Typically, calculated fields
are not stored in a relational table, but there is nothing wrong with storing calculated
fields in a fact table. In some cases this can improve performance by calculating the
values ahead of time instead of requiring the cube engine to calculate the values on the
fly. Calculations in the cube engine will be mentioned later in this chapter.
Extracting, Transforming, and Loading Data
After designing the relational data warehouse to support the needs of the business,
the next step is to identify the data sources, extract the data, transform data as
needed, and then load the data into the relational warehouse. This process is
generically called extraction, transformation, and loading, or ETL. There are many
ETL tools available and while most can connect to all the common data sources,
the real differences lie in their abilities to control the flow of data and provide
transformations to the data.
Data transformation is required because data stored in disparate source systems
may be stored in different formats, with different codes, or using different data types.
For example, one system might store a Yes as 1 and a No as 0, while another system
may store the strings ???Y??™ and ???N.??™ If data is pulled from two systems and a Yes/No
field is needed, then the data must be transformed so that it is consistent in the
warehouse. Data consistency is one of the cornerstones of a data warehouse; all data
must be in the same format for analysis to be effective.
Pages:
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84