Untitled
Untitled
surrounding objects (dimension lookup tables) like a star. Each dimension is represented as a single table. The primary key in each dimension table is related to a forieng key in the fact table. Sample star schema All measures in the fact table are related to all the dimensions that fact table is related to. In other words, they all have the same level of granularity. A star schema can be simple or complex. A simple star consists of one fact table ; a complex star can have more than one fact table. Let's look at an example: Assume our data warehouse keeps store sales data, and the different dimensions are time, store, product, and customer. In this case, t he figure on the left repesents our star schema. The lines between two tables in dicate that there is a primary key / foreign key relationship between the two ta bles. Note that different dimensions are not related to one another. Snowflake Schema The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. In a star schema, each dimension is represented by a single dimensional table, whereas in a snowflake schema, that dimensional table is normalized into multiple lookup tables, each representing a level in th e dimensional hierarchy. Sample snowflake schema For example, the Time Dimension that consists of 2 different hierarchies: Day 1. Year Month 2. Week Day We will have 4 lookup tables in a snowflake schema: A lookup table for year, a l ookup table for month, a lookup table for week, and a lookup table for day. Year is connected to Month, which is then connected to Day. Week is only connected t o Day. A sample snowflake schema illustrating the above relationships in the Tim e Dimension is shown to the right. The main advantage of the snowflake schema is the improvement in query performan ce due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance eff orts needed due to the increase number of lookup tables. Slowly Changing Dimensions The "Slowly Changing Dimension" problem is a common one particular to data wareh ousing. In a nutshell, this applies to cases where the attribute for a record va ries over time. We give an example below: Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record: Customer Key Name State 1001 Christina Illinois At a later date, she moved to Los Angeles, California on January, 2003. How shou ld ABC Inc. now modify its customer table to reflect this change? This is the "S lowly Changing Dimension" problem. There are in general three ways to solve this type of problem, and they are cate gorized as follows: Type 1: The new record replaces the original record. No trace of the old record exists. Type 2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two people. Type 3: The original record is modified to reflect the change. We next take a look at each of the scenarios and how the data model and the data looks like for each of them. Finally, we compare and contrast among the three a lternatives. Type 1 Slowly Changing Dimension In Type 1 Slowly Changing Dimension, the new information simply overwrites the o riginal information. In other words, no history is kept. In our example, recall we originally have the following table: