ICS 2408 - Lecture 3 and 4 - Data Warehouse and OLAP
ICS 2408 - Lecture 3 and 4 - Data Warehouse and OLAP
Office Day
Month
February 19, 2024 Moso J : Dedan Kimathi University 12
A Sample Data Cube
PC Kenya
o
Pr
VCR
Country
sum
India
U.K
sum
1-D cuboids
product date country
Database organization
must look like business
Must be simple
Schema Types
Star Schema
Snowflake schema
Dimension tables
Define business in terms already familiar to users
heavily indexed
typical dimensions
Central table
mostly raw numeric items
Star schema: A single fact table in the middle and for each dimension
one dimension table.
Does not capture hierarchies directly
Snowflake schema: A refinement of star schema where some
dimensional hierarchy is normalized into a set of smaller dimension
tables, forming a shape similar to snowflake.
Easy to maintain and saves storage
Fact constellations: Multiple fact tables share dimension tables, viewed
as a collection of stars, therefore called galaxy schema or fact
constellation
February 19, 2024 Moso J : Dedan Kimathi University 18
Example of Star Schema
time
time_key item
day item_key
day_of_the_week Sales Fact Table item_name
month brand
quarter time_key type
year item_key supplier_type
branch_key
branch location_key location
branch_key location_key
units_sold
branch_name street
branch_type dollars_sold city
state_or_province
avg_sales
country
Measures
February 19, 2024 Moso J : Dedan Kimathi University 19
Example of Snowflake Schema
time
time_key item
day item_key
day_of_the_week
supplier
Sales Fact Table item_name supplier_key
month brand
time_key supplier_type
quarter type
year item_key supplier_type
branch_key
branch location_key location
branch_key location_key
units_sold
branch_name street
dollars_sold city city
branch_type
city_key
avg_sales city
state_or_province
Measures country
Structure-oriented
Depends on the number of layers used by the architecture
Source layer
Operational Data External Data
Source layer
Operational Data External Data
Enterprise warehouse
collects all of the information about subjects spanning the entire
organization
Data Mart
a subset of corporate-wide data that is of value to a specific
Data extraction
get data from multiple, heterogeneous, and external sources
Data cleaning
detect errors in the data and rectify them when possible
Data transformation
convert data from legacy or host format to warehouse format
Load
sort, summarize, consolidate, compute views, check integrity,
Business data: business terms and definitions, ownership of data, charging policies
NY
LA nth
Mo
SF
10
Juice
47
Region
Cola
Milk 30
Cream 12 Product
Date
42
“Slicing and Dicing”
Household
Telecomm n s
gio
e
R Europe
Video
Far East
Audio India
43
Roll-up and Drill Down
Higher Level of
Aggregation
Sales Channel
Region
Drill-Down
Country
Roll Up
State
Location Address
Sales Representative
Low-level
Details
44
Data Warehouse Usage