Introduction Data warehouse

Introduction to
DataWarehouse
Amin Choroomi
choroomi@vdashonline.com

What is a Data
Warehouse?
 A Simple Relational Database
 Different Architecture
 Less Normalized
 Analytical Design
 Facts and Dimensions
 Non-Operational

Why Data
Warehouse?
Operational Systems
Transactional
Ssytems
Legacy
Applications
Internal /
External Feeds
Management
Reports
Dashboards
Analytics

Why Data
Warehouse?
Management
Reports
Dashboards
Analytics
Data Warehouse
Operational Systems
Transactional
Ssytems
Legacy
Applications
Internal /
External Feeds

Benefits of a
Data
Warehouse
 Centralized Data Source
 Enhanced Business Intelligence
 Increased Query and System Performance
 Business Intelligence from Multiple Sources
 Timely Access to Data
 Enhanced Data Quality and Consistency
 Historical Intelligence
 High Return on Investment

OLTP vs.OLAP
OnlineTransaction Processing
 Optimized forTransactions
 Concurrent Operations
 Consistent andAccurate
 Real-Time Data
 Short life span
 Too Many SmallTables
 Normalized
Online Analytical Processing
 Optimized For Analysis
 LargeAmounts of Historical
Data
 Fed From OLTP Databases
 Less Normalized (2NF)
 Facts and Dimensions
 Not Real-Time
 ExtractTransform and Load
(ETL)

Relational vs.
Multidimensional
DWs
Relational DWs
 Similar to OLTP
 Simpler Structure
 Query using SQL
 Less ProcessingCost
 Easier Maintenance
 Best for Real-TimeAd-hoc
Reporting
Multidimensional DWs
 Different Structure (Cubes)
 Different Query Language
(MDX)
 Much Faster for Extra-Large
DataSets
 Pre-Calculated Measures,
KPIs
 Optimized to write and
answer complicated requests

What’s inside a Data
Warehouse?

Inside of Data
Warehouse
 Dimensions
 Measures
 Facts

Dimensions
 Describing Information
 Slicing and Dicing
 Comparing
 Make the data meaningful
 Example:
 Customer
 Time
 Product
 Project
 Types
 Categories
 Colors, Sizes, etc…

Metrics /
Measures
 Measurable Columns
 Things we’re actually looking for
 They are usually aggregated (Sum, Avg, Min, Max…)
 Examples:
 Sales Amount
 Order Quantity
 Customer Count
 Tax Paid
 Etc.

Facts
 Describing Measures by Dimensions
 Tables Containing Multiple Dimension Keys and MeasureValues
 Usually PrimaryKey is all the Dimensions Keys or the Event Key
 Dimension Keys are also ForeignKey to DimensionTables
 Facts usually express real events that happened at a specific time
 Example:
 We Sold 2 Toyotas to John Smith in New York Yesterday for
$20,000.00 each and gave him $2,000.00 overall discount. So it was
totally $38,000.00

Sample Fact
Table
FKs to DimensionTables
Describing the
Fact (Used for
Drill-Down)
Measures

Fact /
Dimension
Relationship
Star Schema
 Facts connect DIRECTLY to
each Dimension with a single
relation.
 Simple Structure
 EasierTo Query
 Not the best approach for
complicated Dimensions
 No Built-in Drill-Down
Snow Flake Schema /
Dimensions
 Dimensions are
HIERARCHICALLY connected
to each other.
 Facts connect to one of the
Dimensions and uses the
other ones through the
connected dimension.
 More Complicated
 Built-in Drill-Down

StarSchema
Fact
Dimension 1
Dimension 5 Dimension 4
Dimension 3
Dimension 2

StarSchema
Sales
Customer
Location Promotion
Product
Time

Snow Flake
Schema Sales
Customer
Location
Product
Category
Product
Product
Subcategory

Designing
Data
Warehouse
 Fact Oriented
 You have and know the business facts
 Design Facts, fill it with measures and then Dimensions
 Might need a couple of Iterations.
 You will end up with Facts with real PrimaryKeys
 Ex. Internet Sales (we talked about it before)
 Measure Group Oriented
 You only know/want your Measures
 Write down all business measures you need
 Connect them to Dimensions
 Group them by their meaning and common Dimensions
 You will end up with Facts with Dimension Combined Primarykeys
 Ex. Employee Offdays (TimeKey, EmployeeKey, ReasonKey,
OffDayCount)

RealWorld
Data
Warehouse
(Some of it )

About Me
Amin Choroomi
CTO & Co-Founder at vdash
Software Developer, Teacher and Consultant
DataVisualization, Analytics, Dashboards
Data Warehousing, Integration, Business Intelligence
http://www.vdash.ir
choroomi@live.com
choroomi@vdashonline.com
https://linkedin.com/in/choroomi
@aminchoroomi

Introduction Data warehouse

Recommended

More Related Content

What's hot (20)

Similar to Introduction Data warehouse (20)

Recently uploaded (20)

Introduction Data warehouse