CH 4 DW
CH 4 DW
Slide 29- 2
Introduction, Definitions, and Terminology (1)
Data Warehouse (DW) was proposed as a new type of database
management system which would keep no transactional data but
only summarized historical information for decision making
purposes.
DWs are modeled and structured differently, use different
techniques for storage and retrieval and cater to a different set of
users.
W. H Inmon characterized a data warehouse as:
“A subject-oriented, integrated, nonvolatile, time-
variant collection of data in support of management’s
decisions.”
Slide 29- 3
Purpose of Data Warehousing
Traditional databases are not optimized for data access - they have to
balance the requirement of data access with the need to ensure integrity
of data.
Most of the times the data warehouse users need only read access but,
need the access to be fast over a large volume of data.
Most of the data required for data warehouse analysis comes from
multiple sources that may include databases from different data models
and sometimes files acquired from independent systems and platforms.
Slide 29- 4
Introduction, Definitions, and Terminology (2)
Data warehouses are databases that store and maintain
analytical data separately from transaction-oriented
databases for the purpose of decision support
Traditional databases support online transaction processing -OLTP .
Data Warehouses are for analytical applications- largely OLAP.
Applications that data warehouse supports are:
OLAP (Online Analytical Processing) is a term used to
describe the analysis of complex data from the data warehouse.
DSS (Decision Support Systems) also known as EIS (Executive
Information Systems) supports organization’s leading decision
makers for making complex and important decisions.
Data Mining is used for knowledge discovery, the process of
searching data for unanticipated new knowledge (See Chapter
28).
Slide 29- 5
Conceptual Structure of Data Warehouse
Data Mining
Figure 29.1
Overview of the
general architecture
of a data warehouse.
Slide 29- 6
Comparison with Traditional Databases
Slide 29- 7
Characteristics of Data Warehouses
Slide 29- 8
Classification of Data Warehouses
Slide 29- 9
Other Concepts common with Data Warehouses
Slide 29- 10
Data Modeling for Data Warehouses (1)
Slide 29- 11
Data Modeling for Data Warehouses (2)
Slide 29- 12
Functionality of a Data Warehouse
Slide 29- 13
The Pivot operation in a Data Warehouse
Slide 29- 17
Multi-dimensional Schemas (2)
Slide 29- 18
Multi-dimensional Schemas (3)
Star schema:
Consists of a fact table with a single table for each dimension.
Slide 29- 19
Multi-dimensional Schemas (4)
Snowflake Schema:
It is a variation of star schema, in which the dimensional tables
from a star schema are organized into a hierarchy by
normalizing them.
Fact Constellation
Fact constellation is a set of tables that share some
dimension tables. However, fact constellations limit the
possible queries for the warehouse.
Example shows the Product dimension table being shared
by two Fact tables.
Slide 29- 21
Multi-dimensional Schemas (6)
Indexing
Data warehouse also utilizes indexing to support high
performance access.
A technique called bitmap indexing constructs a bit vector
for each value in the domain being indexed.
Indexing works very well for domains of low cardinality.
(See example of using a bitmap index in Section 19.8)
Master Data Management (MDM)
Purpose of MDM is to define standards, processes, policies
and governance issues related to critical data elements
entities of the organization
Slide 29- 22
Building A Data Warehouse (1)
Slide 29- 23
Building A Data Warehouse (2)
Slide 29- 24
Data Acquisition (1)
Slide 29- 25
Data Acquisition (2)
Slide 29- 26
Storing Data in a Data Warehouse
Slide 29- 27
DW Design Considerations
Usage projections
The fit of the data model
Characteristics of available resources
Design of the metadata component
Modular component design
Design for manageability and change
Considerations of distributed and parallel architecture
Distributed DWs: Replication, Partitioning,
Communication, Consistency issues
Federated DWs : Decentralized federation of
autonomous DWs.
Slide 29- 28
Metadata Repositories
Slide 29- 29
Functionality of Data Warehouses
Slide 29- 30
Data Warehouse vs. Data Views
Slide 29- 31
Difficulties of implementing Data Warehouses
Slide 29- 32
Open Issues in Data Warehousing
Slide 29- 33
Future of Data Warehousing
Slide 29- 34
Future of Data Warehousing
See: https://www.gartner.com/doc/2057915/understanding-logical-data-
warehouse-emerging
Slide 29- 35
Recap
Slide 29- 36