0% found this document useful (0 votes)
17 views

Topic 9 - Data Warehousing

The document provides an introduction to data warehousing including its objectives, architecture, components and differences between data warehouses and data marts. It discusses extracting, integrating and loading data from various sources into a centralized warehouse to enable complex analysis and reporting across business dimensions.

Uploaded by

miragelimited91
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Topic 9 - Data Warehousing

The document provides an introduction to data warehousing including its objectives, architecture, components and differences between data warehouses and data marts. It discusses extracting, integrating and loading data from various sources into a centralized warehouse to enable complex analysis and reporting across business dimensions.

Uploaded by

miragelimited91
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Introduction To

Data Warehousing

Intro to Data Warehousing 1


Objectives of Today’s Businesses
• Access and combine data from a variety of
data stores
• Perform complex data analysis across
these data stores
• Create multidimensional views of data and
its metadata
• Easily summarize and roll up the
information across subject areas and
business dimensions
Intro to Data Warehousing 2
These objectives cannot be met
easily

• Data is scattered in many types of


incompatible structures.
• Lack of documentation has prevented from
integration older legacy systems with newer
systems
• Internet software like searching engine
needs to be improved
• Accurate and accessible metadata across
multiple organizations is hard to get
Intro to Data Warehousing 3
Four Levels of Analytical
Processing
• In modern organization, at least four levels
of analytical processing should be
supported by information systems
– First level: Consists of simple queries and
reports against current and historical data
– Second level: Goes deeper and requires the
ability to do “what if” processing across data
store dimensions

Intro to Data Warehousing 4


Four Levels of Analytical
Processing
– Third level: Needs to step back and analyze
what has previously occurred to bring about
the current state of the data
– Fourth level: Analyzes what has happened in
the past and what needs to be done in the
future in order to bring some specific change

Intro to Data Warehousing 5


Definition of Data Warehouse
• A data warehouse is a subject-oriented,
integrated, time-variant, and non-
volatile collection of data in support of
management's decision making process.

Intro to Data Warehousing 6


Definition of Data Warehouse
• Subject-Oriented: a data warehouse is
organized around the major subjects of
an organization e.g. customers,
products, sales rather than the major
application areas e.g. stock control,
invoicing.

Intro to Data Warehousing 7


Definition of Data Warehouse
• Integrated: coming from different
sources.
• Time-variant: data in a data warehouse
is only accurate and valid at some point
in time or over some time interval.
• Non-volatile: the data is not updated in
real-time but is refreshed from
operational systems on a regular basis.
Intro to Data Warehousing 8
Data Warehousing Architecture

Monitoring & Administration


OLAP servers

Metadata
Repository Analysis

Extract
Query/
External
Sources
Transform Reporting
Load Serve
Operational
Refresh Data
dbs Mining

Intro to Data Warehousing 9


Data Marts
Intro to Data Warehousing 10
Three-Tier Architecture
• Warehouse database server
– Almost always a relational DBMS; rarely flat files

• OLAP servers
– Relational OLAP (ROLAP): extended relational DBMS that maps
operations on multidimensional data to standard relational
operations.
– Multidimensional OLAP (MOLAP): special purpose server that
directly implements multidimensional data and operations.

• Clients
– Query and reporting tools.
– Analysis tools
– Data mining tools (e.g., trend analysis, prediction)
Intro to Data Warehousing 11
Data Warehouse - Components
 Operational Data
 Load Manager - performs all the
operations associated with the
extraction and loading of data into the
warehouse

Intro to Data Warehousing 12


Data Warehouse - Components
 Warehouse Manager - performs all
operations associated with the
management of the data in the
warehouse e.g. analysis of data to
ensure consistency, transforming and
merging of data sources

Intro to Data Warehousing 13


Data Warehouse - Components
 Query Manager - performs all
operations associated with the
management of user queries e.g.
directing queries to appropriate tables
and scheduling the execution of queries
 End-User Access Tools - Data reporting
and query tools, application development
tools, executive information system
tools, OLAP tools, data mining tools
Intro to Data Warehousing 14
Data Warehousing: Two
Distinct Issues
 How to get information into warehouse
“Data warehousing”
 What to do with data once it’s in
warehouse
“Warehouse DBMS”
• Both rich research areas
• Industry has focused on (2)

Intro to Data Warehousing 15


Issues in Data Warehousing
• Warehouse Design
• Extraction
– Wrappers, monitors (change detectors)
• Integration
– Cleansing & merging
• Warehousing specification &
Maintenance
• Optimizations
• Miscellaneous (e.g., evolution)
Intro to Data Warehousing 16
Data Mart
• Data Mart is a subset of data
warehouse that supports the
requirements of a particular
department or business function.

Intro to Data Warehousing 17


Data Warehouse Vs Data Marts

• Enterprise warehouse: collects all


information about subjects
(customers, products, sales, assets,
personnel) that span the entire
organization.
– Requires extensive business modeling
– May take years to design and build
Intro to Data Warehousing 18
Data Warehouse Vs Data Marts
• Data Marts: Departmental subsets that
focus on selected subjects: Marketing
data mart: customer, products, sales.
– Faster roll out, but complex integration in
the long run.

Intro to Data Warehousing 19


Data Warehouse Vs Data Marts
• Virtual warehouse: views over
operational dbs
– Materialize some summary views for
efficient query processing
– Easier to build
– Requisite excess capacity on operational db
servers

Intro to Data Warehousing 20

You might also like