Unit-2

Uploaded by

Veer Gohil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Unit-2

Uploaded by

Veer Gohil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

UNIT - 2

The Architecture of BI and DW

Outline
• Data Warehouse Architecture
• OLTP v/s OLAP
• Data Warehouse Schema Architecture
• OLAP Operations
• OLAP Servers
Data Warehouse Architecture
Data Warehouse Architecture
• Bottom tier:
• The bottom tier is a warehouse database server that is almost always a relational
database system.
• Back-end tools and utilities are used to feed data into the bottom tier from
operational databases or other external sources.
• These tools and utilities perform data extraction, cleaning, and transformation, as
well as load and refresh functions to update the data warehouse.
• The data are extracted using application program interfaces known as gateways.
• A gateway is supported by the underlying DBMS and allows client programs to
generate SQL code to be executed at a server.
• Examples of gateways include ODBC (Open Database Connection) and OLEDB (Open
Linking and Embedding for Databases) by Microsoft and JDBC (Java Database
Connection).
• This tier also contains a metadata repository, which stores information about the
data warehouse and its contents.
Data Warehouse Architecture
• Middle tier:
• The middle tier is an OLAP (Online Analytical
Processing Server) that is typically implemented using either
• A relational OLAP (ROLAP) model, that is, an extended relational
DBMS that maps operations on multidimensional data to standard
relational operations or,
• A multidimensional OLAP (MOLAP) model, that is, a special-
purpose server that directly implements multidimensional data and
operations.
• Top tier:
• The top tier is a front-end client layer, which contains query
and reporting tools, analysis tools, and/or data mining
tools.
OLAP (On-Line Analytical Processing)
• OLAP is characterized by relatively low volume of
transactions.
• Queries are often very complex and involve
aggregations.
• For OLAP systems a response time is an
effectiveness measure.
• OLAP applications are widely used by Data Mining
techniques.
• In OLAP database there is aggregated, historical
data, stored in multi-dimensional schemas (usually
star schema).
OLTP (On-Line Transaction Processing)
• It is characterized by a large number of short on-
line transactions (INSERT, UPDATE, DELETE).
• The main emphasis for OLTP systems is put on
very fast query processing, maintaining data
integrity in multi-access environments and an
effectiveness measured by number of
transactions per second.
• In OLTP database, there is detailed and current
data, and schema used to store transactional
databases is the entity model (usually 3NF).
OLTP v/s OLAP (Understanding)
OLTP OLAP
Many Short Transactions Long Transactions (Complex Queries)
(Queries + Updates)
Examples Examples
• Update account balance • Report total sales for each department
• Enroll in course in each month
• Add book to shopping cart • Identify top-selling books
• Count classes with fewer than 10
students
Queries touch small amount of data (one Queries touch large amount of data
record or few records)
Updates are frequent Updates are infrequent
OLTP v/s OLAP
Functionality OLTP OLAP
Characteristic Operational processing Transaction Analysis
informational processing
Orientation Transaction Analysis
User Clerk, DBA, database professional Knowledge worker (e.g., manager,
executive, analyst)
Function day-to-day operations long-term informational
requirements, decision support
DB design ER based, application-oriented Star/snowflake, subject-oriented
Data Current; guaranteed up-to-date Historical; accuracy maintained
over time
Summarization Primitive, highly detailed Summarized, consolidated
View Detailed, flat relational Summarized, multidimensional
Unit of work Short, simple transaction Complex query
Access Read/write Mostly read
Data Warehouse Schema Architecture
• Data Warehouse environment usually transforms the
relational data model into some special architectures.
• There are many schema models designed for data
warehousing but the most commonly used are:
• Star Schema
• Snowflake Schema
• Fact constellation(Group of star, Collection of fact tables)
Schema
• The determination of which schema model should be
used for a data warehouse based upon the analysis of
project requirements, accessible tools and project
team preferences.
Star Schema
• The star schema architecture is the simplest data
warehouse schema.
• It is called a star schema because the diagram resembles
a star, with points radiating from a center.
• The center of the star consists of fact table and the
points of the star are the dimension tables.
• Usually the fact tables in a star schema are in third
normal form (3NF) whereas dimensional tables are de-
normalized.
• Despite the fact that the star schema is the simplest
architecture, it is most commonly used nowadays and is
recommended by Oracle.
Star Schema - Example
Snowflake Schema
• The snowflake schema architecture is a more complex
variation of the star schema used in a data warehouse,
because the tables which describe the dimensions are
normalized.
• This table is easy to maintain and saves storage space.
• However, this saving of space is negligible in comparison to
the typical size of the fact table.
• Furthermore, the snowflake structure can reduce the
effectiveness of browsing, since more joins will be needed to
execute a query.
• Hence, although the snowflake schema reduces
redundancy, it is not as popular as the star schema in data
warehouse design.
Snowflake Schema - Example
Snowflake Schema - Example
• DMQL(Data Mining Query Language) code for
Snowflake Schema can be written as follows:
• Define cube sales snowflake [time, item, branch, location]:
• Dollars sold = sum(sales in dollars), units sold = count(*)
• Define dimension time as (time key, day, day of week,
month, quarter, year)
• Define dimension item as (item key, item name, brand,
type, supplier (supplier key, supplier type))
• Define dimension branch as (branch key, branch name,
branch type)
• Define dimension location as (location key, street, city (city
key, city, province or state, country))
Fact Constellation Schema
• Sophisticated applications may require multiple fact tables to
share dimension tables.
• This kind of schema can be viewed as a collection of stars,
and hence is called a galaxy schema or a fact constellation.
• A fact constellation schema allows dimension tables to be
shared between fact tables.
• For example, the dimensions tables for time, item, and
location are shared between both the sales and shipping fact
tables.
• The main shortcoming of the fact constellation schema is a
more complicated design because many variants for
particular kinds of aggregation must be considered and
selected.
Fact Constellation Schema
Fact Constellation Schema
• DMQL code for Fact Constellation schema can be written as follows:
• Define cube sales [time, item, branch, location]:
• Dollars sold = sum(sales in dollars), units sold = count(*)
• Define dimension time as (time key, day, day of week, month, quarter, year)
• Define dimension item as (item key, item name, brand, type, supplier type)
• Define dimension branch as (branch key, branch name, branch type)
• Define dimension location as (location key, street, city, province or state,
country)
• Define cube shipping [time, item, shipper, from location, to location]:
• Dollars cost = sum(cost in dollars), units shipped = count(*)
• Define dimension time as time in cube sales
• Define dimension item as item in cube sales
• Define dimension shipper as (shipper key, shipper name, location as location in
cube sales, shipper type)
• Define dimension from location as location in cube sales
• Define dimension to location as location in cube sales
OLAP Operations
• Roll up
• Drill Down
• Slice
• Dice
• Pivot (Rotate)
Roll up – OLAP Operation
 The roll-up operation (also called drill-up or aggregation
operation) performs aggregation on a data cube by following
ways:
 By climbing up a concept hierarchy for a dimension
• By dimension reduction
• Roll-up is performed by climbing up a concept hierarchy for the
dimension location.
• Initially the concept hierarchy was "street < city < province <
country".
• On rolling up, the data is aggregated by ascending the location
hierarchy from the level of city to the level of country.
• The data is grouped into cities rather than countries.
• When roll-up is performed, one or more dimensions from the data
cube are removed.
Roll up – OLAP Operation
Drill Down – OLAP Operation
• Drill-down is the reverse operation of roll-up. It is performed by
either of the following ways:
• By stepping down a concept hierarchy for a dimension
• By introducing a new dimension
• Drill-down is performed by stepping down a concept hierarchy for
the dimension time.
• Initially the concept hierarchy was "day < month < quarter < year."
• On drilling down, the time dimension is descended from the
level of quarter to the level of month.
• When drill-down is performed, one or more dimensions from the
data cube are added.
• It navigates the data from less detailed data to highly detailed
data.
Drill Down – OLAP Operation
Slice – OLAP Operation
• The slice operation selects one particular dimension from a given cube
and provides a new sub cube.
• Here Slice is performed for the dimension "time" using the criterion time
= "Q1“, time = "Q2“, time = "Q3“ etc.
• It will form a new sub-cube by selecting one or more dimensions.
Slice – OLAP Operation
Dice – OLAP Operation
• Dice selects two or more dimensions from a given cube and provides a
new sub cube.
• The dice operation on the cube based on the following selection criteria
involves three dimensions.
• (location = "Toronto" or "Vancouver")
• (time = "Q1" or "Q2")
• (item =" Mobile" or "Modem")
Dice – OLAP Operation
Pivot – OLAP Operation
• The pivot operation is also known as rotation.
• It rotates the data axes in view in order to provide an alternative
presentation of data.
OLAP Servers
 Relational OLAP (ROLAP)
 Multidimensional OLAP (MOLAP)
 Hybrid OLAP (HOLAP)
Relational OLAP (ROLAP)
• Relational On-Line Analytical Processing (ROLAP) work mainly for the
data that resides in a relational database, where the base data and
dimension tables are stored as relational tables.
• ROLAP servers are placed between the relational back-end server and
client front-end tools.
• ROLAP servers use RDBMS to store and manage warehouse data, and
OLAP middleware to support missing pieces.
 Advantages of ROLAP
• ROLAP can handle large amounts of data.
• Can be used with data warehouse and OLTP systems.
 Disadvantages of ROLAP
• Limited by SQL functionalities.
• Hard to maintain aggregate tables.
Multidimensional OLAP (MOLAP)
• Multidimensional On-Line Analytical Processing (MOLAP) support
multidimensional views of data through array-based multidimensional
storage engines.
• With multidimensional data stores, the storage utilization may be low if
the data set is sparse.
 Advantages of MOLAP
• Optimal for slice and dice operations.
• Performs better than ROLAP when data is dense(heavy).
• Can perform complex calculations.
 Disadvantages of MOLAP
• Difficult to change dimension without re-aggregation.
• MOLAP can handle limited amount of data.
Hybrid OLAP (HOLAP)
• Hybrid On-Line Analytical Processing (HOLAP) is a combination of ROLAP
and MOLAP.
• HOLAP provide greater scalability of ROLAP and the faster computation
of MOLAP.
 Advantages of HOLAP
• HOLAP provide advantages of both MOLAP and ROLAP.
• Provide fast access at all levels of aggregation.
 Disadvantages of HOLAP
• HOLAP architecture is very complex because it support
both MOLAP and ROLAP servers.

Big Book of Data Warehousing and Bi v9 122723 Final 0
No ratings yet
Big Book of Data Warehousing and Bi v9 122723 Final 0
88 pages
DP-900 Dump
67% (6)
DP-900 Dump
64 pages
Unit 2
No ratings yet
Unit 2
34 pages
DWM UNIT 1 (2)
No ratings yet
DWM UNIT 1 (2)
67 pages
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
No ratings yet
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
6 pages
Chapter 2.introduction To Data Warehouse
No ratings yet
Chapter 2.introduction To Data Warehouse
49 pages
unit-2_1 (1)
No ratings yet
unit-2_1 (1)
60 pages
DWDM Mid 1
No ratings yet
DWDM Mid 1
10 pages
Unit 2 DATA WAREHOUSE AND DATA MART
No ratings yet
Unit 2 DATA WAREHOUSE AND DATA MART
17 pages
DWM Unit 1 (2023)
No ratings yet
DWM Unit 1 (2023)
38 pages
DWDM notes
No ratings yet
DWDM notes
19 pages
CST466-M1 - Ktunotes - in
No ratings yet
CST466-M1 - Ktunotes - in
24 pages
ML Module1 Ppt - Copy
No ratings yet
ML Module1 Ppt - Copy
56 pages
Data Mining.pdf
No ratings yet
Data Mining.pdf
8 pages
Data Warehouse
No ratings yet
Data Warehouse
71 pages
Chapter 1 Datawarehouse
100% (1)
Chapter 1 Datawarehouse
47 pages
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
32 pages
Chapter 2 and 3
No ratings yet
Chapter 2 and 3
89 pages
dwdm2
No ratings yet
dwdm2
16 pages
OLAP (Online Analytical Processing) : Zalpa Rathod (39) Yatin Puthran (37) Mayuri Pawar (35) Mitesh Patil
No ratings yet
OLAP (Online Analytical Processing) : Zalpa Rathod (39) Yatin Puthran (37) Mayuri Pawar (35) Mitesh Patil
37 pages
Synchronous State Machine Design: CO - (Eve) 2 Year
No ratings yet
Synchronous State Machine Design: CO - (Eve) 2 Year
27 pages
Dimensional modelling _
No ratings yet
Dimensional modelling _
5 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
47 pages
Unit 2 Datawarehouse
No ratings yet
Unit 2 Datawarehouse
58 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
Unit2 Olap
No ratings yet
Unit2 Olap
13 pages
Unit 2_Data Science BCA
No ratings yet
Unit 2_Data Science BCA
20 pages
Unit - 3 Data Warehousing and OLAP Technology
No ratings yet
Unit - 3 Data Warehousing and OLAP Technology
20 pages
Data_Mining_Warehousing Unit 1
No ratings yet
Data_Mining_Warehousing Unit 1
35 pages
DMDW_Operations
No ratings yet
DMDW_Operations
65 pages
UEU Sistem Pendukung Keputusan Pertemuan 5
No ratings yet
UEU Sistem Pendukung Keputusan Pertemuan 5
46 pages
Hierarchy For A Dimension or Introducing Additional Dimensions. (Reverse of Roll-Up)
No ratings yet
Hierarchy For A Dimension or Introducing Additional Dimensions. (Reverse of Roll-Up)
3 pages
DWDM Set-2
No ratings yet
DWDM Set-2
55 pages
DW&DM Material
No ratings yet
DW&DM Material
107 pages
unit 2 dwm
No ratings yet
unit 2 dwm
16 pages
3
No ratings yet
3
77 pages
2.data Warehouse and OLAP
No ratings yet
2.data Warehouse and OLAP
14 pages
DWM CHP2 QB solution
No ratings yet
DWM CHP2 QB solution
9 pages
Data Warehousing: Data Models and OLAP Operations: Lecture-1
No ratings yet
Data Warehousing: Data Models and OLAP Operations: Lecture-1
47 pages
Concepts and Techniques: - Chapter 4
No ratings yet
Concepts and Techniques: - Chapter 4
58 pages
04OLAP
No ratings yet
04OLAP
66 pages
Concepts and Techniques: - Chapter 4
No ratings yet
Concepts and Techniques: - Chapter 4
58 pages
Data_Mining_Warehousing Unit I
No ratings yet
Data_Mining_Warehousing Unit I
45 pages
Session 4 - Data Warehousing and OLAP-1
No ratings yet
Session 4 - Data Warehousing and OLAP-1
33 pages
MultiDimensional Data Model
No ratings yet
MultiDimensional Data Model
22 pages
Unit 3 OLAP and OLTP
No ratings yet
Unit 3 OLAP and OLTP
64 pages
Data Warehousing and OLAP Technology For Data Mining
No ratings yet
Data Warehousing and OLAP Technology For Data Mining
30 pages
Unit 1- Data Warehouse
No ratings yet
Unit 1- Data Warehouse
21 pages
BI Lecture 3 - Data Warehousing - OLAP
No ratings yet
BI Lecture 3 - Data Warehousing - OLAP
15 pages
Lecture 4 (Dataware Housing)
No ratings yet
Lecture 4 (Dataware Housing)
50 pages
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-26 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-26 Reference-Material-I
28 pages
What Is Data Warehouse?: Data Mining by IK Unit 2
No ratings yet
What Is Data Warehouse?: Data Mining by IK Unit 2
21 pages
Data Warehousing: Data Models and OLAP Operations
No ratings yet
Data Warehousing: Data Models and OLAP Operations
41 pages
Olap (Online Analytical Processing)
No ratings yet
Olap (Online Analytical Processing)
8 pages
What Is Data Warehouse?
No ratings yet
What Is Data Warehouse?
26 pages
Data Warehousing: Online Analytical Processing (OLAP)
No ratings yet
Data Warehousing: Online Analytical Processing (OLAP)
44 pages
Datawarehouse: Fact Table
No ratings yet
Datawarehouse: Fact Table
55 pages
04OLAP
No ratings yet
04OLAP
50 pages
Data Warehousingand Data Mining
No ratings yet
Data Warehousingand Data Mining
65 pages
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Oracle Quick Guides: Part 2 - Oracle Database Design
From Everand
Oracle Quick Guides: Part 2 - Oracle Database Design
Malcolm Coxall
No ratings yet
Oracle OBIEE Interview Q & A
From Everand
Oracle OBIEE Interview Q & A
Mohammed Azizuddin Aamer
3/5 (1)
Predicate logic
No ratings yet
Predicate logic
81 pages
WSafe PPT FOR PROJECT
No ratings yet
WSafe PPT FOR PROJECT
10 pages
Chap4_port_programming
No ratings yet
Chap4_port_programming
4 pages
Chap3_Assembly_Program
No ratings yet
Chap3_Assembly_Program
5 pages
Download
No ratings yet
Download
1 page
Data Engineering Lab
No ratings yet
Data Engineering Lab
55 pages
Unit-Iv XML and Datawarehouse
No ratings yet
Unit-Iv XML and Datawarehouse
59 pages
01 Data Warehoudingand Ab Initio Concepts
100% (1)
01 Data Warehoudingand Ab Initio Concepts
76 pages
Data modeling - presentation pdf
No ratings yet
Data modeling - presentation pdf
46 pages
4th - Business Intelligence
No ratings yet
4th - Business Intelligence
30 pages
Data Mining Notes
No ratings yet
Data Mining Notes
42 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
23 pages
Power BI Week 2
No ratings yet
Power BI Week 2
43 pages
Chapter 12: Big Data, Datawarehouse, and Business Intelligence Systems
No ratings yet
Chapter 12: Big Data, Datawarehouse, and Business Intelligence Systems
16 pages
Defining Slowly Changing Dimensions
No ratings yet
Defining Slowly Changing Dimensions
16 pages
DATABASE SCHEMA COMPLETE LECTURE-BSCS 4 TO 5 TH SEM -FALL 2024
No ratings yet
DATABASE SCHEMA COMPLETE LECTURE-BSCS 4 TO 5 TH SEM -FALL 2024
8 pages
What Is Data Warehouse?
No ratings yet
What Is Data Warehouse?
9 pages
Lecture 3
No ratings yet
Lecture 3
25 pages
DWM - Viva and Short Question Answers
No ratings yet
DWM - Viva and Short Question Answers
24 pages
The Impact of Partitioned Fact Tables and Bitmap Index On Data Warehouse Performance
No ratings yet
The Impact of Partitioned Fact Tables and Bitmap Index On Data Warehouse Performance
3 pages
SAP BW Interview Questions - Mindmajix
No ratings yet
SAP BW Interview Questions - Mindmajix
22 pages
Star Query Versus Star Transformation Query: Which To Choose?
No ratings yet
Star Query Versus Star Transformation Query: Which To Choose?
8 pages
POWER BI Mastery in 15 Days
No ratings yet
POWER BI Mastery in 15 Days
46 pages
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
No ratings yet
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
6 pages
Exercise Configuring a Star schema
No ratings yet
Exercise Configuring a Star schema
3 pages
Unit 1 Data Warehousing and Mining
100% (1)
Unit 1 Data Warehousing and Mining
19 pages
SAP BI Question and Answers
No ratings yet
SAP BI Question and Answers
14 pages
Guide to Data Warehousing in the Lakehouse 1731468863
No ratings yet
Guide to Data Warehousing in the Lakehouse 1731468863
55 pages
Computer Science Faculty Information Systems Department: Data Warehousing & BI
No ratings yet
Computer Science Faculty Information Systems Department: Data Warehousing & BI
52 pages
dp-700_8
No ratings yet
dp-700_8
26 pages
Datastage Interview Tips
No ratings yet
Datastage Interview Tips
56 pages
5 Data Enginnering Projefct
No ratings yet
5 Data Enginnering Projefct
9 pages
How Autonomous Is The Oracle Autonomous Data Warehouse?: Christian Antognini / Dani Schnider
No ratings yet
How Autonomous Is The Oracle Autonomous Data Warehouse?: Christian Antognini / Dani Schnider
57 pages

Unit-2

Uploaded by

Unit-2

Uploaded by

UNIT - 2

The Architecture of BI and DW

You might also like