Unit 2 Notes DWM

The document provides an overview of data warehousing concepts, focusing on Data Cubes and OLAP (Online Analytical Processing). It discusses the advantages and disadvantages of OLAP, differentiates between OLAP and OLTP (Online Transaction Processing), and explains various data warehouse schemas including Star, Snowflake, and Fact Constellation schemas. Additionally, it covers OLAP operations such as roll-up, drill-down, slice, dice, and pivot, along with a comparison between OLTP and OLAP systems.

Uploaded by

Gajanan Markad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Unit 2 Notes DWM

Uploaded by

Gajanan Markad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Unit 2: Data Warehousing Modeling & Online Analytical Processing (OLAP) I

Q.1] Define Data cube used in Data warehouse modelling

Data Cube or OLAP Cube:
The data warehouse data is grouped or combined in multidimensional
matrices; it is called Data Cube. Data or OLAP cube is a data structure optimized
for very quick data analysis. Data cube is also called as OLAP cube or hypercube.

Q.2] Define OLAP with examples.

OLAP stands for Online Analytical Processing:
OLAP is a software that
allows users to analyse information from multiple database systems at the same
time.
Example:
1. Finance and accounting:
2. Sales and Marketing
3. Production
Q.3] Advantages and Disadvantages of OLAP
Advantages of OLAP:
1. Fast query performance.
2. Multidimensional data analysis.
3. Easy data aggregation.
4. Supports complex calculations.
5. Aids in decision-making.
6. User-friendly interfaces.
Disadvantages of OLAP:
1. Complex setup and design.
2. High storage requirements.
3. Maintenance can be time-consuming.
4. Not ideal for real-time data.
5. Scalability challenges with large data.
6. High implementation and maintenance costs.
Q.4] Define OLTP in data warehouse
Online Transaction Processing (OLTP):
OLTP databases are meant to be used
to do many small transactions, and usually serve as a “single source of storage”.
Q.5] Define the term data cube in multidimensional data model.
▪ A data cube is a multidimensional data structure model for storing data in
the data warehouse.
▪ Data cube can be 2D, 3D or n-dimensional in structure.
▪ When data is grouped, combined together into multidimensional matrices,
then it is called as a data cube.
▪ Data cube represent data in terms of dimensions and facts.
▪ Dimension in a data cube represents attributes in the data set.
▪ Each cell of a data cube has aggregated data.
Q.6] Explain need of OLAP
1. Complex Data Analysis:
OLAP enables multidimensional analysis, offering
deeper insights from large datasets.
2. Faster Decision-Making:
It speeds up decision-making by delivering fast
query responses and real-time insights.
3. Data Exploration:
OLAP allows users to easily explore data through features
like drilling down, rolling up, and pivoting.
4. Aggregated Data:
Pre-aggregated data in OLAP systems simplifies analysis
and reduces manual computation.
5. Reporting and Visualization:
OLAP tools support detailed reporting and
visualization, aiding in clearer decision-making.
6. Forecasting and Trend Analysis:
OLAP helps with forecasting and
analyzing trends to predict future outcomes effectively.
Q.7] List & explain schema used in Data warehouse modeling.
Schema in Data warehouse modeling:
1. Star Schema
2. Snowflake Schema
3. Fact Constellation or Galaxy Schema
1] Star Schema:
▪ A star schema is the primary form of a dimensional model, in which data
are organized into facts and dimensions.
▪ A fact is an event that is counted or measured, such as a sale.
▪ A dimension includes all information about the fact, such as date, item, or
customer.
▪ The star schema is the explicit data warehouse schema.
▪ It is known as star schema because the entity-relationship diagram of this
schemas simulates a star, with points, diverge from a central table.
▪ The centre of the schema consists of a large fact table, and the points of the
star are the dimension tables.
Fact Table: (applicable for all schema) This table contains primary key of
multiple dimension tables. It contains facts or measures like quantity sold, amount
sold, etc.
Dimension Table: (applicable for all schema) This table provides descriptive
information for all measures recorded in fact table, like product, item, location,
time, etc.

Advantages of Star Schema:

1. Simple and easy to understand.
2. Improved query performance with fewer joins.
3. Optimized for OLAP systems.
4. Flexible and scalable design.
5. Simplifies ETL processes.
6. User-friendly for business users.
Disadvantages of Star Schema:
1. Can cause data redundancy.
2. Complex queries may slow down performance.
3. Lack of normalization can lead to inconsistencies.
4. Requires frequent updates to dimension tables.
5. Becomes inefficient with many dimensions.
2] Snowflake Schema:
▪ A snowflake schema is refinement of the star schema.
▪ "A schema is known as a snowflake where one or more-dimension tables
do not connect directly to the fact table, but must join through other
dimension tables."
▪ The snowflake schema is an expansion of the star schema where each point
(dimension table) of the star explodes into more points (more dimension
tables).
▪ Snowflaking is a method of normalizing the dimension tables in a STAR
schema.
▪ Snowflaking is used to develop the performance of specific queries.
▪ The snowflake schema consists of one fact table which is linked to many
dimension tables, which can be linked to other dimension tables through a
many-to-one relationship.
▪ Tables in a snowflake schema are generally normalized to the third normal
form.
Advantages of Snowflake Schema:
1. Reduces data redundancy by normalizing dimension tables.
2. Saves storage space compared to the star schema.
3. Improves data consistency due to normalization.
4. More efficient for complex queries that involve multiple dimensions.
5. Easier to maintain and update dimension tables.
Disadvantages of Snowflake Schema:
1. More complex design, making it harder to understand and use.
2. ETL processes are more complicated and time-consuming.
3. May lead to slower performance with large datasets.
4. Not as user-friendly for non-technical users.
5. Less Flexible.
3] Fact Constellation Schema:
▪ A Fact constellation means two or more fact tables sharing one or more
dimensions.
▪ It is also called Galaxy schema.
▪ It is a collection of multiple fact tables having some common dimension
tables.
▪ It can be viewed as a collection of several star schemas and hence, also
known as Galaxy schema.
▪ It is one of the widely used schemas for Data warehouse designing.
▪ It is much more complex than star and snowflake schema.
▪ For complex systems, we require fact constellations.
Fig. Fact Constellation Schema
Advantages of Fact Constellation Schema:
1. Supports multiple fact tables for complex analysis.
2. High performance for large datasets.
3. Flexible for various business requirements.
4. Handles complex relationships well.
5. Scalable for evolving business needs.
6. Reduces data redundancy.
Disadvantages of Fact Constellation Schema:
1. Complex design and maintenance.
2. Slower queries due to multiple joins.
3. Difficult for non-technical users.
4. Requires more storage space.
5. Complicated ETL processes.
Q.8] Differentiate between star schema and snowflake schema
Parameter Star Schema Snowflake Schema
1. Ease of It has redundant data and hence No redundancy and therefore easier
Maintenance less easy to maintain to maintain
2. Ease of change It has redundant data and hence No redundancy and therefore easier
less easy to change to change
3. Ease of Use Less complex queries and simple More complex queries and therefore
to understand less easy to understand (complex)
4. Normalization It has De-normalized tables It has normalized tables
5. Joins Fewer joins Higher Higher number of joins

6. Dimension Table It contains only a single dimension It may have more than one-
table for each dimension dimension table for each dimension
7. Foreign keys Less More
used

Q.9] Explain Multi-Dimensional Data Model

Multi-Dimensional Data Model:
▪ A multidimensional model views data in the form of a data-cube.
▪ A data cube enables data to be modelled and viewed in multiple
dimensions.
▪ Multidimensional data model consists of Fact table and dimension tables.
Fact Table:
▪ This table contains primary key of multiple dimension tables.
▪ It contains facts or measures like quantity sold, amount sold, etc.
Dimension Table:
This table provides descriptive information for all measures
recorded in fact table, like product, item, location, time, etc.
Example:
Consider the data of a shop for items sold per quarter in the city of Delhi. The
data is shown in the table. In this 2D representation, the sales for Delhi are shown
for the time dimension (organized in quarters) and the item dimension (classified
according to the types of an item sold). The fact or measure displayed in rupee
sold (in thousands).
The data from above table can be represented in the form of a 3D (3-Dimensional)
data cube, as shown in fig:

Q.10] Explain following OLAP operation:

1. Roll-up:
2. Drill down
3. Slice
4. Dice
5. Pivot
1] Roll-up:
Roll-up is also known as "consolidation" or "aggregation." The Roll-up operation
can be performed in 2 ways
a. Reducing dimensions
b. Climbing up concept hierarchy. Concept hierarchy is a system of grouping
things based on their order or level.
Consider the following diagram:
In this overview section, roll-up operation performed by climbing up (merging)
in concept hierarchy of Location dimension (City to State)
▪ In this example, cities Pune and Mumbai are rolled up into State
Maharashtra.
▪ The sales figure of Pune and Mumbai are 260 and 390 respectively. They
become 650 after roll-up.
▪ In this aggregation process, data is location hierarchy moves up from city
to the state.
2] Drill down:
In drill-down data is fragmented (divided) into smaller parts. It is the opposite
of the rollup process. It can be done via
a. Moving down in the concept hierarchy and
b. Increasing a dimension.
Consider the following diagram:
In this overview section, drill-down operation is performed by moving down in
concept hierarchy of Time dimension (Quarter to Months).
In this example, Quarter Q1 is drilled down to months January, February, and
March. Corresponding sales are also registered. i.e. dimension months are
added.

3] Slice:
In this operation, one dimension is selected, and a new sub-cube is created. In
the overview section, slice is performed on the dimension Time (Q1).
In this example, dimension Time is Sliced with quarter Q1 as the filter. A new
cube is created altogether
4] Dice:
▪ This operation is similar to a slice. The difference in dice is that, you can
select 2 or more dimensions that result in the creation of a sub-cube.
▪ In the overview section, a sub-cube is selected by selecting Location Pune
or Mumbai and Time Q1 or Q2.

5] Pivot:
In Pivot operation, you rotate the data axes to provide a substitute
presentation of data. In this overview section, a sub-cube obtained after Slice
operation performing Pivot operation gives a new view of that slice.
Consider the result (slice) in slice operation.
Q.11] Distinguish between OLTP & OLAP.
OLTP OLAP

1. OLTP is characterized by a large number 1. OLAP is characterized by relatively low

of short on-line transactions (INSERT, volume of transactions.
UPDATE, DELETE).

2. OLTP queries are simple and easy to 2. OLAP Queries are often very complex
understand. and involve aggregations.
3. OLTP is widely used for small transaction. 3. OLAP applications are widely used by
Data Mining techniques.
4. OLTP is highly normalized. 4. OLAP is typically de-normalized.
5. OLTP is used for Backup religiously. 5. OLAP is used for regular backup.
6. Performance of OLTP is comparably fast 6. Performance of OLAP is comparably
as compared to OLAP. low as compared to OLTP.
7. Write-heavy operations 7. Read-heavy operations
8. Lower redundancy 8. Higher redundancy

Bulk Storage Stype
No ratings yet
Bulk Storage Stype
6 pages
Facilitator's CALA Guide: Learning Area: CALA Type: Level: Topic: Duration
100% (1)
Facilitator's CALA Guide: Learning Area: CALA Type: Level: Topic: Duration
7 pages
CIS CentOS Linux 6 Benchmark v1.1.01 PDF
No ratings yet
CIS CentOS Linux 6 Benchmark v1.1.01 PDF
172 pages
unit 2 dwm
No ratings yet
unit 2 dwm
16 pages
Dwm Chp2 Notes
No ratings yet
Dwm Chp2 Notes
21 pages
DWM Unit 2. Data Warehousing Modeling & OLAP I
100% (2)
DWM Unit 2. Data Warehousing Modeling & OLAP I
16 pages
Unit 5 DW
No ratings yet
Unit 5 DW
12 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
DMDW 7
No ratings yet
DMDW 7
30 pages
DWM Mod 1
No ratings yet
DWM Mod 1
17 pages
Unit 2-DATA WAREHOUSE
No ratings yet
Unit 2-DATA WAREHOUSE
28 pages
2.data Warehouse and OLAP
No ratings yet
2.data Warehouse and OLAP
14 pages
DMDW-MDM L8,9
No ratings yet
DMDW-MDM L8,9
53 pages
Data Warehouse
No ratings yet
Data Warehouse
71 pages
SPPU 2022 Solved Question Paper DWDM
50% (2)
SPPU 2022 Solved Question Paper DWDM
25 pages
Assignment 4-1
100% (2)
Assignment 4-1
27 pages
Data Mining Notes UNIT II
No ratings yet
Data Mining Notes UNIT II
25 pages
Home Work 3
0% (1)
Home Work 3
10 pages
DWM CHP2 QB solution
No ratings yet
DWM CHP2 QB solution
9 pages
Dimensional Modeling and Schemas: Data Modeling Research Paper
No ratings yet
Dimensional Modeling and Schemas: Data Modeling Research Paper
11 pages
introduction to DataWarehouse and DataMining
No ratings yet
introduction to DataWarehouse and DataMining
35 pages
Dwdm Class Ppt 9-9-23
No ratings yet
Dwdm Class Ppt 9-9-23
65 pages
Unit - 3 Data Warehousing and OLAP Technology
No ratings yet
Unit - 3 Data Warehousing and OLAP Technology
20 pages
DW-DM R19 Unit-1
100% (1)
DW-DM R19 Unit-1
25 pages
Assignment - 2 DWH
No ratings yet
Assignment - 2 DWH
13 pages
CTEVT Data mining_solution 2079
No ratings yet
CTEVT Data mining_solution 2079
19 pages
DWDM Set-2
No ratings yet
DWDM Set-2
55 pages
Dmbi Assignment 2: Q.1. Explain STAR Schema. Ans-1
No ratings yet
Dmbi Assignment 2: Q.1. Explain STAR Schema. Ans-1
6 pages
DW Concepts
No ratings yet
DW Concepts
7 pages
CST466-M1 - Ktunotes - in
No ratings yet
CST466-M1 - Ktunotes - in
24 pages
Data Warehouse Final Notes
No ratings yet
Data Warehouse Final Notes
17 pages
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
No ratings yet
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
40 pages
DWM Unit 1 (2023)
No ratings yet
DWM Unit 1 (2023)
38 pages
Data Warehouse Concepts PDF
0% (1)
Data Warehouse Concepts PDF
14 pages
unit1
No ratings yet
unit1
36 pages
DWDM IT-32 DATAWAREHOUSING & DATAMINING
No ratings yet
DWDM IT-32 DATAWAREHOUSING & DATAMINING
9 pages
unit2--- 5marks(datascience)
No ratings yet
unit2--- 5marks(datascience)
16 pages
Data Warehouse Lec-3
No ratings yet
Data Warehouse Lec-3
38 pages
$RD56ADG
No ratings yet
$RD56ADG
21 pages
unit-2_1 (1)
No ratings yet
unit-2_1 (1)
60 pages
Dimensional modelling _
No ratings yet
Dimensional modelling _
5 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
No ratings yet
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
15 pages
CCS341-DW 2 QB Unit 4 Key
No ratings yet
CCS341-DW 2 QB Unit 4 Key
11 pages
DWDM notes
No ratings yet
DWDM notes
19 pages
What Is Data Warehouse?: Data Mining by IK Unit 2
No ratings yet
What Is Data Warehouse?: Data Mining by IK Unit 2
21 pages
Session-9 Final Notes PRM 45
No ratings yet
Session-9 Final Notes PRM 45
4 pages
Datadgeling
No ratings yet
Datadgeling
22 pages
Final DWM
No ratings yet
Final DWM
30 pages
Unit 1
No ratings yet
Unit 1
26 pages
Data Warehouse Schemas
No ratings yet
Data Warehouse Schemas
87 pages
M 1.4 Multidimensional Data Model
No ratings yet
M 1.4 Multidimensional Data Model
72 pages
Unit 3 OLAP and OLTP
No ratings yet
Unit 3 OLAP and OLTP
64 pages
ch3
No ratings yet
ch3
60 pages
Datawarefaqs
No ratings yet
Datawarefaqs
13 pages
Name: Reena Kale Te Comps Roll No: 23 DWM Experiment No: 1 Title: Designing A Data Warehouse Schema For A Case Study and Performing
No ratings yet
Name: Reena Kale Te Comps Roll No: 23 DWM Experiment No: 1 Title: Designing A Data Warehouse Schema For A Case Study and Performing
7 pages
Dataware House Strcture
No ratings yet
Dataware House Strcture
13 pages
Lect-6-Data warehousing-Part-II.ppt
No ratings yet
Lect-6-Data warehousing-Part-II.ppt
37 pages
Ssas Real Time Interview Questions and Answers
No ratings yet
Ssas Real Time Interview Questions and Answers
7 pages
Datawarehouse operations
No ratings yet
Datawarehouse operations
18 pages
What Is The Difference Between Star Schema and Snow Flake Schema ?and When We Use Those Schema's?
No ratings yet
What Is The Difference Between Star Schema and Snow Flake Schema ?and When We Use Those Schema's?
13 pages
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
32 pages
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Unit 5 Notes DWM
No ratings yet
Unit 5 Notes DWM
11 pages
MAD Program 3
No ratings yet
MAD Program 3
9 pages
Unit_2_Notes_MAD
No ratings yet
Unit_2_Notes_MAD
10 pages
STE 5th Unit Important Questions with answer
No ratings yet
STE 5th Unit Important Questions with answer
9 pages
STE 2ND UNIT Important Questions with answer
No ratings yet
STE 2ND UNIT Important Questions with answer
7 pages
Geographical Information System (Gis) : Assignment 1
No ratings yet
Geographical Information System (Gis) : Assignment 1
16 pages
Salinan Dari CH 2 Writing & Reading - Juliandri
No ratings yet
Salinan Dari CH 2 Writing & Reading - Juliandri
39 pages
PP Master Data Presentation
No ratings yet
PP Master Data Presentation
18 pages
Chapter 10 Qualitative and Mixed Method Research Approach
No ratings yet
Chapter 10 Qualitative and Mixed Method Research Approach
56 pages
SQL Commands: A Data Type Defines What Kind of Value A Column Can Contain
No ratings yet
SQL Commands: A Data Type Defines What Kind of Value A Column Can Contain
12 pages
SW 212
No ratings yet
SW 212
17 pages
Tugas Error Analisis Descriptive Text
No ratings yet
Tugas Error Analisis Descriptive Text
9 pages
Ulbotech Communication Protocol V1.2
No ratings yet
Ulbotech Communication Protocol V1.2
99 pages
Road Map To Become: Data Analyst
No ratings yet
Road Map To Become: Data Analyst
1 page
Nasser Hassan H Alkorbi - Research Proposal
No ratings yet
Nasser Hassan H Alkorbi - Research Proposal
17 pages
Mkt535 Article Review Pair NBM4A
No ratings yet
Mkt535 Article Review Pair NBM4A
7 pages
Oracle Architecture 1
No ratings yet
Oracle Architecture 1
3 pages
Doc
No ratings yet
Doc
3 pages
Yu Ping
No ratings yet
Yu Ping
46 pages
Adw-Melts Snowflake-Report
No ratings yet
Adw-Melts Snowflake-Report
22 pages
Movie Statistic Analysis Report
No ratings yet
Movie Statistic Analysis Report
14 pages
1) DCL Stands For: Answer - Click Here
No ratings yet
1) DCL Stands For: Answer - Click Here
7 pages
Corn Production Thesis
100% (3)
Corn Production Thesis
4 pages
Recruitment and Selection in Banking Industry: December 2017
No ratings yet
Recruitment and Selection in Banking Industry: December 2017
56 pages
A Synthesis Overview of The Contemporary Art Forms and Performance Practices in The Philippines Almighty C. Tabuena
100% (1)
A Synthesis Overview of The Contemporary Art Forms and Performance Practices in The Philippines Almighty C. Tabuena
7 pages
AI-driven Environmental Sensor Networks and Digital Platforms For Urban Air Poll
No ratings yet
AI-driven Environmental Sensor Networks and Digital Platforms For Urban Air Poll
12 pages
Normlization 1
No ratings yet
Normlization 1
60 pages
Module 8
No ratings yet
Module 8
14 pages
[English] System Design Interview - Distributed Cache [DownSub.com]
No ratings yet
[English] System Design Interview - Distributed Cache [DownSub.com]
14 pages
Adding and Removing Programs On The HP Calculator
No ratings yet
Adding and Removing Programs On The HP Calculator
5 pages
A Dataset For Multimodal Music Information Retrieval of Sotho - 2024 - Data in B
No ratings yet
A Dataset For Multimodal Music Information Retrieval of Sotho - 2024 - Data in B
12 pages
Joosje Van Riel - Data and The City
No ratings yet
Joosje Van Riel - Data and The City
97 pages

Unit 2 Notes DWM

Uploaded by

Unit 2 Notes DWM

Uploaded by

Unit 2: Data Warehousing Modeling & Online Analytical Processing (OLAP) I

Q.1] Define Data cube used in Data warehouse modelling

Q.2] Define OLAP with examples.

Advantages of Star Schema:

Q.9] Explain Multi-Dimensional Data Model

Q.10] Explain following OLAP operation:

1. OLTP is characterized by a large number 1. OLAP is characterized by relatively low

You might also like