0% found this document useful (0 votes)

35 views

DWM Exp 1-2

Uploaded by

kalax55722

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

DWM Exp 1-2

Uploaded by

kalax55722

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

DEPARTMENT OF COMPUTER ENGINEERING

CSL503 Data Warehousing & Mining Lab

Fifth Semester, 2024-2025 (Odd Semester)

Name of Student :

Roll No. :

Batch : Batch

Day / Session : Friday (14:45 – 16:45)

Venue : Computer Lab 307

Experiment No. :

Title of Experiment :

Date of Conduction :

Date of Submission :
Max. Marks
Particulars

Marks Obtained
Preparedness and Efforts (PE) 3
Knowledge of Tools (KT) 3
Debugging and Results (DR) 3
Documentation (DN) 3
Punctuality & Lab Ethics (PL) 3
Total 15

Grades – Meet Expectations (3 Marks), Moderate Expectations (2 Marks), Below Expectations (1 Mark)
Checked and verified by

Name of Faculty : Ms. Vaishali Salvi

Signature :
Date :
Dept. of Computer Engineering, Shree L. R. Tiwari College of Engineering, Thane-401107.
EXPERIMENT NO. 1
AIM: One case study on building Data warehouse/Data Mart Write Detailed Problem statement and design
dimensional modelling (creation of star and snowflake schema)
OBJECTIVE: To learn how to build a data warehouse using dimensional modelling.
THEORY:
Dimensional Modelling:
It is a data structure technique optimized for data storage in a Data warehouse. The purpose of dimensional
modeling is to optimize the database for faster retrieval of data. The concept of Dimensional Modelling was
developed by Ralph Kimball and consists of “fact” and “dimension” tables. A dimensional model in data
warehouse is designed to read, summarize, analyze numeric information like values, balances, counts,
weights, etc. in a data warehouse.
For instance, in the relational mode, normalization and ER models reduce redundancy in data. On the
contrary, dimensional model in data warehouse arranges data in such a way that it is easier to retrieve
information and generate reports.

Elements of Dimensional Data Model:

1) Fact:
Facts are the measurements/metrics or facts from your business process. For a sales business
process, a measurement would be quarterly sales number.

2) Dimension:
Dimension provides the context surrounding a business process event. In simple terms, they give
who, what, where of a fact. In the Sales business process, for the fact quarterly sales number,
dimensions would be
● Who – Customer Names

● Where – Location

● What – Product Name

In other words, a dimension is a window to view information in the facts.

3) Attributes:
The Attributes are the various characteristics of the dimension in dimensional data modeling. In the
Location dimension, the attributes can be
● State

● Country

● Zip code etc.

4) Fact Table:
A fact table is a primary table in dimension modeling Fact Table contains

● Measurements/facts

● Foreign key to dimension table

5) Dimension Table:
A dimension table contains dimensions of a fact. They are joined to fact table via a foreign key.
Dimension tables are de-normalized tables. The Dimension Attributes are the various columns in a
dimension table Dimensions offers descriptive characteristics of the facts with the help of their
attributes No set limit set for given for number of dimensions The dimension can also contain one or
more hierarchical relationships.

Steps of Dimensional Modelling

The accuracy in creating your Dimensional modeling determines the success of your data warehouse
implementation. Here are the steps to create Dimension Model The model should describe the Why, How
much, When/Where/Who and What of your business process.
Step 1) Identify the Business Process-Identifying the actual business process a data warehouse should cover.
This could be Marketing, Sales, HR, etc. as per the data analysis needs of the organization. The selection of
the Business process also depends on the quality of data available for that process. It is the most important
step of the Data Modelling process, and a failure here would have cascading and irreparable defects.
Step 2) Identify the Grain-The Grain describes the level of detail for the business problem/solution. It is the
process of identifying the lowest level of information for any table in your data warehouse. If a table
contains sales data for every day, then it should be daily granularity. If a table contains total sales data for
each month, then it has monthly granularity.
During this stage, you answer questions like:
Do we need to store all the available products or just a few types of products? This decision is based on the
business processes selected for Data warehouse
Do we store the product sale information on a monthly, weekly, daily or hourly basis? This decision depends
on the nature of reports requested by executives
How do the above two choices affect the database size?
Example of Grain: -The CEO at an MNC wants to find the sales for specific products in different locations
on a daily basis. So, the grain is "product sale information by location by the day."
Step 3) Identify the Dimensions-Dimensions are nouns like date, store, inventory, etc. These dimensions are
where all the data should be stored. For example, the date dimension may contain data like a year, month
and weekday.
Example of Dimensions: -The CEO at an MNC wants to find the sales for specific products in different
locations on a daily basis.
Dimensions: Product, Location and Time
Attributes: For Product: Product key (Foreign Key), Name, Type, Specifications
Hierarchies: For Location: Country, State, City, Street Address, Name

Step 4) Identify the Fact-This step is co-associated with the business users of the system because this is
where they get access to data stored in the data warehouse. Most of the fact table rows are numerical values
like price or cost per unit, etc.
Example of Facts: -The CEO at an MNC wants to find the sales for specific products in different locations
on a daily basis. The fact here is Sum of Sales by product by location by time.

Step 5) Build Schema-In this step, you implement the Dimension Model. A schema is nothing but the
database structure (arrangement of tables). There are two popular schemas
● Star Schema

● Snowflake Schema

Star Schema:
A dimensional model with fact table in the middle and dimension tables arranged around the fact table is
called a star schema. Here, the fact table is at the core of the star and the dimension tables are along the
spikes of the star. In this arrangement every attribute in the dimension table has the even chance to be
participated in a query to analyze that attribute. Each dimension table is related to the fact table in a one-to-
many relationship.
Star schema structure shows how the users normally view their metrics along their business dimensions. It
can answer questions of what, when, by whom and whom etc. The answers are produced by combining one
or more-dimension tables with the fact table. The relationship of a particular row in the fact tables with the
rows in each dimension table. These relationships are shown as the spikes of the Star schema. Example: Star
Schema for sales data
Snowflake schema:
A refinement of star schema where some dimensional hierarchy is further splitting (normalized) into a set of
smaller dimension tables, forming a shape similar to snowflake. However, the snowflake structure can
reduce the effectiveness of browsing, since more joins will be needed. The snowflake schema is a more
complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake
schema because the diagram of the schema resembles a snowflake.
LAB EXERCISE: Students will select one case study on building Data warehouse/Data Mart. Write Detailed
Problem statement and design dimensional modelling (creation of star and snowflake schema)
Problem Statement: Case Study on Real Estate Management System for creating Data Warehouse/Data Mart

Star Schema

Snowflake Schema

CONCLUSION: Hence, we have understood the concept of dimensional modelling and we have created
the dimensional modelling design using star schema and snowflake schema for the selected case study for
building Data Ware house/Data Mart.
EXPERIMENT NO. 2
AIM: Implementation of all Dimension Table and Fact Table based on Experiment 1 Case Study.
OBJECTIVE: To learn how to build a data warehouse using dimensional modelling.
THEORY:
Fact Table:
In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is
located at the center of a star schema or a snowflake schema surrounded by dimension tables. Where
multiple fact tables are used, these are arranged as a fact constellation schema. A fact table typically has two
types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary
key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the
content of the data warehouse and store different types of measures like additive, non- additive, and semi
additive measures.
Fact tables provide the (usually) additive values that act as independent variables by which dimensional
attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents the
most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated as
"Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a
day, product and store. Other dimensions might be members of this fact table (such as location/region) but
these add nothing to the uniqueness of the fact records. These "affiliate dimensions" allow for additional
slices of the independent facts but generally provide insights at a higher level of aggregation (a region
contains many stores).

Dimension Table:
In data warehousing, a dimension table is one of the set of companion tables to a fact table. The fact table
contains business facts (or measures), and foreign keys which refer to candidate keys (normally primary
keys) in the dimension tables.
Contrary to fact tables, dimension tables contain descriptive attributes (or fields) that are typically textual
fields (or discrete numbers that behave like text). These attributes are designed to serve two critical
purposes: query constraining and/or filtering, and query result set labeling.
Dimension attributes should be:
• Verbose (labels consisting of full words)
• Descriptive
• Complete (having no missing values)
• Discretely valued (having only one value per dimension table row)
• Quality assured (having no misspellings or impossible values)
Dimension table rows are uniquely identified by a single key field. It is recommended that the key field be a
simple integer because a key value is meaningless, used only for joining fields between the fact and
dimension tables.
OUTPUT:
Fact and Dimension Table of Star and Snowflake Schema

CONCLUSION: Hence, we have created and displayed Fact Table & Dimension Table.

Excel Project File
100% (2)
Excel Project File
72 pages
Experiment No 1
No ratings yet
Experiment No 1
7 pages
Unit 2
No ratings yet
Unit 2
8 pages
Dimensional Modelling
No ratings yet
Dimensional Modelling
26 pages
What Is Dimensional Model
No ratings yet
What Is Dimensional Model
7 pages
Unit 3
No ratings yet
Unit 3
18 pages
DW5
No ratings yet
DW5
18 pages
Unit – I (1)
No ratings yet
Unit – I (1)
65 pages
DATAWAREHOUSE PPT NEWW
No ratings yet
DATAWAREHOUSE PPT NEWW
27 pages
Lec 5,6,7,8 DW Revison
No ratings yet
Lec 5,6,7,8 DW Revison
31 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
Unit 4
No ratings yet
Unit 4
11 pages
DWM Exp1 C49
No ratings yet
DWM Exp1 C49
13 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
23 pages
Dimensional Modelling
No ratings yet
Dimensional Modelling
36 pages
Dimensional Modeling: Prof. Sunita Sahu
No ratings yet
Dimensional Modeling: Prof. Sunita Sahu
50 pages
APznzab3upw_UOf0tS71yzluuvSezhLOcz0V7YImO44BKlMzoQgANMOu408H90gWZEJRzh0QRc8b5XMYwXV25p9Q4tzh7igo57bYxI3CvqCHVgm4M1pnEXoAEjP5LvnGF9SXNlLIy347ksJ1-4jgkX6Ti8kztG1r4z60z674JDmz2y3qz0AQ66NvgOVcgnbL55H7P0DJyD6aBGp
No ratings yet
APznzab3upw_UOf0tS71yzluuvSezhLOcz0V7YImO44BKlMzoQgANMOu408H90gWZEJRzh0QRc8b5XMYwXV25p9Q4tzh7igo57bYxI3CvqCHVgm4M1pnEXoAEjP5LvnGF9SXNlLIy347ksJ1-4jgkX6Ti8kztG1r4z60z674JDmz2y3qz0AQ66NvgOVcgnbL55H7P0DJyD6aBGp
43 pages
DWDM Unit 2
No ratings yet
DWDM Unit 2
104 pages
Data Warehouse: Subject Oriented
No ratings yet
Data Warehouse: Subject Oriented
6 pages
Lecture 3 Data Warehouse Modelling
No ratings yet
Lecture 3 Data Warehouse Modelling
58 pages
Lecture 3
No ratings yet
Lecture 3
42 pages
Data mining and warehousing(chp#3) .
No ratings yet
Data mining and warehousing(chp#3) .
11 pages
Data Mning
No ratings yet
Data Mning
10 pages
DW Unit 4
No ratings yet
DW Unit 4
39 pages
Data modeling - presentation pdf
No ratings yet
Data modeling - presentation pdf
46 pages
Dim Modelling Part 1 -Sh24
No ratings yet
Dim Modelling Part 1 -Sh24
50 pages
9 Step To Design Data Warehouse
No ratings yet
9 Step To Design Data Warehouse
24 pages
Week 3
No ratings yet
Week 3
39 pages
02_Data Modeling
No ratings yet
02_Data Modeling
32 pages
BDA Unit 2 B.tech
No ratings yet
BDA Unit 2 B.tech
9 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
Week5
No ratings yet
Week5
19 pages
Amey B-50 DWM Lab Experiment-1
No ratings yet
Amey B-50 DWM Lab Experiment-1
12 pages
dw4 - Dimension1
No ratings yet
dw4 - Dimension1
75 pages
Tutorial # 1
No ratings yet
Tutorial # 1
58 pages
introduction to DataWarehouse and DataMining
No ratings yet
introduction to DataWarehouse and DataMining
35 pages
DW Concepts
No ratings yet
DW Concepts
7 pages
Data+Warehouse (3)
No ratings yet
Data+Warehouse (3)
81 pages
BDA U2
No ratings yet
BDA U2
44 pages
Data Warehouse
No ratings yet
Data Warehouse
85 pages
21IS503 UnitI LM2
No ratings yet
21IS503 UnitI LM2
31 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
C 01 Dimensional Modeling
No ratings yet
C 01 Dimensional Modeling
30 pages
Unit - 1
100% (1)
Unit - 1
29 pages
Datawarehouse Concepts
No ratings yet
Datawarehouse Concepts
7 pages
DWM
No ratings yet
DWM
19 pages
Data Warehousing 2
No ratings yet
Data Warehousing 2
14 pages
Create First Data WareHouse - CodeProject
No ratings yet
Create First Data WareHouse - CodeProject
10 pages
Data Warehouse Concepts PDF
0% (1)
Data Warehouse Concepts PDF
14 pages
Top Tier Front-End Processing-: Star Schema Design OLAP Implementation
No ratings yet
Top Tier Front-End Processing-: Star Schema Design OLAP Implementation
54 pages
Bi Lecture4 - 2023
No ratings yet
Bi Lecture4 - 2023
49 pages
Cap 2
No ratings yet
Cap 2
25 pages
Final DWM
No ratings yet
Final DWM
30 pages
DMDW-MDM L8,9
No ratings yet
DMDW-MDM L8,9
53 pages
Data Warehousing: People Making Technology Wor K™
100% (1)
Data Warehousing: People Making Technology Wor K™
44 pages
4 Lecture 4-Dimensional Modelling
No ratings yet
4 Lecture 4-Dimensional Modelling
45 pages
Data - Warehousing - Dimensional - Modeling Basics
No ratings yet
Data - Warehousing - Dimensional - Modeling Basics
48 pages
Data Warehouse Dimensional Modeling
No ratings yet
Data Warehouse Dimensional Modeling
21 pages
Data Warehouse Concepts
No ratings yet
Data Warehouse Concepts
11 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Microsoft Excel Statistical and Advanced Functions for Decision Making
From Everand
Microsoft Excel Statistical and Advanced Functions for Decision Making
Palani Murugappan
No ratings yet
Get Internetware A New Software Paradigm for Internet Computing 1st Edition Hong Mei PDF ebook with Full Chapters Now
100% (2)
Get Internetware A New Software Paradigm for Internet Computing 1st Edition Hong Mei PDF ebook with Full Chapters Now
55 pages
Sandworm APT Lab Instructions
No ratings yet
Sandworm APT Lab Instructions
12 pages
Introduction To Microprocessor and Microcontroller
No ratings yet
Introduction To Microprocessor and Microcontroller
19 pages
H8 Manual
No ratings yet
H8 Manual
905 pages
Installed Files
No ratings yet
Installed Files
25 pages
Conga 11 Integrating With Visual Force and Apex
No ratings yet
Conga 11 Integrating With Visual Force and Apex
4 pages
Create Your Own Radio Station Online
No ratings yet
Create Your Own Radio Station Online
5 pages
Technology Resume Template
100% (1)
Technology Resume Template
8 pages
Skynode X Datasheet
No ratings yet
Skynode X Datasheet
30 pages
Coding Techniques and Programming Practices
No ratings yet
Coding Techniques and Programming Practices
5 pages
By Robert B Salter Textbook of Disorders and Injuries of The Musculoskeletal System Third 3rd Edition by Author B004hgvxko PDF
0% (2)
By Robert B Salter Textbook of Disorders and Injuries of The Musculoskeletal System Third 3rd Edition by Author B004hgvxko PDF
5 pages
Microsemi Corporation One Enterprise, Aliso Viejo CA 92656 USA Within The USA: +1 (949) 380-6100 Sales: +1 (949) 380-6136 Fax: +1 (949) 215-4996
No ratings yet
Microsemi Corporation One Enterprise, Aliso Viejo CA 92656 USA Within The USA: +1 (949) 380-6100 Sales: +1 (949) 380-6136 Fax: +1 (949) 215-4996
2 pages
CL ARENA NX Brochure SINGLE PAGES EN3Z 0942GE51 R0417
No ratings yet
CL ARENA NX Brochure SINGLE PAGES EN3Z 0942GE51 R0417
6 pages
Graphic Design
No ratings yet
Graphic Design
5 pages
RAPL-3 Language Reference Guide
No ratings yet
RAPL-3 Language Reference Guide
360 pages
Dissolution Copley
No ratings yet
Dissolution Copley
9 pages
Backgorund of The Study
No ratings yet
Backgorund of The Study
2 pages
Thermotrack Webserve
No ratings yet
Thermotrack Webserve
4 pages
XCPT DESKTOP 10-08-16 00.09.17
No ratings yet
XCPT DESKTOP 10-08-16 00.09.17
4 pages
12 Troubleshooting Guide: 12.1. User Recoverable Errors
No ratings yet
12 Troubleshooting Guide: 12.1. User Recoverable Errors
3 pages
ArchiForma PDF
No ratings yet
ArchiForma PDF
145 pages
Incident Management Process
100% (1)
Incident Management Process
18 pages
Draft Public Procurement and Asset Disposal Regulations 2016 1 PDF
No ratings yet
Draft Public Procurement and Asset Disposal Regulations 2016 1 PDF
169 pages
A Presentation On Rashmi Sinha - Founder of Slide Share
100% (1)
A Presentation On Rashmi Sinha - Founder of Slide Share
7 pages
Chapter 1 - Introduction To Programming Concepts-1
No ratings yet
Chapter 1 - Introduction To Programming Concepts-1
6 pages
B.SC Cs PDF
100% (1)
B.SC Cs PDF
24 pages
End-to-End Formal Using Abstractions To Maximize Coverage
No ratings yet
End-to-End Formal Using Abstractions To Maximize Coverage
8 pages
Age, Lda: Lntroduction and Lab
No ratings yet
Age, Lda: Lntroduction and Lab
36 pages
Cyber Security: Ethical Hacking
No ratings yet
Cyber Security: Ethical Hacking
7 pages

DWM Exp 1-2

Uploaded by

DWM Exp 1-2

Uploaded by

DEPARTMENT OF COMPUTER ENGINEERING

CSL503 Data Warehousing & Mining Lab

Day / Session : Friday (14:45 – 16:45)

Venue : Computer Lab 307

Name of Faculty : Ms. Vaishali Salvi

Elements of Dimensional Data Model:

● What – Product Name

● Zip code etc.

● Foreign key to dimension table

Steps of Dimensional Modelling

You might also like