DWM Exp 1-2
DWM Exp 1-2
Name of Student :
Roll No. :
Batch : Batch
Experiment No. :
Title of Experiment :
Date of Conduction :
Date of Submission :
Max. Marks
Particulars
Marks Obtained
Preparedness and Efforts (PE) 3
Knowledge of Tools (KT) 3
Debugging and Results (DR) 3
Documentation (DN) 3
Punctuality & Lab Ethics (PL) 3
Total 15
Grades – Meet Expectations (3 Marks), Moderate Expectations (2 Marks), Below Expectations (1 Mark)
Checked and verified by
Signature :
Date :
Dept. of Computer Engineering, Shree L. R. Tiwari College of Engineering, Thane-401107.
EXPERIMENT NO. 1
AIM: One case study on building Data warehouse/Data Mart Write Detailed Problem statement and design
dimensional modelling (creation of star and snowflake schema)
OBJECTIVE: To learn how to build a data warehouse using dimensional modelling.
THEORY:
Dimensional Modelling:
It is a data structure technique optimized for data storage in a Data warehouse. The purpose of dimensional
modeling is to optimize the database for faster retrieval of data. The concept of Dimensional Modelling was
developed by Ralph Kimball and consists of “fact” and “dimension” tables. A dimensional model in data
warehouse is designed to read, summarize, analyze numeric information like values, balances, counts,
weights, etc. in a data warehouse.
For instance, in the relational mode, normalization and ER models reduce redundancy in data. On the
contrary, dimensional model in data warehouse arranges data in such a way that it is easier to retrieve
information and generate reports.
2) Dimension:
Dimension provides the context surrounding a business process event. In simple terms, they give
who, what, where of a fact. In the Sales business process, for the fact quarterly sales number,
dimensions would be
● Who – Customer Names
● Where – Location
3) Attributes:
The Attributes are the various characteristics of the dimension in dimensional data modeling. In the
Location dimension, the attributes can be
● State
● Country
● Measurements/facts
5) Dimension Table:
A dimension table contains dimensions of a fact. They are joined to fact table via a foreign key.
Dimension tables are de-normalized tables. The Dimension Attributes are the various columns in a
dimension table Dimensions offers descriptive characteristics of the facts with the help of their
attributes No set limit set for given for number of dimensions The dimension can also contain one or
more hierarchical relationships.
Step 4) Identify the Fact-This step is co-associated with the business users of the system because this is
where they get access to data stored in the data warehouse. Most of the fact table rows are numerical values
like price or cost per unit, etc.
Example of Facts: -The CEO at an MNC wants to find the sales for specific products in different locations
on a daily basis. The fact here is Sum of Sales by product by location by time.
Step 5) Build Schema-In this step, you implement the Dimension Model. A schema is nothing but the
database structure (arrangement of tables). There are two popular schemas
● Star Schema
● Snowflake Schema
Star Schema:
A dimensional model with fact table in the middle and dimension tables arranged around the fact table is
called a star schema. Here, the fact table is at the core of the star and the dimension tables are along the
spikes of the star. In this arrangement every attribute in the dimension table has the even chance to be
participated in a query to analyze that attribute. Each dimension table is related to the fact table in a one-to-
many relationship.
Star schema structure shows how the users normally view their metrics along their business dimensions. It
can answer questions of what, when, by whom and whom etc. The answers are produced by combining one
or more-dimension tables with the fact table. The relationship of a particular row in the fact tables with the
rows in each dimension table. These relationships are shown as the spikes of the Star schema. Example: Star
Schema for sales data
Snowflake schema:
A refinement of star schema where some dimensional hierarchy is further splitting (normalized) into a set of
smaller dimension tables, forming a shape similar to snowflake. However, the snowflake structure can
reduce the effectiveness of browsing, since more joins will be needed. The snowflake schema is a more
complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake
schema because the diagram of the schema resembles a snowflake.
LAB EXERCISE: Students will select one case study on building Data warehouse/Data Mart. Write Detailed
Problem statement and design dimensional modelling (creation of star and snowflake schema)
Problem Statement: Case Study on Real Estate Management System for creating Data Warehouse/Data Mart
Star Schema
Snowflake Schema
CONCLUSION: Hence, we have understood the concept of dimensional modelling and we have created
the dimensional modelling design using star schema and snowflake schema for the selected case study for
building Data Ware house/Data Mart.
EXPERIMENT NO. 2
AIM: Implementation of all Dimension Table and Fact Table based on Experiment 1 Case Study.
OBJECTIVE: To learn how to build a data warehouse using dimensional modelling.
THEORY:
Fact Table:
In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is
located at the center of a star schema or a snowflake schema surrounded by dimension tables. Where
multiple fact tables are used, these are arranged as a fact constellation schema. A fact table typically has two
types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary
key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the
content of the data warehouse and store different types of measures like additive, non- additive, and semi
additive measures.
Fact tables provide the (usually) additive values that act as independent variables by which dimensional
attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents the
most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated as
"Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a
day, product and store. Other dimensions might be members of this fact table (such as location/region) but
these add nothing to the uniqueness of the fact records. These "affiliate dimensions" allow for additional
slices of the independent facts but generally provide insights at a higher level of aggregation (a region
contains many stores).
Dimension Table:
In data warehousing, a dimension table is one of the set of companion tables to a fact table. The fact table
contains business facts (or measures), and foreign keys which refer to candidate keys (normally primary
keys) in the dimension tables.
Contrary to fact tables, dimension tables contain descriptive attributes (or fields) that are typically textual
fields (or discrete numbers that behave like text). These attributes are designed to serve two critical
purposes: query constraining and/or filtering, and query result set labeling.
Dimension attributes should be:
• Verbose (labels consisting of full words)
• Descriptive
• Complete (having no missing values)
• Discretely valued (having only one value per dimension table row)
• Quality assured (having no misspellings or impossible values)
Dimension table rows are uniquely identified by a single key field. It is recommended that the key field be a
simple integer because a key value is meaningless, used only for joining fields between the fact and
dimension tables.
OUTPUT:
Fact and Dimension Table of Star and Snowflake Schema
CONCLUSION: Hence, we have created and displayed Fact Table & Dimension Table.