Assignment - 2 DWH
Assignment - 2 DWH
1) What are the objectives of dimensional modeling, and how does it help in
data warehouse design?
Objectives of Dimensional Modeling in Data Warehousing
o Organizes data in a structure that enables fast query execution and efficient
retrieval.
o Uses business-friendly concepts like facts and dimensions, making it easier for
users to analyze and interpret data.
o Provides an intuitive model that allows users to perform operations like drill-
down, roll-up, slice, and dice for better insights.
o Structures data into fact tables (numeric business measures) and dimension
tables (descriptive attributes), making it easier to retrieve meaningful insights.
5. Reduces Redundancy:
Conclusion:
• Identify Trends and Patterns: Uses historical and real-time data for forecasting and
trend analysis.
• Improve Decision Speed: Summarizes vast amounts of data into concise reports,
reducing the time needed for analysis.
• Risk Management: Detects anomalies and predicts potential business risks.
EIS relies heavily on data warehousing to provide structured and historical data. The data
warehouse:
• Stores large volumes of historical and current data from multiple sources.
Example: A retail company uses a data warehouse to track sales performance across
multiple locations and make strategic pricing decisions.
• Drill-down & Roll-up: Executives can zoom into specific regions, products, or time
periods.
Example: A CEO of an airline company uses OLAP to analyze ticket sales by city, season,
and customer category, helping optimize future pricing strategies.
5. Conclusion
EIS, powered by data warehousing and OLAP, plays a crucial role in strategic decision-
making by offering real-time insights, trend analysis, and performance tracking. By
integrating historical data, business intelligence, and predictive analytics, it helps
executives make data-driven decisions efficiently.
This data-driven approach enables businesses to stay competitive, minimize risks, and
achieve long-term success.
1. STAR Schema
Definition:
The Star Schema is the simplest form of a dimensional model, where a single fact table is
connected to multiple denormalized dimension tables. The fact table is at the center, and
dimensions spread outward, resembling a star shape.
Diagram:
Characteristics:
• Dimension tables are denormalized (i.e., stored as flat tables with redundancy).
Example:
• A retail sales database where the fact table contains sales data and the dimension
tables store information about products, time, customers, employees, and locations.
2. Snowflake Schema
Definition:
The Snowflake Schema is a more normalized version of the Star Schema, where
dimension tables are divided into multiple related tables, reducing redundancy. It resembles
a snowflake due to its branching structure.
Diagram:
Characteristics:
Example:
• A banking database where the Customer Dimension is split into Customer, Region,
and Country tables to avoid duplication.
Definition:
The Fact Constellation Schema (also called Galaxy Schema) consists of multiple fact
tables sharing common dimension tables. It is used for complex data warehouses that
support multiple business processes.
Diagram:
Characteristics:
Example:
• A logistics company where separate fact tables exist for Sales, Inventory, and
Shipment data, but they all share Product, Customer, and Time dimensions.
Key Differences
Feature Star Schema Snowflake Schema Fact Constellation
Schema
Use Case Simple and fast When data redundancy Large, complex data
reporting is a concern warehouses
Conclusion:
• Fact Constellation Schema is ideal for large, complex data warehouses with
multiple business processes.
Definition:
3. Provides Fast Data Retrieval – Uses pre-aggregated data and optimized structures
to reduce query response time.
4. Drill-Down & Roll-Up – Users can navigate from summary data to detailed
records or aggregate data into higher-level summaries.
5. Slicing & Dicing – Enables users to filter specific data views by selecting subsets of
dimensions.
Proposed by Dr. E.F. Codd, these rules define an ideal OLAP system:
2. Transparency – Users should interact with OLAP tools without needing knowledge
of the underlying data structure.
10.Intuitive Data Manipulation – Users can drill, pivot, and slice/dice data
interactively.
Conclusion:
OLAP plays a vital role in data warehousing by enabling businesses to perform fast,
interactive, and multidimensional data analysis. It improves decision-making, enhances
reporting efficiency, and provides insights into trends and patterns for better strategic
planning.
OLAP (Online Analytical Processing) allows users to analyze data from multiple
perspectives using various operations. The four key OLAP operations are:
1. Drill-Down
Definition:
Drill-down moves from summary-level data to detailed data, increasing the level of
granularity.
Example:
A sales report shows yearly sales data. Drilling down will break it down into quarterly →
monthly → weekly → daily sales.
Yearly Sales → Quarterly Sales → Monthly Sales → Daily Sales
Use Case: Analyzing why Q3 sales were lower by looking at monthly or weekly trends.
2. Roll-Up
Definition:
Roll-up is the opposite of drill-down; it aggregates detailed data into higher levels of
abstraction.
Example:
A sales report shows daily sales data. Rolling up will summarize it into weekly → monthly
→ quarterly → yearly sales.
Daily Sales → Weekly Sales → Monthly Sales → Yearly Sales
Use Case: Generating a yearly revenue summary for management instead of detailed
daily transactions.
3. Slice-and-Dice
Definition:
Example:
• Slice: Selecting sales data for January 2024 from a dataset containing all months.
• Dice: Selecting sales data for January 2024 in the "Electronics" category from
the "North Region".
4. Pivot (Rotation)
Definition:
Pivot reorients the data cube by changing the arrangement of dimensions for better
visualization.
Example:
A sales report is structured as Region vs. Product Type. Pivoting it might restructure it as
Product Type vs. Region for different insights.
Use Case: Rotating a report from "Sales by Region" to "Region by Sales" to change
perspective.
Conclusion:
• Roll-Up → Summarization
6) Differentiate between MOLAP, ROLAP, and DOLAP models. What are the
key differences between ROLAP and MOLAP?
In summary, MOLAP is best for fast, pre-aggregated analytics, ROLAP is suitable for
handling large, dynamic datasets, and DOLAP is useful for localized personal data analysis.
Query and reporting in a data warehouse play a crucial role in Business Intelligence (BI) by
enabling organizations to extract meaningful insights from large datasets. These tools help
in:
Executive Information Systems (EIS) are specialized BI tools designed for senior
executives to access critical business data. They support strategic decision-making by:
5. Data Integration – Combines information from multiple sources into a unified view.
Conclusion
Query and reporting tools in a data warehouse enhance business intelligence by enabling
efficient data access and analysis. EIS plays a strategic role by providing executives with
timely, relevant insights for informed decision-making and long-term business planning.