DWDM Unit 2
DWDM Unit 2
UNIT 2
Syllabus
Data Cube: A Multidimensional
Data Model
“What is a data cube?” A data cube allows data to be modeled
and viewed in multiple dimensions.
It is defined by dimensions and facts.
In general terms, dimensions are the perspectives or entities
with respect to which an organization wants to keep records.
For example, All Electronics may create a sales data warehouse in
order to keep records of the store’s sales with respect to the
dimensions time, item, branch, and location. These dimensions
allow the store to keep track of things like monthly sales of items
and the branches and locations at which the items were sold.
Each dimension may have a table associated with it, called a
dimension table, which further describes the dimension.
For example, a dimension table for item may contain the attributes
item name, brand, and type. Dimension tables can be specified by
users or experts, or automatically generated and adjusted based on
data distributions.
Principles of Dimensional Modeling
What is Dimensional Modeling?
Dimensional Data Modeling is one of the data
modeling techniques used in data warehouse design.
The concept of Dimensional Modeling was developed
by Ralph Kimball which is comprised of facts and
dimension tables. Since the main goal of this modeling
is to improve the data retrieval so it is optimized for
SELECT OPERATION.
In dimensional modeling, the transaction record is
divided into either "facts," which are frequently
numerical transaction data, or "dimensions," which
are the reference information that gives context to the
facts
Objectives of Dimensional
Modeling
The purposes of dimensional modeling are:
To produce database architecture that is easy for end-
clients to understand and write queries.
To maximize the efficiency of queries. It achieves these
goals by minimizing the number of tables and
relationships between them.
Advantages of Dimensional Modeling
Following are the benefits of dimensional modeling are:
Dimensional modeling is simple: Dimensional modeling methods
make it possible for warehouse designers to create database schemas
that business customers can easily hold and comprehend. There is no
need for vast training on how to read diagrams, and there is no
complicated relationship between different data elements.
Dimensional modeling promotes data quality: The star schema
enable warehouse administrators to enforce referential integrity checks
on the data warehouse. Since the fact information key is a
concatenation of the essentials of its associated dimensions, a factual
record is actively loaded if the corresponding dimensions records are
duly described and also exist in the database.
By enforcing foreign key constraints as a form of referential integrity
check, data warehouse DBAs add a line of defense against corrupted
warehouses data.
Performance optimization is possible through aggregates: As the
size of the data warehouse increases, performance optimization
develops into a pressing concern. Customers who have to wait for hours
to get a response to a query will quickly become discouraged with the
warehouses. Aggregates are one of the easiest methods by which query
performance can be optimized.
Disadvantages of Dimensional Modeling
It takes less time for the While it takes more time than star
4.
execution of queries. schema for the execution of queries.
10. It has high data redundancy. While it has low data redundancy.
What is Fact Constellation Schema?
Fact Constellation is a schema for representing
multidimensional model. It is a collection of multiple
fact tables having some common dimension tables. It
can be viewed as a collection of several star schemas. It
is one of the widely used schema for Data warehouse
designing and it is much more complex than star and
snowflake schema.
A Fact constellation means two or more fact tables
sharing one or more dimensions. It is also
called Galaxy schema.
Fact Constellation Schema is a sophisticated database design that is
difficult to summarize information. Fact Constellation Schema can
implement between aggregate Fact tables or decompose a complex
Fact table into independent simplex Fact tables.
Example: A fact constellation schema is shown in the figure below.
This schema defines two fact tables, sales, and
shipping. Sales are treated along four dimensions,
namely, time, item, branch, and location. The schema
contains a fact table for sales that includes keys to each
of the four dimensions, along with two measures:
Rupee_sold and units_sold. The shipping table has five
dimensions, or keys: item_key, time_key, shipper_key,
from_location, and to_location, and two measures:
Rupee_cost and units_shipped.
The primary disadvantage of the fact constellation
schema is that it is a more challenging design because
many variants for specific kinds of aggregation must
be considered and selected.
OLAP in the Data Warehouse
OLAP stands for On-Line Analytical Processing. OLAP
is a classification of software technology which authorizes
analysts, managers, and executives to gain insight into
information through fast, consistent, interactive access in a
wide variety of possible views of data that has been
transformed from raw information to reflect the real
dimensionality of the enterprise as understood by the
clients.
OLAP implement the multidimensional analysis of
business information and support the capability for
complex estimations, trend analysis, and sophisticated data
modeling.
Who uses OLAP and Why?