0% found this document useful (0 votes)
56 views

Dimensions DW

Conformed dimensions refer to dimension tables that are shared across multiple fact tables. Junk dimensions group various flags and indicators together. Role-playing dimensions refer to dimensions like date that can be used in multiple contexts within a database. Degenerate dimensions store attributes in the fact table rather than a separate dimension table. Mini-dimensions track attributes that frequently change in separate tables linked to fact tables.

Uploaded by

Dinesh Gora
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Dimensions DW

Conformed dimensions refer to dimension tables that are shared across multiple fact tables. Junk dimensions group various flags and indicators together. Role-playing dimensions refer to dimensions like date that can be used in multiple contexts within a database. Degenerate dimensions store attributes in the fact table rather than a separate dimension table. Mini-dimensions track attributes that frequently change in separate tables linked to fact tables.

Uploaded by

Dinesh Gora
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Conformed Dimension:

if a table is used as a dimension table for more than one fact tables. then the dimension table is called
conformed dimensions.
Eg., Date_Dim tables.

Junk Dimension
A junk dimension is a convenient grouping of flags and indicators. It's helpful, but not absolutely
required, if there's a positive correlation among the values. The benefits of a junk dimension
include

Role-playing dimensions
Dimensions are often recycled for multiple applications within the same database. For instance, a "Date"
dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often
referred to as a "role-playing dimension".

Degenerate dimensions: (not dimension table)


 A degenerate dimension is when the dimension attribute is stored as part of fact table, and not in
a separate dimension table.
 Any values in the fact table that don’t join to dimensions are either considered degenerate
dimensions or measures
 A degenerate dimension acts as a dimension key in the fact table but does not join a
corresponding dimension table because all its interesting attributes have already been placed in
other analytic dimensions

Mini dimensions:
The mini-dimension technique uses a separate dimension(s) for the attributes that frequently change.
We might build a mini-dimension for customer demographic attributes, such as own/rent home, presence
of children, and income level. This dimension would contain a row for every unique combination of these
attributes observed in the data. The static and less frequently changing attributes are kept in our large
base customer dimension. The fact table captures the relationship of the base customer dimension and
demographic mini-dimension as the fact rows are loaded.
It is not unusual for organizations dealing with consumer-level data to create a series of related mini-
dimensions. A financial services organization might have mini-dimensions for customer scores,
delinquency statuses, behavior segmentations, and credit bureau attributes. The appropriate mini-
dimensions along with the base customer dimension are tied together via their foreign key
relationship in the fact table rows. The mini-dimensions effectively track changes and also provide
smaller points of entry into the fact tables. They are particularly useful when analysis does not
require consumer-specific detail.
Fact table
From Wikipedia, the free encyclopedia

In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It
is often located at the centre of a star schema or a snowflake schema, surrounded by dimension tables.

Fact tables provide the (usually) additive values that act as independent variables by which dimensional
attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents
the most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated
as "Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined
by a day, product and store. Other dimensions might be members of this fact table (such as
location/region) but these add nothing to the uniqueness of the fact records. These "affiliate
dimensions" allow for additional slices of the independent facts but generally provide insights at a
higher level of aggregation (a region contains many stores).

Example

If the business process is SALES, then the corresponding fact table will typically contain columns
representing both raw facts and aggregations in rows such as:

 $12,000, being "sales for New York store for 15-Jan-2005"

 $34,000, being "sales for Los Angeles store for 15-Jan-2005"

 $22,000, being "sales for New York store for 16-Jan-2005"

 $50,000, being "sales for Los Angeles store for 16-Jan-2005"

 $21,000, being "average daily sales for Los Angeles Store for Jan-2005"

 $65,000, being "average daily sales for Los Angeles Store for Feb-2005"

 $33,000, being "average daily sales for Los Angeles Store for year 2005"

"average monthly sales" is a measurement which is stored in the fact table. The fact table also contains
foreign keys from the dimension tables, where time series (e.g. dates) and other dimensions (e.g. store
location, salesperson, product) are stored.

All foreign keys between fact and dimension tables should be surrogate keys, not reused keys from
operational data.

The centralized table in a star schema is called a fact table. A fact table typically has two types of
columns: those that contain facts and those that are foreign keys to dimension tables. The primary key
of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the
content of the data warehouse and store different types of measures like additive, non additive, and
semi additive measures.
Measure types

 Additive - Measures that can be added across all dimensions.


 Non Additive - Measures that cannot be added across all dimensions.
 Semi Additive - Measures that can be added across few dimensions and not with others.
A fact table might contain either detail level facts or facts that have been aggregated (fact tables that
contain aggregated facts are often instead called summary tables).
Special care must be taken when handling ratios and percentage. One good design rule [1] is to never
store percentages or ratios in fact tables but only calculate these in the data access tool. Thus only store
the numerator and denominator in the fact table, which then can be aggregated and the aggregated
stored values can then be used for calculating the ratio or percentage in the data access tool.

In the real world, it is possible to have a fact table that contains no measures or facts. These tables are
called "factless fact tables", or "junction tables".

The "Factless fact tables" can for example be used for modeling many-to-many relationships or capture
events[1]

Types of fact tables

There are basically three fundamental measurement events, which characterizes all fact tables. [2]

 Transactional

A transactional table is the most basic and fundamental. The grain associated with a transactional fact
table is usually specified as "one row per line in a transaction", e.g., every line on a receipt. Typically a
transactional fact table holds data of the most detailed level, causing it to have a great number of
dimensions associated with it.

 Periodic snapshots

The periodic snapshot, as the name implies, takes a "picture of the moment", where the moment could
be any defined period of time, e.g. a performance summary of a salesman over the previous month. A
periodic snapshot table is dependent on the transactional table, as it needs the detailed data held in the
transactional fact table in order to deliver the chosen performance output.

 Accumulating snapshots

This type of fact table is used to show the activity of a process that has a well defined beginning and
end, e.g., the processing of an order. An order moves through specific steps until it is fully processed. As
steps towards fulfilling the order are completed, the associated row in the fact table is updated. An
accumulating snapshot table often has multiple date columns, each representing a milestone in the
process. Therefore, it's important to have an entry in the associated date dimension that represents an
unknown date, as many of the milestone dates are unknown at the time of the creation of the row.
Balanced and unbalanced hierarchies
When a dimension has a recursive hierarchy, you do not need to create any levels. Instead, you need to
specify any required member information.

Balanced hierarchies

In balanced hierarchies (balanced/standard), the branches of the hierarchy all descend to the same level, with
each member's parent being at the level immediately above the member. An common example of a balanced
hierarchy is one that represents time, where the depth of each level (year, quarter, and month) is consistent.
DB2 Alphablox Cube Server supports balanced hierarchies.

Unbalanced hierarchies

Unbalanced hierarchies includes levels that have a consistent parent-child relationship, but have logically
inconsistent levels. The hierarchy branches can also have inconsistent depths. An example of an unbalanced
hierarchy is an organization chart, which show reporting relationships among employees in an organization.
The levels within the organizational structure are unbalanced, with some branches in the hierarchy having more
levels than others.
First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:

 Eliminate duplicative columns from the same table.


 Create separate tables for each group of related data and identify each row with a unique
column or set of columns (the primary key).

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicative data:

 Meet all the requirements of the first normal form.


 Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
 Create relationships between these new tables and their predecessors through the use of
foreign keys.

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:

 Meet all the requirements of the second normal form.


 Remove columns that are not dependent upon the primary key.

Fourth Normal Form (4NF)

Finally, fourth normal form (4NF) has one additional requirement:

 Meet all the requirements of the third normal form.


 A relation is in 4NF if it has no multi-valued dependencies.

Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must first
fulfill all the criteria of a 1NF database.
One of the evenings when weather was good but it did not rain!!!

Gokul and Ajay with their daily dose of medicine... Add their lead Mr. Dada who is wondering how on
earth he ended up in the apartment that night... Mr Suyal with his regular Army stuff and out of the
world gyaan, imparting some of it to Juned... who is stoned and doesn't give a fuck whats happening
around let alone listening to what Suyal has to say... I'm the one with the camera. Good ol' days.

You might also like