0% found this document useful (0 votes)
20 views13 pages

2.1 Principles of Dimensional Modeling

Uploaded by

Ansh Singh 10 B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views13 pages

2.1 Principles of Dimensional Modeling

Uploaded by

Ansh Singh 10 B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

 Principles of Dimensional Modeling

Objectives of Dimensional Modeling:


 Simplicity: Dimensional modeling aims to simplify complex business processes into
understandable and efficient structures. By organizing data into dimensions and facts, it
becomes easier for users to analyze and make decisions.
 Performance: Dimensional models are optimized for query performance, enabling fast and
efficient retrieval of data. This is crucial for analytical tasks where users need to interactively
explore large datasets.
 Flexibility: Dimensional models are designed to adapt to changing business requirements.
They allow for easy modification and expansion without disrupting existing data structures.

Requirements to Data Design:


a. Understanding Business Requirements: Start by gathering and analyzing business
requirements. This involves identifying the key metrics and dimensions that stakeholders need
to analyze.
b. Identify Dimensions and Facts: Dimensions represent the descriptive attributes by which users
want to analyze data, such as time, location, product, and customer. Facts are the numerical
measures or metrics that users want to analyze, such as sales revenue, quantity sold, or profit
margin.
c. Conceptual Design: Create a conceptual design by defining the primary dimensions and facts
based on the identified requirements. This involves sketching out the structure of the dimensional
model without getting into technical details.
d. Logical Design: Translate the conceptual design into a logical design by defining the specific
attributes and relationships between dimensions and facts. This includes specifying hierarchies,
relationships, and data types.
e. Physical Design: Implement the logical design into a physical data model tailored to the target
database platform. This involves optimizing storage, indexing, and partitioning strategies for
performance.
f. Testing and Refinement: Test the dimensional model with sample data to ensure that it meets
the business requirements and performs efficiently. Refine the design as needed based on
feedback and testing results.
g. Deployment and Maintenance: Deploy the dimensional model into production environment
and provide necessary training to users. Continuously monitor and maintain the model to
accommodate changes in business requirements or data sources.

Example:
Let's consider a retail business that wants to analyze its sales data. From requirements gathering,
they identify dimensions like time, product, store, and customer, along with facts such as sales
revenue and quantity sold. They then design a dimensional model where time, product, store, and
customer are dimensions, and sales revenue and quantity sold are facts. This model is implemented
logically and then physically, optimized for querying performance. Finally, it undergoes testing,
deployment, and ongoing maintenance to ensure its effectiveness.
In essence, dimensional modeling bridges the gap between business requirements and data design
by providing a structured approach to organizing and analyzing data for decision-making purposes.

 OLAP
OLAP stands for Online Analytical Processing (OLAP) could be a innovation that’s utilized to
organize expansive business databases and back business intelligence. OLAP databases are
separated into one or more cubes, and each cube is organized and designed by a cube administrator
to fit the way simply recover and analyze data so that it is less demanding to form and utilize the
PivotTable reports and PivotChart reports that you just require.

Characteristics of OLAP

The FASMI Test : It can represent the characteristics of an OLAP application in a specific method,
without dictating how it should be performed.

a. Fast − It defines that the system is targeted to produce most responses to users within about
five seconds, with the understandable analysis taking no more than one second and very few
taking more than 20 seconds.

Independent research in the Netherlands has shown that end-users consider that a process has
declined if results are not received with 30 seconds, and they are suitable to hit
‘ALT+Ctrl+Delete’ unless the system needs them that the report will take longer.

b. Analysis − It defines that the system can manage with any business logic and statistical
analysis that is appropriate for the application and the user, the keep it easy enough for the
target user. Although some pre-programming can be required, it does not think it acceptable
if all application definitions have to be completed using a professional 4GL.

It is necessary to enable the user to represent new ad hoc calculations as part of the analysis
and to report on the data in any desired method, without having to program, so it can exclude
products (like Oracle Discoverer) that do not enable the user to represent new ad hoc
calculations as an element of the analysis and to report on the data in any desired method,
without having to program, so it can exclude products (like Oracle Discoverer) that do not
enable adequate end-user oriented calculation flexibility.

c. Shared − It defines that the system implements all the security requirements for
confidentiality (probably down to cell level) and, multiple write access is required,
concurrent update areas at a suitable level. It is not all applications required users to write
data back, but for the increasing number that does, the system must be able to handle several
updates in an appropriate, secure manner. This is a major field of weakness in some OLAP
products, which tend to consider that all OLAP applications will be read-only, with simple
security controls.
d. Multidimensional − The system should support a multidimensional conceptual view of the
data, including complete support for hierarchies and multiple hierarchies. It is not setting up
a specific minimum number of dimensions that should be managed as it is too software
dependent and most products seem to have enough for their target industry.
e. Information − Information is all of the data and derived data required, whether it is and
however much is relevant for the software. We are measuring the capacity of several products
in terms of how much input data can manage, not how many Gigabytes they take to save it.

Advantages of OLAP

 Quick inquiry execution due to optimized capacity, multidimensional ordering and caching.
 Smaller on-disk measure of information compared to information put away in social database
due to compression techniques.
 Automated computation of higher level totals of the data.
 It is exceptionally compact for most measurement information sets. Array models give
common indexing.
 Effective information extraction accomplished through the pre-structuring of amassed
information.

Disadvantages of OLAP

 Inside a few OLAP Arrangements the preparing step (information stack) can be very long,
particularly on expansive information volumes. This is often ordinarily helped by doing as it
were incremental handling, i.e., preparing as it were the data which have changed (usually
modern information) rather than reprocessing the whole information set.
 Some OLAP techniques present data redundancy.

OLAP Applications

a. Business Reporting for sales: The Business Reporting gives an overview of the sales
activity in the sales activities within an organization. It shows the trends in the sales over a
certain time period. It also analyzes the different steps for sales and sales executive
performance. These reports can be used to analyze the sales data and assess the situation to
make the best decisions to undertake.
b. Marketing: Industries like digital marketing, health care, eCommerce, and finance uses
OLAP in their marketing.Example: Market Basket Analysis is a technique that gives the
careful study of purchases done by a customer in a supermarket. This concept identifies the
pattern of frequent purchase items by customers. This analysis can help to promote deals,
offers, sale by the companies and data mining techniques helps to achieve this analysis
task.
c. Management Reporting: It aims to inform the managers of different aspects of the
organizations about the data from the various departments of the company in order to help
them to make better decisions. They collect the data and present them in an understandable
way. It also provides the insights of the company on how a company is doing and what are
the steps to be taken to increase efficiency and make decisions to remain competitive in the
market.
d. Business Process Management: Business process management refers to improve a
business process from end to end by analyzing it. It helps organizations to steps required
to carry out a business task.
e. Financial Reporting: Financial Reporting refers to financial reports of an organization
that are released to stakeholders and the public. It includes the financial statements which
include the balance sheet, income sheet, statement of cash flows, etc. It shows the financial
information that the company choose to show.
Some other applications of OLAP are as follows:
 Marketing analysis
 Customer and product profitability
 Supply and Demand forecasting
 Human resources analysis
 Resource analysis and capacity planning
 Variance analysis
 Claims experience analysis

 OLAP Rules (OLAP Guidelines)


OLAP was introduced by Dr. E.F. Codd in 1993 and he presented 12 rules regarding OLAP:
a. Multidimensional Conceptual View:
Multidimensional data model is provided that is intuitively analytical and easy to use. A
multidimensional data model decides how the users perceive business problems.

b. Transparency:
It makes the technology, underlying data repository, computing architecture, and the
diverse nature of source data totally transparent to users.

c. Accessibility:
Access should provided only to the data that is actually needed to perform the specific
analysis, presenting a single, coherent and consistent view to the users.

d. Consistent Reporting Performance:


Users should not experience any significant degradation in reporting performance as the
number of dimensions or the size of the database increases. It also ensures users must
perceive consistent run time, response time or machine utilization every time a given
query is run.
e. Client/Server Architecture:
It conforms the system to the principles of client/server architecture for optimum
performance, flexibility, adaptability, and interoperability.

f. Generic Dimensionality:
It should be ensured that very data dimension is equivalent in both structure and
operational capabilities. Have one logical structure for all dimensions.

g. Dynamic Sparse Matrix Handling:


Adaption should be of the physical schema to the specific analytical model being created
and loaded that optimizes sparse matrix handling.

h. Multi-user Support:
Support should be provided for end users to work concurrently with either the same
analytical model or to create different models from the same data.

i. Unrestricted Cross-dimensional Operations:


System should have abilities to recognize dimensional and automatically perform roll-up
and drill-down operations within a dimension or across dimensions.

j. Intuitive Data Manipulation:


Consolidation path reorientation, drill-down, and roll-up and other manipulations to be
accomplished intuitively should be enabled and directly via point and click actions.

k. Flexible Reporting:
Business user is provided capabilities to arrange columns, rows, and cells in manner that
gives the facility of easy manipulation, analysis and synthesis of information.

l. Unlimited Dimensions and Aggregation Levels:


There should be at least fifteen or twenty data dimensions within a common analytical
model.

 OLAP Functions

OLAP functions provide powerful capabilities for analyzing multidimensional data, enabling organizations
to derive insights, make informed decisions, and optimize business processes. These functions encompass a
wide range of analytical techniques, from basic aggregations to advanced forecasting and time series analysis,
tailored to meet diverse analytical needs across industries and domains.

1. Aggregation Functions:
Description: Aggregation functions calculate summary statistics or aggregate values across one or
more dimensions in the data cube.
Examples:
 Sum: Calculates the total sum of a measure across selected dimensions. For instance, summing
up sales revenue across product categories.
 Average: Computes the average value of a measure across selected dimensions. For example,
calculating the average monthly sales quantity.
 Count: Counts the number of data points or records within a dimension. For instance, counting
the number of customers in each region.
 Minimum/Maximum: Determines the minimum or maximum value of a measure within selected
dimensions. For example, finding the highest and lowest temperatures recorded per month.

2. Ranking Functions:
Description: Ranking functions assign a rank or position to data based on specified criteria, enabling
the identification of top or bottom performers.
Examples:

 Rank: Assigns a rank to each data point based on a measure, such as ranking products by
sales revenue.
 Top N/Bottom N: Selects the top or bottom N data points based on a measure, such as
identifying the top 5 best-selling products.

3. Forecasting Functions:
Description: Forecasting functions use historical data to predict future trends, allowing organizations
to anticipate demand or plan resources.
Examples:

 Moving Average: Calculates the average of a measure over a specified period, smoothing out
fluctuations to identify trends.
 Exponential Smoothing: Applies weighted averages to historical data, giving more
importance to recent observations in forecasting future values.
 Time Series Analysis: Utilizes statistical models to analyze patterns and seasonality in
sequential data points, projecting future values based on historical trends.

4. Time Series Analysis Functions:


Description: Time series analysis functions analyze data over time to identify trends, seasonality,
and patterns.
Examples:

 Trend Analysis: Identifies long-term trends in data, such as increasing sales over several
years.
 Seasonality Detection: Detects recurring patterns or cycles in data, such as increased sales
during holiday seasons.
 Smoothing Techniques: Removes noise or irregular fluctuations in data to reveal underlying
trends more clearly.
5. Calculation Functions:
Description: Calculation functions create custom calculations or derive new measures from existing
data to meet specific analytical requirements.
Examples:

 Profit Margin Calculation: Calculates the profit margin by subtracting the cost from the
revenue and dividing by revenue.
 Year-over-Year Growth: Computes the percentage change in a measure from one year to the
next, indicating growth or decline.
 Market Basket Analysis: Identifies associations or relationships between items purchased
together to inform cross-selling strategies.

 OLAP Hypercubes and OLAP Operations


Usually, data operations and analysis are performed using the simple spreadsheet, where
data values are arranged in row and column format. This is ideal for two-dimensional
data. However, OLAP contains multidimensional data, with data usually obtained from a
different and unrelated source. Using a spreadsheet is not an optimal option. The cube can
store and analyze multidimensional data in a logical and orderly manner.

 OLAP operations:

There are five basic analytical operations that can be performed on an OLAP cube:
1. Drill down: In drill-down operation, the less detailed data is converted into highly detailed
data. It can be done by:
 Moving down in the concept hierarchy
 Adding a new dimension
In the cube given in overview section, the drill down operation is performed by moving down
in the concept hierarchy of Time dimension (Quarter -> Month).

2. Roll up: It is just opposite of the drill-down operation. It performs aggregation on the OLAP
cube. It can be done by:
 Climbing up in the concept hierarchy
 Reducing the dimensions
In the cube given in the overview section, the roll-up operation is performed by climbing up
in the concept hierarchy of Location dimension (City -> Country).

3. Dice: It selects a sub-cube from the OLAP cube by selecting two or more dimensions. In the
cube given in the overview section, a sub-cube is selected by selecting following dimensions
with criteria:
 Location = “Delhi” or “Kolkata”
 Time = “Q1” or “Q2”
 Item = “Car” or “Bus”
4. Slice: It selects a single dimension from the OLAP cube which results in a new sub-cube
creation. In the cube given in the overview section, Slice is performed on the dimension Time
= “Q1”.

5. Pivot: It is also known as rotation operation as it rotates the current view to get a new view
of the representation. In the sub-cube obtained after the slice operation, performing pivot
operation gives a new view of it.

 OLAP Implementation Considerations:


1. Data Model Design:
 Designing an appropriate multidimensional data model is crucial for OLAP
implementation. This involves identifying relevant dimensions, hierarchies, and
measures that align with business requirements.
 Consider using star schema or snowflake schema for structuring the data model,
optimizing for query performance and ease of analysis.
2. Data Integration:
 Ensure seamless integration of data from various sources into the OLAP system. This
may involve data extraction, transformation, and loading (ETL) processes to ensure
data consistency and accuracy.
 Evaluate the frequency and latency requirements for data updates to ensure that the
OLAP system provides timely and relevant information to users.
3. Scalability and Performance:
 Consider the scalability requirements of the OLAP system to accommodate growing
data volumes and increasing user concurrency.
 Optimize query performance by designing efficient data structures, indexing
strategies, and partitioning schemes.
 Implement caching mechanisms and pre-aggregation techniques to improve query
response times for frequently accessed data.
4. Security and Access Control:
 Implement robust security measures to protect sensitive data and ensure compliance
with regulatory requirements.
 Define role-based access control policies to restrict access to data based on user roles
and privileges.
 Consider encryption, authentication, and auditing mechanisms to enhance data
security and integrity.
5. User Interface and Visualization:
 Design user-friendly interfaces that provide intuitive navigation and interactive
visualization capabilities.
 Incorporate features such as dashboards, charts, and drill-down functionalities to
facilitate data exploration and analysis.
 Ensure compatibility with various devices and platforms to support diverse user
requirements.
6. Metadata Management:
 Establish a comprehensive metadata management framework to document and
organize metadata related to dimensions, measures, hierarchies, and data lineage.
 Maintain metadata consistency and accuracy to facilitate data governance, data
lineage analysis, and impact analysis.
7. Performance Monitoring and Optimization:
 Implement monitoring tools and performance dashboards to track system usage,
query performance, and resource utilization.
 Continuously monitor system performance and identify bottlenecks or inefficiencies
for optimization.
 Regularly review and fine-tune OLAP configurations, indexing strategies, and query
optimization techniques to improve overall system performance.

 Query and Reporting:


1. Ad-Hoc Querying:
 OLAP systems should support ad-hoc querying capabilities, allowing users to explore
and analyze data dynamically based on their specific requirements.
 Provide users with tools for building custom queries, applying filters, and defining
calculations to derive insights from multidimensional data.
2. Standardized Reporting:
 Implement standardized reporting templates and formats for generating routine
reports and dashboards.
 Enable scheduling and distribution of reports to stakeholders through email, file
sharing, or web portals.
3. Drill-Down and Drill-Through:
 Support drill-down and drill-through functionalities to enable users to navigate
hierarchies and access detailed data underlying summary information.
 Allow users to drill down from aggregated data to lower levels of granularity or drill
through to transactional-level data for deeper analysis.
4. Interactive Visualization:
 Incorporate interactive visualization tools such as charts, graphs, and heatmaps to
enhance data exploration and interpretation.
 Enable users to interactively manipulate visualizations, apply filters, and drill down
into specific data points for deeper insights.
5. Performance Optimization:
 Optimize query performance by leveraging caching mechanisms, pre-aggregation,
and indexing strategies.
 Implement query optimization techniques such as query rewriting, query caching, and
materialized views to improve response times for complex queries.
6. Mobile and Self-Service Reporting:
 Provide support for mobile reporting, allowing users to access and analyze data on-
the-go using smartphones or tablets.
 Enable self-service reporting capabilities, empowering users to create, customize, and
share reports without extensive IT intervention.
7. Integration with BI Tools:
 Integrate OLAP systems with popular business intelligence (BI) tools such as
Tableau, Power BI, or QlikView for advanced reporting, visualization, and analytics
capabilities.
 Ensure compatibility and seamless data exchange between OLAP systems and BI
tools to enable a unified analytical environment for users.
 Executive Information Systems (EIS):
Definition:
 Executive Information Systems (EIS) are specialized information systems designed
to support the strategic information needs of top-level executives in organizations.
Purpose:
 EIS provides top executives with quick and easy access to summarized, high-level
information relevant to their decision-making processes.
 It helps executives monitor the performance of the organization, assess strategic
initiatives, and identify trends and opportunities.
Features:
 User-Friendly Interface: EIS typically have intuitive interfaces that allow executives
to access information easily through dashboards, reports, and graphical
representations.
 Customization: EIS can be customized to display information based on the specific
needs and preferences of individual executives.
 Drill-Down Capabilities: Executives can drill down from summary information to
detailed data to investigate underlying factors influencing organizational
performance.
 Alerting Mechanisms: EIS often include alerting mechanisms to notify executives of
critical events or deviations from predefined thresholds.
 Integration: EIS integrate data from multiple sources within and outside the
organization, providing a comprehensive view of organizational performance.
Example:
 Suppose the CEO of a multinational company wants to monitor the sales performance
across different regions. The EIS dashboard provides a visual representation of sales
revenue by region, allowing the CEO to quickly identify regions with high or low
performance. The CEO can then drill down into detailed reports to analyze factors
contributing to variations in sales performance, such as market trends, competition,
or customer preferences.

 Data Warehouse and Business Strategy:

Definition:
 A data warehouse is a centralized repository that stores integrated, historical data
from various sources within an organization. It is optimized for analytical processing
and supports decision-making processes.

Role in Business Strategy:


 Strategic Decision Support: Data warehouses provide the foundation for strategic
decision-making by enabling analysis of historical data and identification of trends,
patterns, and opportunities.
 Performance Measurement: Data warehouses facilitate performance measurement by
providing key performance indicators (KPIs) and metrics for assessing organizational
performance against strategic objectives.
 Competitive Advantage: Data-driven insights derived from data warehouses can
provide organizations with a competitive advantage by informing strategic initiatives,
market positioning, and resource allocation.
 Alignment with Business Goals: Data warehouses help align operational activities
with strategic goals by providing a unified view of organizational data and ensuring
consistency and accuracy in reporting.

Benefits:
 Improved Decision-Making: Data warehouses enable informed decision-making by
providing timely access to reliable, integrated data for analysis.
 Enhanced Agility: Data warehouses support agile decision-making by enabling quick
access to relevant data and insights, allowing organizations to respond rapidly to
changing market conditions.
 Cost Savings: Data warehouses help reduce costs associated with data duplication,
inconsistency, and inefficiency by providing a single source of truth for
organizational data.
 Innovation: Data warehouses support innovation by providing a platform for
advanced analytics, data mining, and predictive modeling to uncover new business
opportunities and optimize processes.

Example:

 A retail company uses a data warehouse to analyze sales data from various sources,
including point-of-sale systems, online transactions, and customer feedback. By
analyzing historical sales data, the company identifies trends in customer purchasing
behavior, seasonality effects, and product performance. Based on these insights, the
company develops targeted marketing campaigns, optimizes inventory management,
and introduces new product offerings to align with its strategic objectives of
increasing market share and customer satisfaction.

You might also like