Lecture 8 p2
Lecture 8 p2
1
Refs
• https://www.guru99.com/online-analytical-processing.html
• https://www.guru99.com/multidimensional-online-analytical-
processing.html
• https://www.javatpoint.com/sparse-matrix
• https://www.mssqltips.com/sqlservertutorial/4206/sql-server-
analysis-services-multidimensional-data-model/
What is OLAP?
OLAP = On-line analytical processing.
• OLAP is a characterization of applications, not a database design
technique.
• It is a technology that enables analysts to extract and view business
data from different points of view.
• Analysts frequently need to group, aggregate and join data. With
OLAP data can be pre-calculated and pre-aggregated, making
analysis faster.
• OLAP databases are divided into one or more cubes. The cubes are
designed in such a way that creating and viewing reports become
easy.
Cont.
• People often confuse OLAP with specific physical design techniques.
• This is a mistake: OLAP is a characterization of the application domain
centered around slice-and-dice analytics.
• As we will see, there are many possible implementations capable of
delivering OLAP characteristics. Depending on data size, performance
requirements, cost constraints, etc. the specific implementation
technique will vary.
Supporting the human thought process
THOUGHT PROCESS QUERY SEQUENCE
consistently during last quarter regional level during last year ??
only. Rest is OK
• Analysis is iterative
• Answer to one question leads to a dozen more
• Analysis is directional
• Drill Down
• Roll Up More in
subsequent
• Pivot slides
6
Challenges …
• Not feasible to write predefined queries.
• Fails to remain user_driven (becomes programmer driven).
7
Challenges (Cont.)
• Contradiction
• Want to compute answers in advance, but don't know the
questions
• Solution
• Compute answers to “all” possible “queries”. But how?
8
“All” possible queries (level aggregates)
ALL ALL
10
OLAP Cube
• Spreadsheets are ideal for two-
dimensional data. However,
OLAP contains
multidimensional data, with
data usually obtained from a
different and unrelated source.
• The cube can store and analyze
multidimensional data in a
logical and orderly manner.
How does it work?
• A Data warehouse would extract information from multiple data
sources and formats like text files, excel sheet, multimedia files, etc.
dimensions = 2
3-D Cube
dimensions = 3
all profucts in qtr 1 in
america
15
How is aggregation usually carried on?
Aggregates
• Add up amounts for day 1
• In SQL: SELECT sum(amt) FROM SALE
WHERE date = 1
21
Where does OLAP fit in?
Basic analytical operations of OLAP
• Four types of analytical OLAP operations are:
1.Roll-up
2.Drill-down
3.Slice and dice
4.Pivot (rotate)
1) Roll-up:
• Roll-up is also known as
“consolidation” or “aggregation.” The
Roll-up operation can be performed
in 2 ways
1.Reducing dimensions
2.Climbing up concept hierarchy.
Concept hierarchy is a system of
grouping things based on their order
or level.
2) Drill-down
• In drill-down data is fragmented into
smaller parts. It is the opposite of the
rollup process. It can be done via
• Moving down the concept hierarchy
• Increasing a dimension
3) Slice: filteration
29
MOLAP stands for Multidimensional
online analytical processing.
• It is a type of OLAP process which utilizes a multidimensional
data model.
• Data in MOLAP is pre-computed, pre-summarized and is
stored in MOLAP.
• MOLAP has the capability of storing different permutations
and combinations of data which is already stored in a
multidimensional array.
• All cells of data present can be accessed directly from the
array.
• As a result, MOLAP is faster and gives responses to the
analytical data.
Implementation of MOLAP
❖When the cubes are created it is difficult to scale the number
and size of cubes as these should be scalable as and when the
dimensions change or increase.
❖Specific languages used to query MOLAP. However, it
involves extensive click and drag support.
❖The data is by default stored in a multidimensional array. This
provides the user different perspectives of data that can
aggregate the sales by time, geography or the product
❖Data cubes cannot be created by using the ad hoc queries and
on the go. Hence it is said that they work best with pre-defined
queries. Data cubes are thus critical and have a necessity of in
detail front end and design work
Advantages of MOLAP
• MOLAP allows fastest indexing to the pre-computed summarized
data.
•Database server
•ROLAP server (Saved on the main data repository
(DWH))
•Front-end tool.
Advantages of ROLAP model:
• High data efficiency. It offers high data efficiency because query
performance and access language are optimized particularly for the
multidimensional data analysis.
• Scalability. This type of OLAP system offers scalability for
managing large volumes of data, and even when the data is steadily
increasing.