SlideShare a Scribd company logo
ON-LINE ANALYTICAL PROCESSING-Analyzing Data ResourcesADITI PAULMCS/08/20REGISTRATION NO – 003834 OF 2008                 POST GRADUATE DEPARTMENT OF COMPUTER SCIENCE    ST.XAVIERS COLLEGE (AUTONOMOUS)
WHAT IS OLAP ?Basic idea: Quickly answer multi-dimensional analytical queries.
Convert data into information that decision makers need
It is a continuous , iterative, and preferably interactive process.WHO USES OLAP ?It is used in an organization to carry out the different ORGANIZATIONAL FUNCTIONS in :Finance departments Sales analysis and forecasting Marketing departments Cardinal Goal “ Provide managers with the information  they need to make effective decisions ”
Understanding Online Analytical Processing - OLAP3 part descriptionPart 1 – OnlinePart 2 – AnalyticalPart 3 – Processing
PART 1 – ONLINE
FLASH BACK
Data Stored in a DatabaseTYPE 1Operational DataData that “works”.Frequent Updates and Queries.Normalized for efficient search and updates.Fragmented & local relevance.Point Queries .
Examples of Operational DataAccount Details of a Customer in a  BankStudent Details in a College/School DatabaseEmployee RecordsEtc.
Example Queries on Operational DataWhat is the salary of Mr.Chatterjee? ( point query)What is the address and phone number of the person in charge of the hardware department ?How many students have received an “distinction” credential in the latest exam?
Operational Data pertain to what we call“ONLINE TRANSACTION PROCESSING”As the name suggests these sorts of data are used for day to day ‘operations’ like data entry /retrieval .For example : An ATM is a commercial online transaction system.
Types of Data in a DatabaseType 2Historical Data
Data that “tells”.
Very Infrequent updates.
Integrated data set with global relevance.
Analytical queries that require huge amounts of aggregation.
Performance issues mainly in query response time.Examples of Historical DataLast set of 10 transactions on a particular bank account of a customerRecord of sales of a product in the last 15 years in a company’s databaseThe profits incurred by a company stored month wise in a whole fiscal year.
Example Queries on Historical DataHow is the student marks percentage scene changing over the years in college?Is there a correlation between the geographical location of a company unit and excellent employee appraisals?How is the employee attrition changing over the years across the company?
Historical Data pertains to the phenomenon that is “Online Analytical Processing”    where queries thus do not just depend on seeing one part of a tuple .   For example to find out the employee attrition, we have to find out some aggregate employee attrition and then map it against time. Thus these queries require “analyzing” certain facts and then producing a correct output .
The necessity that these queries be ONLINE means that the queries need to be responded to in an “ONLINE INTERACTIVE RESPONSE TIME”as the waiting time of users is of the order of a few seconds.
The differences Between OLAP and OLTP thus are
PART 2 - ANALYTICAL
Analysis of the DataIn order to “Analyze” this Historical Data , it needs to be stored in a certain formatted and organized manner.This is accomplished by a Data Warehouse.Data warehouse is an infrastructure to manage historical data from various sources.It is designed to support OLAP Queries involving  gratuitous use of aggregation.Subject Oriented , Integrated ,Time-Variantand Non Volatile collection of data in support of management’s decision making process.
WAREHOUSING SCHEMATIC DATA DIAGRAM
Dimensions of Data Warehouse ModelingMeasures –Key performance indicator that we want to evaluate.Typically numerical , including volume, sales and cost.A Rule of Thumb : if a number makes(business) sense when aggregated, then it is a measure.Affects what should be stored in Data Warehouse.Example : Aggregate daily volume to month ,    quarter and year
Dimensions –Categories of data analysisTypical dimensions include product, time, region.A Rule of Thumb : when a report is requested “by” something, that something is usually a dimension.Example :In sales report , view sales by month,byregion,so the two dimensions needed are time and region.
Dimensions and measures are physically represented by a STAR SCHEMA.
The Data Model Which is adhered to while handling Historical Data to populate a Data Warehouse is a “MULTIDIMENSIONAL DATA MODEL.”One way to look at a multidimensional data model is to view it as a cube.
CUBE  It is a data structure that allows fast analysis of data. It can also be defined as the capability of manipulating and analyzing data from multiple perspectives.
BASIC STRUCTURE OF A CUBEThe response time of the multidimensiona-l query still depends on how many cells have to be added on the fly
Project report aditi paul1
n-D base cube is called a BASE CUBOID. The top most 0-D cuboid, which holds the highest-level of summarization, is called the APEX CUBOID.  The lattice of cuboids forms a data CUBE.
PART 3 - PROCESSING
PROCESSING DATA TO INFORMATIONNow that we have the Required Data in the Requisite form , how do we get the Desired output to a Query which requires analyzing of the data?    This is Accomplished by OLAP OperationsOLAP Functions SQL Extensions for OLAP.
OLAP OPERATIONSDimension Tables Market (Market_ID, City , Region)Product (Product_ID,Name,Category,Price)Time(Time_ID,Week,Month,Quarter)Fact tableSales(Market_ID, Product_ID,Time_ID,Amount)
OLAP OPERATIONSAggregation – doing the ‘total’ of a measure  over one or more dimensions.
QUERY :Find the Total Sales(over time) of each product in eachmarketSELECT Market_ID ,Product_ID ,SUM(AMOUNT)FROM SalesGROUP BY Market_ID , Product_ID;
OLAP OPERATIONS2. ROLL UPSpecific grouping on one dimension where we go from lower level of aggregation to a higher.Example :“ROLL UP sales on MARKET  from CITY to REGION”
Firsty, the TOTAL SALE of a PARTICULAR Product in a city at a given time is done.Then,we use the CITY and Product ID of a city belonging to a REGION to project sales in that regionSelect S.Product_Id,M.City,SUM(S.Amount)INTO City_SalesFROM Sales S,Market MWHEREM.Market_ID = S.Market_IDGROUP BY S.Product_ID,M.City
OLAP OPERATIONS3.DRILL DOWNFiner –grained view on aggregated data,i.e. going from higher to lower aggregationConverse of Roll-upE.g disaggregate county sales by region/city.
OLAP OPERATIONS4.PIVOTINGSelect A different dimension(orientation) for analysis
OLAP OPERATIONS5. SLICE and DICESlicing : Selection on one or more dimensionsExample : “Choosing sales only in week 12” Slicing the data cube in the Time DimensionSELECT S.*FROM Sales S,Time TWHERE T.Time_ID = S.Time_IDAND T.WEEK=’Week 12’ 
OLAP OPERATIONSDicing: A range selection in a hypercube.      Partition or group on one or more dimensions.Example :“ Total sales for each product in each quarter “ Dicing sales in the time dimension : SELECT S.Product_ID,T.Quarter,SUM(S.Amount)FROM Sales S,Time TWHERE T.Time_ID=S.Time_IDGroup BY T.Quarter,S.Product_ID
SQL EXTENSIONS FOR OLAP1.ROLL UPSELECT SEM,SUM(MARKS),RANK() OVER (ORDER BY SUM (MARKS) DESC)AS rank FROMTEACHERSGROUP BY ROLL UP(SEM) ORDER BY SEM
ROLL UP thus provides subtotals of aggregate rows.
SQL EXTENSIONS2.CUBESELECT SEM,SUM(MARKS)FROM TEACHERS GROUP BY CUBE(SEM)
The CUBE operator provides subtotals of aggregate values in the result set
SQL EXTENSIONS  3. GROUPING SETS lets us compute groups on several different sets of grouping columns in the same query.This Query returns subtotal rows for each year, but not for the individual quarters.
SQL EXTENSIONSSelect YEAR as YEAR , QUARTER as QUARTER,COUNT(*)as ORDERS from SALESGROUP BY GROUPING SETS(YEAR,QUARTER),(YEAR)) ORDER BY YEAR & QUARTER
OLAP FUNCTIONS1. RANK FUNCTION – Lets us compile a list ofvalues from your data set inranked order.Example : The SQL query that follows finds the male and female employees from Kolkataand ranks them in descending order according to salary.
SELECT emp_lname, salary, sex,RANK () OVER (ORDER BY salary DESC) "Rank"FROM employeeWHERE city IN (’KOL’)
OLAP FUNCTIONS2.REPORTING FUNCTION : Reporting functions lets us compare non-aggregate values to aggregate values.Example : The following query returns a result set that shows a list of the products thatsold higher than the average number of sales. The result set is partitioned byyear. 
SELECT *FROM (SELECT year(order_date) AS Year, prod_id,SUM( quantity ) AS Q,AVG (SUM(quantity))OVER (PARTITION BY Year) AS Average FROM sales_order JOIN sales_order_itemsGROUP BY year(order_date), prod_idORDER BY Year)AS derived_tableWHERE Q > AverageFor the year 2000, the average number of orders was 1787. Four products(700, 601, 600, and 400) sold higher than that amount. In 2001, the averagenumber of orders was 1048 and three products exceeded that amount.
OLAP FUNCTIONSWINDOW FUNCTIONSWindow functions lets us analyze ourdataby computing aggregate values over windows surrounding each row. The result set returns a summary value representing a set of rows.
The query returns a result set that partitions the data by department and then provides a cumulative summary of employees’ salaries starting with the employee who has been at the company the longest. The result set includes only those employees who reside in West Bengal, BBSR, Maharashtra, or Arunachal. The column Sum Salary provides the cumulative total of employees’ salaries. SELECT dept_id, emp_lname, start_date, salary,SUM(salary) OVER (PARTITION BY dept_idORDER BY start_dateRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS "Sum_Salary"FROM employeeWHERE state IN (’WB’, ’BBSR’, ’MH’, ’AR’) AND dept_id IN (’100’,’200’)ORDER BY dept_id, start_date;
Project report aditi paul1
On Line Analytical ProcessingThus Online Analytical Processing as a whole can be understood to be a method which takes in raw data , processes it through various functions and operations and produces Information as a Response to Multidimensional Queries in Real Time
SERVER ARCHITECTURESMOLAP : Multidimensional OLAPThe database is stored in a special, usually proprietary, structure that is optimized for multidimensional analysis.+ : very fast query response time because data is mostly pre-calculated-: practical limit on the size because the time taken to calculate the database and the space required to hold these pre-calculated values
SERVER ARCHICTECTURESROLAP – Relational OLAPThe database is a standard relational database and the database model is a multidimensional model, often referred to as a star or snowflake model or schema.+: more scalable solution -: performance of the queries will be largely governed by the complexity of the SQL and the number and size of the tables being joined in the query
SERVER ARCHITECTURESHOLAP – HYBRID OLAP A hybrid of ROLAP and MOLAPcan be thought of as a virtual database whereby the higher levels of the database are implemented as MOLAP and the lower levels of the database as ROLAP
Ad

More Related Content

What's hot (14)

Complete unit ii notes
Complete unit ii notesComplete unit ii notes
Complete unit ii notes
Benazir Fathima
 
Data ware housing- Introduction to olap .
Data ware housing- Introduction to  olap .Data ware housing- Introduction to  olap .
Data ware housing- Introduction to olap .
Vibrant Technologies & Computers
 
02 Essbase
02 Essbase02 Essbase
02 Essbase
Amit Sharma
 
Data Warehouse Design Project
Data Warehouse Design ProjectData Warehouse Design Project
Data Warehouse Design Project
Pradeep Yamala
 
Olap
OlapOlap
Olap
Salahaddin University-Erbil
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
AnwarrChaudary
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0
Jannet Peetz
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Bhaskar Pathak
 
Exploiting data quality tools to meet the expectation of strategic business u...
Exploiting data quality tools to meet the expectation of strategic business u...Exploiting data quality tools to meet the expectation of strategic business u...
Exploiting data quality tools to meet the expectation of strategic business u...
Zubair Abbasi
 
Data Warehouse-Final
Data Warehouse-FinalData Warehouse-Final
Data Warehouse-Final
Priyanka Manchanda ☁️
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
Uday Kothari
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
kiran14360
 
NextEra Energy - Supply Chain Analytics - Florida ASUG
NextEra Energy - Supply Chain Analytics - Florida ASUGNextEra Energy - Supply Chain Analytics - Florida ASUG
NextEra Energy - Supply Chain Analytics - Florida ASUG
Juan Braceras
 
HANA Performance Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BIHANA Performance Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BI
IBM India Smarter Computing
 
Data Warehouse Design Project
Data Warehouse Design ProjectData Warehouse Design Project
Data Warehouse Design Project
Pradeep Yamala
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
AnwarrChaudary
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0
Jannet Peetz
 
Exploiting data quality tools to meet the expectation of strategic business u...
Exploiting data quality tools to meet the expectation of strategic business u...Exploiting data quality tools to meet the expectation of strategic business u...
Exploiting data quality tools to meet the expectation of strategic business u...
Zubair Abbasi
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
Uday Kothari
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
kiran14360
 
NextEra Energy - Supply Chain Analytics - Florida ASUG
NextEra Energy - Supply Chain Analytics - Florida ASUGNextEra Energy - Supply Chain Analytics - Florida ASUG
NextEra Energy - Supply Chain Analytics - Florida ASUG
Juan Braceras
 
HANA Performance Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BIHANA Performance Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BI
IBM India Smarter Computing
 

Viewers also liked (11)

Trabajo semestral de diciembre 2013
Trabajo semestral de diciembre 2013Trabajo semestral de diciembre 2013
Trabajo semestral de diciembre 2013
Axel Bonilla
 
Testes em um mundo ágil
Testes em um mundo ágilTestes em um mundo ágil
Testes em um mundo ágil
Jose Papo, MSc
 
Ma ville et mon quartier
Ma ville et mon quartierMa ville et mon quartier
Ma ville et mon quartier
hannahw1207
 
O product backlog
O product backlogO product backlog
O product backlog
Maria João Gehl Baptista da Costa, PMP, PMD, CSM
 
NatalyaResume2015
NatalyaResume2015NatalyaResume2015
NatalyaResume2015
Natalya Neujahr
 
Carta escrita en 2070
Carta escrita en 2070Carta escrita en 2070
Carta escrita en 2070
Michael Castillo
 
Wind power
Wind powerWind power
Wind power
Taral Soliya
 
第十二組 Final ppt2_湯燙碗(hot soup bowl)
第十二組 Final ppt2_湯燙碗(hot soup bowl)第十二組 Final ppt2_湯燙碗(hot soup bowl)
第十二組 Final ppt2_湯燙碗(hot soup bowl)
sports2473
 
Scrum
ScrumScrum
Scrum
Bruno Felipe
 
Horizon report 2011 vislumbrando um novo mundo
Horizon report 2011 vislumbrando um novo mundoHorizon report 2011 vislumbrando um novo mundo
Horizon report 2011 vislumbrando um novo mundo
educamosonline
 
UOL Party - Projetando Mobile
UOL Party - Projetando MobileUOL Party - Projetando Mobile
UOL Party - Projetando Mobile
Diana Fournier
 
Ad

Similar to Project report aditi paul1 (20)

INTRODUCTION TO ONLINE ALYTICAL PROCESS WITH FEATURES AND OPERATIONS
INTRODUCTION TO ONLINE ALYTICAL PROCESS  WITH FEATURES AND OPERATIONSINTRODUCTION TO ONLINE ALYTICAL PROCESS  WITH FEATURES AND OPERATIONS
INTRODUCTION TO ONLINE ALYTICAL PROCESS WITH FEATURES AND OPERATIONS
sampathoruganti
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
Dhiren Gala
 
Dwbi Project
Dwbi ProjectDwbi Project
Dwbi Project
Sonali Gupta
 
Essbase intro
Essbase introEssbase intro
Essbase intro
Amit Sharma
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
jainyshah20
 
Oracle Hyperion overview
Oracle Hyperion overviewOracle Hyperion overview
Oracle Hyperion overview
Click4learning
 
Analytics 101
Analytics 101Analytics 101
Analytics 101
Sujeevan Nagarajah
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
VijayasankariS
 
Dimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.pptDimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.ppt
nishant523869
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
ABDEL RAHMAN KARIM
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Subrata Kumer Paul
 
SSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business IntelligenceSSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business Intelligence
Slava Kokaev
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
Prithwis Mukerjee
 
OLAPCUBE.pptx
OLAPCUBE.pptxOLAPCUBE.pptx
OLAPCUBE.pptx
DrJANANIA1
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Salah Amean
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
ganblues
 
Datawarehouse & bi introduction
Datawarehouse & bi introductionDatawarehouse & bi introduction
Datawarehouse & bi introduction
Shivmohan Purohit
 
Datawarehouse & bi introduction
Datawarehouse & bi introductionDatawarehouse & bi introduction
Datawarehouse & bi introduction
guest7b34c2
 
INTRODUCTION TO ONLINE ALYTICAL PROCESS WITH FEATURES AND OPERATIONS
INTRODUCTION TO ONLINE ALYTICAL PROCESS  WITH FEATURES AND OPERATIONSINTRODUCTION TO ONLINE ALYTICAL PROCESS  WITH FEATURES AND OPERATIONS
INTRODUCTION TO ONLINE ALYTICAL PROCESS WITH FEATURES AND OPERATIONS
sampathoruganti
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
Dhiren Gala
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
jainyshah20
 
Oracle Hyperion overview
Oracle Hyperion overviewOracle Hyperion overview
Oracle Hyperion overview
Click4learning
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
VijayasankariS
 
Dimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.pptDimensional Modeling Concepts_Nishant.ppt
Dimensional Modeling Concepts_Nishant.ppt
nishant523869
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Subrata Kumer Paul
 
SSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business IntelligenceSSAS R2 and SharePoint 2010 – Business Intelligence
SSAS R2 and SharePoint 2010 – Business Intelligence
Slava Kokaev
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
Prithwis Mukerjee
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Salah Amean
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
ganblues
 
Datawarehouse & bi introduction
Datawarehouse & bi introductionDatawarehouse & bi introduction
Datawarehouse & bi introduction
Shivmohan Purohit
 
Datawarehouse & bi introduction
Datawarehouse & bi introductionDatawarehouse & bi introduction
Datawarehouse & bi introduction
guest7b34c2
 
Ad

Project report aditi paul1

  • 1. ON-LINE ANALYTICAL PROCESSING-Analyzing Data ResourcesADITI PAULMCS/08/20REGISTRATION NO – 003834 OF 2008 POST GRADUATE DEPARTMENT OF COMPUTER SCIENCE ST.XAVIERS COLLEGE (AUTONOMOUS)
  • 2. WHAT IS OLAP ?Basic idea: Quickly answer multi-dimensional analytical queries.
  • 3. Convert data into information that decision makers need
  • 4. It is a continuous , iterative, and preferably interactive process.WHO USES OLAP ?It is used in an organization to carry out the different ORGANIZATIONAL FUNCTIONS in :Finance departments Sales analysis and forecasting Marketing departments Cardinal Goal “ Provide managers with the information they need to make effective decisions ”
  • 5. Understanding Online Analytical Processing - OLAP3 part descriptionPart 1 – OnlinePart 2 – AnalyticalPart 3 – Processing
  • 6. PART 1 – ONLINE
  • 8. Data Stored in a DatabaseTYPE 1Operational DataData that “works”.Frequent Updates and Queries.Normalized for efficient search and updates.Fragmented & local relevance.Point Queries .
  • 9. Examples of Operational DataAccount Details of a Customer in a BankStudent Details in a College/School DatabaseEmployee RecordsEtc.
  • 10. Example Queries on Operational DataWhat is the salary of Mr.Chatterjee? ( point query)What is the address and phone number of the person in charge of the hardware department ?How many students have received an “distinction” credential in the latest exam?
  • 11. Operational Data pertain to what we call“ONLINE TRANSACTION PROCESSING”As the name suggests these sorts of data are used for day to day ‘operations’ like data entry /retrieval .For example : An ATM is a commercial online transaction system.
  • 12. Types of Data in a DatabaseType 2Historical Data
  • 15. Integrated data set with global relevance.
  • 16. Analytical queries that require huge amounts of aggregation.
  • 17. Performance issues mainly in query response time.Examples of Historical DataLast set of 10 transactions on a particular bank account of a customerRecord of sales of a product in the last 15 years in a company’s databaseThe profits incurred by a company stored month wise in a whole fiscal year.
  • 18. Example Queries on Historical DataHow is the student marks percentage scene changing over the years in college?Is there a correlation between the geographical location of a company unit and excellent employee appraisals?How is the employee attrition changing over the years across the company?
  • 19. Historical Data pertains to the phenomenon that is “Online Analytical Processing” where queries thus do not just depend on seeing one part of a tuple . For example to find out the employee attrition, we have to find out some aggregate employee attrition and then map it against time. Thus these queries require “analyzing” certain facts and then producing a correct output .
  • 20. The necessity that these queries be ONLINE means that the queries need to be responded to in an “ONLINE INTERACTIVE RESPONSE TIME”as the waiting time of users is of the order of a few seconds.
  • 21. The differences Between OLAP and OLTP thus are
  • 22. PART 2 - ANALYTICAL
  • 23. Analysis of the DataIn order to “Analyze” this Historical Data , it needs to be stored in a certain formatted and organized manner.This is accomplished by a Data Warehouse.Data warehouse is an infrastructure to manage historical data from various sources.It is designed to support OLAP Queries involving gratuitous use of aggregation.Subject Oriented , Integrated ,Time-Variantand Non Volatile collection of data in support of management’s decision making process.
  • 25. Dimensions of Data Warehouse ModelingMeasures –Key performance indicator that we want to evaluate.Typically numerical , including volume, sales and cost.A Rule of Thumb : if a number makes(business) sense when aggregated, then it is a measure.Affects what should be stored in Data Warehouse.Example : Aggregate daily volume to month , quarter and year
  • 26. Dimensions –Categories of data analysisTypical dimensions include product, time, region.A Rule of Thumb : when a report is requested “by” something, that something is usually a dimension.Example :In sales report , view sales by month,byregion,so the two dimensions needed are time and region.
  • 27. Dimensions and measures are physically represented by a STAR SCHEMA.
  • 28. The Data Model Which is adhered to while handling Historical Data to populate a Data Warehouse is a “MULTIDIMENSIONAL DATA MODEL.”One way to look at a multidimensional data model is to view it as a cube.
  • 29. CUBE  It is a data structure that allows fast analysis of data. It can also be defined as the capability of manipulating and analyzing data from multiple perspectives.
  • 30. BASIC STRUCTURE OF A CUBEThe response time of the multidimensiona-l query still depends on how many cells have to be added on the fly
  • 32. n-D base cube is called a BASE CUBOID. The top most 0-D cuboid, which holds the highest-level of summarization, is called the APEX CUBOID. The lattice of cuboids forms a data CUBE.
  • 33. PART 3 - PROCESSING
  • 34. PROCESSING DATA TO INFORMATIONNow that we have the Required Data in the Requisite form , how do we get the Desired output to a Query which requires analyzing of the data? This is Accomplished by OLAP OperationsOLAP Functions SQL Extensions for OLAP.
  • 35. OLAP OPERATIONSDimension Tables Market (Market_ID, City , Region)Product (Product_ID,Name,Category,Price)Time(Time_ID,Week,Month,Quarter)Fact tableSales(Market_ID, Product_ID,Time_ID,Amount)
  • 36. OLAP OPERATIONSAggregation – doing the ‘total’ of a measure over one or more dimensions.
  • 37. QUERY :Find the Total Sales(over time) of each product in eachmarketSELECT Market_ID ,Product_ID ,SUM(AMOUNT)FROM SalesGROUP BY Market_ID , Product_ID;
  • 38. OLAP OPERATIONS2. ROLL UPSpecific grouping on one dimension where we go from lower level of aggregation to a higher.Example :“ROLL UP sales on MARKET from CITY to REGION”
  • 39. Firsty, the TOTAL SALE of a PARTICULAR Product in a city at a given time is done.Then,we use the CITY and Product ID of a city belonging to a REGION to project sales in that regionSelect S.Product_Id,M.City,SUM(S.Amount)INTO City_SalesFROM Sales S,Market MWHEREM.Market_ID = S.Market_IDGROUP BY S.Product_ID,M.City
  • 40. OLAP OPERATIONS3.DRILL DOWNFiner –grained view on aggregated data,i.e. going from higher to lower aggregationConverse of Roll-upE.g disaggregate county sales by region/city.
  • 41. OLAP OPERATIONS4.PIVOTINGSelect A different dimension(orientation) for analysis
  • 42. OLAP OPERATIONS5. SLICE and DICESlicing : Selection on one or more dimensionsExample : “Choosing sales only in week 12” Slicing the data cube in the Time DimensionSELECT S.*FROM Sales S,Time TWHERE T.Time_ID = S.Time_IDAND T.WEEK=’Week 12’ 
  • 43. OLAP OPERATIONSDicing: A range selection in a hypercube. Partition or group on one or more dimensions.Example :“ Total sales for each product in each quarter “ Dicing sales in the time dimension : SELECT S.Product_ID,T.Quarter,SUM(S.Amount)FROM Sales S,Time TWHERE T.Time_ID=S.Time_IDGroup BY T.Quarter,S.Product_ID
  • 44. SQL EXTENSIONS FOR OLAP1.ROLL UPSELECT SEM,SUM(MARKS),RANK() OVER (ORDER BY SUM (MARKS) DESC)AS rank FROMTEACHERSGROUP BY ROLL UP(SEM) ORDER BY SEM
  • 45. ROLL UP thus provides subtotals of aggregate rows.
  • 46. SQL EXTENSIONS2.CUBESELECT SEM,SUM(MARKS)FROM TEACHERS GROUP BY CUBE(SEM)
  • 47. The CUBE operator provides subtotals of aggregate values in the result set
  • 48. SQL EXTENSIONS 3. GROUPING SETS lets us compute groups on several different sets of grouping columns in the same query.This Query returns subtotal rows for each year, but not for the individual quarters.
  • 49. SQL EXTENSIONSSelect YEAR as YEAR , QUARTER as QUARTER,COUNT(*)as ORDERS from SALESGROUP BY GROUPING SETS(YEAR,QUARTER),(YEAR)) ORDER BY YEAR & QUARTER
  • 50. OLAP FUNCTIONS1. RANK FUNCTION – Lets us compile a list ofvalues from your data set inranked order.Example : The SQL query that follows finds the male and female employees from Kolkataand ranks them in descending order according to salary.
  • 51. SELECT emp_lname, salary, sex,RANK () OVER (ORDER BY salary DESC) "Rank"FROM employeeWHERE city IN (’KOL’)
  • 52. OLAP FUNCTIONS2.REPORTING FUNCTION : Reporting functions lets us compare non-aggregate values to aggregate values.Example : The following query returns a result set that shows a list of the products thatsold higher than the average number of sales. The result set is partitioned byyear. 
  • 53. SELECT *FROM (SELECT year(order_date) AS Year, prod_id,SUM( quantity ) AS Q,AVG (SUM(quantity))OVER (PARTITION BY Year) AS Average FROM sales_order JOIN sales_order_itemsGROUP BY year(order_date), prod_idORDER BY Year)AS derived_tableWHERE Q > AverageFor the year 2000, the average number of orders was 1787. Four products(700, 601, 600, and 400) sold higher than that amount. In 2001, the averagenumber of orders was 1048 and three products exceeded that amount.
  • 54. OLAP FUNCTIONSWINDOW FUNCTIONSWindow functions lets us analyze ourdataby computing aggregate values over windows surrounding each row. The result set returns a summary value representing a set of rows.
  • 55. The query returns a result set that partitions the data by department and then provides a cumulative summary of employees’ salaries starting with the employee who has been at the company the longest. The result set includes only those employees who reside in West Bengal, BBSR, Maharashtra, or Arunachal. The column Sum Salary provides the cumulative total of employees’ salaries. SELECT dept_id, emp_lname, start_date, salary,SUM(salary) OVER (PARTITION BY dept_idORDER BY start_dateRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS "Sum_Salary"FROM employeeWHERE state IN (’WB’, ’BBSR’, ’MH’, ’AR’) AND dept_id IN (’100’,’200’)ORDER BY dept_id, start_date;
  • 57. On Line Analytical ProcessingThus Online Analytical Processing as a whole can be understood to be a method which takes in raw data , processes it through various functions and operations and produces Information as a Response to Multidimensional Queries in Real Time
  • 58. SERVER ARCHITECTURESMOLAP : Multidimensional OLAPThe database is stored in a special, usually proprietary, structure that is optimized for multidimensional analysis.+ : very fast query response time because data is mostly pre-calculated-: practical limit on the size because the time taken to calculate the database and the space required to hold these pre-calculated values
  • 59. SERVER ARCHICTECTURESROLAP – Relational OLAPThe database is a standard relational database and the database model is a multidimensional model, often referred to as a star or snowflake model or schema.+: more scalable solution -: performance of the queries will be largely governed by the complexity of the SQL and the number and size of the tables being joined in the query
  • 60. SERVER ARCHITECTURESHOLAP – HYBRID OLAP A hybrid of ROLAP and MOLAPcan be thought of as a virtual database whereby the higher levels of the database are implemented as MOLAP and the lower levels of the database as ROLAP
  • 61. SERVER ARCHITECTURESDOLAP –DESKTOP OLAPThe previous terms are used to refer to server based OLAP technologiesDOLAP (Desktop OLAP)DOLAP enables users to quickly pull together small cubes that run on their desktops or laptops .
  • 62. COMMERCIAL OLAP SYSTEMSIBM DB2 DATAWAREHOUSINGENTERPRIZE EDITION
  • 64. BASE EDITIONORACLE 9i ENTERPRIZE EDITIONMICROSOFT SQL SERVER 2005 BUSINESS INTELLIGENCE WORKBENCH PLATFORM
  • 65. OLAP Challenges and Future ScopeAnalytical ComplexityBusiness questions can be rarely answered by a single queryComplex queries are hard to understand,write and execute efficientlyNeed for good business analystsData Cubes can be HUGEBut also can be sparseCan compute in advance,compute on demand , or some combination.OLAP forms the underlying structure of DDAS –Distributed Data Analysis and Dissemination System.From On line Analytical Processing to Online Analytical Mining ( OLAP to OLAM)
  • 66. BIBLIOGRAPHYData Warehousing , Data Mining and OLAP – Alex Berson,StephenJ.SmithData Warehousing And OLAp - Hector Garcia-MolinaStanford UniversityA Hitchhiker’s guide to OLAP – Paul Burton and Howard ong.Data mining data warehousing – Dr.HaniSaleebDATA WAREHOUSE AND OLAP TECHNOLOGY Prof. Anita WasilewskaData Mining: Concepts and Techniques Jiawei Han, MichelineKamber, and Jian PeiUniversity of Illinois at Urbana-Champaign &Simon Fraser UniversityWikipedia.Data Warehousing, Filtering, and Mining-Temple UniversityData Mining-Professor Maytal Saar-Tsechansky