100% found this document useful (1 vote)
296 views

Chapter 9 & 10 - Data Warehouse

Uploaded by

Yuikun
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
296 views

Chapter 9 & 10 - Data Warehouse

Uploaded by

Yuikun
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 90

Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Author: Muhammad Hamiz Mohd Radzi


Edited By: Zuhri Arafah Binti Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Course outline
 The Need for Data Analysis
 Business Intelligence
 Business Intelligence Architecture
 Decision Support Data
 Data Warehouse
 Online Analytical Processing
 Star Schema
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Objectives
At the end of this lesson, you should be able to:
 Describe the need for data analysis
 Describe business intelligence and its steps
 Explain the components architecture of business intelligence
 Explain the tools used in business intelligence
 Explain the operational vs decision support data
 Describe the contrasting characteristics of operational and decision support
data
 Explain the decision support database requirements
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

 Explain data warehouse, its characteristics, 12 rules of data warehouse and


the architectural style of data warehouse.
 Describe the multidimensional data analysis technique.
 Explain online analytical processing (OLAP) functions
 Explain the advances database support for OLAP
 Explain MOLAP and ROLAP
 Draw data cube and star schema
 Explain the operation on data cube
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Introduction

 Data are crucial raw material in this information age, and data
storage and management have become the focus of database
design and implementation.

 Ultimately, the reason for collecting, storing, and managing data is


to generate information that becomes the basis for rational
decision making.

 Decision support systems (DSSs) were originally developed to


facilitate the decision-making process.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

 However, as the complexity and range of information requirements


increased, so did the difficulty of extracting all the necessary information
from the data structures typically found in an operational database.

 Therefore, a new data storage facility, called a data warehouse, was


developed.

 The data warehouse extracts or obtains its data from operational databases
as well as from external sources, providing a more comprehensive data
pool.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

The Need For Data Analysis


 Organizations tend to grow and prosper as they gain a better
understanding of their environment.

 Most managers want to be able to track daily transactions to


evaluate how the business is performing.

 In addition, data analysis can provide information about short-


term tactical evaluations and strategies such as these:
 Are our sales promotions working?
 What market percentage are we controlling?
 Are we attracting new customers?
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

 Given the many and varied competitive pressures, managers are


always looking for a competitive advantage and keep changing
to stay relevant.

 Different managerial levels require different decision support


needs.

 The managers require detailed information designed to help


them make decisions in a complex data and analysis environment.

 This more comprehensive and integrated decision support


framework within organizations became known as business
intelligence.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Business Intelligence
 Business intelligence (BI) is a term used to describe a comprehensive,
cohesive, and integrated set of tools and processes used to capture,
collect, integrate, store, and analyze data with the purpose of
generating and presenting information used to support business
decision making.

 As the names implies, BI is about creating intelligence about a


business.

 This intelligence is based on learning and understanding the facts


about a business environment.

 BI has the potential to positively affect a company’s culture by creating


“business wisdom” and distributing it to all users in an organization.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

 BI is not a product by itself, but a framework of concepts, practices, tools, and


technologies that help a business better understand its core capabilities,
provide snapshots of the company situation, and identify key opportunities to
create competitive advantage.

 The steps in general BI creation are:


1. Collecting and storing operational data.
2. Aggregating the operational data into decision support data.
3. Analyzing decision support data to generate information.
4. Presenting such information to the end user to support business decisions.
5. Making business decisions, which in turn generate more data that is collected,
stored, etc. (restarting the process).
6. Monitoring results to evaluate outcomes of the business decisions (providing
more data to be collected, stored, etc.).
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Business Intelligence Architecture

 BI covers a range of technologies and applications to manage the


entire data life cycle from acquisition to storage, transformation,
integration, analysis, monitoring, presentation, and archiving.

 BI functionality ranges from simple data gathering and extraction to


very complex data analysis and presentation.

 There is no single BI architecture.

 However, there are some general types of functionality that all BI


implementations share.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Practices to Manage Data


 Master data management (MDM): Collection of concepts, techniques, and
processes for identification, definition, and management of data elements

 Governance: Method of government for controlling business health and for


consistent decision making

 Key performance indicators (KPI): Numeric or scale-based measurements


that assess company’s effectiveness in reaching its goals
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Practices to Manage Data


Data visualization: Abstracting data to provide information in a visual format

Enhances the user’s ability to efficiently comprehend the meaning of the data

Techniques:
Pie charts and bar charts
Line graphs
Scatter plots
Gantt charts
Heat maps
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Reporting Styles of a Modern BI System

Monitoring and
Advanced reporting
alerting

Advanced data
analytics
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Business Intelligence Benefits

Improved decision making

Integrating architecture

Common user interface for data reporting and analysis

Common data repository fosters single version of company data

Improved organizational performance


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Evolution of BI Information Dissemination Formats


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Business Intelligence Technology Trends

Data storage improvements

Business intelligence appliances

Business intelligence as a service

Big Data analytics

Personal analytics
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Decision Support Data


 Effectiveness of BI depends on quality of data gathered at operational level
 Operational data
 Seldom well-suited for decision support tasks
 Stored in relational database with highly normalized structures
 Optimized to support transactions representing daily operations
Differ from operational data in:
Time span
Granularity
 Drill down: Decomposing a data to a lower level
 Roll up: Aggregating a data into a higher level
Dimensionality
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Contrasting Operational and Decision Support Data


Characteristics
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Decision Support Database Requirements


• Database schema
Must support complex, non-normalized data representations
Data must be aggregated and summarized
Queries must be able to extract multidimensional time slices

• Data extraction and loading


Allow batch and scheduled data extraction
Support different data sources and check for inconsistent data or data validation rules
Support advanced integration, aggregation, and classification

• Database size should support:


Very large databases (VLDBs)
Advanced storage technologies
Multiple-processor technologies
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Data Warehouse
 A central data repository where data from operational database and other
sources are integrated, cleaned, and standardized to support decision
making.

 A warehouse is a subject-oriented, integrated, time-variant and non-


volatile collection of data in support of the management’s decision making
process.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Subject- Can be used to analyze a


oriented particular subject area.

Integrates data from multiple


Integrated data sources.

Historical data is kept in a


Time-variant
data warehouse.

Once data is in the data


Non-volatile warehouse, it will not
change.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Characteristics of Data Warehouse Data and


Operational Database Data
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

The ETL Process (steps involve in DW)


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Twelve Rules for a Data Warehouse


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Twelve Rules for a Data Warehouse


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Two-tier Enterprise data model (EDM)

Oper Data
Marts Marts
DATA
WAREHOUSE
ARCHITECTURE
S

Bottom- Three-
up tier
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Why Do We Need Data Warehouse Architecture?


 Many data warehouse projects have failed due to poor planning and
performance.
 Data warehouse projects are large efforts that involve coordination among
many parts of an organization.
 Appropriate architecture can help alleviate the above problems
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Two-tier architecture
An architecture for a data warehouse in which user departments directly use the
data warehouse rather than data marts.
Operational data are transformed and then transferred to a data warehouse.
A separate layer of servers may be used to support the complex activities of the
transformation process.
To assist with the transformation process, an enterprise data model (EDM) is
created.
EDM:
i. Describes the structure of the data warehouse
ii. Contains meta data for data transformation
iii. Contains details about cleaning and integrating data sources.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Three-tier Architecture
An architecture for a data warehouse in which user departments access data
marts rather than the data warehouse.
To provide users with faster access while isolating them from data needed by
other user groups, smaller data warehouse called data marts are often used.
Data marts:
i. A subset or view of a data warehouse.
ii. typically at a department or functional level
iii. Act as the interface between end users and the corporate data warehouse.
iv. Storing a subset of the data warehouse.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Bottom-up architecture
 An architecture for a data warehouse in which data marts are built for user
departments.
 Data are modeled one entity at a time and stored in separate data marts.
 Over time, new data are synthesized, cleaned, and merged into existing data
marts or built into new data marts.
 Data marts may eventually evolve into a data warehouse
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Oper mart architecture


 Short for operational mart.
 A just-in-time data mart.
 Usually built from one operational database in anticipation or in response to
major events.
 Supports peak demand for reporting and business analysis that accompanies a
major event.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Online Analytical Processing


 Advanced data analysis environment that supports decision making,
business modeling, and operations research

 Characteristics
• Multidimensional data analysis techniques
• Advanced database support
• Easy-to-use end-user interfaces
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Multidimensional Data Analysis Techniques


Data are processed and viewed as part of a multidimensional structure

Augmenting functions
• Advanced data presentation functions
• Advanced data aggregation, consolidation, and classification functions
• Advanced computational functions
• Advanced data-modeling functions
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Advanced Database Support


Advanced data access features:

 Access to many different kinds of DBMSs, flat files, and internal and external data
sources

 Access to aggregated data warehouse data and to the detail data found in operational
databases

 Advanced data navigation features

 Rapid and consistent query response times

 Ability to map end-user requests to appropriate data source and to proper data access
language

 Support for very large databases


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Easy-to-Use End-User Interface


 Proper implementation leads to simple navigation and accelerated decision
making or data analysis

 Advanced OLAP features are more useful when access is kept simple

 Many interface features are borrowed from previous generations of data


analysis tools
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

OLAP Architecture
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

OLAP Server with Local Miniature Data Marts


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Relational Online Analytical Processing (ROLAP)


 Provides OLAP functionality using relational databases and familiar relational
tools to store and analyze multidimensional data

 Extensions added to traditional RDBMS technology


 Multidimensional data schema support within the RDBMS
 Data access language and query performance optimized for
multidimensional data
 Support for very large databases (VLDBs)
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Multidimensional Online Analytical Processing (MOLAP)


• Extends OLAP functionality to multidimensional database management
systems (MDBMSs).

• MDBMS: Uses proprietary techniques store data in matrix-like n-


dimensional arrays

• End users visualize stored data as a 3D data cube


 Grow to n dimensions, becoming hypercubes
 Held in memory in a cube cache to speed access

• Sparsity: Measures the density of the data held in the data cube
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Relational vs. Multidimensional OLAP

Cengage Learning © 2015


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Multidimensional representation
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Three-Dimensional Data Cube


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Multidimensional Representation with Row Total


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

MULTIDIMENSIONAL TERMINOLOGY
• Dimension: subject label for a row or column

• Member: value of dimension

• Measure: quantitative data stored in cells


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

DIMENSIONS
• Hierarchies: members can have sub members eg Location dimension may
have hierarchy country  state  city

• Hierarchies can be used to drill down from higher level to lower level of detail
or roll up in reverse direction.

• Sparsity: - large empty cells in a data cube.


- Waste space and be slow to process.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

MEASURES
• Numeric operation such as simple arithmetic statistical calculations.

• Multiple measures in cells


• eg: sales amount or number of units sold

• Derived measures can be stored in data cube or computed from other


measures
• eg: total ringgit sales = total unit sold * unit price
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

DATA CUBE OPERATIONS


Operator Purpose Description
Replace a dimension with a single
Focus attention on a
Slice member value or with a summary
subset of dimensions
of its measure values
Focus attention on a
Replace a dimension with a subset
Dice subset of member
of members
values
Obtain more detail Navigate from a more general
Drill-down
about a dimension level to a more specific level
Summarize details Navigate from a more specific
Roll-up
about a dimension level to a more general level
Present data in a Rearrange the dimensions in a
Pivot
different order data cube
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

SLICE OPERATOR
• Focus on a subset of dimensions

• Similar to restriction operator

• Dimension are set to specific value

• Set dimension to specific value: 1/1/2006


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Example Slice Operation


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Example Slice-Summarize Operation


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

DICE OPERATOR
• Focus on a subset of member values
• Replace dimension with a subset of values
• Dice operation often follows a slice operation
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

DRILL-DOWN

• navigate from a more general level to more specific.


• Obtain more detail about dimension.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

ROLL-UP/DRILL-UP
•Remove detail from a dimension.

•Moving from a specific level to a more general level of a


hierarchical dimension.
eg: roll up sales data from daily to quarterly level.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

PIVOT
• Rearrange dimensions so that data cube can be presented in a visually
appealing order.

• Most typically used on data cube of more than two dimensions.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

We have a multidimensional data model with the fact table Sales and the dimension
tables Customers, Products, and Salespeople. The sales table below represents sales
data for 1st January 2010:
SALES
PRODUCT

CUSTOMER SALESPEOPLE

SALESPEOPLE
Custid Name Address
CUSTOMER 101 Kamal London
102 Mokthar New York
103 Bukhairi Paris

Draw a 3D picture of a data cube. Assume that all values that are missing
from the Sales table are 0.
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Star Schema
 Data-modeling technique

 Maps multidimensional decision support data into a relational database

 Creates the near equivalent of multidimensional database schema from


existing relational database

 Yields an easily implemented model for multidimensional data analysis


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Components of Star Schemas


Facts

• Numeric values that represent a specific business aspect

Dimensions

• Qualifying characteristics that provide additional perspectives to a given fact

Attributes

• Used to search, filter, and classify facts


• Slice and dice: Ability to focus on slices of the data cube for more detailed analysis

Attribute hierarchy

• Provides a top-down data organization


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Star Schema Representation


 Facts and dimensions represented by physical tables in data warehouse
database

 Many-to-one (M:1) relationship between fact table and each dimension table

 Fact and dimension tables


• Related by foreign keys
• Subject to primary and foreign key constraints
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Star Schema Representation


 Primary key of a fact table

• Is a composite primary key because the fact table is related to many


dimension tables

• Always formed by combining the foreign keys pointing to the related


dimension tables
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Techniques Used to Optimize Data Warehouse Design


• Normalizing dimensional tables:
Snowflake schema: Dimension tables can have their own dimension tables

• Maintaining multiple fact tables to represent different aggregation levels

• De-normalizing fact tables


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Techniques Used to Optimize Data Warehouse Design


Partitioning and replicating tables

• Partitioning: Splits tables into subsets of rows or columns and places them
close to customer location

• Replication: Makes copy of table and places it in a different location

• Periodicity: Provides information about the time span of the data stored in
the table
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

CONSTELLATION SCHEMA
 Data modeling representation of multidimensional database
 A constellation schema contains multiple facts table in the center related to
the dimension table
 Typically, the facts table share some dimension tables
 Multiple fact tables share dimension tables, viewed as a collection of stars,
therefore called galaxy schema or fact constellation
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

SNOWFLAKE SCHEMA
 Data modeling representation of multidimensional database
 Snowflake schema has multiple levels of dimension tables related to one or
more facts tables
 The snowflake schema instead of the star schema for small dimension tables
that are not in 3NF
 However, the snowflake structure can reduce the effectiveness of browsing,
since more joins will be needed
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Year Region Agent Product Quantity

2009 East Carlos Erasers 50


2009 East Tere Erasers 12
SALES ORDER 2009 North Carlos Widgets 120
2009 North Tere Widgets 100
2009 North Carlos Widgets 30
2009 South Victor Balls 145
2009 South Victor Balls 34
2009 South Victor Balls 80
2009 West Mary Pencils 89
2009 West Mary Pencils 56
2010 East Carlos Pencils 45
2010 East Victor Balls 55
2010 North Mary Pencils 60
2010 North Victor Erasers 20
2010 South Carlos Widgets 30
2010 South Mary Widgets 75
2010 South Mary Widgets 50
2010 South Tere Balls 70
2010 South Tere Erasers 90
2010 West Carlos Widgets 25
2010 West Tere Balls 100
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Data Analytics
• Encompasses a wide range of mathematical, statistical, and modeling
techniques to extract knowledge from data
Subset of BI functionality

• Classification of tools
Explanatory analytics: Focuses on discovering and explaining data
characteristics and relationships based on existing data
Predictive analytics: Focuses on predicting future outcomes with a high
degree of accuracy
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Data Mining
• Analyzing massive amounts of data to:
Uncover hidden trends, patterns, and relationships
Form computer models to stimulate and explain the findings
Use the models to support business decision making

• Run in two modes


Guided
Automated
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Extracting Knowledge from Data


Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Data-Mining Phases
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

Predictive Analytics
• Employs mathematical and statistical algorithms, neural networks, artificial
intelligence, and other advanced modeling tools

• Creates actionable predictive models based on available data


 Next logical step after data mining

• Adds value to an organization


 Helps optimize the existing processes
 Identify hidden problems
 Anticipate future problems or opportunities
Author: Muhammad Hamiz Mohd Radzi Edited By: Zuhri Arafah Zulkifli

References
Database Systems: A Practical Approach to Design, Implementation, and
Management, Thomas Connolly and Carolyn Begg, 5th Edition, 2010, Pearson.

Fundamental of Database Management Systems, Mark L. G., 2nd Edition,


2012, John Wiley.

You might also like