0% found this document useful (0 votes)
17 views

Data Warehousing

Hi hi hello

Uploaded by

Lakshay Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Data Warehousing

Hi hi hello

Uploaded by

Lakshay Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

DATA WAREHOUSING

PRESENTED BY – LAKSHAY, VIJAY,


PIYUSH AND AARAV VATS
content
• Database and Data Warehousing
• History of Data Warehousing
• Data Warehouse Architecture
• Advantages and Disadvantages of Data
Warehousing
• Data Mining
• Text Mining
• OLAP
• Business Intelligence
Database and data ware housing

• The difference…
DWH constitute entire information base for all time..
Database constitute real time information…
DWH supports data mining and business intelligence.
Database is used to running the business.
DWH is how to run the business.
What is a data warehouse?

A single complete and consistent store of data obtained from


a variety of different sources made available to end users so
that they can understand and use it in a business context.
What is data warehousing?

A process of transforming data into information and making


it available to users in a timely enough manner to make a
difference.
History of data warehousing
• The concept of data warehousing dates back to the late 1980s when
IBM researchers Barry Devlin and Paul Murphy developed the
“business data warehouse”.
• 1960s – General mills and Dartmouth College, in a joint research
project, develop the terms dimensions and facts.
• 1970s – ACNielsen and IRI provide dimensional data marts for retail
sales.
• 1983 – Tera data introduces a database management system
specifically designed for decision support.
• 1988 – Barry Devlin and Paul Murphy publish the article An
Architecture for a business and information systems in IBM systems
Data warehouse architecture
Data warehouse architecture
• The data has been selected from various sources and then integrate
and store the data in a single and particular format.
• Data warehouses contain current detailed data, historical detailed
data, lightly and highly summarized data, and metadata.
• Current and historical data are voluminous because they are stored
at the highest level of detail.
• Lightly and highly summarized data are necessary to save
processing time when users request them and are readily
accessible.
Data warehouse architecture
• Metadata are “data about data”. It is important for designing,
constructing, retrieving, and controlling the warehouse data.

• Technical Metadata include where the data come from, how the data
were changed, how the data are organized, how the data are stored,
who owns the data, who is responsible for the data and how to
contact them, who can access the data, and the date of last update.

• Business Metadata includes what data are available, where the data
are, what the data mean, how to access the data, predefined reports
and queries, and how current the data is.
advantages
• It provides business users with a “customer-centric” view of the
company’s heterogeneous data by helping to integrate data from
sales, service, manufacturing and distribution, and other customer-
related business systems.
• It provides added value to the company’s customers by allowing
them to access better information when data warehousing is
coupled with internet technology.
• It consolidates data about individual customers and provides a
repository of all customer contacts for segmentation modeling,
customer retention planning, and cross sales analysis.
disadvantages
• Data warehouses are not the optimal environment for unstructured
data.
• Data must be extracted, transformed and loaded into the
warehouse, there is an element of latency in data warehouse’s data.
• Over their life, data warehouses can have high costs. Maintenance
costs are high.
• Data warehouses can get outdated relatively quickly. There is a cost
of delivering suboptimal information to the organization.
Data mining
• Data Mining is the process of extracting information from the
company’s various databases and re-organizing it for purposes
other than what the databases were originally intended for.
• It provides a means of extracting previously unknown, predictive
information from the base of accessible data in data warehouses.
• Data Mining process is different for different organizations
depending upon the nature of the data and organization.
• Data Mining tools use sophisticated, automated algorithms to
discover hidden patterns, correlations, and relationships among
organizational data.
Data mining for decision support
Two capabilities are provided new business opportunities
• Automated prediction of trends and behavior: for ex,
targeted marketing.
• Automated discovery of previously unknown patterns: for
ex, detecting fraudulent credit card transactions and
identifying anomalous data representing data entry-keying
errors.
Data mining tools
IT tools and techniques are used by data miners
• Neural computing – It is a machine learning approach by
which historical data can be examined for patterns.
• Intelligent agents – It is the promising approach to retrieve
information from the internet or from intranet-based
databases.
• Association analysis – An approach that uses a specialized
set of algorithms that sort through large data sets and
expresses statistical rules among items.
Text mining
• Text mining is the application of data mining to non
structured or less structured text files.

• Operates with less structured information.

• Frequently focused on document format rather than


document content.
olap
• Online Analytical Processing – coined by EF
Codd in 1994 paper contracted by Arbor
Software.
• Generally synonymous with earlier terms such
as Decisions Support, Business Intelligence,
Executive Information System.
• OLAP = Multidimensional Database
olap
• Online Analytical Processing refers to such end user
activities as DSS modelling using spreadsheets and
graphics that are done online.
• OLAP involves many different data items in complex
relationships.
• Objective of OLAP is to analyze complex
relationships and look for patterns, trends and
exceptions.
Olap is fasmi
• Fast
• Analysis
• Shared
• Multidimensional
• Information
Business intelligence
• One ultimate use of the data gathered and processed in the data life
cycle is for business intelligence.
• Business Intelligence generally involves the creation or use of a data
warehouse and/or data mart for storage of data, and the use of
front – end analytical tools such as Oracle’s Sales Analyzer and
Financial Analyzer or Micro Strategy’s Web.
• Such tools can be employed by end users to access data, ask
queries, request ad hoc (special) reports, examine scenarios, create
CRM activities, devise pricing strategies, and much more.
How business intelligence works?
• The process starts with raw data which are usually kept in corporate
databases. For ex, a national retail chain that sells everything from
grills and patio furniture to plastic utensils had data about
inventory, customer information, data about past promotions, and
sales numbers in various databases.
• Though all this information may be scattered across multiple
systems-and may seem unrelated-business intelligence software can
being it together. This is done by using a data warehouse.
• In the data warehouse (or mart) tables can be linked, and data
cubes are formed. For instance, inventory information is linked to
sales numbers and customer databases, allowing for deep analysis
of information.
Business intelligence
More advanced applications of business
intelligence include outputs such as
1. Financial Modeling
2. Budgeting
3. Resource Allocation
4. And competitive intelligence.

You might also like