Data Warehousing
Data Warehousing
• The difference…
DWH constitute entire information base for all time..
Database constitute real time information…
DWH supports data mining and business intelligence.
Database is used to running the business.
DWH is how to run the business.
What is a data warehouse?
• Technical Metadata include where the data come from, how the data
were changed, how the data are organized, how the data are stored,
who owns the data, who is responsible for the data and how to
contact them, who can access the data, and the date of last update.
• Business Metadata includes what data are available, where the data
are, what the data mean, how to access the data, predefined reports
and queries, and how current the data is.
advantages
• It provides business users with a “customer-centric” view of the
company’s heterogeneous data by helping to integrate data from
sales, service, manufacturing and distribution, and other customer-
related business systems.
• It provides added value to the company’s customers by allowing
them to access better information when data warehousing is
coupled with internet technology.
• It consolidates data about individual customers and provides a
repository of all customer contacts for segmentation modeling,
customer retention planning, and cross sales analysis.
disadvantages
• Data warehouses are not the optimal environment for unstructured
data.
• Data must be extracted, transformed and loaded into the
warehouse, there is an element of latency in data warehouse’s data.
• Over their life, data warehouses can have high costs. Maintenance
costs are high.
• Data warehouses can get outdated relatively quickly. There is a cost
of delivering suboptimal information to the organization.
Data mining
• Data Mining is the process of extracting information from the
company’s various databases and re-organizing it for purposes
other than what the databases were originally intended for.
• It provides a means of extracting previously unknown, predictive
information from the base of accessible data in data warehouses.
• Data Mining process is different for different organizations
depending upon the nature of the data and organization.
• Data Mining tools use sophisticated, automated algorithms to
discover hidden patterns, correlations, and relationships among
organizational data.
Data mining for decision support
Two capabilities are provided new business opportunities
• Automated prediction of trends and behavior: for ex,
targeted marketing.
• Automated discovery of previously unknown patterns: for
ex, detecting fraudulent credit card transactions and
identifying anomalous data representing data entry-keying
errors.
Data mining tools
IT tools and techniques are used by data miners
• Neural computing – It is a machine learning approach by
which historical data can be examined for patterns.
• Intelligent agents – It is the promising approach to retrieve
information from the internet or from intranet-based
databases.
• Association analysis – An approach that uses a specialized
set of algorithms that sort through large data sets and
expresses statistical rules among items.
Text mining
• Text mining is the application of data mining to non
structured or less structured text files.