0% found this document useful (0 votes)
32 views12 pages

Data Warehouse and Data Mining

The document discusses data warehousing and data mining. It defines data warehousing as a technology that aggregates structured data from multiple sources to support analysis and decision making. It also discusses types of data warehousing, who needs it, advantages, and the data mining process.

Uploaded by

Tushar rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views12 pages

Data Warehouse and Data Mining

The document discusses data warehousing and data mining. It defines data warehousing as a technology that aggregates structured data from multiple sources to support analysis and decision making. It also discusses types of data warehousing, who needs it, advantages, and the data mining process.

Uploaded by

Tushar rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Data Warehouse and Data Mining

Submitted by: Tushar Rana BBA(CAM)-5 (morning), 02713401918


Submitted to: Dr. Mahesh Sharma; Mr. Satpal Arora
Data warehousing
It is a technology that aggregates structured data from one
or more sources so that it can be compared and analyzed
rather than transaction processing. A data warehouse is
designed to support management decision-making process
by providing a platform for data cleaning, data integration
and data consolidation. A data warehouse contains subject-
oriented, integrated, time-variant and non-volatile data.
Data warehouse consolidates data from many sources while
ensuring data quality, consistency and accuracy. Data
warehouse improves system performance by separating
analytics processing from transnational databases. Data
flows into a data warehouse from the various databases. A
data warehouse works by organizing data into a schema
which describes the layout and type of data. Query tools
analyze the data tables using schema.
Types of Data Warehousing
Three main types of Data Warehouses (DWH) are:

1. Enterprise Data Warehouse (EDW):


Enterprise Data Warehouse (EDW) is a centralized warehouse. It provides decision support
service across the enterprise. It offers a unified approach for organizing and representing
data. It also provide the ability to classify data according to the subject and give access
according to those divisions.
2. Operational Data Store:
Operational Data Store, which is also called ODS, are nothing but data store required when
neither Data warehouse nor OLTP systems support organizations reporting needs. In ODS,
Data warehouse is refreshed in real time. Hence, it is widely preferred for routine activities
like storing records of the Employees.
3. Data Mart:
A data mart is a subset of the data warehouse. It specially designed for a particular line of
business, such as sales, finance, sales or finance. In an independent data mart, data can
collect directly from sources.
Who Needs Data warehousing
DWH (Data warehouse) is needed for all types of users like:

•Decision makers who rely on mass amount of data

•Users who use customized, complex processes to obtain information from multiple data sources.

•It is also used by the people who want simple technology to access the data

•It also essential for those people who want a systematic approach for making decisions.

•If the user wants fast performance on a huge amount of data which is a necessity for reports, grids or charts, then
Data warehouse proves useful.

•Data warehouse is a first step If you want to discover 'hidden patterns' of data-flows and groupings.
Advantages of Data warehousing
•Data warehouse allows business users to quickly access critical data from some sources all in one place.

•Data warehouse provides consistent information on various cross-functional activities. It is also supporting ad-hoc
reporting and query.

•Data Warehouse helps to integrate many sources of data to reduce stress on the production system.

•Data warehouse helps to reduce total turnaround time for analysis and reporting.

•Restructuring and Integration make it easier for the user to use for reporting and analysis.

•Data warehouse allows users to access critical data from the number of sources in a single place. Therefore, it saves
user's time of retrieving data from multiple sources.

•Data warehouse stores a large amount of historical data. This helps users to analyze different time periods and trends to
make future predictions.
Data Mining
It is the process of finding patterns and correlations within large data sets to identify relationships between data.
Data mining tools allow a business organization to predict customer behavior. Data mining tools are used to
build risk models and detect fraud. Data mining is used in market analysis and management, fraud detection,
corporate analysis and risk management.
Key Features Of Data Mining

• Automatic pattern predictions based on trend and behavior analysis.

• Prediction based on likely outcomes.

• Creation of decision-oriented information.

• Focus on large data sets and databases for analysis.

• Clustering based on finding and visually documented groups of facts not previously known.
Process
• Before the actual data mining could occur, there are several processes involved in data mining implementation. Here’s
how:
• Step 1: Business Research – Before you begin, you need to have a complete understanding of your enterprise’s
objectives, available resources, and current scenarios in alignment with its requirements. This would help create a
detailed data mining plan that effectively reaches organizations’ goals. 
• Step 2: Data Quality Checks – As the data gets collected from various sources, it needs to be checked and matched to
ensure no bottlenecks in the data integration process. The quality assurance helps spot any underlying anomalies in the
data, such as missing data interpolation, keeping the data in top-shape before it undergoes mining. 
• Step 3: Data Cleaning – It is believed that 90% of the time gets taken in the selecting, cleaning, formatting, and
anonymizing data before mining. 
• Step 4: Data Transformation – Comprising five sub-stages, here, the processes involved make data ready into final data
sets. It involves:
• Data Smoothing: Here, noise is removed from the data.
• Data Summary: The aggregation of data sets is applied in this process.
• Data Generalization: Here, the data gets generalized by replacing any low-level data with higher-level conceptualizations.
• Data Normalization: Here, data is defined in set ranges.
• Data Attribute Construction: The data sets are required to be in the set of attributes before data mining. 
• Step 5: Data Modelling: For better identification of data patterns, several mathematical models are implemented in the
dataset, based on several conditions. 
Architecture of Data Mining
conclusion
• Data mining is considered as a process of extracting data from large data sets, whereas a Data warehouse is
the process of pooling all the relevant data together.
• Data mining is the process of analyzing unknown patterns of data, whereas a Data warehouse is a technique
for collecting and managing data.
• Data mining is usually done by business users with the assistance of engineers while Data warehousing is a
process which needs to occur before any data mining can take place
• Data mining allows users to ask more complicated queries which would increase the workload while Data
Warehouse is complicated to implement and maintain.
• Data mining helps to create suggestive patterns of important factors like the buying habits of customers while
Data Warehouse is useful for operational business systems like CRM systems when the warehouse is
integrated.

You might also like