Data Warehouse and Data Mining
Data Warehouse and Data Mining
•Users who use customized, complex processes to obtain information from multiple data sources.
•It is also used by the people who want simple technology to access the data
•It also essential for those people who want a systematic approach for making decisions.
•If the user wants fast performance on a huge amount of data which is a necessity for reports, grids or charts, then
Data warehouse proves useful.
•Data warehouse is a first step If you want to discover 'hidden patterns' of data-flows and groupings.
Advantages of Data warehousing
•Data warehouse allows business users to quickly access critical data from some sources all in one place.
•Data warehouse provides consistent information on various cross-functional activities. It is also supporting ad-hoc
reporting and query.
•Data Warehouse helps to integrate many sources of data to reduce stress on the production system.
•Data warehouse helps to reduce total turnaround time for analysis and reporting.
•Restructuring and Integration make it easier for the user to use for reporting and analysis.
•Data warehouse allows users to access critical data from the number of sources in a single place. Therefore, it saves
user's time of retrieving data from multiple sources.
•Data warehouse stores a large amount of historical data. This helps users to analyze different time periods and trends to
make future predictions.
Data Mining
It is the process of finding patterns and correlations within large data sets to identify relationships between data.
Data mining tools allow a business organization to predict customer behavior. Data mining tools are used to
build risk models and detect fraud. Data mining is used in market analysis and management, fraud detection,
corporate analysis and risk management.
Key Features Of Data Mining
• Clustering based on finding and visually documented groups of facts not previously known.
Process
• Before the actual data mining could occur, there are several processes involved in data mining implementation. Here’s
how:
• Step 1: Business Research – Before you begin, you need to have a complete understanding of your enterprise’s
objectives, available resources, and current scenarios in alignment with its requirements. This would help create a
detailed data mining plan that effectively reaches organizations’ goals.
• Step 2: Data Quality Checks – As the data gets collected from various sources, it needs to be checked and matched to
ensure no bottlenecks in the data integration process. The quality assurance helps spot any underlying anomalies in the
data, such as missing data interpolation, keeping the data in top-shape before it undergoes mining.
• Step 3: Data Cleaning – It is believed that 90% of the time gets taken in the selecting, cleaning, formatting, and
anonymizing data before mining.
• Step 4: Data Transformation – Comprising five sub-stages, here, the processes involved make data ready into final data
sets. It involves:
• Data Smoothing: Here, noise is removed from the data.
• Data Summary: The aggregation of data sets is applied in this process.
• Data Generalization: Here, the data gets generalized by replacing any low-level data with higher-level conceptualizations.
• Data Normalization: Here, data is defined in set ranges.
• Data Attribute Construction: The data sets are required to be in the set of attributes before data mining.
• Step 5: Data Modelling: For better identification of data patterns, several mathematical models are implemented in the
dataset, based on several conditions.
Architecture of Data Mining
conclusion
• Data mining is considered as a process of extracting data from large data sets, whereas a Data warehouse is
the process of pooling all the relevant data together.
• Data mining is the process of analyzing unknown patterns of data, whereas a Data warehouse is a technique
for collecting and managing data.
• Data mining is usually done by business users with the assistance of engineers while Data warehousing is a
process which needs to occur before any data mining can take place
• Data mining allows users to ask more complicated queries which would increase the workload while Data
Warehouse is complicated to implement and maintain.
• Data mining helps to create suggestive patterns of important factors like the buying habits of customers while
Data Warehouse is useful for operational business systems like CRM systems when the warehouse is
integrated.