Data Warehousing & Data Mining
Data Warehousing & Data Mining
Submitted to: Dr. Jaydip Choudhary Faculty of Information Technology in Business Department of Business & Industrial Management Veer Narmad South Gujarat University
Introduction
Today in organizations, the developments in the transaction processing technology requires that, amount and rate of data capture should match the speed of processing of the data into information which can be utilized for decision making.
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data that is required for decision making process. Data mining involves the use of various data analysis tools to discover new facts, valid patterns and relationships in large data sets. Data mining also includes analysis and prediction for the data.
Data Warehousing
Data warehousing is defined as a process of centralized data management and retrieval. Data warehousing represents an ideal vision of maintaining a central repository of all organizational data. Data warehouse is a storage area for processed and integrated data across different sources which will be both operational data and external data.
What Facebook does? Facebook basically gathers all of your data your friends, your likes, who you stalk, etc and then stores that data into one central repository. Why would they want to do this? They want to make sure that you see the most relevant ads that youre most likely to click on, they want to make sure that the friends that they suggest are the most relevant to you, etc.
Potential high returns on investment Competitive advantage Increased productivity of corporate decisionmakers More cost-effective decision-making Enhances Data Quality and Consistency Delivers enhanced Business Intelligence
Underestimation of resources of data loading Hidden problems with source systems Required data not captured Increased end-user demands Data homogenization High demand for resources Data ownership High maintenance Long-duration projects
Data Mining
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.
Blockbuster Entertainment mines its video rental history database to recommend rentals to individual customers. American Express can suggest products to its cardholders based on analysis of their monthly expenditures. WalMart is pioneering massive data mining to transform its supplier relationships
Regression modeling Visualization Correlation Variance analysis Discriminate analysis Forecasting Cluster analysis Decision trees Neural networks
Privacy Issues Security issues Misuse of information/inaccurate information Great Cost at implementation stage
Data Warehousing is the process of compiling and organizing data into one common database and Data Mining is the process of extracting meaningful data from that database. The Data Mining process relies on the data compiled in the Data Warehousing phase in order to detect meaningful patterns.
Conclusion
Organizations today are under tremendous pressure to compete in an environment of tight deadlines and reduced profits. Business processes that require data to be extracted and manipulated prior to use will no longer be acceptable. Instead, enterprises need rapid decision support based on the analysis and forecasting of predictive behavior. Data-warehousing and data-mining techniques provide this capability.
Thank You