The knowledge discovery process consists of 6 iterative steps: data cleaning, data integration, data selection, data transformation, data mining, and knowledge presentation. A typical data mining system architecture includes a database, database server, knowledge base, data mining engine, pattern evaluation module, and graphical user interface. The data mining engine applies intelligent methods to extract patterns from the data, while the pattern evaluation module assesses pattern interestingness.
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
88 views
Knowledge Discovery Process
The knowledge discovery process consists of 6 iterative steps: data cleaning, data integration, data selection, data transformation, data mining, and knowledge presentation. A typical data mining system architecture includes a database, database server, knowledge base, data mining engine, pattern evaluation module, and graphical user interface. The data mining engine applies intelligent methods to extract patterns from the data, while the pattern evaluation module assesses pattern interestingness.
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Knowledge Discovery Process
Knowledge discovery as a process is depicted and consists of an
iterative sequence of the following steps: Data cleaning (to remove noise or irrelevant data), Data integration (where multiple data sources may be combined) Data selection (where data relevant to the analysis task are retrieved from the database) Data transformation (where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance), Data mining (an essential process where intelligent methods are applied in order to extract data patterns), Pattern evaluation (to identify the truly interesting patterns representing knowledge based on some interestingness measures;), and Knowledge presentation (where visualization and knowledge representation techniques are used to present the mined knowledge to the user). Architecture of a typical data mining system. The architecture of a typical data mining system may have the following major components 1. Database, data warehouse, or other information repository. This is one or a set of databases, data warehouses, spread sheets, or other kinds of information repositories. Data cleaning and data integration techniques may be performed on the data. 2. Database or data warehouse server. The database or data warehouse server is responsible for fetching the relevant data, based on the user's data mining request. 3. Knowledge base. This is the domain knowledge that is used to guide the search, or evaluate the interestingness of resulting patterns. Such knowledge can include concept hierarchies, used to organize attributes or attribute values into different levels of abstraction. Knowledge such as user beliefs, which can be used to assess a pattern's interestingness based on its unexpectedness, may also be included. 4. Data mining engine. This is essential to the data mining system and ideally consists of a set of functional modules for tasks such as characterization, association analysis, classification, evolution and deviation analysis. 5. Pattern evaluation module. This component typically employs interestingness measures and interacts with the data mining modules so as to focus the search towards interesting patterns. It may access interestingness thresholds stored in the knowledge base. Alternatively, the pattern evaluation module may be integrated with the mining module, depending on the implementation of the data mining method used. 6. Graphical user interface. This module communicates between users and the data mining system, allowing the user to interact with the system by specifying a data mining query or task, providing information to help focus the search, and performing exploratory data mining based on the intermediate data mining results.
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB