0% found this document useful (0 votes)
75 views12 pages

Data Mining Process

The CRISP DM process is a data mining methodology consisting of 6 steps: 1) business understanding, 2) data understanding, 3) data preparation, 4) modeling, 5) evaluation, and 6) deployment. The process is iterative, with feedback between steps. It involves gaining domain knowledge, preparing data, building models using algorithms on training data and evaluating them on test data, deploying the optimal model, and accumulating knowledge throughout the process.

Uploaded by

Fahmida Akter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views12 pages

Data Mining Process

The CRISP DM process is a data mining methodology consisting of 6 steps: 1) business understanding, 2) data understanding, 3) data preparation, 4) modeling, 5) evaluation, and 6) deployment. The process is iterative, with feedback between steps. It involves gaining domain knowledge, preparing data, building models using algorithms on training data and evaluating them on test data, deploying the optimal model, and accumulating knowledge throughout the process.

Uploaded by

Fahmida Akter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

2.

Data Mining Process


CRISP DM process

Business Data
Understanding Understanding

Data Preparation

Deployment
Data

Modeling

Evaluation
Process
Business Data
Understanding Understanding 1. Prior Knowledge

Prepare Data

2. Preparation
Building Model using
Training Data
Algorithms

3. Modeling
Applying Model and
Test Data
performance evaluation

4. Application
Deployment

Knowledge and Actions 5. Knowledge


1. Prior Knowledge

Gaining information on:

- Objective of the problem


- Subject area of the problem
- Data
2. Data Preparation

Data Exploration
Data quality
Handling missing values
Data type conversion
Transformation
Outliers
Feature selection
Sampling
3. Modeling

Training Data Build model

Test Data Evaluation

Final Model
3. Modeling
Spliting training and test data sets
3. Modeling
Spliting training and test data sets

Training Data
Test Data
3. Modeling
3. Modeling

Evaluation of test dataset


3. Application

Product readiness
Technical integration
Model response time
Remodeling
Assimilation
5. Knowledge

Posterior knowledge

Kotu, V., & Deshpande, B. (2014). Predictive analytics and data mining: concepts and practice with rapidminer. Morgan Kaufmann.

You might also like