Week 5 CRISP-DM Process and Its Applications (PDF)
Week 5 CRISP-DM Process and Its Applications (PDF)
CRISP-DM PROCESS
INTRODUCTION TOAND
ITS APPLICATION
DATA SCIENCE Professor. O.O Obe
Objectives
At the end of this lesson you should be able to:
● Identify various data science workflows.
● Describe dierent data science workflows.
● Identify the dierent applications of each data science
workflows
Introduction to Data Science
● Healthcare
Predicting disease outbreaks, aiding in patient diagnosis,
and formulating eective treatment plans.
● Finance
Detecting fraud, assessing risks, and optimizing
investment strategies.
Introduction to Data Science
CRISP-DM Process -
Cross-Industry Standard
Process for Data Mining
Introduction to Data Science
● Data Understanding
Thoroughly exploring and gaining insights into the structure and quality of
the data is paramount for meaningful analysis.
● Data Preparation
Cleaning, transforming, and selecting data for modelling ensures that the
dataset is refined and optimised for subsequent processes.
Introduction to Data Science
Mid-lesson questions
1. What forms the backbone of data-driven decision-making and involves a series of
structured approaches for tackling intricate problems through the utilization of data?
A Data Integration
B Data Exploration
D Data Visualisation
Introduction to Data Science
Mid-lesson questions
2. What is CRISP-DM?
A A programming language
C A software tool
D An encryption algorithm
Introduction to Data Science
● Evaluation
Assessing the model’s performance and determining its
impact on the business helps in refining and optimising
the model further.
● Deployment
Implementing the model into production environments
ensures its practical application in real-world scenarios.
Introduction to Data Science
● Manufacturing
Enabling predictive maintenance and ensuring
quality control in production processes.
● Telecommunications
Optimizing network operations and eectively
segmenting customer demographics.
Introduction to Data Science
Knowledge Management
Introduction to Data Science
1. Data Repositories
Centralized storage for both structured and unstructured
data ensures easy access and retrieval.
2. Knowledge Bases
An organized collection of insights, models, and best
practices forms the foundation for informed
decision-making.
3. Collaboration Platforms
Tools that facilitate the sharing of ideas, experiences, and
knowledge promote a collaborative environment.
Introduction to Data Science
Conclusion
In conclusion, Data Science Workflows, including KDD,
CRISP-DM, and Knowledge Management, are essential
frameworks for extracting maximum value from data. By
implementing these methodologies eectively, we’re not
only discovering insights but also preserving and
leveraging them for future endeavours. This drives
innovation and ensures informed decision-making.
Introduction to Data Science
SUMMARY
● Data Science Workflows form the backbone of data-driven
decision-making
● KDD process steps are: selection, pre-processing,
transformation, data mining, interpretation/evaluation,
● CRISP-DM is a widely accepted industry-standard
methodology for executing data mining projects
● Knowledge Management involves the systematic capture,
organisation, and application of collective knowledge
within an organisation
Introduction to Data Science
REFERENCES
● Provost, F., & Fawce, T. (2013). Data science for
business: What you need to know about data mining
and data-analytic thinking. O'Reilly Media.
● Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996).
From data mining to knowledge discovery in
databases. AI magazine, 17(3), 37-54.
● Dalkir, K. (2011). Knowledge management in theory
and practice. MIT press.
Introduction to Data Science
Thank
You