0% found this document useful (0 votes)

781 views8 pages

Modul 1 CertDA

The document provides an overview of data mining and the CRISP-DM framework for data mining projects. It describes data mining as identifying relationships, trends, and patterns in large data sets to turn raw data into useful information. The CRISP-DM framework consists of 6 phases for data mining projects: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The document focuses on explaining the business understanding, data understanding, and data preparation phases in the most detail.

Uploaded by

Indra Siswanto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

781 views8 pages

Modul 1 CertDA

Uploaded by

Indra Siswanto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Modul 1

What is Data Mining?

Data mining is the process of identifying relationships, trends and patterns in large sets of data,
effectively turning raw data into useful information. Data mining approaches involve various
methods such as statistics, machine learning, and database systems.

The information obtained through the data mining process can then be further processed and used
to support decision-making.

The CRISP-DM Framework

CRISP-DM is a cross-industry process for data mining and is a process model designed to facilitate a
structured approach to data mining. It was first conceived in 1996, and in 1997 it became an official
European Union project under the ESPRIT funding initiative. The project was spear-headed by five
companies: Integral Solutions Ltd (ISL), Teradata, Daimler AG, NCR Corporation and OHRA, an
insurance company, and led to the first version of the methodology being published as a data mining
guide in 1999.

Recent research indicates that CRISP-DM is the most widely used data-mining process, because of its
various advantages which solved the existing problems in the data mining industries. The apparent
success and wide use of the CRISP-DM is that it is industry, tool, and application neutral.

The process model is composed of six distinct but connected phases which represent the ideal
sequence of activities involved in the data mining process. In practice some of these activities may
be performed in a different order. Some of the paths between activities are two-way, indicating that
it will frequently be necessary to return to earlier steps depending on the outcome of a particular
activity.
Business understanding

Business understanding is the essential and mandatory first phase in any date mining or data
analytics project. It involves identifying and describing the fundamental aims of the project from a
business perspective. This may involve solving a key business problem or exploring a particular
business opportunity.

Such problems might be:

 Establishing whether the business has been performing or under-performing and in which areas
 Monitoring and controlling performance against targets or budgets
 Identifying areas where efficiency and effectiveness in business processes can be improved
 Understanding customer behaviour to identify trends, patterns and relationships
 Predicting sales volumes at given prices
 Detecting and preventing fraud more easily
 Using scarce resources most profitably
 Optimising sales or profits.

Having identified the aims of the project to address the business problem or opportunity, the next
step is to establish a set of project objectives and requirements. These are then used to inform the
development of a project plan. The plan will detail the steps to be performed over the course of the
rest of the project and should cover the following:

 Deciding which data needs to be selected from intemal or external sources

 Acquiring suitable data
 Determining criteria to determine whether or not the project will have been a success
 Developing an understanding of the acquired data
 Cleaning and preparing the data for modelling
 Selecting suitable tools and techniques for modelling
 Creating appropriate models from the data
 Evaluating the created models
 Visualising the information obtained from the data
 Implementing a solution or proposal that achieves the original business objective.
Data understanding

The second prase of the CRISP-DM process involves obtaining and exploring the data identified as
part of the previous phase and has three separate steps, each resulting in the production of a report.

Data Acquisition

This step involves retrieving the data from their respective sources and the production of a data
acquisition report that lists the sources of date, along with their provenance, the tools or techniques
used to acquire them. It should also document any issues which arose during the acquisition along
with the relevant solutions. This report will facilitate the replication of the date acquisition process if
the project is repeated in the future.

Data Description

The next step requires loading the data and performing a rudimentary examination of the data to aid
in the production of a data quality report. This report should describe the data that has been
acquired.

It should detail the number of attributes and the type of data they contain. For quantitative data,
this should include descriptive statistics such as minimum and maximum values as well as their mean
and median and other statistical measures. For qualitative data, the summary data should include
the number of distinct values, known as the cardinality of data, and how many instances of each
value exists. The first step is to describe the raw data. For instance, if analyzing a purchases ledger,
you would at this stage produce counts of the number of transactions for each department and cost
center, the minimum, mean and maximum for amounts, etc. Relationships between variables are
examined in the data exploration phase (eg. by calculating correlation). For both types of data, the
report should also detail the number of missing or invalid values in each of the attributes.

If there are multiple sources of data, the report should state on which common attributes these
sources will be joined. Finally, the report should include a statement as to whether the data acquired
is complete and satisfies the requirements outlined during the business understanding phase.

Data Exploration

This step builds on the data description and involves using statistical and visualisation techniques to
develop a deeper understanding of the data and their suitability for the analysis.

These may include:

 Performing basic aggregations

 Studying the distribution of data; either through producing descriptive statistics such as means,
medians and standard deviations or by plotting histograms
 Examining the relationships between pairs of attributes; eg. by correlation for numeric data using
regression analysis or chi-square testing
 Exploring the distribution and relationships in significant subsets of the data

These exploratory data analysis techniques can help provide an indication on the likely outcome of
the analysis and may uncover patterns in the data that may be worth subjecting to further
examination.

The results of the exploratory data analysis should be presented as part of a data exploration report
that should also detail any initial findings.
Data preparation

As with the data exploration phase, the data preparation phase is composed of multiple steps and is
about ensuring that the correct data is used, in the correct form in order for the data analytics model
to work effectively.

Data Selection

Feature selection is the process of eliminating features or variables which exhibit little predictive
value or those that are highly correlated with others and retaining those that are the most relevant
to the process of building analytical models such as: • Multiple linear regression, where the
correlation between multiple independent variables and the dependent variable is used to model
the relationship between them • Decision trees, simulating human approaches to solving problems
by dividing the set of predictors into smaller and smaller subsets and associating an outcome with
each one. • Neural networks, a native simulation of multiple interconnected brain cells that can be
configured to learn and recognize patterns.

Sampling may be needed if the amount of data exceeds the capabilities of the tools or systems used
to build the model. This normally involves retaining a random selection of rows as a predetermined
percentage of the total number of rows. Often surprisingly small samples can give reasonably
reliable information about the wider population of data, such as obtained from voter exit polls in
local and national elections.

Any decisions taken during this step should be documented, along with a description of the reasons
for eliminating non-significant variables or selecting samples of data from a wider population of such
data.

Data Cleaning

Data cleaning is the process of ensuring the data can be used effectively in the analytical model.

The next step is to process missing and erroneous data identified during the data understanding or
collection phase. Erroneous data, values outside of reasonably expected ranges, are generally set as
missing. Missing values in each feature are then replaced either using simple rules of thumb, such as
setting them to be equal to the mean or median of data in the feature or by building models that
represent the patterns of missing data and using those models to "predict" the missing values.

Other data cleaning tasks include transforming dates into a common format and removing non-
alphanumeric characters from text.

The activities undertaken, and decisions made during this step should be documented in a data
cleaning report.

Data Integration

Data mining algorithms expect a single source of data to be organised into rows and columns. If
multiple sources of data are to be used in the analysis, it is necessary to combine them. This involves
using common features in each dataset to join the datasets together. For example, a dataset of
customer details may be combined with records of their purchases. The resulting joined dataset will
have one row for each purchase containing attributes of the purchase combined with attributes
related to the customer.
Feature Engineering

This optional step involves the creation or inclusion of new variables or derived attributes into the
existing variables or features originally included to improve the model's capability. This step is
frequently performed when the data analyst feels that the derived attribute or new feature or
variable is likely to make a positive contribution to the modelling process and where it involves a
complex relationship that the model is unlikely to infer by itself.

An example of a derived feature might be adding such attributes such as the amount a customer
spends on different products in a given time period, how soon they pay and how often they return
goods to more reliably assess the profitability of that customer, rather than just measure the gross
profit generated by the customer based on sales values.
Modelling

This key part of the data mining process involves creating generalised, concise representations of the
data. These are frequently mathematical in nature and are used later to generate predictions from
new, previously unseen data.

Determine the modelling techniques to be used

The first step in creating models is to choose the modelling techniques which are the most
appropriate, given both the nature of the analysis and of the data used. Many modelling methods
make assumptions about the nature of data. For examples, some methods can perform well in the
presence of missing data whereas others will fail to produce a valid model.

Design a testing strategy

Before proceeding to build a data analytics model, you will need to determine how you are going to
assess the quality of predictive ability of the model. This is done using data specially held aside for
this purpose, in other words, how well the model will perform on data it hasn't yet seen. This
involves using a subset of data kept aside for this purpose and using it to evaluate how far off the
model's predictions of the dependent variable are from the actual values in the data.
Deployment

During this final phase, the outcome of the evaluation will be used to establish a timetable and
strategy for the deployment of the data mining models, detailing the required steps and how they
should be implemented.

Data mining projects are rarely "set it and forget it in nature. At this time, you will need to develop a
comprehensive plan for the monitoring of the deployed models as well as their future maintenance.
This should take the form of a detailed document. Once the project has been completed there
should be a final written report, re-stating and re-affirming the project objectives, identifying the
deliverables, providing a summary of the results and identifying any problems encountered and how
they were dealt with.

Depending on the requirements, the deployment phase can be as simple as generating a report and
presenting it to the sponsors or as complex as implementing a repeatable data mining process across
the enterprise. In many cases, it is the customer, not the data analyst, who carries out the
deployment steps. However, even if the analyst does carry out the deployment, it is important for
the customer to clearly understand which actions need to be carried out in order to actually make
use of the created models. This is where data visualisation is most important as the data analyst
hands over the findings from the modelling to the sponsor or the end user and these should be
presented and communicated in a form which is easily understood.

Ghozali 2018
67% (24)
Ghozali 2018
14 pages
Solution Manual Cost Accounting 14th by Carter
88% (34)
Solution Manual Cost Accounting 14th by Carter
720 pages
1210 B C A T A (1071)
100% (5)
1210 B C A T A (1071)
6 pages
Soal Tes TOEFL Dan Pembahasan Jawaban Structure
88% (52)
Soal Tes TOEFL Dan Pembahasan Jawaban Structure
48 pages
Soal 1 (LO3 10%) : Tugas Personal Ke-2 Week 7
100% (2)
Soal 1 (LO3 10%) : Tugas Personal Ke-2 Week 7
16 pages
Tugas 2 Bahasa Inggris MKWI4201
100% (12)
Tugas 2 Bahasa Inggris MKWI4201
6 pages
BOND VALUATION With Solutions
73% (11)
BOND VALUATION With Solutions
30 pages
Tugas 2 - Statistika Ekonomi - Ni Putu Yunik Puspita Sari - 045261936
100% (5)
Tugas 2 - Statistika Ekonomi - Ni Putu Yunik Puspita Sari - 045261936
9 pages
Nama Akun Dalam Bahasa Inggris
82% (11)
Nama Akun Dalam Bahasa Inggris
3 pages
EPSM Unit 7 Data Analytics
100% (1)
EPSM Unit 7 Data Analytics
27 pages
Contoh Soal SAP 010 - Financial Accounting (Batch 1&2)
100% (4)
Contoh Soal SAP 010 - Financial Accounting (Batch 1&2)
19 pages
Contoh Soal Uas Utm B.inggris Niaga
100% (4)
Contoh Soal Uas Utm B.inggris Niaga
9 pages
Living Death - 1899 - 22 Cats Paw (3.5E)
No ratings yet
Living Death - 1899 - 22 Cats Paw (3.5E)
41 pages
Sfac No.2 Fasb PDF
No ratings yet
Sfac No.2 Fasb PDF
38 pages
CH 21
40% (5)
CH 21
143 pages
ch11 Kieso IFRS4 SM
No ratings yet
ch11 Kieso IFRS4 SM
82 pages
Unit 7
100% (1)
Unit 7
43 pages
Wolfe & Hermanson 2004 - The Fraud Diamond - Considering The Four Elements of Fraud - The CPA Journal Vol 74 PP 38-42
No ratings yet
Wolfe & Hermanson 2004 - The Fraud Diamond - Considering The Four Elements of Fraud - The CPA Journal Vol 74 PP 38-42
6 pages
Contoh Soal Audit Pilihan Ganda
No ratings yet
Contoh Soal Audit Pilihan Ganda
53 pages
ACT600 Chapter 3
No ratings yet
ACT600 Chapter 3
67 pages
Tuton Tugas 6 Laboratorium Pengantar Akuntansi
0% (1)
Tuton Tugas 6 Laboratorium Pengantar Akuntansi
4 pages
Ak - Keu (Problem)
100% (10)
Ak - Keu (Problem)
51 pages
Lafidan Rizata Febiola - 041711333237 - 5 - Tugas Akm 1 Week 11
100% (4)
Lafidan Rizata Febiola - 041711333237 - 5 - Tugas Akm 1 Week 11
12 pages
JURNAL PPH PPN
No ratings yet
JURNAL PPH PPN
15 pages
Chapter 22 Solution Manual Kieso IFRS by
No ratings yet
Chapter 22 Solution Manual Kieso IFRS by
60 pages
Cob 300 Business Plan
No ratings yet
Cob 300 Business Plan
33 pages
Gale Brewer
No ratings yet
Gale Brewer
3 pages
PT Bayan Resources Tbk. Dan Entitas Anak/And Subsidiaries Laporan Keuangan Konsolidasian
No ratings yet
PT Bayan Resources Tbk. Dan Entitas Anak/And Subsidiaries Laporan Keuangan Konsolidasian
126 pages
Simposium Nasional Akuntansi 9 Padang Analisis Faktor-Faktor
No ratings yet
Simposium Nasional Akuntansi 9 Padang Analisis Faktor-Faktor
49 pages
Materi APK Bapak Steven Tanggara RPL ACPA 7.7.20 IAPI
No ratings yet
Materi APK Bapak Steven Tanggara RPL ACPA 7.7.20 IAPI
67 pages
On January 1 2014 Paxton Company Purchased A 70 Interest
No ratings yet
On January 1 2014 Paxton Company Purchased A 70 Interest
1 page
MAKSI-Silabus Internal Audit I Genap 21-22
No ratings yet
MAKSI-Silabus Internal Audit I Genap 21-22
7 pages
Kasus: Ex-Hr GMC Global
No ratings yet
Kasus: Ex-Hr GMC Global
5 pages
MAKSI-Syllabus Corporate Reporting Gasal 21 22 28.08.21
No ratings yet
MAKSI-Syllabus Corporate Reporting Gasal 21 22 28.08.21
7 pages
Solutions Manual: Accounting Theory 7e
No ratings yet
Solutions Manual: Accounting Theory 7e
21 pages
Albrecht Chapter 7 - Investigating Theft Act
No ratings yet
Albrecht Chapter 7 - Investigating Theft Act
22 pages
Kasus 3 Tugas Kelompok (SUDAH ARTI)
No ratings yet
Kasus 3 Tugas Kelompok (SUDAH ARTI)
5 pages
PDF - Materi RPL ACPA - PEMM - Bapak Yudhistira
No ratings yet
PDF - Materi RPL ACPA - PEMM - Bapak Yudhistira
103 pages
6 Advanced Accounting 2D
No ratings yet
6 Advanced Accounting 2D
3 pages
Simulation of The Preparation of The Work Unit (Satker) Financial Statements
No ratings yet
Simulation of The Preparation of The Work Unit (Satker) Financial Statements
23 pages
AUDITING 1 Tuanakotta Chapter 15-30
No ratings yet
AUDITING 1 Tuanakotta Chapter 15-30
33 pages
Materi Effective IA Technique
No ratings yet
Materi Effective IA Technique
314 pages
Documenting Result Through Process Modeling and Workpapers
No ratings yet
Documenting Result Through Process Modeling and Workpapers
19 pages
Ch20 Attest and Assurance Services and Related Reports
No ratings yet
Ch20 Attest and Assurance Services and Related Reports
15 pages
Psak Vs Ifrs 2022
No ratings yet
Psak Vs Ifrs 2022
11 pages
Chapter 8 263-270
No ratings yet
Chapter 8 263-270
8 pages
2014 - LEMBAR JAWABAN SOAL LATIHAN CPA EXAM - Cover PDF
No ratings yet
2014 - LEMBAR JAWABAN SOAL LATIHAN CPA EXAM - Cover PDF
3 pages
Jeter AA 4e SolutionsManual Ch16
100% (1)
Jeter AA 4e SolutionsManual Ch16
22 pages
Ch05 Banks and Analysts - 3ed
No ratings yet
Ch05 Banks and Analysts - 3ed
22 pages
Update Syllabus Advanced Cost & Management Accounting Gasal 2019
No ratings yet
Update Syllabus Advanced Cost & Management Accounting Gasal 2019
8 pages
3-Puji Wibowo
No ratings yet
3-Puji Wibowo
30 pages
Measurement: ©2018 John Wiley & Sons Australia LTD
No ratings yet
Measurement: ©2018 John Wiley & Sons Australia LTD
50 pages
Lie Dharma Putra-Audit Engagement
No ratings yet
Lie Dharma Putra-Audit Engagement
3 pages
Boynton SM CH 17
100% (1)
Boynton SM CH 17
34 pages
Jawaban Chapter 19
No ratings yet
Jawaban Chapter 19
21 pages
RR Desy CH 4 Dan 5 Fraud Audit
No ratings yet
RR Desy CH 4 Dan 5 Fraud Audit
6 pages
Jawaban Case B Texas
No ratings yet
Jawaban Case B Texas
7 pages
Auditing Ii Resume CH 17 Audit Sampling For Tests of Details of Balances (Contoh Audit Untuk Menguji Detail Dari Saldo)
No ratings yet
Auditing Ii Resume CH 17 Audit Sampling For Tests of Details of Balances (Contoh Audit Untuk Menguji Detail Dari Saldo)
22 pages
Modul CA - Isi - Sistem Informasi Dan Pengendalian Internal - CETAK 2015... - 1 - 126-126
No ratings yet
Modul CA - Isi - Sistem Informasi Dan Pengendalian Internal - CETAK 2015... - 1 - 126-126
1 page
Expenses: Godfrey Hodgson Holmes Tarca
100% (2)
Expenses: Godfrey Hodgson Holmes Tarca
32 pages
HDTX - Audit Report 2018 PDF
No ratings yet
HDTX - Audit Report 2018 PDF
85 pages
Audit Sampling For Tests of Details of Balances
No ratings yet
Audit Sampling For Tests of Details of Balances
4 pages
Materi Ibu Ningsih - Webinar IAPI-ACCA "Key Audit Matters in The Context of The New Audit Regulation"
100% (1)
Materi Ibu Ningsih - Webinar IAPI-ACCA "Key Audit Matters in The Context of The New Audit Regulation"
37 pages
Audit Evidence: ©2008 Prentice Hall Business Publishing, Auditing 12/e, Arens/Beasley/Elder 7 - 1
No ratings yet
Audit Evidence: ©2008 Prentice Hall Business Publishing, Auditing 12/e, Arens/Beasley/Elder 7 - 1
33 pages
Financial Accounting Ch5 Exercises
No ratings yet
Financial Accounting Ch5 Exercises
21 pages
The Audit Investigation and Accounting Forensicin Detecting Fraud in Digital Environment
No ratings yet
The Audit Investigation and Accounting Forensicin Detecting Fraud in Digital Environment
11 pages
Substantive Testing, Computer-Assisted Audit Techniques and Audit Programmes
No ratings yet
Substantive Testing, Computer-Assisted Audit Techniques and Audit Programmes
23 pages
Advanced Accounting - Beams.10Ed - Ch12
No ratings yet
Advanced Accounting - Beams.10Ed - Ch12
24 pages
FINANCIAL SHENANIGANS Kel 3
100% (1)
FINANCIAL SHENANIGANS Kel 3
17 pages
Pertemuan - 5
No ratings yet
Pertemuan - 5
12 pages
Modul 10
No ratings yet
Modul 10
9 pages
Continuous Auditing Framework
No ratings yet
Continuous Auditing Framework
18 pages
E3-5 (LO 3) Adjusting Entries: Instructions
0% (1)
E3-5 (LO 3) Adjusting Entries: Instructions
6 pages
Silabus
No ratings yet
Silabus
8 pages
Internal Control
No ratings yet
Internal Control
105 pages
The Case of Wellesley Paint-Final
No ratings yet
The Case of Wellesley Paint-Final
21 pages
Business Uses of Data Mining and Data Warehousing MIS 304 Section 04 CRN-41595
No ratings yet
Business Uses of Data Mining and Data Warehousing MIS 304 Section 04 CRN-41595
23 pages
Excel Cheat Sheet
No ratings yet
Excel Cheat Sheet
61 pages
Data Mining
No ratings yet
Data Mining
41 pages
Crisp-Dm: Elgounidi Hajar Safsafi Aya El Malki Ikram Aqaabich Reda
No ratings yet
Crisp-Dm: Elgounidi Hajar Safsafi Aya El Malki Ikram Aqaabich Reda
87 pages
Diskusi 6 Bahasa Inggris Niaga - Moh. Reyhan
No ratings yet
Diskusi 6 Bahasa Inggris Niaga - Moh. Reyhan
3 pages
Bahasa Indonesia Diskusi 6
100% (12)
Bahasa Indonesia Diskusi 6
2 pages
Tugas Bahasa Inggris Semester 3 PDF
100% (1)
Tugas Bahasa Inggris Semester 3 PDF
3 pages
Bab 5&6
75% (12)
Bab 5&6
27 pages
Soal Akuntansi Perusahaan Jasa
100% (1)
Soal Akuntansi Perusahaan Jasa
2 pages
Bahasa Inggris Niaga
100% (1)
Bahasa Inggris Niaga
2 pages
Tugas 3 Bahasa Inggris
86% (7)
Tugas 3 Bahasa Inggris
6 pages
Lembar Ukk Ud Abadi
62% (21)
Lembar Ukk Ud Abadi
62 pages
Tugas Sesi 12 - Advance Accounting - Clement Jonathan
No ratings yet
Tugas Sesi 12 - Advance Accounting - Clement Jonathan
11 pages
E14-3 (Entries For Bond Transactions) Presented Below Are Two Independent Situations
No ratings yet
E14-3 (Entries For Bond Transactions) Presented Below Are Two Independent Situations
3 pages
ACCT550 Homework Week 7
100% (3)
ACCT550 Homework Week 7
4 pages
Solution Manual, Managerial Accounting Hansen Mowen 8th Editions - CH 6
94% (17)
Solution Manual, Managerial Accounting Hansen Mowen 8th Editions - CH 6
44 pages
Project_Report_Template_AICTE_Internship_2025
No ratings yet
Project_Report_Template_AICTE_Internship_2025
21 pages
Trip Generation Analysis
No ratings yet
Trip Generation Analysis
56 pages
ME MTech 2021 Regulations-PED
No ratings yet
ME MTech 2021 Regulations-PED
17 pages
Elf - Sporti 9 C2C3 5W-30 - GC3 - 201708 - en
No ratings yet
Elf - Sporti 9 C2C3 5W-30 - GC3 - 201708 - en
1 page
SyllabusME2040SU2017_Updated_May30
No ratings yet
SyllabusME2040SU2017_Updated_May30
3 pages
BTH Company Profile
No ratings yet
BTH Company Profile
24 pages
(eBook PDF) Accounting for Decision Making and Control 10th Edition download
100% (1)
(eBook PDF) Accounting for Decision Making and Control 10th Edition download
43 pages
All Types of Wind Sarsij
No ratings yet
All Types of Wind Sarsij
16 pages
LGP55L 12LPB 3P Eay62709002 PLKD L104B
No ratings yet
LGP55L 12LPB 3P Eay62709002 PLKD L104B
73 pages
Edu 202 Lesson Plan
No ratings yet
Edu 202 Lesson Plan
2 pages
House Construction Scheduling Example
No ratings yet
House Construction Scheduling Example
8 pages
Management Response To IAIG 2009 Activity Report
No ratings yet
Management Response To IAIG 2009 Activity Report
7 pages
Bjork
No ratings yet
Bjork
2 pages
L01 Functions
No ratings yet
L01 Functions
12 pages
STP275 WfwMC4 275 270 265
No ratings yet
STP275 WfwMC4 275 270 265
2 pages
Resume Aditya Chowdhury Bits Pilani
No ratings yet
Resume Aditya Chowdhury Bits Pilani
3 pages
Laozi and Laoism The Authentic Philosoph
No ratings yet
Laozi and Laoism The Authentic Philosoph
39 pages
Bottel Filling Report
No ratings yet
Bottel Filling Report
6 pages
Mary Mother of The Church
No ratings yet
Mary Mother of The Church
14 pages
DLL MATATAG _PE&HEALTH 7 Q4 W1-2 (1)
No ratings yet
DLL MATATAG _PE&HEALTH 7 Q4 W1-2 (1)
16 pages
Mohd Rafi'uddin Hamidon 01200910 0070
50% (2)
Mohd Rafi'uddin Hamidon 01200910 0070
13 pages
Catalogue Subec En
No ratings yet
Catalogue Subec En
503 pages
CHAPTER 1 Intermediate 2
No ratings yet
CHAPTER 1 Intermediate 2
32 pages
Perception, Attribution, and Judgement of Others
No ratings yet
Perception, Attribution, and Judgement of Others
43 pages
Journal On Diabetes
100% (1)
Journal On Diabetes
8 pages
Jean Jacques Rousseau Life Sketch
No ratings yet
Jean Jacques Rousseau Life Sketch
8 pages
Mormon Mysticism
100% (1)
Mormon Mysticism
289 pages
1-s2.0-S0925346723001374-main
No ratings yet
1-s2.0-S0925346723001374-main
12 pages

Modul 1 CertDA

Uploaded by

Modul 1 CertDA

Uploaded by

Modul 1

What is Data Mining?

The CRISP-DM Framework

Such problems might be:

 Deciding which data needs to be selected from intemal or external sources

These may include:

 Performing basic aggregations

Determine the modelling techniques to be used

Design a testing strategy

You might also like