01 Section 6.2.1 QR Code Content

The document provides an example of constructing a decision tree from a training dataset to classify data instances as either "Positive" or "Negative". It includes the following key steps: 1) Calculating the entropy of the overall dataset and for each attribute. 2) Determining the information gain of each attribute. 3) Choosing the attribute with the highest information gain (A2 in this case) as the root node of the decision tree. 4) Splitting the data instances based on A2 and recursively applying the process to build out the rest of the tree.

Uploaded by

vijayaramachandra r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

01 Section 6.2.1 QR Code Content

Uploaded by

vijayaramachandra r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

ADDITIONAL EXAMPLES

Example 1: Let us look at a simple example of constructing a decision tree with the training dataset
consisting of 10 data instances and 3 attributes {A1, A2, A3} as shown in Table 1. The final attribute
in the dataset is the target class to be classified. Each row in the table is called a data instance. The
data instances are classified as 'Positive' or 'Negative'. Construct the decision tree and predict the
target class for a given test instance.
Table 1
S.No. A1 A2 A3 Target Class
1 10 0.5 2.5 Negative
2 7 1.3 3 Positive
3 5 0.7 5.6 Negative
4 9 1.0 4.3 Positive
5 10 1.2 7.1 Positive
6 8 0.9 6.5 Negative
7 6 0.8 3.2 Negative
8 4 1.5 5.1 Negative
9 9 1.7 4.7 Positive
10 8 1.6 7.4 Positive

Solution:

Table 2 depicts the number of data instances categorized as Positive or Negative for each of the three
attributes in the Training dataset.
Table 2
A1 Positive Negative Total A2 Positive Negative Total A3 Positive Negative Total

≥5 5 4 9 ≥1 5 1 6 ≥5 2 3 5
<5 0 1 1 <1 0 4 4 <5 3 2 5

Step 1: Initially we calculate the Entropy for the whole data set T based on the target feature. The
target class has 5 data instances classified as ‘Positive’ and 5 data instances classified as ‘Negative’.
So, we can calculate the class Entropy as Entropy_Info(5,5).
Entropy_Info(T) = Entropy_Info(5,5)

=- [ 5
10
5 5
log 2 + log 2
10 10
5
10 ]
= - ( - 0.4997 + - 0.4997)
= 0.9994
Step 2: Now, we can calculate the Entropy_Info for each of the attributes in the training dataset. Since
all the attributes in this example are continuously valued they are discretized by assuming a split point
"S" and all values ≥ S belongs to one category and all values < S belongs to another category. Thus,
continuous values are made as categorical values and every attribute in this example has 2 categories.
The Table 1 shows total number of data instances belonging to each category of an attribute and
among each category how many instances are classified as positive and how many instances are
classified as negative. For example, in attribute A1, 9 data instances are ≥ 5 and 1 data instance is < 5.
Among 9 data instances, 5 are classified as ‘Positive’ and 4 are classified as 'Negative'. Hence, we
calculate for A1 as below,

9 1
Entropy_Info(T, A1) = * Entropy_Info(5,4) + Entropy_Info(0,1)
10 10

= [
9 −5
10 9
5 4 4
log 2 − log 2 +
9 9
1 −0
9 10 1 ] [ 0 1
log 2 − log 2
1 1
1
1 ]
9 1
= (0.4708 + 0.5196) + ∗¿ 0
10 10
= 0.8913
After we calculate the Entropy_Info for A1, We can calculate Gain of A1 by subtracting
Entropy_Info(T, A1) from Entropy_Info(T).

Gain (A1) = 0.9994 - 0.8913

= 0.1081
Similarly, we do for all attributes.

Entropy_Info(T, A2) = [
6 −5
10 6
5 1 1 4 −0
log 2 − log 2 +
6 6 6 10 4 ] [ 0 4
log 2 − log 2
4 4
4
4 ]
6
= (0.2190 + 0.4305) + 0
10
= 0.3897
Gain (A2) = 0.9994 - 0.3897
= 0.6097

Entropy_Info(T, A3) = [
5 −2
10 5
2 3 3 5 −3
log 2 − log 2 +
5 5 5 10 5 ] [ 3 2
log 2 − log 2
5 5
2
5 ]
5 5
= (0.5284 + 0.4419) + (0.5284 + 0.4419)
10 10
= 0.9703
Gain (A3) = 0.9994 - 0.9703
= 0.0291
Table 3 shows the Gain value calculated for each of the attribute.
Table 3
Attribute Gain
A1 0.1081
A2 0.6097
A3 0.0291
Step 3:
Now we choose the attribute which has maximum gain as the best split attribute. Here we choose A2
as the best split attribute and place it as the root node. A2 has two outcomes ≥ 1 and < 1.
Table 4 shows the data instances for which A2 values are < 1 and all are categorized as ‘Negative’
and hence its entropy is 0. So, the outcome of A2 with < 1 ends up in a leaf node and its class is
‘Negative’.
Table 4
S.No A1 A2 A3 Target Class
1 10 0.5 2.5 Negative
3 5 0.7 5.6 Negative
6 8 0.9 6.5 Negative
7 6 0.8 3.2 Negative

All other data instances with A2 values ≥ 1 form another subset and the process is repeated. This is
shown in Figure 1.
Figure 1
Table 5 depicts the number of data instances categorized as Positive and Negative for the attributes
A1 and A3.
Table 5
A1 Positive Negative Total A3 Positive Negative Total

≥5 5 0 5 ≥5 2 1 3
<5 0 1 1 <5 3 0 3

Entropy_Info(T) = Entropy_Info(5,1) =

=- [ 5
6
5 1
log 2 + log 2
6 6
1
6 ]
= - ( - 0.2190 + - 0.4305)
= 0.6495

Entropy_Info(T, A1) =
6 5[
5 −5 5 0 0 1 −0
log 2 − log 2 +
5 5 5 6 1 ] [
0 1
log 2 − log 2
1 1
1
1 ]
=0+0
= 0
Gain (A1) = 0.6495 - 0
= 0.6495

Entropy_Info(T, A3) =
6 3[
3 −2 2 1 1 3 −3
log 2 − log 2 +
3 3 3 6 3 ] [
3 0
log 2 − log 2
3 3
0
3 ]
3
= (0.5846 + 0.5280)
6
= 0.5563
Gain (A3) = 0.6495 - 0.5563
= 0.0932
Table 6
Attribute Gain
A1 0.6495
A3 0.0932

As shown in Table 6, A1with the maximum information gain is chosen as the best split attribute. The
outcome of A1 ≥ 5 has 5 instances in the subset considered in Iteration 2 and are all 'Positive' and
entropy is 0. Hence it ends in a leaf node as ‘Positive’. There is 1 instance < 5 and classified as
‘Negative’.
Figure 2
The final decision tree is shown in Figure 2. Now if a test instance is given, the target class can be
predicted. The sample test instance is given as (6, 1, 6.7). Now, traverse the tree for the given test data
instance from root to leaf taking a path that satisfies the constraints. The predicted class label is
‘Positive’.

Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
3 Decision Trees_LMS
No ratings yet
3 Decision Trees_LMS
47 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
DWDM final5
No ratings yet
DWDM final5
45 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
10 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Homework1 Excersises
No ratings yet
Homework1 Excersises
12 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
DS_w12_DT
No ratings yet
DS_w12_DT
61 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Randomforest TNP
No ratings yet
Randomforest TNP
71 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Decision Trees Notes
No ratings yet
Decision Trees Notes
11 pages
ML-Lec5
No ratings yet
ML-Lec5
7 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Decision Tree-Using Entropy
No ratings yet
Decision Tree-Using Entropy
17 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
DECISION TREES
No ratings yet
DECISION TREES
7 pages
Module 3
No ratings yet
Module 3
101 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
74 pages
Decision Tree
No ratings yet
Decision Tree
23 pages
Decision Trees: Classifier
No ratings yet
Decision Trees: Classifier
23 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Machine Learning Unit4
No ratings yet
Machine Learning Unit4
8 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Decision Tree
No ratings yet
Decision Tree
25 pages
Lesson 5
No ratings yet
Lesson 5
28 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
Chapter 4 (2)
No ratings yet
Chapter 4 (2)
103 pages
Predictive Modeling Week3
No ratings yet
Predictive Modeling Week3
68 pages
Lab Program 3
No ratings yet
Lab Program 3
6 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
University of Gondar: August 2011 E.C Gondar, Ethiopia
No ratings yet
University of Gondar: August 2011 E.C Gondar, Ethiopia
10 pages
LAB 3
No ratings yet
LAB 3
7 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
CSE4261 Lecture-10
No ratings yet
CSE4261 Lecture-10
50 pages
Basic Mathematics. Explained Easy | For Beginners
From Everand
Basic Mathematics. Explained Easy | For Beginners
ExaGrecation
No ratings yet
Elementary Algebra Notes Examples and Exercises
From Everand
Elementary Algebra Notes Examples and Exercises
George N. Frempong
No ratings yet
2023 - Mat. today - AI energized hydrogel design, optimization and application in biomedicine
No ratings yet
2023 - Mat. today - AI energized hydrogel design, optimization and application in biomedicine
19 pages
Segmentation-of-diabetic-retinopathy-images-using-dee_2023_Alexandria-Engine
No ratings yet
Segmentation-of-diabetic-retinopathy-images-using-dee_2023_Alexandria-Engine
19 pages
AI Fitness Trainer
No ratings yet
AI Fitness Trainer
5 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
90 pages
Data Preprocessing Techniques in ML
No ratings yet
Data Preprocessing Techniques in ML
12 pages
Water Bottle Defect Detection System Using Convolutional Neural Network
No ratings yet
Water Bottle Defect Detection System Using Convolutional Neural Network
6 pages
stanfordKNNassignment
No ratings yet
stanfordKNNassignment
78 pages
Artificial Intelligence and Monetary Policy: Enhancing Central Bank Decision-Making through AI-Driven Text Analysis
No ratings yet
Artificial Intelligence and Monetary Policy: Enhancing Central Bank Decision-Making through AI-Driven Text Analysis
7 pages
Papers Pradeepth
No ratings yet
Papers Pradeepth
55 pages
A Review On Fake News Detection 3T's: Typology, Time of Detection, Taxonomies
No ratings yet
A Review On Fake News Detection 3T's: Typology, Time of Detection, Taxonomies
36 pages
internship codsoft machine learning
No ratings yet
internship codsoft machine learning
36 pages
Introduction to AI and AI Bingo
No ratings yet
Introduction to AI and AI Bingo
28 pages
6 036 Final Fall 2021 Exam Solutions
No ratings yet
6 036 Final Fall 2021 Exam Solutions
33 pages
Blockchain-Based Event Detection and Trust Verification Using Natural Language Processing and Machine Learning
No ratings yet
Blockchain-Based Event Detection and Trust Verification Using Natural Language Processing and Machine Learning
11 pages
DSC100 Data Science Fundamentals by SAP
No ratings yet
DSC100 Data Science Fundamentals by SAP
249 pages
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
No ratings yet
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
19 pages
Supervised vs Unsupervised Learning - Javatpoint
No ratings yet
Supervised vs Unsupervised Learning - Javatpoint
9 pages
AI Capstone Project - Notes-Part2
No ratings yet
AI Capstone Project - Notes-Part2
8 pages
27786-Article Text-31840-1-2-20240324
No ratings yet
27786-Article Text-31840-1-2-20240324
9 pages
A Hybrid Model For Depression Detection Using Deep Learning
No ratings yet
A Hybrid Model For Depression Detection Using Deep Learning
10 pages
IACV_Mini Project report_214
No ratings yet
IACV_Mini Project report_214
17 pages
IML-IITKGP - Assignment 1 Solution
No ratings yet
IML-IITKGP - Assignment 1 Solution
7 pages
Emotion Recognition in Audio and Video Using Deep Neural Networks
No ratings yet
Emotion Recognition in Audio and Video Using Deep Neural Networks
9 pages
机器学习_ 学习笔记 (All in One)_V0.97更多医学课请加微信782878241
No ratings yet
机器学习_ 学习笔记 (All in One)_V0.97更多医学课请加微信782878241
762 pages
2023 SpliTech GasparovicB - mausaG.rukavinaJ - Lergaj. EvaluatingYOLOV5YOLOV6YOLOV7andYOLOV8inUnderwaterEnvironmentIsThereRealImprovement
No ratings yet
2023 SpliTech GasparovicB - mausaG.rukavinaJ - Lergaj. EvaluatingYOLOV5YOLOV6YOLOV7andYOLOV8inUnderwaterEnvironmentIsThereRealImprovement
5 pages
Project Preliminary Report Sample-pages-Deleted
No ratings yet
Project Preliminary Report Sample-pages-Deleted
38 pages
MNIST
No ratings yet
MNIST
3 pages
Bytetrack: Multi-Object Tracking by Associating Every Detection Box
No ratings yet
Bytetrack: Multi-Object Tracking by Associating Every Detection Box
14 pages
Innovative Assignment PDF
No ratings yet
Innovative Assignment PDF
11 pages
unit 5
No ratings yet
unit 5
25 pages

01 Section 6.2.1 QR Code Content

Uploaded by

01 Section 6.2.1 QR Code Content

Uploaded by

ADDITIONAL EXAMPLES

Gain (A1) = 0.9994 - 0.8913

You might also like