0% found this document useful (0 votes)

1 views

Decision Tree

A Decision Tree is a supervised learning algorithm primarily used for classification tasks, which divides data into homogeneous subsets based on significant input variables. The structure includes root, decision, and leaf nodes, and employs measures like Entropy and Gini Index for attribute selection to optimize splits. Overfitting can be mitigated through pre-pruning and post-pruning techniques to ensure the model generalizes well to unseen data.

Uploaded by

abhijaychauhan88

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Decision Tree

Uploaded by

abhijaychauhan88

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Decision Tree

Algorithm
Supervised ML
What is a Decision Tree?

 Decision tree is a type of supervised learning algorithm (having a

predefined target variable) that is mostly used in classification
problems. It works for both categorical and continuous input and
output variables. In this technique, we split the population or sample
into two or more homogeneous sets (or sub-populations) based on
most significant splitter / differentiator in input variables.
Structure of a Decision Tree
Structure of a Decision Tree

 Root Node: It represents entire population or sample and this further

gets divided into two or more homogeneous sets.
 Splitting: It is a process of dividing a node into two or more sub-nodes.
 Decision Node: When a sub-node splits into further sub-nodes, then it is
called decision node.
 Leaf/ Terminal Node: Nodes do not split is called Leaf or Terminal node
 Branch / Sub-Tree: A sub section of entire tree is called branch or sub-
tree.
 Parent and Child Node: A node, which is divided into sub-nodes is
called parent node of sub-nodes where as sub-nodes are the child of
parent node.
How does the Decision Tree
Algorithm Work?
The basic idea behind any decision tree algorithm is as follows:
 Select the best attribute using Attribute Selection Measures (ASM) to
split the records.
 Make that attribute a decision node and breaks the dataset into
smaller subsets.
 Start tree building by repeating this process recursively for each child
until one of the condition will match:
 All the tuples belong to the same attribute value.
 There are no more remaining attributes.
 There are no more instances.
How does the Decision Tree
Algorithm Work?
Attribute Selection Measures

Attribute selection measure is a heuristic for selecting the splitting criterion that
partition data into the best possible manner. It is also known as splitting rules
because it helps us to determine breakpoints for tuples on a given node. ASM
provides a rank to each feature(or attribute) by explaining the given dataset.
Best score attribute will be selected as a splitting attribute (Source). In the case
of a continuous-valued attribute, split points for branches also need to define.

Most popular selection measures are:

 Entropy
 Gini Index
 Chi-Square
 Gain Ratio.
What is Entropy?

Entropy is a measure of the uncertainty or impurity in a dataset. It quantifies the

amount of disorder or randomness. In the context of a decision tree, entropy helps
to determine how informative a particular split is.
 High Entropy: Indicates high disorder, meaning the data is diverse and
uncertain.
 Low Entropy: Indicates low disorder, meaning the data is more homogeneous
and certain.
The formula for entropy H for a binary classification problem is:

H(S)=−p +log 2(p +)−p −log 2(p −)

where:
p + is the proportion of positive examples in the dataset S
p − is the proportion of negative examples in the dataset S
What is Information Gain?

 Information Gain (IG) is a measure of the effectiveness of an attribute in

classifying the training data. It quantifies the reduction in entropy (uncertainty)
achieved by splitting the dataset based on an attribute.
 The formula for Information Gain is:
Gain(S, A) = Entropy(S) - ((|Sv| / |S|) * Entropy(Sv))

Where:
• S is the original dataset
• A is the attribute
• Svis the subset of S for which attribute A has value v
• H(S) is the entropy of the original dataset
• H(Sv) is the entropy of the subset
Gini index

 Another decision tree algorithm CART (Classification and Regression

Tree) uses the Gini method to create split points.

Where, pi is the probability that a tuple in D belongs to class Ci.

 The Gini Index considers a binary split for each attribute. You can
compute a weighted sum of the impurity of each partition. If a binary
split on attribute A partitions data D into D1 and D2, the Gini index of
D is:
Decision tree algorithms:-

 CART (Classification and Regression Trees) → uses Gini

Index(Classification) as metric.
 ID3 (Iterative Dichotomiser 3) → uses Entropy function and
Information gain as metrics.
Information Gain:

 By using information gain as a criterion, we try to estimate the

information contained by each attribute. We are going to use some
points deducted from information theory.
 To measure the randomness or uncertainty of a random variable X is
defined by Entropy.
 For a binary classification problem with only two classes, positive and
negative class.
 If all examples are positive or all are negative then entropy will be
zero i.e, low.
 If half of the records are of positive class and half are of negative
class then entropy is one i.e, high.
 By calculating entropy measure of each attribute we can calculate
their information gain. Information Gain calculates the expected
reduction in entropy due to sorting on the attribute.
Entropy can be calculated using formula:-

Here p and q is probability of success and failure respectively in that node.

Entropy is also used with categorical target variable. It chooses the split which
has lowest entropy compared to parent node and other splits. The lesser the
entropy, the better it is.

Steps to calculate entropy for a split:

 Calculate entropy of parent node
 Calculate entropy of each individual node of split and calculate weighted
average of all sub-nodes available in split.
PROCEDURE

 First the entropy of the total dataset is calculated.

 The dataset is then split on the different attributes.
 The entropy for each branch is calculated. Then it is added
proportionally, to get total entropy for the split.
 The resulting entropy is subtracted from the entropy before
the split.
 The result is the Information Gain, or decrease in entropy.

 The attribute that yields the largest IG is chosen for the decision
node.
EXAMPLE
how can we avoid over-fitting in decision
trees?

 Overfitting is a practical problem while building a decision tree model.

The model is having an issue of overfitting is considered when the
algorithm continues to go deeper and deeper in the to reduce the
training set error but results with an increased test set error i.e,
Accuracy of prediction for our model goes down. It generally happens
when it builds many branches due to outliers and irregularities in
data.
Two approaches which we can use to avoid overfitting are:
 Pre-Pruning
 Post-Pruning
 Pre-Pruning
In pre-pruning, it stops the tree construction bit early. It is preferred
not to split a node if its goodness measure is below a threshold value.
But it’s difficult to choose an appropriate stopping point.
 Post-Pruning
In post-pruning first, it goes deeper and deeper in the tree to build a
complete tree. If the tree shows the overfitting problem then pruning
is done as a post-pruning step. We use a cross-validation data
to check the effect of our pruning. Using cross-validation data, it tests
whether expanding a node will make an improvement or not.
If it shows an improvement, then we can continue by expanding that node.
But if it shows a reduction in accuracy then it should not be expanded i.e, the
node should be converted to a leaf node.

Scanter 4000/4100: A Multi Purpose Surveillance Radar
100% (1)
Scanter 4000/4100: A Multi Purpose Surveillance Radar
4 pages
CCIE Security v6 CLC LAB1.2
100% (1)
CCIE Security v6 CLC LAB1.2
67 pages
2954 4200 00 XRYS 557-577 CD - Product Reference
No ratings yet
2954 4200 00 XRYS 557-577 CD - Product Reference
6 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
No ratings yet
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
36 pages
Lecture 7.1 - Decision Tree Classification
No ratings yet
Lecture 7.1 - Decision Tree Classification
15 pages
Decision Tree Notes
No ratings yet
Decision Tree Notes
6 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
ML_UNIT_3_NOTES-1
No ratings yet
ML_UNIT_3_NOTES-1
118 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Supervised Learning Algorithm DT
No ratings yet
Supervised Learning Algorithm DT
15 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
ML Classification Tree
No ratings yet
ML Classification Tree
36 pages
Decision Tree (1)
No ratings yet
Decision Tree (1)
7 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
3 pages
Act9
No ratings yet
Act9
22 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Unit 4
No ratings yet
Unit 4
33 pages
Decision Tree Classification Algorithm (2)
No ratings yet
Decision Tree Classification Algorithm (2)
11 pages
chapter 04
No ratings yet
chapter 04
48 pages
Session 5b Classification by Decision Tree Induction (1)
No ratings yet
Session 5b Classification by Decision Tree Induction (1)
42 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
ML pp7_u2
No ratings yet
ML pp7_u2
42 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
UNIT-3[MLT]
No ratings yet
UNIT-3[MLT]
42 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
CSE445 NSU Week_4
No ratings yet
CSE445 NSU Week_4
48 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
5 pages
Decision Trees
No ratings yet
Decision Trees
3 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Python Decision Tree Classification
No ratings yet
Python Decision Tree Classification
14 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Unit-3_ML
No ratings yet
Unit-3_ML
47 pages
Decision Tree.pptx
No ratings yet
Decision Tree.pptx
41 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
DECSION TREE
No ratings yet
DECSION TREE
6 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Introduction to ML
No ratings yet
Introduction to ML
17 pages
k-means
No ratings yet
k-means
25 pages
Setting the Unit of Analysis
No ratings yet
Setting the Unit of Analysis
34 pages
Regression Metrics
No ratings yet
Regression Metrics
11 pages
Data Mining
No ratings yet
Data Mining
13 pages
Probability
No ratings yet
Probability
22 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Confusion Matrix
No ratings yet
Confusion Matrix
16 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
Watson Studio (1)
No ratings yet
Watson Studio (1)
8 pages
Statistics
No ratings yet
Statistics
7 pages
Analytics Overview
No ratings yet
Analytics Overview
34 pages
CHAID Decision Tree
No ratings yet
CHAID Decision Tree
14 pages
0304 1 16 Indian Calendar
No ratings yet
0304 1 16 Indian Calendar
20 pages
Terrain Rainwater Brochure
No ratings yet
Terrain Rainwater Brochure
56 pages
Module 8 Inferential Statistics NonParametric Test
No ratings yet
Module 8 Inferential Statistics NonParametric Test
112 pages
Bandgap 2009
No ratings yet
Bandgap 2009
27 pages
Cucumber Notes
No ratings yet
Cucumber Notes
18 pages
Time Series and Forecasting
No ratings yet
Time Series and Forecasting
43 pages
Proses Pembuatan DIPHENYLAMINE Patent Translate
No ratings yet
Proses Pembuatan DIPHENYLAMINE Patent Translate
18 pages
Ma185 Exercise Set 1
No ratings yet
Ma185 Exercise Set 1
2 pages
Journal of Urban Development and Management Integrating The Biophilia Concept Into Urban Planning: A Case Study of Kufa City, Iraq
No ratings yet
Journal of Urban Development and Management Integrating The Biophilia Concept Into Urban Planning: A Case Study of Kufa City, Iraq
11 pages
MVS System Messages Volume 6 (GOS - IEA)
No ratings yet
MVS System Messages Volume 6 (GOS - IEA)
930 pages
Legacy: ICC-ES Legacy Report ER-3056
No ratings yet
Legacy: ICC-ES Legacy Report ER-3056
7 pages
SPCC - Module - 4 - Loaders and Linkers (NEW)
No ratings yet
SPCC - Module - 4 - Loaders and Linkers (NEW)
67 pages
300 Core Java Interview Questions (2022) - javatpoint
No ratings yet
300 Core Java Interview Questions (2022) - javatpoint
33 pages
Image Processing Math Prob1
No ratings yet
Image Processing Math Prob1
13 pages
Manihot Esulenta Tithonia Diversifolia Gallus Gallus
No ratings yet
Manihot Esulenta Tithonia Diversifolia Gallus Gallus
12 pages
Bluetooth Based Smart Sensor Network
80% (5)
Bluetooth Based Smart Sensor Network
32 pages
Applications of DSP
No ratings yet
Applications of DSP
11 pages
Method Statement Sonic Integrity Testing
No ratings yet
Method Statement Sonic Integrity Testing
25 pages
Syllabus IMBA 17
No ratings yet
Syllabus IMBA 17
36 pages
Chemistry_worksheet_2_Matter_in_Our_surroundings
No ratings yet
Chemistry_worksheet_2_Matter_in_Our_surroundings
2 pages
Bisacodyl Suppositories JPXVIII
No ratings yet
Bisacodyl Suppositories JPXVIII
2 pages
Lotto Design
No ratings yet
Lotto Design
25 pages
MagNet - Tutorials
No ratings yet
MagNet - Tutorials
241 pages
Drystar - 5302 5.0 - tcm220-21666
No ratings yet
Drystar - 5302 5.0 - tcm220-21666
47 pages
Iso 10968 2020 Controles de Operacion
No ratings yet
Iso 10968 2020 Controles de Operacion
13 pages
Assemble and Disassemble Powersupply 01
100% (1)
Assemble and Disassemble Powersupply 01
7 pages
Systemd Vs SysVinit
No ratings yet
Systemd Vs SysVinit
1 page

Decision Tree

Uploaded by

Decision Tree

Uploaded by

Decision Tree

 Decision tree is a type of supervised learning algorithm (having a

 Root Node: It represents entire population or sample and this further

Most popular selection measures are:

Entropy is a measure of the uncertainty or impurity in a dataset. It quantifies the

H(S)=−p +​log 2​(p +​)−p −​log 2​(p −​)

 Information Gain (IG) is a measure of the effectiveness of an attribute in

 Another decision tree algorithm CART (Classification and Regression

Where, pi is the probability that a tuple in D belongs to class Ci.

 CART (Classification and Regression Trees) → uses Gini

 By using information gain as a criterion, we try to estimate the

Here p and q is probability of success and failure respectively in that node.

Steps to calculate entropy for a split:

 First the entropy of the total dataset is calculated.

 Overfitting is a practical problem while building a decision tree model.

You might also like

H(S)=−p +log 2(p +)−p −log 2(p −)