0% found this document useful (0 votes)
3 views

Decision Tree

The document discusses the Decision Tree method, specifically focusing on the ID3 algorithm for classification. It explains how to calculate entropy, choose attributes for splitting datasets, and determine information gain to build a decision tree. Various types of decision trees and their visualization are also mentioned, along with the process of selecting the best attribute for decision nodes.

Uploaded by

helmiayari15455
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Decision Tree

The document discusses the Decision Tree method, specifically focusing on the ID3 algorithm for classification. It explains how to calculate entropy, choose attributes for splitting datasets, and determine information gain to build a decision tree. Various types of decision trees and their visualization are also mentioned, along with the process of selecting the best attribute for decision nodes.

Uploaded by

helmiayari15455
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Helmi ayari

Lesson 5

Decision Tree (Rule Based Approach)


Example
Example
Features
Example
Class
Example
Given : <sunny, cool, high, true>

Predict, if there will be a match?


Example
Given : <sunny, cool, high, true>

Predict, if there will be a match?

Assume that I have a set of rules:


- If ((lookout=sunny) and ( humudity=high) and
(windy=false)) then (yes) else (no)
- If (lookout=overcast) then (yes)
- If ((lookout=sunny) and ( humudity=high)) then
(yes) else (no)
- so on…..
Set of rules can be visualized as a tree.
Set of rules can be visualized as a tree.
Rule 1: If ((lookout=sunny) and ( humudity=high))
then (yes) else (no)
Set of rules can be visualized as a tree.
Rule 1: If ((lookout=sunny) and ( humudity=high))
then (yes) else (no)

Rule 2: If (lookout=overcast) then (yes)


Set of rules can be visualized as a tree.
Rule 1: If ((lookout=sunny) and ( humudity=high))
then (yes) else (no)

Rule 2: If (lookout=overcast) then (yes)

Rule 3: If ((lookout=rain) and ( windy=true))


then (no) else (yes)
Many possible Trees
Many possible Trees

Which Tree is the Best?


• Which feature should be used to break the dataset?
• Types of DT
• ID3 (Iterative Dichotomiser 3)
• C4.5 (Successor of ID3)
• CART (Classification and Regression Tree)
• Random Forest
ID3
ID3
1. Calculate the entropy of the total
dataset => H(S)=0.9911
we get a database with single class

p p n n
Entropy( S )= −
p+n ( )
log 2
p+n

p+n
log 2 ( )
p+n

Entropy(4F,5M) = -(4/9)log2(4/9) - (5/9)log2(5/9)


= 0.9911
ID3
1. Calculate the entropy of the total
dataset
2. Choose and attribute and Split the
dataset by an attribute
we get a database with single class
ID3
1. Calculate the entropy of the total dataset
2. Choose and attribute and Split the dataset by
an attribute
3. Calculate the entropy of each branch
get a database with single class

Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5)


= 0.9710

Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4)


= 0.8113
ID3
1. Calculate the entropy of the total dataset
2. Choose and attribute and Split the dataset by
an attribute
3. Calculate the entropy of each branch
4. Calculate Information Gain of the split

get a database with single class


𝐼𝐺(𝐴1 ) = 𝐻 𝑆 − [𝑝 𝑆1 𝐻 𝑆1 + 𝑝 𝑆2 𝐻 𝑆2 ]
Gain(Hair Length <= 5) = 0.9911 – (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911

Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5)


= 0.9710

Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4)


= 0.8113
What is Information Gain?
What is Information Gain?
Which split is better?
What is information gain?
What is information gain?
Reduction in uncertainty of the parent dataset
after the split.

𝐼𝐺(𝐴1 ) = 𝐻 𝑆 − [𝑝 𝑆1 𝐻 𝑆1 + 𝑝 𝑆2 𝐻 𝑆2 ]
ID3
1. Calculate the entropy of the total dataset
2. Choose and attribute and Split the dataset by
an attribute
3. Calculate the entropy of each branch
4. Calculate Information Gain of the split

get a database with single class


𝐼𝐺(𝐴1 ) = 𝐻 𝑆 − [𝑝 𝑆1 𝐻 𝑆1 + 𝑝 𝑆2 𝐻 𝑆2 ]
Gain(Hair Length <= 5) = 0.9911 – (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911

Entropy(3F,2M) = -(3/5)log2(3/5) - (2/5)log2(2/5)


= 0.9710

Entropy(1F,3M) = -(1/4)log2(1/4) - (3/4)log2(3/4)


= 0.8113
ID3
1. Calculate the entropy of the total dataset
2. Choose and attribute and Split the dataset by
an attribute
3. Calculate the entropy of each branch

4. Calculate Information Gain of the split

5. Repeat 2, 3, 4 for all Attributes


6. The attribute that yields the largest IG is chosen
for the decision node.

Gain(Hair Length <= 5) = 0.0911


Gain(Weight <= 160) = 0.5900
Gain(Age <= 40) = 0.0183
1. Calculate the entropy of the total dataset
ID3 2. Choose and attribute and Split the dataset by
an attribute
3. Calculate the entropy of each branch

4. Calculate Information Gain of the split

5. Repeat 2, 3, 4 for all Attributes


6. The attribute that yields the largest IG is chosen
for the decision node.
1. Calculate the entropy of the total dataset
ID3 2. Choose and attribute and Split the dataset by an
attribute
3. Calculate the entropy of each branch

4. Calculate Information Gain of the split

5. Repeat 2, 3, 4 for all Attributes


6. The attribute that yields the largest IG is chosen for the
decision node.
7. Repeat 1 to 6 for all sub-databases till we get sub-
databases with single class
ID3 1. Calculate the entropy of the total dataset
2. Choose and attribute and Split the dataset by an
attribute
3. Calculate the entropy of each branch

4. Calculate Information Gain of the split

5. Repeat 2, 3, 4 for all Attributes


6. The attribute that yields the largest IG is chosen for the
decision node.
7. Repeat 1 to 6 for all sub-databases till we get sub-
databases with single class

You might also like