0% found this document useful (0 votes)
2 views

id3algorithm-200307175839

Uploaded by

Lekshmi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

id3algorithm-200307175839

Uploaded by

Lekshmi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

ID3 ALGORITHM

Abstract

• ID3 builds a decision tree from a fixed set of examples.


• Using this decision tree, future samples are classified.
• The example has several attributes and belongs to a class.
• The leaf nodes of the decision tree contain the class name whereas a non-leaf
node is a decision node.
• The decision node is an attribute test with each branch being a possible value of
the attribute.
• ID3 uses information gain to help it decide which attribute goes into a decision
node.
Algorithm

• Calculate the entropy of every attribute using the data set.


• Split the set into subsets using the attribute for which entropy is minimum (or
equivalently, information gain is maximum).
• Make a decision tree node containing that attribute.
• Recurse on subsets using remaining attributes.
Entropy and Information gain

• The entropy is a measure of the randomness in the information being processed.


• If the sample is completely homogeneous the entropy is zero and if the sample is
equally divided then it has entropy of one.
• Entropy can be calculated as:
Entropy(S) = ∑ – p(I) . log2p(I)
• The information gain is based on the decrease in entropy after a data-set is split
on an attribute.
• Information gain can be calculated as:
Gain(S, A) = Entropy(S) – ∑ [ p(S|A) . Entropy(S|A) ]
Decision tree for deciding if tennis is playable, using
data from past 14 days
Entropy

Entropy(Decision) = – p(Yes) . log2p(Yes) – p(No) . log2p(No)

Entropy(Decision) = – (9/14) . log2(9/14) – (5/14) . log2(5/14)

= 0.940
Wind factor on decision

• Wind attribute has two labels: weak and strong.

Gain(Decision, Wind) = Entropy(Decision) – [ p(Decision|Wind=Weak) .


Entropy(Decision|Wind=Weak) ] – [ p(Decision|Wind=Strong) . Entropy(Decision|Wind=Strong) ]

• We need to calculate (Decision|Wind=Weak) and (Decision|Wind=Strong) respectively.


Weak wind factor

• There are 8 instances for weak wind. Decision of 2 items


are no and 6 items are yes.
• Entropy(Decision|Wind=Weak) = – p(No) . log2p(No) –
p(Yes) . log2p(Yes)
Entropy(Decision|Wind=Weak) = – (2/8) . log2(2/8) – (6/8) . log2(6/8)
=0.811
Strong wind factor

• Here, there are 6 instances for strong wind. Decision is


divided into two equal parts.
Entropy(Decision|Wind=Strong) = – (3/6) . log2(3/6) – (3/6).log2(3/6)
=1
Wind factor on decision

• Information Gain can be calculated as:


Gain(Decision, Wind) = Entropy(Decision) – [ p(Decision|Wind=Weak) . Entropy(Decision|Wind=Weak) ] –
[p(Decision|Wind=Strong) . Entropy(Decision|Wind=Strong) ]
= 0.940 – [ (8/14) . 0.811 ] – [ (6/14). 1]
= 0.048
Other factor on decision

On applying similar calculation on the other columns, we get:

Outlook
• Gain(Decision, Outlook) = 0.246
• Gain(Decision, Temperature) = 0.029
• Gain(Decision, Humidity) = 0.151 Sunny Overcast Rainy
Overcast outlook on decision

• Decision will always be yes if outlook


were overcast.
Outlook

Sunny Overcast Rainy

Yes
Sunny outlook on decision

We have 5 instances for sunny outlook.


Decision would be probably 3/5 percent no,
2/5 percent yes
• Gain(Outlook=Sunny|Temperature) = 0.570
• Gain(Outlook=Sunny|Humidity) = 0.970
• Gain(Outlook=Sunny|Wind) = 0.019
Outlook

Sunny

Humidity Overcast Rainy

High Normal Yes


• Decision will always be no when humidity is
high.
• Decision will always be yes when humidity is
normal.
Outlook

Sunny

Humidity Overcast Rainy


High Normal
No Yes Yes
Rain outlook on decision

Information gain for Rain outlook are:


• Gain(Outlook=Rain | Temperature) = 0.02
• Gain(Outlook=Rain | Humidity) = 0.02
• Gain(Outlook=Rain | Wind) = 0.971
Outlook

Sunny Rainy

Humidity Overcast Wind


High Normal Strong Weak

No Yes Yes
• Decision will always be yes if wind
were weak and outlook were rain.
• Decision will always be no if wind
were strong and outlook were
rain.
Outlook

Sunny Rainy

Humidity Overcast Wind


High Normal Strong Weak
No Yes
No Yes Yes
THANK YOU

You might also like