Lec-3-Decision Trees
Lec-3-Decision Trees
Introduction
2
DECISION TREES
3
DECISION TREES
Example
4
DECISION TREES
Example
Outlook
Sunny Rain
Overcast
Humidity Wind
High Normal Strong Weak
6
DECISION TREES
7
DECISION TREES
8
DECISION TREES
9
DECISION TREES
10
DECISION TREES
Outlook
Sunny Rain
Overcast
D1 D8 D10 D6
D3
D14
D11 D12 D4
D9 D2 D7 D5
D13
11
DECISION TREES
12
DECISION TREES
Outlook
Sunny Rain
Overcast
D1 D8 D10 D6
D3
D14
D11 D12 D4
D9 D2 D7 D5
D13
What is the
“best” attribute to test at this point? The possible choices are
Temperature, Wind & Humidity
13
DECISION TREES
14
15
16
Entropy
17
To illustrate this equation, we will do an example that calculates
the entropy of our data set in Fig: 1. The dataset has 9 positive
instances and 5 negative instances, therefore-
18
19
By observing closely on equations 1.2, 1.3 and 1.4; we can come to
a conclusion that if the data set is completely homogeneous then
the impurity is 0, therefore entropy is 0 (equation 1.4), but if the
data set can be equally divided into two classes, then it is
completely non-homogeneous & impurity is 100%, therefore
entropy is 1 (equation 1.3).
20
The Information Gain
21
Information Gain:
22
23
To become more clear, let’s use this equation and measure the information
gain of attribute Wind from the dataset of Figure 1.
The dataset has 14 instances, so the sample space is 14 where the sample has 9
positive and 5 negative instances.
The Attribute Wind can have the values Weak or Strong.
Therefore,
Values(Wind) = Weak, Strong
24
25
So, the information gain by the Wind attribute is 0.048. Let’s
calculate the information gain by the Outlook attribute.
26
These two examples should make us clear that how we can
calculate information gain.
The information gain of the 4 attributes of Figure 1 dataset
are:
27
Remember, the main goal of measuring information gain is to
find the attribute which is most useful to classify training set.
Our ID3 algorithm will use the attribute as it’s root to build the
decision tree.
Then it will again calculate information gain to find the next
node.
As far as we calculated, the most useful attribute is “Outlook” as
it is giving us more information than others. So, “Outlook” will
be the root of our tree.
28
29
30
We can now measure the information gain of Temperature and
Wind by following the same way we measured Gain(S,
Humidity). Finally, we will get:
31
So Humidity gives us the most information at this stage. The node
after “Outlook” at Sunny descendant will be Humidity.
The High descendant has only negative examples and the Normal
descendant has only positive examples.
So both of them become the leaf node and can not be furthered
expanded.
32
33
34
35
36
37
Decision Boundary for Decision Trees
38
39
40
41
42
DECISION TREES
45
DECISION TREES
Reference
47