0% found this document useful (0 votes)

221 views

Decision Tree

decision tree

Uploaded by

abdi

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

221 views

Decision Tree

decision tree

Uploaded by

abdi

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 34

ChapterThree

Decision Tree

Copyright 2012 Pearson Education, Inc.

Overview

Copyright 2012 Pearson Education, Inc.

0-2

Decision tree induction is a simple but powerful

learning paradigm. In this method a set of
training examples is broken down into smaller
and smaller subsets while at the same time an
associated decision tree get incrementally
developed. At the end of the learning process, a
decision tree covering the training set is
returned.
The decision tree can be thought of as a set
sentences (in Disjunctive Normal Form) written
propositional logic.
Copyright 2012 Pearson Education, Inc.

0-3

At a basic level, machine learning is about

predicting the future based on the past.
For instance, you might wish to predict
how much a user Alice will like a movie
that she hasnt seen, based on her ratings
of movies that she has seen. This means
making informed guesses about some
unobserved property of some object,
based on observed properties of that
object.

Copyright 2012 Pearson Education, Inc.

0-4

Imagine you only ever do four things at the weekend: go

shopping, watch a movie, play tennis or just stay in. What you
do depends on three things: the weather (windy, rainy or
sunny); how much money you have (rich or poor) and whether
your parents are visiting. You say to your yourself: if my
parents are visiting, we'll go to the cinema. If they're not
visiting and it's sunny, then I'll play tennis, but if it's windy, and
I'm rich, then I'll go shopping. If they're not visiting, it's windy
and I'm poor, then I will go to the cinema. If they're not visiting
and it's rainy, then I'll stay in.
To remember all this, you draw a flowchart which will enable
you to read off your decision. We call such diagrams decision
trees. A suitable decision tree for the weekend decision
choices would be as follows:
Copyright 2012 Pearson Education, Inc.

0-5

Copyright 2012 Pearson Education, Inc.

0-6

We can see why such diagrams are called trees, because, while they
are admittedly upside down, they start from a root and have branches
leading to leaves (the tips of the graph at the bottom). Note that the
leaves are always decisions, and a particular decision might be at the
end of multiple branches (for example, we could choose to go to the
cinema for two different reasons).
According to our decision tree diagram, on Saturday morning, when
we wake up, all we need to do is check (a) the weather (b) how much
money we have and (c) whether our parent's car is parked in the
drive. The decision tree will then enable us to make our decision.
Suppose, for example, that the parents haven't turned up and the sun
is shining. Then this path through our decision tree will tell us what to
do:

Copyright 2012 Pearson Education, Inc.

0-7

Copyright 2012 Pearson Education, Inc.

0-8

Hence we run off to play tennis because our

decision tree told us to. Note that the decision tree
covers all eventualities. That is, there are no
values that the weather, the parents turning up or
the money situation could take which aren't
catered for in the decision tree. Note that, in this
lecture, we will be looking at how to automatically
generate decision trees from examples, not at how
to turn thought processes into decision trees.

Copyright 2012 Pearson Education, Inc.

0-9

The basic idea

In the decision tree above, it is significant
that the "parents visiting" node came at the
top of the tree. We don't know exactly the
reason for this, as we didn't see the
example weekends from which the tree
was produced.

Copyright 2012 Pearson Education, Inc.

0-10

However, it is likely that the number of weekends the

parents visited was relatively high, and every
weekend they did visit, there was a trip to the cinema.
Suppose, for example, the parents have visited every
fortnight for a year, and on each occasion the family
visited the cinema. This means that there is no
evidence in favour of doing anything other than
watching a film when the parents visit. Given that we
are learning rules from examples, this means that if
the parents visit, the decision is already made.

Copyright 2012 Pearson Education, Inc.

0-11

Hence we can put this at the top of the

decision tree, and disregard all the
examples where the parents visited when
constructing the rest of the tree. Not
having to worry about a set of examples
will make the construction job easier.

Copyright 2012 Pearson Education, Inc.

0-12

This kind of thinking underlies the ID3

algorithm for learning decisions trees,
which we will describe more formally
below.

Copyright 2012 Pearson Education, Inc.

0-13

The Basic DTL Algorithm

Top-down, greedy search through the space of
possible decision trees (ID3 and C4.5)
Root: best attribute for classification
Which attribute is the best classifier?
answer based on information gain

Copyright 2012 Pearson Education, Inc.

0-14

Entropy
Putting together a decision tree is all a matter of
choosing which attribute to test at each node in
the tree.
We shall define a measure called information
gain which will be used to decide which attribute
to test at each node.
Information gain is itself calculated using a
measure called entropy.

Copyright 2012 Pearson Education, Inc.

0-15

Given a binary categorisation, C, and a set

of examples, S, for which the proportion of
examples categorised as positive by C is
p+ and the proportion of examples
categorised as negative by C is p -, then
the entropy of S is:

Copyright 2012 Pearson Education, Inc.

0-16

Copyright 2012 Pearson Education, Inc.

0-17

Information Gain
We now return to the problem of trying to
determine the best attribute to choose for a
particular node in a tree.
The following measure calculates a
numerical value for a given attribute, A,
with respect to a set of examples, S. Note
that the values of attribute A will range over
a set of possibilities which we call
Values(A),
Copyright 2012 Pearson Education, Inc.

0-18

and that, for a particular value from that

set, v, we write Sv for the set of examples
which have value v for attribute A.
The information gain of attribute A, relative
to a collection of examples, S, is calculated
as:

Copyright 2012 Pearson Education, Inc.

0-19

Decision Tree Learning

Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D1
D2
D3
D4
D5
D6
D7
D8
D9
D10

Sunny
Sunny
Overcast
Rain
Rain
Rain
Overcast
Sunny
Sunny
Rain

Hot
Hot
Hot
Mild
Cool
Cool
Cool
Mild
Cool
Mild

High
High
High
High
Normal
Normal
Normal
High
Normal
Normal

Weak
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Weak
Weak

No
No
Yes
Yes
Yes
No
Yes
No
Yes
Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

[See: Tom M. Mitchell, Machine Learning, McGraw-Hill, 1997]

Decision Tree Learning

(Outlook = Sunny Humidity = Normal) (Outlook = Overcast) (Outlook = Rain Wind = Weak)
[See: Tom M. Mitchell, Machine Learning, McGraw-Hill, 1997]
Copyright 2012 Pearson Education, Inc.

Decision Tree Learning

ID3

Building a Decision Tree

1.
2.
3.
4.

First test all attributes and select the on that would function as the best
root;
Break-up the training set into subsets based on the branches of the
root node;
Test the remaining attributes to see which ones fit best underneath the
branches of the root node;
Continue this process for all other branches until
a.
b.
c.

all examples of a subset are of one type

there are no examples left (return majority classification of the parent)
there are no more attributes left (default value should be majority
classification)

Decision Tree Learning

Determining which attribute is best (Entropy & Gain)
Entropy (E) is the minimum number of bits needed in order
to classify an arbitrary example as yes or no
E(S) = ci=1 pi log2 pi ,
Where S is a set of training examples,
c is the number of classes, and
pi is the proportion of the training set that is of class i

For our entropy equation 0 log2 0 = 0

The information gain G(S,A) where A is an attribute
G(S,A) E(S) - v in Values(A) (|Sv| / |S|) * E(Sv)

Decision Tree Learning

Lets Try an Example!
Play tennis={no, no, yes, yes, yes, no, yes, no, yes, yes,
yes, yes, yes, no}
The target function which named Play tennis contains
two classes:
C1=yes
C2=no
E([C1, C2]) represent that there are C1 positive training elements
and C2 negative elements.

Therefore the Entropy for the training data, E(S), can be

represented as E([9+,5-]) because of the 14 training
examples 9 of them are yes and 5 of them are no.

Decision Tree Learning:

A Simple Example
Lets start off by calculating the Entropy of the Training
Set.
9 5
5
9
E ( S ) pi log 2 pi
log 2 log 2 0.940
14 14
14
14
i 1
n

E(S) = E([9+,5-]) = (-9/14 log2 9/14) + (-5/14 log2 5/14)

= 0.94
Gain(S,) = ?
Gain(S,) = ?
Gain(S,) = ?
Gain(S,) = ?
Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

A Simple Example
Next we will need to calculate the information gain G(S,A)
for each attribute A where A is taken from the set
{Outlook, Temperature, Humidity, Wind}.

Decision Tree Learning:

A Simple Example
The information gain for Outlook is:

Outlook :

sunny 3 no overcast 0 no Rain 3no

sunny 2 yes overcast 4 yes Rain 2yes

G(S,Outlook) = E(S) [5/14 * E(Outlook=sunny) + 4/14 *

E(Outlook = overcast) + 5/14 * E(Outlook=rain)]
G(S,Outlook) = E([9+,5-]) [5/14*E(2+,3-) + 4/14*E([4+,0-]) +
5/14*E([3+,2-])]
G(S,Outlook) = 0.94 [5/14*0.971 + 4/14*0.0 + 5/14*0.971]
G(S,Outlook) = 0.246

Decision Tree Learning:

A Simple Example
G(S,Temperature) = 0.94 [4/14*E(Temperature=hot) +
6/14*E(Temperature=mild) +
4/14*E(Temperature=cool)]
G(S,Temperature) = 0.94 [4/14*E([2+,2-]) +
6/14*E([4+,2-]) + 4/14*E([3+,1-])]
G(S,Temperature) = 0.94 [4/14 + 6/14*0.918 +
4/14*0.811]
G(S,Temperature) = 0.029

Decision Tree Learning:

A Simple Example
G(S,Humidity) = 0.94 [7/14*E(Humidity=high) +
7/14*E(Humidity=normal)]
G(S,Humidity = 0.94 [7/14*E([3+,4-]) + 7/14*E([6+,1-])]
G(S,Humidity = 0.94 [7/14*0.985 + 7/14*0.592]
G(S,Humidity) = 0.1515

Decision Tree Learning:

A Simple Example
G(S,Wind) = 0.94 [8/14*0.811 + 6/14*1.00]
G(S,Wind) = 0.048

Decision Tree Learning:

A Simple Example
Outlook is our winner!

Decision Tree Learning:

A Simple Example
Now that we have discovered the root of our decision tree
we must now recursively find the nodes that should go
below Sunny, Overcast, and Rain.

Decision Tree Learning:

A Simple Example
G(Outlook=Rain, Humidity) = 0.971
[2/5*E(Outlook=Rain ^ Humidity=high) +
3/5*E(Outlook=Rain ^Humidity=normal]
G(Outlook=Rain, Humidity) = 0.02
G(Outlook=Rain,Wind) = 0.971- [3/5*0 + 2/5*0]
G(Outlook=Rain,Wind) = 0.971

Decision Tree Learning:

A Simple Example
Now our decision tree looks like:

PDF Product Lifecycle Management (PLM) : A Digital Journey Using Industrial Internet of Things (IIoT) 1st Edition Uthayan Elangovan Download
100% (3)
PDF Product Lifecycle Management (PLM) : A Digital Journey Using Industrial Internet of Things (IIoT) 1st Edition Uthayan Elangovan Download
52 pages
MindStudio Documentation
No ratings yet
MindStudio Documentation
78 pages
Case Study On AVL Trees
No ratings yet
Case Study On AVL Trees
11 pages
TOOLKIT For MPDS-August12 PDF
No ratings yet
TOOLKIT For MPDS-August12 PDF
13 pages
Useful Key Performance Indicators For Maintenance PDF
No ratings yet
Useful Key Performance Indicators For Maintenance PDF
8 pages
Ibm
No ratings yet
Ibm
6 pages
User Manual - Maintainer Maintenance Execution
100% (1)
User Manual - Maintainer Maintenance Execution
143 pages
Defining The Decision Factors For Managing Defects A Technical Debt Perspective
No ratings yet
Defining The Decision Factors For Managing Defects A Technical Debt Perspective
7 pages
Core 2014 Paper Sfairp Vs Alarp
No ratings yet
Core 2014 Paper Sfairp Vs Alarp
8 pages
ThingWorx Third Edition
From Everand
ThingWorx Third Edition
Gerardus Blokdyk
No ratings yet
SALVO Paper IET Conference Nov 2011
No ratings yet
SALVO Paper IET Conference Nov 2011
10 pages
This Project Takes Place in South Australia
33% (3)
This Project Takes Place in South Australia
9 pages
Decision Tree Print
No ratings yet
Decision Tree Print
4 pages
GAO 12-Step Estimating Process
No ratings yet
GAO 12-Step Estimating Process
3 pages
1-800-Flowers Com United States
100% (1)
1-800-Flowers Com United States
435 pages
Calculating Our Carbon Footprint
No ratings yet
Calculating Our Carbon Footprint
1 page
A New Model of Ishikawa Diagram For Quality Assess
100% (1)
A New Model of Ishikawa Diagram For Quality Assess
7 pages
McKinsey Big Data
100% (1)
McKinsey Big Data
3 pages
Service Quality Models A Review - Seth
No ratings yet
Service Quality Models A Review - Seth
53 pages
04d Process Map Templates-V2.0 (PowerPiont)
No ratings yet
04d Process Map Templates-V2.0 (PowerPiont)
17 pages
Supply Chain Mgmnt.
No ratings yet
Supply Chain Mgmnt.
36 pages
Project Charter
No ratings yet
Project Charter
6 pages
Customer Care
No ratings yet
Customer Care
10 pages
Tata Power Business Case Study Challenge
No ratings yet
Tata Power Business Case Study Challenge
1 page
PEDG1 Guidance On Global Conformity Assessment Oct 2018
No ratings yet
PEDG1 Guidance On Global Conformity Assessment Oct 2018
31 pages
APTA PR-CS-S-012-02 Standard For Door Systems For New and Rebuilt Passenger Cars
No ratings yet
APTA PR-CS-S-012-02 Standard For Door Systems For New and Rebuilt Passenger Cars
11 pages
Data Science For Business
No ratings yet
Data Science For Business
6 pages
Australian Standard: Steel Structures
No ratings yet
Australian Standard: Steel Structures
11 pages
IDC Enterprise Applications and Strategies - 2024 Sep
No ratings yet
IDC Enterprise Applications and Strategies - 2024 Sep
1 page
System Development Life Cycle
No ratings yet
System Development Life Cycle
30 pages
Project Mandate: Project Name: Date: Release: Author: Owner: Client: Document Number
No ratings yet
Project Mandate: Project Name: Date: Release: Author: Owner: Client: Document Number
3 pages
EV Utilities
No ratings yet
EV Utilities
257 pages
SOP 10. LabView Data Acquisition Program
No ratings yet
SOP 10. LabView Data Acquisition Program
7 pages
The Top 5 Use Cases of Graph Databases: Unlocking New Possibilities With Connected Data
No ratings yet
The Top 5 Use Cases of Graph Databases: Unlocking New Possibilities With Connected Data
13 pages
Decision Tree
No ratings yet
Decision Tree
24 pages
Acclaim Series 10 Fire Extinguishing Panel
No ratings yet
Acclaim Series 10 Fire Extinguishing Panel
6 pages
Introduction and ETL
No ratings yet
Introduction and ETL
125 pages
6sigma in Reverse Logistics
50% (2)
6sigma in Reverse Logistics
12 pages
Lifetime Extension of Onshore Wind Turbines A Review Covering Germany, Spain, Denmark, and The UK
No ratings yet
Lifetime Extension of Onshore Wind Turbines A Review Covering Germany, Spain, Denmark, and The UK
11 pages
Totalqualityservicemanagementbook1 090715110223 Phpapp01
No ratings yet
Totalqualityservicemanagementbook1 090715110223 Phpapp01
82 pages
Lifting Magnetics
No ratings yet
Lifting Magnetics
8 pages
Aloha
No ratings yet
Aloha
32 pages
Philip Crosby - The Fun Uncle of The Quality Revolution
No ratings yet
Philip Crosby - The Fun Uncle of The Quality Revolution
1 page
Project Management: Strategic Issues
No ratings yet
Project Management: Strategic Issues
14 pages
Value Disciplines
No ratings yet
Value Disciplines
6 pages
CSCMP Process Standards Mini
No ratings yet
CSCMP Process Standards Mini
64 pages
Get File - Etap Training Manual PDF
0% (1)
Get File - Etap Training Manual PDF
2 pages
JMP 6213: Supply Chain Management: Universiti Utara Malaysia
No ratings yet
JMP 6213: Supply Chain Management: Universiti Utara Malaysia
16 pages
Tap Into The Power of Analytics
No ratings yet
Tap Into The Power of Analytics
2 pages
BBC Digital Media Initiative
No ratings yet
BBC Digital Media Initiative
36 pages
Introduction To Predictive Analytics PDF
No ratings yet
Introduction To Predictive Analytics PDF
10 pages
Climate Change Risk Assessment 2012 UK Government Report
No ratings yet
Climate Change Risk Assessment 2012 UK Government Report
48 pages
What (1) Every Engineer Should Know About Decision Making Under Uncertainty
100% (1)
What (1) Every Engineer Should Know About Decision Making Under Uncertainty
311 pages
Deloitte Final Cost Benifit Analysis 2 August
No ratings yet
Deloitte Final Cost Benifit Analysis 2 August
124 pages
Ikea PDF
No ratings yet
Ikea PDF
51 pages
ID3 Dozier Seals
No ratings yet
ID3 Dozier Seals
30 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
16 pages
VI Sem Machine Learning CS 601 PDF
No ratings yet
VI Sem Machine Learning CS 601 PDF
28 pages
VI Sem Machine Learning CS 601
No ratings yet
VI Sem Machine Learning CS 601
28 pages
Video Tutorial: Decision Tree Learning
No ratings yet
Video Tutorial: Decision Tree Learning
21 pages
Artificial Intelligence Module 5
No ratings yet
Artificial Intelligence Module 5
23 pages
1) Write A Program For Heap Sort? Ans: #Include
No ratings yet
1) Write A Program For Heap Sort? Ans: #Include
8 pages
Binary Search Tree PDF
No ratings yet
Binary Search Tree PDF
26 pages
AVL Trees - Horowitz Sahani
No ratings yet
AVL Trees - Horowitz Sahani
31 pages
General Tree: 1 Mitra Kabir, Dept. of CSE, DU
No ratings yet
General Tree: 1 Mitra Kabir, Dept. of CSE, DU
6 pages
Tree and Its Terminologies
No ratings yet
Tree and Its Terminologies
36 pages
Assignment2 20BCE0023
No ratings yet
Assignment2 20BCE0023
10 pages
Important questions
No ratings yet
Important questions
4 pages
Rohini - 54944938803 Heap
No ratings yet
Rohini - 54944938803 Heap
6 pages
Trees (Unit Review) PDF
No ratings yet
Trees (Unit Review) PDF
2 pages
Tutorals Exercises
No ratings yet
Tutorals Exercises
54 pages
Binary Search Trees
No ratings yet
Binary Search Trees
21 pages
BCSL-033 June 2016 - January 2017 Session: Algorithm
No ratings yet
BCSL-033 June 2016 - January 2017 Session: Algorithm
13 pages
Module 6-BT
No ratings yet
Module 6-BT
42 pages
Data Structure QB
No ratings yet
Data Structure QB
2 pages
Unit-3 Non-Linear Data Structure Part-3 (Tree - III)
No ratings yet
Unit-3 Non-Linear Data Structure Part-3 (Tree - III)
40 pages
Binary Tree (Part 1) - Chapter 6
No ratings yet
Binary Tree (Part 1) - Chapter 6
30 pages
Tree Traversal
No ratings yet
Tree Traversal
65 pages
Heap Leftist Trees
No ratings yet
Heap Leftist Trees
5 pages
Fahima Haque 04210101571 CSE Lab Report:: Implement A Binary Tree and Describe All The Operations
No ratings yet
Fahima Haque 04210101571 CSE Lab Report:: Implement A Binary Tree and Describe All The Operations
9 pages
SP Trees PDF
No ratings yet
SP Trees PDF
22 pages
chapitre 4 Arbres-graphes english-ff
No ratings yet
chapitre 4 Arbres-graphes english-ff
9 pages
Cs301 Final Term Solved Paper Mega File
No ratings yet
Cs301 Final Term Solved Paper Mega File
31 pages
Implementation of A Linked List: Algorithm
No ratings yet
Implementation of A Linked List: Algorithm
91 pages
Leetcode DSA Sheet by Fraz
No ratings yet
Leetcode DSA Sheet by Fraz
12 pages
Data Structures and Algorithm: Avl Tree
No ratings yet
Data Structures and Algorithm: Avl Tree
42 pages
Infix and Postfix Expressions
No ratings yet
Infix and Postfix Expressions
32 pages
AVL Tree Data Structure
No ratings yet
AVL Tree Data Structure
7 pages
Tournament Trees
No ratings yet
Tournament Trees
37 pages
3 B Tree
No ratings yet
3 B Tree
34 pages

Decision Tree

Uploaded by

Decision Tree

Uploaded by

ChapterThree

Copyright 2012 Pearson Education, Inc.

Copyright 2012 Pearson Education, Inc.

Decision tree induction is a simple but powerful

At a basic level, machine learning is about

Copyright 2012 Pearson Education, Inc.

Imagine you only ever do four things at the weekend: go

Copyright 2012 Pearson Education, Inc.

Copyright 2012 Pearson Education, Inc.

Copyright 2012 Pearson Education, Inc.

Hence we run off to play tennis because our

Copyright 2012 Pearson Education, Inc.

The basic idea

Copyright 2012 Pearson Education, Inc.

However, it is likely that the number of weekends the

Copyright 2012 Pearson Education, Inc.

Hence we can put this at the top of the

Copyright 2012 Pearson Education, Inc.

This kind of thinking underlies the ID3

Copyright 2012 Pearson Education, Inc.

The Basic DTL Algorithm

Copyright 2012 Pearson Education, Inc.

Copyright 2012 Pearson Education, Inc.

Given a binary categorisation, C, and a set

Copyright 2012 Pearson Education, Inc.

Copyright 2012 Pearson Education, Inc.

and that, for a particular value from that

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning

[See: Tom M. Mitchell, Machine Learning, McGraw-Hill, 1997]

Decision Tree Learning

Decision Tree Learning

Building a Decision Tree

all examples of a subset are of one type

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning

For our entropy equation 0 log2 0 = 0

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning

Therefore the Entropy for the training data, E(S), can be

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

E(S) = E([9+,5-]) = (-9/14 log2 9/14) + (-5/14 log2 5/14)

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

sunny 3 no overcast 0 no Rain 3no

G(S,Outlook) = E(S) [5/14 * E(Outlook=sunny) + 4/14 *

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

Decision Tree Learning:

Copyright 2012 Pearson Education, Inc.

You might also like