Lecture 023+-+Decision+Trees+ - 1
Lecture 023+-+Decision+Trees+ - 1
Introduction
1
Classification
2
Classification—A Two-Step Process
Classification
Algorithms
Training
Data
Classifier
Testing
Data Unseen Data
(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no
Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Classification
• Types of classifiers
• Comprehensible classifiers (Rule based classifiers)
• Decision tree, Ripper, CN2 etc.
• Non-comprehensible classifiers (Statistical or mathematical
classifiers)
• SVM, Naïve bayes, NN etc.
6
Evaluating Classification Methods
• Predictive accuracy
• Robustness
– handling noise and missing values
• Scalability
– efficiency in disk-resident databases
• Interpretability
– understanding and insight provided by the model
DECISION TREES
Introduction
8
DECISION TREES
Introduction
9
DECISION TREES
10
DECISION TREES
Example
11
DECISION TREES
Example
Outlook
Sunny Rain
Overcast
Humidity Wind
High Normal Strong Weak
13
DECISION TREES
14
DECISION TREES
15
DECISION TREES
16
DECISION TREES
17
DECISION TREES
18
DECISION TREES
19
DECISION TREES
20
DECISION TREES
21
DECISION TREES
H(X) =
22
DECISION TREES
Major
Math CS
History
23
DECISION TREES
Major
Math CS
History
24
DECISION TREES
Major
Math CS
History
25
DECISION TREES
H(Y | X) =
Major
Math CS
History
26
DECISION TREES
Let’s
investigate
the attribute
Wind
27
DECISION TREES
28
DECISION TREES
29
DECISION TREES
30
DECISION TREES
Example
Which attribute should be selected as the first test?
31
DECISION TREES
32
DECISION TREES
Example
The process of selecting a new attribute is now repeated for
each (non-terminal) descendant node, this time using only
training examples associated with that node
33
DECISION TREES
Example
This process continues for each new leaf node until either:
34
DECISION TREES
Example
35
DECISION TREES
After making the decision tree, we trace each path from the
root node to leaf node, recording the test outcomes as
antecedents and the leaf node classification as the consequent
Example
37
DECISION TREES
Example
Grades Hardworking Intelligent Unlucky
1. Good Yes Yes No
2. Bad No Yes Yes
3. Bad Yes No Yes
4. Good Yes Yes No
5. Good Yes Yes No
6. Bad No Yes No
7. Bad Yes No No
38
DECISION TREES
Example
Show the steps of decision tree induction algorithm for the
data
Example
40
DECISION TREES
Example
41
DECISION TREES
Example
Attribute “Hardworking”
42
DECISION TREES
Example
Attribute “Hardworking”
Example
Attribute “Hardworking”
44
DECISION TREES
Example
45
DECISION TREES
Example
46
DECISION TREES
Example
47
DECISION TREES
Example
“Hardworking”
Yes No
Example
“Hardworking”
Yes No
?
Bad
Grades
49
DECISION TREES
Example
Example
Attribute “Intelligent”
51
DECISION TREES
Example
Attribute “Intelligent”
52
DECISION TREES
Example
“Hardworking”
Yes No
Intelligent
Bad
Yes No Grades
Good Bad
Grades Grades
53
DECISION TREES
Reference
Sections 3.1 – 3.4 of T. Mitchell
Section 4.3 of Witten & Frank
54