Lecture 05.decision Tree and K Means PDF
Lecture 05.decision Tree and K Means PDF
2
Example: Text Clustering
3
Improve Healthcare, Win $3M
Motivation:
◦ 71M Americans are admitted to hospitals per year
◦ $30 billion was spent on unnecessary hospital admissions
◦ Can we identify earlier those most at risk and ensure they get
the treatment they need?
Objective:
◦ Identify patients who will be admitted to a hospital within
the next year, using historical claims data.
◦ Develop new care plans and strategies to reach patients before
emergencies occur, thereby reducing the number of
unnecessary hospitalizations.
Competition:
◦ Grand Prize: $3M
◦ Milestone Prizes: $230K/6
◦ Time: 4 April 2011 - 3 April 2013
5
Methods/algorithms
for data analysis / data mining
ICDM’06 survey: KDnuggets poll’ 11:
1. C4.5 (61)
2. K-Means (60)
3. SVM (58)
4. Apriori (52)
5. EM (48)
6. PageRank (46)
7. AdaBoost (45)
8. k-NN (45)
9. Naïve Bayes (45)
10. CART (34)
6
CLASSIFICATION BY
DECISION TREE
INDUCTION
7
BuyComputer Data
8
Classification by Decision Tree
Node: condition
Leaf: conclusion
9
General Algorithm
Create a new node N
If all the data belongs to the same class C Then
◦ Return N as leaf node labeled with C
Select the “best” attribute A
Label N with A
For each value Ai of attribute A
◦ Select a subset Di of examples according to Ai
◦ Iterate the algorithm on Di
EndFor
10
General Algorithm
Create a new node N
If all the data belongs to the same class C Then
◦ Return N as leaf node labeled with C
Select the “best” attribute A
Label N with A
For each value Ai of attribute A
◦ Select a subset Di of examples according to Ai
◦ Iterate the algorithm on Di
EndFor
11
Which Attribute is “best”?
12
Entropy
Given a collection S, containing positive
and negative examples of some target
concept, the entropy of S relative to
this Boolean classification is
c
C-class: H ( X )
p log
i 1
i 2 ( pi )
13
ID3: Information Gain
c Sv
Gain( S , A) Entropy( S )
vValues( A ) S
Entropy( S v )
◦ S – a collection of examples
◦ A – an attribute
◦ Values(A) – possible values of attribute A
◦ Sv – the subset of S for which attribute A has value v
14
Which Attribute is “best”?
E([29+,35-]) = 0.99 E([29+,35-]) = 0.99
A? B?
a b c d
Gini-index
17
Tree Pruning
Occam’s razor:
prefer the simplest hypothesis that fits the data
18
SUPPORT VECTOR
MACHINES
19
CLUSTERING
20
Visualization of the Iraq War Logs
21
Image Clustering
23
K-means Clustering: Example
Step 1: pick K
point randomly
24
K-means Clustering: Example
Step 2: assign
data points to
closest data
center
25
K-means Clustering: Example
Step 3: changer
cluster center to
the
mean/average of
the assigned
data points
26
K-means Clustering: Example
Repeat until
convergence
27
K-means Clustering
Initialize
◦ Pick K random points
as cluster centers
Repeat
1. Assign data points to
closest data center
2. Changer cluster
center to the
mean/average of the
assigned data points
Until no points
assignments change
28
K-means Clustering
29
K-means: Features
Guaranteed to converge in a finite
number of iterations
Complexity
1. Assign data points to closest data center:
O(Kn)
2. Changer cluster center to the average of its
assigned points: O(n)
30
K-means: Convergence
31
K-means: Randomness
Input
Output 1
Output 2
32
K-means: Local Minimum
33
Hierarchical Clustering
34
Distance Measures
35
36
Algorithms
Evaluation
Unsolved problem: number of clusters
unknown
37
Segmentation – Classification
38
Report: Decision Tree Learning
Algorithm Tree pruning
◦ Pre-pruning
◦ Post-pruning
39