Hierarchical Clustering
Hierarchical Clustering
Hierarchical Clustering
Gabriela Ochoa
[email protected]
Hierarchical Clustering
1 / 34
Types of Hierarchical Clustering
2 / 34
Measuring Similarity
3 / 34
Linkage Methods
4 / 34
Agglomerative Clustering
Pseudo-code
5 / 34
Agglomerative Clustering
Intuitive simple example
Raw data
Dendogram
6 / 34
Real-world Dataset
Violent Crime Rates by US State
7 / 34
Dendrogram after Agglomerative Clustering
Violent Crime Rates by US State
8 / 34
How many clusters?
Violent Crime Rates by US State
9 / 34
Agglomerative Clustering
Synthetic small dataset
1
8
10
6
4
4
5 6
2
7
0
0 2 4 6 8
10 / 34
Hierarchical Clustering
Synthetic small dataset
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
Clusters: {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}, {10}
11 / 34
Hierarchical Clustering
Single (minimal) distance
1
8
10
6
4
4
5 6
2
7
0
0 2 4 6 8
12 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
13 / 34
Hierarchical Clustering
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
Clusters: {1, 2}, {3}, {4}, {5, 6}, {7}, {8}, {9}, {10}
14 / 34
Hierarchical Clustering
1
8
2
10
6
4
4
5 6
2
7
0
0 2 4 6 8
15 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
16 / 34
Hierarchical Clustering
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
Clusters: {1, 2}, {3}, {4}, {5, 6}, {7, 8}, {9}, {10}
17 / 34
Hierarchical Clustering
1
8
2
10
6
4
4
5 6
2
7
0
0 2 4 6 8
18 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
19 / 34
Hierarchical Clustering
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
Clusters: {1, 2}, {3}, {4, 5, 6}, {7, 8}, {9}, {10}
20 / 34
Hierarchical Clustering
1
8
2
10
6
4
4
5 6
2
7
0
0 2 4 6 8
21 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2
22 / 34
Hierarchical Clustering
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
23 / 34
Hierarchical Clustering
1
8
2
10
6
4
4
5 6
2
7
0
0 2 4 6 8
24 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2
2.8 3 {1, 2, 3, 4, 5, 6}, Merge {1, 2} and {3, 4, 5, 6} as
{7, 8, 9}, {10} well as {7, 8} and {9} since 2 and
3 as well as 8 and 9 are the closest:
d(2,3)=2.8 and d(8,9)=2.8
25 / 34
Hierarchical Clustering
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
26 / 34
Hierarchical Clustering
1
8
2
10
6
4
4
5 6
2
7
0
0 2 4 6 8
27 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2
2.8 3 {1, 2, 3, 4, 5, 6}, Merge {1, 2} and {3, 4, 5, 6} as
{7, 8, 9}, {10} well as {7, 8} and {9} since 2 and
3 as well as 8 and 9 are the closest:
d(2,3)=2.8 and d(8,9)=2.8
3.2 2 {1, 2, 3, 4, 5, 6, 10}, Merge {1, 2, 3, 4, 5, 6} and
{7, 8, 9} {10} since 3 and 10 are the closest:
d(3,10)=3.2
28 / 34
Hierarchical Clustering
Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0
29 / 34
Hierarchical Clustering
1
8
2
10
6
4
4
5 6
2
7
0
0 2 4 6 8
30 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2
2.8 3 {1, 2, 3, 4, 5, 6}, Merge {1, 2} and {3, 4, 5, 6} as
{7, 8, 9}, {10} well as {7, 8} and {9} since 2 and
3 as well as 8 and 9 are the closest:
d(2,3)=2.8 and d(8,9)=2.8
3.2 2 {1, 2, 3, 4, 5, 6, 10}, Merge {1, 2, 3, 4, 5, 6} and
{7, 8, 9} {10} since 3 and 10 are the closest:
d(3,10)=3.2
3.6 1 {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} Merge remaining two clusters,
d(9,10)=3.6
31 / 34
Hierarchical Clustering
Single Linkage Cluster Dendrogram
3.5
3.0
10
2.5
9
Height
2.0
4
1.5
1.0
6
32 / 34
Height
1.0 1.5 2.0 2.5 3.0 3.5
9
7
8
10
1
2
3
Single Linkage
4
5
6
Height
Hierarchical Clustering
0 2 4 6 8 10
9
7
8
5
6
10
1
2
Complete Linkage
3
4
Height
1 2 3 4 5 6 7
9
7
8
4
5
6
1
Average Linkage
2
3
10
33 / 34
Conclusions
34 / 34