0% found this document useful (0 votes)

116 views

Hierarchical Clustering

Hierarchical clustering is an unsupervised machine learning algorithm that builds a hierarchy of clusters. It does not require specifying the number of clusters beforehand. There are two types: agglomerative which starts with each example as a cluster and merges them iteratively, and divisive which starts with all examples in one cluster and divides them. The linkage criteria determines how the distance between clusters is calculated during the merging or dividing process.

Uploaded by

Wet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

116 views

Hierarchical Clustering

Uploaded by

Wet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Artificial Intelligence (CSC9YE)

Hierarchical Clustering

Gabriela Ochoa
[email protected]
Hierarchical Clustering

I K-means clustering requires us to pre-specify the number of

clusters K. This can be a disadvantage.
I Hierarchical clustering does not require that we commit to a
particular choice of K.
I Hierarchical clustering algorithms produce a hierarchy which
can be visualised as a dendrogram (tree).
I The histogram helps us to decide on the number of clusters K.
I Two types of hierarchical clustering: agglomerative
(bottom-up) and divisive (top-down).

1 / 34
Types of Hierarchical Clustering

I Agglomerative (bottom-up, agglomerative nesting)

I Each example starts as a single-element cluster (leaf)
I At each step, the two clusters that are the most similar are
combined into a new bigger cluster (nodes)
I Iterate until all points are member a single big cluster (root).
I Divisive (top-down, divisive analysis)
I Start with a single cluster (root) with all examples
I At each step, the most heterogenous cluster is divided into two
I Iterate, until all examples are in their own cluster

2 / 34
Measuring Similarity

I We we need a way of measuring the similarity between two

clusters
I As we discussed for K-Means, we can measure the
(dis)similarity of observations using a distance measure, such
as the Euclidean distance
I How do we measure the dissimilarity between two clusters of
observations?
I A number of different cluster agglomeration methods (i.e,
linkage methods) have been proposed.
I Minimum or single linkage
I Maximum or complete linkage
I Mean or average linkage
I Inter cluster variance

3 / 34
Linkage Methods

I The linkage method determines the metric used for merging

clusters. Two clusters are merged if thy are close.
I All pairwise distances between the observations in cluster A
and the observations in cluster B, are computed.
Linkage Description
Single Minimal inter-cluster distance. Shortest distance
between observations of pairs of clusters.
Complete Maximal inter-cluster distance.Maximum distance
between observations of pairs of clusters.
Average Mean inter-cluster distance. Average distance be-
tween observations of pairs of clusters.
Ward Inter-cluster variance. Variance of the pairs of clus-
ters. Sum of squared differences within all clusters.

4 / 34
Agglomerative Clustering
Pseudo-code

Input: set of examples without labels

I Let each example form one cluster. For N examples, this
means creating N clusters, each containing a single example
I Find a pair of clusters with the smallest cluster-to-cluster
distance.
I Merge the two clusters into one, thus reducing the total
number of clusters to N − 1.
I Iterate until all examples are member a single big cluster.

5 / 34
Agglomerative Clustering
Intuitive simple example

Raw data

Dendogram

6 / 34
Real-world Dataset
Violent Crime Rates by US State

Statistics, in arrests per 100,000 residents, in each of the 50 US

states in 1973.
I Murder: Murder arrests (per 100,000)
I Assault: Assault arrests (per 100,000)
I UrbanPop: Percent of urban population
I Rape: Rape arrests (per 100,000)

7 / 34
Dendrogram after Agglomerative Clustering
Violent Crime Rates by US State

I Each leaf corresponds to one observation.

I As we move up the tree, observations that are similar to each
other are combined into branches, which are themselves fused
at a higher height.
I The vertical axis indicates dissimilarity. The higher the height
of the fusion, the less similar the observations are.

8 / 34
How many clusters?
Violent Crime Rates by US State

I The height of the cut to the dendrogram controls the number

of clusters obtained.
I It plays the same role as the K in K-means clustering
I In order to identify sub-groups (i.e. clusters), we need cut the
dendrogram. Here an example of cutting the tree into 4
groups

9 / 34
Agglomerative Clustering
Synthetic small dataset

1
8

10
6

4
4

5 6
2

7
0

0 2 4 6 8

10 / 34
Hierarchical Clustering
Synthetic small dataset

Distance Matrix
1 2 3 4 5 6 7 8 9 10
1 0.0
2 1.0 0.0
3 3.6 2.8 0.0
4 4.0 3.0 2.2 0.0
5 6.0 5.0 3.6 2.0 0.0
6 6.1 5.1 4.2 2.2 1.0 0.0
7 10.0 9.2 6.4 7.2 6.3 7.3 0.0
8 8.6 7.8 5.0 5.8 5.1 6.1 1.4 0.0
9 8.6 8.1 5.4 7.1 7.1 8.1 3.2 2.8 0.0
10 5.4 5.1 3.2 5.4 6.4 7.2 6.1 5.0 3.6 0.0

Clusters: {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}, {10}

11 / 34
Hierarchical Clustering
Single (minimal) distance

1
8

10
6

4
4

5 6
2

7
0

0 2 4 6 8

12 / 34
Hierarchical Clustering

13 / 34
Hierarchical Clustering

Clusters: {1, 2}, {3}, {4}, {5, 6}, {7}, {8}, {9}, {10}

14 / 34
Hierarchical Clustering

1
8
2

10
6

4
4

5 6
2

7
0

0 2 4 6 8

15 / 34
Hierarchical Clustering

16 / 34
Hierarchical Clustering

Clusters: {1, 2}, {3}, {4}, {5, 6}, {7, 8}, {9}, {10}

17 / 34
Hierarchical Clustering

1
8
2

10
6

4
4

5 6
2

7
0

0 2 4 6 8

18 / 34
Hierarchical Clustering

19 / 34
Hierarchical Clustering

Clusters: {1, 2}, {3}, {4, 5, 6}, {7, 8}, {9}, {10}

20 / 34
Hierarchical Clustering

1
8
2

10
6

4
4

5 6
2

7
0

0 2 4 6 8

21 / 34
Hierarchical Clustering

d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2

22 / 34
Hierarchical Clustering

Clusters: {1, 2}, {3, 4, 5, 6}, {7, 8}, {9}, {10}

23 / 34
Hierarchical Clustering

1
8
2

10
6

4
4

5 6
2

7
0

0 2 4 6 8

24 / 34
Hierarchical Clustering

25 / 34
Hierarchical Clustering

Clusters: {1, 2, 3, 4, 5, 6}, {7, 8, 9}, {10}

26 / 34
Hierarchical Clustering

1
8
2

10
6

4
4

5 6
2

7
0

0 2 4 6 8

27 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2
2.8 3 {1, 2, 3, 4, 5, 6}, Merge {1, 2} and {3, 4, 5, 6} as
{7, 8, 9}, {10} well as {7, 8} and {9} since 2 and
3 as well as 8 and 9 are the closest:
d(2,3)=2.8 and d(8,9)=2.8
3.2 2 {1, 2, 3, 4, 5, 6, 10}, Merge {1, 2, 3, 4, 5, 6} and
{7, 8, 9} {10} since 3 and 10 are the closest:
d(3,10)=3.2

28 / 34
Hierarchical Clustering

Clusters: {1, 2, 3, 4, 5, 6, 10}, {7, 8, 9}

29 / 34
Hierarchical Clustering

1
8
2

10
6

4
4

5 6
2

7
0

0 2 4 6 8

30 / 34
Hierarchical Clustering
d k Clusters Comment
0.0 10 {1}, {2}, {3}, {4}, {5}, Start with each observation as one
{6}, {7}, {8}, {9}, {10} cluster.
1.0 8 {1, 2}, {3}, {4}, {5, 6}, Merge {1} and {2} as well as {5}
{7}, {8}, {9}, {10} and {6} since they are the closest:
d(1,2)=1 and d(5,6)=1
1.4 7 {1, 2}, {3}, {4}, {5, 6}, Merge {7} and {8} since they are the
{7, 8}, {9}, {10} closest: d(7,8)=1.4
2.0 6 {1, 2}, {3}, {4, 5, 6}, Merge {4} and {5, 6} since 4 and 5
{7, 8}, {9}, {10} are the closest: d(4,5)=2.0
2.2 5 {1, 2}, {3, 4, 5, 6}, Merge {3} and {4, 5, 6} since 3 and
{7, 8}, {9}, {10} 4 are the closest: d(3,4)=2.2
2.8 3 {1, 2, 3, 4, 5, 6}, Merge {1, 2} and {3, 4, 5, 6} as
{7, 8, 9}, {10} well as {7, 8} and {9} since 2 and
3 as well as 8 and 9 are the closest:
d(2,3)=2.8 and d(8,9)=2.8
3.2 2 {1, 2, 3, 4, 5, 6, 10}, Merge {1, 2, 3, 4, 5, 6} and
{7, 8, 9} {10} since 3 and 10 are the closest:
d(3,10)=3.2
3.6 1 {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} Merge remaining two clusters,
d(9,10)=3.6
31 / 34
Hierarchical Clustering
Single Linkage Cluster Dendrogram
3.5
3.0

10
2.5

9
Height

2.0

4
1.5
1.0

6
32 / 34
Height
1.0 1.5 2.0 2.5 3.0 3.5
9
7
8
10
1
2
3
Single Linkage

4
5
6

Height
Hierarchical Clustering

0 2 4 6 8 10
9
7
8
5
6
10
1
2
Complete Linkage

3
4

Height
1 2 3 4 5 6 7
9
7
8
4
5
6
1
Average Linkage

2
3
10
33 / 34
Conclusions

I Hierarchical clustering seeks to build a hierarchy of clusters.

I We can decided on the number of clusters after exploring the
dendrogram.
I Unsupervised learning is important for understanding the
variation and grouping structure of a set of unlabeled data,
and can be a useful pre-processor for supervised learning.
I It is intrinsically more difficult than supervised learning
because there is no gold standard (like an outcome variable)
and no single objective (like test set accuracy).

34 / 34

NPD Report Ent600 CS246
79% (39)
NPD Report Ent600 CS246
55 pages
SMM (Standard Method of Meacurement of Building Works)
No ratings yet
SMM (Standard Method of Meacurement of Building Works)
72 pages
(Semrush) Social Media Content Calendar Template
No ratings yet
(Semrush) Social Media Content Calendar Template
127 pages
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
No ratings yet
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
22 pages
Cluster
100% (1)
Cluster
72 pages
Chapter 7
100% (1)
Chapter 7
31 pages
Linear Regression 18may
No ratings yet
Linear Regression 18may
28 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
29 pages
ML QP
No ratings yet
ML QP
6 pages
DNN Hyperparameter Tuning
No ratings yet
DNN Hyperparameter Tuning
105 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 2 - Machine Learning - WWW - Rgpvnotes.in PDF
10 pages
MEC 333 - Industry 4.O Syllabus-OE
No ratings yet
MEC 333 - Industry 4.O Syllabus-OE
3 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
20 pages
Text
No ratings yet
Text
131 pages
Lecture On Fuzzy Logic Control
No ratings yet
Lecture On Fuzzy Logic Control
87 pages
HW 1 Sol
0% (1)
HW 1 Sol
7 pages
Machine Learning and Data Mining in Manufacturing
No ratings yet
Machine Learning and Data Mining in Manufacturing
45 pages
Predictive Maintenance of Railway Point Machine Using Machine Learning Algorithm
No ratings yet
Predictive Maintenance of Railway Point Machine Using Machine Learning Algorithm
3 pages
A First Course in Machine Learning Chapman Hall CRC Machine Learning Pattern Recognition 2nd Edition Simon Rogers - The ebook in PDF format is ready for immediate access
100% (1)
A First Course in Machine Learning Chapman Hall CRC Machine Learning Pattern Recognition 2nd Edition Simon Rogers - The ebook in PDF format is ready for immediate access
57 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
Lecture 12 - Deep Learning
No ratings yet
Lecture 12 - Deep Learning
25 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
Fuzzy 2
No ratings yet
Fuzzy 2
76 pages
LSTM
No ratings yet
LSTM
42 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Overfitting and Underfitting in Machine Learning
No ratings yet
Overfitting and Underfitting in Machine Learning
3 pages
Artificial Intelligence Questions and Answers - Neural Networks - 1
No ratings yet
Artificial Intelligence Questions and Answers - Neural Networks - 1
4 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
Clustering (Unit 3)
100% (2)
Clustering (Unit 3)
71 pages
Modelling & Neural Network Grade 9
0% (1)
Modelling & Neural Network Grade 9
71 pages
ML - Expectation-Maximization Algorithm
No ratings yet
ML - Expectation-Maximization Algorithm
3 pages
U L D R: Nsupervised Earning and Imensionality Eduction
No ratings yet
U L D R: Nsupervised Earning and Imensionality Eduction
58 pages
Lecture Notes 5
No ratings yet
Lecture Notes 5
3 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
Activations, Loss Functions & Optimizers in ML
No ratings yet
Activations, Loss Functions & Optimizers in ML
29 pages
Presentation 2
No ratings yet
Presentation 2
19 pages
IEEE 13 Node Test Feeder
No ratings yet
IEEE 13 Node Test Feeder
11 pages
Neuro-Fuzzy System
No ratings yet
Neuro-Fuzzy System
6 pages
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
No ratings yet
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
11 pages
Module 2
No ratings yet
Module 2
20 pages
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
No ratings yet
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
63 pages
Linear Regression
No ratings yet
Linear Regression
83 pages
DecisionTrees RandomForest v2
No ratings yet
DecisionTrees RandomForest v2
27 pages
Weka Lab Record Experiments
No ratings yet
Weka Lab Record Experiments
21 pages
Hw1 Theory Solution PuHK4fmHvB
No ratings yet
Hw1 Theory Solution PuHK4fmHvB
4 pages
Weka Tutorial
No ratings yet
Weka Tutorial
2 pages
Deep Learning Lecture 0 Introduction Alexander Tkachenko
No ratings yet
Deep Learning Lecture 0 Introduction Alexander Tkachenko
31 pages
Industrial Automation Slide
No ratings yet
Industrial Automation Slide
40 pages
Fuzzy
33% (3)
Fuzzy
13 pages
Demonstration of Artificial Neural Network in Matlab
No ratings yet
Demonstration of Artificial Neural Network in Matlab
5 pages
15 Mvue
100% (1)
15 Mvue
28 pages
Ensemble Machine Learning With Python: 7-Day Mini-Course Jason Brownlee - The full ebook version is ready for instant download
100% (1)
Ensemble Machine Learning With Python: 7-Day Mini-Course Jason Brownlee - The full ebook version is ready for instant download
46 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
Fuzzy Logic Applications: Bram Heyns
No ratings yet
Fuzzy Logic Applications: Bram Heyns
7 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Computer Aided Design of Electrical Machines
From Everand
Computer Aided Design of Electrical Machines
K.M. Vishnu Murthy
No ratings yet
Simulation of Some Power System, Control System and Power Electronics Case Studies Using Matlab and PowerWorld Simulator
From Everand
Simulation of Some Power System, Control System and Power Electronics Case Studies Using Matlab and PowerWorld Simulator
Dr. Hedaya Mahmood Alasooly
No ratings yet
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
Clustring
No ratings yet
Clustring
20 pages
19 - Sessionppt - Clusteringalgos
No ratings yet
19 - Sessionppt - Clusteringalgos
36 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
hotellist_20210811
No ratings yet
hotellist_20210811
4 pages
Questions CITA
No ratings yet
Questions CITA
44 pages
18-deeprl
No ratings yet
18-deeprl
19 pages
X Computer Preparation Paper 2025 - Malik Group
No ratings yet
X Computer Preparation Paper 2025 - Malik Group
2 pages
Pyro Chem MONARCH Industrial Fire Protection MWSCO
No ratings yet
Pyro Chem MONARCH Industrial Fire Protection MWSCO
1 page
6911
No ratings yet
6911
13 pages
Scrib
No ratings yet
Scrib
8 pages
EMWI-F.5011.01.Battery 125V Test Report Form1 - 1
No ratings yet
EMWI-F.5011.01.Battery 125V Test Report Form1 - 1
4 pages
Preview Smps Ebook
100% (2)
Preview Smps Ebook
10 pages
SALA STR Module Change Form-2021 1
No ratings yet
SALA STR Module Change Form-2021 1
1 page
2016 SLC Intro To Information Technology
No ratings yet
2016 SLC Intro To Information Technology
22 pages
Getting It Done: Critical Success Factors For Project Managers in Virtual Work Settings
No ratings yet
Getting It Done: Critical Success Factors For Project Managers in Virtual Work Settings
12 pages
Aid on road
No ratings yet
Aid on road
3 pages
Slides Deep Learning On AWS With NVIDIA From Training To Deployment
No ratings yet
Slides Deep Learning On AWS With NVIDIA From Training To Deployment
48 pages
POM-5 Aquasol Corporation Brochure POM.B2.722.R1
No ratings yet
POM-5 Aquasol Corporation Brochure POM.B2.722.R1
4 pages
System and Network Administration
No ratings yet
System and Network Administration
6 pages
CH 12
No ratings yet
CH 12
9 pages
Pro How To Hack A TP Link Wifi Password
100% (1)
Pro How To Hack A TP Link Wifi Password
4 pages
OBIKAS User Guide
No ratings yet
OBIKAS User Guide
12 pages
Corrosion Coupon 1689022630
No ratings yet
Corrosion Coupon 1689022630
23 pages
صيانة واصلاح نظام الفرامل المؤزرة
No ratings yet
صيانة واصلاح نظام الفرامل المؤزرة
23 pages
Module-3 Assignment Problem
No ratings yet
Module-3 Assignment Problem
15 pages
Technical English 2e Level 4 Contents
No ratings yet
Technical English 2e Level 4 Contents
2 pages
Tendernotice_18384792
No ratings yet
Tendernotice_18384792
32 pages
Busqueda
No ratings yet
Busqueda
18 pages
Visit Transportnsw - Info: Description of Routes in This Timetable
No ratings yet
Visit Transportnsw - Info: Description of Routes in This Timetable
30 pages
CCNA 200-301 - Lab-16 ACL - Standard v1.0
No ratings yet
CCNA 200-301 - Lab-16 ACL - Standard v1.0
16 pages

Hierarchical Clustering

Uploaded by

Hierarchical Clustering

Uploaded by

Artificial Intelligence (CSC9YE)

I K-means clustering requires us to pre-specify the number of

I Agglomerative (bottom-up, agglomerative nesting)

I We we need a way of measuring the similarity between two

I The linkage method determines the metric used for merging

Input: set of examples without labels

Statistics, in arrests per 100,000 residents, in each of the 50 US

I Each leaf corresponds to one observation.

I The height of the cut to the dendrogram controls the number

Clusters: {1, 2}, {3, 4, 5, 6}, {7, 8}, {9}, {10}

Clusters: {1, 2, 3, 4, 5, 6}, {7, 8, 9}, {10}

Clusters: {1, 2, 3, 4, 5, 6, 10}, {7, 8, 9}

I Hierarchical clustering seeks to build a hierarchy of clusters.

You might also like