Data Mining and Machine Learning: Fundamental Concepts and Algorithms
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
1
Department of Computer Science
Rensselaer Polytechnic Institute, Troy, NY, USA
2
Department of Computer Science
Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 1/
Density-based Clustering
Density-based methods are able to mine nonconvex clusters, where distance-based
methods may have difficulty.
X2
bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bCbC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC C b C b b C C b C b C b C b C b C b C b C b C b bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC C b
bC
C b bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC C b C b bC bC bC bC bC bC bC bC bC
bC
bC bC bC Cb bC bC Cb bC bC bC bC bC bC C b C b C b bC bC bC bC bC C b bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC Cb bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC
Cb bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC
bC bC bC bC Cb bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC
bC
395 bC
bC
bC bC bC bC bC bC bC bC
bC bC bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC
bC
bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC
bC bC bC bC bC bC bC bC bC bC
bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bCbC
bC bC bC bC bC
bC
Cb bC bC bC bC bC C b bC C b C b bC bC bC bC C b C b C b C b C b C b C b C b C b C b bC bC bC bC C b C b C b bC bC bC bC bC bC C b C b C b C b bC bC C b C b C b bC C b C b bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b bC
bC
C b C b
bC bC bC
C b bC bC C b
bC bC bC bC bC C b
bC bC bC bC
C b C b bC bC bC bC
C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b
bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
320 bCbC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC
bC bC bC
bC bC
bC
bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bCbC bC bC bC bC
C b
bC bC bC bC
bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
C b C b C b bC C b bC
bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b
bC bC bC bC bC bC bC
C b C b C b C b
bC bC bC bC bC bC bC
C b C b
bC bC bC bC bC C b bC bC bC bC
C b
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
C b C b bC bC bC bC bC bC bC bC bC bC bC bC C b bC bC bC
C b bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC C b C b C b C b C b C b C b C b C b C b C b C b
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC C b bC bC C b
245 bC
bC bC bC
bC C b
bC bC
C b
bC
C b
bC bC
bC
C b
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC C b
bC bC bC bC
C b C b
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC
C b bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC C b
bC bC bC C b bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC
bC bC C b C b C b C b C b C b C b C b C b
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bCbC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
170 bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC
bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC
bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bCbC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC
bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC
bC
bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bCbC bC bC bC
bC bC bCbC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC
bC bC
C b C b bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC C b C b C b C b C b C b C b bC bC bC
bbC C bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b C b
bC bC bC bC bC C b C b bC bC bC bC bC C b bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC
bC bC bC bCbC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC
bC
bC bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
95 bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC
bC
C b
C b
bC bC bC bC
C b
C b C b
bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC
bC bC C b
bC
C b
bC bC bC bC bC bC C b C b C b
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC C b C b C b C b C b
bC bC bCbC bC bC bC bC bC bC bC C b C b C b bC C b C b C b C b bC bC bC bC bC bC bC bC C b bC C b bC bC bC bC C b
bC bC bC bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC C b C b C b bC C b C b bC C b C b bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bCbC bC bC bC
bC bC
20 X1
0 100 200 300 400 500 600
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 2/
The DBSCAN Approach
Neighborhood and Core Points
Here δ(x, y ) represents the distance between points x and y . which is usually
assumed to be the Euclidean
We say that x is a core point if there are at least minpts points in its
ǫ-neighborhood, i.e., if |Nǫ (x )| ≥ minpts.
A border point does not meet the minpts threshold, i.e., |Nǫ (x )| < minpts, but it
belongs to the ǫ-neighborhood of some core point z , that is, x ∈ Nǫ (z ).
If a point is neither a core nor a border point, then it is called a noise point or an
outlier.
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 3/
Core, Border and Noise Points
bC
bC bC
bC
bC x
bC bC bC bC z
y
bC bC
x ǫ
bC
bC
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 4/
The DBSCAN Approach
Reachability and Density-based Cluster
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 5/
DBSCAN Density-based Clustering Algorithm
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 6/
DBSCAN Algorithm
dbscan (D, ǫ, minpts):
1 Core ← ∅
2 foreach x i ∈ D do // Find the core points
3 Compute Nǫ (x i )
4 id(x i ) ← ∅ // cluster id for x i
5 if Nǫ (x i ) ≥ minpts then Core ← Core ∪ {x i }
6 k ← 0 // cluster id
7 foreach x i ∈ Core, such that id(x i ) = ∅ do
8 k ←k +1
9 id(x i ) ← k // assign x i to cluster id k
10 DensityConnected (x i , k)
11 C ← {Ci }ki=1 , where Ci ← {x ∈ D | id(x) = i}
12 Noise ← {x ∈ D | id(x ) = ∅}
13 Border ← D \ {Core ∪ Noise}
14 return C, Core, Border , Noise
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 7/
Density-based Clusters
ǫ = 15 and minpts = 10
X2
+ + +
+ + ++ + + + ++ uT uT uT uT
uT uT uT uT uT uT uT uT uT uT uT uT
uT uT
uT uT uT uT uT
++ + + bC + uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
+ bC bC bC bC bC bC
++ + +++uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
uT
uT uT uT uT uT uT uT
uT uT uT uT uT uT uT uT uT uT
+ bC bC bC bC bC bC
bC bC bC bC Cb bC bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uTuT uT uT uT uT uT uT
+ bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb uT uT Tu Tu uT uT uT uT uT
+ uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
bC CbCb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb uT T u T u uuT T T u uT uT uT uT uT uT uT T u T u T u T u uT uT uT
+ + bC bC bC bC Cb bC Cb
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC + uT uT uT uT uT uT uT uT uT uT uT uT uT uT
bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bCbC
bC bC bC Cb Cb bC bC bC bC bC Cb bC bC bC bC bC bC
bC bC bC ++ uT uT uT uT uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT
bC bC bC bC bC
bCbC bC bC bC bC Cb bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b C b bC bC uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
uT uT
++ bC Cb bC bC bC bC bCbC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC uT
uT uT uT uT uT
uT uT uT uT
uT +
bC
bC bC Cb bC bC bC bC Cb bC bC bC bC C b C b C b
bC bC bC bC C b C b bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC
bC uT
+ bC bC bC bC bC bC bC bC bC bC
bCbC Cb bC bC bC bC bC uTbC uTbC bC bC bC bC bC
bC bC Cb uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
395 + bC
bC
bC Cb C b b C
bC Cb bC bC
bC bCbC Cb bC bC
bC bC bC bC b C
bC bC bC bC bC
uT uT + bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bbC C bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
uT uT uT uT uT uT uT uT uT uT uT uT
uT
uT T u
uT uT uT uT uT uT uTuT
uT uT uT T u T u
+
+
bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT + uT uT uT uT bC bC bC bC bC
bC bC bCbC bC bC bC bC bC bC bC bC bC
bC uT
uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
uT uT uT uT uT uT uT uT uT uT uT bC
+ + + uT uT uT uT uT uT uT bC
bC bC bC bC bC
Cb bC bC Cb bC bC bC bC
uT uT uT uT uT uT uT + + uT T u
uT uT uT
T u
bC bC bC bC bC bC bC bC bC bC bC uT uT uT uT uT
bC bC bC bC
bC bC bC bC bC uT uT uT uT uT uT uT uT uT uT uT uT uT ++ uT uT uT uT uT bCbC bC uT bC bC bC bC bC bC bC
bC
bC bC ++ uT uT uT uT uTuT uTuT uT uT uT uT uT uT
uT uT uT uT uT uT uT uT uT uT uT ++ + bC bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT uT uTuT uT uT uT uT bC bC
bC bC bC bC bC bC CbCb bC bC + uT uT bC bC bC bC bC bC bC bC bC bC bC uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT
uT uT bC bC bC bC bC bC bCbC bC bC bC
bC bC bC
T u T u T u uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT + bC + bC
bC
bC bC bC bCbC bC bC bC uT uT uT uT uT uT uT uT uT uT uT uT uT uT uTuT uT uT uT uT uT uT uTuT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
bC bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT bC bC bC bC bC bC bC bC bC bC bCbC bC bCbC bC
bC
bC bC bC bC bC
bC bC bC
+ uT uT uT uT uT uT uT uT ++++ + + +
uT uuT T uT uT uT uT uT uT bC T u T u C b C b bC bC bC bC bC bC
bC bC
bC bC + uT uT uT uT uT uT uT uT uT
uT + uT bC bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT C b
bC bC bC bC bC bC
bC uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC + T u uT T u uT + + uT uT uT T u uT uT uT
C b
bC bC
bC uT
uT uT uT uT uT uT uT
uT bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC + uT uT uT uT uT uT uT uT uT uT uT uT uT uT + + uT uT uT uT uT uT uT uT uT uT bC bC bC bC bC bC bC bC
uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
320 bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
C b
+ bC uT uT
uT uT
uT
uT uT uT uT uT
T u
uT +
+ + uT
uT uT uT uT uT uTuT uT uT uT uT
uT uT T u
+ bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
C b bC bC bC bC bC bC bC
+ uT uT uT uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
uT uT uT uT uT uT uT bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC uT uT + + uT
uT
uT uT bC
bC bC bC bC bC bC bC bC bC bC bC bC
+ uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
uT uT uT bC bC
bC bC bC bC bC bC bC + + bC bC bC uT uT uT uT uT uT uT
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC uTbC uTbC bC bC bC bC bC bC bC bC bC bC bC bC bC
++ uT uT uT uT uT uT uT uT
uT
bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC + bC bC bC bC bC bC bC bC bC bC
+ uT uT uT uT uT uT uT
bC bC bC bC bC bC bC bC bC bC bC + bC bC bC uTbC bC bC bCbC bC bC bC bC
+ ++ uT uT uT uT
bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT uT
bC bC bC bC bC bC bC bC bC
bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
+ uT uT uT uT uT uT uT
+ ++
+ bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC
+
++ +
+ uT uT uT uT uT uT uT uT uT
uT uT uT T u T u
bC bC rS
bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b bC C b C b + + uT T u T u T u T u
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC + + + rS rS rS rS rS rS rS
uT uT uT uT uT uT uT +
++ + bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC rS rS rS rS rS rSrS rS rS
+ uT uT
uT uT uT
uT
uT
uT uT uT uT + uT
245 + +
+ +
bC
bC bC bC
bC bC bC bC bC
bC bC
bC +
bC
bC
++ bC rS rS rS rS rS rS
rS rS rS rS rS rS
rS rS rS
rS rS
rS rS
rS rS rS rS rS rS rS rS rS rS
rS
rS
rS
uT uT uT uT uT uT uT
uT uTuT
uT uT uT uT
uT uT uT uT uT
+ uT
rS rS uTuT uT uT uT uT uT uT uT uT uT uT uT uT
++ ++
+ + + ++ ++ + + rS rS rS rS rS rS rS rS rS rS S r rS rS rS
+
+
+++ +
+ +++ + + + rS rS
rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS
rS rS rS S r
rS
rS rS
T u T u
uT uT uT uT uT uT uT uT uT uT uT uT uT uT
T u T u uT T u T u uT uT uT uT
+ +rS + rS rS rS rS rS rS rS rS S r S r S r
+ ++ + rS uT uT uT uT uT uT uT uT uT uT uT uT
rS
rS rS
rS
+ rS
+ + + + + +++ rS
rS
rS
rS rS rS rS rS
rS rS rS rS
uT uT uT uT uT uT uT uTuT uT uT uT uT uT
uT uT uT uT uT uT
rS rS
rS
rS rS
rS rS rS Sr rS rS rS rS rS rS rS rS Sr rS rS
+rS + + ++ ++ rS rS rS rS rS rS rS rS
rS uT uT uT uT uT
uT uT uT uT uT uT uT uT uT uT uT uT uT uT
rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS Sr rS rS rS rS rS rS
rS
rS Sr rS rS rS rS rS rS rS rS rS rS rS rS rS + rS rS rS rS uT uT uT uT uT uT uT uT uT uT + uT
rS Sr rS rS rS rS rS rS rS rS rS rS rS rS rS Sr rS rS rS rS rS rS rS rS rS rS rS rS Sr rS rS rS rS
rS rS rS rSrS rS rS rS rS rS rS rS
rS + rS rS rS rS uT
rS rS rS rS rS rS rS rS rS rS rS rS rSrS rS rS rS rS rS rS rS rS uT uT uT uT uT uT uT uT
170 rS
rS rS rS rS rS rS rS rS rS rS rSrS rS
rS rS Sr rS rS rS Sr
rS
rS
rS rS
rS rS rS
rS Sr
rS rS rS rS Sr
rS rS
rS rS rS Sr
Sr rS rS
rSrS rSrS rS
Sr rS
rS rS rS
rS rS rS
rS rS rS rS
rS rS rS rS rS rS rS rS rS rS rS rS rS rS Sr rS
rS rS rS rS rS rS
rS rS rS rS rS
rS rS rS rS rS rS rS
rS rS rS rS rS
rS
rS rS rS rS rS rS rS rS
rS rS
rS rS rS +
+ + rS rS rS rS rS rS
rS rS rS rS rS rS
rrS S rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS
rS rS rS rS rSrS rS rS rS rS S r
rrS S rS rS rS rS
rS rS rS rS rS rS
rS rS rS rS rS rS rS rS uT uT uT uT uT
T u uT uT uT uT uT uT uTuT uT uT uT
uT uT uTuT uT
+
rS
rS Sr rS rS rS rS
rS rS rS rS Sr rS rS rS rS rS rS Sr rS rS rS rS rS rS rS rS rS + rS S r S r rS S r S r S r S r T u T u T u T u
rS rS rS rS rS rS rS rS rS rS rS rS Sr rS rS rS rS rS rS rS rS rS Sr rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS uT
uT uT uT uT
uT uT uT uT
rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rSrS Sr Sr rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rSrS rS rS rS rS S r
rS rS rS rS rS rS rS rSrS rS rS S r S r S r S r T u T u T u uT
rS rS bCrS bCrS rS Sr rS rS rS uTrS uT uT uT uT uT
rS rS rS rS rS ++ rS rS rS rS rS rS rS rS rS rS rS rS rS rS uT uT uT uT uT
+ bC rS
rS bCrS rS rS bC rS rS + + rS rSrS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS uT uT uT uT uT
bC bCrS bC bC bC bC
+ + rS rS
rS rS rSrS rS rS rS rS rS rS
rS rS rS rS rS rS rS rS rS rS rS rS rS rS uT uT uT uT
uT uT uT uT uT uT uT uT
uT
bC bC + rS rS rS rS rS rS rS rS rS uT uT
bC bC bC
Cb bC bbC C bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC rS rS rS uT uT uT uT uT uT uT uT
bC bC bC Cb bC bC bC bC Cb bC bC bC bC bC bC C b bC rS rS rS rS uT uT uT uT uT uT uT uT
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC + rS
bC bC bC bC
bC Cb bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bCbC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC + + rS rS
rS
rrS S rS
rS rS rS rS rS rS rS rS
+ +
+ +
uT uT uT
T u uT uT uT
uT
95 bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC
bC bC
bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC
bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC
bC bC bC bC bC bC
bC + rS rS rS
rS rS rS rS
rS
rS rS rS rS rS rS rS rS rS rS rS rS
rS
S r S r S r S r ++ uT uT uT uT uT uT uT uT uT uT uT uT uT
uT uT uT uT uT T u
uT uT uT uT
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC + + rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS
rS uT uT uT uT uT uT
bCbC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC rS rS rS rS rS rS rS rS rS
bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC +++ + rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS uT uT uT uT uT uT +
bC bC bC bC rS rS rS rS rS rS rS rS uT uT +
rS rS rS + ++
bC bC bC bC + ++ rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS rS
++ + + ++ + + rS rS rS rS rS rS rS rS rS +
+ + + rS rS rS rS
+ + ++ rS rS rS rS rS + +
+ + + + + + + rS +++ ++
+ + + + ++ + + + ++ +
20 +
X1
0 100 200 300 400 500 600
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 8/
DBSCAN Clustering
Iris Dataset
+ uT
+ uT
uT
4.0 + 4.0 uT
+
uT + + + uT uT + +
uT ++ uT uT uT
+ uT + uT uT +
uT uT uT uT uT uT uT
3.5 uT uT uT uT
+
bC bC
3.5 uT uT uT uT uT uT bC bC bC
+ + +
uT uT bC bC uT uT bC bC
uT uT uT uT + bC bC bC bC bC + uT uT uT uT bC bC bC bC bC bC bC
uT uT uT bC bC bC uT uT uT bC bC bC
uT uT uT rS rS rS bC bC bC bC bC bC bC uT uT uT uT uT bC bC bC bC bC bC bC bC bC bC bC bC bC bC
3.0 ++
rS rS bC bC bC bC bC bC
++ ++ 3.0 uT bC bC bC bC bC bC bC bC bC
+ +
rS rS rS bC bC bC bC bC + + + bC bC bC bC bC bC bC bC bC bC bC
+ rS rS bC bC bC bC bC bC bC bC bC
rS rS rS bC + bC bC bC bC bC
rS rS rS bC bC bC bC bC bC bC
2.5 + +
rS
+ + 2.5 bC bC
+
+ + rS + + bC bC bC
+ + bC bC
bC
2 + X1 2 X1
4 5 6 7 4 5 6 7
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 9/
Kernel Density Estimation
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 10
Univariate Density Estimation
F̂ x + h2 − F̂ x − 2h
ˆ k/n k
f (x) = = =
h h nh
where k is the number of points that lie in the window of width h centered at x.
The density estimate is the ratio of the fraction of the points in the window (k/n)
to the volume of the window (h).
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 11
Kernel Estimator
Kernel density estimation relies on a kernel function K that is non-negative,
symmetric,
R and integrates to 1, that is, K (x) ≥ 0, K (−x) = K (x) for all values x,
and K (x)dx = 1.
Discrete Kernel Define the discrete kernel function K , that computes the
number of points in a window of width h
(
1 If |z| ≤ 21
K (z) =
0 Otherwise
The density estimate fˆ(x) can be rewritten in terms of the kernel function as
follows:
n
ˆ 1 X x − xi
f (x) = K
nh i =1 h
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 12
Kernel Density Estimation
Discrete Kernel (Iris 1D)
f (x) f (x)
0.66 0.44
0.33 0.22
bCbC bC bCbC
bCbCCb CbCb bCbC bCCb Cb bCbC
Cb
bCCb bCCbCb CbCbCb bCCbCb Cb CbCb CbCbCb CbCb bCbCCb CbCbCb Cb Cb bCbC bCbC CbCb
CbCb CbCbCb CbCbCbCb CbCbCbCb Cb CbCbCb CbCbCbCb CbCbCb CbCbCbCbCb CbCbCbCb CbCbCb CbCbCb Cb bCCbCbCbCb CbCbCbCb CbCb CbCbCb
bCCb bCbC bCbC bCbC bCbC bCbC Cb CbCbCb CbCbCb CbCb bC bC Cb
Cb bCbC Cb bC bC bC bC bC bC bC bCbCbC bCbCbC bCbCbC Cb bCbCbC CbCb CbCbCb Cb bCbC
bC CbCbCb bC bCbCCb CbCb bCbCCb bCbCCb bCbCCb bCbCCb bCbCCb bC bCbCCb bCbCCb bCbCCb bCbCCb bCbCCb CbCbCb bCbCCb bCbCCb bCbCCb bCbCCb bCbCCb bCbCCb CbCb bCbCCb CbCbCb bCbCCb bC bC CbCbCb bC bC bC CbCbCb bC x bC CbCb bC bCCb CbCb bCCb bCCb bCCb bCCb bCCb bC bCCb bCCb bCCb bCCb bCCb bCCb bCCb bCCb bCCb bC bC bC Cb Cb Cb Cb bC bC bCCb bC bC bC CbCb bC x
0 0
4 5 6 7 8 4 5 6 7 8
0.42 0.4
0.21 0.2
The width h is a parameter that denotes the spread or smoothness of the density
estimate. The discrete kernel function has abrupt changes.
Define a more smooth transition of influence via a Gaussian kernel:
2
1 z
K (z) = √ exp −
2π 2
Thus, we have
(x − xi )2
x − xi 1
K = √ exp −
h 2π 2h2
Here x, which is at the center of the window, plays the role of the mean, and h
acts as the standard deviation.
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 14
Kernel Density Estimation
Gaussian Kernel (Iris 1D)
f (x) f (x)
0.54 0.46
0.27 0.23
f (x)
(a) h = 0.1 f (x)
(b) h = 0.15
0.4 0.38
0.2 0.19
bC bCbC bC
bCbC CbCb
Cb bC bC CbCb CbCb bCbC bCCb Cb CbCb CbCb
CbCb bCbCbC bCbCbC bCbCbC Cb bCbC CbCbCb CbCb bCbCbC CbCbCb Cb Cb CbCb CbCb
bCbCbC bCCb bCCbCb CbCbCb bCbCbC Cb CbCb CbCbCb CbCb CbCbCb CbCbCb Cb Cb CbCb CbCb
bCbCbC
Cb CbbC bC Cb bC Cb bC bCCb bCbC bCbC bCbC bCbC Cb bCbCCb bCbCCb CbCb bCbCCb bCbCCb CbCbCb Cb bC CbCb bC bCbC bC bCCb Cb Cb Cb Cb Cb bCCb bCCb bCCb bCCb bCCb bC bCbCCb bCbCCb CbCb bCbCCb bCbCCb CbCbCb Cb bC CbCb bC bCbC
Cb bCbC Cb bCbC CbCb bCbC bCbC bCbC bCbC bCbC Cb bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC CbCb bCbC bCbC bCbC bC bC bCbC bC bC bC bCbC bC x bC bCbC bC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bCbC bC bC bCbC bC bC bC bCbC bC x
0 0
4 5 6 7 8 4 5 6 7 8
When h is small the density function has many local maxima. A large h results in a
unimodal distribution.
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 15
Multivariate Density Estimation
vol(Hd (h)) = hd
The density is estimated as the fraction of the point weight lying within the
d-dimensional window centered at x, divided by the volume of the hypercube:
n
ˆ 1 X x − xi
f (x) = d K
nh i =1 h
R
where the multivariate kernel function K satisfies the condition K (z)dz = 1.
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 16
Multivariate Density Estimation
Discrete and Gaussian Kernel
(
1 If |zj | ≤ 12 , for all dimensions j = 1, . . . , d
K (z ) =
0 Otherwise
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 17
Density Estimation
Iris 2D Data (Gaussian Kernel)
bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC Cb Cb bC bC bC bC bC bC bC bC bC bC Cb Cb bC bC bC bC bC bC bC
bC
bC bC bC bC bC Cb bC bC bC bC bC Cb bC
bC bC bC bC bC Cb bC bC bC bC bC Cb
bC Cb bC bC Cb bC Cb bC bC Cb
bC bC Cb bC Cb bC bC Cb bC Cb
bC bC Cb bC bC bC bC bC bC Cb bC bC bC bC bC bC Cb bC bC Cb bC bC bC bC bC bC Cb bC bC bC bC bC bC Cb
bC bC bC bC bC bC
bC Cb Cb bC bC Cb Cb bC bC bC Cb bC bC bC bC bC bC bC
bC Cb Cb bC bC Cb Cb bC bC bC Cb bC
bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC Cb
bC Cb bC bC bC bC bC bC Cb bC Cb bC bC bC bC bC Cb bC bC bC bC bC bC Cb bC Cb bC bC bC bC
bC bC bC bC bC bC bC bC
Cb bC Cb bC Cb bC Cb bC
bC bC bC bC
bC bC
bC bC bC bC
bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC
bC
bC bC bC bC bC Cb bC Cb Cb bC bC Cb bC
bC bC bC bC bC Cb bC Cb Cb bC bC Cb
bC Cb bC bC Cb bC Cb bC bC Cb
bC bC bC bC bC bC bC bC bC bC
bC bC Cb bC bC bC bC bC bC Cb bC bC bC bC bC Cb Cb bC bC Cb bC bC bC bC bC bC Cb bC bC bC bC bC Cb Cb
bC bC bC bC bC bC
bC Cb Cb bC bC bC Cb bC bC bC Cb bC bC bC bC bC bC bC
bC Cb Cb bC bC bC Cb bC bC bC Cb bC
bC bC bC bC bC Cb bC Cb bC bC bC bC bC Cb bC Cb
bC Cb bC bC bC bC bC bC Cb Cb bC bC bC bC bC bC Cb bC bC bC bC bC bC Cb Cb bC bC bC bC bC
bC bC bC C b bC bC bC C b
bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC
bC bC bC bC
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 18
Density Estimation
Gaussian kernel, h = 20
X2 X2
bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bCbC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bCbC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bCbC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bCbC bC bC bC
bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC Cb
bC bC bC Cb bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC C b C b C b
bCbC bC
bC
bC bC bC bC bC bC bC
bC bC
bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC
C b bC bC C b
bC bC bC
bC C b
bC
C b bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC
bC bC bC bC C b C b C b C b bC bC C b C b bC bC
C b
bC bC bC bC C b C b
bC C b C b C b
bC 500
bC bC bC bC Cb bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC
395 bC
bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC Cb bC bC bC
bC bC bC bC bC bC
bC bC bC bC bCbC bC bC bC bC bC bCbC
bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC
bC
bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bCbC
bC bC bC bC bC
bC
Cb bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bCbC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bCbC bC
bC bC bC bC bC bC bCbC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC
bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bCbC bC bCbC bC bC bC bC
bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b bC bC
bC bC bC bC bC bC C b bC bC bC C b C b bC C b C b bC C b C b C b C b C b C b C b C b C b bC bC bC
bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC
bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
400
320 bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC
bC bC bC bC
bC bC
bC
bC bC
bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bCbC bC bC bC bC
bC bC bC bC bC bC
bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC
bC bC bC bC
bC
bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bCbC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bCbC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bCbC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC
bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bCbC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b C b bC bC bC
C b bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bCbC
245 bC
bC bC
bC
bC bC
bC
bC bC
bC
bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bCbC bC bC bC
bC
bC 300
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bCbC bC bC
bC bC bC bC bC C b bC C b bC bC bC bC bC C b bC bC bC bC bC bC bC bC bC bC bC bC bC
C b C b C b C b C b C b
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bCbC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC C b
bC bC bC
C b C b
bC bC bC C b C b C b C b C b C b
bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bCbC
bC bC bC bCbC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bCbCbC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
170 bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC
bC
bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC
bC
bC bC bC
200
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bCbC bC
bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bCbC bC bC bC bC bC bCbC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bCbC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bCbC bCbC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC
bC bC bC C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b C b
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
95 bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC
bC bC bC bC
bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC
bC bC bC bC bCbCbC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bCbC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC
Cb 100
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC
bC bC bCbC bC bC bC bC bC bC bC bC bCbC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bCbC C b bC C b
bC bC bC bC bC
bC bC bC bC bC bC
bC
20 X1
0 100 200 300 400 500 600 0 X1
0 100 200 300 400 500 600 700
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 19
Nearest Neighbor Density Estimation
In kernel density estimation we implicitly fixed the volume by fixing the width h,
and we used the kernel function to find out the number or weight of points that
lie inside the fixed volume region.
An alternative approach to density estimation is to fix k, the number of points
required to estimate the density, and allow the volume of the enclosing region to
vary to accommodate those k points. This approach is called the k nearest
neighbors (KNN) approach to density estimation.
Given k, the number of neighbors, we estimate the density at x as follows:
k
fˆ(x) =
n vol(Sd (hx ))
where hx is the distance from x to its kth nearest neighbor, and vol(Sd (hx )) is the
volume of the d-dimensional hypersphere Sd (hx ) centered at x , with radius hx .
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 20
DENCLUE Density-based Clustering
Attractor and Gradient
n
1 X x − xi
∇fˆ(x) = K · (x i − x)
nhd +2 i =1 h
This equation can be thought of as having two parts for each point: a vector
(x i − x) and a scalar influence value K ( x −hx i ).
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 21
The Gradient Vector
We first compute the direction away from x to x i , i.e., the vector (x i −x).
Next, we scale it using the Gaussian kernel value as the weight K x −hx i .
∇fˆ(x) is the net influence at x, i.e., the weighted sum of the difference vectors.
x3 x2
3 ∇fˆ(x)
1 x1
x
0
0 1 2 3 4 5
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 22
DENCLUE: Density Attractor
where t denotes the current iteration and x t +1 is the updated value for the
current vector x t .
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 23
The DENCLUE Algorithm: Find Attractor
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 24
DENCLUE: Density-based Cluster
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 25
The DENCLUE Algorithm
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 26
DENCLUE: Iris 2D Data
Iris 2D dataset comprising the sepal length and sepal width attributes.
The results were obtained with h = 0.2 and ξ = 0.08, using a Gaussian kernel.
f (x )
X2 X1
4 7.5
3 6.5
2 5.5
1 4.5
3.5
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 27
DENCLUE: Density-based Dataset
Using the parameters h = 10 and ξ = 9.5 × 10−5, with a Gaussian kernel, we
obtain eight clusters.
X2
500 bC
bC
bC
bC
bC
bC
bC
bC
bC bC
bC
bC
bC bC bC bC bC
bC bC bC
bC
bC
bC
bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bC
bCbC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC Cb bC bC bC Cb
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC
bC Cb bC Cb bC bC bC bC bC Cb bC bC bC
bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC
bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC Cb
bC bC Cb bC bC bC bC bC bC bC bC bC bC Cb bC bC bC Cb Cb bC Cb bC bCbC Cb bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC Cb bC bC Cb Cb bC bC
bC Cb Cb bC
bC bC bC bC bC bC Cb bC bC bC bC bC bC bC Cb bC bC bC bC bC bC
bC Cb bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC
bC Cb bCCb bC Cb bCbC bC bC Cb bC bC bC bC Cb Cb bC bC bC Cb bC bC bC Cb bC bC bC bC
bC Cb Cb bC bC bC CbbC bC
bC bC bC
Cb bC Cb bC bC Cb bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bCbC bC bC bC bC
bC bC bC Cb bC Cb bC bC bC Cb bC bC bC bC bC bC bC Cb
bC bC bC bC bC bC bC
bC Cb bC bC bC bC bC
bC bC bC bC Cb bC bC bC bC bC Cb bC
bC bC Cb bC
Cb
bC bC bC bC bC bC
bC bC
Cb bC Cb bC bC bC Cb bC
bC bC
bC
bC
bC bC bC Cb bC C b C b C b C b C b bC bC bC
C b
bC Cb bC bC bC bC bC bC bC bC
bC bC Cb Cb bC
bC bCbC bC bC Cb bC Cb bC Cb bC Cb
bC bC bC Cb Cb bC bC
bC bC bC bC Cb bC bC bC bC
bC bC bC bC bC bC bC bC Cb bC bC bC bC bC Cb bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bCbC bC bCbC bC
bC bC Cb
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
Cb bC
bC bC bC
bC bC
bC
bC Cb bC bC bC bC bC Cb bC Cb Cb bC bC bC bC bC Cb Cb bC bC bC bC
Cb
bC bC bC bC bC bC
bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC Cb bC bC
bC bC Cb bC Cb bC bC bC Cb bC bC Cb bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bCbC bC Cb bC bC Cb bC
bC Cb bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC Cb
bC bC bC Cb bC
Cb bC bC bC bC bC
bC bC
bC bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
400 bC
bC
bC
bC
bC
bC
bC bC bC
bC
bC
bC Cb
bC
bC bC bC
bC bC
bC
bC Cb bC bC bC bC bC
bC bC bC bC bC
bC bC
bC bC bC
bC
bC bC
bC bC bC
C b
bC bC
C b
bC bC
Cb
bC
bC
bC
bC bC
bC
bC
bC Cb bC bC bC Cb
bC bC
bC bC
bC
bC bC bC bC
bC bC Cb Cb
bC
bC
bC
bC
bC
bC bC
bC
bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC
bC
bC
bC bC bC bC
bC
C b
bC bC
C b C b
bC
C b
bC
bC
bC
bC
bC bC bC bC
C b
C b
C b
bC bC bC
bC bC
bC bC bC bC
bC bC C b
bC
bC
C b C b
bC bC
C b bC bC bC
C b
bC
bC
C b bC bC
bC
bC bC
bC
bC bC
bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC
bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC Cb bCbC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC
bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bCbC bC bC bC bC
Cb Cb bC bC bC bC bC bC bC bC Cb
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC
Cb bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC Cb Cb
Cb Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC Cb
bC Cb bC bC bC
bC bC bC bC bC bC bC bC C b bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC
bC bC bC bC bC bC bC bC
bCbC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC Cb
bC Cb bC Cb bC bC bC bC bC bC bC bC bC bbC CbC bC bC bC bC bC bC bC
bC C b C b C b C b C b C b bC bC bC bC bC C b C b
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC Cb bC bC bC bC bC Cb bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC
bC Cb Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC Cb bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC Cb bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC Cb bC bC bC bC bC bC bCbC
bC bC bC bC
bC bC bC bC bC bC
bC bC
300 bC
bC bC bC bC
bC
bC
bC
bC Cb bC bC
bC
bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC
bC
bC
bC bC
bC
bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC
bC
bC
bC
bC
bC
bC bC
bC bC bC bC bC bC
bC
bC bC
bC
bC
bC
bC bC bC bC Cb
bC
bC bC
bC bC
bC
Cb Cb bC bC
Cb
bC bC bC
bC bC
bC bC
bC
bC bC
bC
bC bC
C b
bC bC bC
bC
C b
bC bC bC bC bC
bC
bC
bC
Cb
bC
bC bCbC
bC bC bC
Cb Cb bC Cb bC Cb bC bC
bC
bC
bC
bC bC Cb bC
bC
bC
bC bC bC
bC
bC
bC bC bC
bC
bC
bC
bC
bC
bC
C b bC bC bC bC
bC
bC
C b
bC
C b
bC bC
bC bCbC
bC C b
bC bC
bC bC
C b
bC bC bC bC bC
bC bC bC
bC bC bC
bC bC bC bC
C b bC bC bC
bC bC Cb
bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bCbC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC Cb bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bCbC bC bC bC bC bC
Cb Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC CbbC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC
bC bC
bC bC bC
bC bC bC bC bC
bC bC
bC
bC bC bC bC
bC
bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC
bC bC bC bCbC bC bC bC bC bC bC
bC C b bC bC bC C b bC bC bC
bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC
bC
bC
bC bbC C bC
C b C b C b bC bC bC bC C b C b bC C b C b bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC
bC bC
bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC
200 bC
bC
bC bC bC bC bC bC
bC
bC bC Cb bC
bC bC Cb bC bC bC Cb
bC bC
bC bC Cb Cb bC bC
bC
bC bC bC Cb
Cb Cb
bC bC bC
bC bC Cb bC
bC Cb bC
bC bC Cb bCbC
bC
bC
bC bC Cb bC Cb
bC
bC bC
bC bC
Cb Cb
bC
bC
Cb bC
bC
bC
bC Cb bC bC
bC Cb bC bC
bC
bC
bC
bC bC
bC
bC
bC bC bC bC bC
bC bC Cb Cb Cb
bC
bC bC
bC bC
bC Cb Cb bC bC bCbC
bC
bC bC bC
Cb bC bC bC
bC
bC bC
bC
bC
bC bC bC bC bC
bC
bC
bC bC bC
bC
bC
bC bC bC bC
bC bC
bC bC bC
bC
bC bC bC
bC bC
bC bC
bC
bC
bC
bC bC bC bC bC bC bC
bC bC bC
bC bC bC bC
bC
bC bC
bC
bC bC bC
bC
bC
bC bC
bC bC bC bC Cb bC Cb bC bC bC bC bC bC Cb bC Cb bC Cb bC bC bC bC bC bC bC bC bC bC
bC bC Cb bC bC bC bC Cb bC Cb Cb bC Cb Cb bC Cb bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC Cb
Cb bC bC bC Cb bC bC bC bC
bC
bC Cb bCbC bC bC bC Cb
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC Cb bC bC bC bC
bC bC Cb bC
Cb bC bC bC bC bC Cb bC bC bC C b C b C b C b bC bC C b bC
bC bC bC bC Cb bC bC bC bC bC bC bC bC bC
bC Cb bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC Cb Cb bC bC
bC Cb bC bC bC Cb bC bC bC bC Cb bC Cb bC Cb bC bC bC bC bC bC bC bC bC bC bC
bC bC Cb bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC Cb bC bC Cb bC bCbC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC Cb bC bC bC bC bC bC Cb bC Cb bC bC bC bC bC bC bC bC bC
Cb bC bC bC bC bCbC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC
bC Cb
bC bC Cb bC bC bC bC bC bC Cb bC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC
bC bC
bC
bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC C b C b bC bC C b bC bC bC C b bC
bC bC bC bC bC bC bC bC bbC C
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bCbC bC bC bC
bC bC bC bC bC bC bC bC bC
bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC Cb bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC Cb bC Cb bC bC bC bC bC bC
bC bC Cb bC bC Cb bC Cb
Cb bC bC bC C b bC bC C b C b bC bC bC bC bC bC
Cb bC bC bC bC bC bC Cb Cb Cb bC bC
bC bCbC bC
bC bC bC bC bC
bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
Cb
bC bC Cb bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC
bC
bC
bC bC bC
bC
bC bC Cb bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
100 bC bC
bC
bC bC
bC bC
bC
bC Cb
bC
bC bC
bC bC
bC
bC bC bC
bC bC bC bC
bC
C b
bC bC bC bC bC
bC bC
bC bC bC bC
C b
bC
C b
bC bC bC bC
bC bC
bC bC
bC bC
bC bC bC
bC bC bC
bC
bC
bC
bC Cb
bC bC Cb bC bC
bC
bC
bC bC bC bC bC
Cb bC bC bC bC
Cb bC
Cb bC Cb bC bC bC Cb bC bC bC
bC bC
bC bC bC bC
bC
bC bC
Cb
bC bC
bC Cb
bC Cb
bC
bC
bC
bC
bC bC bC bC
bC
bC
bC
bC
bC bC bC bC
bC bC
bC
bC
bC bC
bC
bC bCbC bC bC
bC
bC
bC
bC bC
bC
bC bC
bC
bC
C b bC
bC bC bC
bC bC
C b
bC bC
bC
bC
bC
bC
C b
bC bC
bC bC bC bC bC
bC bC bC bC bC
C b
bC bC
bC
bC
bC
bC
bC bC bC bC bC
bC bC
bC
C b
bC bC
C b
bC
C b
bC bC
bC bC Cb bC bC
bC bC bC bC bC bC bC bC Cb
Cb bC bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC
bC bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC Cb
bC bC bC
bC bC bC bC bC bC bC
bC bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC bC bC
bC Cb bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC
bC bC bC bC Cb bC bC bC bC bC bC
bC bC bC bC
bC Cb
bC Cb bC bC bC C b C b bC bC bC bC bC bC bC bC bC
bC bC
bC bC bC bC bC bC bC bC bC bCbC
bC bC bC
bC
bC bC
bC
bC bC bC bC bC bC bC bC bC bC bC
bC bC bC bC
bC bC bC bC
bC bC bC bC bC bC bC bC
bC bC
0 X1
0 100 200 300 400 500 600 700
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 28
Data Mining and Machine Learning:
Fundamental Concepts and Algorithms
dataminingbook.info
1
Department of Computer Science
Rensselaer Polytechnic Institute, Troy, NY, USA
2
Department of Computer Science
Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Zaki & Meira Jr. (RPI and UFMG) Data Mining and Machine Learning Chapter 15: Density-based Clustering 29