0% found this document useful (0 votes)
0 views

Graph based clustering

The document provides an overview of graph-based clustering, detailing its principles, applications, and various graph models used in clustering. It discusses the importance of cluster analysis in identifying groups of similar objects and highlights methods like MST-based and spectral clustering. Additionally, it explains how data can be represented as graphs and the significance of proximity measures in clustering quality.

Uploaded by

skhamrui2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Graph based clustering

The document provides an overview of graph-based clustering, detailing its principles, applications, and various graph models used in clustering. It discusses the importance of cluster analysis in identifying groups of similar objects and highlights methods like MST-based and spectral clustering. Additionally, it explains how data can be represented as graphs and the significance of proximity measures in clustering quality.

Uploaded by

skhamrui2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Graph based Clustering

Dr. Sraban Kumar Mohanty


Computer science and Engineering
PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, India
Outline

● Introduction to cluster analysis


● What is Graph Clustering?
● Graph Modeling & Representations
● Commonly used Graphs
○ Sparse Graph
● MST based Clustering
● Spectral Clustering
Unsupervised Learning

● When flying over a city, one can easily identify the forests,
commercial places, farmlands, riverbeds etc. based on their
features, without any explicit training.
● Class labels of the data are unknown
● Given a set of data, the task is to establish the existence
of classes or clusters in the data
What is Cluster Analysis?

● Finding groups of objects such that the objects in a group are similar (or
related) to one another and different from (or unrelated to) the objects in
other groups
Inter-cluster
Intra-cluster distances are
distances are maximized
minimized
Application 1: Market Segmentation
● A retail company may collect the following information on households:
• Household income
• Household size
• Occupation of the household’s head
• Distance from nearest urban area
● Identify the following clusters:
• Cluster 1: Small family, high spenders
• Cluster 2: Larger family, high spenders
• Cluster 3: Small family, low spenders
• Cluster 4: Large family, low spenders
• The company can then send personalized advertisements or sales letters to
each household based on how likely they are to respond to specific types of
advertisements.
Application 2: Document Clustering

● Document Clustering:
– Goal: To find groups of documents that are similar to each
other based on the important terms appearing in them.

– Approach: To identify frequently occurring terms in each


document. Form a similarity measure based on the
frequencies of different terms. Use it to cluster.
Clustering: An application

● Summarization
– Reduce the size of large data sets
● In fact, clustering is one of the most utilized data
mining techniques.
– It has a long history, and used in almost every field, e.g.,
medicine, botany, sociology, biology, marketing,
insurance, libraries, etc.
What is not Clustering?

● Simple segmentation
– Dividing students into different registration groups alphabetically, by
last name

● Results of a query
– Groupings are a result of an external specification
– Clustering is a grouping of objects based on the data

● Supervised classification
– Have class label information
Notion of a Cluster can be Ambiguous

How many Six Clusters


clusters?

Two Clusters Four Clusters

20 points and 3 different ways to dividing them


Aspects of clustering

● A clustering algorithm
– Partitional clustering
– Hierarchical clustering
– Density based clustering
– Graph based clustering
– …
● A proximity (similarity, or dissimilarity) function
● Clustering quality
– Inter-clusters distance ⇒ maximized
– Intra-clusters distance ⇒ minimized
● The quality of a clustering result depends on the algorithm, the distance
function, and the application.
Proximity Measure
2. Euclidean Distance (L2 Norm: 𝑟 = 2)

This metric is same as Euclidean distance between any two points


𝑥 𝑎𝑛𝑑 𝑦 𝑖𝑛 ℛ 𝑛 .
𝑛

𝑑(𝑥, 𝑦) = ෍ 𝑥𝑖 − 𝑦𝑖 2

𝑖=1

Example: x = [7, 3, 5] and y = [3, 2, 6].

The Euclidean distance between 𝑥 𝑎𝑛𝑑 𝑦 is

𝑑 𝑥, 𝑦 = 7−3 2 + 3−2 2 + 5−6 2 = 18 ≈ 2.426


Quality of Clustering

● Quality of clustering:
– There is usually a separate “quality” function that measures

the “goodness” of a cluster.


– It is hard to define “similar enough” or “good enough”

◆ The answer is typically highly subjective


● Sum of squared errors (SSE), Rand index, Adjusted Rand
index, Silhouette coefficient etc.
Major Clustering Approaches

(a) Partition-based (b) Hierarchical-based

(c) Density-based (d) Graph-based


What is Graph Clustering?

Input Dataset Graph representation

Connected components
Final clusters
Some properties of a Graph
● Formally: G=(V, E, W)
○ V = non-empty set of vertices
○ E = subset of V X V, edges, consisting of (ordered) pairs of vertices
○ W = Set of distances/weights between pair of vertices.
● Directed or undirected

Directed Directed Undirected Undirected

● Degree of a node
○ Number of edges incident on it
○ Undirected degree, In-degre, Out-degree
Some properties of Graph
● Walk in a graph between nodes x and y:
○ x= v0 – v1 – v2 – v3 - . . . . . . . . – v(t-2) – v(t-1) – v(t) = y
○ There is an edge between every pair of nodes
○ Length of walk = number of hops = number of edges in the walk
● Closed walk: x=y
● Trail: a walk in which no edge is repeated
● Path: a walk in which no vertex is repeated (except start and end)
● Closed path: start vertex = end vertex
● Cycle = a closed path with length >= 3
● Vertices x and y connected: A path connecting x to y
● Connected graph: All vertex pairs are connected
Unweighted Graph Representation
● Adjacency matrix: A B C D

Edge List:
A B C D E F E F
A 1 1 A B
B 1 1 A E Node List:
C 1 1 B E
D 1 1 E F A B, E
C F B E, A
E 1 1 1
C D C D, F
F 1 1 1 D F D C, F
E A, B, F
F C, D, E
Weighted Graph Representation
A B C D
2 5

8 1 4 3
E 5 F
● Adjacency matrix:
A B C D E F Edge List:
A 2 8
Node List:
B 2 1 A B 3
A (B, 2), (E, 8)
C 5 1 A E 8
B (E, 1), (A, 2)
B E 1
D 5 3 C (D, 5), (F, 4)
E F 5
E 8 1 1 D (C, 5), (F, 3)
C F 4
E (A, 8), (B, 1), (F, 5)
F 1 3 1 C D 5
F (C, 4), (D, 3), (E, 5)
D F 3
Graph Representation
● Adjacency matrix:

Image Source:
https://matthewlincoln.net/2014/12/20/adjacency-matrix-plots-with-r-and-ggplot2.html
Relational Data
● Data not represented as graphs can be converted into graphs
○ Every Data Record = A Node of the graph (d1, d2, d3, . . .)
○ Every pair of nodes connected by an edge (d1,d2), . . . . , (di,dj)
○ Distance (di,dj) = weight of edge (di, dj).

● Similarity function choice is critical


○ Similarity between two drugs?
■ Induced gene expressions
■ Molecular structures
■ Sets of side-effects
■ Sets of targets
○ Each similarity metric induces a different graph

A subgraph of a drug-drug network;


(example from: http://warunika.weebly.com/research.html)
Relational Data

• Any dataset can be converted to graphs.


• It provides great insights into the dataset

Co-Citation research paper: Jothi, R., Mohanty, S.K.


and Ojha, A., 2018. Fast approximate minimum spanning tree
based clustering algorithm. Neurocomputing, 272, pp.542-557.
(Example Source: http://https://www.connectedpapers.com/)
What is Graph Clustering?
● Given an undirected graph G=(V,E,W), partition the graph into k
subgraphs, on the basis of edge structure
● Each subgraph is a cluster
○ In its loosest sense, a graph cluster is a connected component
○ In its strictest sense, it’s a maximal clique of a graph
● Two points are placed in the same cluster if the distance between
them is less than certain threshold
● The goal of graph partitioning is to minimize the number of edges
that cross from one sub graph of vertices to another
Graph Modeling:
How to model the data in the form of a graph?

- Given an input dataset X of D dimension, the graph


G = (V, E, W) can be constructed by considering the data
points as vertices and the dissimilarity between them
represents the edges of the graph.
Graph Modeling:
How to model the data in the form of a graph?

- Given an input dataset X of D dimension, the graph


G = (V, E, W) can be constructed by considering the data
points as vertices and the dissimilarity between them
represents the edges of the graph.
Graph Modeling:
Graph Modeling:
Graph Modeling:
Commonly used graphs models:
I. K-nearest neighbors graph (KNN): An edge (u,v) belongs to
the graph if v is among the K-nearest neighbors of u.
Commonly used graphs models:
I. K-nearest neighbors graph (KNN): An edge (u,v) belongs to
the graph if v is among the K-nearest neighbors of u.
Commonly used graphs models:
I. K-nearest neighbors graph (KNN): An edge (u,v) belongs to
the graph if v is among the K-nearest neighbors of u.
Commonly used graphs models:
I. K-nearest neighbors graph (KNN): An edge (u,v) belongs to
the graph if v is among the K-nearest neighbors of u.

It’s a directed graph.


Commonly used graphs models:
Following strategies can be used to make KNN undirected:
➔ If (u, v) belongs to E then add the ➔ Mutual KNN graph: Add the edge
connecting u and v only if both u and v
edge (v, u) also. are K-nearest neighbours of each
other.
Commonly used graphs models:
Scanning-radius based neighborhood graph: All those edges
belong to the graph that are having weights less than the
scanning-radius(𝜺). (Note: If dissimilarity is considered.)
Commonly used graphs models:
Scanning-radius based neighborhood graph: All those edges
belong to the graph that are having weights less than the scanning-
radius(𝜺). (Note: If dissimilarity is considered.)
Commonly used graphs models:
Scanning-radius based neighborhood graph: All those edges
belong to the graph that are having weights less than the scanning-
radius(𝜺). (Note: If dissimilarity is considered.)
Commonly used graphs models:
Scanning-radius based neighborhood graph: All those edges
belong to the graph that are having weights less than the scanning-
radius(𝜺). (Note: If dissimilarity is considered.)
Commonly used graphs models:
Fully connected graph: Similarity between the points decided by
a kernel function.
For eg. Gaussian kernel: s(xi,xj)=exp(−||xi−xj||2/(2σ2))

σ is controlling the
sparsity.
Commonly used graphs models:
Fully connected graph: Similarity between the points decided by
a kernel function.
For eg. Gaussian kernel: s(xi,xj)=exp(−||xi−xj||2/(2σ2))

σ is controlling the
sparsity.
Commonly used graphs models:
Fully connected graph: Similarity between the points decided by
a kernel function.
For eg. Gaussian kernel: s(xi,xj)=exp(−||xi−xj||2/(2σ2))

σ is controlling the
sparsity.
Commonly used graphs models:
Fully connected graph: Similarity between the points decided by
a kernel function.
For eg. Gaussian kernel: s(xi,xj)=exp(−||xi−xj||2/(2σ2))

σ is controlling the
sparsity.
Commonly used graphs models:
K-rounds of MST: Similarity between the points decided by
closeness of the data points. Let G = (V, E) be the complete weighted
undirected graph of the dataset.
● The first round of MST of G, say K1 is
computed
● Then the consecutive MSTs are
computed by removing the edges of the
MSTs computed in the previous rounds,
i.e MST of the graph

● K-round of
MST neighborhood
graph is defined as
Commonly used graphs models:
K-round MST: Similarity between the points decided by closeness
of the data points. Let G = (V, E) be the complete weighted
undirected graph of the dataset.
● The first round of MST of G, say K1 is
computed
● Then the consecutive MSTs are
computed by removing the edges of the
MSTs computed in the previous rounds,
i.e MST Ki of the graph

● K-round of
MST neighborhood
graph is defined as
Commonly used graphs models:
K-round MST: Similarity between the points decided by closeness
of the data points. Let G = (V, E) be the complete weighted
undirected graph of the dataset.
● The first round of MST of G, say K1 is
computed
● Then the consecutive MSTs are
computed by removing the edges of the
MSTs computed in the previous rounds,
i.e MST Ki of the graph

● K-round of
MST neighborhood
graph is defined as
Commonly used graphs models:
K-round MST: Similarity between the points decided by closeness
of the data points. Let G = (V, E) be the complete weighted
undirected graph of the dataset.
● The first round of MST of G, say K1 is
computed
● Then the consecutive MSTs are
computed by removing the edges of the
MSTs computed in the previous rounds,
i.e MST Ki of the graph

● K-round of
MST neighborhood
graph is defined as
Commonly used graphs models:
K-round MST: Similarity between the points decided by closeness
of the data points. Let G = (V, E) be the complete weighted
undirected graph of the dataset.
● The first round of MST of G, say K1 is
computed
● Then the consecutive MSTs are
computed by removing the edges of the
MSTs computed in the previous rounds,
i.e MST Ki of the graph

● K-round of
MST neighborhood
graph is defined as
Minimum spanning tree

● A minimum spanning tree (MST) or minimum weight spanning tree is a


subset of the edges of a connected, edge-weighted undirected graph that
connects all the vertices together, without any cycles and with the
minimum possible total edge weight.

● Three classical algorithms are available


to find the MST of the graph.
○ Boruvka's Algorithm
○ Kruskal's Algorithm
○ Prim's Algorithm
Prim’s Algorithm
• Robert Clay Prim

Prim's Algorithm:
let T be a single vertex x
while (T has fewer than n vertices)
{
find the smallest edge connecting T to G-T
add it to T
}
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Prim's Algorithm - Example
v
8 12
13 9
2
11 20 40 14
7
50 6 10
1 3
Minimum spanning tree based clustering

Steps for MST based clustering (Input dataset: X, Number of clusters: K):

1. Let X be the dataset of size N and D dimension.


2. Construct the sparse graph G= (V, E), where V is the data points
representing the vertices and E is the edges between the vertices of
the graph.
3. Construct an MST of graph G.
4. Remove the K-1 inconsistent edges from MST.
5. Find the connected components that represent the K clusters.
Minimum spanning tree based clustering
I. Commonly used graph modeling: Time O(N2)

Minimum spanning tree


Input Dataset Commonly used Sparse graph

MST with inconsistent edges Connected components Final clusters


Minimum spanning tree based clustering
II. Sparse graph of size

Minimum spanning tree


Input Dataset Sparse graph

MST with inconsistent edges Connected components Final clusters


Spectral clustering

Similarity Computing Finding


Detecting
Input dataset graph Laplacian smallest
clusters
construction Matrix eigenvectors
- Graph-based - Laplacian Matrix
clustering. represents the - Finding K-smallest
- Dataset X. - Apply any traditional
- Data points affinity and eigenvectors of L to
- Number of clustering method like
represent the nodes. neighboring construct the
clusters K. K-means on U.
- Edges represent the information of the transformed matrix U.
connectivities. graph.

U. Von. Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (4) (2007) 395–416.
Graph Cut:
I. Mincut problem: Selecting the subsets A and B,
s.t., minimizing the following:
u2
u1 0.4
0.7 0.3
0.8
u3 u5 0.4
Example 1: Let A={u1,u3,u4,u7} 0.9 u4 0.5 1 u6
0.9
B={u2,u5,u6} 0.8
u7
then, cut(A, B) = 0.4+0.5 = 0.9
Example 2: A={u1,u5,u6,u3,u4,u7}, B={u2} u1 0.4 0.3
u2
cut(A, B) = 0.7 0.7
0.8
u3 u5 0.4
Drawback: may select a single node 0.9 u4 0.5 1 u6
0.9
as the one cluster. u7
0.8
Graph Cut:
II. RatioCut: Finding the balanced clusters based on the number of
vertices in each cluster.
u2
u1 0.4
0.7 0.3
0.8
Example 1: Let, A={u1,u3,u4,u7}, |A| = 4 u3 u5 0.4
0.9 u4 0.5
B={u2,u5,u6}, |B| = 3 0.9
1 u6
0.8
then, cut(A, B) = 0.4+0.5 = 0.9 u7
RatioCut(A,B) = 0.9 (¼ +⅓) = 0.525
u2
u1 0.4 0.3
0.7
Example 2: A={u1,u5,u6,u3,u4,u7}, B={u2} 0.8
u3 u5 0.4
then, cut(A, B) = 0.7 0.9 u4 0.5 1 u6
RatioCut(A,B) = 0.7 (⅙ +1) = 0.82 0.9
0.8
u7
Graph Cut:
NCut: Finding the balanced clusters based on the degree-sum of all
nodes within each cluster.

u2
u1
0.7 0.4 0.3
0.8
u3 u5 0.4
Let A={u1,u3,u4,u7} 0.9 u4 0.5 1 u6
0.9
|A| = Number of nodes in A = 4 0.8
u7
vol(A) = Sum of degree sum of each node in A
= deg-sum(u1)+deg-sum(u3)+deg-sum(u4)+deg-sum(u7)
= 1.9 + 2.5 + 3.0 + 1.7 = 9.1
Graph Cut:
Example Mincut RatioCut NCut

u2 0.9 0.9/4 + 0.9/3 0.9/9.1 + 0.9/4.3


u1 0.4
0.7 0.3 = 0.525 = 0.308
0.8
u5 0.4
u3
0.9 u4 0.5 1
0.9 u6
0.8
u7

u2 0.7 0.7/6 + 0.7/1 0.7/12.7 + 0.7/0.7


u1 0.4 0.3
0.7 = 0.8167 = 1.05
0.8
u3 u5 0.4
0.9 u4 0.5 1 u6
0.9
0.8
u7
Graph Cut:
➔ Both RatioCut and NCut problems are NP hard.

➔ The spectral clustering algorithm is the relaxation


of the above graph cut problems by considering
the smallest eigenvectors of the graph Laplacians.
Graph Laplacians:

I. Unnormalized graph Laplacian: L = D - W


where, D is the Degree Matrix and W is the
Adjacency Matrix
I. Normalized graph Laplacians:
A. Symmetric: Ls = D-1/2LD-1/2= I - D-1/2WD-1/2

B. Related to the Random Walks:Lr= D-1L = I-D-1W


Unnormalized spectral clustering:
(Relaxation of RatioCut)
Algorithm unnormalized_spectral_clustering (Dataset: X, Number of clusters: K):

1. Construct the sparse similarity graph (G) for X.


2. Find the unnormalized graph Laplacian as: L = D - W.
3. Compute the K-smallest eigenvectors (u1, u2, …….. uK) of the
eigenproblem Lu=λu.
4. Form a N x K matrix U, where each eigenvector ui is stacked as a
column in U.
5. Treat each row of U as a data point.
6. Apply the K-means clustering algorithm on U to get the final K
clusters.
Computing K-smallest eigenvectors of L:
➔ Standard method using QR factorization:
Algorithm k_eig_qr (Input matrix: L, Number of smallest eigenvectors: K):

1. Transform L into upper Hessenberg matrix L0.


2. Apply QR-algorithm:
1. For k=1,2,... do
1. Decompose Lk-1 into QR factors: QkRk = Lk-1, where Qk is an orthogonal
matrix and Rk is an upper triangular matrix.
2. Swap the two factors to get Lk: Lk = RkQk
2. End for
3. Consider the diagonal elements of Lk as the eigenvalues (λ1,λ2, … λN) of L, respectively:
λ = diag(Lk) = {λ1, λ2, …, λN}
4. Sort the previously calculated N-eigenvalues to select the first K-smallest eigenvalues
(λ’1,λ’2, … λ’K).
5. Finally compute the K-eigenvectors (u1, u2, ... uK) corresponding to the selected K-
smallest eigenvalues, respectively.
Example:
Adjacency Matrix (W)
Adjacency and Degree
0 1 2 3 4 5 6 7
Matrices 0 0 1 1 1 0 0 0 0
1 1 0 1 1 0 0 0 0
2 1 1 0 1 0 0 0 0
3 1 1 1 0 0 0 1 0
4 0 0 0 0 0 1 1 1
5 0 0 0 0 1 0 1 1
6 0 0 0 1 1 1 0 1
7 0 0 0 0 1 1 1 0

Degree Matrix (D)


0 1 2 3 4 5 6 7
0 3 0 0 0 0 0 0 0
1 0 3 0 0 0 0 0 0
2 0 0 3 0 0 0 0 0
3 0 0 0 4 0 0 0 0
4 0 0 0 0 3 0 0 0
5 0 0 0 0 0 3 0 0
6 0 0 0 0 0 0 4 0
7 0 0 0 0 0 0 0 3
Degree Matrix (D) Adjacency Matrix (W)
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
0 3 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
1 0 3 0 0 0 0 0 0 1 1 0 1 1 0 0 0 0
2 0 0 3 0 0 0 0 0 2 1 1 0 1 0 0 0 0
3
4
5
6
0
0
0
0
0
0
0
0
0
0
0
0
4
0
0
0
0
3
0
0
0
0
3
0
0
0
0
4
0
0
0
0
- 3
4
5
6
1
0
0
0
1
0
0
0
1
0
0
0
0
0
0
1
0
0
1
1
0
1
0
1
1
1
1
0
0
1
1
1
7 0 0 0 0 0 0 0 3 7 0 0 0 0 1 1 1 0

Unnormalized Laplacian Matrix (L)


0 1 2 3 4 5 6 7
0 3 -1 -1 -1 0 0 0 0
1 -1 3 -1 -1 0 0 0 0 Unnormalized

=
2 -1 -1 3 -1 0 0 0 0 Laplacian Matrix:
3 -1 -1 -1 4 0 0 -1 0
L=D-W
4 0 0 0 0 3 -1 -1 -1
5 0 0 0 0 -1 3 -1 -1
6 0 0 0 -1 -1 -1 4 -1
7 0 0 0 0 -1 -1 -1 3
Computation of the K-smallest eigenvectors of L
Combining two smallest eigenvectors
of L to get the transformed space U
Laplacian Matrix (L)
Smallest Second smallest
0 1 2 3 4 5 6 7 eigenvector eigenvector
0 3 -1 -1 -1 0 0 0 0 0 -0.35 -0.38
1 -1 3 -1 -1 0 0 0 0 1 -0.35 -0.38
2 -1 -1 3 -1 0 0 0 0 2 -0.35 -0.38
3 -1 -1 -1 4 0 0 -1 0 3 -0.35 -0.25
4 0 0 0 0 3 -1 -1 -1 4 -0.35 0.38
5 0 0 0 0 -1 3 -1 -1 5 -0.35 0.38
6 0 0 0 -1 -1 -1 4 -1
6 -0.35 0.25
7 0 0 0 0 -1 -1 -1 3
7 -0.35 0.38

Note: In this case, both of the two eigenvectors can


individually separate the data points into two
desired clusters.
Apply K-Means clustering on U to get final clusters
Final clusters of X

Transformed space of each data point


in U with predicted cluster labels
Input dataset (X) Predicted
cluster
0 1
0 1 labels
0 1 0.8
0 -0.35 -0.38 1
1 0.91 0.73
1 -0.35 -0.38 1
2 0.84 0.76
2 -0.35 -0.38 1
3 0.89 0.82
3 -0.35 -0.25 1
4 0.72 1
4 -0.35 0.38 0
5 0.64 0.94
5 -0.35 0.38 0
6 0.77 0.92
6 -0.35 0.25 0
7 0.68 0.9
7 -0.35 0.38 0
Other examples: Flame dataset
Heatmap of the smallest
and second-smallest
eigenvectors.

Clearly, all the values of


smallest eigenvectors not
varies much and not
helpful in finding the
clusters.

Second smallest
eigenvector separates
the data points better.
Other examples: Smile dataset
Heatmap of U, i.e., four-smallest eigenvectors of L
References

● Lecture Notes for Chapter 7, Introduction to Data Mining, 2nd Edition by Tan,
Steinbach, Karpatne, Kumar
● Downloaded: https://www-users.cs.umn.edu/~kumar001/dmbook/index.php
● Data Mining: Concepts and Techniques (3rd Edn.) by Jiawei Han, Michelline
Kamber and Jian Pei, Morgan Kaufmann (2014).
● http://cse.iitkgp.ac.in/~dsamanta/courses/da/index.html#resources
● Lecture note on “Minimum Spanning Tree “ by By Swee-Ling Tang
Thank You

You might also like