0% found this document useful (0 votes)

83 views110 pages

Clustering

The document discusses clustering algorithms. It introduces clustering and describes its basic idea of grouping similar data together. It then covers several types of clustering algorithms including hierarchical, partitional, and density-based clustering. It specifically describes hierarchical agglomerative clustering and the single link and complete link approaches. It provides examples to illustrate how these algorithms work on sample distance matrices.

Uploaded by

Krishnan Swami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views110 pages

Clustering

Uploaded by

Krishnan Swami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 110

UNIT: 7

CLUSTERING
Unit : 7 : Clustering
Sr.No. Topics 06

Introduction, categorization of Major,

1 1
Clustering Methods:-
2 1
partitioning methods- K-Means
3 1
Hierarchical Method
4 1
Hierarchical- Agglomerative
5 1
Hierarchical- Divisive methods
6 1
Model- based- Expectation
7 1
Model- based- Maximization
8 1
2
Clustering: Basic Idea

 Groups similar data together into clusters

 Clustering is accomplished by determining the
similarity among the data on predefined
attributes.
 Most similar data are grouped into clusters
 Partitioning or segmentation

Clustering is unsupervised, because

 No previous categorization known
 Totally data driven
 Classes are formed after examining the data

3
Clustering : Application

▷ Marketing Management: Discover distinct groups

in customer bases, and then use this knowledge to
develop targeted marketing programs
▷ WWW
○ Cluster Weblog data to discover groups of
similar access patterns
▷ Text Mining
○ Grouping documents with similar characteristics

4
Clustering : Example

▷ A good clustering method will produce high quality

clusters
* *
** *
* * * * **
* * *
* *
*
*
*
* * **
*
*
*

5
Similarity?

 Groups of similar customers

 Similar demographics
 Similar buying behavior
 Similar health
 Similar products
 Similar cost
 Similar function
 Similar store
…
 Similarity usually is domain/problem specific

6
Clustering (Contd.)

▷ The output of a clustering algorithm consists

of a summarized representation of each
cluster.
▷ The type of summarized representation
depends strongly on the type & shape of
clusters the algorithm computes.

7
Types of Clustering

▷ Hard Clustering:
○ Each record is in one and only one cluster
▷ Soft Clustering:
○ Each record has a probability of being in each
cluster

8
Type of Clustering Algorithms

Hierarchical clustering
Divisive clustering (top down)
Agglomerative clustering (bottom up)
Partitional clustering
K-means clustering
K-medoids clustering
EM (expectation maximization) clustering
Density-Based Methods
Regions of dense points separated by sparser
regions of relatively low density

9
Type of Clustering Algorithms

▷ A Hierarchical clustering algorithm generates a tree

of clusters.
▷ It is categorized as Agglomerative and Divisive.
▷ Agglomerative start with each object in an
individual cluster and nearby clusters are
repeatedly merged resulting in larger and larger
clusters until some stopping criterion is met or all
objects are merged into a single large cluster with
highest level of hierarchy.
▷ Divisive approach starts by putting all data objects
into one cluster and repeatedly perform splitting
which results in smaller and smaller clusters until a
stopping criterion is met or each cluster has only
one object in it.
10
Agglomerative algorithm

1. Bottom-up approach:
2. Initially each item x1, . . . , xn is in its own cluster C1, . .
. ,Cn.
3. Repeat until there is only one cluster left:
Merge the nearest clusters, say Ci and Cj .

▷ The result is a cluster tree. One can cut the tree at

any level to produce different clustering.

11
12
Agglomerative algorithm

1. Allocate each point to a cluster of its own. So start

with n clusters for n objects.
2. Using a distance measure, at each step identify two
clusters to be merged.
 Single link approach  Find two clusters that have
smallest distance between them (find nearest
neighbor).
 Complete link approach  Find two clusters that
have largest distance between them (find farthest
neighbor).
 There are other approaches too……

13
Agglomerative algorithm
Single link approach
3. Two clusters are merged if
min dist between any two points <= threshold dist

d (ci ,c j )  min d ( x, y )
xci , yc j
4. Identify the pair of objects and merge them.
5. Compute all distances from the new cluster and
update the distance matrix after merger.
6. Continue merging process until only one cluster is left
or stop merging once the distance between clusters is
above the threshold.
▷ Single-linkage tends to produce stringy or elongated
clusters that is clusters are chained together by only
single objects that happen to be close together.
14
Single Link Example

15
Agglomerative algorithm

Complete link approach

3. Two clusters are merged if
max dist between any two points <= threshold
dist

d (ci ,c j )  max d ( x, y )
xci , yc j
4. Identify the pair of objects and merge them.
5. Compute all distances from the new cluster and
update the distance matrix after merger.
6. Continue merging process until only one cluster is
left or stop merging once the distance between
clusters is above the threshold.
▷ Clusters tend to be compact and roughly equal in
diameter.
16
Complete Link Example

17
Agglomerative algorithm

Average link approach

d (ci ,c j )   d ( x, x ' ) / c
xci x 'cj
i .cj

▷ Stop merging once the average distance between

clusters is above the threshold.
▷ Somewhere between single-linkage and complete-
linkage.
▷ and a million other ways you can think of …

18
Dendogram: Hierarchical Clustering

• Clustering obtained by
cutting the dendrogram
at a desired level: each
connected component
forms a cluster.

19
19
20
▷ Use single and complete link
agglomerative clustering to group the
data described by the following distance
matrix. Show the dendrograms.

21
Single link

Dendrogram

AB C D

AB 0 2 5

C 0 3

D 0

ABC D

ABC 0 3

D 0

22
23
Complete link

Dendrogram

AB C D

AB 0 4 6

C 0 3

D 0

ABC D

ABC 0 6

D 0

24
complete link: distance between two clusters is the longest distance
between a pair of elements from the two clusters

25
▷ Use single and complete link
agglomerative clustering to group the
data described by the following distance
matrix. Show the dendrograms.
A B C D E

A 0 1 2 2 3

B 1 0 2 4 3

C 2 2 0 1 5

D 2 4 1 0 3

E 3 3 5 3 0

26
Single link

A B C D E

A 0 1 2 2 3
3
B 1 0 2 4 3

C 2 2 0 1 5

D 2 4 1 0 3 2

E 3 3 5 3 0

AB CD E 1
AB 0

CD 2 0

E 3 3 0 A B C D E

ABCD E

ABCD 0

E 3 0

27
Complete Link
D K K Comment

0 5 {A}, We start with the point = cluster

{B},{C},{D},{E}
1 3 {AB} {CD} {E} We merge {A} and {B} and {C } and {D} which are closest
d(A,B)<=1 d(C,D)<=1
2 2 {ABCD}, {E} We merge {AB} and {CD} because we have closest d(A,C)=2

3 1 {ABCDE} Merge E

28
complete link

A B C D E

A 0 1 2 2 3
5
B 1 0 2 4 3

C 2 2 0 1 5

D 2 4 1 0 3 3

E 3 3 5 3 0

AB CD E 1
AB 0

CD 4 0

E 3 5 0 A B E C D

ABE CD

ABE 0

CD 5 0

29
D K K Comment

0 5 {A}, We start with the point = cluster

{B},{C},{D},{E}
1 3 {AB},{CD},{E} We merge {A} and {B} and {C } and {D} which are closest
d(A,B)=1 <=1 and d(C,D)=1 <=1
2 3 {AB},{CD},{E} d(AB,E)=3>2 so cant merged , d(CD.E) =5>2

3 2 {ABE},{CD}, Merge E with AB because there we d(AB,E)<=3 satisfied

4 2 {ABE},{CD}, Distance between this is 5 so 5>=4 not satisfied so cant merge

5 1 {ABCDE} Merge both cluster

30
Average link

▷ Average Linkage In average linkage

hierarchical clustering, the distance
between two clusters is defined as the
average distance between each point in
one cluster to every point in the other
cluster. For example, the distance
between clusters “r” and “s” to the left is
equal to the average length each arrow
between connecting the points of one
cluster to the other.

31
Average link cluster

32
33
34
35
36
37
38
Merge the clusters

39
Agglomerative algorithm

NOTE:
Start with:
▷ Initial no of clusters=n.
▷ Initial value of threshold dist, d = 0.
▷ Increment d in each iteration.
▷ Check condition (min dist<=d) for merging.
▷ Continue until you get one cluster.

40
Weakness of Agglomerative

▷ Major weakness of agglomerative clustering

methods
○ Do not scale well: time complexity of at least
O(n2), where n is the number of total objects
○ Can never undo what was done previously

41
Hierarchical Clustering algorithms

▷ Divisive (top-down):
○ Generate Spanning Tree (no loop in the graph)
○ Start with all documents belong to the same cluster.
○ Eventually each node forms a cluster on its own.
▷ Does not require the number of clusters k in
advance
▷ Needs a termination/readout condition
○ The final mode in both Agglomerative and Divisive is of no
use.

42
Overview

▷ Divisive Clustering starts by placing all

objects into a single group. Before we
start the procedure, we need to decide on
a threshold distance.
The procedure is as follows:
▷ The distance between all pairs of objects
within the same group is determined and
the pair with the largest distance is
selected.

43
Hierarchical Clustering

▷ Divisive Approaches Initialization:

All objects stay in one cluster
Iteration:
a ab Select a cluster and split it into

b abcde
two sub clusters
Until each leaf cluster contains
c
cde only one object
d
de
e

Step 4 Step 3 Step 2 Step 1 Step 0 Top-down

44
Overview-contd

▷ This maximum distance is compared to the

threshold distance.
○ If it is larger than the threshold, this group is
divided in two. This is done by placing the
selected pair into different groups and using
them as seed points. All other objects in this
group are examined, and are placed into the new
group with the closest seed point. The
procedure then returns to Step 1.
○ If the distance between the selected objects is
less than the threshold, the divisive clustering
stops.
▷ To run a divisive clustering, you simply need
to decide upon a method of measuring the
distance between two objects.
45
A 1 B

2
• Max d(E,D) = 3 so {E}
,{ABCD}
C
• Max d(B,C) = 2 so {E}
{AB}{CD}
1
E D • Finally {E}, {A},{B},{C},{D},{E}
3

46
Type of Clustering Algorithms

▷ A Partitional clustering algorithm partitions the

data into k groups such that some criterion that
evaluates the clustering quality is optimized.
▷ The number of clusters k is a parameter whose
value is specified by the user.

47
Partitioning Algorithms

▷ Partitioning method: Construct a partition of a

database D of n objects into a set of k clusters
▷ Given a k, find a partition of k clusters that optimizes
the chosen partitioning criterion
○ Global optimal: exhaustively enumerate all
partitions
○ Heuristic methods: k-means and k-medoids
algorithms
○ k-means (MacQueen, 1967): Each cluster is
represented by the center of the cluster
○ k-medoids or PAM (Partition around medoids)
(Kaufman & Rousseeuw, 1987): Each cluster is
represented by one of the objects in the cluster
48
K-Means Clustering

▷ Given k, the k-means algorithm consists of

four steps:
○ Select initial centroids at random.
○ Assign each object to the cluster with the nearest
centroid.
○ Compute each centroid as the mean of the objects
assigned to it.
○ Repeat previous 2 steps until no change.

49
K-Means Clustering (contd.)

▷ Example
10 10

9 9

8 8

7 7

6 6

5 5

4 4

3 3

2 2

1 1

0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

10 10

9 9

8 8

7 7

6 6

5 5

4 4

3 3

2 2

1 1

0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

50
51
Example:

▷ Suppose we want to group the visitors to

a website using just their age (one-
dimension
▷ n = 19
▷ 15,15,16,19,19,20,20,21,22,28,35,40,41,
42,43,44,60,61,65
▷ al space) as follows:

52
▷ Initial clusters (random
centroid or average): k = 2
▷ c1 = 16
c2 = 22

53
New centroid
Iteration 1 c1 = 15.33
c2 = 36.25

3
3
4
4

54
c1 = 18.56
Iteration 2 c2 = 45.90

55
c1 = 19.50
Iteration 3: c2 = 47.89

56
c1 = 19.50
Iteration 4: c2 = 47.89

57
Do it yourself

▷ Use k-means algorithm to cluster the

following points into 2 clusters:
▷ {2, 4, 10, 12, 3, 20, 30, 11, 25}
▷ Suppose that initial cluster centers (seed)
are 3 & 4. Run K-means algorithm for
three iterations.

58
x m1=3 m2=4 C1=x-m1 C2=x-m2 cluster new centroid
2 3 4 1 2 1
3 3 4 0 1 1 2.5
4 3 4 1 0 2
10 3 4 7 6 2
11 3 4 8 7 2
12 3 4 9 8 2
20 3 4 17 16 2
25 3 4 22 21 2
30 3 4 27 26 2 16

x m1=2.5 m2=16 C1=x-m1 C2=x-m2 cluster new centroid

2 2.5 16 0.5 14 1
3 2.5 16 0.5 13 1
4 2.5 16 1.5 12 1 3
10 2.5 16 7.5 6 2
11 2.5 16 8.5 5 2
12 2.5 16 9.5 4 2
20 2.5 16 17.5 4 2
25 2.5 16 22.5 9 2
30 2.5 16 27.5 14 2 18 59
x m1=3 m2=18 C1=x-m1 C2=x-m2 cluster new centroid
2 3 18 1 16 1
3 3 18 0 15 1
4 3 18 1 14 1
10 3 18 7 8 1 4.75
11 3 18 8 7 2
12 3 18 9 6 2
20 3 18 17 2 2
25 3 18 22 7 2
30 3 18 27 12 2 19.6
x m1=4.75 m2=19.6 C1=x-m1 C2=x-m2 cluster new centroid
2 4.75 19.6 2.75 17.6 1
3 4.75 19.6 1.75 16.6 1
4 4.75 19.6 0.75 15.6 1
10 4.75 19.6 5.25 9.6 1
11 4.75 19.6 6.25 8.6 1
12 4.75 19.6 7.25 7.6 1 7
20 4.75 19.6 15.25 0.4 2
25 4.75 19.6 20.25 5.4 2
60
30 4.75 19.6 25.25 10.4 2 25
x m1=7 m2=25 C1=x-m1 C2=x-m2 cluster new centroid
2 7 19.6 5 17.6 1
3 7 19.6 4 16.6 1
4 7 19.6 3 15.6 1
10 7 19.6 3 9.6 1
11 7 19.6 4 8.6 1
12 7 19.6 5 7.6 1 7
20 7 19.6 13 0.4 2
25 7 19.6 18 5.4 2
30 7 19.6 23 10.4 2 25

61
62
Measurement metric

63
Euclidean

2 3 V2
X = 2, 0  [ (y - x)² ]

1
Y = -2, -2

-3 -2 -1 0

-3 -2 -1 0 1 2 3 V1

 ([-2 – 2]2 + [-2 –) 0]2

( 42 + 22) =  20 = 4.47

4.47 represents the “multivariate dissimilarity” of X & Y

64
Squared Euclidean - (y - x)²

X = 2, 0

2 3 V2
Y = -2, -2

1
-3 -2 -1 0
([-2 – 2]2 + [-2 – 0]2 )
-3 -2 -1 0 1 2 3 V1
( 42 + 22) = 20
Squared Euclidean is a little better at “noticing” strays

• remember that we use a square root transform to “pull

in” outliers
• leaving the value squared makes the strays stand out a
bit more 65
Manhattan distance

▷ The function computes the distance that

would be traveled to get from one data
point to the other if a grid-like path is
followed. The Manhattan distance
between two items is the sum of the
differences of their corresponding
components.
▷ The formula for this distance between a
point X=(X1, X2, etc.) and a point Y=(Y1,
Y2, etc.) is:

66
Manhattan distance or city block

 |y – x|
X = 2, 0

Y = -2, -2

2 3 V2
([-2 – 2] + [-2 – 0])

1
-3 -2 -1 0
( 4 + 2) = 6

-3 -2 -1 0 1 2 3 V1

So named because in a city you have to go “around the

block” you can’t “cut the diagonal”

67
Chebyshev metric

Chebyshev distance is also known as Maximum value

distance. It calculate absolute magnitude of the differences
between coordinates of two objects:

68
Chebychev

max | y - x |
X = 2, 0

Y = -2, -2

2 3 V2
| -2 – 2 | = 4

1
| -2 – 0 | = 2
-3 -2 -1 0

max ( 4 & 2) = 4 -3 -2 -1 0 1 2 3 V1

Uses the “greatest univariate difference” to represent the

multivariate dissimilarity.

69
A Simple example showing the implementation of k-means
algorithm (using K=2) . Use Euclidean distance metric.

70
71
72
73
74
75
76
77
▷ A Simple example showing the implementation of k-means
algorithm (Two dimension) (using K=3).Use Manhattan
distance metric

X Y

1 2 10 Centroid
2 2 5 c1(2,10) c2(5,8) and c3(1,2)
3 8 4

4 5 8

5 7 5

6 6 4

7 1 2

8 4 9

78
Kmean
X Y C1’ C2’( C3’( Clus C1’ C2’( C3’( clust
(2,1 5,8) 1,2) ter (2,1 6,6) 1.5, er
0) 0) 3.5)
A1 2 10 0 5 9 C1 0 8 7 C1

A2 2 5 5 6 4 C3 5 5 2 C3

A3 8 4 12 7 9 C2 12 4 7 C2

A4 5 8 5 0 10 C2 5 3 8 C2

A5 7 5 10 5 9 C2 16 2 7 C2

A6 6 4 10 5 7 C2 10 2 5 C2

A7 1 2 9 10 0 C3 9 9 2 C3

A8 4 9 3 2 10 C2 3 5 8 C1

Second interation : new centroid : c1 => {A1} =>C1(2,10)

C2=>{A3,A4,A5,A6,A8} => ((8+5+7+6+4)/5, (4+8+5+4+9)/5)=> (30/5,30/5) => C2(6,6)
C3 => {A2,A7} => {(2+1)/2 , (5+2)/2} => C3(1.5,3.5)

3rd interation : C1(3,9.5)

C2(6.5,5.25)
C3(1.5,3.5)
79
80
Comments on the K-Means Method

81
EM Model

82
Soft Clustering

▷ Clustering typically assumes that each

instance is given a “hard” assignment to
exactly one cluster.
▷ Does not allow uncertainty in class
membership or for an instance to belong to
more than one cluster.
▷ Soft clustering gives probabilities that an
instance belongs to each of a set of clusters.
▷ Each instance is assigned a probability
distribution across a set of discovered
categories (probabilities of all categories
must sum to 1). 83
Model based clustering
▷ Algorithm optimizes a probabilistic model criterion
▷ Soft clustering is usually done by the Expectation
Maximization (EM) algorithm
○ Gives a soft variant of the K-means algorithm
○ Assume k clusters: {c1, c2,… ck}
○ Assume a probabilistic model of categories that
allows computing P(ci | E) for each category, ci, for
a given example, E.
○ For text, typically assume a naïve Bayes category
model.
○ Parameters  = {P(ci), P(wj | ci): i{1,…k}, j
{1,…,|V|}} 84
Expectation Maximization (EM)
Algorithm
▷ EM algorithm provides a general approach to learning in
presence of unobserved variables
▷ Iterative method for learning probabilistic
categorization model from unsupervised data.
▷ Initially assume random assignment of examples to
categories.
▷ Learn an initial probabilistic model by estimating model
parameters  from this randomly labeled data.
▷ Iterate following two steps until convergence:
○ Expectation (E-step): Compute P(ci | E) for each example
given the current model, and probabilistically re-label
the examples based on these posterior probability
estimates.

○ Maximization (M-step): Re-estimate the model

parameters, , from the probabilistically re-labeled data. 85
Applications

▷ Filling in missing data in samples

▷ Discovering the value of hidden variables
▷ Estimating the parameters of Hidden
Markov Model (HMM)
▷ Unsupervised learning of clusters
▷ Semi-supervised classification and
clustering.

86
Example

87
88
89
90
EM Example

91
92
Step 1

▷ Randomly select two probability

▷ θa probability of head with coin C1
▷ θb probability of head with coin C2

▷ Suppose : θa : 0.6
θb : 0.5

93
Step 2:

▷ Calculate the individual probability of C1

and C2 on first transaction which are as
follows:

▷ H:5
▷ T:5

94
Step :3 compute the likelyhood

95
Step 4

▷ Normalized the data or compute the

probability of calculated likelyhood
▷ Parameter A and B is Estimation parameter

96
Step 4

▷ Repeat the process for all the

experiments in our case the experiments
are 5

97
Step 5

98
Step 6:

▷ Compute the total H and T in c1 and c2 in

our case it is

C1 => 21.3 H and 8.6 T

C2 => 11.7 H and 8.4 T

99
Step 7

Calculate new probability of C1 and C2 for

getting head
This is maximization parameter θ for each coin

100
Step 8

▷ Repeat the process untill you get the

same probability again
▷ In our case it is,

101
EM Example (Do it yourself)

102
Step 1

▷ Randomly select two probability

▷ θa probability of head with coin C1
▷ θb probability of head with coin C2

▷ Suppose : θa : 0.6
θb : 0.5

103
Step 2:

▷ Calculate the individual probability of C1

and C2 on first transaction which are as
follows:

▷ H:3
▷ T:2

104
Step :3 compute the likelyhood
First round = 3 Head and 2 Tails

Liklyhood of A = (0.6)^3 * (0.4)^ 2 = 0.03456

Liklyhood of A = (0.5)^3 * (0.5)^ 2 =0.03125

After normalization the

Expected value of A = 0.03456/ (0.03456+0.03125) = 0.5251

Expected value of B = 0.03125/ (0.03456+0.03125) = 0.4748

105
Step 4

▷ Repeat the process for all the

experiments in our case the experiments
are 5

106
Coin A (Head) Coin A (Tail) Coin B (Head) Coin B (Tail)

0.5251 3 =1.57 0.52512 =1.05 0.47483=1.424 0.47482

=0.9496

=2.6255 =0 =2.374 =0

=2.1 =0.5251 =1.899 =0.4748

=1.05 =1.57 =0.9496 =1.424

=2.1 =0.5251 =1.899 =0.4748

Total Head A Total Tail A Total Head B Total Head B

9.4455 2.67 8.545 3.319

107
▷ Calculate new probability of C1 and C2
for getting head
▷ This is maximization parameter θ for
each coin
▷ Θa = 9.44/ (9.44+2.67) = 0.72
▷ Θb = 8.5456/ (8.5456+3.32) = 0.72

▷ Repeat the process untill you get the

same probability again

108
Limitations

▷ The EM algorithm is very very slow, even

on the fastest computer. It works best
when you only have a small percentage of
missing data and the dimensionality of
the data isn’t too big. The higher the
dimensionality, the slower the E-step; for
data with larger dimensionality, you may
find the E-step runs extremely slow as the
procedure approaches a local maximum.

109
110

00-DeVOPS With AWS-Complete Course Content-V4
75% (4)
00-DeVOPS With AWS-Complete Course Content-V4
5 pages
Sales Contract For Gloves Black
No ratings yet
Sales Contract For Gloves Black
5 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Hierarchical Clustering: Relationship Between Clusters
No ratings yet
Hierarchical Clustering: Relationship Between Clusters
23 pages
P 3.1.3 Hierarchical
No ratings yet
P 3.1.3 Hierarchical
30 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
1629189889 ML TCS Lecture Hierarchical 1608
No ratings yet
1629189889 ML TCS Lecture Hierarchical 1608
41 pages
RK Clustering
No ratings yet
RK Clustering
77 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
9 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Clustering
No ratings yet
Clustering
39 pages
AIMLB PGP 2024 Session 12
No ratings yet
AIMLB PGP 2024 Session 12
46 pages
Chp10 Cluster Analysis Basic Concepts and Methods
No ratings yet
Chp10 Cluster Analysis Basic Concepts and Methods
24 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
32 pages
Clustring
No ratings yet
Clustring
20 pages
4.4 Hierarchical Clustering Methods
No ratings yet
4.4 Hierarchical Clustering Methods
39 pages
Slide TIF311 DM 10 11
No ratings yet
Slide TIF311 DM 10 11
49 pages
Clustering
No ratings yet
Clustering
28 pages
Unit-4 new
No ratings yet
Unit-4 new
36 pages
03 Hierarchical Clustering
100% (1)
03 Hierarchical Clustering
15 pages
UnSupervisedLearning
No ratings yet
UnSupervisedLearning
22 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
32 pages
K-Means and Hierarchical Clustering
No ratings yet
K-Means and Hierarchical Clustering
30 pages
Presentation 28128 Content Document 20241126014005PM
No ratings yet
Presentation 28128 Content Document 20241126014005PM
80 pages
Clustering
No ratings yet
Clustering
75 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
Data Mining-Unit 3-Part1
No ratings yet
Data Mining-Unit 3-Part1
41 pages
Cluster
100% (1)
Cluster
72 pages
lec2
No ratings yet
lec2
32 pages
Clustering - Hierarchical
No ratings yet
Clustering - Hierarchical
4 pages
Hierarchical-Clustering-in-Machine-Learning
No ratings yet
Hierarchical-Clustering-in-Machine-Learning
10 pages
3CP10 Mjj Hierarchical Clustering
No ratings yet
3CP10 Mjj Hierarchical Clustering
40 pages
8. Clustering
No ratings yet
8. Clustering
38 pages
Data Mining Unit 5
No ratings yet
Data Mining Unit 5
30 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Lecture 18
No ratings yet
Lecture 18
27 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
MODULE_5
No ratings yet
MODULE_5
43 pages
ml 8
No ratings yet
ml 8
12 pages
03 Clustering
No ratings yet
03 Clustering
63 pages
Agglomerative Clustering
No ratings yet
Agglomerative Clustering
6 pages
L08 Hierachical agglomerative clustering
No ratings yet
L08 Hierachical agglomerative clustering
41 pages
Chap15 Cluster Analysis
No ratings yet
Chap15 Cluster Analysis
55 pages
Week-10
No ratings yet
Week-10
84 pages
Clustering
No ratings yet
Clustering
75 pages
Grouping
No ratings yet
Grouping
98 pages
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
Lec.4.D. M. spring 2025
No ratings yet
Lec.4.D. M. spring 2025
19 pages
13 Clustering and Classifier
No ratings yet
13 Clustering and Classifier
123 pages
Data Science Unit 5
No ratings yet
Data Science Unit 5
105 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
Hierachical Clustering
No ratings yet
Hierachical Clustering
31 pages
UNIT5
No ratings yet
UNIT5
60 pages
6 - Clustering and Applications and Trends in Datamining
No ratings yet
6 - Clustering and Applications and Trends in Datamining
66 pages
Agnes
No ratings yet
Agnes
25 pages
CS423 Data Warehousing and Data Mining: Dr. Hammad Afzal
No ratings yet
CS423 Data Warehousing and Data Mining: Dr. Hammad Afzal
41 pages
6 - Machine Learning and Unlabeled Data
No ratings yet
6 - Machine Learning and Unlabeled Data
67 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
CG 3rd Ut Clipping Part PDF
No ratings yet
CG 3rd Ut Clipping Part PDF
20 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
SYMCA Timetable
No ratings yet
SYMCA Timetable
1 page
SoftwareTestingPrinciples and Practices by Gopalaswamy Ramesh, Srinivasan Desikan PDF
100% (1)
SoftwareTestingPrinciples and Practices by Gopalaswamy Ramesh, Srinivasan Desikan PDF
388 pages
Class 10 Geography Assignment 18 May 2020
No ratings yet
Class 10 Geography Assignment 18 May 2020
2 pages
DBMS
No ratings yet
DBMS
6 pages
Practical 1 To 4
No ratings yet
Practical 1 To 4
17 pages
Record A Script Using Selenese For An Unsuccessful Login of Gmail
No ratings yet
Record A Script Using Selenese For An Unsuccessful Login of Gmail
5 pages
Assignment 1 Program 1: Show The Network Diagram, Critical Path For Each Activity. Also Describe Resources Allocated To Activities
No ratings yet
Assignment 1 Program 1: Show The Network Diagram, Critical Path For Each Activity. Also Describe Resources Allocated To Activities
40 pages
Apparel Shop Inventory Management
No ratings yet
Apparel Shop Inventory Management
16 pages
Government of Transport Department Maharashtra: Reference No:MH05 /0034961/2018 License Type:LL
No ratings yet
Government of Transport Department Maharashtra: Reference No:MH05 /0034961/2018 License Type:LL
1 page
Ecolab_Equipment_Cleaning_Sanitizing_Toolkit_NRA
No ratings yet
Ecolab_Equipment_Cleaning_Sanitizing_Toolkit_NRA
13 pages
CFT
No ratings yet
CFT
25 pages
Klingersil C-4400: Tests and Certifications
No ratings yet
Klingersil C-4400: Tests and Certifications
2 pages
Import Data
No ratings yet
Import Data
163 pages
Pet Boarding Facility: Project Management
No ratings yet
Pet Boarding Facility: Project Management
12 pages
Tunisian Cuisine Report
No ratings yet
Tunisian Cuisine Report
2 pages
Get New Stock Ideas With Google Finance Stock Screener Tool
No ratings yet
Get New Stock Ideas With Google Finance Stock Screener Tool
8 pages
Bios Passwords
No ratings yet
Bios Passwords
3 pages
The Psychology of Human Misjudgment - Charlie Munger
No ratings yet
The Psychology of Human Misjudgment - Charlie Munger
21 pages
Quiz 115 Answers
No ratings yet
Quiz 115 Answers
5 pages
Reminder app prototype
No ratings yet
Reminder app prototype
12 pages
Aguzzi2019 SPF
No ratings yet
Aguzzi2019 SPF
10 pages
Differences Between Hypertext, Hypermedia and Multimedia
90% (20)
Differences Between Hypertext, Hypermedia and Multimedia
7 pages
Ethics in Finance
No ratings yet
Ethics in Finance
12 pages
Order and Compare Number
No ratings yet
Order and Compare Number
14 pages
RF Planning and Drive Test
No ratings yet
RF Planning and Drive Test
12 pages
Soap Carving Instructions
No ratings yet
Soap Carving Instructions
2 pages
Eapp Lesson 1 Week 1
No ratings yet
Eapp Lesson 1 Week 1
20 pages
Chapter 1 Business Economics
No ratings yet
Chapter 1 Business Economics
71 pages
Word Formation Processes
No ratings yet
Word Formation Processes
31 pages
Basics of Tekla Structures
No ratings yet
Basics of Tekla Structures
110 pages
Probability and Statistics 1-ITI-PROBABILITY THEORY
No ratings yet
Probability and Statistics 1-ITI-PROBABILITY THEORY
32 pages
Deterministic System
No ratings yet
Deterministic System
2 pages
REVIEWER SSP (Finals)
No ratings yet
REVIEWER SSP (Finals)
10 pages
HHS Lunch Menu 75 90
No ratings yet
HHS Lunch Menu 75 90
2 pages
Resume Iim Nsy
No ratings yet
Resume Iim Nsy
18 pages
Ice 100
No ratings yet
Ice 100
48 pages
Harmonic Enhancement With Noise Reduction of Speech Signal by Comb Filtering
No ratings yet
Harmonic Enhancement With Noise Reduction of Speech Signal by Comb Filtering
4 pages

Clustering

Uploaded by

Clustering

Uploaded by

UNIT: 7

Introduction, categorization of Major,

 Groups similar data together into clusters

Clustering is unsupervised, because

▷ Marketing Management: Discover distinct groups

▷ A good clustering method will produce high quality

 Groups of similar customers

▷ The output of a clustering algorithm consists

▷ A Hierarchical clustering algorithm generates a tree

▷ The result is a cluster tree. One can cut the tree at

1. Allocate each point to a cluster of its own. So start

Complete link approach

Average link approach

▷ Stop merging once the average distance between

0 5 {A}, We start with the point = cluster

0 5 {A}, We start with the point = cluster

3 2 {ABE},{CD}, Merge E with AB because there we d(AB,E)<=3 satisfied

4 2 {ABE},{CD}, Distance between this is 5 so 5>=4 not satisfied so cant merge

5 1 {ABCDE} Merge both cluster

▷ Average Linkage In average linkage

▷ Major weakness of agglomerative clustering

▷ Divisive Clustering starts by placing all

▷ Divisive Approaches Initialization:

Step 4 Step 3 Step 2 Step 1 Step 0 Top-down

▷ This maximum distance is compared to the

▷ A Partitional clustering algorithm partitions the

▷ Partitioning method: Construct a partition of a

▷ Given k, the k-means algorithm consists of

▷ Suppose we want to group the visitors to

▷ Use k-means algorithm to cluster the

x m1=2.5 m2=16 C1=x-m1 C2=x-m2 cluster new centroid

 ([-2 – 2]2 + [-2 –) 0]2

4.47 represents the “multivariate dissimilarity” of X & Y

• remember that we use a square root transform to “pull

▷ The function computes the distance that

So named because in a city you have to go “around the

Chebyshev distance is also known as Maximum value

Uses the “greatest univariate difference” to represent the

Second interation : new centroid : c1 => {A1} =>C1(2,10)

3rd interation : C1(3,9.5)

▷ Clustering typically assumes that each

○ Maximization (M-step): Re-estimate the model

▷ Filling in missing data in samples

▷ Randomly select two probability

▷ Calculate the individual probability of C1

▷ Normalized the data or compute the

▷ Repeat the process for all the

▷ Compute the total H and T in c1 and c2 in

C1 => 21.3 H and 8.6 T

Calculate new probability of C1 and C2 for

▷ Repeat the process untill you get the

▷ Randomly select two probability

▷ Calculate the individual probability of C1

Liklyhood of A = (0.6)^3 * (0.4)^ 2 = 0.03456

Liklyhood of A = (0.5)^3 * (0.5)^ 2 =0.03125

After normalization the

Expected value of A = 0.03456/ (0.03456+0.03125) = 0.5251

Expected value of B = 0.03125/ (0.03456+0.03125) = 0.4748

▷ Repeat the process for all the

0.5251 *3 =1.57 0.5251*2 =1.05 0.4748*3=1.424 0.4748*2

=2.1 =0.5251 =1.899 =0.4748

=1.05 =1.57 =0.9496 =1.424

=2.1 =0.5251 =1.899 =0.4748

Total Head A Total Tail A Total Head B Total Head B

▷ Repeat the process untill you get the

▷ The EM algorithm is very very slow, even

You might also like

0.5251 3 =1.57 0.52512 =1.05 0.47483=1.424 0.47482