0% found this document useful (0 votes)

53 views

Quality of Clustering: Clustering (K-Means Algorithm)

K-means clustering is an unsupervised machine learning algorithm that groups unlabeled data points into a specified number of clusters (k) based on their similarities. It works by assigning data points to the cluster with the nearest mean and then recalculating the mean for each cluster until the means converge. The algorithm aims to minimize intra-cluster distances and maximize inter-cluster distances. It is commonly used for exploratory data analysis to find hidden patterns or grouping in the data.

Uploaded by

Sk Arif Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views

Quality of Clustering: Clustering (K-Means Algorithm)

Uploaded by

Sk Arif Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Clustering (K-Means Algorithm)

Finding similarities between data on the basis of the characteristics found

in the data and grouping similar data objects into clusters. It is an
unsupervised learning technique (No dependent variable).

Quality of Clustering
A good clustering method produces high quality clusters with minimum
intra-cluster distance (high similarity within the cluster)
and maximum inter-class distance (low similarity between two clusters).

Ways to measure Distance

There are multiple ways to measure distance. The most two popular
methods are as follows -

Manhattan distance: |x2-x1| + |y2-y1|

K- means Clustering
In k-means clustering algorithm we take the number of inputs, represented
with the k, the k is called as number of clusters from the data set. The value
of k will define by the user and the each cluster having some distance
between them, we calculate the distance between the clusters using
the Euclidean distance formula.

Steps to perform k-means clustering

1. Choose the number of clusters k

2. Compute center of these clusters i.e. centroid or cluster seeds (mean of

the points in a cluster) . We can take any random objects as the initial
centroids or the first k objects in sequence.

3. Determine the distance of each object to the centroids using Euclidean

distance.
4. Group the object based on minimum distance.

5. Computing New Cluster Seeds - Recompute the centroids (centers) of

these clusters by taking mean of all points in each cluster formed above.

6. Repeat Steps 2 ,3, 4 and 5 until the centroids no longer change (

or convergence is reached) .

Calculation Steps : How K-mean clustering works

Dataset : A1(2,10), A2(2,5), A3(8,5), B1(5, 8), B2(7,5), B3(6,4), C1(1,2),
C2(4,9)

Step 1 : We choose 3 clusters.

Step 2 : The initial cluster centers – means, are (2, 10), (5, 8) and (1, 2) -
chosen randomly. They are also called cluster seeds.
Step 3 : We need to calculate the distance between each data points and
the cluster centers using the Euclidean distance.

Two points (x1,y1), (x2,y2)

or Manhattan distance = |x2-x1| + |y2-y1|

1st Row:

Distance calculate between the A2 data point and the Centroids A1, B1, C1
Distance between A2(2,5) & A1(2, 10) = |2-2| + |5-10| = 0+5 = 5
Distance between A2(2,5) & B1(5, 8) = |2-5| + |5-8| = 3+3 = 6
Distance between A2(2,5) & C1(1, 2) = |2-1| + |5-2| = 1+3 = 4
The A2 nearby Cluster Center is C1.

2nd Row:

Distance calculate between the A3 data point and the Centroids A1, B1, C1
Distance between A3(8,5) & A1(2,10) = 11
Distance between A3(8,5) & B1(5,8) = 6
Distance between A3(8,5) & C1(1,2) = 10
The A3 nearby Cluster Center is B1.
3rd Row:

Distance calculate between the B2 data point and the Centroids A1, B1, C1
Distance between B2(7,5) & A1(2,10) = 10
Distance between B2(7,5) & B1(5,8) = 5
Distance between B2(7,5) & C1(1,2) = 9
The B2 nearby Cluster Center is B1.

4th Row:

Distance calculate between the B3 data point and the Centroids A1, B1, C1
Distance between B3(6,4) & A1(2,10) = 10
Distance between B3(6,4) & B1(5,8) = 5
Distance between B3(6,4) & C1(1,2) = 7
The B3 nearby Cluster Center is B1.

5th Row:

Distance calculate between the C2 data point and the Centroids A1, B1, C1
Distance between C2(4,9) & A1(2,10) = 3
Distance between C2(4,9) & B1(5,8) = 2
Distance between C2(4,9) & C1(1,2) = 10
The C2 nearby Cluster Center is B1.

Step 4 : Calculate Cluster Seeds (Mean Values)

Cluster A1 (2, 10) nearby point is A1(2,10), which was the old mean, so the
cluster center remains the same.

Cluster B1(5,8) nearby points are B1(5,8), A3(8,5), B2(7,5), B3(6, 4), C2(4,
9)
B1 Mean value = (5+8+7+6+4/5 , 8+5+5+4+9/5) = (6, 6.2)

Cluster C1(1,2) nearby points are C1(1,2), A2(2,5)

C1 Mean value = (1.5, 3.5)

The updated Cluster seeds are : A1(2, 10), B1(6, 6.2), C1(1.5, 3.5)

Step 5 : Go for the next iteration with the updated cluster seeds.

We need to calculate the distances between the each data points to

updated centroids.

Cluster A1(2, 10) nearby points are C2(4,9)

A1 Mean value = (3, 9.5)

Cluster B1(6, 6.2) nearby points are A3(8,5), B2(7,5), B3(6,4)

B1 Mean value = (6.7, 4)

Cluster C1(1.5,3.5) nearby points are A2(2,5)

C1 Mean value = (1.7, 4.2)

The updated Cluster points are : A1(3, 9.5), B1(6.7, 4), C1= (1.7, 4.2)
After completion of the iteration 2 the cluster points are not equal to the iteration 1
cluster points, and then we need to go for the iteration 3.
Step 6 : Check Convergence

The cluster seeds are no change between the Iteration 2 and the iteration
3, then we stop the iteration.

Limitations of k-means clustering

1. The number of clusters must be known before using k-means
clustering.
2. Sensitive to outliers, noise as mean is used.
3. When the number of data are not so many, initial grouping will
determine the cluster significantly.

Grade 10 Term 3 Test 1..
83% (6)
Grade 10 Term 3 Test 1..
7 pages
K Means Algorithms
No ratings yet
K Means Algorithms
27 pages
08_k-means
No ratings yet
08_k-means
19 pages
ML-Unit III - K-Means Clustering
No ratings yet
ML-Unit III - K-Means Clustering
22 pages
K Means
No ratings yet
K Means
19 pages
K-Means Clustering
No ratings yet
K-Means Clustering
38 pages
ML Unit-2
No ratings yet
ML Unit-2
31 pages
PART2
No ratings yet
PART2
61 pages
K-Means
No ratings yet
K-Means
66 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
5 - CH 5-K-Means Clustering
No ratings yet
5 - CH 5-K-Means Clustering
54 pages
Unit-V (1)
No ratings yet
Unit-V (1)
165 pages
K Means
No ratings yet
K Means
26 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
K-Means With Elbow Method
No ratings yet
K-Means With Elbow Method
24 pages
Kmeans Clustering Lecture 8
No ratings yet
Kmeans Clustering Lecture 8
20 pages
K-means_clustering
No ratings yet
K-means_clustering
21 pages
Unsupervised Learning - Clustering
No ratings yet
Unsupervised Learning - Clustering
55 pages
Example 1
No ratings yet
Example 1
8 pages
algo
No ratings yet
algo
59 pages
Lecture 11 K Means Clustering
No ratings yet
Lecture 11 K Means Clustering
8 pages
K means alg,example
No ratings yet
K means alg,example
9 pages
Clustering Numericals
No ratings yet
Clustering Numericals
8 pages
CPE412 Pattern Recognition (Week 7)
No ratings yet
CPE412 Pattern Recognition (Week 7)
48 pages
KMeans Variants
No ratings yet
KMeans Variants
27 pages
ML ch 4 (4)
No ratings yet
ML ch 4 (4)
65 pages
06. k Clustering
No ratings yet
06. k Clustering
28 pages
K Mean Clustering
No ratings yet
K Mean Clustering
45 pages
3 00f3f2a7d5 K Means
No ratings yet
3 00f3f2a7d5 K Means
13 pages
Pilot
No ratings yet
Pilot
3 pages
AI-AG-Day-2-28th Feb 2023
No ratings yet
AI-AG-Day-2-28th Feb 2023
44 pages
CH-6 DM Clustering
No ratings yet
CH-6 DM Clustering
28 pages
Kmeans Clustering Numerical - 1
No ratings yet
Kmeans Clustering Numerical - 1
5 pages
16 K Mean Clustring 1 18052023 095249am 08042024 093324am
No ratings yet
16 K Mean Clustring 1 18052023 095249am 08042024 093324am
20 pages
ML Seminar
No ratings yet
ML Seminar
37 pages
kmea
No ratings yet
kmea
53 pages
Clustering Analysis: What Is Cluster Analysis?
No ratings yet
Clustering Analysis: What Is Cluster Analysis?
5 pages
K-Means Clustering-converted-merged
No ratings yet
K-Means Clustering-converted-merged
76 pages
K Mean Clustering
No ratings yet
K Mean Clustering
36 pages
K-Means Clustering
No ratings yet
K-Means Clustering
7 pages
KMean Merged
No ratings yet
KMean Merged
13 pages
2875 27398 1 SP
No ratings yet
2875 27398 1 SP
4 pages
K Means Clustering Algorithm
No ratings yet
K Means Clustering Algorithm
12 pages
k_means numerical
No ratings yet
k_means numerical
3 pages
K-Means Clustering Algorithm - Javatpoint
No ratings yet
K-Means Clustering Algorithm - Javatpoint
21 pages
k Mean Clustering
No ratings yet
k Mean Clustering
32 pages
Kmean
No ratings yet
Kmean
24 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
K Means Example
No ratings yet
K Means Example
10 pages
K Mean Clustering
No ratings yet
K Mean Clustering
24 pages
K Mean Clustering
No ratings yet
K Mean Clustering
27 pages
ME3435E ADDTE Lect33 Machine Learning for Signal Processing 07.04.25
No ratings yet
ME3435E ADDTE Lect33 Machine Learning for Signal Processing 07.04.25
16 pages
K Means Clustering
No ratings yet
K Means Clustering
17 pages
kmeansfinal
No ratings yet
kmeansfinal
16 pages
K-Means Clustering Algorithm With Numerical Example
No ratings yet
K-Means Clustering Algorithm With Numerical Example
11 pages
Lecture 1 (UNIT 1)
No ratings yet
Lecture 1 (UNIT 1)
68 pages
K-Mean Clustering
No ratings yet
K-Mean Clustering
8 pages
SCA - Module 8
No ratings yet
SCA - Module 8
13 pages
Geometry and Locus (Geometry) Mathematics Question Bank
From Everand
Geometry and Locus (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Instruction for Using a Slide Rule
From Everand
Instruction for Using a Slide Rule
W. Stanley
No ratings yet
CAA - All Exams-2015
No ratings yet
CAA - All Exams-2015
12 pages
10 Sustainability Related Transgressions in Global Supply Chains When Do Legitimacy
No ratings yet
10 Sustainability Related Transgressions in Global Supply Chains When Do Legitimacy
37 pages
Qualifiers List AUG 2020 Pagesx2 - 0
No ratings yet
Qualifiers List AUG 2020 Pagesx2 - 0
4 pages
New Qualifiers List August 2022
No ratings yet
New Qualifiers List August 2022
6 pages
Paresh Kumar ET MIS Tem-II Jan 2022 (2021-23) 1667839809365
No ratings yet
Paresh Kumar ET MIS Tem-II Jan 2022 (2021-23) 1667839809365
21 pages
Transfer To Students Mess IIM Kashipur 8403 13-10-23 22.44
No ratings yet
Transfer To Students Mess IIM Kashipur 8403 13-10-23 22.44
1 page
Faculty - Recruitment - Special - Recruitment - Drive - For - SC - ST - NC - OBC - EWS - PWD - On - Rolling - Basis
No ratings yet
Faculty - Recruitment - Special - Recruitment - Drive - For - SC - ST - NC - OBC - EWS - PWD - On - Rolling - Basis
4 pages
IIM Sambalpur Final Placement Report 2020-22
No ratings yet
IIM Sambalpur Final Placement Report 2020-22
6 pages
IIM Bodh Gaya MBA Admission Policy 2023
No ratings yet
IIM Bodh Gaya MBA Admission Policy 2023
4 pages
Underline The Prepositions in The Sentences Given Below
100% (1)
Underline The Prepositions in The Sentences Given Below
6 pages
MM14 PDF
No ratings yet
MM14 PDF
5 pages
Teoria Acerca de Taladros Orientados Ingles
No ratings yet
Teoria Acerca de Taladros Orientados Ingles
30 pages
Chapter 3 - Second Order Differential Equation - PPT Note
No ratings yet
Chapter 3 - Second Order Differential Equation - PPT Note
20 pages
DLP Lesson 1
No ratings yet
DLP Lesson 1
1 page
Project Helpful 2
No ratings yet
Project Helpful 2
6 pages
Permutations PDF
No ratings yet
Permutations PDF
11 pages
Taguchi Report
No ratings yet
Taguchi Report
18 pages
Lesson 2.5 - Segments, Angles and Arcs
No ratings yet
Lesson 2.5 - Segments, Angles and Arcs
5 pages
MSc Physics (Specialization in Energy Science)
No ratings yet
MSc Physics (Specialization in Energy Science)
3 pages
Mechanics I: Tutorial 1
No ratings yet
Mechanics I: Tutorial 1
20 pages
IMO Longlists
No ratings yet
IMO Longlists
7 pages
MCQs
No ratings yet
MCQs
7 pages
My Grade 8 Q2 Test
100% (1)
My Grade 8 Q2 Test
1 page
Expert System
No ratings yet
Expert System
27 pages
Lec 04 Electrostatic Fields
No ratings yet
Lec 04 Electrostatic Fields
22 pages
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
Problem Solving Chapter 3 - Math in The Modern World
No ratings yet
Problem Solving Chapter 3 - Math in The Modern World
50 pages
Cryptography
No ratings yet
Cryptography
20 pages
OMD Notes
No ratings yet
OMD Notes
16 pages
XI Sample Papers 2023
No ratings yet
XI Sample Papers 2023
33 pages
Review Statistik (Simple Linear and Correlation)
No ratings yet
Review Statistik (Simple Linear and Correlation)
21 pages
Continuous Review and Periodic Review
No ratings yet
Continuous Review and Periodic Review
15 pages
Coptic Church1999 Winter - Vol20.
No ratings yet
Coptic Church1999 Winter - Vol20.
50 pages
EEN330 - Random Signals and Noise-Syllabus - Spring 22-23
No ratings yet
EEN330 - Random Signals and Noise-Syllabus - Spring 22-23
4 pages
M.tech Power Electronics Power Electronics Electrical Drives
No ratings yet
M.tech Power Electronics Power Electronics Electrical Drives
66 pages
Edtpa Lesson Plans
100% (1)
Edtpa Lesson Plans
10 pages
OS - 2 Marks With Answers
No ratings yet
OS - 2 Marks With Answers
28 pages
Amc 12A: Do Not Open Until Wednesday, July 1, 2020
100% (1)
Amc 12A: Do Not Open Until Wednesday, July 1, 2020
4 pages