Quality of Clustering: Clustering (K-Means Algorithm)
Quality of Clustering: Clustering (K-Means Algorithm)
Quality of Clustering
A good clustering method produces high quality clusters with minimum
intra-cluster distance (high similarity within the cluster)
and maximum inter-class distance (low similarity between two clusters).
K- means Clustering
In k-means clustering algorithm we take the number of inputs, represented
with the k, the k is called as number of clusters from the data set. The value
of k will define by the user and the each cluster having some distance
between them, we calculate the distance between the clusters using
the Euclidean distance formula.
1st Row:
Distance calculate between the A2 data point and the Centroids A1, B1, C1
Distance between A2(2,5) & A1(2, 10) = |2-2| + |5-10| = 0+5 = 5
Distance between A2(2,5) & B1(5, 8) = |2-5| + |5-8| = 3+3 = 6
Distance between A2(2,5) & C1(1, 2) = |2-1| + |5-2| = 1+3 = 4
The A2 nearby Cluster Center is C1.
2nd Row:
Distance calculate between the A3 data point and the Centroids A1, B1, C1
Distance between A3(8,5) & A1(2,10) = 11
Distance between A3(8,5) & B1(5,8) = 6
Distance between A3(8,5) & C1(1,2) = 10
The A3 nearby Cluster Center is B1.
3rd Row:
Distance calculate between the B2 data point and the Centroids A1, B1, C1
Distance between B2(7,5) & A1(2,10) = 10
Distance between B2(7,5) & B1(5,8) = 5
Distance between B2(7,5) & C1(1,2) = 9
The B2 nearby Cluster Center is B1.
4th Row:
Distance calculate between the B3 data point and the Centroids A1, B1, C1
Distance between B3(6,4) & A1(2,10) = 10
Distance between B3(6,4) & B1(5,8) = 5
Distance between B3(6,4) & C1(1,2) = 7
The B3 nearby Cluster Center is B1.
5th Row:
Distance calculate between the C2 data point and the Centroids A1, B1, C1
Distance between C2(4,9) & A1(2,10) = 3
Distance between C2(4,9) & B1(5,8) = 2
Distance between C2(4,9) & C1(1,2) = 10
The C2 nearby Cluster Center is B1.
Cluster A1 (2, 10) nearby point is A1(2,10), which was the old mean, so the
cluster center remains the same.
Cluster B1(5,8) nearby points are B1(5,8), A3(8,5), B2(7,5), B3(6, 4), C2(4,
9)
B1 Mean value = (5+8+7+6+4/5 , 8+5+5+4+9/5) = (6, 6.2)
The updated Cluster seeds are : A1(2, 10), B1(6, 6.2), C1(1.5, 3.5)
Step 5 : Go for the next iteration with the updated cluster seeds.
The updated Cluster points are : A1(3, 9.5), B1(6.7, 4), C1= (1.7, 4.2)
After completion of the iteration 2 the cluster points are not equal to the iteration 1
cluster points, and then we need to go for the iteration 3.
Step 6 : Check Convergence
The cluster seeds are no change between the Iteration 2 and the iteration
3, then we stop the iteration.