0% found this document useful (0 votes)
16 views

Clustering

K-means clustering is an unsupervised machine learning algorithm that groups unlabeled data points into K number of clusters based on their similarities. It works by assigning each data point to the cluster with the closest centroid and recalculating the centroids as the average of all the data points assigned to that cluster. This process iterates until the centroids stabilize and no longer move. For example, in a dataset with 4 data points described by 2 attributes each, k-means clustering with K=2 would initially assign 2 data points as centroids, calculate the distance of each point to the centroids, assign all points to the closest centroid, then recalculate the centroids as the average of their cluster to repeat the process until convergence.

Uploaded by

Alimuddin Nento
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Clustering

K-means clustering is an unsupervised machine learning algorithm that groups unlabeled data points into K number of clusters based on their similarities. It works by assigning each data point to the cluster with the closest centroid and recalculating the centroids as the average of all the data points assigned to that cluster. This process iterates until the centroids stabilize and no longer move. For example, in a dataset with 4 data points described by 2 attributes each, k-means clustering with K=2 would initially assign 2 data points as centroids, calculate the distance of each point to the centroids, assign all points to the closest centroid, then recalculate the centroids as the average of their cluster to repeat the process until convergence.

Uploaded by

Alimuddin Nento
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 8

k-means clustering

k-means clustering is an algorithm to classify or to group your objects based on attributes/features into K number of group. K is positive integer number. The grouping is done by minimizing the sum of squares of distances between data and the corresponding cluster centroid. Thus the purpose of K-mean clustering is to classify the data

k-means clustering
Determine number of cluster K Take any random objects as the initial centroids Iterate until stable (= no object move group):
Determine the centroid coordinate Determine the distance of each object to the centroids Group the object based on minimum distance

k-means clustering
Example: Suppose we have 4 objects as your training data points and each object have 2 attributes

k-means clustering
K=2 Each medicine represents one point with two attributes (X, Y)

k-means clustering
1. Initial value of centroids : Suppose we use medicine A and medicine B as the first centroids. Let C1 and C2 denote the coordinate of the centroids, then C1 = (1,2) and C2 = (2,1)

k-means clustering
2. Objects-Centroids distance : we calculate the distance between cluster centroid to each object. Let us use Euclidean distance, then we have distance matrix at iteration 0 is

For example, distance from medicine C = (4, 3)

k-means clustering
3. Object Clustering : assign each object based on minimum distance.

k-means clustering
4. Determine Centroids

You might also like