LAB 6A:K-Means Clustering
LAB 6A:K-Means Clustering
1. Theoretical basis
The division of the set X into K classes is done so as to minimize the objective
function J, defined by:
K Nk
J ck ||2
k= 1 i =1
where :
- Nk = number of vectors in class k
- K = number of classes: k = 1 .. K
- ck, is the mean (center) of the class k
- Xi(k) = vector Xi belonging to class k
1.2. Algorithm:
1. The N vectors are randomly assigned in K classes and the class means (centers) are
calculated.
Another option is the random initialization of the centers (averages) of the K classes.
2. For each of the N vectors the distance from the centers of the classes is calculated
and the vector Xi is assigned to the class k for which the distance ||Xi – ck|| is minimum.
3. After traversing the entire set of vectors (p-dimensional spatial points) and re-
labeling all vectors, recalculate the centers (means) of the classes and resume the
algorithm from step 2.
The algorithm stops when no re-labeling of any vector occurs.
1.3. Problem
Consider the vectors :
A = (2 0)T, B = (3 1)T, C = (4 0)T, D = (3 -1)T
E = (-2 0)T,F = (-3 1)T, G=(-4 0)T, H=(-3 -1)T
It is desired to classify these vectors into K = 2 classes using K-Means Clustering
assuming the class averages are initialized with c1 = (1 1)T si c2 = (4 0)T.
2. Lab Application
Objective : Using the Unsupervised K-Means algorithm to classify two-dimensional
vectors into K-classes.
Vectors can be chosen as graphical points by the user or there can be used predefined
sets of points.
- Open the Matlab program and run the L6_Kmeans.m file.
- to add a new point write the coordinates of the point and then click Adauga
- to enter the number of classes, fill in the field Nr. Clase with the desired number of
classes.
By default there are 3 classes.
- To classify, click Clustering K-means
- If you want to use a predefined set of points, choose one of the predefined set of points
and then click Clustering K-means
- To delete all entered points, press Reset
- To delete the classification without deleting the points, click Clear plot
The number of iterations performed as well as the objective function J are displayed
in the Command Window.
The application looks like the one on the screen below
- To delete the classification without deleting the points, click Clear plot
The number of iterations performed as well as the objective function J are displayed in
the Command Window.
The application looks like the one on the screen below.
Fig. 1. MATLAB application of K-means Clustering to classify the 2-D vectors in K classes.