0% found this document useful (0 votes)
6 views

Ann Unit 3

Uploaded by

yaminisatish461
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Ann Unit 3

Uploaded by

yaminisatish461
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

K-means clustering

• K-means clustering is an unsupervised machine learning

algorithm used for partitioning data into K clusters. The goal

of K-means clustering is to group similar data points together

and assign them to K clusters, where K is a user-specified

parameter.

1
K-means clustering
• Clustering: the process of grouping a set of objects into classes
of similar objects
– Documents within a cluster should be similar.
– Documents from different clusters should be dissimilar.
• The commonest form of unsupervised learning
• Unsupervised learning = learning from raw data, as
opposed to supervised data where a classification of
examples is given
– A common and important task that finds many applications
in IR and other places

2
Here's how the K-means clustering algorithm works:
1.Initialization:
•Choose K initial cluster centroids randomly from the data points or use
some heuristic method to initialize them.
2.Assignment Step:
•For each data point, calculate its distance to each of the K centroids.
•Assign the data point to the cluster whose centroid is closest (i.e., has
the minimum distance).
3.Update Step:
•Recalculate the centroids of the K clusters based on the data points
assigned to them in the previous step.
•The new centroid of a cluster is the mean of all the data points
assigned to that cluster.

3
4.Repeat Steps 2 and 3:
•Iteratively perform the assignment and update steps until convergence or until
a predefined number of iterations is reached.
5.Convergence:
•The algorithm converges when the cluster assignments no longer change or
when a certain convergence criterion is met (e.g., the change in cluster
centroids becomes small).
6.Final Output:
•Once the algorithm converges, the K clusters are formed, and each data point
belongs to one of these clusters.

4
K-means Clustering
Basic Algorithm:
• Step 0: select K
• Step 1: randomly select initial cluster seeds

Seed 1 6

Seed 2
K-means Clustering
• An initial cluster seed represents the “mean value” of its cluster.
• In the preceding figure:
• Cluster seed 1 = 650
• Cluster seed 2 = 200
K-means Clustering

Seed 1

Seed 2
K-means Clustering
• Step 4: Compute the new centroid for each cluster

Cluster Seed 1
708.9

Cluster Seed 2
214.2
K-means Clustering

• Iterate:

• Calculate distance from objects to cluster centroids.

• Assign objects to closest cluster

• Recalculate new centroids

• Stop based on convergence criteria

• No change in clusters

• Max iterations
K-means clustering

10
K-means clustering

11
K-means clustering

12
K-means clustering

13
14
15
16
17
18
19
Radial Basis Function Neural Network

20
Radial Basis Function Neural Network
The idea of Radial Basis Function (RBF) Networks derives from the theory of function
approximation. We have already seen how Multi-Layer Perceptron (MLP) networks with
a hidden layer of sigmoidal units can learn to approximate functions. RBF Networks take
a slightly different approach. Their main features are:

Basic Form of RBF

Input layer: Source node connected to the environment

Hidden layer: Provide a set of function which form a base for mapping into hidden
layer

Output Layer: Supplies Response

21
Radial Basis Function Neural Network

P is dimensionality of input feature space , M is dimensionality of transformed


feature space where we have imposed our RBF.

22
Radial Basis Function Neural Network

23
Radial Basis Function Neural Network

24
Radial Basis Function Neural Network

25
Radial Basis Function Neural Network

26
Radial Basis Function Neural Network (Training)
Training Comprises for these kind of network in two phases:
 Training Hidden layer which comprises of M RBF functions, the parameters to be
determined for RBF function are receptor position t and the Sigma in case of
Gaussion RBF.
 Training weight vectors Wij for output layer.
Training Hidden layer:

So for training hidden layers there are different approaches, let us


assume for now we are dealing with Gaussian RBF so we need to
determine receptor t and Spread ie Sigma. One of the approach is to
randomly select M number of receptor from N number of sample
feature vector but this does not seems logical so we can go ahead with
clustering mechanism to determine receptors ti.

As we have M nodes in hidden layer and N samples so for clustering to


work here N>M.

27
Radial Basis Function Neural Network

28
Radial Basis Function Neural Network

Calculation of receptors:
Let’s look at above example where we have M=3 so we need to to determine three
t’s. so initially we divide out feature vector space in to three arbitrary clusters and
took their means as the initial receptors, then we need to iterate for every sample
feature vector and perform below steps:
 a) From the selected input feature vector x determine distances of means(t1,t2,t3)
of three different clusters whichever distance mean is minimum the sample x will
get assigned to that cluster.
 b) After x got assigned to different cluster all the means(t1,t2,t3) gets recomputed.
 c) Perform step 1 and step 2 for all sample points.
 Once the iteration finishes we will get the optimal t1,t2 and t3.
29
Use K nearest neighbor rule to find the function width 

Calculation of Sigma:

Once receptors are calculated we can use K nearest neighbor algorithm to calculate
sigma the formula is there in above image. we need to select the values of P.

30
31
Radial Basis Function Neural Network

32
Radial Basis Function Neural Network
Training Weight Vectors
Let us assume the dimensionality of hidden layer as M and sample
size as N the we can calculate the optimal weight vector for the
network using the pseudo inverse matrix solution.

33
Radial Basis Function Neural Network

Every component of dk either will be one or 0. It will be equal to 1 if the


corresponding input vector belongs to class k else it will be 0. 34
Radial Basis Function Neural Network

35

You might also like