0% found this document useful (0 votes)
53 views

Fuzzy Clustering: Presented by CH - Srikanth (07991A1268)

This document provides an overview of fuzzy clustering algorithms. It describes fuzzy clustering as allowing data points to belong to more than one cluster with varying membership degrees between 0 and 1. The document then summarizes several fuzzy clustering algorithms, including fuzzy c-means clustering. Fuzzy c-means clustering assigns data points membership coefficients for each cluster based on distance from cluster centers, with coefficients totaling 1 for each point. It iteratively calculates new cluster centers and membership coefficients until convergence. This provides a concise high-level summary of the key information and concepts discussed in the document.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Fuzzy Clustering: Presented by CH - Srikanth (07991A1268)

This document provides an overview of fuzzy clustering algorithms. It describes fuzzy clustering as allowing data points to belong to more than one cluster with varying membership degrees between 0 and 1. The document then summarizes several fuzzy clustering algorithms, including fuzzy c-means clustering. Fuzzy c-means clustering assigns data points membership coefficients for each cluster based on distance from cluster centers, with coefficients totaling 1 for each point. It iteratively calculates new cluster centers and membership coefficients until convergence. This provides a concise high-level summary of the key information and concepts discussed in the document.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

FUZZY CLUSTERING

Presented By
Ch.Srikanth(07991A1268)

Abstract:
This paper presents a short overview of

methods for fuzzy clustering and states

desired properties for an optimal fuzzy


FUZZY CLUSTERING

clustering algorithm. Based on these different classes are as dissimilar as

criteria we chose one of the fuzzy possible.Depending on the nature of

clustering most prominent methods the data and the purpose for which

clustering is being used, different


--the fuzzy c-means, more precisely
measures of similarity may be used to
probabilistic c-means. This algorithm
place items into classes, where the
is presented in more detail along with
similarity measure controls how the
some empirical results of the clustering
clusters are formed. Some examples
of 2-dimensional points . For the
of measures that can be u sed as in
needs of clustering the points We
clustering include by distance, name
implemented fuzzy c-means in java by
and by concept. Clustering can also
using files as the input and the output be thought of as a form of data

is also returned to the files .Few compression, where a large number of

difficulties with implementation samples are converted into small

and their possible solutions are also number of representative clusters.

listed . As a conclusion we also Types of clustering include hard

clustering and soft clustering. The


propose further work that would be
definitions of the those techniques are
needed in order to fully exploit the

power of fuzzy clustering to various

standards of applications.
Hard clustering assign each
Introduction:
featurevector to one and only one of
clustering is the process of dividing
the clusters with a degree of
data elements into classes or clusters
membership equal to one and well
so that items in the same class are as
define boundaries between clusters.
1
similar as possible, and items in
FUZZY CLUSTERING

Fuzzy clustering allows each feature X [x1,x2…..xn] ...dataset of all objects.

vector to belong to more than one C=[c1,c2,..cn]…setof cluster centroids

cluster with different membership Dij=|| x1-c1|| ..distance between object


degrees (between 0 and 1) and vague xi and center cj.

or fuzzy boundaries between clusters.


uij ... weight of assignment of object
NOW we present some FUZZY
xj to cluster i.
CLUSTERING ALGORITHMS,

that devote the majority of space to HARD C-MEANS:


the hard c-means, fuzzy cmeans and
Hard c-means is better known as k-
possibilistic c-means. For the other
means and in general this is not a fuzzy
methods we provide just a short
algorithm.However,its overall structure
description, as we did not find them
is the basis for all the others methods.
appropriate for our needs.
Therefore we call it hard c-means in

order to emphasize that it serves as a


All algorithms described here are starting point for the fuzzy extensions.
based on objective functions,which are The k-means algorithm assigns each
mathematical criteria that quantify the point to the cluster whose center (also
quality of cluster models. The goal of called centroid) is nearest. The center
each clustering algorithm is the
is the average of all the points in the
minimization of its objective function.
cluster.
The following syntax will be used in

the equations, algorithms and their Example: The data set has three
explanations: dimensions and the cluster has two

points: X = (x1,x2,x3) and Y = (y1,y2,y3).


J -objective function
2
FUZZY CLUSTERING

Then the centroid Z becomes Z =  Repeat the two previous steps

(z1,z2,z3), where until some convergence

criterion is met (usually that the

, assignment hasn't changed).

The main advantages of this algorithm


and are its simplicity and speed which

allows it to run on large datasets. Its

. disadvantage is that it does not yield

the same result with each run, since the


The algorithm steps are[
resulting clusters depend on the initial

 Choose the number of random assignments (the k-means++

clusters,k. algorithm addresses this problem by

 Randomly generate k clusters seeking to choose better starting

and determine the cluster clusters). It minimizes intra-cluster

centers, or directly generate k variance, but does not ensure that the

random points as cluster result has a global minimum of

centers. variance. Another disadvantage is the

 Assign each point to the nearest requirement for the concept of a mean

cluster center, where "nearest" to be definable which is not always the

is defined with respect to one


case. For such datasets the k-medoids
of the distance measures
variants is appropriate. An alternative,
discussed above.
using a different criterion for which
 Recompute the new cluster
points are best assigned to which
centers.
centre is k-medians clustering.
3
FUZZY CLUSTERING

Fuzzy c-means clustering

In fuzzy clustering, each point has a

degree of belonging to clusters, as in then the coefficients are normalized


fuzzy logic, rather than belonging and fuzzyfied with a real parameter
completely to just one cluster. Thus, m > 1 so that their sum is 1. So
points on the edge of a cluster, may be

in the cluster to a lesser degree than

points in the center of cluster. For each

point x we have a coefficient giving the


For m equal to 2, this is equivalent to
degree of being in the kth cluster uk(x).
normalising the coefficient linearly to
Usually, the sum of those coefficients
make their sum 1. When m is close to
for any given x is defined to be 1:
1, then cluster center closest to the

point is given much more weight than

the others, and the algorithm is similar

to k-means.
With fuzzy c-means, the centroid of a

cluster is the mean of all points,

weighted by their degree of belonging

to the cluster:

The fuzzy c-means algorithm is very

similar to the k-means algorithm:[2]

 Choose a number of clusters.


The degree of belonging is related to
 Assign randomly to each point
the inverse of the distance to the
coefficients for being in the
cluster center:
4
clusters.
FUZZY CLUSTERING

 Repeat until the algorithm has EXAMPLE :

converged (that is, the


Initial,take ten two dimentioinal points
coefficients' change between
p(x,y) or Pij where i repesents x-axis
two iterations is no more than
value and j represents y-axis value.
, the given sensitivity

threshold) : In the data,first we want to divide the

o Compute the centroid points into two clusters with different

for each cluster, using membership values of the points to the

the formula above. clusters.

o For each point, compute


The INITIAL MEMBERSHIP
its coefficients of being
VALUES are
in the clusters,using the

above formulae. M

The algorithm minimizes intra-cluster 0.25 0.75

variance as well, but has the same


0.21 0.79
problems as k-means; the minimum is

a local minimum, and the results 0.4 0.6

depend on the initial choice of weights.


0.11 0.89
The expectation-maximization

algorithm is a more statistically 0.07 0.93

formalized method which includes


0.73 0.27
some of these ideas: partial

membership in classes. It has better 0.63 0.37

convergence properties and is in


0.74 0.26
general preferred to fuzzy-c-means. 5
FUZZY CLUSTERING

0.83 0.17 Each data point belongs to two clusters

of different degrees.
0.98 0.02

1.place two cluster centers


STEP-1:

2.assign a fuzzy membership to the

each data point depending upon the

distance

STEP-3:

data: o = cluster1 portion,

* = cluster2 portion ,

x = centres

Here,all the data points are arranged.

1.compute new center of each class

and move the cross(x)


STEP-2:

STEP-4:

In Iteration 2,

6
FUZZY CLUSTERING

In Iteration 10

In Iteration 5

In Iteration 13

7
FUZZY CLUSTERING

Then stop because no visible

change.each data point belongs to two


The membership matrix M:
clusters to a degree.

1. The last five data points (rows)

belong mostly to the first

cluster (column)

Then,FINAL MEMBERSHIP 2. The first five data points

VALUES are (rows) belong mostly to the

second cluster (column)


M

APPLICATIONS:
0.0025 0.9975

• Marketing: Help marketers


0.0091 0.9909
discover distinct groups in their
0.0129 0.9871 customer bases, and then use

this knowledge to develop


0.0001 0.9999
targeted marketing programs
0.0107 0.9893
• Land use: Identification of
0.9393 0.0607 areas of similar land use in an

earth observation database


0.9638 0.0362

• Insurance: Identifying groups


0.9574 0.0426
of motor insurance policy

0.9906 0.0094 holders with a high average

claim cost
0.9807 0.0193
8
FUZZY CLUSTERING

• City-planning: Identifying • Earth-quake studies: Observed

groups of houses according to earth quake epicenters should

their house type, value, and be clustered along continent

geographical location faults.

Conclusion:

 Fuzzy clustering can be done


super efficiently

 And we can do in lots of


different ways

 The optimal number of clusters


K to be created has to be
determined (the number of
References:
clusters cannot always be
defined a priori and a good
1.Fuzzy clustering in wolfarm
cluster validity criterion has to
Research.
be found) in c-means.
http://documents.wolfram.com/applicat
 The character and location of
ions/fuzzylogic/Manual/12.html
cluster prototypes (centers) is
not necessarily known a priori, 2.Extended Fuzzy Clustering
and initial guesses have to be Algorithms by Kaymak, U. and
made. Setnes,M
http://publishing.eur.nl/ir/repub/asset/5
 Its applications also very high in
7/erimrs20001123094510.pdf
different fields.As this algorithm
can be extended easily with the
3.Fuzzy c-means clustering
related formulaes regarding to
that application. 9
FUZZY CLUSTERING

http://en.wikipedia.org/wiki/Cluster_A
nalysis#Fuzzy_c-means_clustering

4.cluster analysis
http://en.wikipedia.org/wiki/Cluster_A
nalysis

5. Aldenderfer, M.S., Blashfield, R.K,


Cluster Analysis, (1984), Newbury
Park (CA): Sage.

10

You might also like