0% found this document useful (0 votes)
4 views

AI21-Fuzzy Clustering

Fuzzy clustering is a method that allows objects to belong to multiple clusters based on degrees of membership, as opposed to traditional clustering which assigns each object to a single cluster. It utilizes a partition matrix to represent the membership degrees and employs an Expectation-Maximization (EM) algorithm for iterative optimization of cluster centers. The method is particularly useful for handling vague and uncertain information, reflecting the fuzzy nature of human reasoning.

Uploaded by

zaydenguide
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

AI21-Fuzzy Clustering

Fuzzy clustering is a method that allows objects to belong to multiple clusters based on degrees of membership, as opposed to traditional clustering which assigns each object to a single cluster. It utilizes a partition matrix to represent the membership degrees and employs an Expectation-Maximization (EM) algorithm for iterative optimization of cluster centers. The method is particularly useful for handling vague and uncertain information, reflecting the fuzzy nature of human reasoning.

Uploaded by

zaydenguide
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Fuzzy Clustering

• Example: Words like young, tall, good or high are fuzzy.


• There is no single quantitative value which defines the term young.
• For some people, age 25 is young, and for others, age 35 is young.
• The concept young has no clean boundary.
• Age 35 has some possibility of being young and usually depends on the
context in which it is being considered.
• Fuzzy set theory is an extension of classical set theory where
elements have degree of membership.
• In real world, there exist much fuzzy knowledge (i.e. vague, uncertain inexact etc)
• Human thinking and reasoning (analysis, logic, interpretation) frequently involved
fuzzy information.
• Human can give satisfactory answers, which are probably true.
• Our systems are unable to answer many question because the systems are
designed based upon classical set theory (Unreliable and incomplete).
• System should be able to cope with unreliable and incomplete information.
• Fuzzy system can provide solution.
• A membership function μA(x) is associated with a fuzzy sets A such
that the function maps every element of universe of discourse X to
the interval [0,1].
• The mapping is written as: μA(x): X [0,1].
Example
• The more digital camera units that are sold, the more popular the camera is.
• Can use the following formula to compute the degree of popularity of a digital
camera, o, given the sales of o:

• Function pop() defines a fuzzy set of popular digital cameras.

Then, A(0.05), B(1), C(0.86), D(0.27)


Fuzzy clustering
• Given a set of objects, a cluster is a fuzzy set of objects. Such a cluster is called a
fuzzy cluster. Consequently, a clustering contains multiple fuzzy clusters.
• Formally, given a set of objects, o1, o2, …, on, a fuzzy clustering of k fuzzy clusters,
C1, C2, …, Ck, can be represented using a partition matrix,
𝑀 = 𝑤𝑖𝑗 1 ≤ 𝑖 ≤ 𝑛, 1 ≤ 𝑗 ≤ 𝑘
where wij is the membership degree of oi in fuzzy cluster Cj.
• Let c1, c2, …, ck be the centers of clusters C1, C2, …, Ck, respectively. Here, a center
can be defined either as the mean or the medoid, or in other ways specific to the
application.
Fuzzy clustering
• The partition matrix should satisfy the following three requirements:
• For each object, oi, and cluster, C𝑗, 0 ≤ 𝑤𝑖𝑗 ≤ 1. This requirement enforces
that a fuzzy cluster is a fuzzy set.
• For each object, oi, σ𝑘𝑗=1 𝑤𝑖𝑗 = 1. This requirement ensures that every object
participates in the clustering equivalently.
• For each cluster, Cj, 0 < σ𝑛𝑖=1 𝑤𝑖𝑗 < n. This requirement ensures that for
every cluster, there is at least one object for which the membership value is
nonzero.
• Usually, the distance or similarity between an object and the center of the cluster
to which the object is assigned can be used to measure how well the object
belongs to the cluster.
• For any object, oi, and cluster, C𝑗, if 𝑤𝑖𝑗 > 0, then 𝑑𝑖𝑠𝑡(𝑜𝑖 , 𝑐𝑗 ) measures how well oi
is represented by cj, and thus belongs to cluster Cj.
• Because an object can participate in more than one cluster, the sum of distances
to the corresponding cluster centers weighted by the degrees of membership
captures how well the object fits the clustering.
• For an object oi, the sum of the squared error (SSE) is given by
𝑘
𝑝 2
𝑆𝑆𝐸 𝑜𝑖 = ෍ 𝑤𝑖𝑗 𝑑𝑖𝑠𝑡 𝑜𝑖 , 𝑐𝑗
𝑗=1
• where the parameter p ≥ 1 controls the influence of the degrees of membership.
The larger the value of p, the larger the influence of the degrees of membership.
• the SSE for a cluster, Cj, is
𝑛
𝑝 2
𝑆𝑆𝐸 𝐶𝑗 = ෍ 𝑤𝑖𝑗 𝑑𝑖𝑠𝑡 𝑜𝑖 , 𝑐𝑗
𝑖=1
• Finally, the SSE of the clustering is defined as
𝑛 𝑘
𝑝 2
𝑆𝑆𝐸 𝐶 = ෍ ෍ 𝑤𝑖𝑗 𝑑𝑖𝑠𝑡 𝑜𝑖 , 𝑐𝑗
𝑖=1 𝑗=1
• The SSE can be used to measure how well a fuzzy clustering fits a data set.
• Fuzzy clustering is also called soft clustering because it allows an object to belong
to more than one cluster.
• It is easy to see that traditional (rigid) clustering, which enforces each object to
belong to only one cluster exclusively, is a special case of fuzzy clustering.
• The expectation step (E-step): Given the current cluster centers, each object is
assigned to the cluster with a center that is closest to the object. Here, an object
is expected to belong to the closest cluster.
• The maximization step (M-step): Given the cluster assignment, for each cluster,
the algorithm adjusts the center so that the sum of the distances from the
objects assigned to this cluster and the new center is minimized. That is, the
similarity of objects assigned to a cluster is maximized.
• In the context of fuzzy clustering, an EM algorithm starts with an initial set of
parameters and iterates until the clustering cannot be improved, that is, until the
clustering converges or the change is sufficiently small (less than a preset
threshold).
• Each iteration also consists of two steps:
• The expectation step assigns objects to clusters according to the current fuzzy clustering.
• The maximization step finds the new clustering that maximize the SSE in fuzzy clustering.
• In the E-step, for each point we calculate its membership degree in each cluster.
𝑤𝑖𝑗 = 1 1ൗ
൙𝑘 𝑑𝑖𝑠𝑡(𝑜𝑖 , 𝑐𝑗 )2 𝑝−1
σ𝑞=1 ൘𝑑𝑖𝑠𝑡(𝑜 , 𝑐 )2
𝑖 𝑞
• Euclidian distance is used.
• In the M-step, compute the cluster centroids
σ𝑛𝑖=1 𝑤𝑖𝑗𝑝 𝑜𝑖
𝑐𝑗 = ൘ 𝑛
σ𝑖=1 𝑤𝑖𝑗𝑝
• Randomly selected two points, c1=a and c2=b as the initial
centers of the two clusters
• p=2
• 𝑤𝑐,𝑐1 = 1 1ൗ
൘ 2 (
𝑑𝑖𝑠𝑡 𝑐,𝑐1 2 ) 𝑝−1
σ𝑞=1 ൘
(
𝑑𝑖𝑠𝑡 𝑐,𝑐𝑞 )2
• When p=2, 𝑤𝑐,𝑐1 = 1 1ൗ
൘ 2 (
𝑑𝑖𝑠𝑡 𝑐,𝑐1 2 ) 2−1
σ𝑞=1 ൘
𝑑𝑖𝑠𝑡 𝑐,𝑐𝑞 ( )2
• 𝑤𝑐,𝑐1 = 1൘𝑑𝑖𝑠𝑡(𝑐,𝑐 )2 σ2 1 = 1ൗ[𝑑𝑖𝑠𝑡(𝑐,𝑐 )2 . 1 +1 ]
1 𝑞=1 ൗ
𝑑𝑖𝑠𝑡(𝑐,𝑐𝑞 )2
1 ൗ𝑑𝑖𝑠𝑡(𝑐,𝑐 ) 2 ൗ𝑑𝑖𝑠𝑡(𝑐,𝑐 ) 2
1 2
=1
൘[𝑑𝑖𝑠𝑡(𝑐,𝑐1 )2 . 𝑑𝑖𝑠𝑡(𝑐,𝑐1 )2+𝑑𝑖𝑠𝑡(𝑐,𝑐1 )2൘ ]
𝑑𝑖𝑠𝑡(𝑐,𝑐1 ) 𝑑𝑖𝑠𝑡(𝑐,𝑐2 )
2 2
𝑑𝑖𝑠𝑡(𝑐,𝑐2 )2
= ൗ𝑑𝑖𝑠𝑡(𝑐,𝑐 )2+𝑑𝑖𝑠𝑡(𝑐,𝑐 )2
1 2
Iteration -1

You might also like