Fuzzy clustering is a method that allows objects to belong to multiple clusters based on degrees of membership, as opposed to traditional clustering which assigns each object to a single cluster. It utilizes a partition matrix to represent the membership degrees and employs an Expectation-Maximization (EM) algorithm for iterative optimization of cluster centers. The method is particularly useful for handling vague and uncertain information, reflecting the fuzzy nature of human reasoning.
Fuzzy clustering is a method that allows objects to belong to multiple clusters based on degrees of membership, as opposed to traditional clustering which assigns each object to a single cluster. It utilizes a partition matrix to represent the membership degrees and employs an Expectation-Maximization (EM) algorithm for iterative optimization of cluster centers. The method is particularly useful for handling vague and uncertain information, reflecting the fuzzy nature of human reasoning.
• Example: Words like young, tall, good or high are fuzzy.
• There is no single quantitative value which defines the term young. • For some people, age 25 is young, and for others, age 35 is young. • The concept young has no clean boundary. • Age 35 has some possibility of being young and usually depends on the context in which it is being considered. • Fuzzy set theory is an extension of classical set theory where elements have degree of membership. • In real world, there exist much fuzzy knowledge (i.e. vague, uncertain inexact etc) • Human thinking and reasoning (analysis, logic, interpretation) frequently involved fuzzy information. • Human can give satisfactory answers, which are probably true. • Our systems are unable to answer many question because the systems are designed based upon classical set theory (Unreliable and incomplete). • System should be able to cope with unreliable and incomplete information. • Fuzzy system can provide solution. • A membership function μA(x) is associated with a fuzzy sets A such that the function maps every element of universe of discourse X to the interval [0,1]. • The mapping is written as: μA(x): X [0,1]. Example • The more digital camera units that are sold, the more popular the camera is. • Can use the following formula to compute the degree of popularity of a digital camera, o, given the sales of o:
• Function pop() defines a fuzzy set of popular digital cameras.
Then, A(0.05), B(1), C(0.86), D(0.27)
Fuzzy clustering • Given a set of objects, a cluster is a fuzzy set of objects. Such a cluster is called a fuzzy cluster. Consequently, a clustering contains multiple fuzzy clusters. • Formally, given a set of objects, o1, o2, …, on, a fuzzy clustering of k fuzzy clusters, C1, C2, …, Ck, can be represented using a partition matrix, 𝑀 = 𝑤𝑖𝑗 1 ≤ 𝑖 ≤ 𝑛, 1 ≤ 𝑗 ≤ 𝑘 where wij is the membership degree of oi in fuzzy cluster Cj. • Let c1, c2, …, ck be the centers of clusters C1, C2, …, Ck, respectively. Here, a center can be defined either as the mean or the medoid, or in other ways specific to the application. Fuzzy clustering • The partition matrix should satisfy the following three requirements: • For each object, oi, and cluster, C𝑗, 0 ≤ 𝑤𝑖𝑗 ≤ 1. This requirement enforces that a fuzzy cluster is a fuzzy set. • For each object, oi, σ𝑘𝑗=1 𝑤𝑖𝑗 = 1. This requirement ensures that every object participates in the clustering equivalently. • For each cluster, Cj, 0 < σ𝑛𝑖=1 𝑤𝑖𝑗 < n. This requirement ensures that for every cluster, there is at least one object for which the membership value is nonzero. • Usually, the distance or similarity between an object and the center of the cluster to which the object is assigned can be used to measure how well the object belongs to the cluster. • For any object, oi, and cluster, C𝑗, if 𝑤𝑖𝑗 > 0, then 𝑑𝑖𝑠𝑡(𝑜𝑖 , 𝑐𝑗 ) measures how well oi is represented by cj, and thus belongs to cluster Cj. • Because an object can participate in more than one cluster, the sum of distances to the corresponding cluster centers weighted by the degrees of membership captures how well the object fits the clustering. • For an object oi, the sum of the squared error (SSE) is given by 𝑘 𝑝 2 𝑆𝑆𝐸 𝑜𝑖 = 𝑤𝑖𝑗 𝑑𝑖𝑠𝑡 𝑜𝑖 , 𝑐𝑗 𝑗=1 • where the parameter p ≥ 1 controls the influence of the degrees of membership. The larger the value of p, the larger the influence of the degrees of membership. • the SSE for a cluster, Cj, is 𝑛 𝑝 2 𝑆𝑆𝐸 𝐶𝑗 = 𝑤𝑖𝑗 𝑑𝑖𝑠𝑡 𝑜𝑖 , 𝑐𝑗 𝑖=1 • Finally, the SSE of the clustering is defined as 𝑛 𝑘 𝑝 2 𝑆𝑆𝐸 𝐶 = 𝑤𝑖𝑗 𝑑𝑖𝑠𝑡 𝑜𝑖 , 𝑐𝑗 𝑖=1 𝑗=1 • The SSE can be used to measure how well a fuzzy clustering fits a data set. • Fuzzy clustering is also called soft clustering because it allows an object to belong to more than one cluster. • It is easy to see that traditional (rigid) clustering, which enforces each object to belong to only one cluster exclusively, is a special case of fuzzy clustering. • The expectation step (E-step): Given the current cluster centers, each object is assigned to the cluster with a center that is closest to the object. Here, an object is expected to belong to the closest cluster. • The maximization step (M-step): Given the cluster assignment, for each cluster, the algorithm adjusts the center so that the sum of the distances from the objects assigned to this cluster and the new center is minimized. That is, the similarity of objects assigned to a cluster is maximized. • In the context of fuzzy clustering, an EM algorithm starts with an initial set of parameters and iterates until the clustering cannot be improved, that is, until the clustering converges or the change is sufficiently small (less than a preset threshold). • Each iteration also consists of two steps: • The expectation step assigns objects to clusters according to the current fuzzy clustering. • The maximization step finds the new clustering that maximize the SSE in fuzzy clustering. • In the E-step, for each point we calculate its membership degree in each cluster. 𝑤𝑖𝑗 = 1 1ൗ ൙𝑘 𝑑𝑖𝑠𝑡(𝑜𝑖 , 𝑐𝑗 )2 𝑝−1 σ𝑞=1 ൘𝑑𝑖𝑠𝑡(𝑜 , 𝑐 )2 𝑖 𝑞 • Euclidian distance is used. • In the M-step, compute the cluster centroids σ𝑛𝑖=1 𝑤𝑖𝑗𝑝 𝑜𝑖 𝑐𝑗 = ൘ 𝑛 σ𝑖=1 𝑤𝑖𝑗𝑝 • Randomly selected two points, c1=a and c2=b as the initial centers of the two clusters • p=2 • 𝑤𝑐,𝑐1 = 1 1ൗ ൘ 2 ( 𝑑𝑖𝑠𝑡 𝑐,𝑐1 2 ) 𝑝−1 σ𝑞=1 ൘ ( 𝑑𝑖𝑠𝑡 𝑐,𝑐𝑞 )2 • When p=2, 𝑤𝑐,𝑐1 = 1 1ൗ ൘ 2 ( 𝑑𝑖𝑠𝑡 𝑐,𝑐1 2 ) 2−1 σ𝑞=1 ൘ 𝑑𝑖𝑠𝑡 𝑐,𝑐𝑞 ( )2 • 𝑤𝑐,𝑐1 = 1൘𝑑𝑖𝑠𝑡(𝑐,𝑐 )2 σ2 1 = 1ൗ[𝑑𝑖𝑠𝑡(𝑐,𝑐 )2 . 1 +1 ] 1 𝑞=1 ൗ 𝑑𝑖𝑠𝑡(𝑐,𝑐𝑞 )2 1 ൗ𝑑𝑖𝑠𝑡(𝑐,𝑐 ) 2 ൗ𝑑𝑖𝑠𝑡(𝑐,𝑐 ) 2 1 2 =1 ൘[𝑑𝑖𝑠𝑡(𝑐,𝑐1 )2 . 𝑑𝑖𝑠𝑡(𝑐,𝑐1 )2+𝑑𝑖𝑠𝑡(𝑐,𝑐1 )2൘ ] 𝑑𝑖𝑠𝑡(𝑐,𝑐1 ) 𝑑𝑖𝑠𝑡(𝑐,𝑐2 ) 2 2 𝑑𝑖𝑠𝑡(𝑐,𝑐2 )2 = ൗ𝑑𝑖𝑠𝑡(𝑐,𝑐 )2+𝑑𝑖𝑠𝑡(𝑐,𝑐 )2 1 2 Iteration -1