Fuzzy Systems
Fuzzy Systems
Fuzzy Clustering 1
2. Clustering
3. Fuzzy Clustering
Runtime 80 ms
• 12 times per second new sport
factor is assigned
Statistics:
• Parameter fitting, structure identification, inference method,
model selection
Machine learning:
• Computational learning (PAC learning), inductive learning,
decision tree learning, concept learning
Neural networks: learning from data
Cluster analysis: unsupervised learning
2. Clustering
Hard c-means
3. Fuzzy Clustering
Definition
d : IRp × IRp → [0, ∞) is a distance function if ∀x, y, z ∈ IRp :
(i) d(x, y) = 0 ⇔ x = y (identity),
(ii) d(x, y) = d(y, x) (symmetry),
(iii) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
• One group is optimized holding other one fixed (and vice versa).
2. Clustering
3. Fuzzy Clustering
Fuzzy c-means
subject to
c
X
uij = 1, ∀j ∈ {1, . . . , n}
i=1
and
n
X
uij > 0, ∀i ∈ {1, . . . , c}
j=1
Usually m = 2.
Solution procedure:
• Take the partial derivatives of f and set them to zero:
∂f ∂f
= 2x + y − 4 = 0, = 2y + x − 5 = 0.
∂x ∂y
• Solve the resulting (here linear) equation system: x = 1, y = 2.
R. Kruse, C. Doell FS – Fuzzy Clustering 1 Lecture 9 23 / 44
Example
Task: Minimize f (x1 , x2 ) = x12 + x22 subject to g : x1 + x2 = 1.
constrained
subspace
y x +y =1
1
unconstrained
minimum
p 0 = (0, 0) 0 x
0.5 1
0
The unconstrained minimum is not in the constrained subspace. At the
minimum in the constrained subspace the gradient does not vanish.
R. Kruse, C. Doell FS – Fuzzy Clustering 1 Lecture 9 26 / 44
Lagrange Theory: Revisited Example 1
Example task: Minimize f (x , y ) = x 2 + y 2 subject to x + y = 1.
C (x , y ) = x + y − 1 L(x , y , −1) = x 2 + y 2 − (x + y − 1)
minimum p 1 = ( 12 , 21 )
y
y
x 1
0.5
0 x
x +y −1=0 0.5 1
0
The gradient of the constraint is perpendicular to the constrained
subspace. The (unconstrained) minimum of the L(x , y , λ) is the
minimum of f (x , y ) in the constrained subspace.
R. Kruse, C. Doell FS – Fuzzy Clustering 1 Lecture 9 27 / 44
Lagrange Theory: Example 2
Example task: find the side lengths x , y , z of a box with maximum
volume for a given area S of the surface.
Formally: max. f (x , y , z) = xyz such that 2xy + 2xz + 2yz = S
Solution procedure:
• The constraint is C (x , y , z) = 2xy + 2xz + 2yz − S = 0.
• The Lagrange function is
L(x , y , z, λ) = xyz + λ(2xy + 2xz + 2yz − S).
• Taking the partial derivatives yields (in addition to constraint):
∂L ∂L ∂L
= yz+2λ(y +z) = 0, = xz+2λ(x +z) = 0, = xy +2λ(x +y ) = 0.
∂x ∂y ∂y
q q
• The solution is λ = − 14 S
6, x =y =z = S
6 (i.e. box is a
cube).
R. Kruse, C. Doell FS – Fuzzy Clustering 1 Lecture 9 28 / 44
Optimizing the Membership Degrees
Jf is alternately optimized, i.e.
• optimize U for a fixed cluster parameters Uτ = jU (Cτ −1 ),
• optimize C for a fixed membership degrees Cτ = jC (Uτ ).
The update formulas are obtained by setting the derivative Jf
w.r.t. parameters U, C to zero. .
The resulting equations form the fuzzy c-means (FCM)
algorithm [Bezdek, 1981]:
2
− m−1
1 dij
uij = 1 = 2 .
Pc −
dij2 m−1 m−1
k=1 dkj
Pc
k=1 2
dkj
Inserting this into the equation for the membership degrees yields
2
dij1−m
∀i; 1 ≤ i ≤ c : ∀j; 1 ≤ j ≤ n : uij = 2 .
Pc 1−m
k=1 dkj
Thus the second step (i.e. the derivations of Jf w.r.t. the centers)
yields [Bezdek, 1981] Pn m
j=1 uij x j
c i = Pn m .
j=1 uij
R. Kruse, C. Doell FS – Fuzzy Clustering 1 Lecture 9 32 / 44
Discussion: Fuzzy c-means
It is initialized with randomly placed cluster centers.
2. Clustering
3. Fuzzy Clustering
usually X is multidimensional
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
4 4.5 5 5.5 6 6.5 7 7.5 8 1 2 3 4 5 6 7
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
4 4.5 5 5.5 6 6.5 7 7.5 8 1 2 3 4 5 6 7
for every attribute value and cluster center, only consider maximum
membership degree
R. Kruse, C. Doell FS – Fuzzy Clustering 1 Lecture 9 40 / 44
3. Convex Completion
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
4 4.5 5 5.5 6 6.5 7 7.5 8 1 2 3 4 5 6 7
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
4 4.5 5 5.5 6 6.5 7 7.5 8 1 2 3 4 5 6 7
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
4 4.5 5 5.5 6 6.5 7 7.5 8 1 2 3 4 5 6 7
Bezdek, J. (1981).
Pattern Recognition With Fuzzy Objective Function Algorithms.
Plenum Press, New York, NY, USA.
Couso, I., Dubois, D., and Sánchez, L. (2014).
Random sets and random fuzzy sets as ill-perceived random variables.
Höppner, F., Klawonn, F., Kruse, R., and Runkler, T. (1999).
Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image
Recognition.
John Wiley & Sons Ltd, New York, NY, USA.