Learning 2004 1 019
Learning 2004 1 019
u
ik
< N i.
U is the matrix which contains the membership degree for an input vector k to a center
i (u
ik
). Let V = [v
ik
] with i = 1...m and k = 1...l +1 be the matrix where v
ik
with k = 1...l
are the coordinates of the center c
i
and v
ik
with k = l + 1 is the hypothetic output for c
i
.
3.1 w Parameter
We use the parameter w as it was dened for the CFA algorithm, because of this parame-
ter, we will be able to consider the variability of the objective function by calculating the
dierence between the hypothetic output of a cluster and the real output of the values that
belong to that cluster. This parameter will allow us to modify the euclidean distance in
Elche (SPAIN), October 20-22, 2004
20
such a way that it will reward when the output of a cluster is close to an input vector. To
calculate w we use:
w
kj
=
|F(x
k
) o
j
|
n
max
i=1
{F(x
i
)}
n
min
i=1
{F(x
i
)}
(1)
where F(x) is the function output and o
j
is the hypothetic output of the center j, this
hypothetic output, as we commented before corresponds to v
ik
.
3.2 Weighted Distance Calculation
The innovation respect Fuzzy C-means is the weighting of the distance by a parameter that
will allow us to consider the variability of the output.In the Fuzzy K-means algorithm, the
distance between the center of a cluster and an input vector is dened as the euclidean
distance:
d
kj
= x
k
c
j
2
. (2)
proceeding this way, matrix U will have small values for input vectors that are far away
from a center and big values for input vectors near the center. For our functional approx-
imation problem, its not always true that if a vector is far from a center the membership
value has to be small. To solve this, we introduce the parameter w (1) that will measure
the dierence between the input vector output and the hypothetical center output. Using
w we will reward that an input vector and a center have the same output even though they
are far from each other.
The distance is now calculated by:
d
kj
= z
k
v
j
2
w
p
kj
(3)
with p > 1. The exponent p will allow us to increase or decrease how much do we want
the output to inuence the calculus of the distance. Since w [0, 1], when we increase the
value of p, w will become smaller and thus the inuence over the distance will be higher
and the centers will concentrate more where the output is more variable.
3.3 Migration Process
The migration step is another feature that has been taken from the CFA algorithm and
it also appeared in other algorithm, the ELBG (Russo and Patan, 1999). This migration
process will take place on each iteration right after the new centers calculation.The migration
scheme is the same as in CFA algorithm.
In FCFAs migration, a pre-selection process is done before choosing the centers to be
migrated, limiting the set of centers that can be source or destination of the migration. By
doing this selection, we are excluding the centers that own a big number of input vectors
and they have a small error respect them.
Let:
error an array of dimension m
Proceedings of the Learning'04 International Conference
21
numvec an array of dimension m
avg(vec) a function that returns the average of the values container in the array vec
For each center c
i
1) Accumulate the error (error
i
) between the hypothetic output of the center c
i
and
the output of the vectors that belong to that center (the threshold established is the
average of the membership function values for the cluster c
i
)
2) numvec
i
= number of vectors that belong to c
i
For each center c
i
1) If error
i
< avg(error) then select c
i
2) else, if numvec
i
< avg(numvec) then select c
i
with distortion
i
: error
i
numvec
i
3.4 FCFA General Scheme
The convergence to the nal solution is reached by an iterative process in which we calculate
the matrix V from matrix U and once this new V matrix is calculated, we calculate a new
U from this last V . This process starts with the random initialization of any of those
matrix:(V
t1
U
t
V
t
o U
t1
V
t
U
t
).
The stop criteria can be: V
t1
V
t
< or U
t1
U
t
< . The iterative process that
we use to calculate the matrices U and V is:
u
ik
=
_
_
c
j=1
_
D
ikA
D
jkA
_ 2
m1
_
_
1
v
i
=
n
k=1
u
m
ik
z
k
n
k=1
u
m
ik
(4)
where D
jkA
is the distance as its dened in (3). This iterative process is the same than in
Fuzzy K-means follows and its convergence is demonstrated in (Bezdek, 1981).
The general scheme that FCFA algorithm follows is:
i=1 Initialize a random matrix U
1
Calculate V
1
from U
1
Do
Calculate w
Calculate the distance between V
i
and Z
Calculate the new U
i
Calculate the new V
i
from U
i
Migrate
i=i+1
While(abs(V
i1
-V
i
<threshold)
4 EXPERIMENTAL RESULTS
To compare the results provided by the dierent algorithms we will use the normalized mean
squared error (NRMSE) which is dened as:
Elche (SPAIN), October 20-22, 2004
22
NRMSE =
_
n
k=1
(F(x
k
) F(x
k
; C, R, ))
2
n
k=1
_
F(x
k
)
F
_
2
(5)
where
F is the average of the output of the input vectors.
To get the radios of the RBFs we used the k-neighbors algorithm with k=1.
0 1 2 3 4 5 6 7 8 9 10
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
Figure 1: Target Function f
1
In Table 4 it is shown the approximation error for the function f
1
(Figure 1) that results
from executing the algorithms Fuzzy C-Means, CFA, ELBG, Hard C-means and FCFA
and the results for a random center initialization, before and after applying a local search
algorithm (Levenberg-Marquardt).
We can appreciate that not only after a rst initialization we get better results but even
after the local search process we still get better results with the new algorithm.
5 CONCLUSIONS
We have seen how RBFs can be used to solve functional approximation problems. Now we
propose a new algorithm which improves CFA solving some drawbacks existing in CFA as
the way the partition was performed and the migration process.
From the analysis of the results, we obtain the following conclusions:
1. The need of a rst step of RBFs centers initialization is crucial in order to obtain
accurate results when approximating functions using RBFs.
2. When we compare our algorithm with its predecessor, CFA, we have seen how FCFA
improves and provides much better solutions than CFA due to the employment of
fuzzy logic and the modied migration process so we can escape from local minimums.
3. Regardless of which algorithm we use, we should apply an optimization algorithm to
improve the solutions given by the initialization algorithm, in order to exploit the
solution space.
Proceedings of the Learning'04 International Conference
23
Clusters Random Hard CM Fuzzy CM ELBG CFA FCFA
5 0.985(0.014) 0.979(0.003) 0.988(9e-5) 0.979(0.001) 0.928(0.001) 0.906(0.001)
6 0.972(0.022) 0.972(0.004) 0.985(2e-5) 0.970(0.007) 0.813(0.128) 0.696(0.000)
7 0.976(0.020) 0.964(0.005) 0.983(4e-4) 0.962(0.002) 0.781(0.052) 0.359(0.005)
8 0.960(0.054) 0.961(0.007) 0.980(3e-4) 0.960(0.011) 0.537(0.007) 0.351(0.007)
9 0.987(0.019) 0.947(0.006) 0.978(0.001) 0.954(0.004) 0.514(0.070) 0.343(0.003)
10 0.949(0.029) 0.940(0.015) 0.976(0.001) 0.945(0.006) 0.509(0.113) 0.339(0.000)
5 0.963(0.017) 0.818(0.034) 0.812(0.205) 0.202(0.132) 0.081(0.028) 0.103(0.069)
6 0.953(0.021) 0.759(0.263) 0.783(0.317) 0.152(0.122) 0.090(0.008) 0.103(0.060)
7 0.948(0.029) 0.344(0.281) 0.248(0.115) 0.111(0.072) 0.081(0.016) 0.069(0.017)
8 0.918(0.063) 0.227(0.368) 0.150(0.162) 0.093(0.064) 0.053(0.028) 0.048(0.031)
9 0.939(0.039) 0.252(0.386) 0.300(0.160) 0.073(0.057) 0.056(0.027) 0.027(0.024)
10 0.926(0.028) 0.087(0.100) 0.285(0.334) 0.064(0.039) 0.047(0.015) 0.011(0.015)
Table 1: Average and standard deviation of the approximation error (NRMSE) for function
f
1
.
References
Baraldi, A. and Blonda, P. (1999). A Survey of Fuzzy Clustering Algorithms for Pattern
Recognition Part I. IEEETransSMC, Part B, 29(6):786801.
Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms.
Plenum, Nueva York.
Gonzalez, J., Rojas, I., Pomares, H., Ortega, J., and Prieto, A. (2002). A new Cluster-
ing Technique for Function Aproximation. IEEE Transactions on Neural Networks,
13(1):132142.
I. Rojas, H. Pomares, J. G. J. L. B. E. R. F. P. and Prieto, A. (2000). Analysis of the func-
tional block involved in the design of radial basis function networks. Neural Processing
Lett., 12(1):117.
Pedrycz, W. (1998). Conditional Fuzzy Clustering in the Design of Radial Basis Function
Neural Networks. IEEE Transactions on Neural Networks, 9(4):601612.
Poggio, T. and Girosi, F. Networks for approximation and learning.
Russo, M. and Patan, G. (1999). Improving the LBG Algorithm. Lecture Notes in Computer
Science, 1606:621630.
Uykan, Z., Gzelis, C., Celebei, M. E., and Koivo, H. N. (2000). Analysis of Input-Output
Clustering for Determining Centers of RBFN. IEEE Transactions on Neural Networks,
11(4):851858.
Elche (SPAIN), October 20-22, 2004
24