0% found this document useful (0 votes)

35 views

An Optimized Approach On Applying Genetic Algorithm To Adaptive Cluster Validity Index

Uploaded by

Balu Amith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

An Optimized Approach On Applying Genetic Algorithm To Adaptive Cluster Validity Index

Uploaded by

Balu Amith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

International Journal of Computer Sciences and Engineering Systems

IJCSES
Vol.International Journal of
10 No. 1 (June, Computer Sciences and Engineering Systems, Vol. 1, No. 4, October 2007
2016)
CSES International © 2007 ISSN 0973-4406

An Optimized Approach on Applying Genetic Algorithm

to Adaptive Cluster Validity Index
Tzu-Chieh LIN1, Hsiang-Cheh HUANG2, Bin-Yih LIAO1 & Jeng-Shyang PAN1
1
Department of Electronic Engineering, National Kaohsiung University of Applied SciencesKaohsiung, 807, Taiwan
E-mail:{tclin, byliao, jspan}@bit.kuas.edu.tw
2
Department of Electrical Engineering, National University of KaohsiungKaohsiung, 811, Taiwan
E-mail: [email protected]

Abstract: The partitioning or clustering method is an important research branch in data mining area, and it divides the
dataset into an arbitrary number of clusters based on the correlation attribute of all elements of the dataset. Most datasets
have the original clusters number, which is estimated with cluster validity index. But most methods give the error estimation
for most real datasets. In order to solve this problem, this paper applies the optimization technique of genetic algorithm
(GA) to the new adaptive cluster validity index, which is called the Gene Index (GI). The algorithm applies GA to adjust the
weighting factors of adaptive cluster validity index to train an optimal cluster validity index. It is tested with many real
datasets, and results show the proposed algorithm can give higher performance and accurately estimate the original cluster
number of real datasets compared with the current cluster validity index methods.
Keywords: Clustering, Genetic Algorithm, Cluster Validity Index, Optimization, Data Mining

1. INTRODUCTION results are demonstrated in Section 4. Finally, we conclude

Data partitioning is commonly encountered in real this paper in Section 5.
applications. Lots of schemes are proposed to assess the 2. DATA PARTITIONING SCHEMES AND CLUSTER
performances for specific algorithms in literature. The main VALIDITY INDICES
concern of data partitioning is how to correctly divide the
In this paper, we employ the fuzzy C-means (FCM) [2]
data points into clusters. Some algorithms in literature are
algorithm for data clustering, and then make comparisons
specifically designed for certain databases. Thus, these may
among several indices. By using the concepts of fuzzy theory,
perform well in some cases but not always good in general.
every data point does not absolutely belong to a certain
In this paper, we would like to propose a generalized scheme,
cluster; it is denoted by a floating number to represent the
which is integrated with optimization techniques, for better
degree of belonging to a certain cluster.
partitioning the data.
The major drawback for FCM or other algorithms is
There are a number of indices proposed in literature to
that the correct number of clusters cannot be known exactly
assess the performances of data clustering. The main ideas
in advance. Thus, the cluster validity indices with several
are twofold: (1) data points within the same cluster should
kinds of representations are proposed to evaluate the correct
locate as close as possible, and (2) data points in different
number of clusters. Every index has its advantages and
clusters should be as apart as possible. Based on the two
drawbacks. We cite several commonly encountered indices;
concepts, a variety of the cluster validity indices are
then we perform verifications in Sec. 2, and finally combine
proposed. We make necessary simulations and verify that
the advantages of these indices and propose the genetic-based
not all the indices perform well. Therefore, we employ the
cluster validity index in Sec. 3.
genetic algorithm (GA) [1] for resultin g in better
performances in data partitioning. 2.1 Cluster Validity Index: PC Index
This paper is organized as follows. In Section 2 we point
out the data partitioning schemes and the cluster validity PC (partition coefficient) index [3] was one of the measures
indices. In Section 3 we describe the proposed algorithm by used in early days, with the definition in Eq. (1):
integrating existing indices and training with GA. Simulation 1 n c 2
V PC �U � � �� uik (1)
n k �1 i �1
Manuscript received July 30, 2007 where uik denotes the degree of membership of xi in the
Manuscript revised September 20, 2007 cluster k, x i is the i th of d-dimensional measured data
254 IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 1, No. 4, October 2007

(and we use d = 2 here as an example), under the condition � �

G �c � �
max δ Vi , V j
i, j
that �
min δ Vi , V j
i� j
� (5)
uik � �0 , 1�, � i, k ;
where δ �Vi , V j � is a distance measure between the geometric
c

�u
centers of clusters i and j, with the definition of
ik � 1, � k .
i �1

To assess the effectiveness of clustering algorithm, the

�
δ �Vi , V j � � �Vi � V j � A�Vi � V j �
T
�1
2
(6)

larger the PC index value, the better the performance. and A denotes a positive definite matrix with dimension of
d � d (or 2 × 2 here). For simplicity, people use the identity
2.2 Cluster Validity Index: PE Index matrix I to replace the matrix A in Eq. (6) to verify the
PE (partition entropy) index was also proposed in [3], with distance measure.
the definition in Eq. (2): Next, the compactness is represented by the ratio of
variances between the data points of the current cluster, and
�1 � n c �
VPE �U � � �� uik � log �uik ��
the data points within every cluster, denoted by Vwt(c),
(2)
n � k �1 i �1 � d c
1 � � varq �k �
To assess the effectiveness of clustering algorithm, the Vwt �c � � � q �d1 k �1 (7)
c q��1vartotal �q �
smaller the PE index value, the better the performance.

2.3 Cluster Validity Index: XB Index where varq denotes the current cluster and var total denotes the
variance of the whole data set. From experimental results,
The XB index was proposed by Xie and Beni in [4] with the the value of G(c) is much larger than that of Vwt(c) with the
two important concepts of compactness and separation. For ranges of G(c)�[0, 20] and Vwt(c)�� [0, 0.8], thus we need
a good clustering result, the data points within the same to include a weighting factor ��to balance the effects from
cluster should be as compact as possible, while any two both factors, and we obtain
different clusters should be as far as possible. It can be
formulated by Eq. (3): B crit �c � � G �c � � α � Vwt �c � (8)
n c
� � uij2 � x j � νi
2
where α � max G ( c )
denotes the weighting factor..
VXB �U, V; X � �
max Vwt ( c )
j �1i �1

�
n �min νi � νk
i �k
2
� (3) From derivations above, when the smaller Bcrit index is
th
obtained, the clustering performance would be better.
where xi is the i of d-dimensional measured data (and we
use d = 2 here), vk is the d-dimension center of the cluster. 2.6 Cluster Validity Index: SV Index
In Eq. (3), the numerator implies the compactness and
SV index was proposed in [7]. It also adopted the concepts
the denominator denotes the separation. Therefore, to assess
of compactness and separation. Unlike the Bcrit index in Sec.
the effectiveness of clustering algorithm, the smaller the XB
0, both factors are normalized to the values between 0 and 1
index value, the better the performance.
to balance the effects from both factors. In measuring the
2.4 Cluster Validity Index: K Index compactness, the mean distance of the c clusters in the data
set is calculated,
The K index was proposed by Kwon [5] based on the
improvement of the XB index. In Eq. (3), we find when c�n, 1 c �1 �
V XB �0, and it is gen erally incorrect for practical Vu �c, V; X � � �� V i � x� (9)
c i �1 � ni x�xi �
applications. By modifying Eq. (3), we obtain Eq. (4):
n c 2 c 2 where ni denotes the number of data points within cluster i,
� � uij2 � x j �νi � 1c � νi � ν
VK �U, V; X � � j �1i �1

�
min νi � νk
j �1
2
� (4) Vi is the geometric center of cluster i, and the total of c mean
i �k distances are calculated. The separation measure is simply
where � denotes the geometric center of data points. denoted by Vo � dmin
c
, where dmin denotes the minimum
To assess the effectiveness of clustering algorithm, the
smaller the XB index value, the better the performance. distance between any two clusters.
Next, normalization of Eqs. (9) and (10) is performed
2.5 Cluster Validity Index: B crit Index by
B crit index was proposed in [6]. It is also composed of the VuN �c, V; X � � Vu �c, V; X � � min �Vu �c, V; X ��
max �Vu � c, V; X �� min �Vu � c, V; X �� , (10)
compactness and separation parameters in order to obtain
the optimal number of clusters. The measure of compactness
VoN �c, V � � Vo � c, V � � min �Vo � c, V ��
max �Vo � c, V �� min �Vo � c, V �� . (11)
and separation are independently derived. First, the
separation between clusters is denoted by G(c),
An Optimized Approach on Applying Genetic Alogirthm to Adaptive Cluster Validity Index 255

Finally, the SV index is defined by Table 1

The Index Values for Clustering from 2 to 10 Clusters in Six
VSV �c, V; X � � VuN �c, V; X � � VoN �c, V � . (12) Different Schemes for My_sample Database. The Shaded
To assess the effectiveness of clustering algorithm, the Blocks Represent the Correct Clustering Results
smaller the SV index value, the better the performance. index k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10

2.7 Preliminary Results with Existing Indices PC 0.722 0.673 0.625 0.640 0.674 0.720 0.757 0.797 0.770
PE 0.631 0.852 1.051 1.071 1.030 0.936 0.855 0.754 0.834
To evaluate the effectiveness of existing indices, we generate
a two-dimensional, 2000-point, 9-cluster testing database XB 0.277 0.110 0.188 0.224 0.095 0.100 0.068 0.042 0.628
called ‘My_sample’, illustrated in Fig. 1. All six indices are K 555 221 377 449 191 202 139 87 1300
examined, and results are in Table 1. B crit 17.87 10.85 9.86 10.77 8.14 8.79 8.59 8.39 17.42
With the database, we can expect that the column with SV 1.000 0.691 0.619 0.485 0.334 0.312 0.261 0.220 1.000
k = 9 should perform the best, i.e., the largest PC value and
the smallest values of the other five should be obtained. As
we can see, not all of the indices indicate that the correct 3.1 Preprocessing in GA
clustering result is when k = 9. Moreover, the criterion for We need to have chromosomes to perform the three steps in
PC is to search for its maximum value, while for the rest GA. We employ five popularly used databases, including
indices the criterion is to find their minimum values. Based auto-mpg [8], bupa [9], cmc [10], iris [11], and wine [12] in
on the two findings, the optimization techniques can be Table 2 for GA optimization. Half of the data set in each
included into the clustering algorithm to search for the better database is used for training, and the other half is used for
and more correct results. testing.
3. GENETIC-BASED CLUSTER VALIDITY INDEX 3.2 Deciding the Fitness Function
As we can see from Sec. 2.1 to 2.6, every index has its own After considering practical implementations in GA, and
specific concept for data clustering and the results in Sec. based on the indices described in Sec. 0 to 0, in this paper,
2.7 have a diversity of performances. Therefore, we employ we proposed the genetic-based index for data clustering. The
genetic algorithm (GA) for finding an optimized result based fitness function is denoted by
on the concept of every index above. GA constitutes of three
major steps: crossover, mutation, and selection. Based on 1
c
� INTRA � k � �
max d Vi , V j �
Vgene �c, V; X � � α � � β � min
c i, j
k �1
the fitness function, we try to integrate our watermarking MSD t d �Vi , V j � . (13)
i� j
scheme with GA procedures.
In the first term, it denotes the compactness with

INTRA �k � � 1
nk � x�xk
Vt - x j , and (14)
nt
MSD t � 1
nt �Vj �1
t - xj . (15)

In the second term, d(Vi, Vj) is the same as that defined

in Eq. (6). Also, � and � are the weighting factors, which
act as the output after GA training.
The goal for optimization is to find the minimized value
in the fitness function. Under the best condition, the fitness
value reaches 0.

3.3 Procedures in GA Training

The GA procedures for optimized cluster validity index are
described as follows.
Step 1: Producing the chromosomes: 40 chromosomes are
produced. Each chromosome denotes the weighting
factors in the fitness function, i.e., ( α i, β i), 1�i�40.
Because the fitness function is composed of two
opposing conditions, we only concern about the ratio
Figure 1: The Two-Dimensional, 2000-Point, 9-Cluster Database between the two weights; we set 0� α i, β i � 1.
My_sample.
256 IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 1, No. 4, October 2007

Table 2 Table 3
The Five Databases Used in This Paper Index Values from 2 to 10 Clusters in Seven Different Schemes.
Shaded Blocks Show the Correct Results for
Training database # of data Testing database # of data Auto-mpg Database
points points
index k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10
auto-mpg_train 196 auto-mpg_test 196
bupa_train 173 bupa_test 172 PC 0.866 0.801 0.787 0.764 0.726 0.716 0.713 0.704 0.700

cmc_train 737 cmc_test 736 PE 0.330 0.522 0.596 0.679 0.804 0.847 0.869 0.915 0.944
iris_train 75 iris_test 75 XB 0.056 0.073 0.083 0.067 0.145 0.121 0.121 0.104 0.123
wine_train 89 wine_test 89 K 11.31 15.30 18.21 15.74 35.53 33.51 36.00 32.70 41.44
B crit 13.57 8.20 6.94 6.47 9.21 9.89 11.11 11.21 13.24
Fitness values are calculated from the training SV 1.000 0.633 0.466 0.415 0.548 0.592 0.705 0.771 1.000
databases in Table 2. At the beginning of first iteration,
chromosome values are randomly set. In training, GI 0.523 0.487 0.521 0.536 0.780 0.854 0.960 0.974 1.148
chromosome values are modified based on the output of
the previous iteration. Table 4
Step 2: Selecting the better chromosomes: All the 40 sets Index Values from 2 to 10 Clusters in Seven Different Schemes.
Shaded Blocks Show the Correct Results for
of chromosomes are included into the fitness function
Bupa Database
and the corresponding fitness scores are calculated. The
20 chromosomes with smaller fitness values are kept index k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10
for use in the next iteration, and the other 20 are
PC 0.882 0.664 0.562 0.476 0.411 0.383 0.346 0.328 0.295
discarded. 20 new chromosomes in the next iteration
are produced from crossover and mutation based on the PE 0.304 0.809 1.136 1.435 1.676 1.826 2.011 2.131 2.309
20 chromosomes remained. XB 0.065 0.511 0.587 0.623 1.480 1.307 1.073 1.395 1.407
Step 3: Crossover of chromosome: From the 20 remained K 11.64 94.16 110.2 118.5 284.2 256.1 212.8 271.9 282.5
chromosomes, we randomly choose 10 of them, and
B crit 59.03 46.69 45.75 47.62 67.55 83.79 49.41 56.06 63.48
gather into 5 pairs, to perform the crossover operation.
By swapping the � or � values of every pair, 10 new SV 1.000 0.718 0.617 0.555 0.702 0.786 0.699 0.882 1.000
chromosomes are produced. GI 1.088 1.225 1.286 1.328 1.754 1.787 1.690 1.903 1.981
Step 4: Mutation of chromosome: The 10 chromosomes
that are not chosen in 0 are used in this step. The �
with the proposed GA-based index has the correct result. In
values in the first five chromosomes are replaced by
bupa, cmc, iris, and wine databases, similar results can be
randomly set, new � values. Similar operation is
obtained, and detailed comparisons can be found from Table
performed on the � values of the other five.
4 to Table 7, respectively. In addition, from Table 8, we see
Step 5: The stopping condition: Once the pre-determined that the proposed GI results in correct cluster numbers in
number of iterations is reached, or when the fitness value four of the five test databases. Comparing to other six indices
equals 0, the training is stopped, and the weighting that only result in one correct cluster number, our scheme
factors corresponding to the smallest fitness score in gets better performance. In addition, regarding to the cmc
the final iteration, (�, �), is the output. database, none of the seven indices have the correct cluster
number.
4. SIMULATION RESULTS
5. CONCLUSION
After training for 1000 iterations the GA optimization in
Sec. 3.3, we obtain the optimized weighting factors In this paper, we discussed about data clustering schemes
(�, �) = (0.8561, 0.0826). With the two values, we can and proposed a new cluster validity index based on GA. GI
compare the GA optimized result with those in index outperforms all the six existing indices in literature.
Sec. 2.1 to 2.6 by verifying the five test databases in However, clustering results for applications to some database
Table 2. We depict the detailed results with the auto-mpg are not correct even after GA training. And this is the
database in Table 3, the bupa database in Table 4, the iris motivation for our researches in the future.
database in ACKNOWLEDGMENTS
Table 5, the wine database in Table 6, respectively.
This work is partially supported by National Science Council
Numerical values in Table 3 depict the results for the auto-
(Taiwan) under grant NSC95-2218-E-005-034.
mpg database, which has three clusters. We can see that only
An Optimized Approach on Applying Genetic Alogirthm to Adaptive Cluster Validity Index 257

Table 5 REFERENCES
Index Values from 2 to 10 Clusters in Seven Different Schemes.
[1] D. E. Goldberg, Genetic Algorithms in Search,
Shaded Blocks Show the Correct Results for cmc Database
Optimization and Machine Learning, Boston, MA:
index k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10 Kluwer, 1989.
PC 0.809 0.704 0.597 0.528 0.474 0.423 0.378 0.342 0.321 [2] J. C. Bezdeck, R. Ehrlich, and W. Full, “FCM: Fuzzy C-
PE 0.459 0.773 1.089 1.323 1.523 1.723 1.905 2.066 2.189 Means Algorithm”, Computers and Geosciences, Vol. 10,
No. 2-3, 1984, pp. 16-20.
XB 0.096 0.125 0.197 0.222 0.231 0.296 0.388 0.604 0.539
K 70.86 92.96 146.9 165.7 173.6 223.1 293.0 458.3 410.4 [3] J.C. Bezdek, Pattern Recognition with Fuzzy Objective
Function Algorithms, New York, NY: Plenum, 1981.
B crit 18.57 13.26 11.96 13.35 13.37 17.26 16.77 19.17 22.55
[4] X. L. Xie and G. Beni, “A Validity Measure for Fuzzy
SV 1.000 0.580 0.452 0.428 0.440 0.514 0.664 0.935 1.000
Clustering”, IEEE Trans. Patt. Anal. Machine Intell., Vol.
GI 0.617 0.595 0.660 0.721 0.771 0.866 0.990 1.214 1.206
13, No. 8, 1991, pp. 841-846.
[5] S. H. Kwon, “Cluster Validity Index for Fuzzy
Table 6 Clustering”, Electronics Letters, Vol. 34, No. 22, pp.
Index Values from 2 to 10 Clusters in Seven Different Schemes. 2176-2177, 1998.
Shaded Blocks Show the Correct Results for iris Database.
[6] A. O. Boudraa, “Dynamic Estimation of Number of
index k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10 Clusters in Data Sets”, Electronics Letters, Vol. 35,
PC 0.888 0.790 0.738 0.678 0.610 0.584 0.562 0.538 0.535 No. 19, 1999, pp. 1606-1608.
PE 0.290 0.559 0.736 0.933 1.108 1.216 1.337 1.435 1.486 [7] D. J. Kim, Y. W. Park, and D. J. Park, “A Novel Validity
XB 0.058 0.115 0.160 0.265 0.316 0.549 0.239 0.227 0.289
Index for Determination of the Optimal Number of
Clusters”, IEICE. Trans. Inf. & Syst., Vol. E84-D, No. 2,
K 4.622 9.920 14.72 25.25 33.21 61.43 26.73 28.05 36.45
2001, pp. 281-285.
B crit 18.46 12.13 10.40 10.77 17.03 21.21 16.08 16.80 16.53
[8] R. Quinlan, “Auto-mpg data”,
SV 1.000 0.724 0.598 0.628 0.695 0.907 0.700 0.832 1.000
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/
GI 0.442 0.510 0.602 0.755 0.887 1.147 0.881 0.953 1.091 auto-mpg/, 1993.
[9] BUPA Medical Research Ltd, “BUPA Liver Disorders”,
Table 7 ftp://ftp.ics.uci.edu/pub/machine-learning-databases/
Index Values from 2 to 10 Clusters in Seven Different Schemes. liver-disorders/, 1990.
Shaded Blocks Show the Correct Results for
[10] T. S. Lim, “Contraceptive method choice”,
Wine Database
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/
index k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10 cmc/, 1999.
PC 0.868 0.783 0.772 0.746 0.751 0.784 0.786 0.760 0.738 [11] R.A. Fisher, “Iris plants database”,
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/
PE 0.328 0.572 0.636 0.720 0.738 0.663 0.677 0.764 0.830
iris/, 1988.
XB 0.067 0.141 0.101 0.081 0.123 0.071 0.097 0.209 0.261
[12] S. Aeberhard, “Wine recognition data”,
K 6.264 13.81 11.28 11.00 18.96 14.83 22.75 50.47 67.97
ftp://ftp.ics.uci.edu/pub/machine-learning-databases/
B crit 22.85 14.17 11.64 9.169 11.01 10.06 12.82 19.18 22.89 wine/, 1998.
SV 1.000 0.672 0.569 0.413 0.406 0.357 0.461 0.772 1.000
GI 0.570 0.566 0.605 0.641 0.841 0.828 1.061 1.594 1.896

Table 8
Comparisons of the Seven Indices for the Five Test Databases.
Our Scheme Performs the Best

Database Original PC PE XB K Bcrit SV GA

clusters
auto-mpg 3 2 2 2 2 5 5 3
bupa 2 2 2 2 2 5 5 2
cmc 3 2 2 2 2 4 4 2
iris 3 2 2 2 2 4 5 3
wine 3 2 2 2 2 5 7 3

Microsoft: Exam Questions DP-203
100% (2)
Microsoft: Exam Questions DP-203
17 pages
Manual Pl7 3
100% (1)
Manual Pl7 3
416 pages
Accenture Informatica Interview Question Answers
100% (2)
Accenture Informatica Interview Question Answers
3 pages
Database Systems The Complete Book
No ratings yet
Database Systems The Complete Book
25 pages
20463C Curso SQL Server
No ratings yet
20463C Curso SQL Server
130 pages
20-463 Internal and External Validity PDF
No ratings yet
20-463 Internal and External Validity PDF
8 pages
2002 Hakidi Cluster Validity Methods Part II
No ratings yet
2002 Hakidi Cluster Validity Methods Part II
9 pages
Cluster Validity
No ratings yet
Cluster Validity
18 pages
A New Index of Cluster Validity: Mu-Chun Su
No ratings yet
A New Index of Cluster Validity: Mu-Chun Su
19 pages
A Cluster Validity Index For Fuzzy Clustering
No ratings yet
A Cluster Validity Index For Fuzzy Clustering
17 pages
Entropy: A Clustering Method Based On The Maximum Entropy Principle
No ratings yet
Entropy: A Clustering Method Based On The Maximum Entropy Principle
30 pages
1 s2.0 S0031320305002943 Main
No ratings yet
1 s2.0 S0031320305002943 Main
17 pages
Comparing Clustering
No ratings yet
Comparing Clustering
42 pages
4.6 Methods For Clustering Validation
No ratings yet
4.6 Methods For Clustering Validation
31 pages
Applied Soft Computing: Boseop Kim, Hakyeon Lee, Pilsung Kang
No ratings yet
Applied Soft Computing: Boseop Kim, Hakyeon Lee, Pilsung Kang
15 pages
Arbelaitz, 2013. Cluster Validity
No ratings yet
Arbelaitz, 2013. Cluster Validity
14 pages
Cluster Validation: Presented By:Rohit Paul
No ratings yet
Cluster Validation: Presented By:Rohit Paul
22 pages
Knee Point Detection in BIC For Detecting The Number of Clusters
No ratings yet
Knee Point Detection in BIC For Detecting The Number of Clusters
10 pages
Mukhopadhyay 2015
No ratings yet
Mukhopadhyay 2015
46 pages
Dynamic Approach To K-Means Clustering Algorithm-2
No ratings yet
Dynamic Approach To K-Means Clustering Algorithm-2
16 pages
Clustering Analysis PDF
No ratings yet
Clustering Analysis PDF
15 pages
The Clustering Validity With Silhouette and Sum of Squared Errors
No ratings yet
The Clustering Validity With Silhouette and Sum of Squared Errors
8 pages
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
No ratings yet
Data Mining and Machine Learning: Fundamental Concepts and Algorithms
58 pages
Video 18
No ratings yet
Video 18
17 pages
Validating Clusters using Hopkins Statistics
No ratings yet
Validating Clusters using Hopkins Statistics
5 pages
Performance Evaluation of Some Clustering Algorithms and Validity Indices
No ratings yet
Performance Evaluation of Some Clustering Algorithms and Validity Indices
5 pages
Calinski Harabasz
No ratings yet
Calinski Harabasz
7 pages
Clustering Performance Evaluation Metrics1
No ratings yet
Clustering Performance Evaluation Metrics1
19 pages
Cluster Validation
No ratings yet
Cluster Validation
47 pages
Cal 99
No ratings yet
Cal 99
7 pages
dataxplore
No ratings yet
dataxplore
34 pages
Punomoc Za Raspolaganje Autohgvghubumobilom
No ratings yet
Punomoc Za Raspolaganje Autohgvghubumobilom
9 pages
Knee Point Detection For Detecting Automatically The Number of Clusters During Clustering Techniques
No ratings yet
Knee Point Detection For Detecting Automatically The Number of Clusters During Clustering Techniques
10 pages
Expert Systems With Applications: D. Binu
No ratings yet
Expert Systems With Applications: D. Binu
12 pages
Performance Evaluation of Distance Metrics in The Clustering Algorithms
No ratings yet
Performance Evaluation of Distance Metrics in The Clustering Algorithms
14 pages
An Enhanced Clustering Algorithm To Analyze Spatial Data: Dr. Mahesh Kumar, Mr. Sachin Yadav
No ratings yet
An Enhanced Clustering Algorithm To Analyze Spatial Data: Dr. Mahesh Kumar, Mr. Sachin Yadav
3 pages
19. Clustering- Introduction^J Evaluation Metrics
No ratings yet
19. Clustering- Introduction^J Evaluation Metrics
19 pages
Density-Based Clustering Validation: April 2014
No ratings yet
Density-Based Clustering Validation: April 2014
10 pages
K ANMIClusteringCategoricalData
No ratings yet
K ANMIClusteringCategoricalData
11 pages
1 s2.0 S0031320311005188 Main
No ratings yet
1 s2.0 S0031320311005188 Main
15 pages
A Comprehensive Survey of Clustering Algorithms
No ratings yet
A Comprehensive Survey of Clustering Algorithms
30 pages
Comparison of Purity and Entropy of K-Means Clustering and Fuzzy C Means Clustering
No ratings yet
Comparison of Purity and Entropy of K-Means Clustering and Fuzzy C Means Clustering
4 pages
Clustering and Its Application To GIS
No ratings yet
Clustering and Its Application To GIS
8 pages
An Efficient GA-based Clustering Technique: Hwei-Jen Lin, Fu-Wen Yang and Yang-Ta Kao
No ratings yet
An Efficient GA-based Clustering Technique: Hwei-Jen Lin, Fu-Wen Yang and Yang-Ta Kao
10 pages
Sine Cosine Based Algorithm For Data Clustering
No ratings yet
Sine Cosine Based Algorithm For Data Clustering
5 pages
Application of Xie-Beni-type Validity Index To Fuzzy Co-Clustering Models Based On Cluster Aggregation and Pseudo-Cluster-Center Estimation
No ratings yet
Application of Xie-Beni-type Validity Index To Fuzzy Co-Clustering Models Based On Cluster Aggregation and Pseudo-Cluster-Center Estimation
5 pages
Spatial Data Mining: Clustering Techniques
No ratings yet
Spatial Data Mining: Clustering Techniques
56 pages
V5I5201647
No ratings yet
V5I5201647
13 pages
Research on k Mean Algorithm
No ratings yet
Research on k Mean Algorithm
5 pages
Improving Fuzzy C-Means Clustering Based On Feature-Weight Learning
No ratings yet
Improving Fuzzy C-Means Clustering Based On Feature-Weight Learning
10 pages
Influence of Machining Parameter On Concentricity of The Hole On VMC Machining Using RSM (Central Composite Design)
No ratings yet
Influence of Machining Parameter On Concentricity of The Hole On VMC Machining Using RSM (Central Composite Design)
8 pages
Algorithms 10 00105
No ratings yet
Algorithms 10 00105
14 pages
Fuzzy C-Means - Review
No ratings yet
Fuzzy C-Means - Review
3 pages
Class Topology
No ratings yet
Class Topology
15 pages
AK-means: An Automatic Clustering Algorithm Based On K-Means
No ratings yet
AK-means: An Automatic Clustering Algorithm Based On K-Means
6 pages
Journal of Statistical Software: Nbclust: An R Package For Determining The Relevant Number of Clusters in A Data Set
No ratings yet
Journal of Statistical Software: Nbclust: An R Package For Determining The Relevant Number of Clusters in A Data Set
36 pages
V61i06 PDF
No ratings yet
V61i06 PDF
36 pages
v5n41
No ratings yet
v5n41
9 pages
Survey of Clustering Algorithms
No ratings yet
Survey of Clustering Algorithms
37 pages
A Rapid Review of Clustering Algorithms
No ratings yet
A Rapid Review of Clustering Algorithms
14 pages
The International Journal of Engineering and Science (The IJES)
No ratings yet
The International Journal of Engineering and Science (The IJES)
4 pages
Yang 2017
No ratings yet
Yang 2017
15 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Inflation: These Slides Supplement The Textbook, But Should Not Replace Reading The Textbook
No ratings yet
Inflation: These Slides Supplement The Textbook, But Should Not Replace Reading The Textbook
60 pages
Monthly Report On Tomato For November, 2019-6
No ratings yet
Monthly Report On Tomato For November, 2019-6
1 page
Economic Rate of Return Using Multiple Regression
No ratings yet
Economic Rate of Return Using Multiple Regression
12 pages
British Petroleum Oil Blowout
No ratings yet
British Petroleum Oil Blowout
1 page
Abap 740 Quick Reference
No ratings yet
Abap 740 Quick Reference
19 pages
Northwind
No ratings yet
Northwind
47 pages
Database Performance and Query Optimization
No ratings yet
Database Performance and Query Optimization
334 pages
New Perspectives Microsoft Office 365 and Word 2016 Intermediate 1st Edition Shaffer Test Bank
100% (42)
New Perspectives Microsoft Office 365 and Word 2016 Intermediate 1st Edition Shaffer Test Bank
13 pages
Unit I (Notes) Class Notes
No ratings yet
Unit I (Notes) Class Notes
63 pages
SQL Quick Reference
No ratings yet
SQL Quick Reference
3 pages
Starting Up Database Instance
No ratings yet
Starting Up Database Instance
9 pages
Ad Database All Slide
No ratings yet
Ad Database All Slide
49 pages
SQL PL SQL Interview Questions
100% (1)
SQL PL SQL Interview Questions
74 pages
Database Reference Guide
100% (1)
Database Reference Guide
358 pages
IDEA Tutorial
No ratings yet
IDEA Tutorial
62 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
24 pages
Relational Databases
No ratings yet
Relational Databases
368 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
9 pages
A User Exit Is A Place in A Software Program Where A Customer Can Arrange For Their Own Tailor
No ratings yet
A User Exit Is A Place in A Software Program Where A Customer Can Arrange For Their Own Tailor
55 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
45 pages
DB2 - IBM's Relational DBMS
No ratings yet
DB2 - IBM's Relational DBMS
157 pages
Reports: A. List View and Summary View
No ratings yet
Reports: A. List View and Summary View
44 pages
DBMS- Unit-1
No ratings yet
DBMS- Unit-1
21 pages
COPO
No ratings yet
COPO
164 pages
SQL Server Interview Questions
No ratings yet
SQL Server Interview Questions
3 pages
DB 244 Outline Fall 2017
No ratings yet
DB 244 Outline Fall 2017
8 pages
Rank Transformation
No ratings yet
Rank Transformation
5 pages
Chapter 14
No ratings yet
Chapter 14
56 pages
Intro To Teradata SQL
No ratings yet
Intro To Teradata SQL
218 pages

An Optimized Approach On Applying Genetic Algorithm To Adaptive Cluster Validity Index

Uploaded by

An Optimized Approach On Applying Genetic Algorithm To Adaptive Cluster Validity Index

Uploaded by

International Journal of Computer Sciences and Engineering Systems

An Optimized Approach on Applying Genetic Algorithm

1. INTRODUCTION results are demonstrated in Section 4. Finally, we conclude

(and we use d = 2 here as an example), under the condition � �

To assess the effectiveness of clustering algorithm, the

Finally, the SV index is defined by Table 1

In the second term, d(Vi, Vj) is the same as that defined

3.3 Procedures in GA Training

Database Original PC PE XB K Bcrit SV GA

You might also like