0% found this document useful (0 votes)

46 views

1 A Modified Version

The document discusses a modified version of the k-means clustering algorithm that aims to improve upon some drawbacks of the traditional algorithm, such as runtime and accuracy depending on initial centroid selection. It proposes finding better initial centroids and combining this with an existing improved method for data object assignment between iterations using simple data structures.

Uploaded by

Edward

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views

1 A Modified Version

Uploaded by

Edward

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Global Journal of Computer Science and Technology: C

Software & Data Engineering

Volume 15 Issue 7 Version 1.0 Year 2015
Type: Double Blind Peer Reviewed International Research Journal
Publisher: Global Journals Inc. (USA)
Online ISSN: 0975-4172 & Print ISSN: 0975-4350

A Modified Version of the K-Means Clustering Algorithm

By Juhi Katara & Naveen Choudhary
Maharana Pratap University of Agriculture and Technology, India
Abstract- Clustering is a technique in data mining which divides given data set into small clusters
based on their similarity. K-means clustering algorithm is a popular, unsupervised and iterative
clustering algorithm which divides given dataset into k clusters. But there are some drawbacks of
traditional k-means clustering algorithm such as it takes more time to run as it has to calculate
distance between each data object and all centroids in each iteration. Accuracy of final clustering
result is mainly depends on correctness of the initial centroids, which are selected randomly. This
paper proposes a methodology which finds better initial centroids further this method is combined
with existing improved method for assigning data objects to clusters which requires two simple data
structures to store information about each iteration, which is to be used in the next iteration.
Proposed algorithm is compared in terms of time and accuracy with traditional k-means clustering
algorithm as well as with a popular improved k-means clustering algorithm.
Keywords: clustering, data mining, initial centroids, k-means clustering.
GJCST-C Classification : B.2.4 B.7.1

AModifiedVersionoftheKMeansClusteringAlgorithm

Strictly as per the compliance and regulations of:

© 2015. Juhi Katara & Naveen Choudhary. This is a research/review paper, distributed under the terms of the Creative Commons
Attribution-Noncommercial 3.0 Unported License http://creativecommons.org/licenses/by-nc/3.0/), permitting all non-commercial use,
distribution, and reproduction inany medium, provided the original work is properly cited.
A Modified Version of the K-Means Clustering
Algorithm
Juhi Katara α & Naveen Choudhary σ

Abstract- Clustering is a technique in data mining which agglomerative or divisive, based on how the hierarchical
divides given data set into small clusters based on their decomposition is designed. Density based clustering
similarity. K-means clustering algorithm is a popular, algorithm uses notion of density for clustering data

2015
unsupervised and iterative clustering algorithm which divides
objects. It either grows clusters according to the density
given dataset into k clusters. But there are some drawbacks of

Year
of neighborhood objects or according to some density
traditional k-means clustering algorithm such as it takes more
time to run as it has to calculate distance between each data function. Grid based clustering algorithm first quantizes
object and all centroids in each iteration. Accuracy of final the object space into a finite number of cells that form a 1
clustering result is mainly depends on correctness of the initial grid structure, and then performs clustering on the grid
centroids, which are selected randomly. This paper proposes structure. Model based clustering algorithm attempts to

( C ) Volume XV Issue VII Version I

a methodology which finds better initial centroids further this optimize the fit between the given data and some
method is combined with existing improved method for mathematical model.
assigning data objects to clusters which requires two simple
K-means clustering is a partitioning clustering
data structures to store information about each iteration, which
is to be used in the next iteration. Proposed algorithm is technique in which clusters are formed with the help of
compared in terms of time and accuracy with traditional k- centroids. It follows unsupervised, non deterministic and
means clustering algorithm as well as with a popular improved iterative approach towards clustering. K-means
k-means clustering algorithm. clustering is processed by the minimization of the
Keywords: clustering, data mining, initial centroids, average squared Euclidean distance between the data
K-means clustering. objects and the cluster centroids. The result of the k-
means clustering algorithm is affected by the choice of
I. Introduction initial centroid. Distinct initial centroid might result in

D
ata mining refers to using a variety of data distinct final clusters. Centroid of the cluster may be
analysis techniques and tools to discover defined as the mean of the objects in a cluster. It may
previously unknown, valid patterns and not necessarily be a member of the dataset.

Global Journal of C omp uter S cience and T echnology

relationship in large dataset[5]. Data mining techniques
like clustering and associations can be used to find
II. TRADITIONAL K-MEANS CLUSTERING
meaningful patterns for future predictions. Clustering ALGORITHM
may be defined as preprocessing step in all data mining
K-means clustering is the most popular
algorithms in which the data objects are divided into
clustering algorithm [9]. In the traditional k-means
clusters which contains high intra-cluster similarity and
clustering given dataset is classified into k numbers of
low inter-cluster similarity [3], [10].
disjoint clusters, where the value of k is given as input to
Clustering can be applied to a wide range of
the algorithm. The algorithm is implemented in two
fields like pattern recognition, marketing, image
phases.In the first phase k centroids are selected
processing etc[3]. Clustering algorithms are mainly
randomly. In the second phase assignment of each data
divided into partitioning, hierarchical, density based, grid
object to the closest centroid cluster is done. Distance
based, model based clustering algorithms.
between data objects and centroids is generally
Partitioning clustering algorithm first creates an
calculated by Euclidean distance. When all data objects
initial set of k partition, where parameter k is the number
are assigned to any of the k clusters, first iteration is
of partitions to construct. It then uses an iterative
completed and an early grouping is done. After
relocation technique that tries to improve the clustering
completion of first iteration recalculation of centroids are
by moving objects from one class to another.
done by taking mean of data objects of each cluster. As
Hierarchical clustering algorithm creates a hierarchical
k new centroids are calculated, a new assignment is to
decomposition of the dataset using some criterion.
be done between the same data objects and new
The method can be categorized as being either
centroids, generating loops which results in number of
iterations. As a result of this loop k centroids and data
Author α σ: Department of Computer Science & Engineering, College of objects may change their position in a step by step
Technology and Engineering, Maharana Pratap University of Agriculture
and Technology, Udaipur, Rajasthan, India.
manner. Ultimately the situation will occur where the
e-mail: [email protected] centroids do not update anymore. This means the

© 20 15 Global Journals Inc. (US)

A Modified Version of the K-Means Clustering Algorithm

convergence criterion for clustering is achieved. In this IV. RELATED WORK

algorithm generally Euclidean distance is used to find
distance between data objects and centroids [3]. Xiuyun Li et al. [1] proposed enhanced k-means
Between one data object X = (x1 ,x2 , …xn) and another clustering algorithm based on fuzzy feature selection.
data object Y = (y1 , y2 , …yn) the Euclidean distance d(X , This algorithm generates weight of feature important
Y)be calculated as follows: factor to describe the contribution of each feature to the
clustering and makes use of FIF to improve the similarity
𝑑𝑑(𝑋𝑋 , 𝑌𝑌 ) = { (𝑋𝑋1 − 𝑌𝑌1 )2 + (𝑋𝑋2 − 𝑌𝑌2 )2 + ⋯
measure and then achieve the improved clustering
+ (𝑋𝑋𝑛𝑛 − 𝑌𝑌𝑛𝑛 )2 }1/2
result.
Algorithm 1 : The Traditional K-Means Wang Shunye et al. [5] proposed an improved
k-means clustering algorithm in the optimal initial
Clustering Algorithm [3] centroids based on dissimilarity. This algorithm achieves
2015

Input: D = {d1, d2,......,dn} //set of n data objects. the dissimilarity to reflect the degree of correlation
Year

k // number of required clusters. between data objects then uses a Huffman tree to find
Output: k clusters the initial centroids. It takes less amount of time
2 Steps: because the iteration diminishes through the Huffman
1. Randomly select K data objects as initial algorithm.
Shi Na et al. [3] proposed an improved k-
( C ) Volume XV Issue VII Version I

centroids from D.
2. Calculate the distance between each data object means clustering algorithm to increase efficiency of k-
di (1<=i<=n) and all k cluster centroids cj means clustering algorithm. This algorithm requires two
(1<=j<=k) , then allocate data object di to the simple data structures to store information in every
cluster which has closest centroid. iteration which is to be used in the next iteration. The
3. Calculate new mean for each cluster. improved algorithm does less calculation, which saves
//new mean is the updated centroid of cluster. run time.
4. Repeat step 3 and 4 until no change in the Mohammed El Agha et al. [4] proposed
centroid of cluster. improved k-means clustering algorithm which has
ElAgha initialization that uses a guided random
technique as k-means clustering algorithm suffers from
III. DRAWBACKS OF TRADITIONAL K-MEANS initial centroids problem. ElAgha initialization
outperformed the random initialization and enhanced
CLUSTERING ALGORITHM
the quality of clustering with a big margin in complex
Traditional K-means clustering algorithm has datasets.
Global Journal of C omp uter S cience and T echnology

several drawbacks. The major drawback of traditional K- K.A Abdul Nazeer et al. [2] proposed an
means clustering algorithm is its performance is mainly algorithm to enhance accuracy and efficiency of the k-
depends on the initial centroids, which are selected means clustering algorithm. This algorithm consist of
randomly and resulting clusters are different for different two phases. First phase is used to determine initial
runs for the same input dataset. Another drawback centroids systematically so as to produce clusters with
includes distance calculation process of traditional k- better accuracy. Second phase is used for allocating
means algorithm which takes long time to converge data objects to the appropriate clusters in less amount
clustering result, as it calculates the distance from each of time. This algorithm outputs good clusters in less
data object to every cluster centroids in each iteration amount of time to run.
while there is no need to calculate that distance each
time. As in the resulting clusters some data objects still V. PROPOSED ALGORITHM
remains in the same cluster after several iteration. It In this section a modified algorithm is proposed
affects the performance of the algorithms. One more for improving the performance of k-means clustering
drawback of k-means clustering is the requirement to algorithm. In the paper [3], authors proposed an
give number of clusters formed as input by the user. improved k-means clustering algorithm to improve the
efficiency of the k-means clustering algorithm but in this
algorithm the initial centroids are selected randomly so
this method is very sensitive to the initial centroids as
random selection of initial centroids does not guarantee
to output unique clustering result. In the paper [5],
authors proposed an improved k-means clustering
algorithm in the optimal initial centroids based on
dissimilarity. However this algorithm is computationally
complex and requires more time to run.

In this paper we proposed a new approach for In the proposed algorithm distance of each data
selecting better initial centroids which outputs the object from origin is calculated. Then the original data
unique clustering result and increases the accuracy of objects are sorted accordance with the sorted distance.
basic k-means clustering algorithm and proposed Insertion sort is used for sorting in this paper. Now
approach is combined with the algorithm of paper [3] for divide the sorted data objects into k equal sets. Take
allocating the data objects to the suitable cluster. The middle data object as the initial centroid from each set.
algorithm of paper [3] is referred as shina improved k- This process of selecting centroid outputs better unique
means clustering algorithm in this paper. We compared clustering result. Now for every data object in the
the traditional k-means clustering algorithm, shina dataset calculate distance from every initial centroid.
improved k-means clustering algorithm [3] and The next step is an iterative process which reduces the
proposed algorithm in terms of time and accuracy required time to run. The data objects are assigned to

2015
parameters. the cluster which has the closest centroid. Two data
structures cluster [ ] and dist[ ]are required to store

Year
Algorithm 2: Modified k-means clustering algorithm information about the completed iteration of the
Input : D = {d1 , d2 ,d 3 , ………..… d n } algorithm. Array cluster [ ] stores the cluster number of
//Dataset of n data objects data object from which it belongs to and array dist [ ] 3
k // Number of required clusters stores the distance of every data object from closest

( C ) Volume XV Issue VII Version I

Output : A set of k clusters. centroid. Next, for each cluster obtained in completed
Steps : iteration the new centroid is calculated by taking the
1. Calculate distance from origin of each data mean of its data objects.
object dn in the dataset D. Then for each data object the distance is
2. Apply sorting on the distances obtained in calculated from the new calculated centroid of its
step 1. Sort the data objects according to present cluster. If this distance is less than or equal to
distance. the previous closest distance, the data object remains in
3. Now divide the sorted data objects into k equal the same cluster otherwise for every remaining data
sets. object,calculate the distance from all new calculated
4. Select the middle data object as the initial centroids. Next, the data objects are assigned to the
centroid from each set. cluster which has the closest centroids. Now array
5. Calculate the distance between each data cluster and dist are updated storing new values
object di(1<=i<=n) and all k cluster centroids obtained in this step. This reassigning process is
cj (1<=j<=k) as Euclideandistance d(di ,c j). repeated until no change in the centroids of cluster.

Global Journal of C omp uter S cience and T echnology

6. For each data object di find the closest
centroid cj and assign di to that cluster j. VI. EXPERIMENTAL RESULTS AND DISCUSSION
7. Repeat step 8 to 11 until no change in the All the experiments are carried out on core i3
centroid of clusters. Intel based PC machine with 4 GB RAM, running on
8. Store the cluster number in array cluster[ ]. WINDOWS 7 64 bits operating environment and
Set cluster[i]=j. Programming Platform is MATLAB version R2013a.
9. Store the distance of data object from the In this paper two different datasets are taken
closest centroid in the array dist[ ]. from the UCI repository of machine learning databases
Set dist[j] = d(di , cj). [6] to test the performance of the proposed k-means
10. For each cluster j(1<=j<=k) recalculate the clustering algorithm and for comparing the traditional k-
cluster centroid. means clustering algorithm, shina improved k-means
11. For each data object di clustering algorithm [3] and proposed algorithm of this
11.1 Compute its distance from the new paper. IRIS and WINE datasets are selected as the test
computed centroid of the present cluster. datasets [6]. The values of attributes are numeric.
11.2 If this distance is less than or equal to A brief introduction of the datasets used in
the present closest centroid, the data experimental evaluation is given in the table below:
remains in the same cluster.
Else Table 1 : Characteristics of datasets
11.2.1 For every centroid cj (1<=j<=k)
Calculate the distance d (di , cj) , Number of Number of
then assign di to the cluster which Dataset
attributes instances
closest centroids. Iris 4 150
End For Wine 13 178

© 20 15 Global Journals Inc. (US)

A Modified Version of the K-Means Clustering Algorithm

a) Iris dataset
Iris dataset contains the three classes of iris
flower: setosa, versicolour and virginica. This dataset
contains 150 instances and three classes. In iris dataset,
each class contains 50 instances with four attributes:
sepal length, sepal width, petal length, petal width.
b) Wine dataset
This dataset contains the chemical analysis of
wine in the same region of Italy but three different
cultivators. The dataset contains 178 instances and
2015

three classes with 13 attributes. First class contains 59,

second class contains 71 and third class contains 48
Year

instances. The attributes of dataset are alcohol, malic

acid, ash, alcalinity of ash, magnesium, total phenols,
4
flavonoids, nonflavanoids phenols, proanthocyanins,
Color intensity, hue, OD280/OD315 of diluted wines and
( C ) Volume XV Issue VII Version I

proline. Fig. 1 : Accuracy comparison chart for Iris dataset

The same datasets are given as input to all the
algorithms. Number of k is given three for both the
datasets. Experiment compares proposed k-means
clustering algorithm with the traditional k-means
clustering algorithm and with the shina improved k-
means [3] in terms of time and accuracy.
Accuracy: Accuracy is the ratio of correctly predicted
instances divided by total number of instances.
Time: It is the amount of time that passes from the start
of an algorithm to its finish.
Accuracy of clustering is determined by
comparing the clustering results with the clusters
already available in the UCI datasets [6]. Traditional and
Global Journal of C omp uter S cience and T echnology

shina improved k-means clustering algorithm gives

different accuracy and time for every run as it selects
initial centroid randomly. So these algorithms are
executed several time and average of accuracy and
time is taken. Accuracy of proposed k-means clustering Fig. 2 : Time comparison chart for iris dataset
algorithm is unique at every run but time is different for
each run so it is also executed several time and average Table 2 : Performance comparison on Iris dataset
of time is taken.
Shina
Traditional Proposed
Improved
K-means K-means
Parameters k-means
clustering clustering
clustering
algorithm algorithm
algorithm

Accuracy
76 80 89
(In %)
Time
86 24 4
(In ms)

number of distance calculations. So proposed algorithm

combines both this methods and results in less time to
run. At the same time the proposed k-means clustering
algorithm can improve the accuracy of the algorithm.

VII. CONCLUSION
K-means clustering algorithm is one of the most
popular and an effective algorithm to cluster datasets
which is used in number of fields like scientific and
commercial applications. However, this algorithm has
several drawbacks such as selection of initial centroid is

2015
random which does not guarantee to output unique
clustering result and k-means clustering has more

Year
number of iterations and distance calculations which
finally result in more amount of time to run. Various
enhancements have been carried out on the Traditional 5
Fig. 3 : Accuracy comparison chart for Wine data set k-means clustering algorithm by different researchers

( C ) Volume XV Issue VII Version I

considering different drawbacks. The proposed
algorithm combines a systematic way for selecting initial
centroids and an efficient method for assigning data
objects to clusters. So proposed algorithm is found to
be more accurate, efficient and feasible. The value of k
required number of clusters is still required to be given
as an input to the proposed algorithm. Intelligent pre
estimation of the value of k is suggested as a future
work.

REFERENCES REFERENCES REFERENCIAS

1. Xiuyun Li, Jie Yang, Qing Wang, Jinjin Fan, Peng Liu
“Research and Application of Improved K-means
AlgorithmBased on Fuzzy Feature Selection” in Fifth

Global Journal of C omp uter S cience and T echnology

International Conference on Fuzzy Systems and
Knowledge Discoveryvol 1, 2008 ieeeconference
publications
Fig. 4 : Time comparison chart for wine dataset 2. K.A Abdul Nazeer and M. P Sebastian, “Improving
the accuracy and efficiency of the k-means
Table 3 : Performance comparison on Wine data set clustering algorithm” in International Conference on
Data Mining and Knowledge Engineering (ICDMKE),
Proceedings of the World Congress on
Traditional Shina Proposed Engineering(WCE-2009), Vol 1, July 2009, London,
K-means Improved K-means
Parameters UK
clustering k-means clustering
algorithm clustering algorithm 3. Shi Na, Liu Xumin, Guan Yong “Research on k-
algorithm means Clustering Algorithm : An Improved k-means
Accuracy Clustering Algorithm” in Third International
64 66 70
(In %) Symposium on Intelligent Information Technology
Time and Security Informatics 2010 ieee conference
115 27 10
(In ms) publications
The result of experiment shows that the 4. Mohammed El Agha, Wesam M. Ashour “Efficient
proposed k-means clustering algorithm can output the and Fast Initialization Algorithm for K-means
better unique clustering result in less amount of time Clustering”I.J. Intelligent Systems and Applications,
than traditional k-means clustering algorithm and shina 2012
improved k-means clustering algorithm [3]. As it selects 5. Wang Shunye, Cui Yeqin, Jin Zuotao and Liu
better initial centroids which result in reduction of Xinyuan “K-means algorithm in the optimal initial
iterations. Shina improved method [3] of assigning data centroids based on dissimilarity” in Journal of
objects to the appropriate clusters results in less Chemical and Pharmaceutical Research, 2013

A Modified Version of the K-Means Clustering Algorithm

6. Merz C and Murphy P, UCIRepository of Machine

LearningDatabases,Available:ftp://ftp.ics.uci.edu/pu
b/machine-learning-databases
7. Charles Elkan “Using the Triangle Inequality to
Accelerate k-Means” in Proceedings of the
Twentieth International Conference on Machine
Learning (ICML-2003), Washington DC, 2003
8. Hong Liu and Xiaohong Yu “Application Research of
k-means Clustering Algorithm in Image Retrieval
System” in Proceedings of the Second Symposium
International Computer Science and Computational
2015

Technology(ISCSCT) 2009
Year

9. Jiawei Han, Michelinekamber and Morgan

Kauffman, “Data Mining: Concepts and
6 Techniques”, 2nd edition 2006
10. Osama Abu Abbas “Comparisons between data
clustering algorithms” in The International Arab
( C ) Volume XV Issue VII Version I

Journal of Information Technology vol 5 , no. 3 ,

July 2008
11. OyeladeO.J, Oladipupo O.O,Obagbuwa I. C
“Application of k-Means Clustering algorithm for
prediction of Students Academic Performance”
inInternational Journal of Computer Science and
Information Security,Vol. 7, 2010
12. Chunfei Zhang, Zhiyi Fang “An Improved K-means
Clustering Algorithm” in Journal of Information &
Computational Science 2013
Global Journal of C omp uter S cience and T echnology

Enhancing The Exactness of K-Means Clustering Algorithm by Centroids
No ratings yet
Enhancing The Exactness of K-Means Clustering Algorithm by Centroids
7 pages
A Dynamic K-Means Clustering For Data Mining-Dikonversi
No ratings yet
A Dynamic K-Means Clustering For Data Mining-Dikonversi
6 pages
A Dynamic K-Means Clustering For Data Mining
No ratings yet
A Dynamic K-Means Clustering For Data Mining
6 pages
V5I5201647
No ratings yet
V5I5201647
13 pages
An Efficient Incremental Clustering Algorithm
No ratings yet
An Efficient Incremental Clustering Algorithm
3 pages
The International Journal of Engineering and Science (The IJES)
No ratings yet
The International Journal of Engineering and Science (The IJES)
4 pages
Implementing and Improvisation of K-Means Clustering: International Journal of Computer Science and Mobile Computing
No ratings yet
Implementing and Improvisation of K-Means Clustering: International Journal of Computer Science and Mobile Computing
5 pages
na2010
No ratings yet
na2010
5 pages
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
No ratings yet
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
6 pages
UNIT 4 K-Means Clustring
No ratings yet
UNIT 4 K-Means Clustring
13 pages
Unit - V DW
No ratings yet
Unit - V DW
6 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Comprehensive Review of K-Means Clustering Algorithms
No ratings yet
Comprehensive Review of K-Means Clustering Algorithms
5 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
29 pages
Research on k Mean Algorithm
No ratings yet
Research on k Mean Algorithm
5 pages
A Review On K Means Clustering
No ratings yet
A Review On K Means Clustering
7 pages
K - Means Clustering Algorithm Applications in Data Mining and Pattern Recognition
No ratings yet
K - Means Clustering Algorithm Applications in Data Mining and Pattern Recognition
8 pages
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
No ratings yet
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
4 pages
Research On K-Means Clustering Algorithm An Improved K-Means Clustering Algorithm
No ratings yet
Research On K-Means Clustering Algorithm An Improved K-Means Clustering Algorithm
5 pages
Unit 3 Clustering Algorithm
No ratings yet
Unit 3 Clustering Algorithm
44 pages
ML CH 4
No ratings yet
ML CH 4
51 pages
Ijert Ijert: Enhanced Clustering Algorithm For Classification of Datasets
No ratings yet
Ijert Ijert: Enhanced Clustering Algorithm For Classification of Datasets
8 pages
Unit V - Clustering
No ratings yet
Unit V - Clustering
19 pages
Module 4-1
No ratings yet
Module 4-1
153 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
Normalization Based K Means Clustering Algorithm
No ratings yet
Normalization Based K Means Clustering Algorithm
5 pages
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
No ratings yet
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
6 pages
Azimi 2017
No ratings yet
Azimi 2017
26 pages
JETIR1503025
No ratings yet
JETIR1503025
4 pages
Storage Technologies: Digital Assignment 1
No ratings yet
Storage Technologies: Digital Assignment 1
16 pages
An Improved K-Means Algorithm Based On Mapreduce and Grid: Li Ma, Lei Gu, Bo Li, Yue Ma and Jin Wang
No ratings yet
An Improved K-Means Algorithm Based On Mapreduce and Grid: Li Ma, Lei Gu, Bo Li, Yue Ma and Jin Wang
12 pages
Unit 3 Data
No ratings yet
Unit 3 Data
37 pages
unsupervised learning
No ratings yet
unsupervised learning
23 pages
Clustering
No ratings yet
Clustering
125 pages
Clustering Algorithm: An Unsupervised Learning Approach
No ratings yet
Clustering Algorithm: An Unsupervised Learning Approach
23 pages
clustering
No ratings yet
clustering
9 pages
Unit 4
No ratings yet
Unit 4
74 pages
Influence of Machining Parameter On Concentricity of The Hole On VMC Machining Using RSM (Central Composite Design)
No ratings yet
Influence of Machining Parameter On Concentricity of The Hole On VMC Machining Using RSM (Central Composite Design)
8 pages
K-means ML
No ratings yet
K-means ML
23 pages
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
No ratings yet
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
11 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
By R.Siranjeevi Me (Cse) Guided by Mrs.P.Hemavathi ME., (PHD) (AP/IT)
No ratings yet
By R.Siranjeevi Me (Cse) Guided by Mrs.P.Hemavathi ME., (PHD) (AP/IT)
10 pages
An Efficient Enhanced K-Means Clustering Algorithm
No ratings yet
An Efficient Enhanced K-Means Clustering Algorithm
8 pages
Clustering
No ratings yet
Clustering
10 pages
DWMModule 4 (1) (1) (1)
No ratings yet
DWMModule 4 (1) (1) (1)
31 pages
Unit 4
No ratings yet
Unit 4
4 pages
Module-5_Notes_13-12-2024.docx
No ratings yet
Module-5_Notes_13-12-2024.docx
45 pages
Graph Partitioning Advance Clustering Technique
No ratings yet
Graph Partitioning Advance Clustering Technique
14 pages
Clustering and Dimensionality Reduction
No ratings yet
Clustering and Dimensionality Reduction
58 pages
Lecture 1 (UNIT 1)
No ratings yet
Lecture 1 (UNIT 1)
68 pages
UNIT- IV UNSUPERVISIED LEARNING_NOTES
No ratings yet
UNIT- IV UNSUPERVISIED LEARNING_NOTES
32 pages
DMW UNIT 5
No ratings yet
DMW UNIT 5
10 pages
K-Mean Clustering ML
No ratings yet
K-Mean Clustering ML
43 pages
1 s2.0 S0020025522014633 Main
No ratings yet
1 s2.0 S0020025522014633 Main
33 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
42 pages
Machine Learning
No ratings yet
Machine Learning
23 pages
Noah Laith
No ratings yet
Noah Laith
62 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
Ijcset 2016060701
No ratings yet
Ijcset 2016060701
3 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Clustering Algorithms For Mixed Datasets: A Review: K. Balaji and K. Lavanya
No ratings yet
Clustering Algorithms For Mixed Datasets: A Review: K. Balaji and K. Lavanya
10 pages
Final Term Paper
No ratings yet
Final Term Paper
13 pages
What If We Knew What Happens After Death?: Naresh Kumar
No ratings yet
What If We Knew What Happens After Death?: Naresh Kumar
11 pages
5 CS 03 Ijsrcse
No ratings yet
5 CS 03 Ijsrcse
4 pages
Prediction Analysis Techniques of Data Mining: A Review
No ratings yet
Prediction Analysis Techniques of Data Mining: A Review
7 pages
MPRA Paper 20588
No ratings yet
MPRA Paper 20588
10 pages
Ijctt V71i2p105
No ratings yet
Ijctt V71i2p105
7 pages
Healthcare E Guide System Using K Means
No ratings yet
Healthcare E Guide System Using K Means
90 pages
LJ 9
No ratings yet
LJ 9
7 pages
SMK Means An Improved Mini Batch K Means Algorithm
No ratings yet
SMK Means An Improved Mini Batch K Means Algorithm
16 pages
A Fuzzy Approach For Multi-Type Relational Data Clustering
No ratings yet
A Fuzzy Approach For Multi-Type Relational Data Clustering
14 pages
Sensors 24 02197
No ratings yet
Sensors 24 02197
23 pages
Energy Efficient Routing Protocols For Wireless Sensor Network
No ratings yet
Energy Efficient Routing Protocols For Wireless Sensor Network
5 pages
HL Icdcs2020
No ratings yet
HL Icdcs2020
11 pages
Survey On Energy-Efficient Techniques For Wireless Sensor Networks
No ratings yet
Survey On Energy-Efficient Techniques For Wireless Sensor Networks
15 pages
An Energy Efficient Routing Protocol For Wireless Sensor Networks Using A-Star Algorithm
No ratings yet
An Energy Efficient Routing Protocol For Wireless Sensor Networks Using A-Star Algorithm
8 pages
Enery Saving Algorithms in Sensor Systems
No ratings yet
Enery Saving Algorithms in Sensor Systems
4 pages
Springer Jeevan Sensor Accepted Version
No ratings yet
Springer Jeevan Sensor Accepted Version
15 pages
IET Communications - 2019 - Nilsaz Dezfuli - Distributed Energy Efficient Algorithm For Ensuring Coverage of Wireless
No ratings yet
IET Communications - 2019 - Nilsaz Dezfuli - Distributed Energy Efficient Algorithm For Ensuring Coverage of Wireless
7 pages
Hierarchical Energy-Saving Routing Algorithm Using Fuzzy Logic in Wireless Sensor Networks
No ratings yet
Hierarchical Energy-Saving Routing Algorithm Using Fuzzy Logic in Wireless Sensor Networks
11 pages
Int J Communication - 2006 - Zheng - Energy Efficient Network Protocols and Algorithms For Wireless Sensor Networks
No ratings yet
Int J Communication - 2006 - Zheng - Energy Efficient Network Protocols and Algorithms For Wireless Sensor Networks
4 pages
Energy-Aware Data Processing Techniques For Wireless Sensor Networks: A Review
No ratings yet
Energy-Aware Data Processing Techniques For Wireless Sensor Networks: A Review
21 pages
2822 Introduction
No ratings yet
2822 Introduction
1 page
Energy Efficient Routing Protocol
No ratings yet
Energy Efficient Routing Protocol
14 pages
Template
No ratings yet
Template
14 pages
Energy Saving With Node Sleep and Power Control Mechanisms For Wireless Sensor Networks
No ratings yet
Energy Saving With Node Sleep and Power Control Mechanisms For Wireless Sensor Networks
1 page
1 s2.0 S1877050914009077 Main
No ratings yet
1 s2.0 S1877050914009077 Main
8 pages
Roostapour Dy KCo SMC08
No ratings yet
Roostapour Dy KCo SMC08
6 pages
Dunkels 07 Demo
No ratings yet
Dunkels 07 Demo
2 pages
Data Mining-Applications, Issues
No ratings yet
Data Mining-Applications, Issues
9 pages
Unit-3 Data Preprocessing
100% (1)
Unit-3 Data Preprocessing
7 pages
Dynamic Clustering Approach Based On Wireless Sensor Networks Genetic Algorithm For Iot Applications
No ratings yet
Dynamic Clustering Approach Based On Wireless Sensor Networks Genetic Algorithm For Iot Applications
10 pages
Release Notes Template Example
No ratings yet
Release Notes Template Example
35 pages
Consensus Clustering
No ratings yet
Consensus Clustering
7 pages
Clusters - Density-Based
No ratings yet
Clusters - Density-Based
12 pages
Machine Learning Multiple Choice Questions - Free Practice Test
100% (1)
Machine Learning Multiple Choice Questions - Free Practice Test
12 pages
Clustering
No ratings yet
Clustering
3 pages
Midterm Quiz 1 - Attempt Review
No ratings yet
Midterm Quiz 1 - Attempt Review
6 pages
Positive Artificial Intelligence in Education (P AIED) : A Roadmap
100% (1)
Positive Artificial Intelligence in Education (P AIED) : A Roadmap
61 pages
2009 - Clustering Techniques For Financial Diversification
No ratings yet
2009 - Clustering Techniques For Financial Diversification
6 pages
A-hybrid-machine-learning-with-process-analytics-for-predi_2024_Decision-Ana
No ratings yet
A-hybrid-machine-learning-with-process-analytics-for-predi_2024_Decision-Ana
16 pages
Project Report Shruti 2
No ratings yet
Project Report Shruti 2
66 pages
DWM Unit 3 Final Notes
No ratings yet
DWM Unit 3 Final Notes
47 pages
IML2016 Solutions06-2
No ratings yet
IML2016 Solutions06-2
6 pages
FELIX: Automatic and Interpretable Feature Engineering Using LLMs
No ratings yet
FELIX: Automatic and Interpretable Feature Engineering Using LLMs
17 pages
Download Complete An Introduction to Spatial Data Science with GeoDa Volume 2 Clustering Spatial Data 1st Edition Luc Anselin PDF for All Chapters
100% (8)
Download Complete An Introduction to Spatial Data Science with GeoDa Volume 2 Clustering Spatial Data 1st Edition Luc Anselin PDF for All Chapters
50 pages
Strategic Group Analysis in The Construction Industry
No ratings yet
Strategic Group Analysis in The Construction Industry
11 pages
Dissertation Thesis
No ratings yet
Dissertation Thesis
42 pages
285 Project Paper
No ratings yet
285 Project Paper
7 pages
Van Liebergen - Machine Learning in Compliance Risk Management PDF
No ratings yet
Van Liebergen - Machine Learning in Compliance Risk Management PDF
8 pages
Pa ZG512 Ec-3r First Sem 2022-2023
No ratings yet
Pa ZG512 Ec-3r First Sem 2022-2023
5 pages
Machine Learning in Python For Process Systems Engineering: Ankur Kumar, Jesus Flores-Cerrillo
No ratings yet
Machine Learning in Python For Process Systems Engineering: Ankur Kumar, Jesus Flores-Cerrillo
352 pages
ai-900
No ratings yet
ai-900
23 pages
Audio-Based Fault Diagnosis For Belt Conveyor Rollers
No ratings yet
Audio-Based Fault Diagnosis For Belt Conveyor Rollers
10 pages
21AI71-module-5-textbook
No ratings yet
21AI71-module-5-textbook
25 pages
Data Mining: Concepts and Techniques, 4th Edition Jiawei Han - Download the ebook now to never miss important information
100% (1)
Data Mining: Concepts and Techniques, 4th Edition Jiawei Han - Download the ebook now to never miss important information
78 pages
Get Data-driven Analytics for Sustainable Buildings and Cities: From Theory to Application 1st Edition Xingxing Zhang free all chapters
100% (2)
Get Data-driven Analytics for Sustainable Buildings and Cities: From Theory to Application 1st Edition Xingxing Zhang free all chapters
65 pages
Ehd
No ratings yet
Ehd
15 pages
Slide-08-Chapter10-Cluster Analysis Basic Concept I
No ratings yet
Slide-08-Chapter10-Cluster Analysis Basic Concept I
40 pages

1 A Modified Version

Uploaded by

1 A Modified Version

Uploaded by

Global Journal of Computer Science and Technology: C

Software & Data Engineering

A Modified Version of the K-Means Clustering Algorithm

Strictly as per the compliance and regulations of:

( C ) Volume XV Issue VII Version I

Global Journal of C omp uter S cience and T echnology

© 20 15 Global Journals Inc. (US)

convergence criterion for clustering is achieved. In this IV. RELATED WORK

( C ) Volume XV Issue VII Version I

Global Journal of C omp uter S cience and T echnology

© 20 15 Global Journals Inc. (US)

three classes with 13 attributes. First class contains 59,

instances. The attributes of dataset are alcohol, malic

proline. Fig. 1 : Accuracy comparison chart for Iris dataset

shina improved k-means clustering algorithm gives

number of distance calculations. So proposed algorithm

( C ) Volume XV Issue VII Version I

REFERENCES REFERENCES REFERENCIAS

Global Journal of C omp uter S cience and T echnology

© 20 15 Global Journals Inc. (US)

6. Merz C and Murphy P, UCIRepository of Machine

9. Jiawei Han, Michelinekamber and Morgan

Journal of Information Technology vol 5 , no. 3 ,

You might also like