0% found this document useful (0 votes)

56 views37 pages

Survey of Clustering Algorithms

1. The document summarizes and compares different clustering algorithms including hierarchical clustering and k-means partitioning clustering. It discusses how each algorithm works and its complexity. 2. The algorithms are tested on the Iris dataset using the WEKA tool to analyze and visualize the differences between the clustering results. 3. While each algorithm has its strengths, k-means clustering is found to be the simplest to use compared to the other algorithms for this dataset and application.

Uploaded by

Aniket Roy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views37 pages

Survey of Clustering Algorithms

Uploaded by

Aniket Roy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Survey of Clustering Algorithms

PREPARED BY-

GUIDED BY-

DEBABRAT DAS- R010113014

JYOTISMITA TALUKDAR

ANIKET ROY- R010113007

ASST. PROFESSOR
CIT, UTM

CONTENTS
PROBLEM STATEMENT
WHAT IS CLUSTERING?
ALGORITHMS FOR CLUSTERING
HIERARCHICAL CLUSTERING
PARTITIONING BASED CLUSTERING
k-means algorithm

BRIEF INTRODUCTION TO WEKA

CONCLUSION

PROBLEM STATEMENT AND AIM

GIVEN A SET OF RECORDS (INSTANCES, EXAMPLES, OBJECTS, OBSERVATIONS, ),
ORGANIZE THEM INTO CLUSTERS (GROUPS, CLASSES)

CLUSTERING: THE PROCESS OF GROUPING PHYSICAL OR ABSTRACT OBJECTS INTO

CLASSES OF SIMILAR OBJECTS

AIM:
HERE WE USE WEKA TOOL TO COMPARE DIFFERENT CLUSTERING ALGORITHM ON IRIS
DATA SET AND SHOWN DIFFERENCES BETWEEN THEM.

WHAT IS CLUSTERING?
CLUSTERING IS THE MOST COMMON FORM OF UNSUPERVISED LEARNING.

CLUSTERING IS THE PROCESS OF GROUPING A SET OF PHYSICAL OR

ABSTRACT OBJECTS INTO CLASSES OF SIMILAR OBJECTS.

APPLICATION
Clustering help marketers discover distinct groups in their customer base. and they can
characterize their customer groups based on the purchasing patterns.
Thematic maps in GIS by clustering feature spaces
WWW
Document classification
Cluster weblog data to discover groups of similar access patterns

Clustering is also used in outlier detection applications such as detection of credit card fraud.
As data mining function, cluster analysis serves as a tool to gain insight into the distribution of
data to observe characteristics of each cluster.

ALGORITHMS TO BE
ANALYSED

HIERARCHICAL CLUSTERING ALGORITHM

PARTITIONING BASED CLUSTERING ALGORITHM

Hierarchical Clustering

Initially each point in the data set is assigned as a cluster.

Then we repeatedly combine two nearest points in a single cluster.

The distance function is calculated on the basis of the characteristics we want to cluster.

Hierarchical:
Agglomerative (bottom up):
Initially, each point is a cluster
Repeatedly combine the two
nearest clusters into one
Divisive (top down):
start with one cluster and recursively split it

s
r
e
w
Ans

Three important questions:

How do you represent a cluster
of more than one point?

Centroid- Centroid (mean) in i th dimension =

SUMi /N.
SUMi = i th component of SUM.

How do you determine the

nearness of clusters?

Nearest inter-cluster distance (Euclidian

Distance).

When to stop combining

clusters?

3. When min(inter-cluster distance d[i]) for a point

i > Threshold.

Algorithm

Hierarchical
Clustering

1.Begin with the disjoint clustering having level L(0) = 0 and sequence number m
= 0.
2. Find the least dissimilar pair of clusters in the current clustering, say pair (r),
(s), according to
where the minimum is over all pairs of clusters in the current clustering.
d[(r),(s)] = min
d[(i),(j)]

3. Increment the sequence number : m = m +1. Merge clusters (r) and (s) into a
single cluster to form the next clustering m. Set the level of this clustering to
L(m) = d[(r),(s)]

Cont.

Hierarchical
Clustering

4.Update the proximity matrix, D, by deleting the rows and columns corresponding to clusters (r)
and (s) and adding a row and column corresponding to the newly formed cluster.
5.The proximity between the new cluster, denoted (r,s) and old cluster (k) is defined in this way:
d[(k), (r,s)] = min d[(k),(r)], d[(k),
(s)]
6.If all objects are in one cluster, stop. Else, go to step 2.

EXAMPLE: HIERARCHICAL CLUSTERING

(5,3)
o
(1,2)
o
x (1.5,1.5)
x (1,1) o (2,1)
o (0,0)

Data:
o data point
x centroid

o (4,1)

x (4.7,1.3)

x (4.5,0.5)
o (5,0)

Dendrogram

Implementation
//Assigning points to
array
loop: i=0 to n
a[i] = Point(i);

//For The Distance matrix:

loop:i=1 to n
Begin
loop: j=i to n
Begin
calculate distance(i,j);
set d[i,j] = distance(i,j);
set d[j,i] = distance(i,j);
END
calculate min( row([i] );
set n[i] = min( row[i] );
END

//Clustering the points

loop: i=1 to n
Begin
loop j=1 to n
Begin
if( i=j)
continue;
if(n[i]<n[j]-threshhold)
cluster i and j;
calculate centroid;
update d[i,j];
else
Set outlier[i] = a[i];
END
END

Complexities
Complexity of setting the points to array=O(n)
Complexity of calculating d[i,j]=O(nlog(n))
Complexity of clustering=O(n2log(n))

PARTITIONING BASED
Suppose we are given a database of n objects and the partitioning method constructs k
partition of data. Each partition will represent a cluster and k n. It means that it will
classify the data into k groups, which satisfy the following requirements

Each group contains at least one object.

Each object must belong to exactly one group.

FOR A GIVEN NUMBER OF PARTITIONS (SAY K), THE PARTITIONING METHOD

WILL CREATE AN INITIAL PARTITIONING.
THEN IT USES THE ITERATIVE RELOCATION TECHNIQUE TO IMPROVE THE
PARTITIONING BY MOVING OBJECTS FROM ONE GROUP TO OTHER.
GLOBAL OPTIMAL: EXHAUSTIVELY ENUMERATE ALL PARTITIONS.
HEURISTIC METHODS: K-MEANS.

K-MEANS CLUSTERING

KMEANS ALGORITHM(S)
ASSUMES EUCLIDEAN SPACE/DISTANCE
START BY PICKING K, THE NUMBER OF CLUSTERS
INITIALIZE CLUSTERS BY PICKING ONE POINT PER CLUSTER
FOR THE MOMENT, ASSUME WE PICK THE K POINTS AT RANDOM

POPULATING CLUSTERS

1) FOR EACH POINT, PLACE IT IN THE CLUSTER WHOSE CURRENT CENTROID IT IS NEAREST
2) AFTER ALL POINTS ARE ASSIGNED, UPDATE THE LOCATIONS OF CENTROIDS OF THE K
CLUSTERS
3) REASSIGN ALL POINTS TO THEIR CLOSEST CENTROID
SOMETIMES MOVES POINTS BETWEEN CLUSTERS
REPEAT 2 AND 3 UNTIL CONVERGENCE
CONVERGENCE: POINTS DONT MOVE BETWEEN CLUSTERS AND CENTROIDS STABILIZE

EXAMPLE: ASSIGNING CLUSTERS

x
x
x
x
x
x

x data point
centroid

Clusters after round 1

EXAMPLE: ASSIGNING CLUSTERS

x
x
x
x
x
x

x data point
centroid

Clusters after round 2

EXAMPLE: ASSIGNING CLUSTERS

x
x
x
x
x
x

x data point
centroid

Clusters at the end

GETTING THE K RIGHT

HOW TO SELECT K?

Average
distance to
centroid

Best value
of k

k
25

EXAMPLE: PICKING K
Too few;
many long
distances
to centroid.

x
x
x x x x
x xx x
x x x
x x

x
xx x
x x
x x x
x
xx x
x

x x
x x x x
x x x
x

Just right;
distances
rather short.

x
x
x x x x
x xx x
x x x
x x

x
xx x
x x
x x x
x
xx x
x

x x
x x x x
x x x
x

Too many;
little improvement
in average
distance.

x
x
x x x x
x xx x
x x x
x x

x
xx x
x x
x x x
x
xx x
x

x x
x x x x
x x x
x

ADVANTAGES OF K-MEANS

WITH A LARGE NUMBER OF VARIABLES, K-MEANS MAY BE COMPUTATIONALLY FASTER

THAN HIERARCHICAL CLUSTERING
(IF K IS SMALL).
K-MEANS MAY PRODUCE TIGHTER CLUSTERS THAN HIERARCHICAL CLUSTERING,
ESPECIALLY IF THE CLUSTERS ARE GLOBULAR.

COMPLEXITY:
EACH ROUND IS O(KN) FOR N POINTS, K CLUSTER

DISADVANTAGES

DIFFICULTY IN COMPARING QUALITY OF THE CLUSTERS PRODUCED (E.G. FOR

DIFFERENT INITIAL PARTITIONS OR VALUES OF K
AFFECT OUTCOME).
FIXED NUMBER OF CLUSTERS CAN MAKE IT DIFFICULT TO PREDICT WHAT K
SHOULD BE.

WEKA
WAIKATO ENVIRONMENT FOR KNOWLEDGE ANALYSIS (WEKA) IS A
POPULAR SUITE OF MACHINE LEARNING SOFTWARE WRITTEN IN JAVA,
DEVELOPED AT THE UNIVERSITY OF WAIKATO, NEW ZEALAND
IRIS DATA

HIERARCHAL ALGORITHM

K-MEANS

FURTHER WORK

WE WOULD LIKE TO SOLVE THE COMPLEXITY PROBLEM OF HIERARCHAL

ALGORITHM
IMPROVEMENT OF K-MEANS FOR THE NO OF CLUSTER SELECTION NEED TO BE
AUTOMATE FURTHER

CONCLUSION
EVERY ALGORITHM HAS THEIR OWN IMPORTANCE AND WE USE THEM ON THE BEHAVIOR
OF THE DATA

ON THE BASIS OF THIS RESEARCH WE FOUND THAT K-MEANS CLUSTERING ALGORITHM IS

SIMPLEST ALGORITHM AS COMPARED TO OTHER
ALGORITHMS.

WE CANT REQUIRED DEEP KNOWLEDGE OF ALGORITHMS FOR WORKING IN WEKA.

THATS WHY WEKA IS MORE SUITABLE TOOL FOR DATA MINING APPLICATIONS.

REFERENCES
YRM, S. AND KRKKINEN, T. INTRODUCTION TO PARTITIONING-BASED CLUSTERING METHODS WITH A ROBUST EXAMPLE,
REPORTS OF THE DEPT. OF MATH. INF. TECH. (SERIES C. SOFTWARE AND COMPUTATIONAL ENGINEERING), 1/2006, UNIVERSITY
OF JYVSKYL, 2006.
BERKHIN, P. (1998). SURVEY OF CLUSTERING DATA MINING TECHNIQUES. RETRIEVED NOVEMBER 6TH, 2015, WEBSITE:
E.B FAWLKES AND C.L. MALLOWS, A METHOD FOR COMPARING TWO HIERARCHICAL CLUSTERINGS, JOURNAL OF THE
AMERICAN STATISTICAL ASSOCIATION, 78:553584, 1983.
J. BEZDEK AND R. HATHAWAY, NUMERICAL CONVERGENCE AND INTERPRETATION OF THE FUZZY C-SHELLS CLUSTERING
ALGORITHMS, IEEE TRANS. NEURAL NETW., VOL. 3, NO. 5, PP. 787793, SEP. 1992.
M. AND HECKERMAN, D. (FEBRUARY, 1998), AN EXPERIMENTAL COMPARISON OF SEVERAL CLUSTERING AND INITIALIZATION
METHOD, TECHNICAL REPORT MSRTR-98-06, MICROSOFT RESEARCH, REDMOND, WA
MYTHILI S ET AL, INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND MOBILE COMPUTING, VOL.3 ISSUE.1 , JANUARY- 2014,
PG. 334-340
NARENDRA SHARMA, AMAN BAJPAI, MR. RATNESH LITORIYA COMPARISON THE VARIOUS CLUSTERING ALGORITHMS OF WEKA
TOOLS, INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGY AND ADVANCED ENGINEERING, VOLUME 2, ISSUE 5, MAY 2012.
R. DAV, ADAPTIVE FUZZY C-SHELLS CLUSTERING AND DETECTION OF ELLIPSES, IEEE TRANS. NEURAL NETW., VOL. 3, NO. 5, PP.
643662, SEP. 1992.
T. VELMURUGAN AND T. SANTHANAM, 2011. A SURVEY OF PARTITION BASED CLUSTERING ALGORITHMS IN DATA MINING: AN
EXPERIMENTAL APPROACH. INFORMATION TECHNOLOGY JOURNAL, 10: 478-484.
37
WEKA AT HTTP://WWW.CS.WAIKATO.AC.NZ/~ML/WEKA.

Batch B DWM Experiments
No ratings yet
Batch B DWM Experiments
90 pages
Machine Learning With Go
No ratings yet
Machine Learning With Go
9 pages
02 01 KMeans
100% (1)
02 01 KMeans
62 pages
Network Management For Wireless Sensor Netwok
No ratings yet
Network Management For Wireless Sensor Netwok
48 pages
Debesai Gutierrez Koyluoglu
No ratings yet
Debesai Gutierrez Koyluoglu
11 pages
Unit-4
No ratings yet
Unit-4
53 pages
The Application of Artificial Intelligence in Project Management Research: A Review
No ratings yet
The Application of Artificial Intelligence in Project Management Research: A Review
14 pages
2014 - Data Mining & Complex Network Algorithms For Traffic Accident Analysis
No ratings yet
2014 - Data Mining & Complex Network Algorithms For Traffic Accident Analysis
9 pages
Chapter 3-Unsupervised learning_updated
No ratings yet
Chapter 3-Unsupervised learning_updated
54 pages
Clustering Large Data Sets With Mixed Numeric and Categorical Values
No ratings yet
Clustering Large Data Sets With Mixed Numeric and Categorical Values
14 pages
MODULE_5
No ratings yet
MODULE_5
43 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
14 pages
ML Unit III.pptx
No ratings yet
ML Unit III.pptx
82 pages
Get Practical Guide to Cluster Analysis in R Unsupervised Machine Learning Alboukadel Kassambara free all chapters
100% (1)
Get Practical Guide to Cluster Analysis in R Unsupervised Machine Learning Alboukadel Kassambara free all chapters
55 pages
Clustering
No ratings yet
Clustering
80 pages
Anomaly Detection RapidMiner
No ratings yet
Anomaly Detection RapidMiner
12 pages
drones-08-00296 (1)
No ratings yet
drones-08-00296 (1)
30 pages
dwm exp6 a49
No ratings yet
dwm exp6 a49
7 pages
2012 Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree Algorithm
No ratings yet
2012 Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree Algorithm
9 pages
Chapter_6 (2)
No ratings yet
Chapter_6 (2)
54 pages
Data Clustering (Contd) : CS771: Introduction To Machine Learning Piyush Rai
No ratings yet
Data Clustering (Contd) : CS771: Introduction To Machine Learning Piyush Rai
15 pages
Machine Learning Techniques Using Python For Data
No ratings yet
Machine Learning Techniques Using Python For Data
17 pages
Clustering
No ratings yet
Clustering
34 pages
5812d46b-1c39-4a89-ae4b-eec09f93ba4b
No ratings yet
5812d46b-1c39-4a89-ae4b-eec09f93ba4b
66 pages
UNIT-5 PPT
No ratings yet
UNIT-5 PPT
85 pages
Case Study-1: Department of Computer Science and Engineering (7 Semester)
No ratings yet
Case Study-1: Department of Computer Science and Engineering (7 Semester)
16 pages
Week-10
No ratings yet
Week-10
84 pages
AI20- Hierarchical-clustering
No ratings yet
AI20- Hierarchical-clustering
31 pages
Cluster-Guided Contrastive Graph Clustering Network
No ratings yet
Cluster-Guided Contrastive Graph Clustering Network
9 pages
ML Module 4 Unsupervised Learning - Updated
No ratings yet
ML Module 4 Unsupervised Learning - Updated
55 pages
DWM UNIT-5 SEM ANS
No ratings yet
DWM UNIT-5 SEM ANS
8 pages
Cluster
No ratings yet
Cluster
20 pages
8. Clustering
No ratings yet
8. Clustering
80 pages
Clustering-Part1
No ratings yet
Clustering-Part1
79 pages
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
No ratings yet
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
28 pages
IT3080 Lecture04 2023
No ratings yet
IT3080 Lecture04 2023
56 pages
TOPIC WISE DSA QUESTIONS
No ratings yet
TOPIC WISE DSA QUESTIONS
15 pages
Unit - 4 DM
No ratings yet
Unit - 4 DM
24 pages
Clustering
No ratings yet
Clustering
28 pages
Chapter 6
No ratings yet
Chapter 6
62 pages
unsupervised_learning_1
No ratings yet
unsupervised_learning_1
40 pages
Clustering
No ratings yet
Clustering
45 pages
DWMModule 4 (1) (1) (1)
No ratings yet
DWMModule 4 (1) (1) (1)
31 pages
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
No ratings yet
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
38 pages
Clustering
No ratings yet
Clustering
65 pages
MODULE 4 - 5TH SEM (2)
No ratings yet
MODULE 4 - 5TH SEM (2)
23 pages
Unit 5
No ratings yet
Unit 5
10 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
DM_C6
No ratings yet
DM_C6
37 pages
Final ML
No ratings yet
Final ML
50 pages
6 - Clustering and Applications and Trends in Datamining
No ratings yet
6 - Clustering and Applications and Trends in Datamining
66 pages
DMW Unit-V
No ratings yet
DMW Unit-V
47 pages
UnSupervisedLearning
No ratings yet
UnSupervisedLearning
22 pages
Chp10 Cluster Analysis Basic Concepts and Methods
No ratings yet
Chp10 Cluster Analysis Basic Concepts and Methods
24 pages
19 - Sessionppt - Clusteringalgos
No ratings yet
19 - Sessionppt - Clusteringalgos
36 pages
Kaggle Competitions - How To Win
No ratings yet
Kaggle Competitions - How To Win
74 pages
Clustering
No ratings yet
Clustering
75 pages
RIce Plant Disease Detection Using Different AI Approaches
No ratings yet
RIce Plant Disease Detection Using Different AI Approaches
11 pages
Introduction to Cluster Analysis.
No ratings yet
Introduction to Cluster Analysis.
53 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
Data Mining - Clustering
No ratings yet
Data Mining - Clustering
90 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
Clustering
No ratings yet
Clustering
75 pages
8 Clustering
No ratings yet
8 Clustering
53 pages
Unit 4 - Data Warehousing and Mining
No ratings yet
Unit 4 - Data Warehousing and Mining
51 pages
Clustering
No ratings yet
Clustering
110 pages
Clustering
No ratings yet
Clustering
35 pages
pp9 - v4 - Mejorado
No ratings yet
pp9 - v4 - Mejorado
6 pages
Clustering
No ratings yet
Clustering
104 pages
8. Clustering
No ratings yet
8. Clustering
38 pages
Data Mining Unit 3 Cluster Analysis: Types of Clusters
No ratings yet
Data Mining Unit 3 Cluster Analysis: Types of Clusters
11 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
UNIT5
No ratings yet
UNIT5
60 pages
Grouping
No ratings yet
Grouping
98 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Lecture 01 - Unsupervised Learning (Optional)
No ratings yet
Lecture 01 - Unsupervised Learning (Optional)
57 pages
Lecture 14 Clustering
0% (1)
Lecture 14 Clustering
57 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
ML Unit 4 Notes - NJ
No ratings yet
ML Unit 4 Notes - NJ
15 pages
An Efficient Enhanced K-Means Clustering Algorithm
No ratings yet
An Efficient Enhanced K-Means Clustering Algorithm
8 pages
Clustering
No ratings yet
Clustering
125 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
Clustering
0% (1)
Clustering
127 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
UNIT1
No ratings yet
UNIT1
38 pages
Data Mining: Concepts and Techniques: Cluster Analysis
No ratings yet
Data Mining: Concepts and Techniques: Cluster Analysis
97 pages
Optical Flow and Motion Analysis
No ratings yet
Optical Flow and Motion Analysis
12 pages
Cluster
100% (1)
Cluster
72 pages
Linear Algebra Fundamentals
From Everand
Linear Algebra Fundamentals
Kartikeya Dutta
No ratings yet

Survey of Clustering Algorithms

Uploaded by

Survey of Clustering Algorithms

Uploaded by

Survey of Clustering Algorithms

DEBABRAT DAS- R010113014

ANIKET ROY- R010113007

BRIEF INTRODUCTION TO WEKA

PROBLEM STATEMENT AND AIM

CLUSTERING: THE PROCESS OF GROUPING PHYSICAL OR ABSTRACT OBJECTS INTO

CLUSTERING IS THE PROCESS OF GROUPING A SET OF PHYSICAL OR

HIERARCHICAL CLUSTERING ALGORITHM

PARTITIONING BASED CLUSTERING ALGORITHM

Initially each point in the data set is assigned as a cluster.

Then we repeatedly combine two nearest points in a single cluster.

Three important questions:

Centroid- Centroid (mean) in i th dimension =

How do you determine the

Nearest inter-cluster distance (Euclidian

When to stop combining

3. When min(inter-cluster distance d[i]) for a point

EXAMPLE: HIERARCHICAL CLUSTERING

//For The Distance matrix:

//Clustering the points

Each group contains at least one object.

Each object must belong to exactly one group.

FOR A GIVEN NUMBER OF PARTITIONS (SAY K), THE PARTITIONING METHOD

EXAMPLE: ASSIGNING CLUSTERS

Clusters after round 1

EXAMPLE: ASSIGNING CLUSTERS

Clusters after round 2

EXAMPLE: ASSIGNING CLUSTERS

Clusters at the end

GETTING THE K RIGHT

WITH A LARGE NUMBER OF VARIABLES, K-MEANS MAY BE COMPUTATIONALLY FASTER

DIFFICULTY IN COMPARING QUALITY OF THE CLUSTERS PRODUCED (E.G. FOR

WE WOULD LIKE TO SOLVE THE COMPLEXITY PROBLEM OF HIERARCHAL

ON THE BASIS OF THIS RESEARCH WE FOUND THAT K-MEANS CLUSTERING ALGORITHM IS

WE CANT REQUIRED DEEP KNOWLEDGE OF ALGORITHMS FOR WORKING IN WEKA.

You might also like