K-means Clustering (1)

The document outlines the K-means clustering algorithm, detailing the steps for dataset preparation, training, and plotting results for different values of K. It emphasizes the importance of not using certain libraries and provides instructions for submission along with limitations and guidance on choosing the optimal K based on inertia. The report should include plots for K values of 2, 4, 6, and 7, along with the corresponding inertia values.

Uploaded by

zarifahmed180

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

K-means Clustering (1)

Uploaded by

zarifahmed180

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Offline 3

4: K-means Clustering
Dataset preparation:
Use dataset ‘dataset.txt’ in the given folder.

Train:
1. K=4
2. Load dataset into 2D list "Data"
3. Randomly select K different data points from “Data” and store them into 2D list "Centers"
4. Initialize a 2D list named "Clusters" which contains K 1D lists for the K centers
5. for each sample/ data point "S" in "Data":
6. identify the center “C_i” that is the closest to “S”
7. Append "S" in "i"th list of "Clusters"
8. itr = 1, “Shift” = 0
9. while True:
10. for each 1D list "L" in "Clusters":
11. Determine the average of the data points. This is the new center of this list.
12. Update the center of this list in “Centers”
13. if itr > 1 and "Shift" < 50 break (convergence)
14. “Shift” = 0
15. Initialize a 2D list named "Temp_Clusters" which contains K 1D lists for the K centers
16. for each sample/ data point "S" in "Data":
17. identify the center “C_i” that is the closest to “S”
18. Append "S" in "i"th list of "Temp_Clusters"
19. if S belongs to different clusters in “Clusters” and “Temp_Clusters” then
20. “Shift” = “Shift” + 1
21. Now "Temp_Clusters" 2D list contains K 1D lists
22. Assign "Temp_Clusters" to "Clusters"
23. itr = itr + 1
24. "Clusters" will contain your desired clusters and "Centers" will contain your desired centers at
the end of loop
25. Plot them with appropriate color
26. “inertia” = 0
27. for each 1D list "L" in "Clusters":
28. “inertia” = “inertia” + sum of distances-square of data points of “L” from the center
Report:
Plot the data for K = 2, 4, 6, 7 and note down inertia.

Instruction
● Submit a .ipynb file and a report (report template) .pdf file.
● You must follow the given algorithm
● DO NOT USE LIBRARIES SUCH AS: "Sklearn", "Scikit learning" or "pandas" for this assignment
● Use your student id as seed
● Copying will result in -100% penalty
● Your marks will fully depend on your viva and understanding.
○ Full Algorithm: 16
○ Plotting: 4
Resources
k-means clustering

1. Select K random data points as the centers of K clusters

2. Assign each datapoint to the closest clusters (by calculating the distance from
centers).
3. While True:
4. Recalculate the center of the clusters (which is the mean of the data points)
5. Reassign each datapoint to the closest cluster
6. If no datapoint changes cluster then
7. break

Limitations:
- Need to know K in advance
- Depended on initial assignment of the centers

How to choose the K?

- Inertia measures how well a dataset was clustered by K-Means. It is calculated by
measuring the distance between each data point and its centroid, squaring this distance,
and summing these squares across one cluster. A good model is one with low
inertia AND a low number of clusters ( K ).

K Means Algorithms
No ratings yet
K Means Algorithms
27 pages
K Means
No ratings yet
K Means
3 pages
01 K Means - Merged
No ratings yet
01 K Means - Merged
26 pages
Unit 4 Machine Learning
No ratings yet
Unit 4 Machine Learning
12 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
EXPERIMENT 9
No ratings yet
EXPERIMENT 9
10 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
ML-Unit III - K-Means Clustering
No ratings yet
ML-Unit III - K-Means Clustering
22 pages
SE_KMeansClustering
No ratings yet
SE_KMeansClustering
21 pages
algo
No ratings yet
algo
59 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
DOC-20250407-WA0033.
No ratings yet
DOC-20250407-WA0033.
38 pages
Experiment No 7
No ratings yet
Experiment No 7
4 pages
DWM_EXP4
No ratings yet
DWM_EXP4
9 pages
Unit-4
No ratings yet
Unit-4
19 pages
k-means
No ratings yet
k-means
25 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
A Paper With 12pt Global Font Size
No ratings yet
A Paper With 12pt Global Font Size
13 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
Unit_4 (1)
No ratings yet
Unit_4 (1)
63 pages
Title: K-Means Clustering Algorithm Implementation: Department of Computer Science and Engineering
No ratings yet
Title: K-Means Clustering Algorithm Implementation: Department of Computer Science and Engineering
7 pages
MLT Unit 3 Notes
No ratings yet
MLT Unit 3 Notes
19 pages
Exp 7
No ratings yet
Exp 7
3 pages
ML Seminar
No ratings yet
ML Seminar
37 pages
3.1 K - Means
No ratings yet
3.1 K - Means
16 pages
K-Means With Elbow Method
No ratings yet
K-Means With Elbow Method
24 pages
Facebook Live Seller
No ratings yet
Facebook Live Seller
8 pages
6-kmeans
No ratings yet
6-kmeans
15 pages
Kmean Clustering
No ratings yet
Kmean Clustering
3 pages
Unit-V (1)
No ratings yet
Unit-V (1)
165 pages
ADL LAB Manual
No ratings yet
ADL LAB Manual
27 pages
Quality of Clustering: Clustering (K-Means Algorithm)
No ratings yet
Quality of Clustering: Clustering (K-Means Algorithm)
4 pages
AI-AG-Day-2-28th Feb 2023
No ratings yet
AI-AG-Day-2-28th Feb 2023
44 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
K-Means Clustering Algorithm - Javatpoint
No ratings yet
K-Means Clustering Algorithm - Javatpoint
21 pages
ASSIGNMENT 1ML
No ratings yet
ASSIGNMENT 1ML
5 pages
Kmean
No ratings yet
Kmean
24 pages
kmea
No ratings yet
kmea
53 pages
Unsupervised Learning - Clustering Cheatsheet - Codecademy
No ratings yet
Unsupervised Learning - Clustering Cheatsheet - Codecademy
5 pages
K++
No ratings yet
K++
5 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
DWM Exp7 C49
No ratings yet
DWM Exp7 C49
11 pages
Kmeans
No ratings yet
Kmeans
6 pages
KMEANS
No ratings yet
KMEANS
9 pages
Clustering in Python
No ratings yet
Clustering in Python
31 pages
K-Means Clustering
No ratings yet
K-Means Clustering
38 pages
K-Means Algo
No ratings yet
K-Means Algo
4 pages
42-Unsupervised Learning - k-means clustering-21-11-2024
No ratings yet
42-Unsupervised Learning - k-means clustering-21-11-2024
18 pages
K-means_clustering
No ratings yet
K-means_clustering
21 pages
08_k-means
No ratings yet
08_k-means
19 pages
"These Are Just Rough Notes For References" What Is K-Means Clustering
No ratings yet
"These Are Just Rough Notes For References" What Is K-Means Clustering
9 pages
K Means Clustering
No ratings yet
K Means Clustering
9 pages
k_means numerical
No ratings yet
k_means numerical
3 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
2.3 Aiml Rishit
No ratings yet
2.3 Aiml Rishit
7 pages
DMDW Lab8
No ratings yet
DMDW Lab8
3 pages
Data Smart: Using Data Science to Transform Information into Insight
From Everand
Data Smart: Using Data Science to Transform Information into Insight
Jordan Goldmeier
4/5 (16)
The Logical Solution Syracuse Conjecture
From Everand
The Logical Solution Syracuse Conjecture
Rolando Zucchini
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Updated Blackbook 2021-2022
No ratings yet
Updated Blackbook 2021-2022
57 pages
Pengaruh Komunikasi Interpersonal Terhadap Motivasi Berprestasi Antar Anggota Sanggar Tari Glossy Dancer Pekanbaru
No ratings yet
Pengaruh Komunikasi Interpersonal Terhadap Motivasi Berprestasi Antar Anggota Sanggar Tari Glossy Dancer Pekanbaru
10 pages
Voting System For Students
No ratings yet
Voting System For Students
51 pages
10 T-Test (One Mean) Chavez, Gabreille R.
No ratings yet
10 T-Test (One Mean) Chavez, Gabreille R.
12 pages
AP Psychology Unit 2 Teacher Noelle Name: - Date: - Word 1 Gut Feeling
No ratings yet
AP Psychology Unit 2 Teacher Noelle Name: - Date: - Word 1 Gut Feeling
7 pages
Sample Cover Letter For Airline Job
100% (2)
Sample Cover Letter For Airline Job
4 pages
mpharm-1-sem-regulatory-affairs-mph104t-2020
No ratings yet
mpharm-1-sem-regulatory-affairs-mph104t-2020
1 page
OB MCQs
No ratings yet
OB MCQs
1 page
Tunisia Knowledge Report
No ratings yet
Tunisia Knowledge Report
99 pages
Quantitative: (1) Descriptive Studies (2) Comparative Stzzdźes
No ratings yet
Quantitative: (1) Descriptive Studies (2) Comparative Stzzdźes
13 pages
Panel Data Analysis - Advantages and Challenges: Wise Working Paper Series WISEWP0602
No ratings yet
Panel Data Analysis - Advantages and Challenges: Wise Working Paper Series WISEWP0602
35 pages
DALY Acnes
No ratings yet
DALY Acnes
19 pages
Billings Gagliardi Mazor 2005 Development and Validation of The Stroke Action Test
No ratings yet
Billings Gagliardi Mazor 2005 Development and Validation of The Stroke Action Test
5 pages
ICH Q3A R2 - IMPURITIES IN NEW DRUG SUBSTANCES - Guia ICH para Determinação de Impurezas de Novas Drogas
100% (3)
ICH Q3A R2 - IMPURITIES IN NEW DRUG SUBSTANCES - Guia ICH para Determinação de Impurezas de Novas Drogas
15 pages
Total Quality Management
No ratings yet
Total Quality Management
10 pages
Heritage City Development and Augmentati
No ratings yet
Heritage City Development and Augmentati
189 pages
mtech-1-sem-ce-probability-and-statistics-221tce100-dec-2023
No ratings yet
mtech-1-sem-ce-probability-and-statistics-221tce100-dec-2023
2 pages
Verb Tense Analysis of Research Article Abstracts in Asian Efl Journal
No ratings yet
Verb Tense Analysis of Research Article Abstracts in Asian Efl Journal
8 pages
SAA-LAB-UNIT1-QA
No ratings yet
SAA-LAB-UNIT1-QA
4 pages
Instructions:: Final Examination Requirement
No ratings yet
Instructions:: Final Examination Requirement
3 pages
Hilton Case
67% (3)
Hilton Case
2 pages
Zimmerman Bandura y Martinez
No ratings yet
Zimmerman Bandura y Martinez
14 pages
The Effects of Sleep Deprivation On Memory
100% (10)
The Effects of Sleep Deprivation On Memory
7 pages
Conclusion Section For Research Papers
No ratings yet
Conclusion Section For Research Papers
4 pages
RISE: Randomized Input Sampling For Explanation of Black-Box Models
No ratings yet
RISE: Randomized Input Sampling For Explanation of Black-Box Models
17 pages
Trinity College: 2014 Foundation Studies
No ratings yet
Trinity College: 2014 Foundation Studies
36 pages
On Women, Cyberfeminism and Information Security
No ratings yet
On Women, Cyberfeminism and Information Security
17 pages
Speech On How To Foster A Great Relationship
No ratings yet
Speech On How To Foster A Great Relationship
4 pages
Attributes and Roles of Pharmacists
No ratings yet
Attributes and Roles of Pharmacists
4 pages
Attitudes and Job Satisfaction Lecture 3
No ratings yet
Attitudes and Job Satisfaction Lecture 3
21 pages

K-means Clustering (1)

Uploaded by

K-means Clustering (1)

Uploaded by

Offline 3

1. Select K random data points as the centers of K clusters

How to choose the K?

You might also like