The document discusses K-means clustering, an unsupervised learning technique where the model works independently to discover patterns in unlabeled data. It clusters data points into K groups based on their distance from initial cluster centers. The example shows 8 points clustered into 3 groups using K-means. It calculates distances from points to initial and new cluster centers over iterations, assigning points to the closest center each time, until cluster assignments stop changing.
This document discusses different types of clustering analysis techniques in data mining. It describes clustering as the task of grouping similar objects together. The document outlines several key clustering algorithms including k-means clustering and hierarchical clustering. It provides an example to illustrate how k-means clustering works by randomly selecting initial cluster centers and iteratively assigning data points to clusters and recomputing cluster centers until convergence. The document also discusses limitations of k-means and how hierarchical clustering builds nested clusters through sequential merging of clusters based on a similarity measure.
K-means clustering exercise based on eucalidean distancejonecx
The document describes exercises on different clustering algorithms:
1. K-means clustering is applied to 8 data points to form 3 clusters over 1 epoch. The clusters and centroids are calculated and visualized. It is determined that 2 more epochs are needed to converge.
2. Nearest neighbor clustering is applied to the same 8 data points using a threshold of 4. Data points are added to clusters based on closest neighbors within the threshold distance.
This document summarizes the K-means clustering algorithm. It provides an outline of the topics covered, which include an introduction to clustering and K-means, how to calculate K-means using steps 0 through 2, results and suggestions, and references. It then provides more detail on the three steps of K-means: 1) initialize centroids, 2) assign points to closest centroids, and 3) recalculate centroids. Pseudocode is provided to demonstrate how to code K-means in Visual Basic.
Find the midpoint of two given points.
Find the coordinates of an endpoint given one endpoint and a midpoint.
Find the coordinates of a point a fractional distance from one end of a segment.
The document provides an overview of various topics in Matlab including:
- Arrays and matrix operations such as creating, accessing, and modifying arrays. Relational and logical operators are also discussed.
- Graphics capabilities for 2D and 3D plotting using functions like plot, mesh, surf, etc. Options for customizing plots are also summarized.
- The fundamentals of Simulink including its use for graphically modeling real-world systems through simulation.
This document presents solutions to numerical problems using different machine learning algorithms:
1. KNN is used to classify an unknown iris flower record based on its sepal length and width. The record is classified as the species that occurs most frequently among the k nearest neighbors.
2. K-means clustering is demonstrated by grouping 8 data points into 3 clusters based on their coordinates. The algorithm iteratively assigns points to centroids and updates the centroids until convergence.
3. K-means++ initialization is shown clustering 6 data points into 3 initial clusters in a way that minimizes intra-cluster distances.
4. SVM is used to find the optimal hyperplane to separate 6 data points into two classes by maximizing
Find the midpoint of two given points.
Find the coordinates of an endpoint given one endpoint and a midpoint.
Find the coordinates of a point a fractional distance from one end of a segment.
This document discusses Fisher's Linear Discriminant, a statistical dimensionality reduction technique used in machine learning and pattern recognition. It works by maximizing the distance between different classes while minimizing the distance within each class. The document provides an example using a sample 2-class dataset to demonstrate the steps of FLD, which includes calculating within-class and between-class scatter matrices to determine the optimal projection vector. This projects the high-dimensional data onto a line that best separates the two classes. Advantages are minimizing variance between classes and working for multi-class problems, while disadvantages include not handling non-linearity or small sample sizes well.
This document describes a novel steganographic method for hiding data in JPEG images. It proposes improvements to the existing matrix encoding (F5) technique. Specifically, it introduces overlapped matrix encoding, modified matrix encoding using 2 or 3 coefficient flips instead of 1, and an insert-remove approach. Experimental results show this method achieves higher data hiding capacity while decreasing detectability compared to the original F5 technique according to steganalysis using 274 features.
The document discusses coefficient of variation (CV), which is the ratio of the standard deviation to the mean. It provides an example comparing the CV of two multiple choice tests with different conditions. Formulas for calculating CV by hand and in Excel are shown. Methods for finding quartiles in ungrouped and grouped data are explained. The document also demonstrates how to calculate quartile deviation and construct box and whisker plots, and provides references for further information.
This document provides an index and overview of programs related to data science concepts in R. The programs cover topics like arithmetic operations on vectors, matrix operations, graphs, loops, and functions. The index lists 8 programs from August to October 2021 covering these topics. For each program, there is a brief description of the concepts covered and examples of R code and output.
BIRCH (balanced iterative reducing and clustering using hierarchies) is an unsupervised data-mining algorithm used to perform hierarchical clustering over, particularly large data sets.
The document discusses finding the midpoint and distance between two points with given coordinates. It provides formulas for finding the midpoint, which is the average of the x-coordinates and y-coordinates, and the distance, which uses the difference of the x-coordinates and y-coordinates. Several examples demonstrate using these formulas to calculate midpoints and distances. Practice problems with solutions are also provided.
Functions and their Graphs (mid point)Nadeem Uddin
This document discusses finding the midpoint of a line segment between two points and provides examples of calculating midpoints. The midpoint formula is presented as (x1+x2)/2, (y1+y2)/2, where (x1,y1) and (x2,y2) are the coordinates of the two endpoints. Examples are worked through of finding midpoints and the length and midpoint of a line segment is used to determine the center and radius of a circle.
1) Matrices can be defined as row vectors, column vectors, or 2D matrices in MATLAB. Common operations include addition, subtraction, multiplication, and exponentiation.
2) Predefined functions like ones, zeros, eye, and rand can be used to create matrices. Other functions like sum, diag, transpose, inv, and det perform operations on matrices.
3) Matrices can be concatenated horizontally using [A B] or vertically using [A;B]. Elements can be indexed and extracted from matrices using normal, linear, or logical indexing.
The document summarizes new features in ES10/ES 2019 including the Array flat() function for flattening nested arrays, map/flatMap for mapping and flattening arrays, Object entries for converting objects to arrays, trim methods for removing whitespace from strings, try/catch blocks, function comments, and the Symbol type.
The document discusses the distance formula and how to calculate the distance between two points. It provides the formula: Distance = √(x2 - x1)2 + (y2 - y1)2. Several examples are shown of using the distance formula to find the distance between points. The document also covers finding the midpoint between two points using the formula: Midpoint = (x1 + x2)/2, (y1 + y2)/2.
The student is able to (I can):
• Find the midpoint of two given points.
• Find the coordinates of an endpoint given one endpoint
and a midpoint.
• Find the distance between two points.
En el presente trabajo encontraremos conceptos básicos sobre números reales al igual que ejemplos. También conceptos sobre inecuaciones y desigualdades y sus ejercicios, operaciones con conjuntos
This document summarizes the K-means clustering algorithm. It provides an outline of the topics covered, which include an introduction to clustering and K-means, how to calculate K-means using steps 0 through 2, results and suggestions, and references. It then provides more detail on the three steps of K-means: 1) initialize centroids, 2) assign points to closest centroids, and 3) recalculate centroids. Pseudocode is provided to demonstrate how to code K-means in Visual Basic.
Find the midpoint of two given points.
Find the coordinates of an endpoint given one endpoint and a midpoint.
Find the coordinates of a point a fractional distance from one end of a segment.
The document provides an overview of various topics in Matlab including:
- Arrays and matrix operations such as creating, accessing, and modifying arrays. Relational and logical operators are also discussed.
- Graphics capabilities for 2D and 3D plotting using functions like plot, mesh, surf, etc. Options for customizing plots are also summarized.
- The fundamentals of Simulink including its use for graphically modeling real-world systems through simulation.
This document presents solutions to numerical problems using different machine learning algorithms:
1. KNN is used to classify an unknown iris flower record based on its sepal length and width. The record is classified as the species that occurs most frequently among the k nearest neighbors.
2. K-means clustering is demonstrated by grouping 8 data points into 3 clusters based on their coordinates. The algorithm iteratively assigns points to centroids and updates the centroids until convergence.
3. K-means++ initialization is shown clustering 6 data points into 3 initial clusters in a way that minimizes intra-cluster distances.
4. SVM is used to find the optimal hyperplane to separate 6 data points into two classes by maximizing
Find the midpoint of two given points.
Find the coordinates of an endpoint given one endpoint and a midpoint.
Find the coordinates of a point a fractional distance from one end of a segment.
This document discusses Fisher's Linear Discriminant, a statistical dimensionality reduction technique used in machine learning and pattern recognition. It works by maximizing the distance between different classes while minimizing the distance within each class. The document provides an example using a sample 2-class dataset to demonstrate the steps of FLD, which includes calculating within-class and between-class scatter matrices to determine the optimal projection vector. This projects the high-dimensional data onto a line that best separates the two classes. Advantages are minimizing variance between classes and working for multi-class problems, while disadvantages include not handling non-linearity or small sample sizes well.
This document describes a novel steganographic method for hiding data in JPEG images. It proposes improvements to the existing matrix encoding (F5) technique. Specifically, it introduces overlapped matrix encoding, modified matrix encoding using 2 or 3 coefficient flips instead of 1, and an insert-remove approach. Experimental results show this method achieves higher data hiding capacity while decreasing detectability compared to the original F5 technique according to steganalysis using 274 features.
The document discusses coefficient of variation (CV), which is the ratio of the standard deviation to the mean. It provides an example comparing the CV of two multiple choice tests with different conditions. Formulas for calculating CV by hand and in Excel are shown. Methods for finding quartiles in ungrouped and grouped data are explained. The document also demonstrates how to calculate quartile deviation and construct box and whisker plots, and provides references for further information.
This document provides an index and overview of programs related to data science concepts in R. The programs cover topics like arithmetic operations on vectors, matrix operations, graphs, loops, and functions. The index lists 8 programs from August to October 2021 covering these topics. For each program, there is a brief description of the concepts covered and examples of R code and output.
BIRCH (balanced iterative reducing and clustering using hierarchies) is an unsupervised data-mining algorithm used to perform hierarchical clustering over, particularly large data sets.
The document discusses finding the midpoint and distance between two points with given coordinates. It provides formulas for finding the midpoint, which is the average of the x-coordinates and y-coordinates, and the distance, which uses the difference of the x-coordinates and y-coordinates. Several examples demonstrate using these formulas to calculate midpoints and distances. Practice problems with solutions are also provided.
Functions and their Graphs (mid point)Nadeem Uddin
This document discusses finding the midpoint of a line segment between two points and provides examples of calculating midpoints. The midpoint formula is presented as (x1+x2)/2, (y1+y2)/2, where (x1,y1) and (x2,y2) are the coordinates of the two endpoints. Examples are worked through of finding midpoints and the length and midpoint of a line segment is used to determine the center and radius of a circle.
1) Matrices can be defined as row vectors, column vectors, or 2D matrices in MATLAB. Common operations include addition, subtraction, multiplication, and exponentiation.
2) Predefined functions like ones, zeros, eye, and rand can be used to create matrices. Other functions like sum, diag, transpose, inv, and det perform operations on matrices.
3) Matrices can be concatenated horizontally using [A B] or vertically using [A;B]. Elements can be indexed and extracted from matrices using normal, linear, or logical indexing.
The document summarizes new features in ES10/ES 2019 including the Array flat() function for flattening nested arrays, map/flatMap for mapping and flattening arrays, Object entries for converting objects to arrays, trim methods for removing whitespace from strings, try/catch blocks, function comments, and the Symbol type.
The document discusses the distance formula and how to calculate the distance between two points. It provides the formula: Distance = √(x2 - x1)2 + (y2 - y1)2. Several examples are shown of using the distance formula to find the distance between points. The document also covers finding the midpoint between two points using the formula: Midpoint = (x1 + x2)/2, (y1 + y2)/2.
The student is able to (I can):
• Find the midpoint of two given points.
• Find the coordinates of an endpoint given one endpoint
and a midpoint.
• Find the distance between two points.
En el presente trabajo encontraremos conceptos básicos sobre números reales al igual que ejemplos. También conceptos sobre inecuaciones y desigualdades y sus ejercicios, operaciones con conjuntos
2D transformations are important operations in computer graphics that allow modifying the position, size, and orientation of objects in a 2D plane. There are several types of 2D transformations including translation, rotation, scaling, and more. Transformations are represented using matrix math for efficient application of sequential transformations. Key techniques include homogeneous coordinates to allow different types of transformations to be combined into a single matrix operation.
3D transformations in computer graphics include translation, scaling, and rotation of 3D objects. Translation moves an object by adding translation offsets to the x, y, and z coordinates. Scaling enlarges or shrinks an object by multiplying the coordinates by scaling factors. Rotation rotates an object by applying rotation matrices to change the orientation. Reflection mirrors an object across planes by flipping the sign of coordinates on one axis. These transformations are used to manipulate 3D objects in computer graphics and animation.
Projection is the process of mapping a 3D object onto a 2D plane. There are two main types of projection: parallel projection, where lines project parallel to each other, and perspective projection, where lines converge to a point. Parallel projection includes oblique projection, where lines hit the plane at a non-90 degree angle, and orthographic projection, where lines hit perpendicular. Orthographic projection can be multiview projection showing top, side and front views, or axonometric projection where the object is rotated for multiple views.
2D transformations are used in computer graphics to modify and reposition graphics. The key 2D transformation techniques are translation, rotation, scaling, reflection, and shearing. Translation moves an object by adding offsets to its coordinates. Rotation modifies an object's position by applying rotational matrices. Scaling enlarges or shrinks an object by multiplying its coordinates. Reflection mirrors an object across an axis by inverting one coordinate. Shearing skews an object by adding its coordinates. Homogeneous coordinates allow representing transformations using matrix multiplications.
The document describes two algorithms for drawing lines on a graph:
1. The DDA (Digital Differential Analyzer) line drawing algorithm which calculates the slope of the line and uses incremental steps to determine each new pixel coordinate.
2. Bresenham's line drawing algorithm which uses a decision parameter to determine whether the next pixel is directly above/below or diagonal to the current pixel in order to draw lines with integer coordinates.
Fragmentation refers to the inefficient use of memory space. Internal fragmentation is the wasted space within each allocated memory block due to rounding up to the next allocation size. External fragmentation occurs when various unused memory holes are scattered throughout the memory space. The sample code demonstrates how to calculate internal and external fragmentation by allocating memory to processes and tracking the remaining free space.
The document discusses three file allocation strategies - sequential, indexed, and linked allocation. It provides descriptions of each strategy and includes sample code to demonstrate how they work. Sequential allocation stores files sequentially in contiguous disk blocks. Indexed allocation stores files in random blocks, maintaining an index to link file blocks. Linked allocation stores files non-contiguously, with each block containing a pointer to the next block in the file.
The Banker's algorithm is a deadlock avoidance algorithm that checks if a system is in a safe state when resources are requested. It models the allocation of resources to processes as a bank lending money to customers. The algorithm uses data structures like Available, Max, Allocation, and Need to represent the available resources, maximum requested resources, allocated resources, and remaining needed resources for each process. It iterates through processes to see if their remaining needs can be met by available resources while maintaining a safe state.
Priority scheduling assigns priorities to processes and executes the highest priority process first. If processes have equal priorities, they are executed in first come first served order. The example document shows how priority scheduling works by assigning priorities from 1 to 5 to five processes, then calculating their waiting times and turnaround times based on executing the highest priority processes first. Round robin scheduling assigns a fixed time quantum to each process, preempting and resuming processes to ensure all get CPU time and avoid starvation. The example shows how time is divided between processes using a 2ms time quantum.
CPU scheduling is the process by which the CPU selects which process to execute next from among processes in memory that are ready to execute. The CPU scheduler selects processes from the ready queue to execute. The goal of CPU scheduling is to maximize CPU utilization and throughput while minimizing waiting time and response time. Common CPU scheduling algorithms include first come first serve (FCF) which services processes in the order they arrive, and shortest job first (SJF) which selects the process with the shortest estimated run time to execute next.
Implementation of checking the validity of an arithmetic expression, conversion from infix to postfix form, evaluation of a postfix expression (C code)
A measles outbreak originating in West Texas has been linked to confirmed cases in New Mexico, with additional cases reported in Oklahoma and Kansas. The current case count is 795 from Texas, New Mexico, Oklahoma, and Kansas. 95 individuals have required hospitalization, and 3 deaths, 2 children in Texas and one adult in New Mexico. These fatalities mark the first measles-related deaths in the United States since 2015 and the first pediatric measles death since 2003.
The YSPH Virtual Medical Operations Center Briefs (VMOC) were created as a service-learning project by faculty and graduate students at the Yale School of Public Health in response to the 2010 Haiti Earthquake. Each year, the VMOC Briefs are produced by students enrolled in Environmental Health Science Course 581 - Public Health Emergencies: Disaster Planning and Response. These briefs compile diverse information sources – including status reports, maps, news articles, and web content– into a single, easily digestible document that can be widely shared and used interactively. Key features of this report include:
- Comprehensive Overview: Provides situation updates, maps, relevant news, and web resources.
- Accessibility: Designed for easy reading, wide distribution, and interactive use.
- Collaboration: The “unlocked" format enables other responders to share, copy, and adapt seamlessly. The students learn by doing, quickly discovering how and where to find critical information and presenting it in an easily understood manner.
Ultimate VMware 2V0-11.25 Exam Dumps for Exam SuccessMark Soia
Boost your chances of passing the 2V0-11.25 exam with CertsExpert reliable exam dumps. Prepare effectively and ace the VMware certification on your first try
Quality dumps. Trusted results. — Visit CertsExpert Now: https://www.certsexpert.com/2V0-11.25-pdf-questions.html
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetSritoma Majumder
Introduction
All the materials around us are made up of elements. These elements can be broadly divided into two major groups:
Metals
Non-Metals
Each group has its own unique physical and chemical properties. Let's understand them one by one.
Physical Properties
1. Appearance
Metals: Shiny (lustrous). Example: gold, silver, copper.
Non-metals: Dull appearance (except iodine, which is shiny).
2. Hardness
Metals: Generally hard. Example: iron.
Non-metals: Usually soft (except diamond, a form of carbon, which is very hard).
3. State
Metals: Mostly solids at room temperature (except mercury, which is a liquid).
Non-metals: Can be solids, liquids, or gases. Example: oxygen (gas), bromine (liquid), sulphur (solid).
4. Malleability
Metals: Can be hammered into thin sheets (malleable).
Non-metals: Not malleable. They break when hammered (brittle).
5. Ductility
Metals: Can be drawn into wires (ductile).
Non-metals: Not ductile.
6. Conductivity
Metals: Good conductors of heat and electricity.
Non-metals: Poor conductors (except graphite, which is a good conductor).
7. Sonorous Nature
Metals: Produce a ringing sound when struck.
Non-metals: Do not produce sound.
Chemical Properties
1. Reaction with Oxygen
Metals react with oxygen to form metal oxides.
These metal oxides are usually basic.
Non-metals react with oxygen to form non-metallic oxides.
These oxides are usually acidic.
2. Reaction with Water
Metals:
Some react vigorously (e.g., sodium).
Some react slowly (e.g., iron).
Some do not react at all (e.g., gold, silver).
Non-metals: Generally do not react with water.
3. Reaction with Acids
Metals react with acids to produce salt and hydrogen gas.
Non-metals: Do not react with acids.
4. Reaction with Bases
Some non-metals react with bases to form salts, but this is rare.
Metals generally do not react with bases directly (except amphoteric metals like aluminum and zinc).
Displacement Reaction
More reactive metals can displace less reactive metals from their salt solutions.
Uses of Metals
Iron: Making machines, tools, and buildings.
Aluminum: Used in aircraft, utensils.
Copper: Electrical wires.
Gold and Silver: Jewelry.
Zinc: Coating iron to prevent rusting (galvanization).
Uses of Non-Metals
Oxygen: Breathing.
Nitrogen: Fertilizers.
Chlorine: Water purification.
Carbon: Fuel (coal), steel-making (coke).
Iodine: Medicines.
Alloys
An alloy is a mixture of metals or a metal with a non-metal.
Alloys have improved properties like strength, resistance to rusting.
Exploring Substances:
Acidic, Basic, and
Neutral
Welcome to the fascinating world of acids and bases! Join siblings Ashwin and
Keerthi as they explore the colorful world of substances at their school's
National Science Day fair. Their adventure begins with a mysterious white paper
that reveals hidden messages when sprayed with a special liquid.
In this presentation, we'll discover how different substances can be classified as
acidic, basic, or neutral. We'll explore natural indicators like litmus, red rose
extract, and turmeric that help us identify these substances through color
changes. We'll also learn about neutralization reactions and their applications in
our daily lives.
by sandeep swamy
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...larencebapu132
This is short and accurate description of World war-1 (1914-18)
It can give you the perfect factual conceptual clarity on the great war
Regards Simanchala Sarab
Student of BABed(ITEP, Secondary stage)in History at Guru Nanak Dev University Amritsar Punjab 🙏🙏
The *nervous system of insects* is a complex network of nerve cells (neurons) and supporting cells that process and transmit information. Here's an overview:
Structure
1. *Brain*: The insect brain is a complex structure that processes sensory information, controls behavior, and integrates information.
2. *Ventral nerve cord*: A chain of ganglia (nerve clusters) that runs along the insect's body, controlling movement and sensory processing.
3. *Peripheral nervous system*: Nerves that connect the central nervous system to sensory organs and muscles.
Functions
1. *Sensory processing*: Insects can detect and respond to various stimuli, such as light, sound, touch, taste, and smell.
2. *Motor control*: The nervous system controls movement, including walking, flying, and feeding.
3. *Behavioral responThe *nervous system of insects* is a complex network of nerve cells (neurons) and supporting cells that process and transmit information. Here's an overview:
Structure
1. *Brain*: The insect brain is a complex structure that processes sensory information, controls behavior, and integrates information.
2. *Ventral nerve cord*: A chain of ganglia (nerve clusters) that runs along the insect's body, controlling movement and sensory processing.
3. *Peripheral nervous system*: Nerves that connect the central nervous system to sensory organs and muscles.
Functions
1. *Sensory processing*: Insects can detect and respond to various stimuli, such as light, sound, touch, taste, and smell.
2. *Motor control*: The nervous system controls movement, including walking, flying, and feeding.
3. *Behavioral responses*: Insects can exhibit complex behaviors, such as mating, foraging, and social interactions.
Characteristics
1. *Decentralized*: Insect nervous systems have some autonomy in different body parts.
2. *Specialized*: Different parts of the nervous system are specialized for specific functions.
3. *Efficient*: Insect nervous systems are highly efficient, allowing for rapid processing and response to stimuli.
The insect nervous system is a remarkable example of evolutionary adaptation, enabling insects to thrive in diverse environments.
The insect nervous system is a remarkable example of evolutionary adaptation, enabling insects to thrive
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingCeline George
The Accounting module in Odoo 17 is a complete tool designed to manage all financial aspects of a business. Odoo offers a comprehensive set of tools for generating financial and tax reports, which are crucial for managing a company's finances and ensuring compliance with tax regulations.
How to Subscribe Newsletter From Odoo 18 WebsiteCeline George
Newsletter is a powerful tool that effectively manage the email marketing . It allows us to send professional looking HTML formatted emails. Under the Mailing Lists in Email Marketing we can find all the Newsletter.
GDGLSPGCOER - Git and GitHub Workshop.pptxazeenhodekar
This presentation covers the fundamentals of Git and version control in a practical, beginner-friendly way. Learn key commands, the Git data model, commit workflows, and how to collaborate effectively using Git — all explained with visuals, examples, and relatable humor.
Title: A Quick and Illustrated Guide to APA Style Referencing (7th Edition)
This visual and beginner-friendly guide simplifies the APA referencing style (7th edition) for academic writing. Designed especially for commerce students and research beginners, it includes:
✅ Real examples from original research papers
✅ Color-coded diagrams for clarity
✅ Key rules for in-text citation and reference list formatting
✅ Free citation tools like Mendeley & Zotero explained
Whether you're writing a college assignment, dissertation, or academic article, this guide will help you cite your sources correctly, confidently, and consistent.
Created by: Prof. Ishika Ghosh,
Faculty.
📩 For queries or feedback: [email protected]
Multi-currency in odoo accounting and Update exchange rates automatically in ...Celine George
Most business transactions use the currencies of several countries for financial operations. For global transactions, multi-currency management is essential for enabling international trade.
Multi-currency in odoo accounting and Update exchange rates automatically in ...Celine George
Ad
K Means Clustering Algorithm in Machine Learning.pdf
1. K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled
dataset into different clusters. Here K defines the number of pre-defined clusters that need to be
created in the process, as if K=2, there will be two clusters, and for K=3, there will be three clusters,
and so on.
2. Manhattan distance
Problem
Cluster the following eight points (with (x, y) representing locations) into three clusters: A1(2, 10),
A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9). Initial cluster centers are:
A1(2, 10), A4(5, 8) and A7(1, 2). Use K-Means Algorithm to find the three cluster centers after
the second iteration.
Solution
Iteration-01
Given Points
Distance from
center (2, 10) of
Cluster-01
Distance from
center (5, 8) of
Cluster-02
Distance from
center (1, 2) of
Cluster-03
Point belongs
to Cluster
A1(2, 10) 0 5 9 C1
A2(2, 5) 5 6 4 C3
A3(8, 4) 12 7 9 C2
A4(5, 8) 5 0 10 C2
A5(7, 5) 10 5 9 C2
A6(6, 4) 10 5 7 C2
A7(1, 2) 9 10 0 C3
A8(4, 9) 3 2 10 C2
From here, New clusters are-
Cluster-01: A1(2, 10)
Cluster-02: A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A8(4, 9)
Cluster-03: A2(2, 5), A7(1, 2)
The new cluster center is computed by taking mean of all the points contained in that cluster.
For Cluster-01: We have only one point A1(2, 10) in Cluster-01. So, cluster center remains the
same.
3. For Cluster-02: Center of Cluster-02
= ((8 + 5 + 7 + 6 + 4)/5, (4 + 8 + 5 + 4 + 9)/5)
= (6, 6)
For Cluster-03: Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)
Iteration-02:
Given Points
Distance from
center (2, 10) of
Cluster-01
Distance from
center (6, 6) of
Cluster-02
Distance from
center (1.5, 3.5) of
Cluster-03
Point belongs to
Cluster
A1(2, 10) 0 8 7 C1
A2(2, 5) 5 5 2 C3
A3(8, 4) 12 4 7 C2
A4(5, 8) 5 3 8 C2
A5(7, 5) 10 2 7 C2
A6(6, 4) 10 2 5 C2
A7(1, 2) 9 9 2 C3
A8(4, 9) 3 5 8 C1
From here, New clusters are-
Cluster-01: A1(2, 10), A8(4, 9)
Cluster-02: A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4)
Cluster-03: A2(2, 5), A7(1, 2)
The new cluster center is computed by taking mean of all the points contained in that cluster.
For Cluster-01: Center of Cluster-01
= ((2 + 4)/2, (10 + 9)/2)
4. = (3, 9.5)
For Cluster-02: Center of Cluster-02
= ((8 + 5 + 7 + 6)/4, (4 + 8 + 5 + 4)/4)
= (6.5, 5.25)
For Cluster-03: Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)
This is completion of Iteration-02.
After second iteration, the center of the three clusters are-
• C1(3, 9.5)
• C2(6.5, 5.25)
• C3(1.5, 3.5)
Iteration-03
Given Points
Distance from
center (3, 9.5) of
Cluster-01
Distance from
center (6.5, 5.25)
of Cluster-02
Distance from
center (1.5, 3.5) of
Cluster-03
Point belongs to
Cluster
A1(2, 10) 1.5 9.25 7 C1
A2(2, 5) 5.5 4.75 2 C3
A3(8, 4) 10.5 2.75 7 C2
A4(5, 8) 3.5 4.25 8 C1
A5(7, 5) 8.5 0.75 7 C2
A6(6, 4) 8.5 1.75 5 C2
A7(1, 2) 9.5 8.75 2 C3
A8(4, 9) 1.5 6.25 8 C1
From here, New clusters are-
Cluster-01: A1(2, 10), A4(5, 8), A8(4, 9)
Cluster-02: A3(8, 4), A5(7, 5), A6(6, 4)
Cluster-03: A2(2, 5), A7(1, 2)
5. The new cluster center is computed by taking mean of all the points contained in that cluster.
For Cluster-01: Center of Cluster-01
= ((2 + 5 + 4)/3, (10 + 8 + 9)/3)
= (3.67, 9)
For Cluster-02: Center of Cluster-02
= ((8 + 7 + 6)/3, (4 + 5 + 4)/3)
= (7, 4.33)
For Cluster-03: Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)
This is completion of Iteration-02.
After second iteration, the center of the three clusters are-
• C1(3.67, 9)
• C2(7, 4.33)
• C3(1.5, 3.5)
Iteration-04
Given Points
Distance from
center (3.67, 9)
of Cluster-01
Distance from
center (7, 4.33) of
Cluster-02
Distance from
center (1.5, 3.5) of
Cluster-03
Point belongs to
Cluster
A1(2, 10) 2.67 10.67 7 C1
A2(2, 5) 5.67 5.67 2 C3
A3(8, 4) 9.33 1.33 7 C2
A4(5, 8) 2.33 5.67 8 C1
A5(7, 5) 7.33 0.67 7 C2
A6(6, 4) 7.33 1.33 5 C2
A7(1, 2) 9.67 8.33 2 C3
A8(4, 9) 0.33 7.67 8 C1