100% found this document useful (1 vote)
185 views

Cluster Analysis

Cluster analysis is used to group similar data objects into clusters. It finds patterns between data by analyzing characteristics within the data to group similar objects together. Key aspects of cluster analysis include measuring similarity between objects, determining how to form clusters, and deciding how many clusters to create. Hierarchical clustering is best for small datasets as it computes the distance between all cases. It can use either an agglomerative or divisive method, where agglomerative starts with each case as its own cluster and merges similar clusters, while divisive starts with all cases in one cluster and divides them into individual clusters.

Uploaded by

suhail memon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
185 views

Cluster Analysis

Cluster analysis is used to group similar data objects into clusters. It finds patterns between data by analyzing characteristics within the data to group similar objects together. Key aspects of cluster analysis include measuring similarity between objects, determining how to form clusters, and deciding how many clusters to create. Hierarchical clustering is best for small datasets as it computes the distance between all cases. It can use either an agglomerative or divisive method, where agglomerative starts with each case as its own cluster and merges similar clusters, while divisive starts with all cases in one cluster and divides them into individual clusters.

Uploaded by

suhail memon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Cluster Analysis

Cluster:
A set of objects that are similar to each other and
separated from the other objects.
Cluster analysis:
Finding similarities between data according to the
characteristics found in the data and grouping
similar data objects into clusters
Employee Opinion Surveys
Market Research
Factor Analysis: to find patterns within
variables
Cluster Analysis: to find patterns between
individuals
Discriminant Analysis: to look for differences
between groups
How to measure similarity?
How to form clusters?
How many clusters?
Key Terms:
Hierarchical clustering is best for small datasets
because this procedure computes a proximity
matrix of the distance/similarity of every case
with every other case in the dataset. An
agglomerative or divisive method can be used to
cluster cases.
Agglomerative method: It begins with each case
being a cluster by itself and continues until
similar clusters merge together.
Divisive method: It begins with every case into
one cluster and continues until each case is
divided into individual clusters.
Wards method: Compute sum of squared
distances within clusters
Centroid method: The distance between two
clusters is defined as the difference between the
centroids (cluster averages)
Euclidean Distance:
1/ 2
2
d ij ( xik x jk )
k

You might also like