SlideShare a Scribd company logo
Clustering on DSS
By:
Nora Alharbi
Enaam Alotaibi
Outlines:
• Importance of Cluster Analysis for Data
Mining
• What is Cluster Analysis
• Purposes of Cluster Analysis
• Main Types of Clustering
• Common Types of Clustering
• Cluster Algorithms
• Applications of clustering
• Conclusion
• References
Importance of Cluster Analysis for
Data Mining
• Used for automatic identification of
natural groupings of things
• Part of the machine-learning family
• Employ unsupervised learning
• Learns the clusters of things from past
data, then assigns new instances
• There is not an output variable
• Also known as segmentation
What is a Cluster Analysis?
Notion of a Cluster can be Ambiguous
QUESTION ??
How many clusters?
What is Cluster Analysis?
• Finding groups of objects such that the
objects in a group will be similar (or related)
to one another and different from (or
unrelated to) the objects in other groups
Puprses of Cluster Analysis
Understanding
Group related documents for
browsing, group genes and
proteins that have similar
functionality, or group stocks
with similar price fluctuations
Summarization
Reduce the size of large data
sets
Clustering precipitation in Australia
What is not Cluster Analysis?
Supervised classification
Have class label information
Simple segmentation
Dividing students into different registration groups
alphabetically, by last name
Results of a query
Groupings are a result of an external specification
Graph partitioning
Some mutual relevance and synergy, but areas are not
identical
Common Types of Clusters
• Well-separated clusters
• Center-based clusters
• Contiguous clusters
• Density-based clusters
• Property or Conceptual
• Described by an Objective Function
Types of Clusters: Well-Separated
• Well-Separated Clusters:
– A cluster is a set of points such that any point in a cluster is closer
(or more similar) to every other point in the cluster than to any
point not in the cluster.
3 well-separated clusters
Types of Clusters: Center-Based
• Center-based (prototype-based)
– A cluster is a set of objects such that an object in a cluster is
closer (more similar) to the “center” of a cluster, than to the
center of any other cluster
– The center of a cluster is often a centroid, the average of all the
points in the cluster, or a medoid, the most “representative”
point of a cluster
4 center-based clusters
Types of Clusters: Contiguity-Based
• Contiguous Cluster (Graph-based)
– A cluster is a set of points such that a point in a cluster is closer (or
more similar) to one or more other points in the cluster than to any
point not in the cluster. Cluster can be defined as a connected
component i.e. a group of objects that are connected to one
another, but that have no connection to objects outside the group.
8 contiguous clusters
Types of Clusters : DENSITY-BASED
Density-based
A cluster is a dense region of points, which is separated by low-density
regions, from other regions of high density.
Used when the clusters are irregular or intertwined, and when noise
and outliers are present.
6 density-based clusters
Types of Clusters: Conceptual Clusters
• Shared Property or Conceptual Clusters
– Finds clusters that share some common property or represent a
particular concept.
.
2 Overlapping Circles
Types of Clusters: Objective Function
• Clusters Defined by an Objective Function
– Finds clusters that minimize or maximize an objective function.
– Enumerate all possible ways of dividing the points into clusters and
evaluate the `goodness' of each potential set of clusters by using the
given objective function.
– Can have global or local objectives.
• Hierarchical clustering algorithms typically have local objectives
• Partitional algorithms typically have global objectives
– A variation of the global objective function approach is to fit the data to a
parameterized model.
• Parameters for the model are determined from the data.
• Mixture models assume that the data is a ‘mixture' of a number of statistical
distributions.
CONT.
-Map the clustering problem to a different domain
and solve a related problem in that domain
-Proximity matrix defines a weighted graph, where
the nodes are the points being clustered, and the
weighted edges represent the proximities between
points
Clustering is equivalent to breaking the graph into
connected components, one for each cluster.
Want to minimize the edge weight between clusters
and maximize the edge weight within clusters
• A clustering is a set of clusters
• Important distinction between
hierarchical and partitional sets of
clusters
What is Clustering?
Main type of Cluster Method
•Hierarchical clustering methods: can be either agglomerative or divisive.
• An agglomerative method starts with each point as a separate cluster,
and successively performs merging until a stopping criterion is met.
• A divisive method begins with all points in a single cluster and
performs splitting until a stopping criterion is met.
• The result of a hierarchical clustering method is a tree of clusters called
a dendogram.
• A tree like diagram that records the sequences of merges or splits
1 3 2 5 4 6
0
0.05
0.1
0.15
0.2
1
2
3
4
5
6
1
2
3 4
5
Hierarchical clustering
• Agglomerative (Bottom up)
Hierarchical clustering
• Agglomerative (Bottom up)
• 1st iteration1
Hierarchical clustering
• Agglomerative (Bottom up)
• 2nd iteration1 2
Hierarchical clustering
• Agglomerative (Bottom up)
• 3rd iteration
1 2
3
Hierarchical clustering
• Agglomerative (Bottom up)
• 4th iteration
1 2
3
4
Hierarchical clustering
• Agglomerative (Bottom up)
• 5th iteration
1 2
3
4
5
Hierarchical clustering
• Agglomerative (Bottom up)
• Finally k clusters left
1 2
3
4
6 9
5
7
8
Hierarchical clustering
• Divisive (Top-down)
Slide credit: Min Zhang
Hierarchical clustering
An example of hierarchical clustering methods is BIRCH (Balanced
Iterative Reducing and Clustering using Hierarchies)
Advantage: typically find a good clustering with a single scan of the data,
and improve the quality further with a few additional scans.
Disadvantages: it can hold only a limited number of entries due to its size,
it not always correspond to what a user may consider a natural cluster.
Type of Cluster Method CONT.
•Partitional clustering methods:
• determine a partition of the points into clusters, such that the
points in a cluster are more similar to each other than to points in
different clusters.
•They start with some arbitrary initial clusters and iteratively until
criterion is met.
Examples of partitional clustering algorithms include k-means, PAM,
CLARA, CLARANS and EM.
Disadvantages:
• it assumes that the points to be clustered are all stored in main
memory.
• the run time of the algorithm is prohibitive on large datasets.
Partitional Clustering
Original Points A Partitional Clustering
Type of Cluster Method CONT.
•Density-based clustering methods :try to find clusters based on the density of
points in regions. Dense regions that are reachable from each other are merged to
form clusters.
Examples of density-based clustering methods include DBSCAN , DBRS and DBRS+
.
Advantages: It gives extremely good results and is efficient in many datasets.
Disadvantages:
• if a dataset has clusters of widely varying densities, it can't able to handle with
it efficiently.
• it does not consider non-spatial attributes in the dataset.
• it not suitable for finding approximate clusters in very large datasets.
Type of Cluster Method CONT.
Grid-based clustering methods quantize the clustering space into a finite number of
cells and then perform the required operations on the quantized space. Cells
containing more than a certain number of points are considered to be dense.
Contiguous dense cells are connected to form clusters.
Examples of grid-based clustering methods include CLIQUE and STING .
Clustering Algorithms
K-means and its variants
Hierarchical clustering
Density-based clustering
K-Means
• Initial set of clusters randomly chosen.
• Iteratively, items are moved among sets of
clusters until the desired set is reached.
• High degree of similarity among elements
in a cluster is obtained.
• Given a cluster Ki={ti1,ti2,…,tim}, the cluster
mean is mi = (1/m)(ti1 + … + tim)
K-Means Example
• Given: {2,4,10,12,3,20,30,11,25}, k=2
1- Randomly assign means: m1=3,m2=4
2- K1={2,3}, K2={4,10,12,20,30,11,25}, m1=2.5,m2=16
3-K1={2,3,4},K2={10,12,20,30,11,25}, m1=3,m2=18
4-K1={2,3,4,10},K2={12,20,30,11,25}, m1=4.75,m2=19.6
5-K1={2,3,4,10,11,12},K2={20,30,25}, m1=7,m2=25
• Stop as the clusters with these means are the same.
K-Means Algorithm
Two different K-means Clustering
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Sub-optimal Clustering
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Optimal Clustering
Original Points
What are applications that
use Clustering?
Some applications of clustering
• Clustering in EBMT (Example-Based Machine
Translation)
• Clustering in Online shopping mall
• Clustering in Spatial database
Clustering in EBMT
• EBMT is a method of performing machine translation by imitating
translation examples of similar sentences .
• a large amount of bi/multi-lingual translation examples (Greek-
English) has been stored in a textual database .
• and input expressions are rendered in the target language by
retrieving from the database that example which is most similar to
the input.
But , how to make retrieval of the example that best matches the
input more efficient?
By using of
cluster
Clustering in EBMT
• It applied k-means algorithm .
• Its enable the application to limit the search space
and locate the best available match in a database.
Clustering in Online shopping mall
• Its combine the characteristics of Web 2.0 into Decision
Support System for online shopping mall.
• To facilitate customers to share profiles and comments.
• The goal is to show the customers the product which they
may have interest in, according to the information provided
by the them.
• the framework contains of the search engine as an external
layer and the recommendation system as an internal layer.
• It used K-means to clustering the products in internal layer.
Clustering in Online shopping mall
Fig2: DSS framework for online shopping mall
Clustering in Online shopping mall
Fig3: Products clustering
Clustering in Spatial database
• Spatial database is a large amounts of data are obtained from
satellite for applications such as geo-marketing, traffic control,
and environmental studies.
• There exist many physical obstacles (rivers, lakes , highways)
and their presence may affect the result of clustering in spatial
databases.
Clustering in Spatial database
Fig4: map of western population of Canada with Highways and Rivers
Clustering in Spatial database
• However, most of clustering algorithms cannot deal with obstacles like
DBRS, DBRS+( Density-Based Clustering with Random Sampling) .
• Because, it can't handle with widely varying densities clusters in
dataset efficiently.
• so new clustering algorithm MMO (mathematics morphology based
algorithm of obstacles clustering) is proposed for the problem of
clustering in the presence of obstacles.
Clustering in Spatial database
• The main contributions are new operators are more
accurate (Connector and Obstor)than the ordinary
operators: (open and close).
Clustering in Spatial database
• Result: The performance tests show that: MMO is
suitable and effective for large data set more than
other algorithms .
Fig5: runtime of MMO with DBRS+
Conclusion
The main purpose of Data mining is to provide decision support to
managers and business professionals through knowledge
discovery
-Analyzes vast store of historical business data
-Tries to discover patterns, trends, and correlations hidden in the
data that can help a company improve its business performance
-Use regression, decision tree, neural network, cluster analysis, or
market basket analysis
-Cluster Analysis is the common method of Data mining tools
-Cluster methods and It’s algorithms were the effective methods
used by many application these day , and good Research area For
DSS fields was Done in Cluster methods .
References
1. CRANIAS, L., PAPAGEORGIOU, H., & PIPERIDIS, S. (1994). CLUSTERING : A
TECHNIQUE FOR SEARCH SPACE REDUCTION IN EXAMPLEBASED MACHINE
TRANSLATION. IEEE.
2. Pattabiraman, V. (2012). Optimization of spatial join using constraints based-
clustering techniques. Engineering and Computer Innovations.
3. Wang, X., Rostoker, C., & Hamilton , H. J. (2004). Density-Based Spatial Clustering
in the Presence of Obstacles and Facilitators.
4. Wang, X., & Hamilton, H. (n.d.). Using Clustering Methods in Geospatial
Information Systems.
5. Wenxing, H., Weng, Y., Xie, L., & Maoqing, L. (2009). Design and Implementation
of Web-based DSS for Online Shopping Mall. IEEE.
6. Zhang, Q. (2008). A Mathematics Morphology Based Algorithm of Obstacles
Clustering. International Conference on Computer Science and Software
Engineering.
Ad

More Related Content

What's hot (20)

Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
Dr Athar Khan
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Simplilearn
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
Valerii Klymchuk
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysis
guru_prasadg
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
SUBBIAH SURESH
 
Clustering
ClusteringClustering
Clustering
NLPseminar
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
guest0edcaf
 
Machine Learning - Clustering
Machine Learning - ClusteringMachine Learning - Clustering
Machine Learning - Clustering
Darío Garigliotti
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
Pushkar Mishra
 
Clustering
ClusteringClustering
Clustering
M Rizwan Aqeel
 
Clustering in artificial intelligence
Clustering in artificial intelligence Clustering in artificial intelligence
Clustering in artificial intelligence
Karam Munir Butt
 
Clustering
ClusteringClustering
Clustering
Meme Hei
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
Archana Swaminathan
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
108kaushik
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
Jewel Refran
 
08 clustering
08 clustering08 clustering
08 clustering
นนทวัฒน์ บุญบา
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
rajshreemuthiah
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
ishmecse13
 

Viewers also liked (16)

simulation modeling in DSS
 simulation modeling in DSS simulation modeling in DSS
simulation modeling in DSS
Enaam Alotaibi
 
GDSS Group Decision Support System
GDSS Group Decision Support SystemGDSS Group Decision Support System
GDSS Group Decision Support System
Enaam Alotaibi
 
TQM in small and big organizations
TQM in small and big organizationsTQM in small and big organizations
TQM in small and big organizations
Enaam Alotaibi
 
Different Approaches using Change Impact Analysis of UML Based Design for Sof...
Different Approaches using Change Impact Analysis of UML Based Design for Sof...Different Approaches using Change Impact Analysis of UML Based Design for Sof...
Different Approaches using Change Impact Analysis of UML Based Design for Sof...
zillesubhan
 
возможности Flash
возможности Flashвозможности Flash
возможности Flash
liza2209
 
TRATAMIENTO DBT 2013
TRATAMIENTO DBT 2013TRATAMIENTO DBT 2013
TRATAMIENTO DBT 2013
Flor Weisburd
 
Lenguaje de definición de datos (ddl)
Lenguaje de definición de datos (ddl)Lenguaje de definición de datos (ddl)
Lenguaje de definición de datos (ddl)
jhon dennis floresmamani
 
Homeopathy for Hypothyroidism Disorders
Homeopathy for Hypothyroidism DisordersHomeopathy for Hypothyroidism Disorders
Homeopathy for Hypothyroidism Disorders
Homeocare International
 
Mtk faiz bab 2
Mtk faiz bab 2Mtk faiz bab 2
Mtk faiz bab 2
Zinoa
 
Phosphorus and potassium placement for corn and soybean managed w
Phosphorus and potassium placement for corn and soybean managed wPhosphorus and potassium placement for corn and soybean managed w
Phosphorus and potassium placement for corn and soybean managed w
Daniel Barker, Ph.D.
 
Intrinsic healing
Intrinsic healingIntrinsic healing
Intrinsic healing
Gajapriya7085
 
Da Freelance a VA: la chiave per un 2017 con più clienti e più guadagni
Da Freelance a VA: la chiave per un 2017 con più clienti e più guadagniDa Freelance a VA: la chiave per un 2017 con più clienti e più guadagni
Da Freelance a VA: la chiave per un 2017 con più clienti e più guadagni
Paola Devescovi
 
Presentatie Leerwerktraject Amersfoort 13-12-16
Presentatie Leerwerktraject Amersfoort 13-12-16Presentatie Leerwerktraject Amersfoort 13-12-16
Presentatie Leerwerktraject Amersfoort 13-12-16
Marcel Palm
 
Arquitectura al servicio del hombre
Arquitectura al servicio del hombreArquitectura al servicio del hombre
Arquitectura al servicio del hombre
mbochaga
 
Falsa doctrina salvo siempre salvo
Falsa doctrina salvo siempre salvoFalsa doctrina salvo siempre salvo
Falsa doctrina salvo siempre salvo
Jose Marchena
 
8. modulo-de-liderazgo
8. modulo-de-liderazgo8. modulo-de-liderazgo
8. modulo-de-liderazgo
Nombre Apellidos
 
simulation modeling in DSS
 simulation modeling in DSS simulation modeling in DSS
simulation modeling in DSS
Enaam Alotaibi
 
GDSS Group Decision Support System
GDSS Group Decision Support SystemGDSS Group Decision Support System
GDSS Group Decision Support System
Enaam Alotaibi
 
TQM in small and big organizations
TQM in small and big organizationsTQM in small and big organizations
TQM in small and big organizations
Enaam Alotaibi
 
Different Approaches using Change Impact Analysis of UML Based Design for Sof...
Different Approaches using Change Impact Analysis of UML Based Design for Sof...Different Approaches using Change Impact Analysis of UML Based Design for Sof...
Different Approaches using Change Impact Analysis of UML Based Design for Sof...
zillesubhan
 
возможности Flash
возможности Flashвозможности Flash
возможности Flash
liza2209
 
TRATAMIENTO DBT 2013
TRATAMIENTO DBT 2013TRATAMIENTO DBT 2013
TRATAMIENTO DBT 2013
Flor Weisburd
 
Mtk faiz bab 2
Mtk faiz bab 2Mtk faiz bab 2
Mtk faiz bab 2
Zinoa
 
Phosphorus and potassium placement for corn and soybean managed w
Phosphorus and potassium placement for corn and soybean managed wPhosphorus and potassium placement for corn and soybean managed w
Phosphorus and potassium placement for corn and soybean managed w
Daniel Barker, Ph.D.
 
Da Freelance a VA: la chiave per un 2017 con più clienti e più guadagni
Da Freelance a VA: la chiave per un 2017 con più clienti e più guadagniDa Freelance a VA: la chiave per un 2017 con più clienti e più guadagni
Da Freelance a VA: la chiave per un 2017 con più clienti e più guadagni
Paola Devescovi
 
Presentatie Leerwerktraject Amersfoort 13-12-16
Presentatie Leerwerktraject Amersfoort 13-12-16Presentatie Leerwerktraject Amersfoort 13-12-16
Presentatie Leerwerktraject Amersfoort 13-12-16
Marcel Palm
 
Arquitectura al servicio del hombre
Arquitectura al servicio del hombreArquitectura al servicio del hombre
Arquitectura al servicio del hombre
mbochaga
 
Falsa doctrina salvo siempre salvo
Falsa doctrina salvo siempre salvoFalsa doctrina salvo siempre salvo
Falsa doctrina salvo siempre salvo
Jose Marchena
 
Ad

Similar to Clustering on DSS (20)

Unsupervised%20Learninffffg (2).pptx. application
Unsupervised%20Learninffffg (2).pptx. applicationUnsupervised%20Learninffffg (2).pptx. application
Unsupervised%20Learninffffg (2).pptx. application
ShabirAhmad625218
 
Clusteryanam
ClusteryanamClusteryanam
Clusteryanam
Nagasuri Bala Venkateswarlu
 
Cluster_saumitra.ppt
Cluster_saumitra.pptCluster_saumitra.ppt
Cluster_saumitra.ppt
ssuser6b3336
 
PPT s10-machine vision-s2
PPT s10-machine vision-s2PPT s10-machine vision-s2
PPT s10-machine vision-s2
Binus Online Learning
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
Pravinkumar Landge
 
Machine Learning : Clustering - Cluster analysis.pptx
Machine Learning : Clustering - Cluster analysis.pptxMachine Learning : Clustering - Cluster analysis.pptx
Machine Learning : Clustering - Cluster analysis.pptx
tecaviw979
 
Data mining Techniques
Data mining TechniquesData mining Techniques
Data mining Techniques
Sulman Ahmed
 
clustering using different methods in .pdf
clustering using different methods in .pdfclustering using different methods in .pdf
clustering using different methods in .pdf
officialnovice7
 
Data mining
Data miningData mining
Data mining
EmaSushan
 
Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptx
jasontseng19
 
Clustering as a unsupervised learning method inin machine learning
Clustering as a unsupervised learning method inin machine learningClustering as a unsupervised learning method inin machine learning
Clustering as a unsupervised learning method inin machine learning
tanishqgujari
 
kmean clustering
kmean clusteringkmean clustering
kmean clustering
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
vikassingh569137
 
Hierarchical clustering machine learning by arpit_sharma
Hierarchical clustering  machine learning by arpit_sharmaHierarchical clustering  machine learning by arpit_sharma
Hierarchical clustering machine learning by arpit_sharma
Er. Arpit Sharma
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
malathieswaran29
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
igeabroad
 
ClusteringClusteringClusteringClustering.pdf
ClusteringClusteringClusteringClustering.pdfClusteringClusteringClusteringClustering.pdf
ClusteringClusteringClusteringClustering.pdf
SsdSsd5
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
nadimhossain24
 
DS9 - Clustering.pptx
DS9 - Clustering.pptxDS9 - Clustering.pptx
DS9 - Clustering.pptx
JK970901
 
Clustering
ClusteringClustering
Clustering
Dr. C.V. Suresh Babu
 
Unsupervised%20Learninffffg (2).pptx. application
Unsupervised%20Learninffffg (2).pptx. applicationUnsupervised%20Learninffffg (2).pptx. application
Unsupervised%20Learninffffg (2).pptx. application
ShabirAhmad625218
 
Cluster_saumitra.ppt
Cluster_saumitra.pptCluster_saumitra.ppt
Cluster_saumitra.ppt
ssuser6b3336
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
Pravinkumar Landge
 
Machine Learning : Clustering - Cluster analysis.pptx
Machine Learning : Clustering - Cluster analysis.pptxMachine Learning : Clustering - Cluster analysis.pptx
Machine Learning : Clustering - Cluster analysis.pptx
tecaviw979
 
Data mining Techniques
Data mining TechniquesData mining Techniques
Data mining Techniques
Sulman Ahmed
 
clustering using different methods in .pdf
clustering using different methods in .pdfclustering using different methods in .pdf
clustering using different methods in .pdf
officialnovice7
 
Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptx
jasontseng19
 
Clustering as a unsupervised learning method inin machine learning
Clustering as a unsupervised learning method inin machine learningClustering as a unsupervised learning method inin machine learning
Clustering as a unsupervised learning method inin machine learning
tanishqgujari
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
vikassingh569137
 
Hierarchical clustering machine learning by arpit_sharma
Hierarchical clustering  machine learning by arpit_sharmaHierarchical clustering  machine learning by arpit_sharma
Hierarchical clustering machine learning by arpit_sharma
Er. Arpit Sharma
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
malathieswaran29
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
igeabroad
 
ClusteringClusteringClusteringClustering.pdf
ClusteringClusteringClusteringClustering.pdfClusteringClusteringClusteringClustering.pdf
ClusteringClusteringClusteringClustering.pdf
SsdSsd5
 
DS9 - Clustering.pptx
DS9 - Clustering.pptxDS9 - Clustering.pptx
DS9 - Clustering.pptx
JK970901
 
Ad

Recently uploaded (20)

Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 

Clustering on DSS

  • 1. Clustering on DSS By: Nora Alharbi Enaam Alotaibi
  • 2. Outlines: • Importance of Cluster Analysis for Data Mining • What is Cluster Analysis • Purposes of Cluster Analysis • Main Types of Clustering • Common Types of Clustering • Cluster Algorithms • Applications of clustering • Conclusion • References
  • 3. Importance of Cluster Analysis for Data Mining • Used for automatic identification of natural groupings of things • Part of the machine-learning family • Employ unsupervised learning • Learns the clusters of things from past data, then assigns new instances • There is not an output variable • Also known as segmentation
  • 4. What is a Cluster Analysis?
  • 5. Notion of a Cluster can be Ambiguous QUESTION ?? How many clusters?
  • 6. What is Cluster Analysis? • Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups
  • 7. Puprses of Cluster Analysis Understanding Group related documents for browsing, group genes and proteins that have similar functionality, or group stocks with similar price fluctuations Summarization Reduce the size of large data sets Clustering precipitation in Australia
  • 8. What is not Cluster Analysis? Supervised classification Have class label information Simple segmentation Dividing students into different registration groups alphabetically, by last name Results of a query Groupings are a result of an external specification Graph partitioning Some mutual relevance and synergy, but areas are not identical
  • 9. Common Types of Clusters • Well-separated clusters • Center-based clusters • Contiguous clusters • Density-based clusters • Property or Conceptual • Described by an Objective Function
  • 10. Types of Clusters: Well-Separated • Well-Separated Clusters: – A cluster is a set of points such that any point in a cluster is closer (or more similar) to every other point in the cluster than to any point not in the cluster. 3 well-separated clusters
  • 11. Types of Clusters: Center-Based • Center-based (prototype-based) – A cluster is a set of objects such that an object in a cluster is closer (more similar) to the “center” of a cluster, than to the center of any other cluster – The center of a cluster is often a centroid, the average of all the points in the cluster, or a medoid, the most “representative” point of a cluster 4 center-based clusters
  • 12. Types of Clusters: Contiguity-Based • Contiguous Cluster (Graph-based) – A cluster is a set of points such that a point in a cluster is closer (or more similar) to one or more other points in the cluster than to any point not in the cluster. Cluster can be defined as a connected component i.e. a group of objects that are connected to one another, but that have no connection to objects outside the group. 8 contiguous clusters
  • 13. Types of Clusters : DENSITY-BASED Density-based A cluster is a dense region of points, which is separated by low-density regions, from other regions of high density. Used when the clusters are irregular or intertwined, and when noise and outliers are present. 6 density-based clusters
  • 14. Types of Clusters: Conceptual Clusters • Shared Property or Conceptual Clusters – Finds clusters that share some common property or represent a particular concept. . 2 Overlapping Circles
  • 15. Types of Clusters: Objective Function • Clusters Defined by an Objective Function – Finds clusters that minimize or maximize an objective function. – Enumerate all possible ways of dividing the points into clusters and evaluate the `goodness' of each potential set of clusters by using the given objective function. – Can have global or local objectives. • Hierarchical clustering algorithms typically have local objectives • Partitional algorithms typically have global objectives – A variation of the global objective function approach is to fit the data to a parameterized model. • Parameters for the model are determined from the data. • Mixture models assume that the data is a ‘mixture' of a number of statistical distributions.
  • 16. CONT. -Map the clustering problem to a different domain and solve a related problem in that domain -Proximity matrix defines a weighted graph, where the nodes are the points being clustered, and the weighted edges represent the proximities between points Clustering is equivalent to breaking the graph into connected components, one for each cluster. Want to minimize the edge weight between clusters and maximize the edge weight within clusters
  • 17. • A clustering is a set of clusters • Important distinction between hierarchical and partitional sets of clusters What is Clustering?
  • 18. Main type of Cluster Method •Hierarchical clustering methods: can be either agglomerative or divisive. • An agglomerative method starts with each point as a separate cluster, and successively performs merging until a stopping criterion is met. • A divisive method begins with all points in a single cluster and performs splitting until a stopping criterion is met. • The result of a hierarchical clustering method is a tree of clusters called a dendogram. • A tree like diagram that records the sequences of merges or splits 1 3 2 5 4 6 0 0.05 0.1 0.15 0.2 1 2 3 4 5 6 1 2 3 4 5
  • 20. Hierarchical clustering • Agglomerative (Bottom up) • 1st iteration1
  • 21. Hierarchical clustering • Agglomerative (Bottom up) • 2nd iteration1 2
  • 22. Hierarchical clustering • Agglomerative (Bottom up) • 3rd iteration 1 2 3
  • 23. Hierarchical clustering • Agglomerative (Bottom up) • 4th iteration 1 2 3 4
  • 24. Hierarchical clustering • Agglomerative (Bottom up) • 5th iteration 1 2 3 4 5
  • 25. Hierarchical clustering • Agglomerative (Bottom up) • Finally k clusters left 1 2 3 4 6 9 5 7 8
  • 26. Hierarchical clustering • Divisive (Top-down) Slide credit: Min Zhang
  • 27. Hierarchical clustering An example of hierarchical clustering methods is BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) Advantage: typically find a good clustering with a single scan of the data, and improve the quality further with a few additional scans. Disadvantages: it can hold only a limited number of entries due to its size, it not always correspond to what a user may consider a natural cluster.
  • 28. Type of Cluster Method CONT. •Partitional clustering methods: • determine a partition of the points into clusters, such that the points in a cluster are more similar to each other than to points in different clusters. •They start with some arbitrary initial clusters and iteratively until criterion is met. Examples of partitional clustering algorithms include k-means, PAM, CLARA, CLARANS and EM. Disadvantages: • it assumes that the points to be clustered are all stored in main memory. • the run time of the algorithm is prohibitive on large datasets.
  • 29. Partitional Clustering Original Points A Partitional Clustering
  • 30. Type of Cluster Method CONT. •Density-based clustering methods :try to find clusters based on the density of points in regions. Dense regions that are reachable from each other are merged to form clusters. Examples of density-based clustering methods include DBSCAN , DBRS and DBRS+ . Advantages: It gives extremely good results and is efficient in many datasets. Disadvantages: • if a dataset has clusters of widely varying densities, it can't able to handle with it efficiently. • it does not consider non-spatial attributes in the dataset. • it not suitable for finding approximate clusters in very large datasets.
  • 31. Type of Cluster Method CONT. Grid-based clustering methods quantize the clustering space into a finite number of cells and then perform the required operations on the quantized space. Cells containing more than a certain number of points are considered to be dense. Contiguous dense cells are connected to form clusters. Examples of grid-based clustering methods include CLIQUE and STING .
  • 32. Clustering Algorithms K-means and its variants Hierarchical clustering Density-based clustering
  • 33. K-Means • Initial set of clusters randomly chosen. • Iteratively, items are moved among sets of clusters until the desired set is reached. • High degree of similarity among elements in a cluster is obtained. • Given a cluster Ki={ti1,ti2,…,tim}, the cluster mean is mi = (1/m)(ti1 + … + tim)
  • 34. K-Means Example • Given: {2,4,10,12,3,20,30,11,25}, k=2 1- Randomly assign means: m1=3,m2=4 2- K1={2,3}, K2={4,10,12,20,30,11,25}, m1=2.5,m2=16 3-K1={2,3,4},K2={10,12,20,30,11,25}, m1=3,m2=18 4-K1={2,3,4,10},K2={12,20,30,11,25}, m1=4.75,m2=19.6 5-K1={2,3,4,10,11,12},K2={20,30,25}, m1=7,m2=25 • Stop as the clusters with these means are the same.
  • 36. Two different K-means Clustering -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Sub-optimal Clustering -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Optimal Clustering Original Points
  • 37. What are applications that use Clustering?
  • 38. Some applications of clustering • Clustering in EBMT (Example-Based Machine Translation) • Clustering in Online shopping mall • Clustering in Spatial database
  • 39. Clustering in EBMT • EBMT is a method of performing machine translation by imitating translation examples of similar sentences . • a large amount of bi/multi-lingual translation examples (Greek- English) has been stored in a textual database . • and input expressions are rendered in the target language by retrieving from the database that example which is most similar to the input. But , how to make retrieval of the example that best matches the input more efficient? By using of cluster
  • 40. Clustering in EBMT • It applied k-means algorithm . • Its enable the application to limit the search space and locate the best available match in a database.
  • 41. Clustering in Online shopping mall • Its combine the characteristics of Web 2.0 into Decision Support System for online shopping mall. • To facilitate customers to share profiles and comments. • The goal is to show the customers the product which they may have interest in, according to the information provided by the them. • the framework contains of the search engine as an external layer and the recommendation system as an internal layer. • It used K-means to clustering the products in internal layer.
  • 42. Clustering in Online shopping mall Fig2: DSS framework for online shopping mall
  • 43. Clustering in Online shopping mall Fig3: Products clustering
  • 44. Clustering in Spatial database • Spatial database is a large amounts of data are obtained from satellite for applications such as geo-marketing, traffic control, and environmental studies. • There exist many physical obstacles (rivers, lakes , highways) and their presence may affect the result of clustering in spatial databases.
  • 45. Clustering in Spatial database Fig4: map of western population of Canada with Highways and Rivers
  • 46. Clustering in Spatial database • However, most of clustering algorithms cannot deal with obstacles like DBRS, DBRS+( Density-Based Clustering with Random Sampling) . • Because, it can't handle with widely varying densities clusters in dataset efficiently. • so new clustering algorithm MMO (mathematics morphology based algorithm of obstacles clustering) is proposed for the problem of clustering in the presence of obstacles.
  • 47. Clustering in Spatial database • The main contributions are new operators are more accurate (Connector and Obstor)than the ordinary operators: (open and close).
  • 48. Clustering in Spatial database • Result: The performance tests show that: MMO is suitable and effective for large data set more than other algorithms . Fig5: runtime of MMO with DBRS+
  • 49. Conclusion The main purpose of Data mining is to provide decision support to managers and business professionals through knowledge discovery -Analyzes vast store of historical business data -Tries to discover patterns, trends, and correlations hidden in the data that can help a company improve its business performance -Use regression, decision tree, neural network, cluster analysis, or market basket analysis -Cluster Analysis is the common method of Data mining tools -Cluster methods and It’s algorithms were the effective methods used by many application these day , and good Research area For DSS fields was Done in Cluster methods .
  • 50. References 1. CRANIAS, L., PAPAGEORGIOU, H., & PIPERIDIS, S. (1994). CLUSTERING : A TECHNIQUE FOR SEARCH SPACE REDUCTION IN EXAMPLEBASED MACHINE TRANSLATION. IEEE. 2. Pattabiraman, V. (2012). Optimization of spatial join using constraints based- clustering techniques. Engineering and Computer Innovations. 3. Wang, X., Rostoker, C., & Hamilton , H. J. (2004). Density-Based Spatial Clustering in the Presence of Obstacles and Facilitators. 4. Wang, X., & Hamilton, H. (n.d.). Using Clustering Methods in Geospatial Information Systems. 5. Wenxing, H., Weng, Y., Xie, L., & Maoqing, L. (2009). Design and Implementation of Web-based DSS for Online Shopping Mall. IEEE. 6. Zhang, Q. (2008). A Mathematics Morphology Based Algorithm of Obstacles Clustering. International Conference on Computer Science and Software Engineering.