SlideShare a Scribd company logo
Summer School
“Achievements and Applications of Contemporary Informatics,
         Mathematics and Physics” (AACIMP 2011)
              August 8-20, 2011, Kiev, Ukraine




          Density Based Clustering

                                 Erik Kropat

                     University of the Bundeswehr Munich
                      Institute for Theoretical Computer Science,
                        Mathematics and Operations Research
                                Neubiberg, Germany
DBSCAN
Density based spatial clustering of applications with noise




                                                              noise




      arbitrarily shaped clusters
DBSCAN

DBSCAN is one of the most cited clustering algorithms in the literature.

Features
− Spatial data
     geomarketing, tomography, satellite images

− Discovery of clusters with arbitrary shape
     spherical, drawn-out, linear, elongated

− Good efficiency on large databases
     parallel programming

− Only two parameters required
− No prior knowledge of the number of clusters required.
DBSCAN

Idea
− Clusters have a high density of points.
− In the area of noise the density is lower
  than the density in any of the clusters.


Goal
− Formalize the notions of clusters and noise.
DBSCAN

Naïve approach
For each point in a cluster there are at least a minimum number (MinPts)
of points in an Eps-neighborhood of that point.




                                       cluster
Neighborhood of a Point

 Eps-neighborhood of a point p

   NEps(p) = { q ∈ D | dist (p, q) ≤ Eps }




                                     Eps

                                       p
DBSCAN ‒ Data

Problem

• In each cluster there are two kinds of points:

                                                                    cluster
     ̶ points inside the cluster (core points)
     ̶ points on the border      (border points)



An Eps-neighborhood of a border point contains significantly less points than
an Eps-neighborhood of a core point.
Better idea
For every point p in a cluster C there is a point q ∈ C,
so that
(1) p is inside of the Eps-neighborhood of q               border points are connected to core points
and
(2) NEps(q) contains at least MinPts points.               core points = high density




                                               p

                                                   q
Definition
A point p is directly density-reachable from a point q
with regard to the parameters Eps and MinPts, if
  1) p ∈ NEps(q)                (reachability)
  2) | NEps(q) | ≥ MinPts       (core point condition)




                     p

                                            MinPts = 5
                            q
                                            | NEps(q) | = 6 ≥ 5 = MinPts (core point condition)
Remark
Directly density-reachable is symmetric for pairs of core points.
It is not symmetric if one core point and one border point are involved.



                                             Parameter: MinPts = 5

                   p                         p directly density reachable from q
                                              p ∈ NEps(q)
                          q
                                              | NEps(q) | = 6 ≥ 5 = MinPts   (core point condition)


                                             q not directly density reachable from p
                                              | NEps (p) | = 4 < 5 = MinPts (core point condition)
Definition
A point p is density-reachable from a point q
with regard to the parameters Eps and MinPts
if there is a chain of points p1, p2, . . . ,ps with p1 = q and ps = p
such that pi+1 is directly density-reachable from pi for all 1 < i < s-1.




                             p
                                  p1            MinPts = 5
                                                | NEps(q) | = 5 = MinPts     (core point condition)
                                       q
                                                | NEps(p1) | = 6 ≥ 5 = MinPts (core point condition)
Definition (density-connected)
A point p is density-connected to a point q
with regard to the parameters Eps and MinPts
if there is a point v such that both p and q are density-reachable from v.


                   p

                                                        MinPts = 5

                           v


                                 q




Remark: Density-connectivity is a symmetric relation.
Definition (cluster)
A cluster with regard to the parameters Eps and MinPts
is a non-empty subset C of the database D with

  1) For all p, q ∈ D:                                    (Maximality)
      If p ∈ C      and q is density-reachable from p
      with regard to the parameters Eps and MinPts,
      then q ∈ C.

  2) For all p, q ∈ C:                                   (Connectivity)
      The point p is density-connected to q
      with regard to the parameters Eps and MinPts.
Definition (noise)
Let C1,...,Ck be the clusters of the database D
with regard to the parameters Eps i and MinPts I (i=1,...,k).

The set of points in the database D not belonging to any cluster C1,...,Ck
is called noise:

      Noise = { p ∈ D | p ∉ Ci for all i = 1,...,k}




                                                                 noise
Two-Step Approach

If the parameters Eps and MinPts are given,
a cluster can be discovered in a two-step approach:

1) Choose an arbitrary point v from the database
   satisfying the core point condition as a seed.

2) Retrieve all points that are density-reachable from the seed
   obtaining the cluster containing the seed.
DBSCAN (algorithm)

(1) Start with an arbitrary point p from the database and
    retrieve all points density-reachable from p
    with regard to Eps and MinPts.

(2) If p is a core point, the procedure yields a cluster
    with regard to Eps and MinPts
    and the point is classified.

(3) If p is a border point, no points are density-reachable from p
    and DBSCAN visits the next unclassified point in the database.
Algorithm: DBSCAN
INPUT:      Database SetOfPoints, Eps, MinPts
OUTPUT: Clusters, region of noise

(1) ClusterID := nextID(NOISE);
(2) Foreach p ∈ SetOfPoints do
(3)       if p.classifiedAs == UNCLASSIFIED then
(4)               if ExpandCluster(SetOfPoints, p, ClusterID, Eps, MinPts) then
(5)                  ClusterID++;
(6)               endif
(7)       endif
(8) endforeach
SetOfPoints = the database or   a discovered cluster from a previous run.
Function: ExpandCluster

INPUT:     SetOfPoints, p, ClusterID, Eps, MinPts
OUTPUT: True, if p is a core point; False, else.

(1) seeds = NEps(p);
(2) if seeds.size < MinPts then            // no core point
(3)      p.classifiedAs = NOISE;
(4)      return FALSE;
(5) else                                   // all points in seeds are density-reachable from p
(6)      foreach q ∈ seeds do
(7)           q.classifiedAs = ClusterID
(8)      endforeach
Function: ExpandCluster                      (continued)
(9)        seeds = seeds  {p};
(10)       while seeds ≠ ∅ do
(11)             currentP = seeds.first();
(12)             result = NEps(currentP);
(13)             if result.size ≥ MinPts then
(14)                      foreach resultP ∈ result and
                               resultP.classifiedAs ∈ {UNCLASSIFIED, NOISE} do
(15)                                             if resultP.classifiedAs == UNCLASSIFIED then
(16)                                                     seeds = seeds ∪ {resultP};
(17)                                             endif
(18)                                             resultP.classifiedAs = ClusterID;
(19)                      endforeach
(20)             endif
(21)             seeds = seeds  {currentP};
(22)       endwhile
(23)       return TRUE;
(24)   endif

Source: A. Naprienko: Dichtebasierte Verfahren der Clusteranalyse raumbezogener Daten am Beispiel von DBSCAN und Fuzzy-DBSCAN.
        Universität der Bundeswehr München, student’s project, WT2011.
Density Based Clustering
 ‒ The Parameters Eps and MinPts ‒
Determining the parameters Eps and MinPts
The parameters Eps and MinPts can be determined by a heuristic.

Observation
• For points in a cluster, their k-th nearest neighbors are at roughly the same distance.
• Noise points have the k-th nearest neighbor at farther distance.




⇒    Plot sorted distance of every point to its k-th nearest neighbor.
Determining the parameters Eps and MinPts

Procedure
• Define a function k-dist from the database to the real numbers,
  mapping each point to the distance from its k-th nearest neighbor.

• Sort the points of the database in descending order of their k-dist values.

                   k-dist




                                       database
Determining the parameters Eps and MinPts

Procedure
• Choose an arbitrary point p
        set Eps = k-dist(p)
        set MinPts = k.
• All points with an equal or smaller k-dist value will be cluster points


                   k-dist




                                      p
                              noise          cluster points
Determining the parameters Eps and MinPts



Idea: Use the point density of the least dense cluster in the data set as parameters
Determining the parameters Eps and MinPts


• Find threshold point with the maximal k-dist value in the “thinnest cluster” of D
• Set parameters     Eps = k-dist(p)      and   MinPts = k.




                                    Eps




                            noise               cluster 1     cluster 2
Density Based Clustering
       ‒ Applications ‒
Automatic border detection in dermoscopy images




Sample images showing assessments of the dermatologist (red), automated frameworks DBSCAN (blue) and FCM (green).
Kockara et al. BMC Bioinformatics 2010 11(Suppl 6):S26 doi:10.1186/1471-2105-11-S6-S26
Literature
• M. Ester, H.P. Kriegel, J. Sander, X. Xu
  A density-based algorithm for discovering clusters in large spatial
  databases with noise.
  Proceedings of 2nd International Conference on Knowledge Discovery
  and Data Mining (KDD96).

• A. Naprienko
  Dichtebasierte Verfahren der Clusteranalyse raumbezogener Daten
  am Beispiel von DBSCAN und Fuzzy-DBSCAN.
  Universität der Bundeswehr München, student’s project, WT2011.

• J. Sander, M. Ester, H.P. Kriegel, X. Xu
  Density-based clustering in spatial databases: the algorithm GDBSCAN
  and its applications.
  Data Mining and Knowledge Discovery, Springer, Berlin, 2 (2): 169–194.
Literature
• J.N Dharwa, A.R. Patel
  A Data Mining with Hybrid Approach Based Transaction Risk Score
  Generation Model (TRSGM) for Fraud Detection of Online Financial Transaction.
  Proceedings of 2nd International Conference on Knowledge Discovery and
  Data Mining (KDD96). International Journal of Computer Applications, Vol 16, No. 1, 2011.
Thank you very much!
Ad

More Related Content

What's hot (20)

Density based clustering
Density based clusteringDensity based clustering
Density based clustering
YaswanthHariKumarVud
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
Krish_ver2
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
Pravinkumar Landge
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan
 
K means clustering
K means clusteringK means clustering
K means clustering
keshav goyal
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
Krish_ver2
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
Carlos Castillo (ChaTo)
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
Saad Elbeleidy
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
Pabna University of Science & Technology
 
Clustering
ClusteringClustering
Clustering
M Rizwan Aqeel
 
Density based methods
Density based methodsDensity based methods
Density based methods
SVijaylakshmi
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
error007
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
Valerii Klymchuk
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
Tharuka Vishwajith Sarathchandra
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
Cory Cook
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
Krish_ver2
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
Pravinkumar Landge
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan
 
K means clustering
K means clusteringK means clustering
K means clustering
keshav goyal
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
Krish_ver2
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
Saad Elbeleidy
 
Density based methods
Density based methodsDensity based methods
Density based methods
SVijaylakshmi
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
error007
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
Valerii Klymchuk
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
Cory Cook
 

Viewers also liked (18)

3.4 density and grid methods
3.4 density and grid methods3.4 density and grid methods
3.4 density and grid methods
Krish_ver2
 
Db Scan
Db ScanDb Scan
Db Scan
International Islamic University
 
Clique
Clique Clique
Clique
sk_klms
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssas
Umar Ali
 
HR FUNCTIONS
HR FUNCTIONSHR FUNCTIONS
HR FUNCTIONS
British Council
 
Database aggregation using metadata
Database aggregation using metadataDatabase aggregation using metadata
Database aggregation using metadata
Dr Sandeep Kumar Poonia
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Jason Rodrigues
 
Cure, Clustering Algorithm
Cure, Clustering AlgorithmCure, Clustering Algorithm
Cure, Clustering Algorithm
Lino Possamai
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reduction
Krish_ver2
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
SHIVANI SONI
 
Overview of human resource management system & function
Overview of human resource management  system & functionOverview of human resource management  system & function
Overview of human resource management system & function
Rita Choudhary
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
International School of Engineering
 
Role of HR Manager
Role of HR ManagerRole of HR Manager
Role of HR Manager
CreativeHRM
 
hrm functions
hrm functionshrm functions
hrm functions
jain.pralabh
 
Functions and Activities of HRM
Functions and Activities of HRMFunctions and Activities of HRM
Functions and Activities of HRM
Sharon Geroquia
 
OLAP
OLAPOLAP
OLAP
Slideshare
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
Benazir Income Support Program (BISP)
 
Hr functions and strategy ppt
Hr functions and strategy pptHr functions and strategy ppt
Hr functions and strategy ppt
LOLITA GANDIA
 
Ad

Similar to Density Based Clustering (20)

DBSCAN
DBSCANDBSCAN
DBSCAN
ssuseraef7e0
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
 
density based method and expectation maximization
density based method and expectation maximizationdensity based method and expectation maximization
density based method and expectation maximization
Siva Priya
 
Dbscan
DbscanDbscan
Dbscan
RohitPaul52
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Intel® Software
 
Graph and Density Based Clustering
Graph and Density Based ClusteringGraph and Density Based Clustering
Graph and Density Based Clustering
AyushAnand105
 
Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)
Zahra Amini
 
Core–periphery detection in networks with nonlinear Perron eigenvectors
Core–periphery detection in networks with nonlinear Perron eigenvectorsCore–periphery detection in networks with nonlinear Perron eigenvectors
Core–periphery detection in networks with nonlinear Perron eigenvectors
Francesco Tudisco
 
Approximate Tree Kernels
Approximate Tree KernelsApproximate Tree Kernels
Approximate Tree Kernels
Niharjyoti Sarangi
 
Parallel kmeans clustering in Erlang
Parallel kmeans clustering in ErlangParallel kmeans clustering in Erlang
Parallel kmeans clustering in Erlang
Chinmay Patel
 
Lecture5.pptx
Lecture5.pptxLecture5.pptx
Lecture5.pptx
ARVIND SARDAR
 
Neural Network
Neural NetworkNeural Network
Neural Network
samisounda
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Salah Amean
 
Machine Learning course Lecture number 2 - Supervised machine learning, part ...
Machine Learning course Lecture number 2 - Supervised machine learning, part ...Machine Learning course Lecture number 2 - Supervised machine learning, part ...
Machine Learning course Lecture number 2 - Supervised machine learning, part ...
hamedj21
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
IJMIT JOURNAL
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
AlaaZ
 
KNN.pptx
KNN.pptxKNN.pptx
KNN.pptx
Rahul Halder
 
KNN.pptx
KNN.pptxKNN.pptx
KNN.pptx
dfgd7
 
International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)
IJMIT JOURNAL
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
IJMIT JOURNAL
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
 
density based method and expectation maximization
density based method and expectation maximizationdensity based method and expectation maximization
density based method and expectation maximization
Siva Priya
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Intel® Software
 
Graph and Density Based Clustering
Graph and Density Based ClusteringGraph and Density Based Clustering
Graph and Density Based Clustering
AyushAnand105
 
Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)
Zahra Amini
 
Core–periphery detection in networks with nonlinear Perron eigenvectors
Core–periphery detection in networks with nonlinear Perron eigenvectorsCore–periphery detection in networks with nonlinear Perron eigenvectors
Core–periphery detection in networks with nonlinear Perron eigenvectors
Francesco Tudisco
 
Parallel kmeans clustering in Erlang
Parallel kmeans clustering in ErlangParallel kmeans clustering in Erlang
Parallel kmeans clustering in Erlang
Chinmay Patel
 
Neural Network
Neural NetworkNeural Network
Neural Network
samisounda
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Salah Amean
 
Machine Learning course Lecture number 2 - Supervised machine learning, part ...
Machine Learning course Lecture number 2 - Supervised machine learning, part ...Machine Learning course Lecture number 2 - Supervised machine learning, part ...
Machine Learning course Lecture number 2 - Supervised machine learning, part ...
hamedj21
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
IJMIT JOURNAL
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
AlaaZ
 
KNN.pptx
KNN.pptxKNN.pptx
KNN.pptx
dfgd7
 
International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)International Journal of Managing Information Technology (IJMIT)
International Journal of Managing Information Technology (IJMIT)
IJMIT JOURNAL
 
An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
IJMIT JOURNAL
 
Ad

More from SSA KPI (20)

Germany presentation
Germany presentationGermany presentation
Germany presentation
SSA KPI
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
SSA KPI
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
SSA KPI
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
SSA KPI
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
SSA KPI
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
SSA KPI
 
DAAD-10.11.2011
DAAD-10.11.2011DAAD-10.11.2011
DAAD-10.11.2011
SSA KPI
 
Talking with money
Talking with moneyTalking with money
Talking with money
SSA KPI
 
'Green' startup investment
'Green' startup investment'Green' startup investment
'Green' startup investment
SSA KPI
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesFrom Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
SSA KPI
 
Dynamics of dice games
Dynamics of dice gamesDynamics of dice games
Dynamics of dice games
SSA KPI
 
Energy Security Costs
Energy Security CostsEnergy Security Costs
Energy Security Costs
SSA KPI
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsNaturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
SSA KPI
 
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5
SSA KPI
 
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4
SSA KPI
 
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3
SSA KPI
 
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2
SSA KPI
 
Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1
SSA KPI
 
Fluorescent proteins in current biology
Fluorescent proteins in current biologyFluorescent proteins in current biology
Fluorescent proteins in current biology
SSA KPI
 
Neurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsNeurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functions
SSA KPI
 
Germany presentation
Germany presentationGermany presentation
Germany presentation
SSA KPI
 
Grand challenges in energy
Grand challenges in energyGrand challenges in energy
Grand challenges in energy
SSA KPI
 
Engineering role in sustainability
Engineering role in sustainabilityEngineering role in sustainability
Engineering role in sustainability
SSA KPI
 
Consensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable developmentConsensus and interaction on a long term strategy for sustainable development
Consensus and interaction on a long term strategy for sustainable development
SSA KPI
 
Competences in sustainability in engineering education
Competences in sustainability in engineering educationCompetences in sustainability in engineering education
Competences in sustainability in engineering education
SSA KPI
 
Introducatio SD for enginers
Introducatio SD for enginersIntroducatio SD for enginers
Introducatio SD for enginers
SSA KPI
 
DAAD-10.11.2011
DAAD-10.11.2011DAAD-10.11.2011
DAAD-10.11.2011
SSA KPI
 
Talking with money
Talking with moneyTalking with money
Talking with money
SSA KPI
 
'Green' startup investment
'Green' startup investment'Green' startup investment
'Green' startup investment
SSA KPI
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea wavesFrom Huygens odd sympathy to the energy Huygens' extraction from the sea waves
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
SSA KPI
 
Dynamics of dice games
Dynamics of dice gamesDynamics of dice games
Dynamics of dice games
SSA KPI
 
Energy Security Costs
Energy Security CostsEnergy Security Costs
Energy Security Costs
SSA KPI
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environmentsNaturally Occurring Radioactivity (NOR) in natural and anthropic environments
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
SSA KPI
 
Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5Advanced energy technology for sustainable development. Part 5
Advanced energy technology for sustainable development. Part 5
SSA KPI
 
Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4Advanced energy technology for sustainable development. Part 4
Advanced energy technology for sustainable development. Part 4
SSA KPI
 
Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3Advanced energy technology for sustainable development. Part 3
Advanced energy technology for sustainable development. Part 3
SSA KPI
 
Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2Advanced energy technology for sustainable development. Part 2
Advanced energy technology for sustainable development. Part 2
SSA KPI
 
Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1Advanced energy technology for sustainable development. Part 1
Advanced energy technology for sustainable development. Part 1
SSA KPI
 
Fluorescent proteins in current biology
Fluorescent proteins in current biologyFluorescent proteins in current biology
Fluorescent proteins in current biology
SSA KPI
 
Neurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functionsNeurotransmitter systems of the brain and their functions
Neurotransmitter systems of the brain and their functions
SSA KPI
 

Recently uploaded (20)

Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam SuccessUltimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Mark Soia
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx
contactwilliamm2546
 
The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...
Sandeep Swamy
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACYUNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
DR.PRISCILLA MARY J
 
How to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odooHow to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odoo
Celine George
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
How to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of saleHow to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of sale
Celine George
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
GDGLSPGCOER - Git and GitHub Workshop.pptx
GDGLSPGCOER - Git and GitHub Workshop.pptxGDGLSPGCOER - Git and GitHub Workshop.pptx
GDGLSPGCOER - Git and GitHub Workshop.pptx
azeenhodekar
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Unit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdfUnit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdf
KanchanPatil34
 
Political History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptxPolitical History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptx
Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
LDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini UpdatesLDMMIA Reiki Master Spring 2025 Mini Updates
LDMMIA Reiki Master Spring 2025 Mini Updates
LDM Mia eStudios
 
Handling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptxHandling Multiple Choice Responses: Fortune Effiong.pptx
Handling Multiple Choice Responses: Fortune Effiong.pptx
AuthorAIDNationalRes
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam SuccessUltimate VMware 2V0-11.25 Exam Dumps for Exam Success
Ultimate VMware 2V0-11.25 Exam Dumps for Exam Success
Mark Soia
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx2541William_McCollough_DigitalDetox.docx
2541William_McCollough_DigitalDetox.docx
contactwilliamm2546
 
The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...The ever evoilving world of science /7th class science curiosity /samyans aca...
The ever evoilving world of science /7th class science curiosity /samyans aca...
Sandeep Swamy
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACYUNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
DR.PRISCILLA MARY J
 
How to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odooHow to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odoo
Celine George
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
How to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of saleHow to manage Multiple Warehouses for multiple floors in odoo point of sale
How to manage Multiple Warehouses for multiple floors in odoo point of sale
Celine George
 
GDGLSPGCOER - Git and GitHub Workshop.pptx
GDGLSPGCOER - Git and GitHub Workshop.pptxGDGLSPGCOER - Git and GitHub Workshop.pptx
GDGLSPGCOER - Git and GitHub Workshop.pptx
azeenhodekar
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Unit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdfUnit 6_Introduction_Phishing_Password Cracking.pdf
Unit 6_Introduction_Phishing_Password Cracking.pdf
KanchanPatil34
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 

Density Based Clustering

  • 1. Summer School “Achievements and Applications of Contemporary Informatics, Mathematics and Physics” (AACIMP 2011) August 8-20, 2011, Kiev, Ukraine Density Based Clustering Erik Kropat University of the Bundeswehr Munich Institute for Theoretical Computer Science, Mathematics and Operations Research Neubiberg, Germany
  • 2. DBSCAN Density based spatial clustering of applications with noise noise arbitrarily shaped clusters
  • 3. DBSCAN DBSCAN is one of the most cited clustering algorithms in the literature. Features − Spatial data geomarketing, tomography, satellite images − Discovery of clusters with arbitrary shape spherical, drawn-out, linear, elongated − Good efficiency on large databases parallel programming − Only two parameters required − No prior knowledge of the number of clusters required.
  • 4. DBSCAN Idea − Clusters have a high density of points. − In the area of noise the density is lower than the density in any of the clusters. Goal − Formalize the notions of clusters and noise.
  • 5. DBSCAN Naïve approach For each point in a cluster there are at least a minimum number (MinPts) of points in an Eps-neighborhood of that point. cluster
  • 6. Neighborhood of a Point Eps-neighborhood of a point p NEps(p) = { q ∈ D | dist (p, q) ≤ Eps } Eps p
  • 7. DBSCAN ‒ Data Problem • In each cluster there are two kinds of points: cluster ̶ points inside the cluster (core points) ̶ points on the border (border points) An Eps-neighborhood of a border point contains significantly less points than an Eps-neighborhood of a core point.
  • 8. Better idea For every point p in a cluster C there is a point q ∈ C, so that (1) p is inside of the Eps-neighborhood of q border points are connected to core points and (2) NEps(q) contains at least MinPts points. core points = high density p q
  • 9. Definition A point p is directly density-reachable from a point q with regard to the parameters Eps and MinPts, if 1) p ∈ NEps(q) (reachability) 2) | NEps(q) | ≥ MinPts (core point condition) p MinPts = 5 q | NEps(q) | = 6 ≥ 5 = MinPts (core point condition)
  • 10. Remark Directly density-reachable is symmetric for pairs of core points. It is not symmetric if one core point and one border point are involved. Parameter: MinPts = 5 p p directly density reachable from q p ∈ NEps(q) q | NEps(q) | = 6 ≥ 5 = MinPts (core point condition) q not directly density reachable from p | NEps (p) | = 4 < 5 = MinPts (core point condition)
  • 11. Definition A point p is density-reachable from a point q with regard to the parameters Eps and MinPts if there is a chain of points p1, p2, . . . ,ps with p1 = q and ps = p such that pi+1 is directly density-reachable from pi for all 1 < i < s-1. p p1 MinPts = 5 | NEps(q) | = 5 = MinPts (core point condition) q | NEps(p1) | = 6 ≥ 5 = MinPts (core point condition)
  • 12. Definition (density-connected) A point p is density-connected to a point q with regard to the parameters Eps and MinPts if there is a point v such that both p and q are density-reachable from v. p MinPts = 5 v q Remark: Density-connectivity is a symmetric relation.
  • 13. Definition (cluster) A cluster with regard to the parameters Eps and MinPts is a non-empty subset C of the database D with 1) For all p, q ∈ D: (Maximality) If p ∈ C and q is density-reachable from p with regard to the parameters Eps and MinPts, then q ∈ C. 2) For all p, q ∈ C: (Connectivity) The point p is density-connected to q with regard to the parameters Eps and MinPts.
  • 14. Definition (noise) Let C1,...,Ck be the clusters of the database D with regard to the parameters Eps i and MinPts I (i=1,...,k). The set of points in the database D not belonging to any cluster C1,...,Ck is called noise: Noise = { p ∈ D | p ∉ Ci for all i = 1,...,k} noise
  • 15. Two-Step Approach If the parameters Eps and MinPts are given, a cluster can be discovered in a two-step approach: 1) Choose an arbitrary point v from the database satisfying the core point condition as a seed. 2) Retrieve all points that are density-reachable from the seed obtaining the cluster containing the seed.
  • 16. DBSCAN (algorithm) (1) Start with an arbitrary point p from the database and retrieve all points density-reachable from p with regard to Eps and MinPts. (2) If p is a core point, the procedure yields a cluster with regard to Eps and MinPts and the point is classified. (3) If p is a border point, no points are density-reachable from p and DBSCAN visits the next unclassified point in the database.
  • 17. Algorithm: DBSCAN INPUT: Database SetOfPoints, Eps, MinPts OUTPUT: Clusters, region of noise (1) ClusterID := nextID(NOISE); (2) Foreach p ∈ SetOfPoints do (3) if p.classifiedAs == UNCLASSIFIED then (4) if ExpandCluster(SetOfPoints, p, ClusterID, Eps, MinPts) then (5) ClusterID++; (6) endif (7) endif (8) endforeach SetOfPoints = the database or a discovered cluster from a previous run.
  • 18. Function: ExpandCluster INPUT: SetOfPoints, p, ClusterID, Eps, MinPts OUTPUT: True, if p is a core point; False, else. (1) seeds = NEps(p); (2) if seeds.size < MinPts then // no core point (3) p.classifiedAs = NOISE; (4) return FALSE; (5) else // all points in seeds are density-reachable from p (6) foreach q ∈ seeds do (7) q.classifiedAs = ClusterID (8) endforeach
  • 19. Function: ExpandCluster (continued) (9) seeds = seeds {p}; (10) while seeds ≠ ∅ do (11) currentP = seeds.first(); (12) result = NEps(currentP); (13) if result.size ≥ MinPts then (14) foreach resultP ∈ result and resultP.classifiedAs ∈ {UNCLASSIFIED, NOISE} do (15) if resultP.classifiedAs == UNCLASSIFIED then (16) seeds = seeds ∪ {resultP}; (17) endif (18) resultP.classifiedAs = ClusterID; (19) endforeach (20) endif (21) seeds = seeds {currentP}; (22) endwhile (23) return TRUE; (24) endif Source: A. Naprienko: Dichtebasierte Verfahren der Clusteranalyse raumbezogener Daten am Beispiel von DBSCAN und Fuzzy-DBSCAN. Universität der Bundeswehr München, student’s project, WT2011.
  • 20. Density Based Clustering ‒ The Parameters Eps and MinPts ‒
  • 21. Determining the parameters Eps and MinPts The parameters Eps and MinPts can be determined by a heuristic. Observation • For points in a cluster, their k-th nearest neighbors are at roughly the same distance. • Noise points have the k-th nearest neighbor at farther distance. ⇒ Plot sorted distance of every point to its k-th nearest neighbor.
  • 22. Determining the parameters Eps and MinPts Procedure • Define a function k-dist from the database to the real numbers, mapping each point to the distance from its k-th nearest neighbor. • Sort the points of the database in descending order of their k-dist values. k-dist database
  • 23. Determining the parameters Eps and MinPts Procedure • Choose an arbitrary point p set Eps = k-dist(p) set MinPts = k. • All points with an equal or smaller k-dist value will be cluster points k-dist p noise cluster points
  • 24. Determining the parameters Eps and MinPts Idea: Use the point density of the least dense cluster in the data set as parameters
  • 25. Determining the parameters Eps and MinPts • Find threshold point with the maximal k-dist value in the “thinnest cluster” of D • Set parameters Eps = k-dist(p) and MinPts = k. Eps noise cluster 1 cluster 2
  • 26. Density Based Clustering ‒ Applications ‒
  • 27. Automatic border detection in dermoscopy images Sample images showing assessments of the dermatologist (red), automated frameworks DBSCAN (blue) and FCM (green). Kockara et al. BMC Bioinformatics 2010 11(Suppl 6):S26 doi:10.1186/1471-2105-11-S6-S26
  • 28. Literature • M. Ester, H.P. Kriegel, J. Sander, X. Xu A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD96). • A. Naprienko Dichtebasierte Verfahren der Clusteranalyse raumbezogener Daten am Beispiel von DBSCAN und Fuzzy-DBSCAN. Universität der Bundeswehr München, student’s project, WT2011. • J. Sander, M. Ester, H.P. Kriegel, X. Xu Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, Springer, Berlin, 2 (2): 169–194.
  • 29. Literature • J.N Dharwa, A.R. Patel A Data Mining with Hybrid Approach Based Transaction Risk Score Generation Model (TRSGM) for Fraud Detection of Online Financial Transaction. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD96). International Journal of Computer Applications, Vol 16, No. 1, 2011.
  • 30. Thank you very much!