SlideShare a Scribd company logo
5
Most read
14
Most read
Clustering and Analysis in Data Mining
What is Clustering?The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering.
Why Clustering?ScalabilityAbility to deal with different types of attributesDiscovery of clusters with arbitrary shapeMinimal requirements for domain knowledge to determine input parametersAbility to deal with noisy dataIncremental clustering and insensitivity to the order of input records:High dimensionalityConstraint-based clusteringInterpretability and usability
 Data types in Cluster AnalysisData matrix (or object-by-variable structure)Interval-Scaled VariablesBinary VariablesA categorical variableA discrete ordinal variableA ratio-scaled variable
Methods used in clustering:Partitioning method.Hierarchical method.Data Density based method.Grid based method.Model Based method.
Hierarchical methods in clustering   There are two types of hierarchical clustering methods:Agglomerative hierarchical clusteringDivisive hierarchical clustering
Agglomerative hierarchical clusteringThis bottom-up strategy starts by placing each object in its own cluster and then merges these atomic clusters into larger and larger clusters, until all of the objects are in a single cluster or until certain termination conditions are satisfied.
Divisive hierarchical clusteringThis top-down strategy does the reverse of agglomerative hierarchical clustering by starting with all objects in one cluster. It subdivides the cluster into smaller and smaller pieces, until each object forms a cluster on its own or until it satisfies certain termination conditions, such as a desired number of clusters is obtained or the diameter of each cluster is within a certain threshold.
Density-Based methods in clusteringDBSCAN: A Density-Based Clustering Method Based on Connected Regions withSufficiently High DensityOPTICS: Ordering Points to Identify the Clustering StructureDENCLUE: Clustering Based on Density Distribution Functions
Grid-Based methods in clusteringSTING: Statistical information gridSTING is a grid-based multi resolution clustering technique in which the spatial area is divided into rectangular cells.Wave Cluster: Clustering Using Wavelet TransformationWave Cluster is a multi resolution clustering algorithm that first summarizes the data by imposing a multidimensional grid structure onto the data space. It then uses a wavelet transformation to transform the original feature space, finding dense regions in the transformed space
Model-Based Clustering MethodsExpectation-MaximizationConceptual ClusteringNeural Network Approach
Methods of Clustering High-Dimensional DataCLIQUE: A Dimension-Growth Subspace Clustering MethodCLIQUE (CLustering In QUEst) was the first algorithm proposed for dimension-growth subspace clustering in high-dimensional space.PROCLUS: A Dimension-Reduction Subspace Clustering MethodPROCLUS (PROjected CLUStering) is a typical dimension-reduction subspace clustering method. That is, instead of starting from single-dimensional spaces, it starts by finding an initial approximation of the clusters in the high-dimensional attribute space. Each dimension is then assigned a weight for each cluster, and the updated weights are used in the next iteration to regenerate the clusters.
Constraint-Based Cluster Analysis    Constraint-based clustering finds clusters that satisfy user-specified preferences or constraints, few categories of constraints are :Constraints on individual objectsConstraints on the selection of clustering parametersConstraints on distance or similarity functionsUser-specified constraints on the properties of individual clustersSemi-supervised clustering based on “partial” supervision
Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net

More Related Content

What's hot (20)

PPT
3.2 partitioning methods
Krish_ver2
 
PPTX
Classification in data mining
Sulman Ahmed
 
PPTX
05 Clustering in Data Mining
Valerii Klymchuk
 
PPTX
Data reduction
kalavathisugan
 
PPTX
Kmeans
Nikita Goyal
 
PDF
Density Based Clustering
SSA KPI
 
PPT
01 Data Mining: Concepts and Techniques, 2nd ed.
Institute of Technology Telkom
 
PPTX
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
PPT
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
PDF
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
PPTX
Apriori algorithm
Gaurav Aggarwal
 
PPTX
data generalization and summarization
janani thirupathi
 
PPTX
Decision Tree Learning
Milind Gokhale
 
PPTX
Data Integration and Transformation in Data mining
kavitha muneeshwaran
 
PPT
1.7 data reduction
Krish_ver2
 
PPT
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
PPTX
Data mining: Classification and prediction
DataminingTools Inc
 
PPT
3. mining frequent patterns
Azad public school
 
PPTX
Unsupervised learning clustering
Arshad Farhad
 
3.2 partitioning methods
Krish_ver2
 
Classification in data mining
Sulman Ahmed
 
05 Clustering in Data Mining
Valerii Klymchuk
 
Data reduction
kalavathisugan
 
Kmeans
Nikita Goyal
 
Density Based Clustering
SSA KPI
 
01 Data Mining: Concepts and Techniques, 2nd ed.
Institute of Technology Telkom
 
Clustering in data Mining (Data Mining)
Mustafa Sherazi
 
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Apriori algorithm
Gaurav Aggarwal
 
data generalization and summarization
janani thirupathi
 
Decision Tree Learning
Milind Gokhale
 
Data Integration and Transformation in Data mining
kavitha muneeshwaran
 
1.7 data reduction
Krish_ver2
 
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
Data mining: Classification and prediction
DataminingTools Inc
 
3. mining frequent patterns
Azad public school
 
Unsupervised learning clustering
Arshad Farhad
 

Viewers also liked (9)

PDF
Clustering: A Survey
Raffaele Capaldo
 
PPTX
Association Analysis
guest0edcaf
 
PPT
Chap8 basic cluster_analysis
guru_prasadg
 
PPTX
Belief Networks & Bayesian Classification
Adnan Masood
 
PDF
Bayesian Networks - A Brief Introduction
Adnan Masood
 
PPTX
Bayesian Belief Networks for dummies
Gilad Barkan
 
PPTX
Types of clustering and different types of clustering algorithms
Prashanth Guntal
 
PDF
Clustering training
Gabor Veress
 
PDF
K means Clustering
Edureka!
 
Clustering: A Survey
Raffaele Capaldo
 
Association Analysis
guest0edcaf
 
Chap8 basic cluster_analysis
guru_prasadg
 
Belief Networks & Bayesian Classification
Adnan Masood
 
Bayesian Networks - A Brief Introduction
Adnan Masood
 
Bayesian Belief Networks for dummies
Gilad Barkan
 
Types of clustering and different types of clustering algorithms
Prashanth Guntal
 
Clustering training
Gabor Veress
 
K means Clustering
Edureka!
 
Ad

Similar to Data Mining: clustering and analysis (20)

PPTX
UNIT - 4: Data Warehousing and Data Mining
Nandakumar P
 
PDF
Ir3116271633
IJERA Editor
 
PDF
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
IOSR Journals
 
PDF
Data mining
EmaSushan
 
PPTX
Grid based method & model based clustering method
rajshreemuthiah
 
PPTX
Data clustring
Salman Memon
 
PPT
Dataa miining
SUBBIAH SURESH
 
PDF
A0360109
iosrjournals
 
PDF
CLUSTERING IN DATA MINING.pdf
SowmyaJyothi3
 
PDF
Du35687693
IJERA Editor
 
PDF
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
PDF
Chapter 5.pdf
DrGnaneswariG
 
PDF
Paper id 26201478
IJRAT
 
PPTX
Clusters techniques
rajshreemuthiah
 
PPT
26-Clustering MTech-2017.ppt
vikassingh569137
 
PPTX
Machine Learning : Clustering - Cluster analysis.pptx
tecaviw979
 
PPTX
K- means clustering method based Data Mining of Network Shared Resources .pptx
SaiPragnaKancheti
 
PPTX
K- means clustering method based Data Mining of Network Shared Resources .pptx
SaiPragnaKancheti
 
PPTX
Clustering
Dr. C.V. Suresh Babu
 
PDF
A0310112
iosrjournals
 
UNIT - 4: Data Warehousing and Data Mining
Nandakumar P
 
Ir3116271633
IJERA Editor
 
A Density Based Clustering Technique For Large Spatial Data Using Polygon App...
IOSR Journals
 
Data mining
EmaSushan
 
Grid based method & model based clustering method
rajshreemuthiah
 
Data clustring
Salman Memon
 
Dataa miining
SUBBIAH SURESH
 
A0360109
iosrjournals
 
CLUSTERING IN DATA MINING.pdf
SowmyaJyothi3
 
Du35687693
IJERA Editor
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Chapter 5.pdf
DrGnaneswariG
 
Paper id 26201478
IJRAT
 
Clusters techniques
rajshreemuthiah
 
26-Clustering MTech-2017.ppt
vikassingh569137
 
Machine Learning : Clustering - Cluster analysis.pptx
tecaviw979
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
SaiPragnaKancheti
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
SaiPragnaKancheti
 
A0310112
iosrjournals
 
Ad

More from DataminingTools Inc (20)

PPTX
Terminology Machine Learning
DataminingTools Inc
 
PPTX
Techniques Machine Learning
DataminingTools Inc
 
PPTX
Machine learning Introduction
DataminingTools Inc
 
PPTX
Areas of machine leanring
DataminingTools Inc
 
PPTX
AI: Planning and AI
DataminingTools Inc
 
PPTX
AI: Logic in AI 2
DataminingTools Inc
 
PPTX
AI: Logic in AI
DataminingTools Inc
 
PPTX
AI: Learning in AI 2
DataminingTools Inc
 
PPTX
AI: Learning in AI
DataminingTools Inc
 
PPTX
AI: Introduction to artificial intelligence
DataminingTools Inc
 
PPTX
AI: Belief Networks
DataminingTools Inc
 
PPTX
AI: AI & Searching
DataminingTools Inc
 
PPTX
AI: AI & Problem Solving
DataminingTools Inc
 
PPTX
Data Mining: Text and web mining
DataminingTools Inc
 
PPTX
Data Mining: Outlier analysis
DataminingTools Inc
 
PPTX
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
PPTX
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
PPTX
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
PPTX
Data warehouse and olap technology
DataminingTools Inc
 
PPTX
Data Mining: Data processing
DataminingTools Inc
 
Terminology Machine Learning
DataminingTools Inc
 
Techniques Machine Learning
DataminingTools Inc
 
Machine learning Introduction
DataminingTools Inc
 
Areas of machine leanring
DataminingTools Inc
 
AI: Planning and AI
DataminingTools Inc
 
AI: Logic in AI 2
DataminingTools Inc
 
AI: Logic in AI
DataminingTools Inc
 
AI: Learning in AI 2
DataminingTools Inc
 
AI: Learning in AI
DataminingTools Inc
 
AI: Introduction to artificial intelligence
DataminingTools Inc
 
AI: Belief Networks
DataminingTools Inc
 
AI: AI & Searching
DataminingTools Inc
 
AI: AI & Problem Solving
DataminingTools Inc
 
Data Mining: Text and web mining
DataminingTools Inc
 
Data Mining: Outlier analysis
DataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Data warehouse and olap technology
DataminingTools Inc
 
Data Mining: Data processing
DataminingTools Inc
 

Recently uploaded (20)

PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 

Data Mining: clustering and analysis

  • 1. Clustering and Analysis in Data Mining
  • 2. What is Clustering?The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering.
  • 3. Why Clustering?ScalabilityAbility to deal with different types of attributesDiscovery of clusters with arbitrary shapeMinimal requirements for domain knowledge to determine input parametersAbility to deal with noisy dataIncremental clustering and insensitivity to the order of input records:High dimensionalityConstraint-based clusteringInterpretability and usability
  • 4.  Data types in Cluster AnalysisData matrix (or object-by-variable structure)Interval-Scaled VariablesBinary VariablesA categorical variableA discrete ordinal variableA ratio-scaled variable
  • 5. Methods used in clustering:Partitioning method.Hierarchical method.Data Density based method.Grid based method.Model Based method.
  • 6. Hierarchical methods in clustering There are two types of hierarchical clustering methods:Agglomerative hierarchical clusteringDivisive hierarchical clustering
  • 7. Agglomerative hierarchical clusteringThis bottom-up strategy starts by placing each object in its own cluster and then merges these atomic clusters into larger and larger clusters, until all of the objects are in a single cluster or until certain termination conditions are satisfied.
  • 8. Divisive hierarchical clusteringThis top-down strategy does the reverse of agglomerative hierarchical clustering by starting with all objects in one cluster. It subdivides the cluster into smaller and smaller pieces, until each object forms a cluster on its own or until it satisfies certain termination conditions, such as a desired number of clusters is obtained or the diameter of each cluster is within a certain threshold.
  • 9. Density-Based methods in clusteringDBSCAN: A Density-Based Clustering Method Based on Connected Regions withSufficiently High DensityOPTICS: Ordering Points to Identify the Clustering StructureDENCLUE: Clustering Based on Density Distribution Functions
  • 10. Grid-Based methods in clusteringSTING: Statistical information gridSTING is a grid-based multi resolution clustering technique in which the spatial area is divided into rectangular cells.Wave Cluster: Clustering Using Wavelet TransformationWave Cluster is a multi resolution clustering algorithm that first summarizes the data by imposing a multidimensional grid structure onto the data space. It then uses a wavelet transformation to transform the original feature space, finding dense regions in the transformed space
  • 12. Methods of Clustering High-Dimensional DataCLIQUE: A Dimension-Growth Subspace Clustering MethodCLIQUE (CLustering In QUEst) was the first algorithm proposed for dimension-growth subspace clustering in high-dimensional space.PROCLUS: A Dimension-Reduction Subspace Clustering MethodPROCLUS (PROjected CLUStering) is a typical dimension-reduction subspace clustering method. That is, instead of starting from single-dimensional spaces, it starts by finding an initial approximation of the clusters in the high-dimensional attribute space. Each dimension is then assigned a weight for each cluster, and the updated weights are used in the next iteration to regenerate the clusters.
  • 13. Constraint-Based Cluster Analysis Constraint-based clustering finds clusters that satisfy user-specified preferences or constraints, few categories of constraints are :Constraints on individual objectsConstraints on the selection of clustering parametersConstraints on distance or similarity functionsUser-specified constraints on the properties of individual clustersSemi-supervised clustering based on “partial” supervision
  • 14. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net