CLIQUE Algorithm Grid-Based Subspace Clustering

The CLIQUE algorithm is a grid-based subspace clustering algorithm that identifies clusters in subspaces of high dimensional data. It discretizes the data space using a grid and identifies dense units that meet a density threshold. It then determines clusters as connected dense units in subspaces using an Apriori approach, and provides minimal descriptions of each cluster. CLIQUE automatically discovers relevant subspaces and scales well with dimensionality, though the quality depends on grid parameters. It is insensitive to data ordering and distribution.

Uploaded by

cia rhaine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

180 views10 pages

CLIQUE Algorithm Grid-Based Subspace Clustering

Uploaded by

cia rhaine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

CLIQUE: CLustering in QUEst

Agrawal et al, SIGMOD 1998

France ROSE, Jan 2017

Clustering approaches
● “Clustering aims at dividing datasets into subsets (clusters), where objects in the same subset
are similar to each other with respect to a given similarity measure, whereas objects in
different clusters are dissimilar.”
● Clustering can be used:
○ To better understand the data: data mining, pattern recognition, information retrieval, machine learning
○ As a first step for different purposes: indexing, data compression

Kriegel et al, 2009

Context and concepts
● Clustering techniques: partitional (single level) or hierarchical

● Distance based (k-means) or connectivity based (graph-based or grid-based)

● Special case of high-dimensional data:

○ Irrelevance of distances;

○ Sparsity of the data;

○ Local feature relevance: different features or a different correlation of features may be relevant
for varying clusters

Agrawal et al., 1998. Kriegel et al, 2009

Data case 1
Data case 2
CLIQUE: Grid-Based Subspace Clustering

● CLIQUE is a density-based and grid-based subspace clustering algorithm

○ Grid-based: It discretizes the data space through a grid and estimates the density by counting the number of
points in a grid cell
○ Density-based: A cluster is a maximal set of connected dense units in a subspace
■ A unit is dense if the fraction of total data points contained in the unit exceeds the input model parameter
● Subspace clustering: A subspace cluster is a set of neighboring dense cells in an arbitrary
subspace. It also discovers some minimal descriptions of the clusters
● It automatically identifies subspaces of a high dimensional data space that allow better
clustering than original space using the Apriori principle
Bottom-up approach

Apriori principle: If a collection of points S is a cluster in a k-dimensional space, then S is also part
of a cluster in any (k-1) dimensional projections of this space
Major Steps of the CLIQUE Algorithm
● Identify subspaces that contain clusters
○ Partition the data space and find the number of points that lie inside each cell of the partition
○ Identify the subspaces that contain clusters using the Apriori principle
● Identify clusters
○ Determine dense units in all subspaces of interests
○ Determine connected dense units in all subspaces of interests
● Generate minimal descriptions for the clusters
○ Determine maximal regions that cover a cluster of connected dense units for each cluster
○ Determine minimal cover for each cluster
Comments on CLIQUE
● Strengths
○ Automatically finds subspaces of the highest dimensionality as long as high density clusters exist in
those subspaces
○ Insensitive to the order of records in input and does not presume some canonical data distribution
○ Scales linearly with the size of input and has good scalability as the number of dimensions in the
data increases O(Ck + mk)
○ Simple method and interpretability of results
● Weaknesses
○ As in all grid-based clustering approaches, the quality of the results crucially depends on the
appropriate choice of the number and width of the partitions and grid cells
References
● R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic Subspace Clustering of High
Dimensional Data for Data Mining Applications. SIGMOD’98
● Charu Aggarwal. An Introduction to Clustering Analysis. in Aggarwal and Reddy(eds.). Data
Clustering: Algorithms and Applications (Chapter 1). CRC Press, 2014
● Kriegel, H.-P., Kröger, P., & Zimek, A. (2009). Clustering high-dimensional data. ACM
Transactions on Knowledge Discovery from Data, 3(1), 1–58.
● Jiawei Han’s video on CLIQUE (extract of a coursera/UIUC MOOC)
https://www.youtube.com/watch?v=QqkHPJxAXoE
● ELKI framework https://elki-project.github.io/

PRJ301 FinalProjectReport
No ratings yet
PRJ301 FinalProjectReport
18 pages
Backend - Dev Technical Test
No ratings yet
Backend - Dev Technical Test
4 pages
Pact Analysis On Tourism Website
33% (3)
Pact Analysis On Tourism Website
3 pages
KNN K Nearest Neighbors Algorithm
No ratings yet
KNN K Nearest Neighbors Algorithm
6 pages
MGT 314 Homework Assignment
No ratings yet
MGT 314 Homework Assignment
3 pages
Slide 10 Distribution
No ratings yet
Slide 10 Distribution
34 pages
Data Warehouse Concepts: Quách Đình Hoàng Hoangqd@hcmute - Edu.vn
No ratings yet
Data Warehouse Concepts: Quách Đình Hoàng Hoangqd@hcmute - Edu.vn
35 pages
2024 SPRING SWR302 Assignment - 01
No ratings yet
2024 SPRING SWR302 Assignment - 01
2 pages
Software Cost Estimation and Cocomo Ii
No ratings yet
Software Cost Estimation and Cocomo Ii
47 pages
Speech Emotion Detection (CNN Algorithm)
No ratings yet
Speech Emotion Detection (CNN Algorithm)
29 pages
Demonstration of Types of Viruses and Its Mechanism: Topic 7
No ratings yet
Demonstration of Types of Viruses and Its Mechanism: Topic 7
9 pages
Cocomo & Cocomo-Ii
No ratings yet
Cocomo & Cocomo-Ii
7 pages
Management Information System Ch-6 Data Flow Diagrams: 0-Level DFD
No ratings yet
Management Information System Ch-6 Data Flow Diagrams: 0-Level DFD
6 pages
Applications of Graph Theory in Computer Sciences
100% (1)
Applications of Graph Theory in Computer Sciences
15 pages
9.3.1.2 Packet Tracer Simulation - Exploration of TCP and UDP Communication
No ratings yet
9.3.1.2 Packet Tracer Simulation - Exploration of TCP and UDP Communication
6 pages
EECS 1015: Introduction To Computer Science and Programming Topic 4
No ratings yet
EECS 1015: Introduction To Computer Science and Programming Topic 4
82 pages
Project Report: Demonstration of Types of Viruses and Its Mechanism
No ratings yet
Project Report: Demonstration of Types of Viruses and Its Mechanism
11 pages
Timo Digital Bank - Case Study - Data Engineer Intern
No ratings yet
Timo Digital Bank - Case Study - Data Engineer Intern
2 pages
Business Intelligence and Data Warehousing-Merged
No ratings yet
Business Intelligence and Data Warehousing-Merged
401 pages
Assignment 2 Unit 1
No ratings yet
Assignment 2 Unit 1
4 pages
Graphs Bfs Dfs
No ratings yet
Graphs Bfs Dfs
21 pages
Computer Science Textbook Solutions - 9
No ratings yet
Computer Science Textbook Solutions - 9
30 pages
Assignment 1 DSA
100% (1)
Assignment 1 DSA
2 pages
Tutorial On Chapters 7-8-9: Probability and Statistics For Engineers Geng 200
No ratings yet
Tutorial On Chapters 7-8-9: Probability and Statistics For Engineers Geng 200
17 pages
Midterm OS HUST Spring2021 2
0% (1)
Midterm OS HUST Spring2021 2
3 pages
Monitor
100% (1)
Monitor
22 pages
Image Caption Technical Report
No ratings yet
Image Caption Technical Report
31 pages
Neo4j - Graph Database PDF
No ratings yet
Neo4j - Graph Database PDF
19 pages
UNIT I 2 Marks
No ratings yet
UNIT I 2 Marks
5 pages
Practical Time Series Forecasting with R A Hands On Guide 2nd Edition Galit Shmueli download pdf
No ratings yet
Practical Time Series Forecasting with R A Hands On Guide 2nd Edition Galit Shmueli download pdf
65 pages
Assignment01-Use Case Analysis
No ratings yet
Assignment01-Use Case Analysis
11 pages
Lab 7
No ratings yet
Lab 7
6 pages
Data Science & Its Applications
No ratings yet
Data Science & Its Applications
59 pages
Assignment 2 Data Structures and Algorithms
No ratings yet
Assignment 2 Data Structures and Algorithms
5 pages
Support Vector Machines & Kernels: David Sontag New York University
No ratings yet
Support Vector Machines & Kernels: David Sontag New York University
19 pages
Book Review Artificial Intelligence
No ratings yet
Book Review Artificial Intelligence
3 pages
Unit 4D
No ratings yet
Unit 4D
43 pages
History of Dart
No ratings yet
History of Dart
2 pages
Solution
No ratings yet
Solution
148 pages
Advanced Data Science With IBM Coursera
No ratings yet
Advanced Data Science With IBM Coursera
1 page
Univ - QP - DC - Case Study Based Questions
No ratings yet
Univ - QP - DC - Case Study Based Questions
6 pages
PROG191 Pass
No ratings yet
PROG191 Pass
25 pages
Kodu Manual 2010
No ratings yet
Kodu Manual 2010
28 pages
All MCQ
No ratings yet
All MCQ
16 pages
Quiz 1: John Stuart Mill
No ratings yet
Quiz 1: John Stuart Mill
18 pages
Chapter 3. Software Testing Approaches and Techniques: White Box Test Techniques
No ratings yet
Chapter 3. Software Testing Approaches and Techniques: White Box Test Techniques
48 pages
Sample Report
No ratings yet
Sample Report
24 pages
FPT University Learning Materials (SWD)
No ratings yet
FPT University Learning Materials (SWD)
8 pages
What Is SDLC or Waterfall Model
No ratings yet
What Is SDLC or Waterfall Model
3 pages
UNIT 3 OOSE ppt (1)
No ratings yet
UNIT 3 OOSE ppt (1)
86 pages
University Management System Project Report.
100% (1)
University Management System Project Report.
20 pages
Aigdse 1120 Green
No ratings yet
Aigdse 1120 Green
52 pages
Fee P A101-Cms
No ratings yet
Fee P A101-Cms
10 pages
I2DB Assignment No1
No ratings yet
I2DB Assignment No1
2 pages
Chapter 06 SQL (Advanced)
No ratings yet
Chapter 06 SQL (Advanced)
38 pages
FULL PDF Qt5 Python GUI Programming Cook PDF
No ratings yet
FULL PDF Qt5 Python GUI Programming Cook PDF
3 pages
Huffman Encoder and Decoder Using Verilog
No ratings yet
Huffman Encoder and Decoder Using Verilog
3 pages
CS 194: Distributed Systems Processes, Threads, Code Migration
No ratings yet
CS 194: Distributed Systems Processes, Threads, Code Migration
26 pages
CLIQUE Algorithm
No ratings yet
CLIQUE Algorithm
33 pages
Presentation On Clustering High Dimensional Data
No ratings yet
Presentation On Clustering High Dimensional Data
10 pages
Oriented Gradients Histogram: Unveiling the Visual Realm: Exploring Oriented Gradients Histogram in Computer Vision
From Everand
Oriented Gradients Histogram: Unveiling the Visual Realm: Exploring Oriented Gradients Histogram in Computer Vision
Fouad Sabry
No ratings yet
2nd Grading IP Format
No ratings yet
2nd Grading IP Format
4 pages
Heat Equation PDE Matlab
No ratings yet
Heat Equation PDE Matlab
8 pages
Purpose and Process of Literature Review
100% (1)
Purpose and Process of Literature Review
9 pages
Pure Mathematics For Sixth Forms Vol.1
83% (6)
Pure Mathematics For Sixth Forms Vol.1
415 pages
Spss Assignment 1
No ratings yet
Spss Assignment 1
6 pages
14.08 Lagrange Multipliers
No ratings yet
14.08 Lagrange Multipliers
11 pages
Runge-Kutta Methods For Linear Ordinary Differential Equations
No ratings yet
Runge-Kutta Methods For Linear Ordinary Differential Equations
22 pages
Data Analysis
No ratings yet
Data Analysis
30 pages
Evaluation of Definite Integrals PDF
No ratings yet
Evaluation of Definite Integrals PDF
2 pages
Solutions To HW11
No ratings yet
Solutions To HW11
9 pages
Design by Root Locus PDF
No ratings yet
Design by Root Locus PDF
26 pages
Chapter#4 Z-Transform
No ratings yet
Chapter#4 Z-Transform
39 pages
Translation, Stylistics and the Imperativeness of Transliteration in African Literary-Damola_Adeyefa-2022
No ratings yet
Translation, Stylistics and the Imperativeness of Transliteration in African Literary-Damola_Adeyefa-2022
19 pages
COMPLEX ANALYTIC FUNCTIONS Hand Out
100% (1)
COMPLEX ANALYTIC FUNCTIONS Hand Out
16 pages
Relation and Function: "Range" "Domain"
No ratings yet
Relation and Function: "Range" "Domain"
3 pages
BUSC2112 Basic Calculus Performance Task 2 - Attempt Review (9.1)
67% (3)
BUSC2112 Basic Calculus Performance Task 2 - Attempt Review (9.1)
3 pages
Assignment 1 (Group Assessment) : Tutorial Lecturer: - GROUP TUTORIAL: - Name Student'S Id 1) 2) 3) 4) 5)
No ratings yet
Assignment 1 (Group Assessment) : Tutorial Lecturer: - GROUP TUTORIAL: - Name Student'S Id 1) 2) 3) 4) 5)
3 pages
Standard Operating Procedure Manual Integration
No ratings yet
Standard Operating Procedure Manual Integration
8 pages
Lecture 20
No ratings yet
Lecture 20
12 pages
Process Analysis Essay Thesis Statement
100% (3)
Process Analysis Essay Thesis Statement
6 pages
Name Course & Year: BSMA-2 Date: Module 6: Forecasting (Exercise Problems)
No ratings yet
Name Course & Year: BSMA-2 Date: Module 6: Forecasting (Exercise Problems)
8 pages
Numerical Analysis 1
100% (1)
Numerical Analysis 1
21 pages
Item Analysis (Reliability Test)
No ratings yet
Item Analysis (Reliability Test)
7 pages
Inq Inv Imm - Analyis and Interpretation of Data
100% (1)
Inq Inv Imm - Analyis and Interpretation of Data
13 pages
Old Question Papers of III Semester Btech
No ratings yet
Old Question Papers of III Semester Btech
5 pages
FINAL (PS) - PR2 11 - 12 - UNIT 7 - LESSON 1 - Descriptive Statistics For Quantitative Data
No ratings yet
FINAL (PS) - PR2 11 - 12 - UNIT 7 - LESSON 1 - Descriptive Statistics For Quantitative Data
59 pages
Files Sent To The Students - Compressed
No ratings yet
Files Sent To The Students - Compressed
11 pages
Material Management
100% (1)
Material Management
59 pages

CLIQUE Algorithm Grid-Based Subspace Clustering

Uploaded by

CLIQUE Algorithm Grid-Based Subspace Clustering

Uploaded by

CLIQUE: CLustering in QUEst

Agrawal et al, SIGMOD 1998

France ROSE, Jan 2017

Kriegel et al, 2009

● Distance based (k-means) or connectivity based (graph-based or grid-based)

● Special case of high-dimensional data:

○ Sparsity of the data;

Agrawal et al., 1998. Kriegel et al, 2009

● CLIQUE is a density-based and grid-based subspace clustering algorithm

You might also like