0% found this document useful (0 votes)

2 views36 pages

Yongli1

The document provides an introduction to trajectory clustering, defining trajectories as the paths followed by moving objects over time and discussing various clustering methods such as model-based, distance-based, and density-based approaches. It highlights the applications of trajectory clustering in areas like environmental data tracking, flock movement, and urban commuting patterns. The conclusion emphasizes the complexity and potential of trajectory data for knowledge extraction and its relevance in real-world applications.

Uploaded by

Oladayo Siyanbola

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views36 pages

Yongli1

Uploaded by

Oladayo Siyanbola

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Introduction to

Trajectory Clustering
By YONGLI ZHANG
Outline
1. Problem Definition

2. Clustering Methods for Trajectory data

3. Model-based Trajectory Clustering

4. Applications

5. Conclusions
1 Problem Definition
Trajectories:
• the path that a moving object follows through
space as a function of time. -Wiki
• the sequence of spatial locations visited by the
object, together with the time-stamps of such
visits, form a trajectory. Namely, the whole
history of a moving object is stored and
available for analysis
1 Problem Definition
Trajectories:
• dynamical systems—a trajectory is the set of points in state space that
are the future states resulting from a given initial state.

• In a discrete dynamical system—a trajectory is a set of isolated points in

state space.

• In a continuous dynamical system—a trajectory is a curve in state space.

• In discrete mathematics—a trajectory is a sequence of values

calculated by the iterated application of a mapping f to an element x of its

source.
1 Problem Definition
Trajectory clustering: Trajectories describe the movement
behavior of objects, therefore clustering can be used to
detect groups of objects that behaved in a similar way.

for example:

• by following similar paths (maybe in different time periods),

• by moving consistently together (i.e., keeping close to each

other for long time intervals)

• by sharing other properties of movement.

Outline
1. Problem Definition

2. Clustering Methods for Trajectory data

3. Model-based Trajectory Clustering

4. Applications

5. Conclusions
2 Clustering Methods for trajectory data

2.1 (descriptive and generative) Model based

clustering

Objective:

• Derive a global model capable of describing the whole

dataset

• Each cluster modeled as a prototype function with

some variability around the prototype, namely, it
produced a descriptive and interpretable model for
each cluster.
2 Clustering Methods for trajectory data

2.2 Distance-based clustering

Step1: Transform the complex data to features

vectors-multidimensional vectors, each dimension
represent 1 single characteristic of the object

Step2: Then use generic clustering algorithm, like K-

means, to cluster them.

Problem: Most methods require all vectors to be of

the same length
2 Clustering Methods for trajectory data

2.3 Density-based clustering and DBSCAN family

Objective: It uses a density threshold around each object to

distinguish the interesting data items from the noise.

DBSCAN:

Step1: Visit the whole dataset and tag each object- core object, border object,
noise.

(Noise means objects that are definitely outside any cluster)

Step2: The core objects that are close each other are joined in a cluster.

Density threshold defined by 2 parameters- maximum radius e around each

object, A minimum number of objects within the interval, say MinPts
2 Clustering Methods for trajectory data

2.3 Density-based clustering and DBSCAN family

• Density-based method strongly rely on an efficient

implementation of the neighborhood query.

• How to choose a distance function?

—Temporal focusing method: cluster trajectories using

all possible time intervals (time windows), evaluate the
results and find the best clustering.
For example, two trajectories may be very different if the whole time interval is
considered. However, if only a small sub-interval is considered, these trajectories may
be found very similar.
2 Clustering Methods for trajectory data

2.4 Visual-aided approaches

Why introduce visualization techniques?

Automatic methods may discover interesting behavioral

patterns with respect to the optimization function but it
may happen that these patterns are trivial or wrong from
the point of view of the phenomena/domain expert.

The visual analytics field tries to overcome this issue.

2 Clustering Methods for trajectory data

2.4 Visual-aided approaches

Advantages:

The analyst or domain expert can control the computational

process by setting different input parameters, interpret the
results and direct the algorithm towards the solution that better
describes the underlying phenomena.

To be more specific, the analyst or domain expert can apply

different distance functions that work with spatial, temporal,
numerical or categorical variables on the spatio-temporal data to
gain understanding of the underlying data in a stepwise manner.
2 Clustering Methods for trajectory data

2.5 Micro clustering methods

• the trajectories are represented as piece-wise segments, possibly

with missing intervals.

• The proposed method try to determine a close time interval, i.e. a

maximal time interval where all the trajectories are pair-wise close
to each other.

• The similarity of trajectories is based on the amount of time in

which trajectories are close.

• The mining problem is to find all the trajectory groups that are
close within a given threshold.
Outline
1. Problem Definition

2. Clustering Methods for Trajectory data

3. Model-based Trajectory Clustering

4. Applications

5. Conclusions
3 Model-based Trajectory Clustering

Problem with Standard/traditional clustering algorithm (i.e.

Kmeans):

Treat yj trajectories as a set of n-dimensional vectors in an n-

dimensional space and then use any of clustering methods which
operate in vector spaces.

Not applicable:

• trajectories with different length,

• be measured at different time point,

• y may be multidimensional with no natural vector representation

3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Standard mixture model clustering

The generative model is a linear combination of component models

K: the number of clusters, the paper assumes it is fixed

wk: The probability an individual assigned to cluster

:Given an individual belongs to cluster k, this density

function will generate the observed data yj from individual j
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Standard mixture model clustering

If we observe yj’s, and we assume a particular functional

form for the fk components, then the problem becomes
how to estimate the parameters: wk and θk

—EM (expectation maximization) algorithm is used in this paper

3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Experiment:
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Assume x,y 1-dimensional, we get standard regression relationship:

Gaussian noise e: mean=0, stand deviation=σk.

gk(x) is a deterministic function of x

: Given that y belongs to the kth group, has mean gk(x) and
standard deviation σk.

y: vertical position of hand

x: time (t)

For simplicity of notation, assume e to be a constant

3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Define the probability of a complete trajectory, given a
particular model k:

(1)

yj: trajectory of measurements for the jth individual

yj(i): ith measurement of yj

xj: measurement of yj were taken at times xj

θk is the set of parameters for component/cluster k

3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Define cluster model for trajectories:

In practice for clustering, we don’t know which component generate that

trajectory, the conditional density of observed data is a mixture density:

(2)
: are the mixture models,

wk are the mixing weights

θk is the set of parameters for component/cluster k

yj: jth trajectory

k: Cluster index-kth cluster

3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Define cluster model for trajectories:

Combining (1) and (2), we get full joint density:

(3)
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Define cluster model for trajectories:

The log-likelihood of the parameter θ given the data

set S can be directly defined from Eq. (3)

(4)
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Next, using EM algorithm to pull the mixture components out of the joint
density.

Hidden data problem- the group membership of each trajectory is unknown

EM algorithm—a sketch:

estimate the hidden data—>work out the answer—>then re-estimate the

hidden data again using the current answers we just computed—>
repented until some stabilization occurs

The EM framework gives a consistent way to estimate the hidden data so

that is guaranteed to never decrease
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Experiment:
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Result:

https://en.wikipedia.org/wiki/Polynomial_regression
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Explanation:
Data set sampled from 3 underlying polynomial (3 clusters):
y=120+x; y=10+2x+0.1x2 ; y=250-0.75x

From: y = a0 + a1 x + e
to y = a0 + a1x + a2x2 +e
In this model, when the temperature is increased from x to x + 1 units,
the expected yield changes by a1 + 2a2x. The fact that the change in
yield depends on x is what makes the relationship nonlinear (this must
not be confused with saying that this is nonlinear regression; on the
contrary, this is still a case of linear regression).
— https://en.wikipedia.org/wiki/Polynomial_regression
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Testing
3 Model-based Trajectory Clustering

—Algorithm: Mixtures of Regression Model

Summary:

-Based on a principle method for probabilistic modeling

of a set of trajectories as individual sequences of points
generated from finite mixture model consisting of
regression model components.

-Unsupervised learning is carried out using maximum

likelihood principles

-EM algorithm for hidden data problem, (i.e. cluster

membership)
Outline
1. Problem Definition

2. Clustering Methods for Trajectory data

3. Model-based Trajectory Clustering

4. Applications

5. Conclusions
4 Application

4.1 Environmental Data

i.e. Cyclones detection

• We can take the track of cyclones as trajectories.

• Apply trajectory clustering, to locate them and

track where they go
4 Application

4.2 Flocks and convoy

In some application, there is a need in discovering

group of objects that move together during a given
period of time.

For example, migrating animals, flocks of birds or

convoys of vehicles, march-detecting military activity
4 Application

4.3 Movement data

A specific example:

The track of commuters—trajectories

How to trace? —GPS based devices

If there are groups of commuters within a city that move from

one area of the city to another one within a particular time frame.

This kind of information and analysis can give meaningful hints

to city planners in order to avoid regular traffic jams.
Outline
1. Problem Definition

2. Clustering Methods for Trajectory data

3. Model-based Trajectory Clustering

4. Applications

5. Conclusions
5 Conclusion

• Trajectories represent the most complex and promising

(from a knowledge extraction viewpoint) form of data
among those based on point-wise information.

•Clustering is one of the general approaches to a descriptive

modeling of a large amount of data, allowing the analyst to
focus on a higher level representation of the data.

•Trajectory clustering has a lot of meaningful real world

applications.
References

1. Kisilevich, Slava, et al. Spatio-temporal clustering.

Springer US, 2009.

2. Gaffney, Scott, and Padhraic Smyth. "Trajectory

clustering with mixtures of regression models."
Proceedings of the fifth ACM SIGKDD international
conference on Knowledge discovery and data
mining. ACM, 1999.

Get (eBook PDF) Statistics for Business and Economics, Global Edition 9th Edition free all chapters
100% (4)
Get (eBook PDF) Statistics for Business and Economics, Global Edition 9th Edition free all chapters
56 pages
Oil & Gas Fundamentals
100% (4)
Oil & Gas Fundamentals
40 pages
CF Baum-Special Regressor
No ratings yet
CF Baum-Special Regressor
81 pages
Prem S. Mann - Introductory Statistics, Eighth Edition:) To, About 2) To (2), and About 3) To (3)
No ratings yet
Prem S. Mann - Introductory Statistics, Eighth Edition:) To, About 2) To (2), and About 3) To (3)
8 pages
sigmod07_jglee
No ratings yet
sigmod07_jglee
12 pages
Unsupervised Aircraft Trajectories Clustering: A Minimum Entropy Approach
No ratings yet
Unsupervised Aircraft Trajectories Clustering: A Minimum Entropy Approach
7 pages
1508.04904v1
No ratings yet
1508.04904v1
10 pages
263-800-4-PB
No ratings yet
263-800-4-PB
40 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
24 pages
classification
No ratings yet
classification
34 pages
Concepts and Techniques: - Chapter 11
No ratings yet
Concepts and Techniques: - Chapter 11
103 pages
4 Clustering
No ratings yet
4 Clustering
9 pages
Data Driven Modelling Using MATLAB
No ratings yet
Data Driven Modelling Using MATLAB
21 pages
Topic 08 - Data Modelling - Part II
No ratings yet
Topic 08 - Data Modelling - Part II
59 pages
Latent Clustering W Mplus v2
No ratings yet
Latent Clustering W Mplus v2
57 pages
DS Chapter 5
No ratings yet
DS Chapter 5
28 pages
Median Trajectories: Kevin Buchin Maarten Löffler Carola Wenk
No ratings yet
Median Trajectories: Kevin Buchin Maarten Löffler Carola Wenk
20 pages
Enhancing Clustering Mechanism by Implementation of EM Algorithm For Gaussian Mixture Model
No ratings yet
Enhancing Clustering Mechanism by Implementation of EM Algorithm For Gaussian Mixture Model
4 pages
Self Organizing Map Based Clustering Approach For Trajectory Data
No ratings yet
Self Organizing Map Based Clustering Approach For Trajectory Data
4 pages
iris_mbc_solution
No ratings yet
iris_mbc_solution
6 pages
Clustering of Time-Series Data
No ratings yet
Clustering of Time-Series Data
20 pages
Ijcttjournal V1i1p12
No ratings yet
Ijcttjournal V1i1p12
3 pages
Fuzzy Clustering Toolbox
No ratings yet
Fuzzy Clustering Toolbox
77 pages
(Balasko, Dkk. 2007) Fuzzy Clustering
No ratings yet
(Balasko, Dkk. 2007) Fuzzy Clustering
77 pages
Lecture - 2 - Data Mining Concepts
No ratings yet
Lecture - 2 - Data Mining Concepts
30 pages
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
No ratings yet
APznzab0G8iLD5cDfn798Gn-fXshRpam8ullbf6ZS5Hd4l0BEcKNHy9gDG24DS66RfgvnKXAQjMAivMmmi5cmDWF9tqOaPMy3afuzafCU1kpG1xfQIr7b98q406ZWiqt50nL8WhMI6azoYzWSgf7c7khnqww3VlQ9I90ROmc0QL4DbmipYYoLleGYR6TO4UYmc_PsaQB5v0XmLUwPEub3QuwGdUnUEr2dp_hV4bds0MuRbpJ
34 pages
Lec. 15-Final. ClusAdvanced
No ratings yet
Lec. 15-Final. ClusAdvanced
103 pages
DWDM - Unit - VI
No ratings yet
DWDM - Unit - VI
38 pages
Clustering
No ratings yet
Clustering
27 pages
4 - Data Analytics Using DM and ML Algorithms - 1
No ratings yet
4 - Data Analytics Using DM and ML Algorithms - 1
71 pages
Handouts On Data-Driven Modelling, Part 3 (UNESCO-IHE)
No ratings yet
Handouts On Data-Driven Modelling, Part 3 (UNESCO-IHE)
42 pages
Dmbi Unit-4
No ratings yet
Dmbi Unit-4
18 pages
Unit 4 Introduction to Algorithm
No ratings yet
Unit 4 Introduction to Algorithm
10 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
Graph Partitioning Advance Clustering Technique
No ratings yet
Graph Partitioning Advance Clustering Technique
14 pages
1. Clustering
No ratings yet
1. Clustering
75 pages
Techniques of Cluster Analysis: A Seminar On
No ratings yet
Techniques of Cluster Analysis: A Seminar On
25 pages
UNIT IV DM
No ratings yet
UNIT IV DM
15 pages
DSE Lab Assignment - Writeup - 7
No ratings yet
DSE Lab Assignment - Writeup - 7
4 pages
Unit 1 Data Mining task
No ratings yet
Unit 1 Data Mining task
7 pages
overview_basics
No ratings yet
overview_basics
16 pages
DM passing package
No ratings yet
DM passing package
38 pages
Q.1. What Is Data Mining?
No ratings yet
Q.1. What Is Data Mining?
15 pages
Unit-4 Unsupervised Algorithm
No ratings yet
Unit-4 Unsupervised Algorithm
18 pages
3 DM Classification (2)
No ratings yet
3 DM Classification (2)
62 pages
Unit 5
No ratings yet
Unit 5
27 pages
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
No ratings yet
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
22 pages
Module2 Ids 240201 162026
No ratings yet
Module2 Ids 240201 162026
11 pages
Data Sciene - Unit 5 Material
No ratings yet
Data Sciene - Unit 5 Material
15 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
A Framework For Trajectory Data Preprocessing For Data Mining
No ratings yet
A Framework For Trajectory Data Preprocessing For Data Mining
5 pages
SJNanda - Spider and CollidingBodies
No ratings yet
SJNanda - Spider and CollidingBodies
50 pages
Clustering Theory Applications and Algorithms
No ratings yet
Clustering Theory Applications and Algorithms
9 pages
CT075!3!2-DTM-Topic 8 - Introduction To Data Mining
No ratings yet
CT075!3!2-DTM-Topic 8 - Introduction To Data Mining
32 pages
Discovering Knowledge in Data: Lecture Review of
No ratings yet
Discovering Knowledge in Data: Lecture Review of
20 pages
DSA Unit1
No ratings yet
DSA Unit1
37 pages
Data Mining For BI - Part 5
No ratings yet
Data Mining For BI - Part 5
34 pages
Unit 4
No ratings yet
Unit 4
5 pages
The Handbook of Data Mining - 1st Edition ISBN 0805840818, 9780805840810 Complete EPUB eBook
No ratings yet
The Handbook of Data Mining - 1st Edition ISBN 0805840818, 9780805840810 Complete EPUB eBook
17 pages
Unit-2 Introduction To Data Mining
100% (1)
Unit-2 Introduction To Data Mining
11 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
From Everand
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
Abdolhossein Fereidoon
No ratings yet
Analytical Methods of Optimization
From Everand
Analytical Methods of Optimization
D. F. Lawden
No ratings yet
Multiple Models Approach in Automation: Takagi-Sugeno Fuzzy Systems
From Everand
Multiple Models Approach in Automation: Takagi-Sugeno Fuzzy Systems
Mohammed Chadli
No ratings yet
Piping & Instrumentation Diagram
100% (2)
Piping & Instrumentation Diagram
20 pages
Well Test Design and Well Test Design and Analysis - WTA Analysis - WTA
No ratings yet
Well Test Design and Well Test Design and Analysis - WTA Analysis - WTA
3 pages
Complete Thesis - (07.04.2018)
No ratings yet
Complete Thesis - (07.04.2018)
341 pages
Surface Safety Valves
No ratings yet
Surface Safety Valves
12 pages
08 - Section-3 Three Phase Separation
100% (2)
08 - Section-3 Three Phase Separation
19 pages
Surface BOP Stack Operations API - Vertical Well Kill Sheet
0% (1)
Surface BOP Stack Operations API - Vertical Well Kill Sheet
0 pages
Reservoir Performance Analysis
100% (2)
Reservoir Performance Analysis
30 pages
Standard P I Club - A Master S Guide To Enclosed Space Entry
No ratings yet
Standard P I Club - A Master S Guide To Enclosed Space Entry
60 pages
Epf Crude Oil Treatment
100% (2)
Epf Crude Oil Treatment
1 page
DC Part 1 Answers
No ratings yet
DC Part 1 Answers
32 pages
Macroscopic Displacement of Fluids in A Reservoir
No ratings yet
Macroscopic Displacement of Fluids in A Reservoir
11 pages
3rd Semester Visitation Roaster
No ratings yet
3rd Semester Visitation Roaster
14 pages
Compstat2012 Boa
No ratings yet
Compstat2012 Boa
61 pages
CFA© Program Curriculum Level I Volume 1 Quantitative methods
No ratings yet
CFA© Program Curriculum Level I Volume 1 Quantitative methods
360 pages
Causes and Consequences of Police Militarization
No ratings yet
Causes and Consequences of Police Militarization
193 pages
Tema 0 Econometrics
No ratings yet
Tema 0 Econometrics
6 pages
Overview of Data Analytics Lifecycle: Unit 2
No ratings yet
Overview of Data Analytics Lifecycle: Unit 2
100 pages
DSME2011 - H&I - 2021 (Updated)
No ratings yet
DSME2011 - H&I - 2021 (Updated)
4 pages
Guru Gobind Singh Indraprastha University Kashmere Gate, Delhi-110403
No ratings yet
Guru Gobind Singh Indraprastha University Kashmere Gate, Delhi-110403
15 pages
Sinusoidal Modeling Project
No ratings yet
Sinusoidal Modeling Project
5 pages
Econometrics Mid
No ratings yet
Econometrics Mid
3 pages
Decision Making - Blabla
100% (1)
Decision Making - Blabla
46 pages
Machine Learning for Signal Processing: Data Science, Algorithms, and Computational Statistics Max A. Little - The ebook is available for instant download, no waiting required
100% (1)
Machine Learning for Signal Processing: Data Science, Algorithms, and Computational Statistics Max A. Little - The ebook is available for instant download, no waiting required
61 pages
Common Errors in Statistics PDF
No ratings yet
Common Errors in Statistics PDF
2 pages
Unit 3 - Supervised Learning
No ratings yet
Unit 3 - Supervised Learning
38 pages
Effects of Financial Management Practices On Financial Performance For County Governments in Kenya - A Case Study of Mombasa County
No ratings yet
Effects of Financial Management Practices On Financial Performance For County Governments in Kenya - A Case Study of Mombasa County
13 pages
House Price Estimation
No ratings yet
House Price Estimation
7 pages
Jurnal Internasional Beban Kerja 8
No ratings yet
Jurnal Internasional Beban Kerja 8
8 pages
Homework 9 QMB 3200
No ratings yet
Homework 9 QMB 3200
22 pages
Latihan Regresi Berganda Dengan Cara Matriks
No ratings yet
Latihan Regresi Berganda Dengan Cara Matriks
6 pages
MIT Microeconomics 14.32 Final Review
No ratings yet
MIT Microeconomics 14.32 Final Review
5 pages
Determining Key Variables Influencing Energy Consumption in Office Buildings Through Cluster Analysis of Pre - and Post-Retrofit Building Data
No ratings yet
Determining Key Variables Influencing Energy Consumption in Office Buildings Through Cluster Analysis of Pre - and Post-Retrofit Building Data
18 pages
ToSend 01-1
No ratings yet
ToSend 01-1
3 pages
Assessing The Factors Influencing Employees Performance: Evidence From Ethiopian Airlines
No ratings yet
Assessing The Factors Influencing Employees Performance: Evidence From Ethiopian Airlines
7 pages
40412-Article Text-103281-2-10-20211129
No ratings yet
40412-Article Text-103281-2-10-20211129
6 pages
CIVL 4100I Introduction To Data Analytics For Smart Transportation Systems
No ratings yet
CIVL 4100I Introduction To Data Analytics For Smart Transportation Systems
10 pages
HW1 Solutions
No ratings yet
HW1 Solutions
3 pages
Water 11 00973 v2 PDF
No ratings yet
Water 11 00973 v2 PDF
16 pages
Research Methods- Introduction to Quantitative Methods - meta - Maastricht Unive
No ratings yet
Research Methods- Introduction to Quantitative Methods - meta - Maastricht Unive
3 pages