Pattern Matching With Acceleration Data: Pramod Vemulapalli
Pattern Matching With Acceleration Data: Pramod Vemulapalli
Acceleration Data
Pramod Vemulapalli
Outline
50 % Tutorial and 50 % Research Results
Basics
Literature Survey
Acceleration Data
Preliminary Results
Conclusions
What is A Time-Series Subsequence ?
40
20
-20
Time Series
-40
-60
40
-80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
20
-20
-40
-60
-80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
What is Time-series
Subsequence Matching?
40
20
Given a
0
Query Signal
40 -20
20 -40
0 -60
-20 -80
Find the most
appropriate
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
match in a database
-40
-60
-80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Applications for TSSM
Data Analytics
Scientific Data
Financial Data
Audio Data (Shazham on Iphone)
SETI Data
A lot of Time Series Data in this universe and in similar parallel
universes
Every time you ask questions such as these :
When is the last time I saw data like this ?
Is there any other data like this ?
Is this pattern a rarity or something that occurs frequently ?
Brute Force
Sliding Window Method
Compare With
40
Extract a
40
40 Signal20
Template
20
20 0
0
0 -20
-20
-20 -40
-40
-40 -60
-60
Store the
Distance
-60 -80
0 500 1000 1500 2000 2500 3000
-80
0
3500
500
Metric
4000
1000
4500
1500
5000
2000 2500 3
-80
(Euclidean)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-20 -40
-40 -60
-60 -80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
40
Preprocessing
Fourier
-80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
20
Transform 10.0 9.5 60
0
-20
-40
History
Faloutsos 1994
Database
Matching
11.3 9.0 6.0
10.0 9.5 60
12.3 10.0 11.0
Post Processing
Find matches from above process and check for Euclidean distance
criterion of the entire signal
Subsequent Work
A number of subsequent papers followed this model
Discrete Fourier Transform 1994(1)
Singular Value Decomposition 1994(1)
Discrete Cosine Transform 1997(2)
Discrete Wavelet Transform 1999(3)
Piecewise Aggregate Approximation 2001(4)
Locally Adaptive Piecewise Approximation 2001(5)
1) C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast Subsequence Matching in Time-Series
Databases. In SIGMOD Conference, 1994.
2) F. Korn, H. V. Jagadish, and C. Faloutsos. Efficiently supporting ad hoc queries in large datasets of
time sequences. In SIGMOD 1997
3) K. pong Chan and A. W.-C. Fu. Efficient Time Series Matching by Wavelets. In ICDE, 1999.
4) E. J. Keogh, K. Chakrabarti, S. Mehrotra, and M. J.Pazzani. Locally Adaptive Dimensionality
Reductionfor Indexing Large Time Series Databases. In SIGMOD Conference, 2001.
5) E. J. Keogh, K. Chakrabarti, M. J. Pazzani, and S. Mehrotra. Dimensionality Reduction for Fast
Similarity Search in Large Time Series Databases. Knowl. Inf. Syst., 3(3), 2001.
Drawbacks: Euclidean Distance Metric
Not robust to temporal distortion
Not robust to outliers
Example :
2 3 4
6
1
5
1 6
5
4
3
Same Road = Same Acceleration ?
Acceleration Data
Route Constant
Driver Behavior Variable
Traffic Conditions Variable
Which time series subsequence matching
technique to use ?
Local pattern matching : Robust to Amplitude and
Temporal Distortion
Very memory intensive especially for large query sets
Avoid Sliding Window
Very computationally intensive
Isolate Information Content
Isolate Information Content ?
Take a wavelet transform
Obtain dyadic frequency band
Better frequency resolution at lower frequencies
Better time resolution at higher frequencies
Avoid Sliding Window?
Take a wavelet transform
Take Wavelet Maxima
Maxima can be used to
completely reconstruct the
signal
Maxima are a stable and
unique representation of a
signal
Avoid sliding window by
just trying to match the
wavelet maxima from
signals
1) Mallat, S., A Wavelet Tour of Signal Processing. New York : Academic, 1999.
2) S.Zhong, S.Mallat and., "Characterization of signals from multiscale edges ." 1992, Issue IEEE Transactions on
Pattern Analysis and Machine Intelligence .
3) C.J.Lennard, C.J.Kicey and., "Unique reconstruction of band-limited signals by a Mallat-Zhong Wavelet
Transform ." s.l. : Birkhuser Boston, 1997, Issue Journal of Fourier Analysis and Applications.
Compare Wavelet Maxima ?
Create feature vector that
encodes relative distances
of the maxima
Common vision technique
Encode the distance by
incorporating the necessary
invariance
More Invariance =>
More robust to noise
Less unique for matching
Increase Uniqueness by
encoding many points
Lesser robustness to outliers
Multi Scale Extrema Features
40
Matching Process
1.2 2.3 3.5 2.0
20
-20
1.0 -5 -2 1.2
-40 3.6 2.5 3.3 3.6
-60
-80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
40
-20
-40
Preliminary Test: Find most appropriate
feature for acceleration data
Collect data in convoy
formation
Use data from one of the
vehicles to create database
Data from other vehicles is
used as Query Data
80
Accuracy (%)
60
40
20
Multi Scale Extrema Features
Euclidean
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Query Signal Length (seconds)
Results:
80
Accuracy (%)
60
40
20
Multi Scale Extrema Features
Euclidean
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Query Signal Length (seconds)
Results
80
Accuracy (%)
60
40
20
Amp Bias
Euclidean
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Query Signal Length (seconds)
Results
80
Accuracy (%)
60
40
20
Amp Bias
Euclidean
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Query Signal Length (seconds)
Conclusions & Future Work
Multiscale Extrema Features work better with Non-
Convoy Data
Euclidean distance measure works well with convoy data
for short query lengths