l8 Signal Extraction
l8 Signal Extraction
Visualization
Novelty Detection
Matches Q11
Time Series
1 5 9
Database
Matches Q11 Best match
1 5 9 6
1
2 6 10 2 7
3 8
3 7 11
4 9
4 8 12 10
5
Queries Database
Kordik, CTU Prague, FIT, MI-PDD 6
Time series similarity
Similarity at the level of shape
C Q
D(Q, C ) (qi ci )
n
2
i =1
D(Q,C)
Kordik, CTU Prague, FIT, MI-PDD 8
Early Abandon
C
calculation
abandoned at
this point
Q
0 10 20 30 40 50 60 70 80 90 100
1.5 1.5
1 1
0.5 0.5
0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300
Q = Q - mean(Q)
C = C - mean(C)
D(Q,C)
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
Q = (Q - mean(Q)) / std(Q)
C = (C - mean(C)) / std(C)
D(Q,C)
10 4
8 3
6 2
4 1
2 0
0 -1
-2 -2
-4 -3
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
8 8
6 6
4 4
2 2
0 0
-2 -2
-4 -4
0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
Q = smooth(Q)
C = smooth(C)
D(Q,C)
9
3
8
2
9 7
6 5
8 6
5 4
7 3
4 2
1 1
C3 (or W3)
W3 W3 W3
C2 (or W2)
W5
C4 (or W4) W1 W1
W(1,4)
W(1,4)
C1 (or W1) W4 W4
Basic idea
Consider X = x1, x2, , xn , and Y = y1, y2, , yn
We are allowed to extend each sequence by
repeating elements
Euclidean distance now calculated between the
extended sequences X and Y
Matrix M, where mij = d(xi, yj)
j=i+w
warping path
j=iw
Y
X
Kordik, CTU Prague, FIT, MI-PDD 20
Restrictions on Warping Paths
Monotonicity
Path should not go down or to the left
Continuity
No elements may be skipped in a sequence
Warping Window
| i j | <= w
x1, x2, , xi
y1, y2, , yj
Gap skipped
X = 3, 2, 5, 7, 4, 8, 10, 7
Y = 2, 5, 4, 7, 3, 10, 8, 6
LCS = 2, 5, 7, 10
C
0 20 40 60 80 100 120
3
2 DFT
f
1 e
d PLA
0 c
b
-1 a Haar
-2
APCA
-3
1
C i =1
0.5
0 Euclidean Distance
-0.5
-1
-1.5
Q
0 20 40 60 80 100 120
(q c )
w
DR(Q , C ) n 2
w i =1 i i
1.5
1 C
0.5 PAA distance
0 lower-bounds
-0.5
the Euclidean
-1
Q Distance
-1.5
0 20 40 60 80 100 120
C = baabccbc
MINDIST(Q , C ) n
w i =1
w
(dist(
qi ,
ci ) )2
S S
D(Q,S)
DLB(Q,S)
D(Q,S) DLB(Q,S)
( )
n
2
i
q s i i=1 i i1 i i
M
( sr sr )(qv sv ) 2
i =1
n __ __
( Yt Y ) ( Yt-k Y )
t=k+1
rk =
n __
( Yt Y )2
t=1
Yt = r Yt-r + wt
r=1
Yt = + wt r wt-r
and should not be
confused with the
moving average
r=1 smoothing procedures.
1/n
400
350
300
250
Units
200
150
100
50
0
Apr-01 Sep-02 Jan-04 May-05 Oct-06 Feb-08 Jul-09 Nov-10 Apr-12 Aug-13
Month
today
Ft = aDt + (1 a ) Ft 1
Thus, new forecast is weighted sum of old forecast and
actual demand
Notes:
Only 2 values (Dt and Ft-1 ) are required, compared with n for
moving average
Parameter a determined empirically (whatever works best)
Rule of thumb: a < 0.5
Typically, a = 0.2 or a = 0.3 work well
a = 0.2
Vincent Stanford
Complex Systems Test Bed project
Kordik, CTU Prague, FIT, MI-PDD 44
August 31, 2007
Simple nonlinear systems
can exhibit chaotic behavior
UUCUCUCD
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100120 0 20 40 60 80 100 120 0 20 40 60 80 100120
U
U
C
U
C
U
D
D
46
DFT DWT SVDKordik, CTU Prague, FIT, MI-PDD
APCA PAA PLA SYM
Spectrum and Shape
Descriptors
Centroid
Rolloff =
Flux
M Bandwidth Feature
Moments Space
....
Feature vector
F
Centroid
0 1 n-1 time
Kordik, CTU Prague, FIT, MI-PDD 49
How does it work?
value value
freq. f=0 freq. f=1 (sin(t * 2 /n) )
value
freq. f=2
0 1 n-1 time
basis
functions 0 1 n-1
cosine, f=1
sine, freq =1 0 1 n-1 0 1 n-1
n 1
X f = 1/ n x t * exp( j 2 tf / n)
t =0
( j = 1 ) inverse DFT
n 1
xt = 1 / n X f * exp(+ j 2 tf / n)
t =0
A f = Re ( X f ) + Im ( X f )
2 2 2
Amplitude
Intuition: strength of frequency f
count Af
freq: 12
time freq. f
Kordik, CTU Prague, FIT, MI-PDD 56
DFT: Amplitude spectrum
excellent approximation, with only 2
frequencies!
value
time
Step 1: The wavelet is placed at the beginning of the signal, and set s=1
(the most compressed wavelet);
Step 2: The wavelet function at scale 1 is multiplied by the signal, and
integrated over all times;
Step 3: Shift the wavelet to t= , and get the transform value at t=
and s=1;
Step 4: Repeat the procedure until the wavelet reaches the end of the
signal;
Step 5: Scale s is increased by a sufficiently small value, the above
procedure is repeated for all s;
Step 6: Each computation for a given s fills the single row of the time-
scale plane;
Step 7: CWT is obtained if all s are calculated.
x0 x1 x2 x3 x4 x5 x6 x7
etc ...
s2,0
d2,0
s2,0
d2,0
t
d1,0 s1,0 d1,1 s1,1 .......
+
-
x0 x1 x2 x3 x4 x5 x6 x7
s2,0
d2,0
t
d1,0 s1,0 d1,1 s1,1 .......
+
-
x0 x1 x2 x3 x4 x5 x6 x7
Q: baritone/silence/soprano - DWT?
value
time
Q: baritone/soprano - DWT?
value
time
Observation1:
+ can be some weighted addition
- is the corresponding weighted difference
(Quadrature mirror filters)
Observation2: unlike DFT/DCT,
there are *many* wavelet bases: Haar,
Daubechies-4, Daubechies-6, ...