0% found this document useful (0 votes)
2 views

Finding Periodicity in Space and Time

This technical report presents an algorithm for detecting, segmenting, and characterizing spatiotemporal periodicity using periodicity templates. The approach utilizes a spectral method that is computationally efficient and robust against noise, distinguishing itself from optical flow-based techniques. The algorithm is demonstrated through real-world examples, particularly focusing on periodic motion in image sequences, such as the movement of walking individuals.

Uploaded by

nhjw08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Finding Periodicity in Space and Time

This technical report presents an algorithm for detecting, segmenting, and characterizing spatiotemporal periodicity using periodicity templates. The approach utilizes a spectral method that is computationally efficient and robust against noise, distinguishing itself from optical flow-based techniques. The algorithm is demonstrated through real-world examples, particularly focusing on periodic motion in image sequences, such as the movement of walking individuals.

Uploaded by

nhjw08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

M.I.T Media Laboratory Perceptual Computing Section Technical Report No.

435
Proceedings of the International Conference on Computer Vision,
Bombay, India, January 4-7, 1998

Finding Periodicity in Space and Time 


Fang Liu and Rosalind W. Picard
The Media Laboratory, E15-383
Massachusetts Institute of Technology, Cambridge, MA 02139
[email protected], [email protected]
Abstract 1.1 Our Approach
The approach presented here is motivated by theory for
An algorithm for simultaneous detection, seg- textured image modeling that assumes an underlying ran-
mentation, and characterization of spatiotempo- dom eld representation of the data [6]. In particular,
ral periodicity is presented. The use of period- 1-D signals along the temporal dimension are considered
icity templates is proposed to localize and char- as stochastic processes. When assuming stationarity, a
acterize temporal activities. The templates not stochastic signal can be decomposed into deterministic (pe-
only indicate the presence and location of a peri- riodic) and indeterministic (random) components. This
odic event, but also give an accurate quantitative is known as Wold decomposition [7]. In the frequency
periodicity measure. Hence, they can be used as domain, the deterministic and the indeterministic com-
a new means of periodicity representation. The ponents correspond respectively to the singular and the
proposed algorithm can also be considered as a continuous part of the signal's Fourier spectrum. In prac-
\periodicity lter," a low-level model of period- tical applications, this is to say that the repetitive struc-
icity perception. The algorithm is computation- ture in the signal contributes only to the spectral harmonic
ally simple, and shown to be more robust than peaks and the random behavior to the smooth part of the
optical ow based techniques in the presence of spectrum. Therefore, the energy contained in the spectral
noise. A variety of real-world examples are used harmonic peaks is a good measure of signal periodicity.
to demonstrate the performance of the algorithm. Applying the above analysis to the temporal dimension
of an image sequence, the ratio between the harmonic en-
ergy and the total energy of the temporal signal is used
1 Introduction here as a measure of the strength of signal periodicity. As
a component of the periodicity template of the image se-
Periodicity is common in the natural world. It is also a quence, this measure plays an important role in detecting
salient cue in human perception. Information regarding and characterizing periodicity in space and time.
the nature of a periodic phenomenon, such as its location, The approach described above assumes that signal pe-
strength, and frequency, is important for our understand- riodicity is observable along lines parallel to the temporal
ing of the environment. Techniques for periodicity detec- (T) axis. In other words, the moving objects need to be
tion and characterization can assist in many applications tracked just like we xate on a walking person. Typically,
requiring object and activity recognition and representa- optical ow based techniques are used for object tracking.
tion. We present here a non- ow-based frame alignment proce-
Although surface patterns may come to mind rst, pe- dure for tracking, and show that it is more robust to noise
riodicity often involves both space and time, such as cyclic than ow based methods.
motion. The main body of work on periodic motion is In this paper, examples of walking people are used to
model-based (e.g., [1][2]). More recently there is work on illustrate the technique. However, it should be stressed
motion recognition directly using low-level features of mo- that the purpose of this work is not to detect and seg-
tion information (e.g., [3][4][5]). However, to date, there ment a moving object, but to detect and characterize in
has not been a method which uses low-level features to three-dimensional (3-D) data those regions that exhibit pe-
detect and systematically characterize periodicity in space riodicity. We do not expect the algorithm to segment out
and time. In this work, we attempt to tackle this prob- the walking person. Instead, regions of legs and arms and
lem by using periodicity templates to incorporate the lo- the outline of the bouncing head and shoulder should be
cation, strength, and other characteristic information of a identi ed.
periodic phenomenon. The templates are useful in appli-
cations such as periodic motion representation and action
1.2 Related Work
recognition. The template generating procedure provides The work of Polana and Nelson on periodic motion de-
a tool for detecting and segmenting regions of periodicity. tection [4] is perhaps the most relevant to the approach
The proposed method is spectral based, and is computa- presented in this paper. In their work, reference curves,
tionally ecient. which are lines parallel to the trajectory of the motion ow
centroid, are extracted and their power spectra computed.
The periodicity measure pf of each reference curve is de-
 This work was supported in part by IBM and NEC. ned as the normalized di erence between the sum of the

1
spectral energy at the highest amplitude frequency and its
multiples and the sum of the energy at the frequencies half
way between. Besides the value of the periodicity measure
itself, there is no checking on the signal harmonicity along
the curve, which is a weakness of the method. The peri-
odicity measure for an entire sequence is the maximum of
pf averaged among pixels whose highest power spectrum
values appear on the same frequency. The nal periodicity Frame 20 Frame 40
measure is used to distinguish periodic and non-periodic
motion by thresholding.
In [3], ow based algorithms are used to transform the
image sequence so that the object in consideration is sta-
bilized at the center of the image frame. Then ow mag-
nitudes in tessellated frame areas of periodic motion were
used as feature vectors for motion classi cation. In this
paper, we show that ow based methods are very sensitive
to noise.
This work di er from the above in the following ways: Frame 60 Frame 80
1) the harmonic relationship among spectral peaks is ex- Figure 1: Frames 20, 40, 60, and 80 of the 97 frame Walker
plicitly veri ed; 2) a more accurate measure of periodic- sequence, with frame size 320  240.
ity in the form of harmonic energy ratios is proposed; 3)
multiple fundamentals can be extracted along a temporal
line; 4) the values of fundamental frequencies are used in
processing to help distinguish periodicity of di erent ac-
tivities; 5) regions of periodicity are actually segmented;
and 6) the proposed algorithm does not use optical ow, (a) (b)
and is robust to noise. Figure 2: Head and ankle level XT slices of Walker se-
quence. (a) Head level. (b) Ankle level. As it is, the
2 Method periodicity in (b) is dicult to characterize.
The algorithm for periodicity detection and segmentation
consists of two stages: (1) object tracking by frame align- in front of the other. The XT and YT slices of the data
ment; (2) simultaneous detection and segmentation of re- cube reveal the temporal behavior usually hidden from the
gions of periodicity. Object tracking is by itself a research viewer. Figure 2 shows the head and ankle level XT slices
area. Decoupling object tracking and periodicity detection of the Walker sequence. In (a), the head leaves a non-
conceptually modularizes the analysis and allows the use periodic straight track while the walking ankles in (b) make
of other tracking algorithms. a crisscross periodic pattern. As it is, the periodicity in
Throughout this section, an image sequence Walker will (b) is dicult to characterize. It will be shown that frame
be used to illustrate the technical points. More challenging alignment transforms data into a form in which periodicity
examples are given in Section 3. can be easily detected and measured.
To align a sequence to a particular moving object, the
2.1 Frame Alignment trajectory of the object is rst detected. A ltering method
In this work, two types of image sequences are considered similar to the one in [8] is used here to avoid the noise sen-
for frame alignment. In practice, a large number of image sitivity of the optical ow based methods (demonstrated in
sequences can be categorized into one of these two types: Section 3). Applying 1-D median ltering along the tem-
(I) area of interest, typically a moving object, is as a whole poral dimension of the sequence ( lter length 11 was used
stationary to the camera, but the background can be mov- for Walker), the resulting sequence has mostly the back-
ing; (II) little ego-motion is involved and each moving ob- ground. The di erence sequence between the original and
ject as a whole is moving approximately frontoparallel to the background contains mainly the moving objects. Since
the camera along a straight line and at a constant speed. the object trajectories in consideration are approximately
Four frames of a sequence with a person walking across linear, the projections of the trajectories onto the XT and
the image plane is shown in Figure 1. This is a typical type YT planes (averaged XT and YT images of the di erence
II sequence. Although there are no re-occurring scenes, we sequence) are straight lines. These lines can be detected
experience the notion of repetitiveness when viewing the via a Hough transform to give the X or Y positions of the
sequence. This is due to our ability to xate on the moving moving objects in each frame. We call these position values
person, so that the person appears to be walking in place. alignment indices. The averaged XT image of the Walker
The e ect of xating can be accomplished computationally di erence sequence and the line found by the Hough trans-
by realign the image frames. Obviously, frame alignment is form method are shown in Figure 3. Each horizontal line
not necessary for type I sequences, but in fact is a process represents a frame, and the diagonal white line marks the
of transforming type II sequences into type I. object X location in each frame. Note that multiple ob-
In the following, the term data cube is used to refer to ject trajectories can be detected simultaneously using this
the 3-D (X: horizontal; Y: vertical; and T: temporal) data procedure, as will be shown in Section 3.1.
volume formed by stacking all the frames in a sequence, one Using the alignment indices, image frames in a sequence

2
(a) (b)
Figure 3: (a) Averaged XT image of the Walker sequence
after background removal. (b) Line found in (a) by using
a Hough transform method. (a1) (a2)

250
(b1) 180
(b2)
(a) (b)
160

200
140

Figure 4: (a) Averaged XY image of aligned Walker di er-


120

Power Spectra
150

Gray Scale
ence sequence. The area of interest is clearly shown. (b)
100

80

Aligned and cropped Walker sequence with splits near the


100
60

center of the frames to show the inside of the data cube. 50


40

20

0 0

can be repositioned to center a moving object to any spec-


0 10 20 30 40 50 60 0 10 20 30 40 50 60

(c1) (c2)
Time (by frame number) Frequency

i ed position in the XY plane. After alignment, the ob- Figure 5: Signals and their power spectra along temporal
ject should appear to be moving in place. This in e ect lines (columns in images). (a1) and (b1): head and ankle
is equivalent to xating on an object when viewing a se- level XT slices of aligned and cropped Walker sequence.
quence in which the object's position changes frame by (a2) and (b2): each column is the 1-D power spectra of
frame. The aligned sequences are passed to the second the corresponding column in (a1) and (b1). (c1) and (c2):
stage of the algorithm. details along the white vertical lines in (b1) and (b2). Pe-
2.2 Finding Regions of Periodicity riodicity in (b1) is re ected by the spectral harmonic peaks
in (b2).
In the second stage, 1-D Fourier transforms are performed
along the temporal dimension of an aligned sequence. The
spectral harmonic peaks are detected and used to com- and (b1), normalized among all temporal lines in the data
pute the temporal signal harmonic energy. A periodicity cube. Figure 5 (c1) and (c2) show details along the white
template is generated by using the extracted fundamental vertical lines in (b1) and (b2). While the head level slice in
frequencies and the ratios between the harmonic energy (a1) shows no harmonicity, the periodicity of the moving
and the total energy at each frame pixel location. The ankles in (b1) is re ected by the spectral harmonic peaks
original sequence is then masked for regions of periodicity. in (c2). We refer to the spectral energy corresponding to
To save computation and storage, an aligned sequence the harmonic peaks as the temporal harmonic energy and
can be cropped to limit processing to the area of inter- propose using the temporal harmonic energy ratio, which
est. The cropping does not a ect the periodicity detec- is the ratio between the harmonic energy and the total
tion. The location and size of the cropping window can energy along a temporal line, as a measure of temporal
be estimated from the average XY image of the aligned periodicity at the corresponding frame pixel location.
di erence sequence. Figure 4 shows such XY image of the For spectral harmonic peak detection, we adapt the 2-D
Walker sequence and the aligned and cropped original se- peak detection algorithm in [6] for 1-D signals. The sig-
quence with splits near the center of the frames to show nal along a temporal line is rst zero-meaned and Gaus-
the inside of the data cube. sian tapered, and then its power spectrum computed via
Now consider an aligned and cropped data cube. Frame a fast Fourier transform. To locate the harmonic peaks,
pixels with the same X and Y locations form straight lines local maxima of the power spectrum are found using size
in the cube. Call these lines the temporal lines. If the 7 neighborhood and excluding values below 10% of the en-
cropped frame size is Nx by Ny , then there are Nx  Ny tire spectral range. A local maximum marks the location
temporal lines in the data cube. In the aligned sequence, of a spectral harmonic peak when its frequency is either
the object of interest moves in place. If the object is mov- a fundamental or a harmonic. A fundamental is de ned
ing cyclically in any manner, the periodicity will be re- as a frequency that can be used to linearly express the
ected in some of the temporal lines. Figure 5 (a1) and frequencies of some other local maxima. A harmonic is a
(b1) show the head and the ankle level XT slices of 64 frequency that can be expressed as a linear combination
frames (Frame 17 to 80) of the data cube in Figure 4 (b). of some fundamentals. Starting from the lowest frequency
Each column in the images is a temporal line. These im- to the highest, each local maxima is checked rst for its
ages are the aligned and cropped version of the two XT harmonicity | if its frequency can be expressed as a linear
slices in Figure 2. Columns in Figure 5 (a2) and (b2) are combination of the existing fundamentals, and then for its
the 1-D power spectra of the corresponding columns in (a1) fundamentality | if the multiples of its frequency, com-

3
bined with the multiples of existing fundamentals, coincide
with the frequency of another local maximum. A toler-
ance of one sample point is used in the frequency match-
ing. Note that multiple fundamental frequencies can exist
along a temporal line.
Due to the nature of the temporal signal and the e ect
of the Gaussian taper, a spectral harmonic peak usually
does not appear as a single impulse. In this work, a peak (a) (b)
support region is determined by growing from the detected
peak location outward along the frequency axis until the Figure 6: (a) Temporal harmonic energy ratio values of
spectral value is below 5% of the spectrum range. After the aligned Walker sequence. High value indicates more
the spectral peaks and their supports are identi ed, it is periodic energy at the location. (b) Using the alignment
straightforward to compute the harmonic energy ratio as- indices, the four frames in Figure 1 are masked by the
sociated with a fundamental frequency and its harmonics. template shown in (a) and then stacked together.
The peak detection technique discussed above fails
when a temporal line contains only one sinusoidal signal,
which produces a single spectral peak. However, this situ-
ation arises only when the edge of a moving object has a si-
nusoidal pro le. An example is a vertical sine grating pat-
tern horizontally translating frontoparallel to the camera
at a constant speed. Natural edges, patterns, and surfaces
hardly ever have such a pro le. Therefore, higher harmon-
ics usually accompany the fundamentals of the temporal
signals.
Applying the peak detection procedure to all temporal
lines in a data cube, the periodicity template of the aligned
sequence is built by registering the fundamental frequen-
cies and the corresponding values of temporal harmonic
energy ratio at each pixel location in a data structure ar-
ray of frame size. At places where no periodicity is found,
the template data structure has value zero. Under circum-
stances such as a noisy background, some speckles may
appear in the template. Simple morphological closing and
opening operations can be applied to remove the speckles.
Figure 6 (a) shows the temporal harmonic energy ratio
values of the Walker sequence after one closing and one
opening operation with a circular structuring element of
diameter 3. The larger the energy ratio value, the more
periodic energy is at the location. As expected, the bright-
est region is the wedge shape created by the walking legs.
The head, the shoulder, and the outline of the backpack
are detected because the walker bounces. The hands ap- Figure 7: Left column: frames 40, 61, and 88 of the Trio
pear at the front of the body since in most parts of the sequence. Right column: frames in the left column masked
sequence the walker was xing his gloves and moving his by the periodicity templates.
hands in a rather periodic manner. Note that the mov-
ing background and parts of the walker do not appear in The Walker and Trio sequences were recorded by a hand-
the template since there is no periodicity present in those held consumer-grade camcorder. The Dog and Wheels se-
areas. quences were taken by the same camera set on a tripod.
Using the alignment indices generated at the rst stage, The Jumping Jack sequence was recorded by a xed Beta-
the periodicity template of a sequence can be used to mask cam camera in an indoor setting. Except for the Jumping
the original sequence for the regions of periodicity in each Jack, none of the subjects in the sequences was aware of the
frame. Figure 6 (b) shows the four frames in Figure 1 after lming; hence the activities are natural and exhibit nat-
they are masked and then stacked together. ural irregularities. All original sequences have 320  240
Since the non-periodic activities of the background do frame size.
not light up in the templates, it is clear that the sequence
cropping for ecient computation does not a ect the pro-
cessing results. These examples are used to demonstrate 1) the e ec-
tiveness of the new algorithm in nding and characterizing
3 Examples periodicity in various settings; 2) the robustness of the al-
gorithm under noisy conditions; and 3) the noise sensitivity
In addition to the Walker sequence, four examples are of optical ow based estimation methods, which have been
used here to demonstrate the e ectiveness of the pro- used for trajectory detection in many existing works, but
posed algorithm: Trio, Dog, Wheels, and Jumping Jack. are avoided by the method proposed here.

4
(a) (b)
Figure 8: (a) Averaged XT image of the Trio sequence
after background removal. (b) Lines found in (a) by using
the Hough transform method.
3.1 Trio
Trio is a 156 frame sequence of three people walking and
passing each other. Frames 40, 61, and 88 of the sequence
are shown in the left column of Figure 7. As in the Walker
example, the averaged XT image is computed after the
background removal. The lines in the XT image are de-
tected via Hough transform. Figure 8 shows the averaged
XT image and the detected lines. These lines provide the
alignment indices of each objects. Note that the alignment
indices of three objects are estimated simultaneously.
To generate the periodicity templates, the original se-
quence is aligned and cropped for each moving person. All
aligned sequences contain 64 frames. Figure 9 shows ex-
ample frames of each aligned sequences and the harmonic
energy ratio values of the periodicity templates. Again, the
goal here is not to segment out the people, but to detect
and characterize regions of periodicity, such as legs, arms,
the outline of bouncing head and shoulder, and even the
dangling straps of the backpack. Finally, the templates are
used to mask the original sequence. Examples are shown Figure 9: Example frames of aligned sequences and the
in the right column of Figure 7. harmonic energy ratio values of the periodicity templates
Notice that, besides the center person, there is a sec- for each individuals of the Trio sequence. First two
ond or even a third person passing through in all three columns: example frames. Right column: harmonic en-
aligned sequences. However, these passersby have no ef- ergy ratio values.
fect on the results of periodicity detection since they are
one-time events on a temporal line, and therefore do not quency is removed. The fence region of the frame in (a) is
contribute to the temporal harmonic energy. The Trio ex- shown in (g) while other regions of periodicity are shown
ample demonstrates that the proposed algorithm is well in (h).
suited for the detection of multiple periodicities, even un-
der the circumstances of temporary object occlusion. 3.3 Wheels
3.2 Dog The examples shown so far all involve walking. However,
the algorithm is not limited to periodicity caused by hu-
Dog is a 104 frame sequence where a person walks two man activities, but works in general for any periodic space-
dogs in front of a picket fence. Figure 10 (a) shows frame time phenomenon.
46 of the original sequence, and (b) shows frame 13 of the Wheels is a 64 frame sequence of a car passing by a
64-frame aligned sequence. Images (c1) and (c2) show the building. Near the top of the building, two spinning wheels
rst and second fundamental frequencies in the periodic- are connected by a gure 8 belt. One side of the belt is
ity template, while (e1) and (e2) are the corresponding patterned and appears periodic. Every region with peri-
harmonic energy ratios. Note that there are double funda- odicity should be captured: the hub caps, the wheels, and
mentals at many pixel locations. one side of the belt. As shown in Figure 11, the algorithm
The complication here is the picket fence. In the orig- accomplishes just that.
inal sequence, the fence is part of the xed background,
exhibiting pure spatial periodicity. However, when the se- 3.4 Jumping Jack
quence is aligned to the person and the dogs, the fence
starts to move in the background, leaving a periodic sig- There is no translatory motion in the Jumping Jack se-
nature on many temporal lines. As shown in (c1) and (e1), quence, and the background is smooth. This sequence and
the fence area lights up in the periodicity template. its noisy versions (corrupted by additive Gaussian white
Figure 10 (d) shows the fundamentals with value noise (AGWN) of variance 100 and 400) are used to demon-
0:875, which is the temporal frequency of the fence in strate the robustness of the new algorithm in the presence
the aligned sequence. The fundamental frequency values of noise, and also to show the noise sensitivity of the optical
are used to extract the fence. Figure 10 (f) shows the ow based motion estimation. The length of the sequences
harmonic energy ratios in the template after the fence fre- used here is 128 due to the cycle of the jumping motion.

5
(a) (b)

(c1) (e1)

(c2) (e2)

(d) (f)

Figure 11: Wheel sequence. Shown in the rst three rows,


the algorithm captures all regions with periodicity: the
hub caps, the wheels, and one side of the belt. Bottom
row: details of spinning wheels and car.

(g) (h) The third row of Figure 12 shows the 57th TY (not
YT!) image of each sequence, revealing the tracks left by
Figure 10: Dog sequence. (a) Frame 46 of original se- the right hand and leg. The rows in these images are tem-
quence. (b) Frame 13 of aligned sequence. (c1) and (c2): poral lines, and the corresponding power spectra are shown
rst and second fundamental frequencies in periodicity in the fourth row of the gure. The periodicity templates
template. (e1) and (e2): harmonic energy ratios corre- can be found in the bottom row. Although the noise causes
sponding to the frequencies in (c1) and (c2). (d) Funda- some degradation in the arm regions, the templates are
mentals with the fence frequency. (f) Harmonic energy well preserved overall. The reason why the proposed al-
ratios after the fence frequency is removed. (g) Frame 46 gorithm is not a ected by large amounts of white noise in
masked to show fence region. (h) Frame 46 masked to the input is that white noise only contributes to the rela-
show other regions of periodicity. tively smooth part of the power spectrum. As long as the
noise energy is not so high that it overwhelms the spectral
Most of the related work uses ow based methods to harmonic peaks, the algorithm works.
locate moving objects in a sequence. However, the noise
sensitivity of the ow based method can be a drawback. 3.5 Walker
The optical ow magnitudes shown here were obtained by The detection results of the Walker sequence were shown
using the hierarchical least-squares algorithm [9], which is in Section 2. Here we show the results from noisy inputs
based on a gradient approach described by [10] [11]. Two (original sequence corrupted by AGWN of variance 100
pyramids are built, one for each of the two consecutive and 400), using 64 frames. The resulting periodicity tem-
frames, and motion parameters are progressively re ned plates in Figure 13 show that, unlike optical ow based
by residual motion estimation from coarse images to the methods, the proposed algorithm is robust in the presence
higher resolution images. This algorithm is representative of noise.
of the existing optical ow estimation techniques. The
optical ow magnitudes of the Jumping Jack frame 61 are
shown in the second row of Figure 12. Given a clean input,
4 Discussion
the ow magnitudes can be used to segment the moving Compared to the method used in [4], the periodicity mea-
object. However, the algorithm is mostly ine ective under sure proposed here in the form of the temporal harmonic
the noisy conditions. energy ratio is a more accurate and more reliable measure

6
Original AGWN Var=100 AGWN Var=400

(a) (b) (c)


Figure 13: Periodicity templates of the Walker sequence.
(a) from original sequence; (b) and (c): with AGWN of
variance 100 and 400 respectively. The proposed algorithm
is robust in the presence of noise.
activities. Periodicity is a salient feature to human visual
perception. The proposed algorithm provides a model of
low-level periodicity perception, even though it may not
work exactly like the human visual system.
The method presented here is computationally ecient.
The most machine intensive part of the algorithm is the
1-D fast Fourier transform used in power spectrum compu-
tation. When the activity cycle is reasonably short, such
as walking in normal speed, a sequence length of 64 frames
suces. Cropping of aligned sequences provides additional
speed-up.
In the current work, assumptions were made on the
data. The steady background condition for data type II
is mainly for the background subtraction. The algorithm
in fact tolerates small camera movement quite well. When
an object is not translating with respect to the camera,
its trajectory will not be linear in the data cube and a
scheme more sophisticated than the Hough line detection
will have to be used for the frame alignment. If the object
is not moving frontoparallel to the camera, the perspective
e ect will change the size of the object in the sequence.
However, this change should not be signi cant during the
period of 64 frames when the distance between the camera
and the object is suciently large. In practical situations,
this is often the case.
4.1 Applications
The proposed algorithm can be applied to motion classi-
Figure 12: Jumping Jack sequence (frame size 155  170). cation and recognition. In [5], the shape of the active
Left column: original sequence. Middle and right columns: region in a sequence was used for activity recognition. In
corrupted sequences. Row 1: frame 61. Row 2: frame 61 [3], the sum of the ow magnitudes in tessellated frame ar-
optical ow magnitudes. Row 3: TY slice 57, showing the eas of periodic motion was used for motion classi cation.
tracks left by right hand and leg; each row of these images The periodicity templates produced by the proposed algo-
is a temporal line. Row 4: temporal power spectra of TY rithm can provide not only distinct shapes of regions of
slice 57. Row 5: harmonic energy ratios of periodicity periodic motion, such as the wedge for the walking motion
templates. and the snow angle for the jumping jack, but also accurate
pixel-level description of a periodic action in the form of
of signal periodicity. temporal harmonic energy ratios and motion fundamental
frequencies.
The periodicity templates also provide the fundamen- The characterization of periodicity is also important to
tal frequencies of the temporal signals. Using this infor- video database related applications. The presence, posi-
mation, areas involved in periodic activities with di erent tion, strength, and frequency information of periodic ac-
cycles can be distinguished easily. tivities can be used for video representation and retrieval.
The proposed algorithm can be considered as a \peri- In general, periodicity is a salient attention-getting fea-
odicity lter". Given a sequence of a street with cars and ture. The proposed algorithm can be used in numerous
pedestrians, the algorithm will nd the moving legs of the surveillance applications for detecting ambulatory activity
pedestrians and lter out the cars and other non-periodic without having to do full-person recognition.

7
5 Summary [10] J. Bergen et al. Hierarchial model-based motion esti-
A new algorithm for nding periodicity in space and time mation. In Proc. ECCV, pages 237{252, 1992.
is presented. The algorithm consists of two main parts: [11] B. Lucas and T. Kanade. An iterative image registra-
1) object tracking by frame alignment, which transforms tion technique with an application to stereo vision.
data into a form in which periodicity can be easily detected In Proc. Image Understanding Workshop, pages 121{
and measured; 2) Fourier spectral harmonic peak detection 131, 1981.
and energy computation to identify regions of periodicity
and measure its strength. This method allows simultane-
ous detection, segmentation, and characterization of spa-
tiotemporal periodicity, and is computationally ecient.
The e ectiveness of the technique and its robustness to
noise over optical ow based methods are demonstrated
using a variety of real-world video examples.
Periodicity templates are proposed as a new way of
characterizing spatiotemporal periodicity. The templates
contain information such as the fundamental frequencies
and the temporal harmonic energy ratios at each frame
pixel location. The periodicity templates and the template
generating algorithm are useful tools for applications such
as action recognition, video databases, and video surveil-
lance.
Acknowledgments
The authors would like to thank Aaron Bobick and Sandy
Pentland for insightful discussions, Jim Davis for the
Jumping Jack sequence, and John Wang for the hierar-
chical optical ow estimation program.
References
[1] D.D. Ho man and B.E. Flinchbuagh. The interpre-
tation of biological motion. Biological Cybernatics.,
pages 195{204, 1982.
[2] M. Allmen and C.R. Dyer. Cyclic motion detection
using spatiotemporal surface and curves. In Proc.
ICPR, pages 365{370, 1990.
[3] R. Polana and R. Nelson. Low level recognition of
human motion. In IEEE Workshop on Motion of Non-
rigid and Articulated Objects, pages 77{82, Austin,
TX, Nov. 11-12 1994.
[4] R. Polana and R. C. Nelson. Detecting activities. In
Proc. CVPR, pages 2{7, New York, NY, June 1993.
[5] A.F. Bobick and J.W. Davis. Real-time recognition
of activity using temporal templates. In Proc. Third
IEEE Workshop on Appl. of Comp. Vis., pages 39{42,
Sarasota, FL, Dec. 1996.
[6] F. Liu and R. W. Picard. Periodicity, directionality,
and randomness: Wold features for image modeling
and retrieval. IEEE T. Pat. Analy. and Machine In-
tel., 18(7):722{733, July 1996.
[7] H. Wold. A Study in the Analysis of Stationary Time
Series. Stockholm, Almqvist & Wiksell, 1954.
[8] S. A. Niyogi and E. H. Adelson. Analyzing gait with
spatiotemporal surfaces. In IEEE Workshop on Mo-
tion of Non-rigid and Articulated Objects, pages 64{
69, Austin, Texas, Nov. 11-12 1994.
[9] J.Y.A. Wang. Layered Image Representation: Identi-
cation of Coherent Components in Image Sequences.
PhD thesis, Dept. of EECS, MIT, Cambridge, Sept.
1996.

You might also like