Chapter 9
Chapter 9
Processing Sequential
Sensor Data
John Krumm
Contents
9.1 Introduction 354
9.2 Tracking Example 356
9.3 Mean and Median Filters 358
9.4 Kalman Filter 359
9.4.1 Linear, Noisy Measurements 360
9.4.2 Linear, Noisy Dynamics 361
9.4.3 All Parameters 362
9.4.4 Kalman Filter 363
9.4.5 Discussion 364
9.5 Particle Filter 365
9.5.1 Problem Formulation 366
9.5.2 Particle Filter 368
9.5.3 Discussion 369
9.6 Hidden Markov Model 370
9.6.1 Problem Formulation 371
9.6.2 Hidden Markov Model 373
9.6.3 Discussion 374
9.7 Presenting Performance 376
9.7.1 Presenting Continuous Performance Results 376
9.7.2 Presenting Discrete Performance Results 378
9.8 Conclusion 380
References 380
353
354 � John Krumm
9.1 Introduction
Ubiquitous computing (ubicomp) applications are normally envisioned to
be sensitive to context, where context can include a person’s location, activ-
ity, goals, resources, state of mind, and nearby people and things. Context
is often inferred with sensors that periodically measure some aspect of the
user’s state. For instance, a global positioning system (GPS) sensor can
repeatedly measure a person’s location at some interval in time. This is an
example of sequential sensor data in that it is a sequence of sensor reading
of the same entity spread out over time. Unfortunately, sensors are never
perfect in terms of noise or accuracy. For example, a GPS sensor gives
noisy latitude/longitude measurements, and sometimes the measurements
are wildly inaccurate (outliers). In addition, sensors often do not measure
the necessary state variables directly. Although a GPS sensor can be used
to infer a person’s velocity and even mode of transportation (Patterson
et al., 2003), it cannot directly measure these states. This chapter is aimed
at introducing fundamental techniques for processing sequential sensor
data to reduce noise and infer context beyond what the sensor actually
measures. The techniques discussed are not necessarily on the cutting
edge of signal processing, but they are well-accepted approaches that have
proven to be fundamentally useful in ubicomp research. Because of their
wide acceptance and usefulness, you should feel comfortable using them
in your own ubicomp work. Specifically, this chapter discusses mean and
median filters, the Kalman filter, the particle filter, and the hidden Markov
model (HMM). Each of these techniques processes sequential sensor data,
but they all have different assumptions and representations, which are
highlighted to help the reader make an intelligent choice.
This chapter concentrates on processing sequential sensor measure-
ments, because it is usually necessary to make repeated measurements
to keep up with possibly changing context in ubicomp applications. For
instance, a person’s location usually changes with time, so location must
be measured repeatedly. Sequential measurements mean that processing
techniques can take advantage of both a sense of the past and a sense of the
future, which will be explained next.
A sense of the past is useful because context does not change completely
randomly, but instead shows some coherence over time. Although an iso-
lated sensor measurement might lead to an uncertain conclusion about
a person’s context, repeated measurements give more certainty in spite
of noisy measurements, partly because context normally does not change
Processing Sequential Sensor Data � 355
9.2 Tracking Example
The problem presented in this chapter focuses on tracking a moving person
in the (x,y) plane based on noisy (x,y) measurements taken at an interval of
1 second. The simulated actual path and simulated noisy measurements
are shown in Figure 9.1, where the unit of measurement is 1 meter. The
path starts at the center of the spiral. Pretend that the person is carrying
a location sensor, possibly a GPS, if he is outside. There are 1000 measure-
ments spread evenly in time from beginning to end. The measurements
are noisy versions of the actual path points. As is common in modeling
noisy measurements, the measurements are taken as the actual values plus
added Gaussian noise with zero mean. The Gaussian probability distribu-
tion is the familiar bell-shaped curve. “Noise” is not noise in the audible
sense, but represents random errors in the measurements. Engineers and
researchers use Gaussian noise, as opposed to other probability distribu-
tions, partly because it is often an adequate model and partly because it
is theoretically convenient. The zero mean assumption implies that the
sensor is unbiased (no constant offset error), and this is easily realizable
90
80
70
60
y (meters)
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
x (meters)
FIGURE 9.1 The actual path, in black, starts at the center of the spiral.
The noisy, measured points are gray. One goal of processing the measured
points is to estimate the actual path. These data form the basis of the run-
ning example in this chapter.
Processing Sequential Sensor Data � 357
Here, vi is a random Gaussian noise vector with zero mean and a diagonal
covariance matrix. Since the covariance matrix is diagonal, this is the same as
adding independent Gaussian noise to each element of xi. Note that Equation
(9.1) is not part of any algorithm for inferring xi, but just an assumption on the
relationship between the measurements zi and actual xi values. It also describes
the simulated measurement data used in this chapter’s running example.
This example is intentionally simple so it is easy to visualize and thus
some of the simpler techniques will work on it. This chapter will show how
358 � John Krumm
x̂i =
1
∑
n j =i −n+1
zj (9.2)
Here x̂i represents the estimated value of xi. Equation (9.2) represents a
sliding window of width n, where n is the number of values to use to com-
pute the mean. This filter is simple to implement and works well to reduce
noise, as shown in Figure 9.2, where n = 10. The primary disadvantage is
that it introduces lag in the estimate, because the average is taken over
mostly measurements that come before zi. One way to reduce the lag is to
use a weighted average whose weights decrease in value for increasingly
older measurements. Note that the mean filter in Equation (9.2) is a “causal
filter,” because it does not look ahead in time to future measurements to
estimate xi. This is true of all techniques discussed in this chapter.
In addition to lag, another potential problem with the mean filter is its
sensitivity to outliers. In fact, just a single measured point, placed far enough
away, can move the mean to any location. The median is a more robust ver-
sion of the mean, and it still works if up to half the data is outliers.
xi = median{z i −n+1 , z i −n+2 , , z i −1 , z i } (9.3)
Processing Sequential Sensor Data � 359
90
80 Mean
Median
70
60
y (meters)
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
x (meters)
FIGURE 9.2 The actual path is in black. The paths estimated by the mean
and median filters are in dark gray and light gray, respectively. The median
filter produces fewer large excursions from the true path, because it is more
robust to outliers.
Figure 9.2 shows the estimated path based on the median. There are
places where the mean drifts relatively far from the actual path due to an
outlier, whereas the median stays closer.
The mean and median filters are simple, yet effective techniques for pro-
cessing sequential sensor data. However, they do suffer from lag, and they
do not intrinsically estimate any higher level variables such as speed. One
could estimate speed with a numerical derivative, but this is very sensitive
to the noise in the original measurements. The remaining techniques dis-
cussed in the chapter work to reduce lag with a dynamic model, and the
Kalman and particle filters (Sections 9.4 and 9.5) can estimate higher level
state variables in a principled way.
9.4 Kalman Filter
The Kalman filter is a big step up in sophistication from the mean and
median filters discussed above. It explicitly accounts for sensor noise (as
long as the noise is additive Gaussian), and it explicitly models the sys-
tem’s dynamics. The Kalman filter introduces probability to the problem
360 � John Krumm
xi
z (x )
i yi
zi = ( y ) xi = (9.4)
z i si( x )
si( y )
the unknown velocity vector’s x and y components. The state vector con-
tains variables that are not directly measured—velocity, in this case.
To express the full relationship between the actual state and measure-
ments, the Kalman filter adds a measurement matrix H and zero-mean
Gaussian noise:
z i = H i xi + vi vi ~ N (0, Ri ) (9.5)
Here, the measurement matrix Hi translates between the state vector and the
measurement vector. Because the measurements are related to the actual state
by a matrix, the measurements are said to be linear. In the example, Hi simply
deletes the unmeasured velocity and passes through the (x,y) coordinates:
1 0 0 0
Hi = (9.6)
0 1 0 0
The noise vector vi has the same dimensions as the measurement vector
zi, and is distributed as zero-mean Gaussian noise with a covariance vec-
tor Ri. In the example, the noise covariance is independent of i. Because
this example is a simulation, the Kalman filter has the advantage of know-
ing the exact noise covariance, from Equation (9.1):
σ2 0
Ri = (9.7)
0 σ2
The system matrix j i−1 gives the linear relationship between the state at
time i − 1 and i. For the example, the system matrix is
1 0 ∆ti 0
∆ti
φi −1 = 0 1 0 (9.9)
0 0 1 0
0 0 0 1
362 � John Krumm
Here, ∆ti is the time elapsed between measurements i − 1 and i. With this
matrix and the state vector of the example, the equation xi = j i−1x i−1 says
that xi = xi −1 + ∆ti si( x ) and similarly for yi. This is just standard physics for
a particle moving in a straight line at constant velocity. The system equa-
tion also says that the velocity stays constant over time. Of course, this
is not true for the example, and is usually not true for any system that
controls its own trajectory. In general, the system noise wi helps account
for the fact that j i−1 is not an exact model of the system. wi is zero-mean,
Gaussian noise with covariance Qi. For the example, Qi is
0 0 0 0
0 0 0 0
Qi = (9.10)
0 0 σ s2 0
0 0 0 σ s2
This implies that the physical model connecting position and veloc-
ity is correct (which it is). But it also implies that velocity is subject to
some noise between updates, which helps compensate for the fact that
otherwise velocity is assumed to be constant. Setting the actual values of
Qi is not straightforward. For the example, σ s2 was chosen to represent
the variance in the actual velocity from measurement to measurement.
Qi is modeling the fact that velocity is not constant in the example. If it
were, the trajectory would have to be straight. With Qi given as Equation
(9.10), Equation (9.8) states that the velocity changes randomly between
time steps, with the changes distributed as a zero mean Gaussian.
9.4.3 All Parameters
Implementing the Kalman filter requires the creation of the measurement
model, system model, and initial conditions. Specifically, it requires
The initial state estimate can usually be estimated from the initial few
measurements. For this chapter’s example, the initial position came from
z 0, and the initial velocity was taken as zero. A reasonable estimate of P0
for this example is
σ2 0 0 0
0 σ2 0 0
P0 = 0 0 σ s2 0
(9.11)
0 0 0 σ s2
9.4.4 Kalman Filter
The Kalman filter proceeds in two steps for each new measurement zi. The
result of the two steps is the mean and covariance of the estimated state,
xˆi(+) . The first step is to extrapolate the state and the state error covariance
from the previous estimates. These are pure extrapolations with no regard
for the measurement, and they depend only on the system model. The (−)
superscript indicates an extrapolated value.
The second step is to update the extrapolations with the new measurement,
giving the mean and covariance of the new state estimate, x̂i(+) and Pi( + ) :
Kalman Filter
100
90
80
Untuned
70 Tuned
60
y (meters)
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
x (meters)
FIGURE 9.3 The Kalman filter result, in dark gray, tends to overshoot turns
because its dynamic model assumes a single, straight line path. The lighter
gray line is a version of the Kalman filter tuned to be more sensitive to the
data. It follows turns better, but is also more sensitive to noise.
9.4.5 Discussion
The major difference between the Kalman filter and the mean and median
filters is that the Kalman filter has a dynamic model of the system to keep up
with changes over time. Besides fixing the lag problem, the dynamic model
counterbalances the measurement model to give a tunable tradeoff between
a designer’s belief in the predictive dynamics versus sensor readings. This
tuning is reflected in the choice of the elements of the measurement and sys-
tem covariance matrices. Although the measurement covariance is relatively
simple to estimate based on sensor characteristics, the system covariance
is more difficult, and it represents an opportunity for tuning the tradeoff
between dynamics and measurement. Figure 9.3 shows two results of the
Kalman filter: the less accurate one in dark gray reflects a principled choice
of ss, which specifies the noise in the predictive velocity model. The inferred
path tends to have trouble tracking curves and corners, because the model is
biased too much toward straight line paths. The more accurate inferred path
in light gray comes from an optimal choice of ss based on an exhaustive
Processing Sequential Sensor Data � 365
sequence of test values of ss. The price of this more responsive model is a
wigglier path, because this filter is more sensitive to the measurements. Of
course, using a textbook example makes such a search easy. The problem of
tuning is more difficult for real applications.
The dynamic model can include parameters beyond the ones that are
measured. In the example, the state vector for the Kalman filter added
velocity as a state parameter, even though the measurements had location
only. An obvious extension to the example would be to include accelera-
tion. The ability to track nonmeasured parameters is a benefit, because the
nonmeasured parameters may be useful for context inference.
The Kalman filter can also make sensor fusion straightforward. For the
example in this section, an accelerometer could add valuable data about
location and velocity. This would involve augmenting the state vector, sys-
tem model, and measurement model with acceleration.
Another advantage of the Kalman filter is that it gives an estimate of its
own uncertainty in the form of the covariance matrix Pi( + ) . A knowledge
of uncertainty is useful for ubicomp systems. For instance, a Kalman filter
tracker might indicate that a person is equally likely in the kitchen or the living
room, which could affect whether certain automatic actions are triggered.
One of the main limitations of the Kalman filter is the linearity of the
dynamic model. Some processes are inherently linear, such as the radar
traces of ballistic missiles to which Kalman filters were applied a long time
ago. System dynamics are often not linear. For instance, a ball bouncing
off a wall could not be represented by the state system matrix j i−1 in the
Kalman filter. Likewise, the combinations of distances and angles in pose
estimation are not linear. The extended Kalman filter can sometimes solve
nonlinear problems by linearizing the system around the estimated state.
Until ubicomp advances to the ballistic missile stage, applications of the
basic Kalman filter in the field will be relatively rare.
The Kalman filter is also unsuitable for representing discrete state vari-
ables such as a person’s mode of transportation, knowledge of the world, or
goals. Fortunately, for nonlinear problems with a mix of continuous and
discrete state variables, the particle filter is a practical, although computa-
tionally more expensive, solution. This is the topic of the next section.
9.5 Particle Filter
The particle filter is a more general version of the Kalman filter, with less
restrictive assumptions and, because of that, more computational demand.
Unlike the Kalman filter, the particle filter does not require a linear model
366 � John Krumm
for the process in question, and it does not assume Gaussian noise. One of
the main advantages of the particle filter for ubicomp applications is that it
can easily represent an arbitrary mix of continuous and discrete variables
along with a rich model of how these variables interact and affect sensors.
In terms of the list of properties given for the Kalman filter, the particle
filter is Bayesian, non-Gaussian, nonlinear, and online.
One good example of a particle filter for a ubicomp application is the work
of Patterson et al. (2003). They created a rich model of a person’s location,
velocity, transportation mode, GPS error, and the presence of a parking lot or
bus stop. The only sensor used was GPS for location and velocity. Using a par-
ticle filter, they could infer all the other variables. Note that their state space
contained both continuous variables (location, velocity, and GPS error) and
discrete variables (transportation mode, presence of parking lot or bus stop).
The particle filter has its name because it represents a multitude of pos-
sible state vectors, which can be thought of as particles. Each particle can
be considered a hypothesis about the true state, and its plausibility is a
function of the current measurement. After each measurement, the most
believable particles survive, and they are subject to random change in state
according to a probabilistic dynamic model. An easy-to-understand intro-
duction to particle filtering is the chapter by Doucet et al. (2001), and this
chapter uses their notation. Hightower and Borriello (2005) give a useful
case study of particle filters for location tracking in ubicomp.
As in the Kalman filter section (Section 9.4.4), the subsections below
explain how to implement a particle filter using this chapter’s tracking
example.
9.5.1 Problem Formulation
The particle filter is based on a sequence of unknown state vectors, xi, and
measurement vectors, zi, which is the same as the Kalman filter, both in
general and in this chapter’s example. As a reminder, for the example, the
state vector represents location and velocity, and the measurement vec-
tor represents a noisy version of location. The particle filter parallels the
Kalman filter with a probabilistic model for measurements and dynamics,
although both are more general than the Kalman filter.
The probability distribution p(zi|xi), which you must provide, models
the noisy measurements. This can be any probability distribution, which
is more general than the Kalman filter formulation, which was zi = Hixi +
vi. The measurement probability distribution gives a probabilistic relation-
ship between the actual state xi given the measurement zi. This models the
Processing Sequential Sensor Data � 367
p( zi | xi ) = N (( xi , yi )T , Ri ) (9.17)
xi +1 = xi + si( x ) ∆ti
yi +1 = yi + si( y ) ∆ti
(9.18)
si(+x1) = si( x ) + wi( x ) wi( x ) ~ N ( 0,σ s2 )
si(+y1) = si( y ) + wi( y ) wi( y ) ~ N ( 0,σ s2 )
This is the same as the assumed dynamic model for the Kalman filter
in Equation (9.8), but expanded for each component of the state vector
to emphasize that the update does not need to be linear. It also does not
need to be Gaussian. For consistency with the example, however, this
dynamic model is both linear and Gaussian. The generality of p(xi|xi−1)
would allow for more domain knowledge in the model. For instance,
with knowledge of a map, p(xi|xi−1) could express the fact that it is more
likely to accelerate going down hills and decelerate going up hills.
368 � John Krumm
p( x0 ) = N ( z 0 , Ri ) (9.19)
Although the variables in the example are all continuous, note that
these distributions could involve a mix of continuous and discrete vari-
ables, and they could interact in interesting ways. For instance, consider
appending a discrete home activity variable to the state vector xi, where
the activities can be {sleeping, eating, studying}. Assuming that the subject
engages in only one of these activities at a time, the measurement model
of the particle filter p(zi|xi) could express the fact that certain activities are
more likely to occur in or near certain rooms of the house.
9.5.2 Particle Filter
The particle filter works with a population of N particles representing the
state vector: xi( j ) , j = 1, …, N. Unlike the other methods in this chapter,
the particle filter actually involves generating random numbers, which are
necessary to instantiate the values of the particles. This sampling means
writing a routine that generates sample random numbers adhering to the
relevant probability distributions.
Although there are several variations of the particle filter, one of the
most popular is called the bootstrap filter, introduced by Gordon (1994). It
starts by initializing N samples of x0( j ) generated from the prior distribu-
tion p(x0). In the example, a plot of these particles would cluster around
the first measurement z 0, because the prior term in Equation (9.19) is a
normal distribution with mean z 0.
With the initialization complete, for i > 1, the first step is “importance
sampling,” which uses the dynamic model to generate random x i(j ) from
p(xi|xi−1). This propagates the particles forward according to the assumed
dynamic model, but without any guidance from the new measurement zi.
The next step computes “importance weights” for each particle accord-
ing to the measurement model:
(
w i( j ) = p zi x i( j ) ) (9.20)
Processing Sequential Sensor Data � 369
xˆi = ∑ w
j =1
( j) ( j)
i xi (9.21)
9.5.3 Discussion
The main problem with the particle filter is computation time. In general,
using more particles (larger N) helps the algorithm work better, because
it can more completely represent and explore the space of possible state
vectors. But often, adding enough particles to make a noticeable improve-
ment in the results also causes a significant increase in computation time.
Fox (2003) gives a method for choosing N based on bounding the approxi-
mation error.
Because the particle filter allows such a rich state representation,
it is tempting to add state variables. In this chapter’s example, it would
be interesting to add state variables governing the mode of transporta-
tion (e.g., walking vs. driving), intended route, and intended destina-
tion. However, increasing the dimensionality of the state vector usually
370 � John Krumm
Particle Filter
100
90
80
70
60
y (meters)
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
x (meters)
requires adding more particles to account for the larger state space, lead-
ing to increased computation time. One interesting remedy for this is the
Rao-Blackwellized particle filter (Murphy and Russell, 2001). It uses a
more conventional filter, such as Kalman, for the state variables where it
is appropriate, and it uses a particle filter for the other state variables. For
instance, Kalman could cover location and velocity, and a particle filter
could track the higher level states.
9.6.1 Problem Formulation
As with the previous filtering methods, the measurements are represented
by zi, where the subscript represents the time the measurement was taken.
Instead of a continuous state xi, the HMM works with discrete states Xi( j ) .
Here, the subscript again refers to time. The superscript indexes through
the M possible states. In an activity recognition system, Xi( j ) could repre-
sent different modes of transportation at time i, that is, Xi( j ) ∈ {bus, foot,
car}, as in Patterson et al. (2003). Here, j = 1 means “bus,” j = 2 means
“foot,” and j = 3 means “car,” and there are M = 3 possible states.
In this chapter’s example, the natural way to represent the state is with
a continuous coordinate (xi,yi)T. For the HMM, however, with its require-
ment for discrete state, the example splits the coordinate plane into small
cells, as shown in Figure 9.8. Each cell in the example is a square whose
sides are 1 meter long. Since the space extends to 100 meters along both
axes, there are M = 10,000 cells. The goal of HMM is to estimate which
of these cells contains the tracked object after each continuous measure-
ment zi.
Both the Kalman filter and particle filter have a measurement model.
In the Kalman filter, the measurement model, given by Equation (9.5),
says that the measurement is a linear function of the state plus additive
Gaussian noise. In the particle filter, the measurement model is a general,
conditional probability p(zi|xi). The measurement model for the HMM is
similar to the one for the particle filter, except that it gives the discrete
probability of each state, given the measurement: P( Xi( j ) |zi ). P (uppercase)
indicates a discrete probability distribution. In the HMM terminology,
the measurement model is used to compute the “observation probabili-
ties.” Given a measurement zi, there is one observation probability value
for each possible state Xi( j ) . These observation probabilities must sum to
1; that is,
∑P(X z ) = 1
j =1
( j)
i i (9.22)
used in the example here, is to compute the value of the Gaussian at the
center of each cell and then normalize the probabilities so they sum to 1.
Note that since the Gaussian goes on forever (infinite support), all cells
have some nonzero probability.
Both the Kalman filter and particle filter have a dynamic model. For the
Kalman filter, the dynamic model says that the new state is a linear func-
tion of the old state, plus additive Gaussian noise, as given by Equation
(9.8). The particle filter is more general, specifying the dynamic model as a
conditional probability distribution p(xi|xi−1). The dynamic model for the
HMM is expressed in terms of transition probabilities, P( Xi(+k1) |Xi( j ) ) = a jk .
This gives the probability of transitioning from state j to state k between
successive measurements. In the {bus, foot, car} example, the transi-
tion probabilities could reflect the fact that it is improbable to transition
directly from a bus to a car, because there is usually some foot travel in
between. The transition probabilities going from one state to all the others
must sum to 1:
M
∑a
k =1
jk =1 (9.23)
0.00011
0.99989 Still Moving 0.99989
0.00011
0.5
ability
0.4
Transition Prob
0.3
0.2
0.1
1
0
Y
io in
0
–1
c t ell
n
ire C
Relat 0 –1
D ive
ive C
t
ell in 1
la
X Dir
Re
ection
FIGURE 9.6 The HMM transition probabilities for each cell look like this,
with the most probable transition being back to the same cell. The only
other nonzero transition probabilities are to the immediate neighbors.
For this chapter’s tracking example, there are M = 10,000 states, one for
each cell. The transition probabilities reflect the fact that it is more likely to
transition to a nearby cell than a distant cell. In fact, an examination of the
actual path shows that the probability of staying in the same cell from time i to
i + 1 is about 0.40. The only other nonzero transition probabilities are into the
cells immediately surrounding the current cell, as shown in Figure 9.6. The
example uses the values in Figure 9.6 to fill in the transition probabilities ajk.
The last element of the HMM is a set of initial state probabilities,
P( X 0( j ) ). With no knowledge of the initial state, it is reasonable to specify a
uniform distribution over all states, that is, P( X 0( j ) ) = 1/M . For the track-
ing example, the initial state probabilities were spread in a Gaussian pat-
tern around the first measurement.
FIGURE 9.7 HMM considers all possible states at each time step. This HMM
has M = 8 states and N = 3 measurements. The Viterbi algorithm is an effi-
cient way to find the most probable path through the observation probabili-
ties and transition probabilities. The dark lines represent one possible path.
9.6.3 Discussion
The HMM is sometimes a very helpful add-on to a system that infers dis-
crete states as a means of reducing the frequency of transitions between
states. For example, one part of the Locadio system attempted to infer if
a user was walking or still, based on the variance of measured WiFi sig-
nal strengths. The raw measurements of variance indicated very frequent
transitions between the two states. The system used a simple two-state
HMM with transition probabilities that approximated the realistic prob-
abilities of making such a transition, leading to a much smoother output,
as shown in Figure 9.9.
Processing Sequential Sensor Data � 375
90
80
70
60
y (meters)
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
x (meters)
FIGURE 9.8 The HMM works with discrete state variables. For this exam-
ple, each discrete state is one of 10,000 1 × 1 meter cells. The gray line
shows the result of the HMM.
Moving
Actual
Still
Moving
Inferred
Still
Moving Inferred and
smoothed with
Still HMM
FIGURE 9.9 An HMM can be a useful add-on to a process that infers the
discrete state of an object. Here, an HMM was used to smooth the inferred
transitions between the states of “still” and “moving” inferred from WiFi
signal strengths for a person carrying a laptop inside a building.
376 � John Krumm
9.7 Presenting Performance
There are a few standard ways to present the performance of algorithms
for processing sequential sensor data. Having de facto standards is impor-
tant, because it allows a fair comparison of different research results. This
section discusses performance measures for both continuous and discrete
state measurements.
5 Median error
4
Meters
3
2
1
0
Mean Median Kalman Kalman Particle HMM
(untuned) (tuned)
FIGURE 9.10 The mean and median error of six different processing algo-
rithms applied to the noisy data in Figure 9.1.
often the resulting error is less than or equal to e*. Figure 9.11 shows the
cumulative error distributions for the example algorithms in this chapter.
A steeply rising curve indicates better performance. The data used for this
type of plot are also used to find percentile errors. For example, the 90th
percentile error is the error value where the cumulative error curve crosses
0.5 HMM
0.4 Kalman(tuned)
0.3 Particle
0.2 Mean
0.1 Kalman(untuned)
0
0 1 2 3 4 5 6 7 8 9 10
Error (meters)
FIGURE 9.11 The cumulative error distribution gives a more detailed view of
how the errors are spread. It is also useful for reading the percentile errors.
For instance, the 90th percentile error for HMM is about 2.5 meters, mean-
ing that the error is less than or equal to 2.5 meters 90% of the time.
378 � John Krumm
0.9, and it means that the errors will be less than or equal to this value 90%
of the time. The 50th percentile error is the same as the median, and the
100th percentile error is the maximum error.
In addition to a quantitative assessment of performance, a qualitative
assessment is also useful. For instance, using this chapter’s example, a
qualitative assessment would include how well the algorithm works on
straight segments, smooth curves, and sharp turns. Although the qualita-
tive performance is hard to compare from algorithm to algorithm, it can
be important, because sometimes qualitative differences have an effect on
the quality of the upper level application. For instance, in the tracking
example, sharp turns might be indicative of a subject’s erratic behavior,
and therefore an algorithm that tracks such turns could be better than one
that tracks more accurately overall for certain applications.
Note that the results presented in this section are merely illustrations of
how to present performance. They should not be used to declare that one
processing algorithm is better than another. Each algorithm can be tuned
for better performance, and the results presented here are not necessarily
indicative of each algorithm’s potential performance.
Actual Activities
Elevator up 0% 2% 2% 6% 0% 3% 87% 0%
Brushing teeth 2% 10% 3% 0% 0% 0% 0% 85%
If the classification worked perfectly, all the off-diagonals would be 0%, and the diagonals would be 100%.
a Data adapted from Lester et al., 2006. With kind permission of Springer Science + Business Media.
Processing Sequential Sensor Data � 379
380 � John Krumm
9.8 Conclusion
This chapter examined some algorithms for processing sequential sen-
sor measurements. The algorithms were the mean and median filters, the
Kalman filter, the particle filter, and the HMM. The process being mea-
sured is assumed to show some continuity in time, and the algorithms
presented take advantage of this by looking at past measurements to help
make a state estimate. For all but the mean and median filters, the algo-
rithms use some type of dynamic model to improve accuracy. Researchers
in signal processing and machine learning continue to develop new meth-
ods to process sequential sensor data, but the well-established methods
in this chapter have proven useful for many ubicomp tasks. They serve as
at least a good starting point for new sequential inference tasks in ubi-
comp. Part of the fun of using them is determining how your problem fits
with the assumptions and limitations of each technique.
References
Doucet, A., Freitas, N. D., and Gordon, N., An introduction to sequential Monte
Carlo methods, in Sequential Monte Carlo Methods in Practice, Doucet, A.,
Freitas, N. D., and Gordon, N., Eds., Springer, New York, 2001.
Fox, D., Adapting the sample size in particle filters through KLD-sampling,
International Journal of Robotics Research 22(12): 985–1003, 2003.
Gelb, A., Applied Optimal Estimation, Analytical Sciences Corporation, MIT Press,
Cambridge, MA, 1974.
Gordon, N. J., Bayesian Methods for Tracking, Imperial College, University of
London, London, 1994.
Hightower, J., and Borriello, G., Particle filters for location estimation in ubiquitous
computing: A case study, in Ubiquitous Computing, Springer, Nottingham,
UK, 2004, pp. 88–106.
Krumm, J., and Horvitz, E., Locadio: Inferring motion and location from
Wi-Fi signal strengths, First Annual International Conference on Mobile
and Ubiquitous Systems: Networking and Services (Mobiquitous 2004),
Cambridge, MA, USA, 2004, pp. 4–13.
Lester, J., Choudhury, T., and Borriello, G., A practical approach to recognizing
physical activities, in Pervasive Computing, LNCS 3968, Springer-Verlag,
Dublin, Ireland, 2006, pp. 1–16.
Murphy, K., and Russell, S., Rao-Blackwellised particle filtering for dynamic
Bayesian networks, in Sequential Monte Carlo Methods in Practice, Doucet,
A., Freitas, N. D., and Gordon, N., Eds., Springer-Verlag, New York, 2001,
pp. 499–515.
Patterson, D., et al., Inferring high-level behavior from low-level sensors, in
Ubiquitous Computing, Ubicomp 2003, Seattle, WA, USA, 2003, pp. 73–89.
Rabiner, L. R., A tutorial on hidden Markov models and selected applications in
speech recognition, Proceedings of IEEE 77(2): 257–286, 1989.