0% found this document useful (0 votes)
29 views

Estimating The Number of People in Buildings Using Visual Information

This document discusses developing statistical methods for estimating the number of people in buildings using visual information from surveillance cameras. It addresses this problem across three levels of increasing difficulty: 1) Counting the number of people in the field of view of a single camera (closed system). The authors implemented people counting in two scenarios using background subtraction and clustering algorithms. 2) Counting the number of people in a large area partially covered by cameras (semi-closed system). This involves estimating the total using counts from the camera views. 3) Estimating the total number of people in a building (open system), which is difficult due to partial monitoring of entrances/exits and recirculating pedestrian flows.

Uploaded by

Khairul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Estimating The Number of People in Buildings Using Visual Information

This document discusses developing statistical methods for estimating the number of people in buildings using visual information from surveillance cameras. It addresses this problem across three levels of increasing difficulty: 1) Counting the number of people in the field of view of a single camera (closed system). The authors implemented people counting in two scenarios using background subtraction and clustering algorithms. 2) Counting the number of people in a large area partially covered by cameras (semi-closed system). This involves estimating the total using counts from the camera views. 3) Estimating the total number of people in a building (open system), which is difficult due to partial monitoring of entrances/exits and recirculating pedestrian flows.

Uploaded by

Khairul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2007 Information, Decision and Control

Estimating the Number of People in Buildings


Using Visual Information
Subhash Challa, Khalid Aboura, Konda Ravikanth, Suhrud Deshpande
Networked Sensors Technologies (NeST) Labs
Computer Systems Engineering Department
University of Technology Sydney, Australia
[email protected]
Abstract - Real time pedestrian ow information and the
count of people in determined areas is essential for a multitude of management and monitoring functions. The range
of applications is wide and the focus here is on the development of statistical methods for people volume estimation
in buildings. To count people, visual information from surveillance cameras is used. This is the rst of a series of reports, where the problem is divided in estimation classes of
increasing diculty. In the initial stage, probability models
are derived for the basic counting scenario, using data gathered from a single camera. The methodology is extended to
more complex problems, making use of several cameras covering the same eld of view. The consideration of all possible classes of estimation problems leads to solutions that
culminate in the assessment of the total number of people
in a building.

Keywords: Visual Surveillance, Crowd Estimation,


Fusion of Information, Bayesian Estimation, NonHomogeneous Counting Processes.

In this rst article, we address the basic problem of


estimating the number of people in the eld of view
of a single camera. We develop probability models for
statistics sought by managers of facilities. A Bayesian
methodology is applied for the statistical estimation
of the probability models. We state directions of
research for fusing information from several cameras
covering the same eld of view. By using the data
from several cameras, the reliability of the count of
the number of people present is increased. As the
ultimate problem is the crowd estimation in a large
building, we introduce three classes of estimation
problems to dene the generic problem.

1.1

Introduction

Counting the number of people in real time is essential for a multitude of management and monitoring
functions. We choose to focus on estimating real time
pedestrian ow in buildings. The implications for
management, economic optimization and security of
buildings are of signicant value.
Designers and managers of large buildings conduct
manual surveys or use counting devices that register
passage of a person through a gated area. These
methods are labor extensive and can be impractical
and unreliable. Since the widespread use of digital
video based surveillance, a lot of research has been
conducted in automated methods for counting people
using camera surveillance data. Collecting useful
video framed information, one can provide exact
counts and estimates of the number of people in
locations of interest. An impressively large research in
computer vision is currently devoted to the problem
[1]. However, despite some pioneering systems, the
results remain constrained to specic scenarios and
research has not been extended to the estimation

1-4244-0902-0/07/$20.00 2007 IEEE

of the total number of people in a building. Most


large buildings cannot possibly be covered entirely by
surveillance devices, and estimating the number of
people inside the building at all times is a dicult
problem.

Problem Denition

The aim is to estimate the number of people in a building with a signicant pedestrian ow. After reviewing
all possible scenarios, using the main tower building
of the University of Technology of Sydney as a typical
site with hundreds of surveillance cameras, the problem was divided in 3 classes: (i) Counting the total
number of people in the building [Open System], (ii)
Counting the number of people in a large area [Semiclosed System], and (iii) Counting the number of people in the eld of view [Closed System]. In this article,
we present a solution implemented for (iii), and research being conducted to improve that solution, and
outline the statistical methodology for (i) and (ii) for
presentation in following papers.
1.1.1

Open System

Counting the total number of people in a large building is a dicult problem to solve. Most large buildings
have many entrances that are not monitored. In addition, the ow of people in some instances is hard to
identify. For example, people coming out of an elevator that accesses the parking level could also be com-

124

Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on February 18, 2009 at 02:57 from IEEE Xplore. Restrictions apply.

ing from another oor. Recirculating ow mixes with


the incoming ow. It is hard in such cases to separate
the two and estimation becomes essential. The current
solution which various industries provide is restricted
to areas where counting devices can be deployed. For
example, counting systems are installed on top of entrances to determine the count of people entering and
leaving. The solutions are restricted to the area where
the counting systems are placed and do not provide an
overall estimate of the total number of people in the
building.
1.1.2

Gaussian distributions in [2] proved eective and has


been applied widely [Figures 1 and 2].

Semi-Closed System

Counting the number of people in some areas of the


building is also an involving problem. This applies
in the case of commonly used places such as lobbies,
building oors, coee shops, lounges, reception areas
and recreational areas. For security reasons, cameras
are directed at these selected spaces. However, the
elds of view of the cameras do not usually cover the
entirety of the area. One would like to estimate the
number of people at all times in these areas using the
data received over the surveillance video channels. Using the count of people in the area covered by the surveillance cameras, the problem is to estimate the total
number of people in the larger area. This estimation
is done through probabilistic modeling.
1.1.3

Closed System

Counting the number of people in an area under the


view of surveillance cameras is the basic scenario. The
solution to this problem is the building block of the
global solution. A set of cameras pointed appropriately provides sucient information for a reasonably
accurate count of the number of people in the viewed
area at all times. A typical scenario is that of a room
covered by surveillance cameras. In general, we consider any area under the view of one or several surveillance cameras as a closed system. In the next sections,
we describe the solution we implemented to count the
number of people in a closed system, and how it can
be used to solve the semi-closed and open system problems.

Figure 1: One author walking in the computer lab

Figure 2: Foreground image of Figure 1


In addition to the motion segmentation algorithm
of Stauer and Grimson [2], we introduced a clustering
algorithm and put a rectangular box around clusters
in the foreground image [Figure 2 and 3]. Using clas-

Counting in a Closed System

We consider a location that is either covered completely by surveillance cameras or has all entrances
and exits covered by cameras. This problem can be
addressed with a number of techniques available in
computer vision (See [1]). We implemented two scenarios, one in a lunch room with one camera covering the whole room, and one in a computer lab with
a camera covering the two entrances of the lab. To
count the number of people using the video data, we
used the background subtraction method for motion
segmentation of Stauer and Grimson [2], based on
the generalized mixture of Gaussian probability distributions. The background estimation using mixture of

Figure 3: Boxed cluster image


sication logic based on the expected size of the boxes
for the clusters in the segmented images, we counted
the number of people in the tea room at all times with
a fairly high degree of reliability. For the computer
lab, we implemented pixel gates around the entrances
in the foreground images, and counted the number of
times these gates are crossed by clusters representing
people entering and leaving the lab. This provided us
with the exact number of people in and out of the computer lab, and therefore the number of people in the

125
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on February 18, 2009 at 02:57 from IEEE Xplore. Restrictions apply.

closed system at all times.

2.1

2.2.1

The Number of People N (t)

For a closed system, the exact number of people can


be counted at all times. Although there is an error associated with the assessment of the number of people
in a eld of view of a camera, one can achieve a reliability that can be considered the count of the number
of people in the closed system. The count results in
N (t), the number of people at time t [Figure 4]. The
units of time in Figure 4 are based on the camera video
frames instants. Although a 30 frames per second real
time video sequence was used, the data was processed
at a 4 frames per second rate.

Figure 4: N (t), the number of people at time t


N (t) is a stochastic process that is the compounding
of many processes each representing the arrival, stay
and departure of a person. Starting a cycle at time
t = 0, where N (t) = 0, at time T1 , the rst person
arrives to the closed system and stays a time S1 . The
second person arrives at time T1 + T2 , T2 > 0, and
stays a time S2 , etc. T1 , T2 , T3 , . . . are the inter-arrival
times. S1 , S2 , S3 , . . . are the times spent in the system.
As one would suspect, depending on the location of the
closed system, we found that the probabilistic nature
of the Ti s and Si s diers according to the time of day,
day of the week and season.

2.2

This method is based on the development of the


likelihood function of the number of people in a
cluster. The idea is taken from the mathematics
of the 3D projection of the closed system into the
2D image created by the camera. The width of the
cluster gets smaller for the same number of people
in the cluster, as the group of people gets away from
the camera. This decrease along a straight line path
follows a polynomial functions, due to the sin and
cosin elements of the angles of the real world position
vis a vis the position of the camera. The camera
often is in a high position [see Figure 3], without
oering a total arial view. Ultimately, one can derive
the likelihood function of the number of people in
a cluster, by positioning clusters of people for all
possible combinations and locations on the oor.
However, this is not needed. This likelihood function
can be determined with a nite number of scene points.
Using this likelihood function, one can fuse the information from all the cameras pointing at the scene,
to obtain the number of people in a cluster, and therefore N (t) as the sum over all the clusters. Let a cluster
be determined by its data C i in the foreground image
of camera i. Let L be the location of the cluster in the
real world, that is on the oor of the scene. Out of the
joint data of the m cameras C 1 , C 2 , . . . , C m , we extract
the information we need, that is (w1 , w2 , . . . , wm , L)
where wi , i = 1, . . . , m is the width of the box in the
foreground image of camera i. L is obtained through
stereo computations, requiring only two cameras, and
rened with more cameras. Here, we ignore other elements of the problem such are the orientation of a
person in the cluster, for the sake of simplicity in the
exposition of the approach. However, it can be used
in practice to obtain a good assessment of the likelihood function. Let NC be the number of people in the
cluster. Then:
p(NC |C 1 , . . . , C m ) =

Fusion of Camera Information

Determining N (t), the number of people at time t in


a closed system, is the building block for estimating
the number of people in semi-closed systems and open
systems such as buildings. In many situations, a single
camera oers enough information to determine N (t).
However, in more crowded settings, factors such as occlusion occur and aect the reliability of the count. In
this case, the use of more than one camera improves
the results. By having more than one camera pointed
at the closed system, and fusing the information from
their video outputs, problems due to occlusion are reduced and the reliability of the count of people is improved.

Likelihood Based Fusion

p(NC |w1 , . . . , wm , L)
p(w1 , . . . , wm |NC , L)p(NC |L)
p(w1 |NC , L) . . . p(wm |NC , L)
p(NC |L)

p(wi |NC , L) is the likelihood of the number of people


in the cluster mentioned above. It is the probability of
the observed width of the box in the segmented image,
given the number of people in that cluster at location
L. p(NC |L) is the apriori probability of the number of
people in that cluster at location L. Apriori to the current set of video frames, means that we can determine
that probability from our assessment of the number of
people at that location in the previous set of frames.
This approach provides a way to fuse the information
from the cameras to determine N (t). The more accurate the likelihood p(wi |NC , L) is made, the more
eective the approach is.

126
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on February 18, 2009 at 02:57 from IEEE Xplore. Restrictions apply.

Incoming Flow and Time


Spent in a Closed System

We discussed counting the number of people in a closed


system at all times. We now derive probability models
for the arrival process to the system and the time spent
by each person in the system. The statistics of these
two processes are of signicant importance to building
managers. For example, in the case of a closed system
like a computer lab or a study room, knowing the utilization of such spaces leads to a better economic management. In addition, determining the arrival process
in a closed system allows us to estimate the number
of people in a semi-closed system or the whole system
such as the building.

3.1

Arrival Process

The rst part of our research was to look at the data


and determine windows of homogeneity for the arrival
process. Within those windows of homogeneity, we
determined the probability model for the inter-arrival
and stay times. Based on preliminary studies, we
observed that the Ti s have Exponential distributions
Ti  Exp() with mean [Figure 5]. This results in
the arrival stochastic process being a Poisson process
within a homogeneous period. The more general
candidate for the arrival process throughout the day,
is the non-homogeneous Poisson process (NHPP) [3],
where varies with time and is not a constant. This
accounts for the variability due to the time of the
day, and the eect of the day of the week. Seasons
are also a factor. We will see how that eect can be
incorporated into a model.

Let the counting process {(t), t 0}, formed by


the inter-arrival times be such that
(t) = {n|

n


Ti  t,

i=1

n+1


Ti > t}

i=1

We consider {(t), t 0} to be a non homogeneous


Poisson process with intensity (t), t 0. If we let
t
m(t) = 0 (s)ds, then p{(t + s) (t) = n} is
e(m(t+s)m(t)) (m(t + s) m(t))n
,
n!

n = 0, 1, . . .

That is ((t + s) (t)) is Poisson distributed with


mean m(t + s) m(t). (t) is Poisson distributed
with mean m(t). After arrival data is collected for
a long enough period of time, it is classied according
to the day of the week, within a particular season. Let
zi = (ti + s) (ti ) be the number of arrivals in the
corresponding time interval,
 t for a carefully chosen s.
We assume that m(t) = 0 (s)ds obeys a functional
form with parameters (, ). Then
p(, |z1 , z2 , . . . , zm ) p(z1 , z2 , . . . , zm )p(, )
p(z1 , z2 , . . . , zm ) is the likelihood function and it is
given by the NHPP, where the independent increments
property and the Poisson distribution of the NHPP
provide the elements of the likelihood in a product of
Poisson terms. p(, ) is a prior distribution for the
parameters and .

x 10

data
fit
5

4
Density

approach is to develop a model for the whole day,


taking into account the data for the day of the week
and the season to which the day belongs. For example,
in the case of a university building, the class semesters
create a seasonal eect that must be taken into
consideration. An appropriate model that generalizes
the Poisson Process is the Non-Homogeneous Poisson
Process (NHPP), where varies with time and is not
a constant.

A dierent approach is to include the eect of the


day of the week and the season into the probability
model. Following [4], we consider a model within a particular season where the days are numbered 1, 2, . . . , J.
Nj,k = {Number of arrivals on day j, j = 1, . . . , J
and in interval [tk1 , tk ], k = 1, . . . , K}. Nj,k
Poisson(j,k ). Let Vj be the daily total number of arrivals. Let dj represent the day of the week, where 1 is
for Monday, 2 for Tuesday, ..., 7 for Sunday. First the
following model is considered:

100

200

300

400

500
Data

600

700

800

900

Figure 5: Exponential t for the inter-arrival times


Many standard techniques can be used to estimate
. Using the interarrival times, and their distribution,
Ti  Exp(), one can obtain an accurate estimate
of that gets rened with time. We use a Bayesian
approach with a conjugate prior distribution Gamma
for that leads to a posterior Gamma distribution.
The rst approach is one way to provide a probability model for windows of homogeneity. However, this
results in too many models. A more comprehensive

j,k = dj (tk )Vj + v 


where dj is the proportion of arrivals in the
corresponding time interval, and v  is an error.

K
Using the
k=1 dj (tk ) = 1 for dj = 1, 2, . . . , 7.
variance stabilizing transformation for Poisson data,
which is based on the resultof Brown et al [5], if
N + 1/4 has approxN is Poisson(), then Y =
imately a mean and variance 1/4. In addition,

127
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on February 18, 2009 at 02:57 from IEEE Xplore. Restrictions apply.

as
 , Y is approximately Normal. Letting
yj,k = Nj,k+1/4 , we have the model
yj,k = udj (k)xj + v


where xj = Vj and udj (k) = dj (tk ) and now v
N (0, 2 ). We next assume an autoregressive model for
the daily eect xj ,
xj dj = (xj1 dj1 ) + w
where w N (0, 2 ). Then, using the Bayesian recursive approach, starting with initial estimates, we derive

(Y1 , Y2 , . . . , Ym ) be the collected statistics for m cycles,


where
ni

Yi =
Si,j , i = 1, 2, . . . , m
j=1

Using (Y1 , Y2 , . . . , Ym ), we want to assess the probability distribution of Si,j , the time spent in the system
by a person. Based on probability model assumptions,
we assume that the {Si,j , i, j = 1, 2, . . .} are independent random variables that are identically distributed.
Si,j Gamma(, ), i, j = 1, 2, . . .. That is
fSi,j (x) =

p(x, , , , |y) p(y|x, )p(x|, , )p(, , , )

x1 ex
()

We want to calculate

3.2

Time Spent in the System

p(Si,j |y1 , y2 , . . . , ym ),

S1 , S2 , S3 , . . . are the times spent by each person


arriving into the system, that is the eld of view.
We cannot observe the Si s directly unless we track
each person. In this research, the image processing
task provides only N (t), the number of people in the
system at time t. Using N (t), we determine exact
estimates of the average time spent in the system and
probability models for the Si s.
It can easily be shown that for a cycle, where N (t)
start at N (t1 ) = 0 and returns to N (t2 ) = 0, that


t2

N (t) =
t1

n


Si ,

i=1

where n is the number of people who arrived to the


system between t1 and t2 . n = (t2 ) (t1 ).
n
By collecting S =
i=1 Si /n for all cycles of the
same homogeneous time period, and taking the average, we obtain the exact sample average time spent in
the system by a person, and an excellent estimator of
the time spent in the system. Often, this estimate is
enough for practical purposes. We further derive probability models for prediction.
3.2.1
n

The Probability Distribution of the


Time Spent in the System

i=1 Si is observed
n for all cycles. Consider the
variables Y =
Using the collected data
i=1 Si .
(Y1 , Y2 , . . .) for the same n, and assuming a period
of homogeneity, that is the data are independent and
identically distributed (IID), one can t a probability
distribution to the Yi s. From such distribution, one
would derive the distribution of the Si s. For example, if the Si s are Gamma distributed, then the sum

n
i=1 Si will be Gamma distributed. As it turns out,
our studies have shown that the data t well a Gamma
distribution. However, this approach requires collecting considerable amount of data, as one needs large
samples (Y1 , Y2 , . . .) having all the same n number of
people that have entered and stayed in the system.
Instead, we prefer to use a more ecient Bayesian approach that makes use of all collected cycle data. Let

where (y1 , y2 , . . . , ym ) are the realizations of


(Y1 , Y2 , . . . , Ym ). Using the Chapman-Kolmogorov
equation or the Law of Total Probability, we condition
and average over all possible values of and . To do
so, we use probability models for these two parameters, and make a few model simplifying assumptions.
Let (1 , 2 , . . . , K ) be the most likely values for the
shape parameter of the Gamma(, ) distribution of
Si,j . Let Gamma(a, b) be the distribution of the scale
parameter . It is a natural conjugate prior distribution. Having discretized , let p() be the ensuing
discrete prior distribution. If prior information is
available on , it can be used to construct the discrete
distribution p(), either directly or through an Expert
Opinion procedure [6, 7]. Otherwise, a at discrete
prior can be used, that is the Uniform distribution
over the set (1 , 2 , . . . , K ). Further assume that, to
start the procedure, and are independent. This
assumption is not a strong one, as the two parameters
and do not remain independent once the data is
used. We have p(Si,j |y1 , . . . , ym ) equal to

p(Si,j |, , y1 , . . . , ym )p(, |y1 , y2 , . . . , ym )d

Given (, ), Si,j is conditionally independent of


(y1 , . . . , ym ) and is the Gamma(, ) distribution. It
remains to evaluate p(, |y1 , . . . , ym ), the posterior
distribution of (, ). Using Bayes theorem,
p(, |y1 , . . . , ym ) =
=

1
p(y1 , . . . , ym |, )p(, )

m
1
p(yi |, )p()p()
i=1

where is the normalizing factor,



=
p(y1 , . . . , ym |, )p(, )d

ni
Since Yi = j=1
Si,j , i = 1, 2, . . ., is the sum of the
independent Gamma random variables Si,j , then Yi
Gamma(ni , ). is computed in a non expensive K
large summation.

128
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on February 18, 2009 at 02:57 from IEEE Xplore. Restrictions apply.

Semi-Closed System

A semi-closed system is a location that contains a


closed system or several closed systems. The aim is
to use the count of the number of people in the contained closed systems to estimate the overall number
of people in the semi-closed system. Let N (t) be the
number of people in the semi-closed system at time
t. Let m(t) be the number of people in the contained
closed systems at time t. N (t) = f (m(t)) is the relationship sought. In our research, we collect data on
N (t) and m(t) by looking at a closed system containing other closed systems. We assume that the system
of interest is semi-closed and that we are interested in
estimating N (t), which in fact we observe. This give
us data on both N (t) and m(t) and allows us to study
their relationship.

Open System

An open system is a system that is not entirely covered


by a surveillance cameras and whose entrances and exits are not all covered by cameras. This means that it
is not possible to know the exact count of the people in
the system, and one can only estimate it. For example,
it can be the entire building. There is no easy answer
to this problem. There are two directions of research:

[3] D. L. Snyder, M. I. Miller, Random Point


Processes in Time and Space.Springer-Verlag, NJ,
1991.
[4] J. Weinberg, L. D. Brown, J. R. Stroud, Bayesian
forecasting of an Inhomogeneous Poisson Process
with applications to call center data., Technical
Report, The Wharton School, Uni. of Pennsylvania, 2006.
[5] L. D. Brown, R. Zhang, L. Zhao, Root un-root
methodology for non parametric density estimation. Technical Report, University of Pennsylvania.
[6] D. V. Lindley, Reconciliation of Probability Distributions. Operations Resarch, vol. 31, pp. 866880,
1983.
[7] D. V. Lindley, N. D. Singpurwalla, Reliability and
Fault Tree Analysis Using Expert OPinion. Journal of the American Statistical Association, vol.
81, pp. 8790, 1986.

1. Studying the arrival process


2. Studying a model that takes into consideration all
inside closed systems
Suppose there are n entries to the open system and
that k are covered by cameras. By studying the arrival process (t) at each of the k entrances and relating them in a probabilistic manner, we will be able to
conclude a relationship model between N (t) and the
k entrance/exits processes {j (t), Zj (t), j = 1 . . . , k}.
It is clear that if k was to equal n, then the system
becomes a closed system. Z(t) is just the reverse of
an arrival process. It is the arrival to the outside.
Hence it will be modeled in the same manner as for
(t), using Homogeneous and Non-homogeneous stochastic processes. Now suppose that there are l closed
systems inside the open system under consideration.
Let N1 (t), N2 (t), ..., Nl (t) be the exact count of people
in those systems at time t. Then part of the research
focuses on drawing a relationship between N (t) and
N1 (t), N2 (t), ..., Nl (t).

References
[1] W. Hu, T. Tan, L. Wang, S. Maybank, A Survey
on Visual Surveillance of Object Motion and Behaviors. IEEE Transactions on Systems, Man, and
Cybernetics-Part C: Applications and Reviews,
vol. 34, No.3. pp. 334352, Aug. 2004.
[2] C. Stauer, W. Grimson, Adaptive background
mixture models for real-time tracking. In Proc.
IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 246252, 1999.

129
Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on February 18, 2009 at 02:57 from IEEE Xplore. Restrictions apply.

You might also like