0% found this document useful (0 votes)
41 views

Detection of Objects in Motion-A Survey of Video Surveillance

Detecting objects in motion

Uploaded by

ARTHUR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Detection of Objects in Motion-A Survey of Video Surveillance

Detecting objects in motion

Uploaded by

ARTHUR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Advances in Internet of Things, 2013, 3, 73-78

http://dx.doi.org/10.4236/ait.2013.34010 Published Online October 2013 (http://www.scirp.org/journal/ait)

Detection of Objects in Motion—A Survey of


Video Surveillance
Jamal Raiyn
Computer Science Department, Alqasemi College, Baka El Gariah, Israel
Email: [email protected]

Received August 1, 2013; revised September 4, 2013; accepted September 13, 2013

Copyright © 2013 Jamal Raiyn. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT
Video surveillance system is the most important issue in homeland security field. It is used as a security system because
of its ability to track and to detect a particular person. To overcome the lack of the conventional video surveillance sys-
tem that is based on human perception, we introduce a novel cognitive video surveillance system (CVS) that is based on
mobile agents. CVS offers important attributes such as suspect objects detection and smart camera cooperation for peo-
ple tracking. According to many studies, an agent-based approach is appropriate for distributed systems, since mobile
agents can transfer copies of themselves to other servers in the system.

Keywords: Video Surveillance; Object Detection; Image Analysis

1. Introduction systems, assuming that a large number of servers with


video camera are installed. If one mobile agent can track
Various papers in the literature have been proposed
one person, then multiple mobile agents can track nu-
and focused on computer vision problems in the con-
merous people at the same time, and the server balances
text of multi-camera surveillance systems. The main pro-
the load process of the operating mobile agent on each
blems highlighted in these papers are object detection
server with a camera.
and tracking, and site-wide, multi-target, multi-camera
We consider the scenario that the smart camera cap-
tracking. The importance of accurate detection and track-
tures two similar objects (e.g. twin), then each object
ing is obvious, since the extracted tracking information
selects a different path. The tracking process will be
can be directly used for site activity/event detection. Fur-
thermore, tracking data is needed as a first step toward confusing. Furthermore, the smart camera is limited to
controlling a set of security cameras to acquire high- cover a certain zone in public place (Indoor). Next sec-
quality imageries, and toward, for example, building bio- tion introduces many solutions that have been suggested
metric signatures of the tracked targets automatically. to the above problem. The suggested solutions to im-
The security camera is controlled to track and capture prove the conventional video surveillance system are
one target at a time, with the next target chosen as the extended in various ways.
nearest one to the current target. These heuristics-based A part of the approaches is to use an active camera to
algorithms provide a simple and tractable way of com- track a person automatically, and thus the security camera
puting. Conventional video surveillance systems have moves in a synchronized motion along with the projected
many limitations to their capabilities. In one case, con- movement of the targeted person. These approaches are
ventional video surveillance systems have difficulty in capable of locating and tracking a small number of peo-
tracking a great number of people located at different ple. Another common approach is to position the camera
positions at the same time and tracking those people at strategic surveillance locations. This is not possible in
automatically. In another case, the number of possible some situations due to the number of cameras that would
targeted people is limited by the extent of users’ in- be necessary for full coverage, and in such cases, this
volvement in manually switching the view from one approach is not feasible due to limited resources. A third
video camera to another. With cognitive video surveil- approach is to identify and track numerous targeted peo-
lance system, mobile agent technologies are more effec- ple at the same time involving image processing and in-
tive and efficient than conventional video surveillance stallation of video cameras at any designated location,

Copyright © 2013 SciRes. AIT


74 J. RAIYN

since the image processing increases server load. approach used to analyze human behavior is the Gaus-
The limitation of human perception system in conven- sian probabilistic model. In [14] has been described the
tional video surveillance system increases the demand to real-time finder system for detecting and tracking hu-
develop cognitive surveillance system. Many of the pro- mans. In [15] proposed a shape-based approach for clas-
posed video surveillance systems are expensive and lack sification of objects is used following background sub-
the capability of cognitive monitoring system such as no traction based on frame differencing. The goal is to de-
image analysis. This makes the system lack the ability to tect the humans for threat assessment.
send warning signal autonomously in real-time and be- In [16] presented a method to detect and track a human
fore the incidents happen. Furthermore, it is difficult and body in a video. First, background subtraction is per-
might take a long time for people to locate the suspects in formed to detect the foreground object, which involves
the video after the incidents happen. The problem may temporal differencing of the consecutive frames. In [17]
get more complete on the larger scale surveillance sys- presented a novel approach to detect the pedestrians,
tem. The next generation video surveillance system ex- which is shown to work well in a indoor environment.
pected not only to solve the issues of detection and track- They make use of a new sensing device, which gives
ing but also to solve the issue of human body analysis. In depth information along with image information simul-
the literature, it can be found many references in devel- taneously. In [18] proposed method that deals with the
opment of sophisticated video surveillance system. In direct detection of humans from static images as well as
this paper, we introduce the cognitive video surveillance video using a classifier trained on human shape and mo-
system (CVS). CVS aims to offer meaningful character- tion features. The training dataset consists of images and
istics like automation, autonomy, and real-time surveil- videos of human and non-human examples. In [19] has
lance such as face recognition, suspect objects, target de- been suggested to use the mobile agent for multi-node
tection, and use of cooperative smart cameras. Many face wireless video cooperation in order to reduce redundancy
recognition systems have a video sequence as the input. which will result repeated information collection in over-
Those systems may require being capable of not only de- lapping regions. In [20] introduced automatic human
tecting but tracking faces. Face tracking is essentially a tracking system based on a video surveillance system
motion estimation problem. Face tracking can be performed enhanced with mobile agent technologies. In [21-23] has
using many different methods, e.g., head tracking, feature been proposed a composite approach for human detection,
tracking, image-based tracking, and model-based track- which uses skin color and motion information to first
ing. These are different ways to classify these algorithms. find the candidate foreground objects for human detec-
tion, and then uses a more sophisticated technique to
2. Review of Human Body Analysis classify the objects. Other approaches extract human
This section introduces various approaches that consid- postures or body parts (such as the head, hands, torso, or
ered the object detection and object tracking in video feet) to analyze human behavior.
surveillance field [1-3]. The analysis of human body
movements can be applied in a variety of application Motion Detection
domains, such as video surveillance, video retrieval, hu- This section aims to provide the status of art of the dif-
man-computer interaction systems, and medical diagno- ferent techniques of motion detection estimation. Various
ses. In some cases, the results of such analysis can be studies have been introduced on the subject and the lit-
used to identify people acting suspiciously and other un- erature is very plentiful in this province. We are trying to
usual events directly from videos. Many approaches have list some methods used methods. The idea is to give an
been proposed for video-based human movement analy- overview of the most commonly used methods and ap-
sis [4-6]. proaches. The most used algorithms for moving objects
In [7] Oliver et al. developed a visual surveillance detection are based on background subtraction. The
system that models and recognizes human behavior using background subtraction is based on comparing of the
hidden Markov models (HMMs) and a trajectory feature. current video frame (foreground objects) with one from
In [8-10] proposed a probabilistic posture classification the previous frames that is called sometimes background.
scheme to identify several types of movement, such as
walking, running, squatting, or sitting. In [11] traced the 3. Video Surveillance System
negative minimum curvatures along body contours to
segment body parts and then identified body postures In this section we introduce the system model of the
using a modified Iterative Closest Point (ICP) algorithm. video surveillance system. Video surveillance system has
In addition [12,13] used different morphological opera- been used for monitoring, real-time image capturing,
tions to extract skeletal features from postures and then processing, and surveillance information analyzing.
identified movements using a HMM framework. Another The infrastructure of the system model is divided in

Copyright © 2013 SciRes. AIT


J. RAIYN 75

three main layers: mobile agents that are used to track


suspect objects, cognitive video surveillance manage-
ment (CVS), and Protocol for communication as shown
in Figure 1. Each end device, smart camera, covers a
certain zone or cell. Smart camera used for collecting
parameters of human face.

3.1. Communication Protocol


In the system model has been introduced two communi-
cation protocols. The first protocol used for agent-to-
agent protocol. Agents used this protocol for communi-
cation. The protocol is based on messages exchange as
shown in Figure 2. The goal of the protocol is to update
the agents. The second protocol is used for communica-
tion between CVS and mobile agent.

3.2. Mobile Agent Features


Mobile agents are placed in smart camera stations. Mo-
bile agent aims to track the suspect object from smart
camera station to others. Mobile agent offers various
characteristics, e.g. negotiation, making decision, roam-
ing, and cloning.
Figure 2. Agent protocol.
3.3. Cognitive Video Surveillance Management
Cognitive video surveillance (CVS) managed mobile form the mobile agent about the position of the suspect
agent handoff in wireless networks. CVS provide the object. The second strategy uses the protocol to help the
mobile agent with information. Based on received infor- mobile agent to roaming from point to others.
mation mobile agents make decision when and where to
move to next smart camera station. 4. Methodology
Cognitive video surveillance (CVS) uses a data base of
3.4. Tracking Moving Objects images. Pixels are described by a set of binary sequences.
In order to track moving objects, we introduce two strate- Each sequence presents certain properties (color). The
gies. The first strategy is based on messaging protocol database is divided into two separate sets of pixels—the
(msg_protocol). The goal of this msg_protocol is to in- training set and the test set. In both sets there are both
pixels, which belong to a certain family of colors (attrib-
utes) and sequence, which do not belong.
TP  X   X 1 , X 2 , , X n 
TN  Y  Y1 , Y2 , , Yn 
Each image is then divided into frames, a frame being
a subset of pixel from the sequence. The number of pixel
in each frame is a variable and is dynamically set to ob-
tain optimal results.
 
X 1  x11 , x21 , , xn1
X2   x , x , , x 
2
1
2
2
2
n



X nm  x1m , x2m , , xnm 
If for example a certain frame is comprised of 200
Figure 1. System model. segments, the frames might consist of pixels 1 to 10, 2 to

Copyright © 2013 SciRes. AIT


76 J. RAIYN

11, 3 to 12, etc. Statistical methods are then applied to 5. Smoothing EMA
find correlation between a certain properties of the frame.
In this section we introduce detection model that is based
The basic logic of statistical differentiation of pixel is
on moving average scheme. There are three types of
known and widely used in many prediction systems.
moving average, that is, simple moving average (SMA),
J  X Y
weight moving average (WMA), and exponential moving
1 if x  y average (EMA). In this study, an exponential moving
J 
0 otherwise average is considered. An exponential moving average
A large number of correlating factors is defined by uses a weighting or a smoothing factor which decreases
CVS and grouped in sets. A number is linked with each exponentially. The weighting for each older data point
correlating factor. Each factor is then turned into a single decreases exponentially, giving much more importance
number which represents the strength of the correlation to recent observations while not discarding the older ob-
factors for each frame with respect to the probability that servations entirely. The detection phase focused on the
this frame belongs to the certain family or not. As a re- collected data analysis. To increase the accuracy of the
sult we have a large number of frames, for each pair of a forecast model, the abnormal events in the collected data
frame we have a number which is correlated to the prob- should be considered. The forecast scheme is based on
ability that this frame belongs to a certain attribute (color the exponential moving average. The robustness and ac-
similarity) or does not belong. curacy of the exponential smoothing forecast is high and
 
J  J 11 , J 22 , J 33 , J 44 
impressive. The accuracy of the exponential smoothing
technique depends on the weight smoothed factor alpha
Optimization of J: value of the current demand. To determine the optimal
 
J Prediction J1*  J demand  k   J  alpha factor value, fitting curve has been considered.
In addition to the statistical method an innovative
method of logical XOR multiplication of matrices is ap- 6. Performance Analysis
plied to enrich the number of frames, which are poten- We have used the object oriented programming language
tially contributing to the prediction model. C # to present the image in binary system as shown in
CVS can be implemented in a dynamic environment – Figure 3. Hence Binary vectors are implemented in
when the training databases are modified the prediction WEKA platform. WEKA is stand for Waikato Environ-
mechanism is modified as well with improved prediction ment for Knowledge Analysis. WEKA implements many
capabilities. machine learning and data mining algorithms. As shown

Figure 3. Image representation in binary system.

Copyright © 2013 SciRes. AIT


J. RAIYN 77

in Figures 4(a) and (b) the image analysis in visual form 8.00
is based on color classification. WEKA considers the 6.00

MAE
color of the image. The colors are represented in binary 4.00
system. WEKA clusters the binary vectors. Each cluster 2.00
represents certain attributes. As shown in Figure 5 the 0.00
comparison between simple moving average (SMA),
weight moving average (WMA) and exponential moving
average (EMA) is based on mean average error (MAE).
Furthermore we have compared the actual observations SMA WMA EMA
to EMA model as shown in Figure 6. Results indicate
that all three moving average methods have more or less Figure 5. Comparison between MA schemes.
similar performance in forecasting short-term times.
However, as one would expect the method using opti-
mized weights produced slightly better forecasts at a
higher computational cost. Quality of forecast is dimin-
ished as the time for which forecasts are made is farther
in the future. Moving average methods overestimate
travel speeds in slow-downs and underestimate them
when the congestion is clearing up and speeds are in-
creasing.

7. Conclusion
In this paper, we discussed several methods in the recent
literature for human detection from video. We have or-
ganized them according to techniques which use back-
Figure 6. Actual observation vs. forecasting model.

ground subtraction and which operate directly on the


input. In the first category, we have ordered the tech-
niques based on the type of background subtraction used
and the model used to represent a human. In the second
category, we have ordered the techniques based on the
human model and classifier model used. Overall, there
seems to be an increasing trend in the recent literature
towards robust methods which operate directly on the
image rather than those which require background sub-
traction as a first step. The EMA model can be used for
human behaviors prediction.
(a)

REFERENCES
[1] R. T. Collins, A. J. Lipton, T. Kanade, H. Fujiyoshi, D.
Duggins, Y. Tsin, D. Tolliver, N. Enomoto, O. Hasegawa,
P. Burt and L. Wixson, “A System for Video Surveillance
and Monitoring,” Robotics Institute, Carnegie Mellon Uni-
versity, Pittsburgh, 2000.
[2] I. Haritaoglu, D. Harwood and L. S. Davis, “W4: Real-
Time Surveillance of People and Their Activities,” IEEE
Transactions on Pattern Analysis and Machine Intelli-
gence, Vol. 22, No. 8, 2000, pp. 809-830.
http://dx.doi.org/10.1109/34.868683
[3] S. Kwak and H. Byun, “Detection of Deominant Flow
(b)
and Abnormal Events in Surveillance Video,” Optical
Figure 4. (a) Image analysis; (b) Color classification. Engineering, Vol. 50, No. 2, 2011. pp. 1-8.

Copyright © 2013 SciRes. AIT


78 J. RAIYN

[4] Z. Xu and H. R. Wu, “Smart Video Surveillance System,” “Pfinder: Real-Time Tracking of the Human Body,” IEEE
Proceedings of the IEEE International Conference on In- Transactions on Pattern Analysis and Machine Intelli-
dustrial Technology, 14-17 March, pp. 285-290. gence, Vol. 19, No. 7, 1997, pp. 780-785.
[5] S. Aramvith, et al., “Video Processing and Analysis for http://dx.doi.org/10.1109/34.598236
Surveillance Applications,” International Symposium on [15] M. Ahmad and S.-W. Lee, “HMM-Based Human Action
Intelligent Signal Processing and Communication Sys- Recognition Using Multi View Image Sequences,” Inter-
tems (ISPACS 2009), 7-9 January 2009, Kanazawa, pp. national Conference on Pattern Recognition, Vol. 1, 2006,
607-610. pp. 263-266.
[6] P. Bottoni, “A Dynamic Environment for Surveillance,” [16] Y. Kuno, T. Watanabe, Y. Shimosakoda and S. Naka-
Proceedings of the 12th IFIP TC 13 International Con- gawa, “Automated Detection of Human for Visual Sur-
ference on Human-Computer Interaction, Uppsala, 24-28 veillance System,” Proceedings of the 13th International
August, 2009, pp. 892-895. Conference on Pattern Recognition, Vienna, 25-29 Au-
[7] N. M. Oliver, B. Rosario and A. P. Pentland, “A Bayesian gust 1996, pp. 865-869.
Computer Vision System for Modeling Human Interac- http://dx.doi.org/10.1109/ICPR.1996.547291
tions,” IEEE Transactions on Pattern Analysis and Ma- [17] H. Gou, et al., “Implementation and Analysis of Moving
chine Intelligence, Vol. 22, No. 8, 2000, pp. 831-843. Objects Detection in Video Surveillance,” Proceedings of
http://dx.doi.org/10.1109/34.868684 the 2010 IEEE International Conference on Information
[8] D. Weinland, R. Ronfard and E. Boyer, “A Survey of and Automation, Harbin, 20-23 June 2010, pp. 154-158.
Vision-Based Methods for Action Representation, Seg- [18] S. Wang et al., “A Mobile Agent Based Multi-Node
mentation and Recognition,” Computer Vision and Image Wireless Video Collaborative Monitoring System,” The
Understanding, Vol. 115, No. 2, 2011. pp. 224-241. 3rd International Conference on Advanced Computer
http://dx.doi.org/10.1016/j.cviu.2010.10.002 Theory and Engineering, Chengdu, 20-22 August 2010,
[9] I. Karaulova, P. Hall and A. Marshall, “A Hierarchical pp. 35-39.
Model of Dynamics for Tracking People with a Single [19] H. Kakiuch, et al., “Detection Methods Improving Reli-
Video Camera,” Proceedings of the British Machine Vi- ability of Automatic Human Tracking System,” 2010 4th
sion Conference, 2000, pp. 262-352. International Conference on Emerging Security Informa-
[10] Y. Ren, et al., “Detection and Tracking of Multiple Tar- tion, Systems and Technologies, Washington DC, 2010,
get Based on Video Processing,” 2009 Second Interna- pp. 240-246.
tional Conference on Intelligent Computation Technology [20] W. Y. Zhao, R. Chellappa, P. J. Phillips and A. Rosenfeld,
and Automation, Changsha, 10-11 October 2009, pp. 586- “Face Recognition: A Literature Survey,” ACM Comput-
589. ing Surveys, Vol. 35, No. 4, 2003, pp. 399-458.
[11] M. B. Augustin, S. Juliet and S. Palanikumar, “Motion http://dx.doi.org/10.1145/954339.954342
and Feature Based Person Tracking in Surveillance Vid- [21] T. S. Ling, L. K. Meng, L. M. Kuan, Z. Kadim and A. A.
eos,” Proceedings of ICETECT 2011, Tamil Nadu, 23-24 Baha Al-Deen, “Colour Based Object Tracking in Sur-
March 2011, pp. 605-609. veillance Application,” Proceedings of the International
[12] T. J. Broida and R. Chellappa, “Estimation of Object Mo- Multi-Conference of Engineers and Computer Scientists,
tion Parameters from Noisy Images,” IEEE Transactions Hong Kong, 18-20 March 2009, pp. 459-464.
on Pattern Analysis and Machine Intelligence, Vol. 8, No. [22] B. Schiele, “Model-Free Tracking of Cars and People
1, 1986, pp. 90-99. Based on Color Regions,” Image and Vision Computing,
http://dx.doi.org/10.1109/TPAMI.1986.4767755 Vol. 24, No. 11, 2006, pp. 1172-1178.
[13] Y. Su, et al., “Surveillance Video Sequence Segmenta- http://dx.doi.org/10.1016/j.imavis.2005.06.003
tion Based on Moving Object Detection,” 2009 Second [23] Z. Zhang, “Head Detection for Video Surveillance Based
International Workshop on Computer Science and Engi- on Categorical Hair and Skin Colour Models,” The 16th
neering, Qingdao, 28-30 October 2009, pp. 534-537. IEEE International Conference on Image Processing, Cairo,
[14] C. Wren, A. Azarbayejani, T. Darrell and A. Pentland, 7-10 November 2009, pp.1137-1140.

Copyright © 2013 SciRes. AIT

You might also like