A Survey On Vehicle Detection and Tracking Algorithms in Real Time Video Surveillance
A Survey On Vehicle Detection and Tracking Algorithms in Real Time Video Surveillance
Abstract: Automated surveillance systems are of critical importance in traffic management and monitoring any unwanted activities. The intelligent
transportation system plays a crucial role in the field of traffic management to provide an efficient and reliable transportation system. One of the
applications of the intelligent transportation system is to detect and track vehicles accurately. Image processing algorithms has been widely developed to
monitor the motion of vehicles, humans or any other objects. The main aim is to detect and recognize moving objects from real surveillance videos to
avoid congestion on highways and parking areas for the prevention of accidents. In comparison with still images, each and every video frame provide
intelligent information about vehicles in various scenarios that change over time. Many algorithms have been developed to improve efficient real-time
detection of incidents and it is a challenging task for the researchers to determine the driver's behavior even in the diversity of vehicles, weather and light
conditions. In this paper, the detailed overview of object motion detection, classification, and tracking algorithms are presented and also their strengths
and weakness of the various algorithms are discussed.
Index Terms: Intelligent Transportation System, Vehicle detection, Foreground detection, Object classification, Tracking, Feature Extraction, Occlusion,
Surveillance Systems
————————————————————
and store the information for future verification. The The algorithms like background subtraction techniques,
progression of technology in video surveillance systems, and optical flow methods, statistical methods, frame
safer driver assistance systems provides researchers to differencing, temporal differencing are the subsequently used
understand more effectively and some scenarios and techniques which are described below.
applications are described below.
2.1 BACKGROUND SUBTRACTION
1.1 A SURVEY IN VIDEO SURVEILLANCE The background subtraction technique is widely used for
This survey discusses many research works on objects motion segmentation in many applications. It finds the
classification, detection, and tracking. Such a system is moving regions in images by subtracting the initial image of
required for preventing crime and accidents to ensure the pixels from a referenced background image which is formed
safety of the public. by averaging images. If the subtracted pixel value is greater
than the threshold then it is defined as foreground. To
2 MOVING OBJECT DETECTION enhance detected regions post-processing operations like
Each application that benefits from smart video processing dilation, erosion, and closing are performed to reduce the
have different needs, thus requires different handling objects. noise level. Many approaches for background subtraction
However, they hold something in common like moving objects. technique are performed in terms of foreground detection,
In each and every vision systems, detecting moving objects background maintenance, and post processing. Heikkila and
are common such as people and vehicles in the video. Moving Sliven used the simplest version where is marked as
object detection steps consists of preprocessing, feature foreground by a pixel at location ( ) in the current image
extraction, classification, detection, and tracking. ( ) ( ) (1)
and the predefined threshold is [1]. The (IIR) Infinite
Impulse Response filter was used to update the
background image.
( ) (2)
By eliminating small-sized regions and morphological closing
was used to create foreground pixel maps. Though
background subtraction techniques are effective, they lack in
performance with dynamic changes such as stationary
objects uncover the backgrounds (e.g. a parked bus moves
out of the parking) or sudden light changes.
2267
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
2.3 TEMPORAL DIFFERENCING temporal motion features. The proposed method based on
In temporal differencing, the reference image is the previous moving object temporal self-similarity. If an object shows
images. A new image is obtained when the difference between periodic motion, it's self-similarity measure also shows a
the previous frame and the current frame is greater than the periodic motion. The method depends on this clue to
threshold value. This method is highly recommended in categorize moving objects by periodicity. Objects like rigid and
dynamic scene environments. But it has some disadvantages non-rigid are also identified by Optical flow analysis. A. J.
in the detection of moving objects. For eg., If an object is in Lipton., [3] proposed a method which uses local optical flow
single color it fails to detect the whole pixel regions even when technique in regions of detected objects. High residual flow is
it is moving. It is ineffective to static scenes. For high-level present in non-rigid objects (humans) whereas rigid objects
processing and to detect stopped static objects other (vehicles) shows little residual flow. The motion of pedestrians
techniques must be let in. Lipton et al., [3] presented two will have a periodicity due to the generation of residual flow.
frames differencing, the foreground frame is defined by the By using this, it can be differentiated from various objects such
following equation. as vehicles.
( ) ( ) (4)
In order to resolve the defects in two frame differencing Collins 3.2 FIRE DETECTION
et al., [4] used a hybrid three frames differencing Discussions of fire detection in research papers are rare in the
method.Video surveillance and monitoring (VSAM)is the highly computer vision literature. Many methods exploit color and
recommended technique for observing moving objects in the extract the motion of fire features. It also generates false
sequence of images. This hybrid algorithm uses motion alarms in the presence of fire-colored segments. The models
segmentation by combining background subtraction with a like spectral, spatial, temporal are defined to detect a fire in
three-frame differencing technique. This algorithm detects the video. The spectral is defined as a pixel color probability
moving objects quickly. density of fire. The spatial defines the spatial structure of the
fire region and temporal is used to capture the spatial structure
2.4 OPTICAL FLOW changes.
Optical flow is based on motion segmentation and it detects
moving objects, even when the camera is moving. However, it 3.3 OBJECT DETECTION
is computationally complex and has more noise. The most important technique in the field of intelligent
transportation system is object detection. In this technique the
2.5 SHADOW AND LIGHT CHANGE DETECTION targets such as cars and traffic signs are detected. For
The motion detection algorithms described above are used for detection, the shape of the car and spatial and temporal
real-time surveillance for years, which performs well in all information from traffic signs are the extracted features [5], [6].
environments such as indoor and outdoor. However, without Optical flow is a technique in motion segmentation and
special work, most of these algorithms are vulnerable to both detection, it detects moving objects even when the camera is
local (e.g. shadows and highlights) and global illumination moving. This method detects and tracks moving objects in
changes (e.g. The sun being covered/uncovered with clouds). aerial views. Its accuracy is more than background subtraction
Motion detection is inaccurate when the moving objects are method. However, optical flow methods are complex to
followed by shadows. Object classification also fails in the process and has more noise. The result of the methods used
presence of shadows and sudden light changes. In by Horn and Schunck in [7] were promising to perform better
background subtraction and shadow detection method, pixels than the methods used by Lucas and Kanade in [8] for aerial
are represented by a color model that separates brightness views of detecting frames in motion. A lot of research works
from the chromaticity component. The pixels in the image are are carried out based on Horn and Schunck and Lucas and
divided into four types (background, shaded background or Kanade methods for enhancing optical flow. Many algorithms
shadow, high-lighted background and moving foreground are proposed to detect motion in different scenarios [9], [10],
object) by calculating the distortion of brightness and [11], [12], [13]. In [11] indoor fixed cameras are used by
chromaticity between the scene and the current image Optical flow for detection objects in video streams with existing
pixels. The two methods by which the shadows can be brightness. It was applied in the software of motion detection
detected are statistic and video based method. In statistic for analyzing of motion level, motion region and the number of
based method the intensity values in shadow region are less objects. The only drawback of this method is the change in
than that in background is analyzed. In video based method object velocity and lights in the area. In static cameras,
difference between neighboring pixels intensity, geometry, tracking of moving objects are done by combining motion
color, and brightness are calculated. segmentation and optical flow algorithm in [9]. Optical flow
does not depend on foreground or background regions. It
3 OBJECT CLASSIFICATION follows segmentation using pixel by pixel classification. In [14],
The different objects detected in the video contains moving Optical flow algorithm was used in the regions of silhouette 2-
regions such as vehicles, humans, animals, and so on. To way ANOVA and brightness change was minimized by object
track it without difficulty we need to distinguish the types of segmentation. In [15] crowd monitoring in videos was done by
objects in the detected video to analyze it properly. Horn and Schunck. The proposed technique detects and
tracks the outdoor scenes with based optical flow. In [12] the
(5)
outdoor scenes are detected with edge detection and gradient
based on optical flow. The edge detection based techniques
3.1 MOTION BASED CLASSIFICATION are more robust and it is not vulnerable to light changes. For
Motion based classification usually distinguishes non-rigid the cameras in motion, the moving things are detected by
objects (e.g. human) from rigid objects (e.g. vehicles) by classification and motion clustering in [10]. In [9] Fusion Horn
2268
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
and Schunck used small squares in images with aerial color modeling change labels. In [30], Bayesian Markov random
for estimating flow field in the color plane and those fields are field (MRF) method was used to increase the performance in
fused together. Lucas and Kanade, in [16] Optical flow method the use of shape for the detected objects and noise reduction.
is combined with stereo camera for UAVs. In [17], [18] for However, the most important work is to remove the
urban area navigation a combined way of fusion control was background and pixels correlation in the frames. Bayesian
used. Lucas and Kanade provide promising results if the flow algorithm [29] was used to extract the shape of the moving
is constant with the pixels neighborhood to them in [19],[20]. object. It deals very well with duplicate motions of objects,
The equation was derived by the local neighborhood of least variation in light, reducing noise and removing shadows, the
squares in [17], [21]. The Horn and Schunck technique gives results and performance of the algorithm are promising. In
the best results if the flow is smooth throughout the entire [31], sobel filtering method was used for low quality web
frames, e.g. for some neighborhood, the objects motion is not cameras in laptops to process moving images while using the
having any restrictions in [22], [23]. Most of the researchers same low end hardware. The initial algorithm is quite fast and
uses hybrid methods for motion detection; the hybrid method the next one deals with edge detection of the object. The
uses two or more methods of different kinds to get rid of result of the algorithm shows 45.5% time of object detection
motion detection problems. In [17] two techniques of optical with 14% use of memory, maintaining the same level of
flow algorithms were compared and its performance was accuracy. In [32] Enhanced Dynamic Bayesian Network (DBN)
evaluated. In [23] eight methods of optical flow algorithms was technique is used for vehicle detection in surveillance, in the
tested on synthetically generated data with added noise and of aeronautical field and this method is found to be flexible. In
high complexity. The author claims that the performance of [33] multiple objects are tracked using spatial and second
method in [8] provided the best result. The area of moving derivative detection and tracking model. But, it cannot track
object was detected by a hybrid method of temporal difference multiple objects in low quality videos. The Speeded Up Robust
and optical flow in [24]. The difference between frames is Features (SURF) deals to optimize Scale Invariant Feature
calculated by the temporal difference method and the Transform (SIFT) [34]. But SURF processing time is too long.
differential image is filtered using low pass filter and edge Oriented FAST and Rotated Binary Robust Independent
detection techniques. The optical flow algorithm is used to find Elementary Features (ORB) algorithm [6] feature extraction is
the velocity from the spatiotemporal derivative of image processed in an outdoor environment. For binary descriptor,
intensity. In [24] for a static camera the results from temporal the method of Local Difference Binary (LDB) is used. The
difference and optical flow technique were quite promising but image descriptors are matched by K-nearest Neighbor (KNN)
not for the camera in motion. Many motion detection [35]. The local invariant features are extracted by BRIEF alias
algorithms were proposed and most of them use simple ORB in [36]. The BRIEF technique is fast but had a drawback
operation of thresholding on the difference in image intensity to noise. The advantage of BRIEF is it deals with two major
like the initial frame is compared with background frames from problems, ORB is used to detect the corners with the help of
consecutive frames of videos, which depends on algorithms of the Harris method and uses intensity centroid for calculating
the simplest form, yet its performance is not promising in [25]. the rotation of object direction in [37]. The researchers of this
In [26], [27], [28], [29] statistical and probabilistic models are paper developed an algorithm to calculate the details in real
used to increase the performance by background subtraction. time traffic by classification, counting of vehicles and
The performances of these algorithms mainly depend on segmentation. The important goal of this technique is it can
threshold value. In [25] various methods of threshold work with the sudden change in light conditions by using
adaptation are described. By choosing Markov Random Field feature based counting technique in detection of the vehicle
(MRF) in the framework of Bayesian [29] most promising and its tracking [38].
results of detection are found by frame differencing and
2269
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
The gradient feature of special It can find the match even with
intensity is used to match the fewer details and also can Image quality should
Lucas and Image Registration
[8] objects in the images with the detect things even the object be higher for matching
Kanade Technique
iteration of Newton-Raphson. is rotated, scaled or sheared. of objects.
When adding
Adaptive Object detection is done by Achieving better performance
JM McHugh, foreground with the
background detecting the change in series of by adapting statistical model,
J Konrad, background the
[25] subtraction with images. By MRF Spatial non parametric background
V Saligrama, detected regions are
Markov Random coherence is improved by change model and MRF model to vary
P Jodoin. getting bigger rather
Field in labels of thresholds. the thresholds.
than shrink.
Automatic vehicle detection is Objects are detected based on
Hsu-Yung
proposed in this paper by using colors, shape and feature It cannot track
Cheng, Enhanced Dynamic
[32] uses spatial and second extraction intensity of pixels, multiple objects in
Chih-Chia Weng, Bayesian Network
derivative detection and tracking then edge detection is done by videos of low quality
Yi-Ying Chen.
model Canny edge detector.
2270
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
Logical regression
The main difficulty in
Vehicle The dataset of 3074 samples is shows the results had
this method is the
Classification processed for vehicle classification high performance when
usage of datasets,
Denis Kleyko, Roland techniques by D. Kleyko et al using different comparing with other
as it was focused
[54] Hostettler, Wolfgang Comparison by algorithms of machine learning. methods of machine
mainly on single
Birk, and Evgeny Machine learning Various classification techniques learning with the
class which is very
Osipov on roadside are used such as SVM, neural classification rate is
difficult to search
sensors networks and logical regression. 93.4%
while classification.
2271
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
5 VEHICLE TRACKING APPROACHES matching and gradient based matching, it produced more
The object tracking in video processing system is a significant accurate results for tracking the moving vehicles, classification
step to track the motion of objects in visual-based surveillance of objects, foreground and background detection of an object,
systems and it has been a challenging task for many vehicle flow, vehicle count, vehicle velocity and vehicle
researchers nowadays [60]. recognition. In a real time video surveillance of traffic, tracking
and classification of vehicles have illustrated as in [66]. In this
work, counting and classification of vehicles, detection of
traffic lane change, direction and vehicle speed are detected.
Multiple moving vehicles in the heavy traffic are detected and
tracked even for various weather conditions, occluded objects
like trees or shadows. For tracking and locating moving
objects, Kalman filter, background subtraction methods,
morphological processing operations are used for extracting
and identifying the vehicle’s contour.
2272
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
Cam Shift (Continuously Adaptive Mean Shift) algorithm is Kalman filter is used to track the multiple moving objects very
based on 1D histogram based object mainly for detecting effectively. The author proposed a framework to track the
faces and produces poor performance where the foreground objects for a mobile robot traveling in crowded scenarios from
object is same as background object or it varies color deep tracking framework [78], [79]. In [80] the author
significantly in [74],[76]. SURF (Speeded Up Robust Features) developed outdoor tracking of moving vehicle based on deep
algorithm is related to two dimensional Harr wavelet response learning framework. The features of the image are learned
and gives better solution than Camshift when the foreground after pre-training a stacked denoising auto encoder and the
object is same as a background object. As the computation next step is to add the k-sparse constraints to the stacked
time is high, this may not be suitable for tracking real time denoising auto encoder (Kssdae) then it is linked with a
objects. Optical flow technique is used to differentiate multiple classification layer to enhance classification neural network. It
objects(foreground) and the background in an image. It is is applied to an online tracker, after the process of fine tuning
dependent on the distance between the movement of objects the evaluation produces a good performance of vehicle
and a scene[77]. tracking after verification.
2273
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
Optical Flow Field, Electronic Commerce and Security, Manufacturing and Automation (ICDMA), 2011 Second
2009. ISECS’09. Second International Symposium on, International Conference on, IEEE, pp. 885–888.
Vol. 2, IEEE, pp. 85–88. [41] Witten, D. M. and Tibshirani, R. (2011). Penalized
[25] JM McHugh, J Konrad, V Saligrama, P Jodoin (2009), Classification using Fisher’s Linear Dis- criminant, Journal
Foreground-adaptive background subtraction. IEEE of the Royal Statistical Society: Series B (Statistical
Signal. Process Lett. 16, 390–393 . Methodology) 73(5): 753–772.
[26] CR Wren, A Azarbayejani, T Darrell, A Pentland, Pfinder: [42] Sonka, M., Hlavac, V. and Boyle, R. (1999). Image
real-time tracking of the human body. IEEE Trans. Pattern Processing, Analysis, and Machine Vision, PWS Pub.
Anal. Mach. Intell. 18(7), 780–785 (1997). [43] Han, F., Shan, Y., Cekander, R., Sawhney, H. and Kumar,
[27] A Elgammal, R Duraiswami, D Harwood, L Davis, R. (2006). A TwoStage Approach to People and Vehicle
Background and foreground modeling using Detection with Hog-Based SVM, Performance Metrics for
nonparametric kernel density for visual surveillance. Proc. Intelligent Systems Workshop in conjunction with the IEEE
IEEE 90(7), 1151–1163 (2002). Safety, Security, and Rescue Robotics Conference, pp.
[28] T Aach, A Kaup, (1995), Bayesian algorithms for adaptive 133–140.
change detection in image sequences using Markov [44] Ramakrishnan, V., Prabhavathy, A. K. and Devishree, J.
random fields. Signal Process Image Comm. Vol. 7, No. (2012). A Survey on Vehicle Detection Techniques in
(2), 147– 160. Aerial Surveillance, International Journal of Computer
[29] T Aach, A Kaup, R Mester, (1993)., Statistical model- Applications 55(18).
based change detection in moving video. Signal Process. [45] Chen, Z., Pears, N., Freeman, M. and Austin, J. (2009).
31, 165– 180. Road vehicle classification using support vector machines,
[30] Elham Kermani and Davud Asemani. (2014)., A robust Intelligent Computing and Intelligent Systems, 2009. ICIS
adaptive algorithm of moving object detection for video 2009. IEEE International Conference on, Vol. 4, IEEE, pp.
surveillance, EURASIP Journal on Image and Video 214–218.
Processing 2014, 2014:27, pp:2-9. [46] Asha, G., Kumar, K. A. and Kumar, D. D. N. P. (2012). A
[31] Roy, A.; Shinde,S.; Kang, K.-D. (2012); An Approach for Real Time Video Object Tracking Using SVM,
Efficient Real Time Moving Object Detection, International International Journal of Engineering Science and
Journal of Signal Processing, Image Processing and Innovative Technology (IJESIT).
Pattern Recognition, 5(3), 2012. [47] Cao, X., Wu, C., Yan, P. and Li, X. (2011). Linear SVM
[32] Cheng, H.-Y.; Weng, C.-C.; Chen Y.-Y.(2012); Vehicle Classification using Boosting HOG Features for Vehicle
Detection in Aerial Surveillance Using Dynamic Bayesian Detection in Low-Altitude Airborne Videos, Image
Networks, IEEE Transactions on Image Processing, 21(4): Processing (ICIP), 2011 18th IEEE International
2152- 2159, 2012. Conference on, IEEE, pp. 2421–2424.
[33] Philip, F.M.; Mukesh R.(2016); Hybrid tracking model for [48] Gallo, I. and Nodari, A. (2011). Learning Object Detection
multiple object videos using second derivative based Using Multiple Neural Netwoks, Proceedings of
visibility model and tangential weighted spatial tracking International Joint Conference on Computer Vision,
model, International Journal of Computational Intelligence Imaging and Computer Graphics Theory and Applications.
Systems, 9(5): 888-899, 2016. INSTICC Press.
[34] H. Bay, T. Tuytelaars, L. Van Gool, ―SURF: Speeded up [49] Park, J., Choi, H. and Oh, S. (2010). Real-Time Vehicle
robust features,‖ in Proc. Eur. Conf. Comput. Vis. (ECCV), Detection in Urban Traffic Using AdaBoost, Intelligent
2006, pp. 404–417. Robots and Systems (IROS), 2010 IEEE/RSJ
[35] X. Yang and K.-T. Cheng, (2014).,―Local difference binary International Conference on, IEEE, pp. 3598–3603.
for ultrafast and distinctive feature description,‖ IEEE [50] Zohrevand, A.; Ahmadyfard, A.; Pouyan, A.; Imani, Z.
Trans. Pattern Anal. Mach. Intell., vol. 36, no. 1, pp. 188– (2014); A SIFT based object recognition using contextual
194, Jan. 2014 information, Iranian Conference on Intelligent Systems
[36] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, ―BRIEF: (ICIS), 1-4, 2014.
Binary robust independent elementary features,‖ in Proc. [51] Li, Y.; Su G. (2015)., Simplified histograms of oriented
Eur. Conf. Comput. Vis. (ECCV), 2010, pp. 778–792. gradient features extraction algorithm for the hardware
[37] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, ―ORB: implementation, International Conference on Computers,
An efficient alternative to SIFT or SURF,‖ in Proc. Int. Communications and Systems (ICCCS), 192 -195, 2015.
Conf. Comput. Vis. (ICCV), Barcelona, Spain, Nov. 2011, [52] G. Jemilda, S. Baulkani, (2018)., Moving Object Detection
pp. 2564–2571. and Tracking using Genetic Algorithm Enabled Extreme
[38] Zakaria Moutakki, Imad Mohamed Ouloul, Karim Afde, Learning Machine, International Journal Of Computers
Abdellah Amghar, (2018)., "Real-Time System Based On Communications & Control Issn 1841-9836, 13(2), 162-
Feature Extraction For Vehicle Detection And 174, April 2018.
Classification", Transport and Telecommunication, 2018, [53] Shingade, A.; Ghotkar A.(2014); Survey of Object
volume 19, no. 2, 93–102 . Tracking and Feature Extraction Using Genetic Algorithm,
[39] Lai, J., Huang, S. and Tseng, C. (2010)., Image-Based International Journal of Computer Science and
Vehicle Tracking and Classification on the Highway, Technology, 5(1), 2014.
Green Circuits and Systems (ICGCS), 2010 International [54] D. Kleyko, R. Hostettler, W. Birk, E. Osipov, "Comparison
Conference on, IEEE, pp. 666–670. of Machine Learning Techniques for Vehicle Classification
[40] Liu, X., Dai, B. and He, H. (2011). Real-Time On-Road Using Road Side Sensors", 2015 IEEE 18th Int. Conf.
Vehicle Detection Combining Specific Shadow Intell. Transp. Syst., pp. 572-577, 2015.
Segmentation and SVM Classification, Digital [55] Z. Chen, T. Ellis and S. A. Velastin "Vehicle type
2275
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616
2276
IJSTR©2019
www.ijstr.org