Color Based Object Tracking With OpenCV A Survey
Color Based Object Tracking With OpenCV A Survey
Volume 5 Issue 3, March-April 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
1Computer Engineering Department, L.J. Institute Engineering and Technology, Ahmedabad, Gujarat, India
2Instrumentation & Control Engineering Department, Government Polytechnic, Ahmedabad, Gujarat, India
1. INTRODUCTION
1.1. OpenCV
OpenCV (Open Source Computer Vision Library) is an open running interactive art in Spain and New York, checking
source computer vision and machine learning software runways for debris in Turkey, inspecting labels on products
library. OpenCV is built to provide a common infrastructure in factories around the world on to rapid face detection in
for computer vision applications and to accelerate the use of Japan. Thus there are tons of uses, as some mentioned above,
machine perception in the commercial products. Being a of Object detection technique.
BSD-licensed product, OpenCV makes it easy for businesses
It has C++, Python, Java and MATLAB interfaces and
to utilize and modify the code.
supports Windows, Linux, Android and Mac OS. A full-
The library comes handy with more than 2500 optimized featured CUDA and OpenCL interfaces are being actively
algorithms, which includes a comprehensive set of both used. OpenCV is written natively in C++ and has a template
classic and state-of-the-art computer vision and machine interface that works seamlessly with STL containers.
learning algorithms. These algorithms can be used to detect
1.2. Object Recognition
and recognize faces, identify objects, classify human actions
Object recognition is the area of artificial intelligence (AI)
in videos, track camera movements, track moving objects,
concerned with the abilities of AI and robots
extract 3D models of objects, produce 3D point clouds from
implementations to recognize various things and entities.
stereo cameras, stitch images together to produce a high
Object recognition allows AI programs and robots to pick out
resolution image of an entire scene, find similar images from
and identify objects from inputs like video and still camera
an image database, remove red eyes from images taken using
images. Methods used for object identification include 3D
flash, follow eye movements, recognize scenery and establish
models, component identification, edge detection and
markers to overlay it with augmented reality, etc.
analysis of appearances from different angles.
Along with well-established companies like Google, Yahoo,
Object recognition is at the convergence points of robotics,
Microsoft, Intel, IBM, Sony, Honda, Toyota that employ the
machine vision, neural networks and AI. Google and
library, there are many startups such as Applied Minds,
Microsoft are among the companies working in this area, e.g.
Video Surf, and Zeitera, that make extensive use of OpenCV.
Google’s driverless car and Microsoft’s Kinect system both
OpenCV’s deployed uses span the range from stitching street
use object recognition. Robots that understand their
view images together, detecting intrusions in surveillance
environments can perform more complex tasks better. Major
video in Israel, monitoring mine equipment in China,
advances of object recognition stand to revolutionize AI and
detection of swimming pool drowning accidents in Europe,
robotics. MIT has created neural networks, based on our
@ IJTSRD | Unique Paper ID – IJTSRD39964 | Volume – 5 | Issue – 3 | March-April 2021 Page 802
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
understanding of how the brain works, that allow software 2.2.2. How Back Projection works
to identify objects almost as quickly as primates do. In order to understand the working of Back Projection let's
Gathered visual data from cloud robotics can allow multiple take an example of detecting the skin. Suppose we have skin
robots to learn tasks associated with object recognition histogram(Hue-Saturation) as shown in image below in
faster. Robots can also reference massive databases of figure-2. The Histogram below is the histogram model of the
known objects and that knowledge can be shared among all skin ( which we know represents the sample of skin), also
connected robots. we had applied some mask to capture histogram of skin only.
But all of these needed to be started at some point, and that
is where the OpenCV comes in play. Thus OpenCV is the tool
that is used to accomplish the task of object detection.
2. Mathematics of the Object Detection
2.1. Histograms
Before getting into the mathematics behind the object
detection first we need to understand how an image is
represented in the computer. An image is nothing but the
array of number representing some values, specifically the Skin Image Histogram Model
value of color that is to be displayed for each pixel. The color Figure 2
is represented in any form. It can be RGB, HSV, Gray Scale,
etc. A histogram is a graphical display of data using bars of Now to test other sample of skin like one shown below in
different heights. In it, each bar group is partitioned by figure-3.
specific ranges. Taller bars show that more data falls in that
range. A figure-1 below shows histogram. It displays the
shape and spread of continuous sample data.
@ IJTSRD | Unique Paper ID – IJTSRD39964 | Volume – 5 | Issue – 3 | March-April 2021 Page 803
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Again it applies the MeanShift with new scaled search
window and previous window location. The process
continues until the required accuracy is met. Example of
CamShift is as shown below in figure-7.
@ IJTSRD | Unique Paper ID – IJTSRD39964 | Volume – 5 | Issue – 3 | March-April 2021 Page 804
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
inRange: 3. If our target object goes out of frame than the method
This method is use to filter image on using a given range of won't understand that there is no target object in frame,
color. Also it a build-in method of openCV library. instead it will assume the other object whose histogram
model matches the best with target histogram model as
3. Applying the about Math to track Object
a target object.
3.1. How the Object tracking works at Backend
4. Compare to deep learning algorithms our algorithm
Now as we had seen the math behind the object tracking,
cannot understand context from past experience.
now it's time to apply it on video to track an object. The
video that will be given as input will be analyses frame by 5. Scope of improvement
frame. Thus we can control the speed of video by controlling Now due to some requirement or just for curiosity, if we had
the frames per second (fps). to use the color based tracking than there are some
techniques, tricks and tips that can be applied to get the
In order to detect an object following steps has to be
better results.
performed.
1. First of all the target object is detected from the video Other than Back Projection there are other methods to scan
2. Now this detected object is tracked frame by frame as the pixel pattern of target object in frame of video. We all
video goes on know that in any situation if the noise is reduced from any
sample than we can analyze if better. Same concept is
First of all we need to detect the target object which we want
applied here. That is if we remove the noise from the frame
to keep track of in video, and in order to detect an object we
than we can analyze and detect the object better. Here this
need to have an image of that target object. This image we
noise is low light image that is the bad image quality due to
will use to create the histogram model of target object. After
the low or insufficient light, in other words we are giving the
creating the histogram model of target image we feed that
threshold value to every pixel in order to accept it in input.
model to the Back Projection algorithm. Thus as a result we
get a box's starting coordinate in 2-D space of image and it's Above concept can be implemented using an in Build method
width and height. This box is showing the target object in from OpenCV library known as in Range function. Basic idea
present frame of video. behind this function is to filter out the range of light from the
frame. As an input we gives an image on which we want to
Once we get our target object coordinate in starting frame,
apply the threshold, and then the range of color that we want
than we just need to keep track of it as video continues to
to accept from image. Here the range is given in HSV format.
play frame by frame. In order to achieve that first we
In our case we will apply the range of color that we want to
calculate the back projection on the present video frame
from image to be accepted. For every frame first we will
with the histogram model of target object. It will give us back
apply the inRange method and filter out the pixel with low
projected image of frame. If we output the back projected
light which can potentially be recognized as noise. After this
image than we'll see the whole image is black except for the
we'll apply the back projection with histogram model of
target object which are white. Further we give that image as
target object.
input to MeanShift or CamShift algorithm. If we give input to
MeanShift and it will return new starting coordinate of the So by this small trick we can see pretty good increase in
box. It will show our target object in that video frame with its accuracy of object detection and tracking but with only slight
width and height value. The value of width and height of box increase in the processing needs. HSV Color range can be
it returns every time, will be same as one that we has given found on online.
as input while detecting the target object in first step. This
Another way to increase the accuracy is very obvious way to
process will repeat till there are more frames in video.
do so. There is a termination criterion which is required
After calculating MeanShift after every iteration we need to input for back projection, which defines the criteria to stop
show the target object, which can be achieved be the algorithm and give the result. In these criteria we need to
highlighting the target object with rectangle, and the give epsilon value and max iteration, thus tweaking this two
coordinate of rectangle the one which we get as output from parameters we can improve the accuracy without much
the MeanShift or CamShift algorithm. increase in processing needs.
4. Pros and Cons of Color based Object Tracking 6. Conclusion
Pros Thus from the above discussion we can conclude that,
1. This technique is based on color and not on features of OpenCV is easy to understand and quit powerful tool. Here
target object so it require relatively very less we had discussed about the methods for tracking an object in
computational power for detecting object from every any video. To recap first we make histogram for the target
frame. object than for small portions in every frame in video we'll
2. In this method it is possible to get pretty good frame compare histogram of target object with histogram of that
rate while detecting the object in video. object. We will continuously do same for each and every
3. This method is easy to understand and implement, thus frame of video, along with highlighting where the histogram
it is the best way for beginners to get started with of target object match with the best in frame. This is how we
machine learning and object tracking using OpenCV. keep track of an object in video.
Cons But as we know the nothing is perfect in the world, thus
1. This method is not recognize the features of target there comes some flaws with this method. As this is color
object while detecting it from video so has very less based tracking technique it can easy confuse the target
accuracy. object with other object with same color or relevant
2. This method will go in ambiguous state if there more histogram. Also if video is playing and if target object isn't in
than one objects in frame having same color as that of the frame, than it can't understand that object was there and
our target object. afterward it got out of frame or it wasn't in video from
@ IJTSRD | Unique Paper ID – IJTSRD39964 | Volume – 5 | Issue – 3 | March-April 2021 Page 805
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
beginning, that is it cannot understand context from past. [8] OpenCV change logs:
But along with all these disadvantages there is a major http://code.opencv.org/projects/opencv/wiki/Chang
advantage of using this method, which is need of eLog Archived 2013-01-15 at the Wayback Machine
computational power. This method can achieve pretty good
[9] OpenCV Developer Site: http://code.opencv.org
frame rates on an average machine. So this method can be
Archived 2013-01-13 at Archive. today
used to analyze the video footage where accuracy is second
factor but time is the limiting factor. [10] OpenCV User Site: http://opencv.org/
Hence ever beginner can use this OpenCV with this method [11] "Intel Acquires Computer Vision for IOT, Automotive |
of tracking object to get the basics of object tracking without Intel Newsroom". Intel Newsroom. Retrieved 2018-11-
any deep mathematical understanding , also the OpenCV is 26.
an open source library hence no licensing is needed in order
[12] "Intel acquires Russian computer vision company
to use it which make the learning process even more
Itseez". East-West Digital News. 2016-05-31. Retrieved
enjoyable.
2018-11-26.
7. References
[13] OpenCV: http://opencv.org/opencv-3-3.html
[1] https://docs.opencv.org/
[14] OpenCV C interface: http://docs.opencv.org
[2] https://www.geeksforgeeks.org/
[15] Introduction to OpenCV.js and Tutorials
[3] "GitHub - opencv/Opencv: Open Source Computer
Vision Library". 21 May 2020. [16] Cuda GPU port:
http://opencv.org/platforms/cuda.html Archived
[4] Intel acquires Itseez: https://opencv.org/intel-
2016-05-21 at the Way back Machine
acquires-itseez.html
[17] OpenCL Announcement: http://opencv.org/opencv-
[5] "CUDA". Opencv.org. Retrieved 2020-10-15.
v2-4-3rc-is-under-way.html
[6] Adrian Kaehler; Gary Bradski (14 December 2016).
[18] OpenCL-accelerated Computer Vision API Reference:
Learning OpenCV 3: Computer Vision in C++ with the
http://docs.opencv.org/modules/ocl/doc/ocl.html
OpenCV Library. O'Reilly Media. pp. 26ff. ISBN 978-1-
4919-3800-3. [19] Maemo port:
https://garage.maemo.org/projects/opencv
[7] Bradski, Gary; Kaehler, Adrian (2008). Learning
OpenCV: Computer vision with the OpenCV library. [20] BlackBerry 10 (partial port):
O'Reilly Media, Inc. p. 6. https://github.com/blackberry/OpenCV
@ IJTSRD | Unique Paper ID – IJTSRD39964 | Volume – 5 | Issue – 3 | March-April 2021 Page 806