0% found this document useful (0 votes)
98 views2 pages

Object Detection Using YOLO

Uploaded by

saumya78198
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views2 pages

Object Detection Using YOLO

Uploaded by

saumya78198
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Object detection using YOLO: challenges, architectural successors, datasets and applications Tausif

Diwan1 & G. Anirudh2 & Jitendra V. Tembhurne1

Abstract Object detection is one of the predominant and challenging problems in computer vision. Over
the decade, with the expeditious evolution of deep learning, researchers have extensively experimented
and contributed in the performance enhancement of object detection and related tasks such as object
classification, localization, and segmentation using underlying deep models. Broadly, object detectors
are classified into two categories viz. two stage and single stage object detectors. Two stage detectors
mainly focus on selective region proposals strategy via complex architecture; however, single stage
detectors focus on all the spatial region proposals for the possible detection of objects via relatively
simpler architecture in one shot. Performance of any object detector is evaluated through detection
accuracy and inference time. Generally, the detection accuracy of two stage detectors outperforms
single stage object detectors. However, the inference time of single stage detectors is better compared
to its counterparts. Moreover, with the advent of YOLO (You Only Look Once) and its architectural
successors, the detection accuracy is improving significantly and sometime it is better than two stage
detectors. YOLOs are adopted in various applications majorly due to their faster inferences rather than
considering detection accuracy. As an example, detection accuracies are 63.4 and 70 for YOLO and Fast-
RCNN respectively, however, inference time is around 300 times faster in case of YOLO. In this paper, we
present a comprehensive review of single stage object detectors specially YOLOs, regression
formulation, their architecture advancements, and performance statistics. Moreover, we summarize the
comparativeillustration between two stage and single stage object detectors, among different versions
of YOLOs, applications based on two stage detectors, and different versions of YOLOs along with the
future research directions. Keywords Object detection .Convolutional neural networks. YOLO. Deep
learning .Computer vision

Object detection is an important field in the domain of computer vision. Various machine learning (ML)
and deep learning (DL) models are employed for the performance enhancement in the process of object
detection and related tasks. In the earlier time, two stage object detectors were quite popular and
effective. With the recent development in single stage object detection and underlying algorithms, they
have become significantly better in comparison with most of the two stage object detectors. Moreover,
with the advent of YOLOs, various applications have utilized YOLOs for object detection and recognition
in various context and performed tremendously well in comparison with their counterparts two stage
detectors. This motivates us to write a specific review on YOLO and their architectural successors by
presenting their design details, optimizations proposed in the successors, tough competition to two
stage object detectors, etc.

Object classification and localization Image Classification is a task of classifying an image or an object in
an image into one of the predefined categories. This problem is generally solved with the help of
supervised machine learning or deep learning algorithms wherein the model is trained on a large
labelled dataset. Some of the commonly used machine learning models for this task includes ANN, SVM,
Decision trees, and KNN [66]. However, on the deep learning side, CNNs and its architectural successors
and variants dominate other deep models for classifying images and related works. Apart from well-
defined machine learning and deep learning models, one can also witness the usage of other
approaches such as Fuzzy logic and Genetic algorithms for the aforementioned task

Object Localization is the task of determining position of an object or multiple objects in an image/frame
with the help of a rectangular box around an object, commonly known as a bounding box. However,
Image segmentation is the process of partitioning an image into multiple segments wherein a segment
may contain a complete object or a part of an object. Image segmentation is commonly utilized to locate
objects, lines, and curves viz. boundaries of an object or segment in an image. Generally, pixels in a
segment possess a set of common characteristics such as intensity, texture, etc. The main motive behind
image segmentation is to present the image into a meaningful representation. Moreover, Object
detection can be considered as a combination of classification, localization, and segmentation. It is the
task of correctly classifying and efficiently localizing single or multiple objects in an image, generally with
the help of supervised algorithms given a sufficiently large labelled training set. Figure 1 presents the
clear understanding of classification, localization, and segmentation for single and multiple objects in an
image in the context of object detection

You might also like