0% found this document useful (0 votes)

139 views

YOLO

Uploaded by

Manel Lnsry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

139 views

YOLO

Uploaded by

Manel Lnsry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

You Only Look Once

path to design a detector

Feng Wang

AIRD, Coretronic Co.

Apr 17, 2019
The slides and a list of references can be found from
https://github.com/fwcore/object-detection
Outlines

 Concepts in object detection

 A brief history of object detection

 YOLO
 design
 loss function
 training
 weaknesses
Classification vs detection/recognition
Common tasks on images

https://medium.com/@nikasa1889/the-modern-history-of-object-recognition-infographic-aea18517c318
Bounding box proposal
Region of interest, region proposal, box proposal
Ground truth

Proposed bounding box

5 parameters
 w, h
 x, y
 confidence score: how likely it
contains an object & accuracy
of the box
How good: Intersection over Union (IOU)

Overlap Area Examples

IOU =
Union Area
0:

1:
Outlines

 Concepts in object detection

 A brief history of object detection

 YOLO
 design
 loss function
 training
 weaknesses
A brief history of object detection

https://stats385.github.io
A brief history of object detection

 Before CNN, people use handcrafted features to locate and

classify objects. (not too bad)

 CNN boosts the accuracy of classification

ImageNet
A brief history of object detection

Region proposal -> Single shot:

classification Region proposal + classification
 e.g. RCNN  e.g. YOLO, SSD
 accurate  fast
 slow  less accurate
Outlines

 Concepts in object detection

 A brief history of object detection

 YOLO
 design
 loss function
 training
 weaknesses
YOLO: you look only once

Results
 x, y, w, h
 confidence
Look once score:
contain an object &
box accuracy
 class score:
belong to a class

Let's use CNN, Why not regress?

since it's good. They are just numbers.
Let's go to CNN

YOLO v1's CNN: GoogLeNet variant, 24 layers

YOLO v3's CNN: darknet-53

YOLO v2's CNN: darknet-19, 19 layers

Let's do regression
-- wait, wait, how many bounding boxes? Where are they
initially?
Better solution: using grids

Results for one box

 x, y, w, h
 confidence score:
contain an object &
box accuracy
 class score:
belong to a class
 Maybe set N as a large number?
 Maybe initially put them randomly?

Note: N is large, but much smaller than R-CNN's

region proposal.
Let's do regression with non-maximal suppression
Proposed Proposed Class scores
box 1 box 2
class 1
Grid x, y, w, h x, y, w, h class 2,
1
confidence confidence ...
score score class 20

... ... ... ...

Proposed Proposed Class scores
box 1 box 2
class 1
We can use CNN to extract features, and Grid x, y, w, h x, y, w, h class 2,
SxS
finally perform a regression to detect confidence confidence ...
objects. score score class 20
 YOLO v1: fully connected layers
 v2 & v3: convolutional layers
arXiv: 1506.02640, 1612.08242, 1804.02767 vector size: SxSx(5x2+20)
Loss function
Problems
 One object is partially/fully covered by several boxes.
 Most boxes has no objects.
 Multi-task training problem: location & class
 Small objects need more accurate location & box
size.

Solution
Oh, no math please. Let's speak human language

Problem 1:
One object is
partially/fully
covered by
several boxes.

 Each true object has one proposed box “responsible” to it.

Rule: the one with highest overlap with the ground truth boxes.
 When inference, we use non-maximal suppression to select the best among the proposals.
Human language

Problem 2: 0.5
Most boxes has
no objects.
Human language

Problem 3:
Multi-task training
problem: location
& class. Weighted sum: here the problem is left untouched.
Human language

sqrt

Problem 4:
Small objects need
more accurate
location & box size.
Other problems
 x, y can be out of the grid cell
 smaller objects can locate
worse than the largers

 probability can be out of [0, 1]

Fix them in YOLO v2

Pre-defined box size

Pre-defined box: anchor
 Naturally, objects have special aspect ratios and sizes.
 This can be a good starting point.
 We don't need randomly initialized boxes' shapes.

 Handcrafted box size vs clustering algorithms

 Box can reshape during training.

 The number of pre-defined boxes is

a hyperparameter
 v2 uses 5
 v3 uses 9

Anchor-free detection is a research topic, see https://arxiv.org/abs/1904.01355 for an instance. anchors used in YOLO v2
Improvements (in v2)
 Resizing image sizes randomly during training: {320, 352, ..., 608}
 CNN only reduce an image by a constant factor (here 32), hence is robust to input image size
 resize every 10 epochs.
 multi-scale training

 Passthrough layer  Odd number of grid cells

 No loss to perform reshaping

Feature map
Training
ImageNet: COCO/PASCAL VOC:
classification dataset detection dataset

YOLO
Step 1: Step 2 (transfer learning):
 train classification backbone  remove head layers
 add regression as new head
 fine-tune backbone & train head

Training tricks
 decaying learning rate
 batch normalization
 data augmentation
Performance
Generalizability

Picasso & People-Art dataset

But ... no free lunch
 YOLO is not as accurate as RCNN-series models
 multi-task problem:
YOLO wins in less background error,
however, loses in localization error.

 YOLO is poor for detecting small objects

 CNN: training on ImageNet may not generalize well for small objects (classification)
 loss function equalizes location weights for small & large objects (localization)
50+ years
 YOLO is not good at crowd objects
 non-maximal suppression. See an improvement: Adaptive NMS (arXiv:1904.03629)

 YOLO is bad when encountering strange aspect ratio

 pre-defined anchors, or anchors learned from data. Go anchor-free (arXiv:1904.01355).
Security
CNN (classification) can be fooled, as well as
YOLO, and the issues can be even worse.

Non-maximal suppression is fooled.

Daedalus: Breaking Non-Maximum

Suppression in Object Detection via
Adversarial Examples. arXiv:1902.02067
Is there anything helpful to improve?
Darwin's evolution

arXiv: 1807.05511

Mastering All YOLO Models From YOLOv1 To YOLO
100% (1)
Mastering All YOLO Models From YOLOv1 To YOLO
58 pages
A Survey On Vision Transformer
No ratings yet
A Survey On Vision Transformer
24 pages
Object Detection Week 2 YOLOv1-YOLOv8
100% (1)
Object Detection Week 2 YOLOv1-YOLOv8
264 pages
A Survey On Vision Transformer
No ratings yet
A Survey On Vision Transformer
23 pages
Session 1
0% (1)
Session 1
13 pages
Java Certification Study Notes
No ratings yet
Java Certification Study Notes
91 pages
A Survey of Evolution of Image Captioning PDF
No ratings yet
A Survey of Evolution of Image Captioning PDF
18 pages
Computer Vision55
100% (1)
Computer Vision55
268 pages
Object Detection Using Image Processing
No ratings yet
Object Detection Using Image Processing
17 pages
Yolo
No ratings yet
Yolo
10 pages
Project
100% (1)
Project
30 pages
ML Training by Custom Yolo v5
No ratings yet
ML Training by Custom Yolo v5
56 pages
Loss Function
No ratings yet
Loss Function
9 pages
Deep Learning
100% (1)
Deep Learning
49 pages
Tutorial - Hybrid Fuzzy Models
No ratings yet
Tutorial - Hybrid Fuzzy Models
19 pages
Deep Learning With Keras and Tensorflow
No ratings yet
Deep Learning With Keras and Tensorflow
557 pages
Plant Disease Identification
No ratings yet
Plant Disease Identification
17 pages
Instant Download IoT Projects with NVIDIA Jetson Nano: AI-Enabled Internet of Things Projects for Beginners Agus Kurniawan PDF All Chapters
100% (5)
Instant Download IoT Projects with NVIDIA Jetson Nano: AI-Enabled Internet of Things Projects for Beginners Agus Kurniawan PDF All Chapters
65 pages
Medical Image Fusion Method by Deep Learning
No ratings yet
Medical Image Fusion Method by Deep Learning
9 pages
Deep Learning: Huawei AI Academy Training Materials
No ratings yet
Deep Learning: Huawei AI Academy Training Materials
47 pages
Analytical Study On Object Detection Using Yolo Algorithm
No ratings yet
Analytical Study On Object Detection Using Yolo Algorithm
3 pages
Chapter 7 - Neural-Networks
100% (1)
Chapter 7 - Neural-Networks
60 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
62 pages
Recurrent Neural Networks For Prediction
100% (2)
Recurrent Neural Networks For Prediction
297 pages
Capsule Neural Network
100% (1)
Capsule Neural Network
42 pages
PDF Deep Learning with JavaScript: Neural networks in TensorFlow.js 1st Edition Shanqing Cai download
100% (2)
PDF Deep Learning with JavaScript: Neural networks in TensorFlow.js 1st Edition Shanqing Cai download
65 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
25 pages
A Convolutional Neural Network For Network Intrusion Detection System
No ratings yet
A Convolutional Neural Network For Network Intrusion Detection System
6 pages
Ultrasonic Radar With Arduino
No ratings yet
Ultrasonic Radar With Arduino
12 pages
Basics of Sensor Fusion 2020
No ratings yet
Basics of Sensor Fusion 2020
102 pages
Jetson Nano Developer Kit: User Guide
No ratings yet
Jetson Nano Developer Kit: User Guide
24 pages
Multiple Object Tracking Using Deep Learning With Yolo v5 IJERTCONV9IS13010
No ratings yet
Multiple Object Tracking Using Deep Learning With Yolo v5 IJERTCONV9IS13010
5 pages
Introduction To Neural Networks Using Matlab 6 0 S N Sivanandam Sumathi Deepa
0% (1)
Introduction To Neural Networks Using Matlab 6 0 S N Sivanandam Sumathi Deepa
4 pages
Image Processing With CUDA
No ratings yet
Image Processing With CUDA
66 pages
Me3116 E3.0
No ratings yet
Me3116 E3.0
14 pages
50 Most Important CNN Interview Questions
No ratings yet
50 Most Important CNN Interview Questions
18 pages
chapter 4 Neural Network
No ratings yet
chapter 4 Neural Network
46 pages
How To Make An Object Tracking Robot Using Raspberry Pi - Automatic Addisonasdfsdf
100% (1)
How To Make An Object Tracking Robot Using Raspberry Pi - Automatic Addisonasdfsdf
10 pages
YOLO Is The State-Of-The-Art, Real Time System Built On Deep Learning For Solving Object Detection Problems
50% (2)
YOLO Is The State-Of-The-Art, Real Time System Built On Deep Learning For Solving Object Detection Problems
8 pages
Visual SLAM
No ratings yet
Visual SLAM
23 pages
PPT
No ratings yet
PPT
20 pages
Movidius Neural Computer Stick
No ratings yet
Movidius Neural Computer Stick
33 pages
Feature Detection and Matching
No ratings yet
Feature Detection and Matching
80 pages
Computer Vision Notes: Confirmed Midterm Exam Guide (Kisi-Kisi UTS)
No ratings yet
Computer Vision Notes: Confirmed Midterm Exam Guide (Kisi-Kisi UTS)
24 pages
Data-Driven and Physics-Informed Deep Learning Operators For Solution of Heat Conduction Equation
No ratings yet
Data-Driven and Physics-Informed Deep Learning Operators For Solution of Heat Conduction Equation
8 pages
5 Unit5 Arduino Interrupt, Timer and Communication
No ratings yet
5 Unit5 Arduino Interrupt, Timer and Communication
19 pages
Deep Learning
No ratings yet
Deep Learning
41 pages
Machine Learning and Blockchain
No ratings yet
Machine Learning and Blockchain
47 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
An Analysis of Convolutional Neural Network Architectures
No ratings yet
An Analysis of Convolutional Neural Network Architectures
54 pages
Object Recognition
No ratings yet
Object Recognition
30 pages
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
100% (2)
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
50 pages
The Backpropagation Algorithm
No ratings yet
The Backpropagation Algorithm
4 pages
02 Understanding Mini Batch Gradient Descent C2W2L02
No ratings yet
02 Understanding Mini Batch Gradient Descent C2W2L02
4 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
Ch. 9: Introduction To Convolution Neural Networks (CNN) and Systems
No ratings yet
Ch. 9: Introduction To Convolution Neural Networks (CNN) and Systems
96 pages
GNN Review
No ratings yet
GNN Review
26 pages
Signature Object Detection Based On YOLOv3
No ratings yet
Signature Object Detection Based On YOLOv3
4 pages
YOLO
No ratings yet
YOLO
43 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
L19.Kd Trees
0% (1)
L19.Kd Trees
19 pages
Cs301 Final Term Solved Paper Mega File
No ratings yet
Cs301 Final Term Solved Paper Mega File
31 pages
Channel Estimation Lmmse
No ratings yet
Channel Estimation Lmmse
31 pages
Rima Ayoub - Linear Equations PDF
No ratings yet
Rima Ayoub - Linear Equations PDF
1 page
Ant Colony Optimization Algorithms
No ratings yet
Ant Colony Optimization Algorithms
13 pages
Bloom FIlter and Hash Function Numericals
No ratings yet
Bloom FIlter and Hash Function Numericals
6 pages
Quiz 1 - PE5022
No ratings yet
Quiz 1 - PE5022
1 page
2D Array Knowledge Test
No ratings yet
2D Array Knowledge Test
20 pages
AI Assignment - I Btech CSE (AI&ML) IIIy Isem
No ratings yet
AI Assignment - I Btech CSE (AI&ML) IIIy Isem
2 pages
K-Means and PCA
No ratings yet
K-Means and PCA
69 pages
Syntax Wavelet
No ratings yet
Syntax Wavelet
3 pages
Python File
No ratings yet
Python File
10 pages
Counting Sort
No ratings yet
Counting Sort
59 pages
HW03 Spring 2024
No ratings yet
HW03 Spring 2024
2 pages
(FREE PDF Sample) Introductory Methods of Numerical Analysis 5th Edition S.S. Sastry Ebooks
100% (3)
(FREE PDF Sample) Introductory Methods of Numerical Analysis 5th Edition S.S. Sastry Ebooks
84 pages
Linear Programs in Standard Form
No ratings yet
Linear Programs in Standard Form
16 pages
(Ebook) An Introduction to Statistical Learning: with Applications in Python by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor ISBN 9783031387463, 9783031387470, 9783031391897, 3031391896, 3031387465, 3031387473, 1431875X - Quickly download the ebook to never miss any content
100% (2)
(Ebook) An Introduction to Statistical Learning: with Applications in Python by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor ISBN 9783031387463, 9783031387470, 9783031391897, 3031391896, 3031387465, 3031387473, 1431875X - Quickly download the ebook to never miss any content
77 pages
Q2a Dijkstra
No ratings yet
Q2a Dijkstra
41 pages
M. Borga, O. Friman, P. Lundberg and H. Knutsson
No ratings yet
M. Borga, O. Friman, P. Lundberg and H. Knutsson
1 page
Prediction of Graduate Admission IEEE - 2020
No ratings yet
Prediction of Graduate Admission IEEE - 2020
6 pages
7 Recurrence Relations
No ratings yet
7 Recurrence Relations
21 pages
A Review on Image Processing Techniques
No ratings yet
A Review on Image Processing Techniques
5 pages
DSA Refresher Batch - PW
No ratings yet
DSA Refresher Batch - PW
2 pages
Thesis On Image Fusion Using Wavelet Transform
100% (3)
Thesis On Image Fusion Using Wavelet Transform
8 pages
NYC PracTest1 F15
No ratings yet
NYC PracTest1 F15
2 pages
Omar Arif Omar - Arif@seecs - Edu.pk National University of Sciences and Technology
No ratings yet
Omar Arif Omar - Arif@seecs - Edu.pk National University of Sciences and Technology
44 pages
Table Processing-Searching & Sorting
No ratings yet
Table Processing-Searching & Sorting
9 pages
Analog Pulse Modulation
No ratings yet
Analog Pulse Modulation
6 pages
Analysis and Processing of Random Signals
No ratings yet
Analysis and Processing of Random Signals
71 pages
Fundamental o F Data Structure Solved and Questio Peper
No ratings yet
Fundamental o F Data Structure Solved and Questio Peper
2 pages