0% found this document useful (0 votes)

37 views

Dlcv2017d2l4objectdetection 170622143747

This document summarizes an object detection lecture. It discusses object detection tasks like assigning labels and bounding boxes to objects in images. Popular object detection datasets with varying numbers of categories and images are presented. Object detection is described as classifying objects at different positions and scales, which is computationally expensive using convolutional networks. Region proposal methods are introduced to select potential object regions efficiently before classification. The R-CNN, Fast R-CNN, and Faster R-CNN models are summarized, which improved object detection speed and performance by sharing convolutional features between proposals and learning region proposals end-to-end.

Uploaded by

SriramGudimella

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

Dlcv2017d2l4objectdetection 170622143747

Uploaded by

SriramGudimella

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

#DLUPC

Day 2 Lecture 4
Object Detection

Amaia Salvador
[email protected]

PhD Candidate
Universitat Politècnica de Catalunya

[course site]
Object Detection

The task of assigning a

label and a bounding box
to all objects in the image

CAT, DOG, DUCK

2
Object Detection: Datasets

20 categories 80 categories 200 categories

6k training images 200k training images 456k training images
6k validation images 60k val + test images 60k validation + test images
10k test images

3
Object Detection as Classification

Classes = [cat, dog, duck]

Cat ? NO

Dog ? NO

Duck? NO

4
Object Detection as Classification

Classes = [cat, dog, duck]

Cat ? NO

Dog ? NO

Duck? NO

5
Object Detection as Classification

Classes = [cat, dog, duck]

Cat ? YES

Dog ? NO

Duck? NO

6
Object Detection as Classification

Classes = [cat, dog, duck]

Cat ? NO

Dog ? NO

Duck? NO

7
Object Detection as Classification

Problem:
Too many positions & scales to test

Solution: If your classifier is fast enough, go for it

8
Object Detection with ConvNets?

Convnets are computationally demanding. We can’t test all positions & scales !

Solution: Look at a tiny subset of positions. Choose them wisely :)

9
Region Proposals

● Find “blobby” image regions that are likely to contain objects

● “Class-agnostic” object detector
● Look for “blob-like” regions

Slide Credit: CS231n 10

Region Proposals

Selective Search (SS) Multiscale Combinatorial Grouping (MCG)

[SS] Uijlings et al. Selective search for object recognition. IJCV 2013

[MCG] Arbeláez, Pont-Tuset et al. Multiscale combinatorial grouping. CVPR 2014 11

Object Detection with Convnets: R-CNN

Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR 2014

12
R-CNN

1. Train network on proposals

2. Post-hoc training of SVMs & Box regressors on fc7 features

Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR 2014

13
R-CNN

We expect: We get:

14
R-CNN

1. Train network on proposals

2. Post-hoc training of SVMs & Box regressors on fc7 features

3. Non Maximum Suppression + score threshold

Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR 2014

15
R-CNN

Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR 2014

16
R-CNN: Problems

1. Slow at test-time: need to run full forward pass of

CNN for each region proposal

2. SVMs and regressors are post-hoc: CNN features

not updated in response to SVMs and regressors

3. Complex multistage training pipeline

Slide Credit: CS231n 17

Fast R-CNN

R-CNN Problem #1: Slow at test-time: need to run full forward pass of CNN for each region proposal

Solution: Share computation of convolutional layers between region proposals for an image

Girshick Fast R-CNN. ICCV 2015 18

Fast R-CNN: Sharing features

Max-pool within
Convolution each grid cell
Fully-connected
and Pooling layers

Hi-res input image: Hi-res conv features: RoI conv features: Fully-connected layers expect
3 x 800 x 600 CxHxW Cxhxw low-res conv features:
with region with region proposal for region proposal Cxhxw
proposal

Girshick Fast R-CNN. ICCV 2015 Slide Credit: CS231n 19

Fast R-CNN

R-CNN Problem #2&3: SVMs and regressors are post-hoc. Complex training.

Solution: Train it all at together E2E

Girshick Fast R-CNN. ICCV 2015 20

Fast R-CNN

R-CNN Fast R-CNN

Training Time: 84 hours 9.5 hours

Faster!
(Speedup) 1x 8.8x

Test time per image 47 seconds 0.32 seconds

FASTER!
(Speedup) 1x 146x

Better! mAP (VOC 2007) 66.0 66.9

Using VGG-16 CNN on Pascal VOC 2007 dataset

Slide Credit: CS231n 21

Fast R-CNN: Problem

Test-time speeds don’t include region proposals

R-CNN Fast R-CNN

Test time per image 47 seconds 0.32 seconds

(Speedup) 1x 146x

Test time per image

50 seconds 2 seconds
with Selective Search

(Speedup) 1x 25x

Slide Credit: CS231n 22

Faster R-CNN
Learn proposals end-to-end sharing parameters with the classification network

RPN Proposals
Region Proposal Network
layers
Conv Conv5_3

FC6

FC7

FC8
RoI
Pooling Class probabilities

RPN Proposals

Ren et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015
23
Faster R-CNN
Learn proposals end-to-end sharing parameters with the classification network

RPN Proposals
Region Proposal Network
layers
Conv Conv5_3

FC6

FC7

FC8
RoI
Pooling Class probabilities

RPN Proposals

Fast R-CNN

Ren et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015
24
Region Proposal Network

Bounding Box Regression

Objectness scores
(object/no object)

In practice, k = 9 (3 different scales and 3 aspect ratios)

Ren et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015
25
Faster R-CNN: Training
RoI Pooling is not differentiable w.r.t box coordinates. Solutions:
● Alternate training
● Ignore gradient of classification branch w.r.t proposal coordinates
● Make pooling function differentiable (spoiler D3L6)

RPN Proposals

Region Proposal Network

layers
Conv

Conv5_3

FC6

FC7

FC8
RoI
Pooling Class probabilities

RPN Proposals

Ren et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015
26
Faster R-CNN

R-CNN Fast R-CNN Faster R-CNN

Test time per 50 seconds 2 seconds 0.2 seconds

image
(with proposals)

(Speedup) 1x 25x 250x

mAP (VOC 2007) 66.0 66.9 66.9

Ren et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015
Slide Credit: CS231n 27
Faster R-CNN

● Faster R-CNN is the basis of the winners of COCO and

ILSVRC 2015&2016 object detection competitions.

He et al. Deep residual learning for image recognition. CVPR 2016

28
YOLO: You Only Look Once

Proposal-free object detection pipeline

Redmon et al. You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016 29
YOLO: You Only Look Once

Redmon et al. You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016 30
YOLO: You Only Look Once
Each cell predicts:

- For each bounding box:

- 4 coordinates (x, y, w, h)
- 1 confidence value
- Some number of class
probabilities

For Pascal VOC:

- 7x7 grid
- 2 bounding boxes / cell
- 20 classes
31
7 x 7 x (2 x 5 + 20) = 7 x 7 x 30 tensor = 1470 outputs
YOLO: You Only Look Once
Predict class probability for each cell

Bicycle Car

Dog

Redmon et al. You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016 Dining Table 32
YOLO: You Only Look Once

+ NMS
+ Score threshold

Redmon et al. You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016 33
SSD: Single Shot MultiBox Detector
Same idea as YOLO, + several predictors at different stages in the network

Liu et al. SSD: Single Shot MultiBox Detector, ECCV 2016 34

YOLOv2

Redmon & Farhadi. YOLO900: Better, Faster, Stronger. CVPR 2017

35
YOLOv2

Results on Pascal VOC 2007

YOLOv2

Results on COCO test-dev 2015

Summary

Proposal-based methods
● R-CNN
● Fast R-CNN
● Faster R-CNN
● SPPnet
● R-FCN
Proposal-free methods
● YOLO, YOLOv2
● SSD
38
Resources
● Official implementations:
○ Faster R-CNN [caffe]
○ Yolov2 [darknet]
○ SSD [caffe]
○ R-FCN [caffe][MxNet]

● Unofficial ports to other frameworks are likely to exist… eg type “yolo

tensorflow” in your browser and pick the one you like best.
● Or… use the newly released Object detection API by Google: SSD, R-FCN &
Faster R-CNN (code & pretrained models in tensorflow)

Object detection tutorials (project ideas maybe?): 39

● Toy object detection (squares, circles, etc.) (keras)
● Object detection (pets dataset) (tensorflow)
Questions?
YOLO: Training

For training, each ground truth

bounding box is matched into the
right cell

Slide credit: YOLO Presentation @ CVPR 2016 41

YOLO: Training

For training, each ground truth

bounding box is matched into the
right cell

Slide credit: YOLO Presentation @ CVPR 2016 42

YOLO: Training

Optimize class prediction in that

cell:
dog: 1, cat: 0, bike: 0, ...

Slide credit: YOLO Presentation @ CVPR 2016 43

YOLO: Training

Predicted boxes for this cell

Slide credit: YOLO Presentation @ CVPR 2016 44

YOLO: Training

Find the best one wrt ground

truth bounding box, optimize it
(i.e. adjust its coordinates to be
closer to the ground truth’s
coordinates)

Slide credit: YOLO Presentation @ CVPR 2016 45

YOLO: Training

Increase matched box’s

confidence, decrease
non-matched boxes confidence

Slide credit: YOLO Presentation @ CVPR 2016 46

YOLO: Training

Increase matched box’s

confidence, decrease
non-matched boxes confidence

Slide credit: YOLO Presentation @ CVPR 2016 47

YOLO: Training

For cells with no ground truth

detections, confidences of all
predicted boxes are decreased

Slide credit: YOLO Presentation @ CVPR 2016 48

YOLO: Training

For cells with no ground truth

detections:
● Confidences of all predicted
boxes are decreased
● Class probabilities are not
adjusted

Slide credit: YOLO Presentation @ CVPR 2016 49

YOLO: Training, formally
= 1 if box j and cell i are matched together, 0 otherwise

Bounding box
coordinate
regression

Bounding box = 1 if box j and cell i are NOT matched together

score prediction

Class
score prediction

Slide credit: YOLO Presentation @ CVPR 2016 = 1 if cell i has an object present 50

Digora For Windows 2.5 R2
100% (1)
Digora For Windows 2.5 R2
111 pages
Dlcvd3l4objects 160803161336
No ratings yet
Dlcvd3l4objects 160803161336
31 pages
L7 Detection
No ratings yet
L7 Detection
54 pages
The Framework For Object Detection: Generalized R-CNN
No ratings yet
The Framework For Object Detection: Generalized R-CNN
127 pages
Object Detection
No ratings yet
Object Detection
57 pages
Lec36 Obj Detn
No ratings yet
Lec36 Obj Detn
60 pages
YOLO FAMILY
No ratings yet
YOLO FAMILY
40 pages
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
60 pages
10 R CNN
No ratings yet
10 R CNN
28 pages
Chapter 7 - Part 3 - DL For CV
No ratings yet
Chapter 7 - Part 3 - DL For CV
79 pages
BTP Report Faster R CNN Compressed
No ratings yet
BTP Report Faster R CNN Compressed
32 pages
Fast Methods For Deep Learning Based Object Detection
No ratings yet
Fast Methods For Deep Learning Based Object Detection
43 pages
NN 09
No ratings yet
NN 09
34 pages
Object Detection
No ratings yet
Object Detection
96 pages
R-CNN Minus R: Karel Lenc Andrea Vedaldi
No ratings yet
R-CNN Minus R: Karel Lenc Andrea Vedaldi
9 pages
IT5409 - Ch7 - Part3 - DL For CV-v2 - 4pages
No ratings yet
IT5409 - Ch7 - Part3 - DL For CV-v2 - 4pages
42 pages
Li 2021 J. Phys.: Conf. Ser. 1827 012085
No ratings yet
Li 2021 J. Phys.: Conf. Ser. 1827 012085
11 pages
ref16
No ratings yet
ref16
14 pages
3.1 Faster - R-CNN - Towards - Real-Time - Object - Detection - With - Region - Proposal - Networks
No ratings yet
3.1 Faster - R-CNN - Towards - Real-Time - Object - Detection - With - Region - Proposal - Networks
13 pages
5638 Faster R CNN Towards Real Time Object Detection With Region Proposal Networks
No ratings yet
5638 Faster R CNN Towards Real Time Object Detection With Region Proposal Networks
9 pages
L10-Lecture-Detection.Segmentation-v2.5
No ratings yet
L10-Lecture-Detection.Segmentation-v2.5
35 pages
Du_2018_J._Phys.__Conf._Ser._1004_012029
No ratings yet
Du_2018_J._Phys.__Conf._Ser._1004_012029
9 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Object Detection1
No ratings yet
Object Detection1
29 pages
Lecture Paola Object Detection
No ratings yet
Lecture Paola Object Detection
29 pages
mv_cs4243_2024_amir_6_p2 (1)
No ratings yet
mv_cs4243_2024_amir_6_p2 (1)
95 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
Week 5 - Fast RCNN
No ratings yet
Week 5 - Fast RCNN
17 pages
机器学习读书会嘉宾分享-计算机视觉-目标检测
No ratings yet
机器学习读书会嘉宾分享-计算机视觉-目标检测
52 pages
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
No ratings yet
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
11 pages
YOLO Evolution Through Time
No ratings yet
YOLO Evolution Through Time
5 pages
Efficient Detection of Small and Complex Objects for Autonomous Driving Using Deep Learning
No ratings yet
Efficient Detection of Small and Complex Objects for Autonomous Driving Using Deep Learning
5 pages
Report 34
No ratings yet
Report 34
22 pages
09 Det Seg Part 02
No ratings yet
09 Det Seg Part 02
103 pages
Yolo: You Only Look Once: Unified Real-Time Object Detection
No ratings yet
Yolo: You Only Look Once: Unified Real-Time Object Detection
60 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium
No ratings yet
R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium
6 pages
Literature Survey For Robotics
No ratings yet
Literature Survey For Robotics
6 pages
Last Lab Report
No ratings yet
Last Lab Report
6 pages
lenc15rcnn(1)
No ratings yet
lenc15rcnn(1)
12 pages
MINI PROJECT SYNOPSIS
No ratings yet
MINI PROJECT SYNOPSIS
6 pages
cs231n 2018 ds06
No ratings yet
cs231n 2018 ds06
38 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
1412.1441v3
No ratings yet
1412.1441v3
10 pages
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
No ratings yet
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
6 pages
R-CNN and FR-CNN Report: Methods Used at The Core of Object Detection
No ratings yet
R-CNN and FR-CNN Report: Methods Used at The Core of Object Detection
4 pages
Object Detection Report
No ratings yet
Object Detection Report
27 pages
Region-Based_Convolutional_Networks_for_Accurate_Object_Detection_and_Segmentation
No ratings yet
Region-Based_Convolutional_Networks_for_Accurate_Object_Detection_and_Segmentation
17 pages
CNN Models To Detect Multiple Leds For Multilateral Occ.: Project: Ieee P802.15 Ig Vat
No ratings yet
CNN Models To Detect Multiple Leds For Multilateral Occ.: Project: Ieee P802.15 Ig Vat
9 pages
02 Semantic Segmentation 2024
No ratings yet
02 Semantic Segmentation 2024
53 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Region-Based Convolutional Networks For Accurate Object Detection and Segmentation
No ratings yet
Region-Based Convolutional Networks For Accurate Object Detection and Segmentation
21 pages
CS60010_CNN 4
No ratings yet
CS60010_CNN 4
32 pages
A Comprehensive Survey of The R-CNN Family For Object Detection
No ratings yet
A Comprehensive Survey of The R-CNN Family For Object Detection
6 pages
Najibi G-CNN An Iterative CVPR 2016 Paper
No ratings yet
Najibi G-CNN An Iterative CVPR 2016 Paper
9 pages
2.ObjectDetection Two Stage
No ratings yet
2.ObjectDetection Two Stage
66 pages
139 Pretrained Networks Object Detection
No ratings yet
139 Pretrained Networks Object Detection
22 pages
Rich_Feature_Hierarchies_for_Accurate_Object_Detection_and_Semantic_Segmentation
No ratings yet
Rich_Feature_Hierarchies_for_Accurate_Object_Detection_and_Semantic_Segmentation
8 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
CCNA Exam Focus: Study Guide with Practice Tests
From Everand
CCNA Exam Focus: Study Guide with Practice Tests
SUJAN
No ratings yet
Kubernetes and Cloud Native Associate (KCNA) Exam Preparation
From Everand
Kubernetes and Cloud Native Associate (KCNA) Exam Preparation
Georgio Daccache
No ratings yet
10069a 9 V06 Musg 00037 PDF
No ratings yet
10069a 9 V06 Musg 00037 PDF
14 pages
Obfuscation Detection PDF Files Peepdf Caro2011
No ratings yet
Obfuscation Detection PDF Files Peepdf Caro2011
72 pages
P.O.Box:1832, P.C:112, Ruwi, Sultanate of Oman
No ratings yet
P.O.Box:1832, P.C:112, Ruwi, Sultanate of Oman
37 pages
HW2+Solution
No ratings yet
HW2+Solution
11 pages
Outcomes Placement Test
No ratings yet
Outcomes Placement Test
12 pages
Introduction to Human Computer Interaction FINAL EXAM
No ratings yet
Introduction to Human Computer Interaction FINAL EXAM
6 pages
752SEFILE
No ratings yet
752SEFILE
65 pages
AWS Beginners Guide
No ratings yet
AWS Beginners Guide
5 pages
DS-2CD3665G0-IZS 6 MP IR Varifocal Bullet Network Camera: Key Features
No ratings yet
DS-2CD3665G0-IZS 6 MP IR Varifocal Bullet Network Camera: Key Features
4 pages
1.6KVA 12V MultiPlus 230 Volt System Example 4 PIN VE Bus BMS Lithium Orion TR
No ratings yet
1.6KVA 12V MultiPlus 230 Volt System Example 4 PIN VE Bus BMS Lithium Orion TR
1 page
Bentley (DTC) 968590155502 20220820131734
No ratings yet
Bentley (DTC) 968590155502 20220820131734
4 pages
A Numerical and Experimental Study of Resistance, Trim and Sinkage of An Inland Ship Model in Extremely Shallow Watere
No ratings yet
A Numerical and Experimental Study of Resistance, Trim and Sinkage of An Inland Ship Model in Extremely Shallow Watere
8 pages
Oil & Gas UK 110
No ratings yet
Oil & Gas UK 110
10 pages
Apex Piping Revised 2
No ratings yet
Apex Piping Revised 2
36 pages
Unit 06 - Assignment Brief 1 - Digital Transformation - 082022
No ratings yet
Unit 06 - Assignment Brief 1 - Digital Transformation - 082022
3 pages
Glencoe Public Library Strategic Plan
No ratings yet
Glencoe Public Library Strategic Plan
13 pages
WU300 KARL STORZ OR1 SCB CONTROL SBC231_EN_V3.0_IFU_CE-MDR 60pages
100% (1)
WU300 KARL STORZ OR1 SCB CONTROL SBC231_EN_V3.0_IFU_CE-MDR 60pages
60 pages
I-Wave - Product Presentation
No ratings yet
I-Wave - Product Presentation
90 pages
SRDF Steps & Summary
No ratings yet
SRDF Steps & Summary
7 pages
S5 Sub ICT Notes-Mukalele Rogers Computer Hardware 2019
No ratings yet
S5 Sub ICT Notes-Mukalele Rogers Computer Hardware 2019
125 pages
Suitcase X-Treme 12VS Spec Sheet
No ratings yet
Suitcase X-Treme 12VS Spec Sheet
4 pages
Document - Kiosk Brochure PDF
No ratings yet
Document - Kiosk Brochure PDF
2 pages
ECE 027 - Zener Diode
No ratings yet
ECE 027 - Zener Diode
63 pages
FM 1321 1323 Controllers For Electric Motor and Diesel Engine
No ratings yet
FM 1321 1323 Controllers For Electric Motor and Diesel Engine
165 pages
JCB Seal Kit 55042842
No ratings yet
JCB Seal Kit 55042842
9 pages
Littmann - Anatomy of A Stethoscope - Poster
No ratings yet
Littmann - Anatomy of A Stethoscope - Poster
1 page
Fire Resistant Coaxial Cables
No ratings yet
Fire Resistant Coaxial Cables
38 pages
COBOL DB2 Tutorial
100% (1)
COBOL DB2 Tutorial
4 pages
CQIIRCA Certified QMS ISO 90012015 Lead Auditor Course
No ratings yet
CQIIRCA Certified QMS ISO 90012015 Lead Auditor Course
1 page

Dlcv2017d2l4objectdetection 170622143747

Uploaded by

Dlcv2017d2l4objectdetection 170622143747

Uploaded by

#DLUPC

The task of assigning a

CAT, DOG, DUCK

20 categories 80 categories 200 categories

Classes = [cat, dog, duck]

Classes = [cat, dog, duck]

Classes = [cat, dog, duck]

Classes = [cat, dog, duck]

Solution: If your classifier is fast enough, go for it

Solution: Look at a tiny subset of positions. Choose them wisely :)

● Find “blobby” image regions that are likely to contain objects

Slide Credit: CS231n 10

Selective Search (SS) Multiscale Combinatorial Grouping (MCG)

[MCG] Arbeláez, Pont-Tuset et al. Multiscale combinatorial grouping. CVPR 2014 11

1. Train network on proposals

2. Post-hoc training of SVMs & Box regressors on fc7 features

1. Train network on proposals

2. Post-hoc training of SVMs & Box regressors on fc7 features

3. Non Maximum Suppression + score threshold

1. Slow at test-time: need to run full forward pass of

2. SVMs and regressors are post-hoc: CNN features

3. Complex multistage training pipeline

Slide Credit: CS231n 17

Girshick Fast R-CNN. ICCV 2015 18

Girshick Fast R-CNN. ICCV 2015 Slide Credit: CS231n 19

Solution: Train it all at together E2E

Girshick Fast R-CNN. ICCV 2015 20

R-CNN Fast R-CNN

Training Time: 84 hours 9.5 hours

Test time per image 47 seconds 0.32 seconds

Better! mAP (VOC 2007) 66.0 66.9

Using VGG-16 CNN on Pascal VOC 2007 dataset

Slide Credit: CS231n 21

Test-time speeds don’t include region proposals

R-CNN Fast R-CNN

Test time per image 47 seconds 0.32 seconds

Test time per image

Slide Credit: CS231n 22

Bounding Box Regression

In practice, k = 9 (3 different scales and 3 aspect ratios)

Region Proposal Network

R-CNN Fast R-CNN Faster R-CNN

Test time per 50 seconds 2 seconds 0.2 seconds

(Speedup) 1x 25x 250x

mAP (VOC 2007) 66.0 66.9 66.9

● Faster R-CNN is the basis of the winners of COCO and

He et al. Deep residual learning for image recognition. CVPR 2016

Proposal-free object detection pipeline

- For each bounding box:

For Pascal VOC:

Liu et al. SSD: Single Shot MultiBox Detector, ECCV 2016 34

Redmon & Farhadi. YOLO900: Better, Faster, Stronger. CVPR 2017

Results on Pascal VOC 2007

Results on COCO test-dev 2015

● Unofficial ports to other frameworks are likely to exist… eg type “yolo

Object detection tutorials (project ideas maybe?): 39

For training, each ground truth

Slide credit: YOLO Presentation @ CVPR 2016 41

For training, each ground truth

Slide credit: YOLO Presentation @ CVPR 2016 42

Optimize class prediction in that

Slide credit: YOLO Presentation @ CVPR 2016 43

Predicted boxes for this cell

Slide credit: YOLO Presentation @ CVPR 2016 44

Find the best one wrt ground

Slide credit: YOLO Presentation @ CVPR 2016 45

Increase matched box’s

Slide credit: YOLO Presentation @ CVPR 2016 46

Increase matched box’s

Slide credit: YOLO Presentation @ CVPR 2016 47

For cells with no ground truth

Slide credit: YOLO Presentation @ CVPR 2016 48

For cells with no ground truth

Slide credit: YOLO Presentation @ CVPR 2016 49

Bounding box = 1 if box j and cell i are NOT matched together

You might also like