0% found this document useful (0 votes)
10 views

Presentation1

The document discusses various algorithms for 2D and 3D object detection, categorizing them into traditional, anchor-based, anchor-free, and transformer-based methods. It highlights the strengths and weaknesses of each approach, such as the speed and accuracy of YOLO and SSD, as well as the computational demands of DETR and ViT. The comparison emphasizes the trade-offs between accuracy, speed, and complexity in object detection models.

Uploaded by

yayesiy213
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Presentation1

The document discusses various algorithms for 2D and 3D object detection, categorizing them into traditional, anchor-based, anchor-free, and transformer-based methods. It highlights the strengths and weaknesses of each approach, such as the speed and accuracy of YOLO and SSD, as well as the computational demands of DETR and ViT. The comparison emphasizes the trade-offs between accuracy, speed, and complexity in object detection models.

Uploaded by

yayesiy213
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Different Algorithms of

2D and 3D Object
Detection
Team 3
Jua
Traditional Object Detection

Three stages: Flaws:

• region proposal • Slow speed


• Regions of Interest (RoI) • Low Accuracy
• Different window sizes • High Computational overhead
• feature extraction
• Local Binary Pattern (LBP)
• Histogram of Oriented Gradient (HOG)
• Scale Invariant Feature Transform
(SIFT)
• classification, and regression.
• Calssification
• boundary box regression
Classification of algorithms

Anchor – Based Anchor – Free Transformer – Based


RCNN
Key-point-based DETR
YOLO

SSD Anchor-point-based ViT


RCNN (Region-
Based CNN)

RCNN Overview:
• Introduced by Ross Girshick et al.
• Region Proposal: Uses selective search to
propose regions (bounding boxes)
• CNN-Based Feature Extraction: Extracts
features for each region
• Classification: Uses classifiers like SVM for
object classification
Limitations:
• Slow due to separate steps for region
proposal, feature extraction, and classification
• Not real-time
RCNN (Region-
Based CNN)
Fast RCNN and
Faster RCNN
Fast RCNN:
• Combines feature extraction and
classification in a single forward pass
• Uses ROI Pooling for faster
computation
Faster RCNN:
• Introduces the Region Proposal
Network (RPN) to generate proposals
• Significantly faster than RCNN
Fast RCNN
Faster RCNN
YOLO (You Only
Look Once)
YOLO Overview:
• Single-stage detector: Combines region proposal,
classification, and bounding box prediction in one pass
• Speed: Real-time object detection
• Divides the image into grid cells and predicts bounding
boxes for each cell
Strengths:
• Extremely fast
• Real-time detection for video processing
Weaknesses:
• Struggles with detecting small or overlapping objects
SSD (Single Shot
MultiBox Detector)
SSD Overview:
• Combines YOLO’s speed with better accuracy for small
objects
• Predicts objects at multiple scales using feature maps
from different layers
• No need for a separate region proposal network like in
Faster RCNN
Strengths:
• Good balance between speed and accuracy
• Multi-scale detection improves performance for small
objects
Weaknesses:
• Still not as accurate as two-stage detectors like Faster
RCNN
Anchor-Free Keypoint-
Based Detection
Overview:
• Anchor-Free: Does not use predefined
anchor boxes
• Detects objects by keypoints (like center
points or object corners)
• Examples: CornerNet, CenterNet
Strengths:
• Eliminates the complexity of anchor design
• More flexible for varying object shapes
Weaknesses:
• May struggle with overlapping objects or
cluttered scenes
Anchor-Free Anchor-
Point-Based Detection
Overview:
• Instead of using predefined anchors, anchor
points are selected dynamically
• Faster and simpler as it removes the need for a
predefined grid of anchors
• Examples: FCOS, CrossDet
Strengths:
• Improves the efficiency of object detection
• Reduces false positives from misaligned anchors
Weaknesses:
• May lose some localization accuracy compared
to anchor-based models
DETR (DEtection
TRansformers)
• Overview:
• Transformer-based approach for object detection
• Uses Transformers to model object detection as a set
prediction problem
• No need for non-maximum suppression or anchor
boxes
• Strengths:
• Simplified architecture
• Strong performance in detecting objects with complex
relationships
• Weaknesses:
• Requires large amounts of data and computational
resources
• Slower convergence compared to CNN-based models
ViT (Vision
Transformer)
• Overview:
• Vision Transformer applies the transformer
architecture (originally for NLP) to image data
• Divides images into patches and processes
them like sequences of words
• Strengths:
• Strong performance for large datasets
• Captures long-range dependencies in images
• Weaknesses:
• Requires substantial training data
• Computationally expensive
Comparison of 2-D Object Detection
Models

RCNN: • Accurate but slow

Faster RCNN: • Balanced speed and accuracy

YOLO: • Real-time but less accurate


• Faster and more accurate than YOLO, but slower than
SSD: YOLO
Anchor-Free Methods: • Flexible and efficient, but struggles with clutter

DETR: • Simplified, no anchors, but computationally heavy

ViT: • Best for large-scale data but expensive to train

You might also like