SlideShare a Scribd company logo
5
Most read
7
Most read
17
Most read
Imaging and Vision
Pathfinding
Perry Lea
ACM Distinguished Lectures
Floating Point
2D
Graphics
3D
Graphics
Vision Computational
Photography
Physics Kernel Floating Point
Requirement
X Color Space
Conversion
Fixed Point
X Gaussian Blur Fixed Point
X Sobel Edge
Detection
Fixed Point
X Bilateral Filters Fixed Point
X Bilinear
Interpolation
Fixed Point
X Bicubic
Interpolation
Half or Single Precision
X Image Signal
Processor
Fixed Point
X X Exposure
Compensation
Single Precision
X X Image Blending Fixed Point
X X Scaling Fixed (for binary scaling)
X Texture Mapping Fixed Point
X Pixel Shading Single / Double Precision
X Z-Buffer Depth Test Single
X Compositing Fixed Point or Half
Precision
X Ray Tracing Single Precision
X 3D Vertex Shading Double Precision
X Fluid Dynamics Single / Double Precision
X JPEG Compression Fixed Point2
Vision
Vision Segments
4
ADAS and Automotive
Medical Imaging
Consumer Electronics and GamingIndustrial Automation &
Robotics
Security, Surveillance,
Intelligence, Defense
Facial Recognition
Why Vision
5
Vision Market
6
Tractica Research: 42% CAGR, $33B market by 2019
Market to Market: 22.6% CAGR, $22.2B market by 2020
Human Vision
| February 11,
2018
| Micron Confidential
7
What do you see here?
Do you see lines between the
circles?
 Guess what: there are none.
Rule 1: Sensory input does not
contain enough information to
explain our perception
What did you just see?
 Did you see the people on the bridge?
 Did you see the church?
 Did you see the tunnel?
Rule 2: There is too much sensory
input to include in our coherent
perceptions at any single moment
8
Human Visual Dataflow
Human vision interprets
images bottom up and top
down:
Bottom Up: Based on raw
sensory data (pixels)
Top down: based on feature
extraction
Find the Target
9
Human Brain Visual System
from Ganglion to Cortex
How Human Vision Works
Humans are born with a nearly fully
developed vision system
Cortical pathways are reinforced and
restructured within the 1st year of
development.
Vision starts at ganglion
cells and follows
the optical nerve.
Some receptors will excite with light
intensity, some will inhibit activity.
1
0
Feature Extraction
When a collection of
photoreceptors are organized
into a center-surround field,
the brain can easily perceive
light and dark regions.
Edges force ganglion cells to
deliver reinforced or
diminished signals.
Visual System does an
extraordinary job at throwing
away information.
1
1
Ganglion Cell Signal Strength
Computer Vision
Vision Principles
SIFT in 6 slides!
Just as the human brain perceives image data top-down and bottom-up,
so are typical vision algorithms.
 Features are “interesting” parts of an image and we will rely on the same
edges, corners, and ridges. To be useful, feature points must:
Be numerous
Be repeatable
Represent orientation
and scale
Be fast to extract
and match
1
3
Typical Feature Extraction Algorithm
Detector
 Find Scale Space
Extrema
 Keypoint Localization
Improve keypoints and
throw out bad ones.
Descriptor
 Orientation Assignment
(remove effects of
rotation and scale)
 Create Descriptor
Use histograms of
orientations
1
4
Lens
Lens
Correction
White
Balance
Noise
Reduction
Demosaic
Color
Correction
Tone
Mapping
Sharpening
Gamma
Correction
3A Stats
RGB2YUV Scaler
DRAM
Image Signal Processor (Front End)
Feature Extraction (Back End)
12 MegaPixel Image (RAW10=15 MB to
37MB. @30 fps = 450 MB/s)
Preprocess Scan Image
Filter Feature
Locations
Generate
Signature
Post Process
Descriptors
Finding Scale Space
Finds keypoints in image.
Image is convolved at different
scales (variant of blob detection)
Best way to do this is a Laplacian of Gaussian:
 But a LoG is really computationally expensive (hmmm)
 So we’ll cheat and do a Difference of Gaussian Blurs:
 Convolved images are grouped by “octaves” which is simply the scale at that
point. We convolve a certain number of images per octave k
 Take the difference of the convolved images k per octave.
1
5
Finding Scale Space
Find Extreme
Choose all extrema within a 3x3x3
neighborhood
 This is done by comparing each pixel in the
DoG images to its eight neighbors at the
same scale and nine corresponding
neighboring pixels in each of the neighboring
scales. If the pixel value is the maximum or
minimum among all compared pixels, it is
selected as a candidate keypoint.
1
6
Keypoint Localization
Scale space extrema produce too many
candidates.
Minimize:
 Use Taylor series expansion to get
true extrema
Reject:
 Points with bad contrast
 Points with strong edge response in 1 direction
1
7
Orientation Assignment
Remove effects of rotation
Create a gradient of histograms (36 bins)
Weighted by magnitude of Gaussian Window
Any peak within 80% of highest is a new keypoint
Parabola a parabola is fit to the 3 histograms closest to each peak
1
8
Keypoint Descriptor
We now want to compute a descriptor for each keypoint to make
them distinctive with various illuminations, 3D views, etc.
Similar to human biological vision
Neurons respond to gradients at certain frequencies
4x4 gradient window with a histogram of 4x4 samples per
window = 4x4x8 = 128 feature vectors
1
9
Lighting gains will
not affect descriptors
Feature Detection Algorithms
Edge Detection:
Canny, Sobel, Prewitt,
Differential
Corner Detectors:
Harris, FAST, SUSAN
Blob Detectors:
Laplacian of Gaussian
Difference of Gaussian
Determinant of Gaussian
2
0
 Transforms:
– Ridge, Hough, Structural
Tensor
 Affine Invariants
– Affine shape adapter
– Harris Affine
– Hessian Affine
 Feature Descriptors
– SIFT, SURF, GLOH, HOG,
BRIEF, ORB, BRISK, FREAK
Other Vision Challenges
Segmentation
Meaningful partitioning of
image/video into non-overlapping
regions and subvolumes. Ability to
handle multi-modal data of varying
complexity
2
1
Color Image Segmentation Output
Original Image courtesy of
University of California at Berkeley
Courtesy RIT
Other Vision Challenges
Super Resolution
Utilizing multiple images of a given scene to obtain a high
resolution image with improved image quality
2
2
Other Vision Challenges
Hierarchical Scale Space
Using information at various scales to determine the semantic
structure of an image. Utilize probabilistic modeling of an image
content to build a dynamic hierarchical tree for high resolution
remote sensing.
2
3
Courtesy RIT
Other Vision Challenges
Computational Photography
2
4
Computational photography combines
plentiful computing, digital sensors,
modern optics, actuators, and smart lights
to escape the limitations of traditional film
cameras and enables novel imaging
applications. Unbounded dynamic range,
variable focus, resolution, and depth of
field, hints about shape, reflectance, and
lighting, and new interactive forms of
photos that are partly snapshots and partly
videos are just some of the new
applications found in Computational
Photography.
• Light Field Arrays
• Massive Image Stitching/Warping
• Computational Optics
• Holographic Imaging

More Related Content

What's hot (20)

PPTX
Computer vision
ghufranAlkaaby
 
PDF
Open CV - 電腦怎麼看世界
Tech Podcast Night
 
PPT
Face detection ppt
Pooja R
 
PPTX
Computer Vision
ArtiKhanchandani
 
PDF
Digital Image Processing and Edge Detection
Seda Yalçın
 
PPTX
Computer Vision
Nitin Sharma
 
PPTX
Computer vision suprim regmi
Suprim Regmi
 
PPTX
Image Processing and Computer Vision
Silicon Mentor
 
PPTX
AI Computer vision
Kashafnaz2
 
PPTX
Application of edge detection
Naresh Biloniya
 
PPT
Computer vision for interactive computer graphics
Shah Alam Sabuj
 
PPTX
Ai lecture 03 computer vision
Ahmad sohail Kakar
 
PPTX
Computer Vision
Ameer Mohamed Rajah
 
PPTX
Computer Vision - Artificial Intelligence
ACM-KU
 
ODP
computer vision & Opencv intro
チュニジア の自由
 
PPTX
Computer vision and robotics
Biniam Asnake
 
PDF
Interactive full body motion capture using infrared sensor network
ijcga
 
PDF
Interactive Full-Body Motion Capture Using Infrared Sensor Network
ijcga
 
PDF
PPT s01-machine vision-s2
Binus Online Learning
 
Computer vision
ghufranAlkaaby
 
Open CV - 電腦怎麼看世界
Tech Podcast Night
 
Face detection ppt
Pooja R
 
Computer Vision
ArtiKhanchandani
 
Digital Image Processing and Edge Detection
Seda Yalçın
 
Computer Vision
Nitin Sharma
 
Computer vision suprim regmi
Suprim Regmi
 
Image Processing and Computer Vision
Silicon Mentor
 
AI Computer vision
Kashafnaz2
 
Application of edge detection
Naresh Biloniya
 
Computer vision for interactive computer graphics
Shah Alam Sabuj
 
Ai lecture 03 computer vision
Ahmad sohail Kakar
 
Computer Vision
Ameer Mohamed Rajah
 
Computer Vision - Artificial Intelligence
ACM-KU
 
computer vision & Opencv intro
チュニジア の自由
 
Computer vision and robotics
Biniam Asnake
 
Interactive full body motion capture using infrared sensor network
ijcga
 
Interactive Full-Body Motion Capture Using Infrared Sensor Network
ijcga
 
PPT s01-machine vision-s2
Binus Online Learning
 

Similar to Computer vision series (20)

PDF
Fcv core szeliski_zisserman
zukun
 
PPTX
IntroComputerVision23.pptx
AneesAbbasi14
 
PPTX
1_Intro2ssssssssssssssssssssssssssssss2.pptx
larturo
 
PPTX
Image analytics - A Primer
Gopi Krishna Nuti
 
PPTX
Computer vision old problems new solutions
Gopi Krishna Nuti
 
PDF
OpenCV.pdf
sagarawasthi5
 
PDF
ICS1020CV_2022.pdf
Vanessa Camilleri
 
PPTX
Computer Vision and techniques-image processing
ShymaPV
 
PDF
視訊訊號處理與深度學習應用
台灣資料科學年會
 
PDF
thesis
Nitha Thomas
 
PDF
CV_1 Introduction of Computer Vision and its Application
Khushali Kathiriya
 
PPTX
13 cv mil_preprocessing
zukun
 
PPTX
Computer vision introduction
Wael Badawy
 
PPTX
Computer_Vision_ItsHistory_Advantages_and Uses.pptx
YashikaTanwar11
 
PDF
Computer Vision – From traditional approaches to deep neural networks
inovex GmbH
 
PPTX
Computer Vision(4).pptx
GouthamMaliga
 
PPTX
Machine Learning
ssuser24ddad
 
PDF
Computer_Vision-Lecture 1-Course Overview.pdf
mostafasameer858
 
PPT
Introduction to Machine Vision
Nasir Jumani
 
Fcv core szeliski_zisserman
zukun
 
IntroComputerVision23.pptx
AneesAbbasi14
 
1_Intro2ssssssssssssssssssssssssssssss2.pptx
larturo
 
Image analytics - A Primer
Gopi Krishna Nuti
 
Computer vision old problems new solutions
Gopi Krishna Nuti
 
OpenCV.pdf
sagarawasthi5
 
ICS1020CV_2022.pdf
Vanessa Camilleri
 
Computer Vision and techniques-image processing
ShymaPV
 
視訊訊號處理與深度學習應用
台灣資料科學年會
 
thesis
Nitha Thomas
 
CV_1 Introduction of Computer Vision and its Application
Khushali Kathiriya
 
13 cv mil_preprocessing
zukun
 
Computer vision introduction
Wael Badawy
 
Computer_Vision_ItsHistory_Advantages_and Uses.pptx
YashikaTanwar11
 
Computer Vision – From traditional approaches to deep neural networks
inovex GmbH
 
Computer Vision(4).pptx
GouthamMaliga
 
Machine Learning
ssuser24ddad
 
Computer_Vision-Lecture 1-Course Overview.pdf
mostafasameer858
 
Introduction to Machine Vision
Nasir Jumani
 
Ad

Recently uploaded (20)

PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PPTX
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
PPTX
Precedence and Associativity in C prog. language
Mahendra Dheer
 
PDF
Construction of a Thermal Vacuum Chamber for Environment Test of Triple CubeS...
2208441
 
PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PPTX
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
PPTX
Introduction to Fluid and Thermal Engineering
Avesahemad Husainy
 
PPTX
00-ClimateChangeImpactCIAProcess_PPTon23.12.2024-ByDr.VijayanGurumurthyIyer1....
praz3
 
PDF
Air -Powered Car PPT by ER. SHRESTH SUDHIR KOKNE.pdf
SHRESTHKOKNE
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PPTX
Unit II: Meteorology of Air Pollution and Control Engineering:
sundharamm
 
PDF
CFM 56-7B - Engine General Familiarization. PDF
Gianluca Foro
 
PDF
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PPTX
Basics of Auto Computer Aided Drafting .pptx
Krunal Thanki
 
PPT
IISM Presentation.ppt Construction safety
lovingrkn
 
PDF
The Complete Guide to the Role of the Fourth Engineer On Ships
Mahmoud Moghtaderi
 
PPTX
ETP Presentation(1000m3 Small ETP For Power Plant and industry
MD Azharul Islam
 
PPTX
filteration _ pre.pptx 11111110001.pptx
awasthivaibhav825
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
Precedence and Associativity in C prog. language
Mahendra Dheer
 
Construction of a Thermal Vacuum Chamber for Environment Test of Triple CubeS...
2208441
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
sunil mishra pptmmmmmmmmmmmmmmmmmmmmmmmmm
singhamit111
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
Introduction to Fluid and Thermal Engineering
Avesahemad Husainy
 
00-ClimateChangeImpactCIAProcess_PPTon23.12.2024-ByDr.VijayanGurumurthyIyer1....
praz3
 
Air -Powered Car PPT by ER. SHRESTH SUDHIR KOKNE.pdf
SHRESTHKOKNE
 
Inventory management chapter in automation and robotics.
atisht0104
 
Unit II: Meteorology of Air Pollution and Control Engineering:
sundharamm
 
CFM 56-7B - Engine General Familiarization. PDF
Gianluca Foro
 
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
Information Retrieval and Extraction - Module 7
premSankar19
 
Basics of Auto Computer Aided Drafting .pptx
Krunal Thanki
 
IISM Presentation.ppt Construction safety
lovingrkn
 
The Complete Guide to the Role of the Fourth Engineer On Ships
Mahmoud Moghtaderi
 
ETP Presentation(1000m3 Small ETP For Power Plant and industry
MD Azharul Islam
 
filteration _ pre.pptx 11111110001.pptx
awasthivaibhav825
 
Ad

Computer vision series

  • 1. Imaging and Vision Pathfinding Perry Lea ACM Distinguished Lectures
  • 2. Floating Point 2D Graphics 3D Graphics Vision Computational Photography Physics Kernel Floating Point Requirement X Color Space Conversion Fixed Point X Gaussian Blur Fixed Point X Sobel Edge Detection Fixed Point X Bilateral Filters Fixed Point X Bilinear Interpolation Fixed Point X Bicubic Interpolation Half or Single Precision X Image Signal Processor Fixed Point X X Exposure Compensation Single Precision X X Image Blending Fixed Point X X Scaling Fixed (for binary scaling) X Texture Mapping Fixed Point X Pixel Shading Single / Double Precision X Z-Buffer Depth Test Single X Compositing Fixed Point or Half Precision X Ray Tracing Single Precision X 3D Vertex Shading Double Precision X Fluid Dynamics Single / Double Precision X JPEG Compression Fixed Point2
  • 4. Vision Segments 4 ADAS and Automotive Medical Imaging Consumer Electronics and GamingIndustrial Automation & Robotics Security, Surveillance, Intelligence, Defense Facial Recognition
  • 6. Vision Market 6 Tractica Research: 42% CAGR, $33B market by 2019 Market to Market: 22.6% CAGR, $22.2B market by 2020
  • 7. Human Vision | February 11, 2018 | Micron Confidential 7
  • 8. What do you see here? Do you see lines between the circles?  Guess what: there are none. Rule 1: Sensory input does not contain enough information to explain our perception What did you just see?  Did you see the people on the bridge?  Did you see the church?  Did you see the tunnel? Rule 2: There is too much sensory input to include in our coherent perceptions at any single moment 8
  • 9. Human Visual Dataflow Human vision interprets images bottom up and top down: Bottom Up: Based on raw sensory data (pixels) Top down: based on feature extraction Find the Target 9 Human Brain Visual System from Ganglion to Cortex
  • 10. How Human Vision Works Humans are born with a nearly fully developed vision system Cortical pathways are reinforced and restructured within the 1st year of development. Vision starts at ganglion cells and follows the optical nerve. Some receptors will excite with light intensity, some will inhibit activity. 1 0
  • 11. Feature Extraction When a collection of photoreceptors are organized into a center-surround field, the brain can easily perceive light and dark regions. Edges force ganglion cells to deliver reinforced or diminished signals. Visual System does an extraordinary job at throwing away information. 1 1 Ganglion Cell Signal Strength
  • 13. Vision Principles SIFT in 6 slides! Just as the human brain perceives image data top-down and bottom-up, so are typical vision algorithms.  Features are “interesting” parts of an image and we will rely on the same edges, corners, and ridges. To be useful, feature points must: Be numerous Be repeatable Represent orientation and scale Be fast to extract and match 1 3
  • 14. Typical Feature Extraction Algorithm Detector  Find Scale Space Extrema  Keypoint Localization Improve keypoints and throw out bad ones. Descriptor  Orientation Assignment (remove effects of rotation and scale)  Create Descriptor Use histograms of orientations 1 4 Lens Lens Correction White Balance Noise Reduction Demosaic Color Correction Tone Mapping Sharpening Gamma Correction 3A Stats RGB2YUV Scaler DRAM Image Signal Processor (Front End) Feature Extraction (Back End) 12 MegaPixel Image (RAW10=15 MB to 37MB. @30 fps = 450 MB/s) Preprocess Scan Image Filter Feature Locations Generate Signature Post Process Descriptors
  • 15. Finding Scale Space Finds keypoints in image. Image is convolved at different scales (variant of blob detection) Best way to do this is a Laplacian of Gaussian:  But a LoG is really computationally expensive (hmmm)  So we’ll cheat and do a Difference of Gaussian Blurs:  Convolved images are grouped by “octaves” which is simply the scale at that point. We convolve a certain number of images per octave k  Take the difference of the convolved images k per octave. 1 5
  • 16. Finding Scale Space Find Extreme Choose all extrema within a 3x3x3 neighborhood  This is done by comparing each pixel in the DoG images to its eight neighbors at the same scale and nine corresponding neighboring pixels in each of the neighboring scales. If the pixel value is the maximum or minimum among all compared pixels, it is selected as a candidate keypoint. 1 6
  • 17. Keypoint Localization Scale space extrema produce too many candidates. Minimize:  Use Taylor series expansion to get true extrema Reject:  Points with bad contrast  Points with strong edge response in 1 direction 1 7
  • 18. Orientation Assignment Remove effects of rotation Create a gradient of histograms (36 bins) Weighted by magnitude of Gaussian Window Any peak within 80% of highest is a new keypoint Parabola a parabola is fit to the 3 histograms closest to each peak 1 8
  • 19. Keypoint Descriptor We now want to compute a descriptor for each keypoint to make them distinctive with various illuminations, 3D views, etc. Similar to human biological vision Neurons respond to gradients at certain frequencies 4x4 gradient window with a histogram of 4x4 samples per window = 4x4x8 = 128 feature vectors 1 9 Lighting gains will not affect descriptors
  • 20. Feature Detection Algorithms Edge Detection: Canny, Sobel, Prewitt, Differential Corner Detectors: Harris, FAST, SUSAN Blob Detectors: Laplacian of Gaussian Difference of Gaussian Determinant of Gaussian 2 0  Transforms: – Ridge, Hough, Structural Tensor  Affine Invariants – Affine shape adapter – Harris Affine – Hessian Affine  Feature Descriptors – SIFT, SURF, GLOH, HOG, BRIEF, ORB, BRISK, FREAK
  • 21. Other Vision Challenges Segmentation Meaningful partitioning of image/video into non-overlapping regions and subvolumes. Ability to handle multi-modal data of varying complexity 2 1 Color Image Segmentation Output Original Image courtesy of University of California at Berkeley Courtesy RIT
  • 22. Other Vision Challenges Super Resolution Utilizing multiple images of a given scene to obtain a high resolution image with improved image quality 2 2
  • 23. Other Vision Challenges Hierarchical Scale Space Using information at various scales to determine the semantic structure of an image. Utilize probabilistic modeling of an image content to build a dynamic hierarchical tree for high resolution remote sensing. 2 3 Courtesy RIT
  • 24. Other Vision Challenges Computational Photography 2 4 Computational photography combines plentiful computing, digital sensors, modern optics, actuators, and smart lights to escape the limitations of traditional film cameras and enables novel imaging applications. Unbounded dynamic range, variable focus, resolution, and depth of field, hints about shape, reflectance, and lighting, and new interactive forms of photos that are partly snapshots and partly videos are just some of the new applications found in Computational Photography. • Light Field Arrays • Massive Image Stitching/Warping • Computational Optics • Holographic Imaging