Point Feature Detection and Matching: Davide Scaramuzza
Point Feature Detection and Matching: Davide Scaramuzza
Lecture 05
Point Feature Detection and Matching
Davide Scaramuzza
http://rpg.ifi.uzh.ch
1
Lab Exercise 3 - Today afternoon
Room ETH HG E 1.1 from 13:15 to 15:00
Work description: implement the Harris corner detector and tracker
2
Outline
• Filters for Feature detection
• Point-feature extraction: today and next lecture
3
Filters for Feature Detection
• In the last lecture, we used filters to reduce noise or
enhance contours
4
Filters for Template Matching
• Find locations in an image that are similar to a template
• If we look at filters as templates, we can use correlation (like convolution but
without flipping the filter) to detect these locations
Template
5
Filters for Template Matching
• Find locations in an image that are similar to a template
• If we look at filters as templates, we can use correlation (like convolution but
without flipping the filter) to detect these locations
Detected template
Correlation map
6
Where’s Waldo?
Template
Scene 7
Where’s Waldo?
Template
Scene 8
Where’s Waldo?
Template
Scene 9
Summary of filters
• Smoothing filter:
– has positive values
– sums to 1 preserve brightness of constant regions
– removes “high-frequency” components: “low-pass” filter
• Derivative filter:
– has opposite signs used to get high response in regions of high
contrast
– sums to 0 no response in constant regions
– highlights “high-frequency” components: “high-pass” filter
• Filters as templates
• Highest response for regions that “look similar to the filter”
10
Template Matching
• What if the template is not identical to the object we want to detect?
• Template Matching will only work if scale, orientation, illumination, and, in
general, the appearance of the template and the object to detect are very
similar. What about the pixels in template background (object-background
problem)?
Scene Template
Template
Scene 11
Correlation as Scalar Product
• Consider images H and F as vectors, their correlation is:
H
H , F H F cos
𝐹
• In Normalized Cross Correlation (NCC), we consider the unit vectors of H
and F , hence we measure their similarity based on the angle . If H and
F are identical, then NCC = 1.
k k
12
Other Similarity measures
• Normalized Cross Correlation (NCC): takes values between -1 and +1 (+1 =
identical) k k
H (u, v) F (u, v)
NCC u kv k
k k k k
H (u, v) F (u, v)
u kv k
2
u kv k
2
u kv k
13
Zero-mean SAD, SSD, NCC
To account for the difference in the average intensity of two images (typically
caused by additive illumination changes), we subtract the mean value of each
image: 1 k k 1 k k
H
N
H (u, v)
u kv k
F
N
F (u, v) N is the number of pixels of H or F
u kv k
H (u , v ) H F (u , v )
2
F 2
𝐼′ (𝑥, 𝑦) = 𝛼𝐼 𝑥, 𝑦 + 𝛽
u kv k u kv k
u kv k
Advantages
• No square roots or divisions are required,
thus very efficient to implement, especially
on FPGA
• Intensities are considered relative to the center
pixel of the patch making it invariant to monotonic
intensity changes
15
Outline
• Filters for feature extraction
• Point-feature (or keypoint) extraction: today and next lecture
16
Keypoint extraction and matching - Example
Video from “Forster, Pizzoli, Scaramuzza, SVO: Semi-Direct Visual Odometry, ICRA’14”
https://www.youtube.com/watch?v=2YnIMfw6bJY 17
Why do we need keypoints?
Recall the Visual-Odometry flow chart:
Image sequence
Feature detection
Local optimization
18
Why do we need keypoints?
Keypoint extraction is the key ingredient of motion estimation!
Image sequence
𝑇𝑘,𝑘−1 = ?
Feature detection
𝒖𝑖 𝒖′𝑖
Feature matching (tracking)
Motion estimation
𝒑𝑖
2D-2D 3D-3D 3D-2D
Local optimization
19
Keypoints are also used for:
• Panorama stitching
• Object recognition
• 3D reconstruction
• Place recognition
• Indexing and database retrieval (e.g., Google Images or http://tineye.com)
20
Image matching: why is it challenging?
21
Image matching: why is it challenging?
22
Image matching: why is it challenging?
Answer below
23
Example: panorama stitching
25
Local features and alignment
• Detect point features in both images
26
Local features and alignment
• Detect point features in both images
• Find corresponding pairs
27
Local features and alignment
• Detect point features in both images
• Find corresponding pairs
• Use these pairs to align the images
28
Matching with Features
• Problem 1:
– Detect the same points independently in both images
no chance to match!
We need a repeatable feature detector. Repeatable means that the detector should be
able to re-detect the same feature in different images of the same scene.
This property is called Repeatability of a feature detector. 29
Matching with Features
• Problem 2:
– For each point, identify its correct correspondence in the other
image(s)
31
Illumination changes
𝐼′ (𝑥, 𝑦) = 𝛼𝐼 𝑥, 𝑦 + 𝛽
32
Invariant local features
Subset of local feature types designed to be invariant to common
geometric and photometric transformations.
Basic steps:
1) Detect repeatable and distinctive interest points
2) Extract invariant descriptors
33
Main questions
• What features are repeatable and distinctive?
• How to describe a feature?
• How to establish correspondences, i.e., compute matches?
34
What is a Repeatable & Distinctive feature?
• Consider the image pair below with extracted patches
• Notice how some patches can be localized or matched with higher accuracy
than others
Image 1 Image 2
35
A corner is defined as the intersection of one or more edges
Corner have high localization accuracy
→ Corners are good for VO
Corners are less distinctive than blobs
E.g., Harris, Shi-Tomasi, SUSAN, FAST
A blob is any other image pattern that is not a corner and differs
significantly from its neighbors (e.g., a connected region of pixels
with similar color, a circle, etc.)
Blobs have less localization accuracy than corners
Blobs are more distinctive than a corner
→ blobs are better for place recognition
E.g., MSER, LOG, DOG (SIFT), SURF, CenSurE,
BRIEF, BRISK, ORB, FREAK, etc.
Davide Scaramuzza – University of Zurich – Robotics and Perception Group - rpg.ifi.uzh.ch
Corner detection
• Key observation: in the region around a corner, image gradient
has two or more dominant directions
• Corners are repeatable and distinctive
37
The Moravec Corner detector (1980)
• How do we identify corners?
• Look at a region of pixels through a small window
• Shifting a window in any direction should give a large intensity changes (e.g., in
SSD) in at least 2 directions
H. Moravec, Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover, PhD thesis, Chapter 5, 38
Stanford University, Computer Science Department, 1980.
The Moravec Corner detector (1980)
Consider the reference patch centered at (𝑥, 𝑦) and the shifted window centered at
(𝑥 + ∆𝑥, 𝑦 + ∆𝑦). The patch has size 𝑃. The Sum of Squared Differences between them is:
SSD (x, y ) I ( x , y ) I ( x x , y y ) 2
x , yP
H. Moravec, Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover, PhD thesis, Chapter 5, 39
Stanford University, Computer Science Department, 1980.
The Harris Corner detector (1988)
• It implements the Moravec corner detector without having to physically
shift the window but rather by just looking at the patch itself, by using
differential calculus.
C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“ , 1988 Proceedings of the 4th Alvey Vision Conference:
pages 147-151. 40
How do we implement this?
• Consider the reference patch centered at (𝑥, 𝑦) and the shifted window centered at
(𝑥 + ∆𝑥, 𝑦 + ∆𝑦). The patch has size 𝑃.
SSD (x, y ) I ( x , y ) I ( x x , y y ) 2
x , yP
I ( x, y ) I ( x, y )
• Let I x I
and y . Approximating with a 1st order Taylor expansion:
x y
SSD (x, y ) I x ( x, y ) x I y ( x, y ) y )
2
x , yP
This is a simple quadratic function in two variables (Δ𝑥, Δ𝑦) 41
How do we implement this?
SSD(x, y ) I x ( x, y ) x I y ( x, y ) y )
2
x , yP
I x2 I x I y x
SSD(x, y ) x y 2
x , yP I x I y I y y
x
SSD(x, y ) x y M
y
I x2 IxI y
M 2
x , yP I x I y I y
42
How do we implement this?
SSD (x, y ) I x ( x, y ) x I y ( x, y ) y )
2
x , yP
I x2 I x I y x
SSD(x, y ) x y 2
x , yP I x I y I y y
x
SSD(x, y ) x y M
Notice that these are y
NOT matrix products
but pixel-wise
I x2 I x I y I x2 I I
products!
M 2
I y I x I y
x y
x , yP I x I y I 2
y
x
• We can visualize x y M const as an ellipse with axis lengths determined
y
by the eigenvalues and the two axes’ orientations determined by R (i.e., the
eigenvectors of M)
• The two eigenvectors identify the directions of largest and smallest changes of SSD
SSD direction of the fastest
change of SSD
direction of the slowest
change of SSD
(max)-1/2
(min)-1/2
44
Example
• First, consider an edge or a flat region. I x2 I I 0 0
M x y
I x I y I 0 2
2
y
Edge
I x2 I I 0 0
M x y
I x I y I 2
y 0 0
Flat region
• We can conclude that if at least one of the eigenvalues λ is close to 0, then this is not a corner.
• Now, let’s consider an axis-aligned corner:
I x2 x y cos 4
I I - sin 0 cos
4 4
sin
4
M 1
• We can observe that the dominant gradient directions are at 45 degrees with 𝑥 and 𝑦 axes
• We can also conclude that if both two eigenvalues are much larger than 0, then this is a corner
45
How to compute λ1 , λ2 , R from M
Eigenvalue/eigenvector review
• You can easily prove that λ1 , λ2 are the eigenvalues of M.
• The eigenvectors and eigenvalues of a square matrix A are the vectors x and scalars
λ that satisfy:
Ax x
• The scalar is the eigenvalue corresponding to x
– The eigenvalues are found by solving: det( A I ) 0
m11 m12
– In our case, A = M is a 2x2 matrix, so we have det 0
m21 m22
1
– The solution is: 𝜆1,2 = (𝑚11 + 𝑚22 ) ± 4𝑚12 𝑚21 + (𝑚11 − 𝑚22 )2
2
– Once you know , you find the two eigenvectors x (i.e., the two columns of R) by solving:
m11 m12 x
m 0
21 m22 y 46
Visualization of 2nd moment matrices
47
Visualization of 2nd moment matrices
NB: the ellipses here are plotted proportionally to the eigenvalues and not as iso-SSD ellipses
as explained before. So small ellipses here denote a flat region, and big ones, a corner.
Interpreting the eigenvalues
• Classification of image points using eigenvalues of M
• A corner can then be identified by checking whether the minimum of
the two eigenvalues of M is larger than a certain user-defined threshold
⇒ R = min(1,2) > threshold
2 “Edge”
2 >> 1 “Corner”
• R is called “cornerness function” 1 and 2 are large,
• The corner detector using
this criterion is called ⇒ R > threshold
«Shi-Tomasi» detector
⇒ SSD increases in all
J. Shi and C. Tomasi (June 1994). "Good Features
to Track,". 9th IEEE Conference on Computer directions
Vision and Pattern Recognition
“Edge”
“Flat”
1 >> 2 1
region 50
Harris Detector: Workflow
51
Harris Detector: Workflow
• Compute corner response 𝑅
52
Harris Detector: Workflow
• Find points with large corner response: 𝑅 > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑
53
Harris Detector: Workflow
• Take only the points of local maxima of thresholded 𝑅 (non-maxima suppression)
54
Harris Detector: Workflow
55
Harris (or Shi-Tomasi) Corner Detector Algorithm
Algorithm:
1. Compute derivatives in x and y directions (𝐼𝑥 , 𝐼𝑦 ) e.g. with Sobel filter
2. Compute 𝐼𝑥 2 , 𝐼𝑦 2 , 𝐼𝑥 𝐼𝑦
3. Convolve 𝐼𝑥 2 , 𝐼𝑥 2 , 𝐼𝑥 𝐼𝑦 with a box filter to get σ 𝐼𝑥 2 , σ 𝐼𝑦 2 , σ 𝐼𝑥 𝐼𝑦 , which are
the entries of the matrix 𝑀 (optionally use a Gaussian filter instead of a box
filter to avoid aliasing and give more “weight” to the central pixels)
4. Compute Harris Corner Measure 𝑅 (according to Shi-Tomasi or Harris)
5. Find points with large corner response (𝑅 > threshold)
6. Take the points of local maxima of R
From now on, whenever you hear Harris corner detector we will be referring to
either the original Harris detector (1988) or to its modification by Shi-Tomasi (1994).
The Shi-Tomasi, despite being a bit more expensive, yet has a small advantage… see
next slides
56
Harris vs. Shi-Tomasi
57
Harris vs. Shi-Tomasi
Shi-Tomasi
operator
Harris
operator
58
Harris Detector: Some Properties
How does the size of the Harris detector affect the performance?
Repeatability:
• How does the Harris detector behave with geometric changes? Which means,
can it re-detect the same corners when the image exhibits changes in
• Rotation,
• Scale (zoom),
• View-point,
• Illumination
59
Harris Detector: Some Properties
• Rotation invariance
Image 1 Image 2
Ellipse rotates but its shape (i.e., eigenvalues) remains the same
60
Harris Detector: Some Properties
• But: non-invariant to image scale!
Image 1 Image 2
61
Harris Detector: Some Properties
• Quality of Harris detector for different scale changes
Repeatability=