Lecture 03
Lecture 03
• Relevant reading:
– Szeliski’s book (1st edition Chapter 4 or 2nd edition Chapter 7)
– David Lowe’s article (2004)
http://www.cs.ubc.ca/~lowe/keypoints/
Acknowledgement: many slides from Svetlana Lazebnik, Steve Seitz, David Lowe,
Kristen Grauman, and others (detailed credits on individual slides)
Edge detection
• An edge is a place of rapid change in the
image intensity function
intensity function
image (along horizontal scanline) first derivative
edges correspond to
extrema of derivative
Source: S. Lazebnik
Derivatives with convolution
For 2D function f(x,y), the partial derivative is:
∂f ( x, y ) f ( x + ε , y ) − f ( x, y )
= lim
∂x ε →0 ε
For discrete data, we can approximate using finite
differences:
∂f ( x, y ) f ( x + 1, y ) − f ( x, y )
≈
∂x 1
To implement the above as convolution, what would be
the associated filter?
Source: K. Grauman
Partial derivatives of an image
∂f ( x, y ) ∂f ( x, y )
∂x ∂y
-1 1
-1 1 or
1 -1
Which shows changes with respect to x?
Source: S. Lazebnik
Image gradient
f*g
d
( f ∗ g)
dx
d
• To find edges, look for peaks in ( f ∗ g)
dx Source: S. Seitz
Derivative theorem of convolution
• Differentiation is convolution, and convolution
is associative: d ( f ∗ g ) = f ∗ d g
dx dx
• This saves us one operation:
d
g
dx
d
f∗ g
dx
Source: S. Seitz
Derivative of Gaussian filters
x-direction y-direction
x-direction y-direction
Derivative filters
• Derivatives of Gaussian
• Can the values of a derivative filter be negative?
• What should the values sum to?
– Zero: no response in constant regions
Source: S. Lazebnik
Keypoint extraction: Corners
Source: S. Lazebnik
Why extract keypoints?
• Motivation: panorama stitching
• We have two images – how do we combine them?
Source: S. Lazebnik
Why extract keypoints?
• Motivation: panorama stitching
• We have two images – how do we combine them?
Source: S. Lazebnik
Why extract keypoints?
• Motivation: panorama stitching
• We have two images – how do we combine them?
• Repeatability
• The same keypoint can be found in several images despite geometric
and photometric transformations
• Saliency
• Each keypoint is distinctive
• Compactness and efficiency
• Many fewer keypoints than image pixels
• Locality
• A keypoint occupies a relatively small area of the image; robust to
clutter and occlusion
Source: S. Lazebnik
Applications
Keypoints are used for:
• Image alignment
• 3D reconstruction
• Motion tracking
• Robot navigation
• Indexing and database retrieval
• Object recognition
Source: S. Lazebnik
Corner Detection: Basic Idea
• We should easily recognize the point by
looking through a small window
• Shifting a window in any direction should
give a large change in intensity
( x,y)∈W
I(x, y)
E(u, v)
E(3,2)
Source: S. Lazebnik
Corner Detection: Mathematics
( x,y)∈W
I(x, y)
E(u, v)
E(0,0)
Source: S. Lazebnik
Corner Detection: Mathematics
( x,y)∈W
Source: S. Lazebnik
Corner Detection: Mathematics
• First-order Taylor approximation for small
motions [u, v]:
I ( x + u , y + v ) ≈ I ( x, y ) + I x u + I y v
• Let’s plug this into E(u,v):
= ∑ 2
[I x u + I y v] = ∑ 2 2
I u + 2I x I y uv + I v
x
2 2
y
( x,y)∈W ( x,y)∈W
Source: S. Lazebnik
Source: S. Lazebnik
⎡u ⎤
E (u , v) ≈ [u v ]M ⎢ ⎥
⎣v ⎦
where M is a second moment matrix computed from image
derivatives:
⎡ ∑ I x2 ∑I I x y
⎤
⎢ x, y x, y ⎥
M =⎢ 2 ⎥
∑ IxI y ∑I y
⎢⎣ x , y x, y ⎥⎦
Source: S. Lazebnik
Interpreting the second moment matrix
⎡u ⎤
Consider a horizontal “slice” of E(u, v): [u v] M ⎢ ⎥ = const
⎣v ⎦
This is the equation of an ellipse.
λ 0⎤
−1 ⎡ 1
Diagonalization of M: M =R ⎢ ⎥ R
⎣ 0 λ2 ⎦
The axis lengths of the ellipse are determined by the
eigenvalues and the orientation is determined by R
direction of the
fastest change
direction of the
slowest change
(λmax)-1/2
(λmin)-1/2
Source: S. Lazebnik
Interpreting the second moment matrix
Source: S. Lazebnik
Visualization of second moment matrices
Source: S. Lazebnik
Visualization of second moment matrices
Source: S. Lazebnik
Interpreting the eigenvalues
Classification of image points using eigenvalues
of M:
λ2 “Edge”
λ2 >> λ1 “Corner”
λ1 and λ2 are large,
λ1 ~ λ2;
E increases in all
directions
λ1
Source: S. Lazebnik
Corner response function
R = det( M ) − α trace(M ) 2 = λ1λ2 − α (λ1 + λ2 ) 2
α: constant (0.04 to 0.06)
“Edge”
R<0 “Corner”
R>0
|R| small
“Flat” “Edge”
region R<0
Source: S. Lazebnik
Source: S. Lazebnik
⎡ ∑ w( x, y ) I x2 ∑ w( x, y) I I x y
⎤
⎢ x, y x, y ⎥
M =⎢ 2 ⎥
∑ w( x, y ) I x I y ∑ w( x, y) I y
⎢⎣ x , y x, y ⎥⎦
I→aI+b
• Intensity scaling: I → a I
R R
threshold
Image translation
Image rotation
Scaling
Corner
Basic idea
• Convolve the image with a “blob filter” at
multiple scales and look for extrema of filter
response in the resulting scale space
Blob detection
minima
* =
maxima
Blob filter
Laplacian of Gaussian: Circularly symmetric
operator for blob detection in 2D
2 2
2 ∂ g ∂ g
∇ g= 2 + 2
∂x ∂y
Source: S. Lazebnik
Edge
f
d Derivative
g of Gaussian
dx
d Edge = maximum
f∗ g of derivative
dx
Source: S. Seitz
Source: S. Lazebnik
Edge
f
2 Second derivative
d of Gaussian
2
g
dx (Laplacian)
Source: S. Seitz
Source: S. Lazebnik
maximum
Scale selection
• We want to find the characteristic scale of the
blob by convolving it with Laplacians at several
scales and looking for the maximum response
• However, Laplacian response decays as scale
increases:
Scale normalization
• The response of a derivative of Gaussian
filter to a perfect step edge decreases as σ
increases
1
σ 2π
Source: S. Lazebnik
Scale normalization
• The response of a derivative of Gaussian
filter to a perfect step edge decreases as σ
increases
• To keep response the same (scale-invariant),
must multiply Gaussian derivative by σ
• Laplacian is the second Gaussian derivative,
so it must be multiplied by σ2
Source: S. Lazebnik
maximum
Source: S. Lazebnik
Blob detection in 2D
• Scale-normalized Laplacian of Gaussian:
2 2
2 2 ⎛∂ g ∂ g⎞
∇ norm g = σ ⎜⎜ 2 + 2 ⎟⎟
⎝ ∂x ∂y ⎠
Source: S. Lazebnik
Blob detection in 2D
• At what scale does the Laplacian achieve a maximum
response to a binary circle of radius r?
image Laplacian
Source: S. Lazebnik
Blob detection in 2D
• At what scale does the Laplacian achieve a maximum
response to a binary circle of radius r?
• To get maximum response, the zeros of the Laplacian
have to be aligned with the circle
• The Laplacian is given by (up to scale):
2 2 2 − ( x 2 + y 2 ) / 2σ 2
( x + y − 2σ ) e
• Therefore, the maximum response occurs at σ = r / 2.
circle
r 0
Laplacian
image
Source: S. Lazebnik
Source: S. Lazebnik
Scale-space blob detector: Example
Source: S. Lazebnik
Scale-space blob detector
1. Convolve image with scale-normalized
Laplacian at several scales
2. Find maxima of squared Laplacian response
in scale-space
Source: S. Lazebnik
Scale-space blob detector: Example
Source: S. Lazebnik
Eliminating edge responses
• Laplacian has strong response along edge
Source: S. Lazebnik
Eliminating edge responses
• Laplacian has strong response along edge
L = σ 2 (Gxx ( x, y, σ ) + Gyy ( x, y, σ ) )
(Laplacian)
DoG = G( x, y, kσ ) − G( x, y, σ )
(Difference of Gaussians)
Source: S. Lazebnik
Efficient implementation
David G. Lowe.
"Distinctive image features from scale-invariant keypoints.” IJCV 60
(2), pp. 91-110, 2004.
Source: S. Lazebnik
From feature detection to feature description
• Scaled and rotated versions of the same
neighborhood will give rise to blobs that are related by
the same transformation
• What to do if we want to compare the appearance of
these image regions?
• Normalization: transform these regions into same-
size circles
• Problem: rotational ambiguity
Source: S. Lazebnik
Eliminating rotation ambiguity
• To assign a unique orientation to circular
image windows:
• Create histogram of local gradient directions in the patch
• Assign canonical orientation at peak of smoothed histogram
0 2π
Source: S. Lazebnik
SIFT features
• Detected features with characteristic scales
and orientations:
David G. Lowe.
"Distinctive image features from scale-invariant keypoints.” IJCV 60
(2), pp. 91-110, 2004. Source: S. Lazebnik
From feature detection to feature description
Detection is covariant:
features(transform(image)) = transform(features(image))
Description is invariant:
features(transform(image)) = features(image)
Source: S. Lazebnik
SIFT descriptors
• Inspiration: complex neurons in the primary
visual cortex
2. Keypoint localization
Fit a model to detrmine location and scale.
Select keypoints based on a measure of stability.
3. Orientation assignment
Compute best orientation(s) for each keypoint region.
4. Keypoint description
Use local image gradients at selected scale and rotation
to describe each keypoint region.
Source: D. Hoiem
Or
Source: N. Snavely
A hard keypoint matching problem
Source: S. Lazebnik
Answer below (look for tiny colored squares…)