0% found this document useful (0 votes)
2 views

Lecture10-Featurebased Image Matching (2)

The document discusses feature-based methods for image matching, including the Bag of Visual Words approach and feature descriptors like SIFT and SURF. It covers geometric consistency checks, vocabulary trees, and the RANSAC algorithm for robust mapping. Additionally, it highlights the importance of comparing feature histograms and the applications of these methods in visual search datasets.

Uploaded by

Tân Hoàng
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture10-Featurebased Image Matching (2)

The document discusses feature-based methods for image matching, including the Bag of Visual Words approach and feature descriptors like SIFT and SURF. It covers geometric consistency checks, vocabulary trees, and the RANSAC algorithm for robust mapping. Additionally, it highlights the importance of comparing feature histograms and the applications of these methods in visual search datasets.

Uploaded by

Tân Hoàng
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Feature-based methods for image matching

 Bag of Visual Words approach


 Feature descriptors
 SIFT descriptor
 SURF descriptor
 Geometric consistency check
 Vocabulary tree
A Bag of
Words

self-evident

Libertytruths
happiness
endowed
i
Creatornalie pursuit
nable
Life
Representing a Text
as a “Bag of
Words”
We hold these truths to be self-evident, that all men are created equal,
that they are endowed by their Creator with certain unalienable Rights,
that among these are Life, Liberty and the pursuit of Happiness. That to
secure these rights, Governments are instituted among Men, deriving
their just powers from the consent of the governed, That whenever any
Form of Government becomes destructive of these ends, it is the Right of
the People to alter or to abolish it, and to institute new Government, laying
its foundation on such principles and organizing its powers in such form, self-evident
as to them shall seem most likely to effect their Safety and Happiness.
Prudence, indeed, will dictate that Governments long established should Liberty truths
not be changed for light and transient causes; and accordingly all happiness
experience hath shewn, that mankind are more disposed to suffer, while
evils are sufferable, than to right themselves by abolishing the forms to endowed
which they are accustomed. But when a long train of abuses and i
usurpations, pursuing invariably the same Object evinces a design to Creatornalie pursuit
reduce them under absolute Despotism, it is their right, it is their duty, to nable
throw off such Government, and to provide new Guards for their future Life
security.
Representing an Image
as a “Bag of Visual
Words”
Feature descriptors

 Represent local pattern around a keypoint by a vector (“feature descriptor”)


 Establish feature correspondences by finding the nearest neighbor in
descriptor space
Scale/rotation invariant feature descriptors

 Scale invariance: extract features at scale provided by keypoint detection


 Rotation invariance:
 Detect dominant orientation
• Average gradient direction
• Peak detection in gradient direction histogram
 Rotate coordinate system to dominant orientation
 Multiple strong orientation peaks: generate second feature point
SIFT descriptors
 SIFT - Scale-Invariant Feature
Transform [Lowe,1999, 2004]
 Sample thresholded image gradients at
16x16 locations in scale space
(in local coordinate system for rotation and
scale invariance)
 For each of 4x4 subregion, generate
orientation histogram with 8 directions
each; each observation weighted with
magnitude of image gradient and a
window function
 128-dimensional feature vector
SURF descriptors
 SURF – Speeded Up Robust Features [Bay et al. 2006]

Compute horizontal and vertical pixel differences, dx, dy (in local coordinate system for rotation
and scale invariance, window size 20 x 20, where  is feature scale)

Sum dx, dy, and |dx|,|dy| over 4x4 subregions (SURF-64) or 3x3 subregions (SURF-36)
 Normalize vector for gain invariance, but distinguish bright blobs and dark blobs based on
sign of Laplacian (trace of Hessian matrix)
Computing feature descriptors
dx 
dy dx
CGor 
|dx|  Maxima
laoyr
|dy| 
dy
 y
SURF Descriptor

dx SIFT Descriptor


 DxxDyy-
dy  Dxx (0.9Dxy)2
| 
|
d x| x
dy|
… …

Oridx Dxy
yy
minan
dy
|dx along
do ||d | t gradient OGrrieandtiee
y
Filters D Blob Response
ndtPFaietcldh
“Bag of Visual Words”
Matching
Pairwise
Comparison
Which of the following statements are true?

???
(a) A bag of visual words representation is robust against partial occlusions of an
object.

(a) The SIFT descriptor can only be calculated for SIFT keypoints. Similarly,
the SURF descriptor can only be calculated for SURF keypoints.

(b) Both SIFT and SURF descriptors only depend on image gradients.

(c) The SIFT descriptor is more robust against image rotation since it uses an
orientation histogram.
Geometric mapping
 Notation:

T
 Homogeneous coordinates; reference image y 1
 
Inhomogeneous coordinates; target image x  

x x
T

 Translation x
y
 Euclidean transformation (rotation and translation)
 cos   sin  tx 
x =   x
sin  cos  ty

 Scaled rotation (similarity
 transform)
s  s sin  tx 
cos  s  cos  t y x
x =  s
Geometric mapping
 Affine transformation
a00 a01 a02 
x =  x
a10 a11 a12
 Motion of planar surface in 3d under orthographic projection
 Parallel lines are preserved
Geometric mapping
 Motion of planar surface in 3d under perspective projection
 Homography
 h00 h01 h02 
x  h11
10 h12  x
h  h h22 
 20 h21
 Inhomogeneous coordinates (after normalization)

h00 x  h01 y  h10 x  h11 y 


x  h20 x  h21 y  y  h20 x  h21 y 
h02 h12
 Straight lineshare
22 preserved h22
RANSAC
 RANdom Sample Consensus [Fischer, Bolles, 1981]

Randomly select subset of k correspondences
 Compute geometric mapping parameters by linear regression
 Apply geometric mapping to all keypoints

Count no. of inliers (closer than  from the corresponding keypoint, typical  = 1…3 pixels)

Repeat process S times, keep geometric maping with largest no. of inliers
 Required number of trials
Total probability of
success

Probability of
valid correspondence
 Use small number of correspondences
RANSAC with Affine Model
RANSAC with Homography
SURF features & affine
RANSAC
Pairwise
Comparison
Which of the following statements are true?

???
(a) RANSAC is resilient against missing features, extraneous features, and noisy
correspondences in a bag of visual words matching scenario.

(b) An affine model contains a homography as a special case.

(c) RANSAC can only be applied if the number of inliers is larger than the number of
outliers.

(d) For a fixed number of iterations in RANSAC, using a model with a larger number
of parameters always increases the probability of success.
Comparing Feature Histograms
 Speed up by comparing histograms of features:
pairwise image comparison only for similar histograms
 Histogram intersection Query histogram Histogram of
database entry

n

 n Q i , D i
i
1 min
  i Di
 1
[Swain, Ballard 1991]

 Equivalent to mean absolute difference, if both histograms


contain same number of samples
Growing Vocabulary Tree

[Nistér and Stewenius, 2006]


Growing Vocabulary Tree

[Nistér and Stewenius, 2006]


Growing Vocabulary Tree

[Nistér and Stewenius, 2006]


Growing Vocabulary Tree

k=3

[Nistér and Stewenius, 2006]


Growing Vocabulary Tree

k=3

[Nistér and Stewenius, 2006]


Querying Vocabulary Tree

Query
Querying: Hard Binning vs. Soft Binning

node 1 node 1
d1 d1

query descriptor query


node 2 node 2 d3
d2 d3 d2
node 3 node 3

Hard Binning Soft Binning


[Nistér and Stewenius, CVPR 2006] [Philbin et al., CVPR 2008]
Stanford Mobile Visual Search Dataset
Stanford Mobile Visual Search Dataset
Querying: Hard Binning vs. Soft Binning

SURF features
6-level tree
1M leaf nodes
3269 query images
100 top tree results
Applications
Matlab help: Local Feature Detection and Extraction

You might also like