0% found this document useful (0 votes)
0 views

05-06 Feature and Object Descriptions

Uploaded by

liangyibo653
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

05-06 Feature and Object Descriptions

Uploaded by

liangyibo653
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

Feature and Object

Descriptions
Computer Vision
Outline
• Feature and Object Descriptions
• Descriptors for contours
• Descriptors for areas
• Descriptors for points
• Points matching
• Model matching

Slide 2 of 103
Feature and Object Descriptions
Matching problem
How to match features on different images?

We have to describe features to be able to compare them.

Slide 4 of 103
Description of the contours
The result of detectors and segmentation algorithms is a set of special points for which it
is necessary to construct a mathematical description.
Contour (chain) codes
Starting from the first point, the contour is traversed clockwise, with each subsequent
point being encoded with a number from 0 to 7, depending on its location.

Curve coding example: 771210766711076771122334.

Slide 5 of 103
Description of the contours
Contour (chain) codes

Disadvantages:
• dependence on the starting point of encoding;
• do not have the property of invariance to rotation;
• instability to noise, local changes in the contour can lead to different encoding results.

Slide 6 of 103
Description of the contours
Piecewise polynomial approximation
Search for a curve passing near a given set of contour points.
The curve is divided by separate nodes into segments, while the approximating function
on each of the segments looks like:

𝑓 𝑥 = 𝑎0 + 𝑎1 𝑥 + 𝑎2 𝑥 2 + ⋯ + 𝑎𝑛 𝑥 𝑛 ,

where 𝑎𝑖 – are the coefficients of the polynomial to be determined on each segment.

Slide 7 of 103
Description of the contours
Piecewise linear approximation
For each pair of nodes, it is necessary to determine two coefficients 𝑎0 and 𝑎1 , the total
number of coefficients to be determined is 2 𝑛 + 1 , where 𝑛 is the total number of
nodes.
For piecewise linear approximation, an iterative algorithm for the selection of endpoints
can be used:
1. The end points of the contour A and B are connected by a straight line.
2. Distances to line AB are calculated for the remaining points.
3. The point that has the greatest deviation from line AB is taken as an additional node.
4. The curve is replaced by two segments AC and CB.
The procedure continues until the maximum value of the points deviation is less than the
specified threshold. The accuracy of the piecewise linear approximation approximation is
determined by the threshold value.
Slide 8 of 103
Description of the contours
Piecewise linear approximation

Iterative selection of end points: on the left - the first stage;


in the center - the second stage; on the right is the third stage.

Disadvantages:
• the approximating function is not smooth (the first derivatives are discontinuous at the
grid nodes);
• dependence of the approximation results on the initial experimental data.

Slide 9 of 103
Description of the contours
Spline fit
• In practice, cubic splines are often used for approximation.
• Cubic splines give a high approximation accuracy and smoothness of the function.
• If the function being approximated has strong inflections, then in some cases the
cubic spline gives outliers.
• The spline of the first degree in this situation does not allow outliers, but it is difficult
to ensure the required accuracy of the approximation.
• Significant difficulties arise in the case of approximation of functions with large
values ​of curvature.
• The use of both cubic and first degree splines is associated with a large number of
interpolation nodes.

Slide 10 of 103
Description of the contours
Rational splines
They combine the properties of first-degree and cubic splines, allow approximating
functions with large curvature values ​and with breakpoints.
A rational spline is a function 𝑆𝑅 𝑥 , which on each segment 𝑥𝑖 , 𝑥𝑖+1 has the form:
𝑐𝑖 𝑡 3 𝑑𝑖 1 − 𝑡 3
𝑆𝑅 𝑥 = 𝑎𝑖 𝑡 + 𝑏𝑖 1 − 𝑡 + +
1 + 𝑝𝑖 1 − 𝑡 1 + 𝑞𝑖 𝑡
𝑥−𝑥𝑖
where 𝑡 = 𝑥 , 𝑝𝑖 , 𝑞𝑖 – are given numbers, and 0 < 𝑝𝑖 , 𝑞𝑖 < ∞.
𝑖+1 −𝑥𝑖
The parameters 𝑝𝑖 , 𝑞𝑖 define the properties of rational splines:
1. If 𝑝𝑖 , 𝑞𝑖 are close to zero, then the rational spline becomes cubic;
2. if the parameters 𝑝𝑖 , 𝑞𝑖 are large enough, then the estimates of the spline error are
comparable to a first-degree spline.
In most cases, it is customary to assume 𝑝𝑖 = 𝑞𝑖 .
Slide 11 of 103
Description of the contours
Natural curve representation
• The natural presentation of the curve implies the absence of connection points and branches on
the contours.
• The contour is represented as a one-dimensional function of an attribute on the length of the arc.
• The length of the arc 𝑙𝑗 of the discrete contour at the point 𝑃 𝑗 = 𝑥𝑗 , 𝑦𝑗 can be approximated
as follows: 𝑗−1

𝑙𝑗 = ෍ 𝑥𝑖 − 𝑥𝑖+1 2 + 𝑦𝑖 − 𝑦𝑖+1 2

𝑖=1
• The contour representation is often used as a function of curvature 𝐾 𝑙 , calculated by the
formula: 𝑓′ 𝑓′′ − 𝑓′′ 𝑓′
𝑥 𝑦 𝑥 𝑦
𝐾 𝑙 = 𝐾 𝑥 𝑙 ,𝑦 𝑙 = ,
2 2 3
𝑓′𝑥 + 𝑓′𝑦
where 𝑓′𝑥 , 𝑓′𝑦 – the first derivatives with respect to х and у respectively;
𝑓′′𝑥 , 𝑓′′𝑦 – the second derivatives with respect to х and у.
Slide 12 of 103
Description of the contours
Natural curve representation

Advantages of the curvature function:


• shift and rotation invariance.

Disadvantages:
Curvature function
• lack of invariance to scale;
• rectilinear (straight) contours cannot be represented as a function of curvature;
• the need to approximate curves for accurate calculation of derivatives at a point.

Slide 13 of 103
Description of the contours
Natural curve representation
• An analogue of curvature is the amount of contour inflection at a point.
• To obtain the inflection value, no curve approximation is required, but a discrete representation of
the curve in the form of a sequence of pixel coordinates of the contour points is used.

To calculate the value of the inflection at the point 𝑃 𝑖 you must:


• choose two points of the sequence 𝑃 𝑖 − 𝑘 and 𝑃 𝑖 + 𝑘 , equidistant from 𝑃 𝑖 by 𝑘 points;
• determine the slope to the left 𝐾 𝐿 and right 𝐾 𝑅 from the
point 𝑃 𝑖 :
𝑦𝑖 − 𝑦𝑖−𝑘 𝑦𝑖+𝑘 − 𝑦𝑖 𝑃 𝑖
𝐾 𝐿 = 𝛼1 = 𝑎𝑟𝑐𝑡𝑔 , 𝐾 𝑅 = 𝛼2 = 𝑎𝑟𝑐𝑡𝑔
𝑥𝑖 − 𝑥𝑖−𝑘 𝑥𝑖+𝑘 − 𝑥𝑖
• calculate the difference between the tilt angles 𝐾 𝐿 and 𝐾 𝑅 :
𝐾′ = 𝛼 = 𝐾 𝐿 − 𝐾 𝑅 𝑃 𝑖+𝑘
𝑃 𝑖−𝑘
where 𝐾 ′ – the inflection value at the point.
Calculating the inflection at a point
Slide 14 of 103
Description of the contours
Natural curve representation
If the contour does not contain branch (connection) points, then it can be represented as
a one-dimensional inflection function 𝐾 ′ 𝑙 .

𝑃 𝑖

𝑃 𝑖+𝑘
𝑃 𝑖−𝑘

Calculating the inflection at a point


Inflection function

Slide 15 of 103
Description of the contours
Singular points of contours
• The number and positions of contour singular points (points of maximum inflection,
local extrema of the curvature function, end points, branch points) can be used as
characteristic features.
• First of all, try to select corner points on the contour, because endpoints and branch
points are not reliable enough and are highly susceptible to noise.
• A reliable way to identify special points is to search for the extreme values ​of any
attribute of the contour, for example, the extrema of the curvature function, for the
search of which it is necessary:
1. Perform piecewise polynomial approximation of the contour;
2. Construct a curvature function;
3. Find all local extrema of curvature.

Slide 16 of 103
Description of the contours
Singular points of contours
Piecewise polynomial approximation of the curve allows you to more accurately calculate
the values of the first two derivatives in directions at points, and, consequently, the value
of the curvature itself.

Local extrema of the curvature function

Slide 17 of 103
Description of areas
Description of selected areas
• Descriptors are vector-signs of a point's neighborhood.
• Features are built on the basis of information about the intensity, color and texture of
special points.
• It is necessary to describe each point of interest with a certain set of parameters.

Characteristic point in the image

Slide 18 of 103
Description of areas
Typical feature set for areas (images)
1. Topological features:
• the number of disconnected components - the number of separate objects in the image;
• the number of holes - are there any holes inside the object;
• Euler's number (Euler’s characteristic) - the number of objects minus the number of holes).
2. Geometric features (characterize the shape of the image):
• the area of ​the image 𝑆 is calculated as the number of nonzero elements of the image;
• the position of the center of gravity of the image, calculated in terms of static moments:
‫׬‬Ω 𝐵 𝑥, 𝑦 𝑥𝑑𝑥 ‫׬‬Ω 𝐵 𝑥, 𝑦 𝑦𝑑𝑦
𝑥𝑐 = , 𝑦𝑐 = ,
‫׭‬Ω 𝐵 𝑥, 𝑦 𝑑𝑥𝑑𝑦 ‫׭‬Ω 𝐵 𝑥, 𝑦 𝑑𝑥𝑑𝑦
where Ω – the image in the Cartesian coordinate system 𝑥, 𝑦 ;
𝐵 𝑥, 𝑦 – the value of the intensity function at the point 𝑥, 𝑦 .

Slide 19 of 103
Description of areas
Typical feature set for areas
• position of the center of gravity of the area of a binary image:
σΩ 𝑥 σΩ 𝑦
𝑥𝑐 = , 𝑦𝑐 =
𝑆 𝑆
• for a grayscale image:
σΩ 𝑥 𝐵 𝑥, 𝑦 σΩ 𝑦 𝐵 𝑥, 𝑦
𝑥𝑐 = , 𝑦𝑐 =
σΩ 𝐵 𝑥, 𝑦 σΩ 𝐵 𝑥, 𝑦
• the perimeter of the area is equal to the sum of the modules of the elementary vectors of
the contour connecting two neighboring elements (by 8-connectivity);
𝑁1 𝑁
𝑃=෍ 𝑃1 + 2 ෍ 𝑃2
𝑘=1 𝑘=𝑁1+1

where 𝑃1 and 𝑃2 - are elementary vectors oriented along the grid and at an angle of 45°, respectively.
• the ratio of the square of the perimeter to the area of ​the image;
Slide 20 of 103
Description of areas
Typical feature set for areas
• format feature – the ratio of the sides of the circumscribed rectangle
To calculate the value of the F (format) feature, a scattering matrix is built from the contour points of
the image: 𝑆20 𝑆11
𝐸= ,
𝑆11 𝑆02
𝑆𝑝𝑞 = ෍ 𝑥 − 𝑥𝑐 𝑝 𝑦 − 𝑦𝑐 𝑞

𝑥,𝑦 ∈𝐷Ω
and the eigenvalues of the scattering matrix are considered:

𝑠20 + 𝑠02 𝑠20 + 𝑠02 2


2
𝜆𝑖 = ± + 𝑠11
2 4
Eigenvalues 𝜆1,2 > 0 and 𝜆1,2 = 0 in cases when the image is a straight line. The format is calculated
using the formula (𝜆1 ≥ 𝜆2 ): 𝜆1
𝐹=
𝜆2
Slide 21 of 103
Description of areas
Typical feature set for areas
• compactness is calculated by the formula:
𝑆
𝑍=
𝑆𝑢
where 𝑆 – the area of the image, 𝑆𝑢 – is the area of the circumscribed rectangle oriented as an
equivalent ellipse.

• To determine the orientation, the eigenvectors of the scattering matrix are found:
𝑆20 − 𝜆1 𝑆11 𝑥
𝑆11 𝑆02 − 𝜆2 𝑦 =0
Slide 22 of 103
Description of areas
Typical feature set for areas
• The magnitude of the projection of the contour point of the image 𝑥, 𝑦 onto
one of the eigenvectors (for example, 𝑥1 , 𝑦1 , corresponding to the eigenvalue
𝜆1 ) is determined by the formula:
𝑦 𝑦1
𝑅= 𝑥 2 + 𝑦 2 sin arctg − arctg
𝑥 𝑥1
Substituting the values of the eigenvectors, we get:
𝜆1 −𝑆20 1 𝜆2 −𝑆02 1
𝑅1 = 𝑦 − 𝑆11
𝑥 , 𝑅2 = 𝑦 − 𝑆11
𝑥 ,
𝑥12 +𝑦12 𝑥22 +𝑦22

where 𝑅1 and 𝑅2 - are the sides of the circumscribed rectangle oriented along the eigenvectors (the
projection of the image onto the eigenvectors).

Slide 23 of 103
Description of areas
Typical feature set for areas
• the perimeter and area of the circumscribed minimum area rectangle;
• the ratio of the area of the circumscribed rectangle to the area of the image;
• the ratio of the square of the perimeter of the circumscribed rectangle to its area;
• the format of the circumscribed rectangle;
𝑇1
𝐹1 =
𝑇2
where 𝑇1 and 𝑇2 – are the sides of the circumscribed rectangle.
• the relative width and height of the image.
𝑃 𝑃
𝑃3 = ,𝑃 =
𝑇1 4 𝑇2

Slide 24 of 103
Description of areas
Typical feature set for areas
3. Moments
𝑚𝛼𝛽 = ‫׭‬Ω 𝐵 𝑥, 𝑦 𝑥 𝛼 𝑦 𝛽 𝑑𝑥𝑑𝑦
For a discrete case:

𝑚𝑝𝑞 = ෍ 𝑥 𝑝 𝑦 𝑞 𝐵 𝑥, 𝑦
𝑥,𝑦 ∈Ω

• moments invariant to displacement: 𝜇𝑝𝑞 = σ 𝑥 − 𝑥𝐶 𝑝 𝑦 − 𝑦𝐶 𝑞 𝐵 𝑥, 𝑦


𝑥,𝑦 ∈Ω
𝜇𝑝𝑞
• scale-invariant moments: 𝜂𝑝𝑞 =
σ𝑖+𝑗=𝑝+𝑞 𝜇𝑖𝑗

• moments invariant to rotation: 𝑀1 = 𝜂02 + 𝜂20 , etc.


4. Textural features
Slide 25 of 103
Description of special points
Simple Neighborhood Approach
The simplest approach to identifying features:
• Take square neighborhoods, with sides parallel to the rows and columns of the image.
• Pixel intensities will be featured.
• When comparing points on images, we will compare their neighborhoods as images
pixel by pixel.
• Such a neighborhood is invariant only to image shift.

Pixel-by-pixel comparison of the


neighborhoods of singular points
Slide 26 of 103
Description of special points
Intensity invariance
To achieve invariance to intensity, it is necessary to normalize the image intensities
histogram as follows:
𝐼−𝜇
𝐼′ =
𝜎
where 𝐼 ′ – the normalized intensities,
𝜇 – average value,
𝜎 – variance. normalization

An example of normalizing the


intensities of a neighborhood area

Slide 27 of 103
Description of special points
Simple Neighborhood Approach
Disadvantages:
• The point detector is rotation invariant, but the neighborhood is not;
• Small shifts, i.e., errors in finding a point make pixel-by-pixel comparison impossible.

Detector invariance and descriptor non-invariance

Slide 28 of 103
Description of special points
SIFT Descriptor (Scale-Invariant Feature Transform)
• This descriptor is used when using a DoG detector to determine the position and scale of a
feature and is resistant to light changes and small shifts.
• The search for the orientation of a singular point is based on the idea of ​finding the main direction
of pixel gradients in the vicinity of a point. Algorithm:
1. Calculate the histogram. Each point contribution is weighted with Gaussian centered at the
singular point:

2. Rotate the fragment so that the dominant direction of the gradient is directed upwards:

3. If there are several local maxima, we assume that there are several points with different
orientations.
David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004.David G. Lowe. "Distinctive image features from scale-
invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004. Slide 29 of 103
Description of special points
SIFT Descriptor (Scale-Invariant Feature Transform)
For each feature found, we now know the characteristic scale and orientation.
• Let's select the corresponding rectangular neighborhood (Rotation Invariant Frame)
• Bring the neighborhood to a standard size (scale).

Neighborhood features
Slide 30 of 103
Description of special points
SIFT Descriptor (Scale-Invariant Feature Transform)

An example of a description of local features

Slide 31 of 103
Description of special points
SIFT Descriptor (Scale-Invariant Feature Transform)
Algorithm for constructing a SIFT descriptor:
1. Calculate the direction of the gradient in each pixel;
2. Quantize the orientation of the gradients by 8 cells (directions);
• Tagging each pixel with a cell number;
3. Calculate the histogram of the directions of the gradients;
• For each cell, calculate the number of pixels with the same gradient direction;
• Each point contribution is evaluated with Gaussian, centered at the center of the
neighborhood.

Gradient Orientation Histogram Slide 32 of 103


Description of special points
SIFT Descriptor (Scale-Invariant Feature Transform)
4. Take the local properties into account. Divide the neighborhood into blocks by a grid,
in each block we calculate its own gradient histogram.
• Usually - a 4x4 grid, each histogram with 8 cells.
• The standard length of a descriptor vector is 128 (4 * 4 * 8).
• Compare as a vector (different metrics).

Gradient histograms Slide 33 of 103


Description of special points
Comparison of SIFT features
A feature vector with a length of 128 is essentially a histogram. The following metrics are
used for comparison:
1. Standard metrics 𝐿1 , 𝐿2 .
2. Special for histograms:
• Intersection of histograms:
𝑁

𝐷 ℎ1 , ℎ2 = ෍ min ℎ1 𝑖 , ℎ2 𝑖
𝑖=1
• Chi-square distance 𝜒 2 :
𝑁 2
ℎ1 𝑖 − ℎ2 𝑖
𝐷 ℎ1 , ℎ2 =෍
ℎ1 𝑖 + ℎ2 𝑖
𝑖=1

Slide 34 of 103
Description of special points
Various SIFT modifications are used for color images.
RGB-SIFT
Implies 3 SIFT descriptors for each channel
С-SIFT 𝑅−𝐺
2
Uses channels 𝑂1 and 𝑂2 : 𝑂1 𝑅 + 𝐺 − 2𝐵
𝑂2 =
𝑂3 6
𝑅+𝐺+𝐵
3
rgSIFT 𝑅
Uses channels 𝑟 and 𝑔: 𝑟 𝑅+𝐺+𝐵
𝐺
𝑔 =
𝑏 𝑅+𝐺+𝐵
𝐵
𝑅+𝐺+𝐵

Koen E. A. van de Sande, Theo Gevers and Cees G. M. Snoek, Evaluating Color Descriptors for Object and Scene Recognition, IEEE PAMI, 2010 Slide 35 of 103
Description of special points
SIFT Descriptor (Scale-Invariant Feature Transform)

Advantages:
1. The SIFT descriptor is specific, resistant to changes in lighting, small shifts.
2. SIFT (detector, neighborhood selection, descriptor) schema is a very efficient tool for
image analysis.
3. Has become widespread.

Slide 36 of 103
Description of special points
PCA-SIFT descriptor (Principal Components Analysis-SIFT)
• For each feature point, a 41 × 41 neighborhood is considered.
• This gives neighborhood gradient vectors containing 2 × 39 × 39 = 3042 elements.
• The vectors are reduced to 32 elements by means of principal component analysis (PCA).

Y. Ke and R. Sukthankar, PCA-SIFT: A More Distinctive Representation for Local Image Descriptors CVPR, 2004. Slide 37 of 103
Description of special points
GLOH descriptor (Gradient location-orientation histogram)
• A polar grid is used for dividing the neighborhood into bins: 3 radial blocks with
radii of 6, 11 and 15 pixels and 8 sectors.
• The result is a vector containing 272 components, which is projected into a space
of dimension 128 using principal component analysis (PCA).

K. Mikolajczyk and C. Schmid,“A performance evaluation of local descriptors ,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 10, pp.
1615-1630, Oct. 2005 Slide 38 of 103
Description of special points
DAISY descriptor
• Works on a dense set of pixels throughout the image.
• Runs 66 times faster than SIFT running on dense pixel counts.
• The ideas of constructing SIFT and GLOH descriptors are used.
• Similarly with GLOH, a circular neighborhood of the singular point is selected, with the
bins being represented not by partial sectors, but by circles.
• For each bin, the same actions are performed as in the SIFT
algorithm, but the weighted sum of the gradient magnitudes is
replaced by the convolution of the original image with the
derivatives of the Gaussian filter taken in 8 directions.
• The constructed descriptor has invariance, while solving the
matching problem in the case when all pixels are considered
special and requires less computational costs.
Tola, Engin, Vincent Lepetit, and Pascal Fua. "A fast local descriptor for dense matching." 2008 IEEE conference on computer vision and pattern recognition.
IEEE, 2008. Slide 39 of 103
Description of special points
BRIEF descriptor (Binary Robust Independent Elementary Features)
Scheme for constructing feature vectors :
1. The image is split into patches (separate overlapping areas). Let's say patch 𝑃 has dimensions
𝑆 × 𝑆 pixels.
2. A set of pairs of pixels {(𝑋, 𝑌), for ∀𝑋, 𝑌 in the neighborhood, 𝑋, 𝑌 = 𝑢, 𝑣 } is selected from the
patch for which a set of binary tests is constructed :
1, 𝐼 𝑋 < 𝐼 𝑌
𝜏 𝑃, 𝑋, 𝑌 = ቊ
0, 𝑒𝑙𝑠𝑒
where 𝐼 𝑋 is the intensity of the pixel 𝑋.
3. For each patch, a set is selected containing 𝑛𝑑 pairs of points 𝑋𝑖 , 𝑌𝑖 that uniquely define a set of
binary tests.
4. Based on these tests, a binary string is built:
𝑛𝑑

𝑓𝑛𝑑 𝑃 = ෍ 2𝑖−1 𝜏 𝑃, 𝑋𝑖 , 𝑌𝑖
𝑖=1
Calonder, Michael, et al. "BRIEF: Computing a local binary descriptor very fast." IEEE transactions on pattern analysis and machine intelligence 34.7 (2011): 1281-1298. Slide 40 of 103
Description of special points
BRIEF descriptor (Binary Robust Independent Elementary Features)
• Provides recognition of the same areas of the image that were shot from different
points of view.
• The recognition algorithm is reduced to the construction of a random forest or a naive
Bayesian classifier on a certain training set of images and the subsequent classification
of areas of test images.
• In a simplified version, the nearest neighbor method can be used to find the most
similar patch in the training set.
• A small number of operations is provided due to the representation of the feature
vector in the form of a binary string, and, as a consequence, the use of the Hamming
metric as a measure of similarity.

Calonder, Michael, et al. "BRIEF: Computing a local binary descriptor very fast." IEEE transactions on pattern analysis and machine intelligence 34.7 (2011): 1281-1298. Slide 41 of 103
Neighborhood normalization
Perspective distortion

Different fragments fall into the circular neighborhood -in the left image,
half of the letter G is inside the circle, in the right it almost did not hit.

Slide 42 of 103
Neighborhood normalization
Perspective distortion
It is necessary to find the appropriate neighborhoods, and describe them with an ellipse,
taking affine transformations into account.

Elliptic neighborhood of the characteristic point


Slide 43 of 103
Neighborhood normalization
Neighborhood normalization
To facilitate comparison of image fragments, it is necessary to find the parameters of the
ellipse around the point of interest or area and bring the ellipses to the "canonical" form –
the "common denominator".

Normalizing ellipsoids

Slide 44 of 103
Neighborhood normalization
Affine adaptation
• The matrix 𝑀 can be represented as an ellipse, in which the lengths of the axes are
determined by the eigenvalues, and the orientation is determined by the matrix 𝑅.
• The main problem is that we count the matrix 𝑀 by a round (square) neighborhood. In
different images, the content will not match, and we will not be able to select the
same areas (ellipses).
𝑢
𝐸 𝑢, 𝑣 ≈ 𝑢 𝑣𝐌 ,
𝑣
𝐼𝑥2 𝐼𝑥 𝐼𝑦 −1 𝜆1 0
𝐌 = σ𝑥,𝑦 𝑤 𝑥, 𝑦 = 𝑅 R
𝐼𝑥 𝐼𝑦 𝐼𝑦2 0 𝜆2

Slide 45 of 103
Neighborhood normalization
Affine adaptation
Solution: iterative neighborhood adaptation.
In the case of affine distortions, the problem is that the matrix of the second moments
determined by the weights 𝑤 (𝑥, 𝑦) should be calculated from the characteristic shape of
the region.
The iterative refinement algorithm is as follows:
1. Calculation of the matrix of moments using a round window.
2. Application of affine adaptation to obtain an elliptical window.
3. Recalculation of the moment matrix along the normalized
neighborhood. Go to step 1.

Slide 46 of 103
Neighborhood normalization
Affine adaptation

An example of affine adaptation, on the left scale-independent regions


(blobs), on the right - refined blob neighborhoods.

Slide 47 of 103
Neighborhood normalization
Normalization
Normalize the neighborhood by converting the ellipses to circles of unit radius. Moreover,
the ellipse of the second moments can be considered the "characteristic shape" of the
region.

You can rotate and mirror a unit circle and it will remain a unit circle. This property can be
used to find the desired orientation of the neighborhood. After normalization, calculate
the dominant gradient and rotate the neighborhood.

Slide 48 of 103
Neighborhood normalization
Normalization

Slide 49 of 103
Points matching
Comparison
We have a set of selected singular points and their descriptors.
How to match the same points in different images?

Slide 50 of 103
Points matching
Comparison
To match the points, it is necessary to generate candidate pairs: for each patch in one
image, we find several patches most similar in terms of the selected metric in another
image.
Methods for selecting pairs of candidate points:
1.Full search:
• For each feature, we calculate the distances to all features of the second image and take the
best one.
2.Accelerated Approximate Measures:
• Hierarchical structures (kd-trees, vocabulary trees).
• Hashing.

Slide 51 of 103
Description of objects
Geometric models of structures
Local features belong to specific geometric structures:
1. the corners of the windows lie on straight lines;
2. the edges of the windows lie on straight lines.
Based on these structures, their geometric models can be calculated.

On the left - highlighted features,


on the right - geometric models.
Slide 52 of 103
Description of objects
Simple Models - Parametric Curves

On the left are lines, on the right are circles.

Complex model - car

Slide 53 of 103
Description of objects
Parametric curves
𝐹 𝑥, 𝑎 = 0 – parametric model,
where 𝑎 – the parameters of the model,
𝑥 – a vector corresponding to some points in space,
𝑋 = 𝑥𝑖 – a set of vectors corresponding to points in space.

Straight line: 𝐹 𝑥, 𝑎 = 𝑎1 𝑥1 + 𝑎2 𝑥2 + 𝑎3 = 0.
Circle: 𝐹 𝑥, 𝑎 = 𝑥1 − 𝑎1 2 + 𝑥2 − 𝑎2 2 − 𝑎3 = 0.
Conica: 𝐹 𝑥, 𝑎 = 𝑎1 𝑥12 + 𝑎2 𝑥1 𝑥2 + 𝑎3 𝑥22 + 𝑎4 𝑥1 + 𝑎5 𝑥2 + 𝑎6 = 0.
(conical section of a plane with a circular cone or a curve of the second order)

Slide 54 of 103
Description of objects
Parametric curves
In the case of a two-dimensional plane 𝑥1 = 𝑥, 𝑥2 = 𝑦.

Tasks for estimating the parameters of the model in the image:


• Points are given that satisfy the model. It is necessary to calculate the parameters of
the model.
• A model is given. Determine which points satisfy it, which do not.
• Points are given, some of them satisfy the model (inliers), some do not satisfy
(outliers). Calculate the parameters of the model and split the data into inliers and
outliers.
• Model fitting Slide 55 of 103
Direct linear transformation
Anomalies are called outliers.
The points that fit the model are inliers.

Slide 56 of 103
Direct linear transformation
Direct linear transformation (DLT)
Problem: a set of points with coordinates 𝑥𝑖 , 𝑦𝑖 , 𝑖 = 1, 𝑛 is given on a two-dimensional
image. It is necessary to find the line that best approximates them.
Solution:
• using the least squares method, calculate a straight line with the smallest possible sum
of squares of distances from points to a straight line;
• probabilistic formulation: search for the maximum likelihood by distance:
𝑙መ = arg max 𝑃 𝑥𝑖 , 𝑦𝑖 |𝑙 .
𝑙

Slide 57 of 103
Direct linear transformation
Model of a straight line with noisy Gaussian noise in points perpendicular to the line:
𝑥 𝑢 𝑎
𝑦 = + 𝜀 ,
𝑣 𝑏
where the vector 𝑢 𝑣 T – a point on the line,
𝜀 – normally distributed Gaussian noise with zero mathematical expectation and standard
deviation 𝜎,
𝑎 𝑏 T – the normal vector.

Slide 58 of 103
Direct linear transformation
Probabilistic approach
• It is necessary to find the points with the maximum likelihood of being on the line
(maximum likelihood).

• The likelihood of points with parameters 𝑎, 𝑏, 𝑑 is calculated as follows:


𝑛 𝑛
𝑎𝑥𝑖 +𝑏𝑦𝑖 −𝑑 2

𝑃 𝑥1 , … , 𝑥𝑛 |𝑎, 𝑏, 𝑑 = ෑ 𝑃 𝑥𝑖 |𝑎, 𝑏, 𝑑 ⇒ ෑ e 2𝜎2
𝑖=1 𝑖=1
• When using the natural logarithm:
𝑛
1 2
𝐿 𝑥1 , … , 𝑥𝑛 |𝑎, 𝑏, 𝑑 = − 2 ෍ 𝑎𝑥𝑖 + 𝑏𝑦𝑖 − 𝑑
2𝜎
𝑖=1

Slide 59 of 103
Direct linear transformation
Geometrical approach
• Since the distance from the point 𝑥𝑖 , 𝑦𝑖 to the line along the normal is equal
to 𝑎𝑥 + 𝑏𝑦 − 𝑑 , it is necessary to find such parameters 𝑎, 𝑏, 𝑑 that minimize the
function 𝐸:
𝑛
𝐸=෍ 𝑎𝑥𝑖 + 𝑏𝑦𝑖 − 𝑑 2
𝑖=1

Slide 60 of 103
Direct linear transformation
We differentiate the function 𝐸 with respect to 𝑑 and equate to zero:
𝜕𝐸
= σ𝑛𝑖=1 −2 𝑎𝑥𝑖 + 𝑏𝑦𝑖 − 𝑑 = 0,
𝜕𝑑
and express 𝑑:
𝑛 𝑛
𝑎 𝑏
𝑑 = ෍ 𝑥𝑖 + ෍ 𝑥𝑖 = 𝑎𝑥ҧ + 𝑏𝑦ത
𝑛 𝑛
𝑖=1 𝑖=1
Substitute the resulting expression into the function 𝐸:
𝑛 2
𝐸=෍ 𝑎 𝑥𝑖 − 𝑥ҧ + 𝑏 𝑦𝑖 − 𝑦ത =
𝑖=1
2
𝑥1 − 𝑥ҧ 𝑦1 − 𝑦ത
𝑎 T
= … … = 𝐴𝑁 𝐴𝑁 .
𝑏
𝑥𝑛 − 𝑥ҧ 𝑦𝑛 − 𝑦ത

Slide 61 of 103
Direct linear transformation
T
Differentiate 𝐴𝑁 𝐴𝑁 by 𝑁:
𝑑𝐸
= 2 𝐴T 𝐴 𝑁 = 0.
𝑑𝑁
It can be seen that the solution to this matrix equation is the eigenvector 𝐴T 𝐴 ,
corresponding to the minimum eigenvalue provided that 𝑁 2 = 1.

The expression 𝐴T 𝐴 allows you to find the singular values of the matrix 𝐴.

Slide 62 of 103
Direct linear transformation
SVD procedure
To simplify the search for singular numbers, consider the Singular Value Decomposition
(SVD).

The matrix can be expanded as


𝐴 = 𝑈𝐷𝑉 T ,
where 𝑈 and 𝑉 – unitary matrices (𝑈𝑈 T = 𝐼),
𝐷 – a diagonal matrix consisting of singular numbers.

The following relations are valid:


𝐴T 𝐴 = 𝑉𝐷𝑈 T 𝑈𝐷𝑉 T = 𝑉𝐷𝐷𝑉 T = 𝑉𝐷2 𝑉 T.

Slide 63 of 103
Direct linear transformation
SVD procedure
We use this decomposition to calculate the least squares. Let the equation be given:
𝐴𝑝 = 0,
where the norm of the vector 𝑝: 𝑝 = 1.

To find the minimum singular number, it is necessary to minimize the norm: 𝑈𝐷𝑉 T 𝑝 .

Given the equality on the previous slide:


𝑈𝐷𝑉 T 𝑝 = 𝐷𝑉 T 𝑝 ∙ 𝑉 T 𝑝 = 𝑝 .

Slide 64 of 103
Direct linear transformation
SVD procedure
If 𝑉 T 𝑝 = 1, then it is necessary to minimize:
𝐷𝑉 T 𝑝 .
We denote 𝑦 = 𝑉 T 𝑝, then it is necessary to minimize:
𝐷𝑦 , if 𝑦 = 1,
and in the diagonal matrix 𝐷 the columns are ordered in descending order.

In this case 𝑦 = 0, … , 0,1 T ,


and 𝑝 = 𝑉𝑦 – the last column of the matrix 𝑉.

Slide 65 of 103
Direct linear transformation
We use the OLS (Ordinary Least Squares) and SVD (Singular Value Decomposition)
procedure to construct lines.

Let a set of points 𝑥𝑖 , 𝑦𝑖 be given.


To draw a line 𝑎𝑥 + 𝑏𝑦 = 𝑑, you must:
1. Calculate the mean point value 𝑥,ҧ 𝑦ത ;
2. Form matrix 𝐴 containing offsets from the mean point value;
3. Execute the Singular Value Decomposition (SVD) procedure 𝐴 = 𝑈𝐷𝑉 T;
4. Calculate the parameters 𝑎 and 𝑏 from the last column of the matrix 𝑉;
5. Find 𝑑 = 𝑎𝑥ҧ + 𝑏𝑦. ത
The least squares method for finding the parameters of models is called DLT (Direct Linear
Transform)

Slide 66 of 103
Direct linear transformation
DLT problems:
• Often, some of the points obtained are not generated by the model 𝐹 𝑥, 𝑎 .
• In such a situation, when estimating by the least squares method, the result can be
arbitrarily far from the true one.
• For example, we have a set of pixels selected by the threshold and build a straight line
based on them:

Slide 67 of 103
M-estimators
To reduce the influence of distant points, we parameterize the points in polar
coordinates:
𝑥 cos 𝜃 + 𝑦 sin 𝜃 = 𝑅,
then the objective function will take the form:
𝜃, 𝑅 = arg min ෍ 𝑥𝑖 cos 𝜃 + 𝑦𝑖 sin 𝜃 − 𝑅 2.
𝜃,𝑅
𝑖
We denote 𝜀𝑖 = 𝑥𝑖 cos 𝜃 + 𝑦𝑖 sin 𝜃 − 𝑅, and modify the objective function:
𝜃, 𝑅 = arg min σ𝑖 𝜌 𝜀𝑖 ,
𝜃,𝑅
where in the case 𝜌 𝜀 = 𝜀 2 we obtain the least squares method.

Slide 68 of 103
M-estimators
The following function is usually minimized:
σ𝑖 𝜌 𝑟𝑖 𝑥𝑖 , 𝜃 , 𝜎 ,
where 𝑟𝑖 𝑥𝑖 , 𝜃 – the residual of the 𝑖 point, subject to the model parameters 𝜃,
𝜌 – a robust function with scale 𝜎.

The robust function 𝜌 behaves like the square of the distance for
small values of 𝑢 and flattens out as the value of increases.

Slide 69 of 103
M-estimators
The following functions are used as the most frequently used variants of the robust
function 𝜌:
1. Tukey's function:
𝐾2 𝜀 2 3
1− 1− , if 𝜀 ≤ 𝐾
6 𝐾
𝜌 𝜀 =
𝐾2
, if 𝜀 > 𝐾
6
2. Cauchy function:
𝑐2 𝜀 2
𝜌 𝜀 = log 1+ ,
2 𝑐
where 𝐾 and 𝑐 – tuning constants.

Slide 70 of 103
M-estimators

On the left is the Tukey function for 𝐾 = 4.685, on the right is the Cauchy
function for 𝑐 = 2.385.

Comparison of the least squares method and robust estimation modification


Slide 71 of 103
M-estimators
How to find the minimum of the objective function? With certain robust functions, this is
extremely difficult to do.

Methods:
1. Nonlinear optimization methods;
2. Weighted Least Squares;
3. Iteratively overweighted least squares method.

Slide 72 of 103
Weighted Least Squares
Using the example of searching for straight lines:
𝑎, 𝑏, 𝑑 = arg min σ𝑖 𝑤𝑖 𝑎𝑥𝑖 + 𝑏𝑦𝑖 + 𝑑 2 ,
𝑎,𝑏 : 𝑎 +𝑏2 =1
2

σ𝑖 𝑤𝑖 = 1,
where 𝑤𝑖 – the weight of each point.

Let's construct a covariance matrix of points, which is a square symmetric nonnegative


definite matrix.
• The main diagonal contains the variances of the coordinates of the points.
• Off-diagonal elements are covariance between points.
σ𝑖 𝑤𝑖 𝑥𝑖 − 𝑥ҧ 2 σ𝑖 𝑤𝑖 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
𝐶𝑜𝑣 = .
σ𝑖 𝑤𝑖 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത σ𝑖 𝑤𝑖 𝑦𝑖 − 𝑦ത 2

Slide 73 of 103
Weighted Least Squares
In a covariance matrix:
• the maximum eigenvector of the matrix 𝐶𝑜𝑣 specifies the direction of the straight line,
• the minimum eigenvector – the direction of the normal 𝑎, 𝑏 .

The line passes through the midpoint 𝑥,ҧ 𝑦ത ,


σ𝑖 𝑤𝑖 𝑥𝑖
where 𝑥ҧ = σ𝑖 𝑤 𝑖
,
σ𝑖 𝑤 𝑖 𝑦 𝑖
𝑦ത = σ𝑖 𝑤 𝑖
,
𝑑 = − 𝑎𝑥ҧ + 𝑏𝑦ത .

Slide 74 of 103
Iteratively overweighted least squares
1. Get the initial approximation of the model by the least squares method:

Θ 0 = 𝜌 0 ,𝜃 0 .

2. Set the iteration number 𝑡 = 1.


3. For Θ 𝑡−1 calculate the current noise estimate

𝑡 𝑡 𝑡−1
𝜎 = 1,4826 median𝑖 𝑟𝑖 𝑥𝑖 , Θ

as an unbiased (for a normal distribution) robust estimate of the mean error.

Slide 75 of 103
Iteratively overweighted least squares
𝑡
4. Calculate the weights of points 𝑤𝑖 taking into account the function 𝜌,
a. in general:
𝜀
𝜌′ 𝑖
𝜎
𝑤𝑖 = 𝜀𝑖 ,
𝜎
b. in the case of the Tukey function:
𝜀 2 𝜀
𝜀 1− , if ≤ 𝐾,
𝑤 = 𝜎∙𝐾 𝜎
𝜎 𝜀
0, if > 𝐾,
𝜎
c. in the case of the Cauchy function:
𝜀 1
𝑤 = 2
𝜎 𝜀
1+ 𝑐∙𝜎

Slide 76 of 103
Iteratively overweighted least squares
5. Using weighted least squares get Θ 𝑡 .
6. If the desired tolerance is not achieved then go to step 3,

Θ 𝑡 −Θ 𝑡−1 > 𝜀∗
where 𝜀 ∗ — the maximum desired deviation.

Slide 77 of 103
Iteratively overweighted least squares
Method results

The influence of the tuning constant 𝑐 on the construction of the line:


the red line is the first step,
green - 𝑐 = 1.5,
blue - 𝑐 = 3.5.

Slide 78 of 103
M-estimators
Disadvantages:

1. The need for a good first approximation;


2. It is necessary to correctly calculate the balance of the weights in order for the
algorithm to work with sufficient accuracy.

Slide 79 of 103
Random Sample Consensus
Random Sample Consensus - RANSAC

The idea: to estimate not all data, but only a small sample that does not contain outliers.

• Since it is not known in advance which points are outliers and which are not, it is
possible to construct many samples at once in a random way.
• Then, for each of the samples, we build a hypothesis. After that, we choose such a
hypothesis from among all that best fits with all the data.

Slide 80 of 103
Random Sample Consensus
The main problem: the number of such samples is huge, so it is necessary to build
hypotheses on the minimum sample size.

• For example, when inscribing a straight line into a set of points on a plane, this method
takes as a basis only two points necessary to construct a straight line and use them to
build a model.
• After that, it is checked how many points correspond to the model using an estimation
function with a given threshold.

Slide 81 of 103
Random Sample Consensus
Example

The dataset to which the line must be


entered.

Slide 82 of 103
Random Sample Consensus
Example

Two minimal samples (two points each) with cutoff along the proposed line.
On the left image, 11 points fell into the area, on the right - 4.

Slide 83 of 103
Random Sample Consensus
Example

The left sample more adequately describes the straight line (received
more "votes"), and accordingly is the right decision.

Slide 84 of 103
Random Sample Consensus
Basic scheme of the RANSAC algorithm is a loop with N iterations:

1. Build a sample 𝑆 ⊂ 𝑋 𝑥𝑖 ∈ 𝑋 . Typically, the sample size is the smallest possible size which is
enough for the model parameters estimation.
2. Put forward a hypothesis Θ on a sample 𝑆.
3. Evaluate the degree of agreement between hypothesis Θ and set of input data 𝑋. Each point is
labeled as "outlier" or “inlier".
4. After checking all points, it is checked whether the hypothesis is currently “the best” one or not
by comparing to with previous “best” one. If yes, then it replaces the previous “best” hypothesis.

At the end of the cycle, the last best hypothesis is left, from which it is possible to
determine the parameters of the model, as well as the points marked as "outliers" and
“inliers“.

Slide 85 of 103
Random Sample Consensus
• To obtain a model built without outliers with a given probability 𝑝, the number of
iterations N of the cycle can be calculated, if it is possible to indicate a given fraction of
“outliers” e.
• The number of samples N is chosen so that the probability of choosing at least one
sample without outliers would not be lower than the given one (for example, 0.99).
Thus:
𝑝 =1− 1− 1−𝑒 𝑠 𝑁⇒
log 1−𝑝
𝑁 = log 1− 1−𝑒 𝑠 ,

where N is the number of samples (the number of iterations),


𝑝 – the probability of getting a good sample in N iterations,,
𝑆 – the number of elements (points) in the sample,
e – the share of “outliers”.
Slide 86 of 103
Random Sample Consensus
To assess the degree of agreement between hypotheses, consider the functions for
evaluating hypotheses:
1. RANSAC

2 2 1, if 𝜀𝑖 ≤ 𝑇
𝑅 Θ = σ𝑖 𝑝 𝜀𝑖 Θ , 𝑝 𝜀𝑖 Θ =ቊ , 𝑖 = 1, 𝑛
0, if 𝜀𝑖 > 𝑇
where 𝜀𝑖 Θ – the residual of the 𝑖-th point and the estimated hypothesis;
𝑝 – probability (1 – “inlier”, 0 – “outlier”);
𝑇 – the threshold chosen on the basis that the value of the probability of “inlier” is 𝑝 ≈ 0,95.
Typically, a Gaussian noise model with zero mathematical expectation is used such that 𝑇 2 = 3,84𝜎 2 .

2. LMS (Least median squares)


𝑅 Θ = 𝑚𝑒𝑑𝑖𝑎𝑛 𝜀𝑖 Θ 2 , 𝑖 = 1, 𝑛.

Slide 87 of 103
Random Sample Consensus
The number of samples grows rapidly with the growth of the sample size and the
proportion of outliers.

Dependence of the number of samples on the sample size and the proportion of "outliers"

How to minimize the number of samples if the fraction of ouliers is not known in
advance?

Slide 88 of 103
Random Sample Consensus
You can start the algorithm with a rough estimate, say 50%, and then refine the number
of samples sequentially.
RANSAC Adaptive Algorithm Completion:

N=999999, sample_count = 0, p = 0.99;


while(N > sample_count) {
//Basic RANSAC algorithm:
//Construction of the sample S (the number of points in the
sample), hypotheses, estimation of “inliers" in the inliers
sample
e = 1 - (inliers / S);
N = log(1 - p) / log(1 - (1 - e)^S );
sample_count++;
} Slide 89 of 103
Random Sample Consensus
• One of the disadvantages of the RANSAC method is the uncertainty in the choice of
the threshold.
• Both large and small thresholds lead to incorrect results.
• Let a set of points be given:

Slide 90 of 103
Random Sample Consensus
Threshold selection problem

Big threshold. Incorrect solution, equivalent to


Big threshold. The correct solution.
correct.

Slide 91 of 103
Random Sample Consensus
Threshold selection problem

Small threshold. Incorrect solution. Least Median Squares (LMS)


problem. The median error is the
same for both solutions

Slide 92 of 103
Random Sample Consensus
To assess the degree of agreement between hypotheses, consider the hypothesis
evaluation function:
3. M-SAC (M-estimator Sample Consensus)

2 2 𝜀𝑖2 , if 𝜀𝑖 ≤ 𝑇
𝑅 Θ = σ𝑖 𝑝 𝜀𝑖 Θ , 𝑝 𝜀𝑖 Θ =൝ 2 , 𝑖 = 1, 𝑛
𝑇 , if 𝜀𝑖 > 𝑇
which is similar to the RANSAC function except for the modification of the probability
function.

This method gives a more accurate estimate without increasing computational complexity
and guarantees the correct solution.

Slide 93 of 103
Random Sample Consensus
An example of using RANSAC: matching the same feature points.

When matching singular points by descriptors, quite a lot of false pairs will be detected.

Slide 94 of 103
Random Sample Consensus
Matching algorithm with using RANSAC:

1. Pairs of points 𝑥, 𝑥 ′ are given on the images 𝐼 and 𝐼 ′ .


2. Calculation of the transformation model 𝑇 by key points between images 𝐼 and 𝐼 ′ .
• Using the RANSAC schema to build a model 𝑇 in sets of points.
3. Filtering outliers in 𝑥, 𝑥 ′ .
4. Refinement of the model by the remaining points.
• Or by the iterative least squares method;
• Or by nonlinear minimization.

Slide 95 of 103
Random Sample Consensus

500 feature points were found on two images

Of these, 117 outliers and 268 matches

After filtering, 151 good


matches were selected.

Slide 96 of 103
Random Sample Consensus
Use case: building panoramas from a set of photos.

Slide 97 of 103
Random Sample Consensus
Use case: building a panorama from an unordered set of photos, so it is necessary to
determine which of them belong to one image and which to another.

Slide 98 of 103
Random Sample Consensus
RANSAC advantages:

• A simple and general method applicable to a variety of tasks;


• Works well in practice;
• Able to give a reliable estimate of the model parameters, that is, to estimate the model
parameters with high accuracy, even if there are a significant number of outliers in the
original dataset.

Slide 99 of 103
Random Sample Consensus
RANSAC disadvantages:

• Many customizable parameters;


• It is not always possible to estimate the parameters well for the minimum sample;
• Sometimes it takes too many iterations;
• Does not work at very high outliers' ratio;
• Method uses uniform probability density distribution function for sampling;
• The absence of an upper bound on the time required to calculate the parameters of
the model;
• The RANSAC method can only define one model for a given dataset. As with any single
model approach, there is the following problem: when there are two (or more) models
in the source data, RANSAC may not find one.

Slide 100 of 103


Test
Lecture 5-6 Test

Please scan the code to start the test


Slide 102 of 103
THANK YOU
FOR YOUR TIME!

Andrei Zhdanov
[email protected]

You might also like