Unit-3
Unit-3
The output of the pixel from the histogram equalization operation is then equal to the CDF of the image.
Once the original histogram is known the pixel values are changed based on the original probability and
spread over the histogram.
3.2 Canny edge detection
Canny edge detection is a popular and widely used edge detection technique that aims to identify and extract
the edges of objects within an image. It was developed by John F. Canny in 1986 and has since become a
fundamental tool in computer vision and image analysis. Edge detection is a technique used in image
processing to find the boundaries of the objects within the image. An edge is defined as a sudden change in
pixel intensity within an image. Edges represent the boundaries between distinct objects or regions with
varying intensity levels.
The Canny edge detection algorithm is a multistage process that helps to identify the edges in an image by
reducing noise and preserving important edge features.
Step 1: Grayscale conversion-Grayscale images have a single channel representing the intensity of each
pixel, which simplifies the edge detection process and reduces computational complexity. Grayscale
conversion removes the color information from the image while preserving the relative brightness levels.
Step 2: Noise reduction. Is to apply a Gaussian filter to the input image. The Gaussian filter is a smoothing
operation that helps to reduce noise in the image. Noise can introduce false edges, which could compromise
the accuracy of the edge detection process.
Step 3: Gradient calculation. The gradient measures how fast the intensity changes at each pixel’s location.
The algorithm uses the concept of derivatives, typically the Sobel operator, to determine both the gradient
magnitude and orientation for each pixel. The gradient magnitude indicates the strength of the intensity
change, while the gradient orientation specifies the direction of the steepest change.
Step 4: Non-maximum suppression. This step effectively thins out the edges and produces a cleaner
representation of the actual edges in the image. It works by examining each pixel's gradient magnitude and
orientation and comparing it with the neighbouring pixels along the gradient direction. If the central pixel's
gradient magnitude is the largest among its neighbours, it means that this pixel is likely part of an edge, and
we keep it. If not, we suppress it by setting its intensity to zero and removing it from consideration as an
edge pixel.
Step 5: Double thresholding. The next step involves double thresholding to categorize edges into three
categories: strong edges, weak edges, and non-edges.
A high threshold and a low threshold are used for this purpose.
Pixels with gradient magnitudes above the high threshold are considered strong edges, indicating significant
intensity changes. Pixels with gradient magnitudes between low threshold and high threshold are classified
as weak edges. These weak edges may represent real edges or noise, and they need further verification.
Pixels with gradient magnitudes below the low threshold are considered non-edges and are discarded.
Step 6: Edge tracking by hysteresis. Hysteresis means 'remembering the past' to make our edges more
accurate and reliable. This step aims to link weak edges that are likely part of real edges to the strong edges.
Starting from each strong edge pixel, the algorithm traces the edge by considering its neighbouring weak
edge pixels that are connected. If a weak edge pixel is connected to a strong edge pixel, it is also considered
part of the edge and retained. This process continues until no more weak edges are connected. This ensures
that the edges are continuous and well-defined.
3.3.1 LoG Filter Function.
Uses a Zero-crossing Feature. A zero crossing is a point where the sign of a mathematical function changes
in the graph of the function. In image processing, the edge detection using Laplacian filter takes place by
marking the points that leads to zero in graph as potential edge points. This method works fine on images for
finding edges in both directions, but it works poorly when noises are found in the image. So, we usually
smooth the image applying Gaussian filter prior to Laplacian filter. It’s often termed Laplacian of Gaussian
(LoG) filter. We can combine Gaussian and Laplacian operations together and the mathematical
representation of the combined filter is as follows:
The Hough Transform calculates the matching curves in the parameter space for each edge point in the
image. This is accomplished by finding the curve that intersects the parameter values at the spot by iterating
over all possible values of the parameters. The "votes" or intersections for every combination of parameters
are recorded by the accumulator array.
In the end, the programme finds peaks in the accumulator array that match the parameters of the shapes it
has identified. These peaks show whether the image contains lines, circles, or other shapes.
3.5 Harris corner detection
A corner is a point whose local neighbourhood is characterized by large intensity variation in all directions.
Corners are important features in computer vision because they are points stable over changes of viewpoint
and illumination.
3.6 An Image Pyramid
An Image Pyramid is a multiresolution representation of an image, which is a hierarchy of images with
different resolutions. An image pyramid can be constructed by repeatedly down sampling (or up sampling)
an image and creating a set of images at different resolutions. The resulting images are referred to as
“levels” of the pyramid, with the highest resolution image at the top and the lowest resolution image at the
bottom.
Gaussian Pyramids: This type of pyramid is constructed by repeatedly applying a Gaussian blur filter to an
image and down sampling it by a factor of two. The resulting images are smoother and have lower
resolution than the original image because Gaussians are low pass filters. Gaussian pyramids used for
3.7 Matching
Application Of Feature Detection And Matching-
• Automate object tracking
• Point matching for computing disparity
• Stereo calibration(Estimation of the fundamental matrix)
• Motion-based segmentation
• Recognition
• 3D object reconstruction
• Robot navigation
• Image retrieval and indexing
3.8 Feature
A feature is a piece of information which is relevant for solving the computational task related to a certain
application. Features may be specific structures in the image such as points, edges or objects. Features may
also be the result of a general neighbourhood operation or feature detection applied to the image. The features
can be classified into two main categories:
The features that are in specific locations of the images, such as mountain peaks, building corners, doorways,
or interestingly shaped patches of snow. These kinds of localized features are often called keypoint
features (or even corners) and are often described by the appearance of patches of pixels surrounding the
point location.
The features that can be matched based on their orientation and local appearance (edge profiles) are
called edges and they can also be good indicators of object boundaries and occlusion events in the image
sequence.
Main Component Of Feature Detection And Matching are:
Description: The local appearance around each feature point is described in some way that is (ideally)
invariant under changes in illumination, translation, scale, and in-plane rotation. We typically end up with a
descriptor vector for each feature point.
Matching: Descriptors are compared across the images, to identify similar features. For two images we may
get a set of pairs (Xi, Yi) ↔ (Xi`, Yi`), where (Xi, Yi) is a feature in one image and (Xi`, Yi`) its matching
feature in the other image.
Feature Descriptor
A feature descriptor is an algorithm which takes an image and outputs feature descriptors/ feature vectors.
Feature descriptors encode interesting information into a series of numbers and act as a sort of numerical
“fingerprint” that can be used to differentiate one feature from another.
Ideally, this information would be invariant under image transformation, so we can find the feature again
even if the image is transformed in some way. After detecting interest point we go on to compute a descriptor
for every one of them. Descriptors can be categorized into two classes:
Local Descriptor: It is a compact representation of a point’s local neighbourhood. Local descriptors try to
resemble shape and appearance only in a local neighbourhood around a point and thus are very suitable for
representing it in terms of matching.
Global Descriptor: A global descriptor describes the whole image. They are generally not very robust as a
change in part of the image may cause it to fail as it will affect the resulting descriptor.
Algorithms:
• SIFT(Scale Invariant Feature Transform)
• SURF(Speeded Up Robust Feature)
• BRISK (Binary Robust Invariant Scalable Keypoints)
• BRIEF (Binary Robust Independent Elementary Features)
• ORB(Oriented FAST and Rotated BRIEF)
Generally, the performance of matching methods based on interest points depends on both the properties of
the underlying interest points and the choice of associated image descriptors. Thus, detectors and descriptors
appropriate for images contents shall be used in applications. For instance, if an image contains bacteria cells,
the blob detector should be used rather than the corner detector. But, if the image is an aerial view of a city,
the corner detector is suitable to find man-made structures. Furthermore, selecting a detector and a descriptor
that addresses the image degradation is very important.
Given a feature in I1, how to find the best match in I2? 1. Define distance function that compares two
descriptors 2. Test all the features in I2, find the one with min distance.To define the difference between two
features f1, f2- Simple approach: L2 distance, || f1 - f2 ||. It can give small distances for ambiguous (incorrect)
matches.
The basic idea of feature matching is to calculate the sum square difference between two different feature
descriptors (SSD). So feature will be matched with another with minimum SSD value.
Algorithms:
Brute-Force Matcher-In brute-force matcher we have to match descriptor of all features in an image to
descriptors of all features in another image. It is extremely expensive as we know any brute-force algorithm
will guarantee getting a solution, but doesn’t guarantee getting optimal solution.
-----------
SIFT stands for Scale-Invariant Feature Transform and was first presented in 2004, by D.Lowe, University
of British Columbia. SIFT is invariance to image scale and rotation. This algorithm is patented, so this
algorithm is included in the Non-free module in OpenCV.
Locality: features are local, so robust to occlusion and clutter (no prior segmentation)
Extensibility: can easily be extended to a wide range of different feature types, with each adding robustness