0% found this document useful (0 votes)
14 views

Unit-3

The document discusses various image processing techniques including Histogram Equalization, Canny edge detection, and feature extraction methods like Hough Transform and Harris corner detection. It explains the steps involved in edge detection, such as grayscale conversion, noise reduction, and edge tracking, as well as the importance of features in computer vision applications. Additionally, it covers feature matching algorithms like SIFT and FLANN, highlighting their significance in object recognition and image registration.

Uploaded by

Rahul Vijay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Unit-3

The document discusses various image processing techniques including Histogram Equalization, Canny edge detection, and feature extraction methods like Hough Transform and Harris corner detection. It explains the steps involved in edge detection, such as grayscale conversion, noise reduction, and edge tracking, as well as the importance of features in computer vision applications. Additionally, it covers feature matching algorithms like SIFT and FLANN, highlighting their significance in object recognition and image registration.

Uploaded by

Rahul Vijay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIT- 3

3.1 Histogram of Equalization (HOE)

The output of the pixel from the histogram equalization operation is then equal to the CDF of the image.
Once the original histogram is known the pixel values are changed based on the original probability and
spread over the histogram.
3.2 Canny edge detection
Canny edge detection is a popular and widely used edge detection technique that aims to identify and extract
the edges of objects within an image. It was developed by John F. Canny in 1986 and has since become a
fundamental tool in computer vision and image analysis. Edge detection is a technique used in image
processing to find the boundaries of the objects within the image. An edge is defined as a sudden change in
pixel intensity within an image. Edges represent the boundaries between distinct objects or regions with
varying intensity levels.

The Canny edge detection algorithm is a multistage process that helps to identify the edges in an image by
reducing noise and preserving important edge features.

Step 1: Grayscale conversion-Grayscale images have a single channel representing the intensity of each
pixel, which simplifies the edge detection process and reduces computational complexity. Grayscale
conversion removes the color information from the image while preserving the relative brightness levels.

Step 2: Noise reduction. Is to apply a Gaussian filter to the input image. The Gaussian filter is a smoothing
operation that helps to reduce noise in the image. Noise can introduce false edges, which could compromise
the accuracy of the edge detection process.

Step 3: Gradient calculation. The gradient measures how fast the intensity changes at each pixel’s location.
The algorithm uses the concept of derivatives, typically the Sobel operator, to determine both the gradient
magnitude and orientation for each pixel. The gradient magnitude indicates the strength of the intensity
change, while the gradient orientation specifies the direction of the steepest change.

Step 4: Non-maximum suppression. This step effectively thins out the edges and produces a cleaner
representation of the actual edges in the image. It works by examining each pixel's gradient magnitude and
orientation and comparing it with the neighbouring pixels along the gradient direction. If the central pixel's
gradient magnitude is the largest among its neighbours, it means that this pixel is likely part of an edge, and
we keep it. If not, we suppress it by setting its intensity to zero and removing it from consideration as an
edge pixel.

Step 5: Double thresholding. The next step involves double thresholding to categorize edges into three
categories: strong edges, weak edges, and non-edges.

A high threshold and a low threshold are used for this purpose.

Pixels with gradient magnitudes above the high threshold are considered strong edges, indicating significant
intensity changes. Pixels with gradient magnitudes between low threshold and high threshold are classified
as weak edges. These weak edges may represent real edges or noise, and they need further verification.
Pixels with gradient magnitudes below the low threshold are considered non-edges and are discarded.

Step 6: Edge tracking by hysteresis. Hysteresis means 'remembering the past' to make our edges more
accurate and reliable. This step aims to link weak edges that are likely part of real edges to the strong edges.
Starting from each strong edge pixel, the algorithm traces the edge by considering its neighbouring weak
edge pixels that are connected. If a weak edge pixel is connected to a strong edge pixel, it is also considered
part of the edge and retained. This process continues until no more weak edges are connected. This ensures
that the edges are continuous and well-defined.
3.3.1 LoG Filter Function.
Uses a Zero-crossing Feature. A zero crossing is a point where the sign of a mathematical function changes
in the graph of the function. In image processing, the edge detection using Laplacian filter takes place by
marking the points that leads to zero in graph as potential edge points. This method works fine on images for
finding edges in both directions, but it works poorly when noises are found in the image. So, we usually
smooth the image applying Gaussian filter prior to Laplacian filter. It’s often termed Laplacian of Gaussian
(LoG) filter. We can combine Gaussian and Laplacian operations together and the mathematical
representation of the combined filter is as follows:

3.3.2 Difference of gaussian


The DOG performs edge detection by performing a Gaussian blur on an image at a specified theta (also
known as sigma or standard deviation). The resulting image is a blurred version of the source image. The
module then performs another blur with a sharper theta that blurs the image less than previously. The final
image is then calculated by replacing each pixel with the difference between the two blurred images and
detecting when the values cross zero, i.e. negative becomes positive and vice versa. The resulting zero
crossings will be focused at edges or areas of pixels that have some variation in their surrounding
neighbourhood. It is simpler than LOG.
Both also help in finding BLOBs, used in feature descriptors.

3.4 Hough Transform


A feature extraction method called the Hough Transform is used to find basic shapes in a picture, like
circles, lines, and ellipses. Fundamentally, it transfers these shapes' representation from the spatial domain to
the parameter space, allowing for effective detection even in the face of distortions like noise or occlusion.
The accumulator array, sometimes referred to as the parameter space or Hough space, is the first thing that
the Hough Transform creates. The available parameter values for the shapes that are being detected are
represented by this space. The slope (m) and y-intercept (b) of a line, for instance, could be the parameters in
the line detection scenario.

The Hough Transform calculates the matching curves in the parameter space for each edge point in the
image. This is accomplished by finding the curve that intersects the parameter values at the spot by iterating
over all possible values of the parameters. The "votes" or intersections for every combination of parameters
are recorded by the accumulator array.

In the end, the programme finds peaks in the accumulator array that match the parameters of the shapes it
has identified. These peaks show whether the image contains lines, circles, or other shapes.
3.5 Harris corner detection
A corner is a point whose local neighbourhood is characterized by large intensity variation in all directions.
Corners are important features in computer vision because they are points stable over changes of viewpoint
and illumination.
3.6 An Image Pyramid
An Image Pyramid is a multiresolution representation of an image, which is a hierarchy of images with
different resolutions. An image pyramid can be constructed by repeatedly down sampling (or up sampling)
an image and creating a set of images at different resolutions. The resulting images are referred to as
“levels” of the pyramid, with the highest resolution image at the top and the lowest resolution image at the
bottom.

Gaussian Pyramids: This type of pyramid is constructed by repeatedly applying a Gaussian blur filter to an
image and down sampling it by a factor of two. The resulting images are smoother and have lower
resolution than the original image because Gaussians are low pass filters. Gaussian pyramids used for

• up- or down- sampling images.


• Multi-resolution image analysis
• Look for an object over various spatial scales
• Coarse-to-fine image processing: form blur estimate or the motion analysis on very low- resolution
image, up sample and repeat. Often a successful strategy for avoiding local minima in complicated
estimation tasks.

3.7 Matching
Application Of Feature Detection And Matching-
• Automate object tracking
• Point matching for computing disparity
• Stereo calibration(Estimation of the fundamental matrix)
• Motion-based segmentation
• Recognition
• 3D object reconstruction
• Robot navigation
• Image retrieval and indexing

3.8 Feature
A feature is a piece of information which is relevant for solving the computational task related to a certain
application. Features may be specific structures in the image such as points, edges or objects. Features may
also be the result of a general neighbourhood operation or feature detection applied to the image. The features
can be classified into two main categories:

The features that are in specific locations of the images, such as mountain peaks, building corners, doorways,
or interestingly shaped patches of snow. These kinds of localized features are often called keypoint
features (or even corners) and are often described by the appearance of patches of pixels surrounding the
point location.

The features that can be matched based on their orientation and local appearance (edge profiles) are
called edges and they can also be good indicators of object boundaries and occlusion events in the image
sequence.
Main Component Of Feature Detection And Matching are:

Detection: Identify the Interest Point

Description: The local appearance around each feature point is described in some way that is (ideally)
invariant under changes in illumination, translation, scale, and in-plane rotation. We typically end up with a
descriptor vector for each feature point.

Matching: Descriptors are compared across the images, to identify similar features. For two images we may
get a set of pairs (Xi, Yi) ↔ (Xi`, Yi`), where (Xi, Yi) is a feature in one image and (Xi`, Yi`) its matching
feature in the other image.

Feature Descriptor

A feature descriptor is an algorithm which takes an image and outputs feature descriptors/ feature vectors.
Feature descriptors encode interesting information into a series of numbers and act as a sort of numerical
“fingerprint” that can be used to differentiate one feature from another.

Ideally, this information would be invariant under image transformation, so we can find the feature again
even if the image is transformed in some way. After detecting interest point we go on to compute a descriptor
for every one of them. Descriptors can be categorized into two classes:

Local Descriptor: It is a compact representation of a point’s local neighbourhood. Local descriptors try to
resemble shape and appearance only in a local neighbourhood around a point and thus are very suitable for
representing it in terms of matching.
Global Descriptor: A global descriptor describes the whole image. They are generally not very robust as a
change in part of the image may cause it to fail as it will affect the resulting descriptor.

Algorithms:
• SIFT(Scale Invariant Feature Transform)
• SURF(Speeded Up Robust Feature)
• BRISK (Binary Robust Invariant Scalable Keypoints)
• BRIEF (Binary Robust Independent Elementary Features)
• ORB(Oriented FAST and Rotated BRIEF)

3.9 Features Matching


Features matching or generally image matching, a part of many computer vision applications such as image
registration, camera calibration and object recognition, is the task of establishing correspondences between
two images of the same scene/object. A common approach to image matching consists of detecting a set of
interest points each associated with image descriptors from image data. Once the features and their
descriptors have been extracted from two or more images, the next step is to establish some preliminary
feature matches between these images.

Generally, the performance of matching methods based on interest points depends on both the properties of
the underlying interest points and the choice of associated image descriptors. Thus, detectors and descriptors
appropriate for images contents shall be used in applications. For instance, if an image contains bacteria cells,
the blob detector should be used rather than the corner detector. But, if the image is an aerial view of a city,
the corner detector is suitable to find man-made structures. Furthermore, selecting a detector and a descriptor
that addresses the image degradation is very important.
Given a feature in I1, how to find the best match in I2? 1. Define distance function that compares two
descriptors 2. Test all the features in I2, find the one with min distance.To define the difference between two
features f1, f2- Simple approach: L2 distance, || f1 - f2 ||. It can give small distances for ambiguous (incorrect)
matches.

The basic idea of feature matching is to calculate the sum square difference between two different feature
descriptors (SSD). So feature will be matched with another with minimum SSD value.

SSD=∑(f1−f2)2 ,where f1 and f2 are two feature descriptors.

Algorithms:

Brute-Force Matcher-In brute-force matcher we have to match descriptor of all features in an image to
descriptors of all features in another image. It is extremely expensive as we know any brute-force algorithm
will guarantee getting a solution, but doesn’t guarantee getting optimal solution.

FLANN(Fast Library for Approximate Nearest Neighbors) Matcher

Algorithm For Feature Detection And Matching

Find a set of distinctive keypoints

Define a region around each keypoint

Extract and normalize the region content

Compute a local descriptor from the normalized region

Match local descriptors

-----------

SIFT stands for Scale-Invariant Feature Transform and was first presented in 2004, by D.Lowe, University
of British Columbia. SIFT is invariance to image scale and rotation. This algorithm is patented, so this
algorithm is included in the Non-free module in OpenCV.

Major advantages of SIFT are

Locality: features are local, so robust to occlusion and clutter (no prior segmentation)

Distinctiveness: individual features can be matched to a large database of objects

Quantity: many features can be generated for even small objects

Efficiency: close to real-time performance

Extensibility: can easily be extended to a wide range of different feature types, with each adding robustness

You might also like