0% found this document useful (0 votes)
9 views

Unit - 3

The document provides an overview of various concepts in computer vision, focusing on feature detection and alignment techniques. Key topics include points and patches, edges, lines, segmentation methods, and algorithms like Mean Shift and Normalized Cuts. Applications of these techniques span object recognition, image stitching, and medical image analysis.

Uploaded by

Balaji
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Unit - 3

The document provides an overview of various concepts in computer vision, focusing on feature detection and alignment techniques. Key topics include points and patches, edges, lines, segmentation methods, and algorithms like Mean Shift and Normalized Cuts. Applications of these techniques span object recognition, image stitching, and medical image analysis.

Uploaded by

Balaji
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

NANDHA ENGINEERING COLLEGE

COMPUTER VISION
1 7 A I X 0 7

UNIT - 2 FEATURE DETECTION AND FEAURE BASED

ALIGNMENT

J a n u a r y | 2 0 2 4
CONTENTS
Points and Patches 01

Edges 02

Lines 03

Segmentation 04

Split and Merge 05

Mean Shift and Mode


Finding
06
CONTENTS
Normalized Cuts 07

Graph Cuts and Energy-Based Methods 08

2D and 3D Feature-Based Alignment 09

Pose Estimation 10

Geometric Intrinsic Calibration 11


Points and Patches

Scale-Invariant
Points Feature Transform
(SIFT) Points
Interest Points Corner Points Edge Points

 Interest points, also known


 Corner points are  SIFT points are interest
as keypoints or feature  Edge points represent
locations where the points that are invariant
points, are specific
intensity changes in locations where there to scale, rotation, and
locations in an image that
multiple directions. is a significant illumination changes.
have unique and
These points are often intensity gradient, SIFT is a feature
discernible features. These
identified using corner indicating an abrupt extraction algorithm that
points are selected based
detection algorithms change in pixel identifies keypoints and
on characteristics like
like Harris corner values. their descriptors.
corners, edges, or texture
detector.
variations.
Points and Patches

Patches
Descriptor
Image Patches Patch Texture Patches
Matching Patches

 Texture  Descriptor patches are


 Patch matching involves patches
 Image patches are small represent regions of regions around
finding keypoints that capture
rectangular or square an image with
correspondences information about the
regions extracted from an consistent and
between similar patches local image structure.
image. These patches repetitive patterns.
in different images. This Descriptors, such as
capture local information Analyzing these
is often used in tasks those used in SIFT,
and can be used for patches provides
like image stitching, provide a compact
analysis or feature information about the
stereo vision, and representation of the
extraction. texture
template matching. local neighborhood.
characteristics.
Applications

1. Object Recognition

2. Image Stitching

3. Structure from Motion (SfM)

4. Machine Learning

5. Texture Analysis
Edges

Definition

In computer vision, edges refer to boundaries or transitions


between different textures, colors, or intensities in an image. Detecting
edges is a fundamental step in image processing, providing crucial
information for tasks such as object recognition, image segmentation,
and feature extraction.
Key Concepts
Intensity Gradient Types of Edges
• Step Edges: Sharp transitions between
Edges are often characterized by a rapid change
in intensity. The intensity gradient at a point different intensity levels.
• Ramp Edges: Gradual transitions from one
indicates the rate of change of intensity, and high
gradients are indicative of potential edge intensity level to another.
• Roof Edges: Combine aspects of both step
locations.
and ramp edges.

Edge Detection Operators


Edge Linking
• Sobel Operator: Applies convolution with
Sobel kernels to calculate the gradient in both Connecting edge segments to form
horizontal and vertical directions. continuous boundaries is part of the edge
• Prewitt Operator: Similar to the Sobel operator detection process. This is often done
but uses different convolution kernels. through techniques like Hough transform or
• Canny Edge Detector: Employs multi-stage edge tracking algorithms.
algorithms, including gradient computation,
non-maximum suppression, and edge
tracking by hysteresis, to detect edges. Thresholding
After edge detection, thresholding is often
Edge Strength and Orientation applied to identify significant edges and
discard noise. It helps in emphasizing the
Edge strength represents the magnitude of the
most salient features.
gradient at an edge, while the orientation
indicates the direction of the gradient. Both are
crucial for characterizing edges.
Applications

1. Object Recognition

2. Image Segmentation

3. Feature Extraction

4. Image Stitching

5. Robotics and Autonomous Vehicles

6. Medical Image Analysis

7. Quality Inspection

8. Gesture Recognition
Lines

Definition

In computer vision, lines are significant geometric features that represent straight or curved
paths within an image. Detecting lines is a fundamental aspect of image processing and
computer vision, contributing to tasks such as object recognition, scene analysis, and image
understanding.
Key Concepts

RANSAC (Random Sample Consensus)


Hough Transform
RANSAC is a robust method used for fitting
The Hough Transform is a popular technique for
lines to a set of data points contaminated by
line detection. It represents lines in polar
outliers. It is commonly employed for line
coordinates and accumulates votes for each
detection in the presence of noise.
line parameter in a parameter space.

Line Segments vs. Straight Lines Types of Lines


Line detection can involve identifying complete • Horizontal and Vertical Lines: Often targeted
straight lines or breaking down longer lines into for tasks like text detection or scene layout
line segments, especially when dealing with analysis.
complex scenes. • Diagonal Lines: Important for recognizing
structures and patterns in images.
• Curved Lines: Detected in scenarios where
Edge-Based Line Detection straight lines are not sufficient to represent
features.
Lines are often detected based on the edges in
an image. Edge detection algorithms, such as
the Canny edge detector, provide input for
subsequent line detection processes.
Applications

1. Lane Detection in Autonomous Vehicles

2. Document Analysis

3. Industrial Quality Control

4. Building and Structure Recognition

5. Cartography and GIS

6. Robotics

7. Barcode and QR Code Reading

8. Gesture Recognition
Segmentation

Definition

Segmentation in computer vision refers to


the process of partitioning an image into distinct
and meaningful regions based on certain criteria,
such as color, intensity, texture, or spatial
proximity. The goal is to group pixels or regions
that share similar visual characteristics and
separate them from the background or other
objects.
Key Concepts
Pixel-Level Segmentation
Divides an image into individual pixels, assigning each pixel to a specific
01 class or label based on its characteristics.
Region-Based Segmentation
Groups adjacent pixels into larger regions based on common
02 features, forming coherent and homogeneous segments within
the image.
Thresholding
Involves setting intensity or color thresholds to separate
03 regions of interest from the background. Pixels above or
below the threshold are classified accordingly.

Clustering
Utilizes clustering algorithms, such as k-means, to group
04 pixels with similar features into clusters, representing
different segments.

Edge-Based Segmentation
05 Detects edges and boundaries between different regions in
an image, defining the limits of distinct segments.

Watershed Segmentation
06 Mimics the behavior of water flowing along gradients in an image.
Watershed lines mark boundaries between segments.
Applications

1. Object Recognition

2. Medical Image Analysis

3. Image Editing and Augmentation

4. Autonomous Vehicles

5. Satellite Image Analysis

6. Facial Recognition

7. Video Surveillance

8. Robotics
Split and Merge

Definition

Split and Merge is a hierarchical image segmentation technique


used in computer vision. It involves recursively dividing an image into
smaller regions and merging them based on certain criteria. This
process continues until the desired segmentation level is achieved,
ensuring homogeneous regions with respect to predefined
characteristics.
Key Concepts

Merging
• After the splitting phase, regions are
Recursive Splitting Homogeneity Criteria
examined for homogeneity.
The image is initially The decision to split or merge • Adjacent regions that satisfy the
split into smaller sub- regions is based on homogeneity homogeneity criteria are merged to
regions. This splitting criteria, which can include color form larger, more homogeneous
process is applied consistency, intensity similarity, segments.
recursively until certain or texture uniformity within a • The merging process continues
criteria are met or a region. until stopping criteria are satisfied.
predefined segmentation
level is reached.

Splitting
Stopping Criteria • The image is initially considered as a Region Merging
single region. If the region does not After splitting, regions are merged if
The process continues until stopping
meet the homogeneity criteria, it is they satisfy the homogeneity criteria.
criteria are met. Stopping criteria could
split into smaller sub-regions. Merging involves combining adjacent
include reaching a specified number of •
The splitting process is applied regions to create larger, more
segments, achieving a desired level of
recursively until stopping criteria are coherent segments.
homogeneity, or other application-specific
met.
requirements.
Applications

1. Image Segmentation

2. Medical Image Analysis

3. Satellite Image Processing

4. Texture Analysis

5. Object Recognition

6. Computer-Aided Design (CAD)

7. Remote Sensing
Mean Shift and Mode Finding

Definition

Mean Shift is a non-parametric clustering and mode-seeking


algorithm used in computer vision and image processing. It is
particularly effective for tasks such as image segmentation, object
tracking, and feature extraction.
Key Concepts

Mode Seeking Bandwidth Parameter


Mean Shift seeks the modes or peaks The bandwidth parameter of the
in the probability density function of kernel determines the size of the
a dataset. These modes represent region around each data point within
clusters or regions of interest in the which other points contribute to the
data. shift.

Kernel Function Convergence


Mean Shift operates by iteratively
The process continues until
shifting the data points towards the
convergence, where data points
mode of the underlying distribution.
settle around the modes. Mean Shift
The shifting is influenced by a kernel
is adaptive to the shape and density
function that gives more weight to
of the data distribution.
nearby points.
Mean Shift and Mode Finding

Definition

Mode finding refers to the process of identifying modes in a


dataset, representing the values or configurations that occur most
frequently. In the context of computer vision, this can relate to finding
dominant colors, shapes, or patterns.
Key Concepts

Histogram Analysis
Mode finding often involves
Intensity Peaks
analyzing histograms to identify
peaks, which correspond to the For grayscale images, mode finding
modes in the distribution of pixel can identify intensity levels that
values or feature responses. occur most frequently, providing
insights into the overall brightness
or contrast.
Dominant Colors
In image processing, mode finding
can be applied to identify dominant
colors within an image. This is useful
in tasks like color-based image
segmentation.
Normalized Cuts

Definition

Normalized Cuts is a graph-based image segmentation algorithm


used in computer vision. It aims to partition an image into coherent
and visually meaningful segments by considering both the similarities
between pixels within segments and the dissimilarities between
segments. The method is based on spectral graph theory and has
proven effective for various segmentation tasks.
Key Concepts

Graph Representation
Represent the image as a graph, where pixels are nodes, and edges
represent pairwise relationships. Edge weights typically capture the
similarity between pixel intensities or feature vectors.

Affinity Matrix
Construct an affinity matrix based on the pairwise similarities. Common
choices for similarity measures include Gaussian functions or gradients.

Spectral Decomposition
Decompose the affinity matrix into its eigenvectors and eigenvalues. The
eigenvectors capture the structural information of the graph.

Feature Space Embedding


Embed pixels into a lower-dimensional feature space using the eigenvectors.
This process aims to reveal the underlying structure of the data.

Normalized Cuts Criterion


Formulate the segmentation problem as an optimization task. The goal is to
find partitions that maximize the ratio of the sum of within-segment affinities
to the sum of total affinities.
Applications

1. Image Segmentation

2. Object Recognition

3. Biomedical Image Analysis

4. Video Segmentation

5. Scene Understanding

6. Image Editing
Graph Cuts and Energy-Based
Methods

Definition

Graph cuts refer to the partitioning of a graph into disjoint sets by


removing the minimum possible edge weights. In computer vision,
graph cuts are extensively used for image segmentation and various
other applications.
Key Concepts

Graph Representation Min-Cut/Max-Flow Algorithm


Represent an image as a graph, where Utilize graph algorithms like the min-cut/max-
nodes correspond to pixels, and edges flow algorithm to find the optimal partitioning
capture pairwise relationships. Edge that minimizes the total cut cost. The cut
weights typically represent the defines the boundary between segmented
dissimilarity or cost between adjacent regions.
pixels.

Source and Sink Nodes Energy Minimization


Introduce source and sink nodes representing The graph cut problem is often formulated as an
the foreground and background, respectively. energy minimization problem, where the
Connect these nodes to image pixels with objective is to find the partitioning that
appropriate edge weights. minimizes an energy function.
Applications

1. Image Segmentation

2. Interactive Segmentation

3. Video Segmentation

4. Stereo Matching

5. Object Recognition
Graph Cuts and Energy-Based
Methods

Definition

Energy-based methods involve defining an energy function that


assigns a cost to different configurations. In computer vision, these
methods are often used for tasks such as image denoising, object
recognition, and optimization problems.
Key Concepts

Energy Function
Smoothness Term
Define an energy function that quantifies the
goodness or badness of a particular Encodes the preference for smooth
configuration. The energy function typically configurations. It penalizes abrupt changes and
consists of data and smoothness terms. encourages coherence in neighboring regions.

Data Term
Optimization
Measures the agreement between the
Minimize the energy function by searching
observed data and the predicted
for the configuration that achieves the
configuration. Lower energy is assigned
lowest energy. This can be done using
to configurations that better match the
optimization techniques such as gradient
observed data.
descent or graph cuts.
Applications

1. Image Denoising

2. Object Recognition

3. Image Restoration

4. Motion Estimation

5. Image Inpainting
2D and 3D Feature-Based
Alignment

Definition

2D feature-based alignment involves aligning images or patterns


in two-dimensional space based on distinctive features. These features
are identifiable points or descriptors that allow for matching and
registration.
Key Concepts

Feature Extraction Transformation Estimation


Identify and extract distinctive features from
Estimate the transformation
images. Common features include corners,
(translation, rotation, scaling) that
keypoints, or local descriptors like SIFT (Scale-
aligns the features in one image with
Invariant Feature Transform) or ORB (Oriented
those in another. Common
FAST and Rotated BRIEF).
transformations include affine or
homography transformations.

Feature Matching Image Alignment


Match corresponding features between images. Apply the estimated transformation
Matching is typically performed using to align the images. This ensures
techniques such as nearest-neighbor matching that corresponding features in the
or using robust methods like RANSAC (Random two images are brought into
Sample Consensus) to handle outliers. registration.
2D and 3D Feature-Based
Alignment

Definition

3D feature-based alignment extends the concept to three-


dimensional space, aligning 3D models or scenes based on distinctive
features.
Key Concepts

3D Feature Extraction Transformation Estimation


Extract features from 3D models or point Estimate the 3D transformation
clouds. Features may include keypoints, parameters (translation, rotation,
surface normals, or local descriptors. scaling) that align one 3D model or
scene with another.

Feature Matching in 3D 3D Alignment


Match corresponding features between different Apply the estimated transformation
3D models or scenes. Matching is often based to align the 3D models or scenes.
on geometric constraints and feature This ensures that corresponding
descriptors. features in the two 3D
representations are aligned.
Post Estimation

Definition

Pose estimation in computer vision refers to the process of


determining the position and orientation (pose) of an object or a scene
in a given environment. It involves extracting information about the
translation and rotation of an object with respect to a reference
coordinate system.
Key Concepts
Camera Calibration RANSAC (Random Sample Consensus)
Accurate pose estimation often begins with To handle outliers and noisy data, robust
calibrating the camera to understand its methods like RANSAC are often employed
intrinsic parameters (focal length, principal during feature matching and pose estimation.
point, distortion) and extrinsic parameters
(position and orientation). Camera Pose Computation
Feature Extraction Once correspondences are established,
Identify and extract features from the algorithms such as the Direct Linear Transform
object or scene that can be matched (DLT) method or iterative optimization
across images. Common features include techniques are used to compute the camera
keypoints, corners, or distinct patterns. pose.

Feature Matching Homography Transformation


Match corresponding features between the For planar objects or scenes, a homography
reference image and the observed image. This transformation can be used to estimate the
matching step helps establish correspondences pose.
for pose computation.
Bundle Adjustment
Perspective-n-Point (PnP) Problem In more complex scenarios involving multiple
Pose estimation is often formulated as the views or a sequence of images, bundle
Perspective-n-Point problem, where the goal is to adjustment can refine the pose estimation by
solve for the camera pose given the 3D optimizing the entire set of camera parameters.
coordinates of points in the object's reference
frame and their 2D projections in the image.
Applications

1. Augmented Reality (AR)

2. Robotics

3. 3D Reconstruction

4. Object Tracking

5. Autonomous Vehicles

6. Medical Imaging
Geometric Intrinsic Calibration

Definition

Geometric intrinsic calibration in computer vision involves


determining the internal parameters of a camera that define its
geometry and optics. These parameters include focal length, principal
point, and lens distortion coefficients. Calibration is essential to
ensure accurate geometric relationships between the 3D world and the
2D image captured by the camera.
Key Concepts

Focal Length (f)


Focal length is the distance between the camera's optical center and the image
plane. It determines how much the camera can "zoom in" or "zoom out" and
influences the scale of the captured scene.
01
Principal Point
The principal point is the location of the optical axis intersection with the
image plane. It defines the center of the image and affects the distortion
02 characteristics.
Camera Matrix (K)
The camera matrix is a 3x3 matrix that combines the focal length,
principal point, and skew. It is a crucial component in the perspective
03 projection from 3D to 2D.

Calibration Patterns

04 Intrinsic calibration often involves capturing images of known


calibration patterns, such as checkerboards or grids, at
different orientations. These patterns provide the necessary
information for calibration.
Applications Challenges
Accurate Pattern Detection
Reliable feature extraction from calibration
patterns can be challenging, especially in the
1. Augmented Reality (AR) presence of noise.

2. Robotics
Variability in Environment
3. 3D Reconstruction Changes in lighting conditions or environmental
factors can impact the accuracy of calibration.
4. Computer Vision System
Lens Distortion Modeling
Accurate modeling of lens distortion is crucial
for precise calibration and correction.
THANK YOU
THANKS

You might also like