CH-4 COMPUTER VISION
CH-4 COMPUTER VISION
The region-splitting method is a top-down approach for image segmentation. It starts with
the entire image as a single region and recursively splits it into smaller regions based on
predefined criteria, such as differences in intensity, color, or texture.
2. Splitting Criterion: The image is divided if a region does not meet a homogeneity
condition, such as uniformity in pixel intensity.
3. Recursive Splitting: The process continues for each subregion until all regions satisfy
the homogeneity condition.
This method helps in segmenting an image into meaningful regions based on similarity but
can be computationally expensive.
2/2
An image is represented in a feature space, where each pixel is a point in the space
defined by features like color (RGB values), intensity, or texture.
A window or kernel (typically circular or Gaussian) is placed around each pixel, with
a specific radius (bandwidth) that defines the region of influence.
2. Iterative Process:
1/18
For each pixel, the mean of all the pixels within the window (kernel) is computed.
The mean is calculated based on the feature space (e.g., color, intensity).
The pixel is then moved to this mean position, shifting the kernel to the new
location.
3. Convergence:
This process is repeated iteratively. In each iteration, the window shifts toward the
region with the highest density of points in the feature space (the mode).
The process stops when the shift is smaller than a threshold, or after a set number
of iterations. This results in a "mode" of the feature space where each pixel has
moved toward its local peak.
4. Cluster Formation:
Once convergence is achieved, each pixel is assigned to the mode (or cluster) it
belongs to based on the final location of the mean.
Pixels with similar features (intensity, color, etc.) will converge to the same mode and
thus belong to the same region or segment.
Advantages:
Non-parametric: No need to define the number of segments in advance.
Adaptability: It can adapt to different shapes and densities of regions in the feature
space.
Flexibility: Works well with various features like color, texture, and intensity.
Disadvantages:
Computationally Expensive: It requires multiple iterations and computations for each
pixel, making it slow for large images.
Bandwidth Sensitivity: The choice of bandwidth (radius of the window) can significantly
affect the segmentation result. Small bandwidths may lead to over-segmentation, while
large bandwidths may cause under-segmentation.
2/18
Explain image segmentation in brief. Also discuss various
approaches for image segmentation.
A basic segmentation technique where pixels are classified into different groups
based on their intensity values.
Global Thresholding: A single threshold value is chosen for the entire image.
Adaptive Thresholding: Threshold values are calculated for smaller regions in the
image, adjusting to local variations.
2. Edge-Based Segmentation:
Edge Detection: Techniques like the Sobel operator, Canny edge detector, or
Laplacian of Gaussian (LoG) are used to identify edges.
Once edges are identified, regions are formed based on these boundaries.
3. Region-Based Segmentation:
Region Growing: Starts with a seed pixel and grows the region by adding
neighboring pixels that meet a similarity criterion.
Region Splitting and Merging: Initially splits the image into uniform regions and
then merges adjacent regions if they satisfy a homogeneity criterion.
4. Clustering-Based Segmentation:
Uses clustering algorithms to group similar pixels together based on their features
(e.g., color, texture).
3/18
K-means Clustering: Pixels are classified into a predefined number of clusters based
on their feature similarities.
5. Graph-Based Segmentation:
The image is represented as a graph where pixels are nodes, and edges represent
the similarity between neighboring pixels.
Graph Cuts: Cuts the graph into two disjoint sets to separate different regions in the
image.
6. Watershed Transform:
The watershed algorithm "floods" the image, and regional minima are found to
form the segmented regions.
Uses convolutional neural networks (CNNs) to learn features and patterns directly
from data for more complex and accurate segmentation tasks.
Fully Convolutional Networks (FCNs) and U-Net are popular architectures for
image segmentation, especially in medical imaging and other detailed tasks.
Summary:
Thresholding is simple but effective for binary segmentation.
4/18
Deep learning approaches offer advanced, highly accurate segmentation by learning
features directly from images.
Each approach has its strengths and is chosen based on the specific requirements of the
segmentation task, such as the complexity of the image and the desired output.
Image Segmentation:
Image segmentation is the process of partitioning an image into multiple segments or
regions, each representing a different part of the image that has similar characteristics such
as color, intensity, or texture. The goal of segmentation is to simplify the representation of
the image or make it more meaningful and easier to analyze. It is a fundamental task in
computer vision, helping to identify objects, boundaries, and structures within an image.
Organ Segmentation: It helps in segmenting organs such as the brain, liver, and
lungs in medical images, aiding diagnosis and treatment planning.
Land Use and Land Cover Classification: Satellite images are segmented to classify
different types of land cover, such as forests, water bodies, and urban areas.
5/18
4. Agriculture:
5. Industrial Automation:
Augmented Reality (AR): Segmentation allows the overlay of virtual objects onto
real-world scenes, making interactions more immersive.
7. Biometrics:
License Plate Recognition: Segmentation aids in identifying the license plate area
from vehicle images for tasks like toll collection and surveillance.
9. Video Surveillance:
6/18
Tracking: Segmentation is crucial for tracking moving objects or people across
frames in surveillance video.
Summary:
Image segmentation plays a crucial role in various fields by isolating meaningful regions in
an image, facilitating easier analysis and understanding. Its applications span from medical
imaging and autonomous vehicles to industrial automation and video surveillance, making it
an essential tool in computer vision and related areas.
Basic Concept:
In watershed segmentation, an image is treated as a topographic surface, with pixel intensity
values representing elevation. The goal is to "flood" the image, where regions (or catchment
basins) are formed around minima, and the boundaries between regions are defined by
"watershed lines."
Steps Involved:
1. Gradient Calculation:
The image’s gradient (or edge map) is calculated to identify the intensity changes,
which helps in finding boundaries.
Imagine the image being submerged in water, with water rising from regional
minima (dark areas in the gradient image).
As water rises, it fills the basins, and the flooding process continues until the water
from different basins meets at certain points.
The points where the waters from different basins meet are identified as watershed
lines (or boundaries).
7/18
4. Region Formation:
The regions bounded by watershed lines represent the segmented parts of the
image.
Key Features:
Marker-Based Watershed: This method can be improved by defining initial markers
(seed points) to guide the flooding process, reducing over-segmentation and improving
accuracy.
Advantages:
Effective for segmenting regions based on intensity and gradient.
Disadvantages:
Sensitive to noise, leading to over-segmentation in low-quality images.
In summary, the watershed segmentation method uses the flooding principle to separate an
image into regions based on pixel intensity, making it a useful technique for detecting
boundaries and segmenting complex structures.
The snake method (also known as active contour models) is a popular technique for image
segmentation, where an energy-minimizing curve (the "snake") is used to delineate the
boundary of an object in an image. The snake evolves through the image, driven by forces
that push it towards object boundaries while maintaining smoothness.
Concept:
The snake is an initial curve (polygonal or spline) that deforms iteratively to fit the boundary
of an object in the image. The deformation is controlled by an energy function, which is
minimized during the evolution process.
8/18
Energy Function:
The energy function of a snake is composed of three terms:
Internal energy ensures that the snake remains smooth and can be modeled as a spring
system. It consists of two components:
2 2
(α ( ) ) ds
1
d2 X(s)
) (
dX(s)
Einternal = ∫ + β
ds2
0 ds
Where:
External energy pulls the snake towards desired features, like object boundaries or regions
of interest in the image. It is derived from image features such as gradients or edges. A
common external energy term is related to the image gradient (the rate of intensity
change), which helps in identifying edges.
1
dX(s)
Eexternal = − ∫ Fexternal ⋅ ds
0 ds
Where Fexternal is the force derived from the image's gradient or other features like edges.
9/18
Fexternal = ∇I(X)
Where I(X) is the image intensity at position X, and ∇I(X) is the gradient (edge
information) at that point.
The image energy term can be a potential energy function derived from the image's intensity
values or edges. For example, it can be based on edge detection operators such as the
gradient magnitude or edge strength.
1
Eimage = ∫
2
(λ ⋅ ∇I(X(s))) ds
0
Where:
λ is a weighting factor that adjusts the influence of the image energy term,
∇I(X(s)) is the gradient of the image at the point X(s).
Snake Evolution:
The snake evolves by minimizing the total energy function. This can be done by solving the
Euler-Lagrange equation or using numerical methods like the finite difference method.
∂E
X(s, t + 1) = X(s, t) + Δt ⋅ (− )
∂X(s)
Where:
Summary of Forces:
Internal forces: Encourage smoothness and continuity of the snake.
External forces: Pull the snake toward edges or other desired features.
Image forces: Help in detecting object boundaries by guiding the snake along edges or
image gradients.
10/18
Object Boundary Detection: Detects the boundary of objects in images.
Object Tracking: In video sequences, snakes can track moving objects by adapting to
changes in the image.
In summary, the snake method is an active contour model that uses an energy function
composed of internal, external, and image energies to guide a curve towards an object
boundary in an image, providing a powerful technique for image segmentation.
Here are the steps involved in Region Split and Merge segmentation:
The image is recursively divided into smaller sub-regions (quadrants, halves, etc.).
11/18
If the region is non-homogeneous, it is further split into smaller sub-regions, and the
splitting process continues.
3. Splitting Criteria:
The splitting continues recursively until all regions meet the homogeneity criterion or
until the region is small enough to be considered homogeneous by default.
4. Region Merging:
After splitting the image into homogeneous regions, the merging phase begins.
Adjacent regions that are homogeneous and have similar properties (such as pixel
values) are merged into a single region.
The merging is done by checking whether the boundaries of neighboring regions satisfy
the homogeneity criteria.
5. Termination:
The process terminates when no further splitting or merging is required, meaning that
the regions are both homogeneous and distinct from one another.
Summary of Steps:
1. Split the image into smaller sub-regions recursively.
Key Concepts:
Recursive Splitting: Divide the image into smaller regions until each region satisfies the
homogeneity criterion.
12/18
Region Merging: Merge regions that are homogeneous and close to each other,
reducing over-segmentation.
Homogeneity Criterion: A predefined rule (e.g., pixel intensity, texture) used to decide
whether a region is uniform or needs further splitting/merging.
This method effectively segments an image by both splitting and merging based on the
consistency of the regions, making it suitable for detecting structures with distinct
boundaries in images.
2/2
An Active Contour, also known as a snake, is a curve that is initialized near the boundary of
an object in an image and evolves iteratively to align itself with the true boundary of the
object. This technique is widely used in image segmentation to detect and outline object
boundaries in an image.
2. Energy Minimization: The contour moves towards the object's boundary by minimizing
an energy function that combines forces from the image and the curve's smoothness.
3. Internal Forces: These forces help the contour remain smooth and resist bending or
stretching.
4. External Forces: These forces pull the contour towards edges or other significant
features (such as gradients) in the image, guiding it to the object boundary.
Energy Function:
The energy function E used to guide the active contour is composed of:
Internal Energy: Ensures that the contour remains smooth and well-behaved,
preventing excessive bending or stretching.
External Energy: Guides the contour to move towards edges or object boundaries,
usually derived from image gradients.
13/18
The contour evolves by minimizing the total energy, typically using iterative optimization
methods.
At each step, the contour is adjusted to reduce both internal and external energy,
eventually settling on the true object boundary.
Applications:
Object Detection: To find the boundaries of objects in an image.
In summary, an active contour (snake) is a curve that deforms to fit the shape of an object
boundary by minimizing an energy function, making it a powerful tool for image
segmentation and boundary detection.
Region Splitting and Merging is a region-based image segmentation technique that divides
an image into homogeneous regions and then merges them based on specific criteria. This
method aims to segment an image by first splitting it into smaller regions and then merging
adjacent regions to form larger, homogeneous areas.
Steps Involved:
1. Initial Split:
The image is recursively divided into smaller sub-regions (often using a quadtree
structure).
This division continues until each sub-region meets a certain homogeneity criterion
(e.g., pixel intensity, color, texture).
2. Homogeneity Check:
14/18
If a region is not homogeneous, it is further split into smaller regions until they are
homogeneous.
3. Region Merging:
After the image has been split into small homogeneous regions, the next step is
merging.
Adjacent regions are merged if they are found to be similar enough (i.e., meet the
homogeneity criteria).
The merging process continues iteratively, combining small regions into larger,
homogeneous regions.
4. Termination:
The process of splitting and merging continues until no further changes are needed,
i.e., the image is divided into distinct, non-overlapping regions that are
homogeneous.
Advantages:
Adaptive segmentation: The method adapts to the structure of the image by splitting
and merging regions based on local image properties.
Flexible: Can work with different types of homogeneity criteria (color, intensity, texture).
Disadvantages:
Over-segmentation: If the homogeneity criterion is too strict, the image might be split
into many small regions.
Summary:
Region Splitting: Divides the image into smaller sub-regions based on a homogeneity
criterion.
This method is useful in scenarios where image objects have distinct boundaries and
uniform characteristics.
15/18
Explain graph-based segmentation in brief.
Nodes: Each pixel (or superpixel) in the image is represented as a node in the graph.
Edges: An edge between two nodes represents the similarity (or dissimilarity)
between the corresponding pixels. The weight of the edge can be determined based
on the pixel intensity difference, color similarity, texture, or other features.
2. Edge Weights:
The weight of an edge between two nodes reflects the dissimilarity between the
pixels they represent. A high weight suggests the pixels are very similar, while a low
weight indicates dissimilarity.
∥I(i) − I(j)∥2
W (i, j) = exp (− )
2σ 2
where I(i) and I(j) are the pixel values (or features) at nodes i and j , and σ
controls the sensitivity of the similarity measure.
3. Graph Cut:
4. Normalized Cut:
One popular approach is the normalized cut (Ncut), which aims to minimize the
dissimilarity between different segments while maximizing the similarity within each
16/18
segment:
Cut(A, B) Cut(A, B)
Ncut(A, B) = +
Assoc(A, V ) Assoc(B, V )
where:
Cut(A, B) is the total weight of the edges between the two segments A and B
,
Assoc(A, V ) and Assoc(B, V ) are the total edge weights connecting each
segment to the rest of the graph.
5. Segmentation Result:
The graph is segmented based on the minimum cut, where each segment
corresponds to a group of pixels (nodes) that are similar to each other.
Advantages:
Effective for complex images: It works well on images with complex structures and
boundaries.
Flexibility: Can use various image features like color, texture, and spatial information.
Disadvantages:
Computationally expensive: The graph-based approach can be computationally
expensive, especially for large images.
Parameter tuning: The method often requires careful tuning of parameters (like σ ) for
good segmentation results.
Applications:
Object Recognition: Helps in identifying and segmenting objects in images.
Medical Imaging: Used for segmenting regions like tumors or organs in medical
images.
17/18
Summary:
In graph-based segmentation, the image is modeled as a graph, and a segmentation is
achieved by partitioning the graph based on pixel similarity, often using graph cut
algorithms. It is an effective technique for obtaining accurate, meaningful image segments,
especially in images with complex structures.
18/18