Chap_9_Representation_and_Description.pdf

© Dr. Dafda
Introduction to Image
Representation and Description
64

• Image Representation involves converting an image into a suitable
format or representation that can be used for further analysis or
processing.
• Image Description refers to extracting meaningful information or
features from the image for various tasks like object recognition,
classification, or retrieval.

6. Types of Digital Image Processing ?
(1) Low level processing: Primitive operations such as noise reduction, image
sharpening, enhancement etc. Input and output are images.
(2) Mid level processing: Image segmentation, classification of individual objects etc.
Here input are images but output are attributes of images for e.g. edges of image.
(3) High level processing: It involves making sense of recognized objects and
performing functions associated with visions. For e.g. Automatic character
recognition, military recognition, autonomous navigation etc.

❖ Introduction
❑ What is Image Segmentation?
• Image segmentation divides an image into regions that are connected and
have some similarity within the region and some difference between
adjacent regions.
• The goal is usually to find individual objects in an image.
• For the most part there are fundamentally two kinds of approaches to
segmentation: discontinuity and similarity.
– Similarity may be due to pixel intensity, color or texture.
– Differences are sudden changes (discontinuities) in any of these, but
especially sudden changes in intensity along a boundary line, which is
called an edge.

• After an image is segmented into regions, the regions are represented and described in a
form suitable for computer processing (descriptors).
• Region representation:
1. In terms of its external characteristics (boundary)
2. In term of its internal characteristics(pixel values)
Example: A region might be represented by the length of its boundary.
– External representations are used when the focus is on shape of the region.
– Internal representations are used when the focus is on color and texture.
– Representations should be insensitive to rotation and translation.
• Region description:
1. Boundary descriptors, such as boundary length, diameter, curvature, etc.
2. Regional descriptors, such as area, perimeter, compactness, mean value, etc.
Generally, an external representation is chosen when a primary focus is on shape
characteristics. An internal representation is selected when a primary focus is on reflectivity
properties, such as color or texture.
❖ Preview

• Representation: Make a decision whether the data should be represented as a
boundary or as a complete region. It almost always follows the output of a
segmentation stage.
- Boundary Representation: Focus on external shape characteristics, such as
corners and inflections.
- Region Representation: Focus on internal properties, such as texture or skeleton
shape.
• Choosing a representation is only part of the solution for transforming raw data
into a form suitable for subsequent computer processing (mainly recognition).
• Description: also called, feature selection, deals with extracting attributes that
result in some information of interest.
❖ Representation and Description

• Multiple thresholding:
g(x, y) = a, if f (x, y) > T2
b, if T1 < f (x, y) ≤ T2
c, if f (x, y) ≤ T1

© Dr. Dafda
Boundary (Border) following and
Chain Codes in Representation for
DIP and its implementation in
MATLAB
65

• Boundary following, also known as contour tracing or contour following, is a
technique used in digital image processing to extract the boundary or
contour of an object within an image.
• The boundary of an object is the set of pixels that form its outermost edge,
separating it from the background or other objects in the image.
• The process of boundary following involves traversing along the object's
boundary, pixel by pixel, in a systematic manner until the entire boundary is
traced.
• This can be achieved using various algorithms, with the most common one
being the 8-connected or 4-connected algorithm.
• The choice between 8-connected and 4-connected depends on whether
diagonal pixels are considered as neighbors during the tracing process.
❖ Boundary following

1) Select a Starting Point: The process begins by
selecting a starting point on the object's
boundary. This point can be any pixel
belonging to the object.
2) Tracing the Boundary: Starting from the
chosen starting point, the algorithm follows a
set of predefined rules to trace the boundary.
The rules determine the order in which
neighboring pixels are examined and selected
to continue tracing.
1) For the 4-connected boundary following,
the algorithm examines the four immediate
neighbors (top, bottom, left, right).
2) For the 8-connected boundary following,
the algorithm examines all eight neighbors,
including the diagonals.
❖ Steps of the Boundary following process:

3) Decision Rules: At each step of the tracing
process, the algorithm makes decisions based on
the following rules:
• It selects the next boundary pixel from the
adjacent neighbors in a specific order (e.g.,
clockwise or counterclockwise).
• It ensures that the selected neighbor is part of
the object (i.e., its pixel intensity corresponds
to the object's intensity).
4) Tracing Completion: The process continues
until the algorithm returns to the starting point,
forming a closed loop and completing the
boundary trace.
5) Representation: As the boundary is traced, the
coordinates of the boundary pixels are recorded.
These coordinates can then be used to represent
the contour of the object.
❖ Steps of the Boundary following process:

• In digital image processing, chain codes are a simple and compact method
used to represent the boundary or contour of an object in an image. The
chain code represents the sequence of pixel transitions along the object's
boundary, and it is particularly useful for describing shapes and contours in
binary images.
• Chain codes work on the principle that the boundary of an object is a
connected chain of pixels. The chain code provides a way to encode the
connectivity of these pixels in a concise manner. Each pixel in the boundary
is represented by a specific code that indicates the direction to move from
the current pixel to the next one in the chain.
❖ Chain codes

1) Starting Point: The chain code starts at a
designated starting pixel on the object's
boundary. This pixel is usually chosen to be the
first pixel encountered while traversing the
boundary in a particular direction.
2) Tracing the Boundary: The chain code
follows the object's boundary pixel by pixel,
recording the direction of movement from one
pixel to the next. The movement can be in any
of the eight possible directions, including
horizontal, vertical, and diagonal.
• In 8-connected chain codes, all eight
neighboring pixels are considered for the
next move.
• In 4-connected chain codes, only the four
immediate neighbors (top, bottom, left,
right) are considered.
❖ Working of Chain codes:

3) Encoding Directions: Each direction is
assigned a code or symbol. The specific
code assignments may vary, but a common
convention is to use integers from 0 to 7 for
8-connected chain codes and integers from
0 to 3 for 4-connected chain codes. These
codes correspond to the eight or four
possible directions, respectively.
4) Recording the Chain: As the chain code
progresses along the boundary, the
sequence of codes is recorded. The
resulting sequence of codes forms the chain
code representation of the object's
boundary.
❖ Working of Chain codes:

• The boundary code formed as a sequence of such directional numbers is
referred to as a Freeman chain code.
• Different starting points gives different chain codes.
• To normalize the chain codes with respect to starting point, we have to treat
the chain codes as a circular sequence.
• Also, whichever number gives the smaller magnitude will be considered as
the chain code.
❖ Chain codes

Chap_9_Representation_and_Description.pdf

• Compactness: Chain codes can significantly reduce the storage space
required to represent a boundary compared to storing the individual
coordinates of boundary pixels.
• Rotation Invariance: Chain codes are rotation invariant since they represent
only the relative directions between boundary pixels, not their absolute
positions.
• Simple Processing: Chain codes are straightforward to generate and process,
making them useful in various image processing tasks, such as pattern
recognition, shape matching, and object tracking.
❖ Advantages of Chain codes:

Boundary Following:
Boundary following, also known as contour tracing or contour following, is a process or algorithm
used to extract the boundary of an object from an image. The result of boundary following is a
sequence of coordinates (or the actual boundary pixels' positions) that represent the object's
contour. Boundary following does not specify any particular encoding scheme; it provides the
actual boundary information. Various algorithms can be used for boundary following, such as 4-
connected and 8-connected approaches.
Chain Codes:
Chain codes are a specific representation technique used to encode the boundary information
obtained from boundary following. Instead of storing the actual coordinates of boundary pixels,
chain codes represent the boundary as a sequence of codes that denote the directions of
movement from one pixel to the next in the contour. Chain codes are compact representations, as
they only store the relative direction information, making them rotation-invariant. Chain codes are
particularly useful when compact representation is needed, like in storage or communication
applications.
❖ Boundary Following vs Chain codes:

© Dr. Dafda
Polygonal(Boundary) approximations
using Minimum-Perimeter Polygons(MPP)
in Representation for DIP and its
implementation in MATLAB
66

❖ Image processing main steps:

Representation is used to convert raw output obtained into another format suitable
for processing the image. It deals with converting the data into a suitable form for
computer processing.
Image Representation
Chain codes
Minimum Perimeter Polygons Merging techniques
It is based on average error. Merge the
points along a boundary until the least
square error line fit of the point is
merged. Disadvantage: vertices do not
always correspond to inflections in the
original boundary.
This technique is to subdivide a segment
successively into two parts until a
specified criterion is satisfied. This
technique has the advantage in finding
prominent inflection points.
Polygonal Approximation Boundary segments
Signature Skeletons
Splitting techniques
It is used to simplify the representation of
shapes or boundaries while preserving their
important features. It captures the essence
of the boundary shape with the fewest
possible polygonal segments.
❖ Introduction

❖ Minimum Perimeter Polygon(MPP):
• The goal of polygon approximation is to
represent an object boundary by a polygon.
• The minimum perimeter polygon consists of
line segments that minimize distances between
boundary pixels.
• Here the boundary is enclosed by a set of
concatenated cells. The enclosure has two walls
corresponding to the inside and outside
boundaries of the strip of cell.
• Think of the object boundary as a rubber band
contained within the wall.
• The rubber band shrinks and produces a
polygon of minimum perimeter that fit the
geometry established by the cell strip.

❖ MPP Algorithm:
• The MPP algorithm is a way to make a simpler shape that
covers the outside of a digital boundary, like a fence
around a picture. This boundary is made of little squares
(cells) in a computer. We think about this shape as a
group of cells, and we have some rules to help us draw it.
• Using the terms "white" (W) and "black" (B) to represent
convex and mirrored concave vertices, respectively:
1. The MPP bounded by a simply connected cellular
complex is not self-intersecting.
2. Every convex vertex of the MPP is a W vertex, but not
every W vertex of a boundary is a vertex of the MPP.
3. Every mirrored concave vertex of the MPP is a B vertex,
but not every B vertex of a boundary is a vertex of the
MPP.
4. All B vertices are on or outside the MPP, and all W
vertices are on or inside the MPP.
5. The uppermost, leftmost vertex in a sequence of vertices
contained in a cellular complex is always a W vertex of
the MPP.

• In essence, polygon approximation with minimum-perimeter polygons helps
strike a balance between accurately representing a shape and simplifying it
for practical purposes.
• It's like creating a simpler version of a complex shape by connecting a few
key points with the shortest line possible, while still keeping the main
characteristics of the shape intact.
• This technique is commonly used in various fields of image processing,
computer graphics, and geographic information systems.
❖ Advantages of MMP Approximations:

❖ Merging technique of Polygonal approximation:
• Merge points around the boundary and fit a least-square-error line to the points
until an preset threshold is exceeded.
• Next, start a new line and repeat above step.
• When the start point is reached, the intersections of adjacent lines are the
vertices of the polygon.
• Here the disadvantage is that the vertices do not always correspond to
inflections(corners).

❖ Splitting Technique of Polygon approximation:
• Find the line joining two extreme points.
Choose a threshold(e.g. Half of this line).
• Find the farthest perpendicular points on
the boundary from the above mentioned
line. If this distance exceed threshold, the
point becomes a vertex and again
subdivide the segment into two sub-
segments, if not no change in original line.
• Repeat above step until the initial point is
reached.
• This technique has the advantage in
finding prominent inflection points.

© Dr. Dafda
Signatures in Image Representation for
DIP and its implementation in MATLAB
67

Chain codes
This is a type of boundary
representation in which a boundary is
decomposed into segments. It is used
when the boundary has one or more
concavities i.e, curves that. carry shape
information.
It is the representation of the structural
shape of a plane region, reducing it to a
graph. This reduction is obtained by
skeleton of the region via a thinning
algorithm.
Signature Skeletons
It is the sequence of normal contour
sequences. It is a 1-D functional
representation of a boundary.
❖ Introduction

❖ Signatures:
• It is the sequence of normal contour sequences. It is a 1-D functional representation of a 2-D boundary.
It is a unique representation for different shapes and it can be used to differentiate between different
objects.
• No matter how a signature is made, the main concept is to turn the 2-D boundary into a simpler 1-D
function that is easier to explain than the original 2-D shape.
• The simplest signature is a plot of the distance from the centroid to the boundary as a function of angle
r(θ).
• For circle, point C is centroid and as
phasor r completes 3600, we get a one
dimensional representation of r(θ)
which will be constant, equal to the
radius of the circle. The signature can
thus be defined as 1-dimensional
• Consider another boundary in the
shape of a square. In this case r(θ) will
be given by the formula
r(θ) = A sec θ
C C

❖ Signatures:
• Signature is an translation-invariant representation
which allows an object to be compared with a
standard prototype by cyclically shifting the
signature of one with respect to the other in steps,
while checking for the best match.
• Signatures are invariant to translation, but
signatures do depend on rotation and scaling
• To make it invariant to rotation we should select
the same starting point regardless of the
orientation.
– One way is to select starting point farthest from
centroid (if unique).
– Another way is to obtain the chain code of the
boundary.
• To make it invariant to scaling we can normalize
to a particular range.

© Dr. Dafda
Boundary segments in Representation for
68

Chain codes
This is a type of boundary
representation in which a boundary is
decomposed into segments. It is used
when the boundary has one or more
concavities i.e, curves that carry shape
information.
It is the representation of the structural
shape of a plane region, reducing it to a
graph. This reduction is obtained by
skeleton of the region via a thinning
algorithm.
Signature Skeletons
It is the sequence of normal contour
sequences. It is a 1-D functional
❖ Introduction

❖ Boundary Segments:
• The aim of boundary segments is to partition an object boundary into segments. When a boundary
contains major concavities that carry shape information it can be worthwhile to decompose it into
segments.
• Boundary segments is useful to extract information from the concave parts of the objects. A good way to
achieve this is to calculate the convex Hull of the region enclosed by the boundary (minimal enclosing
convex region).
->Smooth prior to Convex hull calculation.
->Calculate Convex Hull on polygon approximation.
• The convex hull of an arbitrary set is the smallest
convex set containing S. The set difference H-S is called
the convex deficiency D of the set S.
• Consider figure (a), which shows an object (set S) and its
convex deficiency (shaded regions).
• The region boundary can be partitioned by following
the contour of S and marking the points at which a
transition is made into or out of a component of the
convex deficiency D.
• Figure (b) shows the result in this case. One advantage
here is that, this scheme is independent of region size
and orientation. Second it can also be used to describe a
region and its boundary. But the disadvantage is that, it
is sensitive noise and smoothing may be required.

❖ Convex Hull
• Given a set of points in the plane. the convex hull of the set is the smallest convex
polygon that contains all the points of it.
• A set A is said to be convex if the straight line segment joining any two points in
A lies entirely within A. The convex hull H or of an arbitrary set S is the smallest
convex set containing S.
• For any set of points S in a plane, convex hull is the smallest convex set that
contains that subset S.
• The convex hull H of an arbitrary set S, is the smallest convex set containing S.
The set difference H-S is called the convex deficiency of S. The convex hull and
convex deficiency are useful for object description.

© Dr. Dafda
Skeletons in Image Representation for
69

❖ Skeletonization (Skeleton Extraction)
• It is another way to reduce binary objects to thin strokes that retain important structural
information about the shapes of the original objects.
• The skeleton of A can be expressed in terms of erosions and openings as follows:
where
where B is a structuring element, and (A kB) indicates k successive erosions of A:
The figure below illustrates an example of extracting a skeleton of anobject in a binary image.

❖ Skeletons :
• An important approach to representing structural shape of a plane region is to reduce it to a
graph. This may be accomplished by obtaining the skeleton of the region via a thinning
algorithm. The skeleton is a thin line that shows the shape and how it is connected.
• This has applications in automated inspection of PCB and many other fields.
• Definition of skeleton is based on medial axis transformation (MAT).
• MAT of a region R with border B: for each point p in R, find the closest neighbor in B. If p has
more than one such neighbor, it belongs to medial axis (skeleton).
• MAT is based on “prairie fire concept (an uncontrolled fire in a grassy area)”.
• Medial axis transformation (MAT) generates a "skeleton" of a region.

❖ Skeletons :
• MAT algorithm:
(1) for each point in the region we find its closest point in boundary,
(2) if a point has more than one such a neighbor --> a point belongs to the medial axis
(skeleton) of the region
• the results of MAT operation depend on the distance measure:
pixel coordinates
p (x,y)
q (s,t)
(a) Euclidean distance between p and q is defined as: De(p,q) = [ ( x- s)2 + (y – t)2]1/2
(b) D4 distance (City block distance) : D4(p,q) = │ x- s│ + │y – t │
(c) D8 distance(Chessboard distance): D8(p,q) = max(│ x- s│,│y – t │)
• Direct implementation of MAT is computationally expensive.
• Alternative algorithms have been proposed that “thin” the boundary of a region until the
skeleton is left.

❖ Skeletons :
• The algorithm is for thinning binary regions (assume region points are 1 and background points
are 0). This algorithms iteratively delete boundary points of a region subject to the constraints
that deletion of these points (1) does not remove end points, (2) does not break connectivity,
and (3) does not cause excessive erosion of the region.
• The algorithm has two steps which are applied to all the pixels on the contour of the region.
• A contour point is any pixel with value 1 and having at least one 8-neighbor valued 0.
• In each step the boundary point that satisfy a set of conditions are flagged and then deleted.
• Step 1 flags a contour point p1 if the following conditions are satisfied:
• N(p1): number of nonzero neighbors of p1.
• T(p1) : number of 0 to 1 transitions in ordered sequence p2,p3,….p8,p9,p2.

❖ Skeletons :
• After step 1 is applied to all border points those that are flagged are deleted (changed to 0).
• In step 2 conditions (a) and (b) remain the same but (c) and (d) are changed to:
• After step 2 is applied to all border points remaining after step 1, those that are flagged are
deleted (changed to 0).
• This procedure is applied iteratively until no further points are deleted.

© Dr. Dafda
Boundary Descriptors in
Image Description for DIP
and its implementation in MATLAB
70

While Image Representation involves converting an image into a suitable format or representation that can
be used for further analysis or processing, Image Description refers to extracting meaningful information or
features from the image for various tasks like object recognition, classification, or retrieval.
Image Description
such as area, perimeter,
compactness etc.
Boundary Descriptors
such as boundary length,
diameter, curvature etc.
❖ Introduction
Regional Descriptors
1. Simple descriptors (length, curvature)
2. Shape numbers (the first difference of smallest magnitude)
3. Fourier descriptors (uses DFT for description)
1. Simple descriptors (Area, Perimeter, Compactness,
Circularity ratio, Rectangularity etc.)
2. Topological Descriptors (measure the shape’s structure
without changing by deformations)
3. Texture (It provides measures of smoothness, coarseness and
regularity. There are 3 approaches: Statistical , Structural and
Spectral)

❖ (1) Simple boundary descriptors :
• Length (perimeter): Number of pixels along a boundary gives its length. For a
chain coded curve with unit spacing in both directions, the number of verticle and
horizontal components plus √2 times the number of diagonal components gives its
exact length.

• Diameter = = major axis,
Here D is the distance measure (eucledian, cityblock or chessboard)

• Major axis: Line segment connecting the two extreme points that comprise the
diameter.
• Minor axis: Is the line perpendicular to major axis and 2 extreme points on the
minor axis and two extreme points on the major axis should be 4 points on a box
that enclose the boundary completely.
• Eccentricity: The ratio of major axis to the minor axis is called the eccentricity.

• Curvature: It is defined as the rate of change of slope. Curvature at the point of
intersection is defined as the difference between slopes of adjacent boundary
segments. To calculate the curvature, the boundary is traversed in the clockwise
direction and a vertex point P belongs to a convex segment, if the change in slope
at P is non-negative(i.e. +ve), otherwise P is said to belong to concave segment.

❖ (2) Shape number :
• The shape number is defined as the first difference (difference code) of smallest
magnitude. Order(n) of a shape number is the number of digits in the
representation. n is even for a closed boundary.

❖ (3) Fourier descriptors :
• Represent the boundary as a sequence of coordinates. They convert the object's boundary into a
set of numbers in the frequency domain, capturing its essential shape features.
• Treat each coordinate pair as a complex number.
• From the DFT of the complex number we get the Fourier descriptors (the complex
coefficients, a(u))

❖ Fourier descriptors :
• The IDFT from these coefficients restores
s(k)
• We can create an approximate
reconstruction of s(k) if we use only the
first P Fourier coefficients

While Image Representation involves converting an image into a suitable format or representation that can
be used for further analysis or processing, Image Description refers to extracting meaningful information or
features from the image for various tasks like object recognition, classification, or retrieval.
Image Description
such as area, perimeter,
compactness etc.
Boundary Descriptors
such as boundary length,
diameter, curvature etc.
❖ Introduction
Regional Descriptors
1. Simple descriptors (length, curvature)
2. Shape numbers (the first difference of smallest magnitude)
3. Fourier descriptors (uses DFT for description)
1. Simple descriptors (Area, Perimeter, Compactness,
Circularity ratio, Rectangularity etc.)
2. Topological Descriptors (measure the shape’s structure
without changing by deformations)
3. Textures (It provides measures of smoothness, coarseness
and regularity. There are 3 approaches: Statistical , Structural and
Spectral)

❖ (1) Simple Regional Descriptors :
• Area: The number of pixels in the region, define its area.
• Perimeter: Is the number of pixels in the boundary of the shape.

• Compactness or Circularity: How closely-packed the shape is. It is the ratio of
perimeter2/area. The most compact shape is a circle (4π). All other shapes have a
compactness larger than 4π.

• Circularity ratio: It is the ratio of area of the region to the area of the circle(the
most compact shape) having the same perimeter. The area of a circle with
perimeter length P is P2/4 π .
Rc = 4×π×area / perimeter^2

• Rectangularity: It is defined as the the ratio of area of the region to the area of the
bounding rectangle.

• Elongatedness: It is the ratio between the length and width of the minimum
bounding rectangle of the region.

❖ (2) Topological descriptors :
• Topology = The study of the properties of a figure that are unaffected by any deformation.
• Topological descriptors are features that describe the properties of a shape or a region in an
image that are invariant to deformations such as streching, bending or twisting as long as the
shape or region is not torn or joined.
• One example of a topological descriptor is the Euler number, which is defined as the number of
connected components minus the number of holes in a region. The Euler number is invariant
to translation, rotation and scaling.
• The Euler number can be computed using connected-component labeling algorithms.
• Euler number:
➢ Number of holes in a region, H
➢ Number of connected components, C
➢ Euler number, E = C – H

• Regions represented by straight line
segments (Polygonal networks) have a
particular simple interpretation in
terms of Euler number:
Euler formula...
V − Q + F = C − H
= E
V : Number of vertices
Q: Number of edges
F: Number of faces
❖ (2) Topological descriptors :
E = V − Q + F =
E = C − H =

❖ (3) Textures :
• In image processing, every digital image
composed of repeated elements is called a
"texture." Textures can be very valuable when
describing objects. For example, the images
show the Smooth, coarse and regular textures.
• There can be 3 approaches for textures:
(1) Statistical texture descriptors:
• Histogram based
• Co-occurence based (Statistical
moments, Uniformity, entropy,... )
(2) Structural texture descriptors
• deal with the arrangement of image
primitives such as the description of the
texture based on the regularly spaced
parallel lines.
(3) Spectral texture descriptor
• Use fourier transform

Chap_9_Representation_and_Description.pdf

More Related Content

Similar to Chap_9_Representation_and_Description.pdf (20)

More from ssuser1ecccc (20)

Recently uploaded (20)

Chap_9_Representation_and_Description.pdf