0% found this document useful (0 votes)
7 views

Unit 11

This document provides an overview of object detection and image segmentation techniques. It begins with introducing object detection as identifying objects based on their characteristic features. Image segmentation is then discussed as the process of partitioning an image into meaningful structures that depict the objects within the image. Various image segmentation techniques including edge detection, region detection, and boundary detection are described. Finally, several applications of image segmentation are listed, such as in medical imaging, satellite imaging, movement detection, security and surveillance, and industrial inspection.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Unit 11

This document provides an overview of object detection and image segmentation techniques. It begins with introducing object detection as identifying objects based on their characteristic features. Image segmentation is then discussed as the process of partitioning an image into meaningful structures that depict the objects within the image. Various image segmentation techniques including edge detection, region detection, and boundary detection are described. Finally, several applications of image segmentation are listed, such as in medical imaging, satellite imaging, movement detection, security and surveillance, and industrial inspection.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

UNIT11 OBJECT DETECTION

Structure Page No.


11.1 Introduction
Objectives
11.2 Object Detection
11.3 Image Segmentation
11.3.1 Image Segmentation Techniques
11.4 Edge Detection
11.4.1 Gradient Operators
11.4.2 Lapacian Operation
11.4.3 Line Detection
11.5 Region Detection
11.6 Boundary Detection
11.7 Feature Extraction
11.8 Summary
11.9 Solutions and Answers

11.1 INTRODUCTION
The development of computer vision led to various means of understanding
images automatically. One of the important applications is object detection in
an image. Using image processing algorithms, we can detect semantic objects
of a certain class (like individuals, constructions, animalsor vehicles) from
digital images or videos.

Mainly, object detection algorithm aims to detect objects from a known class
like vehicles, human beings, animals, tree etc. In general, there could be very
small number of objects may be present in a single image, but similar class can
be present in large number of images with various backgrounds.

Such things are to be explored in object detection for example, Figure 1 shows
the output of object detection algorithm within a room and a scene on the road.
We can see objects lie bottles, glasses, laptop, chair, bag etc inside the room.
Where as in the out-door environment, we can see car, two wheelers bus etc

Figure 1 Object detection


7
Block 4 Pattern Recognition
Sometimes we are interested in analysis and interpretation of various features
which are parts of the image. Object detection involves Image segmentation,
which is the process of finding partitions of an image in the form of groups of
pixels which are homogeneous with respect to the feature under consideration.
We shall begin this unit by discussing the image segmentation and its
applications in Sec. 11.2. In Sec. 11.3, we shall discuss various image
segmentation techniques.We shall discuss edge, detection with, region
detection and boundary detection their applications in the Secs. 11.4, 11.5 and
11.6 respectively.
And now, we will list the objectives of this unit. After going through the unit,
please read this list again make sure you have achieved the objectives.

Objectives
After studying this unit, you should be able to

 defineObject Detection/image segmentation techniques


 apply edge based segmentation, line based segmentation;
 apply region based segmentation and boundary detection.
We begin the unit by discussing ‘what is image segmentation?’ in the
following section.

11.2 OBJECT DETECTION


Object detection in an image means, identifying objects based on its features.
Human beings identify objects instantaneously with out any effort. In computer
vision, identifying objects will be based on object models. Object detection
models are based on the characteristic features of the image. The block diagram
is shown in Figure. 2

Figure 2 The object detection process

In short, the basic features of object class are to be defined and included in a
database of object models. Using feature extraction process, specific features of
the object we are looking for are to be identified and matched with the data
base for identifying the object class.

We can represent object in multiple ways and accordingly features can be


extracted for object identification. The inner region of the object can be
represented by some features like gradient, moments, texture etc. Similarly,
boundary can be identified based on pattern of pixels, Fourier descriptor etc.

The object detection is majorly classified as (1) Edge Detection, (2) Region
Detection and (3) Boundary Detection
Unit 11 Image Segmentation

Object detection involves Image segmentation which further involves the


division of anyimage into meaningful structures. Thesestructures depict the
objects that constitutean image. Segmentation is the first step inimage analysis
and pattern recognition. Different features in images are identified and
extractedby segmentation algorithms. Accuracy ofextracted features decide the
accuracy ofautomatic recognition algorithms. Thus, asuitable and rugged
segmentation algorithmshould be chosen very carefully. Selectionof a suitable
algorithm is highly applicationdependent. Image segmentation is one of the
most difficult tasks in image processing.Generally, many image processing
tasks are aimed at finding a group of pixels in animage that are similar or
connected to each other in some way. We are showing the step image analysis,
objectrepresentation, visualization, understanding and classification in Fig.3.

Enhancement Segmentation
Preprocessing Image Analysis Feature
Restoration, etc
extraction etc

Image Knowledge Recognition Recognition


Object Acquisition Base & of object
Interpretation

Fig. 3: Image Processing Steps

An image scene is processed for autonomous machine perception and is


mapped toknowledge. Here, the system tries to imitate human recognition and
ability to makedecision according to the information contained in the image,
showing in Fig. 4.

Fig. 4: Image to Knowledge Mapping

Image segmentation is the fundamental step in image analysis,


understanding,interpretation and recognition tasks. It is the process of
decomposing a scene intodifferent components. Segmentation partitions an
image into multiple homogeneousregions with respect to some characteristics.
In practice, it groups the pixels havingsimilar attributes into one group. The
result of image segmentation is a set of regionsthat collectively cover the entire
image or a set of contours extracted from the image.Each pixel in a particular
region is similar to the other pixel with respect to somecharacteristics such as
colour, edge, texture, etc. Segmentation is anintermediate stage between low
level and high level image processing tasks. Low leveltasks manipulate pixel
values for irregularity correction or for image enhancement,whereas high level
tasks manipulate and analyse a group of pixels that convey someinformation.
Block 4 Pattern Recognition

Segmentation is the most important step in automated recognition system


which has numerous applications and some of them are discussed below:

1. Medical Imaging

Segmentation is used to locate tumors, measure tissue volumes, computer aided


surgery, diagnosis, treatment planning, study of anatomical structures etc. Fig.
5 shows the segmented portion of brain.

Fig.5: Example of Image Segmentation in Medical Imaging

2. Satellite Imaging
Segmentation is used to locate objects like roads, forest etc. in satellite images
for monitoring of ecological resources like seasonal dynamics of vegetation,
deforestation, mapping of underground minerals etc. Segmentation methods
based on texture, colour, wavelength etc. are generally used here.

3. Movement Detection
Monitoring crowded urban environment is a goal of modern vision systems.
Knowledgeof the size of the crowd and tracking its motion can be used to
monitor trafficintersection. Intelligent walk signal system can be designed
based on the number ofpeople waiting to cross the road. Knowledge of the size
of the crowd is helpful ingeneral safety, crowd control and planning urban
environment.

4. Security and Surveillance


Security of the national assets such as bridges, dams, tunnels etc is critical in
today’s world.Automated smart system to detect ‘suspicious’ movements or
activities, to detect leftbaggage or vehicle is crucial for safely. Automated face
detection systems try to matcha criminal's face in a crowded place.

5. License Plate Recognition (LPR)

Automated license plate reading is a very useful and practical approach


as it helps inmonitoring existing and illegally acquired license plates. LPR can
be used in private parking management, traffic monitoring, automatic traffic
ticket issuing, automatictoll payment, surveillance and security enforcement.
Fig. 6 shows the segmented license plate.
Unit 11 Image Segmentation

Fig. 6: Example of image segmentation in LPR

6. Industrial Inspection and Automation

Here the objective is to find missing components (Integrated Circuits(IC)), in


aPrinted Circuit Board (PCB) or a missing pill in a packet of medicine or to
check thequality of the baked biscuits etc. These automated systems help in
reducing manpowerin the plants and in reducing error committed by humans
due to repetitive task.

7. Robot Navigation

Robots are used in a variety of industrial, medical and research applications.


Preassignedtasks can be repeatedly and efficientlydone by robots in
hazardousenvironment.Image analysis and understanding help robots navigate
and execute thegiventask.

Try the following exercise.

E1) What do you mean by segmentations.

E2) Give two applications of image segmentation techniques.

Now we shall discuss the classification of segmentation in the following


section.

11.3 IMAGE SEGMENTATION

Imagesegmentationhelpsinsimplifyingthetasksandgoalsofcomputervisionandimage
processing
techniques.Segmentinganimageisoftenconsideredasthefirststepforimageanalysis.Im
age segmentation is done by splitting an image into multiple parts based on similar
characteristics of pixels for identifying objects
There are several techniques through which an image can be segmented, based on
division and group-specific pixels which can be further assigned labels, and
classified according to these labels. The generated labels, can be used in several
supervised, semi-supervised and unsupervised training and testing tasks in
machine learning and deep learning applications.
Image segmentation plays a vital importance in computer vision and has several
applications in various industries and research. Some of the commonly used
applications are Facial recognition, number plate recognition, image search,
Block 4 Pattern Recognition
analysis of medical image etc
Classification of Image segmentation methods :Researchers areworking
onimage segmentation for over a decade.The commonly used methods
involves “Classification based on method of Identification” where the
images can be segmented either by grouping similar pixels or by
differentiating them by identifying the boundary. The Region based
identification method and Boundary based Identification method of
segmentation are discussed below:

 Region based Identification: In this method, similar pixels are selected


based on predefined threshold. Then these selected pixels are grouped
together using clustering algorithms like SVM, K-Means, nearest neighbor
etc. based on similar attributes or features. These attributes or features can
be used for grouping similar pixels for region merging, region growing,
region spreading etc.

 Boundary-based Identification: In this method separation of regions are


done based on the dissimilar features of pixels. The dissimilar pixels are
identified using Edge Detection algorithm, Line Detection algorithm etc
Other Image Segmentation methods includes Thresholding Segmentation
and the Edge Based Segmentation , the same are discussed below :
1. Thresholding Segmentation
Thresholding based segmentation is considered as the simplest, and easiest
method for segmentation. It groups pixels of an image based on a
predefined threshold value of pixel intensity. The threshold value of pixel
intensity will be determined adaptively on task basis
The threshold value (T) can be a constant value or a dynamic value, based on
the application. A constant threshold value helps in differentiating between
the backgrounds and segmenting object in an image when the image has
less noise. The grey scale image can be converted into a binary image using
thresholding technique as shown in the Figure. 7

Figure 7
https://scikit-image.org/docs/stable/auto_examples/applications/plot_thresholding.html
(Source: Internet)

Thethresholdingbased segmentationcanbefurtherclassifiedas:
1. Simple Thresholding ,and
2. Adaptive Thresholding

Simple Thresholding :In the Simple Thresholding method (also known as global
thresholding), all pixels will be converted into white or black based on the reference
pixel intensity value. If the intensity value is less than the reference (threshold value),
Unit 11 Image Segmentation
the pixel will be converted into black and if it is greater, the pixel will be converted
into white pixel

Algorithm

1. InitialestimateofT
2. SegmentationusingT:G1,pixelsbrighterthanT,G2,pixelsdarkerthan(o
requalto)T.
3. Computationoftheaverageintensitiesm1andm2ofG1andG2.
4. Newthresholdvalue:Tnew=(m1+m2)/2
5. If|T−Tnew|>∆T,

6. backtostep2,otherwisestop.
Global thresholding, using an
appropriate threshold T: g(x, y) =1, if f (x, y) > T
0,iff(x,y)≤T

● Otsu’sBinarization
In general, a constant threshold value is often chosen through a hit-and-trial
method to perform simple thresholding-based segmentation. This
constantthresholdvaluecanvaryaccordingtotheapplicationand may
notberobustorefficientfortwodifferentapplications.InOtsu’sBinarizationmeth
od,thethreshold value will be decided by the average value of the two peaks
obtained from histogram. The main limitation is that it will work only in
bimodal image, i.e., images containing only two peaks in a histogram. The
main application of this method is scanning documents, Removing
unwanted colors, Pattern recognition etc.

Algorithm

1. Otsu’smethodisaimedinfindingtheoptimalvaluefortheglobalthreshold.
2. Itisbasedontheinterclassvariancemaximization.
3. Wellthresholdedclasseshavewelldiscriminatedintensityvalues.
4. M×Nimagehistogram:Lintensitylevels,[0,...,L−1]
5. ni-numberofpixelsofintensityi

6. Normalizedhistogram:

7. calculate the between-class variance value


8. the final threshold is the maximum between-class variance value
Block 4 Pattern Recognition
Example -

Figure 8
https://www.researchgate.net/publication/263608069_Medical_Image_Segmentation_Methods_Algorith
ms_and_Applications/figures?lo=1&utm_source=google&utm_medium=organic

a) OriginalImage
b) Histogramoftheimagea)
c) Simplethreshold,takingT=0.169,η=0.467
d) Otsu’smethod,takingT=181,η=0.944.
AdaptiveThresholding

In this method, we can decide different threshold values for different sections of
theimage.Itismainly used for images having different backgrounds /
properties. Using this method, different lighting conditions can be
differentiated and threshold values can be adaptively decided.
VariableThresholding,ifTcanchangeovertheimage.
- Local or regional thresholding, if T depends on a neighborhood of (x, y).
- Adaptive thresholding, if T is a function of (x, y).
- Multiple thresholding: g(x, y) = a, if f (x, y) > T2
b, if T1 < f (x, y) ≤ T2
c, if f (x, y) ≤ T1
After Thresholding Segmentation, the next is Edge Based Segmentation, which is discussed below:

2. Edge-BasedSegmentation
In this method, the objects are identified based on the edge detection. The edge
detection is done based on the pixel properties like texture, contrast, colour,
saturation, intensity etc.The results of edge-based image segmentation are shown
in Figure 9

Figure 9 Source: Internet


Unit 11 Image Segmentation
There are two commonly used methods for edge detection
● Search-Based Edge method: In this method, edge detection is done based on edge
strength. It is calculated based on the local directional maxima of the gradient
magnitude through a computed estimate of the edge’s local orientation
● Zero-Crossing Based Edge method: In this method, the edges are identified based
on the zero crossings in a derivative expression retrieved from the image. Most
popular edge detection methods like Canny, Prewitt, Deriche, and Roberts cross,use
the same procedure. The local edges will be grouped to segment the image with
subsequent binarization. The detected edge should overlap binary image for object
detection.
Marr and Hildreth edge detector (Laplacian of Gaussian)

 The Laplacian is sometimes used on its own for edge detection because its
sensitive to noise

 The Laplacian-of-Gaussian (LoG) uses a Gaussian filter to blur the image and a
Laplacian to enhance edges.

 Edge localisation is done by finding zero crossings.

Algorithm
- Convolve the image with a two-dimensional Gaussian function
- Compute the Laplacian of the convolved image -> L
- Identify edge pixels as those for which there is a zero-crossing in L. A radially-
symmetric 2D Gaussian:
Where

TheLaplacianofthisis:

- Example

Canny edge detector

The Canny edge detector works on the fact that for edge detection, there is a
tradeoff between noise reduction (smoothing) and edge localisation.

Algorithm

- Smooth the image with a Gaussian filter

- Compute the gradient magnitude and orientation

- Apply non-maximal suppression to the gradient magnitude image

- Use hysteresis thresholding to detect and link edges


Block 4 Pattern Recognition
3. Region-Based Segmentation

This algorithm, identify group of pixels with specific characteristics either from
a small section or from a bigger portion of input images seed point. Then the
algorithm will add more pixels or shrink based on specific characteristics of
pixels with all other seed points. Thus, we can get a segmented image. It is
further classified into Region Growing and Region Splitting methods

Figure :10
https://www.doc.ic.ac.uk/~dfg/vision/v02d01.gif
(Source:Internet)
● Region Growing

In this method, the pixels are merged according to particular similarity


conditions wherein at first, small set of pixels are grouped together and
merged. This phenomenon is continued iteratively until all the pixels are
ground and merged to one another. Basically, the algorithm picks up pixel
randomly and finds out matching neighbouring pixels and adds them and
continue the same till finding out a dissimilar pixel. After that it will find out
another seed point and it will continue.

To avoid overfitting,these algorithms will grow multiple region


simultaneously. Such algorithms will work even with noisy images

Figure. 11
https://towardsdatascience.com/image-segmentation-part-2-8959b609d268
Source: Internet

● RegionSplittingandMerging
This algorithm focuses
onsplittingandmergingportionsoftheimage.Itsplitsimagebasedonattributes and
then merge regions based on similar attributes. While splitting the whole image
will be considered
Unit 11 Image Segmentation
andwhileregiongrowingitwillbeconcentratingspecificpoints.Thisalgorithmisalso
calledas split-merge algorithms

Figure. 12
https://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/ImageProcessing-html/topic3.htm

PictorialrepresentationofhowSplitandMergealgorithmworks

https://towardsdatascience.com/image-segmentation-part-2-8959b609d268
Figure. 13

|Zmax-Zmin|<=threshold
Zmax-Maximumpixelintensityvalueinaregion Zmin-Minimumpixelintensityvalueinaregion
Somepropertiesthatmustbefollowedinregionbasedsegmentation-

- Completeness-
Thesegmentationmustbecompletei.e,ΣRi=REverypixelmustbeinaregion
- Connectedness-Thepointsofaregionmustbeconnected
- Disjointness-Regionsmustbedisjoint:Ri∩Rj=Ø,foralli=1,2,...n
- Satisfiability-
PixelsofaregionmustsatisfyonecommonP(Ri)TRUE,forallipropertyPatl
east,
i.eanyregionmustsatisfyahomogeneitypredicateP
- Segmentability-
DifferentregionssatisfydifferentP(RiURj)=FALSEpropertiesi.eanytw
o adjacent regions cannot be merged into a single region
Example-Segmenttheimagegivenbysplitandmergealgorithm

Figure. 114
Block 4 Pattern Recognition
3. NeuralNetworksforSegmentation
Neural networks have become very popular for image segmentation tasks as they not only
provide automation, but also provide robust specificity and sensitivity to the developed
segmentation masks. Researchers have introduced and analyzed convolutional neural
network, generative adversarial networks, deep belief network, and extreme learning
machines etc., to perform excellent image segmentation for various applications in
healthcare, traffic monitoring, satellite visualization, bakery, plant diseases, etc.
Segmentationthroughneuralnetworks,especiallyaconventisdonebygeneratingafeaturemapfo
raninput image data. Then a region based filter is further applied to generate the mask
according to ones objectives and applications. Bounding boxes play a very important role
inimagesegmentation.Theycanbegenerated through various techniques and consist of
coordinates from the segmented part.
A great example of neural network
forimagesegmentationhasbeenreleasedbytheexpertsatFacebookAI
Research(FAIR)whocreatedadeeplearningarchitecturecalledMaskR-
CNNwhichcanmakeapixel-wise mask for every object present in an image. It is an
enhanced version of the Faster R-CNN object detection architecture. The Faster R-CNN
uses two pieces of data for every object in an image, the bounding box coordinates and the
class of the object. With Mask R-CNN, you get an additional section in this process. Mask
R-CNN outputs the object mask after performing the segmentation.

11.3.1 IMAGE SEGMENTATION TECHNIQUES


Image segmentation partitions an image into set of regions. In many
applications,regions represent meaningful areas in an image. In other
applications, regions might be set of border pixels grouped into structures such
as line segments, edges etc. Thelevel of partitioning of an image depends on
the problem to be solved. Segmentation should stop when the objects of
interests have been isolated. Segmentation has twoobjectives:

a) To decompose an image into regions for further analysis.


b) To perform a change of representation of an image for faster analysis.

There are a number of segmentation techniques available. Segmentation is


application dependent. A single segmentation technique may not be suitable for
different applications. Hence, these techniques have to be combined with the
domainknowledge in order to effectively solve the problem. Generally, all
segmentationalgorithms are based on one of the two properties of gray level
values of pixels.

a) Discontinuity

In this approach, images are partitioned based on the difference or


discontinuity of the gray level values. This means segmentation contour starts
from the point where pixels appear to have very different gray values. Edge
based segmentation methods fall in this category.

b) Similarity

Images are partitioned based on the similarity of the gray level values of the
pixelaccording to a pre-defined criterion. This means as long as pixels have
gray values close to each other, they are considered to be in the same
18
Unit 11 Image Segmentation
segmentation block. Thresholding, region based clustering andmatching based
segmentation techniques fall in this category.

We categorize segmentation techniques as follows:

i) Edge Based Segmentation: Edge based method is a commonly used


technique to detect boundaries anddiscontinuities in an image. With this
technique, detected edges are assumed torepresent object boundaries and
used to identify objects. The assumption here is that every part of the
object is sufficiently uniform such that they can be separated on the basisof
discontinuity alone.

ii) Region Based Segmentation: Edge based techniques find the object
boundaries and then locate the objects,whereas region based techniques
look for uniformity within a sub-region based on asuitable property like
intensity, colour, texture etc. Region based segmentation starts inthe
middle of an object and then ‘grows’ outwards till it meets the object
boundary.

Try an exercise.

E3) Segmentation algorithms are generally based on which two properties


of intensity?

In the following section, we shall discuss the various techniques of image


segmentation.

11.4 EDGE DETECTION –


EDGE BASED SEGMENTATION
As object can be fully represented by its edges, segmentation of an
image intoseparate objects can be achieved by locating edges of these objects.
Edge detection isa fundamental tool in image processing and computer vision.
The aim is to identifypoints in an image at which the brightness change
sharply. Changes in image brightnesscan be due to discontinuities in depth,
discontinuities in surface orientation, changesin material property or variation
in scene illumination.Result of an edge detection algorithm is a set of
connected curves that mayindicate object boundaries. This significantly
reduces the amount of data to beprocessed, making the overall object
representation algorithm simpler and quicker.

If edge detection step is successful, subsequent task of interpreting the


information isalso successful.A typical approach to segmentation using edges
is

a) Compute an edge image, containing all edges of an original image.


b) Process the edge image so that only closed object boundaries remain.
c) Transform the result to an segmented image by filling in the object
boundaries.

The first step of edge detection is discussed in the following sections.


Difficulty liesin the second step which often requires removal of edges that are
19
Block 4 Pattern Recognition
caused by noise orother artifacts, bridging the gaps at locations where no edge
was detected and decisionto connect the edges that make up a single object.

We discussed already discussed on edges in earlier sections. You may recall


that anedge may be loosely defined as a line of pixels showing an ‘observable’
difference.For example, consider two sub-images shown in Fig. 15. In the sub
image Fig. 15(b),there isa clear difference between the gray levels in the
second and third columns which canbe easily picked by the human eye whereas
in sub-image of Fig.15 (a) no such difference can be seen.

51 52 53 59 50 53 150 160

54 52 53 62 51 53 150 180
50 52 53 68 58 55 154 170
55 52 53 55 54 56 156 155
(a) (b)
Fig. 15: 2 Sub-Images

Different edge models have been defined based on their intensity profiles. Fig.
16(a) shows a ‘step’ edge which involves transition between two intensity
values over a distance of one pixel. This is an ideal edge where no additional
processing is needed for identification. A ‘ramp’ edge is shown in Fig. 16(b),
where the transition between two intensity levels takes place over several
pixels. In practice, all the edges get blurred and noisy because of focusing
limitations and inherent noise present in electronic components. A ‘point’ (Fig.
16(c)) is defined as only one or two isolated pixels having different gray level
values as compared to its neighbours. Whereas a roof edge (Fig. 16(d)) is
defined as multiple pixels having same or similar gray level values which are
different from their neighbours.

(a) ‘Step’ edge (b) ‘Ramp’ edge

(c) ‘Isolated’ point (d) ‘Roof’ edge

Fig.16: Gray Levels Across Edges

As we know that the first order derivative at a point x of a one – dimensional


function f ( x ) is given by
f
 f ' ( x )  f ( x  1)  f ( x ) (1)
x
Unit 11 Image Segmentation
Second order derivative of f ( x ) is given by
 2f
 f ( x)  f (x  1)  f ( x  1)  2f ( x) (2)
x 2
Let the values of the ramp edge in Fig.16(b) from left to right be
f (x)  20 20 20 20 20 50 100 180 180 180 180 180

first derivative f ( x )   0 0 0 0 +30 +50 +80 0 0 0 


[using Eqn. (1)]
second derivative  0 0 0 0 30 20 30  80 0 0 0

f ( x )  [using Eqn. (2)]

Generally, for the implementation of first or second derivative, masks are


generated. These masks are convolved with the image to get the result. For
3  3 mask shown in Fig.17, the output is calculated by
1 1
g ( x , y)    f ( x  i, y  j) w (i, j) (3)
i  1 j 1
Where f (x, y) is the image with which mask w is being multiplied.
w (1,1) w (0,1) w(1,1)
w ( 1,0) w (0,0) w (1,0)
w(1,1) w(0,1) w (1,1)

Fig.17: 3 × 3 mask

Now, before we discuss the edge detection approaches, let us discuss line
detection.

A line can be a small number of pixels of a different color or gray level on an


otherwise unchanging background. For the sake of simplicity it is assumed that
the line is only single pixel thick. Fig.18 shows line detection masks in various
directions. The first mask w1 in Fig. 18(a) responds strongly to lines (one pixel
thick) oriented horizontallyand the second mask w2 in Fig. 18(b) responds to
vertical lines. Third w 3 and fourth w4 in Fig.18(c) and Fig. 18(d) masks
respond to lines at 45o and -45o respectively.

 1 1 1 1 21 1 1 2 2 1 1
2 2 2 1 21 1 2 1 1 2 1
1 1 1 1 21 2 1 1 1 1 2
 
(a) Horizontal (w1) (b) Vertical (w 2 ) (c)  45 ( w 5 ) (W3) (d)  45 ( w1 ) (W4)

Fig.18: Line Detection Mask

Let g1, g 2 , g3 and g4 be the response of each of these masks from left to right
respectively.If | g1 |  | g j |, j  2,3,4, then that point is more likely to be
associated withhorizontal line. Consider the electronic circuit shown in Fig. 19
(a), the results of applying masks of Fig. 18 (a) to Fig. 18 (d) on this circuit are
shown in Fig. 19 (a) to Fig. 19 (g).

Now let us discuss edge detection approaches.

Edge detection is the most common approach used in segmentation. A typical


edge may be the border between a block of red color and a block of yellow. 21
Block 4 Pattern Recognition
Edge can also be the boundary between two regions with relatively distinct
gray-level properties. Computation of a local derivative operator can enhance
edges. Fig.20 shows the response of first and second derivative to light strip on
dark background (a) and dark strip on light background (b). Edge is the smooth
change in grey levels. First derivative is positive at the leading edge of
transition and negative at the trailing edge of transition. It is zero in areas of
constant gray level.
The response of second derivative is different from first derivative. It is
positive for dark side, negative for light side and zero in constant area.
‘Magnitude’ of first derivative is used to detect the presence of an edge.
‘Sign’ of second derivative is used to determine whether edge pixel lies on dark
side or on light side of edge. ‘Zero crossing’ is at the midpoint of a transition
in gray level.
Edge detection is a non-trival task. Edge- detection is not as simple as it looks
in earlier section. In practice, edges are corrupted by noise and blurring. This
can be illustrated by the following examples of edge detection on a one –
dimensional array. Looking at Fig. 21(a), we can intuitively say that there is an
edge between 4th & 5th pixel. But, if the intensity difference between 4th & 5th
pixel is smaller because of noise

(a) The original image(b) Results from -45 mask

(c) -45 line (d) Result of +45 mask

(e) +45 line(f) Result of horizontal mask

(g) Horizontal line


Unit 11 Image Segmentation
Fig. 19

and if intensitydifference between the 5th and 6th pixels is higher as in Fig.21
(b) it would not be easyto identify the location of the edge precisely. As the
edges are corrupted by noise, weobserve multiple edges instead of a single
edge.

(a) (b)
Fig. 20: First and secondderivative to a) light stripon dark background b)
dark strip onlight background

(a) (b)
Fig.21: Example of Edge Detection

11.4.1 Gradient Operator

There are many edge detection methods, broadly, wecan classify them into two
categories: First order Gradient based search method, and Laplacian based zero
crossing method. The first order derivative based search methods detect edges
bycomputing the gradient magnitude and then searching for local directional
maxima ofthe gradient magnitude. The zero crossing based methods search for
the zero crossing in a second order derivative computed from the image to find
edges.

Gradient is a first order derivative and is defined for an image f ( x, y) as


Block 4 Pattern Recognition
 f 
Gx   x 
 f      f . (4)
Gy   
 y 

It points in the direction of maximum rate of change of f at a point ( x , y).


Magnitude of the gradient is mag ( f )  G 2x  G 2y . 
1/ 2

This can be approximated as ( f )  | G x |  | G y | and direction of the gradient


G x 
is given by  ( x , y )  tan 1  , where  is measured with respect to x -axis.
G y 

Now, we present again discussion on various gradicutoperators such as, Prewitt


operator and Sobel operator.

i) Prewitt Operator
It uses a 3×3 size mask that approximate first derivative. The x direction mask
and y direction mask is shown in Fig.22. The approach used is

G x  ( Z7  Z8  Z9 )  (Z1  Z2  Z3 )
G y  ( Z3  Z 6  Z 0 )  ( Z1  Z 4  Z 7 )

The convolution results in greatest magnitude indicating gradient direction.


These arealso called ‘Compass operators’ because of its ability to determine
gradient direction.

1 1 1 1 0 1
0 0 0 1 0 1
1 1 1 1 0 1
Fig. 22:Prewitt Operator

ii) Sobel Operator


Sobel operatoris similar toprewitt's, but it uses a weight of 2 in the centre
coefficientto give it little more prominence. Sobel x and y direction mask is
given in Fig.23.
G x  (Z7  2Z8  Z9 )  ( Z1  2Z2  Z3 )
G y  ( Z3  2 Z6  Z 0 )  ( Z1  2 Z 4  Z 7 )
1  2 1 1 0 1
0 0 0 2 0 0
1 2 1 1 1 1

Fig. 23: Sobel Operator

Sobel operator has the advantage of providing both a derivative and a


smoothing effect. This smoothing effect has noise suppression characteristics.
Diagonal edges can be detected byprewitt and sobel masks by rotating the
earlier masks by 45ocounter clockwise. Fig.24 shows the two masks.

24
Unit 11 Image Segmentation
1 1 0 0 1 1 0 1 2  2 1 0
1 0 0 1 0 1 1 0 1 1 0 1
0 1 1 1 1 1  2 1 0 0 1 2

(a) Prewitt (b) Sobel


Fig.24: Prewitt and Sobel Masks for Diagonal Edge Detection.
Now, let us discuss the laplacian operator.

11.4.2 The Laplacian Operator

Laplacian, a second order derivative is defined for a 2D function f(x, y), as


f  2f
2 f  
x 2 y 2

Two Laplacian masks are shown in Fig. 25. for Fig. 25 (a), the Laplacian
equation is

 2 f  4 Z 5  ( Z 2  Z 4  Z 6  Z8 )

And for Fig. 15 (b), the Laplacian equation is


 2 f  8Z 5  ( Z1  Z 2  Z 3  Z 4  Z 6  Z 7  Z 8 ).

The centre coefficient of these operators are positive and outer coefficients
arenegative. Sum of all the coefficients is zero.

0 1 0 1 1 1
1 4 1 1 8 1
0 1 0 1 1 1
(a) (b)

Fig. 25: 3 × 3 Laplacian mask

Laplacian is hardly used in practice for edge detection because it is very


sensitiveto noise, and produces double edges. Edge direction is not detected by
Laplacian. Itis used to find the location of edge using zero crossing detection.
When the image contains noise, the best approach is to use Laplacian of 2D
Gaussian function for edge detection application.

Laplacian of 2D Gaussian function is used for edge detection application.


2 2 2
2D Gaussian filter is given by H ( x, v)  e 1x  y / 2  or H ( r )  e  r / 2  ,
2 2

where, r 2  x 2  v 2 and σ is the standard deviation.

1 2 2
First derivative of Gaussian filter is H( r )  r e  r / 2 .
2

1  r 2  2 2
Second derivative of Gaussian filter is H(r )   1 e  r / 2 .
2 2 
  

After returning to original coordinates x, y and introducing a normalizing


coefficient C, a convolution mask of LOG (Laplacian of Gaussian) operator is
given by
25
Block 4 Pattern Recognition

 x 2  y 2   2   ( x 2  y 2 ) / 2 2
H( x, y)  C e .
  4 
 
A 5 × 5 LOG mask is given in Fig.26. Due to its shape, LOG is also known as
‘Mexicanhat’.Computing second derivative in this way is robust and efficient.
Zero crossings are obtained at r = ±σ.

0 0 1 0 0
0 1  2 1 0
 1  2  16  2  1
0 1 2 1 0
0 0 1 0 0

Fig.26:LOG as an image, LOG 3 D plot, 5×5 LOG mask

Fig.27 (a) is the input image, Fig.27 (b) is the output of prewitt filter Fig.27 (c)
is the output of Robertfilter, Fig.27 (d),Fig. 27 (e), and Fig.27(f) are outputs of
Laplacian, Canny and Sobel filters respectively. As it isclear from the figures,
each filter extracts different edges in the image. Laplacian andcanny filters
extract lot of inner details while sobel and robert filters extract only
theboundary. Prewitt filter extracts the entire boundary of the flower without
any gaps in the boundary.

(a) Original image(b) Output of prewitt filter

(c) Output of robert filter(d) Output of laplacian filter


Unit 11 Image Segmentation

(e) Output of canny filter(f )Output of sobel filter

Fig.27
Try the following exercises.

E4) What is an edge?


E5) List the properties of the second derivative around an edge?
E6) Define Gradient Operator?
E7) Why is a Laplacian generally not used in its original form for edge
detection?
E8) Give a 5 × 5LOG mask.

11.4.3 LINE DETECTION:


Line detection algorithm follows the definition of line in mathematics. It will
select ‘x’ edge points and detect all lines lying on theses edge points.
Fundamentally Line detection algorithms are based on ‘Hough Transform’(HT)
and Convolution [2]
 Any line can be represented by the equation (1)
Y=a·x+b (1)

But it will be difficult to represent a vertical line using equation (1). So in HT


we use the angle and its distance from origin will be used for representing
lines.
If ‘r’ and ‘θ’ represents distance of line from origin and its angle,
y=x + (2)

Figure 28(Source: Internet)

We can represent any line using equation (2) where θ∈[0,360[ and r ≥ 0.
Block 4 Pattern Recognition
a) Hough Transform (HT) :The main constraint of any image processing
algorithm is the amount of data. We need to reduce data for preserving related
information of objects. Edge detection can do this work effectively. The output
of edge detector cannot identify lines. HT was initially developed for line
detection and later defined for shape detection too

When we represent lines in the form of y = ax + b there is one problem. In this


form, the algorithm won’t be able to detect vertical lines because the slope a is
undefined/infinity for vertical lines This would mean a computer would need
an infinite amount of memory to represent all possible values of

So, in Hough’s transform we use the parametric form to describe the lines, i.e ρ
= r cos(θ) + c sin(θ), where ρ is the normal distance of the line from the origin,
and θ is the angle that the normal to the line makes with the positive direction
of x-axis in the positive direction.

The Hough space thus has 2 parameters - θ and ρ, and a line is represented by a
single point, defined by the values of these two coordinates

https://towardsdatascience.com/lines-detection-with-hough-transform-84020b3b1549

Figure 29 Representation of a straight line in the Hough.


Unit 11 Image Segmentation

https://towardsdatascience.com/lines-detection-with-hough-transform-84020b3b1549
Figure 30intersection spot in Hough Space (Source: Internet)

The dot in the Hough Space represent that the line exists and is identified by
the θ and ρ values.

Edge points produce cosine curves in the Hough Space. Cosine curve can be
generated by mapping all edge points from an edge image on to the Hough
Space. If two edge points lay on the same line, their corresponding cosine
curves will intersect each other on a specific (ρ, θ) pair (intersection points in
Figure.30)

Mapping all the edge points from an edge image onto the Hough Space, will
generate a number of cosine curves.

1. The first step is to decide the range for ρ and θ. Usually, the range of θ
is [ 0, 180 ] degrees and ρ is [ -d, d ] where d is the length of the edge
image’s diagonal.
2. Then create a 2D array (ρ, θ), then we’ll calculate r cos(θ) + c sin(θ)
value for each pixel (r,c), for multiple values of ρ and θ, and store the
values in the array.
3. Finally, take the highest values in the above array. These will
correspond to the strongest lines in the image, and can be converted
back to y = ax+b form
Block 4 Pattern Recognition

Hough transform: The Hough transform is an incredible tool that lets you identify lines. Not just lines, but
other shapes as well.

Example: Using Hough transform show that the points (1,1), (2,2), and (3, 3) are collinear find the
equation of line.
Solution: The equation of line is y=mx+c, In order to perform Hough transform we need to convert line
from (x,y) plane to (m,c) plane Equation of (m,c) plane is

Step 1: y =mx+C
For(1,1)
y=mx+c
1=m+c
C=-m+1
If c=0 then ( 0=-m+1 ) m=1
If m=0 then (c=1)c=1 (m,c) = (1, 1)
Similarly for other points
If (x,y)= (2,2) then (m,c) = (1,2)
If (x,y)= (3,3) then (m,c) = (1,3)
Step 2: Intersect at the point (0,1) Then (m,c) = (0,1)

Plot a graph for (mc) = (1, 1), (1,2), and (1,3)


Step 3 : The original equation of line is ( y=mx+c) put the value of m and c on this eq. Then y=x

points (1,1) , (2,2), and (3, 3) are collinear


Unit 11 Image Segmentation
b) Convolution Based Technique

Convolution Masks are used to detect lines in this technique. Basically, there
are 4 different varieties of convolution masks: Horizontal, vertical, oblique
(+45 degrees), and oblique (−45 degrees)

Horizontal (R1) Vertical (R3) Oblique Oblique


(+45 degrees) (R2) (-45degrees) (R4)
-1 -1 2 2 -1 -1
-1 -1 -1 -1 2 -1 -1 2 -1 -1 2 -1
2 2 2 -1 2 -1 2 -1 -1 -1 -1 2
-1 -1 -1 -1 2 -1

Lines are detected using the equation (3) by using the response obtained after
convolving these masks with the image.

R(x, y) = max(|R1 (x, y)|, |R2 (x, y)|, |R3 (x, y)|, |R4 (x, y)|) (3)

If R(x, y) > 𝑇, 𝑡ℎ𝑒𝑛 𝑑𝑖𝑠𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑖𝑡𝑦

Convolution means multiplying the corresponding values and accumulating the


same. This method will detect all light lines against dark background

In the following section, we discuss the region based segmentation.

11.5 REGION DETECTION

Region detection is an important aspect in image processing tool for


semantically significant spatial information in images. Matching establishes
similarity between visual entities, which is crucial for recognition.

In a picture, a region is a collection of connected pixels with comparable


features. Because they may correlate to items in a scene, regions are significant
for picture interpretation. There are multiple objects in an image which have
multiple regions corresponding to various portions of the object. Therefore, it is
necessary to partition an image into several regions that correspond to objects
or parts of things in order to interpret accurately. In general, pixels in a region
will have similar features. Pixels belonging to a specific object can be
identified by testing the following conditions

A. The mean of the grey value of pixels of an image and the mean of the
grey value of pixels of a specific object in the image will be different
B. The standard deviation of the grey value of pixels belonging to a
specific object in the image will lie within a specific range.
C. The texture of the pixels of an object in the image will have a unique
property

But the connection between regions and objects is not perfect due to
segmentation errors. Therefore, we need to apply object-specific knowledge in
later stages for image interpretation. 31
Block 4 Pattern Recognition

Region-based segmentation and boundary estimation using edge detection are


two methods for splitting a picture into areas.

Further, in boundary detection, semantic boundaries are considered to find


different objects or sections of an image. It is different from edge detection
because it does not use boundaries between light and dark pixels in an image.

The brief discussion on Region-based segmentation, boundary estimation, and


boundary detection is given below:

a) Region based segmentation: In region-based segmentation, all pixels


of an image belonging to a same area are grouped and labelled together.
Here, pixels are assigned to areas based on unique characteristic feature
which is different from other part of image. Value similarity and spatial
closeness are two important features of this segmentation process. If
two pixels are very close to one another and have similar intensity
characteristics, they may be allocated to the same region. For example,
the similar grey values can represent similar pixels and Eucledian
distance can represent the closeness of the pixels.

The similarity and proximity concepts are based on the idea that points
on the same object will project to pixels in the image that have similar
grey values. We can also assume to group pixels in the image and then
employ domain-dependent information to match areas to object models.
In simple cases, thresholding and component labelling can be used to
segment data.

b) Boundary estimation method: In this method, segmentation is done


by finding the pixels (called edges) lying on a region boundary. The
difference between neighbouring pixels can be used to determine the
region boundary as regions on either side of the boundary may have
different grey levels. A large number of edge detectors rely on intensity
characteristics to detect edges instead of using derived properties such
as texture and motion.

Figure 31 Boundary identified by grouping similar pixels

c) Boundary Detection: In boundary detection, semantic boundaries are


Unit 11 Image Segmentation
different from edge detection because it does not use boundaries
between light and dark pixels in an image. A zebra, for example, has
several internal boundaries between black and white stripes that humans
would not consider part of the zebra's boundary. The focus is more on
approximate boundary detection method using training data because a
perfect solution requires high-level semantic knowledge about the scene
in the image.

Figure 32

In previous sections, object boundary was detected to locate an object. In this


section,object region will be used to locate the object. In theory, locating an
object by locating eitherits boundary or region should result in the same object
as boundary and region are justdifferent representations of the same object. But
in practice, edge based segmentationapproach may give totally different result
than a region based approach.

Imperfect images, imperfect methods can be the reason which causes that the
result of objectdetection using boundary based approach to be different from
region based approach.Region based methods rely on the assumption that
neighbouring pixels withinone region have similar values. The common
approach is to compare a pixel withits neighbours. A pixel belongs to the same
cluster if the pixel and its neighbours satisfy the same similarity criterion.
Region based segmentation methods employ two basic steps:

a) Merging
b) Splitting

Image segmentation using merging has the following steps

Step 1: Obtain the initial segmentation of the image.


Step 2: Merge two adjacent segments to form a single segment if they are
similar insame way
Step 3: Repeat step 2 until no segment to be merged remains.

The initial segmentation can be all individual pixels. The basic idea is to
combine twopixels (regions) if they are similar. The similarity criteria can be
based on grey levelsimilarity, texture of the segment etc.
Image segmentation using splitting has following steps

Step 1: Obtain an initial segmentation of an image.


Step 2: Split each segment that is inhomogeneous in some way
Step 3: Repeat step 2 until all segments are homogeneous.
Block 4 Pattern Recognition
Here, initial segmentation may be the entire image (no segmentation). The
criterionfor inhomogeneity of a segment may be the variance of gray levels or
the difference in itstextures etc. Both splitting and merging methods seem to be
top-bottom and bottom-topapproach of the same method. But there is a basic
difference. Merging two segmentsisstraight forward,but in splitting,we need to
know the sub-segment boundary.

Let us discuss region growing.

Region growing is a process of merging adjacent pixel segments into one


segment.It is one of the simplest and very popular method of segmentation
which is used in many applications. It needs a set of starting pixels called
‘seed’ points. The process consists of picking a seed from the set and
examining all 4 or 8 connected neighboursof this seed and merging similar
neighbours to the seed as shown in Fig.33(a). Theseed point is modified based
on all mergedneighbours Fig.33(b). The algorithmcontinuous until the seed set
is empty.

(a) Start of Growing a Region (b) Growing Process After a Few Iterations

Fig. 33: Region Growing

Region growing algorithm for a single seed point and same grey level value
as similaritymeasure to merge the pixels is as follows:
The algorithm uses a data structure entity stack to keep track of seed points.
Twooperations push and pop are used here. Push: puts the pixel co-ordinates on
the top ofthe stack. Pop: takes a pixel from the top of the stack. Input to the
algorithm is an image'f'. Initial seed has coordinates of ( x, y) and grey level
value g at ( x, y), f ( x, y)  g. Thegoal is to grow a region with all pixels having
gray level value of g and assigning themgrey level k  1, k  1, k  g. Let ( x, y)
be the coordinates of initial seed, and let (a , b) be the coordinates of pixel under
investigation.

The algorithm

Push ( x, y)
Do till stack is not empty

Pop (a, b) /(take input point from top of stack)


If f (a , b)  g / if input = desired value g
Set f (a, b)  1 / segmented pixel is assigned as 1/
Push (a, b  1) / Test all four neighbors of (a , b)
Push (a, b  1) / Test all four neighbors of (a , b) by
pushing it on the top of the stack/
Push (a  1, b)
Push (a  1, b)
Unit 11 Image Segmentation

End

This is a recursive algorithm. The final region is extracted by selecting all


pixels havinggrey level value as 1(k). The algorithm can be modified by
changing the similaritymeasure to incorporate a range of values for merging.
The statement if f (a , b)  g can be changed to

g j  f (a , b )  g 2 .

Thus, if the grey level value of pixel (a , b) is between g1and g2 then, it


issegmented. The algorithm can be further modified to incorporate multiple
seed points.In the above algorithm, only four neighbours are considered. It can
be modified foreight neighbourhood. Instead of using four push instruction,
eight push instructioncan be used with coordinates of all eight neighbours.

There are several issues about region growing.

1) Suitable selection of seed points is important. Seed point selection is user


dependent. Histogram and image properties can be taken into account by
theuser to select the seed point.

2) Selection criteria for similarity measure is important. It generally


dependson the original image and object to be segmented. Band of
intensity values,color, texture, shape are some of the similarity measures
generally used.

3) Formulation of a stopping rule is also very important in region


growing.Growing of a region should stop when no more pixel satisfythe
similaritycriteria. Algorithm can be made more powerful, if size and
shape of the grownregion are also considered.

Main advantage of region growing algorithm is that it correctly separates the


regionsbased on the property defined by the user. The algorithm is very simple
to implement.Only input needed are the seed points and selection criterion.
Multiple criteria canalso be applied. The algorithm works well in noisy
environment also.

Major disadvantage of region growing is that the seed points are user
dependent.Selection of wrong seed points can lead to wrong segmentation
results. The algorithmis highly iterative and requires high computationaltime
and power.

Example 1: In the image segment given in Fig.34 (a) seed points are given at
(3, 2) and (3, 4).Similarity criterion is grey level difference. Find segmented
image, if a) T  3 and b) T  8.

Solution:For T  3, region growing starts with pixel (3, 2). All the pixels
having grey leveldifference < 3 are assigned as a and denoted as region R1.
Another region growingstarts at (3, 4). All pixels with grey level value < 3 are
assigned as b and denotedas region R2. The output is shown in Fig 34(b). For
T  8, all the pixels have grey leveldifference less than 3 ⇒ only one region
is formed, with all pixels being assigned as ‘a’. The output is shown in Fig.34
(c).
35
Block 4 Pattern Recognition
1 2 3 4 5
1 0 0 5 6 7
2 1 1 5 8 7
3 0 1 6 7 7

4 2 0 7 6 6

5 0 1 5 6 5

(a) Input Image Segment for Example 1

A a B B b a A a a a

A a B B b a A a a a

A a B B b a A a a a

A a B B b a A a a a
R1 R2
A a B B b a A a a a

(b) Output for T = 3 (c) Output for T = 8


Fig.34

***
Example 2:Use region growing to segment the object in the image given in
Fig.20. The seedis the centre pixel of the image. Region is grown in following
directions.

a) In horizontal and vertical directions only (4 neighbourhood)


b) In horizontal and vertical and diagonal directions (8 neighbourhood).

Similarity, criterion is the difference between two pixel values is less than or
equal to 5
10 10 10 10 10 10 10
10 10 10 69 70 10 10
59 10 60 64 59 66 60
10 59 10 60 70 63 62
10 60 59 65 67 10 65
10 10 10 10 10 10 10
10 10 10 10 10 10 10

Fig.35: Input image segment

Solution:a) Region growing starts with seed point pixel with grey value 60
in the centre. It moves horizontally up and down, vertically up and down to
check how much given pixel value differs from 60. If the difference is less than
equal to 5. Then it is assigned as 'a' and merged with the region, else it is
assigned as ‘b’. Fig 36(a) shows the output.

36
Unit 11 Image Segmentation
b) If diagonal elements are also included then the region grows more as
shown in Fig. 36(b).

b b b b b b b b b b b b b b
b b b b b b b b b b b b b b
b b a a a b b a b a a a b a
b a b a b b b b a b a b a a
b a a a b b b b a a a b b a
b b b b b b b b b b b b b b
b b b b b b b b b b b b b b

(a) Output for 4 Neighborhood (b) Output for 8Neighborhood


Fig. 36
***
Now, let us discuss Split and Merge Method.
As mentioned before, there is a fundamental problem in splitting procedure. It
needssuitable sub segments to be established before performing splitting. The
problem ofhow to split has to be solved. We can decide whether a particular
segment needs to besplit or not by checking if

 Grey level difference (variance) exceeds a threshold or


 The variance of texture measure exceeds threshold or
 Histogram entropy or any other histogram measure exceeds threshold, or
 Edge pixels exist in the segment

We start by splitting the entire image into four quadrants. The object can be in
anyor some or of all of the four quadrants, as the object can be anywhere in the
image.

Then, further subdivision of the quadrants is done. Merging operation is added


to the segmentation process, recursive splitting and merging of image segments
is done as shown in Fig.37. Original image is shown in Fig.37(a) and Fig. 37
(b) shows, entire image splitinto four segments. Fig.37(c) shows a further split
of four segments. In Fig. 37(d), furthersegmentation is done if grey level
variance in the sub block is non zero. Merging of thesub blocks having same
grey levels is shown in Fig. 37(e). Continuing this process, we endup in two
segments one being the object and the other being the back ground.

(a) (b) (c) (d) (e)


Fig.37: Example of Split and Merge Segmentation

Split and merge algorithm uses ‘Quad tree’ for representing segments. Fig.38
showsthat there is a one to one relation between splitting an image recursively
into quadrantsand the corresponding quad tree representation. However, there
is a limitation in thisrepresentation, as we cannot model merging of two
segments at different level of pyramid.
Let R represent the entire image. A homogeneity criterion it is selected. If
theregion R is not homogeneous (H(R) = false), then split R into four quadrants
R1, R2, R3, R4. Any four regions with the same parent can be merged into a
single homogeneousregion if they are homogeneous.
Block 4 Pattern Recognition
The steps of split and Merge algorithm are as follows:
Step 1: Define an initial segmentation into regions, a homogeneity criterion
and apyramid data structure.
Step 2: If any region R in the pyramid data structure is not
homogeneous(H(R) = False), split it into four child regions.
Step 3: When no further splitting is possible, merge two adjacent regions
Riwhich are homogeneous (H(Ri U Rj) = True)
Step 4: Stop when no further merging is possible.

(a) Naming convention of quadrants (b) Quadtree representation


Fig. 38: Example of recursive splitting of an image by a quad tree.
Several modification to this basic algorithm are possible. For example, in Step
2, merging of homogeneous regions is allowed which results in a simpler and
fasteralgorithm.
The major advantages of this algorithm is that the image could be split
progressivelyaccording to our required resolution because the number of
splitting levels is decidedby the user. Major disadvantage is that it may produce
blocky segments as splitting isdone in rectangular quadrants. This problem can
be solved by splitting at higher levelbut this will increase the computational
time.

Example 3:Segment the image in Fig.39(a) by split and merge algorithm.


Homogeneity criteria is the grey levels of the pixels.

Solution: The image is divided into four quadrants. Fig.39(b) shows the image
and its quadtree. Quadrants 2 and 3 are homogeneous. Thus no further splitting
is done. Quadrant 1and 4 are non – homogenous hence they are divided further
into 4 quadrants. Fig. 39(c) shows the splitting and corresponding quad tree.
Now only one quadrant, 12 is still non-homogenous. Hence, it is further

1 2
3 4 1 2 3 4

(a) (b)

2 3

11 12 13 14 41 42 43 44
Unit 11 Image Segmentation

12 122 123 124


1 (d)

(e)
Fig.39: Split and merge algorithm

subdivided. Fig. 39(d) shows the segmented image and its final quad-tree
structure. Now all regions are homogeneous and hence no further splitting is
possible.

Now merging operation takes place between adjacent quadrants. Quadrants 43


and 44 are merged into a single region. Quadrants 123 and 124 are merged,
quadrants 11 and 14 are merged. Finally merge all quadrants which are
homogenous. Fig. 39(e) is the final segmented image after merging is
complete.
***
Now try the following exercises.

E9) Distinguish between image segmentation based on thresholding with


image segmentation based on region-growing techniques.
E10) Consider the image segment

128 128 128 64 64 32 32 8


64 64 128 128 128 8 32 32
32 8 64 128 128 64 64 64
8 128 128 64 64 8 64 64
128 64 64 64 128 128 8 8
64 64 64 128 128 128 32 32
8 128 32 64 64 128 128 128
8 8 64 64 128 128 64 64
E11) What are the advantages/disadvantages if we use more than one seed in
a region-growing technique?

In the following section, we shall discuss boundary detection.


11.6 BOUNDARY DETECTION
39
Block 4 Pattern Recognition
After performing image segmentation, a region may be represented in terms of

– external characteristics (boundaries).


– internal characteristics (texture).

A region may be described by its boundary in terms of features such as its


length, the orientation of the straight line joining its extreme points and the
number of concavity in the boundary. Many algorithms require the points in the
boundary of a region in an ordered clockwise (or anti clockwise) direction. For
boundary detection or following or tracking, the following assumptions are
made:

1) The image is binary where1=foreground and 0=background

2) The image is padded with aborder of 0’ s so an objectcannot merge with


the border.

3) We limit the discussion to single regions. Theextension is


straightforward.
Moore Boundary Tracking Algorithm:

Given a binary region R or its boundary.


Step 1: Let the starting point b0, be the uppermost, leftmost point in the
image labeled 1.

Step 2: Denote by c0the west neighbour of b0. c0 isalways a background


point.

Step 3: Examine the 8-neighbours of b0, starting at c0and proceeding in a


clockwise direction.

Step 4: Let b1 denote the first neighbour encountered whose value is 1.

Step 5: Let c1 denote the background point immediately preceding b1 in the


sequence.

Step 6: Store the locations of b 0 and b1 for use inStep 10.

Step 7: Let b  b1 and c  c1 .

Step 8: Let the 8-neighbours of b starting at c and proceeding clockwise be


denoted as n1, n 2  n 8 . Find the first n k which is foreground(i.e., a
“1” ) .
Step 9: Let b  n k and c  n k 1
Step 10: Repeat steps 8 and 9 until b=b0, that is, wehave reached the first
point and the next boundary point found is b 1 .

The sequence of b points found when the algorithm stops is the set of ordered
boundary points.

1 1 1 1 c0 b 0 1 1 1
40 1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1 1 1 1 1
Unit 11 Image Segmentation

(a) Image (b) Step 1 and Step 2

c
b 1 1
1 1
1 1
1 1
1 1 1 1

(c) Step 3 to Step 9(d) Step 10.


Fig.40: Illustration in Boundary Following Algorithm.
There is a need for stopping rule as stated in Step 10. We would only include
the spur at the right if we stop when we reach the initial point without checking
the next point. Starting from topmost leftmost point in Fig.41 (a) results in Fig.
41(b). In Fig.41(c) the algorithm has returned to the starting point again. Rest
of the boundary could not be traced.

1 c0 b 0 c
1 1 1 1 1 b
1 1 1
1 1 1 1 1 1
1 1 1 1 1 1 1 1 1

(a)(b) (c)
Fig.41: Example of erroneous result of Boundary Detection Algorithm.

Now, we shall discuss the chain codes.

Chain codes are used to represent a boundary by a connected sequence of


straight line segments of specified length and direction. Freeman codes [1961]
represent a boundary by the sequence of straight line segments of specified
length anddirection.The direction is coded by a numbering scheme (4 or 8-
connectivity) as shown in Fig.42
1 2
3 1

2 0 4 0

5 7
3 6
(a) 4-direction chain code (b) 8-direction chain code
Fig.42: Direction Numbers
Images are acquired and processed in a grid format with equal spacing in the x
and y- directions, so a chain code can be generated by following a boundary in
clockwise direction and assigning a direction to the segment connecting every
pair of pixels. To avoid noise degradation and long chains a resampling of the
image grid is commonly used to describe the boundary at a coarser level as

41
Block 4 Pattern Recognition
shown in Fig. 43 (a). Fig. 43(b) shows resampled points and Fig. 43(c)
shows 8-direction chain code.

(a) Digital Boundary (b) Result of (c) 8-Direction Chain


with Resampling Grid Resampling Coded Boundary

Fig.43
If we start from topmost leftmost corner, chain code for Fig. 28 is
0766666453321212
The chain code depends on the starting point.To normalize it, we treat the code
as a circular sequence of direction numbersand redefine the starting point so
that the resulting sequence forms an integer of minimum magnitude.To account
for rotation, we use the first differences of the chain code instead of the code
itself.

The first difference is obtained by counting the number of direction changes in


counter clockwise direction that separate two adjacent elements ofthe code.

For boundary in example in Fig. 28, the chain code is 0 7 6 6 6 6 6 4 5 3 3 2 1 2


1 2.
First difference is 6 7 7 0 0 0 0 6 1 1 0 7 7 1 7 1

First value is calculated by considering the code as a circular sequence of


integers, hence coding the sign changes from 2 to 0 counter clockwise and so
on.
Example 4:Find chain code and first difference of the following boundary
shapes.

(a) (b) (c) (d)


Fig. 44
Solution:
1

2 0

Chain code: 0 0 3 3 2 2 1 1 0 0 3 2 2 1

Difference: 3 0 3 0 3 0 3 0 3 0 3 3 0 3
3
Unit 11 Image Segmentation

Chain code: 0 0 3 2 2 1 0 0 0 3 2 2 2 1

Difference: 3 0 3 3 0 3 3 0 0 3 3 0 0 3

4-directional codes are used in this example. First differences are computed by
treating the chain as a circular sequence.
***
Now, try an exercise.

E12) Find chain code and first difference of the following boundary shape.

11.7 FEATURE EXTRACTION


Feature extraction divides a large set of data into smaller groups for quick
processing. There are a large number of variables in these huge data sets which
require a large amount of processing power. Feature extraction extracts the best
feature from by selecting and combining variables into features.

Applications of Feature Extraction

Image Processing, Auto-encoders and Bag of words are some applications of


Feature Extraction.

1. Image Processing: In image processing, we experiment with images using


different techniques to comprehend them better.

2. Auto-encoders: Auto-encoders do efficient unsupervised data coding. As a


result, the feature extraction technique may be used to discover significant
features from data to code by learning from original data set and generating
new ones.

3. Bag of Words: Feature extraction is an important part of this process. This


technique is widely used for natural language processing (NLP). Here,
words/features are extracted from a document, website or sentence and are
classified according to their frequency of use.

Traditional methods of feature detection

Traditional methods of feature detection include the following:

1. Harris Corner Detection- It detects corners based on differential corner


score with reference to direction directly.

2. SIFT (Scale-Invariant Feature Transform)- Generally used for invariance.


43
Block 4 Pattern Recognition
3. SURF (Speeded-Up Robust Features)- This technique is a simplified
variant of SIFT.

4. FAST (Features from Accelerated Segment Test)- In comparison to SURF,


this is a substantially faster corner detecting algorithm.

5. BRIEF (Binary Robust Independent Elementary Features)- This feature


descriptor can be used with any other feature detector. By converting floating
point integers to binary strings, this approach minimizes memory utilization.

6. Oriented FAST and Rotated BRIEF (ORB) —This OpenCV algorithm uses
FAST key-point detector and BRIEF descriptor. It is an alternative to SIFT and
SURF.

Deep Learning Techniques for feature extraction


Convolutional neural network (CNN) can replace Traditional feature extractors
because of their strong ability and efficiency to extract complex features for
expressing more detailed part of an image and can learn task specific features.

1. SuperPoint: It detectspoints of interest and descriptors using Fully


CNN. The extracted features are encoded in VGG style and then using
two decoders, it generates inters points and descriptors

Figure 45 Super Point structure

The SuperPoint structure is an example of a fully connected-convolutional


neural network architecture. It is capable of operating on a picture in its
entirety and generating interest point detections with fixed-length
descriptors in a single forward pass. The approach makes use of a single
encoder that is shared by multiple users in order to process the incoming
image and minimise the total number of dimensions. Following the
encoder, the design then divides into two "heads" that are referred as
decoders. One head is responsible for locating potential places of interest,
while the other is in charge of describing those potential points of interest.
Both of these activities will make use of the majority of the network's
parameters. Unlike previous systems, which locate interest points first and
then compute descriptors, this one is able to share processing and
Unit 11 Image Segmentation
representation between the two tasks. Traditional systems locate interest
points first and then compute descriptors.

As a consequence of this, a system has been developed that is effective for


completing tasks such as homography estimation, which require matching
geometric shapes.[1]

D2-Net: It is a trainable CNN based local feature detector and dense feature
descriptor(feature descriptor has minimum non zero values)

2.

Figure 46 Detect and Describe D2 network

It's a fully convolutional neural network (FCNN) used for extracting feature
maps with a double purpose:

i) Obtaining the local descriptors dij at a given spatial position (i,j) is as


easy as traversing all the n feature maps Dk;

ii) Keypoint detection scores sij are calculated during training using a soft
local-maximum score and a ratio-to-maximum score for each descriptor,
and detections are generated by executing a non-local-maximum
suppression on a feature map.[2]

3. LF-Net: This approach uses training image pairs with relative pose and
depth maps

Figure 47: LF-Net Source: Internet

It can be thought of as both a feature detector and a dense feature description at


the same time. The keypoints that are obtained using this method are more
stable than their traditional equivalents, which are based on the early detection
Block 4 Pattern Recognition
of low-level structures. This is achieved by delaying the detection until a later
stage. We demonstrate that pixel data can be used to train this model [3].

Now let us summaries what we have discussed this unit.

11.8 SUMMARY
In this unit, we have discussed the following:

1. image segmentation techniques;


2. edge based segmentation;
3. line based segmentation;
4. various region based segmentation techniques; and
5. boundary detection algorithm

11.9 SOLUTIONS AND ANSWERS

E1) Image segmentation partitions an image into set of regions. In many


applications,regions represent meaningful areas in an image. In other
applications, regions mightbe set of border pixels grouped into
structures such as line segments, edges etc. Thelevel of partitioning of
an image depends on the problem to be solved. Segmentation should
stop when the objects of interests have been isolated. Segmentation has
twoobjectives:

a) To decompose an image into regions for further analysis.


b) To perform a change of representation of an image for faster
analysis.
E2) Segmentation is the most important step in automated recognition
system which has numerous applications and some of them are
discussed below:

1. Medical Imaging:
2. Satellite
3. Movement Detection:
4. Security and Surveillance:
5. License Plate Recognition (LPR)
6. Industrial Inspection and Automation
7. Robot Navigation

E3) Generally, all segmentationalgorithms are based on one of the two


properties of gray level values of pixels.

i. Discontinuity

In this approach, images are partitioned based on the difference or


discontinuity ofthe gray level values. Edge based segmentation methods
fall in this category.
46
Unit 11 Image Segmentation
ii. Similarity
Images are partitioned based on the similarity of the gray level values
of the pixelsaccording to a pre-defined criterion. Thresholding, region
based clustering andmatching based segmentation techniques fall in this
category.

E4) A typical edgemay be the border between a block of red color and a
block of yellow. Edge canalso be the boundary between two regions
with relatively distinct gray-level properties.

E5) ‘Sign’ of second derivative isused to determine whether edge pixel lies
on dark side or on light side of edge. ‘Zerocrossing’ is at the midpoint
of a transition in gray level.

E6) Gradient Operator

Gradient is a first order derivative and is defined for an image f ( x, y) as

f 
 x   x 
G
f      
f
G y   
  y 

It points in the direction of maximum rate of change of f at a point


( x, y)

Magnitude of the gradient is mag ()  [G 2x  G 2y ]1/ 2

This can be approximated as ( f )  | G x |  | G y | mag


G x 
and direction of the gradient is given by  ( x , y)  tan 1  
G y 
α is measured with respect to x- axis

E7) Laplacian is hardly used in practice for edge detection because it is very
sensitiveto noise, and produces double edges. Edge direction is not
detected by Laplacian.

Itis used to find the location of edge using zero crossing detection.

E8) 5 x 5 LOG mask

0 0 1 0 0
0 1  2 1 0
 1  2  16  2  1
0 1 2 1 0
0 0 1 0 0

E9) Image segmentation based on thresholding applies a single fixed


criterion to all pixels in the image simultaneously. Hence, it is rigid. On 47
Block 4 Pattern Recognition
the other hand, image segmentation based on the region-based approach
is more flexible; hence it is possible to adjust the acceptance criteria in
the course of region-growing process so that they can depend on the
shape of the growing regions if desired.

E10) Step 1 Computation of the histogram of the input image.


The histogram of the image gives the frequency of occurrence of the
gray level.

The histogram threshold is fixed as 32. Now the input image is divided
into two regions as follows:

Region 1: Gray level < 32

Region 2: Gray level > 32

The input image after this decision is given as

2 2 2 2 2 1 1 1
2 2 2 2 2 1 1 1
1 1 2 2 2 2 2 2
1 2 2 2 2 1 2 2
2 2 2 2 2 2 1 1
2 2 2 2 2 2 1 1
1 2 1 2 2 2 2 2
1 1 2 2 2 2 2 2

E11) The advantage of using more than one seed is that better segmentation
of the image can be expected, since more seeds lead to more
homogeneous regions.

The drawback of using more than one seed is that the probability of
splitting a homogeneous region in two or more segments increases.

E12)

Chain code: 0 3 0 3 2 2 1 1

Difference: 3 3 1 3 3 0 3 0
REFERENCES
[1]https://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w9/DeTone_SuperPoi
nt_Self-Supervised_Interest_CVPR_2018_paper.pdf
[2]https://openaccess.thecvf.com/content_CVPR_2019/papers/Dusmanu_D2-
Net_A_Trainable_CNN_for_Joint_Description_and_Detection_of_CVPR_2019_paper.pdf
[3]https://www.semanticscholar.org/paper/D2-Net%3A-A-Trainable-CNN-for-Joint-
Description-and-Dusmanu-Rocco/162d660eaaa1eb2144d8030102f3e6be1e80ce50
[4]https://web.ipac.caltech.edu/staff/fmasci/home/astro_refs/HoughTrans_lines_09.pdf
48
Unit 11 Image Segmentation
[5]https://www2.ph.ed.ac.uk/~wjh/teaching/dia/documents/edge-ohp.pdf
[6]https://en.wikipedia.org/wiki/Line_detection
[7]https://towardsdatascience.com/image-feature-extraction-traditional-and-deep-learning-
techniques-ccc059195d04
[8]https://morioh.com/p/14d27a725a0ehttps://www.researchgate.net/figure/An-example-of-
edge-based-segmentation-18_fig2_283447261

49

You might also like