Unit 11
Unit 11
11.1 INTRODUCTION
The development of computer vision led to various means of understanding
images automatically. One of the important applications is object detection in
an image. Using image processing algorithms, we can detect semantic objects
of a certain class (like individuals, constructions, animalsor vehicles) from
digital images or videos.
Mainly, object detection algorithm aims to detect objects from a known class
like vehicles, human beings, animals, tree etc. In general, there could be very
small number of objects may be present in a single image, but similar class can
be present in large number of images with various backgrounds.
Such things are to be explored in object detection for example, Figure 1 shows
the output of object detection algorithm within a room and a scene on the road.
We can see objects lie bottles, glasses, laptop, chair, bag etc inside the room.
Where as in the out-door environment, we can see car, two wheelers bus etc
Objectives
After studying this unit, you should be able to
In short, the basic features of object class are to be defined and included in a
database of object models. Using feature extraction process, specific features of
the object we are looking for are to be identified and matched with the data
base for identifying the object class.
The object detection is majorly classified as (1) Edge Detection, (2) Region
Detection and (3) Boundary Detection
Unit 11 Image Segmentation
Enhancement Segmentation
Preprocessing Image Analysis Feature
Restoration, etc
extraction etc
1. Medical Imaging
2. Satellite Imaging
Segmentation is used to locate objects like roads, forest etc. in satellite images
for monitoring of ecological resources like seasonal dynamics of vegetation,
deforestation, mapping of underground minerals etc. Segmentation methods
based on texture, colour, wavelength etc. are generally used here.
3. Movement Detection
Monitoring crowded urban environment is a goal of modern vision systems.
Knowledgeof the size of the crowd and tracking its motion can be used to
monitor trafficintersection. Intelligent walk signal system can be designed
based on the number ofpeople waiting to cross the road. Knowledge of the size
of the crowd is helpful ingeneral safety, crowd control and planning urban
environment.
7. Robot Navigation
Imagesegmentationhelpsinsimplifyingthetasksandgoalsofcomputervisionandimage
processing
techniques.Segmentinganimageisoftenconsideredasthefirststepforimageanalysis.Im
age segmentation is done by splitting an image into multiple parts based on similar
characteristics of pixels for identifying objects
There are several techniques through which an image can be segmented, based on
division and group-specific pixels which can be further assigned labels, and
classified according to these labels. The generated labels, can be used in several
supervised, semi-supervised and unsupervised training and testing tasks in
machine learning and deep learning applications.
Image segmentation plays a vital importance in computer vision and has several
applications in various industries and research. Some of the commonly used
applications are Facial recognition, number plate recognition, image search,
Block 4 Pattern Recognition
analysis of medical image etc
Classification of Image segmentation methods :Researchers areworking
onimage segmentation for over a decade.The commonly used methods
involves “Classification based on method of Identification” where the
images can be segmented either by grouping similar pixels or by
differentiating them by identifying the boundary. The Region based
identification method and Boundary based Identification method of
segmentation are discussed below:
Figure 7
https://scikit-image.org/docs/stable/auto_examples/applications/plot_thresholding.html
(Source: Internet)
Thethresholdingbased segmentationcanbefurtherclassifiedas:
1. Simple Thresholding ,and
2. Adaptive Thresholding
Simple Thresholding :In the Simple Thresholding method (also known as global
thresholding), all pixels will be converted into white or black based on the reference
pixel intensity value. If the intensity value is less than the reference (threshold value),
Unit 11 Image Segmentation
the pixel will be converted into black and if it is greater, the pixel will be converted
into white pixel
Algorithm
1. InitialestimateofT
2. SegmentationusingT:G1,pixelsbrighterthanT,G2,pixelsdarkerthan(o
requalto)T.
3. Computationoftheaverageintensitiesm1andm2ofG1andG2.
4. Newthresholdvalue:Tnew=(m1+m2)/2
5. If|T−Tnew|>∆T,
6. backtostep2,otherwisestop.
Global thresholding, using an
appropriate threshold T: g(x, y) =1, if f (x, y) > T
0,iff(x,y)≤T
● Otsu’sBinarization
In general, a constant threshold value is often chosen through a hit-and-trial
method to perform simple thresholding-based segmentation. This
constantthresholdvaluecanvaryaccordingtotheapplicationand may
notberobustorefficientfortwodifferentapplications.InOtsu’sBinarizationmeth
od,thethreshold value will be decided by the average value of the two peaks
obtained from histogram. The main limitation is that it will work only in
bimodal image, i.e., images containing only two peaks in a histogram. The
main application of this method is scanning documents, Removing
unwanted colors, Pattern recognition etc.
Algorithm
1. Otsu’smethodisaimedinfindingtheoptimalvaluefortheglobalthreshold.
2. Itisbasedontheinterclassvariancemaximization.
3. Wellthresholdedclasseshavewelldiscriminatedintensityvalues.
4. M×Nimagehistogram:Lintensitylevels,[0,...,L−1]
5. ni-numberofpixelsofintensityi
6. Normalizedhistogram:
Figure 8
https://www.researchgate.net/publication/263608069_Medical_Image_Segmentation_Methods_Algorith
ms_and_Applications/figures?lo=1&utm_source=google&utm_medium=organic
a) OriginalImage
b) Histogramoftheimagea)
c) Simplethreshold,takingT=0.169,η=0.467
d) Otsu’smethod,takingT=181,η=0.944.
AdaptiveThresholding
In this method, we can decide different threshold values for different sections of
theimage.Itismainly used for images having different backgrounds /
properties. Using this method, different lighting conditions can be
differentiated and threshold values can be adaptively decided.
VariableThresholding,ifTcanchangeovertheimage.
- Local or regional thresholding, if T depends on a neighborhood of (x, y).
- Adaptive thresholding, if T is a function of (x, y).
- Multiple thresholding: g(x, y) = a, if f (x, y) > T2
b, if T1 < f (x, y) ≤ T2
c, if f (x, y) ≤ T1
After Thresholding Segmentation, the next is Edge Based Segmentation, which is discussed below:
2. Edge-BasedSegmentation
In this method, the objects are identified based on the edge detection. The edge
detection is done based on the pixel properties like texture, contrast, colour,
saturation, intensity etc.The results of edge-based image segmentation are shown
in Figure 9
The Laplacian is sometimes used on its own for edge detection because its
sensitive to noise
The Laplacian-of-Gaussian (LoG) uses a Gaussian filter to blur the image and a
Laplacian to enhance edges.
Algorithm
- Convolve the image with a two-dimensional Gaussian function
- Compute the Laplacian of the convolved image -> L
- Identify edge pixels as those for which there is a zero-crossing in L. A radially-
symmetric 2D Gaussian:
Where
TheLaplacianofthisis:
- Example
The Canny edge detector works on the fact that for edge detection, there is a
tradeoff between noise reduction (smoothing) and edge localisation.
Algorithm
This algorithm, identify group of pixels with specific characteristics either from
a small section or from a bigger portion of input images seed point. Then the
algorithm will add more pixels or shrink based on specific characteristics of
pixels with all other seed points. Thus, we can get a segmented image. It is
further classified into Region Growing and Region Splitting methods
Figure :10
https://www.doc.ic.ac.uk/~dfg/vision/v02d01.gif
(Source:Internet)
● Region Growing
Figure. 11
https://towardsdatascience.com/image-segmentation-part-2-8959b609d268
Source: Internet
● RegionSplittingandMerging
This algorithm focuses
onsplittingandmergingportionsoftheimage.Itsplitsimagebasedonattributes and
then merge regions based on similar attributes. While splitting the whole image
will be considered
Unit 11 Image Segmentation
andwhileregiongrowingitwillbeconcentratingspecificpoints.Thisalgorithmisalso
calledas split-merge algorithms
Figure. 12
https://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/ImageProcessing-html/topic3.htm
PictorialrepresentationofhowSplitandMergealgorithmworks
https://towardsdatascience.com/image-segmentation-part-2-8959b609d268
Figure. 13
|Zmax-Zmin|<=threshold
Zmax-Maximumpixelintensityvalueinaregion Zmin-Minimumpixelintensityvalueinaregion
Somepropertiesthatmustbefollowedinregionbasedsegmentation-
- Completeness-
Thesegmentationmustbecompletei.e,ΣRi=REverypixelmustbeinaregion
- Connectedness-Thepointsofaregionmustbeconnected
- Disjointness-Regionsmustbedisjoint:Ri∩Rj=Ø,foralli=1,2,...n
- Satisfiability-
PixelsofaregionmustsatisfyonecommonP(Ri)TRUE,forallipropertyPatl
east,
i.eanyregionmustsatisfyahomogeneitypredicateP
- Segmentability-
DifferentregionssatisfydifferentP(RiURj)=FALSEpropertiesi.eanytw
o adjacent regions cannot be merged into a single region
Example-Segmenttheimagegivenbysplitandmergealgorithm
Figure. 114
Block 4 Pattern Recognition
3. NeuralNetworksforSegmentation
Neural networks have become very popular for image segmentation tasks as they not only
provide automation, but also provide robust specificity and sensitivity to the developed
segmentation masks. Researchers have introduced and analyzed convolutional neural
network, generative adversarial networks, deep belief network, and extreme learning
machines etc., to perform excellent image segmentation for various applications in
healthcare, traffic monitoring, satellite visualization, bakery, plant diseases, etc.
Segmentationthroughneuralnetworks,especiallyaconventisdonebygeneratingafeaturemapfo
raninput image data. Then a region based filter is further applied to generate the mask
according to ones objectives and applications. Bounding boxes play a very important role
inimagesegmentation.Theycanbegenerated through various techniques and consist of
coordinates from the segmented part.
A great example of neural network
forimagesegmentationhasbeenreleasedbytheexpertsatFacebookAI
Research(FAIR)whocreatedadeeplearningarchitecturecalledMaskR-
CNNwhichcanmakeapixel-wise mask for every object present in an image. It is an
enhanced version of the Faster R-CNN object detection architecture. The Faster R-CNN
uses two pieces of data for every object in an image, the bounding box coordinates and the
class of the object. With Mask R-CNN, you get an additional section in this process. Mask
R-CNN outputs the object mask after performing the segmentation.
a) Discontinuity
b) Similarity
Images are partitioned based on the similarity of the gray level values of the
pixelaccording to a pre-defined criterion. This means as long as pixels have
gray values close to each other, they are considered to be in the same
18
Unit 11 Image Segmentation
segmentation block. Thresholding, region based clustering andmatching based
segmentation techniques fall in this category.
ii) Region Based Segmentation: Edge based techniques find the object
boundaries and then locate the objects,whereas region based techniques
look for uniformity within a sub-region based on asuitable property like
intensity, colour, texture etc. Region based segmentation starts inthe
middle of an object and then ‘grows’ outwards till it meets the object
boundary.
Try an exercise.
51 52 53 59 50 53 150 160
54 52 53 62 51 53 150 180
50 52 53 68 58 55 154 170
55 52 53 55 54 56 156 155
(a) (b)
Fig. 15: 2 Sub-Images
Different edge models have been defined based on their intensity profiles. Fig.
16(a) shows a ‘step’ edge which involves transition between two intensity
values over a distance of one pixel. This is an ideal edge where no additional
processing is needed for identification. A ‘ramp’ edge is shown in Fig. 16(b),
where the transition between two intensity levels takes place over several
pixels. In practice, all the edges get blurred and noisy because of focusing
limitations and inherent noise present in electronic components. A ‘point’ (Fig.
16(c)) is defined as only one or two isolated pixels having different gray level
values as compared to its neighbours. Whereas a roof edge (Fig. 16(d)) is
defined as multiple pixels having same or similar gray level values which are
different from their neighbours.
Fig.17: 3 × 3 mask
Now, before we discuss the edge detection approaches, let us discuss line
detection.
1 1 1 1 21 1 1 2 2 1 1
2 2 2 1 21 1 2 1 1 2 1
1 1 1 1 21 2 1 1 1 1 2
(a) Horizontal (w1) (b) Vertical (w 2 ) (c) 45 ( w 5 ) (W3) (d) 45 ( w1 ) (W4)
Let g1, g 2 , g3 and g4 be the response of each of these masks from left to right
respectively.If | g1 | | g j |, j 2,3,4, then that point is more likely to be
associated withhorizontal line. Consider the electronic circuit shown in Fig. 19
(a), the results of applying masks of Fig. 18 (a) to Fig. 18 (d) on this circuit are
shown in Fig. 19 (a) to Fig. 19 (g).
and if intensitydifference between the 5th and 6th pixels is higher as in Fig.21
(b) it would not be easyto identify the location of the edge precisely. As the
edges are corrupted by noise, weobserve multiple edges instead of a single
edge.
(a) (b)
Fig. 20: First and secondderivative to a) light stripon dark background b)
dark strip onlight background
(a) (b)
Fig.21: Example of Edge Detection
There are many edge detection methods, broadly, wecan classify them into two
categories: First order Gradient based search method, and Laplacian based zero
crossing method. The first order derivative based search methods detect edges
bycomputing the gradient magnitude and then searching for local directional
maxima ofthe gradient magnitude. The zero crossing based methods search for
the zero crossing in a second order derivative computed from the image to find
edges.
Magnitude of the gradient is mag ( f ) G 2x G 2y .
1/ 2
i) Prewitt Operator
It uses a 3×3 size mask that approximate first derivative. The x direction mask
and y direction mask is shown in Fig.22. The approach used is
G x ( Z7 Z8 Z9 ) (Z1 Z2 Z3 )
G y ( Z3 Z 6 Z 0 ) ( Z1 Z 4 Z 7 )
1 1 1 1 0 1
0 0 0 1 0 1
1 1 1 1 0 1
Fig. 22:Prewitt Operator
24
Unit 11 Image Segmentation
1 1 0 0 1 1 0 1 2 2 1 0
1 0 0 1 0 1 1 0 1 1 0 1
0 1 1 1 1 1 2 1 0 0 1 2
Two Laplacian masks are shown in Fig. 25. for Fig. 25 (a), the Laplacian
equation is
2 f 4 Z 5 ( Z 2 Z 4 Z 6 Z8 )
The centre coefficient of these operators are positive and outer coefficients
arenegative. Sum of all the coefficients is zero.
0 1 0 1 1 1
1 4 1 1 8 1
0 1 0 1 1 1
(a) (b)
1 2 2
First derivative of Gaussian filter is H( r ) r e r / 2 .
2
1 r 2 2 2
Second derivative of Gaussian filter is H(r ) 1 e r / 2 .
2 2
x 2 y 2 2 ( x 2 y 2 ) / 2 2
H( x, y) C e .
4
A 5 × 5 LOG mask is given in Fig.26. Due to its shape, LOG is also known as
‘Mexicanhat’.Computing second derivative in this way is robust and efficient.
Zero crossings are obtained at r = ±σ.
0 0 1 0 0
0 1 2 1 0
1 2 16 2 1
0 1 2 1 0
0 0 1 0 0
Fig.27 (a) is the input image, Fig.27 (b) is the output of prewitt filter Fig.27 (c)
is the output of Robertfilter, Fig.27 (d),Fig. 27 (e), and Fig.27(f) are outputs of
Laplacian, Canny and Sobel filters respectively. As it isclear from the figures,
each filter extracts different edges in the image. Laplacian andcanny filters
extract lot of inner details while sobel and robert filters extract only
theboundary. Prewitt filter extracts the entire boundary of the flower without
any gaps in the boundary.
Fig.27
Try the following exercises.
We can represent any line using equation (2) where θ∈[0,360[ and r ≥ 0.
Block 4 Pattern Recognition
a) Hough Transform (HT) :The main constraint of any image processing
algorithm is the amount of data. We need to reduce data for preserving related
information of objects. Edge detection can do this work effectively. The output
of edge detector cannot identify lines. HT was initially developed for line
detection and later defined for shape detection too
So, in Hough’s transform we use the parametric form to describe the lines, i.e ρ
= r cos(θ) + c sin(θ), where ρ is the normal distance of the line from the origin,
and θ is the angle that the normal to the line makes with the positive direction
of x-axis in the positive direction.
The Hough space thus has 2 parameters - θ and ρ, and a line is represented by a
single point, defined by the values of these two coordinates
https://towardsdatascience.com/lines-detection-with-hough-transform-84020b3b1549
https://towardsdatascience.com/lines-detection-with-hough-transform-84020b3b1549
Figure 30intersection spot in Hough Space (Source: Internet)
The dot in the Hough Space represent that the line exists and is identified by
the θ and ρ values.
Edge points produce cosine curves in the Hough Space. Cosine curve can be
generated by mapping all edge points from an edge image on to the Hough
Space. If two edge points lay on the same line, their corresponding cosine
curves will intersect each other on a specific (ρ, θ) pair (intersection points in
Figure.30)
Mapping all the edge points from an edge image onto the Hough Space, will
generate a number of cosine curves.
1. The first step is to decide the range for ρ and θ. Usually, the range of θ
is [ 0, 180 ] degrees and ρ is [ -d, d ] where d is the length of the edge
image’s diagonal.
2. Then create a 2D array (ρ, θ), then we’ll calculate r cos(θ) + c sin(θ)
value for each pixel (r,c), for multiple values of ρ and θ, and store the
values in the array.
3. Finally, take the highest values in the above array. These will
correspond to the strongest lines in the image, and can be converted
back to y = ax+b form
Block 4 Pattern Recognition
Hough transform: The Hough transform is an incredible tool that lets you identify lines. Not just lines, but
other shapes as well.
Example: Using Hough transform show that the points (1,1), (2,2), and (3, 3) are collinear find the
equation of line.
Solution: The equation of line is y=mx+c, In order to perform Hough transform we need to convert line
from (x,y) plane to (m,c) plane Equation of (m,c) plane is
Step 1: y =mx+C
For(1,1)
y=mx+c
1=m+c
C=-m+1
If c=0 then ( 0=-m+1 ) m=1
If m=0 then (c=1)c=1 (m,c) = (1, 1)
Similarly for other points
If (x,y)= (2,2) then (m,c) = (1,2)
If (x,y)= (3,3) then (m,c) = (1,3)
Step 2: Intersect at the point (0,1) Then (m,c) = (0,1)
Convolution Masks are used to detect lines in this technique. Basically, there
are 4 different varieties of convolution masks: Horizontal, vertical, oblique
(+45 degrees), and oblique (−45 degrees)
Lines are detected using the equation (3) by using the response obtained after
convolving these masks with the image.
R(x, y) = max(|R1 (x, y)|, |R2 (x, y)|, |R3 (x, y)|, |R4 (x, y)|) (3)
A. The mean of the grey value of pixels of an image and the mean of the
grey value of pixels of a specific object in the image will be different
B. The standard deviation of the grey value of pixels belonging to a
specific object in the image will lie within a specific range.
C. The texture of the pixels of an object in the image will have a unique
property
But the connection between regions and objects is not perfect due to
segmentation errors. Therefore, we need to apply object-specific knowledge in
later stages for image interpretation. 31
Block 4 Pattern Recognition
The similarity and proximity concepts are based on the idea that points
on the same object will project to pixels in the image that have similar
grey values. We can also assume to group pixels in the image and then
employ domain-dependent information to match areas to object models.
In simple cases, thresholding and component labelling can be used to
segment data.
Figure 32
Imperfect images, imperfect methods can be the reason which causes that the
result of objectdetection using boundary based approach to be different from
region based approach.Region based methods rely on the assumption that
neighbouring pixels withinone region have similar values. The common
approach is to compare a pixel withits neighbours. A pixel belongs to the same
cluster if the pixel and its neighbours satisfy the same similarity criterion.
Region based segmentation methods employ two basic steps:
a) Merging
b) Splitting
The initial segmentation can be all individual pixels. The basic idea is to
combine twopixels (regions) if they are similar. The similarity criteria can be
based on grey levelsimilarity, texture of the segment etc.
Image segmentation using splitting has following steps
(a) Start of Growing a Region (b) Growing Process After a Few Iterations
Region growing algorithm for a single seed point and same grey level value
as similaritymeasure to merge the pixels is as follows:
The algorithm uses a data structure entity stack to keep track of seed points.
Twooperations push and pop are used here. Push: puts the pixel co-ordinates on
the top ofthe stack. Pop: takes a pixel from the top of the stack. Input to the
algorithm is an image'f'. Initial seed has coordinates of ( x, y) and grey level
value g at ( x, y), f ( x, y) g. Thegoal is to grow a region with all pixels having
gray level value of g and assigning themgrey level k 1, k 1, k g. Let ( x, y)
be the coordinates of initial seed, and let (a , b) be the coordinates of pixel under
investigation.
The algorithm
Push ( x, y)
Do till stack is not empty
End
g j f (a , b ) g 2 .
Major disadvantage of region growing is that the seed points are user
dependent.Selection of wrong seed points can lead to wrong segmentation
results. The algorithmis highly iterative and requires high computationaltime
and power.
Example 1: In the image segment given in Fig.34 (a) seed points are given at
(3, 2) and (3, 4).Similarity criterion is grey level difference. Find segmented
image, if a) T 3 and b) T 8.
Solution:For T 3, region growing starts with pixel (3, 2). All the pixels
having grey leveldifference < 3 are assigned as a and denoted as region R1.
Another region growingstarts at (3, 4). All pixels with grey level value < 3 are
assigned as b and denotedas region R2. The output is shown in Fig 34(b). For
T 8, all the pixels have grey leveldifference less than 3 ⇒ only one region
is formed, with all pixels being assigned as ‘a’. The output is shown in Fig.34
(c).
35
Block 4 Pattern Recognition
1 2 3 4 5
1 0 0 5 6 7
2 1 1 5 8 7
3 0 1 6 7 7
4 2 0 7 6 6
5 0 1 5 6 5
A a B B b a A a a a
A a B B b a A a a a
A a B B b a A a a a
A a B B b a A a a a
R1 R2
A a B B b a A a a a
***
Example 2:Use region growing to segment the object in the image given in
Fig.20. The seedis the centre pixel of the image. Region is grown in following
directions.
Similarity, criterion is the difference between two pixel values is less than or
equal to 5
10 10 10 10 10 10 10
10 10 10 69 70 10 10
59 10 60 64 59 66 60
10 59 10 60 70 63 62
10 60 59 65 67 10 65
10 10 10 10 10 10 10
10 10 10 10 10 10 10
Solution:a) Region growing starts with seed point pixel with grey value 60
in the centre. It moves horizontally up and down, vertically up and down to
check how much given pixel value differs from 60. If the difference is less than
equal to 5. Then it is assigned as 'a' and merged with the region, else it is
assigned as ‘b’. Fig 36(a) shows the output.
36
Unit 11 Image Segmentation
b) If diagonal elements are also included then the region grows more as
shown in Fig. 36(b).
b b b b b b b b b b b b b b
b b b b b b b b b b b b b b
b b a a a b b a b a a a b a
b a b a b b b b a b a b a a
b a a a b b b b a a a b b a
b b b b b b b b b b b b b b
b b b b b b b b b b b b b b
We start by splitting the entire image into four quadrants. The object can be in
anyor some or of all of the four quadrants, as the object can be anywhere in the
image.
Split and merge algorithm uses ‘Quad tree’ for representing segments. Fig.38
showsthat there is a one to one relation between splitting an image recursively
into quadrantsand the corresponding quad tree representation. However, there
is a limitation in thisrepresentation, as we cannot model merging of two
segments at different level of pyramid.
Let R represent the entire image. A homogeneity criterion it is selected. If
theregion R is not homogeneous (H(R) = false), then split R into four quadrants
R1, R2, R3, R4. Any four regions with the same parent can be merged into a
single homogeneousregion if they are homogeneous.
Block 4 Pattern Recognition
The steps of split and Merge algorithm are as follows:
Step 1: Define an initial segmentation into regions, a homogeneity criterion
and apyramid data structure.
Step 2: If any region R in the pyramid data structure is not
homogeneous(H(R) = False), split it into four child regions.
Step 3: When no further splitting is possible, merge two adjacent regions
Riwhich are homogeneous (H(Ri U Rj) = True)
Step 4: Stop when no further merging is possible.
Solution: The image is divided into four quadrants. Fig.39(b) shows the image
and its quadtree. Quadrants 2 and 3 are homogeneous. Thus no further splitting
is done. Quadrant 1and 4 are non – homogenous hence they are divided further
into 4 quadrants. Fig. 39(c) shows the splitting and corresponding quad tree.
Now only one quadrant, 12 is still non-homogenous. Hence, it is further
1 2
3 4 1 2 3 4
(a) (b)
2 3
11 12 13 14 41 42 43 44
Unit 11 Image Segmentation
(e)
Fig.39: Split and merge algorithm
subdivided. Fig. 39(d) shows the segmented image and its final quad-tree
structure. Now all regions are homogeneous and hence no further splitting is
possible.
The sequence of b points found when the algorithm stops is the set of ordered
boundary points.
1 1 1 1 c0 b 0 1 1 1
40 1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1 1 1 1 1
Unit 11 Image Segmentation
c
b 1 1
1 1
1 1
1 1
1 1 1 1
1 c0 b 0 c
1 1 1 1 1 b
1 1 1
1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
(a)(b) (c)
Fig.41: Example of erroneous result of Boundary Detection Algorithm.
2 0 4 0
5 7
3 6
(a) 4-direction chain code (b) 8-direction chain code
Fig.42: Direction Numbers
Images are acquired and processed in a grid format with equal spacing in the x
and y- directions, so a chain code can be generated by following a boundary in
clockwise direction and assigning a direction to the segment connecting every
pair of pixels. To avoid noise degradation and long chains a resampling of the
image grid is commonly used to describe the boundary at a coarser level as
41
Block 4 Pattern Recognition
shown in Fig. 43 (a). Fig. 43(b) shows resampled points and Fig. 43(c)
shows 8-direction chain code.
Fig.43
If we start from topmost leftmost corner, chain code for Fig. 28 is
0766666453321212
The chain code depends on the starting point.To normalize it, we treat the code
as a circular sequence of direction numbersand redefine the starting point so
that the resulting sequence forms an integer of minimum magnitude.To account
for rotation, we use the first differences of the chain code instead of the code
itself.
2 0
Chain code: 0 0 3 3 2 2 1 1 0 0 3 2 2 1
Difference: 3 0 3 0 3 0 3 0 3 0 3 3 0 3
3
Unit 11 Image Segmentation
Chain code: 0 0 3 2 2 1 0 0 0 3 2 2 2 1
Difference: 3 0 3 3 0 3 3 0 0 3 3 0 0 3
4-directional codes are used in this example. First differences are computed by
treating the chain as a circular sequence.
***
Now, try an exercise.
E12) Find chain code and first difference of the following boundary shape.
6. Oriented FAST and Rotated BRIEF (ORB) —This OpenCV algorithm uses
FAST key-point detector and BRIEF descriptor. It is an alternative to SIFT and
SURF.
D2-Net: It is a trainable CNN based local feature detector and dense feature
descriptor(feature descriptor has minimum non zero values)
2.
It's a fully convolutional neural network (FCNN) used for extracting feature
maps with a double purpose:
ii) Keypoint detection scores sij are calculated during training using a soft
local-maximum score and a ratio-to-maximum score for each descriptor,
and detections are generated by executing a non-local-maximum
suppression on a feature map.[2]
3. LF-Net: This approach uses training image pairs with relative pose and
depth maps
11.8 SUMMARY
In this unit, we have discussed the following:
1. Medical Imaging:
2. Satellite
3. Movement Detection:
4. Security and Surveillance:
5. License Plate Recognition (LPR)
6. Industrial Inspection and Automation
7. Robot Navigation
i. Discontinuity
E4) A typical edgemay be the border between a block of red color and a
block of yellow. Edge canalso be the boundary between two regions
with relatively distinct gray-level properties.
E5) ‘Sign’ of second derivative isused to determine whether edge pixel lies
on dark side or on light side of edge. ‘Zerocrossing’ is at the midpoint
of a transition in gray level.
f
x x
G
f
f
G y
y
E7) Laplacian is hardly used in practice for edge detection because it is very
sensitiveto noise, and produces double edges. Edge direction is not
detected by Laplacian.
Itis used to find the location of edge using zero crossing detection.
0 0 1 0 0
0 1 2 1 0
1 2 16 2 1
0 1 2 1 0
0 0 1 0 0
The histogram threshold is fixed as 32. Now the input image is divided
into two regions as follows:
2 2 2 2 2 1 1 1
2 2 2 2 2 1 1 1
1 1 2 2 2 2 2 2
1 2 2 2 2 1 2 2
2 2 2 2 2 2 1 1
2 2 2 2 2 2 1 1
1 2 1 2 2 2 2 2
1 1 2 2 2 2 2 2
E11) The advantage of using more than one seed is that better segmentation
of the image can be expected, since more seeds lead to more
homogeneous regions.
The drawback of using more than one seed is that the probability of
splitting a homogeneous region in two or more segments increases.
E12)
Chain code: 0 3 0 3 2 2 1 1
Difference: 3 3 1 3 3 0 3 0
REFERENCES
[1]https://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w9/DeTone_SuperPoi
nt_Self-Supervised_Interest_CVPR_2018_paper.pdf
[2]https://openaccess.thecvf.com/content_CVPR_2019/papers/Dusmanu_D2-
Net_A_Trainable_CNN_for_Joint_Description_and_Detection_of_CVPR_2019_paper.pdf
[3]https://www.semanticscholar.org/paper/D2-Net%3A-A-Trainable-CNN-for-Joint-
Description-and-Dusmanu-Rocco/162d660eaaa1eb2144d8030102f3e6be1e80ce50
[4]https://web.ipac.caltech.edu/staff/fmasci/home/astro_refs/HoughTrans_lines_09.pdf
48
Unit 11 Image Segmentation
[5]https://www2.ph.ed.ac.uk/~wjh/teaching/dia/documents/edge-ohp.pdf
[6]https://en.wikipedia.org/wiki/Line_detection
[7]https://towardsdatascience.com/image-feature-extraction-traditional-and-deep-learning-
techniques-ccc059195d04
[8]https://morioh.com/p/14d27a725a0ehttps://www.researchgate.net/figure/An-example-of-
edge-based-segmentation-18_fig2_283447261
49