Clustering Techniques For Image Segmentation
Clustering Techniques For Image Segmentation
Abid Yahya
Clustering
Techniques
for Image
Segmentation
Clustering Techniques for Image Segmentation
Fasahat Ullah Siddiqui • Abid Yahya
Clustering Techniques
for Image Segmentation
Fasahat Ullah Siddiqui Abid Yahya
CQUniversity BIUST
Melbourne, Victoria, Australia Botswana International University of
Science and Technology
Palapye, Botswana
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
I dedicate this book to my family.
Fasahat Ullah Siddiqui
I dedicate this book to my family for their
love, support, and sacrifice along the path of
my academic pursuits.
Abid Yahya
Preface
vii
viii Preface
Chapter 4 discusses the working of evaluation methods and the practical knowl-
edge of the existing evaluation methods. The hard and soft clustering techniques
have membership functions. The chief objective of these functions is converging the
final solution at the optimum global location. This chapter discusses the existing
quantitative analysis methods to demonstrate the segmentation performance of clus-
tering techniques. In the earliest quantitative analysis methods of clustering tech-
niques, the MSE (mean square error), inter-cluster variation, and VXB function
have been generally used that measure the local cluster similarity only. As com-
pared to the former quantitative analysis methods, the three modern methods mea-
sure the local cluster similarity and the global homogeneity of segmented images
without any human interaction or predefined threshold settings.
This book arose from the research work conducted by authors at Universiti Sains
Malaysia, Malaysia. The authors would like to express their special gratitude and
thanks to Universiti Sains Malaysia (USM); Botswana International University of
Science and Technology (BIUST); Karakoram International University (KIU),
Gilgit, Pakistan; University of Peradeniya, Sri Lanka; Sarhad University of Science
& Information Technology (SUIT), Pakistan; City University of Science &
Information Technology (CUSIT), Pakistan; and Shaheed Benazir Bhutto Women
University Peshawar for giving us such attention, time, and opportunity to publish
this book.
The authors would also like to take this opportunity to express their gratitude to
all those people who have provided invaluable help in the writing and publication of
this book.
ix
Contents
Index������������������������������������������������������������������������������������������������������������������ 107
List of Figures
xiii
xiv List of Figures
xvii
About the Authors
xix
xx About the Authors
The digital image processing system has three phases, that is, image processing,
image analysis, and decision-making. Figure 1.1 illustrates the main steps of the
digital image processing system. Several tasks can be performed in the image pro-
cessing phase, based on the computer vision application’s requirements. In general,
image enhancement, noise filtering, and image compression are implemented in the
image processing phase to increase the image’s quality and ease the following
phases in digital image processing system. While in the image analysis phase, image
segmentation and image representation play a fundamental role before applying
images to higher level operations of the decision-making phase such as image clas-
sification and image matching. The prime focus in image segmentation is the clus-
tering technique, and interested readers can refer to other literature to learn in detail
other steps of the digital image processing system (Gonzalez et al., 2003; Shih, 2010).
Many of us get confused between image classification and image segmentation, and
some consider both are similar because of implementing the segmentation tech-
niques, like clustering for pattern recognition. In pattern recognition, clustering and
classification are two significant techniques. The classification technique needs prior
knowledge of class labels, whereas the clustering technique does not require such
information. However, image segmentation and image classification are two differ-
ent steps of the digital image processing system. Image classification’s primary role
is to recognize predefined objects in an image, while the role of image segmentation
is very much limited to simplify an image into homogenous regions. Classification
Image Enhancement
Extracting features
Test images
Labelled data
Atributes
Regions
Several image segmentation techniques have appeared in the recent literature. Image
segmentation techniques can be classified into four general categories: threshold-
ing, clustering, edge-based segmentation technique, and region-based segmentation
technique, as shown in Fig. 1.4. The remarkable modifications in the basic concept
of thresholding, clustering, first-order-based edge detection, second-order-based
edge detection region are growing, and split-and-merge techniques are discussed in
the following sections.
1.4 Thresholding
Fig. 1.5 Implementation of the thresholding technique for various threshold values (a) original
image, (b) histogram of an image, (c) resultant image if the threshold is set to 111, (d) resultant
image if the threshold is set to 157, and (e) resultant image if the threshold is set to 226
are grouped into two different regions. According to Sahoo et al., thresholding is
mainly classified into local and global thresholding (Sahoo et al., 1988).
P-tile is the earliest existing method of global thresholding (Doyle, 1962). This
method assumes that the object area’s percentage (Pb%) in a gray scale image is
known before and brighter than the background. In this case, the threshold must be
set as the gray level, which maps Pb% the image pixels into an object and pixels the
image pixels into the background. This method will not work if the area of the
object is unknown (Sahoo et al., 1988). Recently, this method has been combined
with an edge detector to assist in the thresholding process for the image with an
unknown object area (Samopa & Asano, 2009; Taghizadeh & Hajipoor, 2011).
The mode method finds peaks and valleys in the histogram by measuring the local
minima between two peaks or modes (Sahoo et al., 1988). This method cannot be
applied to the image with unequal peaks (noisy image) and those with flat valley
(low contrast image). Sezgin proposed a new method that finds the peaks and val-
leys by convolving the histogram function with a smoothing kernel (Sezgin, 2004).
Therefore, the gray levels at which peaks start, end, and attain their maxima are
estimated. By this process, the histogram is reduced to a two-lobe function. The
threshold must be set somewhere between the terminating of the first lobe and the
second lobe. Variations of this method have been proposed, where the cumulative
distribution of the image histogram is first expanded in terms of Chebyshev polyno-
mial functions and followed by curvature analysis (Boukharouba et al., 1985;
Sezgin, 2004). The polynomial function with different degrees uses as polynomial
curvature fit theory, where the objective is to find the coefficients that best fit the
curve to the data. The critical points of the resultant curve, that is, minima or zero
points, select as threshold points. The Gaussian filter is also used for smoothing the
6 1 Introduction to Image Segmentation and Clustering
histogram. Tsai et al. employed it with curvature analysis to find peaks and valley.
The instantaneous rate of change of angle is measured to locate the threshold point.
However, this method only works well with the appropriate selection of the Gaussian
filter’s window size (Tsai, 1995). Large windows might over-smooth the histogram
and skip the desire peaks.
On the other hand, more than desired number of peaks are obtained in the histo-
gram, if too small a window is selected. Olivo considered the multi-scale analysis of
the probability mass function by choosing the wavelet or smoothed histogram,
which is the second derivative of the smoothing function, where the threshold is
found. This threshold is adjusted by using the coarse-to-fine approach (Olivo, 1994;
Sezgin, 2004). This adjustment is started from a threshold at the lowest resolution,
linking all the thresholds in correspondence at high resolutions and backward update
of their location (Olivo, 1994). The main disadvantage of these peaks-and-valley
finding methods is their disregard for spatial information (Hemachander et al., 2007).
This method automatically finds the threshold by analyzing the histogram’s concav-
ity by constructing the convex hull; this is the smallest polygon envelope containing
the histogram. The concavity can be defined by connecting the polygon envelope to
the histogram heights. The threshold must be set at the histogram’s deepest concav-
ity (Rosenfeld & De La Torre, 1983). A variation of this method has been proposed,
where the convex hull is constructed through the exponential of the histogram. The
exponential characteristic of this method can indicate the upper concavities more
precisely (Whatmough, 1991). Figure 1.6 illustrates the convex hull construction
and exponential convex hull.
Fig. 1.6 The convex hull and the exponential convex hull
1.4 Thresholding 7
Fig. 1.7 Approximations of the histogram (a) at the first iteration and (b) after some iterations
By using this approach, the histogram data undergoes the clustering analysis. The
given dataset is divided into two clusters with the initial parameter set based on
assumption or guess. Here, the initial parameter may be the threshold position and
the cluster class value. Based on the assumption (i.e., the optimum location is near
the initial defined value), the clustering analysis searches for the optimum location.
It will be stopped if the threshold position remains unchanged during the analysis or
if it has attained minimum class variance. Other objective functions are used in dif-
ferent clustering analysis methods to search the optimal threshold position in histo-
gram data. However, the clustering has some severe problems, that is, sensitive to
initialization, class centroid (class representing point) will not be updated during the
process, sensitive to outlier or noise, etc. Moreover, these problems are interrelated
to each other and discussed in detail in the subsequent chapter.
In early iterative thresholding methods, the initial threshold is set by assuming that
the four corner pixels belong to the background, and the remainder are object pixels.
The class means values are measured for the foreground (object) and background
regions, respectively. The threshold value is updated iteratively by calculating the
average of both class means (Ridler & Calvard, 1978). Here, the objective function
is similar to the k-means objective (i.e., minimizes the class variance). This thresh-
old updating process will stop when the error between the current and previous
thresholds is very small. By completing the process, this method ensures that the
Gaussian mixture distribution of histogram is grouped into two distinct classes with
possible lowest class variance, and the threshold is set in between them. Trussell
generalized this method by proposing a formula to measure the new threshold value
(Trussell, 1979).
Tk −1 Ng
∑g × n ( g ) ∑ g × n ( g )
g =0 g = Tk +1
Tk = Tk −1
+ Ng
(1.2)
2 ∑n ( g ) 2 ∑ n(g)
g = Tk +1
g =0
where, g is gray level, n(g) is the number of pixels belonging to the gray level,
Ng is the total number of gray levels and Tk is the threshold point.
The summation of these two quantities defined the new threshold value (Trussell,
1979). In other studies, the threshold is set by calculating the midpoint of the two
peaks. Initially, the midpoint is measured by the pixel’s average with maximum
value and pixel with the lowest value. This midpoint will be updated in later itera-
tions using the two-class peaks’ mean (Sezgin, 2004; Yanni & Horne, 1994).
1.4 Thresholding 9
Lloyd (1985) proposed a method that considers the equal variance Gaussian density
functions for the two regions and minimizes the total misclassification error itera-
tively (Lloyd, 1985). The process initiates with the guess of optimal points, and the
final solution will be obtained iteratively by minimizing the error of Gaussian func-
tions. The threshold value is calculated by taking the average of two class means.
Against the equal variance, the new Gaussian density function is introduced that
minimizes the total misclassification error by fitting the Gaussian model to the his-
togram such that the histogram is clustered into two lobes (two normal distribution
of histogram) with smaller overlapping (Kittler & Illingworth, 1986). However, the
distributions’ tails are truncated and they bias the actual model’s mean and variance
(Cho et al., 1989). This bias becomes noticeable when the two histograms are not
distinguishable.
Ostu method begins with the guess of final solution points. It searches iteratively by
maximizing the weighted variance between the background and foreground classes
to search for the optimal threshold point (Otsu, 1975). The probability theory mea-
sures the weights of the classes. It is believed that the variance between the classes
(separability) is maximized by minimizing the within-class variance (similarity).
The threshold is selected as an optimal histogram data threshold if the desired maxi-
mum variance between the classes is obtained. Unfortunately, the noise always
occurs in practical applications and affects this method’s accuracy (Lang et al.,
2008). The one dimension of the histogram is insufficient to overcome this problem.
Therefore, the two-dimensional (2D) Ostu method is proposed (Jianzhuang et al.,
1991). The 2D histogram presents the original image pixels distribution, on the one
side, and shows the average neighborhood image, on its other side. Therefore, the
resultant threshold becomes the vector quantity, which improves the segmentation
results. However, the computational cost will be increased. The diagonal areas of
the 2D histogram represent the background and foreground. Most time is consumed
for calculating the triangle areas. Lang et al. represent the 2D histogram in three
integral images (i.e., pixel number integral image, original image intensity integral
image, and average intensity integral image). Instead of using the vector value (i.e.,
two values) in all calculations, the three integral image values are directly substi-
tuted to calculate only the mean value (Lang et al., 2008).
Fuzzy clustering with soft membership can reduce the initialization problem. With
the assumed optimal points, the data are distributed partially to the classes, and new
optimal points are calculated by the weighted (membership) sum of the data. This
10 1 Introduction to Image Segmentation and Clustering
will continue until the difference between the membership function of current and
previous iterations is minimized. The optimal points are considered as the desired
points. On the other hand, Jawahar et al. formulate the distance function according
to the kittler et al. minimizing function. It assumes that the optimal threshold point
can be simplified by considering the normal distribution of object and background
in a histogram (Jawahar et al., 1997). The memberships for background and object
are calculated with randomly setting the initial threshold value by computing the
mean value of both regions and updating membership according to a new function
derived from the kittler et al. minimized function based on Gaussian function. It is
the iterative process that will continue until there is no appreciable change for
regions’ membership. In other words, the weight based on the neighborhood is cal-
culated (Yong et al., 2004). It is believed that the pixels intensity level with neigh-
borhood information is prone to the noise effect. After initializing the membership
and partially assigning the gray levels to the clusters, the probability of gray level
along the neighboring clusters is calculated as an additional weight. The new degree
is measured by multiplying the weight by the membership value. The threshold
point is finally set at the clusters’ midpoint after calculating the mean value of clus-
ters and continuing the process until a very small change in the new degree is
obtained. The threshold point is finally set at the midpoint of the clusters.
Pal & Rosenfeld (1988) highlighted the improper selection of the image’s fuzzy
region’s threshold position. He argued that the threshold is the point or boundary,
where the data is segmented in a crisp way (1 or 0). He regraded the fuzziness levels
into 0 and 1 using subset theory (Pal & Rosenfeld, 1988). The cross-over point with
a value of 0.5 is defined for a specific region in the histogram. The crisp version of
an image is obtained by setting the membership to 1 if it is greater than the cross-
over point in the gray level in the area of region or window and 0 for the rest. The
fuzziness index calculates the average difference between the gray level and the
obtained binary version. By varying the cross-point on the gray level in the window,
the different fuzziness indexes are calculated. The cross-over point is selected as an
optimal threshold where the fuzziness index is minimized. However, this geometri-
cal method has no theoretical proof for choosing the constant for selecting the band-
width of region or window. Murthy et al. were the first to theoretically choose the
constant’s value to define the region (Murthy & Pal, 1990). They set the known
maximum value of subset “0” as the minimum limit and the known minimum value
of subset “1” as the region’s maximum limit. Therefore, their average is equal to the
gray level, that is, cross-over point. In another study, Huang & Wang (1995) mea-
sure the fuzziness index using the mean or median operator. The fuzziness index is
calculated with the assumption that set “0” has no elements and others have all, and
the threshold is at 0 of gray level on the histogram. Iteratively, sift the threshold
point and calculate the fuzziness index (i.e., within the range of 0.5 to 1). The histo-
gram’s point is to select as the optimum threshold point, where the fuzziness index
should be minimized. Besides, the fuzzy range is defined for measuring fuzziness
index that equals to or less than tolerance; this ensures that the threshold lies on the
deepest valley.
1.4 Thresholding 11
Entropy converts the histogram data into the binary stream by exploiting the redun-
dancies in the gray level’s statistical distribution (l) to reduce as much as possible
the size of the binary stream. The maximization of the entropy of the threshold
image is used in the entropic thresholding-based methods. Assuming the prior
entropy of gray-level histogram, Pun et al. determined the optimal threshold by
maximizing the posterior entropies associated with a gray scale image’s brighter
and dark pixels (Pun, 1981; Sahoo et al., 1988).
t
H b = −∑ pi log 2 pi (1.4)
i =0
l −1
H d = − ∑ pi log 2 pi (1.5)
i = t +1
H = Hb + Hd (1.6)
In his next work, Pun et al. defined a coefficient (α) to estimate the threshold.
m
∑p log p i i
α= i =0
l −1
(1.7)
∑p log p
i =0
i i
12 1 Introduction to Image Segmentation and Clustering
∑p
i =0
i ≥ 0.5 (1.8)
(
T ( x,y ) = 0.5 I max (i ,j ) + I min (i ,j ) ) (1.10)
( )
C ( i,j ) = I max (i ,j ) + I min (i ,j ) ≥ 15 (1.11)
or
s ( x ,y )
T ( x,y ) = m ( x,y ) 1 + k − 1 (1.13)
R
In Eqs. 1.12 and 1.13, R is the maximum value of the standard deviation (i.e., 128
for gray scale image) and k is positive value constant (i.e., in the range of [0.2,0.5]),
while m(x, y) is mean, and s(x, y) is the standard deviation of the image pixels pres-
ent in a w × w window centered around the pixel I(x, y). In other words, the local
threshold value is set based on the pixel intensities contrast in a window. Local
threshold measurement takes considerable time for binarization of an image. One
way to speed up the binarization process of local adaptive thresholding is by using
a small size window. The small window size reduces the computational cost of the
threshold calculation. Another way is to measure the local mean and standard devia-
tion in a window of any size using the integral sum image technique before a local
mean and standard deviation calculation. Viola and Jones implemented an essential
image for computer vision (Viola & Jones, 2004). Consider Ig is an integral image
of an image I where the pixel intensity Ig is measured by adding the intensity of all
pixels above and to the left of that pixel position in an image I. It can be expressed
mathematically by,
x y
I g = ∑∑I ( i,j ) (1.14)
i =0 j =0
The local mean and standard deviation for any window size can be computed
using two addition and one subtraction operations instead of summing pixel values
in a local window (please refer to Eqs. 1.15 and 1.16 for mathematical representa-
tion). Therefore, integrating the integral image approach in the local adaptive
threshold makes it flexible to use any window size without increasing computa-
tion cost.
m ( x,y ) =
( I ( x + w 2 ,y + w 2 ) + I ( x − w 2 ,y − w 2 )) −
g g
( I ( x + w 2 ,y − w 2 ) + I ( x − w 2 ,y + w 2 ))
g g
(1.15)
x+w y+w
1 2 2
s ( x ,y ) = 2
2
∑ ∑ I ( i ,j ) − m ( x ,y ) 2
(1.16)
w i− x−w j − y−w
2 2
deviation. As shown in Eq. 1.18, the mean absolute deviation is directly computed
from the local mean.
∂ ( x,y )
T ( x,y ) = m ( x,y ) 1 + k − 1 (1.18)
1 − ∂ ( x,y )
Several other methods use global image information while measuring a local
threshold, and interested readers can view the text published in (Ismail et al., 2018)
for more details.
1.5 Clustering
Clustering is the pixel-based process that organizes the raw data into clusters or
groups whose members (pixels) are similar in some sense (Fig. 1.8). This can be
achieved by using both the unsupervised and supervised approaches. The unsuper-
vised clustering approach classifies the image into a subgroup without any learning
of data. It uses the predefined objective function to partition the image into disjoint
classes or clusters. The pixels within a cluster are similar as possible, and the pixels
among the cluster are dissimilar as possible. The unsupervised clustering approach’s
main advantage is inherent in its simplicity and ease of implementation (Hamerly &
Elkan, 2002; Vantaram & Saber, 2012). However, the clustering technique’s accu-
racy highly depends on the initialization (i.e., initial clusters centroid positions).
Furthermore, the clustering process becomes more difficult with the increase of
dimensionality of space features, e.g., color image and texture.
I J H K L M N F
Smallest Clusters
extensive data (Almeida et al., 2007; Guha et al., 2000). Like ROCK, the random
sample strategy is used by CURE to handle the large data. It combines the complete
and single linkage methods for choosing more than one representative of a cluster.
In each step or iteration, the two clusters with the closest pair of representative data
points or objects are merged. Thus, the scattered points are shrunk toward the clus-
ter mean point and become less sensitive to outliers (Guha et al., 2001).
On the other hand, Chameleon clustering uses the novel approach. It measures
the similarity between each pair of clusters by looking both at their relative inter-
connectivity and their relative closeness. Chameleon’s clustering selects to merge
the pair of clusters for which both factors are high; that is, it selects to merge clus-
ters that are well inter-connected and close together with the internal inter-
connectivity and closeness of the clusters. This allows Chameleon to handle the
high feature space data by reducing the outlier effect (Karypis et al., 1999).
Noticing previous hierarchical clustering algorithms’ restriction, BRICH has
been proposed with a new data structure called cluster feature tree. The cluster fea-
ture tree stores summaries of each data point in clusters, much smaller than the
original data. This allows it to quickly compute the multidimensional data without
scanning whole data in its original format (Xu & Wunsch, 2005; Zhang et al., 1997).
Initially, the cluster feature of data objects or leaf nodes is calculated, which con-
tains necessary information (i.e., several data objects, the linear sum of data objects,
and the square sum of the data objects) intra-cluster distance. Through the pre-
defined parameters, that is, T as a threshold and B as a possible number of entries
(sub-clusters), the capacity of root nodes for a particular level is determined to
absorb the subclusters in them. The values of T and B are iteratively refined such
that the outliers will not be a part of a node, and maximum numbers of similar leaf
nodes are grouped to their closet root node. This process will be performed at each
level until the level with a single root node is obtained. Therefore, the root nodes’
size remains the same at an individual level, which is practically not right.
1.5 Clustering 17
Partitioning clustering starts with an initial partition of data into several clusters and
then tries to swap data from one group to another iteratively. The objective function
is optimized. Generally, the partitioning clustering is classified into two categories,
that is, the hard partitioning clustering and fuzzy partitioning clustering, as shown
in Fig. 1.10.
approach reduces the problems, i.e., centroid trapping at the non-active region and
dead centroid. The enhanced versions of AMKM and MKM called Enhanced
Moving K-means EMKM are proposed (Siddiqui & Mat Isa, 2011). The key differ-
ence in these two versions is the range of elements that will be transferred from the
cluster with the highest fitness value to the nearest cluster. This highlights the wrong
selection of an element in AMKM and MKM. This not only overcame the above-
listed problems but also resolved the stability of the system.
Unlike hard partitioning clustering, which assigns the pixel to a single cluster, fuzzy
portioning clustering allows the pixels to belong to all clusters partially. Fuzzy
C-means (FCM) clustering is one of the earliest methods that partially assigned the
pixel to clusters (Bezdek, 1980). It employed the soft membership function that
derived from its objective function.
N k
FCM ( X ,C ) = ∑∑uij xi − c j 2 (1.19)
i =1 j =1
where k are several clusters, N is several pixels, and Uij is membership function
which is defined by:
2 / (1− m )
xi − c j
uij = k
(1.20)
2 / (1− m )
∑xi − c j
j =1
The cluster centroid value is measured by averaging the pixels value with differ-
ent degrees specified by the membership function. This process is continued until
the difference between two iterations’ membership function should be less than the
predefined value in the range of 0 to 1. At the end of the process, it is assumed that
the process will be converged to the optimum location. The fuzzy concept allows
overlapping clusters, reducing initialization sensitivity and making FCM sensitive
to outliers (Dixon et al., 2009; Yang et al., 2004). According to Kersten (1999), the
Euclidean distance of membership formula is a highly sensitive outlier. It puts con-
siderable weight on outlying pixels that pull the cluster’s centroid from its optimum
location (Kersten, 1999). To overcome this, FCM was generalized, and Lp Norm
Fuzzy C-Means (Lp Norm FCM) was proposed. It uses the Lp Norm distance (i.e.
0 < p < 1) instead of Euclidean distance (Norm 2) in the membership function
(Hathaway et al., 2000). It works better than the FCM if the exact value of p is
known for the particular data or image.
Wang et al. (2004) also address Euclidean distance sensitivity for the data with
more than one feature space and proposed the feature-weighted Euclidean distance
(Wang et al., 2004). Feature-weighted Euclidean distance replaces the Euclidean
1.6 Region-Based Segmentation Techniques 19
distance of FCM for reducing the sensitivity of Euclidean distance to the outlier.
The feature weight is calculated using mean similarity indexes on Euclidean space
and an image’s feature space. It assumes that the weighted feature function is mini-
mized, which attained the 0 or 1 value for similarity index on feature space. This
minimized feature weight value is used to calculate the weighted distance in every
repetition of the process. However, it is a complex method to measure the feature
weight (Hung et al., 2008). He proposed bootstrap as a statistical approach to mea-
suring the feature weight. In this statistical approach, the normalized value of vari-
ability is measured to define the feature weight.
In another approach, the distance is calculated between the centroid and weighted
pixels. With this modification in conventional FCM, the Lagrange multiplier method
is introduced to enhance the objective function of FCM (Chih-Cheng et al., 2011).
The Lagrange multiplier is added or subtracted in function such that the objective
function is minimized to zero, and the optimum location is obtained. Therefore, the
two extra functions, that is, Lagrange multiplier and mean of weighted pixels, are
calculated before measuring the membership function. The cluster fitness checking
concept applies to FCM, and if the fitness value of clusters is highly different from
each other, the membership will come from the cluster with a high fitness value to
the nearest cluster. However, the trapped centroid at local minima is unsolved by it.
Later, the neighboring pixels (spatial) information calculates the pixels’ mem-
bership value (Chuang et al., 2006). The window is defined for considering the
specific number of neighboring pixels. The spatial function that measures neighbor-
ing pixels’ influence has become one if the window is homogenous (pixel has simi-
lar neighboring pixels). This work is generalized, and the spatial function is also
included in distance calculation (Huynh Van & Jong-Myon, 2009). This method
assumed that the effect of salt and paper and Gaussian noise on segmentation is
reduced by using the proper window size and the constant value that controls the
influence of neighboring pixels on the membership of an individual pixel.
Fig. 1.11 Implementation of region-growing technique (a) location of initial seed pixel, (b), and
(c) the growing process of seed pixel to form a region
Fig. 1.12 The example of region-growing application (a) original image (b) segmented image
(where, the extracted background region is converted into black color)
The image can be segmented into the region by detecting boundaries based on dis-
continuity in intensity (gray levels). The first step of boundary detection is to detect
points, lines, and edge discontinuities in an image. Points and lines are those discon-
tinuities detected by simply convolving the window or masking an image. An exam-
ple is shown in Fig. 1.14, where a 3 × 3 mask is depicted, and each element of the
mask has a certain value. When it is convolving on an image, it computes a sum of
the elements’ product with an image’s intensity.
9
M = ∑Ei I i (1.21)
i =1
For detecting points, the center element is set as a positive integer, and the rest of
the elements are set as a negative integer. In other words, the center value is differ-
entiating with the neighboring value of the mask on an image. A threshold is defined
such that if M > T then the image pixel location at the mask center element is
detected as a point.
Fig. 1.15 Value of all elements of the 3 × 3 mask at 0, 45, 90, and -45o
Fig. 1.16 Intensity graph of four main types of edges found at any boundary. (a) Step edge (abrupt
change in intensity), (b) Ramp edge (gradual change in intensity), (c) Roof edge (gradual change
in intensity), (d) Line edge (abrupt change in intensity)
Similarly, the lines are detecting in an image. Few changes in elements of a mask
make it capable of detecting lines. For example, in Fig. 1.15, horizontal, vertical,
inclined (+45°), and declined (−45°) elements of the mask are set as positive inte-
gers. The rest of the elements are set as negative integers to detect horizontal, verti-
cal, and 45° lines in an image. The pixel may be detected as part of a multiple
line-type. This is solved by comparing the M value of each line-type mask with
others, and line-type has the highest M value, which is set as line-type of the pixel.
Mainly four types of edges may be observed in an intensity image, such as step,
ramp, roof, and line edges (Fig. 1.16). Lines and point detection methods are not
robust to detect all types of edges. Therefore, several advancements have been made
in points and line detection methods to detect all kinds of edges.
dI ( x,y ) dI ( x,y )
G ( x,y ) = cos θ + sin θ (1.22)
dx dy
Here, Gr(i, j) is row gradient and Gc(i, j) is column gradient. The direction of the
edge gradient concerning row gradient is measured by:
24 1 Introduction to Image Segmentation and Clustering
G ( i,j )
θ ( i,j ) = arctan c
G ( i,j )
(1.24)
r
As shown in Fig. 1.17, the magnitude of the first-order derivative ramp edge
gradient changes from 0 to 1 if a change in intensity has occurred. This property of
first-order is useful to identify edges. Many methods like Roberts, Sobel, Prewitt,
kirsch, and canny edge detectors compute the derivatives (row gradient and column
gradient) for an entire image using a 2 × 2 or higher dimension-mask.
Gradient operator determines the changes in intensity in the adjacent pixel, and the
resultant gradient magnitude emphasizes that pixels have significant local intensity
changes. Some basic methods that use gradient operators are discussed in this sec-
tion. Roberts edge detection convolves two 2 × 2 masks or operators (shown in
Fig. 1.18) on an image in the image’s x and y direction and takes less time to mea-
sure the gradient Gx(i, j). Image pixels I(i, j) are input to operators, and output pixels
G(i, j) are a magnitude of the gradient. In this method, the masks or operators are
designed to produce the maximum magnitude of gradients for diagonal edges in
an image.
G ( i ,j ) = G x ( i ,j ) + G y ( i , j ) (1.27)
Fig. 1.19 Gradient value of all elements of the Sobel, Prewitt, and Robinson masks at different
directions
26 1 Introduction to Image Segmentation and Clustering
Fig. 1.20 Threshold at the absolute value of the first-order derivative (i.e., gradient value)
points if their gradient value is greater than the threshold value. The process of
marking edge points in gradient-based methods by thresholding causes a thick edge
detection. Refer to Fig. 1.20 for illustration, where the edges points Pa and Pb are
observed at two intensity transition states of the brighter region after thresholding.
These large number points cause a thick edge detection.
(
− x 2 + y2 )
1
g ( x,y ) = e 2σ 2
(1.28)
2πσ 2
Fig. 1.22 Value of all elements of the 3 × 3 mask for Laplacian edge method
(
− x 2 + y2 )
g ( x ,y ) = e 2σ 2
(1.29)
∂2 I ∂2 I
∇2 I = + (1.30)
∂x 2 ∂y 2
As shown in Fig. 1.22, there are mainly two 3 × 3 masks with either four or eight
neighbor Laplacian. For detection, the Laplacian edge method uses one alone to
detect the edge. The mask with four neighbor Laplacian focuses on vertical and
horizontal directions, whereas the other mask focuses on all directions, including
the diagonals. Laplacian of Gaussian (LoG) based methods use the Gaussian filter
in computing the zero-crossing detection step.
LoG = ∇ 2 S = ∇ 2 ( g × I ) = ∇ 2 g × I ( ) (1.31)
x 2 + y2
1 x 2 + y2 −
∇2 g = − 3
2− e
2σ 2
(1.32)
2πσ σ2
x 2 + y2
x 2 + y2 −
g ( x ,y ) = 2 − e
2σ 2
(1.33)
σ2
28 1 Introduction to Image Segmentation and Clustering
Fig. 1.23 Threshold at the absolute value of the second-order derivative (i.e., gradient value)
Unlike the first-order gradient techniques, the Canny edge detection technique uti-
lizes gradient orientation along with gradient magnitude. The gradient is very sensi-
tive to noise; therefore, a Gaussian filter is applied before the gradient analysis to
remove noise. Canny edge measures the gradient and its direction by applying the
Sobel orthogonal operators for edge detection. Next, a non-maximum suppression
process is applied to detected edges. It locates an optimal edge by minimizing the
distance between the detected edge and the real edge. The real edge points may
locate at their neighboring pixels along the direction of the detected edge. The direc-
tion of the edge is at 90o concerning the gradient direction. The gradient magnitude
is analyzed at neighboring points of a detected edge in the gradient direction. A
point with the greatest gradient magnitude is assigned as a real edge point. This
analysis is performed from one end of the detected edge to the other. This process
ensures thin edge detection. An example is shown in Fig. 1.24; the edge in green
1.7 Edge-Based Segmentation Techniques 29
located at the maximum gradient is set as a real edge after applying a non-maximum
suppression process.
All real edges should not be true edges; therefore, the real edges are reanalyzed
using double threshold methods called hysteresis thresholding. As shown in
Fig. 1.25, this method histograms the magnitude of all real edges and uses two
thresholds, high and low thresholds, to group them into true and false edges. For
true edges, the following condition should be fulfilled.
(a) The gradient magnitude of real edges is greater than the high threshold.
(b) If the gradient magnitude of edges is less than the high threshold but greater
than the low threshold, such edges’ connectivity with high gradient magnitude
confirms its true edges rank.
Sharp edges detected from canny edge techniques are not enough to segment regions
in an image, but these edges help extract the boundary of regions present in an
image. There are two major techniques to link the detected edges for generating
boundary regions present in an image, such as local edge linkers and global edge
linkers. In local edge linkers, a filter is used to analyze image pixels’ properties in a
defined local neighborhood. Eqs. 1.34 and 1.35 measure a similarity index for the
pixels between two edges. These pixels are assigned as edge pixels if they are simi-
lar at some threshold in gradient and angle to the adjacent edges’ pixels.
Hough transformation is a robust strategy of global edge linkers that extracts the
perpendicular and parallel lines/edges from an entire image (Cui et al., 2012). It is
Fig. 1.24 A non-maximum suppression process to detect real edges. Here, the detected edge is
highlighted in black, and possible real edges are highlighted in green and purple
30 1 Introduction to Image Segmentation and Clustering
Fig. 1.25 Hysteresis thresholding method, wherein two thresholds are used to group true and
false edges
a linear transform method that is less effective than noise. It considers the line char-
acteristics in image space, i.e., (slope) and (intersection) instead of line points
(x2, y2). Hough transformation transforms the image space into parameter space m
and b. However, the vertical line makes a problem, that is, unbound values are
obtained on parameter space. Therefore, m and b are replaced with the r (distance
from the line to a particular pixel or point) and θ (angle of the vector from the origi-
nal pixel or point to the line). Figure 1.26 describes these parameters where the line
is defined as:
cos θ r
y = − x+ (1.36)
sin θ sin θ
This method stores the parameters for each line in a matrix called an accumulator
(parameter space). One dimension of the matrix is the distance, and the other is the
angle. So, all the lines with the same value of parameters r and θ is examined, and
the line with the highest number of pixels or points is selected to represent the image
edges. Moreover, the parameter space is used to find the parallel and perpendicular
lines. As it is known from Eq. 1.36, the line is sinusoidal, and the peaks of parallel
lines image space are aligned vertically in parameter space, and peaks of perpen-
dicular lines in image space are about ≠ in parameter space. By using the Hough
2
transformation, all parallel and perpendicular edges are extracted. In addition, the
ratio of gray value between the two sides of edges is measured, and if it is less than
the predefined threshold, then the edge is removed (Cui et al., 2012).
References 31
References
Abad, F., Garcia-Consuegra, J., & Cisneros, G. (2000). Merging regions based on the VDM dis-
tance. In IGARSS 2000. IEEE 2000 International Geoscience and Remote Sensing Symposium.
Taking the Pulse of the Planet: The Role of Remote Sensing in Managing the Environment.
Proceedings (Cat. No. 00CH37120) (Vol. 2, pp. 615–617). IEEE.
Almeida, J., Barbosa, L., Pais, A., & Formosinho, S. (2007). Improving hierarchical cluster
analysis: A new method with outlier detection and automatic clustering. Chemometrics and
Intelligent Laboratory Systems, 87(2), 208–217.
Anil, P. N., & Natarajan, S. (2010). Automatic road extraction from high resolution imagery based
on statistical region merging and skeletonization. International Journal of Engineering Science
and Technology, 2(3), 165–171.
Bernsen, J. (1986). Dynamic thresholding of gray-level images. Paper presented at the Proc. Eighth
Int’l conf. Pattern Recognition, Paris, 1986.
Bezdek, J. C. (1980). A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 2(1), 1–8.
Bins, L. S. A., Fonseca, L. G., Erthal, G. J., & Ii, F. M. (1996). Satellite imagery segmentation: a
region growing approach. Simpósio Brasileiro de Sensoriamento Remoto, 8(1996), 677–680.
Boukharouba, S., Rebordao, J. M., & Wendel, P. L. (1985). An amplitude segmentation method
based on the distribution function of an image. Computer vision, graphics, and image process-
ing, 29(1), 47–59. https://doi.org/10.1016/s0734-189x(85)90150-1
Chih-Cheng, H., Kulkarni, S., & Bor-Chen, K. (2011). A new weighted fuzzy c-means clustering
algorithm for remotely sensed image classification. IEEE Journal of Selected Topics in Signal
Processing, 5(3), 543–553. https://doi.org/10.1109/jstsp.2010.2096797
Cho, S., Haralick, R., & Yi, S. (1989). Improvement of Kittler and Illingworth’s minimum error
thresholding. Pattern recognition, 22(5), 609–617.
Chuang, K.-S., Tzeng, H.-L., Chen, S., Wu, J., & Chen, T.-J. (2006). Fuzzy c-means clustering with
spatial information for image segmentation. Computerized Medical Imaging and Graphics,
30(1), 9–15. https://doi.org/10.1016/j.compmedimag.2005.10.001
Cui, S., Yan, Q., & Reinartz, P. (2012). Complex building description and extraction based on
Hough transformation and cycle detection. Remote Sensing Letters, 3(2), 151–159.
Dare, P. M. (2005). Shadow analysis in high-resolution satellite imagery of urban areas.
Photogrammetric Engineering & Remote Sensing, 71(2), 169–177.
Derrien, M., & Le Gléau, H. (2007). Temporal-differencing and region-growing techniques to improve
twilight low cloud detection from SEVIRI data. In Proceedings of the Joint 2007 EUMETSAT
32 1 Introduction to Image Segmentation and Clustering
Meteorological Satellite Conference and the 15th Satellite Meteorology and Oceanography Conference
of the American Meteorological Society (Vol. 2428, p. 2428), Amsterdam: The Netherlands.
Devereux, B. J., Amable, G. S., & Posada, C. C. (2004). An efficient image segmentation algorithm
for landscape analysis. International Journal of Applied Earth Observation and Geoinformation
6(1) 47-61. https://doi.org/10.1016/j.jag.2004.07.007.
Dixon, S. J., Heinrich, N., Holmboe, M., Schaefer, M. L., Reed, R. R., Trevejo, J., & Brereton,
R. G. (2009). Use of cluster separation indices and the influence of outliers: Application of two
new separation indices, the modified silhouette index and the overlap coefficient to simulated
data and mouse urine metabolomic profiles. Journal of Chemometrics, 23(1), 19–31.
Doyle, W. (1962). Operations useful for similarity-invariant pattern recognition. Journal of the
ACM (JACM), 9(2), 259–267.
Du, Q., & Gunzburger, M. (2002). Grid generation and optimization based on centroidal Voronoi
tessellations. Applied Mathematics and Computation, 133(2), 591–607.
Faruquzzaman, A. B. M., Paiker, N. R., Arafat, J., & Ali, M. A. (2008). A survey report on image
segmentation based on split and merge algorithm. IETECH Journal of Advanced Computations,
2(2), 86–101.
Faruquzzaman, A. B. M., Paiker, N. R., Arafat, J., Ali, M. A., & Sorwar, G. (2009). Robust Object
Segmentation using Split-and-Merge. International Journal of Signal and Imaging Systems
Engineering, 2(1/2), 70. https://doi.org/10.1504/IJSISE.2009.029332
Gonzalez, R. C., Woods, R. E., & Eddins, S. L. (2003). Digital Image Processing Using MATLAB.
Guha, S., Rastogi, R., & Shim, K. (2000). ROCK: A robust clustering algorithm for categorical
attributes. Information Systems, 25(5), 345–366.
Guha, S., Rastogi, R., & Shim, K. (2001). Cure: An efficient clustering algorithm for large data-
bases. Information Systems, 26(1), 35–58.
Hamerly, G., & Elkan, C. (2002). Alternatives to the k-means algorithm that find better clusterings.
Paper presented at the Proceedings of the Eleventh International Conference on Information
and Knowledge Management, McLean, Virginia, USA.
Hathaway, R. J., Bezdek, J. C., & Hu, Y. (2000). Generalized fuzzy c-means clustering strategies
using Lp norm distances. IEEE Transactions on Fuzzy Systems, 8(5), 576–582.
Hemachander, S., Verma, A., Arora, S., & Panigrahi, P. K. (2007). Locally adaptive block thresh-
olding method with continuity constraint. Pattern Recognition Letters, 28(1), 119–124.
Horowitz, S. L., & Pavlidis, T. (1976). Picture Segmentation by a Tree Traversal Algorithm.
Journal of the ACM, 23(2), 368–388. https://doi.org/10.1145/321941.321956
Huang, L. K., & Wang, M. J. J. (1995). Image thresholding by minimizing the measures of fuzzi-
ness. Pattern recognition, 28(1), 41–51.
Hung, W. L., Yang, M. S., & Chen, D. H. (2008). Bootstrapping approach to feature-weight selec-
tion in fuzzy c-means algorithms with an application in color image segmentation. Pattern
Recognition Letters, 29(9), 1317–1325.
Huynh Van, L., & Jong-Myon, K. (2009, August 20–24). A generalized spatial fuzzy c-means
algorithm for medical image segmentation. Paper presented at the Fuzzy Systems, 2009. IEEE
International Conference on FUZZ-IEEE 2009.
Isa, N. A. M., Salamah, S. A., & Ngah, U. K. (2009). Adaptive fuzzy moving K-means cluster-
ing algorithm for image segmentation. IEEE Transactions on Consumer Electronics, 55(4),
2145–2153.
Ismail, S. M., Abdullah, S. N. H. S., & Fauzi, F. (2018). Statistical binarization techniques for
document image analysis. Journal of Computer Science, 14(1), 23–36.
Jawahar, C., Biswas, P., & Ray, A. (1997). Investigations on fuzzy thresholding based on fuzzy
clustering. Pattern Recognition, 30(10), 1605–1613.
Jianzhuang, L., Wenqing, L., & Yupeng, T. (1991). Automatic thresholding of gray-level pictures
using two-dimension Otsu method. Paper presented at the Circuits and Systems, 1991. 1991
International Conference on Conference Proceedings, China.
Jin, Z., Lou, Z., Yang, J., & Sun, Q. (2007). Face detection using template matching and skin-
color information. Neurocomputing, 70(4–6), 794–800. https//doi.org/10.1016/j.neucom.
2006.10.043
Jinhai, C., & Zhi-Qiang, L. (1998, August 16–20). A new thresholding algorithm based on all-pole
model. Paper presented at the Pattern Recognition, 1998. Fourteenth International Conference
on Proceedings.
References 33
Kampke, T., & Kober, R. (1998, August 16–20). Nonparametric optimal binarization. Paper pre-
sented at the Pattern Recognition, 1998. Fourteenth International Conference on Proceedings.
Karypis, G., Han, E. H., & Kumar, V. (1999). Chameleon: Hierarchical clustering using dynamic
modeling. Computer, 32(8), 68–75.
Kersten, P. R. (1999). Fuzzy order statistics and their application to fuzzy clustering. IEEE
Transactions on Fuzzy Systems, 7(6), 708–712.
Khoshelham, K., Li, Z., & King, B. (2005). A Split-and-Merge Technique for Automated
Reconstruction of Roof Planes. Photogrammetric Engineering & Remote Sensing, 71(7),
855–862. https://doi.org/10.14358/PERS.71.7.855
Khotanzad, A., & Bouarfa, A. (1990). Image segmentation by a parallel, non-parametric histogram
based clustering algorithm. Pattern Recognition, 23(9), 961–973.
Kittler, J., & Illingworth, J. (1986). Minimum error thresholding. Pattern Recognition, 19(1), 41–47.
Lang, X., Zhu, F., Hao, Y., & Ou, J. (2008). Integral image based fast algorithm for two-
dimensional Otsu thresholding. Paper presented at the Image and Signal Processing, 2008.
Congress on CISP’08.
Lloyd, D. (1985). Automatic target classification using moment invariant of image shapes. IDN
AW126, RAE, Farnborough, Reino Unido.
Lucieer, A., & Stein, A. (2002). Existential uncertainty of spatial objects segmented from satellite
sensor imagery. IEEE Transactions on Geoscience and Remote Sensing, 40(11), 2518–2521.
https://doi.org/10.1109/TGRS.2002.805072
Manousakas, I. N., Undrill, P. E., Cameron, G. G., & Redpath, T. W. (1998). Split-and-Merge
Segmentation of Magnetic Resonance Medical Images: Performance Evaluation and Extension
to Three Dimensions. Computers and Biomedical Research, 31(6), 393–412. https://doi.
org/10.1006/cbmr.1998.1489
Mashor, M. Y. (2000). Hybrid training algorithm for RBF network. International Journal of The
Computer, The Internet and Management, 8(2), 50–65.
Mazonakis, M., Damilakis, J., Varveris, H., Prassopoulos, P., Gourtsoyiannis, N. (2001). Image
segmentation in treatment planning for prostate cancer using the region growing technique.
The British Journal of Radiology, 74(879), 243–249. https/doi.org/10.1259/bjr.74.879.740243
Meethongjan, K., Dzulkifli, M., Rehman, A., & Saba, T. (2010). Face recognition based on fusion
of Voronoi diagram automatic facial and wavelet moment invariants. International Journal
Video Process Image Process Netw Secur, 10(4), 1–8.
Murthy, C. A., & Pal, S. K. (1990). Fuzzy thresholding: Mathematical framework, bound functions
and weighted moving average technique. Pattern Recognition Letters, 11(3), 197–206.
Nedzved, A., Ablameyko, S., & Pitas, I. (2000). Morphological segmentation of histology cell
images. In, 2000. Published by the IEEE Computer Society.
Olivo, J. C. (1994). Automatic threshold selection using the wavelet transform. CVGIP: Graphical
Models and Image Processing, 56(3), 205–218.
Olson, C. F. (1995). Parallel algorithms for hierarchical clustering. Parallel Computing, 21(8),
1313–1325.
Otsu, N. (1975). A threshold selection method from gray-level histograms. Automatica,
11(285-296), 23–27.
Pal, S. K., & Rosenfeld, A. (1988). Image enhancement and thresholding by optimization of fuzzy
compactness. Pattern Recognition Letters, 7(2), 77–86.
Pavlidis, T., & Horowitz, S. L. (1974) Segmentation of Plane Curves. IEEE Transactions on
Computers C-23(8), 860–870. https//doi.org/10.1109/T-C.1974.224041
Pham, D. L., Xu, C., & Prince, J. L. (2000). Current methods in medical image segmentation 1.
Annual Review of Biomedical Engineering, 2(1), 315–337.
Pohle, R., & Toennies, K. D. (2001). A new approach for model-based adaptive region growing in
medical image analysis. In International conference on computer analysis of images and pat-
terns (pp. 238–246). Springer, Berlin, Heidelberg.
Pun, T. (1981). Entropic thresholding, a new approach. Computer Graphics and Image Processing,
16(3), 210–239.
Ramesh, N., Yoo, J. H., & Sethi, I. K. (1995). Thresholding based on histogram approxima-
tion. IEE Proceedings - Vision, Image and Signal Processing, 142(5), 271–279. https://doi.
org/10.1049/ip-vis:19952007
34 1 Introduction to Image Segmentation and Clustering
Rau, J.-Y., & Chen, L.-C. (2003). Robust Reconstruction of Building Models from Three-
Dimensional Line Segments. Photogrammetric Engineering & Remote Sensing, 69(2),
181–188. https://doi.org/10.14358/PERS.69.2.181
Ridler, T., & Calvard, S. (1978). Picture thresholding using an iterative selection method. IEEE
Transactions on Systems, Man and Cybernetics, 8(8), 630–632.
Rosenfeld, A., & De La Torre, P. (1983). Histogram concavity analysis as an aid in threshold selec-
tion. IEEE Transactions on Systems, Man and Cybernetics, SMC-13(2), 231–235. https://doi.
org/10.1109/tsmc.1983.6313118
Sahoo, P. K., Soltani, S., & Wong, A. K. (1988). A survey of thresholding techniques. Computer
Vision, Graphics, and Image Processing, 41(2), 233–260.
Samopa, F., & Asano, A. (2009). Hybrid image thresholding method using edge detection.
International Journal of Computer Science and Network Security, 9(4), 292–299.
Sengupta, K., Shiqin, W., Ko, C. C., & Burman, P. (2000). Automatic face modeling from mon-
ocular image sequences using modified non parametric regression and an affine camera
model. In Proceedings Fourth IEEE International Conference on Automatic Face and Gesture
Recognition (Cat. No. PR00580) (pp. 524–529).
Sezgin, M. (2004). Survey over image thresholding techniques and quantitative performance eval-
uation. Journal of Electronic Imaging, 13(1), 146–168.
Shih, F. Y. (2010). Image processing and pattern recognition: Fundamentals and techniques. Wiley.
Siddiqui, F. U. (2012). Enhanced clustering algorithms for gray-scale image segmentation. Master
dissertation, Universiti Sains Malaysia.
Siddiqui, F. U., & Mat Isa, N. A. (2011). Enhanced moving K-means (EMKM) algorithm for
image segmentation. IEEE Transactions on Consumer Electronics, 57(2), 833–841. https://doi.
org/10.1109/tce.2011.5955230
Sun, Y., & Bhanu, B. (2009). Symmetry integrated region-based image segmentation. In 2009
IEEE Conference on Computer Vision and Pattern Recognition, pp. 826–831.
Taghizadeh, M., & Hajipoor, M. (2011). A hybrid algorithm for segmentation of MRI images based
on edge detection. Paper presented at the Soft Computing and Pattern Recognition (SoCPaR),
2011 international conference of.
Tizhoosh, H. R. (2005). Image thresholding using type II fuzzy sets. Pattern Recognition, 38(12),
2363–2372.
Trussell, H. (1979). Comments on. IEEE Transactions on Systems, Man and Cybernetics, 9(5), 311–311.
Tsai, D. M. (1995). A fast thresholding selection procedure for multimodal and unimodal histo-
grams. Pattern Recognition Letters, 16(6), 653–666.
Vantaram, S. R., & Saber, E. (2012). Survey of contemporary trends in color image segmentation.
Journal of Electronic Imaging, 040901-040901-040901-040928.
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer
Vision, 57(2), 137–154.
Wang, X., Wang, Y., & Wang, L. (2004). Improving fuzzy c-means clustering based on feature-
weight learning. Pattern Recognition Letters, 25(10), 1123–1132.
Whatmough, R. (1991). Automatic threshold selection from a histogram using the “exponential
hull”. CVGIP: Graphical Models and Image Processing, 53(6), 592–600.
Xiao, Y., Cao, Z., & Zhuo, W. (2011). Type-2 fuzzy thresholding using GLSC histogram of human
visual nonlinearity characteristics. Optics Express, 19(11), 10656–10672.
Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on Neural
Networks, 16(3), 645–678.
Yang, M. S., Hwang, P. Y., & Chen, D. H. (2004). Fuzzy clustering algorithms for mixed feature
variables. Fuzzy Sets and Systems, 141(2), 301–317.
Yanni, M., & Horne, E. (1994). A new approach to dynamic thresholding. Paper presented at the
EUSIPCO’94: 9th European Conf. Sig. Process.
Yong, Y., Chongxun, Z., & Pan, L. (2004, September 14–16). Image thresholding based on spa-
tially weighted fuzzy c-means clustering. Paper presented at the Computer and Information
Technology, 2004. The fourth international conference on CIT ‘04.
Zhang, T., Ramakrishnan, R., & Livny, M. (1997). BIRCH: A new data clustering algorithm and
its applications. Data Mining and Knowledge Discovery, 1(2), 141–182.
Zhao, Y., Karypis, G., & Fayyad, U. (2005). Hierarchical clustering algorithms for document data-
sets. Data Mining and Knowledge Discovery, 10(2), 141–168.
Chapter 2
Partitioning Clustering Techniques
Consider an image Iwith the size of N × M (here, N and M represent several rows
and columns, respectively) to be clustered into k several clusters. Let, pi(x, y) be the
ith pixels and cj be the jth centroid, where i = 1, 2, 3, …(N × M) and j = 1, 2, 3, …k.
According to the study (Weisstein, 2004), k-means clustering is implemented as
follows:
For better illustration, the flow diagram of the k-means clustering is depicted in
Fig. 2.1.
2.2 k-means Clustering 37
Generally, the k-means are not robust to the outliers and initialization settings. It is
very normal with k-means that different initial clusters’ centroid locations could
yield empty clusters and other final solutions (Bradley & Fayyad, 1998). In most
cases, the final solution is converged to a poor local minimum location (Kaufman &
Rousseeuw, 2009) (Ester et al., 1995) (Sulaiman & Isa, 2010). Kaufman and
Rousseeuw (2009) consider that the cluster’s labeling by the cluster pixels’ mean
value is liable for the empty clusters problem. In other words, the over sensitiveness
of k-means centroid to the outliers could result in the nearest cluster becoming an
empty cluster or a dead centroid. Thus, the final solution may converge to the poor
local minima. To overcome outlier sensitivity, the Partitioning Around Medoids
(PAM) was proposed by Kaufman and Rousseeuw (1990). A medoid is defined as a
representative member of a cluster with a dataset whose average dissimilarity to all
the members in the cluster is minimal (Kaufman & Rousseeuw, 2009). This study
concludes that a cluster’s labeling using medoid value is far more effective than the
mean value to reduce outlier sensitivity. This solution is outperformed in the
k-means clustering on smaller data but does not work well if the data is large
(Barioni et al., 2008).
38 2 Partitioning Clustering Techniques
As mentioned above, the k-means clustering assigns each pixel to the respective
cluster based on minimum Euclidean distance. But there is a high chance of poor
pixel assignment if the pixel has the same minimum Euclidean distance to two or
more adjacent clusters (Siddiqui & Isa, 2012). Consequently, the pixel may be
assigned to the higher variance cluster instead of the low variance cluster. The lower
variance cluster has probably no chance to become a part of the clustering process
and could be trapped at poor local minima or ended as an empty cluster. This empty
cluster is also known as the dead center. Figure 2.2 visualizes the abovementioned
phenomena where points A, B, and C are the centroids of the clusters adjacent to
point D. The Euclidean distance between point D and its adjacent clusters A, B, and
C is the same. Thus, point D may be assigned to the higher variance cluster. The
lower variance cluster could easily trap into its present value with no further chance
of updating or possibly being converted into an empty cluster without having a
chance to be updated in the clustering process.
For a better illustration of the empty clusters and trapped centroid limitations of
k-means clustering, a "Bridge" image from a publicly available dataset is segmented
by k-means as shown in Fig. 2.3. Figure 2.3 (c) and (e) are the segmented images
into three and six clusters by the k-means clustering, respectively. In addition, the
histograms of the test image before and after the segmentation process with
2.2 k-means Clustering 39
different numbers of clusters (i.e., three and six) are shown in Fig. 2.3 (b), (d), and
(f), respectively. Comparing histograms of the test image and k-means segmented
image with three clusters confirms that k-means produced inappropriate centroids
positioning where the final centroids are not located at the appropriate peak’s origi-
nal histogram as shown in Fig. 2.3 (d). In other words, the final centroids are trapped
at the non-active region of data, and incomplete or inadequate detail from the image
is segmented. For example, the bridge’s ropes in Fig. 2.3 (c) are not segmented with
sharp edges, but unnecessary background information like the cloud’s dark shade is
clustered. Whereas in Fig. 2.3 (e), the image has been segmented into five clusters,
although the initial setting was six clusters. Figure 2.3 (f) also proves that the
k-means clustering cannot avoid empty clusters. At the end of the clustering pro-
cess, only five clusters are assigned to the “Bridge” image than the initial set value
of, that is, six clusters.
Besides the qualitative analysis, the quantitative analysis is also conducted for
performance assessment of k-means clustering, and the metrics like MSE, INTER,
F(I), F′ (I), and Q(I) are measured for the segmented image. From the tabulated
results in Table 2.1, the large value of MSE and the small value of INTER confirm
the poor convergence of the k-means clustering. Whereas the large value of the F(I),
F′ (I), and Q(I) shows the less homogenous segmented image because of the empty
cluster.
40 2 Partitioning Clustering Techniques
Fig. 2.3 Applying k-mean clustering on "Bridge" image (a) test image, (b) histogram of the test
image, (c) segmented image at k = 3, (d) histogram of the segmented image, (e) segmented image
at k = 6, and (f) histogram of the segmented image
2.3 Moving K-means Clustering 41
Table 2.1 Quantitative analysis of k-means clustering using the "Bridge" image
In 2000, the moving k-means (MKM) clustering was proposed to overcome the
limitations of the k-means clustering (Mashor, 2000). The research concluded that
the trapped centroid at the non-active region is the main reason for producing the
dead centroid (i.e., empty cluster) and causing the solution of the k-means cluster-
ing to converge at the optimum local location (Mashor, 2000). In detail, the empty
clusters are generated because of the poor initial positioning of centroids. For
instance, the centroids initialized at non-active regions. Because of this phenome-
non, the centroids may not be updated in the clustering process and may trap in the
non-active region. MKM clustering reduces this unsuitable partitioning of the data
by using some fitness criteria for clusters to keep them in an active region of the
data. Based on this criterion, the fitness is continuously verified during the calcula-
tion of centroids. In any inadequate condition, the cluster with the lowest fitness
value withdraws all its member pixels. It moves towards the active region by taking
the cluster’s member pixels with the highest fitness value. This relocation of pixels
between the clusters maintains the same variance value for all clusters, and there-
fore the MKM becomes less sensitive to the initial settings.
In MKM clustering, the cluster with the smallest fitness value (cs) will drop its all
member pixels and csmove toward the active region by receiving the members, pix-
els with intensity value less than (i.e., ) cluster with the largest fitness value (cl).
Whereas the members of cl having a value greater than cl will remain a member of
cl (Fasahat Ullah Siddiqui & Isa, 2011). Based on equations 2.3 and 2.4, the cs and
cl will be updated at the end of every iteration of the clustering process.
The transferring concept of the MKM clustering process has some major limita-
tions, which are as follows:
• Although the fitness condition is applied to avoid centroid to trap at the non-
active region, the same fitness condition causes an uncertain scenario that can
probably be occurred in the clusters cs(Fasahat Ullah Siddiqui & Isa, 2011). In
detail, the cluster cs may contain similar value members (i.e., equal to the clus-
ter’s centroid value). This causes a low Euclidean distance between the cluster’s
centroid and its members and produces a low fitness value for that cluster. Thus,
forcing the cluster csto drop its members and become an empty cluster opposes
clustering’s main objective, i.e., divides data into a defined number of clusters.
This increases the intra-cluster variance of clusters and produces poor
segmentation.
• The transferring process of the cl members in MKM clustering may cause poor
segmentation. In the transferring process of the cl members, the members are
divided into two groups: (a) members with a value greater than the centroid value
(cl) and (b) members with a value less than the centroid value (cl) (Siddiqui &
Isa, 2011). Members of cl the first group will remain members of that cluster,
although it is located far away from the centroid. Whereas the members cl belong
to a second group, they will be transferred to other clusters, although it is located
close to the centroid. This inappropriate transferring of members will also
increase the clusters’ intra-cluster variance and produce a poor segmentation.
• Furthermore, the MKM clustering has a pair of fitness conditions to avoid mak-
ing an empty cluster in the clustering process (Siddiqui & Isa, 2011). However,
the fitness conditions are not robust enough to distinguish between the empty
clusters and the cluster with similar value members. This is because of similar
fitness values (i.e., zero) for both types of clusters. Thus, transferring members to
the cluster with similar value members causes the empty cluster to avoid updat-
ing the clustering process.
2.3 Moving K-means Clustering 45
Fig. 2.5 Applying MKM Clustering on Harbor House image; (a) test image, (b) histogram of the
test image, (c) segmented image, and (d) histogram of the segmented image
To illustrate the MKM clustering limitations, the Harbor House image is selected as
a tested image (Fig. 2.5). The initial parameters settings of the MKM clustering are
as follows: the number of clusters is set as three, and the constant value for
αa = αb = αo is set as 0.3. According to theory, each constant is set in the range of
0 a 1 (Mashor, 2000). The segmented image and its histogram after apply-
3
ing the MKM clustering are also depicted in Fig. 2.5 (c) and (d), respectively. Like
k-means clustering, the MKM clustering also has been unsuccessful in segmenting
the regions of interest with sharp edges and detailed information, as shown in
Fig. 2.5 (c). For instance, the fence at the front of the house is not segmented. It is
also confirmed from Fig. 2.5 (d), where the histogram presents the segmented
image’s inappropriate results. The centroids are not located at the prominent peaks
46 2 Partitioning Clustering Techniques
Table 2.2 Quantitative analysis of the MKM clustering using Harbor House image
Initialized number of Final number of F(I)
clusters clusters MSE INTER 1x107 F’ (I) Q(I)
3 3 423.00 67.33 17.1935 13.286 230.667
of the histogram of the test image. For example, the centroid with an intensity value
of 150 that located far from active region. To confirm this limitation, the quantitative
analysis is performed, and the large value of the MSE (it represent intra-cluster vari-
ance) and the small value of the INTER (it represent inter-cluster variance) show the
poor performance of MKM clustering (Table 2.2). Furthermore, the large value of
the F(I), F′ (I), and Q(I) confirms the non-homogenous segmentation by the MKM
clustering.
AMKM clustering has followed the MKM clustering principle with some changes
in its pixels transferring approach (Isa et al., 2009). AMKM considers that transfer-
ring member pixels of the cluster with the highest fitness to the cluster with the
lowest fitness value is the main reason for the MKM clustering to converge to the
poor local minima. Therefore, AMKM clustering used the modified transferring
process. The cluster’s member pixels with the highest fitness value are transferred
to the nearest cluster instead of the cluster with the lowest fitness value.
The flow diagram for the implementation of the AMKM clustering is as shown
in Fig. 2.6.
The Gantry Crane image is selected as a test image to demonstrate the AMKM
clustering limitations. Here, except for the number of clusters, the rest of the initial
parameter settings are the same as defined earlier for the MKM clustering, and the
value of the number of clusters is set to four. The test and segmented images with
their histograms are shown in Fig. 2.7, where the centroid in a histogram of the
segmented image, represented with its value 140, is trapped at a region far from one
of the most active region (i.e., the region with the range 0 to 50 on intensity).
Because of this trapped centroid, the AMKM clustering is failed to segment the
switch panel on a rig structure, as shown in Fig. 2.7 (c). The quantitative analysis is
Fig. 2.7 Applying the AMKM clustering on Gantry Crane image; (a) test image, (b) histogram of
the test image, (c) segmented image, and (d) histogram of the segmented image
50 2 Partitioning Clustering Techniques
Table 2.3 Quantitative analysis of the AMKM clustering using Gantry Crane image
Initialized number of Final number of F(I)
clusters clusters MSE INTER 1 × 107 F’ (I) Q(I)
4 4 258.194 77 2.707 3.558 36.971
tabulated in Table 2.3 to illustrate the AMKM clustering performance on the test
image. The large MSE value and small INTER value confirm the centroids’ false
location, i.e., the non-active regions. Whereas the large value of the F(I), F’ (I), and
Q(I) function shows the poor segmentation (i.e., image is segmented in non-
homogenous regions) result of the AMKM clustering. All this occurs because of the
trapped centroid in a non-active region.
In contrast to the k-means clustering, the fuzzy c-means clustering has soft member-
ship by which each pixel partially belongs to each cluster instead of completely
becoming a part of a single cluster. Its main aim is to iteratively minimize the
defined objective function (Bezdek, 1981). The fuzzy c-means clustering starts with
the assumption of each pixel’s membership value related to all clusters. Next, each
cluster’s centroid is measured by taking an average of all the pixels with their differ-
ent degree (it is determined in each iteration by executing the soft membership func-
tion). This is an iterative process and continues until no change is observed in all
pixels’ membership values related to the clusters. The fuzzy concept makes the
fuzzy c-means clustering more flexible to keep the centroids as close as possible to
active regions and become less sensitive to initial settings. However, the fuzzy
c-means clustering has no specific boundary between the clusters to avoid clusters’
overlapping (Sulaiman & Isa, 2010).
The fuzzy c-means clustering segments the pixels of an image into overlapping
regions. The fuzzy c-means clustering can be implemented as follow:
2.5 Fuzzy c-means Clustering 51
52 2 Partitioning Clustering Techniques
The flow chart for the implementation of the fuzzy c-means clustering is shown
in Fig. 2.8.
In the fuzzy c-means clustering, each pixel has partially belonged to all clusters with
some degree, i.e., a pixel’s membership value for clusters. The membership is mea-
sured by using equation 2.6. The fuzzy characteristic of fuzzy c-means restricts the
2.5 Fuzzy c-means Clustering 53
membership value of a pixel is in the range between 0 and 1, and the sum of mem-
bership value of each pixel in all clusters is equal to 1 (Siddiqui et al., 2013). This
characteristic gives a sufficient membership value for the outliers (far located pixels
from the centroid) to become a cluster member, increases the intra-cluster variance,
and reduces the inter-cluster variance (Thomas et al., 2009). Besides, the increase of
overlapping between the clusters in the data causes the increase of sensitivity of
fuzzy c-means to outliers (Dixon et al., 2009; Yang et al., 2004). (Kersten, 1999)
Euclidean distance for measuring the membership value in the fuzzy c-means clus-
tering is sensitive to outliers. It is confirmed in another research (Hathaway et al.,
2002). It is considered that the Euclidean distance assigned a high membership to
outliers, which indirectly pulls the cluster from the optimum location.
Besides, the convergence speed of fuzzy c-means clustering is too slow in com-
parison with the k-means clustering. An earlier study on fuzzy clustering (Wei &
Xie, 2000) presents the convergence speed comparison between the fuzzy c-means
clustering and the introduced clustering called Revival Check Fuzzy C-Means
(RCFCM) clustering. The RCFCM clustering speeds up the fuzzy clustering pro-
cess’s convergence speed by magnifying the biggest membership of a pixel and
suppressing the second biggest membership of a pixel. But it is found in the litera-
ture that this approach often is not converged at the optimum global location (Fan
et al., 2003). The main problem is ignoring the pixel memberships, not the two
leading highest pixels’ highest memberships. Besides, a parameter α has a signifi-
cant role in controlling the clustering process. Any improper selection of α value
can disturb the modification of the second-highest membership and ordering of
membership. The unbalanced ordering of memberships leads the final solution to
converges at poor local minima. Another modification in fuzzy clustering, called the
Suppress-Fuzzy C-means (S-FCM) clustering, was proposed by Fan et al. (2003). It
magnified the biggest membership without disturbing the order of memberships.
S-FCM clustering has also employed a controlling parameter α. Its value equal to 0
converts the SFCM clustering into the k-means clustering; its value equal to 1 con-
verts the S-FCM into fuzzy clustering (Fan et al., 2003). The performance of the
S-FCM clustering only relies on the personal setting of α value.
In fuzzy c-means clustering, the pixels become members of the cluster with the
highest membership value; the ideal condition is that the membership value is equal
to 1 or greater than the sum of membership value of pixels with other clusters
(Siddiqui et al., 2013). However, in some cases, the pixels located between two
clusters have a reasonable degree of membership value with faraway clusters; these
pixels act as outliers for those clusters. This phenomenon has typically occurred if
a range of data is distributed between more than two neighboring clusters. The
fuzzy c-means clustering applied to segment the manually generated data of inten-
sity ranged between 1 and 120 into three regions (Siddiqui et al., 2013). In the
54 2 Partitioning Clustering Techniques
segmented result of fuzzy c-means clustering, the range of data divides into three
regions (i.e., C1, C2, and C3) with their membership functions (i.e., u1, u2, and u3,
respectively). From Fig. 2.9, the membership functions are represented with exclu-
sive line textures and colors, i.e., green dotted (u1), red dash (u2), and blue solid
(u3) lines. For instance, the membership function u1, where the data ranged between
0 to 60, is located far from cluster C1 but still received the reasonable membership
value to become outliers for cluster C1. A similar pattern is observed in other mem-
bership functions, i.e., u2 and u3. Therefore, these outliers can pull away from the
centroids from their active regions and generate clusters with large intra-cluster
variance and low inter-cluster variance.
In Fig. 2.10, a Football image is selected as a test image to illustrate outliers’
effect on fuzzy c-mean clustering. In initial settings, the number of clusters and
termination criterion is set to 3 and 0.001, respectively. Histogram information in
Fig. 2.10 (b) recommends that three final centroids’ ideal locations in the intensity
scale are 50, 140, and 255. But the two final centroids are located very close to each
other (i.e., 50 and 77), and the region with intensity value closed to 255 (i.e., pixels
of football laces) is not segmented. This poor representation of data is because of
the outliers that localize the centroids at false or non-active regions. The quantitative
analysis also confirms the poor segmentation or poor convergence of the final solu-
tion. The large value of the MSE and VXB and the small value of the INTER are
obtained. As a tabulated result in Table 2.4, the large value of F(I), F’ (I), and Q(I)
functions confirms non-homogeneous segmentation of the image by fuzzy c-means
clustering.
1
U1
U2 0.9
U3
0.8
0.7
0.6
Membership
0.5 C3 C2 C1
0.4
0.3
0.2
0.1
0
0 20 40 60 80 100 120
Data Sample
Fig. 2.9 Mapping of membership functions of fuzzy c-means clustering using data ranged
between 1 and 120. (Siddiqui et al., 2013).
2.6 Adaptive Fuzzy Moving k-means Clustering 55
Fig. 2.10 Applying the fuzzy c-means clustering on Football image; (a) test image, (b) histogram
of the test image, (c) segmented image, and (d) histogram of the segmented image
Adaptive fuzzy moving k-means (AFMKM) clustering was introduced for image
segmentation (Isa et al., 2010). In AFMKM clustering, the cluster members’ trans-
ferring concept of the AMKM and fuzzy connect of the fuzzy c-means clustering
are integrated to improve the transferring processes. The pixels with mass (i.e., the
fuzzy weight of a pixel in a cluster) less than the centroid value of the cluster with
the highest fitness are moved from the cluster with the highest fitness value to the
nearest cluster (Isa et al., 2010). Based on the study (Sulaiman & Isa, 2010), this
approach significantly produces better segmentation performance than AMKM
clustering.
56 2 Partitioning Clustering Techniques
The AFMKM clustering is the modified version of the AMKM clustering, which
incorporates the fuzzy concept. For illustration, the complete implementation of the
AFMKM clustering is as follow:
In the AFMKM clustering, the fuzzy concept of fuzzy c-means clustering is fused
with the transferring concept of MKM clustering. As it is a well-proven argument,
the fuzzy concept is mainly sensitive to outliers (Hathaway et al., 2002) and leads
the solution to converge at the non-active region or poor local minima; the fuzzy
concept makes the AFMKM clustering highly sensitive to outliers as well. AFMKM
has two more limitations because of incorporating the MKM clustering: (a) trans-
ferring of unsuitable range of members from the cluster with largest fitness value
toward nearest clusters and (b) fitness condition itself is not capable of distinguish-
ing the clusters with members of similar value and empty clusters (Siddiqui &
Isa, 2011).
2.6 Adaptive Fuzzy Moving k-means Clustering 57
For illustrating the limitations of AFMKM clustering, the Light House image is
selected as the test image. In initial settings, the number of clusters is set to three.
Histogram information in Fig. 2.12 (b) suggests three centroids’ ideal location in
the intensity scale is 80, 130, and 250. Based on Fig. 2.12 (d), the AFMKM
Fig. 2.12 Applying the AFMKM clustering on Light House image; (a) test image (b) histogram
of the test image, (c) segmented image, and (d) histogram of the segmented image
2.7 Adaptive Fuzzy k-means Clustering 59
Table 2.5 Quantitative analysis of the AFMKM clustering using Light House image
Initialized number of Final number of F(I)
clusters clusters MSE INTER VXB 1×107 F’ (I) Q(I)
3 3 4599.9 23.33 9.154 15.28 12.99 382.8
generates centroids located at 25, 55, and 60. Because of false centroids localiza-
tion, the incomplete shape of regions with blur edges are segmented (Fig. 2.12 (c)).
Based on a tabulated quantitative analysis in Table 2.5, the large value of the MSE,
F(I), F’ (I), and Q(I) and small value of INTER confirm the empty cluster and
trapped centroid limitations of AFMKM clustering.
Sulaiman and Isa (2010) proposed the adaptive fuzzy K-means (AFKM) clustering
as an advanced version of AFKM clustering that fused a fuzzy concept with a novel
belongingness concept. Sulaiman and Mat Isa (2010) considered that membership
alone is not the best approach for assigning pixels to clusters. Therefore, a novel
belongingness concept is derived from the membership function of fuzzy clustering
to ensure a strong relationship between the clusters and their members.
As discussed earlier, the AFKM clustering integrates the belongingness concept and
fuzzy concept to strengthen the relationship between clusters and their members.
Based on the literature (Sulaiman & Isa, 2010), AFKM clustering is implemented
as follow:
60 2 Partitioning Clustering Techniques
2.7 Adaptive Fuzzy k-means Clustering 61
62 2 Partitioning Clustering Techniques
2.7 Adaptive Fuzzy k-means Clustering 63
The Monument Building image is selected as a test image to explain AFKM cluster-
ing’s limitation. In the initial setting, the number of clusters is set to four. However,
the image is segmented into three clusters. It is also confirmed from the tabulated
results in Table 2.6. The histograms of the test image and the segmented image are
shown in Fig. 2.14 (b) and (d), respectively. It confirms the formation of an empty
cluster in the segmented result as shown in Fig. 2.14 (d), where only three centroids
instead of four centroids are obtained. Also, the segmented image is completely
blurred; much useful information is failed to segment by the AFKM clustering
(Fig. 2.14 (c)). The quantitative results of AFKM clustering are tabulated in
Table 2.6, where the large values of MSE, F(I), F’ (I), and Q(I) confirm the poor
convergence property of the AFKM clustering. The infinity value of VXB proves
that the clustering process end with two centroids having similar intensity. One
cluster with similar intensity holds most of the pixels, while the other cluster with
similar intensity has no pixels and becomes an empty cluster.
Table 2.6 Quantitative analysis of the AFKM clustering using ‘Monument Building image
Initialized number of Final number of F(I)
clusters clusters MSE INTER VXB 1×107 F’ (I) Q(I)
4 3 1217.6 52.16 ∞ 31.22 22.66 330.9
66 2 Partitioning Clustering Techniques
Fig. 2.14 Applying the AFKM clustering on the Monument Building image; (a) test image (b)
histogram of the test image, (c) segmented image, and (d) histogram of the segmented image
References
Anderson, B. J., Gross, D. S., Musicant, D. R., Ritz, A. M., Smith, T. G., & Steinberg, L. E. (2006).
Adapting k-medians to generate normalized cluster centers. Paper presented at the Proceedings
of the 2006 SIAM International Conference on Data Mining.
Barioni, M. C. N., Razente, H. L., Traina, A. J. M., & Traina, C., Jr. (2008). Accelerating k-medoid-
based algorithms through metric access methods. Journal of Systems and Software, 81(3),
343–355. https://doi.org/10.1016/j.jss.2007.06.019
Bezdek, J. (1981). Pattern recognition with fuzzy objective function algorithms. Plenum Press.
Bradley, P. S., & Fayyad, U. M. (1998). Refining initial points for k-means clustering. Paper pre-
sented at the ICML.
Dixon, S. J., Heinrich, N., Holmboe, M., Schaefer, M. L., Reed, R. R., Trevejo, J., & Brereton,
R. G. (2009). Use of cluster separation indices and the influence of outliers: Application of
two new separation indices, the modified silhouette index and the overlap coefficient to simu-
lated data and mouse urine metabolomic profiles. Journal of Chemometrics: A Journal of the
Chemometrics Society, 23(1), 19–31.
Ester, M., Kriegel, H.-P., & Xu, X. (1995). A database interface for clustering in large spatial
databases. Inst. für Informatik.
References 67
Fan, J.-L., Zhen, W.-Z., & Xie, W.-X. (2003). Suppressed fuzzy c-means clustering algorithm.
Pattern recognition letters, 24(9-10), 1607–1612.
Hathaway, R., Bezdek, J., & Hu, Y. (2002). Generalized fuzzy c-means clustering strategies using
Lp norm distances. IEEE Transactions on Fuzzy Systems, 8(5), 576–582.
Isa, N. A. M., Salamah, S. A., & Ngah, U. K. (2009). Adaptive fuzzy moving K-means cluster-
ing algorithm for image segmentation. IEEE Transactions on Consumer Electronics, 55(4),
2145–2153.
Isa, N., Salamah, S., & Ngah, U. (2010). Adaptive fuzzy moving K-means clustering algorithm for
image segmentation. IEEE Transactions on Consumer Electronics, 55(4), 2145–2153.
Kaufman, L., & Rousseeuw, P. J. (1990). Partitioning around medoids (program pam). Finding
groups in data: an introduction to cluster analysis, 344, 68–125.
Kaufman, L., & Rousseeuw, P. J. (2009). Finding groups in data: An introduction to cluster analy-
sis (Vol. 344). Wiley.
Kersten, P. R. (1999). Fuzzy order statistics and their application to fuzzy clustering. IEEE
Transactions on Fuzzy Systems, 7(6), 708–712.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations.
In L. M. LeCam & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on math-
ematical statistics and probability.
Mashor, M. (2000). Hybrid training algorithm for RBF network. International Journal of The
Computer, The Internet and Management, 8(2), 50–65.
Siddiqui, F. U., & Isa, N. A. M. (2011). Enhanced moving K-means (EMKM) algorithm for image
segmentation. IEEE Transactions on Consumer Electronics, 57(2), 833–841.
Siddiqui, F., & Isa, N. M. (2012). Optimized K-means (OKM) clustering algorithm for image
segmentation. Opto-Electronics Review, 20(3), 216–225.
Siddiqui, F. U., Isa, N. A. M., & Yahya, A. (2013). Outlier rejection fuzzy c-means (ORFCM)
algorithm for image segmentation. Turkish Journal of Electrical Engineering & Computer
Sciences, 21(6).
Sulaiman, S. N., & Isa, N. A. M. (2010). Adaptive fuzzy-K-means clustering algorithm for image
segmentation. IEEE Transactions on Consumer Electronics, 56(4), 2661–2668.
Thomas, B., Raju, G., & Sonam, W. (2009). A modified fuzzy c-means algorithm for natural data
exploration. World Academy of Science, Engineering and Technology, 49.
Wei, L.-m., & Xie, W.-x. (2000). Rival checked fuzzy c-means algorithm. Acta Electronica Sinica,
28(7), 63–66.
Weisstein, E. W. (2004). K-Means Clustering Algorithm. MathWorld–A Wolfram Web Resource.
Yang, M.-S., Hwang, P.-Y., & Chen, D.-H. (2004). Fuzzy clustering algorithms for mixed feature
variables. Fuzzy Sets and Systems, 141(2), 301–317.
Chapter 3
Novel Partitioning Clustering
Although the partitioning clustering techniques can simplify an image with less
complexity, they have major problems that lead the final solution of clustering tech-
niques to converge at poor local minima. These problems include initialization,
trapped centroid at the non-active region, empty cluster, and outlier sensitivity. This
chapter addresses the modifications in clustering techniques proposed in the litera-
ture by the authors of this book (Siddiqui & Isa, 2012; Siddiqui & Isa, 2011; Siddiqui
et al., 2013) to overcome all the mentioned limitations.
The k-means and fuzzy c-means clustering techniques represent the hard-
membership function-based clustering and fuzzy-membership function-based clus-
tering techniques, respectively. The pixel assigning process of k-means was modified
in the optimized k-means clustering (Siddiqui & Isa, 2012), and the working steps
of this technique are discussed in detail in this chapter. It also covers the working
steps of the enhanced moving k-means clustering technique (Siddiqui & Isa, 2011),
where the transferring process of moving k-means clustering technique is modified
to avoid empty cluster, dead centroid, and centroid trapping at non-active region
problems of k-means clustering. Finally, the working steps of outlier rejection fuzzy
c-means clustering technique are discussed in detail. This technique reduced the
outlier sensitivity of fuzzy c-means clustering. Overall, the modifications made in
the mentioned clustering techniques lead to the techniques’ final solution at the
optimum global location. Few samples image will be used in this chapter to discuss
the performance of the modified clustering techniques.
Like k-means clustering, the OKM clustering assigns pixels to their nearest clusters
using the Euclidean distance. But OKM has advanced the pixel process having the
same Euclidean distance to two or more adjacent clusters. Such pixels are known as
conflict pixels. The cluster fitness value is measured using Eq. 3.1, and the conflict
pixels are assigned to the lowest fitness value cluster among those adjacent clusters
(Siddiqui & Isa, 2012). This makes sure that a low variance cluster will move to the
active region during the clustering process.
f c j pi x,y c j
2
(3.1)
ic j
In case the adjacent clusters of conflict pixels are empty, zero variance, and have
positive variance, the pixels-assigning process will measure different analysis to
ensure conflict pixels will assign to empty cluster. Thus, the empty center and
trapped centroid problem will not occur in the next iteration of the clustering pro-
cess. In detail, the intensity of conflict pixels will be arranged in ascending order
based on their distance from the cluster with the highest fitness value. The intensity
of pixels is stored in an array, and it is denoted as Eg, where g=1, 2.... (k−1). If the
image is segmented into k number of clusters, the k-1 will be the maximum intensity
level. Empty cluster and zero variance cluster are difficult to differentiate by
k-means clustering. Therefore, the OKM clustering considered the fitness value and
the number of pixels of each adjacent cluster. First, all the clusters are arranged in
ascending order based on their fitness values, and it is denoted as the Fq, where q=1,
2....v. The fitness value for each cluster is measured using Eq. 3.2. The zero fitness
clusters can be zero variance or empty clusters; therefore, the zero-fitness cluster of
the adjacent clusters are rearranged according to the number of pixels present in the
clusters. The sorting array Fv is renamed as Hw, where w=1, 2....z (z is the number of
clusters with zero variance and no pixels). Finally, the transferring of pixels between
the Eg and Hw arrays are started, and the pixels with the intensity value of Eg are
transferred to the cluster with the value of Hw. This transferring process between Hw
and Eg continues until either the value of w equals z or the value of g is equal to
(k−1) (Siddiqui & Isa, 2012). In case the transferring process terminates with the
value of g equals to z and the value of z is less than k-1, the remaining pixels of Eg
will be assigned to the adjacent cluster with the lowest fitness value. After complet-
ing the assigning of pixels of Eg, all the centroids are updated using equation 3.2.
3.2 Optimized k-means Algorithm 71
1
cj
nj
p x ,y
ic j
i (3.2)
The clustering process mentioned above is repeated until the mean square error
(MSE) difference is less than α, where 0 < α ≤ 1 (Siddiqui & Isa, 2012). The typical
value of α to obtain a good segmentation performance should be close to 0. The
terminating criteria could be defined by,
MSE
t 1
MSE t (3.3)
According to the study (Siddiqui & Isa, 2012), the optimized k-means clustering
is implemented as follows:
72 3 Novel Partitioning Clustering
3.2 Optimized k-means Algorithm 73
The flow diagram of the OKM clustering technique is shown in Fig. 3.1.
Comparatively, k-means clustering has a short processing time, and its time com-
plexity is defined as (Leibe et al., 2006):
The only additional parameter in the time complexity equation of the OKM as
compared to the k-means clustering is b, which is the intensity values of the conflict
pixels that are to be assigned to their respective clusters. At the same time, the big
notation Oin both equations defines the growth rate of their function. The time com-
plexity of OKM clustering is not significantly increasing because of the extra
parameter b, as the advanced process involves only implementing if the conflict
pixels and empty clusters are obtained during the clustering process.
The optimized k-means (OKM) clustering’s main objective is to avoid empty clus-
ters and trapped centroid problems. To illustrate the OKM performance, the Bridge
image is selected, as shown in Fig. 3.2 (a). Figure 3.2 (c) and (e) shows segmented
images into three and six regions or clusters by the OKM clustering. In addition, the
histograms of the test image before and after the segmentation process with differ-
ent numbers of clusters (i.e., three and six) are shown in Fig. 3.2 (b), (d), and (f),
respectively. Comparing histograms of the test image and OKM segmented image
with three and six number of clusters confirms that OKM produced the final cen-
troids lactation at the appropriate peak’s original histogram as shown in Fig. 3.2 (d)
and (f), and their final solutions are converged to the nearby global optimum loca-
tion. In Fig. 3.2 (c) and (f) , the possible region of objects like steel ropes, deck,
pillars, mountains, and sky are significantly segmented. These results show that the
OKM can avoid the centroid being trapped in the non-active region. In addition,
both segmented images have no dead center, and both images are clustered into
three and six regions, respectively. Thus, the results confirm that the OKM cluster-
ing is also capable of avoiding the empty cluster problem. Table 3.1 tabulated the
values of MSE, INTER, F(I), F′ (I), and Q(I) functions, where the small value of the
74 3 Novel Partitioning Clustering
Fig. 3.2 Applying the OKM clustering on Bridge image; (a) original image, (b) histogram of the
original image, (c) segmented image at k=3, (d) histogram of the segmented image at k=3, (e)
segmented image at k=6, and (f) histogram of the segmented image at k=6
MSE and large value of the INTER confirm that images are homogeneously seg-
mented and their solutions are converged to the optimum global location.
Furthermore, the small values of F(I), F′ (I), and Q(I) also confirm that the OKM
clustering is successful in segmenting the images into homogenous regions.
76 3 Novel Partitioning Clustering
Table 3.1 Results produced by the OKM clustering technique on Bridge and Light house images
Fig. 3.3 Transferring process of EMKM1 for the cluster with the highest fitness value, where
members of cloutside the range (shows in light blue region) are transferred to the nearest cluster only
The enhanced moving k-means (EMKM) clustering was introduced by (Siddiqui &
Isa, 2011); it employs a hard membership function, where the pixels are assigned to
a single cluster only. After assigning all pixels to clusters, the positions of centroids
are updated in each iteration of the clustering process according to Eq. 3.2. The
EMKM clustering technique measures each cluster’s fitness in the same fashion
used in the conventional MKM and AMKM clustering techniques. The clusters are
arranged in ascending order where the cluster with the smallest fitness values and
the cluster with the largest fitness value is denoted as cs and cl, respectively. The
relationship between cs and cl must fulfil the condition f(cs) ≥ αaf(cl) to secure a final
solution at the optimum location. The variance among clusters is not considered
similar if the condition f(cs) ≥ αaf(cl) is not fulfilled. However, the MKM and
AMKM clustering techniques have an inappropriate transferring process that failed
to keep a variance balance among the cluster.
Thus, the two versions called EMKM1 and EMKM2 were introduced in the lit-
erature to keep the cluster variance in a reasonable range and avoid the cluster trap
at an optimum local location. The EMKM1 clustering technique cl will keep its
members within the range of 1 cl r where cl(r) is the radius of the cluster cl
2
(Siddiqui & Isa, 2011). As shown in Fig. 3.3, the members with a value more than
the defined threshold 1 cl r will be assigned to the nearest cluster. The cluster’s
2
3.3 Enhanced Moving k-means Clustering 77
Fig. 3.4 Transferring process of EMKM2 for a cluster with the highest fitness value, where bor-
dered members of the highest fitness value cluster (members located at yellow and light blue
regions) are transferred to their closed clusters
78 3 Novel Partitioning Clustering
where i = 1, 2…N × M and j = 1, 2…k. According to the literature (F. U. Siddiqui &
Isa, 2011), the EMKM-1 and EMKM-2 clustering techniques are implemented
below (Figs. 3.5 and 3.6).
(a) EMKM-1
3.3 Enhanced Moving k-means Clustering 79
(b) EMKM-2
82 3 Novel Partitioning Clustering
3.3 Enhanced Moving k-means Clustering 83
The enhanced versions of the moving k-means clustering called EMKM-1 and
EMKM-2 are capable of avoiding the empty cluster and trapped centroids problems
that ensure the final solution converges to the optimum global location. The men-
tioned superior abilities of EMKM-1 and EMKM-2 Clustering techniques were
verified using the Gantry Crane image with the number of clusters to four. The
segmented images produced by the EMKM-1 and EMKM-2 clustering techniques
are shown in Fig. 3.7 (c) and (e), respectively. In addition, the histograms of the
original image and segmented images produced by the EMKM-1 and EMKM-2
techniques are also plotted; refer to Fig. 3.7 (d) and (f). In the last chapter, the
AMKM has been applied on the same image, and the switch panel is not homoge-
neously segmented because of the trapped centroid at the non-active region. Whereas
EMKM-1 and EMKM-2 homogeneously segment the switch panel, particularly the
EMKM-2, which produces remarkable results with homogenous segmentation of
the switch panel region. This is also confirmed by the histogram presented in
Fig. 3.7 (d) and (f), where the centroid significantly represents an active region with
a range of 0–50 on the intensity scale. Furthermore, Table 3.2 presents the selected
image’s quantitative results for testing the performance of EMKM-1 and EMKM-2.
The small MSE value and large INTER value confirm that the final centroids are
located at the active regions based on the results. In addition, the small values of the
F(I), F′ (I), and Q(I) show that EMKM-1 and EMKM-2 techniques segment the
image homogenously. In comparison, the EMKM-2 has smaller values of F(I), F′
(I), and Q(I) than its variant EMKM-1. This confirms the robust performance of the
84 3 Novel Partitioning Clustering
advanced transferring process of EMKM-2, where only the bordered member of the
highest fitness value cluster transfers to their closet clusters of adjacent clusters.
subjectto :
uij 0,1 fori 1, 2..nj 1, 2..k
k
u
j 1
ij 1, i 1, 2 n
where uij is the degree of membership of pixelxi in the jth cluster and m is the degree
of fuzziness, i.e., typically equals to 2. Major changes occur in the membership
function of the conventional fuzzy c-means clustering to overcome the outlier’s
sensitivity, i.e., the original Euclidean distance of membership ‖xi−cj‖ is replaced
with i j . According to (F. U. Siddiqui et al., 2013), the modified equation in
x c
1
uij 2 / m 1
(3.8)
k
xi c j
xi c p
p 1
The exponent variable β limits the partial distribution of points among the two
neighboring clusters rather than to all clusters, and it is defined as:
RangeofIntensityinanimage 1
1 (3.9)
MaximumRangeofIntensity 1
Fig. 3.7 Applying the EMKM-1 and EMKM-2 clustering techniques on Gantry Crane image; (a)
original image, (b) histogram of the original image, (c) EMKM-1 segmented image, (d) histogram
of EMKM-1 segmented image, (e) EMKM-2 segmented image, and (f) histogram of EMKM-2
segmented image
86 3 Novel Partitioning Clustering
Table 3.2 Results produced by the EMKM-1 and EMKM-2 clustering techniques on Gantry
Crane image
I max I min 1
1 (3.10)
256
where, the Imax is the maximum intensity in an image and Imin is the minimum
intensity in the gray image. The value of β is between 1 and 2, it is close to 2 if the
image intensity range is maximum. The membership with β = 2 can restrict the
partial distribution of points between two adjacent clusters. Whereas, membership
with β = 1 (i.e., occur if image intensity range is lowest) becomes more flexible only
partially to distribute the adjacent clusters’ points.
Let I denote an image with n pixels (i.e., pi(x, y) i ∈ {1, 2, .…n}) to be partitioned
into k clusters, where 2 ≤ k ≤ n and cj (for j=1,2,….k) be the jth cluster. Consider the
matrix U = (uij)k × n called fuzzy partition matrix in which each element uij indicates
the membership degree of each pixel in the jthcluster, cj. The implementation of
ORFCM are as follows:
3.4 Outlier Rejection Fuzzy c-means Clustering 87
The flow diagram of the outlier rejection fuzzy c-means (ORFCM) clustering is
shown in Fig. 3.8.
88 3 Novel Partitioning Clustering
To illustrate the ORFCM clustering technique’s performance, two data ranges were
employed, i.e., 1 to 120 and 1 to 60. After applying the ORFCM technique on both
data, the final obtained regions c1, c2, and c3 with their respective membership func-
tions u1, u2, and u3 are plotted in Fig. 3.9 (a) and (b). In the graph, all the data are
partially distributed between only the two clusters. The graph also confirms that
3.4 Outlier Rejection Fuzzy c-means Clustering 89
outliers are not effective in the clustering process; for example, the graph’s outliers
have almost zero membership value. More importantly, the data of large intensity
range (i.e., 1–120) is partial distribution between the two adjacent clusters, and the
data of small intensity range (i.e., 1–60) is not only partially distributed between
two adjacent clusters but their member between the two adjacent clusters is much
flexible. Thus, the effects of outliers in the clustering process have been successfully
U1 1
U2
U3 0.9
0.8
0.7
0.6
Membership
0.5 C3 C2 C1
0.4
0.3
0.2
0.1
0
0 20 40 60 80 100 120
Data sample
(a)
1
U1
U2 0.9
U3
0.8
0.7
0.6
Membership
C2 C3 C1
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60
Data Sample
(b)
Fig. 3.9 Membership function generated by the ORFCM clustering for data of intensity range (a)
1–120 and (b) 1–60. (Siddiqui et al., 2013)
90 3 Novel Partitioning Clustering
reduced, reducing the chance of the center trapping problem of small intensity
range data.
The Football image is also selected further to evaluate the performance of the
ORFCM Clustering technique. In the initial setting, three clusters are set. As shown
in Fig. 3.10 (d), the ORFCM segments the Football image into predefined three
clusters with accuracy such that the final centroids positioned at 50, 140, and 245
levels in the intensity histogram. By avoiding the outlier effect on the clustering
process, regions like the laces of football are homogenously segmented, as shown in
Fig. 3.10. The success of ORFCM in the obtained result is further confirmed by
quantitative analysis. The small values of the MSE and VXB and the large value of
the INTER indicate the better performance of ORFCM (as tabulated in Table 3.3).
In addition, the small values of F(I), F′ (I), and Q(I) functions confirm homogenous
segmentation of regions in the image by the ORFCM clustering technique.
Fig. 3.10 Applying the ORFCM clustering on Football image; (a) original image, (b) histogram
of the original image, (c) segmented image, and (d) histogram of the segmented image
References 91
Table 3.3 Results produced by the ORFCM Clustering technique on Football image
References
Leibe, B., Mikolajczyk, K., & Schiele, B. (2006). Efficient clustering and matching for object class
recognition. In BMVC, pp. 789–798.
Siddiqui, F. U., & Isa, N. A. M. (2011). Enhanced moving K-means (EMKM) algorithm for image
segmentation. IEEE Transactions on Consumer Electronics, 57(2), 833–841.
Siddiqui, F., & Isa, N. M. (2012). Optimized K-means (OKM) clustering algorithm for image
segmentation. Opto-Electronics Review, 20(3), 216–225.
Siddiqui, F. U., Isa, N. A. M., & Yahya, A. (2013). Outlier rejection fuzzy c-means (ORFCM)
algorithm for image segmentation. Turkish Journal of Electrical Engineering & Computer
Sciences, 21(6), 1801.
Chapter 4
Quantitative Analysis Methods
of Clustering Techniques
Currently, two analysis methods are available for comparing the performance of the
clustering techniques, i.e., qualitative and quantitative analyses.
Fig. 4.1 Segmented images in 3 and 4 clusters for the building and plane images after applying
k-means and EMKM-1 clustering techniques. (Fasahat U Siddiqui, 2012)
4.1 Analysis Methods of Clustering Techniques 95
Fig. 4.2 Segmented images in 5 and 6 clusters for the House and Fruit-table images after applying
k-means and EMKM-1 clustering techniques. (Fasahat U Siddiqui, 2012)
96 4 Quantitative Analysis Methods of Clustering Techniques
Quantitative analysis is another procedure that can examine the performance of the
clustering techniques. It measures two characteristics of the extracted regions, i.e.,
similarity index and homogeneity of extracted regions. As compared to qualitative
analysis, quantitative analysis has many advantages, like it has no human depen-
dency. Therefore, a quick and fair comparison is possible for image segmentation
with any number of clusters.
The real-world images’ regions have high texture and extremely irregular boundar-
ies. In such a complex environment, the similarity index measurement evaluates the
quality without analysis of contrast, shape, homogeneity, and edges. MSE (mean
square error) is the simplest similarity index-based quantitative method that mea-
sures the square error between the pixels of original and resultant images using
Eq. 4.1.
1 k
( pi ( x,y ) − c j )
2
MSE = ∑∑
n j =1 i∈c j
(4.1)
Where cj is the jth cluster and pi(x, y) is the pixel that belongs to jth cluster. It is
implemented as follow:
4.1 Analysis Methods of Clustering Techniques 97
Table 4.1 Average MSE value for 103 benchmark images after applying different clustering
techniques
Techniques For k=3 For k=4 For k=5 For k=6
KM 303.52 181.39 115.68 84.715
FCM 297.45 171.66 109.74 76.746
MKM 494.59 316.83 229.63 173.64
AMKM 337.94 192.98 131.91 91.622
AFMKM 950.30 456.17 260.01 96.242
AFKM 3745.3 3542.6 3543.1 3593.0
OKM 300.06 179.12 121.72 85.388
EMKM-1 321.26 185.07 119.62 83.853
EMKM-2 315.59 180.08 115.31 80.734
ORFCM 294.88 169.42 108.81 77.310
According to Eq. 4.1, the lower difference between the resultant and original
images produces the smaller value of MSE. This indicates that all the pixels in an
image are clustered to their nearest clusters such that each cluster has the most simi-
lar intensity value pixels. Furthermore, the smaller value of MSE confirms that the
final solution is converged to the optimum global location. For example, the average
value of MSE for 103 benchmark images is tabulated in Table 4.1. The smallest
average value of MSE shows the best performance of a clustering technique at the
particular number of clusters (Table 4.2).
98 4 Quantitative Analysis Methods of Clustering Techniques
Table 4.2 Average INTER value for 103 benchmarks images after applying the clustering
techniques.
Techniques k=3 k=4 k=5 k=6
KM 85.106 83.197 83.197 80.888
FCM 83.585 80.402 80.402 75.845
MKM 71.734 68.111 68.111 66.978
AMKM 84.135 81.190 81.190 79.605
AFMKM 76.530 84.817* 84.817* 88.143*
AFKM 56.161 46.647 43.471 41.513
OKM 85.650 83.051 83.051 79.144
EMKM-1 78.679 76.079 76.079 72.746
EMKM-2 82.990 80.490 80.490 82.990
ORFCM 83.074 80.694 80.694 75.290
4.1.2.2 INTER
The similarity index is measured by calculating the variance among the clusters,
which distinguishes the differences between adjacent clusters. It is measured by
using the following equation.
INTER = mean∀r ≠ q cr − cq 2 ( ) (4.2)
Where r=1, 2,...,(k-1), and q= (r+1),…, k.
Here, the inter-cluster variance is measured by taking a mean of difference
among the clusters’ centroid. The INTER is implemented as follow:
The large value of INTER shows that the grouped data in the clusters are signifi-
cantly different from the other clusters and the obtained centroids converged to their
optimum locations. Unlike MSE that produced linear output for all situations, the
INTER function produces a higher value if dead centroid intensity is zero and low
if the dead centroid intensity is similar to the other cluster centroid intensity. As a
higher number of clusters favors the dead centroid occurrence, the INTER is not the
best quantitative analysis for images segmented into more than two clusters.
*The largest INTER value for the clustering techniques that fail to segment the images into
defined number of clusters
4.1 Analysis Methods of Clustering Techniques 99
4.1.2.3 VXB
Xie and Beni (1991) introduced the VXB function (Pakhira et al., 2004; Xie & Beni,
1991). The VXB is mainly used to analyze the outlier sensitivity of fuzzy-based
clustering techniques, which cannot be analyzed by applying the MSE and INTER
quantitative analyses. The VXB function measures the compactness and separation
of the pixels clustered by the fuzzy-based clustering techniques. It is defined as:
n k
∑∑u p ( x,y ) − c
i j
2
ij i j
2
VXB = (4.3)
(
n min ∀r ≠ q cq − cr 2 )
where uij is the fuzzy membership of pixel i belongs to j-th cluster. The VBX is
implemented as follow:
The ratio of compactness and separation of will be smaller if the three conditions
become true:
• clusters are comparatively less overlapping
• the pixels within the cluster are similar as possible
• the pixels among the cluster are dissimilar as possible
Based on the tabulated information in Table 4.3, AFKM shows infinity value
because of dead centroid generation in the clustering process. The best value of
100 4 Quantitative Analysis Methods of Clustering Techniques
Table 4.3 Average VXB value for 103 benchmark images after applying the fuzzy-based
clustering techniques
VXB value for different number of clusters
Techniques k=3 k=4 k=5 k=6
FCM 0.3349 0.3862 0.4199 0.4617
AFMKM 2.4320 2.8931 1.8536 0.8685
AFKM ∞ ∞ ∞ ∞
ORFCM 0.2593 0.2797 0.2921 0.3052
VXB is bolded in the table to highlighted them. Based on the results, ORFCM pro-
duces lower overlapped clusters; the technique is less sensitive to the outliers than
other fuzzy-based clustering techniques.
4.1.2.4 F(I)
In 1994, a novel function was introduced called F(I) (Liu & Yang, 1994). It mea-
sures the homogeneity of the segmented image without any human interaction or
any predefined threshold value. In addition, it indirectly measures the shape and
edge information of segmented regions. Furthermore, it also measures the similarity
index of the segmented regions. In detail, the function F(I) is considering three basic
conditions, which are as follows:
• the regions should be uniform and homogenous
• the region’s interiors have uniform characteristics or without many holes inside it
• adjacent regions must present significantly different values
The evaluation function is defined as:
R
ei2
F (I ) = R∑ (4.5)
i =1 Ai
where I is the image to be segmented, R is the number of regions found in the seg-
mented image, Ai is the size of the segmented region, and ei is the average intensity
error and defined as the sum of Euclidean distance of the intensity between the
pixels of segmented regions and the original image. In equation 4, two terms mainly
penalize the segmentation if the segmented image is not fulfilling the three condi-
tions mentioned above. For instance, the term R penalizes the segmentation that
2
forms too many regions. Whereas the term e / Ai penalizes the small size regions
i
with the large intensity error. The smaller value of F(I) indicates a better segmenta-
tion result. Based on the experiment conducted in (Liu & Yang, 1994), many regions
in the segmented image penalized the segmentation only by the former term.
Moreover, the F(I) function favors the noisy segmentation as the ei2 is often close
to zero for small and lower intensity error regions (Borsotti et al., 1998). The F(I) is
implemented as follow:
4.1 Analysis Methods of Clustering Techniques 101
Table 4.4 Average F(I) value for 103 benchmark images after applying the clustering techniques
Average F(I) value (1x107) for different k
Techniques k=3 k=4 k=5 k=6
KM 17.020 19.953 20.146 20.677
FCM 18.243 20.465 21.825 21.958
MKM 19.658 22.999 26.898 26.993
AMKM 18.064 19.612 21.246 21.326
AFMKM 21.329 21.937 22.715 22.093
AFKM 129.60 113.83 131.60 25.890
OKM 16.471 18.893 19.964 20.053
EMKM-1 19.571 21.937 22.494 22.098
EMKM-2 17.862 21.028 21.486 21.767
ORFCM 17.480 19.885 21.668 21.642
Based on the tabulated information in Table 4.4, the best value of F(I) for the
different clusters is produced by the OKM clustering technique. Similarly, the
ORFCM produced the best results (smaller value) in the fuzzy-based clustering
102 4 Quantitative Analysis Methods of Clustering Techniques
techniques category. The smaller value indicates the segmentation of images into
homogenous regions with a high similarity index.
F′ (I) is an improved version of the F(I) function. Unlike F(I) function, the F′ (I)
penalized the segmentation with many small regions of the same size. The size of
regions is the key factor that can fairly rank the homogeneity criteria of segmenta-
max
1 +1 / A
tion. Therefore, the term ∑ R ( A )
A =1
offers more impact of a small region on
the results. If fewer small regions are segmented, then the newly introduced term’s
value is equal to R and F′ (I) is performed like F(I) function. The reformed evalu-
ation function is defined as:
1 max
1 +1 / A
R
ei2
F′(I ) = ∑ R ( A ) ×∑ (4.6)
1000 ( N × M ) A =1 i =1 Ai
The F′ (I) function is implemented as follow:
4.1 Analysis Methods of Clustering Techniques 103
4.1.2.6 Q(I)
The function Q(I) is another modified version of F(I) that evaluates the clustering
technique based on the homogeneity of regions, their similarity intensity error, and
the size of the regions (Borsotti et al., 1998). As compared to convention function
F(I), the Q(I) greatly considers the size of small regions while computing the results.
This evaluating function is mathematically defined as:
R R ( Ai )
2
1 ei2
Q(I ) = R∑ + (4.7)
1000 ( N × M ) i =1 1 + log Ai Ai
where 1/1000(N × M) is a normalizing factor and N × M is the size of the image.
Whereas the R(A) is the number of regions having area A and R(Ai), the number of
regions having area Ai in the function.
Table 4.5 Average F′ (I) value for 103 benchmark images after applying the clustering techniques
According to the result generated for the clustering techniques in Table 4.6, the
AFKM shows a smaller value because of dead centroid generation during the clustering
process. The best value of Q(I) is highlighted by bolding them. Based on Q(I) analysis,
AFMKM and OKM produced uniform segmented regions with higher similarity indexes.
Table 4.6 Average Q(I) value for 103 benchmark images after applying the clustering techniques
Average Q(I) value for different k
Techniques k=3 k=4 k=5 k=6
KM 2001.26 12211.3 49382.8 112888
FCM 2040.25 12889.4 70652.0 204829
MKM 2217.80 24071.7 141857 333357
AMKM 2191.53 10281.9 65727.3 206732
AFMKM 1355.15 9288.63 31316.0 116325
AFKM 3027.52 1639.44* 5698.71* 43040.4*
OKM 1471.09 10533.9 37135.0 104516
EMKM-1 2434.62 15739.9 84585.5 190539
EMKM-2 1682.25 15081.1 55717.4 157287
ORFCM 2180.09 15203.5 93469.7 184625
*The smallest Q(I) value for the algorithms that fail to segment the images into a defined number
of clusters
References 105
References
Borsotti, M., Campadelli, P., & Schettini, R. (1998). Quantitative evaluation of color image seg-
mentation results. Pattern Recognition Letters, 19(8), 741–747.
Liu, J., & Yang, Y.-H. (1994). Multiresolution color image segmentation. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 16(7), 689–700.
Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001, July). A database of human segmented natural
images and its application to evaluating segmentation algorithms and measuring ecological
statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV
2001 (Vol. 2, pp. 416–423). IEEE.
Pakhira, M. K., Bandyopadhyay, S., & Maulik, U. (2004). Validity index for crisp and fuzzy clus-
ters. Pattern Recognition, 37(3), 487–501.
Siddiqui, F. U. (2012). Enhanced clustering algorithms for gray-scale image segmentation. Master
dissertation, Universiti Sains Malaysia.
Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 13(8), 841–847.
Index
A EMKM clustering, 76
Active region, 18, 39, 41, 44, 46, 49, 50, 54, Empty cluster, 70
56, 69, 70, 73, 83 Entropy, 11
Adjacent regions, 100 Euclidean distance, 18, 35, 38, 44, 53, 70,
AFMKM clustering, 55–58 77, 84, 100
AMKM clustering, 46–50, 55, 56, 76
F
C F' (I), 39, 46, 50, 54, 55, 59, 65, 73, 83, 90,
Canny edge, 28 91, 102, 103
Centroid, 8, 14–19, 35–38, 41, 44, 46, 47, 49, F(I) function, 100, 102
50, 53, 55, 59, 65, 69, 70, 73, 77, 83, First order-based edge detection, 3
98, 99, 104 First-order derivative edge
Clustering detection, 23
fuzzy partitioning, 17, 35 Fitness condition, 44, 47, 56, 77
hard partitioning, 17, 18, 35 Fuzzy clustering, 9–11
hierarchical, 15, 16 Fuzzy c-means, 50, 52–56, 69, 84, 86–88, 90
partitioning, 17, 35
unsupervised, 14
Convex hull, 6 G
Global thresholding, 4, 5
Gradient operator, 24–26
D
Dead centroid, 17, 37, 41, 98
Digital image processing system, 1, 2 H
Highest fitness value, 17, 41, 46, 55, 70,
76, 77, 84
E Histogram, 4–11, 29, 39, 40, 45, 49, 55, 58,
Edge-based segmentation technique, 3 66, 73, 75, 83, 85, 90
Edge linkers exponential, 7
global, 29 function, 5
local, 29 gray level, 11
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 107
F.U. Siddiqui, A. Yahya, Clustering Techniques for Image Segmentation,
https://doi.org/10.1007/978-3-030-81230-0
108 Index
K
k-means, 8, 35–41, 45–50, 53, 55–66, 69–71, R
73, 76, 77, 83, 85, 93–96 RCFCM clustering, 53
Region-based segmentation technique, 3
Region growing, 3, 19–21
L
Laplacian edge method, 27
Local adaptive thresholding, 12, 13 S
Local mean, 13, 14 Second order-based edge detection, 3
Local minima, 5, 17, 19, 37, 38, 46, 53, Second order derivative edge detection,
56, 65, 69 26–28
poor, 65 S-FCM clustering, 53
Lowest fitness value, 17, 41, 46, 70 Similarity index, 15, 29, 96, 98, 100, 102, 103
Soft membership, 9, 18, 50
Split-and-merge technique, 3, 21
M Standard deviation, 13
Mean square error (MSE), 71, 96
Membership function, 10, 11, 18, 19, 35, 50,
54, 59, 69, 76, 84, 88 T
Modelling, 7 Thresholding, 3–5, 12–14
Moving k-means, 69 Time complexity, 73
Transferring process, 44, 46, 47, 69, 70, 73,
76, 77, 84
N Trapped centroid, 19, 49
Noise filtering, 1, 26
V
O Variance, 7–9, 12, 15–17, 38, 41, 44, 46, 47,
OKM clustering, 70, 71, 73–76 53, 54, 70, 76, 77, 98
Optimal threshold, 8–11 VXB function, 99
ORFCM clustering, 84, 89–91
Ostu method, 9
Outliers, 8, 15, 16, 18, 19, 37, 38, 53, 54, 56, W
65, 69, 84, 89, 90, 99, 100 Windows, 6, 10, 12, 13, 19, 22