10_chapter5
10_chapter5
CHAPTER 5
IMAGE CLASSIFICATION
5.1 INTRODUCTION
the design of the classification procedure and the quality of the classification
results. The major steps of image classification may include image
preprocessing, feature extraction, selection of training samples, selection of
suitable classification approaches, post-classification processing, and
accuracy assessment.
Definition of classes
Selection of features
Sampling of training
data
Finding of proper
decision rule
Classification
Verification of results
Step 5: Classification
Depending upon the decision rule, all pixels are classified in a single
class. There are two methods of pixel by pixel classification and per-field
classification, with respect to segmented areas.
cluster content from general experience or personal familiarity with the area
imaged.
The following are the main three steps of K-Means clustering until
convergence.
m n
2
A x ij ci (5.1)
i 1 j 1
2
where x ij ci is a distance measure between a data point x ij and the cluster
center ci .
Calculate the
Centroid
Minimize the
objective
function
Unsupervised
classification through Calculation of
PCA based K-means clustering in the Water body and non-
clustering coverage area water body
First, the given image values are preprocessed by using PCA. Using
the most important components of PCA, the image information is mapped
into the new feature space. Then, the K-means algorithm is applied to the data
in the feature space. The final objective is to distinguish the different clusters
using eigen values. Clusters are grouped if its standard deviation exceeds a
threshold and the number of pixels is twice for the minimum number of pixels.
The main intention of K-means algorithm is to reduce the cluster variability.
The objective function is to find the sum of square distance between cluster
centre and its assigned pixel value as defined in Equation (5.2).
116
n
2
F xi C xi (5.2)
i 0
where, xi is the pixel value assigned to mean value of the cluster C xi . Next
to determine the error as defined in Equation (5.3). Minimizing the error is
equivalent to minimizing the sum of squared distances.
n
2
xi C xi
i 0
Error (5.3)
N C
N C
2
Am aijm x j ci , 1 m (5.4)
i 1 j 1
1
aij 2
(5.5)
m 1
c xj ci
m 1 xj cm
number of steps to repeat the iteration. The following Figure 5.5 depicts the
system design for implementing Fuzzy C-Means clustering.
Acquire the
Landsat image
Minimize the
objective function
Fuzzy partitioning
by optimization
The FCM function is called upon which takes the pixel array
and the number of clusters as input.
118
Given a data set, there are n training samples. The values for
the parameter X x1 , x 2 ,...x n denoted as Y sort X y1 , y 2 ,... y n . The
di Yi 1 Yi , i 1, 2,....n 1. (5.6)
di
1 , di CP s
Si CP s
(5.7)
0 , otherwise
IF s i THEN Yi , Yi 1 CPi
(5.8)
ELSE Yi CPi , Yi 1 CPi 1
where CPi and CPi 1 denote two dissimilar classes for the input or output
parameter.
Computational
layer
Input layer
random values.
Step 2: Sampling – Describe a sample input vector x from the input space.
Here the network chooses five texture features as input to classify
the two classes.
Step 3: Matching – Find weight vector closest to the input vector named
as winning neuron I (x ) , which is defined as in Equation (5.9).
n
2
xi w ji (5.9)
i 1
122
w ji T j ,I x xi w ji (5.10)
where T j , I x
is a Gaussian neighbourhood and is the learning
rate.
Once the SOM algorithm has converged, the feature map displays
important statistical characteristics of the input space. Given an input
vector x , the feature map provides a winning neuron I (x ) in the output
space and the weight vector WI ( x ) provides the coordinates of the image of that
Feature map
Continuous
input space
WI(x)
Discrete space
Figure 5.7 Topological ordering and density matching of the feature map
123
The main aim of using a self organizing map is to encode a large set
of input vectors x by finding a smaller set of “representatives” or
approximation to the original input space. This is the basic idea of vector
quantization theory. The motivation of which is dimensionality reduction or
data compression. In effect, the error of the vector quantization approximation
is the total squared distance between the input vectors x and their
representatives WI ( x ) as defined in Equation (5.11).
D || x w I ( x ) || 2 (5.11)
x
Maps
SVM Hyperplane
For each given input, the SVM determines if the input is a member
of water body or non-water body class. This makes SVM as a linear classifier.
The goal is to separate the two classes without loss of generality by a function
which is induced from available training data sets. The task is to produce a
classifier that will work in a generalized manner. The application of SVM for
the desired problem is minimizing the error through maximizing the margin
which means that it maximizes the distance between it and the nearest data
point of each class. Since SVMs are known to generalize well even in high
125
The SVM takes a set of input data and then predicts, for each given
input, which of two possible classes. Given a set of training data, each marked
as belonging to one of two categories, an SVM training algorithm builds a
model that assigns new data into one category or the other.
Water body
w x+b=0 (5.12)
Given such a hyperplane (w,b) that separates the data, this gives
the function defined in equation (5.13) as
which correctly classifies the training data (and hopefully other “testing”
data it hasn’t seen yet). However, a given hyperplane is represented by
(w,b)is equally expressed by all pairs { w, b} for R+ . So define the
canonical hyperplane to be that which separates the data from the hyper-
plane by a “distance” of atleast 1. That is, consider those that satisfy
equation (5.14) and (5.15):
or more compactly:
yi (xi w + b) 1 i (5.16)
the function’s value is 1). This shouldn’t be confused with the geometric or
Euclidean distance (also known as the margin). For a given hyperplane
(w,b), all pairs { w, b} define the exact same hyperplane, but each has a
different functional distance to a given data point. To obtain the geometric
distance from the hyperplane to a data point, it can be normalized by the
127
magnitude of w.
Classified Samples
ClassifiedRate (5.17)
Total no of samples
The Figure 5.10 shows the texture features such as energy, entropy,
contrast, inverse difference moment and directional moment extracted using
GLCM along with the corresponding classes such as water body and non-
water body.
129
Figure 5.10 Input data consisting of five texture features with the
corresponding classes
Then the input samples are trained using neural network. The
numbers of input training data sets are 1026. The network is designed by 5
input layers, 30 hidden layers and 1 output layer. The activation function of
each output neuron is a radial function, i.e., a monotonously decreasing
function.
130
The term ‘Performance goal met’ indicates that the training data
satisfies the necessary and sufficient conditions in order to achieve the
maximum accuracy. Thus, there has been a successful result meeting 100%
accuracy which is shown in Figure 5.11 and Figure 5.12.
131
The support vector machine classifies the images into water body
and non-water body and this shows that the resolution enhanced image gives
better accuracy than the denoised image. Figure 5.13 to Figure 5.17 show the
water body and non-water classified result of Kochi, Kanyakumari, Kolkata,
Visakhapatnam and Sydney region. In the output image, white colour shows
the water body and black colour shows the non-water body region.
132
The support vector machine classifies the images into water body
and non-water body and this shows that the resolution enhanced image gives
better accuracy than the denoised image. The classification accuracy of
denoised images and resolution enhanced images are shown in Table 5.1.
Resolution Enhanced
HDL Denoised Image
Sl. Image
Region Name
No Classified Correct Error Classified Correct Error
Rate Rate Rate Rate Rate Rate
The ROC curve can compute values for various criteria to plot either
on the X-axis or on the Y-axis. This returns values of specificity, or false
positive rate in the X-axis and sensitivity, or true positive rate in the Y-axis.
All such criteria are described by a confusion matrix.
TP FN
C (5.15)
FP TN
The first row of the confusion matrix defines how the classifier
identifies instances of the positive class: C 1,1 is the count of correctly
Figure 5.18 Performance analysis (a) confusion matrix and (b) ROC
curve of Kochi region
140
Figure 5.19 Performance analysis (a) confusion matrix and (b) ROC
curve of Kanyakumari Region
141
Figure 5.20 Performance analysis (a) confusion matrix and (b) ROC
curve of Kolkata Region
142
Figure 5.21 Performance analysis (a) confusion matrix and (b) ROC
curve of Vishakhapatnam region
143
Figure 5.22 Performance analysis (a) confusion matrix and (b) ROC
curve of Sydney region
144
The ROC analysis results show that the Kochi region has a sensitivity
of 80% and specificity of 86.7% and the accuracy is 85%, Kanyakumari region
has a sensitivity of 62.5% and specificity of 91.7% and the accuracy is 80%,
Kolkata and Vishakhapatnam region have a sensitivity of 71.4% and specificity
of 92.3% and the accuracy is 85% and Sydney region has a sensitivity of
83.3% and specificity of 92.9% and the accuracy is 90%.
0.08
0.06
0.04
0.02
0
Kochi Kanyakuma Kolkata Vishakhapa Sydney
Figure 5.24 Input landsat images of coastal landscape taken during 2005
Figure 5.25 Input landsat images of coastal landscape taken during 2010
Water body
Figure 5.28 Classification Using SVM
The advent of global warming and thus the glacier meltdown has
been a serious issue amongst the various environmental bodies of the world.
The efficient analysis of LANDSAT images can provide valuable results and
150
information which can help strategize the preventive steps taken by the
environmental bodies. The main aim of this is to compare the K-Means and
Fuzzy C-Means clustering technique and find out the change detection in
glacier classification by processing images taken over different time frames.
The Figure 5.34 and Figure 5.35 portray the results of June 2005
and June 2010 image after applying Fuzzy C-Means clustering. The
percentage of change of glacier in June 2005 image over June 2010 image is
8%. In accordance with the real time data, it gives a confidence to indicate
that the work to be performed will yield a fruitful result.
153
Figure 5.34 June 2005 image after applying Fuzzy C-Means clustering
Figure 5.35 June 2010 image after applying Fuzzy C-Means clustering
5.9 SUMMARY