0% found this document useful (0 votes)
15 views

Mapping Methods - Machine Learning Classification

Uploaded by

eisyahannie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Mapping Methods - Machine Learning Classification

Uploaded by

eisyahannie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

ECV5511

Mapping Methods
(Spectral-based machine learning
classification using ENMAP-Box Plugin)

Dr. Helmi Z. M. Shafri


Overview

⚫ Machine Learning algorithms


• Support Vector Machine (SVM)
• Decision Tree (DT)
• Random Forest (RF)
• XGBoost
• Artificial Neural Network (ANN)
• Deep Learning (DL)

⚫ Accuracy Assessment
Objectives

To discuss some common and advanced


spectral-based classifiers used for
multispectral/hyperspectral classification

 Specific focus on Support Vector Machine


(SVM) , Random Forest (RF) and XGBoost
Classification is a mapping from data to labels (symbols)

thematic map
QGIS Enmap plugin workflow

Feature
Data Model Performance Final Output/
extraction/
development analysis Deployment
selection

-Acquisition -Feature -Algorithms -Kfold Cross- - Prediction


-Preprocessing clustering selection Validation - Post classification
-Labelling (Hierarchical -Hyperparameters -Testing samples editing
(Ground truth) clustering) optimization -Performance -Map production
-Training/Testing -Feature Ranking measures -Delivery
samples (Permutation -Input for GIS
feature
importance)
-Dimensionality
reduction
-Band selection
SVM
RF,DL etc
P. Ghamisi et al., "Advances in Hyperspectral Image and Signal Processing: A Comprehensive
Overview of the State of the Art," in IEEE Geoscience and Remote Sensing Magazine, vol. 5, no.
4, pp. 37-78, Dec. 2017, doi: 10.1109/MGRS.2017.2762087.
Surface compositional
mapping

 Standard mapping methods


⚫ Produce surface compositional information
⚫ By comparing unknown pixel spectra with
known lab/library spectra
⚫ Using ground truth information
⚫ Spectral matching-based image
classification (eg SAM)
⚫ Statistics-based image classification (eg
SVM/RF) – machine learning
Spectral angle mapper
The spectral similarity between the test (or pixel) spectrum, t,
and the reference (or laboratory) spectrum, r, is expressed in terms
of the average angle, , between the two spectra as calculated
for each channel, i, as
 n 

  t i ri 

−1
 = cos  n i=1
n 


t r i
2
i
2


i=1 i=1

Spectral Angle Mapper


(SAM) is a physically-
based spectral
classification that uses an
n-D angle to match pixels
to reference spectra
Spectral Angle Mapping

Alunite Buddingtonite Calcite

Zeolites Illite Silica Kaolinite


Dark pixels are more similar, white pixels are less similar!
Spectral angle mapping

white = zeolites
green = calcite
yellow = alunite
red = kaolinite
dark green = illite
blue = silica SAM advantage - comparatively insensitive
maroon = buddingtonite to albedo effects and light (illumination)
Machine Learning-based Classification

Data Science/Machine Learning Pipeline


Machine Learning
Algorithms
 Traditional classifiers are parametric – eg
maximum likelihood – depends on pre-defined
stats model, eg Gaussian
 Suffers from ‘curse of dimensionality’ @ Hughes
Phenomenon – when data dimensions very high in
HRS case.
 Thus, non-parametric algorithms are designed –
such as SVM, ANN, DT, RF.
 Involves the use of ground truth (training & testing
data) and statistical learning theory (AI/ML).
-SVM and RF have lower computational
complexity and higher interpretability capabilities
compared to deep learning models.
-SVM with its ability to tackle the problems of
high dimensionality and limited training samples
-RF holds its position due to ease of use (i.e.,
does not need much hyperparameter finetuning)
SVM
 Currently considered one of the best in RS
classification due to its efficiency despite
being simple
 Can perform well with limited training
samples
 Introduced first in the late 1970s by Vapnik
and his group.
 SVM, in its basic form, is a linear binary
classifier, which identifies a single boundary
between two classes
SVM
 supervised classification
 provides good classification results from complex and noisy data
 separates the classes with a decision surface that maximizes the
margin between the classes.
 The surface is often called the optimal hyperplane, and the data
points closest to the hyperplane are called support vectors.
 The support vectors are the critical elements of the training set.
Support Vectors
Margin

2
Support Vectors
2 2
2
2
2 2
1 1 2
2
1
1
1 2
2
1 1 1 2
1 1 2
1
1
1
1
1 Optimum Hyper Plane
Factors on SVM accuracy

 Design: Linearly separable, linearly


non-separable and non linear types
 Binary (2 classes) vs Multiclass
classification (RS case)
 Parameters: penalty value ( C value),
kernel, multiclass types, optimizers.
 four types of kernels: linear,
polynomial, radial basis function
(RBF), and sigmoid.
The EnMAP-Box is a python plugin for QGIS, designed to process and
visualise remote sensing data.
SVM in EnMAP

 Select the Kernel Type


 Specify the Regularisation/Penalty Parameter
C for the SVM algorithm to use.
⚫ a large value of C might cause an overfitting to
the training data.
 Use ‘grid search’ technique to find optimal
parameters (g and C) or use trial and error
 A widely used kernel function is the Gaussian
radial basis function (RBF) kernel
 The SVM training requires the estimation of the
kernel parameter g and the regularization
parameter C
 The two parameters are usually determined by a
grid search, testing possible combinations of C
and g
 The best combinations were selected based

on cross-validation
k-fold Cross Validation
Hyperparameter
optimization/tuning
Techniques for ground truth
data splitting (for training
and accuracy assessment)
 Direct train-test split (e.g. 70-30)

 Split to train, validation, test


 K-fold cross validation (Train/test split
into k groups) – adopted by Enmap-
Box as it is more practical
EnMAP-Box Plugin in QGIS
Japanese airborne hyperspectral data
classification

Original image with


Classified map
ground truth ROIs
Examples of SVM
analysis

Mustafa Ustuner, Fusun Balik Sanli & Barnali Dixon (2015) Application
of Support Vector Machines for Landuse Classification Using High-
Resolution RapidEye Images:A Sensitivity Analysis, European Journal
of Remote Sensing, 48:1, 403-422 ,
Some results from previous work
Decision Tree (DT)
 A decision tree is a type of multistage classifier
that can be applied to a single image or a stack
of images.
 Made up of a series of binary decisions that are
used to determine the correct category for each
pixel.
 The decisions can be based on any available
characteristics of the dataset.
 For example, you may have an elevation image
(eg Lidar data) and two different multispectral
images collected at different times, and any of
those images can contribute to decisions within
the same tree.
Decision Tree Learning
Random Forest (RF)
 A random forest (RF) classifier is an ensemble classifier that produces
multiple decision trees.
 RF is an ensemble learning approach, developed by L. Breiman in
2001, for solving classification and regression problems
The parameter function used in RF of the EnMAP-
Box default setting is using 100 Number of trees,
feature square root and Gini coefficient for impurity
function.
 The RF classifier is suitable for classifying hyperspectral
data, where the curse of dimensionality and highly
correlated data pose major challenges to other
classification methodologies.
 Other advantages:
1) handling thousands of input variables without variable
deletion;
2) reducing the variance without increasing the bias of the
predictions;
3) being robust to outliers and noise; and
4) being computationally lighter compared to other tree
ensemble methods, e.g., Boosting
XG Boost
X Boost (Extreme Gradient Boosting) is a gradient boosting machine learning algorithm that can be applied to hyperspectral image classification tasks. It works by
building an ensemble of decision trees, where each tree is constructed to correct the errors made by the previous ones. Here's how XGBoost works for
hyperspectral image classification:

1. **Data Preparation**:
- Start by collecting and preparing your labeled hyperspectral dataset, where each pixel in the hyperspectral image is associated with a class label.

2. **Flattening the Hyperspectral Data**:


- Hyperspectral data is typically represented as a 3D cube (height, width, spectral bands). To use XGBoost, you'll need to flatten this cube into a 2D dataset, where
each row corresponds to a pixel, and each column represents a spectral band. This transformation converts the hyperspectral data into a feature matrix.

3. **XGBoost Model Creation**:


- Create an XGBoost model for classification using the `XGBClassifier` class in Python. You can specify various hyperparameters to control the model's behavior,
such as the learning rate, maximum tree depth, and the number of trees in the ensemble.

4. **Training**:
- Split your dataset into training and testing subsets using techniques like train-test split. Then, train the XGBoost model on the training data. The training process
involves the following:
- XGBoost builds an initial decision tree, and its predictions are compared to the true labels.
- Errors are calculated for the misclassified samples, and a new decision tree is built to reduce these errors.
- The new tree is added to the ensemble, and this process is repeated multiple times to create a collection of decision trees.

5. **Prediction**:
- Use the trained XGBoost model to make predictions on the testing data. Each tree in the ensemble contributes to the final prediction, and the class label with the
most votes is chosen as the predicted class.

6. **Evaluation**:
- Evaluate the performance of the XGBoost model on the testing data using classification metrics such as accuracy, precision, recall, F1-score, and confusion
matrices.

XGBoost is well-suited for hyperspectral image classification for several reasons:

- It can handle high-dimensional data and is capable of modeling complex relationships between spectral bands.
- XGBoost is an ensemble learning algorithm, which combines multiple decision trees to improve classification accuracy.
- You can adjust the hyperparameters to optimize the model's performance for your specific hyperspectral data.

For hyperspectral image classification, it's essential to preprocess the data appropriately and choose or engineer relevant features to provide meaningful input to the
model. The quality of feature extraction, the choice of hyperparameters, and the size and quality of the training dataset all play a significant role in the success of the
XGBoost-based classification.
To make predictions on the entire hyperspectral image and generate a classification map, you can follow these steps:

Flatten the entire hyperspectral image.


Use the trained XGBoost model to predict class labels for each flattened pixel.
Reshape the prediction results to the original image dimensions.
Here's the code to accomplish this:

import numpy as np
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
import spectral

# Load your flattened hyperspectral image and labels for training


# You should have "flattened_hyperspectral_image" and "labels" ready

# Create and train an XGBoost classifier


classifier = XGBClassifier(objective='multi:softmax', num_classes=num_classes, random_state=42)
classifier.fit(flattened_hyperspectral_image, labels)

# Load the entire hyperspectral image for prediction


entire_hyperspectral_image_file = "entire_hyperspectral_image.hdr" # Replace with your image file
entire_image, header = spectral.envi.open(entire_hyperspectral_image_file)
n_rows, n_cols, n_bands = entire_image.shape

# Flatten the entire hyperspectral image


flattened_entire_image = entire_image.reshape(-1, n_bands)

# Make predictions on the entire image


predictions = classifier.predict(flattened_entire_image)

# Reshape the predictions to the original image dimensions


classification_map = predictions.reshape(n_rows, n_cols)

# Optionally, you can save the classification map as an ENVI format file
output_classification_map_file = "classification_map.hdr"
spectral.envi.save_classification(output_classification_map_file, classification_map, header, force=True)
Artificial Neural Network
(ANN)
 supervised classification where the net is trained on a
set of input ROIs.
 apply a layered feed-forward neural network
classification technique
 uses standard backpropagation
 select the number of hidden layers to use and choose
between a logistic or hyperbolic activation function.
 Learning occurs by adjusting the weights in the node
to minimize the difference between the output node
activation and the output.
 The error is backpropagated through the network and
weight adjustment is made using a recursive method.
 Basis for Deep Learning.
Band selection/feature
reduction
 Band selection, as a commonly used dimension reduction
technique, is the selection of optimal band combinations from
the original bands, while attempting to remove the redundancy
between bands and maintain a good classification ability.

Yang, R., Su, L., Zhao, X., Wan, H., & Sun, J. (2017). Representative band selection for hyperspectral image classification.
Journal of Visual Communication and Image Representation, 48, 396-403.
Variation of classification accuracy with the number of features for analyses based on training sets of
differing size using hyperspectral data set.

(Pal, M., & Foody, G. M. (2010). Feature selection for classification of hyperspectral data by SVM.
IEEE Transactions on Geoscience and Remote Sensing, 48(5), 2297-2307)
Accuracy Assessment
 Many factors affect the performance of a classifier,
including the selection of training and testing data
samples as well as input variables
 Based on ground reference @ ground truth maps (created
using various maps, expert knowledge and field surveys
carried out using a spectrometer/handheld GPS).
 Training and test data/pixels can be formed using random
pixel selection approach in accordance with the ground
reference map
⚫ Stratified / equalized / random selection
⚫ Equalized random sampling plan is quite popular
 Selected pixels were divided in two parts in order to
remove any bias in using same pixels for training and
testing.
 Two widely used accuracy measures—overall accuracy
and the kappa coefficient
 Overall accuracy has the advantage of being directly
interpretable as the proportion of pixels being classified
correctly
 Kappa coefficient allows for a statistical test of the
significance of the difference between two algorithms.
 According to Anderson et al. [24], the minimum
level of interpretation accuracy in the identification of
land use and LULC categories from remote sensing data
should be at least 85%.
 McNemar test can be performed – significance of
differences between classified maps
 Anderson, J.R.; Hardy, E.E.; Roach, J.T.; Witmer, R.E. A Land Use and Land Cover Classification System for Use with Remote
Sensor Data. Government Printing Office: Washington, DC, USA, 1976.
Comparing
different classifiers’
performance

Producer
accuracy is the
probability that a
pixel in the
classification
image is put into
class x given the
ground truth
class is x.

User Accuracy is
the probability
that the ground
truth class is x
given a pixel is
put into class x
in the
classification
image.
Comparing
different
datasets’
performance
Current trends…

 Advanced Machine Learning


approaches – eg Deep Learning etc
 Object-based image analysis (OBIA)

 Combination of spectral and spatial


information (data and information
fusion)
⚫ MS/HRS + lidar/radar
⚫ MS/HRS spectral + spatial

OBIA for hyperspectral
HZM Shafri, A Hamedianfar (2015) Mapping of intra-urban land covers using pixel-based and object-based classifications from
airborne hyperspectral imagery, 2nd International Conference on Information Science and Security (ICISS), South Korea, 2015

74.29% 88.83%
Some CASI Houston hyperspectral data: (a) a color composite representation of the data, using bands 70, 50, and 20 as
R, G, and B, respectively; (b) training samples; (c) test samples; and (d) a legend of the different classes.
RF- random forest, SVM- support vector machine, BP- backpropagation NN,
KELM-kernel Extreme Learning Machine, MLR-multinomial logistic regression,
CNN- Deep Learning
Deep Learning for
hyperspectral
 Gaps of research?
 DEEP LEARNING OPENS A NEW
WINDOW FOR FUTURE RESEARCH,
SHOWCASINGTHE DEEP LEARNING-
BASED METHODS’ HUGE POTENTIAL
(Ghamisi et al 2017).
 Most important aspect comes with the fact
that DL-based methods learn features
automatically while many of the current
state- of-the-art methods rely on handcrafted
design of features (Petersson et al 2016).
DL models
Data Fusion
 Example: hyperspectral + lidar
Project design
Airborne Data Data Processing / Analysis / Feature Validation &
Collection Data Fusion Extraction Verification

Ground-truth
(spectral,
biology etc)

Sensor / data
CASI & fusion, radiometric, Spectral
GPS/IMU atmospheric, Classifications
(hyperspectral) mosaic

Airborne
DEM
LIDAR
(bare earth/canopy)
(lidar)
PRODUCTS
3-D Land cover
classifications
Conclusions
 there is no classifier that consistently provides the
best performance among the considered metrics
(particularly, from the viewpoint of classification
accuracy).

 Instead, different solutions depend on the


complexity of the analysis scenario (e.g., the
availability of training samples, processing
requirements, tuning parameters, and speed of the
algorithm) and on the considered application
domain
The demo…

 Demo : Japanese airborne


hyperspectral image classification
using ENMAP-Box Plugin in QGIS
The airborne hyperspectral dataset was
taken by Headwall Hyperspec-VNIR-C
imaging sensor over agricultural and
urban areas in Chikusei, Ibaraki, Japan,
on July 29, 2014.
The hyperspectral dataset has 128 bands
in the spectral range from 363 nm to
1018 nm.
References
 M. Sheykhmousa, M. Mahdianpari, H. Ghanbari, F. Mohammadimanesh, P. Ghamisi and
S. Homayouni, "Support Vector Machine Versus Random Forest for Remote Sensing
Image Classification: A Meta-Analysis and Systematic Review," in IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 6308-
6325, 2020, doi: 10.1109/JSTARS.2020.3026724.

 Mountrakis, G., Im, J., & Ogole, C. (2011). Support vector machines in remote sensing: A
review. ISPRS Journal of Photogrammetry and Remote Sensing, 66(3), 247-259.

 Mas J. F. & J. J. Flores (2008) The application of artificial neural networks to the analysis of
remotely sensed data, International Journal of Remote Sensing, 29:3, 617-663.

 Sharma, R., Ghosh, A., & Joshi, P. K. (2013). Decision tree approach for classification of
remotely sensed satellite data using open source support. Journal of Earth System Science,
122(5), 1237-1247.

 Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., Chica-Olmo, M., & Rigol-Sanchez, J. P.
(2012). An assessment of the effectiveness of a random forest classifier for land-cover
classification. ISPRS Journal of Photogrammetry and Remote Sensing, 67, 93-104.

 Belgiu, Mariana & Drǎguţ, Lucian. (2016). Random forest in remote sensing: A review of
applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing. 114.
24-31. 10.1016/j.isprsjprs.2016.01.011.
THANK YOU

You might also like