Graph Sample and Aggregate-Attention Network For Hyperspectral Image Classification

This document presents a new graph neural network called SAGE-A for hyperspectral image classification. SAGE-A uses a graph sample and aggregate (graphSAGE) network to flexibly aggregate information from neighbor nodes. It also uses an attention mechanism to characterize the importance of spatial relationships between regions and learn global and contextual information. The network was tested on several hyperspectral data sets and showed improved performance over state-of-the-art methods.

Uploaded by

黄凌翔

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Graph Sample and Aggregate-Attention Network For Hyperspectral Image Classification

Uploaded by

黄凌翔

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL.

19, 2022 5504205

Graph Sample and Aggregate-Attention Network

for Hyperspectral Image Classification
Yao Ding , Xiaofeng Zhao , Zhili Zhang, Wei Cai, and Nengjun Yang

Abstract— Graph convolutional network (GCN) has shown Deep learning has achieved great success in many appli-
potential in hyperspectral image (HSI) classification. However, cations and it has also greatly promoted the technological
GCN is a transductive learning method, which is difficult to progress of HSI classification. For example, in [6] and [7],
aggregate the new node. The available GCN-based methods fail the HSIs were classified using different dimensional con-
to understand the global and contextual information of the volutions; Liu et al. [8] introduced a spectral–spatial feature
graph. To address this deficiency, a novel semisupervised net-
extraction method based on the long and short-term memory
work based on graph sample and aggregate-attention (SAGE-A)
for HSIs’ classification is proposed. Different from the GCN- artificial neural networks (LSTM) network. Zhang et al. [9]
based method, SAGE-A adopts a multilevel graph sample and used image semantic context to classify HSIs. He et al. [10]
aggregate (graphSAGE) network, as it can flexibly aggregate the adopted residual networks to learn spatial and spectral charac-
new neighbor node among arbitrarily structured non-Euclidean teristics of the image to improve the classification rate. In [11],
data and capture long-range contextual relations. Inspired by an unsupervised spectral–spatial feature extraction network
the convolution neural network (CNN) self-attention mechanism, was proposed. However, convolution neural network (CNN)
the proposed network uses the graph attention mechanism to needs a large number of training labels and calculation.
characterize the importance among spatially neighboring regions, Simultaneously, the CNN only performs convolution on the
so the deep contextual and global information of the graph can be regular region. Furthermore, the size of the CNN convolution
learned automatically by focusing on important spatial targets.
Extensive experimental results on different real hyperspectral
kernel is fixed, which will lead to edges missing phenomenon
data sets demonstrate the performances of our proposed method in the process of feature extraction [12].
compared with the state-of-the-art methods. To ameliorate these issues, people have conducted exten-
sive researches on classification using graph convolution
Index Terms— Global and contextual information, graph networks (GCNs). The GCN conducts semisupervised learn-
convolution neural network, hyperspectral image (HSI) ing on graph-structured data and can operate on graph
classification.
signals directly via a variant of CNNs. Sha et al. [13]
applied the graph attention network to hyperspectral clas-
I. I NTRODUCTION sification. Mou et al. [14] proposed a nonlocal graph con-
volution network, which constructs a graph by calculating
H YPERSPECTRAL images (HSIs) provide detailed spec-
tral information through hundreds of (narrow) spectral
channels, which can be used to accurately classify diverse
the relationship between nonadjacent pixels to improve the
classification accuracy. Hong et al. [15] proposed a graph
materials of interest [1], [2]. However, the increased dimen- convolution classification method combining GCN and CNN
sionality of such data provides a challenge to conventional to increase the classification accuracy. Wan et al. [16] used a
techniques, and hyperspectral classification has great research multiscale graph convolutional network to extract multiscale
value. graph features. Wan et al. [17] adopted a context-aware mech-
In the past few decades, people have conducted significant anism to learn the local contextual of the graph. The mentioned
efforts on HSI classification, which can be summarized into methods are GCN-based. However, GCN is a transductive
two categories: traditional methods and neural network meth- learning and whole graph training method, which is difficult
ods. Traditional methods have made some efforts on explored to aggregate the new node and will bring a huge amount of
more discriminative feature representations, such as morpho- computation.
logical features and texture features [3]. Apart from these The main contributions in this letter are as follows: 1) incor-
subspaces learning, sparse learning algorithms and machine poration of sample and aggregate (SAGE) (first time) for
learning, such as random forest and support vector machine extracting contextual relations among superpixels; 2) utiliza-
(SVM) [4], [5], have received great attention in the community. tion of multilevel graph projection and flexible reprojec-
However, traditional methods have defects in feature extraction tion framework for extracting long-range contextual relations
completeness and may suffer from overfitting because of the and producing truthful local-region features; and 3) adoption
deficiency in training samples. of attention mechanism graph refinement for characterizing
global and contextual relations and accurately finding precise
Manuscript received February 8, 2021; accepted February 22, 2021. Date region representations.
of publication March 15, 2021; date of current version December 28, 2021.
This work was supported in part by the National Natural Science Foundation II. R ELATED W ORK
of China under Grant 41404022 and in part by the National Natural Science
Foundation of Shanxi Province Grant 2015JM4128. (Xiaofeng Zhao is co-first Many researchers have published their methods to classify
author.) (Corresponding author: Xiaofeng Zhao.)
The authors are with the Xi’an Research Institute of High Technology, Xi’an
HSIs. In this part, we mainly introduce the graph neural
710000, China (e-mail: [email protected]). network (GNN) method, which has a lot of relationships with
Digital Object Identifier 10.1109/LGRS.2021.3062944 our work.
1558-0571 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Southeast University. Downloaded on May 16,2023 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
5504205 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 19, 2022

A. Spectral Graph Convolution

For a graph G = (V , E), the Laplacian matrix L is
expressed as
L= D− A (1)
where V represents the vertex set, E represents the edge set,
D is the degree matrix of the graph, and A is the adjacency
Fig. 1. Overview of the SAGE-A network. (a) HSI. (b) Superpixels seg-
matrix of the graph. mented by the SLIC algorithm. (c) Circles and lines represent the superpixel
The corresponding symmetrically normalized Laplacian L n (graph node) and edges, respectively. Different colors of the nodes represent
L n = I − D 2 AD − 2
1 1
(2) different land-cover types and the input of the network is the spectral
characteristics of each node. (d) Multilevel new node embedding using SAGE
where I is an identity matrix. and attention mechanism. (e) Output of the network.
To conduct node embedding for G, a filter g θ = diag(θ ) this issue, we find that neighbor pixels have a large probability
parameterized by θ ∈ R N is defined, which can be expressed to belong to the same land-cover type. Therefore, simple linear
as the multiplication of a signal x (a scalar for each node) iterative clustering (SLIC) has been adopted to segment the
with g θ the Fourier domain, that is, entire image into a small number of local regions, and the
g θ ∗ x = U g θ ()U T x (3) pixels consisted of regions that have a strong spectral–spatial
where U is the matrix of eigenvectors of L n and can be similarity. Concretely, SLIC conducts image region segmenta-
computed by L n = UU −1 . is the diagonal matrix of tion via iteratively growing the local clusters using a k-means
eigenvalues of L n . g θ can be understood as a function of algorithm. In the letter, the local regions are treated as the
eigenvalues of , that is, g θ (). However, note that evaluat- graph node, which can significantly reduce the number of
ing (3) requires explicitly calculating. We could approximately graph nodes and improve the computational efficiency. Here,
fit g θ () using a truncated expansion in terms of Chebyshev the average spectral signatures of the involved pixels in the
polynomials up to the K th order, which leads to node (local region) are taken as the feature vector of the node.
K
g θ () ≈ θ k T k (L n ) (4) B. GraphSAGE
k=0 Traditional GCN models assume that the adjacent pixels
where T k is the Chebyshev polynomials. on a graph are more likely to share the same label, and
Then we could simplify (4) K = 1, which is the first-order thus the label information can be propagated to unlabeled
approximation by Chebyshev polynomials, and the largest samples from labeled samples via graph Laplacian regular-
eigenvalue λmax ≈ 2. Equation (4) can be rewritten as ization. However, transductive learning requires all nodes to
1 participate in training to get node embedding, and it cannot
−1
g θ ∗ x ≈ θ (I N + L)x = θ D̃ 2 Ã D̃ 2 x. (5) quickly get embedding of new nodes. Therefore, GCN can
only learn the information about neighboring nodes and cannot
With the activation layer σ , the GCN propagation rule is as naturally generalize to unknown vertices. In this letter, a graph
follows: 1 SAGE (graphSAGE) algorithm is adopted to learn more spatial
−1
H (l+1) = σ D̃ 2 Ã D̃ 2 H (l) W (l) (6) scale information, which improves the generalization ability
of the model for new nodes. The forward propagation rule is
where H (l+1) and H (l) denote the values of l + 1 and l layers, expressed as Algorithm 1.
respectively, and W is the weight matrix.
B. Spatial Graph Convolution Algorithm 1 GraphSAGE Embedding Generation (i.e., For-
According to the traditional CNN operation on the image, ward Propagation) Algorithm
the spatial-based GNN defines the graph convolution opera- Input: Graph G = (V , E); input features {x v , ∀v ∈ V };
tor based on the spatial relationship of a node. The image the number of layers of the network K ; weight matrices
is regarded as a special graph, and each pixel represents W k , ∀k ∈ {1, . . . ,K }; non-linearity σ ; mean aggregator
a node; because the adjacent nodes have a fixed order, functions AGG; neighborhood function N : v → 2v
the training weights can be shared with different local spaces. Output: Vector representations for all v ∈ V
Spatial-based GNN method has better efficiency, flexibility, 1: h 0 ← x v , ∀v ∈ V ;
and versatility compared with GCN. For details of these
2: for k = {1, . . . ,K } do
algorithms, the readers can refer to relevant papers.
3: for v ∈ V do
III. P ROPOSED M ETHOD hkN(v) ← AGG({hk−1
4: u , ∀u ∈ N(v)});
In this section, we will present SAGE-attention (SAGE-A) 5: hkv ← σ (W k · C O NC AT (hk−1
v , h N (v) ))
k
for HSI classification (Fig. 1). It is mainly composed of three 6: end
parts, including pixel-to-region assignment (Section III-A), hk
7: hkv ← hkv , v ∈ V
contextual relations refinement (Sections III-B and III-C), and v 2
8: end
region-to-pixel assignment (Section III-D).
Output: z v ← hvK , v ∈ V
A. Pixel-to-Region Assignment In Algorithm 1, K is the number of layers of the network,
HSI contains a large number of pixels in the spatial which also represents the number of hops of adjacent points
dimension, and a huge amount of computation is needed to that can be aggregated at each vertex, because each additional
classification and sometimes it is unacceptable. To ameliorate layer can aggregate the information about the neighbors of

Authorized licensed use limited to: Southeast University. Downloaded on May 16,2023 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
DING et al.: GRAPH SAGE-A NETWORK FOR HSI CLASSIFICATION 5504205

Fig. 3. Network multilevel information learning.

Fig. 2. Schematic of the graph attention mechanism [see (9)]. l denotes D. Region-to-Pixel Assignment
the lth layer of the network. Different colors of the nodes represent different
land-cover type features. Multiscale information has been widely proved to be very
useful for HSI classification [18], [19]. Ground objects have
different geometric features, and multilevel feature extraction
a further layer, ∀u is the eigenvector of the node u, {hk−1u , can fully learn the contextual information of the image. The
∀u ∈ N(v)} denotes the embedding of the neighbor U of the
network uses multilayer graphSAGE to learn the relationship
node V in the k − 1 layer, and hkv represents the characteristic between superpixels of different scales as Algorithm 1. Fig. 3
of all neighbors of node v at the k level. In this letter, demonstrates the 1-hop and 2-hop neighbors of a central exam-
aggregator
functions (AGG) can be expressed as AGG = ple A to illustrate the multilevel design. Then, the receptive
u∈N (v) u )/(|N(v)|). From Algorithm 1, for each iteration
(h k−1
field of A at the scale s is formed as
or search depth, the nodes collect information from their local
H s (x i ) = H 1 (H s−1 (x i ), x s−1 ) (10)
neighbors, and as the process iterates, the nodes gradually get
more and more information from farther reaches of the graph. where H 0 (x i ) = x i , H 1 (x i ) is the new node embedding
Thus, long-range contextual relations are extracted. of 1-hop neighbors of x i . Considering that the information
association degree of different nodes is different, we use (10)
to analyze the association degree of the learned information.
C. Graph Attention The network output is expressed as follows:
In the experiment, we find that the association degree O = A(H s (x i )) (11)
between different nodes is different. To extract the global and
contextual information better, a graph attention mechanism is where A is the attention mechanism, and O is the output of
added into the network to make the important node infor- SAGE-A. In our network, the cross-entropy error is adopted
mation have greater weight. The graph attention mechanism to penalize the difference between the network output and the
can obtain the global geometric features by calculating the labels of the original labeled examples, namely
relationship between any two nodes in the graph. To get
C
the corresponding transformation between the input and the L=− Y z f ln O z f (12)
output, it is necessary to obtain the output features by linear z∈ y G f =1
transformation according to the input characteristics at least
once. A weight matrix is trained for all nodes: W ∈ R F ×F ,
where yG is the labeled examples set, C denotes the number
which is the relationship between the input features F and the of classes, and Y z f is the label matrix. The details of our
output features F . Node-to-node correlation can be learned SAGE-A are shown in Algorithm 2. The input feature of the
through the network layer SAGE-A is the average spectral signatures of the graph nodes,
which enables the network to process the spectral information
ei j = LeakyReLU aT W hi ||W h j . (7) about HSIs. At the same time, the SAGE method is adopted
Equation (7) shows the importance of node j to node i , to process the spatial relationship of the nodes in the graph
a T ∈ R 2F is the parameter vector of the network, || denotes network, so that the model can learn the long-range spatial
the concatenation operation, and LeakyReLU(·) is a nonlinear information of the HSIs, and the graph attention mechanism is
layer. used to process the overall information of the graph to learn the
Then, normalizing and converting ei j to a probability output global and contextual information of each node in the graph.
ai j through a softmax
function IV. E XPERIMENTAL R ESULTS
exp LeakyReLU a T W hi ||W h j A. Data Set Description and Implementation
ai j = . (8)
k∈Ni exp LeakyReLU a W h i ||W h j
T
Two real hyperspectral data sets of Pavia University (PU)
Therefore, the graph convolution output of each node can and Houston 2013 are adopted to verify the classification
be expressed as follows: performance of our proposed method. The first data set PU
⎛ ⎞ contains 610 × 340 pixels and 103 bands, including a large
number of background pixels, and 42 776 pixels can be applied
hli = σ ⎝ ai j · W T hl−1
i
⎠ (9)
to classification. The whole map contains nine kinds of fea-
j ∈Ni
tures. The second data set Houston 2013 has been used in the
where σ denotes the activate function, and ai j is the learned 2013 Geoscience and Remote Sensing Society (GRSS) Data
attention weight. Fusion Contest. The Salinas scene is composed of 144 spectral
Fig. 2 shows the working process of the graph attention bands and 349 × 1905 pixels. These pixels are divided into
mechanism in SAGE-A. By learning the importance weight 15 categories.
of each node to the classified node, the graph attention
mechanism makes the important nodes have greater weight, B. Experimental Setting
and hence, global and contextual information can be learned For the two HSI data sets described in Section IV-A,
from the graph via an attention mechanism. 30 labeled pixels in each class are randomly selected for

Algorithm 2 Proposed SAGE-A for HSI Classification TABLE I

A CCURACY C OMPARISONS FOR THE PU S CENE .
Input: Input image; number of epoch T = 2000; learn- B OLD N UMBERS I NDICATE T HE B EST P ERFORMANCE
ing rate = 0.0001; number of graph convolutional layers
L = 3; dropout = 0.2; Adam gradient descent;
python = 3.8; pytorch = 1.6.0.
1: Segment the whole image into super-pixels via SLIC
algorithm;
2: Extract the superpixels input features (average spectral
signatures);
3: //Construct the graph and train the SAGE-A model
4: for t = 1 to T do
5: Graph convolution notes feature;
6: Perform graph learning at adjacent points spatial
feature by Algorithm 1; TABLE II
7: Bachnormalzation, dropout and relu; A CCURACY C OMPARISONS FOR THE H OUSTON 2013 S CENE .
8: Perform graph learning at adjacent points and farther B OLD N UMBERS I NDICATE T HE B EST P ERFORMANCE
points spatial level by Algorithm 1;
9: Bachnormalzation and relu;
10: //Graph convolution attention mechanism
11: Perform graph learning at global level by Eq. (9) and
output the spatial and spectral feature;
12: Calculate the error term according to Eq. (12) and
update the weight matrices using Adam gradient
descent;
13: end for
14: Conduct label prediction based on the trained network;
Output: Predicted label for each pixel.

network training, and the remaining unlabeled pixels are

used for network testing. The hyperparameters’ selection
in our SAGE-A is shown in Algorithm 2. To validate the
performance of SAGE-A, the other six state-of-the-art image
classification methods are used to conduct a comparison.
Specifically, our network is compared with two CNN-based
methods, that is, convolution autoencoder (CAE) [11] and methods are difficult to learn the long-range contextual infor-
convolutional recurrent neural network (CRNN) [8], and mation of the graph. While our proposed SAGE-A aggregates
two GCN-based methods, that is, context-aware dynamic the different levels of nodes (superpixel) via adjusting the
graph convolutional network (CAD-GCN) [17] and multiscale node embedding layers of the network. The employment of an
dynamic graph convolutional network (MDGCN) [16]. attention mechanism enables SAGE-A to automatically learn
Meanwhile, two traditional machine learning methods the global and contextual information of the graph.
are also adopted, namely, multiband compact texture unit
(MBCTU) [3] and Joint collaborative representation and SVM D. Impact of Parameters/Hyperparameters
with Decision Fusion (JSDF) [4]. Overall accuracy (OA), Many significant parameters/hyperparameters should be
average accuracy (AA), kappa coefficient (κ), and per-class tuned in the proposed SAGE-A architecture. In the experiment,
accuracy are adopted and used as evaluation indices. the sensitivity of the classification performance to different
hyperparameter settings will be evaluated in detail.
Fig. 4 demonstrates the classification performances of the
C. Comparisons With Other Methods seven algorithms with different numbers of labeled examples
From Tables I and II, we conclude that the proposed (i.e., pixels) for training being investigated. We vary the
SAGE-A achieves better results compared with the other state- number of labeled examples per class from 5 to 30 with
of-the-art models in OA, AA, and κ, which validates the an interval of 5 and report the OA performance acquired
effectiveness of the proposed multilevel graphSAGE network by seven algorithms on PU and Houston 2013 data sets.
with an attention mechanism. It is also notable that the From the results, we can find that the OA of each proposed
GCN-based methods perform better than multiband compact method in the PU and Houston 2013 data sets has been
texture unit (MBCUT), JSDF, CAE, and CRNN. This is significantly improved with the increase in labeled examples
because the graph convolution network can learn the relations per class. Besides, the proposed SAGE-A model performs
among neighbor nodes automatically, which is suitable for better than the contrast algorithms from beginning to end,
classification with limited labeled training samples. However, which shows the effectiveness of multilevel spatial information
GCN is a transductive learning method, which is difficult to on HSI classification. Furthermore, the proposed SAGE-A
aggregate the new node. In another word, the GCN-based allows to automaticity learn global contextual features based

beyond the regular image grids by adopting the pixel-to region

(superpixel) assignment and further encode the contextual
relations among local regions, so that originally long-range
local node in the 2-D space can be connected by multilevel
SAGE. Moreover, we learn the importance weight of each
node to the classified node, and therefore, global and con-
textual relations among pixels can be gradually refined, and
local-region features can be represented precisely.

Fig. 4. OAs of various methods under different numbers of labeled examples R EFERENCES
per class. (a) University of Pavia data set. (b) Houston 2013 data set. [1] B. Rasti et al., “Feature extraction for hyperspectral imagery: The evo-
lution from shallow to deep: Overview and toolbox,” IEEE Geosci.
Remote Sens. Mag., vol. 8, no. 4, pp. 60–88, Dec. 2020, doi:
10.1109/MGRS.2020.2979764.
[2] P. Zhong, Z. Gong, and J. Shan, “Multiple instance learning for multiple
diverse hyperspectral target characterizations,” IEEE Trans. Neural Netw.
Learn. Syst., vol. 31, no. 1, pp. 246–258, Jan. 2020.
[3] K. Djerriri, A. Safia, R. Adjoudj, and M. S. Karoui, “Improving
hyperspectral image classification by combining spectral and multiband
compact texture features,” in Proc. IEEE Int. Geosci. Remote Sens.
Symp. (IGARSS), Jul. 2019, pp. 465–468.
[4] C. Bo, H. Lu, and D. Wang, “Hyperspectral image classification via JCR
and SVM models with decision fusion,” IEEE Geosci. Remote Sens.
Lett., vol. 13, no. 2, pp. 177–181, Feb. 2016.
Fig. 5. Parametric sensitivity of l and S. (a) PU data set. (b) Houston
[5] L. Wang, S. Hao, Q. Wang, and Y. Wang, “Semi-supervised classification
2013 data set.
for hyperspectral imagery based on spatial-spectral label propagation,”
TABLE III ISPRS J. Photogramm. Remote Sens., vol. 97, pp. 123–137, Nov. 2014.
OA, AA (%), AND K APPA C OEFFICIENT A CHIEVED BY D IFFERENT D ATA [6] W. Hu, Y. Huang, L. Wei, F. Zhang, and H. Li, “Deep convolutional
S ETS . M ODEL S ETTINGS ON PU A ND HOUSTON 2013 D ATA S ETS neural networks for hyperspectral image classification,” J. Sensors,
vol. 2015, pp. 1–12, Jul. 2015.
[7] K. Makantasis, K. Karantzalos, A. Doulamis, and N. Doulamis, “Deep
supervised learning for hyperspectral data classification through con-
volutional neural networks,” in Proc. IEEE Int. Geosci. Remote Sens.
Symp. (IGARSS), Jul. 2015, pp. 4959–4962.
[8] Q. Liu, F. Zhou, R. Hang, and X. Yuan, “Bidirectional-convolutional
LSTM based spectral-spatial feature learning for hyperspectral image
classification,” Remote Sens., vol. 9, no. 12, p. 1330, Dec. 2017.
[9] M. Zhang, W. Li, and Q. Du, “Diverse region-based CNN for hyperspec-
on classified land cover, which is more robust than using a tral image classification,” IEEE Trans. Image Process., vol. 27, no. 6,
precomputed fixed graph. pp. 2623–2634, Jun. 2018.
The impact of convolution layers l and segment scales S on [10] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for
the two data sets is revealed in Fig. 5. We can conclude that image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
(CVPR), Jun. 2016, pp. 1–9.
both l and S have a significant impact on the classification [11] R. Kemker and C. Kanan, “Self-taught feature learning for hyperspectral
accuracies. Meanwhile, the best result is usually reached with image classification,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 5,
convolution layer 3. Multilevel is able to learn more spatial pp. 2693–2705, May 2017, doi: 10.1109/TGRS.2017.2651639.
information at a larger scale. However, the characteristics [12] D. Hong, N. Yokoya, J. Chanussot, and X. X. Zhu, “An augmented linear
mixing model to address spectral variability for hyperspectral unmixing,”
learned through iteration or search depth have an inhibitory IEEE Trans. Image Process., vol. 28, no. 4, pp. 1923–1938, Apr. 2019.
effect on the classification due to the low correlation. For [13] A. Sha, B. Wang, X. Wu, and L. Zhang, “Semisupervised classification
S, the classification accuracies would increase as the seg- for hyperspectral images using graph attention networks,” IEEE Geosci.
ment scale increases. However, the amount of calculation Remote Sens. Lett., vol. 18, no. 1, pp. 157–161, Jan. 2021.
also increases exponentially, which may be unaccepted under [14] L. Mou, X. Lu, X. Li, and X. X. Zhu, “Nonlocal graph convolu-
tional networks for hyperspectral image classification,” IEEE Trans.
limited experimental conditions. In our proposed method, the Geosci. Remote Sens., vol. 58, no. 12, pp. 8246–8257, Dec. 2020, doi:
segment scale S is 30 000, which has reached the limits of 10.1109/TGRS.2020.2973363.
computing. [15] D. Hong, L. Gao, J. Yao, B. Zhang, A. Plaza, and J. Chanussot,
“Graph convolutional networks for hyperspectral image classification,”
E. Ablation Study IEEE Trans. Geosci. Remote Sens., early access, Aug. 18, 2020, doi:
10.1109/TGRS.2020.3015157.
In this experiment, we investigate the ablative effect [16] S. Wan, C. Gong, P. Zhong, B. Du, L. Zhang, and J. Yang, “Multiscale
of the SAGE-based attention mechanism. For the sake of dynamic graph convolutional network for hyperspectral image classifica-
comparison, we record the classification results produced tion,” IEEE Trans. Geosci. Remote Sens., vol. 58, no. 5, pp. 3162–3177,
without using an attention mechanism, and the simplified May 2020.
[17] S. Wan, C. Gong, P. Zhong, S. Pan, G. Li, and J. Yang, “Hyperspec-
model is denoted as “SAGE.” The experimental setting tral image classification with context-aware dynamic graph convolu-
is kept identical to Section IV-B. The comparative results tional network,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 1,
are demonstrated in Table III. As shown in the table, pp. 597–612, Jan. 2021, doi: 10.1109/TGRS.2020.2994205.
the SAGE-based attention mechanism plays an important role [18] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and
in the improvement of learning efficiency. R. Salakhutdinov, “Dropout: A simple way to prevent neural networks
from overfitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958,
V. C ONCLUSION Jun. 2014.
[19] S. Zhang and S. Li, “Spectral-spatial classification of hyperspectral
In this letter, a novel SAGE-A for HSI classification is images via multiscale superpixels based sparse representation,” in Proc.
proposed. To extract long-range contextual relations, we go IEEE IGARSS, Jul. 2016, pp. 2423–2426.

Authorized licensed use limited to: Southeast University. Downloaded on May 16,2023 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.

MambaHSI SpatialSpectral Mamba For Hyperspectral Image Classification
No ratings yet
MambaHSI SpatialSpectral Mamba For Hyperspectral Image Classification
16 pages
Decoding The Moons Surface A Graph Neural Network Based Analysis of Chandrayaan-2 Lunar Data Classification
No ratings yet
Decoding The Moons Surface A Graph Neural Network Based Analysis of Chandrayaan-2 Lunar Data Classification
4 pages
Hyperspectral Image Classification Based On Deep Attention Graph Convolutional Network
No ratings yet
Hyperspectral Image Classification Based On Deep Attention Graph Convolutional Network
16 pages
Small Sample Classification For Hyperspectral Imagery Using Temporal Convolution and Attention Mechanism
No ratings yet
Small Sample Classification For Hyperspectral Imagery Using Temporal Convolution and Attention Mechanism
11 pages
A convolution - Transformer Fusion Network for Hyperspectral Image Classification
No ratings yet
A convolution - Transformer Fusion Network for Hyperspectral Image Classification
21 pages
Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification
No ratings yet
Remote Sensing: An Enhanced Spectral Fusion 3D CNN Model For Hyperspectral Image Classification
24 pages
Combining t-Distributed Stochastic Neighbor Embedding With Convolutional Neural Networks for Hyperspectral Image Classification
No ratings yet
Combining t-Distributed Stochastic Neighbor Embedding With Convolutional Neural Networks for Hyperspectral Image Classification
5 pages
2019 Deep Learning Ensemble for Hyperspectral Image Classification
No ratings yet
2019 Deep Learning Ensemble for Hyperspectral Image Classification
16 pages
SpectralSpatial Morphological Attention Transformer For Hyperspectral Image Classification
No ratings yet
SpectralSpatial Morphological Attention Transformer For Hyperspectral Image Classification
15 pages
Zhang 2018
No ratings yet
Zhang 2018
12 pages
A Lightweight Transformer Network For Hyperspectral Image Classification
No ratings yet
A Lightweight Transformer Network For Hyperspectral Image Classification
17 pages
R&D HiFACE
No ratings yet
R&D HiFACE
5 pages
GlobalLocal Multigranularity Transformer for Hyperspectral Image Classification
No ratings yet
GlobalLocal Multigranularity Transformer for Hyperspectral Image Classification
20 pages
s41598-025-97052-w
No ratings yet
s41598-025-97052-w
18 pages
Neural Ordinary Differential Equations for Hyperspectral Image Classification-Plaza2020
No ratings yet
Neural Ordinary Differential Equations for Hyperspectral Image Classification-Plaza2020
17 pages
A Survey of Deep Learning For Hyperspectral Image Classification
No ratings yet
A Survey of Deep Learning For Hyperspectral Image Classification
26 pages
Paper 82-Hyperspectral Image Classification
No ratings yet
Paper 82-Hyperspectral Image Classification
7 pages
Kumar 2021 J. Phys. - Conf. Ser. 1950 012087
No ratings yet
Kumar 2021 J. Phys. - Conf. Ser. 1950 012087
13 pages
Hasan 2019 IOP Conf. Ser. Earth Environ. Sci. 357 012035
No ratings yet
Hasan 2019 IOP Conf. Ser. Earth Environ. Sci. 357 012035
11 pages
Simonovsky Dynamic Edge-Conditioned Filters CVPR 2017 Paper
No ratings yet
Simonovsky Dynamic Edge-Conditioned Filters CVPR 2017 Paper
10 pages
Remote Sensing: Spectral-Spatial Classification of Hyperspectral Imagery With 3D Convolutional Neural Network
No ratings yet
Remote Sensing: Spectral-Spatial Classification of Hyperspectral Imagery With 3D Convolutional Neural Network
21 pages
Landcover Classification
No ratings yet
Landcover Classification
3 pages
Deep Feature Learning and Classification of Remote Sensing Images
No ratings yet
Deep Feature Learning and Classification of Remote Sensing Images
19 pages
Hierarchical Attention Transformer For Hyperspectral Image Classification
No ratings yet
Hierarchical Attention Transformer For Hyperspectral Image Classification
5 pages
Review of Image Classification Algorithms Based On
No ratings yet
Review of Image Classification Algorithms Based On
10 pages
Survey Paper
No ratings yet
Survey Paper
35 pages
DL For HSI - Review
No ratings yet
DL For HSI - Review
39 pages
WaveFormer SpectralSpatial Wavelet Transformer For Hyperspectral Image Classification
No ratings yet
WaveFormer SpectralSpatial Wavelet Transformer For Hyperspectral Image Classification
5 pages
Dual-Branch_Domain_Adaptation_Few-Shot_Learning_for_Hyperspectral_Image_Classification
No ratings yet
Dual-Branch_Domain_Adaptation_Few-Shot_Learning_for_Hyperspectral_Image_Classification
16 pages
Electronics 12 00488 v2
No ratings yet
Electronics 12 00488 v2
34 pages
Mingyi He, Bo Li, Huahui Chen: Al. (11) Proposed A Modified Deep Stacking Network (DSN) For
No ratings yet
Mingyi He, Bo Li, Huahui Chen: Al. (11) Proposed A Modified Deep Stacking Network (DSN) For
5 pages
Learning high-level spectral-spatial features for hyperspectral image classification with insufficient labeled samples
No ratings yet
Learning high-level spectral-spatial features for hyperspectral image classification with insufficient labeled samples
9 pages
A Multiscale Dual-Branch Feature Fusion and Attention Network For Hyperspectral Images Classification
No ratings yet
A Multiscale Dual-Branch Feature Fusion and Attention Network For Hyperspectral Images Classification
13 pages
1-s2.0-S1110982324000048-main
No ratings yet
1-s2.0-S1110982324000048-main
17 pages
PSASL Pixel-Level and Superpixel-Level Aware Subspace Learning For Hyperspectral Image Classification
No ratings yet
PSASL Pixel-Level and Superpixel-Level Aware Subspace Learning For Hyperspectral Image Classification
16 pages
Major Project Report
No ratings yet
Major Project Report
30 pages
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
No ratings yet
A Fast 3D CNN For Hyperspectral Image Classification: Muhammad Ahmad
5 pages
Koumoutsou 2020
No ratings yet
Koumoutsou 2020
8 pages
Full Document - Hyperspectral PDF
No ratings yet
Full Document - Hyperspectral PDF
96 pages
Hyperspectral Image Classification With Spectral-Spatial Feature Integration and Ensemble Learning
No ratings yet
Hyperspectral Image Classification With Spectral-Spatial Feature Integration and Ensemble Learning
12 pages
Sun 2022
No ratings yet
Sun 2022
25 pages
Liu 2017
No ratings yet
Liu 2017
11 pages
Remotesensing 12 01257
No ratings yet
Remotesensing 12 01257
26 pages
Paper 2
No ratings yet
Paper 2
12 pages
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
No ratings yet
HybridCNN Based Hyperspectral Image Classification Using Multiscalespatiospectral Features
10 pages
sensors-23-03515-v2
No ratings yet
sensors-23-03515-v2
18 pages
Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review
No ratings yet
Deep Learning Meets Hyperspectral Image Analysis: A Multidisciplinary Review
32 pages
Remotesensing 14 04373 v2
No ratings yet
Remotesensing 14 04373 v2
32 pages
Radiometric Indices-Based Spectro-Spatial Approach For Hyperspectral Image Classification
100% (1)
Radiometric Indices-Based Spectro-Spatial Approach For Hyperspectral Image Classification
15 pages
Retracted-Advances in Hyperspectral Image Classification With A Bottleneck Attention Mechanism Based On 3D-FCNN Model and Imaging Spectrometer Sensor
No ratings yet
Retracted-Advances in Hyperspectral Image Classification With A Bottleneck Attention Mechanism Based On 3D-FCNN Model and Imaging Spectrometer Sensor
17 pages
Going Deeper With Contextual CNN For Hyperspectral Image Classification
No ratings yet
Going Deeper With Contextual CNN For Hyperspectral Image Classification
14 pages
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
No ratings yet
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
70 pages
Sample EIP-II Report
No ratings yet
Sample EIP-II Report
7 pages
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
No ratings yet
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
38 pages
Sensors: Comparison of CNN Algorithms On Hyperspectral Image Classification in Agricultural Lands
No ratings yet
Sensors: Comparison of CNN Algorithms On Hyperspectral Image Classification in Agricultural Lands
17 pages
When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework
No ratings yet
When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework
13 pages
2018 Recent Advances on Spectral–Spatial Hyperspectral Image Classification An Overview and New Guidelines
No ratings yet
2018 Recent Advances on Spectral–Spatial Hyperspectral Image Classification An Overview and New Guidelines
19 pages
Deep Convolutional Neural Networks For The Classification of Snapshot Mosaic Hyperspectral Imagery
No ratings yet
Deep Convolutional Neural Networks For The Classification of Snapshot Mosaic Hyperspectral Imagery
6 pages
HyperSpecTral Image Classification
No ratings yet
HyperSpecTral Image Classification
17 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
No ratings yet
Understanding Multimodal Popularity Prediction of Social Media Videos With Self-Attention
11 pages
Image-Steganography-with-CNNs (1)
No ratings yet
Image-Steganography-with-CNNs (1)
8 pages
LSTM Paper
No ratings yet
LSTM Paper
5 pages
Mapua Institute of Technology at Laguna Academic Year 2019-2020
No ratings yet
Mapua Institute of Technology at Laguna Academic Year 2019-2020
17 pages
333 10week3
No ratings yet
333 10week3
5 pages
Differentiable Quantization of Deep Neural Networks: Equal Contribution
No ratings yet
Differentiable Quantization of Deep Neural Networks: Equal Contribution
21 pages
Si-Lang Translator With Image Processing
No ratings yet
Si-Lang Translator With Image Processing
4 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
4 pages
Ait401 DL Syllubus
100% (1)
Ait401 DL Syllubus
13 pages
Autotune DNN Nov5 PDF
No ratings yet
Autotune DNN Nov5 PDF
5 pages
CVPR 2022 MainConference ProgramGuide Final
No ratings yet
CVPR 2022 MainConference ProgramGuide Final
70 pages
Automatic Field Monitoring and Detection of Plant Diseases Using IoT
No ratings yet
Automatic Field Monitoring and Detection of Plant Diseases Using IoT
20 pages
Deep Learning
No ratings yet
Deep Learning
189 pages
Xception Net
No ratings yet
Xception Net
8 pages
StonPrehistoric Stone Tool Identification Android App For Archaeological Researchers
No ratings yet
StonPrehistoric Stone Tool Identification Android App For Archaeological Researchers
6 pages
Fetal Brain Ultrasound Image Classification Using Deep Learning
100% (1)
Fetal Brain Ultrasound Image Classification Using Deep Learning
5 pages
List of Inhouse Projects
No ratings yet
List of Inhouse Projects
105 pages
1 s2.0 S0957417422019844 Main
No ratings yet
1 s2.0 S0957417422019844 Main
15 pages
Feature Extraction in TorchVision Using Torch FX - PyTorch
No ratings yet
Feature Extraction in TorchVision Using Torch FX - PyTorch
9 pages
19Vol102No20
No ratings yet
19Vol102No20
12 pages
Review 2 Capstone
No ratings yet
Review 2 Capstone
13 pages
Research Methods in Machine Learning: A Content Analysis: Jackson Kamiri Geoffrey Mariga
No ratings yet
Research Methods in Machine Learning: A Content Analysis: Jackson Kamiri Geoffrey Mariga
14 pages
Don't Give Me The Details, Just The Summary! Topic-Aware Convolutional Neural Networks For Extreme Summarization
No ratings yet
Don't Give Me The Details, Just The Summary! Topic-Aware Convolutional Neural Networks For Extreme Summarization
11 pages
Deep NN - Theory, Tutorial and Survey
No ratings yet
Deep NN - Theory, Tutorial and Survey
32 pages
Water Bottle Defect Detection System Using Convolutional Neural Network
No ratings yet
Water Bottle Defect Detection System Using Convolutional Neural Network
6 pages
A 12.08 - TOPS W All-Digital Time-Domain CNN Engine Using Bi-Directional Memory Delay Lines For Energy Efficient Edge Computing
No ratings yet
A 12.08 - TOPS W All-Digital Time-Domain CNN Engine Using Bi-Directional Memory Delay Lines For Energy Efficient Edge Computing
16 pages
1 s2.0 S0957417423014720 Main
No ratings yet
1 s2.0 S0957417423014720 Main
15 pages
CNN Architectures For Large-Scale Audio Classification
No ratings yet
CNN Architectures For Large-Scale Audio Classification
5 pages
High Fidelity Neural Audio Compression: Alexandre Défossez
No ratings yet
High Fidelity Neural Audio Compression: Alexandre Défossez
19 pages
Icaiccit 720
No ratings yet
Icaiccit 720
6 pages

Graph Sample and Aggregate-Attention Network For Hyperspectral Image Classification

Uploaded by

Graph Sample and Aggregate-Attention Network For Hyperspectral Image Classification

Uploaded by

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL.

19, 2022 5504205

Graph Sample and Aggregate-Attention Network

A. Spectral Graph Convolution

Fig. 3. Network multilevel information learning.

Algorithm 2 Proposed SAGE-A for HSI Classification TABLE I

network training, and the remaining unlabeled pixels are

beyond the regular image grids by adopting the pixel-to region

You might also like