0% found this document useful (0 votes)
11 views

UnDIP_Hyperspectral_Unmixing_Using_Deep_Image_Prior(2)

Paper
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

UnDIP_Hyperspectral_Unmixing_Using_Deep_Image_Prior(2)

Paper
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL.

60, 2022 5504615

UnDIP: Hyperspectral Unmixing Using


Deep Image Prior
Behnood Rasti , Senior Member, IEEE, Bikram Koirala , Graduate Student Member, IEEE,
Paul Scheunders , Senior Member, IEEE, and Pedram Ghamisi , Senior Member, IEEE

Abstract— In this article, we introduce a deep learning-based unmixing techniques can be divided into two main groups:
technique for the linear hyperspectral unmixing problem. The linear unmixing and nonlinear unmixing [3], [4].
proposed method contains two main steps. First, the endmembers In linear unmixing, it is assumed that the light only interacts
are extracted using a geometric endmember extraction method,
i.e., a simplex volume maximization in the subspace of the data with one pure material before reaching the sensor. In remote
set. Then, the abundances are estimated using a deep image prior. sensing applications, hyperspectral images are of low spatial
The main motivation of this work is to boost the abundance resolution, and pixels typically contain large homogeneous
estimation and make the unmixing problem robust to noise. The regions of single materials. For this situation, the linear
proposed deep image prior uses a convolutional neural network mixture model is a good approximation [3]. In microscopic
to estimate the fractional abundances, relying on the extracted
endmembers and the observed hyperspectral data set. The scenarios (i.e., close-range scenarios), the pure materials are
proposed method is evaluated on simulated and three real remote intimately mixed within the pixel, and the light undergoes
sensing data for a range of SNR values (i.e., from 20 to 50 dB). multiple reflections by several materials. In these situations,
The results show considerable improvements compared to state- the linear approximation often fails, and one has to apply
of-the-art methods. The proposed method was implemented in nonlinear models [3].
Python (3.8) using PyTorch as the platform for the deep network
and is available online: https://github.com/BehnoodRasti/UnDIP. In this article, we aim at remote sensing applications and
focus on the linear hyperspectral unmixing methods. The
Index Terms— Convolutional neural network, deep learn- linear unmixing methods can be categorized as unsupervised,
ing, deep prior, endmember extraction, hyperspectral image,
unmixing. supervised, and semisupervised. Unsupervised methods either
sequentially extract the endmembers from the image and
then estimate the fractional abundances or simultaneously
I. I NTRODUCTION
estimate both endmembers and abundances from the image

S PECTRAL unmixing is one of the major hyperspectral


image analysis tasks. Hyperspectral cameras have the
ability to capture the spectral signature of materials. This
(so-called blind unmixing). Supervised methods only estimate
the abundances from the image assuming that endmembers
are known a priori. If not known a priori, the endmembers
ability allows us to distinguish different materials within a need to be extracted from a large endmember library, and one
scene. However, due to the limited spatial resolution and refers to semisupervised (so-called sparse unmixing). In the
scattering of the light, a pixel spectrum is generally a com- latter case, the number of endmembers needs not to be known
plex mixture of the pure spectra of its constituent materials, a priori [1], [3].
i.e., the endmember spectra [1], [2]. Unmixing is the task Endmembers can be extracted from hyperspectral images
of estimating the fractional abundances of those endmembers based on geometrical principles. This can be done by relying
within the spectral pixels. From a modeling point of view, on either the existence of pure spectra for each material,
located at the vertices of the data simplex or the existence
Manuscript received November 27, 2020; revised January 30, 2021 and of sufficient spectra on the facets of the data simplex, to allow
March 3, 2021; accepted March 17, 2021. Date of publication March 31,
2021; date of current version December 13, 2021. The work of Behnood to geometrically locate the vertices of the data simplex.
Rasti was supported by the Alexander-von-Humboldt-Stiftung/foundation. The Approaches such as pixel purity index (PPI) [5], N-FINDR [6],
work of Bikram Koirala was supported by the Belgian Science Policy Office and the vertex component analysis (VCA) algorithm [7]
(BELSPO) in the frame of the STEREO III program under Project GEOMIX
SR/06/357. (Corresponding author: Behnood Rasti.) use geometrical concepts for endmember extraction. After
Behnood Rasti is with Helmholtz-Zentrum Dresden-Rossendorf, Helmholtz endmember extraction, the abundance fractions are generally
Institute Freiberg for Resource Technology, Machine Learning Group, estimated by using optimization algorithms, such as nonneg-
09599 Freiberg, Germany (e-mail: [email protected]; behnood.rasti@
gmail.com). ative constrained least-squares [8], satisfying the abundance
Bikram Koirala and Paul Scheunders are with Imec-Visionlab, nonnegativity constraint (ANC) or fully constrained least-
University of Antwerp (CDE), B-2610 Antwerp, Belgium (e-mail: squares [9], satisfying both ANC and the abundance sum-to-
[email protected]; [email protected]).
Pedram Ghamisi is with Helmholtz-Zentrum Dresden-Rossendorf, one constraint (ASC). This step is also referred to as inversion
Helmholtz Institute Freiberg for Resource Technology, Machine Learning in the literature [3].
Group, 09599 Freiberg, Germany, and also with the Institute of Advanced In blind unmixing, both endmembers and abundances
Research in Artificial Intelligence (IARAI), 1030 Vienna, Austria (e-mail:
[email protected]; [email protected]). are estimated simultaneously. Two major paradigms in
Digital Object Identifier 10.1109/TGRS.2021.3067802 blind unmixing are constrained penalized (or regularized)
1558-0644 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

least-squares (CPLS) methods, such as [10], and statistical Deep learning-based networks are state-of-the-art machine
approaches, such as [11]. Examples of CPLS algorithms are learning and computer vision applications. Inevitably, most
minimum volume simplex analysis [12], simplex identification of the remote sensing applications, involving machine learn-
via variable splitting and augmented Lagrangian (SISAL), and ing and image processing, have been inspired by deep net-
collaborative nonnegative matrix factorization (CoNMF) [13]. works [25]. Recently, a variety of deep neural networks has
They often solve a penalized least-squares problem, subject been proposed for hyperspectral unmixing, mainly based on
to (either or both) ASC and ANC. These algorithms often variations of deep encoder–decoder networks, for which the
involve a data-fitting term and a minimum volume-based inputs are the spectra and the outputs are the abundances. The
regularization term. A major issue with these algorithms is abundances are then decoded to the spectra again using linear
that the regularization parameter needs to be tuned. In [14], layers, with the endmembers as the weights. EndNet [26],
a geometrical constraint (the squared of the simplex volume) SNSA [27], DAEN [28], DeepGUn [29], and uDAS [30] are a
was enforced as a regularizer to the fully constrained least- few examples of such unmixing techniques. EndNet proposes a
squares problem to simultaneously estimate the abundances loss function with several terms, including a Kullback–Leibler
and endmembers. In [15], the regularization parameter for the divergence term, a SAD similarity, and a sparsity term, which
minimum volume-based regularization term was automatically makes the parameter selection very challenging. SNSA utilizes
selected by determining the simplex, which encloses the whole a stack of nonnegative sparse autoencoders from which the
data. The statistical approaches often formulate the unmixing last one performs the task of unmixing and the others are
problem in a Bayesian way and use different estimators, such exploited to improve the robustness with respect to the outliers.
as the joint maximum a posteriori (MAP) estimator in [16]. DAEN exploits a stacked autoencoder to initialize a variational
It is worth mentioning that both groups are related, as a autoencoder that performs the unmixing task. In [29], a varia-
CPLS can be derived using a MAP estimator [17]. Due to the tional autoencoder is used to generate the endmembers. uDAS
inherent nonconvexity of blind unmixing methods, they are exploits an additional denoising constraint on the decoder and
highly vulnerable to the initialization, and therefore, they are a 2,1 sparsity constraint on the decoder. In all these methods,
always initialized using a geometrical endmember extraction the spatial information is ignored.
approach. The advantage of incorporating the spatial information for
In sparse unmixing, the fractional abundances are esti- spectral unmixing has been confirmed in the literature. Train-
mated using sparse regression techniques. These methods ing a network based on a single spectrum at a time ignores the
describe each spectrum as a sparse linear combination of spatial information. Therefore, patchwise or cubewise CNN
the elements of a rich library of pure spectra, a problem was proposed to utilize the spatial information. First, the image
that can be generally formulated using CPLS. This results was spatially divided into patches, and then, the convolution is
in either a convex or a nonconvex problem, depending on applied on small patches of spectra. In [31], it was shown that
the selected sparsity promoting penalty to be applied on cubewise CNN outperforms pixelwise CNN. In [32], spatial
the abundances [18]. Sparse unmixing by variable splitting information has been exploited for unmixing, by improving the
and augmented Lagrangian (SUnSAL), constrained SUnSAL encoder–decoder architecture proposed in [33] and by applying
(C-SUnSAL) [19], and collaborative sparse unmixing [20] are parallel encoder–decoders on HSI patches. In [34], a CNN was
examples of sparse unmixing methods. Both SUnSAL and proposed based on a spatial-spectral model, which is trained
C-SUnSAL apply an 1 penalty on the fractional abundances. using HSI patches. Most recently, a convolutional autoencoder
SUnSAL utilizes 2 for the fidelity term, while C-SUnSAL was proposed for supervised hyperspectral unmixing in [35],
assumes a constraint to enforce the data fidelity. Collaborative exploiting 3-D convolutional filters. The patchwise approach
sparse unmixing is similar to SUnSAL but applies 2,1 (i.e., was found useful for endmember estimation since it supports
the sum of 2 on the abundances) to promote the sparsity on the idea of endmember bundles and captures the variability
the abundances. SUnSAL was improved in [21] by incorpo- of the spectra. However, it degrades (and blurs) the estimated
rating spatial information through applying a total variation abundances [34] since small patches do not contain enough
penalty on the abundances (SUnSAL-TV). structure for the convolutions (filters) to perform better than
The spectral variability of the endmembers (i.e., the intr- merely a mean filter.
aclass variability of the materials) is taken into account The supervised CNN exploited in the abovementioned tech-
by using a dictionary of endmember bundles, generated niques requires spectral signatures to train the CNN. In this
from the data (as opposed to the abovementioned sparse article, we propose an unsupervised CNN that does not need
regression-based techniques where the dictionary is made spectral signatures for training. The convolutional encoder–
from spectral libraries and does not rely on the data itself). decoder network proposed in this article is a more general
When using endmember bundles, four different penalties, network than the autoencoders often used in the literature,
the group least absolute shrinkage and selection operator in the sense that the input can have any distribution regardless
(LASSO) [22], the collaborative LASSO [20], the elitist of the output.
LASSO [23], and the fractional LASSO [24], were proposed
in the framework of sparse regression in [24], where all take A. Contributions and Novelties
the ASC into account. The main difference between those The main motivation of this work is to improve the abun-
techniques is the selection of the penalty term applied to the dance estimation and make the unmixing problem robust to
abundances. noise. Hence, we propose a method called “hyperspectral

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

unmixing using deep image prior” (UnDIP) that utilizes a scalars are denoted in bold and capital letters, bold letters, and
conventional geometrical approach for endmember extraction letters, respectively. X̂ represents the estimate of the variable
and a new UnDIP using a deep convolutional neural network X. . F and |.| denote the Frobenius norm and the absolute
for abundance estimation. The main novelty of this article is value. x(i) and xiT denote the i th column and the i th row of
the introduction of a new unmixing deep prior for the inversion matrix X, respectively. Xi j denotes the matrix element located
task. Deep image prior (DIP) was recently proposed for at the i th row and the j th column. 1n is an n-component
conventional inverse problems in the area of image processing, column vector of ones. The notation (r )! denotes the factorial
such as denoising, inpainting, and super-resolution [36], [37]. of the positive integer r , and det(X) indicates the determinant
In [38], DIP was applied for hyperspectral image denois- of matrix X.
ing, inpainting, and super-resolution. In this work, the DIP
is adjusted to the unmixing problem to generate fractional
B. Hyperspectral Modeling
abundances. Starting from input noise, the abundances are gen-
erating by iteratively minimizing an implicitly regularized loss We assume a linear model for HSI
function. The proposed network is applicable in supervised
Y =X+N (1)
unmixing scenarios, where the endmembers are available.
UnDIP has the following attributes that distinguish it from where Y∈ R p×n is the observed HSI, with n pixels and p
the other deep learning-based unmixing techniques proposed bands, X ∈ R p×n is the unknown image to be estimated, and
in the literature. N ∈ R p×n is the model error, including noise. In spectral
1) It uses DIP as a deep learning procedure. UnDIP unmixing, we assume that
is designed to solve a regularized inverse problem,
in which the regularizer is implicitly incorporated in the Y = EA + N (2)
cost function. This controls the overfitting of the fidelity where E ∈ R p×r and A ∈ Rr×n , r  p, contain the r
term and makes the method robust to noise. endmembers and their fractional abundances, respectively. The
2) It incorporates spatial information globally, unlike the main goal is to estimate the fractional abundances A; however,
pixelwise or patchwise (convolutional) autoencoder- this is not possible without either having prior knowledge
based approaches in the literature. UnDIP does not need about the endmembers E or estimating/extracting them from
spectral signatures for training. The input of the network the image.
has the same spatial size as the observed image and is
given by the Gaussian noise, which is fixed throughout
the learning process. Then, the network iteratively learns C. Endmember Extraction
to map that input to abundance maps. This unsuper- When the endmembers are extracted from the data, one
vised learning framework has the advantage that the often relies on the geometry of the data. Due to spec-
convolutional network can be applied globally on the tral redundancy, an HSI often lives in a low-dimensional
entire spatial domain of an image, which leads to sharper subspace [39], [40]. Therefore, the data can be projected
abundance maps and enhances the robustness to noise. onto an (r − 1)-dimensional subspace and represented by a
3) It combines a geometrical endmember estimation (r − 1)-dimensional simplex whose vertices are the endmem-
approach with deep unmixing. The majority of the bers em (m = 1, . . . , r ). When pure spectra are available in
proposed blind unmixing techniques, including deep the data, the endmembers can be extracted by maximizing the
techniques, need to be initialized by a geometrical volume of the data simplex [41]
endmember estimation approach, confirming the impor-   
1  1 . . . 1 
tance of this step. Here, for the first time, UnDIP pro- arg max V (E) = arg max det (3)
E E (r − 1)!  e(1) . . . e(r) 
poses a collaborative framework, in which a geometrical
endmember estimation is performed prior to the deep where E = [e(i) ]. In this article, we use an algorithm, called
unmixing. The endmembers are then used in the loss simplex volume maximization (SiVM) [42] to extract the
function for training the deep network. In this way, endmembers from the data set. SiVM selects the endmembers
the deep network can focus on the improvement of by iteratively maximizing the simplex volume of the data
the abundance estimation, while the endmembers remain 
fixed. (−1)r · cmd (E)
arg max V (E) = arg max (4)
The remaining of this article is organized as follows. The E E 2r−1 (r − 1)!
unmixing methodology is explained in detail in Section II. The
where cmd is the Cayley–Menger determinant
experimental results are shown and discussed in Section III.
⎡ ⎤
Section IV concludes this article. 0 1 1 1 ... 1
⎢1 0 d 2
d 2
. .. 2 ⎥
d1,r
⎢ 1,2 1,3 ⎥
II. M ETHODOLOGY ⎢1 d2,1
2
0 d 2
. .. 2 ⎥
d2,r
⎢ 2,3 ⎥
cmd (E) = det ⎢1 d 2 d 2
0 . .. 2 ⎥
d3,r
A. Notation ⎢ 3,1 3,2 ⎥
⎢ .. .. .. .. . . .. ⎥
Before discussing the proposed methodology, we explain ⎣. . . . . . ⎦
the notations used in the article. Matrices, column vectors, and 1 2
dr,1 2
dr,2 2
dr,3 ... 0

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

and di,2 j is the Euclidean distance between endmembers ei and i.e., the fully constrained least-squares unmixing (FCLSU) due
e j . Since (4) does not take into account nuisances, such as to the use of both the ASC and ANC. It has been shown
noise, we first project the data on the subspace obtained by that the regularized (or penalized) least-squares techniques can
the spectral eigenvectors of a singular value decomposition. take into account prior knowledge of the data and, therefore,
provides a better estimation of the abundances [3]
D. Deep Image Prior 1
 = arg min ||Y − EA||2F + λR(A) s.t.A ≥ 0, 1rT A = 1Tn
In this section, we first explain the general concept of DIP, A 2
and in Section II-E, we adapt this concept to the unmixing (9)
problem. CNNs are the most popular deep learning networks
where R(A) is the regularizer or penalty term and λ is the
for inverse problems, such as image restoration. They show
regularization parameter. The choice of R is dependent on the
excellent performances as long as a large training data set is
available prior knowledge that can vary considerably in remote
available.
sensing images. However, the regularizer can be implicitly
Recently, DIP was proposed as an unsupervised deep learn-
substituted by a deep network, and the problem is transformed
ing alternative, in which the network is entirely trained based
into an optimization of the network’s parameters
on the observed image. DIP generates an image X using a
random initialization Z and utilizing the deep network as 1
θ̂ = arg min ||Y − E fθ (Z)||2F s.t. Â = f θ̂ (Z). (10)
a parametric function X = fθ (Z). Then, the network is θ 2
optimized over its parameters (i.e., θ ) to generate the optimal Therefore, problem (10) can be solved using a deep network.
image X̂ = fθ̂ (Z). The only issue left to solve is to enforce the constraints. The
Generally, inverse image reconstruction tasks, such as constraints in (9) can be easily enforced by using a softmax
denoising, super-resolution, and inpainting, can be formulated function in the final layer of the network, given by
as an optimization problem
eAi j
X̂ = arg min Q(Y, X) + λR(X) (5) softmax(A) = r Ai j
∀i, j. (11)
X i=1 e

where the function Q often controls the fidelity toward the As a result, the unmixing problem (8) can be solved using
observed data and is chosen to fit the reconstruction task. R DIP. Fig. 1 depicts the concept of UnDIP. The random input
is a regularizer (or penalty) function selected based on prior image Z is fixed. f θ is a deep network with parameter θ , which
knowledge, and λ is the tuning parameter to tradeoff between is initialized using random weights θ0 and updated through the
the two terms. One major drawback of this framework is that learning process. The core idea of UnDIP is to map Z to Â,
the selection of a good regularizer depends on the application using a deep network f θ such that  = fθ̂ (Z). Therefore,
and the available prior knowledge, which can considerably θ̂ should be estimated. As can be seen from Fig. 1, UnDIP
vary in the case of natural images. A widely used regularizer optimizes the network’s parameters θ iteratively by computing
is total variation, which promotes piecewise smoothness on X. the gradient of the loss function (10), which relies on the
In [37], it was shown that the regularizer can be implicitly endmembers (E) extracted by SiVM.
substituted by a deep network When a network is overtrained, overfitting occurs, and the
network will not reach the optimal solution for a test set. Since
θ̂ = arg min Q(Y, fθ (Z)) s.t. X̂ = f θ̂ (Z) (6) the design of UnDIP is not based on training and testing sets,
θ
where the selection of a proper regularizer is taken off the UnDIP is robust to overfitting of the network. The optimization
hands of the user and the optimization is shifted toward is done by iterating based on a fixed input and by optimizing
optimizing the network parameters, i.e., weights and biases. the output until the loss function has converged. On the other
The minimization problem (6) is solved using the network’s hand, since UnDIP is an iterative algorithm, the stopping point
optimizer, e.g., a gradient descent, applied to the network’s becomes an important hyperparameter, which will be discussed
parameters θ . A common choice for the function Q is the in Section II-G.
least-squares term, and hence, the problem to solve becomes
1 F. Convolutional Neural Network for UnDIP
θ̂ = arg min ||Y − f θ (Z)||2F s.t. X̂ = f θ̂ (Z). (7)
θ 2 DIP requires the selection of a network. The description
of DIP in Section II-D did not specify a specific network
E. Abundance Estimation Using DIP selection. In [37], the convolutional encoder–decoder network
was suggested as the best option for DIP. Here, we discuss
In this section, we adapt DIP to solve the unmixing problem.
in detail the network (i.e., f θ ) shown in Fig. 2 used for
Unlike the majority of the deep learning-based unmixing
UnDIP. The CNN, fθ , in UnDIP has a few major differences
techniques proposed in the literature, we propose to use a deep
with the other deep (convolutional) networks, typically used
network for estimating the abundances A only, given fixed
for unmixing. First, the entire network is only used for
endmembers E. The widely used classical method to estimate
the abundance estimation, as the endmembers are extracted
the abundances is to solve the optimization problem
using a geometrical approach and are fixed throughout the
1 unmixing. This framework allows using an unsupervised CNN
 = arg min ||Y − EA||2F s.t. A ≥ 0, 1rT A = 1Tn (8)
A 2 for unmixing where the convolutions can be applied globally

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

Fig. 1. Schematic of UnDIP. UnDIP maps a random noise input image Z to  using a deep network f θ such that  = f θ̂ (Z). To estimate the network’s
parameters θ̂ , UnDIP starts with randomized weights (θ0 ) and optimizes θ iteratively by computing the gradient of the loss function (10), which utilizes the
endmembers (E), extracted by SiVM.

Fig. 2. Proposed convolutional network architecture with one skip connection. This network is used as f θ for UnDIP in the experiments. Different layers
in the network are shown with specific colors.

on the entire spatial domain to extract the spatial information.


Second, the autoencoder network, generally used for deep
spectral unmixing reconstructs spectra as the output of the
network using the observed spectra as the input of the network.
To do so, different loss functions, such as spectra angle
distance and mean squared error, were used to minimize
the reconstruction error (RE). As we will show later in the
experiments, minimizing the RE with respect to both end-
members and abundances does not necessarily provide a good
abundance estimation, which is the main goal in unmixing. Fig. 3. Comparison of the network architecture of DIP versus UnDIP, applied
on the Jasper Ridge data: (a) loss function value and (b) abundance MAE.
On the other hand, in UnDIP, the input is Gaussian noise,
and the output is given by the abundance maps. The network
is trained to minimize the loss function with respect to the use of several downsampling blocks downgrades the spatial
abundances solely. resolution for the unmixing application. In addition, as can be
The core of the UnDIP network is based on the convolu- observed in Fig. 3, the UnDIP network converges much faster
tional encoder–decoder (also called hourglass) with some skip and leads to better abundance estimations than DIP. The other
connections, as proposed in [37], however, with two major main difference is the activation function used in the final
differences. First, UnDIP uses only one downsampling block, layer of UnDIP. While the leaky activation function is used in
one upsampling block, and one skip block, while DIP uses all layers of DIP, UnDIP uses the leaky rectified linear unit
five blocks for each. From our experiments, we found that the (ReLU) activation function for all the layers except the final

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

layer. For the final layer, UnDIP exploits a softmax activation TABLE I
function to hold the constraints as discussed before. H YPERPARAMETERS U SED IN THE E XPERIMENTS FOR U N DIP
The main part of the forward PASs (the plain network
without the skip connection) starts with two blocks of three
layers: a convolution layer (Conv), a batch normalization (BN)
layer, and a leaky ReLU nonlinear activation layer, which are
followed by a bilinear upsampling layer to account for the
stride factor used in the convolutions. This type of three-layer
blocks (i.e., conv, BN, and activation) is the most common
one used in the CNN architectures in the literature. The
convolutional layers extract different spatial features by using
different filters. The BN speeds up the learning process and
also provides more robustness in terms of the hyperparameter
selection. The activation layer promotes the nonlinearity of the
prediction in every layer. Deep networks are hard to train due
to vanishing gradients. The skip connection is a solution to
this problem and enables to train a deep network by using an
activation from one layer and add it to a deeper layer. In this
way, the network can easily learn the identity function when
the parameters become zero. The network exploits two more
blocks of convolution, batch normalization, and leaky ReLU,
followed by a convolution layer and softmax, which, finally,
generates the abundances.

G. Network Component and Hyperparameter Selection


In this work, leaky ReLU was used as the activation function
Fig. 4. Simulated image. (a) Endmembers. (b) Abundance maps.
(except in the last layer), which often speeds up the learning
process since the derivative is either one or close to zero.
We compared the performance of leaky ReLU with the use of a large number (3000). This makes the algorithm very robust
Sigmoid, ELU, and ReLU activation functions and found that to this parameter since the overall average is very close to
both leaky ReLU and ReLU provide the best results. Leaky the minimum solution, even if there is a considerable jump in
ReLU was selected since it is the default for the DIP network. the loss function at the stopping iteration. Finally, an Adam
The negative slope of leaky ReLU was set to 0.1, which is optimizer was used with a learning rate of 0.001, and PyTorch
also the default value in the DIP network. For the filter size of was used as the platform for the network implementation.
the convolutional layers, we used the default values proposed
in [37], i.e., 3 × 3 in the forward connections and 1 × 1 in III. E XPERIMENTAL R ESULTS
the skip connections. Downsampling is often applied using The experiments were performed on a simulated data set
pooling and/or stride inside the CNN. For downsampling, and three real data sets. The description of the data sets is
we only used the stride within the convolution module as is given as follows.
the default in [37]. For upsampling, we experimented with
both the nearest neighborhood and bilinear interpolation and
found that bilinear interpolation performs the best. Reflection A. Hyperspectral Data Description
padding was used in the convolution to preserve the size of the 1) Simulated Data Set: A data set of 60 × 75 pixels is
image. The number of filters used is 4 in the skip connection simulated by generating linear mixtures of three minerals,
and 256 in the forward connections. The hyperparameters i.e., Fe2 O3 , SiO2 , and CaO. The endmembers, which are
of the network are listed in Table I. We should emphasize shown in Fig. 4(a), were measured by an AgriSpec spectrom-
that we do not optimize the hyperparameters according to eter [manufactured by Analytical Spectral Devices (ASD)]
the data set and/or the SNR since this would be unfair and contain 200 reflection values in the wavelength range
to the competing methods used in the experiments. There- [1000–2500] nm. The ground-truth abundance maps are shown
fore, the values mentioned in Table I are not optimal, and in Fig. 4(b). These contain 20 squares of 5 × 5 pixels with
careful tuning according to the noise level and data set could different binary and ternary linear mixtures. The background
possibly lead to better results and probably faster convergence. contains binary mixtures of 50% of Fe2 O3 and 50% of SiO2 .
Since UnDIP is an iterative algorithm (as opposed to the other 2) Samson Image: The Samson hyperspectral data set is
CNN-based algorithms that use training sets for learning), shown in Fig. 5(a) and contains 95 × 95 pixels. The spec-
the stopping point or the number of iterations becomes an tral signatures contain 156 bands in the wavelength range
important hyperparameter to set. To deal with this issue, [401–889] nm. There are three main materials (i.e., soil,
we use (as also suggested in [37]) exponentially weighted tree, and water). The ground-truth endmembers were extracted
averaging over the outputs and set the number of iterations to using SiVM, and the ground-truth fractional abundances were

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

Fig. 5. Samson image. (a) True-color image (red: 571.01 nm, green: 539.53 nm, and blue: 432.48 nm). (b) Endmembers. (c) Abundance maps.

Fig. 6. Jasper Ridge image. (a) True-color image (red: 570.14 nm, green: 532.11 nm, and blue: 427.53 nm). (b) Endmembers. (c) Abundance maps.

generated using FCLSU. Both are shown in Fig. 5(b) and (c),
respectively.
3) Jasper Ridge Image: The Jasper Ridge data set con-
tains 100 × 100 pixels and is shown in Fig. 6(a). The
data set contains 224 bands, covering the wavelength range
[380–2500] nm. The water absorption bands (1–3, 108–112,
154–166, and 220–224) were removed, and 198 channels were
retained. There are four endmembers [i.e., tree, water, soil, and
road, as shown in Fig. 6(b)], which are extracted using SiVM.
The ground-truth fractional abundances [see Fig. 6(c)] were
estimated using FCLSU. Fig. 7. Apex image. (a) True-color image (red: 572.2 nm, green: 532.3 nm,
and blue: 426.5 nm). (b) Endmembers.
4) Apex Data Set: The cropped image used in the article
contains 111×122 pixels [as shown in Fig. 7(a)] and 285 bands
ground truth is available online1 and contains seven classes:
that cover the wavelength range [413–2420] nm. In this data
grass, tree, roof, road, water, trail, and shadow. The ground-
set, there are four ground measured endmembers [i.e., water,
truth endmembers are selected manually for this data set [as
grass, road, and roof, as shown in Fig. 7(b)]. The scene is
shown in Fig. 8(b)], and FCLSU was used to estimate the
influenced by variable illumination conditions and contains
ground-truth fractional abundances.
a shadow-covered area. Therefore, to create the ground-truth
fractional abundances, we added a shadow endmember (a zero
spectrum) to the list of ground-truth endmembers and then B. Experimental Setup
applied FCLSU. Seven unmixing methods from different categories were
5) Washington DC Mall Data Set: Washington DC Mall is used as competing methods in the experiments:
an airborne hyperspectral image, captured over the Washington 1) the baseline FCLSU [9];
DC Mall using the HYDICE sensor. The cropped image [see 2) a blind unmixing method: NMF-QMV [15];
Fig. 8(a)] used in this article contains 319 × 292 pixels
in 191 bands over the spectral range from 0.4 to 2.4 μm. The 1 https://engineering.purdue.edu/ landgreb/Hyperspectral.Ex.html

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

SAD (in degree) is used to measure the SAD between an


estimated and the ground-truth endmember as
   
e(i) , ê(i) 180
SAD(e(i) , ê(i) ) = arccos   
e(i) ê(i)  π .
We should note that, although a lower Abundance MAE
denotes a better abundance estimation and a lower spectral
RMSE denotes a better signal reconstruction, a lower RE
does not necessarily mean a better abundance estimation
performance or a better signal reconstruction. According to
the linear model, the RE depends on the linear combination
Fig. 8. Washington DC Mall image. (a) True-color image (red: 572.7 nm,
of the endmembers and abundances. The multiplication of both
green: 530.1 nm, and blue: 425.0 nm). (b) Endmembers. may be close to the observed spectra, but, individually, they
might not represent the true endmembers and abundances.
3) a sparse unmixing method Collab, which is based on a In addition, the RE includes model errors (nonlinearities)
group sparsity inducing mixed norm using the collabo- and noise. Only if the data contain insignificant levels of
rative LASSO [24]; model errors and noise, a lower RE denotes an improved
4) three deep unmixing methods: uDAS [30], SNSA [27], performance, and then, the RE will be close to the spectral
and DAEN [28]. RMSE since the observed data are close to the original data.
The RE should be interpreted along with the Abundance MAE.
All the parameters for the competing methods were selected
If the abundance estimation is satisfactory, then a lower RE
according to the reported default values.
indicates a better performance. Otherwise, the spectral RMSE
Hyperspectral images generally contain different levels and
is more informative for validating the performance.
types of noise [43]. It has been shown that hyperspectral
unmixing techniques are often remarkably robust to noise and C. Unmixing Experiments
can be used as denoisers [44]. To compare the robustness 1) Experiments on Simulated Data Set: Fig. 9 shows the
of the techniques with respect to the image SNR, we added results of the unmixing techniques applied on the simulated
white zero-mean Gaussian noise to the data to generate the data. As can be observed from Fig. 9(a), UnDIP and FCLSU
observed data Y. Images are generated with SNR = 20, 30, obtain the lowest Abundance MAE for all SNR values. DAEN
40, and 50 dB, on all data sets, except for the Apex and the slightly outperforms the remaining techniques and collabora-
Washington DC Mall images. All experiments are repeated tive LASSO provides the poorest results for 20 dB. The RE for
five times with random noise realizations. Mean results and all techniques is low, despite the poor abundance estimation
standard deviations are shown. of some of the methods, e.g., sparse unmixing. Therefore,
For all the data sets, ground-truth abundance maps are the spectral RMSE is more informative [see Fig. 9(c)]. In the
available, and therefore, quality assessment metrics are applied case of simulated data, the error is only induced by the noise
to compare the results. In the experiments, the results are since no other model errors were simulated. SNSA, UnDIP,
compared based on the abundance mean absolute error (MAE), FCLSU, and NMF-QMV obtain the lowest RMSE, confirming,
the RE, the spectral RMSE, and the spectral angle distance along with the good abundance estimation performance, that
(SAD). All results, except for SAD are reported as percent- these methods are able to reconstruct the data. Fig. 9(d) shows
ages. The abundance MAE is given by the mean of the the performance of the endmember estimation by the different
absolute errors (in percent) between the estimated abundances techniques, in terms of SAD. Both UnDIP and FCLSU apply
and the ground-truth abundances SiVM for the extraction of the endmembers. It can be observed
1
r n
  that SiVM outperforms the other techniques in terms of SAD
Abundance MAE = Âki − Aki  × 100. (12) for all SNRs.
rn k=1 i=1 Fig. 10 visually compares the obtained abundance maps
The RE is the RMSE (in percent) between the obtained using the different unmixing techniques for SNR = 20. The
reconstructed image X̂ and the observed (noisy) image Y visual comparison reveals that UnDIP is less sensitive to
 noise than the other techniques and generates abundance maps

 1 p n  2 that are very close to the ground-truth abundances, even for
RE =  X̂ j i − Y j i × 100. (13) SNR values as low as SNR = 20 dB. In supervised CNN,
pn j =1 i=1
image patches are extracted to train the network, and therefore,
The spectral RMSE is the RMSE (in percent) between the the convolutional operator is only applied on a spatial subset
obtained reconstructed image X̂ and the original noise-free of the data. Depending on the size of the patches, the spatial
image X information can be considerably degraded. On the other hand,
 UnDIP applies the convolutional operator on the entire spatial

 1 p n  2 domain since it is an unsupervised CNN. As can be seen
spectral RMSE =  X̂ j i − X j i × 100. (14) from Fig. 10, the proposed method successfully preserves the
pn j =1 i=1
structures and provides better abundance estimations.

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

Fig. 9. Simulated data—results of unmixing in terms of (a) abundance MAE, (b) RE, (c) spectral RMSE, and (d) SAD (in degree) with respect to different
noise levels of the observed image (in SNR).

Fig. 10. Simulated data—abundance maps obtained by applying different unmixing techniques (20 dB).

Fig. 11. Samson data set—results of unmixing in terms of (a) abundance MAE, (b) RE, (c) spectral RMSE, and (d) SAD (in degree) with respect to different
noise level of the observed image (in SNR).

Fig. 12. Samson data set—abundance maps obtained by applying different unmixing techniques (20 dB).

2) Experiments on Samson Data Set: Fig. 11 shows the the best abundance estimation performances [see Fig. 11(a)]
results of the unmixing experiments applied on the Samson and produce similar abundance maps, close to the ground
data set, and Fig. 12 shows the estimated abundance maps. It truth (see Fig. 12). However, NMF-QMV is more sensitive
can be observed that FCLSU, UnDIP, and NMF-QMV obtain to noise. Both UnDIP and NMF-QMV obtain a lower RE

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

Fig. 13. Jasper Ridge data set—results of unmixing in terms of (a) abundance MAE, (b) RE, (c) spectral RMSE, and (d) SAD (in degree) with respect to
the different noise levels of the observed image (in SNR).

Fig. 14. Jasper Ridge data set—abundance maps obtained by applying different unmixing techniques (20 dB).

and spectral RMSE than FCLSU. The Abundance MAE of the best in terms of Abundance MAE. FCLSU, however,
uDAS increases with increasing noise power although the obtains poor RE and spectral RMSE. DAEN, SNSA, and
RE and spectral RMSE remain low. One can conclude that NMF-QMV obtain lower RE and spectral RMSE but are less
uDAS performs better as a denoiser than as an unmixer. This performant in terms of abundance estimation. Collaborative
is due to the denoising constraint applied on the encoder unmixing obtains the poorest abundance estimation. SNSA
in the uDAS network. DAEN performs better in terms of is not robust to the noise, despite very low RE and spectral
abundance estimation than uDAS for low SNR but worse for RMSE. As can be observed from Fig. 14, uDAS mixes the
high SNR. SNSA obtains a moderate abundance estimation Water and Road classes. Collaborative unmixing can hardly
and the poorest of all methods for 20 dB, which shows distinguish Soil from Road. The Water and Tree abundance
that it is not robust with respect to noise. The abundance maps are well estimated by all techniques, which can be
estimation performance of collaborative unmixing is poor for attributed to their unique endmembers. From Fig. 13(d), one
all SNRs, which makes it very sensitive to noise (notice the can observe that SiVM outperforms the other techniques
large variance for 20 dB), as can also be observed from the with respect to endmember extraction. Both NMF-QMV and
abundance maps in Figs. 11(d) and 12, which shows that SiVM collaborative unmixing give poor results. uDAS and SNSA
and uDAS perform better for the estimation of endmembers have a similar moderate performance.
than the other methods and show robustness to the noise. 4) Experiments on Apex Data Set: To further evaluate
In terms of SAD, DAEN, SNSA, and NMF-QMV show the unmixing techniques, they were applied to the Apex
sensitivity to the noise power. A very low SAD is obtained data set, for which ground-truth endmembers are available.
by collaborative unmixing for 20 dB, but the abundance MAE In this experiment, we did not add artificial noise to the data
and the visual comparison in Fig. 12 reveal a poor abundance set.
estimation. The good performance of collaborative unmixing The results of abundance estimations are given in Table II,
in terms of SAD can be attributed to the averaging effect of and abundances are compared visually in Fig. 15. The lowest
endmember bundles that considerably helps to decrease the overall MAE is obtained by UnDIP, which also obtained the
SAD. best estimations of the abundances for Road and Shadow.
3) Experiments on Jasper Ridge Data Set: All the unmixing Collaborative unmixing also performs well (0.2% higher error
techniques were applied to the Jasper Ridge image. The results than UnDIP) and obtains the best estimations for Water and
are given in Fig. 13, and the abundance maps are shown Grass. uDAS and FCLSU perform similarly with 0.9 and 0.8%
in Fig. 14. For this data set, FCLSU and UnDIP perform higher error than UnDIP, respectively. NMF-QMV, DAEN,

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

Fig. 15. Apex data set—abundance maps obtained by applying different unmixing techniques.

TABLE II TABLE III


A BUNDANCE M EAN A BSOLUTE E RROR ( IN %) OF THE A PEX D ATA S ET. A BUNDANCE M EAN A BSOLUTE E RROR ( IN %) OF THE WASHINGTON DC
T HE B EST P ERFORMANCES A RE S HOWN IN B OLD D ATA S ET. B EST P ERFORMANCES A RE S HOWN IN B OLD

and SNSA obtain abundance errors that are considered poor


compared to the other competing techniques.
The visual comparison in Fig. 15 confirms the results TABLE IV
reported in the table. Although collaborative unmixing pro- SAD OF THE A PEX AND WASHINGTON DC M ALL D ATA S ETS . B EST
vides the lowest MAE for Water, a visual comparison reveals P ERFORMANCES A RE S HOWN IN B OLD
that it is the only technique that considerably mixes Water with
Shadow, while, for Grass, it shows the best performance, also
visually. UnDIP shows the best performance for Road, while
all the other techniques mix the abundances of Road and Roof.
UnDIP outperforms the others on Shadow, also visually. The
performances on the endmember estimation are compared in
Table IV. It can be observed that SiVM outperforms the other 1) In all experiments, a very low Abundance MAE, RE, and
techniques in terms of SAD. NMF-QMV gives the highest spectral RMSE was obtained by UnDIP compared to all
SAD. competing methods. This can be partially attributed to
5) Experiments on Washington DC Mall Data Set: The its ability to globally incorporating spatial information,
unmixing techniques were applied on the Washington DC as can visually be observed from, e.g., the abundance
Mall data set, and the results are compared in Table III. Col- maps of the simulated data. The results also clearly
laborative unmixing provides the best MAE. SNSA, UnDIP, indicate that UnDIP is very robust to noise, which is
and FCLSU perform similarly in terms of MAE and can be due to the implicit application of a regularizer in the
considered as the second best results in the table. uDAS pro- network. The incorporation of a geometrical endmember
vides the worst results on this data set. The visual comparison estimation approach assures that it is entirely devoted
in Fig. 16 reveals that all the methods fail to adequately to the abundance estimation. Other methods that jointly
estimate the abundances. This is due to the poor endmember estimate the endmembers and the abundances obtain low
estimation or extraction, as can be observed in Table IV. RE values but do not necessarily perform well on the
abundance estimation. Since the abundance estimation
D. Discussion highly depends on the quality of the endmembers, a poor
Here, we summarize and discuss the results obtained from endmember estimation evidently leads to a poor abun-
the experiments. dance estimation.

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

Fig. 16. Washington DC Mall data set—abundance maps obtained by applying different unmixing techniques.

2) FCLSU performs equally well for estimating fractional abundance and endmember estimation by employing a
abundances, but obtains higher RE and spectral RMSE, regularizer into the loss function. In addition, DAEN
making it more sensitive to noise compared to UnDIP. exploits stacked encoder–decoders to reduce the sensi-
We should note that FCLSU is used to generate ground- tivity to the noise, which can be clearly observed in the
truth abundances from the noiseless images. Therefore, experimental results.
the Abundance MAE of FCLSU can be considered as 5) Collaborative unmixing obtains the worst results and
the benchmark. is shown to be very sensitive to noise throughout the
3) uDAS and NMF-QMV obtain moderate results. On the experiments. This may be attributed to the fact that the
simulated data set, they perform equally well. On the endmember bundles are not available a priori but rather
Samson data set, NMF-QMV performs better, while are generated from the data.
uDAS performs better on Jasper Ridge and Apex. 6) The reported results in terms of SAD reveal the signifi-
NMF-QMV is more robust to noise and obtains lower cant role of the estimated/extracted endmembers on the
spectral RMSE. This can be attributed to the regular- abundance estimation. The results confirm that poor end-
ization term for which the regularization parameter was member estimation leads to poor abundance estimation.
optimally selected. uDAS provides low RE and moderate SiVM consistently outperforms the other techniques in
spectral RMSE, which can be attributed to the denoising all the experiments performed in this article and shows
constraint inside the deep network. Although uDAS is robustness with respect to the noise power. However,
designed to optimize the RE, the experimental results for both Apex and Washington DC data sets, none
show that this does not guarantee an optimal abundance of the methods could estimate/extract the endmembers
estimation. satisfactorily. This can be attributed to the occurrence
4) SNSA obtains good spectral RMSE but is not as robust of highly mixed pixels and nonlinearities in those data
as the competing methods for abundance estimation. sets.
SNSA is based on stacked encoder–decoders and does 7) Notice that all reported standard deviations are very
not exploit the spatial information. Moreover, the tuning small, except in some cases at 20dB. It seems that
parameter of the minimum volume regularizer in the all randomness, from different noise realizations, and
cost function is fixed and not automatically selected and initializations (all methods except UnDIP use VCA to
cannot perform well for all the noise levels. initialize the endmembers, and the random initializa-
Overall, DAEN performs moderately. DAEN utilizes tion of the UnDIP network) are well overcome by the
a variational auto encoder–decoder to improve the applied methods. In particular, almost always, the same

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

TABLE V
P ROCESSING T IME ( IN SECONDS ) OF THE U NMIXING T ECHNIQUES
A PPLIED TO THE A PEX AND WASHINGTON DC M ALL D ATA S ETS

competitive to geometric, blind, and sparse unmixing, in terms


of computational time.

IV. C ONCLUSION
In this article, we proposed a deep prior unmixing technique
called UnDIP. UnDIP first extracts the endmembers using
a geometrical SiVM technique. Relying on the extracted
endmembers, UnDIP estimates the fractional abundances using
a deep convolutional network. The network is inspired by the
Fig. 17. Sensitivity of UnDIP to the hyperparameters of the network. theory behind the DIP that implicitly induces a regularizer
Experiments were performed on Jasper data set (50 dB). (a) Filter size. on the cost function via the network parameters. Experiments
(b) Number of filters. (c) Number of iterations. (d) Activation function.
were carried out on a simulated data set and three real data
endmembers were extracted, irrespective of the noise sets. Comparative assessments were performed using sparse,
level. geometrical, deep, and blind unmixing methods. Experimental
results confirm that UnDIP outperforms all the other tech-
E. Sensitivity Analysis to Hyperparameters niques used in the experiments based on quality metrics and
visual assessment. In addition, the experiments showed that
In the concept of the DIP, it is important that all the hyper-
UnDIP not only performs very well on abundance estimation
parameters are tuned with respect to the application to obtain
but also successfully reconstructs the data. Moreover, UnDIP
a better performance [37]. Here, we evaluate the performance
is considerably robust to the noise power and does not rely on
of UnDIP with respect to the hyperparameters of the network.
any spectral library. The experimental results also showed that
The results for the Jasper Ridge data set (50 dB) are depicted
UnDIP is computationally very competitive to the conventional
in Fig. 17. Fig. 17(a) shows the performance of UnDIP with
methods used in the experiments due to the efficiency of GPU
respect to the spatial size of the convolutional filter. It can be
programming.
seen that the size of 3 × 3 is optimal. 5 × 5 filters perform
similarly in terms of MAE but at a higher computational cost.
Fig. 17(b) plots the MAE values in function of the number of ACKNOWLEDGMENT
convolutional filters. As can be seen, the use of 256 filters The authors would like to thank Prof. Jun Li and
provides the best result. Fig. 17(c) plots the loss function Dr. Yuanchao Su for providing the MATLAB code of SNSA
in function of the number of iterations for three different and DAEN.
learning rates (LRs). It can be seen that a learning rate of
LR = 0.001 provides the fastest convergence for the proposed R EFERENCES
algoriThm. Fig. 17(d) compares the performance of UnDIP in [1] J. M. Bioucas-Dias, A. Plaza, G. Camps-Valls, P. Scheunders,
terms of MSE for different activation functions. Both leaky N. Nasrabadi, and J. Chanussot, “Hyperspectral remote sensing data
analysis and future challenges,” IEEE Geosci. Remote Sens. Mag., vol. 1,
ReLU and ReLU outperform the Sigmoid and ELU activation no. 2, pp. 6–36, Jun. 2013.
functions. [2] P. Ghamisi et al., “Advances in hyperspectral image and signal process-
ing: A comprehensive overview of the state of the art,” IEEE Geosci.
Remote Sens. Mag., vol. 5, no. 4, pp. 37–78, Dec. 2017.
F. Processing Time [3] J. M. Bioucas-Dias et al., “Hyperspectral unmixing overview: Geomet-
rical, statistical, and sparse regression-based approaches,” IEEE J. Sel.
Table V reports the processing times for the different unmix- Topics Appl. Earth Observ. Remote Sens., vol. 5, no. 2, pp. 354–379,
ing techniques applied to the Apex and Washington DC Mall Apr. 2012.
data sets. All the algorithms were implemented in MATLAB [4] N. Dobigeon, J.-Y. Tourneret, C. Richard, J. C. M. Bermudez,
S. McLaughlin, and A. O. Hero, “Nonlinear unmixing of hyperspectral
(2020b), except UnDIP that was implemented in Python (3.8). images: Models and algorithms,” IEEE Signal Process. Mag., vol. 31,
The reported processing times were obtained using a computer no. 1, pp. 82–94, Jan. 2014.
with an Intel Core i9-10980 HK processor (2.4 GHz), 32 GB [5] J. Boardman, F. A. Kruse, and R. Green, “Mapping target signatures via
partial unmixing of AVIRIS data: In summaries,” in Proc. JPL Airborne
of memory, a 64-bit Operating System, and an NVIDIA Earth Sci. Workshop, 1995, pp. 23–26.
GEFORCE RTX (2080 Super) graphical processing unit. The [6] E. M. Winter, “N-FINDR: An algorithm for fast autonomous spectral
results are averaged over five experiments. From the table, end-member determination in hyperspectral data,” in Proc. SPIE, 5th
Imag. Spectrometry, vol. 3753, M. R. Descour and S. S. Shen, Eds.
it can be observed that, partially due to the efficiency of Denver, CO, USA: International Society for Optics and Photonics,
GPU programming, the proposed deep learning method is very Jul. 1999, pp. 266–275.

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
5504615 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 60, 2022

[7] J. Nascimento and J. Bioucas-Dias, “Vertex component analysis: A~fast [30] Y. Qu and H. Qi, “UDAS: An untied denoising autoencoder with sparsity
algorithm to extract endmembers spectra from hyperspectral data,” in for spectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 3,
Pattern Recognition and Image Analysis, F. J. Perales, A. J. C. Campilho, pp. 1698–1712, Mar. 2019.
N. P. de la Blanca, and A. Sanfeliu, Eds. Berlin, Germany: Springer, [31] X. Zhang, Y. Sun, J. Zhang, P. Wu, and L. Jiao, “Hyperspectral unmixing
2003, pp. 626–635. via deep convolutional neural networks,” IEEE Geosci. Remote Sens.
[8] C.-I. Chang and D. C. Heinz, “Constrained subpixel target detection for Lett., vol. 15, no. 11, pp. 1755–1759, Nov. 2018.
remotely sensed imagery,” IEEE Trans. Geosci. Remote Sens., vol. 38, [32] B. Palsson, J. R. Sveinsson, and M. O. Ulfarsson, “Spectral–spatial
no. 3, pp. 1144–1159, May 2000. hyperspectral unmixing using multitask learning,” IEEE Access, vol. 7,
[9] D. C. Heinz and C. I. Chang, “Fully constrained least squares linear pp. 148861–148872, 2019.
spectral mixture analysis method for material quantification in hyper- [33] B. Palsson, J. Sigurdsson, J. R. Sveinsson, and M. O. Ulfarsson,
spectral imagery,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 3, “Hyperspectral unmixing using a neural network autoencoder,” IEEE
pp. 529–545, Mar. 2001. Access, vol. 6, pp. 25646–25656, 2018.
[10] J. Sigurdsson, M. O. Ulfarsson, and J. R. Sveinsson, “Blind hyper- [34] B. Palsson, M. O. Ulfarsson, and J. R. Sveinsson, “Convolutional
spectral unmixing using total variation and q sparse regularization,” autoencoder for spectral-spatial hyperspectral unmixing,” IEEE Trans.
IEEE Trans. Geosci. Remote Sens., vol. 54, no. 11, pp. 6371–6384, Geosci. Remote Sens., vol. 59, no. 1, pp. 535–549, Jan. 2020.
Nov. 2016. [35] F. Khajehrayeni and H. Ghassemian, “Hyperspectral unmixing
[11] N. Dobigeon, S. Moussaoui, M. Coulon, J.-Y. Tourneret, and using deep convolutional autoencoders in a supervised scenario,”
A. O. Hero, “Joint Bayesian endmember extraction and linear unmixing IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 13,
for hyperspectral imagery,” IEEE Trans. Signal Process., vol. 57, no. 11, pp. 567–576, 2020.
pp. 4355–4368, Nov. 2009. [36] V. Lempitsky, A. Vedaldi, and D. Ulyanov, “Deep image prior,” in
[12] J. Li and J. M. Bioucas-Dias, “Minimum volume simplex analysis: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
A fast algorithm to unmix hyperspectral data,” in Proc. IEEE Int. Geosci. pp. 9446–9454.
Remote Sens. Symp. (IGARSS), vol. 3, Jul. 2008, pp. III-250–III-253. [37] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep image prior,” Int. J.
[13] J. Li, J. M. Bioucas-Dias, and A. Plaza, “Collaborative nonnegative Comput. Vis., vol. 128, no. 7, pp. 1867–1888, Mar. 2020.
matrix factorization for remotely sensed hyperspectral unmixing,” in [38] O. Sidorov and J. Y. Hardeberg, “Deep hyperspectral prior: Single-image
Proc. IEEE Int. Geosci. Remote Sens. Symp., Jul. 2012, pp. 3078–3081. denoising, inpainting, super-resolution,” in Proc. IEEE/CVF Int. Conf.
[14] L. Miao and H. Qi, “Endmember extraction from highly mixed data Comput. Vis. Workshop (ICCVW), Oct. 2019, pp. 3844–3851.
using minimum volume constrained nonnegative matrix factorization,” [39] J. M. Bioucas-Dias and J. M. P. Nascimento, “Hyperspectral subspace
IEEE Trans. Geosci. Remote Sens., vol. 45, no. 3, pp. 765–777, identification,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 8,
Mar. 2007. pp. 2435–2445, Aug. 2008.
[15] L. Zhuang, C. Lin, M. A. T. Figueiredo, and J. M. Bioucas-Dias, “Regu- [40] B. Rasti, M. O. Ulfarsson, and J. R. Sveinsson, “Hyperspectral subspace
larization parameter selection in minimum vol. hyperspectral, unmixing,” identification using SURE,” IEEE Geosci. Remote Sens. Lett., vol. 12,
IEEE Trans. Geosci. Remote Sens., vol. 57, no. 12, pp. 9858–9877, no. 12, pp. 2481–2485, Dec. 2015.
Dec. 2019. [41] M. D. Craig, “Minimum-volume transforms for remotely sensed data,”
[16] L. C. Parra, C. Spence, P. Sajda, A. Ziehe, and K.-R. Müller, “Unmixing IEEE Trans. Geosci. Remote Sens., vol. 32, no. 3, pp. 542–552,
hyperspectral data,” in Proc. Adv. Neural Inf. Process. Syst. Cambridge, May 1994.
MA, USA: MIT Press, 2000, pp. 942–948. [42] R. Heylen, D. Burazerovic, and P. Scheunders, “Fully constrained least
[17] M. Elad, P. Milanfar, and R. Rubinstein, “Analysis versus synthesis in squares spectral unmixing by simplex projection,” IEEE Trans. Geosci.
signal priors,” Inverse Problems, vol. 23, no. 3, pp. 947–968, Apr. 2007. Remote Sens., vol. 49, no. 11, pp. 4112–4122, Nov. 2011.
[18] M.-D. Iordache, J. Bioucas-Dias, and A. Plaza, “Sparse unmixing of [43] B. Rasti, P. Scheunders, P. Ghamisi, G. Licciardi, and J. Chanussot,
hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 6, “Noise reduction in hyperspectral imagery: Overview and application,”
pp. 2014–2039, Jun. 2011. Remote Sens., vol. 10, no. 3, p. 482, Mar. 2018.
[19] J. M. Bioucas-Dias and M. A. T. Figueiredo, “Alternating direction algo- [44] B. Rasti, B. Koirala, P. Scheunders, and P. Ghamisi, “How hyperspectral
rithms for constrained sparse regression: Application to hyperspectral image unmixing and denoising can boost each other,” Remote Sens.,
unmixing,” in Proc. 2nd Workshop Hyperspectral Image Signal Process., vol. 12, no. 11, p. 1728, May 2020.
Evol. Remote Sens., 2010, pp. 1–4.
[20] M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Collaborative sparse
regression for hyperspectral unmixing,” IEEE Trans. Geosci. Remote
Sens., vol. 52, no. 1, pp. 341–354, Jan. 2014.
[21] M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Total variation Behnood Rasti (Senior Member, IEEE) received
spatial regularization for sparse hyperspectral unmixing,” IEEE Trans. the B.Sc. and M.Sc. degrees both in electronics-
Geosci. Remote Sens., vol. 50, no. 11, pp. 4484–4502, Nov. 2012. electrical engineering from the Electrical Engineer-
[22] L. Meier, S. Van De Geer, and P. Bühlmann, “The group lasso for ing Department, University of Guilan, Rasht, Iran,
logistic regression,” J. Roy. Stat. Soc. B, Stat. Methodol., vol. 70, no. 1, in 2006 and 2009, respectively, and the Ph.D. degree
pp. 53–71, Jan. 2008. in electrical and computer engineering from the
[23] M. Kowalski and B. Torrésani, “Sparsity and persistence: Mixed norms University of Iceland, Reykjavik, Iceland, in 2014.
provide simple signal models with dependent coefficients,” Signal, Image In 2015 and 2016, he worked as a Post-Doctoral
Video Process., vol. 3, no. 3, pp. 251–264, Sep. 2009. Researcher and a Seasonal Lecturer with Electrical
[24] L. Drumetz, T. R. Meyer, J. Chanussot, A. L. Bertozzi, and C. Jutten, and Computer Engineering Department, University
“Hyperspectral image unmixing with endmember bundles and group of Iceland. From 2016 to 2019, he has been a
sparsity inducing mixed norms,” IEEE Trans. Image Process., vol. 28, Lecturer with the Center of Engineering Technology and Applied Sciences,
no. 7, pp. 3435–3450, Jul. 2019. Department of Electrical and Computer Engineering, University of Iceland,
[25] B. Rasti et al., “Feature extraction for hyperspectral imagery: The evolu- where he has taught several core courses such as linear systems, control
tion from shallow to deep: Overview and toolbox,” IEEE Geosci. Remote systems, sensors and actuators, data acquisition and processing, circuit theo-
Sens. Mag., vol. 8, no. 4, pp. 60–88, Dec. 2020. ries, electronics, and PLC programming. His research interests include signal
[26] S. Ozkan, B. Kaya, and G. B. Akar, “EndNet: Sparse AutoEncoder and image processing, machine/deep learning, remote sensing image fusion,
network for endmember extraction and hyperspectral unmixing,” IEEE hyperspectral feature extraction and classification, spectral unmixing, remote
Trans. Geosci. Remote Sens., vol. 57, no. 1, pp. 482–496, Jan. 2019. sensing image denoising, and restoration.
[27] Y. Su, A. Marinoni, J. Li, J. Plaza, and P. Gamba, “Stacked nonnegative Dr. Rasti won the prestigious “Alexander von Humboldt Research Fel-
sparse autoencoders for robust hyperspectral unmixing,” IEEE Geosci. lowship Grant” in 2019 and started his work in 2020 as a Humboldt
Remote Sens. Lett., vol. 15, no. 9, pp. 1427–1431, Sep. 2018. Research Fellow with Machine Learning Group, Helmholtz-Zentrum Dresden-
[28] Y. Su, J. Li, A. Plaza, A. Marinoni, P. Gamba, and S. Chakravortty, Rossendorf (HZDR), Freiberg, Germany. He was the Valedictorian as an M.Sc.
“DAEN: Deep autoencoder networks for hyperspectral unmixing,” IEEE Student in 2009 and he won the Doctoral Grant of The University of Iceland
Trans. Geosci. Remote Sens., vol. 57, no. 7, pp. 4309–4321, Jul. 2019. Research Fund and was awarded “The Eimskip University fund,” in 2013.
[29] R. A. Borsoi, T. Imbiriba, and J. C. M. Bermudez, “Deep generative end- He serves as an Associate Editor for the IEEE G EOSCIENCE AND R EMOTE
member modeling: An application to unsupervised spectral unmixing,” S ENSING L ETTERS (GRSL) and Remote Sensing (Multidisciplinary Digital
IEEE Trans. Comput. Imag., vol. 6, pp. 374–384, 2020. Publishing Institute).

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.
RASTI et al.: UnDIP: HYPERSPECTRAL UNMIXING USING DIP 5504615

Bikram Koirala (Graduate Student Member, IEEE) Pedram Ghamisi (Senior Member, IEEE) received
received the B.S. degree in geomatic engineering the Ph.D. degree in electrical and computer engi-
from the Purbanchal University, Biratnagar, Nepal, neering from the University of Iceland, Reykjavik,
and the M.S. degree in geomatic engineering from Iceland, in 2015.
the University of Stuttgart, Stuttgart, Germany, He works as the Head of the Machine Learning
in 2011 and 2016, respectively. Group at Helmholtz-Zentrum Dresden-Rossendorf
In 2017, he joined Vision Lab, Department of (HZDR), Freiberg, Germany, and as the CTO,
Physics, University of Antwerp, Antwerp, Belgium, co-founder of VasoGnosis Inc., Milwaukee, WI,
as a Ph.D. Researcher. His research interest USA, and Visiting Professor at Institute of Advanced
includes machine learning and hyperspectral image Research in Artificial Intelligence (IARAI), Vienna,
processing. Austria. He is also the Co-Chair of the IEEE Image
Analysis and Data Fusion Committee (IEEE IADF). His research interests
include interdisciplinary research on machine (deep) learning, image and
signal processing, and multisensor data fusion.
Dr. Ghamisi was a recipient of the IEEE Mikio Takagi Prize for winning
Paul Scheunders (Senior Member, IEEE) received the Student Paper Competition at IEEE International Geoscience and Remote
the M.S. and Ph.D. degrees in physics, with work in Sensing Symposium (IGARSS) in 2013, the first prize of the data fusion
the field of statistical mechanics, from the University contest organized by the IEEE IADF in 2017, the Best Reviewer Prize of
of Antwerp, Antwerp, Belgium, in 1986 and 1990, IEEE G EOSCIENCE AND R EMOTE S ENSING L ETTERS in 2017, and the IEEE
respectively. Geoscience and Remote Sensing Society 2020 Highest-Impact Paper Award.
In 1991, he became a Research Associate with the For detailed info, please see http://pedram-ghamisi.com/.
Vision Lab, Department of Physics, University of
Antwerp, where he is a Full Professor. His research
interest includes remote sensing and hyperspectral
image processing. He has authored over 200 papers
in international journals and proceedings in the field
of image processing, pattern recognition, and remote sensing.
Dr. Scheunders is an Associate Editor of the IEEE T RANSACTIONS ON
G EOSCIENCE AND R EMOTE S ENSING and has served as a program committee
member for numerous international conferences.

Authorized licensed use limited to: Space Applications Centre (SAC). Downloaded on December 12,2024 at 05:43:32 UTC from IEEE Xplore. Restrictions apply.

You might also like