2023.TCSVT.See SIFT in a Rain
2023.TCSVT.See SIFT in a Rain
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
R
AIN, a common and widespread weather phenomenon, Feature Transform (SIFT) key points extracted from the
often occurs in most of the world [1]-[3]. As we know, backgrounds of a rainy image and its corresponding clean
rain always falls from the sky, so that a rain streak version. From this figure, it can be seen that useful image
layer will have to be added to an original clean image when features collected from a rainy image are far fewer than those
captured outdoors in such bad weather, producing a rainy from the corresponding clean image, leading to serious
image. For their specular highlights, rain streaks bring performance degradation in image matching and other image
complicated pixel intensity changes that inevitably mask feature-based applications. Therefore, it is critical to remove
background information [4]. As a result, additional gradients rain streaks from a single rainy image to recover image
introduced by rain streaks largely obstruct the extraction of features.
valuable image features. Fig. 1 compares Scale-invariant As everyone knows, two famous image quality assessment
indices, including peak signal-to-noise ratio (PSNR) and
structural similarity index measure (SSIM) are usually
This work was supported by the National Natural Science Foundation of
China, in part by the 111 Project, and in part by High-Performance Computing employed to measure a derained image in image deraining. In
Platform of Xidian University. (Corresponding author: Wei Wu.) essence, these two criteria are based on human visual system
Wei Wu and Hao Chang are with State Key Laboratory of Integrated (HVS). Additionally, derained images are generally used as
Services Networks, Xidian University, Xi’an 710071, China (e-mail:
[email protected], [email protected]). input to follow-up computer vision (CV) tasks, e.g. image
Zhu Li is with Department of Computer Science & Electrical Engineering, matching, target recognition and tracking, image fusion, 3-D
University of Missouri, Kansas City (e-mail: [email protected]).
2
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
reconstruction, and change detection, where these CV tasks hardware-software flexibility [21]. Therefore, due to the
are mainly based on image features. Fig. 2 illustrates two extensive use and strong practicability of SIFT, we focus on
derained images generated by using different image deraining how to derain a single rainy image to recover SIFT key points
algorithms, respectively. From this figure, one can see that as many as possible.
although the derained image in Fig. 2 (a) has higher PSNR and In this paper, we propose a task-driven approach, namely
SSIM values, its recovered SIFT key points is much fewer, Image Deraining for SIFT Recovery (IDSR). In this proposed
indicating that the objective and subjective qualities of a algorithm, we first propose a divide-and-conquer strategy
derained image may not directly reflect the recovery of its using two separate networks, i.e. difference of Gaussian
image features. Hence, to make derained images better (DoG) pyramid recovery network (DPRNet) as well as
applicable to subsequent tasks, the number of recovered image gradients of Gaussian images recovery network (GGIRNet), to
features, a task-driven indicator, is able to be adopted as an primarily work on two tasks, respectively. This is inspired by
important restoration assessment method in image deraining. the important idea of SIFT, which is the DoG pyramid and
So far, many methods have been proposed to try to clear up gradient information of Gaussian image are exclusively
rain from a single rainy image. Y. Chen et al. [5] described a employed to realize scale and spatial gradient space extrema
low-rank appearance model to capture spatio-temporally detection and description, respectively. So the first task is to
correlated rain streaks. In [6], a single image deraining recover DoG pyramid especially for detecting SIFT key
framework was designed to get rid of rain streaks by learning points, whereas the second one is to recover the gradients of
the context information of an input. In [7], Y. Luo et al. derained Gaussian images especially for generating the
sparsely modelled a rain layer with a mutually exclusive descriptors of those detected key points. Second, in the
learning dictionary, to differentiate rain streaks and DPRNet an alternative interest point (ALP) loss is proposed
background. Y. Wang et al. [8] proposed a tensor-based low- based on the notable ALP detector in [22]. Besides this novel
rank model, using the similar repeatability pattern of rain loss, several channel spatial attention residual blocks
streaks to complete rain elimination. (CSARBs) are also adopted to forge a derained image. Third,
Recently, with the brilliant success of deep learning applied in the GGIRNet we put forward a gradient attention module
in various language and vision tasks, learning-based (GAM) to exploit important gradient information. Using this
algorithms have also been advanced to tackle the problem of GAM with a gradient-wise loss, we construct a channel
image deraining. X. Fu et al. [9] developed an end-to-end deep gradient attention residual block (CGARB) to produce another
network architecture focusing on high frequency details to derained image. Finally, with these two different derained
obtain a derained image. Based on the first dataset containing images via the DPRNet and GGIRNet, respectively, we
rain-density label information, H. Zhang et al. [10] proposed a calculate their DoG pyramid and gradients of Gaussian
density-aware multi-stream densely connected CNN-based images, respectively, which are further applied to establish
algorithm for joint rain density estimation and deraining. X. Li recovered SIFT key points.
et al. [11] presented a contextual information-based recurrent Our contributions can be summarized as follows:
neural network (RNN) to remove rain from individual images. • To the best of our knowledge, this is the first study to
A progressive recursive network (PReNet) [12] was proposed directly recover SIFT from a single rainy image instead
to serve as a suitable baseline in image deraining research. H. of pixel reconstruction for improving image quality.
Zhang et al. [13] introduced conditional generative adversarial Different from existing HVS-driven image deraining
networks (CGAN) to clear up rain streaks. However, all the methods which aim to improve objective and subjective
approaches mentioned above are devoted to improving the image qualities, our proposed IDSR is a task-driven
objective and subjective qualities of a derained image, and approach developed to strengthen image feature supply
thus neglect how to enhance image feature recovery ability. for subsequent feature-based vision applications.
To alleviate this issue, in this study we propose a task- • We propose a divide-and-conquer strategy using two
driven image deraining algorithm to improve the recovered separate networks to specially concentrate on DoG
image features of derained images. SIFT [14], one of the most pyramid recovery and Gaussian image gradients recovery,
well-known image feature description methods, has been respectively.
widely employed in a great number of image processing • We propose an ALP loss function based on a scale space
applications [15]-[19]. Even in the era of deep learning, SIFT response extrema detection model to accurately locate
has also a strong and solid signal domain interpretation. SIFT key points, which is especially designed for
Indeed, in some big challenges like Google landmark accomplishing the image feature recovery goal.
recognition, SIFT-based object re-identification is still • We advance a new attention mechanism, dubbed GAM,
showing its advantage as an excellent handcrafted feature to generate an attention mask in gradient domain to
solution, compared with learning-based strategies [20]. adaptively select important gradient regions.
Moreover, it has been observed that deep features are • Compared with state-of-the-art (SOTA) methods, our
currently those that perform the best in terms of accuracy, but proposed IDSR achieves better performance in both the
number of recovered SIFT key points and their accuracy.
SIFT-like features still remain highly competitive today in
This paper is an extension of our prior work [4], where we
terms of balance between accuracy, storage, efficiency, and
make further significant improvements: 1) We propose the
3
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
GAM, a novel attention mechanism, collecting useful learning brings many new powerful solutions to image
information to capture gradient-wise relationships. This GAM deraining. Inspired by deep residual networks, X. Fu et al. [9]
is utilized specifically to help the GGIRNet regain derained proposed an end-to-end deep network by changing the
Gaussian images well. 2) In our current version we also apply mapping form to simplify the learning process, reducing the
a loss function in the gradient domain instead of the mapping range between input and output and making the
combination of the L1 and SSIM losses used in [4], to further learning process easier. Based on the first dataset containing
accurately establish the gradients of derained Gaussian images. rain-density label information, in [10] H. Zhang et al.
3) In [4], five parallel network branches are taken to output proposed a density-aware multi-stream densely connected
five corresponding derained Gaussian images, respectively, CNN-based algorithm, which efficiently leveraged features of
resulting in lots of parameters. To reduce the number of different scales for joint rain density estimation and deraining.
network parameters, in this work we change them to only one X. Li et al. [11] presented a recurrent image deraining network
network to create not these Gaussian images but a derained based on contextual information for a single rainy image,
image. 4) Different from our preliminary work in [4], for assigning different alpha values to various rain streak layers
simplicity, where partial data of Rain1200 and Rain1400 by combining squeeze and excitation blocks. In [12] PReNet
datasets are randomly selected to use, respectively, we further was proposed to exploit recursive computation and the
conduct extensive experiments on all the images of each of dependency of depth features across stages by unfolding a
these two datasets for more rigorous analysis. 5) Besides the shallow residual network. H. Zhang et al. [13] proposed a
two synthetic datasets, we also select a well-known real-world CGAN-based framework that adopted a densely connected
rainy image dataset, i.e. SPA-Data, to evaluate the generator to clear up rain, and a multi-scale discriminator to
performance of our proposed algorithm. Experimental results decide whether the corresponding rain removal image was real
demonstrate that the proposed IDSR recovers more key points or false.
than our conference version. In addition, Y. Ye et al. [27] proposed an algorithm that
jointly learned rain generation (forward) and rain removal
II. RELATED WORKS (inverse) in a unified framework. Learning physical
In recent years, there has been an increasing amount degradation can better approximate real rainfall in an implicit
literature on image deraining, which can be broadly divided manner. In [28], an adaptive dilated network was proposed to
into two categories, i.e. traditional and deep learning-based remove rain patterns, by constructing an efficient adaptive
approaches. dilated block, exploiting the importance of different scale
features, and modelling the interdependence of adaptive
A. Traditional Image Rain Removal Approaches ground features. Y. Yang et al. [29] proposed a progressive
Traditional image deraining algorithms generally take residual detail supplementation based end-to-end rain removal
advantage of the prior knowledge of rain, to deal with the network to progressively dislodge rain layers. In [30], a
issue of its elimination from a single rainy image. By residual multi-scale image deraining method was proposed, in
appropriately formulating rain removal as a morphological which the residual between the reconstructed image and the
component analysis based image decomposition problem, Y. input rainy image was treated as an attention map, providing
Fu et al. [23] proposed a single image rain removal framework help in rain pattern recognition and background recovery. Y.
using bilateral filters. Since the direct use of learned rain and Yang et al. [31] proposed a segmentation aware progressive
non-rain dictionaries produced unwanted edge artifacts, C. -H. network via contrast learning, with three sub-networks for
Son et al. [24] developed an image deraining method to shrink supervised rain removal, unsupervised background
sparse codes, generating shrinkage maps and correlation segmentation, and perceptual contrast, respectively. L. Cai et
matrices to reduce those artifacts. D. -Y. Chen et al. [25] al. [32] extract depth and density information from rainy
proposed a monochromatic image-based rain elimination images, based on which conditional generative adversarial
framework, which first decomposed a color image into low- network is utilized to finish the job of rain removal. In [33], a
frequency and high-frequency parts, and then learned through novel attention-guided rain removal network was constructed,
both a dictionary and sparse coding to decompose the high to simultaneously learn and model multiple rain streak layers
frequency part into rain and non-rain components. In [26], under different phases. X. Cui et al. [34] develop a semi-
several common features of rain and snow were outlined, and supervised image deraining network with knowledge
a combination of rain and snow detection and a low-pass filter distillation (SSID-KD) for better exploiting real-world rainy
were used to produce low-frequency and high-frequency images. C. -Y. Lin et al. [35] proposed a two-stage deep
information, so that those two degradations can be neural network to solve the rain removal problem, where the
differentiated from the input image. However, as rain has very predicted rain streak component of the first stage became the
complicated forms, including various locations, sizes, and input to the second stage to further localize possible rain
orientations, these traditional algorithms obtain rather limited pixels. Y. Wei et al. [36] proposed a novel generative
performances. adversarial network-based rain removal network, which used
supervised and unsupervised processes in a unified form. K.
B. Deep Learning-based Image Deraining Approaches Jiang et al. [37] proposed a novel multi-level memory
Over recent years, the sustainable development of deep
1
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
Ground truth
DPRNet
ALP loss
CSARB CSARB
... CSARB CSARB + A DoG pyramid
e
A derained image yDPRNet
e
Another derained image yGGIRNet
Gradient
A rainy image x
CGARB CGARB
... CGARB CGARB +
information of
derained
Gaussian images
Gradient loss
GGIRNet
Conv layer Ground truth
region selection mechanism providing each spatial region a Fig. 6. Schematic illustration of our proposed GAM
i
weight. First, the input feature FSAM of the SAM goes through
two identical groups in proper order, each of which consists of concatenate GX' and GY' , followed by a convolution layer and
a convolutional layer and a ReLU activation. Second, a a sigmoid activation, resulting in a gradient attention map
i
convolutional layer is performed to forge a new feature with M G . Finally, we multiply FGAM by M G element-by-element
the altered size from C H W to 1 H W . Subsequently, o
to forge the GAM output FGAM . The mechanism of the novel
we take a sigmoid activation to construct a spatial attention
GAM can be expressed as:
map, which is then processed by the element-wise
o GX' = ReLU (W8 ( ReLU (W7 (W6 GX + b6 ) + b7 )) + b8 ), (9)
multiplication, building the SAM output FSAM . These steps
GY' = ReLU (W11 ( ReLU (W10 (W9 GY + b9 ) + b10 )) + b11 ), (10)
listed above can be represented as:
M S = (W5 ( ReLU (W4 ReLU (W3 FSAM
i
+ b3 ) + b4 )) + b5 ), (7) M G = (W12 ([GX' , GY' ]) + b12 ), (11)
o
FSAM = FSAM
i
Ms, (8) F o
GAM =F i
GAM MG , (12)
where M S is the spatial attention map, Wq , q = (3, 4,5) and where []represents the concatenate operation,
Wq , q = (6,7, 12) and bq , q = (6,7, 12) are the convolution
bq , q = (3, 4,5) are the convolution matrices and bias vectors
matrices and bias vectors adopted in the GAM, respectively.
used in the SAM, respectively.
To make recovered gradients further closer to those of GT,
C. GGIRNet in the GGIRNet we also take a loss function in gradient
In comparison with the DPRNet, our GGIRNet employs domain as follows.
5
several consecutive CGARBs instead of CSARBs to capture Lgrad = || h y g , j − h yGGIRNet
e, g , j
||1 + || v y g , j − v yGGIRNet
e, g , j
||1 , (13)
deep features, as given in Fig. 3. Fig. 5 provides the diagram j =1
of the CGARB, from which one can see that it takes a similar where y g , j ,( j = 1,2, ,5) and yGGIRNet
e, g , j
,( j = 1, 2, ,5) are the five
structure to the CSARB but a different attention module. e
Gaussian images of y and yGGIRNet , respectively, h and v
Because the CGARB is committed to recovering the gradients
of derained Gaussian images, we propose the GAM to replace are horizontal and vertical gradient operators, respectively, as
the SAM specifically for that purpose. well as || ||1 is the 1 norm.
The schematic illustration of our proposed GAM is
D. Proposed ALP Loss
presented in Fig. 6. First, we apply a sobel convolution (SC) to
calculate the gradient features GX C H W and GY C H W of The ALP detector [22] is an efficient and powerful key
i
point detector submitted to the 106th MPEG meeting and later
the GAM input FGAM in x-axis and y-axis directions, adopted as an alternative scale space extrema detection
respectively. Then, GX and GY are passed to two block groups solution in the MPEG CDVS (Compact Detector for Visual
for the generations of GX' and GY' , respectively, where each of Search) standard. It is quite distinct from prior art, and has two
clear advantages: one is that it is faster than most detectors,
these two groups is composed of a 1 1 convolutional layer
and the other is that its accuracy is also pretty good. In this
especially to change the feature size from C H W to
paper, based on this excellent detector, we develop an ALP
1 H W , a 5 5 convolutional layer, a ReLU, a 5 5
loss to try to detect as many key points as the corresponding
convolutional layer, and a ReLU in sequential. Next, we
1
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
clean image has for a derained image. , k ( ) are functions of scale , and k (k = 1,2,3,4) are four
In this ALP solution, Laplacian of Gaussian (LoG) filtering pre-defined scales.
is performed to sample scale space response (SSR) at pre-set Since k ( ) is smooth and can be approximated by low-
scales, where LoG can detect the local extremum point and so
in the SIFT method the DoG filter is introduced to degree polynomials, the following third-degree polynomial of
approximate the LoG filter. The SSR is modelled as a scale is used to represent k ( ) :
polynomial function fitted with the following response: k ( ) ak 3 + bk 2 + ck + d k , (15)
K
h(m, n, ) k ( ) h(m, n, k ), (14) where ak , bk , ck , and d k are coefficients.
k =1
In general, an image I is directly convoluted with a LoG
where h(m, n, ) is the LoG kernel at location ( m, n) and scale kernel to detect scale-invariant features and search extremums
1
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
at multi-scale spaces as key points, which can be expressed as: 33 (u , v) 2 + 2 2 (u, v) + 1 (u, v) = 0 (u, v) = .
(17)
( I h)[u, v, ] m =− w n =− w I [u − m, v − n] k =1 k ( ) h(m, n, k )
w w K
[3 y ,3 (u , v)-3 ye ,3
(u , v)] 2 + [2 y ,2 (u , v) − 2 ye ,2
(u , v)] by artificially adding rain streaks to its corresponding clean
DPRNet DPRNet
(18) version. Three kinds of rain streaks: heavy, medium, and light,
+[ y ,1 (u , v) − y e (u , v)] → 0,
DPRNet ,1
with their ratio of 1:1:1, are appended to 4000 clean images,
where y e (u, v), ( j = 1,2,3) and y , j (u , v), ( j = 1, 2,3) respectively, producing the training dataset of Rain1200
DPRNet , j
containing 12,000 rainy/clean image pairs. On the other hand,
represent the functions j (u, v), ( j = 1, 2,3) used for yDPRNet
e
and
Rain1200 also provides 1200 pairs of rainy/clean images for
y , respectively. testing. Moreover, in Rain1400 there are 14,000 pairs of
To achieve (18), for each j (u , v) its difference between rainy/clean images, in which rainy images are synthesized
these two images should be close to zero. from 1000 clean images plus rain streaks of different scales
and orientations. In experiments, we choose 12,600 image
y ,1 (u , v) − y ,1 (u , v) → 0 e
DPRNet pairs for training and the remaining 1400 image pairs for
y ,2 (u , v) − y e
,2
(u , v) → 0 (19) testing. Furthermore, since rain streaks in synthetic datasets
DPRNet
algorithms are compared, including DDN (deep detail network) the evaluated methods conducted on four chosen image pairs
[9], ECNet (embedding consistency and layered long short- from Rain1200, respectively. To determine recovered SIFT
term memory) [42], MOSS (memory oriented transfer learning key points, we match the key points extracted from every
for semi-supervised) [44], MPRNet (multi-stage progressive derained image with those of its corresponding clean version.
image restoration) [45], SAPNet (segmentation-aware Then, those matched SIFT key points are exactly what we are
progressive network) [31], ROMNet (Rain O’er Me) [46], looking for. From Fig. 7, we can observe that some deraining
SSDRNet (sequential dual attention network) [35], PReNet methods, e.g. DDN and SAPNet, not only do not remove rain
(progressive image deraining network) [12], BRN (bilateral streaks well, but also regain few key points. Moreover, by
recurrent network) [47], NLEDN (non-locally enhanced comparing Figs. 7 (b) with 7 (c), it can be seen that PReNet
encoder-decoder network) [48], UMRL (uncertainty guided obtains the derained image with more rain streak residues
multi-scale residual learning) [49], MAXIM (multi-axis MLP) but recovers more SIFT key points than UMRL, implying that
[50], and CODE-Net (continuous density-guided network) the quality of a derained image does not directly reflect its
[51]. image feature recovery effect. This is the very reason that we
In experiments we first should employ the available and develop a deraining algorithm especially for SIFT key point
runnable codes of each SOTA baseline approach to produce a recovery from a single rainy image. Furthermore, from Figs. 8
derained image for every tested rainy image. Subsequently, (k) and 8 (l), one can find that MAXIM gets rid of rain streaks
the famous SIFT method is performed on each derained image rather well, and outputs some restored key points. However, in
to extract its SIFT key points. To generate derained Gaussian comparison with MAXIM, our proposed IDSR achieves an
images, the scales of Gaussian images are set to 1.6000, analogous rain streak elimination subjective satisfaction,
2.2627, 3.2000, 4.5255, and 6.4000, respectively. whereas recovers much more key points. Similar results are
also obtained from Figs. 9 as well as 10.
C. Qualitative Results Compared with SOTA Methods
Figs. 11 and 12 also give the qualitative results via the
Figs. 7, 8, 9, and 10 show the qualitative results obtained by evaluated methods conducted on two image pairs chosen from
2
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
TABLE I TABLE IⅤ
AVERAGE QUANTITATIVE RESULTS IN TERMS OF THE AVERAGE QUANTITATIVE RESULTS IN TERMS OF THE
NUMBERS OF AVERAGE RECOVERED SIFT KEY POINTS AND NUMBERS OF AVERAGE RECOVERED SIFT KEY POINTS OF
PSNR/SSIM OF THE EVALUATED SOTA METHODS AND OUR NLEDN, SSDRNET, AND OUR PROPOSED ALGORITHM ON THE
PROPOSED ALGORITHM ON RAIN1200 DATASET, WHERE THE TWO SYNTHETIC DATASETS, WHERE THE BEST AND THE
BEST, THE SECOND BEST, AND THE THIRD BEST RESULTS ARE SECOND BEST RESULTS ARE MARKED WITH RED AND BLUE,
MARKED WITH RED, RESPECTIVELY.
BLUE, AND GREEN, RESPECTIVELY.
Evaluated Methods RAIN1200 RAIN1400 Average
Evaluated Methods SIFT Points PSNR/SSIM NLEDN [48] (ACMMM 2018) 199.01 195.90 197.45
DDN [9] (CVPR 2017) 83.53 23.96/0.740 SSDRNet [35] (TIP 2020) 194.73 212.61 203.67
NLEDN [48] (ACMMM 2018) 199.01 33.32/0.923 Proposed 211.97 225.31 218.64
PReNet [12] (CVPR 2019) 173.83 29.22/0.854
UMRL [49] (CVPR 2019) 183.49 30.83/0.901 Rain1400, respectively. From Fig. 11, it can be observed that
BRN [47] (TIP 2020) 177.47 29.58/0.854
ROMNet [46] (TIP 2020) 132.03 29.10/0.879 rain streaks still retain in the derained images built by both
SSDRNet [35] (TIP 2020) 194.73 32.98/0.919 DDN and ROMNet. Comparatively, the latter method has a
MPRNet [45] (CVPR 2021) 185.28 31.57/0.898 better performance of dislodging rain from an input rainy
MOSS [44] (CVPR 2021) 151.83 28.75/0.872
ECNet [42] (WACV 2022) 100.40 27.49/0.831 image, but produces fewer restored SIFT key points.
SAPNet [31] (WACVW 2022) 83.88 26.84/0.834 Moreover, NLEDN not only eliminates more rain degraded
MAXIM [50] (CVPR 2022) 190.31 31.05/0.903 components but also recovers more key points than ROMNet.
Our preliminary [4] (VCIP 2021) 150.55 30.37/0.901
Proposed 211.97 33.51/0.925 Furthermore, our proposed IDSR establishes the derained
image with a pretty good subjective quality and the most
recovered key points among all the evaluated methods. Similar
TABLE II
phenomena can be also found from Fig. 12.
AVERAGE QUANTITATIVE RESULTS IN TERMS OF THE
The qualitative results of the two images selected from
NUMBERS OF AVERAGE RECOVERED SIFT KEY POINTS AND
SPA-Data by the evaluation methods are given in Figs. 13 and
PSNR/SSIM OF THE EVALUATED SOTA METHODS AND OUR
14, respectively. From Fig. 13, one can find that some rain
PROPOSED ALGORITHM ON RAIN1400 DATASET, WHERE THE
stripes are not eliminated and still remain in each of the
BEST, THE SECOND BEST, AND THE THIRD BEST RESULTS ARE
derained images obtained by PreNet, BRN, SSDRNet,
MARKED WITH RED,
MPRNet, SAPNet, resulting in insufficient recovered SIFT
BLUE, AND GREEN, RESPECTIVELY.
key points. Moreover, by comparing Figs. 13 (e) with 13 (g),
although CODE-Net removes rain streaks well and generates a
Evaluated Methods SIFT Points PSNR/SSIM
DDN [9] (CVPR 2017) 89.90 25.11/0.794
derained image with a good quality, its recovered key points
NLEDN [48] (ACMMM 2018) 195.90 30.82/0.911 are fewer than those of our IDSR. Furthermore, it can be seen
PReNet [12] (CVPR 2019) 201.51 31.28/0.924 from Fig. 14 that those derained images of the evaluated
BRN [47] (TIP 2020) 207.88 31.43/0.926 methods have similar subjective qualities. However, our
ROMNet [46] (TIP 2020) 139.92 29.00/0.888
SSDRNet [35] (TIP 2020) 212.61 31.32/0.921 proposed IDSR yields the most recovered SIFT points among
MOSS [44] (CVPR 2021) 172.79 28.30/0.893 all the approaches.
SAPNet [31] (WACVW 2022) 100.02 27.11/0.856 According to the above qualitative results, there is a fact that
Proposed 225.31 32.13/0.928
it is perhaps that a derained image has a better quality but gets
fewer recovered SIFT key points. Exploring its cause, the key
TABLE III one is luminance and structure losses are accepted in learning
AVERAGE QUANTITATIVE RESULTS IN TERMS OF THE rain streak components to output a derained image with a high
NUMBERS OF AVERAGE RECOVERED SIFT KEY POINTS AND objective quality and a satisfied subjective quality,
PSNR/SSIM OF THE EVALUATED SOTA METHODS AND OUR respectively. However, to recover more local image features, it
PROPOSED ALGORITHM ON SPA-DATA, WHERE THE BEST, THE is necessary to concentrate on the gradient difference between
SECOND BEST, AND THE THIRD BEST RESULTS ARE MARKED a derained image and its corresponding clean version, because
WITH RED,
gradient information plays a very important role in SIFT. As a
BLUE, AND GREEN, RESPECTIVELY. result, our proposed IDSR exactly employs the novel ALP loss,
the proposed GAM, and the gradient-wise loss to boost its key
Evaluated Methods SIFT Points PSNR/SSIM
point recovery ability.
PReNet [12] (CVPR2019) 40.56 30.79/ 0.934
BRN [47] (TIP 2020) 46.87 31.70/ 0.939 D. Quantitative Results Compared with SOTA Methods
SSDRNet [35] (TIP 2020) 116.16 36.66/ 0.963
MPRNet [45] (CVPR 2021) 101.54 33.19/ 0.945 Table I presents the average quantitative evaluation results
SAPNet [31] (WACVW 2022) 55.26 32.93/ 0.943 of the SOTA methods and our proposed IDSR on Rain1200.
CODE-Net [51] (TMM 2023) 143.10 39.32/ 0.979 From the results in this table, we can see that both DDN and
Proposed 161.31 44.74/ 0.989
SAPNet restore relatively few SIFT key points of only about
3
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
TABLE Ⅴ Table II. It should be noted that some evaluated methods have
AVERAGE QUANTITATIVE RESULTS IN TERMS OF MSE OF special requirements for tested image size, e.g. the multiples
ABLATION STUDY FOR THE PROPOSED DIVIDE-AND-CONQUER of 8 or 32. However, the Rain1400 dataset does not satisfy this
STRATEGY ON RAIN1200 DATASET, WHERE THE BEST IS size demand, so that we only use the methods without that
MARKED WITH RED. image size limitation to compute recovery results, as
illustrated in Table II. With the second best result as shown in
Derained Gaussian images One-task Proposed
Table I, NLEDN gets a good SIFT feature recovery effect on
Gau1.6000 9.2174 8.7313
Rain1200 but merely takes fifth place on Rain1400. Moreover,
Gau2.2627 7.2645 6.8562 although SSDRNet restores fewer average key points on
Gau3.2000 5.5713 5.2370 Rain1200 than NLEDN, it ranks second on Rain1400.
Furthermore, the number of average recovered key points on
Gau4.5255 4.1544 3.8822
Rain1400 via our proposed IDSR stands at 225.31, the best
Gau6.4000 3.0912 2.8581 recovery performance among all the evaluated approaches.
Additionally, Table II also gives the average PSNR and SSIM
TABLE ⅤI results of these evaluated algorithms on Rain1400. One can
AVERAGE QUANTITATIVE RESULTS IN TERMS OF MSE OF see that although BRN gains higher average PSNR and SSIM
ABLATION STUDY FOR THE PROPOSED ALP LOSS ON RAIN1400 values than SSDRNet, respectively, the recovered key points
DATASET, WHERE THE BEST IS MARKED WITH RED. via the former are fewer than those of the latter. Therefore, it
is necessary to design a task-driven image deraining method
Derained DoG images L2 loss Proposed from the new perspective of image feature recovery.
DoG1.6000,2.2627 2.7088 2.5240 Compared with SOTA methods, our proposed IDSR not only
DoG2.2627,3.2000 2.3377 2.1601 recovers more key points but generates a derained image with
better qualities.
DoG3.2000,4.5255 1.9810 1.8157
To further validate the effectiveness of the proposed
DoG4.5255,6.4000 1.7485 1.5955 algorithm, we conducted experiments on the real-world
dataset, i.e. SPA-Data, and compared our IDSR with SOTA
TABLE ⅤII algorithms. The results of the average recovered SIFT key
AVERAGE QUANTITATIVE RESULTS IN TERMS OF MSE OF points and average qualities are shown in Table III. From this
ABLATION STUDY FOR THE PROPOSED GAM ON RAIN1400 table, it can be seen that by using the proposed IDSR, not only
DATASET, WHERE THE BEST IS MARKED WITH RED. the average PSNR result, but also the value of average SSIM,
are the best among all the methods. Moreover, the numbers of
Derained Gaussian images w/o GAM Proposed average key points by CODE-Net and SSDRNet are the
Gau1.6000 8.4561 8.3768 second and third, respectively. Compared with these two
Gau2.2627 6.6430 6.5325 methods, our proposed IDSR restores more average key points,
where their gaps are 161.31 – 143.10 = 18.21 and 161.31 –
Gau3.2000 5.1625 4.9898
116.16 = 45.15, respectively.
Gau4.5255 3.9333 3.7591 Finally, we especially compare our proposed IDSR with the
Gau6.4000 3.0928 2.8713 second best methods on the two synthetic datasets, i.e.
NLEDN and SSDRNet, respectively, and show related results
in Table IⅤ. From the results in this table, one can find that
83, as some degraded rain components are not removed yet by our proposed IDSR recovers the most SIFT key points on
using these two methods. Moreover, NLEDN, SSDRNet, and average for these two datasets, and the gap between the novel
MAXIM obtain the average recovered key points of more than IDSR and NLEDN is up to 218.64 – 197.45 = 21.19.
190, meaning that these three SOTA image deraining schemes
perform rather well in restoring local image features. E. Ablation Studies
Furthermore, our proposed IDSR recovers average SIFT key In this section, we conduct the following ablation studies to
points of 211.97, which not only is the largest among all the demonstrate the effectiveness of each of our proposed methods.
evaluated methods but also is much larger than our conference 1) Ablation Study for the Proposed Divide-and-conquer
version [4], showing that the improvements over the method Strategy: In this subsection, we will verify the validity of our
in [4] are highly necessary. In addition, compared with proposed divide-and-conquer strategy by removing the
MPRNet, MAXIM yields a derained image with a lower GGIRNet from the overall framework in Fig. 3. In contrast, we
PSNR result but more recoverd SIFT key points. From the employ a one-task learning network by only training the
results in this table, it can be seen that our proposed IDSR e
DPRNet to create yDPRNet . With the Gaussian images of the
algorithm achieves the best average PSNR and SSIM values, corresponding GT being the anchor, we can compute the mean
while recovering the most SIFT key points. square errors (MSEs) of the five derained Gaussian images via
The average recovered SIFT key point results of the SOTA the one-task network and our proposed IDSR, respectively.
methods and our proposed IDSR on Rain1400 are described in
4
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
The average quantitative results of this ablation study Consequently, we propose a divide-and-conquer strategy using
conducted on Rain1200 are shown in Table Ⅴ. From this table, two separate deep learning networks, including the DPRNet
we can see that each derained Gaussian image via our proposed and GGIRNet, to solve these two sub-problems, respectively.
IDSR has a lower MSE than that via the one-task method. Moreover, based on the notable ALP, an efficient and
Especially for the image with the scale of 1.6, the difference powerful key point detector adopted in the MPEG CDVS
between these two schemes reaches the maximum. standard, in the DPRNet an ALP loss is advanced for the
Experimental results show that our proposed IDSR generates accurate SIFT extrema detection. In the ALP detector, the SSR
more accurate derained Gaussian images compared with the at each pixel is modelled as a polynomial of scale, and key
one-task scheme, indicating that it is necessary to classify the points are detected by finding the extreme points of all the
SIFT recovery problem into two sub-problems and then polynomials. Thus, minimizing the ALP loss is designed to
employ the divide-and-conquer strategy to respectively solve ensure the polynomials of an output derained image and those
them. of its corresponding clean version remain consistent as much
2) Ablation Study for the Proposed ALP Loss: To verify the as possible. Furthermore, for the precise scale and spatial
superiority of our proposed ALP loss, we conduct an ablation gradient space extrema description, in the GGIRNet we put
study on this novel loss. In the study, we consider two different forward a new attention mechanism, i.e. the GAM. This GAM
ways described as follows. First, we accept the widely used L2 is used to collect useful information to capture gradient-wise
loss to train the DPRNet instead of the ALP loss. Second, the relationships. Finally, with the two different derained images
proposed ALP loss is employed in the DPRNet as described in generated by the DPRNet and GGIRNet, respectively, we
Section III. With the DoG pyramid of the corresponding GT compute their DoG pyramid and gradient information of
being the anchor, we calculate the MSEs of the four derained Gaussian images, respectively, which are further applied to
DoG images obtained by the two methods, respectively. produce restored key points. Compared with SOTA methods
Table ⅤI presents the average quantitative results of this in both quantitative and qualitative tests, experimental results
ablation study conducted on Rain1400. From this table, it can demonstrate that our proposed scheme recovers more SIFT
be seen that for each derained DoG image the proposed ALP key points.
loss gets a smaller MSE value than L2 loss, and thus Local image features are conventionally divided into hand-
considerably improves the accuracy of SIFT key point crafted and data-driven according to the process used for
locations. extracting features. SIFT is one of the most famous hand-
3) Ablation Study for the Proposed GAM: To demonstrate crafted feature description methods, and it has a strong and
the performance of our proposed GAM, we conduct a solid signal domain interpretation. In this study, our proposed
corresponding ablation study for two situations of this module, IDSR was especially developed to restore SIFT key points
which are described as follows. The first is that the proposed from a rainy image. However, currently data-driven deep
GAM is not adopted in the CGARB at all (denoted by w/o features have demonstrated their superiority in terms of
(without) GAM). In this case, after the CAM, the generated accuracy. The emergence of deep learning has provided a new
feature is directly added to the input feature to forge the output way and has shown great potential to extract data-driven
feature. The second is the complete use of the novel GAM in features. Therefore, further research should be undertaken to
the CGARB, as illustrated in Fig. 5. explore how to design an image deraining algorithm to recover
Table VII describes the average MSEs of the five derained deep features as much as possible.
Gaussian images for this ablation study conducted on
Rain1400. From the table, it can be observed that the proposed REFERENCES
algorithm produces those five images with their respectively [1] Q. Wu, L. Wang, S. Huang, K. N. Ngan, H. Li, F. Meng, and L. Xu,
smaller MSE values compared with the method without the “Subjective and objective de-raining quality assessment towards authentic
rain image,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 11,
GAM. Experimental results indicate that more precise pp. 3883-3897, Nov. 2020.
derained Gaussian images are established by using the GAM, [2] K. Jiang, Z. Wang, P. Yi, C. Chen, Z. Han, T. Lu, B. Huang, and J. Jiang,
implying that this proposed GAM is of benefit to gradient “Decomposition makes better rain removal: an improved attention-guided
extraction. deraining network,” IEEE Trans. Circuits Syst. Video Technol., vol. 31,
no. 10, pp. 3981-3995, Oct. 2021.
[3] L. Zhu, Z. Deng, X. Hu, H. Xie, X. Xu, J. Qin and P.-A. Heng, “Learning
V. CONCLUSION gated non-local residual for single-image rain streak removal,” IEEE
Trans. Circuits Syst. Video Technol., vol. 31, no. 6, pp. 2147-2159, Jun.
Different from existing HVS-driven image deraining 2021.
approaches, in this paper we proposed an image deraining [4] P. Wang, W. Wu, Z. Li, and Y. Liu, “See SIFT in a rain: divide-and-
algorithm for SIFT key point recovery, dubbed IDSR. The conquer SIFT key point recovery from a single rainy image,” in Proc.
IEEE Vis. Commun. Image Process., Munich, Germany, Dec. 2021, pp. 1-
proposed IDSR is a task-driven approach designed to bolster 5.
image feature supply for follow-up feature-based applications. [5] Y.-L. Chen and C.-T. Hsu, “A generalized low-rank appearance model for
Considering the essence of SIFT, we divide the recovery issue spatio-temporally correlated rain streaks,” in Proc. IEEE Int. Conf.
Comput. Vis., Sydney, NSW, Australia, Dec. 2013, pp. 1968–1975.
into two sub-problems, i.e. one being how to generate the DoG [6] T. -X. Jiang, T. -Z. Huang, X. -L. Zhao, L. -J. Deng, and Y. Wang,
pyramid of a derained image, and the other being how to “FastDeRain: A novel video rain streak removal method using directional
construct the gradients of derained Gaussian images.
5
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <
gradient priors,” IEEE Trans. Image Process., vol. 28, no. 4, pp. 2089- IEEE/CVF Winter Conf. Appl. Comput. Vis. Workshops, Waikoloa, HI,
2102, Apr. 2019. USA, Jan. 2022, pp. 52-62.
[7] Y. Luo, Y. Xu, and H. Ji, “Removing rain from a single image via [32] L. Cai, Y. Fu, T. Zhu, Y. Xiang, Y. Zhang and H. Zeng, “Joint depth and
discriminative sparse coding,” in Proc. IEEE Int. Conf. Comput. Vis., density guided single image de-raining,” IEEE Trans. Circuits Syst. Video
Santiago, Chile, Dec. 2015, pp. 3397-3405. Technol., vol. 32, no. 7, pp. 4108-4121, Jul. 2022.
[8] Y. Wang and T.-Z. Huang, “A tensor-based low-rank model for single- [33] K. Jiang, Z. Wang, P. Yi, C. Chen, Y. Yang, X. Tian, and J. Jiang,
image rain streaks removal,” IEEE Access, vol. 7, pp. 83437-83448, Jun. “Attention-guided deraining network via stage-wise learning,” in Proc.
2019. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, May
[9] X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisely, “Removing 2020, pp. 2618-2622.
rain from single images via a deep detail network,” in Proc. IEEE Conf. [34] X. Cui, C. Wang, D. Ren, Y. Chen, and P. Zhu, “Semi-supervised image
Comput. Vis. Pattern Recognit., Honolulu, HI, USA, Jul. 2017, pp. 3855- deraining using knowledge distillation,” IEEE Trans. Circuits Syst. Video
3863. Technol., early access, Jul. 2022.
[10]H. Zhang and V. M. Patel, “Density-aware single image de-raining using [35] C.-Y. Lin, Z. Tao, A.-S. Xu, L.-W. Kang, and F. Akhyar, “Sequential
a multi-stream dense network,” in Proc. IEEE/CVF Conf. Comput. Vis. dual attention network for rain streak removal in a single image,” IEEE
Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018, pp. 695-704. Trans. Image Process., vol. 29, pp. 9250-9265, Sep. 2020.
[11]X. Li, J. Wu, Z. Lin, H. Liu, and H. Zha, “Recurrent squeeze-and [36] Y. Wei, Z. Zhang, Y. Wang, H. Zhang, M. Zhao, M. Xu, and M. Wang,
excitation context aggregation net for single image deraining,” in Proc. “Semi-deraingan: a new semi-supervised single image deraining,” in Proc.
Eur. Conf. Comput. Vis., Munich, Germany, Sep., 2018, pp. 262-277. IEEE Int. Conf. Multimedia Expo, Shenzhen, China, Jul. 2021, pp. 1-6.
[12]D. Ren, W. Zuo, Q. Hu, P. Zhu, and D. Meng, “Progressive image [37] K. Jiang, Z. Wang, P. Yi, C. Chen, X. Wang, J. Jiang, and Z. Xiong,
deraining networks: a better and simpler baseline,” in Proc. IEEE Conf. “Multi-level memory compensation network for rain removal via divide-
Comput. Vis. Pattern Recognit., Long Beach, CA, USA, Jun. 2019, pp. and-conquer strategy,” IEEE J. Sel. Topics Signal Process., vol. 15, no. 2,
3937-3946. pp. 216-228, Feb. 2021.
[13]H. Zhang, V. Sindagi, and V. M. Patel, “Image de-raining using a [38] Z. Su, Y. Zhang, J. Shi, and X. -P. Zhang, “Recurrent network
conditional generative adversarial network,” IEEE Trans. Circuits Syst. knowledge distillation for image rain removal,” IEEE Trans. Cogn. Dev.
Video Technol., vol. 30, no. 11, pp. 3943-3956, Nov. 2020. Syst., early access, 2021, doi: 10.1109/TCDS.2021.3131045.
[14]D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” [39] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual
Int. J. Comput. Vis., vol. 60, no. 2, pp. 91-110, 2004. networks for single image super-resolution,” in Proc. IEEE Conf. Comput.
[15]M. Ghahremani, Y. Liu, and B. Tiddeman, “FFD: fast feature detector,” Vis. Pattern Recognit. Workshops, Honolulu, HI, USA, Jul. 2017, pp.
IEEE Trans. Image Process., vol. 30, pp. 1153-1168, Dec. 2020. 136-144.
[16]W. Zhou, L. Zhang, S. Gao, and X. Lou, “Gradient-based feature [40] X. Lin, L. Ma, B. Sheng, Z. -J. Wang, and W. Chen, “Utilizing two-phase
extraction from raw Bayer pattern images,” IEEE Trans. Image Process., processing with FBLS for single image deraining,” IEEE Trans.
vol. 30, pp. 5122-5137, May 2021. Multimedia, vol. 23, pp. 664-676, Apr. 2021.
[17]Y. Yao, Y. Zhang, Y. Wan, X. Liu, X. Yan, and J. Li, “Multi-modal [41]L. Yu, B. Wang, J. He, G.-S. Xia, and W. Yang, “Single image deraining
remote sensing image matching considering co-occurrence filter,” IEEE with continuous rain density estimation,” IEEE Trans. Multimedia, vol.
Trans. Image Process., vol. 31, pp. 2584-2597, Mar. 2022. 25, pp. 443-456, 2023.
[18]T. H. Phuoc and N. Guan, “A novel key-point detector based on sparse [42] Y. Li, Y. Monno, and M. Okutomi, “Single image deraining network
coding,” IEEE Trans. Image Process., vol. 29, pp. 747-756, Aug. 2019. with rain embedding consistency and layered LSTM,” in Proc. IEEE/CVF
[19]A. Mustafa, H. Kim, and A. Hilton, “MSFD: multi-scale segmentation- Winter Conf. Appl. Comput. Vis., Waikoloa, HI, USA, Jan. 2022, pp.
based feature detection for wide-baseline scene reconstruction,” IEEE 3957-3966.
Trans. Image Process., vol. 28, no. 3, pp. 1118-1132, Mar. 2019. [43]H. Wang, Q. Xie, Q. Zhao, and D. Meng, “A model-driven deep neural
[20]J. Ma, X. Jiang, A. Fan, and et al, “Image matching from handcrafted to network for single image rain removal,” in Proc. IEEE Conf. Comput. Vis.
deep features: a survey,” Int. J. Comput. Vis., vol. 129, pp. 23-79, 2020. Pattern Recognit., Jun. 2020, pp. 3100-3109.
[21]F. Bellavia and C. Colombo, “Is there anything new to say about SIFT [44] H. Huang, A. Yu, and R. He, “Memory oriented transfer learning for
matching,” Int. J. Comput. Vis., vol. 128, no. 7, pp. 1847-1866, 2020. semi-supervised image deraining,” in Proc. IEEE/CVF Conf. Comput. Vis.
[22]G. Francini, M. Balestri, and S. Lepsoy, “CDVS: Telecom Italia’s Pattern Recognit., Nashville, TN, USA, Jun. 2021, pp. 7728-7737.
response to CE1 – interest point detection,” in ISO/IEC [45] S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and
JTC1/SC29/WG11, M31369, Geneva, Switzerland, Oct. 2013. L. Shao, “Multi-stage progressive image restoration,” in Proc. IEEE/CVF
[23] Y. Fu, L. Kang, C. Lin, and C. Hsu, “Single-frame-based rain removal Conf. Comput. Vis. Pattern Recognit., Nashville, TN, USA, Jun. 2021, pp.
via image decomposition,” in Proc. IEEE Int. Conf. Acoust. Speech Signal 14816-14826.
Process., Prague, Czech Republic, May 2011, pp. 1453-1456. [46] H. Lin, Y. Li, X. Fu, X. Ding, Y. Huang, and J. Paisley, “Rain O’er Me:
[24] C. -H. Son and X. -P. Zhang, “Rain removal via shrinkage of sparse Synthesizing real rain to derain with data distillation,” IEEE Trans. Image
codes and learned rain dictionary,” in Proc. IEEE Int. Conf. Multimedia Process., vol. 29, pp. 7668-7680, Jul. 2020.
Expo Workshops, Seattle, WA, USA, Jul. 2016, pp. 1-6. [47] D. Ren, W. Shang, P. Zhu, Q. Hu, D. Meng, and W. Zuo, “Single image
[25] D. -Y. Chen, C. -C. Chen, and L. -W. Kang, “Visual depth guided color deraining using bilateral recurrent network,” IEEE Trans. Image Process.,
image rain streaks removal using sparse coding,” IEEE Trans. Circuits vol. 29, pp. 6852-6863, May 2020.
Syst. Video Technol., vol. 24, no. 8, pp. 1430-1455, Aug. 2014. [48] G. Li, X. He, W. Zhang, and et al, “Non-locally enhanced encoder-
[26] Y. Wang, S. Liu, C. Chen, and B. Zeng, “A hierarchical approach for rain decoder network for single image de-raining,” in Proc. ACM Multimedia
or snow removing in a single color image,” IEEE Trans. Image Process., Conf. Multimedia Conf., Oct. 2018, pp. 1056-1064.
vol. 26, no. 8, pp. 3936-3950, Aug. 2017. [49] R. Yasarla and V. M. Patel, “Uncertainty guided multi-scale residual
[27] Y. Ye, Y. Chang, H. Zhou, and L. Yan, “Closing the loop: joint rain learning-using a cycle spinning CNN for single image de-raining,” in
generation and removal via disentangled image translation,” in Proc. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Long Beach, CA,
IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Nashville, TN, USA, USA, Jun. 2019, pp. 8397-8406.
Jun. 2021, pp. 2053-2062. [50] Z. Tu, H. Talebi, H. Zhang, F. Yang, P. Milanfar, A. Bovik, and Y. Li,
[28] Y. Yang and H. Lu, “A fast and efficient network for single image “MAXIM: Multi-axis MLP for image processing,” in Proc. IEEE Conf.
deraining,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Comput. Vis. Pattern Recognit., New Orleans, Louisiana, USA, Apr. 2022.
Toronto, ON, Canada, Jun. 2021, pp. 2030-2034. [51]T. Wang, X. Yang, K. Xu, S. Chen, Q. Zhang, and R. W. H. Lau, “Spatial
[29] Y. Yang, J. Guan, S. Huang, W. Wan, Y. Xu, and J. Liu, “End-to-end attentive single-image deraining with a high quality real rain dataset,” in
rain removal network based on progressive residual detail supplement,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2019, pp. 12262-
IEEE Trans. Multimedia, vol. 24, pp. 1622-1636, 2022. 12271.
[30] Y. Zheng, X. Yu, M. Liu, and S. Zhang, “Single-image deraining via
recurrent residual multiscale networks,” IEEE Trans. Neural Netw. Learn.
Syst., vol. 33, no. 3, pp. 1310-1323, Mar. 2022.
[31] S. Zheng, C. Lu, Y. Wu, and G. Gupta, “SAPNet: Segmentation-aware
progressive network for perceptual contrastive deraining,” in Proc.
6
> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <