A Hierarchical Approach For Rain or Snow Removing in A Single Color Image
A Hierarchical Approach For Rain or Snow Removing in A Single Color Image
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 1
Abstract—In this paper, we propose an efficient algorithm to based on some features in the image or video. As compared
remove rain or snow from a single color image. Our algorithm to the de-haze problem where some excellent solutions have
takes advantage of two popular techniques employed in image been achieved (e.g., [2]), removing of rain or snow is much
processing, namely, image decomposition and dictionary learning.
At first, a combination of rain/snow detection and a guided filter more challenging.
is used to decompose the input image into a complementary Though belonging to the dynamic weather category, rain
pair: (1) the low-frequency part that is free of rain or snow and snow still have some differences when appearing in the
almost completely and (2) the high-frequency part that contains image or video. First, rain is semi-transparent. Because of this,
not only the rain/snow component but also some or even many the objects will not be occluded completely but some blurring
details of the image. Then, we focus on the extraction of image’s
details from the high-frequency part. To this end, we design a may appear. Second, pixels with different intensities will be
3-layer hierarchical scheme. In the first layer, an over-complete affected by rain differently. When the pixel’s primary intensity
dictionary is trained and three classifications are carried out is relatively low, rain will enhance its intensity. When a high-
to classify the high-frequency part into rain/snow and non- intensity pixel is affected by rain, its intensity will become
rain/snow components in which some common characteristics of lower. This is to say that rain-affected pixels tend to have the
rain/snow have been utilized. In the second layer, another combi-
nation of rain/snow detection and guided filtering is performed on same intensity because the reflection by rain is dominating
the rain/snow component obtained in the first layer. In the third under this scenario. On the other hand, snow is un-transparent
layer, the sensitivity of variance across color channels (SVCC) and can largely occlude the object behind it. In addition, snow
is computed to enhance the visual quality of rain/snow-removed has bright and white color, and snow’s reflection is strong.
image. The effectiveness of our algorithm is verified through both Consequently, snow often possesses high intensity values in
subjective (the visual quality) and objective (through rendering
rain/snow on some ground-truth images) approaches, which an image, which is hardly affected by the background.
shows a superiority over several state-of-the-art works. Rain/snow removal from a video or a single image has
been an active research topic over the past decade. Today,
Index Terms—Rain and snow removal, image decomposition,
dictionary learning, guided filtering, sparse representation. it continues to draw attentions in outdoor vision systems (e.g.,
surveillance) where the ultimate goal is to produce a clear and
clean image or video. Here, the most critical task is to separate
I. I NTRODUCTION
rain/snow components from the other part. To this end, a low-
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 2
filter (as the low-pass filter), while the corresponding All algorithms mentioned above deal with the rain and
high-frequency part is made complementary to the low- snow removing problem in videos. Nevertheless, rain and snow
frequency part. removal from a single image seems more useful in practice,
• A 3-layer hierarchy of extracting image’s details from but also more challenging. To the best of our knowledge,
the high-frequency part has been designed. Specifically, Halimeh and Roser, for the first time, detected raindrops on the
the first layer is a 3-times classification that is based windshield in a single image, modeled the geometric shape of
on a trained dictionary (over-complete), the second layer raindrops, and utilized the photometric property to construct
applies another combination of rain/snow detection and a relationship between raindrop and the environment [13].
a guided filter, and the third layer utilizes the SVCC Later on, some learning-based image decomposition methods
to enhance the visual quality of the rain/snow-removed were proposed to remove rain in a single image [14]–[18].
image. In the meantime, a guided filter based method was used by
The rest of our paper is organized as follows. Section II Xu et al. to remove rain or snow from a single image by
presents some related works. In particular, several very recent designing a rain/snow-free guidance image [19]. Kim et al.
works will be discussed on their pros and cons, which there- detected rain streaks with a rain’s shape model and removed
fore motivates us to develop our new hierarchical approach. rain in a single image by nonlocal means filtering [20]. In [21],
Section III discusses some common characteristics of rain and because rain streaks always show similar pattern, Chen et al.
snow, based on which the SVCC and PDIP will be defined. proposed a low-rank appearance model to remove rain streaks
The details of our proposed rain/snow removal algorithm are in image/videos. More recently, Ding et al. removed rain and
presented in Sections IV and V. In Section VI, we show some snow in a single image by an L0 smoothing filter [22], Luo et
experimental results and present comparisons between our al. separated an image into the rain layer and de-raining layer
algorithm and several state-of-the-art works. Finally, Section by discriminative sparse codes that are based on the non-linear
VII summarizes our work and concludes the paper. generative model of rain image [23], and Li et al. made use
of some patch-based priors for both the background layer and
II. R ELATED W ORKS rain layer for the rain removal task [24].
The earliest work on the dynamic weather such as rain In general, the recent methods for rain/snow removal from
and snow can date back to the study of their statistical a single image can be classified into three categories. The first
characteristics in the atmospheric science in 1948 [3]. Then, category is simply filtering-based where a nonlocal mean filter
Nayar et al. studied the visual manifestations of different or guided filter is often used [19], [20], [22]. Due to the use
weather conditions, including rain and snow [4]. A pioneering of a filter simply, its implementation is very fast. However,
work on rain-removing was proposed by Garg and Nayar in it can hardly produce a satisfactory performance consistently
2004 [1], in which they built a dynamic model for rain and - either the output image is left over with some rain streaks
a physics-based motion blur model. Based on these models, (snowflakes) or quite a few image’s details are lost so that the
they detected and removed rain in videos. In a continuous work output image becomes blurred. The second category builds
[5], Garg and Nayar proposed a new rain streak appearance models for rain streaks or snowflakes [21], [23], [24]. These
model which is based on a raindrop oscillation model in the models can discriminate rain streaks or snowflakes from the
atmospheric science to build a database of rain appearance, and background. However, it often happens that some details of
through the database they tried to render rain in an image. the image will be mistreated as rain streaks or snowflakes. The
In the meantime, Zhang et al. studied the temporal and third category, which seems more reasonable, is to form a 2-
chromatic properties of rain in a video [6], and showed some step processing [14], [15], [18]. Specifically, a well-designed
interesting observations, e.g., a pixel at a fixed location is filtering is first used to decompose a rain/snow image into the
unlikely to be covered by rain throughout the entire video low-frequency part and high-frequency part. While the low-
and the changes of red (R), green (G), and blue (B) values frequency part can be made free of rain or snow as much as
of a rain-affected pixel are approximately the same. Based on possible, the model-based processing can be applied on the
these two properties, they detected and removed rain in the high-frequency part to further extract the image’s details to be
video that is obtained by a stationary camera. added back into the low-frequency part.
Later on, Garg and Nayar designed an algorithm to render We follow this 2-step approach in our work. As compared
rain in images [7] by utilizing the raindrop’s size and velocity to the existing 2-step methods [14], [15], [18], the novelty of
as well as the camera’s parameters. In [8] and [9], Barnum et our proposed approach is two-fold. In the first step, instead of
al. detected and removed rain and snow in videos by creating applying a low-pass filtering simply, we combine a rain/snow
a global effect model of rain and snow in the frequency do- detection together with a guided filter. By doing this, we
main, which is obtained by combining their streak model and can achieve a much improved balance between removing
statistical characteristics. In 2008, Brewer and Liu analyzed rain/snow components and preserving image’s details - the
the rain shape property and how to find the direction of rain resulted low-frequency part becomes free of rain or snow
streak in videos [10]. In 2009, Roser and Geiger proposed to almost completely and at the same time contains the image’s
use a photometric raindrop model to detect raindrops in videos details to a reasonable extent. Unavoidably, some details of
to improve image registration [11]. In 2011, Bossu proposed the image that have similar characteristics with rain streaks
to use the histogram of orientation of streaks (HOS) to detect or snowflakes will still be left over into the corresponding
rain and snow in image sequences [12]. high-frequency part. In the second step, our design of a new
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 3
TABLE I
N OTATION U SED IN T HIS PAPER .
III. C OMMON C HARACTERISTICS OF R AIN AND S NOW Based on the third characteristic of rain/snow described
above, the variance of the color vector corresponding to a
To facilitate an easier reading, we list in Table I all important rain/snow pixel in the high-frequency part tends to be very
mathematical symbols that are employed in this paper. In our small, while the variance of the color vector corresponding
work, we define “rain or snow” in an image as the dynamic to a non-rain/snow pixel is usually big. This implies that the
components and other non-rain/snow contents as the non- variance of a pixel’s color channels can be used to discriminate
dynamic components. As mentioned in Section I, rain and the rain/snow part from the non-rain/snow part.
snow have many differences in shape, size, and intensity, and
Here, we define the sensitivity of variance of color channels
they also influence images differently. In this section, however,
(SVCC) as the differences between the dynamic component
we try to find some common characteristics of these dynamic
and other contents of an image.
components.
For a pixel at location (i, j) in a given image I, the color
First of all, because of strong reflections by rain/snow, high
vector is formed as:
intensity values tend to be resulted at pixels that are affected
T
by rain/snow. Therefore, the values of rain/snow pixels in an I(i, j) = [R(i, j), G(i, j), B(i, j)] . (1)
image are usually larger than their neighboring non-rain/snow
pixels. We first calculate the mean vector A(i, j) of an image patch
Secondly, edge jumps usually exist in natural images be- centered at (i, j) to form matrix A,
tween rain streaks or snowflakes and their horizontal neigh- 1 X
A(i, j) = I(m, n) (2)
bors. Therefore, an image patch that includes rain/snow will |W |
(m,n)∈W (i,j)
usually produce larger average absolute horizontal gradients.
Fig. 1 shows a rain image and a snow image, respectively, where W is a window centered at (i, j) and |W | standards
where the above two characteristics can be observed clearly. for the window size. In principle, the window size should be
Thirdly, let us decompose a rain or snow image into the larger than the width of a rain streak or snowflake to ensure all
low-frequency and high-frequency parts and use {RL (i, j), rain/snow can be detected. In most images tested in our work,
GL (i, j), BL (i, j)} and {RH (i, j), GH (i, j), BH (i, j)} to we found that the width of majority of rain/snow is about 3
denote three color values of a pixel I(i, j) in these two parts. to 5 pixels. As a result, we fix the window size to be 7 × 7. In
Fig. 2 shows the decomposed results for the images presented order to remove singular values, a median filtering whose size
in Fig. 1, where the detailed decomposition will be described is also 7 × 7 is applied on matrix A to obtain Ã. Then, for
in the next section. It can be seen that rain/snow pixels in each element in Ã, we calculate the variance across its three
the high-frequency part are gray or shallow white. Moreover, color channels
three color channels of a rain/snow pixel in the high-frequency Ṽ (i, j) = var(Ã(i, j)) (3)
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 4
(a) (b)
Fig. 4. (a) The SVCC map of a rain image. (b) The SVCC map of a snow
(a) (b)
image. In order to have a clear visualization, we have employed pseudo colors Fig. 5. (a) A snow image in which snowflakes have different shapes but
to display the 2-D SVCC map where the energy bar is placed on the right many snowflakes have a consistent falling direction. (b) The same rain image
end. as Fig. 1(a) where the PDIP is highlighted visually for one image patch.
The variance matrix Ṽ is finally assigned as the SVCC gradient (HOG) proposed by Dalal et al. [25] can be used to
map for I, which is then normalized into the range [0,1] with separate rain streaks from the image [15], [18].
element being calculated by However, snowflakes in an image do not always have con-
!γ sistent falling directions. Snowflakes with high falling speed
Ve (i, j) may follow nearly consistent falling directions, such as Fig.
V (i, j) = (4)
Ṽmax 5(a); but point-like snowflakes are often perceived when snow
is falling down slowly, such as the example shown in Fig.
where Ṽmax stands for the maximum color channel variance 1(b). Obviously, HOG will fail when encountering point-like
and γ is a power function parameter to expand or compress snow.
the contrast of the SVCC map. Fig. 4 shows the SVCC maps If an image patch contains rain streaks or snowflakes with a
visually for the rain and snow images of Fig. 1, where γ = 1.1. consistent falling direction, its HOG often forms an impulse at
It could be observed that rain or snow areas possess low values the angle corresponding to the rain or snow direction. By the
(deep blue stands for low value according to the energy bar) in K-means method, we can classify rain or snow from an image.
the SVCC map, while the non-rain/snow objects whose color Therefore, we register the angle corresponding to the HOG bin
variances are relatively large lead to high values (red areas that has the maximum value as the principal direction of an
and bright blue areas) in the SVCC map. image patch (PDIP) to identify rain/snow in our work. One
How to make use of the SVCC map for our task of rain/snow example is shown in Fig. 5(b).
removal will be described in Section V, together with some
discussions on how to choose γ. IV. O UR P ROPOSED A LGORITHM
The pipeline of our proposed rain/snow removal is shown
B. Principal direction of an image patch (PDIP) in Fig. 6. Specifically, our algorithm consists of two steps. In
Referring to Fig. 1(a), rain streaks often have consis- the first step, the input image is decomposed into the low-
tent falling directions. Therefore, the histogram-of-oriented- frequency part IL and high-frequency part IH . Note that IL
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 5
Fig. 6. The simplified pipeline of our algorithm - the details of each step
will be shown later.
(a) (b)
Fig. 8. (a) Detection result of rain. (b) Detection result of snow.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 6
(a) (b)
Fig. 9. The detailed flow chart of the second step in our algorithm. Fig. 10. The trained dictionaries for the high-frequency part of (a) the rain
image and (b) the snow image shown earlier in Fig. 1.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 7
B. Layer-1 Extraction
From the dictionary obtained above, dynamic components
and non-dynamic components can be separated by dictionary
atoms. Namely, some dictionary atoms stand for dynamic
components and others for non-dynamic components. To this
goal, three classifications of dictionary atoms are implemented,
D
(a) (b) as shown in Fig. 11. In the end, dynamic component IH
N D1
N D3 extracted by SVCC. (b) Non-snow
and non-dynamic component IH are achieved by a sparse
Fig. 15. (a) Non-rain component IH
component IH N D3 extracted by SVCC. reconstruction.
First classification. According to the third characteristic
discussed in Section II, dictionary atoms standing for dynamic
components will have a smaller sum of pixel color channel
where the least angle regression (LARS) [34] is utilized to variance. Thus, we calculate the sum of pixel color channel
solve it. variance (Sk ) of each dictionary atom Dk (k = 1, 2, ..., 1024)
as:
In our work, we trained a dictionary with 1024 atoms (i.e., X
Sk = var(p(i, j)) (12)
n = 1024). To this end, the high-frequency part IH is divided
(i,j)∈Dk
into 16 × 16 × 3 cubes to form the training samples, and each
cube is arranged into a column vector whose size is 768 × 1 where p(i, j) is a color vector of pixel in Dk . According to
(i.e., m = 16×16×3 = 768). We generate an initial dictionary these results, the first classification is performed as follows: we
D0 by selecting 1024 training samples randomly and choose choose a threshold T1 to identify dynamic components from
the parameter λ = 0.15. After obtaining the dictionary, we the other part of image. If Sk < T1 , Dk stands for dynamic
reshape the atoms back to 16 × 16 × 3 cubes. Fig. 10(a) and components. Once classified, a sparse coding is applied to
(b) show these atoms for the rain and snow images of Fig. 1. obtain the coefficients xk of each dictionary atom Dk , k =
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 8
D
1, 2, ..., 1024. Then, all decomposed components yk can be respectively, and the dynamic part IH is shown in Fig. 14(a)
obtained as follows: and (c) for rain and snow, respectively.
yk = Dk × xk , k = 1, 2, ..., 1024 (13)
C. Layer-2 Extraction
At last, we add up all components corresponding to the D
dynamic dictionary atoms to obtain the dynamic part IC1 D
of A minority of non-dynamic details still exist in IH . In order
the first classification. Meanwhile, the non-dynamic part IC1 ND to get more image’s details, we detect dynamic components
can also be obtained. in IH again by the method described in Section III (i.e., a
combination of rain/snow detection and a guided filter) and
Second classification. After the first classification, we ex- employ the newly calculated location map MH to fill the hole.
tracted non-dynamic components that usually contain color Then, by applying the guided filter, we get the non-dynamic
values that are quite different from that of dynamic com- N D2
part IH (Layer-2) as
ponents. However, non-dynamic components whose colors
N D2
= Fg Fm {IH D
are similar to dynamic components would still remain in IH ◦ MH } (17)
D
IC1 . To solve this problem, we propose to do the second
where Fm and Fg have the same meaning as described In
classification using the dictionary atoms that correspond to
D Section III. The results are shown in Fig. 14(b) and (d) for
IC1 only. Specifically, we calculate the average of absolute
rain and snow, respectively.
horizontal gradient of the pixels in these dictionary atoms in
which we only select the pixels with non-zero gradient values.
The atoms with a large mean of absolute horizontal gradient D. Layer-3 Extraction
represent dynamic atoms. Here, we need a threshold T2 and N D1
After IH and IHN D2
are obtained, the rain-removed or
the classification is the same as the first one. After the second snow-removed results are still a little blurred. In order to
D
classification, the dynamic component is denoted as IC2 and further enhance the visual quality, we calculate the SVCC
ND
the non-dynamic component as IC2 , respectively. map VH of the high-frequency part IH and then use it to filter
D
Third classification. For the non-dynamic details in IC2 , the high-frequency part to get some non-dynamic components
we further propose to find the PDIPs of dictionary atoms N D3
IH (Layer-3) as follows:
D
corresponding to IC2 . Here, we treat a dictionary atom as N D3
a patch. Majority of texture components in IC2 D
is dynamic. IH = IH ◦ VH (18)
Hence, a large number of dictionary atoms corresponding to The result is shown in Fig. 15(a) and (b) for rain and snow,
D
IC2 are dynamic atoms, while only a small part of atoms respectively.
corresponds to non-dynamic components whose textures are The principle for choosing γ in the computation of the
very different from the dynamic weather. While the PDIPs of SVCC map is as follows. If rain/snow is bright (i.e., with
dynamic atoms are nearly consistent and have small variance, high intensity values), we should choose γ > 1 to minimize
we can calculate the PDIPs’ variance: the rain/snow trace in the final result. On the other hand, for
n1 n1
!2 rain/snow with very low intensity, γ could be smaller than 1. In
1 X 1 X
Σ= P DIPj − P DIPi (14) this case, a little rain/snow trace remains, which nevertheless
n1 j=1 n1 i=1
is hard to recognize visually because rain/snow has a very
D low intensity. Either too big or too small γ would destroy this
where n1 is the number of atoms corresponding to IHC2 and
th D purpose. In our experiments, we choose γ = 1.1 after trials
Dj is the j dictionary atom corresponding to IC2 . If the
on different values to fit the majority of test images. However,
following condition:
!2 we would like to point out that for a specific image, this value
n1 can be fine-tuned to obtain a better result.
1 X
Σj = P DIPj − P DIPi >Σ+ρ (15) At last, we sum up all non-dynamic components to obtain
n1 i=1
the rain/snow-removed image. The result is shown in Fig.
holds, atom Dj is classified as non-dynamic; otherwise it is 16(d) for rain and Fig. 17(d) for snow, respectively.
viewed as dynamic. Here, ρ is the control parameter to get
accurate results. After the third classification and reconstruc- E. Individual Contributions
tion which is similar to the first two reconstruction processes, In order to show the individual contribution of each of the
D
we obtain the dynamic component IC3 and non-dynamic three layers described above, we present the resulted images in
ND
component IC3 . Fig. 16 for rain and Fig. 17 for snow, respectively. The results
Eventually, after three times of classification, we obtain the show that Layer-1 extraction provides a more significant
N D1 D
non-dynamic component IH and dynamic component IH contribution as compared to Layer-2. This is because that
as follows: D
only very little non-rain/snow details still exist in IH . In the
N D1
IH ND
= IC1 ND
+ IC2 ND
+ IC3 meantime, however, Layer-3 extraction seems to play a very
D D
(16) positive role. It can be seen from Fig. 16(d) and 17(d) that
IH = IC3
the contrast and color textures have been improved a lot by
Some non-dynamic-weather results of three times classifica- using the SVCC map. Notice that, for some specific images
tions are shown in Fig. 12 and Fig. 13 for rain and snow, that have high-intensity rain/snow, SVCC will leave a little
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 9
rain/snow trace in the final results. This problem can be solved atoms are chosen to be {0.1, 0.02, 1.5} for rain and {0.12,
partially by a fine-tuning on the parameter γ. 0.03, 2} for snow, respectively. We would like to point out
To show their individual contributions quantitatively, we that, for a specific rain/snow image, these parameters could
have synthesized a rain image and a snow image, i.e., a be fine-tuned to achieve a better performance. Figs. 20 and
ground-truth image is known and rain or snow is rendered on 21 show, respectively, some rain-removed results and snow-
the ground-truth. We list in Table II the contribution of every removed results by different algorithms. In order to assess
layer by computing the peak-signal-to-noise-ratio (PSNR) and these results fairly, we first present the subjective evaluations
structural similarity (SSIM). It could be observed from this in the following.
table that the above-described results (i.e., the individual con-
tributions from three layers) have been verified quantitatively. A. User Study
Finally, Fig. 18 and 19 show the visual results. To conduct a visual (subjective) evaluation on the perfor-
mances of different methods, we invited 20 viewers (12 males
and 8 females) to evaluate the visual quality of different
VI. E XPERIMENTAL R ESULTS
methods in terms of the following three aspects: (1) less
In this section, we demonstrate the rain/snow-removing rain/snow residual, (2) the maintenance of the image details,
effectiveness of our proposed algorithm by comparisons with and (3) overall perception.
several state-of-the-art works. In our experiments, three pa- In the evaluation, 10 groups of rain-removed results are
rameters T1 , T2 , and ρ used in the classification of dictionary selected and every group involves the results by Ding et al.,
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 10
TABLE II
T HE CONTRIBUTION OF EACH LAYER TO PSNR AND SSIM (T OP : PSNR, B OTTOM : SSIM) OF SYNTHESIZED RAIN AND SNOW IMAGES AGAINST THE
GROUND - TRUTHES .
N D1 N D1 N D2 N D1 N D2 N D3
IL IL + IH IL + IH + IH IL + IH + IH + IH
29.92 31.63 31.98 32.55
a rain image
0.601 0.843 0.852 0.893
31.98 33.21 33.63 35.26
a snow image
0.672 0.821 0.830 0.860
TABLE III
U SER S TUDY R ESULT. T HE N UMBERS A RE T HE P ERCENTAGES OF VOTES W HICH A RE O BTAINED BY E ACH M ETHOD .
rain snow
Method [22] [23] [18] [24] Ours [22] [19] [18] Ours
Percentage 12.40% 3.78% 9.75% 8.57% 65.50% 12.50% 0% 0% 87.50%
TABLE IV
R AIN I MAGE P ERFORMANCES (T OP : PSNR, B OTTOM : SSIM) OF D IFFERENT M ETHODS (ROWS ) ON 11 S YNTHESIZED R AIN I MAGES ( COLUMNS )
AGAINST G ROUND - TRUTHES .
Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7 Image 8 Image 9 Image 10 Image 11
38.93 37.16 35.93 40.29 32.78 33.50 34.59 35.22 34.00 35.89 34.23
[22]
0.948 0.861 0.835 0.796 0.811 0.875 0.808 0.862 0.857 0.787 0.895
33.06 33.73 29.45 35.95 29.45 30.43 31.63 32.99 27.52 32.10 31.33
[23]
0.899 0.808 0.801 0.784 0.790 0.829 0.804 0.833 0.814 0.707 0.877
34.90 34.95 32.55 38.58 31.84 32.11 34.59 34.15 34.39 35.44 35.13
[18]
0.873 0.774 0.824 0.775 0.802 0.804 0.854 0.784 0.703 0.755 0.825
36.84 32.27 33.34 31.13 30.39 29.52 30.31 32.33 32.36 31.21 29.90
[24]
0.945 0.691 0.748 0.754 0.682 0.669 0.686 0.785 0.747 0.706 0.597
42.78 38.38 36.03 40.31 32.94 34.42 34.91 35.53 34.80 35.93 38.63
Ours
0.950 0.897 0.882 0.846 0.854 0.883 0.846 0.866 0.869 0.811 0.928
TABLE V
S NOW I MAGE P ERFORMANCES (T OP : PSNR, B OTTOM : SSIM) OF D IFFERENT M ETHODS ( ROWS ) ON 11 S YNTHESIZED S NOW I MAGES (C OLUMNS )
AGAINST G ROUND - TRUTHES .
Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7 Image 8 Image 9 Image 10 Image 11
34.11 30.09 31.98 33.60 34.25 34.73 33.51 33.42 32.65 33.60 35.23
[22]
0.794 0.809 0.809 0.808 0.739 0.820 0.820 0.736 0.712 0.756 0.830
30.12 32.07 30.66 31.68 34.05 32.79 33.08 33.22 32.21 34.44 34.57
[19]
0.629 0.599 0.588 0.798 0.676 0.623 0.572 0.593 0.634 0.730 0.613
30.14 33.01 31.36 32.56 30.98 33.70 34.38 33.06 33.81 34.48 33.55
[18]
0.692 0.779 0.652 0.826 0.631 0.725 0.744 0.611 0.693 0.686 0.640
35.41 35.96 32.41 38.91 36.43 35.26 38.52 38.75 36.70 38.11 38.53
Ours
0.804 0.861 0.813 0.885 0.886 0.860 0.851 0.819 0.833 0.865 0.862
Luo et al., Chen et al., Li et al. and our method. Another 10 are shown in Fig. 22 and 23. In Fig. 23, the first two rows
groups of snow-removed results are selected and every group are streak-like snow and the last two rows are point-like snow.
involves the results by Ding et al., Xu et al., Chen et al. and Here, PSNR and SSIM are adopted as the quantitative metrics
our method. To ensure the fairness, the results in each group to assess the performance of different methods. In Tables IV
are arranged randomly. For each group, the viewers are asked and V, we list the PSNR and SSIM values of 11 rain images
to select only one result which they like most. and 11 snow images, respectively. According to these results,
The evaluation result is shown in Table III. It is clear that it could be observed that our method outperforms the selected
our rain/snow removal results are favored by a huge majority state-of-the-art works for removing rain or snow. Especially,
of viewers (65.50% for rain and 87.50% for snow). our method produces much better results for snow images.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 11
method has produced very good results for light rain images, the-art snow removal methods. First, the images in the forth
such as the first, second and third images. However, when the column show the snow-removed results obtained by a rain
intensity of a rain pixel is large, i.e., the relatively heavy rain removal method developed by [18]. The results reveal that
(such as the fifth one) or the edge of rain streaks are blurry a good rain-removal method may not be suitable to snow-
(e.g., the sixth one), this method falls. The third column in Fig. removal. The results by [19] are shown in the third column.
20 is the results by Luo et al. [23]. This method can produce It is found that this method cause a lot of blur.
accepted rain-removal results for some images, like the third
The second column is the results by [22]. For some snow
one. However, this work can not remove rain for relatively
images such as the third image in Fig. 21, this method can
heavy rain images(the fifth and sixth ones). In general, it is
produce good snow-removal results and keep acceptable image
more suitable to light and slender rain as shown in their paper.
quality. However, when snow is heavy and has larger size (e.g.,
The results by Ding et al. [22] are displayed in second column
the second image), it can not recognize snow. Another defect
in Fig. 20. This method produces satisfactory results for the
of this work is that it will mistreat small image details as snow,
fifth and sixth images. However, this method only removes rain
like in the fourth image. The last column presents our results,
streaks, cannot revise the shadows produced by rain streaks
showing a much better snow-removal performance. It can be
(the second one). The results of our proposed algorithm are
seen that we have removed successfully the majority of snow
shown in the last column. By comparison, it could be observed
in images and maintained most image details at the same time,
that our method is suitable to all rain images tested in our
leading to a better visual quality in the snow-removed images.
experiments and produces a highly competing performance.
For some relatively heavy rain image, a hazy effect will appear We analyze the performance of each method as follows.
in the rain-removed results. We can further implement the de- The work by [18] only uses a low-pass filter to separate the
haze algorithm [24] to solve this problem to a certain extent. rain/snow image into the low-frequency and high-frequency
parts. When the intensity of rain/snow is large, it is difficult to
Fig. 21 shows some experimental results of the state-of- obtain a rain/snow-free low-frequency part. Hence, this work
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 12
can not remove rain/snow with high intensity. Besides, this rain/snow with low intensity, but rain/snow with high intensity
work uses HOG as the descriptor to identify rain streaks. When can not be removed. Besides, some image details that have
edges of a rain streak is blurry, the performance of HOG will similar shape and intensity with low-intensity rain/snow will
fall. Finally, HOG descriptor can not identify snow because also be removed. Finally, the work by Xu et al. designs a
some snow usually does not possess the shape of a rain streak. rain/snow-free guidance image, cooperated with the guided
Therefore, this work is not applicable to snow-removal. filter [26], to remove rain/snow from images. Even though the
guided filtering is a good edge-preserving low-pass filter, it is
For the work [23], when rain streaks are wide or a little inevitable that the processed image gets blurred.
heavy, the discrimination of the proposed non-linear generative
model will decrease. Hence, it is only suitable for handling D. Complexity Analysis
light rain streaks. The background prior and rain prior used in We implement our algorithm using MATLAB on an Intel
the work [24] can not separate rain from small image details. (R) Xeon (R) CPU E5-2643 v2 @ 3.5 GHz 3.5 GHz (2 pro-
Hence, when encountered with rain image with small image cessors) with 64G RAM. We test the run time on a 256 × 256
details, this work will loss many image details. By utilizing the image. The total time consumed by our method is 82.60
property of guided filter with L0 gradient minimization, the seconds, where the detection takes 5.71 seconds and the SVCC
work [22] reserves the edges whose corresponding location takes 1.85 seconds. Majority of time is spent in the dictionary
in the guidance image is of large gradient magnitudes, and learning part, which is 60.82 seconds. Classifications and
smooths other edges. Hence, this work can only remove sparse reconstruction spend 13.81 seconds. The remaining run
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 13
time is consumed by intermediate steps of our algorithms. The atoms. The complexity of each classification is O(Q).
time consumed by the works of Chen et al. [18], Luo et al. Without dictionary learning and sparse reconstruction steps,
[23], Ding et al. [22], Xu et al. [19], Li et al. [24] are 95.17, the works in [22] and [19] spend much less time. Though our
68.66, 1.18, 0.27 and 1260.40 seconds, respectively. results are better, optimizing our algorithm and shortening the
The run time changes with the size of images. Therefore, we consumed time are necessary. In our future works, we will
analyze the complexity of major steps as follows. Suppose that focus on finding new methods to optimize dictionary learning.
the size of an given rain/snow image is M ×N , l is the number E. Limitations
of windows we use to identify rain/snow pixel from the given Our proposed method uses some universal characteristics of
rain/snow images (which is 5 in this paper). The computational rain and snow and develops accurate descriptors to represent
complexity of rain/snow detection is O(M × N × l). For the rain and snow so that some very good results have been
online dictionary learning, the number of training samples is obtained. However, some shortcomings still exist in our work.
K, the size of every training sample is L × 1, the number of First, for some relatively heavy rain images, such as the
dictionary atoms is Q, the size of every dictionary atoms is fourth images in Fig. 20, our method still produces blurring.
L × 1 (L < Q K), S is the target sparsity, and T is the Second, we notice that the parameters selected in our work
iteration number. Then, the complexity of the online dictionary are suitable for majority of rain and snow images except
learning and sparse reconstruction are O(T × K(N × S 2 + for some special images such as very blurred rain images or
2 × L × Q)) and O(Q), respectively. heavy snow images. Under this situation, we believe that the
We use the same dictionary learning method as the work by parameters need to be fine tuned for a better result. Finally,
Chen et al. [18]. Therefore, the dictionary learning and sparse little snow trace can still be seen in some of our results when
reconstruction have the equal computational complexity. On the size of snowflakes are large. Our future work will focus
the other hand, we implement 3 classifications. Every classi- on solving these problems and obtaining better rain/snow-
fication has its own feature descriptors to describe dictionary removed images.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 14
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 15
Yinglong Wang (S’16) received his B.S. degree Bing Zeng (M’91-SM’13-F’16) received his BEng
from School of Information Science and Engineer- and MEng degrees in electronic engineering from
ing, LanZhou University, Lanzhou, China, in 2011. University of Electronic Science and Technology
He is currently working toward the Ph.D. degree of China (UESTC), Chengdu, China, in 1983 and
at University of Electronic Science and Technology 1986, respectively, and his PhD degree in electrical
of China (UESTC), Chengdu, China. His research engineering from Tampere University of Technology,
interests focus on image processing and machine Tampere, Finland, in 1991.
learning. He worked as a postdoctoral fellow at University
of Toronto from September 1991 to July 1992 and as
a Researcher at Concordia University from August
1992 to January 1993. He then joined The Hong
Kong University of Science and Technology (HKUST). After 20 years of
service at HKUST, he returned to UESTC in the summer of 2013, through
China’s “1000-Talent-Scheme”. At UESTC, he leads the Institute of Image
Processing to work on image and video processing, 3D and multi-view video
technology, and visual big data.
During his tenure at HKUST and UESTC, he graduated more than 30
Master and PhD students, received about 20 research grants, filed 8 interna-
tional patents, and published more than 200 papers. Three representing works
are as follows: one paper on fast block motion estimation, published in IEEE
Transactions on Circuits and Systems for Video Technology (TCSVT) in 1994,
has so far been SCI-cited more than 1000 times (Google-cited more than 2100
times) and currently stands at the 7th position among all papers published in
this Transactions; one paper on smart padding for arbitrarily-shaped image
blocks, published in IEEE TCSVT in 2001, leads to a patent that has been
successfully licensed to companies; and one paper on directional discrete
cosine transform (DDCT), published in IEEE TCSVT in 2008, receives the
2011 IEEE CSVT Transactions Best Paper Award. He also received the best
paper award at ChinaCom three times (2009, 2010, and 2012).
He served as an Associate Editor of IEEE TCSVT for 8 years and received
the Best Associate Editor Award in 2011. He was General Co-Chair of
Shuaicheng Liu received his Ph.D. and M.S. de- IEEE VCIP-2016, held in Chengdu, China, in November 2016. Currently,
grees from National University of Singapore (NUS), he is on the Editorial Board of Journal of Visual Communication and Image
Singapore, in 2014 and 2010, respectively, and his Representation and serves as General Co-Chair of PCM-2017. He received a
B.E. from Sichuan University, Chengdu, China, in 2nd Class Natural Science Award (the first recipient) from Chinese Ministry
2008. In 2014, he joined University of Electronic of Education in 2014 and was elected as an IEEE Fellow in 2016 for
Science and Technology of China (UESTC) and contributions to image and video coding.
is currently an Associate Professor with the In-
stitute of Image Processing, School of Electronic
Engineering. His research interests include image
and video processing, computational photography,
computer graphics and vision. He is a member of
IEEE.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.