0% found this document useful (0 votes)
172 views

A Hierarchical Approach For Rain or Snow Removing in A Single Color Image

This document summarizes a research paper that proposes an efficient algorithm to remove rain or snow from a single color image. The algorithm takes advantage of image decomposition and dictionary learning techniques. It first uses rain/snow detection and guided filtering to decompose the image into a low-frequency part without rain/snow and a high-frequency part containing rain/snow and image details. It then focuses on extracting image details from the high-frequency part using a 3-layer hierarchical scheme involving dictionary training, classification, and other processing. Evaluation shows the approach outperforms other state-of-the-art methods in both subjective visual quality and objective rendering-based tests.

Uploaded by

Maxuell Smash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
172 views

A Hierarchical Approach For Rain or Snow Removing in A Single Color Image

This document summarizes a research paper that proposes an efficient algorithm to remove rain or snow from a single color image. The algorithm takes advantage of image decomposition and dictionary learning techniques. It first uses rain/snow detection and guided filtering to decompose the image into a low-frequency part without rain/snow and a high-frequency part containing rain/snow and image details. It then focuses on extracting image details from the high-frequency part using a 3-layer hierarchical scheme involving dictionary training, classification, and other processing. Evaluation shows the approach outperforms other state-of-the-art methods in both subjective visual quality and objective rendering-based tests.

Uploaded by

Maxuell Smash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 1

A Hierarchical Approach for Rain or Snow


Removing in A Single Color Image
Yinglong Wang, Student Member, IEEE, Shuaicheng Liu, Member, IEEE, Chen Chen, Student Member, IEEE, and
Bing Zeng, Fellow, IEEE

Abstract—In this paper, we propose an efficient algorithm to based on some features in the image or video. As compared
remove rain or snow from a single color image. Our algorithm to the de-haze problem where some excellent solutions have
takes advantage of two popular techniques employed in image been achieved (e.g., [2]), removing of rain or snow is much
processing, namely, image decomposition and dictionary learning.
At first, a combination of rain/snow detection and a guided filter more challenging.
is used to decompose the input image into a complementary Though belonging to the dynamic weather category, rain
pair: (1) the low-frequency part that is free of rain or snow and snow still have some differences when appearing in the
almost completely and (2) the high-frequency part that contains image or video. First, rain is semi-transparent. Because of this,
not only the rain/snow component but also some or even many the objects will not be occluded completely but some blurring
details of the image. Then, we focus on the extraction of image’s
details from the high-frequency part. To this end, we design a may appear. Second, pixels with different intensities will be
3-layer hierarchical scheme. In the first layer, an over-complete affected by rain differently. When the pixel’s primary intensity
dictionary is trained and three classifications are carried out is relatively low, rain will enhance its intensity. When a high-
to classify the high-frequency part into rain/snow and non- intensity pixel is affected by rain, its intensity will become
rain/snow components in which some common characteristics of lower. This is to say that rain-affected pixels tend to have the
rain/snow have been utilized. In the second layer, another combi-
nation of rain/snow detection and guided filtering is performed on same intensity because the reflection by rain is dominating
the rain/snow component obtained in the first layer. In the third under this scenario. On the other hand, snow is un-transparent
layer, the sensitivity of variance across color channels (SVCC) and can largely occlude the object behind it. In addition, snow
is computed to enhance the visual quality of rain/snow-removed has bright and white color, and snow’s reflection is strong.
image. The effectiveness of our algorithm is verified through both Consequently, snow often possesses high intensity values in
subjective (the visual quality) and objective (through rendering
rain/snow on some ground-truth images) approaches, which an image, which is hardly affected by the background.
shows a superiority over several state-of-the-art works. Rain/snow removal from a video or a single image has
been an active research topic over the past decade. Today,
Index Terms—Rain and snow removal, image decomposition,
dictionary learning, guided filtering, sparse representation. it continues to draw attentions in outdoor vision systems (e.g.,
surveillance) where the ultimate goal is to produce a clear and
clean image or video. Here, the most critical task is to separate
I. I NTRODUCTION
rain/snow components from the other part. To this end, a low-

I T is well-known that a bad weather, e.g., haze, rain, or


snow, affects severely the quality of the captured images
or videos, which consequently degrades the performance of
pass filtering is often used. However, the low-frequency part
produced by the low-pass filter would still contain some rain
streaks or snowflakes if the filter is not strong enough. To
many image processing and computer vision algorithms such avoid this, the low-pass filtering needs to be rather strong.
as object detection, tracking, recognition, and surveillance. A Nevertheless, such a strong filtering will unavoidably remove
study by Garg et al. [1] reveals that rain and snow belong some or even many details of the image at the same time. This
to the dynamic weather - they contain constituent particles implies that the high-frequency part (the residual) contains
of relatively large sizes so that they can be captured easily not only rain/snow components but also some or even many
by cameras. On the other hand, haze belongs to the steady details of the image. Then, some learning-based methods can
weather - the particles are much smaller in size and can hardly be designed to further classify rain/snow components in the
be filmed. As a result, rain or snow leads to complex pixel high-frequency part.
variations and obscures the information that is conveyed in In our work, we consider the rain/snow removal from a
the image or video. Especially, the degradation on the involved single color image, in which several new designs are intro-
algorithm’s performance would be severe if the algorithm is duced. The main contributions of our work are summarized
Manuscript received xxx, revised yyy, accepted zzz. This work has been as follows:
supported by National Natural Science Foundation of China (No. 61370148 • We have outlined several common characteristics of rain
and No. 61505079) and the “111” Projects (No. B17008). and snow, from which two metrics are defined, namely,
The authors are with Institute of Image Processing, University of Electronic
Science and Technology of China, Chengdu, Sichuan 611731, China. the sensitivity of variance across color channels (SVCC)
C. Chen is also with Department of Electronic and Computer Engineering, and the principal direction of an image patch (PDIP).
The Hong Kong University of Science and Technology, Clearwater Bay, • A low-frequency part that is free of rain or snow al-
Kowloon, Hong Kong, China.
All correspondences to S. C. Liu (e-mail: [email protected]) and most completely has been generated, thanks to the use
B. Zeng (e-mail: [email protected]). of a combination of rain/snow detection and a guided

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 2

filter (as the low-pass filter), while the corresponding All algorithms mentioned above deal with the rain and
high-frequency part is made complementary to the low- snow removing problem in videos. Nevertheless, rain and snow
frequency part. removal from a single image seems more useful in practice,
• A 3-layer hierarchy of extracting image’s details from but also more challenging. To the best of our knowledge,
the high-frequency part has been designed. Specifically, Halimeh and Roser, for the first time, detected raindrops on the
the first layer is a 3-times classification that is based windshield in a single image, modeled the geometric shape of
on a trained dictionary (over-complete), the second layer raindrops, and utilized the photometric property to construct
applies another combination of rain/snow detection and a relationship between raindrop and the environment [13].
a guided filter, and the third layer utilizes the SVCC Later on, some learning-based image decomposition methods
to enhance the visual quality of the rain/snow-removed were proposed to remove rain in a single image [14]–[18].
image. In the meantime, a guided filter based method was used by
The rest of our paper is organized as follows. Section II Xu et al. to remove rain or snow from a single image by
presents some related works. In particular, several very recent designing a rain/snow-free guidance image [19]. Kim et al.
works will be discussed on their pros and cons, which there- detected rain streaks with a rain’s shape model and removed
fore motivates us to develop our new hierarchical approach. rain in a single image by nonlocal means filtering [20]. In [21],
Section III discusses some common characteristics of rain and because rain streaks always show similar pattern, Chen et al.
snow, based on which the SVCC and PDIP will be defined. proposed a low-rank appearance model to remove rain streaks
The details of our proposed rain/snow removal algorithm are in image/videos. More recently, Ding et al. removed rain and
presented in Sections IV and V. In Section VI, we show some snow in a single image by an L0 smoothing filter [22], Luo et
experimental results and present comparisons between our al. separated an image into the rain layer and de-raining layer
algorithm and several state-of-the-art works. Finally, Section by discriminative sparse codes that are based on the non-linear
VII summarizes our work and concludes the paper. generative model of rain image [23], and Li et al. made use
of some patch-based priors for both the background layer and
II. R ELATED W ORKS rain layer for the rain removal task [24].
The earliest work on the dynamic weather such as rain In general, the recent methods for rain/snow removal from
and snow can date back to the study of their statistical a single image can be classified into three categories. The first
characteristics in the atmospheric science in 1948 [3]. Then, category is simply filtering-based where a nonlocal mean filter
Nayar et al. studied the visual manifestations of different or guided filter is often used [19], [20], [22]. Due to the use
weather conditions, including rain and snow [4]. A pioneering of a filter simply, its implementation is very fast. However,
work on rain-removing was proposed by Garg and Nayar in it can hardly produce a satisfactory performance consistently
2004 [1], in which they built a dynamic model for rain and - either the output image is left over with some rain streaks
a physics-based motion blur model. Based on these models, (snowflakes) or quite a few image’s details are lost so that the
they detected and removed rain in videos. In a continuous work output image becomes blurred. The second category builds
[5], Garg and Nayar proposed a new rain streak appearance models for rain streaks or snowflakes [21], [23], [24]. These
model which is based on a raindrop oscillation model in the models can discriminate rain streaks or snowflakes from the
atmospheric science to build a database of rain appearance, and background. However, it often happens that some details of
through the database they tried to render rain in an image. the image will be mistreated as rain streaks or snowflakes. The
In the meantime, Zhang et al. studied the temporal and third category, which seems more reasonable, is to form a 2-
chromatic properties of rain in a video [6], and showed some step processing [14], [15], [18]. Specifically, a well-designed
interesting observations, e.g., a pixel at a fixed location is filtering is first used to decompose a rain/snow image into the
unlikely to be covered by rain throughout the entire video low-frequency part and high-frequency part. While the low-
and the changes of red (R), green (G), and blue (B) values frequency part can be made free of rain or snow as much as
of a rain-affected pixel are approximately the same. Based on possible, the model-based processing can be applied on the
these two properties, they detected and removed rain in the high-frequency part to further extract the image’s details to be
video that is obtained by a stationary camera. added back into the low-frequency part.
Later on, Garg and Nayar designed an algorithm to render We follow this 2-step approach in our work. As compared
rain in images [7] by utilizing the raindrop’s size and velocity to the existing 2-step methods [14], [15], [18], the novelty of
as well as the camera’s parameters. In [8] and [9], Barnum et our proposed approach is two-fold. In the first step, instead of
al. detected and removed rain and snow in videos by creating applying a low-pass filtering simply, we combine a rain/snow
a global effect model of rain and snow in the frequency do- detection together with a guided filter. By doing this, we
main, which is obtained by combining their streak model and can achieve a much improved balance between removing
statistical characteristics. In 2008, Brewer and Liu analyzed rain/snow components and preserving image’s details - the
the rain shape property and how to find the direction of rain resulted low-frequency part becomes free of rain or snow
streak in videos [10]. In 2009, Roser and Geiger proposed to almost completely and at the same time contains the image’s
use a photometric raindrop model to detect raindrops in videos details to a reasonable extent. Unavoidably, some details of
to improve image registration [11]. In 2011, Bossu proposed the image that have similar characteristics with rain streaks
to use the histogram of orientation of streaks (HOS) to detect or snowflakes will still be left over into the corresponding
rain and snow in image sequences [12]. high-frequency part. In the second step, our design of a new

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 3

TABLE I
N OTATION U SED IN T HIS PAPER .

Symbols Meanings Symbols Meanings


I Input color image I(i, j) = [R(i, j), G(i, j), B(i, j)]T Color values of pixel at (i, j)
IL /IH Low/High-frequency part of I {RH (i, j), GH (i, j), BH (i, j)} Pixel values at (i, j) in IH
N D1
V (i, j) Value of SVCC map at (i, j) IH Layer-1 non-rain/snow components
N D2 N D3
IH Layer-2 non-rain/snow components IH Layer-3 non-rain/snow components
D
IH Rain/snow components by classification MI Rain or snow location in I
MH Rain or snow location in IH Sk Sum of color channel variance in Dk
P DIP Principal direction of a dictionary atom Iˆ Rain/snow-removed image

part have nearly the same value, i.e., RH (i, j) ≈ GH (i, j) ≈


BH (i, j).
We utilize the rain image in Fig. 1(a) and the snow
image in Fig. 1(b) as examples to verify this consistency.
From their high-frequency parts, we have selected randomly
500 rain/snow pixels as well as 500 non-rain/snow pixels
with high color variances. We calculate the variance of each
(a) (b)
{RH (i, j), GH (i, j), BH (i, j)}. The results are shown in Fig.
Fig. 1. A rain image and a snow image are shown in (a) and (b), respectively,
from which we can observe the first two characteristics clearly.
3. It is clear that a huge majority of color variances of the
selected rain/snow pixels is indeed clustered near zero, while
the variance of the selected non-rain/snow pixel spans over a
3-layer hierarchy of extracting the image’s details will prove much larger range.
to be more effective than the extraction method proposed in
[15] and [18], though the method in [18] also consists of 3
layers. A. Sensitivity of variance of color channels (SVCC)

III. C OMMON C HARACTERISTICS OF R AIN AND S NOW Based on the third characteristic of rain/snow described
above, the variance of the color vector corresponding to a
To facilitate an easier reading, we list in Table I all important rain/snow pixel in the high-frequency part tends to be very
mathematical symbols that are employed in this paper. In our small, while the variance of the color vector corresponding
work, we define “rain or snow” in an image as the dynamic to a non-rain/snow pixel is usually big. This implies that the
components and other non-rain/snow contents as the non- variance of a pixel’s color channels can be used to discriminate
dynamic components. As mentioned in Section I, rain and the rain/snow part from the non-rain/snow part.
snow have many differences in shape, size, and intensity, and
Here, we define the sensitivity of variance of color channels
they also influence images differently. In this section, however,
(SVCC) as the differences between the dynamic component
we try to find some common characteristics of these dynamic
and other contents of an image.
components.
For a pixel at location (i, j) in a given image I, the color
First of all, because of strong reflections by rain/snow, high
vector is formed as:
intensity values tend to be resulted at pixels that are affected
T
by rain/snow. Therefore, the values of rain/snow pixels in an I(i, j) = [R(i, j), G(i, j), B(i, j)] . (1)
image are usually larger than their neighboring non-rain/snow
pixels. We first calculate the mean vector A(i, j) of an image patch
Secondly, edge jumps usually exist in natural images be- centered at (i, j) to form matrix A,
tween rain streaks or snowflakes and their horizontal neigh- 1 X
A(i, j) = I(m, n) (2)
bors. Therefore, an image patch that includes rain/snow will |W |
(m,n)∈W (i,j)
usually produce larger average absolute horizontal gradients.
Fig. 1 shows a rain image and a snow image, respectively, where W is a window centered at (i, j) and |W | standards
where the above two characteristics can be observed clearly. for the window size. In principle, the window size should be
Thirdly, let us decompose a rain or snow image into the larger than the width of a rain streak or snowflake to ensure all
low-frequency and high-frequency parts and use {RL (i, j), rain/snow can be detected. In most images tested in our work,
GL (i, j), BL (i, j)} and {RH (i, j), GH (i, j), BH (i, j)} to we found that the width of majority of rain/snow is about 3
denote three color values of a pixel I(i, j) in these two parts. to 5 pixels. As a result, we fix the window size to be 7 × 7. In
Fig. 2 shows the decomposed results for the images presented order to remove singular values, a median filtering whose size
in Fig. 1, where the detailed decomposition will be described is also 7 × 7 is applied on matrix A to obtain Ã. Then, for
in the next section. It can be seen that rain/snow pixels in each element in Ã, we calculate the variance across its three
the high-frequency part are gray or shallow white. Moreover, color channels
three color channels of a rain/snow pixel in the high-frequency Ṽ (i, j) = var(Ã(i, j)) (3)

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 4

(a) (b) (c) (d)


Fig. 2. (a) Low-frequency component of a rain image. (b) High-frequency component of a rain image. (c) Low-frequency component of a snow image. (d)
High-frequency component of a snow image.

(a) (b) (c) (d)


Fig. 3. (a) Distribution of variances for the selected 500 rain pixels. (b) Distribution of variances for the selected 500 non-rain pixels. (c) Distribution of
variances for the selected 500 snow pixels. (d) Distribution of variances for the selected 500 non-snow pixels.

(a) (b)
Fig. 4. (a) The SVCC map of a rain image. (b) The SVCC map of a snow
(a) (b)
image. In order to have a clear visualization, we have employed pseudo colors Fig. 5. (a) A snow image in which snowflakes have different shapes but
to display the 2-D SVCC map where the energy bar is placed on the right many snowflakes have a consistent falling direction. (b) The same rain image
end. as Fig. 1(a) where the PDIP is highlighted visually for one image patch.

The variance matrix Ṽ is finally assigned as the SVCC gradient (HOG) proposed by Dalal et al. [25] can be used to
map for I, which is then normalized into the range [0,1] with separate rain streaks from the image [15], [18].
element being calculated by However, snowflakes in an image do not always have con-
!γ sistent falling directions. Snowflakes with high falling speed
Ve (i, j) may follow nearly consistent falling directions, such as Fig.
V (i, j) = (4)
Ṽmax 5(a); but point-like snowflakes are often perceived when snow
is falling down slowly, such as the example shown in Fig.
where Ṽmax stands for the maximum color channel variance 1(b). Obviously, HOG will fail when encountering point-like
and γ is a power function parameter to expand or compress snow.
the contrast of the SVCC map. Fig. 4 shows the SVCC maps If an image patch contains rain streaks or snowflakes with a
visually for the rain and snow images of Fig. 1, where γ = 1.1. consistent falling direction, its HOG often forms an impulse at
It could be observed that rain or snow areas possess low values the angle corresponding to the rain or snow direction. By the
(deep blue stands for low value according to the energy bar) in K-means method, we can classify rain or snow from an image.
the SVCC map, while the non-rain/snow objects whose color Therefore, we register the angle corresponding to the HOG bin
variances are relatively large lead to high values (red areas that has the maximum value as the principal direction of an
and bright blue areas) in the SVCC map. image patch (PDIP) to identify rain/snow in our work. One
How to make use of the SVCC map for our task of rain/snow example is shown in Fig. 5(b).
removal will be described in Section V, together with some
discussions on how to choose γ. IV. O UR P ROPOSED A LGORITHM
The pipeline of our proposed rain/snow removal is shown
B. Principal direction of an image patch (PDIP) in Fig. 6. Specifically, our algorithm consists of two steps. In
Referring to Fig. 1(a), rain streaks often have consis- the first step, the input image is decomposed into the low-
tent falling directions. Therefore, the histogram-of-oriented- frequency part IL and high-frequency part IH . Note that IL

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 5

Fig. 6. The simplified pipeline of our algorithm - the details of each step
will be shown later.
(a) (b)
Fig. 8. (a) Detection result of rain. (b) Detection result of snow.

bottom-left, and bottom-right of the window. If the following


Fig. 7. The flow chart of the first step: I is the input rain/snow image; MI inequalities
is the location map; IM is the Hadamard product of I and MI ; IL and IH P
are, respectively, the low-frequency and high-frequency parts obtained after {m,n}∈ω (k) I(m, n)
the decomposition. I(i, j) > I¯ =
(k)
, k = {1, 2, 3, 4, 5},
|ω (k) |
(6)
where |ω (k) | stands for the window size, are satisfied for all
is free of rain or snow almost completely but usually blurred, color channels, I(i, j) is recognized as the dynamic pixel and
while IH contains rain/snow components and some or even the corresponding term MI (i, j) in the location map MI is
many details of the image. In the second step, we design a enforced to be 0; otherwise MI (i, j) is assigned to 1.
3-layer hierarchy of extracting non-dynamic components (i.e., By making the pixel I(i, j) located at different positions of
N D1
the image’s details) from IH , which are denoted as IH , the window, we can avoid mistreating big white non-dynamic
N D2 N D3
IH , and IH , respectively. The final rain/snow-removed objects (such as white buildings) as dynamic components. Fig.
image is obtained as: 8 shows the detection result MI for the rain and snow images
Iˆ = IL + IH
N D1 N D2
+ IH N D3
+ IH (5) of Fig. 1.
Notice that the rain/snow detection used here is a very
In this section, we pay attention to the first step and the strong one and will unavoidably lead to some over-detection
details of the second step are described in the next section. mistakes. Nevertheless, such a detection usually includes all
Fig. 7 shows the details of the first step. First, a rain/snow rain streaks for rain images or snowflakes for snow images,
detection is performed to produce a binary location map MI especially the rain streaks or snowflakes with high intensities.
and the Hadamard product between I and MI yields an output On the other hand, rain streaks or snowflakes with low inten-
image IM . Because the location map is binary, holes appear sities in an image may be missed by our detection. However,
at the rain/snow locations. Then, we fill each hole with the even missed, this kind of rain streaks or snowflakes can be
mean value of its neighboring non-rain/snow pixels. At last, a filtered out by a low-pass filter easily.
guided filter is utilized to generate the low-frequency part IL , A much more challenging problem associated with over-
and the high-frequency part is obtained as IH = I − IL . detection mistakes is that some or even many details of the
image are detected as rain/snow components because they
A. Detection of Dynamic Components also have high intensities as compared with their neighbors.
In general, some low-pass filter (e.g. the guided filter) can Consequently, these mistreated non-rain/snow components will
be used to decompose a rain or snow image into the low- appear in the high-frequency part. How to extract them effec-
frequency part and high-frequency part. However, such a low- tively will be discussed in Section V.
pass filtering can hardly filter out all dynamic components
B. Image Decomposition
(i.e., rain or snow). To solve this problem, we propose to first
perform a rain/snow detection to obtain the coarse locations For a given image I, we calculate the Hadamard product of
of these dynamic components and then apply a guided filter I and the binary location matrix MI as
to obtain the low-frequency part that would become free of IM = I ◦ MI . (7)
rain or snow almost completely.
Rain/snow detection belongs to the category of object Since MI is binary, holes exist in image IM at the locations
detection, to which many algorithms have been developed, of all detected rain streaks or snowflakes. To fill these holes,
including several very recent ones by Pang et al. [27]–[29]. In the value of the dynamic pixel IM (i, j) is substituted with the
this part of our work, we wish to keep the detection as simple mean value of non-dynamic pixels in the patch centered at
as possible, which can be achieved by utilizing some intrinsic IM (i, j). Then, we use the guided filter [26] to further filter
characteristics of rain/snow, as described below. out the remaining dynamic components with low intensities
For the input rain or snow image I, we make use of from IM and get the low-frequency part IL as
the first characteristic as described in Section II to detect
IL = Fg {Fm {IM }} = Fg {Fm {I ◦ MI }} (8)
rain/snow. For each pixel I(i, j), we calculate 5 mean values
I¯(k) (k = 1, 2, 3, 4, 5) in five 7 × 7 windows ω (k) with where Fm and Fg represent the operation of filling holes
pixel I(i, j) being located in the center, top-left, top-right, with the mean of non-rain/snow values and guided filtering,

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 6

(a) (b)
Fig. 9. The detailed flow chart of the second step in our algorithm. Fig. 10. The trained dictionaries for the high-frequency part of (a) the rain
image and (b) the snow image shown earlier in Fig. 1.

respectively. Through this process, it is found that we have


obtained a better low-frequency part than that by directly
applying the guided filter in the sense that it preserves more
details of the image and retain nearly no trace of rain or snow.
Finally, the high frequency part IH is obtained as IH = Fig. 11. The flow chart of classifications of dictionary atoms and sparse
I − IL , i.e., IL and IH are completely complementary to reconstruction.
each other. The low/high-frequency parts that have been shown
earlier in Fig. 2 are obtained by exactly following this comple-
mentary decomposition. Note that the guided filter [26] and dictionary. Recently, many dictionary-based sparse represen-
the bilateral filter [30] are both very good smoothing filters tations have been proposed, including some classical methods
while preserving edges. In our work, we choose the guided such as MCA [31] and K-SVD [32].
filter for the image decomposition. In our work, we choose the on-line dictionary learning
method by Mairal et al. [33] to learn an over-complete dictio-
nary for factorizing IH . This method addresses the factoriza-
V. A 3-L AYER H IERARCHY OF E XTRACTING I MAGE
tion problem with a new on-line optimization algorithm that
D ETAILS IN IH
is based on stochastic approximation. We choose this method
After the first step, almost all rain/snow components remain because it suits for both small and large data set, and can train
in the high-frequency part, but some or even many details of adaptive over-complete dictionaries iteratively. On the other
the image are also included in this part. Our second step is to hand, K-SVD can do a similar job but is slower in speed;
recover these image details as much as possible so that they whereas MCA utilizes a different transform to construct an
can be added back to the low-frequency part to obtain the final over-complete dictionary that is not adaptive. In the following,
rain/snow-removed image. This job is further split into three we introduce this on-line dictionary learning method briefly.
layers, as shown in Fig. 9: Suppose that ui is the ith training sample in a training set
• a dictionary learning and dictionary atoms classification U . In order to obtain the trained dictionary D, the following
are used to classify dynamic components (i.e., rain or loss function needs to be solved iteratively:
snow) from non-dynamic components so that the first k
layer non-dynamic IH N D1
can be extracted, 1X 1
Dk = arg min ( kui − Dαi k22 + λkαi k1 ) (9)
D
• the classified dynamic component IH is processed by D∈C k i=1 2
another combination of rain/snow detection and guided
where k = 1, 2, ..., K, K is the number of training samples
filtering to produce the second layer recovering of image
N D2 and also the iteration number, Dk is the dictionary obtained
details IH , and
after the k th iteration, and the dictionary DK obtained after
• the SVCC defined earlier is employed to produce the third
N D3 the K th iteration is the final dictionary D.
layer recovering of image details IH .
For fear of having too large values, Dk must be subjected
to the following constraint:
A. Dictionary Learning for IH C = {D ∈ Rm×n s.t.∀j = 1, ..., n, dTj dj ≤ 1} (10)
Because the location of rain/snow in the image is random,
it is difficult to accurately separate rain/snow with other non- where dj is the j th column vector in dictionary D and named
rain/snow components by normal detection methods. Dictio- as the j th dictionary atom, m is the dimension of an atom,
nary learning is an excellent image decomposition method, and n is the number of atoms in D.
which can decompose an image into many components. Some Here, the parameter αi in Eq. (9) is the sparse coefficient
are rain/snow components and the other are non-rain/snow by solving following loss function:
components. In this subsection, we try to represent IH by 1
a sparse coding that is based on learning an over-complete αi = arg min kui − Di−1 αk22 + λkαk1 ) (11)
α∈Rk 2

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 7

(a) (b) (c) (d)


N D ; (b) non-rain component I N D ; (c) non-rain component I N D ; and (d)
Fig. 12. Classification results for IH of a rain image: (a) non-rain component IC1 C2 C3
non-rain component IH N D1 .

(a) (b) (c) (d)


N D ; (b) non-snow component I N D ; (c) non-snow component I N D ; and
Fig. 13. Classification results for IH of a snow image: (a) non-snow component IC1 C2 C3
(d) non-snow component IH N D1 .

(a) (b) (c) (d)


D . (b) Non-rain component I N D2 . (c) Snow component I D . (d) Non-snow component I N D2 .
Fig. 14. (a) Rain component IH H H H

B. Layer-1 Extraction
From the dictionary obtained above, dynamic components
and non-dynamic components can be separated by dictionary
atoms. Namely, some dictionary atoms stand for dynamic
components and others for non-dynamic components. To this
goal, three classifications of dictionary atoms are implemented,
D
(a) (b) as shown in Fig. 11. In the end, dynamic component IH
N D1
N D3 extracted by SVCC. (b) Non-snow
and non-dynamic component IH are achieved by a sparse
Fig. 15. (a) Non-rain component IH
component IH N D3 extracted by SVCC. reconstruction.
First classification. According to the third characteristic
discussed in Section II, dictionary atoms standing for dynamic
components will have a smaller sum of pixel color channel
where the least angle regression (LARS) [34] is utilized to variance. Thus, we calculate the sum of pixel color channel
solve it. variance (Sk ) of each dictionary atom Dk (k = 1, 2, ..., 1024)
as:
In our work, we trained a dictionary with 1024 atoms (i.e., X
Sk = var(p(i, j)) (12)
n = 1024). To this end, the high-frequency part IH is divided
(i,j)∈Dk
into 16 × 16 × 3 cubes to form the training samples, and each
cube is arranged into a column vector whose size is 768 × 1 where p(i, j) is a color vector of pixel in Dk . According to
(i.e., m = 16×16×3 = 768). We generate an initial dictionary these results, the first classification is performed as follows: we
D0 by selecting 1024 training samples randomly and choose choose a threshold T1 to identify dynamic components from
the parameter λ = 0.15. After obtaining the dictionary, we the other part of image. If Sk < T1 , Dk stands for dynamic
reshape the atoms back to 16 × 16 × 3 cubes. Fig. 10(a) and components. Once classified, a sparse coding is applied to
(b) show these atoms for the rain and snow images of Fig. 1. obtain the coefficients xk of each dictionary atom Dk , k =

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 8

D
1, 2, ..., 1024. Then, all decomposed components yk can be respectively, and the dynamic part IH is shown in Fig. 14(a)
obtained as follows: and (c) for rain and snow, respectively.
yk = Dk × xk , k = 1, 2, ..., 1024 (13)
C. Layer-2 Extraction
At last, we add up all components corresponding to the D
dynamic dictionary atoms to obtain the dynamic part IC1 D
of A minority of non-dynamic details still exist in IH . In order
the first classification. Meanwhile, the non-dynamic part IC1 ND to get more image’s details, we detect dynamic components
can also be obtained. in IH again by the method described in Section III (i.e., a
combination of rain/snow detection and a guided filter) and
Second classification. After the first classification, we ex- employ the newly calculated location map MH to fill the hole.
tracted non-dynamic components that usually contain color Then, by applying the guided filter, we get the non-dynamic
values that are quite different from that of dynamic com- N D2
part IH (Layer-2) as
ponents. However, non-dynamic components whose colors
N D2
= Fg Fm {IH D

are similar to dynamic components would still remain in IH ◦ MH } (17)
D
IC1 . To solve this problem, we propose to do the second
where Fm and Fg have the same meaning as described In
classification using the dictionary atoms that correspond to
D Section III. The results are shown in Fig. 14(b) and (d) for
IC1 only. Specifically, we calculate the average of absolute
rain and snow, respectively.
horizontal gradient of the pixels in these dictionary atoms in
which we only select the pixels with non-zero gradient values.
The atoms with a large mean of absolute horizontal gradient D. Layer-3 Extraction
represent dynamic atoms. Here, we need a threshold T2 and N D1
After IH and IHN D2
are obtained, the rain-removed or
the classification is the same as the first one. After the second snow-removed results are still a little blurred. In order to
D
classification, the dynamic component is denoted as IC2 and further enhance the visual quality, we calculate the SVCC
ND
the non-dynamic component as IC2 , respectively. map VH of the high-frequency part IH and then use it to filter
D
Third classification. For the non-dynamic details in IC2 , the high-frequency part to get some non-dynamic components
we further propose to find the PDIPs of dictionary atoms N D3
IH (Layer-3) as follows:
D
corresponding to IC2 . Here, we treat a dictionary atom as N D3
a patch. Majority of texture components in IC2 D
is dynamic. IH = IH ◦ VH (18)
Hence, a large number of dictionary atoms corresponding to The result is shown in Fig. 15(a) and (b) for rain and snow,
D
IC2 are dynamic atoms, while only a small part of atoms respectively.
corresponds to non-dynamic components whose textures are The principle for choosing γ in the computation of the
very different from the dynamic weather. While the PDIPs of SVCC map is as follows. If rain/snow is bright (i.e., with
dynamic atoms are nearly consistent and have small variance, high intensity values), we should choose γ > 1 to minimize
we can calculate the PDIPs’ variance: the rain/snow trace in the final result. On the other hand, for
n1 n1
!2 rain/snow with very low intensity, γ could be smaller than 1. In
1 X 1 X
Σ= P DIPj − P DIPi (14) this case, a little rain/snow trace remains, which nevertheless
n1 j=1 n1 i=1
is hard to recognize visually because rain/snow has a very
D low intensity. Either too big or too small γ would destroy this
where n1 is the number of atoms corresponding to IHC2 and
th D purpose. In our experiments, we choose γ = 1.1 after trials
Dj is the j dictionary atom corresponding to IC2 . If the
on different values to fit the majority of test images. However,
following condition:
!2 we would like to point out that for a specific image, this value
n1 can be fine-tuned to obtain a better result.
1 X
Σj = P DIPj − P DIPi >Σ+ρ (15) At last, we sum up all non-dynamic components to obtain
n1 i=1
the rain/snow-removed image. The result is shown in Fig.
holds, atom Dj is classified as non-dynamic; otherwise it is 16(d) for rain and Fig. 17(d) for snow, respectively.
viewed as dynamic. Here, ρ is the control parameter to get
accurate results. After the third classification and reconstruc- E. Individual Contributions
tion which is similar to the first two reconstruction processes, In order to show the individual contribution of each of the
D
we obtain the dynamic component IC3 and non-dynamic three layers described above, we present the resulted images in
ND
component IC3 . Fig. 16 for rain and Fig. 17 for snow, respectively. The results
Eventually, after three times of classification, we obtain the show that Layer-1 extraction provides a more significant
N D1 D
non-dynamic component IH and dynamic component IH contribution as compared to Layer-2. This is because that
as follows: D
only very little non-rain/snow details still exist in IH . In the
N D1
IH ND
= IC1 ND
+ IC2 ND
+ IC3 meantime, however, Layer-3 extraction seems to play a very
D D
(16) positive role. It can be seen from Fig. 16(d) and 17(d) that
IH = IC3
the contrast and color textures have been improved a lot by
Some non-dynamic-weather results of three times classifica- using the SVCC map. Notice that, for some specific images
tions are shown in Fig. 12 and Fig. 13 for rain and snow, that have high-intensity rain/snow, SVCC will leave a little

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 9

(a) (b) (c) (d)


Fig. 16. Rain-removed results: (a) the input image; (b) rain-removed image obtained by N D1 ;
IL +IH N D1 +I N D2 ;
(c) rain-removed image obtained by IL +IH H
and (d) rain-removed image obtained by IL + IH N D1 + I N D2 + I N D3 .
H H

(a) (b) (c) (d)


Fig. 17. Snow-removed results: (a) the input image; (b) snow-removed image obtained by N D1 ;
IL + IH N D1 +
(c) snow-removed image obtained by IL + IH
N D2 ; and (d) snow-removed image obtained by I + I N D1 + I N D2 + I N D3 .
IH L H H H

(a) (b) (c) (d) (e) (f)


N D1 ; (e)
Fig. 18. Rain removed results: (a) ground-truth; (b) the synthesized rain image; (c) low frequency IL ; (d) rain-removed image obtained by IL + IH
rain-removed image obtained by IL + IH N D1 + I N D2 ; (f) rain-removed image obtained by I + I N D1 + I N D2 + I N D3 .
H L H H H

(a) (b) (c) (d) (e) (f)


N D1 ;
Fig. 19. Snow removed results: (a) ground-truth; (b) the synthesized snow image; (c) low frequency IL ; (d) snow-removed image obtained by IL + IH
(e) snow-removed image obtained by IL + IH N D1 + I N D2 ; (f) snow-removed image obtained by I + I N D1 + I N D2 + I N D3 .
H L H H H

rain/snow trace in the final results. This problem can be solved atoms are chosen to be {0.1, 0.02, 1.5} for rain and {0.12,
partially by a fine-tuning on the parameter γ. 0.03, 2} for snow, respectively. We would like to point out
To show their individual contributions quantitatively, we that, for a specific rain/snow image, these parameters could
have synthesized a rain image and a snow image, i.e., a be fine-tuned to achieve a better performance. Figs. 20 and
ground-truth image is known and rain or snow is rendered on 21 show, respectively, some rain-removed results and snow-
the ground-truth. We list in Table II the contribution of every removed results by different algorithms. In order to assess
layer by computing the peak-signal-to-noise-ratio (PSNR) and these results fairly, we first present the subjective evaluations
structural similarity (SSIM). It could be observed from this in the following.
table that the above-described results (i.e., the individual con-
tributions from three layers) have been verified quantitatively. A. User Study
Finally, Fig. 18 and 19 show the visual results. To conduct a visual (subjective) evaluation on the perfor-
mances of different methods, we invited 20 viewers (12 males
and 8 females) to evaluate the visual quality of different
VI. E XPERIMENTAL R ESULTS
methods in terms of the following three aspects: (1) less
In this section, we demonstrate the rain/snow-removing rain/snow residual, (2) the maintenance of the image details,
effectiveness of our proposed algorithm by comparisons with and (3) overall perception.
several state-of-the-art works. In our experiments, three pa- In the evaluation, 10 groups of rain-removed results are
rameters T1 , T2 , and ρ used in the classification of dictionary selected and every group involves the results by Ding et al.,

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 10

TABLE II
T HE CONTRIBUTION OF EACH LAYER TO PSNR AND SSIM (T OP : PSNR, B OTTOM : SSIM) OF SYNTHESIZED RAIN AND SNOW IMAGES AGAINST THE
GROUND - TRUTHES .

N D1 N D1 N D2 N D1 N D2 N D3
IL IL + IH IL + IH + IH IL + IH + IH + IH
29.92 31.63 31.98 32.55
a rain image
0.601 0.843 0.852 0.893
31.98 33.21 33.63 35.26
a snow image
0.672 0.821 0.830 0.860

TABLE III
U SER S TUDY R ESULT. T HE N UMBERS A RE T HE P ERCENTAGES OF VOTES W HICH A RE O BTAINED BY E ACH M ETHOD .

rain snow
Method [22] [23] [18] [24] Ours [22] [19] [18] Ours
Percentage 12.40% 3.78% 9.75% 8.57% 65.50% 12.50% 0% 0% 87.50%

TABLE IV
R AIN I MAGE P ERFORMANCES (T OP : PSNR, B OTTOM : SSIM) OF D IFFERENT M ETHODS (ROWS ) ON 11 S YNTHESIZED R AIN I MAGES ( COLUMNS )
AGAINST G ROUND - TRUTHES .

Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7 Image 8 Image 9 Image 10 Image 11
38.93 37.16 35.93 40.29 32.78 33.50 34.59 35.22 34.00 35.89 34.23
[22]
0.948 0.861 0.835 0.796 0.811 0.875 0.808 0.862 0.857 0.787 0.895
33.06 33.73 29.45 35.95 29.45 30.43 31.63 32.99 27.52 32.10 31.33
[23]
0.899 0.808 0.801 0.784 0.790 0.829 0.804 0.833 0.814 0.707 0.877
34.90 34.95 32.55 38.58 31.84 32.11 34.59 34.15 34.39 35.44 35.13
[18]
0.873 0.774 0.824 0.775 0.802 0.804 0.854 0.784 0.703 0.755 0.825
36.84 32.27 33.34 31.13 30.39 29.52 30.31 32.33 32.36 31.21 29.90
[24]
0.945 0.691 0.748 0.754 0.682 0.669 0.686 0.785 0.747 0.706 0.597
42.78 38.38 36.03 40.31 32.94 34.42 34.91 35.53 34.80 35.93 38.63
Ours
0.950 0.897 0.882 0.846 0.854 0.883 0.846 0.866 0.869 0.811 0.928

TABLE V
S NOW I MAGE P ERFORMANCES (T OP : PSNR, B OTTOM : SSIM) OF D IFFERENT M ETHODS ( ROWS ) ON 11 S YNTHESIZED S NOW I MAGES (C OLUMNS )
AGAINST G ROUND - TRUTHES .

Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7 Image 8 Image 9 Image 10 Image 11
34.11 30.09 31.98 33.60 34.25 34.73 33.51 33.42 32.65 33.60 35.23
[22]
0.794 0.809 0.809 0.808 0.739 0.820 0.820 0.736 0.712 0.756 0.830
30.12 32.07 30.66 31.68 34.05 32.79 33.08 33.22 32.21 34.44 34.57
[19]
0.629 0.599 0.588 0.798 0.676 0.623 0.572 0.593 0.634 0.730 0.613
30.14 33.01 31.36 32.56 30.98 33.70 34.38 33.06 33.81 34.48 33.55
[18]
0.692 0.779 0.652 0.826 0.631 0.725 0.744 0.611 0.693 0.686 0.640
35.41 35.96 32.41 38.91 36.43 35.26 38.52 38.75 36.70 38.11 38.53
Ours
0.804 0.861 0.813 0.885 0.886 0.860 0.851 0.819 0.833 0.865 0.862

Luo et al., Chen et al., Li et al. and our method. Another 10 are shown in Fig. 22 and 23. In Fig. 23, the first two rows
groups of snow-removed results are selected and every group are streak-like snow and the last two rows are point-like snow.
involves the results by Ding et al., Xu et al., Chen et al. and Here, PSNR and SSIM are adopted as the quantitative metrics
our method. To ensure the fairness, the results in each group to assess the performance of different methods. In Tables IV
are arranged randomly. For each group, the viewers are asked and V, we list the PSNR and SSIM values of 11 rain images
to select only one result which they like most. and 11 snow images, respectively. According to these results,
The evaluation result is shown in Table III. It is clear that it could be observed that our method outperforms the selected
our rain/snow removal results are favored by a huge majority state-of-the-art works for removing rain or snow. Especially,
of viewers (65.50% for rain and 87.50% for snow). our method produces much better results for snow images.

B. Objective Assessment C. Result Analysis


The fifth column in Fig. 20 is the results by Li et al.. This
To facilitate the objective assessment, we render rain or
work can obtain an excellent rain-removed result for a rain
snow on ground-truth images1 . Several examples and their
image which has less small details. While for an image with
corresponding rain/snow-removed results by different methods
many small image details (the second and third ones), this
1 Here, we follow [23] to render rain streaks and also use it for rendering work will loss some image details. The fourth column in Fig.
snowflakes with a different setting of parameters. 20 is the results by Chen et al. [18]. It is found that this

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 11

(a) (b) (c) (d) (e) (f)


Fig. 20. Rain-removal results: (a) rain image; (b) results by Ding et al. [22]; (c) results by Luo et al. [23]; (d) results by Chen et al. [18]; (e) results by Li
et al. [24] (f) results by our method.

method has produced very good results for light rain images, the-art snow removal methods. First, the images in the forth
such as the first, second and third images. However, when the column show the snow-removed results obtained by a rain
intensity of a rain pixel is large, i.e., the relatively heavy rain removal method developed by [18]. The results reveal that
(such as the fifth one) or the edge of rain streaks are blurry a good rain-removal method may not be suitable to snow-
(e.g., the sixth one), this method falls. The third column in Fig. removal. The results by [19] are shown in the third column.
20 is the results by Luo et al. [23]. This method can produce It is found that this method cause a lot of blur.
accepted rain-removal results for some images, like the third
The second column is the results by [22]. For some snow
one. However, this work can not remove rain for relatively
images such as the third image in Fig. 21, this method can
heavy rain images(the fifth and sixth ones). In general, it is
produce good snow-removal results and keep acceptable image
more suitable to light and slender rain as shown in their paper.
quality. However, when snow is heavy and has larger size (e.g.,
The results by Ding et al. [22] are displayed in second column
the second image), it can not recognize snow. Another defect
in Fig. 20. This method produces satisfactory results for the
of this work is that it will mistreat small image details as snow,
fifth and sixth images. However, this method only removes rain
like in the fourth image. The last column presents our results,
streaks, cannot revise the shadows produced by rain streaks
showing a much better snow-removal performance. It can be
(the second one). The results of our proposed algorithm are
seen that we have removed successfully the majority of snow
shown in the last column. By comparison, it could be observed
in images and maintained most image details at the same time,
that our method is suitable to all rain images tested in our
leading to a better visual quality in the snow-removed images.
experiments and produces a highly competing performance.
For some relatively heavy rain image, a hazy effect will appear We analyze the performance of each method as follows.
in the rain-removed results. We can further implement the de- The work by [18] only uses a low-pass filter to separate the
haze algorithm [24] to solve this problem to a certain extent. rain/snow image into the low-frequency and high-frequency
parts. When the intensity of rain/snow is large, it is difficult to
Fig. 21 shows some experimental results of the state-of- obtain a rain/snow-free low-frequency part. Hence, this work

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 12

(a) (b) (c) (d) (e)


Fig. 21. Snow-removal results: (a) snow image; (b) results by Ding et al. [22]; (c) results by Xu et al. [19]; (d) results by Chen et al. [18]; (e) results by
our method.

can not remove rain/snow with high intensity. Besides, this rain/snow with low intensity, but rain/snow with high intensity
work uses HOG as the descriptor to identify rain streaks. When can not be removed. Besides, some image details that have
edges of a rain streak is blurry, the performance of HOG will similar shape and intensity with low-intensity rain/snow will
fall. Finally, HOG descriptor can not identify snow because also be removed. Finally, the work by Xu et al. designs a
some snow usually does not possess the shape of a rain streak. rain/snow-free guidance image, cooperated with the guided
Therefore, this work is not applicable to snow-removal. filter [26], to remove rain/snow from images. Even though the
guided filtering is a good edge-preserving low-pass filter, it is
For the work [23], when rain streaks are wide or a little inevitable that the processed image gets blurred.
heavy, the discrimination of the proposed non-linear generative
model will decrease. Hence, it is only suitable for handling D. Complexity Analysis
light rain streaks. The background prior and rain prior used in We implement our algorithm using MATLAB on an Intel
the work [24] can not separate rain from small image details. (R) Xeon (R) CPU E5-2643 v2 @ 3.5 GHz 3.5 GHz (2 pro-
Hence, when encountered with rain image with small image cessors) with 64G RAM. We test the run time on a 256 × 256
details, this work will loss many image details. By utilizing the image. The total time consumed by our method is 82.60
property of guided filter with L0 gradient minimization, the seconds, where the detection takes 5.71 seconds and the SVCC
work [22] reserves the edges whose corresponding location takes 1.85 seconds. Majority of time is spent in the dictionary
in the guidance image is of large gradient magnitudes, and learning part, which is 60.82 seconds. Classifications and
smooths other edges. Hence, this work can only remove sparse reconstruction spend 13.81 seconds. The remaining run

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 13

(a) (b) (c) (d) (e) (f) (g)


Fig. 22. (a) Ground-truthes. (b) Original synthesized rain images. (c) results by Ding et al. [22]; (d) results by Luo et al. [23]; (e) results by Chen et al. [18];
(f) results by Li et al. [24]; (g)results by our method. The values at top left corner are PSNR/SSIM.

(a) (b) (c) (d) (e) (f)


Fig. 23. (a) Ground-truthes. (b) Original synthesized snow images. (c) results by Ding et al. [22]; (d) results by xu et al. [19]; (e) results by Chen et al. [18];
(f) results by our method. The values at top left corner are PSNR/SSIM.

time is consumed by intermediate steps of our algorithms. The atoms. The complexity of each classification is O(Q).
time consumed by the works of Chen et al. [18], Luo et al. Without dictionary learning and sparse reconstruction steps,
[23], Ding et al. [22], Xu et al. [19], Li et al. [24] are 95.17, the works in [22] and [19] spend much less time. Though our
68.66, 1.18, 0.27 and 1260.40 seconds, respectively. results are better, optimizing our algorithm and shortening the
The run time changes with the size of images. Therefore, we consumed time are necessary. In our future works, we will
analyze the complexity of major steps as follows. Suppose that focus on finding new methods to optimize dictionary learning.
the size of an given rain/snow image is M ×N , l is the number E. Limitations
of windows we use to identify rain/snow pixel from the given Our proposed method uses some universal characteristics of
rain/snow images (which is 5 in this paper). The computational rain and snow and develops accurate descriptors to represent
complexity of rain/snow detection is O(M × N × l). For the rain and snow so that some very good results have been
online dictionary learning, the number of training samples is obtained. However, some shortcomings still exist in our work.
K, the size of every training sample is L × 1, the number of First, for some relatively heavy rain images, such as the
dictionary atoms is Q, the size of every dictionary atoms is fourth images in Fig. 20, our method still produces blurring.
L × 1 (L < Q  K), S is the target sparsity, and T is the Second, we notice that the parameters selected in our work
iteration number. Then, the complexity of the online dictionary are suitable for majority of rain and snow images except
learning and sparse reconstruction are O(T × K(N × S 2 + for some special images such as very blurred rain images or
2 × L × Q)) and O(Q), respectively. heavy snow images. Under this situation, we believe that the
We use the same dictionary learning method as the work by parameters need to be fine tuned for a better result. Finally,
Chen et al. [18]. Therefore, the dictionary learning and sparse little snow trace can still be seen in some of our results when
reconstruction have the equal computational complexity. On the size of snowflakes are large. Our future work will focus
the other hand, we implement 3 classifications. Every classi- on solving these problems and obtaining better rain/snow-
fication has its own feature descriptors to describe dictionary removed images.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 14

VII. C ONCLUSIONS [14] Y. H. Fu, L. W. Kang, C. W. Lin, and C. T. Hsu, “Single-frame-based


rain removal via image decomposition,” IEEE International Conference
This paper has attempted to solve the rain/snow-removing on Acoustics, Speech and Signal Processing (ICASSP-2011), pp. 1453-
1456, Prague, Czech Republic, May 22-27, 2011.
problem from a single color image by utilizing the common [15] L. W. Kang, C. W. Lin, and Y. H. Fu, “Automatic single-image-based
characteristics of rain and snow. To this end, we defined rain streaks removal via image decomposition,” IEEE Transactions on
the principal direction of an image patch (PDIP) and the Image Processing, vol. 21, no. 4, pp. 1742-1755, April 2012.
[16] D. A. Huang, L. W. Kang, M. C. Yang, C. W. Lin and Y. C. F. Wang,
sensitivity of variance of color channel (SVCC) to describe the “Context-aware single image rain removal,” IEEE International Confer-
difference of rain or snow from other image components. We ence on Multimedia and Expo (ICME-2012), pp. 164-169, Melbourne,
acquired the low and high frequency parts by implementing a Australia, July 9-13, 2012.
[17] D. A. Huang, L. W. Kang, Y. C. F. Wang and C. W. Lin, “Self-learning
rain/snow detection and applying a guided filter. For the high- based image decomposition with applications to single image denoising,”
frequency part, a dictionary learning and three classifications IEEE Transactions on Multimedia, vol. 16, no. 1, pp. 83-93, Junary,
of dictionary atoms are implemented to decompose it into non- 2014.
[18] D. Y. Chen, C. C. Chen, and L. W. Kang, “Visual depth guided color
dynamic components and dynamic (rain or snow) components, image rain streaks removal using sparse coding,” IEEE Transactions on
where some common characteristics of rain/snow defined Circuits and Systems for Video Technology, vol. 24, no. 8, pp. 1430-
earlier in our work are utilized. Moreover, we have designed 1455, Aug. 2014.
[19] J. Xu, W. Zhao, P. Liu, and X. Tang, “An improved guidance image
two additional layers of extracting image details from the high- based method to remove rain and snow in a single image,” Computer
frequency part, which are based on, respectively, the SVCC and Information Science, vol. 5, no. 3, pp. 49-55, May 2012.
map and another combination of a rain/snow detection and [20] J. H. Kim, C. Lee, J. Y. Sim, and C. S. Kim, “Single-image deraining
using an adaptive nonlocal means filter,” IEEE International Conference
a guided filtering. Finally, we have presented a large set of on Image Processing (ICIP-2013), Melbourne, Australia, Sep. 15-18,
results to show that our method can remove rain or snow from 2013.
images effectively, leading to an enhanced visual quality in the [21] Y. L. Chen, and C. T. Hsu, ”A generalized low-rank appearance
model for spatio-temporally correlated rain streaks,” IEEE International
rain/snow-removed images. Conference on Computer Vision (ICCV-2013), pp. 1968-1975, Sydney,
Australia, Dec. 1-8, 2013.
[22] X. H. Ding, L. Q. Chen, X. H. Zheng, Y. Huang, and D. L. Zeng,
R EFERENCES “Single image rain and snow removal via guided l0 smoothing filter,”
Multimedia Tools and Applications, vol. 24, no. 8, pp. 1-16, 2014.
[1] K. Garg and S. K. Nayar, “Detection and removal of rain from videos,” [23] Y. Luo, X. Yong, and J. Hui “Removing rain from a single image
IEEE Conference on Computer Vision and Pattern Recognition (CVPR- via discriminative sparse coding,” IEEE International Conference on
2004), pp. 528-535, Washington DC, USA, June 27-July 2, 2004. Computer Vision (ICCV-2015), pp. 3397-3405, Boston, MA, USA, Dec.
[2] K. He, J. Sun and X. Tang, “Single image haze removal using dark 7-13, 2015.
channel prior,” IEEE Transactions on Pattern Analysis and Machine [24] Y. Li, R. T. Tan, X. Guo, J. Lu, and M. S. Brown, “Rain streak removal
Intelligence, vol. 33, no. 12, pp. 2341-2353, Dec. 2011. using layer priors,” IEEE Conference on Computer Vision and Pattern
Recognition (CVPR-2016), pp. 2736-2744, Las Vegas, Nevada, USA,
[3] J. S. Marshall and W. Mc K.Palmer, “The distribution of raindrops with
June 26-July 1, 2016.
size,” Journal of the Atmospheric Sciences, vol. 5, no. 4, pp. 165-166,
[25] N. Dalal and B. Triggs, “Histograms of oriented gradients for human
1948.
detection,” IEEE Conference on Computer Vision and Pattern Recog-
[4] S. K. Nayar and S. G. Narasimhan, “Vision in bad weather,” IEEE
nition (CVPR-2005), vol. 1, pp. 886-893, San Diego, CA, USA, June
International Conference on Computer Vision (ICCV-1999), vol. 2, pp.
20-25, 2005.
820-827, Kerkyra, Greece, Sep. 20-27, 1999.
[26] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Transactions
[5] K. Garg and S. K. Nayar, “Photorealistic rendering of rain streaks,” ACM on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 1397-
Transactions on Graphics, vol. 25, no. 3, pp. 996-1002, July 2006. 1409, June 2013.
[6] X. Zhang, H. Li, Y. Qi, W. K. Leow, and T. K. Ng, “Rain removal in [27] Y. Pang, H. Zhu, X. Li and J. Pan, “Motion blur detection with an
video by combining temporal and chromatic properties,” IEEE Interna- indicator function for surveillance Machines,” IEEE Transactions on
tional Conference on Multimedia and Expo (ICME-2006), pp. 461-464, Industrial Electronics, vol. 63, no. 9, pp. 5592-5601, 2016.
Toronto, Ontario, Canada, July 9-12, 2006. [28] Y. Pang, J. Cao and X. Li, “Learning sampling distributions for efficient
[7] K. Garg and S. K. Nayar, “Vision and rain,” International Journal of objection detection,” IEEE Transactions on Cybernetics, vol. pp, no. 99,
Computer Vision, vol. 75, no. 1, pp. 3-27, 2007. pp. 1-13, Jan. 2016.
[8] P. Barnum, T. Kanade, and S. Narasimhan, “Spatio-temporal frequency [29] Y. Pang, H. Zhu, X. Li and X. Li, “Classifying discriminative features
analysis for removing rain and snow from videos,” International Work- for blur detection,” IEEE Transactions on Cybernetics, vol. 46, no. 10,
shop on Photometric Analysis For Computer Vision (PACV-2007), Rio pp. 2220-2227, 2016.
de Janeiro, Brazil, Oct. 2007. [30] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color
[9] P. C. Barnum, S. Narasimhan, and T. Kanade, “Analysis of rain and images,” IEEE International Conference on Computer Vision (ICCV-
snow in frequency space,” International Journal of Computer Vision, 1998), pp. 839-846, Bombay, India, 1998.
vol. 86, no. 2, pp. 256-274, 2010. [31] J. L. Starck, M. Elad, and D. Donoho, “Redundant multiscale transforms
[10] N. Brewer and N. Liu, “Using the shape characteristics of rain to identify and their application for morphological component analysis,” Advances
and remove rain from video,” Joint IAPR International Workshop on in Imaging and Electron Physics, vol. 132, no. 4, pp. 287-348, 2004.
Structural, Syntactic, and Statistical Pattern Recognition, vol. 5342, pp. [32] M. Aharon, M. Elad and A. M. Bruckstein, “The K-SVD: An algorithm
451-458, Olando, USA, Dec. 2008. for designing of overcomplete dictionaries for sparse representation,”
[11] M. Roser and A. Geiger, “Video-based raindrop detection for improved IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311-4322,
image registration,” IEEE International Conference on Computer Vision Nov. 2006.
Workshops (ICCV Workshops 2009), pp. 570-577, Kyoto, Japan, Sept. [33] J. Mairal, F. Bach, J. Ponce and G. Sapiro, “Online learning for
29-Oct. 2, 2009. matrix factorization and sparse coding,” Journal of Machine Learning
[12] J. Bossu, N. Hautiere, and J. P. Tarel, “Rain or snow detection in Research, vol. 11, pp. 19-60, Jan. 2010.
image sequences through use of a histogram of orientation of streaks,” [34] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, “Least angle
International Journal of Computer Vision, vol. 93, no. 3, pp. 348-367, regression,” Annals of Statistics, vol. 32, no. 2, pp. 407-499, 2004.
July 2011. [35] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, “Image
[13] J. C. Halimeh and M. Roser, “Raindrop detection on car windshields quality assessment: from error visibility to structural similarity,” IEEE
using geometric-photometric environment construction and intensity- Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, April
based correlation,” Intelligent Vehicles Symposium, 2009, Xi’an, China, 2004.
June 3-5, 2009.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2017.2708502, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 2016 15

Yinglong Wang (S’16) received his B.S. degree Bing Zeng (M’91-SM’13-F’16) received his BEng
from School of Information Science and Engineer- and MEng degrees in electronic engineering from
ing, LanZhou University, Lanzhou, China, in 2011. University of Electronic Science and Technology
He is currently working toward the Ph.D. degree of China (UESTC), Chengdu, China, in 1983 and
at University of Electronic Science and Technology 1986, respectively, and his PhD degree in electrical
of China (UESTC), Chengdu, China. His research engineering from Tampere University of Technology,
interests focus on image processing and machine Tampere, Finland, in 1991.
learning. He worked as a postdoctoral fellow at University
of Toronto from September 1991 to July 1992 and as
a Researcher at Concordia University from August
1992 to January 1993. He then joined The Hong
Kong University of Science and Technology (HKUST). After 20 years of
service at HKUST, he returned to UESTC in the summer of 2013, through
China’s “1000-Talent-Scheme”. At UESTC, he leads the Institute of Image
Processing to work on image and video processing, 3D and multi-view video
technology, and visual big data.
During his tenure at HKUST and UESTC, he graduated more than 30
Master and PhD students, received about 20 research grants, filed 8 interna-
tional patents, and published more than 200 papers. Three representing works
are as follows: one paper on fast block motion estimation, published in IEEE
Transactions on Circuits and Systems for Video Technology (TCSVT) in 1994,
has so far been SCI-cited more than 1000 times (Google-cited more than 2100
times) and currently stands at the 7th position among all papers published in
this Transactions; one paper on smart padding for arbitrarily-shaped image
blocks, published in IEEE TCSVT in 2001, leads to a patent that has been
successfully licensed to companies; and one paper on directional discrete
cosine transform (DDCT), published in IEEE TCSVT in 2008, receives the
2011 IEEE CSVT Transactions Best Paper Award. He also received the best
paper award at ChinaCom three times (2009, 2010, and 2012).
He served as an Associate Editor of IEEE TCSVT for 8 years and received
the Best Associate Editor Award in 2011. He was General Co-Chair of
Shuaicheng Liu received his Ph.D. and M.S. de- IEEE VCIP-2016, held in Chengdu, China, in November 2016. Currently,
grees from National University of Singapore (NUS), he is on the Editorial Board of Journal of Visual Communication and Image
Singapore, in 2014 and 2010, respectively, and his Representation and serves as General Co-Chair of PCM-2017. He received a
B.E. from Sichuan University, Chengdu, China, in 2nd Class Natural Science Award (the first recipient) from Chinese Ministry
2008. In 2014, he joined University of Electronic of Education in 2014 and was elected as an IEEE Fellow in 2016 for
Science and Technology of China (UESTC) and contributions to image and video coding.
is currently an Associate Professor with the In-
stitute of Image Processing, School of Electronic
Engineering. His research interests include image
and video processing, computational photography,
computer graphics and vision. He is a member of
IEEE.

Chen Chen (S’15) received the B.S. degree in


electronic engineering from Shanghai Jiao Tong
University (SJTU), Shanghai, China, in 2012. He is
currently working toward the Ph.D. degree at The
Hong Kong University of Science and Technology
(HKUST), Kowloon, Hong Kong, China. He has
received the Top 10% paper award at IEEE VCIP
2016. He has been invited to be peer-reviewers of
IEEE DSP 2015, 2016, 2017, IEEE VCIP 2015,
IJCS and IEEE TCSVT. His research interests focus
on video compression and multimedia processing.

1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like