0% found this document useful (0 votes)
17 views

De Haze Net

A good research paper for Image Dehazing

Uploaded by

hello
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

De Haze Net

A good research paper for Image Dehazing

Uploaded by

hello
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

1

DehazeNet: An End-to-End System for Single Image


Haze Removal
Bolun Cai, Xiangmin Xu, Member, IEEE, Kui Jia, Member, IEEE, Chunmei Qing, Member, IEEE,
and Dacheng Tao, Fellow, IEEE

Abstract—Single image haze removal is a challenging ill-posed and saturation-based [3]. In addition, methods using multiple
problem. Existing methods use various constraints/priors to get images or depth information have also been proposed. For
plausible dehazing solutions. The key to achieve haze removal is example, polarization based methods [4] remove the haze
to estimate a medium transmission map for an input hazy image. effect through multiple images taken with different degrees of
arXiv:1601.07661v2 [cs.CV] 17 May 2016

In this paper, we propose a trainable end-to-end system called polarization. In [5], multi-constraint based methods are applied
DehazeNet, for medium transmission estimation. DehazeNet takes
to multiple images capturing the same scene under different
a hazy image as input, and outputs its medium transmission map
that is subsequently used to recover a haze-free image via atmo- weather conditions. Depth-based methods [6] require some
spheric scattering model. DehazeNet adopts Convolutional Neural depth information from user inputs or known 3D models. In
Networks (CNN) based deep architecture, whose layers are practice, depth information or multiple hazy images are not
specially designed to embody the established assumptions/priors always available.
in image dehazing. Specifically, layers of Maxout units are used
for feature extraction, which can generate almost all haze-relevant Single image haze removal has made significant progresses
features. We also propose a novel nonlinear activation function recently, due to the use of better assumptions and priors.
in DehazeNet, called Bilateral Rectified Linear Unit (BReLU), Specifically, under the assumption that the local contrast of the
which is able to improve the quality of recovered haze-free image. haze-free image is much higher than that of hazy image, a local
We establish connections between components of the proposed contrast maximizing method [7] based on Markov Random
DehazeNet and those used in existing methods. Experiments Field (MRF) is proposed for haze removal. Although contrast
on benchmark images show that DehazeNet achieves superior maximizing approach is able to achieve impressive results, it
performance over existing methods, yet keeps efficient and easy
tends to produce over-saturated images. In [8], Independent
to use.
Component Analysis (ICA) based on minimal input is pro-
Keywords—Dehaze, image restoration, deep CNN, BReLU. posed to remove the haze from color images, but the approach
is time-consuming and cannot be used to deal with dense-haze
images. Inspired by dark-object subtraction technique, Dark
I. I NTRODUCTION
Channel Prior (DCP) [9] is discovered based on empirical

H AZE is a traditional atmospheric phenomenon where


dust, smoke and other dry particles obscure the clarity
of the atmosphere. Haze causes issues in the area of terrestrial
statistics of experiments on haze-free images, which shows
at least one color channel has some pixels with very low
intensities in most of non-haze patches. With dark channel
photography, where the light penetration of dense atmosphere prior, the thickness of haze is estimated and removed by the
may be necessary to image distant subjects. This results in the atmospheric scattering model. However, DCP loses dehazing
visual effect of a loss of contrast in the subject, due to the quality in the sky images and is computationally intensive.
effect of light scattering through the haze particles. For these Some improved algorithms are proposed to overcome these
reasons, haze removal is desired in both consumer photography limitations. To improve dehazing quality, Kratz and Nishino
and computer vision applications. et al. [10] model the image with a factorial MRF to estimate
Haze removal is a challenging problem because the haze the scene radiance more accurately; Meng et al. [11] propose
transmission depends on the unknown depth which varies at an effective regularization dehazing method to restore the haze-
different positions. Various techniques of image enhancement free image by exploring the inherent boundary constraint. To
have been applied to the problem of removing haze from a improve computational efficiency, standard median filtering
single image, including histogram-based [1], contrast-based [2] [12], median of median filter [13], guided joint bilateral
filtering [14] and guided image filter [15] are used to replace
B. Cai, X. Xu ( ) and C. Qing are with School of Electronic and the time-consuming soft matting [16]. In recent years, haze-
Information Engineering, South China University of Technology, Wushan relevant priors are investigated in machine learning framework.
RD., Tianhe District, Guangzhou, P.R.China. E-mail: {[email protected],
[email protected], [email protected]}. Tang et al. [17] combine four types of haze-relevant features
K. Jia is with the Department of Electrical and Computer Engineering, with Random Forests to estimate the transmission. Zhu et al.
Faculty of Science and Technology, University of Macau, Macau 999078, [18] create a linear model for estimating the scene depth of
China. E-mail: {[email protected]}. the hazy image under color attenuation prior and learns the
D. Tao is with Centre for Quantum Computation & Intelligent Sys-
tems, Faculty of Engineering & Information Technology, University of Tech- parameters of the model with a supervised method. Despite the
nology Sydney, 235 Jones Street, Ultimo, NSW 2007, Australia. E-mail: remarkable progress, these state-of-the-art methods are limited
{[email protected]}. by the very same haze-relevant priors or heuristic cues - they
2

are often less effective for some images. Section IV, before conclusion is drawn in Section V.
Haze removal from a single image is a difficult vision task.
In contrast, the human brain can quickly identify the hazy area
from the natural scenery without any additional information. II. R ELATED W ORKS
One might be tempted to propose biologically inspired models Many image dehazing methods have been proposed in the
for image dehazing, by following the success of bio-inspired literature. In this section, we briefly review some important
CNNs for high-level vision tasks such as image classification ones, paying attention to those proposing the atmospheric
[19], face recognition [20] and object detection [21]. In fact, scattering model, which is the basic underlying model of
there have been a few (convolutional) neural network based image dehazing, and those proposing useful assumptions for
deep learning methods that are recently proposed for low-level computing haze-relevant features.
vision tasks of image restoration/reconstruction [22], [23],
[24]. However, these methods cannot be directly applied to
single image haze removal. A. Atmospheric Scattering Model
Note that apart from estimation of a global atmospheric light
magnitude, the key to achieve haze removal is to recover an To describe the formation of a hazy image, the atmospheric
accurate medium transmission map. To this end, we propose scattering model is first proposed by McCartney [26], which
DehazeNet, a trainable CNN based end-to-end system for is further developed by Narasimhan and Nayar [27], [28]. The
medium transmission estimation. DehazeNet takes a hazy atmospheric scattering model can be formally written as
image as input, and outputs its medium transmission map that
is subsequently used to recover the haze-free image by a simple I (x) = J (x) t (x) + α (1 − t (x)) , (1)
pixel-wise operation. Design of DehazeNet borrows ideas from where I (x) is the observed hazy image, J (x) is the real scene
established assumptions/principles in image dehazing, while to be recovered, t (x) is the medium transmission, α is the
parameters of all its layers can be automatically learned from global atmospheric light, and x indexes pixels in the observed
training hazy images. Experiments on benchmark images show hazy image I. Fig. 1 gives an illustration. There are three
that DehazeNet gives superior performance over existing meth- unknowns in equation (1), and the real scene J (x) can be
ods, yet keeps efficient and easy to use. Our main contributions recovered after α and t (x) are estimated.
are summarized as follows.
The medium transmission map t (x) describes the light
1) DehazeNet is an end-to-end system. It directly learns portion that is not scattered and reaches the camera. t (x) is
and estimates the mapping relations between hazy im- defined as
age patches and their medium transmissions. This is
t (x) = e−βd(x) , (2)
achieved by special design of its deep architecture to
embody established image dehazing principles. where d (x) is the distance from the scene point to the camera,
2) We propose a novel nonlinear activation function in and β is the scattering coefficient of the atmosphere. Equation
DehazeNet, called Bilateral Rectified Linear Unit1 (2) suggests that when d (x) goes to infinity, t (x) approaches
(BReLU). BReLU extends Rectified Linear Unit zero. Together with equation (1) we have
(ReLU) and demonstrates its significance in obtaining
accurate image restoration. Technically, BReLU uses α = I (x) , d (x) → inf (3)
the bilateral restraint to reduce search space and im-
prove convergence. In practical imaging of a distance view, d (x) cannot be
3) We establish connections between components of De- infinity, but rather be a long distance that gives a very low
hazeNet and those assumptions/priors used in existing transmission t0 . Instead of relying on equation (3) to get the
dehazing methods, and explain that DehazeNet im- global atmospheric light α, it is more stably estimated based
proves over these methods by automatically learning on the following rule
all these components from end to end.
The remainder of this paper is organized as follows. In α= max I (y) (4)
y∈{x|t(x)≤t0 }
Section II, we review the atmospheric scattering model and
haze-relevant features, which provides background knowledge The discussion above suggests that to recover a clean scene
to understand the design of DehazeNet. In Section III, we (i.e., to achieve haze removal), it is the key to estimate an
present details of the proposed DehazeNet, and discuss how accurate medium transmission map.
it relates to existing methods. Experiments are presented in
1 During the preparation of this manuscript (in December, 2015), we
B. Haze-relevant features
find that a nonlinear activation function called adjustable bounded rectifier
is proposed in [25] (arXived in November, 2015), which is almost identical Image dehazing is an inherently ill-posed problem. Based
to BReLU. Adjustable bounded rectifier is motivated to achieve the objective on empirical observations, existing methods propose various
of image recognition. In contrast, BReLU is proposed here to improve image
restoration accuracy. It is interesting that we come to the same activation assumptions or prior knowledge that are utilized to compute
function from completely different initial objectives. This may also suggest intermediate haze-relevant features. Final haze removal can be
the general usefulness of the proposed BReLU. achieved based on these haze-relevant features.
3

(a) The process of imaging in hazy weather. The transmission attenuation


(b) Atmospheric scattering model. The observed hazy image I (x) is gener-
J (x) t (x) caused by the reduction in reflected energy, leads to low brightness
ated by the real scene J (x), the medium transmission t (x) and the global
intensity. The airlight α (1 − t (x)) formed by the scattering of the environ-
atmospheric light α.
mental illumination, enhances the brightness and reduces the saturation.

Fig. 1. Imaging in hazy weather and atmospheric scattering model

1) Dark Channel: The dark channel prior is based on the where I v (x) and I h (x) can be expressed in the HSV
wide observation on outdoor haze-free images. In most of the color space as I v (x) = maxc∈{r,b,g} I c (x) and I s (x) =
haze-free patches, at least one color channel has some pixels maxc∈{r,b,g} I (x) − minc∈{r,b,g} I (x) maxc∈{r,b,g} I c (x).
c c
whose intensity values are very low and even close to zero. The color attenuation feature is proportional to the scene
The dark channel [9] is defined as the minimum of all pixel depth d (x) ∝ A (x), and is used for transmission estimation
colors in a local patch: easily.
  4) Hue Disparity: Hue disparity between the origi-
D (x) = min c
min I (y) , (5) nal image I (x) and its semi-inverse image, Isi (x) =
y∈Ωr (x) c∈{r,g,b} max [I c (x) , 1 − I c (x)] with c ∈ {r, g, b}, has been used to
detect the haze. For haze-free images, pixel values in the three
where I c is a RGB color channel of I and Ωr (x) is a channels of their semi-inverse images will not all flip, resulting
local patch centered at x with the size of r × r. The dark in large hue changes between Isi (x) and I (x). In [29], the
channel feature has a high correlation to the amount of haze hue disparity feature is defined:
in the image, and is used to estimate the medium transmission
h
t (x) ∝ 1 − D (x) directly. H (x) = Isi (x) − I h (x) , (8)
2) Maximum Contrast: According to the atmospheric scat-
tering, the contrast of the image where the superscript ”h” denotes the hue channel of the
P is reduced by P the haze trans-
image in HSV color space. According to (8), the medium
P
mission as x k∇I (x)k = t x k∇J (x)k ≤ x k∇J (x)k .
Based on this observation, the local contrast [7] as the variance transmission t (x) is in inverse propagation to H (x).
of pixel intensities in a s × s local patch Ωs with respect to the
center pixel, and the local maximum of local contrast values III. T HE P ROPOSED D EHAZE N ET
in a r × r region Ωr is defined as:
v The atmospheric scattering model in Section II-A suggests
u 1
u X that estimation of the medium transmission map is the most
2
C (x) = max t kI (z) − I (y)k , (6) important step to recover a haze-free image. To this end,
y∈Ωr (x) |Ωs (y)| we propose DehazeNet, a trainable end-to-end system that
z∈Ωs (y)
explicitly learns the mapping relations between raw hazy
where |Ωs (y)| is the cardinality of the local neighborhood. images and their associated medium transmission maps. In
The correlation between the contrast feature and the medium this section, we present layer designs of DehazeNet, and
transmission t is visually obvious, so the visibility of the image discuss how these designs are related to ideas in existing
is enhanced by maximizing the local contrast showed as (6). image dehazing methods. The final pixel-wise operation to
3) Color Attenuation: The saturation I s (x) of the patch get a recovered haze-free image from the estimated medium
decreases sharply while the color of the scene fades under transmission map will be presented in Section IV.
the influence of the haze, and the brightness value I v (x)
increases at the same time producing a high value for the
A. Layer Designs of DehazeNet
difference. According to the above color attenuation prior [18],
the difference between the brightness and the saturation is The proposed DehazeNet consists of cascaded convolutional
utilized to estimate the concentration of the haze: and pooling layers, with appropriate nonlinear activation func-
tions employed after some of these layers. Fig. 2 shows the
A (x) = I v (x) − I s (x) , (7) architecture of DehazeNet. Layers and nonlinear activations
4

Fig. 2. The architecture of DehazeNet. DehazeNet conceptually consists of four sequential operations (feature extraction, multi-scale mapping, local extremum
and non-linear regression), which is constructed by 3 convolution layers, a max-pooling, a Maxout unit and a BReLU activation function.

of DehazeNet are designed to implement four sequential op- scale invariance. For example, the inception architecture in
erations for medium transmission estimation, namely, feature GoogLeNet [31] uses parallel convolutions with varying filter
extraction, multi-scale mapping, local extremum, and nonlinear sizes, and better addresses the issue of aligning objects in input
regression. We detail these designs as follows. images, resulting in state-of-the-art performance in ILSVRC14
1) Feature Extraction: To address the ill-posed nature of [32]. Motivated by these successes of multi-scale feature ex-
image dehazing problem, existing methods propose various traction, we choose to use parallel convolutional operations in
assumptions and based on these assumptions, they are able the second layer of DehazeNet, where size of any convolution
to extract haze-relevant features (e.g., dark channel, hue dis- filter is among 3 × 3, 5 × 5 and 7 × 7, and we use the same
parity, and color attenuation) densely over the image domain. number of filters for these three scales. Formally, the output
Note that densely extracting these haze-relevant features is of the second layer is written as
equivalent to convolving an input hazy image with appropriate di/3e,(i\3)
filters, followed by nonlinear mappings. Inspired by extremum F2i = W2 ∗F1 + B2 di/3e,(i\3) , (10)
processing in color channels of those haze-relevant features, (3,n /3) (3,n /3)
an unusual activation function called Maxout unit [30] is where W2 = {W2p,q }(p,q)=(1,1)
2
and B2 = {B2p,q }(p,q)=(1,1)
2

selected as the non-linear mapping for dimension reduction. contain n2 pairs of parameters that is break up into 3 groups.
Maxout unit is a simple feed-forward nonlinear activation n2 is the output dimension of the second layer, and i ∈ [1, n2 ]
function used in multi-layer perceptron or CNNs. When used indexes the output feature maps. de takes the integer upwardly
in CNNs, it generates a new feature map by taking a pixel-wise and \ denotes the remainder operation.
maximization operation over k affine feature maps. Based on 3) Local Extremum: To achieve spatial invariance, the cor-
Maxout unit, we design the first layer of DehazeNet as follows tical complex cells in the visual cortex receive responses
from the simple cells for linear feature integration. Ilan et al.
F1i (x) = max f1i,j (x) , f1i,j = W1i,j ∗I + B1i,j , (9) [33] proposed that spatial integration properties of complex
j∈[1,k]
(n ,k) (n ,k)
cells can be described by a series of pooling operations.
where W1 = {W1i,j }(i,j)=(1,1)
1
and B1 = {B1i,j }(i,j)=(1,1)
1
According to the classical architecture of CNNs [34], the
represent the filters and biases respectively, and ∗ denotes the neighborhood maximum is considered under each pixel to
convolution operation. Here, there are n1 output feature maps overcome local sensitivity. In addition, the local extremum is in
in the first layer. W1i,j ∈ R3×f1 ×f1 is one of the total k × n1 accordance with the assumption that the medium transmission
convolution filters, where 3 is the number of channels in the is locally constant, and it is commonly to overcome the noise
input image I (x), and f1 is the spatial size of a filter (detailed of transmission estimation. Therefore, we use a local extremum
in Table I). Maxout unit maps each of kn1 -dimensional vectors operation in the third layer of DehazeNet.
into an n1 -dimensional one, and extracts the haze-relevant
features by automatic learning rather than heuristic ways in F3i (x) = max F2i (y) , (11)
y∈Ω(x)
existing methods.
2) Multi-scale Mapping: In [17], multi-scale features have where Ω (x) is an f3 × f3 neighborhood centered at x, and
been proven effective for haze removal, which densely com- the output dimension of the third layer n3 = n2 . In contrast
pute features of an input image at multiple spatial scales. to max-pooling in CNNs, which usually reduce resolutions of
Multi-scale feature extraction is also effective to achieve feature maps, the local extremum operation here is densely
5

(a) Opposite filter (b) All-pass filter (c) Round filter (d) Maxout

(a) ReLU (b) BReLU

Fig. 3. Rectified Linear Unit (ReLU) and Bilateral Rectified Linear Unit
(BReLU) (e) The actual kernels learned from DehazeNet

Fig. 4. Filter weight and Maxout unit in the first layer operation F1
applied to every feature map pixel, and is able to preserve
resolution for use of image restoration.
4) Non-linear Regression: Standard choices of nonlinear example. If the weight W1 is an opposite filter (sparse matrices
activation functions in deep networks include Sigmoid [35] with the value of -1 at the center of one channel, as in Fig.
and Rectified Linear Unit (ReLU). The former one is eas- 4(a)) and B1 is a unit bias, then the maximum output of the
ier to suffer from vanishing gradient, which may lead to feature map is equivalent to the minimum of color channels,
slow convergence or poor local optima in networks training. which is similar to dark channel [9] (see Equation (5)). In
To overcome the problem of vanishing gradient, ReLU is the same way, when the weight is a round filter as Fig. 4(c),
proposed [36] which offers sparse representations. However, F1 is similar to the maximum contrast [7] (see Equation (6));
ReLU is designed for classification problems and not perfectly when W1 includes all-pass filters and opposite filters, F1 is
suitable for the regression problems such as image restoration. similar to the maximum and minimum feature maps, which are
In particular, ReLU inhibits values only when they are less atomic operations of the color space transformation from RGB
than zero. It might lead to response overflow especially in the to HSV, then the color attenuation [18] (see Equation (7)) and
last layer, because for image restoration, the output values of hue disparity [29] (see Equation (8)) features are extracted. In
the last layer are supposed to be both lower and upper bounded conclusion, upon success of filter learning shown in Fig. 4(e),
in a small range. To this end, we propose a Bilateral Rectified almost all haze-relevant features can be potentially extracted
Linear Unit (BReLU) activation function, shown in Fig. 3, from the first layer of DehazeNet. On the other hand, Maxout
to overcome this limitation. Inspired by Sigmoid and ReLU, activation functions can be considered as piece-wise linear
BReLU as a novel linear unit keeps bilateral restraint and local approximations to arbitrary convex functions. In this paper,
linearity. Based on the proposed BReLU, the feature map of we choose the maximum across four feature maps (k = 4)
the fourth layer is defined as to approximate an arbitrary convex function, as shown in Fig.
4(d).
F4 = min (tmax , max (tmin , W4 ∗ F3 + B4 )) (12)
White-colored objects in an image are similar to heavy haze
Here W4 = {W4 } contains a filter with the size of n3 ×f4 ×f4 , scenes that are usually with high values of brightness and low
B4 = {B4 } contains a bias, and tmin,max is the marginal value values of saturation. Therefore, almost all the haze estimation
of BReLU (tmin = 0 and tmax = 1 in this paper). According models tend to consider the white-colored scene objects as
to (12), the gradient of this activation function can be shown being distant, resulting in inaccurate estimation of the medium
as transmission. Based on the assumption that the scene depth
 is locally constant, local extremum filter is commonly to
∂F4 (x)
∂F4 (x)  , tmin ≤ F4 (x) < tmax overcome this problem [9], [18], [7]. In DehazeNet, local
= (13)
∂F3  0, ∂F 3
otherwise
maximum filters of the third layer operation remove the local
estimation error. Thus the direct attenuation term J (x) t (x)
The above four layers are cascaded together to form a CNN can be very close to zero when the transmission t (x) is close
based trainable end-to-end system, where filters and biases to zero. The directly recovered scene radiance J (x) is prone to
associated with convolutional layers are network parameters noise. In DehazeNet, we propose BReLU to restrict the values
to be learned. We note that designs of these layers can be of transmission between tmin and tmax , thus alleviating the
connected with expertise in existing image dehazing methods, noise problem. Note that BReLU is equivalent to the boundary
which we specify in the subsequent section. constraints used in traditional methods [9], [18].

B. Connections with Traditional Dehazing Methods C. Training of DehazeNet


The first layer feature F1 in DehazeNet is designed for haze- 1) Training Data: It is in general costly to collect a vast
relevant feature extraction. Take dark channel feature [9] as an amount of labelled data for training deep models [19]. For
6

TABLE I. T HE ARCHITECTURES OF THE D EHAZE N ET MODEL


Num Filter
Formulation Type Input Size Pad
n f ×f
Feature Conv 3 × 16 × 16 16 5×5 0
Extraction Maxout 16 × 12 × 12 4 – 0
16 3×3 1
Multi-scale
Conv 4 × 12 × 12 16 5×5 2
Mapping
16 7×7 3
Local Extremum Maxpool 48 × 12 × 12 – 7×7 0
Non-linear Conv 48 × 6 × 6 1 6×6 0
Regression BReLU 1×1 1 – 0

Fig. 5. Example haze-free training images collected from the Internet


IV. E XPERIMENTS
To verify the architecture of DehazeNet, we analyze its
training of DehazeNet, it is even more difficult as the pairs of convergence and compare it with the state-of-art methods,
hazy and haze-free images of natural scenes (or the pairs of including FVR [38], DCP [9], BCCR [11], ATM [39], RF
hazy images and their associated medium transmission maps) [17], BPNN [40], RF [17] and CAP [18].
are not massively available. Instead, we resort to synthesized Regarding the training data, 10,000 haze-free patches are
training data based on the physical haze formation model [17]. sampled randomly from the images collected from the Internet.
More specifically, we synthesize training pairs of hazy and For each patch, we uniformly sample 10 random transmissions
haze-free image patches based on two assumptions [17]: first, t ∈ (0, 1) to generate 10 hazy patches. Therefore, a total
image content is independent of medium transmission (the of 100,000 synthetic patches are generated for DehazeNet
same image content can appear at any depths of scenes); sec- training. In DehazeNet, the filter weights of each layer are
ond, medium transmission is locally constant (image pixels in initialized by drawing randomly from a Gaussian distribution
a small patch tend to have similar depths). These assumptions (with mean value µ = 0 and standard deviation σ = 0.001),
suggest that we can assume an arbitrary transmission for an and the biases are set to 0. The learning rate decreases by half
individual image patch. Given a haze-free patch J P (x), the from 0.005 to 3.125e-4 every 100,000 iterations. Based on the
atmospheric light α, and a random transmission t ∈ (0, 1), a parameters above, DehazeNet is trained (in 500,000 iterations
hazy patch is synthesized as I P (x) = J P (x) t + α (1 − t). To with a batch-size of 128) on a PC with Nvidia GeForce GTX
reduce the uncertainty in variable learning, atmospheric light 780 GPU.
α is set to 1. Based on the transmission estimated by DehazeNet and the
In this work, we collect haze-free images from the Internet, atmospheric scattering model, haze-free images are restored
and randomly sample from them patches of size 16 × 16. as traditional methods. Because of the local extremum in the
Different from [17], these haze-free images include not only third layer, the blocking artifacts appear in the transmission
those capturing people’s daily life, but also those of natural and map obtained from DehazeNet. To refine the transmission
city landscapes, since we believe that this variety of training map, guided image filtering [15] is used to smooth the image.
samples can be learned into the filters of DehazeNet. Fig. 5 Referring to Equation (4), the boundary value of 0.1 percent
shows examples of our collected haze-free images. intensity is chosen as t0 in the transmission map, and we select
2) Training Method: In the DehazeNet, supervised learn- the highest intensity pixel in the corresponding hazy image
ing requires the mapping relationship F between RGB I (x) among x ∈ {y|t (y) ≤ t0 } as the atmospheric light
value and medium transmission. Network parameters Θ = α. Given the medium transmission t (x) and the atmospheric
{W1 , W2 , W4 , B1 , B2 , B4 } are achieved through minimizing light α, the haze-free image J (x) is recovered easily. For
the loss function between the training patch I P (x) and the convenience, Equation (1) is rewritten as follows:
corresponding ground truth medium transmission t. Given a
set of hazy image patches and their corresponding medium I (x) − α (1 − t (x))
J (x) = (15)
transmissions, where hazy patches are synthesized from haze- t (x)
free patches as described above, we use Mean Squared Error Although DehazeNet is based on CNNs, the lightened net-
(MSE) as the loss function: work can effectively guarantee the realtime performance, and
N runs without GPUs. The entire dehazing framework is tested in
1 X 2
F IiP ; Θ − ti

L (Θ) = (14) MATLAB 2014A only with a CPU (Intel i7 3770, 3.4GHz),
N i=1 and it processes a 640 × 480 image with approximately 1.5
seconds.
Stochastic gradient descent (SGD) is used to train De-
hazeNet. We implement our model using the Caffe package
[37]. Detailed configurations and parameter settings of our A. Model and performance
proposed DehazeNet (as shown in Fig. 2) are summarized In DehazeNet, there are two important layers with special
in Table I, which includes 3 convolutional layers and 1 design for transmission estimation, feature extraction F1 and
max-pooling layer, with Maxout and BReLU activations used non-linear regression F4 . To proof the effectiveness of De-
respectively after the first and last convolutional operations. hazeNet, two traditional CNNs (SRCNN [41] and CNN-L [23])
7

Fig. 6. The training process with different low-dimensional mapping in F1 Fig. 7. The training process with different activation function in F4

TABLE II. T HE RESULTS OF USING DIFFERENT FILTER NUMBER OR


SIZE IN D EHAZE N ET (×10−2 )
with the same number of 3 layers are regarded as baseline
models. The number of parameters of DehazeNet, SRCNN, Filter Architecture Train MSE Test MSE #Param
and CNN-L is 8,240, 18,400, and 67,552 respectively. 4-(16×3) 1.090 1.190 8,240
Number
8-(32 × 3) 0.972 1.138 27,104
1) Maxout unit in feature extraction F1 : The activation unit (n1 -n2 )
16-(64 × 3) 0.902 1.112 96,704
in F1 is a non-linear dimension reduction to approximate tra- 5-3-7-6 1.184 1.219 4,656
ditional haze-relevant features extraction. In the field of image F2 Size 5-5-7-6 1.133 1.225 7,728
(f1 -f2 -f3 -f4 ) 5-7-7-6 1.021 1.184 12,336
processing, low-dimensional mapping is a core procedure for 5-M-7-6 1.090 1.190 8,240
discovering principal attributes and for reducing pattern noise. 5-M-6-7 1.077 1.192 8,864
For example, PCA [42] and LDA [43] as classical linear low- F4 Size
5-M-7-6 1.090 1.190 8,240
(f1 -f2 -f3 -f4 )
dimensional mappings are widely used in computer vision and 5-M-8-5 1.103 1.201 7,712
data mining. In [23], a non-linear sparse low-dimensional map-
ping with ReLU is used for high-resolution reconstruction. As
an unusual low-dimensional mapping, Maxout unit maximizes Fig. 7 shows the training process using different activation
feature maps to discover the prior knowledge of the hazy functions in F4 . BReLU has a better convergence rate than
images. Therefore, the following experiment is designed to ReLU and Sigmoid, especially during the first 50,000 itera-
confirm the validity of Maxout unit. According to [23], linear tions. The convergent precisions show that the performance
unit maps a 16-dimensional vector into a 4-dimensional vector, of BReLU is improved approximately 0.05e-2 compared with
which is equivalent to applying 4 filters with a size of 16×1×1. ReLU and by 0.20e-2 compared with Sigmoid. Fig. 8 plots the
In addition, the sparse low-dimensional mapping is connecting predicted transmission versus the ground truth transmission on
ReLU to the linear unit. the test patches. Clearly, the predicted transmission centers
Fig. 6 presents the training process of DehazeNet with around the 45 degree line in BReLU result. However, the
Maxout unit, compared with ReLU and linear unit. We observe predicted transmission of ReLU is always higher than the true
in Fig. 6 that the speed of convergence for Maxout network transmission, and there are some predicted transmissions over
is faster than that for ReLU and linear unit. In addition, the the limit value tmax = 1. Due to the curvature of Sigmoid
values in the bracket present the convergent result, and the function, the predicted transmission is far away from the true
performance of Maxout is improved by approximately 0.30e- transmission, closing to 0 and 1. The MSE on the test set of
2 compared with ReLU and linear unit. The reason is that BReLU is 1.19e-2, and that of ReLU and Sigmoid are 1.28e-2
Maxout unit provides the equivalent function of almost all of and 1.46e-2, respectively.
haze-relevant features, and alleviates the disadvantage of the
simple piecewise functions such as ReLU. B. Filter number and size
2) BReLU in non-linear regression F4 : BReLU is a novel To investigate the best trade-off between performance and
activation function that is useful for image restoration and parameter size, we progressively modify the parameters of
reconstruction. Inspired by ReLU and Sigmoid, BReLU is de- DehazeNet. Based on the default settings of DehazeNet, two
signed with bilateral restraint and local linearity. The bilateral experiments are conducted: (1) one is with a larger filter
restraint applies a priori constraint to reduce the solution space number, and (2) the other is with a different filter size.
scale; the local linearity overcomes the gradient vanishing to Similar to Sec. III-C2, these models are trained on the same
gain better precision. In the contrast experiment, ReLU and dataset and Table II shows the training/testing MSEs with the
Sigmoid are used to take the place of BReLU in the non- corresponding parameter settings.
linear regression layer. For ReLU, F4 can be rewritten as In general, the performance would improve when increasing
F4 = max (0, W4 ∗ F3 + B4 ), and for Sigmoid, it can be the network width. It is clear that superior performance could
rewritten as F4 = 1/(1 + exp (−W4 ∗ F3 − B4 )). be achieved by increasing the number of filter. However, if a
8

Fig. 8. The plots between predicted and truth transmission on different activation function in the non-linear regression F4

TABLE III. MSE BETWEEN PREDICTED TRANSMISSION AND GROUND


fast dehazing speed is desired, a small network is preferred, TRUTH ON SYNTHETIC PATCHES
which could still achieve better performance than other popular
methods. In this paper, the lightened network is adopted in the
Methods DCP [9] BPNN [40] CAP [18] RF [17] DehazeNet
following experiments. MSE(×10−2 ) 3.18 4.37 3.32 1.26 1.19
In addition, we examine the network sensitivity to different
filter sizes. The default network setting, whose specifics are
shown in Table I, is denoted as 5-M-7-6. We first analyze the
achieves the best state-of-the-art score, which is 1.19e-2; the
effects of varying filter sizes in the second layer F2 . Table
MSE between our method and the next state-of-art result (RF
II indicates that a reasonably larger filter size in F2 could
[17]) in the literature is 0.07e-2. Because in [17], the feature
grasp richer structural information, which in turn leads to better
values of patches are sorted to break the correlation between
results. The multi-scale feature mapping with the filter sizes
the haze-relevant features and the image content. However, the
of 3/5/7 is also adopted in F2 of DehazeNet, which achieves
content information concerned by DehazeNet is useful for the
similar testing MSE to that of the single-scale case of 7 × 7
transmission estimation of the sky region and white objects.
filter. Moreover, we demonstrate in Section IV-D that the multi-
Moreover, CAP [18] achieves satisfactory results in follow-on
scale mapping is able to improve the scale robustness.
experiments but poor performance in this experiment, due to
We further examine networks with different filter sizes
some outlier values (greater than 1 or less than 0) in the linear
in the third and fourth layer. Keeping the same receptive
regression model.
field of network, the filter sizes in F3 and F4 are adjusted
simultaneously. It is showed that the larger filter size of non-
linear regression F4 enhances the fitting effect, but may lead D. Quantitative results on synthetic images
to over-fitting. The local extremum in F3 could improve the To verify the effectiveness on complete images, DehazeNet
robustness on testing dataset. Therefore, we find the best filter is tested on synthesized hazy images from stereo images with
setting of F3 and F4 as 5-M-7-6 in DehazeNet. a known depth map d (x), and it is compared with DCP [9],
FVR [38], BCCR [11], ATM [39], CAP 2 [18], and RF [17].
C. Quantitative results on synthetic patches There are 12 pairs of stereo images collected in Middlebury
Stereo Datasets (2001-2006) [44], [45], [46]. In Fig. 9, the
In recent years, there are three methods based on learning hazy images are synthesized from the haze-free stereo images
framework for haze removal. In [18], dehazing parameters are based on (1), and they are restored to haze-free images by
learned by a linear model, to estimate the scene depth under DehazeNet.
Color Attenuation Prior (CAP). A back propagation neural To quantitatively assess these methods, we use a series of
network (BPNN) [40] is used to mine the internal link between evaluation criteria in terms of the difference between each pair
color and depth from the training samples. In [17], Random of haze-free image and dehazing result. Apart from the widely
Forests (RF) are used to investigate haze-relevant features for used mean square error (MSE) and the structural similarity
haze-free image restoration. All of the above methods and (SSIM) [47] indices, we used additional evaluation matrices,
DehazeNet are trained with the same method as RF. According namely peak signal-to-noise ratio (PSNR) and weighted peak
to the testing measure of RF, 2000 image patches are randomly signal-to-noise ratio (WPSNR) [48]. We define one-pass eval-
sampled from haze-free images with 10 random transmission uation (OPE) as the conventional method, which we run with
t ∈ (0, 1) to generate 20,000 hazy patches for testing. We run standard parameters and report the average measure for the
DehazeNet and CAP on the same testing dataset to measure the performance assessment. In Table IV, DehazeNet is compared
mean squared error (MSE) between the predicted transmission with six state-of-the-art methods on all of the hazy images by
and true transmission. DCP [9] is a classical dehazing method, OPE (hazy images are synthesized with the single scattering
which is used as a comparison baselines.
Table III shows the MSE between predicted transmissions 2 The results outside the parenthesis are run with the code implemented
and truth transmissions on the testing patches. DehazeNet by authors [18], and the results in the parenthesis are re-implemented by us.
9

TABLE IV. T HE AVERAGE RESULTS OF MSE, SSIM, PSNR AND WSNR ON THE SYNTHETIC IMAGES (β = 1 AND α = 1)

Metric ATM [39] BCCR [11] FVR [38] DCP [9] CAP2 [18] RF [17] DehazeNet
MSE 0.0689 0.0243 0.0155 0.0172 0.0075 (0.0068) 0.0070 0.0062
SSIM 0.9890 0.9963 0.9973 0.9981 0.9991 (0.9990) 0.9989 0.9993
PSNR 60.8612 65.2794 66.5450 66.7392 70.0029 (70.6581) 70.0099 70.9767
WSNR 7.8492 12.6230 13.7236 13.8508 16.9873 (17.7839) 17.1180 18.0996

Fig. 10. Image enhancement for anti-halation by DehazeNet

in excellent robustness to the scattering coefficient.


Due to the color offset of haze particles and light sources,
the atmosphere airlight is not a proper pure-white. An airlight
robustness evaluation (ARE) is proposed to analyze the de-
hazing methods for different atmosphere airlight α. Although
DehazeNet is trained from the samples generated by setting
α = 1, it also achieves the greater robustness on the other
values of atmosphere airlight. In particular, DehazeNet per-
forms better than the other methods when sunlight haze is
[1.0, 1.0, 0.9]. Therefore, DehazeNet could also be applied to
remove halation, which is a bright ring surrounding a source
of light as shown in Fig. 10.
The view field transformation and image zoom occur often
Fig. 9. Synthetic images based on Middlebury Stereo Datasets and DehazeNet in real-world applications. The scale robustness evaluation
results
(SRE) is used to analyze the influence from the scale vari-
ation. Compared with the same state-of-the-art methods in
OPE, there are 4 scale coefficients s selected from 0.4 to
coefficient β = 1 and the pure-white atmospheric airlight 1.0 to generate different scale images for SER. In Table V,
α = 1). It is exciting that, although DehazeNet is optimized by DehazeNet shows excellent robustness to the scale variation
the MSE loss function, it also achieves the best performance due to the multi-scale mapping in F2 . The single scale used in
on the other types of evaluation matrices. CAP [18], DCP [9] and ATM [39] results in a different pre-
The dehazing effectiveness is sensitive to the haze density, diction accuracy on a different scale. When an image shrinks,
and the performance with a different scattering coefficient β an excessively large-scale processing neighborhood will lose
could become much worse or better. Therefore, we propose an the image’s details. Therefore, the multi-scale mapping in
evaluation to analyze dehazing robustness to scattering coeffi- DehazeNet provides a variety of filters to merge multi-scale
cient β ∈ {0.75, 1.0, 1.25, 1.5}, which is called as coefficient features, and it achieves the best scores under all of different
robustness evaluation (CRE). As shown in Table V, CAP [18] scales.
achieve better performances on the mist (β = 0.75), but the In most situations, noise is random produced by the sensor
dehazing performance reduces gradually when the amount of or camera circuitry, which will bring in estimation error. We
haze increases. The reason is that CAP estimates the medium also discuss the influences of varying degrees of image noise to
transmission based on predicted scene depth and a assumed our method. As a basic noise model, additive white Gaussian
scattering coefficient (β = 1). In [17], 200 trees are used (AWG) noise with standard deviation σ ∈ {10, 15, 20, 25} is
to build random forests for non-linear regression and shows used for noise robustness evaluation (NRE). Benefiting from
greater coefficient robustness. However, the high-computation the Maxout suppression in F1 and the local extremum in F3 ,
of random forests in every pixel constraints to its practicality. DehazeNet performs more robustly in NRE than the others do.
For DehazeNet, the medium transmission is estimated directly RF [17] has a good performance in most of the evaluations but
by a non-linear activation function (Maxout) in F1 , resulting fails in NRE, because the feature values of patches are sorted
10

TABLE V. T HE MSE ON THE SYNTHETIC IMAGES BY DIFFERENT SCATTERING COEFFICIENT, IMAGE SCALE AND ATMOSPHERIC AIRLIGHT

Evaluation ATM [39] BCCR [11] FVR [38] DCP [9] CAP2 [18] RF [17] DehazeNet
0.75 0.0581 0.0269 0.0122 0.0199 0.0043 (0.0042) 0.0046 0.0063
CRE 1.00 0.0689 0.0243 0.0155 0.0172 0.0077 (0.0068) 0.0070 0.0062
(β =) 1.25 0.0703 0.0230 0.0219 0.0147 0.0141 (0.0121) 0.0109 0.0084
1.50 0.0683 0.0219 0.0305 0.0134 0.0231 (0.0201) 0.0152 0.0127
CRE Average 0.0653 0.0254 0.0187 0.0177 0.0105 (0.0095) 0.0094 0.0084
[1.0, 1.0, 1.0] 0.0689 0.0243 0.0155 0.0172 0.0075 (0.0068) 0.0070 0.0062
ARE [0.9, 1.0, 1.0] 0.0660 0.0266 0.0170 0.0210 0.0073 (0.0069) 0.0071 0.0072
(α =) [1.0, 0.9, 1.0] 0.0870 0.0270 0.0159 0.0200 0.0070 (0.0067) 0.0073 0.0074
[1.0, 1.0, 0.9] 0.0689 0.0239 0.0152 0.0186 0.0081 (0.0069) 0.0083 0.0062
ARE Average 0.0727 0.0255 0.0159 0.0192 0.0075 (0.0068) 0.0074 0.0067
0.40 0.0450 0.0238 0.0155 0.0102 0.0137 (0.0084) 0.0089 0.0066
SRE 0.60 0.0564 0.0223 0.0154 0.0137 0.0092 (0.0071) 0.0076 0.0060
(s =) 0.80 0.0619 0.0236 0.0155 0.0166 0.0086 (0.0066) 0.0074 0.0062
1.00 0.0689 0.0243 0.0155 0.0172 0.0077 (0.0068) 0.0070 0.0062
SRE Average 0.0581 0.0235 0.0155 0.0144 0.0098 (0.0072) 0.0077 0.0062
10 0.0541 0.0138 0.0150 0.0133 0.0065 (0.0070) 0.0086 0.0059
15 0.0439 0.0144 0.0148 0.0104 0.0072 (0.0074) 0.0112 0.0061
NRE
20 – 0.0181 0.0151 0.0093 0.0083 (0.0085) 0.0143 0.0058
(σ =)
25 – 0.0224 0.0150 0.0082 0.0100 (0.0092) 0.0155 0.0051
30 – 0.0192 0.0151 0.0085 0.0119 (0.0112) 0.0191 0.0049
NRE Average – 0.0255 0.0150 0.0100 0.0088 (0.0087) 0.0137 0.0055

to break the correlation between the medium transmission and Because transmission estimation based on priors are a type
the image content, which will also magnify the effect of outlier. of statistics, which might not work for certain images. The
fourth and fifth figures are determined to be failure cases
E. Qualitative results on real-world images in [9]. When the scene objects are inherently similar to the
Fig. 11 shows the dehazing results and depth maps restored atmospheric light (such as the fair-skinned complexion in the
by DehazeNet, and more results and comparisons can be fourth figure and the white marble in the fifth figure), the
found at http://caibolun.github.io/DehazeNet/. Because all of estimated transmission based on priors (DCP, BCCR, FVR) is
the dehazing algorithms can obtain truly good results on not reliable. Because the dark channel has bright values near
general outdoor images, it is difficult to rank them visually. To such objects, and FVR and BCCR are based on DCP which
compare them, this paper focuses on 5 identified challenging has an inherent problem of overestimating the transmission.
images in related studies [9], [17], [18]. These images have CAP and RF learned from a regression model is free from
large white or gray regions that are hard to handle, because oversaturation, but underestimates the haze degree in the
most existing dehazing algorithms are sensitive to the white distance (see the brown hair in the fourth image and the red
color. Fig. 12 shows a qualitative comparison with six state- pillar in the fifth image). Compared with the six algorithms,
of-the-art dehazing algorithms on the challenging images. Fig. our results avoid image oversaturation and retain the dehazing
12 (a) depicts the hazy images to be dehazed, and Fig. 12 validity due to the non-linear regression of DehazeNet.
(b-g) shows the results of ATM [39], BCCR [11], FVR [38],
DCP [9], CAP [18] and RF [17], respectively. The results of V. C ONCLUSION
DehazeNet are given in Fig. 12 (h).
The sky region in a hazy image is a challenge of dehazing, In this paper, we have presented a novel deep learning
because clouds and haze are similar natural phenomenons approach for single image dehazing. Inspired by traditional
with the same atmospheric scattering model. As shown in haze-relevant features and dehazing methods, we show that
the first three figures, most of the haze is removed in the (b- medium transmission estimation can be reformulated into a
d) results, and the details of the scenes and objects are well trainable end-to-end system with special design, where the
restored. However, the results significantly suffer from over- feature extraction layer and the non-linear regression layer
enhancement in the sky region. Overall, the sky region of these are distinguished from classical CNNs. In the first layer F1 ,
images is much darker than it should be or is oversaturated and Maxout unit is proved similar to the priori methods, and it is
distorted. Haze generally exists only in the atmospheric surface more effective to learn haze-relevant features. In the last layer
layer, and thus the sky region almost does not require handling. F4 , a novel activation function called BReLU is instead of
Based on the learning framework, CAP and RF avoid color ReLU or Sigmoid to keep bilateral restraint and local linearity
distortion in the sky, but non-sky regions are enhanced poorly for image restoration. With this lightweight architecture, De-
because of the non-content regression model (for example, the hazeNet achieves dramatically high efficiency and outstanding
rock-soil of the first image and the green flatlands in the third dehazing effects than the state-of-the-art methods.
image). DehazeNet appears to be capable of finding the sky Although we successfully applied a CNN for haze removal,
region to keep the color, and assures a good dehazing effect there are still some extensibility researches to be carried
in other regions. The reason is that the patch attribute can be out. That is, the atmospheric light α cannot be regarded
learned in the hidden layer of DehazeNet, and it contributes as a global constant, which will be learned together with
to the dehazing effects in the sky. medium transmission in a unified network. Moreover, we think
11

Fig. 11. The haze-free images and depth maps restored by DehazeNet

(a) Hazy image (b) ATM [39] (c) BCCR [11] (d) FVR [38] (e) DCP [9] (f) CAP [18] (g) RF [17] (h) DehazeNet

Fig. 12. Qualitative comparison of different methods on real-world images.

atmospheric scattering model can also be learned in a deeper [2] J. A. Stark, “Adaptive image contrast enhancement using generalizations
neural network, in which an end-to-end mapping between haze of histogram equalization,” IEEE Transactions on Image Processing,
and haze-free images can be optimized directly without the vol. 9, no. 5, pp. 889–896, 2000.
medium transmission estimation. We leave this problem for [3] R. Eschbach and B. W. Kolpatzik, “Image-dependent color saturation
correction in a natural scene pictorial image,” Sep. 12 1995, uS Patent
future research. 5,450,217.
[4] Y. Y. Schechner, S. G. Narasimhan, and S. K. Nayar, “Instant dehazing
of images using polarization,” in IEEE Conference on Computer Vision
R EFERENCES and Pattern Recognition (CVPR), vol. 1, 2001, pp. I–325.
[1] T. K. Kim, J. K. Paik, and B. S. Kang, “Contrast enhancement system [5] S. G. Narasimhan and S. K. Nayar, “Contrast restoration of weather
using spatially adaptive histogram equalization with temporal filtering,” degraded images,” IEEE Transactions on Pattern Analysis and Machine
IEEE Transactions on Consumer Electronics, vol. 44, no. 1, pp. 82–87, Intelligence, vol. 25, no. 6, pp. 713–724, 2003.
1998. [6] J. Kopf, B. Neubert, B. Chen, M. Cohen, D. Cohen-Or, O. Deussen,
12

M. Uyttendaele, and D. Lischinski, “Deep photo: Model-based photo- [27] S. K. Nayar and S. G. Narasimhan, “Vision in bad weather,” in IEEE
graph enhancement and viewing,” in ACM Transactions on Graphics International Conference on Computer Vision, vol. 2, 1999, pp. 820–
(TOG), vol. 27, no. 5, 2008, p. 116. 827.
[7] R. T. Tan, “Visibility in bad weather from a single image,” in IEEE [28] S. G. Narasimhan and S. K. Nayar, “Contrast restoration of weather
Conference on Computer Vision and Pattern Recognition (CVPR), 2008, degraded images,” IEEE Transactions on Pattern Analysis and Machine
pp. 1–8. Intelligence, vol. 25, no. 6, pp. 713–724, 2003.
[8] R. Fattal, “Single image dehazing,” in ACM Transactions on Graphics [29] C. O. Ancuti, C. Ancuti, C. Hermans, and P. Bekaert, “A fast semi-
(TOG), vol. 27, no. 3, 2008, p. 72. inverse approach to detect and remove the haze from a single image,”
in Computer Vision–ACCV 2010, 2011, pp. 501–514.
[9] K. He, J. Sun, and X. Tang, “Single image haze removal using dark
channel prior,” IEEE Transactions on Pattern Analysis and Machine [30] I. Goodfellow, D. Warde-farley, M. Mirza, A. Courville, and Y. Bengio,
Intelligence, vol. 33, no. 12, pp. 2341–2353, 2011. “Maxout networks,” in Proceedings of the 30th International Confer-
ence on Machine Learning (ICML-13), 2013, pp. 1319–1327.
[10] K. Nishino, L. Kratz, and S. Lombardi, “Bayesian defogging,” Interna-
tional journal of computer vision, vol. 98, no. 3, pp. 263–278, 2012. [31] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in
[11] G. Meng, Y. Wang, J. Duan, S. Xiang, and C. Pan, “Efficient image IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
dehazing with boundary constraint and contextual regularization,” in 2015, pp. 1–9.
IEEE International Conference on Computer Vision (ICCV), 2013, pp.
617–624. [32] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,
Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large
[12] K. B. Gibson, D. T. Vo, and T. Q. Nguyen, “An investigation of scale visual recognition challenge,” International Journal of Computer
dehazing effects on image and video coding,” IEEE Transactions on Vision, pp. 1–42, 2014.
Image Processing, vol. 21, no. 2, pp. 662–673, 2012.
[33] I. Lampl, D. Ferster, T. Poggio, and M. Riesenhuber, “Intracellular
[13] J.-P. Tarel and N. Hautiere, “Fast visibility restoration from a single measurements of spatial integration and the max operation in complex
color or gray level image,” in IEEE International Conference on cells of the cat primary visual cortex,” Journal of neurophysiology,
Computer Vision, 2009, pp. 2201–2208. vol. 92, no. 5, pp. 2704–2713, 2004.
[14] J. Yu, C. Xiao, and D. Li, “Physics-based fast single image fog re- [34] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based
moval,” in IEEE International Conference on Signal Processing (ICSP), learning applied to document recognition,” Proceedings of the IEEE,
2010, pp. 1048–1052. vol. 86, no. 11, pp. 2278–2324, 1998.
[15] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Transactions [35] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of
on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 1397– data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507,
1409, 2013. 2006.
[16] A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to [36] V. Nair and G. E. Hinton, “Rectified linear units improve restricted
natural image matting,” IEEE Transactions on Pattern Analysis and boltzmann machines,” in Proceedings of the 27th International Confer-
Machine Intelligence, vol. 30, no. 2, pp. 228–242, 2008. ence on Machine Learning (ICML-10), 2010, pp. 807–814.
[17] K. Tang, J. Yang, and J. Wang, “Investigating haze-relevant features [37] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,
in a learning framework for image dehazing,” in IEEE Conference on S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for
Computer Vision and Pattern Recognition (CVPR), 2014, pp. 2995– fast feature embedding,” in Proceedings of the ACM International
3002. Conference on Multimedia. ACM, 2014, pp. 675–678.
[18] Q. Zhu, J. Mai, and L. Shao, “A fast single image haze removal [38] J.-P. Tarel and N. Hautiere, “Fast visibility restoration from a single
algorithm using color attenuation prior,” IEEE Transactions on Image color or gray level image,” in IEEE International Conference on
Processing, vol. 24, no. 11, pp. 3522–3533, 2015. Computer Vision, 2009, pp. 2201–2208.
[19] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification [39] M. Sulami, I. Glatzer, R. Fattal, and M. Werman, “Automatic recov-
with deep convolutional neural networks,” in Advances in neural ery of the atmospheric light in hazy images,” in IEEE International
information processing systems, 2012, pp. 1097–1105. Conference on Computational Photography (ICCP), 2014, pp. 1–11.
[20] C. Ding and D. Tao, “Robust face recognition via multimodal deep face [40] J. Mai, Q. Zhu, D. Wu, Y. Xie, and L. Wang, “Back propagation neural
representation,” IEEE Transactions on Multimedia, vol. 17, no. 11, pp. network dehazing,” IEEE International Conference on Robotics and
2049–2058, 2015. Biomimetics (ROBIO), pp. 1433–1438, 2014.
[21] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature [41] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional
hierarchies for accurate object detection and semantic segmentation,” in network for image super-resolution,” in Computer Vision–ECCV 2014.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Springer, 2014, pp. 184–199.
2014, pp. 580–587. [42] T.-H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y. Ma, “Pcanet: A simple
[22] C. J. Schuler, H. C. Burger, S. Harmeling, and B. Scholkopf, “A deep learning baseline for image classification?” IEEE Transactions on
machine learning approach for non-blind image deconvolution,” in IEEE Image Processing, vol. 24, no. 12, pp. 5017–5032, 2015.
Conference on Computer Vision and Pattern Recognition (CVPR), 2013, [43] B. Scholkopft and K.-R. Mullert, “Fisher discriminant analysis with
pp. 1067–1074. kernels,” Neural networks for signal processing IX, vol. 1, p. 1, 1999.
[23] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using [44] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense
deep convolutional networks,” IEEE Transactions on Pattern Analysis two-frame stereo correspondence algorithms,” International journal of
and Machine Intelligence, no. 99, pp. 1–1, 2015. computer vision, vol. 47, no. 1-3, pp. 7–42, 2002.
[24] D. Eigen, D. Krishnan, and R. Fergus, “Restoring an image taken [45] ——, “High-accuracy stereo depth maps using structured light,” in
through a window covered with dirt or rain,” in IEEE International IEEE Computer Society Conference on Computer Vision and Pattern
Conference on Computer Vision (ICCV), 2013, pp. 633–640. Recognition, vol. 1. IEEE, 2003, pp. I–195.
[25] Z. Wu, D. Lin, and X. Tang, “Adjustable bounded rectifiers: Towards [46] D. Scharstein and C. Pal, “Learning conditional random fields for
deep binary representations,” arXiv preprint arXiv:1511.06201, 2015. stereo,” in IEEE Conference on Computer Vision and Pattern Recogni-
[26] E. J. McCartney, “Optics of the atmosphere: scattering by molecules tion. IEEE, 2007, pp. 1–8.
and particles,” New York, John Wiley and Sons, Inc., vol. 1, 1976. [47] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image
13

quality assessment: from error visibility to structural similarity,” IEEE


Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
[48] J. L. Mannos and D. J. Sakrison, “The effects of a visual fidelity
criterion of the encoding of images,” IEEE Transactions on Information
Theory, vol. 20, no. 4, pp. 525–536, 1974.

You might also like