0% found this document useful (0 votes)

68 views9 pages

Lightweight Image Super-Resolution With Information Multi-Distillation Network

Uploaded by

test test

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views9 pages

Lightweight Image Super-Resolution With Information Multi-Distillation Network

Uploaded by

test test

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Lightweight Image Super-Resolution with Information

Multi-distillation Network
Zheng Hui Xinbo Gao
School of Electronic Engineering, Xidian University School of Electronic Engineering, Xidian University
Xi’an, China Xi’an, China
[email protected] [email protected]

Yunchu Yang Xiumei Wang∗

School of Electronic Engineering, Xidian University School of Electronic Engineering, Xidian University
arXiv:1909.11856v1 [eess.IV] 26 Sep 2019

Xi’an, China Xi’an, China

[email protected] [email protected]

ABSTRACT KEYWORDS
In recent years, single image super-resolution (SISR) methods using image super-resolution; lightweight network; information multi-
deep convolution neural network (CNN) have achieved impressive distillation; contrast-aware channel attention; adaptive cropping
results. Thanks to the powerful representation capabilities of the strategy
deep networks, numerous previous ways can learn the complex
non-linear mapping between low-resolution (LR) image patches ACM Reference Format:
and their high-resolution (HR) versions. However, excessive convo- Zheng Hui, Xinbo Gao, Yunchu Yang, and Xiumei Wang. 2019. Lightweight
lutions will limit the application of super-resolution technology in Image Super-Resolution with Information Multi-distillation Network. In
low computing power devices. Besides, super-resolution of any ar- Proceedings of the 27th ACM International Conference on Multimedia (MM’19),
bitrary scale factor is a critical issue in practical applications, which October 21–25, 2019, Nice, France. ACM, New York, NY, USA, 9 pages. https:
//doi.org/10.1145/3343031.3351084
has not been well solved in the previous approaches. To address
these issues, we propose a lightweight information multi-distillation
network (IMDN) by constructing the cascaded information multi-
distillation blocks (IMDB), which contains distillation and selective 1 INTRODUCTION
fusion parts. Specifically, the distillation module extracts hierarchi- Single image super-resolution (SISR) aims at reconstructing a high-
cal features step-by-step, and fusion module aggregates them ac- resolution (HR) image from its low-resolution (LR) observation,
cording to the importance of candidate features, which is evaluated which is inherently ill-posed because many HR images that can
by the proposed contrast-aware channel attention mechanism. To be downsampled to an identical LR image. To address this prob-
process real images with any sizes, we develop an adaptive cropping lem, numerous image SR methods [11, 12, 25, 27, 36, 38] based on
strategy (ACS) to super-resolve block-wise image patches using the deep neural architectures [7, 9, 23] have been proposed and shown
same well-trained model. Extensive experiments suggest that the prominent performance.
proposed method performs favorably against the state-of-the-art SR Dong et al. [4, 5] first developed a three-layer network (SRCNN)
algorithms in term of visual quality, memory footprint, and infer- to establish a direct relationship between LR and HR. Then, Wang et
ence time. Code is available at https://github.com/Zheng222/IMDN. al. [31] proposed a neural network according to the conventional
sparse coding framework and further designed a progressive up-
CCS CONCEPTS sampling style to produce better SR results at the large scale factor
• Computing methodologies → Computational photography; (e.g., ×4). Inspired by VGG model [23] that used for ImageNet clas-
Reconstruction; Image processing. sification, Kim et al. [12, 13] first pushed the depth of SR network
to 20 and their model outperformed SRCNN by a large margin. This
indicates a deeper model is instructive to enhance the quality of
∗ Corresponding author
generated images. To accelerate the training of deep network, the
authors introduced global residual learning with a high initial learn-
ing rate. At the same time, they also presented a deeply-recursive
convolutional network (DRCN), which applied recursive learning
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed to SR problem. This way can significantly reduce the model param-
for profit or commercial advantage and that copies bear this notice and the full citation eters. Similarly, Tai et al. proposed two novel networks, and one is
on the first page. Copyrights for components of this work owned by others than ACM a deep recursive residual network (DRRN) [24], another is a persis-
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a tent memory network (MemNet) [25]. The former mainly utilized
fee. Request permissions from [email protected]. recursive learning to reach the goal of economizing parameters.
MM ’19, October 21–25, 2019, Nice, France The latter model tackled the long-term dependency problem existed
© 2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-6889-6/19/10. . . $15.00 in the previous CNN architecture by several memory blocks that
https://doi.org/10.1145/3343031.3351084 stacked with a densely connected structure [9]. However, these two
algorithms required a long time and huge graphics memory con- divided the preceding extracted features into two parts, one was
sumption both in the training and testing phases. The primary rea- retained and another was further processed. Through this way, IDN
son is the inputs sent to these two models are interpolation version achieved good performance at a moderate size. But there is still
of LR images and the networks have not adopted any downsampling room for improvement in term of performance.
operations. This scheme will bring about a huge computational cost. Another factor that affects the inference speed is the depth of
To increase testing speed and shorten the testing time, Shi et al. [22] the network. In the testing phase, the previous layer and the next
first performed most of the mappings in low-dimensional space layer have dependencies. Simply, conducting the computation of
and designed an efficient sub-pixel convolution to upsample the the current layer must wait for the previous calculation is com-
resolutions of feature maps at the end of SR models. pleted. But multiple convolutional operations at each layer can be
To the same end, Dong et al. proposed fast SRCNN (FSRCNN) [6], processed in parallel. Therefore, the depth of model architecture is
which employed a learnable upsampling layer (transposed con- an essential factor affecting time performance. This point will be
volution) to accomplish post-upsampling SR. Afterward, Lai et verified in Section 4.
al. presented the Laplacian pyramid super-resolution network (Lap- As to solving the different scale factors (×2, ×3, ×4) SR problem
SRN) [14] to progressively reconstruct higher-resolution images. using a single model, previous solutions pretreated an image to
Some other work such as MS-LapSRN [15] and progressive SR the desired size and using the fully convolutional network without
(ProSR) [29] also adopt this progressive upsampling SR framework any downsampling operations. This way will inevitably lead to a
and achieve relatively high performance. EDSR [18] made a sig- substantial increase in the amount of calculation.
nificant breakthrough in term of SR performance, which won the To address the above issues, we propose a lightweight informa-
competition of NTIRE 2017 [1, 26]. The authors removed some un- tion multi-distillation network (IMDN) for better balancing perfor-
necessary modules (e.g., Batch Normalization) of the SRResNet [16] mance against applicability. Unlike most previous small parameters
to obtain better results. Based on EDSR, Zhang et al. incorporated models that use recursive structure, we elaborately design an in-
densely connected block [9, 27] into residual block [7] to construct formation multi-distillation block (IMDB) inspired by [11]. The
a residual dense network (RDN). Soon they exploited the residual- proposed IMDB extracts features at a granular level, which retains
in-residual architecture for the very deep model and introduced partial information and further treats other features at each step
channel attention mechanism [8] to form the very deep residual (layer) as illustrated in Figure 2. For aggregating features distilled by
attention networks (RCAN) [36]. More recently, Zhang et al. also all steps, we devise a contrast-aware channel attention layer, specif-
introduced spatial attention (non-local module) into the residual ically related to the low-level vision tasks, to enhance collected
block and then constructed residual non-local attention network various refined information. Concretely, we exploit more useful
(RNAN) [37] for various image restoration tasks. features (edges, corners, textures, et al. ) for image restoration. In
The major trend of these algorithms is increasing more convo- order to handle SR of any arbitrary scale factor with a single model,
lution layers to improve performance that measured by PSNR and we need to scale the input image to the target size, and then employ
SSIM [30]. As a result, most of them suffered from large model the proposed adaptive cropping strategy (see in Figure 4) to obtain
parameters, huge memory footprints, and slow training and testing image patches of appropriate size for lightweight SR model with
speeds. For instance, EDSR [18] has about 43M parameters, 69 lay- downsampling layers.
ers, and RDN [38] achieved comparable performance, which has The contributions of this paper can be summarized as follows:
about 22M parameters, over 128 layers. Another typical network
is RCAN [36], its depth up to 400 but the parameters are about
15.59M. However, these methods are still not suitable for resource-
constrained equipment. For the mobile devices, the desired practice • We propose a lightweight information multi-distillation net-
should be to pursuing higher SR performance as much as possible work (IMDN) for fast and accurate image super-resolution.
when the available memory and inference time are constrained Thanks to our information multi-distillation block (IMDB)
in a certain range. Many cases require not only the performance with contrast-aware attention (CCA) layer, we achieve com-
but also high execution speed, such as video applications, edge petitive results with a modest number of parameters (refer
devices, and smartphones. Accordingly, it is significant to devise a to Figure 6).
lightweight but efficient model for meeting such demands. • We propose the adaptive cropping strategy (ACS), which
Concerning the reduction of the parameters, many approaches allows the network included downsampling operations (e.g.,
adopted the recursive manner or parameter sharing strategy, such convolution layer with a stride of 2) to process images of any
as [13, 24, 25]. Although these methods did reduce the size of the arbitrary size. By adopting this scheme, the computational
model, they increased the depth or the width of the network to cost, memory occupation, and inference time can dramati-
make up for the performance loss caused by the recursive module. cally reduce in the case of treating indefinite magnification
This will lead to spending a great lot of calculating time when SR.
performing SR processing. To address this issue, the better way • We explore factors affecting actual inference time through
is to design the lightweight and efficient network structures that experiments and find the depth of the network is related
avoid using recursive paradigm. Ahn et al. developed CARN-M [2] to the execution speed. It can be a guideline for guiding
for mobile scenario through a cascading network architecture, but a lightweight network design. And our model achieves an
it is at the cost of a substantial reduction on PSNR. Hui et al. [11] excellent balance among visual quality, inference speed, and
proposed an information distillation network (IDN) that explicitly memory occupation.
2 RELATED WORK the loss function of our IMDN can be expressed by
2.1 Single image super-resolution N
1 Õ
H I M D N IiLR − IiH R ,

L (Θ) = (2)

With the rapid development of deep learning, numerous meth- N i=1 1
ods based on convolutional neural network (CNN) have been the
mainstream in SISR. The pioneering work of SR is proposed by where Θ indicates the updateable parameters of our model and ∥·∥ 1
Dong et al. [4, 5] named SRCNN. The SRCNN upscaled the LR is l 1 norm. Then we give more details about the entire framework.
image with bicubic interpolation before feeding into the network, We first conduct LR feature extraction implemented by one 3 × 3
which would cause substantial unnecessary computational cost. To convolution with 64 output channels. Then, the key component of
address this issue, the authors removed this pre-processing and our network utilizes multiple stacked information multi-distillation
upscaled the image at the end of the net to reduce the computation blocks (IMDBs) and assembles all intermediate features to fusing
in [6]. Lim et al. [18] modified SRResNet [16] to construct a more by a 1 × 1 convolution layer. This scheme, intermediate informa-
in-depth and broader residual network denoted as EDSR. With tion collection (IIC), is beneficial to guarantee the integrity of the
the smart topology structure and a significantly large number of collected information and can further boost the SR performance by
learnable parameters, EDSR dramatically advanced the SR perfor- increasing very few parameters. The final upsampler only consists
mance. Zhang et al. [38] introduced channel attention [8] into the of one learnable layer and a non-parametric operation (sub-pixel
residual block to further boost very deep network (more than 400 convolution) for saving parameters as much as possible.
layers without considering the depth of channel attention modules).
Liu [19] explored the effectiveness of non-local module applied to 3.2 Information multi-distillation block
image restoration. Similarly, Zhang et al. [37] utilized non-local As depicted in Figure 2, our information multi-distillation block
attention to better guide feature extraction in their trunk branch for (IMDB) is constructed by progressive refinement module, contrast-
reaching better performance. Very recently, Li et al. [17] exploited aware channel attention (CCA) layer, and a 1 × 1 convolution that
feedback mechanism that enhancing low-level representation with is used to reduce the number of feature channels. The whole block
high-level ones. adopts residual connection. The main idea of this block is extracting
For lightweight networks, Hui et al. [11] developed the informa- useful features little by little like DenseNet [9]. Then we give more
tion distillation network for better exploiting hierarchical features details to these modules.
by separation processing of the current feature maps. And Ahn [2]
designed an architecture that implemented a cascading mechanism
Table 1: PRM architecture. The columns represent layer,
on a residual network to boost the performance.
kernel-size, stride, input channels, and output channels. The
symbols, C, and L denote a convolution layer, and Leaky
2.2 Attention model
ReLU (α = 0.05).
Attention model, aiming at concentrating on more useful informa-
tion in features, has been widely used in various computer vision Layer Kernel Stride Input_channel Output_channel
tasks. Hu et al. [8] introduced squeeze-and-excitation (SE) block that CL 3 1 64 64
models channel-wise relationships in a computationally efficient CL 3 1 48 64
manner and enhances the representational ability of the network, CL 3 1 48 64
showing its effectiveness on image classification. CBAM [32] modi- CL 3 1 48 16
fied the SE block to exploit both spatial and channel-wise attention.
Wang et al. [28] proposed the non-local module to generate the
wide attention map by calculating the correlation matrix between 3.2.1 Progressive refinement module. As labeled with the gray box
each spatial point in the feature map, then the attention map guided in Figure 2, the progressive refinement module (PRM) first adopts
dense contextual information aggregation. the 3 × 3 convolution layer to extract input features for multiple
subsequent distillation (refinement) steps. For each step, we employ
3 METHOD channel split operation on the preceding features, which will pro-
duce two-part features. One is preserved and the other portion is
3.1 Framework fed into the next calculation unit. The retained part can be regarded
In this section, we describe our proposed information multi-distillation as the refined features. Given the input features Fin , this procedure
network (IMDN) in detail, its graphical depiction is shown in Fig- in the n-th IMDB can be described as
ure 1(a). The upsampler (see Figure 1(b)) includes one 3 × 3 con- Frnef ined _1 , Fcoar
n n n n
se_1 = Split 1 CL 1 Fin ,
volution with 3 × s 2 output channels and a sub-pixel convolution.
Frnef ined _2 , Fcoar
n n n n

Given an input LR image ILR , its corresponding target HR image se_2 = Split 2 CL 2 Fcoar se_1 ,
IH R . The super-resolved image IS R can be generated by (3)
Frnef ined _3 , Fcoar
n n n n

se_3 = Split 3 CL 3 Fcoar se_2 ,
IS R = H I M D N ILR ,

(1)
Frnef ined _4 = CLn4 Fcoarn

se_3 ,
where H I M D N (·) is our IMDN. It is optimized with mean absolute
error (MAE) loss followed most of previous works [2, 11, 18, 36, 38]. where CLnj denotes the j-th convolution layer (including Leaky
N
Given a training set IiLR , IiH R i=1 that has N LR-HR pairs. Thus, ReLU) of the n-th IMDB, Split jn indicates the j-th channel split layer

16
Channel Split 64 Channel Split progressive
Conv‐3
48 refinement
Conv‐3 Conv‐3 Channel Split module
16 (PRM)
Information Multiple Distillations Network (IMDN) Concat Conv‐3
Information Multiple Distillations Network (IMDN)
Conv‐1
Channel Split
Concat
Information Multiple Distillations Network (IMDN)
Conv‐3
Conv‐1
Conv‐3 Channel Split

Sub‐pixel
Upsampler

Conv‐3

Sub‐pixel
Upsampler
Conv‐3

Conv‐1
Conv‐3

Conv‐3
Channel Split

IMDB

IMDB
Conv‐3

Conv‐1
Conv‐3
Conv‐3

IMDB

SR
LR

SR
LR
Conv‐3
Concat
Conv‐3
Channel Split
64

Upsampler
64 3 s2 CCA Layer
annel Split Conv‐3
64 64 64 3 s2
64 64

Conv‐3

Conv‐1
Conv‐3
48

IMDB

IMDB
plit
Channel Split
Conv‐3 progressive (a) IMDN (b) Upsampler Conv‐1
W 

LR
progressive 16
Conv‐3 refinement Conv‐3
lW network (IMDN). (a) The orange box represents Leaky ReLU acti-
refinement
nv‐3 Channel Split module  2 
Figure 1: The architecture of information multi-distillation
Conv‐3 (PRM)Channel Split
module
Channel Split (PRM)
vation
Conv‐3 function and the details of IMDB is shown in Figure 2. (b) s represents the upscale factor.
Conv‐3

Conv‐3
Channel Split
Channel Split 64 64 64
Concat
Conv‐3
H 
Channel Split Conv‐3  2 
progressive

sigmod
Contrast

Conv‐1
Conv‐1
Conv‐1 Conv‐3
sigmod
Contrast

Conv‐1
Conv‐1

sigmod
refinement Contrast

Conv‐1
Conv‐1
Channel Split Channel Split
Conv‐3 Channel Split module
Conv‐3 Conv‐3 lH
(PRM)

Concat Conv‐3
Concat 4 64 4 64
CCA Layer 64
CCA Layer 64
4 64
Channel Split
Concat Conv‐1
48 48 Figure 3: Contrast-aware channel attention module.
Conv‐1
16 16 Conv‐3
Conv‐1

sigmod
Contrast

Conv‐1
Conv‐1
Channel Split
the global information in these high-level or mid-level vision. Al-
though
s2 the average pooling can indeed improve the PSNR value, it
s2
Conv‐3
lacks the information about structures, textures, and edges that are

Upsampler
sigmod
Conv‐1
Conv‐1

sigmod
Contrast

Conv‐1
Conv‐1

H  H  propitious to enhance image details (related to SSIM). As depicted

Upsampler
Concat

Upsampler
 2   lH  lH 4 64
Conv‐3

Conv‐3

Conv‐1
Conv‐3
 2 
IMDB

IMDB

IMDB
Conv‐3

Conv‐3
Conv‐3

Conv‐3
in Figure 3, the contrast-aware channel attention module is special IMDB

IMDB

IMDB
IMDB

IMDB

IMDB
64

SR
LR

CCA Layer

SR
LR

SR
to low-level vision, e.g., image super-resolution, and enhancement. LR
48
4 64 4 64 Conv‐1 Specifically, we replace global average pooling with the summation
W  16 W 
 2   lW  2   lW of standard deviation and mean (evaluating the contrast degree of
64 map). Let’s 64 denote X =64 [x 1 , .64 . 64
. . , xc , 64. . , xC ] as the input,
64 64 a feature
which has C feature maps with spatial size of H × W . Therefore,
Figure 2: The architecture of our proposed information
the contrast information value can be calculated by
multi-distillation block (IMDB). Here, 64, 48, and 16 all repre-
Upsampler

zc = HGC (xc )
Upsampler

Conv‐1

Conv‐1
sent the output channels of the convolution layer. “Conv-3”
Conv‐1

Conv‐1

Conv‐1
Mean

Mean
Mean

Mean
Conv‐1
Conv‐3
Conv‐1
Conv‐3

sigmod
Contrast

Conv‐1
Conv‐1
IMDB

IMDB

IMDB
IMDB

denotes the 3 × 3 convolutional layer, and “CCA Layer” indi- H 

v
SR

u
2
Õ ©  2   1lH Õ
SR

u
u
t
cates the proposed contrast-aware channel attention (CCA) 1  i, j  i, j ª

Conv‐3
= xc − xc ® +
that is depicted in Figure 3. Each convolution followed by HW HW (5)

LR
(i, j)∈x c « (i, j)∈x c
Contrast

Contrast
sigmod

sigmod
Contrast

Contrast
Conv‐1

Conv‐1

Conv‐1
sigmod

sigmod
Conv‐1

Conv‐1

64 a Leaky 64
64 ReLU64activation function except for the last 1 × 1 ¬
4 64 We omit them for concise.
convolution. 1 Õ
i, j
x ,
W 
 HW   lW c
4 64 2  (i, j)∈x c 4 64
n
4 64 4 64 64
of the n-th IMDB, Fr e f ined_j represents the j-th refined features where zc is the c-th element of output. HGC (·) indicates the global
n
(preserved), and Fcoar se_j is the j-th coarse features to be further
contrast (GC) information evaluation function. With the assistance
processed. The hyperparameter of PRM architecture is shown in of the CCA module, our network can steadily improve the accuracy
Table 1. The following stage is concatenating refined features from of SISR.
Upsampler

Conv‐1

each step. It can be expressed by

Mean
Conv‐1
Conv‐3

3.3 Adaptive cropping strategy

IMDB

n
Fdist ill ed =
SR

(4) The adaptive cropping strategy (ACS) is special to image of any

Concat Frne f ined _1 , Frne f ined_2 , Frne f ined_3 , Frne f ined _4 ,

arbitrary size super-resolving. Meanwhile, it can also deal with
the SR problem of any scale factor with a single model (see Fig-
where Concat denotes concatenation operation along the channel
Contrast

sigmod
Conv‐1

Conv‐1

64 64 ure 5). We slightly modify the original IMDN by introducing two

dimension.
downsampling layer and construct the current IMDN_AS (IMDN
3.2.2 Contrast-aware channel attention layer. The initial channel for any scales). Here, the LR and HR images have the same spatial
attention is employed in image classification task and is well-known size (height and width). To handle images whose height and width
as the squeeze-and-excitation (SE) module. In the high-level field,
4 64
are not divisible by 4, we first cut the entire images into 4 parts
the importance of a feature map depends on activated high-value and then feed them into our IMDN_AS. As illustrated in Figure 4,
areas, since these regions in favor of classification or detection. Ac- we can obtain 4 overlapped image patches through ACS. Take the
cordingly, global average/maximum pooling is utilized to capture first patch in the upper left corner as an example, and we give the
Conv‐3
48 Conv‐3 Conv‐3
64 3 s2
Conv‐1
Channel Split
64 64
16 Channel Split
oncat Channel Split 64 Channel Split Concat progressive
Conv‐3
Conv‐348 refinement Conv‐3
onv‐1 Conv‐3 Conv‐3 Conv‐1 module
Channel Split

sigmod
Contrast

Conv‐1
Conv‐1

sigmod
Contrast

Conv‐1
Conv‐1
16
Channel Split (PRM) Channel Split
Concat Conv‐3
Conv‐3 Conv‐3
Conv‐1
Channel Split
Concat Concat Concat
4 64 Conv‐3
4 64
CCA Layer 64 Conv‐1 CCA Layer 64

sigmod
Contrast

Conv‐1
Conv‐1
48 Channel Split 48
Conv‐1 Conv‐1
W  16 with existing works [2, 11, 12, 18, 24, 36, 38], we calculate the values
16
 2  lW Conv‐3
on the luminance channel (i.e., Y channel of the YCbCr channels
Concat
converted from
4 64 the RGB channels).
CCA Layer 64
Additionally, for any/unknown scale factor experiments, we use
H  48
 2 
Conv‐1 RealSR dataset from NTIRE2019 Real Super-Resolution Challenge1 .

sigmod
Contrast

Conv‐1
Conv‐1
W  16
sigmod

H 
Contrast

Conv‐1
Conv‐1

lW H   lresolution paired images.

 2 

Conv‐3 Upsampler
 lH It is a novel dataset of real low and high
2 
H

 2 

Conv‐3
lH

IMDB

IMDB
Conv‐3

Conv‐3
IMDB

IMDB

IMDB
The training data consists of 60 real low, and high resolution paired

LR
SR
LR
4 64 images, and the validation data contains 20 LR-HR pairs. It is note-
4 64
H 
 2  W 
 lW
worthy that the LR andW HR
  have
l the same size. W
2

sigmod
Contrast

Conv‐1
Conv‐1
 2  H 
 lH 64
64  2 
64

Conv‐3
lH

IMDB

IMDB
(a) The first image patch (b) The last image patch 4.2 Implementation details

LR
s2 s2 To obtain LR DIV2K training images, we downscale HR images
Figure 4: The diagrammatic 4sketch
64 of adaptive cropping
W 

Upsampler
 2   lWfactors (×2, ×3, and ×4) using bicubic interpolation
with the scaling

Conv‐1

Conv‐1
strategy (ACS). The cropped image patches in the green dot-
IMDB Upsampler

Mean

Mean
Conv‐3

Conv‐3

Conv‐1
Conv‐3
Conv‐1

Conv‐1

Conv‐1
IMDB

IMDB

IMDB 64 a size of 192 × 192 64

Mean

Mean
in MATLAB R2017a. The HR image patches with
Conv‐1
Conv‐3

ted boxes.
IMDB

IMDB

SR
LR

are randomly cropped from HR images as the input of our model,

s2 s2 and the mini-batch size is set to 16. For data augmentation, we

Contrast

Contrast
sigmod
Conv‐1

Conv‐1

Conv‐1
64 64 64 64 perform randomly horizontal flip and 90 degree rotation. Our model
Upsampler
Contrast

Contrast
sigmod

sigmod
Conv‐1

Conv‐1

Conv‐1
Conv‐1

Conv‐1

Conv‐1
64 64

Mean

Mean
Conv‐3

Conv‐3

Conv‐1
Conv‐3

is trained by ADAM optimizer with the momentum parameter

IMDB

SR
LR

β 1 = 0.9. The initial learning rate is set −4

4 to642 × 10 and halved at 4 64
4 64 every 2 × 105 iterations. 4 64 We set the number of IMDB to 6 in our

Contrast

Contrast
IMDN and IMDN_AS. We apply PyTorch framework to implement

sigmod

sigmod
Conv‐1

Conv‐1

Conv‐1
64 64 64 64
the proposed network on the desktop computer with 4.2GHz Intel i7-
Figure 5: The network structure of our IMDN_AS. “s2” rep-
7700K CPU, 64G RAM, and NVIDIA TITAN Xp GPU (12G memory).
resents the stride of 2. 4 64 4 64
4.3 Model analysis
details about ACS. This image patch must satisfy In this subsection, we investigate model parameters, the effective-
H

ness of IMDB, the intermediate information collection scheme, and
+ ∆l H %4 = 0,
2 adaptive cropping strategy.
(6)
W

+ ∆lW %4 = 0,
2
32.4
where ∆l H , ∆lW are extra increments of height and width, respec- 32.2
IMDN
CARN
tively. They can be computed by 32
IDN EDSR‐baseline
H

31.8 DRRN
∆l H = paddinдH − + paddinдH %4, MemNet
2 31.6
LapSRN
(7)
PSNR (dB)

W 31.4

VDSR
∆lW = paddinдW − + paddinдW %4, 31.2
2
31
where paddinдH , paddinдW are preset additional lengths. In gen- 30.8
FSRCNN
eral, their values are setting by 30.6
SRCNN
paddinдH = paddinдW = 4k, k ≥ 1. (8) 30.4
30.2
Here, k is an integer greater than or equal to 1. These four patches 0 0.5 1 1.5 2
Number of parameters (K) 𝟏𝟎𝟑
can be processed in parallel (they have the same sizes), after which
the outputs are pasted to their original location, and the extra Figure 6: Trade-off between performance and number of pa-
increments (∆l H and ∆lW ) are discarded. rameters on Set5 ×4 dataset.

4 EXPERIMENTS
4.3.1 Model parameters. To construct a lightweight SR model, the
4.1 Datasets and metrics parameters of the network is vital. From Table 5, we can observe
In our experiments, we use the DIV2K dataset [1], which contains that our IMDN with fewer parameters achieves comparative or
800 high-quality RGB training images and widely used in image better performance when comparing with other state-of-the-art
restoration tasks [18, 36–38]. For evaluation, we use five widely methods, such as EDSR-baseline (CVPRW’17), IDN (CVPR’18), SR-
used benchmark datasets: Set5 [3], Set14 [33], BSD100 [20], Ur- MDNF (CVPR’18), and CARN (ECCV’18). We also visualize the
ban100 [10], and Manga109 [21]. We evaluate the performance of trade-off analysis between performance and model size in Figure 6.
the super-resolved images using two metrics, including peak signal- We can see that our IMDN achieves a better trade-off between the
to-noise ratio (PSNR) and structure similarity index (SSIM) [30]. As performance and model size.
Sub‐pixel
Upsampler

Conv‐3
Conv‐1
Conv‐3
IMDB

IMDB

SR
3 s2 Table 2: Investigations of CCA module and IIC scheme.
64 64

Set5 Set14 BSD100 Urban100 Manga109

Scale PRM CCA IIC Params
PSNR / SSIM PSNR / SSIM PSNR / SSIM PSNR / SSIM PSNR / SSIM
# # # 510K 31.86 / 0.8901 28.43 / 0.7775 27.45 / 0.7320 25.63 / 0.7711 29.92 / 0.9003
! # # 480K 32.01 / 0.8927 28.49 / 0.7792 27.50 / 0.7338 25.81 / 0.7773 30.16 / 0.9038
×4
! ! # 482K 32.10 / 0.8934 28.51 / 0.7794 27.52 / 0.7341 25.89 / 0.7793 30.25 / 0.9050
! ! !
sigmod

499K 32.11 / 0.8934 28.52 / 0.7797 27.53 / 0.7342 25.90 / 0.7797 30.28 / 0.9054

Table 3: Comparison with original channel attention (CA) can easily observe that the presented IMDN_AS achieves better
and the presented contrast-aware channel attention (CCA). performance in term of image quality, execution speed, and foot-
print. Accordingly, it also suggests the proposed ACS is powerful
Module Set5 Set14 BSD100 Urban100 to address SR problem of any scales.
IMDN_basic_B4 + CA 32.0821 28.5086 27.5124 25.8829
IMDN_basic_B4 + CCA 32.0964 28.5118 27.5185 25.8916 4.4 Comparison with state-of-the-arts
We compare our IMDN with 11 state-of-the-art methods: SRCNN [4,
H  5], FSRCNN [6], VDSR [12], DRCN [13], LapSRN [14], DRRN [24],
Upsampler

 lH
 2 
Conv‐3

Conv‐3

MemNet [25], IDN [11], EDSR-baseline [18], SRMDNF [34], and

IMDB

SR
LR

CARN [2]. Table 5 shows quantitative comparisons for ×2, ×3, and
×4 SR. It can find out that our IMDN performs favorably against
other compared approaches on most datasets, especially at the
64 64 scaling factor of ×2.
Figure 8 shows ×2, ×3 and ×4 visual comparisons on Set5 and
Figure 7: The structure of IMDN_basic_B4.
Urban100 datasets. For “img_67” image from Urban100, we can see
that grid structure is recovered better than others. It also demon-
4.3.2 Ablation studies of CCA module and IIC scheme. To quickly
Conv‐1

Conv‐1

Conv‐1
Mean

Mean

strates the effectiveness of our IMDN.

validate the effectiveness of the contrast-aware attention (CCA)
module and intermediate information collection (IIC) scheme, we
adopt 4 IMDBs to conduct the following ablation study experi- 4.5 Running time
Contrast

Contrast

4.5.1 Complexity analysis. As the proposed IMDN mainly consists

sigmod

sigmod
Conv‐1

Conv‐1

ment, named IMDN_B4. When removing the CCA module and IIC
scheme, the IMDN_B4 becomes IMDN_basic_B4 as illustrated in of convolutions, the total number of parameters can be computed
Figure 7. From Table 2, we can find out that the CCA module leads to as
4 64 performance improvement4(PSNR: 64 +0.09dB, SSIM: +0.0012 for ×4 Õ L
Manga109) only by increasing 2K parameters (which is an increase Params = nl −1 · nl · fl2 + nl , (9)
of 0.4%). The results compared with the CA module are placed in l =1 | {z } |{z}
conv bias
Table 3. To study the efficiency of PRM in IMDB, we replace it with
three cascaded 3 × 3 convolution layers (64 channels) and remove where l is the layer index, L denotes the total number of layers, and f
the final 1 × 1 convolution (used for fusion). The compared results represents the spatial size of the filters. The number of convolutional
are given in Table 2. Although this network has more parameters kernels belong to l-th layer is nl , and its input channels are nl −1 .
(510K), its performance is much lower than our IMDN_basic_B4 Suppose that the spatial size of output feature maps is ml × ml , the
(480K) especially on Urban100 and Manga109 datasets. time complexity can be roughly calculated by
L
!
Õ
2 2
O nl −1 · nl · fl · ml . (10)
Table 4: Quantitative evaluation of VDSR and our IMDN_AS
l =1
in PSNR, SSIM, LPIPS, running time, and memory occupa-
tion. We assume that the size of the HR image is m × m and then the
computational costs can be calculated by Equation 10 (see Table 7).
Method PSNR SSIM LPIPS [35] Time Memory
4.5.2 Running Time. We use official codes of the compared meth-
VDSR [12] 28.75 0.8439 0.2417 0.0290 7,855M
IMDN_AS 29.35 0.8595 0.2147 0.0041 3,597M
ods to test their running time in a feed-forward process. From
Table 6, we can be informed of actual execution time is related
to the depth of networks. Although EDSR has a large number of
4.3.3 Investigation of ACS. To verify the efficiency of the proposed parameters (43M), it runs very fast. The only drawback is that it
adaptive cropping strategy (ACS), we use RealSR training images takes up more graphics memory. The main reason should be the
to train VDSR [12] and our IMDN_AS. The results, evaluated on convolution computation for each layer are parallel. And RCAN has
RealSR RGB validation dataset, are illustrated in Table 4 and we only 16M parameters, its depth is up to 415 and results in very slow
1 http://www.vision.ee.ethz.ch/ntire19/ inference speed. Compared with CARN [2] and EDSR-baseline [18],
Table 5: Average PSNR/SSIM for scale factor ×2, ×3 and ×4 on datasets Set5, Set14, BSD100, Urban100, and Manga109. Best and
second best results are highlighted and underlined.

Set5 Set14 BSD100 Urban100 Manga109

Method Scale Params
PSNR / SSIM PSNR / SSIM PSNR / SSIM PSNR / SSIM PSNR / SSIM
Bicubic - 33.66 / 0.9299 30.24 / 0.8688 29.56 / 0.8431 26.88 / 0.8403 30.80 / 0.9339
SRCNN [4] 8K 36.66 / 0.9542 32.45 / 0.9067 31.36 / 0.8879 29.50 / 0.8946 35.60 / 0.9663
FSRCNN [6] 13K 37.00 / 0.9558 32.63 / 0.9088 31.53 / 0.8920 29.88 / 0.9020 36.67 / 0.9710
VDSR [12] 666K 37.53 / 0.9587 33.03 / 0.9124 31.90 / 0.8960 30.76 / 0.9140 37.22 / 0.9750
DRCN [13] 1,774K 37.63 / 0.9588 33.04 / 0.9118 31.85 / 0.8942 30.75 / 0.9133 37.55 / 0.9732
LapSRN [14] 251K 37.52 / 0.9591 32.99 / 0.9124 31.80 / 0.8952 30.41 / 0.9103 37.27 / 0.9740
DRRN [24] ×2 298K 37.74 / 0.9591 33.23 / 0.9136 32.05 / 0.8973 31.23 / 0.9188 37.88 / 0.9749
MemNet [25] 678K 37.78 / 0.9597 33.28 / 0.9142 32.08 / 0.8978 31.31 / 0.9195 37.72 / 0.9740
IDN [11] 553K 37.83 / 0.9600 33.30 / 0.9148 32.08 / 0.8985 31.27 / 0.9196 38.01 / 0.9749
EDSR-baseline [18] 1,370K 37.99 / 0.9604 33.57 / 0.9175 32.16 / 0.8994 31.98 / 0.9272 38.54 / 0.9769
SRMDNF [34] 1,511K 37.79 / 0.9601 33.32 / 0.9159 32.05 / 0.8985 31.33 / 0.9204 38.07 / 0.9761
CARN [2] 1,592K 37.76 / 0.9590 33.52 / 0.9166 32.09 / 0.8978 31.92 / 0.9256 38.36 / 0.9765
IMDN (Ours) 694K 38.00 / 0.9605 33.63 / 0.9177 32.19 / 0.8996 32.17 / 0.9283 38.88 / 0.9774
Bicubic - 30.39 / 0.8682 27.55 / 0.7742 27.21 / 0.7385 24.46 / 0.7349 26.95 / 0.8556
SRCNN [4] 8K 32.75 / 0.9090 29.30 / 0.8215 28.41 / 0.7863 26.24 / 0.7989 30.48 / 0.9117
FSRCNN [6] 13K 33.18 / 0.9140 29.37 / 0.8240 28.53 / 0.7910 26.43 / 0.8080 31.10 / 0.9210
VDSR [12] 666K 33.66 / 0.9213 29.77 / 0.8314 28.82 / 0.7976 27.14 / 0.8279 32.01 / 0.9340
DRCN [13] 1,774K 33.82 / 0.9226 29.76 / 0.8311 28.80 / 0.7963 27.15 / 0.8276 32.24 / 0.9343
LapSRN [14] 502K 33.81 / 0.9220 29.79 / 0.8325 28.82 / 0.7980 27.07 / 0.8275 32.21 / 0.9350
DRRN [24] ×3 298K 34.03 / 0.9244 29.96 / 0.8349 28.95 / 0.8004 27.53 / 0.8378 32.71 / 0.9379
MemNet [25] 678K 34.09 / 0.9248 30.00 / 0.8350 28.96 / 0.8001 27.56 / 0.8376 32.51 / 0.9369
IDN [11] 553K 34.11 / 0.9253 29.99 / 0.8354 28.95 / 0.8013 27.42 / 0.8359 32.71 / 0.9381
EDSR-baseline [18] 1,555K 34.37 / 0.9270 30.28 / 0.8417 29.09 / 0.8052 28.15 / 0.8527 33.45 / 0.9439
SRMDNF [34] 1,528K 34.12 / 0.9254 30.04 / 0.8382 28.97 / 0.8025 27.57 / 0.8398 33.00 / 0.9403
CARN [2] 1,592K 34.29 / 0.9255 30.29 / 0.8407 29.06 / 0.8034 28.06 / 0.8493 33.50 / 0.9440
IMDN (Ours) 703K 34.36 / 0.9270 30.32 / 0.8417 29.09 / 0.8046 28.17 / 0.8519 33.61 / 0.9445
Bicubic - 28.42 / 0.8104 26.00 / 0.7027 25.96 / 0.6675 23.14 / 0.6577 24.89 / 0.7866
SRCNN [4] 8K 30.48 / 0.8628 27.50 / 0.7513 26.90 / 0.7101 24.52 / 0.7221 27.58 / 0.8555
FSRCNN [6] 13K 30.72 / 0.8660 27.61 / 0.7550 26.98 / 0.7150 24.62 / 0.7280 27.90 / 0.8610
VDSR [12] 666K 31.35 / 0.8838 28.01 / 0.7674 27.29 / 0.7251 25.18 / 0.7524 28.83 / 0.8870
DRCN [13] 1,774K 31.53 / 0.8854 28.02 / 0.7670 27.23 / 0.7233 25.14 / 0.7510 28.93 / 0.8854
LapSRN [14] 502K 31.54 / 0.8852 28.09 / 0.7700 27.32 / 0.7275 25.21 / 0.7562 29.09 / 0.8900
DRRN [24] ×4 298K 31.68 / 0.8888 28.21 / 0.7720 27.38 / 0.7284 25.44 / 0.7638 29.45 / 0.8946
MemNet [25] 678K 31.74 / 0.8893 28.26 / 0.7723 27.40 / 0.7281 25.50 / 0.7630 29.42 / 0.8942
IDN [11] 553K 31.82 / 0.8903 28.25 / 0.7730 27.41 / 0.7297 25.41 / 0.7632 29.41 / 0.8942
EDSR-baseline [18] 1,518K 32.09 / 0.8938 28.58 / 0.7813 27.57 / 0.7357 26.04 / 0.7849 30.35 / 0.9067
SRMDNF [34] 1,552K 31.96 / 0.8925 28.35 / 0.7787 27.49 / 0.7337 25.68 / 0.7731 30.09 / 0.9024
CARN [2] 1,592K 32.13 / 0.8937 28.60 / 0.7806 27.58 / 0.7349 26.07 / 0.7837 30.47 / 0.9084
IMDN (Ours) 715K 32.21 / 0.8948 28.58 / 0.7811 27.56 / 0.7353 26.04 / 0.7838 30.45 / 0.9075

Table 6: Memory Consumption (MB) and average inference time (second).

BSD100 Urban100 Manga109

Method Scale Params Depth
Memory / Time Memory / Time Memory / Time
EDSR-baseline [18] 1.6M 37 665 / 0.00295 2,511 / 0.00242 1,219 / 0.00232
EDSR [18] 43M 69 1,531 / 0.00580 8,863 / 0.00416 3,703 / 0.00380
RDN [38] 22M 150 1,123 / 0.01626 3,335 / 0.01325 2,257 / 0.01300
×4
RCAN [36] 16M 415 777 / 0.09174 2,631 / 0.55280 1,343 / 0.72250
CARN [2] 1.6M 34 945 / 0.00278 3,761 / 0.00305 2,803 / 0.00383
IMDN (Ours) 0.7M 34 671 / 0.00285 1,155 / 0.00284 895 / 0.00279
HR VDSR [12] DRCN [13] DRRN [24] LapSRN [14]
PSNR/SSIM 24.10/0.9537 23.64/0.9493 24.73/0.9594 23.80/0.9527

Urban100 (2×): MemNet [25] IDN [11] EDSR-baseline [18] CARN [2] IMDN (Ours)
img_67 24.98/0.9613 24.68/0.9594 26.01/0.9695 25.96/0.9692 27.75/0.9773

HR VDSR [12] DRCN [13] DRRN [24] LapSRN [14]

PSNR/SSIM 24.75/0.8284 24.82/0.8277 24.80/0.8312 24.89/0.8337

Urban100 (3×): MemNet [25] IDN [11] EDSR-baseline [18] CARN [2] IMDN (Ours)
img_76 24.97/0.8359 24.95/0.8332 25.85/0.8565 25.92/0.8583 26.19/0.8610

Figure 8: Visual comparisons of IMDN with other SR methods on Set5 and Urban100 datasets.
Table 7: The computational costs. For representing concisely,
we omit m2 . Least and second least computational costs are For more intuitive comparisons with other approaches, we pro-
highlighted and underlined. vide the trade-off between the running time and performance on
Set5 dataset for ×4 SR in the Figure 9. It shows our IMDN gains
Scale LapSRN [14] IDN [11] EDSR-b [18] CARN [2] IMDN comparable execution time and best PSNR value.
×2 112K 175K 341K 157K 173K
×3 76K 75K 172K 90K 78K 5 CONCLUSION
×4 76K 51K 122K 76K 45K In this paper, we propose an information multi-distillation network
for lightweight and accurate single image super-resolution. We
32.3 construct a progressive refinement module to extract hierarchical
IMDN
32.2 feature step-by-step. By cooperating with the proposed contrast-
32.1
CARN aware channel attention module, the SR performance is significantly
EDSR‐baseline and steadily improved. Additionally, we present the adaptive crop-
32

31.9 ping strategy to solve the SR problem of an arbitrary scale factor,

PSNR (dB)

31.8 IDN which is critical for the application of SR algorithms in the ac-
31.7
tual scenes. Numerous experiments have shown that the proposed
DRRN_B1U9
31.6
method achieves a commendable balance between factors affecting
31.5
LapSRN practical use, including visual quality, execution speed, and mem-
DRCN
31.4
ory consumption. In the future, this approach will be explored to
VDSR
facilitate other image restoration tasks such as image denoising
31.3
1 0.1 0.01 0.001 and enhancement.
Execution time (sec)

Figure 9: Trade-off between performance and running time ACKNOWLEDGMENTS

on Set5 ×4 dataset. VDSR, DRCN, and LapSRN were imple- This work was supported in part by the National Natural Science
mented by MatConvNet, while DRRN, and IDN employed Foundation of China under Grant 61432014, 61772402, U1605252,
Caffe package. The rest EDSR-baseline, CARN, and our 61671339 and 61871308, in part by the National Key Research and
IMDN utilized PyTorch. Development Program of China under Grant 2016QY01W0200, in
part by National High-Level Talents Special Support Program of
China under Grant CS31117200001.
Our IMDN achieves dominant performance in term of memory
usage and time consumption.
REFERENCES [20] David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. 2001. A database
[1] Eirikur Agustsson and Radu Timofte. 2017. NTIRE 2017 Challenge on Single of human segmented natural images and its application to evaluating segmenta-
Image Super-Resolution: Dataset and Study. In IEEE Conference on Computer tion algorithms and measuring ecological statistics. In International Conference
Vision and Pattern Recognition Workshop (CVPRW). 126–135. on Computer Vision (ICCV). 416–423.
[2] Namhyuk Ahn, Byungkon Kang, and Kyung-Ah Sohn. 2018. Fast, Accurate, and [21] Yusuke Matsui, Kota Ito, Yuji Aramaki, Azuma Fujimoto, Toru Ogawa, Toshihiko
Lightweight Super-Resolution with Cascading Residual Network. In European Yamasaki, and Kiyoharu Aizawa. 2017. Sketch-based manga retrieval using
Conference on Computer Vision (ECCV). 252–268. manga109 dataset. Multimedia Tools and Applications 76, 20 (2017), 21811–21838.
[3] Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi- [22] Wenzhe Shi, Jose Caballero, Huszár, Ferenc, Johannes Totz, Andrew P. Aitken,
Morel. 2012. Low-complexity single-image super-resolution based on nonnega- Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image
tive neighbor embedding. In British Machine Vision Conference (BMVC). and video super-resolution using an efficient sub-pixel convolutional neural
[4] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a network. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
deep convolutional network for image super-resolution. In European Conference 1874–1883.
on Computer Vision (ECCV). 184–199. [23] Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Net-
[5] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2016. Image super- works for Large-Scale Image Recognition. In International Conference for Learning
resolution using deep convolutional networks. IEEE Transactions on Pattern Representations (ICLR).
Analysis and Machine Intelligence 38, 2 (2016), 295–307. [24] Ying Tai, Jian Yang, and Xiaoming Liu. 2017. Image super-resolution via deep
[6] Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super- recursive residual network. In IEEE Conference on Computer Vision and Pattern
resolution convolutional neural network. In European Conference on Computer Recognition (CVPR). 3147–3155.
[25] Ying Tai, Jian Yang, Xiaoming Liu, and Chunyan Xu. 2017. MemNet: A Persistent
Vision (ECCV). 391–407.
Memory Network for Image Restoration. In IEEE International Conference on
[7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual
Computer Vision (ICCV). 4539–4547.
learning for image recognition. In IEEE Conference on Computer Vision and Pattern
[26] Radu Timofte, Shuhang Gu, Jiqing Wu, Luc Van Gool, Lie Zhang, and et al.
Recognition (CVPR). 770–778.
2017. NTIRE 2018 Challenge on Single Image Super-Resolution: Methods and
[8] Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In IEEE
Results. In IEEE Conference on Computer Vision and Pattern Recognition Workshop
Conference on Computer Vision and Pattern Recognition (CVPR). 7132–7141.
(CVPRW). 965–976.
[9] Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q Weinberger. 2017.
[27] Tong Tong, Gen Li, Xiejie Liu, and Qinquan Gao. 2017. Image Super-Resolution
Densely connected convolutional networks. In IEEE Conference on Computer
Using Dense Skip Connections. In IEEE International Conference on Computer
Vision and Pattern Recognition (CVPR). 4700–4708.
Vision (ICCV). 4799–4807.
[10] Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-
[28] Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local
resolution from transformed self-exemplars. In IEEE Conference on Computer
Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition
Vision and Pattern Recognition (CVPR). 5197–5206.
(CVPR). 7794–7803.
[11] Zheng Hui, Xiumei Wang, and Xinbo Gao. 2018. Fast and Accurate Single Image
[29] Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung,
Super-Resolution via Information Distillation Network. In IEEE Conference on
Olga Sorkin-Hornung, and Christopher Schroers. 2018. A Fully Progressive
Computer Vision and Pattern Recognition (CVPR). 723–731.
Approach to Single-Image Super-Resolution. In IEEE Conference on Computer
[12] Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-
Vision and Pattern Recognition Workshop (CVPRW). 977–986.
resolution using very deep convolutional networks. In IEEE Conference on Com-
[30] Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. 2004. Image quality
puter Vision and Pattern Recognition (CVPR). 1646–1654.
assessment: from error visibility to structural similarity. IEEE Transactions on
[13] Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Deeply-recursive con-
Image Processing 13, 4 (2004), 600–612.
volutional network for image super-resolution. In IEEE Conference on Computer
[31] Zhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, and Thomas Huang. 2015.
Vision and Pattern Recognition (CVPR). 1637–1645.
Deep networks for image super-resolution with sparse prior. In IEEE International
[14] Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017.
Conference on Computer Vision (ICCV). 370–378.
Deep laplacian pyramid networks for fast and accurate super-resolution. In IEEE
[32] Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM:
Conference on Computer Vision and Pattern Recognition (CVPR). 624–632.
Convolutional Block Attention Module. In The European Conference on Computer
[15] Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2018. Fast
Vision (ECCV). 3–19.
and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks.
[33] Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up
IEEE Transactions on Pattern Analysis and Machine Intelligence (2018).
using sparse-representations. In International Conference on Curves and Surfaces
[16] Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, and Andrew Cun-
(ICCS). 711–730.
ningham. 2017. Photo-Realistic single image super-resolution using a generative
[34] Kai Zhang, Wangmeng Zuo, and Lei Zhang. [n. d.]. Learning a Single Convolu-
adversarial network. In IEEE Conference on Computer Vision and Pattern Recogni-
tional Super-Resolution Network for Multiple Degradations. In IEEE Conference
tion (CVPR). 4681–4690.
on Computer Vision and Pattern Recognition (CVPR). 3262–3271.
[17] Zhen Li, Jinglei Yang, Zheng Liu, Xiaoming Yang, Gwanggil Jeon, and Wei Wu.
[35] Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang.
2019. Feedback Network for Image Super-Resolution. In IEEE Conference on
2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric.
Computer Vision and Pattern Recognition (CVPR).
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 586–595.
[18] Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee.
[36] Yulun Zhang, kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018.
2017. Enhanced Deep Residual Networks for Single Image Super-Resolution. In
Image Super-Resolution Using Very Deep Residual Channel Attention Networks.
IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW).
In European Conference on Computer Vision (ECCV). 286–301.
136–144.
[37] Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, and Yun Fu. 2019. Residual
[19] Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, and Thomas S Huang.
Non-local Attention Networks for Image Restoration. In International Conference
2018. Non-Local Recurrent Network for Image Restoration. In Advances in Neural
on Learning Representations (ICLR).
Information Processing Systems (NeurIPS). 1680–1689.
[38] Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. 2018. Residual
Dense Network for Image Super-Resolution. In IEEE Conference on Computer
Vision and Pattern Recognition (CVPR). 2472–2481.

Dokumen - Pub - The Souls Logical Life Towards A Rigorous Notion of Psychology 5nbsped 9783631806630 9783631809419 9783631809426 9783631809433
100% (1)
Dokumen - Pub - The Souls Logical Life Towards A Rigorous Notion of Psychology 5nbsped 9783631806630 9783631809419 9783631809426 9783631809433
296 pages
3) Training - Deep - Dive - ISO - 16140-3 - Method - Verification 20210322
No ratings yet
3) Training - Deep - Dive - ISO - 16140-3 - Method - Verification 20210322
133 pages
Report On Neural Networks
No ratings yet
Report On Neural Networks
91 pages
A Review of Single Image Super Resolution Techniques Using Convolutional Neural Networks
No ratings yet
A Review of Single Image Super Resolution Techniques Using Convolutional Neural Networks
35 pages
Final Year Report
No ratings yet
Final Year Report
42 pages
2205 Esrgcnn
No ratings yet
2205 Esrgcnn
24 pages
Lightweight Image Super-Resolution Based On
No ratings yet
Lightweight Image Super-Resolution Based On
27 pages
85.1 (IJCV 2025) SRConvNet A Transformer-Style ConvNet For Lightweight Image Super-Resolution
No ratings yet
85.1 (IJCV 2025) SRConvNet A Transformer-Style ConvNet For Lightweight Image Super-Resolution
17 pages
Single Image Super Resolution
No ratings yet
Single Image Super Resolution
40 pages
Factors Affecting Solubility
No ratings yet
Factors Affecting Solubility
35 pages
22A Heterogenous Group CNN For Image SR
No ratings yet
22A Heterogenous Group CNN For Image SR
13 pages
IPG SuperResolution Presentation
No ratings yet
IPG SuperResolution Presentation
15 pages
A Reviw of GAN Based Super Resolution Reconstruction For Optical Remote Sensing Images
No ratings yet
A Reviw of GAN Based Super Resolution Reconstruction For Optical Remote Sensing Images
34 pages
Image Super-Resolution Using Deep Convolutional Networks
No ratings yet
Image Super-Resolution Using Deep Convolutional Networks
13 pages
See More Details: Efficient Image Super-Resolution by Experts Mining
No ratings yet
See More Details: Efficient Image Super-Resolution by Experts Mining
16 pages
Structure-Preserving Image Super-Resolution
No ratings yet
Structure-Preserving Image Super-Resolution
22 pages
Image Super-Resolution Using Deep Convolutional Networks
No ratings yet
Image Super-Resolution Using Deep Convolutional Networks
13 pages
Doctoral Thesis Proposal DCCA
No ratings yet
Doctoral Thesis Proposal DCCA
16 pages
MADNet A Fast and Lightweight Network For Single-Image Super Resolution
No ratings yet
MADNet A Fast and Lightweight Network For Single-Image Super Resolution
11 pages
Context Aware Edge-Enhanced GAN For Remote Sensing Image Super-Resolution
No ratings yet
Context Aware Edge-Enhanced GAN For Remote Sensing Image Super-Resolution
14 pages
Soft-Edge Assisted Network For Single Image 2
No ratings yet
Soft-Edge Assisted Network For Single Image 2
31 pages
CVPR - 2023 - CiaoSR Continuous Implicit Attention-in-Attention Network
No ratings yet
CVPR - 2023 - CiaoSR Continuous Implicit Attention-in-Attention Network
12 pages
Forests 14 02188 Compressed
No ratings yet
Forests 14 02188 Compressed
17 pages
Learning Enriched Features For Real Image Restoration and Enhancement
No ratings yet
Learning Enriched Features For Real Image Restoration and Enhancement
20 pages
Wavemix SR
No ratings yet
Wavemix SR
9 pages
CVPR - 2023 - Cascaded Local Implicit Transformer For Arbitrary-Scale Super-Resolution
No ratings yet
CVPR - 2023 - Cascaded Local Implicit Transformer For Arbitrary-Scale Super-Resolution
11 pages
Li 2019
No ratings yet
Li 2019
25 pages
Super Resolution HSI
No ratings yet
Super Resolution HSI
10 pages
Liu 2021
No ratings yet
Liu 2021
9 pages
REF-20-Accurate Image Super-Resolution Using Very Deep Convolutional Networks
No ratings yet
REF-20-Accurate Image Super-Resolution Using Very Deep Convolutional Networks
9 pages
Ieee 2019
No ratings yet
Ieee 2019
9 pages
轻量级图像超分辨率与多尺度特征交互网络
No ratings yet
轻量级图像超分辨率与多尺度特征交互网络
6 pages
Deep Wavelet 2017
No ratings yet
Deep Wavelet 2017
10 pages
Shearlet Transformation Within Super Resolution
No ratings yet
Shearlet Transformation Within Super Resolution
16 pages
REF-21-MemNet - A Persistent Memory Network For Image Restoration
No ratings yet
REF-21-MemNet - A Persistent Memory Network For Image Restoration
9 pages
Review DL2019
No ratings yet
Review DL2019
16 pages
Comparative Study of Implementation of Very Deep Super Resolution Neural Network and Bicubic Interpolation For Single Image Super Resolution Quality Enhancement
No ratings yet
Comparative Study of Implementation of Very Deep Super Resolution Neural Network and Bicubic Interpolation For Single Image Super Resolution Quality Enhancement
7 pages
SISR Transformer
No ratings yet
SISR Transformer
10 pages
Multiresolution Mixture Generative Adversarial Network For Image Super-Resolution
No ratings yet
Multiresolution Mixture Generative Adversarial Network For Image Super-Resolution
6 pages
Deep Learning For Image Super Resolution
No ratings yet
Deep Learning For Image Super Resolution
9 pages
NEUCOM SI Editorial SR
No ratings yet
NEUCOM SI Editorial SR
5 pages
Enhanced Deep Residual Networks For Single Image Super-Resolution
No ratings yet
Enhanced Deep Residual Networks For Single Image Super-Resolution
9 pages
Deep Learning For Single Image Super-Resolution: A Brief Review
No ratings yet
Deep Learning For Single Image Super-Resolution: A Brief Review
17 pages
Image Super Resolution Using DCNN: Bachelor of Technology Computer Science and Engineering
No ratings yet
Image Super Resolution Using DCNN: Bachelor of Technology Computer Science and Engineering
21 pages
1 s2.0 S0167739X20330259 Main
No ratings yet
1 s2.0 S0167739X20330259 Main
9 pages
RPaper 1
No ratings yet
RPaper 1
6 pages
ML Report-Image Segmentation
No ratings yet
ML Report-Image Segmentation
19 pages
Fast and Accurate Image Super-Resolution With Deep Laplacian Pyramid Networks
No ratings yet
Fast and Accurate Image Super-Resolution With Deep Laplacian Pyramid Networks
16 pages
Images Super-Resolution Using Improved Generative Adversarial Networks
No ratings yet
Images Super-Resolution Using Improved Generative Adversarial Networks
5 pages
Densely Residual Laplacian Super-Resolution: Saeed Anwar, Member, IEEE, and Nick Barnes, Senior Member, IEEE
No ratings yet
Densely Residual Laplacian Super-Resolution: Saeed Anwar, Member, IEEE, and Nick Barnes, Senior Member, IEEE
12 pages
Wang Dual Super-Resolution Learning For Semantic Segmentation CVPR 2020 Paper
No ratings yet
Wang Dual Super-Resolution Learning For Semantic Segmentation CVPR 2020 Paper
10 pages
NoUCSR - Efficient Super-Resolution Network Without Upsampling Convolution
No ratings yet
NoUCSR - Efficient Super-Resolution Network Without Upsampling Convolution
10 pages
Multi-Scale Multi-Stage Single Image Super-Resolution Reconstruction Algorithm Based On Transformer
No ratings yet
Multi-Scale Multi-Stage Single Image Super-Resolution Reconstruction Algorithm Based On Transformer
4 pages
Generating Super-Resolution Images Using Computer Vision Approaches
No ratings yet
Generating Super-Resolution Images Using Computer Vision Approaches
6 pages
Fast and Accurate Image Super Resolution by Deep CNN With Skip Connection and Network in Network
No ratings yet
Fast and Accurate Image Super Resolution by Deep CNN With Skip Connection and Network in Network
9 pages
Production - Derieux - Cedric - Advances in Automatic Image Restoration and Upscaling
No ratings yet
Production - Derieux - Cedric - Advances in Automatic Image Restoration and Upscaling
4 pages
A Fully Progressive Approach To Single Image Super Resolution Paper 1
No ratings yet
A Fully Progressive Approach To Single Image Super Resolution Paper 1
10 pages
Artificial Intelligence
100% (1)
Artificial Intelligence
117 pages
Icramet RDGAN Rep3
No ratings yet
Icramet RDGAN Rep3
6 pages
NVAE - A Deep Hierarchical Variational Autoencoder
No ratings yet
NVAE - A Deep Hierarchical Variational Autoencoder
20 pages
Image Restoration Using Residual Generative Adversarial Networks-FINAL
No ratings yet
Image Restoration Using Residual Generative Adversarial Networks-FINAL
21 pages
List of The Best Mathematics Books
100% (3)
List of The Best Mathematics Books
10 pages
Nietzsche BirthOfTragedy
No ratings yet
Nietzsche BirthOfTragedy
76 pages
BK Chap14
No ratings yet
BK Chap14
73 pages
Sifat Koligatif Larutan
No ratings yet
Sifat Koligatif Larutan
10 pages
Inger and Colwell 1977
No ratings yet
Inger and Colwell 1977
27 pages
KK and Chong Tang
No ratings yet
KK and Chong Tang
31 pages
Unit 4 Resourse Materials in Science
No ratings yet
Unit 4 Resourse Materials in Science
30 pages
Image Super Resolution
No ratings yet
Image Super Resolution
8 pages
COST-BEHAVIOR - Mas 23
No ratings yet
COST-BEHAVIOR - Mas 23
12 pages
Univariate and Bivariate Statistical Analysespdf
100% (1)
Univariate and Bivariate Statistical Analysespdf
6 pages
Green Olympiad Study Material: Free Chapter I
No ratings yet
Green Olympiad Study Material: Free Chapter I
10 pages
Enhanced Super-Resolution Using GAN
No ratings yet
Enhanced Super-Resolution Using GAN
6 pages
PANSolastalgia
No ratings yet
PANSolastalgia
16 pages
Department of Education: Weekly Home Learning Plan
100% (1)
Department of Education: Weekly Home Learning Plan
4 pages
Cahpter 3
No ratings yet
Cahpter 3
4 pages
ECE317 L1 Introduction
No ratings yet
ECE317 L1 Introduction
18 pages
Hydrogel Propo Part 3
No ratings yet
Hydrogel Propo Part 3
19 pages
Rethinking Data Augmentation For Image Super-Resolution - A Comprehensive Analysis and A New Strategy
No ratings yet
Rethinking Data Augmentation For Image Super-Resolution - A Comprehensive Analysis and A New Strategy
18 pages
MixConv - Mixed Depthwise Convolutional Kernels
No ratings yet
MixConv - Mixed Depthwise Convolutional Kernels
13 pages
3D Hand Shape and Pose Estimation From A Single RGB Image
No ratings yet
3D Hand Shape and Pose Estimation From A Single RGB Image
12 pages
Calculus Course Contents
No ratings yet
Calculus Course Contents
5 pages
SpineNet - Learning Scale-Permuted Backbone For Recognition and Localization
No ratings yet
SpineNet - Learning Scale-Permuted Backbone For Recognition and Localization
11 pages
Branch Calculation
No ratings yet
Branch Calculation
9 pages
Math Activities That Promote Higher Order Thinking
No ratings yet
Math Activities That Promote Higher Order Thinking
18 pages
Part 542 - Acronyms: 542.2 Plant Nomenclature
No ratings yet
Part 542 - Acronyms: 542.2 Plant Nomenclature
3 pages
Penabur International School - 7 - Volume Measurement, Time Measurement - 1 - (Exercise) - Soal Siswa - Rega
No ratings yet
Penabur International School - 7 - Volume Measurement, Time Measurement - 1 - (Exercise) - Soal Siswa - Rega
6 pages
Nivell Avançat Anglés / Nivel Avanzado Inglés Comprensió Oral/Comprensión Oral
No ratings yet
Nivell Avançat Anglés / Nivel Avanzado Inglés Comprensió Oral/Comprensión Oral
8 pages
Real-Time Semantic Segmentation With Fast Attention
No ratings yet
Real-Time Semantic Segmentation With Fast Attention
7 pages
PT 2 3
No ratings yet
PT 2 3
4 pages
Effort Rubrics Revised
No ratings yet
Effort Rubrics Revised
3 pages
Ielts Writing Topic
No ratings yet
Ielts Writing Topic
3 pages
Alien Explanation Homework
100% (1)
Alien Explanation Homework
7 pages
Literature Review On Impact of Climate Change On Agriculture
100% (1)
Literature Review On Impact of Climate Change On Agriculture
5 pages
Surface Coating Quick Guide Glass Mould 0463hog
No ratings yet
Surface Coating Quick Guide Glass Mould 0463hog
1 page
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

Lightweight Image Super-Resolution With Information Multi-Distillation Network

Uploaded by

Lightweight Image Super-Resolution With Information Multi-Distillation Network

Uploaded by

Lightweight Image Super-Resolution with Information

Yunchu Yang Xiumei Wang∗

Xi’an, China Xi’an, China

H  H  propitious to enhance image details (related to SSIM). As depicted

denotes the 3 × 3 convolutional layer, and “CCA Layer” indi- H 

each step. It can be expressed by

3.3 Adaptive cropping strategy

 (4) The adaptive cropping strategy (ACS) is special to image of any

64 64 ure 5). We slightly modify the original IMDN by introducing two

lW H   lresolution paired images.

IMDB 64 a size of 192 × 192 64

are randomly cropped from HR images as the input of our model,

is trained by ADAM optimizer with the momentum parameter

β 1 = 0.9. The initial learning rate is set −4

Set5 Set14 BSD100 Urban100 Manga109

MemNet [25], IDN [11], EDSR-baseline [18], SRMDNF [34], and

strates the effectiveness of our IMDN.

4.5.1 Complexity analysis. As the proposed IMDN mainly consists

Set5 Set14 BSD100 Urban100 Manga109

Table 6: Memory Consumption (MB) and average inference time (second).

BSD100 Urban100 Manga109

HR VDSR [12] DRCN [13] DRRN [24] LapSRN [14]

31.9 ping strategy to solve the SR problem of an arbitrary scale factor,

Figure 9: Trade-off between performance and running time ACKNOWLEDGMENTS

You might also like

(4) The adaptive cropping strategy (ACS) is special to image of any