Images Super-Resolution Using Improved Generative Adversarial Networks
Images Super-Resolution Using Improved Generative Adversarial Networks
2021)
2021 3rd International Conference on Intelligent Control, Measurement and Signal Processing and Intelligent Oil Field (ICMSP) | 978-1-6654-3715-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICMSP53480.2021.9513225
Abstract—Image super-resolution (ISR) is an important image medical imaging, surveillance and security. In addition to
processing technology to improve image resolution in computer improving the perceived quality of the images, it also helps
vision tasks. The purpose of this paper is to study the super- improve other computer vision tasks. In general, this problem is
resolution reconstruction of single image based on the depth very challenging and inherently ill-posed because there are
learning method. Aiming at the problem that the existing pixel always multiple HR images corresponding to a single LR image.
loss-based super-resolution image reconstruction algorithms have
poor reconstruction effect on high-frequency details, such as The existing research on SR technology is roughly divided
textures, a lighter algorithm is proposed on the basis of the existing into two categories, SR based on traditional features and SR
deep learning method (SRGAN). Firstly, the feedback structure is based on deep learning. In recent years, with the rapid
applied in the generator to process the feedback information and development of deep learning technology, the stochastic
enhance the high frequency information of the image. Secondly, a resonance model based on deep learning has been actively
general residual feature aggregation framework (RFA), is applied explored and achieved the best performance on various
to make full use of the residual information of each layer to stochastic resonance benchmarks. From the early methods based
improve the quality of the SR image. Finally, the solution space of on convolution neural network (CNN) (e.g., SRCNN [3,4]) to
the function is further reduced and the image quality is improved the methods based on spanning adversarial network (GAN) [5]
by using a new loss function. The algorithms are implemented on proposed in recent years (e.g., SRGAN[6]), various deep
pytorch framework. The experimental results on VOC2012 data
learning methods have been applied to deal with SR tasks.
sets show that, compared with the original SRGAN algorithm, the
peak signal-to-noise ratio (PSNR) and structural similarity (SSIM)
Generally speaking, SR algorithms using deep learning
of the proposed algorithm on the benchmark data set Set5 are techniques are different from each other in the following main
improved by 0.83dB and 0.028, respectively, on Set14, the PSNR aspects, different types of network structures [7-9], different
and SSIM of the proposed algorithm are improved by 0.56dB and types of loss functions [10-12], different types of learning
0.009, on Urban100, the PSNR and SSIM of the proposed principles and strategies. The work of this paper mainly focuses
algorithm are improved by 0.51dB and 0.031, on BSD100, the on designing a new network structure and finding a new loss
PSNR and SSIM of the proposed algorithm are improved by function.
0.33dB and 0.014,and compared with other improved algorithms,
The basic processing process of SR task based on deep
the effect of this algorithm is also better than other algorithms.
learning [3] as shown in figure 1, the HR reconstruction process
Keywords—image super-resolution; deep learning; is mainly divided into three stages: low-resolution image feature
convolutional neural networks (CNN); generative adversarial extraction, nonlinear mapping, high-resolution reconstruction
network (GAN); pytorch [1]. With the development of depth model, residual structure [13]
appears as a representative structure, mainly to reduce the
I. INTERODUCTION training difficulty of the model. The general residual model is
The Image super-resolution (SR) [1,2], which refers to the shown as a recursive stack of residual structure, and its basic
process of recovering high-resolution (HR) images from low- structure is shown in figure 2.
resolution (LR) images, is an important class of image
processing techniques in computer vision and image processing.
It enjoys a wide range of real-world applications, such as
978-1-6654-3715-8/21/$31.00 ©2021 IEEE
254
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
image to some extent, but its application does improve the image
High-resolution High-resolution
details and gain people's visual trust. The optimization problem
image(output) image(output) of GAN is a minimax problem, and the objective function of
GAN can be described as follows [5,19,20].
Upsampling
ResBlock
... +
effect is greatly improved. It is mainly improved in two aspects:
the structure of the model and the loss function. The
discriminator uses the same structure as SRGAN.
A. Network Architecture
Fig. 2. Basic Architecture of residual model In order to further improve the restored image quality of
SRGAN, this paper mainly makes two improvements to the
In recent years, the method based on generative adversarial structure of generator G:1) apply a feedback structure in the
network (GAN) [6,14-18] has been applied to SR. The basic idea residual structure to deal with the feedback information flow
is a zero-sum game confrontation. The image details are effectively; 2) apply a general residual feature aggregation
improved through the confrontation training of generator framework (RFA), to improve the accuracy of SR images, and
network and discriminator network, and good results are the generator structure is shown in figure 3.
achieved. GAN will lose the peak signal-to-noise ratio of the
Upsampling
Upsampling
Sum( 0.2)
PReLU
PReLU
X
Conv
Conv
Conv
BN
RFA
LR
SR
255
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
forward way. For the feedback structure, we use the basic III. EXPERIMENTS
convolution activation structure and remove the BN layer. Some
studies have shown that removing the BN layer can improve the A. Training details and parameters
model performance and reduce the computational complexity The experiments are based on pytorch framework [23]. The
[14]. When the statistics of the training data set and the test data experimental environment is running with NVIDIA GeForce
set differ greatly, the BN layer is easy to introduce unpleasant GTX 3090 GPU on a host with Intel(R) Core(TM) i7-9700 CPU
artifacts and limit the generalization ability. 3.70GHz and memory 32GB. In the experiment, the VOC2012
data set is used as the training set and verification set. The
training set contains 16700 images of animals, plants,
environment, people, buildings and other scenes, and the test set
PReLU
L1
N
I HR I SR
t t
1
(3)
t 1
1 N
Lcont
N
54
( I HR ) 54 ( I SR )
t t
1
(4)
t 1
PSNR(dB)
1 N
Ladv
N
1 D (G ( I t
LR
)) (5)
t 1
1 N
LTV
N
(( I t
i , j 1
I i , j ) ( I i 1, j I i , j ) )
t 2 t t 2 /2
(6)
t 1 i, j
C. Network Architecture
The training algorithm is shown in Algorithm 1. Fig. 6. The PSNR of the algorithm on Set5
256
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
TABLE I. PSNR values using several different algorithms(dB) benchmark data set are improved. On Set5, the algorithms PSNR
Algorithms Set5 Set14 Urban100 BSD100
and SSIM improve 0.83dB and 0.028 than SRGAN, 0.79dB and
Bicubic 28.42 26.00 23.14 25.96 0.026 than ISRGAN respectively; on Set14, PSNR and SSIM
SRGAN 28.68 25.51 23.50 25.64 improve 0.56dB and 0.009 than SRGAN, 0.41dB and 0.007
ISRGAN 28.72 25.66 23.61 25.75 respectively than ISRGAN; on Urban100, PSNR and SSIM
Ours 29.51 26.07 24.01 25.97 improve 0.51dB and 0.031 than SRGAN and 0.40dB and 0.028
respectively than ISRGAN. On BSD00, the algorithms PSNR
TABLE II. SSIM values using several different algorithms and SSIM in this paper improve 0.33dB and 0.014 respectively
Algorithms Set5 Set14 Urban100 BSD100
than SRGAN, 0.22dB and 0.012 higher than ISRGAN
Bicubic 0.810 0.702 0.657 0.657 respectively.
SRGAN 0.828 0.731 0.716 0.697
ISRGAN 0.830 0.733 0.719 0.699
2) Visual Effects
Ours 0.856 0.740 0.747 0.711 Select two images of different scenes as the image super-
resolution reconstruction effect comparison, as shown in Figure
TABLE III. Number of parameters of several different algorithms(M) 7, (a) represents the image to be processed, (b) represents the
local effect of the LR image, (c) represents the image local effect
Algorithms Bicubic SRGAN ISRGAN Ours using the Bicubic method, (d) represents the image local effect
Params(M) - 1.6 8.7 0.5 using the SRGAN method, (e) represents the image local effect
using the algorithm in this paper, and (f) represents the HR
Through the comparison of the results, it can be found that image local effect. Through the comparison, we can see that the
the number of parameters of this algorithm is much smaller than processing result of this article is closer to the HR image.
that of other algorithms, and the PSNR and SSIM on each
257
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
[7] Kim J, Lee J K, Lee K M. Accurate image super-resolution using very IEEE/CVF Conference on Computer Vision and Pattern Recognition.
deep convolutional networks[C]//Proceedings of the IEEE conference on 2020: 5407-5416.
computer vision and pattern recognition. 2016: 1646-1654. [16] Yuan Y, Liu S, Zhang J, et al. Unsupervised image super-resolution using
[8] Lai W S, Huang J B, Ahuja N, et al. Deep laplacian pyramid networks for cycle-in-cycle generative adversarial networks[C]//Proceedings of the
fast and accurate super-resolution[C]//Proceedings of the IEEE IEEE Conference on Computer Vision and Pattern Recognition
conference on computer vision and pattern recognition. 2017: 624-632. Workshops. 2018: 701-710.
[9] Ahn N, Kang B, Sohn K A. Fast, accurate, and lightweight super- [17] Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using
resolution with cascading residual network[C]//Proceedings of the cycle-consistent adversarial networks[C]//Proceedings of the IEEE
European Conference on Computer Vision (ECCV). 2018: 252-268. international conference on computer vision. 2017: 2223-2232.
[10] Sajjadi M S M, Scholkopf B, Hirsch M. Enhancenet: Single image super- [18] Karnewar A, Wang O. Msg-gan: Multi-scale gradients for generative
resolution through automated texture synthesis[C]//Proceedings of the adversarial networks[C]//Proceedings of the IEEE/CVF Conference on
IEEE International Conference on Computer Vision. 2017: 4491-4500. Computer Vision and Pattern Recognition. 2020: 7799-7808.
[11] Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer [19] Yilun Lin, Xingyuan Dai, Li Li, et al.The New Frontier of AI Research:
and super-resolution[C]//European conference on computer vision. Generative Adversarial Networks[J]. Acta Automatica Sinica, 2018,
Springer, Cham, 2016: 694-711. 44(005):775-792.
[12] Bulat A, Tzimiropoulos G. Super-fan: Integrated facial landmark [20] Kunfeng Wang, Chao Gou, Yanjie Duan, et al.Generative Adversarial
localization and super-resolution of real-world low resolution faces in Networks: The State of the Art and Beyond[J]. Acta Automatica Sinica,
arbitrary poses with gans[C]//Proceedings of the IEEE Conference on 2017, 043(003):321-332.
Computer Vision and Pattern Recognition. 2018: 109-117. [21] Liu J, Zhang W, Tang Y, et al. Residual feature aggregation network for
[13] He K, Zhang X, Ren S, et al. Deep residual learning for image image super-resolution[C]//Proceedings of the IEEE/CVF Conference on
recognition[C]//Proceedings of the IEEE conference on computer vision Computer Vision and Pattern Recognition. 2020: 2359-2368.
and pattern recognition. 2016: 770-778. [22] Li Z, Yang J, Liu Z, et al. Feedback network for image super-
[14] Wang X, Yu K, Wu S, et al. Esrgan: Enhanced super-resolution generative resolution[C]//Proceedings of the IEEE/CVF Conference on Computer
adversarial networks[C]//Proceedings of the European Conference on Vision and Pattern Recognition. 2019: 3867-3876.
Computer Vision (ECCV) Workshops. 2018: 0-0. [23] Huanxin Cheng, Wenhan Liu. Image super-resolution using improved
[15] Guo Y, Chen J, Wang J, et al. Closed-loop matters: Dual regression generative adversarial networks[J]. Electronic Messaging Technology.
networks for single image super-resolution[C]//Proceedings of the 2020, v.43;No.346(14):137-140.
258
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.