0% found this document useful (0 votes)

8 views

Images Super-Resolution Using Improved Generative Adversarial Networks

Uploaded by

saravanan2018rsp

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Images Super-Resolution Using Improved Generative Adversarial Networks

Uploaded by

saravanan2018rsp

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2021 3rd International Conference on Intelligent Control, Measurement and Signal Processing and Intelligent Oil Field (ICMSP

2021)
2021 3rd International Conference on Intelligent Control, Measurement and Signal Processing and Intelligent Oil Field (ICMSP) | 978-1-6654-3715-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICMSP53480.2021.9513225

Images super-resolution using improved generative

adversarial networks
Yuliang Hu Mingli Jing*
1Xi’an
Shiyou University 1Xi’an
Shiyou University
School of Electrical Engineering School of Electrical Engineering
2Shaanxi Provincial Key Lab of Oil and Gas Well 2Shaanxi Provincial Key Lab of Oil and Gas Well

Measurement and Control Technology Measurement and Control Technology

Xi’an, China Xi’an, China
e-mail: [email protected] e-mail: [email protected]

Yao Jiao Kun Sun

1Xi’an
Shiyou University 1Xi’an
Shiyou University
School of Electrical Engineering School of Electrical Engineering
2Shaanxi Provincial Key Lab of Oil and Gas Well 2Shaanxi Provincial Key Lab of Oil and Gas Well

Measurement and Control Technology Measurement and Control Technology

Xi’an, China Xi’an, China

Abstract—Image super-resolution (ISR) is an important image medical imaging, surveillance and security. In addition to
processing technology to improve image resolution in computer improving the perceived quality of the images, it also helps
vision tasks. The purpose of this paper is to study the super- improve other computer vision tasks. In general, this problem is
resolution reconstruction of single image based on the depth very challenging and inherently ill-posed because there are
learning method. Aiming at the problem that the existing pixel always multiple HR images corresponding to a single LR image.
loss-based super-resolution image reconstruction algorithms have
poor reconstruction effect on high-frequency details, such as The existing research on SR technology is roughly divided
textures, a lighter algorithm is proposed on the basis of the existing into two categories, SR based on traditional features and SR
deep learning method (SRGAN). Firstly, the feedback structure is based on deep learning. In recent years, with the rapid
applied in the generator to process the feedback information and development of deep learning technology, the stochastic
enhance the high frequency information of the image. Secondly, a resonance model based on deep learning has been actively
general residual feature aggregation framework (RFA), is applied explored and achieved the best performance on various
to make full use of the residual information of each layer to stochastic resonance benchmarks. From the early methods based
improve the quality of the SR image. Finally, the solution space of on convolution neural network (CNN) (e.g., SRCNN [3,4]) to
the function is further reduced and the image quality is improved the methods based on spanning adversarial network (GAN) [5]
by using a new loss function. The algorithms are implemented on proposed in recent years (e.g., SRGAN[6]), various deep
pytorch framework. The experimental results on VOC2012 data
learning methods have been applied to deal with SR tasks.
sets show that, compared with the original SRGAN algorithm, the
peak signal-to-noise ratio (PSNR) and structural similarity (SSIM)
Generally speaking, SR algorithms using deep learning
of the proposed algorithm on the benchmark data set Set5 are techniques are different from each other in the following main
improved by 0.83dB and 0.028, respectively, on Set14, the PSNR aspects, different types of network structures [7-9], different
and SSIM of the proposed algorithm are improved by 0.56dB and types of loss functions [10-12], different types of learning
0.009, on Urban100, the PSNR and SSIM of the proposed principles and strategies. The work of this paper mainly focuses
algorithm are improved by 0.51dB and 0.031, on BSD100, the on designing a new network structure and finding a new loss
PSNR and SSIM of the proposed algorithm are improved by function.
0.33dB and 0.014,and compared with other improved algorithms,
The basic processing process of SR task based on deep
the effect of this algorithm is also better than other algorithms.
learning [3] as shown in figure 1, the HR reconstruction process
Keywords—image super-resolution; deep learning; is mainly divided into three stages: low-resolution image feature
convolutional neural networks (CNN); generative adversarial extraction, nonlinear mapping, high-resolution reconstruction
network (GAN); pytorch [1]. With the development of depth model, residual structure [13]
appears as a representative structure, mainly to reduce the
I. INTERODUCTION training difficulty of the model. The general residual model is
The Image super-resolution (SR) [1,2], which refers to the shown as a recursive stack of residual structure, and its basic
process of recovering high-resolution (HR) images from low- structure is shown in figure 2.
resolution (LR) images, is an important class of image
processing techniques in computer vision and image processing.
It enjoys a wide range of real-world applications, such as
978-1-6654-3715-8/21/$31.00 ©2021 IEEE
254

Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
image to some extent, but its application does improve the image
High-resolution High-resolution
details and gain people's visual trust. The optimization problem
image(output) image(output) of GAN is a minimax problem, and the objective function of
GAN can be described as follows [5,19,20].

Patch extraction Non-linear Reconstruction

min max{ f ( D | G )}  E x ~ p data
( x)
[log D ( x )] 
G D
and representation mapping

Fig. 1. Architecture of SRCNN Network Ex ~ p z

(z)
[log(1  D (G ( z )))] (1)

II. PROPOSED METHOD

In this paper, a more lightweight algorithm model is

Upsampling
ResBlock

proposed, the parameter of the model is about 0.5M, and the

ResBlock
ResBlock
Conv

... +
effect is greatly improved. It is mainly improved in two aspects:
the structure of the model and the loss function. The
discriminator uses the same structure as SRGAN.
A. Network Architecture
Fig. 2. Basic Architecture of residual model In order to further improve the restored image quality of
SRGAN, this paper mainly makes two improvements to the
In recent years, the method based on generative adversarial structure of generator G:1) apply a feedback structure in the
network (GAN) [6,14-18] has been applied to SR. The basic idea residual structure to deal with the feedback information flow
is a zero-sum game confrontation. The image details are effectively; 2) apply a general residual feature aggregation
improved through the confrontation training of generator framework (RFA), to improve the accuracy of SR images, and
network and discriminator network, and good results are the generator structure is shown in figure 3.
achieved. GAN will lose the peak signal-to-noise ratio of the

Upsampling
Upsampling
Sum( 0.2)
PReLU

PReLU
X
Conv

Conv
Conv
BN

RFA
LR
SR

Fig. 3. Architecture of Generator Network

The residual feature aggregation framework is shown in

figure 4, which is beneficial to improve the accuracy of SR
images. In the residual structure model, the residual feature is Feedback Feedback Feedback
useful for the reconstruction of HR images, but in the existing block block block
models based on residual structure, residual learning is only used
as a strategy to reduce the difficulty of training and prevent the + + +
Feedback
X0.2 +
block
gradient from disappearing, and does not make full use of the
residual information of the middle layer, in which the residual
feature is fused with the mapping feature before propagating to Fig. 4. Residual Feature Aggregation Network
the next module to form a more complex fusion feature. Ignoring
the full use of cleaner residual features, resulting in performance Aiming at the basic structure of the residual feature
degradation [21]. The residual feature aggregation framework aggregation framework, we design a feedback structure (FB), as
can make good use of the residual information of the middle shown in figure 5. The high-level information can be provided
layers to solve this problem, so as to improve the image quality. from the top-down feedback flow through the feedback
connection, the low-level information can be further extracted
by the high-level information, and the high-level image
information can help the lower level to reconstruct better image
features. The recursive structure with feedback connection has a
strong ability of early reconstruction, can better reconstruct the
image and requires fewer parameters, and effectively solves the
problem that the network can only share information in a feed

255

Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
forward way. For the feedback structure, we use the basic III. EXPERIMENTS
convolution activation structure and remove the BN layer. Some
studies have shown that removing the BN layer can improve the A. Training details and parameters
model performance and reduce the computational complexity The experiments are based on pytorch framework [23]. The
[14]. When the statistics of the training data set and the test data experimental environment is running with NVIDIA GeForce
set differ greatly, the BN layer is easy to introduce unpleasant GTX 3090 GPU on a host with Intel(R) Core(TM) i7-9700 CPU
artifacts and limit the generalization ability. 3.70GHz and memory 32GB. In the experiment, the VOC2012
data set is used as the training set and verification set. The
training set contains 16700 images of animals, plants,
environment, people, buildings and other scenes, and the test set
PReLU

contains 425 images of the same kind. In the process of training,

Conv

X 0.2 + the generator model is pre-trained in order to achieve better

results. In the pre-training process, L1 loss and TV loss are used
to constrain the model, the batch size is set to 64, the Adam
optimizer is used to update the parametersat a learning rate of lr=
X4
1103 , and the other parameters are default values, with a total
Fig. 5. Architecture of Feedback Block
of 150000 iterations. The confrontation training of GAN is
carried out on the pre-training model to enhance the detail
B. Loss Function
quality of the reconstructed image. Set up parameter  6103 ,
The loss function is important for the performance of the  1103 ,   2108 , still use Adam optimizer and the
network. The improvements of this paper are as follows: 1) use
parameter value is unchanged, the number of iterations is 26000
the MAE term instead of the original MSE term; 2) add the total
times. The evaluation results on the Set5 dataset during the
variational loss term. On the whole, the loss function of ours
training are shown in figure 6. Finally, the algorithm proposed
includes generator loss and discriminator loss, in which the
in this paper is tested on benchmark data set, and two objective
generator loss consists of four parts: pixel loss, content loss,
evaluation indexes, peak signal-to-noise ratio and structural
confrontation loss and total variational loss (TV loss):
similarity, are selected to evaluate the quality of the
LG  L1   Lcont   Ladv   LTV (2) reconstructed image. the higher the value, the better the effect of
the model.
1 N

L1 
N
 I HR  I SR
t t

1
(3)
t 1

1 N

Lcont 
N
 54
( I HR )  54 ( I SR )
t t

1
(4)
t 1
PSNR（dB）

1 N

Ladv 
N
 1  D (G ( I t
LR
)) (5)
t 1

1 N

LTV 
N
  (( I t
i , j 1
 I i , j )  ( I i 1, j  I i , j ) )
t 2 t t 2  /2
(6)
t 1 i, j

The discriminator loss is as follows:

LD  1  D ( I HR )  D (G ( I LR )) (7)

C. Network Architecture
The training algorithm is shown in Algorithm 1. Fig. 6. The PSNR of the algorithm on Set5

Algorithm 1:Our Algorithm on Paired Dataset VOC2012. B. Results and Evaluation

Input: Batch sizes of Paired synthetic data；
1 Load the pretrained models G. 1) Objective Index Evaluation
2 while not convergent do The model proposed is tested on several widely used datasets,
3 Sample batch sizes of Paired synthetic data; including Set5, Set14, Urban100 and BSD100. The comparison
4 // Update the D model includes the average PNSR values, SSIM values for all images
5 Update D by minimizing the objective:
6 L D  1  D ( I H R )  D ( G ( I L R ))
in the test set, and the number of model parameters. The
7 comparison algorithms include Bicubic, SRGAN [6] and
// Update the G model ISRGAN [23]. The results are shown in tables I, II and III.
8
Update G by minimizing the objective:
9
LG  L1   Lcont   Ladv   LTV
End

256

Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
TABLE I. PSNR values using several different algorithms(dB) benchmark data set are improved. On Set5, the algorithms PSNR
Algorithms Set5 Set14 Urban100 BSD100
and SSIM improve 0.83dB and 0.028 than SRGAN, 0.79dB and
Bicubic 28.42 26.00 23.14 25.96 0.026 than ISRGAN respectively; on Set14, PSNR and SSIM
SRGAN 28.68 25.51 23.50 25.64 improve 0.56dB and 0.009 than SRGAN, 0.41dB and 0.007
ISRGAN 28.72 25.66 23.61 25.75 respectively than ISRGAN; on Urban100, PSNR and SSIM
Ours 29.51 26.07 24.01 25.97 improve 0.51dB and 0.031 than SRGAN and 0.40dB and 0.028
respectively than ISRGAN. On BSD00, the algorithms PSNR
TABLE II. SSIM values using several different algorithms and SSIM in this paper improve 0.33dB and 0.014 respectively
Algorithms Set5 Set14 Urban100 BSD100
than SRGAN, 0.22dB and 0.012 higher than ISRGAN
Bicubic 0.810 0.702 0.657 0.657 respectively.
SRGAN 0.828 0.731 0.716 0.697
ISRGAN 0.830 0.733 0.719 0.699
2) Visual Effects
Ours 0.856 0.740 0.747 0.711 Select two images of different scenes as the image super-
resolution reconstruction effect comparison, as shown in Figure
TABLE III. Number of parameters of several different algorithms(M) 7, (a) represents the image to be processed, (b) represents the
local effect of the LR image, (c) represents the image local effect
Algorithms Bicubic SRGAN ISRGAN Ours using the Bicubic method, (d) represents the image local effect
Params(M) - 1.6 8.7 0.5 using the SRGAN method, (e) represents the image local effect
using the algorithm in this paper, and (f) represents the HR
Through the comparison of the results, it can be found that image local effect. Through the comparison, we can see that the
the number of parameters of this algorithm is much smaller than processing result of this article is closer to the HR image.
that of other algorithms, and the PSNR and SSIM on each

(1a) (1b) (1c) (1d) (1e) (1f)

(2a) (2b) (2c) (2d) (2e) (2f)

Fig. 7. Visual effects using different algorithms

IV. CONCLUSION REFERENCES

In this paper, a residual feature aggregation framework with [1] Wang Z, Chen J, Hoi S. Deep Learning for Image Super-resolution: A
Survey[J]. IEEE Transactions on Pattern Analysis and Machine
feedback structure is proposed and applied to image super- Intelligence, 2020, PP(99):1-1
resolution research, which not only reduces the number of neural [2] Zhengwei Shi, Sen Lei.Overview of Image Super-resolution
network parameters, but also improves the visual quality of the Reconstruction Algorithms[J]. Journal of Data Acquisition & Processing,
reconstructed image. Compared with the previous algorithms, in 2020, 035(001):1-20.
the four general test sets, the evaluation results of PSNR and [3] Dong C, Loy C C, He K, et al. Learning a deep convolutional network for
SSIM are improved, the visual effect is also improved, and the image super-resolution[C]//European conference on computer vision.
network parameters are less than the previous methods, which Springer, Cham, 2014: 184-199.
reduces the prediction time of the model, and it is helpful to push [4] Dong C, Loy C C, He K, et al. Image Super-Resolution Using Deep
Convolutional Networks[J]. IEEE Trans Pattern Anal Mach Intell, 2016,
the algorithm to practical application. Of course, the algorithm 38(2):295-307.
in this paper also has some shortcomings and needs to be [5] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative Adversarial
improved in the following research. Nets[J]. MIT Press, 2014:2672-2680.
[6] Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-
ACKNOWLEDGMENT resolution using a generative adversarial network[C]//Proceedings of the
This work is supported by Graduate Innovation Fund of IEEE conference on computer vision and pattern recognition. 2017: 4681-
Xi'an Shiyou University (YSC20113037). 4690.

257

Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.
[7] Kim J, Lee J K, Lee K M. Accurate image super-resolution using very IEEE/CVF Conference on Computer Vision and Pattern Recognition.
deep convolutional networks[C]//Proceedings of the IEEE conference on 2020: 5407-5416.
computer vision and pattern recognition. 2016: 1646-1654. [16] Yuan Y, Liu S, Zhang J, et al. Unsupervised image super-resolution using
[8] Lai W S, Huang J B, Ahuja N, et al. Deep laplacian pyramid networks for cycle-in-cycle generative adversarial networks[C]//Proceedings of the
fast and accurate super-resolution[C]//Proceedings of the IEEE IEEE Conference on Computer Vision and Pattern Recognition
conference on computer vision and pattern recognition. 2017: 624-632. Workshops. 2018: 701-710.
[9] Ahn N, Kang B, Sohn K A. Fast, accurate, and lightweight super- [17] Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using
resolution with cascading residual network[C]//Proceedings of the cycle-consistent adversarial networks[C]//Proceedings of the IEEE
European Conference on Computer Vision (ECCV). 2018: 252-268. international conference on computer vision. 2017: 2223-2232.
[10] Sajjadi M S M, Scholkopf B, Hirsch M. Enhancenet: Single image super- [18] Karnewar A, Wang O. Msg-gan: Multi-scale gradients for generative
resolution through automated texture synthesis[C]//Proceedings of the adversarial networks[C]//Proceedings of the IEEE/CVF Conference on
IEEE International Conference on Computer Vision. 2017: 4491-4500. Computer Vision and Pattern Recognition. 2020: 7799-7808.
[11] Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer [19] Yilun Lin, Xingyuan Dai, Li Li, et al.The New Frontier of AI Research:
and super-resolution[C]//European conference on computer vision. Generative Adversarial Networks[J]. Acta Automatica Sinica, 2018,
Springer, Cham, 2016: 694-711. 44(005):775-792.
[12] Bulat A, Tzimiropoulos G. Super-fan: Integrated facial landmark [20] Kunfeng Wang, Chao Gou, Yanjie Duan, et al.Generative Adversarial
localization and super-resolution of real-world low resolution faces in Networks: The State of the Art and Beyond[J]. Acta Automatica Sinica,
arbitrary poses with gans[C]//Proceedings of the IEEE Conference on 2017, 043(003):321-332.
Computer Vision and Pattern Recognition. 2018: 109-117. [21] Liu J, Zhang W, Tang Y, et al. Residual feature aggregation network for
[13] He K, Zhang X, Ren S, et al. Deep residual learning for image image super-resolution[C]//Proceedings of the IEEE/CVF Conference on
recognition[C]//Proceedings of the IEEE conference on computer vision Computer Vision and Pattern Recognition. 2020: 2359-2368.
and pattern recognition. 2016: 770-778. [22] Li Z, Yang J, Liu Z, et al. Feedback network for image super-
[14] Wang X, Yu K, Wu S, et al. Esrgan: Enhanced super-resolution generative resolution[C]//Proceedings of the IEEE/CVF Conference on Computer
adversarial networks[C]//Proceedings of the European Conference on Vision and Pattern Recognition. 2019: 3867-3876.
Computer Vision (ECCV) Workshops. 2018: 0-0. [23] Huanxin Cheng, Wenhan Liu. Image super-resolution using improved
[15] Guo Y, Chen J, Wang J, et al. Closed-loop matters: Dual regression generative adversarial networks[J]. Electronic Messaging Technology.
networks for single image super-resolution[C]//Proceedings of the 2020, v.43;No.346(14):137-140.

258

Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on August 07,2024 at 01:47:28 UTC from IEEE Xplore. Restrictions apply.