0% found this document useful (0 votes)
4 views16 pages

Doctoral Thesis Proposal DCCA

The PhD thesis proposal by Rafael E. Rivadeneira focuses on Thermal Image Super-resolution (TISR) using deep learning techniques, aiming to enhance the resolution of thermal images which are often of lower quality compared to visible spectrum images. The document outlines the motivation for the research, relevant literature, research objectives, questions, methodology, required resources, expected results, and a work plan. The proposal emphasizes the potential applications of improved thermal imaging in various fields such as security, medicine, and agriculture.

Uploaded by

Rosa Quelal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views16 pages

Doctoral Thesis Proposal DCCA

The PhD thesis proposal by Rafael E. Rivadeneira focuses on Thermal Image Super-resolution (TISR) using deep learning techniques, aiming to enhance the resolution of thermal images which are often of lower quality compared to visible spectrum images. The document outlines the motivation for the research, relevant literature, research objectives, questions, methodology, required resources, expected results, and a work plan. The proposal emphasizes the potential applications of improved thermal imaging in various fields such as security, medicine, and agriculture.

Uploaded by

Rosa Quelal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

PHD Thesis Proposal

Thermal Image Super-resolution (TISR) using Deep


Learning Techniques

PhD Programme: Applied Computer Science

Author: Rafael E. Rivadeneira1

Director: Angel D. Sappa1,2


Co-director: Boris X. Vintimilla1
1 Escuela
Superior Politécnica del Litoral, ESPOL, FIEC, CIDIS,
Campus Gustavo Galindo, 09-01-5863, Guayaquil, Ecuador.

2 ComputerVision Center, Edifici O, Campus UAB,


08193-Bellaterra, Barcelona, Spain.
{rrivaden, asappa, boris.vintimilla}@espol.edu.ec
Table of Contents

1 Work Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Thermal Image Application Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Image Super-resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.1 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 Generation of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6 Required Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7 Expected Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.1 SR thought Deep CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.2 SR using CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8 Workplan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Thermal Image Super-sesolution 3

1 Work Motivation

All objects emit infrared radiation by themselves, independently of any external energy
source, and depending on their temperature they emit a different wavelength in the long-
wavelength infrared spectrum (i.e., thermal). In the electromagnetic spectrum, the spec-
trum is divided into several regions such as X rays, UV, Visible, infrared, radar, among
others. There exist different sub-division schemes for the infrared region in the different
scientific fields, but the common scheme is shown in Fig. 1, where it has five regions, the
near (NIR: Near-infrared), short (SWIR: Short-wavelength infrared), middle (MWIR: Mid-
wavelength infrared), long (LWIR: Long-wavelength infrared) and far (FIR: Far-infrared)
spectral bands, where the mid-wavelength infrared is also known as thermal infrared (TIR)
[1]. The research work on this proposal is focused on the usage of TIR images, which is
motivated by the facts mentioned below.

Fig. 1. The electromagnetic spectrum with sub-divided infrared spectrum.

New technology camera sensors can capture from Ultraviolet to Near-infrared spectral
bands. Visual cameras, capture visible light in greyscale or RGB images. The information
of the TIR spectral band can be captured by passive sensors, like thermal cameras; they
capture the thermal-infrared radiation emitted by all objects with a temperature above ab-
solute zero, based on object heat emission and no external illumination is required such
natural or artificial lights. Thermal information can provide valuable extra information to
the visible one (e.g., RGB camera), due visible image can be affected by poor lighting
conditions, for instance in security and object recognition, where nothing can be captured
in total darkness. Thermal cameras are not affected by this lack of illumination and they
do not depend on any external energy source. Fig. 2 shows an example of the same scene
captured with both a visible spectrum and a thermal camera, where thermal images are
represented as grayscale images, with dark pixels for cold spots and the whites one for hot
spots. In the image mention above, a sitting person inside the garage is more distinguished
in the thermal image, while in the visible spectrum it is almost impossible to distinguish it.
In recent years, infrared imaging field has grown considerably; nowadays, there is a
large set of infrared cameras available in the market (e.g., FLIR1 , Axis2 , among others)
with different technical specifications, lens and costs. Innovative use of infrared imaging
technology can, therefore, play an important role in a wide range of different applications,
such as medicine, military defense, surveillance and security, agriculture, building inspec-
tion, fire detection, among others, as well as detection, tracking and human recognition.
Depending on the thermal camera’s specifications, the cost can vary between $200 and
1 https://www.flir.com
2 https://www.axis.com
4 Author: Rafael E. Rivadeneira1

Fig. 2. Visible and thermal image of the same scene in the night. (left) captured with a RGB camera,
and (right) captured with a thermal camera.

more than $20000; the latter one is based on active technology with a cooled detector
integrated using a cryocooler, providing better resolution and higher frame rate. On the
contrary, cheap thermal cameras have smaller resolution than commercial RGB cameras,
which can be 160×120 and 1280×1024 respectively. This low-resolution, at a moderate
price, is a big limitation when thermal cameras need to be used for general-purpose solu-
tions.
Image super-resolution (SR) is an ill-posed problem that refers to the estimation of
high-resolution (HR) image/video from a low-resolution (LR) of the same scene, usually
with the use of digital image processing and Machine Learning (ML) techniques. SR also
has important applications in a wide range of domains, such as surveillance and security
(e.g., [2], [3], [4]), medical imaging (e.g., [5], [6],[7]), object detection in scene (e.g., [8]),
among others. The possibility of obtaining a SR image has been largely exploited in the
visible spectrum domain, where different super-resolution approaches have been proposed
from a conventional interpolation (e.g., [9], [10], [11]), but recently the development of
deep learning techniques, have witnessed remarkable progress achieving the performance
on various benchmarks of SR, where most of the state-of-the-art are focused on the visible
spectrum domain.
Learning-based super-resolution methods generally work by down-sampling and adding
both noise and blur to the given image. These noisy and blurred poor quality images, to-
gether with the given images that are considered as the Ground Truths (GT), are used in the
learning process. The approaches mention above have been mostly used to tackle the super-
resolution problem, however, there are few recent contributions where the learning process
is based on the usage of a pair of images (low and high-resolution images) obtained from
different cameras, also due to physical limitations of the technology and the high cost of
thermal cameras, thermal images tend to have a poor resolution. This poor resolution could
be improved by using new algorithms learning-based super-resolution methods, allowing
to increase image resolution.
Although the use of thermal imaging is not something new, and that conventional super-
resolution techniques (e.g., bicubic interpolation) have been used for many years in the
visible spectrum, the use of deep learning techniques for super-resolution is something that
has emerged in recent years, most of them are focused only on the visible spectrum. This
proposal seeks to use deep learning techniques to obtain a super-resolution in the thermal
spectrum, allowing the use of high-resolution images in any of the applications mention
above.
The remaining part of this document consists of the following sections: Section 2 de-
scribes related work of the present proposal. Research objectives, questions and method-
ology are found in Sections 3, 4 and 5 respectively. Section 6 mentions the necessary re-
sources to do the research. The expected results are depicted in Section 7. In Section 8,
the proposed calendar workplan is shown. Finally, the conclusion of the present proposal
is depicted in Section 9.
Thermal Image Super-sesolution 5

2 Literature Review
In this section work related to the research topics covered in this proposal are reviewed.
It starts with thermal images applications, see Section 2.1. Then, the main techniques for
image super-resolution are summarized in Section 2.2, which will be the main topic in the
PhD thesis. Note that most of the reviewer approaches are intended for visible spectrum
images, while current work is focused on thermal images. Then, a background on thermal
images datasets is given in Section 2.3.

2.1 Thermal Image Application Areas


Be able to ‘see’ the temperature in a scene has a great advantage in a lot of applications (e.g.,
detection, tracking, medicine, agriculture, among others), and provide different information
rather than visible images.
In the studies of warm-blood wild animals, thermal imaging is useful for diagnosis
of diseases; depending on blood circulation, temperature distributed over the animal body
changes and can be detected with thermal cameras [12], also it can reveal inflammation in
some part of the animal’s body (e.g., the leg) [13]. In the food industry, the temperature is
a non-invasive and non-contact value information that allows getting additional knowledge
on information about the quality, such as damage and bruises in fruits and vegetables [14].
Thermal images in inanimate objects depend on the energy that generates heat on them.
It is mainly used on applications such as building inspection [15] for detecting heat loss,
industrial applications for automatic issue detection in electrical installations [16] or diag-
nosis for an object, fire detection and military used on locating gunfire or hidden people
[17].
Thermal images are also widely used in the detection and tracking of humans, what
is the first step in many surveillance applications because it allows to "see" in the night. A
trespasser detection system [18] adjusting a thermal camera to detect objects in the human’s
temperature range, and then classify the objects based on the shape can be done. Even a
system of fall detection is possible using just a low-quality thermal camera [19].

2.2 Image Super-resolution


Although is not intended for super-resolution, image enhancement tackles the process of
restoring an original image, to have a more suitable image for a desired application. Gen-
erally, image enhancement covers many techniques to improve the visual appearance of
an original image for machine analysis or better appearance for human. The main point of
image enhancement is to restore the image with the same resolution where it has "noise",
blurred focuses, colorless, among others.
Like in image enhancement, the single image super-resolution (SISR) aims to restore
an image, not necessarily noised or blurred images, but images with low-resolution being
transformed into a high-resolution image. SISR has been extensively studied in the litera-
ture for decades, recently using deep learning techniques better results, concerning conven-
tional methods, have been obtained. The use of convolutional neural networks (CNNs) has
shown a great capability to improve the quality of SR results. Dong et al. [20], [21] firstly
propose a SRCNN to learn an end-to-end mapping, between the interpolated LR images and
their HR counterparts, it used traditional methods (e.g., bicubic interpolation) for having
the same of the HR output image, then deep CNNs are applied on these images for recon-
structing high-quality detail, reducing the learning difficulty. This first approach archives
state-of-the-art performance and becomes one of the most popular frameworks. However,
the use of predefined up-sampling traditional methods introduces side effects like noise
amplification even some blurring, and the cost of time and space are much higher than
other frameworks. For better performance and improve the efficiency, FSRCNN [22] per-
forms a fast SRCNN extracting feature maps at the low-resolution image and up-sample the
6 Author: Rafael E. Rivadeneira1

image at the last layer. The authors introduce a deconvolution layer at the end of the net-
work and adopt a smaller filter size but more mapping layers. Inspired in SRCNN, depth
networks start to appear stacking more convolutional layers with residual learning (e.g.,
[23], [24]). The authors of EDSR [25], in order to speed up the training process, removes
the batch-normalization layer and take advantage of residual learning [26]. Yamanaka et
al. [27] propose a CNN based approach named DCSCN, where it uses a Deep CNN with
Residual Net, Skip Connection and Network in Network, and get a computation complex-
ity of at least 10 time smaller that state of the art (e.g., RED [28] and DRCN [29]) reaching
similar results.
A CNN based approach for enhancing thermal images has been introduced by Choi
et al. in [30], inspired by the proposal in [22]. The authors in [30] compare the accuracy
of a network trained in different image’s spectral band to find the best representation of
thermal enhancement. They conclude that a grayscale trained network provided better en-
hancement than the MWIR-based network for thermal image enhancement. On the other
hand, Lee et al. [31] also propose a convolutional neural network based on image enhance-
ment for thermal images. The authors evaluate four RGB-based domains, namely, gray,
lightness, intensity and V (from HSV color space) with a residual-learning technique. The
approach improves the performance of enhancement and speed of convergence. The au-
thors conclude that the V representation is the best one for enhancing thermal images. In
[32] the authors propose a parallelized 1x1 CNNs, named Network in Network to perform
image enhancement with a low computational cost for image reconstruction. In most of the
previous approaches, thermal images have not been considered during the training stage,
although intended for thermal image enhancement. They propose to train their CNN based
approaches using images from the visible spectrum at different color space representations
(e.g., grayscale, HSV).
In addition, most of the aforementioned CNNs aim at minimizing the mean-square
error (MSE) between SR and GT images, tending to overthrow the high-frequency details
in images. In other words, a supervised training process, using a pair of images, is followed.
The main drawback of such approaches lies in the need of having pixel-wise registered SR
and GT images to compute the MSE. As mentioned above, in most cases the SR image is
obtained from an image down-sampled from the GT.
In recent literature, different unsupervised training processes have been presented for
applications such as transferring style [33], image colorization [34], image enhancement
[35], feature estimation [36], among others. All these approaches are based on two-way
GANs (CycleGAN) networks that can learn from unpaired data sets [37]. CycleGAN can
be used to learn how to map images from one domain (source domain) into another do-
main (target domain); this functionality makes CycleGAN model appropriate for image SR
estimation when there is not a pixel-wise registration.

2.3 Datasets
Focussing on the SR problem, there are a large variety of datasets available in the visible
spectrum; recently, [38] has released a high-quality (2K resolution) dataset DIV2K for visi-
ble image restoration, which is split up into 800 images for training, 100 for testing and 100
for validation. Most of the approaches in the literature use common benchmark datasets for
evaluating their performance, all of them focus on visible spectrum domain (e.g., Set5 [39],
Set14 [40], BSD300 [41], BSD500 [42], Urban100 [43], Manga109 [44], among others).
These datasets provide HR images under different categories (e.g., animal, building, food,
landscape, people, flora, fauna, car, among others) with different resolutions and amount of
images. Some of them even include LR and HR image pairs.
Regarding thermal images, in last years some datasets have been published, due to
the different benefits and applications of these kinds of images. In [45], they generate a
dataset with 284 thermal images, with a resolution of 360×240. They acquired this dataset
with a Raytheon 300D, in their University campus at a walkway and street intersection,
Thermal Image Super-sesolution 7

capturing images over several daytimes and weather conditions. In [46], a 15224 thermal
image dataset has been proposed, with a resolution 164×129. This dataset was acquired
with an Indigo Omega imager mounted on a vehicle, driven in outdoors urban scenarios. In
[47] a FLIR-A35 is used to acquire more than 41500 thermal images, with a resolution of
320×256. A HR dataset was presented in [48]; the dataset contains seven different scenes,
most of them collected with a FLIR SC8000, with a full resolution of 1024×1024. The
dataset consists of 63782 frames with thousands of recording objects; one of the largest
amounts of HR thermal images available.
Most of the thermal image datasets mentioned above are usually designed for object
detection and tracking; some others for applications on the biometric domain or medical
applications (e.g., [49], [50]); and just a few of them are intended for super-resolution tasks.
Also, most of them contain low-resolution images and others are from the same scenario
which gives a poor variability.

2.4 Evaluation Metrics


Deep convolutional neural networks needs evaluations metrics to learn the end-to-end map-
ping between a input and output image, in this particular case the mapping of low-resolution
to high-resolution image. The most widely used quantitative metrics for super-resolution
evaluations are: i) Peak Signal-to-Noise Ratio (PSNR), which is commonly used to mea-
sure the reconstruction quality of lossy transformations and its have higher correlation with
the human perception; and ii) Structural Similarity Index Metric (SSIM) [51], which is
based on the independent comparisons of luminance, contrast and structure. Higher score
of PSNR or SSIM means better restoration fidelity. Image quality assessments (IQA), fo-
cused on the perception of human viewers, are avoided in the current proposal due to they
are not necessarily consistent in the case of thermal images, furthermore, they are expen-
sive and time-consuming. Even these images are acquired from the LWIR spectral band,
they are represented like grayscale images, so PSNR and SSIM can be used.

3 Research Objectives
This thesis proposal focuses on the analysis of images from the long band of the elec-
tromagnetic spectrum that can be exploited to solve some problems existing in the field of
computer vision, for which different deep learning architectures will be designed to be used
with thermal images, the most important research activities are mentioned below:
1. Evaluation of convolutional network architectures used in the visual spectrum.
2. Design and implementation of deep learning techniques focused on thermal images.
3. Generation of databases for the validation of implemented techniques.
4. Validation of implemented models.

4 Research Questions
The topics to be addressed during the research activities are related to the analysis of differ-
ent techniques in the computer vision field. The design and implementation of architectures,
that make use of computational intelligence through deep learning networks are considered
for thermal images. The following research questions have been formulated to fulfill the
objectives of this thesis proposal:
1. How to design a Convolutional Neural Network architecture to be able to get a high-
resolution thermal image from a low-resolution image?.
2. Is it possible to use a Convolutional Neural Network architecture designed for visible
spectrum images to improve thermal image resolution?
3. Can thermal images from the low-resolution domain be transformed into a high-resolution
domain without having paired images?
8 Author: Rafael E. Rivadeneira1

5 Research Methodology

In this section the techniques that will be used in this proposal are mention. In Section
5.1, the justification of using CNN is mentioned, then the needed dataset for achieve this
proposal is given in Section 5.2.

5.1 Convolutional Neural Networks

In order to answer the research questions, convolutional neural networks will be used. This
decision has been made based on the fact CNN allows to get good representations of image
features, which is the main goal of computer vision.
Convolutional Neural Networks is a class of deep neural network, mostly applied to
analyze images, and the convolutions layers are the core operations of a CNN, they act as a
feature extractor, it consists of a set of learnable kernels. Input images are two-dimensional
data (matrix), convolutions take advantage of doing a dot product with the kernels, produc-
ing a feature map with the most representative characteristic at some spatial position in the
input. Illustration in Fig. 3 represents a typical CNN for classification.
The down-sampling layer (a.k.a. Pooling Layer) has several non-linear functions that
can be used, the most common is the max pooling; it divides the two-dimension input
into several sub-areas, and for each sub-area, it takes the maximum value. These layers
reduce the spatial dimension, reducing the number of parameters being reflected in the
computational cost.

Fig. 3. The architecture of a typical convolutional neural network, alternative between convolutions
and down-samplings layers. The feature maps of the final subsampling layer are fed to a fully con-
nected layers. The output layer usually use softmax activation functions.

5.2 Generation of Datasets

In order to test the development of the thesis’ work and trying to overcome the limitations
mentioned in Section 2.3, and due to the lack of thermal image datasets, in the thesis dif-
ferent datasets are going to be generated. At the moment this work has already started and
two kinds of datasets have been acquired to be tested in preliminary results.
The first dataset has been acquired using a single TAU2 thermal camera with a 13mm
lens within a resolution of 640×512, in indoor and outdoor scenarios, and a set of 101
images (PNG format with a depth of 8 bits) have been acquired. This dataset can be used
to train CNNs following the traditional method of down-sampling the images to have a
registered image between SR and GT images.
Regarding the second dataset, unlike the first dataset, the main idea is to have three
semi-paired images of the same outdoor scenario acquired with three different thermal
cameras. Each resolution has to be the double of the first (e.g., 160×120, 320×240 and
640×480). Like in the previous dataset, this dataset collection has already started, using an
Thermal Image Super-sesolution 9

Axis Domo P1290, Axis Q2901-E and FC-6320 FLIR thermal cameras for LR, MR and
HR respectively. Acquired images are recorded in PNG format with a depth of 8 bits. The
idea of this datasets is to develop an architecture able to transform a real low-resolution
image into a high-resolution image, and with a semi-paired image to be able to have real
GT images to compare with.
The images in the acquired datasets are represented in grayscale and the use of data
augmentation process, rotating and flipping from top to bottom, from left to right all images
can be performed, in order to have more variability and avoid the overfitting of any network
implementation. Examples of images are depicted in Fig. 4.

Fig. 4. Examples of thermal images used for training. (toprow) LR images from Axis Domo P1290.
(middlerow) MR images from Axis Q2901-E. (bottomrow) HR images from FC-6320 FLIR.

6 Required Resources
The deep learning-based approach that will be developed in this thesis will be implemented
using a GPU like NVIDIA Titan XP, with at least 64GB of ram memory. The algorithm
implementation will be done using Python with TensorFlow framework and Keras library.

7 Expected Results
In the proposed thesis different deep learning techniques will be evaluated. The models
will consider the usage of different techniques, proposed to tackle the SR problem in the
visible spectrum, in the thermal spectrum. In addition, the usage of real different resolution
(low, mid and high-resolution) thermal images will be used. Preliminary results have been
obtained in this direction and some of them are briefly presented below.

7.1 SR thought Deep CNN


With the first acquired dataset mention above preliminary results have been obtained with a
super-resolution approach using a Deep Convolution Neural Network architecture, see Fig.
5. The main idea is to use a CNN to enhance a low-resolution thermal image at different
scale but using two kind of datasets separately, visible and thermal images, in order to
demonstrate which kind of datasets is better to use for thermal images SR. Typical down-
sampling techniques applied to GT images are used in order to have paired images between
inputs and output.
10 Author: Rafael E. Rivadeneira1

Fig. 5. Proposed convolutional neural network architecture.

Preliminary quantitative results, see Table 1, show that the network model trained with
thermal image dataset is better than using just a visible image dataset. These results also
show that the use of CNN to obtain thermal images SR is possible. Qualitative results are
depicted in Fig. 6.

Table 1. Preliminary results for x2 scale SR comparision between visible and thermal based model.

Dataset Bicubic Visible Thermal


Thermal6 39.59 40.88 41.24
SET5 33.64 37.69 37.46

Fig. 6. Qualitative results for first preliminary results.

7.2 SR using CycleGAN


Using the second acquired dataset, for preliminary results, a CycleGAN architecture for
thermal image super-resolution has been proposed, see Fig. 7, with ResNet6 as a Generator
and PatchGAN as a Discriminator. The inputs and the output should have the same reso-
lutions, so bicubic interpolation is done to the real low-resolution image. The network is
trained to perform SR at a ×2 scale in two scenarios, to generate mid-resolution images
from low-resolution and to generate high-resolution from mid-resolution. The main idea of
CycleGan is to transform an image domain to another without having paring sets of images.
Thermal Image Super-sesolution 11

Fig. 7. Proposed CycleGAN architecture using 6 residual blocks ResNet; GL−H and GH−L repre-
sent generators from lower to higher and from higher to lower resolution respectively. DH and DL
represent the discriminator for each resolution.

Experimental results are shown in Table 2 and some illustrations depicted in Fig. 8.
These results indicate that it is possible to transform from low-resolution to high resolution,
also indicates that it has better results than traditional approaches that are based on just
using a CNN.

Table 2. Results on LR set in a ×2 scale factor, compared with its MR registered validation set.

x2 scale LR to MR MR to HR
Method PSNR SSIM PSNR SSIM
Bicubic Interpolation 16.46 0.6695 20.17 0.7553
Previous CNN 17.01 0.6704 20.24 0.7492
Current experiment 21.50 0.7218 22.42 0.7989

Fig. 8. SR preliminary results on real-world LR images with a ×2 scale factor—these illustrations


correspond to the 80% centered area cropped from the images. (top − row) Bicubic interpolation
image, (middle − row) Super-resolution results (SRLR ), (bottom − row) Ground truth MR image.
12 Author: Rafael E. Rivadeneira1

8 Workplan

2019 2020
Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep
Preliminary investigation 100% complete
PhD proposal
Writing
Writing modifications
Presentation
Submission
PhD thesis research
Guided readings
Papers reading
Report writing
Submission

End of 2nd year


2020 2021
Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep
PhD thesis research
Conference paper
Related work reading
Design experiments
Perform experiments
Paper writing
Submission
PhD proposal defense
Writing modifications
Write Exam
Oral defense
Defense Acepted
Conference paper
Related work reading
Design experiments
Perform experiments
Paper writing
Submission

End of 3nd year


2021 2022
Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep
PhD thesis research
Journal
Related work reading
Design experiments
Perform experiments
Journal writing
Submission
PhD Thesis
Write PhD Thesis
PhD Thesis defense
Graduation
End of 4th year
Thermal Image Super-sesolution 13

9 Conclusions
This document presents the research proposal in thermal image super-resolution and their
different applications. The proposed models will be designed and implemented in order to
archives better results than state-of-the-art. Deep learning techniques will be used, in par-
ticular, with variants of convolutional neural networks and adversarial generative networks.
For each of the proposals described above, preliminary experiments have been carried out
to demonstrate the validity of the architecture of the designed model. During the PhD in-
vestigation, more exhaustive comparisons and evaluation will be carried out.
14 Author: Rafael E. Rivadeneira1

References
1. Gade, R., Moeslund, T.B.: Thermal cameras and applications: A survey. Machine Vision and
Applications 81 (2014) 89–96
2. Zhang, L., Zhang, H., Shen, H., Li, P.: A super-resolution reconstruction algorithm for surveil-
lance images. Signal Processing 90 (2010) 848–859
3. Rasti, P., Uiboupin, T., Escalera, S., Anbarjafari, G.: Convolutional neural network super resolu-
tion for face recognition in surveillance monitoring. In: International conference on articulated
motion and deformable objects, Springer (2016) 175–184
4. Shamsolmoali, P., Zareapoor, M., Jain, D.K., Jain, V.K., Yang, J.: Deep convolution network
for surveillance records super-resolution. Multimedia Tools and Applications 78 (2019) 23815–
23829
5. Mudunuri, S.P., Biswas, S.: Low resolution face recognition across variations in pose and illu-
mination. IEEE transactions on pattern analysis and machine intelligence 38 (2015) 1034–1040
6. Robinson, M.D., Chiu, S.J., Toth, C.A., Izatt, J.A., Lo, J.Y., Farsiu, S.: New applications of
super-resolution in medical imaging. In: Super-Resolution Imaging. CRC Press (2017) 401–430
7. Huang, Y., Shao, L., Frangi, A.F.: Simultaneous super-resolution and cross-modality synthesis
in magnetic resonance imaging. In: Deep Learning and Convolutional Neural Networks for
Medical Imaging and Clinical Informatics. Springer (2019) 437–457
8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for ac-
curate object detection and segmentation. IEEE transactions on pattern analysis and machine
intelligence 38 (2015) 142–158
9. Duchon, C.E.: Lanczos filtering in one and two dimensions. Journal of applied meteorology 18
(1979) 1016–1022
10. Keys, R.: Cubic convolution interpolation for digital image processing. IEEE transactions on
acoustics, speech, and signal processing 29 (1981) 1153–1160
11. Watson, D., Philip, G.: Neighborhood-based interpolation. Geobyte 2 (1987) 12–16
12. Cilulko, J., Janiszewski, P., Bogdaszewski, M., Szczygielska, E.: Infrared thermal imaging in
studies of wild animals. European Journal of Wildlife Research 59 (2013) 17–23
13. Yanmaz, L.E., Okumus, Z., Dogan, E.: Instrumentation of thermography and its applications in
horses. J Anim Vet Adv 6 (2007) 858–62
14. Gowen, A., Tiwari, B., Cullen, P., McDonnell, K., O’Donnell, C.: Applications of thermal imag-
ing in food quality and safety assessment. Trends in food science & technology 21 (2010)
190–200
15. Grinzato, E., Bison, P., Marinetti, S.: Monitoring of ancient buildings by the thermal method.
Journal of Cultural Heritage 3 (2002) 21–29
16. Jadin, M.S., Ghazali, K.H., Taib, S.: Thermal condition monitoring of electrical installations
based on infrared image analysis. In: 2013 Saudi International Electronics, Communications
and Photonics Conference, IEEE (2013) 1–6
17. Price, J., Maraviglia, C., Seisler, W., Williams, E., Pauli, M.: System capabilities, requirements
and design of the gdl gunfire detection and location system. In: 33rd Applied Imagery Pattern
Recognition Workshop (AIPR’04), IEEE (2004) 257–262
18. Wong, W.K., Tan, P.N., Loo, C.K., Lim, W.S.: An effective surveillance system using thermal
camera. In: 2009 International Conference on Signal Acquisition and Processing, IEEE (2009)
13–17
19. Sixsmith, A., Johnson, N.: A smart sensor to detect the falls of the elderly. IEEE Pervasive
computing 3 (2004) 42–47
20. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-
resolution. In: European conference on computer vision, Springer (2014) 184–199
21. Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional net-
works. IEEE transactions on pattern analysis and machine intelligence 38 (2015) 295–307
22. Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network.
In: European conference on computer vision, Springer (2016) 391–407
23. Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep con-
volutional networks. In: Proceedings of the IEEE conference on computer vision and pattern
recognition. (2016) 1646–1654
24. Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep cnn denoiser prior for image restora-
tion. In: Proceedings of the IEEE conference on computer vision and pattern recognition. (2017)
3929–3938
Thermal Image Super-sesolution 15

25. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single
image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern
recognition workshops. (2017) 136–144
26. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings
of the IEEE conference on computer vision and pattern recognition. (2016) 770–778
27. Yamanaka, J., Kuwashima, S., Kurita, T.: Fast and accurate image super resolution by deep CNN
with skip connection and network in network. CoRR abs/1707.05425 (2017)
28. Mao, X., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder
networks with symmetric skip connections. In: Advances in neural information processing sys-
tems. (2016) 2802–2810
29. Kim, J., Kwon Lee, J., Mu Lee, K.: Deeply-recursive convolutional network for image super-
resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition.
(2016) 1637–1645
30. Choi, Y., Kim, N., Hwang, S., Kweon, I.S.: Thermal image enhancement using convolutional
neural network. In: Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Con-
ference on, IEEE (2016) 223–230
31. Lee, K., Lee, J., Lee, J., Hwang, S., Lee, S.: Brightness-based convolutional neural network for
thermal image enhancement. IEEE Access 5 (2017) 26867–26879
32. Lin, M.: Cq, and yan, s. network in network. In: International Conference on Learning Repre-
sentations (ICLR). (2014)
33. Chang, H., Lu, J., Yu, F., Finkelstein, A.: Pairedcyclegan: Asymmetric style transfer for applying
and removing makeup. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition. (2018) 40–48
34. Mehri, A., Sappa, A.D.: Colorizing near infrared images through a cyclic adversarial approach
of unpaired samples. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition Workshops. (2019)
35. Chen, Y.S., Wang, Y.C., Kao, M.H., Chuang, Y.Y.: Deep photo enhancer: Unpaired learning for
image enhancement from photographs with gans. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. (2018) 6306–6314
36. Suarez, P.L., Sappa, A.D., Vintimilla, B.X., Hammoud, R.I.: Image vegetation index through
a cycle generative adversarial network. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops. (2019)
37. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-
consistent adversarial networks. In: Proceedings of the IEEE international conference on com-
puter vision. (2017) 2223–2232
38. Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: Ntire 2017 challenge on
single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition Workshops. (2017) 114–125
39. Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image
super-resolution based on nonnegative neighbor embedding. (2012)
40. Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In:
International conference on curves and surfaces, Springer (2010) 711–730
41. Martin, D., Fowlkes, C., Tal, D., Malik, J., et al.: A database of human segmented natural images
and its application to evaluating segmentation algorithms and measuring ecological statistics,
Iccv Vancouver: (2001)
42. Arbel, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmen-
tation. IEEE Trans. Pattern Anal. Mach. Intell 33 (2011) 898–916
43. Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-
exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-
nition. (2015) 5197–5206
44. Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-
based manga retrieval using manga109 dataset. Multimedia Tools and Applications 76 (2017)
21811–21838
45. Davis, J.W., Keck, M.A.: A two-stage template approach to person detection in thermal imagery.
In: 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05)-
Volume 1. Volume 1., IEEE (2005) 364–369
46. Olmeda, D., Premebida, C., Nunes, U., Armingol, J.M., de la Escalera, A.: Pedestrian detection
in far infrared images. Integrated Computer-Aided Engineering 20 (2013) 347–360
16 Author: Rafael E. Rivadeneira1

47. Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Bench-
mark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and
pattern recognition. (2015) 1037–1045
48. Wu, Z., Fuller, N., Theriault, D., Betke, M.: A thermal infrared video benchmark for visual
analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Workshops. (2014) 201–208
49. Ring, E., Ammer, K.: Infrared thermal imaging in medicine. Physiological measurement 33
(2012) R33
50. Cho, Y., Bianchi-Berthouze, N., Marquardt, N., Julier, S.J.: Deep thermal imaging: Proximate
material type recognition in the wild through deep learning of spatial surface temperature pat-
terns. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems,
ACM (2018) 2
51. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from
error visibility to structural similarity. IEEE transactions on image processing 13 (2004) 600–612

You might also like