Doctoral Thesis Proposal DCCA
Doctoral Thesis Proposal DCCA
1 Work Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Thermal Image Application Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Image Super-resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.1 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 Generation of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6 Required Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7 Expected Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.1 SR thought Deep CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.2 SR using CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8 Workplan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Thermal Image Super-sesolution 3
1 Work Motivation
All objects emit infrared radiation by themselves, independently of any external energy
source, and depending on their temperature they emit a different wavelength in the long-
wavelength infrared spectrum (i.e., thermal). In the electromagnetic spectrum, the spec-
trum is divided into several regions such as X rays, UV, Visible, infrared, radar, among
others. There exist different sub-division schemes for the infrared region in the different
scientific fields, but the common scheme is shown in Fig. 1, where it has five regions, the
near (NIR: Near-infrared), short (SWIR: Short-wavelength infrared), middle (MWIR: Mid-
wavelength infrared), long (LWIR: Long-wavelength infrared) and far (FIR: Far-infrared)
spectral bands, where the mid-wavelength infrared is also known as thermal infrared (TIR)
[1]. The research work on this proposal is focused on the usage of TIR images, which is
motivated by the facts mentioned below.
New technology camera sensors can capture from Ultraviolet to Near-infrared spectral
bands. Visual cameras, capture visible light in greyscale or RGB images. The information
of the TIR spectral band can be captured by passive sensors, like thermal cameras; they
capture the thermal-infrared radiation emitted by all objects with a temperature above ab-
solute zero, based on object heat emission and no external illumination is required such
natural or artificial lights. Thermal information can provide valuable extra information to
the visible one (e.g., RGB camera), due visible image can be affected by poor lighting
conditions, for instance in security and object recognition, where nothing can be captured
in total darkness. Thermal cameras are not affected by this lack of illumination and they
do not depend on any external energy source. Fig. 2 shows an example of the same scene
captured with both a visible spectrum and a thermal camera, where thermal images are
represented as grayscale images, with dark pixels for cold spots and the whites one for hot
spots. In the image mention above, a sitting person inside the garage is more distinguished
in the thermal image, while in the visible spectrum it is almost impossible to distinguish it.
In recent years, infrared imaging field has grown considerably; nowadays, there is a
large set of infrared cameras available in the market (e.g., FLIR1 , Axis2 , among others)
with different technical specifications, lens and costs. Innovative use of infrared imaging
technology can, therefore, play an important role in a wide range of different applications,
such as medicine, military defense, surveillance and security, agriculture, building inspec-
tion, fire detection, among others, as well as detection, tracking and human recognition.
Depending on the thermal camera’s specifications, the cost can vary between $200 and
1 https://www.flir.com
2 https://www.axis.com
4 Author: Rafael E. Rivadeneira1
Fig. 2. Visible and thermal image of the same scene in the night. (left) captured with a RGB camera,
and (right) captured with a thermal camera.
more than $20000; the latter one is based on active technology with a cooled detector
integrated using a cryocooler, providing better resolution and higher frame rate. On the
contrary, cheap thermal cameras have smaller resolution than commercial RGB cameras,
which can be 160×120 and 1280×1024 respectively. This low-resolution, at a moderate
price, is a big limitation when thermal cameras need to be used for general-purpose solu-
tions.
Image super-resolution (SR) is an ill-posed problem that refers to the estimation of
high-resolution (HR) image/video from a low-resolution (LR) of the same scene, usually
with the use of digital image processing and Machine Learning (ML) techniques. SR also
has important applications in a wide range of domains, such as surveillance and security
(e.g., [2], [3], [4]), medical imaging (e.g., [5], [6],[7]), object detection in scene (e.g., [8]),
among others. The possibility of obtaining a SR image has been largely exploited in the
visible spectrum domain, where different super-resolution approaches have been proposed
from a conventional interpolation (e.g., [9], [10], [11]), but recently the development of
deep learning techniques, have witnessed remarkable progress achieving the performance
on various benchmarks of SR, where most of the state-of-the-art are focused on the visible
spectrum domain.
Learning-based super-resolution methods generally work by down-sampling and adding
both noise and blur to the given image. These noisy and blurred poor quality images, to-
gether with the given images that are considered as the Ground Truths (GT), are used in the
learning process. The approaches mention above have been mostly used to tackle the super-
resolution problem, however, there are few recent contributions where the learning process
is based on the usage of a pair of images (low and high-resolution images) obtained from
different cameras, also due to physical limitations of the technology and the high cost of
thermal cameras, thermal images tend to have a poor resolution. This poor resolution could
be improved by using new algorithms learning-based super-resolution methods, allowing
to increase image resolution.
Although the use of thermal imaging is not something new, and that conventional super-
resolution techniques (e.g., bicubic interpolation) have been used for many years in the
visible spectrum, the use of deep learning techniques for super-resolution is something that
has emerged in recent years, most of them are focused only on the visible spectrum. This
proposal seeks to use deep learning techniques to obtain a super-resolution in the thermal
spectrum, allowing the use of high-resolution images in any of the applications mention
above.
The remaining part of this document consists of the following sections: Section 2 de-
scribes related work of the present proposal. Research objectives, questions and method-
ology are found in Sections 3, 4 and 5 respectively. Section 6 mentions the necessary re-
sources to do the research. The expected results are depicted in Section 7. In Section 8,
the proposed calendar workplan is shown. Finally, the conclusion of the present proposal
is depicted in Section 9.
Thermal Image Super-sesolution 5
2 Literature Review
In this section work related to the research topics covered in this proposal are reviewed.
It starts with thermal images applications, see Section 2.1. Then, the main techniques for
image super-resolution are summarized in Section 2.2, which will be the main topic in the
PhD thesis. Note that most of the reviewer approaches are intended for visible spectrum
images, while current work is focused on thermal images. Then, a background on thermal
images datasets is given in Section 2.3.
image at the last layer. The authors introduce a deconvolution layer at the end of the net-
work and adopt a smaller filter size but more mapping layers. Inspired in SRCNN, depth
networks start to appear stacking more convolutional layers with residual learning (e.g.,
[23], [24]). The authors of EDSR [25], in order to speed up the training process, removes
the batch-normalization layer and take advantage of residual learning [26]. Yamanaka et
al. [27] propose a CNN based approach named DCSCN, where it uses a Deep CNN with
Residual Net, Skip Connection and Network in Network, and get a computation complex-
ity of at least 10 time smaller that state of the art (e.g., RED [28] and DRCN [29]) reaching
similar results.
A CNN based approach for enhancing thermal images has been introduced by Choi
et al. in [30], inspired by the proposal in [22]. The authors in [30] compare the accuracy
of a network trained in different image’s spectral band to find the best representation of
thermal enhancement. They conclude that a grayscale trained network provided better en-
hancement than the MWIR-based network for thermal image enhancement. On the other
hand, Lee et al. [31] also propose a convolutional neural network based on image enhance-
ment for thermal images. The authors evaluate four RGB-based domains, namely, gray,
lightness, intensity and V (from HSV color space) with a residual-learning technique. The
approach improves the performance of enhancement and speed of convergence. The au-
thors conclude that the V representation is the best one for enhancing thermal images. In
[32] the authors propose a parallelized 1x1 CNNs, named Network in Network to perform
image enhancement with a low computational cost for image reconstruction. In most of the
previous approaches, thermal images have not been considered during the training stage,
although intended for thermal image enhancement. They propose to train their CNN based
approaches using images from the visible spectrum at different color space representations
(e.g., grayscale, HSV).
In addition, most of the aforementioned CNNs aim at minimizing the mean-square
error (MSE) between SR and GT images, tending to overthrow the high-frequency details
in images. In other words, a supervised training process, using a pair of images, is followed.
The main drawback of such approaches lies in the need of having pixel-wise registered SR
and GT images to compute the MSE. As mentioned above, in most cases the SR image is
obtained from an image down-sampled from the GT.
In recent literature, different unsupervised training processes have been presented for
applications such as transferring style [33], image colorization [34], image enhancement
[35], feature estimation [36], among others. All these approaches are based on two-way
GANs (CycleGAN) networks that can learn from unpaired data sets [37]. CycleGAN can
be used to learn how to map images from one domain (source domain) into another do-
main (target domain); this functionality makes CycleGAN model appropriate for image SR
estimation when there is not a pixel-wise registration.
2.3 Datasets
Focussing on the SR problem, there are a large variety of datasets available in the visible
spectrum; recently, [38] has released a high-quality (2K resolution) dataset DIV2K for visi-
ble image restoration, which is split up into 800 images for training, 100 for testing and 100
for validation. Most of the approaches in the literature use common benchmark datasets for
evaluating their performance, all of them focus on visible spectrum domain (e.g., Set5 [39],
Set14 [40], BSD300 [41], BSD500 [42], Urban100 [43], Manga109 [44], among others).
These datasets provide HR images under different categories (e.g., animal, building, food,
landscape, people, flora, fauna, car, among others) with different resolutions and amount of
images. Some of them even include LR and HR image pairs.
Regarding thermal images, in last years some datasets have been published, due to
the different benefits and applications of these kinds of images. In [45], they generate a
dataset with 284 thermal images, with a resolution of 360×240. They acquired this dataset
with a Raytheon 300D, in their University campus at a walkway and street intersection,
Thermal Image Super-sesolution 7
capturing images over several daytimes and weather conditions. In [46], a 15224 thermal
image dataset has been proposed, with a resolution 164×129. This dataset was acquired
with an Indigo Omega imager mounted on a vehicle, driven in outdoors urban scenarios. In
[47] a FLIR-A35 is used to acquire more than 41500 thermal images, with a resolution of
320×256. A HR dataset was presented in [48]; the dataset contains seven different scenes,
most of them collected with a FLIR SC8000, with a full resolution of 1024×1024. The
dataset consists of 63782 frames with thousands of recording objects; one of the largest
amounts of HR thermal images available.
Most of the thermal image datasets mentioned above are usually designed for object
detection and tracking; some others for applications on the biometric domain or medical
applications (e.g., [49], [50]); and just a few of them are intended for super-resolution tasks.
Also, most of them contain low-resolution images and others are from the same scenario
which gives a poor variability.
3 Research Objectives
This thesis proposal focuses on the analysis of images from the long band of the elec-
tromagnetic spectrum that can be exploited to solve some problems existing in the field of
computer vision, for which different deep learning architectures will be designed to be used
with thermal images, the most important research activities are mentioned below:
1. Evaluation of convolutional network architectures used in the visual spectrum.
2. Design and implementation of deep learning techniques focused on thermal images.
3. Generation of databases for the validation of implemented techniques.
4. Validation of implemented models.
4 Research Questions
The topics to be addressed during the research activities are related to the analysis of differ-
ent techniques in the computer vision field. The design and implementation of architectures,
that make use of computational intelligence through deep learning networks are considered
for thermal images. The following research questions have been formulated to fulfill the
objectives of this thesis proposal:
1. How to design a Convolutional Neural Network architecture to be able to get a high-
resolution thermal image from a low-resolution image?.
2. Is it possible to use a Convolutional Neural Network architecture designed for visible
spectrum images to improve thermal image resolution?
3. Can thermal images from the low-resolution domain be transformed into a high-resolution
domain without having paired images?
8 Author: Rafael E. Rivadeneira1
5 Research Methodology
In this section the techniques that will be used in this proposal are mention. In Section
5.1, the justification of using CNN is mentioned, then the needed dataset for achieve this
proposal is given in Section 5.2.
In order to answer the research questions, convolutional neural networks will be used. This
decision has been made based on the fact CNN allows to get good representations of image
features, which is the main goal of computer vision.
Convolutional Neural Networks is a class of deep neural network, mostly applied to
analyze images, and the convolutions layers are the core operations of a CNN, they act as a
feature extractor, it consists of a set of learnable kernels. Input images are two-dimensional
data (matrix), convolutions take advantage of doing a dot product with the kernels, produc-
ing a feature map with the most representative characteristic at some spatial position in the
input. Illustration in Fig. 3 represents a typical CNN for classification.
The down-sampling layer (a.k.a. Pooling Layer) has several non-linear functions that
can be used, the most common is the max pooling; it divides the two-dimension input
into several sub-areas, and for each sub-area, it takes the maximum value. These layers
reduce the spatial dimension, reducing the number of parameters being reflected in the
computational cost.
Fig. 3. The architecture of a typical convolutional neural network, alternative between convolutions
and down-samplings layers. The feature maps of the final subsampling layer are fed to a fully con-
nected layers. The output layer usually use softmax activation functions.
In order to test the development of the thesis’ work and trying to overcome the limitations
mentioned in Section 2.3, and due to the lack of thermal image datasets, in the thesis dif-
ferent datasets are going to be generated. At the moment this work has already started and
two kinds of datasets have been acquired to be tested in preliminary results.
The first dataset has been acquired using a single TAU2 thermal camera with a 13mm
lens within a resolution of 640×512, in indoor and outdoor scenarios, and a set of 101
images (PNG format with a depth of 8 bits) have been acquired. This dataset can be used
to train CNNs following the traditional method of down-sampling the images to have a
registered image between SR and GT images.
Regarding the second dataset, unlike the first dataset, the main idea is to have three
semi-paired images of the same outdoor scenario acquired with three different thermal
cameras. Each resolution has to be the double of the first (e.g., 160×120, 320×240 and
640×480). Like in the previous dataset, this dataset collection has already started, using an
Thermal Image Super-sesolution 9
Axis Domo P1290, Axis Q2901-E and FC-6320 FLIR thermal cameras for LR, MR and
HR respectively. Acquired images are recorded in PNG format with a depth of 8 bits. The
idea of this datasets is to develop an architecture able to transform a real low-resolution
image into a high-resolution image, and with a semi-paired image to be able to have real
GT images to compare with.
The images in the acquired datasets are represented in grayscale and the use of data
augmentation process, rotating and flipping from top to bottom, from left to right all images
can be performed, in order to have more variability and avoid the overfitting of any network
implementation. Examples of images are depicted in Fig. 4.
Fig. 4. Examples of thermal images used for training. (toprow) LR images from Axis Domo P1290.
(middlerow) MR images from Axis Q2901-E. (bottomrow) HR images from FC-6320 FLIR.
6 Required Resources
The deep learning-based approach that will be developed in this thesis will be implemented
using a GPU like NVIDIA Titan XP, with at least 64GB of ram memory. The algorithm
implementation will be done using Python with TensorFlow framework and Keras library.
7 Expected Results
In the proposed thesis different deep learning techniques will be evaluated. The models
will consider the usage of different techniques, proposed to tackle the SR problem in the
visible spectrum, in the thermal spectrum. In addition, the usage of real different resolution
(low, mid and high-resolution) thermal images will be used. Preliminary results have been
obtained in this direction and some of them are briefly presented below.
Preliminary quantitative results, see Table 1, show that the network model trained with
thermal image dataset is better than using just a visible image dataset. These results also
show that the use of CNN to obtain thermal images SR is possible. Qualitative results are
depicted in Fig. 6.
Table 1. Preliminary results for x2 scale SR comparision between visible and thermal based model.
Fig. 7. Proposed CycleGAN architecture using 6 residual blocks ResNet; GL−H and GH−L repre-
sent generators from lower to higher and from higher to lower resolution respectively. DH and DL
represent the discriminator for each resolution.
Experimental results are shown in Table 2 and some illustrations depicted in Fig. 8.
These results indicate that it is possible to transform from low-resolution to high resolution,
also indicates that it has better results than traditional approaches that are based on just
using a CNN.
Table 2. Results on LR set in a ×2 scale factor, compared with its MR registered validation set.
x2 scale LR to MR MR to HR
Method PSNR SSIM PSNR SSIM
Bicubic Interpolation 16.46 0.6695 20.17 0.7553
Previous CNN 17.01 0.6704 20.24 0.7492
Current experiment 21.50 0.7218 22.42 0.7989
8 Workplan
2019 2020
Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep
Preliminary investigation 100% complete
PhD proposal
Writing
Writing modifications
Presentation
Submission
PhD thesis research
Guided readings
Papers reading
Report writing
Submission
9 Conclusions
This document presents the research proposal in thermal image super-resolution and their
different applications. The proposed models will be designed and implemented in order to
archives better results than state-of-the-art. Deep learning techniques will be used, in par-
ticular, with variants of convolutional neural networks and adversarial generative networks.
For each of the proposals described above, preliminary experiments have been carried out
to demonstrate the validity of the architecture of the designed model. During the PhD in-
vestigation, more exhaustive comparisons and evaluation will be carried out.
14 Author: Rafael E. Rivadeneira1
References
1. Gade, R., Moeslund, T.B.: Thermal cameras and applications: A survey. Machine Vision and
Applications 81 (2014) 89–96
2. Zhang, L., Zhang, H., Shen, H., Li, P.: A super-resolution reconstruction algorithm for surveil-
lance images. Signal Processing 90 (2010) 848–859
3. Rasti, P., Uiboupin, T., Escalera, S., Anbarjafari, G.: Convolutional neural network super resolu-
tion for face recognition in surveillance monitoring. In: International conference on articulated
motion and deformable objects, Springer (2016) 175–184
4. Shamsolmoali, P., Zareapoor, M., Jain, D.K., Jain, V.K., Yang, J.: Deep convolution network
for surveillance records super-resolution. Multimedia Tools and Applications 78 (2019) 23815–
23829
5. Mudunuri, S.P., Biswas, S.: Low resolution face recognition across variations in pose and illu-
mination. IEEE transactions on pattern analysis and machine intelligence 38 (2015) 1034–1040
6. Robinson, M.D., Chiu, S.J., Toth, C.A., Izatt, J.A., Lo, J.Y., Farsiu, S.: New applications of
super-resolution in medical imaging. In: Super-Resolution Imaging. CRC Press (2017) 401–430
7. Huang, Y., Shao, L., Frangi, A.F.: Simultaneous super-resolution and cross-modality synthesis
in magnetic resonance imaging. In: Deep Learning and Convolutional Neural Networks for
Medical Imaging and Clinical Informatics. Springer (2019) 437–457
8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for ac-
curate object detection and segmentation. IEEE transactions on pattern analysis and machine
intelligence 38 (2015) 142–158
9. Duchon, C.E.: Lanczos filtering in one and two dimensions. Journal of applied meteorology 18
(1979) 1016–1022
10. Keys, R.: Cubic convolution interpolation for digital image processing. IEEE transactions on
acoustics, speech, and signal processing 29 (1981) 1153–1160
11. Watson, D., Philip, G.: Neighborhood-based interpolation. Geobyte 2 (1987) 12–16
12. Cilulko, J., Janiszewski, P., Bogdaszewski, M., Szczygielska, E.: Infrared thermal imaging in
studies of wild animals. European Journal of Wildlife Research 59 (2013) 17–23
13. Yanmaz, L.E., Okumus, Z., Dogan, E.: Instrumentation of thermography and its applications in
horses. J Anim Vet Adv 6 (2007) 858–62
14. Gowen, A., Tiwari, B., Cullen, P., McDonnell, K., O’Donnell, C.: Applications of thermal imag-
ing in food quality and safety assessment. Trends in food science & technology 21 (2010)
190–200
15. Grinzato, E., Bison, P., Marinetti, S.: Monitoring of ancient buildings by the thermal method.
Journal of Cultural Heritage 3 (2002) 21–29
16. Jadin, M.S., Ghazali, K.H., Taib, S.: Thermal condition monitoring of electrical installations
based on infrared image analysis. In: 2013 Saudi International Electronics, Communications
and Photonics Conference, IEEE (2013) 1–6
17. Price, J., Maraviglia, C., Seisler, W., Williams, E., Pauli, M.: System capabilities, requirements
and design of the gdl gunfire detection and location system. In: 33rd Applied Imagery Pattern
Recognition Workshop (AIPR’04), IEEE (2004) 257–262
18. Wong, W.K., Tan, P.N., Loo, C.K., Lim, W.S.: An effective surveillance system using thermal
camera. In: 2009 International Conference on Signal Acquisition and Processing, IEEE (2009)
13–17
19. Sixsmith, A., Johnson, N.: A smart sensor to detect the falls of the elderly. IEEE Pervasive
computing 3 (2004) 42–47
20. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-
resolution. In: European conference on computer vision, Springer (2014) 184–199
21. Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional net-
works. IEEE transactions on pattern analysis and machine intelligence 38 (2015) 295–307
22. Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network.
In: European conference on computer vision, Springer (2016) 391–407
23. Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep con-
volutional networks. In: Proceedings of the IEEE conference on computer vision and pattern
recognition. (2016) 1646–1654
24. Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep cnn denoiser prior for image restora-
tion. In: Proceedings of the IEEE conference on computer vision and pattern recognition. (2017)
3929–3938
Thermal Image Super-sesolution 15
25. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single
image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern
recognition workshops. (2017) 136–144
26. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings
of the IEEE conference on computer vision and pattern recognition. (2016) 770–778
27. Yamanaka, J., Kuwashima, S., Kurita, T.: Fast and accurate image super resolution by deep CNN
with skip connection and network in network. CoRR abs/1707.05425 (2017)
28. Mao, X., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder
networks with symmetric skip connections. In: Advances in neural information processing sys-
tems. (2016) 2802–2810
29. Kim, J., Kwon Lee, J., Mu Lee, K.: Deeply-recursive convolutional network for image super-
resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition.
(2016) 1637–1645
30. Choi, Y., Kim, N., Hwang, S., Kweon, I.S.: Thermal image enhancement using convolutional
neural network. In: Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Con-
ference on, IEEE (2016) 223–230
31. Lee, K., Lee, J., Lee, J., Hwang, S., Lee, S.: Brightness-based convolutional neural network for
thermal image enhancement. IEEE Access 5 (2017) 26867–26879
32. Lin, M.: Cq, and yan, s. network in network. In: International Conference on Learning Repre-
sentations (ICLR). (2014)
33. Chang, H., Lu, J., Yu, F., Finkelstein, A.: Pairedcyclegan: Asymmetric style transfer for applying
and removing makeup. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition. (2018) 40–48
34. Mehri, A., Sappa, A.D.: Colorizing near infrared images through a cyclic adversarial approach
of unpaired samples. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition Workshops. (2019)
35. Chen, Y.S., Wang, Y.C., Kao, M.H., Chuang, Y.Y.: Deep photo enhancer: Unpaired learning for
image enhancement from photographs with gans. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. (2018) 6306–6314
36. Suarez, P.L., Sappa, A.D., Vintimilla, B.X., Hammoud, R.I.: Image vegetation index through
a cycle generative adversarial network. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops. (2019)
37. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-
consistent adversarial networks. In: Proceedings of the IEEE international conference on com-
puter vision. (2017) 2223–2232
38. Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: Ntire 2017 challenge on
single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition Workshops. (2017) 114–125
39. Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image
super-resolution based on nonnegative neighbor embedding. (2012)
40. Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In:
International conference on curves and surfaces, Springer (2010) 711–730
41. Martin, D., Fowlkes, C., Tal, D., Malik, J., et al.: A database of human segmented natural images
and its application to evaluating segmentation algorithms and measuring ecological statistics,
Iccv Vancouver: (2001)
42. Arbel, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmen-
tation. IEEE Trans. Pattern Anal. Mach. Intell 33 (2011) 898–916
43. Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-
exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-
nition. (2015) 5197–5206
44. Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-
based manga retrieval using manga109 dataset. Multimedia Tools and Applications 76 (2017)
21811–21838
45. Davis, J.W., Keck, M.A.: A two-stage template approach to person detection in thermal imagery.
In: 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05)-
Volume 1. Volume 1., IEEE (2005) 364–369
46. Olmeda, D., Premebida, C., Nunes, U., Armingol, J.M., de la Escalera, A.: Pedestrian detection
in far infrared images. Integrated Computer-Aided Engineering 20 (2013) 347–360
16 Author: Rafael E. Rivadeneira1
47. Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Bench-
mark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and
pattern recognition. (2015) 1037–1045
48. Wu, Z., Fuller, N., Theriault, D., Betke, M.: A thermal infrared video benchmark for visual
analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Workshops. (2014) 201–208
49. Ring, E., Ammer, K.: Infrared thermal imaging in medicine. Physiological measurement 33
(2012) R33
50. Cho, Y., Bianchi-Berthouze, N., Marquardt, N., Julier, S.J.: Deep thermal imaging: Proximate
material type recognition in the wild through deep learning of spatial surface temperature pat-
terns. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems,
ACM (2018) 2
51. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from
error visibility to structural similarity. IEEE transactions on image processing 13 (2004) 600–612