jimaging-10-00329-v2
jimaging-10-00329-v2
1 Department of Computing, Electronics and Mechatronics, Universidad de las Américas Puebla, Sta. Catarina
Martir, San Andrés Cholula 72810, Mexico; [email protected]
2 Department of Immunology, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de
México, Mexico City 04510, Mexico; [email protected]
* Correspondence: [email protected]
Abstract: Breast cancer is one of the leading causes of death for women worldwide, and early
detection can help reduce the death rate. Infrared thermography has gained popularity as a non-
invasive and rapid method for detecting this pathology and can be further enhanced by applying
neural networks to extract spatial and even temporal data derived from breast thermographic images
if they are acquired sequentially. In this study, we evaluated hybrid convolutional-recurrent neural
network (CNN-RNN) models based on five state-of-the-art pre-trained CNN architectures coupled
with three RNNs to discern tumor abnormalities in dynamic breast thermographic images. The
hybrid architecture that achieved the best performance for detecting breast cancer was VGG16-LSTM,
which showed accuracy (ACC), sensitivity (SENS), and specificity (SPEC) of 95.72%, 92.76%, and
98.68%, respectively, with a CPU runtime of 3.9 s. However, the hybrid architecture that showed
the fastest CPU runtime was AlexNet-RNN with 0.61 s, although with lower performance (ACC:
80.59%, SENS: 68.52%, SPEC: 92.76%), but still superior to AlexNet (ACC: 69.41%, SENS: 52.63%,
SPEC: 86.18%) with 0.44 s. Our findings show that hybrid CNN-RNN models outperform stand-alone
CNN models, indicating that temporal data recovery from dynamic breast thermographs is possible
without significantly compromising classifier runtime.
Figure 1. Diagram of the proposed methodology for binary breast cancer classification using hybrid
CNN-RNN-based deep learning models.
2.2. Dataset
2.2. Dataset
The data were acquired from a public dataset from Antonio Pedro University Hospital
known Theasdata were acquired
Database from a public
for Mastology dataset
Research withfrom Antonio
Infrared Pedro
Image University
(DMR-IR) [41]Hospi-
from
tal known as Database for Mastology Research with Infrared Image (DMR-IR)
267 healthy volunteers and 44 sick volunteers. This database contains thermal breast [41] from
267 healthy volunteers and 44 sick volunteers. This database contains thermal
images that were acquired using static and dynamic protocols. However, in the presentbreast im-
ages that
work, were images
dynamic acquired using
were usedstatic and the
to extract dynamic
desiredprotocols.
temporal However, in the present
features. Thermal images
were obtained using the FLIR SC-620 IR camera. This camera possesses an image resolution
of 640 × 480 pixels with an image frequency of 30 Hz; the spectral range detected by this
camera is 7.5–13 µm, and the temperature range is from −40 ◦ C to 500 ◦ C. As part of the
DIT protocol acquisition, a fan was used to apply a thermal stimulus to the volunteer until
the thorax temperature reached an average of 30.5 ◦ C. A sequence of frontal, each 15 s
long, was then taken for 5 min. The data were stored in a txt file containing the spatial
temperature in degrees Celsius of the heat map captured by the thermal camera.
camera.
Sample grayscale
Figure 2. thermograms
Figure 2. Sample grayscale thermograms
from fromfor
volunteers volunteers
a breastfor a breast
study: (a)study: (a) The is
The image image is clear, so
clear,
it is selected; (b) The image is blurry, so it is not selected; (c) The image contains material (bandaged
so it is selected; (b) The image is blurry, so it is not selected; (c) The image contains material (band-
breast) that covers the study region, so it is not selected.
aged breast) that covers the study region, so it is not selected.
2.4. Segmentation
2.4. Segmentation A fully automated segmentation was implemented using the U-Net architecture
to remove noise from thermographic images such as necks, stomachs, and armpits in
A fully automated segmentation
accordance was implemented
with the methodology described byusing Mohamedthe et
U-Net
al. [22].architecture to
The U-Net architecture
remove noise fromwas thermographic images
employed to reduce thesuch as necks,tostomachs,
time required segment each and armpits
image in accord-
manually for the use of
ance with the methodology described by Mohamed et al. [22]. The U-Net architecture was can be
editing software in all the thermal images. It has been reported that this network
trained using a limited number of samples [43]. This network is convenient for biomedical
employed to reduce the time required to segment each image manually for the use of ed-
data due to the large number of feature channels in its layers, which increases the resolution
iting software in all theoutput
of the thermal
[44].images. It has abeen
We conducted samplereported
number thatsweepthis network
to train can be
the U-Net to obtain
the number
trained using a limited number of frames needed
of samples to segment
[43]. the breast
This network is thermal images
convenient forproperly,
biomedicalstarting the
model training at 20 images with 5-frame increments. The appropriate number of samples
data due to the large number of feature channels in its layers, which increases the resolu-
to train U-Net for automatic segmentation of breast thermal images was 40. To segment the
tion of the output training
[44]. Weimages,
conducted
we used a sample numbersoftware
the open-source sweep to train the(version
ITK-SNAP U-Net3.8)to obtain
for interactive
the number of frames
image needed to segment
visualization the breast thermal
and semi-automatic segmentationimages properly,
of medical starting
images theonly the
to crop
region of interest (ROI), which, in this case, is the breasts.
model training at 20 images with 5-frame increments. The appropriate number of samples
to train U-Net forU-Net
automatic segmentation of breast thermal images was 40. To segment
Architecture
the training images, we used
The U-Net theis open-source software
a fully connected ITK-SNAP
layer (FCL) (version 3.8)
that automatically for inter-
segments medical im-
active image visualization
ages [45]; itand semi-automatic
comprises segmentation
23 convolutional of medical
layers distributed in twoimages
networkto cropThe first
steps.
only the region ofpart consists
interest of the
(ROI), contracting
which, in thispath
case,with breasts.convolution of 3 × 3, followed by
a repeated
is the
a rectified linear unit (ReLU) and a max pooling for downsampling of the input. The
next step is the expansive path; in this part, there is a concatenation in the feature maps
of the contracting path to unsample the signal and crop the original image for the auto-
mated segmentation. The U-Net network is shown in Figure 3, where the convolutional
layers are used to encode-decode the input data and crop the regions of the images in the
trained network.
contracting path to unsample the signal and crop the original image for the automated
segmentation. The U-Net network is shown in Figure 3, where the convolutional layers
are used to encode-decode the input data and crop the regions of the images in the trained
J. Imaging 2024, 10, 329 network. 5 of 21
Figure 3. U-Net architecture. The contracting path is on the left side of the U-shape, and the expanding
path is on the right. The blue boxes represent multi-channel feature maps. The number of channels is
Figure 3. U-Net architecture. The contracting path is on the left side of the U-shape, and the expand-
indicated on the top of the box. The x-y size is shown at the bottom left edge of the box. An orange
ing pathindicated
arrow is on theeach
right. The blue boxes represent multi-channel feature maps. The number of chan-
operation.
nels is indicated on the top of the box. The x-y size is shown at the bottom left edge of the box. An
2.5. Data Augmentation
orange arrow indicated each operation.
Once the thermal images were cropped to the ROIs, the data augmentation was applied
using four different transformations: a horizontal flip, a 15◦ rotation, a 30◦ rotation, and a
2.5.
15% Data Augmentation
zoom. Data augmentation was implemented to obtain more samples and to train the
hybrid DL model, since images
Once the thermal the ANNs require
were a largetoamount
cropped of data
the ROIs, the to function
data correctly. In
augmentation was ap-
J. Imaging 2024, 10, 329 this step, the sequences of images are adjusted to obtain an input tensor
plied using four different transformations: a horizontal flip, a 15° rotation,of 224 × 224 × 20
a 30° rotation,
6 of 23
per patient. Figure 4 shows an example of the segmentation and data augmentation process
and a 15% zoom. Data augmentation was implemented to obtain more samples and to
for thermal images.
train the hybrid DL model, since the ANNs require a large amount of data to function
correctly. In this step, the sequences of images are adjusted to obtain an input tensor of
224 × 224 × 20 per patient. Figure 4 shows an example of the segmentation and data aug-
mentation process for thermal images.
Figure 4. Example of a grayscale thermogram of the volunteer with ID 28: (a) selected image by data
Figure 4. Example of a grayscale thermogram of the volunteer with ID 28: (a) selected image by data
cleansing; (b) thermal image segmented using U-Net; (c) data augmentation using the transformations
cleansing; (b) thermal image segmented using U-Net; (c) data augmentation using the transfor-
of horizontal flip, rotation 15◦ , rotation 30◦ , and zoom 15%.
mations of horizontal flip, rotation 15°, rotation 30°, and zoom 15%.
2.6. Hybrid Deep Learning Model (CNN-RNN)
2.6.1. Convolutional Neural Network
2.6. Hybrid Deep Learning Model (CNN-RNN)
CNNs have been studied to improve the performance of image classification, image
2.6.1. Convolutional
recognition, Neuraland
object detection, Network
other tasks [46]. CNNs are most used for visual image
classification
CNNs have as they allow
been to extract
studied the information
to improve from extensive
the performance data, such
of image as images image
classification,
with pixels [47]. Currently, there are different imaging modalities for breast cancer diagno-
recognition, object detection, and other tasks [46]. CNNs are most used for visual image
classification as they allow to extract the information from extensive data, such as images
with pixels [47]. Currently, there are different imaging modalities for breast cancer diag-
nosis, such as mammography, ultrasound, and MRI, and the evaluation of these images
is mainly performed with deep learning models such as CNNs [48].
J. Imaging 2024, 10, 329 6 of 21
sis, such as mammography, ultrasound, and MRI, and the evaluation of these images is
mainly performed with deep learning models such as CNNs [48].
The CNNs are composed of three main layers: the convolutional layer, the pooling
layer, and the fully connected layer. The convolutional layer consists of feature learning.
Once inputs are in the network, they are used to extract local characteristics from the image
at different positions. These convolutions are computed with a kernel to extract several
features depending on the values of these small matrices. The results of these convolutions
are passed into a nonlinear activation function, i.e., sigmoid, rectified linear unit (ReLU),
tanh, or softplus, to obtain a continuous signal [49]. The activation function is an essential
part of neural networks as it allows the output to be nonlinear and continuous, enabling
the training of the model for either classification or logistic regression [50]. The next main
computation is the pooling layer; it extracts features by reducing the dimensions of the
feature maps. The most common pooling operations are the average and the max pooling.
Finally, the fully connected layer connects all the previous values of the feature vector to
apply linear transformations to obtain the product after an activation function. For the
classification, the SoftMax regression is the most used in multiclass probability distribution.
There is also a procedure known as dropout. It consists of inhibiting a certain number of
neurons to retrain the network and ensure robust training. The process of feeding an input
into the neural network to obtain the probability distribution in the output layer is known
as forward propagation. When there is an error in the regression, this value is considered to
retrain the layer in the CNN architecture through the back propagation algorithm. Figure 5
10, 329 shows the implementation of classifying tissue heterogeneity using CNN architectures.7 of 23 In
this figure, it is possible to visualize all the layers and the fully connected layers to obtain a
binary classification in the thermal images for the BC disease.
5. CNN
Figurefor
Figure 5. CNN model model for
the binary the binary classification
classification of breast of breastheterogeneity
tissue tissue heterogeneity (normal
(normal or or abnormal)
abnor-
in thermographic images.
mal) in thermographic images.
For image classification, there are five CNN architectures in the state-of-the-art:
For image Inception-V3,
classification,VGG-16,
there are five CNN
ResNet101, architectures
GoogLeNet, in the state-of-the-art:
and AlexNet In-
[51]. These architectures
ception-V3, VGG-16, ResNet101, GoogLeNet, and AlexNet [51]. These architectures in
were used for automated feature extraction in the hybrid DL model and for classification
the single DL model:
were used for automated feature extraction in the hybrid DL model and for classification
• Inception-v3: It consists of a network of 48 layers where there are 24 parameters to
in the single DL model:
train. It was developed to improve the performance of the GoogLeNet architecture.
• Inception-v3: It consists of a network of 48 layers where there are 24 parameters to
train. It was developed to improve the performance of the GoogLeNet architecture.
• VGG-16: This network consists of a structure of 16 layers, 13 from convolutional lay-
ers and 3 of fully connected layers. It is more accurate than the AlexNet architecture,
J. Imaging 2024, 10, 329 7 of 21
bases were used as feature extractors, with weights frozen during training to leverage
their pre-trained capabilities. In order to integrate the thermographic image sequences,
the input layer of the combined model was adapted to accept a sequence of 20 frames
per sample, each resized to 224 × 224 pixels and normalized to a range of [0, 1]. The
CNNs were implemented as part of a TimeDistributed layer, enabling feature extraction
from each frame independently before passing the extracted features to the RNN layers.
On the output side, the classification task was binary (healthy vs. sick), so the final layer
was modified to include a single dense neuron with a softmax activation function for
probabilistic binary classification. A sequential training process was used to train the CNN
and RNN components of the model. The extracted features from the CNNs were processed
by different types of RNNs, including simple RNNs, GRUs, and LSTM networks, to capture
temporal dependencies in the thermographic image sequences. Each RNN configuration
consisted of two stacked layers, each with 64 units. The first recurrent layer was configured
to return sequences, enabling the second layer to process the full temporal information
of the thermographic data. For the GRU and LSTM models, the forget and update gate
mechanisms allowed for effective learning of long-range dependencies while mitigating the
vanishing gradient problem typically encountered in standard RNNs. All RNNs utilized
the ReLU activation function for hidden units to improve stability during training. The
recurrent networks were trained from scratch, with random weight initialization provided
by TensorFlow using the Glorot uniform initialization method to ensure stable gradient
propagation. Input sequences consisted of 20 thermographic frames per patient, normalized
to a range of [0, 1]. The model was trained using the ADAM optimizer with a softmax
activation function in the last layer. The learning rate was set to 0.001 per iteration, with
a batch size of 16 samples. The number of epochs was set at 30, which ensured model
convergence while avoiding overfitting.
To monitor the performance of the models during training, the validation accuracy
was evaluated at the end of each epoch. In this way, we were able to track the model’s
learning progress and detect overfitting potential. In Appendix A, Figures A1–A5 illustrate
the validation accuracy of the different models.
| A ∩ B|
IoU = , (2)
| A ∪ B|
Like the Dice coefficient, its values range from 0 to 1, where higher values indicate
better segmentation accuracy.
The metrics obtained were as follows:
• Average Dice coefficient: 0.9347 ± 0.0138.
• Average Jaccard Index: 0.8776 ± 0.0242.
J. Imaging 2024, 10, 329 9 of 21
TP + TN
Accuracy = , (3)
TP + TN + FP + FN
TP
Sensitivity = , (4)
TP + FN
TN
Specificity = (5)
TN + FP
where TP is the prediction of a sample for the sick class when the real class is sick, TN is for
a healthy predicted class when the real class is healthy, FP is the prediction of a sick class
when the real is healthy, and FN is for the prediction of a healthy class when the real class
is sick. The metrics were computed from the resultant confusion matrix. A CPU execution
time was also calculated, which shows the time in seconds required to predict a class using
a single and hybrid DL model, considering the input complexity, CNN architecture, and
system requirements.
In this methodology, leave-one-out cross-validation (LOOCV) was employed to assess
the model’s performance. For each iteration, the model was trained using all data samples
except one, which was reserved as the test set. This procedure was repeated for every
sample in the dataset, ensuring that each thermal image sequence was used as a test
case once. The performance metrics from all iterations were then averaged to obtain a
comprehensive evaluation of the model’s generalization capability.
RAM 8 GB
CPU 1.60 GHz processor, Core-i5, 8th Gen
GPU Nvidia, 1050
Languages Version 3.8 Python
OS 64-bit Windows
Libraries Numpy, Pandas, OpenCV, Scikitlearn, Tensor Flow
3. Results
The CNN architectures—Inception-v3, VGG16, ResNet101, GoogLeNet, and
AlexNet—were assessed when coupled with RNN, LSTM, and GRU to classify abnor-
malities in the breasts with thermal images in sequence. The DL models were evaluated
through performance metrics, and the validation used was the LOOCV.
The viability of selecting several datasets to acquire more samples was studied, but
none of them included the DIT acquisition approach. In the Visual Lab DMR dataset,
the number of sequences obtained was 38 for each class, resulting from the maximum
number of samples in the class labelled as sick. Moreover, the data were balanced with a
random sampling of 38 sequences from volunteers labelled as the healthy class. Balancing
the data was performed to train the model with the same number of samples from each
class. In addition, the data was increased with the transformations of data augmentation
J. Imaging 2024, 10, 329 10 of 21
techniques because of the small number of samples in the training of DL models using a
horizontal flip, a 15◦ rotation, a 30◦ rotation, and a 15% zoom. Table 2 depicts the number of
thermal images in sequence from the healthy and sick classes using filtered and augmented
data, and Table 3 shows the performance from the single and hybrid DL models, whose
metric values are derived from the confusion matrices of each model (see Appendix B,
Figures A6–A10).
Table 2. Number of thermal images acquired after application of the filters and transformations.
Healthy Sick
Data Cleansing 38 38
Data Augmentation 152 152
Table 3. Performance metrics and CPU execution time of the evaluated CNN architectures coupled
with RNNs or classifying with the fully connected layer.
The proposed CNN-RNN binary classifier obtained the highest metrics when VGG16
is used with the LSTM layers, reaching a total of 95.72%, 92.76%, and 98.68% in accuracy,
sensitivity, and specificity, respectively (Table 3). On the other hand, the worst performance
was achieved from the single DL model AlexNet with 69.41%, 52.63%, and 86.18% in
accuracy, sensitivity, and specificity, respectively. According to Table 3, the architecture with
the fastest CPU execution time is the AlexNet, a single CNN, with a time of 0.45 s. However,
its performance metrics are below those of other models, with accuracy, sensitivity, and
specificity of 69.41%, 52.63%, and 86.18%, respectively. On the other hand, the model that
obtained the best performance metrics (VGG16-LSTM) takes almost nine times longer in
CPU execution time than the single AlexNet model.
Additionally, there is evidence that any of the CNN architectures used in this work in
a hybrid DL model (LSTM-based) increases the performance metrics in comparison to the
single DL model or the remaining RNN cells, not to mention that the hybrid DL models
tested acquired better performance than the single DL model (see Table 4).
J. Imaging 2024, 10, 329 11 of 21
Table 4. Performance metrics of the model coupled after the CNN architectures. Each value represents
the mean of each model from Table 3.
Figure 6 presents a visual representation of the performance metrics with the different
DL models. Here, we compare the results from various CNN architectures with their
respective RNN cells or the FCL. Figure 7 shows a comparison of the CPU execution time
J. Imaging 2024, 10, 329
in seconds of the different models evaluated, either with the deep learning model alone or 12 of 23
coupled to an RNN.
Performance (%)
70 70
60 60
50 50
40 40
30 30
20 20
10 10
0 0
InceptionV3 VGG16 ResNet101 GoogLeNet AlexNet InceptionV3 VGG16 ResNet101 GoogLeNet AlexNet
CNN Architecture CNN Architecture
Performance Sensitivity Specificity Performance Sensitivity Specificity
(c) 100 CNN − LSTM model (d) 100 CNN − GRU model
90 90
80 80
Performance (%)
Performance (%)
70 70
60 60
50 50
40 40
30 30
20 20
10 10
0 0
InceptionV3 VGG16 ResNet101 GoogLeNet AlexNet InceptionV3 VGG16 ResNet101 GoogLeNet AlexNet
CNN Architecture CNN Architecture
Performance Sensitivity Specificity Performance Sensitivity Specificity
Figure 6.Performance
Figure 6. Performance evaluation
evaluation (accuracy,
(accuracy, sensitivity,
sensitivity, and specificity)
and specificity) of the
of the different different
hybrid CNN- hybrid
RNN architectures
CNN-RNN to classify
architectures the presence
to classify or absence
the presence orofabsence
a tumorofinabreast
tumorthermographic images:
in breast thermographic im-
(a) The
ages: (a)independent CNN model;
The independent CNN (b) The hybrid
model; CNN-RNN
(b) The model; (c) The
hybrid CNN-RNN hybrid
model; (c)CNN-LSTM
The hybrid CNN-
model; (d) The hybrid CNN-GRU model. Inception-V3, VGG16, ResNet101, GoogLeNet, and AlexNet
LSTM model; (d) The hybrid CNN-GRU model. Inception-V3, VGG16, ResNet101, GoogLeNet,
are the five CNN models that are coupled to the three sequential networks (RNN, LSTM, and GRU).
and AlexNet are the five CNN models that are coupled to the three sequential networks (RNN,
LSTM, and GRU).
4.5
4.0
CNN-RNN architectures to classify the presence or absence of a tumor in breast thermographic im-
ages: (a) The independent CNN model; (b) The hybrid CNN-RNN model; (c) The hybrid CNN-
LSTM model; (d) The hybrid CNN-GRU model. Inception-V3, VGG16, ResNet101, GoogLeNet,
and AlexNet are the five CNN models that are coupled to the three sequential networks (RNN,
J. Imaging 2024, 10, 329 12 of 21
LSTM, and GRU).
4.5
4.0
3.0
2.5
2.0
1.5
1.0
0.5
0.0
GRU
LSTM
GRU
LSTM
GRU
LSTM
GRU
LSTM
GRU
LSTM
FC
RNN
FC
RNN
FC
RNN
FC
RNN
FC
RNN
Inception v3 VGG16 ResNet101 GoogLeNet AlexNet
Hybrid model
Figure 7. CPU execution time of different coupled CNN-RNN deep learning architectures for breast
cancer classification in images acquired using the DIT acquisition protocol.
The performance metrics calculated by the single CNN models, as well as by CNN
models coupled with RNN, LSTM, and GRU, were augmented using the bootstrap method.
This approach allowed for the calculation of confidence intervals for each class. The results
showed a confidence interval of 69.08 to 71.5 for the simple CNN model, while for the
coupled models, the intervals were 72.7 to 80.92, 85.53 to 95.72, and 76.32 to 89.8 for CNN
with RNN, LSTM, and GRU, respectively. An ANOVA (analysis of variance) was performed
to assess whether there were significant differences between the performance metrics of
the models. ANOVA is a statistical method used to compare the means of three or more
groups to determine if at least one group differs significantly from the others. In this case,
the p-value was less than 0.05. This indicates that there is a highly significant difference
between the models, as the p-value is far below the common significance threshold of 0.05.
Therefore, we can confidently reject the null hypothesis and conclude that the models have
statistically different performances.
4. Discussion
In this study, breast tissue thermographic image sequences were assessed using a
hybrid DL model to identify abnormalities that may indicate BC disease. The hybrid model
incorporates a CNN to extract spatial features, a RNN to extract temporal features, and a
fully connected layer to determine whether the samples belong to a healthy or sick patient.
Few studies for breast cancer disease classification with dynamic acquisition protocol in
thermal imaging with machine learning and deep learning models [11,20,58–60], have
shown a lower false negative rate than SIT. In the last decade, neural networks have at-
tracted much attention from researchers due to the increase in computational capabilities
and their application in the detection of complex patterns automatically, as in the case of
thermal imaging [12]. For instance, Ekici and Jawzal [61] developed software to extract
breast features based on bio-data, image analysis, and image statistics. A CNN model opti-
mized using the Bayes algorithm was used to classify the features, resulting in an accuracy
of 98.95%. However, this metric is not adequate since they worked with an unbalanced
database, and the CNN architecture does not provide reproducibility information. In the
study by Cabıoğlu and Oğul [61], it was shown that by performing transfer learning on
the AlexNet architecture, the accuracy for classifying breast thermal images can increase
from 89.5% to 94.3% if the database is balanced. However, there was no segmentation
process in the images, which increases noise caused by non-interest regions [22]. Although
J. Imaging 2024, 10, 329 13 of 21
CNNs have gained prominence due to their ability to extract features through pixel-based
pattern recognition [22], RNNs perform more effectively when images are sequenced (over
time-captured images) [22], making them ideal for temporal feature extraction from images
acquired by DIT.
Several studies have reported the use of coupled CNN + RNN networks for the
classification of breast cancer disease in different imaging modalities. Wang et al. [34]
assessed breast histological images using a CNN + GRU model and obtained an accuracy of
86.21%, while a single DL model achieved an accuracy of 80%. A later study conducted by
Srikantamurthy et al. [35] reported that for binary classification of histopathology images
of BC, the single DL model had an accuracy of 98.6%, while the hybrid DL model of CNN-
RNN reached an accuracy of 99.75%. Likewise, Atrey et al. [37] applied hybrid models
(CNN + LSTM) to dual-modality mammography and ultrasound images to improve early
detection of breast cancer, leading to an increase in classification accuracy from 88.73%
to 99.35%.
In this context, this study evaluated the efficiency of coupled deep learning models
based on convolutional and recurrent neural networks for classifying breast cancer disease
in thermal images obtained by the DIT acquisition protocol. Our findings indicate that
coupled models can improve the accuracy of dynamic breast thermographic images, since
the accuracy of the stand-alone CNN model (single CNN) was 70.56%, while CNN + RNN,
CNN + GRU, and CNN + LSTM were 76.84%, 82.23%, and 88.56%, respectively (see Table 4).
The LSTM model performed best when coupled with a pretrained CNN model, which
corresponds to a hybrid VGG16-LSTM architecture. However, it has been reported that
this type of sequential network is computationally expensive when compared to RNNs or
GRUs [35]. Therefore, we have addressed not only the performance metrics for classification
but also the CPU execution time associated with binary classification to compare the
different combinations of coupled models between the pre-trained CNN architectures and
the three sequential architectures (RNN, LSTM, and GRU) (see Figure 6). Thus, the hybrid
VGG16-LSTM architecture, which developed the best performance metrics (Acc = 95.72%,
Sens = 92.76%, Spec = 98.68%), showed a CPU execution time of 3.89 s, making it the
second hybrid architecture that required the most CPU time to complete the classification
process (the ResNet101-LSTM model took 4.13 s) (see Table 3). This result is due to the
higher number of parameters and layers in VGG16 and ResNet101, unlike Inception-v3,
AlexNet, and GoogLeNet [35]. These results are consistent with models that took less time,
such as AlexNet, which had a CPU execution time of 0.44 s. However, the classification
statistics of this stand-alone CNN model are lower (ACC: 69.41%, SENS: 52.63%, SPEC:
86.18%) than other models (see Table 3). This same pre-trained CNN architecture coupled
with the LSTM network improved the classification performance (ACC: 85.53%, SENS:
74.34%, SPEC: 96.71%) as well as CPU execution time (1.16 s). Additionally, the stand-alone
CNN architecture, known as GoogLeNet (ACC: 72.70%, SENS: 55.26%, SPEC: 90.13%),
also demonstrated high classification performance when combined with the sequential
LSTM neural network (ACC: 94.08%, SENS: 90.13%, SPEC: 98.03%), requiring only 0.15 s of
CPU execution time over the single CNN model ( GoogLeNet). This time that is negligible
when compared to the increase in binary classification performance metrics for determining
whether a breast thermographic image contains a tumor.
A limitation of this study is the relatively small dataset, which comprises only 38 se-
quences per class after balancing. While data augmentation is a commonly used approach
to expand sample size and reduce overfitting [39], medical images present a unique chal-
lenge due to their inherent complexity and variability. As a result of these characteristics,
more advanced techniques are required to ensure that the model can effectively capture
the specific features and variability of medical conditions. One possible solution is to use
deep generative models, such as VAEs, GANs, and DMs, which have shown promise in
generating realistic, diverse images that can improve training by better representing the
underlying distribution of the dataset [62].
J. Imaging 2024, 10, 329 14 of 21
In the present study, however, the limitation of the small data set persists, since our
approach involves analyzing sequential thermography images, and the only dataset with
such images (DTI) is the DMR dataset from Visual Lab. In view of restricted access to
patient data and the complexity of collecting thermal imaging data, we were not able to
create a larger dataset. Thus, the current model is not suitable for widespread clinical
application. Nevertheless, with further data collection, this model may contribute to the
early detection of breast cancer by aiding clinicians in identifying areas of concern in
thermal images, along with other diagnostic tools. The integration of this model into
existing clinical workflows is also a critical issue. In spite of the fact that our model has
not yet been applied in clinical settings, we consider it to be a potential supplementary
tool for radiologists and clinicians. It may be possible to provide additional insights into
breast cancer diagnosis by analyzing thermal images alongside other diagnostic methods.
However, it would be necessary to address a number of issues to make the model suitable
for clinical use, including the processing of real-time data, the design of user interfaces, and
the compatibility with existing medical technology.
5. Conclusions
Deep learning plays an important role in detecting complex patterns in medical im-
ages, making them more reliable, accurate, and faster for diagnosing diseases. In this
study, we address the challenge of analyzing sequential thermal images of the breast using
hybrid deep learning models. Unlike static protocols, which capture steady-state images
at the same time, our approach benefits from additional information obtained over time
through dynamic acquisition. A comprehensive evaluation of stand-alone and coupled
deep learning models using pre-trained CNN architectures and RNN cells to classify se-
quential thermal breast images revealed that the best architecture for classification was
VGG16 + LSTM. However, other coupled models, such as GoogLeNet and AlexNet with
LSTM, achieved higher classification accuracy with a shorter CPU execution time than
VGG16 with a higher accuracy. The findings suggest that coupled CNN-RNN deep learning
models improve classification performance in thermographic breast images obtained by
dynamic acquisition protocol without significantly affecting the execution time to distin-
guish normal or abnormal breast tissue, making it a promising option for preventative
breast cancer diagnosis with a considerable time to obtain its result. This suggests that
hybrid deep learning models may be implemented in dynamic breast thermography so
that spatial (with CNN models) and temporal (with sequential models) features can be
extracted for subsequent radiological determination to determine whether tumor tissue
exists or is absent. It would be interesting to investigate optimizing features extracted from
thermal images in sequence to reduce the computational cost since neural networks require
systems to support model computations, particularly when training.
Author Contributions: Conceptualization, A.M.-S., J.H.E.-R. and I.V.; methodology, A.M.-S. and
J.H.E.-R.; validation, A.M.-S.; investigation, A.M.-S. and J.H.E.-R.; writing—original draft preparation,
A.M.-S. and J.H.E.-R.; writing—review and editing, A.M.-S., J.H.E.-R. and I.V.; supervision, J.H.E.-R.
All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: This article does not contain any studies with human participants or
animals performed by any of the authors.
Data Availability Statement: No new data were created or analyzed in this study. Data sharing is
not applicable to this article.
Acknowledgments: A.M.-S. wishes to acknowledge the support of the National Council of Humani-
ties, Sciences and Technologies (CONAHCyT) of Mexico and Universidad de las Américas Puebla
(UDLAP) for his PhD scholarship.
J. Imaging 2024, 10, 329 15 of 21
Appendix A
Validation Accuracy
J. J.Imaging
Imaging2024,
2024,10,
10,329
329 1616ofof2323
Comparison of validation accuracy over epochs for various combinations of CNNs
and RNNs. The architectures include Inception-V3, VGG16, ResNet101, GoogLeNet, and
AlexNet, each coupled with RNN, LSTM, and GRU layers. The graphs illustrate the
performanceofofeach
performance eachmodel
modelinincapturing
capturingboth
bothspatial
spatialand
andtemporal
temporalpatterns
patternsfrom
frombreast
breast
performance of each model in capturing both spatial and temporal patterns from breast
cancer
cancer
cancerthermographic
thermographic
thermographic image
image
image sequences,
sequences, highlighting
highlighting
sequences, the
the
highlighting differences
differences
the in
inin
differences convergence
convergence
convergence rates
rates
rates
and
and
and overall
overall
overall accuracy.
accuracy.
accuracy.
Figure
Figure
Figure A1.
A1.
A1. Validation
Validation
Validation accuracy
accuracy ofof
accuracy ofInceptionV3
InceptionV3
InceptionV3 using
using
using (a)
(a)(a)single
single
single CNN;
CNN;
CNN; (b)
(b)
(b) coupled
coupled
coupled RNN;
RNN;
RNN; (c)
(c)(c)coupled
coupled
coupled
LSTM;
LSTM;
LSTM; and
and
and (d)
(d)(d)coupled
coupled
coupled GRU.
GRU.
GRU.
A2. Validation
Figure A2.
Figure accuracy
accuracyof
Validationaccuracy VGG16 using (a) (a)
single CNN; (b) coupled RNN;RNN;
(c) coupled LSTM;
Figure A2. Validation ofofVGG16
VGG16 using
using single
(a) single CNN;
CNN; (b)coupled
(b) coupledRNN; (c)(c)coupled
coupled
and (d)
LSTM;and coupled
and(d) GRU.
(d)coupled
coupledGRU.
GRU.
LSTM;
J. Imaging 2024, 10, 329 17 of 23
J. Imaging 2024,2024,
J. Imaging 10, 329
10, 329 17 of 23
16 of 21
Figure A3. Validation accuracy of ResNet101 using (a) single CNN; (b) coupled RNN; (c) coupled
FigureA3.
Figure Validation accuracy
A3.Validation accuracyofofResNet101
ResNet101using (a) single
using CNN;
(a) single (b) coupled
CNN; RNN; RNN;
(b) coupled (c) coupled
(c) coupled
LSTM; and (d) coupled GRU.
LSTM; and (d) coupled GRU.
LSTM; and (d) coupled GRU.
FigureA4.
Figure Validation accuracy
A4. Validation accuracyofofAlexNet
AlexNetusing (a) (a)
using single CNN;
single (b) coupled
CNN; RNN; RNN;
(b) coupled (c) coupled
(c) coupled
LSTM;A4.
Figure andValidation
(d) coupled accuracy
GRU. of AlexNet using (a) single CNN; (b) coupled RNN; (c) coupled
LSTM; and (d) coupled GRU.
LSTM; and (d) coupled GRU.
J. Imaging 2024, 10, 329 18 of 23
Figure A5. Validation accuracy of GoogLeNet using (a) single CNN; (b) coupled RNN; (c) coupled
LSTM; and (d) coupled GRU.
Figure A5. Validation accuracy of GoogLeNet using (a) single CNN; (b) coupled RNN; (c) coupled
Figure A5. Validation accuracy of GoogLeNet using (a) single CNN; (b) coupled RNN; (c) coupled
LSTM; and (d) coupled GRU.
Appendix
LSTM; and (d) B
coupled GRU.
Appendix Matrix
Confusion B
Appendix B
Confusion Matrix
Confusion matrices illustrating the performance of various CNN-RNN architectures
Confusion Matrix
Confusion matrices illustrating thethermographic
performance of various CNN-RNN The architectures
on the classification of breast cancer image sequences. models include
on the classification
Confusion of
matricesbreast cancer thermographic
illustrating the image
performance sequences.
of various The models
CNN-RNN include
architectures
combinations of Inception-V3, VGG16, ResNet101, GoogLeNet, and AlexNet with RNN,
combinations of Inception-V3, VGG16, ResNet101, GoogLeNet, and AlexNet with RNN,
on the
LSTM, classification
andGRU
LSTM, and GRUlayers. of breast
layers.Each
Each cancer
matrix
matrix
thermographic
provides
provides
image
a detailed
a detailed
sequences.
breakdown
breakdown
The
of trueof
models include
true positive,
positive, true true
combinations
negative, falseof
negative, false Inception-V3,
positive,
positive, andfalse
and VGG16,
false negative
negative ResNet101, GoogLeNet,
predictions,
predictions, and
showcasing
showcasing AlexNet
the the ability
ability with RNN,
of each
of each ar-
LSTM, andto
architecture
chitecture GRU layers.classify
tocorrectly
correctly Each matrix
classify healthy
healthy provides
and
and a detailed
diseased
diseasedcases. breakdown of true positive, true
cases.
negative, false positive, and false negative predictions, showcasing the ability of each ar-
chitecture to correctly classify healthy and diseased cases.
Figure
Figure A7.
Figure A7. Confusion
A7. Confusion matrix
Confusionmatrix of
matrixof VGG16
ofVGG16 using
VGG16using
using (a)
(a)(a) single
single
single CNN;
(b)(b)
CNN;
CNN; (b) coupled
coupled
coupled RNN;
(c) (c)
RNN;
RNN; (c) coupled
coupled
coupled LSTM;
LSTM;
LSTM;
Figure A7. Confusion matrix of VGG16 using (a) single CNN; (b) coupled RNN; (c) coupled LSTM;
and
and (d) coupled
and (d) coupled GRU.
coupledGRU.
GRU.
and (d) coupled GRU.
Figure
Figure A8.
Figure
Figure A8.Confusion
A8.
A8. Confusionmatrix
Confusion
Confusion matrixofof
matrix
matrix ofResNet101
of ResNet101using
ResNet101
ResNet101 (a)
using
using
using (a) single
(a)
(a) CNN;
single
single
single (b)
CNN;
CNN;
CNN; coupled
(b)(b)
(b) RNN;
coupled
coupled
coupled (c)
(c) coupled
RNN;
RNN;
RNN; (c)
(c) coupled
coupled
coupled
LSTM;
LSTM; and
and
LSTM; and (d)
(d)coupled
coupledGRU.
GRU.
LSTM; and (d)coupled
coupledGRU.
GRU.
Figure
FigureA9.
A9. Confusion
Confusionmatrix
matrixofofGoogLeNet
GoogLeNetusing
using(a)
(a)single
singleCNN;
CNN;(b)
(b) coupled
coupled RNN;
RNN; (c)
(c) coupled
coupled
Figure
Figure A9. Confusion matrix of GoogLeNet using (a) single CNN; (b) coupled RNN; (c)
LSTM;
LSTM; A9.
and
and Confusion
(d)
(d) coupled
coupled matrix
GRU.
GRU. of GoogLeNet using (a) single CNN; (b) coupled RNN; (c) coupled
coupled
LSTM;
LSTM; and
and (d)
(d) coupled
coupled GRU.
GRU.
J.
J. Imaging 2024, 10,
Imaging 2024, 10, 329
329 2019of
of 23
21
A10.Confusion
Figure A10. Confusionmatrix
matrixofof
AlexNet using
AlexNet (a) single
using CNN;
(a) single (b) coupled
CNN; RNN;RNN;
(b) coupled (c) coupled LSTM;
(c) coupled
and (d) coupled GRU.
LSTM; and (d) coupled GRU.
References
References
1. GLOBOCAN Cancer Today. Available online: https://gco.iarc.fr/today/en (accessed on 6 August 2024).
1.
2.
GLOBOCAN Cancer Today. Available online: https://gco.iarc.fr/today/en (accessed on 6 August 2024).
Singh, D.; Singh, A.K. Role of Image Thermography in Early Breast Cancer Detection- Past, Present and Future. Comput. Methods
2. Singh, D.;Biomed.
Programs Singh, A.K.
2020,Role
183,of Image[CrossRef]
105074. Thermography in Early Breast Cancer Detection- Past, Present and Future. Comput. Methods
[PubMed]
3. Mahoro,
Programs E.; Akhloufi,
Biomed. 2020,M.A. Breast Cancer Classification on Thermograms Using Deep CNN and Transformers. Quant. Infrared
183, 105074.
3. Thermogr. J. 2024, 21, 30–49. [CrossRef]
Mahoro, E.; Akhloufi, M.A. Breast Cancer Classification on Thermograms Using Deep CNN and Transformers. Quant. Infrared
4. Gonzalez-Hernandez, J.L.; Recinella, A.N.; Kandlikar, S.G.; Dabydeen, D.; Medeiros, L.; Phatak, P. Technology, Application and
Thermogr. J. 2024, 21, 30–49. https://doi.org/10.1080/17686733.2022.2129135.
Potential of Dynamic Breast Thermography for the Detection of Breast Cancer. Int. J. Heat. Mass. Transf. 2019, 131, 558–573.
4. Gonzalez-Hernandez, J.L.; Recinella, A.N.; Kandlikar, S.G.; Dabydeen, D.; Medeiros, L.; Phatak, P. Technology, Application and
[CrossRef]
5. PotentialD.;
Tsietso, ofYahya,
Dynamic A.; Breast Thermography
Samikannu, R. A Reviewfor on
theThermal
Detection of Breast Cancer.
Imaging-Based Int.Cancer
Breast J. Heat.Detection
Mass. Transf.
Using2019,
Deep131, 558–573.Mob.
Learning.
5. Inf. Syst. 2022, 2022, 8952849. [CrossRef]
Tsietso, D.; Yahya, A.; Samikannu, R. A Review on Thermal Imaging-Based Breast Cancer Detection Using Deep Learning. Mob.
6. Rodrigues, A.L.;
Inf. Syst. 2022, de Santana,
2022, 8952849.M.A.; Azevedo, W.W.; Bezerra, R.S.; Barbosa, V.A.F.; de Lima, R.C.F.; dos Santos, W.P. Identification
of Mammary Lesions in Thermographic Images: Feature Selection Study Using Genetic Algorithms and Particle Swarm
6. Rodrigues, A.L.; de Santana, M.A.; Azevedo, W.W.; Bezerra, R.S.; Barbosa, V.A.F.; de Lima, R.C.F.; dos Santos, W.P. Identifica-
Optimization. Res. Biomed. Eng. 2019, 35, 213–222. [CrossRef]
7. tion of Mammary
Gershenson, Lesions in J.Thermographic
M.; Gershenson, Dynamic VascularImages: Feature
Imaging UsingSelection StudyThermography.
Active Breast Using Genetic Sensors
Algorithms
2023,and Particle
23, 3012. Swarm
[CrossRef]
Optimization. Res. Biomed. Eng. 2019, 35, 213–222. https://doi.org/10.1007/s42600-019-00024-z.
[PubMed]
8.
7. Lozano,
Gershenson,A.; Hassanipour,
M.; Gershenson,F. Infrared ImagingVascular
J. Dynamic for BreastImaging
Cancer Detection: An Objective
Using Active Review of Foundational
Breast Thermography. Studies
Sensors 2023, 23,and Its
3012.
Proper Role in Breast Cancer Screening. Infrared Phys. Technol. 2019, 97, 244–257. [CrossRef]
https://doi.org/10.3390/s23063012.
9. Ekici, S.; Jawzal, H. Breast Cancer Diagnosis Using Thermography and Convolutional Neural Networks. Med. Hypotheses 2020,
8. Lozano, A.; Hassanipour,
137, 109542. F. Infrared Imaging for Breast Cancer Detection: An Objective Review of Foundational Studies and
[CrossRef] [PubMed]
10. Its Proper Role
Mashekova, A.;in Breast
Zhao, Y.;Cancer Screening.
Ng, E.Y.K.; Infrared
Zarikas, V.; Fok, Phys.
S.C.;Technol. 2019, 97,
Mukhmetov, O. 244–257.
Early Detection of the Breast Cancer Using Infrared
9. Technology—A
Ekici, S.; Jawzal,Comprehensive
H. Breast Cancer Review. Therm.
Diagnosis UsingSci.Thermography
Eng. Prog. 2022,and
27,Convolutional
101142. [CrossRef]
Neural Networks. Med. Hypotheses 2020,
11. Ohashi, Y.; Uchida, I. Applying Dynamic Thermography
137, 109542. https://doi.org/10.1016/j.mehy.2019.109542. in the Diagnosis of Breast Cancer: Techniques for Improving Sensitivity
of Breast Thermography. IEEE Trans. Biomed. Eng. 2000, 47, 42–51. [CrossRef]
10. Mashekova, A.; Zhao, Y.; Ng, E.Y.K.; Zarikas, V.; Fok, S.C.; Mukhmetov, O. Early Detection of the Breast Cancer Using Infrared
12. D’Alessandro, G.; Tavakolian, P.; Sfarra, S. A Review of Techniques and Bio-Heat Transfer Models Supporting Infrared Thermal
Technology—A
Imaging Comprehensive
for Diagnosis Review.
of Malignancy. Therm.
Appl. Sci. Eng.
Sci. 2024, Prog. [CrossRef]
14, 1603. 2022, 27, 101142.
11.
13. Ohashi, Y.;
Rautela, K.;Uchida,
Kumar,I.D.;Applying
Kumar,Dynamic Thermography
V. A Systematic Review inonthe Diagnosis
Breast CancerofDetection
Breast Cancer:
UsingTechniques for Improving
Deep Learning Sensitiv-
Techniques. Arch.
Comput. Methods Eng. 2022, 29, 4599–4629. [CrossRef]
ity of Breast Thermography. IEEE Trans. Biomed. Eng. 2000, 47,42-51. http://doi.org/10.1109/51.844379.
14.
12. Olota, M.; Alsadoon,
D’Alessandro, A.; Alsadoon,
G.; Tavakolian, O.H.;S.Dawoud,
P.; Sfarra, A Review A.;ofPrasad, P.W.C.;
Techniques andIslam, R.; Jerew,
Bio-Heat O.D.Models
Transfer Modified Anisotropic
Supporting Diffusion
Infrared and
Thermal
Level-Set Segmentation for Breast Cancer. Multimed. Tools Appl. 2024, 83, 13503–13525. [CrossRef]
Imaging for Diagnosis of Malignancy. Appl. Sci. 2024, 14, 1603.
15. Acharya, U.R.; Ng, E.Y.K.; Tan, J.H.; Sree, S.V. Thermography Based Breast Cancer Detection Using Texture Features and Support
13. Rautela, K.; Kumar,
Vector Machine. D.; Syst.
J. Med. Kumar, V.36,
2012, A Systematic
1503–1510. Review on [PubMed]
[CrossRef] Breast Cancer Detection Using Deep Learning Techniques. Arch.
16. de Santana,
Comput. M.A.;Eng.
Methods Pereira,
2022,J.M.S.; da Silva, F.L.; de Lima, N.M.; de Sousa, F.N.; de Arruda, G.M.S.; de Lima, R.d.C.F.; de Silva,
29, 4599–4629.
14. W.W.A.;
Olota, M.;dos Santos, W.P.
Alsadoon, Breast Cancer
A.; Alsadoon, O.H.;Diagnosis
Dawoud,Based on Mammary
A.; Prasad, Thermography
P.W.C.; Islam, R.; Jerew,and
O.D.Extreme Learning
Modified Machines.
Anisotropic Res.
Diffusion
Biomed. Eng. 2018, 34, 45–53. [CrossRef]
and Level-Set Segmentation for Breast Cancer. Multimed. Tools Appl. 2024, 83, 13503–13525. https://doi.org/10.1007/s11042-023-
17. Gaber, T.; Ismail, G.; Anter, A.; Soliman, M.; Ali, M.; Semary, N.; Hassanien, A.E.; Snasel, V. Thermogram Breast Cancer Prediction
16021-5.
Approach Based on Neutrosophic Sets and Fuzzy C-Means Algorithm. In Proceedings of the Annual International Conference of
15. Acharya,
the U.R.; Ng, E.Y.K.;
IEEE Engineering Tan, J.H.;
in Medicine Sree,
and S.V. Thermography
Biology BasedMilan,
Society, EMBS 2015, BreastItaly,
Cancer Detection
25–29 August Using
2015. Texture Features and Sup-
port Vector Machine. J. Med. Syst. 2012, 36, 1503–1510. https://doi.org/10.1007/s10916-010-9611-z.
J. Imaging 2024, 10, 329 20 of 21
18. Sánchez-Ruiz, D.; Olmos-Pineda, I.; Olvera-López, J.A. Automatic Region of Interest Segmentation for Breast Thermogram Image
Classification. Pattern Recognit. Lett. 2020, 135, 72–81. [CrossRef]
19. Kufel, J.; Bargieł-Łaczek,
˛ K.; Kocot, S.; Koźlik, M.; Bartnikowska, W.; Janik, M.; Czogalik, Ł.; Dudek, P.; Magiera, M.; Lis, A.; et al.
What Is Machine Learning, Artificial Neural Networks and Deep Learning?—Examples of Practical Applications in Medicine.
Diagnostics 2023, 13, 2582. [CrossRef] [PubMed]
20. Farooq, M.A.; Corcoran, P. Infrared Imaging for Human Thermography and Breast Tumor Classification Using Thermal Images.
In Proceedings of the 2020 31st Irish Signals and Systems Conference, ISSC 2020, Letterkenny, Ireland, 11–12 June 2020.
21. Ensafi, M.; Keyvanpour, M.R.; Shojaedini, S.V. A New Method for Promote the Performance of Deep Learning Paradigm in
Diagnosing Breast Cancer: Improving Role of Fusing Multiple Views of Thermography Images. Health Technol. 2022, 12, 1097–1107.
[CrossRef]
22. Mohamed, E.A.; Rashed, E.A.; Gaber, T.; Karam, O. Deep Learning Model for Fully Automated Breast Cancer Detection System
from Thermograms. PLoS ONE 2022, 17, e0262349. [CrossRef]
23. Jafari, Z.; Karami, E. Breast Cancer Detection in Mammography Images: A CNN-Based Approach with Feature Selection.
Information 2023, 14, 410. [CrossRef]
24. Yadav, S.S.; Jadhav, S.M. Deep Convolutional Neural Network Based Medical Image Classification for Disease Diagnosis. J. Big
Data 2019, 6, 113. [CrossRef]
25. Goncalves, C.B.; Souza, J.R.; Fernandes, H. Classification of Static Infrared Images Using Pre-Trained CNN for Breast Cancer
Detection. In Proceedings of the IEEE Symposium on Computer-Based Medical Systems, Aveiro, Portugal, 7–9 June 2021.
26. Fourcade, A.; Khonsari, R.H. Deep Learning in Medical Image Analysis: A Third Eye for Doctors. J. Stomatol. Oral. Maxillofac.
Surg. 2019, 120, 279–288. [CrossRef] [PubMed]
27. Khandakar, A.; Chowdhury, M.E.H.; Reaz, M.B.I.; Ali, S.H.M.; Kiranyaz, S.; Rahman, T.; Chowdhury, M.H.; Ayari, M.A.; Alfkey,
R.; Bakar, A.A.A.; et al. A Novel Machine Learning Approach for Severity Classification of Diabetic Foot Complications Using
Thermogram Images. Sensors 2022, 22, 4249. [CrossRef] [PubMed]
28. Yoo, H.; Han, S.; Chung, K. Diagnosis Support Model of Cardiomegaly Based on CNN Using ResNet and Explainable Feature
Map. IEEE Access 2021, 9, 55802–55813. [CrossRef]
29. Barnawi, A.; Chhikara, P.; Tekchandani, R.; Kumar, N.; Alzahrani, B. Artificial Intelligence-Enabled Internet of Things-Based
System for COVID-19 Screening Using Aerial Thermal Imaging. Future Gener. Comput. Syst. 2021, 124, 119–132. [CrossRef]
[PubMed]
30. Grigore, M.A.; Neagoe, V.E. A Deep CNN Approach Using Thermal Imagery for Breast Cancer Diagnosis. In Proceedings of the
13th International Conference on Electronics, Computers and Artificial Intelligence, ECAI 2021, Pitesti, Romania, 1–3 July 2021.
31. Li, F.; Liu, M. A Hybrid Convolutional and Recurrent Neural Network for Hippocampus Analysis in Alzheimer’s Disease. J.
Neurosci. Methods 2019, 323, 108–118. [CrossRef] [PubMed]
32. Patil, R.S.; Biradar, N. Automated Mammogram Breast Cancer Detection Using the Optimized Combination of Convolutional
and Recurrent Neural Network. Evol. Intell. 2021, 14, 1459–1474. [CrossRef]
33. Soni, K.M.; Gupta, A.; Jain, T. Supervised Machine Learning Approaches for Breast Cancer Classification and a High Performance
Recurrent Neural Network. In Proceedings of the 3rd International Conference on Inventive Research in Computing Applications,
ICIRCA 2021, Coimbatore, India, 2–4 September 2021.
34. Wang, X.; Ahmad, I.; Javeed, D.; Zaidi, S.A.; Alotaibi, F.M.; Ghoneim, M.E.; Daradkeh, Y.I.; Asghar, J.; Eldin, E.T. Intelligent
Hybrid Deep Learning Model for Breast Cancer Detection. Electronics 2022, 11, 2767. [CrossRef]
35. Srikantamurthy, M.M.; Rallabandi, V.P.S.; Dudekula, D.B.; Natarajan, S.; Park, J. Classification of Benign and Malignant Subtypes
of Breast Cancer Histopathology Imaging Using Hybrid CNN-LSTM Based Transfer Learning. BMC Med. Imaging 2023, 23, 19.
[CrossRef]
36. Ahmad, S.; Ullah, T.; Ahmad, I.; Al-Sharabi, A.; Ullah, K.; Khan, R.A.; Rasheed, S.; Ullah, I.; Uddin, M.N.; Ali, M.S. A Novel
Hybrid Deep Learning Model for Metastatic Cancer Detection. Comput. Intell. Neurosci. 2022, 2022, 8141530. [CrossRef]
37. Atrey, K.; Singh, B.K.; Bodhey, N.K.; Bilas Pachori, R. Mammography and Ultrasound Based Dual Modality Classification of
Breast Cancer Using a Hybrid Deep Learning Approach. Biomed. Signal Process Control 2023, 86, 104919. [CrossRef]
38. Zhao, T.; Fu, C.; Song, W.; Sham, C.W. RGGC-UNet: Accurate Deep Learning Framework for Signet Ring Cell Semantic
Segmentation in Pathological Images. Bioengineering 2024, 11, 16. [CrossRef]
39. Salehi, A.W.; Khan, S.; Gupta, G.; Alabduallah, B.I.; Almjally, A.; Alsolai, H.; Siddiqui, T.; Mellit, A. A Study of CNN and Transfer
Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 2023, 15, 5930. [CrossRef]
40. Mohammed, F.A.; Tune, K.K.; Assefa, B.G.; Jett, M.; Muhie, S. Medical Image Classifications Using Convolutional Neural
Networks: A Survey of Current Methods and Statistical Modeling of the Literature. Mach. Learn. Knowl. Extr. 2024, 6, 699–735.
[CrossRef]
41. Silva, L.F.; Saade, D.C.M.; Sequeiros, G.O.; Silva, A.C.; Paiva, A.C.; Bravo, R.S.; Conci, A. A New Database for Breast Research
with Infrared Image. J. Med. Imaging Health Inform. 2014, 4, 92–100. [CrossRef]
42. Sánchez-Cauce, R.; Pérez-Martín, J.; Luque, M. Multi-Input Convolutional Neural Network for Breast Cancer Detection Using
Thermal Images and Clinical Data. Comput. Methods Programs Biomed. 2021, 204, 106045. [CrossRef]
43. Siddique, N.; Paheding, S.; Elkin, C.P.; Devabhaktuni, V. U-Net and Its Variants for Medical Image Segmentation: A Review of
Theory and Applications. IEEE Access 2021, 9, 82031–82057. [CrossRef]
J. Imaging 2024, 10, 329 21 of 21
44. Du, G.; Cao, X.; Liang, J.; Chen, X.; Zhan, Y. Medical Image Segmentation Based on U-Net: A Review. J. Imaging Sci. Technol. 2020,
64, jist0710. [CrossRef]
45. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image
Computing and Computer-Assisted Intervention—MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany,
5–9 October 2015; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics); Springer: Cham, Switzerland, 2015; Volume 9351.
46. Guo, T.; Dong, J.; Li, H.; Gao, Y. Simple Convolutional Neural Network on Image Classification. In Proceedings of the 2017 IEEE
2nd International Conference on Big Data Analysis, ICBDA 2017, Beijing, China, 10–12 March 2017.
47. Jalloul, R.; Chethan, H.K.; Alkhatib, R. A Review of Machine Learning Techniques for the Classification and Detection of Breast
Cancer from Medical Images. Diagnostics 2023, 13, 2460. [CrossRef] [PubMed]
48. Mahoro, E.; Akhloufi, M.A. Applying Deep Learning for Breast Cancer Detection in Radiology. Curr. Oncol. 2022, 29, 8767–8793.
[CrossRef] [PubMed]
49. Wang, Y.; Li, Y.; Song, Y.; Rong, X. The Influence of the Activation Function in a Convolution Neural Network Model of Facial
Expression Recognition. Appl. Sci. 2020, 10, 1897. [CrossRef]
50. Rasamoelina, A.D.; Adjailia, F.; Sincak, P. Deep Convolutional Neural Network for Robust Facial Emotion Recognition. In
Proceedings of the IEEE International Symposium on INnovations in Intelligent SysTems and Applications, INISTA 2019, Sofia,
Bulgaria, 3–5 July 2019.
51. Oh, H.M.; Lee, H.; Kim, M.Y. Comparing Convolutional Neural Network(CNN) Models for Machine Learning-Based Drone and
Bird Classification of Anti-Drone System. In Proceedings of the International Conference on Control, Automation and Systems,
Jeju, Republic of Korea, 15–18 October 2019.
52. Zhang, H.; Qie, Y. Applying Deep Learning to Medical Imaging: A Review. Appl. Sci. 2023, 13, 10521. [CrossRef]
53. Azizi, S.; Bayat, S.; Yan, P.; Tahmasebi, A.; Kwak, J.T.; Xu, S.; Turkbey, B.; Choyke, P.; Pinto, P.; Wood, B.; et al. Deep Recurrent
Neural Networks for Prostate Cancer Detection: Analysis of Temporal Enhanced Ultrasound. IEEE Trans. Med. Imaging 2018, 37,
2695–2703. [CrossRef]
54. Pan, Q.; Zhang, Y.; Chen, D.; Xu, G. Character-Based Convolutional Grid Neural Network for Breast Cancer Classification. In
Proceedings of the 2017 International Conference on Green Informatics, ICGI 2017, Fuzhou, China, 15–17 August 2017.
55. Fang, W.; Chen, Y.; Xue, Q. Survey on Research of RNN-Based Spatio-Temporal Sequence Prediction Algorithms. J. Big Data 2021,
3, 97–110. [CrossRef]
56. da Queiroz, K.F.F.C.; de Queiroz Júnior, J.R.A.; Dourado, H.; de Lima, R.d.C.F. Automatic Segmentation of Region of Interest for
Breast Thermographic Image Classification. Res. Biomed. Eng. 2023, 39, 199–208. [CrossRef]
57. Rezaei, Z. A Review on Image-Based Approaches for Breast Cancer Detection, Segmentation, and Classification. Expert. Syst.
Appl. 2021, 182, 115204. [CrossRef]
58. De Freitas Oliveira Baffa, M.; Grassano Lattari, L. Convolutional Neural Networks for Static and Dynamic Breast Infrared Imaging
Classification. In Proceedings of the 31st Conference on Graphics, Patterns and Images, SIBGRAPI 2018, Paraná, Brazil, 29
October–1 November 2018.
59. Mambou, S.J.; Maresova, P.; Krejcar, O.; Selamat, A.; Kuca, K. Breast Cancer Detection Using Infrared Thermal Imaging and a
Deep Learning Model. Sensors 2018, 18, 2799. [CrossRef]
60. Chatterjee, S.; Biswas, S.; Majee, A.; Sen, S.; Oliva, D.; Sarkar, R. Breast Cancer Detection from Thermal Images Using a
Grunwald-Letnikov-Aided Dragonfly Algorithm-Based Deep Feature Selection Method. Comput. Biol. Med. 2022, 141, 105027.
[CrossRef]
61. Cabıoğlu, Ç.; Oğul, H. Computer-Aided Breast Cancer Diagnosis from Thermal Images Using Transfer Learning. In Bioinformatics
and Biomedical Engineering, Proceedings of the 8th International Work-Conference, IWBBIO 2020, Granada, Spain, 6–8 May 2020; Lecture
Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics);
Springer: Cham, Switzerland, 2020; Volume 12108, LNBI.
62. Kebaili, A.; Lapuyade-Lahorgue, J.; Ruan, S. Deep Learning Approaches for Data Augmentation in Medical Imaging: A Review. J.
Imaging 2023, 9, 81. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.