An Improved Densenet Deep Neural Network Model For Tuberculosis Detection Using Chest X-Ray Images
An Improved Densenet Deep Neural Network Model For Tuberculosis Detection Using Chest X-Ray Images
ABSTRACT Tuberculosis (TB) is a highly contagious and life-threatening infectious disease that affects
millions of people worldwide. Early diagnosis of TB is essential for prompt treatment and control of the
spread of the disease. In this paper, a new deep learning model called CBAMWDnet is proposed for the
detection of TB in chest X-ray (CXR) images. The model is based on the Convolutional Block Attention
Module (CBAM) and the Wide Dense Net (WDnet) architecture, which has been designed to effectively
capture spatial and contextual information in the images. The performance of the proposed model is evaluated
based on a large dataset of chest X-ray images and it is compared to several state-of-the-art models.
The results show that the proposed model outperforms the other models in terms of accuracy (98.80%),
sensitivity (94.28%), precision (98.50%), specificity (95.7%) and F1 score (96.35%). Additionally, our
model demonstrates excellent generalization ability, with consistent performance on different datasets. In
conclusion, the proposed CBAMWDnet model is a promising tool for the early diagnosis of TB, with superior
performance compared to other state-of-the-art models, as evidenced by the evaluation metrics of accuracy,
sensitivity, and specificity.
INDEX TERMS Chest X-ray, convolutional neural network, deep learning, disease diagnosis, tuberculosis.
polymerase chain reaction (PCR) is a widely used nucleic network [21]. This method is easy and relatively safe, espe-
acid amplification test that can detect tuberculosis in clinical cially when there is a large amount of labeled training data
specimens such as sputum, blood, bone marrow, and biopsy available. However, it has two main drawbacks: the increased
samples. However, PCR techniques are expensive and may number of parameters may make the network more prone to
not be available at all medical institutions. It is important to overfitting, particularly when the training set is small, and the
consider the benefits and limitations of different techniques increased size also requires more computational resources.
when determining the most appropriate method for identify- Another method, which is becoming increasingly popular,
ing tuberculosis cases. is to scale up the model by increasing the resolution of the
Although Convolutional Neural Networks (CNNs) were images [22]. In the past, it was common to only scale one
originally introduced more than 25 years [2], [3], improve- dimension (depth, width, or image size), but scaling all three
ments in computer hardware and network structure, which can be more effective, though it requires manual tuning and
enabled the training of truly deep CNNs, make them become may not always achieve optimal accuracy and efficiency. For
the dominant machine learning approach for visual object example, if you want to use 2N times more computational
recognition recently. The increasing number of both the lay- resources, you can increase the network depth by αN, the
ers and the size of each layer in modern networks ampli- width by βN, and the image size by γ N, where α, β, γ are
fies the differences between architectures and motivates the constants determined by a small grid search on the original,
exploration of different connectivity patterns and the revisit- smaller model.
ing of old research ideas. From LeNet [4] with only 5 layers, Despite the potential benefits of utilizing deep learning
Resnet [5] with more than 100 layers, even the deep networks for tuberculosis detection, there are significant challenges to
with stochastic depth that contain more than 1200 layers [6]. its implementation. One major obstacle is the need for large
Deep learning techniques are currently being employed datasets of labeled medical images to train the models effec-
to address numerous critical problems. Speech emotion tively, which can be difficult to obtain due to privacy con-
recognition (SER) has received a lot of attention in recent cerns. Additionally, not all deep learning models are designed
years [7], [8], and researchers have achieved excellent results specifically for tuberculosis detection, which can result in
thanks to the use of CNNs [9], [10], [11]. Another challenge suboptimal performance. There is also a risk of bias in the
is the analysis of non-stationary signals, which are charac- data, which may lead to inaccurate predictions or unfairly
terized by a time-varying frequency spectrum, particularly discriminate against certain populations.
in noisy environments. Machine learning-based techniques The purpose of this study is to give physicians a tool
have been utilized for a range of tasks, including denois- to help them diagnose tuberculosis by developing a deep
ing of gravitational-wave data, estimation of parameters, learning architecture that is tailored to tuberculosis diagnosis.
classification of detector glitches, detection of gravitational This architecture will assist them in the diagnosis process.
waves [12], [13], [14]. Our paper aims to contribute some of the most significant
Deep convolutional networks have also shown outstand- things about deep learning in this regard, in order to make
ing performance in tuberculosis diagnosis, surpassing other a significant contribution:
competing approaches. Pasa et al. [15] developed a special- • Data: No matter how effective algorithms for machine
ized neural network architecture that significantly reduced learning models are, it must keep in mind that the quality
computational, memory, and power requirements. Melen- of data is just as critical as the quantity. In this research,
dez et al. [16] demonstrated that Computer-Aided Detection we spend a great deal of time looking for open datasets
(CAD) techniques can be enhanced with clinical informa- that provide both the quality and quantity of informa-
tion to improve accuracy and specificity. Vajda et al. [17] tion that we are seeking. The provided dataset helps
developed an automatic system that detects abnormal lungs deep learning models achieve high recognizing without
with multiple tuberculosis manifestations by selecting fea- requiring much adjustment.
tures based on wrappers in order to minimize classification • Deep learning model: This paper presents a new
error. Lopes and João [18] investigated the use of pre-trained deep-learning architecture named as CBAMWDNet that
CNNs for tuberculosis detection and proposed three different can be tailored to the diagnosis of tuberculosis. It is
CNN architectures for training an SVM classifier. Rahman observed that the CBAMWDNet, by increasing the com-
et al. [19] and Rajaraman and Sameer [20] have proposed a putational and memory requirements just a bit, can sig-
reliable deep learning-based method for detecting tuberculo- nificantly increase classification performance.
sis from chest X-ray images, and it can achieve high accuracy • Comparison: Since applying deep learning to medicine
and outperforming traditional methods. They also suggested isn’t a revolutionary concept nowadays, there are quite a
that large datasets are necessary to train profound neural net- number of researchers working on this particular topic,
works, although this is a costly and time-consuming process. and many of them have already made considerable
There are several ways to improve the performance of progress. In order to show the superiority of the proposed
deep neural networks, but one of the most straightforward model, it needs to compare and evaluate this model with
is by increasing their size. This can be done by increasing other deep learning models to show the superior of the
both the depth (e.g., using more layers) and the width of the proposed model to other models.
42840 VOLUME 11, 2023
V. T. Q. Huy, C.-M. Lin: Improved Densenet Deep Neural Network Model for TB Detection
TABLE 1. Details of total dataset, training set and validation set for classification problem.
This paper contributes significantly to deep learning in med- • Montgomery Dataset: CXR datasets from Montgomery
ical image analysis. Using its novel approach, it uses more County (MC) contain 138 frontal chest X-rays which are
comprehensive X-ray images for detecting tuberculosis than part of Montgomery County’s Tuberculosis screening
previous methods. Moreover, the proposed CBAMWDNet program. 80 of the images can be classified as ‘‘normal’’
outperformed other state-of-the-art algorithms in terms of and 58 of the images can be classified as ‘‘TB manifes-
classification accuracy, sensitivity, and specificity in this tations’’ [23].
study. The model also achieved satisfactory performance • The Shenzhen dataset: In the Shenzhen dataset, there
with fewer training epochs, resulting in significant savings are 662 frontal CXR images that have been exam-
in training time. Through these findings, it may be possible ined. Of these, 326 have been classified as normal
to improve tuberculosis detection and other medical image and 336 have been classified as manifestations of
analysis tasks. tuberculosis [23].
The paper is organized as follows: Section II provides • Tuberculosis (TB) chest X-ray dataset: This dataset was
details on the experimental setup employed in this study, compiled by Qatar University, Doha, Qatar, as well as
including an introduction to the image dataset, the proposed the University of Dhaka, Bangladesh, as well as their
CNN, and the validation methods used for the deep learning Malaysian collaborators, along with medical doctors
classification algorithms. Section III presents the experimen- from Hamad Medical Corporation and Bangladesh. This
tal results, which are elaborated upon in detail. Finally, Sec- dataset consists of 700 images of CXRs that have been
tion IV summarizes the concluding remarks. diagnosed with tuberculosis, and 3500 images of CXRs
that are normal [19].
II. METHODS AND MATERIALS As a result, we merged and removed any corrupted images
A. IMAGE DATASET from these repositories in order to populate a dataset contain-
We populated a dataset from different publicly available data ing 5000 images, based on a combination of the repositories.
repositories as follows: An overview of the dataset resulting from this analysis is
Attention. Instead of using GAP and GMP in the channel TABLE 2. Parameter details for cbamwdnet.
dimension of feature maps, GAP and GMP are processed in
the spatial dimension of feature maps. Figure 4 shows the
detail of the Spatial Attention in CBAM.
There are several reasons why convolutional block atten-
tion module (CBAM) can be useful in a convolutional neural
network (CNN):
Improve model performance: By allowing the network to
focus on certain parts of the input data and ignore others,
CBAM can help improve the performance of the CNN on
tasks such as image classification and object detection.
• Reduce overfitting: Attention mechanisms can help
reduce overfitting by allowing the network to focus on
the most relevant parts of the input data and ignore noise
or other irrelevant information.
• Reduce the number of parameters: CBAM has a rela-
tively simple structure and requires fewer parameters
than some other attention mechanisms, which can make
it easier to train and less prone to overfitting.
• Improve the interpretability of the model: By visualizing
the attention map produced by CBAM, it is possible to
understand which parts of the input data the network is
focusing on and why. This can be useful for understand-
ing the model’s decision-making process and improving
its transparency.
Wide: Wide ResNets are a type of deep learning neural
network that have been shown to be more effective than
traditional ResNets in some tasks, particularly in image clas-
sification. This is because Wide ResNets are able to capture
more detailed features from the input data, resulting in better
performance on the target task. One of the key differences
between Wide ResNets and traditional ResNets is the number
of filters used in the convolutional layers. Wide ResNets use
a larger number of filters, which allows them to learn more
detailed features from the input data. Additionally, Wide
ResNets typically use skip connections, which help improve
the flow of information through the network and make it
easier for the network to learn complex patterns in the data.
Our model also applies this by double the growth rate and half
the number of layers.
Dense blocks: Dense block is a type of layer in a CNN that
consists of multiple convolutional layers with dense connec-
tions between them. Dense connections refer to the fact that
each layer in the block receives input from all of the previous
layers in the block, rather than just the directly preceding
layer. This allows the dense block to propagate information
throughout the entire block and helps the network learn more
abstract features. Some other models (like Resnet, Wide
Resnet, . . . ) use residual block, which is also a type of layer
in a CNN that consists of two or more convolutional layers
with a skip connection between them (A skip connection is a
shortcut that allows the output of a layer to be added directly
to the output of a preceding layer, bypassing any intermediate
layers). Both dense blocks and residual blocks can be used
to improve the performance of CNNs, but they are used in
TABLE 2. (Continued.) Parameter details for cbamwdnet. rate with a step size of 7 and a gamma value of 0.1. Other
hyperparameters, such as input size and dropout rate, are
selected based on the specific model used for detection.
In this study, true positive (TP), true negative (TN), false
positive (FP), and false negative (FN) are used to evaluate
the performance of different CNNs in detecting tuberculo-
sis. TP refers to the number of tuberculosis images that are
correctly identified as tuberculosis, TN refers to the number
of normal images that are correctly identified as normal,
FP refers to the number of normal images that are incor-
rectly identified as tuberculosis images, and FN refers to the
number of tuberculosis images that are incorrectly identified
as normal. These performance metrics are commonly used
in the evaluation of classification models, and they allow
researchers to understand the model’s ability to correctly
classify samples as either positive or negative for the outcome
or class being predicted.
1) ACCURACY
different ways. Dense blocks are typically used to allow the
Accuracy is a commonly used measure to evaluate the per-
network to learn more abstract features, while residual blocks
formance of a deep learning model in detecting tuberculosis.
are used to allow the network to learn residuals and improve
It is the proportion of correctly classified samples out of all
the flow of information through the network.
samples, and can be calculated as the number of true positive
CBAMWDNet is a convolutional neural network specially
and true negative predictions divided by the total number
created for the purpose of tuberculosis classification. Due
of predictions made. For example, if a deep learning model
to its deep architecture, the model comprises of a large
correctly classifies 95 out of 100 samples as either positive or
number of parameters, which amounts to 8,159,134. This
negative for tuberculosis, its accuracy is 95%.
extensive parameter count reflects the model’s capability of
effectively extracting intricate features and patterns from the (TP + TN )
input images. A detailed distribution of the parameter count Accuracy = (1)
TP + FN + FP + TN
across different layers of the network is presented in Table 2.
2) SENSITIVITY OR RECALL OR TRUE POSITIVE RATE (TPR)
C. EVALUATION TPR is another important measure to consider when evaluat-
The performance of different CNNs for the testing dataset is ing the performance of a deep learning model in detecting
evaluated after the completion of the training and validation tuberculosis. It is the proportion of positive samples that
phases. This evaluation is conducted to assess the effective- are correctly classified as positive, and can be calculated as
ness of different CNNs in predicting the outcome or class the number of true positive predictions divided by the total
of the samples in the testing dataset. The performance of number of actual positive samples. For example, if a deep
different CNNs is compared using six performance metrics: learning model correctly classifies 90 out of 100 samples as
accuracy, sensitivity, specificity, precision, negative predic- positive for tuberculosis, its sensitivity is 90%.
tive value, and F1 score. These metrics are chosen because
TP
they are widely used to evaluate the performance of classifi- Sensitive (TPR) = (2)
cation models, and each of them captures a different aspect of TP + FN
the model’s performance. The results of this evaluation allow 3) SPECIFICITY OR SELECTIVITY OR TRUE NEGATIVE
researchers to compare the performance of different CNNs RATE(TNR)
and determine which one is the most effective in predicting
TNR is a measure that is important to consider when evaluat-
the outcome or class of the samples in the testing dataset.
ing the performance of a deep learning model in detecting
Hyperparameters play a crucial role in the performance
tuberculosis. It is the proportion of negative samples that
of deep learning models. However, selecting the appropriate
are correctly classified as negative, and can be calculated as
values for these parameters can be challenging for different
the number of true negative predictions divided by the total
tasks. In this study, we have used default values for some of
number of actual negative samples. For example, if a deep
the most common hyperparameters in deep learning models
learning model correctly classifies 95 out of 100 samples as
since our focus is on the deep learning process itself. The
negative for tuberculosis, its specificity is 95%.
batch size is set to 16, and we use the Stochastic Gradient
Descent (SGD) optimizer with a learning rate of 0.001 and a TN
momentum of 0.9. Additionally, we employ a decay learning specificity(TNR) = (3)
TN + FP
42844 VOLUME 11, 2023
V. T. Q. Huy, C.-M. Lin: Improved Densenet Deep Neural Network Model for TB Detection
4) PRECISION OR POSITIVE PREDICTIVE VALUE (PPV) datasets. The results indicate that our model may take longer
Precision is a measure of the proportion of positive predic- to train compared to other deep learning models. This is
tions that are actually correct. It can be calculated as the num- because we have employed the use of CBAM (Convolutional
ber of true positive predictions divided by the total number Block Attention Module), a more complex architecture that
of positive predictions made. For example, if a deep learning has been demonstrated to enhance model performance in cer-
model makes 100 positive predictions for tuberculosis and tain tasks. While CBAM does improve model performance,
90 of them are correct, its precision is 90%. it also requires more computational resources and a longer
training time due to its increased complexity.
TP
precision(PPV ) = (4) The performance evaluations are presented in Table 5 and
TP + FP Table 6. Table 5 shows the overall performance of our model
5) NEGATIVE PREDICTIVE VALUE (NPV) on the validation dataset, including metrics such as accuracy,
NPV is a measure of the proportion of negative predictions true positive rate, and false positive rate, positive predictive
that are actually correct. It can be calculated as the number value, negative predictive value and F1 score for only Tuber-
of true negative predictions divided by the total number of culosis (TB) Chest X-ray Dataset. Table 6 shows the overall
negative predictions made. For example, if a deep learning performance of our model on the validation dataset, including
model makes 100 negative predictions for tuberculosis and metrics such as accuracy, true positive rate, and false positive
90 of them are correct, its negative predictive value is 90%. rate, positive predictive value, negative predictive value and
F1 score for total dataset which combine from Shenzhen
TN dataset, Montgomery dataset and Tuberculosis (TB) Chest X-
NPV = (5)
TN + FN ray Dataset. Both tables provide a clear and comprehensive
view of the performance of our model and demonstrate its
6) F1 SCORE
ability to effectively learn and generalize from the training
F1 score is a measure that combines both precision and data. It is believed that these results are highly encouraging
recall, or sensitivity. It is calculated as the harmonic mean the superiority of our model and the effectiveness of our
of precision and recall, and is a useful metric when there is training process.
a need to balance both false positives and false negatives. The following are some important values from the
A high F1 score indicates a good balance between precision experiments:
and recall.
(2 ∗ TP) A. ACCURACY
F1_score = (6)
(2 ∗ TP + FN + FP) The accuracy of our model has achieved an satisfactory
In addition to evaluating the performance of different CNNs accuracy of 98.80% for TB dataset and 97.00% for total
using these metrics, the researchers also compared the net- dataset, which is significantly higher than the accuracy of
works in terms of the processing time required for training per other deep learning models. This means that our model is
50 epochs (δe50 ). This processing time is the amount of time it able to correctly classify a larger proportion of the input data,
takes for a network to complete one epoch of training, and it is which is a key indicator of its effectiveness.
measured in seconds. The networks are compared for the time
B. TRUE POSITIVE RATE
between the start and end times of the training epochs, where
t1 and t2 represent the start and end times, respectively. This The true positive rate of our model is also significantly higher
allows researchers to understand the efficiency of different than that of other deep learning models. Our model archives
CNNs in terms of the time required to train the model. 94.28% for TB dataset and 95.43% for total dataset. This
means that our model is able to accurately identify a larger
δe50 = t2 − t1 (7) proportion of positive cases, which is important for tasks such
as this tuberculosis diagnosis.
III. EXPERIMENTAL RESULTS
In Figure 5 and Figure 6, the training process of our model is C. TRUE NEGATIVE RATE
presented using 50 epochs for the Tuberculosis (TB) Chest In contrast to other deep learning models, our model has
X-ray Dataset and the Total dataset, which combines data highest True negative rate. It got 99.71% for TB dataset
from three different datasets, respectively. These figures pro- and 97.43% for total dataset This means that it is able to
vide a visual representation of the model’s performance dur- accurately identify a large proportion of negative cases, which
ing training, enabling us to understand how well the model is is important for avoiding false alarms and unnecessary inter-
learning and adapting to the data. ventions.
Tables 3 and 4 present the training times for 7 models using
50 epochs for the Tuberculosis (TB) Chest X-ray dataset D. POSITIVE PREDICTIVE VALUE
and the Total dataset, respectively. These tables provide a The positive predictive value of our model is also higher than
summary of the training times for each model and allow us other deep learning models. Our model got 98.50% for TB
to compare the performance of different models on these dataset and 91.26% for total dataset. This means that our
FIGURE 5. Accuracy for 50 epochs Training from Tuberculosis (TB) Chest X-ray Dataset.
model is able to accurately predict a larger proportion of for total dataset. This means that it is able to accu-
positive cases, which is important for tasks such as disease rately predict a large proportion of negative cases, which
diagnosis like tuberculosis. is important for avoiding false alarms and unnecessary
interventions.
E. NEGATIVE PREDICTIVE VALUE
Our model has a very high negative predictive value, F. F1 SCORE
which is significantly higher than that of other deep learn- The F1 score of our model is also significantly higher than
ing models. It got 98.86% for TB dataset and 98.70% that of other deep learning models. It got 96.35% for TB
TABLE 3. Training time for each model in tuberculosis (TB) chest X-RAY dataset.
IV. CONCLUSION
As a result of our study, it is found that the diagnosis perfor-
mance of a supervised machine learning model depends on
the dataset. This is because of the varying technical specifi-
cations of CXR images, as well as the distribution of disease
severity in different populations. Not only CBAMWDNet
but also other competing models maintain high diagnostic
accuracy for the training, validation, and test images using
the total dataset and Tuberculosis (TB) chest X-ray dataset.
There is no doubt that the quality of data is as significant as
the quantity of data. This is regardless of the fact that you
have implemented a highly advanced algorithm. Moreover,
our model has achieved satisfactory level of accuracy with our
FIGURE 7. ROC curve for Total Dataset. model. What makes this achievement particularly noteworthy
is that we are able to achieve this level of accuracy without
using any pre-trained processes and with only 50 epochs of
training. This is worth mention, as it typically takes many
more epochs to achieve such high levels of accuracy for other
models. The fact that our model is able to learn and generalize
so effectively from the training data in such a short amount
of time is a testament to its efficiency and the effectiveness of
our training process. With such high levels of accuracy, we are
confident that the model will perform well on unseen data,
and we look forward to analyzing how this model performs
in real-world situations.
There are a few limitations to this study that need to be
considered. The first difficulty is that CBAMWDNet is com-
putationally intensive because of its large number of param-
eters, which can pose a challenge for researchers who do not
have access to high-performance computing resources. Fur-
FIGURE 8. ROC curve for TB Dataset. thermore, the lack of labeled data may also pose a limitation.
Limited availability of validated open-source data may hinder
the performance of the model, resulting in lower accuracy and
dataset and 93.30% for total dataset. This is a measure of the reliability. It may be possible to resolve this issue in the future
model’s ability to balance precision and recall, and a high F1 when more valid open-source data is available.
score indicates that our model is able to accurately identify For future study, the proposed CBAMWDNet can be also
a large proportion of positive cases while also avoiding false applied to other disease classification tasks. Alternatively,
alarms. incremental learning can enable continuous training with
new data while retaining knowledge from previous sessions,
G. ROC resulting in improved accuracy. Additionally, an enhanced
Figures 7 and 8 demonstrate that our model achieved the weighted model ensemble strategy may be considered for
highest AUC (Area Under the Curve) compared to the other further optimizing the performance of the model. Lastly,
6 deep learning models. This result shows the effectiveness multi-objective optimization techniques may be utilized in
of our model in accurately predicting the outcome of a order to determine the optimal model weights and possibly
binary classification task. The AUC is a widely used metric enhance the overall performance of CBAMWDNet.
in machine learning, and a high AUC value indicates that
the model can effectively distinguish between positive and
negative classes. DATA AVAILABILITY STATEMENT
Overall, our model has consistently outperformed other The Tuberculosis (TB) Chest X-ray Database is col-
deep learning models in a wide range of metrics, including lected by Rahman et al. [19] and can be access from
accuracy, true positive rate, false positive rate, positive pre- https://www.kaggle.com/datasets/tawsifurrahman/tubercu
dictive value, negative predictive value, F1 score and AUC. losis-tb-chest-xray-dataset
The Montgomery dataset is collected by Jaeger et al. [23] [17] S. Vajda, A. Karargyris, S. Jaeger, K. C. Santosh, S. Candemir, Z. Xue,
and can be access from https://www.kaggle.com/datasets/rad S. Antani, and G. Thoma, ‘‘Feature selection for automatic tuberculosis
screening in frontal chest radiographs,’’ J. Med. Syst., vol. 42, no. 8,
dar/tuberculosis-chest-xrays-montgomery pp. 1–11, Aug. 2018.
The Shenzhen dataset is also collected by Jaeger et al. and [18] U. K. Lopes and J. F. Valiati, ‘‘Pre-trained convolutional neural networks as
can be access from https://www.kaggle.com/datasets/raddar/ feature extractors for tuberculosis detection,’’ Comput. Biol. Med., vol. 89,
pp. 135–143, Oct. 2017.
tuberculosis-chest-xrays-shenzhen [19] T. Rahman, A. Khandakar, M. A. Kadir, K. R. Islam, K. F. Islam,
R. Mazhar, T. Hamid, M. T. Islam, S. Kashem, Z. B. Mahbub, M. A. Ayari,
REFERENCES and M. E. H. Chowdhury, ‘‘Reliable tuberculosis detection using chest
X-ray with deep learning, segmentation and visualization,’’ IEEE Access,
[1] N. Salazar-Austin, C. Mulder, G. Hoddinott, T. Ryckman, C. F. Hanrahan,
vol. 8, pp. 191586–191601, 2020.
K. Velen, L. Chimoyi, S. Charalambous, and V. N. Chihota, ‘‘Preven-
[20] S. Rajaraman and S. K. Antani, ‘‘Modality-specific deep learning model
tive treatment for household contacts of drug-susceptible tuberculosis
ensembles toward improving TB detection in chest radiographs,’’ IEEE
patients,’’ Pathogens, vol. 11, no. 11, p. 1258, Oct. 2022.
Access, vol. 8, pp. 27318–27326, 2020.
[2] S. N. Cho, ‘‘Current issues on molecular and immunological diagnosis of
[21] S. Zagoruyko and N. Komodakis, ‘‘Wide residual networks,’’ 2016,
tuberculosis,’’ Yonsei Med. J., vol. 48, no. 3, pp. 347–359, 2007.
arXiv:1605.07146.
[3] B. Zhang, Q. Zhao, W. Feng, and S. Lyu, ‘‘AlphaMEX: A smarter global
[22] H. Liu, J. Xu, Y. Wu, Q. Guo, B. Ibragimov, and L. Xing, ‘‘Learning
pooling method for convolutional neural networks,’’ Neurocomputing,
deconvolutional deep neural network for high resolution medical image
vol. 321, pp. 36–48, Dec. 2018.
reconstruction,’’ Inf. Sci., vol. 468, pp. 142–154, Nov. 2018.
[4] Y. LeCun and Y. Bengio, ‘‘Convolutional networks for images, speech,
[23] S. Jaeger, S. Candemir, S. Antani, Y. X. Wang, P. X. Lu, and G. Thoma,
and time series,’’ in The Handbook of Brain Theory and Neural Networks,
‘‘Two public chest X-ray datasets for computer-aided screening of pul-
vol. 3361, no. 10. Cambridge, MA, USA: MIT Press, p. 1995.
monary diseases,’’ Quant. Imag. Med. Surg., vol. 4, p. 475, Dec. 2014.
[5] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
[24] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, ‘‘CBAM: Convolutional block
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
attention module,’’ 2018, arXiv:1807.06521.
Jun. 2016, pp. 770–778.
[25] J. Hu, L. Shen, and G. Sun, ‘‘Squeeze-and-excitation networks,’’ in
[6] G. Huang, ‘‘Deep networks with stochastic depth,’’ in Proc. 14th Eur. Conf. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
Comput. Vis. (ECCV), Oct. 2016, pp. 646–661. pp. 7132–7141, doi: 10.1109/CVPR.2018.00745.
[7] M. Maithri, U. Raghavendra, A. Gudigar, J. Samanth, P. D. Barua,
M. Murugappan, Y. Chakole, and U. R. Acharya, ‘‘Automated emotion
recognition: Current trends and future perspectives,’’ Comput. Methods
Programs Biomed., vol. 215, Mar. 2022, Art. no. 106646.
[8] R. A. Khalil, E. Jones, M. I. Babar, T. Jan, M. H. Zafar, and T. Alhussain,
‘‘Speech emotion recognition using deep learning techniques: A review,’’ VO TRONG QUANG HUY was born in Da Nang,
IEEE Access, vol. 7, pp. 117327–117345, 2019. Vietnam, in 1994. He received the master’s degree
[9] Z. Zhao, Z. Bao, Y. Zhao, Z. Zhang, N. Cummins, Z. Ren, and B. Schuller, in information management from Yuan Ze Uni-
‘‘Exploring deep spectrum representations via attention-based recurrent versity, Taoyuan, Taiwan, in 2019, where he is
and convolutional neural networks for speech emotion recognition,’’ IEEE currently pursuing the Ph.D. degree in electrical
Access, vol. 7, pp. 97515–97525, 2019. engineering. His research interests include deep
[10] L. Yang, K. Xie, C. Wen, and J.-B. He, ‘‘Speech emotion analysis of learning, neural networks, fuzzy logic control, and
netizens based on bidirectional LSTM and PGCDBN,’’ IEEE Access, intelligent control systems.
vol. 9, pp. 59860–59872, 2021.
[11] S. Zhong, B. Yu, and H. Zhang, ‘‘Exploration of an independent train-
ing framework for speech emotion recognition,’’ IEEE Access, vol. 8,
pp. 222533–222543, 2020.
[12] W. Wei and E. A. Huerta, ‘‘Gravitational wave denoising of binary black
hole mergers with deep learning,’’ Phys. Lett. B, vol. 800, Jan. 2020,
Art. no. 135081. CHIH-MIN LIN was born in Changhua, Taiwan,
[13] A. J. K. Chua and M. Vallisneri, ‘‘Learning Bayesian posteriors with neural in 1959. He received the B.S. and M.S. degrees
networks for gravitational-wave inference,’’ Phys. Rev. Lett., vol. 124, from the Department of Control Engineering,
no. 4, Jan. 2020, Art. no. 041102. National Chiao Tung University, Hsinchu, Taiwan,
[14] N. Lopac, F. Hrzic, I. P. Vuksanovic, and J. Lerga, ‘‘Detection of
in 1981 and 1983, respectively, and the Ph.D.
non-stationary GW signals in high noise from Cohen’s class of time–
degree from the Institute of Electronics Engineer-
frequency representations using deep learning,’’ IEEE Access, vol. 10,
pp. 2408–2428, 2022. ing, National Chiao Tung University, in 1986.
[15] F. Pasa, V. Golkov, F. Pfeiffer, D. Cremers, and D. Pfeiffer, ‘‘Efficient He is currently a Chair Professor with Yuan Ze
deep network architectures for fast chest X-ray tuberculosis screening and University, Taoyuan, Taiwan. His research inter-
visualization,’’ Sci. Rep., vol. 9, no. 1, p. 6268, Apr. 2019. ests include fuzzy neural networks, cerebellar
[16] J. Melendez, C. I. Sánchez, R. H. H. M. Philipsen, P. Maduskar, R. model articulation controller, intelligent control systems, adaptive signal
Dawson, G. Theron, K. Dheda, and B. van Ginneken, ‘‘An automated processing, and classification problem. He serves as an Associate Editor
tuberculosis screening strategy combining X-ray-based computer-aided for IEEE TRANSACTIONS ON CYBERNETICS and IEEE TRANSACTIONS ON FUZZY
detection and clinical information,’’ Sci. Rep., vol. 6, no. 1, p. 25265, SYSTEMS.
Apr. 2016.