5.Automated-detection-of-vertebral-fractures-from-X-ray-images--A_2024_Neuroco
5.Automated-detection-of-vertebral-fractures-from-X-ray-images--A_2024_Neuroco
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
Keywords: Vertebral fractures are a common problem and the most prevalent of thoracolumbar compression and burst
Thoracolumbar X-ray image fractures. However, vertebral fractures are difficult to diagnose: an experienced orthopedist or radiologist
Compression fracture is required to detect and determine the type of vertebral fracture. Thus, artificial intelligence methods for
Burst fracture
diagnosing vertebral fractures are clinically useful.
Vertebral body segmentation
On the basis of a review of 12 studies in the literature, the earliest of which was published in 2020, we
Machine learning model
Artificial intelligence in medicine
propose a machine learning model that detects and determines the type of vertebral fracture on the basis of
X-ray data. In this method, YOLOv4 and ResUNet are used to segment vertebral bodies from X-ray images. In
evaluation experiments, our method had a precision of 99%, 74%, and 94% in identifying healthy vertebrae,
compression fractures, and burst fractures, respectively.
1. Introduction Compression and burst fractures are the most common types of ver-
tebral fractures [5]. Burst fractures are common in high-energy trauma
Spine fractures can be caused by traumatic injury, especially in and are most commonly associated with falls and traffic accidents; 10%
patients with osteoporosis. Spine fractures tend to occur at the thora- of spine fractures are burst fractures [6]. The thoracolumbar spine,
columbar junction; thoracolumbar fractures are classified using various located at the transitional zone between the thoracic rib cage and
clinical classification systems. Some of these systems are based on the lumbar spine, is susceptible to injury and fracture because it undergoes
type and mechanism of fracture [1,2], and some are based on the state greater motion relative to other parts of the spine. A burst fracture is
of the anatomical structure and nerves affected by the fracture [3,4]. a thoracolumbar fracture involving the anterior and middle columns
The simplest and most widely used classification system is that based of the vertebral body. Burst fractures are usually unstable and often
on three-column theory proposed by Denis [2]. In this system, the spine accompanied by neurological symptoms and spinal instability. Thus,
a definitive diagnosis on the basis of computed tomography (CT) or
is divided into the anterior, middle, and posterior columns.
magnetic resonance imaging (MRI) is necessary [7], and immediate
✩ An extended abstract, under the title ‘‘Automated Diagnosis of Vertebral Fractures Using Radiographs and Machine Learning,’’ appeared in the Proceedings
of the 18th International Conference on Intelligent Computing, ICIC 2022, August 2022, Xi’an, China. LNCS 13393, pp.726-738, 2022.
∗ Corresponding author at: Department of Computer Science and Information Engineering, Institute of Medical Information, Institute of Manufacturing
Information and Systems, National Cheng Kung University., No. 1, University Road, Tainan, 70101, Taiwan.
E-mail addresses: [email protected] (L.-W. Cheng), [email protected] (H.-H. Chou), [email protected] (Y.-X. Cai),
[email protected] (K.-Y. Huang), [email protected] (C.-C. Hsieh), [email protected] (P.-L. Chu), [email protected]
(I.-S. Cheng), [email protected] (S.-Y. Hsieh).
1
Li-Wei Cheng, Hsin-Hung Chou, and Yu-Xuan Cai contributed equally to this work.
2
Kuo-Yuan Huang and Sun-Yuan Hsieh contributed equally to this work.
https://doi.org/10.1016/j.neucom.2023.126946
Received 25 April 2023; Received in revised form 4 September 2023; Accepted 16 October 2023
Available online 30 October 2023
0925-2312/© 2023 Elsevier B.V. All rights reserved.
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
2
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Table 1
Recent studies on the detection of vertebral fractures using artificial intelligence models.
Reference (year) Task Dataset Size of Models used Performance Weaknessa
dataset
Murata et al. Detection of vertebral a level-3 trauma 300 DCNN Accuracy, sensitivity, specificity = (A)
(2020) [14] fractures medical center 86.0%, 84.7%, 87.3%
Kim et al. Vertebra detection and Severance Hospital, 797 M-net Mean Dice similarity metric of (A)(B)(C)
(2021) [13] segmentation Yonsei University 91.60 ± 2.22%
College of Medicine
Li et al. (2021) Detection of vertebral Taipei Veterans General 941 You Only Look Once AI model with ensemble method (A)(D)
[35] fractures Hospital version 3 (YOLOv3) => Accuracy: 93%,
sensitivity:91%, specificity: 93%;
AI model on bootstrapping
method were 89%, 83%, 95%.
Kim et al. Measurement the Gachon Gil Hospital 339 Multi-dilated recurrent Sensitivity: 0.937, specificity: (A)(B)(C)
(2021) [36] vertebral compression residual U-Net 0.995, accuracy: 0.992, dice
ratio (MDR2-UNet) similarity coefficient: 0.929, AUC:
0.987, precision–recall curve:
0.916
Chen et al. Identification of National Taiwan 1306 DCNN Accuracy: 73.59%, sensitivity: (A)
(2021) [37] vertebral fractures University Hospital 73.81%, specificity: 73.02%, AUC:
HsinChu Branch 0.72
a (A) The classification task remains simplistic, limited to identifying the presence or absence of fractures, or distinguishing a singular fracture type, thus precluding the recognition
of various fracture types; (B) The detection of vertebrae solely pertains to the lumbar region, omitting considerations for the thoracic vertebrae segment. This exclusion overlooks
the susceptibility of older individuals to experiencing fractures in the thoracic region or harboring the risk of existing fractures; (C) Within the computation of vertebral height
compression ratios, the oversight of consecutive vertebral fractures, or fractures stemming from compression exclusively in the mid or posterior segments of the vertebrae.
Consequently, the computed values of height compression ratios are imprecise, impeding the detection of fractures; (D) During height measurements or when utilizing the Genant
method for fracture detection, the omission of scenarios where the anterior, middle, and posterior sections of the vertebrae experience proportionate compression. In such instances,
the outcomes of anterior, middle, and posterior height ratio calculations fail to detect fractures.
contrast enhancement is applied using adoptive histogram equaliza- and older adult patients and recommended that different models be
tion. Second, pose-driven learning is used to identify each of the five used for young versus older adult patients. Their method achieved a
lumbar vertebrae. Third, a multistream network (M-net) is used to final accuracy of 93.36%.
segment the individual lumbar vertebra. Finally, the segmentation re- Experts have recently set a range for the determination of vertebral
sults are fine-tuned by being combined using the level-set method. The fractures, with osteoporotic compression fractures being the most com-
PoseNet, M-net, and level-set methods are combined for segmentation. mon. Fractures in older adult patients are usually osteoporotic fractures
In evaluation experiments, this combined method far outperformed or osteoporosis-induced compression fractures; data on these fractures
the traditional U-net or M-net methods, achieving a Dice coefficient, are frequently used in clinical tests, but the clinical detection rate is
precision, and specificity of 91.6±2.22%, 84.57±3.64%, and 99.5±0.17%, low. Hong et al. [41] tested a model on clinical data and on 29,307
respectively. X-ray images of the lateral spine; this data set was collected between
Kim et al. [36] proposed a multidilated recurrent residual U-Net 2007 and 2018 and covered 10,341 patients aged 40 years or older.
(MDR2-UNet) architecture for vertebral segmentation; the architecture Their model had F1 scores of 0.92 and 0.78.
features a multidilated residual block and recurrent residual block, and Dong et al. [43] formulated a GoogLeNet model that detects (1)
dilated convolution is used to increase the receptive field. In that study, moderate to severe fractures and (2) normal, trace, or mild fractures—
339 X-ray images were divided into training, validation, and testing
defined on the basis of the Genant standard. They used a data set
sets at a ratio of 6:2:2. In evaluation experiments, the aforementioned
comprising spine radiographs collected from 19,985 male patients, and
method had a final Dice similarity coefficient of 92.9%.
100,409 vertebral bodies were extracted. In evaluation experiments,
Chen et al. [37] proposed a deep CNN for fracture detection. They
the model achieved a sensitivity of 59.8% and an F1 score of 0.72.
collected data from 1306 patients from National Taiwan University
Chen et al. [42] trained and evaluated an ensemble model on ground
Hospital (NTUH), Hsinchu Branch, from 2015 to 2018, and the data
truth MRI data that were of newly formed compression fractures or
were labeled using the semiquantitative method of Genant. ImageNet
old fractures. In evaluation experiments, the performance results of
was used for pretraining, and Grad-CAM was used to visualize the
the model for the identification of newly formed vertebral compression
attention heatmap of the model for transparency. Different from other
fractures were AUC = 0.80, accuracy = 74%, sensitivity = 80%, and
studies, this study used plain radiographic data of the frontal abdominal
specificity = 68%.
region for training.
Li et al. (2021) [35] tested YOLO version 3 (YOLOv3) against human Kong et al. [39] proposed a CNN-based DeepSurv algorithm that
experts in vertebral fracture detection on a data set of 941 plain lateral predicts the occurrence of a fracture on the basis of lumbar spine X-
radiographs from 941 older adult patients (average age: 76 ± 12 years); ray images and clinical characteristics, such as age, gender, weight,
this data set was collected between 2016 and 2018 and featured 1101 glucocorticoid use, and secondary osteoporosis status. The training
vertebral fractures in the thoracic vertebrae T7 to lumbar vertebrae L5. data covered mostly middle-aged to older adult women (74.4% female;
YOLOv3 had a final accuracy of 93%. average age: 60.5 years). The model outperformed the Fracture Risk
In the method of Chou et al. (2022) [15], YOLOv3 is used for Assessment Tool (FRAX) and a Cox proportional hazards model.
vertebral body localization and segmentation. The contrast and size of Rosenberg et al. [40] focused on the detection of traumatic thora-
the images are adjusted as part of data preprocessing, and an ensemble columbar fractures, which is difficult because the presence of many
model is used to determine the presence and grade of fractures per the organs in this chest region complicates fracture detection. ResNet18
Genant classification scheme; the fracture is graded on the basis of in- and VGG16 was used for fracture prediction and comparison, respec-
jury height at the anterior, middle, and posterior thirds of the vertebral tively. The aforementioned research group used a data set comprising
body into grade 1 (<25%), grade 2 (26% to 40%), and grade 3 (>40%) 630 sagittal radiographs of the vertebra of 151 patients; vertebral frac-
fractures. Chou et al. tested their model on data from young patients tures were present and absent in 302 and 328 radiographs, respectively.
3
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Table 2
Recent studies (published in 2022) on the detection of vertebral fractures using artificial intelligence models.
Reference (year) Task Dataset Size of Models used Performance Weakness
Dataset
Chou et al. Vertebral fracture Private dataset from 1 941 YOLOv3, ResNet34, older adult population: accuracy, (A)(D)
(2022) [15] detection centers DenseNet121, sensitivity, specificity = 93.36%,
DenseNet201 88.97%, 94.26% younger adult
population: 93.75%, 65.00%,
98.49%
Xiao et al. Detection of Private dataset from 15 5970 Ofeye 1.0 Specificity of 97.1%, a sensitivity (A)(D)
(2022) [38] compressive vertebral centers of 86%, and an accuracy of
fracture 93.9%
Kong et al. Predicting osteoporotic Seoul National 1595 DeepSurv (CNN-based) C-index values => DeepSurv, (A)
(2022) [39] fracture University Hospital 0.612; FRAX, 0.547; CoxPH,
0.594
Rosenberg et al. Detection of traumatic Spine Surgery 630 ResNet18 and VGG16 ResNet18: sensitivity (91%), (A)
(2022) [40] thoracolumbar fractures Reference Center specificity (89%), and accuracy
(88%); VGG16 = 90%, 83%, 86%
Hong et al. Detection of vertebral Severance Hospital, 29 307 DNN VF => AUROC = 0.94, sensitivity (A)
(2022) [41] fractures and Seoul 0.88, specificity 0.85; F1 score
osteoporosis 0.92Osteoporosis => AUROC =
0.86, sensitivity 0.81, specificity
0.73; F1 score 0.78
Chen et al. Identifying vertebral The Second Affiliated 1099 DL model AUC: 0.80; accuracy: 74%; (A)
(2022) [42] compression fractures Hospital of Chongqing sensitivity: 80%; specificity: 68%
Medical University
Dong et al. Detection of The Osteoporotic 19 985 GoogLeNet sensitivity of 59.8%, a PPV of (A)(D)
(2022) [43] osteoporotic Fractures in Men 91.2%, F score of 0.72.AUC-ROC
compression fractures and the precision–recall curve
were 0.99, and 0.82
ResNet18 had the highest performance in evaluation experiments and In YOLO [46], a neural network processes an entire image once to
achieved 91% sensitivity and 88% accuracy. determine the location and confidence of bounding boxes and the cate-
In addition to being difficult to characterize, osteoporotic or com- gory to which the object in the bounding box belongs. The architecture
pression fractures in the lumbar spine and thoracic spine are charac- of the YOLOv4 model is presented in Fig. 3. The confidence score of the
terized in a similar manner. Thus, Xiao et al. [38] proposed a system model is provided in Fig. 5.
that detects and classifies osteoporosis-related compressive vertebral U-net [48] is useful for biomedical image segmentation [49–51],
fractures on lateral chest radiographs; these fractures are difficult to and Resnet [52] has achieved outstanding performance in image recog-
detect in practice. They applied their system to data on older adult nition. ResUNet [44], in which residual blocks are combined with a U-
women. Fractures with a vertebral height loss of <25%, 25%–40%, net architecture (Fig. 4), is a combination of both methods. The residual
40%–67%, and >67% were characterized as mild, moderate, severe, blocks, also known as ResBlocks, can prevent gradient diffusion.
and vertebral collapse fractures, respectively. Their system had 93.9% Generally, the size of the receptive field is closely related to the
accuracy in evaluation experiments. amount of contextual information that a CNN can capture. Because
the amount of global information is usually insufficient, global aver-
3. Methods age pooling is used to provide more global information to the CNN.
However, in some data sets, spatial relationships may be lost, causing
We present our method for detecting fractures and distinguishing ambiguity. Thus, Zhao et al. [53] proposed a method termed hierar-
between compression and burst fractures on the basis of X-ray images. chical global priority, name is Pyramid Scene Parseing Network. This
We applied our method to lateral X-ray images on the thoracolum- method features from different levels generated by pyramid pooling
bar region. In this method, X-ray images are preprocessed, YOLO are smoothly connected into a fully connected layer for classification,
version 4 (YOLOv4) [31] is used for preliminary segmentation, and allows for CNN classification for different sized images and reduces the
ResUNet [44] is used for actual segmentation. The random forest ap- loss of information on the relationship between different regions.
proach is then used to determine whether the image shows no fracture, After hierarchical global pooling is applied, vertebral bodies with
a burst fracture, or a compression fracture. a screw or with bone cement can be detected (Fig. 6). Parts of the
vertebral with a screw and with no prior surgery are marked in cyan
3.1. Model architecture and blue, respectively.
To solve the problem of low contrast in X-ray images, we used adap- 3.2. Feature analysis
tive histogram equalization in data preprocessing [13,45]. To solve the
problem of input images having different sizes, we used YOLOv4 [31] The lateral X-ray images of the thoracolumbar region in the data set
for preliminary segmentation. The overall model architecture is pre- were annotated by an orthopedic doctor with regard to whether a burst
sented in Fig. 2. or compression fracture was present. These fractures can be diagnosed
Adaptive histogram equalization [45] is a method for contrast en- on the basis of the height and proportion of the anterior, middle, and
hancement with broad applicability and demonstrable effectiveness. A posterior columns of the vertebral body. The ratios of the corresponding
previous study demonstrated that a machine learning network can iden- heights of the lower or upper vertebral bodies must also be consid-
tify the location of each thoracolumbar vertebral body in input images ered. We extracted features from images that we segmented using our
of different sizes [13]. Therefore, we used the YOLOv4 model [31], method, and these features were used to train machine learning models
which is fast and accurate, to preliminary segment vertebral bodies that identify the vertebral body type. We evaluated these models and
from T1 to L5 in the thoracolumbar region. used the best performing model as our feature analysis model.
4
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Compression fractures and burst fractures are primarily distin- If a vertebral body collapses, its anterior height is expected to
guished by the presence of damage to the middle column of the be discontinuous with the vertebral body above or below it. In this
vertebral body; they are also distinguished by the height of the an- scenario, a compressed and burst vertebral fracture can be detected. If
terior and posterior vertebral bodies. The vertebral bodies are the the posterior height of a vertebral body is similar to that of the vertebral
main weight-bearing structures in the human body, and their heights body above or below it, the middle column may be undamaged; in this
typically increase gradually from top to bottom as a result of healthy situation, a burst fracture diagnosis can be excluded. If the ratio of the
growth. Therefore, physicians should first measure the anterior and anterior and posterior height of this vertebral body is less than 0.8 and
posterior heights of three consecutive vertebral bodies, including the the posterior height has not collapsed, a compression fracture is likely
fractured vertebral body, when determining whether the fracture is a to have occurred. Normally, the height of the posterior vertebral body
compression fracture or a burst fracture. gradually increases from top to bottom. If it does not gradually increase
5
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
3.3. Data
6
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Table 4
Demographic characteristics of patients.
Patients with vertebral fracture 390
Age (years old) 75.12 ± 10.03
Gender: no. patients (%)
Male 88 (22.6)
Female 302 (77.4)
Single/Multi-level spinal fractures: no. patients (%)
Single level 271 (69.5)
2-4 level 93 (23.8)
5+ level 26 (6.7)
Table 3
Features of segmented images.
Attribute Description
𝐻𝑎 Anterior height of the vertebral body
𝐻𝑚 Middle height of the vertebral body
𝐻𝑝 Posterior height of the vertebral body
𝐻𝑚
𝑅_𝐻𝑚𝐻𝑎 Ratio of 𝐻𝑚 and 𝐻𝑎 (= ) Fig. 8. Vertebral body labels.
𝐻𝑎
𝐻𝑝
𝑅_𝐻𝑝𝐻𝑎 Ratio of 𝐻𝑝 and 𝐻𝑎 (= )
𝐻𝑎
𝐻𝑚
𝑅_𝐻𝑚𝐻𝑝 Ratio of 𝐻𝑚 and 𝐻𝑝 (=
𝐻𝑝
) executed over five iterations. At each iteration, a different subset is
𝐻𝑖
used for testing. The mean performance result is used to indicate the
𝑅_𝐻𝑖_𝑙𝑜𝑤𝑒𝑟 = , performance of the model.
𝐻𝑖 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑡 𝑙𝑜𝑤𝑒𝑟 𝑣𝑒𝑟𝑡𝑒𝑏𝑟𝑎𝑙 𝑏𝑜𝑑𝑦
𝐻𝑖 = 𝐻𝑎, 𝐻𝑚, 𝑜𝑟 𝐻𝑝 We use Python [56] as the programming language. Experiments
(=−1 when no lower vertebral body is present)
were conducted in TensorFlow [57], and Keras [58], and image pro-
𝐻𝑖 cessing was implemented using OpenCV [59] and VGG Image Annota-
𝑅_𝐻𝑖_𝑢𝑝𝑝𝑒𝑟 = ,
𝐻𝑖 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑡 𝑢𝑝𝑝𝑒𝑟 𝑣𝑒𝑟𝑡𝑒𝑏𝑟𝑎𝑙 𝑏𝑜𝑑𝑦
𝐻𝑖 = 𝐻𝑎, 𝐻𝑚, 𝑜𝑟 𝐻𝑝 tor [54]. The experiments were run on a computer with an AMD Ryzen
(=−1 when no upper vertebral body is present) 5 3600 CPU, 32 GB of DDR4 RAM, and a NVIDIA GeForce RTX 2070
𝑅_𝐻𝑖_𝑙𝑜𝑤𝑒𝑟_𝑒𝑛𝑐𝑜𝑑𝑒 =1 when 𝑅_𝐻𝑖_𝑙𝑜𝑤𝑒𝑟 < 0.8 SUPER GPU.
𝑅_𝐻𝑖_𝑢𝑝𝑝𝑒𝑟_𝑒𝑛𝑐𝑜𝑑𝑒 =1 when 𝑅_𝐻𝑖_𝑢𝑝𝑝𝑒𝑟 < 0.8 The following metrics were used. First, the Dice coefficient was used
to evaluate vertebral body segmentation quality. The mean of the Dice
coefficients for all vertebral bodies was used. The Dice coefficient is
defined as follows.
Vertebral bodies at T1 to L5 were given one of four labels: normal,
2 × |𝑃 ∩ 𝐺|
compression fracture, burst fracture, and others; the ‘‘others’’ label was 𝐷𝑖𝑐𝑒 =
|𝑃 | + |𝐺|
given if the vertebral body had bone cement or a screw. The labeled
images were verified against CT or MRI images by an experienced where P denotes the predicted segmentation results, G denotes the
orthopedic doctor. VGG Image Annotator [54] software was used for corresponding ground truth segmentation, and the operator|.| returns
labeling. Examples of vertebral bodies corresponding to each of the four the number of labeled voxels.
labels are presented in Fig. 8. Vertebral bodies were also labeled on the Accuracy, precision, recall, and F1 score were used to evaluate
basis of whether they were at the upper thoracic, thoracolumbar, or diagnostic performance. These metrics are defined as follows.
lower lumbar regions [12]. The data set had images of 3634 vertebral 𝑇𝑃
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
bodies, and the numbers of vertebral bodies corresponding to each of 𝑇𝑃 + 𝐹𝑃
the aforementioned labels are presented in Table 5.
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃 + 𝐹𝑁
3.4. Setup of evaluation experiments
2 × 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙
𝐹 1-𝑠𝑐𝑜𝑟𝑒 =
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
Standard five-fold cross-validation was used in evaluation exper-
iments [55]; this method allows for all observations to be used for 𝑇𝑃 + 𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
training and testing and for each observation to be used for testing 𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
only once. In this method, the data set is split into five subsets – where TP, FP, TN, and FN represent the numbers of true positives, false
four for training, and one for testing – and training and testing are positives, true negatives, and false negatives, respectively.
7
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Table 5
Number of vertebral bodies for each label.
Total Labeled with:
Normal Compression Burst Others
Upper thoracic level (T1-T9) 729 634 38 41 16
Thoracolumbar level (T10-L2) 1809 1268 171 288 82
Lower lumbar level (L3-L5) 1096 979 16 3 98
Total 3634 2881 225 332 196
We used a multiclass confusion matrix to present the diagnostic We compared our model against support vector machine, extreme
results. The model outputs from X-ray images were verified against gradient boosting (XGBoost), random forest, multilayer perceptron, and
model outputs from CT or MRI images; CT and MRI images of a 𝑘-nearest-neighbor models. Their performance results are presented in
compression fracture and burst fracture are presented in Fig. 9. This Table 6 and Fig. 11 for comparison. We used a multiclass confusion
was done to verify the feasibility of applying this model to X-ray images matrix to present our results (Fig. 12). We chose the random forest
for diagnosis. model for feature analysis after the models were compared. Our model
had precision scores of 99%, 74%, and 94% for the identification of
4. Results no fractures, compression fractures, and burst fractures, respectively.
For the determination of whether a fracture (in general) was present
We used 390 vertebral X-ray images in the sagittal view of 3634 or absent, our model had an accuracy, precision, recall, and F1-score
vertebral bodies in evaluation experiments. Manual and model segmen- of 92.0%, 93.2%, 95.7% and 94.4%, respectively.
tation results are presented in Fig. 10 for comparison. The average Dice Our model took 30 s to output an annotated image, taking 1–2 s for
coefficient for segmentation was 0.852. each step.
8
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Fig. 10. The images from left to right are original, manually segmented, and model segmented images, respectively. The rightmost image has the manual segmentation and model
segmentation results overlaid on top of each other.
Table 6
Performance results of models for burst or compression fracture identification in the thoracolumbar region.
Total Normal
Model Accuracy Precision Recall F1-score
RF 0.98 0.94 0.99 0.96
KNN 0.89 0.90 0.98 0.94
SVM 0.89 0.91 0.98 0.94
MLP 0.89 0.89 0.99 0.94
XGBoost 0.96 0.98 0.99 0.99
Compression Burst
Model Precision Recall F1-score Precision Recall F1-score
RF 0.77 0.74 0.75 0.86 0.94 0.90
KNN 0.75 0.71 0.73 0.93 0.87 0.90
SVM 0.60 0.50 0.55 0.86 0.70 0.78
MLP 0.83 0.45 0.59 0.91 0.69 0.78
XGBoost 0.85 0.73 0.79 0.86 0.86 0.86
Fig. 11. Performance results of various models in diagnosing burst or compression fractures in the thoracolumbar region.
9
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Data availability
10
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
[17] D.-S. Huang, Systematic Theory of Neural Networks for Pattern Recognition, Vol. [42] W. Chen, X. Liu, K. Li, Y. Luo, S. Bai, J. Wu, W. Chen, M. Dong, D. Guo,
201, Publishing House of Electronic Industry of China, Beijing, 1996. A deep-learning model for identifying fresh vertebral compression fractures on
[18] D.-s. Huang, Radial basis probabilistic neural networks: Model and application, digital radiography, Eur. Radiol. 32 (3) (2022) 1496–1505.
Int. J. Pattern Recognit. Artif. Intell. 13 (07) (1999) 1083–1101. [43] Q. Dong, G. Luo, N.E. Lane, L.-Y. Lui, L.M. Marshall, D.M. Kado, P. Cawthon,
[19] D.-S. Huang, J.-X. Du, A constructive hybrid structure optimization methodology J. Perry, S.K. Johnston, D. Haynor, et al., Deep learning classification of spinal
for radial basis probabilistic neural networks, IEEE Trans. Neural Netw. 19 (12) osteoporotic compression fractures on radiographs using an adaptation of the
(2008) 2099–2115. genant semiquantitative criteria, Acad. Radiol. (2022).
[20] D.-S. Huang, W. Jiang, A general CPL-AdS methodology for fixing dynamic [44] F.I. Diakogiannis, F. Waldner, P. Caccetta, C. Wu, ResUNet-a: A deep learn-
parameters in dual environments, IEEE Trans. Syst. Man Cybern. B 42 (5) (2012) ing framework for semantic segmentation of remotely sensed data, ISPRS J.
1489–1500. Photogramm. Remote Sens. 162 (2020) 94–114.
[21] D.-S. Huang, W.-B. Zhao, Determining the centers of radial basis probabilistic [45] S.M. Pizer, E.P. Amburn, J.D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter
neural networks by recursive orthogonal least square algorithms, Appl. Math. Haar Romeny, J.B. Zimmerman, K. Zuiderveld, Adaptive histogram equalization
Comput. 162 (1) (2005) 461–473. and its variations, Comput. Vis. Graph. Image Process. 39 (3) (1987) 355–368.
[46] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified,
[22] F. Han, Q.-H. Ling, D.-S. Huang, Modified constrained learning algorithms
real-time object detection, in: Proceedings of the IEEE Conference on Computer
incorporating additional functional constraints into neural networks, Inform. Sci.
Vision and Pattern Recognition, 2016, pp. 779–788.
178 (3) (2008) 907–919.
[47] I. Pacal, D. Karaboga, A robust real-time deep learning based automatic
[23] J.-X. Du, D.-S. Huang, G.-J. Zhang, Z.-F. Wang, A novel full structure optimization
polyp detection system, Comput. Biol. Med. (ISSN: 0010-4825) 134 (2021)
algorithm for radial basis probabilistic neural networks, Neurocomputing 70
104519, http://dx.doi.org/10.1016/j.compbiomed.2021.104519, URL: https://
(1–3) (2006) 592–596.
www.sciencedirect.com/science/article/pii/S0010482521003139.
[24] J.-X. Du, D.-S. Huang, X.-F. Wang, X. Gu, Shape recognition based on neural
[48] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for
networks trained by differential evolution algorithm, Neurocomputing 70 (4–6)
biomedical image segmentation, in: International Conference on Medical Image
(2007) 896–903.
Computing and Computer-Assisted Intervention, Springer, 2015, pp. 234–241.
[25] W.-B. Zhao, D.-S. Huang, J.-Y. Du, L.-M. Wang, Genetic optimization of radial [49] D. Shen, G. Wu, H.-I. Suk, Deep learning in medical image analysis, Annu. Rev.
basis probabilistic neural networks, Int. J. Pattern Recognit. Artif. Intell. 18 (08) Biomed. Eng. 19 (2017) 221–248.
(2004) 1473–1499. [50] G. Litjens, T. Kooi, B.E. Bejnordi, A.A.A. Setio, F. Ciompi, M. Ghafoorian, J.A.
[26] Y. Xie, W. Zhang, C. Li, S. Lin, Y. Qu, Y. Zhang, Discriminative object tracking Van Der Laak, B. Van Ginneken, C.I. Sánchez, A survey on deep learning in
via sparse representation and online dictionary learning, IEEE Trans. Cybern. 44 medical image analysis, Med. Image Anal. 42 (2017) 60–88.
(4) (2013) 539–553. [51] J. Ker, L. Wang, J. Rao, T. Lim, Deep learning applications in medical image
[27] A. Yilmaz, O. Javed, M. Shah, Object tracking: A survey, ACM Comput. Surv. analysis, IEEE Access 6 (2017) 9375–9389.
(CSUR) 38 (4) (2006) 13–es. [52] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in:
[28] Q. Wang, L. Zhang, L. Bertinetto, W. Hu, P.H. Torr, Fast online object tracking Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
and segmentation: A unifying approach, in: 2019 IEEE/CVF Conference on 2016, pp. 770–778.
Computer Vision and Pattern Recognition (CVPR), IEEE, 2019, pp. 1328–1338. [53] H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in:
[29] X. Lu, W. Wang, C. Ma, J. Shen, L. Shao, F. Porikli, See more, know more: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Unsupervised video object segmentation with co-attention siamese networks, 2017, pp. 2881–2890.
in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern [54] A. Dutta, A. Zisserman, The VIA annotation software for images, audio and video,
Recognition, 2019, pp. 3623–3632. in: Proceedings of the 27th ACM International Conference on Multimedia, 2019,
[30] L. Bertinetto, J. Valmadre, J.F. Henriques, A. Vedaldi, P.H. Torr, Fully- pp. 2276–2279.
convolutional siamese networks for object tracking, in: Computer Vision–ECCV [55] R. Kohavi, et al., A study of cross-validation and bootstrap for accuracy
2016 Workshops: Amsterdam, the Netherlands, October 8-10 and 15-16, 2016, estimation and model selection, in: Ijcai, Vol. 14, Montreal, Canada, 1995, pp.
Proceedings, Part II 14, Springer, 2016, pp. 850–865. 1137–1145.
[31] A. Bochkovskiy, C.-Y. Wang, H.-Y.M. Liao, Yolov4: Optimal speed and accuracy [56] G. Van Rossum, F.L. Drake, Python 3 reference manual, 2009, https://www.
of object detection, 2020, arXiv preprint arXiv:2004.10934. python.org/.
[32] X.-F. Wang, D.-S. Huang, H. Xu, An efficient local Chan–Vese model for image [57] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado,
segmentation, Pattern Recognit. 43 (3) (2010) 603–618. A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving,
[33] X.-F. Wang, D.-S. Huang, A novel density-based clustering framework by using M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané,
level set method, IEEE Trans. Knowl. Data Eng. 21 (11) (2009) 1515–1531. R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,
I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O.
[34] Y. Zhao, D.-S. Huang, W. Jia, Completed local binary count for rotation invariant
Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow:
texture classification, IEEE Trans. Image Process. 21 (10) (2012) 4492–4497.
Large-scale machine learning on heterogeneous systems, 2015, URL: https://
[35] Y.-C. Li, H.-H. Chen, H.H.-S. Lu, H.-T.H. Wu, M.-C. Chang, P.-H. Chou, Can a
www.tensorflow.org/, Software available from tensorflow.org.
deep-learning model for the automated detection of vertebral fractures approach
[58] F. Chollet, et al., Keras, 2015, https://keras.io.
the performance level of human subspecialists? Clin. Orthop. Relat. Res. 479 (7)
[59] Itseez, Open source computer vision library, 2015, https://github.com/itseez/
(2021) 1598–1612.
opencv.
[36] D.H. Kim, J.G. Jeong, Y.J. Kim, K.G. Kim, J.Y. Jeon, Automated vertebral
[60] R.F. Heary, S. Kumar, Decision-making in burst fractures of the thoracolumbar
segmentation and measurement of vertebral compression ratio based on deep
and lumbar spine, Indian J. Orthop. 41 (4) (2007) 268.
learning in X-ray images, J. Digit. Imaging 34 (4) (2021) 853–861.
[61] L. Yi, B. Jingping, J. Gele, T. Wu, X. Baoleri, Operative versus non-
[37] H.-Y. Chen, B.W.-Y. Hsu, Y.-K. Yin, F.-H. Lin, T.-H. Yang, R.-S. Yang, C.- operative treatment for thoracolumbar burst fractures without neurological
K. Lee, V.S. Tseng, Application of deep learning algorithm to detect and deficit, Cochrane Database Syst. Rev. (4) (2006).
visualize vertebral fractures on plain frontal radiographs, PLoS One 16 (1) (2021)
e0245992.
[38] B.-H. Xiao, M.S. Zhu, E.-Z. Du, W.-H. Liu, J.-B. Ma, H. Huang, J.-S. Gong, D.
Diacinti, K. Zhang, B. Gao, et al., A software program for automated compressive
vertebral fracture detection on elderly women’s lateral chest radiograph: Ofeye Li-Wei Cheng received the MS degree from the Institute
1.0, Quant. Imaging Med. Surg. 12 (8) (2022) 4259. of Medical Informatics, National Cheng Kung University
[39] S.H. Kong, J.-W. Lee, B.U. Bae, J.K. Sung, K.H. Jung, J.H. Kim, C.S. Shin, in 2021. His research interests include machine learning,
Development of a spine X-ray-based fracture prediction model using a deep algorithm design and bioinformatics.
learning algorithm, Endocrinol. Metab. 37 (4) (2022) 674–683.
[40] G.S. Rosenberg, A. Cina, G.R. Schiró, P.D. Giorgi, B. Gueorguiev, M. Alini,
P. Varga, F. Galbusera, E. Gallazzi, Artificial intelligence accurately detects
traumatic thoracolumbar fractures on sagittal radiographs, Medicina 58 (8)
(2022) 998.
[41] N. Hong, S.W. Cho, S. Shin, S.A. Jang, S. Roh, Y. Rhee, H. Kim, K.M. Kim,
S.R. Cummings, Deep learning-based algorithms to detect vertebral fractures and
osteoporosis using lateral spine X-ray radiograph, Bone Rep. 16 (2022) 101576.
11
L.-W. Cheng et al. Neurocomputing 566 (2024) 126946
Hsin-Hung Chou is an associate professor at the Depart- Po-Lun Chu received the Bachelor of Science (B.S) from the
ment of Computer Science and Information Engineering, department of Computer Science and Information Engineer-
National Chi Nan University. He received the Ph.D. degree ing, National Cheng Kung University in 2022. His research
in Computer Science and Information Engineering from interests include computer vision and deep learning.
National Taiwan University at 2004. His research inter-
ests include Computer Game, Machine Learning, Algorithm
Design, Cloud Computing, and so on. In recent decade,
Dr. Chou dedicates in developing the machine learning
techniques for computer games. He has received the re-
ward of the special outstanding researcher at 2015 by
Ministry of Science and Technology. His computer Othello
program Mothello has ever won the silver metal at Com-
puter Olympia 2015 held by ICGA (International Computer
Game Association). And his 9X9 computer Go program I-Szu Cheng received the Doctor of Medicine degree (MD)
Wingo has ever won the bronze metal several times at from the department of Medicine, National Cheng Kung
the Computer Game contests held by TAAI (Association of University in 2023. Her research interests include intelligent
Taiwan Artificial Intelligent) and TCGA (Taiwan Computer medicine and analysis of health insurance research database.
Game Association).
12