Fracture Identification on Facial Bone X-Ray using Transfer Learning ( YOLO V8 Algorithm )
Fracture Identification on Facial Bone X-Ray using Transfer Learning ( YOLO V8 Algorithm )
ABSTRACT Facial bone fractures are significant because they can affect essential functions, even though they are not the
most common type of bone fracture. However, detecting them early and accurately is still crucial. These fractures often occur
due to trauma, accidents, or weak bones. Diagnosis typically involves getting X-rays of the affected area, followed by
consulting with a medical professional or radiologist for confirmation. Despite X-rays being the primary tool, minor fractures
may be overlooked, making timely detection challenging. It can be hard to find skilled radiologists who can detect fractures
promptly, leading to delayed diagnosis and compromised recovery. To overcome these challenges, our approach combines
deep learning with transfer learning, using the YOLOv8 Algorithm for better accuracy and efficiency in detecting facial bone
fractures. We trained our model on a dataset of facial bone X-ray images, categorized into three types of fractures. We then
compared our model's performance with existing ones to assess its effectiveness in detecting facial bone fractures.
INDEX TERMS Facial bone fractures, X-rays, Deep learning, Transfer learning, YOLOv8 Algorithm,
Dataset, Performance evaluation
I. INTRODUCTION
Within our project, we are solely focused on
Facial bones are a crucial component of the human identifying fractures specifically in the mandibular,
skeletal system, consisting of various types such as the maxillary, and nasal bones, as these regions are commonly
mandibular (lower jaw), maxillary (Upper jaw), nasal affected by trauma and fractures. To address the challenges
(small paired bones that form the bridge of the Nose), associated with fracture detection, we introduce an
lacrimal, vomer, zygomatic and palatine. These bones not innovative deep-learning approach utilizing transfer learning
only provide structural support to the face but also play for facial bone fracture identification in X-ray images. By
vital roles in functions such as chewing, breathing, and harnessing pre-trained models, this approach enhances the
facial expression. Fractures within the facial skeleton, model's ability to recognize intricate patterns indicative of
categorized into many types such as linear fractures, fractures across a diverse range of X-ray images. Through
comminuted fractures, depressed fractures, and Le Fort this methodology, we aim to not only improve the precision
fractures, can significantly impact an individual's of fracture identification but also expedite the diagnostic
appearance, sensory functions, and overall well-being. process, leading to enhanced patient outcomes.
Timely and precise identification of facial bone Through extensive training on a comprehensive
fractures is crucial for promptly starting suitable treatment dataset comprising various types of facial bone fractures -
and reducing the risk of complications. Nevertheless, specifically on mandibular, maxillary, and nasal bones. This
traditional technique approaches that hinge on human research aims to offer a valuable diagnostic resource for
analysis of X-ray images are susceptible to subjectivity and healthcare professionals and also for radiologists. Our goal is
potential fluctuations in accuracy. Moreover, certain to advance the field of medical imaging analysis, facilitating
fracture types, particularly hairline or complex fractures, better treatment planning and improving recovery outcomes
may evade detection through conventional means. for individuals affected by facial bone fracture.
1
The nasal bone exhibits the highest frequency of Yeonjin Jeong et al [5] proposed CA-FBFD system
facial bone fractures (50%), followed by the mandibular addresses challenges in facial bone fracture diagnosis by
bone, while the maxillary bone has the lowest incidence leveraging Yolo X-S object detection with CT image
rate. The male to female ratio stands at about 7:3 while augmentation. While achieving a Average precision
facial bone fractures were more common among patients percentage of 69.80% and higher sensitivity than the
aged 10 to 29 years. This article embraces the following baseline, limitations exist regarding its evaluation is only on
points: nasal fractures and potential accessibility issues with CT
imaging. Further validation across diverse cases is essential
• An easy way to detect facial bone fractures. Specially for clinical applicability.
three types of facial bone fractures like mandibular,
maxillary, and nasal fractures. Kandel et la [1] studied the usage of transfer learning,
• The dataset used for this experiment is a Customized a deep learning algorithm in computer vision to classify
dataset of facial x-ray images. musculoskeletal images to detect bone fractures. Various
• Transfer learning is applied to a deep learning model CNN methods were used such as VGG, Xception, ResNet,
aimed at improving and estimating fracture detection in GoogleNet, InceptionResNet, and DenseNet, and the metrics
facial bone X-rays. accuracy and Kappa statistics were employed to gauge the
effectiveness of all networks, with the model achieving an
The remaining sections will be organized as follows; impressive confidence score of 95%.
Section II covers the related task which is the literature
review for our proposed plan. Section III covers about Atsuyuki Inui et al,[4] assesses the detecting precision
dataset, preprocessing, and Existing Methodologies. of YOLOv8, a deep-learning model, for identifying elbow
Section IV covers about the Proposed Model. Section V osteochondritis dissecans (OCD) lesions in ultrasound images.
covers about the experiment, analysis of results, and Results show high accuracy in classifying normal and OCD
comparison of results. Section VI covers about the final images, with precision and recall rates exceeding 99%. The
remarks and future prospective range of the study. model demonstrates effectiveness in both image
classification and object detection, suggesting potential for
II. LITERATURE REVIEW widespread screening in medical examinations for baseball
elbow.
Our approach combines deep learning with transfer
learning. Here we using YOLOv8 Algorithm for better Rakesh et al., [3] conducted a project to detect bone
accuracy and efficiency in detecting facial bone fractures fractures using the Canny edge detection Bernsen algorithm.
and gives prediction percentage with fracture type. The model was performed on a Kaggle-obtained dataset with
15000 samples of wrist fracture RGB images. The Canny de-
Rizwan Quresh [2] The paper explores into how You tection and Bernsen algorithm achieved 76.85% and 75.29%
Only Look Once (YOLO) is predominantly applied in respectively. Rakesh et al. also concluded that Canny edge
medical object detection tasks, such as identifying detection is less sensitive to partial blockage.
Anomalies, classifying skin Anomalies, and detecting
cardiac abnormalities. It also highlights YOLO's strong Young-Dae Jeon et al., [21] introduces an advanced
performance but notes its requirements for large, balanced machine learning framework leveraging the "You Only Look
datasets and significant computational resources. The Once" (YOLO) v4 algorithm to identify and illustrate
review suggests areas for improvement and future research fracture regions within three-dimensional skeletal images.
to enhance YOLO's effectiveness in medical object The model, assessed using precision-recall curves and
detection, potentially leading to improved detection and intersection over union metrics, showcases exceptional
outcomes for patients in clinical settings. average precision values (>0.60) and promising performance
in detecting fractures in the tibia and elbow. This existing
Robert Lindsey et al [20] Describes that deep neural system provides a clear presentation of fractured regions by
network was developed to aid emergency medicine overlaying distinctive red masks onto 3D reconstructed bone
clinicians in detecting and localizing fractures in images., potentially aiding orthopedic surgeons in quick and
radiographs, trained on annotations from senior orthopedic accurate trauma diagnosis.
surgeons. Clinicians demonstrated improved sensitivity
(91.5%) and specificity (93.9%) with the model's
assistance, resulting in a 47.0% reduction in
misinterpretation rates. These results highlight the capacity
of deep learning to elevate diagnostic precision and
improve patient care in emergency situations.
2
Surendar Reddy Vinta et al [6] A hybrid model DATASET SAMPLES
utilizing deep learning algorithms YOLO NAS, Efficient
Det, and DETR3 is developed to detect hand bone and joint
fractures through X-rays. The model, developed using a
dataset containing 4736 X-ray images of hand bones,
achieves precise object detection across six fracture
classes, offering improved accuracy compared to existing
methods. This approach addresses the challenges of missed
fractures and delayed diagnosis, enhancing the
effectiveness of fracture detection in hand X-ray imaging.
III. MATERIALS AND METHODOLOGIES Encompassing various fracture types, including nasal,
mandibular, and maxillary fractures, each image is
A. DATASET DESCRIPTION meticulously annotated for precise identification. The
datasets are carefully partitioned into three subsets namely
The success of any model heavily depends on the quality validation , testing and training. allowing for complete model
and diversity of its dataset. Our model leverages a evaluation. Its simplicity and clarity render it an invaluable
meticulously curated custom dataset x-ray images sourced resource for researchers and practitioners in the fracture
from diverse channels and carefully selected based on their detection domain
quality and relevance.
3
TABLE 1: Comparison table of Related Work
Surendar Reddy Vintra & co 2024 Yolo NAS Public Detection Only HandFracture
- authors
The process of data preprocessing is essential for preparing To evaluate the model's performance, we conducted
raw data effectively for real-time detections. Raw data experiments using various established algorithms. The
often contains noise and inconsistencies, making dataset created for this study was used consistently across all
preprocessing crucial to optimize its quality and suitability implementations of existing methodologies.
for machine learning and deep learning tasks. The primary
steps involved in data preprocessing include cleaning, 1) CNN MODEL – Inception V3
which involves removing noise and filling in missing
values; transformation, which organizes the dataset for InceptionV3 is widely recognized for its balanced approach
better understanding; integration, merging data from between accuracy and processing efficiency, making it
suitable for various image recognition tasks. However, its
different sources; reduction, removing inefficient and
application in detecting facial bone fractures has limitations.
irrelevant data; and normalization, facilitating comparison
While pre-training on large datasets like ImageNet provides
across different classes or features. In our study, we utilized
a rich set of features, the model may struggle to capture the
a public dataset comprising images from various sources.
intricacies of medical imaging data. Additionally, its focus
To enhance the dataset's quality and reliability. on image classification may not directly translate to precise
fracture localization in facial bone X-rays. The
computational complexity of InceptionV3, particularly with
ANNOTATED IMAGES high-resolution medical images, can lead to longer
processing times and increased resource requirements,
hindering real-time applications. Furthermore, overfitting
may occur when trained on limited medical imaging data,
affecting generalization performance on unseen datasets.
These constraints underscore the necessity for specialized
architectures tailored explicitly for facial bone fracture
detection.
4
Moreover, YoloX utilizes a decoupled head structure to
reduce interference between bounding box regression and 4) Yolo V8
class classification, thereby enhancing the accuracy of
predictions. Unlike coupled head structures, which output Ultralytics' YOLOv8 model represents a
results together, decoupled heads independently produce significant advancement in object identification and image
classification and regression results, reducing the trade-off segmentation, offering improvements in speed, accuracy,
between the two tasks. Finally, Additionally, YoloX and adaptability compared to its predecessors. One notable
implements a multi-positive strategy to tackle data enhancement is the adoption of anchor-free detection, which
imbalance challenges prevalent in medical datasets such as eliminates the reliance on pre-defined anchor boxes for
fracture data. this sampling strategy views predicted predicting object bounding boxes. This approach enhances
bounding boxes around the fracture location within a 3x3 performance efficiency by allowing for more flexible
region as positive samples, thus resolving the issue of detection of objects with varying sizes and shapes.
limited positive samples. Additionally, YOLOv8 achieves high accuracy while
maintaining fast inference speeds, rendering it applicable to
However, YoloX also has limitations. Its computational a diverse array of object detection tasks.
demands may present obstacles for real-time
implementations or platforms with limited resources. As a single forward pass neural network, YOLOv8
Additionally, while the anchor-free strategy is beneficial efficiently predicts bounding boxes and class probabilities in
for certain tasks like nasal bone fracture detection, it may an image by dividing it into a grid of cells. Following
not perform optimally in scenarios requiring precise prediction, the model employs a non-maximum suppression
localization of objects. Moreover, the multi-positive algorithm (NMS) to remove redundant bounding boxes.,
strategy, while effective in mitigating data imbalance, may ensuring that only the most confident predictions are kept. In
introduce noise or false positives in the predictions, the context of facial bone identification, this streamlined
especially in cases of overlapping fractures or complex process enables accurate detection based on annotated
anatomical structures. These limitations should be images from the dataset, leveraging convolutional and
considered when deploying YoloX for fracture detection pooling layers within the model's architecture.
tasks.
Activation functions are vital in enhancing model
3) Yolo V4 performance, and YOLOv8 utilizing ReLU, Swish, and Tanh
functions to counteract the vanishing gradient issue
In utilizing Yolo V4 for facial fracture detection, frequently encountered in classification tasks. These
several limitations need consideration. Firstly, while Yolo activation functions enhance the model's capability to
V4 is known for its high accuracy and efficiency in object precisely identify facial bone fractures by enhancing gradient
detection tasks, its performance may vary depending on flow and promoting efficient learning.
factors such as dataset quality, model configuration, and
training methodology. Secondly, the effectiveness of Yolo Overall, YOLOv8's advancements in speed,
V4 in detecting facial fractures may be impacted by the accuracy, and adaptability, coupled with its streamlined
complexity and variability of fracture patterns, particularly architecture and effective use of activation functions, make it
in cases where fractures are small, subtle, or obscured by a powerful tool for facial bone identification tasks, offering
other anatomical structures. significant advantages over previous models. Activation
functions significantly impact the model's performance. In
Additionally, Yolo V4's reliance on anchor boxes the YOLOv8 algorithm, Leaky ReLU, Swish, and Tanh are
for object localization may pose challenges in accurately the most commonly used as they give the best results for
delineating fractured regions, especially in scenarios where classification tasks by preventing the vanishing gradient
fractures exhibit irregular shapes or orientations. issue.
Furthermore, the computational resources required for
training and inference with Yolo V4 may be significant,
limiting its accessibility for resource-constrained
environments or real-time applications. Moreover, Yolo V4
may struggle with detecting facial fractures in images with
poor quality or low resolution, potentially leading to false
negatives or inaccurate predictions. Finally, the
generalization ability of Yolo V4 to detect facial fractures
across diverse patient populations, imaging modalities, and
clinical settings remains to be fully evaluated, highlighting
the necessity for thorough validation studies in real-world
healthcare settings. These limitations underscore the
importance of careful consideration and evaluation when
employing Yolo V4 for facial fracture detection tasks.
5
IV. PROPOSED MODEL Feature Extraction: After preprocessing, feature extraction
1. Description: techniques are applied to capture relevant information from
The proposed model for fracture identification in the X-ray images. This involves identifying key features
facial bone X-rays integrates deep learning methodologies, such as bone structures, density variations, and fracture
specifically employing the YOLOv8 algorithm. YOLOv8 patterns using methods like edge detection, texture analysis,
stands out as a cutting-edge object detection and image and morphological operations with labeling
segmentation model renowned for its rapid processing,
precision, and versatility. In this instance, the YOLOv8
algorithm will be trained using a group of datasets that
compose of facial bone X-ray images, categorized into
three distinct types of fractures like Mandibular, Maxillary
and Nasal Fractures Through the utilization of transfer
learning, the pre-trained YOLOv8 model will be fine-
tuned, aiming to enhance accuracy and efficiency in
identifying facial bone fractures compared to conventional
approaches. Furthermore, the proposed model will undergo
comprehensive evaluation to gauge its proficiency in
precisely detecting fractures within facial bone X-rays.
2. Proposed Architecture:
Model Creation: With the extracted features, a deep
learning model is constructed to learn patterns and
relationships within the data for fracture detection. The
model architecture, such as YOLOv8, is selected based on
factors like performance and efficiency. Subsequently, it
undergoes training on a labeled dataset of medical images
(X-rays), during which the proposed model learns to
correlate extracted features with fracture labels and
adjusts its parameters to improve the accuracy in
predictions.
6
OUTPUTS
Prediction: Once the model completes its training, it
becomes prepared to generate predictions on new, unseen
X-ray images. The trained model receives preprocessed X-
ray images and provides predictions about fracture
presence and location. These predictions are represented as
bounding boxes or segmentation masks overlaid on the
original images, providing valuable insights for healthcare
professionals in diagnosing and treating facial bone
fractures.
7
Precision – Recall Curve
Our model equipped with yolo v8 latest version and has average precision of 89.9% which
outperforms the old one
8
GRAPHS:
We trained the model based on the iterations, here we print the overall loss parameters &
precision values from each Iteration. (10 times)
In the training phase, the model undergoes continuous evaluation through various loss and
metric calculations. The training loss includes box loss, representing regression errors in bounding boxes, class
loss, indicating errors in class predictions, and dynamic feature learning (DFL) loss, accounting for losses
during dynamic feature learning. Additionally, metrics such as recall, mean average precision (mAP) and
precision at various Intersection over Union (IoU) thresholds are estimated to assess the model's performance.
Validation loss metrics, comprising class loss (Classification loss), DFL loss (Regression), and Box loss
(Localization) are closely monitored to ensure the model's ability to generalize to unseen data. Furthermore,
adjustments in learning rates for various parameter groups are implemented iteratively during the training
process to enhance convergence and mitigate overfitting.
9
VI. CONCLUSION
[7] R. Lindsey, A. Daluiski, S. Chopra, A. Lachapelle, M.
The project signifies a notable progression in facial Mozer, S. Sicular, D. Hanel, M. Gardner, A. Gupta, R.
bone fracture detection by creatively merging deep learning Hotchkiss et al., “Deep neural network improves
with transfer learning techniques. By leveraging the fracture detection by clinicians,” Proceedings of the
YOLOv8 algorithm, the model demonstrates remarkable National Academy of Sciences, vol. 115, no. 45, pp. 11
accuracy and efficiency in identifying mandibular, 591–11 596, 2018.
maxillary, and nasal fractures, addressing a critical gap in
existing methodologies. [8] J. Olczak, F. Emilson, A. Razavian, T. Antonsson, A.
Stark, and M. Gordon, “Ankle fracture classification
The precision percentages obtained for mandibular, using deep learning: automating detailed ao
maxillary, and nasal fractures further validate the model's foundation/orthopedic trauma association (ao/ota) 2018
effectiveness in fracture detection. Furthermore, the malleolar fracture identification reaches a high degree
confidence curve, alongside an extensive analysis of loss of correct classification,” Acta Orthopaedical, pp. 1–7,
metrics and performance indicators, furnishes valuable 2020.
insights into the model's performance and learning
behavior. In summary, the project's discoveries highlight its [9] N. Tajbakhsh, Y. Hu, J. Cao, X. Yan, Y. Xiao, Y. Lu,
potential as a dependable and feasible solution for J. Liang, D. Terzopoulos, and X. Ding, “Surrogate
automated fracture detection in facial bone X-rays with supervision for medical image analysis: Effective deep
implications for enhancing detection precision and patient learning from limited quantities of labeled data,” in
care in clinical settings. IEEE Intl. Symposium on Biomedical Imaging. IEEE,
2019, pp
REFERENCES
[10] O. Ranneberger, P. Fischer, and T. Brox, “U-net:
[1] 1. Kandel I, Castelli M, Popovicˇ A. Musculoskeletal Convolutional networks for biomedical image
Images Classification for Detection of Fractures Using segmentation,” in Intl. Conf. on Medical Image
Transfer Learning. J Imaging. 2020 Nov. DOI: Computing and Computer-assisted Intervention.
10.3390/jimaging6110127. PMID: 34460571; PMCID: Springer, 2015, pp. 234–241.
PMC8321195.
[11] J. Prijs, Z. Liao, M.-S. To, J. W. Verjans, P. Jutte, V.
[2] Rizwan Qureshi, Mohammad Gamal Ragab, said jaded Stirler, J. Olczak, M. Gordon, D. Guss, C. DiGiovanni,
R. Jaarsma, F. IJpma, and J. Doornberg, “Development
Abdelkader a Comprehensive Systematic Review of
and external validation of automated detection,
YOLO for Medical Object Detection (2018 to 2023).
classification, and localization of ankle fractures: Inside
DOI: 10.36227/techrxiv. 23681679.v1
the black box of a convolutional neural network
(CNN),” Submitted work, 2021.
[3] Rakesh, Y., & Akilandeswari, A. (2023). Evaluation of
Bernsen algorithm for bone fracture detection based on
[12] S. Pereira, R. Meier, V. Alves, M. Reyes, and C. A.
edge detection. Journal of Survey in Fisheries Sciences, Silva, “Automatic brain tumor grading from mri data
10(1S), 2060-2068. using convolutional neural networks and quality
assessment,” in Understanding and Interpreting
[4] Inui A, Mifune Y, Nishimoto H, Mukohara S, Fukuda S, Machine Learning in Medical Image Computing
Kato T, Furukawa T, Tanaka S, Kusunose M, Takigami Applications. Springer, 2018, pp. 106–114.
S, et al. Detection of Elbow OCD in the Ultrasound
Image by Artificial Intelligence Using YOLOv8. Applied [13] S. Soffer, A. Ben-Cohen, O. Shimon, M. M. Amitai, H.
Sciences.2023: https://doi.org/10.3390/app13137623 Greenspan, and E. Klang, “Convolutional neural
networks for radiologic images: a radiologist’s guide,”
[5] Moon, G., Kim, S., Kim, W., Kim, Y., Jeong, Y., & Choi, Radiology, vol. 290, no. 3, pp. 590–606, 2019.
H.-S. (2022). Computer aided facial bone fracture
diagnosis (CA-FBFD) system based on object detection [14] E. Ozkaya, F. E. Topal, T. Bulut, M. Gursoy, M.
model. Medical Imaging Review, 10(3), 123-135. DOI: Ozuysal, and Z. Karakaya, “Evaluation of an artificial
10.1109/ACCESS.2022.3192389 intelligence system for diagnosing scaphoid fracture on
direct radiography,” European Journal of Trauma and
[6] Medaramatla, S. C., Samhitha, C. V., Pandey, S. D., & Emergency Surgery, pp. 1–8, 2020.
Vinta, S. R. (2024). Detection of Hand Bone Fractures in
X-ray Images using Hybrid YOLO NAS. Medical
Imaging Review, 15(2), 150-165. DOI
10.1109/ACCESS.2024.3379760
10
[15] D. W. Langerhuizen, A. E. J. Bulstra, S. J. Janssen, D.
Ring, G. M. Kerkhoffs, R. L. Jaarsma, and J. N.
Doornberg, “Is deep learning on par with human
observers for detection of radiographically visible and
occult fractures of the scaphoid?” Clinical Orthopaedics
and Related Research, 2020.
[16] Qi, Y., Zhao, J., Shi, Y., Zuo, G., Zhang, H., Long, Y.,
Wang, F., & Wang, W. (Year). Ground Truth Annotated
Femoral X-Ray Image Dataset and Object Detection
Based Method for Fracture Types Classification. IEEE,
Volume 8, 2020. DOI: 10.1109/ACCESS.2020.3029039
11