6.Bone_Fracture_Detection_Based_on_Faster_R-CNN_with_Bi-Directional_Feature_Pyramid_Module
6.Bone_Fracture_Detection_Based_on_Faster_R-CNN_with_Bi-Directional_Feature_Pyramid_Module
Abstract—With the widespread application of deep learning In medical image analysis, especially fracture detection,
technology in the field of medical image processing, research on traditional machine learning methods such as support vector
using convolutional neural networks for fracture detection has machines (SVM) and random forests are recognized for their
gradually increased. This paper proposes a Faster R-CNN low computational complexity and reasonable performance on
network based on a bidirectional feature transfer mechanism (Bi- small-scale data sets, but they suffer from automatic features
FPM), aiming to improve the accuracy of fracture detection. The shortcomings in extracting and processing high-
Through experimental comparison on the Kaggle Bone fracture dimensional data are obvious compared with deep learning
dataset, this study explores the performance of different feature methods [3]. Deep learning methods, especially convolutional
pyramid structures in the Faster R-CNN framework. Using
neural networks (CNN) and their derived structures such as U-
ResNext-101 as the backbone, the proposed Bi-FPM method
improves the accuracy by 5.8% in the fracture detection task
Net, AlexNet, VGG, and ResNet, stand out for their powerful
compared to the baseline Faster R-CNN, and is higher than the feature learning capabilities and high accuracy on large data sets.
traditional Feature Pyramid Network (FPN) Out 4.1%. Although these models Usually require a large amount of
Experimental results show that Bi-FPM's bidirectional feature annotated data and high computing resources [4]. Region
transfer mechanism can effectively integrate deep and shallow proposal networks such as Faster R-CNN perform well at
features, improve the model's ability to identify fracture features locating key regions in images but can have difficulty processing
and achieve better performance in fracture detection tasks. dense or small objects and are not real-time enough. Feature
Through testing on the Kaggle Bone fracture dataset containing Pyramid Network (FPN) and its variants effectively capture
350 test samples and 150 training samples, this paper confirms the multi-scale features by combining different levels of semantic
effectiveness and superiority of the proposed method. (Abstract) and detailed information. Although the model structure may be
more complex, it shows advantages in the detection of targets of
Keywords—Deep learning; bone fracture detection; feature different sizes. Therefore, when selecting an applicable
pyramid; detection algorithm algorithm, researchers must comprehensively consider the
application scenario, data volume, computing resources, and
I. INTRODUCTION
performance requirements, as well as the algorithm's ability to
In the field of medical imaging, accurately detecting and provide interpretable medical diagnostic results [5].
diagnosing fractures is critical to patient treatment and recovery.
In recent years, the application of deep learning technology in In our study, we adopt a novel bidirectional feature transfer
medical image analysis has made significant progress, among method to enhance feature extraction through a forward feature
which region-based convolutional neural network (R-CNN) and transfer pyramid and a backward feature transfer pyramid. The
its variants, such as Faster R-CNN, have become the mainstream forward feature transfer pyramid transfers detailed information
method for target detection tasks [1]. one. However, when in layer-by-layer decreasing feature maps through down-
processing medical images with complex textures and different sampling and element-wise addition operations, while the
sizes, especially in the fine-grained task of fracture detection, the backward feature transfer pyramid transfers contextual
performance of traditional Faster R-CNN is often limited by the information in layer-by-layer increasing feature maps through
effectiveness of feature extraction and the generalization ability upsampling. The combination of these two feature maps
of the model. In addition, although the existing feature pyramid provides Faster R-CNN with rich feature representation to
network (FPN) can extract multi-scale features, there is still improve the accuracy of fracture detection. Finally, non-
room for improvement in feature fusion and information transfer. maximum suppression technology is used to optimize the
To solve these problems, this study proposes a Faster R-CNN detected candidate areas to ensure high accuracy and efficiency
model based on two-way feature transfer, aiming to improve the of detection results [6].
accuracy of fracture detection through a more efficient feature We selected the Kaggle Bone Fracture data set as the
fusion mechanism [2]. research object. This data set contains 150 training samples and
375
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 21,2024 at 19:26:36 UTC from IEEE Xplore. Restrictions apply.
only improves the expressive power of features but also provides In the Faster R-CNN model based on bidirectional feature
more comprehensive information for the detection algorithm, transfer that we studied, the model architecture is carefully
thereby effectively improving the performance of Faster R-CNN designed to enhance the accuracy of fracture detection. The
in the fracture detection task. Through this two-way feature architecture uses ResNext-101 as the basic network, which is
transfer, the model can more accurately locate and identify responsible for extracting original features from images. Next
fractures of different types and sizes, ensuring efficient and are two innovative pyramid structures, the forward feature
reliable detection results. transfer pyramid and the reverse feature transfer pyramid, which
are respectively responsible for downsampling and merging
high-level semantic information into low-level features, and
upsampling and enhancing low-level detailed information to
high-level features. Hierarchical features ensure the effective
representation of features at different scales. The Region
Proposal Network (RPN) further locates potential fracture
regions on these fused features and generates candidate
detection frames. Subsequently, ROI Pooling converts these
candidate boxes into fixed-size feature maps for use by
classifiers and bounding box regressors to classify fracture types
and accurately locate fracture areas. Finally, non-maximum
suppression (NMS) technology is used to screen out the optimal
detection frames and eliminate overlaps to form the final
Fig. 1. Forward feature transfer pyramid. detection results, which is shown in Figure 3 and Figure 4. This
overall architecture optimizes the information flow, making the
model not only improved in fracture detection performance but
also more efficient in processing speed.
V. EXPERIMENT
The experiment compared the performance of different
feature pyramid structures combined with Faster R-CNN on the
Kaggle fracture data set. The evaluation metrics used include
average accuracy (AP) as a percentage at IoU=0.5 (AP_0.5), and
frames processed per second (FPS). It can be seen that whether
Fig. 2. Backward feature transfer pyramid. using ResNet-50, ResNet-101, or ResNext-101 as the backbone,
the performance of the proposed bidirectional feature transfer
pyramid (Bi-FPM) structure is better than the baseline Faster R-
CNN model and the traditional feature pyramid Network (FPN).
376
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 21,2024 at 19:26:36 UTC from IEEE Xplore. Restrictions apply.
81.8% to 87.2%.". For the ResNext-101 model, Bi FPM algorithms[C]//2022 10th International Conference on Information and
technology can achieve an effect of 88.4%, compared to the Education Technology (ICIET). IEEE, 2022: 53-57.
original 82.6% and 84.3% with added FPN, indicating the [3] Ni L, Huang Q, Ye J, et al. Vocational education development model
based on intelligent management system[C]//2022 11th International
advantage of Bi FPM in improving detection accuracy. Conference on Educational and Information Technology (ICEIT). IEEE,
In all cases, FPS remains unchanged at 0.3, indicating that 2022: 76-79.
although performance has improved, detection speed has not [4] Ni L, Shi J, Han B, et al. Classroom Roll Call System Based on Face
Detection Technology[C]//2022 10th International Conference on
been affected. This shows that the proposed Bi-FPM structure Information and Education Technology (ICIET). IEEE, 2022: 42-46.
can significantly improve the accuracy of fracture detection [5] Jixian L, An G, Zhihao S, et al. Social Media Multimodal Information
without sacrificing speed. These results highlight the advantages Analysis based on the BiLSTM-Attention-CNN-XGBoost Ensemble
of Bi-FPM in feature extraction, especially its ability to capture Neural Network[J]. International Journal of Advanced Computer Science
fracture features in images for effective detection. and Applications, 2022, 13(12).
[6] Yang Q, Wu W, Yang Y, et al. The Improved Fully Convolutional
Mathematical and algorithmic analysis of experimental Network applied in Segmentation and Detection for Pavement
results reveals significant performance improvements of Crack[C]//Proceedings of the 2023 7th International Conference on Deep
Bidirectional Feature Transfer Pyramid (Bi-FPM) over baseline Learning Technologies. 2023: 21-26.
Faster R-CNN and Feature Pyramid Network (FPN). In terms of [7] Chang X, Su Z, Ni L, et al. Person Re-identification Based on Deep
average accuracy (AP_0.5), Bi-FPM demonstrates higher Learning Algorithms with Manual Extracted Colour and Texture
Features[C]//2022 7th International Conference on Big Data Analytics
detection accuracy. From an algorithmic point of view, this is (ICBDA). IEEE, 2022: 243-247.
because it more effectively combines features from different [8] Liu C, Chang X, Cao Z, et al. Welding Seam Recognition Technology of
convolutional layers through mathematical multiple-scale Welding Robot Based on A Novel Multi-Path Neural Network
analysis and feature cascading to enhance the expressive power Algorithm[C]//2022 14th International Conference on Machine Learning
of the model. At the same time, the constant frame rate (FPS) in and Computing (ICMLC). 2022: 407-412.
the experiment shows that the performance improvement has not [9] Ahmad W S H M W, Fauzi M F A, Hasan M J, et al. Multi-configuration
affected the real-time responsiveness of the model. In addition, analysis of densenet architecture for whole slide image scoring of er-
ihc[J]. IEEE Access, 2023.
the advantages of deeper network architectures such as
ResNext-101 in recognition capabilities compared to ResNet-50 [10] Verma H C, Ahmed T, Rajan S, et al. Development of LR-PCA based
fusion approach to detect the changes in mango fruit crop by using landsat
or ResNet-101 also reflect the importance of higher-level 8 OLI images[J]. IEEE Access, 2022, 10: 85764-85776.
features in deep learning. Overall, these improvements reflect [11] Shahrani S, Zainal N F A, Abd Rahman R, et al. The development of a
the algorithm optimization achieved by Bi-FPM through web-based infographic for the introduction of palm industry to young
effective feature fusion and optimized feature space utilization learners[J]. Journal of ICT in Education, 2023, 10(2): 155-167.
in the fracture detection task. [12] Yang S, Yin B, Cao W, et al. Diagnostic accuracy of deep learning in
orthopaedic fractures: a systematic review and meta-analysis[J]. Clinical
VI. CONCLUSION Radiology, 2020, 75(9): 713. e17-713. e28.
[13] Kalmet P H S, Sanduleanu S, Primakov S, et al. Deep learning in fracture
In this study, we propose a Faster R-CNN framework based detection: a narrative review[J]. Acta orthopaedica, 2020, 91(2): 215-220.
on bidirectional feature transfer, aiming to improve the accuracy
[14] Olczak J, Fahlberg N, Maki A, et al. Artificial intelligence for analyzing
of fracture detection. Through experimental verification on the orthopedic trauma radiographs: deep learning algorithms—are they on par
Kaggle fracture data set, our model structure shows better with humans for diagnosing fractures?[J]. Acta orthopaedica, 2017, 88(6):
performance than traditional Faster R-CNN and Feature 581-586.
Pyramid Network (FPN). Experimental results show that using [15] Ren M, Yi P H. Deep learning detection of subtle fractures using staged
our bidirectional feature transfer pyramid (Bi-FPM) not only algorithms to mimic radiologist search pattern[J]. Skeletal Radiology,
achieves higher average accuracy (AP_0.5) on different 2022: 1-9.
backbone networks but also maintains processing speed (FPS), [16] Badgeley M A, Zech J R, Oakden-Rayner L, et al. Deep learning predicts
hip fracture using confounding patient and healthcare variables[J]. NPJ
reflecting the model A balance between efficiency and digital medicine, 2019, 2(1): 31.
effectiveness. In the future, we want to see if Bi FPM technology [17] Pranata Y D, Wang K C, Wang J C, et al. Deep learning and SURF for
can help other medical image analysis tasks to automate medical automated classification and detection of calcaneus fractures in CT
diagnosis. images[J]. Computer methods and programs in biomedicine, 2019, 171:
27-37.
ACKNOWLEDGMENT [18] Chen W, Liu X, Li K, et al. A deep-learning model for identifying fresh
vertebral compression fractures on digital radiography[J]. European
The Corresponding author is Zhihao Su. The funding is from Radiology, 2022: 1-10.
Yuezhi Yang and Zhihao Su.
[19] Guy S, Jacquet C, Tsenkoff D, et al. Deep learning for the radiographic
diagnosis of proximal femur fractures: Limitations and programming
REFERENCES issues[J]. Orthopaedics & Traumatology: Surgery & Research, 2021,
[1] Su Z, Adam A, Nasrudin M F, et al. Skeletal fracture detection with deep 107(2): 102837.
learning: A comprehensive review[J]. Diagnostics, 2023, 13(20): 3245. [20] Gao Y, Soh N Y T, Liu N, et al. Application of a deep learning algorithm
[2] Shi J, Ni L, Su Z. English-speaking teaching system with natural language in the detection of hip fractures[J]. Iscience, 2023, 26(8).
processing technology based on artificial intelligence [21] Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international
377
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on August 21,2024 at 19:26:36 UTC from IEEE Xplore. Restrictions apply.