0% found this document useful (0 votes)
74 views

2020 Research On FOD Detection For Airport Runway Based On YOLOv3

This document summarizes a research paper on detecting foreign object debris (FOD) on airport runways using the YOLOv3 object detection algorithm. The paper proposes using YOLOv3 with a deep residual network to better extract features of FOD from images. It also uses multi-scale feature fusion to improve detection of small FOD. Experimental results showed the YOLOv3-based method effectively detects FOD with good accuracy and robustness.

Uploaded by

gandhara11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

2020 Research On FOD Detection For Airport Runway Based On YOLOv3

This document summarizes a research paper on detecting foreign object debris (FOD) on airport runways using the YOLOv3 object detection algorithm. The paper proposes using YOLOv3 with a deep residual network to better extract features of FOD from images. It also uses multi-scale feature fusion to improve detection of small FOD. Experimental results showed the YOLOv3-based method effectively detects FOD with good accuracy and robustness.

Uploaded by

gandhara11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Proceedings of the 39th Chinese Control Conference

July 27-29, 2020, Shenyang, China

Research on FOD Detection for Airport Runway based on YOLOv3


Peng Li, Huajian Li
College of Automation, Harbin Engineering University, Harbin 150001, 35&KLQD
E-mail: [email protected], [email protected]

Abstract: Foreign object debris (FOD) on airport runways is a threatening factor to aircraft taking off and landing. Accurately
detection of foreign objects debris is important to ensure aircraft flight safety. In this paper, a detection algorithm based on
YOLOv3(You Only Look Once) for foreign objects debris is presented. This method employs deep residual network to extract
feature and multi-scale feature fusion to detect small-scale FOD. Sample datasets of foreign object debris are established to
validate our proposed method. The experiments show that the detection algorithm based on YOLOv3 effectively detect foreign
objects debris, and has good accuracy and robustness.
Key Words: foreign object debris, object detection, multi-scale feature fusion, YOLOv3

1 Introduction
The metal pieces, debris and other foreign object debris
on the airport runway is a threatening factor to the aircraft
take-off and landing[1]. The effective detection of foreign
objects on the airport runway is of great significance for the
subsequent cleaning of foreign object debris. Researchers
at home and abroad mainly use layered detection,
traditional BP neural network, Support Vector Machine
(SVM) and other feature extraction and classification
methods to detect foreign object debris[2]. Aiming at the
problem of feature extraction, we adopt a feature extraction
method combining YOLOv3 network and deep residual
network. Experiments show that this method can
effectively extract features of foreign object debris. In
addition, we adopt a multi-scale feature fusion method to
improve the detection rate of small-scale foreign objects
debris. The comparative simulation experiments show that
the YOLOv3-based foreign object debris detection used in
this paper can not only improve the effect of detecting, but
also has higher detection accuracy.
2 FOD Detection Based on YOLOv3
2.1 Darknet-53 and Deep Residual Learning
YOLOv3 algorithm is an end-to-end and one-stage
object detection method. The backbone network structure
of YOLOv3 is Darknet-53[3], as shown in Fig. 1.
Compared with Darknet-19 in YOLOv2, Darknet-53 does
not have a pooling layer, Instead, the stride = 2 convolution
is used to reduce the length and width of the feature map to Fig. 1: Structure of Darknet-53
1/2. Darknet-53 does not have a fully connected layer,
which effectively reduces the number of parameters and
reduces the complexity and time of model training.
Inspired by deep residual learning[4], a residual network
is added to YOLOv3 network in this paper, which can
effectively improve the depth of the network and the ability
to extract foreign object debris features. The residual
network consists of a series of residual blocks, and each
residual contains two branches: identity mapping and
residual branch. As shown in Fig. 2, denoting the desired
underlying mapping as
Fig. 2: Structure of Residual Blocks
H ( x) = F ( x) + x (1)
It is easier to optimize the residual mapping than to
Let the stacked nonlinear layers fit another mapping of
optimize the original mapping[4]. In Fig. 1, the leftmost
F ( x) = H ( x) − x (2) numbers 1, 2, 8, and 4 represent the number of residual

7096
horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on December 23,2021 at 11:14:59 UTC from IEEE Xplore. Restrictions app
blocks. Adding deep residual network effectively increases capability of the feature pyramid, the up-sampled feature
the depth of the network, so that the network has a strong map is concatenated with the deep residual network in the
feature extraction capability. previous section. The prediction can be performed
independently on the feature maps of the three scales. In
2.2 Multi-scale Feature Fusion
the prediction phase, the input picture is divided into #
In this paper, FPN (feature pyramid networks)[5] is grids. Three prediction boxes of different sizes are
added to the YOLOv3 network to perform multi-scale predicted for each grid. Each prediction box contains four
feature fusion. Regression is performed directly on the coordinate information and one confidence information,
feature maps of three scales to predict locations and classes and the output tensor is expressed as ## #  "  #
of foreign object debris. As shown in Fig. 3, if pixel of the ( is the number of classes). The multi-scale feature
input image is 416#416, the pixel of the output feature fusion method can effectively improve the detection of
maps of the three scales are 13#13, 26#26, 52#52. Output small-scale targets.
feature maps of 26#26, 52#52 are obtained by upsampling
at twice the size. For enhancing the representation

Input concat Conv Conv Output-3


416×416 52×52

Conv
Upsampling
Residual×1
Conv

Residual×2
Conv Output-2
concat Conv Conv
Residual×8 ×5 26×26

Residual×8 Upsampling

Residual×4 Conv

Conv Output-1
Conv Conv
. ×5 13×13

Fig. 3: Structure of YOLOv3

2.3 Prediction of FOD Bounding Boxes Selecting an appropriate  can achieve a good balance
between model complexity and recall. In this paper, we
For efficiently predicting the bounding boxes of
choose k is 5.
objects with different scales and width ratios, Faster
YOLOv3 obtains the bounding box of foreign body by
R-CNN[6] use the anchor box as references at multiple
directly predicting the coordinate position relative to the
scales, instead of the traditional image pyramid and feature
grid. Each bounding box predicts four
pyramid. While reducing the complexity of model training
coordinates: ,! , , . The predictions correspond to:
and improves the running speed. YOLOv2, YOLOv3, and
SSD [7]all adopted the anchor mechanism and achieved bx = σ (t x ) + cx (4)
significant results. As mentioned before, the feature map is by = σ (t y ) + c y (5)
divided into grids, which size is #. Three anchor boxes
are predicted in each grid, so each input picture will have bw = pw etw (6)
16983 prediction boxes, which can significantly improve
bh = ph e th
(7)
the accuracy of foreign object debris location. In this paper,
k-means clustering is used to select nine prior boxes. The Among them,  and ! represent the horizontal and
distance metric is as follows: vertical distance between a grid and the coordinates of the
d (box, centroid ) = 1 − IOU (box, centroid ) (3) upper left corner of the image.  and  represent the
width and height of the bounding box.

 7097
horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on December 23,2021 at 11:14:59 UTC from IEEE Xplore. Restrictions app
2.4 FOD Detection Process Based on YOLOv3 cartons. Each class of foreign object debris has about 100
images. The dataset consists of 2000 original pictures. We
The YOLOv3-based FOD detection process is shown in
use the LabelImg to manually label the dataset. The format
Fig. 4. The video signal is converted into an image. The
of the dataset is PASCAL VOC format. After the image is
input image is normalized to the scale required by the
marked, a file corresponding to the same file name and
model, and then divided into # grids. If the center of a
suffixed as .xml is generated. This file records the position
target object is in a grid which is responsible for detecting
information and class information of the label box. For
this target. The detection process needs to predict both the
reducing the over-fitting of the model, the dataset is
target bounding boxes and the target classes. If the number
augmented with a data augmentation method to expand the
of target classes included in the image is . Each grid cell
sample. The pictures in dataset is horizontally flipped,
predicts prediction bounding boxes and confidence vertically flipped, rotated, Gaussian noise changed, contrast,
scores. brightness changed, and blurred. Finally, the dataset
consists of 40,000 samples.
Input pictures
3.2 Pre-training
Normalization
Model of FOD detection is pre-trained on the public
dataset PASCAL VOC2007. The initial learning rate is set
Dividing an image into S*S grids
to 0.001, set to 0.0001 after 1,000 trainings, set to 0.00005
after 5000 trainings, and set to 0.00001 after 10,000
Predicting bounding boxes Predicting classes
trainings. After 100,000 iterations of training, the trained
model and weights were tested on the test set. The the
mean accuracy (mAP) is 79.3%, and the effect of
Score of confidence pre-training is good.
3.3 Training on FOD Dataset
N
Compare with threshold Using transfer learning[8], the weight parameters
obtained from sec3.2 are used as initial values, and
Y
fine-tuning training is performed on this basis. Momentum
Non-Maximum Suppression Discard this bounding box is used for learning optimization. Its advantage is that it can
speed up learning for small and continuous gradients and
Output predicted bounding box and class
containing a lot of noise[9].

Fig. 4: FOD Detection process with YOLOv3

Each prediction target bounding box contains five


parameters (      ).  is the confidence
score of the predicted bounding box, which reflects how
confident the model is that the box contains an object and
how accuracy it thinks the box is that it predicts [3]. The
confidence score is defined as
score = P(object ) × IOU (8)
Among them,   indicates the possibility of a
target in the predicted target bounding box. If non object
exists in the cell,   $  ; otherwise
  $  . IOU is the intersection over union
between the predicted bounding box and the ground truth,
reflecting the accuracy of the predicted bounding box.
Then comparing score with a specified threshold
Fig. 5: Loss curve
(generally the threshold is set to 0.5). If it is greater than
the threshold, the corresponding prediction target bounding
box is retained; otherwise the prediction target bounding The input image is uniformly scaled to 416 # 416.
box is discarded. Finally, Non-Maximum Suppression Through the clustering method, the nine anchor boxes are
(NMS) is used to filter out the remaining predicted target set to 10#, 16#, 33#, 30# , 62#, 59# ,
bounding boxes, and network outputs the final target 116# , 156#
, and 373# . The learning rate is set
bounding boxes and its classes. like sec3.2. The loss function consists of three parts. The
% & coordinate error and class error are calculated using
3 Training the binary cross-entropy loss function, and % & error is
3.1 Dataset calculated using the smooth_l1 loss function. The total loss
value is the sum of the three loss values. The curve of the
A picture dataset of foreign object debris is established total loss value with the number of iterations is shown in
for airport runways. This dataset has nineteen classes of Fig. 5.
objects such as screws, plastic bottles, screwdrivers, and

 7098
horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on December 23,2021 at 11:14:59 UTC from IEEE Xplore. Restrictions app
4 Experiments Table 1: Performance of different algorithms

The training and testing experiments in this paper were Algorithm mAP (%) recall (%) FPS
completed on a computer with Intel i5-9600K CPU, SSD300 83.6 85.3 24
NVIDIA GTX2080 GPU (8G memory), 16G running
Faster R-CNN 80.1 82.3 9
memory, and Ubuntu-18.04-LTS operating system. The
software is partially written in Python based on PyTorch This paper 92.2 93.1 32
deep learning framework, and uses CUDA (compute
unified device architecture) to accelerate operations. 5 Conclusion
4.1 Experimental Results of Detection In this paper, a method of foreign object debris detection
based on YOLOv3 is presented. The method combines
The trained model and parameters are tested on the FOD deep residual network and multi-scale feature fusion, which
dataset. Partial detection experimental results are shown in enhances the representation ability of feature map and
Fig.6. It shows that the detection method proposed in this detection accuracy. The deep convolution network is
paper can effectively determine the coordinate and class of constructed and tested on the airport runway foreign object
foreign object debris. sample dataset. The experimental results show that the
method can effectively detect foreign object debris in
image, and the detection accuracy is higher than that
SSD300 and Faster R-CNN.
References
[1] B. Wang, Z. Lan, A Hierarchical Foreign Object Debris
Detection Method Using Millimeter Wave Radar, Journal of
Electronics & Information Technology, 40(11):2676-2683,
2018.
[2] B. Niu, H. Gu, Research of FOD recognition based on
Gabor wavelets and SVM classification, Journal of
Information and Computational Science, 10(6):1633-1640,
2013.
[3] J. Redmon, A. Farhadi, YOLOv3: An Incremental
Improvement, arXiv:1804.02767v1, 2018.
[4] K. He, X Zhang, S Ren, Deep Residual Learning for Image
Recognition, IEEE Conference on Computer Vision and
Pattern Recognition, 770-778, 2016.
[5] T.-Y. Lin, Dollár, Piotr, R. Girshick, Feature Pyramid
Networks for Object Detection, IEEE Conference on
Fig. 6: FOD Detection with YOLOv3
Computer Vision and Pattern Recognition, 936-944, 2017.
[6] S. Ren, K. He, R. Girshick, Faster R-CNN: Towards
4.2 Comparative Experiments Real-Time Object Detection with Region Proposal Networks,
IEEE Transactions on Pattern Analysis & Machine
In this paper, an algorithm performance comparative Intelligence, 39(6):1137-1149, 2015.
experiment is also performed. Comparing with SSD300 [7] Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot
and Faster R-CNN on the same train dataset and test MultiBox Detector[C]. European Conference on Computer
dataset. In this paper, mAP and recall rate are used as the Vision(ECCV), Springer, 2016:9905.
evaluation criteria for the detection effect. The comparative [8] E. Tzeng, J Hoffman, K. Saenko, Adversarial Discriminative
experimental results are shown in Table 1. Under the same Domain Adaptation, IEEE Conference on Computer Vision
hardware and software environment, the FOD detection and Pattern Recognition, 2962-2971, 2017.
bashed on YOLOv3 designed in this paper has a better [9] J. Xia, K Qian, Fast Planar Grasp Pose Detection for Robot
Based on Cascaded Deep Convolutional Neural Networks,
detection accuracy than SSD300 and Faster R-CNN is 8.6% ROBOT, 40(6):794-802, 2018.
and 11.9% higher respectively.Besides, the FOD detection
bashed on YOLOv3 designed in this paper has a better
detection speed than SSD300 and Faster R-CNN is 33.3%
and 255% higher respectively

 7099
horized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on December 23,2021 at 11:14:59 UTC from IEEE Xplore. Restrictions app

You might also like