0% found this document useful (0 votes)

12 views

Image-Based Pothole Detection Using Multi-Scale Feature Network and Risk Assessment

The study presents the development of the SPFPN-YOLOv4 tiny algorithm for real-time pothole detection and risk assessment using 2D images. By combining spatial pyramid pooling and feature pyramid networks with YOLOv4 tiny, the algorithm achieved a 2-5% improvement in detection accuracy compared to previous YOLO models. Additionally, it introduces a risk assessment method that classifies pothole severity based on size, enhancing road safety measures.

Uploaded by

vinita2025r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Image-Based Pothole Detection Using Multi-Scale Feature Network and Risk Assessment

Uploaded by

vinita2025r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

electronics

Article
Image-Based Pothole Detection Using Multi-Scale Feature
Network and Risk Assessment
Dong-Hoe Heo 1 , Ji-Yoon Choi 1 , Sang-Baeg Kim 2 , Tae-Oh Tak 1, * and Sheng-Peng Zhang 1, *

1 Department of Mechanical and Biomedical Engineering, Kangwon National University,

Chuncheon City 24341, Gangwon-do, Republic of Korea
2 Kai Networks Corporation, Suwon City 16463, Gyoenggi-do, Republic of Korea
* Correspondence: [email protected] (T.-O.T.); [email protected] (S.-P.Z.)

Abstract: Potholes on road surfaces pose a serious hazard to vehicles and passengers due to the
difficulty detecting them and the short response time. Therefore, many government agencies are
applying various pothole-detection algorithms for road maintenance. However, current methods
based on object detection are unclear in terms of real-time detection when using low-spec hardware
systems. In this study, the SPFPN-YOLOv4 tiny was developed by combining spatial pyramid pooling
and feature pyramid network with CSPDarknet53-tiny. A total of 2665 datasets were obtained via
data augmentation, such as gamma regulation, horizontal flip, and scaling to compensate for the lack
of data, and were divided into training, validation, and test of 70%, 20%, and 10% ratios, respectively.
As a result of the comparison of YOLOv2, YOLOv3, YOLOv4 tiny, and SPFPN-YOLOv4 tiny, the
SPFPN-YOLOv4 tiny showed approximately 2–5% performance improvement in the mean average
precision (intersection over union = 0.5). In addition, the risk assessment based on the proposed
SPFPN-YOLOv4 tiny was calculated by comparing the tire contact patch size with pothole size by
applying the pinhole camera and distance estimation equation. In conclusion, we developed an end-
to-end algorithm that can detect potholes and classify the risks in real-time using 2D pothole images.

Keywords: pothole; 2D object detection; convolutional neural network; multi-scale detection;

risk assessment

Citation: Heo, D.-H.; Choi, J.-Y.;

Kim, S.-B.; Tak, T.-O.; Zhang, S.-P.
Image-Based Pothole Detection Using
1. Introduction
Multi-Scale Feature Network and 1.1. Problem Definition and Motivation
Risk Assessment. Electronics 2023, 12, According to a press release by the Ministry of Road Transport & Highways—“Road
826. https://doi.org/10.3390/ accidents in India—2021”—[1], a total of 7189 pothole-related accidents, with 2952 deaths
electronics12040826 and 6167 injuries, occurred from 2020 to 2021, posing a significant threat to the road users.
Academic Editor: Chunjie Zhang Furthermore, the cost that the governments must cover for the loss of human lives and road
accidents resulting from potholes is rapidly increasing [2]. Recently, potholes have been
Received: 31 December 2022 increasing due to the more severe weather brought on by global warming [3]. Accordingly,
Revised: 28 January 2023
government agencies in each country ought to maintain road surfaces in good condition to
Accepted: 3 February 2023
guarantee health and safety; hence the timely detection of asphalt cracks and potholes is
Published: 6 February 2023
essential. In particular, various pothole detection methods have been introduced as part
of road data collection for road surface maintenance. Methods for detecting potholes [4]
include 3D reconstruction, vibration, and 2D image-based pothole detection.
Copyright: © 2023 by the authors.
The 3D image reconstruction method can analyze the depth and width of the pothole
Licensee MDPI, Basel, Switzerland. by converting the road surface into a three-dimensional image using stereo vision-based
This article is an open access article camera and laser scanner [5,6]. However, developing a 3D sensor system is expensive and
distributed under the terms and very sensitive to vibrations transmitted from road surfaces to vehicles, resulting in low
conditions of the Creative Commons accuracy [7].
Attribution (CC BY) license (https:// On the other hand, the vibration sensor-based method can estimate the depth of
creativecommons.org/licenses/by/ a pothole and detect its position by sensing its impact or motion using an acceleration
4.0/). sensor [8]. However, this method cannot predict potholes in advance because it detects

Electronics 2023, 12, 826. https://doi.org/10.3390/electronics12040826 https://www.mdpi.com/journal/electronics

Electronics 2023, 12, 826 2 of 22

potholes only when the vehicle passes directly through the road surface. In addition,
potholes cannot be distinguished when they pass through manholes or speed bumps.
The 2D Image-based detection method recognizes the textures and characteristics of
potholes based on images of the road surface acquired using a camera [9]. This method
helps prevent accidents in advance and facilitates real-time detection because potholes can
be identified from a relatively long distance. In addition, it can be used with a camera sensor,
which is much less expensive than the laser scanner used in the 3D method. Therefore,
various 2D image-based detection methods that can operate with high efficiency and low
cost have been developed [10].
Moreover, with the rapid development of deep learning technology, convolutional
neural networks (CNNs) that can automatically learn object features from images have
been introduced. The object-detection method classifies the types of objects to be detected
and draws a square-shaped bounding box around the classified objects to indicate their
locations. Among the various object detection techniques, the You Only Look at Once
(YOLO) algorithm has the fastest computational speed among existing deep-learning-based
object detection techniques, and enables real-time detection. However, real-time object
detection cannot be guaranteed in embedded systems with low-performance GPUs and
CPUs, even if the computation speed of YOLO is fast [11]. Therefore, the YOLO-tiny
series algorithm, advantageous for real-time detection in embedded systems, has been
widely applied to pothole detection [12]. However, YOLO-tiny series algorithms have
lower accuracy than general YOLO algorithms because the network size is reduced for
computation speed.
Not only that the importance of pothole detection, but it is also necessary to assess
the risk for potholes of various sizes and shapes because potholes force different impacts
on tires and suspensions depending on their size [13]. Therefore, many researchers and
experts have recently conducted studies to evaluate the risk of potholes based on their
size and volume [14]. Typical techniques for assessing the severity of potholes include
size-based methods [15], and deep learning-based methods [16]. The size-based pothole
risk assessment method evaluates risk by comparing depth, volume, area, and size with
predefined classification criteria. However, most risk classification conditions used to
compare sizes are highly subjective depending on the country, experts, and even researchers.
Deep learning-based pothole risk assessment methods classify risk into low, medium, and
high stages using data acquired by sensors [17]. However, this method is hard to use in a
real-time application, and it is even more challenging to know the risk classification criteria
trained by the deep neural network.

1.2. Contributions
To address the above problem, we developed the SPFPN-YOLOv4 tiny algorithm,
which has a higher accuracy than the general YOLO algorithm, and enables real-time
detection by improving the existing YOLOv4 tiny. Spatial pyramid pooling (SPP) and
feature pyramid network (FPN) are added to CSPDarknet53-tiny, which is the backbone
of YOLOv4 tiny. In the stage of detection performance comparison, SPFPN-YOLOv4
tiny achieved the highest performance in the mean average precision (mAP) (intersection
over union (IoU = 0.5)), F1-score, and frames per second (FPS). Additionally, previous
studies have reported that it is crucial not only to detect potholes, but also to judge their
severity according to their size, as potholes have a negative impact on suspension, tire
damage, and drivers depending on their size [13]. Gupta et al. [18] conducted a deep
neural network study using thermal image images and proposed the need for image-based
severity classification as an agenda for future research. In addition, previously studied
pothole detection algorithms only focus on detecting potholes on the road surface and
cannot distinguish large potholes potentially affecting traffic accidents. To overcome the
limitations of existing detection algorithms, we present a novel risk assessment algorithm
that can be practically adapted to drivers and passengers with visualization signals. This
risk assessment algorithm classifies risk into three classes of severity: danger, safety, and
Electronics 2023, 12, 826 3 of 22

caution, by comparing the tire footprint size grounded on the road with the actual pothole
size calculated using the pinhole camera and distance equations. The key contributions of
this study are as follows:
1. Assuming a low-spec hardware system, we developed a lightweight pothole detection
algorithm capable of real-time detection without a GPU.
2. By applying data augmentation techniques, such as gamma regulation, horizontal
flip, and scaling, we compensate for the deficient dataset and increase the adaptation
of deep learning models in various environments (i.e., overexposed, underexposed,
unpaved, asphalt pavement)
3. We proposed a new risk assessment criterion to compare the size of potholes with the
tire contact patch size to identify large potholes that are likely to affect traffic accidents.
The remainder of this paper is organized as follows: Section 2 presents a study of
detection algorithms and risk assessment associated with potholes. Section 3 presents the
data preprocessing, SPFPN-YOLOv4 tiny, and the risk assessment algorithm applied in this
study. Section 4 presents the results of the object detection performance comparison and
the validation of the risk assessment algorithm. Section 5 presents conclusions, limitations,
and future research objectives.

2. Related Work
2.1. Pothole Object Detection
Early 2D pothole detection techniques were machine-learning-based classification
algorithms. A machine learning-based pothole detection algorithm distinguished the road
surface into defective and non-defective areas by setting a feature classifier subjectively
defined by the user [19]. These shape-based feature classifiers often detect the shaded part of
the road surface as a pothole. Powell and Sateshjumar [20] detected potholes by considering
its shape to be approximately elliptical using a machine learning-based pothole detection
algorithm combined with shadow removal algorithms. However, machine-learning-based
pothole detection algorithms can only classify potholes from images and cannot display
their locations.
As deep learning technology has developed rapidly, these machine-learning-based
limitations have been overcome with the development of object detection techniques in
computer vision. These object detection techniques are classified mainly into two-stage
detectors and one-stage detectors. The two-stage detector extracts numerous candidates in
the area where the object exists and sequentially processes the classification and localization
processes; thus, the accuracy is high, but the computation speed is slow. However, as the
one-stage detector processes the classification and localization processes simultaneously, the
accuracy is low, but the computation speed is fast. YOLO has the fastest computation speed
among existing one-stage detector-type object detection models, and most researchers are
currently developing YOLO-based 2D pothole detection algorithms.
Ukhwah et al. attached a camera for road surface detection to the rear of a vehicle
to capture road surface images in East Java Province [21] and trained YOLOv3 [22],
YOLOv3 tiny [23], and YOLOv3 SPP [24] using selected 448 images. As a result of
training various YOLO models, each model achieved a mAP of 83.43%, 79.33%, and
88.93%, respectively. Through this result, the author argued that the YOLOv3 SPP com-
bined with SPP achieved the highest accuracy. Park et al. [25] split a 665 pothole image
dataset into training, validation, and test datasets by selecting 70%, 20%, and 10% of
the total sample, respectively, then trained 598 pothole image datasets to YOLOv4 [26],
YOLOv4 tiny [27], and YOLO v5s [28] models. As a result of the performance eval-
uation of the model, YOLOv4, YOLOv4 tiny, and YOLOv5s achieved 77.7%, 78.7%,
and 74.8% mAP(IoU = 0.5), respectively. In conclusion, the author noted that YOLOv4
tiny achieved the highest accuracy, and there is a need to extend the backbone network
of YOLOv4 tiny to improve accuracy. Bucko et al. [2] trained 1052 images collected
differently under clear, rainy, sunset, evening, and night weather conditions. Especially
Electronics 2023, 12, 826 4 of 22

under clear weather conditions, YOLOv3, YOLOv3 SPP, and Sparse R-CNN achieved
77%, 79.1%, and 72.6% mAP (IoU = 0.5), respectively. Dharnesshkar et al. [29] trained a
dataset of 1500 Indian road images collected from Coimbatore, Idukki, and Kymily on the
YOLOv2, YOLOv3, and YOLOv3 tiny models, achieving mAP(IoU = 0.5) 45.33%, 38.41%,
and 49.71%, respectively. Asad et al. [30] trained 665 pothole datasets on the YOLOv4
tiny, YOLOv4, and YOLOv5 models, and verified the real-time detection possibility in a
low-end embedded system. An OAK-D camera with a single main board (Raspberry Pi)
was used to detect potholes. One of the limitations was that only a YOLOv4 tiny model
achieved real-time detection of 31.76 FPS among various YOLO models.
However, although the existing computation speed of the YOLOv4 tiny model is fast,
this model needs to extract richer object features to be detected. Therefore, in this study,
we attempted to improve the accuracy by combining multi-scale feature networks, such
as SPP and FPN, with the YOLOv4 tiny architecture to extract rich object features. The
multi-scale feature networks are used to improve the detection performance. SPP extracts
pothole features of various sizes by applying multiple max-pool filters to compensate for the
spatial information loss of the image that occurs during max-pooling, which is a part of the
convolutional neural network process [31]. The FPN compensates for spatial information
loss by combining high-level feature maps generated in deep convolutional layers with
low-level features generated in shallow convolutional layers [32]. This mechanism of
combining various feature maps makes greater use of the spatial information of the image
by extracting the rich features of the object, thereby improving the overall performance of
the deep learning model. Table 1 summarizes selected studies on pothole detection and
shows their contribution, models, hardware systems, FPS, and mAP (IoU = 0.5).

Table 1. A summary of recent road pothole detection algorithm performance.

References
Contribution Model CPU GPU FPS mAP([email protected])
& Year
YOLOv3 83.43%
Ukhwah et al. [21], Intel ® Xeon (R) Tesla T4
Detection of potholes and area estimation YOLOv3 tiny 0.4 79.33%
2019 [email protected] (13 GB)
YOLOv3 SPP 88.93%
YOLOv4 77.7%
Park et al. [25], Tesla K80
Comparison with various YOLO models YOLOv4 tiny − − 78.7%
2021 (12 GB)
YOLOv5s 74.8%
YOLOv2 3.20 81.21%
YOLOv3 2.39 83.60%
Asad et al. [30], Detection of pothole using ARM Cortex-
YOLOv4 − 1.98 85.48%
2022 Raspberry Pi 4 [email protected]
YOLOv4 tiny 31.76 80.04%
YOLOv5 18.25 95.00%
Sparse R-CNN − 72.6%
Bucko et al. [2], Detection potholes under adverse
YOLOv3 − − 28.57 77.1%
2022 weather condition
YOLOv3-SPP 27.78 79.1%
YOLOv2 45.33%
Dharnesshkar et al. [29],
Detection potholes of the India road YOLOv3 − GeForce GTX 1060 − 38.41%
2020
YOLOv3 tiny 49.71%

2.2. Risk Assessment Methods

Due to potholes impacting vehicles differently depending on their size, government
agencies in the U.S. and China have set standards to classify the risk of potholes according
to their depth and area using acceleration sensors, stereo vision cameras, and laser sensors.
China’s Highway Performance Assessment Standard (HPAS) states that if a pothole area is less
than 0.1 m2 , the risk is low, and if the area is greater than 0.1 m2 , the risk is high. The Long-term
Pavement Performance Program (LTTP) in the United States defines pothole depth as low-risk
when the depth is less than 25 mm, medium-risk when the depth is greater than 25 mm and
less than 50 mm, and high-risk when the depth is greater than 50 mm. According to a press
release by the Royal Automobile Club (RAC) Foundation [33], Local Highway Authorities
(LHA) in the United Kingdom applies a “risk-based” road maintenance method that evaluates
the risk depending on the width and depth of the pothole. Solanke et al. [14] distributed
questionnaires to road-related experts to classify risk levels to determine the severity of
Electronics 2023, 12, 826 5 of 22

potholes for road surface maintenance. However, this type of pothole risk assessment method
is subjective and wastes human resources when measuring the pothole size. In addition, this
is not practical because the size cannot be measured when the vehicle is driving at speed.
Table 2 summarizes the risk classification criteria and conditions of potholes identified in the
LTTP (United States) and HPAS (China) [5].

Table 2. Pothole risk classification specified in LTTP and HPAS.

Property Degree of Risk LTTP (USA) HPAS (China)

Low Below 25 mm
Depth Between 25 and −
Moderate
50 mm
High Greater than 50 mm
Low Below 0.1 m2
Area −
High Greater than 0.1 m2

According to the literature study above, the pothole size significantly correlates with
severity of the road. If the pothole detection algorithm is applied to risk assessment, the
size of the pothole can be measured in real-time, which can be used for road maintenance
and quickly recognized in driving situations. In Korea, no criteria exist for evaluating
pothole risk, but a few studies have been conducted. Chung-Hyun et al. [34] attached
an acceleration sensor to a vehicle. They presented a risk index divided into four stages:
Caution,
Electronics 2023, 12, x FOR PEERAlert,
REVIEWWarning, and Danger, according to the Z-axis acceleration value6generated of 24

when passing through the pothole. However, this method evaluates the degree of risk
based on a sensor that detects vibrations and can only detect potential hazards when the
3. Methods
vehicle directly passes through the road surface with the potholes. Therefore, this method
This study developed the SPFPN-YOLOv4 tiny model by combining multi-scale fea-
has disadvantages because it cannot recognize dangerous potholes in advance.
ture networks, such as SPP and FPN, with the YOLOv4 tiny model suitable for embedded
In contrast,
systems. A new 2Da pothole
applying risk assessment requires
risk assessment standardusing 2D pothole
was proposed images
to visually that can
indicate
predict dangerous
risk signals to the driver by comparing the size of the pothole detected using the devel-al. [35]
potholes before a vehicle passes through the pothole. Kortmann et
performed severity
oped modelclassification
with the size using deep
of the tire learning
contact based
patch area, on 2Dinroad
as shown Figureimages
1. to establish
First, a pothole dataset comprising 665 images and
a route plan for an autonomous vehicle. The author classified road hazards annotations was used to traininto
var- three
ious deep-learning models. Data augmentation techniques, such as gamma regulation,
stages: low, medium, and high, depending on the road features, potholes, and depths of the
scaling, and horizontal flip, were randomly applied to the original data to obtain 2665
road surface.datasets
According to this risk
to compensate assessment,
for the road hazards
insufficient dataset. were2400
Subsequently, classified as ‘high’
training and vali- risk
even if theredation
weredatasets
small potholes
were used on the the
to train road thatand
model, could not affect265
the remaining thetest
vehicle.
datasets Therefore,
were
used for model
this study presents a newperformance
2D pothole evaluation after the training
risk assessment thatwas completed.
compares theToactual
verify the
pothole
size and tireperformance
contact patch of the SPFPN-YOLO v4 tiny developed in this study, YOLOv2, YOLOv3,
area grounded on a road surface to address the limitations of
and YOLOv4 tiny were selected as benchmarks, and the improvement in accuracy was
previous pothole risk
verified assessment
by measuring studies. = 0.5), precision, recall, and F1-score for each model.
the mAP(IoU
The FPS values of the models were compared without a GPU to confirm real-time detec-
3. Methods tion, even on low-end hardware.
This study Indeveloped
addition, thethebounding box of the detected
SPFPN-YOLOv4 tinypothole
modelwas byconverted
combiningto themulti-scale
actual
size using the pinhole camera and distance estimation equations to determine the risk
feature networks, such
according to as
theSPP and FPN,
relationship withthe
between thepothole
YOLOv4 tinycontact
and tire model suitable
patch for evalu-
size. Risk embedded
systems. A new 2D pothole risk assessment standard was proposed to visually
ation criteria categorized the potholes into three stages: hazard if they were 1.2 times indicate
risk signals to the than
larger drivertheby
tirecomparing
contact patchthe size
size, of the
safety pothole
if they were 1.2detected using
times smaller thethe
than developed
tire
model with thecontact patch
size size,tire
of the andcontact
caution ifpatch
they were theas
area, same size. in Figure 1.
shown

Figure 1. Research and risk assessment procedure using SPFPN YOLOv4 tiny network.
Figure 1. Research and risk assessment procedure using SPFPN YOLOv4 tiny network.
3.1. Dataset Preprocessing
3.1.1. Dataset Preparation and Data Augmentation
The dataset comprises 665 image and annotation files in the Pascal VOC format pro-
vided by Kaggle [36]. The dataset’s images were captured under various environmental
conditions, such as public roads, unpaved roads, and potholes, including rainwater.
Electronics 2023, 12, 826 6 of 22

First, a pothole dataset comprising 665 images and annotations was used to train
various deep-learning models. Data augmentation techniques, such as gamma regula-
tion, scaling, and horizontal flip, were randomly applied to the original data to obtain
2665 datasets to compensate for the insufficient dataset. Subsequently, 2400 training and
validation datasets were used to train the model, and the remaining 265 test datasets
were used for model performance evaluation after the training was completed. To
verify the performance of the SPFPN-YOLO v4 tiny developed in this study, YOLOv2,
YOLOv3, and YOLOv4 tiny were selected as benchmarks, and the improvement in
accuracy was verified by measuring the mAP(IoU = 0.5), precision, recall, and F1-score
for each model. The FPS values of the models were compared without a GPU to confirm
real-time detection, even on low-end hardware.
In addition, the bounding box of the detected pothole was converted to the actual
size using the pinhole camera and distance estimation equations to determine the risk
according to the relationship between the pothole and tire contact patch size. Risk
evaluation criteria categorized the potholes into three stages: hazard if they were
1.2 times larger than the tire contact patch size, safety if they were 1.2 times smaller
than the tire contact patch size, and caution if they were the same size.

3.1. Dataset Preprocessing

3.1.1. Dataset Preparation and Data Augmentation
The dataset comprises 665 image and annotation files in the Pascal VOC format pro-
vided by Kaggle [36]. The dataset’s images were captured under various environmental
conditions, such as public roads, unpaved roads, and potholes, including rainwater.
Ground truth information in Pascal VOC format was converted into a YOLO annotation
format. The supervised deep learning model depends on the quality and quantity of
the dataset; therefore, data augmentation, such as horizontal flip, scaling, and gamma
regulation was applied to the original image datasets so that insufficient datasets and
models could adapt strongly to low-quality photos, light reflection, overexposed, and
underexposed data, as shown in Figure 2. The scale was multiplied by 0.7 for each
pixel size (width, height) in the image, gamma regulation was randomly multiplied by
0.5–2.0 for each channel in the image, and the image completely flipped horizontally.
After image transformation, the annotation files of the dataset were transformed. As
shown in Figure 2, the data augmentation technique for the ground truth was applied
only to images with horizontal flip and scaling, where objects were deformed, and
not to images with gamma regulation, which did not change the shape of the object.
The original ground truth is marked in green, and the ground truth to which each
data augmentation was applied is marked in red. Subsequently, 2665 datasets with
augmented techniques were divided into training, validation, and test datasets at ratios
of 70%, 20%, and 10%, respectively. Two thousand four hundred datasets were used for
model training, and 265 test datasets were used for test image inference. As potholes
mainly cause serious incidents in highway environments where vehicles drive at high
speeds [37], we manually picked 50 pothole images in the asphalt pavement environ-
ment similar to the highway road when measuring the Precision, Recall, F1-score, and
FPS after training the model.
marked in green, and the ground truth to which each data augmentation was applied is
marked in red. Subsequently, 2665 datasets with augmented techniques were divided into
training, validation, and test datasets at ratios of 70%, 20%, and 10%, respectively. Two
thousand four hundred datasets were used for model training, and 265 test datasets were
used for test image inference. As potholes mainly cause serious incidents in highway en-
Electronics 2023, 12, 826 vironments where vehicles drive at high speeds [37], we manually picked 50 pothole im- 7 of 22
ages in the asphalt pavement environment similar to the highway road when measuring
the Precision, Recall, F1-score, and FPS after training the model.

Figure 2.
Figure 2. Pothole
Pothole images
imagessubset after
subset data
after augmentation.
data augmentation.

3.1.2.
3.1.2. Dataset Configuration
Dataset Configuration
As
As shown Figure3,
shown in Figure 3,the
thetotal
totalnumber
numberofofdatasets
datasets after
after data
data augmentation
augmentation waswas ap- applied
plied
was was divided
2665, 2665, divided into 1865,
into 1865, 535, and
535, and 265 images
265 images for for
the the training,
training, validation,and
validation, andtest data,
test data, respectively.
respectively. Subsequently,Subsequently,
additionaladditional
datasetdataset
splits splits
were were performed.
performed. In order
In order to prevent
to prevent the deep learning model from overfitting to the original data
the deep learning model from overfitting to the original data set, it was first distributed set, it was first to
distributed
60%, 20%, and to 60%,
20%,20%, and 20%, respectively,
respectively, then redistributed
then redistributed to 60%, 18%, to 60%,
and18%,
22%,and 22%,
respectively, by
respectively, by moving a small amount of data. Additionally, for improving model’s de-
moving a small amount of data. Additionally, for improving model’s detection accuracy
tection accuracy under challenging condition, different ratios were applied to the data set
under challenging condition, different ratios were applied to the data set to which the data
to which the data augmentation method was applied and divided into training, valida-
augmentation method was applied and divided into training, validation, and test data, as
tion, and test data, as shown in Figure 3. The scaling-applied dataset is a data augmenta-
shown in Figure
tion method to train 3. small
The scaling-applied dataset
potholes well, divided intois540,
a data
120, augmentation
and 40 data for themethodtrain, to train
small
valid,potholes well, divided
and test datasets. into 540,
The Gamma 120, and 40 data
regulation-applied for the
dataset is train, valid,
a method for and test datasets.
precisely
The Gamma
detecting regulation-applied
potholes datasetdivided
in a noisy environment, is a method
into 415, for145,
precisely detecting
and 40 data for the potholes
train, in a
noisy
valid,environment,
and test datasets. divided into 415, flip-applied
The Horizontal 145, and 40dataset
data for the train,tovalid,
is intended and test
compensate fordatasets.
The Horizontal
the deficient flip-applied
dataset, divided intodataset is intended
510, 150, and 40 datato compensate for theand
for the train, valid, deficient
test da- dataset,
tasets. Since
divided intothe data
510, augmentation-applied
150, and 40 data for the dataset
train,was applied
valid, andtotest
improve the detection
datasets. Since the data
performance of the model
augmentation-applied under an
dataset wasadverse
appliedenvironment
to improve andtheprevent overfitting,
detection it was of the
performance
allocated
Electronics 2023, 12, x FOR PEER REVIEW to 40 sheets for the test dataset.
model under an adverse environment and prevent overfitting, it was allocated to 40 sheets 8 of 24
for the test dataset.

Figure 3. Pothole dataset configuration after data augmentation.

Figure 3. Pothole dataset configuration after data augmentation.
3.2. Development SPFPN-YOLOv4 Tiny
This section is divided into two sections. Section 3.2.1 describes the importance and
functionality of detection algorithms based on the multi-scale feature network used for
developing of the SPFPN-YOLOv4 tiny proposed in this work. Additionally, we show
3.2.2. SPFPN YOLOv4 Tiny
Several researchers and road maintenance organizations have recently adopted the
YOLO model, which has the fastest computation speed and highest accuracy among the
Electronics 2023, 12, 826 one-stage detector-type object detection models. YOLO detects and predicts by selecting 8 of 22
the prediction bounding box with the highest confidence score by dividing the input im-
age into an N × N grid cell. In addition, YOLOv4 tiny is the simplified version of YOLOv4,
which showed a high
3.2. Development performanceTiny
SPFPN-YOLOv4 of 43.5% average precision (AP) on the MS COCO data
sets [40]. YOLOv4 has a smaller CNN structure than YOLOv4; therefore, the detection
This section is divided into two sections. Section 3.2.1 describes the importance and
speed is fast, but with lower accuracy. Even though the YOLO-tiny series showed low
functionality of detection algorithms based on the multi-scale feature network used for
accuracy,
developing they canSPFPN-YOLOv4
of the achieve real-time detection
tiny proposedeven in low-spec
in this embedded systems,
work. Additionally, we showsuchhow
as cameras and Raspberry Pi, which lack hardware specifications, unlike
the applied SPP and FPN are coupled to the existing YOLOv4 tiny together in Figureordinary com-4.
puters.
Section Therefore, YOLOv4
3.2.2 describes tiny was
K-means++ adoptedfor
clustering in this study applying
efficiently because itofisYOLO’s
a highlydetection
suitable
deep-learning model for detecting potholes
mechanism, anchor box, as the network changes. while attaching a camera to the vehicle.

Figure 4. SPFPN YOLOv4 tiny overall architecture.

3.2.1.YOLOv4 tinyFeature
Multi-Scale detectsNetwork
objects based on CSPDarknet53-tiny, which applies CSPNet
[41] and one FPN, which is a type of multi-scale network [27]. Figure 4 shows the overall
The performance of the object detection model depends on the amount of core infor-
structure
mation aboutof the SPFPN-YOLOv4
the objects developed
in the convolutional in this
neural study.Multi-scale
network. The blackfeature
dotted line is
networks
CSPDarknet53-tiny which is the backbone of YOLOv4 and consists of three
are CNN-based structures that allow the independent detection of feature maps of different CSP blocks
and
sizestwo convolutional
fusing layers. The
between high-level area
and markedfeature
low-level by the maps
blue dotted linea represents
to create new one with the more
orig-
inal YOLOv4 tiny head, and the object is finally detected in a feature map
information on objects. This method is an important area of research, in which various with sizes of 13
×multi-scale
13 and 26 object
× 26. The FPN consisting of an upsampling layer, and a convolutional
detection methods have been introduced to achieve better performance layer
was usedyears.
in recent to connect
Objectthe 13 × 13 feature
detection based onmaps and 26 detection
multi-scale × 26 feature maps. divided into (1) in-
is mainly
Furthermore, 13 × 13 feature maps are high-level feature
dependent detection of several feature maps extracted from different layersmaps containing
of the the most
network,
intensive feature
and (2) fusion information
of several of objects,
feature and 26 ×from
maps extracted 26 maps can be
different considered
layers middle-level
of the network.
In case (1), the data was first applied to a single-shot detection network (SSD) for
object detection in independent feature maps generated from different layers of CNN,
and detecting small objects from feature maps combined with high-level and low-level
recorded better accuracy than detecting small objects from feature maps extracted from
low-level [38]. In case (2), Li et al. [39] first reported that state-of-the-art performance
in pedestrian detection using scale-aware mechanisms was obtained by combining and
adding output feature maps using large and small sub-networks for the fusion of several
feature maps. He et al. [31] increased the detection performance by applying several
max-pool filters and SPP, which can create feature maps regardless of the input image
size. Lin et al. [32] developed FPN and a top-down side connection structure integrate
information between high-level and low-level feature maps by combining local area features
of different sizes.
This literature review demonstrated that multi-scale feature networks are effective
in detecting small objects using even detection in independent feature maps and achieve
high accuracy when using feature fusion methods by connecting high-level and low-level
feature maps. Therefore, in this study, we combined the backbone network of the YOLOv4
Electronics 2023, 12, 826 9 of 22

tiny model with SPP and FPN, which are part of a multi-scale feature network type, to
improve detection performance and detect small potholes barely detected in asphalt roads.

3.2.2. SPFPN YOLOv4 Tiny

Several researchers and road maintenance organizations have recently adopted the
YOLO model, which has the fastest computation speed and highest accuracy among the
one-stage detector-type object detection models. YOLO detects and predicts by selecting the
prediction bounding box with the highest confidence score by dividing the input image into
an N × N grid cell. In addition, YOLOv4 tiny is the simplified version of YOLOv4, which
showed a high performance of 43.5% average precision (AP) on the MS COCO data sets [40].
YOLOv4 has a smaller CNN structure than YOLOv4; therefore, the detection speed is fast,
but with lower accuracy. Even though the YOLO-tiny series showed low accuracy, they
can achieve real-time detection even in low-spec embedded systems, such as cameras and
Raspberry Pi, which lack hardware specifications, unlike ordinary computers. Therefore,
YOLOv4 tiny was adopted in this study because it is a highly suitable deep-learning model
for detecting potholes while attaching a camera to the vehicle.
YOLOv4 tiny detects objects based on CSPDarknet53-tiny, which applies CSPNet [41]
and one FPN, which is a type of multi-scale network [27]. Figure 4 shows the over-
all structure of the SPFPN-YOLOv4 developed in this study. The black dotted line is
CSPDarknet53-tiny which is the backbone of YOLOv4 and consists of three CSP blocks and
two convolutional layers. The area marked by the blue dotted line represents the original
YOLOv4 tiny head, and the object is finally detected in a feature map with sizes of 13 × 13
and 26 × 26. The FPN consisting of an upsampling layer, and a convolutional layer was
used to connect the 13 × 13 feature maps and 26 × 26 feature maps.
Furthermore, 13 × 13 feature maps are high-level feature maps containing the most
intensive feature information of objects, and 26 × 26 maps can be considered middle-
level feature maps with less intensive spatial information than the 13 × 13 feature
maps. As the size of the feature map decreased, the shape of the smaller object became
more distorted, making it difficult to detect small potholes. Therefore, we added FPN
to improve the small object detection accuracy by utilizing low-level feature maps of
52 × 52 size. Moreover, an SPP network is added to the third CSP block, which has a
high probability of creating small distorted objects. The SPP network divides image
sizes into various images by applying max-pool filters, then merges them, resulting in
less information loss than the conventional max-pooling process. Additionally, [1 × 1],
[3 × 3], and [5 × 5] max-pool filters are applied in added SPP network, and a total of
13 × 13 × 2048 feature maps are generated after passing this network. The feature maps
passing the third CSP block connected to the SPP are first connected to the high-level
feature map of 13 × 13, and the high-level feature map is sequentially connected to
the middle- and low-level feature maps through two FPN. Finally, 52 × 52 low-level
feature maps responsible for small object features are connected with 13 × 13 and 26 × 26
feature maps that train low-level features, such as edges, shapes, and textures, and reflect
intensive features, such as key information representative of potholes.

3.3. K-Means++ Clustering

An anchor box has a high probability of a previously defined object being present. The
YOLO object detection method draws an accurately predicted bounding box by dividing
the image into a regular grid and calculating the error between the ground truth and
predefined anchor boxes. The number of objects that can be detected in one grid is the same
as the number of anchor boxes defined in advance, and the detection accuracy is high when
the size of the anchor box is similar to that of the ground truth box. Therefore, K-means ++
clustering [42] was applied to improve the IoU score between the prediction bounding box
and the ground truth box. The K-means++ clustering method was applied as follows:
Electronics 2023, 12, 826 10 of 22

(1) One of the data points was randomly selected and designated as the first center point.
(2) The distance to the first center point for the remaining points was calculated.
(3) A data point is placed as far away as possible from the defined center point and
then designated as the next center point.
(4) Steps 2 and 3 are repeated until there are K centers, using the following equation:

d = max( j:1→m) k xi − centroid j k2 (1)

where xi is a point in the dataset, centroid j is any data selected as the centroid, d is the
longest distance between xi and centroid j , and m is the total number of centroids K. Focus-
ing on the K-means ++ clustering mechanism, the farthest distance between centroids as K
and points were calculated using the following equation:

d(Ground truth, centroid) = 1 − IoU ( Ground truth, centroid) (2)

The IoU was calculated by dividing the overlapping area between the anchor box and
ground truth box by the sum of the areas of the two boxes. The lower the IoU value, the
farthest distance between the centroids and the point is calculated; therefore, it is possible
to set the center point farthest from the point. Anchor box values applied with K-means
++ clustering are [35, 21], [67, 44], [116, 64], [125, 126], [182, 210], [130, 110], [110, 60],
[72, 48] and [40, 22]. By setting the number of anchor boxes to 9, an IoU score of 71.02%
was achieved.

3.4. Object Detection Performance Evaluation Metrics

For the performance evaluation of SPFPN-YOLOv4tiny developed in this study, AP,
precision or mAP, recall, and F1-score calculated using confusion matrix index, which
is widely used as a computer vision performance evaluation technique, was used as an
evaluation index. Table 3 defines the confusion matrix index as follows [25].

Table 3. Confusion matrix index.

Prediction
Actual Predicted as Positive Predicted as Negative

Positive True Positive (TP) False Negative (FN)

Negative False Positive (FP) True Negative (TN)

As shown in Table 3, a true positive (TP) is the correct detection of the model. A
false positive is the object’s false detection, which does not exist in the image. Precision is
defined as the ratio of the number of true positives (TP) among (TP + FP), and recall is the
ratio of true positives (TP) out of those that are positive (TP + FN), which can be calculated
using Equations (3) and (4). Both precision and recall values are affected by IoU, which
refers to the overlapping ratio between the prediction bounding box and ground truth, as
shown in Equation (5).
TP
Precision = (3)
TP + FP
TP
Recall = (4)
TP + FN
A P ∩ AGT
IoU = (5)
A P ∪ AGT
where A P is the area of the prediction bounding box, and AGT is the area of the ground
truth box. With these precision and recall variables, the F1-score and AP can be calculated
using Equations (3) and (4). The F1-score is the harmonic average of the precision and
recall. The AP is the weighted sum of the precedence at each IoU threshold, and the weight
Electronics 2023, 12, 826 11 of 22

at this time is calculated as the difference between the recall value at the previous threshold
and the current threshold.
Precision × Recall
F1 score = 2 × (6)
Precision + Recall
M
AP@IoU = ∑ ( Rn − Rn−1 )( Pn ) (7)
i =1

1 N
N i∑
mAP@IoU = APi (8)
=1
The mAP is the average AP value for each class among the indices used to evaluate
object detection performance, but it is regarded as the same as mAP because only one class
was used in this study. AP and mAP can evaluate the reliability of the model by setting the
IoU threshold. The IoU threshold when measuring the model performance evaluation was
set to 0.5. In the training phase performance evaluation, mAP(IoU = 0.5) was applied, and
precision, recall, and F1-score were applied to model performance evaluation for 50 images
in the test dataset. In addition, FPS was also applied to measure the reference speed of the
deep learning model.

3.5. Risk Assessment

3.5.1. Pothole Distance Estimation
Before calculating the actual pothole size using 2D images, the distance between
the camera and pothole was calculated by assuming a monocular camera based on the
distance estimation equation, as shown in Figure 5 [43]. The prerequisites for this method
are as follows: (1) the detected object is attached to the ground, and (2) the distance is
estimated using only the detected pothole. The distance between the detected pothole and
Electronics 2023, 12, x FOR PEER REVIEW 14 of 24
the vehicle with a camera attached can only be determined under the situation where these
two conditions are satisfied.

Figure5.5.Distance
Figure Distanceestimation
estimationbetween
betweenobject
objectand
andcamera.
camera.

When a pothole enters the camera’s field of view, it is detected with a prediction
bounding box using an object-detection algorithm. Assuming that the pothole is in contact
Electronics 2023, 12, 826 12 of 22

with the ground, the horizontal distance can be expressed using Equation (9) by applying
the trigonometric ratio.
D ·tan(θ0 ) = H· tan(θc + β) (9)
where H is the height from the ground to the camera lens; θc is the angle of inclination from
the vertical axis pixel height of the image; β is the angle from the camera to the detected
pothole; and θ0 is the angle from the vertical axis to the pothole. θ0 can be calculated by
following Equation (10):
D
tan(θ0 ) = (10)
H
where dc is the horizontal distance between the actual position and camera. β can be
calculated using Equation (11):
hi
− di
tan( β) = 2 (11)
f
where hi is the pixel height of the image, di is the horizontal distance from the actual
position to the camera, and f is the focal length of the lens, which can be calculated using
Equation (12):
hi
f= (12)
2· tan α2
where α is the horizontal viewing angle of the lens. Finally, substituting all the variables
obtained through Equations (10)–(12), Equation (10) can be expressed as in Equation (13):
hi
" !! #
− d i
α
D = H· tan θc + tan−1 2 − tan θc − (13)
2· tan α2 2

3.5.2. Pinhole Camera Model

Figure 6 illustrates the principle of the pinhole camera and shows the image plane
projected through the pinhole of the lens. The pinhole is a simplified camera model that
projects an external image onto an image plane and considers image pixels and absolute
coordinates. In Figure 6, the axis passing through the hole of the lens is called the optical
axis, and the distance from the hole to the location where the image on the back is formed is
called the focal length. The focal length calibrated for each of the x- and y-axes of the camera
lens are called the x-axis focal length ( f x ) and y-axis focal length ( f y ), respectively. The
central point in the image plane on which the image is projected is called the principal point
(Cx , Cy ). The pinhole camera parameters are expressed as a camera matrix in Equation (14):
 
 x
cx 0  W
  
u fx γ
yW 
D = λ v  =  0 fy c y 0 
 ZW

 (14)
1 0 0 1 0
1

where u, v are the pixel coordinates in which the image is projected onto the image plane;
( xw , yw , zw ) are the 3D coordinates of the detected pothole; c x , cy are the principal point;
γ is the distortion rate, which is set to 0, assuming that there is no distortion rate; λ is the 2D
coordinate after camera calibration. Expanding Equation (14) yields Equations (15)–(17):

λu = f x xW + γyW + c x zW (15)

λ v = f x yW + c x zW (16)
λ = zW = D (17)
where λu is the x coordinate in the image; λv is the y coordinate in the image; λ is the
z-axis coordinate of the object similar to D, which is the distance between the lens and the
Electronics 2023, 12, 826 13 of 22

object. As the distortion rate is 0, the 3D coordinates of the object can be calculated using
Equations (18)–(20):
(u − c x )
xW = D (18)
fx

u − cy
yW = D (19)
fy
Figure 5. Distance estimation between object
ZW and
=camera.
D (20)

Figure 6. Pinhole camera model.

4. Results
Using this pinhole camera formula, we attempted to convert the detected pothole size
into
4.1.its actualDetection
Object size. By Algorithm
substituting the D values
Performance obtained from the distance estimation equa-
Comparison
tion inDeep
Equation (14) into Equations (18)–(20),
learning model development and trainingthe edge were
coordinate valueusing
performed of thethe
predicted
Darknet
bounding box of the detected pothole can be converted into 3D coordinates.
repository, a neural network framework developed by Joseph Redmond [44]. The com- Therefore, the
width and height were calculated by calculating the difference between
puter environment was Google Colab Pro, which provides high-performance computing the values of the
top-left corner (x,
specifications y) and
with right-bottom
Tesla corner
K80 with 12GB (x, y)
GPU of the bounding
memory. box the
Table 4 lists converted
library into 3D
environ-
coordinate values.
ments such as CUDA, CUDNN, and TensorFlow.
The learning parameters were set to a learning rate of 0.9 momentum, decay of 0.0005,
4. Results
batch size of 32, and a subdivision of 16. The input image size was fixed at 416 × 416, and
4.1. Object Detection Algorithm Performance Comparison
the mix-up augmentation during the training phase was set to True so that input image
Deep learning model development and training were performed using the Darknet
size could be randomly changed during the training phase. We set the confidence value
repository, a neural network framework developed by Joseph Redmond [44]. The computer
as 0.25 and the IoU threshold as 0.5 for the proposed SPFPN-YOLOv4 tiny, and other
environment was Google Colab Pro, which provides high-performance computing specifi-
YOLO models to detect potholes accurately. Comparing the training mAP for each model
cations with Tesla K80 with 12GB GPU memory. Table 4 lists the library environments such
as shown in Table 5, it was confirmed that SPFPN-YOLOv4 tiny achieved not only the
as CUDA, CUDNN, and TensorFlow.

Table 4. Training hardware environment.

Environment Model Version

CPU Intel(R) Xeon(R) CPU @ 2.30 GHZ
GPU Tesla K80 (12 GB)
IDE Google Colab Pro
CUDA 10.1
Tensorflow 2.2.1
Python 3.8

The learning parameters were set to a learning rate of 0.9 momentum, decay of 0.0005,
batch size of 32, and a subdivision of 16. The input image size was fixed at 416 × 416, and
the mix-up augmentation during the training phase was set to True so that input image
Electronics 2023, 12, 826 14 of 22

size could be randomly changed during the training phase. We set the confidence value as
0.25 and the IoU threshold as 0.5 for the proposed SPFPN-YOLOv4 tiny, and other YOLO
models to detect potholes accurately. Comparing the training mAP for each model as
shown in Table 5, it was confirmed that SPFPN-YOLOv4 tiny achieved not only the highest
mAP(IoU = 0.5) but also showed 5.8 h of training, which is shorter than that of YOLOv2
and YOLOv3.

Table 5. mAP(IoU = 0.5) comparison with various YOLO models.

Model mAP([email protected]) Training Hours

YOLOv2 74.8% 8.8
YOLOv3 77.8% 9.4
YOLOv4 tiny 72.7% 4.2
SPFPN-YOLO v4 tiny 79.6% 5.8

Figures 7–10 shows the mAP(IoU = 0.5) of YOLOv2, YOLOv3, YOLOv4 tiny, and the
proposed SPFPN-YOLOv4 tiny and training hours during the training phase. The blue line
in each figure indicates the loss, and the red line indicates the recorded mAP(IoU = 0.5)
per 100 times from 1000 iterations. Figure 7 of YOLOv2 shows a rapidly converging loss from
approximately 20 times, and shows that it achieves 74.8% mAP(IoU = 0.5) with a conver-
gence of 1 after the training. Figure 8 of YOLOv3 shows a stable convergence pattern after
600 iterations and achieves 77.8% mAP(IoU = 0.5) with a convergence of almost zero at the
2000 iterations. Figure 9 of YOLOv4 tiny shows a stable loss pattern and converges to almost
zero at 2000 iterations. Figure 7 of the proposed SPFPN-YOLOv4 tiny shows an irregular loss
pattern compared to other YOLO models but achieves the highest mAP(IoU = 0.5). The train-
Electronics 2023, 12, x FOR PEER REVIEW
ing epochs of each model were set to 2000, and additional detection performance16 comparisons,
of 24

such as precision, recall, F1-score, and FPS, were performed at the end of training.

Figure 7. YOLOv2 training graph.

Figure 7. YOLOv2 training graph.
Electronics 2023, 12, 826 15 of 22

Figure 7. YOLOv2 training graph.

Electronics 2023, 12, x FOR PEER REVIEW 17 of 24

Figure8.8.YOLOv3
Figure YOLOv3training
traininggragh.
gragh.

Figure 9. YOLOv4 tiny training gragh.

Figure 9. YOLOv4 tiny training gragh.
Electronics 2023, 12, 826 16 of 22

Figure 9. YOLOv4 tiny training gragh.

Figure 10. SPFPN-YOLOv4 tiny training graph.

Figure 10. SPFPN-YOLOv4 tiny training graph.
Table 6. Performance comparison on the pothole test dataset.
Additionally, because the difficulty of pothole detection stems from the appearance
of the model not trained on an actual road, an evaluation of the test dataset was also
performed. Fifty images of the test dataset were manually selected to simulate potholes on
the highway asphalt road surface. Moreover, the testing detection performance of the test
dataset was compared using only a CPU (Intel i7-10700k [email protected]) without a GPU or
GPU accelerator to assume embedded systems. Table 6 shows the precision, recall, F1-score,
and FPS of each model for 50 images that are manually picked. According to the results,
the proposed SPFPN-YOLOv4 tiny recorded the highest precision, recall, and F1-score. In
addition, SPFPN-YOLOv4 tiny slightly exceeded the real-time detection standard of 30 FPS
by achieving 38 FPS.

Table 6. Performance comparison on the pothole test dataset.

Model Precision Recall F1-Score FPS

YOLOv2 83% 74% 78.2% 26
YOLOv3 88% 72% 79.2% 28
YOLOv4 tiny 74% 76% 72.5% 56
Proposed SPFPN YOLOv4 tiny 89% 84% 86.4% 38

Figure 11 shows an example of the results for the test dataset of the proposed SPFPN-
YOLOv4 tiny, YOLOv4 tiny, YOLOv2, and YOLOv3. The green rectangle is the ground truth,
and red is the prediction bounding box predicted by each YOLO model. The ground truths
presented in pictures (a), (b), and (c) are 3, 5, and 3, respectively. In the image (a), SPFPN-
YOLOv4 tiny successfully detected all three potholes, YOLOv4 tiny detected one pothole,
and YOLOv2 and YOLOv3 detected two potholes. In the image (b), SPFPN-YOLOv4 tiny,
YOLOv4 tiny, YOLOv2, and YOLOv3 detect four, three, two, and three potholes, respectively.
In the image (C), SPFPN-YOLOv4 tiny detected all three potholes, YOLOv4 tiny and YOLOv2
detected only one pothole, and YOLOv3 detected two potholes. The values for the IoU score
achieved by each YOLO model for pictures (a), (b), and (c) are summarized in Table 7.
truth, and red is the prediction bounding box predicted by each YOLO model. The ground
truths presented in pictures (a), (b), and (c) are 3, 5, and 3, respectively. In the image (a),
SPFPN-YOLOv4 tiny successfully detected all three potholes, YOLOv4 tiny detected one
pothole, and YOLOv2 and YOLOv3 detected two potholes. In the image (b), SPFPN-
YOLOv4 tiny, YOLOv4 tiny, YOLOv2, and YOLOv3 detect four, three, two, and three
Electronics 2023, 12, 826 potholes, respectively. In the image (C), SPFPN-YOLOv4 tiny detected all three potholes, 17 of 22
YOLOv4 tiny and YOLOv2 detected only one pothole, and YOLOv3 detected two pot-
holes. The values for the IoU score achieved by each YOLO model for pictures (a), (b), and
(c) are summarized in Table 7.

Figure 11. Detection result sample on the test dataset.

Table 7. IoU comparison according to each YOLO models.

No. of Detected
Picture Model Accuracy(%) IoU
Potholes Potholes
YOLOv2 2 66 0.54
YOLOv3 2 66 0.52
(a) 3
YOLOv4 tiny 1 33 0.61
Proposed SPFPN-YOLOv4 tiny 3 100 0.63
YOLOv2 2 40 0.48
YOLOv3 3 60 0.44
(b) 5
YOLOv4 tiny 3 60 0.47
Proposed SPFPN-YOLOv4 tiny 4 80 0.53
YOLOv2 1 33 0.45
YOLOv3 2 66 0.63
(c) 3
YOLOv4 tiny 1 33 0.52
Proposed SPFPN-YOLOv4 tiny 3 100 0.58

Potholes occur in a variety of shapes and characteristics owing to various factors (envi-
ronment, physical stress, and deterioration of road surfaces), thus detectability under poor
conditions should also be verified. This study attempts to solve this problem by applying
data augmentation (horizontal flip, gamma regulation, and scaling). As shown in Figure 12,
several detected pothole images representative of adverse conditions, such as noisy data,
overexposed, underexposed, and poor quality of photos, were selected. Image (A) in Figure 12
shows a noisy road image and detects all three potholes with shallow depths and distinct
boundaries at the center. In this image (A), the proposed SPFPN-YOLOv4 tiny achieved
high confidence scores of 0.99, 0.98, and 0.96, respectively. Image (B) shows the results of
underexposed and overexposed pothole detection at a high average confidence score of 0.95
Electronics 2023, 12, 826 18 of 22

in a road environment of poor quality with asphalt cracks and potholes. It verifies precise
classification between cracks and potholes. Image (C) shows the number of pothole images
with accumulated rainwater and strong light reflection at the center of each pothole. In the
detection result of image (C), the proposed SPFPN-YOLOv4 tiny shows the detection of six
potholes despite the presence of light reflection, but achieved a relatively lower confidence
score than other images. Image (D) shows paved roads after snowfall without lane boundaries
and an environment where pothole discrimination is difficult. The proposed SPFPN-YOLOv4
tiny shows that if the characteristics of the boundary and center of the road surfaces are
shadowy, they are detected as potholes in image (D). However, if the distinction between
the center of the road surface and the boundary is ambiguous, they are not detected as pot-
holes. Image (E) shows underexposed potholes filled with rainwater at the road surface. In
this image, four potholes were detected, with a confidence score of 0.81, 0.86, 0.9, and 0.72,
respectively. The result shows the detectability in a relatively long distance. In addition,
the road surface area, which shows a large amount of rainwater next to the white-dotted
Electronics 2023, 12, x FOR PEER REVIEW
lane, is not detected as a pothole. Image (F) shows the results of the20pothole of 24
detection of the
proposed SPFPN-YOLOv4 tiny on shadowed unpaved roads. This result demonstrates the
detectability of potholes that can be easily found on unpaved roads while verifying that they
potholes that can be easily found on unpaved roads while verifying that they can be de-
can be detected in environments
tected in environments with less light reflection.
with less light reflection.

Figure 12.
Figure 12. Detection
Detectionresult in thein
result challenging condition. condition.
the challenging
4.2. Risk Assessment
The three-dimensional size of the potholes was calculated using pinhole cameras and
a distance equation. The variables required to compute the distance used in the pinhole
camera equation are the height of the camera lens, distance to the center point, inclination
angle of the camera lens, and height of the image plane. We set the height of the camera
Electronics 2023, 12, 826 19 of 22

4.2. Risk Assessment

Electronics 2023, 12, x FOR PEER REVIEW 21 of 24
The three-dimensional size of the potholes was calculated using pinhole cameras
and a distance equation. The variables required to compute the distance used in the
pinhole camera equation are the height of the camera lens, distance to the center point,
(1520 mm),
inclination the of
angle distance to the
the camera center
lens, and point
height(2900
of themm),
imagethe size We
plane. of the
set image plane
the height (416 ×
of the
camera (1520
416), and themm), the distance
inclination angle to ofthe
thecenter
camera point
lens(2900
(60°).mm), the size of the image plane
(416 ×These
416), and the inclination
variables angle of the
can be measured camera lens
by attaching (60◦ ). to the vehicle. However, the
a camera
These were
variables variables can be set
arbitrarily measured
to verify bythe
attaching
validitya camera to the vehicle.
of this study. Moreover, However, the
it is desirable
variables
to measurewerethearbitrarily
tire contactset patch
to verifysizethe
forvalidity
comparisonof thiswith
study. Moreover,
a pothole; thisitstudy
is desirable
replaced
toitmeasure
with thethe
tiretire contactvalue
footprint patchused size for comparison
in other studieswith a pothole;
to verify this study
its validity. Thereplaced
size of the
ittire
with the tire
contact footprint
patch valueasused
is defined 100inmm other studies
(width) to verify
× 280 its validity.
mm (height) when The
thesize
air of the
pressure
tire contact patch is defined as 100 mm ( width ) × 280 mm ( height ) when
is 1.2 bar, and the weight under the axis load is 4383 N. Based on these values, the width the air pressure
isand
1.2 bar, andofthe
height theweight underpatch
tire contact the axis
wereload is 4383 N.with
compared Based on these
those of thevalues,
pothole thewidth
widthand
and height of the tire contact patch were compared with those of the
height. It is defined as dangerous when the size of the pothole is 1.2 times larger than pothole width andthe
height. It is defined as dangerous when the size of the pothole is 1.2 times larger than the
tire patch size, cautious when the two values are the same, and safe when the pothole size
tire patch size, cautious when the two values are the same, and safe when the pothole size
is smaller than the tire contact patch size. Figure 13 shows the process of determining the
is smaller than the tire contact patch size. Figure 13 shows the process of determining the
risk using a pothole-report video. After detecting the pothole, the width and height values
risk using a pothole-report video. After detecting the pothole, the width and height values
of the bounding box are calculated and compared with the predefined tire contact patch
of the bounding box are calculated and compared with the predefined tire contact patch
value so that the safety case can be checked in green and the dangerous case in red.
value so that the safety case can be checked in green and the dangerous case in red.

Figure13.
Figure 13.Risk
Riskassessment
assessment through
through detected
detected pothole
pothole using
using SPFPN-YOLOv4
SPFPN-YOLOv4 tiny.
tiny.

5.5.Discussion
Discussion
Potholes
Potholesarearechallenging
challengingtotodetect
detectbecause
because of of
their uneven
their uneven shape
shapeowing
owingto various
to various
factors,
factors, and they should be detected quickly, even in embedded systems. In thisstudy,
and they should be detected quickly, even in embedded systems. In this study, a
amulti-scale
multi-scale detection
detection network
network waswas combined
combined withwith CSPDarknet53-tiny,
CSPDarknet53-tiny, the the backbone
backbone of
ofYOLOv4
YOLOv4tiny,tiny,sosothat
that the
the characteristics
characteristics of of
thethe object
object could
could be be richly
richly extracted.
extracted. As aAs a
result,
result, the highest
the highest mAP(IoU mAP(IoU
= 0.5)= 0.5) compared
compared to YOLOv2,
to YOLOv2, YOLOv3,
YOLOv3, andand YOLOv4
YOLOv4 tiny
tiny was
was achieved in the training phase. The highest precision, recall, and F1-score
achieved in the training phase. The highest precision, recall, and F1-score were also rec- were also
recorded
orded forforthe
thefifty
fiftyimages
images manually
manually picked
pickedin inthe
thetest
testdatasets.
datasets.AsAsa result of of
a result measuring
measuring
the reference speed without utilizing the GPU to assume an embedded
the reference speed without utilizing the GPU to assume an embedded system, system, 38 38
FPSFPS
was achieved, exceeding the real-time detection criterion of 30 FPS. In Figure 11, which
was achieved, exceeding the real-time detection criterion of 30 FPS. In Figure 11, which
shows the inference results of each model for the test data, we showed that the proposed
shows the inference results of each model for the test data, we showed that the proposed
SPFPN-YOLOv4 detected tiny potholes with an IoU score of more than 0.5 for all images.
SPFPN-YOLOv4 detected tiny potholes with an IoU score of more than 0.5 for all images.
In addition, Figure 12 shows that detection under poor academic conditions, such as noisy,
In addition, Figure 12 shows that detection under poor academic conditions, such as
shadowy, poor-quality, underexposed, and overexposed, is possible.
noisy, shadowy, poor-quality, underexposed, and overexposed, is possible.
In addition, this study developed a pothole risk assessment algorithm based on 2D
images that have yet to be addressed in previous studies, focusing on the fact that potholes
should be evaluated differently in their risk depending on their size. If the risk can be
evaluated in real-time, it can not only be used for road maintenance, but also for detecting
large potholes that are not recognizable by drivers in driving situations.
Electronics 2023, 12, 826 20 of 22

In addition, this study developed a pothole risk assessment algorithm based on 2D

images that have yet to be addressed in previous studies, focusing on the fact that potholes
should be evaluated differently in their risk depending on their size. If the risk can be
evaluated in real-time, it can not only be used for road maintenance, but also for detecting
large potholes that are not recognizable by drivers in driving situations.

6. Conclusions
For embedded systems with a low-grade computing performance, a new 2D pothole
detection algorithm was proposed. SPFPN-YOLOv4 tiny was developed by combining
CSPDarknet53-tiny with SPP and FPN to extract rich features from images.
Data augmentation techniques, such as gamma regulation, horizontal flip, and
scaling, were applied to compensate for insufficient datasets. The height and width
of the anchor box were calculated by applying the K-means++ algorithm to optimize
the IoU score. After applying data augmentation, 2665 image datasets were obtained,
which were then divided into training, validation, and testing data at 70%, 20%, and
10% ratios, respectively. A total of 2400 images were used for the training and validation
phases, and 265 images were used to evaluate the detection performance of the various
YOLO models on the test dataset. As a result of the training, SPFPN-YOLOv4 tiny,
YOLOv4 tiny, YOLO 2, and YOLOv3 models achieved 79.6%, 72.7%, 74.8%, and 77.8%,
respectively. Subsequently, 50 pothole images were manually selected from the test
dataset to reflect the highway asphalt pavement environment. On measuring precision,
recall, and F1-score, the Proposed SPFPN-YOLOv4 tiny showed the highest values. In
addition, real-time detection was verified by achieving 38 FPS as a result of inferring
images using only a CPU(Intel i7-10700k [email protected]).
In conclusion, we developed a lightweight pothole detection algorithm that enables
real-time detection, even in an environment without a GPU, and that improves accuracy
over general YOLO series, such as YOLOv2 and YOLOv3. In addition, we proposed a novel
risk assessment algorithm that predicts the actual size of large potholes that are likely to
have a negative impact on the vehicle. Moreover, it can determine the risk before a vehicle
comes into contact with the pothole by comparing it with the tire contact patch size. Thus,
this algorithm can not only be applied to road maintenance in various countries evaluating
the risk considering pothole size but also help the driver discover largely unrecognized
potholes in advance. As the number of pothole-related accidents is rapidly increasing,
continuous attention and response are required to mitigate them. Therefore, if the pothole
detection algorithm developed in this study is applied, the number of accidents resulting
from potholes can be reduced. In future research, it will be necessary to verify real-time
detection in an actual road environment by attaching an embedded system to the vehicle.
In this research, although dangerous potholes are marked as red bounding boxes so that
they can only be visually identified, the signal processing method will be combined with
our new risk assessment to indicate risk signals, such as vibration and haptic signals that
the driver can feel later.

Author Contributions: Conceptualization, T.-O.T.; Methodology, D.-H.H.; Software, D.-H.H.;

Validation, D.-H.H.; Resources, T.-O.T.; Data Curation, J.-Y.C. and S.-B.K.; Writing—Original
Draft Preparation, D.-H.H.; Writing—Review & Editing, S.-P.Z.; Visualization, D.-H.H. and S.-P.Z.;
Supervision, S.-P.Z.; Project Administration, S.-P.Z.; Funding Acquisition, T.-O.T.; Investigation,
D.-H.H. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the Ministry of Trade, Industry, and Energy (20011729).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data reported in this study are included in the article.
Electronics 2023, 12, 826 21 of 22

Acknowledgments: The authors would like to thank Yongjun Pan, a mechanical and vehicle engi-
neering professor at Chongqing university, for his constructive comments and extensive English
review and writing, which significantly strengthened this paper. This work was supported by the
Korea Institute of Industrial Technology Evaluation and Management grant funded by the Korea
government (Ministry of Trade, Industry and Energy) in 2022. (No.20011729, Development of Inter-
national Standard for Pedestrian Collision Prevention System Performance Evaluation Technology
for Autonomous Vehicles).
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Ministry of Road Transport and Highway. 2022. Available online: https://morth.nic.in/road-accident-in-india (accessed on 28
December 2022).
2. Bučko, B.; Lieskovská, E.; Zábovská, K.; Zábovský, M. Computer Vision Based Pothole Detection under Challenging Conditions.
Sensors 2022, 22, 8878. [CrossRef]
3. Qiao, Y.; Santos, J.; Stoner, A.M.; Flinstch, G. Climate change impacts on asphalt road pavement construction and maintenance:
An economic life cycle assessment of adaptation measures in the State of Virginia, United States. J. Ind. Ecol. 2020, 24, 342–355.
[CrossRef]
4. Zhang, F.; Hamdulla, A. Research on Pothole Detection Method for Intelligent Driving Vehicle. In Proceedings of the 2022 3rd
International Conference on Pattern Recognition and Machine Learning (PRML), Chengdu, China, 22–24 July 2022; pp. 124–130.
5. She, X.; Hongwei, Z.; Wang, Z.; Yan, J. Feasibility study of asphalt pavement pothole properties measurement using 3D line laser
technology. Int. J. Transp. Sci. Technol. 2021, 10, 83–92. [CrossRef]
6. Fan, R.; Ozgunalp, U.; Wang, Y.; Liu, M.; Pitas, I. Rethinking road surface 3-d reconstruction and pothole detection: From
perspective transformation to disparity map segmentation. IEEE Trans. Cybern. 2021, 52, 5799–5808. [CrossRef] [PubMed]
7. Kaushik, V.; Kalyan, B.S. Pothole Detection System: A Review of Different Methods Used for Detection. In 2022 Second International
Conference on Computer Science, Engineering and Applications (ICCSEA); IEEE: Piscataway, NJ, USA, 2022; pp. 1–4.
8. Wang, H.-W.; Chen, C.-H.; Cheng, D.-Y.; Lin, C.-H.; Lo, C.-C. A real-time pothole detection approach for intelligent transportation
system. Math. Probl. Eng. 2015, 2015, 869627. [CrossRef]
9. Kim, Y.-M.; Kim, Y.-G.; Son, S.-Y.; Lim, S.-Y.; Choi, B.-Y.; Choi, D.-H. Review of Recent Automated Pothole-Detection Methods.
Appl. Sci. 2022, 12, 5320. [CrossRef]
10. Ahmed, K.R. Smart Pothole Detection Using Deep Learning Based on Dilated Convolution. Sensors 2021, 21, 8406. [CrossRef]
[PubMed]
11. Chen, W.-H.; Hsu, H.-J.; Lin, Y.-C. Implementation of a Real-time Uneven Pavement Detection System on FPGA Platforms.
In Proceedings of the 2022 IEEE International Conference on Consumer Electronics-Taiwan, Taipei, Taiwan, 6–8 July 2022;
pp. 587–588.
12. Musa, A.; Hamada, M.; Hassan, M.A. Theoretical Framework Towards Building a Lightweight Model for Pothole Detection using
Knowledge Distillation Approach. In SHS Web of Conferences; EDP Sciences: Les Ulis, France, 2022; p. 03002.
13. Kırbaş, U. Effects of pothole type pavement distress on whole-body vibration. Road Mater. Pavement Des. 2022, 1–22. [CrossRef]
14. Solanke, V.L.; Patil, D.D.; Patkar, A.S.; Tamrale, G.S.; Kale, A.G. Analysis of existing road surface on the basis of pothole
characteristics. Glob. J. Res. Eng. 2019, 19, 17–23.
15. Romero-Chambi, E.; Villarroel-Quezada, S.; Atencio, E.; Muñoz-La Rivera, F. Analysis of optimal flight parameters of unmanned
aerial vehicles (UAVs) for detecting potholes in pavements. Appl. Sci. 2020, 10, 4157. [CrossRef]
16. Jana, S.; Thangam, S.; Kishore, A.; Sai Kumar, V.; Vandana, S. Transfer learning based deep convolutional neural network model
for pavement crack detection from images. Int. J. Nonlinear Anal. Appl. 2022, 13, 1209–1223.
17. Pandey, A.K.; Iqbal, R.; Maniak, T.; Karyotis, C.; Akuma, S.; Palade, V. Convolution neural networks for pothole detection of
critical road infrastructure. Comput. Electr. Eng. 2022, 99, 107725. [CrossRef]
18. Gupta, S.; Sharma, P.; Sharma, D.; Gupta, V.; Sambyal, N. Detection and localization of potholes in thermal images using deep
neural networks. Multimed. Tools Appl. 2020, 79, 26265–26284. [CrossRef]
19. Koch, C.; Brilakis, I. Pothole detection in asphalt pavement images. Adv. Eng. Inform. 2011, 25, 507–515. [CrossRef]
20. Powell, L.; Satheeshkumar, K. Automated road distress detection. In Proceedings of the 2016 International Conference on
Emerging Technological Trends (ICETT), Kollam, India, 21–22 October 2016; pp. 1–6.
21. Ukhwah, E.N.; Yuniarno, E.M.; Suprapto, Y.K. Asphalt pavement pothole detection using deep learning method based on YOLO
neural network. In Proceedings of the 2019 International Seminar on Intelligent Technology and Its Applications (ISITIA), JW
Marriott Hotel, Surabaya, Indonesia, 28–29 August 2019; pp. 35–40.
22. Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767.
23. Adarsh, P.; Rathi, P.; Kumar, M. YOLO v3-Tiny: Object Detection and Recognition using one stage improved model. In Proceedings
of the 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India,
6–7 March 2020; pp. 687–694.
Electronics 2023, 12, 826 22 of 22

24. Huang, Z.; Wang, J.; Fu, X.; Yu, T.; Guo, Y.; Wang, R. DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO
for object detection. Inf. Sci. 2020, 522, 241–258. [CrossRef]
25. Park, S.-S.; Tran, V.-T.; Lee, D.-E. Application of various yolo models for computer vision-based real-time pothole detection. Appl.
Sci. 2021, 11, 11229. [CrossRef]
26. Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934.
27. Wang, L.; Zhou, K.; Chu, A.; Wang, G.; Wang, L. An improved light-weight traffic sign recognition algorithm based on YOLOv4-
tiny. IEEE Access 2021, 9, 124963–124971. [CrossRef]
28. Malta, A.; Mendes, M.; Farinha, T. Augmented reality maintenance assistant using yolov5. Appl. Sci. 2021, 11, 4758. [CrossRef]
29. Dharneeshkar, J.; Aniruthan, S.; Karthika, R.; Parameswaran, L. Deep Learning based Detection of potholes in Indian roads using
YOLO. In Proceedings of the 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India,
26–28 February 2020; pp. 381–385.
30. Asad, M.H.; Khaliq, S.; Yousaf, M.H.; Ullah, M.O.; Ahmad, A. Pothole Detection Using Deep Learning: A Real-Time and
AI-on-the-Edge Perspective. Adv. Civ. Eng. 2022, 2022. [CrossRef]
31. He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans.
Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [CrossRef] [PubMed]
32. Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 19 April 2017; pp. 2117–2125.
33. RAC Foundation. 2019. Available online: https://www.racfoundation.org/media-centre/potholes-does-size-matter (accessed
on 18 January 2019).
34. Yang, C.H.; Kim, J.G.; Shin, S.P. Road hazard assessment using pothole and traffic data in South Korea. J. Adv. Transp. 2021,
2021, 1–10. [CrossRef]
35. Kortmann, F.; Fassmeyer, P.; Funk, B.; Drews, P. Watch out, pothole! featuring road damage detection in an end-to-end system for
autonomous driving. Data Knowl. Eng. 2022, 142, 102091. [CrossRef]
36. Kaggle, Pothole Detection. Available online: https://www.kaggle.com/datasets/andremvd/pothole-detection (accessed on 8
June 2020).
37. Alhussan, A.A.; Khafaga, D.S.; El-Kenawy, E.-S.M.; Ibrahim, A.; Eid, M.M.; Abdelhamid, A.A. Pothole and Plain Road Classifica-
tion Using Adaptive Mutation Dipper Throated Optimization and Transfer Learning for Self Driving Cars. IEEE Access 2022, 10,
84188–84211. [CrossRef]
38. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of
the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37.
39. Li, J.; Liang, X.; Shen, S.; Xu, T.; Feng, J.; Yan, S. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimed. 2017, 20,
985–996. [CrossRef]
40. Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in
context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755.
41. Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A new backbone that can enhance learning
capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle,
WA, USA, 14–19 June 2020; pp. 390–391.
42. Arthur, D.; Vassilvitskii, S. k-Means++: The Advantages of Careful Seeding; Stanford University: Stanford, CA, USA, 2006.
43. Rezaei, M.; Terauchi, M.; Klette, R. Robust vehicle detection and distance estimation under challenging lighting conditions. IEEE
Trans. Intell. Transp. Syst. 2015, 16, 2723–2743. [CrossRef]
44. Minca, C. The determination and analysis of tire contact surface geometric parameters. Rev. Air Force Acad. 2015, 1, 149.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Newtronic Motorcycle Ignition Systems: Realise The Power and Potential of Your Classic Bike!
No ratings yet
Newtronic Motorcycle Ignition Systems: Realise The Power and Potential of Your Classic Bike!
5 pages
Music Business Plan
100% (6)
Music Business Plan
7 pages
Pothole Detection System Using 2D LiDAR and Camera
No ratings yet
Pothole Detection System Using 2D LiDAR and Camera
3 pages
Aashto Cfrp-Prestressed Concrete Design Training Course
100% (1)
Aashto Cfrp-Prestressed Concrete Design Training Course
93 pages
Application of Various YOLO Models For Computer Vi
No ratings yet
Application of Various YOLO Models For Computer Vi
14 pages
7-2025
No ratings yet
7-2025
13 pages
Pothole Severity Prediction Using Monocular Depth (3) (1) - 2
No ratings yet
Pothole Severity Prediction Using Monocular Depth (3) (1) - 2
15 pages
Pothole Detection Model For Road Safety Using Computer Vision and Machine Learning
No ratings yet
Pothole Detection Model For Road Safety Using Computer Vision and Machine Learning
8 pages
applsci-11-03725
No ratings yet
applsci-11-03725
16 pages
An Image-Based Convolutional Neural Network System For Road Defects Detection
No ratings yet
An Image-Based Convolutional Neural Network System For Road Defects Detection
8 pages
Pothole 2023
No ratings yet
Pothole 2023
12 pages
Heiwins Paper
No ratings yet
Heiwins Paper
5 pages
Pothole Detection and Dimension Estimation by Deep
No ratings yet
Pothole Detection and Dimension Estimation by Deep
14 pages
CPaper 12
No ratings yet
CPaper 12
6 pages
paper-3
No ratings yet
paper-3
7 pages
7-2024-Q1-IEEE ACCESS -Deep_Learning_Enhanced_Feature_Extraction_of_Potholes_Using_Vision_and_LiDAR_Data_for_Road_Maintenance
No ratings yet
7-2024-Q1-IEEE ACCESS -Deep_Learning_Enhanced_Feature_Extraction_of_Potholes_Using_Vision_and_LiDAR_Data_for_Road_Maintenance
9 pages
PotholePaper (1)-1
No ratings yet
PotholePaper (1)-1
5 pages
Paper Icmlas 2025
No ratings yet
Paper Icmlas 2025
5 pages
Chitale Et Al (2020)
No ratings yet
Chitale Et Al (2020)
6 pages
IJSRED-V8I2P63
No ratings yet
IJSRED-V8I2P63
6 pages
paper-6
No ratings yet
paper-6
6 pages
Pothole Detection Using Machine Learning
No ratings yet
Pothole Detection Using Machine Learning
6 pages
jimaging-10-00227
No ratings yet
jimaging-10-00227
22 pages
Bridging the Day-Night Gap in Pothole Detection Using Generative Models and Deep Learning-Based Object Detection
No ratings yet
Bridging the Day-Night Gap in Pothole Detection Using Generative Models and Deep Learning-Based Object Detection
4 pages
Real Time Pothole Detection
No ratings yet
Real Time Pothole Detection
6 pages
A Deep Learning-Based Pothole Detection System Using Unmanned Aerial Vehicle Images
No ratings yet
A Deep Learning-Based Pothole Detection System Using Unmanned Aerial Vehicle Images
31 pages
IEEE
No ratings yet
IEEE
5 pages
abdo-2024-ijca-924325
No ratings yet
abdo-2024-ijca-924325
6 pages
Rastogi 2020
No ratings yet
Rastogi 2020
6 pages
Nighttime_Pothole_Detection_A_Benchmark
No ratings yet
Nighttime_Pothole_Detection_A_Benchmark
19 pages
Smart System of Potholes Detection
No ratings yet
Smart System of Potholes Detection
6 pages
Yolov8-Based Visual Detection of Road Hazards: Potholes, Sewer Covers, and Manholes
No ratings yet
Yolov8-Based Visual Detection of Road Hazards: Potholes, Sewer Covers, and Manholes
7 pages
Detection of Road Cracks and Potholes Using IOT Device
No ratings yet
Detection of Road Cracks and Potholes Using IOT Device
6 pages
focus (pathole detection
No ratings yet
focus (pathole detection
7 pages
Drone Based Potholes Detection Using Machine Learning on Various Edge AI Devices in Real-Time
No ratings yet
Drone Based Potholes Detection Using Machine Learning on Various Edge AI Devices in Real-Time
5 pages
Project Feasibility Presentation Ver2[1][2]
No ratings yet
Project Feasibility Presentation Ver2[1][2]
13 pages
1908.00894v3
No ratings yet
1908.00894v3
12 pages
1 Sustainable Road Pothole Detection A Crowdsourcing Based MultiSensors Fusion ApproachSustainability Switzerland
No ratings yet
1 Sustainable Road Pothole Detection A Crowdsourcing Based MultiSensors Fusion ApproachSustainability Switzerland
23 pages
Pothole Detection and Geological Mapping Through Aerial Vehicle
No ratings yet
Pothole Detection and Geological Mapping Through Aerial Vehicle
6 pages
ProjectProposal TresMarias
No ratings yet
ProjectProposal TresMarias
3 pages
Assessing The Performance of Yolov5, Yolov6, and Yolov7 in Road Defect Detection and Classification: A Comparative Study
No ratings yet
Assessing The Performance of Yolov5, Yolov6, and Yolov7 in Road Defect Detection and Classification: A Comparative Study
11 pages
Kuch Bhi
No ratings yet
Kuch Bhi
5 pages
Classification of Different Size of Potholes Based
No ratings yet
Classification of Different Size of Potholes Based
25 pages
Jpeodx Pveng-1194
No ratings yet
Jpeodx Pveng-1194
18 pages
Lightweight Detection Method of Pavement Potholes Based on Binocular Stereo Vision
No ratings yet
Lightweight Detection Method of Pavement Potholes Based on Binocular Stereo Vision
14 pages
Smart System For Potholes Detection Using Computer Vision With Transfer Learning
No ratings yet
Smart System For Potholes Detection Using Computer Vision With Transfer Learning
9 pages
Prototype of Vehicles Potholes Detection - Dewiani - DKK
No ratings yet
Prototype of Vehicles Potholes Detection - Dewiani - DKK
7 pages
Paper
No ratings yet
Paper
9 pages
Ae - 01fe20bei007
No ratings yet
Ae - 01fe20bei007
4 pages
What Is Visual Pothole Detection
No ratings yet
What Is Visual Pothole Detection
2 pages
Sensors 23 07395
No ratings yet
Sensors 23 07395
18 pages
EasyChair-Preprint-9151
No ratings yet
EasyChair-Preprint-9151
13 pages
Result Paper
No ratings yet
Result Paper
8 pages
Road Damage Detection Algorithm For Improved YOLOv5
No ratings yet
Road Damage Detection Algorithm For Improved YOLOv5
12 pages
HEYOLOv5s Efficient Road Defect Detection Networkentropy 25 01280
No ratings yet
HEYOLOv5s Efficient Road Defect Detection Networkentropy 25 01280
16 pages
IJRPR13505
No ratings yet
IJRPR13505
5 pages
Sensors: An Automated Machine-Learning Approach For Road Pothole Detection Using Smartphone Sensor Data
No ratings yet
Sensors: An Automated Machine-Learning Approach For Road Pothole Detection Using Smartphone Sensor Data
23 pages
Pothole Detection Using Machine Learning
No ratings yet
Pothole Detection Using Machine Learning
5 pages
A Deep Learning Approach For Road Damage Detection From Smartphone Images
No ratings yet
A Deep Learning Approach For Road Damage Detection From Smartphone Images
4 pages
Article pp 353-365_c
No ratings yet
Article pp 353-365_c
13 pages
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
From Everand
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
Fouad Sabry
No ratings yet
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
From Everand
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
Fouad Sabry
No ratings yet
Multiple Losing Streak Probability Calculator
No ratings yet
Multiple Losing Streak Probability Calculator
5 pages
De Thi HSG Hoa Hoc 9 Ha Noi Tu 2006-2019
No ratings yet
De Thi HSG Hoa Hoc 9 Ha Noi Tu 2006-2019
26 pages
Print Edition: 26 May 2014
No ratings yet
Print Edition: 26 May 2014
21 pages
Self Reliance in India
No ratings yet
Self Reliance in India
11 pages
Youth in Agriculture Regional Workshop
100% (1)
Youth in Agriculture Regional Workshop
40 pages
DMO 2025.1_User Guide_EN
No ratings yet
DMO 2025.1_User Guide_EN
58 pages
Partnership Review Notes
No ratings yet
Partnership Review Notes
3 pages
An Overview of Multimedia
No ratings yet
An Overview of Multimedia
14 pages
BE Project Diary
No ratings yet
BE Project Diary
8 pages
Timing Belts: Optibelt 8M Sections
No ratings yet
Timing Belts: Optibelt 8M Sections
3 pages
00 Creating - The - ALLTASKS - Role
No ratings yet
00 Creating - The - ALLTASKS - Role
4 pages
2021-cash
No ratings yet
2021-cash
1 page
M.U THESIS Documents
No ratings yet
M.U THESIS Documents
5 pages
Constructing Smooth Hot Mix Asphalt HMA Pavements ASTM Special Technical Publication 1433 Mary Stroup-Gardiner download
No ratings yet
Constructing Smooth Hot Mix Asphalt HMA Pavements ASTM Special Technical Publication 1433 Mary Stroup-Gardiner download
71 pages
Samarth Egov
No ratings yet
Samarth Egov
2 pages
BEL Technical Manual 4 End Port L
No ratings yet
BEL Technical Manual 4 End Port L
27 pages
User's Manual of Pressure Tank
No ratings yet
User's Manual of Pressure Tank
15 pages
Insert Your Essay Here: My Ideal Electrician
No ratings yet
Insert Your Essay Here: My Ideal Electrician
1 page
Orvis Switch
No ratings yet
Orvis Switch
1 page
CS101 PRACTICE QUESTIONS UPDATED
No ratings yet
CS101 PRACTICE QUESTIONS UPDATED
7 pages
Interim Report Replaceable Unbonded Tendons For Post-Tensioned Dridges BDV-31-977-15-Task1
No ratings yet
Interim Report Replaceable Unbonded Tendons For Post-Tensioned Dridges BDV-31-977-15-Task1
97 pages
Public Interest Litigation
No ratings yet
Public Interest Litigation
8 pages
Week 3 - DLL
No ratings yet
Week 3 - DLL
4 pages
Pricelist FATEK EUR
No ratings yet
Pricelist FATEK EUR
1 page
Brochure Conference Isiem 2 PDF
No ratings yet
Brochure Conference Isiem 2 PDF
6 pages
Computer Graphics World 2004 12
No ratings yet
Computer Graphics World 2004 12
56 pages
Marketing Strategy Text and Cases 6th Edition Ferrell Solutions Manual Download
100% (23)
Marketing Strategy Text and Cases 6th Edition Ferrell Solutions Manual Download
9 pages

Image-Based Pothole Detection Using Multi-Scale Feature Network and Risk Assessment

Uploaded by

Image-Based Pothole Detection Using Multi-Scale Feature Network and Risk Assessment

Uploaded by

electronics

1 Department of Mechanical and Biomedical Engineering, Kangwon National University,

Keywords: pothole; 2D object detection; convolutional neural network; multi-scale detection;

Citation: Heo, D.-H.; Choi, J.-Y.;

Electronics 2023, 12, 826. https://doi.org/10.3390/electronics12040826 https://www.mdpi.com/journal/electronics

Table 1. A summary of recent road pothole detection algorithm performance.

2.2. Risk Assessment Methods

Table 2. Pothole risk classification specified in LTTP and HPAS.

Property Degree of Risk LTTP (USA) HPAS (China)

3.1. Dataset Preprocessing

Figure 3. Pothole dataset configuration after data augmentation.

Figure 4. SPFPN YOLOv4 tiny overall architecture.

3.2.2. SPFPN YOLOv4 Tiny

3.3. K-Means++ Clustering

d = max( j:1→m) k xi − centroid j k2 (1)

d(Ground truth, centroid) = 1 − IoU ( Ground truth, centroid) (2)

3.4. Object Detection Performance Evaluation Metrics

Table 3. Confusion matrix index.

Positive True Positive (TP) False Negative (FN)

3.5. Risk Assessment

3.5.2. Pinhole Camera Model

Figure 6. Pinhole camera model.

Table 4. Training hardware environment.

Environment Model Version

Table 5. mAP(IoU = 0.5) comparison with various YOLO models.

Model mAP([email protected]) Training Hours

Figure 7. YOLOv2 training graph.

Figure 7. YOLOv2 training graph.

Electronics 2023, 12, x FOR PEER REVIEW 17 of 24

Figure 9. YOLOv4 tiny training gragh.

Figure 9. YOLOv4 tiny training gragh.

Figure 10. SPFPN-YOLOv4 tiny training graph.

Table 6. Performance comparison on the pothole test dataset.

Model Precision Recall F1-Score FPS

Figure 11. Detection result sample on the test dataset.

Table 7. IoU comparison according to each YOLO models.

4.2. Risk Assessment

In addition, this study developed a pothole risk assessment algorithm based on 2D

Author Contributions: Conceptualization, T.-O.T.; Methodology, D.-H.H.; Software, D.-H.H.;

You might also like