Structural Contr Hlth - 2022 - Huyan - Pixelwise Asphalt Concrete Pavement Crack Detection via Deep Learning‐Based
Structural Contr Hlth - 2022 - Huyan - Pixelwise Asphalt Concrete Pavement Crack Detection via Deep Learning‐Based
DOI: 10.1002/stc.2974
RESEARCH ARTICLE
1
School of Transportation, Southeast
University, Nanjing, China
Summary
2
School of Information Engineering, Explicit gaps exist between the advanced deep learning technologies and the
Chang'an University, Xi'an, China less satisfied pixel-level crack detection algorithms. Therefore, this research
sought to bridge this gap via outlining the deep neural network model for
Correspondence
Tao Ma, School of Transportation, pixelwise pavement crack detection. Two state-of-the-art deep neural network
Southeast University, Southeast models are constructed for the semantic segmentation of crack images. The
University Road, Jiangning District,
Nanjing 211189, China.
first architecture, VGGCrackU-net, is composed of 10 3 3 convolutional
Email: [email protected] layers, 4 max-pooling layers, 4 up-sampling layers, and 4 concatenate opera-
tions. Another architecture, ResCrackU-net, is composed of 7-level residual
Funding information
Fundamental Research Funds for the
units with a total of 22 convolutional layers. Asphalt concrete pavement crack
Central University of China, Grant/Award images are collected by smartphones, action cameras, and automatic pavement
Numbers: 300102249306, 300102249301;
monitoring systems from diverse functional classes of AC pavements. The
National Key R&D Program of China,
Grant/Award Numbers: 2021YFB2600600, crack images are manually labeled and double-checked by trained operators
2021YFB2600604; National Natural for quality insurance. After that, 500 crack images are randomly divided into
Science Foundation of China, Grant/
training, validating, and test datasets according to the ratio of 3:1:1. Both archi-
Award Number: 52108403
tectures are trained on GPU facilitated Keras platform with Python version of
3.5, which demonstrated fast convergence. Results prove that the proposed
models exhibit significant advantages for pixelwise crack detection when com-
pared with the performance of widely used FCN net and PSPnet. Meanwhile,
ResCrackU-net slightly outperforms VGGCrackU-net, which, however, can
provide acceptable results as well. Though significant false negative and false
positive errors are observed in both network models, the contributions are
noticeable, which can provide innovative guidance for future work in figuring
out solutions to the problems.
KEYWORDS
deep learning, pavement cracks, pixel-level crack detection, semantic segmentation, U-net
1 | INTRODUCTION
Serves as a critical component of the transportation system, pavement provides the most basic infrastructure for the whole
transportation functioning network. In other words, the flexibility status, the performance levels, and the network
advancement conditions of a country's transportation systems can be regarded as the symbols of the comprehensive
strength of that nation. Therefore, as a fundamental part of the transportation system, pavement should be constructed
based on scientific design specifications with the consideration of long-life serviceability, monitored periodically with
Struct Control Health Monit. 2022;29:e2974. wileyonlinelibrary.com/journal/stc © 2022 John Wiley & Sons, Ltd. 1 of 21
https://doi.org/10.1002/stc.2974
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 of 21 HUYAN ET AL.
reliable condition evaluations and maintained with the comprehensive considerations of safety insurance, serviceability
level requirement, and budget allocations.1–3 Upon opening to traffic, the long-life service objective of a certain pavement
can only be achieved through timing condition monitoring and proper maintenance and rehabilitation treatments.
Flexible pavements usually consist of a mixture of bituminous materials and aggregates, which are the major pave-
ment type in Canadian highways. The primary function of pavement, that is, load transfer capability, is achieved by the
comprehensive functioning of all components. Therefore, multiple players are designed to achieve better load transmit-
ting capability, which, however, leads to the complexity of the pavement structure and the high difficulty of the damage
detection. Influenced by various impacts from both internal material aging and external traffic, environmental, climate
and other damages, pavement performance tends to deteriorate over time.4–6 The initial manifestations of pavement
performance deteriorations are the occurrence of various surface distresses, rutting, and so on.
Cracks are frequent surface distress that occurs on asphalt concrete pavement surfaces and may lead to permanent per-
formance degradation in the long run. The effective crack analytical approach can provide public sectors with reliable
information for the overall pavement performance rating and repairing work planning.7,8 The advanced computer-aided
technologies, deep learning-based intelligent solutions, have witnessed eye-catching achievements for target detection,
object recognition, semantic segmentation, and so on. Nevertheless, current algorithms for automatic pixel-level pavement
crack detections are criticized for being less efficient and robust. The primary reason is that each traditional method can
only interpret a specific aspect of the crack image, whereas the cracks have multiple features that are hard to be explained
by simple models.9,10 Therefore, traditional manual pavement surface crack evaluation ways cannot meet the accuracy
and efficiency requirements of the pavement management systems. Data-driven pavement crack characterization and
quantitative modeling have become the main trend of algorithms used in automated pavement monitoring systems.11–13
The most brilliant characteristics of data-driven-based automated approaches are the objectiveness and high efficiency
compared with manual methods.14,15 The basic inputs for the data-driven modeling are the pavement surface condition
data. Current automated pavement surface crack data collections systems can be classified into two-dimensional
(2D) systems and three-dimensional (3D) systems. 3D systems are more state-of-art, which usually outperform 2D systems
in terms of accuracy, efficiency, and data quality. The crack detection algorithms are usually performed better when
embedded in 3D systems. However, both the configuration of the system and the development of three-dimensional crack
analysis algorithms are much more challenging than 2D-based systems. In light of those obstacles, most pavement man-
agement departments are incapable of providing 3D-based pavement crack evaluations. Specifically, the main weakness of
current methodologies are (1) minor or hairline cracks are hard to be detected; (2) successive processing cause a group of
fractured crack segments from initial complete cracks; (3) pavement shoulder drop-off or other “crack-like” regions being
regarded as cracks; (4) most algorithms only suitable for detecting cracks collected by specific systems, that is, less robust.
There are two aspects of improvement that can be considered for the private-public endeavors to achieve the objective of
high reliable pavement crack analysis. (1) Develop more effective and robust crack analytical approaches that can obtain satis-
fying results on 2D systems. (2) Allocate more capital investments to upgrade their current 2D systems into 3D systems.16,17
Both these aspects have undergone extensive research over the past decades. The former seems to be gradually replaced by the
latter one limited by the inherent capability of the traditional methods. Nevertheless, the advanced development of machine
learning technologies, especially deep learning techniques, provides the researchers with a promising insight of achieving
remarkable breakthroughs in proposing intelligent pavement crack analytical approaches based on 2D crack images.
Therefore, this research aims to evaluate the effectiveness of deep learning techniques for pixelwise pavement crack
detection. The most state-of-the-art deep learning semantic segmentation methodologies are compared for flexible pave-
ment crack detections. All in all, the main contributions of this research can be summarized as
1. designed deep network architectures for asphalt concrete pavement crack characterization: VGGCrackU-net and
ResCrackU-net;
2. compared the pixelwise crack segmentation performance of the proposed VGGCrackU-net, ResCrackU-net, PSPnet,
and FCN; and
3. addressed the false negative and false positive errors facing the automatic flexible pavement crack detections.
2 | R E LA T E D WOR K
Cracks are key pavement surface manifestations that should be accurately characterized with reliable measurements
when conducting periodic pavement condition surveys.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 3 of 21
Traditionally, automatic pavement crack analyzers18,19 rely on digital image processing algorithms.20 Representative
methodologies include (1) crack image segmentations such as thresholding and region growing methods21,22; (2) crack
edge detections23 such as the edge detector and transformation methods; (3) heuristic analysis-based approaches and
hybrid methods for crack feature extractions such as genetic algorithms,24 Particle Swarm Optimization (PSO), and
Simulated Annealing (SA) methods. Crack segmentations assume that cracks and the pavement background have sig-
nificantly different intensities within an image. Intuitively, these approaches should be able to provide straightforward
cracked regions of an image. They, however, are prone to the quality of the collected crack images. The fact is that most
of these approaches can only obtain satisfactory results on less noise crack images. Crack contour detecting assumes
that there is a distinctive variance between the crack boundary and the pavement background. On this assumption, the
primary objective is to correctly find the sharp changing parts, which are regarded as the crack boundaries. This pur-
pose can be achieved in both the image space domain and the frequency domain. Wavelet transformation,25,26 Fourier
transformation, Haar transformation, and Hough transformation27 are the classical approaches. Several types of
research have demonstrated the effectiveness of transformation-based crack edge extractions. However, lots of edge
information of the blurred crack images tend to be lost, whereas much wrong information tends to be extracted in the
high-noised crack images. The reason is that under those conditions, sharp changes in the crack boundary are dam-
aged, resulting in non-accurate crack edge detections. Heuristic methods are developed by taking advantage of the ani-
mals' intelligence in surviving, behaving, and learning. Hybrid methods usually combine or fuse several basic methods
to achieve better crack feature extraction results. Generally, both traditional heuristic and hybrid methodologies can
provide acceptable pavement crack detections. However, those methods still have a strong dependence on the quality of
the crack image, and the computation complexities are much more extensive.
A body of recent literature is focused on machine learning-based pavement surface crack detection.28–31 Researchers
abroad world are trying to take advantage of this novel technology to upgrade their automatic pavement management sys-
tems. As a typical data-driven modeling method, the intrinsic mechanism of machine learning technology is to build a
model architecture. Then use the prepared training dataset with ground truth to train the architecture, guiding it to learn
hierarchical features of the given dataset. Ever since the beginning of machine learning technologies, they have been used
for pavement crack analysis. Such as Back Propagation Neural Networks,32 Support Vector Machines,33 Random Forest,
Clustering algorithms,34 and Boosting method-based35 pavement crack analytical methods. Those methods employed the
data-driven knowledge discovery principle of machine learning technologies. However, since pavement cracks images
contain complex (1) background conditions, which refers to the pavement types such as asphalt concrete pavement,36,37
Portland Cement Concrete pavement and Composite pavement38,39; and (2) noise conditions, which may include noises
from both the data collection systems and the roadway surroundings. Therefore, the practical in-site usability of those
pavement crack analysis methodologies is always doubtable from the views of pavement management personnel.
Most recently, deep learning techniques have achieved eye-catching attention in several fields. The main reason lies
in that deep learning promotes the development of artificially intelligent technologies, especially in the field of Robot
systems, medical disease diagnosis, self-driving cars, navigation systems, object detection, and so on. The “deep” calling
usually refers to the hierarchical structures of the deep natural networks.17,40,41 The intelligent aspect of deep learning
techniques is achieved by the end-to-end neural network architecture, which can provide the desired output without
human intervention. A deep convolution neural network, a representative kind of the deep models, is developed with
purposes of intelligent object detection, recognition, and classification. Cracks are regarded as the aimed object in a
pavement surface image. Therefore, the topmost task of pavement crack detection is to separate the cracked regions
from the pavement background with the best possible performance.42,43 In a convolutional network, each filter kernel
serves as a feature extraction tool, which can extract multi-level feature maps of the input crack image. Thus, multi-
scale crack features are extracted by multi-size filters with multiple fields of view. That is the main reason why deep
network models outperform traditional methods.44–46
In terms of automatic pavement crack detection, most current deep analytical models are for block-level or grid
level detections. The objective of block-level crack detections is to train deep convolutional neural networks that can
intelligently find the best bounding boxes of the cracks in a pavement image.47,48 The grid level crack detections49–51
are to describe the cracks by a group of small boxes along the crack distribution area. Those are all meaningful for dif-
ferent purposes of crack analysis. Those two levels of crack detections have been extensively studied by scholars. How-
ever, few research have been found for intelligent pixel-level (pixelwise) crack detection,52–57 which can provide pixel-
perfect characterizations for precise evaluations. Pixel perfect crack detection and characterizations are challenging
tasks since they require comprehensive interpretations of the crack images. However, both private and public sectors
emphasize on precise crack detection; hence, significant effort still required for putting into this aspect of research.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 of 21 HUYAN ET AL.
Based on the contributions made by scholars, together with the significant research gaps in pixelwise crack detec-
tion. The primary objective of this research is to develop end-to-end deep network models for pixelwise crack detection.
Meanwhile, this paper compares the performance of the proposed networks with current popular semantic segmenta-
tion models to outline the strength and weakness of each method with respect to pavement crack characterization.
3 | METHOD OLOG Y
3.1.1 | Deconvolution
Convolution is the key computational operation for image feature extractions in deep convolutional neural networks.
During the convolutional operation process, each convolutional kernel aims to extract a specific image feature. There-
fore, multiple features of the input image can be extracted by diverse convolutional filters. Deconvolution is also called
transpose convolution, which is more interpretive in terms of computational processing.
As aforementioned, convolutional operations can obtain multi-level feature maps of the input image. Whereas
deconvolution operation cannot recover the original image, instead, it aims to obtain the same sized output with the
input image while maintaining the extracted feature information. Therefore, deconvolution operations are proved to be
significantly effective for semantic segmentation. Generally, the segmentations are conducted in the decoder trend of
the network and act as a kind of up-sampling method.
All the up-sampling operations in the proposed architectures are realized by deconvolution operation. Given an
input image of size N, the filter size of F, stride of 2 pixels. Theoretically, the size of the deconvolutional output can be
calculated by Output ¼ ðN 1Þ 2 þ F: In this paper, 2 2 kernel size with a stride of 2 pixels are selected when con-
ducting deconvolution. Figure 1 shows the demonstration of detailed operations in deconvolution processing. As can be
seen, the deconvolution operation mainly includes two steps: (1) pixel by pixel convolution and (2) feature fusion based
on the specified stride. Therefore, in the deconvolution, the stride value is used in the fusion stage to control the size of
the output image. In Figure 1, the input image has the size of 3 3, filter size 2 2, and stride = 2 pixels. After
deconvolution, the size of the output image is 6 6, which is agree with the theoretical calculation.
Batch normalization (BN) is proposed by Google deep learning researchers,58 which lays the foundation for effective
training deeper neural networks. Here, “batch” refers to a group of data, which is usually called “mini-batch,” that are
updated simultaneously in process of Stochastic Gradient Descent (SGD) based model training. Figure 2 shows the algo-
rithm of BN.
As Figure 2 shows, BN is realized through four steps, which are
BN operations are used in the ResCrackU-net architecture, which is trained with a mini-batch size of 5 crack
images. Detailed model architecture and model training are discussed in Sections 3.3 and 4.2, respectively.
VGGNet was developed by the Oxford Visual Geometry Group (VGG),59 which belongs to the Robotics Research
Group. VGGNet was proved to outperform the traditional object detection network models benefited by three aspects of
changing:
1. Smaller convolutional filter size. The fixed filter size of 3 3 is used in replacement of larger filter size such as 5 5
and 7 7. This revision can not only reduce the computational complexity but also provide more comprehensive
interpretations of the input image;
2. Smaller pooling filter size. Fixed 2 2 pooling size is used instead of larger size, which helps to maintain significant
information extracted by convolution operations.
3. Deeper network layers. Since smaller convolution and pooling filter sizes are used, more network layers are
employed.
According to the principles of VGGNet, this paper proposed VGGCrackU-net for pixel-level crack segmentation. The
architecture of the VGGCrackU-net is shown in Figure 3. Here can be seen that VGGCrackU-net follows “U” shape
design principle, which has near symmetric encoder-decoder architecture. As Figure 3 shows, VGGCrackU-net includes
10 convolutional layers, 4 Max-pooling layers, 4 up-sampling layers, and 4 concatenate operations. Fixed 3 3 filter size
and 2 2 Max-pooling size are used. Hence, the size of the highest feature map becomes 1/16 times of the input image.
In this research, we used 320 320 (pixel) as the input size, so the size of the highest-level feature map is 20 20 (pixel).
Concatenate operation is conducted after each up-sampling to introduce multiple feature information obtained in the
encoding procedures into the decoding process. Thus, more interpretative knowledge is included in the decoding proce-
dures which can facilitate the efficiency of model training.
Ever since researchers realized the huge benefits brought from deeper network structures, they have sought to achieve
best possible performance through designing extreme deep network architectures. However, they noticed two signifi-
cant problems with deeper network architectures, which are
1. Gradient vanishing
2. Performance degradation
The gradient vanishing happens because the standard deviation of the gradient values increases with successive gradi-
ent descent operations being conducted in the backward direction. Thus, the stagnation appears in the loss minimizing
process, causing the training to stop at a high error value. The degradation problem happens when the network is too
extensive (too much layers). Under this circumstance, the accuracy tends to be saturated and then fall rapidly. How-
ever, the accuracy degradation is neither caused by the vanishing of the gradient nor by overfitting, it is because the net-
work is too complicated. Thus, the unconstrained stocking training is incapable of achieving desired error rate.
The gradient vanishing problem can be solved by batch normalization as explained in Section 3.1. However, the deg-
radation problem still exists because the model architecture has not been improved. Therefore, inspired by the
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 7 of 21
consideration that identity mapping might solve the degradation problem, He's team proposed the residual structure
characterized deep network called ResNet,60 which solved the problems of deeper network architecture. Hence, the
residual structure is adopted in this research. Meanwhile, with the purpose of pixel-level segmentation, or semantic seg-
mentation, the “encoder-decoder” character of U-net architecture is utilized as well.
Hence, this research developed another pixelwise pavement crack segmentation deep architecture called
ResCrackU-net, which integrated the strengths of ResNet and U-net. Figure 4 shows the residual unit structure of
ResCrackU-net, which is formed by multiple stacked residual units. The residual unit of Figure 4 is inspired by He's
innovation.
The output of this residual unit can be calculated with the general form of Equation (1).
where x t and x tþ1 are the input and the output of the tth residual unit. ReLU is the activation function.
The architecture of ResCrackU-net is illustrated in Figure 5, which is characterized by 7-level residual units which
are stacked following the “U” shape structure. Except for the encoder-decoder “U” shape architecture which is similar
to VGGCrackU-net introduced in the previous section, three other main features can be observed:
1. 7-level residual units with each contain 3 convolutional layers, including the last convolutional layer, the network
has 22 convolutional layers. Thus, much deeper than VGGCrackU-net.
2. All convolution filters have a size of 3 3 with 1-pixel padding, which facilitated the model training.
3. No pooling layers in the ResCrackU-net. The pooling layers in traditional network architecture aim to reduce the
size of feature maps. However, a large amount of information is lost through successive pooling operations. There-
fore, 3 3 convolutional with 1-pixel padding and 2-pixel stride are employed to achieve size reduction.
The cross-entropy function is used to evaluate the similarity conditions between the output and the ground truth values
(or loss) of the network models. By defining the loss function, the models are trained based on Adam optimization algo-
rithm to minimize the total loss. The cross-entropy loss function is defined in Equation (2).
1X M
J ðθÞ ¼ p lnq þ ð1 pi Þlnð1 qi Þ ð2Þ
M i¼1 i i
where J ðθÞ denotes the loss function. θ denotes the parameters of the network architecture. pi denotes the ground-truth
value of the ith pixel. qi denotes the network output value of the ith pixel. M denotes the total number of the pixels in
the image.
In the model training procedure, the Adam algorithm is selected to update the model parameters while minimizing
the total loss.
The brilliant performance of the Adam algorithm is benefited by involving the 2nd moment estimate of the gradi-
ents when updating the parameters. In order to find the best parameters for CNN model, investigations on the key
parameters have been conducted. Crack segmentation accuracy and processing speed are used as performance assess-
ment metrices, which are shown in Table 1.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 9 of 21
According to the investigation results shown in table, the proposed models used fixed kernel size of 3 3, and
stride = 1. Padding = 1, which are based on two considerations: (1) We designed neural network architecture based on
the primary objective of achieving best performance of crack segmentation, especially for aims to minimize the false
positive and false negative errors. Thus, the model's capability of detecting minor crack information of critical. There-
fore, we did not choose high values to avoid missing crack information. Moreover, considering that cracks, especially
minor crack information, are more localized information, which are better to be obtained by similar detectors, like
smaller convolutional kernels. Meanwhile, a smaller kernel size does not bring too many hyperparameters compared to
large kernels and stride sizes, which reduced the computational complexity of the model. (2) In another thought, we
design the model architecture also considering the visual effect, or the beauty, the symmetry of the architecture. Upon
achieving functional requirements, we want to design our model with the simplest architecture. This is another consid-
eration, or reason of why we did not use to complicated designs.
4 | IMPLEMENTATION DETAILS
The asphalt concrete pavement surface crack images used in this paper are collected by multiple data collection devices
from different asphalt pavement surfaces to minimize the probability of model overfitting. As Figure 6 shows, pavement
crack images are collected by humans holding smartphones, action cameras, and automatic pavement monitoring sys-
tems.41 In the smartphone-based image collections, each crack image is captured by 3 trained operators, with one locat-
ing the correct crack position and the other two holding the smartphone while taking the pictures. For the other
method, the action camera is mounted on the driving vehicle with the vertical distance to the pavement surface equal
to 1.3 m. The action camera starts to record the pavement monitoring video when beginning the condition detection.
The automatic pavement condition detection system collects pavement surface images using the camera mounted at
the rear part of the vehicle with a vertical distance of 2 m to the pavement surface.
As Figure 6 shows, the input of Labelme is the collected pavement surface crack image. When using this tool, the
cracked regions of each image are manually labeled pixel-by-pixel. After finishing this procedure, the labeled crack
images, which are usually binary images with only cracked regions and the pavement background, are exported as gro-
und truth files. For quality insurance, each image labeling has gone through double-checks by the researchers.
We used gocator 3100 sensor to conduct automatic pavement crack data collection. The main parameters and the
corresponding values are shown in Table 2. Meanwhile, the crack image data collected in this research are described in
Table 3. All these images are labeled in pixel-level for model training and test.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 of 21 HUYAN ET AL.
The proposed models are implemented in Keras,61 which is one of the most popular deep learning programming plat-
forms. The models are trained by SGD with Adam optimization through minimizing the loss function. Five hundred
asphalt pavement crack images collected from multiple sources are manually labeled using the labeling tool called
Labelme. Those crack images are randomly divided into a training set, validation set, and test set according to the ratio
of 3:1:1. Theoretically, the proposed model can deal with arbitrary size input images. However, considering the compu-
tation burden of GPU, the size of the input images is fixed to 320 320 (pixel).
As noted, the training images with the corresponding label images are randomly selected from the prepared dataset.
Then, the “original image-labeled image” pairs are fed into the network models to train them to learn parameters
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 11 of 21
through repetitive loss minimizations. The models are trained with a minibatch size of 5 crack images. Other batch
sizes can be selected depending on the GPU capability, such as some scholars used a minibatch size of 8 to achieve
shorter training time with larger onboard memory. The learning rate is initialized to 0.0001 and adaptively updated
using Adam optimization algorithm. Moreover, to avoid the early stop of the gradient descent operations, the learning
rate is reduced to 101 of the original value if the loss value stopped reducing within continuous 10 epochs.
Based on the above strategy, training and validation operations are conducted to compare the proposed
VGGCrackU-net, ResCrackU-net, and the widely used FCN, PSPnet. To outline the effect of the model training epoch
on the performance, two training periods, which are the first period (first 100 epochs) and the last period (last
100 epochs) are observed separately. The curves showing the training loss, validation loss and learning rate changing in
those two raining periods are shown in Figures 7 and 8. In Figures 7 and 8, the curves from the first to the last rows are
the results of FCN, PSPnet, VGGCrackU-net, and ResCrackU-net, respectively. It can be observed the overall trends
show that the results of the last train period (Figure 8) outperform and first training period (Figure 7). Meanwhile, the
performance of ResCrackU-net and VGGCrackU-net seems even, and both outperform FCN and PSP net.
FIGURE 7 Train loss, validation loss, and learning rate of the first period
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12 of 21 HUYAN ET AL.
FIGURE 8 Train loss, validation loss, and learning rate of the last period
Model performance measurement metrics can provide straightforward conclusions on the performance levels of var-
ious approaches. Hence, three overall evaluation metrics are employed to measure the prediction efficiency, pixelwise
similarity and the average error of those methods. Those metrics are defined in Equations (3)–5.
2PR
F1 ¼ ð3Þ
PþR
j I \ I j
Jcrd ¼ J ðI, I Þ ¼ ð4Þ
j I [ I j
where Jcrd (J ðI, I Þ)denotes the Jaccard's similarity coefficient of image matrix I and I , which represent the predicted
image matrix and the ground truth image matrix, respectively. \ represents the intersection operation. [ represents
the union operation.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 13 of 21
P
j i i j
i
Ave:loss ¼ ð5Þ
M
where Ave:loss denotes the average loss value. M denotes the total number of the pixels in the crack image. i and i
denote the pixel index of the predicted and the ground truth image.
Table 4 shows the calculated measurement metrics of the methods used in the comparative analysis for both the
first and the last 100 epochs. It can be concluded from this table:
1. The overall pixelwise crack detection performance of these four methods follows the sequence of ResCrackU-
net > VGGCrackU-net > CrackU-net > Unet > PSPnet > FCN from higher to lower;
2. Both proposed ResCrackU-net and VGGCrackU-net have satisfied overall performance, and significantly outperform
the other two methods;
3. For each individual method, a proper increase of the training epoch can improve the crack detection performance.
FCN model is considered as the beginning of deep learning-based pixelwise segmentation. Specifically, the main differ-
ence between the FCN model and traditional models is that FCN achieved end-to-end, pixels-to-pixels network archi-
tecture by converting the fully connected layers of traditional models to convolutional layer. Moreover, it conducted the
deconvolutional operation to the features maps of the third and the fourth convolutional layers. However, the segmen-
tation results are not accurate, because the deconvolutional operations are not fine enough to capture minor objectives.
Meanwhile, FCN model does not consider the contextual relationship of the object. All these facts caused the less satis-
factory performance of the model. PSPnet model considered the hierarchical global priority and multi-scale information
difference between different sub-regions. In other words, PSPnet combined the features of 4 different pyramid scales,
and also concatenated information before pooling operation. The performance of PSPnet is much better than FCN
model by these optimizations. Inspired by the model architectures of FCN and PSPnet, the models proposed in this
paper, that is, VGGCrackU-net and ResCrackU-net, fused VGG and Resnet with newly designed U-shape part, which
considered both pyramid scales and contextual relationship among regions, and also achieved end-to-end segmentation.
These are the main difference between FCN, PSPnet, and the proposed models in this paper.
5 | T E S T I N G A N D DI S C U S S I O N
To compare the performance of the proposed models to existing representative segmentation models, which are FCN,
PSPnet, Unet, and CrackU-net. A 10-fold cross-validation with the combination of both training and test dataset has
been conducted. In other words, we conducted cross-validation on the whole dataset including training, validating, and
test datasets, and compared the performance of that on models of FCN, PSPnet, Unet, CrackU-net, VGGCrackU-net,
and ResCrackU-net. The results are shown in Table 5. The comparison indicates that the proposed methods signifi-
cantly outperform the other models.
Figure 9 shows some representative testing results of asphalt pavement crack detection using the proposed method-
ologies. The first row is the original crack images in the testing database. The second and the third rows are the
Metrics
Model Mean COV Max Min Mean COV Max Min Mean COV Max Min Mean COV Max Min
FCN 0.89 0.08 0.91 0.86 0.89 0.05 0.90 0.86 0.86 0.06 0.90 0.84 0.85 0.07 0.90 0.82
PSPnet 0.89 0.06 0.92 0.88 0.91 0.04 0.91 0.92 0.89 0.09 0.90 0.85 0.87 0.05 0.89 0.86
Unet 0.88 0.07 0.90 0.88 0.85 0.06 0.90 0.86 0.87 0.07 0.90 0.82 0.86 0.06 0.89 0.83
CrackU-net 0.90 0.05 0.93 0.89 0.90 0.04 0.91 0.88 0.87 0.06 0.92 0.85 0.87 0.05 0.91 0.86
VGGCrackU-net 0.95 0.03 0.96 0.92 0.91 0.03 0.91 0.89 0.92 0.04 0.95 0.90 0.91 0.05 0.94 0.90
ResCrackU-net 0.96 0.02 0.95 0.94 0.96 0.03 0.94 0.93 0.95 0.05 0.97 0.94 0.95 0.02 0.96 0.94
HUYAN ET AL.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 15 of 21
corresponding pixelwise crack detection results using VGGCrackU-net and ResCrackU-net, respectively. It can be
found that ResCrackU-net can extract complete crack regions even with low-quality background conditions, as
observed from the third row of Figure 9. VGGCrackU-net can provide acceptable crack detections, but it may lose some
minor cracking information, which can be observed from the second row of Figure 9. This phenomenon should be cau-
sed by the pooling layers, which tend to lose many informative features of the crack images.
Some significant errors are also observed in the crack detection results. Those errors can be classified into two cate-
gories, which are false-positive errors and false-negative errors. Figures 10 and 11 demonstrate some examples of those
errors, where the false positive errors are marked by rectangles and the false-negative errors are marked by circles. To
outline the effectiveness and errors of the methods on different types of cracks, as well as verify the strengths of the pro-
posed methods against FCN and PSPnet, comparative results of both linear and complex cracks are demonstrated each
in Figures 10 and 11. Cross comparing the results shown in these two figures, the following findings can arrive:
1. Comparing the performance of FCN, PSPnet, with VGGCrackU-net and ResCrackU-net, clearly, the latter two
methods have better performance, whatever for linear or complex cracks. Thus, the errors of FCN and PSPnet are
not marked in the figures.
2. Generally, more errors are observed in the detection of complex cracks than the linear cracks, which is reasonable
since complex cracks contain much minor or hairline cracks whose features are hard to be learned by the models;
3. More false-negative errors are observed than false-positive errors in flexible pavement crack images. As Figures 10
and 11 show, false-negative errors (circled regions) usually lead to incomplete crack fragments, which influence the
correctness of the quantitative crack analysis. Apart from the results of FCN and PSPnet that exhibit significant
errors, VGGCrackU-net shows more false-negative errors than ResCrackU-net by comparing the results of the last
two rows of Figures 10 and 11. The main reason should be that the utilization of successive pooling operations leads
to the loss of minor crack features. Thus, hairline cracks or minor cracks that have high similarities with road back-
grounds cannot be interpreted by the model.
4. The false-positive errors in flexible pavement cracks are mainly caused by the noises contained in the corresponding
images. This problem can be improved by conducting effective quality insurance during data collection or involving
reliable noise filtering pre-processing operations.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
16 of 21 HUYAN ET AL.
As aforementioned, false-negative errors are the key problems for both VGGCrackUnet-based and ResCrackU-net-
based flexible pavement crack detections. False-positive errors are also observed, whereas they are not more significant
than false-negative errors, and the potential improving methods are easy to figure out.
To be more convincing, the investigation on the impact of successive pooling layers has been conducted. The results
are shown in Figure 12. It can be observed that more pooling layers dramatically reduced the segmentation accuracy,
especially for minor cracks.
6 | C ON C L U S I ON S
Considering the advanced development and successful applications of deep learning techniques in several fields versus
the urgent requirement of highly efficient and robust pixel-perfect pavement crack detections from private-public sec-
tors, this research addressed the pixelwise asphalt pavement crack detection solutions through deep neural network
modeling. The raw crack image datasets are collected by the smartphones, action camera and automatic pavement con-
dition monitoring systems from several flexible pavements within 2 years. Then, the image labeling process is con-
ducted using the pixelwise label tool: Labelme, which is a widely accepted labeling approach in the literature. With the
well-prepared crack images and the label files, two state-of-the-art automatic crack detection deep network models are
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 17 of 21
constructed. Then, the models are trained and tested on 11 Gigabyte memory Nvidia 1080Ti GPU configured Ubuntu
system with Keras platform. The python version is 3.5.
The first model is called VGGCrackU-net, which integrates the strengths of the VGG network and U-net. This net-
work is composed of 10 convolution layers, 4 pooling layers, 4 up-sampling layers and 4 concatenate operations. The
second model is called ResCrackU-net, which is inspired by the ideas of ResNet and U-net. This network includes a
total of 22 convolutional layers. The key structure of ResCrackU-net is the 7-level residual units, which are hierarchi-
cally connected following the encoder-decoder “U” shape for feature extraction and semantic segmentation. 3 3 ker-
nel size is used for all the convolutional layers. However, the feature size reductions are achieved by Max-pooling
operations in VGGCrackU-net, whereas ResCrackU-net used 2-stride 3 3 convolutional layers to reduce the size of
feature maps. Meanwhile, batch normalizations are used in ResCrackU-net to avoid gradient vanishing phenomenon.
Thus, ResCrackU-net exhibits better pixelwise crack detection performance than VGGCrackU-net. However, the advan-
tage of VGGCrackU-net lies in that it can achieve acceptable performance with few parameters and a simpler network
structure compared with ResCrackU-net. The effectiveness of VGGCrackU-net and ResCrackU-net are compared with
FCN and PSPnet for flexible pavement crack detection. The experiment results clearly show outstanding crack detection
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
18 of 21 HUYAN ET AL.
performances of VGGCrackU-net and ResCrackU-net for all types of cracks, which can be observed from both the test-
ing images and the F1-measure values (higher than 0.90).
However, it should be noted that even though significant improvements have been made in the crack detection
accuracy compared to traditional methods, there are limitations in these methods. Firstly, significant errors are
observed of both network models and potential limitations are pointed out. The critical errors are the false negative and
false positive errors observed in detecting flexible pavement images with many minor cracks. Secondly, most of the
crack images used in this research have relatively good quality, without containing oils, shadows, and so on. Therefore,
these models may fail to deal with poor quality crack images that are collected under the poor road or environmental
conditions. Another potential limitation is that the models should exhibit different results and performance when
detecting PCC pavement cracks. Therefore, these aspects of limitation should be focused in future work to support the
requirement of real-life applications.
A U T H O R C ON T R I B U T I O NS
Ju Huyan contributed to the conceptualization, methodology, investigation, data curation, experiments, original draft
writing and revision of the paper, and validation revision. Tao Ma contributed to the conceptualization, the methodol-
ogy, supervision, validation, revision and editing. Wei Li contributed to the supervision, method discussion, review
and editing. Handuo Yang contributed to the method discussion, review and editing. Zhengchao Xu contributed to
the method, discussion, data collection of this paper.
ORCID
Tao Ma https://orcid.org/0000-0002-7963-9370
Wei Li https://orcid.org/0000-0003-4508-3076
R EF E RE N C E S
1. Zhang H, Shen K, Xu G, et al. Fatigue resistance of aged asphalt binders: An investigation of different analytical methods in linear
amplitude sweep test. Construct Build Mater. 2020;241:118099. doi:10.1016/j.conbuildmat.2020.118099
2. Hu J, Ma T, Zhu Y, Huang X, Xu J. A feasibility study exploring limestone in porous asphalt concrete: Performance evaluation and
superpave compaction characteristics. Construct Build Mater. 2021;279:122457. doi:10.1016/j.conbuildmat.2021.122457
3. Liu J, Yu B, Hong Q. Molecular dynamics simulation of distribution and adhesion of asphalt components on steel slag. Construct Build
Mater. 2020;255:119332. doi:10.1016/j.conbuildmat.2020.119332
4. Zhu J, Ma T, Fang Z. Characterization of agglomeration of reclaimed asphalt pavement for cold recycling. Construct Build Mater. 2020;
240:117912. doi:10.1016/j.conbuildmat.2019.117912
5. Xu G, Fan J, Ma T, Zhao W, Ding X, Wang Z. Research on application feasibility of limestone in sublayer of doublelayer permeable
asphalt pavement. Construct Build Mater. 2021;287:123051. doi:10.1016/j.conbuildmat.2021.123051
6. Shi C, Cai X, Yi X, Wang T, Yang J. Fatigue crack density of asphalt binders under controlled‐stress rotational shear load testing. Construct
Build Mater. 2021;272:121899. doi:10.1016/j.conbuildmat.2020.121899
7. Wang G, Chen X, Dong Q, Yuan J, Hong Q. Mechanical performance study of pervious concrete using steel slag aggregate through
laboratory tests and numerical simulation. J Clean Prod. 2020;262:121208. doi:10.1016/j.jclepro.2020.121208
8. Dong Q, Wang G, Chen X, Tan J, Gu X. Recycling of steel slag aggregate in portland cement concrete: An overview. J Clean Prod. 2021;
282:124447. doi:10.1016/j.jclepro.2020.124447
9. Cui B, Gu X, Hu D, Dong Q. A multiphysics evaluation of the rejuvenator effects on aged asphalt using molecular dynamics simulations.
J Clean Prod. 2020;259:120629. doi:10.1016/j.jclepro.2020.120629
10. Tang F, Ma T, Zhang J, Guan Y, Chen L. Integrating three‐dimensional road design and pavement structure analysis based on BIM.
Autom Construct. 2020;113:103152. doi:10.1016/j.autcon.2020.103152
11. Ginsberg MD, Shahin MY. Algorithm for crack detection in automated pavement analysis. Proc Infrastruct Plann Manag. 1993;6:56‐60.
12. Kaseko M, Ritchie SG, Lo Z‐P. Evaluation of two automated thresholding techniques for pavement images. Proc Infrastruct Plann
Manag. 1993;6:277‐281.
13. Klassen G, Swindall B. Automated crack detection system implementation in ARAN; 1993.
14. Kaseko MS, Lo ZP, Ritchie SG. Comparison of traditional and neural classifiers for pavement‐crack detection. J Transport Eng. 1994;120
(4):552‐569. doi:10.1061/(ASCE)0733‐947X
15. McQueen JM, Timm DH. Statistical analysis of automated versus manual pavement condition surveys. Transp Res Rec. 2005;1940:55‐62.
doi:10.1177/0361198105194000107
16. Li W, Huyan J, Tighe SL. Pavement cracking detection based on three‐dimensional data using improved active contour model. J Transp
Eng B: Pavements. 2018;144(2): 04018006. doi:10.1061/JPEODX.0000028
17. Wang H, Buttlar WG. Three‐dimensional micromechanical pavement model development for the study of block cracking. Construct
Build Mater. 2019;206:35‐45. doi:10.1016/j.conbuildmat.2019.01.137
18. Moussa G, Hussain K. A new technique for automatic detection and parameters estimation of pavement crack. Proceedings IMETI
2011‐4th International Multi‐Conference on Engineering and Technological Innovation. 2011;2:11-16.
19. Adarkwa OA, Attoh‐Okine N. Pavement crack classification based on tensor factorization. Construct Build Mater. 2013;48:853‐857. doi:
10.1016/j.conbuildmat.2013.07.091
20. Delagnes P, Barba D. Markov random field for rectilinear structure extraction in pavement distress image analysis. IEEE Intl Conf Image
Process. 1996;1:446‐449.
21. Lu ZW. Pavement Crack Detection Algorithm Based on Sub‐Region and Multi‐Scale Analysis. Dongbei Daxue Xuebao/Journal of Northeastern
University. 2014;35(5):622‐625.
22. Iyer S, Sinha SK. A robust approach for automatic detection and segmentation of cracks in underground pipeline images. Image Vis
Comput. 2005;23(10):921‐933. doi:10.1016/j.imavis.2005.05.017
23. Liu Y, Li YB, Xie H. Edge detection based on 2D Rosin threshold method in road crack images. Zhongguo Gonglu Xuebao/China Journal
of Highway and Transport. 2013;26(3):70‐76.
24. Zhang D, Li Q, Chen Y, Cao M, He L, Zhang B. An efficient and reliable coarse‐to‐fine approach for asphalt pavement crack detection.
Image Vis Comput. 2017;57:130‐146. doi:10.1016/j.imavis.2016.11.018
25. Wang KCP, Li Q, Gong W. Wavelet‐based pavement distress image edge detection with a trous algorithm. Transp Res Rec. 2007;2024(1):
73‐81. doi:10.3141/2024‐09
26. Chambon S, Subirats P, Dumoulin J. Introduction of a wavelet transform based on 2D matched filter in a Markov random field for fine
structure extraction: application on road crack detection. Proc SPIE Int Soc Opt Eng. 2009;7251:72510A.
27. Mathavan S, Vaheesan K, Kumar A, et al. Detection of pavement cracks using tiled fuzzy Hough transform. J Electron Imag. 2017;26(5):
053008. doi:10.1117/1.JEI.26.5.053008
28. Hizukuri A, Nagata T. Development of a classification method for a crack on a pavement surface images using machine learning. Proc
SPIE Int Soc Opt Eng. 2017;10338:103380F.
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
20 of 21 HUYAN ET AL.
29. Hoang ND, Nguyen QL. A novel method for asphalt pavement crack classification based on image processing and machine learning.
Eng Comput. 2018;35(2):1‐12. doi:10.1007/s00366‐018‐0611‐9
30. Hoang ND, Nguyen QL. Automatic recognition of asphalt pavement cracks based on image processing and machine learning
approaches: a comparative study on classifier performance. Math Probl Eng. 2018;2018(1):16 doi:10.1155/2018/6290498
31. Tong Z, Gao J, Zhang H. Recognition, location, measurement, and 3D reconstruction of concealed cracks using convolutional neural
networks. Construct Build Mater. 2017;146:775‐787. doi:10.1016/j.conbuildmat.2017.04.097
32. Kaseko MS, Ritchie SG. A neural network‐based methodology for pavement crack detection and classification. Transport Res Part C.
1993;1(4):275‐291. doi:10.1016/0968‐090X(93)90002‐W
33. Hoang ND, Nguyen QL, Bui D. Image processing‐based classification of asphalt pavement cracks using support vector machine opti-
mized by artificial bee colony. J Comput Civil Eng. 2018;32(5):04018037. doi:10.1061/(ASCE)CP.1943‐5487.0000781
34. Zhang D. Crack detection for bituminous pavements based on cluster and minimum spanning tree. Zhongshan Daxue Xuebao/Acta
Scientiarum Natralium Universitatis Sunyatseni. 2017;56(4):68‐74.
35. Hoang ND, Nguyen QL. Fast local laplacian‐based steerable and sobel filters integrated with adaptive boosting classification tree for
automatic recognition of asphalt pavement cracks. Adv Civil Eng. 2018;2018(1):17 doi:10.1155/2018/5989246
36. Fujita Y. A method based on machine learning using hand‐crafted features for crack detection from asphalt pavement surface images.
Proc SPIE Int Soc Opt Eng. 2017;10338:103380B.
37. Ji X, Chen Y, Hou Y, Zhen Y. Detecting concealed damage in asphalt pavement based on a composite lead zirconate
titanate/polyvinylidene fluoride aggregate. Struct Control Health Monit. 2019;(11):26, e2452. doi:10.1002/stc.2452
38. Ersoz AB, Pekcan O, Teke T. Crack identification for rigid pavements using unmanned aerial vehicles. IOP Conf Ser: Mater Sci Eng.
2017;236(1):012101. doi:10.1088/1757‐899X/236/1/012101
39. Chen T, Ma T, Huang X, Ma S, Tang F, Wu S. Microstructure of synthetic composite interfaces and verification of mixing order in cold‐
recycled asphalt emulsion mixture. J Clean Prod. 2020;263(1):121467. doi:10.1016/j.jclepro.2020.121467
40. Zou Q, Zhang Z, Li Q, Qi X, Wang Q, Wang S. DeepCrack: learning hierarchical convolutional features for crack detection. IEEE Trans
Image Process. 2019;28(3):1498‐1512. doi:10.1109/TIP.2018.2878966
41. Huyan J, Li W, Tighe S, Xu Z, Zhai J. CrackU‐net: a novel deep convolutional neural network for pixelwise pavement crack detection.
Struct Control Health Monit. 2020;27(8):e2551. doi:10.1002/stc.2551
42. Wang KCP, Zhang A, Li JQ, Fei Y, Chen C, Li B. Deep learning for asphalt pavement cracking recognition using convolutional neural
network. Airfield and Highway Pavements 2017: Design, Construction, Evaluation, and Management of Pavements ‐ Proceedings of the
International Conference on Highway Pavements and Airfield Technology 2017. 2017;2017:166-177.
43. Li Q, Zou Q, Zhang D, Mao Q. FoSA: F* seed‐growing approach for crack‐line detection from pavement images. Image Vis Comput.
2011;29(12):861‐872. doi:10.1016/j.imavis.2011.10.003
44. Liu J, Yu B, Wang Q. Application of steel slag in cement treated aggregate base course. J Clean Prod. 2020;269(1):121733. doi:10.1016/j.
jclepro.2020.121733
45. Zhu J, Ma T, Dong Z. Evaluation of optimum mixing conditions for rubberized asphalt mixture containing reclaimed asphalt pavement.
Construct Build Mater. 2020;234:117426. doi:10.1016/j.conbuildmat.2019.117426
46. Huang M, Dong Q, Ni F, Wang L. LCA and LCCA based multi‐objective optimization of pavement maintenance. J Clean Prod. 2021;
283(10):124583. doi:10.1016/j.jclepro.2020.124583
47. Gao S., Jie Z., Pan Z., Qin F., Li R., Automatic recognition of pavement crack via convolutional neural network, in Lecture Notes in
Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2018. p. 82–89. doi:
10.1007/978‐3‐662‐56689‐3_7
48. Kim B, Cho S. Image‐based concrete crack assessment using mask and region‐based convolutional neural network. Struct Control Health
Monit. 2019;26(8):e2381. doi:10.1002/stc.2381e2381
49. Wang X, Hu Z. Grid‐based pavement crack analysis using deep learning. 2017 4th International Conference on Transportation Informa-
tion and Safety, ICTIS 2017 ‐ Proceedings. 2017;20:917-924.
50. Yang J, Li Z, Xu X. Preparation and evaluation of cooling asphalt concrete modified with SBS and tourmaline anion powder. J Clean
Prod. 2021;289(20):125135. doi:10.1016/j.jclepro.2020.125135
51. Hu J, Ma T, Ma K. DEM‐CFD simulation on clogging and degradation of air voids in double‐layer porous asphalt pavement under rainfall.
J Hydrol. 2021;595:126028. doi:10.1016/j.jhydrol.2021.126028
52. Zhang A, Wang KCP, Li B, et al. Automated pixel‐level pavement crack detection on 3D asphalt surfaces using a deep‐learning network.
Comput Aided Civ Inf Eng. 2017;32(10):805‐819. doi:10.1111/mice.12297
53. Zhang A, Wang KCP, Fei Y, et al. Deep learning‐based fully automated pavement crack detection on 3D asphalt surfaces with an
improved CrackNet. J Comput Civil Eng. 2018;32(5):04018041. doi:10.1061/(ASCE)CP.1943‐5487.000077504018041
54. Zhang A, Wang KCP, Fei Y, et al. Automated pixel‐level pavement crack detection on 3D asphalt surfaces with a recurrent neural net-
work. Comput Aided Civ Inf Eng. 2019;34(3):213‐229. doi:10.1111/mice.12409
55. Bang S, Park S, Kim H, Kim H. Encoder–decoder network for pixel‐level road crack detection in black‐box images. Comput Aided Civ Inf
Eng. 2019;34(8):713‐727. doi:10.1111/mice.12440
56. Ni F, Zhang J, Chen ZQ. Pixel‐level crack delineation in images with convolutional feature fusion. 2019;26(1):e2286. doi:10.1002/stc.2286
57. Tang Y, Zhang AA, Luo L, Wang G, Yang E. Pixel‐level pavement crack segmentation with encoder‐decoder network. Measurement.
2021;184(1):109914. doi:10.1016/j.measurement.2021.109914
15452263, 2022, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/stc.2974 by Egyptian National Sti. Network (Enstinet), Wiley Online Library on [04/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
HUYAN ET AL. 21 of 21
58. Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift; 2015.
59. Simonyan K, Zisserman A. Very deep convolutional networks for large‐scale image recognition; 2014.
60. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition. 2016;2016:770-778.
61. Gulli A, Pal S. Deep Learning with Keras. Packt Publishing Ltd; 2017.
How to cite this article: Huyan J, Ma T, Li W, Yang H, Xu Z. Pixelwise asphalt concrete pavement crack
detection via deep learning-based semantic segmentation method. Struct Control Health Monit. 2022;29(8):e2974.
doi:10.1002/stc.2974