0% found this document useful (0 votes)
2 views16 pages

Tuning of Data Augmentation Hyperparamet

This paper presents a methodology for tuning data augmentation hyperparameters in deep learning for building construction image classification, particularly focusing on vegetation recognition. The study utilized Logistic Regression to evaluate the performance of Convolutional Neural Networks trained on 128 combinations of image transformations, achieving accuracies of 95.6% and 93.3% in two case studies. The recommended hyperparameters include height and width shift ranges of 0.2 and a zoom range of 0.2, demonstrating significant improvements in classification accuracy with small datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views16 pages

Tuning of Data Augmentation Hyperparamet

This paper presents a methodology for tuning data augmentation hyperparameters in deep learning for building construction image classification, particularly focusing on vegetation recognition. The study utilized Logistic Regression to evaluate the performance of Convolutional Neural Networks trained on 128 combinations of image transformations, achieving accuracies of 95.6% and 93.3% in two case studies. The recommended hyperparameters include height and width shift ranges of 0.2 and a zoom range of 0.2, demonstrating significant improvements in classification accuracy with small datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

International Journal of Machine Learning and Cybernetics

https://doi.org/10.1007/s13042-022-01555-1

ORIGINAL ARTICLE

Tuning of data augmentation hyperparameters in deep learning


to building construction image classification with small datasets
André Luiz C. Ottoni1,2 · Raphael M. de Amorim2 · Marcela S. Novo3 · Dayana B. Costa4

Received: 24 May 2021 / Accepted: 22 March 2022


© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022

Abstract
Deep Learning methods have important applications in the building construction image classification field. One challenge
of this application is Convolutional Neural Networks adoption in a small datasets. This paper proposes a rigorous method-
ology for tuning of Data Augmentation hyperparameters in Deep Learning to building construction image classification,
especially to vegetation recognition in facades and roofs structure analysis. In order to do that, Logistic Regression models
were used to analyze the performance of Convolutional Neural Networks trained from 128 combinations of transformations
in the images. Experiments were carried out with three architectures of Deep Learning from the literature using the Keras
library. The results show that the recommended configuration (Height Shift Range = 0.2; Width Shift Range = 0.2; Zoom
Range =0.2) reached an accuracy of 95.6% in the test step of first case study. In addition, the hyperparameters recommended
by proposed method also achieved the best test results for second case study: 93.3%.

Keywords Deep learning · Convolutional neural networks · Hyperparameter tuning · Data augmentation · Building
construction image classification

1 Introduction crack detection [8, 54], road crack classification [53], safety
guardrail detection [22], structural damage recognition [11],
Deep Learning methods have important applications in detecting safety helmet [41], safety harness detection [10],
the Digital Image Processing field [4, 23, 42, 47]. In this classification of rock fragments [50], damage detection of a
sense, a possible application of Deep Learning is building steel bridge [1], tunnel lining defects [49] and facade defects
construction area [8, 10, 14, 41, 53]. In recent literature, classification [14].
there are several applications in this research field, such as: Deep Learning methods can also be applied in recogni-
tion of vegetation in building facades images [32]. In fact,
* André Luiz C. Ottoni the growth of biological manifestations on building facades
[email protected] may indicate the deterioration and degradation of construc-
Raphael M. de Amorim tions [2, 24]. In addition, the detection of this pathology in
[email protected] inspection images can assist in the conservation of historic
Marcela S. Novo buildings [7, 21, 24, 37]. In this sense, [32] proposes a Deep
[email protected] Learning approach for recognizing vegetation in buildings.
Dayana B. Costa Another possibility is to use Deep Learning analysis of roof
[email protected] structures [33]. In the literature, there are several examples
of works that investigated the efficient of roofs structure [5,
1
Technologic and Exact Center, Federal University 12, 33, 43]. For example, in a recent study, [33] proposes
of Recôncavo da Bahia, Cruz das Almas, Brazil
a methodology to tuning of two hyperparameters (learn-
2
Electrical Engineering Graduate Program, Federal University ing rate and optimizer) of Neural Networks in the building
of Bahia, Salvador, Brazil
roof image classification. It is also worth noting that, one of
3
Department of Electrical and Computer Engineering, Federal the relevant factors on [32] and [33] was the experiments
University of Bahia, Salvador, Brazil
with Data Augmentation. [32] verified the improvement
4
Department of Structural and Construction Engineering,
Federal University of Bahia, Salvador, Brazil

13
Vol.:(0123456789)
International Journal of Machine Learning and Cybernetics

in validation accuracy when using Data Augmentation to with different functionalities in the network architecture,
increase the training database. such as [4, 9]:
In fact, Data Augmentation techniques play an impor-
tant role in the application of Machine Learning in small – Input layer: receives input signals (e.g .: image).
datasets [4, 9, 42, 51, 52]. This is because, the generation of – Weights: adjusted during the training process (trainable
artificial images directly contributes to increase the capacity parameters).
for the generalization of the Deep Learning model and thus – Convolutional filters (kernels): have a set of weights,
decrease the chance of overfitting [4, 9]. In this respect, one according to their size. For example, if the kernel size is
of the challenges of using Data Augmentation is the defini- 3 × 3, then the filter contains 9 trainable weights.
tion of which transformations (such as zoom, rotation, flip) – Activation function: transforms a signal into a limited
will be applied to the images [6, 28, 34, 44, 48]. In terms output. Some examples are the functions ReLu, softmax
of Machine Learning, this problem can be treated as in the and sigmoid.
area of Hyperparameter Tuning [19, 20, 27, 30, 31, 39, 40]. – Convolutional Layer: applies the convolution operation
In the literature, some studies have analyzed the influ- between the filters and the input matrix in the layer. As an
ence of Data Augmentation hyperparameter combinations output, new matrices are generated (feature map), accord-
in different applications, such as: plant classification [34], ing to the number of kernels in the layer.
transmission line inspection [44] and covid-19 diagnostic – Pooling: applies a transformation to decrease the input
process in chest X-ray radiological imaging [28]. In [48], matrix dimensions. For this, statistical functions can be
different types of Data Augmentation methods were ana- used: maximum (max) or average (avg).
lyzed for crack detection in constructions. However, the – Flatten: transforms the matrices resulting from convolu-
literature lacks proposals to optimize the combinations of tional operations into a single vector.
Data Augmentation hyperparameters for the application of – Dropout: randomly disconnects a set of neurons at each
Deep Learning in building construction image classifica- training epoch.
tion, especially in the recognition of vegetation on building – Fully connected layers: similar to the structures of tradi-
facades and roofs defects classification. tional Artificial Neural Networks, in which, all neurons
The objective of this paper is to propose a rigorous meth- and layers are connected.
odology for tuning of Data Augmentation hyperparameters – Output layer: shows the output of CNN, such as a binary
in Deep Learning to building construction image classifi- classifier neuron.
cation with small data sets. For this, two case studies are
observed: vegetation recognition in facades [32] and roofs Thus, in view of the complexity of the CNN architectures,
structures analysis [33]. In order to do that, Logistic Regres- an important factor is the use of tools for efficient implemen-
sion models [16] will be used to analyze the performance of tation [4, 9]. In this line, it is worth mentioning the Keras
Convolutional Neural Networks (CNN) [4, 9] trained from library [4]. Keras is available on R interface1 and Python
128 combinations of transformations in the images. For language for development of Deep Learning applications.
comparison purposes, three CNN architectures from the lit- In addition, it allows execution on CPU or GPU. Another
erature will also be adopted: MobileNet [17], DenseNet-121 relevant factor is the simplicity to use Data Augmentation
[18] e CNN8 [32]. methods. In this sense, the Keras library was adopted in this
This paper is organized into five sections. Section 2 pre- work, as described in Sect. 3.
sents theoretical concepts of CNNs and Logistic Regression.
Section 3 presents the proposed methodology. Sections 4 2.2 Logistic regression
and 5 describe the results and conclusions, respectively.
The methods based on Linear Regression (simple and mul-
tiple) aim to model a continuous output from one or more
2 Theoretical foundation independent variables [29]. On the other hand, Logistic
Regression is a technique addressed for the analysis of cat-
2.1 Convolutional neural networks egorical data [13, 16]. Moreover, in a logistic function, the
variable response is binary or dichotomous.
Convolutional neural networks (CNNs) are Deep Learning The Logistic Regression model can be represented by Eq.
methods with several researches in computer vision field [4, (1) [13]:
9, 15, 23]. One of the main factors that make CNNs a rel-
evant Machine Learning technique is the ability to automati-
cally extract features from processed images [9]. In addition,
1
another important point is the use of layers and elements https://keras.rstudio.com/.

13
International Journal of Machine Learning and Cybernetics

Fig. 1 Examples of images of


the class 0 - without vegetation
on the facade

∑m
exp(𝛽0 + 𝛽k xk ) According to [32], the images of the training and valida-
k=1
p(x) = ∑m (1) tion datasets were defined from photographs adapted from
1 + exp(𝛽0 + k=1
𝛽k xk )
The Zurich Urban Micro Aerial Vehicle Dataset [26]. These
where, 𝛽0 to 𝛽k are the coefficients of the regression model; images were recorded in 2015 by an vant from the urban
x are the independent variables; and p(x) is the probabil- streets of Zurich (Switzerland). On the other hand, the
ity of success. Thus, if p(x) is the probability of an event images of the test dataset were selected from the website
occurring, then the expression 1 − p(x) represents the prob- Pixabay.2 The dataset analyzed during the current study are
ability of an event not occurring. The ratio between p(x) and available from the corresponding author on request or in
1 − p(x) is called chance (Eq. (2)) [13]: the web link:

p(x) – drive.google.com/file/d/1l6KA80mZdKqxlpfpenIH57m
chance = (2)
1 − p(x) CYESL3uyq/
In this line, the neperian logarithm of chance provides a
Figures 1 and 2 present examples of images from the data-
linear model, according to Eq. (3) [13]:
base of classes 0 and 1, respectively.
[ ] m
p(x) ∑
ln = 𝛽0 + 𝛽k xk (3) The database (390 images) was partitioned following the
1 − p(x) k=1
same structure proposed by [32]:
where, this equation is called logit and is a simplification of
the Logistic Regression model. – Training (250 images): 125 images in class 0 (without
vegetation) and 125 images in class 1 (with vegetation).
– Validation (50 images): 25 images in class 0 (without
3 Methodology vegetation) and 25 images in class 1 (with vegetation).
– Test (90 images): 45 images in class 0 (without vegeta-
3.1 Database of the first case study tion) and 45 images in class 1 (with vegetation).

In this study, the database presented by [32] was used for 3.2 Data augmentation
training, validation and testing of Convolutional Neural Net-
works in the recognition of vegetation on building facades. Data preprocessing is an important stage in the Machine
The dataset has 390 images, divided into two classes: Learning field [9]. This is because, this step can decrease
the learning complexity and improve the accuracy results
1. Class 0: without vegetation on the building’s facade.
2. Class 1: with vegetation on the building’s facade.
2
www.pixabay.com.

13
International Journal of Machine Learning and Cybernetics

Fig. 2 Examples of images of


the class 1 - with vegetation on
the facade

[9]. In this line, Data Augmentation techniques can be used formation, X-Shear that shift X coordinates values and
in the training of CNNs [4, 9, 42, 51, 52]. Y-Shear that’s shift Y coordinates values.
Data Augmentation is an approach applied mainly to – Width shift range: shifts the image randomly to the left
small data learning [4, 42]. In this sense, Data Augmenta- or to the right (horizontal shifts). If the value is float and
tion methods generate more data for training from the exist- ≤ 1 it will take the percentage of total width as range.
ing images [4]. The aim is to increase the CNN model’s For example, in an image that width is 100 pixels and if
ability to generalize and avoid overfitting [4, 9]. For this, width_shift_range = 1.0 then it will shift image randomly
artificial images are created from random transformations between -100% to 100% or -100px to 100px. Positive
in the original data [4]. Zoom, rotation and flip are some values will shift the image to the right side and negative
examples of possible transformations for the generation of values will shift the image to the left side.
augmented images [9]. – Height shift range: shifts the image randomly to up or
In this paper, Data Augmentation methods were applied down (vertically shifts). If the value is float and ≤ 1 it will
from Keras library in R software [4]. For this, a image_ take the percentage of total height as range. For example,
data_generator() method was adopted [4]. The function in an image that height is 100 pixels and if height_shift_
image_data_generator() generates batches of data with new range = 1.0 then it will shift image randomly between
modified images from the original data. In this regard, the -100% to 100% or -100px to 100px. Positive values will
following random transformations3 were used to increase shift the image to the upside and negative values will
training data [4]: shift the image to underside.
– Zoom range: it will do a randomly augmentation of the
– Rotation range: an integer number that defines the image adding new pixels values. It can be specified with
degree range for random rotations. Rotation is a circular the percentage of the zoom as single float or a range as an
movement around a fixed point. The processing images array. For example, if zoom_range = 0.4 the range will be
will have random rotations on a predefined range of [0.6, 1.4] between 60% (zoom in) or 140% (zoom out).
degrees according with data entrance.
– Horizontal flip: if this input is“true”them the images Figure 3 presents examples of images generated by Data
will be randomly mirrored in the horizontal direction Augmentation with the Keras library.
(left-right). The number of artificially generated images depends on
– Vertical flip: if this input is“true”them the images will be training settings: batch_size, steps_per_epoch and epoch. For
randomly mirrored in the vertical direction (up-bottom). example, in this study, these parameters were defined in first
– Shear range: distort the image along an axis to create or phase of experiments as: batch_size = 32; steps_per_epoch
rectify the perception angles. There are two shear trans- = 100; and epoch = 10. Thus, for each simulation were gen-
erated randomly around 32,000 new images for training.
This value is more than 100 times greater than the number
3
https://keras.rstudio.com/reference/. of original photographs for training (250).

13
International Journal of Machine Learning and Cybernetics

Fig. 3 Examples of images


generated by Keras data aug-
mentation: a original image;
b–d rotation range ( R = 40); e
horizontal flip ( H = TRUE ); f
vertical flip (V = TRUE ); g and
h height shift range ( He = 0.2);
l shear range (S = 0.2); j–l
width shift range (W = 0.2);
m–p zoom range ( Z = 0.2)

3.3 Neural network architectures _sequential() method in the Keras library was used [4], as
described below:
In this paper, three CNNs architectures were adopted:
CNN-8 [32], DenseNet-121 [18] and MobileNet [17]. – CNN-8: CNN architecture used by [32] to vegetation
Recently, these structures (or variations) are discussed in image recognition in buildings. The structure has 8
some papers in research field of building construction image layers and 3,985,345 trainable parameters. In addition,
processing with deep learning, such as, in the tasks: crack this architecture is based on a model proposed by [4],
detection (DenseNet) [46], structural health monitoring (FC- originally with 12 layers.
DenseNet) [38], detecting safety helmet (SSD MobileNet) – DenseNet-121: Dense Convolutional Network is an
[41], road damage detection (SSD MobileNet) [25] and rec- architecture proposed by [18]. This structure is char-
ognition of vegetation in buildings (CNN-8) [32]. acterized by connecting each layer to all other layers
In this study, CNN architectures were used for binary (dense connection). Moreover, it has 7,479,169 train-
classification, for example, class 0 (without vegetation able parameters. To use this architecture, the applica-
on the building’s facade) and class 1 (with vegetation tion_densenet121() method in the Keras library was
on the building’s facade) [32]. For this, the keras_model adopted.

13
International Journal of Machine Learning and Cybernetics

– MobileNet: CNN architecture proposed by [17] for 2. Data Augmentation and CNN Architectures.
mobile and embedded vision applications. The structure 3. Test Experiments.
uses deptwise separable convolutions (factorized con-
volutions). In addition, it has 28 layers and 3,732,289 In the first phase, seven hyperparameters were defined for
trainable parameters. To use this architecture, the appli- adjustment, each with two levels of treatments (0 - without
cation_mobilenet() method in the Keras library was transformation and 1 - with transformation):
adopted.
– Rotation Range (R): 0 or 40.
In all experiments, CNN architectures were trained with an – Horinzontal Flip (H): FALSE or TRUE.
adagrad optimizer and a learning rate of 0.01. In addition, – Vertical Flip (V): FALSE or TRUE.
the dimensions 50 × 50 × 3 were standardized as input to the – Height Shift Range (He): 0 or 0.2.
neural network: input_shape = c(50, 50, 3). It is also note- – Shear Range (S): 0 or 0.2.
worthy that all three architectures were configured with the – Width Shift Range (W): 0 or 0.2.
last two layers as fully connected. In the last layer having the – Zoom Range (Z): 0 or 0.2.
binary classifier neuron with sigmoid activation function [4].
Thus, a total of 128 (27) combinations of data augmentation
hyperparameters were analyzed in the first stage. For each
3.4 Hyperparameter tuning configuration, five CNN models (repetitions) were trained
in 10 epochs with 100 steps (steps por epoch) adopting the
3.4.1 Design of experiments MobileNet architecture [17]. The metrics observed in this
phase were the total number of images correctly classified
In this section, the design of experiments for tuning of data (C) and errors (E) in the validation dataset, used to adjust a
augmentation hyperparameters with Logistic Regression Logistic Regression model.
[16] is described. The simulations of the Convolutional Neu- In the second stage of experiments, the best combina-
ral Network models were conducted in the R software [35] tions of the first phase were used. In addition, three CNN
with the Keras library [4]. For this, an Intel Core i7-8565 architectures from the literature were adopted: MobileNet
(CPU) and NVIDIA GeForce MX110 (GPU) were used. [17], DenseNet-121 [18] e CNN8 [32]. For each combina-
For the experiments evaluation were used three metrics: tion (configuration of data augmentation × architecture), 5
accuracy in validation or testing (Acc), number of images repetitions were performed with 20 epochs. The total cor-
correctly classified (C) and number of images incorrectly rect classifications (C) and errors (E) were observed in the
classified, that is, errors (E). Equations (4) to (6) present validation dataset. Furthermore, the accuracy (Acc) in the
these formulas. validation step was also analyzed.
TP + TN Finally, in the third phase of experiments, the hyperpa-
Acc = , (4) rameter combinations performance were analyzed in the test
TP + TN + FP + FN
dataset. In this sense, new trainings were carried out with the
C = TP + TN, (5) data augmentation configurations selected in 5 repetitions
with 30 epochs. For each of the CNN models trained in this
phase, the accuracy in the classification of the test database
E = FP + FN, (6)
was analyzed.
where,
3.4.2 Logistic regression method
– TP: true positives, that is, correct classifications in class
1 (facade with vegetation). In this paper, the method for hyperparameter tuning uses
– FN: false negatives, that is, incorrect classifications in Logistic Regression [16]. The objective is to evaluate the
class 1 (facade with vegetation). probability of hits and errors in the building construction
– TN: true negatives, that is, correct classifications in class image classification, according to the settings of data aug-
0 (facade without vegetation). mentation. For this, the response variable (y) is binary and
– FP: false positives, that is, incorrect classifications in was modeled as follows:
class 0 (facade without vegetation). {
1 ∶ correct image classification .
y=
The experiments were conducted in three stages: 0 ∶ incorrect image classification .

1. Data Augmentation Hyperparameters.

13
International Journal of Machine Learning and Cybernetics

For the first step, the explanatory variables ( x1, x2 , x3, x4 , Therefore, odds ratio indices are used to define hyperpa-
x5, x6 and x7) refer to the seven hyperparameters analyzed: rameter configurations for the sequence of experiments, as
will be described in the next subsection (HPtuningLogReg
⎧ x1 ∶ Rotation Range (R)
Algorithm).
⎪ x2 ∶ Horizontal Flip (H)
⎪x In the sequence, logistic regression models are also used
∶ Vertical Flip (V)
⎪ 3 to analyze the results of the second stage of experiments.
⎨ x4 ∶ Height Shift Range (He)
In this case, the objective is to evaluate the influence of the
⎪ x5 ∶ Shear Range (S)
⎪x ∶ Width Shift Range (W)
selected hyperparameter combinations for the three CNN
⎪ 6 architectures adopted (CNN8 [32], DenseNet-121 [18] and
⎩ x7 ∶ Zoom Range (Z)
MobileNet [17]). For this, three logistic regression models
The Equation (7), in turn, presents the Logistic Regression are adjusted (one per architecture) and the odds ratio indices
model (logito format) proposed for the recommendation of for the hyperparameter configurations are observed.
hyperparameters:
[ ] 3.4.3 HPtuningLogReg algorithm
p(x)
ln =𝛽0 + 𝛽1 x1 + 𝛽2 x2 + 𝛽3 x3
1 − p(x) (7) Algorithm 1 presents the method proposed in R language for
+ … … + 𝛽4 x4 + 𝛽5 x5 + 𝛽6 x6 + 𝛽7 x7 tuning of data augmentation hyperparameters with Logistic
Regression: HPtuningLogReg Algorithm. The code has been
The coefficients (𝛽 ) of Eq. (7) can be obtained by the maxi- divided into four steps: data input, adjustment of the logistic
mum likelihood method [13]. Then, the regression coeffi- regression model, hyperparameter tuning and summary.
cients hypothesis test must be performed. In this sense, the In the first phase (lines 1 to 14), the results of the experi-
significance of the effects of each variable present in the ments are read and prepared for the method sequence.
model is analyzed in two hypotheses: The“correct”and “error”vectors store the number of correct
{ and incorrect classifications, respectively, for each of the
H0 ∶ 𝛽k = 0, observed hyperparameter configurations.
Ha ∶ 𝛽k ≠ 0. Then, in line 16, the function glm of the R language
is used to adjust the Logistic Regression model. In addi-
When the initial hypothesis ( H0) is accepted ( p > 0.05), the
tion, the anova method (line 17) is also used to perform
variable xk (associated with the 𝛽k coefficient) does not have
statistical tests of analysis of variance and to calculate the
statistical significance in the model. On the other hand, if
p − value (“paov”). In sequence, from the adjusted coeffi-
the alternative hypothesis is accepted ( p < 0.05), the hyper-
cients (“modelglm$coefficients”), the odds ratio measures
parameter (k) has significance in the Logistic Regression
are calculated (line 18).
model.
In phase 3 (lines 19 to 45), the Logistic Regression
The adjusted coefficients (𝛽 ) also may perform the calcula-
model is adopted to hyperparameter tuning. For this, a
tions of the odds associated with each hyperparameter configu-
repetition loop is performed by varying the hyperparam-
ration. In this aspect, the OR metric represents the odds ratio
eter index (data augmentation transformation type): 1 - R,
of correct classification between the two levels of a hyper-
2 - H, 3 - V, 4 - He, 5 - S, 6 - W and 7 - Z. In line 21, the
parameter. For example, the odds ratio for hyperparameter 1
statistical significance of the variable present in the model
(Rotation Range) is given by Eq. (8):
is analyzed. If the alternative hypothesis ( H1) is accepted
OR1 = exp(𝛽1 ), (8) (“paov$‘Pr(>Chi)‘[i+1]< 0.05”) there is significance for the
coefficient 𝛽k , that is, there is a statistical difference between
where OR1 is the odds ratio of level 1 ( R = 40 ) in relation the two treatments of the hyperparameter k. In this case, the
to level 0 ( R = 0 ) of hyperparameter 1 (Rotation Range). value of the odds ratio is presented and then recommended
Thus, if OR1 > 1 the chance of success in adopting R = 40 hyperparameter level with the greatest chance of success
is greater than R = 0. Otherwise (OR1 < 1), then the chance in the validation dataset image classification (lines 22 to
of CNN correctly classifying an image is greater if trained 35). On the other hand, if initial hypothesis ( H0) is accepted
without the rotation transformation. A similar analysis can (lines 36 to 45), there is no statistically significant difference
be made after calculating the odds ratio of the other analyzed between the two values of the hyperparameter k ( p > 0.05).
hyperparameters. Thus, Eq. () presents the general formula- Finally, default treatment (“H[i,2]”) is recommended, that
tion for the odds ratio: is, the transformation k in the data augmentation process
ORk = exp(𝛽k ). (9) should not be applied.
In step 4, a summary of the recommended hyperparame-
ter values is presented. In this case, only the hyperparameters

13
International Journal of Machine Learning and Cybernetics

whose decision variables received level 1 (C[i] == 1) in step 3.5 Hyperparameter tuning to second small dataset
3 are shown.
In this case study, another type of problem in buildings was
1 # Step 1: Data Input analyzed: gutter integrity and cleanliness pathology in roofs
2 dat <- read.delim(”data.txt”) [43]. For this, images were used from the database presented
3 attach(dat)
4 dadosf< −
and described by [33, 43, 45] and made available by the
as.data.frame(cbind(correct,error,R,H,V,He,S,W,Z)) Research Group in Construction Technology and Manage-
5 attach(datf) ment (School of Engineering - UFBA).4 These images were
6 h1< −c(”Hyper1 - Rotation Range (R)”, ”0”, ”40”) captured from roof inspections with an unmanned aerial
7 h2< −c(”Hyper2 - H. Flip (H)”, ”FALSE”, ”TRUE”)
8 h3< −c(”Hyper3 - V. Flip (V)”, ”FALSE”, ”TRUE”) vehicle.
9 h4< −c(”Hyper4 - Height Shift R. (He)”, ”0”, ”0.2”) In a previous study, [33] used this dataset in experi-
10 h5< −c(”Hyper5 - Shear Range (S)”, ”0”, ”0.2”) ments to tuning of two CNN hyperparameters (learning rate
11 h6< −c(”Hyper6 - Width Shift R. (W)”, ”0”, ”0.2”)
12 h7< −c(”Hyper7 - Zoom Range (Z)”, ”0”, ”0.2”)
and optimizer). For this, the images were divided into two
13 h< −rbind(h1, h2, h3, h4, h5, h6, h7) classes: (0) roofs with clean gutters and (1) roofs with dirty
14 C< −c(0,0,0,0,0,0,0) gutters. Thus, the database adopted by [33] has 220 images,
15 # Step 2: Logistic Regression Model separated for the training, validation and test phases:
16 modelglm< −
glm(cbind(strtoi(correct),strtoi(error))∼
R+H+V+He+S+W+Z, – Training (160 images): 80 images in class 0 and 80
family=binomial(link=”logit”), data=datf) images in class 1.
17 paov< −anova(modelglm, test=”Chisq”)
18 OR < − exp(modelglm$coefficients)
– Validation (30 images): 15 images in class 0 and 15
19 # Step 3: Hyperparameter Tuning images in class 1.
20 for (i in 1:7){ – Test (30 images): 15 images in class 0 and 15 images in
21 if (paov$‘Pr(>Chi)‘[i+1]< 0.05){ class 1.
22 print(”Hyperparameter Tuning Reglog”)
23 print(h[i])
24 print(”With statistical significance (p < 0.05)”) Figures 4 and 5 present examples of images (two classes) of
25 print(”Beta:”) the second small dataset.
26 print(modelglm$coefficients[i+1])
27 print(”Odds Ratio:”)
28 print(OR[i+1]) In this sense, the Hyperparameter Tuning methodology
29 print(”Recommended hyperparameter:”) (Section 3.4) was adopted in new experiments with this sec-
30 if (OR[i+1]<1){ ond case study. Then, HPtuningLogReg was applied to tun-
31 print(h[i,2])
32 }else{ ing of Data Augmentation hyperparameters.
33 print(h[i,3]) The dataset analyzed during the second case study is
34 C[i]<-1 available from the corresponding author on request or in
35 }
36 }else{
the web link:
37 print(”Hyperparameter Tuning Reglog”)
38 print(h[i]) – https://abre.ai/dataset2
39 print(”No statistical significance (p > 0.05)”)
40 print(”Beta:”)
41 print(modelglm$coefficients[i+1])
42 print(”Recommended hyperparameter:”) 4 Results
43 print(h[i,2])
44 }
45 }
4.1 Results of first small dataset
46 # Step 4: Hyperparameter Tuning - Summary
47 for (i in 1:7){ This section presents the results for the first case study: rec-
48 if (C[i] == 1){ ognition of vegetation on building facades.
49 print(”Hyperparameter Tuning - Summary”)
50 print(h[i])
51 print(h[i,3]) 4.1.1 Stage 1: Data augmentation hyperparameters
52 }
53 }
In stage 1, HPtuningLogReg algorithm was adopted to
Algorithm 1: HPtuningLogReg in R language. adjust the logistic regression model and tuning of Data
Augmentation hyperparameters. For this, results of 128

4
getec.eng.ufba.br.

13
International Journal of Machine Learning and Cybernetics

Fig. 4 Examples of images of


the class 0 (roofs with clean
gutters) in second small dataset

Fig. 5 Examples of images of


the class 1 (roofs with dirty gut-
ters) in second small dataset

Table 1 Results of Data Augmentation hyperparameter tuning with The effects of these variables were statistically significant
logistic regression in stage 1. ( p < 0.05). The recommended value of Height Shift method
Hyperparameter 𝛽 p Value x OR was 0.2 with an estimated odds ratio of 1.196. In this respect,
the adjusted model reveals that adopting the level xHe = 1
Rotation R. (R) – 0.005 0.86 0 0 0.995 has around 20% more chances of success in the image clas-
Hor. Flip (H) – 0.118 0.00 False 0 0.888 sification, in relation to not adopting this transformation in
Vertical Flip (V) – 0.130 0.00 False 0 0.878
the training base. The adjusted values for Width Shift and
Height S. R. (He) 0.179 0.00 0.2 1 1.196
Zoom were also 0.2, with odds ratios of 1.121 and 1.118,
Shear Range (S) – 0.008 0.79 0 0 0.992
respectively. Thus, it is estimated that the chance of cor-
Width S. R. (W) 0.114 0.00 0.2 1 1.121
rect image classification when using W = 0.2 or Z = 0.2 is
Zoom Range (Z) 0.111 0.00 0.2 1 1.118
around 12% greater than performing the training without
Bold values indicate the hyperparameters with OR > 1 and p < 0.05 these Data Augmentation effects.
Thus, from the Logistic Regression results, the HPtun-
hyperparameters combinations analyzed were used. The ingLogReg algorithm recommended three transformations
Equation (10) presents the adjusted linear model (logito). in the images for the training process: Height Shift Range,
Width Shift Range e Zoom Range. These hyperparameters
y =𝛽0 + 𝛽1 x1 + 𝛽2 x2 + 𝛽3 x3 + ⋅
analyzed at two levels each, result in eight combinations
⋅ +𝛽4 x4 + 𝛽5 x5 + 𝛽6 x6 + 𝛽7 x7 of Data Augmentation transformations (2 × 2 × 2). Table 2
(10)
=1.524 − 0.005x1 − 0.118x2 − 0.130x3 + ⋅ presents these combinations set and their respective levels
⋅ +0.179x4 − 0.0085x5 + 0.114x6 + 0.111x7 of decision variables
The hyperparameter combinations presented in Table 2
Table 1 shows the results of the test statistic (p), recom- were used in the next stage of experiments, as shown in the
mended values and odds ratio (OR) per hyperparameter. following section.
The results of Table 1 shows that the effects related to
transformations of Rotation Range and Shear Range have
no statistical effect ( p > 0.05). Thus, the algorithm recom- 4.1.2 Stage 2: Data augmentation and CNN architectures
mended the use of the default value for these hyperparam-
eters ( R = 0 and S = 0 ). On the other hand, the effects of In stage 2, the hyperparameter combinations defined in the
the variables referring to Horizontal Flip and Vertical Flip previous phase were evaluated in conjunction with three
showed statistical significance ( p < 0.05). However, the architectures in the literature: CNN8 [32], DenseNet-121
odds ratio values for these hyperparameters were less than 1 [18] and MobileNet [17]. Table 3 presents the results of
(OR < 1). Thus, HPtuningLogReg algorithm recommended accuracy in the validation step and the statistical metrics of
the use of H = 0 and V = 0. the logistic regression models (OR and p).
Table 1 also presents the results for the other three trans- From the Table 3 it is possible to observe that the highest
formations: Height Shift Range, Width Shift and Zoom. accuracy average (87.6%) for the CNN8 architecture was

13
International Journal of Machine Learning and Cybernetics

Table 2 Hyperparameter Comb. He W Z xHe xW xZ


combinations of data
augmentation selected in stage 1 1 0 0 0 0 0 0
2 0 0 0.2 0 0 1
3 0 0.2 0 0 1 0
4 0 0.2 0.2 0 1 1
5 0.2 0 0 1 0 0
6 0.2 0 0.2 1 0 1
7 0.2 0.2 0 1 1 0
8 0.2 0.2 0.2 1 1 1

Table 3 Results of validation Arch. Comb. 1 2 3 4 5 Mean OR p


accuracy (%) and statistical
metrics for each method 1 78.0 78.0 78.0 76.0 82.0 78.4 1.000 –
(CNN architecture + data
2 86.0 82.0 78.0 78.0 76.0 80.0 1.102 0.66
augmentation combination)
3 86.0 82.0 86.0 86.0 84.0 84.8 1.537 0.07
CNN8 4 92.0 88.0 86.0 84.0 92.0 88.4 2.099 0.00
5 84.0 90.0 80.0 86.0 86.0 85.2 1.586 0.05
6 84.0 84.0 84.0 86.0 82.0 84.0 1.445 0.11
7 90.0 92.0 88.0 90.0 88.0 89.6 2.374 0.00
8 88.0 88.0 86.0 90.0 86.0 87.6 1.946 0.01
1 88.0 84.0 90.0 88.0 84.0 86.8 1.000 –
2 90.0 86.0 94.0 92.0 84.0 89.2 1.256 0.41
3 92.0 92.0 88.0 90.0 86.0 89.6 1.310 0.33
DenseNet-121 4 90.0 88.0 86.0 90.0 92.0 89.2 1.256 0.41
5 86.0 90.0 94.0 88.0 86.0 88.8 1.206 0.49
6 90.0 88.0 90.0 88.0 92.0 89.6 1.310 0.33
7 90.0 94.0 92.0 92.0 90.0 91.6 1.658 0.09
8 92.0 96.0 96.0 92.0 94.0 94.0 2.382 0.01
1 76.0 74.0 76.0 78.0 76.0 76.0 1.000 –
2 84.0 88.0 88.0 88.0 90.0 87.6 2.231 0.00
3 82.0 84.0 90.0 82.0 78.0 83.2 1.564 0.05
MobileNet 4 92.0 88.0 90.0 88.0 90.0 89.6 2.720 0.00
5 82.0 80.0 78.0 84.0 82.0 81.2 1.364 0.16
6 84.0 86.0 86.0 86.0 90.0 86.4 2.006 0.00
7 88.0 92.0 88.0 86.0 94.0 89.6 2.721 0.00
8 92.0 90.0 90.0 92.0 90.0 90.8 3.117 0.00

Bold values indicate the data augmentation combinations with p < 0.05

achieved by adopting the combination 7 ( He = 0.2; W = 0.2; 8 has around 3 times more chances of success (OR = 3.117)
Z = 0 ). In this case, adopting configuration 7 has approxi- in image classification. Moreover, five combinations (2, 3,
mately 2 times more chances of success (OR = 2.374 ) in 5, 6, 7 and 8) are statistically different ( p ≤ 0.05).
the classification in relation to the reference combination In this sense, configuration 8 ( He = 0.2 ; W = 0.2 ;
( He = 0; W = 0; Z = 0). It is also noteworthy that the four Z = 0.2 ) was the only to present statistical significance
combinations (4, 5, 7 and 8) showed statistical significance for the three architectures. In addition, the odds ratio for
( p ≤ 0.05). On the other hand, when analyzing the results this combination was over 1.9 for CNN8, DenseNet-121
of DenseNet-121 in the Table 3, the highest mean accuracy and MobileNet. Thus, combination 8 was selected for the
(94.0%) was obtained by the combination 8. In addition, only sequence of experiments in the test stage.
this configuration ( He = 0.2; W = 0.2; Z = 0.2) was statis- To illustrate, Figs. 6 and 7 present samples of images
tically significant (level of 5%). The experiments with the generated by Data Augmentation, adopting the combination
MobileNet architecture, revealed that adopt the configuration 8: He = 0.2; W = 0.2; Z = 0.2.

13
International Journal of Machine Learning and Cybernetics

Fig. 6 Examples of images


generated by data augmentation
( He = 0.2; W = 0.2; Z = 0.2)
for class 0 (without vegetation
on the building facade)

Fig. 7 Examples of images


generated by data augmentation
( He = 0.2; W = 0.2; Z = 0.2)
for class 1 (with vegetation on
the building facade)

Table 4 Maximum accuracy Arch. C. 1 2 3 4 5 M.


in the test step in each of the
repetitions and respective mean CNN8 P 95.6 87.8 92.2 93.3 91.1 92.0
accuracy (M) for first small
CNN8 L 91.1 80.0 93.3 91.1 93.3 89.8
dataset
DenseNet P 77.8 77.8 73.3 83.3 65.6 75.6
DenseNet L 87.8 77.8 83.3 71.1 70.0 78.0
MobileNet P 71.1 65.6 81.1 74.4 63.3 71.1
MobileNet L 54.4 87.8 58.9 55.6 53.3 62.0

Bold values indicate the best result of test accuracy and mean accuracy
Comparison between methods recommended by the data augmentation configurations (Proposed (P) and
Literature (L)) and architectures

13
International Journal of Machine Learning and Cybernetics

Table 5 Confusion matrix with Table 7 Results of data augmentation hyperparameter tuning with
the best results for the test step TP = 43 FN = 2 logistic regression (second small dataset)
(first small dataset) FP = 2 TN = 43
Hyperparameter 𝛽 p Value x OR
Bold values indicate the number
of true positives (TP) and true Rotation R. (R) – 0.245 0.00 0 0 0.783
negatives (TN) Hor. Flip (H) 0.071 0.07 False 0 1.074
Vertical Flip (V) – 0.027 0.48 False 0 0.973
Height S. R. (He) – 0.008 0.85 0 0 0.992
Table 6 Selected hyperparameters for first small dataset Shear Range (S) 0.089 0.02 0.2 1 1.094
Width S. R. (W) – 0.024 0.53 0 0 0.976
Hyperparameter Recommendation
Zoom Range (Z) – 0.091 0.02 0 0 0.913
Architecture CNN8
Bold value indicates the hyperparameters OR > 1 and p < 0.05
Height Shift Range (He) 0.2
Width Shfit Rang (W) 0.2
Zoom Range (Z) 0.2
4.2 Results of second small dataset

4.1.3 Tests results This section report the results of applying the proposed
methodology in a second small dataset: gutter integrity in
In this step, simulations were performed out on the test roofs structures. In this sense, the Equation 11 presents the
dataset adopting three architectures (CNN8, DenseNet-121 linear model adjusted for this case study:
and MobileNet) and two data augmentation configurations: y =𝛽0 + 𝛽1 x1 + 𝛽2 x2 + 𝛽3 x3 + ⋅
⋅ +𝛽4 x4 + 𝛽5 x5 + 𝛽6 x6 + 𝛽7 x7
– Proposed in this paper (P) - defined from steps 1 and 2 (11)
(Hyperparameter Tuning): ( He = 0.2; W = 0.2; Z = 0.2). =1.750 − 0.245x1 + 0.071x2 − 0.027x3 + ⋅
– Literature (L) - presented in [4] and used by [32] for ⋅ −0.008x4 + 0.089x5 − 0.024x6 − 0.091x7
the same dataset of this study: ( R = 40 ; H = TRUE ;
He = 0.2 ; S = 0.2 ; W = 0.2 ; Z = 0.2). Table 7 shows the results of the test statistic (p), recom-
mended values and odds ratio (OR) per hyperparameter
Table 4 presents the accuracy in test step for each analyzed (second small dataset).
configurations. Table 7 shows that the only hyperparameter recom-
mended by the HPtuningLogReg algorithm for the second
From the Table 4 it is possible to observe that the high- case study was Shear Range (S), because p < 0.05 and
est accuracy average ( 92.0%) was achieved by the CNN8 OR > 1. On the other hand, four transformations did not
architecture when adopting the proposed data augmenta- reach statistical significance ( p > 0.05): Horizontal Flip
tion combination. Moreover, this configuration (CNN8 + (H), Vertical Flip (V), Height Shift Range (He) and Width
P) also resulted in the highest accuracy value in one repeti- Shift Range (W). In addition, two hyperparameters achieved
tion: 95.6%. This value is equivalent to the correct classifi- statistical effect ( p < 0.05), but obtained OR < 1: Rotation
cation of 86 images out of a total of 90 photographs on the Range (R) and Zoom Range (Z).
test dataset. In this sense, Table 5 presents the confusion In this regard, the test stage was carried out with two Data
matrix for the adoption of CNN8 + P (Repetition 1). Augmentation configurations:
It can be seen in Table 5 that the CNN model correctly
classified 43 images in the positive class and 43 images – Proposed in this paper (P) - Hyperparameter Tuning for
in the negative class (accuracy of 95.6%). Thus, for each second small dataset: S = 0.2.
class, CNN only missed 2 images in the test dataset (error – Literature (L) - presented in [4] and used by [33]: R = 40;
around 4.44% ). It is also worth noting that, in the study H = TRUE ; He = 0.2; S = 0.2; W = 0.2; Z = 0.2.
of [32], the maximum accuracy achieved was 90% for the
same test images. Thus, indicating that the careful adjust- Table 8 presents the accuracy results of test step to second
ment of the Data Augmentation hyperparameters can small dataset.
increase the results in the classification.
Table 6 summarizes the recommended hyperparameters Table 8 shows that the highest average accuracy (88.0%)
for the analyzed database. for second case study was achieved by the CNN8 architec-
ture with the proposed configuration. Moreover, this Data

13
International Journal of Machine Learning and Cybernetics

Table 8 Maximum accuracy Arch. C. 1 2 3 4 5 M.


in the test step in each of the
repetitions and respective mean CNN8 P 83.3 93.3 93.3 86.7 83.3 88.0
accuracy (M) for second small
CNN8 L 56.7 70.0 66.7 56.7 73.3 64.7
dataset
DenseNet P 70.0 83.3 90.0 90.0 70.0 80.7
DenseNet L 86.7 86.7 90.0 80.0 86.7 86.0
MobileNet P 70.0 70.0 60.0 70.0 63.3 66.7
MobileNet L 70.0 80.0 73.3 83.3 70.0 75.3

Bold values indicate the best result of test accuracy and mean accuracy
Comparison between methods recommended by the data augmentation configurations (Proposed (P) and
Literature (L)) and architectures

Table 9 Comparison of this Proposed I II III IV V


proposal with different papers
that applied CNNs in the [32] [3] [48] [36] [54]
image processing of building
CNN application Classification ✓ ✓ – – – –
construction: I [32], II [3], III
[48] , IV [36] and V [54]. Detection – – ✓ ✓ ✓ –
Segmentation – – – – ✓ ✓
Problem Crack detection – – – ✓ ✓ ✓
Bridge inspection – – ✓ – – –
Roofs defects classification ✓ – – – – –
Vegetation in facades ✓ ✓ – – – –
Analyzed Hyperparameters Rotation Range ✓ ✓ ✓ ✓ ✓ ✓
Horinzontal Flip ✓ ✓ ✓ – – ✓
Vertical Flip ✓ – – – – ✓
Height Shift Range ✓ ✓ – – – –
Shear Range ✓ ✓ – – ✓ –
Width Shift Range ✓ ✓ – – – –
Zoom Range ✓ ✓ – – – –
Others – – – ✓ ✓ –
Tuning of Data Augmentation Yes ✓ – ✓ ✓ – –
No – ✓ – – ✓ ✓

Augmentation combination ( S = 0.2 ) achieved the maxi- and roofs defects classification. In another way, it should be
mum accuracy value in one iteration: 93.3%. noted that most of the other studies in this area are dedicated
to the problem of crack detection (or segmentation).
Furthermore, this proposal innovates by analyzing 128
4.3 Comparison with other studies ( 27 ) combinations of hyperparameters from seven Data
Augmentation transformations: rotation range, horizontal
In this section, a comparative study is carried out between flip, vertical flip, height shift range, shear range, width shift
the present proposal and other recent works in the litera- range and zoom range. In general, other papers analyze less
ture: I [32], II [3], III [48], IV [36] and V [54]. For this, combinations and transformations of Data Augmentation in
five features were observed: CNN application (classifica- the building construction image processing field.
tion, detection or segmentation), type of problem, analyzed Another important contribution is the proposal for the
hyperparameters and tuning of Data Augmentation methods. application of logistic regression models for hyperparameter
All analyzed paper applied Deep Learning models for image tuning. The papers by [3] and [48] also present methodolo-
processing of building construction. Table 9 presents the gies for recommending Data Augmentation hyperparam-
comparison results. eters. However, these studies are applied to other problems
From the Table 9 it is confirmed that the main contribu- (crack detection or bridge inspection) and do not adopt logis-
tion of this paper is the proposal of a methodology for tuning tic regression methods.
of data augmentation hyperparameters to building construc-
tion image classification, especially in vegetation recognition

13
International Journal of Machine Learning and Cybernetics

5 Conclusion effects, each predictor (hyperparameter) influences others


and could result others solutions of tuning. Another impor-
The objective of this paper was to propose a rigorous meth- tant point will be the adoption of the HPtuningRegLog
odology for tuning of Data Augmentation hyperparameters method in tuning of Data Augmentation settings in other
in Deep Learning to small datasets. In this sense, the main applications with small dataset of building construction
contributions of this study are: image classification.

– Careful analysis of Data Augmentation transformations Acknowledgements The authors are grateful to Research Group in
Construction Technology and Management (GETEC) - School of Engi-
in the application of Deep Learning in building image neering (UFBA) - for providing the second image dataset, the Robotics
classification, especially in the recognition of vegetation & Perception Group (University of Zurich) for providing “The Zurich
on facades and roofs defects classification. Urban Micro Aerial Vehicle Dataset”, UFBA and UFRB.
– Design of experiments with 128 combinations of Data
Augmentation using the Keras library and the R soft- Declarations
ware.
– Proposal of the HPtuningLogReg method using Logis- Conflict of interest The Authors listed in this article declare that they
have no conflict of interest.
tic Regression to tuning of Data Augmentation hyper-
parameters. Ethical approval This article does not contain any studies with human
– Comparison of Data Augmentation configurations by participants or animals performed by any of the authors.
adopting three Convolutional Neural Network architec-
tures from the literature.

Regarding the results, in the first stage of experiments were References


recommended three Data Augmentation transformations for
first case study: Height Shift Range (He), Width Shift Rang 1. Ali R, Cha Y-J (2019) Subsurface damage detection of a steel
(W) and Zoom Range (Z). According to the Logistic Regres- bridge using deep learning and uncooled micro-bolometer. Constr
Build Mater 226:376–387
sion model, adopting He = 0.2 guarantees an increase of
2. Barberousse H, Lombardo RJ, Tell G, Couté A (2006) Factors
around 20% of success in the correct image classification. On involved in the colonisation of building facades by algae and
the other hand, by adopting W = 0.2 or Z = 0.2 the chance of cyanobacteria in france. Biofouling 22(02):69–77
success is increased by approximately 12%. Moreover, from 3. Bianchi E, Abbott AL, Tokekar P, Hebdon M (2021) Coco-bridge:
Structural detail data set for bridge inspections. J Comput Civ Eng
the second stage of experiments, the configuration ( He = 0.2;
35(3):04021003
W = 0.2; Z = 0.2) was the only one to present statistical sig- 4. Chollet F, Allaire JJ (2018) Deep learning with R. Manning
nificance ( p ≤ 0.05) for the three CNN architectures analyzed. Publications
Finally, in the testing stage, the selected Data Augmenta- 5. Conceição J, Poça B, De Brito J, Flores-Colen I, Castelo A (2017)
Inspection, diagnosis, and rehabilitation system for flat roofs. J
tion configuration reached the highest average accuracy (92%)
Perform Constr Facil 31(6):04017100
when adopting the CNN8 architecture. In addition, this com- 6. Cubuk ED, Zoph B, Shlens J, Le QV (2020) Randaugment: Practi-
bination also resulted in the greatest accuracy in one repeti- cal automated data augmentation with a reduced search space. In:
tion: 95.6%. This value is equivalent to the correct classifica- Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition Workshops, pp 702–703
tion of 86 images out of a total of 90 photographs on the test
7. da Silva GR, Valões DC, Nascimento CF, SNA A., Candeia MA,
dataset of first case study. Santiago H, Oliveira DV, Everton G, Lima JC, Souza JM (2 021)
For the second case study, the logistic regression model Elaboration of a damage map the facades of a public building in
recommended the Shear Range transformation for Data Aug- the city of triunfo/pe. Int J Adv Eng Res Sci 8:237–244
8. Dung CV et al (2019) Autonomous concrete crack detection using
mentation. In this sense, the hyperparameters selected for this
deep fully convolutional neural network. Autom Constr 99:52–58
application also achieved the best results in the test phase: 9. Elgendy M (2020) Deep learning for vision systems. Manning
93.3%. Publications
In future work, it is expected to analyze other Data 10. Fang W, Ding L, Luo H, Love PE (2018) Falls from heights: a
computer vision-based approach for safety harness detection.
Augmentation transformations. In addition, it is also sug-
Autom Constr 91:53–61
gested to test more levels for specific hyperparameters, for 11. Gao Y, Mosalam KM (2018) Deep transfer learning for image-
example, Zoom Range ranging from 10% to 50%. It is also based structural damage recognition. Comput-Aid Civ Infrastruct
worth highlighting the importance of investigating possible Eng 33(9):748–768
12. Garcez N, Lopes N, de Brito J, Silvestre J (2012) System of
limitations of the logistic regression model, for example,
inspection, diagnosis and repair of external claddings of pitched
the proposed approach did not accounted for the interac- roofs. Constr Build Mater 35:1034–1044
tions among different predictor variables. With interaction

13
International Journal of Machine Learning and Cybernetics

13. Giolo S. R (2017). Introduction to categorical data analysis with 32. Ottoni ALC, Novo MS (2021) A deep learning approach to veg-
applications (in portuguese). Editora Blucher etation images recognition in buildings: a hyperparameter tuning
14. Guo J, Wang Q, Li Y, Liu P. (2020). Façade defects classifica- case study. IEEE Lat Am Trans 19(12):2062–2070
tion from imbalanced dataset using meta learning-based convo- 33. Ottoni A. L. C, Novo M. S, Costa D. B. (2021). Hyperparameter
lutional neural network. Computer-Aided Civil and Infrastructure tuning of convolutional neural networks for building construction
Engineering image classication. The Visual Computer
15. He K, Zhang X, Ren S, Sun J (2016). Deep residual learning for 34. Pawara P, Okafor E, Schomaker L, Wiering M. (2017). Data aug-
image recognition. In 2016 IEEE Conference on Computer Vision mentation for plant classification. In International Conference on
and Pattern Recognition (CVPR), pages 770–778 Advanced Concepts for Intelligent Vision Systems, pages 615–626.
16. Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied Springer
logistic regression, vol 398. John Wiley & Sons 35. R Core Team (2020) R: A Language and Environment for Statisti-
17. Howard A. G, Zhu M, Chen B, Kalenichenko D, Wang W, Wey- cal Computing. R Foundation for Statistical Computing, Vienna,
and T, Andreetto M, Adam H. (2017). Mobilenets: Efficient con- Austria
volutional neural networks for mobile vision applications. arXiv 36. Ren Y, Huang J, Hong Z, Lu W, Yin J, Zou L, Shen X (2020)
preprint arXiv:1704.04861 Image-based concrete crack detection in tunnels using deep fully
18. Huang G, Liu Z, Van Der Maaten L, Weinberger K. Q (2017). convolutional networks. Construction and Building Materials
Densely connected convolutional networks. In 2017 IEEE Confer- 234:117367
ence on Computer Vision and Pattern Recognition (CVPR), pages 37. Rocha E, Macedo J, Correia P, Monteiro E (2018) Adaptation of
2261–2269 a damage map to historical buildings with pathological problems:
19. Hutter F, Hoos H, Leyton-Brown K. (2014). An efficient approach Case study at the church of carmo in olinda, pernambuco. Revista
for assessing hyperparameter importance. In Proceedings of Inter- ALCONPAT 8(1):51–63
national Conference on Machine Learning 2014 (ICML 2014), 38. Sajedi SO, Liang X (2021) Uncertainty-assisted deep vision struc-
pages 754–762 tural health monitoring. Computer-Aided Civil and Infrastructure
20. Hutter F, Kotthoff L, Vanschoren J, editors (2019). Automated Engineering 36(2):126–142
Machine Learning: Methods, Systems, Challenges. Springer. In 39. Schratz P, Muenchow J, Iturritxa E, Richter J, Brenning A (2019)
press, available at http://automl.org/book Hyperparameter tuning and performance assessment of statistical
21. Kaamin M, Ahmad N, Razali S, Mokhtar M, Ngadiman N, Masri and machine-learning algorithms using spatial data. Ecol Model
D, Hussin I, Asri L. (2020). Visual inspection of heritage mosques 406:109–120
using unmanned aerial vehicle (uav) and condition survey proto- 40. Shankar K, Zhang Y, Liu Y, Wu L, Chen C-H (2020) Hyper-
col (csp) 1 matrix: A case study of tengkera mosque and kampung parameter tuning deep learning for diabetic retinopathy fundus
kling mosque, melaka. volume 1529 image classification. IEEE Access 8:118164–118173
22. Kolar Z, Chen H, Luo X (2018) Transfer learning and deep con- 41. Shen J, Xiong X, Li Y, He W, Li P, Zheng X (2021) Detecting
volutional neural networks for safety guardrail detection in 2d safety helmet wearing on construction sites with bounding-box
images. Autom Constr 89:58–70 regression and deep transfer learning. Computer-Aided Civil and
23. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature Infrastructure Engineering 36(2):180–196
521(7553):436–444 42. Shorten C, Khoshgoftaar T. (2019). A survey on image data aug-
24. Loukma M, Stefanidou M (2018) Causes of deterioration of mentation for deep learning. Journal of Big Data, 6(1). cited By
ottoman mosques. WIT Transactions on The Built Environment 456
177:173–180 43. Silveira, B., Melo, R., and Costa, D. B. (2021). Using uas for roofs
25. Maeda H, Sekimoto Y, Seto T, Kashiyama T, Omata H (2018) structure inspections at post-occupational residential buildings.
Road damage detection and classification using deep neural net- In Toledo Santos, E. and Scheer, S., editors, Proceedings of the
works with smartphone images. Computer-Aided Civil and Infra- 18th International Conference on Computing in Civil and Build-
structure Engineering 33(12):1127–1141 ing Engineering, pages 1055–1068, Cham. Springer International
26. Majdik AL, Till C, Scaramuzza D (2017) The zurich urban micro Publishing
aerial vehicle dataset. The International Journal of Robotics 44. Song C, Xu W, Wang Z, Yu S, Zeng P, Ju Z. (2020). Analysis on
Research 36(3):269–273 the impact of data augmentation on target recognition for uav-based
27. Mantovani RG, Rossi AL, Alcobaça E, Vanschoren J, de Carvalho transmission line inspection. Complexity, 2020
AC (2019) A meta-learning recommender system for hyperparam- 45. Staffa L. B, Sa L. S. V, Lima M. I. S. C, Costa D. B. (2020). Use of
eter tuning: Predicting when tuning improves svm classifiers. Inf image processing techniques for inspection of building roof struc-
Sci 501:193–221 tures for technical assistance purposes (in portuguese). ENTAC -
28. Monshi MMA, Poon J, Chung V, Monshi FM (2021) Cov- National Meeting of the Built Environment Technology
idxraynet: Optimizing data augmentation and cnn hyperparam- 46. Wang J.-J, Liu Y.-F, Nie X, Mo Y. (2022). Deep convolutional
eters for improved covid-19 detection from cxr. Computers in neural networks for semantic segmentation of cracks. Structural
Biology and Medicine 133:104375 Control and Health Monitoring, 29(1). cited By 0
29. Myers RH, Montgomery DC, Anderson-Cook CM (2016) 47. Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep
Response surface methodology: process and product optimiza- learning. Int J Mach Learn Cybern 11:747–750
tion using designed experiments. John Wiley & Sons 48. Wang Z, Yang J, Jiang H, Fan X (2020) Cnn training with twenty
30. Neary P. (2018). Automatic hyperparameter tuning in deep con- samples for crack detection via data augmentation. Sensors
volutional neural networks using asynchronous reinforcement 20(17):4849
learning. In 2018 IEEE International Conference on Cognitive 49. Xue Y, Li Y (2018) A fast detection method via region-based fully
Computing (ICCC), pages 73–77 convolutional neural networks for shield tunnel lining defects. Com-
31. Ottoni ALC, Nepomuceno EG, de Oliveira MS, de Oliveira DCR puter-Aided Civil and Infrastructure Engineering 33(8):638–654
(2020) Tuning of reinforcement learning parameters applied to 50. Yang Z, He B, Liu Y, Wang D, Zhu G (2021) Classification of rock
sop using the scott-knott method. Soft Comput 24:4441–4453 fragments produced by tunnel boring machine using convolutional
neural networks. Automation in Construction 125:103612

13
International Journal of Machine Learning and Cybernetics

51. Younis MC, Keedwell E (2019) Semantic segmentation on small 54. Zhou S, Song W (2021) Crack segmentation through deep convolu-
datasets of satellite images using convolutional neural networks. tional neural networks and heterogeneous image fusion. Automation
Journal of Applied Remote Sensing 13(4):046510 in Construction 125:103605
52. Zeng S, Zhang B, Zhang Y, Gou J (2020) Dual sparse learning via
data augmentation for robust facial image classification. Int J Mach Publisher's Note Springer Nature remains neutral with regard to
Learn Cybern 11(8):1717–1734 jurisdictional claims in published maps and institutional affiliations.
53. Zhou S, Song W. (2020). Deep learning-based roadway crack clas-
sification using laser-scanned range images: A comparative study
on hyperparameter selection. Automation in Construction, 114

13

You might also like