0% found this document useful (0 votes)

45 views

Catena: Khanh Pham, Dongku Kim, Sangyeong Park, Hangseok Choi T

Paper ensemble machine learning

Uploaded by

Rico Bayu Wiranata

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

Catena: Khanh Pham, Dongku Kim, Sangyeong Park, Hangseok Choi T

Paper ensemble machine learning

Uploaded by

Rico Bayu Wiranata

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Catena 196 (2021) 104886

Contents lists available at ScienceDirect

Catena
journal homepage: www.elsevier.com/locate/catena

Ensemble learning-based classification models for slope stability analysis T

a,b c c c,⁎
Khanh Pham , Dongku Kim , Sangyeong Park , Hangseok Choi
a
Department of Civil Engineering, International University, Ho Chi Minh City, Viet Nam
b
Vietnam National University, Ho Chi Minh City, Viet Nam
c
School of Civil, Environmental, and Architectural Engineering, Korea University, Seoul, Republic of Korea

ARTICLE INFO ABSTRACT

Keywords: In this study, ensemble learning was applied to develop a classification model capable of accurately estimating
Ensemble classifier slope stability. Two prominent ensemble techniques—parallel learning and sequential learning—were applied to
Ensemble learning implement the ensemble classifiers. Additionally, for comparison, eight versatile machine learning algorithms
Slope stability analysis were utilized to formulate the single-learning classification models. These classification models were trained and
Machine learning
evaluated on the well-established global database of slope documented from 1930 to 2005. The performance of
these classification models was measured by considering the F1 score, accuracy, receiver operating characteristic
(ROC) curve and area under the ROC curve (AUC). Furthermore, K-fold cross-validation was employed to fairly
assess the generalization capacity of these models. The obtained results demonstrated the advantage of ensemble
classifiers over single-learning classification models. When ensemble learning was used instead of the single
learning, the average F1 score, accuracy, and AUC of the models increased by 2.17%, 1.66%, and 6.27%, re
spectively. In particular, the ensemble classifiers with sequential learning exhibited better performance than
those with parallel learning. The ensemble classifiers on the extreme gradient boosting (XGB-CM) framework
clearly provided the best performance on the test set, with the highest F1 score, accuracy, and AUC of 0.914,
0.903, and 0.95, respectively. The excellent performance on the spatially well-distributed database along with its
capacity to distribute computing indicates the significant potential applicability of the presented ensemble
classifiers, particularly the XGB-CM, for landslide risk assessment and management on a global scale.

1. Introduction Griffiths et al., 1999; Matsui and San, 1992), outperforms the slice
methods by eliminating the assumptions on interslice forces and pro
Landslides are one of the most severe disasters that cause con viding essential information for tracing progressive failure (Lechman
siderable damage to human lives and economies. Therefore, under and Griffiths, 2000). However, this approach requires a deep under
standing the collapse mechanisms and accurately estimating the slope standing of soil behavior, which can be ideally described by sophisti
stabilities are crucial for landslide risk assessment and management. cated constitutive laws. Moreover, obtaining the solution normally re
Owing to the uncertainty and nonlinear nature of geomaterials, it is quires prior assumptions and simplifications, which subsequently
difficult to reliably evaluate the safety of the slope using conventional govern the accuracy of the employed approach (Keaton, 2007).
physics-based models (e.g., slice method and numerical approach). The remarkable development of machine learning (ML) algorithms
Michalowski (1995) raised concerns on the accuracy of the slice along with the extensive data accumulation in this field provides a
methods, which have been routinely used in practice, because it is valuable alternative to the physics-based models that learns and re
impossible to determine the static admissibility of stress fields or the cognizes the failure patterns of slopes under different circumstances.
kinematic admissibility of the collapse mechanism using these methods. Several ML-based models have been proposed to estimate the safety of
The numerical approach, which incorporates computational techniques slopes with a certain level of success (Kang et al., 2017; Lin et al., 2009;
with the shear strength reduction algorithm (Dawson et al., 1999; Manouchehrian et al., 2014; Samui, 2013). In addition, various

Abbreviations: ML, machine learning; KNN, K-nearest neighbor; SVM, support vector machine; SGD, stochastic gradient descent; GP, Gaussian process; QDA,
quadratic discriminant analysis; GNB, Gaussian naïve Bayes; DT, decision tree; XGB, extreme gradient boosting; ANN, artificial neural network; RF, random forest;
AB, adaptive boost; GB, gradient boosting; ROC, receiver operating characteristic; AUC, area under the curve; FP, false positive; FN, false negative; TP, true positive,
TN, true negative; TPR, true positive rate; FPR, false positive rate; STD, standard deviation; TNR, true negative rate
⁎
Corresponding author.
E-mail address: [email protected] (H. Choi).

https://doi.org/10.1016/j.catena.2020.104886
Received 13 January 2020; Received in revised form 27 August 2020; Accepted 28 August 2020
0341-8162/ © 2020 Elsevier B.V. All rights reserved.
K. Pham, et al. Catena 196 (2021) 104886

optimization algorithms (e.g., differential evolutions, firefly algorithm, and enables training with massive training sets, its final solution (i.e.,
and particle swarm optimization) have been integrated with these ML the model parameters) is satisfactory, but not optimal (Aurélien Géron,
models to improve their performance (Das et al., 2011; Hoang and 2017). The SGD uses two crucial hyperparameters including the
Pham, 2016; Xue, 2017). However, the prediction accuracy of these number of epochs (niter) and α, which regulates the learning rate, to
ML-based models might be excessively sensitive to data properties (e.g., control the overfitting issue experienced during the training phase.
distribution of attributes and data volume) owing to the mathematical Gaussian process (GP) is a generic supervised learning method used
assumptions of the utilized ML algorithm (Lin et al., 2018; Qi and Tang, for binary classification. The main working principle involves placing a
2018a). GP over the latent function beforehand, which is then squeezed through
Ensemble learning, inspired by the law of large numbers, has been a link function to obtain a probabilistic classification (Rasmussen and
validated as one of the most efficient approaches for improving the Williams, 2006). Because the prediction is made in terms of Gaussian
performance of ML models (Hansen and Salamon, 1990; Kuncheva, probability, the empirical confidence intervals can be computed and
2004; Zhou et al., 2002). Fundamentally, ensemble learning enhances used to refit the prediction in a region of interest. However, by utilizing
the performance of ML models by embracing the diversity of its base the information of entire samples, GP presumably loses efficiency in the
predictors to learn different perspectives of the database. These diverse case of high-dimensional spaces (Pedregosa et al., 2011).
set of predictors can be obtained either by employing different learning Quadratic discriminant analysis (QDA) is a modification of the linear
algorithms (i.e., heterogeneous ensemble learning) or utilizing one discriminant analysis; QDA eliminates the assumption of equal covar
single-learning algorithm trained with random subsets of the training iance matrices among the groups. This algorithm is desirable because it
set (i.e., homogeneous ensemble learning). provides closed-form solutions that are generally suitable in practice.
Although ensemble learning has demonstrated its overwhelming Furthermore, no hyperparameters are required to tune QDA.
advantages in solving actual problems in various fields, its application Gaussian naïve Bayes (GNB) is formulated based on Bayes’ theorem
is relatively limited in the context of slope stability analysis as only along with the naïve assumption of conditional independence among
heterogeneous ensemble learning has been applied to evaluate the the input features. Although its underlying assumption is rarely ap
safety of slope (Qi et al., 2018; Qi and Tang, 2018b). Recently, plicable in real data, GNB performs quite well in practice. Furthermore,
Bragagnolo et al. (2020) utilized homogeneous ensemble learning to GNB requires a relatively small amount of training data and yields a
formulate a framework for mapping landslide susceptibility. Although high-speed computation. Moreover, GNB can alleviate severe problems
these studies worked on different types of data, they supported the related to the curse of dimensionality (Pedregosa et al., 2011).
superiority of ensemble learning in significantly enhancing the accu Decision tree (DT) is a non-parametric model capable of fitting
racy of predictions. In addition to parallel learning (i.e., homogeneous complex datasets. During the training phase, a simple decision rule is
and heterogeneous ensembles), ensemble learning possesses a variety of inferred from data features to formulate the model. DT is a white-box
abilities that have not been recognized in this field. model, in which all the information regarding model behaviors and
By analyzing the performance of ensemble classifiers using various influential variables is available. Two vital elements of the white-box
learning concepts (i.e., parallel learning and sequential learning), this model include interpretable model features and the transparent
study attempts to systematically examine the ensemble learning used in learning process. DT does not require strict data preparation and is
estimating the safety of the slope. Furthermore, single-learning classi commonly used as the base predictors for ensemble learning (e.g.,
fication models formulated on the framework of versatile ML algo random forest, adaptive boosting, gradient boosting, and extreme gra
rithms were also presented for comparison. The classification models in dient boosting). Nevertheless, DT is susceptible to overfitting and in
the examination were trained and evaluated on a published database of stability; generally, the learning phase terminates at a reasonably good
153 slope cases collected from field investigations at different locations solution, not the optimal one. In this study, two hyperparameters were
worldwide. The K-fold cross-validation was also applied for a fair as employed to control the overfitting issue: max_depths to determine the
sessment. maximum depth of the DT, and max_leaf_nodes to define the maximum
number of leaf nodes.
2. Machine learning algorithms and ensemble learning Artificial neural network (ANN) combines multiple neurons con
nected in the form of a network. The advantage of ANN is its ability to
2.1. Machine learning algorithms learn nonlinear models. However, the need to determine optimal ar
chitecture and tune hyperparameters are the disadvantages of ANN.
Complete details of the eight ML algorithms that are considered Furthermore, ANN is sensitive to input feature scaling.
herein have been well presented in literature (Bishop, 2006; Rasmussen
and Williams, 2006; Shalev-Shwartz and Ben-David, 2013). In this 2.2. Ensemble learning
section, the advantages and disadvantages of each algorithm are sum
marized, along with the approaches for controlling their performance. Ensemble learning is a technique used in aggregating individual
K-nearest neighbor (KNN) is a non-parametric algorithm that does learning algorithms, known as base predictors, to yield a potentially
not assume the underlying data distribution. The training phase of KNN superior predictor. The diversity among the base predictors plays a vital
is high-speed; however, this algorithm is computationally expensive. role in governing the final prediction accuracy (Kuncheva and
The critical factors governing the performance of KNN include the Whitaker, 2003). Additionally, Minaei-Bidgoli et al. (2014) demon
number of nearest neighbors (k) of each query point and the leaf size strated the effect of the resampling method and its adaptation to the
(leaf_size) that affects the speed of construction and query (Pedregosa efficacy of clustering ensemble. Kuncheva (2004) comprehensively
et al., 2011). presented the algorithms to combine pattern classifiers. Fundamentally,
Support vector machine (SVM) is a non-parametric algorithm pro ensemble learning can be sorted into two groups according to their
posed by Cortes and Vapnik (1995). SVM usually performs well even learning concept: parallel ensemble and sequential ensemble.
with high-dimensional data. However, it requires significant amounts of
computer resources and occasionally suffers from numerical instability 2.2.1. Parallel ensemble
when determining solutions to optimization problems. The general Parallel ensemble trains the base predictors in parallel to utilize the
ization error of SVM can be controlled by the regularization parameters, characteristics of independence between them. One of the advantages
C and γ. of parallel ensemble is the ability to utilize different CPU cores or
Stochastic gradient descent (SGD) is an iterative method used to op machines to execute training and predicting simultaneously. The base
timize a differentiable cost function. Although SGD is considerably fast predictors can be different learning algorithms (i.e., heterogeneous

2
K. Pham, et al. Catena 196 (2021) 104886

with majority votes is determined as the final decision. In soft-voting,

the final prediction is the class with the highest probability averaged
over the base predictors. Generally, soft-voting performs better than
hard-voting because the former places more weight on highly reliable
votes (Aurélien Géron, 2017).

2.2.2. Sequential ensemble

The sequential ensemble, also known as boosting, trains the base
predictors sequentially, with each newly added predictor attempting to
correct its predecessor. The predictor focuses more on challenging cases
to improve prediction accuracy. Fig. 3 illustrates the flowchart of the
sequential ensemble.
One of the most popular sequential ensemble algorithms originally
proposed by Freund and Schapire (1997) is Adaptive Boost (AB), in
which a new predictor attempts to correct its predecessor by adding
more weight on the unfitted training instances. This study briefly
summarized the AB learning concept; details regarding this im
Fig. 1. Flowchart of heterogeneous ensemble classifier.
plementation can be found in Freund and Schapire (1997). In the initial
state, each instance weight w j ( j = 1. .m , where m is the number of
ensemble) or a single-learning algorithm (i.e., homogeneous ensemble). training instances) is set equal to 1/m, and the subsequent steps of the
Owing to their different mathematical backgrounds, heterogeneous training process are as follows:
ensemble exploits the diversity of base predictors to increase the
probability of different error types. Consequently, the overall prediction while stop criteria = false :
accuracy can be improved. For the heterogeneous ensemble classifier, Train base predictor ith prediction: yi j
the eight ML algorithms, as mentioned earlier, were employed in this m j
j = 1 w | yi
j
yj
study; their hyperparameters were tuned via the grid search algorithm. Error rate ri = m j (y j :ground truth)
j=1 w
Fig. 1 presents the flowchart of a heterogeneous ensemble classifier. 1 ri
Predictor weight: = log ( : learning rate)
Besides, a set of diverse classifiers can be achieved using one i ri

learning algorithm for all base predictors; however, it must be trained Update instance weight:
with different random subsets of the training set. Sampling can be w j if yi j = yi
carried out either with replacement, (i.e., bagging) or without re wj =
w j e i if y i j yi
placement (i.e., pasting). Although these two techniques allow the
training data to be sampled several times across the base predictors, wj =
wj
m j
only the bagging technique permits sampling data several times for the j= 1 w

same predictor. Fig. 2 illustrates the flowchart of a homogeneous en Next base predictor i + =1 (1)
semble classifier. Random forest (RF) is a well-known example of a
homogeneous ensemble, in which the DT is used as the base predictor where stop criteria is set when the number of predictors npredictor is
along with bootstrap sampling. Despite its simplicity, RF is considered achieved.
as one of the most powerful conventional ML algorithms used in For the new input x, all base predictors of AB are executed to obtain
practice. In addition to RF, this study adopted the bagging technique the predictions along with their predictor weights. The class with the
along with utilizing SVM, ANN, and KNN as the base predictors to majority of weighted votes is determined as the final prediction y (x) , as
formulate the homogeneous ensemble classifiers. expressed in Eq. (2). In particular, the overfitting issue can be con
Final predictions of both the heterogeneous and homogeneous en trolled by tuning npredictor or using the efficiently regularized base pre
semble classifiers can be obtained by hard-voting or soft-voting. Hard- dictors.
voting aggregates the predictions of each base predictor, and the class npredictor
y (x) = arg max i yi (x) = k
k i=1 (2)
Breiman (1997) recast AB in the statistical framework, which was
then developed into gradient boosting (GB) by Friedman (2001). Con
ceptually, GB is the combination of AB and weighted minimization, in
which the residual errors made by the previous predictor are fed to the
new predictor. The objective of GB is to minimize the loss of a model by
sequentially adding new base predictors using a gradient descent-like
procedure. The difference between GB and AB is that GB freezes the
weights of the predecessor whenever the new predictor is added. The
three principal components of GB are loss function, weak learner, and
additive model. For the classification task, the logarithmic loss and DT
are broadly used. Mason et al. (2000) proposed a functional-gradient-
descent algorithm that aids the addition of new predictors in the di
rection of reducing the residual loss of the model. The new predictors
are added until the predefined number is achieved or the loss in the
validation set does not improve on the next iterations. GB overcomes
the overfitting problem by employing more regularized base predictors
or adjusting the learning rate for updating the weights. Additionally,
random sampling can be used to reduce the variance, in which the
Fig. 2. Flowchart of homogeneous ensemble classifier. subsets of the training set are sampled randomly to train each base

3
K. Pham, et al. Catena 196 (2021) 104886

Fig. 3. Flowchart of sequential ensemble learning.

predictor. investigations during the period from 1930 to 2005. Fig. 4 illustrates
The most significant drawback of the sequential ensemble is not the locations of the slope cases considered in this study. It is observed
able to be parallelized, which results in being unscalable (Géron, 2017). that these slope locations were relatively well-distributed across dif
However, Chen and Guestrin (2016) developed extreme gradient ferent areas (e.g., Europe, Asia, and North America) around the world.
boosting (XGB), a scalable tree boosting system, on the framework of The database of Sakellariou and Ferentinou (2005) consists of 46 slope
GB to increase computational speed. The advantages of XGB are evident cases, which were adjusted from the original database of Sah et al.
in its ability to support distributed training and integrate with the cloud (1994). The 53 rock slope cases investigated by Chen et al. (2011) were
dataflow system. Distributed training on multiple machines is im obtained from a mountainous area in Guizhou, a southeastern province
plemented using built-in interfaces to integrate them with distributed of China. The other 26 slope cases explored and statistically summar
computing frameworks (e.g., DASK) to perform feature engineering or ized by Wang et al. (2005) represented typical and large-scale slopes
allocating base predictors. Furthermore, XGB utilizes more regularized with a high probability of failure, located in the Qing river basin region,
models to settle the overfitting issue, due to which it performs better China. The remaining slope cases obtained from the databases of Xu
than GB (Chen and Guestrin, 2016). Consequently, XGB dominates most et al. (1999) and Feng (2000) were rock slopes.
of the other learning algorithms in recent ML comparisons. In addition Five factors defined the geological and geometry conditions of
to AB and GB, this study applied XGB to formulate boosting classifiers. slopes, namely: unit weight γ (kN/m3), cohesion c (kPa), internal fric
tion angle φ (rad), slope height H (m), and slope inclination angle β
(rad). These five factors were then utilized as the input features of the
3. Database
classification models. The stability status of the slope cases was iden
tified as either stable (S) or failure (F). Fig. 5 compares the numbers of
This study analyzed the database of 153 slope cases documented in
failure and stable cases, indicating a slight skew toward the stable
published literature (Chen et al., 2011; Feng, 2000; Sakellariou and
slopes.
Ferentinou, 2005; Wang et al., 2005; Xu et al., 1999). The database
Table 1 summarizes the statistical descriptions of the five factors in
contains information regarding the geological conditions, geometry,
the examined database. According to statistics, the values of these
stability status, and location of the slope obtained from field

Fig. 4. Approximate locations of slope cases in consideration.

4
K. Pham, et al. Catena 196 (2021) 104886

nine outliers exhibiting the cohesion greater than 100 kPa. The internal
friction angle showed a nearly unimodal distribution in the range
0–0.79 rad, which was majorly centered in the range 0.43–0.61 rad.
The slope inclination angle also showed an approximately normal dis
tribution in the range 0.17–1.03 rad, which indicated that the database
contains a variety of slope geometries (i.e., slight to steep inclination).
In the case of slope height, this value had the broadest range among the
other factors (i.e., 3.66–511 m), representing different slopes from re
latively low to extremely high slopes. The slope height, which was
mainly centered around 30.5–108 m, skewed toward the lower value.
Only 25% of the slope cases had a slope height greater than 108 m. In
particular, 20 outliers with slope height greater than 239 m were de
tected.
Fig. 5. Comparison of number of failures and stable cases in the examined These five factors have different scales that can significantly affect
database. the performance of SVM and ANN. The standardization technique was
applied to adjust the input features to the same scale before im
Table 1 plementing these two models. Furthermore, besides the internal friction
Statistical descriptions of input features of database. angle and the slope inclination angle, the other histograms are tail-
heavy or bimodal distributions; the outliers were also detected. These
Statistical descriptions γ (kN/m3) c (kPa) φ (rad) β (rad) H (m)
two properties of the database make it difficult for ML algorithms to
Number 153 153 153 153 153 learn the patterns.
Mean 22.60 34.73 0.50 0.60 97.26 Moreover, the highly correlated input features could degrade the
Standard Deviation 3.90 43.19 0.16 0.18 115.46
performance of ML algorithms. Therefore, the Pearson correlation
Min 12 0 0 0.17 3.66
Q1 (25%) 20.41 11.97 0.43 0.49 30.50
coefficient (P) was briefly applied to examine the correlation between
Median - Q2 (50%) 22.40 29.30 0.53 0.61 50 each pair of input features. Fig. 7 illustrates the heatmap of the Pearson
Q3 (75%) 26.20 40.00 0.61 0.74 108 correlation coefficient.
Max 28.44 300 0.79 1.03 511 According to the obtained results, the γ and φ pair (P = 0.57) is the
most correlated among the pairs of input features examined. The cor
relation of this pair is consistent with that of the physical interpretation.
However, the level of correlation of this pair might be considered as a
medium, which could be adopted to classify the stability status of
slopes.

4. Methodology

4.1. Dataset partition

ML models conduct specific tasks in accordance with the patterns

extracted from the given databases. The process of recognizing the
regularity and pattern of the database is called the learning or training
process. Once the learning phase is completed, the trained model can

Fig. 6. Distribution of five factors of database on their ranges.

factors have broad ranges, thereby indicating that the database consists
of diverse soil types and slope conditions. Fig. 6 illustrates the dis
tribution of these factors in their ranges.
The unit weight exhibited a possible bimodal distribution, ranging
from 12 to 28.44 kN/m3, which was mostly centered in the range
20.41–26.20 kN/m3. The cohesion had a broad range (0–300 kPa),
which skewed toward the smaller value ranging from 11.97 to 40 kPa.
Fig. 7. Heatmap of correlation of each pair of input features. Notice: X1: γ (kN/
Only 25% of the slope cases had a cohesion higher than 40 kPa, with
m3); X2: c (kPa); X3: φ (rad); X4: β (rad); and X5: H (m).

5
K. Pham, et al. Catena 196 (2021) 104886

Table 3
Layout of confusion matrix to visualize performance of a classifier.
Predicted

Failure (Negative) Stable (Positive)

Actual Failure (Negative) True negative (TN) False positive (FP)

Stable (Positive) False negative (FN) True positive (TP)

where TN: number of failure slopes classified correctly as the failure class; FP:
number of failure slopes classified incorrectly as the stable class; FN: number of
stable slopes classified incorrectly as the failure class; TP: number of stable
slopes classified correctly as the stable class.

convenience, the harmonic means of precision and recall, known as the

F1 score, was employed to measure the performance of classification
models, as expressed in Eq. (3). Furthermore, other concise metrics
including accuracy, receiver operating characteristic (ROC) curve, and
Fig. 8. Histogram of slope height categories. area under the ROC (AUC) were also employed for efficient evaluation,
as expressed in Eq. (3). The accuracy is determined as the ratio of in
properly execute a given task on the previously unseen inputs, and this stances that are correctly classified. The ROC curve plots the true po
ability is known as generalization. This study used 80% of the database sitive rate (TPR) against the false positive rate (FPR). A good classifi
to train the classification models, and the remaining 20% to evaluate cation model should have an AUC value close to 1.
the generalization capacity.
Because the volume of the examined database is relatively small, F1 = 1
2
+
1 ( precision = TP
TP + FP
recall = TPR =
TP
TP + FN )
with 153 slope cases alone, stratified sampling was applied to ensure precision recall
FP
the test set is representative. In other words, the database was divided FPR = FP + TN
into homogenous subgroups, called strata, and the sampling process Accuracy =
TP + TN
was performed on each stratum. Among the five input features, the TP + TN + FN + FP (3)
slope height is considered to be the most sensitive to slope stability (Lin For fair assessment, this study applied K-fold cross-validation
et al., 2018). Therefore, this study conducted stratified sampling for the (Stone, 1974) to examine the performance of classification models. The
test set according to the slope height categories. Although the slope cross-validation split the training set into k subsets called the folds.
height was experimentally categorized, sufficient instances were still Thereafter, the examined models were trained and evaluated k times.
ensured in each stratum to avoid sampling bias. Fig. 8 illustrates the Each time, k-1 folds were picked for training, and the remaining folds
histogram of slope height categories, and Table 2 summarizes the range were used to evaluate the classification model. The results of the K-fold
of slope heights corresponding to each category. The obtained results cross-validation are expressed as an array containing k evaluation
show a relatively similar portion of the slope height categories between scores. In this study, considering the computational time, k was set as
the test set and overall database. 10.

4.2. Performance measurement and K-fold cross-validation 4.3. Hyperparameters tuning

This study used the confusion matrix to visualize the performance of As mentioned earlier, the ML algorithms considered in this study
classification models. Each row and column in the confusion matrix provide a set of hyperparameters to control the gap between the
represents actual and predicted classes, respectively. Table 3 presents a training and test errors. However, manually tuning the hyperpara
layout of the confusion matrix. meters to determine the best set of hyperparameter values is tedious
False positive (FP) and false negative (FN) evidently represent dif and time-consuming work. Instead, this study applied a grid search
ferent insights in Table 3. In the context of slope stability analysis, as approach to automatically tune hyperparameters. All possible combi
well as risk assessment, significant attention should be paid to the FP nations of hyperparameters were generated from a predefined grid of
because the cost of misclassifying negative samples (e.g., unstable hyperparameters, which, in turn, depend on each algorithm. Each
slopes) could be more than that of omitting positive samples (e.g., combination of hyperparameters was then evaluated using the cross-
stable slopes). However, a high number of positive samples detected validation, in which the F1 score was chosen to rate its performance.
incorrectly (i.e., high FN) could mislead the decision in the resource The grid search results in the best combination of hyperparameters for
management or cost-benefit analysis. Because the databases in most the algorithm that provides the highest F1 score. Table 4 summarizes
practical applications contain noise, adjusting classification models can the results of tuning the grid search hyperparameters for all the algo
either increase the ratio of positive instances correctly detected (i.e., rithms considered in this study.
recall) or the accuracy of the positive prediction (i.e., precision). This Fig. 9 presents a concise flowchart of the procedure applied in this
phenomenon is the renowned precision-recall tradeoff situation. For study, including the processing of data, implementation of classification
models, and evaluation steps.
Table 2
Slope height categories and their portion in test set and overall dataset. 5. Results
Category Range (m) Number Overall Stratified Sampling
5.1. F1 score and accuracy
1 3.66–22 33 0.216 0.226
2 26–50 45 0.294 0.290 Fig. 10 presents the F1 score and accuracy obtained from the K-fold
3 51–75 23 0.150 0.161
cross-validation, including the evaluation of the training and test set.
4 76.81–100 13 0.085 0.065
5 108–511 39 0.255 0.258 Table 5 summarizes the confusion matrix, F1 score and accuracy.
A close observation of the results from the K-fold cross-validation

6
K. Pham, et al. Catena 196 (2021) 104886

Table 4 observation from the K-fold cross-validation mentioned earlier. The

Best combination of hyperparameters for each ML algorithm. averaged F1 score and accuracy of ensemble classifiers were approxi
Machine learning Hyperparameters mately 5.58% and 5.82%, respectively, higher than those of the single-
algorithms learning classification models. Furthermore, with its highest F1 score of
0.914 and accuracy (correctly classifying 28 out of 31 slopes) on the
SVM kernel: linear, c = 1.1, γ = 0.001
test set, the XGB-CM still outperformed the other classification models
KNN leaf_size = 2, n_neighbors = 11
SGD α = 0.1, max_iter = 1640
considered in this study. It is noted that, compared to the examined
DT max_depth = 2, max_leaf_nodes = 5 classification models, the GB-CM was the most susceptible to over
NN α = 0.1, number of neurons in hidden layer: 2, early fitting, as shown in Fig. 10(b) and (d).
stopping: True
RF Bootstrap: True, n_estimators = 130, base_predictor:
DT 5.2. ROC-AUC
AB n_estimators = 20, base_predictor: DT
GB n_estimators = 20, base_predictor: DT Fig. 11 illustrates the ROC-AUC for the performance of the classi
XGB n_estimators = 150, base_predictor: DT fication models; a significant difference was observed in the evaluation
B-SVM n_estimators = 40, base_predictor: SVM
B-NN n_estimators = 39, base_predictor: NN
results. For convenience, the general rule proposed by Hosmer and
B-KNN n_estimators = 39, base_predictor: KNN Lemeshow (2000), as shown in Table 6, was adopted to categorize the
performance of the classification models according to their AUCs.
Table 7 summarizes the AUC obtained from the K-fold cross-vali
dation ( AUCK fold ), along with the evaluation on the training and test
set. In the case of single-learning classification models, AUCK fold ran
ging from 0.846 to 0.956 covers two discrimination categories in
Table 6 (i.e., excellent and outstanding discrimination). In particular,
among these single-learning classification models, GP-CM provided the
highest AUCGP K fold
of 0.956, sequentially followed by KNN-CM and SVM-
CM. Moreover, according to the standard deviation (STD) of their K-fold
cross-validation results, the performance of the previous model was
relatively more stable than the later ones.
In the case of ensemble classifiers, all the performance belonged to
the outstanding discrimination Min (AUCK fold ) equals to 0.93. The
averaged AUCK fold of the ensemble classifiers was 6.27% higher than
that of the single-learning classification models. Applying the bagging
technique (e.g., B-KNN, B-ANN, and B-SVM) improved the performance
of these classification models, in terms of both the AUC, which in
creased from 3.09 to 6.64%, and the stability expressed by the reduc
tion in the value of the STD.
The boosting technique efficiently enhanced the performance of the
classification models compared with that of the other two techniques.
Among the ensemble classifiers, the highest AUC and smallest STD were
obtained in the case of B-KNN, followed by those of the XGB-CM, as
shown in Fig. 11(a) and Table 7.
Fig. 9. Flowchart for procedure of data processing, ML model implementation, The evaluation results of the test set were similar to those of the K-
and evaluation. fold cross-validation. Generally, the averaged AUC Test of the ensemble
classifiers was 12.45% higher than that of the single-learning classifi
cation models. The performance of single-learning classification models
revealed a permutation in the rank order of the performance of the
crossed all the discrimination categories in Table 6. The outstanding
single-learning classification models based on the F1 score and accu
performance of GP-CM and KNN-CM was still superior to the remaining
racy. However, both metrics supported the superiority of SVM-CM
single-learning classification models considered in this study. However,
compared to other single-learning classification models, as shown in
when using the test set, the performance of SVM-CM was classified as
Fig. 10(a).
acceptable discrimination. Additionally, in the case of ensemble clas
By adopting ensemble learning, the averaged values of the F1 score
sifiers, compared to the remaining classification models, the XGB-CM
and accuracy of single-learning classification models increased by
and heterogeneous ensemble provided the highest AUC of the test set,
2.17% and 1.67%, respectively. According to the significant increase in
with AUC test equal to 0.95, as shown in Fig. 11(b).
the F1 score and accuracy, ranging from 1.56 to 2.93% and 0.92 to
The additional metrics of specificity, also known as the true nega
2.81%, respectively, it was evident that the boosting technique effi
tive rate (TNR), defined in Eq. (4) was adopted to further analyze the
ciently enhanced the performance of the classification models com
performance of the classification models along with TPR, which is
pared to the other two techniques (i.e., homogeneous and hetero
known as sensitivity or recall. The specificity is defined as the ratio of
geneous ensemble). Among them, the XGB-CM provided the highest
negative instances that are correctly detected. (Géron, 2017)
values for both the F1 score and accuracy.
The evaluation results of the test set obtained from both metrics TN
TNR =
yielded the same rank order of performance for the classification TN + FP (4)
models, which was relatively different from the results of the K-fold
Fig. 12 presents the correlation of TNR and TPR corresponding to
cross-validation. The GP-CM outperformed the other single-learning
each classification model on the test set. Most of the examined classi
classification models with its highest F1 score of 0.893 and accuracy
fication models belonged to zone 2, which tended to classify slope cases
(correctly classifying 26 out of 31 slopes) on the test set.
as stable (positive class) more often than as failure (negative class),
The correlation of the F1 score and accuracy between the ensemble
particularly in the AB-CM. The classification models implemented with
classifiers and single-learning classification models was similar to the
SVM, GNB, ANN, RF, and heterogeneous ensemble (i.e., group III in

7
K. Pham, et al. Catena 196 (2021) 104886

Fig. 10. F1 score and accuracy and of classifications models (a) F1 score from K-fold cross-validation; (b) F1 score from evaluation on training and test set; (c)
Accuracy from K-fold cross-validation; and (d) Accuracy from evaluation on training and test set.

Table 5 In particular, the SGD-CM detected 14 out of 15 unstable slopes cor

General rule for classifying the discrimination according the AUC. rectly; however, it could correctly classify only 8 out of 16 stable slopes.
AUC values Discrimination categories

AUC = 0.5 No discrimination 6. Discussion

0.7 AUC < 0.8 Acceptable
0.8 AUC < 0.9 Excellent The input requirements of the classification models presented in this
0.9 AUC Outstanding
study were simplified into five fundamental parameters among crucial
factors (Kavzoglu et al., 2014; Lee et al., 2018) to evaluate the safety of
the slope. The statistical descriptions summarized in Table 1 demon
Fig. 12) showed relatively similar predictions for both the stable and
strate the diversity of these features in representing the geological and
failure slopes, which were expressed by the relatively approximate
geometry conditions of slopes at different locations worldwide, as
values of TPR and TNR. XGB-CM evidently exhibited outstanding per
shown in Fig. 4. Consequently, the models trained with such database
formance in detecting the stable slopes, and a significant high accuracy
have high applicability for landslide risk assessment and management
in detecting the failure slopes, as shown in Fig. 12; XGB-CM detected 16
on a global scale. Further improvement could be achieved in ac
out of the 16 stable slopes correctly resulting in a TPR of 1 and also
cordance with the availability of databases to address additional factors
estimated 12 out of the 15 failure slopes correctly, resulting in a TNR of
determining the stable state of slopes, such as rainfall (Matsushi et al.,
0.8.
2006; Pham et al., 2018) and earthquake (Alfaro et al., 2012; Rodrı́guez
The classification models implemented along with KNN, B-KNN, B-
et al., 1999).
SVM, GP, and SGD classified slopes as failures more often than as stable.
The results of rigorous evaluation summarized in Tables 5 and 7

Fig. 11. ROC-AUC of classification models (a) ROC-AUC from K-fold cross-validation; (b) ROC-AUC from evaluation on training and test set.

8
K. Pham, et al. Catena 196 (2021) 104886

Table 6
Result of performance measurements of classification models.
Classification models Confusion matrix of on test set* F1 Accuracy

K-fold Train Test K-fold Train Test

KNN 13 2 0.852 ± 0.13 0.876 0.839 0.869 ± 0.08 0.877 0.839

3 13
SVM 12 3 0.864 ± 0.08 0.880 0.812 0.869 ± 0.06 0.877 0.806
3 13
SGD 14 1 0.809 ± 0.11 0.667 0.640 0.853 ± 0.09 0.820 0.710
8 8
GP 13 2 0.849 ± 0.14 0.930 0.893 0.861 ± 0.1 0.926 0.839
3 13
QDA 11 4 0.852 ± 0.09 0.859 0.788 0.853 ± 0.07 0.852 0.774
3 13
GNB 12 3 0.852 ± 0.09 0.877 0.812 0.853 ± 0.07 0.869 0.806
3 13
DT 11 4 0.831 ± 0.14 0.885 0.788 0.852 ± 0.1 0.885 0.774
3 13
ANN 12 3 0.852 ± 0.1 0.868 0.821 0.844 ± 0.1 0.861 0.806
3 13
B-KNN 14 1 0.874 ± 0.1 0.924 0.867 0.877 ± 0.09 0.918 0.871
3 13
B-SVM 14 1 0.844 ± 0.13 0.876 0.867 0.86 ± 0.08 0.877 0.871
3 13
B-ANN 12 3 0.877 ± 0.11 0.962 0.848 0.877 ± 0.09 0.959 0.839
2 14
RF 12 3 0.844 ± 0.13 0.905 0.812 0.861 ± 0.09 0.902 0.806
3 13
AB 11 4 0.866 ± 0.09 0.948 0.857 0.869 ± 0.08 0.943 0.839
1 15
GB 11 4 0.854 ± 0.1 0.953 0.788 0.862 ± 0.08 0.951 0.774
3 13
XGB 12 3 0.883 ± 0.1 0.993 0.914 0.886 ± 0.08 0.992 0.903
0 16
Het. ensemble 12 3 0.866 ± 0.1 0.928 0.812 0.876 ± 0.1 0.926 0.806
3 13

*Notice: the layout of the confusion matrix is referred to Table 3.

Table 7
ROC-AUC of the classification models.
Classification Model ROC-AUC

K-fold Train Test

KNN 0.935 ± 0.07 0.968 0.931

SVM 0.906 ± 0.06 0.916 0.796
SGD 0.846 ± 0.09 0.763 0.688
GP 0.956 ± 0.05 0.988 0.933
QDA 0.88 ± 0.07 0.892 0.817
GNB 0.85 ± 0.06 0.890 0.775
DT 0.887 ± 0.09 0.952 0.829
ANN 0.873 ± 0.06 0.935 0.817
B-KNN 0.973 ± 0.03 0.986 0.938
B-SVM 0.934 ± 0.06 0.972 0.892
B-ANN 0.931 ± 0.06 0.994 0.933
RF 0.962 ± 0.04 0.985 0.904
AB 0.952 ± 0.05 0.994 0.910
GB 0.933 ± 0.06 0.994 0.929
XGB 0.965 ± 0.04 0.999 0.950
Het. Ensemble 0.93 ± 0.08 0.987 0.950

Fig. 12. Correlation between TPR and TNR of classification models on test set.
agree with previous studies regarding the high potential of ensemble
learning in enhancing the performance of ML. Furthermore, ensemble presented in this study could be promising for landslides susceptibility
classifiers work efficiently with big data (e.g., landscapes at high re mapping, particularly when dealing with large landscapes.
solution) owing to their ability to distribute training on multiple ma The systematic measurement of multiple metrics (i.e., F1 score, ac
chines or servers. In the context of landslide susceptibility mapping, curacy, ROC-AUC, TNR, and TPR) demonstrated the reliability of en
although high-resolution mapping can help improve estimation accu semble classifiers in estimating the safety of the slope with consistently
racy by providing additional details regarding landslide features, most high accuracy, particularly in the case of XGB-CM. However, owing to
of the current ML-based approaches restrict mapping at a reasonably the bias of the database toward positive samples, the trained ensemble
high resolution owing to computational efficiency (Huang and Zhao, classifiers considered in this study tended to classify more slopes as
2018; Wu et al., 2014). Consequently, the ensemble learning models stable than as unstable, as shown in Fig. 5. This tendency may reduce

9
K. Pham, et al. Catena 196 (2021) 104886

the reliability of such models in landslide assessments, which primarily Feng, X.-T., 2000. Introduction of intelligent rock mechanics.
focus on detecting unstable slopes. However, this limitation could be Freund, Y., Schapire, R.E., 1997. A decision-theoretic generalization of on-line learning
and an application to boosting. J. Comput. Syst. Sci. 55, 119–139. https://doi.org/10.
eliminated by increasing the volume of data. 1006/jcss.1997.1504.
Friedman, J.H., 2001. Greedy function approximation: A gradient boosting machine. Ann.
7. Conclusion Stat. https://doi.org/10.2307/2699986.
Griffiths, D.V., Lane, P.a., Hyatt, M., Way, C., Station, F., Handbook, M.M., Nakamura, A.,
Cai, F., Ugai, K., Lau, Y.C.C.K., Roth, W.H., Dawson, E.M., Drescher, A., He, B.,
This study developed classification models using ensemble learning Zhang, H., Matthews, C., Farook, Z., Stability, T., Cruikshank, K.M., 1999. Slope
to estimate the stability status of slopes. Two prominent ensemble stability analysis by Finite elements. Geotechnique 49, 387–403. https://doi.org/10.
1680/geot.1999.49.6.835.
techniques were employed to implement these ensemble classifiers in Hansen, L.K., Salamon, P., 1990. Neural network ensembles. IEEE Trans. Pattern Anal.
cluding the parallel ensemble with both homogeneous and hetero Mach. Intell. https://doi.org/10.1109/34.58871.
geneous ensembles and the sequential ensemble. Additionally, for Hoang, N.D., Pham, A.D., 2016. Hybrid artificial intelligence approach based on meta
heuristic and machine learning for slope stability assessment: A multinational data
comparison, eight versatile learning algorithms (i.e., KNN, SVM, GP,
analysis. Expert Syst. Appl. 46, 60–68. https://doi.org/10.1016/j.eswa.2015.10.020.
GNB, QDA, ANN, DT, and SGD) widely used in literature for slope Hosmer, D.W., Lemeshow, S., 2000. Applied logistic regression second edition. Appl.
stability analyses were considered. The grid search algorithm was ap Logist. Regress. https://doi.org/10.1002/0471722146.
plied to tune the hyperparameters of each learning algorithm. Huang, Y., Zhao, L., 2018. Review on landslide susceptibility mapping using support
vector machines. Catena 165, 520–529. https://doi.org/10.1016/j.catena.2018.03.
Furthermore, K-fold cross-validation was employed to fairly evaluate 003.
the performance of the classification models. Kang, F., Xu, B., Li, J., Zhao, S., 2017. Slope stability evaluation using Gaussian processes
The obtained results demonstrated the superiority of ensemble with various covariance functions. Appl. Soft Comput. J. 60, 387–396. https://doi.
org/10.1016/j.asoc.2017.07.011.
classifiers over the single-learning classification models in evaluating Kavzoglu, T., Sahin, E.K., Colkesen, I., 2014. Landslide susceptibility mapping using GIS-
the stability status of slopes. When ensemble learning was applied to based multi-criteria decision analysis, support vector machines, and logistic regres
implement the classification model instead of the single-learning algo sion. Landslides 11, 425–439. https://doi.org/10.1007/s10346-013-0391-7.
Keaton, J.R., 2007. Rock slope engineering. Environ. Eng. Geosci. https://doi.org/10.
rithm, the averaged F1 score, accuracy, and AUC of the classification 2113/gseegeosci.13.4.369.
models from the K-fold cross-validation increased by 2.17%, 1.66%, Kuncheva, L.I., 2004. Combining Pattern Classifiers: Methods and Algorithms. John Wiley
and 6.27%, respectively. In particular, boosting learning significantly & Sons.
Kuncheva, L.I., Whitaker, C.J., 2003. Measures of diversity in classifier ensembles and
improved the performance of the classification models. The highest F1 their relationship with the ensemble accuracy. Mach. Learn. https://doi.org/10.
score, accuracy, and AUC of 0.914, 0.903, and 0.95, respectively, on the 1023/A:1022859003006.
test set were obtained by XGB-CM. Furthermore, XGB-CM detected 16 Lechman, J.B., Griffiths, D.V., 2000. Analysis of the progression of failure of earth slopes
by finite elements. Slope Stability 2000, 250–265.
out of the 16 stable slopes and 12 out of the 15 failure slopes correctly,
Lee, J.H., Sameen, M.I., Pradhan, B., Park, H.J., 2018. Modeling landslide susceptibility in
thereby indicating its outstanding performance. Consequently, en data-scarce environments using optimized data mining and statistical methods.
semble learning, particularly XGB, is strongly suggested to develop Geomorphology 303, 284–298. https://doi.org/10.1016/j.geomorph.2017.12.007.
reliable models for estimating slope stability status as well as for further Lin, H.M., Chang, S.K., Wu, J.H., Juang, C.H., 2009. Neural network-based model for
assessing failure potential of highway slopes in the Alishan, Taiwan Area: Pre- and
applications in geotechnical fields. post-earthquake investigation. Eng. Geol. 104, 280–289. https://doi.org/10.1016/j.
enggeo.2008.11.007.
Declaration of Competing Interest Lin, Y., Zhou, K., Li, J., 2018. Prediction of slope stability using four supervised learning
methods. IEEE Access 6, 31169–31179. https://doi.org/10.1109/ACCESS.2018.
2843787.
The authors declare that they have no known competing financial Manouchehrian, A., Gholamnejad, J., Sharifzadeh, M., 2014. Development of a model for
analysis of slope stability for circular mode failure using genetic algorithm. Environ.
interests or personal relationships that could have appeared to influ Earth Sci. 71, 1267–1277. https://doi.org/10.1007/s12665-013-2531-8.
ence the work reported in this paper. Mason, L., Baxter, J., Bartlett, P., Frean, M., 2000. Boosting algorithms as gradient des
cent. In: Advances in Neural Information Processing Systems.
Matsui, T., San, K.-C., 1992. Finite element slope stability analysis by shear strength re
Acknowledgments duction technique. Soils Found. 32, 59–70. https://doi.org/10.3208/sandf1972.
32.59.
This research was supported by Science Research Program through Matsushi, Y., Hattanji, T., Matsukura, Y., 2006. Mechanisms of shallow landslides on soil-
mantled hillslopes with permeable and impermeable bedrocks in the Boso Peninsula,
the National Research Foundation of Korea (NRF) funded by the
Japan. Geomorphology 76, 92–108. https://doi.org/10.1016/j.geomorph.2005.10.
Ministry of Education (2019R1A2C2086647). 003.
Michalowski, R.L., 1995. Slope stability analysis: a kinematical approach. Géotechnique
References 45, 283–293. https://doi.org/10.1680/geot.1995.45.2.283.
Minaei-Bidgoli, B., Parvin, H., Alinejad-Rokny, H., Alizadeh, H., Punch, W.F., 2014.
Effects of resampling method and adaptation on clustering ensemble efficacy. Artif.
Alfaro, P., Delgado, J., García-Tortosa, F.J., Lenti, L., López, J.A., López-Casado, C., Intell. Rev. 41, 27–48. https://doi.org/10.1007/s10462-011-9295-x.
Martino, S., 2012. Widespread landslides induced by the Mw 5.1 earthquake of 11 Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
May 2011 in Lorca, SE Spain. Eng. Geol. 137–138, 40–52. https://doi.org/10.1016/j. Prettenhofer, P., Weiss, R., Dubourg, V., 2011. Scikit-learn: Machine learning in
enggeo.2012.04.002. Python. J. Mach. Learn. Res. 12, 2825–2830.
Aurélien Géron, 2017. Hands-on Machine Learning with Scikit-Learn & Tensor Flow. Pham, K., Kim, D., Choi, H.J., Lee, I.M., Choi, H., 2018. A numerical framework for in
Bishop, C.M., 2006. Patterns Recognition and Machine Learning, Springer-Verlag, New finite slope stability analysis under transient unsaturated seepage conditions. Eng.
York. https://doi.org/10.1016/B978-044452701-1.00059-4. Geol. 243, 36–49. https://doi.org/10.1016/j.enggeo.2018.05.021.
Bragagnolo, L., da Silva, R.V., Grzybowski, J.M.V., 2020. Artificial neural network en Qi, C., Fourie, A., Ma, G., Tang, X., 2018. A hybrid method for improved stability pre
sembles applied to the mapping of landslide susceptibility. CATENA 184, 104240. diction in construction projects: A case study of stope hangingwall stability. Appl.
https://doi.org/10.1016/J.CATENA.2019.104240. Soft Comput. J. 71, 649–658. https://doi.org/10.1016/j.asoc.2018.07.035.
Breiman, L., 1997. Arcing the edge. Statistics (Ber). Qi, C., Tang, X., 2018a. Slope stability prediction using integrated metaheuristic and
Chen, C., Xiao, Z., Zhang, G., 2011. Stability assessment model for epimetamorphic rock machine learning approaches: A comparative study. Comput. Ind. Eng. 118, 112–122.
slopes based on adaptive neuro-fuzzy inference system. Electron. J. Geotech. Eng. 16 https://doi.org/10.1016/j.cie.2018.02.028.
A, 93–107. Qi, C., Tang, X., 2018b. A hybrid ensemble method for improved prediction of slope
Chen, T., Guestrin, C., 2016. XGBoost: A scalable tree boosting system. In: Proceedings of stability. Int. J. Numer. Anal. Methods Geomech. 42, 1823–1839. https://doi.org/10.
the ACM SIGKDD International Conference on Knowledge Discovery and Data 1002/nag.2834.
Mining, https://doi.org/10.1145/2939672.2939785. Rasmussen, C.E., Williams, C.K.I., 2006. Gaussian process for machine learning. MIT
Cortes, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20, 273–297. press.
https://doi.org/10.1023/A:1022627411411. Rodriguez, C.E., Bommer, J.J., Chandler, R.J., 1999. Earthquake-induced landslides:
Das, S.K., Biswal, R.K., Sivakugan, N., Das, B., 2011. Classification of slopes and pre 1980–1997. Soil Dyn. Earthq. Eng. 18, 325–346. https://doi.org/10.1016/S0267-
diction of factor of safety using differential evolution neural networks. Environ. Earth 7261(99)00012-3.
Sci. 64, 201–210. https://doi.org/10.1007/s12665-010-0839-1. Sah, N.K., Sheorey, P.R., Upadhyaya, L.N., 1994. Maximum likelihood estimation of slope
Dawson, E.M., Roth, W.H., Drescher, A., 1999. Slope stability analysis by strength re stability. Int. J. Rock Mech. Min. Sci. Geomech. Abstr. 31, 47–53. https://doi.org/10.
duction. Géotechnique 49, 835–840. https://doi.org/10.1680/geot.1999.49.6.835. 1016/0148-9062(94)92314-0.

10
K. Pham, et al. Catena 196 (2021) 104886

Sakellariou, M.G., Ferentinou, M.D., 2005. A study of slope stability prediction using 06.005.
neural networks. Geotech. Geol. Eng. 23, 419–445. https://doi.org/10.1007/s10706- Wu, X., Ren, F., Niu, R., 2014. Landslide susceptibility assessment using object mapping
004-8680-5. units, decision tree, and support vector machine models in the Three Gorges of China.
Samui, P., 2013. Support vector classifier analysis of slope. Geomatics. Nat. Hazards Risk Environ. Earth Sci. 71, 4725–4738. https://doi.org/10.1007/s12665-013-2863-4.
4, 1–12. https://doi.org/10.1080/19475705.2012.684725. Xu, W., Xie, S., Jean-Pascal, D., Nicolas, B., Imbert, P., 1999. Slope stability analysis and
Shalev-Shwartz, S., Ben-David, S., 2013. Understanding machine learning: From theory to evaluation with probabilistic artificial neural network method. Site Investig. Sci.
algorithms, Understanding Machine Learning: From Theory to Algorithms. https:// Technol. 3, 19–21.
doi.org/10.1017/CBO9781107298019. Xue, X., 2017. Prediction of slope stability based on hybrid PSO and LSSVM. J. Comput.
Stone, M., 1974. Cross-validatory choice and assessment of statistical predictions. J. R. Civ. Eng. 31, 04016041. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000607.
Stat. Soc. Ser. B. Methodol. https://doi.org/10.2307/2984809. Zhou, Z.H., Wu, J., Tang, W., 2002. Ensembling neural networks: Many could be better
Wang, H.B., Xu, W.Y., Xu, R.C., 2005. Slope stability evaluation using back propagation than all. Artif. Intell. 137, 239–263. https://doi.org/10.1016/S0004-3702(02)
neural networks. Eng. Geol. 80, 302–315. https://doi.org/10.1016/j.enggeo.2005. 00190-X.

(Ebook) A Field Guide to Digital Transformation (The Pearson Service Technology Series from Thomas Erl) by Thomas Erl, Roger Stoffers ISBN 9780137571840, 0137571844 - The latest ebook version is now available for instant access
100% (2)
(Ebook) A Field Guide to Digital Transformation (The Pearson Service Technology Series from Thomas Erl) by Thomas Erl, Roger Stoffers ISBN 9780137571840, 0137571844 - The latest ebook version is now available for instant access
79 pages
Chapter 2 / Section 2: Troubleshooting
67% (3)
Chapter 2 / Section 2: Troubleshooting
66 pages
Pepsico'S Operations Management, 10 Decisions, Productivity
No ratings yet
Pepsico'S Operations Management, 10 Decisions, Productivity
2 pages
AC Traction Motor, Serial# RH-2232171-1, Asset#2224419-3
No ratings yet
AC Traction Motor, Serial# RH-2232171-1, Asset#2224419-3
7 pages
Stability Risk Assessment of Slopes Using Logistic Model Tree Based On Updated Case Histories
No ratings yet
Stability Risk Assessment of Slopes Using Logistic Model Tree Based On Updated Case Histories
17 pages
Potential of Ensemble Learning To Improve Tree-Bas
No ratings yet
Potential of Ensemble Learning To Improve Tree-Bas
23 pages
Matriz metodológica JC ALVAREZ SANTIAGO LUNA
No ratings yet
Matriz metodológica JC ALVAREZ SANTIAGO LUNA
48 pages
1 s2.0 S0098300419310039 Main
No ratings yet
1 s2.0 S0098300419310039 Main
15 pages
87-3-Deep Learning Models For Large-Scale Slope Instability Examination in Western Uttarakhand, India
No ratings yet
87-3-Deep Learning Models For Large-Scale Slope Instability Examination in Western Uttarakhand, India
18 pages
Lin 2021
No ratings yet
Lin 2021
13 pages
Slope stability prediction for circular mode failure using gradient boosting
No ratings yet
Slope stability prediction for circular mode failure using gradient boosting
14 pages
Sensors: Slope Stability Monitoring Using Novel Remote Sensing Based Fuzzy Logic
No ratings yet
Sensors: Slope Stability Monitoring Using Novel Remote Sensing Based Fuzzy Logic
13 pages
A2 LandslidePrediction S17007
No ratings yet
A2 LandslidePrediction S17007
2 pages
OK-ML-BN-Binh Thai Pham-A comparative study of different machine learning methods for landslide susceptibility assessment-2016-TGNH-DrDen
No ratings yet
OK-ML-BN-Binh Thai Pham-A comparative study of different machine learning methods for landslide susceptibility assessment-2016-TGNH-DrDen
11 pages
Landslide Susceptibility Prediction Using Sparse Feature Extraction and Machine Learning Models Based On GIS and Remote Sensing
No ratings yet
Landslide Susceptibility Prediction Using Sparse Feature Extraction and Machine Learning Models Based On GIS and Remote Sensing
5 pages
FINAL_YEAR_PROJECT_26
No ratings yet
FINAL_YEAR_PROJECT_26
23 pages
Ensemble learning landslide susceptibility assessment with optimized non-landslide samples selection
No ratings yet
Ensemble learning landslide susceptibility assessment with optimized non-landslide samples selection
32 pages
2018-Spatial_Prediction_of_Rainfall-Induced_Shallow_Lan
No ratings yet
2018-Spatial_Prediction_of_Rainfall-Induced_Shallow_Lan
18 pages
Landslide susceptibility assessment through multi-model stacking and meta-learning in Poyang County China
No ratings yet
Landslide susceptibility assessment through multi-model stacking and meta-learning in Poyang County China
24 pages
sustainability-09-00048
No ratings yet
sustainability-09-00048
15 pages
1-s2.0-S0341816217303909-main
No ratings yet
1-s2.0-S0341816217303909-main
16 pages
Tinoco 2018
No ratings yet
Tinoco 2018
13 pages
Landslide_susceptibility_mapping_using_an_integrat
No ratings yet
Landslide_susceptibility_mapping_using_an_integrat
33 pages
1 s2.0 S1674987120300542 Main
No ratings yet
1 s2.0 S1674987120300542 Main
14 pages
A comparative analysis of weight based machine learning methods for landslide susceptibility mapping in Ha Giang area
No ratings yet
A comparative analysis of weight based machine learning methods for landslide susceptibility mapping in Ha Giang area
31 pages
Medina Et Al - 2021 - Physicaly - Based - Model - LDSLD - Susceptibility
No ratings yet
Medina Et Al - 2021 - Physicaly - Based - Model - LDSLD - Susceptibility
16 pages
1 s2.0 S1674775520301451 Main
No ratings yet
1 s2.0 S1674775520301451 Main
14 pages
Spatial Prediction of Landslide Susceptibility Using Hybrid - 2020 - Science of
No ratings yet
Spatial Prediction of Landslide Susceptibility Using Hybrid - 2020 - Science of
14 pages
Landslide Susceptibility Assessment Usin
No ratings yet
Landslide Susceptibility Assessment Usin
10 pages
2021-Geoinformation-Based Landslide Susceptibility Mapping in Subtropical Area
No ratings yet
2021-Geoinformation-Based Landslide Susceptibility Mapping in Subtropical Area
16 pages
Engineering Applications of Artificial Intelligence
No ratings yet
Engineering Applications of Artificial Intelligence
19 pages
Machine Learning and Landslide Studies: Recent Advances and Applications
No ratings yet
Machine Learning and Landslide Studies: Recent Advances and Applications
49 pages
[1.2]
No ratings yet
[1.2]
58 pages
Engineering Geology: Deliang Sun, Jiahui Xu, Haijia Wen, Danzhou Wang
No ratings yet
Engineering Geology: Deliang Sun, Jiahui Xu, Haijia Wen, Danzhou Wang
12 pages
Novel Machine Learning Ensemble Approach For Landslide Prediction
No ratings yet
Novel Machine Learning Ensemble Approach For Landslide Prediction
7 pages
MachineLearningforLandslidesPreventionASurvey
No ratings yet
MachineLearningforLandslidesPreventionASurvey
52 pages
Expert Systems With Applications: Nhat-Duc Hoang, Anh-Duc Pham
No ratings yet
Expert Systems With Applications: Nhat-Duc Hoang, Anh-Duc Pham
9 pages
Jurnal Internasional Land Sliding
No ratings yet
Jurnal Internasional Land Sliding
14 pages
Assessment_of_Landslide_Suscep (Yao Et Al., 2020)
No ratings yet
Assessment_of_Landslide_Suscep (Yao Et Al., 2020)
25 pages
Optimizing The Predictive Ability of Machine Learning Methods For Landslide Susceptibility Mapping Using SMOTE For Lishui City in Zhejiang Province, China
No ratings yet
Optimizing The Predictive Ability of Machine Learning Methods For Landslide Susceptibility Mapping Using SMOTE For Lishui City in Zhejiang Province, China
27 pages
Catena: A A B C D
No ratings yet
Catena: A A B C D
13 pages
Two-Stepped Evolutionary Algorithm and Its Application To Stability Analysis of Slopes
No ratings yet
Two-Stepped Evolutionary Algorithm and Its Application To Stability Analysis of Slopes
9 pages
Hybrid Model Considering Spatial Heterogeneity For Landslide Susceptibility Mapping in Zhejiang Province, China
No ratings yet
Hybrid Model Considering Spatial Heterogeneity For Landslide Susceptibility Mapping in Zhejiang Province, China
13 pages
2022 - A Comparative Study of Different Machine Learning Methods For Reservoirlandslide Displacement Prediction
No ratings yet
2022 - A Comparative Study of Different Machine Learning Methods For Reservoirlandslide Displacement Prediction
12 pages
j2
No ratings yet
j2
16 pages
A comparative analysis of weight-based machine learning methods for landslide susceptibility mapping in Ha Giang area
No ratings yet
A comparative analysis of weight-based machine learning methods for landslide susceptibility mapping in Ha Giang area
31 pages
Taludes Reforzados
No ratings yet
Taludes Reforzados
42 pages
Land 5
No ratings yet
Land 5
13 pages
1 s2.0 S0098300422000978 Main
No ratings yet
1 s2.0 S0098300422000978 Main
13 pages
Published1IEEEpaper Rainfall Estimation Using Machine Learning-ICELTIC
No ratings yet
Published1IEEEpaper Rainfall Estimation Using Machine Learning-ICELTIC
7 pages
Groundwater Level Prediction of Landslide Based On Classification and Regression Tree
No ratings yet
Groundwater Level Prediction of Landslide Based On Classification and Regression Tree
8 pages
Practice of Artificial Intelligence in Geotechnices
No ratings yet
Practice of Artificial Intelligence in Geotechnices
5 pages
Predicting Uniaxial Compressive Strength of Rocks Using Simple Test Data
No ratings yet
Predicting Uniaxial Compressive Strength of Rocks Using Simple Test Data
10 pages
2021-Landslide Susceptibility Mapping Using Hybrid Random Forest With GeoDetector and RFE For Factor Optimization
No ratings yet
2021-Landslide Susceptibility Mapping Using Hybrid Random Forest With GeoDetector and RFE For Factor Optimization
19 pages
Applying Machine Learning Methods To Predict Geology Using Soil Sample Geochemistry
No ratings yet
Applying Machine Learning Methods To Predict Geology Using Soil Sample Geochemistry
13 pages
s00477 022 02330 y
No ratings yet
s00477 022 02330 y
26 pages
Machine learning approaches for mapping and predicting landslide-prone areas in SAo SebastiAo (Southeast Brazil)
No ratings yet
Machine learning approaches for mapping and predicting landslide-prone areas in SAo SebastiAo (Southeast Brazil)
15 pages
A Hybrid Support Vector Regression With Ant Colony Optimization Algorithm in Estimation of Safety Factor For Circular Failure Slope
No ratings yet
A Hybrid Support Vector Regression With Ant Colony Optimization Algorithm in Estimation of Safety Factor For Circular Failure Slope
13 pages
Geomorphology: Jason N. Goetz, Richard H. Guthrie, Alexander Brenning
No ratings yet
Geomorphology: Jason N. Goetz, Richard H. Guthrie, Alexander Brenning
11 pages
Dickson Et Al. - 2016 - Coastal Cliff Landslide
No ratings yet
Dickson Et Al. - 2016 - Coastal Cliff Landslide
11 pages
(2021) Accelerating Geostatistical Modeling Using Geostatistics-Informed Machine Learning
No ratings yet
(2021) Accelerating Geostatistical Modeling Using Geostatistics-Informed Machine Learning
30 pages
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Analysis Python Read The Docs Io en Latest
No ratings yet
Data Analysis Python Read The Docs Io en Latest
79 pages
Data Skills Framework 2023
No ratings yet
Data Skills Framework 2023
11 pages
Homework 3 Solutions
No ratings yet
Homework 3 Solutions
4 pages
Using Machine Learning Classifiers To Predict Stock Exchange Index
No ratings yet
Using Machine Learning Classifiers To Predict Stock Exchange Index
6 pages
Heart Disease Prediction Final
No ratings yet
Heart Disease Prediction Final
7 pages
Information Sciences: Decui Liang, Bochun Yi
No ratings yet
Information Sciences: Decui Liang, Bochun Yi
18 pages
Information Communication Technology (Ict) : UNIT IV: Programming
No ratings yet
Information Communication Technology (Ict) : UNIT IV: Programming
255 pages
Lenskart Report
No ratings yet
Lenskart Report
47 pages
053 - Jason Dwi Rendrahadi Putra Astono - RESUME
No ratings yet
053 - Jason Dwi Rendrahadi Putra Astono - RESUME
4 pages
2024 Week 5 - Jupyter Notebook
No ratings yet
2024 Week 5 - Jupyter Notebook
3 pages
ICT Assignment # 5
No ratings yet
ICT Assignment # 5
13 pages
Mckenzie Jones: 2323 San Antonio St. Suite #2122, Austin, TX 78705 (512) - 925-6350 Mckenziejones@Utexas - Edu
No ratings yet
Mckenzie Jones: 2323 San Antonio St. Suite #2122, Austin, TX 78705 (512) - 925-6350 Mckenziejones@Utexas - Edu
3 pages
How To Configure PXM Using M580
No ratings yet
How To Configure PXM Using M580
13 pages
How to Easy… Create a Search Help in ALV OOPS editable field _ SAP Blogs
No ratings yet
How to Easy… Create a Search Help in ALV OOPS editable field _ SAP Blogs
5 pages
DBMS End Term
No ratings yet
DBMS End Term
27 pages
ALUCOBOND AXCENT Programs Color Chart
No ratings yet
ALUCOBOND AXCENT Programs Color Chart
4 pages
Ce3361 Surveying and Levelling Laboratory
No ratings yet
Ce3361 Surveying and Levelling Laboratory
1 page
Lumipix9uhe_A4_DATASHEET
No ratings yet
Lumipix9uhe_A4_DATASHEET
1 page
TPG1370YXA
No ratings yet
TPG1370YXA
2 pages
0 Dumpacore 3rd Com - Samsung.android - App.contacts
No ratings yet
0 Dumpacore 3rd Com - Samsung.android - App.contacts
2,528 pages
Full Download Test Bank For Cognitive Psychology, 8th Edition: Solso PDF
100% (4)
Full Download Test Bank For Cognitive Psychology, 8th Edition: Solso PDF
41 pages
Market Development Strategy for Brandon, MB — A UAP / NAPA Auto Parts Business Case
No ratings yet
Market Development Strategy for Brandon, MB — A UAP / NAPA Auto Parts Business Case
30 pages
White Paper
No ratings yet
White Paper
35 pages
Fedora
No ratings yet
Fedora
18 pages
Sandvik Jaw Eng PDF
100% (1)
Sandvik Jaw Eng PDF
12 pages
ZSHRC
No ratings yet
ZSHRC
3 pages
Cisco Adaptive Security Virtual Appliance (Asav)
No ratings yet
Cisco Adaptive Security Virtual Appliance (Asav)
8 pages
Analysis of Digital Marketing Activities On Event Organizer in Marketing Services (Description Study of GMP Organizer & Entertainment)
No ratings yet
Analysis of Digital Marketing Activities On Event Organizer in Marketing Services (Description Study of GMP Organizer & Entertainment)
7 pages
2016 - An Overview of Microgrid Protection Methods and The Factors Involved
No ratings yet
2016 - An Overview of Microgrid Protection Methods and The Factors Involved
13 pages
Pipe Dimensions
No ratings yet
Pipe Dimensions
11 pages
Citizen E650
No ratings yet
Citizen E650
37 pages
Output Representation
No ratings yet
Output Representation
27 pages

Catena: Khanh Pham, Dongku Kim, Sangyeong Park, Hangseok Choi T

Uploaded by

Catena: Khanh Pham, Dongku Kim, Sangyeong Park, Hangseok Choi T

Uploaded by

Catena 196 (2021) 104886

Contents lists available at ScienceDirect

Ensemble learning-based classification models for slope stability analysis T

ARTICLE INFO ABSTRACT

with majority votes is determined as the final decision. In soft-voting,

2.2.2. Sequential ensemble

Fig. 3. Flowchart of sequential ensemble learning.

Fig. 4. Approximate locations of slope cases in consideration.

4.1. Dataset partition

ML models conduct specific tasks in accordance with the patterns

Fig. 6. Distribution of five factors of database on their ranges.

Failure (Negative) Stable (Positive)

Actual Failure (Negative) True negative (TN) False positive (FP)

convenience, the harmonic means of precision and recall, known as the

4.2. Performance measurement and K-fold cross-validation 4.3. Hyperparameters tuning

Table 4 observation from the K-fold cross-validation mentioned earlier. The

Table 5 In particular, the SGD-CM detected 14 out of 15 unstable slopes cor­

AUC = 0.5 No discrimination 6. Discussion

K-fold Train Test K-fold Train Test

KNN 13 2 0.852 ± 0.13 0.876 0.839 0.869 ± 0.08 0.877 0.839

*Notice: the layout of the confusion matrix is referred to Table 3.

K-fold Train Test

KNN 0.935 ± 0.07 0.968 0.931

You might also like

Table 5 In particular, the SGD-CM detected 14 out of 15 unstable slopes cor