Developing a New Model for Drilling Rate of Penetration Prediction Using Convolutional Neural Network
Developing a New Model for Drilling Rate of Penetration Prediction Using Convolutional Neural Network
https://doi.org/10.1007/s13369-022-06765-x
Received: 23 September 2021 / Accepted: 28 February 2022 / Published online: 6 April 2022
© King Fahd University of Petroleum & Minerals 2022
Abstract
Before adjustable parameters of drilling can be optimized, it is necessary to have a high-accuracy model for predicting the
rate of penetration (ROP), which can represent the effects of drilling parameters and formation-related factors on the ROP.
Accordingly, the present research attempts to use different algorithms, including convolutional neural network (CNN), simple
form of least square support vector machine (LSSVM) and its hybrid forms with either particle swarm optimization (PSO),
cuckoo optimization algorithm (COA), or genetic algorithm (GA), and also hybrids of multilayer extreme learning machine
with either of COA, PSO, or GA, to model ROP based on mud-logging and petrophysical data along two wells (Wells A
and B). For this purpose, firstly, median filtering was applied to the data for the sake of denoising. Next, petrophysical logs
were upscaled to make their scales matched to that of mud-logging data. Then, the nondominated sorting genetic algorithm
(NSGA-II) was combined with a multilayer perceptron (MLP) neural network to select the set of the most significant features
for estimating the ROP on the data from Well A. Feature selection results showed that the accuracy of the estimator models
increases with the number of parameters to a maximum of seven, beyond which only subtle enhancements were seen in the
modeling accuracy. Accordingly, the ROP was modeled using depth, bit rotary speed, mud weight, weight on bit, compressive
wave slowness, total flow rate, and neutron porosity. Training the hybrid, CNN, and LSSVM algorithms using the training
data from Well A showed that the model built upon the CNN algorithm tends to produce the smallest root-mean-square error
(RMSE 1.7746 ft/hr), as compared to the other models. In addition, the smaller difference in error between the training and
testing phases (RMSE 2.5356 ft/hr) for this model indicates its high generalizability. This fact was proved by the lower
estimation error of this model for predicting the ROP at Well B, as compared to other models.
Keywords ROP prediction · Deep learning · Petrophysical logs · Mud logging · Feature selection
B Mahdi Bajolvand
[email protected]
1 Department of Petroleum Engineering, Omidiyeh Branch,
Morteza Matinkia Islamic Azad University, Omidiyeh, Iran
[email protected]
2 ACECR Institute of Higher Education (Isfahan Branch),
Amirhossein Sheykhinasab Isfahan, Iran
[email protected]
3 Department of Petroleum Engineering, Semnan University,
Soroush Shojaei Semnan, Iran
[email protected]
4 Islamic Azad University Central Tehran Branch, Hashemi
Ali Vojdani Tazeh Kand Rafsanjani Complex, Tehran, Iran
[email protected]
5 Faculty of Engineering, Islamic Azad University of Quchan,
Arad Elmi Quchan, Iran
[email protected]
6 Faculty of Mining, Petroleum and Geophysics Engineering,
Mohammad Mehrad Shahrood University of Technology, Shahrood, Iran
[email protected]
123
11954 Arabian Journal for Science and Engineering (2022) 47:11953–11985
1 Introduction method in the field (e.g., bit abrasion) has limited the applica-
bility of such models extensively. Another challenge of using
Rate of penetration (ROP) is among the most important per- such models (which is especially the case for the Burgoyne
formance indicators used in the drilling of oil and gas wells. and Young model) is the relatively high number of constants
A large number of factors affect the ROP in a particular that impose significant impacts on the overall modeling accu-
drilling operation [1]. These can be categorized under six racy, as reflected by the large number of studies reporting on
general categories, namely rig productivity, hydraulic param- optimization of such coefficients [11–14].
eters of the bit, drilling fluid properties, human factors, and The MLBIMs constitute another group of methods used
formation properties. A known challenge in the evaluation for estimating the ROP. Thanks to their high flexibility in
of ROP is the relatively large number of effective factors, terms of input parameters and their powerful capability for
some of which are even impossible to measure. The ROP is modeling complex nonlinear associations, these methods
important due to the close associations between the drilling have been increasingly regarded by researchers in the recent
depth, ROP, and cost of the drilling operation, not to men- past. On top of this, these approaches allow for taking into
tion that the drilling cost increases with depth exponentially account a larger number of parameters, as compared to
[2]. Accordingly, researchers have attempted to formulate the analytic models. Exploring the patterns within the data,
relationship between controllable operating parameters (e.g., these approaches may lead to models of higher accuracy and
weight on bit (WOB), bit rotary speed per minute (RPM), generalizability compared to analytical models [15]. Bilgesu
and total flow rate (TFLO)) and non-controllable operating et al. were the first to use an artificial neural network (ANN)
parameters (e.g., formation drillability and intact rock resis- for ROP estimation [16]. Multilayer perceptron (MLP) and
tivity over the drilling trajectory) to predict the ROP and radial basis function (RBF) ANNs have been widely used
hence find the optimum conditions for the ROP [3]. A review by researchers in this field [17, 18]. As a variant of neural
of the existing literature shows that, despite the widespread networks (NNs), extreme learning machine (ELM) has been
efforts made to develop relationships and models for esti- used for ROP estimation in various research works because
mating the ROP, there are still active studies in the field, of its high computation pace coupled with excellent accu-
and various new methods are implemented and analyzed by racy in modeling nonlinear behavior of variables, usually
academics and industrial practitioners. Without any doubt, producing higher accuracy than the competitors [19–22]. In
achieving a comprehensive model of ROP prediction that can contrast, Ahmed et al. showed that the least square support
provide for adequate levels of accuracy and generalizability vector machine (LSSVM) could result in high-accuracy esti-
can serve as the most important objective for such studies. mates of ROP [23]. Bani Mustafa et al. used second-order
According to a broad classification scheme, ROP prediction multivariate regression to establish a mathematical equation
models can be categorized under two main classes, namely among different controllable drilling parameters (WOB,
physics-based models (PBMs) and machine learning-based RPM, and TFLO) based on the drilling data along several
intelligent methods (MLBIMs). wells drilled in southern Iraq. Then, they implemented
The PBMs for ROP prediction are, indeed, mathematical the response surface methodology (RSM) to identify the
equations that have been once developed based on particu- optimal range for each of the controllable parameters for the
lar datasets. In this regard, these models clearly reflect the purpose of maximum ROP. Outcomes of this study further
relationships among different parameters and the way they highlighted the significant role of the estimator model in
affect the ROP. Popular examples of PBMs for ROP predic- determining optimal values of the drilling parameters and
tion include those proposed by Burgoyne and Young (1994) hence the cost and time of the drilling operation [24]. Tewari
and Warren et al. (1986) for roller cone bits, those published et al. used the data from three adjacent wells in a hydrocar-
by Harreland and Rampersad (1999), Motahhari et al. (2008), bon field in Norway to develop a model for estimating the
and Al-AbdulJabbar et al., 2018) for polycrystalline diamond ROP, based on which the best operating parameters for bit
cutter (PDC) bits, and the one developed by Bingham (1996) selection could be determined. In this study, the authors used
for both bit types [4–9]. These models are described in detail ANN and RSM to develop the predictor model. They further
in "Appendix A". By the way, implementation of these mod- applied the artificial bee colony algorithm and genetic
els has always been accompanied with different challenges. algorithm (GA) to select the best possible values of the
PBMs are developed on the basis of sets of experimental controllable parameters. Results of this research suggested
results and well data. In many cases, this has been led to that the development of a high-accuracy model can serve
reduced levels of accuracy because of differences in the for- as a key in the bit selection process—a factor that plays a
mation type [10]. Furthermore, the need for parameters that determinant role in final drilling operation cost [25].
could be particularly measured in the laboratory (e.g., forma- In MLBIMs, the presence of controllable parameters that
tion drillability or geometrical parameters of the bit) or were can be set by the user considering the conditions of the spe-
commonly not measured or lacked a standard measurement cific problem at hand is of crucial importance. Some of such
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11955
parameters (e.g., number of hidden layers in an MLP ANN) has highlighted the more capabilities of the hybrid method-
are linked to the structure of such methods. In contrast, some ologies compared to simple algorithms. Therefore, in this
others (e.g., weights and biases of the neurons in NNs and study, we used CNN and hybrid forms of MELM to model
hyperparameters in LSSVM) have more to do with initializa- the ROP, with the results compared to those of one of the
tion and evolution of them toward converging to the solution. most powerful algorithms that have been extensively used
Many researchers employed different algorithms to optimize in the literature, i.e., LSSVM in hybrid and straightforward
these parameters. Elkatatny optimized the structure of an forms.
ANN in terms of the numbers of hidden layers and neu-
rons in each layer as well as the transition function using a
methodology known as self-adaptive differential evaluation. 2 Studied Wells
He then developed, based on weights, biases, and layers of
this ANN, an analytic model to estimate the ROP. The results In this study, the data acquired along two wells in an oilfield in
showed superior accuracy of this model compared to previous SW Iran were used for ROP modeling based on petrophysical
analytic approaches [26]. Anemangely et al. employed the logs and drilling parameters. Respecting the confidentiality
particle swarm optimization (PSO) and cuckoo optimization of the collected data, we herein refer to the two wells as
algorithm (COA) for setting the weights and biases of neu- Wells A and B, respectively. The petrophysical parameters
rons to achieve minimum error on an MLP ANN. Comparing acquired over the two wells included GR, NPHI, RHOB, spe-
the results between the simple MLP ANN and the opti- cific resistivity (RT), compressive wave slowness (DTCO),
mized versions of the network showed that MLP-COA hybrid and photoelectric log (PEF). In terms of drilling data, we
approach can serve as an efficient tool for high-accuracy pre- were provided with actual ROP, WOB, torque (TRQ), TFLO,
diction of ROP [1]. Ashrafi et al. compared performances returned mud weight (MW-OUT), equivalent mud weight
of PSO, GA, independent component analysis (ICA), and (ECD), and d-exponent parameter (d-exp). An overview of
biogeography-based optimization (BBO) in optimizing MLP the petrophysical logs and drilling data over the studied wells
and RBF NNs. The results indicated the superior perfor- is presented in Figs. 1, 2, 3, 4. Table 1 gives descriptive statis-
mance of the MLP-PSO compared to other approaches [27]. tics for different parameters at the studied wells. Formation
Seeking to optimize hyperparameters of an LSSVM tops over the two wells are described in Table 2.
model, Mehrad et al. used PSO, COA, and GA. In this
research, for the sake of comparison, hybrid models of MLP
coupled with the mentioned algorithms were further built and 3 Methodology
compared to SVR and random forest (RF) models. This study
was based on petrophysical parameters (density (RHOB), Preprocessing plays a crucial role in obtaining high-accuracy
neutron porosity (NPHI), and gamma-ray (GR)) and uniaxial models. Accordingly, in this work, we began by denoising
compressive strength (UCS) of rock and drilling data simul- the collected data. As a next step, in order to address the dif-
taneously. The results showed that the ROP model developed ference in scale between the petrophysical and mud-logging
based on the LSSVM-COA tended to exhibit the highest level data, a resampling task was performed to come up with
of accuracy [28]. In another piece of work, Ansari et al. devel- a uniform sampling interval. Subsequently, the data corre-
oped an ROP model using an SVR model that was optimized sponding to each well were managed into a database before
using ICA, with the results clearly indicating the higher accu- proceeding to feature selection on the data from Well A. The
racy of the optimized model compared to the simple SVR selected features were then used for modeling by means of
model [29]. intelligent algorithms (IAs). Finally, the developed models
A review of the literature published so far shows that the were validated on the data from Well B. Figure 5 presents the
research for developing ROP estimation models is still in flowchart through which the present study was performed.
progress because of the importance of such models in opti- Each step is explained in the following.
mizing adjustable drilling parameters, not to mention the
significance of achieving a comprehensive model of high 3.1 Preprocessing
accuracy coupled with widespread generalizability. On the
other hand, the application of multilayer extreme learning Identification and extraction of patterns on the data can-
machine (MELM) and convolutional neural network (CNN) not be properly accomplished unless one is provided with
in other engineering problems, such as pore pressure predic- sufficient data, in terms of quantity and quality, and proper
tion [30], shear wave velocity estimation [31], and gas flow feature selection is performed using a machine learning (ML)
rate prediction [32], has shown their enormous potentials approach. These conditions not only increase the chances of
for developing high-accuracy wide-generalizability models. achieving a high-accuracy, highly generalizable model, but
Hybridization of optimization problems with estimator ones also accelerate the main processing of the data. Therefore,
123
11956 Arabian Journal for Science and Engineering (2022) 47:11953–11985
prior to modeling, it is necessary to subject the collected data Noise represents an inevitable part of any real data, which
to proper preprocessing. exist at a minimum level of 5% even when the conditions are
fully controlled [33, 34]. Noisy data are known to result in
the extraction of unrealistic rules out of the data, degrading
3.1.1 Denoising
the generalizability of the associated models for predicting
unforeseen data [35, 36].
One of the most important factors affecting the performance
of an ROP estimation model is the quality of input data.
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11957
In the fields of signal and image processing, noise identifi- salt-and-pepper noise. In this technique, each data point is
cation plays a crucial role in selecting an appropriate method replaced by the median of a predefined neighborhood around
of noise attenuation. Considering the problem conditions and the original data point. For this purpose, we begin by sort-
structure of petrophysical logs and drilling data, median fil- ing all data points within the neighborhood in the order of
tering (MF) can serve as an efficient tool for denoising such value/intensity/magnitude before the median of them can
data. One-dimensional (1D) MF is a well-known technique be identified. Now, the original data point is replaced by
in image processing for addressing random, Gaussian, and the median only if it differs from that significantly, with
123
11958 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Table 1 Descriptive statistics of the petrophysical and drilling data over the studied wells
Petrophysical Logs
Depth f eet 7519.68 10,087.33 12,660.76 7716.53 9683.39 11,646.98
BS I nch 12.25 – 12.25 8.5 – 8.5
CGR G AP I 3.66 32.36 110.54 16.13 41.57 109.59
PEF b/e 2.07 4.42 6.1 2.17 4.73 5.94
RHOB gr /cc 1.45 2.51 2.82 1.97 2.54 2.82
RT ohmm 0.05 2.99 73.35 0.06 1.22 13.03
NPHO V /V 0 0.16 0.5 0 0.11 0.46
DTCO μs/ f t 47 68.75 127.98 48.42 65.98 117.71
Count – 10,283 7865
Record rate ft 0.5 0.5
Drilling Parameters
ROP f t/h 2.88 19 93.27 0.21 23.28 121.86
WOB klb f 1.1 13.97 33.1 4.55 15.32 29.82
RPM r pm 92 153.69 176 60.19 122.67 170.98
TRQ klb f . f t 1.62 2.94 4.72 2.71 4.89 7.08
TFLO gpm 476 588.97 764 478 528.11 570
MW-OUT pc f 78.03 82.27 89.5 78.28 81.3 85.03
ECD pc f 64.67 83.04 90.41 78.38 81.68 85.82
d-exp – 0.026 0.07 1.54 0 1.21 1.61
Count – 1566 1199
Record rate ft 3.28 3.28
Table 2 Formation tops over the trajectories of the Wells A and B 3.1.2 Upscaling
Formation Top Depth (m) Number of Data
Points Knowing that conventional petrophysical logs and drilling
data have been originally sampled at different rates, they
Well A Well B cannot be incorporated into a single model of ROP predic-
tion unless their scales are somewhat uniform. Two different
Pabedeh 2292 60 –
approaches have been proposed to this problem, namely
Gourpi 2352 264 264
downscaling the drilling data (which are usually sampled
Ilam 2609 83 83 at a lower rate, e.g., 1 sample every 3.28 ft) and upscaling
Laffan 2693 8 8 the petrophysical logs (which are usually sampled at a higher
Sarvak 2702 616 616 rate). Since the downscaling of drilling data is usually a cause
Kazhdumi 3319 201 201 of increased levels of error, the common practice is to rather
Dariyan 3522 184 27 opt for upscaling the petrophysical parameters. One of the
Gadvan 3707 150 – most popular techniques for this purpose is averaging, which
Total – 1566 1199 can be expressed as follows:
n
the original data point remaining unchanged otherwise [37]. j i1 ai
aupscaled (1)
With such a procedure, this technique provides for different n
levels of smoothing depending on the filtering window. Nev-
ertheless, this is a very effective method when the database where ai refers to the values of data points corresponding
j
contains a few points with abruptly different values from the to equally spaced depths and n and aupscaled are the number
others. and upscaled value of such points.
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11959
3.1.3 Feature Selection in this study. Figure 6 shows the flowchart through which
the NSGA-II approaches the optimal solutions. As shown in
Inclusion of all, or even the most possible number of, features the figure, the NSGA-II begins by generating an initial pop-
in developing an estimator model may not necessarily lead to ulation randomly. Next, each solution is evaluated using an
improved accuracy of the model. Through feature selection, objective function (sometimes called cost function). Among
one can evaluate different features to eliminate similar ones to the solutions comprising each generation, a given number
not only increase the processing speed of ML-based methods of them are selected through a binary tournament selection
but also enhance the accuracy and generalizability of the approach. This approach begins by selecting, on a random
model. basis, two solutions out of the population and then identifies
Given the large number of parameters affecting the ROP, the best solution by comparing the two. This algorithm is
it is always very important to identify the minimum number based on two selection criteria, namely the rank and crowd-
of parameters that can provide for acceptably high accuracy ing distance of different solutions [41]. Accordingly, the best
and reliability of the resultant model. Feature selection refers solution is the one that exhibits the smallest rank coupled with
to the identification of the smallest subset of features that the largest crowding distance. Once finished with this selec-
imposes the largest impact on reducing the model complexity tion approach, a predefined subset of the original generation
while increasing the prediction accuracy [38]. Conventional is obtained. Now, one may proceed to apply mutation opera-
ML techniques require feature selection as a prerequisite—a tor on a subgroup of the selected individuals and the crossover
task that needs to be performed by a relevant expert [39]. The operator on another subgroup of them to come up with a pop-
application of multi-objective optimization algorithms, as ulation of offsprings and mutated individuals. Now, this new
compared to single-objective optimizers, has shown the supe- population is combined with the original population followed
riority of the multi-objective optimization algorithms [40]. by sorting the members in the increasing order of rank and
Accordingly, considering the feature selection problem, one then the decreasing order of crowding distance. At this stage,
can adopt a multi-objective optimization algorithm with two we have a population whose members are sorted by rank, in
objectives: minimization of the number of input parameters the first place, and crowding distance, in the second place.
and minimization of the fitting error. Therefore, the non- We shall now select as many of the top ranked individuals as
dominated sorting genetic algorithm-II (NSGA-II) is used the population size of the original population. These finally
123
11960 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Non-dominated
Truncate population
Start sorting
Non-dominated sorting
Combine parents No Stopping criteria
and child population
met?
Calculation of crowding
distance
Yes
Mutation and cost
function evaluation
Pareto-optimal solution
selected individuals then represent the lowest ranks and the 3.2.1 Predictors
largest crowding distances, as compared to other members
of the combined population. The selected individuals con- So far, numerous researchers have tried to estimate ROP
stitute the next generation, with the other members of the based on petrophysical logs and drilling data by means of
combined population simply omitted from the process. This a versatile set of different predictor algorithms. The MELM
cycle is repeated until a predefined stopping criterion is met. and LSSVM have long been among the most accurate meth-
The non-dominated solutions obtained for a multi-objective ods for such a purpose. In the meantime, to the best of our
optimization problem usually comprise a Pareto front. None knowledge, no one has ever used deep learning (DL) algo-
of the solutions on a Pareto front is preferable over others, and rithms for this purpose, despite the successful application of
each one of them can represent an optimal selection under DL algorithms (e.g., CNN) for solving complex problems.
the considered set of conditions [41]. Therefore, we herein utilize DL algorithms to estimate ROP
from a compilation of petrophysical logs and drilling data,
as explained in the following.
• MELM algorithm
3.2 ROP Modeling with AIs
The ELM network is a fast yet relatively accurate algorithm
Respecting the superior performance of hybrid ML algo- for solving highly complex problems. First introduced by
rithms compared to simple models and PBMs for ROP Liang et al. [42], it has been proven efficient for ROP esti-
estimation, we opted for hybrid intelligent models in this mation as well. In this algorithm, a hidden layer is utilized
research. These include two broad categories of models, to avoid time-intensive iterations during the learning pro-
namely predictor and optimizer models, as explained in cess, so as to come up with significantly faster processing
the following. Subsequently, we present the procedure for times. In this structure, weights of the intermediate layer
building a hybrid algorithm by combining the predictor and are randomly selected from a uniform distribution. The out-
optimizer models. put weights of the ELM network are determined from the
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11961
Moore–Penrose generalized inverse of the hidden layer out- iterative simulations in the back-propagation stage—a step
puts [43]. Figure 7 indicates the structure of an ELM. Inspired that rather cannot be avoided in an MLP network [43]. In
by single-hidden layer feedforward networks (SLFNs), this order to solve complex nonlinear equations that are associ-
processing machine is highly fast yet very simple. Huang ated with clustering or regression solution methods, one can
et al. showed, theoretically, that random determination of use fast-track ML with multiple hidden layers in the form of a
hidden nodes in an SLFN can prevent it from being trained MELM. The flowchart shown in Fig. 8 depicts the main steps
[44]. The linearity of the changes in the output layer provides of problem-solving via MELM. As far as prediction is con-
for optimizing the output weight with minimum regression cerned, the MELM is much more accurate than a single-layer
error. This aspect of this network eliminates the need for machine.
123
11962 Arabian Journal for Science and Engineering (2022) 47:11953–11985
• LSSVM algorithm Table 3 Most popular kernel functions used with LSSVM [28, 49]
alternative to SVMs. σ2
Compared to SVM, LSSVM provides high-speed training. MLP K (xi ,x) θ: bias
This can be attributed to the use of equality constraints instead (MLP-kernel) tanh kxiT x + θ k: scale parameter
of inequality constraints encountered in quadratic program-
ming problems. In the meantime, sparseness and robustness
are known challenges to this approach. Of the main disadvan- In this equation, the Lagrangian coefficients are deter-
tages of such methods, one may refer to increased training mined using the Karush–Kuhn–Tucker (KKT) equality con-
time and output model error on industrial data, which are straints, as expressed in Eqs. (5) through (8) [48]:
especially the case when the input data suffer from het-
eroscedasticity and/or imbalanced distribution [47]. N
∂L
Assuming a training dataset {xi .yi } where i 1, 2, · · · , N , 0→w αi ϕ(xi ) (5)
∂w
input data are of n dimensions (xi ∈ R n ) and output result i1
is of one dimension (yi ∈ R), the regression version of the N
∂L
LSSVM takes the following general form: 0→ αi 0 (6)
∂b
i1
y w T φ(x) + b (2) ∂L
0 → αi γ ei , i 1, 2, . . . , N (7)
∂ei
where w is the weight vector, b is the bias term, and φ(x)
∂L
is a kernel function through which the input data can be 0 → w T ϕ(xi ) + b + ei − y γ ei , i 1, 2, . . . , N
mapped into a feature space of higher dimensions. Various ∂αi
(8)
kernel functions have been defined for the LSSVM so far,
including linear, polynomial, RBF, and MLP (Table 3). Equa-
Considering the mentioned equations, the weight vector
tion (3) expresses the objective function and constraints for
can be obtained from Eq. (9):
a LSSVM.
N
N
1 2
N
1 w αi ϕ(xi ) γ ei ϕ(xi ) (9)
min C(w.e) w T w + γ ei
2 2 (3) i1 i1
i1
Subject to : y w φ(x) + b + ei
T
The weights are herein defined as linear combinations of
the Lagrangian coefficients for the input data into the learn-
in which γ is a regularization parameter, which must be opti- ing process. By substituting Eq. (9) into Eq. (2), one may
mized before a high-accuracy model can be achieved. This come up with Eq. (10) where K (xi ,x) is the kernel function.
parameter establishes a balance between lower learning error Performance of an LSSVM is largely determined by the used
and smoothness. In order to solve Eq. (3) with the help of kernel function and hypervalues [50].
Lagrangian algorithm, one needs to define Lagrangian coef-
ficients αi —which can be positive or negative—for each xi ,
N
N
1
N
• CNN algorithm
L (w.b.e.α) w2 + γ ei2
2
i1
Representing a special case of ANN, CNNs have been intro-
N
− αi w φ (x) + b + ei − yi
T duced in different forms for DL networks. The widespread
i1 (4) application of CNN in complicated image processing tasks
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11963
such as image classification, object identification, and speed solution through an evolutionary process. In an optimiza-
detection is primarily due to its remarkable performance in tion problem with Nvar decision variables, habitat is defined
such areas [51, 52]. DL is comprised of different assessment as a 1 × Nvar array (i.e., habitat [x1 ,x2 ,x3 , . . . ,x Nvar ])
levels, with each level being capable of learning several dif- indicating spatial coordinates of the corresponding cuckoo
ferent features [53]. A major advantage of CNN is the use in the decision space. The decision variables take decimal
of automatic feature selection on raw input data for solving values, and the profitability of each habitat is determined
the problem. You may consider that the selection of supe- by evaluating a profit function ( f p ) at that habitat (i.e.,
rior features is a determinant stage of any modeling process pr o f it f p (habitat)). Given that this algorithm was orig-
by an ML method, significantly affecting the final process- inally formulated for maximization problems, maximization
ing time and modeling accuracy. Another advantage of CNN of the profit function expressed in the form of Eq. (11) ensures
over conventional network structures is its extended general- the minimization of the cost function ( f c ).
izability as it weights different parameters and hence lowers
the number of input parameters, a fact that facilitates the Profit −cost(habitat) − f c x1 , x2 , x3 , . . . , x Nvar (11)
implementation of large-scale networks, which are usually
difficult to deal with using conventional NNs [51]. Neverthe- The evolutionary process of COA is as follows. Given an
less, a major problem with this methodology is the relatively initial population of cuckoos (N pop ), the journey begins by
large number of adjustable parameters, including not only the forming a N pop × Nvar matrix of candidate habitats. Next, a
weights and biases but also the convolutional parameters. randomly selected number of eggs is assigned to each habi-
Figure 9 shows the structure of a CNN. As the figure sug- tat. In nature, this value ranges from 5 to 20, on average.
gests, a CNN has multiple parallel filters that can be adjusted Indeed, these values set a lower and upper bound to the num-
for feature selection. The input vector is filtered by each ber of eggs at each habitat. In real world, cuckoos try to lay
of these convolutional layers. Each of these convolutional their eggs at the maximum possible distance to their actual
layers produces its own specific output vector. Therefore, habitats. In COA, this maximum distance is referred to as
dimensions increase with the number of convolutional lay- egg-laying radius (ELR). For each cuckoo, the ELR is set
ers. Thus, a pooling layer is used to decrease the dimensions based on the overall number of eggs and the lower and upper
and normalize them. Finally, outputs of all pooling layers bounds (V arlow and V arhigh , respectively) to the decision
are compiled and fed into a compact layer for producing the variables (Eq. (12)). Notably, the parameter α—which is an
ultimate outputs. This compact layer (very like an MLP NN) integer value—is used to control the ELR.
is composed of several neurons, with the number of neurons
set by the user.
No. of Current Cuckoo’s Eggs
ELR α ×
Total number Eggs
3.2.2 Optimizers
× Varhigh − Varlow (12)
Metaheuristic optimization algorithms have been widely
mixed with predictor algorithms for solving a versatile spec- At the next step, each cuckoo lays eggs, on a random basis,
trum of problems. COA and PSO are among the most widely in the nest of host birds within its ELR. Once finished with the
used optimization algorithms in the previous literature. In this egg-laying process, the least similar eggs to those of the host
work, we use these two algorithms in combination with pre- bird are identified and thrown away by the host bird. In this
dictor algorithms to develop models of wide generalizability way, each round of egg-laying is followed by the elimination
and high accuracy. Each of these algorithms is explained in of a certain percentage of the cuckoos’ eggs (usually equal to
the following subsections. 10%). These are the ones that are laid in habitats of less than
enough nutrition, keeping them from converging to an opti-
mal solution. In the meantime, the other eggs get mature in
• COA
the nests of the host birds and then hatch from the egg to be fed
by the host bird. Grown into maturity, the little cuckoos con-
COA is a population-based algorithm and was originally tinue to live in their habitats for some time. However, as the
developed to solve continuous multidimensional nonlinear egg-laying time arrives, they migrate to places where they can
problems usually encountered for optimization purposes find more food and host birds whose eggs look much like the
[54]. In an optimization problem, one shall begin by pre- cuckoos’ eggs. Indeed, the cuckoos migrate to destinations
senting the decision variables in the form of an array; in corresponding to the highest cost (i.e., the largest supply of
COA, this array is referred to as habitat. That is, each habitat food and the best environmental conditions for living). Since
represents a candidate solution for the optimization prob- the matured cuckoos are now well scattered among different
lem. The candidate solutions converge to a global optimum habitats, it is now pretty difficult to identify which cuckoo
123
11964 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Yes
End
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11965
• PSO End
123
11966 Arabian Journal for Science and Engineering (2022) 47:11953–11985
In the MELM, the weights and biases are all set on a random
basis. This implies that the model output may exhibit differ-
ent degrees of accuracy. At the other end of the spectrum, it is
Fig. 12 Flowchart of GA
usually costly and time-intensive to develop a high-accuracy
model via a barely trial-and-basis approach, and even that
• GA cannot guarantee the realization of a model with the highest
possible accuracy. Under such circumstances, one can opt
for optimization algorithms to determine optimal numbers
In 1975, Holland proposed the GA based on Darwin’s theory of layers and neurons per layer, weights, and biases for a
of biological evolution [59–61]. Based on an initial popula- MELM, so as to achieve a model with the highest possible
tion of chromosomes (i.e., solutions), this algorithm presents level of accuracy.
a reflection of the natural selection process where individuals For such a purpose, it is necessary to identify the optimum
of the highest fitness are selected for reproduction to gen- structure of the algorithm itself. Accordingly, the weights
erate the next-generation offsprings. The offsprings inherit and biases of the algorithm are considered as decision vari-
characteristics of the parent and even transmit them to the ables to an optimization algorithm. At each iteration of the
proceeding generation. Should the parents exhibit better fit- optimization algorithm, a MELM is built with the best val-
ness, the offsprings would exhibit a higher fitness as well, ues of the decision variables at that iteration and then tested
which is equal to a higher chance of survival compared to on training data. Subsequently, the optimization algorithm
their parents. Figure 12 shows the flowchart of GA. improves the values of the weights and biases based on the
On this basis, in GA, the journey toward the optimal solu- feedback received from the built model at the previous itera-
tion starts from an initial population of chromosomes that tion, and this process continues until the stopping criterion of
are initialized randomly. The size of this initial population is the optimizer is satisfied, upon which the optimal values of
linked to the nature and complexity of the problem at hand the weights and biases are reported. Finally, using these opti-
and remains unchanged throughout different iterations of this mal values, the built MELM is assessed on the testing data.
algorithm. The values assigned to each chromosome are ana- Figure 13 depicts this process in the form of a flowchart.
lyzed by a fitness function. Thus, parent chromosomes are
selected from the group of chromosomes with the highest fit-
• Hybridizing LSSVM with optimization algorithms
ness values compared to other chromosomes. This is realized
by means of crossover and mutation operators. The crossover
operator swaps, randomly, parts of one chromosome with The performance of LSSVM algorithm has much to do
those of another. The result is an offspring that inherits cer- with the choice of the kernel function and the regularization
tain properties from each of the parent chromosomes rather parameter. Accordingly, before this method can be imple-
than exactly resembling either of them. This operator sets mented properly, one must identify the most appropriate
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11967
123
11968 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Fig. 15 Comparison between noisy and denoised mud-logging data in the studied interval of Well A
data, but rather provides for a higher degree of reliability. well to perform further processing with the final aim of ROP
Therefore, continuing with the research, the average value of modeling.
the petrophysical logs over each meter of the studied depth Investigation of how different independent variables affect
was reported as the corresponding value on the upscaled log. one another as well as the dependent parameter can provide a
Figure 17 compares the original and upscaled petrophysical good basis for understanding the inter-parameter associations
logs along Well A. Upon scaling the petrophysical logs to before proceeding to intelligent modeling. Figures 18 and 19
the scale of mud-logging data, a database was built for each indicate heat maps corresponding to the correlation matrices
of the petrophysical logs and mud-logging data at Wells A
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11969
Fig. 18 Heat map of correlation matrices of the petrophysical logs and mud-logging data at Well A
123
11970 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Fig. 19 Heat map of correlation matrices of the petrophysical logs and mud-logging data at Well B
Table 4 Optimal values of the adjustable parameters of the NSGA-II due to the need for preparing such a vast amount of input
algorithm data. On the other hand, since all field-measured data are
Parameter Value contaminated with some level of noise, the introduction of
such a huge volume of noisy data into a model can attenuate
Number of iterations 100 the accuracy and generalizability of the model considerably.
Population ratio 75 This highlights the need for identifying the most significant
Cross-over ratio 0.62 parameters affecting the ROP for developing a simple yet
Mutation ratio 0.05 accurate and widely generalizable prediction model. In this
respect, the hybrid algorithm of MLP-NSGA-II was applied
onto the data from Well A to select the features that were most
strongly correlated to the ROP. In this algorithm, the MLP
and B, respectively. As the figures suggest, at Well A, in NN tries to model the ROP based on the set of features intro-
contrast with Well B, most of the investigated parameters duced by NSGA-II. Existing differences in the measurement
were found to be inversely correlated to the ROP. At both scales of different parameters could lead to weights and/or
wells, compared to other studied parameters, the depth and biases that were irrelevant to the effect of the correspond-
mud weight exhibited much more significant relationships ing features on the ROP. Accordingly, one had to normalize
with the ROP. It was also evident that the ROP increases the parameters to be able to select more appropriate features.
with RPM and DT. Interestingly, WOB was found to exhibit For this purpose, setting a lower (X min ) and an upper bound
no association with the ROP at Well B. This was while an (X max ) to the values (X) of each feature, Eq. (15) was used
increase in WOB at Well A tended to decrease the ROP. Given to normalize the values in the range of [-1, 1].
the high resistivity of the rocks drilled at both wells, the more
significant effect of RPM, rather than WOB, on the ROP in X − X min
X norm 2 −1 (15)
the studied intervals was pretty expected. X max − X min
As indicated earlier, numerous parameters affect the ROP.
Considering all of these parameters in the ROP modeling In order to achieve the best results with the hybrid algo-
can drastically affect the complexity of the developed model rithm of MLP-NSGA-II, it was necessary to optimize its
and, at the same time, limit the applicability of this model adjustable parameters. For this purpose, firstly, MLP NNs
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11971
with different structures were used to model ROP on the the NSGA-II. Table 4 reports the optimal values of these
data from Well A. The tenfold cross-validation algorithm was parameters upon conducting the sensitivity analysis.
used to reach a stable accuracy in the modeling. Results of this Once finished with setting the adjustable parameters of
sensitivity analysis showed that an MLP NN with three hid- the MLP-NSGA-II algorithm, the algorithm can be applied
den layers containing 6, 5, and 6 neurons, respectively, could to the data from Well A. Figure 20 shows variations of
lead to highly accurate models. As a next step, adjustable RMSE and coefficient of variation (COV) for the best set
parameters of the NSGA-II algorithm were optimized to of variables with different numbers of input features into the
realize two objectives, namely minimizing the number of MLP-NSGA-II algorithm. As this figure suggests, the model-
input variables into the MLP NN and minimizing the root- ing error decreases while the COV increases with increasing
mean-square error (RMSE) of modeling with the NSGA-II. the number of input parameters. But the rate of this improve-
Sensitivity analysis was performed on the number of itera- ment decreases as the number of input parameters grows
tions, population size, and mutation and crossover ratios for
123
11972 Arabian Journal for Science and Engineering (2022) 47:11953–11985
larger so that no significant reduction in the ROP model- Table 6 Best values of the adjustable parameters of the PSO algorithms
ing error was observed for any number of input parameters for optimizing the MELM structure
beyond 7. In the meantime, the learning process with 8 input Controllable PSO
parameters took some 30% more time than that with 7 input parameters
parameters. Table 5 reports the set of best features obtained
Optimization of Optimization of the
for the MLP-NSGA-II by considering different numbers of weights and biases number of hidden
input parameters for the purpose of ROP modeling. The table layers and nodes
shows that, setting the number of input parameters to 7, the
best set of features included DEPTH, RPM, MW, WOB, DT, Population 80 60
TFLO, and NPHI. Accordingly, these seven parameters were Iteration 200 60
used for ROP modeling in this work. C1 2.05 2.05
C2 2.05 2.05
Inertia weight 0.97 0.96
4.2 Developing Predictor Models
In order to undertake modeling with a MELM hybridized adjust the controllable parameters of the optimization algo-
with an optimization algorithm, it is necessary to start by rithms as well as the MELM. The PSO algorithms had their
determining the optimal numbers of layers and nodes per parameters adjusted through a trial-and-error approach, with
layer. Herein we used PSO to optimize the structure of the results reported in Table 6. The incorporation of two
MELM. That is, the numbers of layers and nodes of the optimization algorithms into the structure of MELM would
MELM was determined by the PSO before proceeding to increase the processing time considerably. In order to tackle
ROP modeling using the MELM. In this process, the PSO this problem, we limited the allowable ranges of the numbers
algorithm improves its results based on the modeling error. of layers and nodes for the PSO that was aimed at optimizing
On the other hand, the modeling error with a MELM is the model structure. This was done by testing MELM-PSO
highly dependent on the values of the hyperparameters of the models with different numbers of layers containing the same
MELM. Therefore, in order to come up with an appropriate number of neurons in each layer and proceeded to ROP mod-
judgment about different structures of the MELM, PSO was eling, finally storing the resultant RMSE, with the results
deployed to optimize the weights and biases of the MELM. shown in Table 7 for the data from Well A. The table proves
Figure 21 shows how the MELM structure was optimized by that the estimator model had its accuracy increased with the
combining two different PSO algorithms. In order to imple- numbers of layers and nodes per layer. But models with 8
ment the PSO-MELM-PSO algorithm, one should properly layers exhibited somewhat higher RMSEs than those with
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11973
123
11974 Arabian Journal for Science and Engineering (2022) 47:11953–11985
(a) (b)
Fig. 23 Variations in error at different iterations of the hybrid algorithms of a MELM and b LSSVM in the training process on the data from Well
A
LSSVM models. As the figure suggests, COA provided for of the models underestimated the ROP at higher ROP lev-
a higher convergence rate compared to the GA and PSO els (as shown by deviation of the best-fit line to the left of
and could further get to a global solution rather than being the Y T line). This can be linked to the lower availability
trapped in local optima (which was the case with the GA and of data at higher ROP values. The intensity of this prob-
PSO). A comparison between the two estimator algorithms, lem is, however, lower with the CNN model, as compared
namely MELM and LSSVM, in this figure shows that the to other models, while simple form of LSSVM is associ-
hybridized MELM algorithms outperformed the hybridized ated with the largest deviations at higher ROPs. This leads
LSSVM algorithms in ROP prediction. us to the conclusion that hybrid models are generally more
Shown in Fig. 24 are cross-plots of measured ROP against accurate than the simple form of LSSVM. Table 9 compares
estimation results of the hybrid algorithms as well as LSSVM the developed models based on various error indices and the
and CNN in the training stage. The figure indicates that all coefficient of determination (COD) based on the data from
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11975
Fig. 24 Cross-plots of measured ROP against estimation results of the hybrid algorithms and DL method on the data from Well A
Well A. As is evident from the table, the CNN exhibited from Well A. Once again, the figure shows that the models
much lower RMSEs than the competing models. Focusing on underestimated the ROP at higher ROP levels. As this effect
the hybrid models, MELM-COA and LSSVM-COA showed was already observed in the training phase, this outcome in
higher accuracies than the others. the test phase was pretty anticipated. The smallest and largest
Figure 25 presents cross-plots of measured ROP against deviations of the best-fit line out of the Y T line at higher
estimation results of the trained hybrid algorithms on the data
123
11976 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Fig. 24 continued
ROPs were seen with CNN and LSSVM models, respec- less significant on the results of CNN modeling. Error his-
tively. Table 10 compares the results of trained models on togram of the CNN model showed lower values of the mean
the data from Well A based on a handful of error indices (μ) and standard deviation (σ ) compared to other models,
and COD. As the table suggests, CNN provided significantly following a more-or-less normal distribution. Focusing on
lower levels of error than the other models. Moreover, the the hybrid algorithms, the MELM-COA and LSSVM-COA
hybrid models produced lower levels of error than their sim- showed close-to-normal distributions. Table 11 presents the
ple forms, with the highest accuracies, among the hybrid results of a comparison between different trained models in
models, exhibited by the MELM-COA and LSSVM-COA ROP prediction at Well B based on various error indices
models. The smaller difference between RMSEs of the train- and COD. The table suggests that the CNN tends to pro-
ing and testing phases for the CNN model, compared to other duce the smallest RMSE compared to all other models,
studied models, could indicate better generalizability of this indicating the broad generalizability of this model. Note-
model to similar formations at the wells within the studied worthily, the MELM-COA and LSSVM-COA hybrid models
field. produced RMSEs well comparable to those of the CNN,
proving their acceptable generalizability. Figure 27 compares
4.3 Models Validation the estimation results of the best models (i.e., MELM-COA,
LSSVM-COA, and CNN) against measured ROPs over the
In order to put the generalizability of the trained models studied interval at Well B in the form of depth profiles. The
on test, they were used to estimate ROP on similar forma- figure shows that all of the three models succeeded to predict
tions at Well B. Figure 26 shows the histogram of error onto the changes in ROP over the drilled depth.
which a normal distribution is fitted for different ROP pre-
diction models at Well B. This figure shows some right-ward
skewness on the error histogram, which refers to the under-
estimation of ROP at higher ROP values. This is, however,
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11977
Fig. 25 Cross-plots of measured ROP against estimation results of the hybrid algorithms and DL method based on the data from Well A
123
11978 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Fig. 25 continued
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11979
Fig. 26 Error histogram and fitted normal distribution (red line) for eight models on the data from Well B
123
11980 Arabian Journal for Science and Engineering (2022) 47:11953–11985
Fig. 26 continued
• The hybridized form of the LSSVM model was found to • Application of the models for predicting the ROP at Well
be more accurate than its simple form, proving the supe- B showed that the CNN-based model could produce much
rior performance of metaheuristic algorithms in terms of more accurate results than the other models, further prov-
realizing more accurate results. ing the high generalizability of this model.
• The hybridized form of the LSSVM model was found to • Given the better results of the proposed methodology in
be more accurate than its simple form, proving the supe- this study, application of this methodology for predicting
rior performance of metaheuristic algorithms in terms of the ROP at vertical wells penetrating other fields is strongly
realizing more accurate results. recommended if an adequate volume of data is available.
• The hybridized forms of MELM algorithm produced
lower RMSEs than the corresponding LSSVM-based algo-
rithms.
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11981
123
11982 Arabian Journal for Science and Engineering (2022) 47:11953–11985
The coefficients a1 through a8 are constants applied to the number of inserts on the bit, nt is the number of inserts in con-
corresponding parameters, including the formation drillabil- tact with rock, ω is the cutting angle of the formation rock,
ity, normal compaction trend, under-compaction exponent, and τ is called the comprehensive factor. The parameter α in
differential pressure exponent, WOB, rotational speed com- this model is a constant.
ponent, bit wear exponent, and hydraulic exponent, respec-
tively. Ranges of the coefficients a1 through a8 are set 6.1.4 Motahhari et al. Model
according to the drilling conditions and shall be determined
considering previous drilling reports. Table 12 reports the Motahhari et al. proposed an ROP prediction model for PDC
presented ranges of the coefficients for the Bourgoyne and bits, as given in Eq. (27) [7].
Young model (1974) as per different studies.
G × RPMγ × WOBβ
6.1.2 Warren Model RO P W f (27)
db × CCS
Warren proposed a model for roller-cone bits, as expressed
in Eq. (25) [5]. in which WOB, RPM, d b , and CCS refer to weight-on-
bit, rotational speed, bit diameter, and confined compressive
−1 strength of the rock, respectively. Moreover, W f is the bit
a × UCS2 × db2 c wear function and β, γ , and G are constant coefficients. A
ROP + (25)
RPM × WOB
b 2 RPM × d challenge encountered when trying to implement this model
is the method of evaluating the bit wear function, which
where WOB, RPM, d b , and UCS refer to weight-on-bit, requires careful assessment of the bit wear at well site.
rotational speed, bit diameter, and unconfined compressive
strength of the rock, respectively. Moreover, a, b, c, and d 6.1.5 Al-Abduljabbar et al. Model
are constant coefficients.
In 2018, Al-Abduljabbar et al. proposed a special model for
6.1.3 Harreland and Rampersad Model predicting ROP for PDC bits, as per Eq. (28) [8].
The Harreland and Rampersad model was originally pro- TRQ × SPP × TFLO × RPM × WOB f
posed for PDC bits, as written in Eq. (26) [6]. ROP 16.96
db2 × ρ × PV × UCSe
2 (28)
80 × n t × m × RPMa WOB
ROP τ
db2 × tan2 ω 100 × n t × UCS In this model, WOB, RPM, d b , and UCS refer to weight-
(26) on-bit, rotational speed, bit diameter, and unconfined com-
pressive strength of the rock, respectively. Moreover, SPP
where WOB, RPM, d b , and UCS refer to weight-on-bit, denotes system pore pressure, TFLO is the fluid flow rate,
rotational speed, bit diameter, and unconfined compressive TRQ is the torque, PV is the mud viscosity, and ρ is the bit
strength of the rock, respectively. Moreover, m refers to the wear factor, with e and f being constant coefficients.
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11983
6.1.6 Bingham Model in this study were applied to noisy data. For this purpose,
the petrophysical logs that were selected through feature
The Bingham model was proposed for both common types selection by the MLP-NSGA-II were declared as input to
of bits, as per Eq. (29) [9]. the algorithms in the training phase. Similar to the training
phase, the controllable parameters of the algorithms were
A5
WOB determined based on the denoised data. Given that the range
ROP Fd × RPM (29) of input data could affect the values of the considered criteria,
db
the models that were trained on noisy data were applied to
in which WOB, RPM, and d b refer to weight-on-bit, rota- the denoised data of the training and testing phases to predict
tional speed, and bit diameter, respectively. In addition, the ROP. Table 13 reports the results of applying the models
Fd denotes the formation drillability, and A5 is the WOB that were trained on the noisy data to the denoised data of
exponent. A common problem with this method is the deter- the training and testing phases based on the estimation error
mination of A5, a task that can be accomplished only based and coefficient of determinations. A comparison between this
on in-lab tests on the bit. table and Tables 9 and 10, which refer to the models that
were trained on the denoised data in the training and test-
6.2 Performance Evaluation of the Denoising Step ing phases, highlights that the models that were trained on
the denoised data provided for higher levels of accuracy. In
In order to evaluate the effect of the noise attenuation process addition, the smaller difference in estimation error between
on the performance of the models, the algorithms adopted
123
11984 Arabian Journal for Science and Engineering (2022) 47:11953–11985
the training and testing phases for the denoised data, as com- 6. Hareland G.; Rampersad P.R.: Drag—Bit model including wear.
pared to the noisy data, indicated the higher reliability of the Society of Petroleum Engineers (1994)
7. Motahhari H.R.; Hareland G.; James J.A.; Bartlomowicz M.:
models trained on the denoised data. A similar performance Improved drilling efficiency technique using integrated PDM and
evaluation step was conducted in the validation phase. The PDC bit parameters. Petroleum Society of Canada (2008)
ROP estimation error generated upon training the model on 8. Al-AbdulJabbar, A.; Elkatatny, S.; Mahmoud, M.: A robust rate
the noisy data and then applying the trained model on the of penetration model for carbonate formation. J. Energy Resour.
Technol. 141, 042903 (2019)
denoised data from Well B (Table 14) was compared to that 9. Bingham G.: A new approach to interpreting rock drillability. Tech
for the model trained on the denoised data for the validation Man Repprint Oil Gas J 93 P. (1965)
phase (Table 11), indicating the higher generalizability of the 10. Etesami D.; Zhang W.J.; Hadian M.: A formation-based approach
models trained on the denoised data. for modeling of rate of penetration for an offshore gas field using
artificial neural networks, J. Nat. Gas Sci. Eng., 104104, ISSN
1875-5100, https://doi.org/10.1016/j.jngse.2021.10410 (2021)
Authors’ Contributions Morteza Matinkia took part in data curation and
11. Eren T.: Real time optimization of drilling parameters during
visualization; Amirhossein Sheykhinasab involved in visualization, and
drilling operations. PhD. thesis, Middle East Technical University
writing—original draft; Soroush Shojaei involved in methodology, for-
(2015)
mal analysis, validation, visualization; Ali Vojdani Tazeh Kand took part
12. Kutas D.T.; Nascimento A.; Elmgerbi A.M.; Roohi A.; Prohaska
in investigation, and writing—review & editing; Arad Elmi involved
M.; Thonhauser G.; Mathias M.H.: A Study of the Applicability of
in writing—original draft, and formal analysis; Mahdi Bajolvand took
Bourgoyne and Young ROP Model and Fitting Reliability through
part in investigation and visualization; Mohammad Mehrad involved in
Regression. Paper presented at the International Petroleum Tech
project administration and code developer.
Conf, Doha, Qatar, December. Doi: https://doi.org/10.2523/IPTC-
18521-MS (2015)
Funding There is no funding for this study. 13. Anemangely, M.; Ramezanzadeh, A.; Tokhmechi, B.: Determina-
tion of constant coefficients of Bourgoyne and Young drilling rate
model using a novel evolutionary algorithm. J. Min. Environ. 8,
693–702 (2017). https://doi.org/10.22044/jme.2017.842
Declarations 14. Bahari, M.H.; Bahari, A.; Moharrami, F.N.; Naghabi Sistani, M.:
Determination of Bourgoyne and Young Model coefficient using
Conflict of interest The authors declare that they have no known Genetic Algorithm to Predict Drilling Rate. J. Appl. Sci. (2008).
competing financial interests to personal relationships that could have https://doi.org/10.3923/jas.2008.3050.3054
appeared to influence the work reported in this paper. 15. Hegde, C.; Gray, K.E.: Use of machine learning and data analytics
to increase drilling efficiency for nearby wells. J. Nat. Gas. Sci. Eng.
Availability of Data and Material Due to the nature of this research, 40, 327–335 (2017). https://doi.org/10.1016/j.jngse.2017.02.019
participants of this study did not agree for their data to be shared publicly, 16. Bilgesu, H.I.; Tetrick, L.T.; Altmis, U.: A New Approach for
so supporting data are not available. the Prediction of Rate of Penetration (ROP) Values. Society of
Petroleum Engineers (1997)
Code Availability Developed codes are not available. 17. Pollock J.; Stoecker-Sylvia Z.; Veedu V.: Machine Learning for
Improved Directional Drilling. In: Offshore Technology Confer-
Consent to Participate All authors participated in this research. ence. Offshore Technology Conference (2018)
18. Sabah, M.; Mohsen, T.; Wood, D.A.; Khosravanian, R.; Ane-
Consent for Publication All authors agreed to submit and publish this mangely, M.; Younesi, A.: A machine learning approach to predict
manuscript in the Arabian Journal for Science and Engineering. drilling rate using petrophysical and mud logging data. Ear. Sci.
Info. (2019). https://doi.org/10.1007/s12145-019-00381-4
19. Gan C.; Cao W.; Wu M.: Prediction of drilling rate of penetration
(ROP) using hybrid support vector regression: A case study on the
References Shennongjia area, Central China. J. Pet. Sci. Eng. 106200 (2019)
20. Ahmed, O.; Adeniran, A.; Samsuri, A.: Rate of penetration pre-
1. Anemangely, M.; Ramezanzadeh, A.; Tokhmechi, B.: Drilling rate diction utilizing hydro spec energy. Intechopen (2018). https://doi.
prediction from petrophysical logs and mud logging data using an org/10.5772/intechopen.76903
optimized multilayer perceptron neural network. J. Geophys. Eng. 21. Nascimento, A.; Elmgerbi, A.; Roohi, A.: Reverse engineering:
15, 1146–1159 (2018) a new well monitoring and analysis methodology approaching
2. Augustine C.; Tester J.W.; Anderson B.: A comparison of geother- playing-back drill-rate tests in real-time for drilling optimiza-
mal with oil and gas well drilling costs. In: 7 Proceedings. Stanford tion. J. Energy Resour. Technol. (2017). https://doi.org/10.1115/
University, Stanford, California, p. 16 (2006) 1.4033067
3. Amar K.; Ibrahim A.: Rate of penetration prediction and optimiza- 22. Gandelman, R.A.: Prediçao da ROP e otimizaçao em tempo real
tion using advances in artificial neural networks: a comparative de parâmetros operacionais na perfuraçao de poços de petróleo
study. In: Proceedings of the 4th International Joint Conference on offshore. Ph.D thesis, Federal University of Rio de Janeiro (2012)
Computational Intelligence. SciTePress—Science and and Tech- 23. Ahmed, O.S.; Adeniran, A.A.; Samsuri, A.: Computational intelli-
nology Publications, Barcelona, Spain, pp. 647–652. https://doi. gence based prediction of drilling rate of penetration: a comparative
org/10.5220/0004172506470652 (2012) study. J. Pet. Sci. Eng. 172, 1–12 (2019)
4. Bourgoyne, A.T.J.; Young, F.S.J.: A multiple regression approach 24. Bani Mustafa, A.; Abbas, A.K.; Alsaba, M.: Improving drilling
to optimal drilling and abnormal pressure detection. Soc. Pet. Eng. performance through optimizing controllable drilling parameters.
J. 14, 371–384 (1974). https://doi.org/10.2118/4238-PA J. Petrol. Explor. Prod. Technol. 11, 1223–1232 (2021). https://doi.
5. Warren, T.M.: Penetration rate performance of roller cone bits. SPE org/10.1007/s13202-021-01116-2
Drill. Eng. 2, 9–18 (1987). https://doi.org/10.2118/13259-PA
123
Arabian Journal for Science and Engineering (2022) 47:11953–11985 11985
25. Tewari, S.; Dwivedi, U.D.; Biswas, S.: Intelligent drilling of oil 43. Yeom, C.U.; Kwak, K.C.: Short-term electricity-load forecasting
and gas wells using response surface methodology and artificial using a TSK-based extreme learning machine with knowledge rep-
bee colony. Sustainability 13, 1664 (2021). https://doi.org/10.3390/ resentation. Energies 10(10), 1613 (2017). https://doi.org/10.3390/
su13041664 en10101613
26. Elkatatny, S.: Development of a new rate of penetration model using 44. Huang, G.B.; Wang, D.H.; Lan, Y.: Extreme learning machines: a
self-adaptive differential evolution-artificial neural network. Arab. survey. Int. J. Mach. Learn. Cybern. 2, 107–122 (2011). https://doi.
J. Geosci. (2019). https://doi.org/10.1007/s12517-018-4185-z org/10.1007/s13042-011-0019-y
27. Ashrafi, S.B.; Anemangely, M.; Sabah, M.; Ameri, M.J.: Appli- 45. Vapnik, V.: The Nature of Statistical Learning Theory. Springer
cation of hybrid artificial neural networks for predicting rate of Science & Business Media, Berlin (2013)
penetration (ROP): a case study from Marun oil field. J. Pet. Sci. 46. Wang, H.; Hu, D.: Comparison of SVM and LS-SVM for regres-
Eng. 175, 604–623 (2019) sion. Int Conf Neural Netw Br. IEEE, pp. 279–283 (2005)
28. Mehrad, M.; Bajolvand, M.; Ramezanzadeh, A.; Neycharan, J.G.: 47. Si, G.; Shi, J.; Guo, Z.; Jia, L.; Zhang, Y.: Reconstruct the support
Developing a new rigorous drilling rate prediction model using a vectors to improve LSSVM sparseness for mill load prediction.
machine learning technique. J. Pet. Sci. Eng. 192, 107338 (2020). Math. Probl. Eng. (2017). https://doi.org/10.1155/2017/4191789
https://doi.org/10.1016/j.petrol.2020.107338 48. Sabah, M.; Mehrad, M.; Ashrafi, S.B.; Wood, D.A.; Fathi, S.:
29. Ansari, H.R.; Sarbaz Hosseini, M.J.; Amirpour, M.: Drilling Hybrid machine learning algorithms to enhance lost-circulation
rate of penetration prediction through committee support vector prediction and management in the Marun oil field. J. Pet. Sci. Eng.
regression based on imperialist competitive algorithm. Carbonates 198, 108125 (2021). https://doi.org/10.1016/j.petrol.2020.108125
Evaporites 32, 205–213 (2017). https://doi.org/10.1007/s13146- 49. Anemangely, M.; Ramezanzadeh, A.; Amiri, H.; Hoseinpour, S.A.:
016-0291-8 Machine learning technique for the prediction of shear wave veloc-
30. Matinkia, M.; Amraeiniya, A.; Behboud, M.M.; Mehrad, M.; ity using petrophysical logs. J. Pet. Sci. Eng. (2019). https://doi.org/
Bajolvand, M.; Gandomgoun, M.H.; Gandomgoun, M.: A novel 10.1016/j.petrol.2018.11.032
approach to pore pressure modeling based on conventional well 50. Duan, K.; Keerthi, S.S.; Poo, A.N.: Evaluation of simple per-
logs using convolutional neural network. J. Pet. Sci. Eng. 211, formance measures for tuning SVM hyperparameters. Neu-
110156 (2022). https://doi.org/10.1016/j.petrol.2022.110156 rocomputing 51, 41–59 (2003). https://doi.org/10.1016/S0925-
31. Mehrad, M.; Ramezanzadeh, A.; Bajolvand, M.; Reza Hajsaeedi, 2312(02)00601-X
M.: Estimating shear wave velocity in carbonate reservoirs from 51. Indolia, S.; Goswami, A.K.; Mishra, S.P.; Asopa, P.: Conceptual
petrophysical logs using intelligent algorithms. J. Pet. Sci. Eng. understanding of convolutional neural network—a deep learning
212, 110254 (2022). https://doi.org/10.1016/j.petrol.2022.110254 approach. Procedia Comput. Sci. 132, 679–688 (2018). https://doi.
32. Abad, A.R.B.; Ghorbani, H.; Mohamadian, N.; Davoodi, S.; org/10.1016/j.procs.2018.05.069
Mehrad, M.; Aghdam, S.K.; Nasriani, H.R.: Robust hybrid machine 52. Nebauer, C.: Evaluation of convolutional neural networks for visual
learning algorithms for gas flow rates prediction through wellhead recognition. IEEE Trans. Neural Netw. 9, 685–696 (1998)
chokes in gas condensate fields. Fuel 308, 121872 (2022). https:// 53. Mrazova, I.; Kukacka, M.: Can deep neural networks discover
doi.org/10.1016/j.fuel.2021.121872 meaningful pattern features? Procedia Computer Sci. 12, 194–199
33. Maletic, J.I.; Marcus, A.: Data Cleansing: Beyond Integrity Anal- (2012)
ysis. In Conference on Information Quality, pp. 200–209 (2000) 54. Rajabioun, R.: Cuckoo optimization algorithm. Appl. Soft. Com-
34. Wu, X.: Knowledge Acquisition from Databases, Intellect books put. 11, 5508–5518 (2011). https://doi.org/10.1016/j.asoc.2011.
(1995) 05.008
35. Garćia, L.P.; de Carvalho, A.C.; Lorena, A.C.: Noisy data set iden- 55. Kenned, J.; Eberhart, R.: Particle swarm optimization. In: Pro-
tification, International Confrece on Hybrid Artificial Intelligence ceedings of ICNN’95-Int Conf Neural Netw. IEEE, pp. 1942–1948
Systems (Springer), pp 629–38 (2013) (1995)
36. Lorena, A.C.; de Carvalho, A.C.: Evaluation of noise reduction 56. de Moura Meneses, A.A.; Machado, M.D.; Schirru, R.: Particle
techniques in the splice junction recognition problem. Genet. Mol. swarm optimization applied to the nuclear reload problem of a
Biol. 27(4), 665–672 (2004) pressurized water reactor. Prog. Nuc. Eng. 51, 319–326 (2009).
37. Gonzalez R.; Woods R.: Digital image processing. Pear- https://doi.org/10.1016/j.pnucene.2008.07.002
son/Prentice Hall; Available: http://books.google.com/books? 57. Pedersen, M.E.H.; Chipperfield, A.J.: Simplifying particle swarm
id¼8uGOnjRGEzoC (2008) optimization. Appl. Soft Comput. 10, 618–628 (2010)
38. Osman, H.; Ghafari, M.; Nierstrasz, O.: The impact of feature selec- 58. Coello, C.C.; Lamont, G.B.; van Veldhuizen, D.A.: Evolution-
tion on predicting the number of bugs. ArXiv180704486 Cs (2018) ary Algorithms for Solving Multi-Objective Problems, 2nd edn.
39. Lee, K.B.; Cheon, S.; Kim, C.O.: A convolutional neural network Springer, US (2007)
for fault classification and diagnosis in semiconductor manufactur- 59. Katoch, S.; Chauhan, S.S.; Kumar, V.: A review on genetic algo-
ing processes. IEEE Trans. Semicond. Manuf. 30, 135–142 (2017) rithm: past, present, and future. Multimed. Tools Appl. 1–36 (2020)
40. Liu, Y.; Chen, G.: Optimal parameters design of oilfield surface 60. Kunjur, A.; Krishnamurty, S.: Genetic algorithms in mechanism
pipeline systems using fuzzy models. Inf. Sci. 120(1–4), 13–21 synthesis. J. Appl. Mech. Robot. 4, 18–24 (1997)
(1999) 61. Michalewicz, Z.; Schoenauer, M.: Evolutionary algorithms for con-
41. Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T.: A fast and eli- strained parameter optimization problems. Evol. Comput. 4, 1–32
tist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. (1996)
Comput. 6, 182–197 (2002). https://doi.org/10.1109/4235.996017
42. Liang, N.Y.; Huang, G.B.; Saratchandran, P.; Sundararajan, N.:
A fast and accurate online sequential learning algorithm for
feedforward networks. IEEE Trans. Neural Netw. Learn Syst. 17,
1411–1423 (2006)
123