0% found this document useful (0 votes)
50 views

A Data Driven Approach of ROP Prediction and Drilling Performance

This paper develops a new data-driven approach for predicting rate of penetration (ROP) and estimating drilling performance using machine learning techniques. The model uses a large volume of data from wells in southern China, including well logging, mud logging, geological information, and daily reports. Compared to conventional empirical models, the new approach considers ROP as a continuous process and improves prediction accuracy by relating ROP at a given time to drilling rate in the previous time step. The machine learning model provides new insights and can help optimize drilling plans and enhance drilling efficiency.

Uploaded by

Chinedu Nwabueze
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

A Data Driven Approach of ROP Prediction and Drilling Performance

This paper develops a new data-driven approach for predicting rate of penetration (ROP) and estimating drilling performance using machine learning techniques. The model uses a large volume of data from wells in southern China, including well logging, mud logging, geological information, and daily reports. Compared to conventional empirical models, the new approach considers ROP as a continuous process and improves prediction accuracy by relating ROP at a given time to drilling rate in the previous time step. The machine learning model provides new insights and can help optimize drilling plans and enhance drilling efficiency.

Uploaded by

Chinedu Nwabueze
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

IPTC-19430-MS

A Data Driven Approach of ROP Prediction and Drilling Performance


Estimation

Jiahang Han, Yanji Sun, and Shaoning Zhang, PCITC

Copyright 2019, International Petroleum Technology Conference

This paper was prepared for presentation at the International Petroleum Technology Conference held in Beijing, China, 26 – 28 March 2019.

This paper was selected for presentation by an IPTC Programme Committee following review of information contained in an abstract submitted by the author(s).
Contents of the paper, as presented, have not been reviewed by the International Petroleum Technology Conference and are subject to correction by the author(s). The
material, as presented, does not necessarily reflect any position of the International Petroleum Technology Conference, its officers, or members. Papers presented at
IPTC are subject to publication review by Sponsor Society Committees of IPTC. Electronic reproduction, distribution, or storage of any part of this paper for commercial
purposes without the written consent of the International Petroleum Technology Conference is prohibited. Permission to reproduce in print is restricted to an abstract of
not more than 300 words; illustrations may not be copied. The abstract must contain conspicuous acknowledgment of where and by whom the paper was presented.
Write Librarian, IPTC, P.O. Box 833836, Richardson, TX 75083-3836, U.S.A., fax +1-972-952-9435.

Abstract
Given the expense of drilling in resource development, explore the optimal drilling operations, especially
improve the rate of penetration (ROP), has become increasingly important. The prediction of ROP is
challenging due to the complex bit and rock interactions. Several experience based models have attempted
to predict ROP, but few of them can have excellent estimations.
In this paper, a new ROP prediction methodology is developed using machine learning techniques. The
booming computation capacity and large volume of data provide a new way to solve the problems. In this
paper, wells from southern China have been used to demonstrate our approach. Well logging, mud logging
data, geological information, daily well reports, and many other data have been included for the model
development. Parameters which have been previously considered irrelative to ROP calculation show their
impacts in the model.
The machine learning approaches bring new insights into the ROP prediction. Comparing with the
conventional approach, our method consider drilling as a continuous procession. ROP at current time relates
to the drilling rate in previous time step. In our case, the model improves ROP estimation accuracy. In
addition, engineers can use this tool to arrange better drilling plans during operation. Dive deep into the data,
the acceleration and deceleration of ROP showed close relationship to WOB and mud properties, improve
the mud rheology and operation strategies can keep the ROP in a optimal range which may enhance the
drilling efficiency.
With the assistance of machine learning techniques, a new method of ROP prediction has been developed.
Comparing with the conventional approaches, this method considers more parameters and improves
accuracy. Factors that may enhance the drilling bit performances were also explored. This new data driven
method can be beneficial for drilling efficiency.

Introduction
Drilling is one of the most important aspect in resource development. A smooth drilling process can achieve
safe and fast well establishment, and better investment efficiency. An accurate ROP estimation can benefit
the well planning and prevent unexpected drilling accidents. Many parameters influence the instantaneous
2 IPTC-19430-MS

ROP, including formation properties, mud rheology, drill bits, and bit/rock interactions. String Vibrations,
deformations, and bit fatigue can also affects the rate of bit penetration.
The prediction of ROP has been explored for many years. Most of the prediction models are based on
analytical or empirical algorithms. In 1963, Galle and Woods developed an analytical model to calculate
ROP utilizing weight on bit, rotary speed, and bit type. The developed model create a new method of
calculating ROP, however, it fails to consider formation rock properties. Mechem and Fullerton (1965)
improved ROP calculation by taking more parameters into consideration. Parameters as formation tightness,
depth, and mud hydrostatic were included. At the early stage, only limited drilling parameters were
considered, and these models could easily lose their accuracy in complicated drilling environments. In 1974,
Bourgoyne and Young developed a multi regression model to estimate ROP. Weight on bit, rotary speed, bit
wear, bit size, hydraulics Formation depth, strength, and over/under balance conditions were used for rate of
penetration calculation. Rock/bit interactions, bit properties, fluid properties were also considered in their
model. However, the Bourgoyne and Yong's model requires many inputs which are difficult to obtain. Some
of the parameters are empirical based. Al-Betairi et al. (1988) explored how would the operation parameters
affect the penetration rate. Sensitivity analysis were conducted to evaluate how the operation parameters
influence ROP. However the optimized ROP was poorly defined in the paper, and the conclusions were
questionable. Alum and Egbon (2011) found that pressure losses affects ROP significantly. However, the
empirical based parameters used in their models can not adjust to the changing drilling situations. Hedge
(2015,2017) derived models purely on data which bring new thoughts on the ROP prediction. However, the
data pool for their model is still limited. In this paper, we developed a purely data driven ROP prediction
model.

Methodology
In the quest for better solutions to the complex engineering problems, the artificial neural network (ANN)
technique was formed. This methodology was aimed at trying to mimic the behavior of the biological
neurons (Fausett, 1994). Neural network contains elements, which basically a sigmoid function processing
inputs and pass values to following layers.
A typical example of an interconnected neural network architect is shown in Fig.1 where it utilizes a three
layer system with four inputs and three outputs that are arranged in multiple layers. Each neural network
processing information by assign weights to the inputs. Each input parameter in Fig.1 has its own weight
with it. There are essentially two types of training which are known as the supervised and the unsupervised
training. The supervised training has a clear target, so that each output unit understand what its desired
response to inputs ought to be. An important issue concerning supervised learning is the problem of error
convergence, i.e. the minimization of error between the desired and computed unit values. The aim is to
determine a set of weights which gives the least error. ANN is an example of supervised learning which try
to find proper weights to each neuron and convert the inputs to desired output. The inputs and outputs are
iterated among the neural elements to converge to the final target.
IPTC-19430-MS 3

Figure 1—Typical ANN architecture

Weight and transfer functions (which are the input and output functions) are the most import aspects of an
neural network. Transfer functions are basically categorized into three groups, namely: the sigmoid, linear
and threshold. The sigmoid function convert inputs to a continuously changing outputs. Comparing with
linear and threshold functions, the sigmoid units performs close to real neurons. But these three transfer
functions can only be treated as approximations. For the linear transfer function, the output is proportional
to the output. While for the threshold transfer function the output only accept values within the threshold
boundaries. For the network to obtain a good prediction accuracy, the model has to be trained to reduce the
differences (error) between prediction and object. In each neuron unit, it calculate how the error changes as
the weights various. By adjusting the weights, the ANN would converge to a final approximation. The back
propagation algorithm is the most widely used method for determining the error derivative of the weights.
The drilling process is a complicated problem as the ROP changes with time. It requires a time-serious
dependent model as the bit goes deeper, and the rocks consolidated tighter. In order to take the time effects
into neuron network, RNN is developed (Fig. 2). RNN refers to recurring neuron network which can take
inputs with time consequences. RNN's recurring feature helps RNN take the features at time (t-1) along
with the feature at current time (t) to predict the output at time (t). The hidden layer's status at previous
time step was included as inputs for current time calculation. RNN provide solutions to the time serious
problems. But, RNN could suffer vanishing gradients’ problem. Especially for complicated problem, it may
take many iterations to converge to an acceptable prediction. When the derivative of the activation functions
is smaller than 1, during error propagation, the chain rule can be applied to derivatives mutilation and reduce
the gradients to 0. This phenomena could consume lots of computation in order to converge.

Figure 2—Typical RNN architecture

LSTM model eliminates vanishing gradients by incorporating a new network architecture (Fig. 3). The
LSTM architecture composes a memory unit and three gates. The gates including an input gate, an output
4 IPTC-19430-MS

gate, and a forget gate. The input gate controls the values enter the LSTM cell, the forget gate controls
whether the values stay in the cell, and the output gate controls values used to compute the outputs. During
LSTM network training, weights assigned to each units are learned, and the gates operation rule was
determined.

Figure 3—Typical LSTM architecture

Model Development and Application


The LSTM deep learning technology was applied in the ROP prediction model. Data engineering has to be
conducted first to develop the ROP prediction model. In this paper, we utilized a well in southern china.
The drilling section covers 900 meters without changing the drill bit. The first 500 meters were utilized for
training and validation, while the rest part were for prediction. The well's "Dlis" file which contains all the
available drilling information as well as the daily report accomplished by field engineer were utilized as the
original source data. The "Dlis" file records drilling data in every 5 seconds. There are about 139 parameters
(features) in the Dlis file, including depth, bit status, rotation speed, hook load, stand-pip pressure, and so
on. The lithology data represented by gamma, conductivity, resistivity were also included. Basically, any
data associated with the drilling process was recorded in the "Dlis" file. The very first and important word
for the machine learning model development is to clean the data. There are null, and wrong data everywhere
in the original data pool. Engineering thoughts had to be provided to have good data in order to feed the
model training.
The parameter "ROP" is the target which should not be negative. In our procession, all the negative ROP
and their associated features are deleted as wrong inputs. Other ones that has to be eliminated are those
with minimal variance. The minimal variance features are normally treated as constants which have very
limited influences on the changing target. After eliminating those data, the features have to be extended by
feature upscaling techniques. More and more features have to be created in order to cover all the possible
feature combinations. Many of the created features have very minimal influences on the target changing,
an relevance study has to be employed to resize the feature pool to accept size. The feature quantities have
a close relationship to inputs quantity and processing capacity. A reduced feature quantity can improve the
model converge speed without losing accuracy, and only require reasonable number of inputs.
The data correlation study was conducted. A strong relative coefficient (close to 1) indicates a linear
relationship which means duplicated features, while a irrelevant coefficient (close to 0) represents a non-
influential value which can be treated as constant. Both the duplicated values and non-influential ones have
to be deducted from the feature pool. In our case, 32 features were kept for further model development,
including the depth, rotation speed, STP, bottom-hole temperature and so on.
Thus the networks training and testing data set was created for inputs and output ranges. This created a
back propagation neural network which includes the input layer, three hidden layers and the output layer
IPTC-19430-MS 5

(Fig. 4). This BP neural network uses both the transig and logsig transfer function. The first hidden layer
has weights coming from the input. The hidden layers take inputs from the input layer. The last layer is
corresponding to the targets. Errors from each layer would be transferred back to the previous layers to
reduce the differences. An efficient conventional BP ANN model was developed as shown in the following
figure.

Figure 4—ANN Model for ROP prediction

To evaluate the performance of trained neural network, the percentage of errors were compared. This
error was calculated by the comparing the predicted values of the ANN Model to the actual values generated
through simulation. The percentage error is calculated as

where XX is the value generated by the network and RR in the actual data gathered from real field. The error
for the BP model are about 14%. The results are demonstrated as the following figures (Fig. 5). As shown
in the figures, the blue curve is the real field data while the red curve is derived from the BP ANN. Most
of the ROP changing trend has been captured by the BP model. However, a few of the ROP predictions
are away from the real case.

Figure 5a—ROP prediction results from ANN (Blue curve represents the real data, the red curve is predicted from ANN)
6 IPTC-19430-MS

Figure 5b—ROP prediction results from ANN (Blue curve represents the real data, the red curve is predicted from ANN)

In order to improve the model accuracy and eliminate the draw backs of BP neural network, LSTM
recurrent neural network model has been developed. The results are shown in the following figures. The
orange curve is the predicted from LSTM model. Comparing with the conventional BP neural network,
the LSTM model has a better prediction with 7% error. Many of the ROP value spikes can be predicted
as in Figure 6.

Figure 6a—ROP prediction results from LSTM (Blue curve represents the real data, the orange curve is predicted from LSTM)

Figure 6b—ROP prediction results from LSTM (Blue curve represents the real data, the orange curve is predicted from LSTM)

Discussion
Both of these two models were conducted on the same well, the results were demonstrated as Figure 7.
The LSTM model showed a better estimation than the BP neural network model. Especially, for the drilling
section around depth 2520 meters, the BP neural network model predicted half of the real field ROP value,
while the LSTM can have a good ROP estimation. The mainly improvement from BP neural network to
LSTM model is that LSTM deep learning model consider the time sequence effects. During drilling process,
IPTC-19430-MS 7

rock cutting is a continuous procession, the previous drill bit/rock interaction may have a strong effects to
the flowing drilling performance. In our case, when we took the time effects into consideration, the model
prediction accuracy was improved.

Figure 7a—ROP prediction results from ANN (Blue curve represents the real data, the red curve is predicted from ANN)

Figure 7b—ROP prediction results from LSTM (Blue curve represents the real data, the orange curve is predicted from LSTM)

More experiments were conducted on the data. As the following figure (Fig.8), the drilling performance
were divided into "high speed zone (masked in green)" and "low speed zone (masked in red)" according to
ROP value. How to improve ROP by controlling the drilling operations was explored.

Figure 8—Drilling performance zone


8 IPTC-19430-MS

As shown in Fig.9, the drilling procession was all divided into zones by utilizing moving average
technique. The orange, yellow and the grey lines divided ROP into three zones. Area between orange and
yellow has faster drilling penetration speed than the yellow and grey area.

Figure 9—Drilling performance zone

The driven factors that differentiate the high and low ROP zone were analyzed. As shown in Fig.10, six
of the most influential factors that could classifying ROP into high or low region are listed. The positive or
native signs represent whether the correlation is positive or not. As shown in the tornado chart, weight on
bit has biggest positive influences while the hook load has the largest negative effects. From this point of
view, ROP is mostly like to be driven by the force applied on the drill bit. In order to keep the ROP in a high
zone, increasing the weight on bit or decreasing the hook load could be very effective for this well. Azimuth
and inclination are the following two parameters for the ROP classification may due to the in-situ stress
and bit-rock contact angle. Bottom hole mud resistivity and bottom hole pressure may closely relate to the
wellbore cleaning which would affect the rate of drill bit penetration. However, we are lack of detailed mud
properties during drilling, a thoroughly study is need for this observation.

Conclusion
In this paper, a new ROP prediction methodology is developed using machine learning techniques. The
model is based on recorded drilling data, and LSTM deep neural network. Comparing with the conventional
BP Neural Network method, this model treat the drill bit penetration as a continuous process. It takes the
IPTC-19430-MS 9

drilling data in sequences and predict ROP continuously. In our case, the new approach achieve better ROP
prediction results than the BP neural network model.
The driven forces that can maintain a high ROP were explored. In our case, the weight on bit, penetration
direction, and mud properties showed great influential effects in keeping ROP on high level. Controlling
these operation effects may bring a fast penetration rate and better drilling efficiency.

Reference
E.A. Al-Betairi, M.M. Moussa, S. Al-Otaibi. 1988. Multiple Regression Approach to Optimize Drilling Operations in the
Arabian Gulf Area, SPE Drill. Eng., 3 (1).SPE-13694-PA. https://doi.org/10.2118/13694-PA
M.A. Alum, F. Egbon. 2011. Semi-analytical Models on the Effect of Drilling Fluid Properties on Rate of Penetration
(ROP), presented at Nigeria Annual International Conference and Exhibition, 30 July – 3 August, Abuja, Nigeria,
SPE-150806-MS. https://doi.org/10.2118/150806-MS
A.T. Bourgoyne, F.S. Young.1974. A Multiple Regression Approach to Optimal Drilling and Abnormal Pressure
Detection, SPE J., 14 (4), pp. 371–384,SPE-4238-PA. https://doi.org/10.2118/4238-PA
Fausett, L. 1994. Fundamentals of Neural Networks: Architectures, AlgorithMS, and Applications. Prentice-Hall,
Englewood Cliffs, New Jersey.
E.M. Galle, H.B. Woods. 1963. Best Constant Weight and Rotary Speed for Rotary Rock Bits, API Drilling and Production
Practice, 1 January, New York, API-63-048
C. Hegde, S. Wallace, K. Gray. 2015. Using Trees, Bagging, and Random Forests to Predict Rate of Penetration during
Drilling, presented at SPE Middle East Intelligent Oil and Gas Conference and Exhibition, 15-16 September, Abu
Dhabi, UAE. https://doi.org/10.2118/176792-MS
C. Hegde, K. Gray. 2017. Use of Machine Learning and Data Analytics to Increase Drilling Efficiency for Nearby Wells.
Journal of Natural Gas Science and Engineering 40, 327–335. https://doi.org/10.1016/j.jngse.2017.02.019
O.E. Mechem, H.B. Fullerton Jr. 1965. Computers invade the rig floor, Oil Gas J p14
T.G. Ritto, Christian Soize, R. Sampaio. 2011. Robust optimization of the rate of penetration of a drill-string using
a stochastic nonlinear dynamical model, Comput. Mech. Springer, 45 (5), pp. 415–427. https://doi.org/10.1007/
s00466-010-0473-5

You might also like