OceanofPDF - Com Data Science and Machine Learning - Daniel Asante Otcher
OceanofPDF - Com Data Science and Machine Learning - Daniel Asante Otcher
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences
of their use. The authors and publishers have attempted to trace the copyright holders of
all material reproduced in this publication and apologize to copyright holders if permission
to publish in this form has not been obtained. If any copyright material has not been
acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means,
now known or hereafter invented, including photocopying, microfilming, and recording, or in
any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access
www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please
contact [email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and
are used only for identification and explanation without intent to infringe.
To God, who has been my rock and guide throughout my life’s journey,
I dedicate this book with all my heart.
To my dad, Retired Assistant Commissioner of Ghana Police, Daniel
Asante Otchere, thank you for instilling in me a strong work ethic and
dedication to excellence. You have always been a source of inspiration, and
I am grateful for your unwavering support.
To my mum, Jane Duah Otchere, your love and encouragement have been
a constant source of strength. Thank you for being my pillar of support and for
believing in me even when I doubted myself.
To my beloved, Annabelle, thank you for being my partner, my best friend,
and my biggest cheerleader. Your unwavering love and support have been
instrumental in my success, and I am grateful for your presence in my life.
To my siblings, Nicholas, Elliot, Yvonne, and Thelma, thank you for your
love, support, and unwavering belief in me. You have been a constant source
of motivation, and I am grateful for the joy and laughter you bring into my life.
This book is a testament to the love, support, and guidance of these
amazing individuals in my life. Thank you all for being part of my life’s
journey and for being my inspiration.
Foreword
It is with great pleasure that I write the foreword for this book on Data Science
and Machine Learning Applications in Subsurface Engineering.
The field of subsurface engineering is a critical aspect of the global energy
industry, and it has undergone significant transformations in recent years with
the advent of data science and machine learning. This has led to increased
efficiency, improved decision-making, and reduced costs in the exploration,
production, and development of subsurface resources. Applying data science
and machine learning techniques to subsurface engineering is transforming
how we approach and solve complex problems in this domain. The
combination of vast amounts of data generated by sensors and other sources
and the development of powerful algorithms and computing capabilities has
enabled us to extract valuable insights and make informed decisions.
The edited book on Data Science and Machine Learning Applications
in Subsurface Engineering is an important contribution to this rapidly
evolving field. This book brings together a collection of chapters highlighting
innovative and impactful applications of these technologies in subsurface
engineering. From reservoir characterisation to production optimisation
and drilling engineering, the contributors provide various perspectives,
demonstrating potentials and challenges. The expert and knowledgeable
contributors come from diverse backgrounds, including academia, industry,
and government. The diversity of their perspectives enriches the discussion
and highlights the need for cross-disciplinary approaches to solving the
complex challenges of subsurface engineering.
The chapters in this book cover a wide range of application areas, including
data-driven workflows for subsurface characterisation at the well scale and the
reservoir scale, machine learning techniques for well performance analysis,
and smart completions. Diverse data types are analyzed including well flow
data, wireline log data and seismic images. A variety of machine learning
algorithms are described, from traditional multivariate statistical methods
and tree-based methods to artificial neural networks and deep convolutional
neural networks. This book offers valuable insights into the ongoing research
and development in the field of subsurface engineering.
Foreword v
Data science and machine learning have revolutionised various industries, and
the subsurface engineering industry is no exception. Subsurface engineering
involves the exploration, development, and production of natural resources
like oil and gas, and is critical to meet the world’s energy demands. The
application of data science and machine learning techniques in subsurface
engineering has the potential to significantly improve the efficiency, accuracy,
and safety of operations in this industry.
This research book on Data Science and Machine Learning Applications
in Subsurface Engineering provides a comprehensive overview of this field’s
latest research and developments. The book is intended for professionals,
researchers, and students interested in understanding the potential of data
science and machine learning in subsurface engineering. I hope that readers
value each chapter in this book because subsurface engineering plays a vital
role in meeting the world’s energy demands, and the application of data
science and machine learning techniques has the potential to significantly
improve efficiency and accuracy, and safety of operations in this industry.
This book brings together a collection of chapters that showcase some
of the most innovative and impactful applications of these technologies in
subsurface engineering. The contributors to this book represent a diverse
range of backgrounds and expertise, from academic researchers to industry
professionals. Each chapter offers a unique perspective on using data science
and machine learning in subsurface engineering. The success of this book
is a testament to our contributors’ hard work and dedication. I extend my
deepest gratitude to Ramez, Daniel, Eric Thompson, Ayoub, Halim, and
Nikita for their contributions. The contributions of our authors have made this
book possible, and their research provides valuable insights into the potential
of data science and machine learning in subsurface engineering. I want to
sincerely thank each of our contributors for their hard work, dedication, and
commitment to advancing knowledge in this field.
Throughout the book, several key themes emerge, including the importance
of interdisciplinary collaboration, the need for robust and transparent data
management practices, and the challenges of implementing data science and
Preface vii
Dedication iii
Foreword iv
Preface vi
1. Introduction 1
2. Enhancing Drilling Fluid Lost-circulation Prediction: Using 6
Model Agnostic and Supervised Machine Learning
1. Introduction 6
2. Background of Machine Learning Regression Models 9
3. Data Collection and Description 11
4. Methodology 11
4.1 Data Analysis and Visualisation 11
4.2 Machine Learning Model Application 13
4.3 Explainable AI 15
4.3.1 Permutation Feature Importance 15
4.3.2 Shapley Values 16
5. Results and Discussion 17
5.1 Evaluation of Model Performance 17
5.2 Model Agnostic Results 20
5.3 Analysis of Features Using Model Agnostic Metrics 22
5.4 Analysis of Features Using Shapley Values Model 23
Agnostic Metrics
5.5 Evaluation of Top Features 26
5.6 Model Optimisation 26
5.7 Sensitivity Analysis 27
6. Conclusions 28
Acknowledgement 29
Data Availability 29
References 30
x Data Science and Machine Learning Applications in Subsurface Engineering
3. Methodology 187
3.1 Regional Geological Overview of the Opunake Field 187
3.2 Local Geological Overview of the Opunake Field 188
3.3 Deep Convolutional Neural Network in Seismic 189
Image Resolution
3.3.1 Simplified Architecture of Residual U-net 190
3.4 Training and Testing Process 192
3.5 Criteria for Model Evaluation 193
4. Results and Discussion 194
4.1 Conditioned Seismic Volume 194
4.2 Model Evaluation 197
5. Conclusions 202
Data and Software Availability 203
Acknowledgement 203
References 203
10. Petroleum Reservoir Characterisation: A Review from 207
Empirical to Computer-Based Applications
1. Introduction 207
2. Empirical Models for Petrophysical Property Prediction 209
2.1 Porosity and Permeability Prediction Models 209
2.2 Saturation Prediction Models 210
3. Fractal Analysis in Reservoir Characterisation 213
4. Application of Artificial Intelligence in Petrophysical 214
Property Prediction
4.1 Artificial Neural Networks (ANNs) 215
4.1.1 ANN Application in Petrophysical 218
Reservoir Prediction
4.2 Support Vector Machine (SVM) 228
4.2.1 Machine Learning (ML) Application in 229
Petrophysical Reservoir Prediction
5. Lithology and Facies Analysis 234
5.1 AI Applications in Lithology and Facies Analysis 234
6. Seismic Guided Petrophysical Property Prediction 239
7. Hybrid Models of AI for Petrophysical Property Prediction 244
8. Summary 247
9. Challenges and Perspectives 248
9.1 AI Perspective 248
9.2 Rock Physics Perspective 250
10. Conclusions 251
References 253
xiv Data Science and Machine Learning Applications in Subsurface Engineering
11. Artificial Lift Design for Future Inflow and Outflow 261
Performance for Jubilee Oilfield: Using Historical Production
Data and Artificial Neural Network Models
1. Introduction 261
2. Methodology 263
2.1 Artificial Lift Screening Techniques 263
2.2 Inflow Performance Relationship Production Forecast 263
2.3 Outflow Performance Relationship Production Forecast 264
2.4 PROSPER Procedure for Well Model Set-Up 264
2.4.1 Deviation Survey Data Input 266
2.4.2 Surface Equipment Data Input 266
2.4.3 Downhole Equipment Data Input 266
2.4.4 Average Heat Capacities Data Input 267
2.5 Artificial Neural Networks 267
2.5.1 Back Propagation Neural Network 268
2.5.2 Radial Basis Function Neural Network 268
2.5.3 ANN Procedure 268
3. Results and Discussion 269
3.1 Production and Well Data of the Study Area 269
3.1.1 Base Case Flow Rates 271
3.2 Artificial Lift Screening 272
3.3 PROSPER Simulation Results 273
3.3.1 IPR Curves 273
3.3.2 Vertical Lift Performance Correlations 275
3.3.3 Desired Flow Rates 276
3.4 Gas Lift Results 277
3.4.1 Optimum Production Rates 278
3.5 ANN Results 279
3.5.1 ANN Architecture 279
3.5.2 Model Visualization 280
3.6 Discussion 282
4. Conclusions 283
Acknowledgment 284
References 284
12. Modelling Two-phase Flow Parameters Utilizing 286
Machine-learning Methodology
1. Introduction 286
2. Data Sources and Existing Correlations 288
3. Methodology 289
Contents xv
“Data is the new oil”—this buzz phrase has become a ubiquitous adage
in today’s digital age. Data science and machine learning have become
indispensable tools for extracting insights and value from vast amounts of
data. The field of subsurface engineering is no exception to this trend, and
this book brings together a collection of chapters from experts in this field,
exploring various data science and machine learning applications in subsurface
engineering.
In this book, we delve into the potential of these technologies for subsurface
engineering, including topics such as data-driven reservoir characterisation,
machine learning in drilling operations, and computer vision application in
seismic image processing and interpretation. The book is divided into several
chapters, each offering a unique perspective and case studies on the different
applications of data science and machine learning in subsurface engineering.
Chapter 2 focuses on predicting drilling fluid loss in the Marun oil
field using machine learning models. The research begins by assessing
the importance of input features through model agnostic metrics. Once a
suitable dataset is established, several machine learning models will be used,
with the best-performing one optimised using the Bayesian Optimisation
algorithm. This research aims to generate new insights into the generalisation
of individual features to explain the target. The study’s main contributions
are twofold. Firstly, it provides a global explanation of variables in mud-loss
prediction using explainable artificial intelligence (AI). Secondly, it develops
a machine learning workflow that utilises explainable AI to enhance drilling
fluid lost circulation prediction. Overall, this research offers valuable insights
into applying machine learning to subsurface engineering, and the potential
benefits of explainable AI in improving drilling fluid loss prediction.
Chapter 3 describes a study aimed at developing an AI-based model that
can predict porosity and producible pore volume fraction using wireline logs
and NMR-measured total porosity and free fluid index. Wireline logs are
2 Data Science and Machine Learning Applications in Subsurface Engineering
volumes into each reservoir when ICV settings formula is discordant rather
than recalibrating empirical correlation, which leads to interruptions in field
production. The study employs eight machine learning models to estimate each
reservoir unit’s volume of production fluids based on operational parameters,
providing a simple and accurate approach to reservoir management plans. The
method can also provide real-time estimates of produced volumes, allowing
daily production operational changes to meet critical targets and develop
domestic oil resources responsibly. This initiative has tremendous research
possibilities, and the proposed approach could be applied to oil production
wells with various input parameters upon acceptable results.
Chapter 7 of this book explores using Carbon Dioxide Low Salinity
Water Alternating Gas (CO2 LSWAG) flooding as an Enhanced Oil Recovery
(EOR) technique in carbonate reservoirs. The chapter highlights the benefits
of this technique, including a high recovery factor and improved displacement
efficiencies. The authors use a compositional simulator with geochemical
models to develop proxy models for predicting the oil recovery factor.
Multivariate Adaptive Regression Splines (MARS) and Group Method of Data
Handling (GMDH) machine learning methods are used in the study to develop
these proxy models. The authors advocate for using machine learning proxy
models as prediction tools to enhance the efficient full-field implementation
of this technique and reduce the computational time associated with numerical
simulations in carbonate reservoirs.
Chapter 8 showcases the application of transfer learning to a convolutional
neural network pre-trained with synthetic labels, generating salt probability
models for use in seismic imaging and velocity modelling phases. The use of
deep learning techniques in object and edge detection has succeeded in various
fields, making them a promising approach for seismic salt mapping. The
study’s main contributions are improved accuracy in salt segmentation, time
and cost savings, enhanced data analysis, and potential for future research.
By using transfer learning to automate the process of salt segmentation in
seismic images, we can save time and reduce costs while improving accuracy,
ultimately improving our understanding of the subsurface geology and
advancing the energy transition journey. The findings of this study will provide
new insights into the use of transfer learning for salt segmentation in seismic
images, highlighting the potential for developing new energy resources, and
mitigating climate change.
Chapter 9 explores advanced AI techniques to improve seismic image
resolution and quality, which is crucial for accurate subsurface exploration
and analysis. This study focuses on using a pre-trained deep convolutional
neural network (DCNN) to enhance the signal-to-noise ratio (SNR) and
vertical resolution in seismic images of the Opunake field. The study compares
4 Data Science and Machine Learning Applications in Subsurface Engineering
topics covered in this book, readers will gain a deeper understanding of the
transformative power of data science and machine learning in subsurface
engineering and the challenges and opportunities of integrating these
technologies into subsurface workflows.
In conclusion, this book offers a comprehensive overview of the different
applications of data science and machine learning in subsurface engineering,
demonstrating the potential of these technologies for unlocking new
insights and value from subsurface data. The book’s central themes include
the importance of interdisciplinary collaboration, the need for robust and
transparent data management practices, and the challenges of implementing
data science and machine learning techniques in real-world environments.
In addition, the book highlights the importance of transparency and ethical
considerations in using data science and machine learning in subsurface
engineering, emphasising the need for responsible and ethical practices in this
rapidly evolving field.
By bringing together experts from various fields, this book offers a
unique and interdisciplinary perspective on applying data science and
machine learning in subsurface engineering, highlighting the potential
of these technologies for transforming the way we explore, produce, and
manage subsurface resources. The book’s contributions to the field include
case studies, best practices, and critical analyses of the opportunities and
challenges of implementing data science and machine learning techniques in
subsurface engineering. Overall, this book serves as a valuable resource for
researchers, practitioners, and students in the field of subsurface engineering,
offering insights and perspectives that are critical for staying up-to-date with
the latest developments in this rapidly evolving field.
Chapter 2
Enhancing Drilling Fluid
Lost-circulation Prediction
Using Model Agnostic and Supervised
Machine Learning
Daniel Asante Otchere,1,2,*
Mohammed Ayoub Abdalla Mohammed,3 Hamoud Al-Hadrami 4
and Thomas Boahen Boakye5
1. Introduction
Drilling is a complex and high-risk element of oil and gas field development
and production (Zarrouk and McLean, 2019). Drilling fluid serves several
purposes in rotary drilling, where the drilling mud is cycled through the
drill string to remove cuttings and improve drill bit performance (Alkinani
et al., 2020). The size, shape, and density of the drilling fluid’s cuttings, as
well as its annular velocity, all affect the drilling fluid’s ability to remove
accumulated cuttings. One of the most prevalent drilling industry difficulties
is drilling fluid lost returns. This problem occurs whenever the volume of mud
injected during drilling partially or totally filtrates into the formation rather
than flowing back up to the surface (Toreifi et al., 2014). Due to the volume
1
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Darul Ridzuan, Malaysia.
2
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
3
Chemical and Petroleum Engineering, UAE University, Sheik Khalifa Street at Tawam R/A, Maqam
District, Al Ain, United Arab Emirates.
4
Department of Petroleum and Chemical Engineering, Sultan Qaboos University, Muscat, Oman.
5
Subsurface Department, Tullow Ghana Limited, North Dzorwuku, Off George Bush Highway,
PMB, Accra, Ghana.
* Corresponding author: [email protected]
Enhancing Drilling Fluid Lost-circulation Prediction 7
of mud lost during drilling, Pilehvari and Nyshadham (2002) proposed three
classifications due to the severity or otherwise. The three classifications are
defined as seepage (ranging from 0.5 to 10 bbls/hr), partial (ranging from
10 to 500 bbls/hr), and complete (above 500 bbls/hr) losses of all the fluid. The
losses can be attributed to the formation type and structural and petrophysical
properties of the formation (Moazzeni et al., 2012). The loss of these drilling
fluids leads to increased costs in drilling operations, differential sticking, a
blowout, damage to reservoir intervals, and, most seriously leading to a loss
of the well (Alkinani et al., 2019a; Feng and Gray, 2017; Sabah et al., 2019;
Toreifi et al., 2014).
Minimising the loss of drilling fluids is of the utmost importance
considering the severe consequences. This fluid loss has led to the development
of different preventive and remedial treatments employed to prevent fluid loss.
Before deciding on the appropriate lost circulation solution to utilise to reduce
mud loss, the degree of the mud loss should be determined. It should, however,
be noted that finding a single solution to reduce lost circulation is challenging.
Consequently, several lost circulation remedial alternatives are available,
including but not limited to fibrous, high viscosity pills, granular, brittle
materials, and cement sludges. Each treatment method or material is chosen
depending on how much of the mud is lost, the time and expense involved, the
drilling phase, the fluid type, and the theft zone (Alkinani et al., 2020). The
main reason why these treatments are used is to bridge existing fractures and
prevent the development of new fractures (Alkinani et al., 2019b).
Several factors lead to this drilling problem, the most severe being
formation, drilling operation, and time-dependent. Since these factors are
highly dimensional and have complex relationships, traditional mathematical
techniques have proven futile in predicting drilling fluid loss (Sabah et al.,
2019; Toreifi et al., 2014). As a result, a significant amount of effort and
money is spent harnessing technical breakthroughs in data acquisition to
potentially limit its likelihood. One method for accomplishing this is accurately
forecasting lost circulation using drilling parameters. Artificial Intelligence
(AI) techniques are now widely employed in the oil and gas sector, resulting
in substantial advancements in their use to increase accuracy (Otchere
et al., 2021a). Several studies have also employed machine learning models to
forecast drilling fluid loss circulation, demonstrating their efficacy in creating
patterns among complicated drilling parametric interactions (Alkinani et al.,
2020; Sabah et al., 2019; Toreifi et al., 2014).
Sabah et al. (2021) introduced a hybrid machine learning approach that
enhances the prediction of lost circulation using the Multilayer Perceptron
Neural Network (MLP-NN), and the Least Squares Support Vector Machine
(LSSVM) models. To enhance the accuracy of the predictions, feature
selection was employed, which helped to eliminate input parameters that
8 Data Science and Machine Learning Applications in Subsurface Engineering
were not beneficial to the machine learning model. The researchers applied
a Savitzky–Golay (SG) filter to reduce noise recorded in the data. From their
results, the wrapper method, the most efficient feature selection technique, was
used in conjunction with several evolutionary algorithms to improve model
performances. Their study concluded that the LSSVM-Cuckoo Optimisation
Algorithm recorded the highest prediction accuracy of 0.94 compared to all
the hybrid models developed. Alkinani et al. (2020) also employed an artificial
neural network to estimate lost circulation in induced and natural fractures.
Their work entailed separating the datasets into three categories: 60% for
training, 20% for testing, and 20% for validating. Databases, consisting of
nine input features for each fracture type, were created for both natural and
induced fractures to give room for data normalisation. The application of
feature selection justified the selection of the Levenberg–Marquardt function
to train the dataset as it resulted in the highest accuracy. Their results achieved
an accuracy of 0.96 and 0.93, respectively, for both natural and induced
fracture networks.
Abbas et al. (2019b) assessed the potential of machine learning in mining
and analysing drilling data to predict lost circulation while drilling. For the
selection of input parameters, the work made use of the feature ranking method
to reduce the dimensionality of the dataset. Eighteen out of the 23 studied
parameters were selected as input variables to predict the lost–circulation.
The feature ranking method enhanced the efficiency of the dataset, thereby
improving computational processing time. The results showed that the
Gaussian kernel support vector machine (SVM) recorded the highest accuracy
of 0.92. Their work led to more research, such as using two different machine
learning models that consider geological and operational parameters in
making lost-circulation predictions (Abbas et al., 2019a). Toreifi et al. (2014)
also proposed a new technique using the modular neural network (MNN) and
particle swarm optimisation (PSO) capable of predicting drilling fluid loss.
Data normalisation was used to transform the data into a range between 0 and
+1 to develop a more efficient model, for which 60% was used for training,
20% for testing, and 20% for validating. Their results concluded that the
PSO achieved a more accurate output by optimising the parameter variation
process. Other influential research and review work in the field of integrating
AI with drilling activities include Aalizad and Rashidinejad (2012), Abbas
et al. (2019c), Ahmadi (2016), Al-Baiyat and Heinze (2012), Barbosa et al.
(2019), Brankovic et al. (2021).
Considering different machine learning models and techniques used
in predicting drilling fluid loss, the main difference was the type of model
being used and the input variables. Supervised machine learning models are
only as efficient as the information they are trained with. Therefore, when
irrelevant data is included as input, the model’s performance suffers (Otchere
Enhancing Drilling Fluid Lost-circulation Prediction 9
a suitable model that can handle all these problems has become necessary since
drilling data are commonly highly dimensional, either small or large, with
different data distributions. Based on these assertions, this study establishes
relevant features that can lead to lower prediction errors. Improving the
accuracy in estimating mud loss has a massive impact on drilling operations
and the integrity of a well. Hence, the application of machine learning models,
although dependent on data type, can be applied to drilling operations with
similar input features. The most common issue with machine learning models
and their varying performance is centred on data. Having the capability of
explaining the input variables and determining causation has been the pinnacle
of this study as new approaches are created to solve this issue. The slightest
gain in accuracy is critical in improving decision-making in the petroleum
business, making this field of research critical. Table 2.1 presents a summary
analysis of the models reviewed in this research.
Table 2.1. Summary of algorithms used in this study and corresponding authors (Otchere
et al., 2022b).
4. Methodology
4.1 Data Analysis and Visualisation
Data analysis and visualisation were used to aid in understanding how the
input features related to the output. The degree of correlation among each
input variable pair and output was quantified using the Spearman rho
covariance matrix. From the heatmap, the pump pressure was the only feature
to show a moderate correlation to the target. The pair plot also depicted the
nonlinear distribution between some of the input and the target. Based on
the nonlinear distribution depicted in Fig. 2.1, it is evident that none of the
features showed a linear correlation to the target. Hence, nonlinear models
were deemed appropriate for this research.
Table 2.2. Summary statistics of some input and target variables.
Table 2.3. Assigned numbers for the Marun Field Formations (Sabah et al., 2021).
Fig. 2.1. Pair plot distribution of all variables colour-coded against the formation type.
Enhancing Drilling Fluid Lost-circulation Prediction 15
4.3 Explainable AI
In most cases, the application of feature selection techniques is unsupervised;
hence, there is no right or wrong answer, making each technique different
for each data. For this research, model agnostic methods will be used to
analyse the importance of input features to separate its explanations from the
machine learning model. The desirable characteristics sought after are model
representation and explanation flexibility, which are not limited to a specific
type and make sense in the context of the model being explained. For this
purpose, the Permutation Feature Importance (PFI) and Shapley values will
be used to explain the results generated by the models. The inputs to achieve
this are the model, the feature vectors, the target, and the error metrics. After
applying these techniques, features that do not explain the target will be
removed from the input feature vector.
Fig. 2.3. AIC results comparing all the models on the test data.
Table 2.4. Train and test correlation coefficient score of all models used in this study.
Fig. 2.4. Cross-plot of predicted vs actual mud loss for all models based on test data.
reliability. The Root Mean Squared Error (RMSE) findings for all the models
indicate that the Extra Tree model evaluation is the most reliable compared
to the actual mud loss values. The Mean Absolute Error (MAE) also suggests
that the Extra Tree model is the most exact model for predicting mud loss. In
selecting the best model, the ranking feature used in this study is as follows;
MAE, RMSE, AIC, and R2.
20 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 2.5. Comparison of test data prediction errors of all models based on RMSE and MAE.
Fig. 2.6. Permutation importance plot indicating the importance of all the input variables.
Fig. 2.7. XGboost Feature Importance of the input variables to the target.
22 Data Science and Machine Learning Applications in Subsurface Engineering
15–10 features, and the result is illustrated in Table 2.5. The results indicate
that the optimal number of features to predict mud loss is thirteen.
Fig. 2.9. SHAP feature importance measured as the mean absolute Shapley values.
informative but provides no other information outside the significance. The bee
swarm technique, a more informative plot, is used for analysis.
Figures 2.10 and 2.11 visualise the Shapely values as absolute values in a
bee swarm summary plot for both train and test data, respectively. The y-axis
is determined by the feature, and the x-axis by the Shapley value. The feature
importance with characteristic effects is combined into global explanations
whereby a feature matrix of Shapley values is achieved for every instance. In
the summary plot, the first indications of the positive and negative relationship
between the value of a feature and the impact on the target is identified. The
data points, made up of all the training data points, overlap in a scattered
manner on the y-axis. This type of visualisation depicts the distribution of
Shapley values per feature. The features are arranged in descending order of
significance. From the plots, low feature values are represented by blue, while
high feature values are denoted by red. In analysing the influence of pump
pressure, it is observed that low values predict high mud-loss values, whereas
high values predict low mud-loss values. The widespread of the pump pressure
data points also indicates a global explanation that explains the entire model
behaviour. This analysis confirms that this correlation on its own cannot
be termed causation. This conclusion enforces the importance and need for
model agnostic techniques to understand the influence the other features have
in mud-loss prediction. From the bee swarm plot, it can be observed that about
12 features can globally explain how the predictions were made.
The results of the bee swarm plot confirmed that features that have high
importance in both the train and test results and exhibit their importance in the
Enhancing Drilling Fluid Lost-circulation Prediction 25
Fig. 2.10. A bee swarm summary plot of Shap value impact on model target for the train data.
Fig. 2.11. A bee swarm summary plot of Shap value impact on model target for the test data.
global explanation of the target. From the demonstrated results, the following
analysis was derived;
1. Feature importance: The features are ranked in descending order, and
from the train and test data plot, the pump pressure is ranked first. The
meterage feature has close to zero importance because it does not have
any causal effect in predicting mud loss.
2. Impact: The horizontal location of the data points shows that the pump
pressure has a negative correlation and a high prediction effect in general.
26 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 2.12. Comparing the estimation errors of top-performing models and Sabah et al. (2021) models
based on RMSE and MAE.
explain the target and not downgrade the results of the other researchers. From
the error analysis performed, it was observed that the features selected based
on the Shapley values significantly reduced the model’s error. The initial
extra tree model recorded an MAE of 12.6 bbls/hr, while the MAE from the
Shapley selected features was 0.3 bbls/hr. This result represents about a 97%
reduction in MAE.
Similarly, RMSE for the Extra Trees model using all 16 features was
24.0 bbls/hr. The RMSE from the Shapley selected features was recorded as
1.4 bbls/hr. Again, this represents about a 94% reduction in this error metric.
Using hyperparameter tuning, BO–ET was able to reduce the extra tree’s
mean absolute error (MAE), and root mean squared error (RMSE) to 0.2 and
1.2 bbls/hr, respectively. The superiority of the Extra Trees model is mainly
attributed to the bias-variance concept used to build this model, which makes it
resilient to outliers. Based on all the evaluation criteria, it was determined that
the Shapley selected features are highly relevant, offer a global generalisation
of the target, and improve model efficiency.
Fig. 2.13. Kernel density estimation for BO–ET model mud-loss prediction demonstrating the
closeness of projected values to actual values.
6. Conclusions
This study demonstrated a methodology for identifying significant variables
for drilling fluid lost-circulation estimations. The effectiveness of this approach
Enhancing Drilling Fluid Lost-circulation Prediction 29
has been assessed to surpass previously reported techniques based on the same
dataset. The evaluation criteria discussed presented the probability of fitting
held-out data of the models. The exactness, reliability, and dependability
of the models were also assessed based on the R2, MAE, and RMSE on the
out-of-sample data. The models and how the target is predicted, based on
the PFI and Shapley values, have made the input features explainable as the
following traits are displayed;
1. Fairness: The results have ensured that the predictions are neutral and
are not biased against any input variable, either implicitly or overtly.
An easy-to-interpret model can explain why a particular prediction was
made, making it more straightforward to determine which input variable
had a more significant influence.
2. Reliability: The results help capture the differences in input variables that
result in significant changes in the target and vice versa.
3. Causality: Causal relationships were identified, and the pump pressure
feature for training and test data is ranked high amongst all input variables.
4. Trust: The results have yielded some level of trust in input features used
for the final prediction. This analysis is based on how the employed
techniques, PFI and Shapley values, explain their judgements on the input
features to the target.
The application of Explainable AI enhanced model prediction accuracy
by 97% and 94% in terms of MAE and RMSE, respectively. The comparison
with other published results based on the same data resulted in a 98% and
95% reduction in MAE and RMSE, respectively. This analysis is proof that
correlation does not mean causation. Without the PFI and Shapley values, the
XGBoost feature importance would not have indicated features that will cause
over- or underfitting and not capable of global interpretation of features.
Acknowledgement
The authors express their sincere appreciation to Universiti Teknologi
Petronas, the Centre of Research in Enhanced Oil recovery, and the Centre for
Subsurface Seismic Imaging for supporting this work.
Data Availability
The drilling data was obtained from Sabah et al. (2021).
30 Data Science and Machine Learning Applications in Subsurface Engineering
References
Aalizad, S.A. and Rashidinejad, F. 2012. Prediction of penetration rate of rotary-percussive
drilling using artificial neural networks: A case study/Prognozowanie postępu wiercenia
przy użyciu wiertła udarowo-obrotowego przy wykorzystaniu sztucznych sieci
neuronowych – studium przypadku. Archives of Mining Sciences 57: 715–728. https://doi.
org/10.2478/v10267-012-0046-x.
Abbas, A.K., Al-haideri, N.A. and Bashikh, A.A. 2019a. Implementing artificial neural networks
and support vector machines to predict lost circulation. Egyptian Journal of Petroleum
28: 339–347. https://doi.org/10.1016/J.EJPE.2019.06.006.
Abbas, A.K., Bashikh, A.A., Abbas, H. and Mohammed, H.Q. 2019b. Intelligent decisions
to stop or mitigate lost circulation based on machine learning. Energy 183: 1104–1113.
https://doi.org/10.1016/J.ENERGY.2019.07.020.
Abbas, A.K., Flori, R., Almubarak, H., Dawood, J., Abbas, H. and Alsaedi, A. 2019c. Intelligent
prediction of stuck pipe remediation using machine learning algorithms. In: SPE Annual
Technical Conference and Exhibition. SPE, Calgary. https://doi.org/10.2118/196229-MS.
Ahmadi, M.A. 2016. Toward reliable model for prediction Drilling Fluid Density at wellbore
conditions: A LSSVM model. Neurocomputing 211: 143–149. https://doi.org/10.1016/j.
neucom.2016.01.106.
Al-Baiyat, I. and Heinze, L. 2012. Implementing artificial neural networks and support vector
machines in stuck pipe prediction. In: Kuwait International Petroleum Conference and
Exhibition. SPE. https://doi.org/10.2118/163370-MS.
Alkinani, H.H., Al-Hameedi, A.T.T., Dunn-Norman, S., Flori, R.E., Alsaba, M.T., Amer, A.S.
and Hilgedick, S.A. 2019a. Using data mining to stop or mitigate lost circulation. J. Pet.
Sci. Eng. 173: 1097–1108. https://doi.org/10.1016/J.PETROL.2018.10.078.
Alkinani, H.H., Al-Hameedi, A.T.T., Dunn-Norman, S., Flori, R.E., Hilgedick, S.A., Al-Maliki,
M.A., Alshawi, Y.Q., Alsaba, M.T. and Amer, A.S. 2019b. Examination of the relationship
between rate of penetration and mud weight based on unconfined compressive strength of
the rock. J. King Saud. Univ. Sci. 31: 966–972. https://doi.org/10.1016/j.jksus.2018.07.020.
Alkinani, H.H., Al-Hameedi, A.T.T. and Dunn-Norman, S. 2020. Artificial neural network
models to predict lost circulation in natural and induced fractures. SN Appl. Sci. 2: 1980.
https://doi.org/10.1007/s42452-020-03827-3.
Barbosa, L.F.F.M., Nascimento, A., Mathias, M.H. and de Carvalho, J.A. 2019. Machine learning
methods applied to drilling rate of penetration prediction and optimization: A review.
J. Pet. Sci. Eng. 183: 106332. https://doi.org/10.1016/j.petrol.2019.106332.
Brankovic, A., Matteucci, M., Restelli, M., Ferrarini, L., Piroddi, L., Spelta, A. and Zausa, F.
2021. Data-driven indicators for the detection and prediction of stuck-pipe events in oil
& amp; gas drilling operations. Upstream Oil and Gas Technology 7: 100043. https://doi.
org/10.1016/j.upstre.2021.100043.
Breiman, L. 2001. Random forests. Mach. Learn. 45: 5–32. https://doi.
org/10.1023/A:1010933404324.
Chen, T. and Guestrin, C. 2016. XGBoost: A scalable tree boosting system. pp. 785–794.
In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining. ACM, New York, NY, USA. https://doi.org/10.1145/2939672.2939785.
Feng, Y. and Gray, K.E. 2017. Review of fundamental studies on lost circulation and
wellbore strengthening. J. Pet. Sci. Eng. 152: 511–522. https://doi.org/10.1016/J.
PETROL.2017.01.052.
Geurts, P., Ernst, D. and Wehenkel, L. 2006. Extremely randomized trees. Mach. Learn.
63: 3–42. https://doi.org/10.1007/s10994-006-6226-1.
Enhancing Drilling Fluid Lost-circulation Prediction 31
Gordon, A.D., Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. 1984. Classification
and regression trees. Biometrics 40: 874. https://doi.org/10.2307/2530946.
Hoerl, A.E. and Kennard, R.W. 1970. Ridge regression: biased estimation for nonorthogonal
problems. Technometrics 12: 55–67. https://doi.org/10.1080/00401706.1970.10488634.
Moazzeni, A., Nabaei, M. and Jegarluei, S.G. 2012. Decision making for reduction of
nonproductive time through an integrated lost circulation prediction. Petroleum Science
and Technology 30(20): 2097–2107. http://dx.doi.org/10.1080/10916466.2010.495961 30,
2097–2107. https://doi.org/10.1080/10916466.2010.495961.
Otchere, D.A., Arbi Ganat, T.O., Gholami, R. and Ridha, S. 2021a. Application of supervised
machine learning paradigms in the prediction of petroleum reservoir properties:
Comparative analysis of ANN and SVM models. J. Pet. Sci. Eng. 200: 108–182.
https://doi.org/10.1016/j.petrol.2020.108182.
Otchere, D.A., Ganat, T.O.A., Gholami, R., Lawal, M., Arbi Ganat, T.O., Gholami, R. and
Lawal, M. 2021b. A novel custom ensemble learning model for an improved reservoir
permeability and water saturation prediction. J. Nat. Gas Sci. Eng. 91: 103962. https://doi.
org/10.1016/j.jngse.2021.103962.
Otchere, D.A., Ganat, T.O.A., Ojero, J.O., Taki, M.Y. and Tackie-Otoo, B.N. 2021c. Application
of gradient boosting regression model for the evaluation of feature selection techniques
in improving reservoir characterisation predictions. J. Pet. Sci. Eng. 109244. https://doi.
org/10.1016/J.PETROL.2021.109244.
Otchere, D.A., Abdalla Ayoub Mohammed, M., Ganat, T.O.A., Gholami, R. and Aljunid
Merican, Z.M. 2022a. A novel empirical and deep ensemble super learning approach
in predicting reservoir wettability via well logs. Applied Sciences 12: 2942. https://doi.
org/10.3390/app12062942.
Otchere, D.A., Aboagye, M., Mohammed, M.A.A. and Boakye, T.B. 2022b. Enhancing drilling
fluid lost-circulation prediction using model agnostic and supervised machine learning.
SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4085366.
Otchere, D.A., Ganat, T.O.A., Nta, V., Brantson, E.T. and Sharma, T. 2022c. Data analytics
and Bayesian Optimised Extreme Gradient Boosting approach to estimate cut-offs from
wireline logs for net reservoir and pay classification. Appl. Soft Comput. 120: 108680.
https://doi.org/10.1016/j.asoc.2022.108680.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,
Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,
Brucher, M., Perrot, M. and Duchesnay, É. 2011. Scikit-learn: Machine learning in Python.
Journal of Machine Learning Research 12: 2825–2830.
Pilehvari, A.A. and Nyshadham, V.R. 2002. Effect of material type and size distribution on
performance of loss/seepage control material. Paper presented at the International
Symposium and Exhibition on Formation Damage Control, Lafayette, Louisiana.
pp. 863–875. In: All Days. SPE. https://doi.org/10.2118/73791-MS.
Sabah, M., Talebkeikhah, M., Agin, F., Talebkeikhah, F. and Hasheminasab, E. 2019.
Application of decision tree, artificial neural networks, and adaptive neuro-fuzzy inference
system on predicting lost circulation: A case study from Marun oil field. J. Pet. Sci. Eng.
177: 236–249. https://doi.org/10.1016/J.PETROL.2019.02.045.
Sabah, M., Mehrad, M., Ashrafi, S.B., Wood, D.A. and Fathi, S. 2021. Hybrid machine learning
algorithms to enhance lost-circulation prediction and management in the Marun oil field.
J. Pet. Sci. Eng. 198: 108125. https://doi.org/10.1016/j.petrol.2020.108125.
Shapley, L.S. 1953. A value for n-person games. pp. 307–318. In: Contributions to the
Theory of Games (AM-28), Volume II. Princeton University Press. https://doi.
org/10.1515/9781400881970-018.
32 Data Science and Machine Learning Applications in Subsurface Engineering
Tarafder, S., Badruddin, N., Yahya, N. and Egambaram, A. 2021. EEG-based drowsiness
detection from ocular indices using ensemble classification. pp. 21–24. In: 2021 IEEE 3rd
Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability (ECBIOS).
IEEE. https://doi.org/10.1109/ECBIOS51820.2021.9510848.
Tarafder, S., Badruddin, N., Yahya, N. and Nasution, A.H. 2022. Drowsiness detection using
ocular indices from EEG signal. Sensors 22: 4764. https://doi.org/10.3390/s22134764.
Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the
Royal Statistical Society: Series B (Methodological) 58: 267–288. https://doi.
org/10.1111/j.2517-6161.1996.tb02080.x.
Toreifi, H., Rostami, H. and Manshad, A.K. 2014. New method for prediction and solving
the problem of drilling fluid loss using modular neural network and particle swarm
optimization algorithm. J. Pet. Explor. Prod. Technol. 4: 371–379. https://doi.org/10.1007/
s13202-014-0102-5.
Vapnik, V. and Lerner, A. 1963. Pattern recognition using generalised portrait method.
Automation and Remote Control 24: 774–780.
Zarrouk, S.J. and McLean, K. 2019. Geothermal wells. Geothermal Well Test Analysis 39–61.
https://doi.org/10.1016/B978-0-12-814946-1.00003-7.
Chapter 3
Application of a Novel Stacked
Ensemble Model in Predicting Total
Porosity and Free Fluid Index via
Wireline and NMR Logs
Daniel Asante Otchere1,2
1. Introduction
With the growth in energy needs and advances in drilling and hydraulic
fracturing techniques, the extraction of hydrocarbons from reservoirs
innovatively has become a focus for exploration and exploitation in recent
years (Li et al., 2022). Therefore, a complete understanding of bound and
moveable fluids is critical for accurately evaluating net pay reservoirs and
optimising completion, production, and enhanced oil recovery activities
(Otchere et al., 2022b). In reservoir terms, pore fluids refer to the fluid
distribution within the pore structure of the reservoir rock. There are two
main types of pore volume fractions: bound fluid, the fluid that is physically
trapped within the pore structure and cannot be easily displaced; and free fluid
volume, the fluid that can easily move within the pore structure. Knowing
the specific distinction between the proportion of bound and free fluid in a
reservoir is essential for predicting the ease of fluid flow and production (Cai
and Hu, 2019). For example, a reservoir with a higher proportion of bound
fluid may require more aggressive enhanced recovery methods to increase
1
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Daril Ridzuan, Malaysia.
2
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
Email: [email protected]
34 Data Science and Machine Learning Applications in Subsurface Engineering
production. Understanding pore fluid type and volume fractions is crucial for
effective reservoir management and optimisation (Cai and Hu, 2019; Otchere
et al., 2021c).
Numerous approaches have been formulated to ascertain the fluid
distribution in a reservoir. The most appropriate and accurate method to
measure pore types in reservoirs depends on the specific characteristics of
the reservoir being studied. One standard method is the mercury injection
capillary pressure (MICP) analysis, which measures the amount of mercury
that can be injected into the pore structure at a given pressure (Peng et al.,
2017). The amount of mercury that can be injected into the pore structure is
directly related to the proportion of free fluid in the reservoir. MICP is a widely
used method for measuring fluid distribution, and it is considered one of the
most accurate and reliable methods for measuring free fluid volume (Dugan,
2015; Mitchell et al., 2008). Another method is nuclear magnetic resonance
(NMR), which can provide a detailed description of the pore structure and fluid
distribution within the reservoir (Xu et al., 2015). NMR is particularly useful
for complex pore structures and can provide information on the distribution
of multiple fluid phases. Additionally, NMR is a non-destructive method that
uses magnetic fields and radio waves to measure the magnetic susceptibility
of the pore fluids (Branco and Gil, 2017; Otchere et al., 2022a). The magnetic
susceptibility of the pore fluids is used to distinguish between the bound and
free fluid in the reservoir, making it a valuable tool for measuring pore fluid
types and volume fractions on a large scale (Heaton et al., 2000).
Both methods, however, have their inherent set of challenges. A
fundamental challenge in measuring or determining pore type is the
complexity of the pore structure. Reservoirs can have a variety of pore sizes
and shapes, and different methods may be more effective for different types of
pore structures. One challenge associated with using MICP is that it requires
a core sample from the reservoir, which can be costly and time-consuming.
Additionally, the availability of core samples may be limited, making it
difficult to obtain enough samples to provide a representative analysis of the
reservoir (Wu et al., 2021). Also, MICP is a destructive method, where the
core sample is destroyed after analysis, which could make it difficult to repeat
the analysis for the same sample if needed. Another challenge is the sensitivity
of MICP to the saturation of the fluid, requiring the fluid to be in a state of
equilibrium, which may not be the case in some reservoirs. This challenge
can affect the accuracy of the results obtained from MICP analysis (Wu
et al., 2020). On the other hand, NMR is sensitive to the reservoir’s porosity
and cannot provide quantitative measurements of pore types. Furthermore,
NMR is responsive to the existence of clay minerals, which can impact the
precision of the outcomes (Elsayed et al., 2020). Table 3.1 summarises the
unique advantages and limitations of these two techniques.
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 35
Table 3.1. Summary of MICP and NMR techniques identifying key advantages, limitations, and
measurement principles.
incurs significant expenses and may not appear practical to execute in every
well during periods of low oil prices (Otchere et al., 2022a). By combining
these two types of data, AI can be used to predict porosity and volume fractions
with a high degree of accuracy. An ideal way to accomplish this is through the
use of machine learning and data analytical techniques, which can be trained
on a large dataset of wireline logs and NMR porosity and volume fraction
measurements. These techniques exhibit resilience to noise and demonstrate a
high efficacy in identifying intricate or nonlinear patterns or features indicative
of specific input data and pore volume fraction (Otchere et al., 2021a). Once
trained, the AI model can then be used to predict total porosity and pore volume
fractions for new reservoirs based on the wireline logs.
Utilising wireline logs to assist in estimating this indispensable property
is comparatively more economical and time-saving and provides real-time
information than running NMR logs in all wells. As such, considering current
oil prices, the demand for an inexpensive and reliable field-scale technique
for quantitative porosity and pore volume fraction measurement has been
of interest. Hence, this study endeavours to develop an AI-based model that
can accurately predict the porosity and producible pore volume fraction in
a carbonate gas reservoir using wireline logs as input and NMR measured
total porosity and FFV as the target output. This study aims to provide a more
efficient and accurate method for reservoir characterisation and management.
Fig. 3.1. The excitation and relaxation of polarised hydrogen atomic nuclei in response to an external
magnetic field (Otchere et al., 2022a).
The concept of using NMR to predict pore fluid types is to use the magnetic
properties of certain nuclei to provide information about a reservoir’s fluid
composition and rock properties. This prediction is achieved by measuring
the proton density and relaxation times of the fluids in the reservoir, which
are unique to different fluid types, such as water, oil, and gas. When CMR is
used for porosity partitioning, the total porosity is calculated by measuring the
magnetic susceptibility of the pore fluids and using the relationship between
magnetic susceptibility and fluid volume fraction. This method is based on
the assumption that the magnetic susceptibility of the bound fluid is different
from the free fluid, which is a key concept in CMR. The following equations
are used to calculate total porosity, BFV, and FFV in reservoirs:
(ϕT) = ϕB + ϕF (3.1)
where the total porosity is represented as ϕT, which is a dimensionless value
between 0 and 1, representing the fraction of the rock sample that is made up
of pores, ϕB is the bound fluid volume fraction, and ϕF is the FFV fraction.
BFV = ϕB × Vp (3.2)
where Vp is the pore volume.
FFV = ϕF × Vp (3.3)
38 Data Science and Machine Learning Applications in Subsurface Engineering
3. Methodology
3.1 Data Collection and Description
Conventional wireline logs from a vertical exploration gas well were used in
this study. The CMR tool was logged in the well. The objective of the CMR
logging was to evaluate the formation’s total porosity, bound and free fluid,
and permeability. CMR was logged over a carbonate section capped by shale.
Over the zone of interest, we do not observe borehole effects on the CMR signal
due to a relatively smooth borehole combined with moderate mud salinity
(approximately 58 ppk equivalent NaCl using chart GEN-9 with Rmf = 0.0694
ohmm at 21 deg C). No anomalies were observed on the raw echoes, T2
distribution, total porosity, and bin porosities. Since the well is drilled with
water-based mud, the T2 distribution reflects the pore size distribution. The data
used in this study underwent quality control and reprocessing. The parameters
selection for the CMR reprocessing used in this study is captured in Table 3.2.
Figure 3.2 shows the incremental increase in porosity and multiple pore size
distribution with favourable connectivity between pores, identified through the
multi-exponential decay time analysis of NMR T2 distribution in 100% brine
saturated samples. The partitioning of the porosity was done as follows:
1. Small Pore Porosity (T2 min to 3 ms) – this can be considered as
water-filled porosity associated with clay (water resistivity of Rwb in the
Dual Water Saturation model).
2. Capillary Bound Fluid (3 ms to T2 cutoff) – this can be considered part
of effective porosity. In clastic rocks, this would be expected to be water
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 41
Processing Parameters
Porosity Algorithm
Starting Echo 2nd
Number of averaging levels 3
T2 minimum 0.5 msec
T2 maximum 6000 msec
Number of Spectral Components 30
T1/T2 Ratio Minimum 1
T1/T2 Ratio Maximum 3
Polarization Correction Threshold 0.015 v/v
Producibility Parameters
Free Fluid Cut-off 0.02 v/v
Porosity Parameters
T2 Cut-off 100 msec (default for Carbonate)
Taper Cut-off Start 8 msec
T1/T2 Ratio 2
Bin Porosity T2 Cutoffs (ms) 1, 3, 10, 33, 100, 300, 1000, 3000
Small Pore Porosity T2 Cutoffs (ms) 1 to 3
Capillary Bound Fluid T2 Cutoffs (ms) 3 to 100
Fig. 3.2. NMR T2 multimodal pore size distribution of different core samples at 100% water saturation
(Otchere et al., 2022a).
42 Data Science and Machine Learning Applications in Subsurface Engineering
TCMR CMFF
mean 0.11 0.07
std 0.03 0.04
min 0.01 0.00
25% 0.09 0.04
50% 0.11 0.07
75% 0.13 0.09
max 0.31 0.16
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 43
7. The model was trained using the Adam optimiser with MAE as the loss
function.
8. The model was trained using early stopping to prevent overfitting.
Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU)
models are types of Recurrent Neural Network (RNN) models that are
particularly well-suited for sequential data such as time series, speech, and
text. A GRU model is similar to an LSTM model but has fewer parameters,
which can be useful when working with limited data or computational
resources. The key feature of GRU and LSTM is the ability to remember
information over a more extended period of time by using gates to control
the flow of information in and out of the cell. This model type can be useful
when working with sequential data where the order of the data is essential.
However, this model type can be computationally intensive and requires a
large amount of data for training. To build the GRU and LSTM model for this
study, the following steps were taken:
1. The input data was transformed into a suitable format for GRU/LSTM,
usually in the form of a 3D array where the first dimension represents the
number of samples, the second dimension represents the time steps, and
the third dimension represents the number of features.
2. The model was composed of one or more GRU/LSTM layers, which are
responsible for processing the input data and extracting features.
3. The output of the GRU/LSTM layers was then passed through one or
more fully connected layers to make the final predictions.
4. The model was then trained using the training dataset and evaluated using
the holdout dataset.
5. The model was fine-tuned by adjusting the number of layers, the number
of neurons in each layer, and the dropout rate to achieve the desired level
of performance.
6. The model was trained with regularisation techniques such as dropout
and early stopping to prevent overfitting.
Table 3.5. Model architecture stacking ensemble models as base learners for the Custom Ensemble.
Adapted from Otchere et al. (2022a).
which was used as input to train a meta-model using linear regression. The
meta-model was trained on the combined predictions and the target variables
of the training dataset, and it was then used to make predictions on the test
set. This approach can improve the model’s performance by combining the
strengths of different models and reducing the prediction variance (Tarafder
et al., 2022). Table 3.5 summarises the function used to develop the stacked
ensemble using the Extra Trees, Random Forest, and XGBoost models, whiles
Fig. 3.3 summarises the entire workflow.
There are several different combinations of models and different
parameters for each model, as well as different meta-models to find the best
hybrid ensemble for a specific task, dataset, and desired performance.
46 Data Science and Machine Learning Applications in Subsurface Engineering
∑
n
yi − ŷi
MAE = i =1
(3.5)
n
2. Root Mean Squared Error (RMSE): This assessment criterion is the
standard deviation of models’ prediction errors and indicates how close
predicted data is to actual data. This is written as:
∑
n
( yi − yˆi ) 2
RMSE = i =1 (3.6)
n
3. Akaike Information Criterion (AIC): This assessment metric is based on
a frequentist probability approach, which scores a model according to
its maximum probability estimation. This technique is used to determine
the precision and excellence of models, indicating a more robust match
between the model and new data. AIC is statistically written as:
AIC = 2K – 2(log – likelihood) (3.7)
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 47
R2 = 1 −
∑ ( yi − ŷi )2 (3.8)
∑ ( yi − yi )2
Besides R2, low errors are indicative of good model performance.
the hybrid model, which indicates that the Extra Trees model is not able to
generalise well to new data. On the other hand, the hybrid model combines the
predictions from multiple models, which reduces the variance and increases
the robustness of the final prediction. Therefore, it is less likely to overfit the
training data and more likely to generalise well to new data. Based on this
alone, it is vital to note that the hybrid model performed more robustly than
the other models. Additional analysis was conducted to assess the prediction
errors of all the models.
Error measurements were used to assess the consistency and precision
of the models and are presented in Fig. 3.5. The MAE and results indicate
the hybrid model’s superiority over other models, indicating its consistency
and precision in multioutput prediction. The Extra Trees Regression model’s
performance was comparative, with an RMSE of 0.0143 and an MAE of
0.0102. Its AIC score is also higher than the hybrid ensemble model indicating
that the hybrid model has a higher probability of fitting new data. The
Random Forest and XGBoost models performed relatively well, with RMSE
of 0.0.0152 and 0.0.0155, respectively. Their MAE was higher than the hybrid
ensemble model.
On the other hand, the LSTM and GRU models performed poorly
compared to the other base models, with RMSE of 0.0231 and 0.0202 and
MAE of 0.0176 and 0.0151, respectively. The AIC scores are also higher than
the other models, indicating that they are less likely to generalise well to new
data. Comparing the two RNN models indicate that the GRU model recorded
lower errors than the LSTM model. The worst performing model, the CNN
model, achieved an MAE of 0.0273 and RMSE of 0.034, indicating that it is
not able to fit the data well. The relatively high AIC score also indicates that
it is not the best model for this task.
The results suggest that the hybrid ensemble model is the best performing
model for this study, as it has the highest train and test scores, the best AIC
score, and the lowest MAE and RMSE. The Extra Trees Regression, Random
Forest, and XGBoost models performed well, but some overfitting issues were
observed. The ensemble models should be further investigated and optimised
to improve their performance. Moreover, it would be interesting to explore
other ensemble methods, such as bagging and boosting, to improve the model
performance. Furthermore, a larger dataset and more computational resources
may be necessary to enhance the performance of the LSTM and GRU models
(Fig. 3.5).
Figures 3.6 and 3.7 below illustrate the kernel density estimation (KDE)
of the Hybrid, GRU, and ET predicted and actual total porosity and FFV. From
the observation, the hybrid ensemble predicted outputs are significantly closer
to the total porosity and free fluid volume actual data. This result suggests
that the hybrid model is better equipped to capture a wide range of values
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 49
Fig. 3.5. Comparison of the models’ estimation errors based on RMSE, MAE, and AIC on the
holdout data.
Fig. 3.6. KDE indicating the closeness of estimated to actual total porosity where (a) is hybrid model
prediction, (b) is GRU model prediction, and (c) is ET model prediction.
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 51
Fig. 3.7. KDE showing the closeness of estimated to actual FFV where (a) is hybrid model prediction,
(b) is GRU model prediction, and (c) is ET model prediction.
52 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 3.8. Joint plot of the hybrid model predicted vs actual values where total porosity is on the left
and FFV is on the right.
5. Conclusions
To obtain a detailed understanding of the reservoir, it is crucial to quantify the
surface wettability at an early stage. This information is essential for enhanced
oil recovery (EOR), particularly chemical EOR, to maximise production.
This study aimed to develop a suitable model for predicting total porosity
and the free fluid volume using wireline logs as input. The study evaluated
several models, including a Hybrid ensemble model, Extra Trees Regression,
Random Forest, XGBoost, LSTM, GRU, and CNN. The performance of
the models was evaluated using several metrics, including train score, test
score, AIC, MAE, and RMSE. The incorporation of these computational
methods provides a more accurate depiction of surface wettability when core
wettability data and NMR logs are unavailable.
In the light of the results obtained in this study, the following inferences
can be made;
1. The hybrid ensemble model achieved a high train and test accuracy score,
indicating that it is able to fit the training data well and generalise well
to new data. This is further supported by the recorded lowest MAE, and
RMSE in its predictions.
2. The likelihood of the model fitting new data is also high, as indicated by
the AIC score, which is the lowest among all the models. AIC is a measure
of the probability of the model’s fit on new data. A low AIC score indicates
that the model is more likely to generalise well to new data.
3. The DL models, LSTM, CNN, and GRU, performed poorly due to their
high complexity and volume of data used.
4. The hybrid ensemble model performed better than the others because it
combined the strengths of multiple base models, Extra Trees Regression,
Random Forest, and XGBoost, and used a stacking ensemble technique.
This technique involves training multiple base models on the input data
and then using their predictions as input to a higher-level meta-model that
makes the final predictions. This allows the hybrid model to capture both
the low-level and high-level features of the data, which leads to improved
performance.
The results of this study suggest that the hybrid ensemble model is a
robust fit for this problem and is suitable for predicting total porosity and FFV.
It is able to achieve high accuracy, consistency, and precision, and generalise
well to new data. This makes it a valuable tool for the petroleum industry in
estimating the porosity of reservoirs, determining the pore volume fraction of
a reservoir that can be produced, and evaluating the potential of a prospect.
This can help improve hydrocarbon exploration and production efficiency,
ultimately leading to increased productivity and reduced costs.
Application of a Novel Stacked Ensemble Model in Predicting Total Porosity 55
Acknowledgement
The authors express their sincere appreciation to Universiti Teknologi Petronas
and the Centre of Research for Subsurface Seismic Imaging for supporting
this work.
References
Amani, M., Al-Jubouri, M.B., Khadr, S. and Sayed, A.M. 2017. A comprehensive review on
the use of NMR technology in formation evaluation. https://www.semanticscholar.org/
paper/A-Comprehensive-Review-on-the-Use-of-NMR-Technology-Amani-Al-Jubouri/
fe15a754ea33ae2a3c1f2fa689b81f067b398fef.
Bloch, F. 1946. Nuclear induction. Physical Review 70: 460. https://doi.org/10.1103/
PhysRev.70.460.
Branco, F.R. and Gil, N.A. 2017. NMR study of carbonates wettability. J. Pet. Sci. Eng.
157: 288–294. https://doi.org/10.1016/j.petrol.2017.06.023.
Cai, J. and Hu, X. 2019. Petrophysical Characterization and Fluids Transport in Unconventional
Reservoirs. pp. 1–332. https://doi.org/10.1016/C2018-0-00934-2.
Coates, G.R., Xiao, L. and Prammer, M.G. 1999. NMR Logging Principles and Applications.
Houston: Halliburton Energy Services Publication.
Dugan, B. 2015. Data report: Porosity and pore size characteristics of sediments from Site
C0002 of the Nankai Trough determined by mercury injection. https://doi.org/10.2204/
IODP.PROC.338.202.2015.
Elsayed, M., Glatz, G., El-Husseiny, A., Alqubalee, A., Adebayo, A., Al-Garadi, K. and Mahmoud,
M. 2020. The effect of clay content on the spin–spin NMR relaxation time measured in
porous media. ACS Omega 5: 6545–6555. https://doi.org/10.1021/acsomega.9b04228.
Gu, M., Xie, R. and Jin, G. 2021. A machine-learning based quantitative evaluation of the fluid
components on T2-D spectrum. Mar. Pet. Geol. 134: 105353. https://doi.org/10.1016/J.
MARPETGEO.2021.105353.
Gubelin, G. and Boyd, A. 1997. Total porosity and bound-fluid measurements from an NMR
tool. Journal of Petroleum Technology 49: 718–718. https://doi.org/10.2118/39096-JPT.
Heaton, N., Minh, C.C., Freedman, R. and Flaum, C. 2000. High-resolution bound-fluid, free-
fluid and total porosity with fast NMR logging. Paper presented at the SPWLA 41st Annual
Logging Symposium, Dallas Texas (June). https://www.slideshare net>armhaggag>high
resolute…
Li, C., Liu, X., You, F., Wang, P., Feng, X. and Hu, Z. 2022. Pore size distribution characterization
by joint interpretation of MICP and NMR: A case study of chang 7 tight sandstone in the
ordos basin. Processes 10: 1941. https://doi.org/10.3390/pr10101941.
Li, H., Misra, S. and He, J. 2020. Neural network modeling of in situ fluid-filled pore size
distributions in subsurface shale reservoirs under data constraints. Neural Comput. Appl.
32: 3873–3885. https://doi.org/10.1007/S00521-019-04124-W.
Masroor, M., Emami Niri, M., Rajabi-Ghozloo, A.H., Sharifinasab, M.H. and Sajjadi, M.
2022. Application of machine and deep learning techniques to estimate NMR-derived
permeability from conventional well logs and artificial 2D feature maps. Journal of
Petroleum Exploration and Production Technology 12(3): 2937–2953. https://doi.
org/10.1007/S13202-022-01492-3.
Mitchell, P., Al-Hosani, I., Mehairi, Y. Al and Kalam, M.Z. 2008. Importance of mercury
injectiion capillary pressure (MICP) measurements at pseudo reservoir conditions.
56 Data Science and Machine Learning Applications in Subsurface Engineering
Society of Petroleum Engineers –13th Abu Dhabi International Petroleum Exhibition and
Conference. ADIPEC 2008 2: 952–962. https://doi.org/10.2118/117945-MS.
Otchere, D.A., Arbi Ganat, T.O., Gholami, R. and Ridha, S. 2021a. Application of supervised
machine learning paradigms in the prediction of petroleum reservoir properties:
Comparative analysis of ANN and SVM models. J. Pet. Sci. Eng. 200: 108–182.
https://doi.org/10.1016/j.petrol.2020.108182.
Otchere, D.A., Ganat, T.O.A., Ojero, J.O., Taki, M.Y. and Tackie-Otoo, B.N. 2021b. Application
of gradient boosting regression model for the evaluation of feature selection techniques
in improving reservoir characterisation predictions. J. Pet. Sci. Eng. 109244. https://doi.
org/10.1016/J.Petrol.2021.109244.
Otchere, D.A., Hodgetts, D., Ganat, T.A.O., Ullah, N. and Rashid, A. 2021c. Static reservoir
modeling comparing inverse distance weighting to kriging interpolation algorithm in
volumetric estimation. case study: gullfaks field. In: Offshore Technology Conference.
OnePetro, Virtual and Houston, Texas. https://doi.org/10.4043/30919-MS.
Otchere, D.A., Abdalla Ayoub Mohammed, M., Ganat, T.O.A., Gholami, R. and Aljunid
Merican, Z.M. 2022a. A novel empirical and deep ensemble super learning approach
in predicting reservoir wettability via well logs. Applied Sciences 12: 2942. https://doi.
org/10.3390/app12062942.
Otchere, D.A., Ganat, T.O.A., Nta, V., Brantson, E.T. and Sharma, T. 2022b. Data analytics
and Bayesian Optimised Extreme Gradient Boosting approach to estimate cut-offs from
wireline logs for net reservoir and pay classification. Appl. Soft. Comput. 120: 108680.
https://doi.org/10.1016/j.asoc.2022.108680.
Peng, S., Zhang, T., Loucks, R.G. and Shultz, J. 2017. Application of mercury injection capillary
pressure to mudrocks: Conformance and compression corrections. Mar. Pet. Geol.
88: 30–40. https://doi.org/10.1016/J.MARPETGEO.2017.08.006.
Purcell, W.R. 1949. Capillary pressures –their measurement using mercury and the calculation
of permeability therefrom. Journal of Petroleum Technology 1: 39–48. https://doi.
org/10.2118/949039-G.
Rezaee, R. 2022. Synthesizing Nuclear Magnetic Resonance (NMR) outputs for clastic rocks
using machine learning methods, examples from north west shelf and perth basin, Western
Australia. Energies (Basel) 15: 518. https://doi.org/10.3390/en15020518.
Tamoto, H., Gioria, R. dos S. and Carneiro, de C. 2023. Prediction of nuclear magnetic resonance
porosity well-logs in a carbonate reservoir using supervised machine learning models.
J. Pet. Sci. Eng. 220: 111169. https://doi.org/10.1016/j.petrol.2022.111169.
Tarafder, S., Badruddin, N., Yahya, N. and Nasution, A.H. 2022. Drowsiness detection using
ocular indices from EEG signal. Sensors 22: 4764. https://doi.org/10.3390/s22134764.
Wu, B., Xie, R., Wang, X., Wang, T. and Yue, W. 2020. Characterization of pore structure of tight
sandstone reservoirs based on fractal analysis of NMR echo data. J. Nat. Gas Sci. Eng.
81: 103483. https://doi.org/10.1016/j.jngse.2020.103483.
Wu, B., Xie, R., Xu, C., Wei, H., Wang, S. and Liu, J. 2021. A new method for predicting capillary
pressure curves based on NMR echo data: Sandstone as an example. J. Pet. Sci. Eng.
202: 108581. https://doi.org/10.1016/j.petrol.2021.108581.
Xu, H., Tang, D., Zhao, J. and Li, S. 2015. A precise measurement method for shale porosity with
low-field nuclear magnetic resonance: A case study of the Carboniferous–Permian strata in
the Linxing area, eastern Ordos Basin, China. Fuel 143: 47–54. https://doi.org/10.1016/j.
fuel.2014.11.034.
Chapter 4
Compressional and Shear Sonic
Log Determination
Using Data-Driven Machine
Learning Techniques
Daniel Asante Otchere,1,2,* Raoof Gholami,3 Vanessa Nta 4
and Tarek Omar Arbi Ganat 5
1. Introduction
Shear (Vs) and Compressional (Vp) sonic waves are significant parameters
in subsurface engineering. They provide valuable information for reservoir
exploration, development, recovery and fluid sequestration (Azizia et al.,
2017; Zoveidavianpoor et al., 2013). These parameters or its ratio are useful
in reflection seismology, lithologic identification, and formation evaluation
(Castagna et al., 1985), pore fluid and pore pressure information (Duffaut and
Landrø, 2007; Rojas, 2008), reservoir characterisation (Eberli et al., 2003;
Pickett, 1963), geophysics (Phadke et al., 2000; Waluyo et al., 1995), and
geomechanical properties (Asoodeh and Bagheripour, 2014; Rasouli et al.,
2011). There are several ways of measuring these parameters, as summarised
in Table 4.1.
1
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Daril Ridzuan, Malaysia.
2
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
3
Department of Energy Resources, University of Stavanger, Kjell Arholms gate 41, Stavanger, 4021,
Norway.
4
Shell Oil Company, 150 N Dairy Ashford Rd, Houston, TX, 77079, United States of America.
5
Department of Petroleum and Chemical Engineering, Sultan Qaboos University, Muscat, Oman.
* Corresponding author: [email protected]
58 Data Science and Machine Learning Applications in Subsurface Engineering
2. Literature Review
Prior to the influx of data, empirical and laboratory techniques have been useful
in estimating Vs from Vp. One of the pioneering articles that were published
by (Pickett, 1963) established the concept of using the compressional to shear
wave velocities ratio as a lithology indicator. Figure 4.1, which has been
modified from Pickett’s study, demonstrates the clear distinction in Vp/V that
exists between dolomites, limestones, and clean sandstones. According to
(Castagna et al., 1985), the Vp/Vs ratio tends to change fairly linearly between
the velocity ratios of the end members with increasing composition in binary
Compressional and Shear Sonic Log Determination 59
Fig. 4.1. Vs and Vp sonic velocities from in-situ sonic and field seismic measurements for mudrocks.
Modified after Castagna et al. (1985).
combinations of quartz and carbonates. The results of this study served as the
foundation of several empirical approaches. This study will discuss two of the
most commonly used techniques for estimating Vs.
Research conducted by (Castagna et al., 1985) concluded that Vs sonic
velocity is directly related to Vp sonic velocity for both water-saturated and
dry siliciclastic sedimentary rocks, as shown in Fig. 4.1. Shales tend to have a
relatively greater Vp/Vs than clean sandstones, given a similar Vp. From their
results, Vp/Vs is almost homogeneous for dry sandstones. Wet sandstones
and mudstones exhibited an indirect monotonic relationship between Vp
and Vp/Vs. The Vs sonic velocities measured in water-saturated sandstone
agreed with those predicted by Gassmann’s equations. This empirical linear
relationship between the Vp and Vs sonic velocity in brine-saturated clastic
silicate rocks is known as the mudrock line and is expressed from in-situ sonic
and field seismic measurements as:
Vp = 1.16Vs + 1.36 (4.1)
This linear equation is partly explained by the location of the clay point
near a line joining the quartz point with the water velocity. The equation was
formulated assuming that Vp and Vs velocities decrease linearly as increasing
clay porosity approaches water. Similarly, when pure clay is mixed with
quartz, velocities increase monotonically as the quartz point is approached.
These constraints correspond with those deduced from Tosaya and Nur (1982)
empirical relations, except for high porosity behaviour.
Greenberg and Castagna (1992) proposed a correlation, the
Greenberg-Castagna formulation, for pure unit completely brine-saturated
60 Data Science and Machine Learning Applications in Subsurface Engineering
Author Model Used Lithology Input Variables Output Variable(s) Study Area
Bagheripour et Support Vector Regression Carbonate • Compressional sonic Shear sonic Iran Gas
al. (2015) • Density fields
• Neutron
• True resistivity
• Shallow resistivity
• Photoelectric factor
• Gamma ray
Bukar et al. Exponential Gaussian Process Sandstone • Compressional sonic Shear sonic Australia Gas
(2019) Regression Model • Caliper fields
• Bulk density correction
Data from three wells in the Volve oil field is used in this study. The input
features and their relevance will be evaluated using the laboratory-verified
justification of their influence on Vs and Vp prediction. After establishing
a suitable dataset of relevant input variables, different machine learning
models will be employed to predict Vp and Vs. The best-performing model
based on several statistical metrics will then be optimised using the Bayesian
Optimisation (BO) algorithm.
technique known as recursive binary splitting is utilised for the data partition.
Recursive binary splitting is a statistical approach in which all values are
aligned, and several strategic split points are explored and assessed using an
objective function. The split that results in the lowest cost is thence chosen.
Based on the objective function, all input parameters and feasible splits are
assessed and selected in a greedy approach.
The Disjunctive Normal Form concept, also known as the Sum of
Products (SOP), is the foundation of the CART model. Various branches that
end in the same class combine to form a sum for each branch that extends
from the tree’s root to a leaf node of that class. In implementing the decision
tree, determining which qualities should be regarded as the root node at each
level is the primary issue. Addressing this is referred to as attribute selection.
Different attribute selection techniques determine the attribute designated as
the root note at each level.
The preference to do judicious splits has a major impact on tree’s
reliability. There are specific selection principles for CART models. To
determine whether a node should be split into multiple sub-nodes, different
algorithms are used by CART models. The formation of sub-nodes improves
the uniformity of newly emerging sub-nodes (Gordon et al., 1984). The
decision to do strategic splits heavily affects a tree’s accuracy, meaning that
the integrity of the node grows with the target variable. The CART model
divides the nodes based on all available parameters and then chooses the split
that produces the most homogenous subnodes. The algorithm selection is also
chosen depending on the type of target data.
Variance =
∑(X − X ) 2
(4.8)
n
–
The mean of the values is shown as X. X is the actual value, and n is the
number of values.
6. Chi-Squared Automatic Interaction Detector (CHAID): CHAID is a
tree classification approach that determines the statistically significant
differences between sub-nodes and the parent node. The sum of squares
of the standardised discrepancies between the observed and predicted
frequencies of the target is used to calculate it. CHAID employs the binary
target variable ‘Success’ or ‘Failure’ by performing two or more splits.
The greater the Chi-Square value, the greater the statistical significance
of discrepancies between the sub-node and Parent node. Chi-squared is
denoted mathematically as:
(O − E) 2
x2 = ∑ (4.9)
E
where x2 represents the obtained Chi-Square, O is the observed score, and
E denotes the expected score.
These criteria will be used to calculate each attribute selection value.
Each value is ranked, while the characteristics are orderly put in the tree. The
attribute with the highest value regarding knowledge gain is at the root. The
categorical output is assumed when utilising IG as a criterion, whereas GI
assumes continuous attributes. CART models are simple to understand since
they result in precepts, but one main disadvantage is the practical possibility of
overfitting. Overfitting generally happens when CART builds many branches
due to outliers and irregularities in data, but it can be minimised using the
pre- and post-pruning approaches. Pre-pruning terminates tree construction
early, and it is preferable not to split a node if its measure of purity falls
below a certain threshold. However, deciding on an acceptable stopping point
is challenging. Post-pruning begins by going deeper into the tree to construct
an entire tree. If the tree exhibits overfitting, pruning is performed as a
post-pruning phase. Cross-validation is employed to determine if extending
a node would improve or not to evaluate the efficacy of pruning. If there is
an improvement, extending that node can be done. However, if it indicates
a degraded performance, the node should not be extended and changed to a
terminal node (Breiman, 2001; Gordon et al., 1984).
68 Data Science and Machine Learning Applications in Subsurface Engineering
where ŷ is the output, h(X) is the number of trees, X is the input vector, and n
and k are the overall numbers of trees grown where n is greater than k and k
is greater than 1.
The out-of-bag (OOB) score is another essential aspect of the RF algorithm.
Specific samples will be excluded from the subsamples used to train the base
learners when bootstrapping. These out-of-sample examples may be used to
assess the learner and generate an OOB score, acting as a pseudo-validation
subset for the random forest model. Set oob score = True when initialising
the random forest object to acquire the OOB score. Another critical aspect of
the RF algorithm is its feature engineering capability entailing selecting the
most relevant features from the input variables in the training dataset. Feature
selection is an essential part of the machine learning workflow.
The primary disadvantage of random forest is that many trees may slow
down and render the model inefficient for predictions. In general, the model
Compressional and Shear Sonic Log Determination 69
tends to learn rapidly but makes slow predictions. A more precise prediction
necessitates more trees, resulting in a computationally exhaustive model.
Extra-Trees exhibit low variance, and their prediction accuracy improves with
an increasing number of trees (Ernst et al., 2006).
Fig. 4.3. Wireline plot of Well B showing the input logs and the shear (DTS) and compressional (DT) logs.
Compressional and Shear Sonic Log Determination 73
Table 4.5. Descriptive statistics of the outputs and input variables for the combined Wells A and B.
5. Methodology
5.1 Data Analysis and Visualisation
The data for this study were analysed and visualised to help understand
how the input variables correlate to both targets. The Kendall correlation
covariance heatmap was used to calculate the degree of correlation between
the inputs and the targets. The NPHI, RHOB, and PHIF were the parameters
that strongly correlated to both targets. The nonlinear distribution between the
input and the targets is also illustrated in the pair plot in Fig. 4.4. Based on
the nonlinear distribution, it is clear that none of the input parameters had a
linear relationship to both targets, hence advising on the selection of nonlinear
models for this study.
Fig. 4.4. Pair plot distribution of all input features colour-coded using sand flag.
Compressional and Shear Sonic Log Determination 75
absolute per cent error (MAPE) and R2. The models were also compared
using the Akaike Information Criterion (AIC), a frequentist probability
framework that scores models based on the maximum likelihood of fitting
unknown data. Upon selecting the best model based on these criteria (BO),
hyperparameter tuning, a sequential model-based optimisation procedure,
was used to enhance the model’s prediction accuracy. The BO technique is
a convenient and dynamic paradigm that uses the Bayes Theorem to give a
reasonable mechanism for guiding the search of a global optimisation problem
to the extrema of objective functions (Otchere et al., 2022b). The technique
operates by training a Bayesian approximation probabilistic model of the
objective function repeatedly based on previous result estimation. Before
choosing acceptable samples for evaluation on the real objective function, the
probabilistic model is evaluated using an acquisition function. The process
of adjusting the model’s parameters to improve the learning algorithm’s
efficiency and optimise model performance by reaching a sufficiently reduced
cost function is known as hyperparameter tuning (Otchere et al., 2022b).
The optimised model is then deployed on a new well from the same field to
predict Vs and Vp from the wireline logs. The predicted Vs and Vp log will
be compared to that obtained from empirical correlations. Figure 4.5 shows
the research workflow.
procedure in using this metric to evaluate the models suggests that the lower
the AIC values, the greater the possibility of fitting new data. The following
considerations were taken for this study when comparing and selecting the best
model. There are four levels of significance when it comes to differences in
AIC results between models: negligible (< 20), moderate (21–50), substantial
(51–100) and extremely strong (> 100). The Extra-Trees model resulted in the
highest probability of fit for Vp and Vs predictions by recording the lowest
AIC value. When the Extra-Trees model was compared to the Random Forest
model, representing the two models with the lowest AIC, there was a difference
of 180 AIC. The result is illustrated in Fig. 4.6. Based on this outcome, the
Extra-Trees model showed extreme differences and an enhanced chance of
fit. AIC determines the model’s convergence to actual data. However, over- or
underfitting may persist. As such, further model assessment is required using
other error metrics.
Out-of-sample data results were used to evaluate the models’
performance. Since the models were trained using the training dataset, any
attempt to duplicate the data will be highly accurate. As a result, the models
are expected to have a high training accuracy score. However, a high level of
training accuracy might lead to an unrepresentative model due to overfitting
the data by matching inherent noise in the data. The proportion of accurate
predictions generated by the models on the test data is used to assess the model
performance on data that has not yet been observed. A high test accuracy is
highly desired in the prediction of Vp and Vs to confirm model accuracy and
robustness on new data since inaccurate measurements will have long-term
defects in many operations. In this study, R2 greater than 0.9 was regarded
as an acceptable compromise between bias and variance. As can be seen in
Fig. 4.7, the Extra-Trees model performed the best in terms of estimating Vp
and Vs during testing.
Fig. 4.6. AIC results comparing all the models on the test data.
Compressional and Shear Sonic Log Determination 77
Fig. 4.7. Train and test correlation coefficient score of all models used in this study.
Fig. 4.8. Comparison of test data prediction errors of all models based on MAE, RMSE, and MAPE.
Fig. 4.9. A joint plot of Predicted vs Actual Vs (left) and Vp (right) on test data showing cross plot,
data distribution, and confidence interval. (A) is Decision Tree, (B) is Random Forest and (C) is
Extra-Trees.
Compressional and Shear Sonic Log Determination 79
Table 4.6. Performance of all models compared to the Bayesian Optimised Extra-Trees.
Fig. 4.10. Cross-plot of test and predicted data indicating confidence interval for BO–ET model based
on formation type.
Fig. 4.11. Wireline log comparison of empirical and BO–ET predicted values against actual.
Fig. 4.12. Wireline log illustration of the BO–ET predicted Vp and Vs for the deployment Well C.
7. Conclusions
This study demonstrated the use of data-driven domain knowledge and an
optimised Extra-Trees model for predicting shear and compressional sonic
logs in wells that they are absent. This approach has been found to outperform
empirical correlations that rely on the compressional sonic log to predict the
Compressional and Shear Sonic Log Determination 83
shear sonic log in terms of accuracy. The assessment criterion indicated the
models’ likelihood of fitting held-out data. The models’ accuracy, precision,
and dependability were further evaluated based on the MAE, MAPE, RMSE,
and R2 on the held-out data. The findings are outlined as follows:
1. The input data were clearly justified to understand their influence on
sonic velocity.
2. The Extra-Trees model outperformed the Random Forest and Decision
Tree models in terms of precision, robustness, and probability of fit.
3. The BO algorithm made a marginal but meaningful improvement in the
accuracy of the Extra-Trees model.
4. In the multi-output regression prediction, the predicted Vs in the sandstone
interval exhibited the highest error on the test data.
5. The error in the sandstone interval is due to the influence of different
formation fluids in the reservoir zone. This error was minimised by
retraining the model with all the data from Wells A and B.
6. After deploying the model on Well C, the predicted Vs and Vp matched
the influence of different fluid saturations in the three different
formation types.
The proposed methodology offers the geothermal, fluid sequestration,
and oil and gas industry a cheaper and time-efficient alternative to running the
sonic logging tool in every drilled well. It also provides a way to measure Vs
and Vp in older wells that were not previously measured.
Acknowledgement
The authors express their sincere appreciation to University Teknologi Petronas
and the Centre for Subsurface Seismic Imaging for supporting this work.
Data Availability
The wireline log data used was obtained from Equinor Volve Field Datasets at
https://www.equinor.com/en/how-and-why/digitalisation-in-our-dna/volve-
field-data-village-download.html ([Dataset], 2018).
References
Ali, M., Jiang, R., Ma, H., Pan, H., Abbas, K., Ashraf, U. and Ullah, J. 2021. Machine learning: A
novel approach of well logs similarity based on synchronization measures to predict shear
sonic logs. J. Pet. Sci. Eng. 203: 108602. https://doi.org/10.1016/j.petrol.2021.108602.
Anemangely, M., Ramezanzadeh, A., Amiri, H. and Hoseinpour, S.A. 2019. Machine learning
technique for the prediction of shear wave velocity using petrophysical logs. J. Pet. Sci.
Eng. 174: 306–327. https://doi.org/10.1016/j.petrol.2018.11.032.
84 Data Science and Machine Learning Applications in Subsurface Engineering
Asoodeh, M. and Bagheripour, P. 2012. Prediction of compressional, shear, and stoneley wave
velocities from conventional well log data using a committee machine with intelligent
systems. Rock Mech. Rock Eng. 45: 45–63. https://doi.org/10.1007/s00603-011-0181-2.
Asoodeh, M. and Bagheripour, P. 2014. ACE stimulated neural network for shear wave velocity
determination from well logs. J. Appl. Geophy. 107: 102–107. https://doi.org/10.1016/j.
jappgeo.2014.05.014.
Azizia, H., Siahkoohi, H.R., Evans, B., Farajkhah, N.K. and Kazemzadeh, E. 2017. A comparison
between estimated shear wave velocity and elastic modulus by empirical equations and that
of laboratory measurements at reservoir pressure condition. Journal of Sustainable Energy
Engineering 5: 29–46. https://doi.org/10.7569/JSEE.2017.629502.
Bagheripour, P., Gholami, A., Asoodeh, M. and Vaezzadeh-Asadi, M. 2015. Support vector
regression based determination of shear wave velocity. J. Pet. Sci. Eng. 125: 95–99.
https://doi.org/10.1016/j.petrol.2014.11.025.
Breiman, L. 2001. Random forests. Mach. Learn. 45: 5–32. https://doi.
org/10.1023/A:1010933404324.
Brocher, T.M. 2005. Empirical relations between elastic wavespeeds and density in the earth’s
crust. Bulletin of the Seismological Society of America 95: 2081–2092. https://doi.
org/10.1785/0120050077.
Bukar, I., Adamu, M.B. and Hassan, U. 2019. A machine learning approach to shear sonic
log prediction. In: Society of Petroleum Engineers–SPE Nigeria Annual International
Conference and Exhibition 2019, NAIC 2019. Society of Petroleum Engineers, Lagos.
https://doi.org/10.2118/198764-MS.
Carroll, R.D. 1969. The determination of the acoustic parameters of volcanic rocks from
compressional velocity measurements. International Journal of Rock Mechanics and
Mining Sciences & Geomechanics Abstracts 6: 557–579. https://doi.org/10.1016/0148-
9062(69)90022-9.
Castagna, J.P., Batzle, M.L. and Eastwood, R.L. 1985. Relationships between Compressional‐
wave and Shear‐wave Velocities in Clastic Silicate Rocks. pdf Vol. 50: 571–581.
Castagna, J.P., Batzle, M.L., Kan, T.K. and Backus, M.M. 1993. Rock physics—The link
between rock properties and AVO response. Offset-dependent reflectivity—Theory and
practice of AVO analysis. SEG 8: 135–171.
Chaikine, I.A. and Gates, I.D. 2020. A new machine learning procedure to generate highly
accurate synthetic shear sonic logs in unconventional reservoirs. In: SPE Annual Technical
Conference and Exhibition. SPE. https://doi.org/10.2118/201453-MS.
Close, D., Cho, D., Horn, F. and Edmundson, H. 2009. The sound of sonic: A historical
perspective and introduction to acoustic logging. Canadian Society of Exploration
Geophysicists Recorder 34: 34–43.
[Dataset]. 2018. Volve field data village download - data 2008-2016 - equinor.com [WWW
Document]. URL https://www.equinor.com/en/what-we-do/digitalisation-in-our-dna/
volve-field-data-village-download.html (accessed 5.27.21).
Dinov, I.D. 2018. Data Science and Predictive Analytics. Springer International Publishing,
Cham. https://doi.org/10.1007/978-3-319-72347-1.
Duffaut, K. and Landrø, M. 2007. Vp/Vs ratio versus differential stress and rock consolidation—A
comparison between rock models and time-lapse AVO data. Geophysics 72: C81–C94.
https://doi.org/10.1190/1.2752175.
Dvorkin, J. and Mavko, G. 2014. V S predictors revisited. The Leading Edge 33: 288–296.
https://doi.org/10.1190/tle33030288.1.
Eberli, G.P., Baechle, G.T., Anselmetti, F.S. and Incze, M.L. 2003. Factors controlling elastic
properties in carbonate sediments and rocks. The Leading Edge 22: 654–660. https://doi.
org/10.1190/1.1599691.
Compressional and Shear Sonic Log Determination 85
Elkatatny, S., Tariq, Z., Mahmoud, M., Mohamed, I. and Abdulraheem, A. 2018. Development
of new mathematical model for compressional and shear sonic times from wireline log data
using artificial intelligence neural networks (White Box). Arab J. Sci. Eng. 43: 6375–6389.
https://doi.org/10.1007/s13369-018-3094-5.
Ernst, P., Wehenkel, D., Govindaraju, L. and Rao, R.S. 2006. Extremely randomized trees,
Artificial Neural Network in Hydrology. Kluwer.
Folkestad, A. and Satur, N. 2008. Regressive and transgressive cycles in a rift-basin:
Depositional model and sedimentary partitioning of the Middle Jurassic Hugin Formation,
Southern Viking Graben, North Sea. Sediment Geol. 207: 1–21. https://doi.org/10.1016/j.
sedgeo.2008.03.006.
Gamal, H., Alsaihati, A. and Elkatatny, S. 2022. Predicting the rock sonic logs while drilling
by random forest and decision tree-based algorithms. J. Energy Resour. Technol. 144.
https://doi.org/10.1115/1.4051670.
Gassmann, F. 1951. Elastic waves through a packing of spheres. Geophysics 16: 673–685.
https://doi.org/10.1190/1.1437718.
Geurts, P., Ernst, D. and Wehenkel, L. 2006. Extremely randomized trees. Machine Learning
63(1): 3–42. https://doi.org/10.1007/s10994-006-6226-1.
Gordon, A.D., Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. 1984. Classification
and regression trees. Biometrics 40: 874. https://doi.org/10.2307/2530946.
Greenberg, M.L. and Castagna, J.P. 1992. Shear-wave velocity estimation in porous rocks:
theoretical formulation, preliminary verification and applications. Geophys. Prospect.
40: 195–209. https://doi.org/10.1111/j.1365-2478.1992.tb00371.x.
Hamada, G. and Joseph, V. 2020. Developed correlations between sound wave velocity and
porosity, permeability and mechanical properties of sandstone core samples. Petroleum
Research 5: 326–338. https://doi.org/10.1016/j.ptlrs.2020.07.001.
Hatampour, A. and Ghiasi-Freez, J. 2013. A fuzzy logic model for predicting dipole shear
sonic imager parameters from conventional well logs. Pet. Sci. Technol. 31: 2557–2568.
https://doi.org/10.1080/10916466.2011.603005.
LeCompte, B., Majekodunmi, T., Staines, M., Taylor, G., Zhang, B., Evans, R. and Chang, N.
2021. Machine learning prediction of formation evaluation logs in the Gulf of Mexico.
In: Offshore Technology Conference. OTC. https://doi.org/10.4043/31093-MS.
Nourafkan, A. and Kadkhodaie-Ilkhchi, A. 2015. Shear wave velocity estimation from
conventional well log data by using a hybrid ant colony–fuzzy inference system: A
case study from Cheshmeh–Khosh oilfield. J. Pet. Sci, Eng. 127: 459–468. https://doi.
org/10.1016/j.petrol.2015.02.001.
Onalo, D., Adedigba, S., Khan, F., James, L.A. and Butt, S. 2018. Data driven model for
sonic well log prediction. J. Pet. Sci. Eng. 170: 1022–1037. https://doi.org/10.1016/j.
petrol.2018.06.072.
Otchere, D.A., Arbi Ganat, T.O., Gholami, R. and Lawal, M. 2021a. A novel custom ensemble
learning model for an improved reservoir permeability and water saturation prediction.
J. Nat. Gas Sci. Eng. 91: 103962. https://doi.org/10.1016/j.jngse.2021.103962.
Otchere, D.A., Arbi Ganat, T.O., Gholami, R. and Ridha, S. 2021b. Application of supervised
machine learning paradigms in the prediction of petroleum reservoir properties:
Comparative analysis of ANN and SVM models. J. Pet. Sci. Eng. 200: 108–182.
https://doi.org/10.1016/j.petrol.2020.108182.
Otchere, D.A., Ganat, T.O.A., Ojero, J.O., Taki, M.Y. and Tackie-Otoo, B.N. 2021c. Application
of gradient boosting regression model for the evaluation of feature selection techniques
in improving reservoir characterisation predictions. J. Pet. Sci. Eng. 109244. https://doi.
org/10.1016/J.PETROL.2021.109244.
86 Data Science and Machine Learning Applications in Subsurface Engineering
Otchere, D.A., Abdalla Ayoub Mohammed, M., Ganat, T.O.A., Gholami, R. and Aljunid
Merican, Z.M. 2022a. A novel empirical and deep ensemble super learning approach
in predicting reservoir wettability via well logs. Applied Sciences 12: 2942. https://doi.
org/10.3390/app12062942.
Otchere, D.A., Ganat, T.O.A., Nta, V., Brantson, E.T. and Sharma, T. 2022b. Data analytics
and Bayesian Optimised Extreme Gradient Boosting approach to estimate cut-offs from
wireline logs for net reservoir and pay classification. Appl. Soft. Comput. 120: 108680.
https://doi.org/10.1016/j.asoc.2022.108680.
Phadke, S., Bhardwaj, D. and Yerneni, S. 2000. Marine Synthetic Seismograms Using Elastic
Wave Equation. 2000 SEG Annual Meeting, Calgary, Alberta.
Pickett, G.R. 1963. Acoustic character logs and their applications in formation evaluation.
Journal of Petroleum Technology 15: 659–667. https://doi.org/10.2118/452-PA.
Rasouli, V., Pallikathekathil, Z.J. and Mawuli, E. 2011. The influence of perturbed stresses near
faults on drilling strategy: A case study in Blacktip field, North Australia. J. Pet. Sci. Eng.
76: 37–50. https://doi.org/10.1016/j.petrol.2010.12.003.
Rojas, E. 2008. Vp-Vs ratio sensitivity to pressure, fluid, and lithology changes in tight gas
sandstones. First Break 26. https://doi.org/10.3997/1365-2397.26.1117.27907.
Safaei-Farouji, M., Hasannezhad, M., Rahimzadeh Kivi, I. and Hemmati-Sarapardeh, A. 2022.
An advanced computational intelligent framework to predict shear sonic velocity with
application to mechanical rock classification. Sci. Rep. 12: 5579. https://doi.org/10.1038/
s41598-022-08864-z.
Simm, R. and Bacon, M. 2014. Seismic Amplitude. Cambridge University Press. https://doi.
org/10.1017/CBO9780511984501.
Sneider, J.S., Clarens, P. de and Vail, P.R. 1995. Sequence stratigraphy of the middle to upper
jurassic, viking graben, North sea. Norwegian Petroleum Society Special Publications
5: 167–197. https://doi.org/10.1016/S0928-8937(06)80068-8.
Suleymanov, V., Gamal, H., Glatz, G., Elkatatny, S. and Abdulraheem, A. 2021.
Real-time prediction for sonic slowness logs from surface drilling data using machine
learning techniques. In: SPE Annual Caspian Technical Conference. SPE. https://doi.
org/10.2118/207000-MS.
Tarafder, S., Badruddin, N., Yahya, N. and Egambaram, A. 2021. EEG-based drowsiness
detection from ocular indices using ensemble classification. pp. 21–24. In: 2021 IEEE 3rd
Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability (ECBIOS).
IEEE, https://doi.org/10.1109/ECBIOS51820.2021.9510848.
Tarafder, S., Badruddin, N., Yahya, N. and Nasution, A.H. 2022. Drowsiness detection using
ocular indices from EEG signal. Sensors 22: 4764. https://doi.org/10.3390/s22134764.
Tosaya, C. and Nur, A. 1982. Effects of diagenesis and clays on compressional velocities in
rocks. Geophys. Res. Lett. 9: 5–8. https://doi.org/10.1029/GL009i001p00005.
Waluyo, W., Uren, N.F. and McDonald, J.A. 1995. Poisson’s ratio in transversely isotropic
media and its effects on amplitude response: An investigation through physical modelling
experiments. pp. 585–588. In: SEG Technical Program Expanded Abstracts 1995. Society
of Exploration Geophysicists. https://doi.org/10.1190/1.1887420.
Zoveidavianpoor, M., Samsuri, A. and Shadizadeh, S.R. 2013. Adaptive neuro fuzzy inference
system for compressional wave velocity prediction in a carbonate reservoir. J. Appl.
Geophy. 89: 96–107. https://doi.org/10.1016/j.jappgeo.2012.11.010.
Chapter 5
Data-Driven Virtual Flow
Metering Systems
Ramez Abdalla* and Philip Jaeger
1. Introduction
Proper estimation of multiphase flowrates in oil and gas production systems is
an essential tool of monitoring and optimising the production systems. Hence,
one of the routine tests of wells is production testing. It is usually conducted
as a schedulable test to monitor liquid rates, water cuts, and gas oil ratio
(GOR). The production testing is easily conducted using the test separator
to compare the actual production rate with the theoretical one. It is the
most common form of production and reservoir surveillance. However, this
technique has its limitations. The main limitation of this test is the insufficient
resolution or repeatability to identify trends in liquid and water-cut rates over
short periods of time. Another potential problem could be the duration issue.
It is often the case in low-flow rate and deep wells, which require several
time-consuming whole or complete liquid holdup periods. Later, an alternative
solution to the production testing has been developed. This solution is called
multiphase physical flow meters (MPFMs). This technology depends on the
idea of indirectly estimating multiphase flowrates without separating the
phases. This is done by tracking supplementary measurements of fluid phase
properties such as velocity and phase fractions inside the device. These meters
are usually installed at the wellhead, so that the multiphase flowrates of a
particular well can be tracked in real-time. One disadvantage of MPFMs is
that they have higher capital costs (CAPEX) and operating costs (OPEX), and
they require frequent production calibration.
𝑄𝑄𝑄𝑄
P/T
Flow line
length
model can conduct quick and precise real-time metering if the model has been
properly trained and the exposed conditions fall within the training range.
This method can build models more affordably than mechanistic models since
it does not require much in-depth production engineering domain expertise. In
the following section, we are presenting various data-driven VFM applications
on different oil wells systems.
𝑷𝑷𝑷𝑷𝒄𝒄𝒄𝒄 ,
𝑷𝑷𝑷𝑷𝒘𝒘𝒘𝒘𝒘𝒘𝒘𝒘 𝑷𝑷𝑷𝑷𝒎𝒎𝒎𝒎
𝒁𝒁𝒁𝒁𝒄𝒄𝒄𝒄
𝑄𝑄𝑄𝑄
𝐿𝐿𝐿𝐿2
ESP
Motor
𝑇𝑇𝑇𝑇𝑚𝑚𝑚𝑚
𝐿𝐿𝐿𝐿1
𝑞𝑞𝑞𝑞𝑟𝑟𝑟𝑟
• 𝑃𝑃𝑃𝑃෩
𝑟𝑟𝑟𝑟
𝑃𝑃𝑃𝑃𝑤𝑤𝑤𝑤𝑓𝑓𝑓𝑓
production test. Figure 5.3 shows the sub-modules of sucker rod pumped well
with relevant tests, dynamometer, and fluid level test example.
The applications of virtual sensing on the sucker rod pump systems known
so far are limited. This may be due to its limited ability in producing high fluid
rates. Some attempts have been made in this regard. Those applications had
various objectives. The first objective is predicting multi phase flow rates
or the dynamic fluid level in the annulus using dynamomter cards and well
head pressure and temperature as inputs. The second objective is inferring the
dynamometer cards using electrical power data.
underground pump, consists of a prime mover (usually an electric motor) and,
normally, a beam fixed to a pivotal post which is called a Sampson post
and the beam is called a walking beam.
Several sensors can provide measurements
Data-Driven of Metering
Virtual Flow sucker- Systems
rod pump
93
operations.
Dynamometer test
Fig. Virtual
3.2.1 5.3:Flow
A schematic
Meter ongraph of an SRP
Rod Pumping well.
Systems
The challenge of predicting oil, gas and water flow is a function that describes
theThe
multi phase
main flow rates. As
measurements arealoads
solution, data-driven
on the pump which algorithms are usedare
form what to
find a relation between the pump operational parameters and
called dynamomter cards which are diagnostic cards that plot the load on the the produced
oil, gas, (and polished
top rod water. Peng rod) et al.in(2019)
relationhave used
to the the deep
polished rodautoencoders to
position as the
derive
pumping features from through
unit moves dynamometer cards to
each stroke generate
cycle. a predictive
The plot model rod
of the polished for
production. Combining
Load vs Position is knownthis model
as the with
Surface pump
Card. and production
Subsequently, a wavedata leads
equation
to abstract features that show good accordance with the
solution is used to derive the downhole card from the surface card. Thehistory data (Peng
et al., 2019).
downhole card is a plot of the Load vs Position on the pump’s plunger. Also, on
Soft sensing,
the surface, whichmeasurements
continuous replaces the traditional detection
for wellhead method
pressures, to model
temperatures
the dynamic liquid level of the sucker-rod pumping system
and power measurements of the motor are reported. In addition, the normal (Haitao et al.,
2014), proposes a method to calculate the dynamic fluid level. It takes the
submerged pressure as a common solution node to analyse both the plunger
load variation which is contributed by the pump dynamometer card and the
pressure distribution in the annulus. Li et al. (2013) presented the simulated
annealing based on a Gaussian regression modeling.
measured and some assumptions of the values made the card calculation
inaccurate and unstable. Thus, a machine learning model for dynamometer
card calculation in the rod pumping lift process is used to formulate the
complicated process. Deep neural networks are used to find good weight
combinations in these examples that eventually allow the model to come
up with rules from the input data (electrical parameter) to the target data
(dynamometer cards).
Such a study includes extracting power features and constitutes
an eigenvector in chronological sequence for one period. Afterwards,
dynamometer card data and shape curve image are extracted according to
coordinates and load data. Subsequently, power features and dynamometer
diagram features are normalized by rows and then mapped between 0 and 1.
Finally, a sequence-to-sequence algorithm to infer dynamometer cards from
the power curve features (Peng et al., 2019; Zhu et al., 2021).
Test
Manifold
BHP
BHT
PayPay
ZoneZone PayZone
Pay Zone
Plunger sensor
Catcher
Plunger
Bumper Spring
𝑞𝑞𝑞𝑞𝑟𝑟𝑟𝑟
• 𝑃𝑃𝑃𝑃෩
𝑟𝑟𝑟𝑟
𝑃𝑃𝑃𝑃𝑤𝑤𝑤𝑤𝑓𝑓𝑓𝑓
which are also called controller set-points. The main triggers to close a well
Fig. include
5.5:a A schematic
fixed timer for graph of a plunger
the after-flow stage, thelifted well.between the gas
difference
and the calculated critical flow rates, as well as various relations between
casing, tubing, and line pressures. The main triggers to open a well include a
Initially, they
fixed applied
timer wide
for total “offdata analytics
time”, andaverage
calculated then input data
plunger risetovelocity,
the models
and that
different relations
were compared to eachbetween
other andthe pressure
to othermeasurements. FigureThey
empirical models. 5.5 shows
couldthe
predict
control volumes of a plunger lifted
oil rates with an accuracy exceeding 98% [28] . well.
As aforementioned, virtual flow metering includes physics-based and
data-driven methods. When it comes to the plunger lift application, since
the process is extremely transient, the application of physics-based methods
3.4. Virtual Sensing for Gas Wells and
is extremely complex and unviable (Akhiiartdinov et al., 2020). Regarding
Plunger
Lifteddata-driven
Wellsmodels, Andrianov (2018) (GarcAa et al., 2010) and Shoeibi
Omrani etplunger
Conventional al. (2018)lifting
(Loh and
is aOmrani, 2018)process
transient demonstrated
that the application
consists of cyclic
of ANNs to simulate the transient behaviour of severe slugging and
openings and closings of a gas well. Because of their complex behaviour,
liquid-loaded gas wells. On the other hand, Akhiiartdinov et al. (2020) have
using attempted
traditional physics-based
to model a VFM on models to simulate
plunger lifted wells. In the coupled
this study, behaviour of
the objective
reservoir
was toand wellbore
optimize the “on”performance
and “off” periodsis ofcomputationally
the control valve, whichrigorous
serve and
challenging. Therefore,
as parameters machine
for building learning
the response surface.methodology would help in
st
1 iteration 𝐸𝐸𝐸𝐸1
…….
10th iteration 𝐸𝐸𝐸𝐸10
A common algorithm used for building VFM models is the ANN which
could create a function originating from the inputs that connect them to the
final output. Generally, an ANN consists of an input layer to read the input
features which usually are pressure, temperature, choke opening, or other
production system parameters in the case of a VFM.
The second component of the ANN is the hidden layer where non
linear functions are created to build the connection between the inputs and
the prediction and the final component is the output layer where the results
achieved in the hidden layers go through an activation function to reach the
final prognosis. The ANN showed significant results when applied to a steady
state flow unlike the case of transient movement.
Al-Qutami et al. (2017) showed in three publications the application of
ANN in constructing a VFM model (Al-Qutami et al., 2017, 2017, 2017).
Their work provides a significant sighting to improve VFM models. The latest
methodology applied in their research was an ANN model was trained using
the Levenberg-Marquardt optimisation algorithm and K-fold cross-validation
to specify the number of neurons. This work was able to conclude that the gas
flow rate is most sensitive to choke opening and that the bottom hole pressure
is the most critical feature for the flow rate prediction.
Loh and Omrani (2018) implemented feed-forward neural networks on
simulated and field data to predict oil and gas flow rates. The results showed
that the NN model was effective at steady state flow while the model produced
transient flow imprecise predictions. Furthermore, they conducted a sensitivity
study on the input features that showed that the NN model was able to perform
well even in case of noisy inputs. Finally, they introduced a new back-allocation
technique to predict the flow rate using separator measurements. Alajmi et al.
(2015) introduced an NN model to predict the oil flow rate through the choke
and they added an empirical correlation for critical choke flow to the input
features. The study demonstrated a noticeable enhancement in the NN model
performance compared to the mechanistic models.
To conclude in this section, we have reviewed the main building blocks
of creating VFMs. It started with data gathering, preprocessing, and feature
engineering. We have also presented the validation techniques. Finally, we
presented some of NN applications to VFM as an example for the algorithms
used for data-driven VFMs.
References
Adrien, M.H., Younes, T., Deffous, J-F., Couput, A.J-P., Caulier, R. and Vrielynck, B. 2012.
Smart metering: An online application of data validation and reconciliation approach.
Paper presented at SPE Intelligent Energy International Utrecht, The Netherlands, March.
pp. SPE–149908–MS.
Akhiiartdinov, A., Pereyra, E., Sarica, C. and Jose Severino, J. (eds.). 2020. Data analytics
application for conventional plunger lift modelling and optimisation, volume Day 1 Tue.,
10 November. SPE Artificial Lift Conference and Exhibition – Americas.
AlAjmi, Mohammed D., Abdulraheem, A., Mishkhes, A.T. and Al-Shammari, M.J. (eds.).
2015. Profiling downhole casing integrity using artificial intelligence, volume Day 1 Tue.,
03 March SPE Digital Energy Conference and Exhibition.
Alhashem, M. (ed.). 2020. Machine learning classification model for multiphase flow regimes in
horizontal pipes, volume Day 2 Tue., 14 January IPTC International Petroleum Technology
Conference.
Al-Jasmi, A., Goel, H.K,. Nasr, H., Querales, M., Rebeschini, J., Villamizar, M.A., Carvajal,
G.A., Knabe, S., Rivas, F. and Saputelli, L. 2013. Short-term production prediction in
real time using intelligent techniques. Paper presented at EAGE Annual Conference and
Exhibition at London, UK. In: All Days, SPE. pp. SPE–164813–MS.
Al-Qutami, T.A., Ibrahim, R., Ismail, I. and Ishak, M.A. 2017a. Development of soft sensor to
estimate multiphase flow rates using neural networks and early stopping. International
Journal of Smart Sensing and Intelligent Systems 10(1): 1–24.
Data-Driven Virtual Flow Metering Systems 101
Al-Qutami, T.A., Ibrahim, R. and Ismail, I. 2017b. Hybrid neural network and regression
tree ensemble pruned by simulated annealing for virtual flow metering application.
pp. 304–309. In: 2017 IEEE International Conference on Signal and Image Processing
Applications (ICSIPA).
AL-Qutami, T.A., Ibrahim, R., Ismail, I. and Ishak, M.A. 2017c. Radial basis function network
to predict gas flow rate in multiphase flow. pp. 141–146. In: Proceedings of the 9th
International Conference on Machine Learning and Computing, ACM.
Al Sebaiti et al. 2020. Robust data-driven well performance optimisation assisted by machine
learning techniques for natural flowing and gas-lift wells in Abu Dhabi, volume Day 4
Thu., 29 October, of SPE Annual Technical Conference and Exhibition. D041S046R002.
Arteaga-Arteaga, H.D., Mora-Rubio, A. Florez, F., Murcia-Orjuela, N., Diaz-Ortega, C.E.,
OrozcoArias, S., delaPava, M., Bravo-OrtAz, M.A., Robinson, M., Pablo Guillen-Rondon,
P. and Tabares-Soto, R. 2021. Machine learning applications to predict two-phase flow
patterns. Peer J. Comp. Sc. 7: e798.
Bello, O., Ade-Jacob, S. and Kun Yuan, K. (eds.). 2014. Development of hybrid intelligent
system for virtual flow metering in production wells, volume All Days. SPE Intelligent
Energy International Conference and Exhibition.
Binder, Benjamin J.T., Pavlov, A., Tor, A. and Johansen, T.A. 2015. Estimation of flow rate and
viscosity in a well with an electric submersible pump using moving horizon estimation.
This work is funded by the Research Council of Norway and Statoil through the petromaks
project no. 215684: Enabling high-performance safety-critical offshore and subsea
automatic control systems using embedded optimization (emopt). IFACPapersOnLine
48(6): 140–146. 2nd IFAC Workshop on Automatic Control in Offshore Oil and Gas
Production OOGP 2015.
Camilleri, L. and Zhou, W. (eds.). 2011. Obtaining real-time flow rate, water cut, and reservoir
diagnostics from ESP gauge data, volume All Days. SPE Offshore Europe Conference and
Exhibition, 2011.
Camilleri, L., El Gindy, M., Rusakov, A. and Adoghe, S. (eds.). 2015. Converting ESP real-time
data to flow rate and reservoir information for a remote oil well, volume Day 2 Wed.,
16 September. of SPE Middle East Intelligent Oil and Gas Symposium.
Camilleri, L., El Gindy, M. and Rusakov, A. (eds.). 2016a. ESP real-time data enables well testing
with high frequency, high resolution, and high repeatability in an unconventional well,
volume All Days. of SPE/AAPG/SEG Unconventional Resources Technology Conference.
Camilleri, L., El-Gindy, M. Rusakov, A., Bosia, F., Salvatore, P. and Rizza, G. (eds.). 2016b.
i Testing the Untestable Delivering Flowrate Measurements with High Accuracy on a
Remote ESP Well, volume Day 2 Tue., 08 November. Abu Dhabi International Petroleum
Exhibition and Conference, 2016.
Camilleri, L., El Gindy, M., Rusakov, A., Ginawi, I., Abdelmotaal, H., Sayed, E., Edris, T. and
Karam, M. (eds.). 2017. Increasing production with high frequency and high-resolution
flow rate measurements from ESPs, volume Day 2 Tue., 25 April SPE Gulf Coast Section
Electric Submersible Pumps Symposium, 2017.
Carpenter, C. 2019. Analytics solution helps identify rod-pump failure at the wellhead. Journal
of Petroleum Technology 71(05): 63–64.
Cramer, R. and Goh, K-C. (eds.). 2009. Data driven surveillance and optimis ation for gas,
subsea and multizone wells, volume All Days. SPE Digital Energy Conference and
Exhibition, 2009.
Cramer, R., Griffiths, W.N., Kinghorn, P., Schotanus, D., Brutz, J.M. and Mueller, K. (eds.).
2011. Virtual measurement value during start up of major offshore projects, volume All
Days. IPTC International Petroleum Technology Conference, 2011.
102 Data Science and Machine Learning Applications in Subsurface Engineering
Denney, T., Wolfe, B. and Zhu, D. 2013. Benefit evaluation of keeping an integrated
model during real-time ESP operations. In: All Days, SPE Digital Conference, SPE,
pp. SPE–163704–MS.
Garcia, A., Almeida, I., Singh, G., Purwar, S., Monteiro, M., Carbone, L. and Herdeiro, M. 2010.
An Implementation of On-Line Well Virtual Metering of Oil Production.
Goh, K-C., Dale-Pine, B., Yong, I.H.W., Peter, V. and Lauwerys, C. (eds.). 2008. Production
surveillance and optimisation for multizone smart wells with data driven models, volume
All Days. SPE Intelligent Energy International Conference and Exhibition, 2008.
Grimstad, B., Gunnerud, V., Sandnes, A., Shamlou, S., Skrondal, I.S., Uglane, V., Ursin-Holm,
S. and Foss, B. (eds.). 2016. A simple data-driven approach to production estimation and
optimization, volume All Days. SPE Intelligent Energy International Conference and
Exhibition, 2016.
Haitao, Y. et al. 2014. Real time calculation of fluid level using dynamometer card of sucker rod
pump well, volume All Days. IPTC International Petroleum Technology Conference, Dec.
IPTC-17773-MS.
Haouche, M., Tessier, A., Deffous, Y. and Authier, J-F. (eds.). 2012. Virtual flow meter pilot:
based on data validation and reconciliation approach, volume All Days. SPE International
Production and Operations Conference and Exhibition, 2012.
Khan, M.K., Tariq, Z. and Abdulraheem, A. 2020. Application of artificial intelligence to
estimate oil flow rate in gas-lift wells. Natural Resources Research 29(6): 4017–4029.
Krikunov, D., Kosyachenko, S., Lukovkin, D., Kunchinin, A., Tolmachev, R. and Chebotarev,
R. (eds.). 2019. AI-based ESP optimal control solution to optimize oil flow across multiple
wells, volume Day 2 Tue., 22 October. SPE Gas, 2019.
Law, H.Y., Phua, P.H., Briers, J. and Kong, J. (eds.). 2018. Extending virtual metering to
provide real time exception based analytics for optimising well management and chemical
injection, volume Day 2 Wed., 21 March. Offshore Technology Conference Asia, 2018.
Li, X., Gao, X., Cui, Y. and Li, K. 2013. Dynamic liquid level modelling of sucker-rod pumping
systems based on Gaussian process regression. pp. 917–922. In: 2013 Ninth International
Conference on Natural Computation (ICNC).
Loh, K. and Omrani, P.S. 2018. Deep Learning and Data Assimilation for Real-Time Production
Prediction in Natural Gas Wells. ArXiv, abs/1802.05141.
Manikonda, K., Hasan, A.R., Obi, C.E., Islam, R., Sleiti, A.K., Abdelrazeq, M.W. and Rahman,
M.A. 2018. Application of machine learning classification algorithms for two-phase
gas-liquid flow regime identification. Paper presented at Abu Dhabi International
Petroleum Exhibition and Conference. In: Day 4 Thu., 18 November, SPE. D041S121R004.
Mask, G., Wu, X. and Ling, K. 2019. An improved model for gas liquid flow pattern prediction
based on machine learning. In: Day 2 Wed., 27 March. Paper presented at the International
Petroleum Technology Conference, Beijing China, IPTC. D021S026R005.
Olivares, G., Escalona, C. and Gimenez, E. 2012. Production monitoring using artificial
intelligence, APLT asset. Paper presented at SPE Intelligent Energy International Utrecht,
The Netherlands. March. In: All Days, SPE. pp. SPE–149594–MS.
Peng, Y., Xiong, C., Zhang, J., Zhang, Y., Gan, Q., Xu, G., Zhang, X., Zhao, R., Shi, J., Liu,
M., Wang, C. and Chen, G. 2019a. Innovative deep autoencoder and machine learning
algorithms applied in production metering for sucker-rod pumping wells, volume Day
1 Mon., 22 July Unconventional Resources Technology Conference. SPE/AAPG/SEG.
D013S011R004.
Peng, Y., Zhao, R., Zhang, X., Shi, J., Chen, S., Gan, Q., Li, G., Zhen, X. and Han, T. (eds.).
2019b. IInnovative Convolutional Neural Networks Applied in Dynamometer Cards
Generation. Paper presented at the SPE Western Regional Meeting, San Jose, California,
USA, April 2019. doi: https://doi.org/10.2118/195264-MS.
Data-Driven Virtual Flow Metering Systems 103
Poulisse, H., Van Overschee, P., Briers, J., Moncur, C. and Goh, K-C. 2006. Continuous well
production flow monitoring and surveillance. Paper presented at SPE Intelligent Energy
International Amsterdam, The Netherlands, 11–13 April.
Sabaa, A., El Ela, M.A., El-Banbi, A.H. and Sayyouh, Mohamed H.M. 2022. Artificial Neural
Network model to predict production rate of electrical submersible pump wells. SPE
Production & Operations, pp. 1–10. Sept.
Takacs, G. 2018. Chapter 1 - Introduction. pp. 1–10. In: Gabor Takacs (ed.). Electrical Submersible
Pumps Manual (Second Edition). Houston, Texas: Gulf Professional Publishing,
Vinogradov, D. and Vorobev, D. (eds.). 2020. Virtual flowmetering novyport field examples,
volume Day 3, Wed., 28 October. SPE Russian Petroleum Technology Conference.
Zhang, S. and Tang, Y. 2008. Indirect measurement of dynamometer card of pumping unit.
pp. 952–4955. In: 2008 7th World Congress on Intelligent Control and Automation.
Zhu, D., Alyamkin, S., Sesack, L., Bridges, J. and Letzig, J. (eds.). 2016. Electrical submersible
pump operation optimization with time series production data analysis, volume All Days.
SPE Intelligent Energy International Conference and Exhibition.
Zhu, D., Luo, X., Zhang, Z., Li, X., Peng, G., Zhu, L. and Jin, X. 2021. Full reproduction
of surface dynamometer card based on periodic electric current data. SPE Production &
Operations 36(03): 594–603.
Zhu, K., Wang, L., Du, Y., Jiang, C. and Sun, Z. 2020. Deeplog: Identify tight gas reservoir using
multi-log signals by a fully convolutional network. IEEE Geoscience and Remote Sensing
Letters, title=Prediction of Subsurface NMR T2 Distributions in a Shale Petroleum System
Using Variational Autoencoder-Based Neural Networks 17(4): 568–571.
Ziegel, P., Shirzadi, S., Wang, S., Bailey, R., Griffiths, P., Ghuwalela, K., Ogedengbe, A. and
Johnson, D. 2014. A data-driven approach to modelling and optimisation for a North sea
asset using real-time data. Paper presented at the SPE Intelligent Energy Conference &
Exhibition, Utrecht, The Netherlands, April 2014. doi: https://doi.org/10.2118/167850-MS.
Chapter 6
Data-driven and Machine Learning
Approach in Estimating Multi-zonal
ICV Water Injection Rates in a Smart
Well Completion
Daniel Asante Otchere1,2,* and
Mohammed Ayoub Abdalla Mohammed 3
1. Introduction
The ‘smart’or ‘intelligent’well is considered one of the highly developed types
of nonconventional wells. This statement refers to the cutting-edge versions
of wells that have been introduced in recent years. The progressive nature of
these wells is evident in their ability to gather and analyse data, monitor and
control the production process, and adjust production in response to changing
reservoir conditions. The term ‘smart’ or ‘intelligent’ has been coined to
highlight the high level of automation and the use of advanced technology in
these wells. A typical smart well features a customised completion with packers
or sealing components that partition the wellbore with downhole sensors
and pressure control valves fitted on the production tubing across separate
reservoir intervals in a heterogeneous formation. Downhole sensors allow for
continuous monitoring of temperature and pressure across the reservoir and
1
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Daril Ridzuan, Malaysia.
2
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
3
Chemical and Petroleum Engineering, UAE University, Sheik Khalifa Street at Tawam R/A, Maqam
District, Al Ain, United Arab Emirates.
* Corresponding author: [email protected]
Machine Learning Approach in Estimating Multi-zonal ICV Rates 105
control valves, which can be used to calculate approximately zonal flow rates.
In contrast, downhole valves like Inflow Control Valves (ICVs) are flexible
in controlling zonal flow rates and are used as a control variable in optimising
wells to enhance recovery. A smart well can be multilateral with an ICV
controlling each lateral or a single bore well with an ICV controlling each
zone. Most of the new oil and gas field developments include smart wells.
They achieve the desired output while lowering capital and operating costs.
Smart well completion is a technology used to optimise oil and gas reservoir
production. It involves collecting, transmitting, and analysing completion,
production, and reservoir data, allowing for remote selective zonal control.
The utilisation of this technology contributes to the improvement of production
and ultimate recovery, as well as the decrease in capital and operating
expenditures. Smart well completion systems are meticulously designed
to cater to the global demand for intelligent completions, even in the most
difficult conditions. These systems are comprised of a variety of components,
such as permanent monitoring and downhole control systems, zonal isolation
and interval control devices, distributed temperature sensing systems, surface
control and monitoring systems, data acquisition and management software,
as well as system accessories. Multi-zonal reservoirs present a unique set of
challenges for oil and gas production, and smart well completion technology
is well suited to meet these challenges. By dividing the well completion into
multiple production zones that can be controlled independently, intelligent
well completion allows for selective control of production from different
zones in the reservoir. This enables effective management of water injection,
gas and water breakthrough, and individual zone productivity, which can help
to increase ultimate recovery and reduce capital and operating expenditures.
Intelligent well completion technology can also improve the management
of water injection and gas breakthroughs, which can reduce water production
and increase ultimate recovery. By remotely controlling the ICVs, it is
possible to manage the flow of fluids from different zones in the reservoir,
leading to improved production efficiency and reduced costs. In addition to
the benefits mentioned, intelligent well completion technology also allows for
the reduction of the number of wells required for field development, which can
lead to a decrease in drilling and completion costs. Furthermore, it allows for
water management through remote zonal control, reducing surface handling
facilities’ size and complexity.
Operators of a multi-zone intelligent injection well face a problem
in estimating and managing fluid distribution through zonal ICVs in
each reservoir zone. Based on available wellbore, fluid, and well-string
data, including the setting of the zonal ICVs, geometry size, production
string, injection fluid properties, wellhead P/T data, and zonal reservoir
pressure/injectivity data, a simple yet precise empirical correlation has been
106 Data Science and Machine Learning Applications in Subsurface Engineering
to the minimum and maximum production rates. The ICV region was
segmented into these settings to provide a quantitative representation of
the valve’s opening, where 0 denotes complete closure and 10 denotes
complete openness.
3. In accordance with downhole pressure monitoring, the maximum
production rate can be achieved by adjusting the ICV (intelligent
completion valve) to its highest setting range (7–10), whereas the
minimum ICV setting range (0–3) results in a production rate of zero.
4. If production rates significantly differed between zones, different sized
ICVs were used appropriately, although this may not be a viable technique.
5. The new ICV settings were tested on different base cases of all zones
producing and one zone producing individually. If more than half of
the ICV settings in a zone gave the same production rate, the maximum
production rate was reduced. Also, the cases simulated were to capture
instances where a particular zone experienced production issues
(gas/water breakthroughs) and had to be shut in. Although the data
was applied to an injection, this approach was simulated considering a
production well to capture various scenarios and production issues.
ICV valves. These techniques have shown promising results in finding the
optimal ICV settings and improving the performance of oil and gas reservoirs.
Overall, the literature review suggests that combining ICV data and machine
learning techniques can be an effective approach for optimising production
in oil and gas reservoirs. Machine learning algorithms can be used to model
the relationship between ICV settings and production rates and to predict the
optimal ICV settings for a given set of conditions. This can help to increase
ultimate recovery, reduce capital and operating expenditures, and improve the
performance of oil and gas reservoirs.
3. Methodology
In the context of ICV design, the process usually relies on the production
capacity of individual laterals. Notwithstanding, some oil companies opt for
a simplified approach by utilising the mean production rate of the field rather
than individual lateral rates.
1. Data acquisition: For this research, data is to be acquired from the
production technology team in the subsurface department. The data
required from multiple wells are SPFM WI Rate, Annulus Pressure,
Tubing Pressure, ICV dP, ICV position, Annulus Temperature for the
separate reservoir zones (lower, middle and upper), MPFM Rate, THT
(tubing head temperature), Annulus Pressure, THP (tubing head pressure),
Tubing dP (differential pressure). Having some ICV mappings in the
wells of interest is also essential. The data from the mapping helps gain
insight and train the machine learning model on accurate data. The data
were grouped per hour and within a two-year range.
2. Data preprocessing and preparation: Based on the collected data,
preprocessing techniques are used to clean the data. Missing values and
data variables were removed to prepare the data for further analysis. The
total number of data used, based on the descriptive statistics after the
removal of missing data, is 3,767 data points.
3. Exploratory data analysis: The most critical and time-consuming part
of any machine learning project is the exploratory data analysis. It is
the first direct encounter with the data towards understanding it. Initial
data investigations were carried out to discover patterns in data, spot
anomalies, and check assumptions with the help of summary statistics,
heatmaps, and pair plot visualisation.
4. Explainable AI and feature selection: In this step, the various input
features are inspected and checked for their importance. Based on the
results from the data analysis, feature selection is to be performed using
explainable AI techniques (model agnostic metrics). The approach is
Machine Learning Approach in Estimating Multi-zonal ICV Rates 111
line produced by the algorithm. There is not a single statistic that can analyse
all forms of data. The metrics are impacted by a number of components,
including but not limited to the presence of outliers, the machine learning
algorithm used, the ease with which derivatives can be discovered, and the
prediction confidence. Therefore, the model errors were evaluated using the
following standards to determine whether the models utilised in this research
were appropriate;
1. Mean Absolute Error (MAE): The MAE is a well-established evaluation
metric used to quantify the accuracy of predictive models. It is defined
as the L1 loss, representing the sum of absolute differences between
the predicted output and the target variable. By measuring the average
magnitude of errors without taking into account the direction of errors, the
MAE provides a reliable measure of model performance. Additionally,
this metric is particularly sensitive to relative errors, making it a useful
tool for assessing performance in applications where errors of all sizes are
equally important. The MAE is also robust against global scaling of the
predicted output and the presence of outliers, making it a versatile metric
for evaluating model performance in various settings. Mathematically,
the MAE is written as;
∑
n
yi − ŷi
MAE = i =1
(6.1)
n
2. Root Mean Squared Error (RMSE): This performance evaluation metric is
popular because it is interpretable as the standard deviation of the model’s
prediction errors and specifies the closeness of predicted data to actual
data. This function is written as follows;
∑
n
( yi − yˆi ) 2
RMSE = i =1 (6.2)
n
3. Akaike Information Criterion (AIC): This evaluation standard was
developed using a frequentist probability framework that assigns a model
a score based on its maximum likelihood estimation. This technique
assesses models’ quality and accuracy, resulting in a better model fit of
the data. This criterion is expressed as;
AIC = 2K – 2(log – likelihood) (6.3)
In order to assess the accuracy of prediction models based on the number
of input features, various evaluation metrics are employed. However, except
for R2, a lower value for the performance evaluation criteria indicates
better model performance. If a conclusion cannot be drawn based on these
114 Data Science and Machine Learning Applications in Subsurface Engineering
Per domain knowledge, these features relate to the total injected volume, the
zonal ICV position, which indicates that a particular zone is closed or open,
and the tubing pressures of each zone.
Fig. 6.5. Comparison of RF and ET models using the seven relevant features to all input features.
Machine Learning Approach in Estimating Multi-zonal ICV Rates 117
Fig. 6.6. Kernel density estimate demonstrating the similarity between the estimated and actual test
upper zonal rates.
118 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 6.7. Kernel density estimate demonstrating the similarity between the estimated and actual test
middle zonal rate.
Fig. 6.8. Kernel density estimate demonstrating the similarity between the estimated and actual test
lower zonal rate.
level tends to deteriorate as high volumes of water were predicted for the
upper zone, which is analysed further. For the ET model, most data after
200,000 bbls (barrels) were widely scattered due to insufficient high injected
volume data into that zone. The ET model, however, successfully captured the
Machine Learning Approach in Estimating Multi-zonal ICV Rates 119
trends despite insufficient data from this analysis. Although the lower zone
experienced similar data inadequacy in high injected volumes, the model
did not experience low confidence in its prediction. This observation can be
attributed to the variation in the training data and the wide spread of test data
for the lower zone compared to the upper zone with a high concentration of
low injection volumes.
120 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 6.12. Total water injection rate vs Combined predicted zonal rates.
Machine Learning Approach in Estimating Multi-zonal ICV Rates 121
Fig. 6.13. Comparison of actual and predicted upper zonal ICV rates against ICV position.
Fig. 6.14. Comparison of actual and predicted middle zonal ICV rates against ICV position.
Fig. 6.15. Comparison of actual and predicted lower zonal ICV rates against ICV position.
122 Data Science and Machine Learning Applications in Subsurface Engineering
they make predictions based on ICV positions. This is to confirm that all the
predicted zonal rates went to the actual zone where the ICV was opened.
5. Conclusions
In summary, eight supervised machine learning models were employed to
predict multi-zonal rates for a smart water injection well in an oil and gas
production setting. The machine learning models were trained and evaluated
using PFI, a model-agnostic metric, to identify relevant features. Seven of the
features were relevant and used as input to further train the top six models
from the previous evaluation where all features were used. The Extra Trees
model achieved the highest precision and consistency of 708 bbls RMSE and
68 bbls MAE compared to the test data. The Extra Trees predicted zonal rates
were also visualised using KDE and joint plots to confirm their accuracy.
Upon satisfactory results, the Extra Trees model was deployed on a new
well with five months of hourly data, and the combined predicted zonal
rates matched the total injected rate. Additionally, the model could predict
zonal rates in instances where the ICVs were in the closed and fully opened
positions. Overall, the results of this research demonstrate the potential of
machine learning in predicting multi-zonal rates in oil and gas production and
highlight the use of the Extra Trees model as a robust and effective tool for
this task.
This research provides preliminary results for the approach employed in
water injection wells. When used in producer wells, this method could provide
a simple and accurate data-driven approach to estimating each reservoir
unit’s volume of produced fluids based on downhole parameters. The method
can also provide real-time estimates of produced volumes, aiding advanced
formulation of a reservoir management plan, daily production operational
changes to meet critical targets, and the operator’s policy of developing
domestic oil resources responsibly. This project’s success will not end ICV
mapping but will drastically reduce the number of mappings performed in the
life of the field. One limitation of this study is that the model is as good as the
data used to train it. When new data is introduced that falls out of the range
of the model, ICV mapping will be required, and the model will be retrained
with the new data to fine-tune its prediction. This limitation, however, does
not downplay the immense advantages this study provides. This approach is
recommended to save time and money during production using a smart well
completion.
Code Availability
The Jupyter Notebook used in this study is hosted at https://github.com/
ascotjnr/Smart-Well-Completion.
Machine Learning Approach in Estimating Multi-zonal ICV Rates 123
Acknowledgement
The authors wish to thank the University Teknologi Petronas and the Centre
for Subsurface Seismic Imaging and Hydrocarbon Prediction for supporting
this work.
References
Alhuthali, A.H., Datta-Gupta, A., Yuen, B. and Fontanilla, J.P. 2010. Field applications
of waterflood optimisation via optimal rate control with smart wells. SPE Reservoir
Evaluation & Engineering 13: 406–422. https://doi.org/10.2118/118948-PA.
Behrouz, T., Rasaei, M.R. and Masoudi, R. 2016. A novel integrated approach to oil production
optimisation and limiting the water cut using intelligent well concept: using case studies.
Iranian Journal of Oil and Gas Science and Technology 5: 27–41. https://doi.org/10.22050/
IJOGST.2016.13827.
Breiman, L. 1996. Bagging predictors. Mach Learn. 24: 123–140. https://doi.org/10.1007/
bf00058655.
Breiman, L. 2001. Random forests. Mach. Learn. 45: 5–32. https://doi.
org/10.1023/A:1010933404324.
Brouwer, D.R. and Jansen, J.D. 2002. Dynamic optimisation of water flooding with smart
wells using optimal control theory. In: European Petroleum Conference. SPE, Aberdeen.
https://doi.org/10.2118/78278-MS.
Chen, T. and Guestrin, C. 2016. XGBoost: A scalable tree boosting system. pp. 785–794.
In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining. ACM, New York, NY, USA. https://doi.org/10.1145/2939672.2939785.
Geurts, P., Ernst, D. and Wehenkel, L. 2006. Extremely randomised trees. Mach. Learn.
63: 3–42. https://doi.org/10.1007/s10994-006-6226-1.
Gordon, A.D., Breiman, L., Friedman, J.H. Olshen, R.A. and Stone, C.J. 1984. Classification
and regression trees. Biometrics 40: 874. https://doi.org/10.2307/2530946.
Huang, Z., Li, Y., Peng, Y., Shen, Z., Zhang, W. and Wang, M. 2011. Study of the intelligent
completion system for liaohe oil field. Procedia Eng. 15: 739–746. https://doi.org/10.1016/j.
proeng.2011.08.138.
Jalali, Y., Bussear, T. and Sharma, S. 1998. Intelligent completion systems: the reservoir
rationale. In: All Days. SPE. https://doi.org/10.2118/50587-MS.
Kuhn, M. and Johnson, K. 2013. Applied Predictive Modeling. New York, NY: Springer New
York, https://doi.org/10.1007/978-1-4614-6849-3.
Malakooti, R., Ayop, A.Z., Maulianda, B., Muradov, K. and Davies, D. 2020. Integrated
production optimisation and monitoring of multi-zone intelligent wells. J. Pet. Explor.
Prod. Technol. 10: 159–170. https://doi.org/10.1007/s13202-019-0719-5.
Mubarak, S.M., Pham, T.R., Shamrani, S.S. and Shafiq, M. 2008. Case study: the use of
downhole control valves to sustain oil production from the first maximum reservoir contact,
multilateral, and smart completion well in ghawar field. SPE Production & Operations
23: 427–430. https://doi.org/10.2118/120744-PA.
Naus, N.M.J.J., Dolle, N. and Jansen, J.-D. 2006. Optimisation of commingled production using
infinitely variable inflow control valves. SPE Production & Operations 21: 293–301.
https://doi.org/10.2118/90959-PA.
Otchere, D.A., Arbi Ganat, T.O., Gholami, R. and Lawal, M. 2021. A novel custom ensemble
learning model for an improved reservoir permeability and water saturation prediction.
J. Nat. Gas Sci. Eng. 91: 103962. https://doi.org/10.1016/j.jngse.2021.103962.
124 Data Science and Machine Learning Applications in Subsurface Engineering
Otchere, D.A., Abdalla Ayoub Mohammed, M., Ganat, T.O.A., Gholami, R. and Aljunid
Merican, Z.M. 2022a. A novel empirical and deep ensemble super learning approach
in predicting reservoir wettability via well logs. Applied Sciences 12: 2942. https://doi.
org/10.3390/app12062942.
Otchere, D.A., Ganat, T.O.A., Nta, V., Brantson, E.T. and Sharma, T. 2022b. Data analytics
and Bayesian Optimised Extreme Gradient Boosting approach to estimate cut-offs from
wireline logs for net reservoir and pay classification. Appl. Soft Comput. 120: 108680.
https://doi.org/10.1016/j.asoc.2022.108680.
Otchere, D.A., Tackie-Otoo, B.N., Mohammad, M.A.A., Ganat, T.O.A., Kuvakin, N.,
Miftakhov, R., Efremov, I. and Bazanov, A. 2022c. Improving seismic fault mapping
through data conditioning using a pre-trained deep convolutional neural network: A
case study on Groningen field. J. Pet. Sci. Eng. 213: 110411. https://doi.org/10.1016/J.
PETROL.2022.110411.
Silverman, B.W. and Jones, M.C. 1989. E. Fix and J.L. Hodges (1951): An important contribution
to nonparametric discriminant analysis and density estimation. International Statistical
Review 57: 233–247.
Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal
Statistical Society. Series B (Methodological) 58: 267–288.
Yeten, B., Durlofsky, L.J. and Aziz, K. 2002. Optimisation of smart well control. In: SPE
International Thermal Operations and Heavy Oil Symposium and International Well
Technology Conference. Calgary.
Yeten, B. 2003. Optimum Deployment of Nonconventional Wells (PhD). Stanford University,
California.
Chapter 7
Carbon Dioxide Low Salinity Water
Alternating Gas (CO2 LSWAG)
Oil Recovery Factor Prediction in
Carbonate Reservoir
Using Supervised Machine Learning Models
Eric Thompson Brantson,1,* Zainab Ololade Iyiola,1
Yao Yevenyo Ziggah,2 Alexander Ofori Mensah,1
Daniel Asante Otchere,3,4 Efua Eduamba Abakah-Paintsil1
and Emmanuel Karikari Duodu1
1. Introduction
With rising global energy demand and diminishing oil reserves, enhanced oil
recovery (EOR) from existing brownfields is becoming increasingly important
(Sheng, 2011). As production from the petroleum reservoir increases, the
reservoir’s inherent primary energy depletes, resulting in insufficient reservoir
pressure to drive the oil to the surface. As a result, secondary recovery methods
(water or gas injection) are needed to boost production output. The oil
recovered through both primary and secondary processes ranges from about
1
Department of Petroleum and Natural Gas Engineering, School of Petroleum Studies, University of
Mines and Technology, Tarkwa, Ghana.
2
Department of Geomatic Engineering, Faculty of Geosciences and Environmental Studies,
University of Mines and Technology, Tarkwa, Ghana.
3
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Daril Ridzuan, Malaysia.
4
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
* Corresponding author: [email protected]
126 Data Science and Machine Learning Applications in Subsurface Engineering
20%–40% of the original oil in place (OOIP) (Stalkup, 1984). Following the
application of primary and secondary oil recovery techniques, two-thirds of
the OOIP remains in the reservoir (Gbadamosi et al., 2019).
According to Brantson et al. (2019), tertiary oil recovery methods are
capable of retrieving a greater amount of oil than primary or secondary
recovery methods. In this study, EOR synergy methods involving the injection
of low salinity water (LSW) and carbon dioxide (CO2) gas were employed
to enhance oil recovery at the field scale. The use of LSW in field trials has
demonstrated a significant improvement in oil recovery (Robertson, 2007;
Webb et al., 2004; Vledder et al., 2010), making it a favourable option
over conventional chemical EOR methods in terms of chemical costs,
environmental impact, and field process implementation (Dang et al., 2014).
On the other hand, CO2-EOR has the potential to recover an additional
15%–20% of the remaining oil in place (Ahmed, 2018). The two main
mechanisms behind CO2 injection schemes are oil swelling and viscosity
reduction (AlQuraishi et al., 2019). It is noted that all reservoir lithologies,
including siliciclastic, carbonate, and others, are appropriate for CO2-EOR
provided they have an enough seal to contain hydrocarbons and interconnected
pore space for fluid accumulation and flow (Verma, 2015). Therefore,
estimating hydrocarbon’s ultimate recovery factor provides more detailed
insights into oilfield development strategies (Roustazadeh et al., 2022). The
methods of material balance, decline curve analysis, and dynamic numerical
simulation are used to estimate recovery factors, but they are time-consuming.
Over the years, there has been increasing awareness and concern over
the continuous buildup of greenhouse gases in the atmosphere. This is due to
human activities such as burning of fossil fuels, deforestation, and industrial
processes that emit greenhouse gases such as carbon dioxide (CO2), methane,
and nitrous oxide. The accumulation of these gases has led to an increase
in global temperatures, causing significant changes in the climate, which
poses a threat to the entire world. One of the consequences of climate change
is rising sea levels, which have been observed to be a direct result of the
increasing levels of CO2 and other greenhouse gases in the atmosphere. The
rising sea levels have led to devastating consequences, such as flooding of
coastal areas, displacement of communities, loss of habitats, and destruction
of infrastructure (Santos et al., 2014). Therefore, it is crucial to find ways to
mitigate the impact of greenhouse gas emissions on the environment and the
world at large.
In recent years, there has been considerable attention given to the potential
use of CO2-EOR as a long-term anthropogenic CO2 storage application
(Mandadige et al., 2016). CO2-EOR is a technique used in the oil and gas
industry to extract more oil from wells by injecting CO2 into the reservoir.
This technique not only increases the amount of recoverable oil but also has
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 127
the added benefit of storing the CO2 underground. This can be an effective way
to mitigate the impact of greenhouse gas emissions on the environment and
provide a solution to the problem of global warming. Therefore, it is important
for researchers, policymakers, and stakeholders to continue exploring the
potential of CO2-EOR as a long-term anthropogenic CO2 storage application,
while also working towards reducing greenhouse gas emissions and finding
sustainable solutions to climate change.
The dominant cause of observed anthropogenic global warming has been
unrestricted CO2 emissions from fossil fuel combustion due to the rising usage
of fossil fuels in producing electricity and other manufacturing processes
(Kharecha et al., 2008). Over the last 35 years, the oil and gas industry has
made numerous technological advancements and operational practices for
injecting CO2 for enhanced oil recovery (Yong et al., 2016). CO2 flooding has
emerged as one of the most promising EOR methods because it uses readily
available, naturally occurring CO2 from reservoirs (Sun et al., 2017). Typical
incremental oil recovery by CO2 flooding ranges between 5%–25% (Wu
et al., 2021). The CO2 injection method has traditionally been used in reservoirs
with an oil gravity of less than 25 (Stosur, 2003).
In the quest to find the optimal use of CO2 in enhanced oil recovery for
secondary and tertiary modes, several techniques have been employed. These
techniques include miscible and immiscible CO2 flooding, CO2 huff-n-puff,
and CO2-foam injection, among others (Christensen et al., 1998). Amongst
these techniques, the combined forms of EOR methods have proven to
optimise significantly oil recovery synergistically (Afzali et al., 2018; Teklu
et al., 2016). A lot of CO2-LSWAG experiments (AlQuraishi et al., 2011;
Dang et al., 2014; Naderi and Simjoo, 2019; Pourafshary and Moradpour,
2019; Zolfaghari et al., 2013) in both carbonate and sandstone reservoirs have
been performed which reported an incremental oil recovery from the initial oil
in place but not much on reservoir scale.
Hybrid low-salinity gas flooding has garnered significant attention from
researchers in recent years (Jiang et al., 2010; Kulkarni et al., 2004) due to its
many advantages, such as low cost, low minimum miscibility pressure, and
environmental friendliness. CO2 is frequently used as the injection gas in this
hybrid approach because of these benefits (Ma and James, 2022). The injection
of LSW changes the gas solubility in water, affecting gas/oil interactions
and ultimately enhancing oil recovery. Despite the potential benefits, there
are varying observations regarding the effectiveness of this hybrid method,
with some researchers reporting improved recovery. Others, however, have
found no improvement in recovery over continuous gas injection (CGI).
Furthermore, CO2 partitioning between oil and water in the reservoir (Dang
et al., 2014) can lead to different recovery rates depending on the reservoir
conditions.
128 Data Science and Machine Learning Applications in Subsurface Engineering
improving oil recovery rates during high viscous crude oil sandstones miscible
flooding. This is a critical finding that highlights the importance of considering
the impact of water salinity on WAG performance during miscible flooding.
The team’s results demonstrate that the use of LSW can significantly increase
oil recovery rates, providing further evidence of the potential benefits of LSW in
enhanced oil recovery techniques. These findings have important implications
for the petroleum industry, as they provide valuable insights into how water
salinity can affect WAG performance, and how this knowledge can be used to
optimise oil recovery rates in reservoirs with highly viscous crude oil.
In their experimental study, Al-Abri et al. (2019) investigated the effect
of hybrid injection of immiscible CO2 and smart water on sandstone core
samples. The study found that the synergy between gas injection and the
various ions present in the water samples led to a significant improvement
in oil recovery. To conduct the study, the researchers utilized three synthetic
brines, each containing 5000 ppm of MgCl2, NaCl, and KCl, respectively.
The research revealed that while MgCl2-containing water had the highest
solubility of CO2 in brine, it resulted in the lowest oil recovery among the
three tests. Furthermore, the multicomponent ion exchange of the smart water
altered the rock’s wettability, making it more water-wet, thus improving oil
recovery without the need for gas injection.
The effectiveness of hybrid injection of LSW and miscible CO2 for EOR
in carbonates was investigated through a simulation study by Al-Shalabi
et al. (2016) using the UTCOMP reservoir simulator. The study aimed to
compare the performance of CGI with that of the hybrid method. The results
showed that the CGI method achieved a high recovery of 98.9%, while the
hybrid approach only increased it to 99.7% by controlling viscous fingering.
Therefore, the study concluded that the hybrid method may not be suitable
for conditions where gas miscibility is the primary mechanism for EOR. The
study’s findings could be useful in optimising the selection of EOR techniques
for carbonate reservoirs to achieve maximum recovery with minimum effort
and cost.
Consideration of the initial rock wettability is critical in hybrid LSW/gas
methods. The alteration of wettability from oil-wet to water-wet is particularly
significant in sandstone, as it positively affects the performance of LSW.
Conversely, if the rock is initially water-wet, the LSW and hybrid methods
will not be effective. Ramanathan et al. (2015) conducted an experimental
study of seawater alternating gas (SeaWAG) and LSWAG injection for oil
recovery from water-soaked sandstone. The study found that LSWAG had a
lower recovery factor than SeaWAG due to the initial high water saturation of
the rocks. In contrast, recovery by WAG increased from 76% to over 97% in
an aged oil-wet core when low-salinity brine replaced seawater.
130 Data Science and Machine Learning Applications in Subsurface Engineering
AlQuraishi et al. (2017) conducted a study that found that the low-salinity
alternating miscible CO2 method was not effective for clay-free sandstones,
but did result in a recovery value of 35.1% of the OOIP when clays were
present. Yang et al. (2005) conducted research and discovered that CO2 has
the ability to decrease the oil and brine interfacial tension (IFT) under constant
temperature and pressure conditions. This decrease in IFT can contribute to
additional oil recovery through the hybrid method. Further studies conducted
by Teklu et al. (2016) and Ramanathan et al. (2016) showed that the decrease
in IFT was less than 10 dynes/cm. However, a study by Kumar et al. (2016)
reported comparatively high IFT in the presence of CO2. Despite these mixed
findings, it is important to note that the change in IFT is relatively small
and is not considered to be the primary mechanism in the hybrid LSW/gas
approach for increasing oil recovery. Other mechanisms, such as the reduction
of residual oil saturation and improvement of displacement efficiency, are
believed to play a more significant role in the hybrid method. Nonetheless, the
discovery of the potential of CO2 to reduce IFT remains an important aspect of
research in the oil and gas industry.
The issue associated with pure gas flooding is unfavourable mobility
resulting in viscous fingering and a reduction in volumetric sweep efficiency.
With less gas needed for EOR projects, the WAG approach helps to solve this
significant problem. Additionally, conventional CO2 WAG methods typically
cause a delay in oil production, which the current study can alleviate with CO2
LSWAG. By overcoming the issue of late production that conventional WAG
commonly faces, CO2 LSWAG speeds up the synergy of these several process
mechanisms. According to previous studies (Dang et al., 2014; Pourafshary
and Moradpour, 2019; Sheng, 2014), the primary oil recovery mechanism in
CO2-LSWAG has been proposed to be wettability alteration to a more water-
wet condition.
However, the modeling studies of CO2-LSWAG in a 1D homogeneous
model and at field scale in sandstone reservoirs (Dang et al., 2014) had been
done and characterised by the expensive computational cost (Belazreg and
Mahmood, 2020; Jaber et al., 2019). There is little or no application of fast
and reliable machine learning models to forecast the performance of CO2
LSWAG as shown in Table 7.1, for secondary and tertiary modes. Despite
most CO2-LSWAG studies indicating improvement in oil recovery in
Table 7.1, some studies (Jiang et al., 2010; Ramanathan et al., 2015) with core
samples came out with negative or neutral outcomes due to initial wettability
being strongly water-wet which is not favourable for effective low salinity
water injection. Furthermore, references can be made to the reviewed work
by Ma and James (2022) based on CO2-LSWAG laboratory experiments with
no machine learning models reported.
Table 7.1. CO2LSWAG based on field, experimental, numerical simulation and machine learning models for oil recovery factor predictions.
Reference Type of Porous Media Type of Injection Injection Scheme Ultimate Oil Recovery, % Experiment/ Machine
Fluid OOIP Field/ Numerical Learning
Simulation Model
Al Quraisha et Berea and Bentheimer LSW + CO2 LSW + HSW in 82.40% Experiment No ML
131
ML = Machine Learning, LSW = Low Salinity Water, OOIP = Original Oil In Place
132 Data Science and Machine Learning Applications in Subsurface Engineering
Al-Jifri et al. (2021) develop two new empirical equations for predicting
oil recovery factor in waterflooded heterogeneous reservoirs based on these
parameters of water injection rate, permeability anisotropy, water viscosity,
and reservoir heterogeneity with no proxy correlations currently existing for
CO2-LSWAG which this study strive to achieve. Furthermore, Roustazadeh
et al. (2022) developed three regression-based models including the support
vector machine (SVM), extreme gradient boosting (XGBoost), and stepwise
multiple linear regression (MLR) and various combinations of three databases
to construct machine learning (ML) models and estimate the oil and/or gas
recovery factor (RF). The following authors (Aliyuda et al., 2020; Alpak et al.,
2019; Chen et al., 2020; Ibrahim et al., 2022; Sharma et al., 2010; Tahmasebi
et al., 2020) also applied ML algorithms in predicting hydrocarbon recovery
factors from different reservoirs but not with regards to the present study with
CO2 LSWAG.
In this study, a carbonate field model based on CO2-LSWAG flooding
was simulated using a compositional simulator with geochemical models
incorporated, and then Multivariate Adaptive Regression Splines (MARS) and
Group Method of Data Handling (GMDH) machine learning methods were
used to develop proxy models for prediction of oil recovery factor. Therefore,
this study advocates for the injection of low-salinity water alternating
CO2 as an EOR technique due to its high recovery factor for improving
microscopic and macroscopic displacement efficiencies. The use of machine
learning proxy models as prediction tools will also enhance the efficient
full-field implementation of this technique in reducing the computational time
associated with numerical simulations (Amar et al., 2021; Kalam et al., 2021)
in carbonate reservoirs.
The structure of this paper is organised as follows. Section 2 is the
methodology that describes the use of a compositional simulator coupled with
fluid flow and geochemical modelling, as well as machine learning tools.
Section 3 is the results and discussion that analyses the currently proposed
method and its comparison to the use of a conventional simulator to optimise
operational conditions. Section 4 concludes the study and spells out the major
findings drawn from the present study.
2. Methodology
2.1 Modelling of CO2-LSWAG
This paper’s numerical method used a compositional simulator to generate
various scenarios of the CO2-LSWAG in a carbonate reservoir with SO4
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 133
concentrations of the water injected for the CO2 LSWAG. Figure 7.1 shows
the general workflow used for all the supervised machine-learning techniques.
The datasets used were from the simulation results, which were then divided
into training and testing data. The general workflow for all the supervised
algorithms used in this study for CO2-LSWAG ORF prediction as a single
objective problem is shown in Fig. 7.1.
where ORF is the oil recovery factor as the target variable. There are several
parameters that must be taken into account while simulating the interaction
between two or more variables in a reservoir system. These parameters include
the intercept (Co), the number of basis function terms (M), and the vector of
the kth basis function’s unknown coefficients (Ck), where k is a value between
1 to M. The basis functions themselves are denoted by βk and X is used to
represent the input variables for the reservoir parameters.
According to Friedman’s (1991) findings, the MARS technique uses
multivariate spline functions as basis functions, which are represented in
Eq. (7.11):
Zn
βk ( X ) = ∏ p
z =1
zn ⋅ ( X m( z , n ) − szn ) ,
+
(7.11)
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 135
N
∑ (O − P )
i i
2
GCV (ϕ ) = i=1
2
, (7.12)
M (ϕ )
1−
N
To achieve a balance between the model size and its fit to the dataset,
a tuning parameter denoted as φ is utilised. The overall field measurements
training dataset is represented by N, while Oi and Pi denote the observed and
predicted values of the dataset, respectively. By adjusting the value of the
tuning parameter, one can regulate the trade-off between the complexity of
the model and its accuracy in capturing the dataset’s patterns. M(φ) is defined
in Eq. (7.13) as the effective number of parameters employed in the model:
M(φ) = (φ + 1) + d.φ, (7.13)
The parameter d serves as a penalty or smoothing factor in the
non-parametric MARS algorithm, where its value determines the number
of basis functions and the smoothness of the estimated functions. A higher
value of d leads to fewer basis functions and smoother estimates, whereas a
136 Data Science and Machine Learning Applications in Subsurface Engineering
lower value results in a larger model with more basis functions. For further
information on the selection of d values and a comprehensive explanation of the
non-parametric MARS algorithm, please refer to Friedman’s work from 1991.
The format for the MARS equations expressed in terms of max(.) for
univariate linear regression of the pth piece of variable xk as the basis function
is presented in Eq. (7.14). This mathematical expression demonstrates
how BFp () is represented within the MARS framework. To ensure clarity,
appropriate synonyms have been used, and sentences have been rearranged
without altering the intended meaning.
BFP(xk) = max(0, xk – ap) or BFP(xk) = max(0, ap – xk). (7.14)
In mathematical notation, the term max(.) indicates that solely the positive
portion of the input is preserved while assigning a zero value to the negative
portion, as presented in equation (7.15). Piecewise linear functions can be
represented in the form of max (xk – ap), with the knot point specified at a
particular value ap.
xk − a p , xk ≥ a p
max(0, xk − a p ) =
, (7.15)
0, otherwise
To create a local linear regression with continuous knots, the max (.)
function is utilized. Through recursive spline fitting and splitting, the knots
are best chosen. The Eq. (7.14) BFp (xk) will only be zero when the second
term in the equation exceeds zero, which is a crucial point to remember.
Additionally, when modeling two variable interactions, the basis function
BFps () can be expressed using two univariate basis functions for xk and xi, as
shown in Eq. (7.16):
BFPs(xk , xi) = BFP(xk ) × BFs(xi). (7.16)
connectivity, network size, and the coefficient for the optimum model with
model reduction with less human intervention (Ivakhnenko, 1971). The input
and output relationship are expressed in polynomial form, with the model
automatically selecting the most influential parameters (Farlow, 1984). In this
study, GMDH is used to develop a mathematical model for predicting the CO2-
LSWAG oil recovery factor. Ivakhnenko (1971) applied the Kolmogorov-
Gabor polynomials theory function to find the output parameter, which is
expressed in Eq. (7.17):
n n n n n n
ao + ∑ ai xi + ∑∑ ai a j xi x j + ∑∑∑ ai a j ak xi x j xk + (7.17)
ORF GMDH =
=i 1 =i 1 =j 1 =i 1 =j 1=
k 1
∑ ( ORF
observed ) (
− ORF observed mean × ORFpredicted − ORF predicted mean )
r= i=1
(7.21)
n n
∑ ( ORF ) × ∑ (ORF )
2 2
observed − ORF observed mean predicted − ORF predicted mean
i 1 =i 1
(a)
(b)
Fig. 7.2. Petrophysical properties of the model (a) Porosity model (b) Permeability model.
140 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 7.3. Low and high salinity water and oil relative permeability curves (set #1 is for the original
relative permeability plot while set #2 is for the LSW relative permeability plot).
2–
carbonates with SO 4 simple relative permeability interpolation. The
2–
component wizard used linear interpolation for SO 4 concentration to effect
the change.
Upon completion of the reservoir simulation model, well completion
procedures were implemented. In this study, we determined that the inverse
five-spot pattern was the most optimal approach for the reservoir simulation.
To achieve maximum effectiveness, it was ensured that all six grid layers
were perforated for both production and injection wells, with vertical wells
for efficient oil extraction. The placement of the injection well in the central
location allowed for efficient fluid sweep throughout the reservoir, moving
the oil towards the production wells situated at the corners, resulting in a high
areal sweep efficiency. The sweeping motion of the injected fluid played a
significant role in improving oil recovery in the reservoir. The implementation
of the inverse five-spot pattern and vertical wells in all six grid layers of the
model enabled efficient extraction of the oil.
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 141
Parameter Value
Reference pressure, psi 3337
Reference depth, ft 8596
Water-Oil contact depth, ft 8950
Permeability, mD 29.76–269.40
Porosity, % 9.5–29.
Top of Reservoir Sand, ft 8596
Swcon 0.076
Soirw 0.1434
1-Sorw 0.75
1-Soirw 0.8566
pH 5.22
Water Injection Rate, bbl/day 7,000
Gas Injection Rate, MMft3/day 30
Injector Bottomhole Pressure, psi 6,710
Formation Water Salinity, ppm 90,044.10
Equation of State (EOS) Peng Robinson
Table 7.3. Base case for the formation and injected water.
Table 7.4. Dataset ranges for CO2-LSWAG oil recovery factor prediction.
(a)
(b)
Fig. 7.4. Simulation Parameters: (a) Average pressure base case and CO2 LSWAG, (b) LSW and CO2
injection rate.
(a)
(b)
(c)
Fig. 7.5. Oil saturation maps for the simulation period: (a) oil saturation at 5 years, (b) oil saturation
at 10 years, (c) oil saturation at 15 years.
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 145
are 20, 1 and 3, respectively. Hence the relative importance of the five
input variables used are oil cumulative production (100%), gas cumulative
production (2.22%), cumulative gas injection (2.36%), the water injection rate
(1.63%), and average pressure (1.46%). The relevance factor computed for
all the parameters chosen automatically has a positive impact on the ORF as
the target. The ideal developed MARS model for CO2-LSWAG is expressed
in Eq. (7.23). Also, Table 7.5 indicates the basis functions and corresponding
equations of the ORF MARS model written in Eq. (7.23):
= 25.7653 + 1.84934 ×10−6 × BF1 −1.57532 ×10−6 × BF 2 + 1.00632 ×10−8 × BF 4
ORF
− 2.33007 ×10−9 × BF5 + 2.36461 × 10−9 × BF 7 + 0.00204046 × BF10
(7.23)
− 3.65011 × 10−9 × BF12 + 3.67347 ×10−6 × BF13 + 0.0775678 × BF15
− 3.02324 ×10−7 × BF17 − 4.68672 ×10−7 × BF19
Figures 7.6a to 7.6d show the cross plots for the MARS training and
testing models used in this study. The results show both the training and
testing data points are close to the ideal line, which is an indication of the
MARS model’s robustness. Three independent datasets (low salinities of
573.86 ppm, 1250.51 ppm, and 2949.15 ppm) were used to test the MARS
model to verify its accuracy and reliability. Furthermore, the CO2-LSWAG
recovery factor prediction can also be assessed from Table 7.6, showing the
statistical performance. The boxplot in Fig. 7.6e shows the residuals for both
MARS training and testing sets. It can be observed that the residuals do not
vary significantly from zero. It can be stated that the training model established
can predict the testing datasets within acceptable accuracy predictions.
It is shown that the mean and median lie close to zero for a good model.
Figure 7.6e also shows the MARS model residual distributions and outliers.
146 Data Science and Machine Learning Applications in Subsurface Engineering
(a)
(b)
Fig. 7.6 contd. ...
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 147
(c)
(d)
Fig. 7.6 contd. ...
148 Data Science and Machine Learning Applications in Subsurface Engineering
(e)
Fig. 7.6. MARS training and testing models: (a) Crossplot for the MARS training model,
(b) Crossplot for the MARS testing for 573.86 ppm, (c) Crossplot for the MARS testing for 1250.51
ppm, (d) Crossplot for the MARS testing for 2949.15 ppm, (e) Boxplot residuals for MARS training
and testing.
(a)
(b)
Fig. 7.7 contd. ...
150 Data Science and Machine Learning Applications in Subsurface Engineering
(c)
(d)
Fig. 7.7 contd. ...
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 151
(e)
Fig. 7.7. GMDH training and testing models: (a) Crossplot for the GMDH training model,
(b) Crossplot for the GMDH testing for 573.86 ppm, (c) Crossplot for the GMDH testing for
1250.51 ppm, (d) Crossplot for the GMDH testing for 2949.15 ppm, (e) Boxplot residuals for GMDH
training and testing.
Table 7.7. Training and testing results for the GMDH model.
performance. Also, it can be observed from Table 7.7 that the testing data
performed better than the MARS model testing. The boxplot in Fig. 7.7e shows
the residuals for both GMDH training and testing sets. It can be observed that
the residuals do not vary significantly from zero. It can be stated that the
training model established can predict the testing datasets within acceptable
accuracy predictions. It can also be seen that the mean and median lie close to
zero for a good model. Figure 7.7e also shows the residual distributions and
least outliers for GMDH testing.
Figure 7.8 shows the GMDH topology used to develop the ORF equations.
The ORF was formed from three variables from the input layer. The OCP
variable combines with CLSWI and LSWIR in the input layer to form two
variables in the hidden layer before combining them to build the target layer.
152 Data Science and Machine Learning Applications in Subsurface Engineering
The summary of the GMDH model’s equations for two layers is expressed
in Eqs. (7.24) to (7.26) as:
Layer #1
Number of neurons: 2
A = ao + a1 × OCP + a2 × CLSWI + a3 × OCP × CLSWI + a4 × (OCP)2
+ a5 × (CLSWI)2 (7.24)
a0 = 0.03088 a1 = 0.94845 a2 = 0.04723
a3 = 0.05133 a4 = 0.02253 a5 = 0.00212
B = –bo + b1 × LSWIR + b2 × OCP – b3 × OCP × LSWIR – b4 × (LSWIR)2
+ b5 × (OCP)2 (7.25)
b0 = 0.00396 b1 = 0.00360 b2 = 0.99753
b3 = 0.00431 b4 = 0.00041 b5 = 0.01058
Layer #2
Number of neurons: 1
ORF = –co + c1 × B + c2 × A + c3 × A × B – c4 × (B)2 + c5 × (OCP)2 (7.26)
c0 = 0.00038 c1 = 0.61360 c2 = 0.37942
c3 = 409.55682 c4 = 205.30898 c5 = 204.23896
where A and B are virtual independent inputs or nodal variables used in the
GMDH neural network.
Fig. 7.9. Simulation time for machine learning and numerical models.
4. Conclusion
The problem addressed in this study is the lack of attention and research
on using machine learning techniques to use CO2-LSWAG for EOR in
carbonate reservoirs. The quality and composition of the water used in most
EOR processes are not adequately considered, and there is a need for fast
computational methods for predicting future recovery factors. We used
multiphase multicomponent flow equations, geochemical modelling, and
compositional simulation datasets to build proxy models for predicting
oil recovery factors in carbonate reservoirs for CO2-LSWAG EOR. Two
supervised machine learning models (MARS and GMDH) were used to
154 Data Science and Machine Learning Applications in Subsurface Engineering
Acknowledgment
We express our appreciation to the anonymous reviewers whose valuable inputs
have significantly contributed to the success of this research. Additionally, our
gratitude goes to the Computer Modelling Group (CMG) for providing us
with the commercial software that was instrumental to the completion of this
project. Lastly, we acknowledge the indispensable support of the University of
Mines and Technology, Tarkwa, Ghana, GNPC School of Petroleum Studies,
and the Petroleum and Natural Gas Engineering Department.
References
Afzali, S., Rezaei, N. and Zendehboudi, S.A. 2018. Comprehensive review on enhanced oil
recovery by water alternating gas (WAG) injection. Fuel 227: 218–246. https://doi.
org/10.1016/j.fuel.2018.04.015.
Ahmed, T. 2018. Reservoir Engineering Handbook (4th Edn.). Elsevier Inc.
Al-Abri, H., Pourafshary, P., Mosavat, N. and Al Hadrami, H. 2019. A study of the performance
of LSWA CO2 EOR technique on improvement of oil recovery in sandstones. Petroleum
5(1): 58–66. https://doi.org/10.1016/j.petlm.2018.07.003.
Alaidan, A. and Mamore, D. 2010. SWACO2 and WACO2 efficiency improvement in carbonate
cores by lowering water salinity. In: Canadian Unconventional Resources and International
Petroleum Conference. https://doi.org/10.2118/137548-MS.
Aleidan, A. and Mamora, D. 2011. Miscible CO2 injection in highly heterogeneous carbonate
cores: experimental and numerical simulation studies. In: SPE Middle East Oil and Gas
Show and Conference. https://doi.org/10.2118/141469-MS.
Aliyuda, K., Howell, J. and Humphrey, E. 2020. Impact of geological variables in controlling
oil-reservoir performance: an insight from a machine-learning technique. SPE Reservoir
Evaluation & Engineering 23(04): 1314–1327. https://doi.org/10.2118/201196-PA.
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 155
Al-Jifri, M., Al-Attar, H. and Boukadi, F. 2021. New proxy models for predicting oil recovery
factor in waterflooded heterogeneous reservoirs. Journal of Petroleum Exploration and
Production 11(3): 1443–1459. https://doi.org/10.1007/s13202-021-01095-4.
Alpak, F.O, Araya–Polo, M. and Onyeagoro, K. 2019. Simplified dynamic modeling of
faulted turbidite reservoirs: a deep-learning approach to recovery-factor forecasting for
exploration. In: SPE Reservoir Evaluation & Engineering 22(04): 1240–55. https://doi.
org/10.2118/197053-PA.
AlQuraishi, A.A. and Shokir, E.M. El-M. 2011. Experimental investigation of miscible
CO2 flooding, Journal of Petroleum Science and Technology 29(19). https://doi.
org/10.1080/10916461003662976.
AlQuraishi, A.A., Amao, A.M., Al-Zahrani, N.I., AlQarni, M.T. and AlShamrani, S.A. 2019.
Low salinity water and CO2 miscible flooding in berea and bentheimer sandstones. Journal
of King Saud University – Engineering Sciences 31(3): 286–295. https://doi.org/10.1016/j.
jksues.2017.04.001.
Al-Shalabi, E.W., Sepehrnoori, K. and Pope, G. 2014. Geochemical investigation of the
combined effect of injecting low salinity water and carbon dioxide on carbonate reservoirs.
Energy Procedia 63: 7663–7676. https://doi.org/10.1016/j.egypro.2014.11.800.
Al-Shalabi, E., Sepehrnoori, K. and Pope, G. 2016. Numerical modelling of combined low
salinity water and carbon dioxide in carbonate cores. Journal of Petroleum Science and
Engineering 137: 157–171. https://doi.org/10.1016/j.petrol.2015.11.021.
Amar, M.N., Ghahfarokhi, A.J., Ng, C.S.W. and Zeraibi, N. 2021. Optimization of WAG in
real geological field using rigorous soft computing techniques and nature-inspired
algorithms. Journal of Petroleum Science and Engineering 206(109038): 1–13. https://doi.
org/10.1016/j.petrol.2021.109038.
Belazreg, L. and Mahmood, S.M. 2020. Water alternating gas incremental recovery factor
prediction and WAG pilot lessons learned. Journal of Petroleum Exploration and
Production Technology 10: 249–269. https://doi.org/10.1007/s13202-019-0694-x.
Brantson, E.T., Ju, B., Omisore, O.O., Wu, D., Aphu, E.S. and Liu, N. 2018a. Development
of machine learning predictive models for history matching tight gas carbonate reservoir
production profiles. Journal of Geophysics and Engineering 15(5): 7–12. https://doi.
org/10.1088/1742-2140/aaca44.
Brantson, E.T., Ju, B., Omisore, B.O., Wu, D., Selase, A.E. and Liu, N. 2018b. Development
of machine learning predictive models for history matching tight gas carbonate reservoir
production profiles. Journal of Geophysics and Engineering 15(5): 2235–2251.
Brantson, E.T., Ju, B., Ziggah, Y.Y., Akwensi, P.H., Sun, Y., Wu, D. and Addo, B.J. 2019.
Forecasting of horizontal gas well production decline in unconventional reservoirs using
productivity, soft computing and swarm intelligence models. Natural Resources Research
28: 717–756.
Chen, Y., Zhu, Z., Lu, Y., Hu, C., Gao, F., Li, W., Sun, N. and Feng, T. 2020. Reservoir recovery
estimation using data analytics and neural network based analogue study. In: SPE/IATMI
Asia Pacific Oil & Gas Conference and Exhibition. https://doi.org/10.2118/196487-MS.
Christensen, J.R., Stenby, E.H. and Skauge, A. 1998. Review of WAG field experience.
In: Proceedings of the International Petroleum Conference and Exhibition of Mexico.
https://doi.org/10.2118/90589-MS.
Dang, C.T.Q, Nghiem, L.X, and Chen, Z. 2014. CO2 low salinity water alternating gas: a new
promising approach for enhanced oil recovery. In: SPE Improved Oil Recovery Symposium,
1–19. https://doi.org/10.2118/169071-MS.
Farlow, S.J. 1984. Self-Organizing Methods in Modeling GMDH Type Algorithms. New York:
Marcel-Dekker. CRC Press.
156 Data Science and Machine Learning Applications in Subsurface Engineering
Friedman, J.H. 1991. Estimating Functions of Mixed Ordinal and Categorical Variables using
Adaptive Splines (pp. 1–42). Department of Statistics, Stanford Univ.
Gbadamosi, A.O., Radzuan, J., Manan, M.A., Agi, A. and Adeyinka, S.Y. 2019. An overview of
chemical enhanced oil recovery: recent advances and prospects. Int. Nano Lett. 9: 3–10.
https://doi.org/10.1007/s40089-019-0272-8.
Hamouda, A.A. and Pranoto, A. 2016. Synergy between low salinity water flooding and CO2 for
EOR in chalk reservoirs. In: Proceedings of the SPE EOR Conference at Oil and Gas West
Asia, 2016; Society of Petroleum Engineers. TX, USA: Richardson.
Ibrahim, A.F., Alarifi, S.A. and Elkatatny, S. 2022. Application of machine learning to predict
estimated ultimate recovery for multistage hydraulically fractured wells in niobrara
shale formation. Computational Intelligence and Neuroscience 1–10. http://dx.doi.
org/10.1155/2022/7084514.
Ivakhnenko, A.G. 1966. Group method of data handling a rival of the method of stochastic
approximation. Soviet Automatic Control 1(13): 43–71.
Ivakhnenko, A.G. 1971. Polynomial theory of complex system. IEEE Transactions on System,
Man and Cybernetics 1(4): 364–378. 10.1109/TSMC.1971.4308320.
Jaber, A.K., Alhuraishawy, A.K. and AL-Bazzaz, W.H. 2019. A data-driven model for
rapid evaluation of miscible CO2-WAG flooding in heterogeneous clastic reservoirs.
In: Proceedings of the SPE Kuwait Oil & Gas Show and Conference, 13–16. https://doi.
org/10.2118/198013-MS.
Jiang, H., Nuryaningsih, L. and Adidharma, H. 2010. The effect of salinity of injection brine
on water alternating gas performance in tertiary miscible carbon dioxide flooding:
experimental study. SPE Western Regional Meeting. https://doi.org/10.2118/132369-MS.
Kalam, S., Khan, R.A., Khan, S., Faizan, M., Amin, M., Ajaib, R. and Abu-Khamsin, S.A.
2021. Data-driven modelling approach to predict the recovery performance of low-salinity
waterfloods. Natural Resources Research 30: 1697–1717. https://doi.org/10.1007/s11053-
020-09803-3.
Kharecha, P.A. and Hansen, J.E. 2008. Implications of peak oil for atmospheric CO2 and climate.
Global Biogeochem 22: 6–10. https://doi.org/10.48550/arXiv.0704.2782.
Kulkarni, M.M. and Rao, D.N. 2004. Experimental investigation of various methods of tertiary
gas injection. Paper Presented at the SPE Annual Technical Conference and Exhibition.
https://doi.org/10.2118/90589-MS.
Kumar, H., Shehata, A. and Nasr-El-Din, H. 2016. Effectiveness of low salinity and CO2 flooding
hybrid approaches in low permeability sandstone reservoirs. In: SPE Trinidad and Tobago
Section Energy Resources Conference. https://doi.org/10.2118/180875-MS.
Lv, Q., Zhou, T., Zheng, R., Nakhaei-Kohani, R., Riazi, M., Hemmati-Sarapardeh, A.,
Li, J. and Wang, W. 2023. Application of group method of data handling and gene
expression programming for predicting solubility of CO2-N2 gas mixture in brine. Fuel
332(6): 126025, 10.1016/j.fuel.2022.126025.
Ma, S. and James, L.A. 2022. Literature review of hybrid CO2 low salinity water-alternating-
gas injection and investigation on hysteresis effect. Energies 15(21): 7891. https://doi.
org/10.3390/en15217891.
Mandadige, S.P., Ranjith, P.G., Tharaka, D.R., Ashani, S.R., Koay, A. and Choi, X. 2016. A
review of CO2-enhanced oil recovery with a simulated sensitivity analysis. Energies
9(7): 7–22. https://doi.org/10.3390/en9070481.
Naderi, S. and Simjoo, M. 2019. Numerical study of low salinity water alternating CO2 injection
for enhancing oil recovery in a sandstone reservoir: coupled geochemical and fluid flow
modeling. Journal of Petroleum Science and Engineering 173: 279–286. https://doi.
org/10.1016/j.petrol.2018.10.009.
CO2 LSWAG Oil Recovery Factor Prediction in Carbonate Reservoir 157
Pourafshary, P. and Moradpour, N. 2019. Hybrid EOR methods utilizing low-salinity water.
Enhanc. Oil Recovery Process. New Technol. 8: 25.
Ramanathan, R., Shehata, A. and Nasr-El-Din, H. 2015. Water alternating CO2 injection process-
does modifying the salinity of injected brine improve oil recovery? In: Proceedings of the
OTC Brasil, Rio de Janeiro, Brazil, 27 October; Offshore Technology Conference: Rio de
Janeiro, Brazil. https://doi.org/10.4043/26253-MS.
Ramanathan, R., Shehata, A. and Nasr-El-Din, H. 2016. Effect of rock aging in oil recovery
during water-alternating-CO2 injection process. In: SPE Improved Oil Recovery
Conference. https://doi.org/10.2118/179674-MS.
Robertson, E.P. 2007. Low-Salinity Waterflooding to Improve Oil Recovery-Historical Field
Evidence. Paper Presented at the Annual Technical Conference and Exhibition, 11–14,
https://doi.org/10.2118/109965-MS.
Roustazadeh, A., Ghanbarian, B., Shadmand, M.B., Taslimitehrani, V. and Lake, L.W. 2022.
Estimating Oil and Gas Recovery Factors via Machine Learning: Database-Dependent
Accuracy and Reliability. Preprint. http://dx.doi.org/10.48550/arXiv.2210.12491.
Santos, R., Loh, W., Bannwart, A. and Trevisan, O. 2014. An overview of heavy oil properties
and its recovery and transportation methods. Brazilian Journal of Chemical Engineering
31(3): 571–576. http://dx.doi.org/10.1590/0104-6632.20140313s00001853.
Saxena, K. 2017. Low Salinity Water Alternate Gas Injection Process for Alaskan Viscous Oil
EOR [Master’s Thesis, University of Alaska]. University of Alaska Fairbanks. http://hdl.
handle.net/11122/7638.
Sharma, A., Srinivasan, S. and Lake, L.W. 2010. Classification of oil and gas reservoirs based
on recovery factor: a data-mining approach. In: SPE Annual Technology Conference &
Exhibition 1: 50–70. https://doi.org/10.2118/130257-MS.
Sheng, J.J. 2011. Modern Chemical Enhanced Oil Recovery: Theory and Practice. Elsevier
Publishing Corporation.
Sheng, J.J. 2013. Enhanced Oil Recovery Field Case Studies. Waltham, Mass.: Gulf Professional
Publishing.
Stalkup, F.I. 1984. Miscible Displacement (SPE Monograph Series). Society of Petroleum
Engineers.
Stosur, G.J. 2003. EOR: Past, Present, and What the Next 25 Years May Bring. Paper Presented
at the SPE International Improved Oil Recovery Conference in Asia Pacific, Kuala Lumpur,
Malaysia. https://doi.org/10.2118/84864-MS.
Sun, X., Zhang, Y., Chen, G. and Gai, Z. 2017. Application of nanoparticles in enhanced oil
recovery: a critical review of recent progress. Energies 10(3): 1–6. https://doi.org/10.3390/
en10030345.
Tahmasebi, P., Kamrava, S., Bai, T. and Sahimi, M. 2020. Machine learning in geo-and
environmental sciences: From small to large scale. Advances in Water Resources
142(11): 103619. http://dx.doi.org/10.1016/j.advwatres.2020.103619.
Teklu, T.W., Alameri, W., Graves, R.M., Kazemi, H. and AlSumaiti, A.M. 2016. Low salinity
water-surfactant-CO2 EOR. Journal of Petroleum Science Engineering 3(3): 309–320.
https://doi.org/10.1016/j.petlm.2017.03.003.
Verma, M.K. 2015. Fundamentals of carbon dioxide enhanced oil recovery (CO2-EOR): A
supporting document of the assessment methodology for hydrocarbon recovery using
CO2-EOR associated with carbon sequestration. U.S. Geological Survey 15: 1–19.
https://doi.org/10.3133/ofr20151071.
Vledder, P., Fonseca, J.C., Wells, T., Gonzalez, I. and Ligthelm, D. 2010. Low Salinity Water
Flooding: Proof of Wettability Alteration on a Field Wide Scale. Presented at the SPE
Improved Oil Recovery Symposium, 24–28. https://doi.org/10.2118/129564-MS.
158 Data Science and Machine Learning Applications in Subsurface Engineering
Webb, K.J., Black, C.J.J. and Al-Ajeel, H. 2004. Low Salinity Oil Recovery Log-Inject-
Log. Paper presented at the SPE/DOE Symposium on Improved Oil Recovery, 17–21.
https://doi.org/10.2118/89379-MS.
Wu, D., Brantson, E.T. and Ju, B. 2021. Numerical simulation of water alternating gas flooding
(WAG) using CO2 for high-salt argillaceous dolomite reservoir considering the impact of
stress sensitivity and threshold pressure gradient. Acta Geophysica 69(4): 1349–1365.
https://doi.org/10.1007/s11600-021-00601-w.
Yang, D., Tontiwachukwuthikul, P. and Gu, Y. 2005. Interfacial tensions of the crude oil +
reservoir brine + CO2 systems at pressures up to 31 MPa and temperatures of 27°C and
58°C. Journal of Chemical Engineering Data 50(4): 1242–1249. https://doi.org/10.1021/
je0500227.
Yong, T., Zhengywan, S., Jibo, H. and Fulin, Y. 2016. Numerical simulation and optimization
of enhanced oil recovery by the in situ generated CO2 huff-n-puff process with compound
surfactant. Journal of Chemistry 206: 13. https://doi.org/10.1155/2016/6731848.
Zekri, A., Al-Attar, H., Al-Farisi, O., Almehaideb, R. and Lwisa, E.G. 2015. Experimental
investigation of the effect of injection water salinity on the displacement efficiency
of miscible carbon dioxide WAG flooding in a selected carbonate reservoir. Journal of
Petroleum Exploration and Production Technology 5: 363–373. https://doi.org/10.1007/
s13202-015-0155-0.
Zolfaghari, H., Zebarjadi, A., Shahrokhi, O. and Ghazanfari, M.H. 2013. An experimental study
of CO2-low salinity water alternating gas injection in sandstone heavy oil reservoirs. Iranian
Journal of Oil and Gas Science and Technology 2(3): 37–47. https://doi.org/10.22050/
ijogst.2013.3643.
Chapter 8
Improving Seismic Salt Mapping
through Transfer Learning Using
A Pre-trained Deep Convolutional
Neural Network
A Case Study on Groningen Field
Daniel Asante Otchere,1,2,* Abdul Halim Latiff,1 Nikita Kuvakin,3
Ruslan Miftakhov,3 Igor Efremov 3 and Andrey Bazanov 3
1. Introduction
The interpretation of seismic data is a critical aspect of geological exploration.
It enables geologists and engineers to identify and delineate subsurface
structures such as faults, reservoirs, and geological formations. However,
traditional approaches to seismic interpretation, such as manual picking and
horizon tracking, are laborious and time-consuming, particularly in areas
with complex geology and numerous faults. Furthermore, these methods are
prone to noise and other stratigraphic challenges, making them less reliable
and accurate (Otchere et al., 2022c). Getting a reliable velocity model using
1
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Daril Ridzuan, Malaysia.
2
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
3
GridPoint Dynamics, 77 Hopton Road, London, SW162EL, United Kingdom.
* Corresponding author: [email protected]
160 Data Science and Machine Learning Applications in Subsurface Engineering
significant implications for the oil and gas industry. However, deploying the
method to new fields gives inaccurate mapping interpretation. One way of
boosting deep learning model performance in a new field is through transfer
learning.
In recent years, there has been a growing interest in using transfer
learning to improve the accuracy of salt segmentation in seismic images.
Transfer learning can significantly enhance our ability to identify and map
subsurface salt structures, which can significantly impact the development of
new energy resources and climate change mitigation. Using the specialised
information learned in one context to solve the problems in another is known
as transfer learning, which is a promising approach to address disparities in
model prediction and ground truth (Li et al., 2020; Pan and Yang, 2010). The
idea is to get insights into a problem using a plethora of available interpreted
data when the initial model prediction of a comparable situation for which less
information is not desirable. This calls for attention to the distinction between
the model prediction and ground truth, reflected in the different distributions of
their features and boundaries. In particular, the feature-based transfer learning
strategy known as “domain adaptation” is a potential option (Ben-David et al.,
2007; Pan et al., 2011). To do this, it first determines a feature space in which
source domain data retain their intrinsic structure while minimising distribution
disparity across domains. Transfer component analysis is a pioneering
technique described by (Pan et al., 2011). Minimisation of maximum mean
discrepancy (MMD) has emerged as a popular method for domain adaptation
in machine learning. This method aims to learn transferable features between
different domains while preserving the variation in the source domain. By
representing the feature space as a reproducing kernel Hilbert space (RKHS),
MMD minimisation enables the computation of distances between probability
distributions, facilitating knowledge transfer from the source to the target
domain. In this way, MMD-based domain adaptation has been successfully
applied to various problems, including image classification, object detection,
and natural language processing. An implementation with multiple kernels
has been developed using this method (Lixin Duan et al., 2012).
Deep learning models have revolutionised the field of machine learning
by enabling computers to automatically learn complex representations of data.
The ability to learn features from data has made deep learning particularly
useful in domains where large amounts of data are available, such as computer
vision and natural language processing. Recently, there has been an increasing
interest in combining deep learning with transfer learning, which aims to
leverage knowledge from related tasks to improve performance on new tasks.
To this end, Ghifary et al. (2014) introduced a deep adaption neural network
(DaNN) that includes a multi-modal dependency (M-MD) term in the loss
162 Data Science and Machine Learning Applications in Subsurface Engineering
transfer learning to enhance the detection of salt bodies and evaluate the
effectiveness of the pre-trained CNN. These prior investigations demonstrate
that transfer learning can be used to improve the accuracy of CNNs for image
recognition tasks, even when the target dataset differs from the source dataset.
The domain shifts (actual-predicted discrepancy) between the predicted and
ground truth can be reduced to an acceptable level in a predetermined feature
space using domain adaptation through transfer learning.
This study will demonstrate the application of transfer learning using
actual field interpretation to a CNN pre-trained with synthetic labels to
generate salt probability models that can be used as a valuable property in the
seismic imaging and velocity modelling phases. Transfer learning and deep
learning techniques’ object and edge detection have demonstrated promising
success in various domains, making them an appealing approach for seismic
salt mapping. The use of these techniques can potentially enhance the detection
of subsurface salt bodies and improve the accuracy of salt mapping, resulting
in more efficient and effective exploration and production of hydrocarbon
reservoirs. Hence, the main contributions of this research are:
1. Improved salt segmentation accuracy: Transfer learning can significantly
enhance our ability to identify and map subsurface salt structures in
seismic images. Using semantic segmentation to improve salt probability
volume by retraining the model on a labelled dataset, it can learn to
recognise salt structures with greater accuracy and efficiency, which
is critical for the energy industry and our understanding of the Earth’s
subsurface processes.
2. Cost and time savings: Salt segmentation in seismic images is
time-consuming and labour-intensive, requiring significant expertise and
resources. By using transfer learning to automate this process, we can save
time and reduce costs while improving the accuracy of the segmentation.
3. Enhanced data analysis: Accurate segmentation of salt structures in
seismic images is essential for various applications, including hydrocarbon
exploration, CO2 storage, and geohazard assessment. By improving the
accuracy of salt segmentation, we can enhance our understanding of the
subsurface geology and make more informed decisions about energy
exploration and production.
4. Potential for future research: Using transfer learning for salt segmentation
in seismic images is a relatively new area of research with significant
potential for future exploration and development. By demonstrating the
effectiveness of this approach, we can inspire new research in the field
and help to advance our understanding of subsurface processes and the
energy transition journey.
164 Data Science and Machine Learning Applications in Subsurface Engineering
These contributions will help reduce the time and resources seismic
interpreters spend interpreting seismic features. The findings in this study
are of great importance for the energy industry and the broader scientific
community, as they provide new insights into the use of transfer learning for
improving the accuracy of salt segmentation in seismic images. The results of
this study will also highlight the potential of this approach for developing new
energy resources and mitigating climate change by improving our ability to
identify and map subsurface salt structures accurately.
2. Method
2.1 Collection and Description of Data
For this work, the Groningen seismic field data was selected as the case study.
The system unit was a Windows 10 Operating System (OS) with the next
generation AMD Radeon Pro™ and the highest-performing NVIDIA Quadro®
professional graphics capable of 2-petaFLOPS tensor performance. A single
NVIDIA RTX 8000 GPU and quadruple Intel (R) Xeon (R) W-2223 i5 CPU
running at 3.60 GHz, with 32.0 GB of DDR4-2666 MHz DRAM, were used
to run the study. This research used a pre-trained DCNN model trained using
synthetic and real data. The proposed approach consisted of two main parts,
generating a salt probability model and improving the model output through
transfer learning, as illustrated in Fig. 8.1. The optimised DCNN model for
salt body probability prediction can be reused to map new salt bodies, thus
enhancing interpretation efficiency. The model estimation section of the
workflow represents an end-to-end model that automatically can map salt
bodies after selecting several parameters.
Fig. 8.2. (a) Accurate segmentation of a synthetic volume fragment, and (b) projected result produced
using a pre-trained neural network.
Fig. 8.3. Seismic volume showing Inline 8768, Xline 8367, and depth 2800.
Improving Seismic Salt Mapping through Transfer Learning
167
168 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 8.4. Illustration of the application framework of the DCNN in creating the salt probability
volume.
Fig. 8.6. Expert interpreted salt bodies showing: (a) Inline 9090, (b) Inline 7820, (c) Inline 8271 and (d) Inline 9108.
Improving Seismic Salt Mapping through Transfer Learning 171
Fig. 8.7. Expert interpreted salt bodies showing: (a) Crossline 9034, (b) Crossline 7960, and
(c) Crossline 8188.
172 Data Science and Machine Learning Applications in Subsurface Engineering
complete overlap between two sets of binary image segmentation results, A and
B target regions (Fig. 8.8). The DSC is defined as (Yao et al., 2020);
DSC(A, B) = 2(A ∩ B)/(|A| + |B|) (8.1)
where ∩ represents the intersection of the two target regions |A| and |B| is the
cardinality of the set. When there is a much higher number of background voxels
than the number of target voxels, the DSC can be conceptualised as a special
case of the kappa statistic, typically applied in situations involving reliability
analyses. This concept has been shown previously by Zou et al. (2004).
Fig. 8.9. Comparison of predicted salt probability volume (left) and expert interpretation (right) of
(a) Crossline 8188, (b) Inline 9090, and (c) Crossline 9034.
Fig. 8.10. Transfer learned salt probability volume and expert interpretation (solid black line)
showing: (a) Crossline 9034, (b) Crossline 7960, and (c) Crossline 8188.
Fig. 8.11. Transfer learned salt probability volume and expert interpretation (solid black line) showing: (a) Inline 9090, (b) Inline 7820, (c) Inline 8271 and
(d) Inline 9108.
Improving Seismic Salt Mapping through Transfer Learning
175
176 Data Science and Machine Learning Applications in Subsurface Engineering
able to successfully retrain itself on the new labelled data set by readjusting its
weights in the deep learning process. As a result, the accuracy in the prediction
of salt probability bodies has increased compared to the initial results obtained
before the retraining of the CNN. The results showed a statistically significant
improvement when iterations were increased in steps of 100 from 100 to
10,000. The best combination of input data and iterations was determined
by comparing the pair-wise DSC metrics of all the iterations and increasing
the number of input labelled data. DSC results sequentially increased with
successive increments in labelled training data and iterations until they reached
their maximum based on case-to-case analysis. There was no evidence of a
learning curve phenomenon, and segmentations seemed to have little effect
after iterations 10,000 and input labelled data set size above 5.
This research demonstrated the functionality of DSC as a straightforward
criterion for validating the consistency and spatial overlap accuracy of
manual and automated segmentations. The corresponding DSC values for
iteration 10,000 using seven input labelled data for training and validation
(four inlines and three crosslines) are shown in Fig. 8.12. The probabilistic
fractional segmentation showed a wide range of spatial overlap (0.6–0.95)
with the corresponding estimated ground truth (expert interpreted sections).
This result gives an indication that further improvements can be achieved.
In the inline sections, the reproducibility was significantly higher based on a
higher number of labelled training inputs. This improvement motivated the
increment of the labelled data size in subsequent crossline investigations to
develop a suitable matching approach to register these salt body boundaries
Fig. 8.12. Learning transfer metrics showing Dice Similarity Coefficient results at each iteration.
Improving Seismic Salt Mapping through Transfer Learning 177
Fig. 8.13. Transfer learned salt probability volume and expert interpretation (solid black line) showing
the deployed sections of Inline 8271, crossline 8188 and depth 2048.
4. Conclusions
The seismic interpretation field has faced significant challenges due to the
limited availability of labelled seismic data. These challenges have created a
bottleneck in developing deep learning algorithms to aid seismic interpretation.
Accurate and high-quality seismic data are critical to successfully applying
machine learning algorithms to interpret subsurface geological structures.
Without sufficient labelled data, deep learning models are unable to learn the
patterns and correlations necessary to accurately predict subsurface features,
such as salt bodies. In order to solve this issue, a CNN that had previously
been trained on synthetic labels was considered a potential source of relevant
data for developing an adequate salt segmentation model. The application
of transfer learning was required to reduce the visual discrepancy across the
seismic volume because of the inevitable disparity between the model and
the actual data. The application of this process in the Groningen field has
demonstrated the effectiveness of integrating DNNs and transfer learning
for salt body detection. In its conceptualisation, DSC is a particular instance
of the kappa statistic, a well-known and pragmatic reliability and agreement
index. It offers the metric to analyse and measure the model’s performance
and the adjustments necessary for the retraining process. The transfer learning
process might involve several repetitions. Depending on the data quality and
the geological complexity, it may be necessary to complete several iterations
Improving Seismic Salt Mapping through Transfer Learning 179
to achieve a representative model for the deployed seismic field. The transfer
learning technique in this study achieved a Dice similarity index of 0.99 and
0.92 on the training and validation sections, respectively. This result shows
that the transfer learned model can automatically capture subtle salt bodies
from 3D seismic with minimal manual input.
The proposed method advocates for continued research and deep learning
techniques implementation in seismic interpretation. This activity involves
manual seismic interpretation and often requires much expert knowledge. It
would be beneficial to have labelled data of interpreted seismic from different
fields. Despite a large portion of training data coming from synthetic data, the
successful knowledge transfer still requires some field interpretation samples
from actual seismic volumes. Once labelled data is made available, the time
involved in fully interpreting salt boundaries or other attributes on seismic
volumes will be drastically reduced because of the generalisation of the DCNN
model and can accurately map and interpret different seismic volumes.
Acknowledgement
The authors wish to extend their genuine gratitude to the University Teknologi
Petronas and the Centre for Subsurface Seismic Imaging for their support and
valuable contributions to this research and to Geoplat AI for providing the
software for this work.
References
Ben-David, S., Blitzer, J., Crammer, K. and Pereira, F. 2007. Analysis of representations for
domain adaptation. pp. 137–144. In: Advances in Neural Information Processing Systems
19. The MIT Press. https://doi.org/10.7551/mitpress/7503.003.0022.
Chopra, S. and Marfurt, K.J. 2007. Volumetric curvature attributes for fault/fracture
characterisation. First Break 25: 35–46. https://doi.org/10.3997/1365-2397.2007019.
Duffy, O.B., Hudec, M., Peel, F., Apps, G., Bump, A., Moscardelli, L., Dooley, T., Bhattacharya,
S., Wisian, K. and Shuster, M. 2022. The Role of Salt Tectonics in the Energy Transition: An
Overview and Future Challenges. Earth Arxiv Preprint. https://doi.org/10.31223/X5363J.
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M.
and Lempitsky, V. 2016. Domain-adversarial training of neural networks. Advances in
Computer Vision and Pattern Recognition 189–209. https://doi.org/10.1007/978-3-319-
58347-1_10.
180 Data Science and Machine Learning Applications in Subsurface Engineering
Ghifary, M., Kleijn, W.B. and Zhang, M. 2014. Domain adaptive neural networks for object
recognition. pp. 898–904. In: Pacific Rim International Conference on Artificial
Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_76.
Li, W., Gu, S., Zhang, X. and Chen, T. 2020. Transfer learning for process fault diagnosis:
Knowledge transfer from simulation to physical processes. Comput. Chem. Eng.
139: 106904. https://doi.org/10.1016/j.compchemeng.2020.106904.
Liu, B., Jing, H., Li, J., Li, Y., Qu, G. and Gu, R. 2019. Image segmentation of salt deposits
using deep convolutional neural network. pp. 3304–3309. In: 2019 IEEE International
Conference on Systems, Man and Cybernetics (SMC). IEEE. https://doi.org/10.1109/
SMC.2019.8913858.
Lixin Duan, Tsang, I.W. and Dong Xu. 2012. Domain transfer multiple kernel learning. IEEE
Trans. Pattern Anal. Mach. Intell. 34: 465–479. https://doi.org/10.1109/TPAMI.2011.114.
Otchere, D.A., Abdalla Ayoub Mohammed, M., Ganat, T.O.A., Gholami, R. and Aljunid
Merican, Z.M. 2022a. A novel empirical and deep ensemble super learning approach
in predicting reservoir wettability via well logs. Applied Sciences 12: 2942. https://doi.
org/10.3390/app12062942.
Otchere, D.A., Ganat, T.O.A., Nta, V., Brantson, E.T. and Sharma, T. 2022b. Data analytics
and Bayesian Optimised Extreme Gradient Boosting approach to estimate cut-offs from
wireline logs for net reservoir and pay classification. Appl. Soft Comput. 120: 108680.
https://doi.org/10.1016/j.asoc.2022.108680.
Otchere, D.A., Tackie-Otoo, B.N., Mohammad, M.A.A., Ganat, T.O.A., Kuvakin, N.,
Miftakhov, R., Efremov, I. and Bazanov, A. 2022c. Improving seismic fault mapping
through data conditioning using a pre-trained deep convolutional neural network: A
case study on Groningen field. J. Pet. Sci. Eng. 213: 110411. https://doi.org/10.1016/J.
PETROL.2022.110411.
Pan, S.J. and Yang, Q. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng.
22: 1345–1359. https://doi.org/10.1109/TKDE.2009.191.
Pan, S.J., Tsang, I.W., Kwok, J.T. and Yang, Q. 2011. Domain adaptation via transfer
component analysis. IEEE Trans. Neural Netw. 22: 199–210. https://doi.org/10.1109/
TNN.2010.2091281.
Shi, Y., Wu, X. and Fomel, S. 2019. SaltSeg: Automatic 3D salt segmentation using a deep
convolutional neural network. Interpretation 7: SE113–SE122. https://doi.org/10.1190/
INT-2018-0235.1.
Yan, Z., Zhang, Z. and Liu, S. 2021. Improving performance of seismic fault detection by fine-
tuning the convolutional neural network pre-trained with synthetic samples. Energies
(Basel) 14: 3650. https://doi.org/10.3390/en14123650.
Yao, A.D., Cheng, D.L., Pan, I. and Kitamura, F. 2020. Deep learning in neuroradiology: a
systematic review of current algorithms and approaches for the new wave of imaging
technology. Radiol. Artif. Intell. 2: e190026. https://doi.org/10.1148/ryai.2020190026.
Zou, K.H., Warfield, S.K., Bharatha, A., Tempany, C.M.C., Kaus, M.R., Haker, S.J., Wells, W.M.,
Jolesz, F.A. and Kikinis, R. 2004. Statistical validation of image segmentation quality
based on a spatial overlap index1. Acad. Radiol. 11: 178–189. https://doi.org/10.1016/
S1076-6332(03)00671-8.
Chapter 9
Super-Vertical-Resolution
Reconstruction of Seismic
Volume Using A Pre-trained Deep
Convolutional Neural Network
A Case Study on Opunake Field
Daniel Asante Otchere,1,2,* Abdul Halim Latiff,1 Nikita Kuvakin,3
Ruslan Miftakhov,3 Igor Efremov 3 and Andrey Bazanov 3
1. Introduction
Seismic surveys are essential for subsurface structure exploration and analysis
in the oil and gas and civil engineering sectors (Farfour et al., 2021; Talwani
and Kessinger, 2003). Reflection seismic surveys are acquired to map potential
subsurface features and structural and stratigraphic hydrocarbon traps (Haldar,
2018). The subsurface rock’s acoustic impedance changes produce reflections
due to the Earth’s reaction to synthetically generated acoustic waves near
the surface (Aminzadeh and Dasgupta, 2013; Selley and Sonnenberg, 2015).
Although the seismic energy travels in the form of elastic waves, most imaging
procedures presume that the recorded waves are exclusively composed of
compressional waves (Gray, 2014). Seismic image resolution is the ability
to accurately and delicately depict subsurface structures from the Earth’s
1
Centre of Research for Subsurface Seismic Imaging, Universiti Teknologi PETRONAS, 32610, Seri
Iskandar, Perak Daril Ridzuan, Malaysia.
2
Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA,
USA.
3
GridPoint Dynamics, 77 Hopton Road, London, SW162EL, United Kingdom.
* Corresponding author: [email protected]
182 Data Science and Machine Learning Applications in Subsurface Engineering
2. Brief Overview
Seismic imaging is a well-known geophysical technique for visualising
underlying structures and features in the Earth’s subsurface. It is extensively
used in the oil and gas sector to discover and map hydrocarbon reservoirs and
in civil engineering for site characterisation and underground utility mapping.
Despite its widespread use, Seismic imaging has intrinsic limitations in terms
of image resolution and quality, which affects the accuracy and dependability
of the ensuing interpretation. Several factors can limit the resolution of seismic
images in the oil and gas industry (Deng et al., 2021; Sun et al., 2022; Yan and
Chunqin, 2008):
1. Seismic wavelength: The resolution of a seismic image is directly related
to the wavelength and finite size of the seismic source transmitted into the
subsurface. Shorter wavelengths result in higher-resolution images, while
longer wavelengths result in lower-resolution images.
2. Seismic source frequency and depth: The frequency of the seismic
source (e.g., air gun, vibrator) also influences image resolution. Higher
frequencies produce higher-resolution images but consume more energy
and may not penetrate the subsurface as deeply. The resolution of a
seismic image often decreases as depth increases due to the increased
absorption of seismic energy as it passes through the subsurface.
3. Geology: The composition and structure of the Earth’s subsurface can
also affect the resolution of the seismic image. Harder, more consolidated
materials produce clearer images than softer, more porous materials.
Seismic energy can be scattered by complex geology, such as folds or
faults, resulting in a lower-resolution image.
4. Data processing: The resolution of an image can be influenced by the
quality of the data processing techniques employed. Image resolution can
be improved using advanced processing techniques such as inversion and
migration. Some techniques, such as filtering, can help to minimise noise
and improve resolution, while others can reduce the resolution.
5. Receiver spacing: The spacing between the seismic receivers that collect
data can also impact the image’s resolution. Closer spacing results in
higher-resolution images, but it also increases the cost of the survey.
184 Data Science and Machine Learning Applications in Subsurface Engineering
Table 9.1. Summary of articles reviewed on seismic image resolution and image resolution using AI
in oil and gas, civil engineering, and medical fields.
3. Methodology
3.1 Regional Geological Overview of the Opunake Field
The Opunake Field is located in the Taranaki region of New Zealand, on
the North Island’s western side. The Taranaki Basin is a large sedimentary
basin formed approximately 50–60 million years ago during the early to
mid-Cenozoic Era. It is known for its active volcanoes and tectonic activity,
as shown in Fig. 9.1. These structural activities have shaped the geology of
the area (Stagpoole and Nicol, 2008). A complex mixture of sedimentary
Fig. 9.1. Location map of Taranaki Basin showing simplified structural elements modified after
Douglas (2005) and Rajabi et al. (2016).
188 Data Science and Machine Learning Applications in Subsurface Engineering
and over time, the remains of these organisms were buried and compacted
into sedimentary rocks. These rocks, which include sandstone, shale, and
limestone, are the oldest in the area and can be found at the base of the
Opunake Field.
During the Mesozoic period, the area continued to experience tectonic
activity, which caused the land to rise and form mountains. These mountains,
now long gone, were made up of sandstone, shale, and limestone. Later, these
rocks were buried in sedimentary layers made up of shale, sandstone, and coal.
During the Cenozoic period, the area experienced further tectonic activity
and experienced several periods of erosion. These activities caused the land
to be reshaped, exposing the sedimentary layers to the surface. The area is
now home to a variety of rock formations, including sandstone, shale, and
limestone, as well as coal and oil deposits.
Overall, the geological history of Opunake Field is a complex and
dynamic geological system reflecting the millions of years of tectonic activity,
volcanic eruptions, and sedimentary processes in the Taranaki region. These
activities have resulted in a wide range of rock formations and sedimentary
layers that have formed over the years. These rocks and sediments provide a
rich resource for geological study and exploration. The Opunake Field is a
valuable oil and gas production resource that has also played an important role
in developing New Zealand’s energy industry (McBeath, 1977).
This study was performed using a software environment consisting of a
system unit running on a Windows 10 Pro-64-bit operating system, CUDA
Toolkit 10.2, and the Pytorch 2.0 framework. The experimental hardware
environment of the unit is the next-generation AMD Radeon Pro™ and
the highest performing NVIDIA Quadro® professional graphics capable of
2-petaFLOPS tensor performance. This study was run on a single NVIDIA
RTX 8000 GPU, quadruple Intel (R) Xeon (R) W-2223 CPU running at
3.60 GHz, 32.0 GB of DDR4-2666 MHz DRAM. Throughout the course of
this research, the pre-trained DCNN model of the Geoplat AI software was
utilised. The model was pre-trained using both synthetic and real data.
Fig. 9.2. Seismic volume showing Inline 2700, Xline 5500 and depth 2040.
a large and varied 3D synthetic seismic dataset image library that considers
various complex geological factors and imaging quality. The model’s accuracy
was verified and successfully applied to the Opunake Field to generate a
conditioned probability volume. The widely known ResU-net architectural
framework, which was pre-trained, was used to segment the 3D seismic
images obtained from the Opunake Field. A signal gain was added to amplify
and restore regions with amplitude expression coefficient loss. This additional
gain aided in the processing of low-quality seismic data. Figure 9.2 displays
the inline, crossline, and depth sections of the Opunake Field.
the model to access features from multiple scales, which helps improve the
model’s accuracy.
The key difference between a standard CNN and a residual neural network
lies in how information flows through the network. In a standard CNN, the
layers learn the actual mapping between the input and the output. In a residual
neural network, the layers learn the residual mapping between the input and
the output, which is then added to the original input to produce the final output.
This residual network helps reduce the number of parameters in the model and
improves its learning ability. Overall, the ResU-Net architecture combines
the ability of the U-Net to capture multi-scale features with the ability of
residual blocks to reduce the number of parameters and improve learning.
This technique augments the convolutional neural network architecture and
makes it a powerful tool for image resolution tasks.
followed by a fully connected layer. The GAN model, on the other hand,
was built using a generator and a discriminator network. The generator
network was used to generate new high-quality seismic volumes. In
contrast, the discriminator network was used to differentiate between the
generated and the original high-quality seismic volumes.
3. Model training: The models were trained using the training set. The
conventional CNN model was trained to improve low-quality seismic
volumes by lowering noise and increasing resolution. The GAN model
was trained to generate new high-quality seismic volumes that were
similar to the original high-quality seismic volumes.
4. Model evaluation: Once the models were trained, they were evaluated
using the test set. The performance of the models was evaluated using
three commonly used image quality metrics: PSNR, SSIM, and SNR.
PSNR measures the ratio between the maximum possible power of a
signal and the power of the noise that corrupts the signal. SSIM measures
the structural similarity between two images, while SNR measures the
strength of a signal relative to the amount of noise present in the signal.
5. Model comparison: The performance of the models was compared based
on the results of the evaluation. The PSNR, SSIM, and SNR values for
each model were calculated for the test set and compared to determine
which model performed better.
6. Model selection: Based on the comparison of the two models, the one that
performed better regarding the image quality metrics was selected as the
model of choice.
It is important to note that this process of training and testing models
using a dataset is an iterative process. The model architecture, parameters and
dataset itself can be fine-tuned and optimised. Additionally, visual inspection
of the enhanced images can also be used to evaluate the results.
SNR to get a complete evaluation of image quality, such as the SSIM and
PSNR, which are more robust and take into account both luminance and
structural information. The formula for SNR is:
SNR = 10 × log10(Peak signal power/Mean square error) (9.4)
2. PSNR measures the quality of a reconstructed image compared to a
reference image. It is defined as the ratio of the peak signal power to
the power of the noise in the reconstructed image. This metric is
mathematically expressed as;
PSNR = 10 × log10(MAX 2/MSE) (9.5)
where MAX is the maximum possible pixel value of the image, and MSE
is the mean squared error between the conditioned and original volume.
3. SSIM is a measure of the similarity between two images. It considers
the luminance, contrast, structure, mean, standard deviation and
cross-covariance of the image pixels and gives an understanding of
the structural similarity between two images. SSIM is mathematically
written as;
SSIM ( x, y ) =
(2 × µ x × µ y + C1) × ( 2 × Σ xy + C2 )
(9.6)
(( µ x2 ) (
× µ y 2 + C1 × Σ x2 + Σ y 2 + C2 ))
where µx and µy are the pixel values’means of images x and y, respectively,
Σx and Σy are the pixel values’ standard deviations of images x and y,
respectively, Σxy is the pixel values’ covariance of images x and y and C_1
and C_2 are constants.
Fig. 9.4. Inline 2000 comparison of (a) original, (b) mean resolution conditioned, and
(c) super-resolution conditioned volumes.
196 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 9.5. Xline 3000 comparison of (a) original, (b) mean resolution conditioned, and
(c) super-resolution conditioned volumes.
Super-Vertical-Resolution Reconstruction of Seismic Volume 197
Fig. 9.6. Depth 1400 comparison of (a) original, (b) mean resolution conditioned, and
(c) super-resolution conditioned volumes.
be high, the visual perception of the image can be different. That is why it is
good practice to use multiple metrics to get a more comprehensive evaluation
and to consider the specific application for which the image will be used.
Hence, when there is the need to make an image resolution very high, the
super-resolution model performs better, especially in the shallower part of the
seismic volume. However, when the model is needed to enhance the seismic
volume for interpretation, the mean resolution model is the better of the 2 due
to the fact that more geological structures and features were preserved while
exhibiting fewer falsely enhanced areas or less confident resolution in the
deeper parts of the subsurface.
Figure 9.9 presents the spectral analysis results performed on the
original, mean, and super-resolution volumes along inline 2000. Our
findings indicate that the original volumes had an SNR of 70.9, which was
significantly improved upon by the mean resolution volume achieving an
SNR of 212.5. Interestingly, despite the improvement in spatial resolution, the
200 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 9.9. Comparison of spectral analysis for (a) original seismic, (b) super-resolution volume, and
(c) mean resolution volume.
Super-Vertical-Resolution Reconstruction of Seismic Volume 201
Fig. 9.10. Super-resolution conditioned volume showing Inline 2700, crossline 5500, and depth 2040.
Fig. 9.11. Mean resolution conditioned volume showing Inline 2700, crossline 5500, and depth 2040.
5. Conclusions
Insufficient labelled samples have hindered the progress of deep learning
application in seismic interpretation. In this study, we aimed to enhance an
image using AI by building and comparing two different CNN models. The
first model was a traditional CNN, while the second model was a more recent
architecture known as a GAN.
The conventional CNN (super-resolution) model was trained using a
dataset of low-quality images and their corresponding high-quality versions.
The model was then used to enhance the test images by increasing their
resolution and reducing noise. On the other hand, the GAN (mean resolution)
model was trained using a dataset of high-quality images. The generator
network of the GAN was used to generate new high-quality images, which
were then compared with the original test images.
Super-Vertical-Resolution Reconstruction of Seismic Volume 203
Acknowledgement
The authors thank the University Teknologi Petronas and the Centre for
Subsurface Seismic Imaging for supporting this work and to Geoplat AI for
providing the software for this work.
References
Aminzadeh, F. and Dasgupta, S.N. 2013. Fundamentals of Petroleum Geophysics. pp. 37–92.
https://doi.org/10.1016/B978-0-444-50662-7.00003-2.
An, Y., Guo, J., Ye, Q., Childs, C., Walsh, J. and Dong, R. 2021. Deep convolutional neural
network for automatic fault recognition from 3D seismic datasets. Comput. Geosci.
153: 104776. https://doi.org/10.1016/j.cageo.2021.104776.
204 Data Science and Machine Learning Applications in Subsurface Engineering
Cengiz, E., Kelek, M.M., Oğuz, Y. and Yılmaz, C. 2022. Classification of breast cancer with
deep learning from noisy images using wavelet transform. Biomedical Engineering/
Biomedizinische Technik 67: 143–150. https://doi.org/10.1515/bmt-2021-0163.
Chen, Y., Xie, Y., Zhou, Z., Shi, F., Christodoulou, A.G. and Li, D. 2018. Brain MRI super
resolution using 3D deep densely connected neural networks. 2018 IEEE 15th
International Symposium on Biomedical Imaging (ISBI 2018) April, pp. 739–742.
https://doi.org/10.1109/ISBI.2018.8363679.
Cronin, S.J., Zernack, A.v., Ukstins, I.A., Turner, M.B., Torres-Orozco, R., Stewart, R.B., Smith,
I.E.M., Procter, J.N., Price, R., Platz, T., Petterson, M., Neall, V.E., McDonald, G.S.,
Lerner, G.A., Damaschcke, M. and Bebbington, M.S. 2021. The geological history and
hazards of a long-lived stratovolcano, Mt. Taranaki, New Zealand. New Zealand Journal of
Geology and Geophysics 64: 456–478. https://doi.org/10.1080/00288306.2021.1895231.
Deng, M.-D., Jia, R.-S., Sun, H.-M. and Zhang, X.-L. 2021. Super-resolution reconstruction of
seismic section image via multi-scale convolution neural network. E3S Web of Conferences
303: 01058. https://doi.org/10.1051/e3sconf/202130301058.
Douglas, A. 2005. Slow slip on the northern Hikurangi subduction interface, New Zealand.
Geophys. Res. Lett. 32: L16305. https://doi.org/10.1029/2005GL023607.
Farfour, M., Gaci, S., El-Ghali, M. and Mostafa, M. 2021. A review about recent seismic
techniques in shale-gas exploration. pp. 65–80. In: Methods and Applications in Petroleum
and Mineral Exploration and Engineering Geology. Elsevier. https://doi.org/10.1016/
B978-0-323-85617-1.00012-6.
Gondara, L. 2016. Medical image denoising using convolutional denoising autoencoders.
pp. 241–246. In: 2016 IEEE 16th International Conference on Data Mining Workshops
(ICDMW). IEEE. https://doi.org/10.1109/ICDMW.2016.0041.
Gray, S.H. 2014. Seismic imaging. pp. S1-1-S1-16. In: Encyclopedia of Exploration Geophysics.
Society of Exploration Geophysicists. https://doi.org/10.1190/1.9781560803027.entry4.
Haldar, S.K. 2018. Exploration geophysics. pp. 103–122. In: Mineral Exploration. Elsevier.
https://doi.org/10.1016/B978-0-12-814022-2.00006-X.
Higgs, K.E., King, P.R., Raine, J.I., Sykes, R., Browne, G.H., Crouch, E.M. and Baur, J.R.
2012. Sequence stratigraphy and controls on reservoir sandstone distribution in an Eocene
marginal marine-coastal plain fairway, Taranaki Basin, New Zealand. Mar. Pet. Geol.
32: 110–137. https://doi.org/10.1016/J.MARPETGEO.2011.12.001.
Jiang, K., Wang, Z., Yi, P., Wang, G., Lu, T. and Jiang, J. 2019. Edge-enhanced GAN for remote
sensing image super-resolution. IEEE Transactions on Geoscience and Remote Sensing
57: 5799–5812. https://doi.org/10.1109/TGRS.2019.2902431.
Jo, Y., Choi, Y., Seol, S.J. and Byun, J. 2022. Machine learning-based vertical resolution
enhancement considering the seismic attenuation. J. Pet. Sci. Eng. 208: 109657.
https://doi.org/10.1016/j.petrol.2021.109657.
Ker, S., Marsset, B., Garziglia, S., le Gonidec, Y., Gibert, D., Voisset, M. and Adamy, J. 2010.
High-resolution seismic imaging in deep sea from a joint deep-towed/OBH reflection
experiment: Application to a mass transport complex offshore Nigeria. Geophys. J. Int.
182: 1524–1542. https://doi.org/10.1111/j.1365-246X.2010.04700.x.
Lan, T., Zeng, Z., Han, L. and Zeng, J. 2023. Seismic data denoising based on wavelet transform
and the residual neural network. Applied Sciences 13(1): 655. https://doi.org/10.3390/
app13010655.
Li, J., Wu, X. and Hu, Z. 2022. Deep learning for simultaneous seismic image super-resolution
and denoising. IEEE Transactions on Geoscience and Remote Sensing 60. https://doi.
org/10.1109/TGRS.2021.3057857.
Super-Vertical-Resolution Reconstruction of Seismic Volume 205
McBeath, D.M. 1977. Gas-condensate fields of the Taranaki basin, New Zealand. New Zealand
Journal of Geology and Geophysics 20: 99–127. https://doi.org/10.1080/00288306.1977.
10431594.
Mukherjee, S., Bell, R.S., Barkhouse, W.N., Adavani, S., Lelièvre, P.G. and Farquharson,
C.G. 2022. High-resolution imaging of subsurface infrastructure using deep learning
artificial intelligence on drone magnetometry. The Leading Edge 41: 462–471. https://doi.
org/10.1190/TLE41070462.1.
Nicol, A., Mazengarb, C., Chanier, F., Rait, G., Uruski, C. and Wallace, L. 2007. Tectonic
evolution of the active Hikurangi subduction margin, New Zealand, since the Oligocene.
Tectonics 26(4): 1–24. https://doi.org/10.1029/2006TC002090.
Nodder, S.D. 1993. Neotectonics of the offshore Cape Egmont Fault Zone, Taranaki Basin, New
Zealand. New Zealand Journal of Geology and Geophysics 36: 167–184. https://doi.org/1
0.1080/00288306.1993.9514566.
Orozco-del-Castillo, M.G., Ortiz-Alemán, C., Urrutia-Fucugauchi, J., Martin, R., Rodriguez-
Castellanos, A. and Villaseñor-Rojas, P.E. 2014. A genetic algorithm for filter design
to enhance features in seismic images. Geophys. Prospect. 62: 210–222. https://doi.
org/10.1111/1365-2478.12026.
Otchere, D.A., Abdalla Ayoub Mohammed, M., Ganat, T.O.A., Gholami, R. and Aljunid
Merican, Z.M. 2022a. A novel empirical and deep ensemble super learning approach
in predicting reservoir wettability via well logs. Applied Sciences 12: 2942. https://doi.
org/10.3390/app12062942.
Otchere, D.A., Tackie-Otoo, B.N., Mohammad, M.A.A., Ganat, T.O.A., Kuvakin, N.,
Miftakhov, R., Efremov, I. and Bazanov, A. 2022b. Improving seismic fault mapping
through data conditioning using a pre-trained deep convolutional neural network: A
case study on Groningen field. J. Pet. Sci. Eng. 213: 110411. https://doi.org/10.1016/J.
PETROL.2022.110411.
Picetti, F., Lipari, V., Bestagini, P. and Tubaro, S. 2019. Seismic image processing through the
generative adversarial network. Interpretation 7: SF15–SF26. https://doi.org/10.1190/INT-
2018-0232.1.
Rajabi, M., Ziegler, M., Tingay, M., Heidbach, O. and Reynolds, S. 2016. Contemporary
tectonic stress pattern of the Taranaki Basin, New Zealand. J. Geophys. Res. Solid Earth
121: 6053–6070. https://doi.org/10.1002/2016JB013178.
Roy Chowdhury, K. 2011. Seismic Data Acquisition and Processing. Dordrecht: Springer,
pp. 1081–1097. https://doi.org/10.1007/978-90-481-8702-7_52.
Selley, R.C. and Sonnenberg, S.A. 2015. Methods of exploration. pp. 41–152. In: Elements of
Petroleum Geology. Elsevier. https://doi.org/10.1016/B978-0-12-386031-6.00003-5.
Shi, F., Cai, N., Gu, Y., Hu, D., Ma, Y., Chen, Y. and Chen, X. 2019. DeSpecNet: A CNN-based
method for speckle reduction in retinal optical coherence tomography images. Phys. Med.
Biol. 64: 175010. https://doi.org/10.1088/1361-6560/AB3556.
Stagpoole, V. and Nicol, A. 2008. Regional structure and kinematic history of a large subduction
back thrust: Taranaki Fault, New Zealand. J. Geophys. Res. Solid Earth 113. https://doi.
org/10.1029/2007JB005170.
Sun, Q.F., Xu, J.Y., Zhang, H.X., Duan, Y.X. and Sun, Y.K. 2022. Random noise suppression and
super-resolution reconstruction algorithm of seismic profile based on GAN. J. Pet. Explor.
Prod. Technol. 12: 2107–2119. https://doi.org/10.1007/S13202-021-01447-0/FIGURES/7.
Talwani, M. and Kessinger, W. 2003. Exploration geophysics. pp. 709–726. In: Encyclopedia
of Physical Science and Technology. Elsevier. https://doi.org/10.1016/B0-12-227410-
5/00238-6.
206 Data Science and Machine Learning Applications in Subsurface Engineering
Voggenreiter, W.R. 1993. Structure and evolution of the Kapuni Anticline, Taranaki Basin, New
Zealand: Evidence from the Kapuni 3D seismic survey. New Zealand Journal of Geology
and Geophysics 36: 77–94. https://doi.org/10.1080/00288306.1993.9514556.
Wang, E. and Nealon, J. 2019. Applying machine learning to 3D seismic image denoising and
enhancement. Interpretation 7: SE131–SE139. https://doi.org/10.1190/INT-2018-0224.1.
Yan, F. and Chunqin, Z. 2008. Seismic data denoising based on second wavelet transform.
pp. 186–189. In: 2008 International Conference on Advanced Computer Theory and
Engineering. IEEE. https://doi.org/10.1109/ICACTE.2008.118.
Chapter 10
Petroleum Reservoir Characterisation
A Review from Empirical to
Computer-Based Applications
Ebenezer Ansah,1,* Anthony Ewusi,2 Eric Thompson Brantson3
and Jerry S.Y. Kuma2
1. Introduction
In the early days, geological studies entailed reservoir description by
integrating geologic, geophysical, and well logging data (Yu et al., 2011).
Given this, practical problems related to geology and engineering highlighted
the importance of integrated reservoir characterisation. That led to immense
research and application of various reservoir characterisations in the oil and
gas industries.
According to Ma (2011), reservoir characterisation is the study of the
properties of a reservoir using various field specialisations (geological,
geophysical, petrophysical, and engineering), including uncertainty analysis
of geological, engineering data, and spatial variations. There exists a
correlation among these sparse data sources, and analysing them can bring
out a great understanding of the reservoir. The most used data for reservoir
characterisation are seismic data (2D, 3D, and 4D), well logs, and core data.
Recent advancement has seen increased development in the application of
1
Department of Petroleum Geosciences and Engineering, School of Petroleum Studies, University of
Mines and Technology, Tarkwa, Ghana.
2
Department of Geological Engineering, Faculty of Geosciences and Environmental Studies,
University of Mines and Technology, Tarkwa, Ghana.
3
Department of Petroleum and Natural Gas Engineering, School of Petroleum Studies, University of
Mines and Technology, Tarkwa, Ghana.
* Corresponding author: [email protected]
208 Data Science and Machine Learning Applications in Subsurface Engineering
where there is a decrease in K with a rapid decrease in Phi. To account for this,
Schwartz and Kimminau (1987) introduced a consolidated model to survey
the effect of multiple accumulations of cementation on Phi. Bourbie et al.
(1987), also proposed a solution by introducing a variable power on the Phi in
the Kozeny-Carman equation (Eq. (10.1)).
K = Bϕnd2 (10.1)
where K is the permeability, ϕ is the porosity, d is the diameter, and n is
the variable power introduced by Bourbie et al. (1987). The n varies from
3 and 7–8 for higher porosities and lower porosities, respectively. With all
the solutions, Mavko and Nur (1997), proposed a solution by accounting for
the porosity percolation threshold. Their research included the percolation
threshold in the Kozeny-Carman relation (Eq. (10.2)) to accurately fit the
observed permeability to a well-sorted material and also extend the scope of
the model.
K = B(ϕ – ϕc)d2 (10.2)
where ϕc is the minimum threshold porosity. Note, the (ϕ – ϕc) replacing ϕ in
the Kozeny-Carman relation caters for the percolation threshold.
Sw model Characteristics
Archie’s model Archie’s equation (Archie 1942) shows a relationship between water saturation
(Archie 1942) to the true permeable formation resistivity, the formation porosity and the
formation water resistivity. The challenge, therefore, arises due to the presence
of shale in the reservoir which is a conductive medium and hence is against
the original assumptions of Archie’s equation, which was a clean sandstone
reservoir (Archie 1942). The presence of shale causes a disparity in the reading
of the total resistivity of the reservoir and brings about an overshot in the water
saturation predicted by Archie’s equation (Archie 1942).
Simandoux The Simandoux model was developed to study the volumetric effects of
(Simandoux, reducing clay volume on the conductivity of the rock matrix and the overall
1963) saturation.
The Simandoux model is applied in fine siltstone of clay-rich formation
regardless of the specific distribution form of clay or clay applied in shaly
sandstone.
Simandoux experimented on only four synthetic samples using one type of clay
(montmorillonite) with a constant value of porosity. Hence, the model leads to
optimistic results when the porosity is less than 20%, and it cannot be relied on
in low porosity situations.
Also, the model does not show a volumetric balance between sandstone
volume and the clay volumes with consideration of the lack of shale formation
factor in the clay term making the correlation of clay effect in the model too
large and hence reducing the amount of water saturation estimated
(Sam-Marcus et al., 2018)
Waxman-Smits The Waxman–Smits model is based on laboratory measurements of resistivity,
(Waxman-Smits porosity and saturation of real rocks. The major assumptions of the Waxman–
1968) Smits model about clay formation and its properties are as follows: Clay
surface conductivity is assumed to share a directly proportional relationship
with the factor Qv (defined as the milli-equivalents of exchangeable clay
counterions per unit volume of pore space), and the F* term replicated in both
the sandstone resistivity term and the shale resistivity term.
This model served as the premise of the widely used dual water model. The
Waxman–Smit’s equation is often used as a standard against other methods,
due to its high experimental backing, but the determination of CEC (Cation
Exchange Capacity) is a time-consuming experiment and this is the major
limitation of the Waxman–Smits model.
Indonesian The Indonesian equation relics as a benchmark for field-based models that
(Poupon and work reliably with log-based analysis regardless of special core analysis data.
Leveaux, 1971) The Indonesian equation also does not particularly assume any specific shale
distribution. The Indonesian model also has an extra feature as the only model
considered the saturation exponent (n). According to Shedid and Saad (2017),
the results from the Indonesian predictor have been obtained with a simpler
equation, which is more convenient for quick interpretation.
Dual water Dual water is an improved form of Waxman-Smits which contains irreducible
(Clavier et al. water saturation and free water. In this method, it is proposed that the
1977) contribution of clay minerals to the resistivity of reservoir rock is caused by the
presence of free water within the pore spaces and the bound water within the
clay matrix.
Dual water was developed to account for the conductivity at the surface of a
clay mineral within the volume of shaly sandstone.
212 Data Science and Machine Learning Applications in Subsurface Engineering
( a * Rw)
Sw = n
(10.3)
Rt * Phiem
1 φ m * Swn Vsh * Sw
= + (10.4)
Rt a * Rw Rsh
*
1 φ m * SwT n Rw
= T * 1 + B * Qv (10.5)
Rt a * Rw SwT
Fig. 10.1. Influent of petrophysical properties on the net present value (NPV). Modified after Bowers
and Fitz (2000).
that exists between the estimated reservoir elastic properties from the wireline
logs and corresponding sampled fluid information. The application of
computer-based intelligence provides a different perspective on the
implementation of well logs and also brings to light more details contained
in these input logs (e.g., the relationship between Gamma Ray (GR), Deep
resistivity (ILD), and Density (RHOB) with formation K). In this section,
AI techniques such as ANN, SVM and fuzzy logic (FL) will be discussed by
providing the main theoretical framework and some applications in reservoir
characterisation.
The application of ML algorithms has been applied to several quantitative
analyses of well logs for the estimation of reservoir properties (Helle
et al., 2001; Huang et al., 1996; Huang and Williamson, 1997). This learning
approach has proved to be simple and has provided an accurate solution for
the valuation of reservoir properties using several well logs. Nikravesh et al.
(2003) noted that the computational process for the estimation of the reservoir
parameters seems to be more reliable because they are independent of the
uncertainties which come with the logging process.
For the various AI concept and theories, Otchere et al. (2021a) have
presented an in-depth report on the mathematical structure of some of the
AI algorithms such as ANN, SVM, and Relevance Vector Machine (RVM).
Likewise, Saika et al. (2020) and Otchere et al. (2021a) provides a detailed
overview of the structure of AI algorithms and their advantages and limitations.
where xi is the input signal from the axon terminal arriving at synapse i; wi
represents the weight of the synapse i; a is the bias term, and j is the summation
of all input received from the dendrites. The output signal after is given by
the equation:
y = f (j) (10.9)
where y represents the neuron output signal transmitted through an axon to
another perceptron, and f represents the activation transform function.
The various categories of the ANN and the activation function employed
on the input variable give room for the possible design of an ANN as shown in
Fig. 10.4. Zendehboudi et al. (2018) presented in detail some of the possible
and popular ANN designs.
Fig. 10.2. Mathematical framework of a single neural ANN showing the follow of data.
Petroleum Reservoir Characterisation 217
(a)
(b)
Fig. 10.3. Example of (a) conceptual model for the BPNN modelled by Mohaghegh et al. (1996), and
(b) conceptual model framework for the Ensemble Learning Framework modelled after Anifowose
et al. (2017).
Fig. 10.4. Types of ANN based on different criteria (modified after Zendehboudi et al., 2018).
Fig. 10.5. SVM architecture showing: (a) regression SVM/RVM in a high dimensional space; and
(b) supervised support vector creating a linear relationship between two non-linear data making use
of optical hyperplane.
Petroleum Reservoir Characterisation 229
Author(s) Input Predicted Model Used Activation Output Statistical Measures Remarks
Variables Output Functions
Avseth and Well logs Lithofacies Rock Physics, MLDA: Mean Results demonstrate All applied models The predictive
Mukerji (GR, Mahalanobis and Covariance. a somewhat better had a success rate of performance of the
(2002) RHOZ, Discriminant NN: Sigmoid forecast utilizing the NN about 80% model would have
DT) Analysis (MLDA), Transfer Function as compared with the been improved if
Probability Density PDF and the MLDA, additional inputs such as
Function (PDF), yet the MLDA ended photoelectric factor and
Classical Neural up being viable for the spontaneous potential
Network (NN) grouping of well log were considered
into discrete lithofacies
Tang (2008) Well Logs Electrofacies Probabilistic Radial Basis PNN had a good PNN attained a Although the PNN
(GR, SP, Neural Network function classification prediction accuracy achieved a good
DT, NPHI, (PNN) performance above 70% classification, comparing
RHOZ, PE, it with other statistical
classification was based on the six different facies group from the North Sea
turbidite reservoir, taking into consideration the clay content, grain size, and
bedding configuration. After data filtering, the authors used a neural network
and various multivariate statistical techniques to classify the lithofacies. The
Mahalanobis Linear Discriminant Analysis (MLDA), Probability Density
Function (PDF) classifier, and a Multilayer Feed-forward design with
Back-propagation (MLFN-BP) mass adaptation methodology were used based
on their ability to establish a link between facies and their physical properties.
The results emphasised the use of PDF for the identification of new facies
using holdout cross-validation; however, the MLFN-BP classifier was the best
algorithm for dealing with the multidimensional cluster boundary. In terms of
model performance, Otchere et al. (2021a), suggested there would have been
an improvement if a stronger selection technique was employed to select only
pertinent input variables and outlined a selection bias for their model. To also
increase the performance of their model, the rock physics model would have
been the best conditioning tool for the input well log data. Ameur-Zaimeche
et al. (2020) conducted additional research on the feed-forward neural network
using the MLP to reconstruct the lithofacies breaks in the Sif Fatima oil field
in Algeria. Linking MLP to cluster analysis, their outcomes showed that MLP
is better suited for forecasting the non-cored lithofacies.
Tang (2008) defined carbonate lithofacies with some well logs using a
Probabilistic Neural Network (PNN) by analysing the multidimensional
correlation between the variables. Because of its ability to analyse the
multidimensionality that exists between well logs and distinct facies, the PNN
outperformed discriminate analysis and multi-logistic statistical algorithms.
To assess the reliability of the PNN lithofacies prediction, an integration of two
log lithology indicators and the model was used to map zonally using simple
kriging. There was a good correlation between the zonal facies map with the
conceptual reservoir model which indicates a good reservoir modelling and
flow simulator. Al-Mudhafar (2017b) used well logs to investigate the use
of the Kernel Support Vector Machine (KSVM) to model the distribution of
lithofacies. Because of its ability to recognize distinct classes with a decision
function defined by additional subgroups of supporting vectors, the author
chose KSVM. To further validate the model, it was cross-validated using the
known lithofacies. The validation yielded a 99.55% accuracy using the KSVM
algorithm. Also in recent years, there has been increasing documentation of
improved facies and lithology analysis (e.g., Asante-Okyere et al., 2020b;
Otchere et al., 2022b; Shen et al., 2019; Xie et al., 2021).
Petroleum Reservoir Characterisation 239
8. Summary
Tables 10.2 to 10.5 show a summary of the works done concerning the
application of AI techniques in reservoir characterisation. The summary
comprises the input data, predicted output, AI architecture, activation function,
output performance, and the statistical measure for the AI algorithm. An
overview of the advantages and limitations of some AI algorithms is discussed
in detail by Saikia et al. (2020). It can be observed from Tables 10.2 to 10.5
that all works done considered supervised and unsupervised AI algorithms to
achieve a good prediction for their developed models. Considering this, most
industries produce enough data for their operations but to confirm the actual
ground truth, a small amount of core data are being generated to validate the
model of the empirical relation used. With little core information, training
a supervised model becomes difficult. To put to use the few core dataset, a
semi-supervised algorithm becomes useful. The semi-supervised algorithm
considers using the small dataset in a supervised manner and later applies
an unsupervised model to optimise the final prediction. From Tables 10.2 to
10.5, it can be seen that there is limited work done concerning unsupervised
techniques and hybrid unsupervised techniques in the field of reservoir
characterisation. Furthermore, it has also been observed that with the various
248 Data Science and Machine Learning Applications in Subsurface Engineering
9.1 AI Perspective
Literature has highlighted various modifications of the ANN algorithm like
the functional network (FNN), PNN, and Radial Basis Function (RBF)
to help remove the various limitation of the ANN algorithms. This has
led to addressing nonlinearity between various inputs which the empirical
correlations could not highlight. AI has produced improved prediction and
classification in a different task in reservoir characterisation, that is, the use of
hybrid modelling (Amiri et al., 2015; Anifowose et al., 2011; Anifowose et al.,
2017; Saemi et al., 2007) and also ensemble model (Anifowose et al., 2015) to
make available good hyperparameters for modelling.
Feature extraction and selection by AI algorithms is one of the main
areas most literature is addressing to help improve the prediction capability
of these AI tools. Feature selection looks at the individual input for the AI
algorithms by looking at the necessary and unnecessary features which intern
affect the efficiency of the algorithm. Since the conventional AI algorithm
lacks the power of feature extraction, most research has looked at hybridising
AI algorithms to cater to that deficiency. Considering the various feature
selection tools (PCA, forward feature selection (FFS), linear discriminant
Petroleum Reservoir Characterisation 249
analysis (LDS), etc.), fuzzy-aided hybrid models have proven to provide good
performance in many reviewed literature.
Due to the spatial dimension of the various dataset used to train the AI
algorithm, the need to transform the dataset from a high dimensional space to
a low dimensional space while keeping the high dimensional structure helps
improve the prediction power of the ANN and ML algorithm. This process of
reducing dimensionality is termed feature extraction. With the development of
deep learning algorithms, the problem of feature extraction has been addressed
since the algorithm can extract valuable information from the primary data
input. The literature reviewed here has indicated that the application of soft
computing methods has to do with the trial-and-error method of choosing
the parameter settings to obtain high-performance accuracy. Therefore, future
studies have to employ an automatic selection of parameters for soft computing
methods to avoid long time wasted in looking for optimal parameters.
In recent years, the oil industry has seen a high volume of data generation
by progressive sensors (Saikia et al., 2020) which has led to the development of
advanced modelling and estimation techniques. Such advanced modelling is the
application of a deep learning algorithm (e.g., CNN) to handle feature selection
(Shaheen, 2016) and extract useful messages from the input data through its
hidden neurons (Saikia et al., 2020). The application of deep learning (DL)
has not been well exploited in the domain of reservoir characterisation but has
yielded much success in the field of image, and speech recognition (Bae et al.,
2016; Fujiyoshi et al., 2019; He et al., 2016; Pouyanfar et al., 2018) and other
fields like the medical sector (Chen et al., 2018; Fang et al., 2019). Although
DL has in recent years been applied to solve various learning assignments,
training the model is difficult (Rere et al., 2016) but with improved data
availability, the performance of DL models has been better. Various algorithms
such as Stochastic Gradient Descent (SGD), Conjugate Gradient (CG),
Hessian-free optimization (HFO), and Krylov Subspace Descent (KSD) have
been implemented to curtail this deficiency over the years. These algorithms
have shown some limitations such as several manual tuning schemes for
SGD, slowness of the CG, and more memory consumption of the HFO and
KSD (Rere et al., 2016). To resolve the issues of some of these algorithms,
the hybridisation of metaheuristics to DL will be a good area to address. The
metaheuristic optimisation techniques have been successfully applied to solve
many optimization problems in the research area of engineering, sciences, and
related industries. Research on hybridising DL with metaheuristics algorithms
is scarce in the field of reservoir characterisation. Notwithstanding, its other
important status which is learning from unlabelled data concerning reservoir
characterisation has also not been properly exploited. Likewise, with the
availability of core information coupled with well log data, the application
of semi-supervised deep learning approaches can lead to greater success in
250 Data Science and Machine Learning Applications in Subsurface Engineering
Fig. 10.6. Proposed framework for the semi-supervised learning Deep Neural Network.
Fig. 10.7. Application of Rock Physics for well conditioning and its contribution to reservoir
characterisation.
seismic-guided approach coupled with rock physics has been the best
method. Since geological location has a great influence on the acquired
data (geophysical, core data, and petrophysical), the characteristics (elastic
moduli) of the geology can be addressed by the use of rock physics
models and to understand the geophysical data response by such geology
(Fig. 10.7). In Fig. 10.7, rock physics models are used to condition the various
input logs (such as density and sonic logs) which are then used to better
enhance reservoir characterisation (deterministic inversion, facies probability
mapping, and others). The use of this technique helps in understanding the
geological complexity, improving the well log response, and a considerable
level of accuracy with a smaller number of data samples can be achieved.
Therefore, future works should exploit the use of rock physics models
alongside AI models to condition the various inputs used for the prediction
of reservoir properties to have a more accurate and precise result for future
property modelling.
10. Conclusions
The past years have seen the petroleum industry generate abundant data for
its resource estimation. However, under current economic conditions, most
field data acquisition has been limited thereby relying on the acquired data
for reservoir characterisation taking into account the use of complex models.
Integration of all available data (petrophysics, geology, geophysics, rock
mechanics, and engineering) play an important role in achieving a good
reservoir characterisation. The integration of the available data provides a high
252 Data Science and Machine Learning Applications in Subsurface Engineering
lateral resolution (seismic data), and high vertical resolution (well logs) while
core data provides a physical measurement of the reservoir. Also, integrating
these data sources helps to improve inter-well estimation taking into account
the seismic data. Reservoir characterisation had been challenged by the fact
that the various models have to handle complex nonlinearity present in the
available dataset and also the various uncertainty present in the dataset and
models. These flaws are mitigated by using strong data condition methods
for dimensionality reduction, denoising, and identifying appropriate features
for prediction to achieve accurate and reliable reservoir characterisation. The
conclusion for the reservoir characterisation is highlighted as follows:
1. Most ML paradigms employed for petrophysical seismic characterisation
have been reviewed. These ANN models have been used to improve the
issues facing the empirical approach used to determine petrophysical
parameters. Most literature highlights the use of SVM over ANN models
since the former gives the least error, handles small datasets, and has
faster processing times.
2. There has been an increase in the use of hybridizsd models in the field
of reservoir characterisation. The hybrid models have helped to solve
various problems facing conventional AI algorithms such as parameter
optimisation, weight adjustment, computational time, uncertainty in data,
and feature selections.
3. The selection of most hyperparameters for AI models’ application to
reservoir characterisation in the literature entails a guide technique of trial
and error which is tedious and time-ingesting and often ends in suboptimal
trained models. Hence, advanced studies must discover greater green
strategies for using metaheuristics algorithms for the automatic selection
of optimal hyperparameters and for improving computational time.
4. In a complex application, deep neural networks outperform traditional
neural networks in handling high data complexity with automatic feature
extraction. As a result, this method avoids the need for a separate feature
extraction and selection phase.
5. Finally, while deep neural networks and hybridised metaheuristic models
are widely used in many classification problems, their application has not
been employed much in predicting reservoir properties. Seismic-driven
integration model with deep learning and hybrid metaheuristic models
with DL is yet to be explored in depth.
6. While data condition improves the overall prediction of various AI
algorithms, the use of rock physics to help condition the various log
inputs in this domain is yet to be explored in depth.
Petroleum Reservoir Characterisation 253
References
Al-Anazi, A. and Gates, I.D. 2010. A support vector machine algorithm to classify
lithofacies and model permeability in heterogeneous reservoirs. Engineering Geology
114(3-4): 267–277. https://doi.org/10.1016/j.enggeo.2010.05.005.
Al-Anazi, A.F. and Gates, I.D. 2012. Support vector regression to predict porosity and
permeability: Effect of sample size. Computers and Geosciences 39: 64–76. https://doi.
org/10.1016/j.cageo.2011.06.011.
Al-Bulushi, N., King, P.R., Blunt, M.J. and Kraaijveld, M. 2009. Development of artificial neural
network models for predicting water saturation and fluid distribution. Journal of Petroleum
Science and Engineering 68(3–4): 197–208. https://doi.org/10.1016/j.petrol.2009.06.017.
Al-Mudhafar, W.J. 2017a. Integrating well log interpretations for lithofacies classification
and permeability modeling through advanced machine learning algorithms. Journal
of Petroleum Exploration and Production Technology 7(4): 1023–1033. https://doi.
org/10.1007/s13202-017-0360-0.
Al-Mudhafar, W.J. 2017b. Integrating kernel support vector machines for efficient rock facies
classification in the main pay of Zubair formation in South Rumaila oil field, Iraq. Modeling
Earth Systems and Environment 3: 1–8. https://doi.org/10.1007/s40808-017-0277-0.
Alzubaidi, F., Mostaghimi, P., Swietojanski, P., Clark, S.R. and Armstrong, R.T. 2021. Automated
lithology classification from drill core images using convolutional neural networks.
Journal of Petroleum Science and Engineering 197: 107933. https://doi.org/10.1016/j.
petrol.2020.107933.
Ameur-Zaimeche, O., Zeddouri, A., Heddam, S. and Kechiched, R. 2020. Lithofacies prediction
in non-cored wells from the Sif Fatima oil field (Berkine basin, southern Algeria): a
comparative study of multilayer perceptron neural network and cluster analysis-based
approaches. Journal of African Earth Sciences 166: 103826. https://doi.org/10.1016/j.
jafrearsci.2020.103826.
Amiri, M., Ghiasi-Freez, J., Golkar, B. and Hatampour, A. 2015. Improving water saturation
estimation in a tight shaly sandstone reservoir using artificial neural network optimised
by imperialist competitive algorithm: A case study. Journal of Petroleum Science and
Engineering 127: 347–358. https://doi.org/10.1016/j.petrol.2015.01.013.
Anifowose, F. and Abdulraheem, A. 2011. Fuzzy logic-driven and SVM-driven hybrid
computational intelligence models applied to oil and gas reservoir characterization.
Journal of Natural Gas Science and Engineering 3(3): 505–517. https://doi.org/10.1016/j.
jngse.2011.05.002.
Anifowose, F.A., Abdulraheem, A., Al-Shuhail, A. and Schmitt, D.P. 2013. Improved
permeability prediction from seismic and log data using artificial intelligence techniques.
pp. 2190–2196. In: SPEMmiddle East Oil and Gas Show and Conference. March. OnePetro.
https://doi.org/10.2118/164465-MS.
Anifowose, F.A., Labadin, J. and Abdulraheem, A. 2015. Ensemble model of non-linear
feature selection-based extreme learning machine for improved natural gas reservoir
characterisation. Journal of Natural Gas Science and Engineering 26: 1561–1572.
https://doi.org/10.1016/j.jngse.2015.02.012.
Anifowose, F.A., Labadin, J. and Abdulraheem, A. 2017. Hybrid intelligent systems in
petroleum reservoir characterisation and modelling: The journey so far and the challenges
ahead. Journal of Petroleum Exploration and Production Technology 7: 251–263.
https://doi.org/10.1007/s13202-016-0257-3.
Archie, G.E. 1952. Classification of carbonate reservoir rocks and petrophysical considerations.
Aapg Bulletin 36(2): 278–298. https://doi.org/10.1306/3D9343F7-16B1-11D7-
8645000102C1865D.
254 Data Science and Machine Learning Applications in Subsurface Engineering
Asante-Okyere, S., Shen, C., Ziggah, Y.Y., Rulegeya, M.M. and Zhu, X. 2020a. Principal
component analysis (PCA) based hybrid models for the accurate estimation of reservoir
water saturation. Computers and Geosciences 145: 104555. https://doi.org/10.1016/j.
cageo.2020.104555.
Asante-Okyere, S., Shen, C., Ziggah, Y.Y., Rulegeya, M.M. and Zhu, X. 2020b. A novel hybrid
technique of integrating gradient-boosted machine and clustering algorithms for lithology
classification. Natural Resources Research 29: 2257–2273. https://doi.org/10.1007/
s11053-019-09576-4.
Avseth, P. and Mukerji, T. 2002. Seismic lithofacies classification from well logs using statistical
rock physics. Petrophysics: The SPWLA Journal of Formation Evaluation and Reservoir
Description 43(02): 70–81.
Bae, H.S., Lee, H.J. and Lee, S.G. 2016. Voice recognition based on adaptive MFCC and deep
learning. pp. 1542–1546. In: 2016 IEEE 11th Conference on Industrial Electronics and
Applications (ICIEA), June. IEEE. 10.1109/ICIEA.2016.7603830.
Bestagini, P., Lipari, V. and Tubaro, S. 2017. A machine learning approach to facies classification
using well logs. pp. 2137–2142. In: Seg. Technical Program Expanded Abstracts 2017.
Society of Exploration Geophysicists. https://doi.org/10.1190/segam2017-17729805.1.
Bian, H., Xia, Y., Lu, C., Qin, X., Meng, Q. and Lu, H. 2020. Pore structure fractal characterization
and permeability simulation of natural gas hydrate reservoir based on CT images. Geofluids
2020: 1–9. https://doi.org/10.1155/2020/6934691.
Bourbié, T., Coussy, O. and Zinszner, B. 1987. Acoustics of Porous Media. Gulf Publ. Co.,
translated by N. Marshall from French, Acoustique des Milieux Poreu. https://doi.
org/10.1121/1.402899.
Bowers, M.C. and Fitz, D.E. 2000. A probabilistic approach to determine uncertainty in
calculated water saturation. In: SPWLA 41st Annual Logging Symposium. June. OnePetro.
Brantson, E.T., Sibil, S., Osei, H., Owusu, E.B., Takyi, B. and Ansah, E. 2022. A new approach
for saturation height modelling in a clastic reservoir using response surface methodology
and artificial neural network. Upstream Oil and Gas Technology 9: 100081. https://doi.
org/10.1016/j.upstre.2022.100081.
Carman, P.C. 1937. Fluid flow through granular beds. Trans. Inst. Chem. Eng. 15: 150–166.
https://doi.org/10.1016/S0263-8762(97)80003-2.
Carman, P.C. 1956. Flow of gases through porous media. Butterworths, London. https://doi.
org/10.1016/S0263-8762(97)80003-2.
Chai, X., Nie, W., Lin, K., Tang, G., Yang, T., Yu, J. and Cao, W. 2022. An open-source package
for deep-learning-based seismic facies classification: benchmarking experiments on the
SEG 2020 Open Data. IEEE Transactions on Geoscience and Remote Sensing 60: 1–19.
10.1109/TGRS.2022.3144666.
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. and Blaschke, T. 2018. The rise of deep learning
in drug discovery. Drug Discovery Today 23(6): 1241–1250. https://doi.org/10.1016/j.
drudis.2018.01.039.
Chi, X.G. and Han, D.H. 2009. Lithology and fluid differentiation using a rock physics template.
The Leading Edge 28(1): 60–65. https://doi.org/10.1190/1.3064147.
Costa, A. 2006. Permeability‐porosity relationship: A re-examination of the Kozeny‐Carman
equation based on a fractal pore‐space geometry assumption. Geophysical Research
Letters 33(2). https://doi.org/10.1029/2005GL025134.
De Matos, M.C., Osorio, P.L. and Johann, P.R. 2007. Unsupervised seismic facies analysis
using wavelet transform and self-organising maps. Geophysics 72(1): 9–21. https://doi.
org/10.1190/1.2392789.
Dong, S., Xu, L., Dai, Z., Xu, B.I.N., Yu, Q., Yin, S., Zhang, X., Zhang, C., Zang, X.,
Zhou, X. and Zhang, Z. 2020. A novel fractal model for estimating permeability in
Petroleum Reservoir Characterisation 255
Hewett, T.A. 1986. Fractal distributions of reservoir heterogeneity and their influence on fluid
transport. In: SPE Annual Technical Conference and Exhibition. October. OnePetro.
https://doi.org/10.2118/15386-MS.
Hewett, T.A. and Behrens, R.A. 1990. Conditional simulation of reservoir heterogeneity with
fractals. SPE Formation Evaluation 5(03): 217–225. https://doi.org/10.2118/18326-PA.
Hu, Y., Yu, X. and Chen, G. 2012. Classification of the average capillary pressure function
and its application in calculating fluid saturation. Petroleum Exploration and Development
39(6): 778–784. https://doi.org/10.1016/S1876-3804(12)60104-9.
Huang, Z., Shimeld, J., Williamson, M. and Katsube, J. 1996. Permeability prediction with
artificial neural network modelling in the Venture gas field, offshore eastern Canada.
Geophysics 61(2): 422–436. https://doi.org/10.1190/1.1443970.
Huang, Z. and Williamson, M.A. 1997. Determination of porosity and permeability in reservoir
intervals by artificial neural network modelling, offshore Eastern Canada. Petroleum
Geoscience 3(3): 245–258. https://doi.org/10.1144/petgeo.3.3.245.
Iturrarán-Viveros, U. and Muñoz-García, M.A. 2018. Porosity and water saturation in sands
or shales using Artificial Neural Networks and seismic attributes in a clastic reservoir in
Colombia. pp. 1282–1285. In: International Geophysical Conference, Beijing, China,
24–27 April 2018. December. Society of Exploration Geophysicists and Chinese Petroleum
Society. https://doi.org/10.1190/IGC2018-314.
Kaydani, H., Mohebbi, A. and Baghaie, A. 2012. Neural fuzzy system development for the
prediction of permeability from wireline data based on fuzzy clustering. Petroleum Science
and Technology 30(19): 2036–2045. https://doi.org/10.1080/10916466.2010.531345.
Kozeny, J. 1927. Uber kapillare Leitung des Wassers im Boden-Aufstieg, Versickerung und
Anwendung auf die Bewasserung, Sitzungsberichte der Akademie der Wissenschaften
Wien. Mathematisch Naturwissenschaftliche Abteilung 136: 271–306.
Li, D. and Lake, L.W. 1995. Scaling fluid flow through heterogeneous permeable media. SPE
Advanced Technology Series 3(01): 188–197. https://doi.org/10.2118/26648-PA.
Lian, P.Q., Tan, X.Q., Ma, C.Y., Feng, R.Q. and Gao, H.M. 2016. Saturation modelling in a
carbonate reservoir using capillary pressure based saturation height function: a case study
of the Svk reservoir in the Y Field. Journal of Petroleum Exploration and Production
Technology 6(1): 73–84. https://doi.org/10.1007/s13202-015-0159-9.
Lim, J.S. and Kim, J. 2004. Reservoir porosity and permeability estimation from well logs
using fuzzy logic and neural networks. In: SPE Asia Pacific Oil and Gas Conference and
Exhibition. OnePetro. https://doi.org/10.2118/88476-MS.
Luo, H., Tang, D. and Tang, Y. 2015. Study on prediction of oil water contact in carbonate
reservoir with capillary pressure data. Editorial Department of Petroleum Geology and
Recovery Efficiency 20(2): 71–73.
Ma, Y.Z. 2011. Uncertainty analysis in reservoir characterization and management: How
much should we know about what we don’t know? In: Ma, Y.Z. and P.R. La Pointe
(eds.). Uncertainty Analysis and Reservoir Modelling, AAPG Memoir. 96: 1–15.
DOI:10.1306/13301404M963458.
Mavko, G. and Nur, A. 1997. The effect of a percolation threshold in the Kozeny-Carman
relation. Geophysics 62(5): 1480–1482. https://doi.org/10.1190/1.1444251.
McCulloch, W.S. and Walter, P. 1943. A logical calculus of the ideas immanent in nervous
activity. The Bulletin of Mathematical Biophysics 5: 115–133. https://doi.org/10.1007/
BF02478259.
Mehana, M. and El-monier, I. 2016. Shale characteristics impact on Nuclear Magnetic
Resonance (NMR) fluid typing methods and correlations. Petroleum 2(2): 138–147.
https://doi.org/10.1016/j.petlm.2016.02.002.
Petroleum Reservoir Characterisation 257
Meshri, I.D. 1986. On the reactivity of carbonic and organic acids and generation of secondary
porosity: Roles of organic matter in sediment diagenesis. In: Gautier, D.L. (ed.). Roles of
Organic Matter in Sediment Diagenesis. SPEM Special Publication. 38: 123–128.
Miller, R.S., Rhodes, S., Khosla, D. and Nino, F. 2019. Application of artificial intelligence for
depositional facies recognition-permian basin. In: Unconventional Resources Technology
Conference, Denver, Colorado, 22–24 July, Society of Exploration Geophysicists:
pp. 4410–4415. https://doi.org/10.15530/urtec-2019-193.
Moghadasi, L., Ranaee, E., Inzoli, F. and Guadagnini, A. 2017. Petrophysical well log
analysis through intelligent methods. In: SPE Bergen One Day Seminar. April. OnePetro.
https://doi.org/10.2118/185922-MS.
Mohaghegh, S., Arefi, R., Ameri, S., Aminiand, K. and Nutter, R. 1996. Petroleum reservoir
characterisation with the aid of artificial neural networks. Journal of Petroleum Science
and Engineering 16(4): 263–274. https://doi.org/10.1016/S0920-4105(96)00028-9.
Mohamed, E., Elsayed, M., Hassan, A., Mahmoud, M. and El-Husseiny, A. 2022. A machine
learning approach to predict the permeability from nmr t2 relaxation time distribution
for various reservoir rock types. In: ADIPEC. October. OnePetro. https://doi.
org/10.2118/211624-MS.
Mukerji, T., Avseth, P., Mavko, G., Takahashi, I. and González, E.F. 2001. Statistical rock
physics: Combining rock physics, information theory, and geostatistics to reduce
uncertainty in seismic reservoir characterization. The Leading Edge 20(3): 313–319.
https://doi.org/10.1190/1.1438938.
Nelson, P.H. 1994. Permeability-porosity relationships in sedimentary rocks. The Log Analyst
35(03).
Nikravesh, M. and Aminzadeh, F. 2001. Past, present and future intelligent reservoir
characterisation trends. Journal of Petroleum Science and Engineering 31(2–4): 67–79.
https://doi.org/10.1016/S0920-4105(01)00121-8.
Nikravesh, M., Zadeh, L.A. and Aminzadeh, F. 2003. Soft Computing and Intelligent Data
Analysis in Oil Exploration. Elsevier.
Nyein, C.Y., Ghareb, M., Hamada and Ahmed Elsakka. 2018. Artificial Neural Network (ANN)
prediction of porosity and water saturation of shaly sandstone reservoirs. AAPG Asia
Pacific Region, the 4th AAPG/EAGE/MGS Myanmar Oil and Gas Conference: Myanmar:
A Global Oil and Gas Hotspot: Unleashing the Petroleum Systems Potential.
Okon, A.N., Adewole, S.E. and Uguma, E.M. 2021. Artificial neural network model for reservoir
petrophysical properties: Porosity, permeability and water saturation prediction. Modeling
Earth Systems and Environment 7(4): 2373–2390. https://doi.org/10.1007/s40808-020-
01012-4.
Olakunle, I., Chinedu, A., Udoka, N. and Muyiwa, E. 2015. Saturation height modelling in a
partially appraised gas field using analogue field core data: An optimisation case study
of ZAN field in the Niger Delta. In: SPE Nigeria Annual International Conference and
Exhibition. OnePetro. https://doi.org/10.2118/178374-MS.
Olson, T.M. 1998. Porosity and permeability prediction in low-permeability gas reservoirs from
well logs using neura networks. In: SPE Rocky Mountain Regional/Low-Permeability
Reservoirs Symposium. April. OnePetro. https://doi.org/10.2118/39964-MS.
Otchere, D.A., Ganat, T.O.A., Gholami, R. and Ridha, S. 2021a. Application of supervised
machine learning paradigms in the prediction of petroleum reservoir properties: Comparative
analysis of ANN and SVM models. Journal of Petroleum Science and Engineering
200: 108182. https://doi.org/10.1016/j.petrol.2020.108182.
Otchere, D.A., Ganat, T.O.A., Gholami, R. and Lawal, M. 2021b. A novel custom ensemble
learning model for an improved reservoir permeability and water saturation prediction.
258 Data Science and Machine Learning Applications in Subsurface Engineering
Sebtosheikh, M.A. and Salehi, A. 2015. Lithology prediction by support vector classifiers using
inverted seismic attributes data and petrophysical logs as a new approach and investigation
of training data set size effect on its performance in a heterogeneous carbonate reservoir.
Journal of Petroleum Science and Engineering 134: 143–149. https://doi.org/10.1016/j.
petrol.2015.08.001.
Shaheen, F., Verma, B. and Asafuddoula, M. 2016. Impact of automatic feature extraction in
deep learning architecture. pp. 1–8. In: 2016 International Conference on Digital Image
Computing: Techniques and Applications (DICTA). IEEE. 10.1109/DICTA.2016.7797053.
Shedid, S.A. and Saad, M.A. 2017. Comparison and sensitivity analysis of water saturation
models in shaly sandstone reservoirs using well logging data. Journal of Petroleum Science
and Engineering 156: 536–545. https://doi.org/10.1016/j.petrol.2017.06.005.
Shen, C., Asante-Okyere, S., Yevenyo Ziggah, Y., Wang, L. and Zhu, X. 2019. Group
method of data handling (GMDH) lithology identification based on wavelet analysis
and dimensionality reduction as well log data pre-processing techniques. Energies
12(8): 1509. https://doi.org/10.3390/en12081509.
Shook, M., Li, D. and Lake, L.W. 1992. Scaling immiscible flow through permeable media by
inspectional analysis. In Situ (United States) 16(4).
Singh, S., Kanli, A.I. and Sevgen, S. 2016. A general approach for porosity estimation using
artificial neural network method: A case study from Kansas gas field. Studia Geophysica et
Geodaetica 60: 130–140. https://doi.org/10.1007/s11200-015-0820-2.
Soleimani, F., Hosseini, E. and Hajivand, F. 2020. Estimation of reservoir porosity using
analysis of seismic attributes in an Iranian oil field. Journal of Petroleum Exploration and
Production Technology 10(4): 1289–1316. https://doi.org/10.1007/s13202-020-00833-4.
Srisutthiyakorn, N. 2016. Deep-learning methods for predicting permeability from 2D/3D binary-
segmented images. pp. 3042–3046. In: SEG Technical Program Expanded Abstracts 2016).
Society of Exploration Geophysicists. https://doi.org/10.1190/segam2016-13972613.1.
Tang, H. 2008. Improved carbonate reservoir facies classification using artificial neural network
method. In: Canadian International Petroleum Conference. June. OnePetro. https://doi.
org/10.2118/2008-122.
Timur, A. 1968. An investigation of permeability, porosity, and residual water saturation
relationships. In: SPWLA 9th Annual Logging Symposium. June. OnePetro.
Tran, T.V., Ngo, H.H., Hoang, S.K., Tran, H.N. and Lambiase, J.J. 2020. Depositional facies
prediction using artificial intelligence to improve reservoir characterisation in a mature
field of Nam con son basin, offshore Vietnam. In: Offshore Technology Conference Asia.
October. OnePetro. https://doi.org/10.4043/30086-MS.
Verma, A.K., Cheadle, B.A., Routray, A., Mohanty, W.K. and Mansinha, L. 2012. Porosity
and permeability estimation using neural network approach from well log data. pp. 1–6.
In: SPE Annual Technical Conference and Exhibition (May).
Walker, R.G. 1992. Facies, facies models, and modern stratigraphic concepts. Facies Models:
Response to Sea Level Change: p. 454. https://doi.org/10.4116/jaqua.34.19.
Wang, D., Peng, J., Yu, Q., Chen, Y. and Yu, H. 2019. Support vector machine algorithm
for automatically identifying depositional microfacies using well logs. Sustainability
11(7): 1919. https://doi.org/10.3390/su11071919.
Wong, K.W., Fung, C.C., Ong, Y.S. and Gedeon, T.D. 2005. Reservoir characterization using
support vector machines. pp. 354–359. In: International Conference on Computational
Intelligence for Modelling, Control and Automation and International Conference on
Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06),
November. IEEE. Vol. 2. 10.1109/CIMCA.2005.1631494.
260 Data Science and Machine Learning Applications in Subsurface Engineering
Wyllie, M.R.J. and Rose, W.D. 1950. Some theoretical considerations related to the quantitative
evaluation of the physical characteristics of reservoir rock from electrical log data. Journal
of Petroleum Technology 2(04): 105–118. https://doi.org/10.2118/950105-G.
Xie, Y., Zhu, C., Zhou, W., Li, Z., Liu, X. and Tu, M. 2018. Evaluation of machine learning
methods for formation lithology identification: A comparison of tuning processes and
model performances. Journal of Petroleum Science and Engineering 160: 182–193.
https://doi.org/10.1016/j.petrol.2017.10.028.
Xie, Y., Zhu, C., Hu, R. and Zhu, Z. 2021. A coarse-to-fine approach for intelligent logging
lithology identification with extremely randomised trees. Mathematical Geosciences
53: 859–876. https://doi.org/10.1007/s11004-020-09885-y.
Xie, Y., Zhu, C., Hu, R. and Zhu, Z. 2021. A coarse-to-fine approach for intelligent logging
lithology identification with extremely randomised trees. Mathematical Geosciences
53: 859–876. https://doi.org/10.1007/s11004-020-09885-y.
Yu, B. and Cheng, P. 2002. A fractal permeability model for bi-dispersed porous media.
International Journal of Heat and Mass Transfer 45(14): 2983–2993. https://doi.
org/10.1016/S0017-9310(02)00014-5.
Yu, X., Ma, Y.Z., Psaila, D. et al. 2011. Reservoir characterisation and modelling: A look back
to see the way forward. In: Ma, Y.Z. and P.R. La Pointe (eds.). Uncertainty Analysis and
Reservoir Modelling, AAPG Memoir. 96: 289–309. DOI:10.1306/13301421M963458.
Yu, X.H. 2008. Hydrocarbon Reservoir Sedimentology of Clastic Sandstone (2nd Edn.) (in
Chinese): Beijing, China: Petroleum Industry Press: p. 551.
Yu, Y., Lin, L., Zhai, C., Chen, H., Wang, Y., Li, Y. and Deng, X. 2019. Impacts of lithologic
characteristics and diagenesis on reservoir quality of the 4th member of the Upper
Triassic Xujiahe Formation tight gas sandstones in the western Sichuan Basin,
southwest China. Marine and Petroleum Geology 107: 1–19. https://doi.org/10.1016/j.
marpetgeo.2019.04.040.
Zabihi, R., Schaffie, M., Nezamabadi-Pour, H. and Ranjbar, M. 2011. Artificial neural network
for permeability damage prediction due to sulfate scaling. Journal of Petroleum Science
and Engineering 78(3–4): 575–581. https://doi.org/10.1016/j.petrol.2011.08.007.
Zendehboudi, S., Rezaei, N. and Lohi, A. 2018. Applications of hybrid models in chemical,
petroleum, and energy systems: A systematic review. Applied Energy 228: 2539–2566.
https://doi.org/10.1016/j.apenergy.2018.06.051.
Zerrouki, A.A., Aifa, T. and Baddari, K. 2014. Prediction of natural fracture porosity from well
log data by means of fuzzy ranking and an artificial neural network in Hassi Messaoud
oil field, Algeria. Journal of Petroleum Science and Engineering 115: 78–89. https://doi.
org/10.1016/j.petrol.2014.01.011.
Zheng, B. and Li, J.H. 2015. A new fractal permeability model for porous media based on
Kozeny-Carman equation. Natural Gas Geoscience 26(1): 193–198. https://doi.
org/10.1155/2022/8088151.
Zhong, Z. and Carr, T.R. 2019. Application of a new hybrid particle swarm optimization-mixed
kernels function-based support vector machine model for reservoir porosity prediction:
A case study in Jacksonburg-Stringtown oil field, West Virginia, USA. Interpretation
7(1): 97–112. https://doi.org/10.1190/INT-2018-0093.1.
Chapter 11
Artificial Lift Design for Future
Inflow and Outflow Performance for
Jubilee Oilfield
Using Historical Production Data and
Artificial Neural Network Models
Solomon Adjei Marfo,1 Eric Thompson Brantson,2,*
Eric Mensah Amarfio,2 Abakah-Paintsil Efua Eduamba,2
Iyiola Zainab Ololade,2 Alexander Mensah Ofori,2
Ebenezer Ansah3 and Emmanuel Karikari Duodu2
1. Introduction
Forecasting production parameters such as production rate, oil recovery,
or reserves estimation is an essential aspect that enables operators to
determine the economic profitability of a petroleum venture. Inflow and
Tubing Performance Relationships (IPR and TPR) are the two mathematical
techniques used to analyse and predict the performance of a well. AL-Dogail
et al. (2018) used artificial intelligence (AI) (backpropagation network and
fuzzy logic) as another technique to predict the Inflow IPR of a gas reservoir
for effective reservoir management.
1
Department of Chemical and Petrochemical Engineering, GNPC School of Petroleum Studies,
University of Mines and Technology, Tarkwa, Ghana.
2
Department of Petroleum and Natural Gas Engineering, GNPC School of Petroleum Studies,
University of Mines and Technology, Tarkwa, Ghana.
3
Department of Petroleum Geosciences and Engineering, GNPC School of Petroleum Studies,
University of Mines and Technology, Tarkwa, Ghana.
* Corresponding author: [email protected]
262 Data Science and Machine Learning Applications in Subsurface Engineering
2. Methodology
2.1 Artificial Lift Screening Techniques
To apply artificial lift techniques to a field, screening must be done to ensure
that the right artificial lift system is applied to the field in question. Some
factors to be considered during the screening process are location, depth,
estimated production, reservoir properties, and other factors. The screening
process is the initial procedure to evaluate the suitability of a certain artificial
lifting system (Brown, 1982). The goal is to phase out inappropriate systems
progressively, reducing the selection to a few contenders for the next selection
process. It is attempted to choose the optimum system, and to compare
design parameters with existing methodologies and charts (Lea and Nickens,
1999). In this study, field parameters were compared to the parameters on the
screening chart (Takacs, 2015) and the appropriate lift method was selected.
J *f p f pwf
2
pwf
q= 1 − 0.2 − 0.8 (11.2)
1.8 pf p
f
where,
p–f = reservoir pressure in a future time.
J *f = future productivity index
J *p = present productivity index
kro = relative permeability to oil
Bo = oil formation volume factor
µo = oil viscosity
pwf = bottomhole pressure
q = flowrate
Data Remarks
Options The options menu is used to define the characteristics of the well. Well
characteristics such as fluid type, well completion, and desired lift method are
defined here.
PVT Model The Black Oil model was selected. This model is suitable for usage in a wide
variety of applications and hydrocarbon fluid systems. A minimum of GOR
solution, oil viscosity and water formation salinity are required. Data from PVT
are imputed, and correlations have been chosen that best corresponds to the
location or oil type.
Fluid Oil and Water
Description
Temperature Rough approximation
Model
Flow Type Tubing flow
Well Type Producer
Prediction Pressure and temperature offshore model
Well Cased hole
Completion
Reservoir Single branch reservoir
Type
Artificial Gas lift
Lift Design
Parameters Values
Datum at Christmas Tree (ft) 0
Measured Depth (ft) 12519
True Vertical Depth (ft) 12300
Parameters Values
Manifold/Christmas Tree (True Vertical Depth) 0
Temperature of Surroundings (ft) 60 °F
Overall Heat Transfer 8 BTU/hr/ft2/F
Parameters Values
Cp oil 0.53 BTU/lb/F
Cp gas 0.51 BTU/lb/F
Parameters Values
Minimum valve spacing 250 ft
Kill fluid gradient 0.465 ft/psi
Operating Injection Pressure 1300 psi
Kick off pressure 1900 psi
Gas lift valve type R-20 Monel
Port Size: 32
R-value: 0.25
Differential pressure dP across valve 250 psi
Vertical Lift Performance correlation Petroleum Experts 2
Surface Equipment correlation Beggs and Brill
Well Depth 12300 ft
Water Cut 14%
Maximum Liquid rate 12472 STB/D
Maximum Gas Available 10 MMScf/day
least amount of error. Computational time was also calculated to know which
model works faster. Figure 11.1 shows the process followed to obtain results
from the ANN models.
Well A B
Reservoir Pressure (psia) 6 014.7 6 014.7
Temperature (°F) 210 210
Oil Formation Volume factor (Bo) (bbl/stb) 1.595 1.595
Solution Gas Oil Ratio (GOR) (scf/stb) 1243 1 43
Oil Viscosity µo (cP) 0.7 0.7
15
Table 11.12. Screening results.
Field Parameters
shows the results from the calculation. This will serve as theValues
basis for checking if there has been an
Depth after implementing the
improvement in wellbore deliverability 3657.6 m
continuous gas lift method. Fig. 11.3 and
Production
11.4 show the daily production profile Volume
for well 24average
A and B with 000 bbl/day
base case flow rates of 8,161 BOPD
and 9,850 BOPD, Temperature 98.89–106.11°C 15
Table 11.12 shows the field parameters compared to the parameters on the
chart by Takacs (2015) and the gas lift method was selected which satisfies
the chart. The following are the results obtained during screening for an
appropriate artificial lift method for the field under investigation in this work.
Artificial Lift Design for Future Inflow and Outflow Performance for Jubilee Oilfield 273
Since both wells are high productivity index wells, a screening chart for
high productivity wells was used. Sucker rod and progressive cavity pumps
were not applicable in this field due to the depth restriction of not more than
3,050 m. Electrical Submersible Pump (ESP) and Gas Lift were the next in
line for the screening process. At the end of the screening process, the gas lift
was selected based on the following criteria. Gas lift can handle fields that
have wells with a fluid gravity greater than 15° API. Also, the temperature
for all the wells was in the acceptable range for gas lift 98.89–106.11°C. The
required operating volumes for the wells were within an acceptable range
for gas lift use. The required operating volumes for the gas lift range is
200–30,000 BOPD. Based on screening by advantages, gas lift is an excellent
choice for offshore applications due to its high efficiency rate when used
offshore. In addition to that, compressors are already available onsite for use.
For the above reasons, the gas lift was chosen as the artificial lift technique
for the wells under investigation for this research. Table 11.12 shows the field
data used for the screening.
indicates the shut-in reservoir pressure whereas the intercept on the abscissa
indicates the flowrate in stock tank barrels per day. The AOF for Well A is
28,486.5 STB/D and that of Well B is 37,826.4 STB/D. Future IPR curves
generated through sensitivity analysis can be seen in Fig. 11.6 and 11.8 for
well A and B, respectively.
Artificial Lift Design for Future Inflow and Outflow Performance for Jubilee Oilfield 275
rate which is 17,000 BOPD (Fig. 11.10) still falls outside the intersection
indicating that the rate of production for that well must be changed since
that value is not optimum. It can be observed that the test point falls within
where the outflow performance curve and inflow performance curve intersect
meaning the desired test rate is feasible. It can be deduced that the well will
cease to flow and become a dead well if the bottom hole pressure falls below
1,400 psi at the prevailing reservoir conditions since the inflow performance
curves and the tubing performance curves cease to intersect. This observation
was made for Well A. The sensitivity plot obtained for Well B indicates a
maximum objective flowrate of 22,000 BOPD and a test rate of 3,026 STB/D
at a pressure of 4,872 psi. From the graph in Fig. 11.11, it can be deduced that
below 2,472 psi, Well B becomes a dead well since the inflow performance
curves and the tubing performance curves cease to intersect.
Table 11.13. Gas lift results for Well A and B (Optimum Production Rates).
29
of the data used in the model training. Therefore, it is essential to carefully evaluate the quality of the data
used and the choice of basis functions to ensure the accuracy of the RBFNN model predictions.
Additionally, further research can explore the use of other AI techniques to complement RBFNN and
improve the accuracy of the predictions
Fig. 11.17. Testing plot for BPNN model.
4. Conclusions
In summary, this research has contributed to the development of an optimised
artificial lift system for the Jubilee Field, resulting in increased deliverability
of all wells. Additionally, the application of AI techniques, such as BPNN
and RBFNN, have demonstrated improved prediction accuracy for well
performance. These findings have important implications for the optimisation
of oil field production, and the methodology developed in this study can be
applied to other fields to improve efficiency and productivity. The following
conclusions can be drawn from the study.
• Successfully screen and design an optimum artificial lift system (gas lift),
for the field under investigation.
• Comparing the average daily base flow rates calculated using the
production history of the field to the results obtained after applying gas
lift modelling with an increase in the deliverability of all the wells. There
was an 83.63% increase for Well A and a 61.64% increase for Well B.
• RBFNN gave the best prediction for the tubing head pressure in the
Jubilee Field than the BPNN model used with an R of 96.290% and
MAPE of 0.02764.
• BPNN had less computational time due to the number of neurons used as
compared to the RBFNN model.
• Artificial lift method is recommended at the start of production for the
field nearby of Jubilee Field which is the Pecan Field operated by Aker
Energy to avoid technical problems faced in the Jubilee Field.
284 Data Science and Machine Learning Applications in Subsurface Engineering
Acknowledgment
We would like to thank anonymous reviewers for their contribution to this
research. We are also grateful to Ghana National Petroleum Corporation for
providing access to data for this research to be successful. We also acknowledge
PETEX for making the educational licence of the software available. Finally,
we thank the University of Mines and Technology, Tarkwa, Ghana for their
immense support.
References
Agyeman, B.K. 2020. Design of Gas Lift and Electric Submersible Pump: A Case Study at
Jubilee Field. Unpublished BSc Project Report, University of Mines and Technology,
Tarkwa, pp. 1–118.
Akwensi, P.H., Brantson, E.T., Niipele, J.N. and Ziggah, Y.Y. 2021. Performance evaluation
of artificial neural networks for natural terrain classification. Applied Geomatics
13(3): 453–465.
AL-Dogail, A.S., Baarimah, S.O. and Basfar, S.A. 2018. Prediction of inflow performance
relationship of a gas field using artificial intelligence techniques. In: SPE Kingdom of Saudi
Arabia Annual Technical Symposium and Exhibition. April. OnePetro.
Baudoin, C.R. 2016. Deploying the industrial Internet in oil & gas: Challenges and opportunities.
In: SPEntelligent Energy International Conference and Exhibition. OnePetro.
Bhatia, A. and McAllister, S. 2014 Artificial Lift: Focus on Hydraulic Submersible Pumps,
ClydeUnion Pumps, Tech 101(10): 3, 29–31.
Boyun, G., William, C.L. and Ali, G. 2007. Petroleum Production Engineering, A Computer
Assisted Approach. Oxford, UK: Elsevier Ltd., pp. 186–300.
Brantson, E.T., Ju, B., Omisore, B.O., Wu, D., Aphu, E.S. and Liu, N. 2018. Development of
machine learning predictive models for history matching tight gas carbonate reservoir
production profiles. Journal of Geophysics and Engineering 15(5): 22–35.
Brantson, E.T., Ju, B., Opoku Appau, P., Akwensi, H.P., Agyare Peprah G., Liu, N., Aphu, E.S.,
Annan Boah, E. and Aidoo Borsah, A. 2019a. Development of low salinity water polymer
flooding numerical reservoir simulator and smart proxy modeling for hybrid chemical
enhanced oil recovery (CEOR). Journal of Petroleum Science and Engineering 187: 1–23.
Brantson, E.T., Ju, B., Ziggah, Y.Y., Akwensi, P.H., Sun, Y., Wu, D. and Addo, B.J. 2019b.
Forecasting of horizontal gas well production decline in unconventional reservoirs using
productivity, soft computing and swarm intelligence models. Natural Resources Research
28(3): 717–756.
Brantson, E.T., Osei, H., Aidoo, M.S.K., Appau, P.O., Issaka, F.N., Liu, N., Ejeh, C.J. and
Kouamelan, K.S. 2022. Coconut oil and fermented palm wine biodiesel production for oil
spill cleanup: Experimental, numerical, and hybrid metaheuristic modelling approaches.
Environmental Science and Pollution Research, 1–19.
Brill, J.P. and Beggs, H.D. 1991. Two-Phase Flow in Pipes, Houston, 6th Edition, 1991,
pp. 1–15.
Brown, K.E. 1982. Overview of artificial lift systems. Journal of Petroleum Technology, SPE
9979-PA, 13 pp.
Chen, X., Wang, D.Y., Tang, J.B., Ma, W.C. and Liu, Y. 2021. Geotechnical stability analysis
considering strain softening using micro-polar continuum finite element method. Journal
of Central South University 28(1): 297–310.
Artificial Lift Design for Future Inflow and Outflow Performance for Jubilee Oilfield 285
Flatern, R.V. 2015. Oilfield Review, Artificial Lift Systems, www.slb.com, Accessed November
2019.
Fleshman, R. and Lekic, H.O. 1999. Artificial lift for high production. Oilfield Review Spring
49–63.
Hill, T. and Remus, W. 1993. Neural network models for intelligent support of managerial
decision making. Hawaii University 11(5): 449–459.
Lea, J.F. and Nickens, H.V. 1999. Selection of Artificial Lift. SPE 52157, Oklahoma City,
Oklahoma 30 pp.
Mohammed, A.G.H. and Nasr, G.G. 2016. Gas lift optimisation to improve well performance.,
World Academy of Science, Engineering and Technology, International Journal of
Mechanical and Mechatronics Engineering 10(3): 512–520.
Neely, B., Gipson, F., Capps, B., Clegg, J. and Wilson, P. 1981. Selection of Artificial Lift
Method. SPE 10337 Dallas, Texas. 1 p.
Pennel, M., Hsiung, J. and Putcha, V.B. 2018. Detecting failures and optimising performance in
artificial lift using machine learning models. In: SPE Western Regional Meeting. OnePetro.
Renpu, W. 2011. Selection and Determination of Tubing and Production Casing Sizes. Advanced
Well Completions Engineering, 3rd Edn.. Houston,: Gulf Professional Publishing,
pp. 117–170.
Schempf, F.J. 2011. Jubilee brings Ghana to the fore among West African deepwater regions.
Offshore (Conroe, Tex.) 71(4).
Takacs, G. 2015. Sucker-rod Pumping Handbook: Production Engineering Fundamentals and
Long-stroke Rod Pumping. Houston: Gulf Professional Publishing.
Woods, J.D. and Lea, J.F. 2017. What Is New In Artificial Lift? World Oil Production Technology:
Artificial Lift, USA: Gulf Publishing, pp. 43–50.
Wu, D., Ju, B. and Brantson, E.T. 2016. Investigation of productivity decline in tight gas wells
due to formation damage and Non-Darcy effect: Laboratory, mathematical modelling and
application. Journal of Natural Gas Science and Engineering 34: 779–791.
Chapter 12
Modelling Two-phase Flow
Parameters Utilizing
Machine-learning Methodology
Longtong Dafyak* and Buddhika Hewakandamby
1. Introduction
Two-phase flow is predominant in natural occurring and industrial processes;
from water droplets entrained in the air to domestic water distribution
lines, combustion engines, power generation plants, amongst others. In the
exploration industry, gas-liquid flow in pipes is a standard practice due to
the coexistence of these fluids in the subsurface (McCain, 1994; Soloveichik
et al., 2022). Furthermore, pipelines are still the safest and most economical
means of transporting fluids over long distances (Green and Jackson, 2015;
Canada Energy Report, 2020).
When gases and liquids flow simultaneously in pipes, unique flow
configurations are formed, which are referred to as flow patterns. These flow
patterns were classified as bubbly, dispersed bubbly, plug, slug, stratified, and
annular flows by Hewitt and Roberts (1969) and Mandhane et al. (1974) as
illustrated by Shoham (2006) in Fig. 12.1. The flow patterns formed depend on
the operating parameters, pipe geometric variables, and physical properties of
the fluids (Lu, 2015). Each flow pattern is associated with specific interaction
between the phases and the pipe walls, thus making one flow pattern entirely
different from another (Nie et al., 2022). The difference in momentum, phase
distribution, and velocity distribution, makes gas-liquid flow dynamics a lot
more complex than single-phase flow. However, some flow patterns share
similar intermittent behaviour and characteristic parameters.
University of Nottingham, UK.
* Corresponding author: [email protected]
Modelling Two-phase Flow Parameters Utilizing Machine-learning Methodology 287
Fig. 12.1. Flow patterns in (a) horizontal and (b) vertical pipes (Shoham, 2006).
multiphase flow research. Azizi et al. (2016) utilized ANN to predict liquid
fraction using 468 experimental data points for oil-water mixtures. Kim et al.
(2020) and Abdul-Majeed et al. (2022) proposed ML models to specifically
model slug flow parameters. More recently, the ANN model proposed by
Aliyu et al. (2023) for entrained liquid fraction in annular gas-liquid flows
show good performance with a Root Mean Squared Error (RMSE) and R2 of
0.005 and 0.97, respectively. These studies also compared existing correlations
and demonstrated the potential of ANN, random forest, and support vector
regression to outperform traditional two-phase flow correlations. Despite
the contributions of these studies to data-driven modelling, studies focused
on comparing the capabilities of varying ML algorithms in two-phase flow
modelling is sparse. Thus, this study is focused on evaluating the performance
of several ML algorithms in two-phase flow predictions and assessing the
applicability of these ML models in comparison to traditional empirical
correlations.
In this study, the interest is limited to the mean void fraction and structure
velocity of intermittent flow regimes. This is because intermittent flows
(slug, plug/elongated bubble, and churn) are the most persistent and often the
extremely challenging flow patterns in two-phase flow systems. These two
parameters are also generic characteristics of most two-phase flow patterns;
thus, the developed models are applicable across a wide breadth of two-phase
flow conditions. The two-phase flow variables considered in this study are
the operating conditions (the gas and liquid superficial velocities, the pipe
diameter and inclination) and the fluid properties (the gas and liquid viscosity,
the gas and liquid density).
3. Methodology
The sequential procedure for data processing and model development in
this study are data acquisition and sourcing, data pre-processing, model
development, hyperparameter tuning and model evaluation, and validation.
All analysis was carried out in the Python environment using open-source
libraries; Pandas, Numpy, Matplotlib, and Seaborn for data manipulation and
visualization and Keras and Sklearn for predictive modelling.
The experimental data for this study was sourced from previous studies
as elaborated in Section 2. Data pre-processing and feature engineering
encompasses all the steps taken to prepare the data for optimal model
development and evaluation. The input parameters referred to as features and
output parameters referred to as target values in this study were identified
(Fig. 12.2). Correlation analysis is utilized to identify the magnitude of
the relationship between input variables and target variables. This helps in
selecting the most relevant variables for model development, eliminate
irrelevant variables to reduce noise, avoid multicollinearity, and improve the
overall performance of the model. Prior to model development, the data was
split into training and test data for model development and model evaluation,
respectively. The same number of data were used to train and test all the
ML algorithms considered in this study. The supervised learning algorithms
utilized in this study are Multiple Linear Regression (MLR), Polynomial
Regression (PR), Random Forest (RF), Support Vector Regression (SVR),
290 Data Science and Machine Learning Applications in Subsurface Engineering
Authors Correlations
Mean void fraction
Lockhart, R.W., Martinelli −1
1− x ρG µ L
0.64 0.36 0.07
(1949) ε= 1 + 0.28
x ρ L µG
Bendiksen (1984) U SG
ε=
1.2U m + 0.35sin θ gd + 0.54 cos θ gd
1.18
Where U GM
= ( gσ ( ρ L − ρG ))0.25 for P < 12.7
ρ
L
ε= 1 + APRM
x ρ L
y
Where APRM =
1 + F1 − yF2 ,
1 + yF2
0.22 −0.08
ρ ρ
=F1 1.578
= ReL−0.19 G , F2 0.0273WeL ReL−0.51 G
ρL ρL
−1
1 − x ρ G G2D GD
=y =
, WeL = ,ReL
x ρ
L σρ L µL
ML Algorithm Illustration
Linear regression assumes a linear relationship
between a dependent variable and independent
variable(s). Linear regression can be either simple
(one dependent variable) or multiple (2 or more
independent variables). Linear regression estimates
the coefficients (slopes and intercept) that minimize
ression modelsthenon-linear
loss functionrelationships
between the actualbetween
values and
predicted values
independent variables. (Montgomery
Polynomial et al., 2012).
models can fit
nships by adding
sion models higher-order
non-linear
Polynomial terms;,
relationships
regression however,
models between
nonlinear the
ynomial regression
ree of polynomial models non-linear
is anPolynomial relationships
important consideration
relationships between dependent and between
to
independent
dependent
Polynomial variables.
regression models non-linearmodels can fit
pendent and independent
variables. variables.
Polynomial modelsrelationships
Polynomial models
can fit between
complexcan fit
tting
hips (James
by adding
dependent et al., 2013).
andhigher-order
independent terms;,
variables.
relationships by adding however,
Polynomial
higher-order the can fit
models
terms;
mplex relationships by adding higher-order terms;, however, the
complex
eropriate
of relationships
polynomial anbyimportant
however, adding higher-order
the appropriate terms;,
degree however,
of polynomial
to isto the
degree of is polynomial consideration
is an important consideration
appropriate degreean important consideration
of polynomial to minimize
is an important overfittingto
consideration
ng (James
nimize
isminimize et al.,(James
overfitting
an ensemble 2013).
(James
overfitting
etal.,
et al.,2013).
2013).
learning
(James etmethod
al., 2013). developed as a
multiple decision trees. RF models are developed
ndom
equence: forestforest
is anisensemble learning method developed as a
anRandom
ensemble
mbination of multiple
an ensemble
learning
Random forest
decision
islearning
method
trees.
method
developed
an ensemble
RF models
developed
learning
are
as a as a
method
developed
and combination
ultiple tree building
decision
of multiple–- decision
developed
trees. randomly trees.sampling
as a combination
RF models
RF of
models are
multiple
are developed
developed
ofdecision
the
owing this
following sequence: trees.
this sequence: RF models are developed following this
to create multiple subsets of the data.
ence:
otstrapping
Bootstrapping and and sequence:
tree building
tree –- –randomly
building - randomly sampling
samplingofofthe
the
oving nodes or Bootstrapping
branches and tree
that dobuilding
not – randomly
improve the
ginal dataset
original to
dataset
d tree buildingsampling create
to –- randomly
multiple
create subsets
multiple of
subsets
of the original
the
of
sampling data.
the data.
of the
dataset to create multiple
mance.
ning –- removing
Pruning –- removing nodes
subsets oforthe
nodes branches
ordata.
branches thatthat
do donotnot
improve
improvethethe
create multiple subsets of the data.
–- aggregates the predictions of multiple decision
model's
del's performance.
performance. Pruning – removing nodes or branches that do not
ing nodes
Majority
jority voting –or- aggregates
voting branches
–improve thethat
- aggregates model’s
the the
do not improve
performance.
predictions
predictions
the
of multiple
of multiple decision
decision
final prediction (Breiman,
Majority voting2001).
– aggregates the predictions of
nce.trees to make a final
es to make a final prediction prediction
multiple decision
(Breiman,
(Breiman, 2001).
2001).
trees to make a final prediction
aggregates the(Breiman,
predictions
2001). of multiple decision
regression learns
Support vector the function
regression learns the that
Supportlearns maps
function
vector the that input
regressionmaps input
learns themaps
function
pport vector regression
alfeatures
prediction (Breiman, 2001). function that input
tinuous output
tures to continuous values
outputby
to continuous
that mapsoutput finding
input
values
values
featuresa tohyperplane
by
by finding
finding thatthat
a hyperplane
continuous output
a hyperplane
that
values
maximally separates thebyinput
finding a hyperplane
data while minimizingthat maximally
the error on
rates
ximallythe input the
separates datainput
whiledata minimizing
while
data minimizing
the error
the error on on
the training data.separates
SVR usesthe input
kernel while
functions tominimizing
transform input the error
data
egression
a.training
SVR uses learns
data.kernel
SVRon the
uses
the function
functions
kernel
training towherethat
transform
functions
data. SVRto maps input
transform inputdatadata
input
into a higher dimensional space, ituses kernel
is easier tofunctions
find a linear
ouous output
adecision
higher
mensional values
dimensional
space,
boundary by &
to transform
space,
where
(Chang finding
itinput
where
Lin, aithyperplane
data
is 2012).
easierinto toa higher
is easier findthat
findtoadimensional
a linear
linear
ision boundary space,
(Chang &where
Lin, it is
2012). easier to find a linear decision
es the input data while minimizing the error on
aryThe(Chang & Lin,of2012).
architectureboundary a deep (Chang
neural andnetwork
Lin, 2012).consists of multiple
VR uses kernel
architecture
layers of nodes. functions
of aThe deep
Each neural to
node of
architecture transform
network
performs a input
consists
a deep neural ofdata
different
network multiple
type
consistsof
e of a deep neural network consists of multiple
nsional
ers space,
of nodes.
computation and where
Each
of node
multiple
learns itlayers
complexisperforms
easier toa Each
of nodes.
relationshipsfind a linear
different
node
between type
performs of
a
input and
es.output
Eachandnode
mputation learns performs
different
firsttype
complex of ofa the
different
computation
relationships typeinput
andreceives
learns
between of and
complex
(Chang variables.
& Lin, The
2012). layer network
relationships between input and output variables.
the input
d variables,
putlearns complex
variables.andThe relationships
first
subsequent layerlayers
of the between
network
process input
receives
the data whilethe and
the input
output
The first layer of the network receives the input
of layer
s.ables,
The provides
first
and
a deep layer the
subsequent
neural offinal
the
layers
network
variables,
prediction.
network
and process The
the
consists
subsequent
weight
receives
data
ofwhile
layers
andthe
the
multiple
process
bias
input of the
theoutput
data
er nodes
providesin the
the network
final
while theare
prediction.iteratively
output The
layer adjusted
weight
provides and
the to minimize
bias
final of the
prediction.the
ubsequent
Each node layersperforms
process the data
abias while the
different type output of output
desdifference
in the networkbetween
Theare the
weight predicted
and
iteratively of and
the
adjusted the
nodes actual
toinminimize
the network the
the
earns final prediction.
complex
(Goodfellow, The adjusted
relationships
are
Bengio iteratively weight toand
between bias
minimizeinput ofdifference
the andthe
erence between the& Courville,
predicted 2016).
and the actual output
etwork
The are
first layer
odfellow, Bengioof
iteratively
between theadjusted
& the network
Courville,
predicted and
2016).
to minimize
the actual output
receives the input the
(Goodfellow et al., 2016).
ween the predicted
equent layers process the data while and the actual
the outputoutput
ngio & Courville,
4. Results The
e final prediction. 2016).
and dDiscussions
weight and bias of the
work4.are iteratively
Results adjusted to minimize the
and dDiscussions
Data- pre-processing
en the predicted and the actual output
Data- pre-processing
Modelling Two-phase Flow Parameters Utilizing Machine-learning Methodology 293
Fig. 12.4. Cross plot of predicted and test data for Ɛ and UTB.
Modelling Two-phase Flow Parameters Utilizing Machine-learning Methodology 297
(2007) correlation which takes into account the surface tension, gas, and
liquid densities.
Table 12.5 also shows the performance for the structure velocity
correlations. Across all statistical error parameters, the Bendiksen (1984)
correlation displays the best performance. For horizontal flows, drift flux
correlations developed based on the assumption of zero drift velocity tend to
predict the structure velocity with lower accuracy compared to correlations
that assume the opposite (Woldesemayat and Ghajar, 2007). Thus, it is not
surprising that the Bendiksen (1984) correlation performs better than the
Hasan and Kabir (1988) correlation. Although the Woldesemayat and Ghajar
(2007) correlation is developed based on the drift-flux model, assumes a
non-zero drift velocity in the horizontal pipe, and considers some fluid
properties, it fails to meet the predictive capabilities of the Bendiksen (1984)
correlation for the data set explored in this study.
To further validate the ML models developed, only the best performing
correlations are compared with the ML models development. For the
experimental conditions and two-phase flow parameters explored in this study,
the Bendiksen (1984) and Woldesemayat and Ghajar (2007) models show
the best performance for both the structure velocity and mean void fraction.
Comparing the statistical error parameters in Tables 12.4 and 12.5, all the ML
algorithms, except MLR, outperform the traditional empirical correlations for
predicting the mean void fraction. In the case of the structure velocity, all the
ML models surpass the correlations considered in this study by at least 8% of
the RMSE, APPE, APE, or R2. This comparative assessment demonstrates the
capabilities of ML algorithms in the development of data-driven models with
substantially higher accuracy compared to the existing empirical models.
298 Data Science and Machine Learning Applications in Subsurface Engineering
Nomenclature
A area of pipe x quality
Cd drift coefficient θ pipe inclination
Co distribution coefficient ε void fraction
D diameter of the pipe σ Surface tension
g acceleration due to gravity ρL liquid density
Re Reynolds number ρG gas density
Ud drift velocity μG gas viscosity
Um mixture velocity μL liquid viscosity
UTB translational velocity
Modelling Two-phase Flow Parameters Utilizing Machine-learning Methodology 299
References
Abdul-Majeed, G.H., Kadhim, F.S., Almahdawi, F.H.M., Al-Dunainawi, Y., Arabi, A. and
Al-Azzawi, W.K. 2022. Application of artificial neural network to predict slug liquid
holdup. International Journal of Multiphase Flow 150(January): 104004. https://doi.
org/10.1016/j.ijmultiphaseflow.2022.104004.
Al-Naser, M., Elshafei, M. and Al-Sarkhi, A. 2016. Artificial neural network application for
multiphase flow patterns detection: A new approach. Journal of Petroleum Science and
Engineering 145: 548–564. https://doi.org/10.1016/j.petrol.2016.06.029.
Aliyu, A.M., Choudhury, R., Sohani, B., Atanbori, J., Ribeiro, J.X.F., Ahmed, S.K.B. and
Mishra, R. 2023. An artificial neural network model for the prediction of entrained droplet
fraction in annular gas-liquid two-phase flow in vertical pipes. International Journal of
Multiphase Flow 164: 104452. https://doi.org/10.1016/j.ijmultiphaseflow.2023.104452.
Alves, I.N. and Shoham, O. 1991. Slug flow phenomena in inclined pipes. Petroleum
Engineering, University of Tulsa, USA. Doctor of (July 2007).
Azizi, S., Awad, M.M. and Ahmadloo, E. 2016. Prediction of water holdup in vertical and
inclined oil-water two-phase flow using artificial neural network. International Journal
of Multiphase Flow 80: 181–187. https://doi.org/10.1016/j.ijmultiphaseflow.2015.12.010.
Bendiksen, K.H. 1984. An experimental investigation of the motion of long bubbles in
inclined tubes. International Journal of Multiphase Flow 10(4): 467–483. https://doi.
org/10.1016/0301-9322(84)90057-0.
Breiman, L. 2001. Random forests. Machine Learning 45: 5–32.
Cai, S., Toral, H., Qiu, J. and Archer, J.S. 1994. Neural network based objective flow regime
identification in air‐water two phase flow. The Canadian Journal of Chemical Engineering
72(3): 440–445. https://doi.org/10.1002/cjce.5450720308.
Canada. Energy Report. 2020. Oil Sands: Pipeline Safety. https://www.nrcan.gc.ca/energy/
publications/18754.
Chang, C.C. and Lin, C.J. 2012. LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology (TIST) 2(3): 27.
Dafyak, L. 2022. Hydrodynamics of Viscous Slug Flow in Inclined Pipes (Issue December).
University of Nottingham.
Dafyak, L.A., Hewakandamby, B., Fayyaz, A. and Hann, D. 2021. Taylor Bubbles of Viscous
Slug Flow in Inclined Pipes. Paper presented at the Offshore Technology Conference,
Virtual and Houston, Texas, August 2021. https://doi.org/10.4043/31238-MS.
dos Santos Ambrosio, J., Lazzaretti, A.E., Pipa, D.R. and da Silva, M.J. 2022. Two-phase
flow pattern classification based on void fraction time series and machine learning. Flow
Measurement and Instrumentation 83(November 2021): 102084. https://doi.org/10.1016/j.
flowmeasinst.2021.102084.
Dukler, A.E. and Hubbard, M.G. 1975. A model for gas-liquid slug flow in horizontal and near
horizontal tubes. Industrial & Engineering Chemistry Fundamentals 14(4): 337–347.
https://doi.org/10.1021/i160056a011.
Escrig, J. 2017. Influence of Geometrical Parameters on Gas-liquid Intermittent Flows. PhD
thesis, University of Nottingham.
Escrig, J., Hewakandamby, B. and Azzopardi, B. 2017. Influence of Diameter and Inclination
of the Pipes on the Velocity of Periodic Structures in Gas-Liquid Intermittent Flows.
November.
Fernandes, R.C., Semiat, R. and Dukler, A.E. 1983. Hydrodynamic model for gas‐liquid slug flow
in vertical tubes. AIChE Journal 29(6): 981–989. https://doi.org/10.1002/aic.690290617.
300 Data Science and Machine Learning Applications in Subsurface Engineering
Godbole, P.V., Tang, C.C. and Ghajar, A.J. 2011. Comparison of void fraction correlations for
different flow patterns in upward vertical two-phase flow. Heat Transfer Engineering
32(10): 843–860. https://doi.org/10.1080/01457632.2011.548285.
Goodfellow, I., Bengio, Y. and Courville, A. 2016. Deep Learning (Vol. 1). MIT Press.
Green, KP. and Jackson, T. 2015. Safety in the Transportation of Oil and Gas: Pipelines or Rail?
Fraser Institute: 1–14.
Hasan, A.R. and Kabir, C.S. 1988. Predicting multiphase flow behavior in a deviated well. SPE
Production Engineering 3(4): 474–482. https://doi.org/10.2118/15449-PA.
Hewitt, G.F. and Roberts, B.N. 1969. Studies of two-phase flow patterns by simultaneous
X-ray and flash photography. United Kingdom Atomic Energy Authority Research Group,
Berkshire.
James, G., Witten, D., Hastie, T. and Tibshirani, R. 2013. Introduction to Statistical Learning. An
Introduction to Statistical Learning with Applications in R: 176–178.
Kim, T.W., Kim, S. and Lim, J.T. 2020. Modeling and prediction of slug characteristics utilizing
data-driven machine-learning methodology. Journal of Petroleum Science and Engineering
195(June): 107712. https://doi.org/10.1016/j.petrol.2020.107712.
Liu, Y., Tong, T.A., Ozbayoglu, E., Yu, M. and Upchurch, E. 2020. An improved drift-flux
correlation for gas-liquid two-phase flow in horizontal and vertical upward inclined
wells. Journal of Petroleum Science and Engineering 195(June): 107881. https://doi.
org/10.1016/j.petrol.2020.107881.
Lockhart, R.W. and Martinelli, R.C. 1949. Proposed Correlation of Data for Isothermal
Two-phase, Two-component Flow in Pipes (pp. 45, 39–48). Chem. Eng. Progr. Univ. of
Berkeley California.
Lu, M. 2015. Experimental and Computational Study of Two-phase Slug Flow. PhD Thesis,
Imperial College of London, June.
Mandhane, J.M., Gregory, G.A. and Aziz, K. 1974. A flow pattern map for gas-iquid flow in
horizontal pipes. International Journal of Multiphase Flow 1(4): 537–553. https://doi.org/
https://doi.org/10.1016/0301-9322(74)90006-8.
Mask, G., Wu, X. and Ling, K. 2019. An improved model for gas-liquid flow pattern
prediction based on machine learning. Journal of Petroleum Science and Engineering
183(August): 106370. https://doi.org/10.1016/j.petrol.2019.106370.
McCain, Jr. W.D 1933. The Properties of Petroleum Fluids. PennWell Publishing Company,
Tulsa Oklahoma. ISBN 0-87814-335-1.
Montgomery, D.C., Peck, E.A. and Vining, G.G. 2012. Introduction to Linear Regression
Analysis. Wiley.
Nicklin, D.J. 1962. Two-phase bubble flow. Chemical Engineering Science 17(9): 693–702.
https://doi.org/https://doi.org/10.1016/0009-2509(62)85027-1.
Nie, F., Wang, H., Song, Q., Zhao, Y., Shen, J. and Gong, M. 2022. Image identification for
two-phase flow patterns based on CNN algorithms. International Journal of Multiphase
Flow 152(March): 104067. https://doi.org/10.1016/j.ijmultiphaseflow.2022.104067.
Premoli, A., Francesco, D. and Prima, A. 1970. An empirical correlation for evaluating
two-phase mixture density under adiabatic conditions. In: European Two-Phase Flow
Group Meeting, Milan, Italy.
Rosa, E.S., Salgado, R.M., Ohishi, T. and Mastelari, N. 2010. Performance comparison
of artificial neural networks and expert systems applied to flow pattern identification
in vertical ascendant gas-liquid flows. International Journal of Multiphase Flow
36(9): 738–754. https://doi.org/10.1016/j.ijmultiphaseflow.2010.05.001.
Rouhani, S.Z. and Axelsson, E. 1970. Calculation of void volume fraction in the subcooled and
quality boiling regions. International Journal of Heat and Mass Transfer 13(2): 383–393.
https://doi.org/10.1016/0017-9310(70)90114-6.
Modelling Two-phase Flow Parameters Utilizing Machine-learning Methodology 301
Shoham, Ovadia. 2006. Mechanistic Modeling of Gas-liquid Two-phase Flow in Pipes. Society
of Petroleum Engineers. https://doi.org/10.2118/9781555631079.
Soloveichik, Y.G., Persova, M.G., Grif, A.M., Ovchinnikova, A.S., Patrushev, I.I., Vagin,
D.V. and Kiselev, D.S. 2022. A method of FE modeling multiphase compressible flow
in hydrocarbon reservoirs. Computer Methods in Applied Mechanics and Engineering
390: 114468. https://doi.org/10.1016/j.cma.2021.114468.
Woldesemayat, M.A. and Ghajar, A.J. 2007. Comparison of void fraction correlations for
different flow patterns in horizontal and upward inclined pipes. International Journal of
Multiphase Flow 33(4): 347–370. https://doi.org/10.1016/j.ijmultiphaseflow.2006.09.004.
Yan, Y., Wang, L., Wang, T., Wang, X., Hu, Y. and Duan, Q. 2018. Application of soft
computing techniques to multiphase flow measurement: A review. Flow Measurement and
Instrumentation 60(February): 30–43. https://doi.org/10.1016/j.flowmeasinst.2018.02.017.
Index
Formation evaluation 57 M
Fossil fuel combustion 127
Fractal analysis 213 Machine Learning (ML) 1–5, 6–11, 13, 15, 16,
Free fluid 33–35, 37, 38, 40, 41, 47, 48, 54 18, 23, 26, 36, 38, 43, 47, 57, 58, 60, 61,
Free fluid index 33 64, 68, 73, 104, 106, 108–111, 113, 114,
Free fluid volume 33–35, 47, 48, 54 122, 160, 161, 178, 287–289, 291, 292,
Fuzzy logic 261, 287 294, 295, 297, 298
Machine learning algorithms 97
G Maximum likelihood 75
Mean void fraction 287–291, 293–295, 297,
Gas injection 125, 127, 129, 141, 142, 145 298
Gas lift 262, 264, 265, 267, 268, 270–273, Mercury Injection Capillary Pressure (MICP)
276–279, 282, 283 analysis 34
Gas oil ratio (GOR) 87 Metaheuristics algorithms 64
Gas reservoir 261 Miscible CO2 flooding 127, 128
Gas-liquid flow 286–288 Model agnostic 6, 9, 15, 16, 20, 22–24, 28
Geochemical modelling 132, 153 Mud loss 7, 9, 10, 12, 13, 18–20, 22–25, 27, 28
Geological formations 159 Multioutput supervised machine learning 58
Geomechanical properties 57 Multiphase flowrates 87
Geophysics 57 Multiphase physical flow meters (MPFMs) 87
Geothermal energy 58 Multivariate adaptive regression splines 132,
Ghana 261, 262, 269, 284 134
Global energy demand 125 Multi-zonal reservoirs 105, 106
Global optimisation 75
Global warming 127 N
Greenhouse gas emissions 126, 127
Greenhouse gases 126 NMR porosity 36, 39
Group method of data handling 132, 136 Nonconventional wells 104
Nuclear Magnetic Resonance (NMR) 34, 35
H
O
Heterogeneous formation 104
Hydrocarbon extraction 33 Objective functions 65, 66, 75
Hyperparameter tuning 75 Oil and gas industry 160, 161
Oil and gas production systems 87
I Oil and gas reservoir production 105
Oil recovery 261
Immiscible CO2 flooding 127 Oil recovery factor 125, 131–134, 136–138,
Inflow and outflow performance 261 142, 153, 154
Inflow Control Valves (ICVs) 105, 107, 108, 122 Oil reserves 125
Inflow performance relationship 263 Oil swelling 126
Intermittent flow 287, 288, 298 Original Oil in Place (OOIP) 126
Interval control devices 105
P
J
Permanent monitoring 105, 107
Jubilee field 262, 268–270, 282, 283 Permeability prediction 209, 223, 230, 231,
235, 240, 247
L Petroleum reservoir 125
Phase fractions 87, 88
Liquid rates 87
Physics-based models 287
Lost circulation 6–9, 11, 26–28
Pore fluids 33, 34, 37, 38, 57
Low Salinity Water (LSW) 126
Pore pressure 57
Index 305
Pore volume fraction 33, 35, 36, 54 Smart wells 104, 105, 107–109, 114, 122
Porosity 33–42, 47–50, 52–54 Soft computing techniques 287
Porosity prediction 230 Sonic waves 57, 63
Predictive models 93–95 Stratigraphic features 160
Pressure 104–108, 110, 111, 115, 116 Structure velocity 288–291, 293–295, 297
Pressure control valves 104 Subsurface characterisation 160
Primary recovery 125, 126, 130 Subsurface engineering 1, 2, 4, 5
Probabilistic model 75 Subsurface structures 159, 160, 164
Probabilistic reasoning 287 Support Vector Machine (SVM) 7, 8, 10, 18,
Production rate 261–263, 277–279 208, 220–223, 225, 228, 230–233, 236,
Production systems 87–91, 99 238, 240, 243, 245, 287
Production testing 87 Surface control and monitoring systems 105
Production tubing 104
Proxy models 132, 153 T
Temperature 104, 105, 107, 110, 115
R
Tertiary oil recovery 126
Radial basis function neural network 262, 268 Test separator 87, 91, 92
Real-time tracking 87 Transfer learning 159, 161–164, 168, 169, 173,
Recovery factor 125, 126, 129, 131–134, 174, 177–179
136–138, 142, 145, 148, 153, 154 Tree-based algorithms 287
Reserves estimation 261 Tubing head pressure 262, 269, 279, 282, 283
Reservoir characterisation 1, 2, 4, 57, 207–209, Tubing performance relationship 261
213–215, 247–252 Two-phase flow 286–288, 291, 297, 298
Reservoir conditions 104
Reservoir exploration 57 U
Reservoir management 34, 35, 261
Unconventional resources 58
Reservoir pressure 125, 142
Reservoirs 159, 160, 163
Residual U-net architecture 168 V
Rising sea levels 126 Velocity 87, 96
Rock physics 235, 238, 248, 250–252 Velocity model 159, 162
Virtual flow metering (VFM) 2, 87, 88, 96
S Viscosity reduction 126
Salt mapping 159, 163–165, 168
Salt segmentation 161, 163, 164, 178 W
Salt tectonics 160 Water cuts 87, 88, 90–92
Sand contamination 262 Water injection 104–108, 120, 122, 130, 132,
Secondary recovery 125, 126 141, 142, 145
Seismic characterisation 252 Water injection management 122
Seismic edge-detection algorithms 160 Water saturation prediction 247
Seismic image resolution 181–184, 186, 189 Water-cut prediction 88
Seismic imaging 181–183, 186, 201, 203 Wellhead 87, 90–93
Seismic interpretation 159, 160, 162, 178, 179 Well-to-Seismic inversion 58
Seismic salt imaging 160 Wireline logs 35, 36, 40, 42, 43, 47, 54, 58, 60,
Semantic segmentation 163, 169, 174 75, 81–83
Sequential model-based optimisation 75
Shapley values 15, 16, 23, 24, 26–29 Z
Shear sonic 57, 58, 61–63, 83
Smart well completion 104, 105, 107, 122 Zonal isolation 105, 107