Spwla 2019 CC
Spwla 2019 CC
DOI: 10.30632/T60ALS-2019_CC
Copyright 2019, held jointly by the Society of Petrophysicists and Well Log inelastic and thermal gate ratio values for gas saturation
Analysts (SPWLA) and the submitting authors.
This paper was prepared for presentation at the SPWLA 60th Annual Logging determination. Various machine-learning algorithms
Symposium held in The Woodlands, TX, USA June 17-19, 2019. such as random forest and extreme gradient boosting
were applied to the data to generate prediction models
ABSTRACT for the ratios mentioned above. Results showed that over
90% accuracy can be achieved between the predictions
Quantitative gas saturation determination for reservoir from the machine-learning models and the ratios
monitoring purposes became possible with the calculated from the Monte Carlo simulations on a
introduction of a new generation of multi-detector pulsed validation data set.
neutron tools and interpretation algorithms. One
distinctive feature of these interpretation algorithms is The paper will first discuss the Monte Carlo-based model
that they rely heavily on modeling of tool responses for building and the existing model libraries used in
the given completions and fluid types present in the quantitative gas saturation analysis along with the data
system. This modeling is usually achieved through processing methodology used to generate input data for
nuclear Monte Carlo simulations and involves long the machine-learning algorithms. It will be followed by
computing times, significant computer resources, and a discussion of various machine-learning models applied
human intervention. However, despite the time and cost and their prediction accuracies along with variable
drawbacks of this approach, an associated benefit is the values. Next, the trained machine-learning models will CC
ever-growing library of models being computed for be deployed on blind test datasets (i.e., completion,
wells with different attributes. The existence of such lithology and formation parameter sets that the model
Monte Carlo computed model libraries lends themselves has never encountered before), and the performance of
to the deployment of machine learning to substitute the the models on these completely new datasets will be
lengthy and expensive Monte Carlo-based model demonstrated by comparing the predictions with those of
building process. As a result, the associated cost and the Monte Carlo–based models. Finally, the success of
time management cease to be an issue in the data the trained machine-learning model will be demonstrated
acquisition planning and interpretation for gas saturation by deploying it on an actual gas saturation log, thereby
determination. showcasing the time and cost benefits of having data-
driven models that can accurately predict inelastic and
Machine learning is a sub-branch of artificial thermal gate ratio values.
intelligence, and encompasses a category of statistical
algorithms that can “learn” from existing data without INTRODUCTION
explicit programming. These algorithms can be used to
build models to predict the outcome for a given set of Reservoir performance monitoring has traditionally been
conditions. In this specific instance, the conditions are based on measurement techniques using high-energy
completion, formation, and fluid parameters. For neutrons. This situation is especially true for cased holes
example, borehole size, number of casing strings, because fast neutrons can penetrate through casing
presence of cement, annular fluid parameters, lithology, material easily and penetrate the formation beyond the
porosity and fluid types in the pore space are all needed casing and cement layer. Over the years, various
to predict the response of an instrument designed for measurement techniques have been developed using
reservoir monitoring. The ratios of count rates from two fast-neutron beams. Most techniques employ pulsed-
detectors placed at two distances from the pulsed neutron neutron generators emitting neutrons with enough
source are typical outcomes from a Monte Carlo energy to induce inelastic scattering interactions in the
modeling exercise. The machine-learning activity is a target zones. The capture of the thermalized neutrons
substitute for this process, providing fast and accurate plays a role as well.
1
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
One popular measurement method is the pulsed-neutron- In this paper, the methodology for MC-based gas
capture (PNC) log, which provides an accurate thermal- saturation analysis will be discussed first. This will be
neutron-capture cross section of the formation followed by a discussion of the well configurations
surrounding the borehole. This cross section, also known selected for machine-learning model development with
as the ‘sigma’ (Σ) value of the formation, is based on the input and output variables used in the implementation.
time spectrum, and it can be used to distinguish between In the next steps, various machine-learning models
formations saturated with hydrocarbons or high-salinity applied to the training datasets to train the models and
water. The thermal-neutron-capture cross section of the their prediction accuracies will be presented. Later, the
formation (and fluid in the pore space) determines the trained machine-learning models are utilized on blind
rate of change of the thermal neutron flux in the test datasets (i.e., the datasets that the model has never
formation. Another well-known pulsed neutron seen before) to measure the success of the machine-
measurement is the carbon/oxygen ratio, or C/O, learning models. This is accomplished by comparing the
measurement. This is based on the energy spectrum of predictions with those of the MC-based models, and the
the gamma rays from inelastic scattering interactions and metrics for the performance of the models are provided.
is used to distinguish between formations saturated with Finally, the trained machine-learning model will be
hydrocarbons and water (regardless of salinity). applied to an actual gas saturation log. In that, gas
saturation interpretations based on the MC and machine-
In recent years, a new measurement technique was learning developed models will be presented side by side
developed for determination of quantitative gas to demonstrate the merits of the machine-learning
saturations in cased wells. Some details of the method approach. This field example showcases benefits of
were published by Trcka, et al. (2006, 2008a, 2008b, using machine-learning models that can accurately
2008c), and Inanc, et al. (2009, 2014). Correspondingly, predict inelastic gate ratio values and provide highly
the PNC tool has a third gamma-ray detector located comparable gas saturation values. Therefore, it was
farther from the neutron source than the position of the demonstrated in this paper that by having data-driven
typical long-spaced detectors. This location provides the machine-learning models, we can reduce cost and time
necessary large dynamic range for the gas-saturation while maintaining accuracy of the saturation prediction.
CC
measurements that might not be available from tools
with conventional detector spacing. An important GAS SATURATION PREDICTION
differentiating aspect of this quantitative gas-saturation METHODOLOGY AND MONTE CARLO-BASED
measurement method is that it uses a combination of log MODEL BUILDING PROCESS
data and modeled data generated for the specific
completion, mineralogy and fluids using Monte Carlo The instrument used for gas-saturation measurements
(MC) simulations. Incorporating the modeled data into reported in the present paper, the Baker Hughes
the process accounts for the impact of borehole fluids, Reservoir Performance Monitor (RPMTM), employs a
completion, mineralogy, formation and fluids on the pulsed-neutron generator and three sodium iodide (NaI)
measurements. scintillation detectors. The neutron generator is pulsed at
a 1-kHz rate with corresponding cycle duration of 1,000
The relative error for MC simulation is normally µsec. The generator is activated during the first 60 µsec
proportional to 1/√𝑁, where N is the number of particles of the cycle. In each cycle, a rapid increase is observed
being tracked. As a result, to reduce statistical error of in the count rates when the pulsed-neutron generator
the simulation, a large number of particles (often in (PNG) is activated. The count rates reduce significantly
hundreds of millions) need to be tracked. Therefore, MC when the generator is inactive. The photon counts
simulations are usually computationally expensive. On recorded when the generator is active are primarily from
the other hand, once finished, the simulation results can the inelastic interactions, with some contribution from
be added into the model library and labelled with thermal neutron capture events and from neutron-capture
associated model attributes (such as completion, interactions when the source is inactive. A cycle
mineralogy, etc.). Over the time, with enough models described above is shown in Figure 1.
accumulated, this ever-growing model library provides
an ideal data source for developing a machine-learning
scheme to substitute the lengthy and expensive MC-
based model building process.
2
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
PNG Induced Gamma Ray Time Spectra The RATO13 ratio is given by equation (2). The
1 acquisition time for the SS detector is shorter than that
Det. 1 (SS)
used for XLS detector.
Det. 2 (LS)
0.1
Det. 3 (XLS)
Typical Normalized Count Rate
0.01
(2)
0.001
0.0001
(1)
Lithology:
Sandstone, limestone or
where each data point, Ni, is collected in a sequence of RPM Tool dolomite
1.7 in Borehole/ Completion
10-microsecond windows. This is called an inelastic Configuration:
ratio, RIN13, because the counts are obtained from the Bit size
early phase of the cycle, as indicated by the counters in Casing size
the equation. The early phase of the data acquisition is Tubing size
Cement type
dominated by the inelastic gamma rays, or photons. Borehole fluid
Although there could be other ratios, this one is specific Formation Fluids:
Water salinity
to the ratio of inelastic-gamma-ray counts between the Oil density
first or short-spaced (SS) and third or extra-long spaced Gas composition
(XLS) detectors. Gas density
3
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
Modeling provides predictions of the various Variation of RATO13 with Gas Density
120
measurements used for gas saturation measurements.
Water
Figure 2 shows the specific wellbore configuration Oil
100
information required for the modeling. The borehole 0.247 g/cc
sizes, cement thickness, casing and tubing sizes are 0.189 g/cc
80 0.096 g/cc
specified for each case. Additional information includes
RATO13
borehole fluids and formation fluid properties, such as
60
densities and gas composition. Formation lithology and
porosity can be obtained from well log data or general 40
knowledge of the formation, while fluid parameters can
be obtained through pressure, temperature, and produced 20
or in-situ fluid samples. As it might be expected, the
borehole completion has significant impact on the signal. 0
Casing size and thickness, presence of tubing, various 0 10 20 30 40
Porosity (pu)
borehole fluids within the annulus and tubing all can
modify the logged signal. Figure 3b: - Modeled RATO13 values for water, oil, and
three different densities of natural gas. Conditions are
The modeling is performed for formations fully saturated the same as for Figure 3a (Inanc, et al., 2009).
with gas, water and oil for a range of porosity values,
typically from 0 to 40 porosity units (p.u.), and the results The implementation of the gas-saturation method
are used to form theoretical response, or “fan” charts as follows the basic flow diagram shown in Figure 4. The
shown in Figures 3a and 3b. Each chart has 100-percent method uses Monte Carlo models based on pure
gas, oil and water saturation lines given as functions of mineralogy and effective porosity, and then introduces
effective porosity. Each curve in the charts typically has corrections to compensate for the presence of clays in
6 modeled data points for (0, 5, 10, 20, 30, or 40 pu) shale. An example is shown at the end of this paper.
porosity levels. This totals up to 18 Monte Carlo runs,
CC
each requiring about one day of computing time. If there Modeling Data
Log Data Volumetric Data
are enough computational resources available, all 18 Time-based · Porosity, · Completion details,
· Shale volume, · Borehole fluids,
runs can be executed in parallel on a cluster, with each pulsed neutron
· Formation fluids,
gamma data · Mineral volume
requiring about one day of computing time. · Lithology
50
4
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
DATA PROCESSING FOR MACHINE- conditions/attributes. Over the years, there are more than
LEARNING MODELS 2500 GasView™ models archived in the model library.
From a machine-learning point of view, this ever-
For the MC-based gas saturation modeling, the MC growing model library provides an ideal data source for
simulations usually involve long computing times, developing a machine-learning scheme. Once
significant computer resources, and human intervention. developed, such scheme can then be used to substitute
To reduce computational resources and human the lengthy and expensive MC-based modeling process.
intervention, high-performance parallel computing, As a result, cost and time management will no longer be
automatic input and output data processing as well as an issue in determining the gas saturation.
web-based IT infrastructure for model transfer (between
geosciences centers and computing center analysts) were As discussed earlier, well configuration has big impact
implemented. Even so, turnaround time for models can on the RIN13 and RATO13 values. Among all archived
take 3 to 4 days, which includes communication of the models, two types of well configuration, namely the
model request to nuclear modelers, preparation of MC 4-layer well configuration and the 6-layer well
input files, run time (about 1 day if enough configuration are much more common than other types
computational resources are available), post-processing of well configuration. Figure 5a and 5b illustrate a
time and communication of the model from nuclear 4-layer well configuration and a 6-layer well
modelers to geoscientists. configuration, respectively, with layer numbers shown as
well. Essentially, a 4-layer well configuration represents
Nevertheless, after the MC simulations are performed for a cased hole configuration with single casing surrounded
some specific conditions (such as completion, by cement and a 6-layer well configuration represents a
mineralogy, etc.), the model results are then deposited cased hole configuration with production tubing and one
into the model library and tagged with these associated casing surrounded by cement and formation.
conditions. Consequently, there will be an ever-growing
model library with models having various
CC
4 6
3 5
4
2
1
1
3 2
Figure 5a: - A 4-layer well configuration. Figure 5b: - A 6-layer well configuration.
Table 1a: - Completion layer information, input and output variables for 4-layer well completion.
5
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
Table 1b: - Layer information, input and output variables for 6-layer well completion.
Table 1a and 1b list layer information, input and output because the target variables (oil RIN13, gas RIN13 and
variables for a 4-layer well configuration and a 6-layer water RIN13) are continuous. Furthermore, separate ML
well configuration, respectively. Where RIN12 and models are built and tuned for each of the three target
RATO12 are the ratio of inelastic-gamma-ray counts and variables.
ratio of capture-gamma-ray counts between the first or
short spaced (SS) and second or long spaced (LS) Data cleansing is one of the most critical steps in the
detectors, respectively. workflow for building a predictive model. If the quality
of data fed to the ML model is not good, then the results
A script was written to search for a suitable subset of the will correspondingly be bad, as well. Some of the
entire model library/database and identify models for features that go into the machine-learning predictive
4-layer and 6-layer well configuration. Then the input model consist of borehole fluid density, casing diameter,
and output files for these identified models were formation density, formation gas, water and oil densities
extracted as the inputs to the machine-learning model. and borehole size. All those features show some level of
Next, outlier models were removed. The outlier models variability, but some of the archived models have
generally fall into the following categories: features that can be classified as outliers. One example
for the variability can be seen in Figure 6. That CC
The results (RIN13 and RATO13 values) are outside histogram shows the variability of the borehole size
of normal result range due to various reasons (such encountered in the archive. While most of the borehole
as higher modeling uncertainty). sizes are grouped around 6 inch and 8½ inch, some are
The number of cases with some specific input quite different. For example, a borehole around 15 inch
parameter is too small for model training purpose. is clearly an outlier. Figure 7 provides a histogram
For example, the fluid type in the tubing is normally showing casing diameter distribution encountered in the
oil, water or gas. However, there are cases where database. This distribution follows the borehole size
CO2 is the tubing fluid. Since the number of such trends seen in Figure 6. In either case, the sampling
cases is too low for training the machine-learning frequency of some sizes is relatively low, with
model, those cases were omitted. implications on training of the machine-learning models.
MACHINE-LEARNING MODELS
7
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
8
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
9
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
Figure 12b: - Oil, water and gas RIN13 curves from MC Figure 12e: - Oil, water and gas RIN13 curves from MC
simulations and ML predictions, 8.5-in. hole and 5.5-in. simulations and ML predictions, 9.625-in. hole and
casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.1 g/cm3). 7.625-in. casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.15
g/cm3).
FIELD EXAMPLE
11
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
12
SPWLA 60th Annual Logging Symposium, June 17-19, 2019
Hastie, T., Tibshirani, R., Friedman, J. (2009). The Yavuz Kadioglu is Sr. Director at Integrated Technical
Elements of Statistical Learning: Data Mining, Services within Baker Hughes Oil Field Services. He has
Inference, and Prediction. Vol. 2. New York: Springer. more than 24 years of experience in engineering analysis,
R&D management and new product development in
Chen, T., Guestrin, C., 2016. XGBoost: A Scalable Tree Power Generation and Oil & Gas industries. His
Boosting System. Proceedings of the 22 nd ACM experience includes drilling tools, surface systems, drill
SIGKDD International Conference on Knowledge bits and downhole nuclear sensors. More recently, he
Discovery and Data Mining, San Francisco, California, managed the engineering team developing digital
USA—August 13-17, 2016. insights by combining Data Science/Machine
Learning/Analytics with deep domain expertise in Oil
ABOUT THE AUTHORS and Gas. He earned a BS in Mechanical Engineering
from Middle East Technical University, MS in
Yagna Deepika Oruganti is a data scientist at Baker Theoretical and Applied Mechanics and PhD in
Hughes. Yagna is a graduate of the University of Texas Mechanical Engineering both from University of Illinois
at Austin with a master’s degree in Petroleum at Urbana. He also has an MBA from NYU, Stern School
Engineering. She holds a bachelor’s degree in Chemical of Business.
Engineering from the Indian Institute of Technology
Madras. In her current role at Baker Hughes, she focuses David Chace is Manager for Cased Hole Petrophysics in
on application of data analytics and machine-learning the Global Geoscience and Petroleum Engineering group
models for drilling, wireline and completions problems. at Baker Hughes, a GE company. He has 38 years of
Her work experience includes reservoir simulations for industry experience, with a focus on pulsed-neutron
unconventional plays, refrac candidate selection in cased hole formation evaluation and production logging
unconventional reservoirs, heavy oil thermal technologies. He has held various positions in research,
simulations, full field models for conventional naturally engineering, operations, geoscience and marketing,
fractured carbonate reservoirs, decline curve analysis, including international assignments in the Middle East
single-well/multi-well reservoir models and well spacing and Southeast Asia. He has authored or co-authored
CC
analysis for tight oil, shale oil/gas plays. more than 20 professional technical papers and holds 20
US patents. He received a BSc in Physics from the
Peng Yuan is a scientist at the Baker Hughes Houston University of Rhode Island, and he is a member of the
Technology Center. He got his PhD in Mechanical SPWLA and the SPE.
Engineering from University of Pittsburgh in 2005. After
graduation, he worked at Westinghouse Electric
Company R&D department, where he performed CFD
and thermal simulations related to nuclear reactor safety,
nuclear fuel design and small modular reactor design. He
joined Baker Hughes, a GE company’s Houston
Technology Center in 2014 as a scientist in the formation
evaluation research group. He has published more than
20 papers in various journals and conferences.