0% found this document useful (0 votes)
43 views

Spwla 2019 CC

saturacion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Spwla 2019 CC

saturacion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

SPWLA 60th Annual Logging Symposium, June 17-19, 2019

DOI: 10.30632/T60ALS-2019_CC

ROLE OF MACHINE LEARNING IN BUILDING MODELS FOR GAS


SATURATION PREDICTION
Yagna Deepika Oruganti, Peng Yuan, Feyzi Inanc, Yavuz Kadioglu, and David Chace

Baker Hughes, a GE Company

Copyright 2019, held jointly by the Society of Petrophysicists and Well Log inelastic and thermal gate ratio values for gas saturation
Analysts (SPWLA) and the submitting authors.
This paper was prepared for presentation at the SPWLA 60th Annual Logging determination. Various machine-learning algorithms
Symposium held in The Woodlands, TX, USA June 17-19, 2019. such as random forest and extreme gradient boosting
were applied to the data to generate prediction models
ABSTRACT for the ratios mentioned above. Results showed that over
90% accuracy can be achieved between the predictions
Quantitative gas saturation determination for reservoir from the machine-learning models and the ratios
monitoring purposes became possible with the calculated from the Monte Carlo simulations on a
introduction of a new generation of multi-detector pulsed validation data set.
neutron tools and interpretation algorithms. One
distinctive feature of these interpretation algorithms is The paper will first discuss the Monte Carlo-based model
that they rely heavily on modeling of tool responses for building and the existing model libraries used in
the given completions and fluid types present in the quantitative gas saturation analysis along with the data
system. This modeling is usually achieved through processing methodology used to generate input data for
nuclear Monte Carlo simulations and involves long the machine-learning algorithms. It will be followed by
computing times, significant computer resources, and a discussion of various machine-learning models applied
human intervention. However, despite the time and cost and their prediction accuracies along with variable
drawbacks of this approach, an associated benefit is the values. Next, the trained machine-learning models will CC
ever-growing library of models being computed for be deployed on blind test datasets (i.e., completion,
wells with different attributes. The existence of such lithology and formation parameter sets that the model
Monte Carlo computed model libraries lends themselves has never encountered before), and the performance of
to the deployment of machine learning to substitute the the models on these completely new datasets will be
lengthy and expensive Monte Carlo-based model demonstrated by comparing the predictions with those of
building process. As a result, the associated cost and the Monte Carlo–based models. Finally, the success of
time management cease to be an issue in the data the trained machine-learning model will be demonstrated
acquisition planning and interpretation for gas saturation by deploying it on an actual gas saturation log, thereby
determination. showcasing the time and cost benefits of having data-
driven models that can accurately predict inelastic and
Machine learning is a sub-branch of artificial thermal gate ratio values.
intelligence, and encompasses a category of statistical
algorithms that can “learn” from existing data without INTRODUCTION
explicit programming. These algorithms can be used to
build models to predict the outcome for a given set of Reservoir performance monitoring has traditionally been
conditions. In this specific instance, the conditions are based on measurement techniques using high-energy
completion, formation, and fluid parameters. For neutrons. This situation is especially true for cased holes
example, borehole size, number of casing strings, because fast neutrons can penetrate through casing
presence of cement, annular fluid parameters, lithology, material easily and penetrate the formation beyond the
porosity and fluid types in the pore space are all needed casing and cement layer. Over the years, various
to predict the response of an instrument designed for measurement techniques have been developed using
reservoir monitoring. The ratios of count rates from two fast-neutron beams. Most techniques employ pulsed-
detectors placed at two distances from the pulsed neutron neutron generators emitting neutrons with enough
source are typical outcomes from a Monte Carlo energy to induce inelastic scattering interactions in the
modeling exercise. The machine-learning activity is a target zones. The capture of the thermalized neutrons
substitute for this process, providing fast and accurate plays a role as well.

1
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

One popular measurement method is the pulsed-neutron- In this paper, the methodology for MC-based gas
capture (PNC) log, which provides an accurate thermal- saturation analysis will be discussed first. This will be
neutron-capture cross section of the formation followed by a discussion of the well configurations
surrounding the borehole. This cross section, also known selected for machine-learning model development with
as the ‘sigma’ (Σ) value of the formation, is based on the input and output variables used in the implementation.
time spectrum, and it can be used to distinguish between In the next steps, various machine-learning models
formations saturated with hydrocarbons or high-salinity applied to the training datasets to train the models and
water. The thermal-neutron-capture cross section of the their prediction accuracies will be presented. Later, the
formation (and fluid in the pore space) determines the trained machine-learning models are utilized on blind
rate of change of the thermal neutron flux in the test datasets (i.e., the datasets that the model has never
formation. Another well-known pulsed neutron seen before) to measure the success of the machine-
measurement is the carbon/oxygen ratio, or C/O, learning models. This is accomplished by comparing the
measurement. This is based on the energy spectrum of predictions with those of the MC-based models, and the
the gamma rays from inelastic scattering interactions and metrics for the performance of the models are provided.
is used to distinguish between formations saturated with Finally, the trained machine-learning model will be
hydrocarbons and water (regardless of salinity). applied to an actual gas saturation log. In that, gas
saturation interpretations based on the MC and machine-
In recent years, a new measurement technique was learning developed models will be presented side by side
developed for determination of quantitative gas to demonstrate the merits of the machine-learning
saturations in cased wells. Some details of the method approach. This field example showcases benefits of
were published by Trcka, et al. (2006, 2008a, 2008b, using machine-learning models that can accurately
2008c), and Inanc, et al. (2009, 2014). Correspondingly, predict inelastic gate ratio values and provide highly
the PNC tool has a third gamma-ray detector located comparable gas saturation values. Therefore, it was
farther from the neutron source than the position of the demonstrated in this paper that by having data-driven
typical long-spaced detectors. This location provides the machine-learning models, we can reduce cost and time
necessary large dynamic range for the gas-saturation while maintaining accuracy of the saturation prediction.
CC
measurements that might not be available from tools
with conventional detector spacing. An important GAS SATURATION PREDICTION
differentiating aspect of this quantitative gas-saturation METHODOLOGY AND MONTE CARLO-BASED
measurement method is that it uses a combination of log MODEL BUILDING PROCESS
data and modeled data generated for the specific
completion, mineralogy and fluids using Monte Carlo The instrument used for gas-saturation measurements
(MC) simulations. Incorporating the modeled data into reported in the present paper, the Baker Hughes
the process accounts for the impact of borehole fluids, Reservoir Performance Monitor (RPMTM), employs a
completion, mineralogy, formation and fluids on the pulsed-neutron generator and three sodium iodide (NaI)
measurements. scintillation detectors. The neutron generator is pulsed at
a 1-kHz rate with corresponding cycle duration of 1,000
The relative error for MC simulation is normally µsec. The generator is activated during the first 60 µsec
proportional to 1/√𝑁, where N is the number of particles of the cycle. In each cycle, a rapid increase is observed
being tracked. As a result, to reduce statistical error of in the count rates when the pulsed-neutron generator
the simulation, a large number of particles (often in (PNG) is activated. The count rates reduce significantly
hundreds of millions) need to be tracked. Therefore, MC when the generator is inactive. The photon counts
simulations are usually computationally expensive. On recorded when the generator is active are primarily from
the other hand, once finished, the simulation results can the inelastic interactions, with some contribution from
be added into the model library and labelled with thermal neutron capture events and from neutron-capture
associated model attributes (such as completion, interactions when the source is inactive. A cycle
mineralogy, etc.). Over the time, with enough models described above is shown in Figure 1.
accumulated, this ever-growing model library provides
an ideal data source for developing a machine-learning
scheme to substitute the lengthy and expensive MC-
based model building process.

2
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

PNG Induced Gamma Ray Time Spectra The RATO13 ratio is given by equation (2). The
1 acquisition time for the SS detector is shorter than that
Det. 1 (SS)
used for XLS detector.
Det. 2 (LS)
0.1
Det. 3 (XLS)
Typical Normalized Count Rate

0.01

(2)
0.001

0.0001

When this capture photon ratio is used for


0.00001
0 100 200 300 400 500 600 700 800 900 1000 measurements, the SS and XLS detectors are used as in
Time (microseconds)
RIN13. Therefore, this ratio is called the RATO13 curve.
Figure 1: - Typical time spectra recorded by detectors in
a given time sequence. The log data can be processed to form RIN13 and
RATO13 ratios with count rates from SS and XLS
The rate of change in the count rates and differences detectors for thermal capture and inelastic scattering
between the first (short-spaced or SS) detector, second interaction photons. The borehole content, tubulars,
(long-spaced or LS) and third (extra-long spaced or cement and formation composition all contribute to the
XLS) detector are functions of borehole configuration, behavior of the measured ratios. If the contribution of the
borehole fluids, formation properties and formation borehole content, tubulars and cement can be determined
fluids. The impact of these parameters is different in the and then de-convolved from the measured signal using a
earlier and later stages of the time spectra. This priori knowledge, the remaining portion of the signal can
difference plays an important role in the measurement. be used to determine the formation properties. One way
of doing this is to introduce modeling into the process.
Two ratio curves used in the gas saturation prediction are In general, the radiation transport calculations can be
CC
short-space and extra-long space gamma ray count rate done through either deterministic methods based on the
ratios from inelastic and thermal capture, which are Boltzman transport equation or through MC methods. In
RIN13 and RATO13, respectively. These two curves are this paper, the Monte Carlo N-Particle Transport Code
provided by the tool and modeled by the Monte Carlo (MCNP®) developed by Los Alamos National
simulations. This paper describes using a machine- Laboratory has been used exclusively for solving
learning concept to predict those two quantities without radiation transport (Werner et al.). Therefore,
relying on the MC computations. hereinafter MC and MCNP are used interchangeably.
The nature of the well logging measurements, and
The inelastic ratio curve, RIN13, is given in equation (1). especially those involving pulsed neutron instruments, is
a good fit for MC based modeling and simulations.

(1)

Lithology:
Sandstone, limestone or
where each data point, Ni, is collected in a sequence of RPM Tool dolomite
1.7 in Borehole/ Completion
10-microsecond windows. This is called an inelastic Configuration:
ratio, RIN13, because the counts are obtained from the Bit size
early phase of the cycle, as indicated by the counters in Casing size
the equation. The early phase of the data acquisition is Tubing size
Cement type
dominated by the inelastic gamma rays, or photons. Borehole fluid
Although there could be other ratios, this one is specific Formation Fluids:
Water salinity
to the ratio of inelastic-gamma-ray counts between the Oil density
first or short-spaced (SS) and third or extra-long spaced Gas composition
(XLS) detectors. Gas density

Figure 2: - Typical input for Monte Carlo modeling.

3
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Modeling provides predictions of the various Variation of RATO13 with Gas Density
120
measurements used for gas saturation measurements.
Water
Figure 2 shows the specific wellbore configuration Oil
100
information required for the modeling. The borehole 0.247 g/cc

sizes, cement thickness, casing and tubing sizes are 0.189 g/cc
80 0.096 g/cc
specified for each case. Additional information includes

RATO13
borehole fluids and formation fluid properties, such as
60
densities and gas composition. Formation lithology and
porosity can be obtained from well log data or general 40
knowledge of the formation, while fluid parameters can
be obtained through pressure, temperature, and produced 20
or in-situ fluid samples. As it might be expected, the
borehole completion has significant impact on the signal. 0
Casing size and thickness, presence of tubing, various 0 10 20 30 40
Porosity (pu)
borehole fluids within the annulus and tubing all can
modify the logged signal. Figure 3b: - Modeled RATO13 values for water, oil, and
three different densities of natural gas. Conditions are
The modeling is performed for formations fully saturated the same as for Figure 3a (Inanc, et al., 2009).
with gas, water and oil for a range of porosity values,
typically from 0 to 40 porosity units (p.u.), and the results The implementation of the gas-saturation method
are used to form theoretical response, or “fan” charts as follows the basic flow diagram shown in Figure 4. The
shown in Figures 3a and 3b. Each chart has 100-percent method uses Monte Carlo models based on pure
gas, oil and water saturation lines given as functions of mineralogy and effective porosity, and then introduces
effective porosity. Each curve in the charts typically has corrections to compensate for the presence of clays in
6 modeled data points for (0, 5, 10, 20, 30, or 40 pu) shale. An example is shown at the end of this paper.
porosity levels. This totals up to 18 Monte Carlo runs,
CC
each requiring about one day of computing time. If there Modeling Data
Log Data Volumetric Data
are enough computational resources available, all 18 Time-based · Porosity, · Completion details,
· Shale volume, · Borehole fluids,
runs can be executed in parallel on a cluster, with each pulsed neutron
· Formation fluids,
gamma data · Mineral volume
requiring about one day of computing time. · Lithology

Variation of RIN13 with Gas Density


90
Water · Generate variable gas
Compute RIN13 Compute DGE using
Oil and liquid response
and/or RATO13 Monte Carlo radiation
0.247 g/cc curves lines DGE for each transport simulations
0.189 g/cc depth in the logged
0.096 g/cc interval,
70 · Compensate DGE for
clay content
RIN13

50

· Normalize RIN13 and RATO13 to DGE


in wet or gas zone,
· Compute gas saturation based on RIN13
30
and/or RATO13 position within DGE
0 10 20 30 40
Porosity (pu)
Figure 4: - A flow chart showing the steps in the gas-
Figure 3a: - Modeled RIN13 values for water, oil, and saturation processing (Inanc, et. al., 2014).
three different densities of natural gas. The formation is
sandstone with an 8.5-inch borehole and a 7-inch casing.
Formation oil density is 0.8 g/cm3, formation water
density is 1.03 g/cm3, and the borehole fluid is 1.619
g/cm3 OBM. Gas densities correspond to pressures
ranging from 2000 psi up to 6000 psi (Inanc, et al., 2009).

4
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

DATA PROCESSING FOR MACHINE- conditions/attributes. Over the years, there are more than
LEARNING MODELS 2500 GasView™ models archived in the model library.
From a machine-learning point of view, this ever-
For the MC-based gas saturation modeling, the MC growing model library provides an ideal data source for
simulations usually involve long computing times, developing a machine-learning scheme. Once
significant computer resources, and human intervention. developed, such scheme can then be used to substitute
To reduce computational resources and human the lengthy and expensive MC-based modeling process.
intervention, high-performance parallel computing, As a result, cost and time management will no longer be
automatic input and output data processing as well as an issue in determining the gas saturation.
web-based IT infrastructure for model transfer (between
geosciences centers and computing center analysts) were As discussed earlier, well configuration has big impact
implemented. Even so, turnaround time for models can on the RIN13 and RATO13 values. Among all archived
take 3 to 4 days, which includes communication of the models, two types of well configuration, namely the
model request to nuclear modelers, preparation of MC 4-layer well configuration and the 6-layer well
input files, run time (about 1 day if enough configuration are much more common than other types
computational resources are available), post-processing of well configuration. Figure 5a and 5b illustrate a
time and communication of the model from nuclear 4-layer well configuration and a 6-layer well
modelers to geoscientists. configuration, respectively, with layer numbers shown as
well. Essentially, a 4-layer well configuration represents
Nevertheless, after the MC simulations are performed for a cased hole configuration with single casing surrounded
some specific conditions (such as completion, by cement and a 6-layer well configuration represents a
mineralogy, etc.), the model results are then deposited cased hole configuration with production tubing and one
into the model library and tagged with these associated casing surrounded by cement and formation.
conditions. Consequently, there will be an ever-growing
model library with models having various
CC
4 6
3 5
4
2

1
1
3 2

Figure 5a: - A 4-layer well configuration. Figure 5b: - A 6-layer well configuration.

Table 1a: - Completion layer information, input and output variables for 4-layer well completion.

5
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Table 1b: - Layer information, input and output variables for 6-layer well completion.

Table 1a and 1b list layer information, input and output because the target variables (oil RIN13, gas RIN13 and
variables for a 4-layer well configuration and a 6-layer water RIN13) are continuous. Furthermore, separate ML
well configuration, respectively. Where RIN12 and models are built and tuned for each of the three target
RATO12 are the ratio of inelastic-gamma-ray counts and variables.
ratio of capture-gamma-ray counts between the first or
short spaced (SS) and second or long spaced (LS) Data cleansing is one of the most critical steps in the
detectors, respectively. workflow for building a predictive model. If the quality
of data fed to the ML model is not good, then the results
A script was written to search for a suitable subset of the will correspondingly be bad, as well. Some of the
entire model library/database and identify models for features that go into the machine-learning predictive
4-layer and 6-layer well configuration. Then the input model consist of borehole fluid density, casing diameter,
and output files for these identified models were formation density, formation gas, water and oil densities
extracted as the inputs to the machine-learning model. and borehole size. All those features show some level of
Next, outlier models were removed. The outlier models variability, but some of the archived models have
generally fall into the following categories: features that can be classified as outliers. One example
for the variability can be seen in Figure 6. That CC
 The results (RIN13 and RATO13 values) are outside histogram shows the variability of the borehole size
of normal result range due to various reasons (such encountered in the archive. While most of the borehole
as higher modeling uncertainty). sizes are grouped around 6 inch and 8½ inch, some are
 The number of cases with some specific input quite different. For example, a borehole around 15 inch
parameter is too small for model training purpose. is clearly an outlier. Figure 7 provides a histogram
For example, the fluid type in the tubing is normally showing casing diameter distribution encountered in the
oil, water or gas. However, there are cases where database. This distribution follows the borehole size
CO2 is the tubing fluid. Since the number of such trends seen in Figure 6. In either case, the sampling
cases is too low for training the machine-learning frequency of some sizes is relatively low, with
model, those cases were omitted. implications on training of the machine-learning models.

After removing the outlier cases, we ended up with about


270 cases for 4-layer well configurations. These cases
are used for developing the machine-learning model as
discussed in the following section. A 6-layer ML
workflow would follow the same procedures described
in this paper.

MACHINE-LEARNING MODELS

The goal of the current study is to predict, with


reasonable accuracy, oil, water and gas RIN13 values
using machine learning (ML). The results from the
machine-learning model are then compared with the
MC-based RIN13 results to determine the performance
metrics of the ML approach. This is a regression problem Figure 6: Borehole size distribution before data
cleansing step.
6
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

The distribution of the borehole fluid density (Layer 1)


is shown in Figure 8. Any case in the dataset with a
density higher than 1.2 g/cm3 is an exotic material and
excluded from the dataset. The same is done with fluids
lighter than 0.6 g/cm3 range, which would be in the gas
or air classification, and are also removed from the
training set. Only cases with high data quality were
selected as input for the predictive model.

The histograms in Figures 6, 7 and 8, show that data


tends to be clustered around certain values. For example,
Figure 8 shows that the most common borehole fluid
densities are centered around 1.0 g/cm3. This can present
a challenge in predicting RIN13 from the ML model for
input parameter ranges outside the most commonly
occurring values. This is because, the greater the amount Figure 7: Casing diameter distribution prior to data
of data in the training set for a list of input parameter cleansing.
ranges, the better the expected prediction accuracy from
the ML model will be.

After the data is cleansed, machine learning is used to


predict the RIN13 values. The data is generally divided
into a training set, a validation set and a test set. The ML
model is trained on the training set, and the validation set
is used for tuning the hyperparameters of a ML model,
and is used to provide an unbiased evaluation of a model
CC
fit on the training set. Finally, the test set is used to
measure the generalizability of the predictive model (i.e.,
how the model would perform on hitherto unseen data).

However, because the resulting cleansed data in this


study is limited in size, we apply a technique called K- Figure 8: Borehole fluid density distribution including
fold cross-validation. Here, there is no explicit validation outlier gas and very heavy mud data points.
set defined. Instead, the training data set is divided into
K folds (or groups), and the model is fit K times on (K- Multiple machine-learning models, including ensemble
1) folds, and validation is performed on the Kth fold. The methods were evaluated (Hastie, et al., 2009), such as
cross-validation metric is the average of the K cross- Decision Tree, Random Forest, Gradient Boosting, Ada
validation scores. In this work, the number of K-folds Boost, and XGBoost (eXtreme Gradient Boosting).
was chosen to be 5 (i.e., 5-fold cross validation). In this Hyperparameter tuning was performed using the Grid
study, the train-test split was 80%-20% of the data, Search technique, and the model with the best 5-fold
respectively. cross-validation score (XGBoost) was selected as the
final predictive model. XGBoost is an advanced
implementation of the Gradient Boosting algorithm
(Chen, et al., 2016). The performance metrics of the
XGBoost model for the RIN13 curves are shown in
Table 2.

7
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Table 2: -Machine-learning model performance metrics

Machine-learning models consist of parameters and


hyperparameters. Model parameters are “learnt” during
the training process. For example, in tree-based models
(such as decision trees, random forest and XGBoost), the
learnable parameters are the choice of decision variables
and the numeric thresholds at each node.
Hyperparameters on the other hand, are set manually. Figure 9: - Relative feature importance of the input
Often, the Grid Search technique is used to scan through features to the model
a set of hyperparameters and choose the combination that
yields the best cross-validation score. Figure 9 shows the relative feature importance of the
input features used in the XGBoost model. This analysis
In tree-based models, hyperparameters typically include helps in gaining a better understanding of which features
the maximum depth of the tree (max_depth), the number affect the predictive model more. The Figure shows that
of trees to grow (n_estimators), learning rate borehole fluid density, casing diameter, porosity, and
(learning_rate), minimum sum of weights for all formation fluid densities are some of the more important
observations required in a child (min_child_weight), features. This is in accordance with the physics of the
fraction of observations to be randomly sampled at each problem. Usually, the borehole fluid has strong influence
tree (subsample), fraction of columns to be randomly on the neutron slowing down, and that shows up in the
RIN13 values especially for larger-diameter boreholes. CC
sampled at each tree (colsample_bytree) among others.
Table 3 shows the range of hyperparameters used in The influence of the casing diameter, which is correlated
tuning the XGBoost model. to the borehole size, is clearly seen. Figure 10 shows
the correlation coefficients between each pair of
variables. It also shows that oil and water RIN13 are
strongly correlated. This is due to the similarity of
Table 3: -Range of hyperparameters used in tuning the hydrogen content in fresh water and typical 0.8 g/cm3 oil.
XGBoost model The correlation among other features is low, in general.

Figure 10: Features correlation coefficient matrix

8
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Figures 11a, 11b and 11c, respectively show the


XGBoost model predictions plotted against the actual
gas, oil and water RIN13 values from MC simulations
for each porosity point in the independent test set. The
samples in the test set were not included in the training
process, and act as a measure of generalizability of the
model to unseen data. Figure 11 and Table 2 show that
the performance of the predictive model on the test set is
very good.

Figure 11c: - Machine-learning model predictions vs.


actual values (from MC simulations) for Water RIN13,
for each porosity point in the independent test set cases.

PREDICTION OF MCNP MODELS USING


MACHINE LEARNING
Figure 11a: - Machine-learning model predictions vs.
The ML model was used to predict MC water, oil and
actual values (from MC simulations) for Gas RIN13, for
gas response curves for a variety of completion sizes and
each porosity point the independent test set cases.
fluid properties (various densities of water, oil and gas). CC
Some examples are shown in Figures 12a-12e.

Figure 11b: - Machine-learning model predictions vs.


actual values (from MC simulations) for Oil RIN13, for
each porosity point the independent test set cases. Figure 12a: - Oil, water and gas RIN13 curves from MC
simulations and ML model predictions, 6-in. hole and
4.5-in. casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.1
g/cm3).

9
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Figure 12b: - Oil, water and gas RIN13 curves from MC Figure 12e: - Oil, water and gas RIN13 curves from MC
simulations and ML predictions, 8.5-in. hole and 5.5-in. simulations and ML predictions, 9.625-in. hole and
casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.1 g/cm3). 7.625-in. casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.15
g/cm3).

In general, for the range of completion sizes and fluid


properties evaluated, the machine-learning predictions of
MC simulations were very good. However, results were
not as satisfactory when predicting MC responses in
areas where the existing MC data base used for ML
training was sparsely represented. To improve this
implementation, MC simulations would need to be
performed over a wider range of completion sizes and CC
fluid properties, and with higher resolution. This would
yield a more robust ML approach for predicting MC
responses.

Usually, training a predictive model can be relatively


Figure 12c: - Oil, water and gas RIN13 curves from MC time consuming. For this specific implementation, the
simulations and ML predictions, 8.5-in. hole and 7-in. training set size was on the order of hundreds and
casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.1 g/cm3). training time was at the order of hours. When a
predictive model is trained and a production software is
prepared for prediction only, the runtime will be
comparable to the speed at which feature information is
entered (e.g., in minutes).

FIELD EXAMPLE

The following field example is from a North American


tight gas field. Porosity is typically 10 percent or less.
Due to the fresh formation water and low-porosity
environments, PNC Sigma logging has not been a viable
measurement technique. The detector ratio-based gas
saturation method, however, is salinity independent and
demonstrates high sensitivity in low porosity, and has
been successfully utilized in thousands of wells. Figure
Figure 12d: - Oil, water and gas RIN13 curves from MC
13 shows a comparison of the MC and ML predicted
simulations and ML predictions, 3.25-in. hole and 2.375-
RIN13 responses for the example well which contains
in. casing OD (ρw=1 g/cm3, ρo=0.73 g/cm3, ρg=0.2
4.5-inch casing cemented inside a 7.5-inch openhole.
g/cm3).
10
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Figure 13: - Oil, water and gas RIN13 curves from MC


simulations and ML predictions, 7.5-inch hole and 4.5-
inch casing OD (ρw=1 g/cm3, ρo=0.8 g/cm3, ρg=0.1
g/cm3)

Figure 14 shows a comparison of saturation profiles


calculated using the MC and machine-learning (ML)
predicted response models.

The first track contains Sg (gas saturation) from MC, Sg


from ML, and Sw using Archie (plotted inversely). The
first conclusion is that there is excellent agreement CC
between the MC and ML results – so much so that it is
difficult to see a difference. The second conclusion is
that the GasView ratio-based saturation results are in
excellent agreement with openhole Sw (water saturation)
saturation from openhole log data.
Figure 14: Gas saturation analysis comparison using
The second track contains the gas-liquid response MC and ML response predictions.
envelope (as a function of formation porosity) generated
from the ML model with the measured RIN13 response. CONCLUSIONS
The predicted liquid (or wet) response is shown as a blue
curve, the predicted gas response is shown as a red curve, This paper demonstrates that ML can be used to generate
and the measured RIN13 response is shown as a green predicted instrument responses required for the
curve. Gas saturation is determined based on the position described gas saturation analysis methodology.
of the normalized RIN13 curve relative to the predicted
gas and liquid (wet) responses. The third track is a In doing that, we have shown how to cleanse the data
duplicate of the second track, but presents the gas-liquid before it can be used in training the machine-learning
response envelope generated from the MC model. models. An existing database of MC developed models
RIN13 is also shown. It is clear that the envelopes was utilized for training the machine-learning models. It
generated by machine learning and MC are in excellent was clear that most of the cases were clustered around
agreement. more popular completion configurations, borehole fluids
and formation pore space fluids. Due to this clustering,
The fourth track presents the volumetric distribution of trained models showed higher uncertainty when the
shale, sandstone, porosity and fluids. input differed from such cases significantly. With this,
the need for a better sampling of the input variables in
the training sets became apparent.

11
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Another observation made in the training and testing ACKNOWLEDGEMENTS


process is related to how well the material compositions
were captured in the existing database. The data base The authors gratefully acknowledge Baker Hughes, a GE
includes models generated over a nearly 15-year period. company for permission to publish this paper.
Although the completion material densities (and
compositions) and borehole and formation fluid densities REFERENCES
are well known, the data base includes a small number
of models that were generated for oils and gases with Trcka, D.E., Gilchrist, W.A., Riley, S., Bruner, M.,
H/C ratios different from standard CH2 and CH4, Esfandiari, T., Ly, T., Shearin, D., Patino, T., Murray,
respectively. Unfortunately, these H/C ratios were not H., Potter, J., Rose, R., Chen, J., Olsen, S., Lovera, O.,
searchable in the data base, so a few of these models McCants, D., Berger, A., Ellsworth, K., Barolak, G.,
were likely included in the training process. This may Martain, R., McFall, A., and Guo, P., 2006, Field Trials
have acted to add to uncertainty levels in the machine- of a New Method for the Measurement of Formation Gas
learning prediction of the models, and highlights the Using Pulsed Neutron Instrumentation, Paper SPE-
importance of comprehensiveness and consistency in the 102350, presented at the SPE Annual Technical
data used for training the machine-learning model. Conference and Exhibition, San Antonio, Texas, USA,
24–27 September.
Regardless of the uneven spread of the training cases and
uncertainty of some material compositions, the results Trcka, D.E., Riley, S., and Guo, P., 2008a, Measurement
predicted by the machine-learning model provided quite of Formation Gas Pressure in Cased Wellbores Using
good matches with the MC developed models. The test Pulsed Neutron Instrumentation, U.S. Patent No.
set R2 values were quite high. The comparisons between 7,361,887, April 22, 2008.
the MC and ML models for a set of examples cases were
shown with favorable agreements. Trcka, D.E., Riley, S., and Guo, P., 2008b, Measurement
of Formation Gas Saturation in Cased Wellbores Using
In addition to the model comparisons, a field example Pulsed Neutron Instrumentation, U.S. Patent No.
CC
was also given where the match between the actual MC 7,365,308, April 29, 2008.
developed model and machine learning model was very
good and this resulted in nearly identical gas saturation Trcka, D.E., Guo, P., Riley, S., Barolak, G.B., Chace,
values. D.M., 2008c, Determination of Gas Pressure and
Saturation Simultaneously, U.S. Patent No. 7,372,018,
It is clear that using machine-learning is technically May 13, 2008.
feasible especially if the training and test cases cover
whole input ranges adequately and there is a good control Inanc, F., Gilchrist, W.A., Ansari, R., and Chace, D.,
on the data accompanying those cases. With existence 2009, Physical Basis, Modeling, and Interpretation of a
of such a machine-learning predictive model, it would be New Gas Saturation Measurement for Cased Wells,
possible to move model building activity to the locations Paper M, Transactions, SPWLA 50th Annual Logging
where the MCNP cannot be used due to export control Symposium, The Woodlands, Texas, USA, 21–24 June.
issues.
Inanc, F., Gilchrist, W.A., Ansari, R., and Chace, D.,
With the availability of this predictive model, turnaround 2014, Physical Basis for a Cased-Well Quantitative Gas-
time for MC modeling process will cease to be an issue Saturation Analysis Method, Petrophysics, Vol. 55,
because building a model will be possible within minutes No.6.
at most, and performed on site without requiring support
from an analysis center capable of performing MC Werner, C. J., et al., 2017, MCNP User’s manual Code
modeling. Version 6.2, LA-UR-17-29981. Chen, T., Guestrin, C.,
2016. XGBoost: A Scalable Tree Boosting System.
Another outcome of this would be a significant reduction Proceedings of the 22nd ACM SIGKDD International
in the costs associated with generating MC-based models Conference on Knowledge Discovery and Data Mining,
currently employed. San Francisco, California, USA—August 13-17, 2016.

12
SPWLA 60th Annual Logging Symposium, June 17-19, 2019

Hastie, T., Tibshirani, R., Friedman, J. (2009). The Yavuz Kadioglu is Sr. Director at Integrated Technical
Elements of Statistical Learning: Data Mining, Services within Baker Hughes Oil Field Services. He has
Inference, and Prediction. Vol. 2. New York: Springer. more than 24 years of experience in engineering analysis,
R&D management and new product development in
Chen, T., Guestrin, C., 2016. XGBoost: A Scalable Tree Power Generation and Oil & Gas industries. His
Boosting System. Proceedings of the 22 nd ACM experience includes drilling tools, surface systems, drill
SIGKDD International Conference on Knowledge bits and downhole nuclear sensors. More recently, he
Discovery and Data Mining, San Francisco, California, managed the engineering team developing digital
USA—August 13-17, 2016. insights by combining Data Science/Machine
Learning/Analytics with deep domain expertise in Oil
ABOUT THE AUTHORS and Gas. He earned a BS in Mechanical Engineering
from Middle East Technical University, MS in
Yagna Deepika Oruganti is a data scientist at Baker Theoretical and Applied Mechanics and PhD in
Hughes. Yagna is a graduate of the University of Texas Mechanical Engineering both from University of Illinois
at Austin with a master’s degree in Petroleum at Urbana. He also has an MBA from NYU, Stern School
Engineering. She holds a bachelor’s degree in Chemical of Business.
Engineering from the Indian Institute of Technology
Madras. In her current role at Baker Hughes, she focuses David Chace is Manager for Cased Hole Petrophysics in
on application of data analytics and machine-learning the Global Geoscience and Petroleum Engineering group
models for drilling, wireline and completions problems. at Baker Hughes, a GE company. He has 38 years of
Her work experience includes reservoir simulations for industry experience, with a focus on pulsed-neutron
unconventional plays, refrac candidate selection in cased hole formation evaluation and production logging
unconventional reservoirs, heavy oil thermal technologies. He has held various positions in research,
simulations, full field models for conventional naturally engineering, operations, geoscience and marketing,
fractured carbonate reservoirs, decline curve analysis, including international assignments in the Middle East
single-well/multi-well reservoir models and well spacing and Southeast Asia. He has authored or co-authored
CC
analysis for tight oil, shale oil/gas plays. more than 20 professional technical papers and holds 20
US patents. He received a BSc in Physics from the
Peng Yuan is a scientist at the Baker Hughes Houston University of Rhode Island, and he is a member of the
Technology Center. He got his PhD in Mechanical SPWLA and the SPE.
Engineering from University of Pittsburgh in 2005. After
graduation, he worked at Westinghouse Electric
Company R&D department, where he performed CFD
and thermal simulations related to nuclear reactor safety,
nuclear fuel design and small modular reactor design. He
joined Baker Hughes, a GE company’s Houston
Technology Center in 2014 as a scientist in the formation
evaluation research group. He has published more than
20 papers in various journals and conferences.

Feyzi Inanc is a technical advisor at the Baker Hughes


Houston Technology Center. He earned a BS in
metallurgical engineering, followed by MS and PhD.
degrees in the nuclear engineering discipline from Iowa
State University. Following a postdoctoral position at
the Iowa State University, he worked as an assistant and
associate professor at Marmara University from 1990 to
1995. He later worked at Iowa State University as a
research scientist at the Center for Nondestructive
Evaluation from 1995 to 2007. He joined Baker Hughes
in 2007 as a scientist. In his career, he has published
about 70 technical articles and more than 20 patents
granted, various software licensed.
13

You might also like