Two Factor Sequential Accepted Version
Two Factor Sequential Accepted Version
Research Gateway
Link:
Link to publication record in Heriot-Watt Research Portal
Document Version:
Peer reviewed version
Published In:
Journal of the Operational Research Society
General rights
Copyright for the publications made accessible via Heriot-Watt Research Portal is retained by the author(s) and /
or other copyright owners and it is a condition of accessing these publications that users recognise and abide by
the legal requirements associated with these rights.
Keywords: Sequential Exploration, Capital budgeting, Markov decision processes, Investments, Real
options
ABSTRACT
In a group of exploration prospects with common geological features, drilling a well reveals information
about chances of success in others. In addition, oil prices vary during the exploration campaign and
with them so do the economics of wells and the optimal decision to drill. With these dependencies and
price dynamics, where do we drill first and what comes next given success or failure in previous wells?
The solution to this valuation problem should compare the value of learning (drilling wells that provide
valuable information) with the uncertain value of earning (drilling wells that have large payoffs, yet
uncertain). We calculate a joint distribution for geological outcomes by applying information-theoretic
methods and construct a two-dimensional binomial sequence to represent a two-factor stochastic price
process. We then propose a Markov decision process that solves the optimal exploration problem. An
Excel® VBA software implementation of this algorithm accompanies this paper.
1. INTRODUCTION
Motivated by a petroleum exploration campaign in the Barents Sea, off the northern coasts of Norway,
we came to revisit a solution to a prominent exploration problem. When prospects are geologically
dependent, what is the optimal sequence of drilling? A discovery or a dry hole in one well will affect
the chances of success in the neighbouring prospects. Hence, the exploration decisions should consider
informational synergies between prospects. Furthermore, drilling in arctic waters and then evaluating
the results takes a long time, perhaps up to a year. By then the economic valuations for the upcoming
wells expire and a new round of analysis may completely change the drilling decisions. The
management faces a new problem; with such dynamic uncertainties and dependencies, what is the value
of taking this exploration campaign and which well (if any) comes first?
An optimal exploration policy should account for the geological dependencies; yet considering the
recent downturn in the markets that deeply affected the exploration business, a policy cannot ignore the
outlook of prices. The uncertainty in prices makes this problem like a restless bandit with multiple
correlated arms. The state of the system changes not only because of the decision maker’s actions, but
also according to external, random, factors. A solution to these problems is by solving the underlying
Markov decision process (Puterman, 2014). In this paper, we apply a stochastic model that describes
the dynamics of prices and devise an aggregate algorithm for solving a moderate-sized problem of ten
exploration prospects.
Similar problems, from developing pharmaceutical products to selecting R&D projects, benefit from a
solution to the sequential exploration problem. When developing correlated compounds for products
that arrive later in the market, or when selecting dependent projects with delayed outcomes, the decision
makers deal with restless bandits with correlated arms. They face a trade-off between earning (drilling
high-value wells) and learning (drilling prospects with most valuable information), but then values are
1
not stationary; they change by the next decision epoch. We believe our valuation algorithm could also
be useful for these other applications.
To describe the time-related dynamics of a project value, we use a dynamic model for commodity
forward curves. Contrary to the common belief, the financial benefit from discovering oil and gas is not
a lump sum reward; it is the expected net present value of a stream of cash flows materializing over
years in the future. To show how a project value varies with prices, we first need to show how the
outlook of prices varies. In our valuation, we employ cash flow models for each well to estimate the
effect of changing forward prices on the value of a discovery. Assuming oil prices follow the two-factor
process of Schwartz and Smith (2000), we apply the discrete binomial formulation of Hahn and Dyer
(2012) to represent prices in our Markov decision process. For a group of exploration prospects, we do
the following:
We apply an information-theoretic approach, previously used in Bickel and Smith (2006), to
generate a joint probability distribution incorporating marginal chances of success and
geological dependencies.
We model prices as a two-factor price process (Schwartz and Smith, 2000), and use the
approach in Hahn and Dyer (2012) to construct dual-binomial lattices. The cash flow model
takes the forward curves originating from each node of the lattice and estimates the value of
discovery with respect to varying prices.
We construct a Markov decision process representing the sequential exploration problem given
the joint probabilities and binomial lattices. A recursive algorithm, incorporating transition
probabilities and rewards from earlier steps, returns the value of optimal sequential
exploration.
In this paper, we build on the strand of literature describing sequential exploration. Bickel and Smith
(2006) discussed optimal exploration of six prospects, with outcomes “dry” or “wet”, their dynamic
programming model led to 36 = 729 states. They developed a spreadsheet model to handle the
valuation. Bickel, Smith and Meyers (2008) extended the previous model to more intricate geological
uncertainties, now three “layers” of uncertainty each could take “fail” or “success” states. Solving for
five wells, their dynamic programming model had to handle around 59,000 states. Brown and Smith
(2013) and Martinelli et al (2013) considered even larger problems, clusters of exploration prospects
each having many targets. Eidsvik et al (2018) further discussed this general category of sequential
information gathering decisions; they applied the methods to CO2 sequestration and mining projects.
For larger problems, approximate methods were used to address the curse of dimensionality.
In petroleum exploration, while these models are prevalent and insightful at early phases of screening,
we believe the effect of well economics gains importance as decision makers continue towards
commitment to investments. Our valuation algorithm expands the model in Bickel and Smith (2006)
and further includes the effect of stochastic prices on dynamic decisions. We also supply a modular,
open source software application that performs the valuation algorithm.
Our work also contributes to valuation of real options and applications of the two-factor price model in
Schwartz and Smith (2000); a model realistic enough to reflect the dynamics of prices in the markets
and simple enough to provide decision insights. In the oil and gas industry, Jafarizadeh and Bratvold
(2013 and 2015) simulated the two-factor price process to evaluate real options. Yet in our discrete
Markov decision process, simulation will be prohibitive; we need a finite number of discrete states for
prices. Originally developed to approximate Geometric Brownian motions by Cox et al (1979) and
recently adapted by Hahn and Dyer (2008) for mean-reverting processes, binomial lattices are
2
promising provisions to our sequential exploration model. We use the dual-binomial lattice developed
by Hahn and Dyer (2012) to approximate the two-factor price process.
In the next section, we discuss the details of constructing dual binomial lattices for prices. Then in
section 3, we use these lattices along with the joint geological probability distribution in a recursive
algorithm that solves the Markov decision process. This section also has a brief description of the
information-theoretic approach to calculate the joint probability distribution. In section 4, we describe
how we implemented the valuation algorithm in Excel VBA. Using the software, we solve a problem
with ten exploration targets, and perform sensitivity analyses that support decisions (with more details
about the software and example in appendices).
1
This model is in fact equivalent to the stochastic convenience yield model of Gibson and Schwartz (1990). Mean-
reversion as an appropriate assumption for commodities is discussed in e.g. Laughton and Jacoby (1993), Cortazar
and Schwartz (1994), and Dixit and Pindyck (1994). Schwartz (1997) discusses mean-reversion in stochastic price
models and their ability to price existing future contracts, as well as financial and real assets.
2
Previous attempts to discretize two-factor price diffusions, notably the dual trinomial lattice approach in Hull
and White (1994) or the improved version of Tseng and Lin (2007), worked only under a specific range of
correlation values and had computational limitations.
3
State for ln 𝑆𝑡 Probability
𝜉𝑡 + 𝛥𝜉 + 𝜒𝑡 + 𝛥𝜒 𝑝𝑢𝑢 (3)
𝜉𝑡 + 𝛥𝜉 + 𝜒𝑡 − 𝛥𝜒 𝑝𝑢𝑑 (4)
𝜉𝑡 − 𝛥𝜉 + 𝜒𝑡 + 𝛥𝜒 𝑝𝑑𝑢 (5)
𝜉𝑡 − 𝛥𝜉 + 𝜒𝑡 − 𝛥𝜒 𝑝𝑑𝑑 (6)
Where the increment for each factor is
𝛥𝜉 = 𝜎𝜉 √∆𝑡, (7)
𝛥𝜒 = 𝜎𝜒 √∆𝑡. (8)
The probabilities of moving to each state, 𝑝𝑢𝑢 to 𝑝𝑑𝑑 , are easier to calculate if we consider them as joint
probabilities, the product of marginal probability of a move in 𝜉𝑡 and the conditional probability of a
move in 𝜒𝑡 . For example, 𝑝𝑢𝑢 = 𝑝𝑢 × 𝑝𝑢|𝑢 where 𝑝𝑢 is the marginal probability of “up” move in the
long-term factor and 𝑝𝑢|𝑢 is the conditional probability of “up” move in the short-term factor. The
marginal and conditional probabilities for four transitions are
𝜇𝜉 𝛥𝑡
𝑝𝑢 = ½ + , (9)
2𝛥𝜉
𝑝𝑑 = 1 − 𝑝𝑢 , (10)
∆𝜉 (𝛥𝜒 + ∆𝑡𝜈𝜒 ) + ∆𝑡(𝛥𝜒 𝜇𝜉 + 𝜌𝜎𝜉 𝜎𝜒 )
𝑝𝑢|𝑢 = , (11)
2𝛥𝜒 (𝛥𝜉 + ∆𝑡𝜇𝜉 )
∆𝜉 (𝛥𝜒 − ∆𝑡𝜈𝜒 ) + ∆𝑡(𝛥𝜒 𝜇𝜉 − 𝜌𝜎𝜉 𝜎𝜒 )
𝑝𝑑|𝑢 = , (12)
2𝛥𝜒 (𝛥𝜉 + ∆𝑡𝜇𝜉 )
∆𝜉 (𝛥𝜒 − ∆𝑡𝜈𝜒 ) − ∆𝑡(𝛥𝜒 𝜇𝜉 − 𝜌𝜎𝜉 𝜎𝜒 )
𝑝𝑢|𝑑 = , (13)
2𝛥𝜒 (𝛥𝜉 + ∆𝑡𝜇𝜉 )
∆𝜉 (𝛥𝜒 + ∆𝑡𝜈𝜒 ) − ∆𝑡(𝛥𝜒 𝜇𝜉 + 𝜌𝜎𝜉 𝜎𝜒 )
𝑝𝑑|𝑑 = . (14)
2𝛥𝜒 (𝛥𝜉 + ∆𝑡𝜇𝜉 )
We assumed 𝜈𝜒 = −𝜅𝜒𝑡 to simplify the equations. Furthermore, because equations (11) to (14)
sometimes generate unacceptable results, we bound the probabilities for short-term factor between zero
and one using the equation 𝑝 𝑏𝑜𝑢𝑛𝑑𝑒𝑑 = max(0, min(1, 𝑝 𝑢𝑛−𝑏𝑜𝑢𝑛𝑑𝑒𝑑 )).
This will result in a dual binomial lattice for evolution of spot prices. At each node, the short- and long-
term factors each can have an “up” or “down” tick, resulting in two connected binomial sequences.
Although more comprehensible in three dimensional plots, we can still show the results in the lattice of
figure 1 assuming four branches originate from each node and using parameter values in table 1. Here
for example 𝑆1++ = 𝑒 𝜉0 +𝛥𝜉+𝜒0 +𝛥𝜒 represents a move in spot price from 𝑡 = 0 to 𝑡 = 1 where both the
short- and long-term factors have up ticks. The quadrinomial lattice shown in black solid lines generates
four price states at 𝑡 = 1 and nine at 𝑡 = 2.
Table 1 Parameter values for the two-factor price process
4
𝜉0 4.1
𝜇𝜉 0
𝜎𝜉 20%
𝜌𝜉𝜒 0.3
Because cash flows of exploration projects appear years into the future, spot prices are often irrelevant
to the economics of these decisions. Instead, we are interested in the information that spot prices provide
about the future trends. Forward curves3 provide such information by showing a riskless expectation of
future price trends; as in Jafarizadeh and Bratvold (2013 and 2015) we can theoretically reconstruct
them at each node of the lattice using the assumptions from the two-factor process.
In commodity markets, a forward oil contract is an agreement to buy or sell specific amount of oil at
specific price in the future. With such contractual specifications, these contracts provide information
about risk-free expectation of future prices. The price of a forward contract 𝐹𝑡,𝑇 at time 𝑡 for delivery at
time 𝑇, is theoretically related to the parameters of the spot price process:
𝐴(𝑇 − 𝑡) (15)
ln 𝐹𝑡,𝑇 = 𝑟 −𝜅(𝑇−𝑡) 𝜒𝑡 + 𝜉𝑡 + 𝜇𝜉 (𝑇 − 𝑡) +
2
𝜎𝜒2 𝜌𝜉𝜒 𝜎𝜉 𝜎𝜒
𝐴(𝑇 − 𝑡) = ((1 − 𝑒 −2𝜅(𝑇−𝑡)
) + 𝜎𝜉2 (𝑇 − 𝑡) + 2(1 − 𝑒 −𝜅(𝑇−𝑡) ) )
2𝜅 𝜅
We built a dual binomial lattice for spot prices using equations (3) to (8). Each node of the lattice
represents a different spot price scenario. In addition, for each spot price scenario, using equation (15),
we calculate forward prices for any maturity. Figure 1 shows the evolution of spot and forward prices.
Figure 1 Dual binomial lattice showing the dynamics of spot prices (dark solid lines) and resulting forward
curves (dashed lines). The line in solid red is the forward curve fitted to the observed forward prices at 𝒕 = 𝟎.
Originating from each node of the lattice in figure 1, the dashed lines in red represent the theoretical
forward curves. When 𝑡 = 0, the theoretical forward curve (shown in solid red) fits the observed
forward prices in the market. Later, as spot prices vary in the lattice, so do the corresponding forward
curves. By 𝑡 = 2 we will have thirteen curves in different shapes: from contango, e.g. the curve
originating from the node 𝑆2−−−−, to normal backwardation as in the curve originating from 𝑆2++++.
3
In commodity markets, a forward curve is a function that defines prices for a set of forward contracts; all
contracts are identical except for their varying maturities.
5
3. VALUATION ALGORITHM
The sequential exploration problem resembles a restless multi-armed bandit with dependent arms
(Puterman, 2014). Drilling each well provides information about chance of geological success in other
locations. Yet, economic success is a matter of success in geology and a desirable price outlook. Wells
that seem economically viable now may become uneconomical by the end of period because the forward
curve moved to an unfavourable position. The solution is to make drilling decisions by considering both
the geological learning and the stochastic property of prices.
The next section discusses a method of integrating geological dependencies in the decision model. We
then incorporate these dependencies along with dual binomial sequence of prices into a Markov decision
process for the grand problem.
Where 𝜆0 is the unit multiplier, 𝜆𝑖 and 𝜆𝑖𝑗 are the Lagrangian multipliers associated with 𝑝𝑖 and 𝑝𝑖𝑗 ,
and the vector 𝝀 represents all these elements. This problem has only 1 + 𝑛 + 𝑛(𝑛 − 1)⁄2 variables
and no constraints. For our ten-well problem, the automated Excel® Solver in VBA reaches a solution
for this optimization within a few seconds.
4
For our ten well application, there will be 1024 unknown joint probabilities (2𝑛 , 𝑛 = 10) and 56 constraints (1 +
𝑛 + 𝑛(𝑛 − 1)⁄2 , 𝑛 = 10). Bickel and Smith (2006) explain the details of Kullback-Leibler procedure and its
Lagrangian dual.
6
nine decision periods. This decision tree, however, turns out to become unmanageably large in its
complete form, with almost four quadrillion end-nodes5.
Figure 2 A partial decision tree showing the decisions and uncertainties in a sequential exploration problem
We can simplify this decision model because multiple end-nodes associate with identical information
and future cash flows. For example, if we were successful in well 5 and then failed when we drilled
well 8, the future geological probabilities are going to be the same regardless of the order in which we
drilled the wells. The price at that point in time is also independent of its historical path; the forward
curve we use to evaluate cash flows will only be a function of spot price. We describe the Markov
decision model that draws on this recombining feature.
We use a recursive algorithm, similar to the logic of solving a decision tree, to infer the optimal drilling
decisions. In the final decision epoch, we have drilled all wells except for one. The decision to drill this
last well depends on its conditional probability of success given the previous outcomes as well as its
expected cash flows given the four prevailing forward curves at that point. After we determined the
optimal decision for the scenarios of last epoch, we move backwards and calculate the optimal action
in previous epochs. The transition probabilities (the probability of moving from one state to another at
each decision epoch) will be composed of conditional geological probabilities and probabilities for price
ticks.
To describe the state of wells at each decision epoch, we define 𝝎 = (𝜔1 , … , 𝜔𝑖 , … 𝜔10 ), where 𝜔𝑖 =
“0”, “1” or “–”. Here, “0” means “failure”, “1” means “success”, and “–” represents the case where we
have not drilled the well yet. With this notation, 𝝎 = (−, 0, 1, −, … , −) for example represents the state
where we have drilled well 2 and 3, well 2 was a dry hole and well 3 was a success. Also, as we can
only drill one well per epoch, the available eight alternatives at this state are all the wells except well 2
and 3.
The recursive algorithm selects the well that yields the highest expected value given the conditional
chance of success and price levels. In fact, the algorithm looks beyond the immediate drilling results
and considers the expected payoff that follows consequent to this drilling decision—we refer to this as
the continuation value for price 𝑆 and denote it by 𝑣 𝑆 (𝝎).
5
In the first decision epoch, we have 10 alternative wells each with 2 geologic outcomes and 4 price moves. The
next epoch has 9 wells, each with 8 outcomes. Continuing this trend, we will have 10! × 810 total outcomes.
7
If we are in state 𝝎 and well 𝑖 is not drilled yet (therefore 𝜔𝑖 =“–”), the expected value for well 𝑖,
accounting for all possible outcomes given we observe price 𝑆 and drill well 𝑖, is denoted by 𝑣𝑖𝑆 (𝝎).
This value depends on the immediate reward (𝑑𝑖𝑆 ), immediate cost of failure (𝑓𝑖 ), and the expected
continuation value to the next decision epoch. In other words
𝑣𝑖𝑆 (𝝎) = P(𝝎1𝑖 |𝝎) (𝑑𝑖𝑆 + 𝛿 (𝑝𝑢𝑢 𝑣 𝑆++ (𝝎1𝑖 ) + 𝑝𝑢𝑑 𝑣 𝑆+− (𝝎1𝑖 ) + 𝑝𝑑𝑢 𝑣 𝑆−+ (𝝎1𝑖 ) + 𝑝𝑑𝑑 𝑣 𝑆−− (𝝎1𝑖 )))
(17)
+P(𝝎0𝑖 |𝝎) (𝑓𝑖 + 𝛿 (𝑝𝑢𝑢 𝑣 𝑆++ (𝝎0𝑖 ) + 𝑝𝑢𝑑 𝑣 𝑆+− (𝝎0𝑖 ) + 𝑝𝑑𝑢 𝑣 𝑆−+ (𝝎0𝑖 ) + 𝑝𝑑𝑑 𝑣 𝑆−− (𝝎0𝑖 ))).
Here, the immediate reward 𝑑𝑖𝑆 is the expected discounted future value of a discovery given that we
observe price 𝑆. In addition, the continuation value for state 𝝎 is the maximum expected value for all
drilling alternatives
Where 𝑤, 𝑤4 and 𝑤5 range over {0, 1}. The probability of success and failure for well 𝑖 conditional on
the state 𝝎 (The transition probabilities, where well 𝑖 has not yet been drilled and 𝜔𝑖 = “–”) would be
P(𝝎1𝑖 |𝝎) = P(𝝎1𝑖 )⁄P(𝝎) and P(𝝎0𝑖 |𝝎) = P(𝝎0𝑖 )⁄P(𝝎), respectively.
8
𝑛 𝜉 ,𝜒𝑡 𝜏
𝐹𝑡,𝜏𝑡 𝑞𝑖 − 𝑐𝑖𝜏
𝑑𝑖𝑆 =∑ . (20)
(1 + 𝑟)𝜏
𝜏=0
In the above equation, 𝑟 is the risk-free interest rate and 𝑛 is the length of project given discovery.
4.2 Example
We return to our arctic-circle exploration problem; a large, multi-prospect play with subsurface
dependencies that requires a long time to explore. The management believes that, with available
6
Although this application seems ideal for array programming languages, it is arguably not the ideal choice for
users in academia and industry. For example, prohibitively high software license fees and scarce availability of
programming skills hamper MATLAB® implementations of valuation algorithms (e.g. exploration waiting option
in Jafarizadeh and Bratvold, 2015). Yet Microsoft Excel is perhaps the platform of choice for small and medium
scale analysis tasks in industry and dissemination of an open-source VBA application may be more beneficial.
9
resources, drilling a well and then analysis and interpretation of results will take at least a year.
Assuming all wells require the same amount of resources, a complete exploration of this region would
perhaps take a decade. In the meantime, dramatic variations in prices could sway the optimal policy.
Hence, although at prevailing price projections the prospects are marginally uneconomical, the decision
makers are interested in the expected value that an aggregate optimal exploration policy would bring
about.
Applying our valuation algorithm along with the assumptions about price process and geological
correlations (with more details in Appendix A) reveals that the optimal sequential exploration of this
region would have a significant positive expected value. In other words, while each prospect was not
economically viable in isolation, a sequential drilling that considers both prices and geological learning
would become a sound investment. Considering the value of information and price option makes all the
difference.
Figure 3 shows the valuation in four different versions; first, ignoring geological dependencies and
variability of prices leads to the value of zero. Next, we assume prices vary. When prices follow a two-
factor process, the expected value of the exploration play becomes positive. In our third version, we
assume we have geological dependencies, but prices are stationary; in addition, to show the effect of
stationary price assumptions we have considered three different price scenarios. At the long run prices
of USD 60 (low scenario), 70 (expected scenario) and 80 (high scenario) the value of geologic learning
option could be quite different. Finally, in version 4, we include geological dependencies along with
price dynamics and calculate the total value.
Figure 3 Expected value of sequential exploration. Although values are not additive, in this example geological
dependencies and variability of oil prices both drove the total value
Using the sensitivity analysis subroutine, we can take our valuation further by showing how key factors
affect the value of sequential exploration. The univariate sensitivity analyses in figure 4 reveal that, for
example, everything else unchanged, higher discount rates result in significantly reduced expected
values.
10
For correlated inputs, multi-dimensional sensitivities are more insightful. In a two-way analysis of price
volatility (figure 5), we notice that value is much more sensitive to 𝜎𝜉 , the volatility in the long-term
factor. This is perhaps because of the long-term nature of investing in sequential exploration. With one-
year intervals between drilling, the campaign takes up to ten years to conclude. Furthermore, production
revenue of any discovery will take years to materialize.
Figure 5 Sensitivity analysis of value with respect to volatility in long- and short-term price factors
In a more comprehensive analysis, we can even gain insight on the interlinked nature of price options
and geological learning. Effectively, many parameters influence the expected value: these include
individual project parameters, development solution given discovery, the price process and shape of the
forward curves, and the configuration of prospect and their pairwise correlation. While evaluating the
effect of such large variable set is prohibitive, we could still identify key elements and examine their
effect.
In general, we expect higher price volatility and stronger geologic correlations to generate higher values.
So, what is the minimum volatility that makes a group of exploration targets (with a common
correlation, 𝜌) valuable? In other words, we are looking for break-even volatility given various levels
of common correlation. We can run the valuation algorithm for this group of targets and vary the price
volatilities 𝜎𝜒 and 𝜎𝜉 (while keeping all other parameters fixed) until we reach 𝑣 𝑆0 (𝝎) = 0. This would
be the minimum 𝜎𝜒 or 𝜎𝜉 to have a positive value for sequential exploration. A simple goal-seeking
routine expedites the process. Figure 6 shows combinations of break-even volatility and common
geologic correlation at specific spot price scenarios.
11
Figure 6 Sensitivity analysis of value with respect to price and average geological correlation; each contour line
represents combinations of 𝝈𝝃 and 𝝆 (solid lines) or 𝝈𝝌 and 𝝆 (dashed lines) that make 𝒗𝑺𝟎 (𝝎) = 𝟎 at a specific
spot price.
In addition to a value estimate, we can also determine the starting well in this optimal sequence of
decisions. Because in this valuation, the interactions between geological learning and stochastic prices
are intricate, a complete strategy map would be neither feasible nor beneficial for decision makers.
However, running the software at any point in time and learning about the next well in the optimal
strategy would be enough to make value-maximizing decisions.
5. CONCLUSIONS
This paper provides an algorithmic solution to the Markov decision process of sequential exploration.
We combine the simplicity of binomial lattices with the power of recursive solutions and implement
our method in an effective computer application. Furthermore, we solve a problem of sequential
exploration consisting of ten wells and show how sensitivity analyses can provide deeper insights into
exploration decisions. We integrate all the modelling tools in a single package but note that each can
also work independently; for example, the subroutine for binomial lattice is also suitable for valuation
of commodity options, the embedded functions for forward curve work elsewhere within the
spreadsheet, and the recursion subroutine are applicable to other restless bandit problems.
Our valuation model goes beyond applications in the oil and gas industry. Comparable problems, for
example developing drugs from common compounds in the pharmaceutical industry or sequential R&D
projects, have similar characteristics. In these problems, once we understand the relationship between
value of a project and market uncertainties, the application of the model is straightforward.
Finally, we note an extension of the problem that can readily use this evaluation framework. In some
contexts, individual discoveries may be too small to justify a development solution. Success in two or
more wells could be bundled in a project that uses common production and export facilities to drain this
cluster of discoveries. We can adjust the rewards for our Markov decision process and handle these
functional synergies.
The Excel® VBA software is available at the following link
https://www.dropbox.com/s/mrtr5tkl7ow2khr/Sequential_Exploration%202.1.xlsm?dl=0
12
REFERENCES
Bickel, J. E., & Smith, J. E. (2006). Optimal sequential exploration: A binary learning model. Decision
Analysis, 3(1), 16-32.
Bickel, J. E., Smith, J. E., & Meyer, J. L. (2008). Modeling dependence among geologic risks in
sequential exploration decisions. SPE Reservoir Evaluation & Engineering, 11(02), 352-361.
Brown, D. B., & Smith, J. E. (2013). Optimal sequential exploration: Bandits, clairvoyants, and
wildcats. Operations research, 61(3), 644-665.
Cortazar, G., & Schwartz, E. S. (1994). The valuation of commodity contingent claims. Journal of
Derivatives, 1(4), 27-39.
Cox, J. C., Ross, S. A., & Rubinstein, M. (1979). Option pricing: A simplified approach. Journal of
financial Economics, 7(3), 229-263.
Dixit, A. K., Dixit, R. K., Pindyck, R. S., & Pindyck, R. (1994). Investment under uncertainty. Princeton
university press.
Eidsvik, J., Martinelli, G., & Bhattacharjya, D. (2018). Sequential information gathering schemes for
spatial risk and decision analysis applications. Stochastic environmental research and risk
assessment, 32(4), 1163-1177.
Hahn, W. J., & Dyer, J. S. (2008). Discrete time modeling of mean-reverting stochastic processes for
real option valuation. European journal of operational research, 184(2), 534-548.
Hahn, W. J., & Dyer, J. S. (2011). A discrete time approach for modeling two-factor mean-reverting
stochastic processes. Decision Analysis, 8(3), 220-232.
Hull, J., & White A. (1994). Numerical procedures for implementing term structure models II: Two-
factor models. Journal of Derivatives, 2(2), 37-48.
Jafarizadeh, B., & Bratvold, R. (2012). Two-factor oil-price model and real option valuation: an
example of oilfield abandonment. SPE Economics & Management, 4(03), 158-170.
Jafarizadeh, B., & Bratvold, R. B. (2015). Oil and gas exploration valuation and the value of
waiting. The Engineering Economist, 60(4), 245-262.
Jaynes, E. T. (1968). Prior probabilities. IEEE Transactions on systems science and cybernetics, 4(3),
227-241.
Laughton, D. G., & Jacoby, H. D. (1993). Reversion, timing options, and long-term decision-
making. Financial Management, 225-240.
Martinelli, G., Eidsvik, J., & Hauge, R. (2013). Dynamic decision making for graphical models applied
to oil exploration. European Journal of Operational Research, 230(3), 688-702.
Nelson, D. B., & Ramaswamy, K. (1990). Simple binomial processes as diffusion approximations in
financial models. The review of financial studies, 3(3), 393-430.
Puterman, M. L. (2014). Markov decision processes: discrete stochastic dynamic programming. John
Wiley & Sons.
Schwartz, E. S. (1997). The stochastic behavior of commodity prices: Implications for valuation and
hedging. The journal of finance, 52(3), 923-973.
13
Schwartz, E., & Smith, J. E. (2000). Short-term variations and long-term dynamics in commodity
prices. Management Science, 46(7), 893-911.
Tseng, C. L., & Lin, K. Y. (2007). A framework using two-factor price lattices for generation asset
valuation. Operations Research, 55(2), 234-251.
These wells are not attractive in isolation. Their negative expected value shows that with current level
of information, they will not create value. However, geologists in the company believe the wells are
geologically dependent according to table 3
Table 3 Geological correlations
Wells↓→ 1 2 3 4 5 6 7 8 9 10
1 1 0.1 0.2 0.1 0.2 0.2 0.1 0.2 0.2 0.2
2 1 0.2 0.3 0.4 0.3 0.3 0.4 0.3 0.3
3 1 0.1 0.2 0.1 0.1 0.2 0.4 0.4
4 1 0.1 0.2 0.1 0.2 0.2 0.2
5 1 0.1 0.3 0.4 0.3 0.3
6 1 0.1 0.2 0.1 0.1
7 1 0.2 0.1 0.1
8 1 0.2 0.2
9 1 0.3
10 1
14
APPENDIX B: SOFTWARE DETAILS
This section describes the operation of the Excel® VBA functions and subroutine. In brief, the software
collects data from the spreadsheet and performs the Kullback-Leibler procedure on probabilities and
correlations. It then constructs the double-binomial lattices for prices and outcomes, and finally, carries
out the recursive algorithm. The result of these tasks is the value of group of prospects under the optimal
exploration strategy.
As discussed before, the generalized modules in the program each perform a specific task and then pass
the necessary arguments to the next units. This open-source structure allows users to manipulate and
construct other special-purpose programs such as the sensitivity analysis subroutine we discussed in
section 4. This appendix expands on this modular structure and provides further details on the specific
functions and subroutines.
The program is composed of two main subroutines, “DP Main” and “KL Main”, that act like control
centres; they have module-level variables that pass to other subroutines to perform tasks. In the first
stage, “KL Main” calls a subroutine that collects the input data from the spreadsheet. All the data is
stored in arrays and then passed to the subroutine that performs the Kullback-Leibler optimization using
Excel’s Solver. The result is the joint probability distribution that passes to “DP Main” subroutine; it
solves the Markov decision process using our recursive algorithm. In fact, this second stage in the
program calls subroutines that generate price lattice and the scenario probabilities, and later, passes
these results to a subroutine called “DP Recursion” that performs the recursive valuation algorithm.
The result of this process is the value of the optimal sequential strategy.
The above flow chart shows the structure of the program and the series of tasks that the subroutines
perform. Each subroutine may also utilize functions (not shown in the flowchart) that perform part of
the processing. Table 2 shows a list of functions and subroutines in this software.
Table 4 list of functions and subroutines
15
Pdthenu() Function Returns the conditional probability 𝑃 𝑢|𝑑 in a dual lattice
Pdthend() Function Returns the conditional probability 𝑃 𝑑|𝑑 in a dual lattice
KL Main Subroutine Main subroutine for Kullback-Leibler procedure
Data Collection Subroutine Collects input data from the spreadsheet and stores it in
arrays.
KL Solution Subroutine Converts the correlations to pairwise joint probabilities, then
arranges probabilities into arrays, transforms the arrays to the
spreadsheet and runs the Excel Solver to complete the KL
procedure.
DP Main Subroutine Main subroutine for solving the Markov decision process
DP Price Subroutine Constructs the three-dimensional arrays to represent the
binomial lattice for 𝜒𝑡 and 𝜉𝑡 . It then generates a forward curve
for pairs of 𝜒𝑡 and 𝜉𝑡 and calculates the then-NPV of discovery
for each well.
DP Probability Subroutine Uses the joint probability matrix generated in previous stage
and constructs the array for probability scenario.
DP Recursion Subroutine Uses arrays generated in previous stages and performs the
backward recursion algorithm. Then returns the value of
optimal sequential drilling and the first well in the sequence.
16
(equations 11 to 14) lead to expected prices equal to forward prices at time 𝑇, i.e. if 𝐸(𝑆𝑇∗ ) = 𝐹0,𝑇 , then
we would know that the discretization works well.
Assuming 𝜇𝜉 = 0 and 𝜎𝜉 = 0, and replacing binomial terms (equations 7 to 14) in equation 21 we have
−𝜅𝜒 − ½𝜎 2 −𝜅𝜒 − ½𝜎 2
ln(𝐸(𝑆𝑇∗ )) = (½ + √(𝑇 − 0) ) × (𝜉 + 𝜒 + √(𝑇 − 0)𝜎) + (½ − √(𝑇 − 0) ) × (𝜉 + 𝜒 − √(𝑇 − 0)𝜎).
2𝜎 2𝜎
Figure 8 A comparison between theoretical forward curve (solid line) and forward prices resulting from the
expected prices in risk-neutral the binomial lattice
17