Research internship on optimal stochastic theory with financial application using finite differences method foer anumerical resolution

2nd Year Internship at LAMSIN: Optimal stochastic
control problem with ﬁnancial applications
Asma BEN SLIEMENE
ENSIIE
asma.ben-slimene@polytechnique.fr
from June 2016 to September 2016

Overview
1 Optimal stochastic problem theory
Dynamic Programming Principle
Hamilton Jacobi Bellman equation
2 Resolution methods
Probabilistic approach
Numerical/Deterministic approach with PDEs
3 Financial applications
Merton portfolio allocation Problem
Investment/consumption Problem
4 Numerical results on C++ and Scilab
For the investment problem
For the investment/consumption problem

LAMSIN
Traning objective: An open door into financial mathematics
research
located at École Nationale d’Ingénieurs de Tunis (Tunisia)
comprises 83 researchers, including 40 doctoral students. Each year,
6 to 8 students complete their Master’s theses within the laboratory.
1983: Creation of a research group in numeric analysis at ENIT.
2001: becomes Research Laboratory associated with INRIA (e-didon
team).
in July 2003: was selected by the Agence Universitaire de la
Francophonie (AUF) to be a regional center of excellence in Applied
Mathematics.
Fields of study research: Inverse problems, financial mathematics
including optimoiszation control problems etc.

Optimal stochastic problem theory
Resolution methods
Financial applications
Numerical results on C++ and Scilab
I) Introduction to optimal stochastic problem
2 Applications in ﬁnance
3 Dynamic programming principle
4 Hamilton Jacobi Bellman equation
4 / 74

Resolution methods
5 / 74

1 State of the system: Xt (ω) and its dynamics through a SDE
dXt = b(Xt , αt )dt + σ(Xt , αt )dWt , (1)
2 Control: a process α = (αt )t that satisfy somme constraints and defined
in A the set of admissible control.
3 Performance/cost criterion: maximize (or minimize) over all admissible
controls J(X, α)
Consider objective functionals in the form
E
T
0
f(Xs, ω, αs)ds + g(XT , ω)X = x , on a finite horizon T
and
E
∞
0
eβt
f(Xs, ω, αs)ds |X = x , on a infinite horizon
f is a running profit function, g is a terminal reward function, and β > 0 is
a discount factor.
Objective: find the value functionv(x) = supα J(X, α)

Resolution methods
7 / 74

Resolution methods
Portfolio allocation
Production-consumption model
Irreversible investment model
Quadratic hedging of options
Superreplication cost in uncertain volatility
Optimal selling of an asset
Valuation of natural resources
Ergodic and risk-sensitive control problems
Superreplication under gamma constraints
Robust utility maximization problem and risk measures
Forward performance criterion
8 / 74

Resolution methods
9 / 74

Resolution methods
10 / 74

Resolution methods
11 / 74

Deﬁnition
Bellman’s principle of optimality
” An optimal policy has the property that whatever the initial state and initial
decision are, the remaining decisions must constitute an optimal policy with
regard to the state resulting from the ﬁrst decision”
Mathematical formulation of the Bellman’s principle or Dynamic
Programming Principle (DPP)
The usual version of the DPP is written as
v(t, x) = sup
α∈A(t,x)
E
θ
t
f(s, Xt,x
s , αs) ds + v(θ, Xt,x
θ )
for any stopping time θ ∈ Tt,T (set of stopping times valued in [t, T]).

Usual version of the DPP
(1) Finite horizon: let (t, x) ∈ [0, T] × Rn
. Then ∀ θ ∈ Tt,T
v(t, x) = sup
α∈A(t,x)
sup
θ∈Tt,T
E
θ
t
f(s, Xt,x
θ ) (2)
= sup
α∈A(t,x)
inf
θ∈Tt,T
E
θ
t
f(s, Xt,x
θ ) (3)
(2) Inﬁnite horizon: let x ∈ [0, T]Rn
. Then ∀ θ ∈ Tt,T we have
v(t, x) = sup
α∈A(x)
sup
θ∈T
E
θ
0
e−βs
f(Xx
s , αs) dx + e−βs
v(Xx
θ ) (4)
= sup
α∈A(x)
inf
θ∈T
E
θ
0
e−βs
f(Xx
s , αs) dx + e−βθ
v(Xx
θ ) (5)

Strong version of the DPP
Lemma
Dynamic programming principle (i) For all α ∈ A(t, x) and θ ∈ Tt,T :
v(t, x) ≥ E
θ
t
f(s, Xt,x
θ ) (6)
(ii) For all > 0, there exists α ∈ A(t, x) such that for all θ ∈ Tt,T :
v(t, x) − ≤ E
θ
t
f(s, Xt,x
θ ) (7)
for any stopping time θ ∈ Tt,T .
We can assume that:
v(t, x) = sup
α∈A(t,x)
E
θ
t
f(s, Xt,x
θ ) (8)

Resolution methods
16 / 74

Formal derivation of HJB
Assume that the value function is smooth enough (i.e. is C2
) to apply Itˆo’s
formula.
For any α ∈ A, and a controlled process Xt,x
apply Itˆo’s formula to
v(s, Xt,x
) between s = t and s = t + h:
v(t +h, Xt,x
t+h) = v(t, x)+
t+h
t
∂v
∂t
+ La
v (s, Xt,x
s )ds +(local)martingal,
where for a ∈ A, La
is the second-order operator associated to the
diffusion X with constant control a:f
La
w = b(x, a) x w +
1
2
tr(σ(x, a)σ (s, a)) 2
x w
Plug into the DPP:
Devide by h, send h to zero, and obtain by the mean-value theorem, the
so-called HJB equation

Formal derivation of HJB
The Parabolic HJB equation
−
∂v
∂t
(t, x) + H1(t, x, x v(t, x), 2
x v(t, x)) = 0, ∀(t, x) ∈ [0, T[×Rn
, (9)
where ∀(t, x, p, M) ∈ Rn
× Rn
× Sn :
H1(t, x, p, M) = sup
a∈A
−b(x, a)p −
1
2
tr(σσ (x, a))M − f(t, x, a) . (10)
The Elliptic HJB equation
βv(x) − H2(x; x v(x), 2
x v(x)) = 0, ∀x ∈ Rn
,
Where ∀(x, p, M) ∈ Rn
× Rn
× Sn,
H2(x, p, M) = sup
a∈A
b(x, a)p +
1
2
tr(σ(x, a)σ (x, a)M + f(x, a) = 0,

Resolution methods
II) Resolution methods
1 Probabilistic approach
2 PDE approach
20 / 74

Resolution methods
2 PDE approach
21 / 74

Approximate the process Xt with a Marcov chain n such 0 = x. Under
some conditions, n converges in law to Xt .
Monte Carlo algorithms one of the methods widely used to obtain a
numerical approximation.
Case g = 0: Let X(1)
, ..., X(k)
be an i.i.d. sample drawn in the
distribution of Xt,x
T , and compute the mean:
ˆvn(t, x) :=
1
k
n
i=1
f X(i)
.
Law of Large Numbers: ˆvn(t, x) −→ v(t, x) Pa.s.
The Central Limit Theorem:
√
n( ˆvn(t, x) − v(t, x)) −→ N 0, Var f Xt,x
T in distribution,

Resolution methods
2 PDE approach
27 / 74

Resolution methods
Steps
PDE approach is based on:
Step 1: Discretization of time and space sets/Approximating derivatives
Step 2: Discretizing boundary conditions (Dirichlet/Neumann
Step 3: soving problem (Policy/Value iteration, Howard)
v: the value function
Optimal control strategy/stopping time
28 / 74

Resolution methods
Steps
PDE approach is based on:
Step 1: Discretization of time and space sets/Approximating derivatives
Step 2: Discretizing boundary conditions (Dirichlet/Neumann
Step 3: soving problem (Policy/Value iteration, Howard)
v: the value function
Optimal control strategy/stopping time
29 / 74

Time and space descretization
Let Ω = [0, 1], ∆t = T
N , N ∈ N∗
, tk=0...N := k∆t, h step in space, tk = k∆t,
xj = jh. Ωh, Lα
, vk
j (x),bk
j ,ak,α
j approximate Ω, Lα
, b(tk , xj ), α, a(tk , xj , α)
Approximation of ﬁrst
derivative:
∂v
∂x
(tk , xj ) :=
vk
j+1 − vk
j−1
2h1
(11)
∂v
∂x
(tk , xj ) :=
vk
j+1 − vk
j
h
(12)
or
∂v
∂x
(tk , xj ) :=
vk
j − vk
j−1
h
(13)
Approximation of second derivative
∂2
v
∂x2
(tk , xj ) :=
vk
j+1 − 2vk
j + vk
j−1
h2
(14)
Approximation of time derivative
∂v
∂t
(tk , xj ) :=
vk
j − vk−1
j
∆t
(15)
or
∂v
∂t
(tk , xj ) :=
vk+1
j − vk
j
∆t
(16)

Resolution methods
Dirichlet boundary conditions: v = g in ∂Ω × [0, T[
Neumann boundary conditions:
∂v
∂x = g2 in Ω × [0, T[
In case f = 0 and g = xp
/p, p ∈]0, 1[
vN
j = gj =
x
p
j
p
and
vk
M −vk
M−1
h
= p
xM
vk
M = xp−1
M , k ∈ 0..N − 1, j ∈ 0..M
vk
M = vk
M−1
vk
M = 0, and vk
0 = 0
NB: In portfolio allocation problem − > Black and Scholes-Merton Problem of
stocks:
dSt = µdt + σdWt ,
dS0 = rS0dt
32 / 74

Resolution methods
III) Financial applications
1 Merton portfolio allocation Problem
2 Investment/consumption Problem
33 / 74

Resolution methods
34 / 74

Applications 1: Merton portfolio allocation problem in
ﬁnite horizon
An agent invests at any time t a proportion αt of his wealth X in a stock of
price S and 1 − αt in a bond of price S0
with interest rate r.
The dynamics of the controlled wealth process is:
dXt =
Xt αt
St
dSt +
Xt (1 − αt )
S0
t
dS0
t
”Utility maximization problem at a ﬁnite horizon T ”:
v(t, x) = sup
α∈A
E U Xt,x
T , ∀ (t, x) ∈ [0, T] × (0, ∞) .
HJB eqaution for Merton’s problem
vt + rxvx + sup
a∈A
a (µ − r) xvx +
1
2
x2
a2
σ2
vxx = 0 (17)
v(T, x) = U(x) (18)

Utility function
U is C1
, strictly increasing and concave on (0, ∞), and satisies the Inada
conditions:
U (0) = ∞ U (∞) = 0 :
Convex conjugate of U:
ˆU(y) := sup
x>0
[U(x) − xy]
We use the CRRA utility function:
U(x) =
xp
p
, p 1, p 0
Relative Risk Aversion RRA: −xU”
(x)/U (x) = 1 − p.
→ if the person experiences an increase in wealth, he/she will choose to
increase (or keep unchanged, or decrease) the fraction of the portfolio
held in the risky asset if relative risk aversion is decreasing (or constant, or
increasing).

Resolution methods
37 / 74

Investment/consumption problem on inﬁnite horizon
The SDE governing the wealth process
dXt = Xt (αt µ + (1 − αt )r − ct )dt + Xt αt αt dWt ,
The goal is to maximize over strategies (α, c) the expected utility from
intertemporal consumption up to a random time horizon τ:
v(x) = sup
(α,c)∈A×C
E
τ
0
e−βt
u(ct Xx
t ) dt .
τ is independent of F∞, denote by F(t) = P[τ ≤ t] = P[τ ≤ t|F∞] the
distribution function of τ.
Assume an exponential distribution for the random time horizon:
1 − F(t) = exp−λt
for some positive constant λ.
Inﬁnite horizon problem:
v(x) = sup
(α,c)∈A×C
E
∞
0
e−(β+λ)t
u(ct Xx
t ) dt

The HJB equation associated is
ˆβv(x) − sup
a∈A,c≥0
[La,c
v(x) + u(cx)] = 0, x ≥ 0, (19)
where La,c
v(x) = x(aµ + (1 − a)r − c)v + 1
2 x2
a2
σ2
v
Explicit solution
The discount factor β shall satisfy: β > ρ − λ
v(x) = Ku(x) solves the HJB equation where
K =
1 − p
β + λ − ρ
1−p
and ρ =
(µ − r)2
2σ2
p
1 − p
+ rp
The optimal controls are constant given by (ˆa, ˆc)
ˆa = arg max
a∈A
[a(µ − r) + r −
1
2
a2
(1 − p)σ2
]
ˆc =
1
x
(v (x))
1
p−1 .

Why Markov Chain approach?
solving the descritized system requires some conditions on the matrix A
of the differential operator Lα
Case where A is not defined positive, we can obtain a descretization
system such satisfy the ” Discrete Maximum principle ”
Under specific condition on the space step of discretization h we get a
convergent Markov Chain. [page 89 A. SULEM, J-P. PHILIPPE, Méthode
numérique en contr ole stochastique]
The convergence of the scheme can be found and explained using
standard arguments provided by D. Kushner [Numerical Methods for
Stochastic Control Problems in Continuous Time.
NB Depending on the sign of the drift b of Xt , we use the right-hand-side
scheme upwind when b is positive and the left-hand-side upwind
scheme when b is negative to obtain a sort of transition probabilities
(∈ [0, 1] )

Resolution methods
IV) Numerical results on C++ and Scilab
1. Results for the investment problem
Approximated scheme
Resolution method/Coding
Results
2. Results for the investment/consumption problem
Approximated scheme
Results
44 / 74

Approximated scheme
Approximated scheme: two different scheme were used.
The forward upwind scheme
the HJB approximated is:



vk−1
j = supα 1 − ∆t
h |bk,α
j | − ∆t
h2 ak,α
j vk
j +
∆t
h (bk,α
j )+ + 1
2
∆t
h2 ak,α
j vk
j+1 + ∆t
h (bk,α
j )− + 1
2
∆t
h2 ak,α
j vk
j−1
vN
j = gj
Denote
pα
j = p(xj , xj |α), pα
+ j
= p(xj , xj+1|α), pα
− j
= p(xj , xj−1|α)
the transition probabilities that deﬁne the transition matrix Aα
.
Matrix notations: vk−1
= supα (I − ∆tAα
) vk
Explicit solution is given in [1]:

Algorithm C++
Algorithm of the forward scheme
Initialization: ∀j in 0, ..., M, vN
j =
√
xj
Repeat for all k from N − 1 to 0 do
vk
0 = 0
calculate vk
j ∈ h := v(tk , xj ) = supαi
w(tk , xj , αi )
Repeat for all j in 1, ..., M − 1,
for each αi in [ˆα − , ˆα + ] do
calculate (bαi
j )+ and (bαi
j )−
solve


vk
j = supαi
1 − ∆t
h |bαi
j | − ∆t
h2 aαi
j vk+1
j +
∆t
h (bαi
j )+ + 1
2
∆t
h2 aαi
j vk+1
j+1 + ∆t
h (bαi
j )− + 1
2
∆t
h2 aαi
j vk+1
j−1
vN
j = vN−1
j

Results
The shape of approximated value function and the explicit solution are
very close at the time 0.
A very small difference is observed in the limit of x = xM

Results
Error in value function (10−3
).
The implementation requires a big number of points (the more N is big
also for M)

Results
Control: Results are satisfying.
The error gets bigger from a state of time to another in the boundary set
of X Ω

Results
The error is estimated to 2.10−2

The shape of the Value function density
We can draw the shape of the approximated value function in function of time
and space since we stock the different value in an Excel ﬁle.

Backward scheme
The backward upwind scheme
the HJB approximated is:



vk
j = vk+1
j + supα
∆t
h (−|bα
j |) − ∆t
h2 aα
j vk
j
+ ∆t
h (bα
j )+ + 1
2
∆t
h2 aα
j vk
j+1 + ∆t
h (bα
j )− + 1
2
∆t
h2 aα
j vk
j−1
vN
j = gj
vk
N −vk
N−1
h = p
xN
vk
N
k ∈ 0..M − 1, j ∈ 0..N
Denote pα
j = ∆t
h (−|bα
j |) − ∆t
h2 aα
j , pα
+ j
= ∆t
h (bα
j )+ + 1
2
∆t
h2 aα
j ,
pα
− j
= ∆t
h (bα
j )− + 1
2
∆t
h2 aα
j the transition probabilities that deﬁne a
Marcov Chain with the transition matrix Aα
.
Matrix notations: supα (I + ∆tAα
h ) vk+1
− vk
= 0

Algorithm in Scilab
Algorithm of the Howard
sets up the Howard algorithm[3] [7] that allows us to solve
minα∈A (B (α) x − b). B (α) is defined as B(α)ij = B(αi )ij = (I + δtA(αi ))ij
1. Initialize α0
in A.
2. Iterate for k ≥ 0 :
(i) find xk
∈ N
solution of B(α)xk
= b.
(ii) αk+1
:= argminα∈An B(α)xk
− b .
3. k=k+1
Note that at each iteration, we have to find the control value of α

Results: Value function
The value function approximated is very close to the the optimal solution

Results: Error between Value functions
Let’s illustrates the error between both functions, an error of around
10−3
.
Error increases in the boundary state of x: it can be explained by
boundary conditions used in the model.

Results: Optimal control α
The shape of the optimal control α compared to the the explicit solution
Same comments with the terminal condition imposed on x

Results: Error between control solutions
In the Howard algorithm, both boundary conditions type Dirichlet then
those type Neumann were used ⇒ Neumann conditions give better
results.

Resolution methods
IV) Numerical results on C++ and Scilab
1. Results for the investment problem
Approximated scheme
Results
2. Results for the investment/consumption problem
Approximated scheme
Results
65 / 74

Introducing to Markov Chain approach
There is k > 0 and a Markov matrix Mα
h verifying
Aα
h = −ˆβIh +
1
k
(Mα
h − Ih)or Mα
h = Ih + k(Aα
h + ˆβIh) (20)
Hence
(Mα
h )ij =
1 + k(ˆβ + (Aα
h )ii ) if i = j,
k(Aα
h )ij if i = j.
we choose k such that k ≤ 1
ˆβ+|(Aα
h
)ii |
, ∀i = 1, ..., d which make all matrix
coefficients (Mα
h )ij positive:
(Mα
h )ij = 1 + k ˆβ + kMα
h )ij
= 1 if Neumann,
< 1 if Dirichlet
(20) can be written as: supα∈A(Mα
h − Ih − ˆβk)vh + k ûh = 0
⇒ HJB equation of a conntrol problem of a Marcov chain with a discount rate
ˆβh, and instant cost k ûh and transition matrix Mα
h

Explicit Value function
The shape of the explicit solution of the problem using CRRA utility
function:

Approximated value function
At the terminal set, value function goes to inﬁnity.

The shape of both explicit and approximated solutions regardless to the
terminal set of x: Results are not bad!

Error
The error is estimated to 5.10−2
and bigger at the terminal of x

Resolution methods
Comments
71 / 74

Resolution methods
Conclusion
Optimal stochastic control problem: an interesting field of research.
Merton portfolio allocation without/with consumption as classic
examples.
Numerical methods (forward and backward methods, Howard and policy
iteration) approximatie the optimal solutions/ must verify stability,
consistence and convergence ⇒ controlled Markov chain has been
used.
Numerical results were satisfying despite the fact of the presence of the
error related to sophistic boundary conditions.
DPP supposes a minimum of smoothness of value function to apply Itô’s
formula!Not always the case ⇒ viscosity approach widely used in
finance.
Imagine problems more complicated such investment problems with
transaction costs (singular optimal control problem). what methods to
use in modeling solutions?
72 / 74

References
D. Lamberton and B. Lapeyre,
Une Introduction au Calcul Stochastique Appliquée à laFinance.
Editions Eyrolles, 1997.
H. Pham.
Continous-time Stochastic Control and Optimization with Financial Applications.
Springer, 2008.
Jean-Philippe Chancelier et Agnès Sulem.
Méthode numérique en contrôle stochastique.
Le Cermics. 22 février 2005.
Kushner H.J. and Dupuis P.
Numerical Methods for stochastic Control Problems in Continuous Time.
Springer Verlag, 1992.
S. Crépey.
Financial Modeling.
Springer, 2013.
http://www.cmap.polytechnique.fr/ touzi/Fields-LN.pdf
http://www.math.fsu.edu/ pgarreau/files/merton.pdf

Research internship on optimal stochastic theory with financial application using finite differences method foer anumerical resolution

More Related Content

What's hot (19)

Similar to Research internship on optimal stochastic theory with financial application using finite differences method foer anumerical resolution (20)

Recently uploaded (20)

Research internship on optimal stochastic theory with financial application using finite differences method foer anumerical resolution