Machine learning 1
Machine learning 1
1 Introduction
6 Numerical Methods.
M. Escobar-Anel (Western University) FM9590 Stochastic Processes with Applications in Finance
July
and17,Actuarial
2024 Science 2 / 190
Introduction
Details
Lecture time and place:
Thursday 8 : 30 − 11 : 20 am. WSC248.
Recommended Textbooks:
Tomas Bjork (2009). Arbitrage Theory in Continuous Time. Oxford University
Press, Oxford.
Bingham & Kiesel (2004): Risk-neutral valuation, 2nd ed.
Note that these textbooks will only be used as a guide. The instructor will use
his own set of course notes during lectures.
Sept 9-13:
Stopping times, modes of convergence. Variations of paths, Quadratic
variation, p− variation, Quadratic covariation, differentiability. Definition of
Brownian motion (BM). Pertinent results from normal distribution.
d-dimensional Wiener process.
Sept 23-27:
Riemann integral versus Itô integral. Itô integral of an elementary integrand.
Properties of Itô integral: linearity and the martingale property. Itô isometry. Itô
integral of a general integrand.
Sept 30 - Oct 4:
Itô’s process. Itô’s differentiation rule. Derivation of Itô’s formula. Quadratic
variation of an Itô process. Quadratic variation of a geometric Brownian
motion. Itô’s formula in multidimensions. The product rule.
Oct 7-11:
Example of Itô process: Vasicek model, Geometric Brownian motion.
Existence and Uniqueness of solutions of Stochastic Differential Equations
(SDE). Change of measure (Girsanov’s theorem) and applications in finance.
Martingale Representation theorem.
Oct 14-18:
(Reading Week for undergrads).
Midterm, 17-Oct,
Nov 4-8:
The Black-Scholes model. Change of measure, EMM and completeness in the
BS-model. The Black-Scholes Formula. Proofs via PDE and Risk-neutral
valuation. The Greeks: Delta (-hedging), Gamma, Theta, Vega, Rho. Implied
volatility and volatility surface. Options on Dividend paying assets.
Nov 11-15:
Volatility modeling: volatility as a deterministic function, generalized BS.
stochastic volatility, Heston 1993 model. The Feller condition. The
characteristic function. stochastic volatility, Stein and Stein 1991 model.
stochastic volatility, Hull and White 1987 model.
Dec 2- 6:
Introduction to the pricing of default-free zero-coupon bonds. Term structure
modelling and example of classical models. The bond price in the Vasiček
model. Bond price solution via the PDE approach.
Dec 9 -22
Final Exam, Wednesday Dec 11, 9am - noon.
Overview
Chapter I
Results from statistics and stochastic
analysis
IP(A) = 0 ⇒ Q(A) = 0, ∀A ∈ F.
IP(A) = 0 ⇔ Q(A) = 0, ∀A ∈ F.
The probability measure IPX (B) = IP(X −1 (B)) = IP(X ∈ B), B ∈ B(R), is called
distribution of X .
Definition 7 (Density).
Let X be a random variable on (Ω, F, IP). The function F : R → [0, 1], with
Let X be a random variable and ϵ > 0 such that IE [etX ] < ∞ for all t with |t| < ϵ.
Then,
MX (t) := IE [etX ], |t| < ϵ,
is called the moment generating function of X .
ϕX (t) := IE [eitX ], t ∈ R,
Remark 1.
From Inverse Fourier Transform:
Z
1
FX′ (x) = fX (x) = e−itx ϕX (t)dt
2π ℜ
Levy’s Theorem:
T
e−ita − e−itb
Z
1
FX (b) − FX (a) = lim ϕX (t)dt
2π T →∞ −T it
(x − µ)2
1
f (x) = √ exp − 2
, x ∈ R.
0.15
0.10 2πσ 2σ
f(x)
0.05
0.00
−5 0 5 10
σ2t 2 σ2t 2
MX (t) = exp µt + , ϕX (t) = exp iµt − , t ∈ R.
2 2
Remark 2 (Properties).
Let X be a random vector on (Ω, F, IP). The function F : Rd → [0, 1], with
where t = (t1 , . . . , td ).
0.015
0.010
0.005
10
5
0.000
−10 0
−5 x2
0 −5
x1 5
10 −10
(log x − µ)2
1
f (x) = √ exp − 2
1{x>0} .
σx 2π 2σ
σ = 0.25
σ = 0.5
1.5
σ=1
σ=2
1.0
f(x) (mu=0)
0.5
0.0
0 2 4 6 8
n2 σ 2
IE [X n ] = exp nµ + , n ∈ N.
2
Mean and variance of a log-normal distribution are thus given by
σ2 2 2
IE [X ] = eµ+ 2 , Var(X ) = (eσ − 1) · e2µ+σ .
λk e−λ
f (k ) = IP(X = k ) = , k ∈ {0, 1, 2, . . . }.
0.25
0.20
0.15 k!
f(x) (lambda=3)
0.10
0.05
0.00
1 2 3 4 5 6 7 8 9 10 11 12
Remark 3 (Properties).
A Poisson distribution is often used to model the random number of discrete
occurrences during a given time-interval. The expected number of occurrences
is λ = IE [X ].
The mean and variance of the Poisson distribution is µ = σ 2 = λ, respectively.
1
Moreover, skewness = λ− 2 , kurtosis = 3 − λ−1 .
Take independent X ∼ Poi(λX ), Y ∼ Poi(λY ). Then X + Y ∼ Poi(λX + λY ).
Example 1.
The Poisson distribution appears, for example, in the following counting events:
The number of defaults, shocks to a stock, claims to an insurance company for
a given period.
The number of white blood cells found in a cubic centimetre of blood.
0.4
f (x) = λ · e−λx 1{x≥0} , λ > 0.
λ = 0.25
λ = 0.5
λ = 0.75
λ=2
0.3
0.2
f(x)
0.1
0.0
0 2 4 6 8
Remark 4 (Properties).
Example 2.
Let {X (i) }i∈N be a sequence of i.i.d. random variables with IE [|X (1) |] < ∞. Then, the
sample average converges almost surely (a.s., to be defined later) to the expected
value, i.e.
n
1 X (i) n→∞
X −→ IE [X (1) ] a.s.
n
i=1
Let {X (i) }i∈N be a sequence of i.i.d. random variables with mean µ and variance
σ 2 < ∞. For Sn := X (1) + X (2) + . . . + X (n) , it holds that
Sn − nµ n→∞
Y (n) := √ −→ N (0, 1) in distribution.
σ n
Remark 5.
Both theorems hold irrespectively of the shape of the original distribution. Modes of
convergence are recalled later.
Theorem 3 (Radon-Nikodým).
Let IP and Q be measures on the measurable space (Ω, F), where IP is σ-finite and
Q is finite. Then Q ≪ IP holds, if and only if there exists an integrable, IP-a.s.
non-negative function f (e.g. points where negative has zero probability), such that
Z
Q(A) = f (ω)d IP(ω), ∀A ∈ F.
A
Notation:
dQ
f = .
d IP
Example 3 (Radon-Nikodým).
Let X be an integrable random variable on the probability space (Ω, F, IP) and let
G ⊂ F be a sub-σ-algebra of F. The conditional expectation of X under G is
defined as the IP-a.s. unique G-measurable function IE [X |G] satisfying
Z Z
X (ω)d IP(ω) = IE [X |G](ω)d IP(ω), ∀A ∈ G.
A A
Example 4.
Let X be an integrable random variable on the probability space (Ω, F, IP) and let
G ⊂ F be a sub-σ-algebra of F. Then
1. IE [X |{∅, Ω}] = IE [X ].
2. IE [X |F] = X , IP-a.s.
3. If X is G-measurable, then IE [X |G] = X , IP-a.s.
4. Taking out what is known:
If Z is an integrable random variable and X is G-measurable, then
Let X be an integrable random variable on the probability space (Ω, F, IP) and let
G ⊂ F be a sub-σ-algebra of F. Then
1. Tower property: For each sub-σ-algebra H ⊂ G ⊂ F we get
IE [IE [X |G]|H] = IE [X ].
IE [X |G] = IE [X ], IP-a.s.
Remark 7.
Regarding the interpretation of the conditional expectation, the particular σ-algebra
represents the level of information based on which the expectation is computed.
The σ-algebra F itself stands for complete information.
The sub-σ-algebra G ⊂ F stands for partial information.
The trivial σ-algebra G = {∅, Ω} contains no information.
Definition 22 (Filtration).
F = {Ft }t≥0 ,
satisfying
Fs ⊂ Ft ⊂ F, ∀ 0 ≤ s ≤ t < ∞.
The quadruple (Ω, F, F, IP) is called filtered probability space.
Remark 8.
Ft represents the level of information up to time t, whereas F = {Ft }t≥0 represents
the flow of information with respect to time.
If the usual conditions are not satisfied, the filtration can be completed:
In the case of a not completed filtration, this is called IP-completion.
In the case of a not right continuous filtration, it is called IP-augmentation.
X = {Xt }t≥0
Let (Ω, F, F, IP) be a filtered probability space and let X be a stochastic process,
adapted to F. The natural filtration FX is defined as the family of σ-algebras
Ft := σ(Xs : 0 ≤ s ≤ t), ∀ t ≥ 0.
It is the smallest σ-algebra (i.e. the intersection of all σ-algebras satisfying this
property) such that X is an adapted process.
Remark 10.
The intersection of (any number of) σ-algebras is again a σ-algebra.
The natural filtration is a generalization of the Borel σ − algebra generated by a r.v.
Let (Ω, F, F, IP) be a filtered probability space. A stopping time (w.r.t. the filtration
F) is a B([0, ∞])-measurable random variable
τ : Ω → [0, ∞]
satisfying
{τ ≤ t} = {ω ∈ Ω : τ (ω) ≤ t} ∈ Ft , ∀ t ≥ 0.
Lemma 9.
Let τ1 and τ2 be stopping times. Then
min{τ1 , τ2 } =: τ1 ∧ τ2 , max{τ1 , τ2 } =: τ1 ∨ τ2
All random variables considered below are suitably integrable and defined on
(Ω, F, IP). A sequence {X (n) }n∈N of rvs converges to the rv X . . .
. . . almost surely (a.s.), if
IP lim X (n) = X = 1.
n→∞
IE (X (n) − X )2 → 0,
(n → ∞).
Working on the probability space (Ω, F, IP), the following relations hold:
almost surely (a.s.) ⇒ in probability
in mean square ⇒ in probability
in probability ⇒ in distribution
Furthermore, given additional conditions, the following inversions hold:
in probability ⇒ almost surely, if for all ϵ > 0:
lim IP sup |X (m) − X | > ϵ = 0.
n→∞ m≥n
⟨X , X ⟩pt := lim
X
|Xtk +1 − Xtk |p ,
∆tk →0
tk ≤t
If p = 1 then it is called Total Variation and if finite then Bounded (total) variation
or simply finite variation.
Write ⟨M, M⟩ for ⟨M⟩ and extend ⟨◦⟩ to the bilinear form (called quadratic
covariation) ⟨◦ , ◦⟩. For this, use the polarization identity
1 1
⟨M, N⟩ := (⟨M + N, M + N⟩ − ⟨M − N, M − N⟩) := (⟨M + N⟩ − ⟨M − N⟩) .
4 4
⟨X , f ⟩t = 0.
We will study Brownian motion processes next. For two independent Brownian
motions W (1) and W (2) . Then:
D E
W (1) , W (2) = 0.
t
Given (Ω, F, F, IP), an adapted process W = {Wt }t≥0 is called Brownian motion, if
W satisfies the following properties.
1. W0 = 0, IP-a.s.
2. W has independent increments, i.e. the random variables
(Wt − Ws ), (Wv − Wu )
Theorem 4 (Wiener).
4
Brownian Motion
2
0
−2
−4
0 2 4 6 8 10
time t
Each path of the Brownian motion has infinitely many roots near the origin and
near infinity.
Theorem 5 (Lévy).
The quadratic variation of a Brownian path over [0, t] exists and equals t, i.e.
⟨W ⟩t = t.
Remark 12.
The total variation is unbounded,
⟨W ⟩1t = ∞.
Xt := −Wt , ∀t ≥ 0.
Self similarity:
1
W 2, Xtc := c > 0, ∀t ≥ 0.
c ct
Time-inverted Brownian motion:
Xt := tW1/t , ∀t > 0, X0 := 0.
Xt := Wτ +t − Wτ , ∀t ≥ 0,
Definition 32 (Martingale).
Remark 13.
Show that the following processes are martingales. Let W be a Wiener process on
(Ω, F, F, IP) and F = FW .
Xt := Wt and F = FW .
Xt := Wt2 − t and F = FW .
Xt := exp(σWt − 1/2σ 2 t), for σ > 0 and F = FW .
Collecting information about a random variable: Let Y be an integrable
random variable on (Ω, F, F, IP). Then the stochastic process
Xt := IE [Y |Ft ], ∀ t ≥ 0,
is a martingale.
{τn }n∈N is said to reduce X and is called a localizing (or fundamental) sequence
for X .
Remark 14.
Example, processes that explode in finite time, e.g. IE [|Xt |] = ∞ for some t < ∞ are
not martingales but rather local martingales. (in [0.t) is a martingale but not in
[0, ∞). )
A local martingale X is a martingale if and only if for all T > 0 the set
Then X is a martingale.
If IE [sups≥0 |Xs |] < ∞, then X is a uniformly integrable martingale. In particular,
a bounded local martingale is a uniformly integrable martingale.
Let (Ω, F, F, IP) be a filtered probability space and let X be an F-adapted stochastic
process. X is a L2 [0, T ]-process, if X is progressively measurable and
"Z #
T
||X ||2 := IE Xt2 dt < ∞.
0
The set of all L2 [0, T ]-processes defines a linear vector space, denoted by V T .
Further, write M ∈ M20 if M0 = 0. Finally, write cM2 , cM20 for the subclasses of
(pathwise) continuous M.
Xt2 = X02 + Mt + At ,
⟨X ⟩t := At
we start with the simplest possible integrand X and extend successively to more
complicated stochastic processes.
If Xt (ω) := 1(a,b] (t), where 0 ≤ a < b < ∞, there is only one plausible way to define
Rt
0 Xs dWs . Define
Z t Z t 0, if t ≤ a,
Xs (ω)dWs (ω) = 1(a,b] (s)dWs (ω) := Wt − Wa , if a < t ≤ b,
0 0
Wb − Wa , if t ≥ b.
Rt
= a dWs (ω) for a < t ≤ b.
and uniformly bounded Ftk -measurable random variables (i.e. ∃ C ∈ R : |ξk | ≤ C for
all k = 0, . . . , n and ω) and if Xt (ω) can be written in the form
n−1
X
Xt (ω) := ξi (ω)1(ti ,ti+1 ] (t), 0 ≤ t ≤ T.
i=0
Then, if tk ≤ t < tk +1 ,
Z t −1
kX
It (X ) := Xs dWs = ξi (Wti+1 − Wti ) + ξk (Wt − Wtk )
0 i=0
n−1
X
= ξi (Wt∧ti+1 − Wt∧ti ).
i=0
This is Xt (ω) = Wt0 (ω)1(t0 ,t1 ] (t) + ... + Wtn −1 (ω)1(tn−1 ,tn ] (t) then
Rt
It (X ) = 0 Xs dWs = n−1
P
i=0 Wti (Wti+1 ∧τ − Wti ∧τ )
.
Note Stratanovich proposed:
Rt Wti +Wti+1
It (X ) = 0 Xs dWs = n−1
P
i=0 2 (Wti+1 ∧τ − Wti ∧τ )
Remark 17.
What could be an economic interpretation of this result?
In Finance, one does not know the future value of the stock but rather present
value, i.e. take n = T = 2, ti = i, i = 0, 1, 2, then:
one does not know what the price will be at t = 1 + ϵ for any ϵ > 0.
Remark 18.
Rt
Note the quadratic variation of the process It (X ) is 0 Xs2 ds. Hence The previous
result provides the expected QV of It (X ).
Remark 19.
Rt
The Itô isometry suggests that 0 Xs dWs should be defined only for processes
X from L2 [0, ∞), i.e. square integrable
Z t
IE Xs2 ds < ∞, ∀ t ≥ 0.
0
Remark 20.
So far, we know how to integrate simple processes. We now seek a class of
integrands, which can be suitably approximated by simple integrands. It turns out
that:
1. The suitable class of integrands is the class of (B([0, ∞)) ⊗ F)-measurable,
Rt
F-adapted processes X with 0 IE Xs2 ds < ∞ for all t > 0.
2. Each such X may be approximated byRa sequence of simple integrands X (n) ,
t
so that the stochastic integral It (X ) = 0 Xs dWs may be defined as the limit of
(n)
R t (n)
It (X ) = 0 Xs dWs as n tends to infinity.
3. The
R t properties from both lemmas above remain true for the stochastic integral
0 Xs dWs defined by 1. and 2.
4. Details of this construction are given in Øksendal (2000).
Example 7.
Compute
Z t
1 2 1
Ws dWs = W − t.
0 2 t 2
Rt
Note the contrast with ordinary (Newton-Leibniz) calculus! ( 0 xdx = 21 t 2 , more
Rt
generally 0 g(x)dg(x) = 12 g 2 (t))
Itô calculus requires the second term on the right - the Itô correction term -
which arises from the quadratic variation of W .
is called Itô process. The functions b (drift) and σ (dispersion or volatility) map
from R+ × R to R. To simplify notation, this integral equation is often expressed
symbolically in differential form. In terms of the stochastic differential equation
(SDE)
dXt = bdt + σdWt , X0 = x0 ,
where the arguments of b and σ are surpressed for notational convenience.
is an FW -martingale.
If dXt = Xt σt dWRt
for a progressively measurable process σ and if Novikov’s
1 T 2
condition IE [e 2 0 σt dt ] < ∞ for all T > 0, then X is a martingale which is given
by Rt 1 t 2
R
Xt = X0 e 0 σs dWs − 2 0 σs ds , t ≥ 0.
This table can be used as a shorthand for the corresponding properties of the
quadratic (co-)variation.
· dt dWt
dt 0 0
dWt 0 dt
1
df (Xt ) = f ′ (Xt )dXt + f ′′ (Xt )d ⟨X ⟩t , f (X0 ) = f (x0 ).
2
Writing out the integrals, we get
Z t Z t
1
f (Xt ) = f (x0 ) + f ′ (Xs )dXs + f ′′ (Xs )d ⟨X ⟩s .
0 2 0
where
D E m
X
d X (i) , X (j) = σik (t, Xt )σjk (t, Xt )dt.
t
k =1
Then D E
(1) (2) (2) (1)
d(X (1) X (2) )t = Xt dXt + Xt dXt + d X (1) , X (2) .
t
Again, note the difference to ordinary calculus, where the last term is missing.
Consider the following SDE, which is often used to model short-rates, for the
process r = {rt }t≥0 on (Ω, F, F, IP)
Assume that b and σ are continuous functions, satisfying that for arbitrary positive
constants T and N and for all x, y ∈ R, | x |, | y |≤ N and 0 ≤ t ≤ T ,
The Lipschitz condition:
b2 (t, x) + σ 2 (t, x) ≤ K 2 (1 + x 2 ).
For some constant K > 0 (depending possibly on T and N). Then there exists a
unique solution X , which is adapted to the filtration of the Brownian motion W .
The first condition holds if b and σ have continuous first partial derivatives w.r.t x,
and the second condition holds when they both have at most linear growth in x for
large x and bounded for arbitrarily small x.
M. Escobar-Anel (Western University) FM9590 Stochastic Processes with Applications in Finance
July
and17,Actuarial
2024 Science76 / 190
Results from statistics and stochastic analysis
This corrects Bachelier’s model of 1900 (a model without the factor St on the
right - missing the interpretation in terms of returns) which had a positive
probability for negative stock prices.
Modeling stock prices via a geometric Brownian motion was suggested by Paul
A. Samuelson (1965). In part for this, Samuelson received the Nobel Prize in
Economics in 1970.
The SDE for Geometric Brownian motion has the unique solution
St = S0 exp µ − 0.5σ 2 t + σWt .
The SDE for Geometric Brownian motion has the unique solution
n o
St = S(0) exp µ − 0.5σ 2 t + σWt .
Proof: We let n o
f (t, x) := S(0) exp µ − 0.5σ 2 t + σx .
With Xt = Wt , one has dXt = dWt , (dX )2 = dt. Itô’s lemma gives
20
Geometric Brownian Motion
18
16
14
12
10
0 2 4 6 8 10
time t
Then
( n n
)
X1X 2
Q(Zi ∈ dzi , ∀i) = exp γi zi − γi IP(Zi ∈ dzi , ∀i)
2
i=1 i=1
( n n n
) n
1 X 1X 2 1X 2 Y
= n exp γi zi − γi − zi dzi
(2π) 2 2 2
i=1 i=1 i=1 i=1
n
( )
1 1X 2
= n exp − (zi − γi ) dz1 . . . dzn .
(2π) 2 2
i=1
Define Z t
(i) (i) (i)
W̃t := Wt + γs ds.
0
Under the equivalent probability measure Q with Radon-Nikodým derivative
dQ
= LT ,
d IP
dQ
= Lt
d IP Ft
Let M = {Mt }t≥0 be a martingale on (Ω, F, F, IP) with respect to the Brownian
filtration F = FW . Then
Z t
Mt = M0 + Hs dWs , ∀ t ≥ 0,
0
1
Ft + bFx + σ 2 Fxx = 0
2
with final condition F (T , x) = h(x) has the stochastic representation
F (t, x) = IE [ h(XT )| Xt = x] ,
1
Ft + bFx + σ 2 Fxx = rF
2
with final condition F (T , x) = h(x) has the stochastic representation
h i
F (t, x) = IE e−r (T −t) h(XT ) Xt = x ,
If Xt is a vector then
X 1X 2
Ft + bi Fxi + σi,j Fxi xj = rF
2
i i,j
Chapter II
Financial markets in continuous time
Assumptions
Throughout this chapter, let T > 0 be the terminal time horizon.
Uncertainty in the market is modeled by the filtered probability space
(Ω, F, F, IP). The filtration F is assumed to satisfy the usual conditions of
completeness and right-continuity.
Consider d + 1 basic assets, whose price processes are modeled by the
stochastic processes S (0) , . . . , S (d) .
Definition 45 (Numéraire).
Assumption:
R t (i) (i)
Each φ(i) is sufficiently integrable such that 0 φs dSs is well defined.
Interpretation:
(i)
φt denotes the number of shares of asset i held in the portfolio at time t.
Why predictable?
This number has to be determined on the basis of information available before
time t, i.e. the investor selects her or his time t portfolio after observing the
prices St− .
V φ = {Vtφ }t∈[0,T ] is called the value (or wealth) process of the strategy φ.
The gains process Gφ , based on the strategy φ, is defined as
d Z t
(i) (i)
Gtφ :=
X
φs dSs , ∀ t ∈ [0, T ].
i=0 0
St
(1) (d)
S̃t := (0)
= 1, S̃t , . . . , S̃t , ∀ t ∈ [0, T ],
St
(i) (i) (0)
with S̃t = St /St , for i = 1, . . . , d.
The discounted wealth process Ṽ φ is defined as
d
Vtφ (0) (i) (i)
Ṽtφ :=
X
(0)
= φt + φt S̃t , ∀ t ∈ [0, T ].
St i=1
d Z t d
(0) (i) (i) (i) (i)
X X
φt = v0 + φs d S̃s − φt S̃t , ∀ t ∈ [0, T ].
i=1 0 i=1
V0φ = 0.
is a Q-martingale.
Assume that there exists an EMM Q ∼ IP. Then the market model does not contain
any arbitrage opportunities in Φ(Q).
The converse, i.e. no arbitrage implying the existence of an EMM, is basically true
as well. One requires, however, a more technical and stronger definition of no
arbitrage: No free lunch with vanishing risk (NFLVR), see Delbaen &
Schachermayer (1994). This establishes the so called first fundamental theorem
of asset pricing.
X
(0)
=: X̃ ∈ L1 (Ω, FT , Q).
ST
(e.g. E Q [| X |] < ∞)
Definition 53 (Attainable).
A market model that admits at least one EMM Q is complete if and only if Q is
unique.
The arbitrage price process of any attainable claim X is given by the risk-neutral
valuation formula
h X i
(0) (0)
ΠXt = St IEQ (0) Ft = St IEQ X̃ Ft , ∀ t ∈ [0, T ].
ST
1
IEQ [Y |Fs ] = IEIP [YZt |Fs ].
Zs
Let N = {Nt }t≥0 be a numéraire such that N/S (0) is a Q-martingale. Define the
new measure
(0)
dQN Nt S
Ft := ηt = (0) 0 .
dQ S N0
t
Chapter III
The Black-Scholes formula
St = S0 + bt + σWt ,
Using the bank account as numéraire, i.e. S̃t := St /Bt , one finds
d S̃t = S̃t (b − r )dt + σdWt .
GBM is a reasonable (but not perfect) model for stock price movements.
Shortfalls and extensions are discussed later on.
A major advantage of GBM is its analytical tractability. A vast number of exotic
options can be priced in closed form within this context. This tractability is
typically lost when extensions of the model are considered.
A surprising observation is that the model is robust for hedging. Even though
there might be better models for the pricing of derivatives, BS hedging
strategies turn out to perform very well.
We shall see that the BS-model is free of arbitrage and is complete. This is
extremely convenient for theoretical considerations. However, one should be
aware that as soon as more realistic extensions of the model are considered,
one typically loses the completeness of the model.
dBt = rt Bt dt, B0 = 1,
n
(i) (i) (i) (ij) (j) (i) (i)
X
dSt = St bt dt + σt dWt , S0 = s0 > 0,
j=1
dQ
= Lt ,
d IP Ft
with Z t
1 t 2
Z
Lt = exp − γs dWs − γs ds .
0 2 0
By Girsanov’s theorem:
dWt = d W̃t − γt dt,
where W̃ is a Q-Brownian motion.
Thus, the Q-dynamics for S̃ are
b − r − σγt = 0, ∀ t ∈ [0, T ].
b−r
γt ≡ =: γ,
σ
where γ is called market price of risk.
This shows that the martingale measure Q is unique in this model.
The Q-dynamics of S are
dSt = St rdt + σd W̃t .
If n > d then market is incomplete (no replicating strategy), not unique EMM (no
arbitrage oppor.) Examples: stochastic volatility models.
If n < d then infinitely many replicating strategies, no EMM (arb. opport., no γ such
that µi − σi γ = r for all i.)
3. From this, we can obtain the forward price F (t, T ) of a forward contract with
maturity T on the stock S. We obtain
F (t, T ) = er (T −t) St .
For a European call option with terminal payoff C(ST , T ) = (ST − K )+ , we obtain
the following formula. The Black-Scholes price process of a European call option
with strike K and maturity T > t is given by
Using similar arguments as in the following proofs (or the put-call parity)
Πt = C(St , t) − ξSt .
We use Itô’s formula to find the dynamics of the portfolio at time t. We obtain
(omitting arguments for notational simplicity)
∂ ∂ 1 ∂2
dΠt = C dt + C dSt + σ 2 St2 2 C dt − ξdSt .
∂t ∂S 2 ∂ S
dΠt = r Πt dt.
∂ 1 ∂2 ∂
C + σ 2 St2 2 C + rSt C − rC = 0.
∂t 2 ∂ S ∂S
So far, we did not use that C is a call option. In fact, the same derivation holds
for all options! The specific payoff of C is now absorbed into the boundary
condition of the solution of the above PDE. It is required that
C(ST , T ) = (ST − K )+ .
We can now verify (which is done later in an exercise) that the suggested
BS-formula fulfills the required BS PDE and the boundary condition.
We have shown that pricing in the BS-market can be done using the
risk-neutral valuation formula.
Hence, the price of the call option is the discounted expected future payoff,
where the expectation is computed w.r.t. the martingale measure Q.
Recall that the Q-solution of dSt = St (rdt + σd W̃t ) is given by
1
ST = S0 exp r − σ 2 T + σ W̃T .
2
We now have to compute (w.l.o.g. t = 0)
h i
C(S0 , 0) = IEQ e−rT (ST − K )+ .
A := {ω ∈ Ω : ST ≥ K }.
σ2
log(S0 /K ) +
Q(A) = P(ST > K ) = P σ W̃T > log(K /S0) − (r − )T . . . = Φ √
2 σ
log(S0 /K ) + (r + 0.5σ 2 )T
−rT
e IEQ [ST 1A ] = . . . = S0 R(A) = . . . = S0 Φ √ .
σ T
∂C
∆C := = Φ(d1 ) > 0.
∂S0
Using the put-call parity, the Delta of the corresponding put option is found to be
∂P
∆P := = Φ(d1 ) − 1 < 0.
∂S0
Delta is the most important and widely used short-term risk measure.
The Delta of an option shows how the option value changes when the price of
the underlying marginally varies (under the assumption that everything else
stays constant).
C = 154.18, P = 41.19.
1.0
60
50
0.8
40
0.6
Option Value
Delta
30
0.4
20
0.2
10
0.0
0
Using the same options and parameters as in Example 12, the Gamma of the
put and call option at time t = 0 is
Γ = 0.0015.
This means that the Delta of both options changes about 0.0015, if the stock
price increases by 1 unit of currency.
1.0
0.8
0.6
Delta
0.4
0.2
0.0
Stock Price
0.030
0.025
0.020
Gamma
0.015
0.010
0.005
0.000
Stock Price
∂C S0 σφ(d1 ) ∂P S0 σφ(d1 )
ΘC := = √ + Kre−rT Φ(d2 ), ΘP := = √ − Kre−rT Φ(−d2 ).
∂T 2 T ∂T 2 T
Caution: Some authors introduce Theta as the derivative with respect to t:
∂C St σφ(d1 ) ∂P St σφ(d1 )
=− √ − Kre−r (T −t) Φ(d2 ), =− √ + Kre−r (T −t) Φ(−d2 ).
∂t 2 T −t ∂t 2 T −t
Theta is the sensitivity of the option price w.r.t. the time to maturity T (under
the assumption that everything else stays constant).
Theta measures the result of decay in time on an option.
It is especially important for short-maturity options, as the probability of a call
(or put) reaching moneyness falls with shorter maturity, which lowers the option
value.
Note "at the money" means S0 = K , "our of the money" is S0 < K (for Calls),
"in the money" implies S0 > K (for Callls)
M. Escobar-Anel (Western University) FM9590 Stochastic Processes with Applications in Finance
July
and17,Actuarial
2024 Science
130 / 190
The Black-Scholes formula
Black-Scholes Theta (ΘC , w.r.t. t), r = 0.1, K = 100, and σ = 0.15
60
50
40
Option Value
30
20
10
0
0 2 4 6 8 10
Time to Maturity
−4
−6
Theta
−8
−10
at the money
in the money
−12
0 2 4 6 8 10
Time to Maturity
20
Option Value
15
10
Volatility
30
20
Vega
10
0
Volatility
∂C
ρC := = TKe−rT Φ(d2 ) > 0,
∂r
∂P
ρP := = −TKe−rT Φ(−d2 ) < 0.
∂r
Rho is the sensitivity of the option price w.r.t. the riskless interest rate.
Rho shows the change of the price under a marginally varying riskless interest
rate (under the assumption that everything else stays constant).
20
Option Value
15
10
Interest Rate
70
65
Rho
60
55
50
Interest Rate
A bank sells 2 000 calls with a Delta of 0.6 and a Gamma of 0.015 (per call).
The total Delta is −1 200, the total Gamma is −30.00.
Another option on the same stock is available in the market with a Delta of 0.5
and a Gamma of 0.02.
Gamma neutrality can be created by buying 1 500 units of the second option
(note that the Gamma per stock is obviously 0).
Delta neutrality can be created by additionally buying 450 stocks with Delta 1
and Gamma 0.
The total position satisfies:
One of the main issues raised, when using the Black-Scholes formula, is the
question of modeling / selecting the volatility σ. All other required quantities
(S0 , T , r , K ) are (more or less) directly observable.
Before we can implement the Black-Scholes formula to price options, we first
have to specify σ.
Recall that Vega is given by
∂C √
V := = S0 T φ(d1 ) > 0.
∂σ
The important thing to note is that Vega is always positive.
Next, since Vega is positive and C is differentiable, C is a strictly increasing
function of σ.
Turning this round, σ is a continuous (differentiable) strictly increasing function
of C.
13
Option Price in a BS!model
12
C=11.2
11
10
Impl. Vola
0.00 0.05 0.10 0.15 0.20
Volatility
Implied vol. surface of EURO STOXX 50, as of April 16, 2010, S0 = 2 949
"!
&"
&!
%"
&"'"!
%!
&!'&"
$"
%"'&!
$!
%!'%"
#"
$"'%!
#!
$!'$"
"
#"'$!
!
#!'#"
"'#!
!'"
Dividends
Remark 48 (Options on dividend paying assets).
DSt dt.
Again, assume the time t value of the option to be a (smooth) function of the
current time and stock value, denoted by V (St , t).
Now consider a portfolio Π consisting of one option V and a short position in ξ
stocks. The value of this portfolio at time t is
Πt = V (St , t) − ξSt .
Dividends
Remark 49 (Options on dividend paying assets - the new PDE).
We again use Itô’s formula to find the dynamics of the portfolio at time t. We
obtain
∂ ∂ 1 ∂2
dΠt = Vdt + VdSt + σ 2 St2 2 Vdt − ξdSt − DξSt dt.
∂t ∂S 2 ∂S
Similar considerations as before lead to the following PDE for V :
∂ 1 ∂2 ∂
V + σ 2 S 2 2 V + (r − D)S V − rV = 0.
∂t 2 ∂S ∂S
The specific payoff of V is again absorbed into the boundary condition of the
solution of the above PDE.
The solution to the above PDE is the respective pricing formula under the
assumption of proportional dividends.
Dividends
Remark 50 (Options on dividend paying assets - martingale pricing (1)).
Dividends
Remark 51 (Options on dividend paying assets - martingale pricing (2)).
Dividends
Remark 52 (Calls and puts on dividend paying assets).
Chapter IV
Stochastic volatility and jump models
The coefficients represent the instantaneous short rate r (t) ≥ 0, drift b(t), and
volatility σ(t) > 0 at time t, which satisfy certain technical conditions to guarantee
that the model is well-defined.
The respective BS-put option formula is found by applying the put-call parity:
RT
P(St , t) = Ke− r (u)du
t Φ − d2 (St , T − t) − St Φ − d1 (St , T − t) .
0.020
0.015
0.010
√
Note is σd and σy are vol per day and year resp. then σy = σd 250, e.g.
σd = 0.015 implies σy = 0.237
0.05
−0.10
0.05
−0.10
A (continuous) stochastic volatility model is a stock price model where the stock
price volatility follows its own Itô-process, i.e.
Instead of modeling the volatility {σt }t≥0 , some models prefer postulating some
SDE for the variance process vt = σt2 , instead. Changing from one
specification to the other is done via Itô’s formula.
Important: Due to the additional risk of a stochastic volatility, these models are
no longer complete. In the following, we state all models w.r.t. some martingale
measure Q and do not discuss the (difficult!) question of a reasonable change
of measure from IP to Q.
If, w.r.t. Q, b(St , σt , t) = rSt , c(St , σt , t) = σt St , and if the processes WtS and Wtσ
are uncorrelated Brownian motions, the price of a European call option can be
calculated as the call price in a classical Black-Scholes model with the average
volatility over the lifetime of the option, integrated over the distribution of the
average volatility (using Tower property), i.e.
Z ∞
CSV (S0 , 0) = CBS (S0 , 0; σ)hσ (σ)dσ,
0
where CBS (S0 , 0; σ) is the standard BS-price (with volatility being a function of
RT 1/2
time), σ = T1 0 σu2 du the average volatility, and hσ the density of the average
volatility.
where the Brownian motions are correlated with parameter ρ ∈ (−1, 1), i.e.
0.0
0 2 4 6 8
Time
Log−Return(t)
0.1
−0.2
0 2 4 6 8
Time
The volatility process can reach zero, unless the Feller condition
holds:
1
κθ > α2 .
2
where
κ − ραiu − d
q
d= (ραui − κ)2 + α2 (iu + u 2 ), g= .
κ − ραiu + d
where the Brownian motions are correlated with parameter ρ ∈ (−1, 1), i.e.
0.35
0 2 4 6 8
Time
Log−Return(t)
0.1
−0.2
0 2 4 6 8
Time
where the Brownian motions are correlated with parameter ρ ∈ (−1, 1), i.e.
Chapter V
Numerical methods
This algorithm simulates a trajectory of the Brownian motion W = {Wt }t≥0 on the
equidistant grid 0 = t0 < t1 < · · · < tn = T , where ∆ ≡ ∆tk := tk +1 − tk .
1. Initialize t0 := 0, Wt0 := 0, and ∆ := T /n.
2. For j = 1, . . . , n do
2.1 tj := tj−1 + ∆.
2.2 Draw a N (0, 1)-distributed
√ r.v. Z , independent of the past.
2.3 Set Wtj := Wtj−1 + Z ∆.
where ∆ ≡ ∆tk := tk +1 − tk .
1. Initialize t0 := 0, Xt0 =: x0 , Wt0 := 0, and ∆ := T /n.
2. For j = 0, . . . , n − 1 do
2.1 tj+1 := tj + ∆. √
2.2 Draw Z ∼ N (0, 1) (independent of the past) and set ∆W := Z ∆.
2.3 Set Xtj+1 := Xtj + µ(Xtj , tj )∆ + σ(Xtj , tj )∆W .
where XT is the true solution of the SDE and XT∆ is the approximation obtained
with the Euler method (with the same BM).
For the Euler method, one can show that the error decreases as
√
ϵ(∆) ≤ c · ∆1/2 , i.e. ϵ(∆) ∈ O( ∆).
1
d log(St ) = (b − σ 2 )dt + σdWt .
2
From this, we find the solution
1
St = S0 exp (b − σ 2 )t + σWt .
| 2 {z }
:=Xt
M. Escobar-Anel (Western University) FM9590 Stochastic Processes with Applications in Finance
July
and17,Actuarial
2024 Science
176 / 190
Numerical Methods.
If only the terminal value ST is required (do not use this method to produce a path
of S), it is enough to simulate Z ∼ N (0, 1) and to define
d
1 √
ST = S0 exp b − σ2 T + σ T Z ,
2
d
where = means equal in distribution.
1
Xt := b − σ 2 t + σWt .
2
This process can be simulated (without discretisation bias!) via Xt0 := 0 and
1 √
Xti+1 := Xti + b − σ 2 ∆ + σ ∆Z .
2
Finally, set Sti := S0 exp(Xti ), for i = 0, . . . , n.
as an approximation for I.
It is important to notice that In itself is a random variable.
Therefore, we can apply probabilistic results to obtain statements about the
quality of the estimate In .
Lemma 26 (Properties of In ).
IE [In ] = I,
!
Z 1
1
Var (In ) = f 2 (x)dx − I 2 .
n 0
We conclude that:
1. In is an unbiased estimate for I. √
2. The variance (resp. standard deviation) of In decreases as 1/n (resp. 1/ n) in
the number of samples drawn, that is n.
The strong law of large numbers can be used to show that In converges to I
with probability one, i.e. for a.e. ω ∈ Ω:
In → I, (n → ∞).
Using the central limit theorem we can even get the approximate distribution of
In . We find
d
1
In ≈ N I, Var f (U) , U ∼ Uni [0, 1].
n
This result is later used to derive (asymptotic) confidence intervals for I.
For an option D with payoff h({St }t∈[0,T ] ) at time T , we find the price at time
t = 0 as h i
D(S0 , 0) = IEQ e−rT h({St }t∈[0,T ] ) .
This puts us back into the original MC-framework, where an expected value
(i.e. an integral) is to be computed.
1
D̄n := D̂1 + . . . + D̂n .
n
This is our estimate of the option price.
In many cases, D̄n is unbiased, i.e. IE [D̄n ] = D(S0 , 0), and strongly consistent,
i.e. D̄n − D(S0 , 0) → 0 a.s. (n → ∞). Only for path dependent options we
sometimes have to accept a small discretisation bias.
The central limit theorem allows to provide a confidence interval. Let
n
1 X
sn2 := (D̂i − D̄n )2
n−1
i=1
denote the sample variance of the option payoff and let z1−δ denote the 1 − δ
quantile of the standard normal distribution, i.e. Φ(z1−δ ) = 1 − δ. Then
sn sn
D̄n − z1−δ/2 √ , D̄n + z1−δ/2 √
n n
Unfortunately, the quantile function Φ−1 (x) of the normal distribution is not
known in closed form, i.e. it has to be approximated numerically.
Here, the algorithm of Box and Müller is more efficient and exact.
Let U1 and U2 be independent and Uni [0, 1]-distributed.
Set
p
Y1 := −2 log(U1 ) cos(2πU2 ),
p
Y2 := −2 log(U1 ) sin(2πU2 ).
Given two unbiased estimates (that can be computed with similar effort) for the
same quantity, it is natural to prefer the one with smaller variance (as for
instance, confidence intervals obtained from this estimate are smaller).
In a Monte Carlo context, a reduction of the standard deviation by the factor 10
corresponds to increasing the number of samples by 100.
The following examples for variance reduction methods in the context of option
pricing will be presented:
Antithetic variates;
Control variates;
Importance sampling.
Techniques that we do not cover are moment matching, stratified and Latin
hypercube sampling, and pseudo-Monte Carlo methods.
and let Či be the realization of the option price obtained from −Zi .
Now consider the estimate
n
1 X Ĉi + Či
C̄nanti := .
n 2
i=1
A heuristic argument for this modification is that antithetic pairs are more
regularly distributed, as each realization has its antithetic analogon. Note that
the sample mean of antithetic pairs Z and −Z is always zero.
A first observation is that C̄nanti is unbiased if C̄n , the original estimate, is.
The next observation is that
This relation holds for our call example (and for most standard options).