An Introduction To Stochastic Calculus With Applications To Finance
An Introduction To Stochastic Calculus With Applications To Finance
to Finance
Ovidiu Calin
Department of Mathematics
Eastern Michigan University
Ypsilanti, MI 48197 USA
[email protected]
Preface
The goal of this work is to introduce elementary Stochastic Calculus to senior undergraduate
as well as to master students with Mathematics, Economics and Business majors. The author’s
goal was to capture as much as possible of the spirit of elementary Calculus, at which the stu-
dents have been already exposed in the beginning of their majors. This assumes a presentation
that mimics similar properties of deterministic Calculus, which facilitates the understanding of
more complicated concepts of Stochastic Calculus. Since deterministic Calculus books usually
start with a brief presentation of elementary functions, and then continue with limits, and other
properties of functions, we employed here a similar approach, starting with elementary stochas-
tic processes, different types of limits and pursuing with properties of stochastic processes. The
chapters regarding differentiation and integration follow the same pattern. For instance, there
is a product rule, a chain-type rule and an integration by parts in Stochastic Calculus, which
are modifications of the well-known rules from the elementary Calculus.
Since deterministic Calculus can be used for modeling regular business problems, in the
second part of the book we deal with stochastic modeling of business applications, such as
Financial Derivatives, whose modeling are solely based on Stochastic Calculus.
In order to make the book available to a wider audience, the rigor was sacrificed for clarity.
Most of the time we assumed maximal regularity conditions for which the computations hold
and the statements are valid. This will be found attractive by both Business and Economics
students, who might get lost otherwise in a very profound mathematical textbook where the
forest’s scenary is obscured by the sight of the trees.
An important feature of this textbook is the large number of solved problems and examples
from which will benefit both the beginner as well as the advanced student.
This book grew from a series of lectures and courses given by the author at the Eastern
Michigan University (USA), Kuwait University (Kuwait) and Fu-Jen University (Taiwan). Sev-
eral students read the first draft of these notes and provided valuable feedback, supplying a
list of corrections, which is by far exhaustive. Any typos or comments regarding the present
material are welcome.
The Author,
Ann Arbor, October 2012
i
ii O. Calin
Contents
I Stochastic Calculus 3
1 Basic Notions 5
1.1 Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Events and Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Basic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.7 Independent Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.8 Integration in Probability Measure . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.9 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.10 Radon-Nikodym’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.11 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.12 Inequalities of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.13 Limits of Sequences of Random Variables . . . . . . . . . . . . . . . . . . . . . 22
1.14 Properties of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.15 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
iii
iv O. Calin
4 Stochastic Integration 83
4.0.3 Nonanticipating Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.0.4 Increments of Brownian Motions . . . . . . . . . . . . . . . . . . . . . . 83
4.1 The Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Examples of Ito integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.1 The case Ft = c, constant . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.2 The case Ft = Wt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3 Properties of the Ito Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4 The Wiener Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5 Poisson Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.5.1 A Work Out Example: the case Ft = Mt . . . . . . . . . . . . . . . . . . 93
RT
4.6 The distribution function of XT = 0 g(t) dNt . . . . . . . . . . . . . . . . . . 97
5 Stochastic Differentiation 99
5.1 Differentiation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2 Basic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Ito’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.1 Ito’s formula for diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.2 Ito’s formula for Poisson processes . . . . . . . . . . . . . . . . . . . . . 104
5.3.3 Ito’s multidimensional formula . . . . . . . . . . . . . . . . . . . . . . . 105
9 Martingales 159
9.1 Examples of Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.2 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
2 O. Calin
Part I
Stochastic Calculus
3
Chapter 1
Basic Notions
5
6
0 if the outcome does not occur and a 1 if the outcome occurs. We obtain the following four
possible assignments:
so the set of subsets of {H, T } can be represented as 4 sequences of length 2 formed with 0
and 1: {0, 0}, {0, 1}, {1, 0}, {1, 1}. These correspond in order to the sets Ø, {T }, {H}, {H, T },
which is the set 2{H,T } .
Example 1.2.2 Pick a natural number at random. Any subset of the sample space corresponds
to a sequence formed with 0 and 1. For instance, the subset {1, 3, 5, 6} corresponds to the
sequence 10101100000 . . . having 1 on the 1st, 3rd, 5th and 6th places and 0 in rest. It is known
that the number of these sequences is infinite and can be put into a bijective correspondence with
the real number set R. This can be also written as |2N | = |R|, and stated by saying that the set
of all subsets of natural numbers N has the same cardinal as the real numbers set R.
Any subset F of 2Ω that satisfies the previous three properties is called a σ-field. The sets
belonging to F are called events. This way, the complement of an event, or the union of events
is also an event. We say that an event occurs if the outcome of the experiment is an element
of that subset.
The chance of occurrence of an event is measured by a probability function P : F → [0, 1]
which satisfies the following two properties:
1. P (Ω) = 1;
2. For any mutually disjoint events A1 , A2 , · · · ∈ F ,
The triplet (Ω, F , P ) is called a probability space. This is the main setup in which the
probability theory works.
Example 1.3.1 In the case of flipping a coin, the probability space has the following elements:
Ω = {H, T }, F = {Ø, {H}, {T }, {H, T }} and P defined by P (Ø) = 0, P ({H}) = 12 , P ({T }) =
1
2 , P ({H, T }) = 1.
Example 1.3.2 Consider a finite sample space Ω = {s1 , . . . , sn }, with the σ-field F = 2Ω , and
probability given by P (A) = |A|/n, ∀A ∈ F . Then (Ω, 2Ω , P ) is called the classical probability
space.
7
Figure 1.1: If any pullback X −1 (a, b) is known, then the random variable X : Ω → R is
2Ω -measurable.
Example 1.4.1 Let X(ω) be the number of people who want to buy houses, given the state of
the market ω. Is X predictable? This would mean that given two numbers, say a = 10, 000 and
b = 50, 000, we know all the market situations ω for which there are at least 10, 000 and at most
50, 000 people willing to purchase houses. Many times, in theory, it makes sense to assume that
we have enough knowledge to assume X predictable.
Example 1.4.2 Consider the experiment of flipping three coins. In this case Ω is the set of
all possible triplets. Consider the random variable X which gives the number of tails obtained.
For instance X(HHH) = 0, X(HHT ) = 1, etc. The sets
Example 1.4.3 A graph is a set of elements, called nodes, and a set of unordered pairs of
nodes, called edges. Consider the set of nodes N = {n1 , n2 , . . . , nk } and the set of edges
E = {(ni , nj ), 1 ≤ i, j ≤ n, i 6= j}. Define the probability space (Ω, F , P ), where
◦ the sample space is the the complete graph, Ω = N ∪ E;
◦ the σ-field F is the set of all subgraphs of Ω;
◦ the probability is given by P (G) = n(G)/k, where n(G) is the number of nodes of the
graph G.
As an example of a random variable we consider Y : F → R, Y (G) = the total number of edges
of the graph G. Since given F , one can count the total number of edges of each subgraph, it
follows that Y is F -measurable, and hence it is a random variable.
It is worth observing that since X is a random variable, then the set {ω; X(ω) ≤ x} belongs to
the information set F .
The distribution function is non-decreasing and satisfies the limits
If we have
d
F (x) = p(x),
dx X
then we say that p(x) is the probability density function of X. A useful property which follows
from the Fundamental Theorem of Calculus is
Z b
P (a < X < b) = P (ω; a < X(ω) < b) = p(x) dx.
a
In the case of discrete random variables the aforementioned integral is replaced by the following
sum X
P (a < X < b) = P (X = x).
a<x<b
For more details the reader is referred to a traditional probability book, such as Wackerly et.
al. [?].
0.4
0.5
0.3 0.4
0.3
0.2
0.2
0.1
0.1
-4 -2 0 2 4 0 2 4 6 8
a b
3.5
Α = 3, Β = 9 Α = 8, Β = 3
3.0
0.20
Α = 3, Β = 2 2.5
0.15
2.0
0.10 1.5
Α = 4, Β = 3
1.0
0.05
0.5
c d
with µ and σ > 0 constant parameters, see Fig.1.2a. The mean and variance are given by
E[X] = µ, V ar[X] = σ 2 .
X ∼ N (µ, σ 2 ).
Exercise 1.6.1 Let α, β ∈ R. Show that if X is normal distributed, with X ∼ N (µ, σ 2 ), then
Y = αX + β is also normal distributed, with Y ∼ N (αµ + β, α2 σ 2 ).
Log-normal distribution Let X be normally distributed with mean µ and variance σ 2 . Then
the random variable Y = eX is said to be log-normal distributed. The mean and variance of Y
are given by
σ2
E[Y ] = eµ+ 2
2 2
V ar[Y ] = e2µ+σ (eσ − 1).
Exercise 1.6.2 Given that the moment generating function of a normally distributed random
2 2
variable X ∼ N (µ, σ 2 ) is m(t) = E[etX ] = eµt+t σ /2 , show that
10
2 2
(a) E[Y n ] = enµ+n σ /2 , where Y = eX .
(b) Show that the mean and variance of the log-normal random variable Y = eX are
2 2 2
E[Y ] = eµ+σ /2
, V ar[Y ] = e2µ+σ (eσ − 1).
Gamma distribution A random variable X is said to have a gamma distribution with pa-
rameters α > 0, β > 0 if its density function is given by
xα−1 e−x/β
p(x) = , x ≥ 0,
β α Γ(α)
where Γ(α) denotes the gamma function,1 see Fig.1.2c. The mean and variance are
E[X] = αβ, V ar[X] = αβ 2 .
The case α = 1 is known as the exponential distribution, see Fig.1.3a. In this case
1 −x/β
p(x) = e , x ≥ 0.
β
The particular case when α = n/2 and β = 2 becomes the χ2 −distribution with n degrees of
freedom. This characterizes also a sum of n independent standard normal distributions.
Beta distribution A random variable X is said to have a beta distribution with parameters
α > 0, β > 0 if its probability density function is of the form
xα−1 (1 − x)β−1
p(x) = , 0 ≤ x ≤ 1,
B(α, β)
where B(α, β) denotes the beta function.2 See see Fig.1.2d for two particular density functions.
In this case
α αβ
E[X] = , V ar[X] = 2
.
α+β (α + β) (α + β + 1)
Poisson distribution A discrete random variable X is said to have a Poisson probability
distribution if
λk −λ
P (X = k) = e , k = 0, 1, 2, . . . ,
k!
with λ > 0 parameter, see Fig.1.3b. In this case E[X] = λ and V ar[X] = λ.
Pearson 5 distribution Let α, β > 0. A random variable X with the density function
1 e−β/x
p(x) = , x≥0
βΓ(α) (x/β)α+1
is said to have a Pearson 5 distribution3 with positive parameters α and β. It can be shown
that
β β2
, if α > 1 , if α > 2
E[X] = α−1 V ar(X) = (α − 1)2 (α − 2)
∞, otherwise,
∞, otherwise.
R
the definition of the gamma function Γ(α) = 0∞ y α−1 e−y dy; if α = n, integer, then Γ(n) = (n − 1)!
1 Recall
R
2 Two definition formulas for the beta functions are B(α, β) = Γ(α)Γ(β) = 1 y α−1 (1 − y)β−1 dy.
Γ(α+β) 0
3 The Pearson family of distributions was designed by Pearson between 1890 and 1895. There are several
0.35 0.10
0.25
0.20
Β=3 0.06
0.15
0.04
0.10
0.02
0.05
0 2 4 6 8 10 5 10 15 20 25 30
a b
β
The mode of this distribution is equal to .
α+1
The Inverse Gaussian distribution Let µ, λ > 0. A random variable X has an inverse
Gaussian distribution with parameters µ and λ if its density function is given by
2
λ − λ(x−µ)
p(x) = e 2µ2 x , x > 0. (1.6.1)
2πx3
We shall write X ∼ IG(µ, λ). Its mean, variance and mode are given by
r
µ3 9µ2 3µ
E[X] = µ, V ar(X) = , M ode(X) = µ 1+ 2 − .
λ 4λ 2λ
This distribution will be used to model the time instance when a Brownian motion with drift
exceeds a certain barrier for the first time.
Proposition 1.7.1 Let X and Y be independent random variables with probability density
functions pX (x) and pY (y). Then the joint probability density function of (X, Y ) is given by
pX,Y (x, y) = pX (x) pY (y).
Dropping the factor dxdy yields the desired result. We note that the converse holds true.
Each Ωi is an event with the Pnassociated probability P (Ωi ). A simple function is a sum of
characteristic functions f = i ci χΩi . This means f (ω) = ck for ω ∈ Ωk . The integral of the
simple function f is defined by
Z n
X
f dP = ci P (Ωi ).
Ω i
If X : Ω → R is a random variable such that there is a sequence of simple functions (fn )n≥1
satisfying:
1. fn is fundamental in probability: ∀ǫ > 0 limn,m→∞ P (ω; |fn (ω) − fm (ω)| ≥ ǫ) → 0,
2. fn converges to X in probability: ∀ǫ > 0 limn→∞ P (ω; |fn (ω) − X(ω)| ≥ ǫ) → 0
then the integral of X is defined as the following limit of integrals
Z Z
X dP = lim fn dP.
Ω n→∞ Ω
R R
From now on, the integral notations Ω X dP or Ω X(ω) dP (ω) will be used interchangeably.
In the rest of the chapter the integral notation will be used formally, without requiring a direct
use of the previous definition.
1.9 Expectation
A random variable X : Ω → R is called integrable if
Z Z
|X(ω)| dP (ω) = |x|p(x) dx < ∞,
Ω R
where p(x) denotes the probability density function of X. The previous identity is based on
changing the domain of integration from Ω to R.
The expectation of an integrable random variable X is defined by
Z Z
E[X] = X(ω) dP (ω) = x p(x) dx.
Ω R
13
Customarily, the expectation of X is denoted by µ and it is also called the mean. In general,
for any continuous6 function h : R → R, we have
Z Z
E[h(X)] = h X(ω) dP (ω) = h(x)p(x) dx.
Ω R
Proposition 1.9.1 The expectation operator E is linear, i.e. for any integrable random vari-
ables X and Y
1. E[cX] = cE[X], ∀c ∈ R;
2. E[X + Y ] = E[X] + E[Y ].
Proof: It follows from the fact that the integral is a linear operator.
Proposition 1.9.2 Let X and Y be two independent integrable random variables. Then
E[XY ] = E[X]E[Y ].
Proof: This is a variant of Fubini’s theorem, which in this case states that a double integral
is a product of two simple integrals. Let pX , pY , pX,Y denote the probability densities of X, Y
and (X, Y ), respectively. Since X and Y are independent, by Proposition 1.7.1 we have
ZZ Z Z
E[XY ] = xypX,Y (x, y) dxdy = xpX (x) dx ypY (y) dy = E[X]E[Y ].
Proposition 1.10.1 Consider the probability space (Ω, F , P ), and let G be a σ-field included
in F . If X is a G-predictable random variable such that
Z
X dP = 0 ∀A ∈ G,
A
then X = 0 a.s.
Proof: In order to show that X = 0 almost surely, it suffices to prove that P ω; X(ω) = 0 = 1.
We shall show first that X takes values as small as possible with probability one, i.e. ∀ǫ > 0
we have P (|X| < ǫ) = 1. To do this, let A = {ω; X(ω) ≥ ǫ}. Then
Z Z Z
1 1
0 ≤ P (X ≥ ǫ) = dP = ǫ dP ≤ X dP = 0,
A ǫ A ǫ A
and hence P (X ≥ ǫ) = 0. Similarly P (X ≤ −ǫ) = 0. Therefore
1
Taking ǫ → 0 leads to P (|X| = 0) = 1. This can be formalized as follows. Let ǫ = n and
consider Bn = {ω; |X(ω)| ≤ ǫ}, with P (Bn ) = 1. Then
∞
\
P (X = 0) = P (|X| = 0) = P ( Bn ) = lim P (Bn ) = 1.
n→∞
n=1
then X = Y a.s.
R
Proof: Since A (X − Y ) dP = 0, ∀A ∈ G, by Proposition 1.10.1 we have X − Y = 0 a.s.
1. E[X|G] is G-predictable;
R R
2. A E[X|G] dP = A X dP, ∀A ∈ G.
E[X|G] is called the conditional expectation of X given G.
We owe a few explanations regarding the correctness of the aforementioned definition. The
existence of the G-predictable random variable E[X|G] is assured by the Radon-Nikodym theo-
rem. The almost surely uniqueness is an application of Proposition (1.10.1) (see the discussion
point 2 of section 1.10).
It is worth noting that the expectation of X, denoted by E[X] is a number, while the
conditional expectation E[X|G] is a random variable. When are they equal and what is their
relationship? The answer is inferred by the following solved exercises.
Proof: We need to show that E[X] satisfies conditions 1 and 2. The first one is obviously
satisfied since any constant is G-predictable. The latter condition is checked on each set of G.
We have
Z Z Z
X dP = E[X] = E[X] dP = E[X]dP
ZΩ Z Ω Ω
X dP = E[X]dP.
Ø Ø
Example 1.11.2 Show that E[E[X|G]] = E[X], i.e. all conditional expectations have the same
mean, which is the mean of X.
Proof: Using the definition of expectation and taking A = Ω in the second relation of the
aforementioned definition, yields
Z Z
E[E[X|G]] = E[X|G] dP = XdP = E[X],
Ω Ω
Example 1.11.3 The conditional expectation of X given the total information F is the random
variable X itself, i.e.
E[X|F ] = X.
Proof: The random variables X and E[X|F ] are both F -predictable (from the definition of
the random variable). From the definition of the conditional expectation we have
Z Z
E[X|F ] dP = X dP, ∀A ∈ F .
A A
Proposition 1.11.4 Let X and Y be two random variables on the probability space (Ω, F , P ).
We have
1. Linearity:
E[aX + bY |G] = aE[X|G] + bE[Y |G], ∀a, b ∈ R;
2. Factoring out the predictable part:
E[c|G] = c.
E[X|G] = E[X],
if X is independent of G.
Exercise 1.11.5 Prove the property 3 (tower property) given in the previous proposition.
Exercise 1.11.6 Let X be a random variable on the probability space (Ω, F , P ), which is in-
dependentof the σ-field G ⊂ F . Consider the characteristic function of a set A ⊂ Ω defined by
1, if ω ∈ A
χA (ω) = Show the following:
0, if ω ∈ /A
(a) χA is G-predictable for any A ∈ G;
(b) P (A) = E[χA ];
(c) X and χA are independent random variables;
(d) E[χA X] = E[X]P (A) for any A ∈ G;
(e) E[X|G] = E[X].
ϕ(E[X]) ≤ E[ϕ(X)]
almost surely (i.e. the inequality might fail on a set of probability zero).
17
Figure 1.4: Jensen’s inequality ϕ(E[X]) < E[ϕ(X)] for a convex function ϕ.
Proof: We shall assume ϕ twice differentiable with ϕ′′ continuous. Let µ = E[X]. Expand ϕ
in a Taylor series about µ and get
1
ϕ(x) = ϕ(µ) + ϕ′ (µ)(x − µ) + ϕ′′ (ξ)(ξ − µ)2 ,
2
with ξ in between x and µ. Since ϕ is convex, ϕ′′ ≥ 0, and hence
E[X]2 ≤ E[X 2 ].
Since the right side is finite, it follows that E[X] < ∞, so X is integrable.
18
Application 1.12.3 If mX (t) denotes the moment generating function of the random variable
X with mean µ, then
mX (t) ≥ etµ .
Proof: Applying Jensen inequality with the convex function ϕ(x) = ex yields
eE[X] ≤ E[eX ].
Exercise 1.12.5 Prove that a non-constant random variable has a non-zero standard devia-
tion.
Exercise 1.12.6 Prove the following extension of Jensen’s inequality: If ϕ is a convex function,
then for any σ-field G ⊂ F we have
ϕ(E[X|G]) ≤ E[ϕ(X)|G].
Theorem 1.12.8 (Markov’s inequality) For any λ, p > 0, we have the following inequality:
1
P (ω; |X(ω)| ≥ λ) ≤ E[|X|p ].
λp
Proof: Let A = {ω; |X(ω)| ≥ λ}. Then
Z Z Z
E[|X|p ] = |X(ω)|p dP (ω) ≥ |X(ω)|p dP (ω) ≥ λp dP (ω)
Ω A A
Z
p
= λ dP (ω) = λp P (A) = λp P (|X| ≥ λ).
A
p
Dividing by λ leads to the desired result.
19
Theorem 1.12.10 (Chernoff bounds) Let X be a random variable. Then for any λ > 0 we
have
E[etX ]
1. P (X ≥ λ) ≤ , ∀t > 0;
eλt
E[etX ]
2. P (X ≤ λ) ≤ , ∀t < 0.
eλt
Proof: 1. Let t > 0 and denote Y = etX . By Markov’s inequality
E[Y ]
P (Y ≥ eλt ) ≤ .
eλt
Then we have
m(t) 1 2 2
P (X ≥ λ) ≤ = e(µ−λ)t+ 2 t σ , ∀t > 0,
eλt
which implies
1
min[(µ − λ)t + t2 σ 2 ]
P (X ≥ λ) ≤ e t>0 2 .
20
It is easy to see that the quadratic function f (t) = (µ − λ)t + 21 t2 σ 2 has the minimum value
λ−µ
reached for t = . Since t > 0, λ needs to satisfy λ > µ. Then
σ2
λ − µ (λ − µ)2
min f (t) = f 2
=− .
t>0 σ 2σ 2
Substituting into the previous formula, we obtain the following result:
Markov’s, Tchebychev’s and Chernoff’s inequalities will be useful later when computing
limits of random variables.
The next inequality is called Tchebychev’s inequality for monotone sequences of numbers.
Lemma 1.12.13 Let (ai ) and (bi ) be two sequences of real numbers such that either
a1 ≤ a2 ≤ · · · ≤ an , b1 ≤ b2 ≤ · · · ≤ bn
or
a1 ≥ a2 ≥ · · · ≥ an ,b1 ≥ b2 ≥ · · · ≥ bn
Xn
If (λi ) is a sequence of non-negative numbers such that λi = 1, then
i=1
n
X n
X Xn
λi ai λi bi ≤ λi ai bi .
i=1 i=1 i=1
Proof: Since the sequences (ai ) and (bi ) are either both increasing or both decreasing
(ai − aj )(bi − bj ) ≥ 0.
Expanding yields
X X X X X X
λj λi ai bi − λi ai λj bj − λj aj λi bi
j i i j j i
X X
+ λi λj aj bj ≥ 0.
i j
21
X
Using λj = 1 the expression becomes
j
X X X
λi ai bi ≥ λi ai λj bj ,
i i j
Proposition 1.12.14 Let X be a random variable and f and g be two functions, both increas-
ing or both decreasing. Then
Let x0 < x1 < · · · < xn be a partition of the interval I, with ∆x = xk+1 − xk . Using Lemma
1.12.13 we obtain the following inequality between Riemann sums
X X X
f (xj )g(xj )p(xj )∆x ≥ f (xj )p(xj )∆x g(xj )p(xj )∆x ,
j j j
where aj = f (xj ), bj = g(xj ), and λj = p(xj )∆x. Taking the limit k∆xk → 0 we obtain
(1.12.5), which leads to the desired result.
and we shall write ac-lim Xn = X. An important example where this type of limit occurs is
n→∞
the Strong Law of Large Numbers:
If Xn is a sequence of independent and identically distributed random variables with the
X1 + · · · + Xn
same mean µ, then ac-lim = µ.
n→∞ n
It is worth noting that this type of convergence is also known under the name of strong
convergence. This is the reason that the aforementioned theorem bares its name.
Example 1.13.1 Let Ω = {H, T } be the sample space obtained when a coin is flipped. Consider
the random variables Xn : Ω → {0, 1}, where Xn denotes the number of heads obtained at the n-
th flip. Obviously, Xn are i.i.d., with the distribution given by P (Xn = 0) = P (Xn = 1) = 1/2,
and the mean E[Xn ] = 0 · 12 + 1 · 21 = 12 . Then X1 + · · · + Xn is the number of heads obtained
after n flips of the coin. By the law of large numbers, n1 (X1 + · · · + Xn ) tends to 1/2 strongly,
as n → ∞.
This limit will be abbreviated by ms-lim Xn = X. The mean square convergence is useful
n→∞
when defining the Ito integral.
Example 1.13.1 Consider a sequence Xn of random variables such that there is a constant k
with E[Xn ] → k and V ar(Xn ) → 0 as n → ∞. Show that ms-lim Xn = k.
n→∞
23
Exercise 1.13.3 If Xn tends to X in mean square, with E[X 2 ] < ∞, show that:
(a) E[Xn ] → E[X] as n → ∞;
(b) E[Xn2 ] → E[X 2 ] as n → ∞;
(c) V ar[Xn ] → V ar[X] as n → ∞;
(d) Cov(Xn , X) → V ar[X] as n → ∞.
Exercise 1.13.4 If Xn tends to X in mean square, show that E[Xn |H] tends to E[X|H] in
mean square.
Proof: Let ms-lim Yn = Y . Let ǫ > 0 be arbitrarily fixed. Applying Markov’s inequality with
n→∞
X = Yn − Y , p = 2 and λ = ǫ, yields
1
0 ≤ P (|Yn − Y | ≥ ǫ) ≤ E[|Yn − Y |2 ].
ǫ2
24
lim P (|Yn − Y | ≥ ǫ) = 0,
n→∞
E[|Xn |]
0 ≤ P ω; |Xn (ω)| ≥ ǫ ≤ .
ǫ
Remark 1.13.7 The conclusion still holds true even in the case when there is a p > 0 such
that E[|Xn |p ] → 0 as n → ∞.
Limit in Distribution
We say the sequence Xn converges in distribution to X if for any continuous bounded function
ϕ(x) we have
lim ϕ(Xn ) = ϕ(X).
n→∞
This type of limit is even weaker than the stochastic convergence, i.e. it is implied by it.
An application of the limit in distribution is obtained if we consider ϕ(x) = eitx . In this
case, if Xn converges in distribution to X, then the characteristic function of Xn converges to
the characteristic function of X. In particular, the probability density of Xn approaches the
probability density of X.
It can be shown that the convergence in distribution is equivalent with
Remark 1.13.8 The almost certain convergence implies the stochastic convergence, and the
stochastic convergence implies the limit in distribution. The proof of these statements is beyound
the goal of this book. The interested reader can consult a graduate text in probability theory.
25
1. ms-lim (Xn + Yn ) = 0
n→∞
2. ms-lim (Xn Yn ) = 0.
n→∞
Proof: Since ms-lim Xn = 0, then lim E[Xn2 ] = 0. Applying the Squeeze Theorem to the
n→∞ n→∞
inequality7
0 ≤ E[Xn ]2 ≤ E[Xn2 ]
yields lim E[Xn ] = 0. Then
n→∞
lim V ar[Xn ] = lim E[Xn2 ] − lim E[Xn ]2
n→∞ n→∞ n→∞
= lim E[Xn2 ] − lim E[Xn ]2
n→∞ n→∞
= 0.
Similarly, we have lim E[Yn2 ] = 0, lim E[Yn ] = 0 and lim V ar[Yn ] = 0. Then lim σXn =
n→∞ n→∞ n→∞ n→∞
lim σYn = 0. Using the correlation definition formula of two random variables Xn and Yn
n→∞
Cov(Xn , Yn )
Corr(Xn , Yn ) = ,
σXn σYn
Since lim σXn σXn = 0, from the Squeeze Theorem it follows that
n→∞
lim Cov(Xn , Yn ) = 0.
n→∞
Proposition 1.14.2 If the sequences of random variables Xn and Yn converge in the mean
square, then
The evolution in time of a given state of the world ω ∈ Ω given by the function t 7−→ Xt (ω) is
called a path or realization of Xt . The study of stochastic processes using computer simulations
is based on retrieving information about the process Xt given a large number of it realizations.
Consider that all the information accumulated until time t is contained by the σ-field Ft .
This means that Ft contains the information of which events have already occurred until time
t, and which did not. Since the information is growing in time, we have
Fs ⊂ Ft ⊂ F
Xt = E[X|Ft ].
27
From the definition of conditional expectation, the random variable Xt is Ft -predictable, and
can be regarded as the measurement of X at time t using the information Ft . If the accumulated
knowledge Ft increases and eventually equals the σ-field F , then X = E[X|F ], i.e. we obtain
the entire random variable. The process Xt is adapted to Ft .
Example 1.15.3 Don Joe is asking a doctor how long he still has to live. The age at which he
will pass away is a random variable, denoted by X. Given his medical condition today, which
is contained in Ft , the doctor infers that Mr. Joe will die at the age of Xt = E[X|Ft ]. The
stochastic process Xt is adapted to the medical knowledge Ft .
Remark
Z 1.15.5 The first condition states that the unconditional forecast is finite E[|Xt ]] =
|Xt | dP < ∞. Condition 2 says that the value Xt is known, given the information set Ft .
Ω
This can be also stated by saying that Xt is Ft -predictable. The third relation asserts that the
best forecast of unobserved future values is the last observation on Xt .
Example 1.15.1 Let Xt denote Mr. Li Zhu’s salary after t years of work at the same company.
Since Xt is known at time t and it is bounded above, as all salaries are, then the first two
conditions hold. Being honest, Mr. Zhu expects today that his future salary will be the same as
today’s, i.e. Xs = E[Xt |Fs ], for s < t. This means that Xt is a martingale.
If Mr. Zhu is optimistic and believes as of today that his future salary will increase, then
Xt is a submartingale.
Exercise 1.15.8 Let Xt and Yt be martingales with respect to the filtration Ft . Show that for
any a, b, c ∈ R the process Zt = aXt + bYt + c is a Ft -martingale.
Exercise 1.15.10 Two processes Xt and Yt are called conditionally uncorrelated, given Ft , if
Let Xt and Yt be martingale processes. Show that the process Zt = Xt Yt is a martingale if and
only if Xt and Yt are conditionally uncorrelated. Assume that Xt , Yt and Zt are integrable.
Exercise 1.15.13 (a) Let X be a normally distributed random variable with mean µ 6= 0 and
variance σ 2 . Prove that there is a unique θ 6= 0 such that E[eθX ] = 1.
(b) Let (Xi )i≥0 be a sequence ofPidentically normally distributed random variables with mean
n
µ 6= 0. Consider the sum Sn = j=0 Xj . Show that Zn = eθSn is a martingale, with θ defined
in part (a).
This chapter deals with the most common used stochastic processes and their basic properties.
The two main basic processes are the Brownian motion and the Poisson process. The other
processes described in this chapter are derived from the previous two.
Bt − Bs ∼ N (0, |t − s|).
The process Xt = x + Bt has all the properties of a Brownian motion that starts at x. Since
Bt − Bs is stationary, its distribution function depends only on the time interval t − s, i.e.
Bt ∼ N (0, t).
This implies also that the second moment is E[Bt2 ] = t. Let 0 < s < t. Since the increments
are independent, we can write
29
30
Condition 4 has also a physical explanation. A pollen grain suspended in water is kicked
by a very large numbers of water molecules. The influence of each molecule on the grain is
independent of the other molecules. These effects are average out into a resultant increment
of the grain coordinate. According to the Central Limit Theorem, this increment has to be
normal distributed.
Proposition 2.1.2 A Brownian motion process Bt is a martingale with respect to the infor-
mation set Ft = σ(Bs ; s ≤ t).
where we used that Bs is Fs -predictable (from where E[Bs |Fs ] = Bs ) and that the increment
Bt − Bs is independent of previous values of Bt contained in the information set Ft = σ(Bs ; s ≤
t).
A process with similar properties as the Brownian motion was introduced by Wiener.
E[(Wt − Ws )2 ] = t − s, s ≤ t;
The only property Bt has and Wt seems not to have is that the increments are normally
distributed. However, there is no distinction between these two processes, as the following
result states.
In stochastic calculus we often need to use infinitesimal notation and its properties. If dWt
denotes the infinitesimal increment of a Wiener process in the time interval dt, the aforemen-
tioned properties become dWt ∼ N (0, dt), E[dWt ] = 0, and E[(dWt )2 ] = dt.
1 This follows from E[Wt − Ws ] = E[Wt − Ws |Fs ] = E[Wt |Fs ] − Ws = Ws − Ws = 0.
31
Proposition 2.1.6 If Wt is a Wiener process with respect to the information set Ft , then
Yt = Wt2 − t is a martingale.
Let s < t. Using that the increments Wt −Ws and (Wt −Ws )2 are independent of the information
set Fs and applying Proposition 1.11.4 yields
Since Wt is normally distributed with mean 0 and variance t, its density function is
1 − x2
φt (x) = √ e 2t .
2πt
Then its distribution function is
Z x
1 u2
Ft (x) = P (Wt ≤ x) = √ e− 2t du
2πt −∞
Even if the increments of a Brownian motion are independent, their values are still corre-
lated.
2 These type of processes are called Marcov processes.
32
Cov(Ws , Wt ) = Cov(Ws , Ws + Wt − Ws )
= Cov(Ws , Ws ) + Cov(Ws , Wt − Ws )
= V ar(Ws ) + E[Ws (Wt − Ws )] − E[Ws ]E[Wt − Ws ]
= s + E[Ws ]E[Wt − Ws ]
= s,
since E[Ws ] = 0.
We can also arrive at the same result starting from the formula
Using that conditional expectations have the same expectation, factoring the predictable part
out, and using that Wt is a martingale, we have
so Cov(Ws , Wt ) = s.
2. The correlation formula yields
r
Cov(Ws , Wt ) s s
Corr(Ws , Wt ) = = √ √ = .
σ(Wt )σ(Ws ) s t t
Remark 2.1.9 Removing the order relation between s and t, the previous relations can also be
stated as
The following exercises state the translation and the scaling invariance of the Brownian
motion.
Exercise 2.1.10 For any t0 ≥ 0, show that the process Xt = Wt+t0 − Wt0 is a Brownian
motion. This can be also stated as saying that the Brownian motion is translation invariant.
Exercise 2.1.11 For any λ > 0, show that the process Xt = √1 Wλt is a Brownian motion.
λ
This says that the Brownian motion is invariant by scaling.
33
3
-0.5
-1.0
a b
Figure 2.1: a Three simulations of the Brownian motion process Wt ; b Two simulations of the
geometric Brownian motion process eWt .
Exercise 2.1.12 Let 0 < s < t < u. Show the following multiplicative property
Exercise 2.1.14 (a) Use the martingale property of Wt2 − t to find E[(Wt2 − t)(Ws2 − s)];
(b) Evaluate E[Wt2 Ws2 ];
(c) Compute Cov(Wt2 , Ws2 );
(d) Find Corr(Wt2 , Ws2 ).
Exercise 2.1.16 The process Xt = |Wt | is called Brownian motion reflected at the origin.
Show that p
(a) E[|Wt |] = 2t/π;
(b) V ar(|Wt |) = (1 − π2 )t.
Exercise 2.1.19 Show that the following processes are Brownian motions
(a) Xt = WT − WT −t , 0 ≤ t ≤ T ;
(b) Yt = −Wt , t ≥ 0.
34
Proposition 2.2.2 The geometric Brownian motion Xt = eWt is log-normally distributed with
mean et/2 and variance e2t − et .
Proof: Since Wt is normally distributed, then Xt = eWt will have a log-normal distribution.
Using Lemma 2.2.1 we have
1 2
√ e−(ln x) /(2t) , if x > 0,
d x 2πt
p(x) = F (x) =
dx Xt
0, elsewhere.
35
Ws1 + · · · + Wsn
= n(Ws1 − W0 ) + (n − 1)(Ws2 − Ws1 ) + · · · + (Wsn − Wsn−1 )
= X1 + X2 + · · · + Xn . (2.3.1)
Since the increments of a Brownian motion are independent and normally distributed, we have
X1 ∼ N 0, n2 ∆s
X2 ∼ N 0, (n − 1)2 ∆s
X3 ∼ N 0, (n − 2)2 ∆s
..
.
Xn ∼ N 0, ∆s .
Theorem 2.3.1 If Xj are independent random variables normally distributed with mean µj
and variance σj2 , then the sum X1 +· · ·+Xn is also normally distributed with mean µ1 +· · ·+µn
and variance σ12 + · · · + σn2 .
Then
n(n + 1)(2n + 1)
X1 + · · · + Xn ∼ N 0, (1 + 22 + 32 + · · · + n2 )∆s = N 0, ∆s ,
6
t
with ∆s = . Using (2.3.1) yields
n
Ws + · · · + Wsn (n + 1)(2n + 1)
t 1 ∼ N 0, t3 .
n 6n2
“Taking the limit” we get
t3
Zt ∼ N 0, .
3
Proposition 2.3.2 The integrated Brownian motion Zt has a normal distribution with mean
0 and variance t3 /3.
Remark 2.3.3 The aforementioned limit was taken heuristically, without specifying the type
of the convergence. In order to make this to work, the following result is usually used:
If Xn is a sequence of normal random variables that converges in mean square to X, then
the limit X is normal distributed, with E[Xn ] → E[X] and V ar(Xn ) → V ar(X), as n → ∞.
The mean and the variance can also be computed in a direct way as follows. By Fubini’s
theorem we have
Z t Z Z t
E[Zt ] = E[ Ws ds] = Ws ds dP
0 R 0
Z tZ Z t
= Ws dP ds = E[Ws ] ds = 0,
0 R 0
where
D1 = {(u, v); u > v, 0 ≤ u ≤ t}, D2 = {(u, v); u < v, 0 ≤ u ≤ t}
The first integral can be evaluated using Fubini’s theorem
ZZ ZZ
min{u, v} dudv = v dudv
D1 D1
Z tZ u Z t
u2 t3
= v dv du = du = .
0 0 0 2 6
37
(b) Use the first part to find the mean and variance of Zt .
Exercise 2.3.5 Let s < t. Show that the covariance of the integrated Brownian motion is given
by
t s
Cov Zs , Zt = s2 − , s < t.
2 6
Exercise 2.3.6 Show that
(a) Cov(Zt , Zt −Zt−h ) = 12 t2 h+o(h), where o(h) denotes a quantity such that limh→0 o(h)/h =
0;
t2
(b) Cov(Zt , Wt ) = .
2
Exercise 2.3.7 Show that
u+s
E[eWs +Wu ] = e 2 emin{s,u} .
Z t
Exercise 2.3.8 Consider the process Xt = eWs ds.
0
(a) Find the mean of Xt ;
(b) Find the variance of Xt .
Z t
Exercise 2.3.9 Consider the process Zt = Wu du, t > 0.
0
(a) Show that E[ZT |Ft ] = Zt + Wt (T − t), for any t < T ;
(b) Prove that the process Mt = Zt − tWt is an Ft -martingale.
Vt = eZt
2t3 t3
V ar(Vt ) = E[Vt2 ] − E[Vt ]2 = e 3 −e 3
t+3s
Cov(Vs , Vt ) = e 2 .
(T −t)3
Exercise 2.4.1 Show that E[VT |Ft ] = Vt e(T −t)Wt + 3 for t < T .
using that the increments Wt − W0 and W1 − Wt are independent and normally distributed,
with
Wt − W0 ∼ N (0, t), W1 − Wt ∼ N (0, 1 − t),
it follows that Xt is normally distributed with
E[Yt ] = µt + E[Wt ] = µt
and variance
V ar[Yt ] = V ar[µt + Wt ] = V ar[Wt ] = t.
Exercise 2.6.1 Find the distribution and the density functions of the process Yt .
39
Proof: Since the Brownian motions W1 (t), . . . , Wn (t) are independent, their joint density
function is
fW1 ···Wn (t) = fW1 (t) · · · fWn (t)
1 2 2
= n/2
e−(x1 +···+xn )/(2t) , t > 0.
(2πt)
In the next computation we shall use the following formula of integration that follows from
the use of polar coordinates
Z Z ρ
n−1
f (x) dx = σ(S ) rn−1 g(r) dr, (2.7.3)
{|x|≤ρ} 0
n
where f (x) = g(|x|) is a function on R with spherical symmetry, and where
2π n/2
σ(Sn−1 ) =
Γ(n/2)
is the area of the (n − 1)-dimensional sphere in Rn .
Let ρ ≥ 0. The distribution function of Rt is
Z
FR (ρ) = P (Rt ≤ ρ) = fW1 ···Wn (t) dx1 · · · dxn
{Rt ≤ρ}
Z
1 2 2
= n/2
e−(x1 +···+xn )/(2t) dx1 · · · dxn
2 2
x1 +···+xn ≤ρ 2 (2πt)
Z ρ Z !
1 2 2
= rn−1 n/2
e−(x1 +···+xn )/(2t) dσ dr
0 S(0,1) (2πt)
Z
σ(Sn−1 ) ρ n−1 −r2 /(2t)
= r e dr.
(2πt)n/2 0
40
Differentiating yields
d σ(Sn−1 ) n−1 − ρ2
pt (ρ) = FR (ρ) = ρ e 2t
dρ (2πt)n/2
2 ρ2
= n/2
ρn−1 e− 2t , ρ > 0, t > 0.
(2t) Γ(n/2)
It is worth noting that in the 2-dimensional case the aforementioned density becomes a
particular case of a Weibull distribution with parameters m = 2 and α = 2t, called Wald’s
distribution
1 x2
pt (x) = xe− 2t , x > 0, t > 0.
t
ρ2 ρ2 ρ2
1− < P (Rt ≤ t) < .
2t 4t 2t
Rt
Exercise 2.7.4 Let Xt = , t > 0, where Rt is a 2-dimensional Bessel process. Show that
t
Xt → 0 as t → ∞ in mean square.
where we used that Ns is Fs -predictable (and hence E[Ns |Fs ] = Ns ) and that the increment
Nt − Ns is independent of previous values of Ns and the information set Fs . Subtracting λt
yields
E[Nt − λt|Fs ] = Ns − λs,
or E[Mt |Fs ] = Ms . Since it is obvious that Mt is integrable and Ft -adapted, it follows that Mt
is a martingale.
It is worth noting that the Poisson process Nt is not a martingale. The martingale process
Mt = Nt − λt is called the compensated Poisson process.
Exercise 2.8.6 Compute E[Nt2 |Fs ] for s < t. Is the process Nt2 an Fs -martingale?
Exercise 2.8.7 (a) Show that the moment generating function of the random variable Nt is
x
mNt (x) = eλt(e −1)
.
E[Nt ] = λt
E[Nt2 ] = λ2 t2 + λt
E[Nt3 ] = λ3 t3 + 3λ2 t2 + λt
E[Nt4 ] = λ4 t4 + 6λ3 t3 + 7λ2 t2 + λt.
43
(c) Show that the first few central moments are given by
E[Nt − λt] = 0
E[(Nt − λt)2 ] = λt
E[(Nt − λt)3 ] = λt
E[(Nt − λt)4 ] = 3λ2 t2 + λt.
Exercise 2.8.8 Find the mean and variance of the process Xt = eNt .
Exercise 2.8.9 (a) Show that the moment generating function of the random variable Mt is
x
mMt (x) = eλt(e −x−1)
.
E[Mt − Ms ] = 0,
E[(Mt − Ms )2 ] = λ(t − s),
E[(Mt − Ms )3 ] = λ(t − s),
E[(Mt − Ms )4 ] = λ(t − s) + 3λ2 (t − s)2 .
Proposition 2.8.11 The random variables Tn are independent and exponentially distributed
with mean E[Tn ] = 1/λ.
Proof: We start by noticing that the events {T1 > t} and {Nt = 0} are the same, since both
describe the situation that no events occurred until after time t. Then
In order to show that the random variables T1 and T2 are independent, if suffices to show that
i.e. the distribution function of T2 is independent of the values of T1 . We note first that from
the independent increments property
P 0 jumps in (s, s + t], 1 jump in (0, s] = P (Ns+t − Ns = 0, Ns − N0 = 1)
= P (Ns+t − Ns = 0)P (Ns − N0 = 1) = P 0 jumps in (s, s + t] P 1 jump in (0, s] .
Figure 2.2: The Poisson process Nt and the waiting times S1 , S2 , · · · Sn . The shaded rectangle
has area n(Sn+1 − t).
d λe−λt (λt)n−1
Exercise 2.8.12 Prove that FSn (t) = ·
dt (n − 1)!
Exercise 2.8.13 Using that the interarrival times T1 , T2 , · · · are independent and exponentially
distributed, compute directly the mean E[Sn ] and variance V ar(Sn ).
called the integrated Poisson process. The next result provides a relation between the process
Ut and the partial sum of the waiting times Sk .
Let Nt = n. Since Nt is equal to k between the waiting times Sk and Sk+1 , the process Ut ,
which is equal to the area of the subgraph of Nu between 0 and t, can be expressed as
Z t
Ut = Nu du = 1 · (S2 − S1 ) + 2 · (S3 − S2 ) + · · · + n(Sn+1 − Sn ) − n(Sn+1 − t).
0
Since Sn < t < Sn+1 , the difference of the last two terms represents the area of last the
rectangle, which has the length t − Sn and the height n. Using associativity, a computation
46
yields
where we replaced n by Nt .
The conditional distribution of the waiting times is provided by the following useful result.
Theorem 2.8.15 Given that Nt = n, the waiting times S1 , S2 , · · · , Sn have the the joint den-
sity function given by
n!
f (s1 , s2 , · · · , sn ) = , 0 < s1 ≤ s2 ≤ · · · ≤ sn < t.
tn
This is the same as the density of an ordered sample of size n from a uniform distribution on
the interval (0, t). A naive explanation of this result is as follows. If we know that there will
be exactly n events during the time interval (0, t), since the events can occur at any time, each
of them can be considered uniformly distributed, with the density f (sk ) = 1/t. Since it makes
sense to consider the events independent, taking into consideration all possible n! permutaions,
n!
the joint density function becomes f (s1 , · · · , sn ) = n!f (s1 ) · · · f (sn ) = n .
t
Exercise 2.8.16 Find the following means
(a) E[Ut ].
Nt
X
(b) E Sk .
k=1
λt3
Exercise 2.8.17 Show that V ar(Ut ) = .
3
Exercise 2.8.18 Can you apply a similar proof as in Proposition 2.3.2 to show that the inte-
grated Poisson process Ut is also a Poisson process?
Exercise 2.8.19 Let Y : Ω → N be a discrete random variable. Then for any random variable
X we have X
E[X] = E[X|Y = y]P (Y = y).
y≥0
Exercise 2.8.21 (a) Let Tk be the kth interarrival time. Show that
λ
E[e−σTk ] = , σ > 0.
λ+σ
47
(Hint: If know that there are exactly n jumps in the interval [0, T ], it makes sense to consider
the arrival time of the jumps Ti independent and uniformly distributed on [0, T ]).
(d) Find the expectation h i
E e−σUt .
2.9 Submartingales
A stochastic process Xt on the probability space (Ω, F , P ) is called submartingale with respect
to the filtration Ft if:
R
(a) Ω |Xt | dP < ∞ (Xt integrable);
(c) E[Xt+s |Ft ] ≥ Xt , ∀t, s ≥ 0 (future predictions exceed the present value).
Example 2.9.1 We shall prove that the process Xt = µt+ σWt , with µ > 0 is a submartingale.
The integrability follows from the inequality |Xt (ω)| ≤ µt + |Wt (ω)| and integrability of Wt .
The adaptability of Xt is obvious, and the last property follows from the computation:
Example 2.9.2 We shall show that the square of the Brownian motion, Wt2 , is a submartin-
gale.
The following result supplies examples of submatingales starting from martingales or sub-
martingales.
Proposition 2.9.3 (a) If Xt is a martingale and φ a convex function such that φ(Xt ) is
integrable, then the process Yt = φ(Xt ) is a submartingale.
(b) If Xt is a submartingale and φ an increasing convex function such that φ(Xt ) is integrable,
then the process Yt = φ(Xt ) is a submartingale.
48
Proof: (a) Using Jensen’s inequality for conditional probabilities, Exercise 1.12.6, we have
E[Yt+s |Ft ] = E[φ(Xt+s )|Ft ] ≥ φ E[Xt+s |Ft ] = φ(Xt ) = Yt .
Corollary 2.9.4 (a) Let Xt be a martingale. Then Xt2 , |Xt |, eXt are submartingales.
(b) Let µ > 0. Then eµt+σWt is a submartingale.
E[Xt+ ]
P (sup Xt ≥ x) ≤ ,
s≤t x
sups≤t |Ws |
Exercise 2.9.7 Show that st-lim = 0.
t→∞ t
Exercise 2.9.8 Show that for any martingale Xt we have the inequality
E[Xt2 ]
P (sup Xt2 > x) ≤ , ∀x > 0.
s≤t x
It is worth noting that Doob’s inequality implies Markov’s inequality. Since sup Xs ≥ Xt ,
s≤t
then P (Xt ≥ x) ≤ P (sup Xs ≥ x). Then Doob’s inequality
s≤t
E[Xt ]
P (sup Xs ≥ x) ≤
s≤t x
49
E[Xt ]
P (Xt ≥ x) ≤ .
x
Exercise 2.9.9 Let Nt denote the Poisson process and consider the information set Ft =
σ{Ns ; s ≤ t}.
(a) Show that Nt is a submartingale;
(b) Is Nt2 a submartingale?
Exercise 2.9.10 It can be shown that for any 0 < σ < τ we have the inequality
h X N 2 i 4τ λ
t
E −λ ≤ 2 ·
t σ
σ≤t≤τ
Nt
Using this inequality prove that ms–lim = λ.
t→∞ t
50
Chapter 3
Properties of Stochastic
Processes
Ft ⊂ Fs ⊂ F , ∀t < s.
Assume that the decision to stop playing a game before or at time t is determined by the
information Ft available at time t. Then this decision can be modeled by a random variable
τ : Ω → [0, ∞] which satisfies
{ω; τ (ω) ≤ t} ∈ Ft .
This means that given the information set Ft , we know whether the event {ω; τ (ω) ≤ t} had
occurred or not. We note that the possibility τ = ∞ is included, since the decision to continue
the game for ever is also a possible event. However, we ask the condition P (τ < ∞) = 1. A
random variable τ with the previous properties is called a stopping time.
The next example illustrates a few cases when a decision is or is not a stopping time. In order
to accomplish this, think of the situation that τ is the time when some random event related
to a given stochastic process occurs first.
Example 3.1.1 Let Ft be the information available until time t regarding the evolution of a
stock. Assume the price of the stock at time t = 0 is $50 per share. The following decisions are
stopping times:
(a) Sell the stock when it reaches for the first time the price of $100 per share;
(b) Buy the stock when it reaches for the first time the price of $10 per share;
(c) Sell the stock at the end of the year;
(d) Sell the stock either when it reaches for the first time $80 or at the end of the year.
(e) Keep the stock either until the initial investment doubles or until the end of the year;
The following decisions are not stopping times:
(f ) Sell the stock when it reaches the maximum level it will ever be;
(g) Keep the stock until the initial investment at least doubles.
51
52
Part (f ) is not a stopping time because it requires information about the future that is not
contained in Ft . For part (g), since the initial stock price is S0 = $50, the general theory of
stock prices state
P (St ≥ 2S0 ) = P (St ≥ 100) < 1,
i.e. there is a positive probability that the stock never doubles its value. This contradicts the
condition P (τ = ∞) = 0. In part (e) there are two conditions; the latter one has the occurring
probability equal to 1.
Exercise 3.1.2 Show that any positive constant, τ = c, is a stopping time with respect to any
filtration.
Exercise 3.1.3 Let τ (ω) = inf{t > 0; |Wt (ω)| > K}, with K > 0 constant. Show that τ is a
stopping time with respect to the filtration Ft = σ(Ws ; s ≤ t).
The random variable τ is called the first exit time of the Brownian motion Wt from the interval
(−K, K). In a similar way one can define the first exit time of the process Xt from the interval
(a, b):
τ (ω) = inf{t > 0; Xt (ω) ∈
/ (a, b)} = inf{t > 0; Xt (ω) > b or Xt (ω) < a)}.
Let X0 < a. The first entry time of Xt in the interval (a, b) is defined as
τ (ω) = inf{t > 0; Xt (ω) ∈ (a, b)} = inf{t > 0; Xt (ω) > a or Xt (ω) < b)}.
Exercise 3.1.4 Let Xt be a continuous stochastic process. Prove that the first exit time of Xt
from the interval (a, b) is a stopping time.
We shall present in the following some properties regarding operations with stopping times.
Consider the notations τ1 ∨ τ2 = max{τ1 , τ2 }, τ1 ∧ τ2 = min{τ1 , τ2 }, τ̄n = supn≥1 τn and
τ n = inf n≥1 τn .
Proposition 3.1.5 Let τ1 and τ2 be two stopping times with respect to the filtration Ft . Then
1. τ1 ∨ τ2
2. τ1 ∧ τ2
3. τ1 + τ2
are stopping times.
Proof: 1. We have
Since
P {ω; τi = ∞} = 1 − P {ω; τi < ∞} = 0, i = 1, 2,
it follows that P {ω; τ1 ∨ τ2 = ∞} = 0 and hence P {ω; τ1 ∨ τ2 < ∞} = 1. Then τ1 ∨ τ2 is
a stopping time.
2. The event {ω; τ1 ∧ τ2 ≤ t} ∈ Ft if and only if {ω; τ1 ∧ τ2 > t} ∈ Ft .
since {ω; τ1 > t} ∈ Ft and {ω; τ2 > t} ∈ Ft (the σ-algebra Ft is closed to complements). The
fact that τ1 ∧ τ2 < ∞ almost surely has a similar proof.
τ1 ≤ c, τ2 ≤ t − c.
since
{ω; τ1 ≤ c} ∈ Fc ⊂ Ft , {ω; τ2 ≤ t − c} ∈ Ft−c ⊂ Ft .
Writing
{ω; τ1 + τ2 = ∞} = {ω; τ1 = ∞} ∪ {ω; τ2 = ∞}
yields
P {ω; τ1 + τ2 = ∞} ≤ P {ω; τ1 = ∞} + P {ω; τ2 = ∞} = 0,
Hence P {ω; τ1 + τ2 < ∞} = 1. It follows that τ1 + τ2 is a stopping time.
∞
\
A filtration Ft is called right-continuous if Ft = Ft+ n1 , for t > 0. This means that the
n=1
information available at time t is a good approximation for any future infinitesimal information
Ft+ǫ ; or, equivalently, nothing more can be learned by peeking infinitesimally far into the
future.
Exercise 3.1.6 (a) Let Ft = σ{Ws ; s ≤ t}, where Wt is a Brownian motion. Show that Ft is
right-continuous.
(b) Let Nt = σ{Ns ; s ≤ t}, where Nt is a Poisson motion. Is Nt right-continuous?
Proposition 3.1.7 Let Ft be right-continuous and (τn )n≥1 be a sequence of bounded stopping
times.
(a) Then supn τ and inf τn are stopping times.
(b) If the sequence (τn )n≥1 converges to τ , τ 6= 0, then τ is a stopping time.
54
Proof: (a) The fact that τ̄n is a stopping time follows from
\
{ω; τ̄n ≤ t} ⊂ {ω; τn ≤ t} ∈ Ft ,
n≥1
(b) Let lim τn = τ . Then there is an increasing (or decreasing) subsequence τnk of stopping
n→∞
times that tends to τ , so sup τnk = τ (or inf τnk = τ ). Since τnk are stopping times, by part
(a), it follows that τ is a stopping time.
The condition that τn is bounded is significant, since if take τn = n as stopping times, then
supn τn = ∞ with probability 1, which does not satisfy the stopping time definition.
Exercise 3.1.8 Let τ be a stopping time.
(a) Let c ≥ 1 be a constant. Show that cτ is a stopping time.
(b) Let f : [0, ∞) → R be a continuous, increasing function satisfying f (t) ≥ t. Prove that
f (τ ) is a stopping time.
(c) Show that eτ is a stopping time.
Exercise 3.1.9 Let τ be a stopping time and c > 0 a constant. Prove that τ + c is a stopping
time.
Exercise 3.1.10 Let a be a constant and define τ = inf{t ≥ 0; Wt = a}. Is τ a stopping time?
Exercise 3.1.11 Let τ be a stopping time. Consider the following sequence τn = (m + 1)2−n
if m2−n ≤ τ < (m + 1)2−n (stop at the first time of the form k2−n after τ ). Prove that τn is
a stopping time.
Theorem 3.2.1 (Optional Stopping Theorem) Let (Mt )t≥0 be a right continuous Ft -martingale
and τ be a stopping time with respect to Ft . If either one of the following conditions holds:
1. τ is bounded, i.e. ∃N < ∞ such that τ ≤ N ;
2. ∃c > 0 such that E[|Mt |] ≤ c, ∀t > 0,
then E[Mτ ] = E[M0 ].
Proof: We shall sketch the proof for the case 1 only. Taking the expectation in relation
Mτ = Mτ ∧t + (Mτ − Mt )1{τ >t} ,
see Exercise 3.2.3, yields
E[Mτ ] = E[Mτ ∧t ] + E[Mτ 1{τ >t} ] − E[Mt 1{τ >t} ].
Since Mτ ∧t is a martingale, see Exercise 3.2.4 (b), then E[Mτ ∧t ] = E[M0 ]. The previous relation
becomes
E[Mτ ] = E[M0 ] + E[Mτ 1{τ >t} ] − E[Mt 1{τ >t} ], ∀t > 0.
Taking the limit yields
E[Mτ ] = E[M0 ] + lim E[Mτ 1{τ >t} ] − lim E[Mt 1{τ >t} ]. (3.2.1)
t→∞ t→∞
since for t > N the integrand vanishes. Hence relation (3.2.1) yields E[Mτ ] = E[M0 ].
It is worth noting that the previous theorem is a special case of the more general Optional
Stopping Theorem of Doob:
Theorem 3.2.2 Let Mt be a right continuous martingale and σ, τ be two bounded stopping
times, with σ ≤ τ . Then Mσ , Mτ are integrable and
E[Mτ |Fσ ] = Mσ a.s.
Exercise 3.2.3 Show that
Mτ = Mτ ∧t + (Mτ − Mt )1{τ >t} ,
where
1, τ (ω) > t;
1{τ >t} (ω) =
0, τ (ω) ≤ t
is the indicator function of the set {τ > t}.
Exercise 3.2.4 Let Mt be a right continuous martingale and τ be a stopping time. Show that
(a) Mτ is integrable;
(b) Mτ ∧t is a martingale.
Exercise 3.2.5 Show that if let σ = 0 in Theorem 3.2.2 yields Theorem 3.2.1.
56
2.5
Wt
2.0
a
1.5
Ta
Lemma 3.3.1 Let Ta be the first time the Brownian motion Wt hits a. Then the distribution
function of Ta is given by
Z ∞
2 2
P (Ta ≤ t) = √ √ e−y /2
dy.
2π |a|/ t
P (A) = P (A ∩ B) + P (A ∩ B)
= P (A|B)P (B) + P (A|B)P (B). (3.3.2)
Let a > 0. Using formula (3.3.2) for A = {ω; Wt (ω) ≥ a} and B = {ω; Ta(ω) ≤ t} yields
If Ta > t, the Brownian motion did not reach the barrier a yet, so we must have Wt < a.
Therefore
P (Wt ≥ a|Ta > t) = 0.
If Ta ≤ t, then WTa = a. Since the Brownian motion is a Markov process, it starts fresh at Ta .
Due to symmetry of the density function of a normal variable, Wt has equal chances to go up
or go down after the time interval t − Ta . It follows that
1
P (Wt ≥ a|Ta ≤ t) = .
2
57
P (Ta ≤ t) = 2P (Wt ≥ a)
Z ∞ Z ∞
2 2 2 2
= √ e−x /(2t) dx = √ e−y /2 dy.
2πt a 2π a/√t
If a < 0, symmetry implies that the distribution of Ta is the same as that of T−a , so we get
Z ∞
2 −y 2 /2
P (Ta ≤ t) = P (T−a ≤ t) = √ √ e dy.
2π −a/ t
Remark 3.3.2 The previous proof is based on a more general principle called the Reflection
Principle: If τ is a stopping time for the Brownian motion Wt , then the Brownian motion
reflected at τ is also a Brownian motion.
Theorem 3.3.3 Let a ∈ R be fixed. Then the Brownian motion hits a (in a finite amount of
time) with probability 1.
The previous result stated that the Brownian motion hits the barrier a almost surely. The
next result shows that the expected time to hit the barrier is infinite.
|a| a2 3
p(t) = √ e− 2t t− 2 , t > 0.
2π
a2
It has the mean E[Ta ] = ∞ and the mode .
3
Proof: Differentiating in the formula of distribution function1
Z ∞
2 2
FTa (t) = P (Ta ≤ t) = √ e−y /2 dy
2π a/√t
Z ψ(t)
1 One d
may use Leibnitz’s formula f (u) du = f (ψ(t))ψ′ (t) − f (ϕ(t))ϕ′ (t).
dt ϕ(t)
58
0.4
0.3
0.2
0.1
2
a3
1 2 3 4
β a2 a2
= 1 = .
α+1 2( 2 + 1) 3
Remark 3.3.5 The distribution has a peak at a2 /3. Then if we need to pick a small time
interval [t − dt, t + dt] in which the probability that the Brownian motion hits the barrier a is
maximum, we need to choose t = a2 /3.
Remark 3.3.6 The expected waiting time for Wt to reach the barrier a is infinite. However,
the expected waiting time for the Brownian motion Wt to hit either a or −a is finite, see Exercise
3.3.9.
Corollary 3.3.7 A Brownian motion process returns to the origin in a finite amount time with
probability 1.
Exercise 3.3.8 Try to apply the proof of Lemma 3.3.1 for the following stochastic processes
(a) Xt = µt + σWt , with µ, σ > 0 constants;
Z t
(b) Xt = Ws ds.
0
Where is the difficulty?
Exercise 3.3.9 Let a > 0 and consider the hitting time
τa = inf{t > 0; Wt = a or Wt = −a} = inf{t > 0; |Wt | = a}.
Prove that E[τa ] = a2 .
Exercise 3.3.10 (a) Show that the distribution function of the process
Xt = max Ws
s∈[0,t]
is given by √
Z a/ t
2 2
P (Xt ≤ a) = √ e−y /2
dy.
2π 0
√
(b) Show that E[Xt ] = 0 and V ar(Xt ) = 2 t.
Exercise 3.3.11 (a) Find the distribution of Yt = |Wt |, t ≥ 0;
r
πT
(b) Show that E max |Wt | = .
0≤t≤T 2
The fact that a Brownian motion returns to the origin or hits a barrier almost surely is
a property characteristic to the first dimension only. The next result states that in larger
dimensions this is no longer possible.
Theorem 3.3.12 Let (a, b) ∈ R2 . The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t)
(with W1 (t) and W2 (t) independent) hits the point (a, b) with probability zero. The same result
is valid for any n-dimensional Brownian motion, with n ≥ 2.
However, if the point (a, b) is replaced by the disk Dǫ (x0 ) = {x ∈ R2 ; |x − x0 | ≤ ǫ}, there is
a difference in the behavior of the Brownian motion from n = 2 to n > 2.
Theorem 3.3.13 The 2-dimensional Brownian motion W (t) = W1 (t), W2 (t) hits the disk
Dǫ (x0 ) with probability one.
Theorem 3.3.14 Let n > 2. The n-dimensional Brownian motion W (t) = W1 (t), · · · Wn (t)
hits the ball Dǫ (x0 ) with probability
|x | 2−n
0
P = < 1.
ǫ
The previous results can be stated by saying that that Brownian motion is transient in Rn ,
for n > 2. If n = 2 the previous probability equals 1. We shall come back with proofs to the
aforementioned results in a later chapter.
Remark 3.3.15 If life spreads according to a Brownian motion, the aforementioned results
explain why life is more extensive on earth rather than in space. The probability for a form of
life to reach a planet of radius R situated at distance d is R
d . Since d is large the probability is
very small, unlike in the plane, where the probability is always 1.
Exercise 3.3.16 Is the one-dimensional Brownian motion transient or recurrent in R?
60
0.4
0.2
t1 t2
50 100 150 200 250 300 350
-0.2
a
-0.4
Wt
-0.6
Proposition 3.4.1 (a) If X : Ω → N is a discrete random variable, then for any subset A ⊂ Ω,
we have X
P (A) = P (A|X = x)P (X = x).
x∈N
Proof: (a) The sets X −1 (x) = {X = x} = {ω; X(ω) = x} form a partition of the sample space
Ω, i.e.:
S
(i) Ω = x X −1 (x);
(ii) X −1 (x) ∩ X −1 (y) = Ø for x 6= y.
[ [
Then A = A ∩ X −1 (x) = A ∩ {X = x} , and hence
x x
X
P (A) = P A ∩ {X = x}
x
X P (A ∩ {X = x})
= P ({X = x})
x
P ({X = x})
X
= P (A|X = x)P (X = x).
x
(b) In the case when X is continuous, the sum is replaced by an integral and the probability
P ({X = x}) by fX (x)dx, where fX is the density function of X.
61
Proof: Let A(a; t1 , t2 ) denote the event that the Brownian motion Wt takes on the value a
between t1 and t2 . In particular, A(0; t1 , t2 ) denotes the event that Wt has (at least) a zero
between t1 and t2 . Substituting A = A(0; t1 , t2 ) and X = Wt1 into the formula provided by
Proposition 3.4.1
Z
P (A) = P (A|X = x)fX (x) dx
yields
Z
P A(0; t1 , t2 ) = P A(0; t1 , t2 )|Wt1 = x fWt (x) dx (3.4.4)
1
Z ∞
1 − x2
= √ P A(0; t1 , t2 )|Wt1 = x e 2t1 dx
2πt1 −∞
Using the properties of Wt with respect to time translation and symmetry we have
P A(0; t1 , t2 )|Wt1 = x = P A(0; 0, t2 − t1 )|W0 = x
= P A(−x; 0, t2 − t1 )|W0 = 0
= P A(|x|; 0, t2 − t1 )|W0 = 0
= P A(|x|; 0, t2 − t1 )
= P T|x| ≤ t2 − t1 ,
the last identity stating that Wt hits |x| before t2 − t1 . Using Lemma 3.3.1 yields
Z ∞
2 − y2
P A(0; t1 , t2 )|Wt1 = x = p e 2(t2 −t1 ) dy.
2π(t2 − t1 ) |x|
Substituting into (3.4.4) we obtain
Z ∞ Z ∞ x2
1 2 − y2
P A(0; t1 , t2 ) = √ p e 2(t2 −t1 ) dy e− 2t1 dx
2πt1 −∞ 2π(t2 − t1 ) |x|
Z ∞Z ∞
1 − y2
−x
2
= p e 2(t2 −t1 ) 2t1 dydx.
π t1 (t2 − t1 ) 0 |x|
Exercise 3.4.5 Find the probability that a Brownian motion Wt does not take the value a in
the interval (t1 , t2 ).
Exercise 3.4.6 Let a 6= b. Find the probability that a Brownian motion Wt does not take any
of the values {a, b} in the interval (t1 , t2 ). Formulate and prove a generalization.
We provide below without proof a few similar results dealing with arc-sine probabilities. The
first result deals with the amount of time spent by a Brownian motion on the positive half-axis.
Rt
Theorem 3.4.7 (Arc-sine Law of Lévy) Let L+ +
t = 0 sgn Ws ds be the amount of time a
Brownian motion Wt is positive during the time interval [0, t]. Then
r
2 τ
P (L+t ≤ τ ) = arcsin .
π t
The next result deals with the Arc-sine law for the last exit time of a Brownian motion from
0.
Theorem 3.4.8 (Arc-sine Law of exit from 0) Let γt = sup{0 ≤ s ≤ t; Ws = 0}. Then
r
2 τ
P (γt ≤ τ ) = arcsin , 0 ≤ τ ≤ t.
π t
The Arc-sine law for the time the Brownian motion attains its maximum on the interval
[0, t] is given by the next result.
Theorem 3.4.9 (Arc-sine Law of maximum) Let Mt = max Ws and define
0≤s≤t
θt = sup{0 ≤ s ≤ t; Ws = Mt }.
Then r
2 s
P (θt ≤ s) = arcsin , 0 ≤ s ≤ t, t > 0.
π t
Theorem 3.5.1 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and
consider α, β > 0. Then
e2µβ − 1
P (Xt goes up to α before down to − β) = .
e2µβ − e−2µα
63
Proof: Let T = inf{t > 0; Xt = α or Xt = −β} be the first exit time of Xt from the interval
(−β, α), which is a stopping time, see Exercise 3.1.4. The exponential process
c2
Mt = ecWt − 2 t
, t≥0
is a martingale, see Exercise 2.2.4(c). Then E[Mt ] = E[M0 ] = 1. By the Optional Stopping
Theorem (see Theorem 3.2.1), we get E[MT ] = 1. This can be written as
1 2 1 2
1 = E[ecWT − 2 c T
] = E[ecXT −(cµ+ 2 c )T
]. (3.5.5)
Choosing c = −2µ yields E[e−2µXT ] = 1. Since the random variable XT takes only the values
α and −β, if let pα = P (XT = α), the previous relation becomes
e−2µα pα + e2µβ (1 − pα ) = 1.
e2µβ − 1 2βe2µβ β
lim = lim = .
µ→0 e2µβ − e−2µα µ→0 2βe2µβ + 2αe−2µα α+β
Hence
β
P (Wt goes up to α before down to − β) = .
α+β
P (Wt hits α) = 1.
If α = β we obtain
1
P (Wt goes up to α before down to − α) = ,
2
which shows that the Brownian motion is equally likely to go up or down an amount α in a
given time interval.
If Tα and Tβ denote the times when the process Xt reaches α and β, respectively, then the
aforementioned probabilities can be written using inequalities. For instance the first identity
becomes
e2µβ − 1
P (Tα ≤ T−β ) = 2µβ .
e − e−2µα
64
Exercise 3.5.2 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and
consider α > 0.
(a) If µ > 0 show that
P (Xt goes up to α) = 1.
(b) If µ < 0 show that
P (Xt goes up to α) = e2µα < 1.
P (sup(Wt + µt) ≥ α) = 1, µ ≥ 0,
t≥0
or
P (sup(Wt − γt) ≥ α) = e−2γα , γ > 0,
t≥0
which is known as one of the Doob’s inequalities. This can be also described in terms of stopping
times as follows. Define the stopping time τα = inf{t > 0; Wt − γt ≥ α}. Using
P (τα < ∞) = P sup(Wt − γt) ≥ α
t≥0
Exercise 3.5.3 Let Xt = µt + Wt denote a Brownian motion with nonzero drift rate µ, and
consider β > 0. Show that the probability that Xt never hits −β is given by
1 − e−2µβ , if µ > 0
0, if µ < 0.
Recall that T is the first time when the process Xt hits α or −β.
αe2µβ + βe−2µα − α − β
E[XT ] = .
e2µβ − e−2µα
(b) Find E[XT2 ];
(c) Compute V ar(XT ).
The next result deals with the time one has to wait (in expectation) for the process Xt =
µt + Wt to reach either α or −β.
Proposition 3.5.5 The expected value of T is
αe2µβ + βe−2µα − α − β
E[T ] = .
µ(e2µβ − e−2µα )
65
Proof: Using that Wt is a martingale, with E[Wt ] = E[W0 ] = 0, applying the Optional
Stopping Theorem, Theorem 3.2.1, yields
0 = E[WT ] = E[XT − µT ] = E[XT ] − µE[T ].
Then by Exercise 3.5.4(a) we get
E[XT ] αe2µβ + be−2µα − α − β
E[T ] = = .
µ µ(e2µβ − e−2µα )
Exercise 3.5.6 Take the limit µ → 0 in the formula provided by Proposition 3.5.5 to find the
expected time for a Brownian motion to hit either α or −β.
Exercise 3.5.7 Find E[T 2 ] and V ar(T ).
Exercise 3.5.8 (Wald’s identities) Let T be a finite stopping time for the Brownian motion
Wt . Show that
(a) E[WT ] = 0;
(b) E[WT2 ] = E[T ].
The previous techniques can be also applied to right continuous martingales. Let a > 0 and
consider the hitting time of the Poisson process of the barrier a
τ = inf{t > 0; Nt ≥ a}.
Proposition 3.5.9 The expected waiting time for Nt to reach the barrier a is E[τ ] = λa .
Proof: Since Mt = Nt −λt is a right continuous martingale, by the Optional Stopping Theorem
E[Mτ ] = E[M0 ] = 0. Then E[Nτ − λτ ] = 0 and hence E[τ ] = λ1 E[Nτ ] = λa .
Proposition 3.6.1 The moments of the first hitting time are all infinite E[τ n ] = ∞, n ≥ 1.
Another application involves the inverse Laplace transform to get the probability density.
This way we can retrieve the result of Proposition 3.3.4.
Proposition 3.6.2 The probability density of the hitting time τ is given by
|x| − x2
p(t) = √ e 2t , t > 0. (3.6.8)
2πt3
Proof: Let x > 0. The expectation
Z ∞
E[e−sτ ] = e−sτ p(τ ) dτ = L{p(τ )}(s)
0
is the Laplace transform of p(τ ). Applying the inverse Laplace transform yields
√
p(τ ) = L−1 {E[e−sτ ]}(τ ) = L−1 {e− 2sx
}(τ )
x x2
= √ e− 2τ , τ > 0.
2πτ 3
Proposition 3.6.4 (Chernoff bound) Let τ denote the first hitting time when the Brownian
motion Wt hits the barrier x, x > 0. Then
x2
P (τ ≤ λ) ≤ e− 2λ , ∀λ > 0.
Proof: Let s = −t in the part 2 of Theorem 1.12.10 and use (3.6.7) to get
E[etX ] E[e−sX ] √
P (τ ≤ λ) ≤ λt
= −λs
= eλs−x 2s , ∀s > 0.
e e
√
Then P (τ ≤ λ) ≤ emins>0 f (s) , where f (s) = λs − x 2s. Since f ′ (s) = λ − √x , then f (s)
2s
2
x
reaches its minimum at the critical point s0 = 2λ2 . The minimum value is
x2
min f (s) = f (s0 ) = − .
s>0 2λ
Substituting in the previous inequality leads to the required result.
Proposition 3.6.5 Let τ be the time the process Xt = µt + σWt hits x for the first time. Then
for s > 0, x > 0 we have √ 2 2
1
E[e−sτ ] = e σ2 (µ− 2sσ +µ )x . (3.6.10)
Proposition 3.6.6 Let τ be the time the process Xt = µt + σWt hits x, with x > 0 and µ > 0.
(a) Then the density function of τ is given by
x (x−µτ )2
p(τ ) = √ e− 2τ σ2 , τ > 0. (3.6.11)
σ 2πτ 3/2
(b) The mean and variance of τ are
x xσ 2
E[τ ] = , V ar(τ ) = .
µ µ3
Proof: (a) Let p(τ ) be the density function of τ . Since
Z ∞
E[e−sτ ] = e−sτ p(τ ) dτ = L{p(τ )}(s)
0
is the Laplace transform of p(τ ), applying the inverse Laplace transform yields
1
√ 2 2
p(τ ) = L−1 {E[e−sτ ]} = L−1 {e σ2 (µ− 2sσ +µ )x }
x (x−µτ )2
= √ e− 2τ σ2 , τ > 0.
σ 2πτ 3/2
(b) The moments are obtained by differentiating the moment generating function and taking
the value at s = 0
d d 1 √ 2 2
E[τ ] = − E[e−sτ ] = − e σ2 (µ− 2sσ +µ )x
ds s=0 ds s=0
x 1
√ 2 2
= p e σ2 (µ− 2sσ +µ )x
µ2 + 2sµ s=0
x
= .
µ
d2 d2 12 (µ−√2sσ2 +µ2 )x
E[τ 2 ] = (−1)2 E[e −sτ
] = eσ
ds2 s=0 ds2 s=0
xσ 2 x2
= + 2.
µ3 µ
Hence
xσ 2
V ar(τ ) = E[τ 2 ] − E[τ ]2 =.
µ3
It is worth noting that we can arrive at the formula E[τ ] = µx in the following heuristic way.
Taking the expectation in the equation µτ + σWτ = x yields µE[τ ] = x, where we used that
E[Wτ ] = 0 for any finite stopping time τ (see Exercise 3.5.8 (a)). Solving for E[τ ] yields the
aforementioned formula.
69
Exercise 3.6.9 Does 4t + 2Wt hit 9 faster (in expectation) than 5t + 3Wt hits 14?
Exercise 3.6.10 Let τ be the first time the Brownian motion with drift Xt = µt + Wt hits x,
where µ, x > 0. Prove the inequality
x2 +λ2 µ2
P (τ ≤ λ) ≤ e− 2λ +µx
, ∀λ > 0.
1 2 1
E[e−(cµ+ 2 c )T
]= ·
ecα pα + e−cβ (1 − pα )
If substitute s = cµ + 12 c2 , then
1
E[e−sT ] = √ √ · (3.6.12)
(−µ+ 2s+µ2 )α −(−µ+ 2s+µ2 )β
e pα + e (1 − pα )
The probability density of the stopping time T is obtained by taking the inverse Laplace trans-
form of the right side expression
n 1 o
p(T ) = L−1 √ √ (τ ),
2s+µ2 )α 2s+µ2 )β
e(−µ+ pα + e−(−µ+ (1 − pα )
an expression which is not feasible for having closed form solution. However, expression (3.6.12)
would be useful for computing the price for double barrier derivatives.
Exercise 3.6.12 Denote by Mt = Nt − λt the compensated Poisson process and let c > 0 be a
constant.
(a) Show that
c
Xt = ecMt −λt(e −c−1)
Limit in Distribution
We say that Xt converges in distribution to X if for any continuous bounded function ϕ(x) we
have
lim ϕ(Xt ) = ϕ(X).
t→∞
It is worth noting that the stochastic convergence implies the convergence in distribution.
Wt
Corollary 3.8.3 ms-lim = 0.
t→∞t
Rt
Application 3.8.4 Let Zt = 0 Ws ds. If β > 3/2, then
Zt
ms-lim = 0.
t→∞ tβ
Zt E[Zt ] 1 t3 1
Proof: Let Xt = β
. Then E[X t ] = β
= 0, and V ar[X t ] = 2β
V ar[Z t ] = 2β
= 2β−3 ,
t t t 3t 3t
1
for any t > 0. Since 2β−3 → 0 as t → ∞, applying Proposition 3.8.1 leads to the desired
3t
result.
then
E[Xt ] = 0
√
2 t
V ar[Xt ] = → 0, t → ∞.
t2
Apply Proposition 3.8.1 to get the desired result.
Remark 3.8.7 One of the strongest result regarding limits of Brownian motions is called the
law of iterated logarithms and was first proved by Lamperti:
Wt
lim sup p = 1,
t→∞ 2t ln(ln t)
almost certainly.
Wt
lim p = 1.
t→0+ 2t ln ln(1/t)
Wt
Application 3.8.9 We shall show that ac-lim = 0.
t→∞ t
Wt
From the law of iterated logarithms p < 1 for t large. Then
2t ln(ln t)
p p
Wt Wt 2 ln(ln t) 2 ln(ln t)
= p √ < √ .
t 2t ln(ln t) t t
p
2 ln(ln t) Wt
Let ǫt = √ . Then < ǫt for t large. As an application of the l’Hospital rule, it is
t t
not hard to see that ǫt satisfies the following limits
ǫ → 0, t→∞
√ t
ǫt t → ∞, t → ∞.
73
Wt
In order to show that ac-lim = 0, it suffices to prove
t→∞ t
Wt (ω)
P ω; < ǫt → 1, t → ∞. (3.8.13)
t
We have
Wt (ω) Z tǫt
1 − u2
P ω; < ǫt = P ω; −tǫt < Wt (ω) < tǫt = √ e 2t du
t −tǫt 2πt
Z √ Z
ǫt t ∞
1 v2 1 v2
= √
√ e− 2 dv → √ e− 2 dv = 1, t → ∞,
−ǫt t 2π −∞ 2π
√
where we used ǫt t → ∞, as t → ∞, which proves (3.8.13).
Proposition 3.8.10 Let Xt be a stochastic process. Then
Another convergence result can be obtained if we consider the continuous analog of Example
1.13.6:
Proposition 3.8.12 Let Xt be a stochastic process such that there is a p > 0 such that
E[|Xt |p ] → 0 as t → ∞. Then st-lim Xt = 0.
t→∞
The following result can be regarded as the L’Hospital’s rule for sequences:
Lemma 3.8.14 (Cesaró-Stoltz) Let xn and yn be two sequences of real numbers, n ≥ 1. If
xn+1 − xn
the limit lim exists and is equal to L, then the following limit exists
n→∞ yn+1 − yn
xn
lim = L.
n→∞ yn
74
Proof: (Sketch) Assume there are differentiable functions f and g such that f (n) = xn and
g(n) = yn . (How do you construct these functions?) From Cauchy’s theorem2 there is a
cn ∈ (n, n + 1) such that
f ′ (t)
lim = L.
t→∞ g ′ (t)
(Here one may argue against this, but we recall the freedom of choice for the functions f and
g such that cn can be any number between n and n + 1). By l’Hospital’s rule we get
f (t)
lim = L.
t→∞ g(t)
xn
Making t = n yields lim = L.
yn
t→∞
The next application states that if a sequence is convergent, then the arithmetic average of
its terms is also convergent, and the sequences have the same limit.
a1 + a2 + · · · + an
An =
n
be the arithmetic average of the first n terms. Then An is convergent and
lim An = L.
n→∞
then
xn+1 − xn
lim = lim an+1 = L.
n→∞ yn+1 − yn n→∞
2 This
says that if f and g are differentiable on (a, b) and continuous on [a, b], then there is a c ∈ (a, b) such
f (a) − f (b) f ′ (c)
that = ′ .
g(a) − g(b) g (c)
75
Gn = (b1 · b2 · · · · · bn )1/n
be the geometric average of the first n terms. Show that Gn is convergent and
lim Gn = L.
n→∞
The following result extends the Cesaró-Stoltz lemma to sequences of random variables.
Proposition 3.8.16 Let Xn be a sequence of random variables on the probability space (Ω, F , P ),
such that
Xn+1 − Xn
ac- lim = L.
n→∞ Yn+1 − Yn
Then
Xn
ac- lim = L.
n→∞ Yn
Xn (ω)
B = {ω ∈ Ω; lim = L}.
n→∞ Yn (ω)
Since for any given state of the world ω, the sequences xn = Xn (ω) and yn = Yn (ω) are
numerical sequences, Lemma 3.8.14 yields the inclusion A ⊂ B. This implies P (A) ≤ P (B).
Since P (A) = 1, it follows that P (B) = 1, which leads to the desired conclusion.
Example 3.8.17 Let Sn denote the price of a stock on day n, and assume that
ac-lim Sn = L.
n→∞
Then
S1 + · · · + Sn
ac-lim = L and ac-lim (S1 · · · · · Sn )1/n = L.
n→∞ n n→∞
This says that if almost all future simulations of the stock price approach the steady state
limit L, the arithmetic and geometric averages converge to the same limit. The statement is a
consequence of Proposition 3.8.16 and follows a similar proof as Example 3.8.1. Asian options
have payoffs depending on these type of averages, as we shall see in Part II.
Since E[|Xn |]2 ≤ E[Xn2 ], the condition (3.8.14) can be replaced by its stronger version
The next exrcise deals with a process that is a.c-convergent but is not ms-convergent.
Remark 3.8.22 The previous theorem remains valid if n is replaced by a continuous positive
parameter t.
Wt sin(Wt )
Example 3.8.2 Show that ac- lim = 0.
t→∞ t
Wt sin(Wt ) Wt
Proof: Consider the sequences Xt = 0, Yt = and Zt = . From Application 3.8.9
t t
we have ac-lim Zt = 0. Applying the Squeeze Theorem we obtain the desired result.
t→∞
77
Exercise 3.8.23 Use the Squeeze Theorem to find the following limits:
sin(Wt )
(a) ac-lim ;
t→∞ t
(b) ac-lim t cos Wt ;
t→0
(c) ac- lim et (sin Wt )2 .
t→−∞
Since the increments of a Brownian motion are independent, Proposition 4.0.7 yields
n−1
X n−1
X
E[Xn ] = E[(Wti+1 − Wti )2 ] = (ti+1 − ti )
i=0 i=0
= tn − t0 = T ;
n−1
X n−1
X
V ar(Xn ) = V ar[(Wti+1 − Wti )2 ] = 2(ti+1 − ti )2
i=0 i=0
2T 2
T 2
= n·2 = ,
n n
where we used that the partition is equidistant. Since Xn satisfies the conditions
E[Xn ] = T, ∀n ≥ 1;
V ar[Xn ] → 0, n → ∞,
by Proposition 3.8.1 we obtain ms-lim Xn = T , or
n→∞
n−1
X
ms- lim (Wti+1 − Wti )2 = T. (3.9.16)
n→∞
i=0
78
Exercise 3.9.2 Prove that the quadratic variation of the Brownian motion Wt on [a, b] is equal
to b − a.
while the left side can be regarded as a stochastic integral with respect to dWt2
Z T n−1
X
(dWt )2 = ms- lim (Wti+1 − Wti )2 .
0 n→∞
i=0
dWt2 = dt.
In fact, this expression also holds in the mean square sense, as it can be inferred from the next
exercise.
Exercise 3.9.3 Show that
(a) E[dWt2 − dt] = 0;
(b) V ar(dWt2 − dt) = o(dt);
(c) ms- lim (dWt2 − dt) = 0.
dt→0
Roughly speaking, the process dWt2 , which is the square of infinitesimal increments of a
Brownian motion, is totally predictable. This relation plays a central role in Stochastic Calculus
and will be useful when dealing with Ito’s lemma.
The following exercise states that dtdWt = 0, which is another important stochastic relation
useful in Ito’s lemma.
Exercise 3.9.4 Consider the equidistant partition 0 = t0 < t1 < · · · tn−1 < tn = T . Then
n−1
X
ms- lim (Wti+1 − Wti )(ti+1 − ti ) = 0. (3.9.18)
n→∞
i=0
79
Proposition 3.9.5 Let a < b and consider the partition a = t0 < t1 < · · · < tn−1 < tn = b.
Then
n−1
X
ms– lim (Mtk+1 − Mtk )2 = Nb − Na , (3.9.19)
k∆n k→0
k=0
Proof: For the sake of simplicity we shall use the following notations:
Let
Yk = (∆Mk )2 − ∆Nk = (∆Mk )2 − ∆Mk − λ∆tk .
It suffices to show that
h n−1
X i
E Yk = 0, (3.9.20)
k=0
h n−1
X i
lim V ar Yk = 0. (3.9.21)
n→∞
k=0
The first identity follows from the properties of Poisson processes (see Exercise 2.8.9)
h n−1
X i n−1
X n−1
X
E Yk = E[Yk ] = E[(∆Mk )2 ] − E[∆Nk ]
k=0 k=0 k=0
n−1
X
= (λ∆tk − λ∆tk ) = 0.
k=0
For the proof of the identity (3.9.21) we need to find first the variance of Yk .
where we used Exercise 2.8.9 and the fact that E[∆Mk ] = 0. Since Mt is a process with
independent increments, then Cov[Yk , Yj ] = 0 for i 6= j. Then
h n−1
X i n−1
X X n−1
X
V ar Yk = V ar[Yk ] + 2 Cov[Yk , Yj ] = V ar[Yk ]
k=0 k=0 k6=j k=0
n−1
X n−1
X
= 2λ2 (∆tk )2 ≤ 2λ2 k∆n k ∆tk = 2λ2 (b − a)k∆n k,
k=0 k=0
h n−1
X i
and hence V ar Yn → 0 as k∆n k → 0. According to the Example 1.13.1, we obtain the
k=0
desired limit in mean square.
The previous result states that the quadratic variation of the martingale Mt between a and
b is equal to the jump of the Poisson process between a and b.
n−1
X
ms- lim (Mtk+1 − Mtk )2 = Nb − Na . (3.9.22)
n→∞
k=0
while the left side can be regarded as a stochastic integral with respect to (dMt )2
Z b n−1
X
(dMt )2 := ms- lim (Mtk+1 − Mtk )2 .
a n→∞
k=0
This can be thought as a vanishing integral of the increment process dMt with respect to dt
Z b
dMt dt = 0, ∀a, b ∈ R.
a
Denote
n−1
X n−1
X
Xn = (tk+1 − tk )(Mtk+1 − Mtk ) = ∆tk ∆Mk .
k=0 k=0
1. E[Xn ] = 0;
2. lim V ar[Xn ] = 0.
n→∞
h n−1
X i n−1
X
E[Xn ] = E ∆tk ∆Mk = ∆tk E[∆Mk ] = 0.
k=0 k=0
Since the Poisson process Nt has independent increments, the same property holds for the
compensated Poisson process Mt . Then ∆tk ∆Mk and ∆tj ∆Mj are independent for k 6= j, and
using the properties of variance we have
h n−1
X i n−1
X n−1
X
V ar[Xn ] = V ar ∆tk ∆Mk = (∆tk )2 V ar[∆Mk ] = λ (∆tk )3 ,
k=0 k=0 k=0
where we used
V ar[∆Mk ] = E[(∆Mk )2 ] − (E[∆Mk ])2 = λ∆tk ,
see Exercise 2.8.9 (ii). If we let k∆n k = max ∆tk , then
k
n−1
X n−1
X
3 2
V ar[Xn ] = λ (∆tk ) ≤ λk∆n k ∆tk = λ(b − a)k∆n k2 → 0
k=0 k=0
dt dMt = 0. (3.9.25)
Since the Brownian motion Wt and the process Mt have independent increments and ∆Wk is
independent of ∆Mk , we have
n−1
X n−1
X
E[Yn ] = E[∆Wk ∆Mk ] = E[∆Wk ]E[∆Mk ] = 0,
k=0 k=0
where we used E[∆Wk ] = E[∆Mk ] = 0. Using also E[(∆Wk )2 ] = ∆tk , E[(∆Mk )2 ] = λ∆tk ,
and invoking the independence of ∆Wk and ∆Mk , we get
as n → ∞. Since Yn is a random variable with mean zero and variance decreasing to zero, it
follows that Yn → 0 in the mean square sense. Hence we proved that
The relations proved in this section will be useful in the Part II when developing the stochas-
tic model of a stock price that exhibits jumps modeled by a Poisson process.
Chapter 4
Stochastic Integration
This chapter deals with one of the most useful stochastic integrals, called the Ito integral.
This type of integral was introduced in 1944 by the Japanese mathematician K. Ito, and was
originally motivated by a construction of diffusion processes.
83
84
Remark 4.0.8 The infinitesimal version of the previous result is obtained by replacing t − s
with dt
1. E[dWt2 ] = dt;
2. V ar[dWt2 ] = 2dt2 .
We shall see in an upcoming section that in fact dWt2 and dt are equal in a mean square sense.
Divide the interval [a, b] into n subintervals using the partition points
We emphasize that the intermediate points are the left endpoints of each interval, and this is
the way they should be always chosen. Since the process Ft is nonanticipative, the random
variables Fti and Wti+1 − Wti are independent; this is an important feature in the definition of
the Ito integral.
The Ito integral is the limit of the partial sums Sn
85
Z b
ms-lim Sn = Ft dWt ,
n→∞ a
provided the limit exists. It can be shown that the choice of partition does not influence the
value of the Ito integral. This is the reason why, for practical purposes, it suffices to assume
the intervals equidistant, i.e.
(b − a)
ti+1 − ti = a + , i = 0, 1, · · · , n − 1.
n
The previous convergence is in the mean square sense, i.e.
h Z b 2 i
lim E Sn − Ft dWt = 0.
n→∞ a
Since
1
xy = [(x + y)2 − x2 − y 2 ],
2
letting x = Wti and y = Wti+1 − Wti yields
1 2 1 1
Wti (Wti+1 − Wti ) = W − W 2 − (Wti+1 − Wti )2 .
2 ti+1 2 ti 2
Then after pair cancelations the sum becomes
n−1 n−1 n−1
1X 2 1X 2 1X
Sn = W − W − (Wti+1 − Wti )2
2 i=0 ti+1 2 i=0 ti 2 i=0
n−1
1 2 1X
= Wtn − (Wti+1 − Wti )2 .
2 2 i=0
Using tn = T , we get
n−1
1 2 1X
Sn = WT − (Wti+1 − Wti )2 .
2 2 i=0
Since the first term is independent of n, using Proposition 3.9.1, we have
n−1
1 2 1X
ms- lim Sn = WT − ms- lim (Wti+1 − Wti )2 (4.2.2)
n→∞ 2 n→∞ 2
i=0
1 2 1
= W − T. (4.2.3)
2 T 2
We have now obtained the following explicit formula of a stochastic integral:
Z T
1 2 1
Wt dWt = W − T.
0 2 T 2
It is worth noting that the right side contains random variables depending on the limits of
integration a and b.
Proposition 4.3.1 Let f (Wt , t), g(Wt , t) be nonanticipating processes and c ∈ R. Then we
have
1. Additivity:
Z T Z T Z T
[f (Wt , t) + g(Wt , t)] dWt = f (Wt , t) dWt + g(Wt , t) dWt .
0 0 0
2. Homogeneity:
Z T Z T
cf (Wt , t) dWt = c f (Wt , t) dWt .
0 0
3. Partition property:
Z T Z u Z T
f (Wt , t) dWt = f (Wt , t) dWt + f (Wt , t) dWt , ∀0 < u < T.
0 0 u
n−1
X
Xn = f (Wti , ti )(Wti+1 − Wti )
i=0
n−1
X
Yn = g(Wti , ti )(Wti+1 − Wti ).
i=0
88
Z T Z T
Since ms-lim Xn = f (Wt , t) dWt and ms-lim Yn = g(Wt , t) dWt , using Proposition
n→∞ 0 n→∞ 0
1.14.2 yields
Z T
f (Wt , t) + g(Wt , t) dWt
0
n−1
X
= ms-lim f (Wti , ti ) + g(Wti , ti ) (Wti+1 − Wti )
n→∞
i=0
h n−1
X n−1
X i
= ms-lim f (Wti , ti )(Wti+1 − Wti ) + g(Wti , ti )(Wti+1 − Wti )
n→∞
i=0 i=0
= ms-lim (Xn + Yn ) = ms-lim Xn + ms-lim Yn
n→∞ n→∞ n→∞
Z T Z T
= f (Wt , t) dWt + g(Wt , t) dWt .
0 0
The proofs of parts 2 and 3 are left as an exercise for the reader.
Some other properties, such as monotonicity, do not hold in general. It is possible to have a
RT
nonnegative random variable Ft for which the random variable 0 Ft dWt has negative values.
Some of the random variable properties of the Ito integral are given by the following result:
Proposition 4.3.2 We have
1. Zero mean:
hZ b i
E f (Wt , t) dWt = 0.
a
2. Isometry:
h Z b 2 i hZ b i
E f (Wt , t) dWt =E f (Wt , t)2 dt .
a a
3. Covariance:
h Z b Z b i hZ b i
E f (Wt , t) dWt g(Wt , t) dWt =E f (Wt , t)g(Wt , t) dt .
a a a
We shall discuss the previous properties giving rough reasons why they hold true. The
detailed proofs are beyond the goal of this book.
Pn−1
1. The Ito integral is the mean square limit of the partial sums Sn = i=0 fti (Wti+1 − Wti ),
where we denoted fti = f (Wti , ti ). Since f (Wt , t) is a nonanticipative process, then fti is
independent of the increments Wti+1 − Wti , and hence we have
h n−1
X i n−1
X
E[Sn ] = E fti (Wti+1 − Wti ) = E[fti (Wti+1 − Wti )]
i=0 i=0
n−1
X
= E[fti ]E[(Wti+1 − Wti )] = 0,
i=0
because the increments have mean zero. Since each partial sum has zero mean, their limit,
which is the Ito Integral, will also have zero mean.
89
Their product is
n−1
X n−1
X
Sn Vn = fti (Wti+1 − Wti ) gtj (Wtj+1 − Wtj )
i=0 j=0
n−1
X n−1
X
= fti gti (Wti+1 − Wti )2 + fti gtj (Wti+1 − Wti )(Wtj+1 − Wtj )
i=0 i6=j
it follows that
n−1
X
E[Sn Vn ] = E[fti gti ]E[(Wti+1 − Wti )2 ]
i=0
n−1
X
= E[fti gti ](ti+1 − ti ),
i=0
90
Rb
which is the Riemann sum for the integral E[ft gt ] dt. a
Rb
From 1 and 2 it follows that the random variable a f (Wt , t) dWt has mean zero and variance
hZ b i hZ b i
V ar f (Wt , t) dWt = E f (Wt , t)2 dt .
a a
Corollary 4.3.3 (Cauchy’s integral inequality) Let f (t) = f (Wt , t) and g(t) = g(Wt , t).
Then
Z b 2 Z b Z b
E[ft gt ] dt ≤ E[ft2 ] dt E[gt2 ] dt .
a a a
Proof: It follows from the previous theorem and from the correlation formula |Corr(X, Y )| =
|Cov(X, Y )|
≤ 1.
[V ar(X)V ar(Y )]1/2
Let Ft be the information set at time t. This implies that fti and Wti+1 − Wti are known
n−1
X
at time t, for any ti+1 ≤ t. It follows that the partial sum Sn = fti (Wti+1 − Wti ) is
i=0
Ft -predictable. The following result states that this is also valid in mean square:
Rt
Proposition 4.3.4 The Ito integral 0 fs dWs is Ft -predictable.
The following two results state that if the upper limit of an Ito integral is replaced by the
parameter t we obtain a continuous martingale.
The process Yh has zero mean for any h > 0 and its variance tends to 0 as h → 0. Using a
convergence theorem yields that Yh tends to 0 in mean square, as h → 0. This is equivalent
with the continuity of Xt at t0 .
All properties of Ito integrals also hold for Wiener integrals. The Wiener integral is a random
variable with zero mean
hZ b i
E f (t) dWt = 0
a
and variance
h Z b 2 i Z b
E f (t) dWt = f (t)2 dt.
a a
However, in the case of Wiener integrals we can say something about their distribution.
Rb
Proposition 4.4.1 The Wiener integral I(f ) = a f (t) dWt is a normal random variable with
mean 0 and variance Z b
V ar[I(f )] = f (t)2 dt := kf k2L2 .
a
92
Proof: Since increments Wti+1 − Wti are normally distributed with mean 0 and variance
ti+1 − ti , then
f (ti )(Wti+1 − Wti ) ∼ N (0, f (ti )2 (ti+1 − ti )).
Since these random variables are independent, by the Central Limit Theorem (see Theorem
2.3.1), their sum is also normally distributed, with
n−1
X n−1
X
Sn = f (ti )(Wti+1 − Wti ) ∼ N 0, f (ti )2 (ti+1 − ti ) .
i=0 i=0
Z b
N 0, f (t)2 dt .
a
The previous convergence holds in distribution, and it still needs to be shown in the mean
square. However, we shall omit this essential proof detail.
RT
Exercise 4.4.2 Show that the random variable X = 1 √1t dWt is normally distributed with
mean 0 and variance ln T .
RT √
Exercise 4.4.3 Let Y = 1 t dWt . Show that Y is normally distributed with mean 0 and
variance (T 2 − 1)/2.
Rt
Exercise 4.4.4 Find the distribution of the integral 0 et−s dWs .
Rt Rt
Exercise 4.4.5 Show that Xt = 0 (2t − u) dWu and Yt = 0 (3t − 4u) dWu are Gaussian
processes with mean 0 and variance 37 t3 .
Z
1 t
Exercise 4.4.6 Show that ms- lim u dWu = 0.
t→0 t 0
Rt
Exercise 4.4.7 Find all constants a, b such that Xt = 0 a + bu
t dWu is normally distributed
with variance t.
where Fti− is the left-hand limit at ti− . For predictability reasons, the intermediate points are
the left-handed limit to the endpoints of each interval. Since the process Ft is nonanticipative,
the random variables Fti− and Mti+1 − Mti are independent.
The integral of Ft− with respect to Mt is the mean square limit of the partial sum Sn
Z T
ms-lim Sn = Ft− dMt ,
n→∞ 0
provided the limit exists. More precisely, this convergence means that
h Z b 2 i
lim E Sn − Ft− dMt = 0.
n→∞ a
1
Using xy = [(x + y)2 − x2 − y 2 ], by letting x = Mti− and y = Mti+1 − Mti , we get (Where
2
does a minus go?)
1 2 1 1
Mti− (Mti+1 − Mti ) = Mti+1 − Mt2i − (Mti+1 − Mti )2 .
2 2 2
After pair cancelations we have
n−1 n−1 n−1
1X 2 1X 2 1X
Sn = M − M − (Mti+1 − Mti )2
2 i=0 ti+1 2 i=0 ti 2 i=0
n−1
1 2 1X
= Mtn − (Mti+1 − Mti )2
2 2 i=0
Since tn = T , we get
n−1
1 2 1X
Sn = MT − (Mti+1 − Mti )2 .
2 2 i=0
The second term on the right is the quadratic variation of Mt . Using formula (3.9.19) yields
that Sn converges in mean square towards 21 MT2 − 12 NT , since N0 = 0.
94
hZ b i
Exercise 4.5.1 (a) Show that E Mt− dMt = 0,
a
hZ b i
(b) Find V ar Mt dMt .
a
Exercise 4.5.2 Let ω be a fixed state of the world and assume the sample path t → Nt (ω) has
a jump in the interval (a, b). Show that the Riemann-Stieltjes integral
Z b
Nt (ω) dNt
a
Exercise 4.5.3 Let Nt− denote the left-hand limit of Nt . Show that Nt− is predictable, while
Nt is not.
The previous exercises provide the reason why in the following we shall work with Mt− instead
Z b Z b
of Mt : the integral Mt dNt might not exist, while Mt− dNt does exist.
a a
The following integrals with respect to a Poisson process Nt are considered in the Riemann-
Stieltjes sense.
Proposition 4.5.6 For any continuous function f we have
hZ t i Z t
(a) E f (s) dNs = λ f (s) ds;
0 0
h Z t 2 i Z t Z t 2
(b) E f (s) dNs =λ f (s)2 ds + λ2 f (s) ds ;
h R t0 i R t f (s)
0 0
f (s) dN λ (e −1) ds
(c) E e 0 s
=e 0 .
95
Proof: (a) Consider the equidistant partition 0 = s0 < s1 < · · · < sn = t, with sk+1 − sk = ∆s.
Then
hZ t i h n−1
X i n−1
X h i
E f (s) dNs = lim E f (si )(Nsi+1 − Nsi ) = lim f (si )E Nsi+1 − Nsi
0 n→∞ n→∞
i=0 i=0
n−1
X Z t
= λ lim f (si )(si+1 − si ) = λ f (s) ds.
n→∞ 0
i=0
(b) Using that Nt is stationary and has independent increments, we have respectively
E[(Nsi+1 − Nsi )2 ] = E[Ns2i+1 −si ] = λ(si+1 − si ) + λ2 (si+1 − si )2
= λ ∆s + λ2 (∆s)2 ,
E[(Nsi+1 − Nsi )(Nsj+1 − Nsj )] = E[(Nsi+1 − Nsi )]E[(Nsj+1 − Nsj )]
= λ(si+1 − si )λ(sj+1 − sj ) = λ2 (∆s)2 .
Applying the expectation to the formula
n−1
X 2 n−1
X
f (si )(Nsi+1 − Nsi ) = f (si )2 (Nsi+1 − Nsi )2
i=0 i=0
X
+2 f (si )f (sj )(Nsi+1 − Nsi )(Nsj+1 − Nsj )
i6=j
yields
h n−1
X 2 i n−1
X X
E f (si )(Nsi+1 − Nsi ) = f (si )2 (λ∆s + λ2 (∆s)2 ) + 2 f (si )f (sj )λ2 (∆s)2
i=0 i=0 i6=j
n−1
X h n−1
X X i
= λ f (si )2 ∆s + λ2 f (si )2 (∆s)2 + 2 f (si )f (sj )(∆s)2
i=0 i=0 i6=j
n−1
X n−1
X 2
= λ f (si )2 ∆s + λ2 f (si ) ∆s
i=0 i=0
Z t Z t 2
→ λ f (s)2 ds + λ2 f (s) ds , as n → ∞.
0 0
(c) Using that Nt is stationary with independent increments and has the moment generating
k
function E[ekNt ] = eλ(e −1)t , we have
h Rt i h Pn−1 i h n−1
Y i
E e 0 f (s) dNs = lim E e i=0 f (si )(Nsi+1 −Nsi ) = lim E ef (si )(Nsi+1 −Nsi )
n→∞ n→∞
i=0
n−1
Y h i n−1
Y h i
= lim E ef (si )(Nsi+1 −Nsi ) = lim E ef (si )(Nsi+1 −si )
n→∞ n→∞
i=0 i=0
n−1
Y Pn−1
f (si )
(ef (si ) −1)(si+1 −si )
= lim eλ(e −1)(si+1 −si )
= lim eλ i=0
n→∞ n→∞
i=0
Rt
λ (ef (s) −1) ds
= e 0 .
96
Z t
Since f is continuous, the Poisson integral f (s) dNs can be computed in terms of the
0
waiting times Sk
Z t Nt
X
f (s) dNs = f (Sk ).
0 k=1
This formula can be used to give a proof for the previous result. For instance, taking the
expectation and using conditions over Nt = n, yields
hZ t i Nt
hX n
i X hX i
E f (s) dNs = E f (Sk ) = E f (Sk )|Nt = n P (Nt = n)
0 k=1 n≥0 k=1
XnZ t (λt)n −λt
Z t
1 X (λt)n
−λt
= f (x) dx e =e f (x) dx
t 0 n! 0 t (n − 1)!
n≥0 n≥0
Z t Z t
= e−λt f (x) dx λeλt = λ f (x) dx.
0 0
Exercise 4.5.7 Solve parts (b) and (c) of Proposition 4.5.6 using a similar idea with the one
presented above.
Proposition 4.5.11 Let Ft = σ(Ns ; 0 ≤ s ≤ t). Then for any constant c, the process
c
Mt = ecNt +λ(1−e )t
, t≥0
is an Ft -martingale.
s
λ x
E[e−sSn ] = e−n ln(1+ λ ) = .
λ+s
Since the expectation on the left side is the Laplace transform of the probability density of Sn ,
then
n λ x o
p(Sn ) = L−1 {E[e−sSn ]} = L−1
λ+s
e−tλ tn−1 λn
= ,
Γ(n)
RT
4.6 The distribution function of XT = 0 g(t) dNt
In this section we consider the function g(t) continuous. Let S1 < S2 < · · · < SNt denote the
waiting times until time t. Since the increments dNt are equal to 1 at Sk and 0 otherwise, the
integral can be written as
Z T
XT = g(t) dNt = g(S1 ) + · · · + g(SNt ).
0
RT
The distribution function of the random variable XT = 0 g(t) dNt can be obtained condition-
ing over the Nt
X
P (XT ≤ u) = P (XT ≤ u|NT = k) P (NT = k)
k≥0
X
= P (g(S1 ) + · · · + g(SNt ) ≤ u|NT = k) P (NT = k)
k≥0
X
= P (g(S1 ) + · · · + g(Sk ) ≤ u) P (NT = k). (4.6.5)
k≥0
Z
1 vol(Dk )
P g(S1 ) + · · · + g(Sk ) ≤ u = k
dx1 · · · dxk = ,
Dk T Tk
where
Dk = {g(x1 ) + g(x2 ) + · · · + g(xk ) ≤ u} ∩ {0 ≤ x1 , · · · , xk ≤ T }.
Substituting back in (4.6.5) yields
X
P (XT ≤ u) = P (g(S1 ) + · · · + g(Sk ) ≤ u) P (NT = k)
k≥0
X vol(Dk ) λk T k X λk vol(Dk )
−λT −λT
= e = e . (4.6.6)
Tk k! k!
k≥0 k≥0
In general, the volume of the k-dimensional solid Dk is not obvious easy to obtain. However,
there are simple cases when this can be computed explicitly.
A Particular Case. We shall do an explicit computation of the partition function of XT =
RT 2
s dNs . In this case the solid Dk is the intersection between the k-dimensional ball of radius
√0
u centered at the origin and the k-dimensional
√ cube [0, T ]k . There are three possible shapes
for Dk , which depend on the size of u:
√
(a) if 0 ≤ u < T , then Dk is a 21k -part of a k-dimensional sphere;
√ √
(b) if T ≤ u < T k, then Dk has a complicated shape;
√ √
(c) if T k ≤ u, then Dk is the entire k-dimensional cube, and then vol(Dk ) = T k .
π k/2 Rk
Since the volume of the k-dimensional ball of radius R is given by , then the volume
Γ( k2 + 1)
of Dk in case (a) becomes
π k/2 uk/2
vol(Dk ) = .
2k Γ( k2 + 1)
Substituting in (4.6.6) yields
X (λ2 πu)k/2 √
P (XT ≤ u) = e−λT , 0≤ u < T.
k≥0
k!Γ( k2 + 1)
√ √
It is worth noting that for u → ∞, the inequality T k ≤ u is satisfied for all k ≥ 0; hence
relation (4.6.6) yields
X λk T k
lim P (XT ≤ u) = e−λT = e−kT ekT = 1.
u→∞ k!
k≥0
Stochastic Differentiation
make sense in stochastic calculus. The only quantities allowed to be used are the infinitesimal
changes of the process, in our case dWt .
The infinitesimal change of a process
The change in the process Xt between instances t and t + ∆t is given by ∆Xt = Xt+∆t − Xt .
When ∆t is infinitesimally small, we obtain the infinitesimal change of a process Xt
dXt = Xt+dt − Xt .
d(c Xt ) = c dXt .
The verification follows from a straightforward application of the infinitesimal change formula
99
100
The proof follows from Ito’s formula and will be addressed in section 5.3.3.
When the process Yt is replaced by the deterministic function f (t), and Xt is an Ito diffusion,
then the previous formula becomes
X f (t)dX − X df (t)
t t t
d = .
f (t) f (t)2
101
Applying the product rule and the fundamental relation (dWt )2 = dt, yields
d(Wt2 ) = Wt dWt + Wt dWt + dWt dWt = 2Wt dWt + dt.
Example 5.2.2 Show that
d(Wt3 ) = 3Wt2 dWt + 3Wt dt.
Applying the product rule and the previous exercise yields
d(Wt3 ) = d(Wt · Wt2 ) = Wt d(Wt2 ) + Wt2 dWt + d(Wt2 ) dWt
= Wt (2Wt dWt + dt) + Wt2 dWt + dWt (2Wt dWt + dt)
= 2Wt2 dWt + Wt dt + Wt2 dWt + 2Wt (dWt )2 + dt dWt
= 3Wt2 dWt + 3Wt dt,
where we used (dWt )2 = dt and dt dWt = 0.
Example 5.2.3 Show that d(tWt ) = Wt dt + t dWt .
Using the product rule and dt dWt = 0, we get
d(tWt ) = Wt dt + t dWt + dt dWt
= Wt dt + t dWt .
Rt
Example 5.2.4 Let Zt = 0 Wu du be the integrated Brownian motion. Show that
dZt = Wt dt.
The infinitesimal change of Zt is
Z t+dt
dZt = Zt+dt − Zt = Ws ds = Wt dt,
t
df (x) = f ′ (x)dx.
We shall present a similar formula for the stochastic environment. In this case the deter-
ministic function x(t) is replaced by a stochastic process Xt . The composition between the
differentiable function f and the process Xt is denoted by Ft = f (Xt ). Since the increments
involving powers of dt2 or higher are neglected, we may assume that the same holds true for
the increment dXt . Then the expression (5.3.1) becomes
1 2
dFt = f ′ Xt dXt + f ′′ Xt dXt . (5.3.2)
2
In the computation of dXt we may take into the account stochastic relations such as dWt2 = dt,
or dt dWt = 0.
Theorem 5.3.1 (Ito’s formula for diffusions) If Xt is an Ito diffusion, and Ft = f (Xt ),
then
h b(Wt , t)2 ′′ i
dFt = a(Wt , t)f ′ (Xt ) + f (Xt ) dt + b(Wt , t)f ′ (Xt ) dWt . (5.3.3)
2
103
Proof: We shall provide a formal proof. Using relations dWt2 = dt and dt2 = dWt dt = 0, we
have
2
(dXt )2 = a(Wt , t)dt + b(Wt , t)dWt
= a(Wt , t)2 dt2 + 2a(Wt , t)b(Wt , t)dWt dt + b(Wt , t)2 dWt2
= b(Wt , t)2 dt.
1 ′′
dFt = f (Wt )dt + f ′ (Wt ) dWt . (5.3.4)
2
Particular cases
1. If f (x) = xα , with α constant, then f ′ (x) = αxα−1 and f ′′ (x) = α(α − 1)xα−2 . Then (5.3.4)
becomes the following useful formula
1
d(Wtα ) = α(α − 1)Wtα−2 dt + αWtα−1 dWt .
2
1
d(ekWt ) = kekWt dWt + k 2 ekWt dt.
2
Exercise 5.3.3 Use the previous rules to find the following increments
(a) d(Wt eWt )
(b) d(3Wt2 + 2e5Wt )
2
(c) d(et+Wt )
(d) d (t + Wt )n .
1 Z t
(e) d Wu du
t 0
1 Z t
(f ) d α eWu du , where α is a constant.
t 0
In the case when the function f = f (t, x) is also time dependent, the analog of (5.3.1) is
given by
1
df (t, x) = ∂t f (t, x)dt + ∂x f (t, x)dx + ∂x2 f (t, x)(dx)2 + O(dx)3 + O(dt)2 . (5.3.5)
2
Substituting x = Xt yields
1
df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂x2 f (t, Xt )(dXt )2 . (5.3.6)
2
If Xt is an Ito diffusion we obtain an extra-term in formula (5.3.3)
h b(Wt , t)2 2 i
dFt = ∂t f (t, Xt ) + a(Wt , t)∂x f (t, Xt ) + ∂x f (t, Xt ) dt
2
+b(Wt , t)∂x f (t, Xt ) dWt . (5.3.7)
Proposition 5.3.6 Let F be a twice differentiable function. Then for any a < t we have
Z t X
Ft = Fa + F ′ (Ms− ) dMs + ∆F (Ms ) − F ′ (Ms− )∆Ms ,
a a<s≤t
We shall apply the aforementioned result for the case Ft = F (Mt ) = Mt2 . We have
Z t X
Mt2 = Ma2 + 2 Ms− dMs + Ms2 − Ms−2
− 2Ms− (Ms − Ms− ) . (5.3.8)
a a<s≤t
Since the jumps in Ns are of size 1, we have (∆Ns )2 = ∆Ns . Since the difference of the
processes Ms and Ns is continuous, then ∆Ms = ∆Ns . Using these formulas we have
2
Ms2 − Ms− − 2Ms− (Ms − Ms− ) = (Ms − Ms− ) Ms + Ms− − 2Ms−
= (Ms − Ms− )2 = (∆Ms )2 = (∆Ns )2
= ∆Ns = Ns − Ns− .
P
Since the sum of the jumps between s and t is a<s≤t ∆Ns = Nt −Na , formula (5.3.8) becomes
Z t
Mt2 = Ma2 + 2 Ms− dMs + Nt − Na . (5.3.9)
a
The differential form is
d(Mt2 ) = 2Mt− dMt + dNt ,
which is equivalent with
d(Mt2 ) = (1 + 2Mt− ) dMt + λdt,
since dNt = dMt + λdt.
Exercise 5.3.7 Show that Z T
1
Mt− dMt = (M 2 − NT ).
0 2 T
Exercise 5.3.8 Use Ito’s formula for the Poison process to find the conditional expectation
E[Mt2 |Fs ] for s < t.
The expression
1 ∂2f ∂2f
∆f = 2
+ 2
2 ∂x ∂y
is called the Laplacian of f . We can rewrite the previous formula as
∂f ∂f
dFt = dWt1 + dWt2 + ∆f dt
∂x ∂y
A function f with ∆f = 0 is called harmonic. The aforementioned formula in the case of
harmonic functions takes the simple form
∂f ∂f
dFt = dWt1 + dWt2 . (5.3.10)
∂x ∂y
Exercise 5.3.9 Let Wt1 , Wt2 be two independent Brownian motions. If the function f is har-
monic, show that Ft = f (Wt1 , Wt2 ) is a martingale. Is the converse true?
Exercise 5.3.10 Use the previous formulas to find dFt in the following cases
(a) Ft = (Wt1 )2 + (Wt2 )2
(b) Ft = ln[(Wt1 )2 + (Wt2 )2 ].
p
Exercise 5.3.11 Consider the Bessel process Rt = (Wt1 )2 + (Wt2 )2 , where Wt1 and Wt2 are
two independent Brownian motions. Prove that
1 W1 W2
dRt = dt + t dWt1 + t dWt2 .
2Rt Rt Rt
Example 5.3.1 (The product rule) Let Xt and Yt be two processes. Show that
d(Xt Yt ) = Yt dXt + Xt dYt + dXt dYt .
Consider the function f (x, y) = xy. Since ∂x f = y, ∂y f = x, ∂x2 f = ∂y2 f = 0, ∂x ∂y = 1, then
Ito’s multidimensional formula yields
d(Xt Yt ) = d f (X, Yt ) = ∂x f dXt + ∂y f dYt
1 1
+ ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt
2 2
= Yt dXt + Xt dYt + dXt dYt .
Example 5.3.2 (The quotient rule) Let Xt and Yt be two processes. Show that
X Y dX − X dY − dX dY Xt
t t t t t t t
d = + 2 (dYt )2 .
Yt Yt2 Yt
Consider the function f (x, y) = xy . Since ∂x f = y1 , ∂y f = − yx2 , ∂x2 f = 0, ∂y2 f = − yx2 , ∂x ∂y = 1
y2 ,
then applying Ito’s multidimensional formula yields
X
t
d = d f (X, Yt ) = ∂x f dXt + ∂y f dYt
Yt
1 1
+ ∂x2 f (dXt )2 + ∂y2 f (dYt )2 + ∂x ∂y f dXt dYt
2 2
1 Xt 1
= dXt − 2 dYt − 2 dXt dYt
Yt Yt Yt
Yt dXt − Xt dYt − dXt dYt Xt
= 2 + 2 (dYt )2 .
Yt Yt
Chapter 6
Computing a stochastic integral starting from the definition of the Ito integral is a quite ineffi-
cient method. Like in the elementary Calculus, several methods can be developed to compute
stochastic integrals. In order to keep the analogy with the elementary Calculus, we have called
them Fundamental Theorem of Stochastic Calculus and Integration by Parts. The integration
by substitution is more complicated in the stochastic environment and we have considered only
a particular case of it, which we called The method of heat equation.
since we canceled the terms in pairs. Substituting into formula (6.1.1) yields Xt = Xa +
Z t Z t
f (s, Ws )dWs , and hence dXt = d f (s, Ws )dWs , since Xa is a constant.
a a
Theorem 6.1.1 (The Fundamental Theorem of Stochastic Calculus)
(i) For any a < t, we have
Z t
d f (s, Ws )dWs = f (t, Wt )dWt .
a
107
108
Hence dXt = dYt , or d(Xt − Yt ) = 0. Since the process Xt − Yt has zero increments, then
Xt − Yt = c, constant. Taking t = 0, yields
Z 0 W2
0 0
c = X0 − Y0 = Ws dWs − − = 0,
0 2 2
and hence c = 0. It follows that Xt = Yt , which verifies the desired relation.
Consider the function f (t, x) = 31 x3 − tx, and let Ft = f (t, Wt ). Since ∂t f = −x, ∂x f = x2 − t,
and ∂x2 f = 2x, then Ito’s formula provides
1
dFt = ∂t f dt + ∂x f dWt + ∂x2 f (dWt )2
2
2 1
= −Wt dt + (Wt − t) dWt + 2Wt dt
2
= (Wt2 − t)dWt .
This formula is to be used when integrating a product between a function of t and a function
of the Brownian motion Wt , for which an antiderivative is known. The following two particular
cases are important and useful in applications.
1. If g(Wt ) = Wt , the aforementioned formula takes the simple form
Z b t=b
Z b
f (t) dWt = f (t)Wt − f ′ (t)Wt dt. (6.2.2)
a t=a a
Z T
Application 1 Consider the Wiener integral IT = t dWt . From the general theory, see
0
Proposition 4.4.1, it is known that I is a random variable normally distributed with mean 0
and variance Z T
T3
V ar[IT ] = t2 dt = .
0 3
Recall the definition of integrated Brownian motion
Z t
Zt = Wu du.
0
Formula (6.2.2) yields a relationship between I and the integrated Brownian motion
Z T Z T
IT = t dWt = T WT − Wt dt = T WT − ZT ,
0 0
and hence IT + ZT = T WT . This relation can be used to compute the covariance between IT
and ZT .
Cov(IT + ZT , IT + ZT ) = V ar[T WT ] ⇐⇒
V ar[IT ] + V ar[ZT ] + 2Cov(IT , ZT ) = T 2 V ar[WT ] ⇐⇒
T 3 /3 + T 3 /3 + 2Cov(IT , ZT ) = T 3 ⇐⇒
Cov(IT , ZT ) = T 3 /6,
where we used that V ar[ZT ] = T 3 /3. The processes It and Zt are not independent. Their
correlation coefficient is 0.5 as the following calculation shows
Cov(IT , ZT ) T 3 /6
Corr(IT , ZT ) = 1/2 =
T 3 /3
V ar[IT ]V ar[ZT ]
= 1/2.
x2
Application 2 If we let g(x) = 2 in formula (6.2.3), we get
Z b
Wb2 − Wa2 1
Wt dWt = − (b − a).
a 2 2
It is worth noting that letting a = 0 and b = T , we retrieve a formula that was proved by direct
methods in chapter 2
Z T
W2 T
Wt dWt = T − .
0 2 2
x3
Similarly, if we let g(x) = 3 in (6.2.3) yields
Z b Z b
W3 b
Wt2 dWt = t − Wt dt.
a 3 a a
111
Application 3
RT
Choosing f (t) = eαt and g(x) = cos x, we shall compute the stochastic integral 0 eαt cos Wt dWt
using the formula of integration by parts
Z T Z T
eαt cos Wt dWt = eαt (sin Wt )′ dWt
0 0
Z T Z
T 1 T αt
= eαt sin Wt − (eαt )′ sin Wt dt −
e (cos Wt )′′ dt
0 0 2 0
Z T Z
1 T αt
= eαT sin WT − α αt
e sin Wt dt + e sin Wt dt
0 2 0
Z
1 T αt
= eαT sin WT − α − e sin Wt dt.
2 0
1
The particular case α = 2 leads to the following exact formula of a stochastic integral
Z T
t T
e 2 cos Wt dWt = e 2 sin WT . (6.2.4)
0
RT
In a similar way, we can obtain an exact formula for the stochastic integral 0 eβt sin Wt dWt
as follows
Z T Z T
eβt sin Wt dWt = − eβt (cos Wt )′ dWt
0 0
T
Z T Z T
βt βt 1
= −e cos Wt +β e cos Wt dt − eβt cos Wt dt.
0 0 2 0
1
Taking β = 2 yields the closed form formula
Z T
t T
e 2 sin Wt dWt = 1 − e 2 cos WT . (6.2.5)
0
This formula is of theoretical value. In practice, the term dXt dYt needs to be computed using
the rules Wt2 = dt, and dt dWt = 0.
Exercise 6.2.1 (a) Use integration by parts to get
Z T Z T
1 −1 Wt
2 dWt = tan (WT ) + 2 2 dt, T > 0.
0 1 + Wt 0 (1 + Wt )
Exercise 6.2.4 (a) Let T > 0. Show the following relation using integration by parts
Z T Z T
2Wt 1 − Wt2
dWt = ln(1 + WT2 ) − dt.
0 1 + Wt2 0 (1 + Wt2 )2
(b) Show that for any real number x the following double inequality holds
1 1 − x2
− ≤ ≤ 1.
8 (1 + x2 )2
T
− ≤ E[ln(1 + WT2 )] ≤ T.
8
(e) Use Jensen’s inequality to get
1
∂t ϕ + ∂x2 ϕ = 0. (6.3.6)
2
This is called the heat equation without sources. The non-homogeneous equation
1
∂t ϕ + ∂x2 ϕ = G(t, x) (6.3.7)
2
is called heat equation with sources. The function G(t, x) represents the density of heat sources,
while the function ϕ(t, x) is the temperature at point x at time t in a one-dimensional wire. If
the heat source is time independent, then G = G(x), i.e. G is a function of x only.
Example 6.3.1 Find all solutions of the equation (6.3.6) of type ϕ(t, x) = a(t) + b(x).
1 ′′
b (x) = −a′ (t).
2
114
Since the left side is a function of x only, while the right side is a function of variable t, the
only where the previous equation is satisfied is when both sides are equal to the same constant
C. This is called a separation constant. Therefore a(t) and b(x) satisfy the equations
1 ′′
a′ (t) = −C, b (x) = C.
2
Integrating yields a(t) = −Ct + C0 and b(x) = Cx2 + C1 x + C2 . It follows that
ϕ(t, x) = C(x2 − t) + C1 x + C3 ,
Example 6.3.2 Find all solutions of the equation (6.3.6) of the type ϕ(t, x) = a(t)b(x).
ϕ(t, x) = a(t)b(x) = c1 x + c0 , c0 , c1 ∈ R
In particular, the functions x, x2 − t, ex−t/2 , e−x−t/2 , et/2 sin x and et/2 cos x, or any linear
combination of them are solutions of the heat equation (6.3.6). However, there are other
solutions which are not of the previous type.
115
Exercise 6.3.1 Prove that ϕ(t, x) = 31 x3 − tx is a solution of the heat equation (6.3.6).
2
Exercise 6.3.2 Show that ϕ(t, x) = t−1/2 e−x /(2t)
is a solution of the heat equation (6.3.6)
for t > 0.
x
Exercise 6.3.3 Let ϕ = u(λ), with λ = √ , t > 0. Show that ϕ satisfies the heat equation
2 t
(6.3.6) if and only if u′′ + 2λu′ = 0.
Z ∞ √
2 2
Exercise 6.3.4 Let erf c(x) = √ e−r dr. Show that ϕ = erf c(x/(2 t)) is a solution of
π x
the equation (6.3.6).
x2
Exercise 6.3.5 (the fundamental solution) Show that ϕ(t, x) = √ 1 e− 4t , t > 0 satisfies
4πt
the equation (6.3.6).
Sometimes it is useful to generate new solutions for the heat equation from other solutions.
Below we present a few ways to accomplish this:
(i) by linear combination: if ϕ1 and ϕ2 are solutions, then a1 ϕ1 + a1 ϕ2 is a solution, where
a1 , a2 constants.
(ii) by translation: if ϕ(t, x) is a solution, then ϕ(t − τ, x − ξ) is a solution, where (τ, ξ) is
a translation vector.
(iii) by affine transforms: if ϕ(t, x) is a solution, then ϕ(λτ, λ2 x) is a solution, for any
constant λ.
∂ n+m
(iv) by differentiation: if ϕ(t, x) is a solution, then n m ϕ(t, x) is a solution.
∂ x∂ t
(v) by convolution: if ϕ(t, x) is a solution, then so are
Z b
ϕ(t, x − ξ)f (ξ) dξ
a
Z b
ϕ(t − τ, x)g(τ ) dt.
a
For more detail on the subject the reader can consult Widder [?] and Cannon [?].
Theorem 6.3.6 Let ϕ(t, x) be a solution of the heat equation (6.3.6) and denote f (t, x) =
∂x ϕ(t, x). Then
Z b
f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ).
a
Choose the solution of the heat equation (6.3.6) given by ϕ(t, x) = x2 − t. Then f (t, x) =
∂x ϕ(t, x) = 2x. Theorem 6.3.6 yields
Z T Z T T
2Wt dWt = f (t, Wt ) dWt = ϕ(t, x) = WT2 − T.
0 0 0
Consider the function ϕ(t, x) = 13 x3 − tx, which is a solution of the heat equation (6.3.6), see
Exercise 6.3.1. Then f (t, x) = ∂x ϕ(t, x) = x2 − t. Applying Theorem 6.3.6 yields
Z T Z T T 1 3
(Wt2 − t) dWt = f (t, Wt ) dWt = ϕ(t, Wt ) = W − T WT .
0 0 0 3 T
λ2 t
Consider the function ϕ(t, x) = e− 2 ±λx , which is a solution of the homogeneous heat equation
λ2 t
(6.3.6), see Example 6.3.2. Then f (t, x) = ∂x ϕ(t, x) = ±λe− 2 ±λx . Apply Theorem 6.3.6 to
get
Z T Z T T
λ2 t λ2 T
±λe− 2 ±λx dWt = f (t, Wt ) dWt = ϕ(t, Wt ) = e− 2 ±λWT − 1.
0 0 0
λ2 t
From the Example 6.3.2 we know that ϕ(t, x) = e 2 sin(λx) is a solution of the heat equation.
λ2 t
Applying Theorem 6.3.6 to the function f (t, x) = ∂x ϕ(t, x) = λe 2 cos(λx), yields
Z T Z T T
λ2 t
λe 2 cos(λWt ) dWt = f (t, Wt ) dWt = ϕ(t, Wt )
0 0 0
λ2 t T λ2 T
= e 2 sin(λWt ) =e 2 sin(λWT ).
0
λ2 t
Choose ϕ(t, x) = e 2 cos(λx) to be a solution of the heat equation. Apply Theorem 6.3.6 for
λ2 t
the function f (t, x) = ∂x ϕ(t, x) = −λe 2 sin(λx) to get
Z T T
λ2 t
(−λ)e 2 sin(λWt ) dWt = ϕ(t, Wt )
0 0
λ2 T T λ2 T
= e 2 cos(λWt ) =e 2 cos(λWT ) − 1,
0
2
From Exercise 6.3.2 we have that ϕ(t, x) = t−1/2 e−x /(2t) is a solution of the homogeneous heat
2
equation. Since f (t, x) = ∂x ϕ(t, x) = −t−3/2 xe−x /(2t) , applying Theorem 6.3.6 yields to the
desired result. The reader can easily fill in the details.
Integration techniques will be used when solving stochastic differential equations in the next
chapter.
118
Exercise 6.3.14 Let ϕ(t, x) be a solution of the following non-homogeneous heat equation with
time-dependent and uniform heat source G(t)
1
∂t ϕ + ∂x2 ϕ = G(t).
2
Denote f (t, x) = ∂x ϕ(t, x). Show that
Z b Z b
f (t, Wt ) dWt = ϕ(b, Wb ) − ϕ(a, Wa ) − G(t) dt.
a a
Z T
λ2 t 1 λ2 T
9. e 2 cos(λWt ) dWt = e 2 sin(λWT );
0 λ
Z T
λ2 t 1 λ2 T
10. e 2 sin(λWt ) dWt = 1 − e 2 cos(λWT ) ;
0 λ
Z T
λ2 t 1 − λ2 T ±λWT
11. e− 2 ±λWt dWt = e 2 −1 ;
0 ±λ
Z b Wb2
3 Wt2 1
2
Wa 1
12. t− 2 Wt e− 2t dWt = a− 2 e− 2a − b− 2 e− 2b ;
a
Z t
13. d f (s, Ws ) dWs = f (t, Wt ) dWt ;
a
Z b
14. Yt dWt = Fb − Fa , if Yt dWt = dFt ;
a
Z b Z b
15. f (t) dWt = f (t)Wt |ba − f ′ (t)Wt dt;
a a
Z b b
Z b
′ 1
16. g (Wt ) dWt = g(Wt ) − g ′′ (Wt ) dt.
a a 2 a
120
Chapter 7
where the last integral is taken in the Ito sense. Relation (7.1.2) is taken as the definition for
the stochastic differential equation (7.1.1). However, since it is convenient to use stochastic
differentials informally, we shall approach stochastic differential equations by analogy with the
ordinary differential equations, and try to present the same methods of solving equations in the
new stochastic environment.
The functions a(t, Wt , Xt ) and b(t, Wt , Xt ) are called drift rate and volatility, respectively.
A process Xt is called a (strong) solution for the stochastic equation (7.1.1) if it satisfies the
equation. We shall start with an example.
Example 7.1.1 (The Brownian bridge) Let a, b ∈ R. Show that the process
Z t
1
Xt = a(1 − t) + bt + (1 − t) dWs , 0 ≤ t < 1
0 1 − s
is a solution of the stochastic differential equation
b − Xt
dXt = dt + dWt , 0 ≤ t < 1, X0 = a.
1−t
We shall perform a routine verification to show that Xt is a solution. First we compute the
b − Xt
quotient :
1−t
121
122
Z t
1
b − Xt = b − a(1 − t) − bt − (1 − t) dWs
0 1 − s
Z t
1
= (b − a)(1 − t) − (1 − t) dWs ,
0 1−s
Using
Z t
1 1
d dWs = dWt ,
0 1−s 1−t
the product rule yields
Z t Z t 1
1
dXt = a d(1 − t) + bdt + d(1 − t) dWs + (1 − t)d dWs
0 1−s 0 1−s
Z t
1
= b−a− dWs dt + dWt
0 1−s
b − Xt
= dt + dWt ,
1−t
where the last identity comes from (7.1.3). We just verified that the process Xt is a solution of
the given stochastic equation. The question of how this solution was obtained in the first place,
is the subject of study for the next few sections.
2. If a(t, Wt , Xt ) = α(t)Xt + β(t), with α(t) and β(t) continuous deterministic function,
then
d
E[Xt ] = α(t)E[Xt ] + β(t),
dt
which is a linear differential equation in E[Xt ]. Its solution is given by
Z t
E[Xt ] = eA(t) X0 + e−A(s) β(s) ds , (7.2.5)
0
Rt
where A(t) = 0 α(s) ds. It is worth noting that the expectation E[Xt ] does not depend on the
volatility term b(t, Wt , Xt ).
Exercise 7.2.1 If dXt = (2Xt + e2t )dt + b(t, Wt , Xt )dWt , then
E[Xt ] = e2t (X0 + t).
Proposition 7.2.2 Let Xt be a process satisfying the stochastic equation
dXt = α(t)Xt dt + b(t)dWt .
Then the mean and variance of Xt are given by
E[Xt ] = eA(t) X0
Z t
2A(t)
V ar[Xt ] = e e−A(s) b2 (s) ds,
0
Rt
where A(t) = 0
α(s) ds.
Proof: The expression of E[Xt ] follows directly from formula (7.2.5) with β = 0. In order to
compute the second moment we first compute
(dXt )2 = b2 (t) dt;
d(Xt2 ) = 2Xt dXt + (dXt )2
= 2Xt α(t)Xt dt + b(t)dWt + b2 (t)dt
= 2α(t)Xt2 + b2 (t) dt + 2b(t)Xt dWt ,
where we used Ito’s formula. If we let Yt = Xt2 , the previous equation becomes
p
dYt = 2α(t)Yt + b2 (t) dt + 2b(t) Yt dWt .
Applying formula (7.2.5) with α(t) replaced by 2α(t) and β(t) by b2 (t), yields
Z t
2A(t)
E[Yt ] = e Y0 + e−2A(s) b2 (s) ds ,
0
Remark 7.2.3 We note that the previous equation is of linear type. This shall be solved
explicitly in a future section.
The mean and variance for a given stochastic process can be computed by working out the
associated stochastic equation. We shall provide next a few examples.
Example 7.2.1 Find the mean and variance of ekWt , with k constant.
If we let f (t) = E[ekWt ], then differentiating the previous relations yields the differential equa-
tion
1
f ′ (t) = k 2 f (t)
2
2
with the initial condition f (0) = E[ekW0 ] = 1. The solution is f (t) = ek t/2
, and hence
2
E[ekWt ] = ek t/2
.
The variance is
2 2
V ar(ekWt ) = E[e2kWt ] − (E[ekWt ])2 = e4k t/2
− ek t
2 2
= ek t (ek t − 1).
We shall set up a stochastic differential equation for Wt eWt . Using the product formula and
Ito’s formula yields
Let f (t) = E[Wt eWt ]. Using E[eWs ] = es/2 , the previous integral equation becomes
Z t
1
f (t) = ( f (s) + es/2 ) ds,
0 2
(e−t/2 f (t))′ = 1.
(2k)! k
E[Wt2k ] = t , E[Wt2k+1 ] = 0.
2k k!
Using the initial values E[Wt ] = 0 and E[Wt2 ] = t, the method of mathematical induction
implies that E[Wt2k+1 ] = 0 and E[Wt2k ] = (2k)! k
2k k! t .
Let f (t) = E[sin Wt ]. Differentiating yields the equation f ′ (t) = − 12 f (t) with f (0) = E[sin W0 ] =
0. The unique solution is f (t) = 0. Hence
E[sin Wt ] = 0.
Exercise 7.2.7 Use the previous exercise and the definition of expectation to show that
Z ∞
2 π 1/2
(a) e−x cos x dx = 1/4 ;
−∞ e
Z ∞ r
−x2 /2 2π
(b) e cos x dx = .
−∞ e
Exercise 7.2.9 Using the result given by Example 7.2.3 show that
3
(a) E[cos(tWt )] = e−t /2
;
(b) E[sin(tWt )] = 0;
(c) E[etWt ] = 0.
127
For general drift rates we cannot find the mean, but in the case of concave drift rates we
can find an upper bound for the expectation E[Xt ]. The following result will be useful.
Lemma 7.2.10 (Gronwall’s inequality) Let f (t) be a non-negative function satisfying the
inequality
Z t
f (t) ≤ C + M f (s) ds
0
f (t) ≤ CeMt , 0 ≤ t ≤ T.
Proof: From the mean value theorem there is ξ ∈ (0, x) such that
where we used that a′ (x) is a decreasing function. Applying Jensen’s inequality for concave
functions yields
E[a(Xt )] ≤ a(E[Xt ]).
Combining with (7.2.6) we obtain E[a(Xt )] ≤ M E[Xt ]. Substituting in the identity (7.2.4)
implies
Z t
E[Xt ] ≤ X0 + M E[Xs ] ds.
0
Exercise 7.2.12 State the previous result in the particular case when a(x) = sin x, with 0 ≤
x ≤ π.
Not in all cases can the mean and the variance be obtained directly from the stochastic
equation. In these cases we need more powerful methods that produce closed form solutions.
In the next sections we shall discuss several methods of solving stochastic differential equation.
Exercise 7.3.2 Solve the following stochastic differential equations for t ≥ 0 and determine
the mean and the variance of the solution
(a) dXt = cos t dt − sin t dWt , X0 = 1.
√
(b) dXt = et dt + t dWt , X0 = 0.
t
(c) dXt = 1+t2 dt + t3/2 dWt , X0 = 1.
If the drift and the volatility depend on both variables t and Wt , the stochastic differential
equation
dXt = a(t, Wt )dt + b(t, Wt )dWt , t≥0
defines an Ito diffusion. Integrating yields the solution
Z t Z t
Xt = X0 + a(s, Ws ) ds + b(s, Ws ) dWs .
0 0
There are several cases when both integrals can be computed explicitly.
129
dXt = dt + Wt dWt , X0 = 1.
Integrating yields
Z t Z t
Xt = s2 ds + es/2 cos Ws dWs
0 0
3
t
= + et/2 sin Wt , (7.3.7)
3
where we used (6.3.9). Even if the process Xt is not Gaussian, we can still compute its mean
and variance. By Ito’s formula we have
1
d(sin Wt ) = cos Wt dWt − sin Wt dt
2
Integrating between 0 and t yields
Z t Z t
1
sin Wt = cos Ws dWs − sin Ws ds,
0 2 0
where we used that sin W0 = sin 0 = 0. Taking the expectation in the previous relation yields
hZ t i 1Z t
E[sin Wt ] = E cos Ws dWs − E[sin Ws ] ds.
0 2 0
130
From the properties of the Ito integral, the first expectation on the right side is zero. Denoting
µ(t) = E[sin Wt ], we obtain the integral equation
Z
1 t
µ(t) = − µ(s) ds.
2 0
Differentiating yields the differential equation
1
µ′ (t) = − µ(t)
2
with the solution µ(t) = ke−t/2 . Since k = µ(0) = E[sin W0 ] = 0, it follows that µ(t) = 0.
Hence
E[sin Wt ] = 0.
Taking expectation in (7.3.7) leads to
h t3 i t3
E[Xt ] = E + et/2 E[sin Wt ] = .
3 3
Since the variance of predictable functions is zero,
h t3 i
V ar[Xt ] = V ar + et/2 sin Wt = (et/2 )2 V ar[sin Wt ]
3
et
= et E[sin2 Wt ] = (1 − E[cos 2Wt ]). (7.3.8)
2
In order to compute the last expectation we use Ito’s formula
d(cos 2Wt ) = −2 sin 2Wt dWt − 2 cos 2Wt dt
and integrate to get
Z t Z t
cos 2Wt = cos 2W0 − 2 sin 2Ws dWs − 2 cos 2Ws ds
0 0
Taking the expectation and using that Ito integrals have zero expectation, yields
Z t
E[cos 2Wt ] = 1 − 2 E[cos 2Ws ] ds.
0
If we denote m(t) = E[cos 2Wt ], the previous relation becomes an integral equation
Z t
m(t) = 1 − 2 m(s) ds.
0
and find the distribution of the solution Xt and its mean and variance.
Dividing by et/2 , integrating between 0 and t, and using formula (6.3.8) yields
Z t Z t
Xt = e−s/2 ds + e−s/2+Ws dWs
0 0
= 2(1 − e−t/2 ) + e−t/2 eWt − 1
= 1 + e−t/2 (eWt − 2).
F (y) = P (Xt ≤ y) = P 1 + e−t/2 (eWt − 2) ≤ y
W 1
t
= P Wt ≤ ln 2 + et/2 (y − 1) = P √ ≤ √ ln 2 + et/2 (y − 1)
t t
1
t/2
= N √ ln 2 + e (y − 1) ,
t
Z u
1 2
where N (u) = √ e−s /2
ds is the distribution function of a standard normal distributed
2π −∞
random variable.
1
a(t, x) = ∂t f (t, x) + ∂x2 f (t, x) (7.4.10)
2
b(t, x) = ∂x f (t, x). (7.4.11)
1
et (1 + x2 ) = ∂t f (t, x) + ∂x2 f (t, x)
2
1 + 2et x = ∂x f (t, x).
Then ∂t f = et x2 + T ′ (t) and ∂x2 f = 2et . Substituting in the first equation yields
et (1 + x2 ) = et x2 + T ′ (t) + et .
The coefficient functions are a(t, x) = 2tx3 + 3t2 (1 + x) and b(t, x) = 3t2 x2 + 1. The associated
system is given by
1
2tx3 + 3t2 (1 + x) = ∂t f (t, x) + ∂x2 f (t, x)
2
3t2 x2 + 1 = ∂x f (t, x).
Then ∂t f = 2tx3 + T ′ (t) and ∂x2 f = 6t2 x, and substituting into the first equation we get
1
2tx3 + 3t2 (1 + x) = 2tx3 + T ′ (t) + 6t2 x.
2
After cancellations we get T ′ (t) = 3t2 , so T (t) = t3 + c. Then
Theorem 7.4.1 If the stochastic differential equation (7.4.9) is exact, then the coefficient func-
tions a(t, x) and b(t, x) satisfy the condition
1
∂x a = ∂t b + ∂x2 b. (7.4.12)
2
Proof: If the stochastic equation is exact, there is a function f (t, x) satisfying the system
(7.4.10)–(7.4.10). Differentiating the first equation of the system with respect to x yields
1
∂x a = ∂t ∂x f + ∂x2 ∂x f.
2
Substituting b = ∂x f yields the desired relation.
Remark 7.4.2 The equation (7.4.12) has the meaning of a heat equation. The function b(t, x)
represents the temperature measured at x at the instance t, while ∂x a is the density of heat
sources. The function a(t, x) can be regarded as the potential from which the density of heat
sources is derived by taking the gradient in x.
It is worth noting that equation (7.4.12) is a just necessary condition for exactness. This
means that if this condition is not satisfied, then the equation is not exact. In that case we
need to try a different method to solve the equation.
134
exact?
Collecting the coefficients, we have a(t, x) = 1 + x2 , b(t, x) = t4 + x2 . Since ∂x a = 2x, ∂t b = 4t3 ,
and ∂x2 b = 2, the condition (7.4.12) is not satisfied, and hence the equation is not exact.
X f (t)dX − X df (t)
t t t
d = .
f (t) f (t)2
For instance, if a stochastic differential equation can be written as
the product rule brings the equation into the exact form
dXt = d f (t)Wt ,
Xt = X0 + f (t)Wt .
dXt = d(tWt2 ),
Exercise 7.5.1 Solve the following stochastic differential equations by the inspection method
(a) dXt = (1 + Wt )dt + (t + 2Wt )dWt , X0 = 0;
(b) t2 dXt = (2t3 − Wt )dt + tdWt , X1 = 0;
(c) e−t/2 dXt = 12 Wt dt + dWt , X0 = 0;
(d) dXt = 2tWt dWt + Wt2 dt, X0 = 0;
1
√
(e) dXt = 1 + 2√ t
Wt dt + t dWt , X1 = 0.
Integrating yields
Z t Z t
−A(t) −A(s)
e Xt = X0 + e β(s) ds + e−A(s) b(s, Ws ) dWs
0 0
Z t Z t
A(t) A(t) −A(s)
Xt = X0 e +e e β(s) ds + e−A(s) b(s, Ws ) dWs .
0 0
The first integral within the previous parentheses is a Riemann integral, and the latter one is
an Ito stochastic integral. Sometimes, in practical applications these integrals can be computed
explicitly.
When b(t, Wt ) = b(t), the latter integral becomes a Wiener integral. In this case the solution
Xt is Gaussian with mean and variance given by
Z t
E[Xt ] = X0 eA(t) + eA(t) e−A(s) β(s) ds
0
Z t
V ar[Xt ] = e2A(t) e−2A(s) b(s)2 ds.
0
Another important particular case is when α(t) = α 6= 0, β(t) = β are constants and
b(t, Wt ) = b(t). The equation in this case is
dXt = αXt + β dt + b(t)dWt , t ≥ 0,
and the solution takes the form
Z t
β αt
Xt = X0 eαt + (e − 1) + eα(t−s) b(s) dWs .
α 0
137
Integrating yields Z Z
t t
−t/2 −s/2
e Xt = X0 + e ds + es/2 cos Ws dWs
0 0
is given by
Z t
−t
Xt = m + (X0 − m)e +α es−t dWs . (7.6.14)
0
Hence
Z t
Xt = X0 e−t + m − e−t + αe−t es dWs
0
Z t
= m + (X0 − m)e−t + α es−t dWs .
0
Since Xt is the sum between a predictable function and a Wiener integral, then we can use
Proposition 4.4.1 and it follows that Xt is Gaussian, with
h Z t i
E[Xt ] = m + (X0 − m)e−t + E α es−t dWs = m + (X0 − m)e−t
0
h Z t i Z t
V ar(Xt ) = V ar α es−t dWs = α2 e−2t e2s ds
0 0
e2t − 1 1
= α2 e−2t = α2 (1 − e−2t ).
2 2
139
lim E[Xt ] = m.
t→∞
The variance also tends to zero exponentially, lim V ar[Xt ] = 0. According to Proposition
t→∞
3.8.1, the process Xt tends to m in the mean square sense.
Proposition 7.6.3 (The Brownian bridge) For a, b ∈ R fixed, the stochastic differential
equation
b − Xt
dXt = dt + dWt , 0 ≤ t < 1, X0 = a
1−t
has the solution
Z t
1
Xt = a(1 − t) + bt + (1 − t) dWs , 0 ≤ t < 1. (7.6.15)
0 1−s
E[Xt ] = a(1 − t) + bt
V ar(Xt ) = V ar(Ut ) = t(1 − t).
It is worth noting that the variance is maximum at the midpoint t = (b − a)/2 and zero at the
end points a and b.
Example 7.6.4 Find Cov(Xs , Xt ), 0 < s < t for the following cases:
(a) Xt is a mean reverting Orstein-Uhlenbeck process;
(b) Xt is a Brownian bridge process.
d(e−3t Xt ) = dNt .
Substituting the last term from the initial equation (7.7.17) yields
Example 7.7.1 Use the method of variation of parameters to solve the equation
dXt = Xt Wt dWt .
Dividing by Xt and converting the differential equation into the equivalent integral form, we
get Z Z
1
dXt = Wt dWt .
Xt
The right side is a well-known stochastic integral given by
Z
W2 t
Wt dWt = t − + C.
2 2
The left side will be integrated “blindly” according to the rules of elementary Calculus
Z
1
dXt = ln Xt + C.
Xt
Equating the last two relations and solving for Xt we obtain the “pseudo-solution”
Wt2
− 2t +c
Xt = e 2 ,
with c constant. In order to get a correct solution, we let c to depend on t and Wt . We shall
assume that c(t, Wt ) = a(t) + b(Wt ), so we are looking for a solution of the form
Wt2
− 2t +a(t)+b(Wt )
Xt = e 2 .
This equation is satisfied if we are able to choose the functions a(t) and b(Wt ) such that the
coefficients of dt and dWt vanish
1
b′ (Wt ) = 0, a′ (t) + b′′ (Wt ) = 0.
2
From the first equation b must be a constant. Substituting into the second equation it follows
that a is also a constant. It turns out that the aforementioned “pseudo-solution” is in fact a
solution. The constant c = a + b is obtained letting t = 0. Hence the solution is given by
Wt2
− 2t
Xt = X0 e 2 .
Example 7.7.2 Use the method of variation of parameters to solve the stochastic differential
equation
dXt = µXt dt + σXt dWt ,
After dividing by Xt we bring the equation into the equivalent integral form
Z Z Z
dXt
= µ dt + σ dWt .
Xt
ln Xt = µt + σWt + c,
Xt = eµt+σWt +c .
Assume the constant c is replaced by a function c(t), so we are looking for a solution of the
form
Xt = eµt+σWt +c(t) . (7.7.19)
σ2
dXt = Xt µ + c′ (t) + dt + σXt dWt .
2
Subtracting the initial equation yields
σ2
c′ (t) + dt = 0,
2
2 2
which is satisfied for c′ (t) = − σ2 , with the solution c(t) = − σ2 t + k, k ∈ R. Substituting into
(7.7.19) yields the solution
σ2 σ2 σ2
Xt = eµt+σWt − 2 t+k
= e(µ− 2 )t+σWt +k
= X0 e(µ− 2 )t+σWt
.
144
Exercise 7.8.1 Let α be a constant. Solve the following stochastic differential equations by the
method of integrating factors
(a) dXt = αXt dWt ;
(b) dXt = Xt dt + αXt dWt ;
1
(c) dXt = dt + αXt dWt , X0 > 0.
Xt
Exercise 7.8.2 Let X R tt be the solution of the stochastic equation dXt = σXt dWt , with σ
constant. Let At = 1t 0 Xs dWs be the stochastic average of Xt . Find the stochastic equation
satisfied by At , the mean and variance of At .
We shall look for a solution of the type Xt = f (Wt ). Ito’s formula yields
1
dXt = f ′ (Wt )dWt + f ′′ (Wt )dt.
2
Equating the coefficients of dt and dWt in the last two equations yields
1
Xt = .
a − Wt
Let Ta be the first time the Brownian motion Wt hits a. Then the process Xt is defined only
for 0 ≤ t < Ta . Ta is a random variable with P (Ta < ∞) = 1 and E[Ta ] = ∞, see section 3.3.
The following theorem is the analog of Picard’s uniqueness result from ordinary differential
equations:
146
Theorem 7.9.1 (Existence and Uniqueness) Consider the stochastic differential equation
The first condition says that the drift and volatility increase no faster than a linear function
in x. The second conditions states that the functions are Lipschitz in the second argument.
where Xt = (Xt1 , · · · , Xtn ) satisfies the Ito diffusion (8.1.1) on components, i.e.,
147
148
Using the stochastic relations dt2 = dt dWk (t) = 0 and dWk (t) dWr (t) = δkr dt, a computation
provides
X X
dXti dXtj = bi dt + σik dWk (t) bj dt + σjk dWk (t)
k k
X X
= σik dWk (t) σjr dWr (t)
k r
X X
= σik σjr dWk (t)dWr (t) = σik σjk dt
k,r k
" #
1 X ∂2f T
X ∂f
dFt = (Xt )(σσ )ij + bi (Xt ) (Xt ) dt
2 i,j ∂xi ∂xj i
∂xi
X ∂f
+ (Xt )σik (Xt ) dWk (t).
∂xi
i,k
Z t" X #
1 ∂2f T
X ∂f
Ft = F0 + (σσ )ij + bi (Xs ) ds
0 2 i,j ∂xi ∂xj i
∂xi
XZ t X ∂f
+ σik (Xs ) dWk (s).
0 i
∂xi
k
Since F0 = f (X0 ) = f (x) and E(f (x)) = f (x), applying the expectation operator in the
previous relation we obtain
"Z #
t X
1 T ∂2f X ∂f
E(Ft ) = f (x) + E (σσ )ij + bi (Xs ) ds . (8.1.5)
0 2 i,j ∂xi ∂xj i
∂xi
Rt
Using the commutation between the operator E and the integral 0 yields
The matrix σ is called dispersion and the product σσ T is called diffusion matrix. These
names are related with their physical significance Substituting (8.1.6) in (8.1.5) we obtain the
following formula
hZ t i
E[f (Xt )] = f (x) + E Af (Xs ) ds , (8.1.7)
0
Exercise 8.1.2 Find the generator operator associated with the n-dimensional Brownian mo-
tion.
Exercise 8.1.3 Find the Ito diffusion corresponding to the generator Af (x) = f ′′ (x) + f ′ (x).
Lemma 8.2.1 Let g be a bounded measurable function and τ be a stopping time for Xt with
E[τ ] < ∞. Then
h Z τ ∧k i hZ τ i
lim E g(Xs ) dWs = E g(Xs ) dWs ; (8.2.8)
k→∞ 0 0
hZ τ ∧k i hZ τ i
lim E g(Xs ) ds = E g(Xs ) ds . (8.2.9)
k→∞ 0 0
Proof: Let |g| < K. Using the properties of Ito integrals, we have
h Z τ Z τ ∧k 2 i h Z τ 2 i
E g(Xs ) dWs − g(Xt ) dWs =E g(Xs ) dWs
0 0 τ ∧k
hZ τ i
= E g 2 (Xs ) ds ≤ K 2 E[τ − τ ∧ k] → 0, k → ∞.
τ ∧k
Exercise 8.2.2 Assume the hypothesis of the previous lemma. Let 1{s<τ } be the characteristic
function of the interval (−∞, τ )
1, if u < τ
1{s<τ } (u) =
0, otherwise.
Show that
Z τ ∧k Z k
(a) g(Xs ) dWs = 1{s<τ } g(Xs ) dWs ,
0 0
Z τ ∧k Z k
(b) g(Xs ) ds = 1{s<τ } g(Xs ) ds.
0 0
Theorem 8.2.3 (Dynkin’s formula) Let f ∈ C02 (Rn ), and Xt be an Ito diffusion starting at
x. If τ is a stopping time with E[τ ] < ∞, then
hZ τ i
E[f (Xτ )] = f (x) + E Af (Xs ) ds , (8.2.10)
0
hZ k i
E[1s<τ f (Xk )] = 1{s<τ } f (x) + E A(1{s<τ } f )(Xs ) ds ,
0
hZ k∧τ i hZ τ i
E A(f )(Xs ) ds → E A(f )(Xs ) ds , k → ∞,
0 0
8.3 Applications
In this section we shall present a few important results of stochastic calculus which can be
obtained as a consequence of Dynkin’s formula.
151
Theorem 8.3.1 (Kolmogorov’s backward equation) For any f ∈ C02 (Rn ) the function
v(t, x) = E[f (Xt )] satisfies the following Cauchy’s problem
∂v
= Av, t>0
∂t
v(0, x) = f (x),
E[Xτ ] = x0 (8.3.13)
Exercise 8.3.2 Prove relation (8.3.13) using the Optional Stopping Theorem for the martin-
gale Xt .
Let pa = P (Xτ = a) and pb = P (Xτ = b) be the exit probabilities from the interval (a, b).
Obviously, pa + pb = 1, since the probability that the Brownian motion will stay for ever inside
the bounded interval is zero. Using the expectation definition, relation (8.3.13) yields
apa + b(1 − pa ) = x0 .
b − x0 x0 − a
E[τ ] = a2 pa + b2 pb − x20 = a2 + b2 − x0
b−a b−a
1 h i
= ba2 − ab2 + x0 (b − a)(b + a) − x20
b−a
= −ab + x0 (b + a) − x20
= (b − x0 )(x0 − a). (8.3.16)
Exercise 8.3.3 (a) Show that the equation x2 − (b − a)x + E[τ ] = 0 cannot have complex roots;
(b − a)2
(b) Prove that E[τ ] ≤ ;
4
(c) Find the point x0 ∈ (a, b) such that the expectation of the exit time, E[τ ], is maximum.
that R > |a|. Consider the exit time of the process Xt from the ball B(0, R)
Assuming E[τ ] < ∞ and letting f (x) = |x|2 = x21 + · · · + x2n in Dynkin’s formula
hZ τ 1 i
E f (Xτ ) = f (x) + E ∆f (Xs ) ds
0 2
yields
hZ τ i
R2 = |a|2 + E n ds ,
0
and hence
R2 − |a|2
E[τ ] = . (8.3.18)
n
In particular, if the Brownian motion starts from the center, i.e. a = 0, the expectation of the
exit time is
R2
E[τ ] = .
n
(i) Since R2 /2 > R2 /3, the previous relation implies that it takes longer for a Brownian motion
to exit a disk of radius R rather than a ball of the same radius.
(ii) The probability that a Brownian motion leaves the interval (−R, R) is twice the probability
that a 2-dimensional Brownian motion exits the disk B(0, R).
Exercise 8.3.5 Apply the Optional Stopping Theorem for the martingale Wt = Wt2 − t to show
that E[τ ] = R2 , where
τ = inf{t > 0; |Wt | > R}
is the first exit time of the Brownian motion from (−R, R).
where k > 0 such that b ∈ Ak . Consider the process Xt = b + W (t) and let
τk = inf{t > 0; Xt ∈
/ Ak }
yields
E f (Xτk ) = f (b). (8.3.19)
This can be stated by saying that the value of f at a point b in the annulus is equal to the
expected value of f at the first exit time of a Brownian motion starting at b.
Since |Xτk | is a random variable with two outcomes, we have
E f (Xτk ) = pk f (R) + qk f (kR),
ln( Rb )
pk = 1 − .
ln k
Hence
P (τ < ∞) = lim pk = 1,
k→∞
where τ = inf{t > 0; |Xt | < R} is the first time Xt hits the ball B(0, R). Hence in R2 a
Brownian motion hits with probability 1 any ball. This is stated equivalently by saying that
the Brownian motion is recurrent in R2 .
(ii) If n > 2 the equation (8.3.20) becomes
pk qk 1
+ n−2 n−2 = n−2 .
Rn−2 k R b
Taking the limit k → ∞ yields
R n−2
lim pk = < 1.
k→∞ b
Then in Rn , n > 2, a Brownian motion starting outside of a ball hits it with a probability less
than 1. This is usually stated by saying that the Brownian motion is transient.
3. We shall recover the previous results using the n-dimensional Bessel process
p
Rt = dist 0, W (t) = W1 (t)2 + · · · + Wn (t)2 .
Consider the process Yt = α + Rt , with 0 ≤ α < R. The generator of Yt is the Bessel operator
of order n
1 d2 n−1 d
A= + ,
2 dx2 2x dx
see section 2.7. Consider the exit time
R2 − α2
E[τ ] = .
n
which recovers (8.3.18) with α = |a|.
In the following assume n ≥ 3 and consider the annulus
Consider the stopping time τ = inf{t > 0; Xt ∈ / Ar,R } = inf{t > 0;Yt ∈ / (r, R)}, where
|Y0 | = α ∈ (r, R). Applying Dynkin’s formula for f (x) = x2−n yields E f (Yτ = f (α). This
can be written as
pr r2−n + pR R2−n = α2−n ,
where
pr = P (|Xt | = r), pR = P (|Xt | = R), pr + pR = 1.
Solving for pr and pR yields
R n−2
r n−2
α −1 α −1
pr = , pR = .
R n−2 r n−2
r −1 R −1
The transience probability is obtained by taking the limit to infinity
α2−n Rn−2 − 1 r n−2
pr = lim pr,R = lim = ,
R→∞ R→∞ r2−n Rn−2 − 1 α
156
where pr is the probability that a Brownian motion starting outside the ball of radius r will hit
the ball, see Fig. 8.2.
dX(s)
= a s, X(s) , t≤s≤T
ds
X(t) = x,
and define the cumulative cost between t and T along the solution ϕ
Z T
u(t, x) = c s, ϕ(s) ds, (8.4.21)
t
where c denotes a continuous cost function. Differentiate both sides with respect to t
Z T
∂ ∂
u t, ϕ(t) = c s, ϕ(s) ds
∂t ∂t t
∂t u + ∂x u ϕ′ (t) = −c t, ϕ(t) .
It is worth mentioning that this is a variant of the method of characteristics.1 The curve given
by the solution ϕ(s) is called a characteristic curve.
Exercise 8.4.1 Using the previous method solve the following final boundary problems:
(a)
∂t u + x∂x u = −x
u(T, x) = 0.
(b)
∂t u + tx∂x u = ln x, x>0
u(T, x) = 0.
1 This is a well known method of solving linear partial differential equations.
157
Applying Ito’s formula on one side and the Fundamental Theorem of Calculus on the other, we
obtain
1
∂t u(t, x)dt + ∂x u(t, Xt )dXt + ∂x2 u(t, t, Xt )dXt2 = −c(t, Xt )dt.
2
Taking the expectation E[ · |Xt = x] on both sides yields
1
∂t u(t, x)dt + ∂x u(t, x)a(t, x)dt + ∂x2 u(t, x)b2 (t, x)dt = −c(t, x)dt.
2
Hence, the expected cost
hZ T i
u(t, x) = E c(s, Xs ) ds|Xt = x
t
1
∂t u + ∂x u + ∂x2 u = −x
2
u(T, x) = 0.
158
(b)
1
∂t u + ∂x u + ∂x2 u = ex ,
2
u(T, x) = 0.
(c)
1
∂t u + µx∂x u + σ 2 x2 ∂x2 u = −x,
2
u(T, x) = 0.
Chapter 9
Martingales
is an Ft -martingale.
Rt
where we used that s v(τ ) dWτ is independent of Fs and the conditional expectation equals
the usual expectation
hZ t i hZ t i
E v(τ ) dWτ Fs = E v(τ ) dWτ = 0.
s s
159
160
Z t
Example 9.1.2 Let Xt = v(s) dWs be a process as in Example 9.1.1. Then
0
Z t
Mt = Xt2 − v 2 (s) ds
0
is an Ft -martingale.
The process Xt satisfies the stochastic equation dXt = v(t)dWt . By Ito’s formula
is an Ft -martingale.
is an Ft -martingale for 0 ≤ t ≤ T .
Z t Z t
1
Consider the process Ut = u(s) dWs − u2 (s) ds. Then
0 2 0
161
1
dUt = u(t)dWt − u2 (t)dt
2
(dUt )2 = u(t)dt.
hZ t i
Z t
E u(τ )Mτ dWτ |Fs = E u(τ )Mτ dWτ = 0,
s s
and hence Z t
E[Mt |Fs ] = E[Ms + u(τ )Mτ dWτ |Fs ] = Ms .
s
Remark 9.1.4 The condition that u(s) is continuous on [0, T ] can be relaxed by asking only
Z t
2
u ∈ L [0, T ] = {u : [0, T ] → R; measurable and |u(s)|2 ds < ∞}.
0
It is worth noting that the conclusion still holds if the function u(s) is replaced by a stochastic
process u(t, ω) satisfying Novikov’s condition
1 RT 2
E e 2 0 u (s,ω) ds < ∞.
The previous process has a distinguished importance in the theory of martingales and it will
be useful when proving the Girsanov theorem.
Definition 9.1.5 Let u ∈ L2 [0, T ] be a deterministic function. Then the stochastic process
Rt Rt
u(s) dWs − 21 u2 (s) ds
Mt = e 0 0
t3
= etWt − 6 −Zt
is an Ft -martingale.
Example 9.1.6 Let Xt be a solution of dXt = u(t)dt + dWt , with u(s) a bounded function.
Consider the exponential process
Rt Rt
u(s) dWs − 12 u2 (s) ds
Mt = e− 0 0 . (9.1.2)
Then Yt = Mt Xt is an Ft -martingale.
and hence
E[Yt |Fs ] = Ys .
1
Exercise 9.1.7 Prove that (Wt + t)e−Wt − 2 t is an Ft -martingale.
Exercise 9.1.8 Let h be a continuous function. Using the properties of the Wiener integral
and log-normal random variables, show that
h Rt i 1
Rt 2
E e 0 h(s) dWs = e 2 0 h(s) ds .
163
Exercise 9.1.9 Let Mt be the exponential process (9.1.2). Use the previous exercise to show
that for any t > 0 Rt 2
(a) E[Mt ] = 1 (b) E[Mt2 ] = e 0 u(s) ds .
Exercise 9.1.10 Let Ft = σ{Wu ; u ≤ t}. Show that the following processes are Ft -martingales:
(a) et/2 cos Wt ;
(b) et/2 sin Wt .
Pn
Recall that the Laplacian of a twice differentiable function f is defined by ∆f (x) = j=1 ∂x2j f .
Example 9.1.11 Consider the smooth function f : Rn → R, such that
(i) ∆f = 0;
(ii) E |f (Wt )| < ∞, ∀t > 0 and x ∈ R.
Then the process Xt = f (Wt ) is an Ft -martingale.
Proof: It follows from the more general Example 9.1.13.
Exercise 9.1.12 Let W1 (t) and W2 (t) be two independent Brownian motions. Show that Xt =
eW1 (t) cos W2 (t) is a martingale.
Example 9.1.13 Let f : Rn → R be a smooth function such that
(i) E |f (Wt )| < ∞;
hR i
t
(ii) E 0 |∆f (Ws )| ds < ∞.
Rt
Then the process Xt = f (Wt ) − 21 0 ∆f (Ws ) ds is a martingale.
Proof: For 0 ≤ s < t we have
h1 Z t i
E Xt |Fs = E f (Wt )|Fs − E ∆f (Wu ) du|Fs
2 0
Z Z t h
1 s 1 i
= E f (Wt )|Fs − ∆f (Wu ) du − E ∆f (Wu )|Fs du (9.1.3)
2 0 s 2
Let p(t, y, x) be the probability density function of Wt . Integrating by parts and using that p
satisfies the Kolmogorov’s backward equation, we have
h1 i Z Z
1 1
E ∆f (Wu )|Fs = p(u − s, Ws , x) ∆f (x) dx = ∆x p(u − s, Ws , x)f (x) dx
2 2 2
Z
∂
= p(u − s, Ws , x)f (x) dx.
∂u
Then, using the Fundamental Theorem of Calculus, we obtain
Z t h i Z t Z
1 ∂
E ∆f (Wu )|Fs du = p(u − s, Ws , x)f (x) dx du
s 2 ∂u
Zs Z
= p(t − s, Ws , x)f (x) dx − lim p(ǫ, Ws , x)f (x) dx
ǫց0
Z
= E f (Wt )|Fs − δ(x = Ws )f (x) dx
= E f (Wt )|Fs − f (Ws ).
164
Hence Xt is an Ft -martingale.
Exercise 9.1.14 Use the Example 9.1.13 to show that the following processes are martingales:
(a) Xt = Wt2 − t;
Rt
(b) Xt = Wt3 − 3 0 Ws ds;
1
Rt
(c) Xt = n(n−1) Wtn − 12 0 Wsn−2 ds;
Rt
(d) Xt = ecWt − 12 c2 0 eWs ds, with c constant;
Rt
(e) Xt = sin(cWt ) + 12 c2 0 sin(cWs ) ds, with c constant.
We have not used the upper script until now since there was no doubt which probability measure
was used. In this section we shall use also another probability measure given by
dQ = MT dP,
which shows that Q is a probability on F , and hence (Ω, F , Q) becomes a probability space.
The following transformation of expectation from the probability measure Q to P will be useful
in part II. If X is a random variable
Z Z
E Q [X] = X(ω) dQ(ω) = X(ω)MT (ω) dP (ω)
Ω Ω
= E P [XMT ].
The following result will play a central role in proving Girsanov’s theorem:
Lemma 9.2.1 Let Xt be the Ito process
dXt = u(t)dt + dWt , X0 = 0, 0 ≤ t ≤ T,
with u(s) a bounded function. Consider the exponential process
Rt Rt
u(s) dWs − 12 u2 (s) ds
Mt = e− 0 0 .
Then Xt is an Ft -martingale with respect to the measure
dQ(ω) = MT (ω)dP (ω).
Proof: We need to prove that Xt is an Ft -martingale with respect to Q, so it suffices to show
the following three properties:
1. Integrability of Xt . This part usually follows from standard manipulations of norms esti-
mations. We shall do it here in detail, but we shall omit it in other proofs. Integrating in the
equation of Xt between 0 and t yields
Z t
Xt = u(s) ds + Wt . (9.2.7)
0
We shall prove this identity by showing that both terms are equal to Xs Ms . Since Xs is
Fs -predictable and Mt is a martingale, the right side term becomes
Let s < t. Using the tower property (see Proposition 1.11.4, part 3), the left side term becomes
E P [Xt MT |Fs ] = E P E P [Xt MT |Ft ]|Fs = E P Xt E P [MT |Ft ]|Fs
= E P Xt Mt |Fs = Xs Ms ,
where we used that Mt and Xt Mt are martingales and Xt is Ft -predictable. Hence (9.2.8) holds
and Xt is an Ft -martingale with respect to the probability measure Q.
E Q [Xt2 ] = t.
167
Rt
Proof: Denote U (t) = 0
u(s) ds. Then
Taking the expectation and using the property of Ito integrals we have
Z t Z t
E[Wt Mt ] = − u(s)E[Ms ] ds = − u(s) ds = −U (t). (9.2.11)
0 0
Chapter 1
1.6.1 Let X ∼ N (µ, σ 2 ). Then the distribution function of Y is
y −β
FY (y) = P (Y < y) = P (αX + β < y) = P X <
α
Z y−β Z y 2
1 α
− (x−µ)
2
1 −
z−(αµ+β)
= √ e 2σ2 dx = √ e 2α2 σ2 dz
2πσ −∞ 2πασ −∞
Z y ′ 2
1 − (z−µ )
= √ e 2(σ′ )2 dz,
2πσ ′ −∞
with µ′ = αµ + β, σ ′ = ασ.
2 2
1.6.2 (a) Making t = n yields E[Y n ] = E[enX ] = eµn+n σ /2 .
(b) Let n = 1 and n = 2 in (a) to get the first two moments and then use the formula of
variance.
1.11.5 The tower property
E E[X|G]|H = E[X|H], H⊂G
is equivalent with Z Z
E[X|G] dP = X dP, ∀A ∈ H.
A
E[(Xn − X)2 ] → 0
Z
|E[X(X − Xn )]| ≤ |X(X − Xn )| dP
Ω
Z 1/2 Z 1/2
2
≤ X dP (X − Xn )2 dP
pΩ Ω
= E[X 2 ]E[(X − Xn )2 ] → 0.
as n → 0.
1.15.7 The integrability of Xt follows from
E[|Xt |] = E |E[X|Ft ]| ≤ E E[|X| |Ft ] = E[|X|] < ∞.
Xt is Ft -predictable by the definition of the conditional expectation. Using the tower property
yields
E[Xt |Fs ] = E E[X|Ft ]|Fs = E[X|Fs ] = Xs , s < t.
171
1.15.8 Since
1.15.9 In general the answer is no. For instance, if Xt = Yt the process Xt2 is not a martingale,
since the Jensen’s inequality
2
E[Xt2 |Fs ] ≥ E[Xt |Fs ] = Xs2
is not necessarily an identity. For instance Bt2 is not a martingale, with Bt the Brownian motion
process.
1.15.10 It follows from the identity
since E[U ] = 0.
1.15.12 Let Fn = σ(Xk ; k ≤ n). Using the independence
1.15.13 (a) Since the random variable Y = θX is normally distributed with mean θµ and
variance θ2 σ 2 , then
1 2 2
E[eθX ] = eθµ+ 2 θ σ .
Hence E[eθX ] = 1 iff θµ + 12 θ2 σ 2 = 0 which has the nonzero solution θ = −2µ/σ 2 .
(b) Since eθXi are independent, integrable and satisfy E[eθXi ] = 1, by Exercise 1.15.12 we
get that the product Zn = eθSn = eθX1 · · · eθXn is a martingale.
Chapter 2
2.1.4 Bt starts at 0 and is continuous in t. By Proposition 2.1.2 Bt is a martingale with
E[Bt2 ] = t < ∞. Since Bt − Bs ∼ N (0, |t − s|), then E[(Bt − Bs )2 ] = |t − s|.
2.1.10 It is obvious that Xt = Wt+t0 − Wt0 satisfies X0 = 0 and that Xt is continuous in t. The
increments are normal distributed Xt −Xs = Wt+t0 −Ws+t0 ∼ N (0, |t−s|). If 0 < t1 < · · · < tn ,
then 0 < t0 < t1 + t0 < · · · < tn + t0 . The increments Xtk+1 − Xtk = Wtk+1 +t0 − Wtk +t0 are
obviously independent and stationary.
2.1.11 For any λ > 0, show that the process Xt = √1λ Wλt is a Brownian motion. This says that
the Brownian motion is invariant by scaling. Let s < t. Then Xt − Xs = √1λ (Wλt − Wλs ) ∼
√1 N 0, λ(t − s) = N (0, t − s). The other properties are obvious.
λ
2s2
(d) Corr(Wt2 , Ws2 ) = 2ts = st , where we used
(d) Since
Yt − Ys = (t − s)(W1/t − W0 ) − s(W1/s − W1/t )
E[Yt − Ys ] = (t − s)E[W1/t ] − sE[W1/s − W1/t ] = 0,
and
1 1 1
V ar(Yt − Ys ) = E[(Yt − Ys )2 ] = (t − s)2 + s2 ( − )
t s t
(t − s)2 + s(t − s)
= = t − s.
t
E[(Wt − Ws )3 |Fs ] = E[Wt3 |Fs ] − 3Ws E[Wt2 ] + 3Ws2 E[Wt |Fs ] − Ws3
= E[Wt3 |Fs ] − 3(t − s)Ws − Ws3 ,
so
E[Wt3 |Fs ] = 3(t − s)Ws + Ws3 ,
since
E[(Wt − Ws )3 |Fs ] = E[(Wt − Ws )3 ] = E[Wt−s
3
] = 0.
2.2.4 (a)
E[ecWt |Fs ] = E[ec(Wt −Ws ) ecWs |Fs ] = ecWs E[ec(Wt −Ws ) |Fs ]
1 2 1 2
= ecWs E[ec(Wt −Ws ) ] = ecWs e 2 c (t−s)
= Ys e 2 c t .
1 2
Multiplying by e− 2 c t
yields the desired result.
2.2.5 (a) Using Exercise 2.2.3 we have
Z Z
2Wt2 2x2 1 2 x2
E[e ] = e φt (x) dx = √ e2x e− 2t dx
2πt
Z
1 2x2 − 1−4t x 2 1
= √ e e 2t dx = √ ,
2πt 1 − 4t
Z p
2
if 1−4t > 0. Otherwise, the integral is infinite. We used the standard integral e−ax = π/a,
a > 0.
2.3.4 It follows from the fact that Zt is normally distributed.
2.3.5 Using the definition of covariance we have
Cov Zs , Zt = E[Zs Zt ] − E[Zs ]E[Zt ] = E[Zs Zt ]
hZ s Z t i hZ s Z t i
= E Wu du · Wv dv = E Wu Wv dudv
0 0 0 0
Z sZ t Z sZ t
= E[Wu Wv ] dudv = min{u, v} dudv
0 0 0 0
t s
= s2 − , s < t.
2 6
1
Cov(Zt , Wt ) = Cov(Zt , Zt − Zt−h )
h
1 1 2 1
= t h + o(h) = t2 .
h 2 2
176
2.3.7 Let s < u. Since Wt has independent increments, taking the expectation in
eWs +Wt = eWt −Ws e2(Ws −W0 )
we obtain
u−s
E[eWs +Wt ] = E[eWt −Ws ]E[e2(Ws −W0 ) ] = e 2 e2s
u+s u+s
= e 2 es = e 2 emin{s,t} .
Rt Rt
2.3.8 (a) E[Xt ] = 0 E[eWs ] ds = 0 E[es/2 ] ds = 2(et/2 − 1)
(b) Since V ar(Xt ) = E[Xt2 ] − E[Xt ]2 , it suffices to compute E[Xt2 ]. Using Exercise 2.3.7 we
have
hZ t Z t i hZ tZ t i
2 Wt Wu
E[Xt ] = E e ds · e du = E eWs eWu dsdu
0 0 0 0
Z tZ t Z tZ t
u+s
= E[eWs +Wu ] dsdu = e 2 emin{s,t} dsdu
0 0 0 0
ZZ ZZ
u+s u+s
= e 2 es duds + e 2 eu duds
D1 D2
ZZ
u+s
u 4 1 2t 3
= 2 e 2 e duds = e − 2et/2 + ,
D2 3 2 2
where D1 {0 ≤ s < u ≤ t} and D2 {0 ≤ u < s ≤ t}. In the last identity we applied Fubini’s
theorem. For the variance use the formula V ar(Xt ) = E[Xt2 ] − E[Xt ]2 .
2.3.9 (a) Splitting the integral at t and taking out the predictable part, we have
Z T Z t Z T
E[ZT |Ft ] = E[ Wu du|Ft ] = E[ Wu du|Ft ] + E[ Wu du|Ft ]
0 0 t
Z T
= Zt + E[ Wu du|Ft ]
t
Z T
= Zt + E[ (Wu − Wt + Wt ) du|Ft ]
t
Z T
= Zt + E[ (Wu − Wt ) du|Ft ] + Wt (T − t)
t
Z T
= Zt + E[ (Wu − Wt ) du] + Wt (T − t)
t
Z T
= Zt + E[Wu − Wt ] du + Wt (T − t)
t
= Zt + Wt (T − t),
since E[Wu − Wt ] = 0.
(b) Let 0 < t < T . Using (a) we have
E[ZT − T WT |Ft ] = E[ZT |Ft ] − T E[WT |Ft ]
= Zt + Wt (T − t) − T Wt
= Zt − tWt .
177
2.4.1
h Rt RT i
E[VT |Ft ] = E e 0 Wu du+ t Wu du |Ft
Rt h RT i
= e 0 Wu du E e t Wu du |Ft
Rt h RT i
= e 0 Wu du E e t (Wu −Wt ) du+(T −t)Wt |Ft
h RT i
= Vt e(T −t)Wt E e t (Wu −Wt ) du |Ft
h RT i
= Vt e(T −t)Wt E e t (Wu −Wt ) du
h R T −t i
= Vt e(T −t)Wt E e 0 Wτ dτ
3
1 (T −t)
= Vt e(T −t)Wt e 2 3 .
2.6.1
F (x) = P (Yt ≤ x) = P (µt + Wt ≤ x) = P (Wt ≤ x − µt)
Z x−µt
1 − u2
= √ e 2t du;
0 2πt
1 − (x−µt)2
f (x) = F ′ (x) = √ e 2t .
2πt
2.7.2 Since Z ρ
1 − x2
P (Rt ≤ ρ) = xe 2t dx,
0 t
use the inequality
x2 x2
1− < e− 2t < 1
2t
to get the desired result.
2.7.3
Z ∞ Z ∞
1 2 − x2
E[Rt ] = xpt (x) dx = x e 2t dx
0 0 t
Z √ Z ∞ 3 −1 −z
1 ∞ 1/2 − y
= y e 2t dy = 2t z 2 e dz
2t 0 0
√
√ 3 2πt
= 2t Γ = .
2 2
Since E[Rt2 ] = E[W1 (t)2 + W2 (t)2 ] = 2t, then
2πt π
V ar(Rt ) = 2t − = 2t 1 − .
4 4
2.7.4
√ r
E[Xt ] 2πt π
E[Xt ] = = = → 0, t → ∞;
t 2t 2t
1 2 π
V ar(Xt ) = 2
V ar(Rt ) = (1 − ) → 0, t → ∞.
t t 4
178
Nt2 = Nt (Nt − Ns ) + Nt Ns
= (Nt − Ns )2 + Ns (Nt − Ns ) + (Nt − λt)Ns + λtNs ,
then
E[Nt2 |Fs ] = E[(Nt − Ns )2 |Fs ] + Ns E[Nt − Ns |Fs ] + E[Nt − λt|Fs ]Ns + λtNs
= E[(Nt − Ns )2 ] + Ns E[Nt − Ns ] + E[Nt − λt]Ns + λtNs
= λ(t − s) + λ2 (t − s)2 + λtNs + Ns2 − λsNs + λtNs
= λ(t − s) + λ2 (t − s)2 + 2λ(t − s)Ns + Ns2
= λ(t − s) + [Ns + λ(t − s)]2 .
Hence E[Nt2 |Fs ] 6= Ns2 and hence the process Nt2 is not an Fs -martingale.
2.8.7 (a)
X
mNt (x) = E[exNt ] = exh P (Nt = k)
k≥0
X λk tk
= exk e−λt
k!
k≥0
x x
= e−λt eλte = eλt(e −1)
.
(b) E[Nt2 ] = m′′Nt (0) = λ2 t2 + λt. Similarly for the other relations.
2.8.8 E[Xt ] = E[eNt ] = mNt (1) = eλt(e−1) .
2.8.9 (a) Since exMt = ex(Nt −λt) = e−λtx exNt , the moment generating function is
Chapter 3
3.1.2 We have
Ω, if c ≤ t
{ω; τ (ω) ≤ t} =
Ø, if c > t
and use that Ø, Ω ∈ Ft .
3.1.3 First we note that
[
{ω; τ (ω) < t} = {ω; |Ws (ω)| > K}. (9.2.16)
0<s<t
This can be shown by double inclusion. Let As = {ω; |Ws (ω)| > K}.
“ ⊂ ” Let ω ∈ {ω; τ (ω) < t}, so inf{s > 0; |Ws (ω)| > K} < t. Then exists τ < u < t such that
|Wu (ω)| > K, and
S hence ω ∈ Au .
“ ⊃ ” Let ω ∈ 0<s<t {ω; |Ws (ω)| > K}. Then there is 0 < s < t such that |Ws (ω)| > K. This
implies τ (ω) < s and since s < t it follows that τ (ω) < t.
Since Wt is continuous, then (9.2.16) can be also written as
[
{ω; τ (ω) < t} = {ω; |Wr (ω)| > K},
0<r<t,r∈Q
181
which implies {ω; τ (ω) < t} ∈ Ft since {ω; |Wr (ω)| > K} ∈ Ft , for 0 < r < t. Next we shall
show that P (τ < ∞) = 1.
[
P ({ω; τ (ω) < ∞}) = P {ω; |Ws (ω)| > K} > P ({ω; |Ws (ω)| > K})
0<s
Z
1 y2 2K
= 1− √ e− 2s dy > 1 − √ → 1, s → ∞.
|x|<K 2πs 2πs
/ Km } = {ω; Xr ∈ K m } ∈ Fr ⊂ Ft .
since {ω; Xr ∈
3.1.8 (a) We have {ω; cτ ≤ t} = {ω; τ ≤ t/c} ∈ Ft/c ⊂ Ft . And P (cτ < ∞) = P (τ < ∞) = 1.
(b) {ω; f (τ ) ≤ t} = {ω; τ ≤ f −1 (t)} = Ff −1 (t) ⊂ Ft , since f −1 (t) ≤ t. If f is bounded, then it is
obvious that P (f (τ ) < ∞) = 1. If limt→∞ f (t) = ∞, then P (f (τ ) < ∞) = P τ < f −1 (∞) =
P (τ < ∞) = 1.
(c) Apply (b) with f (x) = ex .
T
3.1.10 If let G(n) = {x; |x − a| < n1 }, then {a} = n≥1 G(n). Then τn = inf{t ≥ 0; Wt ∈ G(n)}
are stopping times. Since supn τn = τ , then τ is a stopping time.
3.2.3 The relation is proved by verifying two cases:
(i) If ω ∈ {ω; τ > t} then (τ ∧ t)(ω) = t and the relation becomes
(ii) If ω ∈ {ω; τ ≤ t} then (τ ∧ t)(ω) = τ (ω) and the relation is equivalent with the obvious
relation
Mτ = Mτ .
3.2.5 Taking the expectation in E[Mτ |Fσ ] = Mσ yields E[Mτ ] = E[Mσ ], and then make σ = 0.
3.3.9 Since Mt = WT2 − t is a martingale with E[Mt ] = 0, by the Optional Stopping Theorem
we get E[Mτa ] = E[M0 ] = 0, so E[Wτ2a − τa ] = 0, from where E[τa ] = E[Wτ2a ] = a2 , since
Wτa = a.
3.3.10 (a)
2
(b) The density function is p(a) = F ′ (a) = √ 2 e−a /(2t) , a > 0. Then
2πt
Z ∞ Z ∞
1 −y2 /(2t) 1
E[Xt ] = xp(x) dx = √ e dy = .
0 0 2πt 2
e2µβ − 1
P (Xt goes up to α) = P (Xt goes up to α before down to − ∞) = lim = 1.
β→∞ e2µβ − e−2µα
3.5.3
e2µβ − 1
P (Xt never hits − β) = P (Xt goes up to ∞ before down to − β) = lim .
α→∞ e2µβ − e−2µα
3.5.4 (a) Use that E[XT ] = αpα − β(1 − pα ); (b) E[XT2 ] = α2 pα + β 2 (1 − pα ), with pα =
e2µβ −1
e2µβ −e−2µα
; (c) Use V ar(T ) = E[T 2 ] − E[T ]2 .
3.5.7 Since Mt = Wt2 − t is a martingale, with E[Mt ] = 0, by the Optional Stopping Theorem
we get E[WT2 − T ] = 0. Using WT = XT − µT yields
Then
E[T ](1 + 2µE[XT ]) − E[XT2 ]
E[T 2 ] = .
µ2
Substitute E[XT ] and E[XT2 ] from Exercise 3.5.4 and E[T ] from Proposition 3.5.5.
3.6.10 See the proof of Proposition 3.6.4.
3.6.12 (b) Applying the Optional Stopping Theorem
c
E[ecMT −λT (e −c−1)
] = E[X0 ] = 1
ca−λT f (c)
E[e ] = 1
E[e−λT f (c) ] = e−ac .
so E[T ] = ∞.
(d) The inverse Laplace transform L−1 e−aϕ(s) cannot be represented by elementary functions.
3.8.8 Use that if Wt is a Brownian motion then also tW1/t is a Brownian motion.
3.8.11 Use E[(|Xt | − 0)2 ] = E[|Xt |2 ] = E[(Xt − 0)2 ].
a1 +···+an
3.8.15 Let an = ln bn . Then Gn = eln Gn = e n → eln L = L.
3.8.20 (a) L = 1. (b) A computation shows
RT 1
4.4.2 X ∼ N (0, 1 t dt) = N (0, ln T ).
RT
4.4.3 Y ∼ N (0, 1
tdt) = N 0, 12 (T 2 − 1)
Z t
1 2t
4.4.4 Normally distributed with zero mean and variance e2(t−s) ds = (e − 1).
0 2
184
7t3
4.4.5 Using the property of Wiener integrals, both integrals have zero mean and variance .
3
4.4.6 The mean is zero and the variance is t/3 → 0 as t → 0.
4.4.7 Since it is a Wiener integral, Xt is normally distributed with zero mean and variance
Z t
bu 2 b2
a+ du = (a2 + + ab)t.
0 t 3
b2
Hence a2 + 3 + ab = 1.
Rt
4.4.8 Since both Wt and 0 f (s) dWs have the mean equal to zero,
Z t Z t Z t Z t
Cov Wt , f (s) dWs = E[Wt , f (s) dWs ] = E[ dWs f (s) dWs ]
0 0 0 0
Z t Z t
= E[ f (u) ds] = f (u) ds.
0 0
4.6.1
hZ T i λ kT
E eks dNs = (e − 1)
0 k
Z T λ 2kT
V ar eks dNs = (e − 1).
0 2k
Chapter 5
Rt Wu
5.2.1 Let Xt = 0
e du. Then
Xt tdXt − Xt dt teWt dt − Xt dt 1 Wt
dGt = d( )= = = e − Gt dt.
t t2 t2 t
5.3.3
(a) eWt (1 + 12 Wt )dt + eWt (1 + Wt )dWt ;
(b) (6Wt + 10e5Wt )dWt + (3 + 25e5Wt )dt;
2 2
(c) 2et+Wt (1 + Wt2 )dt + 2et+Wt Wt dWt ;
(d) n(t + Wt )n−2 (t + Wt + n−12 )dt + (t + Wt )dWt ;
185
Z !
t
1 1
(e) Wt − Wu du dt;
t t 0
Z !
t
1 α
(f ) α eWt − e Wu
du dt.
t t 0
5.3.4
d(tWt2 ) = td(Wt2 ) + Wt2 dt = t(2Wt dWt + dt) + Wt2 dt = (t + Wt2 )dt + 2tWt dWt .
and obtain
Z t
E[Mt2 |Fs ] = Ms2 + 2E[ Mu− dMu |Fs ] + E[Nt |Fs ] − Ns
s
= Ms2 + E[Mt + λt|Fs ] − Ns
= Ms2 + Ms + λt − Ns
= Ms2 + λ(t − s).
∂f ∂f W1 W2 1
dRt = dWt1 + dWt2 + ∆f dt = t dWt1 + t dWt2 + dt.
∂x ∂y Rt Rt 2Rt
Chapter 6
186
Differentiating yields the ODE φ′ (T ) = 12 φ(T ), with φ(0) = 1. Solving yields φ(T ) = eT /2 .
6.2.3 (a) Apply integration by parts with g(x) = (x − 1)ex to get
Z T Z T Z T
1
Wt eWt dWt = g ′ (Wt ) dWt = g(WT ) − g(0) − g ′′ (Wt ) dt.
0 0 2 0
Z
1 Th i
E[WT eWT ] = E[eWT ] − 1 + E[eWt ] + E[Wt eWt ] dt
2 0
Z T
1
= eT /2 − 1 + et/2 + E[Wt eWt ] dt.
2 0
Then φ(T ) = E[WT eWT ] satisfies the ODE φ′ (T ) − φ(T ) = eT /2 with φ(0) = 0.
6.2.4 (a) Use integration by parts with g(x) = ln(1 + x2 ).
(e) Since ln(1 + T ) ≤ T , the upper bound obtained in (e) is better than the one in (d), without
contradicting it.
6.3.1 By straightforward computation.
6.3.2 By computation.
√ √
6.3.13 (a) √e2 sin( 2W1 ); (b) 21 e2T sin(2W3 ); (c) √1 (e 2W4 −4
2
− 1).
187
Chapter 7
Rt Rt
7.2.1 Integrating yields Xt = X0 + (2Xs + e2s ) ds + 0 b dWs . Taking the expectation we get
0
Z t
E[Xt ] = X0 + (2E[Xs ] + e2s ) ds.
0
′ 2t
Differentiating we obtain f (t) = 2f (t) + e , where f (t) = E[Xt ], with f (0) = X0 . Multiplying
by the integrating factor e−2t yields (e−2t f (t))′ = 1. Integrating yields f (t) = e2t (t + X0 ).
7.2.4 (a) Using product rule and Ito’s formula, we get
1
d(Wt2 eWt ) = eWt (1 + 2Wt + Wt2 )dt + eWt (2Wt + Wt2 )dWt .
2
Integrating and taking expectations yields
Z t
2 Wt 1
E[Wt e ] = E[eWs ] + 2E[Ws eWs ] + E[Ws2 eWs ] ds.
0 2
Since E[eWs ] = et/2 , E[Ws eWs ] = tet/2 , if let f (t) = E[Wt2 eWt ], we get by differentiation
1
f ′ (t) = et/2 + 2tet/2 + f (t), f (0) = 0.
2
Multiplying by the integrating factor e−t/2 yields (f (t)e−t/2 )′ = 1 + 2t. Integrating yields the
solution f (t) = t(1 + t)et/2 . (b) Similar method.
7.2.5 (a) Using Exercise 2.1.18
2 σ2
Let f (t) = E[cos(σWt )]. Then f ′ (t) = − σ2 f (t), f (0) = 1. The solution is f (t) = e− 2 t .
(c) Since sin(t + σWt ) = sin t cos(σWt ) + cos t sin(σWt ), taking the expectation and using (a)
and (b) yields
σ2
E[sin(t + σWt )] = sin tE[cos(σWt )] = e− 2 t
sin t.
(d) Similarly starting from cos(t + σWt ) = cos t cos(σWt ) − sin t sin(σWt ).
7.2.7 From Exercise 7.2.6 (b) we have E[cos(Wt )] = e−t/2 . From the definition of expectation
Z ∞
1 − x2
E[cos(Wt )] = cos x √ e 2t dx.
−∞ 2πt
1 2
E[τk ] ≤ (R − |a|2 ),
n
and take k → ∞.
8.3.6 x0 and x2−n .
RT
8.4.1 (a) We have a(t, x) = x, c(t, x) = x, ϕ(s) = xes−t and u(t, x) = t xes−t ds = x(eT −t −1).
2 2 RT 2 2
(b) a(t, x) = tx, c(t, x) = − ln x, ϕ(s) = xe(s −t )/2 and u(t, x) = − t ln xe(s −t )/2 ds =
2
−(T − t)[ln x + T6 (T + t) − t3 ].
3
8.4.2 (a) u(t, x) = x(T −t)+ 12 (T −t)2 . (b) u(t, x) = 32 ex e 2 (T −t) −1 . (c) We have a(t, x) = µx,
b(t, x) = σx, c(t, x) = x. The associated diffusion is dXs = µXs ds + σXs dWs , Xt = x, which
is the geometric Brownian motion
1 2
Xs = xe(µ− 2 σ )(s−t)+σ(Ws −Wt )
, s ≥ t.
The solution is
hZ T
1 2
i
u(t, x) = E xe(µ− 2 σ )(s−t)+σ(Ws −Wt )
ds
t
Z T
1
= x e(µ− 2 (s−t))(s−t) E[eσ(s−t) ] ds
t
Z T
1 2
= x e(µ− 2 (s−t))(s−t) eσ (s−t)/2
ds
t
Z T
x µ(T −t)
= x eµ(s−t) ds = e −1 .
t µ
Chapter 9
9.1.7 Apply Example 9.1.6 with u = 1.
Rt Rt 1
9.1.8 Xt = 0 h(s) dWs ∼ N (0, 0 h2 (s) ds). Then eXt is log-normal with E[eXt ] = e 2 V ar(Xt ) =
1
R t 2
e 2 0 h(s) ds .
9.1.9 (a) Using Exercise 9.1.8 we have
Rt Rt
u(s) dWs − 21 u(s)2 ds
E[Mt ] = E[e− 0 e 0 ]
Rt 2
Rt Rt Rt
− 21 u(s) ds 1
u(s)2 ds 1
u(s)2 ds
= e 0 E[e− 0
u(s) dWs
] = e− 2 0 e2 0 = 1.
Integrating yields Z t
t/2
e cos Wt = 1 − e−s/2 sin Ws dWs ,
0