0% found this document useful (0 votes)
2 views

Report

This report discusses the convergence of Markov Chains to ordinary differential equations (ODEs) and introduces diffusion processes in various applications. It presents key theorems and examples demonstrating how stochastic models can approximate deterministic systems, particularly in biological and chemical contexts. The document also outlines the conditions under which these processes converge and the properties of diffusion processes, including their infinitesimal parameters.

Uploaded by

lorddasy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Report

This report discusses the convergence of Markov Chains to ordinary differential equations (ODEs) and introduces diffusion processes in various applications. It presents key theorems and examples demonstrating how stochastic models can approximate deterministic systems, particularly in biological and chemical contexts. The document also outlines the conditions under which these processes converge and the properties of diffusion processes, including their infinitesimal parameters.

Uploaded by

lorddasy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

MA 699 Report: Diffusion Process and Application

Yu Sun 9074473373

December 22, 2016

1 Markov Chain converge to ODE


In this chapter we introduce some properties of using jumping Markov Process approximate
the solution of the process derive from ODE system.
We first started with the most simplest case,

d
M = λM
dt

this simple model actually describe many process in biology, epidemic theory, physics and
chemistry etc. And at the same time, most of these processes were also described by Markov
Chain Model, which means, for probability space (Ω, F , P) with filtration (Ft , t ∈ T ), (S, S)
be a measure space, the for any s < t, s, t ∈ T , for all A ∈ S,

P(Xt ∈ A|Fs ) = P(Xt ∈ A|Xs )

for simplicity it means that the future only depends on current state information. For
example, we choose a special Markov Process X(t) to be Branching Process, and we
have the following lemma,
Lemma. There exists a sequence of {Xn (t)} are branching processes with

• E(Xn (t)) = Xn (0)eλt ;


Xn (0)
• lim n = M (0);
n→∞

• lim P{sup | Xnn(s) − M (s)| > } = 0, for all 


n→∞ s≤t

with M (s) = M (0)eλs with initial value of M (0) as the solution of Markov Chain Model.
Based on this lemma in 1970’s, Kurtz give a extension of this lemma to pure jump Markov
Process[3].
We will use the same notation in the paper. If we let {Xn (t)} to be a sequence of pure
jump Markov processes with state spaces (En , Bn ), where En ∈ B K , B K denotes the Borel
σ-algebra of RK , Bn = {En ∩ B : B ∈ B K }. If Xn (t) are all right continuous with

• exponential waiting time distribution : λn (x);

1
• exit distribution: µn (x, Γ), Γ ∈ Bn ;

• P{Xn (τxn ) ∈ Γ|X(0) = x} = µn (x, Γ), is Bn -measurable, and τxn is the first exit time
for X n from x;

• λn (x) = 1/E[τxn |Xn (0) = x], is Bn -measurable;

If we assume λn (x) is bounded on bounded subsets of En and define


Z
Fn (x) = λn (x) (z − x)µn (x, dz), Fn : En → RK .
E

The most important result was described in this paper is

Theorem. Suppose there exists E ⊂ RK , a function F : E → K with

• Lipschitz Condition: |F (x) − F (y)| ≤ L|x − y|, ∀x, y ∈ E;

• Uniform Approximation Condition: lim sup |Fn (x) − F (x)| = 0;


n→∞ x∈E∩En

Then if lim Xn (0) = x0 implies:


n→∞

lim P{sup |Xn ([αn s]) − X(s, x0 )| > }, ∀ > 0, ∀t > 0.
n→∞ s≤t

The proof of this solution based on a similar result derive from ODE, and here we just
skip the details. After this we already have a theorem about under certain condition a
sequence pure jump Markov Processes Xn (t) converge to the solution of X(t) of the first
order differential equations, and in which sense of this converge was detailedly discussed in
[4]. Actually this theorem gives a really good theoretical direction of deterministic model,
which as good as for these stochastic model provided with sufficient large population. We
will detailed discussed the sufficient condition of a stochastic model converge to a ODE,
and here we just give some really simple examples for better understanding.
We started with Markov process with many small jumps with fast rate [5]. We define the
drift is the product of average jump and the total rate. For more general cases the process
may have multiple types of jump, then the drift is the linear sum product of different types
of jump and its rate. If we consider N large enough, then N will be well quantifies the
size of the jump and the jump rate. (R.K This notation arise from fluid limit or law of
large number, for sufficient large N , the size of jump is order 1/N and the rate is order

N ; this id quite different from diffusive and central limit, which jumps have size of 1 N
and the rate of order N . Further, for classical central limit, the Winer process, also means
the Gaussian diffusive limit, can be used to describe the first order fluctuations of a process
around the fluid limit.)

Example.1. Consider Poisson Process Xt of rate λN , Let Yt = Xt /N , then Y takes the

2
jump of size 1/N and rate λN . Then is easy to get drift is λ, then ODE is

dyt = λdt

then Yt will close to the solution yt = λt + y0 , with initial condition Y0 = y0 .

Example.2. Consider Chemical Reaction,

µ1
k1 A + k2 B ⇐=
=⇒ k3 C
µ2

In this reversible reaction, molecules A and B become C with number of k1 , k2 and k3 and
rate of µ1 and µ2 . If At , Bt , Ct denote the number of molecules at state t, and

~ t = (Xt1 , Xt2 , Xt3 ) = 1 (At , Bt , Ct )


X
N

~ t is a vector Markov Chain with two kind of jumps:


then X
1
• N (−k1 , −k2 , k3 ) with rate µ1 (N Xt1 )k1 (N Xt2 )k2 ;
1
• N (k1 , k2 , −k3 ) with rate µ2 (N Xt3 )k3 ;

then the solution x~t of ODE system:



dx1 = [λ1 ν2 (x3t )k3 − λ1 ν1 (x1t )k1 (x2t )k2 ]dt
 t


dx1t = [λ2 ν2 (x3t )k3 − λ2 ν1 (x1t )k1 (x2t )k2 ]dt


dx1 = [−λ ν (x3 )k3 + λ ν (x1 )k1 (x2 )k2 ]dt

t 3 2 t 3 1 t t

~ t if x~0 = X
with λi = ki /N , i = 1, 2, 3; ν1 = µ1 /N k1 +k2 , ν2 = /N k3 will be closed by X ~0 .

Example.3. Consider a continuous case as Branching Processes, if each individual in a


population lives for an exponential distributed of mean 1/N , and replace itself by a random
number ξ with mean µ. Denote Xt is the number if population at state t, then Yt = Xt /N
is a Markov Chain. The jump will be (k − 1)/N at rate of N Yt P(ξ = k), then the drift is
P
(k − 1)P(ξ = k)y = (µ − 1)y, then the ODE is
k

dyt = (µ − 1)yt dt

As far we give some first order approximation to the Markov Chain, and in [Kurt’s
71] he discussed how this “sufficient” large is by using Martingale Theorem to prove the
bounds of the probability in Thm, but a more important thing is we want to get a preciser
approximate than only first order, in next two sections we will introduce a Theorem about
Markov Chain converges to SDE.

3
2 Intro of Diffusion Process
In many physical, biological, economic and social phenomena, we are using diffusion process
to approximate or reasonable modeled. And in fact, diffusion process has many good
properties and the most famous and important one is weakly convergence from SDE to
deterministic PDE. Here we started with the introduction of diffusion process[1].

Definition.1. A continuous time parameter stochastic process which possesses the (strong)
Markov property and for which the sample paths X(t) are a.e continuous functions of t is
called diffusion process.

We introduce an idea of regular of a diffusion process: if we consider a diffusion process


whose state space is I = (r, l) (or [r, l) or [r, l] or (r, l] ), and we could let r = −∞ or l = ∞.
Then the process is to be regular means :

• starting from any interior point of I reach another interior pointof I will have positive
measure.

And WLOG, following diffusion process we talked about are all regular.

Properties.1. Denote the increment of X(t) by length h is ∆h X(t) = X(t + h) − X(t),


and diffusion process will have following conditions:

• For ∀ > 0, and for all x ∈ I

1
lim P(|X(t + h) − X(t)|> |X(t) = x) = 0;
h→0 h

• limh→0 h1 E(∆h X(t)|X(t) = x) = µ(x, t);

• limh→0 h1 E([∆h X(t)]2 |X(t) = x) = σ 2 (x, t).

Usually we called µ(x, t) and σ 2 (x, t) infinitesimal parameters of the process. µ(x, t) is
called infinitesimal mean or drift parameter ; σ 2 (x, t) is called infinitesimal variance
or diffusion parameter. Generally, µ(x, t) and σ 2 (x, t) are continuous function of x and
t.
An interesting question is: we always want to determine a stochastic process to be a
diffusion process, that means we need to find out the sufficient condition of diffusion process,
but first we should started with standard process.

Definition.2. A strong Markov process {X(t), t ≤ 0} is called a standard process if the


sample path process satisfied the three regularity properties:

• X(t) is right continuous: for all s ≥ 0, limt→s X(t) = X(s);

4
• left limits of X(t) exist: for all s > 0, limt→s X(t) exists;

• X(t) is continuous from the left through Markov times: if T1 ≤ T2 ≤ · · · are Markov
times converging to T ≤ ∞, then for T < ∞:

lim X(Tn ) = X(T ) a.e.


n→∞

A sufficient condition of a standard process to be a diffusion is a fulfillment of the


Dynkin Condition: for ∀ > 0,

1
P(|X(t + h) − X(t)|> |X(t) = x) → 0, as h → 0
h
for all compact subinterval for I.

Theorem.1. We claim that if a standard process satisfies Dynkin’s Condition, then it is a


diffusion process.

Proof. And the prove is quite long we just skip it, you can find all the details in [1].

3 Convergence to Diffusion
(N ) (N )
In this section we denote X (N ) = {Xk } and X (N ) = {Xt } as discrete time and contin-
uous time cases of stochastic process. And we first study discrete case [2].
(N )
If X (N ) = {Xk }, with state space confined to a closed bounded interval of the real
(N ) (N )
line to an increasing sequence of a σ− field {Fk }, that means Xk is measurable with
(N )
respect to Fk . In didn’t we require the Markov property for discrete case.
(N ) (N ) (N )
We use the same notation ∆Xn = Xn+1 − Xn , and use the terminology of condi-
(N )
tional moments to describe ∆Xn , we assume it satisfies
(N ) (N ) (N ) (N )
• E[∆Xn |Fn ] = hN µ(Xn ) + 1,n ;
(N ) (N ) (N ) (N )
• E[(∆Xn )2 |Fn ] = hN σ 2 (Xn ) + 2,n ;
(N ) (N ) (N )
• E[(∆Xn )4 |Fn ] = 4,n ;
P (N )
• for i=1,2,4, [z] denotes the integer part of z, then E|i,n | → 0
n<[t/hN ]

(N ) (N )
R.K. Usually, when Fn = σ(Xn ), then we can replace the conditional expectation by

E[·|Fn(N ) ] = E[·|Xn(N ) ]

5
then we consider about continuous cases, let

(N )
X (N ) (t) = X[t/hN ]

that means it’s a step continuous stochastic process, then we have a most important theorem
in this chapter

Theorem.2. If {X (N ) (t)} satisfied the tightness condition or truncated moment


condition, then
X (N ) (t) ⇒ X(t), f or all t

and that weakly convergence means converge in distribution to X(t), which has infinitesimal
drift and variance to be µ and σ 2 .

Proof. The detailed is all in [Durr 96] (7.1) and (8.2), tightness condition is (A) and
truncated moment condition is (B) in P297. Briefly we separate to discrete and con-
tinuous cases. For both cases we cans show tightness by showing
Z t Z t
Xti − bi (Xs )ds, Xti Xtj − ai (Xs )ds
0 0

are local martingales. And For continuous case we can either show that

d
sup P(Xth ∈ Rd |X0h = x) < ∞
x∈K dt

for any compact K in Rd .

Example.1. Generations are in discrete time t=0,1,2,3... There are 2 types of players:
C(cooperators) and D(defectors). Fixed integer N. Suppose at the beginning (generation 0)
there are N/2 C and D. To go to the next generation,

• randomly pick two guys and play according game matrix. New guys are generated
according to this step. 2C and 5D are generated from this game.

• Pick 7 guys uniformly at random N guys in generation 0 and replace them by 2C and
5D. Therefore there are also N guys in total at generation 1.
(N ) (N )
Show that Ct , Dt converge to a system of ODE.

Proof. By C and D are conservative, dCt = −dDt , for simplicity we just consider Dt . If
i
Dn = N, then the transform probability is

Cij CN
7−j
−i
pi,i+5−j = 7 , j = 0, ..., 7
CN

6
For the drift term,

7
X i + 5 − j 1 j 7−j
i i
E[∆Dn |Dn = ] = 7 Ci CN −i −
N N CN N
j=0
7 7
1 j 7−j 7−j 1 i
jCij CN
X X
= 7 [(i + 5) Ci CN −i − −i ] −
CN N N
j=0 j=0
7
1 7
X j−1 6−(j−1) 1 i
= 7 [(i + 5)CN − iCi−1 C(N −1)−(i−1) ] −
CN N N
j=1
1 7 6 1 i
= 7 [(i + 5)CN − iC(N −1) ] −
CN N N
7i 1
= (5 − )
N N

then µD (x) = 5 − 7x For the diffusion term

7
2 i i2 i 7i 1 X i + 5 − j 1 j 7−j
E[∆ Dn |Dn = ] = 2 − 2 (i + 5 − ) + 7 Ci CN −i
N N N N N N 2 CN
j=0
1 14i2
1 7i 7 42i(i − 1) 7
= {(−i2 − 10i + ) + 7 [(i + 5)2 CN
7
− (2i + 9) CN + C ]}
N2 N CN N N (N − 1) N
1 14i2 14i2
= { − + o(1/N )} = 0
N N N
i
then σD (x) = 0. For similar calculation E[∆2 Dn |Dn = N] = 0. Therefore, dDt = (5 − 7t)

Dt = 5 − 3 −7t
7 14 e
C = 2 + 3 −7t
t 7 14 e

7
Example.2. (More general case for evolutionary game theory ) We still consider population
#C #D
is fixed number N , and we have C and D types in population with p = N and q = N . If
they have a symmetric game matrix as

C D
C α (β, γ)
D (β, γ) δ

#C(#C−1)
• pick (C,C), with P = N (N −1) = p2 generate 2α new C;
2#C#D
• pick (C,D), with P = N (N −1) = 2pq generate β new C and γ new D;

#D(#D−1)
• pick (D,D), with P = N (N −1) generate 2δ new D

Then consider the increase of #C,

XCC ∼ B(2α, q), −XDD ∼ B(2δ, q), XCD ∼ B(β + γ, q) − γ,

By

E[·] E[·2 ]
XCC 2αq 2αqp + (2αq)2
XDD −2δq 2δqp + (2δq)2
XCD (β + γ)q − γ (β + γ)(q − p − 2γq) + (β + γ)2 q 2 + γ 2

8
Then by the similar calculation as before,

E[ ∆C ] = N1 [f ( N
C
)];
 N


E[( ∆C 2 1 1 C
N ) ] = N [ N g( N )];


E[( ∆C )4 ] = 0

N

with

2 3 2
f (x) = 2αx (1 − x) − 2δ(1 − x) + 2(β + γ)x(1 − x) − 2γx(1 − x);


g(x) = 2αx3 (1 − x) + 4α2 x2 (1 − x)2 + 2δx(1 − x)3 + 4δ 2 (1 − x)4


+ 2x(1 − x){(β + γ)[x(1 − x) − 2γ(1 − x)] + (β + γ)2 (1 − x)2 + γ 2 }

Therefore we imply that


 q
dCt = f (Ct )dt + 1 g(Ct )dBt ;
N
q
dD = −f (C )dt − 1 g(C )dB ;
t t N t t

Example.3. (Coevolutionary Dynamics[6])


The goal of this model is: using finite population model to derive quantitively properties
from infinite population model.
First we consider finite population of two types A and B, the fitness(payoff matrix) is
given by

A B
A a b
B c d

And we assume population is finite (N) and constant, the balance between selection can
drift can describe as Moran Process:
Definition.2. The transition matrix of the stochastic process is tri-diagonal in the shape
and the transition probabilities is are

P = NN−i Ni
 i,i−1



Pi,i = 1 − Pi,i−1 − Pi,i+1


= i N −i

P
i,i+1 N N

The microscopic dynamic is followed by three steps:

• selection an individual is randomly selected for reproduction by its fitness;

• reproduction the selected individual product one offspring;

9
• replacement the offspring randomly replace a random selected individual in the pop-
ulation.

the finess of an individual in composed by two component:

• the frequency independent baseline fitness: associated with genetic predisposition;

• frequency dependent contribution fitness: associated with interaction with other mem-
bers of population

If i is the number of A, then the average payoff for different type is given by

π A = a(i−1)+b(N −i)
i N −1
π B = c(i+d(N −i−1)
i N −1

the effective reproductive fitness is given of type κ is given by

p = 1 − ω + ωπiκ

and ω ∈ [0, 1] determines the relative contributions of the baseline fitness, clearly bigger of
value of ω is, stronger of frequency dependent fitness is. Then combine the properties of
Moran Process, we get a global information transition probabilities:

1−ω+ωπ B
T − (i) = 1−ω+ωhπii i NN−i Ni
 g


Tg0 (i) = 1 − Tg− (i) − Tg+ (i)

T + (i) = 1−ω+ωπiA i N −i


g 1−ω+ωhπi i N N

with hπi i = [iπiA + (N − i)πiB ]/N is total average payoff.


But if we see the selection process, we realize that it requires global information and it
could be undesirable in many situations. Then we derive another version of fitness, only
require local information:
1 ω πia − πib
p= +
2 2 ∆πmax
a and b denotes two random choose individuals, and ∆πmax denotes the maximum possible
payoff difference between a and b. Then combine the properties of Moran Process, we get
a local information transition probabilities:

πiA −πiB N −i i


Tl− (i) = ( 12 + ω2 ∆π max
) N N

Tl0 (i) = 1 − Tl− (i) − Tl+ (i)

T + (i) = ( 1 + ω πiB −πiA ) i N −i


l 2 2 ∆πmax N N

then if we denote P τ (i) as the probability that system is in time τ and state i. Let us the

10
notation
i τ
x= ; t = ; ρ(x, t) = N P τ (i)
N N
ρ(x, t) denotes the probability density. For By

P τ +1 (i) − P τ (i) = P τ (i − 1)Tξ+ (i − 1) + P τ (i)Tξ0 (i) + P τ (i + 1)Tξ− (i + 1)


= P τ (i − 1)Tξ+ (i − 1) − P τ (i)Tξ− (i) + P τ (i + 1)Tξ− (i + 1) − P τ (i)Tξ+ (i)

1 1 1 1 1
⇒ ρ(x, t + ) − ρ(x, t) = ρ(x − , t)Tξ+ (x − ) + ρ(x + , t)Tξ− (x + )
N N N N N
+ −
− ρ(x, t)Tξ (x) − ρ(x, t)Tξ (x)
By N  1, then use Taylor expansion

1 1 1 1 1 1 1 1 1 + 1
ρx + ρxx 2 + o( 2 ) = (ρ − ρx + ρxx 2 )(T + − Tx+ + Txx ) − ρT +
N 2 N N N 2 N N 2 N2
1 1 1 1 1 − 1 1
+ (ρ + ρx + ρxx 2 )(T − + Tx− + Txx ) − ρT − + o( 2 )
N 2 N N 2 N2 N
1
= −(Tx+ − Tx− )ρ − (T + − T − )ρx + −
(T + + Txx )ρ
2N xx
1 1 1
+ (Tx+ + Tx− )ρx + (T + + T − )ρxx + o( 2 )
N 2N N

That implies
d d 1 d2 2
ρ(x, t) ≈ − [a(x)ρ(x, t)] + [b (x)ρ(x, t)]
dt dx 2 dx2
q
with a(x) = Tξ+ (x) − Tξ− (x) and b(x) = 1 + −
N [Tξ (x) + Tξ (x)]. By the form is exactly
Fokker-Planck Equation, we know ρ(x, t) actually denotes the probability density of a
random variable Xt , which is a Itô process derive by standard Wiener process Wt , and
satisfies SDE as
dXt = a(Xt )dt + b(Xt )dWt

Since N is sufficient large and |Tξ+ (x)|, |Tξ− (x)| ≤ 1, then b(Xt ) ≈ 0. For N → 0,

1
Global: dXt = Xt [π A (Xt ) − hπ A (Xt )i]
Γ + hπ(Xt )i

Local: dXt = ΥXt [π A (Xt ) − hπ A (Xt )i]

with 
π A (Xt ) = aXt + b(1 − Xt )


 1−ω ω
π B (Xt ) = cXt + d(1 − Xt ) , Γ= ; Υ=

 ω ∆πmax
hπ A (X )i = π A (X )X + π B (X )(1 − X )

t t t t t

Personal comment for this Local Information Model : this model is kind of good,
but the local assumption seems doesn’t make full sense, if we compute Tl0 (i) = 1 − TL+ (i) −

11
TL− (i) = 1 − i N −i
N N , it is actually independent with baselinef itness, which doesn’t make
sense at all. For example, if we consider i is the number of type in the population, then if
the baseline fitness of B is increasing, Tg+ (i) and Tg0 (i) should decrease and Tg− (i) should
increase.

Example.4. (Coevolutionary Dynamics in finite population [7],[8])


Continued with Example 3, it’s natural to extend this model to multivariable Moran
Process. We assume we totally have d types of individuals in the population, and the payoff
matrix will be P = (pjk )d×d .
Another creative idea is adding mutation into our process, so right now the microscopic
dynamics consists of four steps:

• Selection: an individual is randomly selected for reproduction by its fitness;

• Reproduction: the selected individual product one offspring;

• Replacement: the offspring randomly replace a random selected individual in the


population.

• Mutation: the offspring randomly mutate to another type of offspring(including itself


with a positive probability).

Then we assume the mutation matrix M = (mjk )d×d , the special case of vanishing mutation
will be M = Id×d . Then the average payoff of j type with population (i1 , · · · , id ) will be

d
P
pjk ik
k=1
πj (i1 , · · · , id ) =
N −1

then as pervious we can have the fitness factor as

d
1 − ω + ωπj (i1 , · · · , id ) X ik
, hπ(i1 , · · · , id )i = π(i1 , · · · , id )
1 − ω + ωhπ(i1 , · · · , id )i N
k=1

for simplicity, we can always choose ω = 1 and times a constant to the payoff matrix. Then
if we consider the transition probability of a type k replaced by type j (k 6= j), and denotes
it as Tkj (i1 , · · · , id ), this happens in two cases

• generate a type j and replace a type k without mutation

• generate a type l (l 6= j), then mutate to type j and replace a type k.

Then we know both cases happen with replaced by type k, then if we back to three proba-
bility in Moran Process, only Pik ,ik +1 and partial Pik ,ik of type k satisfy, and each sub-case
in Pik ,ik +1 has to mutate to type j, and only event pick type k and generate type k will be

12
counted with mutating to type j. Then combine the fitness and ω = 1 we have

d ik i k
X πl (i1 , · · · , id ) il ik N N
Tkj (i1 , · · · , id ) = mlj + (1 − Pik ,ik −1 − Pik ,ik +1 ) mkj
hπ(i1 , · · · , id )i N N (1 − Pik ,ik −1 − Pik ,ik +1 )
l6=k
d
ik X
= il πl (i1 , · · · , id )mlj
N 2 hπ(i1 , · · · , id )i
l=1

then we use the similar notation P τ (i1 , · · · , id ) denotes the probability of state (i1 , · · · , id )
at time τ , then

d
X
τ +∆τ
P (i1 , · · · , id ) = P τ (i1 , · · · , ij − 1, · · · , ik + 1, · · · , id ) × Tkj (i1 , · · · , ij − 1, · · · , ik + 1, · · · , id )
j,k=1
d
X
= P τ (i1 , · · · , ij − 1, · · · , ik + 1, · · · , id ) × Tkj (i1 , · · · , ij − 1, · · · , ik + 1, · · · , id )
j6=k
d
X
+ P τ (i1 , · · · , id ) × Tjj (i1 , · · · , id )
j=1
d
X
= P τ (i1 , · · · , ij − 1, · · · , ik + 1, · · · , id ) × Tkj (i1 , · · · , ij − 1, · · · , ik + 1, · · · , id )
j6=k
X
+ P τ (i1 , · · · , id ) × [1 − Tkj (i1 , · · · , id )]
k6=j

that implies

d
X d
X
− + − +
ρ(x; t + ∆t) − ρ(x; t) = ρ(j , k ; t) × Tkj (j , k ) − ρ(x; t) × Tkj (x)
j6=k j6=k

with similar notation t = τ /N , xl = il /N , ρ(x) = N P t (x), (j − , k+) = (i1 , · · · , ij −


1, · · · , ik + 1, · · · , id ), then we still use Taylor Expansion and consider N  1, and we get
a multi-varaible Fokker-Planck Equation system as

d−1 d−1
∂ρ(x) X ∂ 1 X ∂2
=− ρ(x)ak (x) + ρ(x)bjk (x)
∂t ∂xk 2 ∂xk ∂xj
k=1 j6=k

with 
P d
[Tjk (x) − Tkj (x)]

ak (x) =


j6=k
d 
 
1
 P 
bjk (x) = N − Tjk (x) − Tkj (x) + δjk

 Tjl (x) + Tlj (x)
l=1

Similarly as Example 3, we can also use the local information transition probabilities
compute the Fokker-Planck Equation, but since we already mentioned this idea is technically
not very good, we will skip this part.

13
After this a very interesting idea is to see the stationary distribution and equilibrium for
the Fokker-Planck Equation since it’s very hard to get the analytic solution. If we consider

ρt (x) = −∇J (x)

with d − 1 element have forms like

d−1
1X ∂
Jk (x) = ρ(x)ak (x) − ρ(x)bjk (x)
2 ∂xj
j=1

we denote ρ(x) at equilibrium is ρ∗ (x), and at equilibrium Jk (x) = 0 for all k, then we got

d−1 d−1
X ∂ ∗ X ∂
bjk (x) ρ∗ (x),
 
ρ (x) · bjk (x) = 2ak (x) − ∀k
∂xj ∂xj
j=1 j=1

if we consider detbjk (x) 6= 0, then we can easily have a PDE system as


 

∇ ln(ρ (x)) = [bTjk (x)]−1 2a(x) − [bTjk (x)] · ∇x = Γ(x)

If we assume Γ(x) is a gradient, then the solution exists and independent of the choice of
path, then if we assume the initial data was x0 , then we get the solution as
Z x 
∗ ∗
ρ (x) = ρ (x0 )exp Γ(y) · dy
x0

but since the complicity of Γ(x), we couldn’t easily get the analytic solution.
Another good idea shows on Traulsen’s 2012 is using the Eigenvalue Decomposition of
the matrix: if we denote the matrix of coefficient as (Ak )1×d−1 (x) and (Bjk )d−1×d−1 (x), he
Pd
use the fact that Tjk (x) = 1, then
j=1

d
X d
X

Ak (x) = Tjk (x − Tkj (x) = −1 + Tjk (x)
j=1 j=1

Consider matrix B(x),



Bjk (x) = − N1 [Tjk (x) + Tkj (x)]; j 6= k

 d 
1 P
B
 jj
 (x) = N [T jk (x) + Tkj (x)]
l6=j

it’s easy to observe that B(x) is a symmetric matrix and using Weak Diagonal Dominate
Theorem, we know that B is non-negative definite, and by Itô calculus, we can get the
solution of the Fokker-Planck Equation system as Langevin equation, and it could be

14
written as a diffusion process

∂ρ(x)
= Ak (x)dt + Ck (x)Bt
∂xk

here C T (x) · C(x) = B(x) and Ck (x) denotes the k-th row of C(x); Bt is a vector uncor-
related d − 1-dimensional Brownian Motion, and

hBti Btj0 i = δij · δ(t − t0 )

By matrix B(x) is non-negative definite, exists an orthogonal matrix U (x), such that

U (x)T · U (x) = 1; B(x) = U (x) · Λ(x · U (x)T


p
which implies C(x) = U (x) · Λ((x) · U (x)T , after that we give a quite good diffusion to
represent ρ(x), but actually Λ(x) = diag{λ1 (x), · · · , λd−1 (x)} is dependent on x, then we
have to update this each time to get the eigenvalue, this computational cost is too much
for us.

Personal comment

• for this general case and Example 3 as simple case which only have 2 types, we should
Pd
always face the fact that Tk j(x) 6= 1, which means if we only define transition
j=1,k=1
d
P
probabilities on k 6= j, it still could let Tk j(x) > 1, if we have certain types which
j6=k
have large fitness and dominate population, since Tj j(x) is always positive, then we
have to check the transition probability is well-defined.
d
P d
P
• Traulsen made a little mistake that Tjk (x) not always equal to 1, since Tjk (x) =
j=1 j=1,k=1
1 all the time.

• for the Fokker-Planck Equation system, we could try to use PDE knowledge to get
the analytic solution, in Evan’s PDE book and Yoshida’s Heat kernal expansion, we
can find some iterative geometric approach to the analytic solution, but the cost will
be expensive and we need to find a way to minimize the cost, like we should find the
easy way to compute the eigenvalue dependent on x in our future work.

15
4 Reference
[1 KT81] Samuel Karlin and Howard M. Taylor. A second course in stochastic processes.
Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1981.

[2 Dur96] Richard Durrett. Stochastic calculus. Probability and Stochastics Series. CRC
Press, Boca Raton, FL, 1996.

[3 Kurtz 70] Kurtz T G. Solutions of ordinary differential equations as limits of pure jump
Markov processes[J]. Journal of applied Probability, 1970, 7(1): 49-58.

[4 Kurtz 71] Kurtz T G. Limit theorems for sequences of jump Markov processes approximat-
ing ordinary differential processes[J]. Journal of Applied Probability, 1971, 8(2): 344-356.

[5 Kurtz 86] Ethier S N, Kurtz T G. Markov Processes, Characteristics and Convergence[J].


1986.

[6 Traulsen05] Traulsen A, Claussen J C, Hauert C. Coevolutionary dynamics: from finite


to infinite populations[J]. Physical review letters, 2005, 95(23): 238701.

[7 Traulsen06] Traulsen A, Claussen J C, Hauert C. Coevolutionary dynamics in large, but


finite populations[J]. Physical Review E, 2006, 74(1): 011901.

[8 Traulsen12] Traulsen A, Claussen J C, Hauert C. Stochastic differential equations for


evolutionary dynamics with demographic noise and mutations[J]. Physical Review E, 2012,
85(4): 041901.

16

You might also like