Stochastic Processes and Time Series Markov Chains - II: 1 Conditional Probability Results
Stochastic Processes and Time Series Markov Chains - II: 1 Conditional Probability Results
Module 2
Markov Chains - II
Dr. Alok Goswami, Professor, Indian Statistical Institute, Kolkata
1
3 Markov chains: Some Notations
P (2)
The quantity z∈S pxz pzy obtained above is denoted by pxy and is called the “2-step transition
probability” (from state x to state y).
In fact, what we have just proved, implies:
P(Xn+2 = y | X0 = x0 , . . . , Xn−1 = xn−1 , Xn = x) = P(Xn+2 = y | Xn = x)
(2)
=P(X2 = y | X0 = x)=pxy
4 Marix Notations:
4.1 Transition Matrix:
Consider the S × S matrix: P = ((pxy ))x,y∈S
P : transition matrix (or, the transition probability matrix).
Properties: (i) all entries non-negative, (ii) each row-sum = 1
(2) (2)
pxy = z pxz pzy =(x, y)th entry of P 2 =⇒ P 2 = (( pxy ))
P
(k) (1)
In general, P k = (( pxy )), for every k ≥ 1. (write pxy = pxy )
(0)
Holds for k = 0 also: pxy = P(X0 = y | X0 = x) = δxy .
2
Clearly, in this notation,
pxy = Px (X1 = y) = Ex I{X1 =y}
and p(k)
xy = Px (Xk = y) = Ex I{Xk =y}
Upshot: Given the past upto a time n and the present state x, the conditional future evolution of
the chain from time (n+1), behaves in exactly the same way as the evolution of the chain from time
1, conditioned on its starting at the state x and the corresponding conditional prob-
abilities and expectations are all completely determined by transition probabilities of
the MC.
6 Initial Distribution
Conditional probabilities and expectations related to future evolution, given history upto any time
point, reduce to Px and Ex , which are, in turn, determined by the transition probabilities.
To get unconditional probabilities and expectations, we need to specify the distribution of the
3
initial state X0 and use:
X X
P(A)= P(A |X0 = x)P(X0 = x), E(Z)= E(Z |X0 = x)P(X0 = x)
x∈S x∈S
Denote the distribution of X0 by µ: µx = P(X0 = x), x ∈ S. Writing Pµ and Eµ respectively
for (unconditional) probabilities and expectations arising out of µ, the above relations reduce to:
X X
Pµ (A) = µx Px (A), Eµ (Z) = µx Ex (Z)
x∈S x∈S
Distribution of a MC is completely specified by Initial Distribution (µx , x ∈ S) and the
Transition Probabilities (pxy , x, y ∈ S)
7 Distribution of the MC
Joint distribution of (X1 , . . . , Xn ):
X
Pµ (X1 = x1 , . . . , Xn = xn ) = µx pxx1 · · · pxn−1 xn
x∈S
P (n)
Marginal distribution of Xn : Pµ (Xn = x) = µz pzx
z∈S
(n)
Denote distribution of Xn as: µ(n) = {µx , x ∈ S}. The above formula says (in matrix notation):
µ(n) = µ · P n , where µ and µ(n) are both viewed as 1 × S (row) vectors. Thus, probabilities of
all finite-dimensional events are explicitly given in terms of initial distribution µ and transition
matrix P . But, the really important questions for a MC centre around undesrtanding its long-term
characteristics and these involve Infinite-Dimensional Events!!
What are these questions? How does one answer them?
Deduce:
n−1
(n)
X
µ0 =β (1 − α − β)k + (1 − α − β)n · µ0 .
k=1
4
Assume: α + β > 0. [Just rules out α = β = 0, so OK (why?)]
(n) β n β (n)
Get a simple formula for distribution of Xn : µ0 = − (1 − α − β) − µ0 , µ1 =
α+β α+β
(n)
1−µ0
8.1 What are the main takeaways from the above analysis?
The chain has a unique limiting distribution. The limit depends only on α and β, but not on µ.
(Initial distribution often has little or no effect in long run!) Limit distribution π = {π0 =
β α
α+β , π1 = α+β } has property:
if µ = π, then µ(n) = π for all n. (X0 ∼ π ⇒ Xn ∼ π ∀ n)
(First prove: πP = π, then conclude πP n = π for all n)
This property is expressed as: π is a stationary distribution.
β α
π = α+β , α+β is the only stationary distribution.
(It is the only probability on S = {0, 1} satisfying πP = π)
To sum up: The On-Off Chain has, irrespective of the initial distribution, a unique limiting
distribution, which is also a stationary distribution for the chain and is the only one.
• Distinct Advantage for On-Off chain: Simple explicit formula for µ(n) .
• Will have to employ different ideas, that do NOT depend on such explicit computations.