0% found this document useful (0 votes)
101 views12 pages

MIT6 262S11 Lec08 Pagina12

This document provides an overview and outline of lecture material on Markov eigenvalues and eigenvectors. The key points are: 1) For Markov chains with distinct eigenvalues, the transition matrix P can be written as P = UΛU-1, where U is the matrix of right eigenvectors and Λ is the diagonal matrix of eigenvalues. 2) The nth power of P can then be written as Pn = UΛnU-1, allowing the long-term behavior of Pn to be determined from the eigenvalues. 3) All eigenvalues λ of a stochastic matrix satisfy |λ| ≤ 1, with the eigenvalue of 1 corresponding to the steady state distribution.

Uploaded by

Gago Romero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views12 pages

MIT6 262S11 Lec08 Pagina12

This document provides an overview and outline of lecture material on Markov eigenvalues and eigenvectors. The key points are: 1) For Markov chains with distinct eigenvalues, the transition matrix P can be written as P = UΛU-1, where U is the matrix of right eigenvectors and Λ is the diagonal matrix of eigenvalues. 2) The nth power of P can then be written as Pn = UΛnU-1, allowing the long-term behavior of Pn to be determined from the eigenvalues. 3) All eigenvalues λ of a stochastic matrix satisfy |λ| ≤ 1, with the eigenvalue of 1 corresponding to the steady state distribution.

Uploaded by

Gago Romero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

6.

262: Discrete Stochastic Processes

2/28/11

Lecture 8: Markov eigenvalues and eigenvectors


Outline:
Review of ergodic unichains
Review of basic linear algebra facts

Markov chains with 2 states

Distinct eigenvalues for M > 2 states


M states and M independent eigenvectors
The Jordan form

Recall that for an ergodic nite-state Markov chain,


the transition probabilities reach a limit in the sense
n = where
that limn Pij
= (1, . . . , M) is a
j
strictly positive probability vector.
Multiplying both sides by Pjk and summing over j,
k = lim

nP =
Pij
jk

j Pjk

Thus
is a steady-state vector for the Markov
chain, i.e.,
=
[P ] and
0.
where e = (1, 1, . . . , 1)T
In matrix terms, limn[P n] = e
is a column vector and
is a row vector.

The same result almost holds for ergodic unichains,


i.e., one ergodic class plus an arbitrary set of tran
sient states.
The sole dierence is that the steady-state vector is
positive for all ergodic states and 0 for all transient
states.

[PT ]
[P ] =
[0]

[PT R]

[PR]

where

P11
[PT ] =
Pt1 . . .

P1t

Ptt

The idea is that each transient state eventually has


a transition (via [PT R]) to a recurrent state, and
the class of recurrent states lead to steady state as
before.

Review of basic linear algebra facts


Def: A complex number is an eigenvalue of a real
square matrix [A], and a complex vector v = 0 is a
right eigenvector of [A], if v = [A]v .
For every stochastic matrix (the transition matrix of

a nite-state Markov chain [P ]), we have j Pij = 1


and thus [P ]e = e.
Thus = 1 is an eigenvalue of an arbitrary stochas
tic matrix [P ] with right eigenvector e.
An equivalent way to express the eigenvalue/eigenvector
equation is that [P I]v = 0 where I is the identity
matrix.

Def: A square matrix [A] is singular if there is a


vector v = 0 such that [A]v = 0.
Thus is an eigenvalue of [P ] if and only if (i)
0.
[P I] is singular for some v =
Let a1, . . . , aM be the the columns of [A]. Then [A]
is singular i a1, . . . , aM are linearly dependent.
The square matrix [A] is singular i the rows of [A]
are linearly dependent and i the determinant det[A]
of [A] is 0.
Summary: is an eigenvalue of [P ] i [P I] is
singular, i det[P I] = 0, i [P ]v = v for some
0, and i
0.
v =
u[P ] =
u for some
u=

For every stochastic matrix [P ], [P ]e = e and thus


[P I] is singular and there is a row vector = 0
such that
[P ] =
.
This does not show that there is a probability vector

such that
[P ] =
, but we already know there is
such a probability vector (i.e., a steady-state vector)
if [P ] is the matrix of an ergodic unichain.
We show later that there is a steady-state vector
for all Markov chains.

The determinant of an M by M matrix can be de


termined as
det A =

Ai,(i)

i=1

where the sum is over all permutations of the


integers 1, . . . , M. Plus is used for each even per
mutation and minus for each odd.
The important facet of this formula for us is that
det[P I] must be a polynomial in of degree M.
Thus there are M roots of the equation det[P I] =
0, and consequently M eigenvalues of [P ].
Some of these M eigenvalues might be the same,
and if k of these roots are equal to , the eigenvalue
is said to have algebraic multiplicity k.

Markov chains with 2 states


1P11 + 2P21 = 1
1P12 + 2P22 = 2
left eigenvector

P111 + P122 = 1
.
P211 + P222 = 2
right eigenvector

det[P I] = (P11 )(P22 ) P12P21


1 = 1;

2 = 1 P12 P21

If P12 = P21 = 0 (the chain has 2 recurrent classes),


then = 1 has multiplicity 2. Otherwise = 1 has
multiplicity 1.
If P12 = P21 = 1 (the chain is periodic), then 2 =
1. Otherwise |2| < 1.

1P11 + 2P21 = 1
1P12 + 2P22 = 2
1 = 1;

P111 + P122 = 1
.
P211 + P222 = 2
2 = 1 P12 P21

Assume throughout that either P12 > 0 or P21 > 0. Then

P12
21
c
(1) = P12P+P

(1) = (1
P
+P
21
12
21
, 1)
c
P12

(2) = (1, 1)

(2) =
, P21
P12 +P21

P12 +P21

Note that
(i)
(j) = ij . In general, if
(i)[P ] = i
(i)
and [P ]
(i) = i
(i) for i = 1, . . . , M, then
(i)
(j) = 0
if i = j . To see this,
i
(i)
(j) =
(i)[P ]
(j) =
(i)(j
(j)) = j
(i)
(j)
so if

i or

i
j = 0. Normalization (of either
i = j , then

i) can make
i
i = 1 for each i.

Note that the equations


(i)

(i)

(i)

(i)

P111 + P122 = i1 ;

(i)

(i)

P211 + P222 = i2

can be rewritten in matrix form as


[P ][U ] = [U ][]
[] =

1 0
0 2

where

(1)

(2)

[U ] = 1(1) 1(2) ,
2
2

and

Since
(i)
(j ) = ij , we see that

(1)

(1)

2
1
(2)
(2)
1
2

(1)

(2)

= [I],
(1)
(2)
2
2
so [U ] is invertible and [U 1] has
(1) and
(2) as
rows. Thus [P ] = [U ][][U 1] and

[P 2] = [U ][][U 1][U ][][U 1] = [U ][2][U 1]

10

Similarly, for any n 2,

[P n] = [U ][n][U 1]

(1)

Eq. 3.29 in text has a typo and should be (1) above.


We can solve (1) in general (if all M eigenvalues are
distinct) as easily as for M = 2.
Break []n into M terms,
n
[]n = [n
1 ] + + [M]

where

n
[n
i ] has i in position (i, i) and has zeros elsewhere.
Then

[P n] =

n
(i)
(i)
i

i=1

11

[P n] =

(1) =

(2)

P21
, P12
P12 +P21 P12 +P21

= (1, 1)

M
n
(i)
(i)
i=1 i

(1) = (1,
1)
12

(2) = P12P+P
,
21

P21
P12 +P21

The steady-state vector is


=
(1) and

(1)
=

1 2
1 2

[P n] =

(2)
(2) =

1 + 2n
2
1 1n
2

2 2
1
1

2 2n
2
n
2 + 12

We see that [P n] converges to e


, and the rate
of convergence is 2. This solution is exact. It
essentially extends to arbitrary nite M.

12

Distinct eigenvalues for M > 2 states


Recall that, for an M state Markov chain, det[P I]
is a polynomial of degree M in . It thus has M roots
(eigenvalues), which we assume here to be distinct.
Each eigenvalue i has a right eigenvector
(i) and
a left eigenvector
(i). Also
(i)
(j) = 0 for each
i, j = i.
By scaling
(i) or
(i), we can satisfy
(i)
(i) = 1.
(1) to
(M) and
Let [U ] be the matrix with columns
(1)
(M)
let [V ] have rows

to

.
Then [V ][U ] = I, so [V ] = [U 1]. Thus the eigenvec
tors
(1) to
(M) are linearly independent and span
M space. Same with
(1) to
(M).

13

Putting the right eigenvector equations together,


[P ][U ] = [U ][]. Postmultiplying by [U 1], this be
comes
[P ] = [U ][][U 1]
[P n] = [U ][n][U 1]
Breaking [n] into a sum of M terms as before,
[P n] =

i=1

n
(i)
(i)
i

Since each row of [P ] sums to 1, e is a right eigen


vector of eigenvalue 1.
Thm: The left eigenvector
of eigenvalue 1 is a
steady-state vector if it is normalized to
e = 1 .

14

Thm: The left eigenvector


of eigenvalue 1 is a
steady-state vector if it is normalized to
e = 1 .
Pf: There must be a left eigenvector
for eigen

value 1. For every j, 1 j M, j = k k Pkj . Tak


ing magnitudes,
|j |

(2)

|k |Pkj

with equality i j = |j |ei for all j and some .

Summing over j, j |j | k |k |. This is satised


with equality, so (2) is satised with equality for
each j.
Thus (|1|, |2|, . . . , |M| is a nonnegative vector satis
fying the steady-state vector equation. Normalizing

to j |j | = 1, we have a steady-state vector.


15

Thm: Every eigenvalue satises || 1.


Pf: We have seen that if
() is a left eigenvec
tor of [P ] with eigenvalue , then it is also a left
eigenvector of [P n] with eigenvalue n
. Thus
()
n
i Pij
for all j.
i
()
()
n
|n
|i | Pij
for all j.
| |j |
i
()
Let be the largest of |j | over j. For that maxi()

n
j

mizing j,

|n
|

n M
Pij

i
n
Thus | | M for all n, so || 1.

16

These two theorems are valid for all nite-state


Markov chains. For the case with M distinct eigen
values, we have
[P n] =

n
(i)
(i)
i

i=1

If the chain is an ergodic unichain, then one eigen


value is 1 and the rest are strictly less than 1 in
magnitude.
is deter
Thus the rate at which [P n] approaches e
mined by the second largest eigenvalue.
If [P ] is a periodic unichain with period d, then there
are d eigenvalues equally spaced around the unit
circle and [P n] does not converge.

17

M states and M independent eigenvectors


Next assume that one or more eigenvalues have
multiplicity greater than 1, but that if an eigenvalue
has multiplicity k, then it has k linearly independent
eigenvectors.
We can choose the left eigenvectors of a given
eigenvalue to be orthonormal to the right eigen
vectors of that eigenvalue.
After doing this and dening [U ] as the matrix with
columns
(M), we see [U ] is invertible and
(1), . . . ,
(1), . . . ,
(M). We
that [U 1) is the matrix with rows
then again have
[P n] =

n
(i)
(i)
i

i=1

18

Example: Consider a Markov chain consisting of


ergodic sets of states.
Then each ergodic set will have an eigenvalue equal
to 1 with a right eigenvector equal to 1 on the states
of that set and 0 elsewhere.
There will also be a steady-state vector, nonzero
only on that set of states.
Then [P n] will converge to a block diagonal matrix
where for each ergodic set, the rows within that set
are the same.

19

The Jordan form


Unfortunately, it is possible that an eigenvalue of
algebraic multiplicity k 2 has fewer than k linearly
independent eigenvectors.
The decomposition [P ] = [U ][][U 1] can be replaced
in this case by a Jordan form, [P ] = [U ][J][U 1] where
[J] has the form

1 0 0 0 0
0 1 0 0 0

[J] = 0 0 2 1 0 .
0 0 0 0
2
0 0 0 0 2
The eigenvalues are on the main diagonal and ones
are on the next diagonal up where needed for de
cient eigenvectors.

20

Example:

1/2 1/2 0

[P ] = 0 1/2 1/2 .
0
0
1

The eigenvalues are 1 and 1/2, with algebraic mul


tiplicity 2 for = 1/2.
There is only one eigenvector (subject to a scaling
constant) for the eigenvalue 1/2. [P n] approaches
steady-state as n(1/2)n.
Fortunately, if [P ] is stochastic, the eigenvalue 1 al
ways has as many linearly independent eigenvectors
as its algebraic multiplicity.

21

MIT OpenCourseWare
http://ocw.mit.edu

6.262 Discrete Stochastic Processes


Spring 2011

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like