Random Walk Article
Random Walk Article
d
(aperiodic,
mean zero, nite second moment), and T
d
(aperiodic with no other assumptions). Symmetric
random walks on other integer lattices such as the triangular lattice can also be considered by
taking a linear transformation of the lattice onto Z
d
.
The local central limit theorem (LCLT) is the topic for Chapter 2. Its proof, like the proof of the
usual central limit theorem, is done by using Fourier analysis to express the probability of interest
6
Preface 7
in terms of an integral, and then estimating the integral. The error estimates depend strongly
on the number of nite moments of the corresponding increment distribution. Some important
corollaries are proved in Section 2.4; in particular, the fact that aperiodic random walks starting
at dierent points can be coupled so that with probability 1 O(n
1/2
) they agree for all times
greater than n is true for any aperiodic walk, without any nite moment assumptions. The chapter
ends by a more classical, combinatorial derivation of LCLT for simple random walk using Stirlings
formula, while again keeping track of error terms.
Brownian motion is introduced in Chapter 3. Although we would expect a typical reader to
already be familiar with Brownian motion, we give the construction via the dyadic splitting method.
The estimates for the modulus of continuity are given as well. We then describe the Skorokhod
method of coupling a random walk and a Brownian motion on the same probability space, and
give error estimates. The dyadic construction of Brownian motion is also important for the dyadic
coupling algorithm of Chapter 7.
The Greens function and its analog in the recurrent setting, the potential kernel, are studied
in Chapter 4. One of the main tools in the potential theory of random walk is the analysis of
martingales derived from these functions. Sharp asymptotics at innity for the Greens function
are needed to take full advantage of the martingale technique. We use the sharp LCLT estimates of
Chapter 2 to obtain the Greens function estimates. We also discuss the number of nite moments
needed for various error asymptotics.
Chapter 5 may seem somewhat out of place. It concerns a well-known estimate for one-dimensional
walks called the gamblers ruin estimate. Our motivation for providing a complete self-contained
argument is twofold. Firstly, in order to apply this result to all one-dimensional projections of a
higher dimensional walk simultaneously, it is important to shiw that this estimate holds for non-
lattice walks uniformly in few parameters of the distribution (variance, probability of making an
order 1 positive step). In addition, the argument introduces the reader to a fairly general technique
for obtaining the overshoot estimates. The nal two sections of this chapter concern variations of
one-dimensional walk that arise naturally in the arguments for estimating probabilities of hitting
(or avoiding) some special sets, for example, the half-line.
In Chapter 6, the classical potential theory of the random walk is covered in the spirit of [16]
and [10] (and a number of other sources). The dierence equations of our discrete space setting
(that in turn become matrix equations on nite sets) are analogous to the standard linear partial
dierential equations of (continuous) potential theory. The closed form of the solutions is important,
but we emphasize here the estimates on hitting probabilities that one can obtain using them. The
martingales derived from the Greens function are very important in this analysis, and again special
care is given to error terms. For notational ease, the discussion is restricted here to symmetric walks.
In fact, most of the results of this chapter hold for nonsymmetric walks, but in this case one must
distinguish between the original walk and the reversed walk, i.e. between an operator and
its adjoint. An implicit exercise for a dedicated student would be to redo this entire chapter for
nonsymmetric walks, changing the statements of the propositions as necessary. It would be more
work to relax the nite range assumption, and the moment conditions would become a crucial
component of the analysis in this general setting. Perhaps this will be a topic of some future book.
Chapter 7 discusses a tight coupling of a random walk (that has a nite exponential moment)
and a Brownian motion, called the dyadic coupling or KMT or Hungarian coupling, originated in
K omlos, Major, and Tusnady [7, 8]. The idea of the coupling is very natural (once explained), but
8 Preface
hard work is needed to prove the strong error estimate. The sharp LCLT estimates from Chapter
2 are one of the key points for this analysis.
In bounded rectangles with sides parallel to the coordinate directions, the rate of convergence of
simple random walk to Brownian motion is very fast. Moreover, in this case, exact expressions are
available in terms of nite Fourier sums. Several of these calculations are done in Chapter 8.
Chapter 9 is dierent from the rest of this book. It covers an area that includes both classical
combinatorial ideas and topics of current research. As has been gradually discovered by a number
of researchers in various disciplines (combinatorics, probability, statistical physics) several objects
inherent to a graph or network are closely related: the number of spanning trees, the determinant
of the Laplacian, various measures on loops on the trees, Gaussian free eld, and loop-erased walks.
We give an introduction to this theory, using an approach that is focused on the (unrooted) random
walk loop measure, and that uses Wilsons algorithm [18] for generating spanning trees.
The original outline of this book put much more emphasis on the path-intersection probabilities
and the loop-erased walks. The nal version oers only a general introduction to some of the main
ideas, in the last two chapters. On the one hand, these topics were already discussed in more
detail in [10], and on the other, discussing the more recent developments in the area would require
familiarity with Schramm-Loewner evolution, and explaining this would take us too far from the
main topic.
Most of the content of this text (the rst eight chapters in particular) are well-known classical
results. It would be very dicult, if not impossible, to give a detailed and complete list of refer-
ences. In many cases, the results were obtained in several places at dierent occasions, as auxiliary
(technical) lemmas needed for understanding some other model of interest, and were therefore not
particularly noticed by the community. Attempting to give even a reasonably fair account of the
development of this subject would have inhibited the conclusion of this project. The bibliography
is therefore restricted to a few references that were used in the writing of this book. We refer
the reader to [16] for an extensive bibliography on random walk, and to [10] for some additional
references.
This book is intended for researchers and graduate students alike, and a considerable number
of exercises is included for their benet. The appendix consists of various results from probability
theory, that are used in the rst eleven chapters but are however not really linked to random walk
behavior. It is assumed that the reader is familiar with the basics of measure-theoretic probability
theory.
The book contains quite a few remarks that are separated from the rest of the text by this typeface. They
are intended to be helpful heuristics for the reader, but are not used in the actual arguments.
A number of people have made useful comments on various drafts of this book including stu-
dents at Cornell University and the University of Chicago. We thank Christian Benes, Juliana
Freire, Michael Kozdron, Jose Truillijo Ferreras, Robert Masson, Robin Pemantle, Mohammad Ab-
bas Rezaei, Nicolas de Saxce, Joel Spencer, Rongfeng Sun, John Thacker, Brigitta Vermesi, and
Xinghua Zheng. The research of Greg Lawler is supported by the National Science Foundation.
1
Introduction
1.1 Basic denitions
We will dene the random walks that we consider in this book. We focus our attention on random
walks in Z
d
that have bounded symmetric increment distributions although we occasionally discuss
results for wider classes of walks. We also impose an irreducibility criterion to guarantee that all
points in the lattice Z
d
can be reached.
Fig 1.1. The square lattice Z
2
We start by setting some basic notation. We use x, y, z to denote points in the integer lattice
Z
d
= (x
1
, . . . , x
d
) : x
j
Z. We use superscripts to denote components, and we use subscripts
to enumerate elements. For example, x
1
, x
2
, . . . represents a sequence of points in Z
d
, and the
point x
j
can be written in component form x
j
= (x
1
j
, . . . , x
d
j
). We write e
1
= (1, 0, . . . , 0), . . . , e
d
=
(0, . . . , 0, 1) for the standard basis of unit vectors in Z
d
. The prototypical example is (discrete
time) simple random walk starting at x Z
d
. This process can be considered either as a sum of a
sequence of independent, identically distributed random variables
S
n
= x +X
1
+ +X
n
9
10 Introduction
where PX
j
= e
k
= PX
j
= e
k
= 1/(2d), k = 1, . . . , d, or it can be considered as a Markov
chain with state space Z
d
and transition probabilities
PS
n+1
= z [ S
n
= y =
1
2d
, z y e
1
, . . . e
d
.
We call V = x
1
, . . . , x
l
Z
d
0 a (nite) generating set if each y Z
d
can be written as
k
1
x
1
+ +k
l
x
l
for some k
1
, . . . , k
l
Z. We let ( denote the collection of generating sets V with
the property that if x = (x
1
, . . . , x
d
) V then the rst nonzero component of x is positive. An
example of such a set is e
1
, . . . , e
d
. A (nite range, symmetric, irreducible) random walk is given
by specifying a V = x
1
, . . . , x
l
( and a function : V (0, 1] with (x
1
) + + (x
l
) 1.
Associated to this is the symmetric probability distribution on Z
d
p(x
k
) = p(x
k
) =
1
2
(x
k
), p(0) = 1
xV
(x).
We let T
d
denote the set of such distributions p on Z
d
and T =
d1
T
d
. Given p the corresponding
random walk S
n
can be considered as the time-homogeneous Markov chain with state space Z
d
and
transition probabilities
p(y, z) := PS
n+1
= z [ S
n
= y = p(z y).
We can also write
S
n
= S
0
+X
1
+ +X
n
where X
1
, X
2
, . . . are independent random variables, independent of S
0
, with distribution p. (Most
of the time we will choose S
0
to have a trivial distribution.) We will use the phrase T-walk or
T
d
-walk for such a random walk. We will use the term simple random walk for the particular p
with
p(e
j
) = p(e
j
) =
1
2d
, j = 1, . . . , d.
We call p the increment distribution for the walk. Given p T, we write p
n
for the n-step
distribution
p
n
(x, y) = PS
n
= y [ S
0
= x
and p
n
(x) = p
n
(0, x). Note that p
n
() is the distribution of X
1
+ + X
n
where X
1
, . . . , X
n
are
independent with increment distribution p.
In many ways the main focus of this book is simple random walk, and a rst-time reader might nd it useful
to consider this example throughout. We have chosen to generalize this slightly, because it does not complicate
the arguments much and allows the results to be extended to other examples. One particular example is simple
random walk on other regular lattices such as the planar triangular lattice. In Section 1.3, we show that walks on
other d-dimensional lattices are isomorphic to p-walks on Z
d
.
If S
n
= (S
1
n
, . . . , S
d
n
) is a T-walk with S
0
= 0, then PS
2n
= 0 > 0 for every even integer n; this
follows from the easy estimate PS
2n
= 0 [PS
2
= 0]
n
p(x)
2n
for every x Z
d
. We will call
the walk bipartite if p
n
(0, 0) = 0 for every odd n, and we will call it aperiodic otherwise. In the
1.1 Basic denitions 11
latter case, p
n
(0, 0) > 0 for all n suciently large (in fact, for all n k where k is the rst odd
integer with p
k
(0, 0) > 0). Simple random walk is an example of a bipartite walk since S
1
n
+ +S
d
n
is odd for odd n and even for even n. If p is bipartite, then we can partition Z
d
= (Z
d
)
e
(Z
d
)
o
where (Z
d
)
e
denotes the points that can be reached from the origin in an even number of steps
and (Z
d
)
o
denotes the set of points that can be reached in an odd number of steps. In algebraic
language, (Z
d
)
e
is an additive subgroup of Z
d
of index 2 and (Z
d
)
o
is the nontrivial coset. Note
that if x (Z
d
)
o
, then (Z
d
)
o
= x + (Z
d
)
e
.
It would suce and would perhaps be more convenient to restrict our attention to aperiodic walks. Results
about bipartite walks can easily be deduced from them. However, since our main example, simple random walk,
is bipartite, we have chosen to allow such p.
If p T
d
and j
1
, . . . , j
d
are nonnegative integers, the (j
1
, . . . , j
d
) moment is given by
E[(X
1
1
)
j
1
(X
d
1
)
j
d
] =
xZ
d
(x
1
)
j
1
(x
d
)
j
d
p(x).
We let denote the covariance matrix
=
_
E[X
j
1
X
k
1
]
_
1j,kd
.
The covariance matrix is symmetric and positive denite. Since the random walk is truly d-
dimensional, it is easy to verify (see Proposition 1.1.1 (a)) that the matrix is invertible. There
exists a symmetric positive denite matrix such that =
T
(see Section 12.3). There is a
(not unique) orthonormal basis u
1
, . . . , u
d
of R
d
such that we can write
x =
d
j=1
2
j
(x u
j
) u
j
, x =
d
j=1
j
(x u
j
) u
j
.
If X
1
has covariance matrix =
T
, then the random vector
1
X
1
has covariance matrix I.
For future use, we dene norms
, by
(x)
2
= [x
1
x[ = [
1
x[
2
=
d
j=1
2
j
(x u
j
)
2
, (x) = d
1/2
(x). (1.1)
If p T
d
,
E[(X
1
)
2
] =
1
d
E[
(X
1
)
2
] =
1
d
E
_
[
1
X
1
[
2
= 1.
For simple random walk in Z
d
,
= d
1
I,
(x) = d
1/2
[x[, (x) = [x[.
We will use B
n
to denote the discrete ball of radius n,
B
n
= x Z
d
: [x[ < n,
and (
n
to denote the discrete ball under the norm ,
(
n
= x Z
d
: (x) < n = x Z
d
:
(x) < d
1/2
n.
12 Introduction
We choose to use in the denition of (
n
so that for simple random walk, (
n
= B
n
. We will write
R = R
p
= max[x[ : p(x) > 0 and we will call R the range of p. The following is very easy, but it
is important enough to state as a proposition.
Proposition 1.1.1 Suppose p T
d
.
(a) There exists an > 0 such that for every unit vector u R
d
,
E[(X
1
u)
2
] .
(b) If j
1
, . . . , j
d
are nonnegative integers with j
1
+ +j
d
odd, then
E[(X
1
1
)
j
1
(X
d
1
)
j
d
] = 0.
(c) There exists a > 0 such that for all x,
(x) [x[
1
(x).
In particular,
(
n
B
n
(
n/
.
We note for later use that we can construct a random walk with increment distribution p T from
a collection of independent one-dimensional simple random walks and an independent multinomial
process. To be more precise, let V = x
1
, . . . , x
l
( and let : V (0, 1]
l
be as in the
denition of T. Suppose that on the same probability space we have dened l independent one-
dimensional simple random walks S
n,1
, S
n,2
, . . . , S
n,l
and an independent multinomial process
L
n
= (L
1
n
, . . . , L
l
n
) with probabilities (x
1
), . . . , (x
l
). In other words,
L
n
=
n
j=1
Y
j
,
where Y
1
, Y
2
, . . . are independent Z
l
-valued random variables with
PY
k
= (1, 0, . . . , 0) = (x
1
), . . . , PY
k
= (0, 0, . . . , 1) = (x
l
),
and PY
k
= (0, 0, . . . , 0) = 1 [(x
1
) + +(x
l
)]. It is easy to verify that the process
S
n
:= x
1
S
L
1
n
,1
+x
2
S
L
2
n
,2
+ +x
l
S
L
l
n
,l
(1.2)
has the distribution of the random walk with increment distribution p. Essentially what we have
done is to split the decision as to how to jump at time n into two decisions: rst, to choose an
element x
j
x
1
, . . . , x
l
and then to decide whether to move by +x
j
or x
j
.
1.2 Continuous-time random walk
It is often more convenient to consider random walks in Z
d
indexed by positive real times. Given
V, , p as in the previous section, the continuous-time random walk with increment distribution p is
the continuous-time Markov chain
S
t
with rates p. In other words, for each x, y Z
d
,
P
S
t+t
= y [
S
t
= x = p(y x) t +o(t), y ,= x,
1.2 Continuous-time random walk 13
P
S
t+t
= x [
S
t
= x = 1
_
_
y=x
p(y x)
_
_
t +o(t).
Let p
t
(x, y) = P
S
t
= y [
S
0
= x, and p
t
(y) = p
t
(0, y) = p
t
(x, x +y). Then the expressions above
imply
d
dt
p
t
(x) =
yZ
d
p(y) [ p
t
(x y) p
t
(x)].
There is a very close relationship between the discrete time and continuous time random walks
with the same increment distribution. We state this as a proposition which we leave to the reader
to verify.
Proposition 1.2.1 Suppose S
n
is a (discrete-time) random walk with increment distribution p and
N
t
is an independent Poisson process with parameter 1. Then
S
t
:= S
Nt
has the distribution of a
continuous-time random walk with increment distribution p.
There are various technical reasons why continuous-time random walks are sometimes easier to
handle than discrete-time walks. One reason is that in the continuous setting there is no periodicity.
If p T
d
, then p
t
(x) > 0 for every t > 0 and x Z
d
. Another advantage can be found in the
following proposition which gives an analogous, but nicer, version of (1.2). We leave the proof to
the reader.
Proposition 1.2.2 Suppose p T
d
with generating set V = x
1
, . . . , x
l
and suppose
S
t,1
, . . . ,
S
t,l
are independent one-dimensional continuous-time random walks with increment distribution q
1
, . . . ,
q
l
where q
j
(1) = p(x
j
). Then
S
t
:= x
1
S
t,1
+x
2
S
t,2
+ +x
l
S
t,l
(1.3)
has the distribution of a continuous-time random walk with increment distribution p.
If p is the increment distribution for simple random walk, we call the corresponding walk
S
t
the continuous-time simple random walk in Z
d
. From the previous proposition, we see that the
coordinates of the continuous-time simple random walk are independent this is clearly not true
for the discrete-time simple random walk. In fact, we get the following. Suppose
S
t,1
, . . . ,
S
t,d
are
independent one-dimensional continuous-time simple random walks. Then,
S
t
:= (
S
t/d,1
, . . . ,
S
t/d,d
)
is a continuous time simple random walk in Z
d
. In particular, if
S
0
= 0, then
P
S
t
= (y
1
, . . . , y
d
) = P
S
t/d,1
= y
1
P
S
t/d,l
= y
l
.
Remark. To verify that a discrete-time process S
n
is a random walk with distribution p T
d
starting at the origin, it suces to show for all positive integers j
1
< j
2
< < j
k
and x
1
, . . . , x
k
Z
d
,
PS
j
1
= x
1
, . . . , S
j
k
= x
k
= p
j
1
(x
1
) p
j
2
j
1
(x
2
x
1
) p
j
k
j
k1
(x
k
x
k1
).
To verify that a continuous-time process
S
t
is a continuous-time random walk with distribution p
14 Introduction
starting at the origin, it suces to show that the paths are right-continuous with probability one,
and that for all real t
1
< t
2
< < t
k
and x
1
, . . . , x
k
Z
d
,
P
S
t
1
= x
1
, . . . ,
S
t
k
= x
k
= p
t
1
(x
1
) p
t
2
t
1
(x
2
x
1
) p
t
k
t
k1
(x
k
x
k1
).
1.3 Other lattices
A lattice L is a discrete additive subgroup of R
d
. The term discrete means that there is a real
neighborhood of the origin whose intersection with L is just the origin. While this book will focus
on the lattice Z
d
, we will show in this section that this also implies results for symmetric, bounded
random walks on other lattices. We start by giving a proposition that classies all lattices.
Proposition 1.3.1 If L is a lattice in R
d
, then there exists an integer k d and elements
x
1
, . . . , x
k
L that are linearly independent as vectors in R
d
such that
L = j
1
x
1
+ +j
k
x
k
, j
1
, . . . , j
k
Z.
In this case we call L a k-dimensional lattice.
Proof Suppose rst that L is contained in a one-dimensional subspace of R
d
. Choose x
1
L 0
with minimal distance from the origin. Clearly jx
1
: j Z L. Also, if x L, then jx
1
x <
(j +1)x
1
for some j Z, but if x > jx
1
, then x jx
1
would be closer to the origin than x
1
. Hence
L = jx
1
: j Z.
More generally, suppose we have chosen linearly independent x
1
, . . . , x
j
such that the following
holds: if L
j
is the subgroup generated by x
1
, . . . , x
j
, and V
j
is the real subspace of R
d
generated
by the vectors x
1
, . . . , x
j
, then L V
j
= L
j
. If L = L
j
, we stop. Otherwise, let w
0
L L
j
and let
U = tw
0
: t R, tw
0
+y
0
L for some y
0
V
j
= tw
0
: t R, tw
0
+t
1
x
1
+ +t
j
x
j
L for some t
1
, . . . , t
j
[0, 1].
The second equality uses the fact that L is a subgroup. Using the rst description, we can see
that U is a subgroup of R
d
(although not necessarily contained in L). We claim that the second
description shows that there is a neighborhood of the origin whose intersection with U is exactly
the origin. Indeed, the intersection of L with every bounded subset of R
d
is nite (why?), and
hence there are only a nite number of lattice points of the form
tw
0
+t
1
x
1
+ +t
j
x
j
with 0 < t 1; and 0 t
1
, . . . , t
j
1. Hence there is an > 0 such that there are no such lattice
points with 0 < [t[ . Therefore U is a one-dimensional lattice, and hence there is a w U such
that U = kw : k Z. By denition, there exists a y
1
V
j
(not unique, but we just choose one)
such that x
j+1
:= w + y
1
L. Let L
j+1
, V
j+1
be as above using x
1
, . . . , x
j
, x
j+1
. Note that V
j+1
is also the real subspace generated by x
1
, . . . , x
j
, w
0
. We claim that L V
j+1
= L
j+1
. Indeed,
suppose that z L V
j+1
, and write z = s
0
w
0
+ y
2
where y
2
V
j
. Then s
0
w
0
U, and hence
s
0
w
0
= lw for some integer l. Hence, we can write z = lx
j+1
+ y
3
with y
3
= y
2
ly
1
V
j
. But,
z lx
j+1
V
j
L = L
j
. Hence z L
j+1
.
1.3 Other lattices 15
The proof above seems a little complicated. At rst glance it seems that one might be able to simplify
the argument as follows. Using the notation in the proof, we start by choosing x
1
to be a nonzero point in L
at minimal distance from the origin, and then inductively to choose x
j+1
to be a nonzero point in L L
j
at
minimal distance from the origin. This selection method produces linearly independent x
1
, . . . , x
k
; however, it is
not always the case that
L = j
1
x
1
+ +j
k
x
k
: j
1
, . . . , j
k
Z.
As an example, suppose L is the 5-dimensional lattice generated by
2e
1
, 2e
2
, 2e
3
, 2e
4
, e
1
+e
2
+ +e
5
.
Note that 2e
5
L and the only nonzero points in L that are within distance two of the origin are 2e
j
, j =
1, . . . , 5. Therefore this selection method would choose (in some order) 2e
1
, . . . , 2e
5
. But, e
1
+ + e
5
is
not in the subgroup generated by these points.
It follows from the proposition that if k d and L is a k-dimensional lattice in R
d
, then we
can nd a linear transformation A : R
d
R
k
that is an isomorphism of L onto Z
k
. Indeed, we
dene A by A(x
j
) = e
j
where x
1
, . . . , x
k
is a basis for L as in the proposition. If S
n
is a bounded,
symmetric, irreducible random walk taking values in L, then S
n
:= AS
n
is a random walk with
increment distribution p T
k
. Hence, results about walks on Z
k
immediately translate to results
about walks on L. If L is a k-dimensional lattice in R
d
and A is the corresponding transformation,
we will call [ det A[ the density of the lattice. The term comes from the fact that as r , the
cardinality of the intersection of the lattice and ball of radius r in R
d
is asymptotically equal to
[ det A[ r
k
times the volume of the unit ball in R
k
. In particular, if j
1
, . . . , j
k
are positive integers,
then (j
1
Z) (j
k
Z) has density (j
1
j
k
)
1
.
Examples.
The triangular lattice, considered as a subset of C = R
2
is the lattice generated by 1 and e
i/3
,
L
T
= k
1
+k
2
e
i/3
: k
1
, k
2
Z.
Note that e
2i/3
= e
i/3
1 L
T
. The triangular lattice is also considered as a graph with the
above vertices and with edges connecting points that are Euclidean distance one apart. In this
case, the origin has six nearest neighbors, the six sixth roots of unity. Simple random walk on
the triangular lattice is the process that chooses among these six nearest neighbors equally likely.
Note that this is a symmetric walk with bounded increments. The matrix
A =
_
1 1/
3
0 2/
3
_
.
maps L
T
to Z
2
sending 1, e
i/3
, e
2i/3
to e
1
, e
2
, e
2
e
1
. The transformed random walk gives
probability 1/6 to the following vectors: e
1
, e
2
, (e
2
e
1
). Note that our transformed walk
has lost some of the symmetry of the original walk.
16 Introduction
Fig 1.2. The triangular lattice L
T
and its transformation AL
T
The hexagonal or honeycomb lattice is not a lattice in our sense but rather a dual graph to
the triangular lattice. It can be constructed in a number of ways. One way is to start with
the triangular lattice L
T
. The lattice partitions the plane into triangular regions, of which some
point up and some point down. We add a vertex in the center of each triangle pointing down.
The edges of this graph are the line segments from the center points to the vertices of these
triangles (see gure).
Fig 1.3. The hexagons within L
T
Simple random walk on this graph is the process that at each time step moves to one of the
three nearest neighbors. This is not a random walk in our strict sense because the increment
distribution depends on whether the current position is a center point or a vertex point.
However, if we start at a vertex in L
T
, the two-step distribution of this walk is the same as the walk
on the triangular lattice with step distribution p(1) = p(e
i/3
) = p(e
2i/3
) = 1/9; p(0) = 1/3.
When studying random walks on other lattices L, we can map the walk to another walk on Z
d
.
However, since this might lose useful symmetries of the walk, it is sometimes better to work on the
original lattice.
1.4 Other walks
Although we will focus primarily on p T, there are times where we will want to look at more
general walks. There are two classes of distributions we will be considering.
Denition
1.5 Generator 17
T
d
denotes the set of p that generate aperiodic, irreducible walks supported on Z
d
, i.e., the set
of p such that for all x, y Z
d
there exists an N such that p
n
(x, y) > 0 for n N.
T
d
denotes the set of p T
d
with mean zero and nite second moment.
We write T
=
d
T
d
, T
= T
d
.
Note that under our denition T is not a subset of T
.
1.5 Generator
If f : Z
d
R is a function and x Z
d
, we dene the rst and second dierence operators in x by
x
f(y) = f(y +x) f(y),
2
x
f(y) =
1
2
f(y +x) +
1
2
f(y x) f(y).
Note that
2
x
=
2
x
. We will sometimes write just
j
,
2
j
for
e
j
,
2
e
j
. If p T
d
with generator
set V , then the generator L = L
p
is dened by
Lf(y) =
xZ
d
p(x)
x
f(y) =
xV
(x)
2
x
f(y) = f(y) +
xZ
d
p(x) f(x +y).
In the case of simple random walk, the generator is often called the discrete Laplacian and we will
represent it by
D
,
D
f(y) =
1
d
d
j=1
2
j
f(y).
Remark. We have dened the discrete Laplacian in the standard way for probability. In graph
theory, the discrete Laplacian of f is often dened to be
2d
D
f(y) =
|xy|=1
[f(x) f(y)].
We can dene
Lf(y) =
xZ
d
p(x) [f(x +y) f(y)]
for any p T
d
. If p is not symmetric, one often needs to consider
L
R
f(y) =
xZ
d
p(x) [f(x +y) f(y)].
The R stands for reversed; this is the generator for the random walk obtained by looking at the walk with time
reversed.
The generator of a random walk is very closely related to the walk. We will write E
x
, P
x
to denote
18 Introduction
expectations and probabilities for random walk (both discrete and continuous time) assuming that
S
0
= x or
S
0
= x. Then, it is easy to check that
Lf(y) = E
y
[f(S
1
)] f(y) =
d
dt
E
y
[f(
S
t
)]
t=0
.
(In the continuous-time case, some restrictions on the growth of f at innity are needed.) Also,
the transition probabilities p
n
(x), p
t
(x) satisfy the following heat equations:
p
n+1
(x) p
n
(x) = Lp
n
(x),
d
dt
p
t
(x) = L p
t
(x).
The derivation of these equations uses the symmetry of p. For example to derive the rst, we write
p
n+1
(x) =
yZ
d
PS
1
= y; S
n+1
S
1
= x y
=
yZ
d
p(y) p
n
(x y)
=
yZ
d
p(y) p
n
(x y) = p
n
(x) +Lp
n
(x).
The generator L is also closely related to a second order dierential operator. If u R
d
is a unit
vector, we write
2
u
for the second partial derivative in the direction u. Let
L be the operator
Lf(y) =
1
2
xV
(x) [x[
2
2
x/|x|
f(y).
In the case of simple random walk,
L = (2d)
1
, where denotes the usual Laplacian,
f(x) =
d
j=1
x
j
x
j
f(y);
Taylors theorem shows that there is a c such that if f : R
d
R is C
4
and y Z
d
,
[Lf(y)
Lf(y)[ c R
4
M
4
, (1.4)
where R is the range of the walk and M
4
= M
4
(f, y) is the maximal absolute value of a fourth
derivative of f for [x y[ R. If the covariance matrix is diagonalized,
x =
d
j=1
2
j
(x u
j
) u
j
,
where u
1
, . . . , u
d
is an orthonormal basis, then
Lf(y) =
1
2
d
j=1
2
j
2
u
j
f(y).
For future reference, we note that if y ,= 0,
L[log
(y)
2
] =
L[log (y)
2
] =
L
_
_
log
d
j=1
2
j
(y u
j
)
2
_
_
=
d 2
(y)
2
=
d 2
d (y)
2
. (1.5)
1.6 Filtrations and strong Markov property 19
The estimate (1.4) uses the symmetry of p. If p is mean zero and nite range, but not necessarily symmetric,
we can relate its generator to a (purely) second order dierential operator, but the error involves the third
derivatives of f. This only requires f to be C
3
and hence can be useful in the symmetric case as well.
1.6 Filtrations and strong Markov property
The basic property of a random walk is that the increments are independent and identically dis-
tributed. It is useful to set up a framework that allows more information at a particular time
than just the value of the random walk. This will not aect the distribution of the random walk
provided that this extra information is independent of the future increments of the walk.
A (discrete-time) ltration T
0
T
1
is an increasing sequence of -algebras. If p T
d
,
then we say that S
n
is a random walk with increment distribution p with respect to T
n
if:
for each n, S
n
is T
n
-measurable;
for each n > 0, S
n
S
n1
is independent of T
n1
and PS
n
S
n1
= x = p(x).
Similarly, we dene a (right continuous, continuous-time) ltration to be an increasing collection
of -algebras T
t
satisfying T
t
=
>0
T
t+
. If p T
d
, then we say that
S
t
is a continuous-time
random walk with increment distribution p with respect to T
t
if:
for each t,
S
t
is T
t
-measurable;
for each s < t,
S
t
S
s
is independent of T
s
and P
S
t
S
s
= x = p
ts
(x).
We let T
,
then we can add information about T to the ltration and still retain the properties of the random
walk. We will describe one example of this in detail here; later on, we will do similar adding of
information without being explicit. Suppose T has an exponential distribution with parameter ,
i.e., PT > = e
. Let T
n
denote the -algebra generated by T
n
and the events T t for
t n. Then T
n
is a ltration, and S
n
is a random walk with respect to T
n
. Also, given T
n
, then
on the event T > n, the random variable T n has an exponential distribution with parameter
. We can do similarly for the continuous-time walk
S
t
.
We will discuss stopping times and the strong Markov property. We will only do the slightly
more dicult continuous-time case leaving the discrete-time analogue to the reader. If T
t
is a
ltration, then a stopping time with respect to T
t
is a [0, ]-valued random variable such that
for each t, t T
t
. Associated to the stopping time is a -algebra T
consisting of all
events A such that for each t, A t T
t
. (It is straightforward to check that the set of such
A is a -algebra.)
Theorem 1.6.1 (Strong Markov Property) Suppose
S
t
is a continuous-time random walk with
increment distribution p with respect to the ltration T
t
. Suppose is a stopping time with respect
to the process. Then on the event < the process
Y
t
=
S
t+
S
,
is a continuous-time random walk with increment distribution p independent of T
.
20 Introduction
Proof (sketch) We will assume for ease that P < = 1. Note that with probability one Y
t
has
right-continuous paths. We rst suppose that there exists 0 = t
0
< t
1
< t
2
< . . . such that with
probability one t
0
, t
1
, . . .. Then, the result can be derived immediately, by considering the
countable collection of events = t
j
. For more general , let
n
be the smallest dyadic rational
l/2
n
that is greater than . Then,
n
is a stopping time and the result holds for
n
. But,
Y
t
= lim
n
S
t+n
S
n
.
We will use the strong Markov property throughout this book often without being explicit about
its use.
Proposition 1.6.2 (Reection Principle.) Suppose S
n
(resp.,
S
t
) is a random walk (resp.,
continuous-time random walk) with increment distribution p T
d
starting at the origin.
(a) If u R
d
is a unit vector and b > 0,
P max
0jn
S
j
u b 2 PS
n
u b,
Psup
st
S
s
u b 2 P
S
t
u b.
(b) If b > 0,
P max
0jn
[S
j
[ b 2 P[S
n
[ b,
P sup
0st
[
S
t
[ b 2 P[
S
t
[ b.
Proof We will do the continuous-time case. To prove (a), x t > 0 and a unit vector u and let
A
n
= A
n,t,b
be the event
A
n
=
_
max
j=1,...,2
n
S
jt2
n u b
_
.
The events A
n
are increasing in n and right continuity implies that w.p.1,
lim
n
A
n
=
_
sup
st
S
s
u b
_
.
Hence, it suces to show that for each n, P(A
n
) 2 P
S
t
u b. Let =
n,t,b
be the smallest j
such that
S
jt2
n u b. Note that
2
n
_
j=1
_
= j; (
S
t
S
jt2
n) u 0
_
S
t
u b.
Since p T, symmetry implies that for all t, P
S
t
u 0 1/2. Therefore, using independence,
P = j; (
S
t
S
jt2
n) u 0 (1/2) P = j, and hence
P
S
t
u b
2
n
j=1
P
_
= j; (
S
t
S
jt2
n) u 0
_
1
2
2
n
j=1
P = j =
1
2
P(A
n
).
1.7 A word about constants 21
Part (b) is done similarly, by letting be the smallest j with [
S
jt2
n[ b and writing
2
n
_
j=1
_
= j; (
S
t
S
jt2
n)
S
jt2
n 0
_
[
S
t
[ b.
Remark. The only fact about the distribution p that is used in the proof is that it is symmetric
about the origin.
1.7 A word about constants
Throughout this book c will denote a positive constant that depends on the dimension d and the
increment distribution p but does not depend on any other constants. We write
f(n, x) = g(n, x) +O(h(n)),
to mean that there exists a constant c such that for all n,
[f(n, x) g(n, x)[ c [h(n)[.
Similarly, we write
f(n, x) = g(n, x) +o(h(n)),
if for every > 0 there is an N such that
[f(n, x) g(n, x)[ [h(n)[, n N.
Note that implicit in the denition is the fact that c, N can be chosen uniformly for all x. If f, g
are positive functions, we will write
f(n, x) g(n, x), n ,
if there exists a c (again, independent of x) such that for all n, x,
c
1
g(n, x) f(n, x) c g(n, x).
We will write similarly for asymptotics of f(t, x) as t 0.
As an example, let f(z) = log(1 z), [z[ < 1, where log denotes the branch of the complex
logarithm function with log 1 = 0. Then f is analytic in the unit disk with Taylor series expansion
log(1 z) =
j=1
z
j
j
.
By the remainder estimate, for every > 0,
log(1 z) +
k
j=1
z
j
j
[z[
k+1
k
(k + 1)
, [z[ 1 .
22 Introduction
For a xed value of k we can write this as
log(1 z) =
_
_
k
j=1
z
j
j
_
_
+O([z[
k+1
), [z[ 1/2, (1.6)
or
log(1 z) =
_
_
k
j=1
z
j
j
_
_
+O
([z[
k+1
), [z[ 1 , (1.7)
where we write O
d
, then there exists a nite set x
1
, . . . , x
k
such that:
p(x
j
) > 0, j = 1, . . . , k,
For every y Z
d
, there exist (strictly) positive integers n
1
, . . . , n
k
with
n
1
x
1
+ +n
k
x
k
= y. (1.8)
(Hint: rst write each unit vector e
j
in the above form with perhaps dierent sets
x
1
, . . . , x
k
. Then add the equations together.)
Use this to show that there exist > 0, q T
d
, q
d
such that q has nite support and
p = q + (1 ) q
.
Note that (1.8) is used with y = 0 to guarantee that q has zero mean.
Exercise 1.4 Suppose that S
n
= X
1
+ + X
n
where X
1
, X
2
, . . . are independent R
d
-valued
random variables with mean zero and covariance matrix . Show that
M
n
:= [S
n
[
2
(tr) n
is a martingale.
Exercise 1.5 Suppose that p T
d
T
d
with covariance matrix =
T
and S
n
is the corre-
sponding random walk. Show that
M
n
:= (S
n
)
2
n
is a martingale.
1.7 A word about constants 23
Exercise 1.6 Let L be a 2-dimensional lattice contained in R
d
and suppose x
1
, x
2
L are points
such that
[x
1
[ = min[x[ : x L 0,
[x
2
[ = min[x[ : x L jx
1
: j Z .
Show that
L = j
1
x
1
+j
2
x
2
: j
1
, j
2
Z.
You may wish to compare this to the remark after Proposition 1.3.1.
Exercise 1.7 Let S
1
n
, S
2
n
be independent simple random walks in Z and let
Y
n
=
_
S
1
n
+S
2
n
2
,
S
1
n
S
2
n
2
_
,
Show that Y
n
is a simple random walk in Z
2
.
Exercise 1.8 Suppose S
n
is a random walk with increment distribution p T
T. Show that
there exists an > 0 such that for every unit vector R
d
, PS
1
.
2
Local Central Limit Theorem
2.1 Introduction
If X
1
, X
2
, . . . are independent, identically distributed random variables in R with mean zero and
variance
2
, then the central limit theorem (CLT) states that the distribution of
X
1
+ +X
n
n
(2.1)
approaches that of a normal distribution with mean zero and variance
2
. In other words, for
< r < s < ,
lim
n
P
_
r
X
1
+ +X
n
n
s
_
=
_
s
r
1
2
2
e
y
2
2
2
dy.
If p T
1
is aperiodic with variance
2
, we can use this to motivate the following approximation:
p
n
(k) = PS
n
= k = P
_
k
n
S
n
n
<
k + 1
n
_
_
(k+1)/
n
k/
n
1
2
2
e
y
2
2
2
dy
1
2
2
n
exp
_
k
2
2
2
n
_
.
Similarly, if p T
1
is bipartite, we can conjecture that
p
n
(k) +p
n
(k + 1)
_
(k+2)/
n
k/
n
1
2
2
e
y
2
2
2
dy
2
2
2
n
exp
_
k
2
2
2
n
_
.
The local central limit theorem (LCLT) justies this approximation.
One gets a better approximation by writing
PS
n
= k = P
_
k
1
2
n
S
n
n
<
k +
1
2
n
_
_
(k+
1
2
)/
n
(k
1
2
)/
n
1
2
2
n
e
y
2
2
2
dy.
If p T
d
with covariance matrix =
T
=
2
, then the normalized sums (2.1) approach a
joint normal random variable with covariance matrix , i.e., a random variable with density
f(x) =
1
(2)
d/2
(det )
e
|
1
x|
2
/2
=
1
(2)
d/2
det
e
(x
1
x)/2
.
24
2.1 Introduction 25
(See Section 12.3 for a review of the joint normal distribution.) A similar heuristic argument can
be given for p
n
(x). Recall from (1.1) that
(x)
2
= x
1
x. Let p
n
(x) denote the estimate of
p
n
(x) that one obtains by the central limit theorem argument,
p
n
(x) =
1
(2n)
d/2
det
e
(x)
2
2n
=
1
(2)
d
n
d/2
_
R
d
e
i
sx
n
e
ss
2
d
d
s. (2.2)
The second equality is a straightforward computation, see (12.14). We dene p
t
(x) for real t > 0
in the same way. The LCLT states that for large n, p
n
(x) is approximately p
n
(x). To be more
precise, we will say that an aperiodic p satises the LCLT if
lim
n
n
d/2
sup
xZ
d
[p
n
(x) p
n
(x)[ = 0.
A bipartite p satises the LCLT if
lim
n
n
d/2
sup
xZ
d
[p
n
(x) +p
n+1
(x) 2p
n
(x)[ = 0.
In this weak form of the LCLT we have not made any estimate of the error term [p
n
(x) p
n
(x)[
other than that it goes to zero faster than n
d/2
uniformly in x. Note that p
n
(x) is bounded by
c n
d/2
uniformly in x. This is the correct order of magnitude for [x[ of order
n but p
n
(x) is
much smaller for larger [x[. We will prove a LCLT for any mean zero distribution with nite second
moment. However, the LCLT we state now for p T
d
includes error estimates that do not hold for
all p T
d
.
Theorem 2.1.1 (Local Central Limit Theorem) If p T
d
is aperiodic, and p
n
(x) is as dened
in (2.2), then there is a c and for every integer k 4 there is a c(k) < such that for all integers
n > 0 and x Z
d
the following hold where z = x/
n:
[p
n
(x) p
n
(x)[
c(k)
n
(d+2)/2
_
([z[
k
+ 1) e
(z)
2
2
+
1
n
(k3)/2
_
, (2.3)
[p
n
(x) p
n
(x)[
c
n
(d+2)/2
[z[
2
. (2.4)
We will prove this result in a number of steps in Section 2.3. Before doing so, let us consider
what the theorem states. Plugging k = 4 into (2.3) implies that
[p
n
(x) p
n
(x)[
c
n
(d+2)/2
. (2.5)
For typical x with [x[
n, p
n
(x) n
d/2
. Hence (2.5) implies
p
n
(x) = p
n
(x)
_
1 +O
_
1
n
__
, [x[
n.
The error term in (2.5) is uniform over x, but as [x[ grows, the ratio between the error term and
p
n
(x) grows. The inequalities (2.3) and (2.4) are improvements on the error term for [x[
n.
Since p
n
(x) n
d/2
e
J
(x)
2
/2n
, (2.3) implies
p
n
(x) = p
n
(x)
_
1 +
O
k
([x/
n[
k
)
n
_
+O
k
_
1
n
(d+k1)/2
_
, [x[
n,
26 Local Central Limit Theorem
where we write O
k
to emphasize that the constant in the error term depends on k.
An even better improvement is established in Section 2.3.1 where it is shown that
p
n
(x) = p
n
(x) exp
_
O
_
1
n
+
[x[
4
n
3
__
, [x[ < n.
Although Theorem 2.1.1 is not as useful for atypical x, simple large deviation results as given in
the next propositions often suce to estimate probabilities.
Proposition 2.1.2
Suppose p T
d
and S
n
is a p-walk starting at the origin. Suppose k is a positive integer
such that E[[X
1
[
2k
] < . There exists c < such that for all s > 0
P
_
max
0jn
[S
j
[ s
n
_
c s
2k
. (2.6)
Suppose p T
d
and S
n
is a p-walk starting at the origin. There exist > 0 and c < such
that for all n and all s > 0,
P
_
max
0jn
[S
j
[ s
n
_
c e
s
2
. (2.7)
Proof It suces to prove the results for one-dimensional walks. See Corollaries 12.2.6 and 12.2.7.
The statement of the LCLT given here is stronger than is needed for many applications. For example, to
determine whether the random walk is recurrent or transient, we only need the following corollary. If p T
d
is aperiodic, then there exist 0 < c
1
< c
2
< such that for all x, p
n
(x) c
2
n
d/2
, and for [x[
n,
p
n
(x) c
1
n
d/2
. The exponent d/2 is important to remember and can be understood easily. In n steps, the
random walk tends to go distance
n. In Z
d
, there are of order n
d/2
points within distance
n of the origin.
Therefore, the probability of being at a particular point should be of order n
d/2
.
The proof of Theorem 2.1.1 in Section 2.2 will use the characteristic function. We discuss LCLTs
for p T
d
, where, as before, T
d
denotes the set of aperiodic, irreducible increment distributions p
in Z
d
with mean zero and nite second moment. In the proof of Theorem 2.1.1, we will see that
we do not need to assume that the increments are bounded. For xed k 4, (2.3) holds for p T
d
provided that E[[X[
k+1
] < and the third moments of p vanish. The inequalities (2.5) and (2.4)
need only nite fourth moments and vanishing third moments. If p T
d
has nite third moments
that are nonzero, we can prove a weaker version of (2.3). Suppose k 3, and E[[X
1
[
k+1
] < .
There exists c(k) < such that
[p
n
(x) p
n
(x)[
c(k)
n
(d+1)/2
_
([z[
k
+ 1) e
(z)
2
2
+
1
n
(k2)/2
_
.
Also, for any p T
d
with E[[X
1
[
3
] < ,
[p
n
(x) p
n
(x)[
c
n
(d+1)/2
, [p
n
(x) p
n
(x)[
c
n
(d1)/2
[x[
2
.
We focus our discussion in Section 2.2 on aperiodic, discrete-time walks, but the next theorem
2.2 Characteristic Functions and LCLT 27
shows that we can deduce the results for bipartite and continuous-time walks from LCLT for
aperiodic, discrete-time walks. We state the analogue of (2.3); the analogue of (2.4) can be proved
similarly.
Theorem 2.1.3 If p T
d
and p
n
(x) is as dened in (2.2), then for every k 4 there is a
c = c(k) < such that the follwing holds for all x Z
d
.
If n is a positive integer and z = x/
n, then
[p
n
(x) +p
n+1
(x) 2 p
n
(x)[
c
n
(d+2)/2
_
([z[
k
+ 1) e
J
(z)
2
/2
+
1
n
(k3)/2
_
. (2.8)
If f t > 0 and z = x/
t,
[ p
t
(x) p
t
(x)[
c
t
(d+2)/2
_
([z[
k
+ 1) e
J
(y)
2
/2
+
1
t
(k3)/2
_
. (2.9)
Proof (assuming Theorem 2.1.1) We only sketch the proof. If p T
d
is bipartite, then S
n
:= S
2n
is an aperiodic walk on the lattice Z
d
e
. We can establish the result for S
n
by mapping Z
d
e
to Z
d
as
described in Section 1.3. This gives the asymptotics for p
2n
(x), x Z
d
e
and for x Z
d
o
, we know
that
p
2n+1
(x) =
yZ
d
p
2n
(x y) p(y).
The continuous-time walk viewed at integer times is the discrete-time walk with increment dis-
tribution p = p
1
. Since p satises all the moment conditions, (2.3) holds for p
n
(x), n = 0, 1, 2, . . . .
If 0 < t < 1, we can write
p
n+t
(x) =
yZ
d
p
n
(x y) p
t
(y),
and deduce the result for all t.
2.2 Characteristic Functions and LCLT
2.2.1 Characteristic functions of random variables in R
d
One of the most useful tools for studying the distribution of the sums of independent random
variables is the characteristic function. If X = (X
1
, . . . , X
d
) is a random variable in R
d
, then its
characteristic function =
X
is the function from R
d
into C given by
() = E[expi X].
Proposition 2.2.1 Suppose X = (X
1
, . . . , X
d
) is a random variable in R
d
with characteristic
function .
(a) is a uniformly continuous function with (0) = 1 and [()[ 1 for all R
d
.
(b) If R
d
then
X,
(s) := (s) is the characteristic function of the one-dimensional random
variable X .
28 Local Central Limit Theorem
(c) Suppose d = 1 and m is a positive integer with E[[X[
m
] < . Then (s) is a C
m
function of
s; in fact,
(m)
(s) = i
m
E[X
m
e
isX
].
(d) If m is a positive integer, E[[X[
m
] < , and [u[ = 1, then
(su)
m1
j=0
i
j
E[(X u)
j
]
j!
s
j
E[[X u[
m
]
m!
[s[
m
.
(e) If X
1
, X
2
, . . . , X
n
are independent random variables in R
d
, with characteristic functions
X
1
, . . . ,
Xn
, then
X
1
++Xn
() =
X
1
()
Xn
().
In particular, if X
1
, X
2
, . . . are independent, identically distributed with the same distribution as
X, then the characteristic function of S
n
= X
1
+ +X
n
is given by
Sn
() = [()]
n
.
Proof To prove uniform continuity, note that
[(
1
+) ()[ = [E[e
iX(
1
+)
e
iX
][ E[[e
iX
1
1[],
and the dominated convergence theorem implies that
lim
1
0
E[[e
iX
1
1[] = 0.
The other statements in (a) and (b) are immediate. Part (c) is derived by dierentiating; the
condition E[[X[
m
] < is needed to justify the dierentiation using the dominated convergence
theorem (details omitted). Part (d) follows from (b), (c), and Taylors theorem with remainder.
Part (e) is immediate from the product rule for expectations of independent random variables.
We will write P
m
() for the m-th order Taylor series approximation of about the origin. Then
the last proposition implies that if E[[X[
m
] < , then
() = P
m
() +o([[
m
), 0. (2.10)
Note that if E[X] = 0 and E[[X[
2
] < , then
P
2
() = 1
1
2
d
j=1
d
k=1
E[X
j
X
k
]
j
k
= 1
2
= 1
E[(X )
2
]
2
.
Here denotes the covariance matrix for X. If E[[X[
m
] < , we write
P
m
() = 1
2
+
m
j=3
q
j
(), (2.11)
where q
j
are homogeneous polynomials of degree j determined by the moments of X. If all the
third moments of X exist and equal zero, q
3
0. If X has a symmetric distribution, then q
j
0
for all odd j for which E[[X[
j
] < .
2.3 LCLT characteristic function approach 29
2.2.2 Characteristic functions of random variables in Z
d
If X = (X
1
, . . . , X
d
) is a Z
d
-valued random variable, then its characteristic function has period 2
in each variable, i.e., if k
1
, . . . , k
d
are integers,
(
1
, . . . ,
d
) = (
1
+ 2k
1
, . . . ,
d
+ 2k
d
).
The characteristic function determines the distribution of X; in fact, the next proposition gives a
simple inversion formula. Here, and for the remainder of this section, we will write d for d
1
d
d
.
Proposition 2.2.2 If X = (X
1
, . . . , X
d
) is a Z
d
-valued random variable with characteristic func-
tion , then for every x Z
d
,
PX = x =
1
(2)
d
_
[,]
d
() e
ix
d.
Proof Since
() = E[e
iX
] =
yZ
d
e
iy
PX = y,
we get
_
[,]
d
() e
ix
d =
yZ
d
PX = y
_
[,]
d
e
i(yx)
d.
(The dominated convergence theorem justies the interchange of the sum and the integral.) But,
if x, y Z
d
,
_
[,]
d
e
i(yx)
d =
_
(2)
d
, y = x
0, y ,= x.
Corollary 2.2.3 Suppose X
1
, X
2
, . . . are independent, identically distributed random variables in
Z
d
with characteristic function . Let S
n
= X
1
+ +X
n
. Then, for all x Z
d
,
PS
n
= x =
1
(2)
d
_
[,]
d
n
() e
ix
d.
2.3 LCLT characteristic function approach
In some sense, Corollary 2.2.3 completely solves the problem of determining the distribution of
a random walk at a particular time n. However, the integral is generally hard to evaluate and
estimation of oscillatory integrals is tricky. Fortunately, we can use this corollary as a starting
point for deriving the local central limit theorem. We will consider p T
xZ
d
e
ix
p(x).
30 Local Central Limit Theorem
We have noted that the characteristic function of S
n
is
n
.
Lemma 2.3.1 The characteristic function of
S
t
is
St
() = expt[() 1].
Proof Since
S
t
has the same distribution as S
Nt
where N
t
is an independent Poisson process with
parameter 1, we get
St
() = E[e
i
St
] =
j=0
e
t
t
j
j!
E[e
iS
j
] =
j=0
e
t
t
j
j!
()
j
= expt[() 1].
Corollary 2.2.3 gives the formulas
p
n
(x) =
1
(2)
d
_
[,]
d
n
() e
ix
d, (2.12)
p
t
(x) =
1
(2)
d
_
[,]
d
e
t[()1]
e
ix
d.
Lemma 2.3.2 Suppose p T
d
.
(a) For every > 0,
sup
_
[()[ : [, ]
d
, [[
_
< 1.
(b) There is a b > 0 such that for all [, ]
d
,
[()[ 1 b[[
2
. (2.13)
In particular, for all [, ]
d
, and r > 0,
[()[
r
_
1 b[[
2
r
exp
_
br[[
2
_
. (2.14)
Proof By continuity and compactness, to prove (a) it suces to prove that [()[ < 1 for all
[, ]
d
0. To see this, suppose that [()[ = 1. Then [()
n
[ = 1 for all positive integers
n. Since
()
n
=
zZ
d
p
n
(z) e
iz
,
and for each xed z, p
n
(z) > 0 for all suciently large n, we see that e
iz
= 1 for all z Z
d
.
(Here we use the fact that if w
1
, w
2
, . . . C with [w
1
+ w
2
+ [ = 1 and [w
1
[ + [w
2
[ + = 1,
then there is a such that w
j
= r
j
e
i
with r
j
0.) The only [, ]
d
that satises this is
= 0. Using (a), it suces to prove (2.13) in a neighborhood of the origin, and this follows from
the second-order Taylor series expansion (2.10).
The last lemma does not hold for bipartite p. For example, for simple random walk (i, i, . . . , i) = 1.
2.3 LCLT characteristic function approach 31
In order to illustrate the proof of the local central limit theorem using the characteristic function,
we will consider the one-dimensional case with p(1) = p(1) = 1/4 and p(0) = 1/2. Note that
this increment distribution is the same as the two-step distribution of (1/2 times) the usual simple
random walk. The characteristic function for p is
() =
1
2
+
1
4
e
i
+
1
4
e
i
=
1
2
+
1
2
cos = 1
2
4
+O(
4
).
The inversion formula (2.12) tells us that
p
n
(x) =
1
2
_
e
ix
()
n
d =
1
2
n
_
n
e
i(x/
n)s
(s/
n)
n
ds.
The second equality follows from the substitution s =
n. For [s[
n, we can write
_
s
n
_
= 1
s
2
4n
+O
_
s
4
n
2
_
= 1
(s
2
/4) +O(s
4
/n)
n
.
We can nd > 0 such that if [s[
n,
s
2
4
+O
_
s
4
n
_
n
2
.
Therefore, using (12.3), if [s[
n,
_
s
n
_
n
=
_
1
s
2
4 n
+O
_
s
4
n
2
__
n
= e
s
2
/4
e
g(s,n)
,
where
[g(s, n)[ c
s
4
n
.
If = min, 1/
8c we also have
[g(s, n)[
s
2
8
, [s[
n.
For
n < [s[
n)s
(s/
n)
n
[ e
n
for some > 0. Hence, up
to an error that is exponentially small in n, p
n
(x) equals
1
2
n
_
n
e
i(x/
n)s
e
s
2
/4
e
g(s,n)
ds.
We now use
[e
g(s,n)
1[
_
c s
4
/n, [s[ n
1/4
e
s
2
/8
, n
1/4
< [s[
n
to bound the error term as follows:
1
2
n
_
n
e
i(x/
n)s
e
s
2
/4
[e
g(s,n)
1] ds
n
_
n
e
s
2
/4
[e
g(s,n)
1[ ds,
_
n
1/4
n
1/4
e
s
2
/4
[e
g(s,n)
1[ ds
c
n
_
s
4
e
s
2
/4
ds
c
n
,
32 Local Central Limit Theorem
_
n
1/4
|s|
n
e
s
2
/4
[e
g(s,n)
1[ ds
_
|s|n
1/4
e
s
2
/8
ds = o(n
1
).
Hence we have
p
n
(x) = O
_
1
n
3/2
_
+
1
2
n
_
n
e
i(x/
n)s
e
s
2
/4
ds
= O
_
1
n
3/2
_
+
1
2
n
_
e
i(x/
n)s
e
s
2
/4
ds.
The last term equals p
n
(x), see (2.2), and so we have shown that
p
n
(x) = p
n
(x) +O
_
1
n
3/2
_
.
We will follow this basic line of proof for theorems in this subsection. Before proceeding, it will be
useful to outline the main steps.
Expand log () in a neighborhood [[ < about the origin.
Use this expansion to approximate [(/
n)]
n
, which is the characteristic function of S
n
/
n.
Use the inversion formula to get an exact expression for the probability and do a change of
variables s =
n,
n]
d
. Use Lemma 2.3.2 to show that the
integral over [[
n is exponentially small.
Use the approximation of [(/
n)]
n
to compute the dominant term and to give an expression
for the error term that needs to be estimated.
Estimate the error term.
Our rst lemma discusses the approximation of the characteristic function of S
n
/
n by an
exponential. We state the lemma for all p T
d
, and then give examples to show how to get sharper
results if one makes stronger assumptions on the moments.
Lemma 2.3.3 Suppose p T
d
with covariance matrix and characteristic function that we
write as
() = 1
2
+h(),
where h() = o([[
2
) as 0. There exist > 0, c < such that for all positive integers n and
all [[
n, we can write
_
n
__
n
= exp
_
2
+g(, n)
_
= e
2
[1 +F
n
()] , (2.15)
where F
n
() = e
g(,n)
1 and
[g(, n)[ min
_
4
, n
h
_
n
_
+
c[[
4
n
_
. (2.16)
In particular,
[F
n
()[ e
4
+ 1.
2.3 LCLT characteristic function approach 33
Proof Choose > 0 such that
[() 1[
1
2
, [[ .
For [[ , we can write
log () =
2
+h()
( )
2
8
+O([h()[ [[
2
) +O([[
6
). (2.17)
Dene g(, n) by
n log
_
n
_
=
2
+g(, n),
so that (2.15) holds. Note that
[g(, n)[ n
h
_
n
_
+O
_
[[
4
n
_
.
Since nh(/
n) = o([[
2
), we can nd 0 < such that for [[
n,
[g(, n)[
4
.
The proofs will require estimates for F
n
(). The inequality [e
z
1[ O([z[) is valid if z is restricted to a
bounded set. Hence, the basic strategy is to nd c
1
, r(n) O(n
1/4
) such that
n
h
_
n
_
c
1
, [[ r(n)
Since O([[
4
/n) O(1) for [[ n
1/4
, (2.16) implies
[F
n
()[ =
e
g(,n)
1
c [g(, n)[ c
_
n
h
_
n
_
+
[[
4
n
2
_
, [[ r(n),
[F
n
()[ e
4
+ 1, r(n) [[
n.
Examples
We give some examples with dierent moment assumptions. In the discussion below, is as in
Lemma 2.3.3 and is restricted to [[
n.
If E[[X
1
[
4
] < , then by (2.11),
h() = q
3
() +O([[
4
),
and
log () =
2
+f
3
() +O([[
4
),
where f
3
= q
3
is a homogeneous polynomial of degree three. In this case,
g(, n) = nf
3
_
n
_
+
O([[
4
)
n
, (2.18)
34 Local Central Limit Theorem
and there exists c < such that
[g(, n)[ min
_
4
,
c [[
3
n
_
.
We use here and below the fact that [[
3
/
n [[
4
/n for [[
n.
If E[X
1
[
6
] < and all the third and fth moments of X
1
vanish, then
h() = q
4
() +O([[
6
),
log () =
2
+f
4
() +O([[
6
),
where f
4
() = q
4
() ( )
2
/8 is a homogeneous polynomial of degree four. In this case,
g(, n) = nf
4
_
n
_
+
O([[
6
)
n
2
, (2.19)
and there exists c < such that
[g(, n)[ min
_
4
,
c [[
4
n
_
.
More generally, suppose that k 3 is a positive integer such that E[[X
1
[
k+1
] < . Then
h() =
k
j=3
q
j
() +O([[
k+1
),
log () =
2
+
k
j=3
f
j
() +O([[
k+1
),
where f
j
are homogeneous polynomials of degree j that are determined by , q
3
, . . . , q
k
. In this
case,
g(, n) =
k
j=3
nf
j
_
n
_
+
O([[
k+1
)
n
(k1)/2
, (2.20)
Moreover, if j is odd and all the odd moments of X of degree less than or equal to j vanish, then
f
j
0. Also,
[g(, n)[ min
_
4
,
c [[
2+
n
/2
_
,
where = 2 if the third moments vanish and otherwise = 1.
Suppose E[e
bX
] < for all b in a real neighborhood of the origin. Then z (z) = e
izX
1
is a
holomorphic function from a neighborhood of the origin in C
n
to C. Hence, we can choose so
that log (z) is holomorphic for [z[ < and hence z g(z, n) and z F
n
(z) are holomorphic
for [z[ <
n.
The next lemma computes the dominant term and isolates the integral that needs to be estimated
in order to obtain error bounds.
2.3 LCLT characteristic function approach 35
Lemma 2.3.4 Suppose p T
d
with covariance matrix . Let , , F
n
be as in Lemma 2.3.3. There
exist c < , > 0 such that for all 0 r
n, if we dene v
n
(x, r) by
p
n
(x) = p
n
(x) +v
n
(x, r) +
1
(2)
d
n
d/2
_
||r
e
ix
n
e
2
F
n
() d,
then
[v
n
(x, r)[ c n
d/2
e
r
2
.
Proof The inversion formula (2.12) gives
p
n
(x) =
1
(2)
d
_
[,]
d
()
n
e
ix
d =
1
(2)
d
n
d/2
_
[
n,
n]
d
_
s
n
_
n
e
izs
ds,
where z = x/
for [[ .
Therefore,
1
(2)
d
n
d/2
_
[
n,
n]
d
_
s
n
_
n
e
izs
ds = O(e
n
) +
1
(2)
d
n
d/2
_
||
_
s
n
_
n
e
izs
ds.
For [s[
n, we write
_
s
n
_
n
= e
ss
2
+e
ss
2
F
n
(s).
By (2.2) we have
1
(2)
d
n
d/2
_
R
d
e
izs
e
ss
2
ds = p
n
(x). (2.21)
Also,
1
(2)
d
n
d/2
_
|s|
n
e
izs
e
ss
2
ds
1
(2)
d
n
d/2
_
|s|
n
e
ss
2
ds O(e
n
),
for perhaps a dierent . Therefore,
p
n
(x) = p
n
(x) +O(e
n
) +
1
(2)
d
n
d/2
_
||
n
e
ix
n
e
2
F
n
() d.
This gives the result for r =
4
+ 1,
to see that
_
r||
n
e
ix
n
e
2
F
n
() d
2
_
||r
e
4
d = O(e
r
2
).
The next theorem establishes LCLTs for p T
with nite
fourth moment and vanishing third moments. It gives an error term that is uniform over all x Z
d
.
The estimate is good for typical x, but is not very sharp for atypically large x.
36 Local Central Limit Theorem
Theorem 2.3.5 Suppose p T
with E[[X
1
[
3
] < . Then there exists a c < such for all n, x,
[p
n
(x) p
n
(x)[
c
n
(d+1)/2
. (2.22)
If E[[X
1
[
4
] < and all the third moments of X
1
are zero, then there is a c such that for all n, x,
[p
n
(x) p
n
(x)[
c
n
(d+2)/2
. (2.23)
Proof We use the notations of Lemmas 2.3.3 and 2.3.4. Letting r = n
1/8
in Lemma 2.3.4, we see
that,
p
n
(x) = p
n
(x) +O(e
n
1/4
) +
1
(2)
d
n
d/2
_
||n
1/8
e
ix
n
e
2
F
n
() d.
Note that [h()[ = O([[
2+
) where = 1 under the weaker assumption and = 2 under the
stronger assumption. For [[ n
1/8
, [g(, n)[ c [[
2+
/n
/2
, and hence
[F
n
()[ c
[[
2+
n
/2
.
This implies
_
||n
1/8
e
ix
n
e
2
F
n
() d
c
n
/2
_
R
d
[[
2+
e
2
d
c
n
/2
.
The choice r = n
1/8
in the proof above was somewhat arbitrary. The value r was chosen suciently large
so that the error term v
n
(x, r) from Lemma 2.3.4 decays faster than any power of n but suciently small so that
[g(, n)[ is uniformly bounded for [[ r. We could just as well have chosen r(n) = n
with E[[X
1
[
4
] < . One can obtain a dierence estimate for p
n
(x) from (2.5). However, we
will give another proof below that requires only third moments of the increment distribution. This
theorem also gives a uniform bound on the error term.
If ,= 0 and
f(n) = n
+O(n
1
), (2.24)
2.3 LCLT characteristic function approach 37
then
f(n + 1) f(n) = [(n + 1)
] + [O((n + 1)
1
) O(n
1
)].
This shows that f(n + 1) f(n) = O(n
1
), but the best that we can write about the error terms is O((n +
1)
1
) O(n
1
) = O(n
1
), which is as large as the dominant term. Hence an expression such as (2.24) is
not sucient to give good asymptotics on dierences of f. One strategy for proving dierence estimates is to go
back to the derivation of (2.24) to see if the dierence of the errors can be estimated. This is the approach used
in the next theorem.
Theorem 2.3.6 Suppose p T
d
with E[[X
1
[
3
] < . Let
y
denote the dierences in the x variable,
y
p
n
(x) = p
n
(x +y) p
n
(x),
y
p
n
(x) = p
n
(x +y) p
n
(x),
and
j
=
e
j
.
There exists c < such that for all x, n, y,
[
y
p
n
(x)
y
p
n
(x)[
c [y[
n
(d+2)/2
.
If E[[X
1
[
4
] < and all the third moments of X
1
vanish, there exists c < such that for
all x, n, y,
[
y
p
n
(x)
y
p
n
(x)[
c [y[
n
(d+3)/2
.
Proof By the triangle inequality, it suces to prove the result for y = e
j
, j = 1, . . . , d. Let = 1
under the weaker assumptions and = 2 under the stronger assumptions. As in the proof of
Theorem 2.3.5, we see that
j
p
n
(x) =
j
p
n
(x) +O(e
n
1/4
) +
1
(2)
d
n
d/2
_
||n
1/8
_
e
i(x+e
j
)
n
e
ix
n
_
e
2
F
n
() d.
Note that
i(x+e
j
)
n
e
ix
ie
j
n
1
[[
n
,
and hence
_
||n
1/8
_
e
i(x+e
j
)
n
e
ix
n
_
e
2
F
n
() d
n
_
||n
1/8
[[ e
2
[F
n
()[ d.
The estimate
_
||n
1/8
[[ e
2
[F
n
()[ d
c
n
/2
,
where = 1 under the weaker assumption and = 2 under the stronger assumption, is done as in
the previous theorem.
38 Local Central Limit Theorem
The next theorem improves the LCLT by giving a better error bound for larger x. The basic
strategy is to write F
n
() as the sum of a dominant term and an error term. This requires a stronger
moment condition. If E[[X
1
[
j
] < , let f
j
be the homogeneous polynomial of degree j dened in
(2.20). Let
u
j
(z) =
1
(2)
d
_
R
d
e
isz
f
j
(s) e
ss
2
ds. (2.25)
Using standard properties of Fourier transforms, we can see that
u
j
(z) = f
j
(z) e
(z
1
z)/2
= f
j
(z) e
(z)
2
2
(2.26)
for some jth degree polynomial f
j
that depends only on the distribution of X
1
.
Theorem 2.3.7 Suppose p T
d
.
If E[[X
1
[
4
] < , there exists c < such that
p
n
(x) p
n
(x)
u
3
(x/
n)
n
(d+1)/2
c
n
(d+2)/2
, (2.27)
where u
3
is a dened in (2.25).
If E[[X
1
[
5
] < and the third moments of X
1
vanish there exists c < such that
p
n
(x) p
n
(x)
u
4
(x/
n)
n
(d+2)/2
c
n
(d+3)/2
, (2.28)
where u
4
is a dened in (2.25).
If k 3 is a positive integer such that E[[X
1
[
k
] < and u
k
is as dened in (2.25), then there is
a c(k) such that
[u
k
(z)[ c(k) ([z[
k
+ 1) e
(z)
2
2
.
Moreover, if j is a positive integer, there is a c(k, j) such that if D
j
is a jth order derivative,
[D
j
u
k
(z)[ c(k, j) [([z[
k+j
+ 1) e
(z)
2
2
. (2.29)
Proof Let = 1 under the weaker assumptions and = 2 under the stronger assumptions. As in
Theorem 2.3.5,
n
d/2
[p
n
(x) p
n
(x)] = O(e
n
1/4
) +
1
(2)
d
_
||n
1/8
e
ix
n
e
2
F
n
() d. (2.30)
Recalling (2.20), we can see that for [[ n
1/8
,
F
n
() =
f
2+
()
n
/2
+
O([[
3+
)
n
(+1)/2
.
Up to an error of O(e
n
1/4
), the right-hand side of (2.30) equals
1
(2)
d
_
R
d
e
ix
n
e
2
f
2+
()
n
(/2)
d +
1
(2)
d
_
||n
1/8
e
ix
n
e
2
_
F
n
()
f
2+
()
n
/2
_
d.
2.3 LCLT characteristic function approach 39
The second integral can be bounded as before
_
||n
1/8
e
ix
n
e
2
_
F
n
()
f
2+
()
n
/2
_
d
c
n
(+1)/2
_
R
d
[[
3+
e
2
d
c
n
(+1)/2
.
The estimates on u
k
and D
j
u
k
follows immediately from (2.26).
The next theorem is proved in the same way as Theorem 2.3.7 starting with (2.20), and we omit
it. A special case of this theorem is (2.3). The theorem shows that (2.3) holds for all symmetric
p T
d
with E[[X
1
[
6
] < . The results stated for n [x[
2
are just restatements of Theorem 2.3.5.
Theorem 2.3.8 Suppose p T
d
and k 3 is a positive integer such that E[[X
1
[
k+1
] < . There
exists c = c(k) such that
p
n
(x) p
n
(x)
k
j=3
u
j
(x/
n)
n
(d+j2)/2
c
n
(d+k1)/2
, (2.31)
where u
j
are as dened in (2.25).
In particular, if z = x/
n,
[p
n
(x) p
n
(x)[
c
n
(d+1)/2
_
[z[
k
e
(z)
2
2
+
1
n
(k2)/2
_
, n [x[
2
,
[p
n
(x) p
n
(x)[
c
n
(d+1)/2
, n [x[
2
.
If the third moments of X
1
vanish (e.g., if p is symmetric) then u
3
0 and
[p
n
(x) p
n
(x)[
c
n
(d+2)/2
_
[z[
k
e
(z)
2
2
+
1
n
(k3)/2
_
, n [x[
2
,
[p
n
(x) p
n
(x)[
c
n
(d+2)/2
, n [x[
2
.
Remark. Theorem 2.3.8 gives improvements to Theorem 2.3.6. Assuming a sucient number of
moments on the increment distribution, one can estimate
y
p
n
(x) up to an error of O(n
(d+k1)/2
)
by taking
y
of all the terms on the left-hand side of (2.31). These terms can be estimated using
(2.29). This works for higher order dierences as well.
The next theorem is the LCLT assuming only a nite second moment.
Theorem 2.3.9 Suppose p T
d
. Then there exists a sequence
n
0 such that for all n, x,
[p
n
(x) p
n
(x)[
n
n
d/2
. (2.32)
Proof By Lemma 2.3.4, there exist c, such that for all n, x and r > 0,
n
d/2
[p
n
(x) p
n
(x)[ c
_
e
r
2
+
_
r
[F
n
()[ d
_
c
_
e
r
2
+r
d
sup
r
[F
n
()[
_
.
40 Local Central Limit Theorem
We now refer to Lemma 2.3.3. Since h() = o([[
2
),
lim
n
sup
||r
[g(, n)[ = 0,
and hence
lim
n
sup
||r
[F
n
()[ = 0.
In particular, for all n suciently large,
n
d/2
[p
n
(x) p
n
(x)[ 2 c e
r
2
.
The next theorem improves on this for [x[ larger than
d
. Then there exists a sequence
n
0 such that for all n, x,
[p
n
(x) p
n
(x)[
n
[x[
2
n
(d2)/2
. (2.33)
Moreover,
If E[[X
1
[
3
] < , then
n
can be chosen O(n
1/2
).
If E[[X
1
[
4
] < and the third moments of X
1
vanish, then
n
can be chosen O(n
1
).
Proof If
1
,
2
are C
2
functions on R
d
with period 2 in each component, then it follows from
Greens theorem (integration by parts) that
_
[,]
d
[
1
()]
2
() d =
_
[,]
d
1
() [
2
()] d
(the boundary terms disappear by periodicity). Since [e
ix
] = [x[
2
e
x
, the inversion formula
gives
p
n
(x) =
1
(2)
d
_
[,]
d
e
ix
() d =
1
[x[
2
(2)
d
_
[,]
d
e
ix
() d,
where () = ()
n
. Therefore,
[x[
2
n
p
n
(x) =
1
(2)
d
_
[,]
d
e
ix
()
n2
[(n 1) () +() ()] d,
where
() =
d
j=1
[
j
()]
2
.
The rst and second derivatives of are uniformly bounded. Hence by (2.14), we can see that
()
n2
[(n 1) () +() ()]
c [1 +n[[
2
] e
n||
2
b
c e
n||
2
,
2.3 LCLT characteristic function approach 41
where 0 < < b. Hence, we can write
[x[
2
n
p
n
(x) O(e
r
2
) =
1
(2)
d
_
||r/
n
e
ix
()
n2
[(n 1) () +() ()] d.
The usual change of variables shows that the right-hand side equals
1
(2)
d
n
d/2
_
||r
e
iz
n
_
n2
_
(n 1)
_
n
_
+
_
n
_
_
n
__
d,
where z = x/
n.
Note that
_
e
2
_
= e
2
[[[
2
tr()].
We dene
F
n
() by
n
_
n2
_
(n 1)
_
n
_
+
_
n
_
_
n
__
= e
2
_
[[
2
tr()
F
n
()
_
.
A straightforward calculation using Greens theorem shows that
p
n
(x) =
1
(2)
d
n
d/2
_
R
d
e
i(x/
n)
e
2
d =
n
[x[
2
(2)
2
_
R
d
e
i(x/
n)
[e
2
] d.
Therefore (with perhaps a dierent ),
[x[
2
n
p
n
(x) =
[x[
2
n
p
n
(x) +O(e
r
2
)
1
(2)
d
n
d/2
_
||r
e
iz
e
2
F
n
() d. (2.34)
The remaining task is to estimate
F
n
(). Recalling the denition of F
n
() from Lemma 2.3.3, we
can see that
n
_
n2
_
(n 1)
_
n
_
+
_
n
_
_
n
__
=
e
2
[1 +F
n
()]
_
(n 1)
(/
n)
(/
n)
2
+
(/
n)
(/
n)
_
.
We make three possible assumptions:
p T
d
.
p T
d
with E[[X
1
[
3
] < .
p T
d
with E[[X
1
[
4
] < and vanishing third moments.
We set = 0, 1, 2, respectively, in these three cases. Then we can write
() = 1
2
+q
2+
() +o([[
2+
),
where q
2
0 and q
3
, q
4
are homogeneous polynomials of degree 3 and 4 respectively. Because is
C
2+
and we know the values of the derivatives at the origin, we can write
j
() =
j
2
+
j
q
2+
() +o([[
1+
),
42 Local Central Limit Theorem
jj
() =
jj
2
+
jj
q
2+
() +o([[
).
Using this, we see that
d
j=1
[
j
()]
2
()
2
= [[
2
+ q
2+
() +o([[
2+
),
()
()
= tr() + q
() +o([[
),
where q
2+
is a homogeneous polyonomial of degree 2 + with q
2
0, and q
is a homogeneous
polynomial of degree with q
0
= 0. Therefore, for [[ n
1/8
,
(n 1)
(/
n)
(/
n)
2
+
(/
n)
(/
n)
= [[
2
tr() +
q
2+
() + q
()
n
/2
+o
_
[[
+[[
+2
n
/2
_
,
which establishes that for = 1, 2
[
F
n
()[ = O
_
1 +[[
2+
n
/2
_
, [[ n
1/16
,
and for = 0, for each r < ,
lim
n
sup
||r
[
F
n
()[ = 0.
The remainder of the argument follows the proofs of Theorem 2.3.5 and 2.3.9. For = 1, 2 we can
choose r = n
7/16
in (2.34) while for = 0 we choose r independent of n and then let r .
2.3.1 Exponential moments
The estimation of probabilities for atypical values can be done more accurately for random walks
whose increment distribution has an exponential moment. In this section we prove the following.
Theorem 2.3.11 Suppose p T
d
such that for some b > 0,
E
_
e
b|X
1
|
_
< . (2.35)
Then there exists > 0 such that for all n 0 and all x Z
d
with [x[ < n,
p
n
(x) = p
n
(x) exp
_
O
_
1
n
+
[x[
3
n
2
__
.
Moreover, if all the third moments of X
1
vanish,
p
n
(x) = p
n
(x) exp
_
O
_
1
n
+
[x[
4
n
3
__
.
Note that the conclusion of the theorem can be written
[p
n
(x) p
n
(x)[ c p
n
(x)
_
1
n
/2
+
[x[
2+
n
1+
_
, [x[ n
1+
2+
,
2.3 LCLT characteristic function approach 43
[p
n
(x) p
n
(x)[ p
n
(x) exp
_
O
_
[x[
2+
n
1+
__
, [x[ n
1+
2+
where = 2 if the third moments vanish and = 1 otherwise. In particular, if x
n
is a sequence of points in Z
d
,
then as n ,
p
n
(x
n
) p
n
(x
n
) if [x
n
[ = o(n
),
p
n
(x
n
) p
n
(x
n
) if [x
n
[ = O(n
),
where = 2/3 if = 1 and = 3/4 if = 2.
Theorem 2.3.11 will follow from a stronger result (Theorem 2.3.12). Before stating it we introduce
some additional notation and make several preliminary observations. Let p T
d
have characteristic
function and covariance matrix , and assume that p satises (2.35). If the third moments of p
vanish, we let = 2; otherwise, = 1. Let M denote the moment generating function,
M(b) = E[e
bX
] = (ib),
which by (2.35) is well dened in a neighborhood of the origin in C
d
. Moreover, we can nd
C < , > 0 such that
E
_
[X[
4
e
|bX|
_
C, [b[ < . (2.36)
In particular, there is a uniform bound in this neighborhood on all the derivatives of M of order at
most four. (A (nite) number of times in this section we will say that something holds for all b in a
neighborhood of the origin. At the end, one should take the intersection of all such neighborhoods.)
Let L(b) = log M(b), L(i) = log (). Then in a neighborhood of the origin we have
M(b) = 1 +
b b
2
+O
_
[b[
+2
_
, M(b) = b +O
_
[b[
+1
_
,
L(b) =
b b
2
+O([b[
+2
), L(b) =
M(b)
M(b)
= b +O([b[
+1
). (2.37)
For [b[ < , let p
b
T
d
be the probability measure
p
b
(x) =
e
bx
p(x)
M(b)
, (2.38)
and let P
b
, E
b
denote probabilities and expectations associated to a random walk with increment
distribution p
b
. Note that
P
b
S
n
= x = e
bx
M(b)
n
PS
n
= x. (2.39)
The mean of p
b
is equal to
m
b
=
E[X e
bX
]
E[e
bX
]
= L(b).
A standard large deviations technique for understanding PS
n
= x is to study P
b
S
n
= x
where b is chosen so that m
b
= x/n. We will apply this technique in the current context. Since
is an invertible matrix, (2.37) implies that b L(b) maps [b[ < one-to-one and onto a
44 Local Central Limit Theorem
neighborhood of the origin, where > 0 is suciently small. In particular, there is a > 0 such
that for all w R
d
with [w[ < , there is a unique [b
w
[ < with L(b
w
) = w.
One could think of the tilting procedure of (2.38) as weighting by a martingale. Indeed, it is easy to
see that for [b[ < , the process
N
n
= M(b)
n
exp bS
n
b
() = E
b
[e
iX
] =
M(i +b)
M(b)
. (2.40)
Then there is a neighborhood of the origin such that for all b, in the neighborhood, we can expand
log
b
as
log
b
() = i m
b
b
2
+f
3,b
() +h
4,b
(). (2.41)
Here
b
is the covariance matrix for the increment distribution p
b
, and f
3,b
() is the homogeneous
polynomial of degree three
f
3,b
() =
i
6
_
E
b
[( X)
3
] + 2 (E
b
[ X])
3
.
Due to (2.36), the coecients for the third order Taylor polynomial of log
b
are all dierential in
b with bounded derivatives in the same neighborhood of the origin. In particular we conclude that
[f
3,b
()[ c [b[
1
[[
3
, [b[, [[ < .
To see this if = 1 use the boundedness of the rst and third moments. If = 2, note that
f
3,0
() = 0, R
d
, and use the fact that the rst and third moments have bounded derivatives as
functions of b. Similarly,
b
=
E[XX
T
e
bX
]
M(b)
= +O([b[
),
The error term h
4,b
is bounded by
h
4,b
() c [[
4
, [b[, [[ < .
Note that due to (2.37) (and invertibility of ) we have both [b
w
[ = O([w[) and [w[ = O([b
w
[).
Combining this with the above observations, we can conclude
m
b
= b +O([w[
1+
),
2.3 LCLT characteristic function approach 45
b
w
=
1
w +O([w[
1+
), (2.42)
det
bw
= det +O([w[
) = det +O([b
w
[
), (2.43)
L(b
w
) =
b
w
b
w
2
+O([b
w
[
2+
) =
w
1
w
2
+O([w[
2+
). (2.44)
By examining the proof of (2.13) one can nd a (perhaps dierent) > 0, such that for all [b[
and all [, ]
d
,
[e
im
b
b
()[ 1 [[
2
.
(For small use the expansion of near 0, otherwise consider max
,b
[e
im
b
b
()[ where the
maximum is taken over all such z [, ]
d
: [z[ and all [b[ .)
Theorem 2.3.12 Suppose p satises the assumptions of Theorem 2.3.11, and let L, b
w
be dened
as above. Then there exists c < and > 0 such that the following holds. Suppose x Z
d
with
[x[ n and b = b
x/n
. Then
(2 det
b
)
d/2
n
d/2
P
b
S
n
= x 1
c ([x[
1
+
n)
n
(+1)/2
. (2.45)
In particular,
p
n
(x) = PS
n
= x = p
n
(x) exp
_
O
_
1
n
/2
+
[x[
2+
n
1+
__
. (2.46)
Proof [of (2.46) given (2.45)] We can write (2.45) as
P
b
S
n
= x =
1
(2 det
b
)
d/2
n
d/2
_
1 +O
_
[x[
1
n
(+1)/2
+
1
n
/2
__
.
By (2.39),
p
n
(x) = PS
n
= x = M(b)
n
e
bx
P
b
S
n
= x = exp nL(b) b x P
b
S
n
= x.
From (2.43), we see that
(det
b
)
d/2
= (det )
d/2
_
1 +O
_
[x[
__
,
and due to (2.44), we have
nL(b)
x
1
x
2n
c
[x[
2+
n
1+
.
Applying (2.42), we see that
b x =
_
1
_
x
n
_
+O
_
[x[
+1
n
+1
__
x =
x
1
x
n
+O
_
[x[
+2
n
+1
_
.
Therefore,
exp nL(b) b x = exp
_
x
1
x
2n
_
exp
_
O
_
[x[
2+
n
1+
__
.
46 Local Central Limit Theorem
Combining these and recalling the denition of p
n
(x) we get,
p
n
(x) = p
n
(x) exp
_
O
_
[x[
2+
n
1+
__
.
Therefore, it suces to prove (2.45). The argument uses an LCLT for probability distributions on
Z
d
with non-zero mean. Suppose K < and X is a random variable in Z
d
with mean m R
d
,
covariance matrix , and E[[X[
4
] K. Let be the characteristic function of X. Then there exist
, C, depending only on K, such that for [[ < ,
log ()
_
im +
2
+f
3
()
_
C [[
4
. (2.47)
where the term f
3
() is a homogeneous polynomial of degree 3. Let us write K
3
for the smallest
number such that [f
3
()[ K
3
[[
3
for all . Note that there exist uniform bounds for m, and K
3
in terms of K. Moreover, if = 2 and f
3
corresponds to p
b
, then [K
3
[ c [b[. The next proposition
is proved in the same was as Theorem 2.3.5 taking some extra care in obtaining uniform bounds.
The relation (2.45) then follows from this proposition and the bound K
3
c ([x[/n)
1
.
Proposition 2.3.13 For every > 0, K < , there is a c such that the following holds. Let p
be a probability distribution on Z
d
with E[[X[
4
] K. Let m, , C, , , K
3
be as in the previous
paragraph. Moreover, assume that
[e
im
()[ 1 [[
2
, [, ]
d
.
Suppose X
1
, X
2
, . . . are independent random variables with distribution p and S
n
= X
1
+ +X
n
.
Then if nm Z
d
,
(2ndet )
d/2
PS
n
= nm 1
c[K
3
n + 1]
n
.
Remark. The error term indicates existence of two dierent regimes: K
3
n
1/2
and K
3
n
1/2
.
Proof We x , K and allow all constants in this proof to depend only on and K. Proposition
2.2.2 implies that
PS
n
= nm =
1
(2)
d
_
[,]
d
[e
im
()]
n
d.
The uniform upper bound on E[[X[
4
] implies uniform upper bounds on the lower moments. In
particular, det is uniformly bounded and hence it suces to nd n
0
such that the result holds for
n n
0
. Also observe that (2.47) holds with a uniform C from which we conclude
nlog
_
n
_
i
n(m )
2
nf
3
_
n
_
C
[[
4
n
.
In addition we have [nf
3
(/
n)[ K
3
[[
3
/
zZ
d
[p
n
(z) p
n
(z +y)[ c [y[ n
1/2
.
Proof By the triangle inequality, it suces to prove the result for y = e = e
j
. Let = 1/2d. By
Theorem 2.3.6,
p
n
(z +e) p
n
(z) =
j
p
n
(z) +O
_
1
n
(d+2)/2
_
.
Also Corollary 12.2.7 shows that
|z|n
(1/2)+
[p
n
(z) p
n
(z +e)[
|z|n
(1/2)+
[p
n
(z) +p
n
(z +e)] = o(n
1/2
).
But,
|z|n
(1/2)+
[p
n
(z) p
n
(z +e)[
zZ
d
[p
n
(z) p
n
(z +e)[ +
|z|n
(1/2)+
O
_
1
n
(d+2)/2
_
O(n
1/2
) +
zZ
d
[
j
p
n
(z)[.
A straightforward estimate which we omit gives
zZ
d
[
j
p
n
(z)[ = O(n
1/2
).
The last proposition holds with much weaker assumptions on the random walk. Recall that T
is the set of increment distributions p with the property that for each x Z
d
, there is an N
x
such
that p
n
(x) > 0 for all n N
x
.
Proposition 2.4.2 If p T
zZ
d
[p
n
(z) p
n
(z +y)[ c [y[ n
1/2
.
Proof In Exercise 1.3 it was shown that we can write
p = q + (1 )q
,
where q T
j=0
_
n
j
_
j
(1 )
nj
zZ
d
q
j
(x z) q
nj
(z). (2.48)
48 Local Central Limit Theorem
Therefore,
xZ
d
[p
n
(x) p
n
(x +y)[
n
j=0
_
n
j
_
j
(1 )
nj
xZ
d
q
nj
(x)
zZ
d
[q
j
(x z) q
j
(x +y z)[.
We split the rst sum into the sum over j < (/2)n and j (/2)n. Standard exponential estimates
for the binomial (see Lemma 12.2.8) give
j<(/2)n
_
n
j
_
j
(1 )
nj
xZ
d
q
nj
(x)
zZ
d
[q
j
(x z) q
j
(x +y z)[
2
j<(/2)n
_
n
j
_
j
(1 )
nj
= O(e
n
),
for some = () > 0. By Proposition 2.4.1,
j(/2)n
_
n
j
_
j
(1 )
nj
xZ
d
q
nj
(x)
zZ
d
[q
j
(x z) q
j
(x +y z)[
c n
1/2
[y[
j(/2)n
_
n
j
_
j
(1 )
nj
xZ
d
q
nj
(x) c n
1/2
[y[.
The last proposition has the following useful lemma as a corollary. Since this is essentially a
result about Markov chains in general, we leave the proof to the appendix, see Theorem 12.4.5.
Lemma 2.4.3 Suppose p T
d
. There is a c < such that if x, y Z
d
, we can dene S
n
, S
n
on
the same probability space such that: S
n
has the distribution of a random walk with increment p
with S
0
= x; S
n
has the distribution of a random walk with increment p with S
0
= y; and such that
for all n,
PS
m
= S
m
for all m n 1
c [x y[
n
.
While the proof of this last lemma is somewhat messy to write out in detail, there really is not a lot of
content to it once we have Proposition 2.4.2. Suppose that p, q are two probability distributions on Z
d
with
1
2
zZ
d
[p(z) q(z)[ = .
Then there is an easy way to dene random variables X, Y on the same probability space such that X has
distribution p, Y has distribution q and PX ,= Y = . Indeed, if we let f(z) = minp(z), q(z) we can let the
probability space be Z
d
Z
d
and dene by
(z, z) = f(z)
2.4 Some corollaries of the LCLT 49
and for x ,= y,
(x, y) =
1
[p(x) f(x)] [q(y) f(y)].
If we let X(x, y) = x, Y (x, y) = y, it is easy to check that the marginal of X is p, the marginal of Y is q and
PX = Y = 1 . The more general fact is not much more complicated than this.
Proposition 2.4.4 Suppose p T
d
. There is a c < such that for all n, x,
p
n
(x)
c
n
d/2
. (2.49)
Proof If p T
d
with bounded support this follows immediately from (2.22). For general p T
d
,
write p = q + (1 ) q
with q T
d
, q
d
as in the proof of Proposition 2.22. Then p
n
(x)
is as in (2.48). The sum over j < (/2)n is O(e
n
) and for j (/2)n, we have the bound
q
j
(x z) c n
d/2
.
The central limit theorem implies that it takes O(n
2
) steps to go distance n. This proposition
gives some bounds on large deviations for the number of steps.
Proposition 2.4.5 Suppose S is a random walk with increment distribution p T
d
and let
n
= mink : [S
k
[ n,
n
= mink :
(S
k
) n.
There exist t > 0 and c < such that for all n and all r > 0,
P
n
rn
2
+P
n
rn
2
c e
t/r
, (2.50)
P
n
rn
2
+P
n
rn
2
c e
rt
. (2.51)
Proof There exists a c such that
cn
n
n/ c
so it suces to prove the estimates for
n
. It also
suces to prove the result for n suciently large. The central limit theorem implies that there is
an integer k such that for all n suciently large,
P[S
kn
2[ 2n
1
2
.
By the strong Markov property, this implies for all l
P
n
> kn
2
+l [
n
> l
1
2
,
and hence
P
n
> jkn
2
(1/2)
j
= e
j log 2
= e
jk(log 2/k)
.
This gives (2.51). The estimate (2.50) on
n
can be written as
P
_
max
1jrn
2
[S
j
[ n
_
= P
_
max
1jrn
2
[S
j
[ (1/
r)
rn
2
_
c e
t/r
,
which follows from (2.7).
50 Local Central Limit Theorem
The upper bound (2.51) for
n
does not need any assumptions on the distribution of the increments other
than they be nontrivial, see Exercise 2.7.
Theorem 2.3.10 implies that for all p T
d
, p
n
(x) c n
d/2
(
n/[x[)
2
. The next proposition
extends this to real r > 2 under the assumption that E[[X
1
[
r
] < .
Proposition 2.4.6 Suppose p T
d
. There is a c such that for all n, x,
p
n
(x)
c
n
d/2
max
0jn
P[S
j
[ [x[/2 .
In particular, if r > 2, p T
d
and E[[X
1
[
r
] < , then there exists c < such that for all n, x,
p
n
(x)
c
n
d/2
_
n
[x[
_
r
. (2.52)
Proof Let m = n/2 if n is even and m = (n + 1)/2 if n is odd. Then,
S
n
= x = S
n
= x, [S
m
[ [x[/2 S
n
= x, [S
n
S
m
[ [x[/2 .
Hence it suces to estimate the probabilities of the events on the right-hand side. Using (2.49) we
get
PS
n
= x, [S
m
[ [x[/2 = P[S
m
[ [x[/2 PS
n
= x [ [S
m
[ [x[/2
P[S
m
[ [x[/2
_
sup
y
p
nm
(y, x)
_
c n
d/2
P[S
m
[ [x[/2 .
The other probability can be estimated similarly since
PS
n
= x, [S
n
S
m
[ [x[/2 = PS
n
= x, [S
nm
[ [x[/2 .
We claim that if p T
d
, r 2, and E[[X
1
[
r
] < , then there is a c such that E[[S
n
[
r
] c n
r/2
.
Once we have this, the Chebyshev inequality gives for m n,
P[S
m
[ [x[
c n
r/2
[x[
r
.
The claim is easier when r is an even integer (for then we can estimate the expectation by expanding
(X
1
+ +X
n
)
r
), but we give a proof for all r 2. Without loss of generality, assume d = 1. For
a xed n dene
T
1
=
T
1
= minj : [S
j
[ c
1
n,
and for l > 1,
T
l
= min
_
j >
T
l1
:
S
j
S
T
l1
c
1
n
_
, T
l
=
T
l
T
l1
,
where c
1
is chosen suciently large so that
PT
1
> n
1
2
.
2.5 LCLT combinatorial approach 51
The existence of such a c
1
follows from (2.6) applied with k = 1.
Let Y
1
= [S
T
1
[ and for l > 1, Y
l
=
T
l
S
T
l1
. Note that (T
1
, Y
1
), (T
2
, Y
2
), (T
3
, Y
3
), . . . are
independent, identically distributed random variables taking values in 1, 2, . . . R. Let be the
smallest l 1 such that T
l
> n. Then one can readily check from the triangle inequality that
[S
n
[ Y
1
+Y
2
+ +Y
1
+c
1
n
= c
1
n +
l=1
Y
l
,
where
Y
l
= Y
l
1T
l
n 1 > l 1. Note that
PY
1
c
1
n +t; T
1
n P[X
j
[ t for some 1 j n
nP[X
1
[ t.
Letting Z = [X
1
[, we get
E[
Y
r
1
] = E[Y
r
1
; T
l
n] = c
_
0
s
r1
PY
1
s; T
1
n ds
c
_
n
r/2
+
_
(c
1
+1)
n
s
r1
nPZ s c
1
n ds
_
= c
_
n
r/2
+
_
n
(s +
n)
r1
nPZ s ds
_
c
_
n
r/2
+ 2
r1
_
n
s
r1
nPZ s ds
_
c
_
n
r/2
+ 2
r1
nE[Z
r
]
_
c n
r/2
.
For l > 1,
E[
Y
r
l
] P > l 1 E[Y
r
l
1T
l
n [ > l 1] =
_
1
2
_
l1
E[
Y
r
1
].
Therefore,
E
_
(
Y
1
+
Y
2
+ )
r
_
= lim
l
E
_
(
Y
1
+ +
Y
l
)
r
_
lim
l
_
E[
Y
r
1
]
1/r
+ +E[
Y
r
l
]
1/r
_
r
= E[
Y
r
1
]
_
l=1
_
1
2
_
(l1)/r
_
r
= c E[
Y
r
1
].
2.5 LCLT combinatorial approach
In this section, we give another proof of an LCLT with estimates for one-dimensional simple random
walk, both discrete and continuous time, using an elementary combinatorial approach. Our results
are no stronger than that derived earlier, and this section is not needed for the remainder of the
52 Local Central Limit Theorem
book, but it is interesting to see how much can be derived by simple counting methods. While we
focus on simple random walk, extensions to p T
d
are straightforward using (1.2). Although the
arguments are relatively elementary, they do require a lot of calculation and estimation. Here is a
basic outline:
Establish the result for discrete time random walk by exact counting of paths. Along the way
we will prove Stirlings formula.
Prove an LCLT for Poisson random variables and use it to derive the result for one-dimensional
continuous-time walks. (A result for d-dimensional continuous-time simple random walks follows
immediately.)
We could continue this approach and prove an LCLT for multinomial random variables and use
it to derive the result for discrete-time d-dimensional simple random walk, but we have chosen to
omit this.
2.5.1 Stirlings formula and 1-d walks
Suppose S
n
is a simple one-dimensional random walk starting at the origin. Determining the
distribution of S
n
reduces to an easy counting problem. In order for X
1
+ + X
2n
to equal 2k,
exactly n +k of the X
j
must equal +1. Since all 2
2n
sequences of 1 are equally likely,
p
2n
(2k) = PS
2n
= 2k = 2
2n
_
2n
n +k
_
= 2
2n
(2n)!
(n +k)! (n k)!
. (2.53)
We will use Stirlings formula, which we now derive, to estimate the factorial. In the proof, we will
use some standard estimates about the logarithm, see Section 12.1.2.
Theorem 2.5.1 (Stirlings formula) As n ,
n!
2 n
n+(1/2)
e
n
.
In fact,
n!
2 n
n+(1/2)
e
n
= 1 +
1
12 n
+O
_
1
n
2
_
.
Proof Let b
n
= n
n+(1/2)
e
n
/n!. Then, (12.5) and Taylors theorem imply
b
n+1
b
n
=
1
e
_
1 +
1
n
_
n
_
1 +
1
n
_
1/2
=
_
1
1
2n
+
11
24n
2
+O
_
1
n
3
__ _
1 +
1
2n
1
8n
2
+O
_
1
n
3
__
= 1 +
1
12n
2
+O
_
1
n
3
_
.
Therefore,
lim
m
b
m
b
n
=
l=n
_
1 +
1
12l
2
+O
_
1
l
3
__
= 1 +
1
12n
+O
_
1
n
2
_
.
2.5 LCLT combinatorial approach 53
The second equality is obtained by
log
l=n
_
1 +
1
12l
2
+O
_
1
l
3
__
=
l=n
log
_
1 +
1
12l
2
+O
_
1
l
3
__
=
l=n
1
12l
2
+
l=n
O
_
1
l
3
_
=
1
12n
+O
_
1
n
2
_
.
This establishes that the limit
C :=
_
lim
m
b
m
_
1
exists and
b
n
=
1
C
_
1
1
12n
+O
_
1
n
2
__
,
n! = C n
n+(1/2)
e
n
_
1 +
1
12n
+O
_
1
n
2
__
.
There are a number of ways to determine the constant C. For example, if S
n
denotes a one-
dimensional simple random walk, then
P[S
2n
[
2n log n =
|2k|
2n log n
4
n
_
2n
n +k
_
=
|2k|
2n log n
4
n
(2n)!
(n +k)!(n k)!
.
Using (12.3), we see that as n , if [2k[
2n log n,
4
n
(2n)!
(n +k)!(n k)!
2
C
n
_
1 +
k
n
_
(n+k)
_
1
k
n
_
(nk)
=
2
C
n
_
1
k
2
/n
n
_
n
_
1 +
k
2
/n
k
_
k
_
1
k
2
/n
k
_
k
2
C
n
e
k
2
/n
.
Therefore,
lim
n
P[S
2n
[
2n log n = lim
n
|k|
n/2 log n
2
C
n
e
k
2
/n
=
2
C
_
e
x
2
dx =
2
C
.
However, Chebyshevs inequality shows that
P[S
2n
[
2n log n
Var[S
2n
]
2n log
2
n
=
1
log
2
n
0.
Therefore, C =
2.
By adapting this proof, it is easy to see that one can nd r
1
= 1/12, r
2
, r
3
, . . . such that for each positive
integer k,
n! =
2 n
n+(1/2)
e
n
_
1 +
r
1
n
+
r
2
n
2
+ +
r
k
n
k
+O
_
1
n
k+1
__
. (2.54)
54 Local Central Limit Theorem
We will now prove Theorem 2.1.1 and some dierence estimates in the special case of simple
random walk in one dimension by using (2.53) and Stirlings formula. As a warmup, we start with
the probability of being at the origin.
Proposition 2.5.2 For simple random walk in Z, if n is a positive integer, then
PS
2n
= 0 =
1
n
_
1
1
8n
+O
_
1
n
2
__
.
Proof The probability is exactly
2
2n
_
2n
n
_
=
(2n)!
4
n
(n!)
2
.
By plugging into Stirlings formula, we see that the right hand side equals
1
n
1 + (24n)
1
+O(n
2
)
[1 + (12n)
1
+O(n
2
)]
2
=
1
n
_
1
1
8n
+O
_
1
n
2
__
.
In the last proof, we just plugged into Stirlings formula and evaluated. We will now do the same
thing to prove a version of the LCLT for one-dimensional simple random walk.
Proposition 2.5.3 For simple random walk in Z, if n is a positive integer and k is an integer with
[k[ n,
p
2n
(2k) = PS
2n
= 2k =
1
n
e
k
2
/n
exp
_
O
_
1
n
+
k
4
n
3
__
.
In particular, if [k[ n
3/4
, then
PS
2n
= 2k =
1
n
e
k
2
/n
_
1 +O
_
1
n
+
k
4
n
3
__
.
Note that for one-dimensional simple random walk,
2 p
2n
(2k) = 2
1
_
(2) (2n)
exp
_
(2k)
2
2 (2n)
_
=
1
n
e
k
2
/n
.
While the theorem is stated for all [k[ n, it is not a very strong statement when k is of order n. For
example, for n/2 [k[ n, we can rewrite the conclusion as
p
2n
(2k) =
1
n
e
k
2
/n
e
O(n)
= e
O(n)
,
which only tells us that there exists such that
e
n
p
2n
(2k) e
n
.
2.5 LCLT combinatorial approach 55
In fact, 2p
2n
(2k) is not a very good approximation of p
2n
(2k) for large n. As an extreme example, note that
p
2n
(2n) = 4
n
, 2 p
2n
(2n) =
1
n
e
n
.
Proof If n/2 [k[ n, the result is immediate using only the estimate 2
2n
PS
2n
= 2k 1.
Hence, we may assume that [k[ n/2. As noted before,
PS
2n
= 2k = 2
2n
_
2n
n +k
_
=
(2n)!
2
2n
(n +k)!(n k)!
.
If we restrict to [k[ n/2, we can use Stirlings formula (Lemma 2.5.1) to see that
PS
2n
= 2k =
_
1 +O
_
1
n
__
1
n
_
1
k
2
n
2
_
1/2
_
1
k
2
n
2
_
n
_
1
2k
n +k
_
k
.
The last two terms approach exponential functions. We need to be careful with the error terms.
Using (12.3) we get,
_
1
k
2
n
2
_
n
= e
k
2
/n
exp
_
O
_
k
4
n
3
__
.
_
1
2k
n +k
_
k
= e
2k
2
/(n+k)
exp
_
2k
3
(n +k)
2
+O
_
k
4
n
3
__
= e
2k
2
/(n+k)
exp
_
2k
3
n
2
+O
_
k
4
n
3
__
,
e
2k
2
/(n+k)
= e
2k
2
/n
exp
_
2k
3
n
2
+O
_
k
4
n
3
__
.
Also, using k
2
/n
2
max(1/n), (k
4
/n
3
), we can see that
_
1
k
2
n
2
_
1/2
= exp
_
O
_
1
n
+
k
4
n
3
__
.
Combining all of this gives the theorem.
We could also prove dierence estimates by using the equalities
p
2n
(2k + 2) =
n k
n +k + 1
p
2n
(2k),
p
2(n+1)
(2k) = p
2n
(2k) 4
1
(2n + 1)(2n + 2)
(n +k + 1)(n k + 1)
.
56 Local Central Limit Theorem
Corollary 2.5.4 If S
n
is simple random walk, then for all positive integers n and all [k[ < n,
PS
2n+1
= 2k + 1 =
1
n
exp
_
(k +
1
2
)
2
n
_
exp
_
O
_
1
n
+
k
4
n
3
__
. (2.55)
Proof Note that
PS
2n+1
= 2k + 1 =
1
2
PS
2n
= 2k +
1
2
PS
2n
= 2(k + 1).
Hence,
PS
2n+1
= 2k + 1 =
1
2
n
[e
k
2
/n
+e
(k+1)
2
/n
] exp
_
O
_
1
n
+
k
4
n
3
__
.
But,
exp
_
(k +
1
2
)
2
n
_
= e
k
2
/n
_
1
k
n
+O
_
k
2
n
2
__
,
exp
_
(k + 1)
2
n
_
= e
k
2
/n
_
1
2k
n
+O
_
k
2
n
2
__
,
which implies
1
2
_
e
k
2
/n
+e
(k+1)
2
/n
_
= exp
_
(k +
1
2
)
2
n
_
_
1 +O
_
k
2
n
2
__
.
Using k
2
/n
2
max(1/n), (k
4
/n
3
), we get (2.55).
One might think that we should replace n in (2.55) with n + (1/2). However,
1
n + (1/2)
=
1
n
_
1 +O
_
1
n
__
.
Hence, the same statement with n + (1/2) replacing n is also true.
2.5.2 LCLT for Poisson and continuous-time walks
The next proposition establishes the strong LCLT for Poisson random variables. This will be used
for comparing discrete-time and continuous-time random walks with the same p. If N
t
is a Poisson
random variable with parameter t, then E[N
t
] = t, Var[N
t
] = t. The central limit theorem implies
that as t , the distribution of (N
t
t)/
N
t
t
t
<
m+ 1 t
t
_
_
(m+1t)/
t
(mt)/
t
1
2
e
x
2
2
dx
1
2t
e
(mt)
2
2t
.
In the next proposition, we use a straightforward combinatorial argument to justify this approxi-
mation.
2.5 LCLT combinatorial approach 57
Proposition 2.5.5 Suppose N
t
is a Poisson random variable with parameter t, and m is an integer
with [mt[ t/2. Then
PN
t
= m =
1
2t
e
(mt)
2
2t
exp
_
O
_
1
t
+
[mt[
3
t
2
__
.
Proof For notational ease, we will rst consider the case where t = n is an integer, and we let
m = n +k. Let
q(n, k) = PN
n
= n +k = e
n
n
n+k
(n +k)!
,
and note the recursion formula
q(n, k) =
n
n +k
q(n, k 1).
Stirlings formula (Theorem 2.5.1) gives
q(n, 0) =
e
n
n
n
n!
=
1
2n
_
1 +O
_
1
n
__
. (2.56)
By the recursion formula, if k n/2,
q(n, k) = q(n, 0)
__
1 +
1
n
_ _
1 +
2
n
_
_
1 +
k
n
__
1
,
and,
log
k
j=1
_
1 +
j
n
_
=
k
j=1
log
_
1 +
j
n
_
=
k
j=1
_
j
n
+O
_
j
2
n
2
__
=
k
2
2n
+
k
2n
+O
_
k
3
n
2
_
=
k
2
2n
+O
_
1
n
+
k
3
n
2
_
.
The last equality uses the inequality
k
n
max
_
1
n
,
k
3
n
2
_
,
which will also be used in other estimates in this proof. Using (2.56), we get
log q(n, k) = log
2n
k
2
2n
+O
_
1
n
+
k
3
n
2
_
,
and the result for k 0 follows by exponentiating.
Similarly,
q(n, k) = q(n, 0)
_
1
1
n
__
1
2
n
_
_
1
k 1
n
_
58 Local Central Limit Theorem
and
log q(n, k) = log
2n + log
k1
j=1
_
1
j
n
_
= log
2n
k
2
2n
+O
_
1
n
+
k
3
n
2
_
.
The proposition for integer n follows by exponentiating. For general t, let n = t and note that
PN
t
= n +k = PN
n
= n +k e
(tn)
_
1 +
t n
n
_
n+k
= PN
n
= n +k
_
1 +
t n
n
_
k
_
1 +O
_
1
n
__
= PN
n
= n +k
_
1 +O
_
[k[ + 1
n
__
= (2n)
1/2
e
k
2
/(2n)
exp
_
O
_
1
n
+
k
3
n
2
__
= (2t)
1/2
e
(k+nt)
2
/(2t)
exp
_
O
_
1
t
+
[n +k t[
3
t
2
__
.
The last step uses the estimates
1
t
=
1
n
_
1 +O
_
1
t
__
, e
k
2
2t
= e
k
2
2n
exp
_
O
_
k
2
t
2
__
.
We will use this to prove a version of the local central limit theorem for one-dimensional,
continuous-time simple random walk.
Theorem 2.5.6 If
S
t
is continuous-time one-dimensional simple random walk, then if [x[ t/2,
p
t
(x) =
1
2t
e
x
2
2t
exp
_
O
_
1
t
+
[x[
3
t
2
__
.
Proof We will assume that x = 2k is even; the odd case is done similarly. We know that
p
t
(2k) =
m=0
PN
t
= 2m p
2m
(2k).
Standard exponential estimates, see (12.12), show that for every > 0, there exist c, such that
P[N
t
t[ t c e
t
. Hence,
p
t
(2k) =
m=0
PN
t
= 2m p
2m
(2k)
= O(e
t
) +
PN
t
= 2m p
2m
(2k), (2.57)
2.5 LCLT combinatorial approach 59
where here and for the remainder of this proof, we write just
PN
t
= 2m p
2m
(2k) =
1
2t
e
x
2
2t
exp
_
O
_
1
t
+
[x[
3
t
2
__
.
A little thought shows that this and (2.57) imply the theorem.
By Proposition 2.5.3 we know that
p
2m
(2k) = PS
2m
= 2k =
1
m
e
k
2
m
exp
_
O
_
1
m
+
k
4
m
3
__
,
and by Proposition 2.5.5 we know that
PN
t
= 2m =
1
2t
e
(2mt)
2
2t
exp
_
O
_
1
t
+
[2mt[
3
t
2
__
.
Also, we have
1
2m
=
1
t
_
1 +O
_
[2mt[
t
__
,
1
2m
=
1
t
_
1 +O
_
[2mt[
t
__
,
which implies
e
k
2
m
= e
2k
2
t
exp
_
O
_
k
2
[2mt[
t
2
__
.
Combining all of this, we can see that the sum in (2.57) can be written as
1
2t
e
x
2
2t
exp
_
O
_
1
t
+
[x[
3
t
2
__
2t
e
(2mt)
2
2t
exp
_
O
_
[2mt[
3
t
2
__
.
We now choose so that [O([2mt[
3
/t
2
)[ (2mt)
2
/(4t) for all [2mt[ t. We will now show
that
2t
e
(2mt)
2
2t
exp
_
O
_
[2mt[
3
t
2
__
= 1 +O
_
1
t
_
,
which will complete the argument. Since
e
(2mt)
2
2t
exp
_
O
_
[2mt[
3
t
2
__
e
(2mt)
2
4t
,
is easy to see that the sum over [2mt[ > t
2/3
decays faster than any power of t. For [2mt[ t
2/3
we write
exp
_
O
_
[2mt[
3
t
2
__
= 1 +O
_
[2mt[
3
t
2
_
.
The estimate
|2mt|t
2/3
2
2t
e
(2mt)
2
2t
=
|m|t
2/3
/2
2
2t
e
2(m/
t)
2
= O
_
1
t
_
+ 2
_
2
e
2y
2
dy = 1 +O
_
1
t
_
60 Local Central Limit Theorem
is a standard approximation of an integral by a sum. Similarly,
|2mt|t
2/3
O
_
[2mt[
3
t
2
_
2
2t
e
(2mt)
2
2t
c
t
_
[y[
3
2
e
2y
2
dy = O
_
1
t
_
.
Exercises
Exercise 2.1 Suppose p T
d
, (0, 1) and E[[X
1
[
2+
] < . Show that the characteristic function
has the expansion
() = 1
2
+o([[
2+
), 0.
Show that the
n
in (2.32) can be chosen so that n
/2
n
0.
Exercise 2.2 Show that if p T
d
, there exists a c such that for all x Z
d
and all positive integers
n,
[p
n
(x) p
n
(0)[ c
[x[
n
(d+1)/2
.
(Hint: rst show the estimate for p T
d
with bounded support and then use (2.48). Alternatively,
one can use Lemma 2.4.3 at time n/2, the Markov property, and (2.49). )
Exercise 2.3 Show that Lemma 2.3.2 holds for p T
.
Exercise 2.4 Suppose p T
d
with E[[X[
3
] < . Show that there is a c < such that for all
[y[ = 1,
[p
n
(0) p
n
(y)[
c
n
(d+2)/2
.
Exercise 2.5 Suppose p T
d
. Let A Z
d
and
h(x) = P
x
S
n
A i.o..
Show that if h(x) > 0 for some x Z
d
, then h(x) = 1 for all x Z
d
.
Exercise 2.6 Suppose S
n
is a random walk with increment distribution p T
d
. Show that there
exists a b > 0 such that
sup
n>1
E
_
exp
_
b[S
n
[
2
n
__
< .
Exercise 2.7 Suppose X
1
, X
2
, . . . are independent, identically distributed random variables in Z
d
with PX
1
= 0 < 1 and let S
n
= X
1
+ +X
n
.
Show that there exists an r such that for all n
P[S
rn
2[ n
1
2
.
2.5 LCLT combinatorial approach 61
Show that there exist c, t such that for all b > 0,
P
_
max
1jn
2
[S
j
[ bn
_
c e
t/b
.
Exercise 2.8 Find r
2
, r
3
in (2.54).
Exercise 2.9 Let S
n
denote one-dimensional simple random walk. In this exercise we will prove
without using Stirlings formula that there exists a constant C such that
p
2n
(0) =
C
n
_
1
1
8n
+O
_
1
n
2
__
.
a. Show that if n 1,
p
2(n+1)
=
_
1 +
1
2n
_ _
1 +
1
n
_
1
p
2n
.
b. Let b
n
=
np
2n
(0). Show that b
1
= 1/2 and for n 1,
b
n+1
b
n
= 1 +
1
8n
2
+O
_
1
n
3
_
.
c. Use this to show that b
= limb
n
exists and is positive. Moreover,
b
n
= b
_
1
1
8n
+O
_
1
n
2
__
.
Exercise 2.10 Show that if p T
d
with E[[X
1
[
3
] < , then
2
j
p
n
(x) =
2
j
p
n
(x) +O(n
(d+3)/2
).
Exercise 2.11 Suppose q : Z
d
R has nite support, and k is a positive integer such that for all
l 1, . . . , k 1 and all j
1
, . . . , j
l
1, . . . , d,
x=(x
1
,...,x
d
)Z
d
x
j
1
x
j
2
. . . x
j
l
q(x) = 0.
Then we call the operator
f(x) :=
y
f(x +y) q(y)
a dierence operator of order (at least) k. The order of the operator is the largest k for which this
is true. Suppose is a dierence operator of order k 1.
Suppose g is a C
function on R
d
. Dene g
on Z
d
by g
(0)[ = O([[
k
), 0.
Show that if p T
d
with E[[X
1
[
3
] < , then
p
n
(x) = p
n
(x) +O(n
(d+1+k)/2
).
62 Local Central Limit Theorem
Show that if p T
d
is symmetric with E[[X
1
[
4
] < , then
p
n
(x) = p
n
(x) +O(n
(d+2+k)/2
).
Exercise 2.12 Suppose p T
2
. Show that there is a c such that the following is true. Let S
n
be a p-walk and let
n
= infj : [S
j
[ n.
If y Z
2
, let
V
n
(y) =
n1
j=0
1S
j
= y
denote the number of visits to y before time
n
. Then, if 0 < [y[ < n,
E[V
k
(y)] c
1 + log n log [y[
n
.
Hint: Show that there exist c
1
, such that for each positive integer j,
jn
2
j<(j+1)n
2
1S
j
= y; j <
n
c
1
e
j
n
1
.
3
Approximation by Brownian motion
3.1 Introduction
Suppose S
n
= X
1
+ + X
n
is a one-dimensional simple random walk. We make this into a
(random) continuous function by linear interpolation,
S
t
= S
n
+ (t n) [S
n+1
S
n
], n t n + 1.
For xed integer n, the LCLT describes the distribution of S
n
. A corollary of LCLT is the usual
central limit theorem that states that the distribution of n
1/2
S
n
converges to that of a standard
normal random variable. A simple extension of this is the following: suppose 0 < t
1
< t
2
< . . . <
t
k
= 1. Then as n the distribution of
n
1/2
(S
t
1
n
, S
t
2
n
, . . . , S
t
k
n
)
converges to that of
(Y
1
, Y
1
+Y
2
, . . . , Y
1
+Y
2
+ Y
k
),
where Y
1
, . . . , Y
k
are independent mean zero normal random variables with variances t
1
, t
2
t
1
, . . . ,
t
k
t
k1
, respectively.
The functional central limit theorem (also called the invariance principle or Donskers theorem)
for random walk extends this result to the random function
W
(n)
t
:= n
1/2
S
tn
. (3.1)
The functional central limit theorem states roughly that as n , the distribution of this random
function converges to the distribution of a random function t B
t
. From what we know about the
simple random walk, here are some properties that would be expected of the random function B
t
:
If s < t, the distribution of B
t
B
s
is N(0, t s).
If 0 t
0
< t
1
< . . . < t
k
, then B
t
1
B
t
0
, . . . , B
t
k
B
t
k1
are independent random variables.
These two properties follow almost immediately from the central limit theorem. The third property
is not as obvious.
The function t B
t
is continuous.
Although this is not obvious, we can guess this from the heuristic argument:
E[(B
t+t
B
t
)
2
] t,
63
64 Approximation by Brownian motion
which indicates that [B
t+t
B
t
[ should be of order
_
n=0
D
n
denote the nonnegative dyadic rationals. Our strategy will be as follows:
dene B
t
for t in D satisfying conditions (a) and (b);
derive an estimate on the oscillation of B
t
, t D, that implies that with probability one the
paths are uniformly continuous on compact intervals;
dene B
t
for other values of t by continuity.
The rst step is straightforward using a basic property of normal random variables. Suppose
X, Y are independent normal random variables, each mean 0 and variance 1/2. Then Z = X +Y
is N(0, 1). Moreover, the conditional distribution of X given the value of Z is normal with mean
Z/2 and variance 1/2. This can be checked directly using the density of the normals. Alternatively,
one can check that if Z, N are independent N(0, 1) random variables then
X :=
Z
2
+
N
2
, Y :=
Z
2
N
2
, (3.3)
are independent N(0, 1/2) random variables. To verify this, one only notes that (X, Y ) has a joint
normal distribution with E[X] = E[Y ] = 0, E[X
2
] = E[Y
2
] = 1/2, E[XY ] = 0. (See Corollary
12.3.1.) This tells us that in order to dene X, Y we can start with independent random variables
N, Z and then use (3.3).
66 Approximation by Brownian motion
Figure 3.1: The dyadic construction
We start by dening B
t
for t D
0
= N by B
0
= 0 and
B
j
= N
0,1
+ +N
0,j
.
We then continue recursively using (3.3). Suppose B
t
has been dened for all t D
n
. Then we
dene B
t
for t D
n+1
D
n
by
B2k+1
2
n+1
= B k
2
n
+
1
2
_
Bk+1
2
n
B k
2
n
_
+ 2
(n+2)/2
N
2k+1,n+1
.
By induction, one can check that for each n the collection of random variables Z
k,n
:= B
k/2
n
B
(k1)/2
n are independent, each with a N(0, 2
n
) distribution. Since this is true for each n, we
can see that (a) and (b) hold (with the natural ltration) provided that we restrict to t D. The
scaling property for normal random variables shows that for each integer n, the random variables
2
n/2
B
t/2
n, t D,
have the same joint distribution as the random variables
B
t
, t D.
We dene the oscillation of B
t
(restricted to t D) by
osc(B; , T) = sup[B
t
B
s
[ : s, t D; 0 s, t T; [s t[ .
For xed , T, this is an T
T
-measurable random variable. We write osc(B; ) for osc(B; , 1). Let
M
n
= max
0k<2
n
sup
_
[B
t+k2
n B
k2
n[ : t D; 0 t 2
n
_
.
The random variable M
n
is similar to osc(B; 2
n
) but is easier to analyze. Note that if r 2
n
,
osc(B; r) osc(B; 2
n
) 3 M
n
. (3.4)
3.2 Construction of Brownian motion 67
To see this, suppose 2
n
, 0 < s < t s + 1, and [B
s
B
t
[ . Then there exists a k
such that either k2
n
s < t (k + 1)2
n
or (k 1)2
n
s k2
n
< t (k + 1)2
n
. In either
case, the triangle inequality tells us that M
n
/3. We will prove a proposition that bounds the
probability of large values of osc(B; , T). We start with a lemma which gives a similar bound for
M
n
.
Lemma 3.2.1 For every integer n and every > 0,
PM
n
> 2
n/2
4
_
2
2
n
2
/2
.
Proof Note that
PM
n
> 2
n/2
2
n
P
_
sup
0t2
n
[B
t
[ > 2
n/2
_
= 2
n
P
_
sup
0t1
[B
t
[ >
_
.
Here the supremums are taken over t D. Also note that
P sup[B
t
[ : 0 t 1, t D > = lim
n
P max[B
k2
n[ : k = 1, . . . , 2
n
>
2 lim
n
P maxB
k2
n : k = 1, . . . , 2
n
> .
The reection principle (see Proposition 1.6.2 and the remark following) shows that
P maxB
k2
n : k = 1, . . . , 2
n
> 2 PB
1
>
= 2
_
2
e
x
2
/2
dx
2
_
2
e
x/2
dx = 2
_
2
1
e
2
/2
.
Proposition 3.2.2 There exists a c > 0 such that for every 0 < 1, r 1, and positive integer
T,
Posc(B; , T) > c r
_
log(1/) c T
r
2
.
Proof It suces to prove the result for T = 1 since for general T we can estimate separately the
oscillations over the 2T 1 intervals [0, 1], [1/2, 3/2], [1, 2], . . . , [T 1, T]. Also, it suces to prove
the result for 1/4. Suppose that 2
n1
2
n
. Using (3.4), we see that
Posc(B; ) > c r
_
log(1/) P
_
M
n
>
cr
3
2
_
2
n
log(1/)
_
.
By Lemma 3.2.1, if c is chosen suciently large, the probability on the right-hand side is bounded
by a constant times
exp
_
1
4
_
c
2
r
2
18
_
log(1/)
_
,
which for c large enough is bounded by a constant times
r
2
.
68 Approximation by Brownian motion
Corollary 3.2.3 With probability one, for every integer T < , the function t B
t
, t D is
uniformly continuous on [0, T].
Proof Uniform continuity on [0, T] is equivalent to saying that osc(B; 2
n
, T) 0 as n . The
previous proposition implies that there is a c
1
such that
Posc(B; 2
n
, T) > c
1
2
n/2
n c
1
T 2
n
.
In particular,
n=1
Posc(B; 2
n
, T) > c
1
2
n/2
n < ,
which implies by Borel-Cantelli that with probability one osc(B; 2
n
, T) c
1
2
n/2
n for all n
suciently large.
Given the corollary, we can dene B
t
for t , D by continuity, i.e.,
B
t
= lim
tnt
B
tn
,
where t
n
D with t
n
t. It is not dicult to show that this satises the denition of Brownian
motion (we omit the details). Moreover, since B
t
has continuous paths, we can write
osc(B; , T) = sup[B
t
B
s
[ : 0 s, t T; [s t[ .
We restate the estimate and include a fact about scaling of Brownian motion. Note that if B
t
is a
standard Brownian motion and a > 0, then Y
t
:= a
1/2
B
at
is also a standard Brownian motion.
Theorem 3.2.4 (Modulus of continuity of Brownian motion) There is a c < such that
if B
t
is a standard Brownian motion, 0 < 1, r c, T 1,
Posc(B; , T) > r
_
log(1/) c T
(r/c)
2
.
Moreover, if T > 0, then osc(B; , T) has the same distribution as
= 1 = PB
= 1 = 1/2.
3.3 Skorokhod embedding 69
Lemma 3.3.1 E[] = 1 and there exists a b < such that E[e
b
] < .
Proof Note that for integer n
P > n P > n 1, [B
n
B
n1
[ 2 = P > n 1 P[B
n
B
n1
[ 2,
which implies for integer n,
P > n P[B
n
B
n1
[ 2
n
= e
n
,
with > 0. This implies that E[e
b
] < for b < . If s < t, then E[B
2
t
t [ T
s
] = B
2
s
s (Exercise
3.1). This shows that B
2
t
t is a continuous martingale. Also,
E[[B
2
t
t[; > t] (t + 1) P > t 0.
Therefore, we can use the optional sampling theorem (Theorem 12.2.9) to conclude that E[B
2
] =
0. Since E[B
2
n
= inft
n1
: [B
t
B
n1
[ = 1.
Then S
n
:= B
n
is a simple one-dimensional random walk. Let T
n
=
n
n1
. The random
variables T
1
, T
2
, . . . are independent, identically distributed, with mean one satisfying E[e
bT
j
] <
for some b > 0. As before, we dene S
t
for noninteger t by linear interpolation. Let
(B, S; n) = max[B
t
S
t
[ : 0 t n.
In other words, (B, S; n) is the distance between the continuous functions B and S in C[0, n]
using the usual supremum norm. If j t < j + 1 n, then
[B
t
S
t
[ [S
j
S
t
[ +[B
j
B
t
[ +[B
j
S
j
[ 1 + osc(B; 1, n) +[B
j
B
j
[.
Hence for integer n,
(B, S; n) 1 + osc(B; 1, n) + max[B
j
B
j
[ : j = 1, . . . , n. (3.6)
We can estimate the probabilities for the second term with (3.5). We will concentrate on the last
term. Before doing the harder estimates, let us consider how large an error we should expect. Since
T
1
, T
2
, . . . are i.i.d. random variables with mean 1 and nite variance, the central limit theorem says
roughly that
[
n
n[ =
j=1
[T
j
1]
n.
Hence we would expect that
[B
n
B
n
[
_
[
n
n[ n
1/4
.
From this reasoning, we can see that we expect (B, S; n) to be at least of order n
1/4
. The next
theorem shows that it is unlikely that the actual value is much greater than n
1/4
.
We actually need the strong Markov property for Brownian motion to justify this and the next assertion. This is not dicult
to prove, but we will not do it in this book.
70 Approximation by Brownian motion
Theorem 3.3.2 There exist 0 < c
1
, a < such that for all r n
1/4
and all integers n 3
P(B, S; n) > r n
1/4
_
log n c
1
e
ar
.
Proof It suces to prove the theorem for r 9c
2
where c
). Suppose 9c
2
r n
1/4
. If [B
n
B
n
[
is large, then either [n
n
[ is large or the oscillation of B is large. Using (3.6), we see that the
event (B, S; n) r n
1/4
n, 2n) (r/3) n
1/4
_
log n ,
_
max
1jn
[
j
j[ r
n
_
.
Indeed, if osc(B; r
n, 2n) (r/3) n
1/4
log n and [
j
j[ r
log n.
Note that Theorem 3.2.4 gives for 1 r n
1/4
,
P osc(B; r
n, n) > (r/3) n
1/4
_
log n
= 3 P osc(B; r n
1/2
) > (r/3) n
1/4
_
log n
3 P
_
osc(B; r n
1/2
) > (
r/3)
_
r n
1/2
log(n
1/2
/r)
_
.
If
r/3 c
and r n
1/4
, we can use Theorem 3.2.4 to conclude that there exist c, a such that
P
_
osc(B; r n
1/2
) > (
r/3)
_
r n
1/2
log(n
1/2
/r)
_
c e
ar log n
.
For the second event, consider the martingale
M
j
=
j
j.
Using (12.12) on M
j
and M
j
, we see that there exist c, a such that
P
_
max
1jn
[
j
j[ r
n
_
c e
ar
2
. (3.7)
The proof actually gives the stronger upper bound of c [e
ar
2
+e
ar log n
] but we will not need this improve-
ment.
Extending the Skorokhod approximation to continuous time simple random walk
S
t
is not dicult
although in this case the path t
S
t
is not continuous. Let N
t
be a Poisson process with parameter
1 dened on the same probability space and independent of the Brownian motion B. Then
S
t
:= S
Nt
has the distribution of the continuous-time simple random walk. Since N
t
t is a martingale, and
3.4 Higher dimensions 71
the Poisson distribution has exponential moments, another application of (12.12) shows that for
r t
1/4
,
P
_
max
0st
[N
s
s[ r
t
_
c e
ar
2
.
Let
(B,
S; n) = sup[B
t
S
t
[ : 0 t n.
Then the following is proved similarly.
Theorem 3.3.3 There exist 0 < c, a < such that for all 1 r n
1/4
and all positive integers n
P(B,
S; n) r n
1/4
_
log n c e
ar
.
3.4 Higher dimensions
It is not dicult to extend Theorems 3.3.2 and 3.3.3 to p T
d
for d > 1. A d-dimensional Brownian
motion with covariance matrix with respect to a ltration T
t
is a collection of random variables
B
t
, t 0 satisfying the following:
(a) B
0
= 0;
(b) if s < t, then B
t
B
s
is an T
t
-measurable random R
d
-valued variable, independent of T
s
,
whose distribution is joint normal with mean zero and covariance matrix (t s) .
(c) with probability one, t B
t
is a continuous function.
Lemma 3.4.1 Suppose B
(1)
, . . . , B
(l)
are independent one-dimensional standard Brownian motions
and v
1
, . . . , v
l
R
d
. Then
B
t
:= B
(1)
t
v
1
+ +B
(l)
t
v
l
is a Brownian motion in R
d
with covariance matrix = AA
T
where A = [v
1
v
2
v
l
].
Proof Straightforward and left to the reader.
In particular, a standard d-dimensional Brownian motion is of the form
B
t
= (B
(1)
t
, . . . , B
(d)
t
)
where B
(1)
, . . . , B
(d)
are independent one-dimensional Brownian motions. Its covariance matrix is
the identity.
The next theorem shows that one can dene d-dimensional Brownian motions and d-dimensional
random walks on the same probability space so that their paths are close to each other. Although
the proof will use Skorokhod embedding, it is not true that the d-dimensional random walk is
embedded into the d-dimensional Brownian motion. In fact, it is impossible to have an embedded
walk since for d > 1 the probability that a d-dimensional Brownian motion B
t
visits the countable
set Z
d
after time 0 is zero.
72 Approximation by Brownian motion
Theorem 3.4.2 Let p T
d
with covariance matrix . There exist c, a and a probability space
(, T, P) on which are dened a Brownian motion B with covariance matrix ; a discrete-time
random walk S with increment distribution p; and a continuous-time random walk
S with increment
distribution p such that for all positive integers n and all 1 r n
1/4
,
P(B, S; n) r n
1/4
_
log n c e
ar
,
P(B,
S; n) r n
1/4
_
log n c e
ar
.
Proof Suppose v
1
, . . . , v
l
are the points such p(v
j
) = p(v
j
) = q
j
/2 and p(z) = 0 for all other
z Z
d
0. Let L
n
= (L
1
n
, . . . , L
l
n
) be a multinomial process with parameters q
1
, . . . , q
l
, and let
B
1
, . . . , B
l
be independent one-dimensional Brownian motions. Let S
1
, . . . , S
l
be the random walks
derived from B
1
, . . . , B
l
by Skorokhod embedding. As was noted in (1.2),
S
n
:= S
1
L
1
n
v
1
+. . . +S
l
L
l
n
v
l
,
has the distribution of a random walk with increment distribution p. Also,
B
t
:= B
1
t
v
1
+ +B
l
t
v
l
,
is a Brownian motion with covariance matrix . The proof now proceeds as in the previous cases.
One fact that is used is that the L
j
n
have a binomial distribution and hence we can get an exponential
estimate
P
_
max
1jn
[L
i
j
q
i
j[ a
n
_
c e
a
.
3.5 An alternative formulation
Here we give a slightly dierent, but equivalent, form of the strong approximation from which we
get (3.2). We will illustrate this in the case of one-dimensional simple random walk. Suppose B
t
is a standard Brownian motion dened on a probability space (, T, P). For positive integer n, let
B
(n)
t
denote the Brownian motion
B
(n)
t
= n
1/2
B
nt
.
Let S
(n)
denote the simple random walk derived from B
(n)
using the Skorokhod embedding. Then
we know that for all positive integers T,
P
_
max
0tTn
[S
(n)
t
B
(n)
t
[ c r (Tn)
1/4
_
log(Tn)
_
c e
ar
.
If we let
W
(n)
t
= n
1/2
S
(n)
tn
,
then this becomes
P
_
max
0tT
[W
(n)
t
B
t
[ c r T
1/4
n
1/4
_
log(Tn)
_
c e
ar
.
3.5 An alternative formulation 73
In particular, if r = c
1
log n where c
1
= c
1
(T) is chosen suciently large,
P
_
max
0tT
[W
(n)
t
B
t
[ c
1
n
1/4
log
3/2
n
_
c
1
n
2
.
By the Borel-Cantelli lemma, with probability one
max
0tT
[W
(n)
t
B
t
[ c
1
n
1/4
log
3/2
n
for all n suciently large. In particular, with probability one W
(n)
converges to B in the metric
space C[0, T].
By using a multinomial process (in the discrete-time case) or a Poisson process (in the continuous-
time) case, we can prove the following.
Theorem 3.5.1 Suppose p T
d
with covariance matrix . There exist c < , a > 0 and a prob-
ability space (, T, P) on which are dened a d-dimensional Brownian motion B
t
with covariance
matrix ; an innite sequence of discrete-time p-walks, S
(1)
, S
(2)
, . . .; and an innite sequence of
continuous time p-walks
S
(1)
,
S
(2)
, . . . such that the following holds for every r > 0, T 1. Let
W
(n)
t
= n
1/2
S
(n)
nt
,
W
(n)
t
= n
1/2
S
(n)
nt
.
Then,
P
_
max
0tT
[W
(n)
t
B
t
[ c r T
1/4
n
1/4
_
log(Tn)
_
c e
ar
.
P
_
max
0tT
[
W
(n)
t
B
t
[ c r T
1/4
n
1/4
_
log(Tn)
_
c e
ar
.
In particular, with probability one W
(n)
B and
W
(n)
B in the metric space C
d
[0, T].
Exercises
Exercise 3.1 Show that if B
t
is a standard Brownian motion with respect to the ltration T
t
and
s < t, then E[B
2
t
t [ T
s
] = B
2
s
s.
Exercise 3.2 Let X be an integer-valued random variable with PX = 0 = 0 and E[X] = 0.
(a) Show that there exist numbers r
j
(0, ],
r
1
r
2
, r
1
r
2
,
such that if B
t
is a standard Brownian motion and
T = inft : B
t
Z 0, t r
Bt
,
then B
T
has the same distribution as X.
(b) Show that if X has bounded support, then there exists a b > 0 with E[e
bT
] < .
(c) Show that E[T] = E[X
2
].
(Hint: you may wish to consider rst the cases where X is supported on 1, 1, 1, 2, 1, and
1, 2, 1, 2, respectively.)
74 Approximation by Brownian motion
Exercise 3.3 Show that there exist c < , > 0 such that the following is true. Suppose
B
t
= (B
1
t
, B
2
t
) is a standard two-dimensional Brownian motion and let T
R
= inft : [B
t
[ R. Let
U
R
denote the unbounded component of the open set R
2
B[0, T
R
]. Then,
P
x
0 U
R
c([x[/R)
.
(Hint: Show there is a < 1 such that for all R and all [x[ < R,
P
x
0 U
2R
[ 0 U
R
. )
Exercise 3.4 Show that there exist c < , > 0 such that the following is true. Suppose S
n
is
simple random walk in Z
2
starting at x ,= 0, and let
R
= minn : [S
n
[ R. Then the probability
that there is a nearest neighbor path starting at the origin and ending at [z[ R that does
intersect S
j
: 0 j
R
is no more than c([x[/R)
d
is called recurrent if PS
n
= 0 i.o. = 1.
If the walk is not recurrent it is called transient. We will also say that p is recurrent or transient.
It is easy to see using the Markov property that p is recurrent if and only if for each x Z
d
,
P
x
S
n
= 0 for some n 1 = 1,
and p is transient if and only if the escape probability, q, is positive, where q is dened by
q = PS
n
,= 0 for all n 1.
Theorem 4.1.1 If p T
d
with d = 1, 2, then p is recurrent. If p T
d
with d 3, then p is
transient. For all p,
q =
_
n=0
p
n
(0)
_
1
, (4.1)
where the left-hand side equals zero if the sum is divergent.
Proof Let Y =
n=0
1S
n
= 0 denote the number of visits to the origin and note that
E[Y ] =
n=0
PS
n
= 0 =
n=0
p
n
(0).
If p T
d
with d = 1, 2, the LCLT (see Theorem 2.1.1 and Theorem 2.3.9) implies that p
n
(0)
c n
d/2
and the sum is innite. If p T
d
with d 3, then (2.49) shows that p
n
(0) c n
d/2
and
hence E[Y ] < . We can compute E(Y ) in terms of q. Indeed, the Markov property shows that,
PY = j = (1 q)
j1
q. Therefore, if q > 0,
E[Y ] =
j=0
j PY = j =
j=0
j (1 q)
j1
q =
1
q
.
75
76 Greens Function
4.2 Greens generating function
If p T T
and x, y Z
d
, we dene the Greens generating function to be the power series in :
G(x, y; ) =
n=0
n
P
x
S
n
= y =
n=0
n
p
n
(y x).
Note that the sum is absolutely convergent for [[ < 1. We write just G(y; ) for G(0, y, ). If
p T, then G(x; ) = G(x; ).
The generating function is dened for complex , but there is a particular interpretation of the
sum for positive 1. Suppose T is a random variable independent of the random walk S with a
geometric distribution,
PT = j =
j1
(1 ), j = 1, 2, . . . ,
i.e., PT > j =
j
(if = 1, then T ). We think of T as a killing time for the walk and we
will refer to such T as a geometric random variable with killing rate 1 . At each time j, if the
walker has not already been killed, the process is killed with probability 1 , where the killing
is independent of the walk. If the random walk starts at the origin, then the expected number of
visits to x before being killed is given by
E
_
_
j<T
1S
j
= x
_
_
= E
_
_
j=0
1S
j
= x; T > j
_
_
=
j=0
PS
j
= x; T > j =
j=0
p
j
(x)
j
= G(x; ).
Theorem 4.1.1 states that a random walk is transient if and only if G(0; 1) < , in which case
the escape probability is G(0; 1)
1
. For a transient random walk, we dene the Greens function
to be
G(x, y) = G(x, y; 1) =
n=0
p
n
(y x).
We write G(x) = G(0, x); if p T, then G(x) = G(x). The strong Markov property implies that
G(0, x) = PS
n
= x for some n 0 G(0, 0). (4.2)
Similarly, we dene
G(x, y; ) =
_
0
t
p
t
(x, y) dt.
For (0, 1) this is the expected amount of time spent at site y by a continuous-time random
walk with increment distribution p before an independent killing time that has an exponential
distribution with rate log(1). We will now show that if we set = 1, we get the same Greens
function as that induced by the discrete walk.
Proposition 4.2.1 If p T
d
is transient, then
_
0
p
t
(x) dt = G(x).
4.2 Greens generating function 77
Proof Let S
n
denote a discrete-time walk with distribution p, N
t
an independent Poisson process
with parameter 1, and let
S
t
denote the continuous-time walk
S
t
= S
Nt
. Let
Y
x
=
n=0
1S
n
= x,
Y
x
=
_
0
1
S
t
= x dt,
denote the amount of time spent at x by S and
S, respectively. Then G(x) = E[Y
x
]. If we let
T
n
= inft : N
t
= n, then we can write
Y
x
=
n=0
1S
n
= x (T
n+1
T
n
).
Independence of S and N implies
E[1S
n
= x (T
n+1
T
n
)] = PS
n
= x E[T
n+1
T
n
] = PS
n
= x.
Hence E[
Y
x
] = E[Y
x
].
Remark. Suppose p is the increment distribution of a random walk in Z
d
. For > 0, let p
denote
the increment of the lazy walker given by
p
(x) =
_
(1 ) p(x), x ,= 0
+ (1 ) p(0), x = 0
If p is irreducible and periodic on Z
d
, then for each 0 < < 1, p
are
L
= (1 ) L,
() = + (1 ) (). (4.3)
If p has mean zero and covariance matrix , then p
= (1 ) , det
= (1 )
d
det . (4.4)
If p is transient, and G, G
(x) =
1
1
G(x). (4.5)
For some proofs it is convenient to assume that the walk is aperiodic; results for periodic walks can
then be derived using these relations.
If n 1, let f
n
(x, y) denote the probability that a random walk starting at x rst visits y at
time n (not counting time n = 0), i.e.,
f
n
(x, y) = P
x
S
n
= y; S
1
,= y, . . . , S
n1
,= y = P
x
y
= n,
where
y
= minj 1 : S
j
= y,
y
= minj 0 : S
j
= y.
78 Greens Function
Let f
n
(x) = f
n
(0, x) and note that
P
x
y
< =
n=1
f
n
(x, y) =
n=1
f
n
(y x) 1.
Dene the rst visit generating function by
F(x, y; ) = F(y x; ) =
n=1
n
f
n
(y x).
If (0, 1), then
F(x, y; ) = P
x
y
< T
,
where T
> n =
n
.
Proposition 4.2.2 If n 1,
p
n
(y) =
n
j=1
f
j
(y) p
nj
(0).
If C,
G(y; ) = (y) +F(y; ) G(0; ), (4.6)
where denotes the delta function. In particular, if [F(0, )[ < 1,
G(0; ) =
1
1 F(0; )
. (4.7)
Proof The rst equality follows from
PS
n
= y =
n
j=1
P
y
= j; S
n
S
j
= 0 =
n
j=1
P
y
= j p
nj
(0).
The second equality uses
n=1
p
n
(x)
n
=
_
n=1
f
n
(x)
n
_ _
m=0
p
m
(0)
m
_
,
which follows from the rst equality. For (0, 1], there is a probabilistic interpretation of (4.6).
If y ,= 0, the expected number of visits to y (before time T
. If y = 0,
we have to add an extra 1 to account for p
0
(y).
If (0, 1), the identity (4.7) can be considered as a generalization of (4.1). Note that
F(0; ) =
j=1
P
0
= j; T
> j = P
0
< T
d
with characteristic function . Then if x Z
d
, [[ < 1,
G(x; ) =
1
(2)
d
_
[,]
d
1
1 ()
e
ix
d.
If d 3, this holds for = 1, i.e.,
G(x) =
1
(2)
d
_
[,]
d
1
1 ()
e
ix
d.
Proof All of the integrals in this proof will be over [, ]
d
. The formal calculation, using Corollary
2.2.3, is
G(x; ) =
n=0
n
p
n
(x) =
n=0
n
1
(2)
d
_
()
n
e
ix
d
=
1
(2)
d
_
_
n=0
( ())
n
_
e
ix
d
=
1
(2)
d
_
1
1 ()
e
ix
d.
The interchange of the sum and the integral in the second equality is justied by the dominated
convergence theorem as we now describe. For each N,
n=0
n
()
n
e
ix
1
1 [[ [()[
.
If [[ < 1, then the right-hand side is bounded by 1/[1[[]. If p T
d
and = 1, then (2.13) shows
that the right-hand side is bounded by c [[
2
for some c. If d 3, [[
2
is integrable on [, ]
d
.
If p T
d
is bipartite, we can use (4.3) and (4.5).
Some results are easier to prove for geometrically killed random walks than for walks restricted
to a xed number of steps. This is because stopping time arguments work more nicely for such
walks. Suppose that S
n
is a random walk, is a stopping time for the random walk, and T is
an independent geometric random variable. Then on the event T > the distribution of T
given S
n
, n = 0, . . . , is the same as that of T. This loss of memory property for geometric and
exponential random variables can be very useful. The next proposition gives an example of a result
proved rst for geometrically killed walks. The result for xed length random walks can be deduced
from the geometrically killed walk result by using Tauberian theorems. Tauberian theorems are one
of the major tools for deriving facts about a sequence from its generating functions. We will only
use some simple Tauberian theorems; see Section 12.5.
80 Greens Function
Proposition 4.2.4 Suppose p T
d
T
d
, d = 1, 2. Let
q(n) = PS
j
,= 0 : j = 1, . . . , n.
Then as n ,
q(n)
_
r
1
n
1/2
, d = 1
r (log n)
1
, d = 2.
where r = (2)
d/2
det .
Proof We will assume p T
d
; it is not dicult to extend this to bipartite p T
d
. We will establish
the corresponding facts about the generating functions for q(n): as 1,
n=0
n
q(n)
r
(1/2)
1
1
, d = 1, (4.9)
n=0
n
q(n)
r
1
_
log
_
1
1
__
1
, d = 2. (4.10)
Here denotes the Gamma function. Since the sequence q(n) is monotone in n, Propositions
12.5.2 and 12.5.3 imply the proposition (recall that (1/2) =
).
Let T be a geometric random variable with killing rate 1 . Then (4.8) tells us that
PS
j
,= 0 : j = 1, . . . , T 1 = G(0; )
1
.
Also,
PS
j
,= 0 : j = 1, . . . , T 1 =
n=0
PT = n + 1 q(n) = (1 )
n=0
n
q(n).
Using (2.32) and Lemma 12.5.1, we can see that as 1,
G(0; ) =
n=0
n
p
n
(0) =
n=0
n
_
1
r n
d/2
+o
_
1
n
d/2
__
1
r
F
_
1
1
_
,
where
F(s) =
_
(1/2)
s, d = 1
log s, d = 2.
This gives (4.9) and (4.10).
Corollary 4.2.5 Suppose S
n
is a random walk with increment distribution p T
d
and
=
0
= minj 1 : S
j
= 0.
Then E[] = .
We use the bold face to denote the Gamma function to distinguish it from the covariance matrix .
4.3 Greens function, transient case 81
Proof If d 3, then transience implies that P = > 0. For d = 1, 2, the result follows from
the previous proposition which tells us
P > n
_
c n
1/2
, d = 1,
c (log n)
1
, d = 2.
One of the basic ingredients of Proposition 4.2.4 is the fact that the random walk always starts afresh when
it returns to the origin. This idea can be extended to returns of a random walk to a set if the set if suciently
symmetric that it looks the same at all points. For an example, see Exercise 4.2.
4.3 Greens function, transient case
In this section, we will study the Greens function for p T
d
, d 3. The Greens function
G(x, y) = G(y, x) = G(y x) is given by
G(x) =
n=0
p
n
(x) = E
_
n=0
1S
n
= x
_
= E
x
_
n=0
1S
n
= 0
_
.
Note that
G(x) = 1x = 0 +
y
p(x, y) E
y
_
n=0
1S
n
= 0
_
= (x) +
y
p(x, y) G(y),
In other words,
LG(x) = (x) =
_
1, x = 0,
0, x ,= 0.
Recall from (4.2) that
G(x) = P
x
< G(0).
In the calculations above as well as throughout this section, we use the symmetry of the Greens function,
G(x, y) = G(y, x). For nonsymmetric random walks, one must be careful to distinguish between G(x, y) and
G(y, x).
The next theorem gives the asymptotics of the Greens function as [x[ . Recall that
(x)
2
=
d (x)
2
= x
1
x. Since is nonsingular,
(x)
d2
+O
_
1
[x[
d
_
=
C
d
(x)
d2
+O
_
1
[x[
d
_
,
where
C
d
= d
(d/2)1
C
d
=
(
d2
2
)
2
d/2
det
=
(
d
2
)
(d 2)
d/2
det
.
82 Greens Function
Here denotes the covariance matrix and denotes the Gamma function. In particular, for
simple random walk,
G(x) =
d (
d
2
)
(d 2)
d/2
1
[x[
d2
+O
_
1
[x[
d
_
.
For simple random walk we can write
C
d
=
2d
(d 2)
d
=
2
(d 2) V
d
,
where
d
denotes the surface area of unit (d 1)-dimensional sphere and V
d
is the volume of the unit ball in R
d
.
See Exercise 6.18 for a derivation of this relation. More generally,
C
d
=
2
(d 2) V ()
where V () denotes the volume of the ellipsoid x R
d
: (x) 1.
The last statement of Theorem 4.3.1 follows from the rst statement using = d
1
I, (x) = [x[
for simple random walk. It suces to prove the rst statement for aperiodic p; the proof for
bipartite p follows using (4.4) and (4.5). The proof of the theorem will consist of two estimates:
G(x) =
n=0
p
n
(x) = O
_
1
[x[
d
_
+
n=1
p
n
(x), (4.11)
and
n=1
p
n
(x) =
C
(x)
d2
+o
_
1
[x[
d
_
.
The second estimate uses the next lemma.
Lemma 4.3.2 Let b > 1. Then as r ,
n=1
n
b
e
r/n
=
(b 1)
r
b1
+O
_
1
r
b+1
_
Proof The sum is a Riemann sum approximation of the integral
I
r
:=
_
0
t
b
e
r/t
dt =
1
r
b1
_
0
y
b2
e
y
dy =
(b 1)
r
b1
. (4.12)
If f : (0, ) R is a C
2
function and n is a positive integer, then Lemma 12.1.1 gives
f(n)
_
n+(1/2)
n(1/2)
f(s) ds
1
24
sup[f
(t)[ : [t n[ 1/2.
Choosing f(t) = t
b
e
r/t
, we get
n
b
e
r/n
_
n+(1/2)
n(1/2)
t
b
e
r/t
dt
c
1
n
b+2
_
1 +
r
2
n
2
_
e
r/n
, n
r.
4.3 Greens function, transient case 83
(The restriction n
r is used to guarantee that e
r/(n+(1/2))
c e
r/n
.) Therefore,
n
b
e
r/n
_
n+(1/2)
n(1/2)
t
b
e
r/t
dt
r
1
n
b+2
_
1 +
r
2
n
2
_
e
r/n
c
_
0
t
(b+2)
_
1 +
r
2
t
2
_
e
r/t
dt
c r
(b+1)
The last step uses (4.12). It is easy to check that the sum over n <
r and the integral over t <
r
decay faster than any power of r.
Proof of Theorem 4.3.1. Using Lemma 4.3.2 with b = d/2, r =
(x)
2
/2, we have
n=1
p
n
(x) =
n=1
1
(2n)
d/2
det
e
J
(x)
2
/(2n)
=
(
d2
2
)
2
d/2
det
1
(x)
(d2)
+O
_
1
[x[
d+2
_
.
Hence we only need to prove (4.11). A simple estimate shows that
n<|x|
p
n
(x)
as a function of x decays faster than any power of x. Similarly, using Proposition 2.1.2,
n<|x|
p
n
(x) = o([x[
d
). (4.13)
Using (2.5), we see that
n>|x|
2
[p
n
(x) p
n
(x)[ c
n>|x|
2
n
(d+2)/2
= O([x[
d
).
Let k = d + 3. For [x[ n [x[
2
, (2.3) implies that there is an r such that
[p
n
(x) p
n
(x)[ c
_
_
[x[
n
_
k
e
r|x|
2
/n
1
n
(d+2)/2
+
1
n
(d+k1)/2
_
. (4.14)
Note that
n|x|
n
(d+k1)/2
= O([x[
(d+k3)/2
) = O([x[
d
),
and
n|x|
_
[x[
n
_
k
e
r |x|
2
/n
1
n
(d+2)/2
c
_
0
_
[x[
t
_
k
e
r|x|
2
/t
(
t)
d+2
dt c [x[
d
.
Remark. The error term in this theorem is very small. In order to prove that it is this small we need
the sharp estimate (4.14) which uses the fact that the third moments of the increment distribution
are zero. If p T
d
with bounded increments but with nonzero third moments, there exists a
similar asymptotic expansion for the Greens function except that the error term is O([x[
(d1)
),
see Theorem 4.3.5. We have used bounded increments (or at least the existence of suciently large
84 Greens Function
moments) in an important way in (4.13). Theorem 4.3.5 proves asymptotics under weaker moment
assumptions; however, mean zero, nite variance is not sucient to conclude that the Greens
function is asymptotic to c
(x)
2d
for d 4. See Exercise 4.5.
Often one does not use the full force of these asymptotics. An important thing to remember is that
G(x) [x[
2d
. There are a number of ways to remember the exponent 2 d. For example, the central limit
theorem implies that the random walk should visit on the order of R
2
points in the ball of radius R. Since
there are R
d
points in this ball, the probability that a particular point is visited is of order R
2d
. In the case of
standard d-dimensional Brownian motion, the Greens function is proportional to [x[
2d
. This is the unique (up
to multiplicative constant) harmonic, radially symmetric function on R
d
0 that goes to zero as [x[ (see
Exercise 4.4).
Corollary 4.3.3 If p T
d
, then
j
G(x) =
j
C
d
(x)
d2
+O([x[
d
).
In particular,
j
G(x) = O([x[
d+1
). Also,
2
j
G(x) = O([x[
d
).
Remark. We could also prove this corollary with improved error terms by using the dierence
estimates for the LCLT such as Theorem 2.3.6, but we will not need the sharper results in this
book. If p T
d
with bounded increments but nonzero third moments, we could also prove dierence
estimates for the Greens function using Theorem 2.3.6. The starting point is to write
y
G(x) =
n=0
y
p
n
(x) +
n=0
[
y
p
n
(x)
y
p
n
(x)].
4.3.1 Asymptotics under weaker assumptions
In this section we establish the asymptotics for G for certain p T
d
, d 3. We will follow the
basic outline of the proof of Theorem 4.3.1. Let G(x) = C
d
/
(x)
d2
denote the dominant term
in the asymptotics. From that proof we see that
G(x) = G(x) +o([x[
d
) +
n=0
[p
n
(x) p
n
(x)].
In the discussion below, we let 0, 1, 2. If E[[X
1
[
4
] < and the third moments vanish, we
set = 2. If this is not the case, but E[[X
1
[
3
] < , we set = 1. Otherwise, we set = 0. By
Theorems 2.3.5 and 2.3.9 we can see that there exists a sequence
n
0 such that
n|x|
2
[p
n
(x) p
n
(x)[ c
n|x|
2
n
+
[x[
(d+)/2
=
_
o([x[
2d
), = 0
O([x[
2d
), = 1, 2.
This is the order of magnitude that we will try to show for the error term, so this estimate suces
4.3 Greens function, transient case 85
for this sum. The sum that is more dicult to handle and which in some cases requires additional
moment conditions is
n<|x|
2
[p
n
(x) p
n
(x)].
Theorem 4.3.4 Suppose p T
3
. Then
G(x) = G(x) +o
_
1
[x[
_
.
If E[[X
1
[
3
] < we can write
G(x) = G(x) +O
_
log [x[
[x[
2
_
.
If E[[X
1
[
4
] < and the third moments vanish, then
G(x) = G(x) +O
_
1
[x[
2
_
.
Proof By Theorem 2.3.10, there exists
n
0 such that
n<|x|
2
[p
n
(x) p
n
(x)[ c
n<|x|
2
n
+
[x[
2
n
(1+)/2
.
The next theorem shows that if we assume enough moments of the distribution, then we get the
asymptotics as in Theorem 4.3.1. Note that as d , the number of moments assumed grows.
Theorem 4.3.5 Suppose p T
d
, d 3.
If E[X
1
[
d+1
] < , then
G(x) = G(x) +O([x[
1d
).
If E[X
1
[
d+3
] < and the third moments vanish,
G(x) = G(x) +O([x[
d
).
Proof Let = 1 under the weaker assumption and = 2 under the stronger assumption, set
k = d + 2 1 so that E[[X
1
[
k
] < . As mentioned above, it suces to show that
n<|x|
2
[p
n
(x) p
n
(x)] = O([x[
2d
).
Let = 2(1 +)/(1 + 2). As before,
n<|x|
p
n
(x)
decays faster than any power of [x[. Using (2.52), we have
n<|x|
p
n
(x)
c
[x[
k
n<|x|
n
kd
2
= O([x[
2d
).
86 Greens Function
(The value of was chosen as the largest value for which this holds.) For the range [x[
n < [x[
2
,
we use the estimate from Theorem 2.3.8:
[p
n
(x) p
n
(x)[
c
n
(d+)/2
_
[x/
n[
k1
e
r|x|
2
/n
+n
(k2)/2
_
.
As before,
n|x|
[x/
n[
k1
n
(d+)/2
e
r|x|
2
/n
c
_
0
[x[
k1
(
t)
k+d+1
e
r|x|
2
/t
dt = O([x[
2d
).
Also,
n|x|
1
n
d
2
+
k
2
1
= O([x[
(d+
5
2
)
) O([x[
2d
),
provided that
_
d +
5
2
_
2(1 +)
1 + 2
d 2 +,
which can be readily checked for = 1, 2 if d 3.
4.4 Potential kernel
4.4.1 Two dimensions
If p T
2
, the potential kernel is the function
a(x) =
n=0
[p
n
(0) p
n
(x)] = lim
N
_
N
n=0
p
n
(0)
N
n=0
p
n
(x)
_
. (4.15)
Exercise 2.2 shows that [p
n
(0)p
n
(x)[ c [x[ n
3/2
, so the rst sum converges absolutely. However,
since p
n
(0) n
1
, it is not true that
a(x) =
_
n=0
p
n
(0)
_
n=0
p
n
(x)
_
. (4.16)
If p T
2
is bipartite, the potential kernel for x (Z
2
)
e
is dened in the same way. If x (Z
2
)
o
we
can dene a(x) by the second expression in (4.15). Many authors use the term Greens function
for a or a. Note that
a(0) = 0.
If p T
d
is transient, then (4.16) is valid, and a(x) = G(0) G(x), where G is the Greens function for p.
Since [p
n
(0) p
n
(x)[ c [x[ n
3/2
for all p T
2
, the same argument shows that a exists for such p.
Proposition 4.4.1 If p T
2
, then 2 a(x) is the expected number of visits to x by a random walk
starting at x before its rst visit to the origin.
4.4 Potential kernel 87
Proof We delay this until the next section; see (4.31).
Remark. Using Proposition 4.4.1, we can see that if p
denotes the
potential kernel for p
then
a
(x) =
1
1
a(x). (4.17)
Proposition 4.4.2 If p T
2
,
La(x) =
0
(x) =
_
1, x = 0
0, x ,= 0.
Proof Recall that
L[p
n
(0) p
n
(x)] = Lp
n
(x) = p
n
(x) p
n+1
(x).
For xed x, the sequence p
n
(x) p
n+1
(x) is absolutely convergent. Hence we can write
La(x) =
n=0
L[p
n
(0) p
n
(x)] = lim
N
N
n=0
L[p
n
(0) p
n
(x)]
= lim
N
N
n=0
[p
n
(x) p
n+1
(x)]
= lim
N
[p
0
(x) p
N+1
(x)]
= p
0
(x) =
0
(x).
Proposition 4.4.3 If p T
2
T
2
, then
a(x) =
1
(2)
2
_
[,]
2
1 e
ix
1 ()
d.
Proof By the remark above, it suces to consider p T
2
. The formal calculation is
a(x) =
n=0
[p
n
(0) p
n
(x)] =
n=0
1
(2)
2
_
()
n
[1 e
ix
] d
=
1
(2)
2
_
_
n=0
()
n
_
[1 e
ix
] d
=
1
(2)
2
_
1 e
ix
1 ()
d.
All of the integrals are over [, ]
2
. To justify the interchange of the sum and the integral we use
(2.13) to obtain the estimate
n=0
()
n
[1 e
ix
]
[1 e
ix
[
1 [()[
c [x[
[[
2
c [x[
[[
.
88 Greens Function
Since [[
1
is an integrable function in [, ]
2
, the dominated convergence theorem may be applied.
Theorem 4.4.4 If p T
2
, there exists a constant C = C(p) such that as [x[ ,
a(x) =
1
det
log[
(x)] +C +O([x[
2
).
For simple random walk,
a(x) =
2
log [x[ +
2 + log 8
+O([x[
2
),
where is Eulers constant.
Proof We will assume that p is aperiodic; the bipartite case is done similarly. We write
a(x) =
nJ
(x)
2
p
n
(0)
nJ
(x)
2
p
n
(x) +
n>J
(x)
2
[p
n
(0) p
n
(x)].
We know from (2.23) that
p
n
(0) =
1
2
det
1
n
+O
_
1
n
2
_
.
We therefore get
nJ
(x)
2
p
n
(0) = 1 +O([x[
2
) +
1nJ
(x)
2
1
2
det
1
n
+
n=1
_
p
n
(0)
1
2
det
1
n
_
,
where the last sum is absolutely convergent. Also,
1nJ
(x)
2
1
n
= 2 log[
(x)] + +O([x[
2
),
where is Eulers constant (see Lemma 12.1.3). Hence,
nJ
(x)
2
p
n
(0) =
1
det
log[
(x)] +c
+O([x[
2
)
for some constant c
.
Proposition 2.1.2 shows that
n|x|
p
n
(x)
decays faster than any power of [x[. Theorem 2.3.8 implies that there exists c, r such that for
n
(x)
2
,
p
n
(x)
1
2 n
det
e
J
(x)
2
/2n
c
_
[x/
n[
5
e
r|x|
2
/n
n
2
+
1
n
3
_
. (4.18)
4.4 Potential kernel 89
Therefore,
|x|nJ
(x)
2
p
n
(x)
1
2 n
det
e
J
(x)
2
/2n
O([x[
2
) +c
|x|<nJ
(x)
2
_
[x/
n[
5
e
r|x|
2
/n
n
2
+
1
n
3
_
c [x[
2
.
The last estimate is done as in the nal step of the proof of Theorem 4.3.1. Similarly to the proof
of Lemma 4.3.2 we can see that
|x|nJ
(x)
2
1
n
e
J
(x)
2
/2n
=
_
J
(x)
2
0
1
t
e
J
(x)
2
/2t
dt +O([x[
2
)
=
_
1
1
y
e
y/2
dy +O([x[
2
).
The integral contributes a constant. At this point we have shown that
nJ
(x)
2
[p
n
(0) p
n
(x)] =
1
det
log[
(x)] +C
+O([x[
2
)
for some constant C
. For n >
(x)
2
, we use Theorem 2.3.8 and Lemma 4.3.2 again to conclude
that
n>J
(x)
2
[p
n
(0) p
n
(x)] = c
_
1
0
1
y
_
1 e
y/2
_
dy +O([x[
2
).
For simple random walk in two dimensions, it follows that
a(x) =
2
_
1 +
1
3
+
1
5
+ +
1
2n 1
_
.
However, we also know that
a(x
n
) =
2
log n +
2
log
2 +k +O(n
2
).
Therefore,
k = lim
n
_
_
log 2
2
log n +
4
j=1
1
2j 1
_
_
.
90 Greens Function
Using Lemma 12.1.3 we can see that as n ,
n
j=1
1
2j 1
=
2n
j=1
1
j
n
j=1
1
2j
=
1
2
log n + log 2 +
1
2
+o(1).
Therefore,
k =
3
log 2 +
2
.
Roughly speaking, a(x) is the dierence between the expected number of visits to 0 and the expected
number of visits to x by some large time N. Let us consider N >> [x[
2
. By time [x[
2
, the random walker has
visited the origin about
n<|x|
2
p
n
(0)
n<|x|
2
c
n
2 c log [x[,
times where c = (2
det )
1
. It has visited x about O(1) times. From time [x[
2
onward, p
n
(x) and p
n
(0) are
roughly the same and the sum of the dierence from then on is O(1). This shows why we expect
a(x) = 2c log [x[ +O(1).
Note that log [x[ = log
(x) +O(1).
Although we have included the exact value
2
for simple random walk, we will never need to use this value.
Corollary 4.4.5 If p T
2
,
j
a(x) =
j
_
1
det
log[
(x)]
_
+O([x[
2
).
In particular,
j
a(x) = O([x[
1
). Also,
2
j
a(x) = O([x[
2
).
Remark. One can give better estimates for the dierences of the potential kernel by starting with
Theorem 2.3.6 and then following the proof of Theorem 4.3.1. We give an example of this technique
in Theorem 8.1.2.
4.4.2 Asymptotics under weaker assumptions
We can prove asymptotics for the potential kernel under weaker assumptions. Let
a(x) = [
det ]
1
log[
(x)]
denote the leading term in the asymptotics.
4.4 Potential kernel 91
Theorem 4.4.6 Suppose p T
2
. Then
a(x) = a(x) +o(log [x[).
If E[[X
1
[
3
] < , then there exists C < such that
a(x) = a(x) +C +O([x[
1
).
If E[[X
1
[
6
] < and the third moments vanish, then
a(x) = a(x) +C +O([x[
2
).
Proof Let = 0, 1, 2 under the three possible assumptions, respectively. We start with = 1, 2
for which we can write
n=0
[p
n
(0) p
n
(x)] =
n=0
[p
n
(0) p
n
(x)] +
n=0
[p
n
(0) p
n
(0)] +
n=0
[p
n
(x) p
n
(x)] (4.19)
The estimate
n=0
[p
n
(0) p
n
(x)] = a(x) +
C +O([x[
2
)
is done as in Theorem 4.4.4. Since [p
n
(0) p
n
(0)[ c n
3/2
, the second sum on the right-hand side
of (4.19) converges, and we set
C =
C +
n=0
[p
n
(0) p
n
(0)].
We write
n=0
[p
n
(x) p
n
(x)]
n<|x|
2
[p
n
(x) p
n
(x)[ +
n|x|
2
[p
n
(x) p
n
(x)[.
By Theorem 2.3.5 and and Theorem 2.3.9,
n|x|
2
[p
n
(x) p
n
(x)[ c
n|x|
2
n
(2+)/2
= O([x[
).
For = 1, Theorem 2.3.10 gives
n<|x|
2
[p
n
(x) p
n
(x)[
n<|x|
2
c
[x[
2
n
1/2
= O([x[
1
).
If E[[X
1
[
4
] < and the third moments vanish, a similar arugments shows that the sum on the
left-hand side is bounded by O([x[
2
log [x[) which is a little bigger than we want. However, if
we also assume that E[[X
1
[
6
] < , then we get an estimate as in (4.18), and we can show as in
Theorem 4.4.4 that this sum is O([x[
2
).
If we only assume that p T
2
, then we cannot write (4.19) because the second sum on the
right-hand side might diverge. Instead, we write
n=0
[p
n
(0) p
n
(x)] =
92 Greens Function
n|x|
2
[p
n
(0) p
n
(x)] +
n<|x|
2
[p
n
(0) p
n
(x)] +
n<|x|
2
[p
n
(0) p
n
(0)] +
n<|x|
2
[p
n
(x) p
n
(x)].
As before,
n<|x|
2
[p
n
(0) p
n
(x)] = a(x) +O(1).
Also, Exercise 2.2, Theorem 2.3.10, and (2.32), respectively, imply
n|x|
2
[p
n
(0) p
n
(x)[
n|x|
2
c [x[
n
3/2
= O(1),
n<|x|
2
[p
n
(x) p
n
(x)[
n<|x|
2
c
[x[
2
= O(1),
n<|x|
2
[p
n
(0) p
n
(0)[
n<|x|
2
o
_
1
n
_
= o(log [x[).
4.4.3 One dimension
If p T
1
, the potential kernel is dened in the same way
a(x) = lim
N
_
N
n=0
p
n
(0)
N
n=0
p
n
(x)
_
.
In this case, the convergence is a little more subtle. We will restrict ourselves to walks satisfying
E[[X[
3
] < for which the proof of the next proposition shows that the sum converges absolutely.
Proposition 4.4.7 Suppose p T
1
with E[[X[
3
] < . Then there is a c such that for all x,
a(x)
2
[x[
c log [x[.
If E[[X[
4
] < and E[X
3
] = 0, then there is a C such that
a(x) =
[x[
2
+C +O([x[
1
).
Proof Assume x > 0. Let = 1 under the weaker assumption and = 2 under the stronger
assumption. Theorem 2.3.6 gives
p
n
(0) p
n
(x) = p
n
(0) p
n
(x) +xO(n
(2+)/2
),
which shows that
nx
2
[p
n
(0) p
n
(x)] [p
n
(0) p
n
(x)]
c x
nx
2
n
(2+)/2
cx
1
.
4.4 Potential kernel 93
If = 1, Theorem 2.3.5 gives
n<x
2
[p
n
(0) p
n
(0)] c
n<x
2
n
1
= O(log x).
If = 2, Theorem 2.3.5 gives [p
n
(0) p
n
(0)[ = O(n
3/2
) and hence
n<x
2
[p
n
(0) p
n
(0)] = C
+O([x[
1
), C
:=
n=0
[p
n
(0) p
n
(0)],
In both cases, Theorem 2.3.10 gives
n<x
2
[p
n
(x) p
n
(x)]
c
x
2
n<x
2
n
(1)/2
c x
1
.
Therefore,
a(x) = e(x) +
n=0
[p
n
(0) p
n
(x)] = e(x) +
n=1
1
2
2
n
_
1 e
x
2
2
2
n
_
,
where e(x) = O(log x) if = 1 and e(x) = C
+ O(x
1
) if = 2. Standard estimates (see Section
12.1.1), which we omit, show that there is a C
such that as x ,
n=1
1
2
2
n
= C
+
_
0
1
2
2
t
_
1 e
x
2
2
2
t
_
dt +o(x
1
),
and
_
0
1
2
2
t
_
1 e
x
2
2
2
t
_
dt =
2x
2
_
0
1
u
2
_
1 e
u
2
/2
_
du =
x
2
.
Since C = C
+C
2
+E
x
_
a(S
T
)
S
T
2
_
, (4.20)
where T = minn : S
n
0. There exists > 0 such that for x > 0,
a(x) =
x
2
+C +O(e
x
), x , (4.21)
where
C = lim
y
E
y
_
a(S
T
)
S
T
2
_
.
In particular, for simple random walk, a(x) = [x[.
Proof Assume y > x, let T
y
= minn : S
n
0 or S
n
y, and consider the bounded martingale
S
nTy
. Then the optional sampling theorem implies that
x = E
x
[S
0
] = E
x
[S
Ty
] = E
x
[S
T
; T T
y
] +E
x
[S
Ty
; T > T
y
].
94 Greens Function
If we let y , we see that
lim
y
E
x
[S
Ty
; T > T
y
] = x E
x
[S
T
].
Also, since E[S
Ty
[ T
y
< T] = y +O(1), we can see that
lim
y
y P
x
T
y
< T = x E
x
[S
T
].
We now consider the bounded martingale M
n
= a(S
nTy
). Then the optional sampling theorem
implies that
a(x) = E
x
[M
0
] = E
x
[M
Ty
] = E
x
[a(S
T
); T T
y
] +E
x
[a(S
Ty
); T > T
y
].
As y , E
x
[a(S
T
); T < T
y
] E
x
[a(S
T
)]. Also, as y ,
E
x
[a(S
Ty
); T > T
y
] P
x
T
y
< T
_
y
2
+O(1)
_
x E
x
[S
T
]
2
.
This gives (4.20).
We will sketch the proof of (4.21); we leave it as an exercise (Exercise 4.12) to ll in the details.
We will show that there exists a such that if 0 < x < y < , then
j=0
[P
x
S
T
= j P
y
S
T
= j[ = O(e
x
). (4.22)
Even though we have written this as an innite sum, the terms are nonzero only for j less than the
range of the walk. Let
z
= minn 0 : S
n
z. Irreducibility and aperiodicity of the random walk
can be used to see that there is an > 0 such that for all z > 0, P
z+1
S
z
= z = P
1
0
= 0 > .
Let
f(r) = f
j
(r) = sup
x,yr
[P
x
S
T
= j P
y
S
T
= j[.
Then if R denotes the range of the walk, we can see that
f(r + 1) (1 ) f(r R).
Iteration of this inequality gives f(kR) (1 )
k1
f(R) and this gives (4.22).
Remark. There is another (perhaps more ecient) proof of this result, see Exercise 4.13. One
may note that the proof does not use the symmetry of the walk to establish
a(x) =
x
2
+C +O(e
x
), x .
Hence, this result holds for all mean zero walks with bounded increments. Applying the proof to
negative x yields
a(x) =
[x[
2
+C
+O(e
|x|
).
If the third moment of the increment distribution is nonzero, it is possible that C ,= C
, see Exercise
4.14.
4.5 Fundamental solutions 95
The potential kernel in one dimension is not as useful as the potential kernel or Greens function in higher
dimensions. For d 2, we use the fact that the potential kernel or Greens function is harmonic on Z
d
0
and that we have very good estimates for the asymptotics. For d = 1, similar arguments can be done with the
function f(x) = x which is obviously harmonic.
4.5 Fundamental solutions
If p T, the Greens function G for d 3 or the potential kernel a for d = 2 is often called the
fundamental solution of the generator L since
LG(x) = (x), La(x) = (x). (4.23)
More generally, we write
L
x
G(x, y) = L
x
G(y, x) = (y x), L
x
a(x, y) = L
x
a(y, x) = (y x),
where L
x
denotes L applied to the x variable.
Remark. Symmetry of walks in T is necessary to derive (4.23). If p T
is transient, the
Greens function G does not satisfy (4.23). Instead it satises L
R
G(x) =
0
(x) where L
R
denotes
the generator of the backwards random walk with increment distribution p
R
(x) = p(x). The
function f(x) = G(x) satises Lf(x) =
0
(x) and is therefore the fundamental solution of the
generator. Similarly, if p T
2
, the fundamental solution of the generator is f(x) = a(x).
Proposition 4.5.1 Suppose p T
d
with d 2, and f : Z
d
R is a function satisfying f(0) = 0,
f(x) = o([x[) as x , and Lf(x) = 0 for x ,= 0. Then, there exists b R such that
f(x) = b [G(x) G(0)], d 3,
f(x) = b a(x), d = 2.
Proof See Propositions 6.4.6 and 6.4.8.
Remark. The assumption f(x) = o([x[) is clearly needed since the function f(x
1
, . . . , x
d
) = x
1
is
harmonic.
Suppose d 3. If f : Z
d
R is a function with nite support we dene
Gf(x) =
yZ
d
G(x, y) f(y) =
yZ
d
G(y x) f(y). (4.24)
Note that if f is supported on A, then LGf(x) = 0 for x , A. Also if x A,
LGf(x) = L
x
yZ
d
G(x, y) f(y) =
yZ
d
L
x
G(x, y) f(y) = f(x). (4.25)
In other words G = L
1
. For this reason the Greens function is often called the inverse of the
(negative of the) Laplacian. Similarly, if d = 2, and f has nite support, we dene
af(x) =
yZ
d
a(x, y) f(y) =
yZ
d
a(y x) f(y). (4.26)
96 Greens Function
In this case we get
Laf(x) = f(x),
i.e., a = L
1
.
4.6 Greens function for a set
If A Z
d
and S is a random walk with increment distribution p, let
A
= minj 1 : S
j
, A,
A
= minj 0 : S
j
, A. (4.27)
If A = Z
d
x, we write just
x
,
x
, which is consistent with the denition of
x
given earlier in
this chapter. Note that
A
,
A
agree if S
0
A, but are dierent if S
0
, A. If p is transient or A is
a proper subset of Z
d
we dene
G
A
(x, y) = E
x
_
A
1
n=0
1S
n
= y
_
=
n=0
P
x
S
n
= y; n <
A
.
Lemma 4.6.1 Suppose p T
d
and A is a proper subset of Z
d
.
G
A
(x, y) = 0 unless x, y A.
G
A
(x, y) = G
A
(y, x) for all x, y.
For x A, L
x
G
A
(x, y) = (y x). In particular if f(y) = G
A
(x, y), then f vanishes on
Z
d
A and satises Lf(y) = (y x) on A.
For each y A,
G
A
(y, y) =
1
P
y
A
<
y
< .
If x, y A, then
G
A
(x, y) = P
x
y
<
A
G
A
(y, y).
G
A
(x, y) = G
Ax
(0, y x) where A x = z x : z A.
Proof Easy and left to the reader. The second assertion may be surprising at rst, but symmetry
of the random walk implies that for x, y A,
P
x
S
n
= y; n <
A
= P
y
S
n
= x; n <
A
.
Indeed if z
o
= x, z
1
, z
2
, . . . , z
n1
, z
n
= y A, then
P
x
S
1
= z
1
, S
2
= z
2
, . . . , S
n
= y = P
y
S
1
= z
n1
, S
2
= z
n2
, . . . , S
n
= x.
The next proposition gives an important relation between the Greens function for a set and the
Greens function or the potential kernel.
4.6 Greens function for a set 97
Proposition 4.6.2 Suppose p T
d
, A Z
d
, x, y Z
d
.
(a) If d 3,
G
A
(x, y) = G(x, y) E
x
[G(S
A
, y);
A
< ] = G(x, y)
z
P
x
S
A
= z G(z, y).
(b) If d = 1, 2 and A is nite,
G
A
(x, y) = E
x
[a(S
A
, y)] a(x, y) =
_
z
P
x
S
A
= z a(z, y)
_
a(x, y). (4.28)
Proof The result is trivial if x , A. We will assume x A in which case
A
=
A
.
If d 3, let Y
y
=
n=0
1S
n
= y denote the total number of visits to the point y. Then
Y
y
=
A
1
n=0
1S
n
= y +
n=
A
1S
n
= y.
If we assume S
0
= x and take expectations of both sides, we get
G(x, y) = G
A
(x, y) +E
x
[G(S
A
, y)].
The d = 1, 2 case could be done using a similar approach, but it is easier to use a dierent
argument. If S
0
= x and g is any function, then it is easy to check that
M
n
= g(S
n
)
n1
j=0
Lg(S
j
)
is a martingale. We apply this to g(z) = a(z, y) for which Lg(z) = (z y). Then,
a(x, y) = E
x
[M
0
] = E
x
[M
n
A
] = E
x
[a(S
n
A
, y)] E
x
_
_
(n
A
)1
j=0
1S
j
= y
_
_
.
Since A is nite, the dominated convergence theorem implies that
lim
n
E
x
[a(S
n
A
, y)] = E
x
[a(S
A
, y)]. (4.29)
The monotone convergence theorem implies
lim
n
E
x
_
_
(n
A
)1
j=0
1S
j
= y
_
_
= E
x
_
_
A
1
j=0
1S
j
= y
_
_
= G
A
(x, y).
The niteness assumption on A was used in (4.29). The next proposition generalizes this to all
proper subsets A of Z
d
, d = 1, 2. Recall that B
n
= x Z
d
: [x[ < n. Dene a function F
A
by
F
A
(x) = lim
n
log n
det
P
x
Bn
<
A
, d = 2,
F
A
(x) = lim
n
n
2
P
x
Bn
<
A
, d = 1.
98 Greens Function
The existence of these limits is established in the next proposition. Note that F
A
0 on Z
d
A
since P
x
A
= 0 = 1 for x Z
d
A.
Proposition 4.6.3 Suppose p T
d
, d = 1, 2 and A is a proper subset of Z
d
. Then if x, y Z
2
,
G
A
(x, y) = E
x
[a(S
A
, y)] a(x, y) +F
A
(x).
Proof The result is trivial if x , A so we will suppose that x A. Choose n > [x[, [y[ and let
A
n
= A [z[ < n. Using (4.28), we have
G
An
(x, y) = E
x
[a(S
An
, y)] a(x, y).
Note also that
E
x
[a(S
An
, y)] = E
x
[a(S
A
, y);
A
Bn
] +E
x
[a(S
Bn
, y);
A
>
Bn
].
The monotone convergence theorem implies that as n ,
G
An
(x, y) G
A
(x, y), E
x
[a(S
A
, y);
A
Bn
] E
x
[a(S
A
, y)].
Sincce G
A
(x, y) < , this implies
lim
n
E
x
[a(S
Bn
, y);
A
>
Bn
] = G
A
(x, y) +a(x, y) E
x
[a(S
A
, y)].
However, n [S
Bn
[ n + R where R denotes the range of the increment distribution. Hence
Theorems 4.4.4 and 4.4.8 show that as n ,
E
x
[a(S
Bn
, y);
A
>
Bn
] P
x
A
>
Bn
log n
det
, d = 2,
E
x
[a(S
Bn
, y);
A
>
Bn
] P
x
A
>
Bn
2
, d = 1.
Remark. We proved that for d = 1, 2,
F
A
(x) = G
A
(x, y) +a(x, y) E
x
[a(S
A
, y)]. (4.30)
This holds for all y. If we choose y Z
d
A, then G
A
(x, y) = 0, and hence we can write
F
A
(x) = a(x, y) E
x
[a(S
A
, y)].
Using this expression it is easy to see that
LF
A
(x) = 0, x A.
Also, if Z
d
A is nite,
F
A
(x) = a(x) +O
A
(1), x .
In the particular case A = Z
d
0, y = 0, this gives
F
Z
d
\{0}
(x) = a(x).
4.6 Greens function for a set 99
Applying (4.30) with y = x, we get
G
Z
d
\{0}
(x, x) = F
Z
d
\{0}
(x) +a(0, x) = 2 a(x). (4.31)
The next simple proposition relates Greens functions to escape probabilities from sets. The
proof uses a last-exit decomposition. Note that the last time a random walk visits a set is a random
time that is not a stopping time. If A A
, the event
Z
d
\A
<
A
is the event that the random
walk visits A before leaving A
.
Proposition 4.6.4 (Last-Exit Decomposition) Suppose p T
d
and A Z
d
. Then,
If A
is a proper subset of Z
d
with A A
,
P
x
Z
d
\A
<
A
=
zA
G
A
(x, z) P
z
Z
d
\A
>
A
.
If (0, 1) and T
Z
d
\A
< T
zA
G(x, z; ) P
z
Z
d
\A
T
.
If d 3 and A is nite,
P
x
S
j
A for some j 0 = P
x
Z
d
\A
<
=
zA
G(x, z) P
z
Z
d
\A
= .
Proof We will prove the rst assertion; the other two are left as Exercise 4.11. We assume x A
such that S
k
A. Then,
P
x
Z
d
\A
<
A
=
k=0
zA
P
x
= k; S
= z
=
zA
k=0
P
x
S
k
= z; k <
A
; S
j
, A, j = k + 1, . . . ,
A
.
The Markov property implies that
P
x
S
j
, A, j = k + 1, . . . ,
A
[ S
k
= z; k <
A
= P
z
A
<
Z
d
\A
.
Therefore,
P
x
Z
d
\A
<
A
=
zA
k=0
P
x
S
k
= z; k <
A
P
z
A
<
Z
d
\A
zA
G
A
(x, z) P
z
A
<
Z
d
\A
.
100 Greens Function
The next proposition uses a last-exit decomposition to describe the distribution of a random
walk conditioned to not return to its starting point before a killing time. The killing time is either
geometric or the rst exit time from a set.
Proposition 4.6.5 Suppose S
n
is a p-walk with p T
d
; 0 A Z
d
; and (0, 1). Let T
be a
geometric random variable independent of the random walk with killing rate 1 . Let
= maxj 0 : j
A
, S
j
= 0,
= maxj 0 : j < T
, S
j
= 0.
The distribution of S
j
: j
A
is the same as the conditional distribution of S
j
: 0
j
A
given = 0.
The distribution of S
j
:
j < T
given
= 0.
Proof The usual Markov property implies that for any positive integer j, any x
1
, x
2
, . . . , x
k1
A 0 and any x
k
Z
d
A,
P = j,
A
= j +k, S
j+1
= x
1
, . . . , S
j+k
= x
k
= PS
j
= 0,
A
> j, S
j+1
= x
1
, . . . , S
j+k
= x
k
= PS
j
= 0,
A
> j PS
1
= x
1
, . . . , S
k
= x
k
.
The rst assertion is obtained by summation over j, and the other equality is done similarly.
Exercises
Exercise 4.1 Suppose p T
d
and S
n
is a p-walk. Suppose A Z
d
and that P
x
A
= > 0 for
some x A. Show that for every > 0, there is a y with P
y
A
= > 1 .
Exercise 4.2 Suppose p T
d
T
d
, d 2 and let x Z
d
0. Let
T = minn > 0 : S
n
= jx for some j Z.
Show there exists c = c(x) such that as n ,
PT > n
_
_
_
c n
1/2
, d = 2
c (log n)
1
, d = 3
c, d 4.
Exercise 4.3 Suppose d = 1. Show that the only function satisfying the conditions of Proposition
4.5.1 is the zero function.
Exercise 4.4 Find all radially symmetric functions f in R
d
0 satisfying f(x) = 0 for all
x R
d
0.
Exercise 4.5 For each positive integer k nd positive integer d and p T
d
such that E[[X
1
[
k
] <
and
limsup
|x|
[x[
d2
G(x) = .
4.6 Greens function for a set 101
(Hint: Consider a sequence of points z
1
, z
2
, . . . going to innity and dene PX
1
= z
j
= q
j
. Note
that G(z
j
) q
j
. Make a good choice of z
1
, z
2
, . . . and q
1
, q
2
, . . .)
Exercise 4.6 Suppose X
1
, X
2
, . . . are independent, identically distributed random variables in Z
with mean zero. Let S
n
= X
1
+ +X
n
denote the corresponding random walk and let
G
n
(x) =
n
j=0
PS
j
= x
be the expected number of visits to x in the rst n steps of the walk.
(i) Show that G
n
(x) G
n
(0) for all n.
(ii) Use the law of large numbers to conclude that for all > 0 there is an N
|x|n
G
n
(x)
n
2
.
(iii) Show that
G(0) = lim
n
G
n
(0) =
and conclude that the random walk is recurrent.
Exercise 4.7 Suppose A Z
d
and x, y A. Show that
G
A
(x, y) = lim
n
G
An
(x, y),
where A
n
= z A : [z[ < n.
Exercise 4.8 Let S
n
denote simple random walk in Z
2
starting at the origin and let = minj
1 : S
j
= 0 or e
1
. Show that PS
= 0 = 1/2.
Exercise 4.9 Consider the random walk in Z
2
that moves at each step to one of (1, 1), (1, 1),
(1, 1), (1, 1) each with probability 1/4. Although this walk is not irreducible, many of the
ideas of this chapter apply to this walk.
(i) Show that (
1
,
2
) = 1 (cos
1
)(cos
2
).
(ii) Let a be the potential kernel for this random walk and a the potential kernel for simple
random walk. Show that for every integer n, a((n, 0)) = a((n, n)). (see Exercise 1.7).
(iii) Use Proposition 4.4.3 (which is valid for this walk) to show that for all integers n > 0,
a((n, 0)) a((n 1, 0)) =
4
(2n 1)
,
a((n, 0)) =
4
_
1 +
1
3
+
1
5
+ +
1
2n 1
_
.
Exercise 4.10 Suppose p T
1
and let A = 1, 2, . . .. Show that
F
A
(x) =
x E
x
[S
T
]
2
,
102 Greens Function
where T = minj 0 : S
j
0 and F
A
is as in (4.30).
Exercise 4.11 Finish the details in Proposition 4.6.4.
Exercise 4.12 Finish the details in Theorem 4.4.8.
Exercise 4.13 Let S
j
be a random walk in Z with increment distribution p satisfying
r
1
= minj : p(j) > 0 < , r
2
= maxj : p(j) > 0 < ,
and let r = r
2
r
1
.
(i) Show that if R and k is a nonnegative integer, then f(x) =
x
x
k
satises Lf(x) = 0
for all x R in and only if (s )
k1
divides the polynomial
q(s) = E
_
s
X
1
.
(ii) Show that the set of functions on r + 1, r + 2, . . . satisfying Lf(x) = 0 for x 1 is a
vector space of dimension r.
(iii) Suppose that f is a function on r + 1, r + 2, . . . satisfying Lf(x) = 0 and f(x) x as
x . Show that there exists c R, c
1
, > 0 such that
[f(x) x c[ c
1
e
x
.
Exercise 4.14 Find the potential kernel a(x) for the one-dimensional walk with
p(1) = p(2) =
1
5
, p(1) =
3
5
.
5
One-dimensional walks
5.1 Gamblers ruin estimate
We will prove one of the basic estimates for one-dimensional random walks with zero mean and
nite variance, often called the gamblers ruin estimate. We will not restrict to integer-valued
random walks. For this section we assume that X
1
, X
2
, . . . are independent, identically distributed
(one-dimensional) random variables with E[X
1
] = 0, E[X
2
1
] =
2
> 0. We let S
n
= S
0
+X
1
+ +X
n
be the corresponding random walk. If r > 0, we let
r
= minn 0 : S
n
0 or S
n
r,
=
= minn 0 : S
n
0.
We rst consider simple random walk for which the gamblers ruin estimates are identities.
Proposition 5.1.1 If S
n
is one-dimensional simple random walk and j < k are positive integers,
then
P
j
S
k
= k =
j
k
.
Proof Since M
n
:= S
n
k
is a bounded martingale, the optional sampling theorem implies that
j = E
j
[M
0
] = E
j
[M
k
] = k P
j
S
k
= k.
Proposition 5.1.2 If S
n
is one-dimensional simple random walk, then for positive integer n,
P
1
> 2n = P
1
S
2n
> 0 P
1
S
2n
< 0 = PS
2n
= 0 =
1
n
+O
_
1
n
3/2
_
.
Proof Symmetry and the Markov property tell us that each k < 2n and each positive integer x,
P
1
= k, S
2n
= x = P
1
= k p
2nk
(x) = P
1
= k, S
2n
= x.
Therefore,
P
1
2n, S
2n
= x = P
1
2n, S
2n
= x.
103
104 One-dimensional walks
Symmetry also implies that for all x, P
1
S
2n
= x + 2 = P
1
S
2n
= x. Since P
1
> 2n, S
2n
=
x = 0, for x 0, we have
P
1
> 2n =
x>0
P > 2n; S
2n
= x
=
x>0
[p
2n
(1, x) p
2n
(1, x)]
= p
2n
(1, 1) +
x>0
[p
2n
(1, x + 2) p
2n
(1, x)]
= p
2n
(0, 0) = 4
n
_
2n
n
_
=
1
n
+O
_
1
n
3/2
_
.
The proof of the gamblers ruin estimate for more general walks follows the same idea as that
in the proof of Proposition 5.1.1. However, there is a complication arising from the fact that we
do not know the exact value of S
k
. Our rst lemma shows that the application of the optional
sampling theorem is valid. For this we do not need to assume that the variance is nite.
Lemma 5.1.3 If X
1
, X
2
, . . . are i.i.d. random variables in R with E(X
j
) = 0 and PX
j
> 0 > 0,
then for every 0 < r < and every x R,
E
x
[S
r
] = x. (5.1)
Proof We assume 0 < x < r for otherwise the result is trivial. We start by showing that E
x
[[S
r
[] <
. Since PX
j
> 0 > 0, there exists an integer m and a > 0 such that
PX
1
+ +X
m
> r .
Therefore for all x and all positive integers j,
P
x
r
> jm (1 )
m
.
In particular, E
x
[
r
] < . By the Markov property,
P
x
[S
r
[ r +y;
r
= k P
x
r
> k 1; [X
k
[ y = P
x
r
> k 1 P[X
k
[ y.
Summing over k gives
P
x
[S
r
[ r +y E
x
[
r
] P[X
k
[ y.
Hence
E
x
[[S
r
[] =
_
0
P
x
[S
r
[ y dy E
x
[
r
]
_
r +
_
0
P[X
k
[ ydy
_
= E
x
[
r
] ( r +E[[X
j
[] ) < .
Since E
x
[[S
r
[] < , the martingale M
n
:= S
nr
is dominated by the integrable random variable
r+[S
r
[. Hence it is a uniformly integrable martingale, and (5.1) follows from the optional sampling
theorem (Theorem 12.2.3).
5.1 Gamblers ruin estimate 105
We now prove the estimates under the assumption of bounded range. We will take some care in
showing how the constants in the estimate depend on the range.
Proposition 5.1.4 For every > 0 and K < , there exist 0 < c
1
< c
2
< such that if
P[X
1
[ > K = 0 and PX
1
, then for all 0 < x < r,
c
1
x + 1
r
P
x
S
r
r c
2
x + 1
r
.
Proof We x , K and allow constants in this proof to depend on , K. Let m be the smallest integer
greater than K/. The assumption PX
1
implies that for all x > 0,
P
x
S
K
K PX
1
, . . . , X
m
m
.
Also note that if 0 x y K then translation invariance and monotonicity give P
x
(S
r
r)
P
y
(S
r
r). Therefore, for 0 < x K,
m
P
K
S
r
r P
x
S
r
r P
K
S
r
r, (5.2)
and hence it suces to show for K x r that
x
r +K
P
x
S
r
r
x +K
r
.
By the previous lemma, E
x
[S
r
] = x. If S
r
r, then r S
r
r + K. If S
r
0, then
K S
r
0. Therefore,
x = E
x
[S
r
] E
x
[S
r
; S
r
r] P
x
S
r
r (r +K),
and
x = E
x
[S
r
] E
x
[S
r
; S
r
r] K r P
x
S
r
r K.
Proposition 5.1.5 For every > 0 and K < , there exist 0 < c
1
< c
2
< such that if
P[X
1
[ > K = 0 and PX
1
, then for all x > 0, r > 1,
c
1
x + 1
r
P
x
r
2
c
2
x + 1
r
.
Proof For the lower bound, we note that the maximal inequality for martingales (Theorem 12.2.5)
implies
P
_
sup
1jn
2
[X
1
+ +X
j
[ 2Kn
_
E[S
2
n
2
]
4K
2
n
2
1
4
.
This tells us that if the random walk starts at z 3Kr, then the probability that it does not
reach the origin in r
2
steps is at least 3/4. Using this, the strong Markov property, and the last
proposition, we get
P
x
r
2
3
4
P
x
S
3Kr
3Kr
c
1
(x + 1)
r
.
106 One-dimensional walks
For the upper bound, we refer to Lemma 5.1.8 below. In this case, it is just as easy to give the
argument for general mean zero, nite variance walks.
If p T
d
, d 2, then p induces an innite family of one-dimensional non-lattice random walks
S
n
where [[ = 1. In Chapter 6, we will need gamblers ruin estimates for these walks that are
uniform over all . In particular, it will be important that the constant is uniform over all .
Proposition 5.1.6 Suppose S
n
is a random walk with increment distribution p T
d
, d 2. There
exist c
1
, c
2
such that if R
d
with [[ = 1 and S
n
= S
n
, then the conclusions of Propositions
5.1.4 and 5.1.5 hold with c
1
, c
2
.
Proof Clearly there is a uniform bound on the range. The other condition is satised by noting
the simple geometric fact that there is an > 0, independent of such that PS
1
, see
Exercise 1.8.
5.1.1 General case
We prove the gamblers ruin estimate assuming only mean zero and nite variance. While we will
not attempt to get the best values for the constants, we do show that the constants can be chosen
uniformly over a wide class of distributions. In this section we x K < , , b > 0 and 0 < < 1,
and we let /(K, , b, ) be the collection of distributions on X
1
with E[X
1
] = 0,
E[X
2
1
] =
2
K
2
,
PX
1
1 ,
inf
n
PS
1
, . . . , S
n
2 > n b,
inf
n>0
PS
n
2 n.
It is easy to check that for any mean zero, nite nonzero variance random walk S
n
we can nd a
t > 0 and some K, , b, such that the estimates above hold for tS
n
.
Theorem 5.1.7 (Gamblers ruin) For every K, , b, , there exist 0 < c
1
< c
2
< such that if
X
1
, X
2
, . . . are i.i.d. random variables whose distributions are in /(K, , b, ), then for all 0 < x < r,
c
1
x + 1
r
P
x
> r
2
c
2
x + 1
r
,
c
1
x + 1
r
P
x
S
r
r c
2
x + 1
r
.
Our argument consists of several steps. We start with the upper bound. Let
r
= minn > 0 : S
n
0 or S
n
r,
= minn > 0 : S
n
0.
Note that
r
diers from
r
in that the minimum is taken over n > 0 rather than n 0. As before
we write P for P
0
.
5.1 Gamblers ruin estimate 107
Lemma 5.1.8
P
> n
4K
n
, P
n
<
4K
bn
.
Proof Let q
n
= P
> n = PS
1
, . . . , S
n
> 0. Then
PS
1
, . . . , S
n
1 q
n1
q
n
.
Let J
k,n
be the event
J
k,n
= S
k+1
, . . . , S
n
S
k
+ 1.
We will also use J
k,n
to denote the indicator function of this event. Let m
n
= minS
j
: 0 j n,
M
n
= maxS
j
: 0 j n. For each real x [m
n
, M
n
], there is at most one integer k such that
S
k
x and S
j
> x, k < j n. On the event J
k,n
, the random set corresponding to the jump from
S
k
to S
k+1
,
x : S
k
x and S
j
> x, k < j n,
contains an interval of length at least one. In other words, there are
k
J
k,n
nonoverlapping
intervals contained in [m
n
, M
n
] each of length at least one. Therefore,
n1
k=0
J
k,n
M
n
m
n
.
But, P(J
k,n
) q
nk
q
n
. Therefore,
nq
n
E[M
n
m
n
] 2 E[ max[S
j
[ : j n ].
Martingale maximal inequalities (Theorem 12.2.5) give
P max[S
j
[ : j n t
E[S
2
n
]
t
2
K
2
n
t
2
.
Therefore,
nq
n
2
E[max[S
j
[ : j n] =
_
0
Pmax[S
j
[ : j n t dt
K
n +
_
K
n
K
2
nt
2
dt = 2K
n.
This gives the rst inequality. The strong Markov property implies
P
> n
2
[
n
<
PS
j
S
n
> n, 1 j n
2
[
n
<
b.
Hence,
b P
n
<
> n
2
, (5.3)
which gives the second inequality.
Lemma 5.1.9 (Overshoot lemma I) For all x > 0,
P
x
[S
[ m
1
E[X
2
1
; [X
1
[ m]. (5.4)
108 One-dimensional walks
Moreover if > 0 and E[[X
1
[
2+
] < , then
E
x
[[S
E[[X
1
[
2+
].
Since E
x
[] = , we cannot use the proof from Lemma 5.1.3.
Proof Fix > 0. For nonnegative integers k, let
Y
k
=
n=0
1k < S
n
(k + 1)
be the number of times the random walk visits (k, (k + 1)] before hitting (, 0], and let
g(x, k) = E
x
[Y
k
] =
n=0
P
x
k < S
n
(k + 1); > n.
Note that if m, x > 0,
P
x
[S
[ m =
n=0
P
x
[S
[ m; = n + 1
=
n=0
k=0
P
x
[S
[ m; = n + 1; k < S
n
(k + 1)
k=0
n=0
P
x
> n; k < S
n
(k + 1); [S
n+1
S
n
[ m+k
=
k=0
g(x, k) P[X
1
[ m+k
=
k=0
g(x, k)
l=k
Pm+l [X
1
[ < m+ (l + 1)
=
l=0
Pm+l [X
1
[ < m+ (l + 1)
l
k=0
g(x, k).
Recall that PS
n
2 n for each n. We claim that for all x, y,
0k<y/
g(x, k)
y
2
. (5.5)
To see this, let
H
y
= max
x>0
0k<y/
g(x, k).
5.1 Gamblers ruin estimate 109
Note that the maximum is the same if we restrict to 0 < x y. Then for any x y,
0k<y/
g(x, k) y
2
+P
x
y
2
E
_
_
ny
2
1S
n
y; n <
y
2
_
_
y
2
+ (1 ) H
y
. (5.6)
By taking the supremum over x we get H
y
y
2
+ (1 )H
y
which gives (5.5). We therefore have
P
x
[S
[ m
1
l=0
Pm+l [X
1
[ < m+ (l + 1)(l +)
2
(E[(X
1
)
2
; X
1
m] +E[(X
1
+)
2
; X
1
m]).
Letting 0, we obtain (5.4).
To get the second estimate, let F denote the distribution function of [X
1
[. Then
E
x
[[S
] =
_
0
t
1
P
x
[S
[ t dt
_
0
t
1
E[X
2
1
; [X
1
[ t] dt
_
0
E
_
[X
1
[
1+
; [X
1
[ t
dt
=
_
0
_
t
x
1+
dF(x) dt
=
_
0
__
x
0
dt
_
x
1+
dF(x) =
E[[X
1
[
2+
].
The estimate (5.6) illustrates a useful way to prove upper bounds for Greens functions of a set. If starting
at any point y in a set V U, there is a probability q of leaving U within N steps, then the expected amount
of time spent in V before leaving U starting at any x U is bounded above by
N + (1 q) N + (1 q)
2
N + =
N
q
.
The lemma states that the overshoot random variable has two fewer moments than the increment distribution.
When the starting point is close to the origin, one might expect that the overshoot would be smaller since there
are fewer chances for the last step before entering (, 0] to be much larger than a typical step. The next lemma
conrms this intuition and shows that one gains one moment if one starts near the origin.
Lemma 5.1.10 (Overshoot lemma II) Let
c
=
32 K
b
.
110 One-dimensional walks
Then for all 0 < x 1,
P
x
[S
[ m
c
E[[X
1
[; [X
1
[ m].
Moreover if > 0 and E[[X
1
[
1+
] < , then
E
x
[[S
]
c
E[[X
1
[
1+
].
Proof The proof proceeds exactly as in Lemma 5.1.9 up to (5.5) which we replace with a stronger
estimate that is valid for 0 < x 1:
0k<y
g(x, k)
c
. (5.7)
To derive this estimate we note that
2
j1
k<2
j
g(x, k)
equals the product of the probability of reaching a value above 2
j1
before hitting (, 0] and the
expected number of visits in this range given that event. Due to Lemma 5.1.8, the rst probability
is no more than 4K/(b2
j1
) and the conditional expectation, as estimated in (5.5), is less than
2
2j
/. Therefore,
0k<2
j
g(x, k)
1
l=1
_
4K
b2
l1
_
2
2l
_
16K
b
_
2
j
.
For general y we write 2
j1
< y 2
j
and obtain (5.7).
Given this, the same argument gives
P
x
[S
[ m
c
E[[X
1
[; [X
1
[ m],
and
E[[S
] =
_
0
t
1
P[S
[ t dt
_
0
t
1
E[[X
1
[; [X
1
[ t] dt
_
0
E[X
1
; [X
1
[ t] dt =
c
E[[X
1
[
1+
].
The inequalities (5.5) and (5.7) imply that there exists a c < such that for all y, E
y
[
n
] < cn
2
, and
E
x
[
n
] cn, 0 < x 1. (5.8)
5.1 Gamblers ruin estimate 111
Lemma 5.1.11
P
n
<
n
,
where
c
=
2 ( + 2c
K
2
)
,
and c
n b P
n
<
b c
n
.
Proof The last assertion follows immediately from the rst one and the strong Markov property as
in (5.3). Since P
n
<
P
1
n
<
n
<
n
.
Using (5.8), we have
P
1
[S
n
[ s +n
l=0
P
1
n
= l + 1; [S
n
[ s +n
l=0
P
1
n
> l; [X
l+1
[ s
P[X
1
[ s E
1
[
n
]
P[X
1
[ s.
In particular, if t > 0,
E
1
_
[S
n
[; [S
n
[ (1 +t) n
=
_
tn
P
1
[S
n
[ s +n ds
_
tn
P[X
1
[ s ds
=
c
E[[X
1
[; [X
1
[ tn]
t
E
_
[X
1
[
2
K
2
t
. (5.9)
Consider the martingale M
k
= S
k
n
. Due to the optional stopping theorem we have
1 = E
1
[M
0
] = E
1
[M
] E
1
[S
n
; S
n
n].
If we let t
0
= 2c
K
2
/ in (5.9), we obtain
E
1
[[S
n
[; [S
n
[ (1 +t
0
) n]
1
2
,
so it must be
E
1
[S
n
; n S
n
(1 +t
0
)n]
1
2
,
112 One-dimensional walks
which implies
P
1
n
<
P
1
n S
n
(1 +t
0
)n
1
2(1 +t
0
)n
.
Proof [of Theorem 5.1.7] Lemmas 5.1.8 and 5.1.11 prove the result for 0 < x 1. The result is
easy if x r/2 so we will assume 1 x r/2. As already noted, the function x P
x
S
r
r is
nondecreasing in x. Therefore,
PS
r
r = PS
x
x PS
r
r [ S
x
x PS
x
x P
x
S
r
r.
Hence by Lemmas 5.1.8 and 5.1.11,
P
x
S
r
r
PS
r
r
PS
x
x
4K
c
b
x
r
.
For an inequality in the opposite direction, we rst show that there is a c
2
such that E
x
[
r
] c
2
xr.
Recall from (5.8) that E
y
[
r
] cr for 0 < y 1. The strong Markov property and monotonicity
can be used (Exercise 5.1) to see that
E
x
[
r
] E
1
[
r
] +E
x1
[
r
]. (5.10)
Hence we obtain the claimed bound for general x by induction. As in the previous lemma one can
now see that
E
x
_
[S
r
[; [S
r
[ (1 +t) r
c
2
K
2
x
t
,
and hence if t
0
= 2c
2
K
2
,
E
x
_
S
r
; S
r
(1 +t
0
) r
x
2
,
E
x
_
[S
r
[; r S
r
(1 +t
0
) r
x
2
,
so that
P
x
_
r S
r
(1 +t
0
) r
_
x
2(1 +t
0
)r
.
As we have already shown (see the beginning of the proof of Lemma 5.1.11), this implies
P
x
r
2
b
x
2(1 +t
0
)r
.
5.2 One-dimensional killed walks
A symmetric defective increment distribution (on Z) is a set of nonnegative numbers p
k
: k Z
with
p
k
< 1 and p
k
= p
k
for all k. Given a symmetric defective increment distribution, we have
the corresponding symmetric random walk with killing, that we again denote by S. More precisely,
S is a Markov chain with state space Z , where is an absorbing state, and
PS
j+1
= k +l [ S
j
= k = p
l
, PS
j+1
= [ S
j
= k = p
,
5.2 One-dimensional killed walks 113
where p
= 1
p
k
. We let
T = minj : S
j
=
denote the killing time for the random walk. Note that PT = j = p
(1 p
)
j1
, j 1, 2, . . ..
Examples.
Suppose p(j) is the increment distribution of a symmetric one-dimensional random walk and
s [0, 1). Then p
j
= s p(j) is a defective increment distribution corresponding to the random
walk with killing rate 1 s. Conversely, if p
j
is a symmetric defective increment distribution,
and p(j) = p
j
/(1 p
, we
get back p
j
.
Suppose S
j
is a symmetric random walk in Z
d
, d 2 which we write S
j
= (Y
j
, Z
j
) where Y
j
is a
random walk in Z and Z
j
is a random walk in Z
d1
. Suppose the random walk is killed at rate
1 s and let
T denote the killing time. Let
= minj 1 : Z
j
= 0, (5.11)
p
k
= PY
= k; <
T.
Note that
p
k
=
j=1
P = j; Y
j
= k; j <
T
=
j=1
s
j
P = j; Y
j
= k = E[s
T
; Y
T
= k;
T < ].
If Z is a transient random walk, then P < < 1 and we can let s = 1.
Suppose S
j
= (Y
j
, Z
j
) and are as in the previous example and suppose A Z
d1
0. Let
A
= minj : Z
j
A,
p
k
= PY
= k; <
A
.
If PZ
j
A for some j > 0, then p
k
is a defective increment distribution.
Given a symmetric defective increment distribution p
k
with corresponding walk S
j
and killing
time T, dene the events
V
+
= S
j
> 0 : j = 1, . . . , T 1, V
+
= S
j
0 : j = 1, . . . , T 1,
V
= S
j
< 0 : j = 1, . . . , T 1, V
= S
j
0 : j = 1, . . . , T 1.
Symmetry implies that P(V
+
) = P(V
), P(V
+
) = P(V
). Note that V
+
V
+
, V
V
and
P(V
+
V
) = P(V
+
V
) = PT = 1 = p
. (5.12)
Dene a new defective increment distribution p
k,
, which is supported on k = 0, 1, 2, . . ., by
114 One-dimensional walks
setting p
k,
equal to the probability that the rst visit to , 2, 1, 0 after time 0 occurs at
position k and this occurs before the killing time T, i.e.,
p
k,
=
j=1
PS
j
= k ; j < T ; S
l
> 0, l = 1, . . . , j 1.
Dene p
k,+
similarly so that p
k,+
= p
k,
. The strong Markov property implies
P(V
+
) = P(V
+
) +p
0,
P(V
+
),
and hence
P(V
+
) = (1 p
0,
) P(V
+
) = (1 p
0,+
) P(V
+
). (5.13)
In the next proposition we prove a nonintuitive fact.
Proposition 5.2.1 The events V
+
and V
) = P(V
+
) = (1 p
0,+
) P(V
+
) =
_
p
(1 p
0,+
). (5.14)
Proof Independence is equivalent to the statement P(V
V
+
) = P(V
) P(V
+
). We will prove the
equivalent statement P(V
V
c
+
) = P(V
) P(V
c
+
). Note that V
V
c
+
is the event that T > 1 but
no point in 0, 1, 2 . . . is visited during the times 1, . . . , T 1. In particular, at least one point
in . . . , 2, 1 is visited before time T.
Let
= maxk Z : S
j
= k for some j = 1, . . . , T 1,
k
= maxj 0 : S
j
= k; j < T.
In words, is the rightmost point visited after time zero, and
k
is the last time that k is visited
before the walk is killed. Then,
P(V
V
c
+
) =
k=1
P = k =
k=1
j=1
P = k;
k
= j.
Note that the event = k;
k
= j is the same as the event
S
j
= k ; j < T ; S
l
k, l = 1, . . . , j 1 ; S
l
< k, l = j + 1, . . . , T 1.
Since,
PS
l
< k, l = j + 1, . . . , T 1 [ S
j
= k ; j < T ; S
l
k, l = 1, . . . , j 1 = P(V
),
we have
P = k;
k
= j = PS
j
= k ; j < T ; S
l
k, l = 1, . . . , j 1 P(V
).
Due to the symmetry of the random walk, the probability of the path [x
0
= 0, x
1
, . . . , x
j
] is the
same as the probability of the reversed path [x
j
x
j
, x
j1
x
j
, . . . , x
0
x
j
]. Note that if x
j
= k
5.3 Hitting a half-line 115
and x
l
k, l = 1, . . . , j 1, then x
0
x
j
= k and
l
i=1
(x
ji
x
ji+1
) = x
jl
x
j
0, for
l = 1, . . . , j 1. Therefore we have
PS
j
= k ; j < T ; S
l
k, l = 1, . . . , j 1 = P = j; j < T; S
j
= k,
where
= minj 1 : S
j
> 0.
Since
k=1
j=1
P = j; j < T; S
j
= k = P < T = P(V
c
) = P(V
c
+
),
we obtain the stated independence. The equality (5.14) now follows from
p
= P(V
V
+
) = P(V
) P(V
+
) =
P(V
) P(V
+
)
1 p
0,
=
P(V
+
)
2
1 p
0,+
.
5.3 Hitting a half-line
We will give an application of Proposition 5.2.1 to walks in Z
d
. Suppose d 2 and S
n
is a random
walk with increment distribution p T
d
. We write S
n
= (Y
n
, Z
n
) where Y
n
is a one-dimensional
walk and Z
n
is a (d 1)-dimensional walk. Let denote the covariance matrix for S
n
and let
n=1
P = n; T
+
> n =
n=1
(1 )
n1
q
n
.
116 One-dimensional walks
By Propositions 12.5.2 and 12.5.3, it suces to show that q() c (1 )
1/4
if d = 2 and q()
c [log(1 )]
1/2
if d = 3.
This is the same situation as the second example of the last subsection (although (, T) there
corresponds to (T, ) here). Hence, Proposition 5.2.1 tells us that
q() =
_
p
() (1 p
0,+
()),
where p
() = PT > and p
0,+
() = PT
+
; Y
T
+
= 0. Clearly, as 1, 1 p
0,+
()
1 p
0,+
> 0. By applying (4.9) and (4.10) to the random walk Z
n
, we can see that
PT > c (1 )
1/2
, d = 2,
PT > c
_
log
_
1
1
__
1
, d = 3.
From the proof one can see that the constant C can be determined in terms of
and p
0,+
. We do not
need the exact value and the proof is a little easier to follow if we do not try to keep track of this constant. It is
generally hard to compute p
0,+
; for simple random walk, see Proposition 9.9.8.
The above proof uses the surprising fact that the events avoid the positive x
1
-axis and avoid the negative
x
1
- axis are independent up to a multiplicative constant. This idea does not extend to other sets, for example
the event avoid the positive x
1
-axis and avoid the positive x
2
-axis are not independent up to a multiplicative
constant in two dimensions. However, they are in three dimensions (which is a nontrivial fact).
In Section 6.8 we will need some estimates for two-dimensional random walks avoiding a half-
line. The argument given below uses the Harnack inequality (Theorem 6.3.9), which will be proved
independently of this estimate. In the remainder of this section, let d = 2 and let S
n
= (Y
n
, Z
n
) be
the random walk. Let
r
= minn > 0 : Y
n
r ,
r
= min n > 0 : S
n
, (r, r) (r, r) ,
r
= minn > 0 : S
n
, Z (r, r) .
If [S
0
[ < r, the event
r
=
r
occurs if and only if the rst visit of the random walk to the
complement of (r, r) (r, r) is at a point (j, k) with j r.
Proposition 5.3.2 If p T
2
, then
PT
+
>
r
r
1/2
. (5.15)
Moreover, for all z ,= 0,
P
z
r
< T
+
c [z[
1/2
r
1/2
(5.16)
5.3 Hitting a half-line 117
In addition, there is a c < such that if 1 k r and A
k
= je
1
: j = k, k + 1, . . . , then
PT
A
k
>
r
c k
1/2
r
1/2
. (5.17)
Proof It suces to show that there exist c
1
, c
2
with
PT
+
>
r
c
2
r
1/2
, PT
+
>
r
c
1
r
1/2
.
The gamblers ruin estimate applied to the second component implies that PT >
r
r
1
and
an application of Proposition 5.2.1 gives PT
+
>
r
r
1/2
.
Using the invariance principle, it is not dicult to show that there is a c such that for r suciently
large, P
r
=
r
c. By translation invariance and monotonicity, one can see that for j 1,
P
je
1
r
=
r
P
r
=
r
.
Hence the strong Markov property implies that P
r
=
r
[ T
+
<
r
P
r
=
r
, therefore it
has to be that P
r
=
r
[ T
+
>
r
c and
P
r
< T
+
c P
r
=
r
< T
+
. (5.18)
Another application of the invariance principle shows that
PT
+
> r
2
[
r
=
r
< T
+
c,
since this conditional probability is bounded below by the probability that a random walk goes no
farther than distance r/2 in r
2
steps. Hence,
P
r
< T
+
c P
r
< T
+
, T
+
> r
2
c PT
+
> r
2
c r
1/2
.
This gives (5.15).
For the remaining results we will assume [z[ is an integer greater than the range R of the
walk, but one can easily adapt the argument to arbitrary z. Let h
r
(x) = P
x
r
< T
+
and let
M = M(r, [z[) be the maximum value of h
r
(x) over x ([z[ R, [z[ + R) ([z[ R, [z[ + R).
By translation invariance, this is maximized at a point with maximal rst component and by the
Harnack inequality (Theorem 6.3.9),
c
1
M h
r
(x) c
2
M, x ([z[ R, [z[ +R) ([z[ R, [z[ +R).
Together with strong Markov property this implies
P
r
< T
+
c MP
|z|
< T
+
,
and due to (5.18)
P
r
< T
+
c MP
|z|
=
|z|
< T
+
c MP
|z|
< T
+
.
Since P
r
< T
+
r
1/2
, we conclude that M [z[
1/2
r
1/2
, implying (5.16). To prove (5.17),
we write
PT
A
k
>
r
= PT
A
k
>
k
PT
A
k
>
r
[ T
A
k
>
k
c PT
A
k
>
k
(k/r)
1/2
.
So if suces to show that PT
A
k
>
k
ck
1
This is very close to the gamblers ruin estimate,
but it is not exactly the form we have proved so far, so we will sketch a proof.
118 One-dimensional walks
Let
q(k) = PT
A
k
>
k
.
Note that for all integers [j[ < k,
P
je
1
T
A
k
>
k
q(2k).
A last-exit decomposition focusing on the last visit to A
k
before time
k
shows that
1 =
|j|<k
G
k
(0, je
1
) P
je
1
T
A
k
>
k
q(2k)
|j|<k
G
k
(0, je
1
) .
where
G
k
denotes the Greens function for the set Z
2
[(k, k)(k, k)]. Hence it suces to prove
that
|j|<k
G
j
(0, je
1
) c k.
We leave this to the reader (alternatively, see next chapter for such estimates).
Exercises
Exercise 5.1 Prove inequality (5.10).
Exercise 5.2 Suppose p T
2
and x Z
2
0. Let
T = min n 1 : S
n
= jx for some j Z .
T
+
= minn 1 : S
n
= jx for some j = 0, 1, 2, . . . .
(i) Show that there exists c such that
PT > n c n
1
.
(ii) Show that there exists c
1
such that
PT
+
> n c
1
n
1/2
.
Establish the analog of Proposition 5.3.2 in this setting.
6
Potential Theory
6.1 Introduction
There is a close relationship between random walks with increment distribution p and functions
that are harmonic with respect to the generator L = L
p
.
We start by setting some notation. We x p T. If A Z
d
, we let
A = x Z
d
A : p(y, x) > 0 for some y A
denote the (outer) boundary of A and we let A = A A be the discrete closure of A. Note
that the above denition of A, A depends on the choice of p. We omit this dependence from the
notation, and hope that this will not confuse the reader. In the case of simple random walk,
A = x Z
d
A : [y x[ = 1 for some y A.
Since p has nite range, if A is nite, then A, A are nite. The inner boundary of A Z
d
is
dened by
i
A = (Z
d
A) = x A : p(x, y) > 0 for some y , A.
Figure 6.1: Suppose A is the set of lattice points inside the dashed curve. Then the points in
A
i
A,
i
A and A are marked by , and , respectively
119
120 Potential Theory
A function f : A R is harmonic (with respect to p) or p-harmonic in A if
Lf(y) :=
x
p(x) [f(y +x) f(y)] = 0
for every y A. Note that we cannot dene Lf(y) for all y A, unless f is dened on A.
We say that A is connected (with respect to p) if for every x, y A, there is a nite sequence
x = z
0
, z
1
, . . . , z
k
= y of points in A with p(z
j+1
z
j
) > 0, j = 0, . . . , k 1.
This chapter contains a number of results about functions on subsets of Z
d
. These results have analogues
in the continuous setting. The set A corresponds to an open set D R
d
, the outer boundary A corresponds to
the usual topological boundary D, and A corresponds to the closure D = D D. The term domain is often
used for open, connected subsets of R
d
. Finiteness assumptions on A correspond to boundedness assumptions
on D.
Proposition 6.1.1 Suppose S
n
is a random walk with increment distribution p T
d
starting at
x Z
d
. Suppose f : Z
d
R. Then
M
n
:= f(S
n
)
n1
j=0
Lf(S
j
)
is a martingale. In particular, if f is harmonic on A Z
d
, then Y
n
:= f(S
n
A
) is a martingale,
where
A
is as dened in (4.27).
Proof Immediate from the denition.
Proposition 6.1.2 Suppose p T
d
and f : Z
d
R is bounded and harmonic on Z
d
. Then f is
constant.
Proof We may assume p is aperiodic; if not consider p = (1/2) p + (1/2)
0
and note that f is
p-harmonic if and only if it is p-harmonic. Let x, y Z
d
. By Lemma 2.4.3 we can dene random
walks S,
S on the same probability space so that S is a random walk starting at x;
S is a random
walk starting at y; and
PS
n
,=
S
n
c [x y[ n
1/2
.
In particular,
[E[f(S
n
)] E[f(
S
n
)][ 2 c [x y[ n
1/2
|f|
0.
Proposition 6.1.1 implies that f(x) = E[f(S
n
)], f(y) = E[f(
S
n
)].
The fact that all bounded harmonic functions are constant is closely related to the fact that a random walk
eventually forgets its starting point. Lemma 2.4.3 gives a precise formulation of this loss of memory property.
The last proposition is not true for simple random walk on a regular tree.
6.2 Dirichlet problem 121
6.2 Dirichlet problem
The standard Dirichlet problem for harmonic functions is to nd a harmonic function on a region
with specied values on the boundary.
Theorem 6.2.1 (Dirichlet problem I) Suppose p T
d
, and A Z
d
satises P
x
A
< = 1
for all x A. Suppose F : A R is a bounded function. Then there is a unique bounded function
f : A R satisfying
Lf(x) = 0, x A, (6.1)
f(x) = F(x), x A. (6.2)
It is given by
f(x) = E
x
[F(S
A
)]. (6.3)
Proof A simple application of the Markov property shows that f dened by (6.3) satises (6.1) and
(6.2). Now suppose f is a bounded function satisfying (6.1) and (6.2). Then M
n
:= f(S
n
A
) is a
bounded martingale. Hence, the optional sampling theorem (Theorem 12.2.3) implies that
f(x) = E
x
[M
0
] = E
x
[M
A
] = E
x
[F(S
A
)].
Remark.
If A is nite, then A is also nite and all functions on A are bounded. Hence for each F on
A, there is a unique function satisfying (6.1) and (6.2). In this case we could prove existence
and uniqueness using linear algebra since (6.1) and (6.2) give #(A) linear equations in #(A)
unknowns. However, algebraic methods do not yield the nice probabilistic form (6.3).
If A is innite, there may well be more than one solution to the Dirichlet problem if we allow
unbounded solutions. For example, if d = 1, p is simple random walk, A = 1, 2, 3, . . ., and
F(0) = 0, then there is an innite number of solutions of the form f
b
(x) = bx. If b ,= 0, f
b
is
unbounded.
Under the conditions of the theorem, it follows that any function f on A that is harmonic on A
satises the maximum principle:
sup
xA
[f(x)[ = sup
xA
[f(x)[.
If d = 1, 2 and A is a proper subset of Z
d
, then we know by recurrence that P
x
A
< = 1 for
all x A.
If d 3 and Z
d
A is nite, then there are points x A with P
x
A
= > 0. The function
f(x) = P
x
A
=
is a bounded function satisfying (6.1) and (6.2) with F 0 on A. Hence, the condition
P
x
A
< = 1 is needed to guarantee uniqueness. However, as Proposition 6.2.2 below shows,
all solutions with F 0 on A are multiples of f.
122 Potential Theory
Remark. This theorem has a well-known continuous analogue. Suppose f : [z[ R
d
: [z[
1 R is a continuous function with f(x) = 0 for [x[ < 1. Then
f(x) = E
x
[f(B
T
)],
where B is a standard d-dimensional Brownian motion and T is the rst time t that [B
t
[ = 1. If
[x[ < 1, the distribution of B
T
given B
0
= x has a density with respect to surface measure on
[z[ = 1. This density h(x, z) = c (1 [x[
2
)/[x z[
d
is called the Poisson kernel and we can write
f(x) = c
_
|z|=1
f(z)
1 [x[
2
[x z[
d
ds(z), (6.4)
where s denotes surface measure. To verify that this is correct, one can check directly that f as
dened above is harmonic in the ball and satises the boundary condition on the sphere. Two facts
follow almost immediately from this integral formula:
Derivative estimates. For every k, there is a c = c(k) < such that if f is harmonic in the
unit ball and D denotes a kth order derivative, then [Df(0)[ c
k
|f|
.
Harnack inequality. For every r < 1, there is a c = c
r
< such that if f is a positive
harmonic function on the unit ball, then f(x) c f(y) for [x[, [y[ r.
An important aspect of these estimates is the fact that the constants do not depend on f. We will
prove the analogous results for random walk in Section 6.3.
Proposition 6.2.2 (Dirichlet problem II) Suppose p T
d
and A Z
d
. Suppose F : A R
is a bounded function. Then the only bounded functions f : A R satisfying (6.1) and (6.2) are
of the form
f(x) = E
x
[F(S
A
);
A
< ] +b P
x
A
= , (6.5)
for some b R.
Proof We may assume that p is aperiodic. We also assume that P
x
A
= > 0 for some x A; if
not, Theorem 6.2.1 applies. Assume that f is a bounded function satisfying (6.1) and (6.2). Since
M
n
:= f(S
n
A
) is a martingale, we know that
f(x) = E
x
[M
0
] = E
x
[M
n
] = E
x
[f(S
n
A
)]
= E
x
[f(S
n
)] E
x
[f(S
n
);
A
< n] +E
x
[F(S
A
);
A
< n].
Using Lemma 2.4.3, we can see that for all x, y,
lim
n
[E
x
[f(S
n
)] E
y
[f(S
n
)][ = 0.
Therefore,
[f(x) f(y)[ 2 |f|
[P
x
A
< +P
y
A
< ].
Let U
= z Z
d
: P
z
A
= 1. Since P
x
A
= > 0 for some x, one can see (Exercise
4.1) that U
, x, y U
.
6.2 Dirichlet problem 123
Hence, there is a b such that
[f(x) b[ 4 |f|
, x U
.
Let
be the minimum of
A
and the smallest n such that S
n
U
)] = E
x
[F(S
A
);
A
] +E
x
[f(S
);
A
>
].
(Here we use the fact that P
x
A
);
A
] = E
x
[F(S
A
);
A
< ].
Also,
[E
x
[f(S
);
A
>
] b P
x
A
>
[ 4|f|
P
x
A
>
,
and since
as 0,
lim
0
E
x
[f(S
);
A
>
] = b P
x
A
= .
This gives (6.5).
Remark. We can think of (6.5) as a generalization of (6.3) where we have added a boundary
point at innity. The constant b in the last proposition is the boundary value at innity and can
be written as F(). The fact that there is a single boundary value at innity is closely related to
Proposition 6.1.2.
Denition. If p T
d
and A Z
d
, then the Poisson kernel is the function H : A A [0, 1]
dened by
H
A
(x, y) = P
x
A
< ; S
A
= y.
As a slight abuse of notation we will also write
H
A
(x, ) = P
x
A
= .
Note that
yA
H
A
(x, y) = P
x
A
< .
For xed y A, f(x) = H
A
(x, y) is a function on A that is harmonic on A and equals ( y) on
A. If p is recurrent, there is a unique such function. If p is transient, f is the unique such function
that tends to 0 as x tends to innity. We can write (6.3) as
f(x) = E
x
[F(S
A
)] =
yA
H
A
(x, y) F(y), (6.6)
and (6.5) as
f(x) = E
x
[F(S
A
);
A
< ] +b P
x
A
= =
yA{}
H
A
(x, y) F(y),
124 Potential Theory
where F() = b. The expression (6.6) is a random walk analogue of (6.4).
Proposition 6.2.3 Suppose p T
d
and A Z
d
. Let g : A R be a function with nite support.
Then, the function
f(x) =
yA
G
A
(x, y) g(y) = E
x
_
_
A
1
j=0
g(S
j
)
_
_
,
is the unique bounded function on A that vanishes on A and satises
Lf(x) = g(x), x A. (6.7)
Proof Since g has nite support,
[f(x)[
yA
G
A
(x, y) [g(y)[ < ,
and hence f is bounded. We have already noted in Lemma 4.6.1 that f satises (6.7). Now suppose
f is a bounded function vanishing on A satisfying (6.7). Then, Proposition 6.1.1 implies that
M
n
:= f(S
n
A
) +
n
A
1
j=0
g(S
j
),
is a martingale. Note that [M
n
[ |f|
+Y where
Y =
A
1
j=0
[g(S
j
)[,
and that
E
x
[Y ] =
y
G
A
(x, y) [g(y)[ < .
Hence M
n
is dominated by an integrable random variable and we can use the optional sampling
theorem (Theorem 12.2.3) to conclude that
f(x) = E
x
[M
0
] = E
x
[M
A
] = E
x
_
_
A
1
j=0
g(S
j
)
_
_
.
Remark. Suppose A Z
d
is nite with #(A) = m. Then G
A
= [G
A
(x, y)]
x,yA
is an m m
symmetric matrix with nonnegative entries. Let L
A
= [L
A
(x, y)]
x,yA
be the m m symmetric
matrix dened by
L
A
(x, y) = p(x, y), x ,= y; L
A
(x, x) = p(x, x) 1.
If g : A R and x A, then L
A
g(x) is the same as Lg(x) where g is extended to A by setting g 0
on A. The last proposition can be rephrased as L
A
[G
A
g] = g, or in other words, G
A
= (L
A
)
1
.
6.3 Dierence estimates and Harnack inequality 125
Corollary 6.2.4 Suppose p T
d
and A Z
d
is nite. Let g : A R, F : A R be given. Then,
the function
f(x) = E
x
[F(S
A
)] +E
x
_
_
A
1
j=0
g(S
j
)
_
_
=
zA
H
A
(x, z) F(z) +
yA
G
A
(x, y) g(y), (6.8)
is the unique function on A that satises
Lf(x) = g(x), x A.
f(x) = F(x), x A.
In particular, for any f : A R, x A,
f(x) = E
x
[f(S
A
)] E
x
_
_
A
1
j=0
Lf(S
j
)
_
_
. (6.9)
Proof Use the fact that h(x) := f(x) E
x
[F(S
A
)] satises the assumptions in the previous propo-
sition.
Corollary 6.2.5 Suppose p T
d
and A Z
d
is nite. Then
f(x) = E
x
[
A
] =
yA
G
A
(x, y)
is the unique bounded function f : A R that vanishes on A and satises
Lf(x) = 1, x A.
Proof This is Proposition 6.2.3 with g 1
A
.
Proposition 6.2.6 Let
n
=
Bn
= infj 0 : [S
j
[ n. Then if p T
d
with range R and [x[ < n,
[n
2
[x[
2
] (tr) E
x
[
n
] [(n +R)
2
[x[
2
].
Proof In Exercise 1.4 it was shown that M
j
=: [S
jn
[
2
(tr)(j
n
) is a martingale. Also,
E
x
[
n
] < for each x, so M
j
is dominated by the integrable random variable (n +R)
2
+(tr)
n
.
Hence,
[x[
2
= E
x
[M
0
] = E
x
[M
n
] = E
x
[[S
n
[
2
] (tr) E
x
[
n
].
Moreover, n [S
n
[ < (n +R).
6.3 Dierence estimates and Harnack inequality
In the next two sections we will prove useful results about random walk and harmonic functions.
The main tools in the proofs are the optional sampling theorem and the estimates for the Greens
function and the potential kernel. The basic idea in many of the proofs is to dene a martingale
126 Potential Theory
in terms of the Greens function or potential kernel and then to stop it at a region at which that
function is approximately constant. We recall that
B
n
= z Z
d
: [z[ < n, (
n
= z Z
d
: (z) < n.
Also, there is a > 0 such that
(
n
B
n
(
n/
.
We set
n
=
Cn
= minj 1 : S
j
, (
n
,
n
=
Bn
= minj 1 : S
j
, B
n
.
As the next proposition points out, the Greens function and potential kernel are almost constant
on (
n
. We recall that Theorems 4.3.1 and 4.4.4 imply that as x ,
G(x) =
C
d
(x)
d2
+O
_
1
[x[
d
_
, d 3, (6.10)
a(x) = C
2
log (x) +
2
+O
_
1
[x[
2
_
. (6.11)
Here C
2
= [
det ]
1
and
2
= C +C
2
log
Cn
, 0)], d 3,
G
Cn
(0, 0) = E[a(S
Cn
, 0)], d = 2.
We now apply Proposition 6.3.1.
It follows from Proposition 6.3.2 that
G
Bn
(0, 0) = G(0, 0) +O(n
2d
), d 3,
a
Bn
(0, 0) = C
2
log n +O(1), d = 2.
It can be shown that G
Bn
(0, 0) = G(0, 0)
C
d
n
2d
+o(n
1d
), a
Bn
(0, 0) = C
2
log n+
2
+O(n
1
) where
C
d
,
2
are dierent from C
d
,
2
but we will not need this in the sequel, hence omit the argument.
We will now prove dierence estimates and a Harnack inequality for harmonic functions. There
are dierent possible approaches to proving these results. One would be to use the result for
Brownian motion and approximate. We will use a dierent approach where we start with the
known dierence estimates for the Greens function G and the potential kernel a and work from
there. We begin by proving a dierence estimate for G
A
. We then use this to prove a result on
probabilities that is closely related to the gamblers ruin estimate for one-dimensional walks.
Lemma 6.3.3 If p T
d
, d 2, then for every > 0, r < , there is a c such that if B
n
A Z
d
,
then for every [x[ > n and every [y[ r,
[G
A
(0, x) G
A
(y, x)[
c
n
d1
.
[2 G
A
(0, x) G
A
(y, x) G
A
(y, x)[
c
n
d
.
128 Potential Theory
Proof It suces to prove the result for nite A for we can approximate any A by nite sets (see
Exercise 4.7). Assume that x A, for otherwise the result is trivial. By symmetry G
A
(0, x) =
G
A
(x, 0), G
A
(y, x) = G
A
(x, y). By Proposition 4.6.2,
G
A
(x, 0) G
A
(x, y) = G(x, 0) G(x, y)
zA
H
A
(x, z) [G(z, 0) G(z, y)], d 3,
G
A
(x, y) G
A
(x, 0) = a(x, 0) a(x, y)
zA
H
A
(x, z) [a(z, 0) a(z, y)], d = 2.
There are similar expressions for the second dierences. The dierence estimates for the Greens
function and the potential kernel (Corollaries 4.3.3 and 4.4.5) give, provided that [y[ r and
[z[ (/2) n,
[G(z) G(z +y)[ c
n
1d
, [2G(z) G(z +y) G(z y)[ c
n
d
for d 3 and
[a(z) a(z +y)[ c
n
1
, [2a(z) a(z +y) a(z y)[ c
n
2
for d = 2.
The next lemma is very closely related to the one-dimensional gamblers ruin estimate. This
lemma is particularly useful for x on or near the boundary of (
n
. For x in (
n
(
n/2
that are away
from the boundary, there are sharper estimates. See Propositions 6.4.1 and 6.4.2.
Lemma 6.3.4 Suppose p T
d
, d 2. There exist c
1
, c
2
such that for all n suciently large and
all x (
n
(
n/2
,
P
x
S
Cn\C
n/2
(
n/2
c
1
n
1
, (6.13)
and if x (
n
,
P
x
S
Cn\C
n/2
(
n/2
c
2
n
1
. (6.14)
Proof We will do the proof for d 3; the proof for d = 2 is almost identical replacing the Greens
function with the potential kernel. It follows from (6.10) that there exist r, c such that for all n
suciently large and all y (
nr
, z (
n
,
G(y) G(z) cn
1d
. (6.15)
By choosing n suciently large, we can assure that
i
(
n/2
(
n/4
= .
Suppose that x (
nr
and let T =
Cn\C
n/2
. Applying the optional sampling theorem to the
bounded martingale G(S
jT
), we see that
G(x) = E
x
[G(S
T
)] E
x
[G(S
T
); S
T
(
n/2
] + max
zCn
G(z).
Therefore, (6.15) implies that
E
x
[G(S
T
); S
T
(
n/2
] c n
1d
.
6.3 Dierence estimates and Harnack inequality 129
For n suciently large, S
T
, (
n/4
and hence (6.10) gives
E
x
[G(S
T
); S
T
(
n/2
] c n
2d
P
x
Cn\C
n/2
<
Cn
.
This establishes (6.13) for x (
nr
.
To prove (6.13) for other x we note the following fact that holds for any p T
d
: there is an > 0
such that for all [x[ r, there is a y with p(y) and (x +y) (x) . It follows that there
is a > 0 such that for all n suciently large and all x (
n
, there is probability at least that a
random walk starting at x reaches (
nr
before leaving (
n
.
Since our random walk has nite range, it suces to prove (6.14) for x (
n
(
nr
, and any
nite r. For such x,
G(x) = C
d
n
2d
+O(n
1d
).
Also,
E
x
[G(S
T
) [ S
T
(
n/2
] = C
d
2
d2
n
2d
+O(n
1d
),
E
x
[G(S
T
) [ S
T
(
n
] = C
d
n
2d
+O(n
1d
).
The optional sampling theorem gives
G(x) = E
x
[G(S
T
)] =
P
x
S
T
(
n/2
E
x
[G(S
T
) [ S
T
(
n/2
] +P
x
S
T
(
n
E
x
[G(S
T
) [ S
T
(
n
].
The left-hand side equals C
d
n
2d
+O(n
1d
) and the right-hand side equals
C
d
n
2d
+O(n
1d
) +C
d
[2
d2
1] n
2d
P
x
S
T
(
n/2
.
Therefore P
x
S
T
(
n/2
= O(n
1
).
Proposition 6.3.5 If p T
d
and x (
n
,
G
Cn
(0, x) = C
d
_
(x)
2d
n
2d
_
+O([x[
1d
), d 3,
G
Cn
(0, x) = C
2
[log n log (x)] +O([x[
1
), d = 2.
In particular, for every 0 < < 1/2, there exist c
1
, c
2
such that for all n suciently large,
c
1
n
2d
G
Cn
(y, x) c
2
n
2d
, y (
n
, x
i
(
2n
(
2n
.
Proof Symmetry and Lemma 4.6.2 tell us that
G
Cn
(0, x) = G
Cn
(x, 0) = G(x, 0) E
x
[G(S
Cn
)], d 3,
G
Cn
(0, x) = G
Cn
(x, 0) = E
x
[a(S
Cn
)] a(x), d = 2. (6.16)
Also, (6.10) and (6.11) give
G(x) = C
d
(x)
2d
+O([x[
d
), d 3,
a(x) = C
2
log[(x)] +
2
+O([x[
2
), d = 2,
130 Potential Theory
and Proposition 6.3.1 implies that
E
x
[G(S
Cn
)] =
C
d
n
d2
+O(n
1d
), d 3,
E
x
[a(S
Cn
)] = C
2
log n +
2
+O(n
1
), d = 2.
Since [x[ c n, we can write O([x[
d
) +O(n
1d
) O([x[
1d
). To get the nal assertion we use the
estimate
G
C
(1)n
(0, x y) G
Cn
(y, x) G
C
(1+)n
(0, x y).
We now focus on H
Cn
, the distribution of the rst visit of a random walker to the complement
of (
n
. Our rst lemma uses the last-exit decomposition.
Lemma 6.3.6 If p T
d
, x B A Z
d
, y A,
H
A
(x, y) =
zB
G
A
(x, z) P
z
S
A\B
= y =
zB
G
A
(z, x) P
y
S
A\B
= z.
In particular,
H
A
(x, y) =
zA
G
A
(x, z) p(z, y) =
z
i
A
G
A
(x, z) p(z, y).
Proof In the rst display the rst equality follows immediately from Proposition 4.6.4, and the
second equality uses the symmetry of p. The second display is the particular case B = A.
Lemma 6.3.7 If p T
d
, there exist c
1
, c
2
such that for all n suciently large and all x (
n/4
, y
(
n
,
c
1
n
d1
H
Cn
(x, y)
c
2
n
d1
.
We think of (
n
as a (d 1)-dimensional subset of Z
d
that contains on the order of n
d1
points. This
lemma states that the hitting measure is mutually absolutely continuous with respect to the uniform measure on
(
n
(with a constant independent of n).
Proof By the previous lemma,
H
Cn
(x, y) =
zC
n/2
G
Cn
(z, x) P
y
S
Cn\C
n/2
= z.
Using Proposition 6.3.5 we see that for z
i
(
n/2
, x (
n/4
, G
Cn
(z, x) n
2d
. Also, Lemma 6.3.4
implies that
zC
n/2
P
y
S
Cn\C
n/2
= z n
1
.
6.3 Dierence estimates and Harnack inequality 131
Theorem 6.3.8 (Dierence estimates) If p T
d
and r < , there exists c such that the
following holds for every n suciently large.
(a) If g : B
n
R is harmonic in B
n
and [y[ r,
[
y
g(0)[ c |g|
n
1
, (6.17)
[
2
y
g(0)[ c |g|
n
2
. (6.18)
(b) If f : B
n
[0, ) is harmonic in B
n
and [y[ r, then
[
y
f(0)[ c f(0) n
1
, (6.19)
[
2
y
f(0)[ c f(0) n
2
. (6.20)
Proof Choose > 0 such that (
2n
B
n
. Choose n suciently large so that B
r
(
(/2)n
and
i
(
2n
(
n
= . Let H(x, z) = H
C
2n
(x, z). Then for [x[ r,
g(x) =
zC
2n
H(x, z) g(z),
and similarly for f. Hence to prove the theorem, it suces to establish (6.19) and (6.20) for
f(x) = H(x, z) (with c independent of n, z). Let =
n,
=
C
2n
\Cn
. By Lemma 6.3.6, if
x (
(/2)n
,
f(x) =
w
i
Cn
G
C
2n
(w, x) P
z
S
= w.
Lemma 6.3.7 shows that f(x) n
1d
and in particular that
f(z) c f(w), z, w (
(/2)n
. (6.21)
is a > 0 such that for n suciently large, [w[ n for w
i
(
n
. The estimates (6.19) and (6.20)
now follow from Lemma 6.3.3 and Lemma 6.3.4.
Theorem 6.3.9 (Harnack inequality) Suppose p T
d
, U R
d
is open and connected, and K
is a compact subset of U. Then there exist c = c(K, U, p) < and positive integer N = N(K, U, p)
such that if n N,
U
n
= x Z
d
: n
1
x U, K
n
= x Z
d
: n
1
x K,
and f : U
n
[0, ) is harmonic in U
n
, then
f(x) c f(y), x, y K
n
.
This is the discrete analogue of the Harnack principle for positive harmonic functions in R
d
. Suppose
K U R
d
where K is compact and U is open. Then there exists c(K, U) < such that if f : U (0, )
is harmonic, then
f(x) c(K, U) f(y), x, y K.
132 Potential Theory
Proof Without loss of generality we will assume that U is bounded. In (6.21) we showed that there
exists > 0, c
0
< such that
f(x) c
0
f(y) if [x y[ dist(x, U
n
). (6.22)
Let us call two points z, w in U adjacent if [z w[ < (/4) maxdist(z, U), dist(w, U). Let
denote the graph distance associated to this adjacency, i.e., (z, w) is the minimum k such
that there exists a sequence z = z
0
, z
1
, . . . , z
k
= w of points in U such that z
j
is adjacent to
z
j1
for j = 1, . . . , k. Fix z U, and let V
k
= w U : (z, w) k, V
n,k
= x Z
d
:
n
1
x V
k
. For k 1, V
k
is open, and connectedness of U implies that V
k
= U. For n
suciently large, if x, y V
n,k
, there is a sequence of points x = x
0
, x
1
, . . . , x
k
= y in V
n,k
such
that [x
j
x
j1
[ < (/2) maxdist(x
j
, U), dist(x
j1
, U). Repeated application of (6.22) gives
f(x) c
k
0
f(y). Compactness of K implies that K V
k
for some nite k, and hence K
n
V
n,k
.
6.4 Further estimates
In this section we will collect some more facts about random walks in T
d
restricted to the set (
n
.
The rst three propositions are similar to Lemma 6.3.4.
Proposition 6.4.1 If p T
2
, m < n, T =
Cn\Cm
, then for x (
n
(
m
,
P
x
S
T
(
n
=
log (x) log m+O(m
1
)
log n log m
.
Proof Let q = P
x
S
T
(
n
. The optional sampling theorem applied to the bounded martingale
M
j
= a(S
jT
) gives
a(x) = E
x
[a(S
T
)] = (1 q) E
x
[a(S
T
) [ S
T
i
(
m
] +q E
x
[a(S
T
) [ S
T
(
n
].
From (6.11) and Proposition 6.3.1 we know that
a(x) = C
2
log (x) +
2
+O([x[
2
),
E
x
[a(S
T
) [ S
T
i
(
m
] = C
2
log m+
2
+O(m
1
),
E
x
[a(S
T
) [ S
T
(
n
] = C
2
log n +
2
+O(n
1
).
Solving for q gives the result.
Proposition 6.4.2 If p T
d
, d 3, T =
Z
d
\Cm
, then for x Z
d
(
m
,
P
x
T < =
_
m
(x)
_
d2
_
1 +O(m
1
)
.
Proof Since G(y) is a bounded harmonic function on
Z
d
\Cm
with G() = 0, (6.5) gives
G(x) = E
x
[G(S
T
); T < ] = P
x
T < E
x
[G(S
T
) [ T < ].
6.4 Further estimates 133
But (6.10) gives
G(x) = C
d
(x)
2d
[1 +O([x[
2
)],
E
x
[G(S
T
) [ T < ] = C
d
m
2d
[1 +O(m
1
)].
Proposition 6.4.3 If p T
2
, n > 0, and T =
Cn\{0}
, then for x (
n
,
P
x
S
T
= 0 =
_
1
log (x) +O([x[
1
)
log n
_ _
1 +O
_
1
log n
__
.
Proof Recall that P
x
S
T
= 0 = G
Cn
(x, 0)/G
Cn
(0, 0). The estimate then follows immediately from
Propositions 6.3.2 and 6.3.5. The O([x[
1
) term is superuous except for x very close to (
n
.
Suppose m n/2, x (
m
, z (
n
. By applying Theorem 6.3.8 O(m) times we can see that (for
n suciently large)
H
Cn
(x, z) = P
x
S
n
= z = H
Cn
(0, z)
_
1 +O
_
m
n
__
. (6.23)
We will use this in the next two propositions to estimate some conditional probabilities.
Proposition 6.4.4 Suppose p T
d
, d 3, m < n/4, and (
n
(
m
A (
n
. Suppose x (
2m
with P
x
S
A
(
n
> 0 and z (
n
. Then for n suciently large,
P
x
S
A
= z [ S
A
(
n
= H
Cn
(0, z)
_
1 +O
_
m
n
__
. (6.24)
Proof It is easy to check (using optional stopping) that it suces to verify (6.24) for x (
2m
.
Note that (6.23) gives
P
x
S
n
= z = H
Cn
(0, z)
_
1 +O
_
m
n
__
,
and since A (
n
(
m
,
P
x
S
n
= z [ S
A
, (
n
= H
Cn
(0, z)
_
1 +O
_
m
n
__
.
This implies
P
x
S
n
= z; S
A
, (
n
= PS
A
, (
n
H
Cn
(0, z)
_
1 +O
_
m
n
__
.
The last estimate, combined with (6.24), yields
P
x
S
n
= z; S
A
(
n
= PS
A
(
n
H
Cn
(0, z) +H
Cn
(0, z) O
_
m
n
_
.
Using Proposition 6.4.2, we can see there is a c such that
P
x
S
A
(
n
P
x
S
j
, (
m
for all j c, x (
2m
,
which allows use to write the preceding expression as
P
x
S
n
= z; S
A
(
n
= PS
A
(
n
H
Cn
(0, z)
_
1 +O
_
m
n
__
.
134 Potential Theory
For d = 2 we get a similar result but with a slightly larger error term.
Proposition 6.4.5 Suppose p T
2
, m < n/4, and (
n
(
m
A (
n
. Suppose x (
2m
with
P
x
S
A
(
n
> 0 and z (
n
. Then, for n suciently large,
P
x
S
A
= z [ S
A
(
n
= H
Cn
(0, z)
_
1 +O
_
m log(n/m)
n
__
. (6.25)
Proof The proof is essentially the same, except for the last step, where Proposition 6.4.1 gives us
P
x
S
A
(
n
c
log(n/m)
, x (
2m
,
so that
H
Cn
(0, z) O
_
m
n
_
can be written as
P
x
S
A
(
n
H
Cn
(0, z) O
_
m log(n/m)
n
_
.
The next proposition is a stronger version of Proposition 6.2.2. Here we show that the bounded-
ness assumption of that proposition can be replaced with an assumption of sublinearity.
Proposition 6.4.6 Suppose p T
d
, d 3 and A Z
d
with Z
d
A nite. Suppose f : Z
d
R is
harmonic on A and satises f(x) = o([x[) as x . Then there exists b R such that for all x,
f(x) = E
x
[f(S
A
);
A
< ] +b P
x
A
= .
Proof Without loss of generality, we may assume that 0 , A. Also, we may assume that f 0 on
Z
d
A; otherwise, we can consider
f(x) = f(x) E
x
[f(S
A
);
A
< ].
The assumptions imply that there is a sequence of real numbers
n
decreasing to 0 such that
[f(x)[
n
n for all x (
n
and hence
[f(x) f(y)[ 2
n
n, x, y (
n
.
Since Lf 0 on A, (6.8) gives
0 = f(0) = E[f(S
n
)]
yZ
d
\A
G
Cn
(0, y) Lf(y), (6.26)
and since Z
d
A is nite, this implies that
lim
n
E[f(S
n
)] = b :=
yZ
d
\A
G(0, y) Lf(y).
6.4 Further estimates 135
If x A (
n
, the optional sampling theorem implies that
f(x) = E
x
[f(S
A
n
)] = E
x
[f(S
n
);
A
>
n
] = P
x
A
>
n
E
x
[f(S
n
) [
A
>
n
].
For every w (
n
, we can write
E
x
[f(S
n
) [
A
>
n
] E[f(S
n
)] =
zCn
f(z) [P
x
S
n
= z [
A
>
n
H
Cn
(0, z)]
=
zCn
[f(z) f(w)] [P
x
S
n
= z [
A
>
n
H
Cn
(0, z)].
For n large, apply
wCn
and divide by [(
n
[ the above identity, and note that (6.24) now implies
[E
x
[f(S
n
) [
A
>
n
] E[f(S
n
)][ c
[x[
n
sup
y,zCn
[f(z) f(y)[ c [x[
n
. (6.27)
Therefore,
f(x) = lim
n
P
x
A
>
n
E
x
[f(S
n
) [
A
>
n
]
= P
x
A
= lim
n
E[f(S
n
)] = b P
x
A
= .
Proposition 6.4.7 Suppose p T
2
and A is a nite subset of Z
2
containing the origin. Let
T = T
A
=
Z
2
\A
= minj 0 : S
j
A. Then for each x Z
2
the limit
g
A
(x) := lim
n
C
2
(log n) P
x
n
< T (6.28)
exists. Moreover, if y A,
g
A
(x) = a(x y) E
x
[a(S
T
y)]. (6.29)
Proof If y A and x (
n
A, the optional sampling theorem applied to the bounded martingale
M
j
= a(S
jTn
y) implies
a(x y) = E
x
[a(S
Tn
y)] = P
x
n
< T E
x
[a(S
n
y) [
n
< T] +E
x
[a(S
T
y)]
P
x
n
< T E
x
[a(S
T
y) [
n
< T].
As n ,
E
x
[a(S
n
y) [
n
< T] C
2
log n.
Letting n , we obtain the result.
Remark. As mentioned before, it follows that the right-hand side of (6.29) is the same for all
y A. Also, since there exists such that (
n
B
n
(
n/
we can replace (6.28) with
g
A
(x) := lim
n
C
2
(log n) P
x
n
< T.
The astute reader will note that we already proved this proposition in Proposition 4.6.3.
Proposition 6.4.8 Suppose p T
2
and A is a nite subset of Z
2
. Suppose f : Z
2
R is harmonic
on Z
2
A; vanishes on A; and satises f(x) = o([x[) as [x[ . Then f = b g
A
for some b R.
136 Potential Theory
Proof Without loss of generality, assume 0 A and let T = T
A
be as in the previous proposition.
Using (6.8) and (6.12), we get
E[f(S
n
)] =
yA
G
Cn
(0, y) Lf(y) = C
2
log n
yA
Lf(y) +O(1).
(Here and below the error terms may depend on A.) As in the argument deducing (6.27), we use
(6.25) to see that
[E
x
[f(S
Tn
) [
n
< T] E[f(S
n
)][ c
[x[ log n
n
sup
y,zCn
[f(y) f(z)[ c [x[
n
log n,
and combining the last two estimates we get
f(x) = E
x
[f(S
Tn
)] = P
x
n
< T E
x
[f(S
T
A
n
) [
n
< T]
= P
x
n
< T E[f(S
n
)] +[x[ o(1)
= b g
A
(x) +o(1),
where b =
yA
Lf(y).
6.5 Capacity, transient case
If A is a nite subset of Z
d
, we let
T
A
=
Z
d
\A
, T
A
=
Z
d
\A
,
rad(A) = sup[x[ : x A.
If p T
d
, d 3, dene
Es
A
(x) = P
x
T
A
= , g
A
(x) = P
x
T
A
= .
Note that Es
A
(x) = 0 if x A
i
A. Furthermore, due to Proposition 6.4.6, g
A
is the unique
function on Z
d
that is zero on A; harmonic on Z
d
A; and satises g
A
(x) 1 as [x[ . In
particular, if x A,
Lg
A
(x) =
y
p(y) g
A
(x +y) = Es
A
(x).
Denition. If d 3, the capacity of a nite set A is given by
cap(A) =
xA
Es
A
(x) =
z
i
A
Es
A
(z) =
xA
Lg
A
(x) =
z
i
A
Lg
A
(z).
The motivation for the above denition is given by the following property (stated as the next proposition):
as z , the probability that a random walk starting at z ever hits A is comparable to [z[
2d
cap(A).
6.5 Capacity, transient case 137
Proposition 6.5.1 If p T
d
, d 3 and A Z
d
is nite, then
P
x
T
A
< =
C
d
cap(A)
(x)
d2
_
1 +O
_
rad(A)
[x[
__
, [x[ 2 rad(A).
Proof There is a such that B
n
(
n/
for all n. We will rst prove the result for x , (
2rad(A)/
.
By the last-exit decomposition, Proposition 4.6.4,
P
x
T
A
< =
yA
G(x, y) Es
A
(y).
For y A, (x y) = (x) +O([y[). Therefore,
G(x, y) = C
d
(x)
2d
+O
_
[y[
[x[
d1
_
=
C
d
(x)
d2
_
1 +O
_
rad(A)
[x[
__
.
This gives the result for x , (
2rad(A)/
. We can extend this to [x[ 2rad(A) by using the Harnack
inequality (Theorem 6.3.9) on the set
z : 2 rad(A) [z[; (z) (3/) rad(A).
Note that for x in this set, rad(A)/[x[ is of order 1, so it suces to show that there is a c such that,
for any two points x, z in this set,
P
x
T
A
< c P
z
T
A
< .
Proposition 6.5.2 If p T
d
, d 3,
cap((
n
) = C
1
d
n
d2
+O(n
d1
).
Proof By Proposition 4.6.4,
1 = PT
Cn
< =
y
i
Cn
G(0, y) Es
Cn
(y),
But for y
i
(
n
, Proposition 6.3.1 gives
G(0, y) = C
d
n
2d
[1 +O(n
1
)].
Hence,
1 = C
d
n
2d
cap((
n
)
_
1 +O(n
1
)
.
Let
T
A,n
= T
A
n
= infj 1 : S
j
A or S
j
, (
n
.
If x A (
n
,
P
x
T
A
>
n
=
yCn
P
x
S
T
A,n
= y =
yCn
P
y
S
T
A,n
= x.
138 Potential Theory
The last equation uses symmetry of the walk. As a consequence,
xA
P
x
T
A
>
n
=
xA
yCn
P
y
S
T
A,n
= x =
yCn
P
y
T
A
<
n
. (6.30)
Therefore,
cap(A) =
xA
Es
A
(x) = lim
n
xA
P
x
T
A
>
n
= lim
n
yCn
P
y
T
A
<
n
. (6.31)
The identities (6.30)(6.31) relate, for a given nite set A, the probability that a random walker started
uniformly in A escapes A and the probability that a random walker started uniformly on the boundary of a
large ellipse, (far away from A) ever hits A. Formally, every path from A to innity can also be considered as a
path from innity to A by reversal. This correspondence is manifested again in Proposition 6.5.4.
Proposition 6.5.3 If p T
d
, d 3, and A, B are nite subsets of Z
d
, then
cap(A B) cap(A) + cap(B) cap(A B).
Proof Choose n such that A B (
n
. Then for y (
n
,
P
y
T
AB
<
n
= P
x
T
A
<
n
or T
B
<
n
= P
y
T
A
<
n
+P
y
T
B
<
n
P
y
T
A
<
n
, T
B
<
n
P
y
T
A
<
n
+P
y
T
B
<
n
P
y
T
AB
<
n
.
The proposition then follows from (6.31).
Denition. If p T
d
, d 3, and A Z
d
is nite, the harmonic measure of A (from innity) is
dened by
hm
A
(x) =
Es
A
(x)
cap(A)
, x A.
Note that hm
A
is a probability measure supported on
i
A. As the next proposition shows, it can
be considered as the hitting measure of A by a random walk started at innity conditioned to hit
A.
Proposition 6.5.4 If p T
d
, d 3, and A Z
d
is nite, then for x A,
hm
A
(x) = lim
|y|
P
y
S
T
A
= x [ T
A
< .
In fact, if A (
n/2
and y , (
n
, then
P
y
S
T
A
= x [ T
A
< = hm
A
(x)
_
1 +O
_
rad(A)
[y[
__
. (6.32)
6.5 Capacity, transient case 139
Proof If A (
n
and y , (
n
, the last-exit decomposition (Proposition 4.6.4) gives
P
y
S
T
A
= x =
zCn
G
Z
d
\A
(y, z) P
z
S
T
A,n
= x,
where, as before, T
A,n
= T
A
n
. By symmetry and (6.24),
P
z
S
T
A,n
= x = P
x
S
T
A,n
= z = P
x
n
< T
A
H
Cn
(0, z)
_
1 +O
_
rad(A)
n
__
= Es
A
(x) H
Cn
(0, z)
_
1 +O
_
rad(A)
n
__
.
The last equality uses
Es
A
(x) = P
x
T
A
= = P
x
T
A
>
n
_
1 +O
_
rad(A)
d2
n
d2
__
,
which follows from Proposition 6.4.2. Therefore,
P
y
S
T
A
= x = Es
A
(x)
_
1 +O
_
rad(A)
n
__
zCn
G
Z
d
\A
(y, z) H
Cn
(0, z),
and by summing over x,
P
y
T
A
< = cap(A)
_
1 +O
_
rad(A)
n
__
zCn
G
Z
d
\A
(y, z) H
Cn
(0, z).
We obtain (6.32) by dividing the last two expressions.
Proposition 6.5.5 If p T
d
, d 3, and A Z
d
is nite, then
cap(A) = sup
xA
f(x), (6.33)
where the supremum is over all functions f 0 supported on A such that
Gf(y) :=
xZ
d
G(y, x) f(x) =
xA
G(y, x) f(x) 1
for all y Z
d
.
Proof Let
f(x) = Es
A
(x). Note that Proposition 4.6.4 implies that for y Z
d
,
1 P
y
T
A
< =
xA
G(y, x) Es
A
(x).
Hence G
f 1 and the supremum in (6.33) is at least as large as cap(A). Note also that G
f is the
unique bounded function on Z
d
that is harmonic on Z
d
A; equals 1 on A; and approaches 0 at
innity. Suppose f 0, f = 0 on Z
d
A, with Gf(y) 1 for all y Z
d
. Then Gf is the unique
bounded function on Z
d
that is harmonic on Z
d
A; equals Gf 1 on A; and approaches zero at
140 Potential Theory
innity. By the maximum principle, Gf(y) G
xA
f(x)
xA
f(x).
If x, y A, let
K
A
(x, y) = P
x
S
T
A
= y.
Note that K
A
(x, y) = K
A
(y, x) and
yA
K
A
(x, y) = 1 Es
A
(x).
If h is a bounded function on Z
d
that is harmonic on Z
d
A and has h() = 0, then h(z) =
E[h(S
T
A
); T
A
< ], z Z
d
. Using this one can easily check that for x A,
Lh(x) =
_
_
yA
K
A
(x, y) h(y)
_
_
h(x).
Also, if h 0,
xA
yA
K
A
(x, y) h(y) =
yA
h(y)
xA
K
A
(y, x) =
yA
h(y) [1 Es
A
(y)]
yA
h(y),
which implies
xZ
d
Lh(x) =
xA
Lh(x) 0.
Then, using (4.25),
xA
f(x) =
xA
L[Gf](x)
xA
L[Gf](x)
xA
L[G(
f f)](x) =
xA
Es
A
(x).
Our denition of capacity depends on the random walk p. The next proposition shows that
capacities for dierent ps in the same dimension are comparable.
Proposition 6.5.6 Suppose p, q T
d
, d 3 and let cap
p
, cap
q
denote the corresponding capacities.
Then there is a = (p, q) > 0 such that for all nite A Z
d
,
cap
p
(A) cap
q
(A)
1
cap
p
(A).
Proof It follows from Theorem 4.3.1 that there exists such that
G
p
(x, y) G
q
(x, y)
1
G
p
(x, y),
for all x, y. The proposition then follows from Proposition 6.5.5.
6.5 Capacity, transient case 141
Denition. If p T
d
, d 3, and A Z
d
, then A is transient if
PS
n
A i.o. = 0.
Otherwise, the set is called recurrent.
Lemma 6.5.7 If p T
d
, d 3, then a subset A of Z
d
is recurrent if and only if for every x Z
d
,
P
x
S
n
A i.o. = 1.
Proof The if direction of the statement is trivial. To show the only if direction, let F(y) = P
y
S
n
A i.o., and note that F is a bounded harmonic function on Z
d
, so it must be constant by Propo-
sition 6.1.2. Now if F(y) > 0, y Z
d
, then for each x there is an N
x
such that
P
x
S
n
A for some n N
x
/2.
By iterating this we can see for all x,
P
x
S
n
A for some n < = 1,
and the lemma follows easily.
Alternatively, S
n
A i.o. is an exchangeable event with respect to the i.i.d. steps of the random walk,
and therefore P
x
(S
n
A i.o.) 0, 1.
Clearly, all nite sets are transient; in fact, nite unions of transient sets are transient. If A is a
subset such that
xA
G(x) < , (6.34)
then A is transient. To see this, let S
n
be a random walk starting at the origin and let V denote
the number of visits to A,
V
A
=
j=0
1S
n
A.
Then (6.34) implies that E[V
A
] < which implies that PV
A
< = 1. In Exercise 6.3, it is
shown that the converse is not true, i.e., there exist transient sets A with E[V
A
] = .
Lemma 6.5.8 Suppose p T
d
, d 3, and A Z
d
. Then A is transient if and only if
k=1
PT
k
< < , (6.35)
where T
k
= T
A
k
and A
k
= A ((
2
k (
2
k1).
Proof Let E
k
be the event T
k
< . Since the random walk is transient, A is transient if and
only if PE
k
i.o. = 0. Hence the Borel-Cantelli Lemma implies that any A satisfying (6.35) is
transient.
142 Potential Theory
Suppose
k=1
PT
k
< = .
Then either the sum over even k or the sum over odd k is innite. We will assume the former;
the argument if the latter holds is almost identical. Let B
k,+
= A
k
(z
1
, . . . , z
d
) : z
1
0 and
B
k,
= A
k
(z
1
, . . . , z
d
) : z
1
0. Since PT
2k
< PT
B
2k,+
< + PT
B
2k,
< , we
know that either
k=1
PT
B
2k,+
< = , (6.36)
or the same equality with B
2k,
replacing B
2k,+
. We will assume (6.36) holds and write
k
= T
B
2k,+
.
An application of the Harnack inequality (we leave the details as Exercise 6.11) shows that there
is a c such that for all j ,= k,
P
j
< [
j
k
=
k
< c P
j
< .
This implies
P
j
< ,
k
< 2c P
j
< P
k
< .
Using this and a special form of the Borel-Cantelli Lemma (Corollary 12.6.2) we can see that
P
j
< i.o. > 0,
which implies that A is not transient.
Corollary 6.5.9 (Wieners test) Suppose p T
d
, d 3, and A Z
d
. Then A is transient if
and only if
k=1
2
(2d)k
cap(A
k
) < (6.37)
where A
k
= A ((
2
k (
2
k1). In particular, if A is transient for some p T
d
, then it is transient
for all p T
d
.
Proof Due to Proposition 6.5.1, we have that PT
k
< 2
(2d)k
cap(A
k
).
Theorem 6.5.10 Suppose d 3, p T
d
, and S
n
is a p-walk. Let A be the set of points visited by
the random walk,
A = S[0, ) = S
n
: n = 0, 1, . . ..
If d = 3, 4, then with probability one A is a recurrent set. If d 5, then with probability one A is
a transient set.
Proof Since a set is transient if and only if all its translates are transient, we see that for each n,
A is recurrent if and only if the set
S
m
S
n
: m = n, n + 1, . . .
6.5 Capacity, transient case 143
is recurrent. Hence the event A is recurrent is a tail event, and and the Kolmogorov 0-1 law now
implies that it has probability 0 or 1.
Let Y denote the random variable that equals the expected number of visits to A by an inde-
pendent random walker
S
n
starting at the origin. In other words,
Y =
xA
G(x) =
xZ
d
1x A G(x).
Then,
E(Y ) =
xZ
d
Px A G(x) = G(0)
1
xZ
d
G(x)
2
.
Since G(x) [x[
2d
, we have G(x)
2
[x[
42d
. By examining the sum, we see that E(Y ) = for
d = 3, 4 and E(Y ) < for d 5. If d 5, this gives Y < with probability one which implies
that A is transient with probability one.
We now focus on d = 4 (it is easy to see that if the result holds for d = 4 then it also holds for
d = 3). It suces to show that PA is recurrent > 0. Let S
1
, S
2
be independent random walks
with increment distribution p starting at the origin, and let
j
k
= minn : S
j
n
, (
2
k .
Let
V
j
k
= [(
2
k (
2
k1] S
j
[0,
j
k+1
) = x (
2
k (
2
k1 : S
j
n
= x for some n
j
k+1
.
Let E
k
be the event V
1
k
V
2
k
,= . We will show that PE
k
i.o. > 0 which will imply that with
positive probability, S
1
n
: n = 0, 1, . . . is recurrent. Using Corollary 12.6.2, one can see that it
suces to show that
k=1
P(E
3k
) = , (6.38)
and that there exists a constant c < such that for m < k,
P(E
3m
E
3k
) c P(E
3m
) P(E
3k
). (6.39)
The event E
3m
depends only on the values of S
j
n
with
j
3m1
n
j
3m+1
. Hence, the Harnack
inequality implies P(E
3k
[ E
3m
) c P(E
3k
) so (6.39) holds. To prove (6.38), let J
j
(k, x) denote the
indicator function of the event that S
j
n
= x for some n
j
k
. Then,
Z
k
:= #(V
1
k
V
2
k
) =
xC
2
k
\C
2
k1
J
1
(k, x) J
2
(k, x).
There exist c
1
, c
2
such that if x, y (
2
k (
2
k1, (recall d 2 = 2)
E[J
j
(k, x)] c
1
(2
k
)
2
, E[J
j
(k, x) J
j
(k, y)] c
2
(2
k
)
2
[1 +[x y[]
2
.
(The latter inequality is obtained by noting that the probability that a random walker hits both x
and y given that it hits at least one of them is bounded above by the probability that a random
walker starting at the origin visits y x.) Therefore,
E[Z
k
] =
xC
2
k
\C
2
k1
E[J
1
(k, x)] E[J
2
(k, x)] c
xC
2
k
\C
2
k1
(2
k
)
4
c,
144 Potential Theory
E[Z
2
k
] =
x,yC
2
3k
\C
2
k1
E[J
1
(k, x) J
1
(k, y)] E[J
2
(k, x) J
2
(k, y)]
c
x,yC
2
k
\C
2
k1
(2
k
)
4
1
[1 +[x y[
2
]
2
ck,
where for the last inequality note that there are O(2
4k
) points in (
2
k (
2
k1, and that for x
(
2
k (
2
k1 there are O(
3
) points y (
2
k (
2
k1 at distance from from x. The second moment
estimate, Lemma 12.6.1 now implies that PZ
k
> 0 c/k, hence (6.38) holds.
The central limit theorem implies that the number of points in B
n
visited by a random walk is of order n
2
.
Roughly speaking, we can say that a random walk path is a two-dimensional set. Asking whether or not this
is recurrent is asking whether or not two random two-dimensional sets intersect. Using the example of planes in
R
d
, one can guess that the critical dimension is four.
6.6 Capacity in two dimensions
The theory of capacity in two dimensions is somewhat similar to that for d 3, but there are
signicant dierences due to the fact that the random walk is recurrent. We start by recalling a
few facts from Propositions 6.4.7 and 6.4.8. If p T
2
and 0 A Z
2
is nite, let
g
A
(x) = a(x) E
x
[a(S
T
A
)] = lim
n
C
2
(log n) P
x
n
< T
A
. (6.40)
The function g
A
is the unique function on Z
2
that vanishes on A; is harmonic on Z
2
A; and satises
g
A
(x) C
2
log (x) C
2
log [x[ as x . If y A, we can also write
g
A
(x) = a(x y) E
x
[a(S
T
A
y)].
To simplify notation we will mostly assume that 0 A, and then a(x)g
A
(x) is the unique bounded
function on Z
2
that is harmonic on Z
2
A and has boundary value a on A. We dene the harmonic
measure of A (from innity) by
hm
A
(x) = lim
|y|
P
y
S
T
A
= x. (6.41)
Since P
y
T
A
< = 1, this is the same as P
y
S
T
A
= x [ T
A
< and hence agrees with the
denition of harmonic measure for d 3. It is not clear a priori that the limit exists, this fact is
established in the next proposition.
Proposition 6.6.1 Suppose p T
2
and 0 A Z
2
is nite. Then the limit in (6.41) exists and
equals Lg
A
(x).
Proof Fix A and let r
A
= rad(A). Let n be suciently large so that A (
n/4
. Using (6.25) on the
set Z
2
A, we see that if x
i
A, y (
n
,
P
y
S
T
A
n
= x = P
x
S
T
A
n
= y = P
x
n
< T
A
H
Cn
(0, y)
_
1 +O
_
r
A
log n
n
__
.
6.6 Capacity in two dimensions 145
If z Z
2
(
n
, the last-exit decomposition (Proposition 4.6.4) gives
P
z
S
T
A
= x =
yCn
G
Z
2
\A
(z, y) P
y
S
T
A
n
= x.
Therefore,
P
z
S
T
A
= x = P
x
n
< T
A
J(n, z)
_
1 +O
_
r
A
log n
n
__
, (6.42)
where
J(n, z) =
yCn
H
Cn
(0, y) G
Z
2
\A
(z, y).
If x A, the denition of L, the optional sampling theorem, and the asymptotic expansion of
g
A
respectively imply
Lg
A
(x) = E
x
[g
A
(S
1
)] = E
x
[g
A
(S
T
A
n
)]
= E
x
[g
A
(S
n
);
n
< T
A
]
= P
x
n
< T
A
[C
2
log n +O
A
(1)] . (6.43)
In particular,
Lg
A
(x) = lim
n
C
2
(log n) P
x
n
< T
A
, x A. (6.44)
(This is the d = 2 analogue of the relation Lg
A
(x) = Es
A
(x) for d 3.)
Note that (as in (6.30))
x
i
A
P
x
n
< T
A
=
x
i
A
yCn
P
x
S
nT
A
= y
=
yCn
x
i
A
P
y
S
nT
A
= x =
yCn
P
y
T
A
<
n
.
Proposition 6.4.3 shows that if x A, then the probability that a random walk starting at x reaches
(
n
before visiting the origin is bounded above by c log r
A
/ log n. Therefore,
P
y
T
A
<
n
= P
y
T
{0}
<
n
_
1 +O
_
log r
A
log n
__
.
As a consequence,
x
i
A
P
x
n
< T
A
=
yCn
P
y
S
nT
{0}
= 0
_
1 +O
_
log r
A
log n
__
= P
n
< T
{0}
_
1 +O
_
log r
A
log n
__
= [C
2
log n]
1
_
1 +O
_
log r
A
log n
__
.
Combining this with (6.44) gives
xA
Lg
A
(x) =
x
i
A
Lg
A
(x) = 1. (6.45)
146 Potential Theory
Here we see a major dierence between the recurrent and transient case. If d 3, the sum above
equals cap(A) and increases in A, while it is constant in A if d = 2. (In particular, it would not be
a very useful denition for a capacity!)
Using (6.42) together with
xA
P
z
S
T
A
= x = 1, we see that
J(n, z)
xA
P
x
n
< T
A
= 1 +O
_
r
A
log n
n
_
,
which by (6.43) (6.45) implies that
J(n, z) = C
2
log n
_
1 +O
_
r
A
log n
n
__
,
uniformly in z Z
2
(
n
, and the claim follows by (6.42).
We dene the capacity of A by
cap(A) := lim
y
[a(y) g
A
(y)] =
xA
hm
A
(x) a(x z),
where z A. The last proposition establishes the limit if z = 0 A, and for other z use (6.29) and
lim
y
a(x) a(x z) = 0. We have the expansion
g
A
(x) = C
2
log (x) +
2
cap(A) +o
A
(1), [x[ .
It is easy to check from the denition that the capacity is translation invariant, that is, cap(A+y) =
cap(A), y Z
d
. Note that singleton sets have capacity zero.
Proposition 6.6.2 Suppose p T
2
.
(a) If 0 A B Z
d
are nite, then g
A
(x) g
B
(x) for all x. In particular, cap(A) cap(B).
(b) If A, B Z
d
are nite subsets containing the origin, then for all x
g
AB
(x) g
A
(x) +g
B
(x) g
AB
(x). (6.46)
In particular,
cap(A B) cap(A) + cap(B) cap(A B).
Proof The inequality g
A
(x) g
B
(x) follows immediately from (6.40). The inequality (6.46) follows
from (6.40) and the observation (recall also the argument for Proposition 6.5.3)
P
x
T
AB
<
n
= P
x
T
A
<
n
or T
B
<
n
= P
x
T
A
<
n
+P
x
T
B
<
n
P
x
T
A
<
n
, T
B
<
n
P
x
T
A
<
n
+P
x
T
B
<
n
P
x
T
AB
<
n
,
which implies
P
x
T
AB
>
n
P
x
T
A
>
n
+P
x
T
B
>
n
P
x
T
AB
>
n
.
6.6 Capacity in two dimensions 147
We next derive an analogue of Proposition 6.5.5. If A is a nite set, let a
A
denote the #(A)#(A)
symmetric matrix with entries a(x, y). Let a
A
also denote the operator
a
A
f(x) =
yA
a(x, y) f(y)
which is dened for all functions f : A R and all x Z
2
. Note that x a
A
f(x) is harmonic on
Z
2
A.
Proposition 6.6.3 Suppose p T
2
and 0 A Z
2
is nite. Then
cap(A) =
_
_
sup
yA
f(y)
_
_
1
,
where the supremum is over all nonnegative functions f on A satisfying a
A
f(x) 1 for all x A.
If A = 0 is a singleton set, the proposition is trivial since a
A
f(0) = 0 for all f and hence the
supremum is innity. A natural rst guess for other A (which turns out to be correct) is that the
supremum is obtained by a function f satisfying a
A
f(x) = 1 for all x A. If a
A
(x, y)
x,yA
is
invertible, there is a unique such function that can be written as f = a
1
A
1 (where 1 denotes the
vector of all 1s). The main ingredient in the proof of Proposition 6.6.3 is the next lemma that
shows this inverse is well dened assuming A has at least two points.
Lemma 6.6.4 Suppose p T
2
and 0 A Z
2
is nite with at least two points. Then a
1
A
exists
and
a
1
A
(x, y) = P
x
S
T
A
= y (y x) +
Lg
A
(x) Lg
A
(y)
cap(A)
, x, y A.
Proof We will rst show that for all x Z
2
.
zA
a(x, z) Lg
A
(z) = cap(A) +g
A
(x). (6.47)
To prove this, we will need the following fact (see Exercise 6.7):
lim
n
[G
Cn
(0, 0) G
Cn
(x, y)] = a(x, y). (6.48)
Consider the function
h(x) =
zA
a(x, z) Lg
A
(z).
We rst claim that h is constant on A. By a last-exit decomposition (Proposition 4.6.4), if x, y A,
1 = P
x
T
A
<
n
=
zA
G
Cn
(x, z) P
z
n
< T
A
=
zA
G
Cn
(y, z) P
z
n
< T
A
.
Hence,
(C
2
log n)
zA
[G
Cn
(0, 0) G
Cn
(x, z)]P
z
n
< T
A
=
148 Potential Theory
(C
2
log n)
zA
[G
Cn
(0, 0) G
Cn
(y, z)]P
z
n
< T
A
.
Letting n , and recalling that C
2
(log n) P
z
n
< T
A
Lg
A
(z), we conclude that h(x) = h(y).
Theorem 4.4.4 and (6.45) imply that
lim
x
[a(x) h(x)] = 0.
Hence, a(x) h(x) is a bounded function that is harmonic in Z
2
A and takes the value a h
A
on
A, where h
A
denotes the constant value of h on A. Now Theorem 6.2.1 implies that a(x) h(x) =
a(x) g
A
(x) h
A
. Therefore,
h
A
= lim
x
[a(x) g
A
(x)] = cap(A).
This establishes (6.47).
An application of the optional sampling theorem gives for z A
G
Cn
(x, z) = (z x) +E
x
[G
Cn
(S
1
, z)] = (z x) +
yA
P
x
S
T
A
n
= y G
Cn
(y, z).
Hence,
G
Cn
(0, 0) G
Cn
(x, z) = (z x)
+G
Cn
(0, 0) P
x
n
< T
A
+
yA
P
x
S
A
n
= y [G
Cn
(0, 0) G
Cn
(y, z)].
Letting n and using (6.12) and (6.48), as well as Proposition 6.6.1, this gives
(z x) = a(x, z) +Lg
A
(x) +
yA
P
x
S
T
A
= y a(y, z).
If x, z A, we can use (6.47) to write the previous identity as
(z x) =
yA
_
P
x
S
T
A
= y (y x) +
Lg
A
(x) Lg
A
(y)
cap(A)
_
a(y, z),
provided that cap(A) > 0.
Proof [of Proposition 6.6.3] Let
f(x) = Lg
A
(x)/cap(A). Applying (6.47) to x A gives
yA
a(x, y)
f(y) = 1, x A.
Suppose f satises the conditions in the statement of the proposition, and let h = a
A
f a
A
f which
is nonnegative in A. Then, using Lemma 6.6.4,
xA
[
f(x) f(x)] =
xA
_
_
yA
a
1
A
(x, y) h(y)
_
_
xA
yA
P
x
S
A
= y h(y)
xA
h(x)
=
yA
h(y)
xA
P
y
S
A
= x
_
_
yA
h(y)
_
_
= 0.
6.6 Capacity in two dimensions 149
Proposition 6.6.5 If p T
2
,
cap((
n
) = C
2
log n +
2
+O(n
1
),
Proof Recall the asymptotic expansion for g
Cn
. By denition of capacity we have,
g
Cn
(x) = C
2
log (x) +
2
cap((
n
) +o(1), x .
But for x , (
n
,
g
Cn
(x) = a(x) E
x
[a(S
T
Cn
)] = C
2
log (x) +
2
+O([x[
2
) [C
2
log n +
2
+O(n
1
)].
Lemma 6.6.6 If p T
2
, and A B Z
2
are nite, then
cap(A) = cap(B)
yB
hm
B
(y) g
A
(y).
Proof g
A
g
B
is a bounded function that is harmonic on Z
2
B with boundary value g
A
on B.
Therefore,
cap(B) cap(A) = lim
x
[g
A
(x) g
B
(x)]
= lim
x
E
x
[g
A
(S
T
B
) g
B
(S
T
B
)] =
yB
hm
B
(y) g
A
(y).
Proposition 6.6.5 tells us that the capacity of an ellipse of diameter n is C
2
log n +O(1). The next lemma
shows that this is also true for any connected set of diameter n. In particular, the capacities of the ball of radius
n and a line of radius n are asymptotic as n . This is not true for capacities in d 3.
Lemma 6.6.7 If p T
2
, there exist c
1
, c
2
such that the following holds. If A is a nite subset of
Z
2
with rad(A) < n satisfying
#x A : k 1 [x[ < k 1, k = 1, . . . , n,
then
(a) if x (
2n
,
P
x
T
A
<
4n
c
1
,
(b) [cap(A) C
2
log n[ c
2
,
(c) if x (
2n
, m 4n, and A
n
= A (
n
, then
c
1
P
x
T
An
>
m
log(m/n) c
2
. (6.49)
150 Potential Theory
Proof (a) Let be such that B
n
(
n
, and let B denote a subset of A contained in B
n
such that
#x B : k 1 [x[ < k = 1
for each positive integer k < n. We will prove the estimate for B which will clearly imply the
estimate for A. Let V = V
n,B
denote the number of visits to B before leaving (
4n
,
V =
4n
1
j=0
1S
j
B =
j=0
zB
1S
j
= z; j <
4n
.
The strong Markov property implies that if x (
2n
,
E
x
[V ] = P
x
T
B
<
4n
E
x
[V [ T
B
<
4n
] P
x
T
B
<
4n
max
zB
E
z
[V ].
Hence, we need only nd a c
1
such that E
x
[V ] c
1
E
z
[V ] for all x (
2n
, z B. Note that
#(B) = n + O(1). By Exercise 6.13, we can see that G
C
4n
(x, z) c for x, z (
2n
. Therefore
E
x
[V ] c n. If z B, there are at most 2k points w in B z satisfying [z w[ k + 1,
k = 1, . . . , n. Using Proposition 6.3.5, we see that
G
C
4n
(z, w) C
2
[log n log [z w[ +O(1)].
Therefore,
E
z
[V ] =
wB
G
C
4n
(z, w)
n
k=1
2 C
2
[log n log k +O(1)] c n.
The last inequality uses the estimate
n
k=1
log k = O(log n) +
_
n
1
log xdx = n log n n +O(log n).
(b) There exists a such that B
n
(
n/
for all n and hence
cap(A) cap(B
n
) cap((
n/
) C
2
log n +O(1).
Hence, we only need to give a lower bound on cap(A). By the previous lemma it suces to nd a
uniform upper bound for g
A
on (
4n
. For m > 4n, let
r
m
= r
m,n,A
= max
yC
2n
P
y
m
< T
A
,
r
m
= r
m,n,A
= max
yC
4n
P
y
m
< T
A
.
Using part (a) and the strong Markov property, we see that there is a < 1 such that r
m
r
m
.
Also, if y (
4n
P
y
m
< T
A
= P
y
m
< T
C
2n
+P
y
m
> T
C
2n
P
y
m
< T
A
[
m
> T
C
2n
P
y
m
< T
C
2n
+ r
m
.
Proposition 6.4.1 tells us that there is a c
3
such that for y (
4n
,
P
y
m
< T
C
2n
c
3
log mlog n +O(1)
.
6.6 Capacity in two dimensions 151
Therefore,
g
A
(y) = lim
m
C
2
(log m) P
y
m
< T
A
C
2
c
3
1
.
(c) The lower bound for (6.49) follows from Proposition 6.4.1 and the observation
P
x
T
An
>
m
P
x
T
Cn
>
m
.
For the upper bound let
u = u
n
= max
xC
2n
P
x
T
An
>
m
.
Consider a random walk starting at y (
2n
and consider T
Cn
m
. Clearly,
P
y
T
An
>
m
= P
y
m
< T
Cn
+P
y
m
> T
Cn
;
m
< T
An
.
By Proposition 6.4.1, for all y (
2n
P
y
m
< T
Cn
c
log(m/n)
.
Let =
n
= minj T
Cn
: S
j
(
2n
. Then, by the Markov property,
P
y
m
> T
Cn
,
m
T
An
uP
y
S[0, ] A
n
= .
Part (a) shows that there is a < 1 such that P
y
S[0, ] A
n
= and hence, we get
P
y
T
An
>
m
c
log(m/n)
+ u.
Since this holds, for all y (
2n
, this implies
u
c
log(m/n)
+ u,
which gives us the upper bound.
A major example of a set satisfying the condition of the theorem is a connected (with respect to simple
random walk) subset of Z
2
with radius between n1 and n. In the case of simple random walk, there is another
proof of part (a) based on the observation that the simple random walk starting anywhere on (
2n
makes a closed
loop about the origin contained in (
n
with a probability uniformly bounded away from 0. One can justify this
rigorously by using an approximation by Brownian motion. If the random walk makes a closed loop, then it must
intersect any connected set. Unfortunately, it is not easy to modify this argument for random walks that take
non-nearest neighbor steps.
152 Potential Theory
6.7 Neumann problem
We will consider the following Neumann problem. Suppose p T
d
and A Z
d
with nonempty
boundary A. If f : A R is a function, we dene its normal derivative at y A by
Df(y) =
xA
p(y, x) [f(x) f(y)].
Given D
(y), y A. (6.51)
The term normal derivative is motivated by the case of simple random walk and a point y A such that
there is a unique x A with [y x[ = 1. Then Df(y) = [f(x) f(y)]/2d, which is a discrete analogue of the
normal derivative.
A solution to (6.50)(6.51) will not always exist. The next lemma which is a form of Greens
theorem shows that if A is nite, a necessary condition for existence is
yA
D
(y) = 0. (6.52)
Lemma 6.7.1 Suppose p T
d
, A is a nite subset of Z
d
and f : A R is a function. Then
xA
Lf(x) =
yA
Df(y).
Proof
xA
Lf(x) =
xA
yA
p(x, y) [f(y) f(x)]
=
xA
yA
p(x, y) [f(y) f(x)] +
xA
yA
p(x, y) [f(y) f(x)]
However,
xA
yA
p(x, y) [f(y) f(x)] = 0,
since p(x, y) [f(y) f(x)] +p(y, x) [f(x) f(y)] = 0 for all x, y A. Therefore,
xA
Lf(x) =
yA
xA
p(x, y) [f(y) f(x)] =
yA
Df(y).
Given A, the excursion Poisson kernel is the function
H
A
: AA [0, 1],
6.7 Neumann problem 153
dened by
H
A
(y, z) = P
y
S
1
A, S
A
= z =
xA
p(y, x) H
A
(x, z),
where H
A
: A A [0, 1] is the Poisson kernel. If z A and H(x) = H
A
(x, z), then
DH(y) = H
A
(y, z), y A z,
DH(z) = H
A
(z, z) P
z
S
1
A.
More generally, if f : A R is harmonic in A, then f(y) =
zA
f(z)H
A
(y, z) so that
Df(y) =
zA
H
A
(y, z) [f(z) f(y)]. (6.53)
Note that if y A then
zA
H
A
(y, z) = P
y
S
1
A 1.
It is sometimes useful to consider the Markov transition probabilities
H
A
where
H
A
(y, z) =
H
A
(y, z) for y ,= z, and
H
A
(y, y) is chosen so that
zA
H
A
(y, z) = 1.
Note that again (compare with (6.53))
Df(y) =
zA
H
A
(y, z) [f(z) f(y)],
which we can write in matrix form
Df = [
H
A
I] f.
If A is nite, then the #(A)#(A) matrix
H
A
I is sometimes called the Dirichlet-to-Neumann
map because it takes the boundary values f (Dirichlet conditions) of a harmonic function to the
derivatives Df (Neumann conditions). The matrix is not invertible since constant functions f are
mapped to zero derivatives. We also know that the image of the map is contained in the subspace of
functions D
satisfying (6.52). The next proposition shows that the rank of the matrix is #(A)1.
It will be useful to dene random walk reected o A. There are several natural ways to do
this. We dene this to be the Markov chain with state space A and transition probabilities q where
q(x, y) = p(x, y) if x A or y A; q(x, y) = 0 if x, y A are distinct; and q(y, y) is dened for
y A so that
zA
q(y, z) = 1. In words, this chain moves like random walk with transition
probability p while in A, and whenever its current position y is in A, the only moves allowed
are those into A y. While the original walk could step out of A y with some probability
p(y) = p(y, A, p), the modied walk stays at y with probability p(y, y) + p(y).
Proposition 6.7.2 Suppose p T
d
, A is a nite, connected subset of Z
d
, and D
: A R is a
154 Potential Theory
function satisfying (6.52). Then there is a function f : A R satisfying (6.50) and (6.51). The
function f is unique up to an additive constant. One such function is given by
f(x) = lim
n
E
x
_
_
n
j=0
D
(Y
j
) 1Y
j
A
_
_
, (6.54)
where Y
j
is a Markov chain with transition probabilities q as dened in the previous paragraph.
Proof It suces to show that f as dened in (6.54) is well dened and satises (6.50) and (6.51).
Indeed, if this is true then f + c also satises it. Since the image of the matrix
H
A
I contains
the set of functions satisfying (6.52) and this is a subspace of dimension #(A) 1, we get the
uniqueness.
Note that q is an irreducible, symmetric Markov chain and hence has the uniform measure as
the invariant measure (y) = 1/m where m = #(A). Because the chain also has points with
q(y, y) > 0, it is aperiodic. Also,
E
x
_
_
n
j=0
D
(Y
j
) 1Y
j
A
_
_
=
n
j=0
zA
q
j
(x, z) D
(z) =
n
j=0
zA
_
q
j
(x, z)
1
m
_
D
(z).
By standard results about Markov chains (see Section 12.4), we know that
q
j
(x, z)
1
m
c e
j
,
for some positive constants c, . Hence the sum is convergent. It is then straightforward to check
that it satises (6.50) and (6.51).
6.8 Beurling estimate
The Beurling estimate is an important tool for estimating hitting (avoiding) probabilities of sets
in two dimensions. The Beurling estimate is a discrete analogue of what is known as the Beurling
projection theorem for Brownian motion in R
2
.
Recall that a set A Z
d
is connected (for simple random walk) if any two points in A can be
connected by a nearest neighbor path of points in A.
Theorem 6.8.1 (Beurling estimate) If p T
2
, there exists a constant c such that if A is an
innte connected subset of Z
d
containing the origin and S is simple random walk, then
P
n
< T
A
c
n
1/2
, d = 2. (6.55)
We prove the result for simple random walk, and then we describe the extension to more general
walks.
Denition. Let /
d
denote the collection of innite subsets of Z
d
with the property that for each
positive integer j,
#z A : (j 1) [z[ < j = 1.
6.8 Beurling estimate 155
One important example of a set in /
d
is the half-innite line
L = je
1
: j = 0, 1, . . ..
We state two immediate facts about /
d
.
If A
.
If z A /
d
, then for every real r > 0.
#w A : [z w[ r #w A : [z[ r [w[ [z[ +r 2r + 1. (6.56)
Theorem 6.8.1 for simple random walk is implied by the following stronger result.
Theorem 6.8.2 For simple random walk in Z
2
there is a c such that if A /
d
, then
P
n
< T
A
c
n
1/2
. (6.57)
Proof We x n and let V = V
n
= y
1
, . . . , y
n
where y
j
denotes the unique point in A with
j [y
j
[ < j + 1. We let K = K
n
= x
1
, . . . , x
n
where x
j
= je
1
. Let G
n
= G
Bn
, B = B
n
3,
G = G
B
, =
n
3. Let
v(z) = P
z
< T
Vn
, q(z) = P
z
< T
Kn
.
By (6.49), there exist c
1
, c
2
such that for z B
2n
,
c
1
log n
v(z)
c
2
log n
,
c
1
log n
q(z)
c
2
log n
.
We will establish
v(0)
c
n
1/2
log n
and then the Markov property will imply that (6.57) holds. Indeed, note that
v(0) P(
2n
< T
V
2n
)P( <
n
[
2n
< T
V
2n
).
By (5.17) and (6.49), we know that there is a c such that for j = 1, . . . , n,
q(x
j
) c n
1/2
_
j
1/2
+ (n j + 1)
1/2
_
[log n]
1
; (6.58)
In particular, q(0) c/(n
1/2
log n) and hence it suces to prove that
v(0) q(0)
c
n
1/2
log n
. (6.59)
If [x[, [y[ n, then
G
n
3
2n
(0, y x) G
n
3(x, y) G
n
3
+2n
(0, y x),
and hence (4.28) and Theorem 4.4.4 imply
G(x, y) =
2
log n
3
+
2
a(x, y) +O
_
1
n
2
_
, [x[, [y[ n. (6.60)
156 Potential Theory
Using Proposition 4.6.4, we write
v(0) q(0) = P < T
V
P < T
K
= P > T
K
P > T
V
=
n
j=1
G(0, x
j
) q(x
j
)
n
j=1
G(0, y
j
) v(y
j
)
=
n
j=1
[G(0, x
j
) G(0, y
j
)] q(x
j
) +
n
j=1
G(0, y
j
) [q(x
j
) v(y
j
)].
Using (6.58) and (6.60), we get
(log n)
j=1
[G(0, x
j
) G(0, y
j
)] q(x
j
)
O(n
1
) +c
n
j=1
[a(x
j
) a(y
j
)[ (j
1/2
+ (n j)
1/2
) n
1/2
.
Since [x
j
[ = j, [y
j
[ = j +O(1), (4.4.4) implies that
[a(x
j
) a(y
j
)[
c
j
,
and hence
(log n)
j=1
[G(0, x
j
) G(0, y
j
)] q(x
j
)
O(n
1
) +c
n
j=1
1
j
3/2
n
1/2
c
n
1/2
.
For the last estimate we note that
n
j=1
1
j(n j)
1/2
n
j=1
1
j
3/2
.
In fact, if a, b R
n
are two vectors such that a has non-decreasing components (that is, a
1
a
2
. . . a
n
) then a b a b
where b
= (b
(1)
, . . . , b
(n)
) and is any permutation that makes
b
(1)
b
(2)
. . . b
(n)
.
Therefore, to establish (6.59), it suces to show that
n
j=1
G(0, y
j
) [q(x
j
) v(y
j
)]
c
n
1/2
log n
. (6.61)
Note that we are not taking absolute values on the left-hand side. Consider the function
F(z) =
n
j=1
G(z, y
j
) [q(x
j
) v(y
j
)],
and note that F is harmonic on B V . Since F 0 on B, either F 0 everywhere (in which case
(6.61) is trivial) or it takes its maximum on V . Therefore, it suces to nd a c such that for all
k = 1, . . . , n,
n
j=1
G(y
k
, y
j
) [q(x
j
) v(y
j
)]
c
n
1/2
log n
.
6.9 Eigenvalue of a set 157
By using Proposition 4.6.4 once again, we get
n
j=1
G(y
k
, y
j
) v(y
j
) = P
y
k
T
V
= 1 = P
x
k
T
K
=
n
j=1
G(x
k
, x
j
) q(x
j
).
Plugging in, we get
n
j=1
G(y
k
, y
j
) [q(x
j
) v(y
j
)] =
n
j=1
[G(y
k
, y
j
) G(x
k
, x
j
)] q(x
j
).
We will now bound the right-hand side. Note that [x
k
x
j
[ = [k j[ and [y
k
y
j
[ [k j[ 1.
Hence, using (6.60),
G(y
k
, y
j
) G(x
k
, x
j
)
c
[k j[ + 1
and therefore for each k = 1, . . . , n
n
j=1
[G(y
k
, y
j
) G(x
k
, x
j
)] q(x
j
) c
n
j=1
1
([k j[ + 1) j
1/2
log n
c
n
1/2
log n
.
One can now generalize this result.
Denition. If p T
2
and k is a positive integer, let /
= /
2,k,p
denote the collection of innite
subsets of Z
2
with the property that for each positive integer j,
#z A : (j 1)k (z) < jk 1,
and let / denote the collection of subsets with
#z A : (j 1)k (z) < jk = 1.
If A /
, then
P
n
< T
A
c
n
1/2
.
The proof is done similarly to that of the last theorem. We let K = x
1
, . . . , x
n
where x
j
= jle
1
and l is chosen suciently large so that (le
1
) > k, and set V = y
1
, . . . , y
n
where y
j
A with
j
(le
1
) [y
j
[ < (j + 1)
(le
1
). See Exercise 5.2.
6.9 Eigenvalue of a set
Suppose p T
d
and A Z
d
is nite and connected (with respect to p) with #(A) = m. The (rst)
eigenvalue of A is dened to be the number
A
= e
A
such that for each x A, as n ,
P
x
A
> n
n
A
= e
A
n
.
Let P
A
denote the mm matrix [p(x, y)]
x,yA
and, as before, let L
A
= P
A
I. Note that (P
A
)
n
is the matrix [p
A
n
(x, y)] where p
A
n
(x, y) = P
x
S
n
= y; n <
A
. We will say that p T
d
is aperiodic
158 Potential Theory
restricted to A if there exists an n such that (P
A
)
n
has all entries strictly positive; otherwise, we
say that p is bipartite restricted to A. In order for p to be aperiodic restricted to A, p must be
aperiodic. However, it is possible for p to be aperiodic but for p to be bipartite restricted to A
(Exercise 6.16). The next two propositions show that
D
is the largest eigenvalue for the matrix
P
A
, or, equivalently, 1
A
is the smallest eigenvalue for the matrix L
A
.
Proposition 6.9.1 If p T
d
, A Z
d
is nite and connected, and p restricted to A is aperiodic,
then there exist numbers 0 < =
A
< =
A
< 1 such that if x, y A,
p
A
n
(x, y) =
n
g
A
(x) g
A
(y) +O
A
(
n
). (6.62)
Here g
A
: A R is the unique positive function satisfying
P
A
g
A
(x) =
A
g
A
(x), x A,
xA
g
A
(x)
2
= 1.
In particular,
P
x
A
> n = g
A
(x)
n
+O
A
(
n
),
where
g
A
(x) = g
A
(x)
yA
g
A
(y),
We write O
A
to indicate that the implicit constant in the error term depends on A.
Proof This is a general fact about irreducible Markov chains, see Proposition 12.4.3. In the notation
of that proposition v = w = g. Note that
P
x
A
> n =
yA
p
A
n
(x, y).
.
Proposition 6.9.2 If p T
d
, A Z
d
is nite and connected, and p is bipartite restricted to A,
then there exist numbers 0 < =
A
< =
A
< 1 such that if x, y A for all n suciently large,
p
A
n
(x, y) +p
A
n+1
(x, y) = 2
n
g
A
(x) g
A
(y) +O
A
(
n
).
Here g
A
: A R is the unique positive function satisfying
xA
g
A
(x)
2
= 1, P
A
g
A
(x) = g
A
(x), x A.
Proof This can be proved similarly using Markov chains. We omit the proof.
Proposition 6.9.3 Suppose p T
d
; (0, 1), and p
=
0
+ (1 ) p is the corresponding lazy
walker. Suppose A is a nite, connected subset of Z
d
and let ,
, g, g
, respectively. Then 1
= (1 ) (1 ) and g
= g.
6.9 Eigenvalue of a set 159
Proof Let P
A
, P
A
= (1 ) P
A
+ I and hence
P
A
g
A
= [(1 ) +] g
A
.
A standard problem is to estimate
A
or
A
as A gets large and
A
1,
A
0. In these cases it usually
suces to consider the eigenvalue of the lazy walker with = 1/2. Indeed let
A
be the eigenvalue for the lazy
walker. Since,
A
= 1
A
+O((1
A
)
2
),
A
1 .
we get
A
=
1
2
A
+O(
2
A
),
A
0.
Proposition 6.9.1 gives no bounds for the . The optimal is the maximum of the absolute
values of the eigenvalues other than . In general, it is hard to estimate , and it is possible for
to be very close to . We will show that in the case of the nice set (
n
there is an upper bound
for independent of n. We x p T
d
with p(x, x) > 0 and let e
m
=
Cm
, g
m
= g
Cm
, and
p
m
n
(x, y) = p
Cm
n
(x, y). For x (
m
we let
m
(x) =
dist(x, (
m
) + 1
m
,
and we set
m
0 on Z
d
(
m
.
Proposition 6.9.4 There exist c
1
, c
2
such that for all m suciently large and all x, y (
m
,
c
1
m
(x)
m
(y) m
d
p
m
m
2
(x, y) c
2
m
(x)
m
(y). (6.63)
Also, there exist c
3
, c
4
such that for every n m
2
, and all x, y (
m
,
c
3
m
(y) m
d
P
x
S
n
= y [
Cm
> n c
4
m
(y) m
d
.
This proposition is an example of a parabolic boundary Harnack principle. At any time larger than rad
2
((
m
),
the position of the random walker, given that it has stayed in (
m
up to the current time, is independent of the
initial state up to a multiplicative constant.
Proof For notational ease, we will restrict to the case where m is even. (If m is odd, essentially the
same proof works except m
2
/4 must be replaced with m
2
/4, etc.) We write =
m
. Note that
p
m
m
2
(x, y) =
z,w
p
m
m
2
/4
(x, z) p
m
m
2
/2
(z, w) p
m
m
2
/4
(w, y). (6.64)
The local central limit theorem implies that there is a c such that for all z, w, p
m
m
2
/2
(z, w)
p
m
2
/2
(z, w) c m
d
. Therefore,
p
m
m
2
(x, y) m
d
P
x
Cm
> m
2
/4 P
y
Cm
> m
2
/4.
160 Potential Theory
Gamblers ruin (see Proposition 5.1.6) implies that P
x
Cm
> m
2
/4 c
m
(x). This gives the
upper bound for (6.63).
For the lower bound, we rst note that there is an > 0 such that
p
m
m
2
(z, w) m
d
, [z[, [w[ m.
Indeed, the Markov property implies that
p
m
m
2
(z, w) p
m
2
(z, w) maxp
k
( z, w) : k m
2
, z Z
d
(
m
, (6.65)
and the local central limit theorem establishes the estimate. Using this estimate and the invariance
principle, one can see that for every > 0, there is a c such that for z, w (
(1)m
,
p
m
2
/2
(z, w) c m
d
.
Indeed, in order to estimate p
m
2
/2
(z, w), we split the path into three pieces: the rst m
2
/8 steps,
the middle m
2
/4 steps; and the nal m
2
/8 steps (here we are assuming m
2
/8 is an integer for
notational ease). We estimate both the probability that the walk starting at z has not left (
m
and is in the ball of radius m at time m
2
/8 and corresponding probability for the walk in reverse
time starting at w using the invariance principle. There is a positive probability for this, where the
probability depends on . For the middle piece we use (6.65), and then we connect the paths to
obtain the lower bound on the probability.
Using (6.64), we can then see that it suces to nd > 0 and c > 0 such that
zC
(1)m
p
m
m
2
/4
(x, z) c (x). (6.66)
Let T =
Cm
C
m/2
as in Lemma 6.3.4 and let T
m
= T (m
2
/4). Using that lemma and Theorem
5.1.7, we can see that
P
x
S
Tm
(
m
c
1
(x).
Propositions 6.4.1 and 6.4.2 can be used to see that
P
x
S
T
(
m/2
c
2
(x).
We can write
P
x
S
T
(
m/2
=
z
P
x
S
Tm
= z P
x
S
T
(
m/2
[ S
Tm
= z.
The conditional expectation can be estimated again by Lemma 6.3.4; in particular, we can nd an
such that
P
z
S
T
(
m/2
c
2
2c
1
, z , (
(1)m
.
This implies,
zC
(1)m
P
x
S
Tm
= z
zC
(1)m
P
x
S
Tm
= z P
x
S
T
(
m/2
[ S
Tm
= z
c
2
2
(x).
A nal appeal to the central limit theorem shows that if 1/4,
zC
(1)m
p
m
m
2
/4
(x, z) c
zC
(1)m
P
x
S
Tm
= z.
6.9 Eigenvalue of a set 161
The last assertion follows for n = m
2
by noting that
P
x
S
m
2 = y [
Cm
> m
2
=
p
m
m
2
(x, y)
z
p
m
m
2
(x, z)
and
z
p
m
m
2
(x, z)
m
(x) m
d
m
(z)
m
(x).
For n > m
2
, we can argue similarly by conditioning on the walk at time n m
2
.
Corollary 6.9.5 There exists c
1
, c
2
such that
c
1
m
2
m
c
2
.
Proof See exercise 6.10.
Corollary 6.9.6 There exists c
1
, c
2
such that for all m and all x (
m/2
,
c
1
e
mn
P
x
m
> n c
2
e
mn
.
Proof Using the previous corollary, it suces to prove the estimates for n = km
2
, k 1, 2, . . .. Let
k
(x) =
k
(x, m) = P
x
m
> km
2
and let
k
= max
xCm
k
(x). Using the previous proposition,
we see there is a c
1
such that
k
k
(x) c
1
k
, x (
m/2
.
Due to the same estimates,
P
x
S
m
(
m/2
[
m
> km
2
c
2
.
Therefore, there is a c
3
such that
c
3
j
k
j+k
j
k
,
which implies (see Corollary 12.7.2)
e
mm
2
k
k
c
1
3
e
mm
2
k
,
and hence for x (
m/2
,
c
1
e
mm
2
k
k
(x) c
1
3
e
mm
2
k
.
Exercises
Exercise 6.1 Show that Proposition 6.1.2 holds for p T
.
Exercise 6.2
162 Potential Theory
(i) Show that if p T
d
and x (
n
,
E
x
[
n
] =
yCn
G
Cn
(x, y) = n
2
(x) +O(n).
(Hint: see Exercise 1.5.)
(ii) Show that if p T
d
and x (
n
,
E
x
[
n
] =
yCn
G
Cn
(x, y) = n
2
(x) +o(n
2
).
Exercise 6.3 In this exercise we construct a transient subset A of Z
3
with
yA
G(0, y) = . (6.67)
Here G denotes the Greens function for simple random walk. Our set will be of the form
A =
_
k=1
A
k
, A
k
= z Z
3
: [z 2
k
e
1
[
k
2
k
.
for some
k
0.
(i) Show that (6.67) holds if and only if
k=1
3
k
2
2k
= .
(ii) Show that A is transient if and only if
k=1
k
< .
(iii) Find a transient A satisfying (6.67).
Exercise 6.4 Show that there is a c < such that the following holds. Suppose S
n
is simple
random walk in Z
2
and let V = V
n,N
be the event that the path S[0,
N
] does not disconnect the
origin from B
n
. Then if x B
2n
,
P
x
(V )
c
log(N/n)
.
(Hint: There is a > 0 such that the probability that a walk starting at B
n/2
disconnects the
origin before reaching B
n
is at least , see Exercise 3.4.)
Exercise 6.5 Suppose p T
d
, d 3. Show that there exists a sequence K
n
such that if
A Z
d
is a nite set with at least n points, then cap(A) K
n
.
Exercise 6.6 Suppose p T
d
and r < 1. Show there exists c = c
r
< such that the following
holds.
(i) If [e[ = 1, and x (
rn
,
yCn
[G
Cn
(x +e, y) G
Cn
(x, y)[ c n,
(ii) Suppose f, g, F are as in Corollary 6.2.4 with A = (
n
. Then if x (
rn
,
[
j
f(x)[
c
n
_
|F|
+n
2
|g|
.
6.9 Eigenvalue of a set 163
Exercise 6.7 Show that if p T
2
and r > 0,
lim
n
[G
C
n+r
(0, 0) G
Cn
(0, 0)] = 0.
Use this and (6.16) to conclude that for all x, y,
lim
n
[G
Cn
(0, 0) G
Cn
(x, y)] = a(x, y).
Exercise 6.8 Suppose p T
d
and A Z
d
is nite. Dene
Q
A
(f, g) =
x,yA
p(x, y) [f(y) f(x)] [g(y) g(x)].
and Q
A
(f) = Q
A
(f, f). Let F : A R be given. Show that the inmum of Q
A
(f) restricted to
functions f : A R with f F on A is obtained by the unique harmonic function with boundary
value F.
Exercise 6.9 Write the two-dimensional integer lattice in complex form, Z
2
= Z + iZ and let A
be the upper half plane A = j +ik Z
2
: k > 0. Show that for simple random walk
G
A
(x, y) = a(x, y) a(x, y), x, y A,
H
A
(x, j) =
1
4
[a(x, j i) a(x, j +i)] +(x j), x A, j Z.
where j +ik = j ik denotes complex conjugate. Find
lim
k
k H
A
(ik, j).
Exercise 6.10 Prove Corollary 6.9.5.
Exercise 6.11 Provide the details of the Harnack inequality argument in Lemma 6.5.8 and Theorem
6.5.10.
Exercise 6.12 Suppose p T
d
.
(i) Show that there is a c < such that if x A (
n
and z (
2n
,
P
x
S
2n
= z [
2n
< T
A
c n
1d
P
x
n
< T
A
P
x
2n
< T
A
.
(ii) Let A be the line je
1
: j Z. Show that there is an > 0 such that for all n suciently
large,
Pdist(S
n
, A) n [
n
< T
A
.
(Hint: you can use the gamblers ruin estimate to estimate P
x
n/2
< T
A
/P
x
n
< T
A
.)
Exercise 6.13 Show that for each p T
2
and each r (0, 1), there is a c such that for all n
suciently large,
G
Cn
(x, y) c, x, y (
rn
,
164 Potential Theory
G
Bn
(x, y) c, x, y B
rn
.
Exercise 6.14 Suppose p T
2
and let A = x
1
, x
2
be a two-point set.
(i) Prove that hm
A
(x
1
) = 1/2.
(ii) Show that there is a c < such that if A (
n
, then for y Z
2
(
2n
,
P
y
S
T
A
= x
1
1
2
c
log n
.
(Hint: Suppose P
y
S
T
A
= x
j
1/2 and let V be the set of z such that P
z
S
T
A
= x
j
1/2.
Let = minj : S
j
V . Then it suces to prove that P
y
T
A
< c/ log n.)
(iii) Show that there is a c < such that if A = Z
2
x with x ,= 0, then
G
A
(0, 0)
4
log [x[
c.
Exercise 6.15 Suppose p T
2
. Show that there exist c
1
, c
2
> 0 such that the following holds.
(i) If n is suciently large, A is a set as in Lemma 6.6.7, and A
n
= A [z[ n/2, then for
x B
n/2
,
P
x
T
A
<
n
c.
(ii) If x B
n/2
,
G
Z
2
\A
(x, 0) c.
(iii) If A
is a set with B
n/2
A
Z
2
A
n
,
G
A
(0, 0)
2
log n
c.
Exercise 6.16 Give an example of an aperiodic p T
d
and a nite connected (with respect to p)
set A for which p is bipartite restricted to A.
Exercise 6.17 Suppose S
n
is simple random walk in Z
d
so that
n
=
n
. If [x[ < n, let
u(x, n) = E
x
[[S
n
[ n]
and note that 0 u(x, n) 1.
(i) Show that
n
2
[x[
2
+ 2nu(x, n) E
x
[
n
] n
2
[x[
2
+ (2n + 1) u(x, n).
(ii) Show that if d = 2,
2
G
Bn
(0, x) = log n log [x[ +
u(x, n)
n
+O([x[
2
).
(iii) Show that if d 3,
C
1
d
G
Bn
(0, x) =
1
[x[
d2
1
n
d2
+
(d 2) u(x, n)
n
d1
+O([x[
d
).
6.9 Eigenvalue of a set 165
Exercise 6.18 Suppose S
n
is simple random walk in Z
d
with d 3. For this exercise assume that
we know that
G(x)
C
d
[x[
d2
, [x[
for some constant C
d
but no further information on the asymptotics. The purpose of this exercise
is to nd C
d
. Let V
d
be the volume of the unit ball in R
d
and
d
= d V
d
the surface area of the
boundary of the unit ball.
(i) Show that as n ,
xBn
G(0, x)
C
d
d
n
2
2
=
C
d
d V
d
n
2
2
.
(ii) Show that as n ,
xBn
[G(0, x) G
Bn
(0, x)] C
d
V
d
n
2
.
(iii) Show that as n ,
xBn
G
Bn
(0, x) n
2
.
(iv) Conclude that
C
d
V
d
_
d
2
1
_
= 1.
7
Dyadic coupling
7.1 Introduction
In this chapter we will study the dyadic or KMT coupling which is a coupling of Brownian motion
and random walk for which the paths are signicantly closer to each other than in the Skorokhod
embedding. Recall that if (S
n
, B
n
) are coupled by the Skorokhod embedding, then typically one
expects [S
n
B
n
[ to be of order n
1/4
. In the dyadic coupling, [S
n
B
n
[ will be of order log n. We
mainly restrict our consideration to one dimension, although we discuss some higher dimensional
versions in Section 7.6.
Suppose p T
1
and
S
n
= X
1
+ +X
n
is a p-walk. Suppose that there exists b > 0 such that
E[X
2
1
] =
2
, E[e
b|X
1
|
] < . (7.1)
Then by Theorem 2.3.11, there exist N, c, such that if we dene (n, x) by
p
n
(x) := PS
n
= x =
1
2
2
n
e
x
2
2
2
n
exp(n, x),
then for all n N and [x[ n,
[(n, x)[ c
_
1
n
+
[x[
3
n
2
_
. (7.2)
Theorem 7.1.1 Suppose p T
d
satises (7.1) and (7.2). Then one can dene on the same
probability space (, T, P), a Brownian motion B
t
with variance parameter
2
and a random walk
with increment distribution p such that the following holds. For each < , there is a c
such
that
P
_
max
1jn
[S
j
B
j
[ c
log n
_
c
. (7.3)
Remark. From the theorem it is easy to conclude the corresponding result for bipartite or
continuous-time walks with p T
1
. In particular, the result holds for discrete-time and continuous-
time simple random walk.
166
7.2 Some estimates 167
We will describe the dyadic coupling formally in Section 7.4, but we will give a basic idea here. Suppose
that n = 2
m
. One starts by dening S
2
m as closely to B
2
m as possible. Using the local central limit theorem, we
can do this in a way so that with very high probability [S
2
m B
2
m[ is of order 1. We then dene S
2
m1 using
the values of B
2
m, B
2
m1 , and again get an error of order 1. We keep subdividing intervals using binary splitting,
and every time we construct the value of S at the middle point of a new interval. If at each subdivision we get
an error of order 1, the total error should be at most of order m, the number of subdivisions needed. (Typically
it might be less because of cancellation.)
The assumption E[e
b|X1|
] < for some b > 0 is necessary for (7.3) to hold at j = 1. Suppose p T
1
such
that for each n there is a coupling with
P[S
1
B
1
[ c log n c n
1
.
It is not dicult to show that as n , P[B
1
[ c log n = o(n
1
), and hence
P[S
1
[ 2 c log n P[S
1
B
1
[ c log n +P[B
1
[ c log n 2 c n
1
for n suciently large. If we let x = 2 c log n, this becomes
P[X
1
[ x 2 c e
x/(2 c)
,
for all x suciently large which implies E[e
b|X1|
] < for b < (2 c)
1
.
Some preliminary estimates and denitions are given in Sections 7.2 and 7.3, the coupling is
dened in Section 7.4, and we show that it satises (7.3) in Section 7.5. The proof is essentially
the same for all values of
2
. For ease of notation we will assume that
2
= 1. It also suces to
prove the result for n = 2
m
and we will assume this in Sections 7.4 and 7.5.
For the remainder of this chapter, we x b, , c
0
, N and assume that p is an increment distribution
satisfying
E[e
b|X
1
|
] < , (7.4)
and
p
n
(x) =
1
2n
e
x
2
2n
exp(n, x),
where
[(n, x)[ c
0
_
1
n
+
[x[
3
n
2
_
, n N, [x[ n. (7.5)
7.2 Some estimates
In this section we collect a few lemmas about random walk that will be used in establishing (7.3).
The reader may wish to skip this section at rst reading and come back to the estimates as they
are needed.
168 Dyadic coupling
Lemma 7.2.1 Suppose S
n
is a random walk with increment distribution p satisfying (7.4) and
(7.5). Dene
n
(n, x, y) by
PS
n
= x [ S
2n
= y =
1
n
exp
_
(x (y/2))
2
n
_
exp
(n, x, y).
Then if n N, [x[, [y[ n/2,
[
(n, x, y)[ 9 c
0
_
1
n
+
[x[
3
n
2
+
[y[
3
n
2
_
.
Without the conditioning, S
n
is approximately normal with mean zero and variance n. Conditioned on the
event S
2n
= y, S
n
is approximately normal with mean y/2 and variance n/2. Note that specifying the value at
time 2n reduces the variance of S
n
.
Proof Note that
PS
n
= x [ S
2n
= y =
PS
n
= x, S
2n
S
n
= y x
PS
2n
= y
=
p
n
(x) p
n
(y x)
p
2n
(y)
.
Since [x[, [y[, [x y[ n, we can apply (7.5). Note that
[
j=1
(x
1
+ +x
j
)
2
2
j
2
2 + 1
2 1
_
_
n
j=1
x
2
j
2
j
_
_
. (7.6)
Proof Due to homogeneity of (7.6) we may assume that
2
j
x
2
j
= 1. Let y
j
= 2
j/2
x
j
, y =
(y
1
, . . . , y
n
). Then
n
i=1
(x
1
+ +x
i
)
2
2
i
=
n
i=1
1j,ki
x
j
x
k
2
i
=
n
j=1
n
k=1
x
j
x
k
n
i=jk
2
i
2
n
j=1
n
k=1
2
(jk)
x
j
x
k
= 2
n
j=1
n
k=1
2
|kj|/2
y
j
y
k
= 2Ay, y) 2|y|
2
= 2,
7.2 Some estimates 169
where A = A
n
is the n n symmetric matrix with entries a(j, k) = 2
|kj|/2
and =
n
denotes
the largest eigenvalue of A. Since is bounded by the maximum of the row sums,
1 + 2
j=1
2
j/2
=
2 + 1
2 1
.
We will use the fact that the left-hand side of (7.6) is bounded by a constant times the term in brackets on
the right-hand side. The exact constant is not important.
Lemma 7.2.3 Suppose S
n
is a random walk with increment distribution satisfying (7.4) and (7.5).
Then for every there exists a c = c() such that
P
_
_
_
log
2
n<jn
S
2
2
j
2
j
c n
_
_
_
c e
n
. (7.7)
Consider the random variables U
j
= S
2
2
j
/2
j
and note that E[U
j
] = 1. Suppose that U
1
, U
2
, . . . were
independent. If it were also true that there exist t, c such that E[e
tUj
] c for all j, then (7.7) would be a
standard large deviation estimate similar to Theorem 12.2.5. To handle the lack of independence, we consider the
independent random variables [S
2
j S
2
j1 ]
2
/2
j
and use (7.6). If the increment distribution is bounded then we
also get E[e
tUj
] c for some t, see Exercise 2.6. However, if the range is innite this expectation may be innite
for all t > 0, see Exercise 7.3. To overcome this diculty, we use a striaghtforward truncation argument.
Proof We x > 0 and allow constants in this proof to depend on . Using (7.4), we see that there
is a such that
P[S
n
[ n e
n
.
Hence, we can nd c
1
such that
log
2
n<jn
[ P[S
2
j [ c
1
2
j
+P[S
2
j S
2
j1 [ c
1
2
j
] = O(e
n
).
Fix this c
1
, and let j
0
= log
2
n + 1 be the smallest integer greater than log
2
n. Let Y
j
= 0 for
j < j
0
; Y
j
0
= S
2
j
0
; and for j > j
0
, let Y
j
= S
2
j S
2
j1. Then, except for an event of probability
O(e
n
), [Y
j
[ c
1
2
j
for j j
0
and hence
P
_
_
_
n
j=j
0
Y
2
j
2
j
,=
n
j=j
0
Y
2
j
2
j
1[Y
j
[ c
1
2
j
_
_
_
O(e
n
).
Note that
log
2
n<jn
S
2
2
j
2
j
=
n
j=j
0
S
2
2
j
2
j
=
n
j=1
(Y
1
+ +Y
j
)
2
2
j
c
n
j=j
0
Y
2
j
2
j
.
170 Dyadic coupling
The last step uses (7.6). Therefore it suces to prove that
P
_
_
_
n
j=1
Y
2
j
2
j
1[Y
j
[ c
1
2
j
cn
_
_
_
e
n
.
The estimates (7.4) and (7.5) imply that there is a t > 0 such that for each n,
E
_
exp
_
t S
2
n
n
_
; [S
n
[ c
1
n
_
e
(see Exercise 7.2). Therefore,
E
_
_
exp
_
_
_
t
n
j=1
Y
2
j
2
j
1[Y
j
[ c
1
2
j
_
_
_
_
_
e
n
,
which implies
P
_
_
_
n
j=1
Y
2
j
2
j
1[Y
j
[ c
1
2
j
t
1
( + 1) n
_
_
_
e
n
.
7.3 Quantile coupling
In this section we consider the simpler problem of coupling S
n
and B
n
for a xed n. The following
is a general denition of quantile coupling. We will only use quantile coupling in a particular case
where F is supported on Z or on (1/2)Z.
Denition. Suppose F is the distribution function of a discrete random variable supported on the
locally nite set
< a
1
< a
0
< a
1
< ,
and Z is a random variable with a continuous, strictly increasing distribution function G. Let r
k
be dened by G(r
k
) = F(a
k
), i.e., if F(a
k
) > F(a
k
),
G(r
k
) G(r
k1
) = F(a
k
) F(a
k
).
Let f be the step function
f(z) = a
k
if r
k1
< z r
k
,
and let X be the random variable f(Z). We call X the quantile coupling of F with Z, and f the
quantile coupling function of F and G.
Note that the event X = a
k
is the same as the event r
k1
< Z r
k
. Hence,
PX = a
k
= Pr
k1
< Z r
k
= G(r
k
) G(r
k1
) = F(a
k
) F(a
k
),
and X has distribution function F. Also, if
G(a
k
t) F(a
k1
) < F(a
k
) G(a
k
+t), (7.8)
7.3 Quantile coupling 171
then it is immediate from the above denitions that X = a
k
[X Z[ = [a
k
Z[ t. Hence,
if we wish to prove that [X Z[ t on the event X = a
k
, it suces to establish (7.8).
As an intermediate step in the construction of the dyadic coupling, we study the quantile coupling
of the random walk distribution with normal random variable that has the same mean and variance.
Let denote the standard normal distribution function, and let
n
_
x c
_
1 +
x
2
n
__
F
n
(x 1) F
n
(x)
n
_
x +c
_
1 +
x
2
n
__
, (7.9)
n/2
_
x c
_
1 +
x
2
n
+
y
2
n
__
F
n,y
(x 1) F
n,y
(x)
n/2
_
x +c
_
1 +
x
2
n
+
y
2
n
__
,
Proof It suces to establish the inequalities in the case where x is a non-negative integer. Implicit
constants in this proof are allowed to depend on , b, c
0
and we assume n N. If F is a distribution
function, we write F = 1 F. Since for t > 0,
(t + 1)
2
2n
=
t
2
n
+O
_
1
n
+
t
3
n
2
_
,
(consider t
n and t
n), we can see that (7.4) and (7.5) imply that we can write
p
n
(x) =
_
x+1
x
1
2n
e
t
2
/(2n)
exp
_
O
_
1
n
+
t
3
n
2
__
dt, [x[ n.
172 Dyadic coupling
Hence, using (12.12), for some a and all [x[ n,
F
n
(x) = PS
n
n +Px S
n
< n
= O(e
an
) +
_
n
x
1
2n
e
t
2
/(2n)
exp
_
O
_
1
n
+
t
3
n
2
__
dt.
From this we can conclude that for [x[ n,
F
n
(x) =
n
(x) exp
_
O
_
1
n
+
x
3
n
2
__
, (7.10)
and from this we can conclude (7.9). The second inequality is done similarly by using Lemma 7.2.1
to derive
F
n,y
(x) =
n/2
(x) exp
_
O
_
1
n
+
x
3
n
2
+
y
3
n
2
__
,
for [x[, [y[ n. Details are left as Exercise 7.4.
To derive Propositions 7.3.1 and Proposition 7.3.2 we use only estimates on the distribution functions
F
n
, F
n,y
and not pointwise estimates (local central limit theorem). However, the pointwise estimate (7.5) is used
in the proof of Lemma 7.2.1 which is used in turn to estimate F
n,y
.
7.4 The dyadic coupling
In this section we dene the dyadic coupling. Fix n = 2
m
and assume that we are given a stan-
dard Brownian motion dened on some probability space. We will dene the random variables
S
1
, S
2
, . . . , S
2
m as functions of the random variables B
1
, B
2
, . . . , B
2
m so that S
1
, . . . , S
2
m has the
distribution of a random walk with increment distribution p.
In Chapter 3, we constructed a Brownian motion from a collection of independent normal random
variables by a dyadic construction. Here we reverse the process, starting with the Brownian motion,
B
t
, and obtaining the independent normals. We will only use the random variables B
1
, B
2
, . . . , B
2
m.
Dene
k,j
by
k,j
= B
k2
mj B
(k1)2
mj , j = 0, 1, . . . , m; k = 1, 2, 3, . . . , 2
j
.
For each j,
k,j
: k = 1, 2, 3, . . . , 2
j
are independent normal random variables with mean zero
and variance 2
mj
. Let Z
1,0
= B
2
m and dene
Z
2k+1,j
, j = 1, . . . , m, k = 0, 1, . . . , 2
j1
1,
recursively by
2k+1,j
=
1
2
k+1,j1
+Z
2k+1,j
, (7.11)
so that also
2k+2,j
=
1
2
k+1,j1
Z
2k+1,j
.
7.4 The dyadic coupling 173
One can check (see Corollary 12.3.1) that the random variables Z
2k+1,j
: j = 0, . . . , 2
m
, k =
0, 1, . . . , 2
m1
1 are independent, mean zero, normal random variables with E[Z
2
1,0
] = 2
m
and
E[Z
2
2k+1,j
] = 2
mj1
for j 1. We can rewrite (7.11) as
B
(2k+1)2
mj =
1
2
_
B
k2
mj+1 +B
(k+1)2
mj+1
+Z
2k+1,j
. (7.12)
Let f
m
() denote the quantile coupling function for the distribution functions of S
2
m and B
2
m.
If y Z, let f
j
(, y) denote the quantile coupling function for the conditional distribution of
S
2
j
1
2
S
2
j+1
given S
2
j+1 = y and a normal random variable with mean zero and variance 2
j1
. This is well
dened as long as PS
2
j+1 = y > 0. Note that the range of f
j
(, y) is contained in (1/2)Z. This
conditional distribution is symmetric about the origin (see Exercise 7.1), so f
j
(z, y) = f
j
(z, y).
We can now dene the dyadic coupling.
Let S
2
m = f
m
(B
2
m).
Suppose the values of S
l2
mj+1, l = 1, . . . , 2
j1
are known. Let
k,i
= S
k2
mi S
(k1)2
mi .
Then we let
S
(2k1)2
mj =
1
2
[S
(k1)2
mj+1 +S
k2
mj+1] +f
mj
(Z
2k1,j
,
k,j1
),
so that
2k1,j
=
1
2
k,j1
+f
mj
(Z
2k1,j
,
k,j1
),
2k,j
=
1
2
k,j1
f
mj
(Z
2k1,j
,
k,j1
).
It follows immediately from the denition that (S
1
, S
2
, . . . , S
2
m) has the distribution of the ran-
dom walk with increment p. Also Exercise 7.1 shows that
2k1,j
and
2k,j
have the same
conditional distribution given
k,j1
.
It is convenient to rephrase this denition in terms of random variables indexed by dyadic inter-
vals. Let I
k,j
denote the interval
I
k,j
= [(k 1)2
mj
, k2
mj
], j = 0, . . . , m; k = 1, . . . , 2
j
.
We write Z(I) for the normal random variable associated to the midpoint of I,
Z(I
k,j
) = Z
2k1,j+1
Then the Z(I) are independent mean zero normal random variables indexed by the dyadic intervals
with variance [I[/4 where [ [ denotes length. We also write
(I
k,j
) =
k,j
, (I
k,j
) =
k,j
.
Then the denition can be given as follows.
Let (I
1,0
) = B
2
m, (I
1,0
) = f
m
(B
2
m).
174 Dyadic coupling
Suppose I is a dyadic interval of length 2
mj+1
that is the union of consecutive dyadic intervals
I
1
, I
2
of length 2
mj
. Then
(I
1
) =
1
2
(I) +Z(I), (I
2
) =
1
2
(I) Z(I) (7.13)
(I
1
) =
1
2
(I) +f
j
(Z(I), (I)), (I
2
) =
1
2
(I) f
j
(Z(I), (I)). (7.14)
Note that if j 1 and k 1, . . . , 2
j
, then
B
k2
mj =
ik
([(i 1)2
mj
, i2
mj
]), S
k2
mj =
ik
([(i 1)2
mj
, i2
mj
]). (7.15)
We next note a few important properties of the coupling.
If I = I
1
I
2
as above, then (I
1
), (I
2
), (I
1
), (I
2
) are deterministic functions of (I), (I),
Z(I). The conditional distributions of ((I
1
), (I
1
)) and ((I
2
), (I
2
)) given ((I), (I)) are
the same.
By iterating this we get the following. For each interval I
k,j
consider the joint distribution random
variables
((I
l,i
), (I
l,i
)), i = 0, . . . , j,
where l = l(i, k, j) is chosen so that I
k,j
I
l,i
. Then this distribution is the same for all
k = 1, 2, . . . , 2
j
. In particular, if
R
k,j
=
j
i=0
[(I
l,i
) (I
l,i
)[,
then the random variables R
1,j
, . . . , R
2
j
,j
are identically distributed. (They are not independent.)
For k = 1,
(I
1,j
) (I
1,j
) =
1
2
[(I
1,j1
) (I
1,j1
)] + [Z
1,j
f
j
(Z
1,j
, S
2
mj+1 )]
By iterating this, we get
R
1,j
[S
2
m B
2
m[ + 2
j
l=1
[f
ml
(Z
1,l
, S
2
ml+1) Z
1,l
[. (7.16)
Dene (I
1,0
) = [B
2
m S
2
m[ = [(I
1,0
) (I
1,0
)[. Suppse j 1 and I
k,j
is an interval with
parent interval I
. Dene (I
k,j
) to be the maximum of [B
t
S
t
[ where the maximum is over
three values of t: the left endpoint, midpoint, and right endpoint of I. We claim that
(I
k,j
) (I
) +[(I
k,j
) (I
k,j
)[.
Since the endpoints of I
k,j
are either endpoints or midpoints of I
S
s
[, [B
s
+
S
s
+
[[
_
+[(I
k,j
) (I
k,j
)[,
7.5 Proof of Theorem 7.1.1 175
where t, s
, s
+
denote the midpoint, left endpoint, and right endpoint of I
k,j
, respectively. But
using (7.13), (7.14), and (7.15), we see that
B
t
S
t
=
1
2
_
(B
s
S
s
) + (B
s
+
S
s
+
)
+[(I
k,j
) (I
k,j
)[,
and hence the claim follows from the simple inequality [x + y[ 2 max[x[, [y[. Hence, by
induction, we see that
(I
k,j
) R
k,j
. (7.17)
7.5 Proof of Theorem 7.1.1
Recall that n = 2
m
. It suces to show that for each there is a c
log n c
. (7.18)
Indeed if the above holds, then
P
_
max
1in
[S
i
B
i
[ c
log n
_
i=1
P[S
i
B
i
[ c
log n c
n
+1
.
We claim in fact, that it suces to nd a sequence 0 = i
0
< i
1
< < i
l
= n such that
[i
k
i
k1
[ c
log n and such that (7.18) holds for these indices. Indeed, if we prove this and
[j i
k
[ c
such that
P[S
j
S
i
k
[ c
log n +P[B
j
B
i
k
[ c
log n c
,
and hence the triangle inequality gives (7.18) (with a dierent constant).
For the remainder of this section we x and allow constants to depend on . By the reasoning
of the previous paragraph and (7.17), it suces to nd a c such that for log
2
m+ c j m, and
k = 1, . . . , 2
mj
,
PR
k,j
c
m c
e
m
,
and as pointed out in the previous section, it suces to consider the case k = 1, and show
PR
1,j
c
m c
e
m
, for j = log
2
m+c, . . . , m. (7.19)
Let be the minimum of the two values given in Propositions 7.3.1 and 7.3.2, and recall that
there is a = () such that
P[S
2
j [ 2
j
exp2
j
In particular, we can nd a c
3
such that
log
2
m+c
3
jm
P[S
2
j [ 2
j
O(e
m
).
Proposition 7.3.1 tells us that on the event [S
2
m[ 2
m
,
[S
2
m B
2
m[ c
_
1 +
S
2
2
m
2
m
_
.
176 Dyadic coupling
Similarly, Proposition 7.3.2 tells us that on the event max[S
2
ml [, [S
2
ml+1 [ 2
ml
, we have
[Z
1,l
f
ml
(Z
1,l
, S
2
ml+1)[ c
_
1 +
S
2
2
ml+1
+S
2
2
ml
2
ml
_
.
Hence, by (7.16), we see that on the same event, simultaneously for all j [log
2
m+c
3
, m],
[S
2
mj B
2
mj [ R
1,j
+[S
2
m B
2
m[ c
_
_
m+
log
2
mc
3
im
S
2
2
i
2
i
_
_
.
We now use (7.7) (due to the extra term c
3
in the lower limit of the sum, one may have to apply
(7.7) twice) to conclude (7.19) for j log
2
m+c
3
.
7.6 Higher dimensions
Without trying to extend the result of the previous section to to the general (bounded exponential
moment) walks in higher dimensions, we indicate two immediate consequences.
Theorem 7.6.1 One can dene on the same probability space (, T, P), a Brownian motion B
t
in
R
2
with covariance matrix (1/2) I and a simple random walk in Z
2
. such that the following holds.
For each < , there is a c
such that
P
_
max
1jn
[S
j
B
j
[ c
log n
_
c
.
Proof We use the trick from Exercise 1.7. Let (S
n,1
, B
n,1
), (S
n,2
, B
n,2
) be independent dyadic
couplings of one-dimensional simple random walk and Brownian motion. Let
S
n
=
_
S
n,1
+S
n,2
2
,
S
n,1
S
n,2
2
_
,
B
n
=
_
B
n,1
+B
n,2
2
,
B
n,1
B
n,2
2
_
.
Theorem 7.6.2 If p T
d
, one can dene on the same probability space (, T, P), a Brownian
motion B
t
in R
d
with covariance matrix and a continuous-time random walk
S
t
with increment
distribution p such that the following holds. For each < , there is a c
such that
P
_
max
1jn
[
S
j
B
j
[ c
log n
_
c
. (7.20)
Proof Recall from (1.3) that we can write any such
S
t
as
S
t
=
S
1
q
1
t
x
1
+ +
S
l
q
l
t
x
l
,
7.7 Coupling the exit distributions 177
where q
1
, . . . , q
l
> 0; x
1
, . . . , x
l
Z
d
; and
S
1
, . . . ,
S
l
are independent one-dimensional simple
continuous-time random walks. Choose l independent couplings as in Theorem 7.1.1,
(S
1
t
, B
1
t
), (S
2
t
, B
2
t
), . . . , (S
l
t
, B
l
t
),
where B
1
, . . . , B
l
are standard Brownian motions. Let
B
t
= B
1
q
1
t
x
1
+ +B
l
q
l
t
x
l
.
This satises (7.20).
7.7 Coupling the exit distributions
Proposition 7.7.1 Suppose p T
d
. Then one can dene on the same probability space a (discrete-
time) random walk S
n
with increment distribution p; a continuous-time random walk
S
t
with in-
crement distribution p; and a Brownian motion B
t
with covariance matrix such that for each
n, r > 0,
P
_
[S
n
B
n
[ r log n
_
= P
_
[
n
B
n
[ r log n
_
c
r
,
where
n
= minj : (S
j
) n,
n
= mint : (
S
t
) n,
n
= mint : (B
t
) = n.
We advise caution when using the dyadic coupling to prove results about random walk. If (S
n
, B
t
) are
coupled as in the dyadic coupling, then S
n
and B
t
are Markov processes, but the joint process (S
n
, B
n
) is not
Markov.
Proof It suces to prove the result for
S
t
, B
t
, for then we can dene S
j
to be the discrete-time
skeleton walk obtained by sampling
S
t
at times of its jumps. We may also assume r n; indeed,
since [
n
[ +[B
n
[ O(n), for all n suciently large
P
_
[
n
B
n
[ n log n
_
= 0.
By Theorem 7.6.2 we can dene
S, B on the same probability space such that except for an event
of probability O(n
4
),
[
S
t
B
t
[ c
1
log n, 0 t n
3
.
We claim that P
n
> n
3
decay exponentially in n. Indeed, the central limit theorem shows that
there is a c > 0 such that for n suciently large and [x[ < n , P
x
n
n
2
c. Iterating this gives
P
x
n
> n
3
(1c)
n
. Similarly, P
n
> n
3
decays exponentially. Therefore, except on an event
of probability O(n
4
),
[
S
t
B
t
[ c
1
log n, 0 t max
n
,
n
. (7.21)
Note that the estimate (7.21) is not sucient to directly yield the claim, since it is possible that
one of the two paths (say
S) rst exits (
n
at some point y, then moves far away from y (while
178 Dyadic coupling
staying close to (
n
) and that only then the other path exits (
n
, while all along the two paths stay
close. The rest of the argument shows that such event has small probability. Let
n
(c
1
) = mint : dist(
S
t
, Z
d
(
n
) c
1
log n and
n
(c
1
) = mint : dist(B
t
, Z
d
(
n
) c
1
log n,
and dene
n
:=
n
(c
1
)
n
(c
1
).
Since
n
max
n
,
n
, we conclude as in (7.21) that with an overwhelming (larger than 1O(n
4
))
probability,
[
S
t
B
t
[ c
1
log n, 0 t
n
,
and in particular that
[
S
n
B
n
[ c
1
log n. (7.22)
On the event in (7.22) we have
maxdist(
S
n
, Z
d
(
n
), dist(B
n
, Z
d
(
n
) 2c
1
log n,
by triangle inequality, so in particular
max
n
(2c
1
),
n
(2c
1
)
n
. (7.23)
Using the gamblers ruin estimate (see Exercise 7.5) and strong Markov property for each process
separately (recall, they are not jointly Markov)
P[
S
n(2c
1
)
S
j
[ r log n for all j [
n
(2c
1
),
n
] 1
c
2
r
, (7.24)
and also
P[B
n
(2c
1
)
B
t
[ r log n for all t [
n
(2c
1
),
n
] 1
c
2
r
. (7.25)
Applying the triangle inequality to
n
B
n
= (
S
n
) + (
S
n
B
n
) + (B
n
B
n
),
on the intersection of the four events from (7.22)(7.25), yields
S
n
B
n
(2r +c
1
) log n, and the
complement has probability bounded by O(1/r).
Denition. A nite subset A of Z
d
is simply connected if both A and Z
d
A are connected. If
x Z
d
, let o
x
denote the closed cube in R
d
of side length one, centered at x, with sides parallel
to the coordinate axes. If A Z
d
, let D
A
be the domain dened as the interior of
xA
o
x
. The
inradius of A is dened by
inrad(A) = min[y[ : y Z
d
A.
Proposition 7.7.2 Suppose p T
2
. Then one can dene on the same probability space a (discrete-
time) random walk S
n
with increment distribution p and a Brownian motion B
t
with covariance
7.7 Coupling the exit distributions 179
matrix such that the following holds. If A is a nite, simply connected set containing the origin
and
A
= inft : B
t
, D
A
,
then each if r > 0,
P[S
A
B
A
[ r log[inrad(A)]
c
r
.
Proof Similar to the last proposition except that the gamblers ruin estimate is replaced with the
Beurling estimate.
Exercises
Exercise 7.1 Suppose S
n
= X
1
+ +X
n
where X
1
, X
2
, . . . are independent, identically distributed
random variables. Suppose PS
2n
= 2y > 0 for some y R. Show that the conditional distribution
of
S
n
y
conditioned on S
2n
= 2y is symmetric about the origin.
Exercise 7.2 Suppose S
n
is a random walk in Z whose increment distribution satises (7.4) and
(7.5) and let C < . Show that there exists a t = t(b, , c
0
, C) > 0 such that for all n,
E
_
exp
_
t S
2
n
n
_
; [S
n
[ Cn
_
e.
Exercise 7.3 Suppose
S
t
is continuous-time simple random walk in Z.
(i) Show that there is a c < such that for all positive integers n,
P
S
n
= n
2
c
1
expcn
2
log n.
(Hint: consider the event that the walk makes exactly n
2
moves by time n, each of them in
the positive direction.)
(ii) Show that if t > 0,
E
_
exp
_
t
S
2
n
n
__
= ,
Exercise 7.4 Let be the standard normal distribution function, and let = 1 .
(i) Show that as x ,
(x)
e
x
2
/2
2
_
0
e
xt
dt =
1
x
2
e
x
2
/2
.
(ii) Prove (7.10).
(iii) Show that for all 0 t x,
(x +t) e
tx
e
t
2
/2
(x) e
tx
e
t
2
/2
(x) (x t).
180 Dyadic coupling
(iv) For positive integer n, let
n
(x) = (x/
n
_
x +c
_
1 +
x
2
n
__
exp
_
2b
_
1
n
+
x
3
n
2
__
n
(x) exp
_
b
_
1
n
+
x
3
n
2
__
n
_
x c
_
1 +
x
2
n
__
.
(v) Prove (7.9).
Exercise 7.5 In this exercise we prove the following version of the gamblers ruin estimate. Suppose
p T
d
, d 2. Then there exists c such that the following is true. If R
d
with [[ = 1 and r 0,
PS
j
r, 0 j
n
c(r + 1)
n
. (7.26)
Here
n
is as dened in Section 6.3.
(i) Let
q(x, n, ) = P
x
S
j
> 0, 1 j
n
.
Show that there is a c
1
> 0 such that for all n suciently large and all R
d
with [[ = 1,
the cardinality of the set of x Z
d
with [x[ n/2 and
q(x, n, ) c
1
q(0, 2n, )
is at least c
1
n
d1
.
(ii) Use a last-exit decomposition to conclude
xCn
G
Bn
(0, x) q(x, n, ) 1,
and use this to conclude the result for r = 0.
(iii) Use Lemma 5.1.6 and the invariance principle to show that there is a c
2
> 0 such that for
all [[ = 1,
q(0, n, )
c
2
n
.
(iv) Prove (7.26) for all r 0.
8
Addtional topics on Simple Random Walk
In this chapter we only consider simple random walk on Z
d
. In particular, S will always denote a
simple random walk in Z
d
. If d 3, G denotes the corresponding Greens function, and we simplify
the notation by setting
G(z) = a(z), d = 1, 2,
where a is the potential kernel. Note that then the equation LG(z) = (z) holds for all d 1.
8.1 Poisson kernel
Recall that if A Z
d
and
A
= minj 0 : S
j
, A,
A
= minj 1 : S
j
, A, then the Poisson
kernel is dened for x A, y A by
H
A
(x, y) = P
x
S
A
= y.
For simple random walk, we would expect the Poisson kernel to be very close to that of Brownian
motion. If D R
d
is a domain with suciently smooth boundary, we let h
D
(x, y) denote the
Poisson kernel for Brownian motion. This means that, for each x D, h
D
(x, ) is the density with
respect to surface measure on D of the distribution of the point at which the Brownian motion
visits D for the rst time. For sets A that are rectangles with sides perpendicular to the coordinate
axes (with nite or innite length), explicit expressions can be obtained for the Poisson kernel and
one can show convergence to the Brownian quantities with relatively small error terms. We give
some of these formulas in this section.
8.1.1 Half space
If d 2, we dene the discrete upper half space H = H
d
by
H = (x, y) Z
d1
Z : y > 0,
with boundary H = Z
d1
0 and closure H = H H. Let T =
H
, and let H
H
denote the
Poisson kernel, which for convenience we will write as a function H
H
: HZ
d1
[0, 1],
H
H
(z, x) = H
H
(z, (x, 0)) = P
z
S
T
= (x, 0).
If z = (x, y) Z
d1
Z, we write z for its conjugate, z = (x, y). If z H, then z H. If
z H, then z = z. Recall the Greens function for a set dened in Section 4.6.
181
182 Addtional topics on Simple Random Walk
Proposition 8.1.1 For simple random walk in Z
d
, d 2, if z, w H,
G
H
(z, w) = G(z w) G(z w),
H
H
(z, 0) =
1
2d
[G(z e
d
) G(z +e
d
)] . (8.1)
Proof To establish the rst relation, note that for w H, the function f(z) = G(zw)G(zw) =
G(zw)G(zw) is bounded on H, Lf(z) =
w
(z), and f 0 on H. Hence f(z) = G
H
(z, w) by
the characterization of Proposition 6.2.3. For the second relation, we use a last-exit decomposition
(focusing on the last visit to e
d
before leaving H) to see that
H
H
(z, 0) =
1
2d
G
H
(z, e
d
).
The Poisson kernel for Brownian motion in the upper half space
H = H
d
= (x, y) R
d1
(0, )
is given by
h
H
((x, y), 0) = h
H
((x +z, y), z) =
2y
d
[(x, y)[
d
,
where
d
= 2
d/2
/(d/2) is the surface area of the (d 1)-dimensional sphere of radius 1 in R
d
.
The next theorem shows that this is also the asymptotic value for the Poisson kernel for the random
walk in H = H
d
, and that the error term is small.
Theorem 8.1.2 If d 2 and z = (x, y) Z
d1
1, 2, . . ., then
H
H
(z, 0) =
2y
d
[z[
d
_
1 +O
_
[y[
[z[
2
__
+O
_
1
[z[
d+1
_
. (8.2)
Proof We use (8.1). If we did not need to worry about the error terms, we would naively estimate
1
2d
[G(z e
d
) G(z +e
d
)]
by
C
d
2d
_
[z e
d
[
2d
[z +e
d
[
2d
_
, d 3, (8.3)
C
2
4
log
[z e
d
[
[z +e
d
[
, d = 2. (8.4)
Using Taylor series expansion, one can check that the quantities in (8.3) and (8.4) equal
2y
d
[z[
d
+O
_
[y[
2
[z[
d+2
_
.
However, the error term in the expansion of the Greens function or potential kernel is O([z[
d
), so
we need to do more work to show that the error term in (8.2) is of order O([y[
2
/[z[
d+2
)+O([z[
(d+1)
).
8.1 Poisson kernel 183
Assume without loss of generality that [z[ > 1. We need to estimate
G(z e
d
) G(z +e
d
) =
n=1
[p
n
(z e
d
) p
n
(z +e
d
)] =
n=1
[p
n
(z e
d
) p
n
(z +e
d
)]
n=1
[p
n
(z e
d
) p
n
(z e
d
) p
n
(z +e
d
) +p
n
(z +e
d
)] .
Note that z e
d
and z +e
d
have the same parity so the above series converge absolutely even if
d = 2. We will now show that
1
2d
n=1
[p
n
(z e
d
) p
n
(z +e
d
)] =
2y
d
[z[
d
+O
_
y
2
[z[
d+2
_
. (8.5)
Indeed,
p
n
(z e
d
) p
n
(z +e
d
) =
d
d/2
(2)
d/2
1
n
d/2
e
|x|
2
+(y1)
2
2n/d
_
1 e
4y
2n/d
_
.
For n y, we can use a Taylor approximation for 1 e
4y
2n/d
. The terms with n < y, do not
contribute much. More specically, the left-hand side of (8.5) equals
2d
d/2+1
(2)
d/2
n=1
y
n
1+d/2
e
|x|
2
+(y1)
2
2n/d
+O
_
n=1
y
2
n
2+d/2
e
|x|
2
+(y1)
2
2n/d
_
+O(y
2
e
|z|
).
Lemma 4.3.2 then gives
n=1
[p
n
(z e
d
) p
n
(z +e
d
)] =
2 d (d/2)
d/2
y
([x[
2
+ (y 1)
2
)
d/2
+O
_
y
2
[z[
d+2
_
=
2 d (d/2)
d/2
y
[z[
d
+O
_
y
2
[z[
d+2
_
=
4d
d
y
[z[
d
+O
_
y
2
[z[
d+2
_
.
The remaining work is to show that
n=1
[p
n
(z e
d
) p
n
(z e
d
) p
n
(z +e
d
) +p
n
(z +e
d
)] = O([z[
(d+1)
).
We mimic the argument used for (4.11), some details are left to the reader.
Again the sum over n < [z[ is negligible. Due to the second (stronger) estimate in Theorem 2.3.6,
the sum over n > [z[
2
is bounded by
n>|z|
2
c
n
(d+3)/2
= O
_
1
[z[
d+1
_
.
For n [[z[, [z[
2
], apply Theorem 2.3.8 with k = d + 5 (for the case of symmetric increment
184 Addtional topics on Simple Random Walk
distribution) to give
p
n
(w) = p
n
(w) +
d+5
j=3
u
j
(w/
n)
n
(d+j2)/2
+O
_
1
n
(d+k1)/2
_
,
where w = z e
d
. As remarked after Theorem 2.3.8, we then can estimate
[p
n
(z e
d
) p
n
(z e
d
) p
n
(z e
d
) +p
n
(z e
d
)[
up to an error of O(n
(d+k1)/2
) by
I
3,d+5
(n, z) :=
d+5
j=3
1
n
(d+j2)/2
u
j
_
z +e
d
n
_
u
j
_
z e
d
n
_
.
Finally, due to Taylor expansion and the uniform estimate (2.29), one can obtain a bound on the
sum
n[|z|,|z|
2
]
I
3,d+5
(n, z) by imitating the nal estimate in the proof of Theorem 4.3.1. We leave
this to the reader.
In Section 8.1.3 we give an exact expression for the Poisson kernel in H
2
in terms of an integral. To
motivate it, consider a random walk in Z
2
starting at e
2
stopped when it rst reaches xe
1
: x Z.
Then the distribution of the rst coordinate of the stopping position gives a probability distribution
on Z. In Corollary 8.1.7, we show that the characteristic function of this distribution is
() = 2 cos
_
(2 cos )
2
1.
Using this and Proposition 2.2.2, we see that the probability that the rst visit is to xe
1
is
1
2
_
e
ix
() d =
1
2
_
cos(x) () d.
If instead the walk starts from ye
2
, then the position of its rst visit to the origin can be considered
as the sum of y independent random variables each with characteristic function . The sum has
characteristic function
y
, and hence
H
H
(ye
2
, xe
1
) =
1
2
_
cos(x) ()
y
d.
8.1.2 Cube
In this subsection we give an explicit form for the Poisson kernel on a nite cube in Z
d
. Let
/
n
= /
n,d
be the cube
/
n
= (x
1
, . . . , x
d
) Z
d
: 1 x
j
n 1.
Note that #(/
n
) = (n1)
d
and /
n
consists of 2d copies of /
n,d1
. Let S
j
denote simple random
walk and =
n
= minj 0 : S
j
, /
n
. Let H
n
= H
Kn
denote the Poisson kernel
H
n
(x, y) = P
x
S
n
= y.
If d = 1, the gamblers ruin estimate gives
H
n
(x, n) =
x
n
, x = 0, 1, . . . , n,
8.1 Poisson kernel 185
so we will restrict our consideration to d 2. By symmetry, it suces to determine H
n
(x, y) for y
in one of the (d 1)-dimensional sub-cubes of /
n
. We will consider
y
1
n
:= (n, y) Z
d
: y /
n,d1
.
The set of functions on /
n
that are harmonic in /
n
and equal zero on /
n
1
n
is a vector space
of dimension #(
1
n
), and one of its bases is H
n
(, y) : y
1
n
. In the next proposition we will use
another basis which is more explicit.
The proposition below uses a discrete analogue of a technique from partial dierential equations called
separation of variables. We will then compare this to the Poisson kernel for Brownian motion that can be
computed using the usual separation of variables.
Proposition 8.1.3 If x = (x
1
, . . . , x
d
) /
n,d
and y = (y
2
, . . . , y
d
) /
n1,d
, then H
n
(x, (n, y))
equals
_
2
n
_
d1
zK
n,d1
sinh(
z
x
1
/n)
sinh(
z
)
sin
_
z
2
x
2
n
_
sin
_
z
d
x
d
n
_
sin
_
z
2
y
2
n
_
sin
_
z
d
y
d
n
_
,
where z = (z
2
, . . . , z
d
) and
z
=
z,n
is the unique nonnegative number satisfying
cosh
_
n
_
+
d
j=2
cos
_
z
j
n
_
= d. (8.6)
Proof If z = (z
2
, . . . , z
d
) R
d1
, let f
z
denote the function on Z
d
,
f
z
(x
1
, . . . , x
d
) = sinh
_
z
x
1
n
_
sin
_
z
2
x
2
n
_
sin
_
z
d
x
d
n
_
,
where
z
satises (8.6). It is straightforward to check that for any z, f
z
is a discrete harmonic
function on Z
d
with
f
z
(x) = 0, x /
n
1
n
.
We now restrict our consideration to z /
n,d1
. Let
f
z
=
2
(d1)/2
n
(d1)/2
sinh(
z
)
f
z
, z /
n,d1
,
and let
f
z
denote the restriction of
f
z
to
1
n
, considered as a function on /
n,d1
,
z
(x) =
f
z
((n, x)) =
_
2
n
_
(d1)/2
sin
_
z
2
x
2
n
_
sin
_
z
d
x
d
n
_
, x = (x
2
, . . . , x
d
) /
n,d1
.
For integers 1 j, k n 1, one can see (via the representation of sin in terms of exponentials)
that
n1
l=1
sin
_
jl
n
_
sin
_
kl
n
_
=
_
0 j ,= k
n/2 j = k.
(8.7)
186 Addtional topics on Simple Random Walk
Therefore,
z
: z /
n,d1
forms an orthonormal basis for the set of functions on /
n,d1
, in
symbols,
xK
n,d1
z
(x)
f
z
(x) =
_
0, z ,= z
1, z = z
.
Hence any function g on /
n,d1
can be written as
g(x) =
zK
n,d1
C(g, z)
f
z
(x),
where
C(g, z) =
yK
n,d1
z
(y) g(y).
In particular, if y /
n,d1
,
y
(x) =
zK
n,d1
z
(y)
f
z
(x).
Therefore, for each y = (y
2
, . . . , y
n
) the function
x
zK
n,d1
sinh(
z
x
1
/n)
sinh(
z
)
f
z
(y)
f
z
((n, x
2
, . . . , x
n
)),
is a harmonic function in /
n,d
whose value on /
n,d
is
(n,y)
and hence it must equal to x
H
n
(x, (n, y)).
To simplify the notation, we will consider only the case d = 2 (but most of what we write extends
to d 3). If d = 2,
H
Kn
((x
1
, x
2
), (n, y)) =
2
n
n
k=1
sinh(a
k
x
1
/n)
sinh(a
k
)
sin
_
kx
2
n
_
sin
_
ky
n
_
, (8.8)
where a
k
= a
k,n
is the unique positive solution to
cosh
_
a
k
n
_
+ cos
_
k
n
_
= 2.
Alternatively, we can write
a
k
=
n
r
_
k
n
_
, (8.9)
where r is the even function
r(t) = cosh
1
(2 cos t). (8.10)
Using cosh
1
(1 +x) =
2x +O(x
3/2
) as x 0+, we get
r(t) = [t[ +O([t[
3
), t [1, 1].
Now (8.9)(8.10) imply
a
k
= k +O
_
k
3
n
2
_
. (8.11)
8.1 Poisson kernel 187
Since a
k
increases with k, (8.11) implies that there is an > 0 such that
a
k
k, 1 k n 1. (8.12)
We will consider the scaling limit. Let B
t
denote a two-dimensional Brownian motion. Let
/ = (0, 1)
2
and let
T = inft : B
t
, /.
The corresponding Poisson kernel h
K
can be computed exactly in terms of an innite series using
the continuous analogue of the procedure above giving
h((x
1
, x
2
), (1, y)) = 2
k=0
sinh(kx
1
)
sinh(k)
sin(kx
2
) sin(ky) (8.13)
(see Exercise 8.2).
Roughly speaking, we expect
H
Kn
((nx
1
, nx
2
), (n, ny))
1
n
h((x
1
, x
2
), (1, y
2
)),
and the next proposition gives a precise formulation of this.
Proposition 8.1.4 There exists c < such that if 1 j
1
, j
2
, l n1 are integers, x
i
= j
i
/n, y =
l/n,
nH
Kn
((j
1
, j
2
), (n, l)) h((x
1
, x
2
), (1, y))
c
(1 x
1
)
6
n
2
sin(x
2
) sin(y).
A surprising fact about this proposition is how small the error term is. For xed x
1
< 1, the error is O(n
2
)
where one might only expect O(n
1
).
Proof Let = x
1
. Given k N, note that [ sin(kt)[ k sint for 0 < t < . (To see this,
t k sin t sin(kt) is increasing on [0, t
k
] where t
k
(0, /2) solves sin(t
k
) = 1/k, and sin()
continues to increase up to /2, while sin(k ) stays bounded by 1. For /2 < t < consider t
instead, details are left to the reader.) Therefore,
1
sin(x
2
) sin(y)
kn
2/3
sinh(kx
1
)
sinh(k)
sin(kx
2
) sin(ky)
kn
2/3
k
2
sinh(k)
sinh(k)
1
n
2
kn
2/3
k
5
sinh(k)
sinh(k)
c
n
2
kn
2/3
k
5
e
k(1)
. (8.14)
188 Addtional topics on Simple Random Walk
Similarly, using (8.12),
1
sin(x
2
) sin(y)
kn
2/3
sinh(a
k
x
1
)
sinh(a
k
)
sin(kx
2
) sin(ky)
1
n
2
kn
2/3
k
5
e
k(1)
.
For 0 x 1 and k < n
2/3
,
sinh(xa
k
) = sinh(xk)
_
1 +O
_
k
3
n
2
__
.
Therefore,
1
sin(x
2
) sin(y)
k<n
2/3
_
sinh(kx
1
)
sinh(k)
sinh(a
k
x
1
)
sinh(a
k
)
_
sin(kx
2
) sin(ky)
c
n
2
kn
2/3
k
3
e
k(1)
c
n
2
kn
2/3
k
5
e
k(1)
,
where the second to last inequality is obtained as in (8.14). Combining this with (8.8) and (8.13),
we see that
[nH
Kn
((j
1
, j
2
), (n, nl)) h((x
1
, x
2
), (1, y))[
sin(x
2
) sin(y)
c
n
2
k=1
k
5
e
(1)k
c
(1 )
6
n
2
.
The error term in the last proposition is very good except for x
1
near 1. For x
1
close to 1, one can give
good estimates for the Poisson kernel by using the Poisson kernel for a half plane (if x
2
is not near 0 or 1) or by
a quadrant (if x
2
is near 0 or 1). These Poisson kernels are discussed in the next subsection.
8.1.3 Strips and quadrants in Z
2
In the continuing discussion we think of Z
2
as Z + iZ, and we will use complex numbers notation
in this section. Recall r dened in (8.10) and note that
e
r(t)
= 2 cos t +
_
(2 cos t)
2
1, e
r(t)
= 2 cos t
_
(2 cos t)
2
1.
For each t 0, the function
f
t
(x +iy) = e
xr(t)
sin(yt),
f
t
(x +iy) = e
xr(t)
sin(yt),
is harmonic for simple random walk, and so is the function sinh(xr(t))sin(yt). The next proposition
is an immediate generalization of Proposition 8.1.3 to rectangles that are not squares. The proof
is the same and we omit it. We then take limits as the side lengths go to innity to get expressions
for other rectangular subsets of Z
2
.
8.1 Poisson kernel 189
Proposition 8.1.5 If m, n are positive integers, let
A
m,n
= x +iy Z iZ : 1 x m1, 1 y n 1.
Then
H
Am,n
(x +iy, iy
1
) = H
Am,n
((mx) +iy, m +iy
1
)
=
2
n
n1
j=1
sinh(r(
j
n
)(mx))
sinh(r(
j
n
)m)
sin
_
jy
n
_
sin
_
jy
1
n
_
. (8.15)
Corollary 8.1.6 If n is a positive integer, let
A
,n
= x +iy Z iZ : 1 x < , 1 y n 1.
Then
H
A,n
(x +iy, iy
1
) =
2
n
n1
j=1
exp
_
r
_
j
n
_
x
_
sin
_
jy
n
_
sin
_
jy
1
n
_
, (8.16)
and
H
A,n
(x +iy, x
1
) =
2
_
0
sinh(r(t)(n y))
sinh(r(t)n)
sin(tx) sin(tx
1
) dt. (8.17)
Proof Note that
H
A,n
(x +iy, iy
1
) = lim
m
H
Am,n
(x +iy, iy
1
),
and
lim
m
sinh(r(
j
n
)(mx))
sinh(r(
j
n
)m)
= exp
_
r
_
j
n
_
x
_
.
This combined with (8.15) gives the rst identity. For the second we write
H
A,n
(x +iy, x
1
) = lim
m
H
Am,n
(x +iy, x
1
)
= lim
m
H
An,m
(y +ix, ix
1
)
= lim
m
2
m
m1
j=1
sinh(r(
j
m
)(n y))
sinh(r(
j
m
)n)
sin
_
jx
m
_
sin
_
jx
1
m
_
=
2
_
0
sinh(r(t)(n y))
sinh(r(t)n)
sin(tx) sin(tx
1
) dt.
We derived (8.16) as a limit of (8.15). We could also have derived it directly by considering the collection
of harmonic functions
exp
_
r
_
j
n
_
x
_
sin
_
jy
n
_
, j = 1, . . . , n 1.
190 Addtional topics on Simple Random Walk
Corollary 8.1.7 Let
A
+
= x +iy Z iZ : x > 0.
Then
H
A
+
(x +iy, 0) =
1
2
_
e
xr(t)
cos(yt) dt.
Remark. If H denotes the discrete upper half plane, then this corollary implies
H
H
(iy, x) = H
H
(x +iy, 0) =
1
2
_
e
yr(t)
cos(xt) dt.
Proof Note that
H
A
+
(x +iy, 0) = lim
n
H
A
,2n
(x +i(n +y), in)
= lim
n
1
n
2n1
j=1
exp
_
r
_
j
2n
_
x
_
sin
_
j
2
_
sin
_
j(n +y)
2n
_
.
Note that sin(j/2) = 0 if j is even. For odd j, we have sin
2
(j/2) = 1 and cos(j/2) = 0, hence
sin
_
j
2
_
sin
_
j(n +y)
2n
_
= cos
_
jy
2n
_
.
Therefore,
H
A
+
(x +iy, 0) = lim
n
1
n
n
j=1
exp
_
r
_
(2j 1)
2n
_
x
_
cos
_
(2j 1)y
2n
_
=
1
_
0
e
xr(t)
cos(yt) dt.
Remark. As already mentioned, using the above expression for (H
A
+
(i, x), x Z), one can read
o the characteristic function of the stopping position of simple random walk started from e
2
and
stopped at its rst visit to the Z 0 (see also Exercise 8.1).
Corollary 8.1.8 Let
A
,
= x +iy Z +iZ : x, y > 0.
Then,
H
A,
(x +iy, x
1
) =
2
_
0
e
r(t)y
sin(tx) sin(tx
1
) dt.
8.2 Eigenvalues for rectangles 191
Proof Using (8.17),
H
A,
(x +iy, x
1
) = lim
n
H
A,n
(x +iy, x
1
)
= lim
n
2
_
0
sinh(r(t)(n y))
sinh(r(t)n)
sin(tx) sin(tx
1
) dt
=
2
_
0
e
r(t)y
sin(tx) sin(tx
1
) dt.
8.2 Eigenvalues for rectangles
In general it is hard to compute the eigenfunctions and eigenvectors for a nite subset A of Z
d
with
respect to simple random walk. One feasible case is that of a rectangle
A = (N
1
, . . . , N
d
) := (x
1
, . . . , x
d
) Z
d
: 0 < x
j
< N
j
.
If k = (k
1
, . . . , k
d
) Z
d
with 1 k
j
< N
j
, let
f
k
(x
1
, . . . , x
d
) = f
k,N
1
,...,N
d
(x
1
, . . . , x
d
) =
d
j=1
sin
_
x
j
k
j
N
j
_
.
Note that f
k
0 on (N
1
, . . . , N
d
). A straightforward computation shows that
Lf
k
(x
1
, . . . , x
d
) = (k) f
k
.
where
(k) = (k; N
1
, . . . , N
d
) =
1
d
d
j=1
_
cos
_
k
j
N
j
_
1
_
.
Using (8.7) we can see that the functions
f
k
: 1 k
j
N
j
1 ,
form an orthogonal basis for the set of functions on (N
1
, . . . , N
d
) that vanish on (N
1
, . . . , N
d
).
Hence this gives a complete set of eigenvalues and eigenvectors. We conclude that the (rst)
eigenvalue of (N
1
, . . . , N
d
), dened in Section 6.9, is given by
=
1
d
d
j=1
cos
_
N
j
_
.
In particular, as n , the eigenvalue for /
n
= /
n,d
= (n, . . . , n), is given by
Kn
= cos
_
n
_
= 1
2
2n
2
+O
_
1
n
4
_
.
192 Addtional topics on Simple Random Walk
8.3 Approximating continuous harmonic functions
It is natural to expect that discrete harmonic functions in Z
d
, when appropriately scaled, converge
to (continuous) harmonic functions in R
d
. In this section we discuss some versions of this principle.
We let | = |
d
= x R
d
: [x[ < 1 denote the unit ball in R
d
.
Proposition 8.3.1 There exists c < such that the following is true for all positive integers
n, m. Suppose f : (n + m)| R is a harmonic function. Then there is a function
f on B
n
with
L
f(x) = 0, x B
n
and such that
[f(x)
f(x)[
c |f|
m
2
, x B
n
. (8.18)
In fact, one can choose (recall
n
from Section 6.3)
f(x) = E
x
[f(S
n
)].
Proof Without loss of generality, assume |f|
f(x) = 0, x B
n
. We need to prove (8.18). By (6.9), if
x B
n
,
f(x) = E
x
_
_
f(S
n
)
n1
j=0
Lf(S
j
)
_
_
=
f(x) (x),
where
(x) =
zBn
G
Bn
(x, z) Lf(z).
In Section 6.2, we observed that there is a c such that all 4th order derivatives of f at x are bounded
above by c (n +m[x[)
4
. By expanding in a Taylor series, using the fact that f is harmonic, and
also using the symmetry of the random walk, this implies
[Lf(x)[
c
(n +m[x[)
4
.
Therefore, we have
[(x)[ c
n1
k=0
k|z|<k+1
G
Bn
(x, z)
(n +mk)
4
. (8.19)
We claim that there is a c
1
such that for all x,
nl|z|n1
G
Bn
(x, z) c l
2
.
Indeed, the proof of this is essentially the same as the proof of (5.5). Once we have the last estimate,
summing by parts the right-hand side of (8.19) gives that [(x)[ c/m
2
.
The next proposition can be considered as a converse of the last proposition. If f : Z
d
R is
a function, we will also write f for the piecewise constant function on R
d
dened as follows. For
8.4 Estimates for the ball 193
each x = (x
1
, . . . , x
j
) Z
d
, let
x
denote the cube of side length 1 centered at x,
x
=
_
(y
1
, . . . , y
d
) R
d
:
1
2
y
j
x
j
<
1
2
_
.
The sets
x
: x Z
d
partition R
d
.
Proposition 8.3.2 Suppose f
n
is a sequence of functions on Z
d
satisfying Lf
n
(x) = 0, x B
n
and
sup
x
[f
n
(x)[ 1. Let g
n
: R
d
R be dened by g
n
(y) = f
n
(ny). Then there exists a subsequence
n
j
and a function g that is harmonic on | such that g
n
j
g uniformly on every compact K |.
Proof Let J be a countable dense subset of |. For each y J, the sequence g
n
(y) is bounded and
hence has a subsequential limit. By a standard diagonalization procedure, we can nd a function
g on J such that
g
n
j
(y) g(y), y J.
For notational convenience, for the rest of this proof we will assume that in fact g
n
(y) g(y),
but the proof works equally well if there is only a subsequence.
Given r < 1, let r| = y | : [y[ < r. Using Theorem 6.3.8, we can see that there is a
c
r
< such that for all n, [g
n
(y
1
) g
n
(y
2
)[ c
r
[[y
1
y
2
[ + n
1
] for y
1
, y
2
r|. In particular,
[g(y
1
) g(y
2
)[ c
r
[y
1
y
2
[ for y
1
, y
2
J r|. Hence, we can extend g continuously to r| such
that
[g(y
1
) g(y
2
)[ c
r
[y
1
y
2
[, y
1
, y
2
r|, (8.20)
and a standard 3-argument shows that g
n
converges to g uniformly on r|.
Since g is continuous, in order to show that g is harmonic, it suces to show that it has the
spherical mean value property, i.e., if y | and [y[ + < 1,
_
|zy|=
g(z) ds
(z) = g(y).
Here s
denotes surface measure normalized to have measure one. This can be established from the
discrete mean value property for the functions f
n
, using Proposition 7.7.1 and (8.20). We omit the
details.
8.4 Estimates for the ball
One is often interested in comparing quantities for the simple random walk on the discrete ball
B
n
with corresponding quantities for Brownian motion. Since the Brownian motion is rotationally
invariant, balls are very natural domains to consider. However, the lattice eects at the boundary
mean that it is harder to control the rate of convergence of the simple random walk. This section
presents some basic comparison estimates.
We rst consider the Greens function G
Bn
(x, y). If x = 0, Proposition 6.3.5 gives sharp estimates.
It is trickier to estimate this for other x, y. We will let g denote the Greens function for Brownian
motion in R
d
with covariance matrix d
1
I,
g(x, y) = C
d
[x y[
2d
, d 3,
194 Addtional topics on Simple Random Walk
g(x, y) = C
d
log [x y[, d = 2,
and dene g
n
(x, y) by
g
n
(x, y) = g(x, y) E
x
[g(B
Tn
, y)] .
Here B is a d-dimensional Brownian motion with covariance d
1
I and T
n
= inft : [B
t
[ = n.
The reader should compare the formula for g
n
(x, y) to corresponding formulas for G
Bn
(x, y) in
Proposition 4.6.2. The Greens function for standard Brownian motion is g
n
(x, y)/d.
Proposition 8.4.1 If d 2 and x, y B
n
,
G
Bn
(x, y) = g
n
(x, y) +O
_
1
[x y[
d
_
+O
_
log
2
n
(n [y[)
d1
_
.
This estimate is not optimal, but improvements will not be studied in this book. Note that if follows that
for every > 0, if [x[, [y[ (1 )n,
G
Bn
(x, y) = g
n
(x, y)
_
1 +O
_
1
[x y[
2
_
+O
_
log
2
n
n
__
,
where we write O
to indicate that the implicit constants depend on but are uniform in x, y, n. In particular,
we have uniform convergence on compact subsets of the open unit ball.
Proof We will do the d 3 case; the d = 2 case is done similarly. By Proposition 4.6.2,
G
Bn
(x, y) = G(x, y) E
x
[G(S
n
, y)] .
Therefore,
[g
n
(x, y) G
Bn
(x, y)[ [g(x, y) G(x, y)[ +[E
x
[g(B
Tn
, y)] E
x
[G(S
n
, y)][ .
By Theorem 4.3.1,
[g(x, y) G(x, y)[
c
[x y[
d
G(S
n
, y) =
C
d
[S
n
y[
d2
+O
_
1
(1 +[S
n
y[)
d
_
,
Note that
E
x
[(1 +[S
n
y[)
d
] [n + 1 [y[[
2
E
x
[G(S
n
, y)]
c [n + 1 [y[[
2
G(x, y)
c [n + 1 [y[[
2
[x y[
2d
c
_
[x y[
d
+ (n + 1 [y[)
1d
_
.
We can dene a Brownian motion B and a simple random walk S on the same probability space
such that for each r,
P[B
Tn
S
n
[ r log n
c
r
,
8.4 Estimates for the ball 195
see Proposition 7.7.1. Since [B
Tn
S
n
[ c n, we see that
E[[B
Tn
S
n
[]
cn
k=1
P([B
Tn
S
n
[ k) c log
2
n.
Also,
C
d
[x y[
d2
C
d
[z y[
d2
c [x z[
[n [y[]
d1
.
Let
Bn
denote the eigenvalue for the ball as in Section 6.9 and dene
n
by
Bn
= e
n
. Let
= (d) be the eigenvalue of the unit ball for a standard d-dimensional Brownian motion B
t
, i.e.,
P[B
s
[ < 1, 0 s t c e
t
, t .
Since the random walk suitably normalized converges to Brownian motion, one would conjecture
that dn
2
n
is approximately for large n. The next proposition establishes this but again not with
the optimal error bound.
Proposition 8.4.2
n
=
dn
2
_
1 +O
_
1
log n
__
.
Proof By Theorem 7.1.1, we can nd a b > 0 such that a simple random walk S
n
and a standard
Brownian motion B
t
can be dened on the same probability space so that
P
_
max
0jn
3
[S
j
B
j/d
[ b log n
_
b n
1
.
By Corollary 6.9.6, there is a c
1
such that for all n and all j,
P[S
j
[ < n, j kn
2
c
1
e
nk n
2
. (8.21)
For Brownian motion, we know there is a c
2
such that
P[B
t
[ < 1, 0 t k c
2
e
k
.
By the coupling, we know that for all n suciently large
P[S
j
[ < n, j d
1
n
2
log n P[B
t
[ < n +b log n, t
1
n
2
log n +b n
1
,
and due to Brownian scaling, we obtain
c
1
exp
_
dn
2
log n
_
c
2
exp
_
n
2
log n
(n +b log n)
2
_
+b n
1
c
3
exp
_
log n +O
_
log
2
n
n
__
.
Taking logarithms, we get
dn
2
1 O
_
1
log n
_
.
196 Addtional topics on Simple Random Walk
A similar argument, reversing the roles of the Brownian motion and the random walk, gives
dn
2
1 +O
_
1
log n
_
.
Exercises
Exercise 8.1 Suppose S
n
is simple random walk in Z
2
started at the origin and
T = min j 1 : S
j
xe
1
: x Z .
Let X denote the rst component of S
T
. Show that the characteristic function of X is
(t) = 1
_
(2 cos t)
2
1.
Exercise 8.2 Let V = (x, y) R
2
: 0 < x, y < 1 and let
1
V = (1, y) : 0 y 1. Suppose
g : V R is a continuous function that vanishes on V
1
V . Show that the unique continuous
function on V that is harmonic in V and agrees with g on V is
f(x, y) = 2
k=1
c
k
sinh(kx)
sinh(k)
sin(ky),
where
c
k
=
_
1
0
sin(tk) g(1, t) dt.
Use this to derive (8.13).
Exercise 8.3 Let A
,
be as in Corollary 8.1.8. Suppose x
n
, y
n
, k
n
are sequences of positive
integers with
lim
n
x
n
n
= x, lim
n
y
n
n
= y, lim
n
k
n
n
= k,
where x, y, k are positive real numbers. Find
lim
n
nH
A,
(x
n
+iy
n
, k
n
).
Exercise 8.4 Let f
n
be the eigenfunction associated to the d-dimensional simple random walk in
Z
d
on B
n
, i.e.,
Lf
n
(x) = (1 e
n
) f
n
(x), x B
n
,
with f
n
0 on Z
d
B
n
or equivalently,
f
n
(x) = (1 e
n
)
yBn
G
Bn
(x, y) f(y).
This denes the function up to a multiplicative constant; x the constant by asserting f
n
(0) = 1.
Extend f
n
to be a function of R
d
as in Section 8.3 and let
F
n
(x) = f
n
(nx).
8.4 Estimates for the ball 197
The goal of this problem is to show that the limit
F(x) = lim
n
F
n
(x),
exists and satises
F(x) =
_
|y|1
g(x, y) F(y) d
d
y, (8.22)
where g is the Greens function for Brownian motion with constant chosen as in Section 8.4. In
other words, F is the eigenfunction for Brownian motion. (The eigenfunction is the same whether
we choose covariance matrix I or d
1
I.) Useful tools for this exercise are Proposition 6.9.4, Exercise
6.6, Proposition 8.4.1, and Proposition 8.4.2. In particular,
(i) Show that there exist c
1
, c
2
such that
c
1
[1 [x[] F
n
(x) c
2
[1 [x[ +n
1
].
(ii) Use a diagonalization argument to nd a subsequence n
j
such that for all x with rational
coordinates the limit
F(x) = lim
j
F
n
j
(x)
exists.
(iii) Show that for every r < 1, there is a c
r
such that for [x[, [y[ r,
[F
n
(x) F
n
(y)[ c
r
[[x y] +n
1
].
(iv) Show that F is uniformly continuous on the set of points in the unit ball with rational
coordinates and hence can be dened uniquely on [z[ 1 by continuity.
(v) Show that if [x[, [y[ < c
r
, then
[F(x) F(y)[ c
r
[x y[.
(vi) Show that F satises (8.22).
(vii) You may take it as a given that there is a unique solution to (8.22) with F(0) = 1 and
F(x) = 0 for [x[ = 1. Use this to show that F
n
converges to F uniformly.
9
Loop Measures
9.1 Introduction
Problems in random walks are closely related to problems on loop measures, spanning trees, and
determinants of Laplacians. In this chapter we will gives some of the relations. Our basic viewpoint
will be dierent from that normally taken in probability. Instead of concentrating on probability
measures, we consider arbitrary (positive) measures on paths and loops.
Considering measures on paths or loops that are not probability measures is standard in statistical
physics. Typically one consider weights on paths of the form e
E
where is a parameter and
E is the energy of a conguration. If the total mass is nite (say, if the there are only a nite
number of congurations) such weights can be made into probability measures by normalizing.
There are times where it is more useful to think of the probability measures and other times where
the unnormalized measure is important. In this chapter we take the congurational view.
9.2 Denitions and notations
Throughout this chapter we will assume that
A = x
0
, x
1
, . . . , x
n1
, or A = x
0
, x
1
, . . .
is a nite or countably innite set of points or vertices with a distinguished vertex x
0
called the
root.
A nite sequence of points = [
0
,
1
, . . . ,
k
] in A is called a path of length k. We write [[ for
the length of .
A path is called a cycle if
0
=
k
. If
0
= x, we call the cycle an x-cycle and call x the root of
the cycle.
We allow the trivial cycles of length zero consisting of a single point.
If x A, we write x if x =
j
for some j = 0, . . . , [[.
If A A, we write A if all the vertices of are in A.
A weight q is a nonnegative function q : A A [0, ) that induces a weight on paths
q() =
||
j=1
q(
j1
,
j
).
198
9.2 Denitions and notations 199
By convention q() = 1 if [[ = 0.
q is symmetric if q(x, y) = q(y, x) for all x, y
Although we are doing this in generality, one good example to have in mind is A = Z
d
or A equal to a
nite subset of Z
d
containing the origin with x
0
= 0. The weight q is that obtained from simple random walk,
i.e., q(x, y) = 1/2d if [x y[ = 1 and q 0 otherwise.
We say that A is q-connected if for every x, y A there exists a path
= [
0
,
1
, . . . ,
k
]
with
0
= x,
k
= y and q() > 0.
q is called a (Markov) transition probability if for each x
y
q(x, y) = 1.
In this case q() denotes the probability that the chain starting at
0
enters states
1
, . . . ,
k
in
that order. If A is q-connected, q is called irreducible.
q is called a subMarkov transition probability if for each x
y
q(x, y) 1,
and it is called strictly subMarkov if the sum is strictly less than one for at least one x. Again, q
is called irreducible if A is q-connected. A subMarkov transition probability q on A can be made
into a transition probability on A by setting q(, ) = 1 and
q(x, ) = 1
y
q(x, y).
The rst time that this Markov chain reaches is called the killing time for the subMarkov
chain.
If q is the weight corresponding to simple random walk in Z
d
, then q is a transition probability if A = Z
d
and q is a strictly subMarkov transition probability if A is a proper subset of Z
d
.
If q is a transition probability on A, two important ways to get subMarkov transition probabilities
are:
Take A A and consider q(x, y) restricted to A. This corresponds to the Markov chain killed
when it leaves A.
Let 0 < < 1 and consider q. This corresponds to the Markov chain killed at geometric rate
(1 ).
The rooted loop measure m = m
q
is the measure on cycles dened by m() = 0 if [[ = 0 and
m() = m
q
() =
q()
[[
, [[ 1.
200 Loop Measures
An unrooted loop or cycle is an equivalence class of cycles under the equivalence
[
0
,
1
, . . . ,
k
] [
j
,
j+1
, . . . ,
k
,
1
. . . ,
j
]. (9.1)
We denote unrooted loops by and write if is a cycle that produces the unrooted loop
.
The lengths and weights of all representatives of are the same, so it makes sense to write [[
and q().
If is an unrooted loop, let
K() = # :
be the number of representatives of the equivalence class. The reader can easily check that K()
divides [[ but can be smaller. For example, if is the unrooted loop corresponding to a rooted
loop = [x, y, x, y, x] with distinct vertices x, y, then [[ = 4 but K() = 2.
The unrooted loop measure is the measure m = m
q
obtained from m by forgetting the root,
i.e.,
m() =
q()
[[
=
K() q()
[[
.
A weight q generates a directed graph with vertices A and directed edges = (x, y) A A :
q(x, y) > 0. Note that this allows self-loops of the form (x, x). If q is symmetric, then this is
an undirected graph. In this chapter graph will mean undirected graph.
If #(A) = n < , a spanning tree T (of the complete graph) on vertices A is a collection of
n 1 edges in A such that A with these edges is a connected graph.
Given q, the weight of a tree T (with respect to root x
0
) is
q(T ; x
0
) =
(x,x
)T
q(x, x
),
where the product is over all directed edges (x, x
.
If q is symmetric, then q(T ; x
0
) is independent of the choice of the root x
0
and we will write q(T )
for (T ; x
0
). Any tree with positive weight is a subgraph of the graph generated by q.
If q is a weight and > 0, we write q
() =
||
q(), q
(T ) =
n1
q(T ). If q is a subMarkov transition probability and 1, then q
is also a subMarkov
transition probability for a chain moving as q with an additional geometric killing.
Let L
j
denote the set of (rooted) cycles of length j and
L
j
(A) = L
j
: A,
L
x
j
(A) = L
j
(A) : x , L
x
j
= L
x
j
(A).
L =
_
j=0
L
j
, L(A) =
_
j=0
L
j
(A), L
x
(A) =
_
j=0
L
x
j
(A), L
x
=
_
j=0
L
x
j
.
9.3 Generating functions and loop measures 201
We also write L
j
, L
j
(A), etc., for the analogous sets of unrooted cycles.
9.2.1 Simple random walk on a graph
An important example is simple random walk on a graph. There are two dierent denitions that
we will use. Suppose A is the set of the vertices of an (undirected) graph. We write x y if x is
adjacent to y, i.e., if x, y is an edge. Let
deg(x) = #y : x y
be the degree of x. We assume that the graph is connected.
Simple random walk on the graph is the Markov chain with transition probability
q(x, y) =
1
deg(x)
, x y.
If A is nite, the invariant probability measure for this Markov chain is proportional to d(x).
Suppose
d = sup
xX
deg(x) < .
The lazy (simple random) walk on the graph, is the Markov chain with symmetric transition
probability
q(x, y) =
1
d
, x y,
q(x, x) =
d deg(x)
d
.
We can also consider this as simple random walk on the augmented graph that has added d
deg(x) self-loops at each vertex x. If A is nite, the invariant probability measure for this Markov
chain is uniform.
A graph is regular (or d-regular) if deg(x) = d for all x. For regular graphs, the lazy walk is the
same as the simple random walk.
A graph is transitive if all the vertices look the same, i.e., if for each x, y A there is a graph
isomorphism that takes x to y. Any transitive graph is regular.
9.3 Generating functions and loop measures
In this section, we x a set of vertices A and a weight q on A.
If x A, the x-cycle generating function is given by
g(; x) =
L,
0
=x
||
q() =
L,
0
=x
q
().
If q is a subMarkov transition probability and 1, then g(; x) denotes the expected number
of visits of the chain to x before being killed for a subMarkov chain with weight q
started at x.
202 Loop Measures
For any cycle we dene
d() = #j : 1 j [[,
j
=
0
,
and we call an irreducible cycle if d() = 1.
The rst return to x generating function is dened by
f(; x) =
L,||1,
0
=x,d()=1
q()
||
.
If q is a subMarkov transition probability, then f(; x) is the probability that the chain starting
at x returns to x before being killed.
One can check as in (4.6), that
g(; x) = 1 +f(; x) g(, x),
which yields
g(; x) =
1
1 f(; x)
. (9.2)
If A is nite, the cycle generating function is
g() =
xX
g(; x) =
||
q() =
L
q
().
Since each x A has a unique cycle of length 0 rooted at x,
g(0; x) = 1, g(0) = #(A).
If A is nite, the loop measure generating function is
() =
||
m
q
() =
||
m
q
() =
L, ||1
||
[[
q().
Note that if #(A) = n < ,
(0) = 0, g() =
() +n, () =
_
0
g(s) n
s
ds.
If A A is nite, we write
F(A; ) = exp
_
_
_
L(A),||1
q()
||
[[
_
_
_
= exp
_
_
_
L(A),||1
q() K()
||
[[
_
_
_
.
In other words, log F(A; ) is the loop measure (with weight q
) of
the set of loops in A that include x, i.e.,
F
x
(A; ) = exp
_
_
_
L
x
(A),||1
q()
||
[[
_
_
_
= exp
_
_
_
L
x
(A),||1
q() K()
||
[[
_
_
_
.
9.3 Generating functions and loop measures 203
More generally, if V A, log F
V
(A; ) denotes the loop measure of loops in A that intersect V ,
F
V
(A; ) = exp
_
_
_
L(A),||1,V =
q()
||
[[
_
_
_
= exp
_
_
_
L(A),||1,V =
q() K()
||
[[
_
_
_
.
If is a path, we write F
for F
V
where V denotes the vertices in . Note that F(A; ) = F
A
(A; ).
We write F(A) = F(A; 1), F
x
(A) = F
x
(A; 1).
Proposition 9.3.1 If A = y
1
, . . . , y
k
, then
F(A; ) = F
A
(A; ) = F
y
1
(A; ) F
y
2
(A
1
; ) F
y
k
(A
k1
; ), (9.3)
where A
i
= A y
1
, . . . , y
i
. More generally, if V = y
1
, . . . , y
j
A then
F
V
(A; ) = F
y
1
(A; ) F
y
2
(A
1
; ) F
y
j
(A
j1
; ). (9.4)
In particular, the products on the right-hand side of (9.3) and (9.4) are independent of the ordering
of the vertices.
Proof This follows from the denition and the observation that the collection of loops that intersect
V can be partitioned into those that intersect y
1
, those that do not intersect y
1
but intersect y
2
,
etc.
The next lemma is an important relationship between one generating function and the exponential
of another generating function.
Lemma 9.3.2 Suppose x A A. Let
g
A
(; x) =
L(A),
0
=x
q()
||
.
Then,
F
x
(A; ) = g
A
(; x).
Remark. If = 1 and q is a transition probability, then g
A
(1; x) (and hence by the lemma
F
x
(A) = F
x
(A; 1)) is the expected number of visits to x by a random walk starting at x before
its rst visit to A A. In other words, F
x
(A)
1
is the probability that a random walk starting
at x reaches A A before its rst return to x. Using this interpretation for F
x
(A), the fact that
the product on the right-hand side of (9.3) is independent of the ordering is not so obvious. See
Exercise 9.2 for a more direct proof in this case.
Proof Suppose L
x
(A). Let d
x
() be the number of times that a representative of visits x
(this is the same for all representatives ). For representatives with
0
= x, d() = d
x
(). It is
easy to verify that the number of representatives of with
0
= x is K()d
x
()/[[. From this
we see that
m() =
q()
[[
=
K() q()
[[
=
,
0
=x
q()
d()
.
204 Loop Measures
Therefore,
L
x
(A)
m()
||
=
L(A),
0
=x
q()
||
d()
=
j=1
1
j
L(A),
0
=x,d()=j
q()
||
.
An x-cycle with d
x
() = j can be considered as a concatenation of j x-cycles
with d(
) = 1.
Using this we can see that
L(A),
0
=x,d()=j
q()
||
=
_
_
L(A),
0
=x,d()=1
q()
||
_
_
j
= f
A
(; x)
j
.
Therefore,
log F
x
(A; ) =
j=1
f
A
(x; )
j
j
= log[1 f
A
(; x)] = log g
A
(; x).
The last equality uses (9.2).
Proposition 9.3.3 Suppose #(A) = n < and > 0 satises F(A; ) < . Then
F(A; ) =
1
det[I Q]
,
where Q denotes the n n matrix [q(x, y)]
x,yX
.
Proof Without loss of generality we may assume = 1. We prove by induction on n. If n = 1 and
Q = (r), then F(A; 1) = 1/(1 r). To do the inductive step, suppose n > 1 and x A, then
g(1; x) =
_
_
j=0
Q
j
_
_
x,x
=
_
(I Q)
1
x,x
=
det[I Q
x
]
det[I Q]
,
where Q
x
denotes the matrix Q with the row and column corresponding to x removed. The last
equality follows from the adjoint form of the inverse. Using (9.3) and the inductive hypothesis on
A x, we get the result.
Remark. The matrix I Q is often called the (negative of the) Laplacian. The last proposition
and others below relate the determinant of the Laplacian to loop measures and trees.
Let
0,x
denote the radius of convergence of g(; x).
If A is q-connected, then
0,x
is independent of x and we write just
0
.
If A is q-connected and nite, then
0
is also the radius of convergence for g() and F(A; )
and 1/
0
is the largest eigenvalue for the matrix Q = (q(x, y)). If q is a transition probability,
0
= 1. If q is an irreducible, strictly subMarkov transition probability, then
0
> 1.
If A is q-connected and nite, then g(
0
) = g(
0
; x) = F(A;
0
) = . However, one can show
easily that F(A x;
0
) < . The next proposition shows how to compute the last quantity
from the generating functions.
9.3 Generating functions and loop measures 205
Proposition 9.3.4 Let
0
be the radius of convergence of g. Then if x A,
log F(A x;
0
) = lim
log F(A x; )
= lim
j=2
(1
j
)
Proof Since the eigenvalues of I Q are 1
1
, . . . , 1
n
, we see that
lim
1
det[I Q]
1
=
n
j=2
(1
j
).
If < 1, then Proposition 9.3.3 states
F(A; ) =
1
det[I Q]
.
Also, as 1,
g(; x) (x) (1 )
1
,
where denotes the invariant probability. (This can be seen by recalling that g(; x) is the number
of visits to x by a chain starting at x before a geometric killing time with rate (1). The expected
number of steps before killing is 1/(1 ), and since the killing is independent of the chain, the
206 Loop Measures
expected number of visits to x before begin killed is asymptotic to (x)/(1 ).) Therefore, using
Proposition 9.3.4,
log F(A x) = lim
1
[log F(A; ) log g(; x)]
= log (x)
n
j=2
log(1
j
).
9.4 Loop soup
If V is a countable set and : V [0, ) is a measure, then a (Poisson) soup from is a
collection of independent Poisson processes
N
x
t
, x V,
where N
x
t
has parameter (x). A soup realization is the corresponding collection of multi-sets
/
t
where the number of times that x appears in /
t
is N
x
t
. This can be considered as a stochastic
process taking values in multi-sets of elements of V .
The rooted loop soup (
t
is a soup realization from m.
The unrooted loop soup (
t
is a soup realization from m.
From the denitions of m and m we can see that we can obtain an unrooted loop soup (
t
from
a rooted loop soup (
t
by forgetting the roots of the loops in (
t
. To obtain (
t
from (
t
, we need
to add some randomness. More specically, if is a loop in an unrooted loop soup (
t
, we choose a
rooted loop by choosing uniformly among the K() representatives of . It is not hard to show
that with probability one, for each t, there is at most one loop in
(
t
_
_
s<t
(
s
_
.
Hence we can order the loops in (
t
(or (
t
) according to the time at which they were created; we
call this the chronological order.
Proposition 9.4.1 Suppose x A A with F
x
(A) < . Let (
t
(A; x) denote an unrooted loop
soup (
t
restricted to
L
x
(A). Then with probability one, (
1
(A; x) contains a nite number of loops
which we can write in chronological order
1
, . . . ,
k
.
Suppose that independently for each unrooted loop
j
, a rooted loop
j
with
0
= x is chosen uni-
formly among the K() d
x
()/[[ representatives of rooted at x, and these loops are concatenated
to form a single loop
=
1
2
k
.
A multi-set is a generalization of a set where elements can appear multiple times in the collection.
9.5 Loop erasure 207
Then for any loop
A rooted at x,
P =
=
q()
F
x
(A)
.
Proof We rst note that
1
, . . . ,
k
as given above is the realization of the loops in a Poissonian
realization corresponding to the measure
m
x
() =
q()
d
x
()
,
up to time 1 listed in chronological order, restricted to loops L(A) with
0
= x. Using the
argument of Lemma 9.3.2, the probability that no loop appears is
exp
_
_
_
L(A);
0
=x
m
x
()
_
_
_
= exp
_
_
_
L(A);
0
=x
q()
d
x
()
_
_
_
=
1
F
x
(A)
.
More generally, suppose
0
= x and d(
=
1
r
where
i
is a loop rooted at x with d
x
(
i
) = j
i
. The probability that
1
, . . . ,
r
(and no other
loops) appear in the realization up to time 1 in this order is
exp
_
L(A);
0
=x
m
x
()
_
r!
m
x
(
1
) m
x
(
r
) =
1
r! F
x
(A)
q(
1
) q(
r
)
j
1
j
r
=
q(
)
F
x
(A)
1
r! (j
1
j
r
)
.
The proposition then follows from the following combinatorial fact that we leave as Exercise 9.1:
j
1
++jr=k
1
r! (j
1
j
r
)
= 1.
9.5 Loop erasure
A path = [
0
, . . . ,
n
] is self-avoiding if
j
,=
k
for 0 j < k n.
Given a path = [
0
, . . . ,
n
] there are a number of ways to obtain a self-avoiding subpath of
that goes from
0
to
n
. The next denition gives one way.
If = [
0
, . . . ,
n
] is a path, LE() denotes its (chronological) loop-erasure dened as follows.
Let
0
= maxj n :
j
= 0. Set
0
=
0
=
0
.
Suppose
i
< n. Let
i+1
= maxj n :
j
=
i
+1
. Set
i+1
=
i+1
=
i
+1
.
If i
= mini :
i
= n = mini :
i
=
n
, then LE() = [
0
, . . . ,
i
].
208 Loop Measures
A weight q on paths induces a new weight q
A
on self-avoiding paths by specifying that the weight
of a self-avoiding path is the sum of the weights of all the paths in A for which = L(). The
next proposition describes this weight.
Proposition 9.5.1 Suppose A A, i 1 and = [
0
, . . . ,
i
] is a self-avoiding path whose vertices
are in A. Then,
q
A
() :=
L(A);LE()=
q() = q() F
(A), (9.5)
where, as before,
F
(A) = exp
_
_
_
L(A),||1,=
q()
[[
_
_
_
.
Proof Let A
1
= A, A
j
= A
0
, . . . ,
j
. Given any with LE() = , we can decompose as
=
0
[
0
,
1
]
1
[
1
,
2
] [
i1
,
i
]
i
,
where
j
denotes the loop
[
j1
+1
, . . . ,
j
]
(here
1
= 1). The loop
j
can be any loop rooted at
j
contained in A
j1
. The total measure
of such loops is F
j
(A
j1
), see Lemma 9.3.2. The result then follows from (9.4).
In particular, q
A
() depends on A. The next proposition discusses the Radon-Nikodym deriva-
tive of q
A
1
with respect to q
A
for A
1
A.
If V
1
, V
2
A, Let
F
V
1
,V
2
(A) = exp
_
_
_
L(A),V
1
=,V
2
=
q()
[[
_
_
_
.
Proposition 9.5.2 Suppose A
1
A and = [
0
, . . . ,
i
] is a self-avoiding path whose vertices are
in A
1
. Then
q
A
() = q
A
1
() F
,A\A
1
(A).
Proof This follows immediately from the relation F
(A) = F
(A
1
) F
,A\A
1
(A).
The inverse of loop erasing is loop addition. Suppose = [
0
, . . . ,
k
] is a self-avoiding path.
We dene a random variable Z
1,j
,
2,j
, . . . ,
s
j
,j
,
denote the loops in (
1
that intersect
j
but do not intersect
0
, . . . ,
j1
. These loops are listed
in the order that they appear in the soup. For each such loop
i,j
, choose a representative
i,j
9.6 Boundary excursions 209
roooted at
j
; if there is more than one choice for the representative, choose it uniformly. We then
concatenate these loops to give
j
=
1,j
s
j
,j
.
If s
j
= 0, dene
j
to be the trivial loop [
j
]. We then concatenate again to dene
= Z() =
0
[
0
,
1
]
1
[
1
,
2
] [
k1
,
k
]
k
.
Proposition 9.4.1 tells us that there is another way to construct a random variable with the distribu-
tion of Z
. Suppose
0
, . . . ,
k
are chosen independently (given ) with
j
having the distribution
of a cycle in A
0
, . . . ,
j1
rooted at
j
. In other words if
L(A
0
, . . . ,
j1
) with
0
=
j
, then the probability that
j
=
is q(
)/F
j
(A
0
, . . . ,
j1
).
9.6 Boundary excursions
Boundary excursions in a set A are paths that begin and end on the boundary and otherwise stay
in A. Suppose A, q are given. If A A we dene
A = (A)
q
= y A A : q(x, y) +q(y, x) > 0 for some x A.
A (boundary) excursion in A is a path = [
0
, . . . ,
n
] with n 2 such that
0
,
n
A and
1
, . . . ,
n1
A.
The set of boundary excursions with
0
= x and
||
= y is denoted c
A
(x, y), and
c
A
=
_
x.yA
c
A
(x, y).
Let
c
A
(x, y),
c
A
denote the subsets of c
A
(x, y), c
A
, respectively, consisting of the self-avoiding
paths. If x = y, the set
c
A
(x, y) is empty.
The measure q restricted to c
A
is called excursion measure on A.
The measure q restricted to
c
A
is called the self-avoiding excursion measure on A.
The loop-erased excursion measure on A, is the measure on
c
A
given by
q() = q c
A
: LE() = .
As in (9.5), we can see that
q() = q() F
(A). (9.6)
If x, y A, q, q can also be considered as measures on c
A
(x, y) or
c
A
(x, y) by restricting to
those paths that begin at x and end at y. If x = y, these measures are trivial for the self-avoiding
and loop-erased excursion measures.
If is a measure on a set K and K
1
K, then the restriction of to K
1
is the measure dened by
(V ) = (V K
1
). If is a probability measure, this is related to but not the same as the conditional measure
given K
1
; the conditional measure normalizes to make the measure a probability measure. A family of measures
A
, indexed by subsets A, supported on c
A
(or c
A
(x, y)) is said to have the restriction property if whenever
A
1
A, then
A1
is
A
restricted to c
A1
(c
A1
(x, y)). The excursion measure and the self-avoiding excursion
210 Loop Measures
measure have the restriction property. However, the loop-erased excursion measure does not have the restriction
property. This can be seen from (9.6) since it is possible that F
(A
1
) ,= F
(A).
The loop-erased excursion measure q is obtained from the excursion measure q by a deterministic
function on paths (loop erasure). Since this function is not one-to-one, we cannot obtain q from
q without adding some extra randomness. However, one can obtain q from q by adding random
loops as described at the end of Section 9.5.
The next denition is a generalization of the boundary Poisson kernel dened in Section 6.7.
The boundary Poisson kernel is the function H
A
: AA [0, ) given by
H
A
(x, y) =
E
A
(x,y)
q().
Note that if c
A
(x, y), then LE()
c
A
(x, y). In particular, if x ,= y,
H
A
(x, y) =
E
A
(x,y)
q().
Suppose k is a positive integer and x
1
, . . . , x
k
, y
1
, . . . , y
k
are distinct points in A. We write x =
(x
1
, . . . , x
k
), y = (y
1
, . . . , y
k
). We let
c
A
(x, y) = c
A
(x
1
, y
1
) c
A
(x
k
, y
k
),
and we write [] = (
1
, . . . ,
k
) for an element of c
A
(x, y) and
q([]) = (q q)([]) = q(
1
) q(
2
) q(
k
).
We can consider q q as a measure on c
A
(x, y). We dene
c
A
(x, y) similarly.
The nonintersecting excursion measure q
A
(x, y) at (x, y) is the restriction of the measure q q
to the set of [] c
A
(x, y) that do not intersect, i.e.,
i
j
= , 1 i < j k.
The nonintersecting self-avoiding excursion measure at (x, y) is the restriction of the measure
q q to the set of []
c
A
(x, y) that do not intersect. Equivalently, it is the restriction of
the nonintersecting excursion measure to
c
A
(x, y).
There are several ways to dene the nonintersecting loop-erased excursion measure. It turns out
that the most obvious way (restricting the loop-erased excursion measure to k-tuples of walks that
do not intersect) is neither the most important nor the most natural. To motivate our denition,
let us consider the nonintersecting excursion measure with k = 2. This is the measure on pairs of
excursions (
1
,
2
). that gives measure q(
1
) q(
2
) to each (
1
,
2
) satisfying
1
2
= . Another
way of saying this is the following.
Given
1
, the measure on
2
is q restricted to those excursions c
A
(x
2
, y
2
) such that
1
= .
In other words, the measure is q restricted to c
A\
1
(x
2
, y
2
).
More generally, if k 2 and 1 j k 1, the following holds.
Given
1
, . . . ,
j
, the measure on
j+1
is q restricted to excursions in c
A
(x
j+1
, y
j+1
) that do not
intersect
1
j
. In other words, the measure is q restricted to c
A\(
1
j
)
(x
j+1
, y
j+1
).
9.6 Boundary excursions 211
The nonintersecting self-avoiding excursion measure satises the analogous property. We will use
this as the basis for our denition of the nonintersecting loop-erased measure q
A
(x, y) at (x, y). We
want our denition to satisfy the following.
Given
1
, . . . ,
j
, the measure on
j+1
is the same as q
A\(
1
j
)
(x
j+1
, y
j+1
).
This leads to the following denition.
The measure q
A
(x, y) is the measure on
c
A
(x, y) obtained by restricting q
A
(x, y) to the set V of
k-tuples [] c
A
(x, y) that satisfy
j+1
[
1
j
] = , j = 1, . . . , k 1, (9.7)
where
j
= LE(
j
), and then considering it as a measure on the loop erasures. In other words,
q
A
(
1
, . . . ,
k
) = q(
1
, . . . ,
k
) V : LE(
j
) =
j
, j = 1, . . . , k, satisfying (9.7).
This denition may look unnatural because it seems that it might depend on the order of the pairs
of vertices. However, the next proposition shows that this is not the case.
Proposition 9.6.1 The q
A
(x, y)-measure of a k-tuple (
1
, . . . ,
k
) is
_
_
k
j=1
q
A
(
j
)
_
_
1
i
j
,= , 1 i < j nF
1
,...,
k(A)
1
,
where
F
1
,...,
k (A) = exp
_
_
_
L(A)
q()
[[
J(;
1
, . . . ,
k
)
_
_
_
,
and J(;
1
, . . . ,
k
) = max0, s 1, where s is the number of paths
1
, . . . ,
k
intersected by .
Proof Proposition 9.5.1 implies
k
j=1
q
A
(
j
) =
k
j=1
q(
j
)
k
j=1
exp
_
_
_
L(A),||1,
j
=
q()
[[
_
_
_
. (9.8)
However, assuming that
i
j
= for i ,= j,
q
A
(
1
, . . . ,
j
) =
k
j=1
q(
j
)
k
j=1
exp
_
_
_
L(A\(
1
j1
),||1,
j
=
q()
[[
_
_
_
. (9.9)
If a loop intersects s of the
j
, where s 2, then it appears s times in (9.8) but only one time
in (9.9).
Let
H
A
(x, y) denote the total mass of the measure q
A
(x, y).
212 Loop Measures
If k = 1, we know that
H
A
(x, y) = H
A
(x, y). The next proposition shows that for k > 1, we
can describe
H
A
(x, y) in terms of the quantities H
A
(x
i
, y
j
). The identity is a generalization of
a result of Karlin and McGregor on Markov chains (see Exercise 9.3). If is a permutation of
1, . . . , k, we also write (y) for (y
(1)
, . . . , y
(k)
).
Proposition 9.6.2 (Fomins identity)
(1)
sgn
H
A
(x, (y)) = det [H
A
(x
i
, y
j
)]
1i,jk
. (9.10)
Remark. If A is a simply connected subset of Z
2
and q comes from simple random walk, then
topological considerations tell us that
H
A
(x, (y)) is nonzero for at most one permutation . If
we order the vertices so that this permutation is the identity, Fomins identity becomes
H
A
(x, y) = det [H
A
(x
i
, y
j
)]
1i,jk
.
Proof We will say that [] is nonintersecting if (9.7) holds and otherwise we call it intersecting.
Let
c
=
_
c
A
(x, (y)), (9.11)
let c
NI
be the set of nonintersecting [] c
, and let c
I
= c
NI
be the set of intersecting [].
We will dene a function : c
NI
.
q([]) = q(([])).
If [] c
I
c
A
(x, (y)), then ([]) c
A
(x,
1
(y)) where sgn
1
= sgn. In fact,
1
is the
composition of and a transposition.
is the identity. In particular, is a bijection.
To show that existence of such a proves the proposition, rst note that
det [H
A
(x
i
, y
j
)]
1i,jk
=
(1)
sgn
k
i=1
H
A
(x
i
, y
(i)
).
Also,
H
A
(x
i
, y
(i)
) =
E
A
(x
i
,y
(i)
)
q().
Therefore, by expanding the product, we have
det [H
A
(x
i
, y
j
)]
1i,jk
=
[]E
(1)
sgn
q([]) =
[]E
(1)
sgn
1
q([()]).
In the rst summation the permutation is as in (9.11). Hence the sum of all the terms that come
from c
I
is zero, and
det [H
A
(x
i
, y
j
)]
1i,jk
=
[]E
NI
(1)
sgn
q([]).
9.6 Boundary excursions 213
But the right-hand side is the same as the left-hand side of (9.10). We dene to be the identity
on c
NI
, and we now proceed to dene the bijection on c
I
.
Let us rst consider the k = 2 case. Let [] c
I
,
=
1
= [
0
, . . . ,
m
] c
A
(x
1
, y), =
2
= [
0
, . . . ,
n
] c
A
(x
2
, y
),
= [
0
, . . . ,
l
] = LE(),
where y = y
1
, y
= y
2
or y = y
2
, y
= y
1
. Since [] c
I
, we know that
= L() ,= .
Dene
s = minl :
l
, t = maxl :
l
=
s
, u = maxl :
l
=
s
.
Then we can write =
+
, =
+
where
= [
0
, . . . ,
t
],
+
= [
t
, . . . ,
m
],
= [
0
, . . . ,
u
],
+
= [
u
, . . . ,
n
].
We dene
([]) = ((
+
,
+
)) = (
+
,
+
).
Note that
+
c
A
(x
1
, y
),
+
c
A
(x
2
, y), and q(([])) = q([]). A straightforward
check shows that is the identity.
x
1
x
2
y
y
t
=
s
= w
u
Figure 0-a
214 Loop Measures
Suppose k > 2 and [] c
I
. We will change two paths as in the k = 2 case and leave the others
xed being careful in our choice of the paths to make sure that (([])) = []. Let
i
= LE(
i
).
We dene
r = mini :
i
j
,= for some j > i,
s = minl :
r
l
i+1
k
,
b = minj > r :
r
s
j
,
t = maxl :
r
l
=
r
s
, u = maxl :
b
l
=
r
s
.
We make the interchange
(
r,
r,+
,
b,
b,+
) (
r,
b,+
,
b,
r,+
)
as in the previous paragraph (with (
r
,
b
) = (, )) leaving the other paths xed. This denes ,
and it is then straightforward to check that is the identity.
9.7 Wilsons algorithm and spanning trees
Kirchho was the rst to relate the number of spanning trees of a graph to a determinant. Here we
derive a number of these results. We use a more recent technique, Wilsons algorithm, to establish
the results. This algorithm is an ecient method to produce spanning trees from the uniform
distribution using loop-erased random walk. We describe it in the proof of the next proposition.
The basic reason why this algorithm works is that the product on the right-hand side of (9.3) is
independent of the ordering of the vertices.
Proposition 9.7.1 Suppose #(A) = n < and q are transition probabilities for an irreducible
Markov chain on A. Then
T
q(T ; x
0
) =
1
F(A x
0
)
. (9.12)
Proof We will describe an algorithm due to David Wilson that chooses a spanning tree at random.
Let A = x
0
, . . . , x
n1
.
Start the Markov chain at x
1
and let it run until it reaches x
0
. Take the loop-erasure of the
set of points visited, [
0
= x
1
,
1
, . . . ,
i
= x
0
]. Add the edges [
0
,
1
], [
1
,
2
], . . . , [
i1
,
i
]
to the tree.
If the edges form a spanning tree we stop. Otherwise, we let j be the smallest index such
that x
j
is not a vertex in the tree. Start a random walk at x
j
and let it run until it reaches
one of the vertices that has already been added. Perform loop-erasure on this path and add
the edges in the loop-erasure to the tree.
Continue until all vertices have been added to the tree.
We claim that for any tree T , the probability that T is output in this algorithm is
q(T ; x
0
) F(A x
0
). (9.13)
9.7 Wilsons algorithm and spanning trees 215
The result (9.12) follows immediately. To prove (9.13), suppose that a spanning tree T is given.
Then this gives a collection of self-avoiding paths:
1
= [y
1,1
= x
1
, y
1,2
, . . . , y
1,k
1
, z
1
= x
0
]
2
= [y
2,1
, y
2,2
, . . . , y
2,k
2
, z
2
]
.
.
.
m
= [y
m,1
, y
m,2
, . . . , y
m,km
, z
m
] .
Here
1
is the unique self-avoiding path in the tree from x
1
to x
0
; for j > 1, y
j,1
is the vertex of
smallest index (using the ordering x
0
, x
1
, . . . , x
n1
) that has not been listed so far; and
j
is the
unique self-avoiding path from y
j,1
to a vertex z
j
in
1
j1
. Then the probability that T is
chosen is exactly the product of the probabilities that
if a random walk starting at x
1
is stopped at x
0
, the loop-erasure is
1
;
if a random walk starting at y
2,1
is stopped at
1
, then the loop-erasure is
2
.
.
.
if a random walk starting at y
m,1
is stopped at
1
m1
, then the loop-erasure is
m
.
With this decomposition, we can now use (9.5) and (9.3), we obtain (9.13).
Corollary 9.7.2 If C
n
denotes the number of spanning trees of a connected graph with vertices
x
0
, x
1
, . . . , x
n1
, then
log C
n
=
n1
j=1
log d(x
j
) log F(A x
0
)
=
n1
j=1
log d(x
j
) + lim
1
[log g(; x
0
) ()].
Here the implicit q is the transition probability for simple random walk on the graph and d(x
j
)
denotes the degree of x
j
. If C
n
is a transitive graph of degree d,
log C
n
= (n 1) log d log n + lim
1
[log g() ()]. (9.14)
Proof For simple random walk on the graph, for all T ,
q(T ; x
0
) =
_
_
n1
j=1
d(x
j
)
_
_
1
.
In particular, it is the same for all trees, and (9.12) implies that the number of spanning trees is
[q(T ; x
0
) F(A x
0
)]
1
=
_
_
n1
j=1
d(x
j
)
_
_
F(A x
0
)
1
.
216 Loop Measures
The second equality follows from Proposition 9.3.4 and the relation () = F(x; ). If the graph
is transitive, then g() = ng(; x
0
), from which (9.14) follows.
If we take a connected graph and add any number of self-loops at vertices, this does not change the number
of spanning trees. The last corollary holds regardless of how many self-loops are added. Note that adding self-loops
aects both the value of the degree and the value of F(A x
0
).
Proposition 9.7.3 Suppose A is a nite, connected graph with n vertices and maximal degree d,
and P is the transition matrix for the lazy random walk on A as in Section 9.2.1. Suppose the
eigenvalues of P are
1
= 1,
2
, . . . ,
n
.
Then the number of spanning trees of A is
d
n1
n
1
n
j=2
(1
j
).
Proof Since the invariant probability is 1/n, Proposition 9.3.5 tells us that for each x A,
1
F(A x)
= n
1
n
j=2
(1
j
).
The values 1
j
are the eigenvalues for the (negative of the) Laplacian I Q for simple random walk
on the graph. In graph theory, it is more common to dene the Laplacian to be d(I Q). When looking at
formulas, it is important to know which denition of the Laplacian is being used.
9.8 Examples
9.8.1 Complete graph
The complete graph on a collection of vertices is the graph with all (distinct) vertices adjacent.
Proposition 9.8.1 The number of spanning trees of the complete graph on A = x
0
, . . . , x
n1
is
n
n2
.
Proof Consider the Markov chain with transition probabilities q(x, y) = 1/n for all x, y. Let
A
j
= x
j
, . . . , x
n1
. The probability that the chain starting at x
j
has its rst visit (after time
zero) to x
0
, . . . , x
j
at x
j
is 1/(j +1) since each vertex is equally likely to be the rst one visited.
Using the interpretation of F
x
j
(A
j
) as the reciprocal of the probability that the chain starting at
x
j
visits x
0
, . . . , x
j1
before returning to x
j
we see that
F
x
j
(A
j
) =
j + 1
j
, j = 1, . . . , n 1
9.8 Examples 217
and hence (9.3) gives
F(A x
0
) = n.
With the self-loops, each vertex has degree n and hence for each spanning tree
q(T ; x
0
) = n
(n1)
.
Therefore the number of spanning trees is
[q(T ; x
0
) F(A x
0
)]
1
= n
n2
.
9.8.2 Hypercube
The hypercube A
n
is the graph whose vertices are 0, 1
n
with vertices adjacent if they agree in all
but one component.
Proposition 9.8.2 If C
n
denotes the number of spanning trees of the hypercube A
n
:= 0, 1
n
,
then
log C
n
:= (2
n
n 1) log 2 +
n
k=1
_
n
k
_
log k.
By (9.14), Proposition 9.8.2 is equivalent to
lim
1
[log g() ()] = (2
n
1) log n + (2
n
1) log 2 +
n
k=1
_
n
k
_
log k.
where g is the cycle generating function for simple random walk on A
n
. The next proposition
computes g.
Proposition 9.8.3 Let g be the cycle generating function for simple random walk on the hypercube
A
n
. Then
g() =
n
j=0
_
n
j
_
n
n (n 2j)
= 2
n
+
n
j=0
_
n
j
_
(n 2j)
n (n 2j)
.
Proof [of Proposition 9.8.2 given Proposition 9.8.3] Note that
() =
_
0
g(s) 2
n
s
ds
=
_
0
_
_
n
j=0
_
n
j
_
n 2j
n s(n 2j)
_
_
ds
= (2
n
1) log n log(1 )
n
j=1
_
n
j
_
log[n (n 2j)].
218 Loop Measures
Let us write = 1 so that
log g() = log
_
_
n
j=0
_
n
j
_
n
n + 2j(1 )
_
_
,
() = (2
n
1) log n log
n
j=1
_
n
j
_
log[n + (1 ) 2j].
As 0+,
log g() = log(1/) + log
_
_
n
j=0
_
n
j
_
n
n + 2j
1
_
_
= log() +o(1),
() = (2
n
1) log n log
n
j=1
_
n
j
_
log(2j) +o(1)
and hence
lim
1
[log g() ()] = (1 2
n
) log n + (2
n
1) log 2 +
n
j=1
_
n
j
_
log j,
which is what we needed to show.
The remainder of this subsection will be devoted to proving Proposition 9.8.3. Let L(n, 2k)
denote the number of cycles of length 2k in A
n
. By denition L(n, 0) = 2
n
. Let g
n
denote the
generating function on A
n
using weights 1 (instead of 1/n) on the edges on the graph and zero
otherwise,
g
n
() =
||
=
k=0
L(n, 2k)
2k
.
Then if g is as in Proposition 9.8.3, g() = g
n
(/n). Then Proposition 9.8.3 is equivalent to
g
n
() =
n
j=0
_
n
j
_
1
1 (n 2j)
, (9.15)
which is what we will prove. By convention we set L(0, 0) = 1; L(0, k) = 0 for k > 0, and hence
g
0
() =
k=0
L(0, 2k)
2k
= 1,
which is consistent with (9.15).
Lemma 9.8.4 If n, k 0,
L(n + 1, 2k) = 2
k
j=0
_
2k
2j
_
L(n, 2j).
9.8 Examples 219
Proof This is immediate for n = 0 since L(1, 2k) = 2 for every k 0. For n 1, consider any cycle
in A
n+1
of length 2k. Assume that there are 2j steps that change one of the rst n components
and 2(k j) that change the last component. There are
_
2k
2j
_
ways to choose which 2j steps make
changes in the rst n components. Given this choice, there are L(n, 2j) ways of moving in the rst
n components. The movement in the last component is determined once the initial value of the
(n + 1)-component is chosen; the 2 represents the fact that this initial value can equal 0 or 1.
Lemma 9.8.5 For all n 0,
g
n+1
() =
1
1
g
n
_
1
_
+
1
1 +
g
n
_
1 +
_
.
Proof
g
n+1
() =
k=0
L(n + 1, 2k)
2k
= 2
k=0
k
j=0
_
2k
2j
_
L(n, 2j)
2k
= 2
j=0
L(n, 2j)
k=0
_
2j + 2k
2j
_
2j+2k
=
2
1
j=0
L(n, 2j)
_
1
_
2j
k=0
_
2j + 2k
2j
_
(1 )
2j+1
2k
Using the identity (see Exercise 9.4).
k=0
_
2j + 2k
2j
_
p
2j+1
(1 p)
2k
=
1
2
+
1
2
_
p
2 p
_
2j+1
, (9.16)
we see that g
n+1
() equals
1
1
j=0
L(n, 2j)
_
1
_
2j
+
1
1 +
j=0
L(n, 2j)
_
1 +
_
2j
,
which gives the result.
Proof [Proof of Proposition 9.8.3] Setting = (n +)
1
, we see that it suces to show that
g
n
_
1
n +
_
=
n
j=0
_
n
j
_
+n
+ 2j
. (9.17)
This clearly holds for n = 0. Let H
n
() = g
n
(). Then the previous lemma gives the recursion
relation
H
n+1
_
1
n + 1 +
_
= H
n
_
1
n +
_
+H
n
_
1
n + 2 +
_
.
220 Loop Measures
Hence by induction we see that
H
n
_
1
n +
_
=
n
j=0
_
n
j
_
1
+ 2j
.
9.8.3 Sierpinski graphs
In this subsection we consider the Sierpinski graphs which is a sequence of graphs V
0
, V
1
, . . . dened
as follows. V
0
is a triangle, i.e., a complete graph on three vertices. For n > 0, V
n
will be a graph
with 3 vertices of degree 2 (which we call the corner vertices) and [3
n+1
3]/2 vertices of degree
4. We dene the graph inductively. Suppose we are given three copies of V
n1
, V
(1)
n1
, V
(2)
n1
, V
(3)
n1
,
with corner vertices x
(1)
1
, x
(1)
2
, x
(1)
3
, . . . , x
(3)
1
, x
(3)
2
, x
(3)
3
. Then V
n
is obtained from these three copies
by identifying the vertex x
(k)
j
with the vertex x
(j)
k
. We call the graphs V
n
the Sierpinski graphs.
Proposition 9.8.6 Let C
n
denote the number of spanning trees of the Sierpinski graph V
n
. Then
C
n
satises the recursive equation
C
n+1
= 2 (5/3)
n
C
3
n
. (9.18)
Hence,
C
n
= (3/20)
1/4
(3/5)
n/2
(540)
3
n
/4
. (9.19)
Proof It is clear that C
0
= 3, and a simple induction argument shows that the solution to (9.18)
with C
0
= 3 is given by (9.19). Hence we need to show the recursive equation C
n+1
= 2 (5/3)
n
C
3
n
.
x
0
x
1
x
2
x
3
x
4
x
5
For n 1, we will write V
n
= x
0
, x
1
, x
2
, x
3
, x
4
, x
5
, . . . , x
Mn
where M
n
= [3
n
+ 3]/2, x
0
, x
1
, x
2
are corner vertices of V
n
and x
3
, x
4
, x
5
are the other vertices that are corner vertices for the three
copies of V
n1
. They are chosen so that x
3
lies between x
0
, x
1
; x
4
between x
1
, x
2
; x
5
between x
2
, x
0
.
9.9 Spanning trees of subsets of Z
2
221
Using Corollary 9.7.2 and Lemma 9.3.2, we can write C
n
=
n
J
n
where
n
=
Mn
j=1
d(x
j
), J
n
=
Mn
j=1
p
j,n
.
Here p
j,n
denotes the probability that simple random walk in V
n
started at x
j
returns to x
j
before
visiting x
0
, . . . , x
j1
. Note that
n
= 2
2
4
(3
n+1
3)/2
,
and hence
n+1
= 4
3
n
. Hence we need to show that
J
n+1
= (1/2) (5/3)
n
J
3
n
. (9.20)
We can write
J
n+1
= p
1,n+1
p
2,n+1
J
n+1
where J
n+1
denotes the product over all the other vertices (the non-corner vertices). From this, we
see that
J
n+1
= p
1,n+1
p
2,n+1
p
5,n+1
(J
n
)
3
=
p
1,n+1
p
2,n+1
p
5,n+1
p
3
1,n
p
3
2,n
J
3
n
.
The computations of p
j,n
are straightforward computations familiar to those who study random
walks on the Sierpinski gasket and are easy exercises in Markov chains. We give the answers here,
leaving the details to the reader. By induction on n one can show that p
2,n+1
= (3/5) p
2,n
and from
this one can see that
p
2,n+1
=
_
3
5
_
n+1
, p
1,n
=
3
4
_
3
5
_
n+1
.
Also,
p
5,n+1
= p
2,n
=
_
3
5
_
n
, p
4,n+1
=
15
16
_
3
5
_
n
, p
3,n+1
=
5
6
_
3
5
_
n
.
This gives (9.20).
9.9 Spanning trees of subsets of Z
2
Suppose A Z
2
is nite, and let e(A) denote the set of edges with at least one vertex in A. We
write e(A) =
e
A e
o
(A) where
e
A denotes the boundary edges with one vertex in A and
e
o
(A) = e(A)
e
A, the interior edges. There will be two types of spanning trees of A, we will
consider.
Free. A collection of #(A) 1 edges from e
0
(A) such that the corresponding graph is connected.
Wired. The set of vertices is A where denotes the boundary. The edges of the graph
are the same as e(A) except that each edge in
e
A is replaced with an edge connecting the point
in A to . (There can be more than one edge connecting a vertex in A to .) A wired spanning
tree is a collection of edges from e(A) such that the corresponding subgraph of A is a
spanning tree. Such a tree has #(A) edges.
222 Loop Measures
In both cases, we will nd the number of trees by considering the Markov chain given by simple
random walk in Z
2
. The dierent spanning trees correspond to dierent boundary conditions for
the random walks.
Free. The lazy walker on A as described in Section 9.2.1, i.e.,
q(x, y) =
1
4
, x, y A, [x y[ = 1,
and q(x, x) = 1
y
q(x, y).
Wired. Simple random walk on A killed when it leaves A, i.e.,
q(x, y) =
1
4
, x, y A, [x y[ = 1,
and q(x, x) = 0. Equivalently, we can consider this as the Markov chain on A where is
an absorbing point and
q(x, ) = 1
yA
q(x, y).
In other words, free spanning trees correspond to reecting or Neumann boundary conditions and wired
spanning trees correspond to Dirichlet boundary conditions.
We let F(A) denote the quantity for the wired case. This is the same as F(A) for simple random
walk in Z
2
. If x A, we write F
(Ax) for the corresponding quantity for the lazy walker. (The
lazy walker is a Markov chain on A and hence F
j=1
(1
j
),
where
1
, . . . ,
n
denote the eigenvalues of Q
A
= [q(x, y)]
x,yA
.
Proof This is a particular case of Corollary 9.7.2 using the graph A and x
0
= . See also
Proposition 9.3.3.
Proposition 9.9.2 Suppose
1
= 1, . . . ,
n
are the eigenvalues of the transition matrix for the lazy
walker on a nite, connected A Z
2
of cardinality n. Then the number of spanning trees of A is
4
n1
n
1
n
j=2
(1
j
).
Proof This is a particular case of Proposition 9.7.3.
9.9 Spanning trees of subsets of Z
2
223
Recall that
log F(A) =
L(A),||1
1
4
||
[[
=
xA
n=1
1
2n
P
x
S
2n
= 0; S
j
A, j = 1, . . . , 2n. (9.21)
The rst order term in an expansion of log F(A) is
xA
n=1
1
2n
P
x
S
2n
= 0,
which ignores the restriction that S
j
A, j = 1, . . . , 2n. The actual value involves a well known
constant C
cat
called Catalans constant. There are many equivalent denitions of this constant. For
our purposes we can use the following
C
cat
=
2
log 2
4
n=1
1
2n
4
2n
_
2n
n
_
2
= .91596 .
Proposition 9.9.3 If S = (S
1
, S
2
) is simple random walk in Z
2
, then
n=1
1
2n
PS
2n
= 0 = log 4
4
C
cat
,
where C
cat
denotes Catalans constant. In particular, if A Z
2
is nite,
log F(A) = [log 4 (4/) C
cat
] #(A)
xA
(x; A), (9.22)
where
(x; A) =
n=1
1
2n
P
x
S
2n
= 0; S
j
, A for some 0 j 2n.
Proof Using Exercise 1.7, we get
n=1
1
2n
PS
2n
= 0 =
n=1
1
2n
[PS
1
2n
= 0]
2
=
n=1
1
2n
4
2n
_
2n
n
_
2
.
Since PS(2n) = 0 c n
1
, we can see that the sum is nite. The exact value follows from our
(conveniently chosen) denition of C
cat
. The last assertion then follows from (9.21).
Lemma 9.9.4 There exists c < such that if A Z
2
, x A, and (x; A) is dened as in
Proposition 9.9.3, then
(x; A)
c
dist(x, A)
2
.
Proof We only sketch the argument leaving the details as Exercise 9.5. Let r = dist(x, A). Since
it takes about r
2
steps to reach A, the loops with fewer than that many steps rooted at x tend
not to leave A. Hence (x; A) is at most of the order of
nr
2
1
2n
PS
2n
= 0
nr
2
n
2
r
2
.
224 Loop Measures
Using this and (9.22) we immediately get the following.
Proposition 9.9.5 Suppose A
n
is a sequence of nite, connected subsets of Z
2
satisfying the
following condition (that roughly means measure of the boundary goes to zero). For every r > 0,
lim
n
#x A
n
: dist(x, A
n
) r
#(A
n
)
= 0.
Then,
lim
n
log F(A
n
)
#(A
n
)
= log 4
4C
cat
.
Suppose A
m,n
is the (m1) (n 1) discrete rectangle,
A
m,n
= x +iy : 1 x m1, 1 y n 1.
Note that
#(A
m,n
) = (m1) (n 1), #(A
m,n
) = 2 (m1) + 2 (n 1).
Theorem 9.9.6
4
(m1)(n1)
F(A
m,n
)
e
4Ccatmn/
(
2 1)
m+n
n
1/2
. (9.23)
More precisely, for every b (0, ) there is a c
b
< such that if b
1
m/n b then both sides
of (9.23) are bounded above by c
b
times the other side. In particular, if C
m,n
denotes the number
of wired spanning trees of A
m,n
,
log C
mn
=
4C
cat
mn + log(
2 1) (m +n)
1
2
log n +O(1)
=
4C
cat
#(A
m,n
) +
_
2C
cat
+
1
2
log(
2 1)
_
#(A
m,n
)
1
2
log n +O(1).
Although our proof will use the exact values of the eigenvalues, it is useful to consider the result in terms
of (9.22). The dominant term is already given by (9.22). The correction comes from loops rooted in A
m,n
that
leave A. The biggest contribution to these comes from points near the boundary. It is not surprising then that
the second term is proportional to the number of points on the boundary. The next correction to this comes from
the corners of the rectangle. This turns out to contribute a logarithmic term and after that all other correction
terms are O(1). We arbitrarily write log n rather than log m; note that log m = log n +O(1).
Proof The expansion for log C
m,n
follows immediately from Proposition 9.9.1 and (9.23), so we only
need to establish (9.23). The eigenvalues of I Q
A
can be given explicitly (see Section 8.2),
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
__
, j = 1, . . . , m1; k = 1, . . . , n 1,
9.9 Spanning trees of subsets of Z
2
225
with corresponding eigenfunctions
f(x, y) = sin
_
jx
m
_
sin
_
ky
n
_
,
where the eigenfunctions have been chosen so that f 0 on A
m,n
. Therefore,
log F(A
m,n
) = log det[I Q
A
] =
m1
j=1
n1
k=1
log
_
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
___
.
Let
g(x, y) = log
_
1
cos(x) + cos(y)
2
_
.
Then (mn)
1
log det[I Q
A
] is a Riemann sum approximation of
1
2
_
0
_
0
g(x, y) dxdy.
To be more precise, Let V (j, k) = V
m,n
(j, k) denote the rectangle of side lengths /m and /n
centered at (j/m) +i(k/n). Then we will consider
J(j, k) :=
1
mn
g
_
j
m
,
k
n
_
=
1
mn
_
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
___
as an approximation to
1
2
_
V (j,k)
g(x, y) dxdy.
Note that
V =
m1
_
j=1
n1
_
k=1
V (j, k) =
_
x +iy :
2m
x
_
1
1
2m
_
,
2n
y
_
1
1
2n
__
.
One can show (using ideas as in Section 12.1.1, details omitted),
log det[I Q
A
] = mn
_
V
g(x, y) dxdy +O(1).
Therefore,
log det[I Q
A
] = mn
_
_
[0,]
2
g(x, y) dxdy
_
[0,]
2
\V
g(x, y) dxdy
_
+O(1).
The result will follow if we show that
mn
_
[0,]
2
\V
g(x, y) dxdy = (m+n) log 4 (m+n) log(1
2) +
1
2
log n +O(1).
We now estimate the integral over [0, ]
2
V which we write as the sum of integrals over four
thin strips minus the integrals over the corners that are doubly counted. One can check (using
an integral table, e.g.) that
1
_
0
log
_
1
cos x + cos y
2
_
dy = 2 log 2 + log[2 cos x +
_
2(1 cos x) + (1 cos x)
2
].
226 Loop Measures
Then,
1
2
_
0
_
0
log
_
1
cos x + cos y
2
_
dy dx =
2
log 2 +
2
2
+O(
3
).
If we choose = /(2m) or = /2n, this gives
mn
2
_
/(2m)
0
_
0
log
_
1
cos x + cos y
2
_
dy dx = m log 2 +O(1).
mn
2
_
0
_
/(2n)
0
log
_
1
cos x + cos y
2
_
dy dx = n log 2 +O(1).
Similarly,
1
2
_
_
0
log
_
1
cos x + cos y
2
_
dy dx =
2
log 2 +
log[3 + 2
2] +O(
3
)
=
2
log 2
2
log[
2 1] +O(
3
),
which gives
mn
2
_
2m
_
0
log
_
1
cos x + cos y
2
_
dy dx = n log 2 nlog[
2 1] +O(n
1
),
mn
2
_
0
_
2n
log
_
1
cos x + cos y
2
_
dy dx = m log 2 mlog[
2 1] +O(n
1
).
The only nontrivial corner term comes from
_
0
_
0
log
_
1
cos x + cos y
2
_
dxdy = 2 log() +O( ).
Therefore,
mn
2
_
2m
0
_
2n
0
log
_
1
cos x + cos y
2
_
dxdy =
1
2
log n +O(1).
All of the other corners give O(1) terms.
Combining it all, we get
m1
j=1
n1
k=1
log
_
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
___
equals
Imn + (m +n) log 4 + (m+n) log[1
2]
1
2
log n +O(1).,
where
I =
1
2
_
0
_
0
log
_
1
cos x + cos y
2
_
dydx.
9.9 Spanning trees of subsets of Z
2
227
Proposition 9.9.5 tells us that
I =
4 C
cat
log 4.
Theorem 9.9.6 allows us to derive some constants for simple random walk that are hard to show
directly. Write (9.23) as
log F(A
m,n
) = B
1
mn +B
2
(m +n) +
1
2
log n +O(1), (9.24)
where
B
1
= log 4
4C
cat
, B
2
= log(
2 + 1) log 4.
The constant B
1
was obtained by considering the rooted loop measure and B
2
was obtained from
the exact value of the eigenvalues. Recall from (9.3) that if we enumerate A
m,n
,
A
m,n
= x
1
, x
2
, . . . , x
K
, K = (m1) (n 1),
then
log F(A
m,n
) =
K
j=1
log F
x
j
(A
m,n
x
1
, . . . , x
j1
),
and F
x
(V ) is the expected number of visits to x for a simple random walk starting at x before
leaving V . We will dene the lexicographic order of Z+iZ by x +iy x
1
+iy
1
if x < x
1
or x = x
1
and y < y
1
.
Proposition 9.9.7 If
V = x +iy : y > 0 0, 1, 2, . . . , ,
then
F
0
(V ) = 4 e
4Ccat/
.
Proof Choose the lexicographic order for A
n,n
. Then one can show that
F
x
j
(A
n,n
x
1
, . . . , x
j1
) = F
0
(V ) [1 + error] ,
where the error term is small for points away from the boundary. Hence
log F(A
n,n
) = #(A
n,n
) log F
0
(V ) [1 +o(1)].
which implies log F
0
(V ) = B
1
as in (9.24).
Proposition 9.9.8 Let V Z iZ be the subset
V = (Z iZ) , 2, 1,
228 Loop Measures
Then, F
0
(V ) = 4 (
2
4
.
Proof Consider
A = A
n
= x +iy : x = 1, . . . , n 1; (n 1) y n 1.
Then A is a translation of A
2n,n
and hence (9.24) gives
log F(A) = 2 B
1
n
2
+ 3 B
2
n +
1
2
log n +O(1).
Order A so that the rst n 1 vertices of A are 1, 2, . . . , n 1 in order. Then, we can see that
log F(A) =
_
_
n1
j=1
log F
j
(A 1, . . . , j 1)
_
_
+ 2 log F(A
n,n
).
Using (9.24) again, we see that
2 log F(A
n,n
) = 2 B
1
n
2
+ 4 B
2
n + log n +O(1),
and hence
n1
j=1
log F
j
(A 1, . . . , j 1) = B
2
n
1
2
log n +O(1).
Now we use the fact that
log F
j
(A 1, . . . , j 1) = log F
0
(V ) [1 + error] ,
where the error term is small for points away from the boundary to conclude that F
0
(V ) = e
B
2
.
Let
A
m,n
be the mn rectangle
A
m,n
= x +iy : 0 x m1, 0 y n 1.
Note that
#(
A
m,n
) = mn, #(
A
m,n
) = 2(m +n).
Let
C
m,n
denote the number of (free) spanning trees of
A
m,n
.
Theorem 9.9.9
C
m,n
e
4Ccatmn/
(
2 1)
m+n
n
1/2
.
More precisely, for every b (0, ) there is a c
b
< such that if b
1
m/n b then both sides
of (9.23) are bounded above by c
b
times the other side.
9.9 Spanning trees of subsets of Z
2
229
Proof We claim that the eigenvalues for the lazy walker Markov chain on
A
m,n
are:
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
__
, j = 0, . . . , m1; k = 0, . . . , n 1,
with corresponding eigenfunctions
f(x, y) = cos
_
j(x +
1
2
)
m
_
cos
_
k(y +
1
2
)
n
_
.
Indeed, these are eigenvalues and eigenfunctions for the usual discrete Laplacian, but the eigen-
functions have been chosen to have boundary conditions
f(0, y) = f(1, y), f(m1, y) = f(m, y), f(x, 0) = f(x, 1), f(x, n 1) = f(x, n).
For these reason we can see that they are also eigenvalues and eigenvalues for the lazy walker.
Using Proposition 9.9.2, we have
C
mn
=
4
mn1
mn
(j,k)=(0,0)
_
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
___
.
Recall that if F(A
n,m
) is as in Theorem 9.9.6, then
1
F(
A
m,n
)
=
1jm1,1kn1
_
1
1
2
_
cos
_
j
m
_
+ cos
_
k
n
___
.
Therefore,
C
mn
=
4
(m1)(n1)
F(A
m,n
)
4
m+n1
mn
_
_
n1
j=1
_
1
2
1
2
cos
_
j
n
__
_
_
_
_
m1
j=1
_
1
2
1
2
cos
_
j
m
__
_
_
.
Using (9.23), we see that it suces to prove that
4
n
n
n1
j=1
_
1
2
1
2
cos
_
j
n
__
1,
or equivalently,
n1
j=1
log
_
1 cos
_
j
n
__
= n log 2 + log n +O(1). (9.25)
To establish (9.25), note that
1
n
n1
j=1
log
_
1 cos
_
j
n
__
is a Riemann sum approximation of
1
_
0
f(x) dx
230 Loop Measures
where f(x) = log[1 cos x]. Note that
f
(x) =
sin x
1 cos x
, f
(x) =
1
1 cos x
.
In particular [f
(x)[ c x
2
. Using this we can see that
1
n
_
1 cos
_
j
n
__
=
1
n
O(j
2
) +
1
_ j
n
+
2n
j
n
2n
f(x) dx.
Therefore,
1
n
n1
j=1
log
_
1 cos
_
j
n
__
= O(n
1
) +
1
_
2n
2n
f(x) dx
= O(n
1
) +
1
_
0
f(x) dx
1
_
2n
0
f(x) dx
= O(n
1
) log 2 +
1
n
log n.
9.10 Gaussian free eld
We introduce the Gaussian free eld. In this section we assume that q is a symmetric transi-
tion probability on the space A. Some of the denitions below are straightforward extensions of
denitions for random walk on Z
d
.
We say e = x, y is an edge if q(e) := q(x, y) > 0.
If A A, let e(A) denote the set of edges with at least one vertex in A. We write e(A) =
e
A e
o
(A) where
e
A are the edges with one vertex in A and e
o
(A) are the edges with both
vertices in A.
We let
A = y A A : q(x, y) > 0 for some x A,
A = A A.
If f : A R and x A, then
f(x) =
y
q(x, y) [f(y) f(x)].
We say that f is harmonic at x if f(x) = 0, and f is harmonic on A if f(x) = 0 for all x A.
If e e(A), we set
e
f = f(y) f(x) where e = x, y. This denes
e
f up to a sign. Note
that
e
f
e
g
is well dened.
Throughout this section we assume that A A with #(A) < .
9.10 Gaussian free eld 231
If f, g : A R are functions, then we dene the energy or Dirichlet form c to be the quadratic
form
c
A
(f, g) =
ee(A)
q(e)
e
f
e
g
We let c
A
(f) = c
A
(f, f).
Lemma 9.10.1 (Greens formula) Suppose f, h : A R. Then,
c
A
(f, h) =
xA
f(x) h(x) +
xA
yA
f(x) [h(x) h(y)] q(x, y). (9.26)
If h is harmonic in A,
c
A
(f, h) =
xA
yA
f(x) [h(x) h(y)] q(x, y). (9.27)
If f 0 on A,
c
A
(f, h) =
xA
f(x) h(x). (9.28)
If h is harmonic in A and f 0 on A, then c
A
(f, h) = 0 and hence
c
A
(f +h) = c
A
(f) +c
A
(h). (9.29)
Proof
c
A
(f, h)
=
ee(A)
q(e)
e
f
e
h
=
1
2
x,yA
q(x, y) [f(y) f(x)] [h(y) h(x)] +
xA
yA
q(x, y) [f(y) f(x)] [h(y) h(x)]
=
xA
yA
q(x, y) f(x) [h(y) h(x)]
xA
yA
q(x, y) f(x) [h(y) h(x)]
+
xA
yA
q(x, y) f(y) [h(y) h(x)]
=
xA
f(x) h(x) +
yA
xA
q(x, y) f(y) [h(y) h(x)].
This gives (9.26) and the nal three assertions follow immediately.
Suppose x A and let h
x
denote the function that is harmonic on A with boundary value
x
on A. Then it follows from (9.27) that
c
A
(h
x
) =
yA
[1 h
x
(y)] q(x, y).
We extend h
x
to A by setting h
x
0 on A A.
232 Loop Measures
Lemma 9.10.2 Let Y
j
be a Markov chain on A with transition probability q. Let
T
x
= minj 1 : Y
j
= x,
A
= minj 1 : Y
j
, A.
If A A, x A, A
= A x,
c
A
(h
x
) = P
x
T
x
A
=
1
F
x
(A
)
. (9.30)
Proof If y A, then h
x
(y) = P
y
T
x
=
A
. Note that
P
x
T
x
<
A
= q(x, x) +
yA
q(x, y) P
y
T
x
=
A
= q(x, x) +
yA
q(x, y) h
x
(y).
Therefore,
P
x
T
x
A
= 1 P
x
T
x
<
A
=
zA
q(x, z) +
yA
q(x, y) [1 h
x
(y)]
= h
x
(x)
=
yA
h
x
(y) h
x
(y) = c
A
(h
x
).
The last equality uses (9.28). The second equality in (9.30) follows from Lemma 9.3.2.
If v : A A R, f : A R, we write c
A
(f; v) for c
A
(f
v
) where f
v
f on A and f
v
v on A.
If v is omitted, then v 0 is assumed.
The Gaussian free eld on A with boundary condition v is the measure on functions f : A R
whose density with respect to Lebesgue measure on R
A
is
(2)
#(A)/2
e
E
A
(f;v)/2
.
If v 0, we call this the eld with Dirichlet boundary conditions.
If A A is nite and v : A A R, dene the partition function
((A; v) =
_
(2)
#(A)/2
e
E
A
(f;v)/2
df,
where df indicates that this is an integral with respect to Lebesgue measure on R
A
. If v 0, we
write just ((A). By convention, we set ((; v) = 1.
We will give two proofs of the next fact.
Proposition 9.10.3 For any A A with #(A) < ,
((A) =
_
F(A) = exp
_
_
_
1
2
L(A)
m()
_
_
_
. (9.31)
9.10 Gaussian free eld 233
Proof We prove this inductively on the cardinality of A. If A = , the result is immediate. From
(9.3), we can see that it suces to show that if A A is nite, x , A, and A
= A x,
((A
) = ((A)
_
F
x
(A
).
Suppose f : A
. We can write
f = g +t h
where g vanishes on A A; t = f(x); and h is the function that is harmonic on A with h(x) = 1
and h 0 on A A
yA
q(x, y) t
2
. (9.32)
Also, by (9.29),
c
A
(f) = c
A
(g) +c
A
(th) = c
A
(g) +t
2
c
A
(h),
which combined with (9.32) gives
exp
_
1
2
c
A
(f)
_
= exp
_
1
2
c
A
(g)
_
exp
_
t
2
2
c
A
(h)
_
.
Integrating over A rst, we get
((A
) = ((A)
_
2
e
t
2
E
A
(h)/2
dt
= ((A)
_
2
e
t
2
/[2Fx(A
)]
dt
= ((A)
_
F
x
(A
).
The second equality uses (9.30).
Let Q = Q
A
as above and denote the entries of Q
n
by q
n
(x, y). The Greens function on A is the
matrix G = (I Q)
1
; in other words, the expected number of visits to y by the chain starting at
x equals
n=0
q
n
(x, y)
which is the (x, y) entry of (I Q)
1
. Since Q is strictly subMarkov, (I Q) is symmetric,
strictly positive denite, and (I Q)
1
is well dened. The next proposition uses the joint normal
distribution as discussed in Section 12.3.
Proposition 9.10.4 Suppose the random variables Z
x
: x A have a (mean zero) joint normal
distribution with covariance matrix G = (I Q)
1
. Then the distribution of the random function
f(x) = Z
x
is the same as the Gaussian free eld on A with Dirichlet boundary conditions.
234 Loop Measures
Proof Plugging = G = (I Q)
1
into (12.14) , we see that the joint density of Z
x
is given by
(2)
#(A)
[det(I Q)]
1/2
exp
_
f (I Q)f
2
_
.
But (9.28) implies that f (I Q)f = c
A
(f). Since this is a probability density this shows that
((A) =
1
det(I Q)
,
and hence (9.31) follows from Proposition 9.3.3.
The scaling limit of the Gaussian free eld for random walk in Z
d
is the Gaussian free eld in R
d
. There
are technical subtleties required in the denition. For example if d = 2 and U is a bounded open set, we would
like to dene the Gaussian free eld Z
z
: z U with Dirichlet boundary conditions to be the collection of
random variables such that each nite collection (Z
z1
, . . . , Z
z
k
) has a joint normal distribution with covariance
matrix [G
U
(z
i
, z
j
)]. Here G
U
denotes the Greens function for Brownian motion in the domain. However, the
Greens function G
U
(z, w) blows up as w approaches z, so this gives an innite variance for the random variable
Z
z
. These problems can be overcome, but the collection Z
z
is not a collection of random variables in the usual
sense.
The proof of Proposition 9.10.3 is not really needed given the quick proof in Proposition 9.10.4. However,
we choose to include it since it uses more directly the loop measure interpretation of F(A) rather than the
interpretation as a determinant. Many computations with the loop measure have interpretations in the scaling
limit.
Exercises
Exercise 9.1 Show that for all positive integers k
j
1
++jr=k
1
r! (j
1
j
r
)
= 1.
Here are two possible approaches.
Show that the number of permutations of k elements with exactly r cycles is
j
1
++jr=k
k!
r! j
1
j
2
j
r
.
Consider the equation
1
1 t
= explog(1 t),
expand both sides in power series in t, and compare coecients.
9.10 Gaussian free eld 235
Exercise 9.2 Suppose X
n
is an irreducible Markov chain on a countable state space A and A =
x
1
, . . . , x
k
is a proper subset of A. Let A
0
= A, A
j
= A x
1
, . . . , x
j
. If z V A, let g
V
(z)
denote the expected number of visits to z by the chain starting at z before leaving V .
(i) Show that
g
A
(x
1
) g
A\{x
1
}
(x
2
) = g
A\{x
2
}
(x
1
) g
A
(x
2
). (9.33)
(ii) By iterating (9.33) show that the quantity
k
j=1
g
A
j1
(x
j
)
is independent of the ordering of x
1
, . . . , x
k
.
Exercise 9.3 [Karlin-McGregor] Suppose X
1
n
, . . . , X
k
n
are independent realizations from a Markov
chain with transition probability q on a nite state space A. Assume x
1
, . . . , x
k
, y
1
, . . . , y
j
A.
Consider the event
V = V
n
(y
1
, . . . , y
k
) =
_
X
i
m
,= X
j
m
, m = 0, . . . , n; X
j
n
= y
j
, 1 j n
_
.
Show that
PV [ X
1
0
= x
1
, . . . , X
1
n
= x
n
= det [q
n
(x
i
, y
j
)]
1i,jk
,
where
q
n
(x
i
, y
j
) = P
_
X
1
n
= y
j
[ X
1
0
= x
i
_
.
Exercise 9.4 Suppose Bernoulli trials are performed with probability p of success. Let Y
n
denote
the number of failures before the nth success, and let r(n) be the probability that Y
n
is even. By
denition, r(0) = 1. Give a recursive equation for r(n) and use it to nd r(n). Use this to verify
(9.16).
Exercise 9.5 Give the details of Lemma 9.9.4.
Exercise 9.6 Suppose q is the weight arising from simple random walk in Z
d
. Suppose A
1
, A
2
are disjoint subsets of Z
d
and x Z
d
. Let p(x, A
1
, A
2
) denote the probability that a random walk
starting at x enters A
2
and subsequently returns to x all without entering A
1
. Let g(x, A
1
) denote
the expected number of visits to x before entering A
1
for a random walk starting at x. Show that
the unrooted loop measure of the set of loops in Z
d
A
1
that intersect both x and A
2
is bounded
above by p(x, A
1
, A
2
) g(x, A
1
). Hint: for each unroooted loop that intersects both x and A
2
choose
a (not necessarily unique) representative that is rooted at x and enters A
2
before its rst return to
x.
Exercise 9.7 We continue the notation of Exercise 9.6 with d 3. Choose an enumeration of
Z
d
= x
0
, x
1
, . . . such that j < k implies [x
j
[ [x
k
[.
(i) Show there exists c < such that if r > 0, u 2, and [x
j
[ r,
p(x
j
, A
j1
, Z
d
B
ur
) c
1
[x
j
[
2
(ur)
2d
.
236 Loop Measures
(Hint: Consider a path that starts at x
j
, leaves B
ur
and then returns to x
j
without visiting
A
j1
. Split such a curve into three pieces: the beginning up to the rst visit to Z
d
B
ur
;
the end which (with time reversed) is a walk from x
j
to the rst (last) visit to Z
d
B
3|x
j
|/2
;
and the middle which ties these walks together.)
(ii) Show that there exists c
1
< , such that if r > 0 and u 2, then the (unrooted) loop
measure of the set of loops that intersect both B
r
and Z
d
B
ur
is bounded above by c
1
u
2d
.
10
Intersection Probabilities for Random Walks
j=0
3n
k=2n
1S
j
= S
k
, K
n
=
n
j=0
k=2n
1S
j
= S
k
.
237
238 Intersection Probabilities for Random Walks
Note that
PS[0, n] S[2n, 3n] ,= = PJ
n
1,
PS[0, n] S[2n, ) ,= = PK
n
1.
We will derive the following inequalities for d 3,
c
1
n
(4d)/2
E(J
n
) E(K
n
) c
2
n
(4d)/2
, (10.1)
E(J
2
n
)
_
_
_
c n, d = 3,
c log n, d = 4,
c n
(4d)/2
, d 5.
(10.2)
Once these are established, the lower bound follows by the second moment lemma (Lemma 12.6.1),
PJ
n
> 0
E(J
n
)
2
4 E(J
2
n
)
.
Let us write p(n) for PS
n
= 0. Then,
E(J
n
) =
n
j=0
3n
k=2n
p(k j),
and similarly for E(K
n
). Since p(k j) (k j)
d/2
, we get
E(J
n
)
n
j=0
3n
k=2n
1
(k j)
d/2
n
j=0
3n
k=2n
1
(k n)
d/2
n
j=0
n
1(d/2)
n
2(d/2)
,
and similarly for E(K
n
). This gives (10.1). To bound the second moments, note that
E(J
2
n
) =
0j,in
2nk,m3n
PS
j
= S
k
, S
i
= S
m
0jin
2nkm3n
[PS
j
= S
k
, S
i
= S
m
+PS
j
= S
m
, S
i
= S
k
].
If 0 i, j n and 2n k m 3n, then
PS
j
= S
k
, S
i
= S
m
_
max
ln,xZ
d
PS
l
= x
_ _
max
xZ
d
PS
mk
= x
_
c
n
d/2
(mk + 1)
d/2
.
The last inequality uses the local central limit theorem. Therefore,
E(J
2
n
) c n
2
2nkm3n
1
n
d/2
(mk + 1)
d/2
c n
2(d/2)
0kmn
1
(mk + 1)
d/2
.
This yields (10.2).
The upper bound is trivial for d = 3 and for d 5 it follows from (10.1) and the inequality
PK
n
1 E[K
n
]. Assume d = 4. We will consider E[K
n
[ K
n
1]. On the event K
n
1,
let k be the smallest integer 2n such that S
k
S[0, n]. Let j be the smallest index such that
10.1 Long range estimate 239
S
k
= S
j
. Then by the Markov property, given [S
0
, . . . , S
k
] and S
k
= S
j
, the expected value of K
2n
is
n
i=0
l=k
PS
l
= S
i
[ S
k
= S
j
=
n
i=0
G(S
i
S
j
).
Dene a random variable, depending on S
0
, . . . , S
n
,
Y
n
= min
j=0,...,n
n
i=0
G(S
i
S
j
).
For any r > 0, we have that
E[K
n
[ K
n
1, Y
n
r log n] r log n.
Note that for each r,
PY
n
< r log n (n + 1) P
_
_
_
in/2
G(S
i
) < r log n
_
_
_
.
Using Lemma 10.1.2 below, we can nd an r such that PY
n
< r log n = o(1/ log n) But,
c E[K
n
] PK
n
1; Y
n
r log nE[K
n
[ K
n
1, Y
n
r log n]
PK
n
1; Y
n
r log n[r log n].
Therefore,
PK
n
1 PY
n
< r log n +PK
n
1; Y
n
r log n
c
log n
.
This nishes the proof except for the one lemma that we will now prove.
Lemma 10.1.2 Let p T
4
.
(a) For every > 0, there exist c, r such that for all n suciently large,
P
_
_
_
n1
j=0
G(S
j
) r log n
_
_
_
c n
.
(b) For every > 0, there exist c, r such that for all n suciently large,
P
_
_
_
n
j=0
G(S
j
) r log n
_
_
_
c n
.
Proof It suces to prove (a) when n = 2
l
for some integer l, and we write
k
=
2
k . Since
G(x) c/([x[ + 1)
2
, we have
l
1
j=0
G(S
j
)
l
k=1
k
1
j=
k1
G(S
j
) c
l
k=1
2
2k
[
k
k1
].
240 Intersection Probabilities for Random Walks
The reection principle (Proposition 1.6.2) and the central limit theorem show that for every > 0,
there is a > 0 such that if n is suciently large, and x (
n/2
, then P
x
n
n
2
. Let I
k
denote the indicator function of the event
k
k1
2
2k
. Then we know that
P(I
k
= 1 [ S
0
, . . . , S
k1) .
Therefore, J
l
:=
l
k=1
I
k
is stochastically bounded by a binomial random variable with parameters
l and . By exponential estimates for binomial random variables (see Lemma 12.2.8), we can nd
an such that
PJ
l
l/2 c 2
l
.
But on the event J
l
< l/2 we know that
G(S
j
) c(l/2) r log n,
where the r depends on .
For part (b) we need only note that Pn <
n
1/4 decays faster than any power of n and
P
_
_
_
n
j=0
G(S
j
)
r
4
log n
_
_
_
P
_
_
_
n
1/4
j=0
G(S
j
) r log n
1/4
_
_
_
+Pn <
n
1/4.
The proof of the upper bound for d = 4 in Proposition 10.1.1 can be compared to the proof of an easier
estimate
P0 S[n, ) c n
1
d
2
, d 3.
To prove this, one uses the local central limit theorem to show that the expected number of visits to the origin
is O(n
1
d
2
). On the event that 0 S[n, ), we consider the smallest j n such that S
j
= 0. Then
using the strong Markov property, one shows that the expected number of visits given at least one visit is
G(0, 0) < . In Proposition 10.1.1 we consider the event that S[0, n] S[2n, ) ,= and try to take the rst
(j, k) [0, n] [2n, ) such that S
j
= S
k
. This is not well dened since if (i, l) is another pair it might be
the case that i < j and l > k. To be specic, we choose the smallest k and then the smallest j with S
j
= S
k
.
We then say that the expected number of intersections after this time is the expected number of intersections of
S[k, ) with S[0, n]. Since S
k
= S
j
this is like the number of intersections of two random walks starting at the
origin. In d = 4, this is of order log n. However, because S
k
, S
j
have been chosen specically, we cannot use a
simple strong Markov property argument to assert this. This is why the extra lemma is needed.
10.2 Short range estimate
We are interested in the probability that the paths of two random walks starting at the origin do not
intersect up to some nite time. We discuss only the interesting dimensions d 4. Let S, S
1
, S
2
, . . .
be independent random walks starting at the origin with distribution p T
d
. If 0 < < 1, let
T
, T
1
, T
2
, . . . denote independent geometric random variables with killing rate 1 and we write
n
= 1
1
n
. We would like to estimate
PS(0, n] S
1
[0, n] = ,
10.2 Short range estimate 241
or
P
_
S(0, T
n
] S
1
[0, T
2
n
] =
_
.
The next proposition uses the long range estimate to bound a dierent probability,
P
_
S(0, T
n
] (S
1
[0, T
1
n
] S
2
[0, T
2
n
]) =
_
.
Let
Q() =
yZ
d
P
0,y
S[0, T
] S
1
[0, T
1
] ,=
= (1 )
2
yZ
d
j=0
k=0
j+k
P
0,y
S[0, j] S
1
[0, k] ,= .
Here we write P
x,y
to denote probabilities assuming S
0
= x, S
1
0
= y. Using Proposition 10.1.1, one
can show that as n (we omit the details),
Q(
n
)
_
n
d/2
, d < 4
n
2
[log n]
1
, d = 4.
Proposition 10.2.1 Suppose S, S
1
, S
2
are independent random walks starting at the origin with
increment p T
d
. Let V
]. Then,
P
_
V
S(0, T
] (S
1
(0, T
1
] S
2
(0, T
2
]) =
= (1 )
2
Q(). (10.3)
Proof Suppose = [
0
= 0, . . . ,
n
], = [
0
, . . . ,
m
] are paths in Z
d
with
p() :=
n
j=1
p(
j1
,
j
) > 0 p() :==
m
j=1
p(
j1
,
j
) > 0.
Then we can write
Q() = (1 )
2
n=0
m=0
n+m
p() p(),
where the last sum is over all paths , with [[ = n, [[ = m,
0
= 0 and ,= . For each such
pair (, ) we dene a 4-tuple of paths starting at the origin (
,
+
,
,
+
) as follows. Let
s = minj :
j
, t = mink :
k
=
s
.
= [
s
s
,
s1
s
, . . . ,
0
s
],
+
= [
s
s
,
s+1
s
, . . . ,
n
s
],
= [
t
t
,
t1
t
, . . . ,
0
t
],
+
= [
t
t
,
t+1
t
, . . . ,
m
t
].
Note that p() = p(
) p(
+
), p() = p(
) p(
+
). Also,
0 , [
1
, . . . ,
t
], [
1
, . . . ,
s
] [
+
] = . (10.4)
242 Intersection Probabilities for Random Walks
Conversely, for each 4-tuple (
,
+
,
,
+
) of paths starting at the origin satisfying (10.4), we
can nd a corresponding (, ) with
0
= 0 by inverting this procedure. Therefore,
Q() = (1 )
2
0n
,n
+
,m
,m
+
,
+
,
,
+
+n
+
+m
+m
+
p(
)p(
+
)p(
)p(
+
),
where the last sum is over all (
,
+
,
,
+
) with [
[ = n
, [
+
[ = n
+
, [
[ = m
, [
+
[ = m
+
satisfying (10.4). Note that there is no restriction on the path
+
. Hence we can sum over n
+
and
+
to get
Q() = (1 )
0n,m
,m
+
,
+
n+m
+m
+
p()p(
)p(
+
),
But it is easy to check that the left-hand side of (10.3) equals
(1 )
3
0n,m
,m
+
,
+
n+m
+m
+
p()p(
)p(
+
).
Corollary 10.2.2 For d = 2, 3, 4,
PS(0, n] (S
1
(0, n] S
2
[0, n]) = PS(0, T
n
] (S
1
(0, T
1
n
] S
2
[0, T
2
n
]) =
(1
n
)
2
Q(
n
)
_
n
d4
2
, d = 2, 3
(log n)
1
, d = 4
Proof [Sketch] We have already noted the last relation. The previous proposition almost proves
the second relation. It gives a lower bound. Since PT
n
= 0 = 1/n, the upper bound will follow
if we show that
P[V
n
[ S(0, T
n
] (S
1
(0, T
1
n
] S
2
[0, T
2
n
]) = , T
n
> 0] c > 0. (10.5)
We leave this as an exercise (Exercise 10.1).
One direction of the rst relation can be proved by considering the event T
n
, T
1
n
, T
2
n
n
which is independent of the random walks and whose probability is bounded below by a c > 0
uniformly in n. This shows
PS(0, T
n
] (S
1
(0, T
1
n
] S
2
[0, T
2
n
]) = c PS(0, n] (S
1
(0, n] S
2
[0, n]) = .
For the other direction, it suces to show that
PS(0, T
n
] (S
1
(0, T
1
n
] S
2
[0, T
2
n
]) = ; T
n
, T
1
n
, T
2
n
n c (1
n
)
2
Q(
n
).
This can be established by going through the construction in proof of Proposition 10.2.1. We leave
this to the interested reader.
10.3 One-sided exponent 243
10.3 One-sided exponent
Let
q(n) = PS(0, n] S
1
(0, n] = .
This is not an easy quantity to estimate. If we let
Y
n
= P
_
S(0, n] S
1
(0, n] = [ S(0, n]
_
,
then we can write
q(n) = E[Y
n
].
Note that if S, S
1
, S
2
are independent, then
E[Y
2
n
] = P
_
S(0, n] (S
1
(0, n] S
2
(0, n]) =
_
.
Hence, we see that
E[Y
2
n
]
_
(log n)
1
, d = 4
n
d4
2
, d < 4
(10.6)
Since 0 Y
n
1, we know that
E[Y
2
n
] E[Y
n
]
_
E[Y
2
n
]. (10.7)
If it were true that (E[Y
n
])
2
E[Y
2
n
] we would know how E[Y
n
] behaves. Unfortunately, this is not
true for small d.
As an example, consider simple random walk on Z. In order for S(0, n] to avoid S
1
[0, n], ei-
ther S(0, n] 1, 2, . . . and S
1
[0, n] 0, 1, 2, . . . or S(0, n) 1, 2, . . . and S
1
[0, n]
0, 1, 2, . . .. The gamblers ruin estimate shows that the probability of each of these events is
comparable to n
1/2
and hence
E[Y
n
] n
1
, E[Y
2
n
] n
3/2
.
Another way of saying this is
PS(0, n] S
2
(0, n] = n
1
, PS(0, n] S
2
(0, n] = [ S(0, n] S
1
(0, n] = n
1/2
.
For d = 4, it is true that (E[Y
n
])
2
E[Y
2
n
]. For d < 4, the relation (E[Y
n
])
2
E[Y
2
n
] does
not hold. The intersection exponent =
d
is dened by saying E[Y
n
] n
|zx|=1
h(z)
=
h(y)
2d h(x)
. (11.1)
If x A and [x y[ = 1,
q(x, y) =
h(y)
|zx|=1
h(z)
.
The second equality in (11.1) follows by the fact that Lh(x) = 0. The denition of q(x, y) for
x A is the same as that for x A, but we write it separately to emphasize that the second
equality in (11.1) does not necessarily hold for x A. The h-process stopped at (A)
+
is the chain
with transition probability q = q
A,h
which equals q except for
q(x, x) = 1, x (A)
+
.
Note that if x A (A)
+
, then q(y, x) = q(y, x) = 0 for all y A. In other words, the chain
can start in x A (A)
+
, but it cannot visit there at positive times. Let q
n
= q
A,h
n
, q
n
= q
A,h
n
denote the usual n-step transition probabilities for the Markov chains.
Proposition 11.1.1 If x, y A,
q
A,h
n
(x, y) = p
A
n
(x, y)
h(y)
h(x)
.
In particular, q
A,h
n
(x, x) = p
A
n
(x, x).
Proof Let
= [
0
= x,
1
, . . . ,
n
= y]
be a nearest neighbor path with
j
A for all j. Then the probability that rst n points of the
h-process starting at x are
1
, . . . ,
n
in order is
n
j=1
h(
j
)
2d h(
j1
)
= (2d)
n
h(y)
h(x)
.
By summing over all paths , we get the proposition.
If we consider q
A,h
and p
A
as measures on nite paths = [
0
, . . . ,
n
] in A, then we can rephrase the
proposition as
dq
A,h
dp
A
() =
h(
n
)
h(
0
)
.
Formulations like this in terms of Radon-Nikodym derivatives of measures can be extended to measures on
continuous paths such as Brownian motion.
The h-process can be considered as the random walk weighted by the function h. One can dene this
11.1 h-processes 247
for any positive function on A, even if h is not harmonic, using the rst equality in (11.1). However, Proposition
11.1.1 will not hold if h is not harmonic.
Examples
If A Z
d
and V A, let
h
V,A
(x) = P
x
S
A
V =
yV
H
A
(x, y).
Assume h
V,A
(x) > 0 for all x A. By denition, h
V,A
1 on V and h
V,A
0 on Z
d
(A V ).
The h
V,A
-process corresponds to simple random walk conditioned to leave A at V . We usually
consider the version stopped at V = (A)
+
.
Suppose x A V and
H
A
(x, V ) := P
x
S
1
A; S
A
V =
yA
H
A
(x, y) > 0.
If [xy[ > 1 for all y V , then the excursion measure as dened in Section 9.6 corresponding
to paths from x to V in A normalized to be a probability measure is the h
V,A
-process. If there
is a y V with [x y[ = 1, the h
V,A
-process allows an immediate transition to y while the
normalized excursion measure does not.
Let A = H = x + iy Z iZ : y > 0 and h(z) = Im(z). Then h is a harmonic function on
H that vanishes on A. This h-process corresponds to simple random walk conditioned never to
leave H and is sometimes called an H-excursion. With probability one this process never leaves
H. Also, if q = q
H,h
and x +iy H,
q(x +iy, (x 1) +iy) =
1
4
,
q(x +iy, x +i(y + 1)) =
y + 1
4y
, q(x +iy, +i(y 1)) =
y 1
4y
.
Suppose A is a proper subset of Z
d
and V = A. Then the h
V,A
-process is simple random walk
conditioned to leave A. If d = 1, 2 or Z
d
A is a recurrent subset of Z
d
, then h
V,A
1 and the
h
V,A
-process is the same as simple random walk.
Suppose A is a connected subset of Z
d
, d 3 such that
h
,A
(x) := P
x
A
= > 0.
Then the h
,A
-process is simple random walk conditioned to stay in A.
Let A be a connected subset of Z
2
such that Z
2
A is nite and nonempty, and let
h(x) = a(x) E
x
[a(S
A
)] ,
be the unique function that is harmonic on A; vanishes on A; and satises h(x) (2/) log [x[
as x , see (6.40). Then the h-process is simple random walk conditioned to stay in A. Note
that this conditioning is on an event of probability zero. Using (6.40), we can see that this is
the limit as n of the h
Vn,An
processes where
A
n
= A [z[ < n, V
n
= A
n
[z[ n.
248 Loop-erased random walk
Note that for large n, V
n
= B
n
.
11.2 Loop-erased random walk
Suppose A Z
d
, V A, and x A with that h
V,A
(x) > 0. The loop-erased random walk (LERW)
from x to V in A is the probability measure on paths obtained by taking the h
V,A
-process stopped
at V and erasing loops. We can dene the walk equivalently as follows.
Take a simple random walk S
n
started at x and stopped when it reaches A. Condition on
the event (of positive probability) S
A
V . The conditional probability gives a probability
measure on (nite) paths
= [S
0
= x, S
1
, . . . , S
n
= S
A
] .
Erase loops from each which produces a self-avoiding path
= L() = [
S
0
= x,
S
1
, . . . ,
S
m
= S
A
],
with
S
1
, . . . ,
S
m1
A. We now have a probability measure on self-avoiding paths from x to V ,
and this is the LERW.
Similarly, if x A V with P
x
S
A
V > 0, we dene LERW from x to V in A by erasing
loops from the h
V,A
-process started at x stopped at V . If x V , we dene LERW from x to V to
be the trivial path of length zero.
We write the LERW as
S
0
,
S
1
, . . . ,
S
.
Here is the length of the loop-erasure of the h-process.
The LERW gives a probability measure on paths which we give explicitly in the next proposition.
We will use the results and notations from Chapter 9 where the weight q from that chapter is the
weight associated to simple random walk, q(x, y) = 1/2d if [x y[ = 1.
Proposition 11.2.1 Suppose V A, x A V and
S
0
,
S
1
, . . . ,
S
is LERW from x to V in A.
Suppose = [
0
, . . . ,
n
] is a self-avoiding path with
0
= x A,
n
V , and
j
A for 0 < j < n.
Then
P = n; [
S
0
, . . . ,
S
n
] = =
1
(2d)
n
P
x
S
A
V
F
(A).
Proof This is proved in the same way as Proposition 9.5.1. The extra term P
x
S
A
V comes
from the normalization to be a probability measure.
If = [
0
,
1
, . . . ,
m
] and
R
denotes the reversed path [
m
,
m1
, . . . ,
0
], it is not necessarily
true that L(
R
) = [L()]
R
(the reader might want to nd an example). However, the last proposi-
tion shows that for any self-avoiding path with appropriate endpoints, the probability that LERW
produces depends only on the set
1
, . . . ,
n1
. For this reason we have the following corollary
which shows that the distribution of LERW is reversible.
Corollary 11.2.2 (Reversibility of LERW) Suppose x, y A and
S
0
,
S
1
, . . . ,
S
is LERW
from x to y in A. Then the distribution of
S
,
S
1
, . . . ,
S
0
is that of LERW from y to x.
11.2 Loop-erased random walk 249
Proposition 11.2.3 If x A with h
V,A
(x) > 0, then the distribution of LERW from x to V in A
stopped at V is the same as that of LERW from x to V in A x stopped at V .
Proof Let X
0
, X
1
, . . . denote an h
V,A
-process started at x, and let =
A
be the rst time that the
walk leaves A, which with probability one is the rst time that the walk visits V . Let
= maxm <
A
: X
m
= x.
Then using last-exit decomposition ideas (see Proposition 4.6.5) and Proposition 11.1.1, the distri-
bution of
[X
, X
+1
, . . . , X
A
]
is the same as that of an h
V,A
-process stopped at V conditioned not to return to x. This is the
same as an h
V,A\{x}
-process.
If x A V , then the rst step
S
1
of the LERW from x to V in A has the same distribution as
the rst step of the h
V,A
-process from x to V . Hence,
P
x
S
1
= y =
h
V,A
(y)
|zx|=1
h
V,A
(z)
.
Proposition 11.2.4 Suppose x A V and
S
0
, . . . ,
S
S
0
, . . . ,
S
m
] = =
P
m
S
A\
V
(2d)
m
P
x
S
A
V
F
(A).
Proof Let = [
0
, . . . ,
n
] be a nearest neighbor path with
0
= x,
n
V and
0
, . . . ,
n1
A
such that the length of LE() is greater than m and the rst m steps of LE() agrees with . Let
s = maxj :
j
=
m
and write =
+
where
= [
0
,
1
, . . . ,
s
],
+
= [
s
,
s+1
, . . . ,
n
].
Then L(
) = and
+
is a nearest neighbor path from
m
to V with
s
=
m
,
s+1
, . . . ,
n1
A ,
n
V. (11.2)
Every such can be obtained by concatenating an
in A with L(
) = with an
+
satisfying
(11.2). The total measure of the set of
is given by (2d)
m
F
A\
V . Again, the term P
x
S
A
V comes from the normalization
to make the LERW a probability measure.
The LERW is not a Markov process. However, we can consider the LERW from x to V in A
as a Markov chain on a dierent state space. Fix V , and consider the state space A of ordered
pairs (x, A) with x Z
d
, A Z
d
(V x) and either x V or P
x
S
A
V > 0. The states
(x, A), x V are absorbing states. For other states, the probability of the transition
(x, A) (y, A y)
250 Loop-erased random walk
is the same as the probability that an h
V,A
-process starting at x takes its rst step to y. The fact
that this is a Markov chain is sometimes called the domain Markov property for LERW.
11.3 LERW in Z
d
The loop-erased random walk in Z
d
is the process obtained by erasing the loops from the path of
a d-dimensional simple random walk. The d = 1 case is trivial, so we will focus on d 2. We will
use the term self-avoiding path for a nearest neighbor path that is self-avoiding.
11.3.1 d 3
The denition of LERW is easier in the transient case d 3 for then we can take the innite path
[S
0
, S
1
, S
2
, . . .]
and erase loops chronologically to obtain the path
[
S
0
,
S
1
,
S
2
, . . .].
To be precise, we let
0
= maxj 0 : S
j
= 0,
and for k > 0,
k
= maxj >
k1
: S
j
= S
k1
+1
,
and then
[
S
0
,
S
1
,
S
2
, . . .] = [S
0
, S
1
, S
2
, . . .].
It is convenient to dene chronological erasing as above by considering the last visit to a point. It is not
dicult to see that this gives the same path as obtained by nonanticipating loop erasure, i.e., every time one
visits a point that is on the path one erases all the points in between.
The following properties follow from the previous sections in this chapter and we omit the proofs.
Given
S
0
, . . . ,
S
m
, the distribution of
S
m+1
is that of the h
,Am
-process starting at
S
m
where
A
m
= Z
d
S
0
, . . . ,
S
m
. Indeed,
P
S
m+1
= x [ [
S
0
, . . . ,
S
m
] =
h
,Am
(x)
|y
Sm|=1
h
,Am
(y)
, [x
S
m
[ = 1.
If = [
0
, . . . ,
m
] is a self-avoiding path with
0
= 0,
P
_
[
S
0
, . . . ,
S
m
] =
_
=
Es
Am
(
m
)
(2d)
m
F
(Z
d
) =
Es
Am
(
m
)
(2d)
m
m
j=0
G
A
j1
(
j
,
j
).
Here A
1
= Z
d
.
11.3 LERW in Z
d
251
Suppose Z
d
A is nite,
A
r
= A [z[ < r, V
r
= A
r
[z[ r,
and
S
(r)
0
, . . . ,
S
(r)
m
denotes (the rst m steps of) a LERW from 0 to V
r
in A
r
. Then for every
self-avoiding path ,
P
_
[
S
0
, . . . ,
S
m
] =
_
= lim
r
P
_
[
S
(r)
0
, . . . ,
S
(r)
m
] =
_
.
11.3.2 d = 2
There are a number of ways to dene LERW in Z
2
; all the reasonable ones give the same answer.
One possibility (see Exercise 11.2) is to take simple random walk conditioned not to return to
the origin and erase loops. We take a dierent approach in this section and dene it as the limit
as N of the measure obtained by erasing loops from simple random walk stopped when it
reaches B
N
. This approach has the advantage that we obtain an error estimate on the rate of
convergence.
Let S
n
denote simple random walk starting at the origin in Z
2
. Let
S
0,N
, . . . ,
S
N
,N
denote
LERW from 0 to B
N
in B
N
. A This can be obtained by erasing loops from
[S
0
, S
1
, . . . , S
N
].
As noted in Section 11.2, if we condition on the event that
0
>
N
, we get the same distribution on
the LERW. Let
N
denote the set of self-avoiding paths = [0,
1
, . . . ,
k
] with
1
, . . . ,
k1
B
N
,
and
N
B
N
and let
N
denote the corresponding probability measure on
N
,
N
() = P[
S
0,N
, . . . ,
S
n,N
] = .
If n < N, we can also consider
N
as a probability measure on
n
, by considering the path up
to the rst time it visits B
n
and removing the rest of the path. The goal of this subsection is to
prove the following result.
Proposition 11.3.1 Suppose d = 2 and n < . For each N n, consider
N
as a probability
measure on
n
. Then the limit
= lim
N
N
,
exists. Moreover, for every
n
.
N
() = ()
_
1 +O
_
1
log(N/n)
__
, N 2n. (11.3)
To be more specic, (11.3) means that there is a c such that for all N 2n and all
n
,
N
()
()
1
c
log(N/n)
.
The proof of this proposition will require an estimate on the loop measure as dened in Chapter
252 Loop-erased random walk
9. We start by stating the following proposition which is an immediate application of Proposition
11.2.4 to our situation.
Proposition 11.3.2 If n N and = [
0
, . . . ,
k
]
n
,
N
() =
P
k
_
N
<
Z
d
\
_
(2d)
||
F
(B
N
) =
P
k
_
N
<
Z
d
\
_
(2d)
||
P
N
<
0
(B
N
0).
Since 0 ,
F
(B
N
) = G
BN
(0, 0) F
(B
N
0) = P
N
<
0
1
F
(B
N
0),
which shows the second equality in the proposition.
We will say that a loop disconnects the origin from a set A if there is no self-avoiding path starting
at the origin ending at A that does not intersect the loop; in particular, loops that intersect the
origin disconnect the origin from all sets. Let m denote the unrooted loop measure for simple
random walk as dened in Chapter 9.
Lemma 11.3.3 There exists c < such that the following holds for simple random walk in Z
2
.
For every n < N/2 < consider the set U = U(n, N) of unrooted loops satisfying
B
n
,= , (Z
d
B
N
) ,=
and such that does not disconnect the origin from B
n
. Then
m(U)
c
log(N/n).
Proof Order Z
2
= x
0
= 0, x
1
, x
2
, . . . so that j < k implies [x
j
[ [x
k
[. Let A
k
= Z
2
x
0
, . . . , x
k1
. For each unrooted loop , let k be the smallest index with x
k
and, as be-
fore, let d
x
k
() denote the number of times that visits x
k
. By choosing the root uniformly among
the d
x
k
() visits to x
k
, we can see that
m(U) =
k=1
U
k
1
(2d)
||
d
x
k
()
k=1
U
k
1
(2d)
||
,
where
U
k
=
U
k
(n, N) denotes the set of (rooted) loops rooted at x
k
satisfying the following three
properties:
x
0
, . . . , x
k1
= ,
(Z
d
B
N
) ,= ,
does not disconnect the origin from B
n
.
We now give an upper bound for the measure of
U
k
for x
k
B
n
. Suppose
= [
0
, . . . ,
2l
]
U
k
.
Let s
0
=
0
, s
5
= 2l and dene s
1
, . . . , s
4
as follows.
Let s
1
be the smallest index s such that [
s
[ 2[x
k
[.
11.3 LERW in Z
d
253
Let s
2
be the smallest index s s
1
such that [
s
[ n
Let s
3
be the smallest index s s
2
such that [
s
[ N.
Let s
4
be the largest index s 2l such that [
s
[ 2[x
k
[.
Then we can decompose
=
1
5
,
where
j
= [
s
j1
, . . . ,
s
j
]. We can use this decomposition to estimate the probability of
U
k
.
1
is a path from x
k
to B
2|x
k
|
that does not hit x
0
, . . . , x
k1
. Using gamblers ruin (or a
similar estimate), the probability of such a path is bounded above by c/[x
k
[.
2
is a path from B
2|x
k
|
to B
n
that does not disconnect the origin from B
n
. There
exists c, such that the probability of reaching distance n without disconnecting the origin
is bounded above by c ([x
k
[/n)
1
log(N/n)
.
By summing over x
k
B
n
, we get the proposition.
Being able to verify all the estimates in the last proof is a good test that one has absorbed a lot of material
from this book!
Proof [of Proposition 11.3.1] Let
r
= 1/ log r. We will show that for 2n N M and =
[
0
, . . . ,
k
]
n
,
M
() =
N
() [1 +O(
N/n
)]. (11.4)
Standard arguments using Cauchy sequences then show the existence of satisfying (11.3). Propo-
sition 11.3.2 implies
M
() =
N
()
F
(B
M
)
F
(B
N
)
P
k
_
M
<
Z
2
\
[
N
<
Z
2
\
_
.
The set of loops contributing to the term F
(B
M
)/F
(B
N
) are of two types: those that disconnect
the origin from B
n
and those that do not. Loops that disconnect the origin from B
n
intersect
every
n
and hence contribute a factor C(n, N, M) that is independent of . Hence, using
Lemma 11.3.3, we see that
F
(B
M
)
F
(B
N
)
= C(n, N, M) [1 +O(
N/n
)], (11.5)
254 Loop-erased random walk
Using Proposition 6.4.1, we can see that for every x B
N
,
P
x
M
<
Z
2
\Bn
=
log(N/n)
log(M/n)
_
1 +O(
N/n
)
N
<
Z
2
\
c
log(N/n)
.
We therefore get
P
k
_
M
<
Z
2
\
[
N
<
Z
2
\
_
=
log(N/n)
log(M/n)
_
1 +O(
N/n
)
.
Combining this with (11.5) we get
M
() =
N
() C(n, N, M)
log(N/n)
log(M/n)
_
1 +O(
N/n
)
,
where we emphasize that the error term is bounded uniformly in
n
. However, both
N
and
M
are probability measures. By summing over
n
on both sides, we get
C(n, N, M)
log(N/n)
log(M/n)
= 1 +O(
N/n
),
which gives (11.4).
The following is proved similarly (see Exercise 9.7).
Proposition 11.3.4 Suppose d 3 and n < . For each N n, consider
N
as a probability
measure on
n
. Then the limit
= lim
N
N
,
exists and is the same as that given by the innite LERW. Moreover, for every
n
.
N
() = ()
_
1 +O
_
(n/N)
d2
__
, N 2n. (11.6)
11.4 Rate of growth
If
S
0
,
S
1
, . . . , denotes LERW in Z
d
, d 2, we let
n
= minj : [
S
j
[ n.
Let
F(n) =
F
d
(n) = E[
n
].
In other words it takes about
F(n) steps for the LERW to go distance n. Recall that for simple
random walk, E[
n
] n
2
. Note that
F(n) =
xBn
P
S
j
= x for some j <
n
.
11.4 Rate of growth 255
By Propositions 11.3.1 and 11.3.4, we know that if x B
n
,
P
S
j
= x for some j <
n
Px LE(S[0,
2n
])
=
j=0
Pj <
2n
; S
j
= x; LE(S[0, j]) S[j + 1,
2n
] = .
If S, S
1
are independent random walks, let
Q() = (1 )
2
xZ
d
n=0
m=0
n+m
P
0,x
LE(S[0, n]) S
1
[0, m] = .
In Proposition 10.2.1, a probability of nonintersection of random walks starting at the origin was
computed in terms of a long-range intersection quantity Q(). We do something similar for
LERW using the quantity
Q(). The proof of Proposition 10.2.1 used a path decomposition: given
two intersecting paths, the proof focused on the rst intersection (using the time scale of one of the
paths) and then translating to make that the origin. The proof of the next proposition is similar
given a simple random walk that intersects a loop-erased walk. However, we get two dierent
results depending on whether we focus on the rst intersection on the time scale of the simple walk
or on the time scale of the loop-erased walk.
Proposition 11.4.1 Let S, S
1
, S
2
, S
3
be independent simple random walks starting at the origin
in Z
d
with independent geometric killing times T
, T
1
, . . . , T
3
.
(i) Let V
1
= V
1
] LE(S[0, T
]) = , j = 1, 2,
and
S
3
[1, T
3
] [LE(S[0, T
]) 0] = .
Then
P(V
1
) = (1 )
2
Q(). (11.7)
(ii) Let V
2
= V
2
] LE(S[0, T
]) = ,
and
S
2
[1, T
2
]
_
LE(S[0, T
]) LE(S
1
[0, T
1
]) =
_
.
Then
P(V
2
) = (1 )
2
Q(). (11.8)
Proof We use some of the notation from the proof of Proposition 10.2.1. Note that
Q() = (1 )
2
n=0
m=0
n+m
p() p(),
256 Loop-erased random walk
where the last sum is over all , with [[ = n, [[ = m,
0
= 0 and L() ,= . We write
= L() = [
0
, . . . ,
l
].
To prove (11.7), on the event ,= , we let
u = minj :
j
, s = maxj :
j
=
u
, t = mink :
k
=
u
.
We dene the paths
,
+
,
,
+
as in the proof of Proposition 10.2.1 using these values of s, t.
Our denition of s, t implies for j > 0,
+
j
, LE
R
(
),
j
, LE
R
(
),
+
j
, LE
R
(
) 0. (11.9)
Here we write LE
R
to indicate that one traverses the path in the reverse direction, erases loops,
and then reverses the path again this is not necessarily the same as LE(
,
+
,
,
+
) satisfying (11.9), we get a corresponding (, ) satisfying L() ,= .
Therefore,
Q() = (1 )
2
0n
,n
+
,m
,m
+
,
+
,
,
+
+n
+
+m
+m
+
p(
)p(
+
)p(
)p(
+
),
where the last sum is over all (
,
+
,
,
+
) with [
[ = n
, [
+
[ = n
+
, [
[ = m
, [
+
[ = m
+
satisfying (11.9). Using Corollary 11.2.2, we see that the sum is the same if replace (11.9) with:
for j > 0,
+
j
, LE(
),
j
, LE(
),
+
j
, LE(
) 0.
To prove (11.8), on the event ,= , we let
t = mink :
k
, s = maxj :
j
=
t
,
and dene (
,
+
,
,
+
) as before. The conditions now become for j > 0,
+
j
, LE
R
(
),
j
, [LE
R
(
) LE(
+
)],
It is harder to estimate
Q() then Q(). We do not give a proof here but we state that if
n
= 1
1
n
, then as n ,
Q(
n
)
_
n
d/2
, d < 4
n
2
[log n]
1
, d = 4.
This is the same behavior as for Q(
n
). Roughly speaking, if two random walks of length n start
distance
n away, then the probability that one walk intersects the loop erasure of the other is of
order 1 for d 3 and of order 1/ log n for d = 4. For d = 1, 2, this is almost obvious for topological
reasons. The hard cases are d = 3, 4. For d = 3 the set of cut points (i.e., points S
j
such that
S[0, j] S[j + 1, n] = ) has a fractal dimension strictly greater than one and hence tends to be
hit by a (roughly two-dimensional) simple random walk path. For d = 4, one can also show that
the probability of hitting the cut points is of order 1/ log n. Since all cut points are retained in loop
11.5 Short-range intersections 257
erasure this gives a bound on the probability of hitting the loop-erasure. This estimate of
Q()
yields
P(V
1
n
) P(V
2
n
)
_
n
d4
2
, d < 4
1
log n
, d = 4.
To compute the growth rate we would like to know the asymptotic behavior of
PLE(S[0,
n
]) S
1
[1, T
n
] = = E[Y
n
],
where
Y
n
= PLE(S[0,
n
]) S
1
[1, T
n
] = [ S[0,
n
].
Note that
P(V
1
n
) E[Y
3
n
].
11.5 Short-range intersections
Studying the growth rate for LERW leads one to try to estimate probabilities such as
PLE(S[0, n]) S[n + 1, 2n] = ,
which by Corollary 11.2.2 is the same as
q
n
=: PLE(S[0, n]) S
1
[1, n] = ,
where S, S
1
are independent walks starting at the origin. If d 5,
q
n
PS[0, n] S
1
[1, n] = c > 0,
so we will restrict our discussion to d 4. Let
Y
n
= PLE(S[0, n]) S[n + 1, 2n] = [ S[0, n].
Using ideas similar to those leading up to (11.7), one can show that
E[
Y
3
n
]
_
(log n)
1
, d = 4
n
d4
2
, d = 1, 2, 3.
(11.10)
This can be compared to (10.6) where the second moment for an analogous quantity is given. We
also know that
E[
Y
3
n
] E[
Y
n
]
_
E[
Y
n
]
_
1/3
. (11.11)
In the mean-eld case d = 4, it can be shown that E[
Y
3
n
]
_
E[
Y
n
]
_
3
. and hence that
q
n
(log n)
1/3
.
Moreover, if we appropriately scale the process, the LERW converges to a Brownian motion.
258 Loop-erased random walk
For d = 2, 3, we do not expect E[
Y
3
n
]
_
E[
Y
n
]
_
3
. Let us dene an exponent =
d
roughly as
q
n
n
(r)[ : [n r[
1
2
_
. (12.1)
If
[b
n
[ < , let
C =
n=1
b
n
, B
n
=
j=n+1
[b
n
[.
Then
n
j=1
f(j) =
_
n+(1/2)
1/2
f(s) ds +C +O(B
n
).
Also, for all m < n
j=m
f(j)
_
n+(1/2)
m(1/2)
f(s) ds
B
m
.
Proof Taylors theorem shows that for [s n[ 1/2,
f(s) = f(n) + (s n) f
(n) +
1
2
f
(r
s
) (r
s
n)
2
,
for some [n r
s
[ < 1/2. Hence, for such s,
[f(s) +f(s) 2f(n)[
s
2
2
sup[f
(r)[ : [n r[ s.
259
260 Appendix
Integrating gives (12.1). The rest is straightforward.
Example. Suppose < 1, R and
f(n) = n
log
n.
Note that for t 2,
[f
(t)[ c t
2
log
t.
Therefore, there is a C(, ) such that
n
j=2
n
log
n =
_
n+(1/2)
2
t
log
t dt +C(, ) +O(n
1
log
n)
=
_
n
2
t
log
t dt +
1
2
n
log
n +C(, ) +O(n
1
log
n) (12.2)
12.1.2 Logarithm
Let log denote the branch of the complex logarithm on z C; Re(z) > 0 with log 1 = 0. Using
the power series
log(1 +z) =
_
_
k
j=1
(1)
j+1
z
j
j
_
_
+O
([z[
k+1
), [z[ 1 .
we see that if r (0, 1) and [[ rt,
log
_
1 +
t
_
t
=
2
2t
+
3
3t
2
+ + (1)
k+1
k
kt
k1
+O
r
_
[[
k+1
t
k
_
,
_
1 +
t
_
t
= e
exp
_
2
2t
+
3
3t
2
+ + (1)
k+1
k
kt
k1
+O
r
_
[[
k+1
t
k
__
. (12.3)
If [[
2
/t is not too big, we can expand the exponential in a Taylor series. Recall that for xed
R < , we can write
e
z
= 1 +z +
z
2
2!
+ +
z
k
k!
+O
R
([z[
k+1
), [z[ R.
Therefore, if r (0, 1), R < , [[ rt, [[
2
Rt, we can write
_
1 +
t
_
t
= e
_
1
2
2t
+
8
3
+ 3
4
24t
2
+ +
f
k
()
t
k1
+O
_
[[
2k
t
k
__
, (12.4)
where f
k
is a polynomial of degree 2(k 1) and the implicit constant in the O() term depends only
on r, R and k. In particular,
_
1 +
1
n
_
n
= e
_
1
1
2n
+
11
24 n
2
+ +
b
k
n
k
+O
_
1
n
k+1
__
. (12.5)
12.1 Some expansions 261
Lemma 12.1.2 For every positive integer k, there exist constants c(k, l), l = k +1, k +2, . . ., such
that for each m > k,
j=n
k
j
k+1
=
1
n
k
+
m
l=k+1
c(k, l)
n
l
+O
_
1
n
m+1
_
. (12.6)
Proof If n > 1,
n
k
=
j=n
[j
k
(j + 1)
k
] =
j=n
j
k
[1 (1 +j
1
)
k
] =
l=k
b(k, l)
j=n
l
j
l+1
,
with b(k, k) = 1 (the other constants can be given explicitly but we do not need to). In particular,
n
k
=
m
l=k
b(k, j)
_
_
j=n
l
j
l+1
_
_
+O
_
1
n
m+1
_
.
The expression (12.6) can be obtained by inverting this expression; we omit the details.
Lemma 12.1.3 There exists a constant (called Eulers constant) and b
2
, b
3
, . . . such that for
every integer k 2,
n
j=1
1
j
= log n + +
1
2n
+
k
l=2
b
l
n
l
+O
_
1
n
k+1
_
.
In fact,
= lim
n
_
_
n
j=1
1
j
_
_
log n =
_
1
0
(1 e
t
)
1
t
dt
_
1
e
t
1
t
dt. (12.7)
Proof Note that
n
j=1
1
j
= log
_
n +
1
2
_
+ log 2 +
n
j=1
j
,
where
j
=
1
j
log
_
j +
1
2
_
+ log
_
j
1
2
_
=
k=1
2
(2k + 1) (2j)
2k+1
.
In particular,
j
= O(j
3
), and hence
j
< . We can write
n
j=1
1
j
= log
_
n +
1
2
_
+
j=n+1
j
= log n + +
l=1
(1)
l+1
2
l
n
l
j=n+1
j
,
where is the constant
= log 2 +
j=1
j
.
262 Appendix
Using (12.6), we can write
j=n+1
j
=
k
l=3
a
l
n
l
+O
_
1
n
k+1
_
,
for some constants a
l
.
We will sketch the proof of (12.7) leaving the details to the reader. By Taylors series, we know
that
log n = log
_
1
_
1
1
n
__
=
j=1
_
1
1
n
_
j
1
j
.
Therefore,
= lim
n
_
_
n
j=1
1
j
_
_
log n
= lim
n
n
j=1
_
1
_
1
1
n
_
j
_
1
j
lim
n
j=n+1
_
1
1
n
_
j
1
j
.
We now use the approximation (1 n
1
)
n
e
1
to get
lim
n
n
j=1
_
1
_
1
1
n
_
j
_
1
j
= lim
n
n
j=1
1
n
(1 e
j/n
)
1
j/n
=
_
1
0
(1 e
t
)
1
t
dt,
lim
n
j=n+1
_
1
1
n
_
j
1
j
= lim
n
j=n+1
1
n
e
j/n
1
j/n
=
_
1
e
t
1
t
dt.
Lemma 12.1.4 Suppose R and m is a positive integer. There exist constants r
0
, r
1
, . . ., such
that if k is a positive integer and n m,
n
j=m
_
1
j
_
= r
0
n
_
1 +
r
1
n
+ +
r
k
n
k
+O
_
1
n
k+1
__
.
Proof Without loss of generality we assume that [[ 2m; if this does not hold we can factor out
the rst few terms of the product and then analyze the remaining terms. Note that
log
n
j=m
_
1
j
_
=
n
j=m
log
_
1
j
_
=
n
j=m
l=1
l
l j
l
=
l=1
n
j=m
l
l j
l
.
For the l = 1 term we have
n
j=m
j
=
m1
j=1
j
+
_
log n + +
1
2n
+
k
l=2
b
l
n
l
+O
_
1
n
k+1
_
_
.
12.2 Martingales 263
All of the other terms can be written in powers of (1/n). Therefore, we can write
log
n
j=m
_
1
j
_
= log n +
k
l=0
C
l
n
l
+O
_
1
n
k+1
_
.
The lemma is then obtained by exponentiating both sides.
12.2 Martingales
A ltration T
0
T
1
is an increasing sequence of -algebras.
Denition. A sequence of integrable random variables M
0
, M
1
, . . . is called a martingale with
respect to the ltration T
n
if each M
n
is T
n
-measurable and for each m n,
E[M
n
[ T
m
] = M
m
. (12.8)
If (12.8) is replaced with E[M
n
[ T
m
] M
m
the sequence is called a submartingale. If (12.8) is
replaced with E[M
n
[ T
m
] M
m
the sequence is called a supermartingale.
Using properties of conditional expectation, it is easy to see that to verify (12.8) it suces to
show for each n that E[M
n+1
[ T
n
] = M
n
. This equality only needs to hold up to an event of
probability zero; in fact, the conditional expectation is only dened up to events of probability
zero. If the ltration is not specied, then the assumption is that T
n
is the -algebra generated by
M
0
, . . . , M
n
. If M
0
, X
1
, X
2
, . . . are independent random variables with E[[M
0
[] < and E[X
j
] = 0
for j 1, and
M
n
= M
0
+X
1
+ +X
n
,
then M
0
, M
1
, . . . is a martingale. We omit the proof of the next lemma which is the conditional
expectation version of Jensens inequality.
Lemma 12.2.1 (Jensens inequality) If X is an integrable random variable; f : R R is
convex with E[[f(X)[] < ; and T is a -algebra, then E[f(X) [ T] f(E[X [ T]). In particular, if
M
0
, M
1
, . . . is a martingale; f : R R is convex with E[[f(M
n
)[] < for all n; and Y
n
= f(M
n
);
then Y
0
, Y
1
, . . . is a submartingale.
In particular, if M
0
, M
1
, . . . is a martingale then
if 1, Y
n
:= [M
n
[
is a submartingale;
if b R, then Y
n
:= e
bMn
is a submartingale.
In both cases, this is assuming that E[Y
n
] < .
12.2.1 Optional Sampling Theorem
A stopping time with respect to a ltration T
n
is a 0, 1, . . . -valued random variable T
such that for each n, T n is T
n
-measurable. If T is a stopping time, and n is a positive integer,
then T
n
:= T n is a stopping time satisfying T
n
n.
264 Appendix
Proposition 12.2.2 Suppose M
0
, M
1
, . . . is a martingale and T is a stopping time each with respect
to the ltration T
n
. Then Y
n
:= M
Tn
is a martingale with respect to T
n
. In particular,
E[M
0
] = E[M
Tn
].
Proof Note that
Y
n+1
= M
Tn
1T n +M
n+1
1T n + 1.
The event T n + 1 is the complement of the event T n and hence is T
n
-measurable.
Therefore, by properties of conditional expectation,
E[M
n+1
1T n + 1 [ T
n
] = 1T n + 1 E[M
n+1
[ T
n
] = 1T n + 1 M
n
.
Therefore,
E[Y
n+1
[ T
n
] = M
Tn
1T n +M
n
1T n + 1 = Y
n
.
The optional sampling theorem states that under certain conditions, if PT < = 1, then
E[M
0
] = E[M
T
]. However, this does not hold without some further assumptions. For example, if
M
n
is one-dimensional simple random walk starting at the origin and T is the rst n such that
M
n
= 1, then PT < = 1, M
T
= 1, and hence E[M
0
] ,= E[M
T
]. In the next theorem we list a
number of sucient conditions under which we can conclude that E[M
0
] = E[M
T
].
Theorem 12.2.3 (Optional Sampling Theorem) Suppose M
0
, M
1
, . . . is a martingale and T
is a stopping time with respect to the ltration T
n
. Suppose that PT < = 1. Suppose also
that at least one of the following conditions holds:
There is a K < such that PT K = 1.
There exists an integrable random variable Y such that for all n, [M
Tn
[ Y .
E[[M
T
[] < and lim
n
E[[M
n
[; T > n] = 0.
The random variables M
0
, M
1
, . . . are uniformly integrable, i.e., for every > 0 there is a
K
] < .
There exists an > 1 and a K < such that for all n, E[[M
n
[
] K.
Then E[M
0
] = E[M
T
].
Proof We will consider the conditions in order. The suciency of the rst follows immediately
from Proposition 12.2.2. We know that M
Tn
M
T
with probability one. Proposition 12.2.2 gives
E[M
Tn
] = E[M
0
]. Hence we need to show that
lim
n
E[M
Tn
] = E[M
T
]. (12.9)
If the second condition holds, then this limit is justied by the dominated convergence theorem.
Now assume the third condition. Note that
M
T
= M
Tn
+M
T
1T > n M
n
1T > n.
12.2 Martingales 265
Since PT > n 0, and E[[M
T
[] < , it follows from the dominated convergence theorem that
lim
n
E[M
T
1T > n] = 0.
Hence if E[M
n
1T > n] 0, we have (12.9). Standard exercises show that the fourth implies the
third and the fth condition implies the fourth, so either the fourth or fth condition is sucient.
12.2.2 Maximal inequality
Theorem 12.2.4 (Maximal inequality) Suppose M
0
, M
1
, . . . is a nonnegative submartingale
with respect to T
n
and > 0. Then
P
_
max
0jn
M
j
_
E[M
n
]
.
Proof Let T = minj 0 : M
j
. Then,
P
_
max
0jn
M
j
_
=
n
j=0
PT = j,
E[M
n
] E[M
n
; T n] =
n
j=0
E[M
n
; T = j].
Since M
n
is a submartingale and T = j is T
j
-measurable,
E[M
n
; T = j] = E[E[M
n
[ T
j
]; T = j] E[M
j
; T = j] PT = j.
Combining these estimates gives the theorem.
Combining Theorem 12.2.4 with Lemma 12.2.1 gives the following theorem.
Theorem 12.2.5 (Martingale maximal inequalities) Suppose M
0
, M
1
, . . . is a martingale with
respect to T
n
and > 0. Then if 1, b 0,
P
_
max
0jn
[M
j
[
_
E[[M
n
[
, (12.10)
P
_
max
0jn
M
j
_
E[e
bMn
]
e
b
.
Corollary 12.2.6 Let X
1
, X
2
, . . . be independent, identically distributed random variables in R
with mean zero, and let k be a positive integer for which E[[X
1
[
2k
] < . There exists c < such
that for all > 0,
P
_
max
0jn
[S
j
[
n
_
c
2k
. (12.11)
266 Appendix
Proof Fix k and allow constants to depend on k. Note that
E[S
2k
n
] =
E[X
j
1
X
j
2k
],
where the sum is over all (j
1
, . . . , j
2k
) 1, . . . , n
2k
. If there exists an l such that j
i
,= j
l
for i ,= l,
then we can use independence and E[X
j
l
] = 0 to see that E[X
j
1
X
j
2k
] = 0. Hence
E[S
2k
n
] =
E[X
j
1
X
j
2k
],
where the sum is over all (2k)-tuples such that if l j
1
, . . . , j
2k
, then l appears at least twice.
The number of such (2k)-tuples is O(n
k
) and hence we can see that
E
_
_
S
n
n
_
2k
_
c.
Hence we can apply (12.10) to the martingale M
j
= S
j
/
n.
Corollary 12.2.7 Let X
1
, X
2
, . . . be independent, identically distributed random variables in R
with mean zero, variance
2
, and such that for some > 0, the moment generating function
(t) = E[e
tX
j
] exists for [t[ < . Let S
n
= X
1
+ +X
n
. Then for all 0 r
n/2,
P
_
max
0jn
S
j
r
n
_
e
r
2
/2
exp
_
O
_
r
3
n
__
. (12.12)
If PX
1
R = 0 for some R, this holds for all r > 0.
Proof Without loss of generality, we may assume
2
= 1. The moment generating function of
S
n
= X
1
+ +X
n
is (t)
n
. Letting t = r/
n, we get
P
_
max
0jn
S
j
r
n
_
e
r
2
(r/
n)
n
.
Using the expansion for (t) at zero,
(t) = 1 +
t
2
2
+O(t
3
), [t[
2
,
we see that for 0 r
n/2,
(r/
n)
n
=
_
1 +
r
2
2n
+O
_
r
3
n
3/2
__
n
e
r
2
/2
exp
_
O
_
r
3
n
__
.
This gives (12.12). If PX
1
> R = 0, then (12.12) holds for r > R
m/2
P max
0jn
max
1km
[S
k+j
S
j
[ r
m c ne
br
2
. (12.13)
This next lemma is not about martingales, but it does concern exponential estimates for proba-
bilities so we will include it here.
12.3 Joint normal distributions 267
Lemma 12.2.8 If 0 < < , 0 < r < 1 and X
n
is a binomial random variable with parameters
n and e
2/r
, then
PX
n
rn e
n
.
Proof
PX
n
rn e
2n
E[e
(2/r)Xn
] e
2n
[1 +]
n
e
n
.
12.2.3 Continuous martingales
A process M
t
adapted to a ltration T
t
is called a continuous martingale if for each s < t, E[M
t
[
M
s
] = M
s
and with probability one the function t M
t
is continuous. If M
t
is a continuous
martingale, and > 0, then
M
()
n
:= M
n
is a discrete time martingale. Using this, we can extend results about discrete time martingales to
continuous martingales. We state one such result here.
Theorem 12.2.9 (Optional Sampling Theorem) Suppose M
t
is a uniformly integrable con-
tinuous martingale and is a stopping time with P < = 1 and E[[M
[] < . Suppose
that
lim
t
E[[M
t
[; > t] = 0.
Then
E[M
T
] = E[M
0
].
12.3 Joint normal distributions
A random vector Z = (Z
1
, . . . , Z
d
) R
d
is said to have a (mean zero) joint normal distribution
if there exist independent (one-dimensional) mean zero, variance one normal random variables
N
1
, . . . , N
n
and scalars a
jk
such that
Z
j
= a
j1
N
1
+ +a
jn
N
n
, j = 1, . . . , d,
or in matrix form
Z = AN.
Here A = (a
jk
) is a d n matrix and Z, N are column vectors. Note that
E(Z
j
Z
k
) =
n
m=1
a
jm
a
km
.
In other words, the covariance matrix = [ E(Z
j
Z
k
) ] is the d d symmetric matrix
= AA
T
.
268 Appendix
We say Z has a nondegenerate distribution if is invertible.
The characteristic function of Z can be computed using the known formula for the characteristic
function of N
k
,
E[e
itN
k
] = e
t
2
/2
,
E[expi Z] = E
_
_
exp
_
_
_
i
d
j=1
j
n
k=1
a
jk
N
k
_
_
_
_
_
= E
_
_
exp
_
_
_
i
n
k=1
N
k
d
j=1
j
a
jk
_
_
_
_
_
=
n
k=1
E
_
_
exp
_
_
_
iN
k
d
j=1
j
a
jk
_
_
_
_
_
=
n
k=1
exp
_
_
_
1
2
_
_
d
j=1
j
a
jk
_
_
2
_
_
_
= exp
_
_
_
1
2
n
k=1
d
j=1
d
l=1
j
l
a
jk
a
lk
_
_
_
= exp
_
1
2
AA
T
T
_
= exp
_
1
2
T
_
.
Since the characteristic function determines the distribution, we see that the distribution of Z
depends only on .
The matrix is symmetric and nonnegative denite. Hence we can nd an orthogonal basis
u
1
, . . . , u
d
of unit vectors in R
d
that are eigenvectors of with nonnegative eigenvalues
1
, . . . ,
d
.
The random variable
Z =
1
N
1
u
1
+ +
d
N
d
u
d
has a joint normal distribution with covariance matrix . In matrix language, we have written =
T
=
2
for a d d nonnegative denite symmetric matrix . The distribution is nondegenerate
if and only if all of the
j
are strictly positive.
Although we allow the matrix A to have n columns, what we have shown is that there is a symmetric,
positive denite d d matrix which gives the same distribution. Hence joint normal distribution in R
d
can be
described as linear combinations of d independent one-dimensional normals. Moreover, if we choose the correct
orthogonal basis for R
d
, the components of Z with respect to that basis are independent normals.
If is invertible, then Z has a density f(z
1
, . . . , z
d
) with respect to Lebesgue measure that can
12.4 Markov chains 269
be computed using the inversion formula
f(z
1
, . . . , z
d
) =
1
(2)
d
_
e
iz
E[expi Z] d
=
1
(2)
d
_
exp
_
i z
1
2
T
_
d.
(Here and for the remainder of this paragraph the integrals are over R
d
and d represents d
d
.) To
evaluate the integral, we start with the substitution
1
= which gives
_
exp
_
i z
1
2
T
_
d =
1
det
_
e
|
1
|
2
/2
e
i(
1
1
z)
d
1
.
By completing the square we see that the right-hand side equals
e
|
1
z|
2
/2
det
_
exp
_
1
2
(i
1
1
z) (i
1
1
z)
_
d
1
.
The substitution
2
=
1
i
1
z gives
_
exp
_
1
2
(i
1
1
z) (i
1
1
z)
_
d
1
=
_
e
|
2
|
2
/2
d
2
= (2)
d/2
.
Hence, the density of Z is
f(z) =
1
(2)
d/2
det
e
|
1
z|
2
/2
=
1
(2)
d/2
det
e
(z
1
z)/2
. (12.14)
Corollary 12.3.1 Suppose Z = (Z
1
, . . . , Z
d
) has a mean zero, joint normal distribution such that
E[Z
j
Z
k
] = 0 for all j ,= k. Then Z
1
, . . . , Z
d
are independent.
Proof Suppose E[Z
j
Z
k
] = 0 for all j ,= k. Then Z has the same distribution as
(b
1
N
1
, . . . , b
d
N
d
),
where b
j
=
_
E[Z
2
j
]. In this representation, the components are obviously independent.
If Z
1
, . . . , Z
d
are mean zero random variables satisfying E[Z
j
Z
k
] = 0 for all j ,= k, they are called
orthogonal. Independence implies orthogonality but the converse is not always true. However, the corollary tells
us that the converse is true in the case of joint normal random variables. Orthogonality is often easier to verify
than independence.
12.4 Markov chains
A (time-homogeneous) Markov chain on a countable state space D is a process X
n
taking values
in D whose transitions satisfy
PX
n+1
= x
n+1
[ X
0
= x
0
, . . . , X
n
= x
n
= p(x
n
, x
n+1
)
270 Appendix
where p : D D [0, 1] is the transition function satisfying
yD
p(x, y) = 1 for each x. If A is
nite, we call the transition function the transition matrix P = [p(x, y)]
x,yA
. The n-step transitions
are given by the matrix P
n
. In other words, if p
n
(x, y) is dened to be PX
n
= y [ X
0
= x, then
p
n
(x, y) =
zD
p(x, z) p
n1
(z, y) =
zD
p
n1
(x, z) p(z, y).
A Markov chain is called irreducible if for each x, y A, there exists an n = n(x, y) 0 with
p
n
(x, y) > 0. The chain is aperiodic if for each x there is an N
x
such that for n N
x
, p
n
(x, x) > 0.
If D is nite, then the chain is irreducible and aperiodic if and only if there exists an n such that
P
n
has strictly positive entries.
Theorem 12.4.1 [Perron-Froebenius Theorem] If P is an mm matrix such that for some positive
integer n, P
n
has all entries strictly positive, then there exists > 0 and vectors v, w, with strictly
positive entries such that
v P = v, P w = w.
This eigenvalue is simple and all other eigenvalues of P have absolute value strictly less than . In
particular, if P is the transition matrix for an irreducible aperiodic Markov chain there is a unique
invariant probability satisfying
xD
(x) = 1, (x) =
yD
(y) P(y, x).
Proof We rst assume that P has all strictly positive entries. It suces to nd a right eigenvector,
since the left eigenvector can be handled by considering the transpose of P. We write w
1
w
2
if
every component of w
1
is greater than or equal to the corresponding component of w
2
. Similarly, we
write w
1
> w
2
if all the components of w
1
are strictly greater than the corresponding components
of w
2
. We let 0 denote the zero vector and e
j
the vector whose jth component is 1 and whose
other components are 0. If w 0, let
w
= sup : Pw w.
Clearly
w
< , and since P has strictly positive entries,
w
> 0 for all w > 0. Let
= sup
w
: w 0,
m
j=1
[w]
j
= 1.
By compactness and continuity arguments we can see that there exists a w with w 0,
j
[w]
j
= 1
and
w
= . We claim that Pw = w. Indeed, if [Pw]
j
> [w]
j
for some j, one can check that
there exist positive , such that P[w+e
j
] (+) [w+e
j
], which contradicts the maximality
of . If v is a vector with both positive and negative component, then for each j,
[[Pv]
j
[ < [P[v[]
j
[[v[]
j
.
Here we write [v[ for the vector whose components are the absolute values of the components of v.
Hence any eigenvector with both positive and negative values has an eigenvalue with absolute value
strictly less than . Also, if w
1
, w
2
are positive eigenvectors with eigenvalue , then w
1
tw
2
is an
eigenvector for each t. If w
1
is not a multiple of w
2
then there is some value of t such that w
1
tw
2
12.4 Markov chains 271
has both positive and negative values. Since this is impossible, we conclude that the eigenvector
w is unique. If w 0 is an eigenvector, then the eigenvalue must be positive. Therefore, has a
unique eigenvector (up to constant), and all other eigenvalues have absolute value strictly less than
. Note that if v > 0, then Pv has all entries strictly positive; hence the eigenvector w must have
all entries strictly positive.
We claim, in fact, that is a simple eigenvalue. To see this, one can use the argument as in
the previous paragraph to all submatrices of the matrix to conclude that all eigenvalues of all
submatrices of the matrix are strictly less than in absolute value. Using this (details omitted),
one can see that the derivative of the function f() = det(I P) is nonzero at = which shows
that the eigenvalue is simple.
If P is a matrix such that P
n
has all entries strictly positive, and w is an eigenvector of P with
eigenvalue , then w is an eigenvector for P
n
with eigenvalue
n
. Using this, we can conclude the
result for P. The nal assertion follows by noting that the vector of all 1s is a right eigenvector for
a stochastic matrix.
A dierent derivation of the Perron-Froebenius Theorem which generalizes to some chains on innite state
spaces is done in Exercise 12.4.
If P is the transition matrix for a irreducible, aperiodic Markov chain, then p
n
(x, y) (y) as
n . In fact, this holds for countable state space provided the chain is positive recurrent, i.e.,
if there exists an invariant probability measure. The next proposition gives a simple, quantitative
version of this fact provided the chain satises a certain condition which always holds for the nite
irreducible, aperiodic case.
Proposition 12.4.2 Suppose p : DD [0, 1] is the transition probability for a positive recurrent,
irreducible, aperiodic Markov chain on a countable state space D. Let denote the invariant
probability measure. Suppose there exist > 0 and a positive integer k such that for all x, y D,
1
2
zD
[p
k
(x, z) p
k
(y, z)[ 1 . (12.15)
Then for all positive integers j and all x A,
1
2
zD
[p
j
(x, z) (z)[ c e
j
,
where c = (1 )
1
and e
= (1 )
1/k
.
Proof If is any probability distribution on D, let
j
(x) =
yD
(y)p
j
(y, x).
Then (12.15) implies that for every ,
1
2
zD
[
k
(z) (z)[ 1 .
272 Appendix
In other words we can write
k
= + (1 )
(1)
for some probability measure
(1)
. By iterating
(12.15), we can see that for every integer i 1 we can write
ik
= (1 )
i
(i)
+ [1 (1 )
i
] for
some probability measure
(i)
. This establishes the result for j = ki (with c = 1 for these values of
j) and for other j we nd i with ik j < (i + 1)k.
12.4.1 Chains restricted to subsets
We will now consider Markov chains restricted to a subset of the original state space. If X
n
is an
irreducible, aperiodic Markov chain with state space D and A is a nite proper subset of D, we
write P
A
= [p(x, y)]
x,yA
. Note that (P
A
)
n
= [p
A
n
(x, y)]
x,yA
where
p
A
n
(x, y) = PX
n
= y; X
0
, . . . , X
n
A [ X
0
= x = P
x
X
n
= y,
A
> n, (12.16)
where
A
= infn : X
n
, A. Note that
P
x
A
> n =
yA
p
A
n
(x, y).
We call A connected and aperiodic (with respect to P) if for each x, y A, there is an N such that
for n N, p
A
n
(x, y) > 0. If A is nite, then A is connected and aperiodic if and only if there exists
an n such that (P
A
)
n
has all entries strictly positive. In this case all of the row sums of P
A
are less
than or equal to one and (since A is a proper subset) there is at least one row whose sum is strictly
less than one.
Suppose X
n
is an irreducible, aperiodic Markov chain with state space D and A is a nite,
connected, aperiodic proper subset of D. Let be as in the Perron-Froebenius Theorem for the
matrix P
A
. Then 0 < < 1. Let v, w be the corresponding positive eigenvectors which we write
as functions,
xA
v(x) p(x, y) = v(y),
yA
w(y) p(x, y) = w(x).
We normalize the functions so that
xA
v(x) = 1,
xA
v(x) w(x) = 1,
and we let (x) = v(x) w(x). Let
q
A
(x, y) =
1
p(x, y)
w(y)
w(x)
.
Note that
yA
q
A
(x, y) =
yA
p(x, y) w(y)
w(x)
= 1.
In other words, Q
A
:= [q
A
(x, y)]
x,yA
is the transition matrix for a Markov chain which we will
denote by Y
n
. Note that (Q
A
)
n
= [q
A
n
(x, y)]
x,yA
where
q
A
n
(x, y) =
n
p
A
n
(x, y)
w(y)
w(x)
12.4 Markov chains 273
and p
A
n
(x, y) is as in (12.16). From this we see that the chain is irreducible and aperiodic. Since
xA
(x) q
A
(x, y) =
xA
v(x) w(x)
1
p(x, y)
w(y)
w(x)
= (y),
we see that is the invariant probability for this chain.
Proposition 12.4.3 Under the assumptions above, there exist c, such that for all n,
[
n
p
A
n
(x, y) w(x) v(y)[ c e
n
.
In particular,
PX
0
, . . . , X
n
A [ X
0
= x = w(x)
n
[1 +O(e
n
)].
Proof Consider the Markov chain with transition matrix Q
A
. Choose positive integer k and > 0
such that q
A
k
(x, y) (y) for all x, y A. Proposition 12.4.2 gives
[q
A
n
(x, y) (y)[ c e
n
,
for some c, . Since (y) = v(y) w(y) and q
A
n
(x, y) =
n
p
A
n
(x, y) w(y)/w(x), we get the rst
assertion, using the fact that A is nite so that inf v > 0. The second assertion follows from the
rst using
y
v(y) = 1 and
PX
0
, . . . , X
n
A [ X
0
= x =
yA
p
A
n
(x, y).
If the Markov chain is symmetric (p(y, x) = p(x, y)), then w(x) = c v(x), (x) = c v(x)
2
. The
function g(x) =
c v(x) can be characterized by the fact that g is strictly positive and satises
P
A
g(x) = g(x),
xA
g(x)
2
= 1.
The chain Y
n
can be considered the chain derived from X
n
by conditioning the chain to stay in A forever.
The probability measures v, are both invariant (sometimes the word quasi-invariant is used) probability
measures but with dierent interpretations. Roughly speaking, the three measures v, w, can be described
as follows.
Suppose the chain X
n
is observed at a large time n and it known that the chain has stayed in A for all times
up to n. Then the conditional distribution on X
n
given this information approaches v.
For x A, the probability that the chain stays in A up to time n is asymptotic to w(x)
n
.
Suppose the chain X
n
is observed at a large time n and it is known that the chain has stayed in A and will
stay in A for all times up to N where N n. Then the conditional distribution on X
n
given this information
approaches . We can think of the rst term of the product v(x) w(x) as the conditional probability of being
at x given that the walk has stayed in A up to time n and the second part of the product is the conditional
probability given this that the walk stays in A for times between n and N.
The next proposition gives a criterion for determining the rate of convergence to the invariant
distribution v. Let us write
p
A
n
(x, y) =
p
A
n
(x, y)
zA
p
A
n
(x, z)
= P
x
X
n
= y [
A
> n.
274 Appendix
Proposition 12.4.4 Suppose X
n
is an irreducible, aperiodic Markov chain on the countable state
space D. Suppose A is a nite, proper subset of D and A
yA
p
k
(x, y) . (12.17)
If x, x
A,
yA
[ p
A
k
(x, y) p
A
k
(x
, y)] . (12.18)
If x A, y A
A
> n P
x
A
> n. (12.19)
Then there exists > 0, depending only on , such that for all x, z A and all integers m 0,
1
2
yA
p
A
km
(x, y) p
A
km
(z, y)
(1 )
m
.
Proof We x and allow all constants in this proof to depend on . Let q
n
= max
yA
P
y
A
> n.
Then (12.19) implies that for all y A
and all n, P
y
A
> n q
n
. Combining this with (12.17)
gives for all positive integers k, n,
c q
n
P
x
A
> k P
x
A
> k +n q
n
P
x
A
> k. (12.20)
Let m be a positive integer and let
Y
0
, Y
1
, Y
2
, . . . Y
m
be the process corresponding to X
0
, X
k
, X
2k
, . . . , X
mk
conditioned so that
A
> mk. This is a time
inhomogeneous Markov chain with transition probabilities
PY
j
= y [ Y
j1
= x =
p
A
k
(x, y) P
y
A
> (mj)k
P
x
A
> (mj + 1)k
, j = 1, 2, . . . , m.
Note that (12.20) implies that for all y A,
PY
j
= y [ Y
j1
= x c
2
p
A
k
(x, y),
and if y A
,
PY
j
= y [ Y
j1
= x c
1
p
A
k
(x, y).
Using this and (12.18), we can see that there is a > 0 such that if x, z A and j m,
1
2
yA
[PY
j
= y [ Y
j1
= x PY
j
= y [ Y
j1
= z[ 1 ,
and using an argument as in the proof of Proposition 12.4.2 we can see that
1
2
yA
[PY
m
= y [ Y
0
= x PY
m
= y [ Y
0
= z[ (1 )
m
.
12.4 Markov chains 275
12.4.2 Maximal coupling of Markov chains
Here we will describe the maximal coupling of a Markov chain. Suppose that p : D D [0, 1] is
the transition probability function for an irreducible, aperiodic Markov chain with countable state
space D. Assume that g
1
0
, g
2
0
are two initial probability distributions on D. Let g
j
n
denote the
corresponding distribution at time n, given recursively by
g
j
n
(x) =
zD
g
j
n1
(z) p(z, x).
Let | | denote the total variation distance,
|g
1
n
g
2
n
| =
1
2
xD
[g
1
n
(x) g
2
n
(x)[ = 1
xD
[g
1
n
(x) g
2
n
(x)].
Suppose
X
1
0
, X
1
1
, X
1
2
, . . . , X
2
0
, X
2
1
, X
2
2
, . . .
are dened on the same probability space such that for each j, X
j
n
: n = 0, 1, . . . has the
distribution of the Markov chain with initial distribution g
j
0
. Then it is clear that
PX
1
n
= X
2
n
1 |g
1
n
g
2
n
| =
xD
g
1
n
(x) g
2
n
(x). (12.21)
The following theorem shows that there is a way to dene the chains on the same probability space
so that equality is obtained in (12.21). This theorem gives one example of the powerful probabilistic
technique called coupling. Coupling refers to the dening of two or more processes on the same
probability space in a way so that each individual process has a certain distribution but the joint
distribution has some particularly nice properties. Often, as in this case, the two processes are
equal except for an event of small probability.
Theorem 12.4.5 Suppose p, g
1
n
, g
2
n
are as dened in the previous paragraph. We can dene
(X
1
n
, X
2
n
), n = 0, 1, 2, . . . on the same probability space such that:
for each j, X
j
0
, X
j
1
, . . . has the distribution of the Markov chain with initial distribution g
j
0
;
for each integer n 0,
PX
1
m
= X
2
m
for all m n = 1 |g
1
n
g
2
n
|.
Before doing this proof, let us consider the easier problem of dening (X
1
, X
2
) on the same
probability space so that X
j
has distribution g
j
0
and
PX
1
= X
2
= 1 |g
1
0
g
2
0
|.
Assume 0 < |g
1
0
g
2
0
| < 1. Let f
j
(x) = g
j
0
(x) [g
1
0
(x) g
2
0
(x)].
Suppose that J, X, W
1
, W
2
are independent random variables with the following distributions.
PJ = 0 = 1 PJ = 1 = |g
1
0
g
2
0
|.
PX = x =
g
1
(x) g
2
(x)
1 |g
1
0
g
2
0
|
, x D
276 Appendix
PW
j
= x =
f
j
(x)
|g
1
0
g
2
0
|
, x D.
Let X
j
= 1J = 1 X + 1J = 0W
j
.
It is easy to check that this construction works.
Proof For ease, we will assume that |g
1
0
g
2
0
| = 1 and |g
1
n
g
2
n
| 0 as n ; the adjustment
needed if this does not hold is left to the reader. Let (Z
1
n
, Z
2
n
) be independent Markov chains with
the appropriate distributions. Let f
j
n
(x) = g
j
n
(x) [g
1
n
(x) g
2
n
(x)] and dene h
j
n
by h
j
0
(x) = g
j
0
(x) =
f
j
0
(x) and for n > 1,
h
j
n
(x) =
zS
f
j
n1
(z) p(z, x).
Note that f
j
n+1
(x) = h
j
n+1
(x) [h
1
n
(x) h
2
n
(x)]. Let
j
n
(x) =
h
1
n
(x) h
2
n
(x)
h
j
n
(x)
if h
j
n
(x) ,= 0.
We set
j
n
(x) = 0 if h
j
n
(x) = 0. We let Y
j
(n, x) : j = 1, 2; n = 1, 2, . . . ; x D be independent 0-1
random variables, independent of (Z
1
n
, Z
2
n
), with PY
j
(n, x) = 1 =
j
n
(x).
We now dene 0-1 random variables J
j
n
as follows:
J
j
0
0
If J
j
n
= 1, then J
j
m
= 1 for all m n.
If J
j
n
= 0, then J
j
n+1
= Y
j
(n + 1, Z
j
n+1
).
We claim that
PJ
j
n
= 0; Z
j
n
= x = f
j
n
(x).
For n = 0, this follows immediately from the denition. Also,
PJ
j
n+1
= 0; Z
j
n+1
= x =
zD
PJ
j
n
= 0; Z
j
n
= zPZ
j
n+1
= x, Y
j
(n + 1, x) = 0 [ J
j
n
= 0; Z
j
n
= z.
The random variable Y
j
(n+1, x) is independent of the Markov chain, and the event J
j
n
= 0; Z
j
n
=
z depends only on the chain up to time n and the values of Y
j
(k, y) : k n. Therefore,
PZ
j
n+1
= x, Y
j
(n + 1, x) = 0 [ J
j
n
= 0; Z
j
n
= z = p(z, x) [1
j
n+1
(x)].
Therefore, we have the inductive argument
PJ
j
n+1
= 0; Z
j
n+1
= x =
zD
f
j
n
(z) p(z, x) [1
j
n+1
(x)]
= h
j
n+1
(x) [1
j
n+1
(x)]
= h
j
n+1
(x) [h
1
n+1
(x) h
2
n+1
(x)] = f
j
n+1
(x),
which establishes the claim.
12.4 Markov chains 277
Let K
j
denote the smallest n such that J
j
n
= 1. The condition |g
1
0
g
2
0
| 0 implies that
K
j
< with probability one. A key fact is that for each n and each x,
PK
1
= n + 1; Z
1
n
= x = PK
2
= n + 1; Z
2
n
= x = h
1
n+1
(x) h
2
n+1
(x).
This is immediate for n = 0 and for n > 0,
PK
j
= n + 1; Z
j
n+1
= x
=
zD
PJ
n
= 0; Z
j
n
= z PY
j
(n + 1, x) = 1; Z
j
n+1
= x [ J
n
= 0; Z
j
n
= z
=
zD
f
j
n
(z) p(z, x)
j
n+1
(x)
= h
j
n+1
(x)
j
n+1
(x) = h
1
n+1
(x) h
2
n+1
(x).
The last important observation is that the distribution of W
m
:= X
j
mn
given the event K
j
=
n; X
j
n
= x is that of a Markov chain with transition probability p starting at x.
The reader may note that for each j, the process (Z
j
n
, J
j
n
) is a time-inhomogeneous Markov chain
with transition probabilities
P(Z
j
n+1
, J
j
n+1
) = (y, 1) [ (Z
j
n
, J
j
n
) = (x, 1) = p(x, y),
P(Z
j
n+1
, J
n+1
) = (y, 0) [ (Z
j
n
, J
j
n
) = (x, 0) = p(x, y) [1
j
n+1
(y)],
P(Z
j
n+1
, J
n+1
) = (y, 1) [ (Z
j
n
, J
j
n
) = (x, 0) = p(x, y)
j
n+1
(y).
The chains (Z
1
n
, J
1
n
) and (Z
2
n
, J
2
n
) are independent. However, the transition probabilities for these
chains depend on both initial distributions and p.
We are now ready to make our construction of (X
1
n
, X
2
n
).
Dene for each (n, x) a process W
n,x
m
: m = 0, 1, 2, . . . that has the distribution of the
Markov chain with initial point x. Assume that all these processes are independent.
Choose (n, x) according to the probability distribution
h
1
n+1
(x) h
2
n+1
(x) = PK
j
= n; Z
j
n
= x.
Set J
j
m
= 1 for m n, J
j
m
= 0 for m < n, and K
1
= K
2
= n. Note that K
j
is the smallest
n such that J
j
n
= 1.
Given (n, x), choose X
1
0
, . . . , X
1
n
from the conditional distribution of the Markov chain with
initial distribution g
1
0
conditioned on the event K
1
= n; Z
1
n
= x.
Given (n, x), choose X
2
0
, . . . , X
2
n
(conditionally) independent of X
1
0
, . . . , X
1
n
from the condi-
tional distribution of the Markov chain with initial distribution g
2
0
conditioned on the event
K
2
= n; Z
2
n
= x.
Let
X
j
m
= W
n,x
mn
, m = n, n + 1, . . . .
The two conditional distributions above are not easy to express explicitly; fortunately, we do not
need to do so.
To nish the proof, we need only check that the above construction satises the conditions. For
278 Appendix
xed j, the fact that X
j
0
, X
j
1
, . . . has the distribution of the chain with initial distribution g
j
0
is im-
mediate from construction and the earlier observation that the distribution of X
j
n
, X
j
n+1
, . . . given
K
j
= n; Z
j
n
= x is that of the Markov chain starting at x. Also, the construction immediately
gives X
1
m
= X
2
m
if m K
1
= K
2
. Also,
PJ
j
n
= 0 =
xD
f
j
n
(z) = |g
n
1
g
n
2
|.
Remark. A review of the proof of Theorem 12.4.5 shows that we do not need to assume that
the Markov chain is time-homogeneous. However, time-homogeneity makes the notation a little
simpler and we use the result only for time-homogenous chains.
12.5 Some Tauberian theory
Lemma 12.5.1 Suppose > 0. Then as 1,
n=2
n
n
1
()
(1 )
.
Proof Let = 1 . First note that
n
2
n
n
1
=
n
2
[(1 )
1/
]
n
n
1
n
2
e
n
n
1
,
and the right-hand side decays faster than every power of . For n
2
we can do the asymptotics
n
= expnlog(1 ) = expn( O(
2
)) = e
n
[1 +O(n
2
)].
Hence,
n
2
n
n
1
=
n
2
e
n
(n)
1
[1 + (n) O()].
Using Riemann sum approximations we see that
lim
0+
n=1
e
n
(n)
1
=
_
0
e
t
t
1
dt = ().
Proposition 12.5.2 Suppose u
n
is a sequence of nonnegative real numbers. If > 0, the following
two statements are equivalent:
n=0
n
u
n
()
(1 )
, 1, (12.22)
N
n=1
u
n
1
N
, N . (12.23)
12.5 Some Tauberian theory 279
Moreover, if the sequence is monotone, either of these statements implies
u
n
n
1
, n .
Proof Let U
n
=
jn
u
j
where U
1
= 0. Note that
n=0
n
u
n
=
n=0
n
[U
n
U
n1
] = (1 )
n=0
n
U
n
. (12.24)
If (12.23) holds, then by the previous lemma
n=0
n
u
n
(1 )
n=0
1
n
( + 1)
(1 )
=
()
(1 )
.
Now suppose (12.22) holds. We rst give an upper bound on U
n
. Using 1 = 1/n, we can see
as n ,
U
n
n
1
_
1
1
n
_
2n 2n1
j=n
_
1
1
n
_
j
U
j
n
1
_
1
1
n
_
2n
j=0
_
1
1
n
_
j
U
j
e
2
() n
.
The last relation uses (12.24). Let
(j)
denote the measure on [0, ) that gives measure j
u
n
to
the point n/j. Then the last estimate shows that the total mass of
(j)
is uniformly bounded on
each compact interval and hence there is a subsequence that converges weakly to a measure that
is nite on each compact interval. Using (12.22) we can see that that for each > 0,
_
0
e
x
(dx) =
_
0
e
x
x
1
dx.
This implies that is x
1
dx. Since the limit is independent of the subsequence, we can conclude
that
(j)
and this implies (12.23).
The fact that (12.23) implies the last assertion if u
n
is monotone is straightforward using
U
n(1+)
U
n
1
[(n(1 +))
] , n ,
The following is proved similarly.
Proposition 12.5.3 Suppose u
n
is a sequence of nonnegative real numbers. If R, the following
two statements are equivalent:
n=0
n
u
n
=
_
1
1
_
log
_
1
1
_
, 1, (12.25)
N
n=1
u
n
N log
N N . (12.26)
280 Appendix
Moreover, if the sequence is monotone, either of these statements implies
u
n
log
n, n .
12.6 Second moment method
Lemma 12.6.1 Suppose X is a nonnegative random variable with E[X
2
] < and 0 < r < 1.
Then
PX rE(X)
(1 r)
2
E(X)
2
E(X
2
)
.
Proof Without loss of generality, we may assume that E(X) = 1. Since E[X; X < r] r, we know
that E[X; X r] (1 r). Then,
E(X
2
) E[X
2
; X r] = PX r E[X
2
[ X r]
PX r (E[X [ X r])
2
E[X; X r]
2
PX r
(1 r)
2
PX r
.
Corollary 12.6.2 Suppose E
1
, E
2
, . . . is a collection of events with
P(E
n
) = . Suppose there
is a K < such that for all j ,= k, P(E
j
E
k
) KP(E
j
) P(E
k
). Then
PE
k
i.o.
1
K
.
Proof Let V
n
=
n
k=1
1
E
k
. Then the assumptions imply that
lim
n
E(V
n
) = ,
and
E(V
2
n
)
k
j=1
P(E
j
) +
j=k
KP(E
j
) P(E
k
) E(V
n
) +KE(V
n
)
2
=
_
1
E(V
n
)
+K
_
E(V
n
)
2
.
By Lemma 12.6.1, for every r > 0,
PV
n
rE(V
n
)
(1 r)
2
K +E(V
n
)
1
.
Since E(V
n
) , this implies
PV
=
(1 r)
2
K
.
Since this holds for every r > 0, we get the result.
12.7 Subadditivity 281
12.7 Subadditivity
Lemma 12.7.1 (Subadditivity lemma) Suppose f : 1, 2, . . . R is subadditive, i.e., for all
n, m, f(n +m) f(n) +f(m). Then,
lim
n
f(n)
n
= inf
n>0
f(n)
n
.
Proof Fix integer N > 0. We can write any integer n as jN + k where j is a nonnegative integer
and k 1, . . . , N. Let b
N
= maxf(1), . . . , f(N). Then subadditivity implies
f(n)
n
jf(N) +f(k)
jN
f(N)
N
+
b
N
jN
.
Therefore,
limsup
n
f(n)
n
f(N)
N
.
Since this is true for every N, we get the lemma.
Corollary 12.7.2 Suppose r
n
is a sequence of positive numbers and b
1
, b
2
> 0 such that for every
n, m,
b
1
r
n
r
m
r
n+m
b
2
r
n
r
m
. (12.27)
Then there exists > 0 such that for all n,
b
1
2
n
r
n
b
1
1
n
.
Proof Let f(n) = log r
n
+ log b
2
. Then f is subadditive and hence
lim
n
f(n)
n
=
f(n)
n
:= .
This shows that r
n
n
/b
2
. Similarly, by considering the subadditive function g(n) = log r
n
log b
1
, we get r
n
b
1
1
n
.
Remark. Note that if r
n
satises (12.27), then so does
n
r
n
for each > 0. Therefore, we cannot
determine the value of from (12.27).
Exercises
Exercise 12.1 Find f
3
(), f
4
() in (12.4).
Exercise 12.2 Go through the proof of Lemma 12.5.1 carefully and estimate the size of the error
term in the asymptotics.
282 Appendix
Exercise 12.3 Suppose E
1
E
2
is a decreasing sequence of events with P(E
n
) > 0 for each
n. Suppose there exist > 0 such that
n=1
[P(E
n
[ E
n1
) (1 n
1
)[ < .
Show there exists c such that
P(E
n
) c n
. (12.28)
(Hint: use Lemma 12.1.4.)
Exercise 12.4 In this exercise we will consider an alternative approach to the Perron-Froebenius
Theorem. Suppose
q : 1, 2, . . . 1, 2, . . . [0, ),
is a function such that for each x > 0,
q(x) :=
y
q(x, y) 1.
Dene q
n
(x, y) by matrix multiplication as usual, that is, q
1
(x, y) = q(x, y) and
q
n
(x, y) =
z
q
n1
(x, z) q(z, y).
Assume for each x, q
n
(x, 1) > 0 for all n suciently large. Dene
q
n
(x) =
y
q
n
(x, y), p
n
(x, y) =
q
n
(x, y)
q
n
(x)
,
q
n
= sup
x
q
n
(x), q(x) = inf
n
q
n
(x)
q
n
.
Assume there is a function F : 1, 2, . . . [0, 1] and a positive integer m such that
p
m
(x, y) F(y), 1 x, y < ,
and such that
:=
y
F(y) q(y) > 0.
(i) Show there exists 0 < 1 such that
lim
n
q
1/n
n
= .
Moreover, q
n
n
. (Hint: q
n+m
q
n
q
m
.)
(ii) Show that
p
n+k
(x, y) =
n,k
(x, z) p
k
(z, y),
where
n,k
(x, z) =
p
n
(x, z) q
k
(z)
w
p
n
(x, w) q
k
(w)
=
q
n
(x, z) q
k
(z)
w
q
n
(x, w) q
k
(w)
.
12.7 Subadditivity 283
(iii) Show that if k, x are positive integers and n km,
1
2
y
[p
km
(1, y) p
n
(x, y)[ (1 )
k
.
(iv) Show that the limit
v(y) = lim
n
p
n
(1, y)
exists and if k, x are positive integers and n km,
1
2
y
[v(y) p
n
(x, y)[ (1 )
k
.
(v) Show that
v(y) =
x
v(x) q(x, y).
(vi) Show that for each x, the limit
w(x) = lim
n
n
q
n
(x)
exists, is positive, and w satises
w(x) =
y
q(x, y) w(y).
(Hint: consider q
n+1
(x)/q
n
(x).)
(vii) Show that there is a C = C(, ) < such that if
n
(x) is dened by
q
n
(x) = w(x)
n
[1 +
n
(x)] ,
then
[
n
(x)[ C e
n
,
where = log(1 )/m.
(viii) Show that there is a C = C(, ) < such that if
n
(x, y) is dened by
q
n
(x, y) = w(x)
n
[v(y) +
n
(x, y)] ,
then
[
n
(x, y)[ C e
n
.
(ix) Suppose that Q is an N N matrix with nonnegative entries such that Q
m
has all positive
entries. Suppose that the row sums of Q are bounded by K. For 1 j, k N, let
q(j, k) = K
1
q(j, k); set q(j, k) = 0 if k > N; and q(k, j) =
j,1
if k > N. Show that the
conditions are satised (and hence we get the Perron-Froebenius Theorem).
Exercise 12.5 In the previous exercise, let q(x, 1) = 1/2 for all k, q(2, 2) = 1/2 and q(x, y) = 0
for all other x, y. Show that there is no F such that > 0.
284 Appendix
Exercise 12.6 Suppose X
1
, X
2
, . . . are i.i.d. random variables in R with mean zero, variance one,
and such that for some t > 0,
:= 1 +E
_
X
2
1
e
tX
1
; X
1
0
< .
Let
S
n
= X
1
+ +X
n
.
(i) Show that for all n,
E
_
e
tSn
e
nt
2
/2
.
(Hint: expand the moment generating function for X
1
about s = 0.)
(ii) Show that if r tn,
PS
n
r exp
_
r
2
2n
_
.
Exercise 12.7 Suppose X
1
, X
2
, . . . are i.i.d. random variables in R with mean zero, variance one,
and such that for some t > 0 and 0 < < 1,
:= 1 +E
_
X
2
1
e
tX
1
; X
1
0
< .
Let S
n
= X
1
+ +X
n
. Suppose r > 0 and n is a positive integer. Let
K =
_
nt
r
_ 1
1
,
X
j
= X
j
1X
j
K,
S
n
=
X
1
+ +
X
n
.
(i) Show that
PX
j
,=
X
j
( 1) K
2
e
tK
.
(ii) Show that
E
_
e
tK
1
Sn
_
e
nt
2
K
2(1)
/2
.
(iii) Show that
PS
n
r exp
_
r
2
2n
_
+n( 1) K
2
e
tK
.
References
Bhattacharya, R. and Rao R. (1976). Normal Approximation and Asymptotic Expansions, John Wiley &
Sons.
Bousquet-Melou, M. and Schaeer, G. (2002). Walks on the slit plane, Prob. Theor. Rel. Fields 124, 305344.
Duplantier, B. and David, F. (1988). Exact partition functions and correlation functions of multiple Hamilton
walks on the Manhattan lattice, J. Stat. Phys. 51, 327434.
Fomin, S. (2001). Loop-erased walks and total positivity, Trans. AMS 353, 35633583.
Fukai, Y. and Uchiyama, K. (1996). Potential kernel for two-dimensional random walk, Annals of Prob. 24,
19791992.
Kenyon, R. (2000). The asymptotic determinant of the dsicrete Laplacian, Acta Math. 185, 239286.
Komlos, J., Major, P., and Tusnady, G. (1975). An approximation of partial sums if independent rvs and
the sample df I, Z. Warsch. verw. Geb. 32, 111-131.
Komlos, J., Major, P., and Tusnady, G. (1975). An approximation of partial sums if independent rvs and
the sample df II, Z. Warsch. verw. Geb. 34, 33-58.
Kozma, G. (2007). The scaling limit of loop-erased random walk in three dimensions, Acta Math. 191,
29-152.
Lawler, G. (1996). Intersections of Random Walks, Birkhauser.
Lawler, G., Puckette, E. (2000). The intersection exponent for simple random walk, Combin., Probab., and
Comp. 9, 441464.
Lawler, G., Schramm, O., and Werner, W. (2001). Values of Brownian intersection exponents II: plane
exponents, Acta Math. 187, 275308.
Lawler, G., Schramm, O., and Werner, W. (2004). Conformal invariance of planar loop-erased random walk
and uniform spanning trees, Annals of Probab. 32 , 939995.
Lawler, G. and Trujillo Ferreras, J. A., Random walk loop soup, Trans. Amer. Math. Soc. 359, 767787.
Masson, R. (2009), The growth exponent for loop-erased walk, Elect. J Prob. 14, 10121073
Spitzer, F. (1976). Principles of Random Walk, Springer-Verlag.
Teu, E. and Wagner, S. (2006). The number of spanning trees of nite Sierpinski graphs, Fourth Colloquium
on Mathematics and Computer Science, DMTCS proc. AG, 411-414
Wilson, D. (1996). Gnerating random spanning trees more quickly than the cover time, Proc. STOC96,
296303.
285
286 References
Index of Symbols
If an entry is followed by a chapter number,
then that notation is used only in that chap-
ter. Otherwise, the notation may appear through-
out the book.
a(x), 86
a
A
, 147
a(x), 90
A
m,n
[Chap. 8], 225
/
d
, 154
b [Chap. 5], 106
B
t
, 64
B
n
, 11
cap, 136, 146
C
2
, 126
C
d
(d 3), 82
C
d
, 82
(
n
, 11
((A; v), ((A) [Chap. 8], 233
(
t
, (
t
[Chap. 8], 207
Df(y), 152
d() [Chap. 8], 203
deg [Chap. 8], 202
k,i
[Chap. 7], 173
e
j
, 9
Es
A
(x), 136
c
A
(f, g), c
A
(f) [Chap. 8], 231
c
A
(x, y) [Chap. 8], 211
c
A
(x, y), c
A
[Chap. 8], 210
c
A
(x, y) [Chap. 8], 210
f(; x) [Chap. 8], 203
F(A; ), F
x
(A; ), F
V
(A; ), F(A) [Chap. 8],
203
F(x, y; ) [Chap. 4], 78
F
n
() [Chap. 2], 32
F
V
1
,V
2
(A) [Chap. 8], 209
g(; x), g() [Chap. 8], 203
g(, n) [Chap. 2], 32
g
A
(x), 135
G(x), 76
G(x, y), 76
G(x, y; ), 76
G(x; ), 76
G
A
(x, y), 96
G(x), 84
2
, 126
, 11
k,j
[Chap. 7], 172
, 80
h
D
(x, y), 181
hm
A
, 138
hm
A
(x), 144
H
A
(x, y), 123
H
A
(x, y), 153, 211
H
A
(x, y) [Chap. 8], 212
inrad(A), 178
, 11
, 11
K [Chap. 5], 106
K() [Chap. 8], 200
LE() [Chap. 8], 208
L, 17
L
A
, 157
L, L
j
, L
x
j
, L
x
[Chap. 8], 201
L, L
j
, L
x
j
, L
x
[Chap. 8], 201
m = m
q
[Chap. 8], 200
m() [Chap. 8], 201
o(), 21
osc, 66
O(), 21
[] [Chap. 8], 211
[Chap. 8], 200
p
n
(x, y), 10
p
n
(x), 25
p
t
(x, y), 13
P
A
, 157
T, T
d
, 10
T
, T
d
, 17
INDEX OF SYMBOLS 287
T
, T
d
, 17
q() [Chap. 8], 199
q(T ; x
0
) [Chap. 8], 201
q
S, 13
T
A
[Chap. 6], 136
T
A
[Chap. 6], 136
T [Chap. 8], 201
A
, 96
A
, 96
A, 199
Z
d
, 9
(B, S; n), 69
(), 27
() [Chap. 8], 203
,
2, 171
,
r
[Chap. 5], 103
r
[Chap. 5], 106
n
, 126
n
, 126
x
, 17
2
x
, 17
A, 119
i
A, 119
x y [Chap. 8], 202
288 References
Index
h-process, 245
adjacent, 201
aperiodic, 10, 158
Berry-Esseen bounds, 36
Beurling estimate, 154
bipartite, 10, 158
boundary, inner, 119
boundary, outer, 119
Brownian motion, standard, 64, 71, 172
capacity
d 3, 136
capacity, two dimensions, 146
central limit theorem (CLT), 24
characteristic function, 27
closure, discrete, 119
connected (with respect to p), 120
coupling, 48
covariance matrix, 11
cycle, 198
unrooted, 199
defective increment distribution, 112
degree, 201
determinant of the Laplacian, 204
dierence operators, 17
Dirichlet form, 230
Dirichlet problem, 121, 122
Dirichlet-to-Neumann map, 153
domain Markov property, 250
dyadic coupling, 166, 172
eigenvalue, 157, 191
excursion measure, 209
loop-erased, 209
nonintersecting, 210
self-avoiding, 209
excursions (boundary), 209
ltration, 19, 263
Fomins identity, 212
functional central limit theorem, 63
fundamental solution, 95
gamblers ruin, 103, 106
Gaussian free eld, 230, 232
generating function, 202
cycle, 202
loop measure, 202
generating function, rst visit, 78
generating set, 10
generator, 17
Girsanov transformation, 44
Greens function, 76, 96
half-line, 115, 155
harmonic (with respect to p), 120
harmonic function
dierence estimates, 125, 131
harmonic functions, 192
harmonic measure, 138, 144
Harnack inequality, 122, 131
hexagonal lattice, 16
honeycomb lattice, 16
increment distribution, 10
inradius, 178
invariance principle, 63
irreducible cycle, 202
joint normal distribution, 24, 267
killing rate, 76
killing time, 76, 113, 199
KMT coupling, 166
Laplacian, discrete, 17
last-exit decomposition, 99, 130
lattice, 14
lazy walker, 77, 201
length (of a path or cycle), 198
local central limit theorem (LCLT), 24, 25
loop
unrooted, 199
loop measure
rooted, 199
INDEX 289
unrooted, 200
loop soup, 206
loop-erased random walk (LERW), 248
loop-erasure(chronological), 207
martingale, 120, 263
maximal inequality, 265
maximal coupling, 275
maximal inequality, 265
maximum principle, 121
modulus of continuity of Brownian motion, 68
Neumann problem, 152
normal derivative, 152
optional sampling theorem, 125, 263, 267
oscillation, 66
overshoot, 108, 110, 126
Perron-Froebenius theorem, 270
Poisson kernel, 122, 123, 181
boundary, 210
excursion, 152
potential kernel, 86
quantile coupling, 170
function, 170
range, 12
recurrent, 75
recurrent set, 141
reected random walk, 153
reection principle, 20
regular graph, 201
restriction property, 209
root, 198
Schramm-Loewner evolution, 243, 258
second moment method, 280
simple random walk, 9
on a graph, 201
simply connected, 178
Skorokhod embedding, 68, 166
spanning tree, 200
complete graph, 216
counting number of trees, 215
free, 221
hypercube, 217
rectangle, 224
Sierpinski graphs, 220
uniform, 214
Stirlings formula, 52
stopping time, 19, 263
strong approximation, 64
strong Markov property, 19
subMarkov, 199
strictly, 199
Tauberian theorems, 79, 278
transient, 75
transient set, 141
transitive graph, 201
triangular lattice, 15
weight, 198
symmetric, 199
Wieners test, 142
Wilsons algorithm, 214