0% found this document useful (0 votes)
415 views

Probability Theory I: CAM 384K Concepts

Theorems, definitions, and concepts from Mihai Sirbu's Probability Theory I course (CAM 384K) taught at the University of Texas at Austin in Fall of 2008.

Uploaded by

RhysU
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
415 views

Probability Theory I: CAM 384K Concepts

Theorems, definitions, and concepts from Mihai Sirbu's Probability Theory I course (CAM 384K) taught at the University of Texas at Austin in Fall of 2008.

Uploaded by

RhysU
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 1 of 14

28 Aug
F ⊂ 2Ω is a σ-algebra iff
(i) F,∅
(ii) A ∈ F =⇒ A0 = Ω − A ∈ F

[
(iii) Ai ∈ F ∀i ∈ Ž =⇒ Ai ∈ F
i=1
∅, Ω ∈ F. F is closed under countable intersection.
Ω , ∅ is the sample space. F ⊂ 2Ω is a σ-algebra of events. A ∈ F is an event.
(Ω, F) is a measurable space.
µ : F → [0, ∞] is a measure iff
(i) µ(A) ≥ µ(∅) = 0
[  X
(ii) Ai ∈ F countable, Ai disjoint =⇒ µ Ai = µ(Ai )

If  is a measure where (Ω) = 1 then  : F → [0, 1] is a probability measure. It obeys


(monotonicity) A ⊂ B =⇒ (B) − (A) = (B − A) ≥ 0
[ X
(subadditivity) Ai ∈ F, A ⊂ Ai =⇒ (A) ≤ (Ai )
(continuity from below) Ai % A =⇒ (Ai ) % (A)
(continuity from above) Ai & A =⇒ (Ai ) & (A)
(Ω, F, ) is a probability space.
2 Sept
If A ⊂ 2Ω then A generates σ(A) B F, σ-algebra
T
:A⊂F
using that Fι ⊂ 2Ω , ι ∈ I are σ-algebras =⇒ ι∈I Fι is a σ-algebra.
T

Given (Ω1 , F1 , 1 ) , . . . , (Ωn , Fn , n ) define the product probability space (Ω, F, ) where
Ω B Ω1 × · · · × Ωn = {(ω1 , . . . , ωn ) : ωi ∈ Ωi }
F B F1 × · · · × Fn = σ({A1 × · · · × An : Ai ∈ Fi })
 B 1 × · · · × n where (A1 × · · · × An ) = 1 (A1 ) . . . n (An )
F exists by Carathéodory’s extension theorem. F is unique by the π − λ theorem.
P ⊂ 2Ω is a π-system iff
(i) P,∅
(ii) A, B ∈ P =⇒ A ∩ B ∈ P

L ⊂ 2Ω is a λ-system iff
(i) Ω∈L
(ii) A, B ∈ L, A ⊂ B =⇒ B − A ∈ L
(iii) Ai ∈ L, Ai % A =⇒ A ∈ L
Generally, λ-systems are not σ-algebras.
A λ-system which is additionally closed under intersection is a σ-algebra.
If A ⊂ 2Ω then A generates `(A) B L, λ-system
T
:A⊂L
using that Lι ⊂ 2Ω , ι ∈ I are λ-systems =⇒ ι∈I Lι is a λ-system.
T
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 2 of 14

(π-λ theorem) P ⊂ L =⇒ σ(P) ⊂ `(P) ⊂ L.


The Borel sets on ’ are R B σ open sets in ’ = σ({(a, b] : a < b}) = σ({(a, b) : a < b}).

The Borel sets on ’n are R n B σ open sets in ’n = R × · · · × R.


For measure µ : R → [0, ∞], assume ∃F : ’ → ’ such that µ((a, b]) = F(b) − F(a).
Such an F must be nondecreasing and right continuous.
(Lebesgue-Stieltjes)
F : ’ → ’ nondecreasing, right-continuous =⇒ ∃!µ : R → [0, ∞] where µ((a, b]) = F(b) − F(a).
(Lebesgue measure) The unique measure λ : R → [0, ∞] where λ ((a, b]) = b − a.
4 Sept
{X ∈ B} B X −1 (B) = {ω ∈ Ω : X(ω) ∈ B}.
A random variable is a function X : (Ω, F) → ’ such that {X ∈ B} ∈ F ∀B ∈ R.
Define the probability measure µX : ’ → [0, 1], called the distribution, such that µ X (B) = (X ∈ B).
(’, R, µX ) is a probability space.
The distribution function of X is F X : ’ → [0, 1] such that F X (x) B (X ≤ x) = µX ((−∞, x]).
Every F X satisfies
(i) x ≤ y =⇒ F(x) ≤ F(y)
(ii) xn & x =⇒ F(x) = lim F(xn ) = F(x+ )
n
(iii) (X = x) = F(x) − F(x− )
(v) (X < x) = F(x− )
(vi) F(−∞) B lim F(x) = 0
x→−∞
(vii) F(+∞) B lim F(x) = 1
x→+∞
Properties (i), (ii), (vi), and (vii) characterize a distribution.
Given F : ’ → ’ obeying (i), (ii), (vi), and (vii), there exists a random variable X such that F X = F.
−1 (y) B sup
Fleft {x ∈ X : F(x) < y} is the left continuous inverse of a distribution F.
−1 (y) B sup
Fright {x ∈ X : F(x) ≤ y} is the right continuous inverse of a distribution F.
−1 −1 (y) is used.
Generally, F (y) B Fleft
F X continuous =⇒ F X (X) is uniformly distributed on (0, 1).
9 Sept
Random variables X and Y are equal in distribution, denoted X =Y, when F X = FY .
d


1 x ∈ A

Define the indicator function 1A (x) = 

.
0 x < A

X : (Ω, F) → (S , S) is measurable if ∀B ∈ S {X ∈ B} ∈ F.
X measurable wrt F is also written X ∈ F.
X
For (Ω, F) → (S , S = σ(A)), if {X ∈ A} ∈ F ∀A ∈ A then X is measurable.
X f
For (Ω, F) → (S , S) → (T, T ) if X, f measurable then f (X) measurable.
X1 , X2 , . . . Xn random variables, f : (’n , Rn ) → (’, R) measurable =⇒ f (X1 , . . . , Xn ) a random variable.
For example, X1 + · · · + Xn is a random variable.
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 3 of 14

X1 , . . . , Xn , . . . random variables =⇒ inf Xn , sup Xn , lim inf Xn , and lim sup Xn are all random variables.
 
The extended real numbers are ’ B ’ ∪ {−∞, +∞}. The extended Borel sets are R B σ ’ .
An extended random variable is a function X : (Ω, F) → ’ such that {X ∈ B} ∈ F ∀B ∈ R.
All random variables are automatically extended random variables.

Property p (ω) is satisfied almost everywhere (a. e.) when µ({ω ∈ Ω : ¬p (ω)}) = 0.
Property p (ω) is satisfied almost surely (a. s.) when µ({ω ∈ Ω : ¬p (ω)}) = 0.

X = Y almost surely when (X , Y) = µ ({X − Y , 0}) = 0.


n o
For X : (Ω, F) → (S , S) measurable, σ(X) B X −1 (B) : B ∈ S is the σ-algebra generated by X.
A generates S =⇒ X −1 (A) generates σ(X).

Y : Ω → ’, Y measurable on σ(X) =⇒ ∃ f : R → R : Y = f (X).


11 Sept
(Ω, F, µ) is σ-finite if ∃An ∈ F, An % Ω : µ(An ) < ∞. 16 Sept

f : Ω → ’ measurable is a simple function if f = ai 1Ai where ai ∈ ’, Ai ∈ F disjoint, µ(Ai ) < ∞.


P

There are four incremental stages in the development of the Lebesgue integral:
Z X
(1. f simple) f dµ B ai µ(Ai )
Z Z Z
(2. f bounded, finite support) f dµ B sup ϕ dµ = inf ϕ dµ
ϕ≤ f ϕ≥ f
ϕ simple ϕ simple
Z Z
(3. f ≥ 0) f dµ B sup 0≤h≤ f h dµ
h bounded
h finite support
Z Z Z
(4. f = f + − f − ) f dµ B f + dµ − f − dµ

fR dµ using An underlying the σ-finite space.


R R
For f ≥ 0 we get the continuity result ( f ∧ n) 1An dµ %
+
For f = f − f , we sayR f is integrable +
f dµ, f − dµ < ∞.
R

R only when
Note | f | = f + + f − and | f | dµ ⇐⇒ f + dµ, f − dµ < ∞.
R

The Legesgue integral has the following properties:


Z
(i) f ≥ 0 a. e. =⇒ f dµ ≥ 0
Z Z
(ii) a f dµ = a f dµ
Z Z Z
(iii) f + g dµ = f dµ + g dµ
Z Z
(iv) f ≤ g a. e. =⇒ f dµ ≤ g dµ
Z Z
(v) f = g a. e. =⇒ f dµ = g dµ
Z Z
(vi) f dµ ≤ | f | dµ

Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 4 of 14

Properties (i)–(iii) must be verified at each stage in integral’s development.


Properties (iv)–(vi) follow directly from (i)–(iii). R
In integration, “∞ · 0 B 0” or µ({ f , 0}) = 0 =⇒ f dµ = 0.

L0 (Ω, F, µ) B { f : Ω → ’ measurable}. L0 (Ω, F, µ) B L0 (Ω, F, µ) / ∼ where f ∼ g ⇐⇒ f = g a. e..


L1 (Ω, F, µ) B n f : Ω o . L (Ω, F, µ) B L (Ω, F, µ) / ∼.
 1 1
R → p’ integrable
L p (Ω, F, µ) B f : | f | dµ < ∞ . L p (Ω, F, µ) B L p (Ω, F, µ) / ∼.
n o
L∞ (Ω, F, µ) B f : || f ||∞ < ∞ where || f ||∞ B ess sup f B inf M ∈ ’ : µ({ f > M}) = 0 .

L∞ (Ω, F, µ) B L∞ (Ω, F, µ) / ∼.
Always L∞ , L1 ⊂ L0 .
µ(Ω) < ∞ =⇒ L∞ ⊂ · · · ⊂ L2 ⊂ L1 . This inclusion relation always holds for a probability measure.

(Hölder’s inequality) p, q ∈ [1, ∞] such that 1p + 1q = 1 =⇒ | f g| d µ ≤ || f || p ||g||q .


R

(Minkowski’s inequality) p ∈ [1, ∞] =⇒ || f + g|| p ≤ || f || p + ||g|| p .


a.s.
Xn converges almost surely (everywhere) to X, denoted Xn −→X, when µ({ω ∈ Ω : Xn (ω)−→X(ω)})
/ = 0.
µ
Xn converges in measure (probability) to X, denoted Xn −→X, when µ(|Xn − X| ≥ ε) → 0 ∀ε > 0.
a.s. µ
Xn −→X =⇒ Xn −→X whenever µ(Ω) < ∞.
18 Sept
There are four major theorems regarding interchanging limits and integration:
(bounded convergence) Z Z
µ
| fn | , | f | ≤ M ∈ ’, fn −→ f, µ(Ω) < ∞ =⇒ fn dµ −→ f dµ
(monotone convergence) Z Z
(Ω, F, µ) σ-finite, fn ≥ 0, fn % f a. e. =⇒ fn dµ % f dµ
(Fatou’s Lemma) Z Z Z Z
fn ≥ 0 measurable =⇒ lim inf fn dµ ≤ lim inf fn dµ ≤ lim sup fn dµ ≤ lim sup fn dµ
(dominated convergence) Z Z
a.s.
fn measurable, fn −→ f, fn ≤ g for g ∈ L =⇒ 1
fn dµ −→ f dµ, f ∈ L1

If ∃ f : ’ → [0, ∞) such that (X ∈ B) = B f (x) dx then f is the density function of X, denoted fX .
R
Rx
By definition F X (x) = −∞ fX (y) dy and ’ fX (y) dy = 1.
R

R
The expected value of X on (Ω, F, ) is …[X] B X d.
…[X] is also called the mean and denoted µX or µ.
On a discrete space, …[X] = i ωi (X = ωi ).
P
(X ∈ A) = …[1A ].
X g
(transport formula) (Ω, F, ) → (S , S) → (’, R), g measurable, g ≥ 0 or g bounded =⇒
Z Z Z Z
… g(X) = g(X) d = g(x) dµX (x) = g(x) dF X (x) =
 
g(x) fX (x) dx.
’ ’ ’
The transport formula implies g(X) ∈ 1
L (Ω, F, )
⇐⇒ g ∈ L1 (’, R, µ).
h i
… X k for k ∈ Ž is called the kth moment of X.
h i h i
… (X − …[X])k = … (X − µ)k for k ∈ Ž is called the kth centered moment of X.
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 5 of 14

The variance of X, denoted varh X, iis the second centered moment.


h i
Using the definition, var X = … X 2 − µ2 . Always var X ≤ … X 2 and var (aX + b) = a2 var X.

σX B var X is the standard deviation of X.
 
(Markov’s inequality) ϕ : ’ → [0, ∞), A ∈ R =⇒ inf y∈A ϕ(y) (X ∈ A) ≤ … ϕ(X)1X∈A ≤ … ϕ(X) .
   
… ϕ(|X|)1 … ϕ(|X|)
(Chebyshev’s inequality) Markov when A = {x ∈ ’ : |x| ≥ a} =⇒ (|X| ≥ a) ≤ [ ϕ(a){|X|≥a} ] ≤ [ϕ(a) ] .
… X2
Moments are often used for ϕ, e.g. (|X| ≥ a) ≤ [ ] and (|X − µ| ≥ a) ≤ var X .
a2 a2
22 Sept
ϕ : ’ → ’ is convex if ϕ (λx + (1 − λ)y) ≤ λϕ(x) + (1 − λ)ϕ(y) whenever λ ∈ [0, 1].
In words, “ϕ weighted average ≤ weighted average ϕ.”

ϕ ∈ C 1 ([a, b]) convex ⇐⇒ ϕ(x) + ϕ0 (x)(y − x) ≤ ϕ(y) ∀x, y ∈ [a, b].
ϕ ∈ C 2 ([a, b]) convex ⇐⇒ ϕ00 (x) ≥ 0 ∀x ∈ [a, b] .

(Jensen’s inequality) ϕ convex and …[|X|] , … |ϕ(X)| < ∞ =⇒ ϕ (…[X])


h i ≤ … ϕ(X) .
   
Moments are often used for ϕ, e.g. |…[X]| ≤ …[|X|] and …[X]2 ≤ … X 2 .

The conditional probability that event A occurs given event B is (A|B) = (A∩B)
(B) provided (B) , 0.

Independence of two objects, denoted using infix y, is defined as follows:


(Events A, B) A y B ⇐⇒ (A ∩ B) = (A) (B)
(Random variables X, Y) X y Y ⇐⇒ (X ∈ C, Y ∈ D) = (X ∈ C) (Y ∈ D) ∀C, D ∈ R
(σ-algebras G1 ,G2 ) G1 y G2 ⇐⇒ A y B ∀A ∈ G1 ∀B ∈ G2

AyB =⇒ …[AB] = …[A] …[B].


AyB =⇒ (A|B) = (A).
AyB =⇒ A0 y B, A y B0 , A0 y B0 .
AyB ⇐⇒ 1A y 1B .
XyY ⇐⇒ σ(X) y σ(Y).

Independence of a finite collection requires these logical extensions of the above to hold:
\  Y
(σ-algebras G1 , . . . ,Gn )  Ai = (Ai ) for Ai ∈ Gi
\  Y
(Random variables X1 , . . . , Xn )  {Xi ∈ Ci } = (Xi ∈ Ci ) for Ci ∈ R
 
\  Y
(Events A1 , . . . , An )   Ai  = (Ai ) whenever I ⊂ {1, . . . , n}
i∈I i∈I
 
\  Y
(Classes of events A1 , . . . , An )   Ai  = (Ai ) for Ai ∈ Ai whenever I ⊂ {1, . . . , n}
i∈I i∈I
Independence of an infinite collection requires that every finite subcollection be independent.
Pairwise independence of a collection’s elements does not imply the collection is independent.

A1 , . . . , An independent π-systems =⇒ σ(A1 ) , . . . , σ(An ) independent.

(X1 ≤ x1 , . . . , Xn ≤ xn ) = ≤ xi ) =⇒ X1 , . . . , Xn independent.
Qn
i=1 (Xi
25 Sept
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 6 of 14

Independence of objects derived from independent triangular arrays:


S 
F1,1 . . . F1,m(1) 
 G1 = σ m(1) j=1 F1, j 
. . . .

 

. . . independent =⇒ .
 
(i) . . .  . independent
 
 

Fn,1 . . . Fn,m(n) 
  Sm(n)  
Gn = σ j=1 Fn, j
 

X1,1 . . . X1,m(1)  f1 X1,1 , . . . , X1,m(1) 


 
. . . .

 

. . . =⇒ .
 
(ii) . . .  independent . independent
 
 

Xn,1 . . . Xn,m(n) fn Xn,1 , . . . , Xn,m(n)


 

Given two random variables


X : (Ω, F, ) → (S 1 , S1 ) with µX : (S 1 , S1 ) → [0, 1]
Y : (Ω, F, ) → (S 2 , S2 ) with µY : (S 2 , S2 ) → [0, 1]
we have the joint random variable
(X, Y) : (Ω, F, ) → (S 1 × S 2 , S1 × S2 ) with µX,Y : (S 1 × S 2 , S1 × S2 ) → [0, 1]
where µX,Y is the unique joint distribution of X and Y satisfying µX,Y (A × B) = µX (A) µY (B) ∀A ∈ S1 ∀B ∈ S2 .

If X1 , . . . Xn are random variables on (Ω, F, ) then X1 , . . . Xn independent ⇐⇒ µX1 ,...,Xn = ni=1 µXi .
Q

(Fubini’s theorem) σ-finite (S 1 , S1 , µ1 ), (S 2 , S2 , µ2 ) and either f ≥ 0 or | f | d (µ1 × µ2 ) < ∞ =⇒


R
Z "Z # Z Z "Z #
f (u1 , u2 ) d (µ2 ) d (µ1 ) = f (u1 , u2 ) d (µ1 × µ2 ) = f (u1 , u2 ) d (µ1 ) d (µ2 ) .
S1 S2 S 1 ×S 2 S2 S1

X1 , . . . Xn with densities f1 , . . . , fn independent ⇐⇒ (X1 , . . . , Xn ) has density f ((x1 , . . . , xn )) = f1 (x1 ) . . . fn (xn )

X y Y, h : ’2 → ’ measurable =⇒ …[h(X, Y)] = ’ ’ h(x, y) dµX dµY using Fubini.


R R

X y Y =⇒ … f (X)g(Y) = … f (X) … g(Y) .


     
X1 , . . . , Xn independent =⇒ …[X1 . . . Xn ] = …[X1 ] . . . …[Xn ].
27 Sept
For X y Y, the distribution of the sum Z = X + Y is
Z
 Fubini 
FZ (z) = (X + Y ≤ z) = … 1{X+Y≤z} = … 1{X≤z−Y} = F X (z − y) dFY (y) = (F X ∗ FY ) (z).
 
’
If X has density fX then fZ (z) = ’ fX (x − y) dFY (y) since
R
Z Z z−y ! Z z Z !
FZ (z) = … 1{X≤z−Y} = fX (x) dx dFY (y) =
 
fX (x − y) dFY (y) dx.
’ −∞ −∞ ’
Additionally, if Y has density fY then f Z = ( f X ∗ fY ) (z).

cov (X, Y) B … (X − µX ) (Y − µY ) = …[XY] − …[X] …[Y] is the covariance of X, Y ∈ L2 .


 
ρ (X, Y) B √ cov(X,Y)
√ = cov(X,Y)
σX σY is the correlation coefficient of X, Y.
var X var Y

X, Y are uncorrelated if cov (X, Y) = ρ (X, Y) = 0.


X y Y =⇒ X, Y uncorrelated but X, Y uncorrelated =⇒ / XyY

uncorrelated X1 , . . . , Xn ∈ L2 =⇒ var (X1 + · · · + Xn ) = var X1 + · · · + var Xn .


Lp
For 1 ≤ p < ∞, Xn converges in L p to X, denoted Xn −→X, when ||Xn − X||L p = … |Xn − X| p −→ 0.
 
L∞
For p = ∞, Xn converges in L∞ , denoted Xn −→X, when ||Xn − X||L∞ = ess sup |Xn − X| −→ 0.
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 7 of 14
Lq Lp
Xn −→X =⇒ Xn −→X using that µ(Ω) < ∞,1 ≤ p ≤ q ≤ ∞ =⇒ ||·||L p ≤ ||·||Lq
Lp 
Xn −→X =⇒ Xn −→X using Chebyshev’s inequality.
S n L2
(L2 -WLLN) uncorrelated Xi ∈ L2 , …[Xi ] = µ, var Xi ≤ C < ∞, and S n B X1 + · · · + Xn =⇒ n −→µ.
2 Oct
S n L2
(popular WLLN) Xi ∈ L2 independent and identically distributed (i. i. d.) =⇒ n −→µ.

S n −…[S n ] L2
(WLLN for triangular arrays) S n = Xn,1 + · · · + Xn,n and var S n
b2n
−→ 0 for some bn =⇒ bn −→0.
7 Oct
A random variable X with large tails can be truncated outside a threshold M, i.e. X̄ B X1{|X|≤M} .

(WLLN for triangular arrays with independent rows) Construct bn > 0, bn % ∞ such that both
Pn  2 
n … Xn,k 1
{|Xn,k |≤bn }

X   k=1
 Xn,k ≥ bn −→ 0 and −→ 0.
k=1
b2n
  
Define an = nk=1 … Xn,k 1{|Xn,k |≤bn } and S n = Xn,1 + · · · + Xn,n . Under these conditions S nb−a
P n
n
−→0.
R∞
X ≥ 0, f : [0, ∞] → [0, ∞] increasing, f ∈ C 1 , and f (0) = 0 =⇒ …[ f (X)] = 0 f 0 (x)(X ≥ x) dx
R∞
In particular, …[X p ] = 0 px p−1 (X > x) dx allows estimating moments using tails.
For discrete N : Ω → Ž ∪ {∞}, …[N] = ∞
P
n=0 (N h ≥ n).i
Use p = 1 − ε to show x(|X| > x) −→ 0 =⇒ … |X|1−ε < ∞ for ε > 0.

(General WLLN) X1 , . . . , Xn , . . . i. i. d., x(|X| > x) −→ 0 =⇒ Snn − µn −→0 where µn = … X 1{|X|≤n} .
 

X ∈ L1 =⇒ x(|X| > x) −→ 0 since x(|X| > x) = … x 1{|X|>x} , x 1{|X|>x} ≤ |X|, and x 1{|X|>x} → 0 as x → ∞.
 

Sn 
(L1 -WLLN) Xi ∈ L1 i. i. d. and …[Xi ] = µ =⇒ n −→µ.
9 Oct
Define lim inf An ⊂ lim sup An for sequences of subsets of Ω:
[\ \
Al = lim Al = ω that are in all but finitely many An ’s

lim inf An B
n n→∞
n l≥n l≥n
\[ [
Al = lim Al = ω that are in infinitely many An ’s

lim sup An B
n n→∞
n l≥n l≥n

lim sup An is read An infinitely often (i. o.), i.e. (An i. o.) B  lim sup An .
a.s.
Xn −→X ⇐⇒ ∀ε > 0 ({|Xn − X| ≥ ε} i. o.) = 0 using that {Xn −→X}
/ = ∪ε>0 ∩n ∪l≥n {|Xl − X| ≥ ε}.

(Borel-Cantelli 1) n=1 (An ) < ∞ =⇒ (An i. o.) = 0.


P∞

fast
Xn converges fast to X, denoted Xn −→X, if ∀ε > 0 ≥ ε}) < ∞.
P∞
n=1 ({|Xn − X|
fast a.s.
Xn −→X =⇒ Xn −→X by Borel-Cantelli.

(convergence of random variables) Using that for a topological space, yn −→ y if ∀ynm ∃ynmk −→ y:
fast BC1 a.s. 
(i) Xn −→X =⇒ Xn −→X =⇒ Xn −→X
 fast
(ii) Xn −→X =⇒ ∃Xnk : Xnk −→X
 a.s.
(iii) Xn −→X ⇐⇒ ∀Xnm ∃Xnmk −→X
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 8 of 14

There exist sequences that converge in probability but not almost surely.
Convergence in probability comes from a metric, but convergence almost surely is not from any topology.
S n a.s.
(L4 -SLLN) Xi ∈ L4 i. i. d. =⇒ n −→µ.
14 Oct
S n a.s.
(SLLN) Xi ∈ L1 i. i. d. =⇒ n −→µ.

= ∞ =⇒ (An i. o.) = 1.
P∞
(Borel-Cantelli 2) Ai independent and i=1 (An )

For independent An , the Borel-Cantelli lemmas impose a zero-one law forcing (An i. o.) to be either 0 or 1.
1A1 +···+1An a.s.
= ∞ =⇒
P∞
(Borel-Cantelli 2 extension) Ai independent and i=1 (An ) (A1 )+···+(An ) −→1 as n → ∞.

Weak convergence or convergence in distribution, written with an infix ⇒ , is defined as follows:


(Distribution functions) Fn ⇒ F ⇐⇒ Fn (x) −→ F(x) at each x where F is continuous.
(Probability measures) n ⇒  ⇐⇒ distribution functions Fn ⇒ F
(Random variables) Xn ⇒ X ⇐⇒ distribution functions Fn ⇒ F
Practically, Xn ⇒ X means n (X ≤ x) −→ (X ≤ x) whenever (X = x) = 0.

Weak convergence is metrizable, that is Fn ⇒ F ⇐⇒ ρ (Fn , F) −→ 0


where ρ (F,G) B inf {ε : F(x − ε) − ε ≤ G(x) ≤ F(x + ε) + ε ∀x} is the Lévy metric.
16 Oct a.s.
Fn ⇒ F as n → ∞ =⇒ ∃Xn , X such that F Xn = Fn , F X = F, and Xn −→X as n → ∞.

When taking expectations, everything using almost sure convergence can instead use weak convergence.

(characterization of weak convergence) The following are equivalent:


(i) Xn ⇒ X
   
(ii) … g(Xn ) −→ … g(X) ∀g : ’ → ’ continuous, bounded
(iii) (X ∈ G) ≤ lim inf (Xn ∈ G) ∀G open
(iv) (X ∈ F) ≥ lim sup (Xn ∈ F) ∀F closed
(v) (Xn ∈ A) −→ (X ∈ A) if (X ∈ ∂A) = 0
Results (iii) and (iv) are lower and upper semicontinuity, respectively.

(continuous mapping theorem) Xn ⇒ X, (X ∈ {x ∈ ’ : g is discontinuous at x}) = 0 =⇒ g(Xn ) ⇒ g(X).

n ,  ∈ C ∗ (’), the dual of the space of continuous, bounded functions on ’.


n ⇒  is weak-∗ sequential convergence under the assumption (’) = 1.

(Helly’s selection/compactness theorem)


Fn a sequence of distribution functions =⇒ ∃Fnk , ∃F right continuous and nondescreasing : Fnk ⇒ F.
F is not necessarily a distribution function because mass may escape at ±∞.
21 Oct
P, a set of probability measures on ’, is tight if ∀ε > 0 ∃Mε ∈ ’+ such that µ([−Mε , Mε ]) ≥ 1 − ε ∀µ ∈ P.
Equivalently P, a set of distribution functions, is tight if 1 − F (Mε ) + F (−Mε ) ≤ ε ∀F ∈ P.
Equivalently Pn , a countable set of distribution functions, is tight if lim supn→∞ [1 − Fn (Mε ) + Fn (−Mε )] ≤ ε.

Fn a tight sequence of distribution functions =⇒ ∃Fnk , ∃F a distribution function such that Fnk ⇒ F.

(Prokorov’s theorem) For P, a set of probability measures on ’, P is tight ⇐⇒ ∀Pn ∃Pnk , P∞ : Pnk ⇒ P∞ .
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 9 of 14

For ϕ : ’ → [0, ∞) such that lim|x|→∞ ϕ(x) = +∞, if ’ ϕ(x) dF(x) ≤ C < ∞ ∀F ∈ P then P is tight.
R

For an integer-valued X, let an B (X = n) where ∞ an = 1.


P
P∞ i=0 n h i
Define generating function g(x) = n=0 an x = n=0 x (X = n) = … xX .
P∞ n

Knowing g(x) is equivalent to knowing (X = n).


h i R
Every random variable X has a characteristic function ϕX (t) B … eitX = ’ eitx µX (dx):
h i
(i) ϕ(0) = … ei0X = 1
(ii) ϕ(−t) = …[cos(−tX)] + i…[sin(−tX)] = …[cos(tX)] − i…[sin(tX)] = ϕ(t)
h i
(iii) |ϕ(t)| ≤ … eitX = …[1] = 1
h i h i
(iv) ϕaX+b (t) = … eit(aX+b) = eitb … eitaX = eitb ϕX (at)
(v) X y Y =⇒ ϕX+Y (t) = ϕX (t) ϕY (t)

F X (t) = i=1 λi F Xi (t) i=1 λi = 1 =⇒ ϕX (t) = i=1 λi ϕXi (t).


Pn Pn Pn
where

Characteristic functions are uniformly continuous since dominated convergence implies


h i h i h i
|ϕ(t + h) − ϕ(t)| ≤ … ei(t+h)X − eitX = … eitX eihX − 1 = … eihX − 1 → 0 as h → 0.
R y R y 28 Oct
eix − eiy ≤ |x − y| for x, y ∈ ’ since eix − eiy ≤ x dtd eit = x ieit dt = y − x.
k h i R∞
… |X|n < ∞ =⇒ ϕ0 , . . . , ϕ(n) exist everywhere, continuous, and dtd k ϕ(t) = … (iX)k eitX = −∞ (ix)k eitx µX (dx).
 
k h i
… |X|n < ∞ =⇒ dtd k ϕ(0) = ik … X k for k = 0, . . . , n.
 

itb −eita
Uniform U (a, b) with density f (x) = 1
b−a has characteristic function ϕ(t) = eit(b−a) .
2   2
Normal N (0, 1) with density f (x) = √1 exp −x2 has characteristic function ϕ(t) = exp − t2 .
2π  
2
N µ, σ with density f (x) = √ 1 exp −(x−µ)
   
Normal 2
2σ2
has characteristic function ϕ(t) = exp itµ − 2 σ
1 2 2
t .
2πσ

The inversion formula recovers a distribution from a characteristic function:


Z ∞ −ita −itb
1 1 e −e
µ((a, b)) + µ({a, b}) = ϕ(t) dt
2 2π −∞ it
Z T
1
µ({a}) = lim e−ita ϕ(t) dt
T →∞ 2T −T
In particular, the limits above always exist at ±∞.

ϕ = ϕ0 =⇒ µ = µ0
R∞ 1 ∞ −itx
|ϕ (t)| dt < ∞ =⇒ X has a continuous, bounded density function fX (x) = ϕ(t)
R
−∞ X 2π −∞ e dt.
30 Oct
(continuity theorem) For a sequence of probability distributions µn and their characteristic functions ϕn (t),
(i) µn ⇒ µ∞ =⇒ ϕn (t) −→ ϕ∞ (t) ∀t
(ii) ϕn (t) → ϕ∞ (t) ∀t, ϕ∞ continuous at 0 =⇒ µn ⇒ µ∞ , µn tight
where µ∞ is another probability distribution and ϕ∞ is its characteristic function.
S n −nµ
(CLT) Xi ∈ L2 i. i. d., …[Xi ] = µ, 0 < var Xi = σ2 , S n = Xi + · · · + Xn =⇒ σ n
√ ⇒ N(0, 1).
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 10 of 14
    
≤ min |x|n+1 , 2|x|n with n = 2 gives the estimate ϕ(t) − 1 − σ2 t2 ≤ … min |tx|3 , t2 x2 .
n  
Using eix − nk=0 (ix)
P
n! (n+1)! n! 2 6

 n
cn → c ∈ ƒ =⇒ 1 + cnn → ec

(self-normalized sums) Pn
X
Xi i. i. d., …[Xi ] = 0, var Xi = σ2 ∈ (0, ∞) =⇒ √Pi=1n i 2 ⇒ N (0, 1).
i=1 Xi

A triangular array satisfies the Lindeberg conditions if


(i) σ2n,1 + · · · + σ2n,n → σ2 > 0
n
X  2 
(ii) ∀ε > 0 … Xn,k 1{|Xn,k ≥ε|} → 0 as n → ∞
k=1
The second condition requires that all array elements contribute “equally” to the sum.
h i
(Lindeberg-Feller CLT) If a triangular array with independent rows satisfies … Xn,i = 0, … Xn,i
2 = σ2 < ∞,
 
n,i
X +···+X
 
and the Lindeberg conditions then n,1 √n n,n ⇒ N 0, σ2 .
4 Nov

n n n
|w1 | , . . . , |wn | , |z1 | , . . . , |zn | ≤ θ =⇒ wi − zi ≤ θn−1 |wi − zi |.
Q Q P
i=1 i=1 i=1

  √ √  Rb
a pq a pq
√1 e−x /2 dx
B(n,p)−np B(n,p) 2
√ √
pq n
⇒ N(0, 1) so  n ∈ p + √n , p + √n −→ a by the CLT.

1A1 +···+1An a.s. 1A1 +···+1An −log n


For record values with (An ) = 1
n where log n −→1 we get √ ⇒ N(0, 1) by the CLT.
log n
6 Nov
     
For Ω finite with partition Ωi=1,k ,  ω j Ωi B  ω j /(Ωi ) if ω j ∈ Ωi .  ω j Ωi B 0 if ω j < Ωi .
h i P   P
For X : Ω → ’, let yi = … X Ωi B Nj=1 X(ω j ) ω j Ωi = ω∈Ωi X(ω)(ω)
(Ωi ) .
Define Y : Ω → ’ by Y|Ωi = yi for i = 1, . . . , k. Then …[X1B ] = …[Y1B ] whenever B = ∪Ωi .

(Radon-Nikodym) If (Ω, F) a measurable space with measures µ, ν where µ(A) = 0 =⇒ R ν(A) = 0 ∀A ∈ F


(i.e. ν  µ, absolutely continuous) then ∃ f : Ω → [0, ∞) measurable such that ν(A) = A f (x) µ(dx) ∀A ∈ F.
f is usually denoted dν/dµ and called the Radon-Nikodym derivative.
h i
For X ∈ L1 and G ⊂ F, random variable Y = … X G is a conditional expectation of X wrt G iff
(i) Y measurable wrt G,
(ii) …[X1B ] = …[Y1B ] ∀B ∈ G.
Y exists by Radon-Nikodym and is unique up to versions, i.e. Y = Y 0 a. s. ⇐⇒ Y 0 is a version of Y.
h i h i
… X Y B … X σ(Y) = h(Y) where h : (’, R) → (’, R) is the conditional expectation of X wrt Y.
Since σ(Y) 3 B = {Y ∈ C} with C ∈ R,
Z Z
…[X1B ] = … X1{Y∈C} = …[h(Y)1B ] = … h(Y)1{Y∈C} = = h(y) Y (dy) .
   
h(y)1{y∈C} Y (dy)
’ C
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 11 of 14

Conditional expectation has the following properties:


h i h i
(i) X ∈ L1 =⇒ … X G ∈ L1 and … X G 1 ≤ ||X||L1 .
h i h i h i L
(ii) … aX + bY G = a… X G + b… Y G
h i h i
(iii) X ≤ Y a. s. =⇒ … X G ≤ … Y G
h h ii
(iv) … … X G = …[X] .
h h i i
(v) …[X1A ] = … … X G 1A for A ∈ G
h i
(vi) X y G =⇒ … X G = …[X]
L1
h i L1 h i
(vii) Xn −→X =⇒ … Xn F −→… X F
h i  
(viii) … 1A Fn =  A Fn
h i h i
For X ∈ L2 (Ω, F, ), G ⊂ F, … X G is the unique random variable attaining minY∈L2 (Ω,G,) … (X − Y)2 .
That is, the conditional expectation is the orthogonal projection wrt the L2 -inner product.
  h i h i2
var X G B … X 2 G − … X G is the conditional variance of X wrt G.
h  i h i
var X = … var X G + var … X G .
 
… X 2 F


(Chebyshev’s inequality) a > 0 =⇒  |X| ≥ a F ≤ a2
.
 h i h i
(Jensen’s inequality) ϕ convex and …[|X|] , … |ϕ(X)| < ∞ =⇒ ϕ … X F ≤ … ϕ (X) F .
 

h h i i h h i i h i 11 Nov
(tower property) F1 ⊂ F2 =⇒ … … X F2 F1 = … … X F1 F2 = … X F1 , i.e. “smaller σ-algebra wins”
h i h i
(taking out what is known) X ∈ G and XY, Y ∈ L1 =⇒ … XY G = X… Y G .
h i h i
(monotone convergence) If X ∈ L1 and Xn % X =⇒ … Xn G % … X G .

A filtration {F n} a sequence of σ-algebras where Fn ⊂ Fn+1 .

A discrete time stochastic process is a sequence of random variables {Xn } on (Ω, F, ) indexed by time.

{Xn } (ω) = {X0 (ω), X1 (ω), . . . } for fixed ω ∈ Ω is called a path of a stochastic process.

A stochastic process can be viewed as one of

(1) a sequence of random variables,


(2) an infinite dimensional random variables X : Ω → paths, or
(3) a two dimensional function X(·) (·) : Ž × Ω → ’.

An adapted stochastic process {Xn } satisfies Xn ∈ Fn ∀n, i.e. Xn is measurable wrt Fn .


The filtration {Fn }n = {σ(X0 , . . . , Xn )}n is called the natural filtration of a stochastic process.
{Xn } is adapted to {Fn } ⇐⇒ σ(X0 , . . . , Xn ) ⊂ Fn ∀n.
All processes are adapted on their natural filtration.
 
Ω, F, {Fn }n=0,1,... ,  is called a filtered probability space.
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 12 of 14
 
On Ω, F, {Fn }n=0,1,... ,  , a process {Mn } is a { submartingale, martingale, supermartingale }
h i
iff ∀n all of Mn ∈ L1 , Mn ∈ Fn (adapted), and … Mn+1 Fn {≥, =, ≤} Mn hold (respectively).

Over time, a {sub-, true, super-}martingale {increases, stays the same, decreases} in conditional expectation:
h i
Mn submartingale ⇐⇒ … Mn+1 − Mn Fn ≥ 0
h i
Mn martingale ⇐⇒ … Mn+1 − Mn Fn = 0
h i
M supermartingale ⇐⇒ … M − M F ≤ 0
n n+1 n n

All true martingales are both submartingales and supermartingales.

{Mn , Fn }n a martingale =⇒ Mn , Fnm n a martingale by the tower property.




i Any submartingale Xn can be decomposed uniquely as Xn = Mn + An where A0 B 0,


(Doob’s decomposition)
h
An+1 − An B … Xn+1 Fn − Xn is an increasing, predictable sequence and Mn B Xn − An is a martingale.
13 Nov h i
(multistep) {Mn , Fn }n a {sub-, true, super-}martingale =⇒ … Mk Fn {≥, =, ≤} Mn ∀n ≤ k (respectively).

Assume Mn ∈ Fn , Mn ∈ L1 and consider ϕ : ’ → ’ such that ϕ(Mn ) ∈ L1 :


(i) Mn martingale, ϕ convex =⇒ ϕ(Mn ) submartingale
(ii) Mn martingale, ϕ concave =⇒ ϕ(Mn ) supermartingale
(iii) Mn submartingale, ϕ convex, ϕ increasing =⇒ ϕ(Mn ) submartingale
 h i h i
These follow from conditional Jensen, e.g. for (i) we have ϕ(Mn ) = ϕ … Mn+1 Fn ≤ … ϕ (Mn+1 ) Fn .
Useful examples functions to combine with this fact include |Mn |, (Mn )+ , (Mn − a)+ , (Mn )− , and (Mn − a)− .

H is predictable if Hn ∈ Fn−1 .
Predictable implies is “deterministic” behavior on natural filtration.
Predictable H may be thought of as a betting strategy, i.e. Hn (Mn − Mn−1 ) is the payoff at time n.

(H · M) n B nk=1 Hk (Mk − Mk−1 ) is the discrete time stochastic integral of H onto M.


P
(H · M)n is also called a martingale transform.
(H · M)n is an adapted process.

{Mn , Fn }n supermartingale, H predictable with Hn bounded =⇒ {(H · M)n , Fn }n is a supermartingale.


18 Nov
A random time T : Ω → Ž ∪ {∞} is a stopping time wrt filtration Fn if {T = n} ∈ Fn ∀n.

N = inf n {Xn ∈ A} for A ∈ R is called the hitting time of A.


N is a stopping time because {N = n} = ({X0 < A} ∩ · · · ∩ {Xn−1 < A} ∩ {Xn ∈ A}) ∈ Fn .

{T = n} ∈ Fn =⇒ {T ≤ n} = ({T = 0} ∪ · · · ∪ {T = n − 1} ∪ {T = n}) ∈ Fn .
{T = n} ∈ Fn =⇒ {T < n} = ({T = 0} ∪ · · · ∪ {T = n − 1}) ∈ Fn .

XTn B XT∧n = XT (ω)∧n (ω) is a stopped process: one which “runs until stopping time N occurs.”

T stopping time =⇒ 1{n≤T } predictable because {T ≥ n} = {T < n}C = {T ≤ n − 1}C ∈ Fn−1 .


T stopping time =⇒ 1{n>T } predictable because 1{n>T } = 1 − 1{n≤T } .

1{n≤T } · X n = XT ∧n − X0

Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 13 of 14

T stopping time, {Mn , Fn }n martingale =⇒ {Mn∧T , Fn }n martingale because MT ∧n = M0 + 1{n≤T } · M n .




Using Mn submartingale and a < b, let T 2k+1 = minn≥2k {Mn ≤ a} and let T 2k+2 = minn≥2k+1 {Mn ≥ b}.
Call (T 2k+1 , T 2k+2 ) an interval
P of upcrossing and denote U (a, b) for the number of upcrossings.
The trading strategy Hn = ∞ k=0 1{T 2k+1 <n≤T 2k+2 } is predictable. It represents buy below a and sell above b.
Define Yn = Xn ∧ a = a + (Xn − a)+ so that (H · Y)n represents gains over upcrossings plus a possible last gain.
Then (b − a) Un (a, b) ≤ (H · Y)n . Also 1 − H ≥ 0 =⇒ ((1 − H) · Y) submartingale =⇒ …[((1 − H) · Y)] ≥ 0.
(submartingale upcrossing inequality) ∴ (b − a) …[U n (a, b)] ≤ … (X n − a)+ − … (X0 − a)+
   

(martingale convergence)
{Mn , Fn }n submartingale, supn … Mn+ < ∞ =⇒ Mn −→M, M ∈ L1 .
  a.s.

a.s.
{Mn , Fn }n supermartingale, Mn ≥ 0 =⇒ Mn −→M, …[M] ≤ …[M0 ].
20 Nov
 
(Xi )i∈I is uniformly integrable (u. i.) if supi∈I … |Xi | 1{|Xi |≥M} → 0 as M → ∞.

(Xi )i∈I u. i. ⇐⇒ Xi are L1 -bounded and ∀ε > 0 ∃δ > 0 : ∀A ∈ F (A) ≤ δ =⇒ …[|Xi | 1A ] ≤ ε ∀i ∈ I.

(de la Vallée Possin criterion)


(Xi )i∈I u. i. ⇐⇒ ∃ψ : [0, ∞) → [0, ∞) increasing such that ψ(x)/x → ∞ as x → ∞ and supi∈I … ψ(|Xi |) < ∞.
 

|Xi | ≤ |X| for X ∈ L1 =⇒ (Xi )i u. i..


(Xi )i u. i. =⇒
/ |Xi | ≤ |X| for some X ∈ L1 .
 h i
X ∈ L1 (Ω, F, ) , G ⊂ F =⇒ … X G is uniformly integrable.
G⊂F
  h i
Ω, F, {Fn }n=0,1,... ,  , M ∈ L1 (Ω, F, ) =⇒ Mn = … M Fn is a u. i. Lévy martingale.

 
 L1
Xn −→X =⇒ (Xn )n u. i. ⇐⇒ Xn −→X ⇐⇒ …[|Xn |] → …[|X|] < ∞
" h i#
a.s. L1
Mn martingale =⇒ (Mn )n u. i. ⇐⇒ Mn −→ M ⇐⇒ Mn −→M ⇐⇒ ∃M ∈ L : Mn = … M Fn
1
L1
" #
a.s. L1
Mn submartingale =⇒ (Mn )n u. i. ⇐⇒ Mn −→ M ⇐⇒ Mn −→M
L1

h i h i L1 h i
(multistep at n = ∞) {Mn }n submartingale =⇒ … M∞ Fk ≥ Mk since … Mn Fk −→… M∞ Fk .
h i
M ∈ L1 is called a last element of an adapted {sub-, true, super-}martingale Mn if … M Fk {≥, =, ≤} Mk ∀k.
A {sub-, true, super-}martingale has a last element ⇐⇒ Mn+ n , (Mn )n , Mn− n u. i..
n   o

h i a.s. h i S  25 Nov
(Lévy’s theorem) Given (Fn )n , X ∈ L1 (Ω, F, ) =⇒ … X Fn −→… X F∞ where F∞ = σ ∞ n=0
F n .
  a.s. L1
(Lévy’s corollary) A ∈ F∞ =⇒  A Fn −→1A .
L1h i
a.s.
(Lévy’s 0-1 law) Fn % F∞ , A ∈ F∞ =⇒ … 1A Fn −→1A .

A last element is not generally unique. However, if we require it to be in F∞ then it is unique.


There is a bijection between L1 (Ω, F∞ , ) and the space of u. i. martingales.
We can identify a u. i. martingale by its last element.
Rhys Ulerich, CAM 384K concepts (updated February 12, 2009) Page 14 of 14

For stopping time N, F N B {A ∈ F : A ∩ {N = n} ∈ F n ∀n}.


X N is the random variable for the value of process Xn at stopped time N.
Always XN ∈ FN .

(Optional sampling for bounded stopping times)


{Xn , Fn }n submartingale, M, N stopping times with 0h ≤ M
≤ iN ≤ k for k ∈ Ž.
=⇒ …[X0 ] ≤ …[X M ] ≤ …[XN ] ≤ …[Xk ] and X M ≤ … XN F M .

(Optional sampling theorem) For M ≤ N stopping times, possibly unbounded:


+
  h i
(i) Xn submartingale, Xn∧N u. i. =⇒ … XN F M ≥ X M , …[XN ] ≥ …[X M ]
h i
(ii) Xn martingale, (Xn∧N ) u. i. =⇒ … XN F M = X M , …[XN ] = …[X M ]
  h i
(iii) −
Xn supermartingale, Xn∧N u. i. =⇒ … XN F M ≤ X M , …[XN ] ≤ …[X M ]
2 Dec
“Optional sampling holds iff the submartingale has a last element after N”
follows from Xn submartingale =⇒ Xn∧N submartingale and Xn∧N + u. i. ⇐⇒ Xn∧N has a last element.
h i 
(independence lemma) X ∈ G, Y y G, and some integrability conditions =⇒ … g(X, Y) G = … g(x, Y) x=X


(martingale inequality) Xn submartingale, X¯n B max Xk+ =⇒ λ X¯n ≥ λ ≤ … 1{X¯n ≥λ} Xn+ ≤ … Xn+ .
  h i  
k=0,...,n
h  p i  p  p 
(Doob’s L p maximal inequality) Xn submartingale, X¯n B max Xk+ , 1 < p < ∞ =⇒ … X¯n ≤ p−1 … Xn+ p .
 
k=0,...,n

(Wald I) Xi ∈ L1 i. i. d. , S n = X1 + · · · + Xn , N stopping
h i time where …[N]h < ∞ i =⇒ …[S N ] = …[N] …[X].
(Wald II) Above assumptions plus …[X] = 0, … X 2 = σ2 < ∞ =⇒ … S 2N = σ2 …[N].
a.s.
h i
Xn martingale , 1 < p < ∞, sup … (Xn ) p < ∞ =⇒ Xn −→ , =
 
p
X ∞ ∈ F ∞ Xn … X∞ Fn .
n L
( )
For 1 < p < ∞ we identify L p (Ω, F ∞ , ) Hp with Xn martingale : sup … (Xn ) < ∞ .
p


n
4 Dec
Within (Ω, F), a sequence of σ-algebras {Gn }n is called a backward filtration if F ⊃ Gn ⊃ Gn+1 ∀n.
n o h i
Mn ∈ L1 , Gn is a backward {submartingale, martingale, supermartingale} if … Mn Gn+1 {≥ , =, ≤ } Mn+1 .
n

(backward martingale convergence) h i


a.s.
{Mn , Gn } backward martingale =⇒ Mn −→ M∞ = … M1 G∞ where G∞ B ∞
T
n=1 G n
L1
h i h i
H y σ(X, G) =⇒ … X σ(G, H) = … X G .

n=1 σ(Xn , Xn+1 , . . .) ⊂ G∞ =⇒ (A) ∈ {0, 1}.


T∞
(Kolmogorov 0-1 law) Xn independent, A ∈
S n a.s.
(SLLN) Xi ∈ L1 i. i. d. =⇒ n −→ …[Xi ] is proven by showing
L1

 Sn = σ(S
(1) G  n , S n+1 , . . .) is a backward filtration,
(2) n , Gn is a backward martingale, and
n
nh i h i
a.s.
(3) S n −→… S 1 G = … X G = …[X ].
n 1 ∞ i i
L1 ∞

You might also like