0% found this document useful (0 votes)
13 views

Đồ_án_CSXS (1)

The document is a final project on the foundations of probability theory, focusing on convergence concepts such as convergence in mean, probability, and distribution. It is structured into three chapters, covering definitions, theorems, and relationships between different types of convergence. The project aims to provide essential knowledge for understanding limit theorems, particularly the Law of Large Numbers and the Central Limit Theorem.

Uploaded by

OANH NGUYỄN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Đồ_án_CSXS (1)

The document is a final project on the foundations of probability theory, focusing on convergence concepts such as convergence in mean, probability, and distribution. It is structured into three chapters, covering definitions, theorems, and relationships between different types of convergence. The project aims to provide essential knowledge for understanding limit theorems, particularly the Law of Large Numbers and the Central Limit Theorem.

Uploaded by

OANH NGUYỄN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

MINISTRY OF EDUCATION AND TRAINING

VINH UNIVERSITY

FINAL PROJECT FOUNDATIONS OF PROBABILITY


THEORY

CONVERGENCE IN MEAN, CONVERGENCE IN


PROBABILITY AND CONVERGENCE IN DISTRIBUTION

CLASS: FOUNDATIONS OF PROBABILITY THEORY LT01

Group 13

Lecturer Assoc. Prof. LE VAN THANH

Nghe An - 2024
MEMBER OF GROUP 13

No. Name Student code Position

1 Nguyen Thi Kieu Oanh 215714020910252 Leader

2 Tran Thi Huong Thuy 215714020910021 Member

3 Le Hoang Thuy 215714020910182 Member


Contents
Preface 3

1 Preliminaries 4
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Convergence concepts 7
2.1 Convergence in mean . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Definition and examples . . . . . . . . . . . . . . . . . . 7
2.1.2 Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Convergence in probability . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Definition and examples . . . . . . . . . . . . . . . . . . 10
2.2.2 Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 The Cauchy criterion and the continuous mapping the-
orem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Convergence in distribution . . . . . . . . . . . . . . . . . . . . 17

3 Relations between convergence concepts 19


3.1 Relations between convergence in mean and convergence in
probability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Relations between convergence in probability and convergence
in distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1
Conclusion 25

References 26

2
Preface
Limit theorems is a vital field within probability theory and continue to be a
contemporary focus of research in the current stage. We recognize the impor-
tance of the convergence concepts theme in the exploration of limit theorems.
Convergence in mean, convergence in probability, and convergence in distri-
bution carry significant meaning in the study of the Law of Large Numbers
and the Central Limit Theorem. Therefore, the purpose of this article is to
provide readers with the basic knowledge needed to investigate limit theorems
by discussing important kinds of convergence such as convergence in mean,
convergence in probability, and convergence in distribution. The content of
this paper is divided into 3 chapters.
The first chapter introduces some definitions and theorems as preparation
knowledge for chapter 2.
Chapter 2 presents the definitions, examples and theorems of convergence in
mean, convergence in probability and convergence in distribution.
In chapter 3, we present the pairwise relationships among three types con-
vergence, include convergence in mean implies convergence in probability,
convergence in probability implies convergence in distribution, and examples
show that the converse does not hold in general.

3
Chapter 1

Preliminaries
1.1 Definition
Definition 1.1.1. The triple (Ω, F, P) is called a probability space. Inside, Ω
is a non-empty set, called the sample space, F is an σ - field of events, and P
is a function define in F , satisfy Kolmogorov axiomatic system:

(i) P(A) ≥ 0, A ∈ F (non-negativity).

(ii) P(ω) = 1 (normalization).

(iii) Countable additivity: If {An , n ≥ 1} ⊂ F such that Ai Aj ̸= ∅, ∀i ̸= j then



[  X∞
P An = P(An ).
n=1 n=1

Definition 1.1.2. Let Ω = R and A = {(a, b) : −∞ < a < b < +∞}, the
collection of open intervals. Then σ(A) is called the Borel σ -field, denoted by
B(R). Elements A ∈ B(R) are called Borel sets. The pair (R, B(R)) is called
Borel space.

Definition 1.1.3. A random variable is a measurable function in a probabil-


ity space, denoted by

X : (Ω, F, P) → (R, B(R), µ).

Definition 1.1.4. Assume X is a random variable on probability space (Ω, F, P).


Expectation of X , denoted by E(X), is Lebesgue integral of the random vari-
4
able X over the sample space Ω with respect to the probability measure P.
Z
E(X) := XdP.

Definition 1.1.5. The σ -field generated by X , denoted by σ(X) is

σ(X) := {X −1 (A) : A Borel in R}.

= {{X ∈ A} : A Borel in R}.

Definition 1.1.6. The distribution function F : R −→ [0, 1] of a random


variable X is defined by the rule

F (x) := P(X ≤ x), x ∈ R.

It determines uniquely the distribution of X .

If F (x) is a distribution function of X , we will use the following notation:

(C(F )) = {x ∈ R : F is continuous at x}.

A finite interval I with endpoints a < b is called an interval of continuity for


F if both a, b ∈ C(F ). We know that

(C(F ))c = {x ∈ R : F is discontinuous at x}.

is at most countable.

Example 1.1.7. Let X be a random variable such that P(X = 1) = p and


P(X = 0) = 1 − p. Then (C(F ))c = {0, 1}.

Definition 1.1.8. (Almost sure convergence). A sequence of random vari-


ables Xn , n ≥ 1 is said to convergence almost surely (a.s.) to a random variable
X if

P{ω : lim Xn (ω) = X(ω)} = 1.


n→∞

a.s.
Denote: Xn −−→ X as n → ∞.

5
1.2 Theorem
Theorem 1.2.1. (Jensen’s inequality) Let X be an integrable random vari-
able. Let g : R → R be a convex function. Then g(E[x]) ≤ E[g(x)].
  1q
Theorem 1.2.2. (Lyapunov’s inequality) For all 0 < q < p then E[X]q ≤
  p1
E[X]p .

Theorem 1.2.3. (Markov’s inequality) If X ≥ 0 and a ≥ 0 then P(X ≥ a) ≤


E[X]
.
a

Theorem 1.2.4. (The Borel-Cantelli 0-1 law) Let An , n ≥ 1 be a sequence of


independent events. Then
 ∞
 X
0 if P(An ) < ∞.



P(lim sup An ) = n=1

X

1 if


 P(An ) = ∞.
n=1

Proposition 1.2.5. (Almost sure convergence criterion). Let {X, Xn , n ≥ 1}


is a sequence of random variables. Then the following statements are equiva-
lent:

a.s.
(i) Xn −−→ X .

(ii) P(lim sup(|Xn − X| > ε)) = 0 for all ε > 0.

(iii) lim P(supk≥n |Xk − X| > ε) = 0 for all ε > 0.


n→∞

a.s.
Theorem 1.2.6. (Continuous mapping theorem). Let Xn −−→ X and f : R →
R be a continuous function. Then

a.s.
f (Xn ) −−→ f (X).

Proposition 1.2.7. (Complete convergence implies almost sure convergence).


c a.s.
If Xn →
− X, then Xn −−→ X .

6
Chapter 2

Convergence concepts
2.1 Convergence in mean
2.1.1 Definition and examples
Definition 2.1.1. Let p > 0. A sequence of random variables {Xn , n ≥ 1} is
said to converge in mean of order p to a random variable X as n → ∞ if

lim E|Xn − X|p = 0.


n→∞

Then we write
Lp
Xn −→ X as n → ∞.

If there is no confusing about the index of n, we write

Lp
Xn −→ X.

Example 2.1.2. Let {Xn , n ≥ 1} be a sequence of random variables such that


1 1
P(Xn = 1) = 1 − 2
and P(Xn = n) = 2 for all n ≥ 1.
n n
1L 2 L
Then Xn −→ 1 and Xn −→ 1.
|

Proof. We have
1 1 n−1
 
E|Xn = 1| = (1 − 1) 1 − 2 + (n − 1) 2 = 2 → 0 as n → ∞.
n n n
1 L
Therefore, Xn −→ 1.
On the other hand,
 1
 1
 n − 1 2
E|Xn = 1|2 = (1 − 1)2 1 − 2 + (n − 1)2 2 = → 1 as n → ∞.
n n n
7
2 L
Hence, Xn −→ 1.

|
Example 2.1.3. Suppose that for every n ≥ 1, Xn is a discrete random
variable having a following values table

Xn −n2 0 n2

1 1 1
P 1−
2n5 n5 2n5

3 L
Then Xn −→ 0 as n → ∞.

Proof. We have
E|Xn |3 = n → 0 as n → ∞.
3 L
Then Xn −→ 0 as n → ∞.

Example 2.1.4. Given α > 0, and let X1 , X2 , · · · be independent random


variables such that
1 1
P(Xn = 0) = 1 − α
and P(Xn = n) = α , n ≥ 1.
n n
Then the following statement holds:
r
Xn →
− 0 as n → ∞ if and only if α > r.

Proof. We have,
1 1
 
r r
E|Xn | = 0 · 1 − α + nr · α
 n n
→ 0 for r <α
r−α
=n =1 for r = α, as n → ∞
→ +∞ for r > α

Note that E|Xn |r neither converges to 0 nor diverges to infinity when r = α,


but equals (in fact, not even converges to) "the wrong number" 1.

Remark 2.1.5. Let 0 < q < p. By Lyapunov’s inequality, we have


1 1
(E|Xn − X|q ) q ≤ (E|Xn − X|p ) p .
Lp Lq
Thus, if Xn −→ X then Xn −→ X .

Remark 2.1.6. If p = 2, it is called mean-square convergence, and it is shown


m.s 2 L
by Xn −−→ X or by Xn −→ X.

8
2.1.2 Theorem
Lp Lp
Theorem 2.1.7. Assume that Xn −→ X and Yn −→ Y . Then the following
statements hold:
Lp
(i) Xn ± Yn −→ X ± Y .
Lp
(ii) aXn −→ aX , where a is constant.
Lp
Proof. (i) Since Xn −→ X . this implies that lim E(|Xn − X|p ) = 0.
n→∞
Lp
Since Yn −→ Y . this implies that lim E(|Yn − Y |p ) = 0.
n→∞
Consider the sum Xn + Y n, applying the triangle inequality, we have

|Xn + Yn − X − Y | ≤ |Xn − X| + |Yn + Y |.

This implies that

|Xn + Yn − X − Y |p ≤ |Xn − X|p + |Yn + Y |p .

Then

0 ≤ lim E(|Xn + Yn − X − Y |p ) ≤ lim E(|Xn − X|p + |Yn − Y |p ) = 0.


n→∞ n→∞

Then
lim E(|Xn + Yn − X − Y |p ) = 0.
n→∞
Lp
In other words, Xn + Yn −→ X + Y .
Lp
Similarly, we will prove Xn − Yn −→ X − Y .
Lp
(ii) Consider a sequence aXn , we want to show that aXn −→ aX .
Indeed, we have
|aXn − aX|p = |a||Xn − X|p .

This implies that

lim E(|aXn − aX|p ) = |a| lim E(|Xn − X|p ).


n→∞ n→∞

Since lim E(|Xn − X|p ) = 0, it follows that


n→∞

|a| lim E(|Xn − X|p ) = 0.


n→∞
Lp
Therefore, aXn −→ aX .

9
2.2 Convergence in probability
In this section we will introduce convergence in probability. This notion will
play a preliminary for the weak law of large numbers which we will study
later.
In brief, a sequence of random variables {Xn , n ≥ 1} is said to converge to a
random variable X in probability if for all ε > 0, probability of event (|Xn −
X| > ε) is small when n large enough.

2.2.1 Definition and examples


Definition 2.2.1. A sequence of random variables {Xn , n ≥ 1} is said to
converge in probability to a random variable X as n → ∞ if for all ε > 0,

lim P(|Xn − X| > ε) = 0.


n→∞

We then write

P
Xn −
→ X as n → ∞ or lim Xn = X in probability.
n→∞

Example 2.2.2. Suppose that for every n ≥ 1, Xn is a discrete random


variable having a following values table

Xn −n2 0 n2

1 1 1
P 1−
2n5 n5 2n5

P
Then Xn −
→ 0 as n → ∞.

Proof. For all ε > 0, we have

(|Xn | > ε) ⊂ (|Xn | =


̸ 0).

Then
1
P(|Xn − 0| > ε) ≤ P(|Xn | =
̸ 0) = → 0.
2n5
P
Therefore, Xn −
→ 0 as n → ∞.

10
2.2.2 Theorem
Theorem 2.2.3. (The uniqueness of the limit). If a sequence of random
variables {Xn , n ≥ 1} converges to both X and Y in probability, then X ≡ Y
a.s., i.e.,

P (X = Y ) = 1.

Proof. Let ε > 0. Applying the triangle inequality |X −Y | ≥ |Xn −X|+|Xn −Y |,


we have
   
ε ε
(|X − Y | > ε) ⊂ |Xn − X| > ∪ |Xn − Y | > , n ≥ 1. (2.1)
2 2

This ensures that


   
ε ε
0 ≤ P(|X − Y | > ε) ≤ P |Xn − X| > + P |Xn − Y | > , n ≥ 1. (2.2)
2 2

P P
Since Xn −
→ X and Xn −
→ Y,

ε ε
lim P(|Xn − X| > ) = 0 and lim P(|Xn − Y | > ) = 0. (2.3)
n→∞ 2 n→∞ 2

Taking the limit in (2.2) and combining with (2.3), we have

P(|X − Y | > ε) = 0. (2.4)

For k ≥ 1, letting ε = k1 , we have

1
P(|X − Y | > ) = 0. (2.5)
k

This implies
∞ 
[  ∞  
1 X 1
P |X − Y | > ≤ P |X − Y | > = 0. (2.6)
k k
k=1 k=1

On the other hand, it is easy to verify that


∞  
[ 1
(X ̸= Y ) = |X − Y | > . (2.7)
k
k=1

Therefore P(X ̸= Y ) = 0 or P(X = Y ) = 1. The following theorem states that


almost sure convergence implies convergence in probability.
11
Theorem 2.2.4. (Almost sure convergence implies convergence in probabil-
a.s. P
ity). Xn −−→ X then Xn −
→ X.

a.s.
Proof. Assume that Xn −−→ X . Then for any ε > 0, we have that
 
0 = lim P sup |Xk − X| > ε
n→∞ k≥n
≥ lim P(|Xn − X| > ε) ≥ 0.
n→∞

P
Theorem 2.2.5. If Xn −
→ X , then there exists a subsequence {Xnk } of {Xn }
c a.s.
such that Xnk →
− X and therefore Xnk −−→ X .

P
Proof. Assume that Xn −
→ X . We choose a subsequence {Xnk } such that

P(|Xnk − X| > 2−k ) ≤ 2−k , k = 1, 2, ...

Let ε > 0. With k large enough, we have 2−k < ε and so P(|Xnk − X| > ε) ≤
P(|Xnk − X| > 2−k ) ≤ 2−k . This implies
P
nk P(|Xnk − X| > ε) < ∞,
c
i.e., Xnk →
− X as nk → ∞. By proposition about complete convergence implies
a.s.
almost sure convergence, we conclude that Xnk −−→ X .

The following corollary gives a necessary and sufficient condition for conver-
gence in probability. We will use this criterion many times when we prove a
sequence of random variables converging in probability.

P
Corollary 2.2.6. Xn −
→ X if and only if for every subsequence of {Xn } con-
tain another subsequence which converges to X a.s.

Proof. Assume that Xn → X in probability. Then for every subsequence of


{Xn }, it contains another subsequence which converges to X a.s. by Theorem
2.2.5
Conversely, assume that Xn ̸→ X in probability. Then there exists ε > 0
and a subsequence {Xnk } such that P(|Xnk − X| > ε) > ε. Therefore, the
subsequence Xnk does not contain any subsequence which converges to X a.s.
This contracdiction completes the proof of the theorem.
12
2.2.3 The Cauchy criterion and the continuous map-
ping theorem
We will now present the Cauchy criterion of convergence in probability. A
sequence of random variables {Xn , n ≥ 1} is said to be a Cauchy sequence in
probability if

lim P(|Xm − Xn | > ε) = 0 for all ε > 0. (2.8)


m,n→∞

Proposition 2.2.7. (The Cauchy criterion) A sequence of random variables


{Xn , n ≥ 1} converges (to some random variable X) in probability if and only
if it is a Cauchy sequence in probability.
P
Proof. Assume that Xn −
→ X and ε > 0. Then

lim P(|Xm − Xn | > ε) ≤ lim P(|Xm − X| > 2ε ) + lim P(|Xm − X| > 2ε ) = 0,


m,n→∞ m→∞ n→∞

thereby proving lim P(|Xm − Xn | > ε) = 0 for all ε > 0. Conversely, assume
m,n→∞
that {Xn , n ≥ 1} is a Cauchy sequence in probability. For k ≥ 1, let nk be
such that

nk < nk+1 < ...

and

P(|Xr − Xs | > 2−k ) ≤ 2−k for all r, s ≥ nk .

This implies

1
X  
P |Xnk+1 − Xnk | > k < ∞.
2
k=1

Applying the first Borel - Cantelli lemma, we have


 
1

P lim sup |Xnk+1 − Xnk | > k = 0,
2

or
 
1

P lim inf |Xnk+1 − Xnk | ≤ k = 1. (2.9)
2

1
 
For ω ∈ lim inf |Xnk+1 − Xnk | ≤ k , it is easy to verify that {Xnk (ω), k ≥ 1}
2
is a Cauchy sequence in R so it converges in R. Thus (2.9) implies that
13
{Xnk , k ≥ 1} converges a.s., to a random variable X. Let ε > 0. We need to
prove that

lim P(|Xn − X| > 2ε) = 0. (2.10)


n→∞

a.s.
Since Xnk −−→ X ,

lim P(|Xnk − X| > ε) = 0. (2.11)


nk →∞

On the other hand

P(|Xn − X| > 2ε) ≤ P(|Xn − Xnk | > ε) + P(|Xnk − X| > ε). (2.12)

Combining (2.8), (2.11) and (2.12) we obtain (2.10).

”The continuous mapping theorem” still holds for convergence in probability,


stated as follows.

P
Theorem 2.2.8. (Continuous mapping theorem) Assume that Xn −
→ X and
f : R → R is a mapping function. Then

P
f (Xn ) −
→ f (X).

P
Proof. Assume that {f (Xnk )} is a subsequence of {f (Xn )}. Since {Xnk } −
→ X,
we have from Theorem 2.2.5. that there exists a subsequence {Xnkl } of {Xnk }
a.s.
such that Xnkl −−→ X . By applying continuous mapping theorem for a.s.
a.s.
convergence, we have f (Xnkl ) −−→ f (X). Corollary 2.2.6. then ensures that
P
f (Xn ) −
→ f (X).

P P
Theorem 2.2.9. Assume that Xn −
→ X as n → ∞ and Yn −
→ Y . Then the
following statements hold.

P
(i) Xn ± Yn −
→X ±Y.

P
(ii) Xn Yn −
→ XY .

(iii) Generally, for a continuous function g : R2 → R, we have


14
P
g(Xn , Yn ) −
→ g(X, Y ).

P
Proof. (i) Since Xn −
→ X for all ε > 0, we have

lim P(|Xn − X| > 2ε ) = 0.


n→∞

P
Since Yn −
→ Y for all ε > 0, we have

lim P(|Yn − Y | > 2ε ) = 0.


n→∞

Assume that An = (|Xn + Yn − (X + Y )| > ε).


We have:

|Xn + Yn − (X + Y )| > ε
⇔ |(Xn − X) + (Yn − Y )| > ε.

Since |A + B| < |A| + |B|, we have:

|Xn − X| + |Yn − Y | > ε


ε
⇔ |Xn − X| > 2 or |Yn − Y | > 2ε .

So, An ⊂ (|Xn − X| > 2ε ) ∪ (|Yn − Y | > 2ε ).


This ensures that
   
ε ε
0 ≤ P(An ) ≤ P |Xn − X| > + P |Yn − Y | > , n ≥ 1.
2 2

P P
Since n → ∞, Xn −
→ and Yn −
→Y,

lim P(|Xn − X| > 2ε ) = 0 and lim P(|Yn − Y | > 2ε ) = 0.


n→∞ n→∞

Thus, lim P(An ) = 0.


n→∞
⇒ lim P(|Xn + Yn − (X + Y )| > ε) = 0.
n→∞
This implies that,

P
Xn + Yn −
→X +Y.

15
P
(ii) Since Xn −
→ X for all ε > 0, we have

lim P(|Xn − X| > 2ε ) = 0.


n→∞

P
Since Yn −
→ Y for all ε > 0, we have

lim P(|Yn − Y | > 2ε ) = 0.


n→∞

Assume that Bn = (|Xn Yn − XY | > ε).


We have:

|Xn Yn − XY | = |Xn Yn − Xn Y + Xn Y − XY | > ε


⇔ |Xn (Yn − Y ) + Y (Xn − X)| > ε.

Since |A + B| < |A| + |B|, we have:

|Xn (Yn − Y )| + |Y (Xn − X)| > ε


ε
⇔ |Xn (Yn − Y )| > 2 or |Y (Xn − X)| > 2ε .

So, Bn ⊂ (|Xn (Yn − Y )| > 2ε ) ∪ (|Y (Xn − X)| > 2ε ).


This ensures that
   
ε ε
0 ≤ P(Bn ) ≤ P |Xn (Yn − Y )| > + P |Y (Xn − X)| > , n ≥ 1.
2 2

P P
Since n → ∞, Xn −
→ and Yn −
→Y,

lim P(|Xn (Yn − Y )| > 2ε ) = 0 and lim P(|Y (Xn − X)| > 2ε ) = 0.
n→∞ n→∞

Thus, lim P(Bn ) = 0.


n→∞
⇒ lim P(|Xn Yn − XY | > ε) = 0.
n→∞
This implies that,

P
Xn Yn −
→ XY .

16
Theorem 2.2.10. If {Xn , n ≥ 1} is a monotone sequence (increasing or de-
a.s. P
creasing), then Xn −−→ X if and only if Xn −
→ X.

Proof. If Xn → X in probability, then there exists a subsequence Xnk → X


a.s. Combining this with the monotonicity of Xn , we conclude that Xn → X
a.s.

2.3 Convergence in distribution


Definition 2.3.1. Let X, X1 , X2 , . . . be random variables which have the dis-
tribution are F, F1 , F2 , . . . , respectively. The sequence {Xn , n ≥ 1} is said to
converge in distribution to the random variable X as n → ∞ if

lim Fn (x) = F (x) for all x ∈ C(F ).


n→∞

Then we write:
d
Xn → X as n → ∞.

Condition lim Fn (x) = F (x) for all x ∈ C(F ) instead of for all x ∈ R is
n→∞
quite strange for the first encounter.

Example 2.3.2. Consider

X ≡ 0, Xn ≡ n1 , n ≥ 1.

Then
d
Xn → X.

It is clear that

0 if x < 0
F (x) =
1 if x ≥ 0,
and

0 if x < 1/n
Fn (x) =
1 if x ≥ 1/n.
We see that x = 0 is the unique discontinuous point of F and Fn (x) only
converge to F (x) if x ̸= 0.
17
Example 2.3.3. Let Z ∼ N (0, 1), and let {Xn , n ≥ 1} and {Yn , n ≥ 1} be two
sequences of random variables such that Xn = Z, Yn = −Z for all n ≥ 1. Since
−Z ∼ N (0, 1),

d d
Xn → Z and Yn → Z.

However, since Xn + Yn = 0, Xn − Yn = 2Z and Xn Yn = −Z 2 , we have

d d d
Xn + Yn →
− Z + Z, Xn − Yn →
− Z − Z, Xn YN →
− Z × Z.
|

|
Example 2.3.4. This example concerns Poisson approximation of the bino-
mial distribution. For the sake of illustration we assume, for simplicity, that
 λ
p = pn = λ/n. Suppose that Xn ∈ Bin n, . Then
n
d
Xn → P (λ) as n → ∞.

Implies that, for fixed k , we have


   k  n−k
n λ λ d −λ λk
1− → e as n → ∞.
k n n k!

Example 2.3.5. Toss a symmetric coin, set X = 1 for heads and X = 0 for
tails, and let X2n = X and X2n−1 = 1 − X , n ≥ 1. Since X, X1 , X2 ..., all have
the same distribution, it follows, in particular, that

d
Xn → X as n → ∞.

18
Chapter 3

Relations between
convergence concepts
Let X and X1 , X2 , ... be random variables. The following implications hold
as n → ∞:
Lp P d
Xn −→ X =⇒ Xn −
→ X =⇒ Xn → X.

3.1 Relations between convergence in mean


and convergence in probability.
Theorem 3.1.1. (Convergence in mean of order p implies convergence in
probability). Let p > 0 and {X, Xn , n ≥ 1} be a sequence of random variables.
If
Lp
Xn −→ X,

then
P
Xn −
→ X.

Proof. Given the definition of convergence in probability, we want to show


that for any ε > 0, lim P(|Xn − X| ≥ ε) = 0.
n→∞
Applying Markov’s inequality to the random variable |Xn − X|p , we have:

E(|Xn − X|p )
P(|Xn − X|p ≥ εp ) ≤ .
εp

19
Lp
Since Xn −→ X , that is, lim E(|Xn − X|p ) = 0.
n→∞
So,
E(|Xn − X|p )
lim = 0.
n→∞ εp
Hence

0 ≤ lim P(|Xn − X|p ≥ εp ) ≤ 0 ⇔ lim P(|Xn − X|p ≥ εp ) = 0.


n→∞ n→∞

This implies that, lim P(|Xn − X| ≥ ε) = 0.


n→∞
P
Therefore, Xn −
→ X.

Example 3.1.2. Let a ∈ R and Xn be a discrete random variable takes the


1 1
values 0 and a with corresponding probability 1 − and where n ≥ 1. Then
n n
P L2
Xn −
→ 0 and Xn −→ 0 when n → ∞.

Proof. Indeed, for all ε > 0,

1
0 ≤ P(|Xn − 0| > ε) = P(|Xn | > ε) ≤ P(|Xn | > 0) = 1 − P(|Xn | = 0) = → 0 as n → ∞.
n

Then
lim P(|Xn − 0| > ε) = 0.
n→∞

P
Hence, Xn −
→ 0.
On the other hand, since

1
E|Xn − 0|2 = E|Xn |2 = |a|2 × → 0 as n → ∞.
n
2 L
Therefore, Xn −→ 0.

The following examples will show that the converse does not hold.

Example 3.1.3. (Convergence in probability does not imply convergence in


mean of order p). Let p > 0 and {Xn , n ≥ 1} be a sequence of random variables
satisfying
1 1
P(Xn = 0) = 1 − and P(Xn = n2 ) = , n ≥ 1.
n n
Then Xn does not converge in the r-th mean for any r ≥ 1.

20
Proof. For any r ≥ 1, we can write
E(Xn )r = (Xi − 0)r P(Xi = 0) + (Xi − 0)r P(Xi = n2 )
1 1
 
= 0r 1 − + (n2 )r
n n
= n2r−1 .

Then lim E(|Xn |r ) = lim n2r−1 = ∞.


n→∞ n→∞
Therefore, Xn does not converge in the r-th mean for any r ≥ 1. In particular
P
it is interesting to note that, although Xn −
→ 0, the expected value of Xn does
not converge to 0.

Example 3.1.4. Let the probability space be ([0, 1], B([0, 1]), λ) where λ is
Lebesgue measure and set
Xn = 2n 1(0, 1 )
n

Then 
1
 1
P[|Xn | > ε] = P 0, = → 0.
n n
but
1
E(|Xn |p ) = 2np → ∞.
n
Example 3.1.5. (The continuous mapping theorem does not hold for conver-
gence in mean). Let {Xn , n ≥ 1} be a sequence of random variables satisfying

1 1
P(Xn = 2) = 1 − 3
and P(Xn = n2 ) = 3 , n ≥ 1.
n n

Then
1
E|Xn − 2| = |n2 − 2| × → 0.
n3
However,
1
E|Xn2 − 4| = |n4 − 4| × → ∞.
n3
Example 3.1.6. (Convergence in mean does not imply almost sure conver-
gence.) Consider the function {Xn } defined on ([0, 1], B([0, 1]), λ) where λ is
Lebesgue measure.
X1 = 1[0, 1 ] , X2 = 1[ 1 ,1] ,
2 2

X3 = 1[0, 1 ] , X4 = 1[ 1 , 2 ] ,
3 3 3

X5 = 1[ 2 ,1] , X6 = 1[0, 1 ] ,
3 4

21
and so on. Note that for any p > 0,
1 1
E(|X1 |p ) = , E(|X2 |p ) = ,
2 2
1 1
p
E(|X3 | ) = , · · · , E(|X6 |p ) = .
3 4
So E(|Xn |p ) → 0 and
Lp
Xn −→ 0.

Observe that {Xn } does not convergence almost surely to 0.

Example 3.1.7. (Convergence in probability does not imply a.s convergence).


Let {An , n ≥ 1} be a sequence of independent events such that lim P(An ) = 0
n→∞
P∞
and n=1 P(An ) = ∞. Set Xn = 1(An ), n ≥ 1. Since lim P(An ) = 0, Xn → 0
n→∞
in probability. However, the Borel-Cantelli lemma ensures that
1
  
P lim supn→∞ |Xn − 0| > = P(lim sup An ) = 1.
2

By applying almost sure convergence criterion, this implies Xn ̸→ 0 a.s.

3.2 Relations between convergence in prob-


ability and convergence in distribution.
Theorem 3.2.1. (Convergence in probability implies convergence in distribu-
tion). Let {X, Xn , n ≥ 1} be a sequence of random variables. If
P
Xp → X,

then
d
Xn → X.

Proof. Let ε > 0. Then for all n ≥ 1 and for all x ∈ R, we have
Fn (X) = P(Xn ≤ x)

(By definition of distribution function)


= P((Xn ≤ x) ∩ (|Xn − X| ≤ ε)) + P((Xn ≤ x) ∩ (|Xn − X|) > ε))

≤ P((Xn ≤ x) ∩ (|Xn − X| ≤ ε)) + P((|Xn − X|) > ε)

≤ P(X ≤ x + ε) + P(|Xn − X| > ε).

22
( Because P(A) = P(AΩ) = P(A(B ∪ B c )) = P(AB ∪ AB c ) = P(AB) + P(AB c )
≤ P(AB) + P(B c ) and Xn ≤ x, −ε ≤ X − Xn ≤ ε then P((Xn ≤ x) ∩ (|Xn − X| ≤
ε)) ⊂ P(X ≤ x + ε).)
P
Since Xn → X , this implies that

lim sup Fn (x) ≤ F (x + ε).


n→∞

Similarly, we have

P(X ≤ x − ε) ≥ Fn (X) + P(|Xn − X| > ε).

Then

lim inf Fn (x) ≥ F (x − ε).


n→∞

Taking x ∈ C(F ) and letting ε → 0, we obtain

F (x) ≤ lim inf ≤ lim sup Fn (x) ≤ F (x).


n→∞ n→∞

This ensures that

lim Fn (x) = F (x) for all x ∈ C(F ).

Example 3.2.2. (Convergence in distribution does not imply convergence


in probability). Let Z, −Z are discrete random variables having a following
values table.

Z −1 1

1 1
P
2 2

−Z −1 1

1 1
P
2 2

23
d P
Then F−Z (x) → FZ (x) as n → ∞, but −Z −
→ Z as n → ∞.

|
Indeed, we see that FZ (x) ≡ F−Z (x). Put Xn = −Z , then FXn (x) ≡ FZ (x) for
1
all x ∈ R. However, since Xn − Z = −2Z and ε = , we have
2
1 1 1
     
lim P |Xn − Z| > = P |2Z| > = P |Z| > = 1.
n→∞ 2 2 4
P
Then Xn −
→ Z as n → ∞.
|

Example 3.2.3. (Convergence in distribution does not imply convergence in


probability). Let Z ∼ N (0, 1) and Xn , n ≥ 1 be a sequence of random variables
with Xn = −Z for all n ≥ 1. Since −Z ∼ N (0, 1), we obviously have
d
Xn → Z.

However, since Xn − Z = −2Z ,


P
Xn −
→ Z.
|

Theorem 3.2.4. (Convergence in distribution to a constant). Let Xn , n ≥ 1


be a sequence of random variables. If
d
Xn → c,

where c is a constant, then


P
Xn → c.

Proof. Let Fn be the distribution function of Xn for all n ≥ 1. The constant c


also has the distribution function F (x), that is

0 if x < c,
F (x) =
1 if x ≥ c.
The continuous set of F is

C(F ) = R \{c} .

Let ε > 0. Then


0 ≤ P(|Xn − c| > ε) = 1 − P(−ε ≤ Xn − c ≤ ε)

= 1 − Fn (c + ε) + Fn (c − ε) − P(Xn = c + ε)

≤ 1 − Fn (c + ε) + Fn (c − ε)

→ 1 − F (c + ε) + F (c − ε) = 1 − 1 + 0 = 0.
The proof of the theorem is completed.
24
Conclusion
In conclusion, the concepts of convergence in mean, convergence in probabil-
ity, and convergence in distribution provide valuable tools for understanding
the limiting behavior of sequences of random variables. Convergence in mean
Lp
(Xn −→ X ) focuses on the convergence of expected values and moments, offer-
ing insights into the behavior of random variables in terms of their averages.
P
Convergence in probability (Xn −
→ X ) addresses the likelihood that random
variables in a sequence get arbitrarily close to their limiting value, emphasiz-
d
ing a probabilistic perspective. Convergence in distribution (Xn →
− X ) deals
with the convergence of cumulative distribution functions, capturing the lim-
iting distributional characteristics. While these concepts are related, they
do not always coalesce, highlighting the nuanced nature of stochastic con-
vergence in various scenarios. Understanding their distinctions is pivotal for
sound applications in probability theory and statistics.

25
References
[1] L. V. Thành, Cơ sở lý thuyết xác suất.
[2] Le Van Thanh, Foundations of probability theory, Vinh University Pub-
lisher.
[3] Allan Gut, Probability: A Graduate Course, Second Edition, Springer,
2013.
[4] Sidney I. Resnick, A Probability Path, 2014.

26

You might also like