0% found this document useful (0 votes)
92 views10 pages

Limiting Distributions

The document discusses limiting distributions and introduces several key concepts: 1) It defines convergence in probability and distribution, and introduces the indicator function. 2) It presents Chebyshev's inequality which relates a random variable's probability of deviating from its mean to its variance. 3) It discusses the weak and strong laws of large numbers, showing that the sample mean converges in probability and almost surely to the population mean for iid random variables. 4) It introduces convergence in distribution and provides examples showing it does not necessarily imply convergence in probability. It also presents an alternative criterion using moment generating functions. 5) Slutsky's theorem and the central limit theorem are presented, with the

Uploaded by

Monica Defriani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views10 pages

Limiting Distributions

The document discusses limiting distributions and introduces several key concepts: 1) It defines convergence in probability and distribution, and introduces the indicator function. 2) It presents Chebyshev's inequality which relates a random variable's probability of deviating from its mean to its variance. 3) It discusses the weak and strong laws of large numbers, showing that the sample mean converges in probability and almost surely to the population mean for iid random variables. 4) It introduces convergence in distribution and provides examples showing it does not necessarily imply convergence in probability. It also presents an alternative criterion using moment generating functions. 5) Slutsky's theorem and the central limit theorem are presented, with the

Uploaded by

Monica Defriani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Note #2 Limiting Distributions

Limiting Distributions
We introduce the mode of convergence for a sequence of random variables, and discuss the
convergence in probability and in distribution. The concept of convergence leads us to the two
fundamental results of probability theory: Law of large number and central limit theorem.

Indicator function. We can define an indicator function IA of a subset A of real line as


(
1 x ∈ A;
IA (x) =
0 otherwise.

Let X be a random variable having the pdf f (x). Then we can express the probability
Z Z ∞
P (X ∈ A) = f (x) dx = IA (x)f (x) dx
A −∞

Chebyshev’s inequality. Suppose that X has the mean µ = E[X] and the variance σ 2 =
Var(X), and that we choose the particular subset
A = {x : |x − µ| > t}.
Then we can observe that
1
IA (x) ≤ (x − µ)2 for all x,
t2
which can be used to derive
Z ∞
P (|X − µ| > t) = P (X ∈ A) = IA (x)f (x) dx
−∞
1 ∞ σ2
Z
1
≤ 2 (x − µ)2 f (x) dx = 2 Var(X) = 2 .
t −∞ t t
The inequality P (|X − µ| > t) ≤ σ 2 /t2 is called the Chebyshev’s inequality.

Convergence in probability. Let Y1 , Y2 , . . . be a sequence of random variables. Then we can


define the event An = {|Yn − α| > ε} and the probability P (An ) = P (|Yn − α| > ε). We say
that the sequence Yn converges to α in probability if we have
lim P (|Yn − α| > ε) = 0,
n→∞

for any choice of ε > 0. In general, the convergent value can be a random variable Y , and we
say that Yn converges to Y in probability if it satisfies
lim P (|Yn − Y | > ε) = 0,
n→∞

for any choice of ε > 0.

Laws of large numbers. Now let X1 , X2 , . . . be a sequence of iid random variables with mean
µ and variance σ 2 . Then for each n = 1, 2, . . ., the sample mean
n
1X
X̄n = Xi
n i=1
Page 1 Probability and Statistics II/February 22, 2018
Note #2 Limiting Distributions
has the mean µ and the variance σ 2 /n. Given a fixed ε > 0, we can apply the Chebyshev’s
inequality to obtain
σ 2 /n
P (|X̄n − µ| > ε) ≤ 2 → 0 as n → ∞.
ε

Laws of large numbers, continued. We have established the assertion

“the sample mean X̄n converges to µ in probability”

which is called the weak law of large numbers. A stronger result of the above statement, “the
sample mean X̄n converges to µ almost surely,” was established by Kolmogorov, and is called
the strong law of large numbers.

Convergence in distribution. Let X1 , X2 , . . . be a sequence of random variables having the


cdf’s F1 , F2 , . . ., and let X be a random variable having the cdf F . Then we say that the sequence
Xn converges to X in distribution (in short, Xn converges to F ) if
lim Fn (x) = F (x) for every x at which F (x) is continuous. (2.1)
n→∞

Convergence in distribution, continued. Convergence in distribution does not necessarily


imply convergence in probability.
Example 1. Let Y be a uniform random on [−1/2, 1/2], and let X1 = Y, X2 = −Y, X3 =
Y, X4 = −Y, · · · . Then we have

0
 if t < −1/2;
Fn (t) = FY (t) = t + 1/2 if −1/2 ≤ t ≤ 1/2;

1 if t > 1/2.

Then clearly Xn converges to Y in distribution. Does Xn converges to Y in probability?

Solution to Example 1. Let 0 < ε < 1 be arbitrary. Since we have


(
0 if n is odd;
P (|Xn − Y | > ε) =
P (|2Y | > ε) = 1 − ε if n is even,
limn→∞ P (|Xn − X| > ε) does not exist. Thus, Xn does not converge to Y in probability. In
fact, the limit in probability does not exist.

Convergence in distribution, continued. Convergence in distribution does not necessarily


establish limn→∞ Fn (x) = F (x) for every point.
Example 2. Let Xn be a uniform random on [−1/n, 1/n], and let X ≡ 0 be the constant.

0
 if t < −1/n;
Fn (t) = (nt + 1)/2 if −1/n ≤ t ≤ 1/n;

1 if t > 1/n.

Does Xn converges to X in distribution? In probability?


Page 2 Probability and Statistics II/February 22, 2018
Note #2 Limiting Distributions
Solution to Example 2. Observe that the constant X ≡ 0 has the cdf
(
0 if t < 0;
F (t) =
1 if t ≥ 0.

If t < 0 then limn→∞ Fn (t) = 0 = F (t); if t > 0 then limn→∞ Fn (t) = 1 = F (t). However, we
can observe
lim Fn (0) = 1/2 6= F (0).
n→∞

Since F (t) is not continuous at t = 0, Xn converges to X in distribution. Since for any ε > 0
we have
lim P (|Xn − X| > ε) = lim P (|Xn | > ε) = 0,
n→∞ n→∞

it converges in probability.

Alternative criterion for convergence in distribution. Let Y1 , Y2 , . . . be a sequence of


random variables, and let Y be a random variable. Then the sequence Yn converges to Y in
distribution if and only if
lim E[g(Yn )] = E[g(Y )]
n→∞

for every continuous function g satisfying E[|g(Yn )|] < ∞ for all n. By setting g(y) = ety ,
we obtain the moment generating function MY (t) = E[g(Y )]. Thus, if Yn converges to Y in
distribution then
lim MYn (t) = MY (t)
n→∞

when MYn (t) exists for all n.

Alternative criterion, continued. Convergence in (2.1) is completely characterized in terms


of the distributions F1 , F2 , . . . and F . Recall that the distributions F1 (x), F2 (x), . . . and F (x)
are uniquely determined by the respective moment generating functions, say M1 (t), M2 (t), . . .
and M (t). Furthermore, we have an “equivalent” version of the convergence in terms of the
m.g.f’s.

Theorem 3. If
lim Mn (t) = M (t) for every t (around 0), (2.2)
n→∞

then the corresponding distributions Fn ’s converge to F in the sense of (2.1).

Slutsky’s theorem. Although the following theorem is technical, it plays a critical role later
in asymptotic theory of statistics.

Theorem 4. Let Z1 , Z2 , . . . and W1 , W2 , . . . be two sequences of random variables, and let c


be a constant value. Suppose that the sequence Zn converges to Z in distribution, and that the
sequence Wn converges to c in probability. Then

(a) The sequence Zn + Wn converges to Z + c in distribution.

(b) The sequence Zn Wn converges to cZ in distribution.

(c) Assuming c 6= 0, the sequence Zn /Wn converges to Z/c in distribution.


Page 3 Probability and Statistics II/February 22, 2018
Note #2 Limiting Distributions
Central limit theorem. Central limit theorem (CLT) is the most important theorem in
probability and statistics.

Theorem 5. Let X1 , X2 , . . . be a sequence of iid random variables having the common distribu-
tion F with mean 0 and variance 1. Then
n

X 
Zn = Xi n n = 1, 2, . . .
i=1

converges to the standard normal distribution.

A sketch of proof. Let M be the m.g.f. for the distribution F , and let Mn be the m.g.f. for
the random variable Zn for n = 1, 2, . . .. Since Xi ’s are iid random variables, we have
   
 tZn  √t (X1 +···+Xn )
Mn (t) = E e =E e n

          n
√t X1 √t Xn t
=E e n
× ··· × E e n
= M √
n

0 00 2
  that M (0) = E[X1 ] = 0 and M (0) = E[X1 ] = 1,
A sketch of proof, continued. Observing
we can apply a Taylor expansion to M √tn around 0.

t2
 
t t
M √ = M (0) + √ M 0 (0) + M 00 (0) + εn (t)
n n 2n
2
t
=1+ + εn (t).
2n
 a n
Recall that lim 1 + = ea ; in a similar manner, we can obtain
n→∞ n
n
t2 /2

2
lim Mn (t) = lim 1 + + εn (t) = et /2 ,
n→∞ n→∞ n

which is the m.g.f. for the standard normal distribution.

Central limit theorem, modified. Mean and variance for a sequence can be arbitrary in
general, and CLT can be modified for them.

Theorem 6. Let X1 , X2 , . . . be a sequence of iid random variables having the common distribu-
tion F with mean µ and variance σ 2 . Then
n
X n
X
Xi − nµ (Xi − µ)/σ
i=1 i=1
Zn = √ = √ n = 1, 2, . . .
σ n n

converges to the standard normal distribution.

Page 4 Probability and Statistics II/February 22, 2018


Note #2 Limiting Distributions
Normal approximation. Let X1 , X2 , . . . , Xn be finitely many iid random variables with mean
µ and variance σ 2 . If the size n is adequately large, then the distribution of the sum
n
X
Y = Xi
i=1

can be approximated by the normal distribution with parameter (nµ, nσ 2 ). A general rule for
“adequately large” n is about n ≥ 30, but it is often good for much smaller n.

Normal approximation, continued. Similarly the sample mean


n
1X
X̄n = Xi
n i=1

has approximately the normal distribution with parameter (µ, σ 2 /n). Then the following appli-
cation of normal approximation to probability computation can be applied.

Random variable Approximation Probability


   
b−nµ a−nµ
Y = X1 + · · · + Xn N (nµ, nσ 2 ) Φ σ n −Φ σ n
√ √

 2
    
X1 +···+Xn b−µ a−µ
X̄ = n
N µ, σn Φ √
σ/ n
−Φ √
σ/ n

Example 7. An insurance company has 10,000 automobile policyholders. The expected yearly
claim per policyholder is $240 with a standard deviation of $800. Find approximately the
probability that the total yearly claim exceeds $2.7 million. Can you say that such event is
statistically highly unlikely?

Solution. Let X1 , . . . , X10000 be normally distributed with mean µ = 240 and σ = 800. Then
Y = X1 + · · · + X10000
√ is approximately normally distributed with mean nµ = 2, 400, 000 and
standard deviation nσ = 80, 000. Thus, we obtain

P (Y > 2, 700, 000) ≈ 1 − Φ(3.75) ≈ 0,

which is highly unlikely.

Normal approximation to binomial distribution. Suppose that X1 , . . . , Xn are iid Bernoulli


random variables with the mean p = E(X) and the variance p(1 − p) = Var(X). If the size n is
adequately large, then the distribution of the sum
n
X
Y = Xi
i=1

can be approximated by the normal distribution with parameter (np, np(1 − p)). Thus, the
normal distribution N (np, np(1 − p)) approximates the binomial distribution B(n, p). A general
rule for “adequately large” n is to satisfy np ≥ 5 and n(1 − p) ≥ 5.
Page 5 Probability and Statistics II/February 22, 2018
Note #2 Limiting Distributions
Normal approximation to binomial distribution, continued. Let Y be a binomial random
variable with parameter (n, p), and let Z be a normal random variable with parameter (np, np(1−
p)). Then the distribution of Y can be approximated by that of Z. Since Z is a continuous
random variable, the approximation of probability should improve when the following formula
of continuity correction is considered.

P (i ≤ Y ≤ j) ≈ P (i − 0.5 ≤ Z ≤ j + 0.5)
! !
j + 0.5 − np i − 0.5 − np
=Φ p −Φ p .
np(1 − p) np(1 − p)

Page 6 Probability and Statistics II/February 22, 2018


Note #2 Exercises

Exercises

Problem 1. Let Sn2 denote the sample variance of random sample of size n from normal distri-
bution with mean µ and variance σ 2 .

(a) Find E[Sn2 ] and Var(Sn2 ).

(b) Show that Sn2 converges to σ 2 in probability

Problem 2. Let X1 , . . . , Xn be a random sample of size n from the common pdf f (x) = e−(x−θ)
if θ ≤ x < ∞; otherwise, f (x) = 0, and let X(1) be the first order statistic, and Zn = n(X(1) − θ).

(a) Find the pdf for X(1) , and the mgf M (t).

(b) Find the mgf Mn (t) for Zn .

(c) Does Zn converge in distribution? If so, find the limiting distribution.

Problem 3. Suppose
√ that Zn has a Poisson distribution with parameter λ = n. Then show
that (Zn − n)/ n converges to the standard normal distribution.

Problem 4. An actual voltage of new a 1.5-volt battery has the probability density function

f (x) = 5, 1.4 ≤ x ≤ 1.6.

Estimate the probability that the sum of the voltages from 120 new batteries lies between 170
and 190 volts.

Problem 5. The germination time in days of a newly planted seed has the probability density
function
f (x) = 0.3e−0.3x , x ≥ 0.
If the germination times of different seeds are independent of one another, estimate the proba-
bility that the average germination time of 2000 seeds is between 3.1 and 3.4 days.

Problem 6. Calculate the following probabilities by using normal approximation with conti-
nuity correction.

(a) Let X be a binomial random variable with n = 10 and p = 0.7. Find P (X ≥ 8).

(b) Let X be a binomial random variable with n = 15 and p = 0.3. Find P (2 ≤ X ≤ 7).

(c) Let X be a binomial random variable with n = 9 and p = 0.4. Find P (X ≤ 4).

(d) Let X be a binomial random variable with n = 14 and p = 0.6. Find P (8 ≤ X ≤ 11).

Page 7 Probability and Statistics II/February 22, 2018


Note #2 Optional problems

Optional problems

Convergence in distribution.

Definition 8. A function f (x) is said to be bounded if there exists some M > 0 such that
|f (x)| ≤ M for all x. A function f (x) is called uniformly continuous if for any ε > 0 there exists
some δ > 0 such that |f (x) − f (y)| < ε whenever |x − y| < δ.

Let X1 , X2 , . . . be a sequence of random variables, and let X be a random variable. In order to


see that Xn converges to X in distribution it suffices to show that

lim E[g(Xn )] = E[g(X)]


n→∞

for every bounded and uniformly continuous function f .

Problem 7. Choose a bounded and uniformly continuous function f so that |f (x)| ≤ M for
all x and |f (x) − f (y)| < ε whenever |x − y| < δ. Let X be a random variable, and let
Aδ = {y : |y − c| < δ}. Suppose that a sequence Yn of random variables converges to a constant
c in probability.

(a) Show that |f (X + Yn ) − f (X + c)| ≤ 2M .

(b) Show that |f (X + Yn ) − f (X + c)|IA (Yn ) ≤ ε.

(c) Show that |E[f (X + Yn )] − E[f (X + c)]| ≤ 2M P (|Yn − c| ≥ δ) + ε

(d) The above result implies that limn→∞ |E[f (X + Yn )] − E[f (X + c)]| ≤ ε for every ε > 0,
and therefore, that limn→∞ |E[f (X + Yn )] − E[f (X + c)]| = 0. Then argue that X + Yn
converges to X + c in distribution.

Problem 8. Let Z1 , Z2 , . . . and W1 , W2 , . . . be two sequences of random variables, and let c be


a constant value. Suppose that Zn converges to Z in distribution, and that Wn converges to c
in probability.

(a) Choose a bounded and uniformly continuous function f so that |f (x)| ≤ M for all x and
|f (x) − f (y)| < ε whenever |x − y| < δ. Then show that

|E[f (Zn + Wn )] − E[f (Z + c)]|


≤ |E[f (Zn + c)] − E[f (Z + c)]| + |E[f (Zn + Wn )] − E[f (Zn + c)]|
≤ |E[f (Zn + c)] − E[f (Z + c)]| + 2M P (|Wn − c| ≥ δ) + ε

(b) The above result implies that limn→∞ |E[f (Zn + Wn )] − E[f (Z + c)]| ≤ ε, and therefore,
that limn→∞ |E[f (Zn + Wn )] − E[f (Z + c)]| = 0. Then argue that Zn + Wn converges to
Z + c in distribution (Slutsky’s theorem).

Page 8 Probability and Statistics II/February 22, 2018


Note #2 Answers to exercises

Answers to exercises
Problem 1. Recall that Wn−1 = (n − 1)Sn2 /σ 2 is a chi-square distribution with (n − 1) degrees
of freedom, and that E[Wn−1 ] = (n − 1) and Var(Wn−1 ) = 2(n − 1).
h 2 i
σ2
(1) E[Sn2 ] = E σ(n−1)
Wn−1
= n−1 E[Wn−1 ] = σ 2 , and
2
σ 2 Wn−1 σ2 2σ 4
  
Var(Sn2 ) = Var = Var(Wn−1 ) = .
(n − 1) n−1 n−1

(2) By Chebyshev’s inequality we obtain


2σ 4
P |Sn2 − σ 2 | > ε ≤

→ 0 as n → ∞.
(n − 1)ε2

Problem 2. (1) X(1) has the pdf f(1) (x) = ne−n(x−θ) if x ≥ θ; otherwise, f(1) (x) = 0. Then we
obtain Z ∞ Z ∞
−n(x−θ) netθ
M (t) = tx
e ne dx = ne tθ
e−(n−t)u du =
θ 0 n−t
for t < n.
(2) We obtain
1
Mn (t) = E etn(X(1) −θ) = entθ M (nt) =
 
1−t
for t < 1.
1
(3) Since limn→∞ Mn (t) = 1−t
, the limiting distribution is an exponential with λ = 1.

Problem 3. Let X1 , X2 , . . . be iid random variables having a Poisson distribution with λ = 1.


Then we have Zn = X1 + · · · + Xn . By CLT we obtain
Pn
Xi − n
Zn = i=1√
n
converges to the standard normal distribution.

Problem 4. Let X1 , . . . , X120 be the voltage of each battery, having


P120 the pdf f (x). Since
µ = E[Xi ] = 1.5 and σ 2 = E[Xi2 ] − µ2 = 0.0033, the sum Y = i=1 Xi is approximately
2
distributed as N (120µ, 120σ ) = N (180, 0.396). Thus, we obtain
   
190 − 180 170 − 180
P (170 ≤ Y ≤ 190) = Φ √ −Φ √ ≈1
0.396 0.396

Problem 5. Let X1 , . . . , X2000 be the germination time of individual seed, having the exponen-
tial distribution with λ = 0.3. Since µ = E[Xi ] = 1/0.3 ≈ 3.33 and σ 2 = Var(Xi ) = 1/0.32 ≈
11.11, the sample mean X̄ is approximately distributed as N (µ, σ 2 /n) = N (3.33, 0.0056). Thus,
   
3.4 − 3.33 3.1 − 3.33
P (3.1 ≤ X̄ ≤ 3.4) = Φ √ −Φ √ ≈ 0.825
0.0056 0.0056
Page 9 Probability and Statistics II/February 22, 2018
Note #2 Answers to exercises

!
7.5 − (10)(0.7)
Problem 6. (1) P (X ≥ 8) = 1 − Φ p ≈ 0.365
(10)(0.7)(0.3)
! !
7.5 − (15)(0.3) 1.5 − (15)(0.3)
(2) P (2 ≤ X ≤ 7) = Φ p −Φ p ≈ 0.909
(15)(0.3)(0.7) (15)(0.3)(0.7)
!
4.5 − (9)(0.4)
(3) P (X ≤ 4) = Φ p ≈ 0.7299
(9)(0.4)(0.6)
! !
11.5 − (14)(0.6) 7.5 − (14)(0.6)
(4) P (8 ≤ X ≤ 11) = Φ p −Φ p ≈ 0.6429
(14)(0.6)(0.4) (14)(0.6)(0.4)

Page 10 Probability and Statistics II/February 22, 2018

You might also like