0% found this document useful (0 votes)
3 views

lecture_note_4

The document covers statistical methods, focusing on the gamma distribution, limit theorems, and the Central Limit Theorem (CLT). It explains the gamma distribution's parameters, applications, and provides examples of calculating probabilities related to fuel consumption. Additionally, it discusses convergence in distribution and probability, Chebyshev's theorem, and the law of large numbers, emphasizing the significance of the CLT in approximating distributions of sample means as normal regardless of the original distribution shape.

Uploaded by

Kabir Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

lecture_note_4

The document covers statistical methods, focusing on the gamma distribution, limit theorems, and the Central Limit Theorem (CLT). It explains the gamma distribution's parameters, applications, and provides examples of calculating probabilities related to fuel consumption. Additionally, it discusses convergence in distribution and probability, Chebyshev's theorem, and the law of large numbers, emphasizing the significance of the CLT in approximating distributions of sample means as normal regardless of the original distribution shape.

Uploaded by

Kabir Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

MTL390: Statistical Methods

Instructure: Dr. Biplab Paul


January 10, 2025

Lecture 4

Revision of Probability Distribution (cont.)

Gamma distribution
A random variable X is said to possess a gamma probability distribution with parameters
a > 0 and b > 0 if it has the probability density function (PDF) given by:
(
a
1
xa−1 e−x/b , if x > 0,
f (x) = b Γ(a)
0, otherwise.
The gamma density has two parameters, a and b. We denote this by Gamma(a, b).
The parameter a is called the shape parameter, and b is called the scale parameter.
If X is a gamma random variable with parameters a > 0 and b > 0, then:

E(X) = ab and Var(X) = ab2 .


The moment-generating function is given by:
1
MX (t) = (1 − bt)−a , t < .
b
The gamma probability distribution has found applications in various fields. For
example, in engineering, the gamma probability distribution has been employed in the
study of system reliability.
Example The daily consumption of aviation fuel in millions of gallons at a certain airport
can be treated as a gamma random variable with a = 3 and b = 1.

(a) What is the probability that on a given day the fuel consumption will be less than
1 million gallons?

(b) Suppose the airport can store only 2 million gallons of fuel. What is the probability
that the fuel supply will be inadequate on a given day?

Solution Let X be the fuel consumption in millions of gallons on a given day at a certain
airport. Then, X ∼ Gamma(3, 1)

1
(a) Z 1
1 2 −x
P (X < 1) = x e dx ≈ 0.08025.
0 2
Thus, there is about an 8% chance that on a given day the fuel consumption will
be less than 1 million gallons.

(b) Because the airport can store only 2 million gallons, the fuel supply will be inade-
quate if the fuel consumption X is greater than 2. Thus,
Z ∞
1 2 −x
P (X > 2) = x e dx = 0.677.
2 2

We can conclude that there is about a 67.7% chance that the fuel supply of 2 million
gallons will be inadequate on a given day. So, if the model is correct, the airport
needs to store more than 2 million gallons of fuel.

2
Limit Theorems
Limit theorems play a very important role in the study of probability theory and statistics.
We already observed that some binomial probabilities can be computed using the Poisson
distribution. They can also be computed using the normal distribution by employing
limiting arguments.
In this section, we discuss modes of convergence, the law of large numbers, and the
Central Limit Theorem. First, we present Chebyshev’s theorem, which is a useful result
for proving limit theorems.

Theorem 1 (Chebyshev’s Theorem). Let the random variable X have a mean µ and
standard deviation σ. Then for K > 0, a constant:
1
P (|X − µ| ≥ Kσ) ≤ .
K2
Proof. We will work with the continuous case. By the definition of the variance of X,
Z ∞
2 2
σ = E[(X − µ) ] = (x − µ)2 f (x) dx.
−∞

This can be expressed as:


Z µ−Kσ Z µ+Kσ Z ∞
2 2 2
σ = (x − µ) f (x) dx + (x − µ) f (x) dx + (x − µ)2 f (x) dx
−∞ µ−Kσ µ+Kσ
Zµ−Kσ Z∞
≥ (x − µ)2 f (x) dx + (x − µ)2 f (x) dx.
−∞ µ+Kσ

Note that (x − µ)2 ≥ K 2 σ 2 for x ≤ µ − Kσ or x ≥ µ + Kσ. Then

σ 2 ≥ K 2 σ 2 [P (X ≤ µ − Kσ) + P (X ≥ µ + Kσ)] .

Equivalently:
1
P (|X − µ| ≥ Kσ) ≤ .
K2
or
1
P (|X − µ| < Kσ) ≥ 1 − .
K2

We can also write Chebyshev’s theorem as:

E [(X − µ)2 ] Var(X)


P (|X − µ| ≥ ϵ) ≤ = , for some ϵ > 0.
ϵ2 ϵ2
It gives a lower bound for the area under a curve between two points that are on opposite
sides of the mean and are equidistant from the mean. The strength of this result lies in
the fact that we need not know the distribution of the underlying population, other than
its mean and variance. This result was developed by the Russian mathematician Pafnuty
Chebyshev (1821-1894).
Example A random variable X has mean 24 and variance 9. Obtain a bound on the

3
probability that the random variable X assumes values between 16.5 and 31.5.
Solution: From Chebyshev’s theorem, we have:
1
P (µ − Kσ < X < µ + Kσ) ≥ 1 − .
K2

Equating µ + Kσ = 31.5 and µ − Kσ = 16.5, with µ = 24 and σ = 9 = 3, we obtain
K = 2.5. Hence, the probability is

P (16.5 < X < 31.5) ≥ 0.84.

Mode of Convergence
In this section, we discuss modes of convergence, namely, convergence in distribution (or
law) and convergence in probability, and their relationship.

Converges in distribution (or law)


Let {Fn } be a sequence of distribution functions. If there exists a distribution function
F such that as n → ∞,
Fn (x) → F (x)
at every point x at which F is continuous, we say that Fn converges in law (or weakly)
w
to F , and we write Fn −→ F.
If {Xn } is a sequence of random variables and {Fn } is the corresponding sequence of
distribution functions, we say that Xn converges in distribution (or law) to X if there
w
exists a random variable X with distribution function F such that Fn − → F . We write
d L
Xn →− X or Xn − → X.
Example Let X1 , X2 , . . . , Xn be i.i.d. random variables with a common density function:
(
1
, 0 < x < θ (0 < θ < ∞),
fX (x) = θ
0, otherwise.
Let X(n) = max(X1 , X2 , . . . , Xn ). Then the density function of X(n) is:
( n−1
nx
θn
, 0 < x < θ,
fX(n) (x) =
0 otherwise.
and the distribution function (DF) of X(n) is:

0,
 x < 0,
FX(n) (x) = ( xθ )n , 0 ≤ x < θ,

1, x ≥ θ.

We see that as n → ∞,
(
0 if x < θ
FX(n) (x) → FX (x) =
1 if x ≥ θ,
w
which is a distribution function. Thus, FX(n) (x) −
→ FX (x).

4
Converges in probability
Let {Xn } be a sequence of random variables defined on some probability space (Ω, S, P ).
We say that the sequence {Xn } converges in probability to the random variable X if for
every ϵ > 0,
P (|Xn − X| > ϵ) → 0 as n → ∞.
P
We write Xn −
→ X.
Example Let {Xn } be a sequence of random variables with the probability mass function
(PMF)
1 1
P {Xn = 1} = , P {Xn = 0} = 1 − .
n n
Then
1
P (|Xn | > ϵ) = P {Xn = 1} = , 0 < ϵ < 1,
n
and

P (|Xn | > ϵ) = 0, if ϵ ≥ 1.
It follows that

P (|Xn | > ϵ) → 0 as n → ∞,
P
and we conclude that Xn − → 0. This means Xn converge to a RV X which is degen-
erated at 0, i.e., P (X = 0) = 1 and = 0 elsewhere.
Remark
P d
(a) Xn −
→ X implies Xn →
− X, but the converse is not always true.
L P
(b) Let k be a constant. Then Xn −
→ k iff Xn −
→ k.
Theorem 2 (Slutsky’s Theorem). Let {Xn , Yn }, n = 1, 2, . . . be a sequence of pairs of
random variables, and let c be a constant. Then:
d P d
(a) If Xn →
− X and Yn −
→ c, then Xn + Yn →
− X + c.
d P d P
(b) If Xn →
− X, Yn −
→ c, and c ̸= 0, then Xn Yn →
− cX. If c = 0, then Xn Yn −
→ 0.
d P d
(c) If Xn →
− X, Yn −
→ c, and c ̸= 0, then Xn /Yn →
− X/c.
Theorem 3 (Law of Large Numbers). Let X1 , . . . , Xn be a set of pairwise independent
and identically random variables with E(Xi ) = µ and Var(Xi ) = σ 2 < ∞. Then for any
c > 0,
 σ2
P µ − c ≤ X̄ ≤ µ + c ≥ 1 − 2 ,
nc
and as n → ∞, the probability approaches 1. Equivalently,
 
Sn
P − µ ≥ ϵ → 0 as n → ∞,
n
where X = n1 ni=1 Xi and Sn = ni=1 Xi , i.e.,
P P

Sn P

→µ as n → ∞.
n
5
Proof. Since X1 , . . . , Xn are independent and identically distributed (iid) random vari-
ables (random sample), we know that:

σ2
 
2 Sn
Var(Sn ) = nσ , and Var = .
n n

Also,  
Sn
E = µ.
n
By Chebyshev’s theorem, for any ϵ > 0,

σ2
 
Sn
P − µ ≥ ϵ ≤ 2.
n nϵ

Thus, for any fixed ϵ,


 
Sn
P − µ ≥ ϵ → 0 as n → ∞.
n

Equivalently,  
Sn
P − µ < ϵ → 1 as n → ∞.
n

The law of large numbers states that if the sample size n is large, the sample mean rarely
deviates from the mean of the distribution of X, which in statistics is called the popu-
lation mean. This result basically states that we can start with a random experiment
whose outcome cannot be predicted with certainty, and by taking averages, we can ob-
tain an experiment in which the outcome can be predicted with a high degree of accuracy.

Example Let X1 , . . . , Xn be iid Bernoulli random variables with parameter p. Verify


the law of large numbers.
Solution: For Bernoulli random variables, we know that:

E(Xi ) = p, and Var(Xi ) = p(1 − p).

Thus, by Chebyshev’s Theorem, we have:


 Var(Sn )
P X̄ − p ≥ c ≤ .
n 2 c2
Since Var(Sn ) = nVar(Xi ) = np(1 − p), we get:
 p(1 − p)
P X̄ − p ≥ c ≤ .
nc2
As n → ∞, the right-hand side tends to 0. Therefore,

P X̄ − p ≥ c → 0 as n → ∞.

This verifies the weak law of large numbers.

6
Exercise Consider
Pn n rolls of a balanced die. Let Xi be the outcome of the i-th roll,
and let Sn = i=1 Xi . Show that, for any ϵ > 0,
 
Sn 7
P − ≥ ϵ → 0 as n → ∞.
n 2

Now we will discuss one of the most important results in probability theory the Cen-
tral Limit Theorem.

Theorem 4 (Central Limit Theorem (CLT)). Let X1 , X2 , . . . , Xn be a random sample


from an infinite population with mean µ and variance σ 2 . Then the limiting distribution
of
X̄ − µ
Zn = √ ,
σ/ n
as n → ∞, is the standard normal distribution. That is,
Z z
1 2
lim P (Zn ≤ z) = √ e−t /2 dt.
n→∞ 2π −∞
This basically states that the z-transform of the sample mean is asymptotically stan-
dard normal. The amazing thing about the Central Limit Theorem is that no matter
what the shape of the original distribution is, the (sampling) distribution of the mean
approaches a normal probability distribution. The Central Limit Theorem basically says
that when we repeat an experiment a large number of times, the average (almost always)
follows a Gaussian distribution.
Example Let X1 , X2 , . . . be i.i.d. random variables such that:
(
1, with probability p,
Xi =
0, with probability 1 − p.

Show that
Sn − np
Zn = √
npq
is approximately normal for large n, where Sn = ni=1 Xi and q = 1 − p.
P
Solution: For the given random variable Xi :

E[Xi ] = p, and Var(Xi ) = p(1 − p) = pq.


Pn
The sum Sn = i=1 Xi has:
n
X n
X
E[Sn ] = E[Xi ] = np, and Var(Sn ) = Var(Xi ) = npq.
i=1 i=1

The standardized random variable is:


Sn − E[Sn ] Sn − np
Zn = p = √ .
Var(Sn ) npq

7
Hence, by the Central Limit Theorem (CLT), the limiting distribution of Zn as n → ∞
is the standard normal probability distribution.

Example A soft-drink vending machine is set so that the amount of drink dispensed
is a random variable with a mean of 8 ounces and a standard deviation of 0.4 ounces.
What is the approximate probability that the average of 36 randomly chosen fills exceeds
8.1 ounces?
Solution: From the CLT,
X̄ − 8
√ ∼ N (0, 1).
0.4/ 36
Hence, from the normal table,
 
 8.1 − 8.0
P X̄ > 8.1 = P Z> √ = P (Z > 1.5) = 0.0668.
0.4/ 36

You might also like