0% found this document useful (0 votes)

54 views

Stat2602b Topic 1

The document provides information about moment-generating functions (MGFs) including their definition, properties, and uses. Key points: - The MGF of a random variable X is defined as the expected value of e to the tX, where t is a real number. - MGFs can be used to find moments like the mean and variance of a distribution. Taking derivatives of the MGF and evaluating at t=0 gives the moments. - If X1+X2+...+Xn is the sum of independent random variables, the MGF of the sum is the product of the individual MGFs. - MGFs provide a one-to-one correspondence between

Uploaded by

wingszet24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views

Stat2602b Topic 1

Uploaded by

wingszet24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

The University of Hong Kong

Department of Statistics and Actuarial Science

STAT2602B Probability and Statistics II
Semester 2 2023/2024
Topic 1: Sampling Distributions

1 Moment-generating Functions
Definition 1.1. The moment-generating function (MGF) of a random variable X
is defined as
MX (t) = E(etX )
such that the right-hand side exists for t ∈ R . We may also call MX (t) the moment-
generating function of the distribution followed by X.
Remarks:
MX (0) = E(e0 ) = 1.

The moment-generating function of a discrete random variable X is

X
MX (t) = etx f (x)
x

and that of a continuous random variable X is

Z ∞
MX (t) = etx f (x) dx.
−∞

The subscript X indicates it is the moment-generating function of the random

variable X and can be dropped.
Example 1.1. The moment-generating function of a random variable X having a
Poisson distribution with mean λ is
∞ ∞ −λ x ∞
X X
tx e λ −λ
X (λet )x t
MX (t) = E(e ) = tX tx
e P(X = x) = e =e = e−λ eλe
x=0 x=0
x! x=0
x!
t
= eλ(e −1)

Theorem 1.1. Suppose MX (t) exists. Then,

∞
X
tr
1. MX (t) = r!
E (X r );
r=0

dr

2. MX (t) = E (X r ) for r = 1, 2, . . .;
dtr t=0

3. For constants a and b,

MaX+b (t) = ebt MX (at) .

1
STAT2602B TST23/24 Topic 1

Proof. 1. For discrete random variables,

∞ ∞ r X
X
tx
XX (tx)r X t
MX (t) = e P(X = x) = P(X = x) = xr P(X = x)
x x r=0
r! r=0
r! x
∞ r
X t
= E (X r ) ,
r=0
r!

For continuous random variables, use integrals instead of sums.

2. Consider the Taylor’s series expansion of the exponential function etx ,

∞
tx
X (tx)k 1 2 2 1 3 3 1
e = = 1 + tx + t x + t x + . . . + tr xr + . . . .
k=0
k! 2! 3! r!

By definition,
1 2 1 1
t E X 2 + t3 E X 3 + . . . + tr E (X r ) + . . . .

MX (t) = 1 + tE (X) +
2! 3! r!
Consider differentiation with respect to t,
d 1 1
MX (t) = E (X) + tE X 2 + t2 E X 3 + . . . + tr−1 E (X r ) + . . . ,

dt 2! (r − 1)!
d2 2 3 1
tr−2 E (X r ) + . . . .

MX (t) = E X + tE X + . . . +
dt2 (r − 2)!
In general,
dr
MX (t) = E (X r ) + sum of terms in t.
dtr
Substituting t = 0, we can obtain, for example
d d2
= E X2 .

MX (t) = E (X) , MX (t)
dt t=0 dt2 t=0

In general, we have
d
MX (t) = E (X r ) .
dtr t=0

3. By definition,

MaX+b (t) = E e(aX+b)t = ebt E(eatX ) = ebt MX (at).

Example 1.2. Suppose that X is a continuous random variable with a probability

density function given by
−x
e , for x > 0;
f (x) =
0, otherwise.

2
STAT2602B TST23/24 Topic 1

Find the moment-generating function of X and hence find the mean and the vari-
ance of X.

The moment-generating function of X is

Z ∞ Z ∞ Z ∞
tX tx tx −x
etx−x dx

MX (t) = E e = e f (x) dx = e e dx =
−∞ 0 0
Z ∞ ∞
1 −(1−t)x
= e−(1−t)x dx = − e
0 1−t 0
1
=
1−t
which is valid if 1 − t > 0 ⇒ t < 1. Consider the differentiation of MX (t) with
respect to t,
d 1 d2 2
MX (t) = 2 , MX (t) = .
dt (1 − t) dt2 (1 − t)3
Recall that
d 1 d (1 − t) d n
(1 − t)−n = −1 × −n (1 − t)−n−1 =

n = .
dt (1 − t) dt d (1 − t) (1 − t)n+1
Substituting t = 0 gives
d 1 d2 2
MX (t) = = 1, MX (t) = = 2.
dt t=0 (1 − 0)2 dt2 t=0 (1 − 0)3
Hence, the first two moments of X and its mean and variance are, respectively
d d2
= 1, E X 2 = 2 MX (t)

E (X) = MX (t) = 2,
dt t=0 dt t=0
µ = E (X) = 1, σ 2 = E X 2 − µ2 = 2 − 12 = 1.

Theorem 1.2. If X1 , X2 , . . . , Xn are independent random variables, MXi (t) exists

for i = 1, 2, . . . , n, and Y = X1 + X2 + · · · + Xn , then MY (t) exists and
n
Y
MY (t) = MXi (t).
i=1

Proof. Suppose all Xi ’s are continuous random variables and their marginal proba-
bility density functions are f1 (x1 ),f2 (x2 ), ..., fn (xn ), respectively. Then,
MY (t) = E e(X1 +X2 +···+Xn )t

Z ∞ Z ∞Z ∞
= ··· e(x1 +x2 +···+xn )t f1 (x1 )f2 (x2 ) · · · fn (xn )dx1 dx2 · · · dxn
Z−∞
∞
−∞ −∞
Z ∞ Z ∞
x1 t x2 t
= e f1 (x1 )dx1 e f2 (x2 )dx2 · · · exn t fn (xn )dxn
−∞ −∞ −∞
= MX1 (t)MX2 (t) · · · MXn (t).
The proof for discrete case is similar and is omitted.

3
STAT2602B TST23/24 Topic 1

Theorem 1.3. For those probability distributions which moment-generating func-

tions exist, there is a one-to-one correspondence between moment-generating func-
tions and probability distributions. That is,

MX (t) = MY (t) ⇒ fX (x) = fY (y) .

Then, MX (t) completely determines the distribution of X.

Example 1.3. Find the probability distribution of the sum of n independent ran-
dom variables X1 , X2 , . . . , Xn having Poisson distributions with means λ1 , λ2 , . . . , λn
respectively.

Let Y = X1 + X2 + · · · + Xn . Then,
n n
Y Y t t
Pn
MY (t) = MXi (t) = eλi (e −1) = e(e −1) i=1 λi
,
i=1 i=1

which
Pn is the moment-generating function of a Poisson random variable with mean
Pni=1 λi . Therefore, by Theorem 1.3, Y has the Poisson distribution with mean
i=1 λi .

Example 1.4. For positive numbers α and λ, find the moment-generating function
of a gamma distribution Gamma(α, λ) of which the probability density function is
( α α−1 −λx
λ x e
Γ(α)
, for x > 0;
f (x) =
0, otherwise.

The moment-generating function of a gamma distribution is

Z ∞ α α−1 −λx Z ∞ α
tx λ x e λ
tX
MX (t) = E(e ) = e dx = xα−1 e−(λ−t)x dx
0 Γ(α) 0 Γ(α)
Z ∞
λα (λ − t)α α−1 −(λ−t)x
= x e dx
(λ − t)α 0 Γ(α)
λα
=
(λ − t)α

for t < λ, where

∞
(λ − t)α α−1 −(λ−t)x
Z
x e dx = 1
0 Γ(α)
is due to the fact that
(λ − t)α α−1 −(λ−t)x
x e for x > 0
Γ(α)

is the probability density function of a Gamma(α, λ − t) distribution.

4
STAT2602B TST23/24 Topic 1

Example 1.5. Show that the sum of n independent random variables X1 , X2 , . . . , Xn

each having a Bernoulli distribution with parameter p has Bin(n, p), the binomial
distribution with parameters n and p.

For each i = 1, 2, . . . , n, the moment-generating function of Xi is

MXi (t) = e0t P(Xi = 0) + e1t P(Xi = 1) = (1 − p) + et p = 1 − p + pet
and hence the moment-generating function of X1 + X2 + · · · + Xn is
n
Y
MXi (t) = (1 − p + pet )n .
i=1

The moment-generating function of Bin(n, p) is

n
X n
X
tx
e Cxn px (1 − p)n−x
= Cxn (pet )x (1 − p)n−x = (pet + 1 − p)n .
x=0 x=0

Therefore, X1 + X2 + · · · + Xn has a Bin(n, p) distribution.

2 Simple Random Sampling

Descriptive statistics are tabular, graphical and numerical methods used to sum-
marise data whereas inferential statistics are statistical methods using data based
on a small group of elements (the sample) to make decisions or predictions about the
characteristics of a whole group of elements of interest (the population). A numer-
ical characteristic of a population (for example, population mean and population
variance) is the parameter.
Definition 1.2. For a population consisting of data x1 , x2 , . . . , xN , where N is a
positive integer, the population mean, denoted by µ, is defined as
N
x1 + x2 + · · · + xN 1 X
µ= ≡ xi ,
N N i=1

and N is called the population size.

Definition 1.3. For a sample consisting of data x1 , x2 , . . . , xn , where n is a positive
integer, the sample mean, denoted by x, is defined as
n
x1 + x2 + · · · + xn 1X
x= ≡ xi ,
n n i=1

and n is called the sample size.

Definition 1.4. For a population, the population variance, denoted by σ 2 , is defined
as !
N N N
1 X 1 X 1 X 2
σ2 = (xi − µ)2 = x2i − N µ2 = x − µ2 .
N i=1 N i=1 N i=1 i

5
STAT2602B TST23/24 Topic 1

Definition 1.5. The population standard deviation σ is the non-negative square

root of the population variance.
Definition 1.6. For a sample of size n ≥ 2, the sample variance, denoted by s2 , is
defined as n
2 1 X
s = (xi − x)2 ,
n − 1 i=1
which equals !
n
1 X
x2i − nx2 .
n−1 i=1

Definition 1.7. The sample standard deviation s is the non-negative square root
of the sample variance.
Remarks: The standard deviation is measured in the same units as the data, making
it more easily comparable, than the variance, to the mean.
Example 1.6. Suppose we want to know the mean income of all households in a
city. The population is therefore the incomes of all households in the city. We then
select randomly a household from the city and record its income. In this random
experiment, the sample space is the city and the sampling units are the households in
the city. The income corresponds to the outcome (of a household randomly selected)
of the experiment. Therefore we say that the income of a household randomly
selected is a random variable. For convenience, we define the distribution of the
population as the distribution of this random variable.
Definition 1.8. Simple random sampling is a process that all the samples with
the same size are equally likely to be chosen. A simple random sample is a sample
selected using a simple random sampling plan.
Example 1.7. To conduct random sampling, assign a number to each element of the
chosen population (or use already given numbers). Then, randomly select numbers
using a random number table or a software package.
Remarks: In this course, sampling means simple random sampling. Two common
sampling schemes include
sampling with replacement: replacing (putting back) each sampled element
before selecting subsequent elements.
sampling without replacement: not replacing any sampled element before se-
lecting subsequent elements.
Example 1.8. Suppose we have a population: −1, 1, 5 and 11. Then,
−1 + 1 + 5 + 11
µ = = 4,
4
(−1 − 4)2 + (1 − 4)2 + (5 − 4)2 + (11 − 4)2
σ2 = = 21.
4
The following table shows all possible samples and their means and variances for
sampling without replacement when the sample size is 2.

6
STAT2602B TST23/24 Topic 1

Sample Sample mean Sample variance

1 {−1, 1} 0 2
2 {−1, 5} 2 18
3 {−1, 11} 5 72
4 {1, −1} 0 2
5 {1, 5} 3 8
6 {1, 11} 6 50
7 {5, −1} 2 18
8 {5, 1} 3 8
9 {5, 11} 8 18
10 {11, −1} 5 72
11 {11, 1} 6 50
12 {11, 5} 8 18
Average 4 28

Below are some details of the calculations. For sample 1,

−1 + 1
sample mean = = 0,
2
(−1 − 0)2 + (1 − 0)2
sample variance = = 2.
2−1
For sample 2,
−1 + 5
sample mean = = 2,
2
(−1 − 2)2 + (5 − 2)2
sample variance = = 18.
2−1
The next table shows all possible samples and their means and variances for
sampling with replacement when the sample size is 2.

7
STAT2602B TST23/24 Topic 1

Sample Sample mean Sample variance

1 {−1, −1} −1 0
2 {−1, 1} 0 2
3 {−1, 5} 2 18
4 {−1, 11} 5 72
5 {1, −1} 0 2
6 {1, 1} 1 0
7 {1, 5} 3 8
8 {1, 11} 6 50
9 {5, −1} 2 18
10 {5, 1} 3 8
11 {5, 5} 5 0
12 {5, 11} 8 18
13 {11, −1} 5 72
14 {11, 1} 6 50
15 {11, 5} 8 18
16 {11, 11} 11 0
Average 4 21
Remarks:
For a particular population and a fixed sample size n, the sample mean depends
on the sample, which is drawn randomly. Therefore the sample mean may be
treated as a random variable. When the sample mean is treated as a random
variable, it is denoted by X. The value of the mean of a particular sample is
denoted by x, which is only one of the realizations of (possible values taken
on by) X. For the same reason, the sample variance and the sample standard
deviation are denoted respectively by S 2 and S when being treated as random
variables, and the values of the sample variance and the sample standard
deviation of a particular sample are denoted respectively by s2 and s.
Sampling without replacement is the procedure used most often. If the popu-
lation size is very large, the results of sampling with and without replacement
will have very small differences.
From now on, unless otherwise specified, all theoretical results in this course
are based on simple random sampling with replacement, and we will assume in
numerical examples that population sizes are very large so that these theoreti-
cal results can also be applied even without mentioning whether the sampling
is to be carried out with or without replacement.

3 Sampling Distributions of Statistics

Definition 1.9. Random sample of size n taken from a distribution (or population,
respectively) is a set of n independent random variables X1 , X2 , . . . , Xn each having
the distribution (or same distribution as the population, respectively).

8
STAT2602B TST23/24 Topic 1

Remarks: This definition corresponds to sampling with replacement.

Definition 1.10. Statistic is a function of X1 , X2 , . . . , Xn that does not depend on
any unknown parameter.
Example 1.9. Sample mean
n
1X
X= Xi
n i=1
is an example of statistic. Sample variance
n n
!
1 X 1 X 2
S2 = (Xi − X)2 = Xi2 − nX .
n − 1 i=1 n−1 i=1

is another example of statistic.

Theorem 1.4. If X1 , X2 , . . . , Xn constitute a random sample from a population
with mean µ and variance σ 2 , then
σ2
E(X) = µ, Var(X) = and E(S 2 ) = σ 2 .
n
Proof.
n
! n
1X 1X 1
E(X) = E Xi = E(Xi ) = nµ = µ.
n i=1 n i=1 n
n
! n
1X 1 X nσ 2 σ2
Var(X) = Var Xi = 2 Var(Xi ) = 2 = .
n i=1 n i=1 n n

By symmetry, for i = 1, 2, . . . , n,
E (Xi − X)2

2
X1 + X2 + · · · + X n
= E (X1 − X) = Var(X1 − X) = Var X1 −
n
n
!
2 2
(n − 1)X1 X Xi (n − 1) 2 σ
= Var − = 2
σ + (n − 1) 2
n i=2
n n n
n−1 2
= σ .
n
Therefore,
" n
# n
1 X 1 X
E(S 2 ) = E (Xi − X)2 = E (Xi − X)2

n − 1 i=1 n − 1 i=1
1 n−1 2
= ·n· σ
n−1 n
= σ2.

9
STAT2602B TST23/24 Topic 1

2
Remarks: For convenience, E(X) and Var(X) may be written as µX and σX , re-
2
spectively. σX , the square root of σX , is called the standard error of the sample
mean.
Example 1.10. Recall Example 1.8,
E(X) = 4 = µ for sampling with or without replacement,
E(S 2 ) = 28 ̸= σ 2 for sampling without replacement,
E(S 2 ) = 21 = σ 2 for sampling with replacement,
The variance of the sample mean is
(−1 − 4)2 + (0 − 4)2 + (2 − 4)2 + · · · + (8 − 4)2 + (11 − 4)2
Var(X) =
16
21 σ2
= 10.5 = =
2 n
for sampling with replacement.
Theorem 1.5. If a population has N(µ, σ2 ), then
the mean X of a sample of size
σ2
n drawn from the population will have N µ, , that is,
n
X −µ
√ ∼ N 0, 12 .

σ/ n
Proof. The moment-generating function of N(µ, σ 2 ) is
Z ∞
1 (x−µ)2
tX
E(e ) = etx √ e− 2σ2 dx
−∞ 2πσ
Z ∞
−2σ 2 tx + x2 − 2µx + µ2

1
= √ exp dx
−∞ 2πσ −2σ 2
Z ∞
−2µσ 2 t − σ 4 t2 (x − µ − σ 2 t)2

1
= exp × √ exp dx
−2σ 2 −∞ 2πσ −2σ 2

1 22
= exp µt + σ t
2
because
(x − µ − σ 2 t)2

1
√ exp
2πσ −2σ 2
is the probability density function of N(µ + σ 2 t, σ 2 ).
Let {X1 , X2 , . . . , Xn } be a random sample drawnP from the normal population.
Then the moment-generating function of X = n1 ni=1 Xi is (by Theorem 1.1 and
Theorem 1.2.)
n " 2 !#n
t t t 1 t
MX (t) = MPni=1 Xi = MX1 = exp µ + σ2
n n n 2 n
1 σ2 2

= exp µt + t ,
2 n

10
STAT2602B TST23/24 Topic 1

2

which is the moment-generating function of N µ, σn . Therefore (by Theorem 1.3)

σ2
X has N µ, n .
Example 1.11. Suppose a population has N(µ, 202 ). Let n be the size of a sample.

Figure 1: Probability density functions of the sample mean for different same sizes.

Example 1.12. A soft-drink vending machine is set so that the amount of drink
dispensed is a normal random variable with a mean of 200 millilitres and a stan-
dard deviation of 15 millilitres. What is the probability that the average amount
dispensed in a random sample of size 36 is at least 204 millilitres?
2

By Theorem 1.5, the sample mean X has N 200, 15 36
, that is, N(200, 2.52 ).

X − 200 204 − 200
P X ≥ 204 = P ≥ = P (Z ≥ 1.6)
2.5 2.5
= P (Z > 0) − P(0 < Z < 1.6) = 0.5 − 0.4452
= 0.0548
Remarks: When n = 1, X has N (200, 152 ) and

X − 200 204 − 200
P X ≥ 204 = P ≥ = P (Z ≥ 0.27)
15 15
= P (Z > 0) − P (0 < Z < 0.27) = 0.5 − 0.1064
= 0.3936.
Theorem 1.6. Suppose
Pn X21 , X2 , . . . , Xn are independent2 random variables each hav-
ing N(0, 1). Then, i=1 Xi has Gamma (n/2, 1/2) or χ (n).
Proof. The moment-generating function of X12 is
Z ∞ Z ∞
(1 − 2t)x2

tx2 1 1
2
tX12 − x2
E(e ) = e √ e dx = √ exp − dx
−∞ 2π −∞ 2π 2
Z ∞
x2

1 1
= √ × exp − dx
2(1 − 2t)−1
p
1 − 2t −∞ 2π(1 − 2t)−1
1
= √
1 − 2t

11
STAT2602B TST23/24 Topic 1

for t < 1/2 because

x2

1
exp −
2(1 − 2t)−1
p
2π(1 − 2t)−1

is the probability density function of N (0, (1 − 2t)−1 ). P

By Theorem 1.2, the moment-generating function of ni=1 Xi2 is
n 1 n/2

1 1 2 1
√ = n/2
= n/2
for t < ,
1 − 2t (1 − 2t) 1 2

2
−t

which is the moment-generating function of Gamma (n/2, 1/2) (refer to Example
1.4). Therefore, ni=1 Xi2 has Gamma n2 , 12 by Theorem 1.3.
P

Gamma (n/2, 1/2) is also called the chi-squared distribution with n degrees of
freedom (denoted by χ2 (n) or χ2n ), its mean is n and its variance is 2n.
Example 1.13. Figure 2 shows the probability density functions of some χ2 (df ).

Figure 2: Probability density functions of χ2 (df ).

Theorem 1.7. Suppose there are two independent random variables X and Y ,
with X having N(0, 1) and Y having χ2 (n). Then the following gives a probability
X
density function of T = p :
Y /n
n+1
Γ n+1

2 t2 2
fT (t) = √ 1+ ,
πnΓ n2 n
R∞
for −∞ < t < ∞, where Γ(y) = 0 uy−1 e−u du for y > 0 is the gamma function.

12
STAT2602B TST23/24 Topic 1

X
Remarks: We say that p has the Student’s t-distribution with n degrees of
Y /n
freedom (denoted by t (n) or tn ).
Proof. Exercise question 16.

Example 1.14. Figure 3 shows the probability density functions of N(0, 1) and
some t (n).

Figure 3: Probability density functions of N(0, 1) and t (n).

Theorem 1.8. Suppose there are two independent random variables U and V , U
has χ2 (m) and V has χ2 (n). Then the following gives a probability density function
U/m
of W = :
V /n
m+n
m
m − m+n

 Γ 2 m 2 m2 −1
2
m n
w 1+ w , for w > 0;
fW (w) = Γ 2 Γ 2 n n

0, otherwise.

Remarks:
U/m
We say that has the F -distribution with m numerator degrees of freedom
V /n
and n denominator degrees of freedom (denoted by F (m, n) or Fm,n ).

If T has t (n), then T 2 has F (1, n).

Proof. Exercise question 18.

Example 1.15. Figure 4 shows the probability density functions of some F (m, n)

Lemma 1. Suppose X and Y are independent, X has χ2 (m) and (X + Y ) has

χ2 (n) with n > m, where m and n are positive integers. Then Y has χ2 (n − m).

13
STAT2602B TST23/24 Topic 1

Figure 4: Probability density functions of some F (m, n).

Proof. Assume that MY (t) exists. We make use of the following moment-generating
functions
1 m/2

2 1
MX (t) = m/2 for t < ,
1
−t 2
2
1 n/2

2 1
MX+Y (t) = n/2
for t < .
1 2

−t 2

By Theorem 1.2,
1 n/2 1 m/2 1 (n−m)/2
,
MX+Y (t) 2 2 2
MY (t) = = n/2 m/2 = (n−m)/2
MX (t) 1
−t 1
−t 1
−t
2 2 2

for t < 1/2. Therefore, Y has χ2 (n − m).

Pn
Lemma 2. Suppose x1 , x2 , . . . , xn and c are real numbers and x = i=1 xi /n. Then,
n
X n
X
2
(xi − c) = (xi − x)2 + n(x − c)2 .
i=1 i=1

Proof.
n
X n
X
2
(xi − c) = [(xi − x) + (x − c)]2
i=1 i=1
n
X n
X n
X
2
= (xi − x) + 2(x − c) (xi − x) + (x − c)2
i=1 i=1 i=1
Xn
= (xi − x)2 + 0 + n(x − c)2 .
i=1

14
STAT2602B TST23/24 Topic 1

Theorem 1.9. Suppose X1 , X2 , . . . , Xn constitute a random sample from a popu-

lation having N(µ, σ 2 ). Let X and S 2 be the sample mean and the sample variance
respectively. Then,

1. X and S 2 are independent;

(n − 1)S 2
2. has χ2 (n − 1);
σ2
X −µ
3. √ has t(n − 1).
S/ n
Proof. 1. Exercise question 15 (a special case only).

2. By Lemma 2,
n
X n
X
(Xi − µ)2 = (Xi − X)2 + n(X − µ)2 = (n − 1)S 2 + n(X − µ)2 .
i=1 i=1

Then,
n 2 2
(n − 1)S 2

X Xi − µ X −µ
= + √ .
i=1
σ σ2 σ/ n

Xi − µ X −µ
Since has N(0, 1) for i = 1, 2, . . . , n and √ has N(0, 1), we know
σ σ/ n
n 2 2
X Xi − µ 2 X −µ
that has χ (n) and √ has χ2 (1) by Theorem 1.6.
i=1
σ σ/ n
2
(n − 1)S 2

X −µ
Also, √ and are independent by the first result. Therefore,
σ/ n σ2
(n − 1)S 2
2
has χ2 (n − 1) by Lemma 1.
σ
3. Consider
X−µ
√ X−µ
√
X −µ σ/ n σ/ n
√ = S
=q ,
S/ n σ
(n−1)S 2 /σ 2
n−1

X −µ (n − 1)S 2 X −µ
where √ and 2
are independent by the first result, √ has
σ/ n σ σ/ n
(n − 1)S 2
N(0, 1) (by Theorem 1.5) and 2
has χ2 (n − 1) (by (2)). Therefore,
σ
X −µ
√ has t(n − 1) by Theorem 1.7.
S/ n

Theorem 1.10. Suppose X1 , X2 , . . . , Xn1 constitute a random sample from a pop-

ulation having N(µ1 , σ12 ), Y1 , Y2 , . . . , Yn2 constitute a random sample from a popula-
tion having N(µ2 , σ22 ), and the (n1 + n2 ) random variables X1 , X2 , . . . , Xn1 , Y1 , Y2 , . . . , Yn2

15
STAT2602B TST23/24 Topic 1

are independent. Let S12 be the sample variance of the first sample and S22 be that
S 2 /σ 2
of the second sample. Then, 12 12 has F (n1 − 1, n2 − 1).
S2 /σ2
(n1 − 1)S12 2 (n2 − 1)S22
Proof. By Theorem 1.9, has χ (n 1 − 1) and has χ2 (n2 − 1).
σ12 σ22
Then, by Theorem 1.8,
.
(n1 −1)S12
S12 /σ12 σ12
(n1 − 1)
=
S22 /σ22 2
.
(n2 −1)S2
σ22
(n2 − 1)

has F (n1 − 1, n2 − 1).

Lemma 3. (Converges in distribution/weak convergence) If

1. MXn (t), the moment-generating function of Xn , exists, n = 1, 2, . . .,

2. lim MXn (t) exists and equals the moment-generating function of a random
n→∞
variable Y ,

then
lim FXn (x) = FY (x) for all x at which FY (x) is continuous,
n→∞

where FXn (x) is the distribution function of Xn , n = 1, 2, . . ., and FY (x) is the

distribution function of Y .

Theorem 1.11. Suppose there is a population with mean µ and variance σ 2 > 0.
Let X be the mean of a random sample of size n drawn from the population. Then,
X −µ X −µ
for all real number x, P √ ≤ x (that is, the distribution function of √ )
σ/ n σ/ n
tends to the distribution function of N(0, 1) when n → ∞.

X −µ
Remarks: How large should n be so that one can say that √ has approximately
σ/ n
N(0, 1)? The criterion varies from case to case. Some books say that a sample of
size 30 is large enough, but this is actually not appropriate for many cases.
Xi − µ
Proof. Let Yi = , then E(Yi ) = 0 and Var(Yi ) = 1 and suppose the moment-
σ
generating function MYi (t) exists. A Taylor’s series expansion of MYi (t) around t = 0
gives:

t2 ′′
MYi (t) = MYi (0) + tMY′ i (0) + M (ϵ), for some 0 ≤ ϵ ≤ t.
2 Yi
Let √ n
n(X − µ) 1 X
Zn = = √ Yi ,
σ n i=1

16
STAT2602B TST23/24 Topic 1

then the moment-generating function of Zn is thus given by

n n √ n
(t/ n)2 ′′

Y t t t ′
MZn (t) = MYi √ = MYi √ = MYi (0) + √ MYi (0) + MYi (ϵ)
i=1
n n n 2
n
t2 ′′

t
= 1 + √ E(Yi ) + MYi (ϵ)
n 2n
2
n
t
= 1 + MY′′i (ϵ) .
2n
As n → ∞, ϵ → 0, MY′′i (ϵ) → MY′′i (0) = E(Yi2 ) = 1. Hence,
n n
t2 ′′ t2

lim MZn (t) = lim 1 + MYi (ϵ) = lim 1 +
n→∞ n→∞ 2n n→∞ 2n
2
t 1 2
= exp = exp 0 × t + × 1 × t
2 2
which is the moment-generating function of a standard normal random variable. In
other words, Zn converges in distribution to Z where Z ∼ N(0, 1).
Example 1.16. Consider a Bernoulli random variable X such that
P (X = 1) = p and P (X = 0) = 1 − p.
Then µ = p and σ 2 = p (1 − p). Let {X1 , X2 , . . . , Xn } be the random sample of size
n drawn from the population of X. It is suggested that np and n(1 − p) should both
be greater than 5 in order to have the following result:
X −µ X −p nX − np
√ =q =p
σ/ n p(1−p) np (1 − p)
n

has approximately N(0, 1). Note that now nX = X1 +X2 +· · ·+Xn has the binomial
distribution with parameters n and p. Therefore we may also say that the normal
distribution can be used as an approximation to the binomial distribution when np
and n(1 − p) are both greater than 5.
Lemma 4.(Chebyshev’s theorem, Chebyshev’s inequality) For a random variable
X with mean µ and variance σ 2 ,
σ2
P (|X − µ| ≥ c) ≤ for any c > 0.
c2
Proof. Suppose X is a continuous random variable and its probability density func-
tion is f (x). Then,
Z ∞
2 2
(x − µ)2 f (x)dx

σ = E (X − µ) =
−∞
Z µ−c Z µ+c Z ∞
2 2
= (x − µ) f (x)dx + (x − µ) f (x)dx + (x − µ)2 f (x)dx
−∞ µ−c µ+c
Z µ−c Z ∞
≥ c2 f (x)dx + 0 + c2 f (x)dx
−∞ µ+c
= c2 P (|X − µ| ≥ c) .

17
STAT2602B TST23/24 Topic 1

The proof for discrete case is similar and is omitted.

Theorem 1.12 (Weak law of large numbers). Let X be the mean of a random
sample of size n from a population with mean µ and variance σ 2 . Then,

lim P |X − µ| ≥ c = 0 for any c > 0.
n→∞

Proof. For any c > 0, we have, from Lemma 4, that

1 σ2
P |X − µ| ≥ c ≤ 2 Var(X) = 2 .
c nc
Hence,
lim P |X − µ| ≥ c = 0 for any c > 0.
n→∞

Remarks:

We may state this result as “X converges in probability to µ”.

If Xn converges in distribution to X, it means that the behaviour of Xn is

getting closer and closer to that of X. It does not guarantee that the observed
value of Xn should often be close to the observed value of X,

If Xn converges in probability to X, it means that the observed value of Xn

is very likely to be arbitrarily close to the observed value of X.

Xn converges in probability to X implies Xn converges in distribution to X,

but not the other way around.

Example 1.17. Consider a Bernoulli random variable X such that

P (X = 1) = p and P (X = 0) = 1 − p.

Then µ = p and therefore X converges in probability to p as

¯ − p| ≥ c) ≤ lim 1 p (1 − p) = 0
lim P(|X for any c > 0.
n→∞ n→∞ c2 n

4 Order Statistics
Consider a random sample {X1 , X2 , . . . , Xn } from a population having a probability
density function. Note that

P (Xi = Xj ) = 0 for i ̸= j, i, j = 1, 2, . . . , n.

Order statistics are important in non-parametric inferences which were developed

to deal with the problem of violation of normal distributional assumption. Suppose

18
STAT2602B TST23/24 Topic 1

that we arrange in ascending order the values of X1 , X2 , . . . , Xn in a random sample

of size n from a population with distribution function F (x), we let

X(1) < X(2) < · · · < X(r) < · · · < X(n)

denote the order statistics of this sample. Here X(r) is called the r-th order statistic
from smallest to largest for r = 1, 2, . . . , n, with X(1) = min{X1 , X2 , . . . , Xn } and
X(n) = max{X1 , X2 , . . . , Xn }.

Example 1.18. Suppose {X1 , X2 , . . . , Xn } is a random sample from a population

with distribution function F (x). Find the distribution functions of

X(1) = min{X1 , X2 , . . . , Xn } and X(n) = max{X1 , X2 , . . . , Xn }.

Let U = min{X1 , X2 , . . . , Xn } and V = max{X1 , X2 , . . . , Xn }.

P (U ≤ u) = 1 − P (U > u) = 1 − P (X1 > u, X2 > u, . . . , Xn > u)

= 1 − P (X1 > u) P (X2 > u) · · · P (Xn > u)
= 1 − [1 − F (u)]n .
P (V ≤ v) = P (X1 ≤ v, X2 ≤ v, . . . , Xn ≤ v)
= P (X1 ≤ v) P (X2 ≤ v) · · · P (Xn ≤ v)
= [F (v)]n .
i.i.d.
Example 1.19. Suppose that X1 , X2 ∼ Uni(0, 1), then the order statistic X(2) is
given by
X2 , if X1 ≤ X2 ;
X(2) =
X1 , if X1 > X2 .
Find the probability density function of X(2) .

The probability distribution of Y = X(2) is

G(y) = P X(2) ≤ y
= P (X2 ≤ y and X1 ≤ X2 ) + P (X1 ≤ y and X1 > X2 )
= P (X1 ≤ X2 ≤ y) + P (X2 < X1 ≤ y)
Z y Z x2 Z y Z x1
= 1dx1 dx2 + 1dx2 dx1
0 0 0 0
Z y Z y
= x2 dx2 + x1 dx1
0 0
= y2 for 0 < y < 1.

Hence, the probability density function of X(2) is obtained as

g(y) = G′ (y) = 2y, for 0 < y < 1.

19
STAT2602B TST23/24 Topic 1

Theorem 1.13. Suppose that {X1 , X2 , . . . , Xn } is a random sample of size n from

a continuous distribution with distribution function F (x) and probability density
function f (x) for a < x < b. Let X(1) < X(2) < · · · < X(n) be the order statistics
of the sample. Then for k = 1, 2, . . . , n, the probability density function of the k-th
order statistic X(k) is given by

n!
gk (y) = [F (y)]k−1 [1 − F (y)]n−k f (y), for a < y < b.
(k − 1)!(n − k)!

In particular, the probability density function of X(n) is

gn (y) = n [F (y)]n−1 f (y), for a < y < b;

and the probability density function of X(1) is

g1 (y) = n [1 − F (y)]n−1 f (y), for a < y < b.

Proof. According to the multinomial distribution, we have

P y < X(k) < y + h
gk (y) = lim
h→0 h
1 n!
= lim [F (y)]k−1 [F (y + h) − F (y)] [1 − F (y)]n−k
h→0 h (k − 1)!1!(n − k)!

n! F (y + h) − F (y)
= [F (y)]k−1 lim [1 − F (y)]n−k
(k − 1)!1!(n − k)! h→0 h
n!
= [F (y)]k−1 [1 − F (y)]n−k f (y)
(k − 1)!(n − k)!

Remarks: Using similar arguments as in Theorem 1.13, the joint probability density
function of X(i) and X(j) (i < j) can be obtained as

n!
gi,j (x, y) = [F (x)]i−1 [[F (y) − F (x)]j−i−1 [1 − F (y)]n−j f (x)f (y),
(i − 1)!(j − i − 1)!(n − j)!

where x ≤ y, and the joint probability density function of (X(1) , X(2) , · · · , X(n) ) is

n!f (x1 )f (x2 ) · · · f (xn ), if x1 ≤ x2 ≤ · · · ≤ xn ;
g1,2,··· ,n (x1 , x2 , · · · , xn ) =
0, otherwise.

20
The University of Hong Kong
Department of Statistics and Actuarial Science
STAT2602B Probability and Statistics II
Semester 2 2023/2024
Topic 1 Summary

1 Moment-generating Functions

(a) MX (t) = E etX
tx
P
MX (t) = R ∞ x etx f (x) , if X is discrete,
−∞
e f (x) dx, if X is continuous.

(b) Properties

(i) MaX+b (t) = ebt MX (at)

d
(ii) E (X r ) = dtr
MX (t) t=0
(iii) MX (t) = MY (t) ⇔ fX (x) = fY (y)
Qn
(iv) If Xi ’s are independent for i = 1, . . . , n, then MPni=1 Xi (t) = i=1 MXi (t)

2 Sampling Distribution of the Sample Mean

(a) Any population distribution with finite mean µ and variance σ 2 :
σ2
E X = µ, Var X = .
n
By the central limit theorem,

X − µ approx.
∼ N 0, 12 .

√
σ/ n

(b) The normal population with known variance σ 2 :

σ2

X −µ
√ ∼ N 0, 12 .

X ∼ N µ, ⇔
n σ/ n

(c) The normal population with unknown variance σ 2 :

X −µ
√ ∼ tn−1 .
S/ n

22
STAT2602B TST23/24 Topic 1 Summary

3 Sampling Distribution of the Sample Variance

(a) The normal population with unknown variance σ 2 :
n−1 2
S ∼ χ2n−1 .
σ2

(b) Two normal populations with unknown variances σ12 and σ22 :

S12 /σ12 σ22 S12

= ∼ Fn1 −1,n2 −1 .
S22 /σ22 σ12 S22

23
The University of Hong Kong
Department of Statistics and Actuarial Science
STAT2602B Probability and Statistics II
Semester 2 2023/2024
Topic 1 Exercise

1. Two random variables X and Y are independent and have probabilities given
by

P (X = −1) = 1/4, P (X = 0) = 1/2, P (X = 2) = 1/4,

P (Y = 3) = 1/3, P (Y = 4) = 2/3.

Let Z = X + Y .

(a) Find the probability distribution of Z.

(b) Find the moment-generating functions of X, Y and Z.
(c) Compare the products of the moment-generating functions of X and Y ,
and the moment-generating function of Z.

2. The moments of X are given by

E (X r ) = 0.8, r = 1, 2, . . . .

(a) Find the moment-generating function of X.

(b) What is the probability distribution of X?

3. Let MX (t) be the moment-generating function of the random variable X and

′ ′′
let RX (t) = ln MX (t). Show that RX (0) = µ and RX (0) = σ 2 . Use these
results to find the mean and the variance of a random variable X having the
moment-generating function

MX (t) = e4(e −1) .

4. Let X1 , X2 , . . . , Xn be independent random variables. Show that

n
Y
M Pn
i=1 Xi (t) = MXi (t).
i=1

5. Given the moment-generating functions of two discrete random variables, X

and Y .

MX (t) = 0.05et + 0.15e2t + 0.30e3t + 0.30e4t + 0.15e5t + 0.05e6t ,

MY (t) = 0.05et + 0.20e2t + 0.15e3t + 0.45e4t + 0.10e5t + 0.05e6t .

(a) Find the probability distributions of X and Y .

24
STAT2602B TST23/24 Topic 1 Exercise

(b) Obtain the mean, the variance, and the skewness of X and Y .

6. Given the probability distributions of two discrete random variables, X and

x -3 -2 -1 0 1 2 3
f (x) 0.06 0.09 0.10 0.50 0.10 0.09 0.06
y -3 -2 -1 0 1 2 3
f (y) 0.04 0.11 0.20 0.30 0.20 0.11 0.04

Obtain the mean, the variance, and the kurtosis of each of X and Y .

7. A random sample of size n = 100 is taken from a population with the mean
µ = 75 and the variance σ 2 = 256. Based on the central limit theorem, with
what probability can we assert that the value we obtain for X will fall between
67 and 83?

8. A random sample of size n = 225 is to be taken from the exponential popula-

tion with λ = 0.25. Based on the central limit theorem, what is the probability
that the mean of the sample will exceed 4.5?

9. Independent random samples of size 400 are taken from each of two populations
having equal means and the standard deviations σ1 = 20 and σ2 = 30. Using
the central limit theorem, what can we assert with a probability of at least
0.99 about an interval about the mean we will get for X 1 − X 2 ?

10. The actual proportion of families in a certain city who own, rather than rent,
their home is 0.70. If 84 families in this city are interviewed at random and
their responses to the question of whether they own their home are looked
upon as values of independent random variables having identical Bernoulli
distributions with the parameter p = 0.70, with what probability can we assert
that the value we obtain for the sample proportion will fall between 0.64 and
0.76, using the central limit theorem?

11. Suppose that Z follows the standard normal distribution. Find the probability
density function of Y = Z 2 by the transformation technique.

12. If X has the standard normal distribution, use the distribution function tech-
nique to find the probability density function of Z = X 2 .

13. Let X1 , X2 , . . . , Xn be independent random variables. If each of X1 , X2 , . . . , Xn

follows the normal distribution where Xi ∼ N (µi , σi2 ) for i = 1, 2, . . . , n. Con-
sider Y = X1 + X2 + . . . + Xn .

(a) Show that Y has the normal distribution, where

n n
!
X X
Y ∼N µi , σi2 .
i=1 i=1

25
STAT2602B TST23/24 Topic 1 Exercise

(b) Show that the mean Y /n has the normal distribution, where
n n
!
Y 1X 1 X 2
∼N µi , 2 σ .
n n i=1 n i=1 i

(c) Further assume that X1 , X2 , . . . , Xn are identically distributed. That

is, Xi ∼ N (µ, σ 2 ) for i = 1, 2, . . . , n. Using (a) and (b), show that
Y and Y /n have the normal distributions where Y ∼ N (nµ, nσ 2 ) and
Y /n ∼ N (µ, σ 2 /n).
14. Suppose that each of the independent random variables Z1 , Z2 , . . . , Zn follows
the standard normal distribution, where Zi ∼ N (0, 12 ) for i = 1, 2, . . . , n.
Show that Y = Z12 + Z22 + . . . + Zn2 has the chi-squared distribution with
degrees of freedom equal to n. That is, Y ∼ χ2n .
15. If X1 and X2 are independent random variables both having the standard
normal distribution, show that
(a) the joint probability density function of X1 and X̄ is
1 −x̄2 −(x1 −x̄)2
f (x1 , x̄) = e e
π
for −∞ < x1 < ∞ and −∞ < x̄ < ∞;
(b) the joint probability density function of U = X1 − X̄ and X̄ is
2 −(x̄2 +u2 )
f (u, x̄) = e
π
for u > 0 and −∞ < x̄ < ∞;
2
(c) the sample variance is S 2 = 2 X1 − X̄ = 2U 2 ;
(d) the joint probability density function of X̄ and S 2 is
1 − 1 1 2 1 2
f x̄, s2 = √ s2 2 e− 2 s × √ e−x̄

2π π
for s2 > 0 and −∞ < x̄ < ∞.
16. Suppose that Y and Z are independent random variables that Y follows the
chi-squared distribution with n degrees of freedom and that Z follows the
standard normal distribution. Show that the distribution of
Z
X=p ,
Y /n
is the Student’s t-distribution where the probability density function of X is
given by
− n+1
Γ n+1

2 x2 2
fX (x) = √ 1 + , for − ∞ < x < ∞.
πnΓ n2

n

26
STAT2602B TST23/24 Topic 1 Exercise

Hints: The probability density functions of Y and Z are, respectively

1
fY (y) = n/2
y n/2−1 e−y/2 , for y > 0;
Γ (n/2) 2
1 2
fZ (z) = √ e−z /2 , for − ∞ < z < ∞.
2π
The gamma function given below is also involved.
Z ∞
Γ (t) = xt−1 e−x dx.
0

17. The claim that the variance of a normal population is σ 2 = 4 is to be rejected

if the variance of a random sample of size 9 exceeds 7.7535. What is the
probability that this claim will be rejected even though σ 2 = 4?

18. If Y and Z are independent random variables having the chi-square distribu-
tions with m and n degrees of freedom, respectively, show that the random
variable
Y /m
X= ,
Z/n
has the F distribution with degrees of freedom m and n.

Hints: The probability density function of Y is

1
fY (y) = y m/2−1 e−y/2 , for y > 0.
Γ (m/2) 2m/2

19. If S12 and S22 are the variances of independent random samples of sizes n1 and
n2 from the normal populations with the variances σ12 and σ22 , respectively,
show that
S 2 /σ 2
X = 12 12 ,
S2 /σ2
has the F distribution with (n1 − 1) and (n2 − 1) degrees of freedom.

20. If S1 and S2 are the standard deviations of independent random samples of

size n1 = 61 and n2 = 31 from normal populations with σ12 = 12 and σ22 = 18,
find P (S12 /S22 > 1.16).

Moment Generating Functions
No ratings yet
Moment Generating Functions
7 pages
Great Gatsby Lesson Plan Edsc 440s
No ratings yet
Great Gatsby Lesson Plan Edsc 440s
8 pages
Random Variable (slide)
No ratings yet
Random Variable (slide)
22 pages
Lecture 6 - Fall 2023
No ratings yet
Lecture 6 - Fall 2023
39 pages
Lecture 6 - Fall 2023
No ratings yet
Lecture 6 - Fall 2023
38 pages
Discrete Time Overview
No ratings yet
Discrete Time Overview
28 pages
Unit-1-Probability (PAS)
No ratings yet
Unit-1-Probability (PAS)
103 pages
Sampling Distributions: 1.1 Statistical Inference
No ratings yet
Sampling Distributions: 1.1 Statistical Inference
22 pages
proba 3
No ratings yet
proba 3
30 pages
✨SMA 240 Probability and Statistics 1 Lecture Notes
No ratings yet
✨SMA 240 Probability and Statistics 1 Lecture Notes
36 pages
Use of Moment Generating Functions
No ratings yet
Use of Moment Generating Functions
38 pages
Stat520 Ch.2
No ratings yet
Stat520 Ch.2
5 pages
Notes
No ratings yet
Notes
56 pages
Book Down
No ratings yet
Book Down
17 pages
f (x) = Γ (α+β) Γ (α) Γ (β) x, for 0<x<1
No ratings yet
f (x) = Γ (α+β) Γ (α) Γ (β) x, for 0<x<1
8 pages
Lec2 IntroToProbabilityAndStatistics
No ratings yet
Lec2 IntroToProbabilityAndStatistics
37 pages
mgf, Sampling theory & Central limit theorem
No ratings yet
mgf, Sampling theory & Central limit theorem
8 pages
Probtheo 5
No ratings yet
Probtheo 5
78 pages
استدلال احصائي
No ratings yet
استدلال احصائي
110 pages
1.7.1 Moments and Moment Generating Functions: Chapter 1. Elements of Probability Distribution Theory
No ratings yet
1.7.1 Moments and Moment Generating Functions: Chapter 1. Elements of Probability Distribution Theory
8 pages
Reference1 Harvard
No ratings yet
Reference1 Harvard
118 pages
Lecture 11
No ratings yet
Lecture 11
20 pages
NewNotesInf_Chapter_8
No ratings yet
NewNotesInf_Chapter_8
36 pages
Gamma Function
No ratings yet
Gamma Function
16 pages
Proof_Central_Limit_Theorem
No ratings yet
Proof_Central_Limit_Theorem
4 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
1 A Dash On Moments: Moments, Moment-Generating Functions, and Method-Of-Moments Estimators
No ratings yet
1 A Dash On Moments: Moments, Moment-Generating Functions, and Method-Of-Moments Estimators
10 pages
Review of Random Variables
No ratings yet
Review of Random Variables
8 pages
Econ-2042- Unit 5-HO
No ratings yet
Econ-2042- Unit 5-HO
22 pages
Stat6201 ch5-1
No ratings yet
Stat6201 ch5-1
2 pages
Sta 2200 Probability & Statistics II (Course Outline With Notes)
No ratings yet
Sta 2200 Probability & Statistics II (Course Outline With Notes)
155 pages
Moment Generating Functions 1
No ratings yet
Moment Generating Functions 1
14 pages
Random Sampling, Statistics, and Estimators
No ratings yet
Random Sampling, Statistics, and Estimators
9 pages
P and S - Unit 2
No ratings yet
P and S - Unit 2
44 pages
Moment Generating Function
No ratings yet
Moment Generating Function
10 pages
Transition To MATH503
No ratings yet
Transition To MATH503
12 pages
Uniform and Normal Distribution.
No ratings yet
Uniform and Normal Distribution.
45 pages
stat400-hw09-Fa24
No ratings yet
stat400-hw09-Fa24
3 pages
Chap1SamplingDistributions
No ratings yet
Chap1SamplingDistributions
14 pages
College Statistics
No ratings yet
College Statistics
244 pages
Mathematics Handbook
No ratings yet
Mathematics Handbook
11 pages
week two note
No ratings yet
week two note
19 pages
Moments and Moment Generating Functions
100% (1)
Moments and Moment Generating Functions
9 pages
Chapter 6F-PropCRV - W PDF
No ratings yet
Chapter 6F-PropCRV - W PDF
30 pages
Nonlife Actuarial Models: Claim-Severity Distribution
No ratings yet
Nonlife Actuarial Models: Claim-Severity Distribution
62 pages
Exercises and Answers To Chapter 1
No ratings yet
Exercises and Answers To Chapter 1
35 pages
323 egec
No ratings yet
323 egec
18 pages
Ivchenko Medvedev Chistyakov Problems in Mathematical Statistics
No ratings yet
Ivchenko Medvedev Chistyakov Problems in Mathematical Statistics
282 pages
MGF CHF
No ratings yet
MGF CHF
57 pages
lect2
No ratings yet
lect2
7 pages
My Notes For Discrete and Continuous Distributions 987654
No ratings yet
My Notes For Discrete and Continuous Distributions 987654
28 pages
POL 571: Convergence of Random Variables: 1 Random Sample and Statistics
No ratings yet
POL 571: Convergence of Random Variables: 1 Random Sample and Statistics
7 pages
Chapter3-Discrete Distribution
No ratings yet
Chapter3-Discrete Distribution
141 pages
Probability & Statistics Theme 6 Sampling Distribution Random Sample
No ratings yet
Probability & Statistics Theme 6 Sampling Distribution Random Sample
4 pages
3.Normal Distribution
No ratings yet
3.Normal Distribution
42 pages
prob2 manual
No ratings yet
prob2 manual
72 pages
Untitled 3
No ratings yet
Untitled 3
32 pages
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
No ratings yet
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
63 pages
Moment Generating Functions (Mgf)
No ratings yet
Moment Generating Functions (Mgf)
10 pages
Unit - 1
No ratings yet
Unit - 1
40 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
2065 0061 PDF
No ratings yet
2065 0061 PDF
4 pages
Past Tenses
0% (1)
Past Tenses
12 pages
Marley's Ghost
No ratings yet
Marley's Ghost
1 page
Extravagant Generosity
No ratings yet
Extravagant Generosity
3 pages
This Study Resource Was: Eapp-111 English Long Quiz 3
No ratings yet
This Study Resource Was: Eapp-111 English Long Quiz 3
2 pages
Al Term Paper
No ratings yet
Al Term Paper
8 pages
Download The Ornament of the World How Muslims Jews and Christians Created a Culture of Tolerance in Medieval Spain Maria Rosa Menocal ebook All Chapters PDF
100% (9)
Download The Ornament of the World How Muslims Jews and Christians Created a Culture of Tolerance in Medieval Spain Maria Rosa Menocal ebook All Chapters PDF
81 pages
A Secure QR Code System For Sharing Personal Confidential Information
No ratings yet
A Secure QR Code System For Sharing Personal Confidential Information
4 pages
The Wrestler's Body
No ratings yet
The Wrestler's Body
189 pages
Become Famous TRW
No ratings yet
Become Famous TRW
8 pages
Windows Native API Programming
No ratings yet
Windows Native API Programming
391 pages
Social Studies - Grade 5
100% (1)
Social Studies - Grade 5
845 pages
AIM:-Create "First Android Application", That Will Display "LDRP-ITR" in The
No ratings yet
AIM:-Create "First Android Application", That Will Display "LDRP-ITR" in The
6 pages
Hibernate Reference Envers
No ratings yet
Hibernate Reference Envers
42 pages
I Vocabulary Review
No ratings yet
I Vocabulary Review
2 pages
5 to 9 Answer Key
No ratings yet
5 to 9 Answer Key
37 pages
Simple Future Tense
No ratings yet
Simple Future Tense
13 pages
Hvpe Unit - II
No ratings yet
Hvpe Unit - II
33 pages
Batch Determination For IM: MM PP SD 11 Comments Batch MM
No ratings yet
Batch Determination For IM: MM PP SD 11 Comments Batch MM
3 pages
EngID1 2ndedition Unit8Teachers
100% (1)
EngID1 2ndedition Unit8Teachers
10 pages
MP Lecture - 4: Control Unit Design
No ratings yet
MP Lecture - 4: Control Unit Design
33 pages
Critical Approaches - Literary Theory PowerPoint - ACAD WRITING
No ratings yet
Critical Approaches - Literary Theory PowerPoint - ACAD WRITING
27 pages
GTL Ver4
100% (1)
GTL Ver4
718 pages
Trigonometry 4
No ratings yet
Trigonometry 4
34 pages
OperationGuide ELW531TH W531TG W506T
No ratings yet
OperationGuide ELW531TH W531TG W506T
93 pages
EMC Avamar Command
100% (1)
EMC Avamar Command
4 pages
My Book Review
No ratings yet
My Book Review
2 pages
October CA Crazy Gk
No ratings yet
October CA Crazy Gk
83 pages
Workshop 3.4 Named Selections + Object Generator: Introduction To ANSYS Mechanical
No ratings yet
Workshop 3.4 Named Selections + Object Generator: Introduction To ANSYS Mechanical
15 pages

Stat2602b Topic 1

Uploaded by

Stat2602b Topic 1

Uploaded by

The University of Hong Kong

Department of Statistics and Actuarial Science

 The moment-generating function of a discrete random variable X is

and that of a continuous random variable X is

 The subscript X indicates it is the moment-generating function of the random

Theorem 1.1. Suppose MX (t) exists. Then,

3. For constants a and b,

MaX+b (t) = ebt MX (at) .

Proof. 1. For discrete random variables,

For continuous random variables, use integrals instead of sums.

2. Consider the Taylor’s series expansion of the exponential function etx ,

MaX+b (t) = E e(aX+b)t = ebt E(eatX ) = ebt MX (at).

Example 1.2. Suppose that X is a continuous random variable with a probability

The moment-generating function of X is

Theorem 1.2. If X1 , X2 , . . . , Xn are independent random variables, MXi (t) exists

Theorem 1.3. For those probability distributions which moment-generating func-

MX (t) = MY (t) ⇒ fX (x) = fY (y) .

Then, MX (t) completely determines the distribution of X.

The moment-generating function of a gamma distribution is

for t < λ, where

is the probability density function of a Gamma(α, λ − t) distribution.

Example 1.5. Show that the sum of n independent random variables X1 , X2 , . . . , Xn

For each i = 1, 2, . . . , n, the moment-generating function of Xi is

The moment-generating function of Bin(n, p) is

Therefore, X1 + X2 + · · · + Xn has a Bin(n, p) distribution.

2 Simple Random Sampling

and N is called the population size.

and n is called the sample size.

Definition 1.5. The population standard deviation σ is the non-negative square

Sample Sample mean Sample variance

Below are some details of the calculations. For sample 1,

Sample Sample mean Sample variance

3 Sampling Distributions of Statistics

Remarks: This definition corresponds to sampling with replacement.

is another example of statistic.

for t < 1/2 because

is the probability density function of N (0, (1 − 2t)−1 ). P

Figure 2: Probability density functions of χ2 (df ).

Figure 3: Probability density functions of N(0, 1) and t (n).

 If T has t (n), then T 2 has F (1, n).

Proof. Exercise question 18.

Lemma 1. Suppose X and Y are independent, X has χ2 (m) and (X + Y ) has

Figure 4: Probability density functions of some F (m, n).

for t < 1/2. Therefore, Y has χ2 (n − m).

Theorem 1.9. Suppose X1 , X2 , . . . , Xn constitute a random sample from a popu-

1. X and S 2 are independent;

Theorem 1.10. Suppose X1 , X2 , . . . , Xn1 constitute a random sample from a pop-

has F (n1 − 1, n2 − 1).

1. MXn (t), the moment-generating function of Xn , exists, n = 1, 2, . . .,

where FXn (x) is the distribution function of Xn , n = 1, 2, . . ., and FY (x) is the

then the moment-generating function of Zn is thus given by

The proof for discrete case is similar and is omitted.

Proof. For any c > 0, we have, from Lemma 4, that

 We may state this result as “X converges in probability to µ”.

 If Xn converges in distribution to X, it means that the behaviour of Xn is

 If Xn converges in probability to X, it means that the observed value of Xn

 Xn converges in probability to X implies Xn converges in distribution to X,

Example 1.17. Consider a Bernoulli random variable X such that

Then µ = p and therefore X converges in probability to p as

Order statistics are important in non-parametric inferences which were developed

that we arrange in ascending order the values of X1 , X2 , . . . , Xn in a random sample

X(1) < X(2) < · · · < X(r) < · · · < X(n)

Example 1.18. Suppose {X1 , X2 , . . . , Xn } is a random sample from a population

X(1) = min{X1 , X2 , . . . , Xn } and X(n) = max{X1 , X2 , . . . , Xn }.

Let U = min{X1 , X2 , . . . , Xn } and V = max{X1 , X2 , . . . , Xn }.

P (U ≤ u) = 1 − P (U > u) = 1 − P (X1 > u, X2 > u, . . . , Xn > u)

The probability distribution of Y = X(2) is

Hence, the probability density function of X(2) is obtained as

g(y) = G′ (y) = 2y, for 0 < y < 1.

Theorem 1.13. Suppose that {X1 , X2 , . . . , Xn } is a random sample of size n from

In particular, the probability density function of X(n) is

gn (y) = n [F (y)]n−1 f (y), for a < y < b;

and the probability density function of X(1) is

g1 (y) = n [1 − F (y)]n−1 f (y), for a < y < b.

Proof. According to the multinomial distribution, we have

(i) MaX+b (t) = ebt MX (at)

The moment-generating function of a discrete random variable X is

The subscript X indicates it is the moment-generating function of the random

If T has t (n), then T 2 has F (1, n).

We may state this result as “X converges in probability to µ”.

If Xn converges in distribution to X, it means that the behaviour of Xn is

If Xn converges in probability to X, it means that the observed value of Xn

Xn converges in probability to X implies Xn converges in distribution to X,