0% found this document useful (0 votes)
20 views

Ch6

Chapter 6 discusses the concepts of sampling statistics, including definitions of population, sample, and statistic, as well as the sampling distributions of means and variances. It emphasizes the Central Limit Theorem and provides methods for determining probabilities related to sample means and variances. Additionally, the chapter covers the joint distribution of sample mean and variance, including the t-distribution and chi-square distribution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Ch6

Chapter 6 discusses the concepts of sampling statistics, including definitions of population, sample, and statistic, as well as the sampling distributions of means and variances. It emphasizes the Central Limit Theorem and provides methods for determining probabilities related to sample means and variances. Additionally, the chapter covers the joint distribution of sample mean and variance, including the t-distribution and chi-square distribution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Chapter 6: Descriptions of Sampling Statistics

Li-Pang Chen, PhD

Department of Statistics, National Chengchi University

©Fall 2024

1 / 33
Outlines

1 Motivation

2 Sampling Distribution of Means

3 Sampling Distribution of Variances

4 Joint Distribution of X and S 2

2 / 33
6.1 Motivation

Definition (Population)
A population consists of the totality of the observations that we are
concerned.

Definition (Sample)
A sample is a subset of a population.

⇒ The main purpose in selecting (random) samples is to elicit information


about the unknown population parameters.

3 / 33
6.1 Motivation

Definition (Statistic)
Suppose that {X1 , X2 , · · · , Xn } is a random sample. Any function of the
random variables constituting a random sample is called a statistic.

Statistic Parameter
n
X = n1
P
Mean Xi µ
i=1
n 2
1
S2 = σ2
P
Variance n−1 Xi − X
i=1
n
pb = n1
P
Proportion(binary Xi ) Xi p
i=1

4 / 33
6.1 Motivation

Based on statistics, say X or S 2 , we would like to find probabilities,


such as P(X < a) or P(S 2 > b).
To do so, we need to know the distributions of X and S 2 .

5 / 33
6.2 Sampling Distribution of X

n
1 P
Recall that X = n Xi , where X1 , · · · , Xn are independent.
i=1
Assume that E (Xi ) = µ and var(Xi ) = σ 2 for all i = 1, · · · , n.
2
It is easy to show that E X = µ and var X = σn .
 

If Xi follows a normal distribution, then we know that


 σ2 
X ∼ N µ, .
n
What is the distribution of Xi if Xi does NOT follow normal
distributions, or the distribution of Xi is UNKNOWN?

6 / 33
6.2 Sampling Distribution of X

Theorem (The Central Limit Theorem (CLT))


If X is the mean of a random sample of size n taken from a population
with mean µ and finite variance σ 2 , then, as n → ∞, the limiting form of
the distribution of X is
 σ2 
X → N µ, .
n
Moreover, the standardized X is given by

X −µ
Z= √ ,
σ/ n

then Z → N(0, 1).

7 / 33
6.2 Sampling Distribution of X

Suppose X ∼ Bin(n, p), or equivalently, X = X1 + X2 + · · · + Xn with


Xi ∼ Ber(p).
X
Let pb ≜ n denote the proportion.
p(1−p)
It is known that E (b
p ) = p and var(b
p) = n .
By CLT, we conclude that
 
p(1 − p)
pb → N p, ,
n

or equivalently,

pb − p
Z=q → N(0, 1).
p(1−p)
n

8 / 33
6.2 Sampling Distribution of X
Histogram with n=10 Histogram with n=100

3.0

20
2.5

15
2.0
Frequency

Frequency
1.5

10
1.0

5
0.5
0.0

0
0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.4 0.5 0.6 0.7 0.8

p p

Histogram with n=10000


1500
Frequency

1000
500
0

0.2 0.4 0.6 0.8

9 / 33
6.2 Sampling Distribution of X

Remarks:
In practice, the sampling distribution of pb can be approximated by a
normal distribution whenever np ≥ 5 and n(1 − p) ≥ 5.
A general rule of thumb is that one can be confident of the normal
approximation whenever the sample size n is at least 30.

10 / 33
6.2 Sampling Distribution of X
Example: An electrical firm manufactures light bulbs that have a length
of life that is approximately normally distributed, with mean equal to 800
hours and a standard deviation of 40 hours. Find the probability that a
random sample of 16 bulbs will have an average life of less than 775 hours.

11 / 33
6.2 Sampling Distribution of X
Example: The ideal size of a first-year class at a particular college is 150
students. The college, knowing from past experience that, on the average,
only 30 percent of those accepted for admission will actually attend, uses a
policy of approving the applications of 450 students. Compute the
probability that more than 150 first-year students attend this college.

12 / 33
6.3 Sampling Distribution of S 2
n 2
1
Recall: S 2 =
P
n−1 Xi − X .
i=1
Why do we have “n − 1” in S 2 ?
⇒ Ensure E (S 2 ) = σ 2 .

13 / 33
6.3 Sampling Distribution of S 2

Suppose that X1 , · · · , Xn ∼ N(µ, σ 2 ).


Q: What is the distribution of S 2 ?

14 / 33
6.3 Sampling Distribution of S 2

Definition
(a) Suppose that X ∼ N(0, 1), then X 2 follows the chi-square distribution
with parameter (or known as degree of freedom) being one. Denote it
as

X 2 ∼ χ21 .
n
(b) Moreover, suppose that X1 , · · · , Xn ∼ N(0, 1), then χ2 ≜ Xi2
P
i=1
follows the chi-square distribution with degree of freedom being n.
Denote it as
n
X
Xi2 ∼ χ2n .
i=1

15 / 33
6.3 Sampling Distribution of S 2

Based on χ2 , we are interested in finding the probability P(χ2 > x),


where x is a constant.
Method: Chi-square table.

From the figure above, we know that P(χ2 > χ2α,n ) = α.


Thus, to find the probability, it is equivalent to find α, such that
x ≈ χ2α,n .

16 / 33
6.3 Sampling Distribution of S 2
Table for χ2 -distribution

17 / 33
6.3 Sampling Distribution of S 2
Table for χ2 -distribution

18 / 33
6.3 Sampling Distribution of S 2
Example: Determine P(χ226 > 54.051) and P(χ226 < 30) when χ226 is a
chi-square random variable with degree of freedom 26.

19 / 33
6.3 Sampling Distribution of S 2

We now check the distribution of S 2 :


Step 0: Assume that Xi ∼ N(µ, σ 2 ).
n n
(Xi −µ)2 (Xi −X )2
P P
i=1 i=1 n(X −µ)2
Step 1: Check σ2
= σ2
+ σ2
.
n
(Xi −µ)2
P
i=1
Step 2: σ2

n(X −µ)2
Step 3: σ2

Conclusion:

20 / 33
6.3 Sampling Distribution of S 2

Based on the distribution of S 2 , we can easily compute the probability


P(S 2 > a) for some constant a > 0.
Step 1: Transfer S 2 to the form of Chi-square distribution. That is,
write

P(S 2 > a) =
=
(n−1)a
Step 2: Use Chi-square table to find α such that χ2α,n−1 ≈ σ2
.

21 / 33
6.3 Sampling Distribution of S 2
Example: The time it takes a central processing unit to a certain process
type of job is normal distributed with mean 20 seconds and standard
deviation 3 seconds. If a sample of 15 such jobs is observed, what is the
probability that the sample variance will exceed 12?

22 / 33
6.3 Sampling Distribution of S 2
Example: A manufacturer of car batteries guarantees that the batteries
will last, on average, 3 years with a standard deviation of 1 year. Suppose
that the sample size is 21. Find a value d such that the probability that
the sample variance greater than d is 0.95.

23 / 33
6.3 Sampling Distribution of S 2
Example: Suppose that a random sample of 10 observations is to be
drawn where X1 , · · · , X10 are independent normally distributed random
variables each with mean µ and variance σ 2 . Find P(σ 2 > 0.5319S 2 ).

24 / 33
6.4 Joint Distribution of X and S 2

X −µ
Recall: In Section 6.2, we have Z = √
σ/ n
and Z → N(0, 1).
However, σ 2 is “population” variance and is usually unknown.
Instead, as discussed in Section 6.3, we know that S 2 is the sample
based variance.
Intuitively, we may replace σ 2 in Z by S 2 , yielding

X −µ
√ . (1)
S/ n

Q: What is the distribution of (1)?

25 / 33
6.4 Joint Distribution of X and S 2

Definition
Let Z ∼ N(0, 1) and Y ∼ χ2v . If Z and Y are independent, then a new
random variable T ≜ √ Z follows the t-distribution with degree of
Y /v
freedom being v . Denote T ∼ tv .

26 / 33
6.4 Joint Distribution of X and S 2

Remarks:
When v → ∞, tv will be close to N(0, 1).
P(T > tα ) = α.
P(T < −tα ) = P(T > tα ) = α.

27 / 33
6.4 Joint Distribution of X and S 2
Table for t-distribution

28 / 33
6.4 Joint Distribution of X and S 2
Table for t-distribution

29 / 33
6.4 Joint Distribution of X and S 2

X −µ
Z= √
σ/ n
∼ N(0, 1).
(n−1)S 2
Y = σ2
∼ χ2n−1 .
(1) becomes

X −µ
√ =
S/ n

30 / 33
6.4 Joint Distribution of X and S 2
Example: A chemical engineer claims that the population mean yield of a
certain batch process is 500 grams per milliliter of raw material. To check
this claim he samples 25 batches each month. What is the probability that
the sample mean is greater than 518, provided that the sample standard
deviation is 40? Assume the distribution of yields to be approximately
normal.

31 / 33
6.4 Joint Distribution of X and S 2

N(0, 1) □2 χ21

sum of v χ21

χ2v

v →∞ tv

Figure 1: Relationship among distributions. Circles represent distributions; squares


are operations 32 / 33
6.4 Joint Distribution of X and S 2

Statistic Distribution Remark


X −µ
Mean (X ) Z = σ/ √
n
N(0, 1) σ2 is known
X −µ
Mean (X ) T = √
S/ n
tn−1 σ 2 is unknown
(n−1)S 2
Variance (S 2 ) σ2
χ2n−1 −
Proportion (b
p) Z = qpb−p
p(1−p)
N(0, 1) −
n

33 / 33

You might also like