STAT2602B Topic 4 With Exercise Suggested Solution
STAT2602B Topic 4 With Exercise Suggested Solution
1 Interval Estimation
A point estimate for the parameter confidence interval does not provide much infor-
mation about the accuracy of the estimate and can sometimes be misleading. It is
desirable to generate a narrow interval that will cover the unknown parameter with
a large probability (confidence).
Remarks:
h i
θ̂1 , θ̂2 is called a 100 (1 − α) % confidence interval for θ
For example, when α = 0.05, the confidence level is 0.95 and we get a 95% confidence
interval. It should be understood that, like point estimates, interval estimates of a
given parameter are not unique. Methods of interval estimation are judged by their
various statistical properties. For instance, one desirable property is to have the
width of a 100(1 − α)% confidence interval as narrow as possible.
1
STAT2602B TST23/24 Topic 4
Theorem 4.1. If x̄ is the value of the mean of a random sample of size n from a
normal population with known variance σ 2 , a 100 (1 − α) % confidence interval for
the population mean µ is
σ σ
x̄ − zα/2 √ , x̄ + zα/2 √ .
n n
X −µ
Proof. For a random sample of size n, X ∼ N (µ, σ 2 /n). Then, Z = √ ∼
σ/ n
N (0, 12 ) that
X −µ
1 − α = P −zα/2 < √ < zα/2
σ/ n
σ σ
= P X − zα/2 √ < µ < X + zα/2 √ ,
n n
Example 4.1. A publishing company has just published a new college textbook.
Before the company decides the price of the book, it wants to know the average price
2
STAT2602B TST23/24 Topic 4
of all such textbooks in the market. The research department at the company took
a random sample of 36 such textbooks and collected information on their prices.
This information produced a mean price of $48.40 for this sample. It is known that
the standard deviation of the prices of all such textbooks is $4.50. Construct a 90%
confidence interval for the mean price of all such college textbooks assuming that
the underlying population has a normal distribution.
From the given information, n = 36, x = 48.40 and σ = 4.50. Now 1 − α = 0.9
which means α = 0.1, and zα/2 = 1.645. Hence, the 90% confidence interval for the
mean price of all such college textbooks is
σ σ 4.5 4.5
x − zα/2 √ , x + zα/2 √ = 48.4 − 1.645 × √ , 48.4 + 1.645 × √
n n 36 36
= [47.17, 49.63]
Remarks: How do we interpret this result? If we observe a large number of times the
value of x and construct each time a 90% confidence interval for µ accordingly, we
can expect that 90% of these intervals will include µ and 10% will not. If we observe
that x is, say, 48.4, we do not know whether the particular interval [47.17, 49.63]
includes the true value of µ or not because µ is unknown. Although we can say that
it either includes or does not include µ. We cannot say that the particular interval
includes µ with probability 0.9 as there is nothing to do with probability here since
47.17, 49.63 and µ are non-random.
Example 4.2. Suppose the bureau of the census and statistics of a city wants to
estimate the mean family annual income µ for all families in the city, where the
family annual income is known to have a normal distribution. It is known that
the standard deviation σ for the family annual income is 60 thousand dollars. How
large a random sample should the bureau select so that it can assert with probabil-
ity 0.99 that the sample mean will differ from µ by no more than 5 thousand dollars?
60zα/2 2
2
σ 60 × 2.576
zα/2 √ ≤ 5 ⇒ n ≥ = = 955.6.
n 5 5
Hence, the sample size should be at least 956. Note that we have to round 955.6
up to the next higher integer. This is always the case when determining the sample
size.
3
STAT2602B TST23/24 Topic 4
Theorem 4.2. If x̄ and s are the values of the sample mean and the sample standard
deviation of a random sample of size n from a normal population, a 100 (1 − α) %
confidence interval for the population mean µ is
s s
x̄ − tα/2,n−1 √ , x̄ + tα/2,n−1 √ .
n n
X −µ
Proof. Now σ is unknown. Instead of σ, we make use of S and the fact that √
S/ n
has t(n − 1) if the population has a normal distribution. Therefore,
X −µ
1 − α = P −tα/2,n−1 < √ < tα/2,n−1
S/ n
S S
= P X − tα/2,n−1 √ < µ < X + tα/2,n−1 √ ,
n n
Example 4.3. A paint manufacturer wants to determine the average drying time
of a new brand of interior wall paint. If for 12 test areas of equal size he obtained a
mean drying time of 66.3 minutes and a standard deviation of 8.4 minutes, construct
a 95% confidence interval for the true population mean assuming normality.
Given n = 12, x = 66.3, s = 8.4, α = 1 − 0.95 = 0.05 and tα/2,n−1 = t0.025,11 ≈ 2.201.
The 95% confidence interval for µ is
s s 8.4 8.4
x̄ − tα/2,n−1 √ , x̄ + tα/2,n−1 √ = 66.3 − 2.201 × √ , 66.3 + 2.201 × √
n n 12 12
= [60.96, 71.64]
4
STAT2602B TST23/24 Topic 4
Example 4.4. For a random sample of 50 apprentice geologists, the sample mean
and the sample standard deviation of hourly wage of apprentice geologists employed
by the top 5 oil companies are 14.75 and 3.0, respectively. Construct a 95% confi-
dence interval for the mean hourly wage of apprentice geologists employed by the
top 5 oil companies if the population of hourly wage has a normal distribution.
Given x = 14.75, s = 3.0 and n = 50, α = 1−0.95 = 0.05. By Excel, t0.025,49 ≈ 2.010
using the function =T.INV(0.975,49), the 95% confidence interval for µ is
s s 3.0 3.0
x̄ − tα/2,n−1 √ , x̄ + tα/2,n−1 √ = 14.75 − 2.010 × √ , 14.75 + 2.010 × √
n n 50 50
= [13.90, 15.60]
P W ≥ χ2α,n = α.
Hence, " #
n−1 2 n−1
2
s, 2 s2
χα/2,n−1 χ1−α/2,n−1
is a 100(1 − α)% confidence interval for σ 2 .
5
STAT2602B TST23/24 Topic 4
Remarks: For a chi-squared random variable W , some values of χ2α,n can be found
in Table 4 of the book Modern Fundamental Statistical Tables.
Example 4.5. A machine is set up to fill packages of cookies. A recently taken ran-
dom sample of the weights of 25 packages from the production line gave a variance
of 2.9, where the weights are known to have a normal distribution. Construct a 95%
confidence interval for the standard deviation of the weight of a randomly selected
package from the production line.
Given n = 25, s2 = 2.9 and α = 0.05. The 95% confidence interval for the population
variance is
" #
n−1 2 n−1 2 25 − 1 25 − 1
s, 2 s = × 2.9, × 2.9
χ2α/2,n−1 χ1−α/2,n−1 39.36 12.40
= [1.768, 5.613]
Remarks: Taking positive square roots, we obtain the 95% confidence interval for
the population standard deviation to be [1.330, 2.369].
6
STAT2602B TST23/24 Topic 4
Given n = 400, p̂ = 136/400 = 0.34 and z0.025 = 1.960. The approximate 95%
confidence interval for p is
" r r #
p̂(1 − p̂) p̂(1 − p̂)
p̂ − zα/2 , p̂ + zα/2
n n
" r r #
0.34 × (1 − 0.34) 0.34 × (1 − 0.34)
= 0.34 − 1.960 × , 0.34 + 1.960 ×
400 400
= [0.2936, 0.3864]
Example 4.7. The reaction of an individual to a stimulus in a psychology exper-
iment may take one of two forms, A or B. If an experimenter wishes to estimate
the probability p that a person will react in manner A, how many people must be
included in the experiment? Assume that the experimenter will be satisfied if the
estimation error is less than 0.04 with probability equal to 0.9. Assume also that he
expects p to lie somewhere in the neighbourhood of 0.6.
We know that
!
p̂ − p
1 − α ≈ P −zα/2 < p < zα/2
p × (1 − p) /n
r !
p × (1 − p)
= P |p̂ − p| < zα/2 .
n
7
STAT2602B TST23/24 Topic 4
Hence, r
p × (1 − p)
zα/2 < 0.04,
n
or z 2
α/2
n > p (1 − p) ,
0.04
where α = 1 − 0.9 = 0.1. Since p is unknown, we would use the guessed value of
p = 0.6 provided by the experimenter. Then,
2
1.645
n > 0.6 × (1 − 0.6) × ≈ 405.9.
0.04
Therefore, 2
1.645
n > 0.5 × (1 − 0.5) × ≈ 422.8,
0.04
and the final result will be n ≥ 423.
8
The University of Hong Kong
Department of Statistics and Actuarial Science
STAT2602B Probability and Statistics II
Semester 2 2023/2024
Topic 4 Summary
(c) Variance σ 2 .
A 100 (1 − α) % confidence interval for σ 2 is
" #
n−1 2 n−1
s, 2 s2 .
χ2α/2,n−1 χ1−α/2,n−1
9
The University of Hong Kong
Department of Statistics and Actuarial Science
STAT2602B Probability and Statistics II
Semester 2 2023/2024
Topic 4 Exercise
1. A team of efficiency experts intends to use the mean of a random sample of size
n = 150 to estimate the average mechanical aptitude of assembly-line workers
in a large industry. If, based on experience, the efficiency experts can assume
that σ = 6.2 for such data which yield a sample mean of 12. Construct a 95%
confidence interval for µ assuming the population has the normal distribution.
25, 28, 26, 29, 32, 22, 24, 26, 33, 30.
The population has the normal distribution but its standard deviation is un-
known. Construct a 90% confidence interval for the population mean.
It is known that the population of packet weights has the normal distribution
with mean µ.
10
The University of Hong Kong
Department of Statistics and Actuarial Science
STAT2602B Probability and Statistics II
Semester 2 2023/2024
Topic 4 Exercise Suggested Solution
2. We obtain from the sample that x̄ = 27.5, s = 3.535534 and n = 10. A 90%
confidence interval for µ is
s 3.535534
x̄ ± tα/2,n−1 √ = 27.5 ± 1.833 × √ = [25.45, 29.55] .
n 10
Look up the t-test Table with the level of significance for two-tailed tests
=0.10 (or level of significance for one-tailed tests= 0.05) and df = 9. This
gives tα/2,n−1 = t0.05,9 = 1.833.
Look up the t-test Table with the level of significance for two-tailed tests=
0.05 (or level of significance for one-tailed tests= 0.025) and df = 13. This
gives tα/2,n−1 = t0.025,13 = 2.160.
4. Given n = 16 and s = 2.2. Look up the χ2 -test Table with the level of
significance for two-tailed tests= 0.01 (or level of significance for one-tailed
tests= 0.005) and df = 15. This gives the left quantile
11
STAT2602B TST23/24 Topic 4 Exercise Suggested Solution
12