0% found this document useful (0 votes)
10 views

Final 2023 Summer IntroStat Sol

Final_2023_Summer_IntroStat_sol

Uploaded by

roongjeee
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Final 2023 Summer IntroStat Sol

Final_2023_Summer_IntroStat_sol

Uploaded by

roongjeee
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Introduction of Statistics (STA1002-12)

Final Examination
Date : Monday, July 17th, 2023 at 09:00~ 11:00 am

- solution -

1. We want to know the average weight (kg) of products produced by Company A. Answer the
following questions.

(a) A random sample of 50 products was weighed, and the average weight was found to be 10 kg
with a standard deviation of 3 kg. Estimate the 90% confidence interval for the population mean
based on these results.

Let the weight of the product be a random variable 𝑋. Since the sample size (n=50) is large,
according to the Central Limit Theorem, the sample mean 𝑋̅ approximately follows a normal
distribution. That is, (𝑋̅ − 𝜇)/(3/√50) ~ 𝑁(0,1). Therefore, the 90% confidence interval for the
population mean is as follows.

3 3 3 3
: (𝜃̂𝐿 , 𝜃̂𝑈 ) = (𝑋̅ − 𝑧0.05 , 𝑋̅ + 𝑧0.05 ) = (10 − 1.645 × , 10 + 1.645 × ) = (9.302, 10.698)
√50 √50 √50 √50

(b) The number of samples was increased to 200 products chosen randomly, and their weights
were measured. The results showed an average of 10 kg with a standard deviation of 3 kg.
Estimate the 95% confidence interval for the population mean based on these results.

3 3 3 3
: (𝜃̂𝐿 , 𝜃̂𝑈 ) = (𝑋̅ − 𝑧0.025 , 𝑋̅ + 𝑧0.025 ) = (10 − 1.96 × , 10 + 1.96 × ) = (9.584, 10.416)
√200 √200 √200 √200

(c) Which is the better confidence interval, (a) or (b)? Explain your reasoning.

Confidence interval (b) is better because it has a higher confidence level and a narrower interval
width. This is due to a relatively larger sample size (n=200).
(d) Company A claims that the weight of its products is 13 kg. To prove this, 10 products were
randomly selected and weighed.

11 12 8 11 15 9 5 7 12 10

Conduct a hypothesis test about the company's claim at a 5% significance level. (Assuming that
the weight of the product follows a normal distribution.)

Step1) State the hypothesis

𝐻0 : 𝜇 = 13 𝑉𝑆 𝐻𝑎 : 𝜇 ≠ 13 (two-sided test)

Step2) State the significance level 𝛼

𝛼 = 0.05

Step3) Calculate the test statistic

𝑋̅−13 10−13
Under 𝐻0 , 𝑡 = 8.222
~ 𝑡(9) and the test statistic is 𝑡0 = 8.222
= −3.309
√ √
10 10

Step4) Calculate the P-value and State your conclusion

𝑃(𝑡 < −3.309) ∗ 2 = 0.009,

The calculated p-value is 0.009, which is less than the significance level of 5%. Therefore, we can
reject the null hypothesis. At a significance level of 5%, we cannot support the claim that the ??

(e) Company A has developed a new device that can reduce the weight of its products. The 10
samples from (d) were re-weighed using the new device.

Before 11 12 8 11 15 9 5 7 12 10
After 12 8 9 9 10 5 6 5 10 5
Diff 1 -4 1 -2 -5 -4 1 -2 -2 -5

Conduct a hypothesis test to demonstrate the effectiveness of the new device at a 5% significance
level. (Assuming that the weight of the product follows a normal distribution.)

Step1) State the hypothesis

𝐻0 : 𝜇𝐷 = 0 𝑉𝑆 𝐻𝑎 : 𝜇𝐷 < 0 where 𝐷 = After − Before

Step2) State the significance level 𝛼

𝛼 = 0.05

Step3) Calculate the test statistic


𝑋̅−0 −2.1−0
Under 𝐻0 , 𝑡 = 5.878
~ 𝑡(9) and the test statistic is 𝑡0 = 5.878
= −2.739
√ √
10 10

Step4) Calculate the P-value and State your conclusion

𝑃(𝑡𝑑𝑓=9 ≤ −2.739) = 0.0114

At 𝛼 = 0.05, we can reject 𝐻0

2. Company B is a competitor of Company A and produces the same products. Answer the
following questions.

(a) Random samples of 8 products each were taken from Company A and Company B.

No A B
1 9 9
2 10 8
3 15 12
4 13 9
5 15 9
6 13 11
7 6 15
8 12 10
(Assuming that the weight of the products follows a normal distribution and that the variances of
the two populations are different.)

Estimate the 99% confidence interval for the difference in population means between the two
companies.

Let the weights of the products from Companies 𝐴 and 𝐵 be random variables A and B,
respectively. Assuming heterogeneity in the population variances, the 99% confidence interval for
the difference in the two population means is as follows.

𝑆12 𝑆22 𝑆12 𝑆22


(𝜃̂𝐿 , 𝜃̂𝑈 ) = ((𝑋̅1 − 𝑋̅2 ) − 𝑡𝑣,𝛼 √ + , (𝑋̅1 − 𝑋̅2 ) + 𝑡𝑣,𝛼 √ + )
2 𝑛1 𝑛2 2 𝑛1 𝑛2

9.696 5.125 9.696 5.125


= ((11.625 − 10.375) − 𝑡13,0.005 √ + , (11.625 − 10.375) + 𝑡13,0.005 √ + ) = (−2.850, 5.350)
8 8 8 8

2
𝑠2 𝑠2
(𝑛1 +𝑛2 )
where 𝑣 = 1 2
2 2 = 12.78 ≈ 13
𝑠2 𝑠2
(𝑛 ) (𝑛2 )
1
1 + 2
𝑛1 −1 𝑛2 −1
(b) Use the data from (a) to test at a 5% significance level whether there is a difference in the
average weights of products from Company A and Company B.

Step1) State the hypothesis

Let ~~

𝐻0 : 𝜇𝐴 = 𝜇𝐵 𝑉𝑆 𝐻𝑎 : 𝜇𝐴 ≠ 𝜇𝐵

Step2) State the significance level 𝛼

𝛼 = 0.05

Step3) Calculate the test statistic

(𝑋̅𝐴 −𝑋̅𝐵 )−0 (𝑋̅𝐴 −𝑋̅𝐵 )−0 11.625−10.375


Under 𝐻0 , 𝑡 = = 9.696 5.125
~ 𝑡(𝜈 = 13) and the test statistic is 𝑡0 = 9.696 5.125
= 0.9184
𝑆2 𝑆2 √ + √ +
√ 𝐴+ 𝐵 8 8 8 8
𝑛𝐴 𝑛𝐵

Step4) Calculate the P-value and State your conclusion

𝑃(𝑡𝑑𝑓=13 ≥ 0.9184) + 𝑃(𝑡𝑑𝑓=9 ≤ −0.9184) = 𝑃(𝑡𝑑𝑓=13 ≤ −0.9184) ∗ 2 = 0.3751

The calculated p-value is 0.0188, which is less than the significance level of 5%. Therefore, we can
reject the null hypothesis. At a significance level of 5%, we can support the claim that there is a
statistically significant difference in academic performance between the group that spends 2 or
more hours on social media per day and the group that does not.

(c) Compare the results of (a) and (b).

(d) Company B claims that their products are lighter on average than those of Company A. Test
this hypothesis at a 1% significance level using the data from (a).

Step1) State the hypothesis

𝐻0 : 𝜇𝐴 = 𝜇𝐵 𝑉𝑆 𝐻𝑎 : 𝜇𝐴 > 𝜇𝐵

Step2) State the significance level 𝛼

𝛼 = 0.05

Step3) Calculate the test statistic

(𝑋̅𝐴 −𝑋̅𝐵 )−0 (𝑋̅𝐴 −𝑋̅𝐵 )−0 11.625−10.375


Under 𝐻0 , 𝑡 = = 9.696 5.125
~ 𝑡(𝜈 = 13) and the test statistic is 𝑡0 = 9.696 5.125
= 0.9184
𝑆2 𝑆2 √ + √ +
√ 𝐴+ 𝐵 8 8 8 8
𝑛𝐴 𝑛𝐵
Step4) Calculate the P-value and State your conclusion

𝑃(𝑡𝑑𝑓=13 ≥ 0.9184) = 0.1876

The calculated p-value is 0.0188, which is less than the significance level of 5%. Therefore, we can
reject the null hypothesis. At a significance level of 5%, we can support the claim that there is a
statistically significant difference in academic performance between the group that spends 2 or
more hours on social media per day and the group that does not.

(e) Company A claims that its products have a lower defect rate than those of Company B. To test
this, 100 products from Company A and 80 from Company B were examined, finding 15 defective
products from Company A and 8 from Company B. Test Company A's claim at a 1% significance
level.

Step1) State the hypothesis

Let the proportion of Y University students who exercise for at least 60 minutes be 𝑃.

𝐻0 : 𝑝𝐴 − 𝑝𝐵 = 0 𝑉𝑆 𝐻𝑎 : 𝑝𝐴 − 𝑝𝐵 < 0

Step2) State the significance level 𝛼

𝛼 = 0.01

Step3) Calculate the test statistic

Since 𝑝̅𝐴 = 0.15, 𝑝̅𝐵 = 0.1875, 𝑛𝐴 = 100 , 𝑛𝐵 = 80 , the conditions of large sample: 𝑛𝐴 𝑝̅𝐴 > 5
and 𝑛𝐴 (1 − 𝑝̅𝐴 ) > 5, 𝑛𝐵 𝑝̅𝐵 > 5 and 𝑛𝐵 (1 − 𝑝̅𝐵 ) > 5 are satisfied simultaneously. Thus, the CLT and
LLN can be applied to the distribution of the sample proportion.

(𝑝̅ 𝐴 −𝑝̅ 𝐵 )−0 (𝑝̅ 𝐴 −𝑝̅ 𝐵 )−0 15+8


Thus, under 𝐻0 , 𝑍 = = ~ 𝑁(0,1) where 𝑝̅ = = 0.1278
√𝑝̅ (1−𝑝̅ )(
1 1
+ ) √0.1278(1−0.1278)(
1 1 100+80
𝑛𝐴 𝑛𝐵
+ )
100 80

(0.15−0.1)−0
and the test statistic is 𝑧0 = 1 1
= 0.9984
√0.1278(1−0.1278)( + )
100 80

Step4) Calculate the P-value and State your conclusion

𝑃(𝑍 ≤ 0.9984) = 0.841

The calculated p-value is 0.763, which is greater than the significance level of 5%. Therefore, we can
not reject the null hypothesis. At a significance level of 5%, we have sufficient evidence to support
the claim that the average daily exercise time of Y University students is more than 60 minutes.

You might also like