0% found this document useful (0 votes)
168 views12 pages

STAT 2500 Review For Final Exam - (Solutions) - Updated

This document provides solutions to review problems for a statistics final exam. It includes: 1) Calculations of probabilities and z-scores for various normal distributions. 2) Descriptions of properties of the sampling distribution of the mean including how it approaches normality as sample size increases. 3) A calculation of conditional probability using a 2x2 probability table. 4) Construction of 90% and 95% confidence intervals for a population mean where the standard deviation is known, including calculating required sample sizes. 5) Construction of 99% and 90% confidence intervals for a population mean where the standard deviation is unknown, using a t-distribution. 6) Descriptions of how confidence interval

Uploaded by

amaka.iwu10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views12 pages

STAT 2500 Review For Final Exam - (Solutions) - Updated

This document provides solutions to review problems for a statistics final exam. It includes: 1) Calculations of probabilities and z-scores for various normal distributions. 2) Descriptions of properties of the sampling distribution of the mean including how it approaches normality as sample size increases. 3) A calculation of conditional probability using a 2x2 probability table. 4) Construction of 90% and 95% confidence intervals for a population mean where the standard deviation is known, including calculating required sample sizes. 5) Construction of 99% and 90% confidence intervals for a population mean where the standard deviation is unknown, using a t-distribution. 6) Descriptions of how confidence interval

Uploaded by

amaka.iwu10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

STAT 2500 Review for Final Exam (Solutions)

1.
1325 − 1400
a) 0.6265 z1 = = −0.7894  −0.79 → Area 1 = 0.2148
95
0.8413 1495 − 1400
z2 = = 1.00 → Area 2 = 0.8413
X 95
1325 1400 1495
Z
0.2148 P(1325< X < 1495) = 0.8413 – 0.2148 = 0.6265
–0.79 0 1.00

b)
1485 − 1400
0.8133 z= = 0.8947  0.89
95
0.1867 gives the left cum. area of 0.8133
X
1400 1485 P(X > 1485) =1 – 0.8133 = 0.1867
Z
0 0.89

c)
1350 − 1400
z= = −0.52631  −0.53
0.2981
95
gives the left cum. area of 0.2981
X
1350 1400 P(X < 1350) = 0.2981
Z
–0.53 0

d)

0.6628 0.2519 1440 − 1400


z1 = = 0.4210  0.42 → Area 1 = 0.6628
0.9147 95
X
1400 1440 1530
1530 − 1400
0 0.42 1.37
Z z2 = = 1.3684  1.37 → Area 2 = 0.9147
95

P(1440 < X < 1530) = 0.9147 – 0.6628 = 0.2519

e) Sampling distribution. (σ is known and the original population is normal z-distribution)

x − 1460 − 1400
0.9857 z= = = 2.1878  2.19
 95
n 12
0.0143 gives the left cum. area of 0.9857
X
1400 1460
Z
P( X > 1460) = 1 – 0.9857 = 0.0143
0 2.19

Page 1
STAT 2500 Review for Final Exam (Solutions)

2.
1) The distribution of sample means ( X ) will, as the sample size n increases (usually larger than
30), approach a normal distribution.

2) If the original population is itself normally distributed, then the sample means ( X ) will be
normally distributed for any sample size, n.

3) The mean (  x ) of the distribution of sample means will be the population mean μ.

4) The standard deviation of the distribution of sample means (standard error) will be  x = .
n

100
3. P(pregnant) = = 0.1 P = pregnant P = not pregnant
1000
Pregnant or Not Test Results Joint Probabilities

0.98 + P(P and +) = 0.098

0.1
P
0.02 − P(P and −) = 0.002

0.9
0.01 + P(P and +) = 0.009
P
0.99 − P(P and −) = 0.891
Total 1.000
P(P and −) 0.002
P( P | – ) = = = 0.0022396  0.0022
P ( −) 0.002 + 0.891

4.  = 18.15 is known and n = 36  30 → z-table, x = 118.25

a) Sample mean, x = 118.25 hours

b) 90% confidence level → Use (1 – 0.90)/2 = 0.05 in the body of


z-table (left cumulative area between 0.0495 & 0.0505)
to find z = –1.645. Now, calculate E (Margin of Error) 0.05
0.90
by substituting z = 1.645, σ = 18.15, and n = 36 into the formula:
Z
 18.15 -1.645 0 1.645
E = z = (1.645) = 4.9761  4.98
2 n 36

x − E = 118.25 − 4.98 x + E = 118.25 + 4.98


113.27 hours < μ < 123.23 hours

Page 2
STAT 2500 Review for Final Exam (Solutions)

c) 95% confidence level → Use (1 – 0.95)/2 = 0.025 in the body of the


z-table to find z = –1.96. Now, calculate E (Margin of Error)
by substituting z = 1.96, σ = 18.15, and n = 36 into the formula:
0.025
 18.15 0.95
0.95
E = z = (1.96) = 5.929  5.93
2 n 36 Z
-1.96 0 1.96
x − E = 118.25 − 5.93 x + E = 118.25 + 5.93
112.32 hours <  < 124.18 hours
We are 95% confident that the population average recovery time from a common cold
is between 112.32 hours and 124.18 hours.

d) Given E = 3.23 z = 1.96 (95% confidence level), find the sample size.
2
2
 z     (1.96)(18.15) 2
n= 2  =  = 121.299 ≈ round up to 122
 E   3.23 
 
Sample size of n = 122 common cold cases will be needed.

5. σ is unknown & n = 32 ≥ 30 t-table, x = 7.6  g / ml , s = 2.1  g / ml

a) 99% confidence level & d.f.= n –1= 32 – 1 = 31 → t = ±2.744

Now, calculate E (Margin of Error) by substituting


t = 2.744, s = 2.1, and n = 32 into the formula to get:
0.99
s 2.1
E = t ,n −1 = (2.744) = 1.018658  1.0 t
2
n 32 -2.744 0 2.744

x − E = 7.6 − 1.0 x + E = 7.6 + 1.0


6.6  g / ml <  < 8.6  g / ml

b) We are 99% confident that the population mean concentration of the specific dose of ampicillin
is between 6.6 μg/ml and 8.6 μg/ml.

c) 90% confidence level & d.f. = n –1= 32 – 1 = 31 → t = ±1.696


Now, calculate E (Margin of Error) by substituting
t = 1.696, s = 2.1, and n = 32 into the formula to get:
0.90
s 2.1
E = t ,n −1 = (1.696) = 0.6296  0.6 -1.696 0 1.696
t
2
n 32
x − E = 7.6 − 0.6 x + E = 7.6 + 0.6

7.0  g / ml <  < 8.2  g / ml

d) The 99% confidence interval is wider since the higher confidence level produces the wider
interval.

Page 3
STAT 2500 Review for Final Exam (Solutions)

6.
a)
Pre-Step σ =120 is known & n = 64 ≥ 30 Z-test

Step (1)
H0 :  = 2300
H1 :   2300 Two-tailed Test

Step (2) Critical value(s) and Rejection region(s):


Since the significance level is 1%,  = 0.01/ 2
2
= 0.005. Use the area 0.005 (left cumulative
area between 0.0049 & 0.0051) to find z-score. 0.005 0.005
zc = ±2.575 Z
-2.575 0 2.575
Reject H0 if z-test stat < –2.575 reject H0 reject H0
or z-test stat > 2.575
Step (3) Test Statistic:
n = 64, x = 2240 , σ = 120, µ = 2300 (hypothesized)
x − (2240 − 2300)
z= = = −4.00 (two decimal places for z-scores)
  120 
n  
 64 

Step (4) Decision & Conclusion:


Since z (test statistic) = –4.00 is less than –2.575 (z-critical), we reject H0 at the 1% significance
level.
There is enough evidence to conclude that the average daily nutrient intake in healthy
young women is different from 2300 kcal.

b) Type I error might have occurred since we rejected the null hypothesis.
It was a mistake to reject the null hypothesis if the average daily nutrient intake in healthy
young women was actually 2300 kcal.

7.
a)
Pre-Step: σ =18.2 is known and the population is normally distributed Z-test

Step (1) H 0 :   80
H 1 :   80 Left-tailed Test
Step (2) Critical value(s) and Rejection region(s):

The significance level is 0.05, so use the area


0.05 (left cumulative area between 0.0495 0.05
& 0.0505) to find z-score.
zc = –1.645
Z
– 1.645 0
Reject H0 if z-test stat < –1.645 reject H0
Page 4
STAT 2500 Review for Final Exam (Solutions)

Step (3) Test Statistic:


n = 25, x = 78.5 , σ = 18.2, µ = 80 (hypothesized)
x − (78.5 − 80)
z= = = −0.4120879  −0.41 (two decimal places for z-scores)
  18.2 
n  
 25 

Step (4) Decision & Conclusion:


Since z (test statistic) = –0.41 is greater than –1.645(z-critical), we do not reject H0 at the 5%
significance level. There is not enough evidence to conclude that the new medicine reduces
the average recovery time to less than 80 hours.

b) Type II error might have occurred since we failed to reject the null hypothesis.
It was a mistake that we failed to reject the null hypothesis if the new medicine actually
reduced the average recovery time to less than 80 hours.

8.
Pre-Step: σ is unknown & n = 46 ≥ 30 t-test

Step (1)
H 0 :   60
H 1 :   60 Right-tailed Test

Step (2) Critical value(s) and Rejection region(s):

The significance level is 0.05, so use 1 tail α =


0.05 & d.f. = n –1 = 46 – 1 = 45 to find t-score. 0.05
tc = 1.679
t
Reject H0 if t-test stat > 1.679 0 1.679

reject H0

Step (3) Test Statistic:


n = 16, x = 68 , s = 12.5, µ = 60 (hypothesized)

x − 68 − 60 8
t= = = = 4.34069  4.341
s 12.5 1.8430244
n 46

Step (4) Decision & Conclusion:

Since t (test statistic) = 4.341 is greater than 1.679 (t-critical), we reject H0 at α = 0.05.
There is enough evidence to support the medical board’s claim that the mean number of hours
worked per week by nurses at the hospital is more than 60 hours.

Page 5
STAT 2500 Review for Final Exam (Solutions)

9.
Pre-Step: σ is unknown and the population normally distributed (n = 81 ≥30) t-test

Step (1)
H 0 :  = 190
H1 :   190 Two-tailed Test

Step (2) Critical value(s) and Rejection region(s):

Since the significance level is 5%, select 0.05


as 2 tails α & d.f. = n –1 = 81 – 1 = 80 to find
t-score. tc = ±1.990 0.025 0.025

Reject H0 if t-test stat < –1.990 -1.990 0 1.990


t
or t-test stat > 1.990 reject H0 reject H0

Step (3) Test Statistic: n = 81, x = 181.45 , s = 40, µ = 190 (hypothesized)


x −  (181.45 − 190)
t= = = −1.92375  −1.924 (round to three decimal places for t-scores)
s  40 
n  
 81 

Step (4) Decision & Conclusion:


Since t (test statistic) = −1.924 is between  1.990 (t - critical) , we do not reject H0 at the 5%
significance level. There is not enough evidence to conclude that the mean cholesterol level of
recent Asian immigrants is different from that (190 mg/dL) of general U.S. population.

10.
Pre-Step: σ1= 0.72 & σ2= 0.87 are known & n1 = 64, n2 = 81 ≥ 30 Z-test (independent)

a)
Step (1) 1 : the population mean of total cholesterol level of women
2 : the population mean of total cholesterol level of men

H 0 : 1 = 2
H1 : 1  2 Two-tailed Test

Step (2) Critical value(s) and Rejection region(s):

Since the significance level is 2%,


 = 0.02/2 = 0.01. Use the area 0.01(left
2 0.01
cumulative area 0.009 is the closest) 0.01
to find z-score. zc= ±2.33 Z
-2.33 0 2.33
Reject H0 if z-test stat <–2.33 reject H0 reject H0
or z-test stat > 2.33 Page 6
STAT 2500 Review for Final Exam (Solutions)

Step (3) Test Statistic:


n1 = 64, x1 = 5.1,  1 = 0.72 ; n2 = 81, x2 = 4.8,  2 = 0.87

( x1 − x2 ) − ( 1 − 2 ) (5.1 − 4.8) − 0
z= = = 2.271395  2.27 (two decimal places for z - scores)
 12  22  (0.72)2 (0.87)2 
+  + 
n1 n2  64 81 

Step (4) Decision & Conclusion:


Since z (test statistic) = 2.27 is between  2.33 (z -critical) , we do not reject H0 at the 2%
significance level. There is insufficient evidence to conclude that the there is a difference
in the mean total cholesterol levels between men and women.

b)
Step (1)
1 : the population mean of total cholesterol level of women
2 : the population mean of total cholesterol level of men

H 0 : 1  2
H1 : 1  2 Right-tailed Test

Step (2) Critical value(s) and Rejection region(s):

Since the significance level is 2% (in the right


tail), use the area (1– 0.02 =) 0.98 (left 0.98
cumulative area 0.9798 which is the closest) 0.02
to find z-score. zc = 2.05
Z
0 2.05
Reject H0 if z-test stat > 2.05 reject H0

Step (3) Test Statistic: (same as (a))

( x1 − x2 ) − ( 1 − 2 ) (5.1 − 4.8) − 0
z= = = 2.271395  2.27 (two decimal places for z - scores)
1 2
 2
 (0.72)2 (0.87)2 
+ 2
 + 
n1 n2  64 81 

Page 7
STAT 2500 Review for Final Exam (Solutions)

Step (4) Decision & Conclusion:


Since z (test statistic) = 2.27 is greater than 2.05 (z - critical) , we reject H0 at the 2%
significance level. There is enough evidence to conclude that the mean total cholesterol
level of women is higher than that of men.

11. Pre-Step: σ’s are unknown and n1 = 120, n2 = 100 ≥ 30 t-test (Independent)
Step (1)  : thepopulation mean heart rate of Caucasian newborns
1

2 : thepopulation mean heart rate of American - African newborns


H 0 : 1  2
H1 : 1  2 Left-tailed Test

Step (2) Critical value(s) and Rejection region(s):

Since the significance level is 1%, use 1 tail α =


0.01. As n1 = 120 & n2 = 100, n2 = 100 is the
smaller sample size. d.f. = 100 – 1 = 99 (90 is the 0.01
closest one as rounded down) to find the t-score.
t
Place the negative sign to the t-score because of -2.368 0
left-tailed test. tc = –2.368 reject H0

Reject H0 if t-test stat < –2.368

Step (3) Test Statistic: n1 = 120, x1 = 126, s1 = 11; n2 = 100, x2 = 134, s2 = 12


( x1 − x2 ) − ( 1 − 2 ) (126 − 134) − 0 −8
t= = = = −5.112751  −5.113
s12 s22 112 122 1.5647151
+ +
n1 n2 120 100

Step (4) Decision & Conclusion:


Since t (test statistic) = −5.113 is less than − 2.368 (t - critical) , we reject H0 at the 1%
significance level. There is significant evidence to conclude that the mean heart rate of
Caucasian newborns is lower than that of African-American newborns.

Page 8
STAT 2500 Review for Final Exam (Solutions)

12. σ is unknown, the differences are from a normally distributed population, and dependent
samples Paired t-test
a)
Pre-Step
Patient Before After Difference(d) d ( d − d )2
1 25 19 19–25 = –6 –4  −6 − (−4)
2
=4
2 17 14 14–17 = –3 –4  −3 − (−4)
2
=1
3 16 12 –4 –4 0
4 10 3 –7 –4 9
5 8 6 –2 –4 4
6 8 2 –6 –4 4
7 6 4 –2 –4 4
8 5 3 –2 –4 4
Total –32 30

−32 30
d= = −4 sd = = 2.0701966  2.07
8 8 −1
12.
b)
Step (1)
H 0 : d = 0 (No change)
H1 : d  0 (Changed) Two-tailed Test

Step (2) Critical value(s) and Rejection region(s):

Since the significance level is 0.01, use


2 tail α = 0.01 & d.f. = 8 – 1 = 7 to find
the t-score. tc = ±3.499
0.005 0.005
Reject H0 if t-test stat <–3.499 -3.499 t
0 3.499
or t-test stat > 3.499 reject H0
reject H0

Step (3) Test Statistic: n = 8, d = −4, sd = 2.07

d − d −4 − 0
t= = = −5.46555  −5.466 (three decimal places for t -scores)
sd  2.07 
n  8

Step (4) Decision & Conclusion:


Since the test statistic t = –5.466 is less than –3.499 (t-critical), we reject H0 at the 1%
significance level. There is enough evidence to conclude that the mean urinary protein has
changed over the 8-week period.

Page 9
STAT 2500 Review for Final Exam (Solutions)

c)
Step (1)
H 0 : d  0 (not deceased)
H1 : d  0 (deceased) Left-tailed Test

Step (2) Critical value(s) and Rejection region(s):

The significance level is 0.01, so use 1 tail


α = 0.01 & d.f. = n –1 = 8 – 1 = 7 to find t-score.
Place the negative sign to the t-score because of
0.01
left-tailed test. tc = –2.998
Reject H0 if t-test stat <–2.998 -2.998 0
t
reject H0

Step (3) Test Statistic: same as the previous case

d − d −4 − 0
t= = = −5.46555  −5.466 (three decimal places for t - scores)
sd  2.07 
n  8

Step (4) Decision & Conclusion:


Since the t-test statistic = –5.466 is less than –2.998 (t-critical), we reject H0 at the 1% significance
level. There is enough evidence to conclude that the mean urinary protein has decreased
over the 8-week period.

13.
a) yˆ = −1.115 + 1.846 x
1.846 ∙∙∙ If the average daily sunshine is increased by 1 hour, then the number of skin cancers
is increased by 1.846 per 100,000.

b)
r2 = 0.923
92.3% of the variation in the number of skin cancers (/100,000) can be explained by the
variation of the average daily sunshine (in hours). The remaining 7.7% is unexplained.

c) r = 0.923 = 0.96072 ∙∙∙ r ≈ 0.961 (Since the slope b1 is positive, r is positive.)


(There is a very strong positive linear relationship between the amount of sunshine and the
numbers of skin cancers.)

Page 10
STAT 2500 Review for Final Exam (Solutions)

d)
H 0 : There is no significant linear relationship
H1 : There is a significant linear relationship
Since the p-value = 0.000147 (sig. value from the table) < 0.01 (significance level), we reject H0.
There is a significant linear relationship between the amount of sunshine and the numbers of
skin cancers at the 1% level.

e) Since there is a significant linear relationship and the correlation is very strong, we can use
the equation to predict.

ŷ = −1.115 + 1.846(6.5) = 10.884  10.9 skin cancers / 100,000

14.
Pre-step: Calculate expected frequencies: Expected frequency =
( Column Total )  ( Row Total )
Overall Total
Expected frequencies are shown in brackets in the table below.

< 20 20-24 25-29 Total


30-34 ≥ 35
320 1206 1011 463 220
Case 3220
(416.58) (1348.26) (933.60) (371.86) (149.70)
1422 4432 2893 1092 406
Control 10245
(1325.42) (4289.74) (2970.40) (1183.14) (476.30)
Total 1742 5638 3904 1555 626 13465

Step 1. Statements:
𝐻0 : There is no relationship between age at first birth and prevalence of breast cancer.
𝐻1 : There is a relationship between age at first birth and prevalence of breast cancer.

Step 2. Critical values and rejection region:

Degrees of freedom 𝑑𝑓 = (2 − 1)(5 − 1) = 4


χ24, 0.005 = 14.860
Reject 𝐻0 if 𝜒 2 -stat > 14.860. Reject H0
14.860

Page 11
STAT 2500 Review for Final Exam (Solutions)

Step 3. Test statistic:

Observed Expected 2 2
(O – E) (O – E) /E

320 416.58 9327.6964 22.39113


1206 1348.26 20237.9076 15.01039
1011 933.60 5990.7600 6.4168
463 371.86 8306.4996 22.33771
220 149.70 4942.0900 33.01329
1422 1325.42 9327.6964 7.03754
4432 4289.74 20237.9076 4.71775
2893 2970.40 5990.7600 2.01682
1092 1183.14 8306.4996 7.02072
406 476.30 4942.0900 10.37600
130.33819
χ2 ‐ stat  130.34

Step 4: Conclusion:
Since 𝜒 2 (test statistic) = 130.34 is greater than 14.860, we reject H0.
There is enough evidence to conclude that there is a significant relationship between
age at first birth and prevalence of breast cancer.

Page 12

You might also like