0% found this document useful (0 votes)
31 views

Set - 2 - 2023 - Review - Outline Solutions

This document provides solutions to review questions for topics 1 and 2 in a business statistics course. For topic 1, the questions cover descriptive statistics concepts like measures of central tendency, standard deviation, the empirical rule and applying these concepts to sample data. For topic 2, the questions cover probability concepts like random variables, discrete vs. continuous variables, mutually exclusive and collectively exhaustive events, and calculating probabilities based on these classifications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Set - 2 - 2023 - Review - Outline Solutions

This document provides solutions to review questions for topics 1 and 2 in a business statistics course. For topic 1, the questions cover descriptive statistics concepts like measures of central tendency, standard deviation, the empirical rule and applying these concepts to sample data. For topic 2, the questions cover probability concepts like random variables, discrete vs. continuous variables, mutually exclusive and collectively exhaustive events, and calculating probabilities based on these classifications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

QM161 Business Statistics

Review Questions - Solutions

TOPIC 1

T1_Q1: Choose the one alternative that best completes the statement or answers the question.
1) The methods involving the collection, presentation of data in tables, figures and graphs, and characterization of a set
of data in order to properly describe the various features of that set of data are called
A) the scientific method. B) statistical inference. C) sampling. D) descriptive statistics.

2) The estimation of the population average family expenditure on food based on the sample average expenditure of
1,000 families is an example of
A) a statistic. B) a parameter. C) inferential statistics. D) descriptive statistics.

3) The study of the collection, analysis, summarization, organization and interpretation of data is known as
A) Economics B) Mathematics C) Statistics. D) None of the above

4) A summary measure that is computed to describe a characteristic from only a sample of the population is called
A) a statistic. B) the scientific method. C) a census. D) a parameter.

5) Which of the following is most likely a population as opposed to a sample?


A) every third person to arrive at the bank. B) registered voters in a county.
C) respondents to a newspaper survey. D) the first 5 students completing an assignment.

6) Which of the following is most likely a parameter as opposed to a statistic?


A) The proportion of trucks stopped yesterday that were cited for bad brakes.
B) The average height of people randomly selected from a database.
C) The average score of the first five students completing an assignment.
D) The proportion of females registered to vote in a county.

7) Which of the following statistics is not a measure of central tendency?


A) median B) Q3 C) mode D) arithmetic mean

8) In a perfectly symmetrical bell-shaped ʺnormalʺ distribution


A) the median equals the mode. B) the arithmetic mean equals the median.
C) the arithmetic mean equals the mode. D) All of the above.

9) The smaller the spread of scores around the arithmetic mean,


A) the smaller the interquartile range. B) the smaller the coefficient of variation.
C) the smaller the standard deviation. D) All of the above.

10) According to the empirical rule, if the data form a ʺbell-shapedʺ normal distribution, _____ percent of the
observations will be contained within 2 standard deviations around the arithmetic mean.
A) 68 B) 95 C) 93.75 D) 9.99

Rene@UNE Business School 1


T1_Q2: The following table represents the assets in billions of dollars of a sample of five largest bond funds.

Bond Fund Assets (Billions $) Let X = Assets (B$)


Vanguard GNMA 19.5
Vanguard Total Bond Mkt. Index 16.8 ∑ 𝑋 = (19.5 + 16.8 + 13.7 + 12.8 + 10.9) = 𝟕𝟑. 𝟕
Bond Fund of America A 13.7 ∑ 𝑋 2 = (19.52 + 16.82 + 13.72 + 12.82 + 10.92 ) = 𝟏𝟏𝟑𝟐. 𝟖𝟑
Franklin Calif. Tax-Free Inc. A 12.8
Vanguard Short -Term Corp. 10.9

1) Calculate the mean assets.

∑ 𝑋 73.7
𝑋̅ = = = 14.74 𝑏𝑖𝑙𝑙𝑖𝑜𝑛 $
𝑛 5

2) Compute the value of the standard deviation.

1 (∑ 𝑋)2 1 (73.7)2
𝑠=√ [∑ 𝑋 2 − ] = √ [1132.83 − ] = √11.623 = 3.41 𝑏𝑖𝑙𝑙𝑖𝑜𝑛 $
𝑛−1 𝑛 4 5

T1_Q3: A university class contained 60 female students. They were asked to report their heights (in cms), with the following
results. n=60 mean = 163.39 standard deviation = 8.68
Assume that heights follow a bell or mound-shaped distribution.
The main purpose of this
(1) Within what range would you expect 95% of the heights to lie? question is to apply the
empirical rule. However, it
By empirical rule, we would expect 95% of the heights to lie within 2 standard deviation could
frombetheargued
mean,that ithence:
would
be more accurate to use a
𝑋̅ ± 2(𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛) 95% confidence interval in
part 1, and obtain exact
probability in parts 2 and 3.
163.39 ± 2(8.68)=163.39 ± 17.36
Both are acceptable.
[146.03cm, 180.75 cm]

(2) Find the z-score of Jenny, a female student whose height is 155.22 cm. Approximately how many students are shorter
than Jenny?
𝑋 − 𝑋̅ 155.22 − 163.39
𝑍= = = −0.94
𝑆 8.68

By empirical rule, approximately 10 students are shorter than Jenny (0.16*60).

(3) Tina has a height of 195 cm. What is her z score? Approximately how many students in the class do you think would
be taller than Tina?
𝑋 − 𝑋̅ 195 − 163.39
𝑍= = = 3.64
𝑆 8.68
None. Tina is the tallest.

Rene@UNE Business School 2


TOPIC 2

T2_Q1: Choose the one alternative that best completes the statement or answers the question.

1) The manager of the customer service division of a major consumer electronics company is interested in determining
whether the customers who have purchased a DVD player made by the company over the past 12 months are
satisfied with their products. The possible responses to the question ʺHow many people are there in your
household?ʺ The data you collected from this question are example of a
A) categorical random variable. B) discrete numerical random variable.
C) parameter. D) continuous numerical random variable.
2) Which of the following is a discrete numerical variable?
A) The Dow Jones Industrial average B) The distance you drove yesterday
C) The volume of water released from a dam D) The number of employees of an insurance company
3) Which of the following is a continuous numerical variable?
A) The number of gallons of milk sold at the local grocery store yesterday
Do not know
B) The color of a studentʹs eyes
if mutually
C) The amount of milk produced by a cow in one 24-hour period exclusive
D) The number of employees of an insurance company exclsive
4) If two events are collectively exhaustive, what is the probability that both occur at the same time?
A) 0.50 B) 1.00 C) 0 D) Cannot be determined from the information given.
5) If two events are mutually exclusive and collectively exhaustive, what is the probability that one or the other
occurs?
A) 1.00 B) 0.50 C) 0 D) Cannot be determined from the information given.

T2_Q2: A survey conducted by the Segal Company of New York found that in a sample of 170 large companies, 40 offered
stock options to their board members as part of their non -cash compensation packages. For small- to mid-sized companies,
43 of the 180 surveyed indicated that they offer stock options as part of their noncash compensation packages to their board
members.
Company Size Total
Large (L) Small to midsized
Stock Options (SM)
Yes (Y) 40 43 83
No (N) 130 137 267
Total 170 180 350
If a company is selected at random,
1) What is the probability that the company offered stock options to their board members?
P(Y) = 83/350 = 0.2371
2) What is the probability that the company is small to mid-sized and did not offer stock options to their board
members?
P(SM and N) = 137/350 = 0.3944
3) What is the probability that the company is small to mid-sized or offered stock options to their board members?
P(SM or Y) = (180+83-43)/350 = 0.6286
4) If a randomly selected company offered stock options to their board member, what is the probability that it is a large
company?
P(L | Y ) = 40/83 = 0.4819
5) Let A = the offered stock options to their board member and B = large company. Are the events A and B
independent. Show your working.
A and B are independent if P(A and B) = P(A) P(B)
We have: P(A) = P(Y) = 83/350 = 0.2371; P(B) = P(L) = 170/350 = 0.4857; P(A and B) = P(Y and L) = 40/350 = 0.1143
And: P(A) P(B) = 0.2371*0.4857 = 0.1152
Thus: 0.1152 ≠ 0.1143, therefore events A and B are not independent.

Rene@UNE Business School 3


TOPIC 3

T3_Q1: A lab orders 100 rats a week for each of the 52 weeks in the year for experiments that the lab conducts. Prices for 100
rats follow the following distribution:
Work out the
Price: $10.00 $12.50 $15.00 expected price first
Probability: 0.35 0.40 0.25 then multiply by 52

1) How much should the lab budget for next yearʹs rat orders be, assuming this distribution does not change?
A) $520 B) $780 C) $650 D) $637

T3_Q2: Suppose that past history shows that 65% of college students prefer Brand C cola. A
sample of 5 students is to be selected and the following PHStat output is generated.
X P(X) P(<=X) P(<X) P(>X) P(>=X)
0 0.005252 0.005252 0 0.994748 1
1 0.04877 0.054023 0.005252 0.945978 0.994748
2 0.181147 0.235169 0.054022 0.764831 0.945978
3 0.336416 0.571585 0.235169 0.428415 0.764831
4 0.312386 0.883971 0.571585 0.116029 0.428415 Can have answers
5 0.116029 1 0.883971 0 0.116029 to 4 decimal
places.

1) What is the probability that exactly 3 prefer brand C?


P(X=3) = 0.3364

2) What is the probability that more than 3 prefer brand C?


P(X > 3 ) = 0.4284

3) What is the probability that less than 3 prefer brand C?

P(X < 3) = 0.2352


4) What is the probability that exactly 3 did not prefer brand C?
P(X’=3) = P(X=2) = 0.1811

5) Calculate the mean and standard deviation of students who would prefer brand C cola.

µ = nπ = 5*0.65 = 3.25
 = √𝒏𝝅(𝟏 − 𝝅) = √𝟓(𝟎. 𝟔𝟓)(𝟎. 𝟑𝟓) = 𝟏. 𝟎𝟕 (𝟐𝒅𝒄𝒑)

T3_Q3:
1) Given the standard normal distribution (with mean 0 and standard deviation of 1), what is the probability that:

A) Z is less than 1.57?


P(Z < 1.57) = 0.9418 See if you could shade
the appropriate areas
B) Z is greater than 1.84? in the standard normal
curve
P(Z > 1.84) = 1- P(Z < 1.84) = 1 – 0.9671 = 0.0329

C) Z is between 1.57 and 1.84?


P(1.57 < Z < 1.84) = P(Z<1.84) – P(Z<1.57) = 0.9671 – 0.9418 = 0.0253

Rene@UNE Business School 4


2) Given a normal distribution with  = 100 and  = 10, what is the probability that: X ~ N (100, 102)

A) X > 75? P(X>75)


75 − 100
𝑃(𝑋 > 75) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > −2.50) = 1 − 𝑃(𝑍 < −2.5) = 1 − 0.0062 = 0.9938
10

B) X < 70?
70 − 100
𝑃(𝑋 < 70) = 𝑃 (𝑍 < ) = 𝑃(𝑍 < −3.00) = 0.00135
10

C) X < 70 or X > 75?


𝑃(𝑋 < 70) 𝑜𝑟 𝑃(𝑋 > 75) = 𝑃(𝑍 < −3.00) + 𝑃(𝑍 > −2.5) = 0.00135 + 0.9938 = 0.99515

D) 5% of the values are more than what X value?

𝑋0 −100
P(X > X0 ) = 0.05 or P(X < X0) = 0.95 or P(Z < 10
) = 0.95. But we know, P(Z < 1.645) = 0.95, therefore:
𝑋0 −100 𝑋0 −100
Z= 10
and 1.645= 10
. Solving for X0 = (1.645*10)+100 = 116.45

T3_Q4:
A company that sells annuities must base the annual payout on the probability distribution on the length of life of the
participants in the plan. Suppose the probability distribution of the lifetimes of the participants in the plan is approximately
a normal distribution with mean, 68 years and standard deviation, 3.5 years:

1) What proportion of participants would receive payments:


a) below 65 years old?
65 − 68
𝑃(𝑋 < 65) = 𝑃 (𝑍 < ) = 𝑃(𝑍 < −0.86) = 0.1949
3.5
19.49% of participants
b) beyond age 70?

70 − 68
𝑃(𝑋 > 70) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 0.57) = 1 − 𝑃(𝑍 < 0.57) = 1 − 0.7157 = 0.2843
3.5
28.43% of participants
c) beyond age 75?

75 − 68
𝑃(𝑋 > 75) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 2.00) = 1 − 𝑃(𝑍 < 2.00) = 1 − 0.9772 = 0.0228
3.5
2.28% of participants
d) between 65 and 75 years old.

𝑃(65 < 𝑋 < 75) = 𝑃(−0.86 < 𝑍 < 2.00) = 𝑃(𝑍 < 2.00) − 𝑃(𝑍 < −0.86) = 0.9772 − 0.1949 = 0.7823
78.23% of participants

2) Complete the following statement: Only 10% of plan participants will receive payment beyond age _____?
𝑋 −100
P(X > X0 ) = 0.10 or P(X < X0) = 0.90 or P(Z < 0 10 ) = 0.90. But we know, P(Z < 1.28) = 0.90, therefore:
𝑋0 −68 𝑋0 −68
Z= 3.5
and 1.28= 3.5
. Solving for X0 = (1.28*3.5)+68 = 72.48

Rene@UNE Business School 5


TOPIC 4

T4_Q1: A random sample of 16 is drawn from a normal population with mean equal to 15 and standard deviation 2.
1) What is the mean and the standard deviation of the sampling distribution of X?
Please note of the
µ = 15
2 formula of Z
= = 0.50 (denominator)
√16
2) Find the value of:
a) P( X > 15.5)
15.5 − 15
𝑃(𝑋̅ > 15.5) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 1.00) = 1 − 𝑃(𝑍 < 1.00) = 1 − 0.8413 = 0.1587
2⁄
√16
b) P( X < 14)
14 − 15
𝑃(𝑋̅ < 14) = 𝑃 (𝑍 > ) = 𝑃(𝑍 < −2.00) = 0.0228
2⁄
√16
c) P( X > 18)
18 − 15
𝑃(𝑋̅ > 18) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 6.00) = 1 − 𝑃(𝑍 < 6.00) = 1 − 0.99999999 0
2⁄
√16
3) Find P( X > X 0) = 0.60

𝑋̅0 −15
P(𝑋̅ > 𝑋̅0 ) = 0.60 or P(𝑋̅ < 𝑋̅0 ) = 0.40 or P(Z < 0.50
) = 0.40. But we know, P(Z < -0.25) = 0.40, therefore:
𝑋̅0 −15
-0.25= . Solving for 𝑋̅0 = (-0.25*0.50)+15 = 14.875
0.50

T4_Q2: The time spent using email per session is normally distributed with mean 8 and standard deviation of 2 minutes. If
random samples of 25 sessions were selected:

(a) What is the standard error of the sample mean for 25 sessions?

2
= = 0.40
√25
(b) What proportion of the sample means more than 9 minutes?

9−8
𝑃(𝑋̅ > 9) = 𝑃 (𝑍 >
) = 𝑃(𝑍 > 2.50) = 1 − 𝑃(𝑍 < 2.50) = 1 − 0.9938 = 0.0062
2⁄
√25
(c) What proportion of the sample means less than 7.5 minutes?

7.5 − 8
𝑃(𝑋̅ < 7.5) = 𝑃 (𝑍 > ) = 𝑃(𝑍 < −1.25) = 0.1056
2⁄
√25

(d) What proportion of the sample means would fall between 7.5 and 9 minutes?

𝑃(7.5 < 𝑋̅ < 9) = 𝑃(−1.25 < 𝑍 < 2.50) = 𝑃(𝑍 < 2.50) − 𝑃(𝑍 < −1.25) = 0.9938 − 0.1056 = 0.8882

Rene@UNE Business School 6


T4_Q3: A random sample of 50 households was selected for a telephone survey. The key question asked was
Do you or any member of the household own a smart phone?” Of the 50 respondents, 35 said “yes” and 15 said
“no”. If the population proportion () is 0.60:

(a) Determine the sample proportion, p, of households with smart phones.

p = 35 /50 = 0.70

(b) Determine the standard error of the proportion.


𝜋(1 − 𝜋) 0.60(1 − 0.60)
𝜎=√ √ = 0.0693
𝑛 50

(c) What is the probability that more than 65 % of households will have smart phones?

0.65 − 0.60
𝑃(𝑝 > 0.65) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 0.72) = 1 − 𝑃(𝑍 < 0.72) = 1 − 0.7642 = 0.2358
0.0693

Rene@UNE Business School 7


TOPIC 5

T5_Q1: The marketing manager of a local department store wants to know the mean amount spent per customer
during the Thursday (6-8pm) late night shopping. A sample of 64 customers is taken. The sample mean is $140.50
and the standard deviation is $17.25.

1) Construct a 95% confidence interval for the mean amount spent per customer. Interpret the results.
X 
 is unknown, hence ~ t n 1 , therefore the confidence limits for µ is:
S
n
𝑆
𝑋̅ ± 𝑡𝛼,𝑛−1
2 √𝑛

17.25
140.50 ± 1.9983 ×
√64

140.50 ± 4.31

($136.19, $144.81)

𝑆
Alternatively, you could also use: 𝑋̅ ± 𝑍𝛼 by invoking the CLT. The final answer is:
2 √𝑛

17.25
140.50 ± 1.96 ×
√64

($136.27, $144.73)

We are confident that the unknown mean amount spent per customer during Thursday night shopping is
between $136.19 and $144.81.

T5_Q2: If the manager of a paint supply store wants to estimate the mean amount of pain in a 4-litre can to within  0.015
litres with 95% confidence and also assumes that the standard deviation is 0.075 litres, what sample size is needed?

𝑍𝛼/2 × 𝜎 2
𝑛≥[ ]
𝑒

1.96 × 0.075 2
𝑛≥[ ] = 97
0.015

Rene@UNE Business School 8


TOPIC 6
T6_Q1

1) A Type I error is committed when


A) we reject a null hypothesis that is false.
B) we donʹt reject a null hypothesis that is true.
C) we donʹt reject a null hypothesis that is false.
D) we reject a null hypothesis that is true.

2) Which of the following would be an appropriate alternative hypothesis?


A) The mean of a population is equal to 100.
B) The mean of a sample is greater than 100.
C) The mean of a population is greater than100.
D) The mean of a sample is equal to 100.

3) If an economist wishes to determine whether there is evidence that mean family income in a community exceeds
$50,000
A) a one-tail test should be utilized.
B) either a one-tail or two-tail test could be used with equivalent results.
C) a two-tail test should be utilized.
D) None of the above.

4) If the p-value is less than α in a two-tail test,


A) a one-tail test should be used.
B) the null hypothesis should be rejected.
C) the null hypothesis should not be rejected.
D) no conclusion should be reached.

5) A ______________ is a numerical quantity computed from the data of a sample and is used in reaching a decision on
whether or not to reject the null hypothesis.
A) significance level
B) critical value
C) parameter
D) test statistic

6) The probability of making an incorrect decision in hypothesis testing is defined by the:


A) significance level
B) critical value
C) parameter
D) test statistic

7) The value (s) obtained from either Table E2 or Table E3 are called the:
A) significance level
B) critical value
C) parameter
D) test statistic

Rene@UNE Business School 9


T6_Q2: The owner of a local nightclub has recently surveyed a random sample of n = 250 customers of the club. She would
now like to determine whether or not the mean age of her customers is over 30. If so, she plans to alter the
entertainment to appeal to an older crowd. If not, no entertainment changes will be made. Suppose she found that
the sample mean was 30.45 years and the sample standard deviation was 5 years. Use 5% level of significance to test
whether the mean age is over 30.

X = Age of customers 𝑋̅ = 30.45, S = 5, n = 250, α = 0.05

H0: µ ≤ 30

H1: µ > 30

Justify your choice of test statistic and state any required assumptions and Calculation
𝑋̅ −𝜇
Sample size is large (n >30), we can invoke the CLT, hence the test statistic to use is: 𝑍 = 𝑆 .

√𝑛

30.45−30 0.45
𝑍= 5⁄ = 0.316 = 1.42 (2 𝑑𝑐𝑝)
√250

Decision Rule and Decision:

With α = 0.05, we reject H0 if Z > 1.645, otherwise do not reject H0.

Since 1.42 < 1.645, we do not reject H0.

Conclusion:

At 5% level of significance, we do not have sufficient evidence to support that the mean age is over 30 years old.

Rene@UNE Business School 10


TOPIC 7

T7_Q1: A computer software developer would like to use the number of downloads (in thousands) for the trial version of
his new shareware to predict the amount of revenue (in thousands of dollars) he can make on the full version of the
new shareware. Following is the output from a simple linear regression obtained from a data set of 30 different
sharewares that he has developed:

ANOVA

1) The dependent variable for this problem is:


a) Number of Downloads (thousands)
b) Amount of revenue (in thousand dollars)
c) Intercept
d) None of the above.

2) The independent variable for this problem is:


a) Number of Downloads (thousands)
b) Amount of revenue (in thousand dollars)
c) Intercept
d) None of the above.
3) The Y-intercept (b0) represents the
a) Predicted value of Y.
b) change in estimated average Y per unit change in X.
c) variation around the sample regression line.
d) predicted value of Y when X = 0.
4) The slope (b1) represents
a) the estimated average change in Y per unit change in X.
b) variation around the line of regression.
c) the predicted value of Y.
d) predicted value of Y when X = 0.
5) The least squares method minimizes which of the following?
a) SST (total sum of squares)
b) SSR (regression sum of squares)
c) SSE (error sum of squares)
d) All of the above.

Rene@UNE Business School 11


6) Which of the following is the correct interpretation for the slope coefficient?
a) For each increase of 1 dollar in expected revenue, the expected number of downloads is estimated to increase
by 3.7297.
b) For each increase of 1 download, the expected revenue is estimated to increase by $ 3.7297.
c) For each increase of 1 thousand downloads, the expected revenue is estimated to increase by $ 3.7297
thousands.
d) For each increase of 1 thousand dollars in expected revenue, the expected number of downloads is estimated
to increase by 3.7297 thousands.

7) What is the predicted revenue (in thousand dollars) when the number of downloads is 30 thousands?
a) 16.8296
b) 111.891
c) -95.0614
d) None of the above

8) Which of the following is the correct interpretation for the coefficient of determination?
a) 74.67% of the variation in the number of downloads can be explained by the variation in revenue.
b) 75.54% of the variation in the number of downloads can be explained by the variation in revenue.
c) 74.67% of the variation in revenue can be explained by the variation in the number of downloads.
d) 75.54% of the variation in revenue can be explained by the variation in the number of downloads.

9) Which of the following is the correct alternative hypothesis for testing whether there is a linear relationship between
revenue and the number of downloads?
a) H1 : β1 ≠ 0
b) H1 : b1 ≠ 0
c) H1 : β1 = 0
d) H1 : b1 = 0

10) There is sufficient evidence that revenue and the number of downloads are linearly related at a 5% level of
significance.
a) True
b) False

Rene@UNE Business School 12

You might also like