Chapter#5
Chapter#5
Statistical Inference:
Estimation and
Hypothesis Testing for
Single Populations
University of Economics
Ho Chi Minh City
Dang Van Thac 1/28
Outline
Population Sample
μ
(parameter) (statistic)
Select a
random
sample
Dang Van Thac 3/28
Estimating the Population Mean Using the Z Statistic
(Population Variance is known)
• A point estimate is a statistic taken from a sample that is used to estimate a population parameter.
Sample 1 𝑥1
• An interval estimate
(confidence interval) is a
Point estimate
range of values within which
the analyst can declare, with Population
some confidence, the Sample 2 𝑥2
population parameter lies.
Example: A survey was taken of U.S. companies that do business with firms in India. One of the
questions on the survey was: Approximately how many years has your company been trading with
firms in India? A random sample of 44 responses to this question yielded a mean of 10.455 years.
Suppose the population standard deviation for this question is 7.7 years. Using this information,
construct a 90% confidence interval for the mean number of years that a company has been trading
in India for the population of U.S. companies trading with firms in India.
• The t Distribution: is used instead of the z distribution for doing inferential statistics on the
population mean when the population standard deviation is unknown and the population is
normally distributed.
• Characteristics of the t Distribution: Like the standard normal curve, t distributions are
symmetric, unimodal, and a family of curves. The t distributions are flatter in the middle and have
more area in their tails than the standard normal distribution.
• Reading the t Distribution: To find a value in the t distribution table requires knowing the
degrees of freedom; each different value of degrees of freedom is associated with a different t
distribution.
• Degrees of freedom: refers to the number of independent observations for a source of variation
minus the number of independent parameters estimated in computing the variation. (degree of
freedom = n-1).
Example: A survey was taken of U.S. companies that do business with firms in India. One of the
questions on the survey was: Approximately how many years has your company been trading with
firms in India? A random sample of 18 responses to this question yielded a mean of 13.56 years.
Suppose the population standard deviation for this question unknown. The sample standard deviation is
7.8 years. Using this information, construct a 90% confidence interval for the mean number of years
that a company has been trading in India for the population of U.S. companies trading with firms in
India.
𝑛= 18 , 𝑥=13.56 , 𝑆=7.8 , 90 % 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ⇒ 𝑡 𝛼 =𝑡 0.05,17=1.740
,𝑛 − 1
2
7.8 7.8
13.56 −1.740 ≤ 𝜇 ≤ 13.45+1.740 ⇒ 10.36 ≤ 𝜇 ≤ 16.76
√ 18 √ 18
Dang Van Thac 9/28
Estimating the Population Mean Using the t Statistic
(Population Variance is Unknown)
Example: The owner of a large equipment rental company wants to make a rather quick estimate of the
average number of days a piece of ditchdigging equipment is rented out per person per time. The
company has records of all rentals, but the amount of time required to conduct an audit of all accounts
would be prohibitive. The owner decides to take a random sample of rental invoices. Fourteen different
rentals of ditchdiggers are selected randomly from the files, yielding the following data (3 1 3 2 5 1
2 1 4 2 1 3 1 1). She uses these data to construct a 99% confidence interval to estimate the
average number of days that a ditchdigger is rented and assumes that the number of days per rental is
normally distributed in the population.
2.14 − 3.012
( )
1.29
√ 14
≤ 𝜇 ≤ 2.14 +3.012(
1.29
√14
) ⇒1.10 ≤ 𝜇≤ 3.18
√ √
= sample proportion
^^ ^^
^𝑝 − 𝑍 𝛼 𝑝 𝑞 ≤ 𝑝 ≤ 𝑝^ − 𝑍 𝛼/ 2 𝑝 𝑞 = 1-
𝑛 𝑛 p= population proportion
2 n= sample size
0 .39 −1.96
√
0.39 ∗ 0.61
87
≤ 𝑝≤ 0.39+1.96
√
0.39 ∗ 0.61
87
⇒0.29 ≤ 𝑝 ≤ 0.49
Dang Van Thac 11/28
Estimating The Population Proportion
Example: A clothing company produces men’s jeans. The jeans are made and sold with either a
regular cut or a boot cut. In an effort to estimate the proportion of their men’s jeans market in
Oklahoma City that prefers boot-cut jeans, the analyst takes a random sample of 212 jeans sales from
the company’s two Oklahoma City retail outlets. Only 34 of the sales were for boot-cut jeans.
Construct a 90% confidence interval to estimate the proportion of the population in Oklahoma City
who prefer boot-cut jeans.
0 .16 −1.645
√
0.16 ∗ 0.84
212
≤ 𝑝≤ 0.16 +1.645
√
0.16 ∗0.8 4
212
⇒0.12 ≤ 𝑝 ≤ 0.20
• One-tailed tests: are always directional, and the alternative hypothesis uses either the greater than
(>) or the less than (<) sign.
Example: suppose a company has held an 18% share of the market. However, because of an
increased marketing effort, company officials believe the company’s market share is now greater
than 18%, and the officials would like to prove it.
⇒
Dang Van Thac 16/28
Introduction To Hypotheses Testing
• Substantive Hypotheses: In testing a statistical hypothesis, a business researcher reaches a conclusion
based on the data obtained in the study. If the null hypothesis is rejected and therefore the alternative
hypothesis is accepted, it is common to say that a statistically significant result has been obtained.
However, this statistically significant result may not be a significant business outcome.
Example: In the market share study, Suppose a large sample of potential customers is taken, and a
sample market share of 18.2% is obtained.
Eight-step Approach
Step 1: Establish a null and alternative hypothesis.
HTAB Approach Step 2: Determine the appropriate statistical test
Task 1: Establishing the hypotheses Step 3: Set the value of alpha, the Type I error rate.
Task 2: Conducting the test Step 4: Establish the decision rule.
Task 3: Taking statistical action Step 5: Gather sample data.
Task 4: Determining the business implications Step 6: Analyze the data.
Step 7: Reach a statistical conclusion.
Step 8: Make a business decision.
𝑥 −𝜇
𝑍=
𝜎 /√𝑛
Example: A survey of CPAs across the United States found that the average net
income for sole proprietor CPAs is $74,914. Because this survey is now more than
ten years old, an accounting researcher wants to test this figure by taking a random
sample of 112 sole proprietor accountants in the United States to determine whether
the net income figure changed. Assume the population standard deviation of net
incomes for sole proprietor CPAs is $14,530 and the sample mean is $78,695.
• Step 1:
𝑥 −𝜇 78,695 −74,914
𝑍= = =2.75
𝜎 /√𝑛 14,530/ √ 112
Example: In the CPA net income example, suppose only 600 sole
proprietor CPAs practice in the United States.
𝑥−𝜇 78,695 −74,914 3,781
𝑍= = = =3.05
𝜎 14,530 1,239.2
√( 𝑁 −𝑛)/( 𝑁 − 1) √( 600− 112)/ (600 −1)
√𝑛 √ 112
Example: Figures released by the U.S. Department of Agriculture show that the average size of farms has
increased since 1940. In 1940, the mean size of a farm was 174 acres; by 1997, the average size was 471
acres. Between those years, the number of farms decreased but the amount of tillable land remained
relatively constant, so now farms are bigger. This trend might be explained, in part, by the inability of
small farms to compete with the prices and costs of large-scale operations and to produce a level of income
necessary to support the farmers’ desired standard of living. Suppose an agribusiness researcher believes
the average size of farms has now increased from the 1997 mean figure of 471 acres. To test this notion,
she randomly sampled 23 farms across the United States and ascertained the size of each farm from county
records. The data she gathered follow. Use a 5% level of significance to test her hypothesis. Assume that
number of acres per farm is normally distributed in the population.
445 489 474 505 553 477 454 463 466 545 590 560
557 502 449 438 500 466 477 557 433 511 561
Dang Van Thac 27/28
Testing Hypotheses About A Population Mean
(t Statistic, Population Variance is Unknown)
𝑛=23,𝜇=471,𝑑𝑓=23−1=2 ,𝛼=0.05
𝑺𝒕𝒆𝒑 𝟐 : 𝑡h𝑒 𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝑡𝑒𝑠𝑡 𝑖𝑠𝑎𝑝𝑝𝑟𝑜𝑝𝑟𝑖𝑎𝑡𝑒
𝑺𝒕𝒆𝒑 𝟑: 𝑡h𝑒𝑣𝑎𝑙𝑢𝑒𝑜𝑓 𝛼 𝑖𝑠0.05
.
Dang Van Thac 28/28