4 - Estimation
4 - Estimation
Parameters
1
Point Estimate for Population μ
Point Estimate
• A single value estimate for a population parameter
• Most unbiased point estimate of the population mean μ is the sample
mean
x
9 20 18 16 9 9 11 13 22 16 5 18 6 6 5 12 25
17 23 7 10 9 10 10 5 11 18 18 9 9 17 13 11 7
14 6 11 12 11 6 12 14 11 9 18 12 12 17 11 20
3
Solution: Point Estimate for Population μ
The sample mean of the data is
x 620
x 12.4
n 50
4
Interval Estimate
Interval estimate
• An interval, or range of values, used to estimate a population parameter.
Point estimate
12.4
( • )
Interval estimate
c = 0.90
z
zc
-zc = -1.645 z=0 zc = 1.645
The corresponding z-scores are +1.645.
7
Sampling error Sampling Error
• The difference between the point estimate and the actual population
parameter value.
• For μ:
the sampling error is the difference x – μ
μ is generally unknown
x varies from sample to sample
8
Margin of error
Margin of Error
• The greatest possible distance between the point estimate and the value
of the parameter it is estimating for a given level of confidence, c.
• Denoted by E.
9
Example: Finding the Margin of Error
Use the magazine advertisement data and a 95%
confidence level to find the margin of error for the
mean number of sentences in all magazine
advertisements. Assume the sample standard deviation
is about 5.0.
10
Solution: Finding the Margin of Error
• First find the critical values
0.95
0.025 0.025
z
zc
-zc = -1.96 z=0 zczc= 1.96
95% of the area under the standard normal curve falls within
1.96 standard deviations of the mean. (You can approximate
the distribution of the sample means with a normal curve by
the Central Limit Theorem, because n ≥ 30.)
11
Solution: Finding the Margin of Error
s You don’t know σ, but
E zc zc since n ≥ 30, you can
n n use s in place of σ.
5.0
1.96
50
1.4
You are 95% confident that the margin of error for the
population mean is about 1.4 sentences.
12
Confidence Intervals for the Population Mean
13
Constructing Confidence Intervals for μ
Finding a Confidence Interval for a Population Mean
(n 30 or σ known with a normally distributed population)
In Words In Symbols
14
In WordsConstructing Confidence Intervals for μ
In Symbols
17
Example: Constructing a Confidence
Interval σ Known
A college admissions director wishes to estimate the mean
age of all students currently enrolled. In a random sample of
20 students, the mean age is found to be 22.9 years. From
past studies, the standard deviation is known to be 1.5
years, and the population is normally distributed. Construct
a 90% confidence interval of the population mean age.
18
Solution: Constructing a Confidence
Interval σ Known
• First find the critical values
c = 0.90
z
zc z=0 zc
-zc = -1.645 zc = 1.645
zc = 1.645
19
Solution: Constructing a Confidence
Interval σ Known
• Margin of error:
1.5
E zc 1.645 0.6
n 20
• Confidence interval:
Point estimate
22.3 22.9 23.5
( • )
x E x xE
With 90% confidence, you can say that the mean age
of all the students is between 22.3 and 23.5 years.
21
• Given a c-confidence level and a E zc
margin of error E, the minimum n
sample size n needed to estimate the
population mean is
2
E z
2 2
c
n
c
2 2
z
E
2
n
E
22
Example: Sample Size
You want to estimate the mean number of sentences in a
magazine advertisement. How many magazine advertisements
must be included in the sample if you want to be 95% confident
that the sample mean is within one sentence of the population
mean? Assume the sample standard deviation is about 5.0.
23
Solution: Sample Size
• First find the critical values
0.95
0.025 0.025
z
-zc = -1.96
zc z=0 zczc= 1.96
zc = 1.96
24
Solution: Sample Size
zc = 1.96 s = 5.0 E=1
n 96.04
E 1
When necessary, round up to obtain a whole number.
25
The t-Distribution
• When the population standard deviation is unknown, the sample size is
less than 30, and the random variable x is approximately normally
distributed, it follows a t-distribution.
x -
t
s
n
• Critical values of t are denoted by tc.
26
In statistics, the t-distribution was first derived in 1876 by Helmert
and Lüroth. In the English-language literature it takes its name
from William Sealy Gosset's 1908 paper in Biometrika under the
pseudonym "Student", published while he worked at the Guinness
Brewery in Dublin, Ireland. One version of the origin of the
pseudonym is that Gosset's employer forbade members of its staff
from publishing scientific papers, so he had to hide his identity.
Another version is that Guinness did not want their competition to
know that they were using the t-test to test the quality of raw
material. The t-test and the associated theory became well-known
through the work of Ronald A. Fisher, who called the distribution
"Student's distribution".
T-distribution’s mathematical formulas:
Slide 6- 27
T-distribution’s mathematical formulas:
Slide 6- 28
Properties of the t-Distribution
29
Properties of the t-Distribution
3. The total area under a t-curve is 1 or 100%.
4. The mean, median, and mode of the t-distribution are equal to zero.
5. As the degrees of freedom increase, the t-distribution approaches the normal
distribution. After 30 d.f., the t-distribution is very close to the standard
normal z-distribution.
tc = 2.145
31
Solution: Critical Values of t
95% of the area under the t-distribution curve with 14 degrees of freedom
lies between t = +2.145.
c = 0.95
t
-tc = -2.145 tc = 2.145
32
Confidence Intervals for the Population Mean
33
In Words In Symbols
34
In Words In Symbols
E tc
s
n
35
Example: Constructing a Confidence
Interval
You randomly select 16 coffee shops and measure the temperature of the
coffee sold at each. The sample mean temperature is 162.0ºF with a sample
standard deviation of 10.0ºF. Find the 95% confidence interval for the
mean temperature. Assume the temperatures are approximately normally
distributed.
Solution:
Use the t-distribution (n < 30, σ is unknown,
temperatures are approximately normally distributed.)
36
Solution: Constructing a Confidence
Interval
• Margin of error:
E tc
s 2.131
10
5.3
n 16
• Confidence interval:
Population Proportion
• The probability of success in a single trial of a binomial experiment.
• Denoted by p
Point Estimate for p
• The proportion of successes in a sample.
• Denoted by
x number of successes in sample
pˆ
n number in sample
read as “p hat”
40
Point Estimate for Population p
Estimate Population with Sample
Parameter… Statistic
Proportion: p p̂
41
Example: Point Estimate for p
In a survey of 1219 U.S. adults, 354 said that their favorite sport to watch
is football. Find a point estimate for the population proportion of U.S.
adults who say their favorite sport to watch is football. (Adapted from
The Harris Poll)
44
Constructing Confidence Intervals for p
In Words In Symbols
45
Example: Confidence Interval for p
In a survey of 1219 U.S. adults, 354 said that their favorite sport to watch
is football. Construct a 95% confidence interval for the proportion of
adults in the United States who say that their favorite sport to watch is
football.
46
Solution: Confidence Interval for p
• Verify the sampling distribution of p̂ can be approximated by the normal
distribution
48
Solution: Confidence Interval for p
• 0.265 < p < 0.315
Point estimate
0.265 0.29 0.315
( • )
p̂ E p̂ p̂ E
50
Sample Size:
ˆˆ
pq
E zc
n
2
ˆˆ
zc pq
E
2
n
2
ˆˆ
zc pq
n 2
E
Slide 6- 51
pˆ 0.5
and
qˆ 0.5
Slide 6- 52
Example: Sample Size
You are running a political campaign and wish to
estimate, with 95% confidence, the proportion of
registered voters who will vote for your candidate. Your
estimate must be accurate within 3% of the true
population. Find the minimum sample size needed if
1. no preliminary estimate is available.
Solution:
Because you do not have a preliminary estimate
for p̂ use pˆ 0.5 and qˆ 0.5. 53
Solution: Sample Size
• c = 0.95 zc = 1.96 E = 0.03
2 2
zc 1.96
n pq
ˆ ˆ (0.5)(0.5) 1067.11
E 0.03
Solution:
Use the preliminary estimate pˆ 0.31
qˆ 1 pˆ 1 0.31 0.69
55
Solution: Sample Size
• c = 0.95 zc = 1.96 E = 0.03
2 2
zc 1.96
n pq
ˆ ˆ (0.31)(0.69) 913.02
E 0.03
57
Sample Problems
58
Sample Problems
59
Sample Problems
4. The time required to finish an assembly job is believed to
be normally distributed with a standard deviation of 16
minutes. How large a sample is required if we want to have a
probability of .90 that the sample mean will differ from the true
mean by at most 2.2 minutes?
60
Sample Problems
61
Sample Problems
62
63