Confident Interval
Confident Interval
1. Introduction
• The two general areas of statistical inference are:
1) estimation of parameter(s), ch. 9
2) hypothesis testing of parameter(s), ch. 10
• Question: Are there other reasonable point estimates (other than x ) for µ ?
• The point estimate does not give us a sense of how close x might be to µ , i.e. it
doesn’t tell us about the inherent uncertainty in the process. Therefore, interval
estimates are used, which provide a range of values that contain the parameter,
say µ , with some specified degree of confidence. This range is called a
confidence interval.
1
• Substituting (1) into (2), and using some algebra, one can show that (2) implies
σ σ
P ( X − zα < µ < X + zα ) = 1−α (3)
2 n 2 n
e.g. letting α = .05
σ σ
P ( X − 1.96 < µ < X + 1.96 ) = .95
n n
Then the best point estimate of µ is 217. Also, a 95% confidence interval for
µ is (using X = 217 )
2
1.96(46)
217 ±
12
= 217 ± 26
= (191, 243).
A 99% confidence interval for µ is
2.58(46)
217 ±
12
= (183, 251)
Why is it wider?
• Suppose we are designing the study, and we want to know µ within 10 units.
(Note that E=10 and the interval’s width is 20). Substituting σ , E, and α = .01
into (6), one has
2.58(46) 2
n=( ) = 140.8 .
10
Therefore our recommendation is to obtain a sample for n = 141 males.
4. Students’ t distribution
3
(x − µ)
t= .
s
n
This statistic has the t distribution with n-1 “degrees of freedom” (df).
• Recall we had an extensive Z table, but it’s impractical to give a table for each
tn −1 statistic. Therefore, tables typically give only upper percent points. For
example, from Table A-4, one has
t10,.025 = 2.228 .
By symmetry,
t10,.975 = −2.228 ,
and thus one can find upper and lower percentiles.
• One can construct confidence intervals based on the t-distribution in the same
way as before. The expression comparable to (3) is
s s
P ( X − tα < µ < X + tα ) = 1−α .
2 n 2 n
In general, a 100(1- α )% confidence interval for µ , with approximately normal
x , is
s s
( X − tα , X + tα ). (9)
2 n 2 n
4
• Notice that in this problem, the interval width is a random variable. Hence, if
two experimenters took different samples, their interval widths would differ,
unlike the case of known σ . In general, the interval based on t would be wider
than that based on Z. Why? Though the widths differ, we expect 100(1- α )%
of the intervals to contain µ . See Figure 9.3.
5. Case Study
• This has a nice case study on the efficacy of using a particular drug for treating
ADD children.