0% found this document useful (0 votes)
8 views5 pages

Confident Interval

Uploaded by

Jeff Tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views5 pages

Confident Interval

Uploaded by

Jeff Tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Topic 5: Confidence Intervals (Chapter 9)

1. Introduction
• The two general areas of statistical inference are:
1) estimation of parameter(s), ch. 9
2) hypothesis testing of parameter(s), ch. 10

• Let X be some random variable with unknown mean µ . Suppose we take a


sample of size n . One could find the sample mean, x , and use it to estimate µ .
Such a single number used to estimate a parameter is called a point estimate.

• Question: Are there other reasonable point estimates (other than x ) for µ ?

• Claim: The “best” point estimate of µ is x .

• The point estimate does not give us a sense of how close x might be to µ , i.e. it
doesn’t tell us about the inherent uncertainty in the process. Therefore, interval
estimates are used, which provide a range of values that contain the parameter,
say µ , with some specified degree of confidence. This range is called a
confidence interval.

2. Two-Sided Confidence Interval


• Consider now some random variable X with mean µ and standard deviation σ .
Suppose we want to estimate µ (the estimation of σ follows later). Using the
central limit theorem
σ
X ~ N (µ , )
n
Therefore,
(X − µ)
Z= (1)
σ
n
is a standard normal.

• Recall that by definition


P ( − zα < Z < zα ) = 1 − α (2)
2 2

e.g. because z.025 = 1.96


P(−1.96 < Z < 1.96) = 1 − .05 = .95

1
• Substituting (1) into (2), and using some algebra, one can show that (2) implies
σ σ
P ( X − zα < µ < X + zα ) = 1−α (3)
2 n 2 n
e.g. letting α = .05
σ σ
P ( X − 1.96 < µ < X + 1.96 ) = .95
n n

• The interpretation of (3) is tricky. Notice that we calculate a random interval


σ σ
( X − 1.96 , X + 1.96
).
n n
If we do this repeatedly, we expect that 95% of the intervals contain the
unknown parameter µ . It does not mean that µ assumes a value within the
interval with probability 0.95. An illustration of this is given in Figure 9.1.

• In general, a 100 (1 − α ) % confidence interval for µ is


σ σ
( X − zα , X + zα ). (4)
2 n 2 n

• The confidence interval could be written as


X ±E
σ
where E = zα . (5)
2 n
E is also called the bound on error (BOE).

• Suppose one wants to reduce E. This could obviously be done by increasing α


or increasing n . In fact, n may be determined from E. Notice from (5),
σ
n = zα .
2 E
Therefore,
σ
n = [ zα ]2 . (6)
2 E

• Consider an example. Let X be the cholesterol level of a U.S. male who


smokes. Suppose µ is unknown, but that we know σ = 46 (we actually
assume this value because we presumably know σ = 46 for population of all
U.S. adult males). Suppose we take a sample of n = 12 , and find
x = 217 .

Then the best point estimate of µ is 217. Also, a 95% confidence interval for
µ is (using X = 217 )

2
1.96(46)
217 ±
12
= 217 ± 26
= (191, 243).
A 99% confidence interval for µ is
2.58(46)
217 ±
12
= (183, 251)
Why is it wider?

• Suppose we are designing the study, and we want to know µ within 10 units.
(Note that E=10 and the interval’s width is 20). Substituting σ , E, and α = .01
into (6), one has
2.58(46) 2
n=( ) = 140.8 .
10
Therefore our recommendation is to obtain a sample for n = 141 males.

3. One-Sided Confidence Intervals


• Sometimes (though not often), one desires just an upper or a lower bound on µ .
Using algebraic manipulations as before, one has:

An 100(1- α )% upper (one-sided) confidence bound on µ is


σ
X + zα , (7)
n
and a corresponding lower confidence bound is
σ
X − zα . (8)
n
e.g. for U.S. males, to find a 95% upper bound, recall that z.05 = 1.65 . Therefore
the upper bound from the sample is
1.65(46)
217 ±
12
= 217 + 21.9
= 238.9

4. Students’ t distribution

• Consider a random variable X which is known to be normally distributed, but


where neither µ nor σ are known. To make inferences about µ , we could use
the Z transformation, provided σ is known. However, in case of unknown σ ,
which is the common case, we replace σ by s to find

3
(x − µ)
t= .
s
n
This statistic has the t distribution with n-1 “degrees of freedom” (df).

• Several facts of interest are


1) The t-distribution is also symmetric about 0, but it has “thicker tails”, i.e.
it’s a bit flatter. This is because there is added variability introduced by
using the random variable s rather than a constant σ in the denominator
(see Figure 9.2).
2) The t-distribution approaches Z as n becomes large. This is because as
n increases, we have more information and s approaches σ . Why?

• Recall we had an extensive Z table, but it’s impractical to give a table for each
tn −1 statistic. Therefore, tables typically give only upper percent points. For
example, from Table A-4, one has
t10,.025 = 2.228 .
By symmetry,
t10,.975 = −2.228 ,
and thus one can find upper and lower percentiles.

• One can construct confidence intervals based on the t-distribution in the same
way as before. The expression comparable to (3) is
s s
P ( X − tα < µ < X + tα ) = 1−α .
2 n 2 n
In general, a 100(1- α )% confidence interval for µ , with approximately normal
x , is
s s
( X − tα , X + tα ). (9)
2 n 2 n

• For example, let X denote an infant’s plasma aluminum level, which is


assumed to be approximately normal. Suppose we take a random sample of
n = 10 infants, and find x = 37.2 and s = 7.13 .

To find a 95% confidence interval, we find


t9,.025 = 2.262 .
Hence, the interval estimate is
2.262(7.13)
37.2 ±
10
= 37.2 ± 5.1
= ( 32.1, 42.3)

4
• Notice that in this problem, the interval width is a random variable. Hence, if
two experimenters took different samples, their interval widths would differ,
unlike the case of known σ . In general, the interval based on t would be wider
than that based on Z. Why? Though the widths differ, we expect 100(1- α )%
of the intervals to contain µ . See Figure 9.3.

5. Case Study

• This has a nice case study on the efficacy of using a particular drug for treating
ADD children.

You might also like