0% found this document useful (0 votes)
54 views6 pages

Wilson Interval

This document discusses several methods for constructing confidence intervals for a binomial proportion π: 1) The Wilson interval, which is recommended for sample sizes n < 40. It provides better coverage than the Wald interval for small n. 2) The Agresti-Coull interval, which is recommended for n > 40. It is essentially a Wald interval where successes and failures are adjusted. 3) An exact confidence interval method that calculates the true confidence level without simulation by considering all possible intervals for each value of the random variable. This allows an exact confidence level to be found for a finite sample size n. 4) The likelihood ratio test (LRT) for comparing nested models, which approxim

Uploaded by

Amsalu Walelign
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views6 pages

Wilson Interval

This document discusses several methods for constructing confidence intervals for a binomial proportion π: 1) The Wilson interval, which is recommended for sample sizes n < 40. It provides better coverage than the Wald interval for small n. 2) The Agresti-Coull interval, which is recommended for n > 40. It is essentially a Wald interval where successes and failures are adjusted. 3) An exact confidence interval method that calculates the true confidence level without simulation by considering all possible intervals for each value of the random variable. This allows an exact confidence level to be found for a finite sample size n. 4) The likelihood ratio test (LRT) for comparing nested models, which approxim

Uploaded by

Amsalu Walelign
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Wilson interval (n < 40)

(1-)100% CI for
1/2 2
1 /2 1 /2
2
1 /2
Z n Z
(1 )
n Z 4n
o o
o
t t t +
+


Where
2
n
i 1 /2
i 1
2
1 /2
y Z 2
n Z
o
=
o
+

t =
+









For n > 40, use the Agresti-Coull

1 /2
2
1 /2
(1 )
Z
n Z
o
o
t t
t
+

Where
*** This is essentially a Wald interval where we add
2
1 /2
Z / 2
o
successes and
2
1 /2
Z / 2
o
failures to the
observed data. In fact, when o = 0.05, Z
1-o/2
= 1.96 ~
2. Then

2 n n
i i
i 1 i 1
2
y 2 2 y 2
n 2 n 4
= =
+ +

t = =
+ +




++> When n < 40, the Agresti-Coull interval is
generally still better than the Wald interval.
++> The Wilson interval can be used when n < 40 as
well, and it is generally better than the Agresti-Coull.

*********TRUE CONFIDENCE LEVEL
One could find the EXACT true confidence level without Monte Carlo
simulation! Below are the steps:

1) Find all possible intervals that one could have with w = 0, 1, , n.
2) Form I(w) = 1 if the interval for a w contains t and 0 otherwise.
3) Calculate the true confidence level as

n
w n w
w 0
n
I(w) (1 )
w

=
| |
t t

|
\ .


This is what Brown et al. (2001) did for their paper. The key to using a non-
simulation based approach is there are a finite number of possible values for
the random variable of interest. In other settings beyond confidence
intervals for t, this will usually not occur and simulation will be the only
approach for a finite sample size n.






****LIKELIHOOD RATIO TEST
The LRT statistic, A, is the ratio of two likelihood
functions. The numerator is the likelihood
function maximized over the parameter space
restricted under the null hypothesis. The
denominator is the likelihood function maximized
over the unrestricted parameter space. The test
statistic is written as:

o
o a
Max. lik. when parameters satisfy H
Max. lik. when parameters satisfy H or H
A =


Wilks (1935, 1938) shows that 2log(A) can be
approximated by a
2
u
_
for a large sample and
under Ho where u is the difference in dimension
between the alternative and null hypothesis
parameter spaces.

suppose the hypothesis test
H0:t = 0.5 vs. Ha:t = 0.5 is of interest.
Remember that
i
y 4 =

and n = 10.
The numerator of A is the maximum possible
value of the likelihood function under the null
hypothesis. Because t = 0.5 is the null
hypothesis, the maximum can be found by just
substituting t = 0.5 in the likelihood function:

i i
y n y
1 n
L( 0.5| y ,...,y ) 0.5 (1 0.5)
E E
t = =

Then

4 10 4
1 n
L( 0.5| y ,...,y ) 0.5 (0.5) 0.0009766

t = = =

The denominator of A is the maximum possible
value of the likelihood function under the null OR
alternative hypotheses. Because this includes
all possible values of t here, the maximum is
achieved when the maximum likelihood estimate
is substituted for t in the likelihood function! As
shown previously, the maximum value is
0.001194.

Therefore,

o
o a
Max. lik. when parameters satisfy H

Max. lik. when parameters satisfy H or H
0.0009766
0.8179
0.001194
A =
= =

Then 2log(A) = -2log(0.8179) = 0.4020 is the
test statistic value. The critical value is
2
1,0.95
_ =
3.84 using o = 0.05:

> qchisq(p = 0.95, df = 1)
[1] 3.841459

There is not sufficient evidence to reject the
hypothesis that t = 0.5.

You might also like