P8120_Lecture_5_2025 - annotated
P8120_Lecture_5_2025 - annotated
Lecture 4 Review
The likelihood function expresses the probability of the observed (response) data expressed as a function of
the parameter.
o 𝐿𝐿(𝜋𝜋) = 𝑃𝑃 (𝑋𝑋 = 7 |𝜋𝜋) = �50
7
�𝜋𝜋 7 (1 − 𝜋𝜋) 50−7
Hypothesis Testing
A hypothesis test is a method for using sample data to decide between two competing claims (hypotheses) about a
population parameter. If it were possible to carry out a census of the entire population then we would know
which of the two hypotheses are correct, but usually it is the case that we need to decide between the
two hypotheses using information from a sample.
The null hypothesis, H0, is usually chosen to represent “no change” or “no association” whereas the alternative
hypothesis, H1, usually specifies a change, difference, or association.
𝐻𝐻0 : 𝜋𝜋 ≥ 𝜋𝜋0
𝐻𝐻1 : 𝜋𝜋 < 𝜋𝜋0
Two-sided
𝐻𝐻0 : 𝜋𝜋 = 𝜋𝜋0
𝐻𝐻1 : 𝜋𝜋 ≠ 𝜋𝜋0
Therefore, the two possible outcomes of a hypothesis test are reject H0 or fail to reject H0
A p-value is the probability of observing a test statistic (data) as extreme as or more extreme
than that test statistic (data) that we observed, given that the null hypothesis is true.
Also recall that there are different errors that can be made when conducting a hypothesis test. Consider the table
below:
There are TWO ways to perform a hypothesis test for a single proportion. We will review both in this lecture.
We want to test whether the true proportion of interest is different (larger or smaller) than some particular value
𝜋𝜋0 using our data (n = sample size, 𝑝𝑝̂ = MLE of 𝜋𝜋)
Two-sided
𝐻𝐻0 : 𝜋𝜋 = 𝜋𝜋0
𝐻𝐻1 : 𝜋𝜋 ≠ 𝜋𝜋0
𝜋𝜋 (1 − 𝜋𝜋)
𝑝𝑝̂ ~𝑁𝑁 �𝜋𝜋, �
𝑛𝑛
𝜋𝜋0 (1 − 𝜋𝜋0 )
𝑝𝑝̂ ~𝑁𝑁 �𝜋𝜋0 , �
𝑛𝑛
(2) H0:
H1:
(4) Assumptions
𝑝𝑝̂ − 𝜋𝜋0
𝑧𝑧 = =
�𝜋𝜋0 (1 − 𝜋𝜋0 )
𝑛𝑛
(6) Decision
(7) Conclusion
At the 5% level of significance, we have _________________ evidence to conclude that the true
proportion of those who refuse acupuncture treatment is different from 20%.
data acupuncture;
input refused;
cards;
7
43
;
run;
Notation Recap:
𝝅𝝅 The population proportion (parameter); often referred to as the “truth”.
�
𝒑𝒑 The sample proportion (statistic); computed from your sample data
𝝅𝝅𝟎𝟎 The null value for 𝝅𝝅. This is the value you want to compare against 𝝅𝝅.
We want to test whether the true proportion of interest is different (larger or smaller) than some particular value 𝜋𝜋0
using our data (n = sample size, x = observed number of successes)
But now it may be the case that 𝑛𝑛𝜋𝜋0 < 5 and 𝑛𝑛(1 − 𝜋𝜋0 ) < 5. So, we cannot use Z and trust that it follows a
normal distribution. Instead, we work directly with the binomial distribution to calculate a p-value.
X ~ Bin(n,𝜋𝜋)
There are several suggestions for how to compute a p-value for an exact binomial test. Here is how SAS does it:
𝑛𝑛
𝑃𝑃(𝑋𝑋 = 𝑥𝑥|𝜋𝜋 = 𝜋𝜋0 ) = � � 𝜋𝜋0𝑥𝑥 (1 − 𝜋𝜋0 ) 𝑛𝑛−𝑥𝑥
𝑥𝑥
(2) H0:
H1:
(4) Assumptions
𝐴𝐴 = 𝑃𝑃 (𝑋𝑋 ≥ 𝑥𝑥 0 | 𝜋𝜋 = 𝜋𝜋0 )
𝐵𝐵 = 𝑃𝑃(𝑋𝑋 ≤ 𝑥𝑥 0 | 𝜋𝜋 = 𝜋𝜋0 )
(6) Decision
(7) Conclusion
At the 5% level of significance, we have _________________ evidence to conclude that the true
proportion of those who refused acupuncture therapy is different from 20%.
data acupuncture_exact;
input refused;
cards;
2
4
;
run;