0% found this document useful (0 votes)
3 views

AP Unit 6 Notes (TPS ch. 8 & 9)

This document covers statistical methods for inference related to categorical data, focusing on one-sample z-intervals and z-tests for population proportions. It outlines the conditions for calculating confidence intervals, making interpretations, and understanding the implications of p-values and errors in hypothesis testing. Additionally, it discusses the effects of sample size and confidence level on the width of confidence intervals and introduces two-sample z-intervals for comparing population proportions.

Uploaded by

rishi.nrkr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

AP Unit 6 Notes (TPS ch. 8 & 9)

This document covers statistical methods for inference related to categorical data, focusing on one-sample z-intervals and z-tests for population proportions. It outlines the conditions for calculating confidence intervals, making interpretations, and understanding the implications of p-values and errors in hypothesis testing. Additionally, it discusses the effects of sample size and confidence level on the width of confidence intervals and introduces two-sample z-intervals for comparing population proportions.

Uploaded by

rishi.nrkr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

AP Unit 6 (TPS ch.

8 & 9)

I. Inference for Categorical Data: Proportions

A. One Sample z-interval for a population proportion:


1. Verify the conditions for calculating confidence intervals for a population
proportion:
a. To check that the sampling distribution of 𝑝𝑝̂ is approximately normal (shape):
�) ≥ 𝟏𝟏𝟏𝟏 (number of successes and failures are at least 10)
� ≥ 𝟏𝟏𝟏𝟏 𝒂𝒂𝒂𝒂𝒂𝒂 𝒏𝒏(𝟏𝟏 − 𝒑𝒑
𝒏𝒏𝒑𝒑
b. To check that the sampling method is unbiased (so sampling distribution of 𝑝𝑝̂ will
center at parameter:
sample should be SRS from population of interest (discuss IN CONTEXT!!)
c. independence: If sampling without replacement, n < 10% of population.

2. Make Confidence Interval:


a. On formula sheet: 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 ± (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣)(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠)
�(𝟏𝟏−𝒑𝒑
𝒑𝒑 �)
� ± 𝒛𝒛∗ �
𝒑𝒑 𝒏𝒏
b. Where 𝑝𝑝̂ is the midpoint of the interval, also called the ‘point estimate’
c. 𝑧𝑧 ∗ is the critical value, determined by the confidence level, the boundaries that
have _____% between them, according to the normal distribution.
𝑝𝑝�(1−𝑝𝑝�)
d. � 𝑛𝑛
is the standard error (estimate of the CLT standard deviation)
𝑝𝑝�(1−𝑝𝑝�)
e. 𝑧𝑧 ∗ � 𝑛𝑛
is the margin-of-error or half-width of the interval. A margin of error
gives how much a value of a sample statistic is likely to vary from the value of the
corresponding population parameter.

3. The formula for margin of error can be rearranged to solve for n, the minimum
sample size needed to achieve a given margin of error. For this purpose, use a given
estimate for 𝑝𝑝̂ or use 𝑝𝑝̂ = 0.5 if none is given in order to find an upper bound for the
sample size that will result in a given margin of error. (Round sample size up to
whole number and include units on your answer.)

4. Interpret a Confidence Interval:


I am ____% confident that the actual proportion of ____(population)______ that
____(variable)____ is between _____ and _____.

5. Interpret the confidence level (why are you ____% confident?):


If I repeatedly take samples of size __(n)__ from the population and create ____%
confidence intervals for each. In the long run, approx. ____% of the intervals
created would capture the actual proportion (p).
Note: A confidence interval for a population proportion either contains the
population proportion or it does not, because each interval is based on random
sample data, which varies from sample to sample.
AP Unit 6 (TPS ch. 8 & 9)

6. Know what affects the width of an interval:


a. When all other things remain the same, the width of the confidence interval for
a population proportion tends to decrease as the sample size increases. For a
1
population proportion, the width of the interval is proportional to
√𝑛𝑛
b. For a given sample, the width of the confidence interval for a population
proportion increases as the confidence level increases.
c. The width of a confidence interval for a population proportion is exactly twice
the margin of error (half-width).

B. One sample z-test for population proportions:


1. Hypotheses:
a. The null hypothesis is the situation that is assumed to be correct unless evidence
suggests otherwise, and the alternative hypothesis is the situation for which
evidence is being collected. The null hypothesis specifies a value for the
population proportion, usually one indicating no difference or effect.
b. For hypotheses about parameters, the null hypothesis contains an equality
reference (=, ≥, or ≤). Although the null hypothesis for a one-sided test may
include an inequality symbol, it is still tested at the boundary of equality (=).
The alternative hypothesis contains a strict inequality (<, >, or ≠). The type of
inequality in the alternative hypothesis is based on the question of interest.
Alternative hypotheses with < or > are called one-sided, and alternative
hypotheses with ≠ are called two-sided.
c. The null hypothesis for a population proportion is 𝐻𝐻𝑜𝑜 : 𝑝𝑝 = 𝑝𝑝𝑜𝑜 , where 𝑝𝑝𝑜𝑜 is the
null hypothesized value for the population proportion.
A one-sided alternative hypothesis for a proportion is either𝐻𝐻𝑎𝑎 : 𝑝𝑝 < 𝑝𝑝𝑜𝑜 or
𝐻𝐻𝑎𝑎 : 𝑝𝑝 > 𝑝𝑝𝑜𝑜 . A two-sided alternate hypothesis is 𝐻𝐻𝑎𝑎 : 𝑝𝑝 ≠ 𝑝𝑝𝑜𝑜 .

2. Verify the conditions for performing a significance test for a population proportion:
a. To check that the sampling distribution of 𝑝𝑝̂ is approximately normal (shape):
Assuming 𝐻𝐻𝑜𝑜 is true, 𝒏𝒏𝒑𝒑𝒐𝒐 ≥ 𝟏𝟏𝟏𝟏 𝒂𝒂𝒂𝒂𝒂𝒂 𝒏𝒏(𝟏𝟏 − 𝒑𝒑𝒐𝒐 ) ≥ 𝟏𝟏𝟏𝟏 (expected number of
successes and failures are at least 10)
b. To check that the sampling method is unbiased (so sampling distribution of 𝑝𝑝̂ will
center at parameter:
sample should be SRS from population of interest (discuss IN CONTEXT!!)
c. independence: If sampling without replacement, n < 10% of population.

3. Calculate a test statistics and the p-value: (Draw a picture and think CLT!!)
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠−𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
a. On the formula sheet: 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝑝𝑝�−𝑝𝑝𝑜𝑜
b. 𝑧𝑧 = 𝑝𝑝 (1−𝑝𝑝𝑜𝑜 )
� 𝑜𝑜
𝑛𝑛
AP Unit 6 (TPS ch. 8 & 9)

c. find p-value using normal table or calculator.


d. If asked to interpret a p-value:
The p-value is the proportion of values for the null distribution that are as
extreme or more extreme than the observed value of the test statistic. This is:
i. The proportion at or above the observed value of the test statistic, if the
alternative is >.
ii. The proportion at or below the observed value of the test statistic, if the
alternative is <.
iii.The proportion less than or equal to the negative of the absolute value of the
test statistic plus the proportion greater than or equal to the absolute value of
the test statistic, if the alternative is ≠.
iv. A p-value is the probability of obtaining a test statistic as or more extreme
than the observed test statistic when the null hypothesis is assumed to be true.
It’s basically the probability that you randomly got your sample result from that
population.

4. Conclusion:
a. A formal decision explicitly compares the p-value to the significance level, α. If
the p-value ≤ α, reject the null hypothesis. If the p-value > α, fail to reject the
null hypothesis.
b. Rejecting the null hypothesis means there is sufficient statistical evidence to
support the alternative hypothesis. Failing to reject the null means there is
insufficient statistical evidence to support the alternative hypothesis. The
conclusion about the alternative hypothesis must be stated in context.
c. A significance test can lead to rejecting or not rejecting the null hypothesis, but
can never lead to concluding or proving that the null hypothesis is true. Lack of
statistical evidence for the alternative hypothesis is not the same as evidence for
the null hypothesis.
d. Small p-values indicate that the observed value of the test statistic would be
unusual if the null hypothesis and probability model were true, and so provide
evidence for the alternative. The lower the p-value, the more convincing the
statistical evidence for the alternative hypothesis.
e. p-values that are not small indicate that the observed value of the test statistic
would not be unusual if the null hypothesis and probability model were true, so
do not provide convincing statistical evidence for the alternative hypothesis nor
do they provide evidence that the null hypothesis is true.
f. Writing Conclusion:
p-value ≤ α p-value > α
Significant? Yes No
Reject 𝐻𝐻𝑜𝑜 ? Yes No
Evidence to support 𝐻𝐻𝑎𝑎 ? Yes No
AP Unit 6 (TPS ch. 8 & 9)

With a p-value of _______, this is/isn’t significant at the ____level.


I reject/fail to reject 𝑯𝑯𝒐𝒐 .
There is/isn’t enough evidence to say 𝑯𝑯𝒂𝒂 in context (answer the question in
the problem).

5. Types of Errors:
Actual Population Value
𝑯𝑯𝒐𝒐 is True 𝑯𝑯𝒐𝒐 is false
(𝑯𝑯𝒂𝒂 is true)
Reject Type I Error Correct Decision
𝑯𝑯𝒐𝒐
Fail to Reject Correct Decision Type II Error
𝑯𝑯𝒐𝒐

a. A Type I error occurs when the null hypothesis is true and is rejected by the
evidence (false positive).
b. A Type II error occurs when the null hypothesis is false and is not rejected by
the evidence (false negative).
c. The significance level, α, is the probability of making a Type I error, if the null
hypothesis is true.
d. The power of a test is the probability that a test will correctly reject a false
null hypothesis.
e. The probability of making a Type II error = 1 − 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝.
f. Whether a Type I or a Type II error is more consequential depends upon the
situation.
g. Since the significance level, α, is the probability of a Type I error, the
consequences of a Type I error influence decisions about a significance level.
h. The power of a test increases when any of the following occurs, provided the
others do not change: (* because the probability of a Type II is 1 − 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝,
this probability automatically decreases when the power increases.)
i. Sample size(s) increases.
ii. Significance level (α) of a test increases.
iii. Standard error decreases. (most relevant to means)
iv. True parameter value is farther from the null.
AP Unit 6 (TPS ch. 8 & 9)

C. Two Sample z-interval for a difference between population proportions:


1. Verify the conditions:
a. To check that the sampling distribution of 𝑝𝑝̂1 − 𝑝𝑝̂ 2 is approximately normal:

𝑛𝑛1 𝑝𝑝̂1 ≥ 10 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛1 (1 − 𝑝𝑝̂1 ) ≥ 10


𝑛𝑛2 𝑝𝑝̂ 2 ≥ 10 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛2 (1 − 𝑝𝑝̂ 2 ) ≥ 10
b. To check that the sampling method is unbiased:
samples should be 2 independent SRS from population of interest or random
assignment if an experiment (discuss IN CONTEXT!!)
c. independence: If sampling without replacement, n < 10% of population for both
samples.

2. Make Confidence Interval:


a. On formula sheet: 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 ± (𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣)(𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠)

𝑝𝑝̂1 (1 − 𝑝𝑝̂1 ) 𝑝𝑝̂ 2 (1 − 𝑝𝑝̂2 )


(𝑝𝑝̂1 − 𝑝𝑝̂2 ) ± 𝑧𝑧 ∗ � +
𝑛𝑛1 𝑛𝑛2

3. Interpret a Confidence Interval:


I am ____% confident that the actual difference in the proportion of
____(population 1)___ and ____(population 2)___ that ____(variable)____ is
between _____ and _____.

D. Two sample z-test for a difference in population proportions:


1. Hypotheses:
a. The null hypothesis specifies a value of 0 for the difference in population
proportions, indicating no difference or effect.
b. The null hypothesis for a population proportion is 𝐻𝐻𝑜𝑜 : 𝑝𝑝1 = 𝑝𝑝2 ,
or 𝐻𝐻𝑜𝑜 : 𝑝𝑝1 − 𝑝𝑝2 = 0
A one-sided alternative hypothesis for a proportion is either𝐻𝐻𝑎𝑎 : 𝑝𝑝1 < 𝑝𝑝2 or
𝐻𝐻𝑎𝑎 : 𝑝𝑝1 > 𝑝𝑝2 . A two-sided alternate hypothesis is 𝐻𝐻𝑎𝑎 : 𝑝𝑝1 ≠ 𝑝𝑝2 .

2. Verify the conditions for performing a significance test for a population proportion:
a. To check that the sampling distribution of 𝑝𝑝̂1 − 𝑝𝑝̂ 2 is approximately normal:
𝑋𝑋 +𝑋𝑋
Use 𝑝𝑝̂𝑐𝑐 = 𝑛𝑛1 +𝑛𝑛2
1 2

𝑛𝑛1 𝑝𝑝̂𝑐𝑐 ≥ 10 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛1 (1 − 𝑝𝑝̂𝑐𝑐 ) ≥ 10


𝑛𝑛2 𝑝𝑝̂𝑐𝑐 ≥ 10 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛2 (1 − 𝑝𝑝̂𝑐𝑐 ) ≥ 10
AP Unit 6 (TPS ch. 8 & 9)

b. To check that the sampling method is unbiased:


samples should be 2 independent SRS from population of interest or random
assignment if an experiment (discuss IN CONTEXT!!)
c. independence: If sampling without replacement, n < 10% of population for both
samples.

3. Calculate a test statistics and the p-value: (Draw a picture and think CLT!!)
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠−𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
a. On the formula sheet: 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
b.
(𝑝𝑝̂1 − 𝑝𝑝̂ 2 ) − 0
𝑧𝑧 =
1 1
�𝑝𝑝̂𝑐𝑐 (1 − 𝑝𝑝̂𝑐𝑐 ) �
𝑛𝑛1 + 𝑛𝑛2 �
c. find p-value using normal table or calculator.
d. If asked to interpret a p-value:
The p-value is the probability of obtaining a difference in the sample
proportions as big or bigger than the one obtained, IF the population
proportion are equal (there’s no difference).

4. Conclusion:
With a p-value of _______, this is/isn’t significant at the ____level.
I reject/fail to reject 𝑯𝑯𝒐𝒐 .
There is/isn’t enough evidence to say 𝑯𝑯𝒂𝒂 in context (answer the question in
the problem).

You might also like