ch8 0
ch8 0
Tests for
8 One Sample
Examples:
• H: µ = 75 cents, where µ is the true population average of daily
per-student candy+soda expenses in US high schools
• H: p < .10, where p is the population proportion of defective
helmets for a given manufacturer
• If µ1 and µ2 denote the true average breaking strengths of two
different types of twine, one hypothesis might be the assertion
that µ1 – µ2 = 0, and another is the statement µ1 – µ2 > 5
2
Null vs Alternative Hypotheses
In any hypothesis-testing problem, there are always two competing
hypotheses under consideration:
For example,
µ = .75 versus µ ≠ .75
p ≥ .10 versus p < .10
The objective of hypothesis testing is to decide, based on sample
information, if the alternative hypotheses is actually supported by the
data.
We usually do new research to challenge the existing (accepted) beliefs.
3
Burden of Proof
The burden of proof is placed on those who believe in the
alternative claim.
This initially favored claim (H0) will not be rejected in favor of the
alternative claim (Ha or H1) unless the sample evidence
contradicts it and provides strong support for the alternative
assertion.
If the sample does not strongly contradict H0, we will continue to
believe in the plausibility of the null hypothesis.
The two possible conclusions: 1) reject H0
2) fail to reject H0.
4
No proof… only evidence
We can never prove that a hypothesis is true or not true.
We can only conclude that it is or is not supported by the data.
Thus we might test the null hypothesis H0: µ = .75 against the
alternative Ha: µ ≠ .75. Only if sample data strongly suggests that µ is
something other than 0.75 should the null hypothesis be rejected.
In the absence of such evidence, H0 should not be rejected, since it is
still considered plausible.
5
Why favor the null so much?
Why be so committed to the null hypothesis?
• sometimes we do not want to accept a particular
assertion unless (or until) data can show strong support
• reluctance (cost, time) to change
The true average wear life with the current coating is
known to be 1000 hours. With µ denoting the true average
life for the new coating, the company would not want to
make any (costly) changes unless evidence strongly
suggested that µ exceeds 1000.
6
Hypotheses and Test Procedures
An appropriate problem formulation would involve testing
H0: µ = 1000 against Ha: µ > 1000.
7
Hypotheses and Test Procedures
An appropriate problem formulation would involve testing
the hypothesis:
The conclusion that “a change is justified” is identified with
Ha, and it would take conclusive evidence to justify
rejecting H0 and switching to the new coating.
8
Hypotheses and Test Procedures
The word null means “of no value, effect, or
consequence,” which suggests that H0 should be identified
with the hypothesis of no change (from current opinion), no
difference, no improvement, etc.
9
Hypotheses and Test Procedures
The alternative to the null hypothesis H0: θ = θ0 will look like
one of the following three assertions:
1. Ha: θ ≠ θ0
2. Ha: θ > θ0 (in which case the null hypothesis is θ ≤ θ0)
3. Ha: θ < θ0 (in which case the null hypothesis is θ ≥ θ0)
• It is typically easier to determine the alternate hypothesis first then
the complementary statement is designated as the null hypothesis
• The alternate hypothesis is the claim for which we are seeking
statistical proof
10
Test Procedures
A test procedure is a rule, based on sample data, for
deciding whether to reject H0.
11
Test Procedures
Testing procedure has two constituents:
(1) a test statistic, or function of the sample data which will
be used to make a decision, and
So if we have decided we can reject H0 if x ≤ 15 – then the
rejection region consists of {0, 1, 2,…, 15}. Then H0 will not
be rejected if x = 16, 17,. . . ,199, or 200.
12
Errors in Hypothesis Testing
The basis for choosing a particular rejection region lies in
consideration of the errors that one might be faced with in
drawing a conclusion.
On the other hand, even when Ha: p < .10 is true, an
unusual sample might yield x = 20, in which case H0 would
not be rejected—again an incorrect conclusion.
13
Errors in Hypothesis Testing
Definition
• A type I error is when the null hypothesis is rejected,
but it is true.
• A type II error is not rejecting H0 when H0 is false.
This is very similar in spirit to our diagnostic test examples
• False negative test = type I error
• False positive test = type II error
14
Type I error in hypothesis testing
Usually: Specify the largest value of α that can be
tolerated, and then find a rejection region with that α.
Suppose the distribution of nicotine content is known to be
normal with σ = .20.
16
Example (Type I Error) cont’d
Test statistic: Z =
17
Example (Type I Error) cont’d
As Ha: µ > 1.5, the form of the rejection region is z ≥ c.
What is c so that α = 0.05?
18
Case I: Testing means of a normal population with known σ
19
Case I: Testing means of a normal population with known σ
Rejection regions for z tests: (a) upper-tailed test;; (b) lower-tailed test;; (c) two-tailed test
20
Type II Error Example
A certain type of automobile is known to sustain no visible
damage 25% of the time in 10-mph crash tests. A modified
bumper design has been proposed in an effort to increase
this percentage.
Let p denote the proportion of all 10-mph crashes with this
new bumper that result in no visible damage.
21
Type II Error Example 1 cont’d
Clearly, β decreases as the value of p moves farther to the
right of the null value .25.
Intuitively, the greater the departure from H0, the less likely
it is that such a departure will not be detected.
Thus, 1- β is often called the “power of the test”
22
Errors in Hypothesis Testing
We can also obtain a smaller value of α -- the probability that
the null will be incorrectly rejected – by decreasing the size of
the rejection region.
However, this results in a larger value of β for all parameter
values consistent with Ha.
23
Case II: Large sample tests for means
When the sample size is large, the z tests for case I are
easily modified to yield valid test procedures without
requiring either a normal population distribution or
known σ.
25
CI and Hypotheses cont’d
Rejection regions have a lot in common with confidence intervals.
Source: s hex.org
26
Proportions: Large-Sample Tests
The estimator is unbiased , has
approximately a normal distribution, and its standard
deviation is
27
Proportions: Large-Sample Tests
Alternative Hypothesis Rejection Region
28
Example
Natural cork in wine bottles is subject to deterioration, and
as a result wine in such bottles may experience
contamination.
29
P-Values
The P-value is a probability of observing values of the test
statistic that are as contradictory or even more
contradictory to H0 as the test statistic obtained in our
sample.
30
Example
Urban storm water can be contaminated by many sources,
including discarded batteries. When ruptured, these batteries
release metals of environmental significance.
The article “Urban Battery Litter” (J. of Environ. Engr., 2009:
46–57) presented summary data for characteristics of a
variety of batteries found in urban areas around Cleveland.
That is, H0 should be rejected in favor of Ha when the P-
value is sufficiently small (such large sample statistic is
unlikely if the null is true).
32
Decision rule based on the P-value
Select a significance level α (as before, the desired type I
error probability), then α defines the rejection region.
Note, the P-value can be thought of as the smallest
significance level at which H0 can be rejected.
33
P-Values
In the previous example, we calculated the P-value =
.0012. Then using a significance level of .01, we would
reject the null hypothesis in favor of the alternative
hypothesis because .0012 ≤ .01.
This is why we cannot change significance level after we
see the data – NOT ALLOWED though tempting!
34
P-Values for z Tests
The calculation of the P-value depends on whether the test
is upper-, lower-, or two-tailed.
Each of these is the probability of getting a value at least as
extreme as what was obtained (assuming H0 true).
35
P-Values for z Tests
The three cases are illustrated in Figure 8.9.
36
P-Values for z Tests cont’d
37
Example
The target thickness for silicon wafers used in a certain
type of integrated circuit is 245 µ m.
A sample of 50 wafers is obtained and the thickness of
each one is determined, resulting in a sample mean
thickness of 246.18 µ m and a sample standard deviation of
3.60 µ m.
Does this data suggest that true average wafer thickness is
something other than the target value? Use a significance
level of .01.
38
P-Values for t Tests
Just as the P-value for a z test is the area under the z
curve, the P-value for a t test will be the area under the t-
curve.
39
P-Values for t Tests cont’d
40
P-Values for t Tests
The table of t critical values used previously for confidence
and prediction intervals doesn’t contain enough information
about any particular t distribution to allow for accurate
determination of desired areas.
41
More on Interpreting P-values
42
How are p-values distributed? cont’d
Figure below shows a histogram of the 10,000 P-values from a simulation
experiment under a null μ = 20 (with n = 4 and σ = 2).
When H0 is true, the probability distribution of the P-value is a uniform
distribution on the interval from 0 to 1.
43
Example cont’d
About 4.5% of these P-values are in the first class interval
from 0 to .05.
If we continued to generate samples and carry out the test
for each sample at significance level .05, in the long run 5%
of the P-values would be in the first class interval.
44
Example cont’d
A histogram of the P-values when we simulate under an alternative
hypothesis. There is a much greater tendency for the P-value to be
small (closer to 0) when µ = 21 than when µ = 20.
(b) μ = 21
45
Example cont’d
Unfortunately, this is the case for only about 19% of the
P-values. So only about 19% of the 10,000 tests correctly
reject the null hypothesis;; for the other 81%, a type II error
is committed.
The difficulty is that the sample size is quite small and 21 is
not very different from the value asserted by the null
hypothesis.
46
Example cont’d
(c) μ = 22
47
Example cont’d
In general, as µ moves further to the right of the null value
20, the distribution of the P-value will become more and
more concentrated on values close to 0.
Even here a bit fewer than 50% of the P-values are smaller
than .05. So it is still slightly more likely than not that the
null hypothesis is incorrectly not rejected. Only for values of
µ much larger than 20 (e.g., at least 24 or 25) is it highly
likely that the P-value will be smaller than .05 and thus give
the correct conclusion.
48
Statistical Versus Practical Significance
When using
49
R code
50