0% found this document useful (0 votes)
3 views67 pages

7.hypothesis Test

Uploaded by

yanghm669
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views67 pages

7.hypothesis Test

Uploaded by

yanghm669
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 67

Hypothesis Testing

Fei He, PhD


Department of Epidemiology & Health Statistics, SPH, FJMU

1
Is it statistically significant?

2
Data must be interpreted in order to add meaning.

3
Outline

 What is hypothesis testing?

 When does hypothesis tesing?

 How to do hypothesis testing?

4
Hypothesis

A hypothesis is an educated guess about

something in the world around you. It should

be testable, either by experiment or

observation.

5
Hypothesis
For example:
A new medicine you think might work.
A way of teaching I think might be better.
A possible location of new species.
A fairer way to administer standardized tests.

It can really be anything at all as long as you can put it to the


test.

6
What is a Hypothesis Statement?

If you are going to propose a hypothesis, it’s customary to

write a statement.

Your statement will look like this:

“If I…(do this to an independent variable)….then (this will

happen to the dependent variable).”

7
What is a Hypothesis Statement?
For example:

If I (decrease the amount of water given to herbs) then (the


herbs will increase in size).
If I (give patients counseling in addition to medication) then
(their overall depression scale will decrease).
If I (give exams at noon instead of 7) then (student test
scores will improve).
If I (look in this certain location) then (I am more likely to
find new species).

8
What is a Hypothesis Statement?
A hypothesis statement should:

Include both the independent and dependent variables.


Be testable by experiment, survey or other scientifically
sound technique.
Be based on information in prior research (either yours or
someone else’s).
Have design criteria (for engineering or programming
projects).

9
Example

One place where you can consistently see the general idea
of hypothesis testing in action is in criminal trials.

The criminal justice system assumes "the defendant is


innocent until proven guilty."

That is, the initial assumption is that the defendant is


innocent.

10
Example

In the practice of statistics, we make our initial


assumption when we state our two competing hypotheses.
Here, our hypotheses are:

H0: Defendant is not guilty (innocent)

H1: Defendant is guilty

11
Example
The two hypotheses are called the null hypothesis and the
other the alternative or research hypothesis. The usual
notation is:
pronounce
d
H “nought”

H0: — the ‘null’ hypothesis

H1: — the ‘alternative’ or ‘research’ hypothesis

12
Example

In statistics, we always assume the null hypothesis is


true. That is, the null hypothesis is always our initial
assumption.
The null hypothesis (H0) will always state that the
parameter equals the value specified in the alternative
hypothesis (H1)

The prosecution team then collects evidence — such as


finger prints, blood spots, hair samples, carpet fibers, shoe
prints, and handwriting samples — with the hopes of
finding "sufficient evidence" to make the assumption of
innocence refutable.
13
Example

The jury then makes a decision based on the available


evidence:
If the jury finds sufficient evidence — beyond a
reasonable doubt — to make the assumption of innocence
refutable, the jury rejects the null hypothesis and deems
the defendant guilty. We behave as if the defendant is
guilty.
If there is insufficient evidence, then the jury does not
reject the null hypothesis. We behave as if the defendant
is innocent.

14
Example
In statistics, the data are evidence.

In statistics, we always make one of two decisions. We


either "reject the null hypothesis" or we "fail to reject the
null hypothesis.”

Did you notice the use of the phrase "behave as if" in the
previous discussion? We "behave as if" the defendant is
guilty; we do not "prove" that the defendant is guilty.
And, we "behave as if" the defendant is innocent; we do
not "prove" that the defendant is innocent.

15
Example
This is a very important distinction! We make our
decision based on evidence not on 100% guaranteed
proof. Again:
If we reject the null hypothesis, we do not prove that the
alternative hypothesis is true.
If we do not reject the null hypothesis, we do not prove
that the null hypothesis is true.
We merely state that there is enough evidence to behave
one way or the other. This is always true in statistics!
Because of this, whatever the decision, there is always a
chance that we made an error.

16
Errors in Hypothesis Testing
There are two possible errors.
Let's review the two types of errors that can be made in
criminal trials:

Type II Error
Type I Error

17
Errors in Hypothesis Testing
In statistics:

18
Errors in Hypothesis Testing
A Type I error occurs when we reject a true null hypothesis
(i.e. Reject H0 when it is TRUE)

H0 T F

Reject I

Reject II

A Type II error occurs when we don’t reject a false null


hypothesis (i.e. Do NOT reject H0 when it is FALSE)
19
Errors in Hypothesis Testing
The probability of a Type I error is denoted as α (Greek letter
alpha). The probability of a type II error is β (Greek letter beta).

The two probabilities are inversely related. Decreasing one


increases the other, for a fixed sample size.

In other words, you can’t have  and β both real small for any
small sample size. You may have to take a much larger sample
size, or in the court example, you need much more evidence.

20
Errors in Hypothesis Testing
The critical concepts are the following:
1. There are two hypotheses, the null and the alternative hypotheses.
2. The procedure begins with the assumption that the null hypothesis is
true.
3. The goal is to determine whether there is enough evidence to infer
that the alternative hypothesis is true, or the null is not likely to be
true.
4. There are two possible decisions:
Conclude that there is enough evidence to support the alternative
hypothesis. Reject the null.
Conclude that there is not enough evidence to support the
alternative hypothesis. Fail to reject the null.

21
Making a decision

Recall that it is either likely or unlikely that we would

observe the evidence we did give our initial assumption. If

it is likely, we do not reject the null hypothesis. If it

is unlikely, then we reject the null hypothesis in favor of

the alternative hypothesis. Effectively, then, making the

decision reduces to determining "likely" or "unlikely."

22
Making a decision
In statistics, there are three ways to determine whether the
evidence is likely or unlikely given the initial assumption.

Let’s see.

23
Statistical Example
Consider mean demand for computers during assembly lead
time. Rather than estimate the mean demand, our operations
manager wants to know whether the mean is different from
350 units. In other words, someone is claiming that the mean
time is 350 units and we want to check this claim out to see
if it appears reasonable. We can rephrase this request into a
test of the hypothesis:
H0: = 350
Thus, our research hypothesis becomes:
H1: ≠ 350
Suppose that the standard deviation [σ]was assumed to be
75, the sample size [n] was 25, and the sample mean [ ]
was calculated to be 370.16
24
Statistical Example
For example, if we’re trying to decide whether the mean is
not equal to 350, a large value of (say, 600) would provide
enough evidence.

If is close to 350 (say, 355) we could not say that this


provides a great deal of evidence to infer that the population
mean is different than 350.

25
Statistical Example
The testing procedure begins with the assumption that the
null hypothesis is true.

Thus, until we have further statistical evidence, we will


assume:

H0: = 350 (assumed to be TRUE)


The next step will be to determine the sampling distribution
of the sample mean assuming the true mean is 350.
is normal with 350
75/SQRT(25) = 15
26
Is the Sample Mean in the Guts of the Sampling
Distribution??

27
Three ways to determine this: First way
1. Unstandardized test statistic: Is in the guts of the
sampling distribution? Depends on what you define as
the “guts” of the sampling distribution.

If we define the guts as the center 95% of the distribution


[this means  = 0.05], then the critical values that define
the guts will be 1.96 standard deviations of X-Bar on
either side of the mean of the sampling distribution
[350], or
UCV = 350 + 1.96*15 = 350 + 29.4 = 379.4
LCV = 350 – 1.96*15 = 350 – 29.4 = 320.6

28
1. Unstandardized Test Statistic Approach

29
Three ways to determine this: Second
way
2. Standardized test statistic: Since we defined the “guts” of
the sampling distribution to be the center 95% [ = 0.05],
If the Z-Score for the sample mean is greater than
1.96, we know that will be in the reject region on the right
side or
If the Z-Score for the sample mean is less than -1.96,
we know that will be in the reject region on the left side.

Z=( - )/ = (370.16 – 350)/15 = 1.344

Is this Z-Score in the guts of the sampling distribution???


30
2. Standardized Test Statistic Approach

31
Three ways to determine this: Third way
3. The p-value approach (which is generally used with a computer and
statistical software): Increase the “Rejection Region” until it
“captures” the sample mean.

In null hypothesis testing, the p-value is the probability of obtaining


test results at least as extreme as the results actually observed, under
the assumption that the null hypothesis is correct.

The smaller the p-value, the stronger the evidence that you should
reject the null hypothesis.

32
Three ways to determine this: Third way
Graphically, the p value is the area in the tail of a probability
distribution.

33
Three ways to determine this: Third way
When you perform a statistical test a p-value helps you
determine the significance of your results in relation to the
null hypothesis.

How do you know if a p-value is statistically significant?

The level of statistical significance is often expressed as a p-


value between 0 and 1. The smaller the p-value, the stronger
the evidence that you should reject the null hypothesis.

34
Three ways to determine this: Third way
How do you know if a p-value is statistically significant?

A p-value less than 0.05 (typically ≤ 0.05) is statistically


significant. It indicates strong evidence against the null
hypothesis, as there is less than a 5% probability the null is
correct (and the results are random). Therefore, we reject the
null hypothesis, and accept the alternative hypothesis.
However, this does not mean that there is a 95% probability
that the research hypothesis is true. The p-value is
conditional upon the null hypothesis being true is unrelated
to the truth or falsity of the research hypothesis.

35
Three ways to determine this: Third way
A p-value higher than 0.05 (> 0.05) is not statistically significant
and indicates strong evidence for the null hypothesis. This means
we retain the null hypothesis and reject the alternative hypothesis.
You should note that you cannot accept the null hypothesis, we
can only reject the null or fail to reject it.
A statistically significant result cannot prove that a research
hypothesis is correct (as this implies 100% certainty).
Instead, we may state our results “provide support for” or “give
evidence for” our research hypothesis (as there is still a slight
probability that the results occurred by chance and the null
hypothesis was correct – e.g. less than 5%).

36
Three ways to determine this: Third way

For this example, since is to the right of the mean, calculate


P( > 370.16) = P(Z > 1.344) = 0.0901
Since this is a two tailed test, you must double this area for the p-
value.
p-value = 2*(0.0901) = 0.1802
Since we defined the guts as the center 95% [ = 0.05], the reject
region is the other 5%. Since our sample mean, , is in the 18.02%
region, it cannot be in our 5% rejection region [ = 0.05].

37
3. p-value approach

38
Statistical Conclusions:
Unstandardized Test Statistic:
Since LCV (320.6) < (370.16) < UCV (379.4), we
reject the null hypothesis at a 5% level of significance.

Standardized Test Statistic:


Since -Z/2(-1.96) < Z(1.344) < Z/2 (1.96), we fail to
reject the null hypothesis at a 5% level of significance.

P-value:
Since p-value (0.1802) > 0.05 [], we fail to reject the
hull hypothesis at a 5% level of significance.
39
Statistical Inference
Population
(parameters, e.g.,  and )

select sample at random

Sample

collect data from individuals


in sample
Data

Analyse data (e.g.


estimate x, s ) to
make inferences 40
Formal Hypothesis Testing
S tep 1 : S tate n u ll an d altern ate h yp oth eses

S tep 2 : S elect a level of sig n ifican ce

S tep 3 : Id en tify th e test statistic

S tep 4 : F orm u late a d ecision ru le

S tep 5 : Take a sam p le, arrive at a d ecision

D o n ot reject n u ll R eject n u ll an d accep t altern ate 41


State Null and Alternative Hypotheses

Example

We have a medicine that is being manufactured and each pill is supposed to have 14

milligrams of the active ingredient. What are our null and alternative hypotheses?

42
null hypotheses

Some null hypotheses may be:

there is no difference between the height of the students in


class one and class two

there is no difference in the location (distance to downtown)


of superstores and small grocers shops

there is no connection between the size of farm and the type


of farm

43
Components of a Formal Hypothesis
Test

The alternative hypothesis (denoted H1 or Ha) is a


statement that the value of a population parameter
somehow differs from the null hypothesis. The
symbolic form must be a >, < or ≠ statement.

44
One- and Two-Tail Tests…

One-Tail Test Two-Tail Test One-Tail Test


(left tail) (right tail)

45
One- and Two-Tail Tests…

• A hypothesis test can be one-tailed or two-tailed. The examples


above are all two-tailed hypothesis tests.
• We do not specify whether we believe the true mean to be
higher or lower than the hypothesized mean. We just believe it
must be different.
• We would use a single-tail hypothesis test when the direction of
the results is anticipated or we are only interested in one
direction of the results.

46
One- and Two-Tail Tests…

• When performing a single-tail hypothesis test, our alternative


hypothesis looks a bit different. We use the symbols of greater
than or less than. For example, let’s say we were claiming that
the average final score of students was GREATER than 85.
Remember, our own personal hypothesis is the alternative
hypothesis.
• Then our null and alternative hypothesis could look something
like:
H0 : µ ≤ 85
Ha : µ > 85
47
Select a Level of Significance α
Alpha levels are controlled by the researcher and are related
to confidence levels. You get an alpha level by subtracting your
confidence level from 100%. For example, if you want to be 98
percent confident in your research, the alpha level would be 2%
(100% – 98%). When you run the hypothesis test, the test will
give you a value for p. Compare that value to your chosen alpha
level.
For example, let’s say you chose an alpha level of 5% (0.05). If the
results from the test give you:
A small p (≤ 0.05), reject the null hypothesis. This is strong evidence
that the null hypothesis is invalid.
A large p (> 0.05) means the alternate hypothesis is weak, so you do
not reject the null.

48
Select a Level of Significance α

Designated a (alpha)
Typical values are 0.01, 0.05, 0.10

Type I Error:
Rejecting the null hypothesis when it is actually true
(α).

49
Types of errors

When a true null hypothesis is rejected, it causes a


Type I error whose probability is .
When a false null hypothesis is not rejected, it causes
a Type II error whose probability is designated by .
A Type I error is considered to be more serious than
a Type II error.

50
Table of error conditions

51
Risk management

Since rejecting a null hypothesis has a chance of


committing a type I error, we make  small by
selecting an appropriate confidence interval.
Generally, we do not control , even though it is
generally greater than . However, when failing to
reject a null hypothesis, the risk of error is unknown.

52
Probability of a Type II Error –
A Type II error occurs when a false null hypothesis is not
rejected or “you accept the null when it is not true” but don’t
say it this way if a statistician is around.

In practice, this is by far the most serious error you can make
in most cases, especially in the “quality field”.

53
Judging the Test…
A statistical test of hypothesis is effectively defined by the
significance level ( ) and the sample size (n), both of
which are selected by the statistics practitioner.

Therefore, if the probability of a Type II error ( ) is too


large [we have insufficient power], we can reduce it by
increasing , and/or
increasing the sample size, n.

54
Judging the Test…
The power of a test is defined as 1– .
It represents the probability of rejecting the null hypothesis when it is
false and the true mean is something other than the null value for the
mean.

If we are testing the hypothesis that the average amount of medication


in blood pressure pills is equal to 6 mg (which is good), and we “fail to
reject” the null hypothesis, ship the pills to patients worldwide, only to
find out later that the “true” average amount of medication is really 8
mg and people die, we get in trouble. This occurred because the
P(reject the null / true mean = 7 mg) = 0.32 which would mean that we
have a 68% chance on not rejecting the null for these BAD pills and
shipping to patients worldwide.

55
Probability you ship pills whose mean amount of medication is 7 mg approximately
67%

56
Select the test statistic
Test statistic: A value, determined from sample information,
used to determine whether or not to reject the null
hypothesis.

Examples: z, t, F, 2

X   X  0
z  t 
/ n s/ n

57
Formulate the decision rule
Critical value: The dividing point between the region where
the null hypothesis is rejected and the region where it is not
rejected.

Sampling Distribution R egion of


D o not
Of the Statistic z, a reject
rejection

Right-Tailed Test [P robability =.95] [P robability=.05]

(Alternative hypothesis:
H1: μ > μ 0 ),
0 .05 Level of Significance
0 1.65
C ritical value

58
Formulate the decision rule
The alternate hypothesis, H1, states a direction (μ > μ0 or μ <
μ0 )

Sampling Distribution
of the Statistic z, a
Right-Tailed Test R e g io n o f
D o not
(Alternative hypothesis: reject
re je ctio n
H1: μ > μ0 ), 0.05 [P robability = .95] [P ro b a b ility= .0 5 ]
Level of Significance

0 1.65
C ritical value 59
Formulate the decision rule
No direction is specified in the alternate hypothesis H1.
(μ ≠μ0 )
Regions of Rejection for a
Two-Tailed Test, .05
Level of Significance
R e g io n o f R e g io n o f
D o not
re je ctio n re je ctio n
reject
[P ro b a b ility= .0 2 5 ] [P ro b a b ility= .0 2 5 ]
[P robability = .95]

- 1.96 0 1.96
C ritical value C ritical value
60
Example: Normal Body Temperature
What is normal body temperature? Is it
actually 37.6oC (on average)?
State the null and alternative hypotheses
H0:  = 37.6oC
Ha:   37.6oC

61
Example: Normal Body Temperature
Data: random sample of n = 18 normal body temps

37.2 36.8 38.0 37.6 37.2 36.8 37.4 38.7 37.2


36.4 36.6 37.4 37.0 38.2 37.6 36.1 36.2 37.5

62
Example: Normal Body Temperature
Data: random sample of n = 18 normal body temps

37.2 36.8 38.0 37.6 37.2 36.8 37.4 38.7 37.2


36.4 36.6 37.4 37.0 38.2 37.6 36.1 36.2 37.5

Summarize data with a test statistic

Variable n Mean SD SE t P
Temperature 18 37.22 0.68 0.161 2.38 0.029

sample mean  null value x   0


t 
standard error s
n
63
STUDENT’S t DISTRIBUTION TABLE
Degrees of Probability (p value)
freedom
0.10 0.05 0.01

1 6.314 12.706 63.657


5 2.015 2.571 4.032
10 1.813 2.228 3.169
17 1.740 2.110 2.898
20 1.725 2.086 2.845
24 1.711 2.064 2.797
25 1.708 2.060 2.787
 1.645 1.960 2.576
64
Example: Normal Body Temperature
Find the p-value
Df = n – 1 = 18 – 1 = 17
From SPSS: p-value = 0.029
From t Table: p-value is
between 0.05 and 0.01. -2.11 +2.11 t

Area to left of t = -2.11 equals area


to right of t = +2.11.
The value t = 2.38 is between
column headings 2.110& 2.898 in
table, and for df =17, the p-values
65
Example: Normal Body Temperature
Decide whether or not the result is
statistically significant based on the p-value
Using  = 0.05 as the level of significance criterion,
the results are statistically significant because 0.029
is less than 0.05. In other words, we can reject the
null hypothesis.

Report the Conclusion


We can conclude, based on these data, that the mean
temperature in the human population does not equal
37.6.

66
67

You might also like