7.Hypothesis testing and Sample size determination
7.Hypothesis testing and Sample size determination
1
Hypothesis Testing
Hypothesis testing is a formal process of statistical analysis using
inferential statistics.
The goal of hypothesis testing is to compare populations or assess
relationships between variables using samples.
3
Hypothesis Testing…
Statistical tests come in three forms: tests of comparison, correlation or regression.
1. Comparison tests: assess whether there are differences in means, medians or
rankings of scores of two or more groups (e.g. t-test, ANOVA).
To decide which test suits your aim, consider whether your data meets the
conditions necessary for parametric tests, the number of samples, and the
levels of measurement of your variables.
2. Correlation tests: determine the extent to which two variables are associated.
Although Pearson’s r is the most statistically powerful test, Spearman’s r is
appropriate for interval and ratio variables when the data doesn’t follow a
normal distribution. The chi square test of independence is the only test that can
be used with nominal variables.
3. Regression tests: demonstrate whether changes in predictor variables cause changes
in an outcome variable. You can decide which regression test to use based on the
number and types of variables you have as predictors and outcomes (e.g. Linear-
regression, Logistic regression…others).
Most of the commonly used regression tests are parametric. If your data is not
normally distributed, you can perform data transformations.
4
Tests of Significance
Data are often collected to answer specified questions, such as:
Do U5 children from Urban have a lower prevalence of
malnutrition compared with Rural?
Is a new-Rx beneficial to those suffering from a certain
disease compared with the standard-Rx?
Such questions may be answered by setting up a hypothesis
and then using the data to test this hypothesis.
5
Hypothesis Testing
Purpose is to aid the researcher in reaching a decision concerning
a population by examining a sample from that population.
Hypothesis:
A statement about one or more population
Is a claim about a population parameter
is a statement which may or may not be true concerning one
or more populations.
6
Types of hypothesis
Researchers are concerned with two types of hypotheses:
1. Research hypothesis is the conjecture or supposition that
motivates the research
E.g. The mean birth weight of babies delivered by mothers
with low SES is lower than those from higher SES.
2. Statistical hypotheses (H0 and HA) are hypotheses that
are stated in such a way that they may be evaluated
by appropriate statistical techniques
7
Statistical hypothesis
There are two statistical hypotheses that are involved in hypothesis testing,
1. Null hypothesis (H0) is the hypothesis to be tested.
Always contains “=” , “ ≤” or “≥ ” sign
May or may not be rejected
Sometimes referred to as a hypothesis of no difference or no
effect, since it is a statement of agreement with (or no difference
from) conditions presumed to be true in the population of interest.
2. Alternative Hypothesis (HA): The notation HA (or H1 ) is used for the
hypothesis that will be accepted if HO is rejected.
The opposite of Ho.
Is a statement of what we will believe is true if our sample data
causes us to reject Ho.
May or may not be accepted.
8
Stating statistical hypotheses
rue
9
Steps in Hypothesis Testing
1. Formulate the appropriate statistical hypothesis clearly.
10
Steps in Hypothesis Testing…
OR
4. Select the level of significance for the statistical test (α=0.05, 0.01, 0.1, etc.)
11
Steps in Hypothesis Testing…
5. Determine the critical value.
A value the test statistic must attain to be declared significant
12
Steps in Hypothesis Testing…
13
Rules for Stating Statistical Hypothesis
1. One population
Indication of equality (either =, ≤ or ≥) must appear in Ho.
Ho: μ = μo, Ho: P = Po,
HA: μ ≠ μo HA: P ≠ Po
Can we conclude that a certain population mean is
not 50?
Ho: μ = 50 and HA: μ ≠ 50
greater than 50?
Ho: μ ≤ 50 HA: μ > 50
Can we conclude that the proportion of patients with leukemia who
survive more than six years is not 60%?
Ho: P = 0.6 HA: P ≠ 0.6
2. Two populations
Ho: μ1 = μ2 Ho: P1 = P2
HA: μ1 ≠ μ2 HA: P1 ≠ P2
14
Rejection and Non-Rejection Regions
The values of the test statistic assume the points on the
horizontal axis of the normal distribution and are divided into
two groups:
Rejection region, and
Non-rejection region.
The values of the test statistic forming the rejection region are
less likely to occur if the Ho is true.
The values making the acceptance (non-rejection) region are
more likely to occur if the Ho is true.
15
Example: Two-sided test at α 5%
= 0.025 = 0.025
0.95
-1.96 1.96
Reject Ho if computed the value of the test statistic is one of the values in
the rejection region.
Don’t reject Ho if the computed value of the test statistic is one of the values
in the non-rejection region.
16
Level of significance
If HO is rejected, then HA is accepted.
A HO is either true or false, and it is either not rejected or rejected.
No error is made:
When it is true and we fail to reject it, or
When it is false and rejected.
An error is made,
When it is true but rejected (type I (α) errors) , or
When it is false and we fail to reject (type II (β) errors).
α is the probability of a type I error. It is called the level of significance.
β is the probability of a type II error.
Power: The probability of rejecting the null hypothesis when it is false.
Power = 1‐β.
17
Level of Significance, α
Is the probability of rejecting a true Ho
Defines unlikely values of sample statistic if Ho is true
Defines rejection region of the sampling distribution
The decision is made on the basis of the level of significance,
designated by α.
More frequently used values of α are 0.01, 0.05 and 0.10.
α is selected by the researcher
The level of significance, a, is a probability and is, in reality, the
probability of rejecting a true null hypothesis.
For example, with 95% confidence intervals, a = .05 meaning that
there is a 5% chance that the parameter does not fall within the 95%
confidence region.
This creates an error and leads to a false conclusion.
18
Reality
Action Ho True Ho False
(Conclusion)
Do not Correct action Type II error (β)
reject Ho
Reject Ho Type I error (α) Correct action
a = P(Reject H 0 H 0 is true)
= P(Accept H 0 H 0 is false)
19
One tail and two tail tests
In a one tail test, the rejection region is at one end of the distribution
or the other.
In a two tail test, the rejection region is split between the two tails.
Which one is used depends on the way the Ho is stated.
Eg: average survival year after cancer dx is less than 3 years.
20
Difference b/n P value and level of significance
The significance level α is the probability of making a type I error. This
is set before the test is carried out.
The P-value is the result observed after the study is completed and is
based on the observed data.
It would be better (informative) to give the exact values of P; such as,
P = 0.02 or P = 0.15 rather than P < 0.05 or P > 0.05 .
Another way to state conclusion
Reject Ho if P-value < α
Accept Ho if P-value ≥ α
The larger the test statistic, the smaller is the P-value. OR, the
smaller the P-value the stronger the evidence against the Ho.
21
1. Hypothesis test about A population mean (normally distributed)
a) Known variance: the test statistic is
Example:
Researchers are interested in the mean level of some enzyme in a certain
population. They are asking: can we conclude that the mean enzyme level in
this population is different from 25?
Step 1: H0: μ= 25
H1: μ≠25
Step 2: They collect a sample of size 10 from a normally distributed population
with a known variance, σ2= 45. The calculated sample mean is = 22
The population is normally distributed
Population variance is known
Step 3:
⇒Z ‐statistic is the appropriate one
22
Hypothesis test about A population mean…Known variance..
23
Hypothesis test about A population mean…Known variance..
Step 7: Since ‐1.41 falls in the acceptance region we accept the H0. The
mean enzyme level in the population is not different from 25.
OR Calculate P value
24
Hypothesis test about A population mean…
b) Unknown variance: The test statistic is
Example
Serum amylase determination were made on a sample of 15 apparently
health subjects. The sample yielded a mean of 96 units/100ml and standard
deviation of 35 units/100ml. The population was normally distributed but the
variance was unknown. We want to know whether we can conclude that the
mean of the population is different from 120.
Step 1, H0: μ = 120
H1: μ≠120
Step 2, mean = 96 SD=35 n=15, 𝜇𝑜=120
Step 3, t‐test is the appropriate test, since we are testing about the population
mean the population is normally distributed the population variance is unknown
Step 4, α= 0.05
Step 5, t0.025,14= 2.1448
25
Hypothesis test about A population mean…Unknown variance..
Step, 6.
Step 7, Since –2.65 < ‐2.1448, it falls in the rejection region we reject
the null hypothesis. The mean of the population from which the
sample came is not 120.
OR Calculate P value
26
2. Hypothesis testing about differences between two
population means(normally distributed)
i) Known variance(2 independent samples)
Example:
In a large hospital for the treatment of mentally retarded, a
sample of 12 individuals with mongolism yielded a mean serum
uric acid value of =4.5mg/100ml. In a general hospital a sample
of15 normal individuals of the same age and sex were found to
have a mean value of =3.4. If it is reasonable to assume that the
two populations of values are normally distributed with
variance equal to 1, do these data provide sufficient evidence
to indicate a difference in mean serum uric acid levels between
normal individuals and individuals with mongolism?
27
HT differences between two population means…Known variance
28
HT differences between two population means…Known variance
29
HT differences between two population means…
30
HT differences between two population means…Unknown and equal
variance….
31
HT differences between two population means…Unknown and equal
variance….
32
3. Hypothesis testing about a single population proportion
Involves categorical values
Two possible outcomes
“Success” (possesses a certain characteristic)
“Failure” (does not possesses that characteristic)
Fraction or proportion of population in the “success” category is
denoted by p
33
Hypothesis testing about a single population proportion…
34
Hypothesis testing about a single population proportion…
Example-1: We are interested in the probability of developing
asthma over a given one-year period for children 0 to 4 years of age
whose mothers smoke in the home. In the general population of 0 to
4-year-olds, the annual incidence of asthma is 1.4%. If 10 cases of
asthma are observed over a single year in a sample of 500 children
whose mothers smoke, can we conclude that this is different from
the underlying probability of p0 = 0.014? α = 5%
H0 : p = 0.014
HA: p ≠ 0.014
35
HT about a single population proportion… Example-1…
The critical value of Zα/2 at α=5% is ±1.96.
Don’t reject Ho since Z (=1.14) in the non-rejection
region between ±1.96.
P-value = 0.2543
We do not have sufficient evidence to conclude
that the probability of developing asthma for
children whose mothers smoke in the home is
different from the probability in the general
population
36
4. Hypothesis testing about the difference between two
population proportions
H0: 𝜋1= π2
H1: 𝜋1≠π2
We use a pooled sample estimate for the common
hypothesized proportion, which is a weighted average of
the sample proportions, with the sample size as weights.
or
37
HT for two population…
The standard error of P1 - P2 under the null hypothesis is
thus calculated on the assumption that the proportion in
each group is p , so that we have.
38
HT for two population… example-1
Two hundred patients suffering from a certain disease were randomly divided
in to two equal groups. Of the first group, 78 recovered within three days. Out
of the other 100, who were treated by a new method, 90 recovered within three
days. The physician wished to know whether the data provide sufficient
evidence to indicate that the new treatment is more effective than the standard.
Since ‐2.32 < ‐1.645 it falls in the rejection region. We reject the null hypothesis.
The data suggests that the new treatment is more effective than the standard. 39
HT for two population… example-2
Among the 225 students who ate the sandwiches, 109 became ill. While,
among the 38 students who did not eat the sandwiches, 4 became ill. Is there
a significant difference between the two groups at α =5%.
Ho: p1=p2 against the alternative
HA: p1 ≠ p2
40
HT for two population… example-2
43
Review Questions…
3. A general physician recorded the oral and rectal temperatures of nine
consecutive patients who made first visits to his office. The temperatures
are given in degrees Celsius (oC). The following measurements were
recorded:
• From this data, what is your
best point estimate of mean
difference between oral and
rectal temperatures?
45
Sample Size Determination
Why is it important to consider sample size?
In studies concerned with estimating some characteristic of a
population (e.g. the prevalence of asthmatic children),
Sample size calculations are important to ensure that estimates are
obtained with required precision or confidence.
For example,
A prevalence of 10% from a sample of size 20 would have a 95%
confidence interval of 1% to 31%, which is not very precise or informative.
On the other hand, a prevalence of 10% from a sample of size 400 would
have a 95% confidence interval of 7% to 13%, which may be considered
sufficiently accurate.
Sample size calculations help to avoid this situation. (For sample size of
800 the 95% CI is 8% to 12%)
46
Sample Size Determination…
Sample size calculations are important to ensure that if an effect
deemed to be clinically or biologically important exists, then there is a
high chance of it being detected, i.e. that the analysis will be
statistically significant.
A critically important aspect of study design is determining the
appropriate sample size to answer the research question
There are formulas that are used to estimate the sample size needed
to produce a confidence interval estimate with a specified margin of
error, or to ensure that a test of hypothesis has a high probability of
detecting a meaningful difference in the parameter if one exists (power
of the study)
47
Sample size determination…
Too small sample size a study : Too large sample size a study :
Scientifically- Cannot detect Scientifically- demonstrate
clinically important effects scientifically (clinically)
Economically- waste irrelevant, but statistically
resources without capability of significant effects.
producing useful results. Economically- waste resources
Ethically: Expose subjects to by using more than necessary.
potentially harmful treatments Ethically: Expose unnecessary
without advancing knowledge number of subjects to
potentially harmful treatments or
subjects denied potentially
beneficial ones
48
Need for sample size
The eventual sample size is usually a negotiation
between what is desirable and what is feasible.
The feasible sample size is determined by the
availability of resources. It is also important to
remember that resources are not only needed to collect
the information, but also to analyze it.
49
Points to be considered
1. The reasonable estimate of the key proportion to be studied. If you cannot
guess the proportion, take it as 50%.
2. The degree of accuracy required. That is, the allowed deviation from the true
proportion in the population as a whole. It can be within1% or 5%, etc.
3. The confidence level required, usually specified as 95%.
4. The size of the population that the sample is to represent. If it is more than
10,000 the precise magnitude is not likely to be very important; but if the
population is less than 10,000 then a smaller sample size may be required.
5. The difference between the two sub-groups and the value of the likelihood or
the power that helps in finding a statistically significant difference..
52
Sample size for single population Proportion
Estimate how big the proportion might be (P)
Choose the margin of error you will allow in the estimate of the
proportion (say ±d)
Choose the level of confidence that the proportion in the whole
population is indeed between (p-d) and (p+d). We can never be
100% sure. Do you want to be 95% sure?
The minimum sample size required, for a very large population
(N≥10,000) is:
53
Sample size for single population Proportion…
1. If sampling is from a finite population of size N, then
54
Sample size for single population Proportion…
Example
(1.96)2 0.26(1−0.26
=
0.032
= 821.25
If the above sample is to be taken from a relatively small population (say N = 3000)
, the required minimum sample will be obtained from the
If sampling is from a finite population of size N,
56
To test a hypothesis about the difference between two
population proportions
• This equation is quite general: it applies to comparative cross- sectional, cohort, and
case-control study
57
Sample size for interventional studies
58
Sample size for interventional studies…
59
Using a computer package
Packages available:
EpiInfo
(download free from http://www.cdc.gov/epiinfo)
Open epi
Sample Power
Egret siz
nQuery
60