0% found this document useful (0 votes)
6 views

7.Hypothesis testing and Sample size determination

Hypothesis testing is a formal statistical process used to compare populations or assess relationships between variables through sample data. It involves formulating null and alternative hypotheses, selecting appropriate statistical tests, and determining significance levels to draw conclusions about population parameters. The tests can be parametric or non-parametric, with specific procedures for different types of data and hypotheses.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

7.Hypothesis testing and Sample size determination

Hypothesis testing is a formal statistical process used to compare populations or assess relationships between variables through sample data. It involves formulating null and alternative hypotheses, selecting appropriate statistical tests, and determining significance levels to draw conclusions about population parameters. The tests can be parametric or non-parametric, with specific procedures for different types of data and hypotheses.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Hypothesis testing

[One way of statistical inference]

1
Hypothesis Testing
 Hypothesis testing is a formal process of statistical analysis using
inferential statistics.
 The goal of hypothesis testing is to compare populations or assess
relationships between variables using samples.

 Hypotheses, or predictions, are tested using statistical tests.


Statistical tests also estimate sampling errors so that valid inferences
can be made.

 Statistical tests can be parametric or non-parametric. Parametric


tests are considered more statistically powerful because they are
more likely to detect an effect if one exists
2
Hypothesis Testing…
 Parametric tests make assumptions that include the following:

 The population that the sample comes from follows a normal


distribution of scores
 The sample size is large enough to represent the population
 The variances, a measure of variability, of each group being compared
are similar
 When your data violates any of these assumptions, non-parametric
tests are more suitable.
 Non-parametric tests are called “distribution-free tests” because they
don’t assume anything about the distribution of the population data.

3
Hypothesis Testing…
 Statistical tests come in three forms: tests of comparison, correlation or regression.
1. Comparison tests: assess whether there are differences in means, medians or
rankings of scores of two or more groups (e.g. t-test, ANOVA).
 To decide which test suits your aim, consider whether your data meets the
conditions necessary for parametric tests, the number of samples, and the
levels of measurement of your variables.
2. Correlation tests: determine the extent to which two variables are associated.
 Although Pearson’s r is the most statistically powerful test, Spearman’s r is
appropriate for interval and ratio variables when the data doesn’t follow a
normal distribution. The chi square test of independence is the only test that can
be used with nominal variables.
3. Regression tests: demonstrate whether changes in predictor variables cause changes
in an outcome variable. You can decide which regression test to use based on the
number and types of variables you have as predictors and outcomes (e.g. Linear-
regression, Logistic regression…others).
 Most of the commonly used regression tests are parametric. If your data is not
normally distributed, you can perform data transformations.
4
Tests of Significance
 Data are often collected to answer specified questions, such as:
 Do U5 children from Urban have a lower prevalence of
malnutrition compared with Rural?
 Is a new-Rx beneficial to those suffering from a certain
disease compared with the standard-Rx?
 Such questions may be answered by setting up a hypothesis
and then using the data to test this hypothesis.

5
Hypothesis Testing
 Purpose is to aid the researcher in reaching a decision concerning
a population by examining a sample from that population.
 Hypothesis:
 A statement about one or more population
 Is a claim about a population parameter
 is a statement which may or may not be true concerning one
or more populations.

6
Types of hypothesis
 Researchers are concerned with two types of hypotheses:
1. Research hypothesis is the conjecture or supposition that
motivates the research
 E.g. The mean birth weight of babies delivered by mothers
with low SES is lower than those from higher SES.
2. Statistical hypotheses (H0 and HA) are hypotheses that
are stated in such a way that they may be evaluated
by appropriate statistical techniques

7
Statistical hypothesis
 There are two statistical hypotheses that are involved in hypothesis testing,
1. Null hypothesis (H0) is the hypothesis to be tested.
 Always contains “=” , “ ≤” or “≥ ” sign
 May or may not be rejected
 Sometimes referred to as a hypothesis of no difference or no
effect, since it is a statement of agreement with (or no difference
from) conditions presumed to be true in the population of interest.
2. Alternative Hypothesis (HA): The notation HA (or H1 ) is used for the
hypothesis that will be accepted if HO is rejected.
 The opposite of Ho.
 Is a statement of what we will believe is true if our sample data
causes us to reject Ho.
 May or may not be accepted.

8
Stating statistical hypotheses

rue

9
Steps in Hypothesis Testing
1. Formulate the appropriate statistical hypothesis clearly.

2. State the assumptions necessary for computing probabilities


based on the collected data.
 A distribution is approximately normal (Gaussian)
 Variance is known or unknown

10
Steps in Hypothesis Testing…

3. Decide on the appropriate test statistic (Z, t, χ2, F, etc.)


 Test statistic is a value computed from the sample data that is used
in making the decision about the rejection of the H0.

OR

4. Select the level of significance for the statistical test (α=0.05, 0.01, 0.1, etc.)
11
Steps in Hypothesis Testing…
5. Determine the critical value.
 A value the test statistic must attain to be declared significant

-1.96 1.96 1.645 -1.645

12
Steps in Hypothesis Testing…

6. Perform the calculation


7. Draw and state the conclusion.
 If Ho is rejected, we conclude that HA is true (or
accepted).
 If Ho is not rejected, we conclude that Ho may be true.

13
Rules for Stating Statistical Hypothesis
1. One population
 Indication of equality (either =, ≤ or ≥) must appear in Ho.
Ho: μ = μo, Ho: P = Po,
HA: μ ≠ μo HA: P ≠ Po
 Can we conclude that a certain population mean is
 not 50?
Ho: μ = 50 and HA: μ ≠ 50
 greater than 50?
Ho: μ ≤ 50 HA: μ > 50
 Can we conclude that the proportion of patients with leukemia who
survive more than six years is not 60%?
Ho: P = 0.6 HA: P ≠ 0.6
2. Two populations
Ho: μ1 = μ2 Ho: P1 = P2
HA: μ1 ≠ μ2 HA: P1 ≠ P2
14
Rejection and Non-Rejection Regions
 The values of the test statistic assume the points on the
horizontal axis of the normal distribution and are divided into
two groups:
 Rejection region, and
 Non-rejection region.
 The values of the test statistic forming the rejection region are
less likely to occur if the Ho is true.
 The values making the acceptance (non-rejection) region are
more likely to occur if the Ho is true.

15
Example: Two-sided test at α 5%

= 0.025 = 0.025
0.95

-1.96 1.96

 Reject Ho if computed the value of the test statistic is one of the values in
the rejection region.

 Don’t reject Ho if the computed value of the test statistic is one of the values
in the non-rejection region.
16
Level of significance
 If HO is rejected, then HA is accepted.
 A HO is either true or false, and it is either not rejected or rejected.
 No error is made:
 When it is true and we fail to reject it, or
 When it is false and rejected.
 An error is made,
 When it is true but rejected (type I (α) errors) , or
 When it is false and we fail to reject (type II (β) errors).
 α is the probability of a type I error. It is called the level of significance.
 β is the probability of a type II error.
 Power: The probability of rejecting the null hypothesis when it is false.
Power = 1‐β.
17
Level of Significance, α
 Is the probability of rejecting a true Ho
 Defines unlikely values of sample statistic if Ho is true
 Defines rejection region of the sampling distribution
 The decision is made on the basis of the level of significance,
designated by α.
 More frequently used values of α are 0.01, 0.05 and 0.10.
 α is selected by the researcher
 The level of significance, a, is a probability and is, in reality, the
probability of rejecting a true null hypothesis.
 For example, with 95% confidence intervals, a = .05 meaning that
there is a 5% chance that the parameter does not fall within the 95%
confidence region.
 This creates an error and leads to a false conclusion.
18
Reality
Action Ho True Ho False
(Conclusion)
Do not Correct action Type II error (β)
reject Ho
Reject Ho Type I error (α) Correct action

a = P(Reject H 0 H 0 is true)
 = P(Accept H 0 H 0 is false)

19
One tail and two tail tests
 In a one tail test, the rejection region is at one end of the distribution
or the other.
 In a two tail test, the rejection region is split between the two tails.
 Which one is used depends on the way the Ho is stated.
 Eg: average survival year after cancer dx is less than 3 years.

20
Difference b/n P value and level of significance
 The significance level α is the probability of making a type I error. This
is set before the test is carried out.
 The P-value is the result observed after the study is completed and is
based on the observed data.
 It would be better (informative) to give the exact values of P; such as,
P = 0.02 or P = 0.15 rather than P < 0.05 or P > 0.05 .
 Another way to state conclusion
 Reject Ho if P-value < α
 Accept Ho if P-value ≥ α
The larger the test statistic, the smaller is the P-value. OR, the
smaller the P-value the stronger the evidence against the Ho.

21
1. Hypothesis test about A population mean (normally distributed)
a) Known variance: the test statistic is
Example:
 Researchers are interested in the mean level of some enzyme in a certain
population. They are asking: can we conclude that the mean enzyme level in
this population is different from 25?
 Step 1: H0: μ= 25
H1: μ≠25
 Step 2: They collect a sample of size 10 from a normally distributed population
with a known variance, σ2= 45. The calculated sample mean is = 22
 The population is normally distributed
 Population variance is known
 Step 3:
 ⇒Z ‐statistic is the appropriate one
22
Hypothesis test about A population mean…Known variance..

23
Hypothesis test about A population mean…Known variance..

 Step 7: Since ‐1.41 falls in the acceptance region we accept the H0. The
mean enzyme level in the population is not different from 25.
 OR Calculate P value

24
Hypothesis test about A population mean…
b) Unknown variance: The test statistic is
 Example
 Serum amylase determination were made on a sample of 15 apparently
health subjects. The sample yielded a mean of 96 units/100ml and standard
deviation of 35 units/100ml. The population was normally distributed but the
variance was unknown. We want to know whether we can conclude that the
mean of the population is different from 120.
 Step 1, H0: μ = 120
H1: μ≠120
 Step 2, mean = 96 SD=35 n=15, 𝜇𝑜=120
 Step 3, t‐test is the appropriate test, since we are testing about the population
mean the population is normally distributed the population variance is unknown
 Step 4, α= 0.05
 Step 5, t0.025,14= 2.1448
25
Hypothesis test about A population mean…Unknown variance..
 Step, 6.

 Step 7, Since –2.65 < ‐2.1448, it falls in the rejection region we reject
the null hypothesis. The mean of the population from which the
sample came is not 120.
 OR Calculate P value

26
2. Hypothesis testing about differences between two
population means(normally distributed)
i) Known variance(2 independent samples)
 Example:
 In a large hospital for the treatment of mentally retarded, a
sample of 12 individuals with mongolism yielded a mean serum
uric acid value of =4.5mg/100ml. In a general hospital a sample
of15 normal individuals of the same age and sex were found to
have a mean value of =3.4. If it is reasonable to assume that the
two populations of values are normally distributed with
variance equal to 1, do these data provide sufficient evidence
to indicate a difference in mean serum uric acid levels between
normal individuals and individuals with mongolism?
27
HT differences between two population means…Known variance

 Step 1: H0: μ1= μ2


H1: μ1≠μ2
 Step 2: X1=4.5 X2= 3.4
n1=12 n2=15
σ1= 1 σ2= 1
 Step 3: Z ‐ test is the appropriate one

28
HT differences between two population means…Known variance

29
HT differences between two population means…

30
HT differences between two population means…Unknown and equal
variance….

31
HT differences between two population means…Unknown and equal
variance….

• Since ‐1.88 falls in the acceptance region, we accept the null


hypothesis. On the basis of these data we can not conclude
that, the two population means are different

32
3. Hypothesis testing about a single population proportion
 Involves categorical values
 Two possible outcomes
 “Success” (possesses a certain characteristic)
 “Failure” (does not possesses that characteristic)
 Fraction or proportion of population in the “success” category is
denoted by p

33
Hypothesis testing about a single population proportion…

34
Hypothesis testing about a single population proportion…
 Example-1: We are interested in the probability of developing
asthma over a given one-year period for children 0 to 4 years of age
whose mothers smoke in the home. In the general population of 0 to
4-year-olds, the annual incidence of asthma is 1.4%. If 10 cases of
asthma are observed over a single year in a sample of 500 children
whose mothers smoke, can we conclude that this is different from
the underlying probability of p0 = 0.014? α = 5%

H0 : p = 0.014
HA: p ≠ 0.014

35
HT about a single population proportion… Example-1…
 The critical value of Zα/2 at α=5% is ±1.96.
 Don’t reject Ho since Z (=1.14) in the non-rejection
region between ±1.96.
 P-value = 0.2543
 We do not have sufficient evidence to conclude
that the probability of developing asthma for
children whose mothers smoke in the home is
different from the probability in the general
population

36
4. Hypothesis testing about the difference between two
population proportions

 H0: 𝜋1= π2
 H1: 𝜋1≠π2
 We use a pooled sample estimate for the common
hypothesized proportion, which is a weighted average of
the sample proportions, with the sample size as weights.

or

37
HT for two population…
 The standard error of P1 - P2 under the null hypothesis is
thus calculated on the assumption that the proportion in
each group is p , so that we have.

38
HT for two population… example-1
 Two hundred patients suffering from a certain disease were randomly divided
in to two equal groups. Of the first group, 78 recovered within three days. Out
of the other 100, who were treated by a new method, 90 recovered within three
days. The physician wished to know whether the data provide sufficient
evidence to indicate that the new treatment is more effective than the standard.

 Since ‐2.32 < ‐1.645 it falls in the rejection region. We reject the null hypothesis.
 The data suggests that the new treatment is more effective than the standard. 39
HT for two population… example-2
 Among the 225 students who ate the sandwiches, 109 became ill. While,
among the 38 students who did not eat the sandwiches, 4 became ill. Is there
a significant difference between the two groups at α =5%.
 Ho: p1=p2 against the alternative
HA: p1 ≠ p2

40
HT for two population… example-2

 The area under the standard normal curve to the


right of 4.36 is less than 0.0001. Therefore, p <
0.0002. We reject H0 at the 0.05 level.

 The proportion of students who became ill differs


in the two groups; those who ate the prepared
sandwiches were more likely to develop illness.
41
Review Questions
1. The time taken for cessation of bleeding was recorded for
a large number of persons (population) whose fingers had
been pricked. The population mean time was found to be
1.407 min and the standard deviation was .588 min. In an
effort to determine whether pressure applied to the upper
arm increases bleeding time, six persons had pressure
equal to 20 mmHg applied to their upper arms and had their
fingers pricked. For these six persons, the times taken for
bleeding to stop were 1.15, 1.75, 1.32, 1.28, 1.39, and
2.50min. Give a 95% confidence interval for the mean
bleeding time under pressure for the six persons and draw
some conclusion as to whether pressure increases bleeding
time.
42
Review Questions…
2. A special diet was given to 16 children and their gain
in weight was recorded over a 3‐month period. Their
mean gain in weight was found to be 2.49 kg. A control
group consisting of 16 children of similar background
and physique had normal meals during the same
period and gained 2.05 kg on average. Assume that
the population standard deviation for weight gains is .8
kg. Is the evidence strong enough for us to assert that
the special diet really promotes weight gain?

43
Review Questions…
3. A general physician recorded the oral and rectal temperatures of nine
consecutive patients who made first visits to his office. The temperatures
are given in degrees Celsius (oC). The following measurements were
recorded:
• From this data, what is your
best point estimate of mean
difference between oral and
rectal temperatures?

• Give a 95% confidence


interval for this difference,
and state what the interval
means.
44
Sample size determination

45
Sample Size Determination
 Why is it important to consider sample size?
 In studies concerned with estimating some characteristic of a
population (e.g. the prevalence of asthmatic children),
 Sample size calculations are important to ensure that estimates are
obtained with required precision or confidence.
 For example,
 A prevalence of 10% from a sample of size 20 would have a 95%
confidence interval of 1% to 31%, which is not very precise or informative.
 On the other hand, a prevalence of 10% from a sample of size 400 would
have a 95% confidence interval of 7% to 13%, which may be considered
sufficiently accurate.
 Sample size calculations help to avoid this situation. (For sample size of
800 the 95% CI is 8% to 12%)

46
Sample Size Determination…
 Sample size calculations are important to ensure that if an effect
deemed to be clinically or biologically important exists, then there is a
high chance of it being detected, i.e. that the analysis will be
statistically significant.
 A critically important aspect of study design is determining the
appropriate sample size to answer the research question
 There are formulas that are used to estimate the sample size needed
to produce a confidence interval estimate with a specified margin of
error, or to ensure that a test of hypothesis has a high probability of
detecting a meaningful difference in the parameter if one exists (power
of the study)

47
Sample size determination…
 Too small sample size a study :  Too large sample size a study :
 Scientifically- Cannot detect  Scientifically- demonstrate
clinically important effects scientifically (clinically)
 Economically- waste irrelevant, but statistically
resources without capability of significant effects.
producing useful results.  Economically- waste resources
 Ethically: Expose subjects to by using more than necessary.
potentially harmful treatments  Ethically: Expose unnecessary
without advancing knowledge number of subjects to
potentially harmful treatments or
subjects denied potentially
beneficial ones

48
Need for sample size
 The eventual sample size is usually a negotiation
between what is desirable and what is feasible.
 The feasible sample size is determined by the
availability of resources. It is also important to
remember that resources are not only needed to collect
the information, but also to analyze it.

49
Points to be considered
1. The reasonable estimate of the key proportion to be studied. If you cannot
guess the proportion, take it as 50%.
2. The degree of accuracy required. That is, the allowed deviation from the true
proportion in the population as a whole. It can be within1% or 5%, etc.
3. The confidence level required, usually specified as 95%.
4. The size of the population that the sample is to represent. If it is more than
10,000 the precise magnitude is not likely to be very important; but if the
population is less than 10,000 then a smaller sample size may be required.
5. The difference between the two sub-groups and the value of the likelihood or
the power that helps in finding a statistically significant difference..

 Note that number 5 is required when there are two population


groups and the interest is to compare between two means or
proportions
50
Sample size for single mean
 The most frequently used source of estimates for σ 2 are the following:
1. A pilot study or preliminary sample may be drawn from the population.
2. From previous similar study

 Eg. A public health expert wishes to estimate mean haemoglobin level in a


defined community. From preliminary contact he thinks this mean is about
150 mg/l with a standard deviation of 32 m/l. If he is willing to tolerate a
sampling error of up to 5 mg/l in his estimate, how many subjects should be
included in his study? (α=5%, two sided)
 If the population size is assumed to be very large, the required sample size
would be:
 n = (1.96)2 (32)2/ (5)2= 157.4 ≈158 persons
51
To estimate the difference between two population means
(Assuming that the two populations have common standard
deviation σ)

where n1 is the size of sample 1 and , r is the ratio of the size of


sample 2 to sample1 ratio of the size of sample 2 to sample1.
When the two samples are equal, the size of each sample is twice
that given by one each sample is twice that given by one sample
case

52
Sample size for single population Proportion
 Estimate how big the proportion might be (P)
 Choose the margin of error you will allow in the estimate of the
proportion (say ±d)
 Choose the level of confidence that the proportion in the whole
population is indeed between (p-d) and (p+d). We can never be
100% sure. Do you want to be 95% sure?
 The minimum sample size required, for a very large population
(N≥10,000) is:

53
Sample size for single population Proportion…
1. If sampling is from a finite population of size N, then

 where n0 is the sample from an infinite population


2. The initial sample size approached in the study may need to be increased in
accordance with the expected response rate, loss to follow up, lack of
compliance, and any other predicted reasons for loss of subjects
3. Design effect for complex cluster sampling Common values multiply n by
1.5, 2, 3, …5

54
Sample size for single population Proportion…
Example

 a) p = 0.26 ,d = 0.03 , Z = 1.96 ( i.e., for a 95% C.I.)


(𝑍𝛼/2)2 𝑝(1−𝑃
n= 𝑑2

(1.96)2 0.26(1−0.26
=
0.032
= 821.25
 If the above sample is to be taken from a relatively small population (say N = 3000)
, the required minimum sample will be obtained from the
 If sampling is from a finite population of size N,

 where n0 is the sample from an infinite population


 n= 821.25 / (1+ (821.25/3000)) = 644.7 ≈645 subjects
55
Sample size for single population Proportion…
Example
 Suppose you wish to estimate the prevalence of acute respiratory tract
infection, with a precision of 5%, in a target population comprising children aged
2‐4 years in a particular region of a developing country. Since an estimate of P is
not available until the survey has been carried out, the sample size calculation
formula does not tell you what sample size is needed. However the formula may
still be used to get a range of sample sizes corresponding to various assumptions
for the values of P

56
To test a hypothesis about the difference between two
population proportions

• This equation is quite general: it applies to comparative cross- sectional, cohort, and
case-control study

57
Sample size for interventional studies

58
Sample size for interventional studies…

59
Using a computer package

Packages available:
 EpiInfo
(download free from http://www.cdc.gov/epiinfo)
 Open epi
 Sample Power
 Egret siz
 nQuery
60

You might also like