0% found this document useful (0 votes)
8 views

Sample Size - NPMJ

Uploaded by

konfor 2007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Sample Size - NPMJ

Uploaded by

konfor 2007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.

101]

Review Article

Sample Size Estimation for Health and Social Science


Researchers: The Principles and Considerations for Different
Study Designs
Oladimeji Akeem Bolarinwa
Department of Epidemiology and Community Health, Faculty of Clinical Sciences, University of Ilorin, Ilorin, Nigeria

Abstract
Sample size is one of the important considerations at the planning phase of a research proposal, but researchers are often faced with challenges
of estimating valid sample size. Many researchers frequently use inadequate sample size and this invariably introduces errors into the final
findings. Many reviews on sample size estimation have focused more on specific study designs which often present technical equations and
formula that are boring to statistically naïve health researchers. Therefore, this compendium reviews all the common sample size estimation
formula in social science and health research with the aim of providing basic guidelines and principles to achieve valid sample size estimation.
The simplification of the sample size formula and detailed explanation in this review will demystify the difficulties many students as well as
some researchers have with statistical formulae for sample size estimation.

Keywords: Health, sample size, social science, study design

Background study the entire population of interest, when there is large


geographical spread of the population, when the subjects
Every scientific research requires carefully designed methods
within the population are too large and when there are
to produce valid and relevant results. In achieving such
limited resources to study the whole population. In all these
results, a scientifically proven sample size estimation must
situations, a scientific method of selecting representatives of
be adopted. In almost all quantitative researches, sample
the population will be vital.
size will be required to provide credible findings. Therefore,
sample size estimation is a vital consideration at the concept In health and social science research, scientists are often
development and proposal phase in research. One of the key faced with challenges of estimating valid sample sizes. Many
questions health researchers are likely to ask is, how much of researchers frequently use inadequate sample size and this
a population is needed for valid and reliable study? In some invariably introduces errors into the final findings. Taking
instances, researchers may choose to study all those within a ‘too much’ or ‘too small’ of a population sample is not only a
target population. This is possible when the entire population waste of scarce resources but the researcher is also working
of interest is small and there are resources to study them. This with wrong research assumptions[3] which could possibly have
scenario is called exhaustive survey,[1] and in this instance, ethical concerns as well. This will undermine the integrity
a sample size calculation may not be required or may not of the outcome of the study with spurious effects on future
be applicable even when estimated. In most instances, it is researches that may use such outcomes. In essence, sample
not feasible to study the entire subjects or respondents in
a population of interest. Therefore, a sample or sub‑set of Address for correspondence: Dr. Oladimeji Akeem Bolarinwa,
the population will be required.[1,2] It will be impractical to Department of Epidemiology and Community Health, Faculty of Clinical
Sciences, University of Ilorin, Ilorin, Nigeria.
E‑mail: [email protected]
Received: 01-02-2020, Revised: 29-02-2020,
Accepted: 16-03-2020, Published: 11-04-2020

This is an open access journal, and articles are distributed under the terms of the Creative
Access this article online Commons Attribution‑NonCommercial‑ShareAlike 4.0 License, which allows others to remix,
Quick Response Code: tweak, and build upon the work non‑commercially, as long as appropriate credit is given and
Website: the new creations are licensed under the identical terms.
www.npmj.org
For reprints contact: [email protected]

DOI: How to cite this article: Bolarinwa OA. Sample size estimation for health
10.4103/npmj.npmj_19_20 and social science researchers: The principles and considerations for different
study designs. Niger Postgrad Med J 2020;27:67-75.

© 2020 Nigerian Postgraduate Medical Journal | Published by Wolters Kluwer ‑ Medknow 67


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

size should be ‘large enough’ that an effect or precision of instance a population <10,000, expected attrition or dropouts,
such magnitude as to be of scientific or clinical significance non‑response, covariates, e.g., controlling for confounders[2,3]
will also be statistically significant. Sample size is so important and Deff in cluster sampling.[1,4] These adjustments are for the
that it has evidential link with previous studies, characteristics purpose of yielding sufficient number of analysable subjects
of the population of interest, scientific assumptions, allowable for valid statistical findings of the health research.[2] In sample
study errors, sampling methods, analysis methods and study size re‑estimation, there is no known or little evidence in the
designs. Available literatures on sample size focused more on literature about some attributes to be studied, especially past
specific study designs and often present technical equations and prevalence, incidence and means. In some other instances,
formula that are boring to statistically naïve health researchers. certain aspect of the study needs to be monitored for safety
This compendium reviews all the common sample size and relevance before exposing more participants to the
estimation formulae in social science and health research. In intervention. Therefore, there may be a need for a pilot study
addition, it provides basic guidelines and principles to achieve or interim study (in clinical trials).[2] In these situations, sample
valid estimation. The simplification of the sample size formula size re‑estimation is required to adjust for the initial sample size
and detailed explanation in this review will demystify statistical calculated for the pilot study and to confirm the preliminary
formulae in sample size estimation for researchers. study assumptions such as power. In this manuscript, sample
size estimation, calculation and determination will be used
Importance of Sample Size Determination in exchangeable. Of all the four methods, sample size estimation
will be discussed extensively in this review. A little note will be
Health Research added towards the end of the review on sample size adjustment.
Both internal and external validities of the research are ensured
with an accurately estimated sample size that leveraged on General Considerations in Sample Size
previous studies or evidences. When representativeness in a
study is accurately determined, it ensures that it measured the Determination
population attributes it purports to study. In human and animal It is very important to understand the dimensions of the
experiment, sample size is a pivotal issue for ethical reasons. research to be conducted in terms of characteristics of the
Inadequate sample size will produce scientific inference with proposed study population, the appropriate study designs
small power. This will expose subjects to potentially harmful and the intended methods of analysis.[5] Characteristics
treatments without advancing knowledge. On the other hand, of the population are relevant consideration in sample
oversized experiments will recruit an unnecessarily large size determination. These characteristics could be human
number of subjects into the study. This will in turn expose sociodemography, animal species, human body parts or system
them to unnecessary harmful treatment. The volunteer in the to be studied and type of health records available. Study sites’
study will be needlessly troubled without the study adding characteristics should also be considered. Some of the study
significant contribution to scientific knowledge. site characteristics are community setup, household, hospital or
institutional‑based study sites, geographic spread, confinement
Dynamics of Sample Size Determination and security considerations. The study designs have great
influence on analysis methods. As will be shown later, a
Some researchers have classified sample size determination into
good idea of the proposed study design that is appropriate for
four depending on the aim and procedure involved.[2] These are;
the study concept and analysis method will help define the
sample size estimation/determination, sample size justification,
appropriate sample size estimation for the study. Explicitly,
sample size adjustment and sample size re‑estimation. Sample
the following study characteristics are essential to the validity
size estimation/determination requires the actual calculation
of sample size determination.
using scientific assumption and evidence to achieve desired
statistical significance of valid and reliable outcome. This Objectives or hypothesis
is the most common method which requires attributes such The objectives, research question and hypothesis are
as prevalence, proportion and means from previous studies. interrelated considerations to choosing the best sample size
Predetermined assumptions for validity and reliability such as determination.[2] For some studies, these considerations may
power of study, level of significance and design effect (Deff) have more than one attributes (prevalence, incidence and
may be needed in sample size estimation.[2] Sample size means) which needed to be well thought‑out before estimating
justification is necessary when a sample size is already chosen. the sample size. For instance, the prevalence in a study that
It becomes expedient for the researcher to provide a ‘statistical aimed at assessing the treatment outcomes and health‑related
justification’ for the selected sample size.[2] Usually, a small size quality of life of hypertensive patients attending a local hospital
of the population will be recruited initially due to budgetary has more than one dependent variables, e.g., clinical outcomes
constraints or for medical consideration. A good example of and quality of life, to consider when estimating sample size.
this is the sample size in the first phase of clinical trials. Various Literatures agreed that researchers should calculate for all the
methods for sample size adjustment have been described in the attributes and choose the higher or highest sample size.[2,5]
literature.[1‑4] For reasons like small study population, e.g., for Another consideration is the direction of the null hypothesis

68 Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

stated. Is the hypothesis one‑tail or two‑tail test? This is more indicates maximum variability.[7] The prevalence moving
relevant in analytical study types, especially experimental towards extreme of the spectrum 100% (or 1) and 0 will not
studies and some descriptive studies. As would be discussed have as much variability. This simply means that majority of
later, the hypothesis connects sample size and the methods of the sample population possess or do not possess the attribute
analysis of the study. of interest.[7]
Study designs Detectable difference (effect size) of the parameter
A properly applied study design will need appropriate sample This is the smallest clinical effect that is detectable in the
size based on whether the study is descriptive (cross‑sectional, finding.[5,8] It is a parameter that elicits the difference in the
surveys or case studies types) or analytical (observational or outcome of one arm of study (intervention, experimental or
experimental types).[2,5] A good study requires that each of the study group) to the other arm (control or comparator). It is the
study design has specific sample size estimation consideration. attribute of analytical studies which determines the probability
For example, a cross‑sectional study that aimed at assessing the that an independent factor will be strongly associated with an
health‑care utilisation pattern in a community will need not set outcome or dependent variable.[5] Depending on the unit of
power (1‑type 2 error) for the sample size estimation. Whereas, measuring the outcome variables, effect size could be mean
a clinical trial that aims at assessing the effectiveness of drug X difference or change in the proportion. It is expedient to
as against drug Y will be interested in setting a stringent power. mention that effect size is interrelated to the hypothesis set at
the beginning of the research, the outcome measurement and
clinically detectable difference in the outcome measurement.
Elements Required for Sample Size As a general rule of thumb, a small effect size will require a
Determination large sample size to be able to detect a clinically meaningful
Outcome variable/parameter/endpoints difference, whereas a large effect size will require a small
In health research, units of measuring variables are of sample size.[4,5] The effect sizes to input in sample size
two classes. It is either numeric or categorical. These two estimation are often obtained from previous research.
categories have other sub‑types of units of measurement. Three variants of detectable difference have been described
The unit of measurement in categorical variables is in in the literature.[2] Absolute difference means that a clinically
proportion (percentages and rates) and at times could be acceptable effect size can be presumably set for the study.
in ratio. The numeric variables are presented as means and For instance, a difference of 5 mmHg can be presumed to
median mostly (measures of central tendency). In some health be clinically acceptable between a new and the existing drug
researches, odd ratio (OR) and relative risk are also measured for hypertension treatment. Relative difference requires that
as outcome variables. The chosen unit of measurement in researcher set the study to detect certain change in proportion of
sample size estimation should be taken into consideration at a clinical outcome. For example, a 10% decrease in systolic BP
all time.[4,6] A previous literature that uses the same or similar can be set to be of practical importance (20%–30% is usually
unit of measurement for the variable should be adopted for the taken as clinically acceptable). Cohen, decades ago, established
sample size estimation. However, in some instances, a variable that for an experimental (interventional) study with 2 arms
could be interpreted in more than one unit of measurement of comparison, a ratio of effect size and standard deviation
in health research. For example, blood pressure (BP) can be termed standardised effect size or standard difference can be
expressed as a mean value in mmHg. It can also be reported applied.[8,9] The standardised effect size was classified as small,
as controlled BP or uncontrolled BP. Another classification of medium or big if this ratio is 0.2, 0.5 and 0.8, respectively.[8]
BP could be optimal, Grade I, Grade II or Grade III.
Error rates
Variability of the parameter The concept of error assumption in research stemmed from
This is the measure of how spread out or dispersed individual the hypothesis testing.[2,5,8] The type of error committed when
unit in a variable is from the middle. The wider the variability, researcher wrongly rejects a null hypothesis that is true is
the more sample size that will be required to achieve a called type I or alpha (α) error. This is also described as
significant effect size if any. The reason is that any two highly ‘failure to accept a true null hypothesis’.[2,5,8] On the other hand,
dispersed variables being compared will overlap.[5] For the type II or beta (β) error means to wrongly accept a false null
numeric parameters, the measures of dispersion for a sample hypothesis. It is also described as ‘failure to reject a false null
mean is variance (standard deviation), whereas for median hypothesis'.[2,5,8] The implication of type I error (α) is that the
is range (interquartile range). These are usually reported researcher has to set an assumption for the level of type I error
by previous literatures and available for the researcher to he/she wishes to allow in the study. This assumption of type I
leverage on to estimate the study sample size. However, for error is also called setting ‘level of significance (P value)’. It
categorical parameter, the variability for sample proportions is frequently set at 5% which means the researcher is willing
is based on spread towards 0.5 (or 50%). If a previous study to allow the 5% probability of ‘failure to accept a true null
reported a prevalence of 0.5 (50%), the dispersion will also hypothesis’. However, some researches such as clinical trials
equal 0.5 (that is 1–0.5). A prevalence tending towards 50% can set a very small α‑error. The smaller the α‑error, the larger

Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020 69


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

the sample size required.[8] The level of significant thereby setting a confidence interval (CI) means that the interval of the
means that at less than 5% (P = 0.05) or 1% (P = 0.01 in width of the confidence level will be estimated during analysis.
stringent trials) of error, the variations observed in the outcome [2]
The CI like the P value indicates the statistical significance
are due to chance and not due to ‘too much error’.[10] An of the study outcomes.
important caution here is that majority of the analysis software
like SPSS, set P-value at 0.05 as a default. Consequently, if Sample Size Estimation for Different Study
there is a need to use P value lower than 5%, the researcher
needs to change this from the software setting to the desired Designs and Statistical Analysis
value. Otherwise, the researcher’s assumption of P value of 1% Cross‑sectional studies and surveys
could be erroneously presenting the result at P value of 5%. Prevalence studies and surveys are descriptive in nature. They
Another note of relevance is that when researcher fails to reject are employed to show the associations between factors and
null hypothesis, it does not mean that it is true, it is just that generated hypothesis for future researches.[4] Estimating sample
there is not enough evidence to reject the null hypothesis.[10] size for these type of research requires outcomes/variables/
Type II error (β, beta error) on the other hand gives rise to parameters such as prevalence, incidence, means, rates and
‘power’ of the study which is 1‑β.[2,5,8,10] The power of the ratios. Out of all these, prevalence (p) and means (µ) are
study therefore means the other proportion left behind after commonly used for outcomes that are categorical (qualitative)
removing the errors committed by wrongly accepting a false or numeric (quantitative) in nature. The variability for each
null hypothesis [Figure 1]. This connotes a proportion of rightly of P (1 − p) and µ (variance = σ), normal standard deviate
rejected false null hypothesis.[2,5] Power of the study is often for α‑error (Zα) and a precision level (δ) usually assumed at
assumed or set at the proposal stage similar to the level of 5% (0.05) are all required. The followings depict the formula
significance. For example, suppose a researcher assumes a 20% for both the categorical and numeric outcome variable
β‑error, the power of the study will be set at 80%.[2,5,8] Random cross‑sectional studies:[4,6,8,12]
Z2 Pq
Sample size ( N ) =
values of 0.05 for α and 0.2 for β (power, 0.8) are often used by a. Categorical outcome (proportion) d2
researchers, but conventionally, α values could range from 0.01
to 0.10, whereas β can be set between 0.05 (power, 0.95) and Zα2σ 2
0.20 (power, 0.80).[5] Like the α error, the lower the β (higher b. Numeric outcome (mean). N =
δ2
power), the larger the sample size is required to achieve
clinically detectable changes in the outcome.[2,5,8] As applicable Analytical studies: Independent case–control and cohort
to the actual sample size estimation formula, the values of α studies
and β cannot be used directly. This required conversion on In this type of studies, there are comparator groups called
the standard normal deviate in the Gaussian curve.[8] This is ‘controls’ that are weighed against the group with the attributes
called the Z‑scores denoted as Zα and Zβ for α and β errors, been studied called ‘cases’. While the case–control study
respectively [Table 1]. Fianlly, a few clarification need to be captures the cases with outcome (disease or other health related
stated about the relationship between confidence level and issue) and search retrospectively to determine the exposed
α‑error. Similar to the power of the study, confidence level factors, the cohort study starts from exposed factors and follow
simply means the other proportion left behind after removing the cohort prospectively to determine the associated outcomes.
the α‑error (1− α) usually set as 0.95 as shown in Figure 1.[11] Only few studies have extensively documented sample size
It is the precision of the study which means the confidence of
not rejecting a true null hypothesis.[2] For analytical studies,
Table 1: Commonly used standard normal deviate for α
and β errors
α Z
Direction of Ho testing
α‑error
Two‑tail 0.05 1.960
Two‑tail 0.025 2.326
Two‑tail 0.01 2.576
One‑tail 0.05 1.645
One‑tail 0.025 1.960
One‑tail 0.01 2.326
β‑error 1‑β (power)
0.40 0.60 0.25
0.20 0.80 0.84
0.10 0.90 1.28
0.05 0.95 1.64
Figure 1: The relationship between type 1 and type 2 errors as they relate
to the hypothesis[11] 0.01 0.99 2.33

70 Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

formula for case–control and cohort studies.[6,7,13] Other study power (1−βerror). P* is the average probability of the exposure
variants’ formula (such as matched and paired studies) can be calculated as shown in formula d (2). m is the ratio of control
found in some other literature[7] and internet sources. Formulae subjects to cohort or experimental subjects desired, while P0
for independent studies are shown in this review. is the probability of event in the control group and P1 is the
c. Independent case–control (retrospective study).[7,13] probability of the event in the study or experimental group.[7]
As shown in d (3) formula, Nc is the continuity‑adjusted sample
size for further analysis such as Chi‑square and Fisher’s
[ Zα (1 + m) P * (1 − P *) exact.[7]
+ Z β P1 (1 + P1 ) + mP0 (1 − P0 ) ]2 Analytical studies: Cross‑sectional analytical comparative)
N=  (1)
( p1 − p0 ) 2 studies
P1 + p0 / m These are various types of observational study that compare
P* =  (2) population proportions (P1 and P2) and means (µ1 and µ2). It
1+1/ m is formerly known as ‘comparative study’. In this study, there
p0 is no form of intervention or experimentation. For instance,
P1 =  (3) a study that aimed at comparing the cardiovascular risk score
1 + P0 ( − 1)
between the residents in rural and urban communities. The

2 formula for cross‑sectional analytical study can be applied to
N 2(m + 1) 
Nc = 1 + 1 Nm | p − P |   (4) categorical and numerical variables as shown below:[4,8,12‑14]
4  0 1 
d. Comparing two proportions
In equation C (1), N is the estimated sample size for the
(Z )  P1 (1 − P1 ) + P2 (1 − P2 )
2
+ Zβ
independent case–control, Zα is the standard normal deviate N=
α

for α error and Zβ is the standard normal deviate for power ( P1 − P2 )2


(1−βerror). P* is the average probability of the exposure (similar
to pooled variance or proportion) calculated as shown in
(Z )
2
α + Zβ 2σ 2
formula C (2). m is ratio of control subjects to case subjects f. Comparing two means N= .
desired, while P1 is the probability of exposure in the control ( µ1 − µ2 )2
group, calculated in equation C (3) from known prevalence
of the exposure from the population (P0) and OR (ω) of the Analytical studies: Randomised controlled trials
exposure between cases and control.[7] As shown in C (4) There are four variants of randomised control trials (RCT)
formula, Nc is the continuity‑adjusted sample size for further described in the literature[10,15] as shown in Table 2:
analysis such as Chi‑square and Fisher’s exact, taking into 1. Equality trial: (Ho: µT − µS = 0). This trial is designed to
consideration the ratio of control to case, prevalence in the hypothesise that there is no clinical difference or effect
population and probability of the exposure.[7] When OR (ω) is between the mean of the new treatment/intervention (µT)
not available but only prevalence is available, a more simple and the mean of the comparator (µS)
alternative formula is prescribed:[13] 2. Equivalence trial: (Ho: |µT − µS|= δ). This trial hypothesises
that both the treatment/intervention and the comparator (µT
and µS) are equally effective
2
  *
 Z β − Z α  P (1 − P )
*
3. Non‑inferior trial: (Ho: µT − µS ≥ δ). It is a design to
m +1
N= 2
 (5) prove that the treatment/intervention is as effective
m ( P0 − P1 ) as the comparator and not necessary better than
d. Independent cohort (prospective study)[7,13] comparator (standard or usual or placebo)
4. Superiority trial: (Ho: µT − µS ≤ δ). The purpose of
this design is to prove that the treatment/intervention
[ Zα (1 + 1 / m) P * (1 − P *) is more effective (statistically or clinically) than the
+ Z β P0 (1 − P0 ) / m + P1 (1 − P1 ) ]2 comparator (standard or usual or placebo).
N=  (1)
( p0 − p1 ) 2
The trials can also be one‑sided (one‑tail) hypothesis. This
P + mp0 means that the direction of the difference or the effect is
P = 1
*
 (2)
1+ m stated (more/greater or less/lower than). More commonly,
2
many researchers prefer to adopt two‑sided (two‑tail)
N  2(m + 1)  hypothesis which usually do not state the direction of the
Nc = 1 + 1 Nm | p − P |   (3) differences or effects expected. This states that there is no
4  0 1 
difference between the effect of the treatment/intervention
In equation d (1), N is the estimated sample size for the and the comparator (standard/usual/placebo), and the common
independent case–control, Zα is the standard normal deviate analysis method is independent t‑test. In addition to the
for α error and Z β is the standard normal deviate for direction of the hypothesis, the design variants of the trials such

Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020 71


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

Table 2: Sample size considerations for common types of randomised control trials
Design Hypothesis Sample size estimation rule
Numeric Categorical
One‑tail Equality
(Z ) (Z )
2 2
α + Zβ σ 2 α + Zβ p (1 − p )
( µT − µ S ) 2 ( PT − PS ) 2

Superior
(Z ) (Z )
2 2
α + Zβ p (1 − p ) α + Zβ p (1 − p )
( PT − PS ) 2
( PT − PS − δ ) 2

Equivalence
(Z ) (Z )
2 2
α + Zβ σ 2 α + Zβ p (1 − p )
(| µT − µ S | −δ ) 2
(| PT − PS | −δ ) 2

Two‑tail parallel Equality


( ) ( )
2 2
2 Zα + Z β σ 2 2 Zα + Z β p (1 − p )
( µT − µ S ) 2 ( PT − PS ) 2

Non‑inferior
( ) ( )
2 2
2 Zα + Z β σ 2 2 Zα + Z β p (1 − p )
( µT − µ S − δ ) 2 ( PT − PS − δ ) 2

Superior
( ) ( )
2 2
2 Zα + Z β σ 2 2 Zα + Z β p (1 − p )
( µT − µ S − δ ) 2 ( PT − PS − δ ) 2

Equivalence
( ) ( )
2 2
2 Zα + Z β σ 2 2 Zα + Z β p (1 − p )
(| µT − µ S| − δ ) 2
(| PT − PS | −δ ) 2

Two‑tail crossover Equality


(Z ) (Z )
2 2
α + Zβ σ 2 α + Zβ p (1 − p )
2( µT − µ S ) 2
2( PT − PS ) 2

Non‑inferior
(Z ) (Z )
2 2
α + Zβ σ 2 α + Zβ p (1 − p )
2( µT − µ S − δ ) 2 2( PT − PS − δ ) 2

Superior
(Z ) (Z )
2 2
α + Zβ σ 2 α + Zβ p (1 − p )
2( µT − µ S − δ ) 2 2( PT − PS − δ ) 2

Equivalence
(Z ) (Z )
2 2
α + Zβ σ 2 α + Zβ p (1 − p )
2(| µT − µ S | −δ ) 2 2(| PT − PS | −δ ) 2

as the parallel, cross‑over and cluster RCTs also have effects µS are the mean outcomes in the treatment and the comparator
on the sample size calculation as shown in Table 2.[2,6,10,15] groups. Clinically acceptable margin effect is denoted as δ in
the above equation.
( T +  S ) 2
σ2 = pooled variance = where σT is the variance of
2
the treatment group and the σS is the variance of the comparator Other Sample Size Consideration in Randomised
( ST + S S ) 2 Control Trials and Interventional Studies
group or if standard deviation is given for the
2 Cluster randomised control trials designs
treatment (ST) and comparator (SS) groups. Alternatively, For a detailed explanation on sample size considerations
a more comprehensive pooled standard variation (Spooled) on cluster RCTs, standard reviews should be consulted.[15,16]
(n1 − 1) S12 + (n2 − 1) S22 However, a brief and helpful explanation is provided here from
calculation has been suggested[11] = existing literature.[15,16]
n1 + n2 − 2
keeping in view the standard deviations (s1, s2 ….) and sample The initial step is to follow the appropriate sample size
estimation N for RCT over individuals as shown in Table 1,
sizes (n1, n2…) of the groups. P is also a pooled prevalence and then corrections will be considered for the κ number of
and is simply PT + PS/2. PT and PS are the prevalence of the clusters in each arm of size ɱ. This will produce a total number
outcomes in the treatment and the comparators, while µT and of Nc = ɱκ individuals in each arm. As a rule of thumb, to

72 Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

compensate for the selection error inherent in cluster sampling, test is mostly applied to this type of analysis, thereby making
there is a need to inflate the variance of the difference (δc) to it expedient to take differential total number of events into
be detected by a variance inflation factor (VIF). How well consideration.[12] Therefore, both the sample size estimation
individuals in the clusters are correlated to each other known and duration of stay in the study are important considerations
as the intra‑cluster correlation coefficient (ρ) is important when for this type of study design.[12] The first consideration is
multiplying with VIF. This is called Deff. the number of events (d) estimated using the α‑error, the
Therefore, VIF = [1+ (ɱ‑1)ρ]. power (1−β) and effect size or the treatment effect (δ).
However, the treatment effect is embodied by the probability
There are times that the cluster sizes are not equal, then of the occurrence of the events in the two study groups.[12] This
VIF = [1+ ((δv2 + 1)ɱ*‑1)ρ]. probability is termed ‘hazard ratio’ (HR).
The δv means the coefficient of variation of the cluster sizes and The total number of events can be estimated as:
ɱ* represents average cluster size. Substituting the multiplier 2
1 + 0 
for VIF in any of the individual RCT formula is: ( )  (1)
2
d = Z α + Zβ  1 −  
Nc = N [1+ (ɱ‑1)ρ] = N[VIF] – for equal cluster size 0

log ( Pe )
= N [1+ ((δv2 + 1)ɱ*‑1)ρ] – for unequal cluster sizes. The δ0 is equal to HR =  (2)
log ( pc )
Quasi‑Experimental Studies The pe and pc are the estimated survival probability in the
One good example of quasi‑experimental study is pre‑ and experimental and control groups, respectively.
post‑test or before and after test. This is also described as 2d
repeated measure. Another description of this situation is The final sample size required N =  (3)
2 − pc − pe
that each subject is serving as his/her own control. Repeated
measures analyses such as paired t‑test (for numeric) and Sample Size Consideration in Correlation and
McNemar test (categorical) are employed for the analysis of
these forms of study as shown below:[11,12] Diagnostic Tests
Correlational studies
(Z )
2
α + Zβ  2 Despite being a common descriptive study, only few literature[5]
Numeric: N = have described sample size estimation in correlational
(δ ) 2
study. In this study type, the main focus is the correlational
(Z ) coefficient (r) and the Fisher’s transformation of the correlation
2
+ Z² p (1 − p )
Categorical N =
±
coefficient (Cr).
( p1 + p 2) 2
( Zα Z β )
It looks very similar to the two‑sample situation, but with One sample correlation formula: N = { }2 + 3.
two important changes. First, there is no multiplier of ‘2’. Cr
Second, the σ is the standard deviation of the differences 1 1+ r
where Cr = ln{ }.
within pairs, while δ = µ1 and µ2 are the means before and 2 1− r
after intervention, respectively.[11,12] Similarly, p1 and p2 are ( Zα Z β )
Two sample correlation formula: N = { }2 + 3.
the proportion/prevalence before and after intervention. The Cr 1 − Cr 2
P is the pooled prevalence of the before and after prevalence.
The σ is the variance of the difference in the repeated measure 1 1 + r1 1 1+ r
where Cr1 = ln{ } and Cr2= ln{ 2 } .
= σ12+ σ22 − ρσ1σ2[11,12] where ρ is the correlation between 2 1 − r1 2 1 − r2
baseline and post‑intervention values on the same group. If
only one σ1 is reported, then σ =2 σ1 (1−ρ). Accuracy tests (sensitivity/specificity)
Further detail reading can be found in the literature.[17] For the
Survival Analysis (Outcome) Study purpose of this review, a simple and an all‑purpose formula
is given here:[17] the sensitivity (Se), specificity (Sp), disease
This type of study is carry out when research subjects are
prevalence (P) and precision (δ) are all required.
followed up over a time to generate outcome variable that
is of time‑to‑event type.[12] A good example of this is in the Sample size when the aim of the accuracy test is for single
clinical trial that set out to compare the survival rates of the sensitivity or specificity:
experimental drug or an intervention group compared to the
Zα2 Se (1 − Se )
control (non‑experimental) group. One striking feature of Sensitivity (Se) = .
survival study is that by design, it is not every research subject δ 2 ( P)
that survive to the end of the study.[12] Hence, research subjects Zα2 S p (1 − S p )
exit at different points along the follow‑up period. Log‑rank Specificity (Sp) = .
δ 2 (1 − P)

Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020 73


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

Sample size for sensitivity (or specificity) of a single diagnostic N


test in comparison with a standard: The comparison is of the N* =
N
value of the sensitivity/specificity (P1) of a diagnostic test been 1+
N0
compared with a predetermined or a gold standard sensitivity/ .
specificity (P0).
Design effects
[ Zα P0 (1 − P0 ) + Z β P1 (1 − P1 ) ] 2 The cluster trials design and the VIF have been discussed in
N= detail in the preceding section. It should be noted that stratified
( p0 − p1 ) 2
. sampling has similarly Deff like cluster randomisation and
should be corrected as well.[8]
Sample size for a Sensitivity (or specificity) of more than one
diagnostic tests: the comparison in this design involves two Multivariate analysis and covariates
alternative diagnostic tests (P1 and P2) More advanced analysis and modelling are being frequently
used in health research nowadays; some of these analyses
[ Zα 2 xP* (1 − P* ) + Z β P1 (1 − P1 ) + P2 (1 − P2 ) ]2 such as analysis of covariance, log‑linear analysis and
N= cox’s proportional hazard analysis will require sample
( p1 − p2 ) 2
. size adjustments.[8] Proper methods of doing these are still
evolving.[8]
Sample size adjustments
There are various reasons that can warrant adjustment for an
initially estimated sample size. Conclusion
This review discussed common sample size estimation
Multiple outcome variables formula in health research and offers basic guidelines and
When there are more than one outcome variables of interest principles to achieve valid estimation. The simplification
in a study, sample size of each of all these variables should of the sample size formula and detail explanation were also
be estimated and the highest of them should be applied for provided. Sample size estimation is an important step in
the study.[8,15] conducting a valid and generalisable research. The variable
Unequal comparison group of outcomes, research designs, analysis methods, error
assumptions and effect size among other important elements
Some researches have comparison group which may have
are cardinal to estimating a scientifically correct sample size.
equal or unequal subjects per group. In this instance that
Certain situations require adjustment for the sample size
the arms of the study have unequal subjects in the group, it
and they are to be considered at all times in health research.
become expedient to adjust the initially calculated sample This compendium will ease the struggles student and young
size (N) that assumed that the arms of study are equal,[8] using researchers go through to deploy scientifically strong sample
the actual ratio between the unequal arms of the research (ɱ). size estimation in their studies.
N (1 + n) 2
The adjusted sample size = N* = . Financial support and sponsorship
4n
Nil.
Non‑consent, missing response, withdrawal from study
and dropout Conflicts of interest
Sample size is calculated as a minimum number required There are no conflicts of interest.
to achieve research aim. In practice, reasons ranging from
incomplete response to loss to follow‑up (N*) can adversely References
affect the final sample size that is useful for the research.[8,15] 1. Umulisa C. Sampling methods and sample size calculation for the
Researcher should have adequate knowledge of these losses SMART methodology. University; 2012;2:20‑30.
2. Chow SC, Shao J, Wang H, Lokhnygina Y. Sample Size Calculations in
and have good idea of the proportion (P) that may be lost to Clinical Research: Chapman and Hall/CRC Biostatistics Series. 3rd ed.
any of these in a study. New York: Taylor and Francis; 2017.
3. Lenth RV. Some Practical Guidelines for Effective Sample‑Size
N Determination; 2001. p. 1‑11.
Therefore N* = 4. Habib A, Johargy A, Mahmood K, Humma H. Design and determination
(1− P)
of the sample size in medical research. IOSR J Dent Med Sci (IOSR-
Finite population correction JDMS). 2014;13:21-31.
5. Warren SB, Thomas BN, Hulley SB. Estimating Sample Size and
Logically, searching for a few coloured grains of corn in a
Power: Applications and Examples. In: Hulley SB, Cummings SR,
large bowl will take longer than finding same coloured grains Browner WS, Grady DG GN, editors. Designing Clinical Research. 4th
in a handful scoop of corn. After estimation of sample size ed. Philadepia: Lippincott Williams & Wilkins; 2013. 44-55. Available
for a population of less than 10,000 (N0), need arises for the from: https://www.academia.edu/36931058/Designing_Clinical_
Research. [last Accessed on 2020 Jan 24].
researcher to correct the sample size (N) for the small study 6. Charan J, Biswas T. How to calculate sample size for different study
population.[7] designs in medical research? Indian J Psychol Med 2013;35:121‑6.

74 Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020


[Downloaded free from http://www.npmj.org on Sunday, July 19, 2020, IP: 197.210.28.101]

Bolarinwa: Sample size estimation for health and social researchers

7. Kasiulevicius V, Šapoka V, Filipaviciute R. Sample size calculation in semanticscholar.org/efd7/00bb717ce68bd9474d43e05d0a7f48920422.


epidemiological studies. Gerontologija 2006;7:225‑31. pdf. [last Accessed on 2020 Jan 24].
8. Hazra A, Gogtay N. Biostatistics series module 5: Determining sample 14. Noordzij M, Tripepi G, Dekker FW, Zoccali C, Tanck MW, Jager KJ.
size. Indian J Dermatol 2016;61:496‑504. Sample size calculations: Basic principles and common pitfalls. Nephrol
9. Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Dial Transplant 2010;25:1388‑93.
Hillsdale, NJ: Erlbaum; 1988.4-17. 15. Thabane L. Sample Size Determination in Clinical Trials HRM‑733
10. Zhong B. How to calculate sample size in randomized controlled trial? J CLass Notes; 2004. p. 31. Available from: http://www.lehanathabane.
Thorac Dis 2009;1:51‑4. com. [last Accessed on 2020 Jan 24].
11. Shintani A. Sample Size Estimation and Power Computation on Paired 16. Hemming K, Girling AJ, Sitch AJ, Marsh J, Lilford RJ. Sample size
or Skewed Continuous Data; 2006. p. 1‑15. calculations for cluster randomised controlled trials with a fixed number
12. van der Tweel I. Sample size determination. Intern report. 2006;2-15. of clusters. BMC Med Res Methodol 2011;11:102.
13. Sharma A. Sample Size Calculation for Research Studies in 17. Hajian‑Tilaki K. Sample size estimation in diagnostic test studies of
Ophthalmology; 2015. p. 78‑81. Available from: https://pdfs. biomedical informatics. J Biomed Inform 2014;48:193‑204.

Nigerian Postgraduate Medical Journal ¦ Volume 27 ¦ Issue 2 ¦ April-June 2020 75

You might also like