Martinez Mesa 2016 Sampling How To Select Participants
Martinez Mesa 2016 Sampling How To Select Participants
326
s
Sampling: how to select participants in my research study?*
DOI: http://dx.doi.org/10.1590/abd1806-4841.20165254
Abstract: Background: In this paper, the basic elements related to the selection of participants for a health
research are discussed. Sample representativeness, sample frame, types of sampling, as well as the impact that
non-respondents may have on results of a study are described. The whole discussion is supported by practical
examples to facilitate the reader’s understanding. Objective: To introduce readers to issues related to sampling.
Keywords: Dermatology; Epidemiology and biostatistics; Epidemiologic studies; Sample size; Sampling studies
INTRODUCTION
The essential topics related to the selection of necessary, but the representativeness is preserved, sta-
participants for a health research are: 1) whether to tistical inference may be compromised in terms of pre-
work with samples or include the whole reference cision (prevalence studies) and/or statistical power to
population in the study (census); 2) the sample basis; detect the associations of interest.1 On the other hand,
3) the sampling process and 4) the potential effects samples without representativeness may not be a re-
nonrespondents might have on study results. We will liable source to draw conclusions about the reference
refer to each of these aspects with theoretical and prac- population (i.e., statistical inference is not deemed
tical examples for better understanding in the sections possible), even if the sample size reaches the required
that follow. number of participants. Lack of representativeness
can occur as a result of flawed selection procedures
TO SAMPLE OR NOT TO SAMPLE (sampling bias) or when the probability of refusal/
In a previous paper, we discussed the necessary non-participation in the study is related to the object
parameters on which to estimate the sample size.1 We of research (nonresponse bias).1,2
define sample as a finite part or subset of participants Although most studies are performed using
drawn from the target population. In turn, the target samples, whether or not they represent any target
population corresponds to the entire set of subjects population, census-based estimates should be pre-
whose characteristics are of interest to the research ferred whenever possible.3,4 For instance, if all cases
team. Based on results obtained from a sample, re- of melanoma are available on a national or regional
searchers may draw their conclusions about the tar- database, and information on the potential risk factors
get population with a certain level of confidence, fol- are also available, it would be preferable to conduct a
lowing a process called statistical inference. When the census instead of investigating a sample.
sample contains fewer individuals than the minimum
Received on 15.10.2015
Approved by the Advisory Board and accepted for publication on 02.11.2015
* Study performed at Faculdade Meridional - Escola de Medicina (IMED) – Passo Fundo (RS), Brazil.
Financial Support: None.
Conflict of Interest: None.
1
Faculdade Meridional (IMED) – Passo Fundo (RS), Brazil.
2
University of Adelaide – Adelaide, Australia.
3
Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA) – Porto Alegre (RS), Brazil.
4
Universidade Federal de Santa Catarina (UFSC) - Florianópolis (RS), Brazil.
However, there are several theoretical and prac- from 2 to 5 percentage points. Nevertheless, the re-
tical reasons that prevent us from carrying out cen- searcher should be aware that the smaller the random
sus-based surveys, including: error considered in the study, the larger the required
1. Ethical issues: it is unethical to include a great- sample size.1
er number of individuals than that effectively re-
quired; SAMPLE FRAME
2. Budgetary limitations: the high costs of a census The sample frame is the group of individuals
survey often limits its use as a strategy to select that can be selected from the target population given
participants for a study; the sampling process used in the study. For example,
3. Logistics: censuses often impose great challenges to identify cases of cutaneous melanoma the research-
in terms of required staff, equipment, etc. to con- er may consider to utilize as sample frame the nation-
duct the study; al cancer registry system or the anatomopathological
4. Time restrictions: the amount of time needed to records of skin biopsies. Given that the sample may
plan and conduct a census-based survey may be represent only a portion of the target population, the
excessive; and, researcher needs to examine carefully whether the
5. Unknown target population size: if the study selected sample frame fits the study objectives or hy-
objective is to investigate the presence of prema- potheses, and especially if there are strategies to over-
lignant skin lesions in illicit drugs users, lack of in- come the sample frame limitations (see Chart 1 for ex-
formation on all existing users makes it impossible amples and possible limitations).
to conduct a census-based study.
SAMPLING
All these reasons explain why samples are more Sampling can be defined as the process through
frequently used. However, researchers must be aware which individuals or sampling units are selected from
that sample results can be affected by the random error the sample frame. The sampling strategy needs to be
(or sampling error).3 To exemplify this concept, we will specified in advance, given that the sampling method
consider a research study aiming to estimate the prev- may affect the sample size estimation.1,5 Without a rig-
alence of premalignant skin lesions (outcome) among orous sampling plan the estimates derived from the
individuals >18 years residing in a specific city (target study may be biased (selection bias). 3
population). The city has a total population of 4,000
adults, but the investigator decided to collect data on a TYPES OF SAMPLING
representative sample of 400 participants, detecting an In figure 1, we depict a summary of the main
8% prevalence of premalignant skin lesions. A week sampling types. There are two major sampling types:
later, the researcher selects another sample of 400 par- probabilistic and nonprobabilistic.
ticipants from the same target population to confirm
the results, but this time observes a 12% prevalence of NONPROBABILISTIC SAMPLING
premalignant skin lesions. Based on these findings, is In the context of nonprobabilistic sampling,
it possible to assume that the prevalence of lesions in- the likelihood of selecting some individuals from the
creased from the first to the second week? The answer target population is null. This type of sampling does
is probably not. Each time we select a new sample, it not render a representative sample; therefore, the ob-
is very likely to obtain a different result. These fluctua- served results are usually not generalizable to the tar-
tions are attributed to the “random error.” They occur get population. Still, unrepresentative samples may be
because individuals composing different samples are useful for some specific research objectives, and may
not the same, even though they were selected from the help answer particular research questions, as well as
same target population. Therefore, the parameters of contribute to the generation of new hypotheses.4 The
interest may vary randomly from one sample to an- different types of nonprobabilistic sampling are de-
other. Despite this fluctuation, if it were possible to tailed below.
obtain 100 different samples of the same population, Convenience sampling: the participants are con-
approximately 95 of them would provide prevalence secutively selected in order of apperance according
estimates very close to the real estimate in the target to their convenient accessibility (also known as con-
population – the value that we would observe if we secutive sampling). The sampling process comes to
investigated all the 4,000 adults residing in the city. an end when the total amount of participants (sample
Thus, during the sample size estimation the investiga- saturation) and/or the time limit (time saturation) are
tor must specify in advance the highest or maximum reached. Randomized clinical trials are usually based
acceptable random error value in the study. Most on convenience sampling. After sampling, participants
population-based studies use a random error ranging are usually randomly allocated to the intervention or
Hospital or Health Services records • U sually include only data of affected people (this is a limitation, depending on the study
objectives)
• Depending on the service, data may be incomplete and/or outdated
• If the lists are from public units, results may differ from those who seek private services
School lists • School lists are currently available only in the public sector
• Children/ teenagers not attending school will not be represented
• Lists are quickly outdated
• There will be problems in areas with high percentage of school absenteeism
List of phone numbers • Several population groups are not represented: individuals with no phone line at home
(low-income families, young people who use only cell phones), those who spend less
time at home, etc.
Mailing lists • I ndividuals with multiple email addresses, which increase the chance of selection com-
pared to individuals with only one address
• Individuals without an email address may be different from those who have it, according
to age, education, etc.
cases. Improper selection of control individuals may of the cohort member, study participants must be a
introduce selection bias in the results. Thus, the con- representative sample of those included in the base-
cern with representativeness in this type of study is line.14,15 In this type of study, losses over time may
established based on the relationship between cases cause follow-up bias.
and controls (comparability).
In cohort studies, individuals are recruited CONCLUSION
based on the exposure (exposed and unexposed sub- Researchers need to decide during the planning
jects), and they are followed over time to evaluate the stage of the study if they will work with the entire tar-
occurrence of the outcome of interest. At baseline, the get population or a sample. Working with a sample
sample can be selected from a representative sample involves different steps, including sample size estima-
(population-based cohort studies) or a non-represen- tion, identification of the sample frame, and selection
tative sample. However, in the successive follow-ups of the sampling method to be adopted.q
REFERENCES
1. Martínez-Mesa J, González-Chica DA, Bastos JL, Bonamigo RR, Duquia RP.
Sample size: how many participants do I need in my research? An Bras Dermatol.
2014;89:609-15.
Mailing address:
2. Röhrig B, du Prel JB, Wachtlin D, Kwiecien R, Blettner M. Sample size calculation Jeovany Martínez-Mesa
in clinical trials: part 13 of a series on evaluation of scientific publications. Dtsch Faculdade Meridional - IMED
Arztebl Int. 2010;107:552-6.
3. Suresh K, Thomas SV, Suresh G. Design, data analysis and sampling techniques
Escola de Medicina
for clinical research. Ann Indian Acad Neurol. 2011;14:287-90. R. Senador Pinheiro, 304
4. Rothman KJ, Gallacher JE, Hatch EE. Why representativeness should be avoided. 99070-220 - Passo Fundo - RS
Int J Epidemiol. 2013;42:1012-4.
5. Krause M, Lutz W, Boehnke JR. The role of sampling in clinical trial design.
Brazil
Psychother Res. 2011;21:243-51. Email: [email protected]
6. Roubille C, Richer V, Starnino T, McCourt C, McFarlane A, Fleming P, et al.
Evidence-based Recommendations for the Management of Comorbidities in
Rheumatoid Arthritis, Psoriasis, and Psoriatic Arthritis: Expert Opinion of the
Canadian Dermatology-Rheumatology Comorbidity Initiative. J Rheumatol.
2015;42:1767-80.
7. Larkin J, Ascierto PA, Dréno B, Atkinson V, Liszkay G, Maio M, et al. Combined
vemurafenib and cobimetinib in BRAF-mutated melanoma. N Engl J Med.
2014;371:1867-76.
8. Goncalves JR, Nappo SA. Factors that lead to the use of crack cocaine in
combination with marijuana in Brazil: a qualitative study. BMC Public Health.
2015;15:706.
9. Pimenta FB, Pinho L, Silveira MF, Botelho AC. Factors associated with chronic
diseases among the elderly receiving treatment under the Family Health Strategy.
Cien Saude Colet. 2015;20:2489-98.
10. Kelbore AG, Alemu W, Shumye A, Getachew S. Magnitude and associated factors
of Atopic dermatitis among children in Ayder referral hospital, Mekelle, Ethiopia.
BMC Dermatol. 2015;15:15.
11. Zhou SJ, Skeaff M, Makrides M, Gibson R. Vitamin D status and its predictors
among pre-school children in Adelaide. J Paediatr Child Health. 2015;51:614-9.
12. Duquia RP, Menezes AM, Almeida HL Jr, Reichert FF, Santos Ida S, Haack RL, et
al. Prevalence of sun exposure and its associated factors in southern Brazil: a
population-based study. An Bras Dermatol. 2013;88:554-61.
13. Barrios CH, Werutsky G, Martinez-Mesa J. The global conduct of cancer clinical
trials: challenges and opportunities. Am Soc Clin Oncol Educ Book. 2015:e132-9.
14. Victora CG, Barros FC. Cohort profile: the 1982 Pelotas (Brazil) birth cohort study.
Int J Epidemiol. 2006;35:237-42.
15. Boing AC, Peres KG, Boing AF, Hallal PC, Silva NN, Peres MA. EpiFloripa Health
Survey: the methodological and operational aspects behind the scenes. Rev Bras
Epidemiol. 2014;17:147-62.
How to cite this article: Martinez-Mesa J, González-Chica DA, Duquia RP, Bonamigo RR, Bastos JL. Sampling: how
to select participants in my research study? An Bras Dermatol. 2016;91(3):326-30.