0% found this document useful (0 votes)
91 views5 pages

Martinez Mesa 2016 Sampling How To Select Participants

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views5 pages

Martinez Mesa 2016 Sampling How To Select Participants

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Special Article

326
s
Sampling: how to select participants in my research study?*

Jeovany Martínez-Mesa1 David Alejandro González-Chica2


Rodrigo Pereira Duquia3 Renan Rangel Bonamigo3
João Luiz Bastos4

DOI: http://dx.doi.org/10.1590/abd1806-4841.20165254

Abstract: Background: In this paper, the basic elements related to the selection of participants for a health
research are discussed. Sample representativeness, sample frame, types of sampling, as well as the impact that
non-respondents may have on results of a study are described. The whole discussion is supported by practical
examples to facilitate the reader’s understanding. Objective: To introduce readers to issues related to sampling.
Keywords: Dermatology; Epidemiology and biostatistics; Epidemiologic studies; Sample size; Sampling studies

INTRODUCTION
The essential topics related to the selection of necessary, but the representativeness is preserved, sta-
participants for a health research are: 1) whether to tistical inference may be compromised in terms of pre-
work with samples or include the whole reference cision (prevalence studies) and/or statistical power to
population in the study (census); 2) the sample basis; detect the associations of interest.1 On the other hand,
3) the sampling process and 4) the potential effects samples without representativeness may not be a re-
nonrespondents might have on study results. We will liable source to draw conclusions about the reference
refer to each of these aspects with theoretical and prac- population (i.e., statistical inference is not deemed
tical examples for better understanding in the sections possible), even if the sample size reaches the required
that follow. number of participants. Lack of representativeness
can occur as a result of flawed selection procedures
TO SAMPLE OR NOT TO SAMPLE (sampling bias) or when the probability of refusal/
In a previous paper, we discussed the necessary non-participation in the study is related to the object
parameters on which to estimate the sample size.1 We of research (nonresponse bias).1,2
define sample as a finite part or subset of participants Although most studies are performed using
drawn from the target population. In turn, the target samples, whether or not they represent any target
population corresponds to the entire set of subjects population, census-based estimates should be pre-
whose characteristics are of interest to the research ferred whenever possible.3,4 For instance, if all cases
team. Based on results obtained from a sample, re- of melanoma are available on a national or regional
searchers may draw their conclusions about the tar- database, and information on the potential risk factors
get population with a certain level of confidence, fol- are also available, it would be preferable to conduct a
lowing a process called statistical inference. When the census instead of investigating a sample.
sample contains fewer individuals than the minimum

Received on 15.10.2015
Approved by the Advisory Board and accepted for publication on 02.11.2015
* Study performed at Faculdade Meridional - Escola de Medicina (IMED) – Passo Fundo (RS), Brazil.
Financial Support: None.
Conflict of Interest: None.
1
Faculdade Meridional (IMED) – Passo Fundo (RS), Brazil.
2
University of Adelaide – Adelaide, Australia.
3
Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA) – Porto Alegre (RS), Brazil.
4
Universidade Federal de Santa Catarina (UFSC) - Florianópolis (RS), Brazil.

©2016 by Anais Brasileiros de Dermatologia

An Bras Dermatol. 2016;91(3):326-30.



Sampling: how to select participants in my research study? 327

However, there are several theoretical and prac- from 2 to 5 percentage points. Nevertheless, the re-
tical reasons that prevent us from carrying out cen- searcher should be aware that the smaller the random
sus-based surveys, including: error considered in the study, the larger the required
1. Ethical issues: it is unethical to include a great- sample size.1
er number of individuals than that effectively re-
quired; SAMPLE FRAME
2. Budgetary limitations: the high costs of a census The sample frame is the group of individuals
survey often limits its use as a strategy to select that can be selected from the target population given
participants for a study; the sampling process used in the study. For example,
3. Logistics: censuses often impose great challenges to identify cases of cutaneous melanoma the research-
in terms of required staff, equipment, etc. to con- er may consider to utilize as sample frame the nation-
duct the study; al cancer registry system or the anatomopathological
4. Time restrictions: the amount of time needed to records of skin biopsies. Given that the sample may
plan and conduct a census-based survey may be represent only a portion of the target population, the
excessive; and, researcher needs to examine carefully whether the
5. Unknown target population size: if the study selected sample frame fits the study objectives or hy-
objective is to investigate the presence of prema- potheses, and especially if there are strategies to over-
lignant skin lesions in illicit drugs users, lack of in- come the sample frame limitations (see Chart 1 for ex-
formation on all existing users makes it impossible amples and possible limitations).
to conduct a census-based study.
SAMPLING
All these reasons explain why samples are more Sampling can be defined as the process through
frequently used. However, researchers must be aware which individuals or sampling units are selected from
that sample results can be affected by the random error the sample frame. The sampling strategy needs to be
(or sampling error).3 To exemplify this concept, we will specified in advance, given that the sampling method
consider a research study aiming to estimate the prev- may affect the sample size estimation.1,5 Without a rig-
alence of premalignant skin lesions (outcome) among orous sampling plan the estimates derived from the
individuals >18 years residing in a specific city (target study may be biased (selection bias). 3
population). The city has a total population of 4,000
adults, but the investigator decided to collect data on a TYPES OF SAMPLING
representative sample of 400 participants, detecting an In figure 1, we depict a summary of the main
8% prevalence of premalignant skin lesions. A week sampling types. There are two major sampling types:
later, the researcher selects another sample of 400 par- probabilistic and nonprobabilistic.
ticipants from the same target population to confirm
the results, but this time observes a 12% prevalence of NONPROBABILISTIC SAMPLING
premalignant skin lesions. Based on these findings, is In the context of nonprobabilistic sampling,
it possible to assume that the prevalence of lesions in- the likelihood of selecting some individuals from the
creased from the first to the second week? The answer target population is null. This type of sampling does
is probably not. Each time we select a new sample, it not render a representative sample; therefore, the ob-
is very likely to obtain a different result. These fluctua- served results are usually not generalizable to the tar-
tions are attributed to the “random error.” They occur get population. Still, unrepresentative samples may be
because individuals composing different samples are useful for some specific research objectives, and may
not the same, even though they were selected from the help answer particular research questions, as well as
same target population. Therefore, the parameters of contribute to the generation of new hypotheses.4 The
interest may vary randomly from one sample to an- different types of nonprobabilistic sampling are de-
other. Despite this fluctuation, if it were possible to tailed below.
obtain 100 different samples of the same population, Convenience sampling: the participants are con-
approximately 95 of them would provide prevalence secutively selected in order of apperance according
estimates very close to the real estimate in the target to their convenient accessibility (also known as con-
population – the value that we would observe if we secutive sampling). The sampling process comes to
investigated all the 4,000 adults residing in the city. an end when the total amount of participants (sample
Thus, during the sample size estimation the investiga- saturation) and/or the time limit (time saturation) are
tor must specify in advance the highest or maximum reached. Randomized clinical trials are usually based
acceptable random error value in the study. Most on convenience sampling. After sampling, participants
population-based studies use a random error ranging are usually randomly allocated to the intervention or

An Bras Dermatol. 2016;91(3):326-30.


328 Martínez-Mesa J, González-Chica DA, Duquia RP, Bonamigo RR, Bastos JL

Chart 1: Examples of sample frames and potential limitations as regards representativeness


Sample frames Limitations
Population census • If the census was not conducted in recent years, areas with high migration might be
outdated
• Homeless or itinerant people cannot be represented

Hospital or Health Services records • U sually include only data of affected people (this is a limitation, depending on the study
objectives)
• Depending on the service, data may be incomplete and/or outdated
• If the lists are from public units, results may differ from those who seek private services

School lists • School lists are currently available only in the public sector
• Children/ teenagers not attending school will not be represented
• Lists are quickly outdated
• There will be problems in areas with high percentage of school absenteeism

List of phone numbers • Several population groups are not represented: individuals with no phone line at home
(low-income families, young people who use only cell phones), those who spend less
time at home, etc.

Mailing lists • I ndividuals with multiple email addresses, which increase the chance of selection com-
pared to individuals with only one address
• Individuals without an email address may be different from those who have it, according
to age, education, etc.

Census: All the population Quota sampling: according to this sampling tech-


Population under
study nique, the population is first classified by characteris-
Sample: A fraction of the population tics such as gender, age, etc. Subsequently, sampling
units are selected to complete each quota. For exam-
ple, in the study by Larkin et al., the combination of
Sampling vemurafenib and cobimetinib versus placebo was test-
ed in patients with locally-advanced melanoma, stage
IIIC or IV, with BRAF mutation.7 The study recruited
Probabilistic sampling Non-probabilistic sampling
495 patients from 135 health centers located in several
countries. In this type of study, each center has a “quo-
ta” of patients.
“Snowball” sampling: in this case, the research-
Simple random sampling Accidental or convenience sampling
Systematic random sampling Purposive sampling
er selects an initial group of individuals. Then, these
Stratified sampling Quota sampling participants indicate other potential members with
Complex sampling Snowball sampling similar characteristics to take part in the study. This is
frequently used in studies investigating special popu-
Figure 1: Sampling types used in scientific studies lations, for example, those including illicit drugs users,
as was the case of the study by Gonçalves et al, which
assessed 27 users of cocaine and crack in combination
control group (randomization).3 Although randomiza- with marijuana.8
tion is a probabilistic process to obtain two compara-
ble groups (treatment and control), the samples used PROBABILISTIC SAMPLING
in these studies are generally not representative of the In the context of probabilistic sampling, all units
target population. of the target population have a nonzero probability to
Purposive sampling: this is used when a diverse take part in the study. If all participants are equally
sample is necessary or the opinion of experts in a likely to be selected in the study, equiprobabilistic
particular field is the topic of interest. This technique sampling is being used, and the odds of being selected
was used in the study by Roubille et al, in which rec- by the research team may be expressed by the formula:
ommendations for the treatment of comorbidities in P=1/N, where P equals the probability of taking part
patients with rheumatoid arthritis, psoriasis, and pso- in the study and N corresponds to the size of the target
riatic arthritis were made based on the opinion of a population. The main types of probabilistic sampling
group of experts.6 are described below.

An Bras Dermatol. 2016;91(3):326-30.


Sampling: how to select participants in my research study? 329

Simple random sampling: in this case, we have a NONRESPONDENTS


full list of sample units or participants (sample basis), Frequently, sample sizes are increased by 10%
and we randomly select individuals using a table of to compensate for potential nonresponses (refusals/
random numbers. An example is the study by Pimenta losses).1 Let us imagine that in a study to assess the
et al, in which the authors obtained a listing from the prevalence of premalignant skin lesions there is a
Health Department of all elderly enrolled in the Fam- higher percentage of nonrespondents among men
ily Health Strategy and, by simple random sampling, (10%) than among women (1%). If the highest percent-
selected a sample of 449 participants. 9 age of nonresponse occurs because these men are not
Systematic random sampling: in this case, partici- at home during the scheduled visits, and these par-
pants are selected from fixed intervals previously de- ticipants are more likely to be exposed to the sun, the
fined from a ranked list of participants. For example, in number of skin lesions will be underestimated. For
the study of Kelbore et al, children who were assisted this reason, it is strongly recommended to collect and
at the Pediatric Dermatology Service were selected to describe some basic characteristics of nonrespondents
evaluate factors associated with atopic dermatitis, se- (sex, age, etc.) so they can be compared to the respon-
lecting always the second child by consulting order.10 dents to evaluate whether the results may have been
Stratified sampling: in this type of sampling, the affected by this systematic error.
target population is first divided into separate strata. Often, in study protocols, refusal to participate
Then, samples are selected within each stratum, ei- or sign the informed consent is considered an “exclu-
ther through simple or systematic sampling. The total sion criteria”. However, this is not correct, as these
number of individuals to be selected in each stratum individuals are eligible for the study and need to be
can be fixed or proportional to the size of each stra- reported as “nonrespondents”.
tum. Each individual may be equally likely to be se-
lected to participate in the study. However, the fixed SAMPLING METHOD ACCORDING TO
method usually involves the use of sampling weights THE TYPE OF STUDY
in the statistical analysis (inverse of the probability of In general, clinical trials aim to obtain a homo-
selection or 1/P). An example is the study conduct- geneous sample which is not necessarily representa-
ed in South Australia to investigate factors associated tive of any target population. Clinical trials often re-
with vitamin D deficiency in preschool children. Us- cruit those participants who are most likely to benefit
ing the national census as the sample frame, house- from the intervention.3 Thus, the more strict criteria
holds were randomly selected in each stratum and all for inclusion and exclusion of subjects in clinical tri-
children in the age group of interest identified in the als often make it difficult to locate participants: after
selected houses were investigated.11 verification of the eligibility criteria, just one out of
Cluster sampling: in this type of probabilistic ten possible candidates will enter the study. Therefore,
sampling, groups such as health facilities, schools, etc., clinical trials usually show limitations to generalize
are sampled. In the above-mentioned study, the selec- the results to the entire population of patients with the
tion of households is an example of cluster sampling.11 disease, but only to those with similar characteristics
Complex or multi-stage sampling: This probabilis- to the sample included in the study. These peculiari-
tic sampling method combines different strategies in ties in clinical trials justify the necessity of conducting
the selection of the sample units. An example is the a multicenter and/or global studiesto accelerate the
study of Duquia et al. to assess the prevalence and recruitment rate and to reach, in a shorter time, the
factors associated with the use of sunscreen in adults. number of patients required for the study.13
The sampling process included two stages.12 Using In turn, in observational studies to build a solid
the 2000 Brazilian demographic census as sampling sampling plan is important because of the great het-
frame, all 404 census tracts from Pelotas (Southern erogeneity usually observed in the target population.
Brazil) were listed in ascending order of family in- Therefore, this heterogeneity has to be also reflected in
come. A sample of 120 tracts were systematically se- the sample. A cross-sectional population-based study
lected (first sampling stage units). In the second stage, aiming to assess disease estimates or identify risk fac-
12 households in each of these census tract (second tors often uses complex probabilistic sampling, be-
sampling stage units) were systematically drawn. All cause the sample representativeness is crucial. How-
adult residents in these households were included in ever, in a case-control study, we face the challenge of
the study (third sampling stage units). All these stages selecting two different samples for the same study.
have to be considered in the statistical analysis to pro- One sample is formed by the cases, which are iden-
vide correct estimates. tified based on the diagnosis of the disease of inter-
est. The other consists of controls, which need to be
representative of the population that originated the

An Bras Dermatol. 2016;91(3):326-30.


330 Martínez-Mesa J, González-Chica DA, Duquia RP, Bonamigo RR, Bastos JL

cases. Improper selection of control individuals may of the cohort member, study participants must be a
introduce selection bias in the results. Thus, the con- representative sample of those included in the base-
cern with representativeness in this type of study is line.14,15 In this type of study, losses over time may
established based on the relationship between cases cause follow-up bias.
and controls (comparability).
In cohort studies, individuals are recruited CONCLUSION
based on the exposure (exposed and unexposed sub- Researchers need to decide during the planning
jects), and they are followed over time to evaluate the stage of the study if they will work with the entire tar-
occurrence of the outcome of interest. At baseline, the get population or a sample. Working with a sample
sample can be selected from a representative sample involves different steps, including sample size estima-
(population-based cohort studies) or a non-represen- tion, identification of the sample frame, and selection
tative sample. However, in the successive follow-ups of the sampling method to be adopted.q

REFERENCES
1. Martínez-Mesa J, González-Chica DA, Bastos JL, Bonamigo RR, Duquia RP.
Sample size: how many participants do I need in my research? An Bras Dermatol.
2014;89:609-15.
Mailing ­address:
2. Röhrig B, du Prel JB, Wachtlin D, Kwiecien R, Blettner M. Sample size calculation Jeovany Martínez-Mesa
in clinical trials: part 13 of a series on evaluation of scientific publications. Dtsch Faculdade Meridional - IMED
Arztebl Int. 2010;107:552-6.
3. Suresh K, Thomas SV, Suresh G. Design, data analysis and sampling techniques
Escola de Medicina
for clinical research. Ann Indian Acad Neurol. 2011;14:287-90. R. Senador Pinheiro, 304
4. Rothman KJ, Gallacher JE, Hatch EE. Why representativeness should be avoided. 99070-220 - Passo Fundo - RS
Int J Epidemiol. 2013;42:1012-4.
5. Krause M, Lutz W, Boehnke JR. The role of sampling in clinical trial design.
Brazil
Psychother Res. 2011;21:243-51. Email: [email protected]
6. Roubille C, Richer V, Starnino T, McCourt C, McFarlane A, Fleming P, et al.
Evidence-based Recommendations for the Management of Comorbidities in
Rheumatoid Arthritis, Psoriasis, and Psoriatic Arthritis: Expert Opinion of the
Canadian Dermatology-Rheumatology Comorbidity Initiative. J Rheumatol.
2015;42:1767-80.
7. Larkin J, Ascierto PA, Dréno B, Atkinson V, Liszkay G, Maio M, et al. Combined
vemurafenib and cobimetinib in BRAF-mutated melanoma. N Engl J Med.
2014;371:1867-76.
8. Goncalves JR, Nappo SA. Factors that lead to the use of crack cocaine in
combination with marijuana in Brazil: a qualitative study. BMC Public Health.
2015;15:706.
9. Pimenta FB, Pinho L, Silveira MF, Botelho AC. Factors associated with chronic
diseases among the elderly receiving treatment under the Family Health Strategy.
Cien Saude Colet. 2015;20:2489-98.
10. Kelbore AG, Alemu W, Shumye A, Getachew S. Magnitude and associated factors
of Atopic dermatitis among children in Ayder referral hospital, Mekelle, Ethiopia.
BMC Dermatol. 2015;15:15.
11. Zhou SJ, Skeaff M, Makrides M, Gibson R. Vitamin D status and its predictors
among pre-school children in Adelaide. J Paediatr Child Health. 2015;51:614-9.
12. Duquia RP, Menezes AM, Almeida HL Jr, Reichert FF, Santos Ida S, Haack RL, et
al. Prevalence of sun exposure and its associated factors in southern Brazil: a
population-based study. An Bras Dermatol. 2013;88:554-61.
13. Barrios CH, Werutsky G, Martinez-Mesa J. The global conduct of cancer clinical
trials: challenges and opportunities. Am Soc Clin Oncol Educ Book. 2015:e132-9.
14. Victora CG, Barros FC. Cohort profile: the 1982 Pelotas (Brazil) birth cohort study.
Int J Epidemiol. 2006;35:237-42.
15. Boing AC, Peres KG, Boing AF, Hallal PC, Silva NN, Peres MA. EpiFloripa Health
Survey: the methodological and operational aspects behind the scenes. Rev Bras
Epidemiol. 2014;17:147-62.

How to cite this article: Martinez-Mesa J, González-Chica DA, Duquia RP, Bonamigo RR, Bastos JL. Sampling: how
to select participants in my research study? An Bras Dermatol. 2016;91(3):326-30.

An Bras Dermatol. 2016;91(3):326-30.

You might also like