0% found this document useful (0 votes)
26 views

Sample Surveys - Nonprobability Sampling

This document discusses nonprobability sampling methods, including convenience sampling and quota sampling. 1) Convenience sampling involves collecting data from population units that are easiest to access, such as students at a university. It is the cheapest method but results cannot be generalized to the wider population. 2) Quota sampling aims to obtain a sample with quotas or proportions of certain groups, such as age and gender, that match the overall population. It is commonly used in market research. 3) Both convenience and quota sampling are easier and cheaper than probability sampling methods but do not allow estimating sampling errors or generalizing results to the target population.

Uploaded by

Max Sarmento
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Sample Surveys - Nonprobability Sampling

This document discusses nonprobability sampling methods, including convenience sampling and quota sampling. 1) Convenience sampling involves collecting data from population units that are easiest to access, such as students at a university. It is the cheapest method but results cannot be generalized to the wider population. 2) Quota sampling aims to obtain a sample with quotas or proportions of certain groups, such as age and gender, that match the overall population. It is commonly used in market research. 3) Both convenience and quota sampling are easier and cheaper than probability sampling methods but do not allow estimating sampling errors or generalizing results to the target population.

Uploaded by

Max Sarmento
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Sample Sureys: Nonprobability Sampling

of Poverty for Small Geographic Areas, Committee on ference cannot be applied to assess the bias or
National Statistics. National Academy Press, Washington, variability of estimators based on nonprobability
DC samples, as such methods do not allow for unknown
Neyman J 1934 On the two different aspects of the representative
or zero selection probabilities.
method: The method of stratified sampling and the method of
purposive selection. Journal of the Royal Statistical Society Surveys carried out by national statistical agencies
97: 558–606 invariably use probability sampling. Marsh and Scar-
Rao J N K 1997 Developments in sample survey theory: An borough (1990) also noted ‘the preponderance of
appraisal. Canadian Journal of Statistics 25: 1–21 probability sampling in university social science.’
Rao J N K 1999 Some recent advances in model-based small Nonprobability sampling is much more common in
area estimation. Surey Methology 25: 175–86 market and opinion research. However, Taylor (1995
Royall R M 1970 On finite population sampling theory under observed large national differences in the extent to
certain linear regression models. Biometrika 57: 377–87 which nonprobability sampling, particularly quota
Sa$ rndal C E, Swensson B, Wretman J 1992 Model Assisted
sampling, is viewed as an acceptable tool for market
Surey Sampling. Springer-Verlag, New York
Skinner C J, Holt D, Smith T M F 1989 Analysis of Complex research. In Canada and the USA, probability sam-
Sureys. Wiley, New York pling using telephone polling and random-digit dialing
Smith T M F 1994 Sample surveys 1975–1990; an age of is the norm for public opinion surveys. In Australia
reconciliation? International Statistical Reiew 62: 5–34 and South Africa probability sampling is also preva-
Thompson M E 1997 Theory of Sample Sureys. Chapman & lent, but with face-to-face interviews. On the other
Hall, London hand, in many European countries such as France and
the UK, quota sampling is much more common.
S. L. Lohr
1. Conenience Sampling
The easiest and cheapest way to collect sample data is
to collect information on those population units which
Sample Surveys: Nonprobability Sampling are most readily accessible. A university researcher
may collect data on students. Surveys carried out
A sample collected from a finite population is said to through newspapers, television broadcasts or Internet
be a probability sample if each unit of the population sites (as described, for example, by Bradley, 1999) are
has nonzero probability of being selected into the necessarily restricted to those individuals who have
sample, and that probability is known. Traditional access to the medium in question. Sometimes only a
methods of probability sampling include simple and small fraction of the population is accessible, in which
stratified random sampling, and cluster sampling. case the sample may consist of exactly those units
Conclusions concerning the population may be ob- which are available for observation.
tained by design-based, or randomization, inference. Some surveys involve an element of self-selection
See Sample Sureys: The Field and Sample Sureys: where individuals decide whether to include them-
Methods. The values of variables of interest in the selves in the sample or not. If participation is time-
population are considered as fixed quantities, un- consuming, or financial cost is involved, then the
known except for those units selected into the sample. sample is more likely to include individuals with an
Inference proceeds by considering the behavior of interest in the subject of the survey. This may not be
estimators of quantities of interest under the randomi- important. For example, an interest in participating in
zation distribution, based on the known selection an experimental study of behavior might be considered
probabilities. For example, if the N population values to be unlikely to be associated with the outcome of the
of variable Y are denoted Y , …, YN and the n sample experiment. However, where the variable of interest
values by y , …, yn then y- ," the sample mean, is a relates to opinion on a question of interest, as is often
"
possible estimator for Yz , the population mean. If the the case in newspaper, television or Internet polls, it is
sample is obtained by simple random sampling, then, likely that interest in participation is related to
with respect to this randomization distribution, ỳ opinion, and it is much harder to justify using the
is unbiased for Yz and has sampling variance sample data to make conclusions about a wider
(Nkn)\Nn(Nk1)ΣN (Y kYz )#.
i=" i
population.
Nonprobability sampling refers to any method of A famous example of the failure of such a nonprob-
obtaining a sample from a population which does not ability sample to provide accurate inferences about a
satisfy the criteria for probability sampling. Nonpro- wider population is the Literary Digest poll of 1936.
bability samples are usually easier and cheaper to Ten million US citizens were sent postcard ballots
collect than probability samples, as the data collector concerning the forthcoming presidential election.
is allowed to exercise some choice as to which units to Around 2 million of these were returned, a sample size
include in the sample. For a probability sample, this which, if associated with a simple random sample,
choice is made entirely by the random sampling would be expected to predict the population with
mechanism. However, methods of design-based in- negligible error. However, when calibrated against the

13467
Sample Sureys: Nonprobability Sampling

election results, the Literary Digest poll was in error known population proportions. For example, if the
by 19 percentage points in predicting Roosevelt’s population proportions of males and females are
share of the vote. equal, then equal numbers of male and female units
On the other hand, useful inferences can be made are selected into the sample. Age groups are also
using convenience samples. Smith and Sugden (1988) commonly used in designing quota samples. Sample
considered statistical experiments, where the alloca- totals for each cell of a cross-classification of two or
tion of a particular treatment to the units under more control variables (for example, age by sex) may
investigation is controlled, usually by randomization. also be fixed by the design. Examples are given by
In such experiments, the selection of units is not Moser and Kalton (1971). Quota sampling is most
usually controlled and is often a convenience sample. commonly used in market and opinion research, where
For example, individuals might be volunteers. Never- control variables usually include age, sex, and socio-
theless, inferences are often successfully extended to a economic class. Other variables such as employment
wider population. Similarly, obserational studies status and housing tenure are also used. The known
where neither treatment allocation nor sample selection population proportions for the control variables are
is controlled, usually because it is impossible to do calculated from census data, or from surveys based on
so, can be thought of as arising from convenience large probability samples. Variables with known
samples. Smith (1983) noted that Doll and Hill (1964) population totals which are not used in setting quotas
in their landmark study of smoking and health, used a may be used for weighting in any subsequent analyses.
sample entirely made up of medical practitioners. Where data collection involves visiting households,
However, the validity of extending conclusions based further constraints beyond the quotas may be applied
on their data, to the general population, is now widely to sample selection. For example, data collectors may
recognized. be assigned a prespecified travel plan. However, where
Studies based on convenience samples can be an the mode of data collection involves intercepting
extremely effective way of conducting preliminary individuals on the street for interview, then the only
investigations, but it is desirable that any important constraint on the data collector may be to satisfy the
conclusions drawn about a wider population are quotas. It is this freedom given to the data collector
further investigated, preferably using probability that provides both the biggest advantage and biggest
samples. Where some kind of explanatory, rather than disadvantage of quota sampling. The advantage is
simply descriptive, inference is desired, Smith and that with only the quota constraints to satisfy, data
Sugden (1988) argued that ‘the ideal studies are collection is relatively easy. Such surveys can be carried
experiments within surveys in which the scientist has out rapidly by an individual data collector performing
control over both the selection of units and the interviews on a busy street corner. As with any
allocation of treatments.’ This approach was con- nonprobability sampling scheme, however, there is no
sidered in detail by Fienberg and Tanur (1989). way of assessing the bias associated with quota
sampling. The sample units are necessarily selected
from those which are available to the data collector,
2. Quota Sampling given their mode of interviewing. If availability is
associated with any of the survey variables, then
When using survey data to draw an inference about a significant bias may occur. Advocates of quota sam-
population of interest, the hope of the analyst is that pling argue that the quotas control for this, but there
sample estimators of quantities of interest are close to is no way of guaranteeing that they do. Neither can
the corresponding population values. If a nonproba- design-based inference be used to assess the variability
bility sample has been collected, then it is instructive to of estimates based on quota samples. Sometimes, a
observe the precision of sample estimators of known simple model is used to assess this variability. If one
population quantities. For example, how do the assumes that the data collectors used are drawn from
sample proportions of males and females compare to a population of possible data collectors, then the
known population values? If they differ substantially, ‘between collector’ variance combines both sampling
then the sample is ‘unrepresentative’ of the population variability and interviewer variability. Deville (1991)
and one might have legitimate cause for concern about modeled the quota sampling process and provided
the reliability of estimates of unknown quantities of some alternative measures of variability.
interest. Purposie sampling is a term used for methods Studies comparing quota and probability sampling
of choosing a nonprobability sample in a way that have been carried out. Moser and Stuart (1953)
makes it ‘representative’ of the population, although discovered apparent availability biases in the quota
there is no generally agreed definition of a represen- samples they investigated, with respect to the variables
tative sample, and purposive sampling is often based occupation and education. In particular, they noticed
on subjective considerations. that the quota samples underestimated the proportion
In quota sampling, the sample selection is con- of population with lower levels of education. Marsh
strained to ensure that the sample proportions of and Scarborough (1990) investigated nine possible
certain control variables approximately match the sources of availability bias in quota samples. They

13468
Sample Sureys: Nonprobability Sampling

found that, amongst women, their quota sample modeled the population values of Y and the selection
overestimated the proportion from households with process jointly through
children. Both studies found that the quota samples
tended to underestimate the proportion of individuals f (Y, As Q Z; θ, φ) l f (Y Q Z; θ) f (As Q Y, Z; φ) (1)
in the extreme (high and low) income groups.
Quota samples are often used for political opinion where θ and φ are distinct model parameters for the
polls preceding elections. In such examples they can be population model and selection model respectively.
externally validated against the election results and Given As, Y can be partitioned as (Ys, Ys̀) into observed
historically quota samples have often been shown to and unobserved values.
be quite accurate. Indeed Worcester (1996) argued Inferences based on the observed data model
that election forecasts using quota samples for UK f (Ys Q Z; θ) and extended to the population are said to
elections in the 1970s were more accurate than those ignore the selection mechanism, and in situations
using probability samples. Smith (1996) presented where this is valid, the selection is said to be ignorable
similar evidence. However, it is also election fore- (Rubin, 1976); see Statistical Data, Missing. Selection
casting which has led to quota sampling coming under is ignorable when
closest scrutiny. In the US presidential election of
f (As Q Y, Z; φ) l f (As Q Z; φ) (2)
1948, the Crossley, Gallup, and Roper polls all
underestimated Truman’s share of the vote by at least so that the probability of making the observed
five percentage points, and as a consequence, predicted selection, for given Z, is the same for all Y. A sufficient
the wrong election winner. Mosteller et al. (1949) in condition for this is that A and Y are conditionally
their report on the failure of the polls found one of the independent given Z. A probability sampling scheme,
two main causes of error to be errors of sampling and perhaps using some stratification or clustering based
interviewing, and concluded (p. 304) that ‘it is likely on Z, is clearly ignorable.
that the principal weakness of the quota control Nonprobability sampling schemes based on Z (for
method occurred at the local level at which re- example selecting exactly those units corresponding to
spondents are selected by interviewers.’ a particular set of values of Z) are also ignorable.
The UK general election of 1992 saw a similar However, whether or not inferences are immediately
catastrophic failure of the pre-election opinion polls, available for values of Z not contained in the sample
with pre-election polls giving Labour an average lead depends on the form of the population model f(Y Q Z; θ)
of around 1.5 percentage points. In the election, the and, in particular, whether the entire θ is estimable
Conservative lead over Labour was 7 percentage using Ys. If Y is independent of Z then there is no
points. A report by the Market Research Society problem, but this is an assumption which cannot be
Working Party (1994) into the failure of the polls verified by sample data based on a restricted sample of
identified inaccuracies in setting the quota controls as values of Z. If this assumption seems implausible, then
one of a number of possible sources of error. As a post-stratification may help. Smith (1993) considered
result the sample proportions of the key variables did partitioning the variables comprising Y into measure-
not accurately reflect the proportions in the popu- ment variables Ym and stratification variables Yq, and
lation. Lynn and Jowell (1996) attributed much of the post-stratifying. If
error to the selection bias inherent in quota sampling,
and argued for increased use of probability sampling f (Ym
s
Q Yqs, Z; ξ) l f (Ym
s
Q Yqs; ξ) (3)
methods for future election forecasts.
where ξ are parameters for the post-stratification
3. A Formal Framework model, then inference for any Z is available. This
condition implies that, given the observed values Ym
s
of
As methods of design-based inference cannot be the stratification variables, Z gives no further in-
applied to data obtained by nonprobability sampling, formation concerning the measurement variables. This
any kind of formal assessment of bias and variability approach provides a way of validating certain infer-
associated with nonprobability samples requires a ences based on a convenience sample, where Z is an
model-based approach (see Sample Sureys: Model- indicator variable defining the sample.
based Approaches). Smith (1983) considered the Smith (1983) also considered ignorability for quota
following framework, which can be used to assess the sampling schemes. He proposed modeling selection
validity of inferences from various kinds of non- into a quota sample in two stages, selection into a
probability samples. Let i l 1, …, N denote the popu- larger sample for whom quota variables Yq are
lation units, vector Yi the values of the unknown recorded, followed by selection into the final sample,
survey variables, and vector Zi the values of variables based on a unit’s quota variables and the requirements
which are known prior to the survey. Let A be a binary to fill the quota. For the final sample, the variables of
variable indicating whether a unit is selected into the interest Ym are recorded. Two ignorability conditions
sample (Ai l 1) or not (Ai l 0), and let As be the result, requiring that at neither stage does probability
values of A for the observed sample. Smith (1983) of selection, given Yq and Z, depend on Ym.

13469
Sample Sureys: Nonprobability Sampling

This formal framework makes clear, through ex- Bibliography


pressions such as (2) and (3) when model-based
Bradley N 1999 Sampling for internet surveys. An examination
inferences from nonprobability samples can and can- of respondent selection for internet research. Journal of the
not be used to provide justifiable population infer- Market Research Society 41: 387–95
ences. However, it is important to realize that the Deville J-C 1991 A theory of quota surveys. Surey Methodology
assumptions required to ensure ignorability cannot be 17: 163–81
verified using the sample data alone. They remain Doll R, Hill A B 1964 Mortality in relation to smoking: ten
assumptions which need to be subjectively justified years’ observations of British doctors. British Medical Journal
before extending any inferences to a wider population. 1: 1399–410
These formal concepts of ignorability confirm more Fienberg S E, Tanur J M 1989 Combining cognitive and
statistical approaches to survey design. Science 243: 1017–22
heuristic notions of what is likely to comprise a good
Hansen M H, Hurwitz W N, Madow W G 1953 Sample Surey
nonprobability sampling scheme. For example, opin- Methods and Theory. Volume 1: Methods and Applications.
ion polls with a large element of self-selection are Wiley, New York
highly unlikely to result in an ignorable selection. On Lynn P, Jowell R 1996 How might opinion polls be improved?
the other hand one might have much more faith in a The case for probability sampling. Journal of the Royal
carefully constructed quota sampling scheme, where Statistical Society A 159: 21–8
data collectors are assigned to narrowly defined Market Research Society Working Party 1994 The Opinion Polls
geographical areas, chosen using a probability sam- and the 1992 General Election. Market Research Society,
pling scheme, and given restrictive guidelines on London
Marsh C, Scarborough E 1990 Testing nine hypotheses about
choosing the units to satisfy their quota.
quota sampling. Journal of the Market Research Society 32:
485–506
Moser C A, Kalton G 1971 Surey Methods in Social Inesti-
4. Discussion gation. Heinemann, London
Moser C A, Stuart A 1953 An experimental study of quota
The distinction between probability sampling and sampling (with discussion). Journal of the Royal Statistical
nonprobability sampling is necessarily coarse. At one Society A 116: 349–405
extreme is a carefully constructed probability survey Mosteller F, Hyman H, McCarthy P J, Marks E S, Truman D B
with no nonresponse; at the other extreme is a sample 1949 The Pre-election Polls of 1948: Report to the Committee
chosen entirely for the investigator’s convenience. on Analysis of Pre-election Polls and Forecasts. Social Science
Research Council, New York
However, most surveys fall between these two ex- Rubin D B 1976 Inference and missing data. Biometrika 63:
tremes, and therefore strictly should be considered as 581–92
nonprobability samples. Examples include quota sur- Smith T M F 1983 On the validity of inferences from non-
veys of households where the geographical areas for random samples. Journal of the Royal Statistical Society A
investigation are chosen using a probability sample, or 146: 394–403
statistical experiments where a convenience sample of Smith T M F 1996 Public opinion polls: the UK general election,
units is assigned treatments using a randomization 1992. Journal of the Royal Statistical Society A 159: 535–45
scheme. The validity of any inferences extended to a Smith T M F, Sugden R A 1988 Sampling and assignment
wider population depends on the extent to which the mechanisms in experiments, surveys and observational
studies. International Statistical Reiew 56: 165–80
selection of units is ignorable for the inference re- Stephan F F, McCarthy P J 1958 Sampling Opinions. An
quired. This applies equally to any survey with Analysis of Surey Procedure. Wiley, New York
nonresponse. The presence of nonresponders in a Taylor H 1995 Horses for courses: how survey firms in different
probability survey introduces a nonprobability el- countries measure public opinion with very different methods.
ement into the selection mechanism. Considerations Journal of the Market Research Society 37: 211–19
of ignorability (of nonresponse) now need to be Worcester R 1996 Political polling: 95% expertise and 5% luck.
considered. However, surveys with probability sam- Journal of the Royal Statistical Society A 159: 5–20
pling usually make a greater effort to minimize
nonresponse than nonprobability surveys, where there J. J. Forster
is little incentive to do so. Furthermore, even with
nonresponse, it is easier to justify ignorability of a Copyright # 2001 Elsevier Science Ltd.
probability sampling mechanism. All rights reserved.
Further details concerning specific issues may be
obtained from the sources referenced above. Alterna- Sample Surveys: Survey Design Issues and
tive perspectives on nonprobability sampling are
provided by general texts on sampling such as Hansen Strategies
et al. (1953), Stephan and McCarthy (1958) and Moser
and Kalton (1971). A treatment of survey questions intended to be useful
to those wishing to carry out or interpret actual
See also: Sample Surveys, History of; Sample Surveys: surveys should consider several issues: the basic
Survey Design Issues and Strategies difference between questions asked in surveys and

13470

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7

You might also like