survey-sampling-2
survey-sampling-2
WORKSHOP
$
Manage Successful
Impact Evaluations
18 - 22 JUNE 2018
WASHINGTON, DC
Sampling: Track 2 $
é 4s ( z1-a / 2 +z1- b )
2 2
ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
Minimum Number of
detectable clusters
effect
types of power calcs
• Three options
– Compute sample size given power and MDES
– Compute power given sample size and effect size
– Compute MDES given power and sample size
é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
significance and confidence
• type I error: false positive
– detect an effect when no effect is present
• e.g. result indicates a treatment has an effect when in truth it has no
effect
• confidence level : (1 − α)
– probability that we do not find a statistically significant effect if
the treatment effect is zero
• power: (1 − β)
– probability of correctly rejecting H0
é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
Effect size
• What is minimum detectable effect?
Baseline values: paddy: 2.6 tons/ha; wheat: 1.9 tons/ha; maize: 2.2 tons/ha; potato: 12.9 tons/ha
é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
QUIZ
An intervention increases employment by 10% for
treatment group on average in two different populations.
Would you expect a difference in sample size needed to
detect the effect in the two populations?
15
variance of outcomes
§ σ = variance of the outcome of interest for the study
population
é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
QUIZ
27
Take up vs. sample size
6000
5000
4000
Sample size
3000
2000
1000
0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3
Proportion of HHs taking up voucher
28
Other factors
• Attrition
– Effectively the same problem as take-up
– Especially serious if cluster-level
• Compliance
– If contamination (control also adopts) it will be more
difficult to discern a meaningful difference
• Data quality
– Missing observations: approximately = attrition
– Measurement error à increased variance, less precision
conclusions
The smaller effects that we want
to detect
The
The more underlying
heterogeneity (variance) larger
the
The higher the level of clustering sample size
has to be
The lower take up
30
Now you know how many people to
sample ….
How do you identify them?
Sampling in practice
• Best case scenario: complete sampling frame
already exists
• Pros:
– Easiest for the enumerator
• Cons:
– Enumerators are likely to chose the traders they
can find easily
– No guarantee of representation of all types of
traders
Selecting the sample – method 2
• Provide a walking skip pattern based on
market size for enumerators to follow
Market size Sample size Skip pattern to follow
<=30 100% every trader will be interviewed
31-100 50% every 2nd trader will be interviewed
101-400 33.3% every 3rd trader will be interviewed
401-500 25% every 4th trader will be interviewed
501-600 20% every 5h trader will be interviewed
601-700 16.67% every 6th trader will be interviewed
701-800 14.28% every 7th trader will be interviewed
801-900 12.5% every 8th trader will be interviewed
901-1000 11.11% every 9th trader will be interviewed
...so on
Selecting the sample – method 2
• Pros
– There is some form of randomness
– All trader types are likely to be represented
• Cons:
– Enumerators have to do a lot of mental math!
– Very hard to verify whether sampling pattern was
followed
Selecting the sample – method 3
• Rely on technology - Program the survey form
to dynamically pick the traders to survey
• Pros
– Enumerators just have to locate the stall listed on the
tablet screen
– All trader types are likely to be represented
• Cons:
– Programming of the randomization might take time
– Randomization is not replicable if done on SurveyCTO
What would you do?
What was actually done?
• Weighing the pros and cons of each available
method, we left the selection to technology
(method 3)
• Overestimated the required sample in each
market to ensure required number of trader
surveys were reached
Case Study 2 -
Irrigation Scheme Farmer Survey
Irrigation impact evaluation
sampling
• Spatial regression discontinuity design
– Compare plots just below irrigation canal to those
just above
• Therefore need to sample plots close to irrigation canal
• How to do that?
– Listing HHs in the neighboring villages?
• Plots aren’t necessarily close to villages
• People won’t accurately be able to say whether the plot
is within 50m of the canal
What did we do?
• Dropped uniform grid of points across full site
at 2m resolution
• Randomly sampled points, excluding any point
within 10m of a point selected
• Enumerators equipped with GPS units visited
each sampled point to identify whether the
point is agricultural land, and if so find out
who cultivates it
Draw 50m buffer
Drop 3000 points
Assign 24 enumerators to points
outcome
• From 3000 points, 2932 successfully visited and description
recorded
• So, use the best available data to estimate values for each
parameter. Sources to consider
• high-quality nationally representative survey (e.g. LSMS)
• Data from DIME IE in same country (or region, if pressed)
• Review the literature – especially published papers on the sector and
country. Will certainly include effect sizes, possibly also useful
descriptives (e.g. that you could use for assumptions about means)
• Pros
– Better output: more info, graph option. Automatically saved.
– More flexible in terms of input/output choices
– Can compute sample size of control group given treatment group size
(or vice versa)
– Directly calculate MDES
– Allows for treatment and control groups of different sizes
• Cons
– No straightforward way to control for repeated measures
power
• Options
– cluster
• Allows for clustered sampling designs
– power onemean
• assume means same in tmt & control (e.g. randomization)
– n sample size
• n1() control group size, n2() treatment group size
– nratio ratio of n1/n2
• default is 1
• not necessary to specify if you list n1 and n2
– power, table outputs results in table format
– power, saving(filename, [replace]) saves results in .dta
format
sampsi
• No longer official stata package (replaced by
power), though it continues to work
• Pros
– Works with Stata13 or less
– Allows repeated measures (multiple follow-ups)
• Cons
– Does not allow clustering
– Have to impute MDES
– Defaults to 90% power (not really a con, but be aware)
sampsi options
• onesample: use if randomized (assume means the same
between treatment and control)
• Sample size
– n1(#) size of treatment group
– n2(#) size of control group
– ratio() n1/n2, default is 1
• Repeated measures
– pre number of baseline measurements
– post number of follow-up measurements
– r0(#) correlation between baseline measures (default r0 = r1)
– r1(#) correlation between follow-up measures
– r01(#) correlation between baseline and follow-up
• method(post change anova or all), default is all
sampsi
• Default is to compute sample size
• To compute power: specify n1 or n2
• To compare means (not proportions), must
specify sd1(#) or sd2(#)
• For repeated measures, sd1(#) or sd2(#) must
be specified
sampsi example syntax
• Simple case: one-sample comparison of mean to
hypothesized value.
– Take sample size as given to compute power:
• sampsi # (baseline mean) # (hypothesized mean), sd (postulated sd) n
(sample size) onesample
• sampsi 0 2.5, sd(4) n(25) onesample
• Cons
– Have to impute MDES
– Does not allow for repeated measures
– Does not allow for baseline correlation
clsampsi options
• m(#) cluster size in treatment and control assuming equal
cluster size in tmt & control
– alternative m1(#) and m2(#)
• k(#) number of clusters in tmt and control assuming equal
number in tmt & control
– Alternative k1(#) and k2(#)
• sd(#) standard deviation assuming same sd in tmt & control
– Alternative sd1(#) and sd2(#)
• rho(#) ICC assuming same in tmt & control
– Alternatively rho1 and rho2
• sampsi determines power of means (or proportion)
comparison using the standard sampsi command
clsampsi less common options
• varm(#) cluster size variation assuming same
in tmt & ctl
– only affects power if larger than m(#) and rho(#)>0
– Calculate the effect of cluster-size variation
(varm1()) on the required sample size
• clsampsi 3 2.3, sd1(2) sd2(1.55) m1(6) m2(8)
varm1(100) rho1(0.2)
clustersampsi
• Pros
– Allows for clustering
– Allows for baseline correlations
– Directly calculates MDES
• Cons
– Doesn’t allow for different sized treatment /
control groups
– Doesn’t allow for repeated measures
clustersampsi options
• detectabledifference calculate MDES
– Alternative options: power, samplesize
– to use detectabledifference must specify m, k, mu1
• rho(#) ICC
• k(#) number of clusters in each arm
• m(#) average cluster size
• size_cv(#) coefficient of variation of cluster sizes
(default is 0). Can be any number greater than 1.
• mu1 mean for tmt (mu2 = mean for control)
• sd1 mean for tmt (sd2 = mean for control)
• base_correl correlation btw baseline measurements (or
other predictive covariates) and outcome
clustersampsi example
• Detectable difference for fixed sample size:
compute the difference detectable with 20
clusters per arm each of size 10 between two
means where the baseline mean is 300 and
ICC is 0.05.
– clustersampsi, detectabledifference mu1(300)
m(10) k(20) rho(0.05)
sampclus
• Add-on to sampsi that allows for clustering
• Must be preceded by sampsi