0% found this document useful (0 votes)

0 views

survey-sampling-2

The document outlines a workshop on managing successful impact evaluations, focusing on sample size calculations, sampling in challenging contexts, and programming power calculations using Stata. It includes case studies on market and trader surveys as well as farm surveys in irrigation schemes, emphasizing the importance of understanding key parameters such as minimum detectable effect size (MDES) and intracluster correlation (ICC). Additionally, it discusses practical sampling methods and the challenges faced in data collection for impact evaluations.

Uploaded by

Suman Gautam

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

survey-sampling-2

Uploaded by

Suman Gautam

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

FIELD COORDINATOR

WORKSHOP
$
Manage Successful
Impact Evaluations

18 - 22 JUNE 2018
WASHINGTON, DC
Sampling: Track 2 $

Maria Jones & Roshni

Khincha
21 June 2018
outline
1. Sample size calculations: key parameters
– Choose your own adventure!
– Quick quizzes, we’ll spend more time where there are
knowledge gaps

2. Sampling in challenging contexts: 2 case studies

– Market and trader survey
– Farm survey in an irrigation scheme

3. Programming power calculations: Stata options

– For your reference as an appendix
Sample Size Calculations -
Key Parameters to Understand
sample size equation
variance Intracluster
correlation
significance power coefficient (ICC)

é 4s ( z1-a / 2 +z1- b )
2 2
ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
Minimum Number of
detectable clusters
effect
types of power calcs
• Three options
– Compute sample size given power and MDES
– Compute power given sample size and effect size
– Compute MDES given power and sample size

• For IE, typically assume power of 80% and solve

for either sample size or MDES
– Most often, take sample size as given (based on IE
design / population / budget) and solve for MDES
– If MDES too large, useful to reverse – put in largest
acceptable MDES and solve for sample size
power and confidence

é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
significance and confidence
• type I error: false positive
– detect an effect when no effect is present
• e.g. result indicates a treatment has an effect when in truth it has no
effect

• significance level: α (alpha)

– probability of a type I error

• confidence level : (1 − α)
– probability that we do not find a statistically significant effect if
the treatment effect is zero

• for power calcs assume 95% confidence

power
• type II error: false negative
– fail to detect an effect when an effect is present
– e.g. a result that indicates that a treatment has no
effect when in truth it has an effect

• β (beta) is the likelihood of making a type II error

• power: (1 − β)
– probability of correctly rejecting H0

• for power calcs assume 80% power

effect size

é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
Effect size
• What is minimum detectable effect?

• How should you determine a reasonable

MDES?
MDES
§ D: the smallest effect size that, if it were any smaller,
the intervention would not be worth the effort
§ a.k.a Minimum Detectable Effect Size (MDES)

§ The smaller the effect you want to be able to detect,

the larger the sample you will need
§ larger sample è more precise measuring device

§ Very common to solve for MDES when doing IE

power calculations
QUIZ

Baseline values: paddy: 2.6 tons/ha; wheat: 1.9 tons/ha; maize: 2.2 tons/ha; potato: 12.9 tons/ha

The impact evaluation focuses on improving crop productivity

through farmer field schools. Based on the above results
framework, how would you think about deciding a reasonable
MDES?
variance of outcomes

é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
QUIZ
An intervention increases employment by 10% for
treatment group on average in two different populations.
Would you expect a difference in sample size needed to
detect the effect in the two populations?

15
variance of outcomes
§ σ = variance of the outcome of interest for the study
population

§ More underlying variance (heterogeneity)

§ à more difficult to detect difference
§ à need larger sample size

§ Tricky: How do we know about heterogeneity before we decide our

sample size and collect our data?
§ Ideal: pre-existing data … but often non-existent
§ Can use pre-existing data from a similar population
§ Example: LSMS, data routinely collected by govt, satellite imagery
§ Common sense
16
clustering (aka “design effect”)

é 4s 2 ( z1-a / 2 +z1- b ) 2 ù
n=ê 2 ú[1 + r (m - 1)]
êë D úû
QUIZ

Which sampling strategy is likely to give you

more statistical power?
A. 400 classrooms, 5 students per classroom =
2,000 students
B. 50 classrooms, 40 students per classroom =
2,000 students
C. Both should give you similar statistical power
D. Don’t know
clustering
• Unit for sample size calculation depends on
both:
– Level of intervention AND
– Level of measured impacts

• Example: intervention at village level,

interested in impacts at HH level
– Randomly assign villages to treatment / control
– Sample household within villages
clustering
• Level of intervention (“cluster”) most
important for sample size calculation

• If few clusters, precision will be limited,

regardless of number of HHs sampled
QUIZ

Which sampling strategy is likely to give you

more statistical power?
A. 100 villages, 5 HHs per village = 2,000 HHs
B. 100 villages, 50 HHs per village = 2,000 HHs
C. Both should give you similar statistical power
D. Don’t know
clustering
• Intracluster correlation (ICC): similarity of units within
clusters

• Is the variation in outcome of interest coming mostly

from differences within villages (low ICC), or between
villages (high ICC)?
– If HHs in village A are similar to each other, but different
from HHs in village B, high ICC
– If HHs in village A are similar to HHs in village B, low ICC

• If ICC = 0, no design effect

20 clusters

high ICC (.50)

low ICC (.05)

100 clusters

high ICC (.50)

low ICC (.05)

clustering
Takeaway
High intra-cluster correlation (HHs in same cluster similar)

lower marginal value per extra sampled unit in the cluster

More clusters needed

Rule of thumb: at least 40 clusters per treatment arm

QUIZ
• You do power calculations and decide you
need a sample of 1,000 HHs for an impact
evaluation. The project starts. 6 months in, a
monitoring survey shows that take-up of the
intervention is 50%.
• What effect will this have on statistical power,
given the sample size of 1,000 HHs?
• What could you do to improve power, if
increasing the number of HHs is not feasible?
take-up
§ Low take-up (rate) for intervention lowers
precision
§ Effectively decreases sample size / increases minimum
detectable effect
§ Can only detect an effect if it is really large

§ Unfortunately, to account for take-up rate of 50%,

have to increase sample size by factor of 4

27
Take up vs. sample size
6000

5000

4000
Sample size

3000

2000

1000

0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3
Proportion of HHs taking up voucher

28
Other factors
• Attrition
– Effectively the same problem as take-up
– Especially serious if cluster-level

• Compliance
– If contamination (control also adopts) it will be more
difficult to discern a meaningful difference

• Data quality
– Missing observations: approximately = attrition
– Measurement error à increased variance, less precision
conclusions
The smaller effects that we want
to detect
The
The more underlying
heterogeneity (variance) larger
the
The higher the level of clustering sample size
has to be
The lower take up

The lower data quality

30
Now you know how many people to
sample ….
How do you identify them?
Sampling in practice
• Best case scenario: complete sampling frame
already exists

• Most often this is not the case as impact

evaluation samples are quite specific

• First step is usually to conduct a listing, then

sample

• However, that may not be entirely

straightforward, as the two case studies show
Case Study 1 -
Market Listing & Trader Survey
Context

Rural Feeder Roads:

Does improved connectivity change lives?
Market survey - setup
• Understand how market structure and
composition has changed over time through
– a visual listing of all traders present in the market
– conducting a short trader survey for a sub-sample
of traders listed in each market
Market sample
Sampling for trader survey
• Based on power
Market size Sample size
calculations and
field practicalities, <=30 100%
31-100 50%
research team 101-400 33.3%
designed a sampling 401-500 25%
strategy in which 501-600 20%

the number of 601-700 16.67%

701-800 14.28%
traders to survey 801-900 12.5%
per market depends 901-1000 11.11%
on total market size ...so on
Sampling for trader survey
• The trader survey has to be conducted on the
same day as the listing
– for quicker completion as markets do not meet
everyday
– to avoid attrition / confusion as traders are identified
by location, clothing, and type of goods, which will
change between market days

How can the sample be dynamically

selected as soon as listing is complete?
Selecting the sample – method 1
• Allow the enumerators to select the traders to
interview

• Pros:
– Easiest for the enumerator
• Cons:
– Enumerators are likely to chose the traders they
can find easily
– No guarantee of representation of all types of
traders
Selecting the sample – method 2
• Provide a walking skip pattern based on
market size for enumerators to follow
Market size Sample size Skip pattern to follow
<=30 100% every trader will be interviewed
31-100 50% every 2nd trader will be interviewed
101-400 33.3% every 3rd trader will be interviewed
401-500 25% every 4th trader will be interviewed
501-600 20% every 5h trader will be interviewed
601-700 16.67% every 6th trader will be interviewed
701-800 14.28% every 7th trader will be interviewed
801-900 12.5% every 8th trader will be interviewed
901-1000 11.11% every 9th trader will be interviewed
...so on
Selecting the sample – method 2
• Pros
– There is some form of randomness
– All trader types are likely to be represented
• Cons:
– Enumerators have to do a lot of mental math!
– Very hard to verify whether sampling pattern was
followed
Selecting the sample – method 3
• Rely on technology - Program the survey form
to dynamically pick the traders to survey

• Pros
– Enumerators just have to locate the stall listed on the
tablet screen
– All trader types are likely to be represented
• Cons:
– Programming of the randomization might take time
– Randomization is not replicable if done on SurveyCTO
What would you do?
What was actually done?
• Weighing the pros and cons of each available
method, we left the selection to technology
(method 3)
• Overestimated the required sample in each
market to ensure required number of trader
surveys were reached
Case Study 2 -
Irrigation Scheme Farmer Survey
Irrigation impact evaluation
sampling
• Spatial regression discontinuity design
– Compare plots just below irrigation canal to those
just above
• Therefore need to sample plots close to irrigation canal

• How to do that?
– Listing HHs in the neighboring villages?
• Plots aren’t necessarily close to villages
• People won’t accurately be able to say whether the plot
is within 50m of the canal
What did we do?
• Dropped uniform grid of points across full site
at 2m resolution
• Randomly sampled points, excluding any point
within 10m of a point selected
• Enumerators equipped with GPS units visited
each sampled point to identify whether the
point is agricultural land, and if so find out
who cultivates it
Draw 50m buffer
Drop 3000 points
Assign 24 enumerators to points
outcome
• From 3000 points, 2932 successfully visited and description
recorded

• 1,058 distinct village name + cultivator combinations

• Contact village leaders to verify names and remove duplicate

households (e.g. husband and wife) à 810 households

• 670 households successfully interviewed (more duplicates

discovered during interviews, some names not recognized)

• Once plots were mapped, 76% have sample point within

boundaries
APPENDIX

Programming Power Calculations -

Summary of Stata Options
data you need to have on hand
• Mean and variance for outcome variable for population of
interest
– Assume mean and SD same for tmt and control if randomized

• Sample size (assuming you are calculating MDES (δ))

– If individual randomization, number of people/units (n)
– If clustered, number of clusters (k), number of units per cluster
(m), intracluster correlation (ICC, ρ) and ideally, variation in
cluster size (e.g. min-max cluster size)

• The following standard conventions

– Significance level (α) = 0.05
– Power = 0.80 (i.e. probability of type II error (β) = 0.20
ideally, you also know
• Baseline correlation of outcome with covariates
– Covariates (individual and/or cluster level) reduce residual variance of
the outcome variable, reducing required sample size
• Reducing individual level residual variance is akin to increasing # obs per
cluster (bigger effect if ICC low)
• Reducing cluster level residual variance is akin to increasing # of clusters
(bigger effect if ICC and m high)
– If you have baseline data, this is easy to obtain
• Including baseline autocorrelation will improve power (keep only time
invariant portion of variance)

• Number of follow-up surveys

• Autocorrelation of outcome between FUP rounds

• Take-up for the intervention

But… what if I don’t have this data?
• You will basically never have all of this for your exact
population of interest when you first do power calculations

• So, use the best available data to estimate values for each
parameter. Sources to consider
• high-quality nationally representative survey (e.g. LSMS)
• Data from DIME IE in same country (or region, if pressed)
• Review the literature – especially published papers on the sector and
country. Will certainly include effect sizes, possibly also useful
descriptives (e.g. that you could use for assumptions about means)

• If you can’t come up with a specific value you feel very

confident in, run a few different power calculations with
alternate assumptions to generate bounds
Commonly-used Stata options
• power
• sampsi
• sampclus
• clsampsi
• clustersampsi
power calcs in stata
quick reference

Which Stata Clustering Multiple Different size Using old Directly

package survey tmt and control version of calculate
should I use? rounds groups stata (12 or MDES
lower)

power YES (only as NO YES NO YES

of Stata 15)
sampsi NO YES YES YES NO

clsampsi YES NO YES YES NO

clustersampsi YES NO NO YES YES

power
• Stata’s newest updated to power calculations
– Introduced with Stata13, replaces sampsi
– As of Stata15, allows for clustered sample designs

• Pros
– Better output: more info, graph option. Automatically saved.
– More flexible in terms of input/output choices
– Can compute sample size of control group given treatment group size
(or vice versa)
– Directly calculate MDES
– Allows for treatment and control groups of different sizes

• Cons
– No straightforward way to control for repeated measures
power
• Options
– cluster
• Allows for clustered sampling designs
– power onemean
• assume means same in tmt & control (e.g. randomization)
– n sample size
• n1() control group size, n2() treatment group size
– nratio ratio of n1/n2
• default is 1
• not necessary to specify if you list n1 and n2
– power, table outputs results in table format
– power, saving(filename, [replace]) saves results in .dta
format
sampsi
• No longer official stata package (replaced by
power), though it continues to work

• Pros
– Works with Stata13 or less
– Allows repeated measures (multiple follow-ups)

• Cons
– Does not allow clustering
– Have to impute MDES
– Defaults to 90% power (not really a con, but be aware)
sampsi options
• onesample: use if randomized (assume means the same
between treatment and control)
• Sample size
– n1(#) size of treatment group
– n2(#) size of control group
– ratio() n1/n2, default is 1
• Repeated measures
– pre number of baseline measurements
– post number of follow-up measurements
– r0(#) correlation between baseline measures (default r0 = r1)
– r1(#) correlation between follow-up measures
– r01(#) correlation between baseline and follow-up
• method(post change anova or all), default is all
sampsi
• Default is to compute sample size
• To compute power: specify n1 or n2
• To compare means (not proportions), must
specify sd1(#) or sd2(#)
• For repeated measures, sd1(#) or sd2(#) must
be specified
sampsi example syntax
• Simple case: one-sample comparison of mean to
hypothesized value.
– Take sample size as given to compute power:
• sampsi # (baseline mean) # (hypothesized mean), sd (postulated sd) n
(sample size) onesample
• sampsi 0 2.5, sd(4) n(25) onesample

• More complex: repeated measures

– Need to know sample sizes, mean, sd, expected correlation
– sampsi # (BL mean) # (hypothesized mean), n1 (control sample
size) n2 (tmt sample size) sd1 (control sd) sd2 (tmt sd)
method(change) pre (number of BL measures) post (number of
FUP measures) r1 (correlation btw FUP measures)
– sampsi 485 500, n1(15) n2(15) sd1(20.2) sd2(19.5)
method(change) pre(1) post(3) r1(.7)
clsampsi
• Pros
– Allows for clustering

• Cons
– Have to impute MDES
– Does not allow for repeated measures
– Does not allow for baseline correlation
clsampsi options
• m(#) cluster size in treatment and control assuming equal
cluster size in tmt & control
– alternative m1(#) and m2(#)
• k(#) number of clusters in tmt and control assuming equal
number in tmt & control
– Alternative k1(#) and k2(#)
• sd(#) standard deviation assuming same sd in tmt & control
– Alternative sd1(#) and sd2(#)
• rho(#) ICC assuming same in tmt & control
– Alternatively rho1 and rho2
• sampsi determines power of means (or proportion)
comparison using the standard sampsi command
clsampsi less common options
• varm(#) cluster size variation assuming same
in tmt & ctl
– only affects power if larger than m(#) and rho(#)>0
– Calculate the effect of cluster-size variation
(varm1()) on the required sample size
• clsampsi 3 2.3, sd1(2) sd2(1.55) m1(6) m2(8)
varm1(100) rho1(0.2)
clustersampsi
• Pros
– Allows for clustering
– Allows for baseline correlations
– Directly calculates MDES

• Cons
– Doesn’t allow for different sized treatment /
control groups
– Doesn’t allow for repeated measures
clustersampsi options
• detectabledifference calculate MDES
– Alternative options: power, samplesize
– to use detectabledifference must specify m, k, mu1
• rho(#) ICC
• k(#) number of clusters in each arm
• m(#) average cluster size
• size_cv(#) coefficient of variation of cluster sizes
(default is 0). Can be any number greater than 1.
• mu1 mean for tmt (mu2 = mean for control)
• sd1 mean for tmt (sd2 = mean for control)
• base_correl correlation btw baseline measurements (or
other predictive covariates) and outcome
clustersampsi example
• Detectable difference for fixed sample size:
compute the difference detectable with 20
clusters per arm each of size 10 between two
means where the baseline mean is 300 and
ICC is 0.05.
– clustersampsi, detectabledifference mu1(300)
m(10) k(20) rho(0.05)
sampclus
• Add-on to sampsi that allows for clustering
• Must be preceded by sampsi

sampsi 200 185, alpha(.01) power(.8) sd(30)

sampclus, obsclus(10) rho(.2)
sampclus, obsclus(10) rho(.1)

• Corrects sample size and computes number of clusters

from a t-test
• Adjusts this sample size calculation for 10 observations per
cluster and an ICC of 0.2
• Repeats for an intraclass correlation of 0.1
example of reporting power calcs –
simple individual-level randomization
Parameter Values Definition Source of parameter

α 0.05 Significance level Assumption

β 0.8 Desired power of the test Assumption

Tail 2 One-tailed or two-tailed test Detect either an increase or

a decrease in yields
µ 12761 Pooled mean of outcome variable Baseline survey in the three
(yield in RWF/ha) ongoing sites
24233 Pooled standard deviation of Baseline survey in the three
outcome variable (yield in ongoing sites
RWF/ha)
0.52 The proportion of the study sample Actual treatment/control ratio
randomly assigned to treatment in two of the ongoing sites;
expectation is ~.5
N 690 The size of the study sample Actual sample size in two of
the ongoing sites (expected
to double with new sites)
stata code sampsi 12761 15930, sd1(24233) stata package: sampsi
method(change) n1(359) n2(331)
D 0.13 Minimum detectable effect (in Calculation
standard deviations)
Example of reporting power calcs –
clustered randomization
Parameter Test 1: Low ICC Definition Source of parameter - comments
α 0.05 Significance level Assumption
β 0.8 Desired power of the test Assumption

Tail 2 One-tailed or two-tailed test Detect either increase or decrease in yields

µ 12761 Pooled mean of outcome variable Baseline survey in the three ongoing sites
(yield in RWF/ha)
24233 Pooled SD of outcome variable Baseline survey in the three ongoing sites
(yield in RWF/ha)
K 152 Number of clusters Actual sample size in the three ongoing sites
(expected to ~double with 3 new sites)
N 7 Number of observations per
cluster
ρ 0.75 Correlation between baseline and Assumption
follow-up measurements
icc 0.05 Intracluster correlation Based on icc for agricultural yield data for two
of the three sites, from a 2013 survey run by
the same research team.
stata code clustersampsi, stata package: clustersampsi ** note that estimates are conservative.
detectabledifference clustersampsi doesn't allow for imbalanced
mu1(12761) sd1(24233) treatment/control, so assumes comparing 76 tmt v.
m(7) k(152) rho(0.05) 76 ctl. Also only allows for 1 follow-up (unlike
base_correl(0.75) sampsi, which has post option).
D 0.12 Minimum detectable effect (in Calculation
standard deviations)

Clinical Epidemiology: The Essentials. ISBN 9781451144475, 978-1451144475
100% (26)
Clinical Epidemiology: The Essentials. ISBN 9781451144475, 978-1451144475
23 pages
Quantitative Methods 1 2
No ratings yet
Quantitative Methods 1 2
286 pages
CAI-1 Inspection Principles-M1
No ratings yet
CAI-1 Inspection Principles-M1
32 pages
Test of Hypotheses: Hypothesis
100% (2)
Test of Hypotheses: Hypothesis
31 pages
Sample Size Determination PDF - Houndolo
100% (2)
Sample Size Determination PDF - Houndolo
52 pages
Sample Size Power 2009
No ratings yet
Sample Size Power 2009
33 pages
Introduction To Sampling: Situo Liu Spry, Inc. 10/25/2013
No ratings yet
Introduction To Sampling: Situo Liu Spry, Inc. 10/25/2013
22 pages
Sample Size Determination
No ratings yet
Sample Size Determination
29 pages
M1_V3- Sampling (Intro)
No ratings yet
M1_V3- Sampling (Intro)
39 pages
Sampling
No ratings yet
Sampling
30 pages
Sample Size (1) .PPTX - Read-Only
No ratings yet
Sample Size (1) .PPTX - Read-Only
43 pages
Confidence Intervals, Effect Size and Power
No ratings yet
Confidence Intervals, Effect Size and Power
44 pages
Chap2-Some Unique Features of Data Science Projects
No ratings yet
Chap2-Some Unique Features of Data Science Projects
44 pages
Gcse Statistics Revision Notes
No ratings yet
Gcse Statistics Revision Notes
10 pages
Qing Liu Associate Professor of Marketing
No ratings yet
Qing Liu Associate Professor of Marketing
36 pages
2018.03.21 Rules of Thumb For Sample Size and Power
No ratings yet
2018.03.21 Rules of Thumb For Sample Size and Power
8 pages
Sampling 1
No ratings yet
Sampling 1
26 pages
Lecture Notes - Data
No ratings yet
Lecture Notes - Data
26 pages
7 - Large Scale Surveys
No ratings yet
7 - Large Scale Surveys
22 pages
9sample Size Determination
No ratings yet
9sample Size Determination
56 pages
1727943133_SMA_4.1_Sampling_and_estimation
No ratings yet
1727943133_SMA_4.1_Sampling_and_estimation
27 pages
Ch6 Sampling and Estimation
No ratings yet
Ch6 Sampling and Estimation
24 pages
9. Hypothesis Testing_Standard Error_Effect Size_Power
No ratings yet
9. Hypothesis Testing_Standard Error_Effect Size_Power
32 pages
Descriptive Statistics and Exploratory Data Analysis
No ratings yet
Descriptive Statistics and Exploratory Data Analysis
36 pages
Lectorial Slides 6a
No ratings yet
Lectorial Slides 6a
30 pages
Selecting Samples: Lecture - 7
No ratings yet
Selecting Samples: Lecture - 7
28 pages
Sample Size
100% (2)
Sample Size
62 pages
Virtual COMSATS Inferential Statistics Lecture-6: Ossam Chohan CIIT Abbottabad
100% (1)
Virtual COMSATS Inferential Statistics Lecture-6: Ossam Chohan CIIT Abbottabad
35 pages
6 - Praktek G Power
No ratings yet
6 - Praktek G Power
74 pages
When selecting a sample of supply chain professionals from seven UN humanitarian agencies
No ratings yet
When selecting a sample of supply chain professionals from seven UN humanitarian agencies
9 pages
Chapter 7
No ratings yet
Chapter 7
58 pages
wk4lectureESCI_24 (1)
No ratings yet
wk4lectureESCI_24 (1)
112 pages
Analysis of Variance
No ratings yet
Analysis of Variance
82 pages
Statatistical Inferences
No ratings yet
Statatistical Inferences
22 pages
Sampling For Impact Evaluation: Nazmul Chaudhury World Bank
No ratings yet
Sampling For Impact Evaluation: Nazmul Chaudhury World Bank
24 pages
L6 Sample Size Estimation
No ratings yet
L6 Sample Size Estimation
16 pages
Worksheet 3 - CH3 (Numerical Measures)
No ratings yet
Worksheet 3 - CH3 (Numerical Measures)
6 pages
Statistical Sampling & Parameter Estimation: Prof M.Shashi
No ratings yet
Statistical Sampling & Parameter Estimation: Prof M.Shashi
25 pages
Inferential 1 Student
No ratings yet
Inferential 1 Student
93 pages
05 Lecture4 - Estimation
No ratings yet
05 Lecture4 - Estimation
37 pages
Sample Size Calculation
No ratings yet
Sample Size Calculation
30 pages
Sampling and Sampling Methods Dewi Rosmala
No ratings yet
Sampling and Sampling Methods Dewi Rosmala
21 pages
Chapter One - Introduction
No ratings yet
Chapter One - Introduction
156 pages
02 ABE Review - Sampling Techniques
No ratings yet
02 ABE Review - Sampling Techniques
41 pages
Chapter 6 - Sampling and Estimation
No ratings yet
Chapter 6 - Sampling and Estimation
36 pages
Stats
No ratings yet
Stats
12 pages
MMW-CHAPTER-4.1
No ratings yet
MMW-CHAPTER-4.1
24 pages
Introduction & Descriptive Statistics
No ratings yet
Introduction & Descriptive Statistics
53 pages
Part 2
No ratings yet
Part 2
31 pages
Q4 WEEK 1 2 Slovins and Samplingpptx
No ratings yet
Q4 WEEK 1 2 Slovins and Samplingpptx
34 pages
Inferential Statistics and Linear Regression
No ratings yet
Inferential Statistics and Linear Regression
35 pages
Kajal Srivastava SPM Deptt. S.N.Medical College,: Determining The Size of A Sample
No ratings yet
Kajal Srivastava SPM Deptt. S.N.Medical College,: Determining The Size of A Sample
38 pages
Lecture Notes
No ratings yet
Lecture Notes
37 pages
20151113141143introduction To Statistics-7
No ratings yet
20151113141143introduction To Statistics-7
29 pages
Operatinalization Measurement Sampling
No ratings yet
Operatinalization Measurement Sampling
79 pages
One Dimensional Statistics
No ratings yet
One Dimensional Statistics
21 pages
QC TRNG
No ratings yet
QC TRNG
55 pages
Inferential Statistics FInal
No ratings yet
Inferential Statistics FInal
34 pages
PR1T7 Sampling Method1
No ratings yet
PR1T7 Sampling Method1
33 pages
Module 4 (301 SI-2) (1)
No ratings yet
Module 4 (301 SI-2) (1)
24 pages
Statistics Foundation Slider Team Group#1
No ratings yet
Statistics Foundation Slider Team Group#1
94 pages
Using Statistical Inference
No ratings yet
Using Statistical Inference
18 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Preventive and Social Medicine Public Health PDF
100% (2)
Preventive and Social Medicine Public Health PDF
10 pages
2015 Exit Exam - Questions
No ratings yet
2015 Exit Exam - Questions
159 pages
Entire All: C. Probability of Success in Any Given Trial
No ratings yet
Entire All: C. Probability of Success in Any Given Trial
10 pages
Problems On Inference About Mean and Proportion
No ratings yet
Problems On Inference About Mean and Proportion
6 pages
Business Research Methods: Hypothesis Testing
No ratings yet
Business Research Methods: Hypothesis Testing
29 pages
Statistical Hypothesis
No ratings yet
Statistical Hypothesis
70 pages
Introduction To Statistics Using R
No ratings yet
Introduction To Statistics Using R
237 pages
Module 4 Simple Test of Hypothesis
No ratings yet
Module 4 Simple Test of Hypothesis
30 pages
1.1 Introduction To Hypothesis Testing
No ratings yet
1.1 Introduction To Hypothesis Testing
10 pages
Discovering Bitcoin's Public Topology and Influential Nodes
No ratings yet
Discovering Bitcoin's Public Topology and Influential Nodes
17 pages
Spearman P Value
No ratings yet
Spearman P Value
9 pages
Uji Hipotesis Satu Sampel Bu So
No ratings yet
Uji Hipotesis Satu Sampel Bu So
18 pages
The Chi Square Statistic
No ratings yet
The Chi Square Statistic
7 pages
Day 11 & 12 - Hypothesis Testing
No ratings yet
Day 11 & 12 - Hypothesis Testing
6 pages
haladyna mkale
No ratings yet
haladyna mkale
9 pages
Unit V Probability Distributions
No ratings yet
Unit V Probability Distributions
34 pages
Module 5.hypothesis Testing
No ratings yet
Module 5.hypothesis Testing
16 pages
Neuroimaging and Neurophysiology in Psychiatry
No ratings yet
Neuroimaging and Neurophysiology in Psychiatry
161 pages
CH 4 Statistics In Research Work
No ratings yet
CH 4 Statistics In Research Work
38 pages
Experiments With A Single Factor (1) : Design of Experiment
No ratings yet
Experiments With A Single Factor (1) : Design of Experiment
28 pages
Minimum Sample For Diagnostic Test
No ratings yet
Minimum Sample For Diagnostic Test
6 pages
Exit Exam Model Hawassa University
No ratings yet
Exit Exam Model Hawassa University
32 pages
Hypothesis Formulation and Testing
0% (1)
Hypothesis Formulation and Testing
5 pages
Unit Iv (Research Methods in Business)
No ratings yet
Unit Iv (Research Methods in Business)
18 pages
RM
No ratings yet
RM
100 pages
An Introduction To Quantitative Research Methods: DR Iman Ardekani
No ratings yet
An Introduction To Quantitative Research Methods: DR Iman Ardekani
74 pages

survey-sampling-2

Uploaded by

survey-sampling-2

Uploaded by

FIELD COORDINATOR

Maria Jones & Roshni

2. Sampling in challenging contexts: 2 case studies

3. Programming power calculations: Stata options

• For IE, typically assume power of 80% and solve

• significance level: α (alpha)

• for power calcs assume 95% confidence

• β (beta) is the likelihood of making a type II error

• for power calcs assume 80% power

• How should you determine a reasonable

§ The smaller the effect you want to be able to detect,

§ Very common to solve for MDES when doing IE

The impact evaluation focuses on improving crop productivity

§ More underlying variance (heterogeneity)

§ Tricky: How do we know about heterogeneity before we decide our

Which sampling strategy is likely to give you

• Example: intervention at village level,

• If few clusters, precision will be limited,

Which sampling strategy is likely to give you

• Is the variation in outcome of interest coming mostly

• If ICC = 0, no design effect

high ICC (.50)

low ICC (.05)

high ICC (.50)

low ICC (.05)

lower marginal value per extra sampled unit in the cluster

More clusters needed

Rule of thumb: at least 40 clusters per treatment arm

§ Unfortunately, to account for take-up rate of 50%,

The lower data quality

• Most often this is not the case as impact

• First step is usually to conduct a listing, then

• However, that may not be entirely

Rural Feeder Roads:

the number of 601-700 16.67%

How can the sample be dynamically

• 1,058 distinct village name + cultivator combinations

• Contact village leaders to verify names and remove duplicate

• 670 households successfully interviewed (more duplicates

• Once plots were mapped, 76% have sample point within

Programming Power Calculations -

• Sample size (assuming you are calculating MDES (δ))

• The following standard conventions

• Number of follow-up surveys

• Autocorrelation of outcome between FUP rounds

• Take-up for the intervention

• If you can’t come up with a specific value you feel very

Which Stata Clustering Multiple Different size Using old Directly

power YES (only as NO YES NO YES

clsampsi YES NO YES YES NO

clustersampsi YES NO NO YES YES

• More complex: repeated measures

sampsi 200 185, alpha(.01) power(.8) sd(30)

• Corrects sample size and computes number of clusters

α 0.05 Significance level Assumption

β 0.8 Desired power of the test Assumption

Tail 2 One-tailed or two-tailed test Detect either an increase or

Tail 2 One-tailed or two-tailed test Detect either increase or decrease in yields

You might also like