Stratified Sampling
Stratified Sampling
STRATIFIED SAMPLING
2
STRATIFIED SAMPLING
1. Stratification: The elements in the
population are divided into layers/groups/
strata based on their values on one/several
auxiliary variables. The strata must be non-
overlapping and together constitute the whole
population.
2. Sampling within strata: Samples are
selected independently from each stratum.
Different selection methods can be used in
different strata.
3
Ex. Stratification of individuals by age group
Stratum Age group
1 17 or younger
2 18-24
3 25-34
4 35-44
5 45-54
6 55-64
7 65 or older
4
Stratum 1:
Northern
Sweden
Ex. Regional
stratification
Stratum 2:
Mid-
Sweden
Stratum 3:
Southern
Sweden
5
Ex. Stratification of individuals by age group and
region
Stratum Age group Region
1 17 or younger Northern
2 17 or younger Mid
3 17 or younger Southern
4 18-24 Northern
5 18-24 Mid
6 18-24 Southern
etc. etc. etc.
6
Gain in precision. If the strata are more
homogenous with respect to the study
variable(s) than the population as a whole,
the precision of the estimates will improve.
Strata = domains of study. Precision
requirements of estimates for certain
subpopulations/domains can be assured by
using domains as strata.
WHY STRATIFY?
7
Practical reasons. For instance
nonresponse rates, method of measurement
and the quality of auxiliary information may
differ between subpopulations, and can be
efficiently handled by stratification.
Administrative reasons. The survey
organization may be divided into
geographical districts that makes it natural to
let each district be a stratum.
WHY STRATIFY?, contd
8
ESTIMATION
Assume a population divided into H strata of
sizes . Independently, a sample
of size n
h
is selected from each stratum.
H h
N N N ,..., ,...,
1
hj
y
= y-value for element j in stratum h
=
=
h
N
j
hj yh
y t
1
= population total for stratum h
h
S j
hj
h
n
y
y
h
e
=
= sample mean for stratum h
9
ESTIMATION OF A TOTAL
Assume: SRS within all strata.
y N t
y
=
1 1 1
y N t
y
=
3 3 3
y N t
y
=
4 4 4
y N t
y
=
5 5 5
y N t
y
=
2 2 2
y N t
y
=
10
ESTIMATION OF A TOTAL
Assume: SRS within all strata.
5 5 4 4
3 3 2 2 1 1 str
y N y N
y N y N y N t
+ +
+ + =
In general:
=
=
H
h
h h
y N t
1
str
|
.
|
\
|
=
=
One term per stratum
Finite population correction
(one per stratum!)
where
( )
=
h
N
j
hU hj
h
h
y y
N
S
1
2 2
1
1
13
ESTIMATION OF THE VARIANCE
OF THE ESTIMATOR OF A TOTAL
Principle: Estimate whats unknown in the
variance formula.
( )
h
h
h
h
H
h
h
n
S
N
n
N t V
2
1
2
str
1
|
.
|
\
|
=
=
( )
h
h
h
h
H
h
h
n
s
N
n
N t V
2
1
2
str
1
|
.
|
\
|
=
=
where
( )
=
h
n
j
h hj
h
h
y y
n
s
1
2 2
1
1
14
ESTIMATORS FOR A MEAN
Note: Start from the estimators for a total!
= =
|
.
|
\
|
= =
H
h
h
h
H
h
h h
y
N
N
y N
N
y
1 1
str
1
=
=
H
h
h h
y N t
1
str
15
ESTIMATORS FOR A MEAN,
contd
Note: Start from the estimators for a total!
( )
h
h
h
h
H
h
h
n
s
N
n
N
N
y V
2
1
2
2
str
1
1
|
.
|
\
|
=
=
( )
h
h
h
h
H
h
h
n
s
N
n
N t V
2
1
2
str
1
|
.
|
\
|
=
=
h
h
h
h
H
h
h
n
s
N
n
N
N
2
2
1
1 |
.
|
\
|
|
.
|
\
|
=
=
16
ESTIMATORS FOR A
PROPORTION
=
|
.
|
\
|
=
H
h
h
h
p
N
N
p
1
str
( )
h
h
h
h
H
h
h
n
s
N
n
N
N
y V
2
1
2
str
1
|
.
|
\
|
|
.
|
\
|
=
=
Note: Like the estimators for a mean, only
with y a 0/1-variable!
=
|
.
|
\
|
=
H
h
h
h
y
N
N
y
1
str
( )
( )
1
1
1
1
2
str
|
.
|
\
|
|
.
|
\
|
=
=
h
h h
h
h
H
h
h
n
p p
N
n
N
N
y V
17
IMPORTANT DESIGN CHOICES IN
STRATIFIED SAMPLING
Stratification variable(s)
Number of strata
Sample size in each stratum (allocation)
Sampling design in each stratum
Estimator for each stratum