STAT 366 - Sample Survey Theory and Methods II - Lecture 2
STAT 366 - Sample Survey Theory and Methods II - Lecture 2
• 3.1 Introduction
• 3.5 Examples
1
3.1 Introduction
• Heterogeneous Population of size N is divided into more
homogeneous sub-populations of sizes .
2
Characteristics of Stratified Samples
• Population is divided into an exclusive and exhaustive
set of strata, using some external sources.
• Within each stratum a separate random sample is
selected.
• For each stratum, parameters/statistics are computed
and properly weighted to form an overall estimate for
the whole population.
• The statistic may be the mean or the variance.
3
Reasons for Stratification
• If data of known precision is wanted for certain
subdivisions, it is advisable to treat each subdivision as a
population in its own right.
4
parts of the population.
• Stratification may produce more precise estimates of
population characteristics.
5
Notations
Suppose
• a population of N units is divided into k strata.
For each stratum h we define the following:
yh = value obtained for the ith unit for stratum h
i
1 N
Yh
h
1 n
yh yh = sample mean for stratum h.
h
nh i 1 i
6
• Nh
Wh = weight of stratum h
N
nh
fh = sampling fraction in stratum h.
Nh
2
1 k
sh
2
y hi y h = sample variance for stratum h
n h 1 h 1
1 k N h
1 k k
Y N h yh N hYh W hYh = overall mean.
7
N h 1 i 1 N h 1i
h 1
3.3 Properties of Estimators
• The estimator for the population mean in stratified
sampling is y st , which is defined as:
1 k k
yst Nh yh Wh yh
N h1 h1
where N N1 N2 ... Nk
1k
Note: yst = y nh yh , if and only if fh f
n h1
8
SA1
• Theorem:
1 k 2 k
Var( yst ) 2 NhVar( yh ) Wh2Var( yh )
N h1 h1
9
Slide 10
11
• If Yˆ N y is the estimate of the population total Y , then
st
1 k 2 nh s h2
Var ( y st ) 2 N h 1
N h 1 Nh nh
2
k s
W h2 1 f h h
h 1 nh
nh
1 2
where sh2 hi ( y y h ) an unbiased estimator for S 2
h
n h 1 i 1
12
3.4 Allocation of Sample Size to Strata
Factors considered in sample size allocation are:
• The total number of units in each stratum.
• The variability of observations within each stratum.
• The cost of obtaining an observation from each
stratum.
• Methods of Allocation:
• Equal Allocation
• Proportional Allocation
• Optimum Allocation
13
Equal Allocation
• This is the easiest but not the best way of sample
allocation.
• For k strata, if we want a sample of size n, then we select
in each stratum a sample size , such that we have
equal allocation.
• This is usually done for administrative convenience and if
nothing is known about the stratum variances.
14
• 2
1 k n h S
V a r ( y st ) N 2 1
h
h
N 2 h 1
N h nh
2
1 k n h kS
V a r ( y st ) 2 N 2 1
h
h
N h 1
N h n
k k 1 k
V a r ( y s t ) N h2 S h2 2 h
N S 2
n h 1 N 1
h h
k k 1 k
V a r ( y s t ) W h2 S h2 W
h h S 2
n h 1 N h 1
15
•
Proportional Allocation
• For this allocation, the sample size from a stratum is
n Nh
Hence nh .N h .n Wh.n ,
N N
17
• If within any stratum the cost is proportional to the
size of sample but the cost of taking measurement
on each unit varies from stratum to stratum, then
we take .
18
• the stratum is larger.
19
• Suppose we select to minimize for specified cost or
minimize the total cost for specified The simplest cost
function which is linear is of the form:
k
C c0 c h n h
h 1
• C=total cost,
• =overhead cost ,
• =cost per unit
20
• If travel costs between units are substantial,
22
• Neyman Allocation
• An important special case arises if the cost per unit is
the same in all strata ( ) . The total cost thus
becomes an optimum allocation for fixed
cost and reduces to optimum allocation for fixed
sample size. The result in this special case is as follows:
23
Theorem:
24
• This is called Neyman allocation. The minimum variance
with fixed is
25
3.5 Relative Precision of Stratified Random and
Simple Random Sampling
respectively.
26
Theorem:
• ,
• where the optimum allocation is for fixed
27
Example 2.3
28
Stratum h (yield in
bags)
1 26 3 92,105,82
2 35 4 38,47,52,59
3 53 5 27,20,21,22,30
29
i. Estimate the total annual yield and its standard
error.
30
Solution:
Stratum
h
1 26 3 93 133.0 2418
2 35 4 49 78.0 1715
3 53 5 24 18.5 1272
32
The estimate of the variance,
33
• Hence the Variance of ,
• and
34
(ii) The 95% confidence interval for ,
• Thus
35
Example 2
36
Stratum, h
1 1800 10 15 2.0
2 1200 24 29 1.0
3 1000 15 30 40
37
i. If a random sample of size 100 is selected, estimate the
population total, Y.
40
(ii) Allocating proportionally we have
• •
• •
41
The required variance,
42
(iii) For the Neyman allocation:
43
The required Variance
Hence
44
iv. Given that
46
Minimizing G, we have
47
• Hence the required total cost;
48
3.6 Estimation of Population Proportion
49
• Let Ah denote the total number of units in C in stratum h.
Then the proportion of units in C in this stratum,
• which is estimated by
•
50
• Theorem:
• The variance of for stratified random sampling is
•
51
• If proportion allocation is carried out we have
substituting for
•
52
3.7 Allocation of Sample for Estimating P
The best choice of stratum size in order to minimize
follows from the general theorem of optimum
allocation earlier discussed.
•
53
• Minimum variance for fixed cost, where cost
•n
54
3.8 Estimation of Sample Size
55
• If fpc is ignored, we have, as first approximation,
•
56
Optimum allocation for fixed
57
Proportional allocation.
58
•
Estimation of Population Total, Y:
• for
o General :
59
o Proportional Allocation:
o Optimum Allocation :
o
60
• Estimation of Population Proportion P
61
o For optimum allocation:
62
Example 2.3
63
Strata 1 2 3 4 5 6
20 21 25 18 32 9
64
i. Compute the variance of the mean enrolment
under the optimum Neyman allocation. Assume
a sample size of 100.
65
Stratum
1 20 190 3800 722000 30.4 5776 8
2 21 200 4200 840000 33.6 6720 9
3 25 120 3000 360000 24.0 2880 6
4 18 300 5400 1620000 43.3 12960 12
5 32 110 3520 387000 28.16 3097.6 8
6 9 300 2700 810000 21.6 6480 6
Total 125 - 22620 4739000 180.96 37913.6 49
66
i. The variance of under Neyman optimum
allocation,
67
• (ii) Given the coefficient of variation,
68
• For the required sample size n,
69
The allocation of sample sizes
•
70
Example 2
71
Stratum
1 0.6 0.2 90
72
(a) Compute the smallest sample sizes, if
75
Hence from ;
76
•
77
• which gives the following allocations:
78
if fpc is ignored.
79
Thank You