3. Stratified Random Sampling 07.06.231
3. Stratified Random Sampling 07.06.231
There may often be factors which divide the population into several sub-populations or groups and
we may expect the measurements of interest to vary among the different sub-populations. This has
to be accounted for when we select a sample from the population in order that we obtain a sample
that is representative of the population. This is achieved by stratified sampling. The sub-
populations making up the whole population are called strata. The strata are non-overlapping and
homogeneous within themselves. Sampling frames are constructed for each of these strata and the
sampling is performed independently within each stratum. If a random sampling strategy is
followed to select the sample within each stratum, the whole procedure is called stratified random
sampling.
DEFINITION: Stratified random sampling is a sampling plan in which we divided the population
into several non-overlapping strata and select a random sample from each stratum in such a way
that units within the strata are homogeneous but between strata they are heterogeneous.
Strata are generally formed on the basis of some known characteristics of the population, which is
believed to be related to the variable of interest. Such stratification is usually done in such a way as
to reduce the variability of the strata estimates. This is generally achieved by forming the strata
such that they are homogeneous within themselves with respect to a suitably chosen auxiliary
variable called stratification variable or stratification factor.
The justification of adopting a stratified sampling is that if we know nothing about the structure of
the population apart from its size, we cannot do better than take a simple random sample. However,
it is an extreme situation that we know nothing about the population other than its size. Most often,
we know that a population consists of individuals who can be classified by their characteristics such
as religion, socio- economic status, income, expenditure, occupation, race and level of education. It
the researcher, for instance, studying the living and working condition of the people, had wished to
ensure that different types of area (e. g. City corporation, Municipal area, urban, semi- urban, rural
etc.) were adequately represented in the sample, he could have stratified the population by area
type. Here area type is the stratification variable, which is clearly related to the study variables. In
studying the television viewing habit among the university student, academic performance score
(high, medium, low) or the place of residence (urban rural) may serve as stratification variables,
since each of these variables is believed to be related to the television viewing habit of the students.
Principles of stratification
The process of stratification involves dividing the population into several sub- population, which
we call strata. In forming such strata, few principles should be followed to take full advantage of
stratified sampling.
These are:
▪ The strata should be non- overlapping and exhaustive so that they together comprise the
whole population. Thus several administrative divisions of the country, ecological zones.
rural-urban residence and the like may be thought of as different strata. The strata should
be made as homogeneous as possible, ensuring greater similarity within the strata than
between the strata.
▪ Strata are to be formed on the basis of some known characteristics of the population,
which are believed to have some relationship with the subject of inquiry and variables of
interest.
▪ When stratification with respect to the characteristics under study becomes difficult for
practical reasons, administrative convenience may be considered as the basis for forming
the strata.
▪ With a view to improve the sampling design, strata should be formed on the basis of
natural characteristics as far as possible.
▪ Past data, intuition, expert judgment or preliminary findings from pilot surveys may also
be used to set-up the strata. This, however, requires that we have prior knowledge of the
nature of the population from which we are sampling.
Formation of strata: Stratification with uniform sampling fraction, almost always increases
precision. In general, the effect in the gain in precision from stratified sampling with uniform
sampling fraction depends on the variation between stratum averages. The greater the variation
between the stratum averages, the greater the gain in precision. In extreme cases, when the stratum
averages are exactly equal, there may actually be a loss in precision due to the stratification. But
this can only happen when the sampling fraction is appreciable (Stuart, 1984:44). It is thus
important to note that our aim when forming the strata for sampling with a uniform sampling
fraction must be to make the strata averages differ as much as possible. What about the case when
we have variable sampling fractions? Use of variable sampling fractions may increase or decrease
precision as compared with a uniform sampling fraction. Except in extreme cases, precision will be
increased if larger sampling fraction are used in the more variable strata, and decreased if they are
used in the less variable strata. A mathematical rule seems to work well in this regard: to maximize
the precision in each stratum sampling fraction should be made proportional to the square root of
the variance in that stratum (Stuart, 1984:41). Even if it becomes difficult to follow the above rule,
nevertheless, it is easy to realize that if we divide a population into strata with the aim of
maximizing strata variations with respect to their averages, we automatically minimize the total
variation within the strata, since the population contains a fixed number of variables. This also
takes into account the case of variable sampling fraction to maximize the precision.
Number of strata: The decision on the number of strata to be formed from any single stratification
variable is of crucial importance. The choice however is to be made keeping in view that it takes
into account such factors as the (i) effect of increasing the number of strata (ii) in the reduction of
variance and (ii) cost of the survey.
While deciding on this, it should be borne in mind that very small strata contribute little to the gains
from stratification. The formation of only a few large strata will typically yield most of the possible
gains from a variable. Further subdivisions of these would result in only small additional gains.
Sampling within the strata: The different sub-groups of a population may markedly vary in terms
of its physical distribution and peculiarity. This calls for adopting different sampling designs in
different strata to account for the degree of variability in the population. The sample selection
within the stratum may follow simple random sampling, sampling with probability proportional to
size (PPS), or systematic sampling especially at the primary and other stages down to the
penultimate stage.
Allocating sample to strata: The essence of stratification is the classification of the population
into sub-populations or strata and then the selection of the separate samples from each of the strata.
This should not, however be done entirely arbitrarily if precision of the estimates is of concern. We
must therefore frame some rule or strategy so as to allocate the sample of a fixed size to different
strata. These are what we call principles of allocation. It is however of importance to note that if
one that if one has to balance the requirements of overall precision on the one side and stratum
level precision on the other, a variety of compromise solution are possible.
𝑌ℎ = 𝑦ℎℎ 𝑦ℎ = 𝑦ℎℎ
𝑖=1 𝑖=1
Mean 𝑌ℎ 𝑦ℎ
𝑌ℎ = 𝑦ℎ =
𝑁ℎ 𝑛ℎ
5
𝑛
Variance 𝑁
∑𝑖=1
ℎ
𝑦ℎℎ ― 𝑌ℎ
2
∑𝑖=1
ℎ
(𝑦ℎℎ ― 𝑦ℎ )2
𝑆2ℎ = 𝑠2ℎ =
𝑁ℎ ― 1 𝑛ℎ ― 1
Stratum weight 𝑁ℎ
𝑊ℎ =
𝑁
Sampling fraction 𝑛ℎ
𝑓ℎ =
𝑁ℎ
𝑌= 𝑊ℎ 𝑌ℎ ,
ℎ=1
where ,
𝑁ℎ
𝑊ℎ =
𝑁
𝑁ℎ
𝑌ℎ ∑𝑖=1 𝑦ℎℎ
𝑌ℎ = = .
𝑁ℎ 𝑁ℎ
Proof:
Let the population of N units be divided into a number of 𝐿 strata with sizes 𝑁ℎ (ℎ = 1, 2, ……𝐿)
𝐿
where 𝑁 = ∑ℎ=1 𝑁ℎ .
Let a random sample of size 𝑛ℎ be drawn from hth stratum using simple random sampling without
replacement.
Let
𝑌=population mean,
𝑌ℎ =population mean for hth stratum,
𝑦ℎ =sample mean for hth straum and
𝑌 =population total.
𝐿 𝑁
Then population total 𝑌 = ∑ℎ=1 𝑌ℎ where 𝑌ℎ = ∑𝑖=1
ℎ
𝑦ℎℎ .
𝑌
So, population mean 𝑌 =
𝑁
𝐿
∑ℎ=1 𝑌ℎ
=
𝑁
𝐿
1 𝑌ℎ 𝑁ℎ
=
𝑁 𝑁ℎ
ℎ=1
𝐿
𝑁ℎ
= 𝑌
𝑁 ℎ
ℎ=1
𝐿
= 𝑊ℎ 𝑌ℎ
ℎ=1
where
𝑁ℎ
𝑊ℎ = 𝑁
is called the stratum weight or population proportion for hth stratum.
6
Stratified sample mean (𝒚𝒔𝒔 )
For the population mean per unit, the estimator used in stratified sampling is 𝑦𝑠𝑠 , known as
stratified sample mean and is given by
𝐿
∑ℎ=1 𝑁ℎ 𝑦ℎ
𝑦𝑠𝑠 =
𝑁
𝐿
= 𝑊ℎ 𝑦ℎ
ℎ=1
where
𝑁ℎ
𝑊ℎ =
𝑁
and
𝑁1 + 𝑁2 + … 𝑁𝐿
That is, stratified sample mean 𝑦𝑠𝑠 is the weighted average of stratum means.
Note:
Stratified sample mean 𝑦𝑠𝑠 is not in general the same as the ordinary sample mean 𝑦
where
𝐿 𝐿
∑ℎ=1 𝑛ℎ 𝑦ℎ
𝑦= = 𝑤ℎ 𝑦ℎ
𝑛
ℎ=1
where
𝑛ℎ
𝑤ℎ = .
𝑛
That is 𝑦 coincides with 𝑦𝑠𝑠 if the sampling fraction is the same in all strata. This stratification is
described as stratification with proportional allocation of the 𝑛ℎ .
Proof:
Let the population of N units be divided into a number of 𝐿 strata with sizes 𝑁ℎ (ℎ = 1, 2, ……𝐿)
𝐿
where 𝑁 = ∑ℎ=1 𝑁ℎ .
7
Let a random sample of size 𝑛ℎ be drawn from hth stratum using simple random sampling without
replacement.
Let
𝑌 =population mean,
𝑌ℎ =population mean for hth stratum,
𝑦ℎ =sample mean for hthstraum and
𝑌 =population total.
𝑦𝑠𝑠 = 𝑊ℎ 𝑦ℎ
ℎ=1
where
𝑁ℎ
𝑊ℎ =
𝑁
Now,
𝐿
𝐸(𝑦𝑠𝑠 ) = 𝐸 𝑊ℎ 𝑦ℎ
ℎ=1
𝐿
= 𝑊ℎ 𝐸(𝑦ℎ )
ℎ=1
𝐿
Theorem:
In stratified sampling, the ordinary sample mean 𝑦 is not an unbiased estimator of population mean
𝑌.
Proof:
8
𝐿
𝑛ℎ 𝑦ℎ
𝐸(𝑦) = 𝐸
𝑛
ℎ=1
𝐿
𝑛ℎ
= 𝐸(𝑦ℎ )
𝑛
ℎ=1
𝐿
𝑛ℎ
= 𝑌
𝑛 ℎ
ℎ=1
≠ 𝑌. (proved)
𝑛ℎ 𝑁ℎ
So, 𝑦 is not an unbiased estimator of population mean𝑌 unless 𝑛
= 𝑁
for all h=1, 2, ….L.
Theorem:
Variance of stratified sample mean is given in one of three following forms:
𝐿
𝑆2ℎ 𝑛ℎ
𝑎. 𝑉(𝑦𝑠𝑠 ) = 𝑊2ℎ (1 ― 𝑓ℎ ) , 𝑤ℎ𝑒𝑟𝑒 𝑓ℎ =
𝑛ℎ 𝑁ℎ
ℎ=1
𝐿
1 𝑆2ℎ
𝑏.𝑉(𝑦𝑠𝑠 ) = 2 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑁 𝑛ℎ
ℎ=1
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ
𝑐.𝑉(𝑦𝑠𝑠 ) = ―
𝑛ℎ 𝑁ℎ
ℎ=1 ℎ=1
Proof:
Let the population of N units be divided into a number of 𝐿 strata with sizes 𝑁ℎ (ℎ = 1, 2, ……𝐿)
𝐿
where 𝑁 = ∑ℎ=1 𝑁ℎ .
Let a random sample of size 𝑛ℎ be drawn from hth stratum using simple random sampling without
replacement.
Let
𝑌 =population mean,
𝑌ℎ =population mean for hth stratum,
𝑦ℎ =sample mean for hthstraum and
𝑌 =population total.
a.
𝐿
𝑉(𝑦𝑠𝑠 ) = 𝑉 𝑊ℎ 𝑦ℎ
ℎ=1
𝐿 𝐿
9
𝐿
b.
We have
𝐿
𝑆2ℎ
𝑉(𝑦𝑠𝑠 ) = 𝑊2ℎ (1 ― 𝑓ℎ )
𝑛ℎ
ℎ=1
𝐿
𝑁2ℎ 𝑁ℎ ― 𝑛ℎ 𝑆2ℎ
=
𝑁2 𝑁ℎ 𝑛ℎ
ℎ=1
𝐿
1 𝑆2ℎ
= 2 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑁 𝑛ℎ
ℎ=1
c.
We have
𝐿
𝑆2ℎ
𝑉(𝑦𝑠𝑠 ) = 𝑊2ℎ (1 ― 𝑓ℎ )
𝑛ℎ
ℎ=1
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ 𝑓ℎ
= ―
𝑛ℎ 𝑛ℎ
ℎ=1 ℎ=1
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ 𝑛ℎ
= ― ∵ 𝑓ℎ =
𝑛ℎ 𝑁ℎ 𝑁ℎ
ℎ=1 ℎ=1
(proved)
Theorem:
𝑌𝑠𝑠 = 𝑁𝑦𝑠𝑠 is an unbiased estimator of population total 𝑌.
Proof:
Let the population of N units be divided into a number of 𝐿 strata with sizes 𝑁ℎ (ℎ = 1, 2, ……𝐿)
𝐿
where 𝑁 = ∑ℎ=1 𝑁ℎ .
Let a random sample of size 𝑛ℎ be drawn from hth stratum using simple random sampling without
replacement.
Let
𝑌 =population mean,
𝑌ℎ =population mean for hth stratum,
𝑦ℎ =sample mean for hthstraum and
𝑌 =population total.
10
𝐸(𝑌𝑠𝑠 ) = 𝐸(𝑁𝑦𝑠𝑠 )
= 𝑁𝐸(𝑦𝑠𝑠 )
= 𝑁𝑌[ ∵ 𝐸(𝑦𝑠𝑠 ) = 𝑌]
𝑌
=𝑁
𝑁
= 𝑌.
(proved)
Theorem:
Variance of stratified sample total is given as:
𝐿
𝑆2ℎ
𝑉(𝑌𝑠𝑠 ) = 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑛ℎ
ℎ=1
Proof:
Let the population of N units be divided into a number of 𝐿 strata with sizes 𝑁ℎ (ℎ = 1, 2, ……𝐿)
𝐿
where 𝑁 = ∑ℎ=1 𝑁ℎ .
Let a random sample of size 𝑛ℎ be drawn from hth stratum using simple random sampling without
replacement.
Let
𝑌 =population mean,
𝑌ℎ =population mean for hth stratum,
𝑦ℎ =sample mean for hth straum and
𝑌 =population total.
𝑉(𝑌𝑠𝑠 ) = 𝑉(𝑁𝑦𝑠𝑠 )
= 𝑁 2 𝑉(𝑦𝑠𝑠 )
𝐿
1 𝑆2ℎ
= 𝑁2 2 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑁 𝑛ℎ
ℎ=1
𝐿
𝑆2ℎ
= 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑛ℎ
ℎ=1
(proved)
11
Estimator of 𝑽(𝒚𝒔𝒔 )
Theorem:
If SRS is done without replacement within each stratum, an unbiased estimator of 𝑆2ℎ is
𝑛ℎ
1
𝑠2ℎ = (𝑦ℎℎ ― 𝑦ℎ )2 .
(𝑛ℎ ― 1)
𝑖=1
Now replacing 𝑆2ℎ by its estimator 𝑠2ℎ , an unbiased estimator of 𝑉(𝑦𝑠𝑠 ) is given in one of three
following forms:
𝐿
𝑠2ℎ 𝑛ℎ
𝑎. 𝑣(𝑦𝑠𝑠 ) = 𝑠 2 (𝑦𝑠𝑠 ) = 𝑊2ℎ (1 ― 𝑓ℎ ) , 𝑤ℎ𝑒𝑟𝑒 𝑓ℎ =
𝑛ℎ 𝑁ℎ
ℎ=1
𝐿
1 𝑠2ℎ
𝑏.𝑣(𝑦𝑠𝑠 ) = 𝑠 2 (𝑦𝑠𝑠 ) = 2 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑁 𝑛ℎ
ℎ=1
𝐿 𝐿
𝑊2ℎ 𝑠2ℎ 𝑊2ℎ 𝑠2ℎ
𝑐.𝑣(𝑦𝑠𝑠 ) = 𝑠 2 (𝑦𝑠𝑠 ) = ―
𝑛ℎ 𝑁ℎ
ℎ=1 ℎ=1
Estimator of 𝑽(𝒀𝒔𝒔 )
Similarly, replacing 𝑆2ℎ by its estimator 𝑠2ℎ , an unbiased estimator of 𝑉(𝑌𝑠𝑠 ) is given by
𝐿
𝑠2ℎ
𝑣(𝑌𝑠𝑠 ) = 𝑠 2 (𝑌𝑠𝑠 ) = 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑛ℎ
ℎ=1
Hence the estimator of standard error of 𝑉(𝑌𝑠𝑠 ) is given by
𝐿
𝑠2ℎ
𝑠(𝑌𝑠𝑠 ) = 𝑣(𝑌𝑠𝑠 ) = 𝑁ℎ (𝑁ℎ ― 𝑛ℎ ) .
𝑛ℎ
ℎ=1
12
Allocation of total sample to different strata
Suppose that we want to draw a total sample of size n from a population which is divided into a
number of strata. Then the total sample of size n is to be distributed among different strata. This
distribution of sample over different strata is called allocation of total sample or simply allocation.
Types of allocation
There are four types of allocation.
a) Equal allocation
b) Proportional allocation
c) Neyman allocation
d) Optimum allocation
Equal allocation
Definition:
If the total sample is distributed equally among all the strata, the allocation is called equal
allocation.
𝑛ℎ = 𝑛
ℎ=1
⇒𝐿𝑛ℎ = 𝑛 [since under equal allocation 𝑛ℎ is constant for all h]
𝑛
⇒𝑛ℎ =
𝐿
That is under equal allocation, we obtain 𝑛ℎ by dividing the sample size by the number of strata.
𝑦𝑠𝑠 = 𝑊ℎ 𝑦ℎ
ℎ=1
13
𝐿 𝑛
𝑁ℎ ∑𝑖=1 𝑦ℎℎ
ℎ
=
𝑁 𝑛ℎ
ℎ=1
𝐿 𝑛ℎ
𝑁ℎ 𝐿 𝑛
= 𝑦ℎℎ ∵ 𝑛ℎ =
𝑁 𝑛 𝐿
ℎ=1 𝑖=1
𝐿 𝑛ℎ
𝐿
= 𝑁ℎ 𝑦ℎ ∵ 𝑦ℎ = 𝑦ℎℎ
𝑛𝑁
ℎ=1 𝑖=1
𝑽(𝒚𝒔𝒔 ) under equal allocation
We have
𝐿
𝑆2ℎ
𝑉(𝑦𝑠𝑠 ) = 𝑊2ℎ (1 ― 𝑓ℎ )
𝑛ℎ
ℎ=1
𝐿
𝑁2ℎ 𝑁ℎ ― 𝑛ℎ 𝑆2ℎ
=
𝑁2 𝑁ℎ 𝑛ℎ
ℎ=1
𝐿
1 𝑆2ℎ
= 2 𝑁ℎ (𝑁ℎ ― 𝑛ℎ )
𝑁 𝑛ℎ
ℎ=1
𝐿
1 𝑛 𝑆2ℎ
= 2 𝑁ℎ 𝑁ℎ ―
𝑁 𝐿 𝑛
ℎ=1 𝐿
𝐿
1 𝐿𝑁ℎ ― 𝑛 𝑆2ℎ
= 𝑁ℎ 𝑛
𝑁2 𝐿
ℎ=1 𝐿
𝐿
1 𝐿𝑁ℎ ― 𝑛 2
= 𝑁ℎ 𝑆ℎ
𝑁2 𝑛
ℎ=1
𝐿
1
= 𝑁ℎ (𝐿𝑁ℎ ― 𝑛)𝑆2ℎ
𝑛𝑁 2
ℎ=1
Proportional allocation
Definition:
If the total sample is distributed over different strata in such a way that the size of sample in any
stratum is proportional to the stratum size, then this allocation is called proportional allocation.
14
𝐿 𝐿
𝑦𝑠𝑠 = 𝑊ℎ 𝑦ℎ
ℎ=1
𝐿
𝑁ℎ
= 𝑦
𝑁 ℎ
ℎ=1
𝐿
𝑛ℎ 𝑛 𝑛ℎ 𝑁ℎ
= 𝑦ℎ Since under proportional allocation,𝑛ℎ = 𝑁ℎ ⇒ =
𝑛 𝑁 𝑛 𝑁
ℎ=1
𝐿 𝑛ℎ
𝑛ℎ 𝑦ℎ
= where𝑦ℎ = 𝑦ℎℎ
𝑛 𝑛ℎ
ℎ=1 𝑖=1
𝐿
𝑦ℎ
=
𝑛
ℎ=1
𝐿
∑ℎ=1 𝑦ℎ
=
𝑛
=𝑦
So, under proportional allocation, 𝑦𝑠𝑠 = 𝑦, i.e. stratified sample mean is equal to ordinary sample
mean.
Therefore, under proportional allocation, ordinary sample mean 𝑦 is an unbiased estimator of
population mean 𝑌 .
15
𝑽(𝒚𝒔𝒔 ) under proportional allocation
We have
𝐿
𝑆2ℎ
𝑉(𝑦𝑠𝑠 ) = 𝑊2ℎ (1 ― 𝑓ℎ )
𝑛ℎ
ℎ=1
𝐿
𝑆2ℎ
= 𝑊2ℎ (1 ― 𝑓)
𝑛ℎ
ℎ=1
𝐿
𝑁2ℎ 𝑆2ℎ
= (1 ― 𝑓)
𝑁 2 𝑛ℎ
ℎ=1
𝐿
𝑁2ℎ 𝑆2ℎ
= (1 ― 𝑓)
𝑁 2 𝑛𝑁ℎ
ℎ=1
𝑁
𝐿
(1 ― 𝑓) 𝑁ℎ 2
= 𝑆
𝑛 𝑁 ℎ
ℎ=1
𝐿
1 1
= ― 𝑊ℎ 𝑆2ℎ
𝑛 𝑁
ℎ=1
𝐿 𝐿
∑ℎ=1 𝑊ℎ 𝑆2ℎ ∑ℎ=1 𝑊ℎ 𝑆2ℎ
= ―
𝑛 𝑁
Now second part of right hand side is constant (with respect to 𝑛ℎ ). So we shall minimize only first
𝐿 𝑊ℎ2 𝑆ℎ2
part i.e. ∑ℎ=1 𝑛ℎ
= 𝑉 ′ (say).
Now minimizing 𝑉 ′ for fixed sample size n or minimizing n for fixed 𝑉 ′ is equivalent to
minimizing the product
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ
𝑉′𝑛 = 𝑛ℎ
𝑛ℎ
ℎ=1 ℎ=1
Let 𝑎ℎ and 𝑏ℎ are two sets of L positive numbers, then by Cauchy–Schwarz inequality, we have
16
2
𝑎2ℎ 𝑏2ℎ ≥ 𝑎 ℎ 𝑏ℎ
𝑏
Where equality holds if 𝑎ℎ is constant for all h.
ℎ
𝑊ℎ 𝑆ℎ
Let 𝑎ℎ = 𝑛ℎ
and 𝑏ℎ = 𝑛ℎ ∴ 𝑎ℎ 𝑏ℎ=𝑊ℎ 𝑆ℎ
Hence by Cauchy–Schwarz inequality
𝐿 𝐿 𝐿 2
𝑊2ℎ 𝑆2ℎ
𝑉′𝑛 = 𝑛ℎ ≥ 𝑊ℎ 𝑆ℎ
𝑛ℎ
ℎ=1 ℎ=1 ℎ=1
𝐿 2
Therefore, the smallest value of 𝑉 ′ 𝑛 is ∑ℎ=1 𝑊ℎ 𝑆ℎ .
This minimum value occurs when
𝑏ℎ
= 𝑘 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
𝑎ℎ
𝑛ℎ
⇒ =𝑘
𝑊ℎ 𝑆ℎ
𝑛ℎ
𝑛ℎ
⇒ =𝑘
𝑊ℎ 𝑆ℎ
⇒𝑛ℎ = 𝑘𝑊ℎ 𝑆ℎ
𝑁ℎ
⇒𝑛ℎ = 𝑘 𝑆
𝑁 ℎ
𝑘
⇒𝑛ℎ = 𝑘 ′ 𝑁ℎ 𝑆ℎ ……………(1) [𝑝𝑢𝑡𝑡𝑖𝑛𝑔 = 𝑘′]
𝑁
𝐿 𝐿
⇒𝑛 = 𝑘 ′ 𝑁ℎ 𝑆ℎ
ℎ=1
𝑛
⇒𝑘 ′ = 𝐿
∑ℎ=1 𝑁ℎ 𝑆ℎ
Putting the value of 𝑘 ′ in (1)
𝑛𝑁ℎ 𝑆ℎ
⇒𝑛ℎ = 𝐿 ……………(2)
∑ℎ=1 𝑁ℎ 𝑆ℎ
Equation (2) gives the formula for 𝑛ℎ in Neyman allocation.
Note: Equation (2) gives 𝑛ℎ ∞𝑁ℎ 𝑆ℎ which provides the definition of this allocation where
𝑛
proportionality constant 𝑘 ′ = 𝐿 .
∑ℎ=1 𝑁ℎ 𝑆ℎ
17
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ
⇒𝑉𝑁𝑁𝑁 = ― [𝑝𝑢𝑡𝑡𝑖𝑛𝑔 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑛ℎ 𝑢𝑛𝑑𝑒𝑟 𝑁𝑒𝑦𝑚𝑎𝑛 𝑎𝑙𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛]
𝑛𝑁ℎ 𝑆ℎ 𝑁ℎ
ℎ=1 ℎ=1
∑ 𝑁ℎ 𝑆ℎ
𝐿 𝐿
1 𝑁2ℎ 𝑆2ℎ 1 𝑁2ℎ 𝑆2ℎ
= ― 2
𝑛𝑁 2 𝑁ℎ 𝑆ℎ 𝑁 𝑁ℎ
ℎ=1 ℎ=1
∑ 𝑁ℎ 𝑆ℎ
𝐿 2 𝐿
1 1
= 𝑁ℎ 𝑆ℎ ― 2 𝑁ℎ 𝑆2ℎ ……………(3)
𝑛𝑁 2 𝑁
ℎ=1 ℎ=1
2
𝐿 𝐿
∑ℎ=1 𝑊ℎ 𝑆ℎ ∑ℎ=1 𝑊ℎ 𝑆2ℎ
= ―
𝑛 𝑁2
Minimum value of n for given variance V
From (3) we have
𝐿 2 𝐿
1 1
𝑉= 𝑁ℎ 𝑆ℎ ― 2 𝑁ℎ 𝑆2ℎ
𝑛𝑁 2 𝑁
ℎ=1 ℎ=1
𝐿 2 𝐿
1 1
⇒ 𝑁ℎ 𝑆ℎ =𝑉+ 2 𝑁ℎ 𝑆2ℎ
𝑛𝑁 2 𝑁
ℎ=1 ℎ=1
2
1 𝐿
2 ∑ℎ=1 𝑁ℎ 𝑆ℎ
∴𝑛= 𝑁
1 𝐿
𝑉 + 𝑁 2 ∑ℎ=1 𝑁ℎ 𝑆2ℎ
2
𝐿
∑ℎ=1 𝑁ℎ 𝑆ℎ
⇒𝑛 = 𝐿 .
𝑁 2 𝑉 + ∑ℎ=1 𝑁ℎ 𝑆2ℎ
Optimum allocation
Definition:
If the total sample is distributed over different strata in such a way that the sample size in hth
𝑁ℎ 𝑆ℎ
stratum is proportional to the quantity 𝑐ℎ
where
𝑁ℎ = hth stratum size
𝑆2ℎ =variance of hth stratum
𝑐ℎ =cost per unit for sampling in hth stratum,
then this allocation called optimum allocation.
That is in this allocation,
𝑁ℎ 𝑆ℎ
𝑛ℎ ∞ .
𝑐ℎ
18
𝐿
𝑐 = 𝑐0 + 𝑐ℎ 𝑛ℎ ………(1)
ℎ=1
𝐿
⇒𝑐 = 𝑐0 + 𝑐 ′ …………(2) where 𝑐 ′ = ∑ℎ=1 𝑐ℎ 𝑛ℎ
In equation (1)
𝑐 = total cost
𝑐0 = overhead cost (e.g. cost of setting up and maintaining an office, recruiting survey personnel,
capital expenses etc.)
𝑐ℎ = cost of sampling per unit from hth stratum
…………………………………………
Note: For example, in a household survey, in stratum 1, survey-cost per unit per household is c1
and n1 households are taken from stratum 1. So the total cost for n1 households is c1n1. Similarly
𝐿
for stratum 2, the cost is c2n2 and so on. Therefore, the total cost for survey only is ∑ℎ=1 𝑐ℎ 𝑛ℎ and
𝐿
hence the total cost for whole activities of the survey is 𝑐0 + ∑ℎ=1 𝑐ℎ 𝑛ℎ = 𝑐 (𝑠𝑎𝑦).
………………………………………
Optimum allocation is obtained by minimizing V(𝑦𝑠𝑠 ) for a given cost c or minimizing c for given
value of V(𝑦𝑠𝑠 ).
Now,
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ
V(𝑦𝑠𝑠 ) = ―
𝑛ℎ 𝑁ℎ
ℎ=1 ℎ=1
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ
= 𝑉′ ― ………(3) 𝑤ℎ𝑒𝑟𝑒 𝑉 ′ =
𝑁ℎ 𝑛ℎ
ℎ=1 ℎ=1
In V(𝑦𝑠𝑠 ), we consider only 1st term i.e. 𝑉 ′ of equation (3) and
in the expression for c, we consider 2nd term i.e. 𝑐 ′ of equation (2), because other parts are constants
(as the remaining parts in V and c are independent of 𝑛ℎ ).
Now minimizing 𝑉 ′ for 𝑐 ′ or minimizing 𝑐 ′ for given 𝑉 ′ is equivalent to minimizing the product
𝐿 𝐿
𝑊2ℎ 𝑆2ℎ
𝑉′𝑐′ = 𝑐ℎ 𝑛ℎ
𝑛ℎ
ℎ=1 ℎ=1
Let 𝑎ℎ and 𝑏ℎ are two sets of L positive numbers, then by Cauchy–Schwarz inequality, we have
2
𝑎2ℎ 𝑏2ℎ ≥ 𝑎 ℎ 𝑏ℎ
𝑏ℎ
where equality holds if 𝑎ℎ
is constant for all h.
𝑊ℎ 𝑆ℎ
Let 𝑎ℎ = 𝑛ℎ
and 𝑏ℎ = 𝑐ℎ 𝑛ℎ
𝑊ℎ 𝑆ℎ
∴ 𝑎 ℎ 𝑏ℎ = 𝑐ℎ 𝑛ℎ = 𝑊ℎ 𝑆ℎ 𝑐ℎ
𝑛ℎ
Hence by Cauchy–Schwarz inequality
𝐿 𝐿 𝐿 2
𝑊2ℎ 𝑆2ℎ
𝑉′𝑐′ = 𝑐ℎ 𝑛ℎ ≥ 𝑊ℎ 𝑆ℎ 𝑐ℎ
𝑛ℎ
ℎ=1 ℎ=1 ℎ=1
𝐿 2
Therefore, the smallest value of 𝑉 ′ 𝑐 ′ is ∑ℎ=1 𝑊ℎ 𝑆ℎ 𝑐ℎ .
19
This minimum value occurs when
𝑏ℎ
= 𝑘 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
𝑎ℎ
𝑐ℎ 𝑛ℎ
⇒ =𝑘
𝑊ℎ 𝑆ℎ
𝑛ℎ
𝑛ℎ 𝑐ℎ
⇒ =𝑘
𝑊ℎ 𝑆ℎ
𝑊ℎ 𝑆ℎ
⇒𝑛ℎ = 𝑘
𝑐ℎ
𝑘 𝑁ℎ 𝑆ℎ
⇒𝑛ℎ =
𝑁 𝑐ℎ
𝑁ℎ 𝑆ℎ 𝑘
⇒𝑛ℎ = 𝑘 ′ ……………(4) [𝑝𝑢𝑡𝑡𝑖𝑛𝑔 = 𝑘 ′ ]
𝑐ℎ 𝑁
𝐿 𝐿
𝑁ℎ 𝑆ℎ
⇒ 𝑛ℎ = 𝑘 ′ [𝑠𝑢𝑚𝑚𝑖𝑛𝑔 𝑜𝑣𝑒𝑟 𝑎𝑙𝑙 𝑠𝑡𝑟𝑎𝑡𝑎]
ℎ=1 ℎ=1
𝑐ℎ
𝐿
𝑁ℎ 𝑆ℎ
⇒𝑛 = 𝑘 ′
ℎ=1
𝑐ℎ
𝑛
⇒𝑘 ′ = 𝐿
∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
Putting the value of 𝑘 ′ in (4)
𝑛𝑁ℎ 𝑆ℎ / 𝑐ℎ
⇒𝑛ℎ = 𝐿 ……………(5)
∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
Equation (5) gives the formula for 𝑛ℎ in optimum allocation.
………………………………………
𝑁ℎ 𝑆ℎ
Note: Equation (5) gives 𝑛ℎ ∞ 𝑐ℎ
which provides the definition of this allocation where
𝑛
proportionality constant 𝑘 ′ = .
∑𝐿ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
………………………………………
𝐿 𝐿
1 𝑁2ℎ 𝑆2ℎ 𝑊2ℎ 𝑆2ℎ
⇒𝑉𝑜𝑜𝑜 = 2 ―
𝑁 𝑛𝑁 𝑆 / 𝑐ℎ 𝑁ℎ
ℎ=1 𝐿 ℎ ℎ ℎ=1
∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
20
𝐿 𝐿 𝐿
1 1
⇒𝑉𝑜𝑜𝑜 = 𝑁 ℎ 𝑆 ℎ 𝑐ℎ 𝑁 ℎ 𝑆 ℎ / 𝑐ℎ ― 2 𝑁ℎ 𝑆2ℎ ………(6)
𝑛𝑁 2 𝑁
ℎ=1 ℎ=1 ℎ=1
𝑐 = 𝑐0 + 𝑐ℎ 𝑛ℎ
ℎ=1
𝐿
𝑛𝑁ℎ 𝑆ℎ / 𝑐ℎ
⇒𝑐 = 𝑐0 + 𝑐ℎ 𝐿
ℎ=1
∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
𝐿
𝑛 ∑ℎ=1 𝑁ℎ 𝑆ℎ 𝑐ℎ
⇒𝑐 ― 𝑐0 = 𝐿
∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
𝐿
(𝑐 ― 𝑐0 ) ∑ℎ=1 𝑁ℎ 𝑆ℎ 𝑐ℎ
⇒𝑛 = 𝐿 ………(7)
∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
⇒𝑛 𝑁 2 𝑉 + 𝑁ℎ 𝑆2ℎ = 𝑁 ℎ 𝑆 ℎ 𝑐ℎ 𝑁ℎ 𝑆ℎ / 𝑐ℎ
ℎ=1 ℎ=1 ℎ=1
𝐿 𝐿
∑ℎ=1 𝑁ℎ 𝑆ℎ 𝑐ℎ ∑ℎ=1 𝑁ℎ 𝑆ℎ / 𝑐ℎ
⇒𝑛 = 𝐿
𝑁 2 𝑉 + ∑ℎ=1 𝑁ℎ 𝑆2ℎ
21
Comparison of precision in different allocation with SRS
1
Theorem: If terms 𝑁ℎ ′s are so large that 𝑁 ≃ 0 (ℎ = 1, 2, …𝐿) then
ℎ
Proof:
We know
𝑆2
𝑉𝑟𝑟𝑟 = (1 ― 𝑓)
𝑛
𝑛 𝑆2
⇒𝑉𝑟𝑟𝑟 = (1 ― )
𝑁 𝑛
1 1
⇒𝑉𝑟𝑟𝑟 = ― 𝑆2
𝑛 𝑁
𝑆2 1
𝑉𝑟𝑟𝑟 = …………(1) since𝑁ℎ ′s are large 𝑁 is also large so that ≃ 0
𝑛 𝑁
𝐿
(1 ― 𝑓)
𝑉𝑝𝑝𝑝𝑝 = 𝑊ℎ 𝑆2ℎ
𝑛
ℎ=1
𝐿
1 1
⇒𝑉𝑝𝑝𝑝𝑝 = ― 𝑊ℎ 𝑆2ℎ
𝑛 𝑁
ℎ=1
𝐿
1 1
⇒𝑉𝑝𝑝𝑝𝑝 = 𝑊ℎ 𝑆2ℎ …………(2) since ≃0
𝑛 𝑁
ℎ=1
Now
𝐿 𝑁
∑ℎ=1 ∑𝑖=1
ℎ
(𝑦ℎℎ ― 𝑌) 2
𝑆2 =
𝑁―1
𝐿 𝑁ℎ
𝐿 𝐿
2
= (𝑁ℎ ― 1)𝑆2ℎ + 𝑁ℎ 𝑌ℎ ― 𝑌
ℎ=1 ℎ=1
𝐿 𝐿
2
∴ 𝑁𝑆 2 = 𝑁ℎ 𝑆2ℎ + 𝑁ℎ 𝑌ℎ ― 𝑌 [∵ 𝑁ℎ ― 1 ≃ 𝑁ℎ 𝑎𝑛𝑑 𝑁 ― 1 ≃ 𝑁]
ℎ=1 ℎ=1
22
𝐿 𝐿
2
⇒𝑆 2 = 𝑊ℎ 𝑆2ℎ + 𝑊ℎ 𝑌ℎ ― 𝑌
ℎ=1 ℎ=1
𝐿 𝐿
𝑆2 1 1 2
⇒ = 𝑊ℎ 𝑆2ℎ + 𝑊ℎ 𝑌ℎ ― 𝑌
𝑛 𝑛 𝑛
ℎ=1 ℎ=1
𝐿
1 2
⇒𝑉𝑟𝑟𝑟 = 𝑉𝑝𝑝𝑝𝑝 + 𝑊ℎ 𝑌ℎ ― 𝑌 [𝑢𝑠𝑖𝑛𝑔 (1)𝑎𝑛𝑑 (2)]
𝑛
ℎ=1
𝐿
1 2
∴ 𝑉𝑟𝑟𝑟 ≥ 𝑉𝑝𝑝𝑝𝑝 …………(𝐴) ∵ 𝑊ℎ 𝑌ℎ ― 𝑌 ≥0
𝑛
ℎ=1
Again we have
2
𝐿 𝐿
∑ℎ=1 𝑊ℎ 𝑆ℎ ∑ℎ=1 𝑊ℎ 𝑆2ℎ
⇒𝑉𝑁𝑁𝑁 = ―
𝑛 𝑁
2 𝐿
𝐿
∑ℎ=1 𝑊ℎ 𝑆ℎ ∑ℎ=1 𝑊ℎ 𝑆2ℎ
⇒𝑉𝑁𝑁𝑁 = …………(3) ∵ ≃ 0 𝑓𝑜𝑟 𝑙𝑎𝑟𝑔𝑒 𝑁
𝑛 𝑁
Now
𝐿 𝐿 2
1 1
𝑉𝑝𝑝𝑝𝑝 ― 𝑉𝑁𝑁𝑁 = 𝑊ℎ 𝑆2ℎ ― 𝑊ℎ 𝑆ℎ [𝑢𝑠𝑖𝑛𝑔 (2)𝑎𝑛𝑑 (3)]
𝑛 𝑛
ℎ=1 ℎ=1
𝐿 𝐿 2
1
= 𝑊ℎ 𝑆2ℎ ― 𝑊ℎ 𝑆ℎ
𝑛
ℎ=1 ℎ=1
𝐿 𝐿
1 2
= 𝑊ℎ 𝑆ℎ ― 𝑆 𝑤ℎ𝑒𝑟𝑒 𝑆 = 𝑊ℎ 𝑆ℎ
𝑛
ℎ=1 ℎ=1
𝐿
1 2
∴ 𝑉𝑝𝑝𝑝𝑝 ― 𝑉𝑁𝑁𝑁 = 𝑊ℎ 𝑆ℎ ― 𝑆 ≥0
𝑛
ℎ=1
∴ 𝑉𝑝𝑝𝑝𝑝 ≥ 𝑉𝑁𝑁𝑁 …………(𝐵)
23
𝐿 𝑁ℎ
2
= 𝑦ℎℎ ― 𝑌ℎ )(𝑌ℎ ― 𝑌
ℎ=1 𝑖=1
𝐿 𝑁ℎ
2
= 𝑦ℎℎ ― 𝑌ℎ )(𝑌ℎ ― 𝑌
ℎ=1 𝑖=1
𝐿 𝑁ℎ
2
= (𝑌ℎ ― 𝑌) 𝑦ℎℎ ― 𝑌ℎ
ℎ=1 𝑖=1
𝑁ℎ
2
=0 ∵ 𝑦ℎℎ ― 𝑌ℎ =0
𝑖=1
Note 3:
Let
𝑊ℎ = 𝑓𝑖 , 𝑆ℎ = 𝑥𝑖 , 𝑆 = 𝑥 𝑎𝑛𝑑 𝐿 = 𝑛
Then
𝐿 𝐿 2
𝑊ℎ 𝑆2ℎ ― 𝑊ℎ 𝑆ℎ
ℎ=1 ℎ=1
𝑛 𝑛 2
= 𝑓𝑖 𝑥𝑖2 ― 𝑓𝑖 𝑥𝑖
𝑖=1 𝑖=1
𝑛 2 𝑛 𝐿
𝑛
∑𝑖=1 𝑓𝑖 𝑥𝑖
= 𝑓𝑖 𝑥𝑖2 ― 𝑛 ∵ 𝑓𝑖 = 𝑊ℎ = 1
𝑖=1
∑𝑖=1 𝑓𝑖 𝑖=1 ℎ=1
= 𝑓𝑖 (𝑥𝑖 ― 𝑥) 2
𝑖=1
𝐿
2
= 𝑊ℎ 𝑆ℎ ― 𝑆
ℎ=1
Theorem: Given the result of stratified random sample, an unbiased estimator of variance of SRS
is given by
𝐿 𝑛ℎ
𝑁―𝑛 1 𝑁ℎ
𝑣𝑟𝑟𝑟 (𝑦) = 𝑦2ℎℎ ― 𝑦2𝑠𝑠 + 𝑣(𝑦𝑠𝑠 )
𝑛(𝑁 ― 1) 𝑁 𝑛ℎ
ℎ=1 𝑖=1
𝑤ℎ𝑒𝑟𝑒 𝑣(𝑦𝑠𝑠 )𝑖𝑠 𝑎𝑛 𝑢𝑛𝑏𝑖𝑎𝑠𝑒𝑑 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟 𝑜𝑓 𝑉(𝑦𝑠𝑠)
Proof:
We have
𝑁―𝑛 2
𝑉𝑟𝑟𝑟 (𝑦) = 𝑆
𝑛𝑁
24
𝐿 𝑁
𝑁 ― 𝑛 ∑ℎ=1 ∑𝑖=1 (𝑦ℎℎ ― 𝑌) 2
ℎ
⇒𝑉𝑟𝑟𝑟 (𝑦) =
𝑛𝑁 𝑁―1
2 𝐿 𝑁
𝑁 ― 𝑛 ∑ℎ=1 ∑𝑖=1 𝑦ℎℎ ― 𝑁𝑌2
ℎ
⇒𝑉𝑟𝑟𝑟 (𝑦) =
𝑛𝑁 𝑁―1
𝐿 𝑁ℎ
𝑁―𝑛 1
⇒𝑉𝑟𝑟𝑟 (𝑦) = 𝑦2ℎℎ ― 𝑌2 ………(1)
𝑛(𝑁 ― 1) 𝑁
ℎ=1 𝑖=1
Now,
𝐿 𝑛ℎ
1 𝑁ℎ
𝐸 𝑦2ℎℎ
𝑁 𝑛ℎ
ℎ=1 𝑖=1
𝐿 𝑛ℎ
1 𝑁ℎ
= 𝐸(𝑦2ℎℎ )
𝑁 𝑛ℎ
ℎ=1 𝑖=1
𝐿
1 𝑁ℎ
= 𝑛 𝐸(𝑦2ℎℎ )
𝑁 𝑛ℎ ℎ
ℎ=1
𝐿 𝑁ℎ
1 1
= 𝑁ℎ 𝑦2ℎℎ
𝑁 𝑁ℎ
ℎ=1 𝑖=1
𝐿 𝑛ℎ 𝐿 𝑁ℎ
1 𝑁ℎ 1
∴𝐸 𝑦2ℎℎ = 𝑦2ℎℎ ………(2)
𝑁 𝑛ℎ 𝑁
ℎ=1 𝑖=1 ℎ=1 𝑖=1
Again
2
𝑉(𝑦𝑠𝑠 ) = 𝐸 𝑦2𝑠𝑠 ― 𝐸(𝑦𝑠𝑠 )
⇒𝐸 𝑣(𝑦𝑠𝑠 ) = 𝐸 𝑦2𝑠𝑠 ― 𝑌2 ∵ 𝐸 𝑣(𝑦𝑠𝑠 ) = 𝑉(𝑦𝑠𝑠 ) 𝑎𝑛𝑑 𝐸(𝑦𝑠𝑠 ) = 𝑌
⇒ 𝐸 𝑦2𝑠𝑠 ― 𝐸 𝑣(𝑦𝑠𝑠 ) = 𝑌2
25
Use of this theorem
Using the above theorem, we get the estimate of gain in precision due to stratification as
1 1
― .
𝑣(𝑦𝑠𝑠 ) 𝑣𝑟𝑟𝑟 (𝑦)
26