Chapter7 Sampling Varying Probability Sampling
Chapter7 Sampling Varying Probability Sampling
The simple random sampling scheme provides a random sample where every unit in the population has
equal probability of selection. Under certain circumstances, more efficient estimators are obtained by
assigning unequal probabilities of selection to the units in the population. This type of sampling is
known as varying probability sampling scheme.
If Y is the variable under study and X is an auxiliary variable related to Y, then in the most commonly
used varying probability scheme, the units are selected with probability proportional to the value of X,
called as size. This is termed as probability proportional to a given measure of size (pps) sampling. If
the sampling units vary considerably in size, then SRS does not takes into account the possible
importance of the larger units in the population. A large unit, i.e., a unit with large value of Y contributes
more to the population total than the units with smaller values, so it is natural to expect that a selection
scheme which assigns more probability of inclusion in a sample to the larger units than to the smaller
units would provide more efficient estimators than the estimators which provide equal probability to all
the units. This is accomplished through pps sampling.
Note that the “size” considered is the value of auxiliary variable X and not the value of study variable Y.
For example in an agriculture survey, the yield depends on the area under cultivation. So bigger areas are
likely to have larger population and they will contribute more towards the population total, so the value
of the area can be considered as the size of auxiliary variable. Also, the cultivated area for a previous
period can also be taken as the size while estimating the yield of crop. Similarly, in an industrial survey,
the number of workers in a factory can be considered as the measure of size when studying the industrial
output from the respective factory.
PPS without replacement (WOR) is more complex than PPS with replacement (WR) . We consider both
the cases separately.
In selection of a sample with varying probabilities, the procedure is to associate with each unit a set of
consecutive natural numbers, the size of the set being proportional to the desired probability.
If X 1 , X 2 ,..., X N are the positive integers proportional to the probabilities assigned to the N units in the
population, then a possible way to associate the cumulative totals of the units. Then the units are selected
based on the values of cumulative totals. This is illustrated in the following table:
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 2
Units Size Cumulative Total
1 X1 T1 X 1
If Ti 1 R Ti , then
2 X2 T2 X 1 X 2 ith unit is selected
Select a random with probability
number R Xi
i 1 between 1 and , i = 1,2,…, N .
i 1 X i 1 Ti 1 X j TN by using
TN
j 1
random number
i Repeat the procedure
i Ti X j table.
Xi n times to get a
j 1
sample of size n.
N
XN X j
N
N TN X j
j 1 j 1
Drawback : This procedure involves writing down the successive cumulative totals. This is time
consuming and tedious if the number of units in the population is large.
Lahiri’s method:
Let M Max X i , i.e., maximum of the sizes of N units in the population or some convenient
i 1,2,..., N
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 3
Probability of selection of ith unit at a trial depends on two possible outcomes
– either it is selected at the first draw
– or it is selected in the subsequent draws preceded by ineffective draws. Such probability is given by
P (1 i N ) P (1 j M | i )
1 X
. i Pi * , say.
N M
1 N
Xi
Probability that no unit is selected at a trial
N
1 M
i 1
1 NX
N
N M
X
1 Q, say.
M
Probability that unit i is selected (all other previous draws result in the non selection of unit i)
Pi* QPi* Q 2 Pi* ...
Pi*
1 Q
X / NM X Xi
i i Xi.
X /M NX X total
Thus the probability of selection of unit i is proportional to the size X i . So this method generates a pps
sample.
Advantage:
1. It does not require writing down all cumulative totals for each unit.
2. Sizes of all the units need not be known before hand. We need only some number greater than the
maximum size and the sizes of those units which are selected by the choice of the first set of
random numbers 1 to N for drawing sample under this scheme.
Disadvantage: It results in the wastage of time and efforts if units get rejected.
A draw is ineffective if one of the ineffective random number is selected.
The probability of rejection of a drawn number, i.e., probability that no unit is selected at a trial
1 N Xi 1 NX X
. 1 N . N M 1 .
N i 1 M M
M
The expected numbers of draws required to draw one unit .
X
This number is large if M is much larger than X .
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 4
Example: Consider the following data set of 10 number of workers in the factory and its output. We
illustrate the selection of units using the cumulative total method.
2 5 60 T2 2 5 7
3 10 12 T3 2 5 10 17
4 4 6 T4 17 4 21
5 7 8 T5 21 7 28
6 2 13 T6 28 2 30
7 3 4 T7 30 3 33
8 14 17 T8 33 14 47
9 11 13 T9 47 11 58
10 6 8 T10 58 6 64
2. Second draw:
- Draw a random number between 1 and 64
- Suppose it is 38
- T7 38 T8
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 5
Selection of sample using Lahiri’s Method
In this case
M Max X i 14
i 1,2,...,10
3 7 j 7 X 3 10 trial accepted ( y3 )
8 13 j 13 X 8 14 trial accepted ( y8 )
4 7 j 7 X4 4 trial rejected
2 9 j 9 X2 5 trial rejected
9 2 j 2 X 9 11 trial accepted ( y9 )
Pi : Probability of selection of ith unit in the population at any given draw and is proportional to size X i .
Consider the varying probability scheme and with replacement for a sample of size n. Let yr be the
value of rth observation on study variable in the sample and pr be its initial probability of selection.
Define
yr
zr , r 1, 2,..., n,
Npr
1 n z2
then z zi is an unbiased estimator of population mean Y , variance of z is
n i 1 n
where
2
Y
N
s2 1 n
Pi i Y and an unbiased estimate of variance of z is z
2
( zr z ) 2 .
n n 1 r 1
z
i 1 NPi
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 6
Proof:
Note that zr can take any one of the N values out of Z1 , Z 2 ,..., Z N with corresponding initial probabilities
P1 , P2 ,..., PN , respectively. So
N
E ( zr ) Z i Pi
i 1
N
Yi
Pi
i 1 NPi
Y.
Thus
1 n
E(z ) E ( zr )
n i 1
1 n
Y
n i 1
Y.
The variance of z is
1 n
Var ( z ) 2 Var zr
n r 1
1 n
n2
Var ( z )
r 1
r ( zr' s are independent in WR case).
Now
Var ( zr ) E zr E ( zr )
2
2
E zr Y
N
Z i Y Pi
2
i 1
2
Y
N
i Y Pi
i 1 NPi
z2 (say) .
Thus
1 n
Var ( z )
n2
r 1
2
z
z2
.
n
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 7
sz2
To show that is an unbiased estimator of variance of z , consider
n
n
(n 1) E ( sz2 ) E ( zr z ) 2
r 1
n
E zr2 nz 2
r 1
n
E ( zr2 ) nE ( z 2 )
r 1
n
Var ( zr ) E ( zr ) n Var ( z ) E ( z )
2 2
r 1
2
n Yi
Y
n N
2 2 z2
Y 2
using Var ( zr ) Y Pi z2
z n
i 1 NPi
r 1
(n 1) z2
E ( sz2 ) z2
sz2 z2
or E Var ( z )
n n
1 n yr
2
sz2
Var ( z ) nz .
2
n n(n 1) r 1 Npr
1
Note: If Pi , then z y ,
N
2
1 1 N Yi y2
Var ( z )
n N i 1 N . 1
Y
n
N
which is the same as in the case of SRSWR.
1 n y
Yˆtot r N z . .
n r 1 pr
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 8
1 n Y Y Y
E (Yˆtot ) 1 P1 2 P2 ... N PN
n r 1 P1 P2 PN
1 n N
Yi
n r 1 i 1
1 n
Ytot
n r 1
Ytot .
n i 1 N Pi
2
1 N Y
i Ytot Pi
n i 1 Pi
1 N Yi 2
Ytot2 .
n i 1 Pi
Let U i : i th unit,
P 1
i 1
i
Pi (1) Pi .
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 9
Consider
Pi (2) Probability of selection of U i at 2nd draw.
Pi P Pi Pi Pi
Pi (2) P1 P2 i ... Pi 1 Pi 1 ... PN
1 P1 1 P2 1 Pi 1 1 Pi 1 1 PN
N
Pi
j ( i ) 1
Pj
1 Pj
N
Pi P P
j ( i ) 1
Pj
1 Pj
Pi i Pi i
1 Pi 1 Pi
N
Pi P
Pj Pi i
j 1 1 Pj 1 Pi
N P P
Pi j i
j 1 1 Pj 1 Pi
1
Pi (2) Pi (1) for all i unless Pi .
N
y
Pi (2) will, in general, be different for each i = 1,2,…, N . So E i will change with successive draws.
pi
y1
This makes the varying probability scheme WOR more complex. Only will provide an unbiased
Np1
yi
estimator of Y . In general, (i 1) will not provide an unbiased estimator of Y .
Npi
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 10
Ordered estimates
To overcome the difficulty of changing expectation with each draw, associate a new variate with each
draw such that its expectation is equal to the population value of the variate under study. Such
estimators take into account the order of the draw. They are called the ordered estimates. The order of
the value obtained at previous draw will affect the unbiasedness of population mean.
We consider the ordered estimators proposed by Des Raj, first for the case of two draws and then
generalize the result.
respectively. Note that any one out of the N units can be the first unit or second unit, so we use the
notations U i (1) and U i (2) instead of U1 and U 2 . Also note that y1 and y2 are not the values of the first two
units in the population. Further, let p1 and p2 denote the initial probabilities of selection of Ui(1) and
Ui(2), respectively.
1 y2
z2 y1
N p2 / (1 p1 )
1 (1 p1 )
y1 y2
N p2
z1 z2
z .
2
p2
Note that is the probability P(U i (2) | U i (1) ).
1 p1
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 11
Consider
1 y1 y1 Y Y Y
E ( z1 ) E Note that can take any one of out of the N values 1 , 2 ,..., N
N p1 p1 P1 P2 PN
1 Y1 Y2 YN
P1 P2 ... PN
N P1 P2 PN
Y
1 (1 p1 )
E ( z2 ) E y1 y2
N p2
1 (1 P1 )
E ( y1 ) E1 E2 y2 U i (1) (Using E (Y ) E X [ EY (Y | X )].
N p2
where E2 is the conditional expectation after fixing the unit U i (1) selected in the first draw.
y2 Y
Since can take any one of the (N – 1) values (except the value selected in the first draw) j with
p2 Pj
Pj
probability , so
1 P1
(1 P1 ) y * Y P
E2 y2 U i (1) (1 P1 ) E2 2 U i (1) (1 P1 ) j j . j .
p2 p2 Pj 1 P1
where the summation is taken over all the values of Y except the value y1 which is selected at the first
draw. So
(1 P1 )
U i (1) j Y j Ytot y1.
*
E2 y2
p2
Substituting it in E ( z2 ), we have
1
E ( z2 ) E ( y1 ) E1 (Ytot y1 )
N
1
E ( y1 ) E (Ytot y1 )
N
1 Y
E (Ytot ) tot Y .
N N
Thus
E ( z1 ) E ( z2 )
E(z )
2
Y Y
2
Y.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 12
Variance:
The variance of z for the case of two draws is given as
1 N 2 1
2 2
N
Y 1 N
Y
Var ( z ) 1 Pi 2 Pi i Ytot 2 Pi i Ytot
2
2 i 1 2 N i 1 Pi 4 N i 1 Pi
The variance of z is
Var ( z ) E ( z 2 ) E ( z )
2
2
1 y1 y2 (1 p1 )
E y1 Y
2
2 N 1
p p2
2
1 y1 (1 p1 ) y2 (1 p1 )
2
E Y
2
4N p1 p2
nature of nature of
variable variable
depends depends
only on upon1st and
1st draw 2nd draw
1 N Yi (1 Pi ) Y j (1 Pi ) PP
2
=
4 N 2 i j 1 Pi
1
i j
Y 2
Pj P i
1 N Yi 2 (1 Pi ) 2 PP Y j2 (1 Pi ) 2 PP (1 Pi 2 ) PP
i j
2
= i j
i j
2YY Y
2
2
1 Pi 2
1 Pi 1 Pi
i j
4 N i j 1 Pi Pj PPi j
1 N Y 2 (1 P ) 2 Pj Y j2 (1 Pi ) 2 Pi
2YY
i j (1 Pi ) Y .
2
= i i
4N 2 i j 1 Pi 1 Pi Pj 1 Pi
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 13
Using the property
N N
N
a b
i j ai b j bi , we can write
i j 1 i 1 j 1
1 N Yi 2 (1 Pi ) 2 N N N Y j Yi
2 2 N N
Var ( z ) j i i
4 N 2 i 1 Pi (1 Pi ) j 1
P P P (1 Pi
) i
2 Y (1 Pi Y j Yi )] Y
)( 2
i 1 j 1 j
P Pi
i 1 j 1
1 N Yi 2 N N Y j2 Yi 2 N N
2
4 N i 1 Pi
(1 Pi
2
2 Pi ) Pi (1 Pi ) 2 Yi (1 Pi )( Y j Yi ) Y 2
i 1 j 1 Pj Pi i 1 j 1
1 N Yi 2 N 2 N N N Y2 N N N Y2
i i
2 j 2 2 j
Y P 2 Y P Y P
4N 2
i i i
i 1 Pi i 1 i 1 i 1 j 1 Pj i 1 i 1 j 1 Pj
N N N N N N
PY
i i 2 Yi Y j 2 Yi Pi 2 Yi Pi Y j 2 Yi ] Y
2 2 2 2
i i 1 j 1 i 1 i 1 j 1 i 1
1 N Yi 2 N 2 N Y
N 2 N
2 i
4 N 2 i 1 Pi i 1
P j
Yi
2
2Y 2
tot 2Ytot i i Y
Y P 2
j 1 Pj i 1 i 1
1 N 1 N Yi 2 2 2 1 N 2 N
2 2
2 1 Pi 2 2
Ytot Ytot 2 i
Y 2Y 2
tot 2Ytot Yi Pi 4 N Y
2 i 1 4 N i 1 Pi 4 N i 1 i 1
2
1 N 1 N
Yi 1 N N
1 Pi 2 2 Pi Ytot 2 i
( Y 2
2Ytot Yi Pi 2Ytot 4Ytot )
2 2
2 i 1 2 N i 1 Pi 4 N i 1 i 1
1 N 1
1 Pi 2 Y2
2 tot
2 i 1 2 N
2
1 N 1 N
Yi 1 N N
1 Pi 2 2 Pi Ytot 2 i
( Y 2
2Ytot Yi Pi 2Ytot 2Ytot Pi Ytot )
2 2 2 2
2 i 1 2 N i 1 i
P 4 N i 1 i 1 i
2
1 N 1 Y 1
Y 2YtotYi Pi Pi 2Ytot2
N N
1 Pi 2 2 Pi i Ytot 2 i
2
2 i 1 2 N i 1 Pi 4N i 1
2
1 1 N 2 N Yi 1 N
Y
2
1 Pi Pi
2 N 2 i 1 i 1 Pi
Ytot 2 Pi i Ytot
2
4N i 1 Pi
2 2 2
1 N Y 1 N
YN
1 N
Y
Pi i Y 2 Pi i Ytot
2
2 Pi i Ytot
2
2 i 1 NPi 4N i 1 i 1 Pi 4N i 1 Pi `
2 2 2
1 N Y 1 N N
Y 1 N
Y
Var ( z ) Pi i Y 2 Pi i Ytot
2
2 Pi i Ytot
2
2 i 1 NPi 4N i 1 i 1 Pi 4N i 1 Pi
variance of WR reduction of variance
case for n 2 in WR with varying
probability
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 14
Estimation of Var ( z )
Var ( z ) E ( z 2 ) ( E ( z )) 2
E(z 2 ) Y 2
Since
E ( z1 z2 ) E z1 E ( z2 | u1 )
E z1Y
YE ( z1 )
Y 2.
Consider
E z 2 z1 z2 E ( z 2 ) E ( z1 z2 )
E(z 2 ) Y 2
Var ( z )
( z ) z 2 z z is an unbiased estimator of Var ( z )
Var 1 2
Alternative form
(z ) z 2 z z
Var 1 2
2
z z
1 2 z1 z2
2
( z1 z2 ) 2
4
2
1 y y y 1 p1
1 1 2
4 Np1 N N p2
2
1 y1 y2 (1 p1 )
(1 p1 )
4N 2 p1 p2
2
(1 p1 ) 2 y1 y2
.
4 N 2 p1 p2
where U i ( r ) denotes that the ith unit is drawn at the rth draw. Let ( y1 , y2 ,.., yr ,..., yn ) and
( p1 , p2 ,..., pr ,..., pn ) be the values of study variable and corresponding initial probabilities of selection,
respectively. Further, let Pi (1) , Pi (2) ,..., Pi ( r ) ,..., Pi ( n ) be the initial probabilities of
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 15
Further, let
y1
z1
Np1
1 yr
zr y1 y2 ... yr 1 (1 p1 ... pr 1 ) for r 2,3,..., n.
N pr
1 n
Consider z zr as an estimator of population mean Y .
n r 1
1
E ( zr ) E1 E2 zr U i (1) ,U i (2) ,..., U i ( r 1)
N
where E2 is the conditional expectation after fixing the units U i (1) ,U i (2) ,..., U i ( r 1) drawn in the first (r -
1) draws.
Consider
y y
E r (1 p ... p ) E E r (1 p ... p ) U ,U ,...,U
p 1 r 1 1 2p 1 r 1 i(1) i(2) i(r 1)
r r
y
E (1 P P ... P ) E r U ,U ,...,U .
1 i(1) i(2) i(r 1) 2 p i(1) i(2) i (r 1)
r
y Y
r j
Since conditionally can take any one of the N - (r -1) values , j 1, 2,..., N with probabilities
p P
r j
P
j
, so
1 P P ... P
i(1) i (2) i(r 1)
y N Yj P
) * .
j
E r (1 p ... p ) E (1 P P ... P
p 1 r 1 1 i(1) i(2) i(r 1) P (1 P P ... P )
r j 1 j i(1) i(2) i(r 1)
N
E *Y
1 j
j 1
N *
where denotes that the summation is taken over all the values of y except the y values selected in the first (r -1) draws
j 1
N
like as , i.e., except the values y , y ,..., y
1 2 r 1
which are selected in the first (r -1) draws.
j 1( i(1), i(2),..., i(r 1))
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 16
Thus now we can express
1 y
E ( zr ) E1E2 y1 y2 ... yr 1 r (1 p1 ... pr 1 )
N pr
1 N
E1 Yi (1) Yi (2) ... Yi ( r 1) *Y j
N j 1
1 N
E1 Yi (1) Yi (2) ... Yi ( r 1)
N Yj
j 1( i (1),i (2),...,i ( r 1))
1
E1 Yi (1) Yi (2) ... Yi ( r 1) Ytot Yi (1) Yi (2) ... Yi ( r 1)
N
1
E Y
N 1 tot
Y
tot
N
Y for all r 1, 2,..., n.
Then
1 n
Ez E zr
n r 1
1 n
Y
n r 1
Y.
Thus z is an unbiased estimator of population mean Y .
The expression for variance of z in general case is complex but its estimate is simple.
Estimate of variance:
Var ( z ) E ( z 2 ) Y 2 .
Consider for r s,
E ( zr zs ) E zr E ( zs | U1 ,U 2 ,...,U s 1 )
E zrY
YE ( zr )
Y2
E ( zr zs ) E zs E ( zr | U1 ,U 2 ,...,U r 1 )
E zsY
YE ( zs )
Y 2.
Consider
1 n n 1 n n
E
( 1)
z r s
z
( 1)
E ( zr z s )
n n r ( s ) 1 s 1 n n r ( s ) 1 s 1
1
n(n 1)Y 2
n(n 1)
Y 2.
Substituting Y 2 in Var ( z ), we get
Var ( z ) E ( z 2 ) Y 2
1 n n
E( z 2 ) E zr z s
n(n 1) r ( s ) 1 s 1
1 n n
Var (z ) z 2 zr z s
n(n 1) r ( s ) 1 s 1
2
n n n n
Using zr zr2 zr zs
r 1 r 1 r ( s ) 1 s 1
n n n
zr zs n2 z 2 zr2 ,
r ( s ) 1 s 1 r 1
1 2 2 n 2
(z ) z 2
Var n z zr
n(n 1) r 1
1 n 2
n(n 1) r 1
zr nz 2
1 n
n(n 1) r 1
( zr z ) 2 .
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 18
Unordered estimator:
In ordered estimator, the order in which the units are drawn is considered. Corresponding to any ordered
estimator, there exist an unordered estimator which does not depend on the order in which the units are
drawn and has smaller variance than the ordered estimator.
N
In case of sampling WOR from a population of size N , there are unordered sample(s) of size n .
n
Corresponding to any unordered sample(s) of size n units, there are n ! ordered samples.
For example, for n 2 if the units are u1 and u2 , then
Moreover,
Probability of unordered Probability of ordered Probability of ordered
sample (u1 , u2 ) sample (u1 , u 2 ) sample (u2 , u 1 )
For n 3, there are three units u1 , u2 , u3 and
-there are following 3! = 6 ordered samples:
(u1 , u2 , u3 ), (u1 , u3 , u2 ), (u2 , u1 , u3 ), (u2 , u3 , u1 ), (u3 , u1 , u2 ), (u3 , u2 , u1 )
Moreover,
Probability of unordered sample
= Sum of probability of ordered sample, i.e.
P(u1 , u2 , u3 ) P(u1 , u3 , u2 ) P(u2 , u1 , u3 ) P(u2 , u3 , u1 ) P(u3 , u1 , u2 ) P(u3 , u2 , u1 ),
N
Let zsi , s 1, 2,.., , i 1, 2,..., n !( M ) be an estimator of population parameter based on ordered
n
sample si . Consider a scheme of selection in which the probability of selecting the ordered sample
( si ) is psi . The probability of getting the unordered sample(s) is the sum of the probabilities, i.e.,
M
ps psi .
i 1
For a population of size N with units denoted as 1, 2,…, N , the samples of size n are n tuples. In the
nth draw, the sample space will consist of N ( N 1)...( N n 1) unordered sample points.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 19
1
psio P selection of any ordered sample
N ( N 1)...( N n 1)
n! selection of any
psiu P selection of any unordered sample n! P
N ( N 1)...( N n 1) ordered sample
M ( n!)
n !( N n)! 1
then ps
i 1
psio
N!
N
.
n
N M
Theorem : If ˆ0 zsi , s 1, 2,..., ; i 1, 2,..., M ( n !) and ˆu zsi psi are the ordered and unordered
n i 1
where zsi is a function of si th ordered sample (hence a random variable) and psi is the probability of
psi
selection of si th ordered sample and psi .
ps
N
Proof: Total number of ordered sample = n !
n
N
n M
(i ) E (ˆ0 ) zsi psi
s 1 i 1
N
n
M
E (ˆu ) zsi psi ps
s 1 i 1
p
zsi si ps
s i ps
zsi psi
s i
E (ˆ0 )
N
(ii) Since ˆ0 zsi , so ˆ02 zsi2 with probability psi , i 1, 2,..., M , s 1, 2,..., .
n
2
M
M
Similarly, ˆu zsi psi , so ˆu2 zsi psi with probability ps
i 1 i 1
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 20
Consider
2
Var (ˆ0 ) E (ˆ02 ) E (ˆ0 )
2
zsi2 psi E (ˆ0 )
s i
2
Var (ˆu ) E (ˆu2 ) E (ˆu )
2
2
zsi psi ps E (ˆ0 )
s i
2
Var (ˆ0 ) Var (ˆu ) z psi zsi psi ps
2
si
s i s i
2
z psi zsi psi ps
2
si
s i s i
2 zsi psi zsi psi ps
s i i
2
zsi2 psi zsi psi psi 2 zsi psi zsi psi ps
s i i i i i
2
zsi psi zsi psi psi 2 zsi psi zsi psi
2
s i i i
( zsi zsi psi ) 2 psi 0
s i i
ˆ ˆ
Var ( 0 ) Var (u ) 0
or Var (ˆ ) Var (ˆ )
u 0
(ˆ ) p (
p Var
i
z z p ) 2 .
si 0
i
si si i
si si
Based on this result, now we use the ordered estimators to construct an unordered estimator. It follows
from this theorem that the unordered estimator will be more efficient than the corresponding ordered
estimators.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 21
Murthy’s unordered estimator corresponding to Des Raj’s ordered estimator for the
sample size 2
Suppose yi and y j are the values of units U i and U j selected in the first and second draws respectively
with varying probability and WOR in a sample of size 2 and let pi and p j be the corresponding initial
probabilities of selection. So now we have two ordered estimates corresponding to the ordered samples
s1* and s2* as follows
s1* ( yi , y j ) with (U i , U j )
s2* ( y j , yi ) with (U j , U i )
1 yi yj
z ( s1* ) (1 pi ) (1 pi )
2 N pi p j
1 yi y j (1 pi )
yi
2 N pi pj
and
1 yj yi
z ( s2* ) (1 p j ) (1 p j )
2 N pj pi
1 y j yi (1 p j )
yj .
2 N pj pi
The probabilities corresponding to z ( s1* ) and z ( s2* ) are
pi p j
p ( s1* )
1 pi
p j pi
p ( s2* )
1 p j
p ( s ) p( s1* ) p ( s2* )
pi p j (2 pi p j )
(1 pi )(1 p j )
1 p j
p '( s1* )
2 pi p j
1 pi
p '( s2* ) .
2 pi p j
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 22
Murthy’s unordered estimate z (u ) corresponding to the Des Raj’s ordered estimate is given as
1 y y j yj y
(1 pi ) i (1 pi ) (1 p j ) (1 p j ) (1 p j ) i (1 pi )
2N pi p j pi pi
(1 p j ) (1 pi )
1
(1 p j ) (1 pi ) (1 pi ) (1 pi ) (1 p j ) (1 p j
yi yj
2N pi pj
2 pi p j
yi y
(1 p j ) (1 pi ) j
pi pj
.
N (2 pi p j )
Unbiasedness:
Note that yi and pi can take any one of the values out of Y1 , Y2 ,..., YN and P1 , P2 ,..., PN ,
respectively. Then y j and p j can take any one of the remaining values out of Y1 , Y2 ,..., YN and
P1 , P2 ,..., PN , respectively, i.e., all the values except the values taken at the first draw. Now
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 23
Y Y j PP PP
(1 Pj ) i (1 Pi ) i j
i j
1 Pi Pj 1 Pi 1 Pj
E z (u )
N i j 2 Pi Pj
Y Y j PP P P
(1 Pj ) i (1 Pi ) j i
i j
1 Pi Pj 1 Pi 1 Pj
2
2 N i j 2 Pi Pj
Y Y j PP P P
(1 Pj ) i (1 Pi ) j i
i j
1 Pi Pj 1 Pi 1 Pj
2 N i j 2 Pi Pj
1 Yi Y j PP
2N
i j
(1 Pj ) (1 Pi )
(1
i j
)(1 )
Pi P j
Pi Pj
1 Yi Pj Y j Pi
2N
1 P 1 P
i j i j
N N
N
Using result ai b j ai b j bi , we have
i j 1 i 1 j 1
1 N Y N
N Y j N
E z (u ) i ( Pj Pi ) ( Pi Pj )
2N i 1 1 Pi j 1 j 1 1 Pj i 1
1 N Yi N Yj
(1 Pi
) (1 Pj )
2N i 1 1 Pi j 1 1 Pj
1 N N
Yi Y j
2 N i 1 j 1
Y Y
2
Y.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 24
Variance: The variance of z (u ) can be found as
2
1 N (1 Pi Pj )(1 Pi )(1 Pj ) Yi Y j PP
i j (2 Pi Pj )
Var z (u )
2 i j 1 N (2 Pi Pj )
2
Pi Pj (1 Pi )(1 Pj )
2
1 N PP (1 Pi Pj ) Yi Y j
i 2j
2 i j 1 N (2 Pi Pj ) Pi Pj
Unbiased estimator of V z (u )
value of characteristic under study and a sample of size n is drawn by WOR using arbitrary probability
of selection at each draw.
Thus prior to each succeeding draw, there is defined a new probability distribution for the units available
at that draw. The probability distribution at each draw may or may not depend upon the initial
probability at the first draw.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 25
nyi
Let zi , i 1...N assuming E ( i ) 0 for all i
NE ( i )
where
E ( i ) 1.P (Yi s ) 0.P (Yi s )
i
is the probability of including the unit i in the sample and is called as inclusion probability.
1 n
zn YˆHT zi
n i 1
1 N
i zi .
n i 1
Unbiasedness
1 N
E (YˆHT ) E ( zi i )
n i 1
1 N
zi E ( i )
n i 1
1 N nyi
E ( i )
n i 1 NE ( i )
1 N nyi
Y
n i 1 N
which shows that HT estimator is an unbiased estimator of population mean.
Variance
V (YˆHT ) V ( zn )
E ( zn2) E ( zn )
2
E ( zn2) Y 2 .
Consider
2
1 N
E ( z ) 2 E i zi
n
2
n i 1
1 N 2 2 N N
2
E i i
z i j zi z j
n i 1 i ( j ) 1 j 1
1 N 2 N N
2 i
z E ( i
2
) zi z j E ( i j ) .
n i 1 i ( j ) 1 j 1
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 26
If S s is the set of all possible samples and i is probability of selection of ith unit in the sample s
then
E ( i ) 1 P( yi s ) 0.P ( yi s )
1. i 0.(1 i ) i
E ( ) 12. P( yi s ) 02.P( yi s )
i
2
i.
So
E ( i ) E ( i2 )
1 N 2 N N
E ( z ) 2 zi i ij zi z j
2
n i 1
n
i (# j ) i 1
where ij is the probability of inclusion of ith and jth unit in the sample. This is called as second order
inclusion probability.
Now
Y 2 E ( zn )
2
2
1 N
2 E i zi
n i 1
1 N 2 2
N N
z E ( ) zi z j E ( i ) E ( j )
n 2 i 1
i i
i ( j )1 j 1
1 N 2 2 N N
2 i i
z i j zi z j .
n i 1 i ( j ) 1 j 1
Thus
1 N N N
Var (YˆHT ) 2 i zi2 ij zi z j
n i 1 i ( j ) 1 j 1
1 N N N
2 i2 zi2 i j zi z j
n i 1 i ( j ) 1 j 1
1 N N N
2 i (1 i ) zi2 ( ij i i ) zi z j
n i 1 i ( j ) 1 j 1
1 N n 2 yi2 N N n 2 yi y j
2 i (1 i ) 2 2 ( ij i i ) 2
n i 1 N i i ( j ) 1 j 1 N i j
1 N 1 i 2 N N
N i 1 i
2
yi
ij
i j
i i
yi y j
i ( j ) 1 j 1
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 27
Estimate of variance
n 2 n yi y j
(Yˆ ) 1 yi (1 i )
n
Vˆ1 Var HT 2
N 2 i 1
ij
i j
.
i i ( j ) 1 j 1 ij i j
This is an unbiased estimator of variance .
yi
Drawback: It does not reduces to zero when all are same, i.e., when yi i .
i
Consequently, this may assume negative values for some samples.
A more elegant expression for the variance of yˆ HT has been obtained by Yates and Grundy.
Since there are exactly n values of i which are 1 and ( N n) values which are zero, so
N
i 1
i n.
E ( ) n.
i 1
i
Also
2
N N N N
E i E ( i2 ) E ( i j )
i 1 i 1 i ( j ) 1 j 1
N N N
E n E ( i ) E (
2
i J ) (using E ( i ) E ( i2 ))
i 1 i ( j ) 1 j 1
N N
n2 n E (
i ( j ) 1 j 1
i J )
N N
E (
i ( j ) 1 j 1
i J ) n(n 1)
Thus E ( i j ) P( i 1, j 1)
P ( i 1) P( j 1 i 1)
E ( i ) E ( j i 1)
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 28
Therefore
N
j ( i ) 1
E ( i j ) E ( i ) E ( j )
N
j ( i ) 1
E ( i ) E ( j | i 1) E ( i ) E ( j )
N
E ( i )
j ( i ) 1
E ( j | i 1) E ( j )
E ( i ) (n 1) (n E ( i )
E ( i ) 1 E ( i )
i (1 i ) (1)
Similarly
N
i ( j ) 1
E ( i j ) E ( i ) E ( j ) j (1 j ). (2)
1 N N N
Var (YˆHT ) 2 i (1 i ) zi2 ( ij i j ) zi z j
n i 1 i ( j ) 1 j 1
Using (1) and (2) in this expression, we get
1 N N N N
Var (YˆHT ) 2 i (1 i ) zi2 j (1 j ) z 2j 2 ( i j ij ) z i z j
2n i 1 j 1 i j 1 j 1
1 N N
2
E ( i j ) E ( i ) E ( j ) zi2
2n i 1 j ( i ) 1
N N
E ( i j ) E ( i ) E ( j ) z 2j 2 E ( i ) E ( j ) E ( i j ) zi z j
N n
j 1 i ( j ) 1 i ( j ) 1 j 1
1 N N N N N N
( ij i i ) zi z j
2 2
( ) z ( ) z 2
2n 2
ij i i i ij i i j
i ( j ) 1 j 1 i ( j ) 1 j 1 i ( j ) 1 j 1
1 N N
2
( i j ij )( zi2 z 2j 2 zi z j ) .
2n i ( j ) 1 j 1
The expression for i and ij can be written for any given sample size.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 29
For example, for n 2 , assume that at the second draw, the probability of selecting a unit from the units
available is proportional to the probability of selecting it at the first draw. Since
Pi1 Pi 2
where Pir is the probability of selecting Yi at r th draw (r 1, 2). If Pi is the probability of selecting the
ith unit at first draw (i 1, 2,..., N ) then we had earlier derived that
Pi1 Pi
yi is not selected yi is selected at 2nd draw|
Pi 2 P st P
at 1 draw yi is not selected at 1 draw
st
N PP
j i
j ( i ) 1 1 Pj
N P P
j i Pi .
j 1 1 Pj 1 Pi
So
N P P
E ( i ) Pi j i Pi
j 1 1 Pj 1 Pi
Again
E ( i j ) Probability of including both yi and y j in a sample of size two
Pi1 Pj 2|i Pj1 Pi 2| j
Pj Pi
Pi Pj
1 Pi 1 Pj
1 1
=PP
i j Pi .
1 Pi 1 Pj
Estimate of Variance
The estimate of variance is given by
(Yˆ ) 1
n n i j ij
Var HT
2n 2
i( j ) j 1 ij
( zi z j ) 2 .
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 30
Midzuno system of sampling:
Under this system of selection of probabilities, the unit in the first draw is selected with unequal
probabilities of selection (i.e., pps) and remaining all the units are selected with SRSWOR at all
subsequent draws.
(n 1) N n n2
( Pi Pj )
( N 1) N 2 N 2
n 1 N n n2
ij ( Pi Pj ) .
N 1 N 2 N 2
Similarly,
E ( i j k ) ijk Probability of including U i , U j and U k in the sample
(n 1)(n 2) N n n3
( Pi Pj Pk ) .
( N 1)( N 2) N 3 N 3
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 31
By an extension of this argument, if U i , U j ,..., U r are the r units in the sample of size n(r n), the
(n 1)(n 2)...1
E ( i j ... q ) ij ...q ( Pi Pj ... Pq )
( N 1)( N 2)...( N n 1)
1
( Pi Pj ... Pq )
N 1
n 1
which is obtained by substituting r n .
Thus if Pi ' s are proportional to some measure of size of units in the population then the probability of
selecting a specified sample is proportional to the total measure of the size of units included in the
sample.
Substituting these i , ij , ijk etc. in the HT estimator, we can obtain the estimator of population’s mean
ij
ij
( zi z j ) 2
where
N n n 1
i j ij ( N n) PP
i j (1 Pi Pj ) .
( N 1) 2 N 2
The main advantage of this method of sampling is that it is possible to compute a set of revised
probabilities of selection such that the inclusion probabilities resulting from the revised probabilities are
proportional to the initial probabilities of selection. It is desirable to do so since the initial probabilities
can be chosen proportional to some measure of size.
Sampling Theory| Chapter 7 | Varying Probability Sampling | Shalabh, IIT Kanpur Page 32