0% found this document useful (0 votes)

35 views27 pages

Sur15 3 Sol

Uploaded by

Suhail Wani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views27 pages

Sur15 3 Sol

Uploaded by

Suhail Wani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3 Ratio and regression estimators

3.1 Motivating examples

Frequently, we are interested in measuring the ratio of a matched pair
of variables. This occurs when the sampling unit comprises a group or
cluster of individuals, and our interest is in the population mean per
individual.
For example, to estimate average income/adult in the population in a
household survey, we record for the ith household (i = 1, · · · , n) the
number of adults who live there, xi, and the household income, yi.
Then the parameter, average income per adult in the population,
N
P
Yi
household income
R= = i=1
N
total no. of adults P
Xi
i=1

can be estimated by the ratio estimator

n
P
yi
b = r = i=1 ȳ
R n = .
P x̄
xi
i=1

Relationship between estimates

Ratio Mean Total
×X ×N
R −→ Y −→ Y

×X
R −→ Y

SydU STAT3014 (2015) Second semester Dr. J. Chan 34

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3.2 Two characteristics per unit in SRS

Theorem: If Xi and Yi are a pair of numerical characteristics defined

on every unit of the population, and ȳ and x̄ are the corresponding
means from a SRS without replacement of size n , then
"P #
N
n 1 i=1 (Yi − Ȳ )(Xi − X̄) n Sxy

Cov (x̄, ȳ) = 1 − = 1−
N n N −1 N n
(1)
and
Pn PN
(y
i=1 i − ȳ)(x i − x̄) i=1 (Yi − Ȳ )(Xi − X̄)
E = . (2)
n−1 N −1
Proof. Consider Ui = Xi + Yi and the corresponding sample values are
ui = xi + yi. Clearly
"P #
N
n S2
U
n 1

i=1 (Xi − X̄ + Yi − Ȳ )
2
Var (ū) = 1 − = 1−
N n N n N −1
"P #
N 2
PN 2
PN
n 1 i=1 (X i − X̄) + (Y
i=1 i − Ȳ ) + 2 i=1 (X i − X̄)(Yi − Ȳ )
= 1−
N n N −1
"P #
N
2 i=1 (Xi − X̄)(Yi − Ȳ ) n

= Var (x̄) + Var (ȳ) + 1− .
n N −1 N

Since Var (ū) = Var (x̄ + ȳ) = Var (x̄) + Var (ȳ) + 2Cov(x̄, ȳ), (1) is
proved. (2) can be proved in a similar way.

Theorem: For large sample,

(a) E(r) − R ≈ 0, approximately unbiased,
"P #
N 2
1 n 1

i=1 (Yi − RXi ) 1 n Sr2
(b) Var(r) ≈ 2 1 − = 2 1− .
X̄ N n N −1 X̄ N n

SydU STAT3014 (2015) Second semester Dr. J. Chan 35

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Proof:
(a) Recall E(ȳ) = Ȳ , E(x̄) = X̄ and Var(x̄) = O(n−1) (order of n−1).
Thus for large sample,
ȳ E(ȳ)
E(r) = E ≈ = R.
x̄ X̄
(b) Note that
ȳ ȳ − Rx̄
r−R= −R≈ .
x̄ X̄
Thus, for large sample,
2 1 2 E(d¯2) Var(d)¯
Var(r) = E[(r − R) ] ≈ 2 E[(ȳ − Rx̄) ] = =
X̄ X̄ 2 X̄ 2

where d¯ = ȳ−Rx̄ is the sample mean of di = yi −Rxi, i = 1, · · · , n,

drawn from the population of Di = Yi − RXi, i = 1, · · · , N with

¯ = E(ȳ − Rx̄) = E(ȳ) − RE(x̄) = Ȳ − RX̄ = Ȳ − Ȳ

E(d) X̄ = 0.
X̄
For a SRS of di,
¯
n Sr2
Var(d) = 1 −
N n
where
N N
2 1 X 2 1 X
Sr = (Di − D̄) = (Yi − RXi)2.
N − 1 i=1 N − 1 i=1
Hence
" N
#
1 n 1
1 X
2 1 n Sr2
Var(r) ≈ 2 1 − (Yi − RXi) = 2 1 − ,
X̄ N n N − 1 i=1
X̄ N n
" n
#
1 n 1
1 X
2 1 n s2r
var(r) ≈ 2 1 − (yi − rxi) = 2 1 −
X̄ N n n − 1 i=1 X̄ N n

SydU STAT3014 (2015) Second semester Dr. J. Chan 36

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.
2
1. Ordinary: x not related to y Yb̄ = ȳ & var(Yb̄ ) = 1 − n sy
N n
y6 yi s
Solid line: yi − ȳ
s
ȳ s P 2
s i (yi −ȳ)
s2y = n−1
s

- x
2
2. Ratio: x positively related to y Yb̄ r = ȳ Xx̄ & var(Yb̄ r ) = 1 − Nn snr
y6 yi s Solid line: z = y − rx
i i i
rxi
rX s
s
s
Ȳ 2 P z 2
6 sr = n−1i i
< s2y
s y = rx = s2y − 2rρ̂sxsy + r2s2x
(a = 0,b = r)
- x
X

Calculation of s2r :
n
2 1 X
sr = (yi − rxi)2
n − 1 i=1
n
1 X ȳ
= [(yi − ȳ) − r(xi − x̄)]2 since ȳ − rx̄ = ȳ − x̄ = 0
n − 1 i=1 x̄
" n n n
#
1 X X X
= (yi − ȳ)2 − 2r (xi − x̄)(yi − ȳ) + r2 (xi − x̄)2
n − 1 i=1 i=1 i=1
= s2y − 2r sxy + r2 s2x = s2y − 2r ρ̂sxsy + r2 s2x

n n n
!
1 X X X
or s2r = yi2 − 2r xi yi + r 2 x2i .
n−1 i=1 i=1 i=1

SydU STAT3014 (2015) Second semester Dr. J. Chan 37

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Remark:
1. If Xi and Yi are positively related, we have s2r s2y . Hence Xi can be
used as an auxiliary variable which provides additional information
and hence improves the precision of the estimate Ȳ .
2. When X is replaced by x if it is unknown, ordinary estimator results.
3. When ratio estimation is used, estimates of variance and sample size
are quite sensitive to data points that do not fit the ideal pattern
called influential observation. It is important to plot the data and
look for these unusual data points before proceeding with an analysis.
4. The ‘ratio of means’ Rb = y is biased and can be almost unbiased
x
if n is large. Another ratio estimator is the ‘mean of ratios’
n N
∗ ∗ 1
P yi ∗ yi ∗ 1
P yi
R = r = n
b
xi where ri = xi is unbiased for R = N xi .
i=1 i=1
However Rb∗ gives equal weight to each cluster which may vary greatly
in size. Unlike Rb∗ , R
b is weighed by the cluster size which is an
advantage over Rb∗ .

SydU STAT3014 (2015) Second semester Dr. J. Chan 38

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3.3 Ratio estimate for population mean and total

The ratio estimator of the population total Y is
ȳ
Ybr = X = rX
x̄
Similarly, the ratio estimator of population mean is
ȳ
Yb̄ r = X̄ = rX̄
x̄

These ratio estimates use extra information of xi, i = 1, · · · , n and

the true total and mean X or X̄, thus improving the precision of ratio
estimates over the ordinary estimates Yb = N ȳ and Yb̄ = ȳ respectively.
From the previous result,

(a) E(Yb̄ r ) = X̄E(r) ≈ X̄R = Ȳ .

Similarly E(Ybr ) = XE(r) ≈ XR = Y .
n Sr2 2
n S2
r
(b) Since Var(Y r ) ≈ 1 −
b̄ and Var(Ybr ) ≈ N 1− ,
N n N n
n s2r 2
n s2
r
var(Y r ) = 1 −
b̄ and var(Ybr ) = N 1 − .
N n N n

The estimator r for R is generally biased , so Ybr and Yb̄ r are also
biased for Y and Ȳ respectively.
Bias:
ȳ
Cov(r, x̄) = E(rx̄) − E(r)E(x̄) = E x̄ − E(r)E(x̄)
x̄
so
E(ȳ) Cov(r, x̄) ρr,x̄ σr σx̄
E(r) = − =R− .
E(x̄) E(x̄) X̄
SydU STAT3014 (2015) Second semester Dr. J. Chan 39
STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Therefore for any ratio estimates,

Efficiency:
The ratio estimator is more efficient than the ordinary estimator, that is
var(Yb ) > var(Yb r ), if
cv(x)
ρ̂ > (4)
2cv(y)
where cv(y) is the sample cv for Y defined as
sy
cv(y) = .
y
Then
n 1 2
var(Yb ) − var(Yb r ) > 0 ⇒ 1− [sy − s2r ] > 0
N n
⇒ 2
[sy − (s2y − 2rρ̂sxsy + r2s2x)] > 0
⇒ rsx(2ρ̂sy − rsx) > 0
⇒ 2ρ̂sy − rsx > 0 since r > 0 & sx > 0
y sx cv(x) y
⇒ ρ̂ > = since r =
x 2sy 2cv(y) x
cv(x)
and the equality holds when ρ̂ = .
2cv(y)

SydU STAT3014 (2015) Second semester Dr. J. Chan 40

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Example: (7-11) The manager of 7-11 is interested in estimating the

total sale in thousands for all of its 300 branches. From last year record,
the total sale in thousands for all the 300 branches is 21300. Careful
check of this year records are obtained for a SRS of 15 branches with the
following results:
Branch Last year sale x This year sale y Branch Last year sale x This year sale y
1 50 56 9 100 165
2 35 48 10 250 409
3 12 22 11 50 73
4 10 14 12 50 70
5 15 18 13 150 95
6 30 26 14 100 55
7 9 11 15 40 83
8 25 30
n
X n
X n
X n
X n
X
xi = 926, x2i = 117400, yi = 1175, yi2 = 231815, xi yi = 155753
i=1 i=1 i=1 i=1 i=1
s2y = 9983.81

The ordinary estimate of the total sale this year in thousands is

1175
Yb = N y = 300 = 23500
15
with
r s
s2y

n 15 9983.81
se(Yb ) = N (1 − ) = 300 1− = 7543.72.
N n 300 15
The ratio estimate and its se for the total sale this year in thousands are

1175
Ybr = Xr = 21300 = 27027.54
926

SydU STAT3014 (2015) Second semester Dr. J. Chan 41

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

v !
u n n n
u n 1 1 X X X
se(Ybr ) = N t 1 − y 2 − 2r xi yi + r2 x2i
N n n − 1 i=1 i i=1 i=1
s
15 1 1175 1175 2
= 300 1− 231815 − 2 · · 155753 + ( ) · 117400
300 15 × 14 926 926
= 3226.66

which is much smaller than se(Yb ) = 7543.72 thousands.

Read Tutorial 11 Q2a,b, Q3a,b.

3.4 Regression estimator

Since Yb r = X Rb = X y , the line y = mx with slope m = y passes

x x
through the origin (0, 0) and (X, Y r ). However, the linear relationship
b
between X and Y may not pass through the origin. A more general
estimator, the regression estimator fits a regression line:
y = A + Bx = y − Bx + Bx = y + B(x − x) (5)
to the sample data where the least square estimate of B is
PN PN
SSxy (yi − Y )(xi − X) i=1 xi yi − N XY Sxy ρSy
B= = i=1 PN = P N 2 = 2
= .
SSxx i=1 (xi − X)
2 2 S S
i=1 xi − N X x x

and A = y − Bx.
Note: Cov(X, Y ) = Sxy = SSxy /(N − 1), Var(X) = Sx2 = SSxx/(N − 1),
cov(X, Y ) = sxy = ssxy /(n − 1) and var(X) = s2x = ssxx/(n − 1).
Then the regression estimator of the population mean Y is to substitute
x = X to (5) to obtain

Yb reg = y + b(X − x)
SydU STAT3014 (2015) Second semester Dr. J. Chan 42
STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

where
Pn Pn
ssxy (y − y)(xi − x) i=1 xi yi − nxy sxy
b= Pn i
= i=1 2
= P n 2 − nx2
= 2
. (6)
ssxx i=1 (x i − x) x
i=1 i s x

Since
Yb reg = y + b(X − x) ' y + B(X − x) = z 0
the sample mean of the variable zi0 = yi + B(X − xi), we have
E(Yb reg ) ' E[y +B(X −x)] = E(y)+B[X −E(x)] = Y Approx. unbiased
and
Var(Yb reg ) ' Var(z̄ 0) = Var[y + B(X − x)] = Var(y − Bx)
= Var(ȳ) + B 2 Var(x̄) − 2B Cov(ȳ, x̄)
n Sy2 2
Sy
2
n Sx2 Sy n ρSxSy
= 1− +ρ 2 1− − 2ρ 1−
N n Sx N n Sx N n
n Sy2
1 − ρ2 .

= 1−
N n
Hence
n s2reg n s2y (1 − ρ̂2)
var(Y reg ) = 1 −
b = 1−
N n N n
where s2reg is the sample variance of zi0 = yi + b(X − xi).
The regression estimator for the population total Y is
Ybreg = N [y + b(X − x)]
and its variance estimate is
n s2 n s2 (1 − ρ̂2)
2 reg 2 y
var(Ybreg ) = N 1 − =N 1−
N n N n

SydU STAT3014 (2015) Second semester Dr. J. Chan 43

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Bias:
Bias in Yb̄ reg = E(Yb̄ reg ) − Ȳ = E(ȳ) + E[b(X̄ − x̄)] − Ȳ
= E[b(X̄ − x̄)] = −Cov(b, x̄).

Efficiency:
1. The regression estimator is at least as efficient as the ordinary
estimator, that is var(Yb ) ≥ var(Yb reg ) since
n 1 2
var(Y ) − var(Y reg ) = 1 −
b b [sy − s2reg ]
N n
n 1 2 2
= 1− s ρ̂ ≥ 0
N n y
where the equality holds when ρ̂ = 0, i.e. there is no association
between Y and X.
2. The regression estimator is more efficient than the ratio estimator,
that is var(Yb r ) ≥ var(Yb reg ) unless
y
b=r=
x
in which case they are equivalent and the regression of y on x is
linear through the origin and the variance of y is proportional to x.
n 1 2
var(Yb r ) − var(Yb reg ) = 1− [sr − s2reg ]
N n
n 1 2
= 1− [sy − 2rρ̂sxsy + r2s2x − s2y (1 − ρ̂2)]
N n
n 1 2 2
= 1− (r sx − 2rρ̂sxsy + s2y ρ̂2)
N n
n 1
= 1− (rsx − ρ̂sy )2
N n

SydU STAT3014 (2015) Second semester Dr. J. Chan 44

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.
n 1
= 1− (rsx − bsx)2
N n
n s2x
= 1− (r − b)2 > 0 ⇒ (r − b)2 > 0
N n
sxy sxy sxy
where ρ̂sy = sy = = 2 sx = bsx.
sx sy sx sx

3. Since Yb reg = y + b(X − x), the regression estimator adjusts the y

up or down by an amount b(X − x).

(a) When the slope b = 0, the regression estimator Yb reg = y becomes

the ordinary estimator Yb .
(b) When the y-intercept a = y − bx = 0 ⇔ b = xy = r, the slope b
becomes the ratio estimate r and the regression estimator
y y y
Yb reg = y + (X − x) = y + X − y = X = Xr = Yb r
x x x
becomes the ratio estimator Yb .r

Example: (7-11) Estimate the total sale using the regression estimator.
Solution: The regression estimate of the total sale this year in thou-
sands is
n
X 926 1175
ssxy = xiyi − nxy = 155753 − 15 × × = 83216.33,
i=1
15 15
n 2
X 926
ssxx = x2i − nx2 = 117400 − 15 × = 60234.93,
i=1
15
n 2
X 1175
ssyy = yi2 − ny 2 = 231815 − 15 × = 139773.33.
i=1
15

SydU STAT3014 (2015) Second semester Dr. J. Chan 45

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

We have
ssxy 83216.33
b= = = 1.3815
ssxx 60234.93
and
ssxy 83216.33
ρ̂ = √ =√ = 0.9069.
ssxxssyy 60234.93 × 139773.33
It follows that
Ybreg = N [y + b(X − x)]

1175 21300 926
= 300 + 1.3815 − = 27340.65
15 300 15
as compared with Yb = 23500 and Ybr = 27027.54. The s.e. estimate is
r
n s2 (1 − ρ̂2)
y
se(Ybreg ) = N 1−
s N n
15 9983.81(1 − 0.90692)
= 300 1− = 3178.52
300 15

which is < se(Ybr ) = 3226.66 << se(Yb ) = 7543.72. This shows that the
dropping of zero y-intercept assumption improves the estimate slightly.
Note that the y-intercept estimate is
1175 926
a = y − bx = − 1.3815 × ≈ −6.9531
15 15
which is quite close to zero.
Read Tutorial 11 Q2c,d, & 3c,d.

SydU STAT3014 (2015) Second semester Dr. J. Chan 46

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3.5 The Hartley - Ross Estimator

Since the ratio estimator r for R is biased, the following leads to an
unbiased estimator of R.
Theorem: Let Z = f (X, Y ) be a fixed function of two variables.
Define Zi = f (Xi, Yi) and zi = f (xi, yi). Then
Pn PN
N − 1 i=1 zi(xi − x̄) ZiXi
E z̄ + = i=1 . (7)
N X̄ n−1 N X̄

Proof: The LHS is

Pn
N −1 i=1 zi (xi − x̄)
E(z̄) + E
N X̄ n−1
X N XN XN XN
Zi(Xi − X̄) ZiXi − X̄ Zi Zi X i
N − 1 i=1
= Z̄ + = Z̄ + i=1 i=1
= i=1 .
N X̄ N −1 N X̄ N X̄
For the problem of estimation of R from sample (xi, yi), i = 1, · · · , n,
we assume Xi > 0, i = 1, · · · , N and define the function
zi = f (xi, yi) = yi/xi = ri∗, i = 1, · · · , n
and Zi = Yi/Xi, i = 1, 2, · · · , N , so from (7)
N
X Yi
Xi

N − 1 n(ȳ − x̄r̄ ∗
)

i=1
X i Ȳ
E r̄∗ + = = =R
N X̄ n−1 N X̄ X̄
since
n n n n
X X yi X X yi
zi(xi − x̄) = (xi − x̄) = yi − x̄ = n(ȳ − x̄r̄∗).
i=1 i=1
xi i=1 i=1
xi
Thus the Hartley-Ross estimator as an unbiased estimator of R is
SydU STAT3014 (2015) Second semester Dr. J. Chan 47
STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

∗ N − 1 n(ȳ − r̄∗x̄)
R̂hr = r̄ +
N X̄ n−1

for which we need to know X̄ (or X = N X̄). This estimator contains

a mean of ratio estimate and an adjustment for unbiasness.
The Hartley-Ross estimators for mean and total are

∗ N − 1 n(ȳ − r̄∗x̄)
for the population mean: Y hr = X̄ r̄ +
b̄ and
N n−1
∗
∗ n(ȳ − r̄ x̄)
for the population total: Ybhr = X r̄ + (N − 1) .
n−1

Remarks:
n
ȳ ∗ 1 X yi
1. So far, we have R = biased for R, R =
b b biased for R &
x̄ n i=1 xi
N
1 X yi
unbiased for R∗ = and Rbhr unbiased for R. Finally, could
N i=1 xi
we just use
R bo = ȳ/X̄ ?

ȳ
E(ȳ) Ȳ
This is the ordinary estimator E == = R which does
X̄ X̄ X̄
not use the information from the sample {xi} but is unbiased for R.

2. For small samples we might expect the Hartley-Ross estimator to be

better. There is no general result on the comparison of the variances
ȳ ȳ ∗ N − 1 n(ȳ − r̄∗x̄)
of r = , ro = , and rhr = r̄ + for all
x̄ X̄ N X̄ n−1
sample sizes.
See Cochran (2nd Ed) Theorem 6.3 §6.15.

SydU STAT3014 (2015) Second semester Dr. J. Chan 48

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Summary of estimators and variance estimates based on 1 SRS

Ord. Ratio Regression Hartley-Ross

ȳ ȳ ∗ N − 1 n(ȳ − r̄∗ x̄)

Ratio R - r̄ +
X̄ x̄ N X̄ n−1
1 n s2y 1 n s2r
(1 − ) (1 − ) - -
X̄ 2 N n X̄ 2 N n

ȳ sxy N − 1 n(ȳ − r̄∗ x̄)

Mean Ȳ ȳ X̄ y+ (X − x) X̄ r̄∗ +
x̄ s2x N n−1

n s2y n s2 n s2y (1 − ρ̂2 )

(1 − ) (1 − ) r (1 − ) -
N n N n N n
var(Yb̄ r ) < var(Yb̄ ) var(Yb̄ reg ) < var(Yb̄ r )
ȳsx ȳ
if ρ̂ > equal if b = r =
2x̄sy x̄

ȳ sxy ∗ n(ȳ − r̄∗ x̄)

Total Y N ȳ X N y + 2 (X − N x) X r̄ + (N − 1)
x̄ sx n−1

2 n s2y 2 n s2r 2 n s2y (1 − ρ̂2 )

N (1 − ) N (1 − ) N (1 − ) -
N n N n N n

n
1
( yi2 − nȳ 2 ),
X
s2y=
n − 1 i=1
n n n
1 ȳ
( x2i ) = s2y − 2rρ̂sx sy + r2 s2x , r = ,
X X X
2 2 2
sr = yi − 2r x i yi + r
n − 1 i=1 i=1 i=1
x̄

SydU STAT3014 (2015) Second semester Dr. J. Chan 49

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Example: (7-11)
Solution: The ratios and their summary are given below:
i xi yi ri0 = yi/xi i xi yi ri0 = yi/xi
1 50 56 1.120 9 100 165 1.650
2 35 48 1.371 10 250 409 1.636
3 12 22 1.833 11 50 73 1.460
4 10 14 1.400 12 50 70 1.400
5 15 18 1.200 13 150 95 0.633
6 30 26 0.867 14 100 55 0.550
7 9 11 1.222 15 40 83 2.075
8 25 30 1.200 Total 19.618
n
1X ∗
∗ 19.618
We have r̄ = ri = = 1.3079, x̄ = 61.7333 and ȳ =
n i=1 15
78.3333.

The Hartley-Ross estimate of the total sale this year in thousands is

∗
n(ȳ − r̄ x̄)
Ybhr = X r̄∗ + (N − 1)
n−1
15[78.333 − 1.3079(61.7333)]
= 21300(1.3079) + (300 − 1)
15 − 1
= 27086.9

Read Tutorial 12 Q1(a).

SydU STAT3014 (2015) Second semester Dr. J. Chan 50

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Example: In a survey of family size (x1), weekly income (x2) and weekly
expenditure on food (y), we want to estimate the average weekly expen-
diture on food per family in the most efficient way. A simple random
sample of 27 families yields the following data:
X X X
x1i = 109, x2i = 16277, yi = 2831, ρ̂x1,y = 0.925, ρ̂x2,y = 0.573
i i i

The sample covariance matrix for y, x1 and x2 is

 
547.8234 26.5057 1796.5541
 
 
 26.5057 1.4986 80.1595  .
 
 
 
1796.5541 80.1595 17967.0541

From the census data X̄1 = 3.91 and X̄2 = 542.

(a) Estimate the standard errors of the ratio estimators for Ȳ using x1
and using x2. Compare the standard errors with the s.e. for the
simple estimate ignoring the covariates. Which estimator has the
smallest estimated s.e.?
(b) Calculate the best available estimate of the average weekly expen-
diture on food per family and give an approximate 95% confidence
interval for this average.

Solution:
(a) The standard errors of the ratio estimators for Ȳ using x1 and using
x2 are

SydU STAT3014 (2015) Second semester Dr. J. Chan 51

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

P
yi 2831
r1 = P i = = 25.97
i x 1i 109
2 2
sr1 = sy − 2r sx1y + r2 s2x1
= 547.8234 − 2(25.97)(26.5057) + 25.972(1.4986)
= 181.896
P
yi 2831
r2 = P i = = 0.1739
x
i 2i 16277
sr2 = sy − 2r sx2y + r2 s2x2
2 2

= 547.8234 − 2(0.1739)(1796.5541) + 0.17392(17967.0541)

= r
475.7895 r
s2r1 181.896
se(Y r1) =
b̄ = = 2.5956
r n r 27
2
sr2 475.7895.193
se(Yb̄ r2) = = = 4.1978
r n 27
r
s 2
y 547.8234
se(Yb̄ ) = = = 4.5044
n 27
The first ratio estimator Yb̄ has the lowest s.e. due to the higher cor-
r1
relation ρ̂y,x1 = 0.925. The second ratio estimator only has marginal
improvement as the correlation ρ̂y,xx = 0.573 is weak but
√
ȳsx 2831 · 17967.0541
ρx2,y = 0.573 > = √ = 0.4980.
2x̄sy 2 · 16277 · 547.8234
Note that fpc is ignored because the population size N is unknown.
(b) The estimate of the average weekly expenditure on food per family
Yb̄ = r X̄ = 25.97(3.91) = 101.5524
r1 1
95% CI for Ȳ = 101.5524 ∓ 1.96(2.5956) = (96.4651, 106.6397)

SydU STAT3014 (2015) Second semester Dr. J. Chan 52

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3.6 Ratio estimate for subpopulation in poststratification

For some Cl , we want to estimate:
, N
, N
X X X X
0
Rl = Yi Xi = Yi Xi0, = R0
i∈Cl i∈Cl i=1 i=1

if we define
(Yi0, Xi0) = (Yi, Xi) if i ∈ Cl
= (0, 0) if i ∈
/ Cl .
Note: X 0 = Xl , i.e. the sum of Xi0 over all population equals to the
sum of Xi over Cl . Hence the natural estimator of ratio and its variance
estimate is
n P
P 0
yi yi
0 i=1 i∈Cj 1 n s0rl 2
r =P n = P = rl and var(rl ) ≈ 0 )2
1−
x ( X̄ N n
x0i i∈C i
l
i=1

where
(x0i, yi0 ) = (xi, yi) if i ∈ Cl
= (0, 0) if i ∈
/ Cl ,
n
0 X0 0 1X 0 1X
X̄ = can be estimated by x̄ = x = xi and
N n i=1 i n
i∈Cl
N
2 1 X 0
Srl0 = (Yi − R0Xi0)2
N − 1 i−1
can be estimated by
n
2 1 X 0 1 X
s0rl = 0 0 2
(yi − r xi) = (yi − rl xi)2.
n − 1 i=1 n−1
i∈Cl

SydU STAT3014 (2015) Second semester Dr. J. Chan 53

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

The ratio estimator of mean in Cl and its variance estimate are

1 n s0 2
rl
Yb̄ rl = X̄l rl and var(Yb̄ rl ) ≈ 2 1 −
Wl N n
since
2 X̄l2 n s0rl 2
var(Y rl ) = X̄l var(rl ) = 02 1 −
b̄
X̄ N n
02 0 2
X N 2
n srl
1 n s0rl 2
= 1− = 2 1− .
Nl2 X 02 N n Wl N n

Similarly, the ratio estimator of total in Cl and its variance estimate are
n s0 2
2 rl
Ybrl = Xl rl and var(Ybrl ) ≈ N 1 −
N n
since
X 2
n s0 2 0 N 2
n s0 2
var(Ybrl ) = Xl2var(rl ) = 0l2 1 − rl
= X 2 02 1 − rl
.
X̄ N n X N n
Note that these estimators correspond to method 1 in Section 1.5 for
poststratification and nl does not come into any of these calculations.
Read Tutorial 12 Q1b,c.

SydU STAT3014 (2015) Second semester Dr. J. Chan 54

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3.7 Ratio Estimation for Stratified SRS

In a stratified SRS, a SRS of a specified sample size nl is taken in each of
the L strata with known size Nl , e.g. the 6 states of Australia. There
are two types of ratio estimates depending on the order of taking ratio
and summing over strata.

1. Take ratios rl = ȳl /x̄l first and sum over Ybl = Xl rl to obtain R
bs =
PL b
l=1 Yl /X.

2. Sum over Ybl and X

bl first to obtain Yb and X
b and then take ratio
R
bc = Yb /X.
b

3.7.1 The ‘Separate’ Ratio Estimate

L
P
Suppose the stratum totals Xl , l = 1, · · · , L are known so X = Xl
l=1
is known also. Then
L L L
Y
b 1 X 1 X 1 X
R
bs = = Ybl = Xl rl = Wl X̄l rl
X X X X̄
l=1 l=1 l=1

Xl N Nl Xl 1 ȳl
since = = Wl X̄l and rl = . Then
X X N Nl X̄ x̄l
L L L
X Xl X Xl Yl 1 X Y
E(R
bs ) = E(rl ) ≈ = Yl = = R,
X X Xl X X
l=1 l=1 l=1

L 2
Yl 1 X
2 n l ssrl
since E(rl ) ≈ Rl = and var(R
bs ) = W 1 −
Xl X̄ 2 l=1 l Nl nl

SydU STAT3014 (2015) Second semester Dr. J. Chan 55

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

where
nl nl nl
" #
1 X X X
s2rl = s2yl −2rl sxl yl +rl2s2xl = yil2 − 2rl xil yil + rl2 x2il .
nl − 1 i=1 i=1 i=1
Similarly the separate ratio estimate for the mean is
L L
s2srl

X X nl
Yb̄ st,s = Wl X̄l rl and var(Yb̄ st,s) = Wl2 1−
Nl nl
l=1 l=1

Bias:
For large stratum sample sizes, rl will be approximately unbiased for Rl
and var(rl ) will approximate Var(rl ) reasonably well.
For moderate and small samples, bias is important, and we should con-
sider it here. We know that in a single stratum
|bias rl | σx̄l
≤ = cv(x̄l )
σrl X̄l
Consider the bias of R
bs :
|bias (R
bs)| = E(R bs − R)
L
! L
X Xl X Xl
= E (rl − Rl ) = E(rl − Rl )
X X
l=1 l=1
L L
X Xl X Xl
= |bias rl | ≤ max |bias rl |
X l X
l=1 l=1

σx̄l σrl
≤ max |bias rl | ≤ max
l l X̄l
Hence  
|bias (R max σrl σx̄l

√ max σrl
 max σx̄l
bs)|
≤ l max ≤ L l
s.e.Rbs bs ) l
s.e.(R X̄l min σrl l X̄l
l

SydU STAT3014 (2015) Second semester Dr. J. Chan 56

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

since
v v
u L 2 u L
uX Xl uX
s.e.(Rs) =
b t var(rl ) ≥ min σrl t p2l
X l
l=1 l=1
v
u L 2 r
uX 1 L 1
≥ min σrl t ≥ min σrl ≥ √ min σrl
l L l L2 L l
l=1

where Xl = pl X. The sum of squares of unequal proportions is higher

than that from equal proportions in general. This is due to the convexity
property of the function f (p) = p2. For example, when L = 2 with cases
(1 − p, p) and ( 12 , 12 ),
1 1 1
(1 − p)2 + p2 − 2( )2 = 2p2 − 2p + = (2p − 1)2 ≥ 0.
2 2 2
√
Therefore the ratio on the LHS can be L times as large as the σx̄l /X̄l
bound on individual relative biases. Even if the biases are individually
small, the overall bias can be large.

3.7.2 The ‘Combined’ Ratio Estimate

It is defined as
L
P
Wl ȳl
l=1 ȳst Yb̄ Yb
R
bc =
L
= = =
P x̄st X b̄ Xb
Wl x̄l
l=1

SydU STAT3014 (2015) Second semester Dr. J. Chan 57

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

and in contrast to Rbs, it does not require the knowledge of individual

Xl ’s. Note that
L
ȳ st
ȳ
st 1 X Y
E(R bc ) = E ≈E ≈ Wl E(ȳl ) = = R Approx. unbiased
x̄st X̄ X̄ X
l=1

Theorem:
L PNl !
2

bc ) ≈ 1 nl 1 i=1 [Yil − Ȳl − R(Xil − X̄l )]
X
2
Var(R W 1 − .
X̄ 2 l=1 l Nl nl Nl − 1
Proof: First
L
ȳst 1 1 X
Rc − R =
b −R= (ȳst − Rx̄st) = Wl (ȳl − Rx̄l )
x̄st x̄st x̄st
l=1
L
1 X 1 ¯ 1
= Wl d¯l = dst ≈ d¯st
x̄st x̄st X̄
l=1
where dli = yli − Rxli, i = 1, · · · , nl estimates Dli = Yli − RXli and
nl
1 X
d¯l = dil . Note that typically D̄l 6= 0. Hence
nl i=1
L 2
bc ) ≈ 1 ¯st) ≈ 1 X
2 nl Scrl
Var(R Var(d W l 1 −
X̄ 2 X̄ 2 l=1
Nl nl
where
lN l N
2 1 X 1 X
Scrl = (Dli − D̄l )2 = [Yli − Ȳl − R(Xli − X̄l )]2
Nl − 1 i=1 Nl − 1 i=1
= Sy2l − 2RSxl yl + R2Sx2l ,
and this can be estimated by
nl
1 X
s2crl = [yli − ȳl − rc(xli − x̄l )]2 = s2yl − 2rcsxl yl + rc2s2xl
nl − 1 i=1
SydU STAT3014 (2015) Second semester Dr. J. Chan 58
STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

as compared to
" n nl nl
#
l
2 1 X
2
X
2
X
ssrl = yli − 2rl xliyli + rl x2il = s2yl −2rl sxl yl +rl2s2xl
nl − 1 i=1 i=1 i=1

for separate ratio estimator.

There is less risk of bias in R̂c than in R̂s. We can show that

|E(R bc − R)| σx̄l
≤ max
s.e.R
bc l X̄l
in contrast to
|E(Rbs − R)| √ maxl σr
σx̄l
l
≤ L max
s.e.R
bs minl σrl l X̄l

for the separate ratio estimator R

bs .

Similarly the combine ratio estimate for the mean and its variance are
L 2
X nl scrl
Yb̄ st,c = X̄ R
bc and var(Yb̄ st,c) = Wl2 1 − .
Nl nl
l=1

and for the total are

L 2
X nl scrl
Ybst,c = X R
bc and var(Ybst,c) = N 2 Wl2 1 − .
Nl nl
l=1

Read Tutorial 12 Q2.

SydU STAT3014 (2015) Second semester Dr. J. Chan 59

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

Estimators and variance estimates for stratified SRS (Ch.2)

Parameter Estimator Variance
nl
Ordinary/naive estimator syl = nl −1 ( yli2 − nl ȳl2 ), Wl = NNl
2 1
P
i=1
L L 2
syl

1 X 1 X
2 n l
Ratio R Rbst = Wl ȳl var(Rbst ) = W 1 −
X̄ l=1 X̄ 2 l=1 l Nl nl
L 2
syl
L

P X
2 nl
Mean Ȳ Yb̄ st = Wl ȳl var(Yb̄ st ) = Wl 1 −
l=1 Nl nl
l=1
L 2
syl
L

X n l
var(Ybst ) = N 2 Wl2 1 −
P
Total Y Ybst = N Wl ȳl
l=1 Nl nl
l=1
ȳl
Separate ratio estimator s2sr,l = s2yl − 2 rl ρ̂sxl syl + rl2 s2xl , rl =
x̄l
L L 2
ssr,l

1 X 1 X
2 n l
Ratio R Rbst,sr = Wl X̄l rl var(R bst,sr ) = W 1 −
X̄ l=1 X̄ 2 l=1 l Nl nl
L 2
ssr,l
L

X nl
Wl2 1 −
P
Mean Ȳ Yb̄ st,sr = Wl X̄l rl var(Yb̄ st,sr ) =
l=1 Nl nl
l=1
L 2
ssr,l
L

X n l
var(Ȳst,sr ) = N 2 Wl2 1 −
P
Total Y Ybst,sr = N Wl X̄l rl
l=1 Nl nl
l=1 PL
l=1 Wl ȳl
Combine ratio estimator s2cr,l = s2yl − 2 rc ρ̂sxl syl + rc2 s2xl , rst,cr = PL
l=1 Wl x̄l
L
P
Wl ȳl L 2
scr,l

l=1 1 X
2 n l
Ratio R R
bst,cr = = rst,cr var(R
bst,cr ) = W l 1 −
PL X̄ 2 l=1 Nl nl
Wl x̄l
l=1
L
nl s2cr,l
X
Mean Ȳ Yb̄ st,cr = X̄rst,cr var(Yb̄ st,cr ) = 1−Wl2
Nl nl
l=1
L
nl s2cr,l
X
2 2
Total Y Ybst,cr = N X̄rst,cr var(Ȳst,cr ) = N Wl 1 −
Nl nl
l=1

SydU STAT3014 (2015) Second semester Dr. J. Chan 60

Dsilytc Group 3 Final Paper 2
No ratings yet
Dsilytc Group 3 Final Paper 2
20 pages
5 - Ratio Regression and Difference Estimation - Revised
No ratings yet
5 - Ratio Regression and Difference Estimation - Revised
39 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Statistika Dan Probabilitas
No ratings yet
Statistika Dan Probabilitas
10 pages
Sur15 1 Sol
No ratings yet
Sur15 1 Sol
17 pages
Sampling CH-5
No ratings yet
Sampling CH-5
6 pages
Chapter5 Ratio Method Estimation
No ratings yet
Chapter5 Ratio Method Estimation
23 pages
Chapter5 Sampling Ratio Method Estimation
No ratings yet
Chapter5 Sampling Ratio Method Estimation
23 pages
Ratio Regression R
No ratings yet
Ratio Regression R
20 pages
Ratio and Product Methods of Estimation: Y X Y X Y X
No ratings yet
Ratio and Product Methods of Estimation: Y X Y X Y X
23 pages
Chapter5 Sampling Ratio Method Estimation
No ratings yet
Chapter5 Sampling Ratio Method Estimation
23 pages
Regression Estimator
No ratings yet
Regression Estimator
24 pages
Sampling Unit 6
No ratings yet
Sampling Unit 6
5 pages
Chapter5 Sampling Ratio Method Estimation
No ratings yet
Chapter5 Sampling Ratio Method Estimation
24 pages
MTAT 572_Lecture 3
No ratings yet
MTAT 572_Lecture 3
34 pages
Simple Linear Regression (Chapter 11) : Review of Some Inference and Notation: A Common Population Mean Model
No ratings yet
Simple Linear Regression (Chapter 11) : Review of Some Inference and Notation: A Common Population Mean Model
24 pages
BSc Sample Surveys Unit III Part I
No ratings yet
BSc Sample Surveys Unit III Part I
5 pages
1 Preliminaries: 1.1 Motivation
No ratings yet
1 Preliminaries: 1.1 Motivation
7 pages
Estimation of Parametric Functions in Downton's
No ratings yet
Estimation of Parametric Functions in Downton's
17 pages
Chapter 5 - STATISTICAL TESTS OF THE LEAST SQUARES ESTIMATES
No ratings yet
Chapter 5 - STATISTICAL TESTS OF THE LEAST SQUARES ESTIMATES
10 pages
EC2C4__Econometrics_II (11)
No ratings yet
EC2C4__Econometrics_II (11)
56 pages
Lecture 2: Simple Linear Regression Model: Recap
No ratings yet
Lecture 2: Simple Linear Regression Model: Recap
5 pages
Sample
No ratings yet
Sample
23 pages
Chapter 5 - 2010
No ratings yet
Chapter 5 - 2010
8 pages
6034 - Classical Linear Regression Model
No ratings yet
6034 - Classical Linear Regression Model
30 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
Stat Cluster Sampling
No ratings yet
Stat Cluster Sampling
22 pages
cheatsheet
No ratings yet
cheatsheet
4 pages
A Two-Parameter Ratio-Product
No ratings yet
A Two-Parameter Ratio-Product
19 pages
477_STS 443
No ratings yet
477_STS 443
20 pages
Chapter6 Regression Method Estimation
No ratings yet
Chapter6 Regression Method Estimation
12 pages
ECMT1020 Formulas 2021
No ratings yet
ECMT1020 Formulas 2021
9 pages
Estimation of Parameter
No ratings yet
Estimation of Parameter
10 pages
Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing
No ratings yet
Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing
16 pages
Sample Surveys: Rohan, Vijayan
No ratings yet
Sample Surveys: Rohan, Vijayan
72 pages
Statistics
No ratings yet
Statistics
53 pages
Name-Simran Kaur Syal Subject - Financial Econometrics Assignment No. 4 Q. Explain BLUE in Detail and Conditions For The Same? Ans
No ratings yet
Name-Simran Kaur Syal Subject - Financial Econometrics Assignment No. 4 Q. Explain BLUE in Detail and Conditions For The Same? Ans
4 pages
ExamFinal Topics
No ratings yet
ExamFinal Topics
9 pages
slidesc53_3
No ratings yet
slidesc53_3
47 pages
007 - Buku Basic Econometric Damodar N Gujarati 4th Solution-15-25
No ratings yet
007 - Buku Basic Econometric Damodar N Gujarati 4th Solution-15-25
12 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Basic Statistic
No ratings yet
Basic Statistic
20 pages
Mathematical Statistics (MA212M) : Lecture Slides
No ratings yet
Mathematical Statistics (MA212M) : Lecture Slides
16 pages
Formula_List_Statistics_2
No ratings yet
Formula_List_Statistics_2
4 pages
LECTURE 12
No ratings yet
LECTURE 12
8 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Regression Equation: Independent Variable Predictor Variable Explanatory Variable Dependent Variable Response Variable
No ratings yet
Regression Equation: Independent Variable Predictor Variable Explanatory Variable Dependent Variable Response Variable
60 pages
EC226 - Econometrics (Revision Guide - Simple Linear Regression)
No ratings yet
EC226 - Econometrics (Revision Guide - Simple Linear Regression)
9 pages
Jimma University: M.SC in Economics (Industrial Economics) Regular Program Individual Assignment: Econometrics
No ratings yet
Jimma University: M.SC in Economics (Industrial Economics) Regular Program Individual Assignment: Econometrics
20 pages
2b.-SRS-for-proportion_20.05.221
No ratings yet
2b.-SRS-for-proportion_20.05.221
9 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
54 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
Stimation: Statistic
No ratings yet
Stimation: Statistic
46 pages
Sen 1968
No ratings yet
Sen 1968
12 pages
Lecture 7
No ratings yet
Lecture 7
12 pages
UMVUE Statmat 2 2022
No ratings yet
UMVUE Statmat 2 2022
43 pages
Sampling For Proportions and Percentages
No ratings yet
Sampling For Proportions and Percentages
8 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Annual Report 10 11(Rev-I)
No ratings yet
Annual Report 10 11(Rev-I)
64 pages
Microeconomics Lecturer
No ratings yet
Microeconomics Lecturer
2 pages
1707498246
No ratings yet
1707498246
3 pages
Forumias.com-G20 and Its Significance Explained Pointwise
No ratings yet
Forumias.com-G20 and Its Significance Explained Pointwise
9 pages
rti kgbv
No ratings yet
rti kgbv
1 page
The gender stereotyping of
No ratings yet
The gender stereotyping of
2 pages
548137612big Push Theory
No ratings yet
548137612big Push Theory
9 pages
Rise of Artificial Intelligence
No ratings yet
Rise of Artificial Intelligence
2 pages
Components of Population Growth
No ratings yet
Components of Population Growth
83 pages
Kashmir Industrial Policy 2021-30 As Per Annexure Appended To This Order For
No ratings yet
Kashmir Industrial Policy 2021-30 As Per Annexure Appended To This Order For
59 pages
The Gadgil Formula For Determining The Allocation 60ab64f6dde15d36756cdbf1
No ratings yet
The Gadgil Formula For Determining The Allocation 60ab64f6dde15d36756cdbf1
3 pages
Lecture 2
No ratings yet
Lecture 2
62 pages
HW2368-Chapter3
No ratings yet
HW2368-Chapter3
18 pages
4th sem model question paper
No ratings yet
4th sem model question paper
3 pages
LCAa 2 Loria
No ratings yet
LCAa 2 Loria
7 pages
TH RD TH ST
No ratings yet
TH RD TH ST
6 pages
Latihan 1 Dan 2 Spss Statistika
No ratings yet
Latihan 1 Dan 2 Spss Statistika
4 pages
Modal Partisipasi Massa
No ratings yet
Modal Partisipasi Massa
10 pages
calculate the mean, mode, and median (1)
No ratings yet
calculate the mean, mode, and median (1)
10 pages
Statistics
No ratings yet
Statistics
31 pages
Measure of Central Tendency Lecture 123
No ratings yet
Measure of Central Tendency Lecture 123
33 pages
Quiz 2: Continue
No ratings yet
Quiz 2: Continue
15 pages
Data Analytics TB
No ratings yet
Data Analytics TB
1,944 pages
24T2 COMM5007 - Group - Assignment - P2
No ratings yet
24T2 COMM5007 - Group - Assignment - P2
5 pages
Module 4
No ratings yet
Module 4
18 pages
Bab Iv
No ratings yet
Bab Iv
20 pages
Data Analysis - An Introduction - Lewis-Beck, Michael S - 1995 - Thousand Oaks - Sage Publications - 9780803957725 - Anna's Archive
No ratings yet
Data Analysis - An Introduction - Lewis-Beck, Michael S - 1995 - Thousand Oaks - Sage Publications - 9780803957725 - Anna's Archive
92 pages
4.1-4.3 Quiz
No ratings yet
4.1-4.3 Quiz
3 pages
Statistic 6.4 Lesson and Assignment
No ratings yet
Statistic 6.4 Lesson and Assignment
7 pages
LESSON 4.2.1 Mean, Median and Mode
No ratings yet
LESSON 4.2.1 Mean, Median and Mode
17 pages
Chapter 1 Summary Univaraiate Data (002) (2)
No ratings yet
Chapter 1 Summary Univaraiate Data (002) (2)
44 pages
GCSE Statistics - Chapter 4
No ratings yet
GCSE Statistics - Chapter 4
2 pages
Sibd Questions Soved Theory
No ratings yet
Sibd Questions Soved Theory
14 pages
Chapter Testmeasureofvariation
No ratings yet
Chapter Testmeasureofvariation
7 pages
Statistical Analysis Measure of Variation
No ratings yet
Statistical Analysis Measure of Variation
14 pages
Measures of Dispersion Kurtosis and Skewness
No ratings yet
Measures of Dispersion Kurtosis and Skewness
19 pages
Unit 5
No ratings yet
Unit 5
17 pages
Measures of Central Tendency - Use This PDF
No ratings yet
Measures of Central Tendency - Use This PDF
30 pages
Measures of Shape_ STA 131 Note
No ratings yet
Measures of Shape_ STA 131 Note
13 pages

Sur15 3 Sol

Uploaded by

Sur15 3 Sol

Uploaded by

STAT3014/3914 Applied Stat.-Sampling C3-Ratio & reg est.

3 Ratio and regression estimators

3.1 Motivating examples

can be estimated by the ratio estimator

Relationship between estimates

SydU STAT3014 (2015) Second semester Dr. J. Chan 34

3.2 Two characteristics per unit in SRS

Theorem: If Xi and Yi are a pair of numerical characteristics defined

Theorem: For large sample,

SydU STAT3014 (2015) Second semester Dr. J. Chan 35

where d¯ = ȳ−Rx̄ is the sample mean of di = yi −Rxi, i = 1, · · · , n,

¯ = E(ȳ − Rx̄) = E(ȳ) − RE(x̄) = Ȳ − RX̄ = Ȳ − Ȳ

SydU STAT3014 (2015) Second semester Dr. J. Chan 36

SydU STAT3014 (2015) Second semester Dr. J. Chan 37

SydU STAT3014 (2015) Second semester Dr. J. Chan 38

3.3 Ratio estimate for population mean and total

These ratio estimates use extra information of xi, i = 1, · · · , n and

(a) E(Yb̄ r ) = X̄E(r) ≈ X̄R = Ȳ .

Therefore for any ratio estimates,

SydU STAT3014 (2015) Second semester Dr. J. Chan 40

Example: (7-11) The manager of 7-11 is interested in estimating the

The ordinary estimate of the total sale this year in thousands is

SydU STAT3014 (2015) Second semester Dr. J. Chan 41

which is much smaller than se(Yb ) = 7543.72 thousands.

3.4 Regression estimator

Since Yb r = X Rb = X y , the line y = mx with slope m = y passes

SydU STAT3014 (2015) Second semester Dr. J. Chan 43

SydU STAT3014 (2015) Second semester Dr. J. Chan 44

3. Since Yb reg = y + b(X − x), the regression estimator adjusts the y

(a) When the slope b = 0, the regression estimator Yb reg = y becomes

SydU STAT3014 (2015) Second semester Dr. J. Chan 45

SydU STAT3014 (2015) Second semester Dr. J. Chan 46

3.5 The Hartley - Ross Estimator

Proof: The LHS is

for which we need to know X̄ (or X = N X̄). This estimator contains

2. For small samples we might expect the Hartley-Ross estimator to be

SydU STAT3014 (2015) Second semester Dr. J. Chan 48

Summary of estimators and variance estimates based on 1 SRS

Ord. Ratio Regression Hartley-Ross

ȳ ȳ ∗ N − 1 n(ȳ − r̄∗ x̄)

ȳ sxy N − 1 n(ȳ − r̄∗ x̄)

n s2y n s2 n s2y (1 − ρ̂2 )

ȳ sxy ∗ n(ȳ − r̄∗ x̄)

2 n s2y 2 n s2r 2 n s2y (1 − ρ̂2 )

SydU STAT3014 (2015) Second semester Dr. J. Chan 49

The Hartley-Ross estimate of the total sale this year in thousands is

Read Tutorial 12 Q1(a).

SydU STAT3014 (2015) Second semester Dr. J. Chan 50

The sample covariance matrix for y, x1 and x2 is

From the census data X̄1 = 3.91 and X̄2 = 542.

SydU STAT3014 (2015) Second semester Dr. J. Chan 51

= 547.8234 − 2(0.1739)(1796.5541) + 0.17392(17967.0541)

SydU STAT3014 (2015) Second semester Dr. J. Chan 52

3.6 Ratio estimate for subpopulation in poststratification

SydU STAT3014 (2015) Second semester Dr. J. Chan 53

The ratio estimator of mean in Cl and its variance estimate are

SydU STAT3014 (2015) Second semester Dr. J. Chan 54

3.7 Ratio Estimation for Stratified SRS

2. Sum over Ybl and X

3.7.1 The ‘Separate’ Ratio Estimate

SydU STAT3014 (2015) Second semester Dr. J. Chan 55

SydU STAT3014 (2015) Second semester Dr. J. Chan 56

where Xl = pl X. The sum of squares of unequal proportions is higher

3.7.2 The ‘Combined’ Ratio Estimate

SydU STAT3014 (2015) Second semester Dr. J. Chan 57

and in contrast to Rbs, it does not require the knowledge of individual

for separate ratio estimator.

for the separate ratio estimator R

and for the total are

Read Tutorial 12 Q2.

SydU STAT3014 (2015) Second semester Dr. J. Chan 59

Estimators and variance estimates for stratified SRS (Ch.2)

SydU STAT3014 (2015) Second semester Dr. J. Chan 60

You might also like