0% found this document useful (0 votes)
11 views

Lecture 14 Simple Random Sampling 3

The document discusses the estimation of population characteristics using simple random sampling (SRS), focusing on population mean, total, and variance. It explains the concepts of estimators, estimates, and the process of estimating unknown parameters, emphasizing that the sample mean is an unbiased estimator of the population mean. Additionally, it provides formulas for calculating population mean, total, and variance, as well as sample mean and variance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture 14 Simple Random Sampling 3

The document discusses the estimation of population characteristics using simple random sampling (SRS), focusing on population mean, total, and variance. It explains the concepts of estimators, estimates, and the process of estimating unknown parameters, emphasizing that the sample mean is an unbiased estimator of the population mean. Additionally, it provides formulas for calculating population mean, total, and variance, as well as sample mean and variance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Estimation of Population Characteristics: Simple Random Sampling

Suppose that it is of interest to collect information on a quantitative random variable,


𝑌𝑌 from a population having 𝑁𝑁 sampling units to study the some unknown population
characteristics such as
• Population Mean
• Population Total
• Population Variance.
Let 𝑢𝑢 be the sampling unit, which will be selected randomly and 𝑦𝑦 be the value
(information) obtained from the sampling unit 𝑢𝑢. Hence 𝑦𝑦 is a random variable.
Also, suppose that a sample survey is conducted by selecting a simple random
sample of size 𝑛𝑛. Since under SRS technique, the probability of selecting a specified
1
sampling unit 𝑢𝑢 at any draw is , the probability of obtaining the value 𝑦𝑦 from the
𝑁𝑁
1
sampling unit 𝑢𝑢 is also . That is,
𝑁𝑁
1
𝑃𝑃(𝑢𝑢) = = 𝑃𝑃(𝑦𝑦).
𝑁𝑁
Notations:

𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂:


𝑦𝑦1 , 𝑦𝑦2 , ⋯ , 𝑦𝑦𝑖𝑖 , ⋯ , 𝑦𝑦𝑁𝑁
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀:
𝑁𝑁
1
𝐸𝐸(𝑌𝑌) = 𝜇𝜇 = � 𝑦𝑦𝑖𝑖 .
𝑁𝑁
𝑖𝑖=1

𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇:
𝑁𝑁

𝑇𝑇 = � 𝑦𝑦𝑖𝑖 = 𝑁𝑁𝑁𝑁.
𝑖𝑖=1

1
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉:
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) = 𝜎𝜎 2 = 𝐸𝐸[𝑌𝑌 − 𝐸𝐸(𝑌𝑌)]2
= 𝐸𝐸(𝑌𝑌 − 𝜇𝜇)2 = 𝐸𝐸(𝑌𝑌 2 ) − 𝜇𝜇2 = 𝐸𝐸(𝑌𝑌 2 ) − [𝐸𝐸(𝑌𝑌)]2
𝑁𝑁 𝑁𝑁 2
1 1
= � 𝑦𝑦𝑖𝑖2 − � � 𝑦𝑦𝑖𝑖 �
𝑁𝑁 𝑁𝑁
𝑖𝑖=1 𝑖𝑖=1
𝑁𝑁 𝑁𝑁
1 1
= �� 𝑦𝑦𝑖𝑖2 − 𝑁𝑁𝜇𝜇2 � = �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)2
𝑁𝑁 𝑁𝑁
𝑖𝑖=1 𝑖𝑖=1

That is,
𝑁𝑁
1
𝜎𝜎 2 = �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)2 .
𝑁𝑁
𝑖𝑖=1

In sample survey, the population variance can also be defined as

𝑁𝑁
2
1
𝑆𝑆 = �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)2 .
𝑁𝑁 − 1
𝑖𝑖=1

That is,
𝑁𝑁
𝑆𝑆 2 = 𝜎𝜎 2 .
𝑁𝑁 − 1
• For large population, the difference between 𝜎𝜎 2 and 𝑆𝑆 2 is ignorable.

𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂:
𝑌𝑌1 , 𝑌𝑌2 , ⋯ , 𝑌𝑌𝑗𝑗 , ⋯ , 𝑌𝑌𝑛𝑛
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀:
𝑛𝑛
1
𝑌𝑌� = � 𝑌𝑌𝑗𝑗 .
𝑛𝑛
𝑗𝑗=1

2
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉:
𝑛𝑛
1 2
𝑠𝑠 2 = ��𝑌𝑌𝑗𝑗 − 𝑌𝑌�� .
𝑛𝑛 − 1
𝑗𝑗=1

Remark
The given sample observations is usually denoted by
𝑦𝑦1 , 𝑦𝑦2 , ⋯ , 𝑦𝑦𝑗𝑗 , ⋯ , 𝑦𝑦𝑛𝑛 .

Note
1. Estimator: Estimator is a function of sample observations with no unknown
quantity (i.e. a statistic), which is used to find an approximate value for the
unknown parameter. Note that estimator is a random variable.

2. Estimate: The value of the estimator obtained from a given sample


observations is called estimate. The estimate is used as an approximate value
for an unknown parameter.

3. Estimation: Estimation is a statistical technique to obtain estimators for


unknown parameters.

4. Unbiased Estimator: An estimator is said to be an unbiased estimator of an


unknown parameter if the expected value of the estimator is equal to the
unknown parameter. Suppose that 𝜃𝜃 is an unknown parameter and 𝜃𝜃� is the
estimator of 𝜃𝜃. The estimator 𝜃𝜃� is an unbiased estimator if
𝐸𝐸�𝜃𝜃�� = 𝜃𝜃.
The amount of bias of 𝜃𝜃� is measured as
𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 = 𝐸𝐸�𝜃𝜃�� − 𝜃𝜃.
3
Estimation of Population Mean

Suppose that the population mean of the random variable of interest 𝑌𝑌 is unknown.
That is, the population mean 𝜇𝜇 is an unknown parameter.

• It can be shown that the sample mean is an estimator of the unknown


population mean.

• The sample mean value obtained from a given sample observations is the
estimate of unknown population mean. That is, sample mean value
approximates the unknown population mean.

1
• 𝑌𝑌� = ∑𝑛𝑛𝑗𝑗=1 𝑌𝑌𝑗𝑗 𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝜇𝜇.
𝑛𝑛

1
• 𝑦𝑦� = ∑𝑛𝑛𝑗𝑗=1 𝑦𝑦𝑗𝑗 𝑖𝑖𝑖𝑖 𝑎𝑎𝑎𝑎 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝜇𝜇.
𝑛𝑛

� )?
What is 𝑬𝑬(𝒀𝒀

The expected value of sample mean or 𝐸𝐸(𝑌𝑌�) indicates the arithmetic mean of
population observations obtained on 𝑌𝑌�. It can be computed as follows.

1. Obtain all possible samples of same size 𝑛𝑛 from a population of size


𝑁𝑁. In SRSWOR, there are 𝑁𝑁𝐶𝐶 𝑛𝑛 samples and in SRSWR, there are
𝑁𝑁 𝑛𝑛 sample.
2. Assign values to the selected sampling units for each sample.
3. Compute sample mean value for each sample.
4. Compute the arithmetic mean of sample mean values obtained in 3.
4
𝐾𝐾
1
𝐸𝐸(𝑌𝑌�) = � 𝑦𝑦�𝑢𝑢 ,
𝐾𝐾
𝑢𝑢=1

where
𝑁𝑁 𝑖𝑖𝑖𝑖 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆
𝐾𝐾 = � 𝐶𝐶𝑛𝑛𝑛𝑛 .
𝑁𝑁 𝑖𝑖𝑖𝑖 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆
This expectation can also be computed by defining the probability distribution
function of 𝑌𝑌� (known as sampling distribution of 𝑌𝑌�). Let there be 𝑘𝑘 distinct values
for 𝑌𝑌� and 𝑝𝑝𝑢𝑢 be the probability that the sample mean value is 𝑦𝑦�𝑢𝑢 , 𝑢𝑢 = 1, ⋯ , 𝑘𝑘.
Therefore,
𝑘𝑘

𝐸𝐸(𝑌𝑌�) = � 𝑦𝑦�𝑢𝑢 𝑝𝑝𝑢𝑢 .


𝑢𝑢=1

Property

The sample mean obtained in simple random sampling (with or without


replacement) is an unbiased estimator of the population mean.

Proof:
Suppose that a simple random sample of size 𝑛𝑛 is drawn from a population having
𝑁𝑁 sampling units. Suppose that the given population observations are
𝑦𝑦1 , ⋯ , 𝑦𝑦𝑖𝑖 , ⋯ , 𝑦𝑦𝑁𝑁
and sample observations are
𝑌𝑌1 , ⋯ , 𝑌𝑌𝑗𝑗 , ⋯ , 𝑌𝑌𝑛𝑛 .
Let 𝜇𝜇 be the unknown population mean defined as
𝑁𝑁
1
𝜇𝜇 = � 𝑦𝑦𝑖𝑖
𝑁𝑁
𝑖𝑖=1

5
and the sample mean be 𝑌𝑌� defined as
𝑛𝑛
1
𝑌𝑌� = � 𝑌𝑌𝑗𝑗 .
𝑛𝑛
𝑗𝑗=1

It can be shown that sample mean, 𝑌𝑌� is an estimator of 𝜇𝜇. Now


𝑛𝑛
1
𝐸𝐸(𝑌𝑌�) = � 𝐸𝐸�𝑌𝑌𝑗𝑗 �,
𝑛𝑛
𝑗𝑗=1

where
𝑁𝑁 𝑁𝑁 𝑁𝑁
1 1
𝐸𝐸�𝑌𝑌𝑗𝑗 � = � 𝑦𝑦𝑖𝑖 𝑃𝑃(𝑦𝑦𝑖𝑖 ) = � 𝑦𝑦𝑖𝑖 = � 𝑦𝑦𝑖𝑖 = 𝜇𝜇.
𝑁𝑁 𝑁𝑁
𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1

Therefore,
𝑛𝑛
1 𝑛𝑛𝑛𝑛
𝐸𝐸(𝑌𝑌�) = � 𝜇𝜇 = = 𝜇𝜇.
𝑛𝑛 𝑛𝑛
𝑗𝑗=1

It implies that sample mean is an unbiased estimator of the unknown population


mean.

Illustration

Suppose that a SRS of size 2 is drawn from a population of size 4 without


replacement. Also, suppose that values obtained from sampling units 1, 2,3, 𝑎𝑎𝑎𝑎𝑎𝑎 4
are 2, 4, 6, and 8, respectively.
i. Find the population mean.
ii. Find the all possible sample with sample values. Also, find the all
possible sample mean values.
iii. Show that sample mean is an unbiased estimator of population
mean.

6
iv. Find the sampling distribution of sample mean. Hence show that it
is an unbiased estimator of population mean.

Solution

(i) The population mean


1
𝜇𝜇 = (2 + 4 + 6 + 8) = 5.
4
4!
(ii) Number of samples=4𝐶𝐶2 = = 6.
2! 2!

Serial Number Sampling Units Sample Observations Mean (𝑌𝑌�)


1 (1,2) 2 4 3
2 (1,3) 2 6 4
3 (1,4) 2 8 5
4 (2,3) 4 6 5
5 (2,4) 4 8 6
6 (3,4) 6 8 7

1
(iii) 𝐸𝐸(𝑌𝑌�) = (3 + 4 + 5 + 5 + 6 + 7) = 5 (= 𝜇𝜇)
6

Therefore, sample mean (𝑌𝑌�) is an unbiased estimator of population mean.

(iv) The possible values of sample mean are 𝑦𝑦� = 3, 4, 5, 6, 7. The sampling
distribution of sample mean is
𝑦𝑦� 3 4 5 6 7
𝑃𝑃(𝑦𝑦�) 1/6 1/6 2/6 1/6 1/6

Then the expected value of 𝑌𝑌� is

7
6

𝐸𝐸(𝑌𝑌�) = � 𝑦𝑦�𝑗𝑗 𝑃𝑃(𝑦𝑦𝑗𝑗 )


𝑗𝑗=1

1 1 2 1 1 30
=3× +4× +5× +6× +7× = = 5(= 𝜇𝜇)
6 6 6 6 6 6
Therefore, sample mean (𝑌𝑌�) is an unbiased estimator of population mean.

Exercise

Solve the above problem if SRS is done with replacement.

Property

The population variance of the sample mean obtained in simple random sampling
without replacement is given by
𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = (1 − 𝑓𝑓) ,
𝑛𝑛
where
𝑁𝑁
1 𝑁𝑁
𝑆𝑆 2 = �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)2 = 𝜎𝜎 2 .
𝑁𝑁 − 1 𝑁𝑁 − 1
𝑖𝑖=1

Proof:
Suppose that a simple random sample of size 𝑛𝑛 is drawn from a population having
𝑁𝑁 sampling units without replacement. Suppose that the given population
observations are
𝑦𝑦1 , ⋯ , 𝑦𝑦𝑖𝑖 , ⋯ , 𝑦𝑦𝑁𝑁
and sample observations are
𝑌𝑌1 , ⋯ , 𝑌𝑌𝑗𝑗 , ⋯ , 𝑌𝑌𝑛𝑛 .
Let 𝜇𝜇 be the unknown population mean defined as

8
𝑁𝑁
1
𝜇𝜇 = � 𝑦𝑦𝑖𝑖
𝑁𝑁
𝑖𝑖=1

and the sample mean be 𝑌𝑌� defined as


𝑛𝑛
1
𝑌𝑌� = � 𝑌𝑌𝑗𝑗 .
𝑛𝑛
𝑗𝑗=1

The population variance of 𝑌𝑌� is then


𝑛𝑛 𝑛𝑛
1 1
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = 𝑉𝑉𝑉𝑉𝑉𝑉 � � 𝑌𝑌𝑗𝑗 � = 2 𝑉𝑉𝑉𝑉𝑉𝑉 �� 𝑌𝑌𝑗𝑗 �
𝑛𝑛 𝑛𝑛
𝑗𝑗=1 𝑗𝑗=1

𝑛𝑛 𝑛𝑛
1
= 2 �� 𝑉𝑉𝑉𝑉𝑉𝑉�𝑌𝑌𝑗𝑗 � + � 𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 ��
𝑛𝑛
𝑗𝑗=1 𝑖𝑖≠𝑗𝑗

𝑛𝑛 𝑛𝑛
1
= 2 �� 𝜎𝜎 2 + � 𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 ��
𝑛𝑛
𝑗𝑗=1 𝑖𝑖≠𝑗𝑗

Now,

𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 � = 𝐸𝐸�{𝑌𝑌𝑖𝑖 − 𝐸𝐸(𝑌𝑌𝑖𝑖 )} �𝑌𝑌𝑗𝑗 − 𝐸𝐸�𝑌𝑌𝑗𝑗 �� �, 𝑖𝑖 ≠ 𝑗𝑗

= 𝐸𝐸�(𝑌𝑌𝑖𝑖 − 𝜇𝜇) �𝑌𝑌𝑗𝑗 − 𝜇𝜇��, 𝑖𝑖 ≠ 𝑗𝑗


𝑁𝑁
1
= �(𝑦𝑦𝑖𝑖 − 𝜇𝜇) �𝑦𝑦𝑗𝑗 − 𝜇𝜇�
𝑁𝑁(𝑁𝑁 − 1)
𝑖𝑖≠𝑗𝑗

2
Since �∑𝑘𝑘𝑖𝑖=1 𝑥𝑥𝑖𝑖 � = ∑𝑘𝑘𝑖𝑖=1 𝑥𝑥𝑖𝑖2 + ∑𝑘𝑘𝑖𝑖≠𝑗𝑗 𝑥𝑥𝑖𝑖 𝑥𝑥𝑗𝑗 ,

𝑁𝑁 2 𝑁𝑁 𝑁𝑁
2
��(𝑦𝑦𝑖𝑖 − 𝜇𝜇)� = �(𝑦𝑦𝑖𝑖 − 𝜇𝜇) + �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)�𝑦𝑦𝑗𝑗 − 𝜇𝜇�.
𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖≠𝑗𝑗

9
It is known that the sum of deviations of observations from the mean is always zero.
Therefore,
𝑁𝑁 𝑁𝑁

�(𝑦𝑦𝑖𝑖 − 𝜇𝜇)�𝑌𝑌𝑗𝑗 − 𝜇𝜇� = − �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)2 .


𝑖𝑖≠𝑗𝑗 𝑖𝑖=1

Finally,
𝑁𝑁
1 2
𝜎𝜎 2
𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 � = − �(𝑦𝑦𝑖𝑖 − 𝜇𝜇) = − .
𝑁𝑁(𝑁𝑁 − 1) 𝑁𝑁 − 1
𝑖𝑖=1

Therefore,
1 𝜎𝜎 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = 2
�∑𝑛𝑛𝑗𝑗=1 𝜎𝜎 2 + ∑𝑛𝑛𝑖𝑖≠𝑗𝑗 − �
𝑛𝑛 𝑁𝑁−1

𝑛𝑛𝜎𝜎 2 𝑛𝑛(𝑛𝑛 − 1) 2 𝜎𝜎 2 𝑛𝑛 − 1
= 2 − 2 𝜎𝜎 = �1 − �
𝑛𝑛 𝑛𝑛 (𝑁𝑁 − 1) 𝑛𝑛 𝑁𝑁 − 1
𝜎𝜎 2 𝑁𝑁 − 𝑛𝑛
= ×
𝑛𝑛 𝑁𝑁 − 1
(𝑁𝑁 − 1)𝑆𝑆 2 1 𝑁𝑁 − 𝑛𝑛
= × ×
𝑁𝑁 𝑛𝑛 𝑁𝑁 − 1
𝑁𝑁 − 𝑛𝑛 𝑆𝑆 2 𝑆𝑆 2
= × = (1 − 𝑓𝑓)
𝑁𝑁 𝑛𝑛 𝑛𝑛
𝑆𝑆 2
𝑖𝑖. 𝑒𝑒. 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = (1 − 𝑓𝑓) .
𝑛𝑛
Illustration

Suppose that a SRS of size 2 is drawn from a population of size 4 without


replacement. Also, suppose that values obtained from sampling units 1, 2,3, 𝑎𝑎𝑎𝑎𝑎𝑎 4
are 2, 4, 6, and 8, respectively. Justify that population variance of sample mean
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) satisfies
𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = (1 − 𝑓𝑓) .
𝑛𝑛
Solution

10
The population mean
1
𝜇𝜇 = (2 + 4 + 6 + 8) = 5.
4
Then,
1
𝑆𝑆 2 = [(2 − 5)2 + (4 − 5)2 + (6 − 5)2 + (8 − 5)2 ] = 6.67
4−1
The sampling fraction
𝑛𝑛 2
𝑓𝑓 = = = 0.5
𝑁𝑁 4
Therefore,
𝑆𝑆 2 6.67
(1 − 𝑓𝑓) = (1 − 0.5) × = 1.67
𝑛𝑛 2
4!
The possible number of samples=4𝐶𝐶2 = = 6.
2! 2!

Serial Number Sampling Units Sample Observations Mean (𝑌𝑌�)


1 (1,2) 2 4 3
2 (1,3) 2 6 4
3 (1,4) 2 8 5
4 (2,3) 4 6 5
5 (2,4) 4 8 6
6 (3,4) 6 8 7

That is, the values of sample mean are 3, 4, 5, 5, 6, 7. The population mean of
sample mean is
1
(3 + 4 + 5 + 5 + 6 + 7) = 5.
6
Then, the population variance of sample mean is

11
1
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = [(3 − 5)2 + (4 − 5)2 + (5 − 5)2 + (5 − 5)2 + (6 − 5)2 + (7 − 5)2 ]
6
= 1.67
Hence
𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = (1 − 𝑓𝑓) .
𝑛𝑛

Property

The population variance of the sample mean obtained in simple random sampling
with replacement is given by
𝜎𝜎 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = ,
𝑛𝑛
where
𝑁𝑁
1 𝑁𝑁 − 1 2
𝜎𝜎 2 = �(𝑦𝑦𝑖𝑖 − 𝜇𝜇)2 = 𝑆𝑆 .
𝑁𝑁 𝑁𝑁
𝑖𝑖=1

Proof:
Suppose that a simple random sample of size 𝑛𝑛 is drawn from a population having
𝑁𝑁 sampling units without replacement. Suppose that the given population
observations are
𝑦𝑦1 , ⋯ , 𝑦𝑦𝑖𝑖 , ⋯ , 𝑦𝑦𝑁𝑁
and sample observations are
𝑌𝑌1 , ⋯ , 𝑌𝑌𝑗𝑗 , ⋯ , 𝑌𝑌𝑛𝑛 .
Let 𝜇𝜇 be the unknown population mean defined as
𝑁𝑁
1
𝜇𝜇 = � 𝑦𝑦𝑖𝑖
𝑁𝑁
𝑖𝑖=1

and the sample mean be 𝑌𝑌� defined as

12
𝑛𝑛
1
𝑌𝑌� = � 𝑌𝑌𝑗𝑗 .
𝑛𝑛
𝑗𝑗=1

The population variance of 𝑌𝑌� is then


𝑛𝑛 𝑛𝑛
1 1
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = 𝑉𝑉𝑉𝑉𝑉𝑉 � � 𝑌𝑌𝑗𝑗 � = 2 𝑉𝑉𝑉𝑉𝑉𝑉 �� 𝑌𝑌𝑗𝑗 �
𝑛𝑛 𝑛𝑛
𝑗𝑗=1 𝑗𝑗=1

𝑛𝑛 𝑛𝑛
1
= 2 �� 𝑉𝑉𝑉𝑉𝑉𝑉�𝑌𝑌𝑗𝑗 � + � 𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 ��
𝑛𝑛
𝑗𝑗=1 𝑖𝑖≠𝑗𝑗

𝑛𝑛 𝑛𝑛
1
= 2 �� 𝜎𝜎 2 + � 𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 ��
𝑛𝑛
𝑗𝑗=1 𝑖𝑖≠𝑗𝑗

In simple random sampling with replacement, the selection of sampling units are
independent of each other. Hence, observations obtained from sampling units are
independent among themselves. Therefore,
𝐶𝐶𝐶𝐶𝐶𝐶�𝑌𝑌𝑖𝑖 , 𝑌𝑌𝑗𝑗 � = 0, ∀ 𝑖𝑖 ≠ 𝑗𝑗.
Finally,
𝑛𝑛𝜎𝜎 2 𝜎𝜎 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = 2 = .
𝑛𝑛 𝑛𝑛
In other words,
𝑁𝑁 − 1 𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = × .
𝑁𝑁 𝑛𝑛

13
Exercise

Suppose that a SRS of size 2 is drawn from a population of size 4 with replacement.
Also, suppose that values obtained from sampling units 1, 2,3, 𝑎𝑎𝑎𝑎𝑎𝑎 4 are 2, 4, 6, and
8, respectively. Justify that population variance of sample mean 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) satisfies
𝜎𝜎 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�) = .
𝑛𝑛

Efficient Estimator:

Suppose that there are two estimators for the same unknown parameter 𝜃𝜃. Let 𝜃𝜃�1
and 𝜃𝜃�2 be the two competitive unbiased estimators of 𝜃𝜃. The estimator 𝜃𝜃�1 is said to
be efficient estimator of 𝜃𝜃, if the population variance of 𝜃𝜃�1 is smaller than the
population variance of 𝜃𝜃�2 , 𝑖𝑖. 𝑒𝑒.
𝑉𝑉𝑉𝑉𝑉𝑉�𝜃𝜃�1 � < 𝑉𝑉𝑉𝑉𝑉𝑉�𝜃𝜃�2 �.

Theorem
SRSWOR provides efficient estimator of population mean compared to SRSWR.

Proof:
Suppose that a simple random sample of size 𝑛𝑛 is drawn from a population having
𝑁𝑁 sampling units. Suppose that the given population observations are
𝑦𝑦1 , ⋯ , 𝑦𝑦𝑖𝑖 , ⋯ , 𝑦𝑦𝑁𝑁
and sample observations are
𝑌𝑌1 , ⋯ , 𝑌𝑌𝑗𝑗 , ⋯ , 𝑌𝑌𝑛𝑛 .
Let 𝑌𝑌�𝑊𝑊𝑊𝑊𝑊𝑊 be the sample mean when SRS is done without replacement and 𝑌𝑌�𝑊𝑊𝑊𝑊 be
the sample mean when SRS is done with replacement. Note that both 𝑌𝑌�𝑊𝑊𝑊𝑊𝑊𝑊 and 𝑌𝑌�𝑊𝑊𝑊𝑊

14
are unbiased estimator of population mean 𝜇𝜇. The population variance of 𝑌𝑌�𝑊𝑊𝑊𝑊𝑊𝑊 is
given by
𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�𝑊𝑊𝑊𝑊𝑊𝑊 ) = (1 − 𝑓𝑓)
𝑛𝑛
and the population variance of 𝑌𝑌�𝑊𝑊𝑊𝑊 is given by
𝑁𝑁 − 1 𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�𝑊𝑊𝑊𝑊 ) = .
𝑁𝑁 𝑛𝑛
Now,
𝑁𝑁 − 1 𝑆𝑆 2 𝑆𝑆 2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�𝑊𝑊𝑊𝑊 ) − 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�𝑊𝑊𝑊𝑊𝑊𝑊 ) = − (1 − 𝑓𝑓)
𝑁𝑁 𝑛𝑛 𝑛𝑛
𝑆𝑆 2 𝑁𝑁 − 1
= � − 1 + 𝑓𝑓�
𝑛𝑛 𝑁𝑁
𝑆𝑆 2 𝑁𝑁 − 1 𝑛𝑛 𝑆𝑆 2 𝑁𝑁 − 1 − 𝑁𝑁 + 𝑛𝑛
= � −1+ �= � �
𝑛𝑛 𝑁𝑁 𝑁𝑁 𝑛𝑛 𝑁𝑁
𝑆𝑆 2 𝑛𝑛 − 1 (𝑛𝑛 − 1)𝑆𝑆 2
= = > 0.
𝑛𝑛 𝑁𝑁 𝑁𝑁𝑁𝑁
It implies that
𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�𝑊𝑊𝑊𝑊 ) > 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌�𝑊𝑊𝑊𝑊𝑊𝑊 ).

Since variance of sample mean obtained from SRSWOR is less than that of sample
mean obtained from SRSWR, SRSWOR provides efficient estimator of population
mean compared to SRSWR.

Exercise

Suppose that a SRS of size 2 is drawn from a population of size 4. Also, suppose
that values obtained from sampling units 1, 2,3, 𝑎𝑎𝑎𝑎𝑎𝑎 4 are 2, 4, 6, and 8, respectively.
Using this information, show that SRSWOR provides efficient estimator of
population mean compared to SRSWR.

15

You might also like