0% found this document useful (0 votes)
19 views

EIE2001 Lecture 6b Week 7

This document discusses simple random sampling (SRS) including how to conduct SRS, how to estimate the mean, total, and proportion of a population from a SRS, and how to determine the required sample size for estimating the mean, total, and proportion with a given bound on the error of estimation. SRS involves randomly selecting units from a population where each unit has an equal chance of being selected.

Uploaded by

u2102965
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

EIE2001 Lecture 6b Week 7

This document discusses simple random sampling (SRS) including how to conduct SRS, how to estimate the mean, total, and proportion of a population from a SRS, and how to determine the required sample size for estimating the mean, total, and proportion with a given bound on the error of estimation. SRS involves randomly selecting units from a population where each unit has an equal chance of being selected.

Uploaded by

u2102965
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

EIE2001 Lecture 6b: Simple Random Sampling (SRS)

Introduction
SRS is a type of probability sampling in which the units composing a population are
assigned numbers. A set of random numbers is then generated, and the units having those
numbers are included in the sample.

SRS is the basis for sampling theories. It is used as an approximation to the more complex
sampling designs.

SRS is suitable for selecting small sample from a small population in a small geographical
area and where sampling frame is available. Each member of the population has an equal
chance of being selected as subject. It is not suitable for very large population where the
elements are spread out in a wide geographical area.
- Must have sampling frame to apply the probability sampling design.

SRS is widely used with other complex sampling designs such as stratified random
sampling and cluster sampling. In stratified random sampling, SRS is used to select
elements from each stratum. SRS is also used to select clusters and elements within
selected clusters in multistage sampling.

How to select a simple random sample


1. List all elements in the frame.
2. Determine the sample size, let’s say 20 students from 300 students.
3. Refer to random number table or a random number generated with EXCEL (RAND and
RANDBETWEEN functions). Read three digits from any column or row and select 20
numbers from 001 to 300. Replace a number that has already been selected. Sampling is
normally done without replacement (an element can only be selected once).
4. Use EXCEL or SPSS to do the selection.

Estimating Mean, Total and Proportion


Population (Parameter) Sample (Estimator)
Mean
Y yi y yi
N n where
where i = 1 to N i = 1 to n
Total Y= = yi = y1 + y2 + …+
yi
yN Yˆ ˆ Ny N
n
where i = 1 to N where i = 1 to n
1
Estimator

Proportion p=µ ˆp y
of the population mean

yi
ˆ y i 1

(4.1)
n

Estimated variance of y

n s2
Vˆ y 1 (4.2)
N n

where
n

(yi y)2
s2 i 1

n 1

Bound on the error of estimation, B

n s2
2 Vˆ y 2 1 (4.3)
N n

n
1 is known as the finite population correction factor (fpc). It is a correction factor
N
which, when multipled by the with-replacement variance, gives the without-replacement
variance.

Example:

2
Estimator

SRS of 200 accounts from a total of 1,000 accounts. The sample mean y = RM94.22 and
sample variance s2 = 445.21. Estimate (population mean), margin (bound) on error of
estimation and 95% confidence interval. y = 94.22, so E y = = RM 94.22

Vˆ y 1 200 445.21 = RM 1.335


1000 200

Bound of error =2 Vˆ y = RM 2.67

95% confidence interval for population mean is RM 94.22 2.67 = (91.55, 96.89)
of the population total

N yi
ˆ Ny i 1

(4.4)
n

Estimated variance of ˆ

n s
Vˆ ˆ Vˆ Ny N2 1 N n2
(4.5)

Bound on the error of estimation, B

2 Vˆ Ny 2 N2 1 Nn sn2
(4.6)

3
Estimator

Example:
An industry wishes to estimate the total time spent by scientists in doing the trivial jobs. A
sample survey on 50 scientists selected with SRS shows that they spent an average 10.31
hours on such jobs, with a variance of 2.25 hours. If the industry has 750 scientists, estimate
the total amount of time spent by scientists on trivial jobs and 95% confidence interval.

ˆ Ny= 750(10.31) = 7732.5 hours

Vˆ ˆ 7502 1 50 2.25 = 23625


750 50

Bound on error =2 Vˆ ˆ = 307.41

95% confidence interval is 7732.5 hours 307.41 = (7425.09, 8039.91)

of the population proportion p

yi = 1 if the element possess a certain characteristic, 0 if not.

yi
pˆ y i 1

(4.14)
n

Estimated variance of ˆp

n ˆpqˆ
Vˆ ˆp 1 (4.15)
N n 1

where
4
Estimator

qˆ 1 ˆp

Bound on the error of estimation, B

ˆ n ˆpqˆ
2V ˆp 2 1 (4.16)
N n 1

Note: Proportion is a special case of the mean, with a dichotomous outcome (0 and 1). The
variance for the mean is s2, the variance for proportion is pq. The variance for the sampling
distribution of the mean is s2/n, while the variance for the sampling distribution of
proportion is pq/(n-1)

Example:
N = 300 students, n = 100
Estimate the proportion of students who are required to take supplementary examination, with
code “1” if “yes” and “0” if “no”.
Student yi yi 15
ˆp = 0.15 = 15% n
1 1
2 0 100
3 0
4 0 Vˆ ˆp 1 100 0.15 0.85 = 0.000859
… … 300 99
98 1
Margin of error or bound of error = 2 Vˆ ˆp = 0.059
99 0
100 1
95% confidence interval: 0.15 0.059 = (0.091, 0.209) or 9.1% and 20.9%
Total yi = 15
at 95% confidence level. This is a rather wide margin and it is not accurate,
because the sample size is relatively small. With a bigger sample size, the margin of error can be
reduced, and the estimate will be more accurate (with a smaller margin of error).

5
Determining Sample Size with a Bound on the Error of Estimation, B
A sample size that is too big involves high cost and longer time to collect the data. On the
other hand, if the sample size is too small, the estimates are not accurate. It is therefore
necessary to determine the sample size for a fixed cost and margin of error.

Sample size required to estimate population mean µ

The margin of error is calculated as:

B 2V y

Vˆ y 1
s2
n
N n

V(y) 2
N
n n N
1

Rearranging the terms, we get:

N 2

n (N 1)D 2 (4.11)

where
B2
D
4

Note: 2 is usually estimated with the sample variance, s2 from past studies. can also be
estimated as follows:
≈ range/4 where range is the maximum value minus the
minimum value.

6
Example:
The officer of a hospital wants to estimate the average outstanding bill. His record shows
that most of the outstanding accounts are within the range of RM 100. If there are 1,000
accounts, how large should the sample size be to estimate the mean ( ), with a margin of
error B = RM 3?

≈ 100/4 = 25

2
≈ 252 = 625

D = 32/4 = 2.25

n= 217.56

We need approximately 218 observations to estimate µ, the mean accounts receivable.

Sample size required to estimate population total

B 2 V Ny 2N V y

N 2

n (N 2

(4.13)

1)D

where

7
D 4BN22

Example:
A researcher wants to estimate the increase in the total weight gain of 1,000 chickens that
were given special feed for 0 to 4 weeks. Past studies show that is 6 gram. Determine
the sample size required to achieve a bound of error B = 1,000 gram.

10002 1000 62
D= 4 10002 0.25 n= (1000 1)0.25 62 =
125.98

The researcher needs to weigh n = 126 chicks to estimate , the total weight gain for N =
1,000 chickens in 0 to 4 weeks.
Sample size required to estimate population proportion p

Npq N 1 D pq
n (4.18)

where
q=1–p

B2
D
4

Example:
The management of a company with 2,000 workers wants to conduct a survey to study the
proportion of workers who are in favor of the new salary scheme. Determine the sample
size to estimate the population proportion p, that supports the new scheme, with a bound
of error B = 0.05. Let’s say the proportion supporting the new scheme is unknown (we then
use p = 0.5 which produces the maximum variance).

0.052
D= 0.000625
4

8
n= 333.47

334 workers must be interviewed to estimate the proportion of workers who favor the
proposed new salary scheme.

Theory
A larger sample size produces a more accurate estimate of the parameter. The accuracy of
the estimate also depends on the variability, s. The smaller the s, the smaller the standard
error.

The sampling distribution is a probability distribution of samples, such as the sample mean
for all possible random sample of size n from the population. The standard error is the
standard deviation of the sampling distribution.

Characteristics of sampling distributions


1. The arithmetic mean for distribution of sample mean y and proportion ˆp for a sample
of a certain size is an unbiased estimator for the parameter, and p.
2. The sampling distribution for the mean from a random sample approximates normal
distribution if the sample size gets larger, although the population under study is not
normally distributed.
3. For large sample size (n > 100 and p = 0.5), the binomial distribution for the proportion
1
from the sample is approximately normally distributed. n N

4. The fpc can be ignored if 1 ≥ 0.95, or equivalently, n≤


N 20

Exercise:
Select 5% of the respondents from “Employee Satisfaction Survey” data and calculate:
1. Mean respondent income and 95% confidence interval.
2. Total income of all respondents (based on the sample) and 95% confidence interval.
3. Percent who are males and 95% confidence interval.

9
Data-select cases

10

You might also like