2. Estimation
2. Estimation
Estimation
Statistical Inference
Point & Interval Estimation
Estimation
Sample Size Determination
Shahed B Sadique
1
2/4/2025
Shahed B Sadique
Sampling Fundamentals Estimation 5 Sampling Fundamentals Estimation 6
(i) An estimator should on the average be equal to the value (iii) An estimator should use as much as possible the
of the parameter being estimated. This is popularly known as information available from the sample. This property is
the property of unbiasedness. An estimator is said to be known as the property of sufficiency.
unbiased if the expected value of the estimator is equal to (iv) An estimator should approach the value of population
the parameter being estimated. The sample mean 𝑥̅ is the parameter as the sample size becomes larger and larger. This
most widely used estimator because of the fact that it property is referred to as the property of consistency.
provides an unbiased estimate of the population mean µ. Keeping in view the above stated properties, the researcher
must select appropriate estimator(s) for his study.
(ii) An estimator should have a relatively small variance. This
means that the most efficient estimator, among a group of
unbiased estimators, is one which has the smallest variance.
Shahed B Sadique
efficiency.
2
2/4/2025
A. Unbiased Estimator
An estimator T = 𝜃 is an unbiased estimator of the Thus the sample mean is
parameter 𝜃 , if its expected value is equal to that of ∑ 𝑥
. 𝑥̄ =
𝑛
the parameter itself, that is E(T) = 𝜃 .
𝑥
For example, the sample mean 𝑥̄ = is an unbiased
𝑛 ∑ 𝑥 ∑ 𝐸(𝑥 )
estimator of the population mean 𝜇 , since 𝐸(𝑥̄ ) = 𝜇 Now , 𝐸 𝑥̄ = =
𝑛 𝑛
Suppose, 𝑥 , 𝑥 , 𝑥 , ⋯ ⋯ ⋯ , 𝑥 is a random sample of size
n from some population with mean .
∑ 𝜇 𝑛𝜇
= = =𝜇
𝑛 𝑛
Shahed B Sadique
Shahed B Sadique
Sampling Fundamentals Estimation 9 Sampling Fundamentals Estimation 10
Shahed B Sadique
3
2/4/2025
D. Sufficient Estimator
Let 𝑓(𝑥|𝜃) be the density function of a random variable The necessary and sufficient condition for t to be
x where is unknown fixed parameter. Let sufficient for is that the likelihood function L may
be factorized as follows
𝑥 ,𝑥 ,𝑥 ,⋯⋯⋯,𝑥 be a random sample from this 𝐿(𝜃 𝑥) = 𝑔(𝑡|𝜃). ℎ(𝑥 , 𝑥 ,𝑥 , ⋯ ⋯ , 𝑥 )
density. Let t and 𝑡 are two statistics such that 𝑡 is
not a function of t. If the conditional distribution of 𝑡
given t be independent of , then t is called a
sufficient statistic for . If further E(t) = , then t is
called a sufficient estimator of .
Shahed B Sadique
Shahed B Sadique
Sampling Fundamentals Estimation 13 Sampling Fundamentals Estimation 14
Suppose 𝑥 , 𝑥 , 𝑥 , ⋯ ⋯ ⋯ , 𝑥 is a random sample of The best point estimate of is 𝑥̄ . The expected value
size n from a normal population with mean µ and the variance of 𝑥̄ are E( 𝑥̄ ) = and ,
𝜎
and known variance 2. The most efficient estimator 𝜎̄ = respectively.
𝑛
of the population mean µ is given by the sample
mean The statistic 𝑡 has a normal distribution with mean
𝑥
𝜇̂ = 𝑥̄ =
𝑛 and variance (2/n). Thus the random variable
𝑥̄ − 𝜇
𝑧= has a standard normal Distribution, that is
𝜎/ 𝑛
z= E(z) = 0 and variance 𝜎 = 1.
Shahed B Sadique
Shahed B Sadique
4
2/4/2025
𝑧
Thus 𝑃(−𝑧 <𝑧<𝑧 )=1−𝛼
, where 𝜎 𝜎
⇒ 𝑃(𝑥̄ − 𝑧 < 𝜇 < 𝑥̄ + 𝑧 ) =1−𝛼
𝑛 𝑛
is that value in the standard normal distribution that
𝛼
has area𝛼 to the left, that is
1−
2 The interval (tL, tU) = (𝑥̄ − 𝑧 , 𝑥̄ + 𝑧 )
𝑃(𝑧 < −𝑧 )= , 𝑎𝑛𝑑𝑧 = −𝑧 . (1)
2 .
is a 100(1-)% confidence interval for .
Shahed B Sadique
Sampling Fundamentals Estimation 17 Sampling Fundamentals Estimation 18
Shahed B Sadique
5
2/4/2025
EXAMPLE:
The maximum likelihood estimator is the solution of Let 𝑥 , 𝑥 , 𝑥 , ⋯ ⋯ ⋯ , 𝑥 be a random sample from
distribution 𝑁(𝜇, 𝜎 ). Find an ML estimator for .
𝛿𝐿 𝛿 log 𝐿
𝛿𝜃
= 0𝑜𝑟
𝛿𝜃
=0
Solution: The maximum likelihood function is
1 ∑ ( )
𝐿(𝜃 𝑥) = 𝑒
𝜎 ( 2𝜋)
1
log 𝐿 = − 𝑛 log( 𝜎. 2𝜋) − (𝑥 − 𝜇)
2𝜎
𝛿 log 𝐿 1
⇒ =− 2(𝑥 − 𝜇)(−1) = 0
Shahed B Sadique
Shahed B Sadique
𝛿𝜇 2𝜎
⇒ 𝜇 = 𝑥̄
Sampling Fundamentals Estimation 21 Sampling Fundamentals Estimation 22
Shahed B Sadique
6
2/4/2025
Shahed B Sadique
Sampling Fundamentals Statistics II 25 Sampling Fundamentals Statistics II 26
Shahed B Sadique
27 28
7
2/4/2025
Shahed B Sadique
Sampling Fundamentals Statistics II 29 Sampling Fundamentals Statistics II 30
Shahed B Sadique
8
2/4/2025
3. The standard error enables us to specify the limits within Range Percent Values
which the parameters of the population are expected to lie
with a specified degree of confidence. Such an interval is µ ± 1 S. E. 68.27
usually known as confidence interval. The following table µ ± 2 S. E. 95.45
gives the percentage of samples having their mean values
µ ± 3 S. E. 99.73
within a range of population mean µ ± S.E.
µ ± 1.96 S. E. 95.00
µ ± 2.5758 S. E. 99.00
Shahed B Sadique
Shahed B Sadique
Sampling Fundamentals Statistics II 33 Sampling Fundamentals Statistics II 34
Important formulae for computing the standard (iii) Standard error of the difference between proportions of two
errors concerning various measures based on samples: 𝜎 = 𝑝. 𝑞. −
samples are as under:
where p = best estimate of proportion in the population and
(a) In case of sampling of attributes: is worked out as under:
(i) Standard error of number of successes = 𝑛 . 𝑝 . 𝑞
where n = number of events in each sample, Note: Instead of the above formula, we use the following
formula:
p = probability of success in each event,
q = probability of failure in each event.
when samples are drawn from two heterogeneous populations
where we cannot have the best estimate of proportion in the
Shahed B Sadique
Shahed B Sadique
(ii) Standard error of proportion of successes, universe on the basis of given sample data. Such a situation often
arises in study of association of attributes.
Sampling Fundamentals Statistics II 35 Sampling Fundamentals Statistics II 36
9
2/4/2025
(b) In case of sampling of variables (large samples): (iii) Standard error of standard deviation when population
(i) Standard error of mean when population standard standard deviation is known:
deviation is known: 𝜎 ̅ =
Note: This formula is used even when n is 30 or less.
(iv) Standard error of standard deviation when population
standard deviation is unknown:
(ii) Standard error of mean when population standard deviation
is unknown: 𝜎 ̅ =
Shahed B Sadique
Sampling Fundamentals Statistics II 37 Sampling Fundamentals Statistics II 38
(vi) Standard error of difference between means of two samples: (c) In case of sampling of variables (small samples):
(a) When two samples are drawn from the same population: (i) Standard error of mean when p is unknown:
Shahed B Sadique
10
2/4/2025
Statistical Inference
Sample Size
Determination
Shahed B Sadique
Size of the sample should be determined by a researcher keeping (iii) Nature of study: If items are to be intensively and
in view the following points: continuously studied, the sample should be small. For a general
(i) Nature of universe: Universe may be either homogenous or survey the size of the sample should be large, but a small sample
heterogenous in nature. If the items of the universe are is considered appropriate in technical surveys.
homogenous, a small sample can serve the purpose. But if the (iv) Type of sampling: Sampling technique plays an important
items are heterogenous, a large sample would be required. part in determining the size of the sample. A small random
Technically, this can be termed as the dispersion factor. sample is apt to be much superior to a larger but badly selected
(ii) Number of classes proposed: If many class-groups (groups Sample.
and sub-groups) are to be formed, a large sample would be (v) Standard of accuracy and acceptable confidence level: If the
required because a small sample might not be able to give standard of accuracy or the level of precision is to be kept high,
a reasonable number of items in each class-group. we shall require relatively larger sample. For doubling the
Shahed B Sadique
Shahed B Sadique
11
2/4/2025
(vi) Availability of finance: In practice, size of the sample depends There are two alternative approaches for determining the size of
upon the amount of money available for the study purposes. This the sample. The first approach is “to specify the precision of
factor should be kept in view while determining the size of estimation desired and then to determine the sample size
sample for large samples result in increasing the cost of sampling necessary to insure it” and the second approach “uses Bayesian
estimates. statistics to weigh the cost of additional information against
(vii) Other considerations: Nature of units, size of the population, the expected value of the additional information.”
size of questionnaire, availability of trained investigators, the The first approach is capable of giving a mathematical solution,
conditions under which the sample is being conducted, the time and as such is a frequently used technique of determining ‘n’.
available for completion of the study are a few other The limitation of this technique is that it does not analyse the
considerations to which a researcher must pay attention while cost of gathering information vis-a-vis the expected value of
selecting the size of the sample. information.
Shahed B Sadique
Shahed B Sadique
The second approach is theoretically optimal, but it is seldom
used because of the difficulty involved in measuring the value of
information. Hence, we shall mainly concentrate here on the first
Sampling Fundamentals Statistics II 45 approach.
Sampling Fundamentals Statistics II 46
Sample size when estimating mean Sample size when estimating Population Proportion
i) In case of infinite Population :
𝑧 ×𝜎 i) In case of infinite Population : 𝑛 =
𝑛=
𝑒
Shahed B Sadique
12
2/4/2025
Example:
Determine the size of the sample for estimating the true weight Solution: In the given problem we have the following:
of the cereal containers for the universe with N = 5000 on the N = 5000;
basis of the following information: p = 2 ounces (since the variance of weight = 4 ounces);
(1) the variance of weight = 4 ounces on the basis of past e = 0.8 ounces (since the estimate should be within 0.8 ounces of
records. the true average weight);
(2) estimate should be within 0.8 ounces of the true average z = 2.57 (as per the table of area under normal curve for the
weight with 99% probability. given confidence level of 99%).
Will there be a change in the size of the sample if we assume Hence, the confidence interval for µ is given by
infinite population in the given case? If so, explain by how much?
𝜎 𝑁−𝑛
𝑥 ±𝑧×
𝑛 𝑁−1
Shahed B Sadique
Shahed B Sadique
49 50
and accordingly the sample size can be worked out as under: But if we take population to be infinite, the sample size will be
worked out as under:
𝑧 ×𝜎
𝑛=
𝑒
(2.57) × 2
= = 41.28 ≅ 41
(0.8)
Thus, in the given case the sample size remains the same even if
we assume infinite population.
Hence, the sample size (or n) = 41 for the given precision and
Shahed B Sadique
Shahed B Sadique
51 52
13