Statistics 2 Chapter Two
Statistics 2 Chapter Two
UNIT 2
STATISTICAL ESTIMATIONS
2.1. Basic concepts Of Statistical Estimation
One of the principal objectives of statistical investigation is to make reasonable estimates. The
overall objective of this chapter is to describe how statisticians or professionals that use
statistical tools go about doing statistical estimates.
Introduction
Everyone makes estimates. When you are ready to cross a street, you see a car approaching to
you on the street; you estimate the speed of any car that is approaching, the distance between you
and that car, and your own speed. Having made these quick estimates, you decide whether to
wait, walk, or run.
So far we have covered discussions on the concepts of probability theory and sampling
distribution that forms foundation for statistical inference. In this chapter we begin to explore the
possibility of making inferential statements about a population, based on the information
contained in a random sample.
Estimation: - is a method that enables us to estimate, with reasonable accuracy, the population
parameter. In reality, to calculate the population parameter is extremely difficult or an impossible
goal, hence we need to make estimates and sometimes make a statement about the error that will
likely accompany the estimate.
Page 1
Statistics for management II
The procedure of marking estimation is to have random sample of size n from the know
probability distribution, compute sample statistics and use it as an estimate of the population
parameters. This procedure of estimation can be categorized in to two: Point Estimation and
Interval Estimation.
It deals with the task of selecting a specific sample value as an estimate for a population
parameter. A point estimate is a specific value of a sample statistic that is used to estimate a
population parameter. If, a research head of the national bank of Ethiopia (NBE) would make a
point estimate saying, “our current date indicates that the country is growing at 7% on average
per year.”
Page 2
Statistics for management II
Example; the age for a random sample of ten students from 2 nd year management department
students were: 20; 26; 21; 23; 22; 20; 24; 25; 22; 21. Find point estimates of the population
mean age.
Solution
i 1 2 3 4 5 6 7 8 9 10 sum
xi 20 26 21 23 22 20 24 25 22 21 224
∑ xi
So, x =
n
= 224
10
= 22.4
2.2.2. Point estimators of the proportion
Sample proportion is an unbiased estimator of the population proportion. That is p is an unbiased
estimator of P. So E(p) = P. A particular value of p based on a sample survey becomes a point
estimate. It is calculated by taking elements in the sample that have the same characteristics; we
can use P this estimator of P.
number of success x
Proportion = P = p = =
sample ¿ ¿ ¿ n
Example; from the above example find the point estimation of population proportion which has
Solution;
Page 3
Statistics for management II
From the sample, there are four students has 23 and above years old. So, the population
x 4
proportion calculated as; p = ,= = 0.4
n 10
Therefore the proportion of student of MGT 2nd year which has 23 and above years is 0.4
Note: Estimation of population parameter by using a point sample statistics has some problems;
since we are estimating a large population parameter by using a single number which is either
false or true. As a result the choice of point estimator has been based on intuitive plausibility.
Meaning we have to consider various desirable properties of point estimators. This provides a
framework within which a particular choice can be evaluated and alternatives examined. At the
outset it must be stated that no single mechanism exists for the determination of a uniquely point
estimator in all circumstances. What is available instead is a set of criteria under which particular
estimators can be evaluated. We will see at the end of this chapter some of these desirable
properties.
Page 4
Statistics for management II
Assume that we want to find out how “close” is and estimator X to . For this purpose, we try to
find out two positive values U and X such that the probability that the random variable
P( X U X U) 1
0 1 and 0 1 1
Where: is the level of significance
1–: is confidence coefficient
Such an interval is known as confidence Interval. The random variables ( X U) and ( X U) are
the lower and the upper confidence limit (or critical values) respectively.
The choice of method used in constructing a confidence interval for μ depends upon whether or
not the population is normal and whether the population standard deviation is known or
unknown; and whether the sample size is large (n ¿ 30) or small (n <30). Thus, if the sample size
Zα
is large, no assumption is required about the distribution of the population and 2 is used in the
computation of the interval estimate. If the sample size is small, the population must have a
normal or approximately normal probability distribution in order to develop an interval estimate
Zα
of μ . If this is the case, 2 is used in the computation of the interval estimate when is known,
tα
whereas 2 is used when is unknown or estimated by the sample standard deviation.
δ
δ x̄ =
sampling distribution of the mean will be normal with mean of and √n for
sampling distribution of the mean the standard normal variable is , then the interval
estimate of a population mean (the confidence interval for a population mean) is:
Page 5
Statistics for management II
that is exceeded with a probability of α /2 or Zα/2 is Z value providing an area of α/2 in the upper
tail of the standard normal probability distribution.
α/2
0 Zα/2
is given we compute the confidence interval estimation of the population mean by using
So, if
1: an experiment involves selecting a random sample of 256 managers for study. One
Example
item of interest is their annual income. The sample mean is computed to be 35,420 and sample
standard deviation is $ 2050. What is the 95% confidence interval?
s 2050
σ x̄ = =
√n √ 256 2
( )
=128 . 125 , z 0. 05 =z 0 . 025=0 . 95 2 =0 . 4750 =1 . 96 (Table reading )
Thus, the annual income of the managers at 95% confidence will be between $ 35168.9 and $
35671.125.
Page 6
Statistics for management II
Example 2:
A normal population has a standard deviation of 10. A random sample of size, 25 has a mean of
50. Construct a 95% confidence interval estimate of the population mean?
Given Solution
= 10 σx 10
σ x̄ = = =2
n = 25 √n √ 25
X = 50
α = 0.5 (5%) =100% - 95%
( z 0. 95
=
p 2 )
=0. 4750 =1. 96
μ= X±zσ x̄ ⇒
⇒ 50± 1.96(2)
⇒ 46 . 1≤μ≤53 . 9
Hence, we state with 95% confidence interval is lies between 46.1 and 53.9.
Thus, the standard deviation of the sampling distribution of the sample means,
s
δx ….. is given by δx =
√n
In this case, the construction of confidence interval estimate depends up on whether the sample
size is larger or small:
Case 1; when the sample size is large and unknown (A sample size is large when n ≥30)
s
Confidence interval estimate for population mean () is given by: X ±
√n
Case 2: when the sample size is small and unknown d (A sample size is small when n < 30).
Page 7
Statistics for management II
s
μ= X±t α S x or X ±t α
2 2 √n
The confidence interval estimate of m is:
When the standard deviation of the population is unknown, the sample size is small and the
x̄−μ
t=
s
population is normally distributed is: √n .
Note that to use the t - table, you should determine the degree of freedom (by n - 1) andα .
Example 1; Suppose that abdissa rental firm wants to estimate the average number of miles
traveled per day by each of its cars rented in dembi dollo. A random sample of 110 cars reveals
that the sample mean travel distance per day is 85.5 miles with a sample standard deviation of
X
Given; n = 110, = 85.5, S = 19.3 and α = 0.01
Solution
μ= X±Z
s
√n
⇒ 85 .5±2. 575 ( )
19 . 3
√ 110
⇒ 80 . 8 ≤μ≤90 . 2
Example 2:
The environmental protection officer of a large industrial plant sought to determine the mean
daily amount of sulfur oxides (a pollutant) emitted by the plant. Because measurement costs
were high, only random samples of 10 days' measurements were obtained. These were in tons
per day, 8, 7, 10, 15, 11, 6, 8, 5, 13, 12. Suppose emissions per day are normally distributed.
Estimate μ , the mean amount of sulfur oxides emitted per day, using a confidence coefficient
of 0.95?
Page 8
Statistics for management II
Solution
X=
∑ X =9 .5 tones/day
n
C = 0.95
α
α =1− 0 .95=0 . 05 , =0 .025
2
V = degree of freedom = n-1 = 10-1 = 9
S=
√ ∑ ( X− X̄ )2 =
n−1 √ 94 . 5
9
=3 .24 tones/day
μ= X±tα
2
S
√n
⇒9 . 5±2 . 26
( )
3 .24
√ 10
⇒ 9 . 5±2 .3
⇒ 7 . 5≤μ≤11.8
Thus, we state with 95% confidence that the mean output of sulfur oxides is between 7.2 and
11.8 tons/day.
2.3.3. Confidence Interval Estimate of the Proportion (P)
When np and nq are both greater than 5, and when n is small relative to the size of the
population, the approximate confidence interval for population proportion p, is given by
Example 1:
P=P±Z
√ pq
n
or P±ZS x
Suppose 1600 of 2000 registered voters sampled said that they planned to vote for the ABC party
candidate. Using the .95-degree of confidence, what is the interval.
Solution:
P=
1600
2000
=0 . 8 , =
P 2 (
Z . 95
=. 196 )
P=P±Z
√ pq
n
Page 9
Statistics for management II
=
0 . 8±1 . 96
√ (. 8 )(.2 )
2000
=0 .7824 to 0 . 8176
Example 2:
Kebire clothing company produces men's jeans. The jeans are made and sold with either a
regular cut or a boot cut. In an effort to estimate the proportion of their men's jeans market in
Addis Ababa city that is for boot - cut jeans, the analyst taxes a random sample of 212 jeans sales
from the company's two retail out lets in the city. Only 34 of the sales were for boot - cut jeans.
Construct a 90% confidence interval to estimate the proportion of the population who prefer
boot- cut jeans?
Solution:
P=
34
212
Z .90
=0 . 16 , =
P 2 (
=1. 64 )
P=P±Z
√ pq
n
=0. 16±1. 64 v
( 0 .16 ) ( 0. 84 )
212
= 0 . 16±0. 04
0 . 12≤ p≤0. 2
=
μ1 −μ2 =( X 1 −X 2 ) ±Z
√ σ 21 σ 22
+
n1 n2
Example 1;
Two methods of performing a certain task in a manufacturing plant, Method A and Method B,
are under study. The variable of interest is length of time needed to perform the task. It is known
2 2
that δ A is 9 minutes squared and δ B is 12 minutes squared. A simple random sample of 35
employees performed the task by method A. An independent simple random sample of 35
employees similar in all-important aspects to the first group performed the task by method B.
Page 10
Statistics for management II
The average time the first group needed to complete the task was 25 minutes. The average time
for the second group was 23 minutes. Construct a 95% confidence interval to estimate the
difference in the true average lengths of time needed for the task?
Solution:
X 1 =25 δ 2
1= 9 n1 =35 CI =0 . 95
X 2=23 δ2
2 =12 n 2=35
= 0.48 to 3.52
Therefore the difference between the averages lengths of time needed for the task of the two
methods lies between 0.48 to 3.52.
Some sampling error will arise because we have not studied the whole population. Whenever we
sample, we always miss some helpful information about the population. If we want a high level
of precision (that is, if we want to be quite sure of our estimate). We have to sample enough of
the population to provide the required information. Sampling error is controlled by selecting a
sample size that is adequate. How is this adequate sample size determined for any specified level
Page 11
Statistics for management II
σ
μ= X±Z ...
√n
Let
e=Z
σ
√n
⇒ e2 = z
σ 2
√n ( ) σ2 σ2
⇒ e2 =z 2 ⇒ n=z 2 2
n e
Example 1:
An advertising firm wants to estimate the average amount of money a certain type of store spent
on advertising during the past year. Experience has shown the population variance, to be about 1,
800,000. How large a sample should the advertising firm take in order for the estimate to be with
in $500 of the true mean with 95% confidence?
Solution
Z 2 σ 2 ( 1 . 96 )2 ( 1 , 800 , 000 )
n= = 3.8416 ( 1,800,000 ) 6,914.880
e2 (500)2 ¿
250,000 = 250,000 =27.66 28
Example 2:
A certain company makes light fixtures on an assembly line. An efficiency expert wants to
determine the mean time it takes an employee to assemble the switch on one of these fixtures. A
preliminary study used a random sample of 45 observations and found that the sample standard
deviation was S = 75 seconds. How many more observations are necessary for the efficiency
expert to be 95% sure that the point estimate will be "off” from the true mean by at most 15
seconds?
Solution:
The efficiency expert should use a sample of minimum size 104. Since the preliminary study has
45 observations, an additional 104 - 45 = 59 observations are necessary.
2.4.2. For Estimating a Population Proportion
The rational for determining the sample size in estimating the population proportion is the same
as we discussed for the case of population mean estimation. In this case the adequate sample size
Page 12
Statistics for management II
P=P±Z
√ pq
n Let
e=Z
pq
n √
P=P±e
( √ )
2
pq Pq
e=Z e 2 =Z 2
n , n
Pq
⇒ n=Z 2
e2
Example 1:
A market research firm wants to estimate the proportion of households in a certain area that have
color television sets. The firm would like to estimate p to within 0.05 with 95% confidence. No
estimate of P is available. Determine the sample size?
Solution:
( Z 0. 95
P
=
2 )
=1 .96
Z 2 P q ( 1. 96 )2 ( 0 .5 ) ( 0 .5 )
n= 2 = =384 .16=385
e ( 0 . 05 )2
Example 2:
Mr. Abebe, Member of Parliament, wants to determine his popularity in a certain part of the
state. He indicates that the proportion of voters who will vote for him must be estimated within
± 2% of the population proportion. Further; the 0.95 degree of confidence is to be used. In Past
elections he received 40% of the popular vote in that area of the state. He doubts whether it has
changed much. How many registered voters should be sampled?
Solution
Page 13
Statistics for management II
Z 2 P q ( 1. 96 )2 ( 0 . 4 ) ( 0 .6 )
n= = =2304 . 96=2305
e2 ( 0 .02 )2
Page 14