Chapter 2 Statistics Estimation Final
Chapter 2 Statistics Estimation Final
In statistical inference, one estimates about the population based on the result obtained from the
sample selected from that population. Thus, estimation is a process by which we estimate various
unknown population parameters from sample statistics.
Page 1 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
Any sample statistic that is used to estimate a population parameter is called an estimator and an
estimate is a numerical value of an estimator.
The sample mean is often used as an estimator of the population mean. Suppose that we calculate
the mean daily revenue of a store for a random sample of 6 days and find it to be 1110 birr. If we
use this value to estimate the daily revenue for the whole year, then the value 1110 birr would be
an estimate.
Proportion, π X
P=
n
Example 1 To set the price of a product, one strategy is competition-oriented in which you fix the
price of your product at the average level charged by other producers. Suppose you want to
market a 200-gram bar or soap that you produce. The current wholesale prices charged by a
random sample of 10 soap producers (in birr) are:
1.00 1.35 1.50 0.95 0.90 1.25 1.00 1.20 0.90 and 1.50
What is an estimate of the mean wholesale price charged by all soap producers? Find an estimate
of the standard deviation in the wholesale prices of all the producers?
Page 2 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
Solution
The mean wholesale price or the population mean () is estimated by the sample mean X ,
X = ∑ xi/n
given by i = (1.00 + 1.35 + ---- + 1.50) / 10 = 1.155. Thus, an estimate of the
mean wholesale price charged by all soap producers is 1.155 Birr. Based on this information,
you might set the wholesale price per unit of your product at 1.155 Birr.
The standard deviation in the wholesale prices of all producers, what we call the population
standard deviation () and is estimated by the sample standard deviation.
S=
√ ∑ ( Xi − X )2
i
n−1
=
√
(1.00 − 1. 155 )2 + ( 1. 35 −1. 155 )2 + −−−−+ ( 1.50 − 1 .155 )2
= 0.237
9
Thus, the wholesale prices fluctuate below and above their mean by about 0.237 Birr, which is an
estimate of the standard deviation in the wholesale prices of all producers.
Example 2: - suppose you are interested to know the proportion of fishes that are poisoned as a
result of chemical pollution of a certain lake. In a random sample of 400 fishes caught from this
lake, 55 were found out to be inedible. Out of all fishes in this lake, what is an estimate of the
proportion of inedible fishes?
Solution
The proportion of inedible fishes in the entire lake is what we call population proportion ( ).
Thus is estimated by the sample proportion:
X 55
=
P = n 400 = 0.1375 = 13.75 percent.
Although point estimates are often useful, they do have one serious drawback: we do not know
how close or far these values are from the population value they are supposed to estimate, and
hence, we cannot be certain of their reliability. In other words, a point estimate will be more
useful if it is accompanied by an estimate of the error that might be involved. To this end, we use
interval estimation.
Page 3 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
The interval estimate is an interval that includes the point estimate. For example, if the sample
mean is say 0.28, one may report that the population mean is in the range of 0.25 and 0.31 with a
probability of 0.95. i.e. the 95 percent confidence interval of the population mean is (0.25, 0.31).
Clearly this interval contains the point estimated 0.28.
Case I. Sampling From a Normally Distributed Population with Known standard deviation
Recall that Z denotes the value of Z for which the area under standard normal curve to its right is
equal to . Analogously, Z / 2 denotes value of Z for which the area to its right /2 and, Z/2
denotes the value for which the area to its left is / 2.
P (- Z/2 < Z < Z/2) = 1 -
X−μ
But we know that Z = σ / √n follows standard normal distribution. Thus
P
(
− Z α /2 <
X−μ
σ / √n )
< Z α / 2 = 1 −α
P (− Z α / 2 . σ / √ n < X − μ < Z α / 2 . σ / √n ) = 1− α
P ( X − Z α / 2 . σ / √ n < μ < X + Z α / 2 . σ / √ n ) = 1− α
Thus, a (1 - ) 100 % confidence interval for the population mean is given by:
X ± Zα / 2 σ / √ n
α
Z
Where, X is the sample mean, α / 2 is the value of Z for which the area to its right is 2 .
Page 4 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
More often, the common confidence intervals are the 99, 95 and 90 percent confidence intervals.
If we use the 95 percent level of confidence, then we expect about 95 percent of the intervals to
contain the parameter being estimated. Another interpretation of the 95 percent confidence
interval is that 95 percent of the sample means for a specified sample size will be within 1.96
(obtained by dividing 0.95 by 2) standard deviations of the hypothesized population mean.
Similarly, for a 99 percent confidence interval, 99 percent of the sample means will lie within
2.58 standard deviations of the hypothesized population mean.
Values of Za/2 for the most commonly used confidence
Confidence level a a /2 Z a /2
90 % 0.1 0.05 1.645
95 % 0.05 0.025 1.96
99% 0.01 0.005 2.58
Here, 1 – 95% confidence level is known as significant level () = 0.05, then the (1 -) 100
percent confidence interval, which is the (1 – 0.05) 100 = 95 percent confidence interval and if
= 0.01, then the (1 -) 100 percent confidence interval will be the (1 – 0.01) 100 % which is the
99 % confidence interval. Where is called the confidence coefficient (significant level).
Then, total area under the normal curve is 1. Or one can report as, 95 % of the area under the
standard normal curve is between Z values - 1.96 and 1.96 and similarly 99 % of the area under
the standard normal curve is between Z value – 2.58 and 2.58.
Thus, the 95 percent confidence interval of the mean for known standard deviation is given by,
σ
X ± 1. 96
√n
And the 99 % confidence interval is given by;
σ
X ± 2.58
√n
Example: A normal infinite population has a standard deviation of 10. A random sample of size
25 has a mean of 50. Construct a 95% confidence interval of the population mean?
Given = 10 n = 25
= 50 95% confidence interval.
Answer: Pr( X -Z α/2 σ /√ n ¿ μ≤ X +Z α/2 σ /√ n ) = 50 - (1.96) 2 50 + (1.96)2
46.08 53.92
Page 5 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
In this case we may say that “we are 95% confident that the populations mean lies with in 46.08
and 53.92.” This statement does not mean that the chance is 0.95 that the population mean of all
the random variables falls within the interval established from this one sample. Instead, it means
that if we select many random samples of the same size and if we compute a confidence interval
for each of these samples, then in about 95 percent of these cases, the population mean will lie
within that interval
Exercise: A normal infinite population has a standard deviation of 10. A random sample of size
25 has a mean of 50. Construct a 95% confidence interval of the population mean?
√( Xi − X )2
S= n−1
s
s x = √n
In this case, the construction of confidence interval estimate depends up on whether the sample
size is larger or small:
Case1. When the sample size is large and unknown (A sample size is large when n 30)
μ = X ± Z α/2 s/ √ n
For instance, the 95 % confidence interval can be given by;
S
X ± 1. 96
√n
And the 99 % confidence interval is given by
S
X ± 2 .58
√n Where;
X - Sample mean, S – sample standard deviation
Page 6 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
Example 1; In a certain small city, to estimate the mean monthly expenditure for food, a random
sample of 25 households was randomly selected yielding a mean of 200 birr. From experience, it
is known that such expenditures are normally distributed with a standard deviation of 50 Birr.
a) What is the point estimate of the mean monthly expenditures for food of all households in
the city?
b) Find a 95 percent confidence interval for the mean monthly expenditures for food of all
households in the city.
Solution: -
a) Given
X = 200 Birr
s = 50 Birr
n = 25
A point estimate of the population mean is the sample mean X
Thus, μ = X = 200 Birr.
b) For 95 % confidence interval, let us find confidence coefficient.
95
(1 - ) 100 % = 95 % 1 - = 100
95 100 − 95
=
= 1 - 100 100
5
= 0. 05
= 100
= 0.05 (significant level)
Then Z/2 = Z0.05/2 = Z0.025 = 1.96 (from the table of standard normal)
Thus, a 95 % confidence interval for the mean is
s
X ± Zα / 2
√n
= 200 (1.96)
( )
50
√25
= 200 19.6
= (180.40 Birr, 219.60 Birr)
i.e. we are 95 percent confident that the true mean monthly expenditure for food () is between
180.40 Birr and 219.60 Birr.
Page 7 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
Example 2: A manufacturer claims that his tyre lasts 20,000 miles on average. A consumer
organization tests a random sample of 64 tyres and reported an average of 19,200 miles with a
standard deviation of 2,000 miles. Does a 99 % confidence interval for the mean life of all tyres
produced by the manufacturer support the claim?
Solution: -
Given: n = 64, X = 19,200 miles, S = 2000 miles. Though we have no information about the
normality of the population by central limit theorem, for large n, say n 30. We assume that the
distribution is normal. In our case as n = 64 30 then we consider the normality.
Then for 99 % confidence interval, = 0.01 and /2 = 0.005
And from the table of standard normal,
Z/2 = Z0.005 = 2.58
Thus, A 99 % confidence interval for the mean () will be:
X ± Zα / 2 S / √n
Case II. When the Sample Size is Small and Unknown (A sample size is small when n < 30).
Page 8 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
If the population variance 2 is not known, then it must be estimated by the sample satandard
deviation S where;
√( Xi − X )2
S= n−1
s
s x = √n
Under this situation, since is estimated by S, the sampling distribution of the mean deviates
from the Normal distribution for small size, or we say the sampling distribution of X follows the
students’ t- distribution with n – 1 degrees of freedom.
For n > 30, the student t distribution can be approximated by the Normal distribution. Like the
Normal distribution, the t-distribution is symmetrical about the mean = 0. But it is flatter as
compared to the Normal distribution. However, as the sample size increases the t-distribution
losses its flatness and becomes approximately Normal.
The shape of the t-distribution is determined by the degrees of freedom. Degrees of freedom can
be defined as the number of values we can choose freely. Suppose we are dealing with a sample
of size n = 6 and we know the mean of these 6 numbers is 5. Symbolically, we have:
a+b +c + d +e + f
=5
6
Now, we are free to assign any value to a, b, c, d and e,
Say a = 3, b = 2, c = 4, d = 5 and e = 3. But, we are no more free to assign a value to f since:
a+b+c+d+e+f 17 + f
=5 ⇒ = 5 ⇒ 17 + f = 30
6 6
⇒ f =13
That is, in order for the mean of these 6 numbers to be 5, f must be 13. If we assign another
number for f, then the mean will not be equal to 5. Thus, we are free to choose only 5 values and
the 6th one is determined automatically.
Hence, the degree of freedom is:
n–1=6–1=5
Page 9 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
Generally, for a sample of size n, the degree of freedom is n – 1. The values of t a/2 for different
degrees of freedom and different values of X are tabulated. t /2¿ ) denotes the value of t for which
the area under the curve to its right is equal to /2 with (n – 1) degrees of freedom.
Example 1: a) for n = 20 and ¿ 2 = 0.025, find; t /2(n – 1)
Solution:
From the t-distribution table, t0.025 (19) = 2.093 (shaded area = 0.025)
b) If n = 26, = 0.005
Then t/2 (n – 1) = t0.005 (25) = 2.787
(From the table of t-distribution)
Under such situations, a (1 - ) 100 %. Confidence interval for the population mean is given by:
X ± t α / 2 (n− 1) S / √ n
Example.2
One measure of a company’s financial health is its debt-to equity ratio. This quantity is defined to
be the ration of the company’s corporate debt to the company’s equity. If this ratio is too high, it
is one indication of financial instability. For obvious reasons, banks often monitor the financial
health of companies to which they have extended commercial loans. Suppose that, in order to
reduce risk, a large bank has decided to initiate a policy limiting the mean debt-to- equity ratio for
its portfolio of commercial loans to 1.5. In order to estimate the mean debt-to-equity ratio of its
loan portfolio, the bank randomly selects a sample of 15 of its commercial loan accounts. Audits
of these companies result in the following debt-to-equity ratios:
1.31 1.05 1.45 1.21 1.19
1.78 1.37 1.41 1.22 1.11
1.48 1.33 1.29 1.32 1.65
A stem-and-leaf display of these ratios is reasonably mound shaped. Furthermore, the sample
mean and standard deviation of these ratios can be calculated to be X = 1.343 and S = 0.192
Suppose that the bank wishes to calculate a 95% confidence interval for a loan portfolio’s mean
debt-to-equity ratio, since the bank has taken a small sample of size 15, it is appropriate to
calculate an interval based on the t - distribution. We have n – 1 = 15 – 1 = 14 degrees of
freedom, and the level of confidence 100 (1 - ) percent = 95 percent implies that = 0.05
Page 10 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
significant level. Therefore, we use the t point t /2 = t0.05 / 2 = t 0.025 at degree of freedom 14 = 2.145
(from, the table). It follows that the 95 percent confidence interval for is
( X ± t . 025
S
√n) [
= 1 .343 ± 2. 145
.192
√15 ( )]
= 1.343 0.106
= 1.237, 1.449
This interval says that the bank is 95 percent confident that the mean debt-to-equity ratio for its
portfolio of commercial loan accounts is between 1.237 and 1449. Based on this interval, the bank
has strong evidence that the portfolio’s mean ratio is less than 1.5 (or that the bank is in
compliance with its new policy).
proportion, and here is unknown and we want to estimate p by p and hence z becomes
p-Z Q
n √
≤ ≤ +Za/2 n
a/2
Q p
The formula to construct interval estimate for P
√
=
p ±Z α/2 S p
Where p =sample proportion
α =1-c
S p =sample standard error of the proportion
=unknown population proportion
Example: on a certain region a sample of 500 members of the labor force showed that 40 were
unemployed. Find the 95% confidence interval for the proportion unemployed in the region.
40
Given: P = =0.08, q=1-0.08=0.92
500
n = 500 C=0.95 α =1-c = 0.05 Za/2 = Z0.025 = 1.96 (from the table Z=1.96)
ANSWER: 0.08-1.96
√ 0.08× 0.92
500
≤ p ≤ 0.08+1.96
√ 0.08× 0.92
500
Page 11 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
0.05622≤P≤0.10378
EXERCISE: Of 900 consumers surveyed, 414 said they were very enthusiastic about a new
home decor scheme. Construct a 99% confidence interval for the population proportion.
Suppose we have a random sample of nx1 and nx2 observations from normal distributions with
( X 1- X 2) - Z a /2 √ σ 2x1 σ 2x 2
n x1
+
n x2
≤ (1 -2) ≤ ( X 1- X 2) + Z a /2
√ σ 2x1 σ 2x 2
n x1
+
n x2
n x1√ n x2 n x1 n x2 √
→ ( X 1- X 2) - Z a /2 σ 2 x 1 + σ 2 x 2 ≤ (1 -2) ≤ ( X 1- X 2) + Z a /2 σ 2 x 1 + σ 2 x 2
→ (2500-2000) – 1.645
√ 100 ×100 150× 150
60
+
120
≤ (1 -2) ≤ + 1.645
→ 456.24 ≤ (1 -2) ≤543.76
60
+
120 √
100 ×100 150× 150
Page 12 of 13
Statistics For Management II, Chapter Two – Statistical Estimation 2023
√ n1 n2 √
(P1-p2¿−Z a/ 2 P 1 q 1 + P2 q 2 ≤ (p1-p2) ≤( P 1− p 2 )+ Z a/ 2 P 1q 1 + P 2 q 2
n1 n2
Example: A sample of 100 T- shirts are taken from market ABC and 20 have yellow color and
another sample of 150 are taken from market XYZ and 40 have yellow color. Find the interval of
the difference between two population proportions at 92% confidence level?
Given:
ABC XYZ
n1 =100 n2 =150
p1 =20/100=0.2 p2=40/150=0.267
q1 =1-p-=1-0.2=0.8 q2= 1-0.267=0.733
Population proportions are not known (P1 And P2), so, combined sample proportions are
estimators of population proportions, then
√ n1 n2 √
(P1-p2¿−Z a/ 2 P 1 q 1 + P2 q 2 ≤ (p1-p2) ≤( P 1− p 2 )+ Z a/ 2 P 1q 1 + P 2 q 2
n1 n2
(0.2-0.267) - 1.75×0.0761 ≤ (p1-p2) ≤ (0.2-0.267) +1.75×0.0761
-0.067-0.1332 ≤ (p1-p2) ≤ -0.067+ 0.1332
-0.2002≤ (p1-p2) ≤ 0.0662
Page 13 of 13