stat2 chapter 2-1
stat2 chapter 2-1
Chapter TWO
Statistical Estimation
2.1. Basic Concepts
Estimator and estimates
Any sample statistic used to estimate or measure a population parameter is called an estimator, that is, an
estimator is a sample statistic used to estimate a population parameter. The sample mean X can be an estimator
of the population mean μ, and the sample proportion can be used as an estimator of the population proportion.
We can also use the sample range as an estimator of the population range.
When we have observed a specific numerical value of our estimator, we call that value an estimate. In other
words, an estimate is a specific observed value of a statistic. We form an estimate by taking a sample and
computing the value taken by our estimator in that sample.
Estimator: A sample statistic which is used to estimate a population parameter. It must be unbiased, consistent,
and relatively efficient.
E.g. x, p, x1 x 2, p1 p 2 etc.
Estimate: Is the different possible values which an estimator can assumes.
Statistical estimation refers the procedure of using a sample statistic (measure) to estimate a population
parameter.
Estimation: - Is the process of predicting or estimating the unknown population parameter through
sampling. That is, it is the process of using sample statistic so as to estimate the unknown population
parameter.
Types of statistical estimates
We can make two types of estimates about a population: a point estimate and an interval estimate.
A. Point estimate: It is a single number that is used to estimate an unknown population parameter. Point
estimate is the values computed from sample distribution that is used to estimate the population
parameter. The sample mean, X is a point estimator for the population mean, μ; a sample variance is an
estimate for population variance. That is, S 2 is an estimator for population Variance σ 2. Sample standard
❑
deviation(S) is used to estimate the population standard deviation ( σ ) and sample proportion,
p
estimates parameter population proportion, p.
Example 1: Suppose we have the following random sample of n = 6 elements from a population whose
parameter values are not known. 2 4 5 7 11
x
x 30 5
n 6
1
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
Example: The mean of the age of men attending a show is between 28 and 36 years. Sometimes inequality
notations are used to indicate interval estimators. Therefore, we can say that the mean of the age of men
attending a show is28 ≤ age ≤ 36.
The interval estimate incorporates:
a. Measure of variability i.e. standard error of the point estimate
b. Confident coefficient/level - measures how confident we are that the Interval is correct.
For example, a confidence interval estimate of is an interval estimate; together with a statement of how confident we
are that the interval is correct.
x
Z
n
x Z x Z
n n
Therefore, the formula for calculating an interval estimate of the mean, is as follows
x Z x Z
n n
The choice of method used in constructing a confidence interval estimate for depends up on whether the:
2
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
2.2.1 Interval Estimation of a Population Mean: Case 1: Large-Sample Case (n>30), With Known
Where: x = is the sample mean x z /2
Z n
C= 1- α Is the confidence coefficient α / 2 =the z value providing an area of α /2in the
upper tail of the standard normal probability distribution
= the population standard deviation
n = the sample size
= 1-C
Example 1- The vice president of operations for Ethiopian Telecommunication Corporation is in the process
of developing a strategic management plan. He believes that the ability to estimate the length of the average
phone call on the system is important. Because thousands of calls have been placed on the system, averaging
all the calls is virtually impossible. He takes a random sample of 60 calls from the company records and find
that the mean sample length for a call is 4.26 minutes, Past history for these types of calls has shown that the
population standard deviation for call length is about 1.1 minutes. Assuming that the population is normally
distributed and he wants to have a 95% confidence, help him estimate the population mean.
Given:
Sample mean x 4.26
Population standard deviation
1.1
n = 60
c = 0.95
x Z x Z
2 n 2 n
If he wants to produce a single number estimate of the value of , he can use the sample mean, x = 4.26. In this case,
the sample mean is a point estimate of the population mean. However he realizes that if he were to randomly sample
another 60 telephone calls, this second sample mean likely would be different from the first, and hence the point estimate
would change. In fact, for every sample taken, the like hood is strong that the point estimate would change. Thus
estimating a population parameter with an interval estimate often is preferable. In using interval estimate, the researcher
must select a desired level of confidence; some of the more common ones are 90%, 95%, 98% and 99%. And the
researcher cannot simply select the highest confidence level because there are trade offs between sample size, interval
width and the level of confidence.
3
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
In our example, the vice president selected a 95% confidence level. When using a 95% level of confidence he is selecting
on interval centered on with in which 95% of all sample mean values will falls as shown in the figure below.
95%
x
z =-1.96 z =+1.96
0.475 0.475
0.25 0.25
Figure 4-1 Distribution of sample means about population mean for 95% confidence
1
Because the distribution is symmetric and the interval is equal on each side of the population mean, 2 (95%) or
0.4750 of the area falls on each side of the mean. Thus the Z table fives a Z value of 1.96 for this portion of the normal
curve. Thus the Z value for a 95% confidence interval is always 1.96. In other words, of all the possible x values along
the horizontal axis of the diagram, 95% of them should fall with in a Z score of 1.96 from the population mean. Now he
can estimate the average call length by substituting 1.96 in place of Z.
x Z x Z
n n
1 .1 1 .1
4.26 1.96
4.26 - 1.96 60 60
3.98 4.54
This can be interpreted, as the vice president can be 95% confident that the average length of a call for the population is
between 3.8 and 4.54 minutes or we have a 95% chance that the population mean will be between 3.98 and 4.54. This
means if he were to randomly select 100 samples of 60 calls each and use the results of each sample to construct a 95%
confidence interval, approximately 95 of the 100 intervals would contain the population mean. It also indicates that 5%
of the intervals would not contain the population mean.
We get the Z value for a certain (specific) confidence level from Z table.
You will find different tables at the back of this material. One of these tables is the Z table, which contains Z-values for
different confidence levels. Let us now see how to pick up the Z-values of a specific confidence level.
For example we have said that the Z-value of 95% confidence level is 1.96. To get this number from the table we divide
the confidence level by two.
0.95
0.475
2
4
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
Thus, we obtain this figure in the row of 1.9 and under the column of 0.06. When we combine the row and column
figures i.e. 1.9+ 0.0.6 = 1.96, which is the z- value for 95% confidence level.
Here are the Z values corresponding to the most commonly used confidence levels.
C= (1- α ) α α /2 Zα/ 2
90% 0.10 0.05 1.645
95% 0.05 0.025 1.96
98% 0.02 0.01 2.33
99% 0.01 0.005 2.575
2.2.2. Interval Estimation of a Population Mean: Case 2: Large sample case (n ≥ 30), with Standard
From the standard normal reference table, appears at the endpoints of the normal distribution.
5
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
So CI:
Example4: Find the 99% confidence interval estimate of the true population mean income if a sample of 100
families gives a sample mean of $28,500. From previous experience we know that the population standard
deviation is $5,000
Using alpha = 1 - 0.99 = 0.01, we find the z-values for the endpoints of the CI when the probabilities are 0.005
(a/2), z= 2.57
So our CI estimate is
2.2. 3. Interval Estimation of a Population Mean: Case 3: Small-Sample Case (n < 30), is
unknown
If Population is Normally Distributed and is known the large-sample interval-estimation procedure can
be used but if population is Normally Distributed and is Unknown the appropriate interval estimate is
based on a probability distribution known as the t distribution
t - Distribution
The t distribution is a family of similar probability distributions. A specific t distribution depends on a
parameter known as the degrees of freedom. As the number of degrees of freedom increases, the difference
between the t distribution and the standard normal probability distribution becomes smaller and smaller. A t
distribution with more degrees of freedom has less dispersion. The mean of the t distribution is zero.
α
Solution: a/Given: n = 25 X =32, S = 4.2, 1-α = 0.95 ⟹ α = 0.05, =0.025
2
⟹ t 0.025, 24=2.064 ¿ table .
6
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
S S
⟹ The required interval will be ( X – t α /2 , X +t α /2 )
√n √n
4.2
=32± t 0.025 , 24 ×
√ 25
4.2
=32±2.064 ×
√25
= 32±1.73
30.27 < μ <33.73 = (30.27, 33.73)
α
b/ Given: n = 25 X =32, S = 4.2, 1-α = 0.99 ⟹ α = 0.01, =0.005
2
⟹ t 0.005, 24=2.797 ¿ table .
S S
⟹ The required interval will be ( X – t α /2 , X +t α/2 )
√n √n
4.2
=32± t 0.005 , 24 ×
√ 25
4.2
=32±2.797 ×
√25
= 32±1.35
29.65 < μ < 34.35= (29.65, 34.35)
2.3. Sample Size for an Interval Estimate of a Population Mean
Let E = the maximum sampling error mentioned in the precision statement. E is the amount added to and subtracted
from the point estimate to obtain an interval estimate. E is often referred to as the margin of error.
E z / 2
n
( z / 2 ) 2 2
n
Solving for n we have E2
Example: Suppose that National’s management team wants an estimate of the population mean such that there
is a .95 probability that the sampling error is $500 or less and = 4,500. . How large a sample size is
z / 2 500
needed to meet the required precision? n
7
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
Z α / 2Is the z value providing an area of α /2 in the upper tail of the standard normal probability distribution
p (1 p )
p z / 2
candidate . n
.44 + .0435
PSI is 95% confident that the proportion of all voters that favors the candidate is between .3965 and .4835
2.5. Sample Size for an Interval Estimate of a Population Proportion
Let E = the maximum sampling error mentioned in the precision statement.
p (1 p ) ( z / 2 ) 2 p (1 p )
E z / 2 n
n Solving for n we have E2
Example: Suppose that PSI would like a .99 probability that the sample proportion is within E is .03 of the
population proportion. How large a sample size is needed to meet the required precision?
At 99% confidence, z = 2.576
.005
8
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
Letting μ1 denote the mean of population 1 and μ 2 denote the mean of population 2, we will focus on inferences
about the difference between the means: μ 1 - μ2. To make an inference about this difference, we select a simple
random sample of n1 units from population 1 and a second simple random sample of n 2 units from population 2.
The two samples, taken separately and independently, are referred to as independent simple random samples. In
this section, we assume that information is available such that the two population standard deviations, σ 1 and σ2,
can be assumed known prior to collecting the samples. We refer to this situation as the σ 1 and σ2 known case. In
the following example we show how to compute a margin of error and develop an interval estimate of the
difference between the two population means when σ1 and σ2 are known.
The point estimator of the difference between the two-population means is the difference between the two
sample means.
STANDARD ERROR Of =
9
Statistics for management II Ch-2: Stastical Estimation
University of Gondar Department of Management
We find that the point estimate of the difference between the mean ages of the two populations is X 1- X 2 = 40 - 35 = 5
years. Using 95% confidence and zα/2 = z.025 = 1.96, we have.
5±4.06
Thus, the margin of error is 4.06 years and the 95% confidence interval estimate of the difference between the
two population means is 5 - 4.06 = .94 years to 5 + 4.06 = 9.06 years
10
Statistics for management II Ch-2: Stastical Estimation