0% found this document useful (0 votes)
13 views8 pages

Chapter 5- Estimation

Chapter Five discusses statistical estimation and inference, emphasizing the importance of using samples to infer characteristics of a population due to the impracticality of conducting a census. It outlines the concepts of point and interval estimators, detailing criteria for point estimators such as unbiasedness, efficiency, and consistency, as well as the construction of confidence intervals for population means. The chapter also covers the use of t-distribution for interval estimation when the population standard deviation is unknown.

Uploaded by

Abdi Jote
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Chapter 5- Estimation

Chapter Five discusses statistical estimation and inference, emphasizing the importance of using samples to infer characteristics of a population due to the impracticality of conducting a census. It outlines the concepts of point and interval estimators, detailing criteria for point estimators such as unbiasedness, efficiency, and consistency, as well as the construction of confidence intervals for population means. The chapter also covers the use of t-distribution for interval estimation when the population standard deviation is unknown.

Uploaded by

Abdi Jote
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

CHAPTER FIVE

_____________________________________________________________________
STATISTICAL ESTIMATION AND STATISTICAL INFERENCE
____________________________________________________________________
5.0 INTRODUCTION
Do you remember why we need to take samples? Yes, census is costly and sometimes
impossible. Therefore, we need to take part of the entire population (sample) and infer the
characteristic of the population form the sample we have drawn.
Statistical inference is the process of using limited information, a sample, for the purpose of
reaching conclusion about a large set of data, the population.
Estimation refers to any procedure where sample information is used to estimate or predict the
numerical value of some population measure (called parameter) such as the population mean μ.

Definition of some concepts:


An estimator is a procedure or function used in estimating a population parameter.
An estimate is the numerical value determined from the estimator.
A parameter is a characteristic of an entire population
A statistic is a summary measure that is computed to describe a characteristic for
only a sample of the population.

There are two types of estimators.

1. A point estimator of a population parameter is a procedure that produces a single value


as an estimate. The sample mean is a statistic that may be used as a point estimate of the
population mean.
2. An interval estimator of the population parameter is the procedure that produces is a
procedure that produces a range of values. The range of values is useful as a measure of
degree of error that may exist in estimation.
5.1. POINT ESTIMATION
5.1.1. CRITERIA FOR POINT ESTIMATOR
In point estimator we seek the sample statistic that is the best estimator of the population
parameter. Many criteria have been developed to describe what the best is for a point estimator.
The more general of these are the criteria of unbiasedness, efficiency, and consistency.

1
Unbiasedeness
A statistic is an unbiased estimator of a parameter if the average value of the statistic is the same
as the parameter value. Thus on average the estimator will be correct statistic is an unbiased
estimator of a parameter of the expected value of the statistic equals the parameter, i.e. if
E (Statistic) = Parameter
Efficiency
Even though the average value of an unbiased estimator equals the parameter, an estimator may
yield estimates that are not particularly close to the parameter value. The efficiency of an
estimator is measured by the variance of the estimator. The efficient estimator is the unbiased
estimator with the smallest variance.
Consistency
Another desirable property is that an estimator should produce estimates that have a high
probability of being close to the true value as the sample size increases. An estimator that has
this property is called a consistent estimator. The variance of a consistent estimator becomes
smaller as larger sample sizes are taken. Thus, consistence indicates that the amount of bias
becomes smaller as the sample size increases.

5.1.2 POINT ESTIMATOR OF THE MEAN


Assume that we have the following random sample of n = 6 elements from a population whose
parameter is not known.
1, 2, 4, 5, 7, 11

The sample mean is X 


X 
30
5
n 6

The estimator is X , and 5 is the point estimate of the unknown population mean.
5.1.3 POINT ESTIMATE OF THE UNKNOWN POPULATION STANDARD
DEVIATION
We will use the symbol Sx to mean an estimate of the unknown population standard deviation σx.
The estimator, called sample standard deviation, is defined by the formula

Sx 
(X  X ) 2

n 1

2
Where X = sample mean
n= sample size
Recall that, earlier we used the devisor N, when computing a population standard deviation σx
instead of n-1.
Example: For the random sample 1, 2, 4, 5, 7, 11 the sample standard deviation will be
computed as follows:

Sx 
(X  X ) 2

n 1

(1  5)2  (2  5) 2  (4  5) 2  (5  5) 2  (7  5) 2  (11  5) 2
Sx  =3.633
6 1
5.1.4 POINT ESTIMATOR OF STANDARD ERROR OF THE MEAN
x
Standard error of the mean is computed by the formula  x  when the sample size is less
n
than 5 % of the population size. In our case, the total size of the population is unknown; therefore
it is safer to assume that the sample is less than 5% of the entire population. Hence, we will use

the estimator sx to estimate the standard error  X . The symbol S X is called the sample
n
standard error of the mean. The formula for S X is

Sx
SX 
n
Where Sx= Sample standard deviation
n= sample size

Thus, Sx is the estimator for σx, and S X is the estimator for  X .

Note that, we have calculated Sx= 3.633 for the random sample of 1, 2, 4, 5, 7, 11. The sample
standard error can be obtained using the formula
Sx 3.633
SX  =  1.483
n 6

Note: The following table shows some population parameters and their estimators.
Population parameter Sample statistic (estimators)

3
Mean  X

Standard deviation σx Sx
Variance σ2x S2x

Standard error of the mean  SX

5.2 INTERVAL ESTIMATE


Point estimators of population parameters, while useful, do not convey as much information as
the interval estimators. Point estimation produces a single value as an estimate of unknown
population parameter. An interval estimate, on the other hand, is a range of values that conveys
the fact that estimation is an uncertain process.
The standard error of the point estimator is used in creating a range of values; thus, a measure of
variability is incorporated into interval estimation. Further, a measure of confidence in the
interval estimator is provided; consequently, interval estimates are also called Confidence
Intervals. For this reasons, Interval estimators are considered more desirable than point
estimators.
5.2.1 INTERVAL ESTIMATE OF POPULATION MEAN
An interval estimate of  is an interval values of a and b; with in which an unknown population
mean is expected to lie. The interval is an inference based up on:

1. Value of the mean X of the simple random sample selected from the population, and
2. Known facts about sampling distributions of the mean
The confidence interval shows how certain we are that the interval is correct. The choice of
method used in constructing a confidence interval for  depends upon whether or not the
population is normal and whether the population standard deviation  X is known or unknown.

5.2.1.1 CONFIDENCE INTERVAL ESTIMATE OF  , NORMAL POPULATION, AND


STANDARD DEVIATION KNOWN

Suppose we have a normal population whose mean and standard deviation are  and x. the
x
sampling distribution of the mean is normal with the mean  and standard error of  x 
n
For the sampling distribution of the mean, the standard normal variable is

4
X 
Z
x

If we want to be 95% confident that the population mean,  falls within the estimate, we can
calculate the range as follows.
1. find the Z value for 95% confidence level
2. Use the obtained Z value to calculate the unknown population parameter.
For example z value for 95% confidence interval is 1.96. Therefore, if we want to be 95% sure
that the true population mean falls within the estimate, we can rearrange the above formula and
get:

X  1.96 x    X  1.96 x

The proportion of correct estimates (0.95 in our illustration) is called the confidence coefficient
C. the number 100C (95% in our illustration) is called the confidence level. The proportion of
incorrect statements is symbolized by the Greek letter α (alpha). The sum of the proportions of
correct and incorrect statements 1; so
C + α =1 or α = 1- c
We can describe C as the chance that the confidence interval is correct, and α as the chance that
the interval is incorrect.
Example: A normal population has standard deviation of 10; a random sample of size 25 has a
mean of 50. Construct a 95% confidence interval estimate of the population mean.

Solution: To construct the confidence interval,


We have to first find Z value for 95% confidence level and then use the formula,
X  Z x    X  Z x to estimate the interval. The Z value for 95% confidence level is 1.96.

Therefore, the estimate can be given as, X  1.96 x    X  1.96 x . That is:

10 10
50  1.96( )    50  1.96( )
25 25

= 50  3.9    50  3.9

= 46.1    53.9

5
5.2.1.2 PRECISION, CONFIDENCE AND SAMPLE SIZE
The narrower the confidence interval is, the more precise it is. And the wider the interval, the
less precise is the interval. The end points of a confidence interval for µ are:
x
X  Z / 2
n

x
The smaller the value of Z  / 2 , the more precise (narrower) is the confidence interval.
n
Consequently, the smaller Z  / 2 and  x are, and the larger n is, the more precise will be the
interval. We conclude that the larger the sample size, the more precise is an interval estimate. It
can also be concluded that the smaller the variability the more precise the estimate. The final
conclusion that can be drawn from the above relationship is, the lower the confidence level, the
more precise is the interval estimate.
5.2.1.3 CONFIDENCE ESTIMATE OF µ, NORMAL POPULATION, STANDARD
DEVIATION UNKNOWN
Under the previous case we have seen the case where the population is uniformly distributed and
population standard deviation is known. In this case we search for Z value of /2 and use the
x
formula X  Z  / 2 to estimate the interval within which the population mean lies with C
n
Confidence coefficient. However, most of the time population mean µ is unknown, so is
population standard deviation, d. therefore, d must be estimated from sample standard
deviation.

Sx 
(X  X ) 2

n 1
After calculating the standard deviation, standard error must be computed using the formula.
SX
Sx 
n
When population standard deviation known, the interval estimate can be calculated as
X 
Z
x

6
However, if population standard deviation is unknown, we need to estimate population standard
deviation with sample standard deviation and the distribution does not follow normal
distribution. The distribution rather follows a student’s t-distribution. There are different t-
distributions for each sample size. T-distribution is discussed in a greater detail in hypothesis
test. In this chapter we will only illustrate how to make an interval estimate using the t-
distribution; without giving much emphasis for the distribution’s characteristic.
Tail areas for t-distribution are presented according to parameter called degrees of freedom. We
shall use the symbol  for degrees of freedom. Degree of freedom for t-distribution can be
calculated as  n  1.
As ν increases, the tail area decreases; so is the t-value. As degrees of freedom increases, the t-
distribution approaches the standard normal distribution. When degree of freedom is 30, the t-
distribution is approximately similar to normal distribution.

To construct interval estimate for µ under this situation, we need to use the value of t / 2 , which
will be read from statistical table in association with the formula:
SX SX
X  t / 2,    X  t / 2,
n n
Where,

X  Sample Mean , n= sample size,   n -1 (degrees of freedom), Sx= sample standard


deviation, μ=unknown population mean
Example: The environmental protection officer of a large industrial plant sought to determine
the mean daily amount of sulphur oxide (pollutant) emitted by the plant. Because measurements
costs were high, only a random sample 10 days’ measurements were obtained: these were, in
tons per day; 8, 7, 10, 15, 11, 5, 8, 5, 13,12
Suppose emissions per day are normally distributed. Estimate μ, the mean amount of sulphur
oxides emitted per day using the confidence interval with a confidence coefficient of 0.95.
Solution

X
 X = 95  9.5
n 10

7
Sx 
(X  X ) 2

=
94.5
=3.24
n 1 9
The confidence level is 95%. Therefore, significance level  = 1-C= 1-0.95= 0.05 and
/2=0.025.
Next, we have to calculate the degree of freedom for the observation; which is given as

v = n-1 = 10-1=9

SX SX
We can now calculate the interval as X  t / 2,    X  t / 2, . t / 2, in this specific
n n
situation means t0.025, 9 = 2.26
Therefore Interval can be calculated as:
3.24 3.24
9.5  2.26( )    9.5  2.26( )
10 10
= 7.2    11.8

You might also like