0% found this document useful (0 votes)
11 views

Chapter 9

This document discusses methods of statistical estimation. It begins by defining point estimation as using a single statistic to estimate an unknown population parameter. It then defines an unbiased estimator and provides examples of the sample mean and variance as unbiased estimators of the population mean and variance. It discusses finding the most efficient estimator by minimizing variance. The document then defines interval estimation and explains how to construct confidence intervals. It provides an example of how to calculate a confidence interval for the population mean when the population variance is known using the normal distribution. Finally, it explains how to calculate a confidence interval for the population mean when the variance is unknown using the t-distribution.

Uploaded by

sezarozoldek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Chapter 9

This document discusses methods of statistical estimation. It begins by defining point estimation as using a single statistic to estimate an unknown population parameter. It then defines an unbiased estimator and provides examples of the sample mean and variance as unbiased estimators of the population mean and variance. It discusses finding the most efficient estimator by minimizing variance. The document then defines interval estimation and explains how to construct confidence intervals. It provides an example of how to calculate a confidence interval for the population mean when the population variance is known using the normal distribution. Finally, it explains how to calculate a confidence interval for the population mean when the variance is unknown using the t-distribution.

Uploaded by

sezarozoldek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Chapter 9

Estimation

Statistical inference consists of those methods by which one makes


inferences or generalizations a bout a population.
Statistical Inference may be divided into two areas:
Estimation (give an example such as smokers) and Tests of Hypothesis (give
an example such as two medicines)

Section 9.3
Estimation
We’ll do estimation by either a point or by an interval (discuss)
Of course we expect errors in our estimation.

A point estimate (‫ )التقدير بنقطة‬of some population parameter θ is a single

value θ ̂ of a statistic Θ̂ .

e.g. (1) The value of x̄ of the statistic X̄ from a sample of size n is a point

estimate of the population parameter μ,

Defintion of Unbiased Estimator

A statistic Θ̂ is said to be an unbiased (‫ )غير منحاز‬estimator of the parameter θ


̂
if E(Θ) =θ
Examples
(1) Show that X̄ is unbiased estimator of μ,
Proof : Exercise
(2) Show that S 2 is unbiased estimator of σ 2
n
∑i=1 (Xi − X̄ )2
Proof: want to prove that E(S 2) = E( )
n−1
Variance of Point Estimator

If Θ̂ 1 and Θ̂ 2 are unbiased estimator of the same parameter θ, we choose


the the estimator whose sampling distribution has the smaller variance.
Hence, if σ 2̂
Θ1
< σΘ2̂ , we say that Θ̂ 1 it is more efficient than Θ̂ 2.
2

Definition
If we consider all possible estimators of some parameter θ, the one with

smallest variance is called the most efficient estimator of θ.


E.g.
For a normal populations, we can show that both X̄ and X̃ are unbiased

estimators of the population mean μ, but the variance of X̄ is smaller than

the variance of X̃.


Interval Estimation
An interval estimate of a population parameter θ is an interval of the form

θL̂ < θ < θÛ where θL̂ and θÛ depend on the value of the statistic Θ̂ for a
particular sample and also on the sampling distribution of Θ̂ .

If we find θL̂ and θU


̂ such that P(θ ̂
L < θ < θÛ ) = 1 − α for 0 < α < 1, then
we have a probability of 1 − α of selecting a random sample that will

produce an interval containing θ.

The interval θL̂ < θ < θÛ is called a 100(1 − α)% confidence interval, the
fraction 1 − α is called the confidence coefficient or the degree of
confidence, and the endpoints θL̂ and θU
̂ are called the lower and upper
limits.
E.g if α = 0.05
Remark: the wider the confidence interval is, the more confident we can be
that the interval contains the unknown parameter.
It is better to be 95% confident that the average life of a certain TV transistor
is between 6 and 7 years than to be 99% confident that it is between 3 and
10.
Section 9.4
Single sampling: Estimating the Mean

Recall that the sample is a point estimate of the mean X̄ of the population
our sample is selected from probability,
If our random sample is selected from a normal population or, if the sample
size n is sufficiently large, then we know (from chapter 8) that X̄ is
σ2
approximately normally distributed with mean μ and variance and thus
n
we have
X̄ − μ
P( − z α2 < σ < z α2 ) = 1 − α.
n

Which becomes after simplification:


σ σ
P(x̄ − z α2 < μ < x̄ + z α2 ) = 1 − α.
n n

So we have the following theorem:


Theorem (Confidence Interval on μ, if σ 2 known)
If X̄ is the mean of a random sample of large size n from population with

unknown mean μ and known variance σ 2, then

a 100(1 − α) % confidence interval for μ is given by


σ σ
x̄ − z α2 < μ < x̄ + z α2
n n
α
where z α is the z− value of area to the right.
2
2
Remark:
n ≥ 30 is considered large and thus the estimation above is a good
estimation.
Example
The average zinc concentration recorded from a sample of measurements
taken in 36 different locations in a river is found to be 2.6 grams per milliliter.
Find 95% and 99% confidence intervals for the mean zinc concentration in
the river. Assume that the population standard deviation is 0.3 grams per
milliliter.
Solution
The information we have: x̄ = 2.6 g\m, σ = 0.3 g\m, n = 36.
And since we want a 95% confidence interval, so 1 − α = 0.95 and thus
α = 0.05
Now z α2 = z0.025 = 1.96 from the table.
σ σ
So the confidence interval x̄ − z α2 < μ < x̄ + z α2 becomes:
n n

0.3 0.3
2.6 − 1.96 < μ < 2.6 + 1.96
36 36

After simplification we get: 2.5 < μ < 2.7.


The case 99% is done similarly in which we get 2.47 < μ < 2.73
One sided Bounds
The two probabilities
X̄ − μ
P( σ < zα) = 1 − α which gives the lower bound:
n

σ
x̄ − z α2 <μ
n

X̄ − μ
P( − z α2 < σ ) = 1 − α which gives the upper bound:
n

σ
μ < x̄ + z α2
n

The case where σ 2 unknown


First need to talk about the t-distribution:
It is one of the important continuous probability distributions. It is particularly
important in inferential statistics.
The t-distribution graph looks like the graph of the standard except has a
lower peak and a heavier tails.
The t-distribution has one parameter, namely the degrees of freedom
denoted by ν with value ν =n−1
As n gets larger the t-distribution becomes closer and closer to the standard
normal distribution.
As it is not easy to commute the probability concerning t0distribution directly
from its probability density function, this distribution has a table (See Table
A4 on page 773).
This table gives the value of t if we are given the area to the right of that
point.
Remark: Using symmetry we get : −tα = t1−α
Examples on using the t- table:
- The area to the left of t=1.1812 with 10 degrees of freedom is 0.95,
- The area to the right of t=2.080 with 21 degrees of freedom is 0.025,
- The value of t = t(α, ν) where α is the area to the right of t and ν is the
degrees of freedom
(i) t(0.025;20) (i) t(0.95, 14)
Answers: (i) 2.086 (ii) t(0.95,14)=-t(0.05, 14)= -1.761
- Find the value of t with 14 degrees of freedom and that have an area of 0.025 to
its left.
Solution
First look at the line that has degrees of freedom 14,
Second: Since the table gives the value of t if we are given the area on its right which
is 0.975 here, but since there no area of 0.975 in the top line, so we make use of
symmetry and find the t-value where o.975 to its left (or 0.025 to its right) which is
2.145
So the value of t where 0.025 to its left is -2.145 by symmetry.
Example
Find P(−t0.025 < T < t0.05)
Solution
P(−t0.025 < T < t0.05) = P(T < t0.05) − P(T < − t0.025)=0.95-0.025=0.925
Example
Find k sos that P(k < T < − 1.761) = 0.045 for random sample of size 15
By symmetry P(k < T < − 1.761) = P(1.761 < T < − k) = 0.045
Now area to the right of 1.761=0.05, so area to the right of -k is 0.05 - 0.045 = 0.005
Therefore, from the table - k= 2.997, and thus k -2.977
Now we can introduce the following theorem:
Theorem
Let X1, X2, ⋯, Xn be a random sample from a normal distribution with mean μ and an
X̄ − μ
unknown variance σ 2. The random variable T = S
has a t- distribution with
n

ν = n − 1 degrees of freedom.
Example
The contents of 7 similar containers of sulfuric acid are
9.8, 10.2, 10.4, 9.8. 10, 10.2, 9.6 liters.
Find a 95% confidence interval for the mean of all such containers, assuming an
approximately normal distribution.
Solution

Here n = 7,

Computing the mean and the standard deviation of the given random sample we get

x̄ = 10,S = 0.283
And because the variance of the population is unknown and that the population
distribution is approximately normal, so we use the t-distribution with the random
variable

X̄ − μ
T= S
has a t - distribution with degrees of freedom =6,
n

Know that a (1 − α)% confidence interval for the mean is given by


s s
x̄ − t α < μ < x̄ + t α
2
n 2
n

Substituting the values of x̄, s, α we get


0.283 0.283
10 − t0.025 < μ < 10 − t0.025
7 7
0.283 0.283
10 − 2.447 < μ < 10 − 2.447
7 7
So a 95% confidence interval that contains the mean μ is: 9.47 < μ < 10.26

You might also like