III Sem BSC Mathematics Complementary Course Statistical Inference On10dec2015
III Sem BSC Mathematics Complementary Course Statistical Inference On10dec2015
STATISTICS
STATISTICAL INFERENCE
Complementary course
for B.Sc.
MATHEMATICS
III Semester
(2014 Admission)
CU-CBCSS
UNIVERSITY OFCALICUT
SCHOOLOFDISTANCEEDUCATION
984
School of Distance Education
2 Statistical Statistical 3
SYLLAB
US
of attributes. 30 hours
of attributes. 30 hours
MODULE I
SAMPLING DISTRIBUTIONS
SE of t 2 2
V (t ) where V(t) = E(t ) – {E(t)} m xi (t) = e 2 ,i = 1, 2, 3......n
= n
2
forms the basis of testing of hypothesis ( µt + 1 t σ 2 ( µt + 1 t σ
2
σ2
3. The above results show that E( x ) = µ, and V( )= n
x
σ
m SE of x = n
σ 2n
Chi square Distribution
Karl Pearson in about 1900 described the well known probability The shape of z2 curve depends on the value of n. For small n, the curve
d i s t r i b u t i o n “ C h i s q u a r e d i s t r i b u t i o n ” o r d i s t r i b u t i o n o f z 2. (Square of greek is positively skewed. As n, the degrees of freedom increases, the curve
letter chi) z2 is a random variable used as a test statistic. Let a random approaches to symmetry rapidly. For large n the z2 is approximately
sample X1, X2,....Xn be taken from a normal population with mean µ and normally distributed. The distribution is unimodal and continuous.
variance σ2.
ie.X →N (µ σ ). We define z statistic as the sum of the squares of Degrees of freedom
i
standard normal variates. The phrase „degrees of freedom‟ can be explained in the following
2 intutive way.
n (X –µ
ie., z = Σ i
Let us take a sample of size n = 3 whose average is 5. Therefore the
i 1 σ sum of the observations must be equal to 15. That means X 1 + X2 + X3 =
15
Definition
We have complete freedom in assigning values to any two of the three
A continuous r.v. z2 assuming values from 0 to , is said to follow observations. After assigning values to any two, we have no freedom in
a
x –µ
freedom we mean the number of independent observations in a → N (0,1)
distribution or a set. σ/ n
2
Moments of (x –µ 2 2
Mean → 3 (1)
m
E(z2) = n
σ / n d.f.
Now consider the expression
Variance n
2 n
2 2 2 2 Σ ( x i – 2 Σ ( x i – x + x – µ)
µ)
V(z ) = E(z ) – [E(z )] i 1 i 1
n n n
Moment generating function = Σ(xi x) + 2( – µ ) Σ ( x i – x ) + Σ ( x – µ )
–
i 1 i 1 i 1
2
z (t)= ty = 2 2
= ns + 2( x – µ) 0 + n ( x – µ)
()
E (e ) 2 2
n ( Xi –µ2 ns ( x –µ
., Σ 2 , by dividing each term term
e
Sampling Distribution of i 1 σ σ σ / n
2 2
Sample Variance s
y
Let x1, x2,... xn be a random sample drawn from a normal population
School of Distance School of Distance
2 2 3 2
+ 2 ie., 3 (n) = ns
N(µ σ). Let x be the sample mean and s be its variance. We can consider (1)
the random observations as independent and normally distributed r.v.s. σ 2
z2
with mean µ and variance σ2. Here we have seen that the LHS follows (n). So by additive
ie., Xi → N(µ σ).i = 1, 2,........n property of z2 distribution we have.
X –µ 2 2
m i → N(0,1) ns 3
σ (n – 1)
2 →
σ
f (t ) 1
( t 2 - 2 ,– c t c
σ n –1 σ σ 1+ n
n
2 n ,
2 2
2
ns du n is said to follow a student‟s t distribution with n degrees of freedom.
Put u = ,
σ ds σ The t distribution depends only on „n‟ which is the parameter of the
distribution.
du
2 1
f(s ) = f(u) ds For n = 1, the above pdf reduces to f(t) = , which is known
n(1 + t2
n –1
as the Cauchy pdf. )
(1 2 2 n –1
–ns (ns 2 –1 mn
= 2 e 2σ 2 n (m + n)
n –1
σ
σ Note: β(m, n) =
2
Definition of ‘t’ statistic
n –1 If the random variables Z → N (0, 1) and Y → z2(n) and if Z and Y
n 2 2
n –1
are independent then the statistic defined by
–
2σ 2σ 2 (s 2
= n–
e –1 , 0 c s 2c Z
)
1
t= Y/n follows the Student‟s „t‟ distribution with n df.
2
2
This is the sampling distribution of s . It was discovered by a German x- →t df
mathematician. Helmert in 1876. We can determine the mean and t = (n - 1)
variance of s2 similar to the case of z2 distribution. S/ n- 1
Later we can see that the above r.v. will play an important role in
2 n-1 2 building schemes for inference on µ.
Thus E(s ) =
n
s
n - 14 SE of t = n- 1
2
School2.of Distance
n School of Distance
n2
Statistical Statistical 1
V(s ) = ii. Distribution of t for comparison of two samples means:
2
(x 12 – x )–(µ–µ)
1 2
→ t (n +n – 2)df
1 2
t n 1s1 +n 2 s 2 ( 1 1
= + 7. For each different
n1 +n 2 – n1 2
2
Characteristics of t distribution
1. The curve representing t distribution is symmetrical about
t=0
2. The t curve is unimodal number of degrees of freedom, there is a distribution of t. Accordingly t
3. The mode of t curve is 0 ie, at t = 0, it has the maximum probability distribution is not a single distribution, but a family of distributions.
4. The mean of the t distribution is 0 ie., E(t) = 0 8. The mgf does not exist for t distribution.
5. All odd central moments are 0. 9. The probabilities of the t distribution have been tabulated. (See
tables) The table provides
6. The even central moments are given by
2 v t0
Hence variance σ v –2 , it is slightly greater than 1. Therefore, it Values of t0 have been tabulated by Fisher for different probabilities
has a greater dispersion than normal distribution. The exact shape of the t say 0.10, 0.05. 0.02 and 0.01. This table is prepared for t curves for v=
distribution depends on the sample size or the degrees of freedom v. If v 1, 2, 3,... 60.
is small, the curve is more spread out in the tails and flatter around the
centre. As v increases, t distribution approaches the normal distribution. Snedecor’s F Distribution
Another important distribution is F distribution named in honour of
Sir. Ronald. A. Fisher. The F distribution, which we shall later find to be
of considerable practical interest, is the distribution of the ratio of two
independent chi square random variables divided by their respective
Definition
A continuous random variable F, assuming values from 0 to and
having the pdf given by O F
Applications of F distribution arise in problems in which we are
n 1n2 n interested in comparing the variances σ 2 and σ 2 of two normal
n 1 n2
1 –1 1 2
1 and X2 such that X1 →
populations. Let us have two independent r.v.s. X
(n 1 n2
f (F) = F 2 n 1 +n2 ,0 cFc
β , N (µ σ ) and → N (µ σ ). The random samples of sizes n and n
X
2 (n 1 F +n2 )
1 2
2 1 2 2
2 2
1 2
is said to follow an F distribution with (n 1, n2) degrees of freedom. are taken from the above population. The sample variances s1 and s2 are
The credit for its invention goes to G.W. Snedecor. He chose the letter F
n sn s
to designate the random variable in honour of R.A. Fisher. 1 1 2 2
1 n2
(
ie., V / n2 z
again is the ratio of two independent r.v.s. each
F U / n1 1. The mean of F distribution is n2 – 2
divided by its degrees of freedom, so it again has the F distribution, now
n2
with n2 and n1 degrees of freedom.
In view of its importance, the F distribution has been tabulated
extensively. See table at the end of this book. This contain values of
ie., E (F) = n2 – 2 , No mean exist for n2 c 2.
F 2. The variance of the F distribution is
2
( ; n 1 ,n 2 ) for α = 0.05 and 0.01, and for various values of n1 and n2, 2n 2 (n 1 +n2 –2)
F V(F) = , No variance exist if n c 4.
where ( ; n 1 , n 2 ) is such that the area to its right under the curve of n 1 (n 2 –2) 2 (n 2 – 4) 2
1 c
(n F → F(n , n ) df, then
5. If F
n ) df n s2 n s2
1 2 2, 1 2
Let F(n – 1, n – 1) = 1 1 2
F
1 2
It is called reciprocal property of F distribution
n1 –1 n2 –1
6. Th two important uses of F distribution are (a) to test the equality of n s
two normal population variances and (b) to test the equality of three 1 1 / (n 1 –1)
or more population means. σ
1
2
2
7. Since the applications of F distribution are based on the ratio of sample = 2 2 / (n –1)
variances, the F distribution is also known as variance - ratio distribution. 2
σ 2
2
Inter relationship between t, 2
and F distributions; 2
2
Squareof N (0,1) 3 (1)/ 1
101.645) = 0.05
( x –100 101.645 – 100
ie P = 0.05
10/ n 10/ n
(
2 2 P Z 1.645 n
ie, = 0.05
m t2 = 3 (n –1)/ (n –1) = 3 (n –1)/ (n –1) → F (1, n –1) 10
ie., the square of a variate with n –1 df is F(1, n –1).
ie,
P ( Z 0.1645 n = 0.05 X4
2 2
3 (1)
) X1 +X2
3
+ X33 (3)/ 3
2
3X 4
School of Distance School of Distance
2 –2 2 –1 1
ie, λ 2 e y dy = 2
(ii) v = 2 2 2
2 + X3
X1 +X2
l The mean of the distribution is of its variance l Explain the terms (i) statistic (ii) standard error and (iii) sampling
distributions giving suitable examples.
l If the df is for Chi square distribution is large, the chi-square
distribution tends to .................. l Define sampling distribution and give an e xample.
l t distribution with 1 df reduces to .................. l Derive the sampling distribution of mean of samples from a
normal population.
l The ratio of two sample variances is distributed as ..................
Long Essay Questions
l The relation between Fisher‟s Z and Snedecor‟s F is ..................
l State the distribution of the sample varience from a normal
l The square of any standard normal variate follows ......... .........
population
distribution. 2
l Define z and obtain its mean and mode.
Very Short Answer Questions 2
l Define z statistic. Write its density and establish the additive
l What is a random sample? l property.
Define the term „statistic‟. l Give the important properties of z 2distribution and examine its
l Define the term „parameter‟. l
relationship with the normal distribution.
What 2
is sampling distribution? l Define
l Define a z variate and give its sampling distribution. Show that
standard error. its variance is twice its mean.
l What is the relationship between SE and sample size.
2 l Define the F statistic, Relate F to the t statistic and Fn,m to Fm,n
lDefine z distribution with n df. l
MODULE II 1. Unbiasedness
THEORY OF ESTIMATION An unbiased estimator is a statistic that has an expected value equal
to the unknown true value of the population parameter being estimated.
An estimator not having this property is said to be biased.
The Theory of estimation was expounded by Prof: R.A. Fisher in his
Let X be random variable having the pdf f(x , θ), where θ may be
research research papers round about 1930. Suppose we are given a random
sample from a population, the distribution of which has a known unknown. Let X1, X2....Xn be a random sample taken from the
population represented by X. Let
mathematical form but involves a certain number of unknown parameters.
The technique of coming to conclusion regarding the values of the unknown tn = t(X1, X2.....Xn) be an estimator of the parameter θ.
parameters based on the information provided by a sample is known as If E(tn) = θ for every n, then estimator t n is called unbiased estimator.
the problem of „Estimation‟. This estimation can be made in t wo ways.
i. Point Estimation ii. Interval Estimation 2. Consistency
One of the basic properties of a good estimator is that it provides
Point Estimation increasingly more precise information about the parameter θ with the
If from the observations in a sample, a single value is calculated as an increase of the sample size n. Accordingly we introduce the following
estimate of the unknown parameter, the procedure is referred to as point definition.
estimation and we refer to the value of the statistic as a point estimate. For
Definition
example, if we use a value of x to estimate the mean µ of a population we
The estimator t = t(X X......X ) of parameter θ is called consistent if
are using a point estimate of µ. Correspondingly, we, refer to the statistic 1 2 n
x as point estimator. That is, the term „estimator‟ represents a rule or tn converges to θ in probability. That is, for s > 0
method of estimating the population parameter and the estimate lim P (| t n –8 | c c ) 1 or lim P (| t n –8 | c ) 0
represents the value produced by the estimator. n → n →
An estimator is a random variable being a function of random The estimators satisfying the above condition are called weakly
observations which are themselves random variables. An estimate can be consistent estimators.
counted only as one of the possible values of the random variable. So The following theories gives a sufficient set of conditions for the
estimators are statistics and to study properties of estimators, it is consistency of an estimator.
desirable to look at their distributions. Theorem
An estimator t , is such that E(t ) = θ →θ and V(t )→0 as n
Properties of Estimators n n n n
There are four criteria commonly used for finding a good estimator. → , the estimator tn is said to be consistent for θ.
They are:
3. Efficiency
1. Unbiasedness
Let t1 and t2 be two unbiased estimators of a parameter θ. To choose
2. Consistency between different unbiased estimators, one would reasonably consider their
3. Efficiency variances, ie., If V(t1) is less than V(t2) then t1 is said to be more efficient
4. Sufficiency than t2. That is as variance of an estimator decreases its efficiency
Method of Moments
This is the oldest method of estimation introduced by Karl Pearson.
According to it to estimate k parameters of a population, we equate in
general, the first k moments of the sample to the first k moments of the
population. Solving these k equations we get the k estimators.
Let X be a random variable with the probability density function f(x,
θ). Let µr be the r-th moment about O. µr = E(X ).r In general, µr will
be a known function of θ and we write µr = µr (θ). Let x1, x2...xn be a
random sample of size n drawn from the population with density function
1 n
f(x, θ). Then r-th sample moment will be m= Σ xi . Form the
r n i 1
equation m = µ (θ) and solve for θ. Let ˆ
be the solution of θ.
Then 8
r r 8
function is defined as
L ( x 1 , x 2 ......x n ; 8 ) = f ( x 1 , x 2....x n ; 8 )
= f( x1;8) f ( x 2 ; 8 )..... f (x n ; 8 )
n
= П f(xi,8)
i 1
The likelyhood function can also be denoted as L(X; θ) or L(θ). The
likelyhood function L ( x 1 , x 2.......x n ; 8 ) give the likelyhood that the
random variables assume a particular value x 1 , x 2.........xn .
x
θ, then t is called Maximum likelyhood Estimator of θ. (MLE). ith mean µ and variance σ . Show that the sample mean is an unbiased
Thus t is a solution if any of estimator of population mean µ.
2
6L 6L Solution
0 and 0
68 68 2 n
Also L and log L have maximum at the same value of θ we can take We know that x = 1Σ i
log L instead of L which is usually more convenient in computations. n i 1
6 log L Taking expected value, we get
n
Thus MLE is the solution of the equations 0 , provided
68 n
E
1
1E
x
2
6 log L 0 E( x ) = n
Σ i n Σ
1
i
2 i
i 1
68 1
=
E ( x 1 + x 2 +......+ xn )
The maximum likelyhood estimator can also be used for the
n
Example 2
Let x1, x2, x3...., xn is a random sample from a normal distribution
N(µ, 1) show that
1 n
t= Σx i is an unbiased estimator of µ +1
n i 1
V(xi) Solution
= 1 for every i = 1, 2, 3,...n
Now V(xi) = 2 Since xi is a random observation from a Poisson population with parameter λ,
2
2 E(xi) = λ i = 1, 2, ... n
E ( x i ) – [ E ( x i )]
m E(xi )= 2 x +x2
µ +1 mt 1
=x,t = 1 , tn x 1 + x 2 +...+ xn
2
2 n
1 n 2 1 n 2
are unbiased estimators of λ. It may be noted that
or
E
E( t) =
x E (x )
n Σi n Σ i .
E(t1) = E(xi) = λ
L
1 n i 1 i 1 1 1
1 E(t2) = [ E ( x 1 ) + E ( x 2 )] [λ + λ ] λ
n Σ( µ +1) n n ( µ +1) 2 2
i 1
s
is consistent for σ2 order, we have
c x 1 c x 2 c x 3.....c xn c β
Example 6
Give an example of estimators which are Here the minimum value of β consistent with the sample is xn and
(a) Unbiased and efficient, maximum value of α is x1. Thus the M.L.E.‟s of α and β are
(b) Unbiased and inefficient,
x1,β
ˆ
ˆ
xn
(c) Biased and inefficient.
EXERCISES
(a) The sample mean x and modified sample variance Multiple Choice Questions
n 2
l An estimator is a function of
S =n –1 s are two such examples. a. population observations
1 b. sample observations
c. Mean and variance of population
(b) The sample median, and the sample statistic 2 [Q1 +Q3] where Q1 and
d. None of the above
Q3 are the lower and upper sample quartiles, are two such examples.
Both statistics are unbiased estimators of the population mean, since the l Estimate and estimator are
mean of their sampling distribution is the population mean.
a. synonyms b. different
(c) The sample standard deviation s, the modified standard deviation s , c. related to population d. none of the above
the mean deviation and the semi-in-terquartile range are four such l The type of estimates are
examples.
a. point estimate
Example 7 b. interval estimate
For the rectangular distribution oven an interval ( α β); α < β. Find c. estimates of confidence region
the maximum likelihood estimates of α and β. d. all the above
Solution
l The estimator of population mean is
x
For the rectangular distribution over ( α β), the p.d.f. of X is given by 1 , c x c β
l The credit of inventing the method of moments for estimating Define consistency of an estimator. l Define
parameters goes to efficiency of an estimator.
a. R.A. Fisher b. J. Neymann l Define sufficiency of an estimator.
c. Laplace d. Karl Pearson l State the desirable properties of a good estimator.
l Generally the estimators obtained by the method of moments as l Give one example of an unbiased estimator which is not consistent. l
compared to MLE are Give an example of a consistent estimator which is not unbiased. l Give the
a. Less efficient b. more efficient names of various methods of estimation of a parameter. l What is a
c. equally efficient d. none of these maximum likelyhood estimator?
l Discuss method of moments estimation. l
Fill in the blanks
What are the properties of MLE?
An estimator is itself a ..................
l
l Show that sample mean is more efficient than sample median as
l A sample constant representing a population parameter is known as an estimator of population mean.
..................
l State the necessary and sufficient condition for consistency of
l A value of an estimator is called an .................. an estimator.
A single value of an estimator for a population parameter 8
l
Short Essay Questions
is called its...................estimate
l Distinguish between Point estimation and Interval estimation.
l The difference between the expected value of an estimator and
the value of the corresponding parameter is known as ............ l Define the following terms and give an example for each: (a)
...... Unbiased statistic; (b) Consistent statistic; and (c) Sufficient statistic,
l The joint probability density function of sample variates is called l Describe the desirable properties of a good estimator.
.................. l Explain the properties of a good estimator. Give an example to
show that a consistent estimate need not be unbiased.
l A value of a parameter 8 which maximises the likelyhood function
Define consistency of an estimator. State a set of sufficient conditions
is known as.........................estimate of 8 l
8
This leads to our saying we are 100( 1 – )% confident that our single
f(x, 8 ) = (1 + 8 ) x ; 8 > 0, 0 < x < 8
interval contains the true parameter value. The interval (t 1 , t 2 ) is
= 0 elsewhere. called confidence interval or fiducial interval and 1 – is called
l Given a random sample of size n from „confidence coefficient‟ of the interval (t 1 , t 2 ). The limits t1 and t2 are
–8 x called „confidence limits‟.
f(x ; 8 ) = 8e , x > 0 ; 8 > 0. For instance if we take = 0.05, the 95% confidence possesses the
find the maximum likelihood estimator of 8 . Obtain the variance meaning that if 100 intervals are constructed based on 100 different
of the estimator. samples (of the same size) from the population, 95 of them will include
the true value of the parameter. By accepting 95% confidence interval for
the parameter the frequency of wrong estimates is approximately equal to
5%. The notion of confidence interval was introduced and developed by
Prof: J. Neyman in a series of papers.
Now we discuss the construction of confidence interval of various
parameters of a population or distribution under different conditions.
x – µ σ σ
– 2.326 , + 2.326
ie. P –z / 2 c c z / 2 = 1 – x x
L n
σ/ n J n
σ σ 3. If 0.01, z / 2 2.58 , so the 99% confidence interval for µ
ie. P –z / 2 c x – µ c z / 2 n = 1 –
n J is
σ
σ σ σ
– – 2.58 , + 2.58
n
ie. P –x /2 c–µc– x + z / 2
σ
= 1 – L
x n x n
n J
+z σ n 4. If 0.10, z / 2 1.645, so the 90% confidence interval for µ
ie. P / µ x – z / = 1 – is
x 2
2 J σ σ
n x –1.645 ,x +1.645
n n
σ σ L
ie. P x – z /2 cµcx + z / = 1 –
2
n n J Case (ii) when is unknown, n is large (n 30)
n
, x + z / 2
L /2 n –1 /2 n –1
Case (iii) when is unknown, n is small (n<30)
where t / 2 is obtained by referring the Student‟s t table for
Let X 2 , X 2 ,.......Xn be a random sample drawn f rom N( µ , σ ) where
(n–1) d.f. and probability .
2
σ is unknown. Let x be the sample mean and s be its sample variance.
Here we know that the statistic. Interval Estimate of the Difference of two population means
Let P|t|
= Let x 2 be the mean of a sample of size n 2 taken from a population
ct
/ 2 1 –
with mean µ2 and SD σ 2 .
=> P –t /2 c t c t / 2 = 1 –
–µ Then x 2 c N ( µ 2 , σ 2 n2)
x
=> P –t / 2 /2 = 1 – Then by additive property,
s / n –1 J ( σ 2σ 2
12
s s
=> P –t / 2 c x – µ c t / 2 = 1 –
x1–x2 → N µ1 – µ2, +
n 1n 2
n –1 n –1 J
(x –x )–(µ –µ )
1 2 2
s s mZ=
1
→ N(0,1)
=> P –x – / 2 c –µ c x /2 = 1 – σ2 σ
+ of Distance
2
n 1n 2
4 Statistical Statistical 4
n –1 n –1 J
+ s s By the area property of ND, we know that
=> /2 µ x – t / = 1 –
t
x 2
P
n –1 n –1 J P | Z | c = 1–α
Z /2
L
n1 n2 n2
J
(x –x )–(µ –µ Similarly we can find 98% and 99% confidence intervals replacing 1.96
) /2 respectively by 2.326 and 2.58.
i.e. P –Z / 2 c
1 2 1 2 1–α
σ 2 2
1
+2 Case (iii) When unknown, small
L n1 n2
Here t = ( σ )–(µ – µ ) / σ /n +σ /n →
On simplification as in the case of one sample, the 100(1 – ) xx1 2 1 2 1 2
%
confidence interval for µ – µ is
1 2 students „t‟ distribution with c = (n 1 +n 2 –2)d.f.
2 2 2 2
( ) (n
σ1 σ2
σ σ 2 2 2
+n –2)
(x –x )–Z
1 + n 2 ,(x –x )+Z , Where σ = n 1 +n 2
+
1s 1
s
2
1
n1 2 n 1 n 2
J Refer the „t‟ curve for c = (n 1 +n 2 –2)d.f. and probability level P =
here the value of Z / 2 can be determined from the Normal table. α
When α = 0.05, Z / 2 = 1.96. So, 100(1 – )% = 95% The table value of t is t / 2
confidence
interval for µ – µ is Then we have P | t | = α
1
2
2 2
2 2
t
/2
ie. P | t | ct =
σ1 σ σ σ
2 /2 1 –
1 2
ie. P –t = 1 – .
(x 1 –x 2 ) – 1.96 + ,(x 1–x 2) + 1.96 + ,
c+tt c
n1 n2 n1 n2
/ 2 /2
J
When α = 0.01, Z /2 = 2.58. So, the 99% eq for µ 1 –µ2 Substituting t and simplifying we get the 100(1 – )% confidence
is
+σ
2 2
2 2 interval for µ –µ as ( 1
–x
)–t (
σ2 /n
1 ) ( /n 2) ,
σ1 σ2 σ1 σ2 1 2 2 /2
(x 1 –x 2 ) – 2.58 + ,(x 1 –x 2 ) + 2.58 + ,
n1 n2 n1 n2
J – )+t 2 2
Case (ii) When unkn own, 2 /2 σ
σ large (x 1 x
/n1 +σ /n2
where tα/2 is obtained by referring the t table for n 1 +n 2 –2 df and
probability α.
s1 and s2.
So 95% CI for µ – is
1 2
School of Distance Education School of Distance Education
46 Statistical inference Statistical inference 47
Confidence interval for the variance of a Normal population where 3 and 3 are obtained by referring the 3 table for
1 – / 2 / 2
μσ n–1 d.f. and probabilities 1 – / 2 and / 2 respectively.
2
Let s be the variance of a sample of size n(n<30) drawn from N( µ , σ Confidence interval for the proportion of success of a
).
We know that the statistic binomial population
2 ns2 2
z → 3 (n –1) d . f . ( x
Let P bethe proportion ofsuccess ofasample ofsizen drawn
σ
2 n
2 2 2 from a binomial population with parameters n and p where p is unknown
Now by referring the table we can find a 31 – / 2 and / 2 such
3 and n is assumed to be known. Then we know that
3
that
2 2 p–p
P 3 c32 c3 = Z → N(0,1) for large n
pq n
1 – / 2 /2 1 –
where 3 and 3 are obtained by referring the table for n–1
1 – / 2 / 2
d.f. and probabilities 1 –/2 and / 2 respectively. From normal tables we get,
ns 2 P| Z| cz =
ie. P 3
σ2 2
= 1 – /2 1 –
c 3
–z
1 – / 2 /2
J ie. P / cZc / = 1 –
2
z
2
1 σ 2
1
P
ie. 32 ns 32 = 1 – p –p
1 – / 2 / 2 J P –z / 2 c
z
/2
pq
ie. ns 2 ie. = 1 –
2 ns J
2
= 1 – n get
P σ
3 3
ie. –z n cp c p + z / 2n
= 1 – ie. P p
= 1 –
P cσ c
J
3 y
/2 1 – / 2 p is
J
2 So, the 100(1 – )% confidence interval for n
pq
ns ns –z pq , p + z
, ie. p /2 n /2
32 y2
J
L /2 1 – / 2
Note
When = 0.05, z / 2 = 1.96, so the 95% C.I. for p is ( p 1 q 1 p 2 q 2
(p – p )–1.96 +
1 2 n 1 n 2
Since p1, q1 and p2, q2 are unknown, they are estimated as
p –1.96 pq
, p +1.96 pq
L n p1p1 , q 1 q1,p2 p2 and q q2 .
n 2
hen = 0.02, z / 2 = 2.326, so the 98% C.I. for p is The 95% confidence interval for ( p 1 – p2 ) is
p q p q
(p – p )–1.96 +22
,
1 1
p – 2.326 pq
, p + 2.326 pq
1 2
nn
L n 1 2
n
hen = 0.01, z / 2 = 2.58, so the 99% C.I.
for p is
q pq
(p – p )+1.96
1 2 1 1 +2 2
nn
p – 2.58 pq
, p + 2.58 J
Note: To construct 98% and 99% confidence 12intervals for
pq
2
x – → N(0,1) To obtain the CI for σ let us use the result
mZ =
σ /n 2
From normal tables, we will get 2 ns 2
2 2
ie. P –1.96 c Z c1.96 = 0.95 2
table, we can find a 3 2 and 3 such that
From 3 0.975 0.025
x– μ
2 2 2
ie P 3 σ 2 = 0.95
c
.75 . 5
ie. J
P –x –1.96 c x – µ c1.96 = 0.95
n n J
ns 2
2 ns 2
σ σ ie. P = 0.95
ie. 3
c– µ c –x +1.96 σ
n 3
= 0.95 2
nJ 0.975 0.025 J
σ σ
µ x –1.96 ns 2
2 ns 2
e. P x +1.96 = .
n nJ ie. P = 0.95
32
cσ c
σ σ
c µ c x +1.96 3
2
n n 0.025 0.975 J
J
2
ie. P x –1.96 = 0.95 2 ns2 , ns 2
School of Distance School of Distance
5 Statistical Statistical is 3 2 5 and
Thus the 95% CI for σ 3 3
where 0.975
L 0.025 0.975
2 2
Thus the 95% CI for µ 30 .025 are obtained by referring the 3
L n probabilities 0.975 and 0.025 respectively.
table for n –1 df and
Example 2
Example 3
Obtain the 95% confidence interval for the variance of a normal Obtain the 99% CI for the difference of means of two normal
population N( µ , σ ) populations N( µ , σ ) and N( µ , σ ) when (i) σ, known (ii)
1 1 2 σ 2
2 1
σ,σ unknown.
1 2
since 2 2
X →
X
N µ,σ
( ,
→N µ,σ
)
( ) 1 2
1 1 1 2 2 2 σ =σ = *2 , where
2 2 1 2 σ
( σ ( σ 2 2
then x 1 → N µ,
1
, x2 → N µ2, 2
n s +n s
1
2
n n *2 11 2
σ =
1 2
n 1 +n2 – 2
( σ2 σ Then the statistic
therefore x1 – x2 →N µ –µ 2,
1
+ 2
1
(x –
n1 n2 ( x12–x ) – (µ – µ )
) – (µ – µ t = ns2 +n
12
s2 ( 11 ct (n 1 + n 2 –2)d
x
) 1 12 2
+
1 21
2
→ ( ) n +n– 2 n n
N 0,1
mZ= σ 2 σ 12 1 2
+
1 2
n 1 n2 Thus we have
From normal tables we can write
P ( –2.58 c Z c 2.58) 0.99
Substituting Z and simplifying, we get
( x12 – x ) – (µ – µ 2 )
–t / 2 c
1
p c / 2 1–
σσ1 + 2 c (µ – µ n21s + n22 s( 1 1
(x x ) – 2.58 )c
P
n n
1 2
+
1 2 1 2 1 2 n 1 +n 2 – 2 1 2
J
(x 1 – x2 )– 2.58 1
+ 2
0.99 and probability 0.01.
n
n1 2
J 2
2 ( 1
σ 2 σ 2
(x1 – x 2 ) –t / 2
1 1 +n 2 s 2 1
+
c
– is (x 1 – x2 ) 2.58 1
+
2 This gives n 1 +n 2 –2
1
Thus the 99% CI for µ n 2
1 2 1n 2
5 Statistical Statistical 5
J
School of Distance Education School of Distance Education
54 Statistical inference Statistical inference 55
n s 2 +n s 2 ( 11 2
µ –µ c
From table, 3 = 23.68
(x –x )– 112 2
+ 1 –
1 2 1
t
2 /2 n 1 +n 2 –2 1 2
14,0.05
J 3 = 6.571
Thus the 99% CI for µ – µ is 14,0.95
1 2 2 15 4.24 15 4.24
Thus 98% CI for σ is ,
(x 1 –x2)
/2
n 1 s1 +n 2 s 2 ( 1
+
1 23.68 6.571 J
n 1 +n 2 – 2 1 2 = (2.685, 9.678)
J
Example 7
Example 5 A medical study showed 57 of 300 persons failed to recover from a
particular disease. Find 95% confidence interval for the mortality rate of
If the mean age at death of 64 men engaged in an occupation is 52.4
the disease.
years with standard deviation of 10.2 years. What are the 98%
confidence limits for the mean age of all men in that occupation? Solution
Solution n = 300, x = 57
=
nn
64 J J
Example 6
0.19 0.81
+ 1.96
Solution N( µ , σ ) is
The 95% confidence interval for ( µ1 – µ2 ) is s σ
a. x 1.96 n b. x 1.96 n
sn
σ 2 σ 2 σ 22 σ
σ
+2 x 2.58 x 2.58n
x x –1.96 1 +2 ) + 1.96
(x x 1 c. d.
1 2 n1 n2
,
1 2 n1 2
J and s2.
Since σ and σ are unknown, we shall replace them respectively by s
1 2 1
a. x /2 n –1 b. x /2 l Give the formula for obtaining confidence limits for the difference
n
between the mean of two normal populations
s s Why interval estimate is preferred to point estimate for
c. x tn –1 d. x tn
l
c. 55.28, 48.72) d. none of the above Explain how you will construct 100(1 – )% confidence interval for
normal population mean when population S.D. is (i) known and (ii)
l Formula for the confidence interval for the ratio of variances unknown.
of the two normal population involves
l Explain how you would find interval estimates for the mean
2
a. 3 distribution b. F distribution and variance of a normal population.
c . t distribution d. none of the above l What do you mean by interval estimation? Obtain 99% confidence
Fill in the blanks limits for θ of the normal distribution N( θ, σ 2), with the help of a
random sample of size n.
l The notion of confidence interval was introduced and developed b y
l Explain the idea of interval estimation. Obtain a 100( 1 – )%
l The confidence interval is also called interval confidence interval for the mean of a normal distribution whose
variance is also unknown.
l An interval estimate is determined in terms of
l Obtain 95% confidence interval for the mean of a normal population
l An interval estimate with interval is best
with unknown variance on the basis of a small sample of size n
l Confidence interval is specified by the limits taken from the population. What happens when n is large?
l Confidence interval is always specified with a certain Long Essay Questions
l To determine the confidence interval for the variance of a normal l A random sample of 20 bullets produced by a machine shows an
distribution distribution is used average diameter of 3.5 mm and a s.d. of
0.2 mm. Assuming that the diameter measurement follows N( µ ,
Very Short Answer Questions σ ) obtain a 95% interval estimate for the mean and a 99% interval
l What is an interval estimate? estimate for the true variance.
l Explain interval estimation l The mean and s.d. of a sample of size 60 are found to be 145 and
l State the 95% confidence interval for the mean of a normal 40. Construct 95% confidence interval for the population mean.
distribution N( µ , σ ) when σ is known l Two independent random samples each of size 10 from two
1
= 4.8, s
1
x = 5.6 and s = 7.9. Find 95% confidence
2 TESTING OF HYPOTHESIS
2
interval for –µ.
1 2 Most tests of statistical hypothesis concern the parameters of
distributions, but sometimes they also concern the type, or nature of the
l Two random samples of sizes 10 and 12 from normal populations
distributions, themselves. If a statistical hypothesis completely specifies
having the same variance gave x 1 = 20, the distribution, it is referred to as a simple hypothesis if not, it is referred
s 2 = 25 and x = 24, s = 36. Find 90% confidence limits for to as a composite hypothesis.
1 2 2 The statistical hypothesis that X follows normal with mean 15 is a
( µ – µ ). composite hypothesis since it does not specify the standard deviation of
1 2
the normal population. The statement that X follows a poison distribution
. In a sample of 532 individuals selected at random from a population,
with parameter λ = 2 is a simple hypothesis since it specifies the
l
2 2 = 2.e dx 2
10
56 = – 2
(1
[1 + 10 + 45] 1 1
=
2 1024
= – (0 – e–2) = e 2
4 4 4 4
9 1
+
8 3 1 1 – dx (e – x
+ 45 e
= (3/ 4)
L16
10
= –1
16 16 =
84 0
=
8
(3/ 4) = 5.25 (3/4)
8
– e ( –1
–1 ) –1
Exa 2 L 16
mple
School of Distance School of Distance
Example 3 = = 1 – e
.
2 x 4–x = 2 –1/ 2
= Σ 4Cx (1 (1 J
3
x 0 2 2 0
4 1 3 2 2
P(Type II error)
4C ( 1 + 4 C ( 1 (1 + 4C ( 1 ( 1
=
5
(1 4 4
e dx =
= + 4(1 (1
+6
4
=
11 = 05 –1/ 5
4 –
5 0
2 2 2 2 3/5
= –e –1 = e
1
1
c. Value of the statistic d. Number of observations
= (x 2 1
)
1 – (1 –
=
α)
2
1–
l Power of a test is
related to
a. Type I error
Type II error
c. both (a) and (b) d.
neither (a) nor (b)
Level of significance is also called l Explain the following with reference to testing of hypothesis: (i)
l
Type I and II errors; (ii) Critical region; and (iii) Null and
a. size of the test b. size of the critical region
alternate hypothesis.
c. producer‟s risk d. all the above
l Distinguish between Simple and Composite hypotheses. Give one
Fill in the blanks example each.
l Explain the terms (i) Errors of the first and second kind; (ii)
l A hypothesis is a testable ..................
Critical region; (iii) Power of a test; and (iv) Significance level
The parametric testing of hypothesis was originated by ..................
l
in test of hypothesis.
and ..................
l Explain with illustrative examples the terms : two types or
l The hypothesis which is under test for possible rejection is called
error, critical region and significance level.
..................
l Explain the terms (1) Null hypothesis; (2) Level of significance; and
l .................. error is not severe than...............................error. (3) Critical region.
l Probability of type I error is called .................. l Explain the terms (i) statistical hypothesis; (ii) critical region
l Rejecting H0 when H0 is true is called .................. (iii) power of a test.
l Accepting H0 when H0 is flase is called .................. Long Essay Questions
Very Short Answer Questions l To test the hypothesis H0 : p = 1/2 against H1 : p > 1/2, where p is
l Define the term „test of hypothesis‟ the probability of head turning up when a coin is tossed, the coin
was
l Define simple and composite hypothesis l
tossed 8 times. It was decided to reject H0 in case more than 6 heads
Define null and alternative hypothesis turned up. Find the significance level of the test and its power
l Define type I and type II errors l if H1 : p = .7.
Define level of significance X1 and X2 are independent Bernoulli r.v.s. such that
l
LARGE SAMPLE TESTS in this case it would be logical to let the critical region consist only of the
left hand tail of the sampling distribution of the test statistic. Likewise, in
The statistical tests may be grouped into two. (a) Large sample tests
testing :8 80 against the one sided alternative H1 : 8 80 we reject
(b) Small sample tests. For small sample tests the exact sampling H0
distribution of the test statistic will be known. In large sample tests the H0 only for larger values of 8ˆ and the critical region consists only of the
normal distribution plays the key role and the justification for it is found
right tail of the sampling distribution of the test statistic. Any test where
in the famous central limit theorem. That is when the sample size is large
the critical region consists only one tail of the sampling distribution of
most of the statistics are normally or atleast approximately normally
the test statistic is called a one tailed test, particularly they are called left
distributed. Let Y be a statistic satisfying the conditions of the CLT, then
the statistic given by tailed and right tailed test respectively.
Y –E(Y)
Z → N(0,1), for large n. Best critical regions of z-test
V(Y) To test H0 :8 80 against H1 : 8 80
: 8 8 0
Here V(Y) is called the Standard Error of Y.
Y –E(Y) : 8 s 80
mZ → N(0,1) for the significance level α the best critical regions are respectively.
SEof Y
c Ξ Z –Z c Ξ Z Z and, c Ξ Z /2
If Z is chosen as the test statistic, the critical region for a given
significance level can be determined from normal tables. The test based For example, when
on normal distribution is called „normal test‟ or „Z test‟. = 0.05, the best critical regions are respectively
To explain the terminology, let us consider a situation in which we want
c Ξ Z –1.645 , c Ξ Z 1.645 and c Ξ Z 1.96
to test the null hypothesis H0 :8 80 against the two sided alternative
α a=
hypothesis H1 : 8 s80 . Since it appears reasonable to accept the null α 0 . 0 2 5
= 0 . 0 5 = 0 . 0 2 5
α= 0 . 0 2 5
the critical region consists of both tails of the sampling distribution of our
School of Distance School of Distance
7 Statistical Statistical 7
Note:
When = 0.01, the best critical regions are respectively (i) For = 0.05, = 0.02 or = 0.01 we can fix the critical regions by
determining the critical values using normal area table. (Refer best
critical regions)
α= 0.01 α=0.01α=0.005 α 0.005 (ii) If the population is given to be normal, the test procedure is valid
=
even for small samples, provided σ is known.
(iii) When σ is unknown and n is large, in the statistic we have to replace
––2 . 236 z = 0 + – z=0 2.236 +– –2.58z=02.58 +
σ by its estimate s.
When = 0.02, the best critical regions are respectively
Example 1
A sample of 25 items were taken from a population with standard
α=0.02 α=0.02 α 0.01 α=0.01
= deviation 10 and the sample mean is found to be 65. Can it be regarded
as a sample from a normal population with µ = 60.
– –2 . 05 5 z = 0+ – z=0 2.236 +– –2 . 2 36 z =0 2 . 236 +
Solution
Testing mean of a Population
Given n = 25, σ = 10, x = 65, µ0 = 60
By testing the mean of population we are actually testing the significant
We have to test H0 :µ = 60 against H1 :µ s 60.
difference between population mean and sample mean. In other words we
are deciding whether the given sample is drawn from the population having Let = 0.05. The best critical region is c Ξ Z 1.96 . Here the test
the mean given by H0 . statistic is
x – µ0 191 – 200 63 substituted for σ 2 so long as both samples are large enough to
Z 21 3 invoke the central limit theorem.
σ/n 21 49
(ii) To test H0 : µ1 – µ2 δ against
Since Z lies in the critical region. H 0 is rejected.
That is, the claim of the manufacturer is not justified. H1 : µ 1 – µ 2 δ , H1 : µ 1 – µ 2 δ , H1 : µ 1 – µ 2 s δ
the procedure is exactly the same as in the case of equality of two
Testing the Equality of two population Means
population means. In this case the test statistic is given by
By testing the equality of two population means we are actually –
testing the significant difference between two sample means. In other
words we are deciding whether the two samples have come from Z (x 1
σσ x2
)– δ
1+
x1– x2 H0 : µ1 – µ2 = 0 against H1 : µ1 – µ2 s 0
The test statistic is Z
σσ 2
12 + Let = 0.05. The BCR is c Ξ Z 1.96
12
Z 2.21 1.96 Since Z lies in the critical region, H 0 is accepted. That is, there is no
m
significant difference between the samples.
Since Z lies in the critical region, we reject H0 .
Testing the proportion of success of a population
That is, the two groups are significantly different with reference to By testing population proportion of success we mean the testing of
their mean statures. the significant difference between population proportion of success and
the sample proportion of success.
Example 2
Now let us familiarise the following notations.
A random sample of 1000 workers from factory A shows that the
mean wages were Rs. 47 per week with a standard deviation of Rs. 23. A
p : population proportion of success (unknown) p0
random sample of 1500 workers from factory B gives a mean wage of
: the assumed value of p (given)
Rs. 49 per week with a standard deviation of Rs. 30. Is there any
significant difference between their mean level of wages?
x
Solution
p : n ; the proportion of success of a sample
Given n1 = 1000, n2 = 1500, x1 = 47, x2 = 49 x : the number of successes
s1 = 23 s2 = 30 n : sample size
We have to test
Suppose we want to test the null hypothesis
H0 : µ1 – µ2 = 0 against H1 : µ1 – µ2 s 0
H 0 : p = p0 against one of the alternatives
Let = 0.02. The BCR is c Ξ Z 2.326
H 1 : p p0 or H 1 : p p0 or H 1 : p s p0 based on a large sample
n1 n2 c Ξ Z –Z , c Ξ Z Z and c Ξ Z Z / 2
70 n1
where p* = +n2
Since Z lies in the acceptance region, H0 is accepted. That is, 70% of the proportions of success.
Example 1
Before an increase in excise duty on tea 800 persons out of a sample
1000 perons were found to be tea drinkers. After an increase in duty 800
people were tea drinkers in a sample of 1200 people. Test whether there
is significant decrease in the consumption of tea after the increase in duty.
H 0 : p1 p2 against H1 : p1 s p2
800 800
We have = 1000 = 0.8 .
p1 2 1200
H 0 : p1 – p2 0 against H1 : p1 – p2
0 The test statistic is Z
p
1
– 2
1
Let = 0.05. The BCR is c Ξ Z 1.645 p*q* + 1
n
The test statistic is Z p1
(
– p2
1 1 n1 2
p*q* + +
n p
n p
450 + 450 900
n1 2 1 1 2 2
= = .
n p
800 + 800 1600 Now p* =n 1 600 + 900 1500
+n2
+n 2 p 2
1 1
q* = 1 p* = 1 0.6 = 0.4
Now p* = n 1 +n2 1000 + 1200 2200 = 0.727
q* = 1 p* = 1 0.727 = 0.273 Z = 0.75 – 0.50 = 9.68
11
0.80 – 0.67 0.6 0.4 +
Z= = 6.816 600900
1 1
0.727 0.273 +
Z 9.68 2.58
10001200 m
Fill in the blanks l An electrical firm manufactures light bulbs that have a length of
life that is aporoximatey normally distributed with a mean of 800
l A test based on the outcome of tosing of a coin is a.....................test.
hours and a standard deviation of 40 hours. Test H0 : µ = 800
l When σ is known, the hypothesis about population mean is
tested by ................ hours,
l If the smple drawn from a population is large, then the hypothesis against the alternative H1: µ s 800 hours if a random sample of 30
about µ can be tested by ................ bulbs has an average life of 788 hours.
l A large population of heights of person is distributed with mean l A random sample of 36 drinks from a soft-drink machine has an
66 inches and SD = 10 inches. A sample of 400 persons had the average content of 7.4 ounces with a standard deviation of 0.48
mean
ounces. Test the hypothesis H0 : µ =7.5 ounces against H1: µ <
height = 62 inches. The data..........the hypothesis H0 : µ = 66 inches.
7.5 at = 0.5
School of Distance Education School of Distance Education