Lecture 07
Lecture 07
X + X 1 + ..... + X n − n
P 1 x PY x
n
X 1 + X 1 + ..... + X n − n
i.e., P x (x )
n
Central Limit Theorem (contd.)
Problem 1:
An insurance company has 25000 car insurance policy holders. If
The yearly claim of a policy holder is a random variable with
mean 320 and standard deviation 540, approximate the
probability that the total yearly claim exceeds 8.3 million.
Let X represent the total yearly claim.
Let Xi denote the claim of policy holder i.
Then Each Xi can be treated as an independent random variable
having X =
25000
i =1
Xi
=320 and =540
For large n=25000, using central limit theorem, X may be
Considered to be normal with mean=n and standard
deviation=n
Central Limit Theorem (contd.)
Problem 1 (contd.):
Hence with the following transformation,
X − 320 25000 X − 8 106
Y= = is standard normal
540 25000 85381.5
P X 8.3 106 = 1 − P X 8.3 106
X − 8 106 8.3 106 − 8 106
=1 − P
85381 . 5 85381 . 5
= 1 − PY 3.51 = 1 − (3.51)
= 1 − 0.9998 = 0.0002
Central Limit Theorem (contd.)
Distribution of Sample Mean
Let X1, X2,……, Xn be sample taken from population, each
having mean and variance 2.
If we define sample mean X = ( X 1 + X 2 + ....... + X n ) / n then X has
mean and variance 2/n.
For large n, by the Central Limit Theorem, X1 + X2 + … + Xn is
approximately normal, hence X is also normal,
2
X ~ N ,
n
Thus, transformation to standard normal distribution by
X −
~ N (0,1)
n
Central Limit Theorem (contd.)
Problem 2a:
The weights of population of workers have mean 167 and
standard deviation 27. If a sample of 36 workers is chosen,
approximate the probability that the sample mean of their
weights lies between 163 and 170.
Let X represent the sample mean of weight of workers.
Hence it is normal with mean 167 and standard
deviation=27/36=4.5.
Required probability is given by:
163 − 167 X − 167 170 − 167
P 163 X 170 = P
4.5 4.5 4.5
= P −0.8889 Y 0.8889 ( 0.8889 ) − ( −0.8889 )
( 0.8889 ) − 1 + ( 0.8889 ) 2(.8133) − 1 0.6266
Central Limit Theorem (contd.)
Problem 2b:
The weights of population of workers have mean 167 and
standard deviation 27. If a sample of 144 workers is chosen,
approximate the probability that the sample mean of their
weights lies between 163 and 170.
Let X represent the sample mean of weight of workers.
Hence it is normal with mean 167 and standard
deviation=27/144=2.25.Required probability is given by:
163 − 167 X − 167 170 − 167
P163 X 170 = P
2.25 2.25 2.25
= P− 1.7778 Y 1.7778 0.9246
Hence, probability increases as sample size increases.
Central Limit Theorem (contd.)
Problem 3:
An astronomer wants to measure the distance of a star. His
measurements are subjected to atmospheric disturbances. His
measurements are independent random variables with a mean
of d light years and a standard deviation of 2 light years. How
many measurements are needed to be 95% certain that his
estimate is accurate to within 0.5 years.
i.e., E X = X = Xi n
i =1
Var ( X ) =
1
i
n i =1 n
(X −X)
n
2
i
S2 = i =1
n −1
E S2 = 2
Sample Variance (contd.)
Problem 1:
The time taken by a processing unit to process a certain type of
job is normally distributed with mean 20 sec and standard
deviation 3 sec. If a sample of 15 such jobs is observed what is
the probability that the sample variance will exceed 12?
The population has mean=20 and variance 2=32=9.
The sample is of size n=15. If S2 is the sample variance of size n
from a normal population N(,2), then (n-1)S2/2 has chi
square distribution with dof=n-1
Hence, 14S2/9 has chi-square distribution with dof=14
Now, required probability
14 14
P S 2 12 = P S 2
12 = P 142 18.67
9 9
= 0.1779
Table: Chi-Square Probabilities
DOF
1
0.995
---
0.99
---
0.975
0.001
0.95
0.004
0.90
0.016
0.10
2.706
0.05
3.841
0.025
5.024
0.01
6.635
0.005
7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750 The table lists values of
6
7
0.676
0.989
0.872
1.239
1.237
1.690
1.635
2.167
2.204
2.833
10.645 12.592 14.449 16.812 18.548
12.017 14.067 16.013 18.475 20.278
2,n for different values of
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955 n and , where
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666 23.589 = P{X2,n}
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 34.382 37.652 40.646 44.314 46.928
n
Sampling from Finite Population
Consider a population of N elements. Let p be the proportion of
elements with certain characteristic. Let a sample of size n be
selected such that each of NCn population subsets is equally
likely to be the sample. Then such sample can be termed as
random sample.
Let Xi = 1 if ith element of sample has the characteristic
= 0 otherwise
Then X = X1+X2+……+Xn represents no. of elements in sample
with the characteristic.
If the N is large compared to n, then X1, X2, ……, Xn are
approximately independent. Then the distribution of the
number of elements that possess the characteristic is
approximately a binomial random variable with parameters n
and p.
Quality Criteria for Estimates
These criteria define generally desirable properties for an
estimate.
Objective in parameter estimation is to determine an
statistical estimator
ˆ = h( X , X , ......., X )
1 2 n
which gives a good estimate of parameter , for which
properties such as mean, variance, distribution provide a
measure of quality of this estimator. Once, sample values x1,
x2, …..xn are observed, the observed estimator
ˆ = h(x1 , x2 , ......., xn )
has a numerical value and will be called an estimate of
parameter .
Quality Criteria for Estimates (contd.)
Unbiasedness
An estimator is said to be an unbiased estimator for if
()
E ̂ =
ˆ var *
var
for all .
Example: Sample MeanX is independent of sample size n. But it is an unbiased
estimator for population mean . However, its variance (2/n) decreases as n
increases. Thus based on minimum variance criterion, the quality of X as an
estimator for increases as sample size n increases.
Quality Criteria for Estimates (contd.)
Consistency
An estimator ̂ will be a consistent estimator for if as sample
size n increases,
ˆ − = 0
lim P
n →