0% found this document useful (0 votes)
16 views46 pages

P & S (25)

Uploaded by

farooqshahzad873
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views46 pages

P & S (25)

Uploaded by

farooqshahzad873
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

Lecture No.

Probability & Statistics


IN THE LAST LECTURE,
YOU LEARNT

Sam
pli
n g
D i
str
ib
uti
onofX
Mea
na ndSt
a n
dar
d Devi
ati
ono
f
t
heSa
m pl
ing D
is
tr
ibut
ion fX
o
Cen
tr
alLimitThe
orem
TOPICS FOR TODAY

 Sampling Distribution of p̂
 Sampling Distribution of
X1  X2
EXAMPLE: A construction company has 310
employees who have an average annual salary
of Rs.24,000 and standard deviation Rs.5,000.
Suppose that the employees of this
company launch a demand that the government
should institute a law by which their average
salary should be at least Rs. 24500, and,
suppose that the government decides to check
the validity of this demand by drawing a
random sample of 100 employees of this
company, and acquiring information regarding
their present salaries.
What is the probability that, in a random
sample of 100 employees, the average salary
will exceed Rs.24,500 (so that the government
decides that the demand of the employees of
this company is unfounded, and hence does not
pay attention to the demand(although, in
reality, it was justified))?
SOLUTION
The sample size (n = 100) is large enough
to assume that the sampling distribution ofX is
approximately normally distributed with the
following mean and standard deviation:
 x  Rs.24,000.
 N  n 5000 310  100
x  . 
n N 1 100 310  1
Rs. 412.20
NOTE: Here we have used finite population
correction factor (fpc), because the sample size

n = 100 is greater than 5 percent of the


population
Since size
X Nis= 310.
approximately N(24000,
412.20), therefore
X   x X  24000
Z 
x 412.20
is approximately N(0, 1).
We are required to evaluate P(X > 24,500).
24500  24000
z 1.21
412.20
0.3869 0.1131
24000 24500 X

0 1.21 Z

Using the table of areas under the standard


normal curve, we find that the area between z =
0 and z = 1.21 is 0.3869. Hence,
P(X > 24,500) = P(Z > 1.21)
= 0.5 – P(0 < Z < 1.21) = 0.5 – 0.3869
= 0.1131.
Hence, the chances are only 11% that in a
random sample of 100 employees from this
particular construction company , the average
salary will exceed Rs.24,500.
In other words, the chances are 89% that,
in such a sample, the average salary will not
exceed Rs.24,500.

Hence, the chances are considerably high


that the government might pay attention to the
employees’ demand.
Next, we consider the SAMPLING
DISTRIBUTION OF THE SAMPLE
PROPORTION:
In this regard, the first point to be noted is
that, whenever the elements of a population can
be classified into two categories, technically
called “success” and “failure”, we may be
interested in the proportion of “successes”
in the population.
If X denotes the number of successes in
the population, then the proportion of successes
in the population is given by
X
p .
N
Similarly, if we draw a sample of size n
from the population, the proportion of successes
in the sample is given by
X
p̂  ,
n
where X represents the number of successes in
the sample.
It is interesting to note that X is a binomial
random variable and the binomial parameter p
is being called a proportion of successes here.
The sample proportion p̂ has different
values in different samples. It is obviously a
random variable and has a probability
distribution.
This probability distribution of the
proportions of successes in all possible random
samples of size n, is called the sampling
distribution of p̂.
EXAMPLE-1: A population consists of six
values 1, 3, 6, 8, 9 and 12. Draw all possible
samples of size n = 3 without replacement from
the population and find the proportion of even
numbers in each sample.
Construct the sampling distribution of
sample proportions and verify that
i)  p̂  p
pq N  n
ii) Var p̂   . .
n N 1
SOLUTION
The number of possible samples of size n = 3
that could be selected without replacement
 6
from a population of size N is  3  20.
 
Let p̂represent the proportion of even numbers
in the sample.
Then the 20 possible samples and the
proportion of even numbers are given as
follows:
S a m p le S a m p le S a m p le
N o. D a ta P r o p o r t i o n p̂ 
1 1, 3, 6 1 /3
2 1, 3, 8 1 /3
3 1, 3, 9 0
4 1, 3, 12 1 /3
5 1, 6, 8 2 /3
6 1, 6, 9 1 /3
7 1, 6, 12 2 /3
8 1, 8, 9 1 /3
9 1, 8, 12 2 /3
10 1, 9, 12 1 /3
11
12
3, 6, 8
3, 6, 9
2 /3
1 /3 The sampling
13
14
3, 6, 12
3, 8, 9
2 /3
1 /3
distribution of sample
15
16
3, 8, 12
3, 9, 12
2 /3
1 /3
proportion is given
17
18
6, 8, 9
6, 8, 12
2 /3
1
below:
19 6, 9, 12 2 /3
20 8, 9, 12 2 /3
Sampling Distribution of p̂ :
Probability p̂ f p̂
p̂ No. of
f p̂ p̂2 f p̂
Samples
0 1 1/20 0 0
1/3 9 9/20 3/20 1/20
2/3 9 9/20 6/20 4/20
1 1 1/20 1/20 1/20
 20 1 10/20 6/20
Now
10
 p̂  p̂ f p̂   0.5 , and
20
 p̂  p̂ f p̂   p̂ f p̂ 
2 2 2

2
2  10  1
     0.05.
60  20  20
To verify the given relations, we first
calculate the population proportion p.
Thus :
X
p , where X represents the number of even
N
numbers in the population.
3
In other words, p  0.5 ,
6
H
en
ce
,wef
ind
tht
a p̂
0.
5 
p,
pqNn 0
. 3
25
6
. .
nN 1 3 6
1
a
nd
0.
25
  0.
05
Vp̂

ar
5
H
ence
,twop
ro
pert
ieso
fthesa
mpl
ing
d
is
tr
ibu
ti
onfp̂
o arever
if
ied
.
PROPERTIES OF THE SAMPLING
DISTRIBUTION OF p̂
Property No. 1:
The mean of the sampling distribution of
proportions, denoted by p̂ , is equal to the
population proportion p, that is  p̂ p.
Property No. 2:
The standard deviation (standard error) of
p̂ and denoted by  p̂ , is given as:
pq
a) p̂  ,
n
when the sampling is performed with
replacement pq N  n
b)  p̂  ,
n N 1
when sampling is done without replacement
from a finite population.
(As in the case of the sampling distribution
of X, N  n , is known as the finite population
N 1
correction factor (fpc).)
Property 3: SHAPE OF THE DISTRIBUTION
The sampling distribution of p̂ is the binomial
distribution. However, for sufficiently large
sample sizes, this sampling distribution is
approximately normal.

 p̂ p. p̂

pq
p̂  ,
n
As a rule of thumb, the sampling distribution of
p̂ will be approximately normal whenever both
np and nq are equal to or greater than 5.
Let us apply this concept to
a real-world situation:
EXAMPLE-2: Ten percent of the 1-kilogram
boxes of sugar in a large warehouse are
underweight. Suppose a retailer buys a random
sample of 144 of these boxes. What is the
probability that at least 5 percent of the sample
boxes will be underweight?
SOLUTION: Here the statistic is the sample
proportion
The sample size (n = 144) is large enough
to assume that the sample proportion is
approximately normally distributed with mean
M e a n o f th e s a m p lin g d is tr ib u tio n
o f p̂ :
 p̂  p  0 . 10 ,
and

S t a n d a r d E r r o r o f p̂ :

 p̂ 
pq

0 . 10 0 . 90 
n 144
0 .3
  0 . 025 .
12
Therefore, the sampling distribution of p̂ is
approximately N(0.10, 0.025) And, hence:
p̂   p̂ p̂  p
Z 
 p̂ pq / n
p̂  0.10

0.025
is approximately N(0, 1).
We are required to find the probability that
the proportion of underweight boxes in the
sample is equal to or greater than 5% i.e., we
require Pp̂ 0.05.
In this regard, a very important point to be
noted is that, just as we use a continuity
correction of + ½ whenever we consider the
normal approximation to the binomially
distributed random variable X, in this situation,
since X
p̂  ,
n
therefore, we need to use a continuity
1

correction of 2n in the case of the sampling
distribution of p̂ .
A pplying the continuity correction
in this problem , w e have:
 1 
P p̂  0 . 05  P  p̂  0 . 05  
 2 144 
 1 
 P  p̂  0 . 05  
 288 

 P 
 pˆ  0 . 10

0 . 05  1 / 288  0 . 10 

 0 . 025 0 . 025 
 P Z   2 . 14 
 P  2 . 14  Z  0   P 0  Z   
 0 . 4838  0 . 5  0 . 9838
(using the area table of the standard normal
distribution)
0.4838 0.5

0.10 p̂
-2.14 0 Z
Hence, the probability that at least 5% of
the sample boxes are under-weight is as high
as 98% !
The sampling distributions of X and p̂
pertain to the situation when we are drawing all
possible samples of a particular size from one
particular population.
Next, we will discuss the case when we
are dealing with all possible samples drawn
from two populations, such that the samples
from the two populations are independent.
In this regard, we will consider the
sampling distributions of X1  X 2 and p̂1  p̂ 2 :
S A M P L IN G D IS T R IB U T IO N O F
D IF F E R E N C E S B E T W E E N M E A N S :

S u p p o se we have tw o d istin c t
p o p u la tio n s w ith m e a n s  1 and  2 a n d
2 2
1
v a r ia n c e s and 2
r e sp e c tiv e ly .
Let independent random samples of sizes
n1 and n 2 be selected from the respective
populations, and the differences x1  x 2
between the means of all possible pairs of
samples be computed.
Then, a probability distribution of the
differences X1  X 2 can be obtained.
Such a distribution is called the
sampling distribution of the
differences of sample means X1  X 2 .
We illustrate the sampling distribution of X1  X2

with the help of the following example:


EXAMPLE: Draw all possible random
samples of size n1 = 2 with replacement from a
finite population consisting of 4, 6, 8.
Similarly, draw all possible random
samples of size n = 2 with replacement from
another finite population consisting of 1, 2, 3.
a) Find the possible differences between the
sample means of the two population.

b) Construct the sampling distribution of


X1  X 2 and compute its mean and variance.
c) Verify that
 x1  x 2 1   2 and
2 2
2 1 2
 x1  x 2   .
n1 n1
Solution: Whenever we are sampling with
replacement from a finite population, the total
number of possible samples is Nn
Hence, there are (3)2 = 9 possible samples
which can be drawn with replacement from
each population. given below:
From Population 1 From Population 2
Sample Sample  Sample Sample 
x1 x2
No. Value No. Value
1 4, 4 4 1 1, 1 1.0
2 4, 6 5 2 1, 2 1.5
3 4, 8 6 3 1, 3 2.0
4 6, 4 5 4 2, 1 1.5
5 6, 6 6 5 2, 2 2.0
6 6, 8 7 6 2, 3 2.5
7 8, 4 6 7 3, 1 2.0
8 8, 6 7 8 3, 2 2.5
9 8, 8 8 9 3, 3 3.0
a) Since there are 9 samples from the
first population as well as 9 from the
second, hence, there are 81 possible
combinations of x1 andx2 .
x2
x2
4 5 6 5 6 7 6 7 8
1.0 3.0 4.0 5.0 4.0 5.0 6.0 5.0 6.0 7.0
1.5 2.5 3.5 4.5 3.5 4.5 5.5 4.5 5.5 6.5
2.0 2.0 3.0 4.0 3.0 4.0 5.0 4.0 5.0 6.0
1.5 2.5 3.5 4.5 3.5 4.5 5.5 4.5 5.5 6.5
2.0 2.0 3.0 4.0 3.0 4.0 5.0 4.0 5.0 6.0
2.5 1.5 2.5 3.5 2.5 3.5 4.5 3.5 4.5 5.5
2.0 2.0 3.0 4.0 3.0 4.0 5.0 4.0 5.0 6.0
2.5 1.0 2.5 3.5 2.5 3.5 4.5 3.5 4.5 5.5
3.0 1.0 2.0 3.0 2.0 3.0 4.0 3.0 4.0 5.0
P r o b a b ility
b) The x1  x 2
T a lly f f x 1  x 2  d f (d )
2
d f(d )
d  f d 
sampling
1 .0 | 1 1 /8 1 1 /8 1 1 .0 / 8 1
distribution 1 .5 || 2 2 /8 1 3 /8 1 4 .5 / 8 1
of X1  X 2 2 .0 |||| 5 5 /8 1 1 0 /8 1 2 0 .0 / 8 1
2 .5 |||| | 6 6 /8 1 1 5 /8 1 3 7 .5 / 8 1
is as 3 .0 |||| |||| 10 1 0 /8 1 3 0 /8 1 9 0 .0 / 8 1
3 .5 |||| |||| 10 1 0 /8 1 3 5 /8 1 1 2 2 .5 / 8 1
follows: 4 .0 |||| |||| ||| 13 1 3 /8 1 5 2 /8 1 2 0 8 .0 / 8 1
4 .5 |||| |||| 10 1 0 /8 1 4 5 /8 1 2 0 2 .5 / 8 1
5 .0 |||| |||| 10 1 0 /8 1 5 0 /8 1 2 5 0 .0 / 8 1
5 .5 |||| | 6 6 /8 1 3 3 /8 1 1 8 1 .5 / 8 1
6 .0 |||| 5 5 /8 1 3 0 /8 1 1 8 0 .0 / 8 1
6 .5 || 2 2 /8 1 1 3 /8 1 8 4 .5 / 8 1
7 .0 | 1 1 /8 1 7 /8 1 4 9 .0 / 8 1
T o ta l --- 81 1 3 2 4 /8 1 1 4 3 1 /8 1
Thus the mean and the variance are
 x1  x 2  x1  x 2  f x1  x 2 
324
 df d   4 , and
81
 2x1  x 2  d 2f d   df d 2
2
1431  324  53 5
     16  1.67
81  81  3 3
c) In order to verify the properties of the
sampling distribution ofX1  X 2 , we first
need to compute the mean and variance of the
first population:
The mean and standard deviation of the
first population are:
4 6 8
1  6 , and
3

2 
4  6 2
 6  6 2
 8  6 2
8
 .
1
3 3
The mean and variance of the second
population are:
1 2  3
2  2 , and
3

2 
1  2 2
 2  2 2
 3  2 2
2
 .
2
3 3
Now  x1  x 2 4 6  2 1   2 , and
2 2
1 28 1 2 1
  .  .
n1 n 2 3 2 3 2
4 1 5
  
3 3 3
1.67
2
 x1  x 2
Hence, two properties of the
sampling distribution of X1  X 2
are satisfied.
PROPERTIES OF THE SAMPLING
DISTRIBUTION OF X1  X 2
Property No. 1:
The mean of the sampling distribution of
X1  X 2 , denoted by  X  X , is equal to the
1 2
difference between population means, that is
 X1  X2 1   2
P ro p erty N o . 2 :
In ca se o f sa m p lin g w ith o r w ith o u t
re p la ce m e n t fr o m tw o in fin ite
p o p u la tio n s, th e sta n d a r d d e v ia tio n o f
th e sa m p lin g d istr ib u tio n o f X 1  X 2
(i.e. sta n d a rd e rro r o f X 1  X 2 ),
d en o te d b y  X 1  X 2 , is g iv en b y
2 2
1 2
 X1  X2  
n1 n2
The above expression for the Standard
Error of X1  X 2 also holds for finite
population when sampling is performed
with replacement.
In case of sampling without replacement from a
finite population, the formula for the standard
error of X1  X2 will be suitably modified.
Property No. 3:
Shape of the distribution:
a) If the POPULATIONS are
normally distributed, the sampling
distribution of X1  X2 , regardless of
sample sizes, will be normal with
2 2
1 2
mean 1  2 and variance  .
n1 n2
In other words, the variable

Z
X1  X 2  1   2 
2 2
1 2

n1 n2

is normally distributed with


zero mean and unit variance.
b) If the POPULATIONS are non-normal
and if both sample sizes are large, (i.e., greater
than or equal to 30),
then the sampling distribution of the differences
between means is approximately a normal
distribution by the Central Limit Theorem.

In this case too, the variable


Z
X1  X 2  1   2 
12  22

n1 n 2
will be approximately normally distributed.
IN THE NEXT LECTURE,
YOU WILL LEARN
Sampling Distribution of p̂1  p̂ 2
Point Estimation
Desirable Qualities of a Good Point
Estimator
–Unbiasedness
–Consistency
–Efficiency
Methods of Point Estimation:
The Method of Moments,

You might also like