Business Statistics CH (2)
Business Statistics CH (2)
Sampling Distribution
Random Sampling
Sampling Distribution of the Sample Mean
Central Limit Theorem
Sampling Distribution of the Difference Between
Two Sample Means
Sampling Distribution of the Sample Proportion
Sampling Distribution of the Difference Between
Two Sample Proportions
02/05/24 Lecture 5 1
Statistic and Parameter
In lectures 3 and 4, we discussed probability distributions of
discrete and continuous random variables. This lecture
extends the concept of probability distribution to that of a
sample statistic.
A sample statistic is a numerical summary measure calculated
for sample data.
The mean, median, quartiles, mode, variance, standard
deviation, and so on for sample are called sample statistics.
02/05/24 Lecture 5 2
Sampling Distribution
Because sample measurements are observed values of
random variables, the value of a sample statistic will
vary from sample to sample in a random manner.
Therefore sample statistics are random variables. Hence
a sample statistic also has a probability distribution.
02/05/24 Lecture 5 3
Example
Population = {2, 4, 6, 8, 10}. We draw a random sample of size 2
without replacement. There are C52 = 10 different samples:
Sample X
2, 4 3
2, 6 4
2, 8 5
2, 10 6
4, 6 5
4, 8 6
4, 10 7
6, 8 7
6, 10 8
8, 10 9
The population mean = (2 + 4 + 6 + 8 + 10)/5 = 6.
02/05/24 Lecture 5 4
Example 1 (Cont.)
Since the sample is random, so each outcome has probability
1/10. Then the sampling distribution of X is:
X 3 4 5 6 7 8
9
P(X) .1 .1 .2 .2 .2 .1 .1
02/05/24 Lecture 5 5
Random Sampling
A basic reason for using random sampling is to ensure
that the inferences made from the sample data are not
distorted by a selection bias
02/05/24 Lecture 5 6
A Simple Random Sample
Definition: If n elements are selected from a population in such
a way that every possible combination of n elements in the
population has an equal probability of being selected, the n
elements are said to be a simple random sample (or a random
sample).
In other words, a simple random sample is a sample that is
selected in such a way that each member of the population has
the same chance of being included in the sample.
With a complete list of the population, using the Random
Number Table will generate random samples. Other commonly
used methods for selecting a random sample are the Systematic
sampling, Stratified sampling and Cluster sampling.
02/05/24 Lecture 5 7
Sampling Distribution Of Sample Mean
THEOREM 1: If a random sample of n observations is
taken from a population with mean and variance 2,
then E(X) = and Var(X) = 2/n.
Proof:
X = ( Xi )/n = (X1 + X2 + ... + Xn)/n ,
E(X) = [E(X1) + E(X2) + ... + E(Xn)]/n
= [ + + ... + ]/n = (n )/n = .
Var(X) = [Var(X1)+Var(X2)+...+Var(Xn)]/n2
= [ 2 + 2 + ... + 2 ]/n2
= (n2)/ n2 = 2 / n .
02/05/24 Lecture 5 8
Sampling Distribution Of Sample Mean
From the variance of X , we can see that the larger the
sample, the more accurately X estimates the
population mean .
Example 3:
(1) If X ~ N(, 2) and the sample size is n, then
X
Z ~ N(0,1)
/ n
(2) If X ~ N(40,36) and the sample size is 9, then X ~ N(40, 4)
(3) If X ~ N(40,36) and the sample size is 9, then
P(X < 42) = P(Z < 1) = 0.8413
02/05/24 Lecture 5 12
Central Limit Theorem
From above theorems, we have known that if we take random
sample of size n from a normal population with mean and
standard deviation , then sampling distribution of has the
following properties:
X
(1) The expectation is ;
(2) the s.e. is /n ; and
(3) the shape is normal.
02/05/24 Lecture 5 13
CENTRAL LIMIT THEOREM
Central Limit Theorem (C.L.T): If X is the mean of a
random sample of size n taken from a population with
mean and finite variance 2, then the limiting form
of the distribution of
X
Z
/ n
as n , is the standard normal distribution N(0, 1),
or X has approximately a normal distribution with
mean and standard deviation n.
02/05/24 Lecture 5 14
Central Limit Theorem - Second Form
If n is large, then
n
X i n
Z i 1
n
02/05/24 Lecture 5 15
More About Central Limit Theorem
The normal approximation for X will generally be
good if n 30 regardless of the shape of the
population.
02/05/24 Lecture 5 16
Example 4
Consider the discrete uniform population
f(x) = 1/4 for x = 0, 1, 2, 3.
Find the probability that a random sample of size 36, selected with
replacement, will yield a sample mean greater than 1.4 but less than
1.8 if the mean is measured to the nearest tenth.
Solution:
= E(X) = 1.5 and
2 = [(0 - 1.5)2 + (1 -1.5)2 + (2 - 1.5)2 + (3 - 1.5)2] 1/4 = 5/4.
Then E(X ) = 1.5, Var(X ) = 2/n =5/144, and
the standard error (X )=0.186.
Therefore, by the C. L. T.
P(1.4 < X < 1.8) P(-0.54 < Z < 1.61)
= P(Z < 1.61) - P(Z < -0.54) = 0.6517.
02/05/24 Lecture 5 17
Sampling Distribution of The Difference
Between Two Means
Suppose that we now have two populations, the first with
mean and variance , and the second with mean
and variance .
Let the statistic X1 represent the mean of a random
sample of size n1 selected from the first population; and
the statistic X2 represent the mean of a random sample
of size n2 selected from the second population,
independent of the sample from the first population.
What can we say about the sampling distribution of the
difference X1 - X2 for repeated samples of size n1 and
n2?
02/05/24 Lecture 5 18
Sampling Distribution of The Difference
Between Two Means (Cont.)
According to the central limit theorem, the random
variables X1 and X2 are both approximately normally
distributed with means & and variances n1 and
n2 respectively, if n1 and n2 are large.
By choosing independent samples from the two populations,
the variables X1 and X2 will be independent. Hence we can
conclude that X1 - X2 is approximately normally distributed
with the mean
E(X1 - X2 ) = E(X1 ) - E(X2 ) = -
and the variance
Var(X1 - X2) = Var(X1) + Var(X2) = n1+ n2
02/05/24 Lecture 5 19
Sampling Distribution of X1 - X2
From the above discussion, the standard deviation of the
sampling distribution of X1 - X2 is given by:
12 22
X1 X 2
n1 n 2
02/05/24 Lecture 5 20
Example 5
The television picture tubes of Manufacturer A have a
mean lifetime of 6.5 years & a standard deviation of 0.9
year; while those of Manufacturer B have a mean
lifetime of 6 years and a standard deviation of 0.8 year.
What is the probability that a random sample of 36
tubes from Manufacturer A will have a mean lifetime
that is at least 1 year more than the mean lifetime of a
sample of 49 tubes from Manufacturer B ?
02/05/24 Lecture 5 21
Example 5 - Solution
For Population I: = 6.5, = 0.9 and n1 = 36
For Population II: = 6.0, = 0.8 and n2 = 49
Since n1 and n2 are greater than 30, the sampling
distribution of X1 - X2 will be approximately normal with
E( X1 X 2 ) 6.5 6 0.5
12 22 .81 0.64
X1 X 2 0.189
n1 n 2 36 49
02/05/24 Lecture 5 22
Sampling Distribution of Sample Proportion
We take one sample of n items from a binomial population with
a proportion of success . In the previous lecture, we
demonstrated how to use normal distribution to approximate
the binomial distribution if
n 5 and n (1-) 5 (or n (1-) 5).
This is in fact a consequence of the Central Limit Theorem
applied to the binomial distribution.
Consider a random sample of size n: X1, X2, ..., Xn, which are
independent random variables taking values 0 & 1 such that
P(Xi = 1) = .
Let X = X1 + X2 + … + Xn . Then X represents the total number
of successes in the sample.
02/05/24 Lecture 5 23
Sampling Distribution of p
We know that X follows a binomial distribution Bin(x;
n, ) with mean = n and the standard deviation
n(1 )
02/05/24 Lecture 5 24
Sampling Distribution of p (Cont.)
Let p= X/n, which is the sample proportion of successes,
or the fraction of successes. We know that E(p) = ,
and
(1 )
p
n
And the standardized fraction of successes is
p
Z
(1 ) / n
which, again according to C.L.T. (the first form), is
approximately standard normal.
02/05/24 Lecture 5 25
Sampling Distribution of p1 - p2
Many times, we need to know the differences in the proportion of
successes in two independent binomial populations. For example,
we may have two production processes, two auditing procedures,
and two medical treatment procedures (the classical ‘placebo’
effect). Assume that: for population 1, the probability of success
= 1 and for population 2, the probability of success = 2
For i = 1, 2, if let Xi be the number of successes in a sample of size
ni drawn from population i, then pi =Xi /ni is the proportion of
successes in the sample draw from population i.
1. The expected value of the sampling distribution of (p1 - p2) is
given by E(p1 - p2) =1 - 2
2. The standard deviation of the sampling distribution of (p1 - p2) is
(1 1 ) 2 (1 2 )
given by p p 1
1 2
n1 n2
02/05/24 Lecture 5 26
Sampling Distribution of p1 - p2 (Cont.)
By extending the normal approximation to the binomial
distribution as discussed above, we know that when n1
and n2 are large, or practically, when
n1 1(1-1) 5 and n2 2(1-2) 5,
the shape of the distribution of (p1 -p2) is approximately
normal. More specifically speaking, the standardized
difference between two sample proportions:
(p 1 p 2 ) ( 1 2 )
Z
1 (1 1 ) 2 (1 2 )
n1 n2
is approximately a standard normal distribution.
02/05/24 Lecture 5 27
Example 6
A store has two locations. At both locations, about 40%
of customers use credit card to pay their purchases.
That is, 1 = 2 =0.4. In doing an audit, the company
accountant took random samples of n1=100 and n2=100
sales slips from the two locations. 41 and 36 charge
customers were found in the sample from the first and
second location, respectively.
What is the probability that a result would be achieved
whereby the first location’s proportion of charge
customers exceeded the second location’s proportion of
charge customers by this much or more?
02/05/24 Lecture 5 28
Example 6 - Solution
In this problem, n1 = n2 = 100,
p1 = 41/100 = 0.41 & p2 = 36/100 = 0.36
Since 1 = 2 = 0.4, then on average we would expect
that to be close to zero. The probability we seek can be
written as P(p1 - p2 0 .41 - 0.36). Since (n11(1-1))=24
and (n22(1-2))=24, we can use the normal
approximation, that is
P(p1 p 2 0.41 0.36) P(p1 p 2 0.05)
0.05 0
P Z
[(0.4)(0.6) / 100] [(0.4)(0.6) / 100]
0.05
= P Z P( Z 0.72) 0.2358
0.069
02/05/24 Lecture 5 29