0% found this document useful (0 votes)
2 views

CH I - Sampling and Sampling Distributions (6)

This document discusses sampling and sampling distributions, focusing on the sampling distribution of the mean and sample proportions. It explains key concepts such as population distribution, sample distribution, and the Central Limit Theorem, which states that sample means will approximate a normal distribution as sample size increases. Additionally, it provides examples and formulas for calculating probabilities related to sample means and proportions.

Uploaded by

petrosmulu5
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

CH I - Sampling and Sampling Distributions (6)

This document discusses sampling and sampling distributions, focusing on the sampling distribution of the mean and sample proportions. It explains key concepts such as population distribution, sample distribution, and the Central Limit Theorem, which states that sample means will approximate a normal distribution as sample size increases. Additionally, it provides examples and formulas for calculating probabilities related to sample means and proportions.

Uploaded by

petrosmulu5
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

CHAPTER ONE

SAMPLING AND SAMPLING DISTRIBUTIONS

SAMPLING DISTRIBUTION

NOTE: The normal probability distribution is used to determine probabilities for the
normally distributed individual measurements, given the mean and the standard
deviation. Symbolically, the variable is the measurement X, with the population
mean µ and population standard deviation δ. In contrast to such distributions of
individual measurements, a sampling distribution is a probability distribution for
the possible values of a sample statistic.

Population distribution: Is the distribution of measured values of its members and have
mean denoted by μ and variance δ 2and standard deviation σ . The population standard
deviation describes the variation among values of members of the population; where as the
standard deviation of sampling distribution measures the variability among values of the
statistics (sample) such as mean values, proportion values due to sampling errors.

Sample distribution: Is the distribution of measured values of sample in random samples


drawn from a given population. Each sample mean would vary from sample to sample.
This variability serves as the basis for random sampling distribution. A sampling
distribution is a probability distribution for the possible values of a sample statistic, such
as a sample mean.

1.1. SAMPLING DISTRIBUTION OF THE MEAN


Sampling distribution of the mean: Is the probability distribution of all possible values of
a given statistic (sample) from all distinct possible samples of equal size drawn from a
population or a process. The sampling distribution of the mean values has its own
arithmetic mean denoted by μ x (read as mu sub x bar) and standard deviation δ x (sigma sub
x bar). The sampling distribution of the mean is the probability distribution of the means,
X of all simple random samples of a given sample size n that can be drawn from the
population.

NB: The sampling distribution of the mean is not the sample distribution, which is the
distribution of the measured values of X in one random sample. Rather, the sampling
distribution of the mean is the probability distribution for X , the sample mean.

For any given sample size n taken from a population with mean µ and standard deviation
δ, the value of the sample mean would vary from sample to sample if several random
samples were obtained from the population. This variability serves as the basis for
sampling distribution.

Page 1 of 13
The sampling distribution of the mean is described by two parameters: the expected value
( X ) = X , or mean of the sampling distribution of the mean, and the standard deviation
of the mean
δ x , the standard error of the mean.

PROPERTIES OF THE SAMPLING DISTRIBUTION OF MEANS

1. The arithmetic mean μ x of the sampling distribution of mean values is equal to the
population mean μ regardless of the form of population distribution .i.e. μ x= μ
2. The sampling distribution has a standard deviation (also called standard error) equal
to the population standard deviation divided by the square root of the sample size
σ
i.e., δ x = . This holds true if and only of n<0.05N and N is very large. If N is
√n
finite and
δ x=
S
√n
or Sδ x =

S N−n
√ n N −1
, n ≥ 0.05 N

is called finite population correction factor/finite


3. The expression

N −n
N −1
population multiplier. In the calculation of the standard error of the mean, if the
δ
population standard deviation δ is unknown, the standard error of the mean x , can
be estimated by using the sample standard error of the mean
S X which is calculated
as follows:

S x=
S or
√n
S x=
S N−n
√n N −1√
4. A sample size n≥30 is generally considered to be a large sample for statistical
analysis where as a sample of size n¿ 30 is considered to be a small sample. The
sampling distribution of means is approximately normal for sufficiently large
sample sizes (n≥ 30).
5. When standard deviation of population σ is not known, the standard deviation of
the sample s which closely approximates σ value is used to compute standard error,
s
i.e.δ x = .
√n
A population consists of the following ages: 10, 20, 30, 40, and 50. A random sample
of three is to be selected from this population and mean computed. Develop the
sampling distribution of the mean.

Page 2 of 13
Solution: The number of simple random samples of size n that can be drawn without
N!
replacement from a population of size is N C = With N= 5 and n = 3, 5C3 = 10
n
n ! (N −n)!
samples can be drawn from the population as:

Sampled items Sample means ( X )


10, 20, 30 20.00
10, 20, 40 23.33
10, 20, 50 26.67
10, 30, 40 26.67
10, 30, 50 30.00
10, 40, 50 33.33
20, 30, 40 30.00
20, 30, 50 33.33
20, 40, 50 36.67
30, 40, 50 40.00
300.00
A systematic organization of the above figures gives the following:

Sample mean ( X ) Frequency Prob. (relative freq.) of X


20.00 1 0.1
23.33 1 0.1
26.67 2 0.2
30.00 2 0.2
33.33 2 0.2
36.67 1 0.1
40.00 1 0.1
TOTAL 10.00 1.00
Columns 1 and 2 show frequency distribution of sample means.
Columns 1 and 3 show sampling distribution of the mean.

μ=
∑ X = ∑ x =30 ,
N n Regardless of the sample size μ= X .
x (Observation) x−μ (x−μ)
2

10 -20 400
20 -10 100
30 0 0
40 10 100
50 20 400
∑ (x−μ)2 1,000

Page 3 of 13
σ=
√ ∑ ( X i −X )2
N
=
√ 1000
5
=14 . 142

σ X=
δ
√n

√ N−n 14 .142 5−3
=
N−1 √3

5−1 √
=5 .774

=

N
∑ ( X i −X )2
333 . 4
10
=

=5 . 774
δ
Since averaging reduces variability x < δ except the cases where δ = 0 and n =
1.
Central Limit Theorem and the Sampling Distribution of the Mean
The Central Limit Theorem (CLT) states that:

1. If the population is normally distributed, the distribution of sample means is


normal regardless of the sample size.
2. If the population from which samples are taken is not normal, the distribution of
sample means will be approximately normal if the sample size (n) is sufficiently
large (n ≥ 30). The larger the sample size is used, the closer the sampling
distribution is to the normal curve.

The relationship between the shape of the population distribution and the shape of the
sampling distribution of the mean is called the Central Limit Theorem.

The significance of the Central Limit Theorem is that it permits us to use sample statistics
to make inference about population parameters without knowing anything about the
shape of the frequency distribution of that population other than what we can get from the
sample. It also permits us to use the normal distribution curve for analyzing distributions
whose shape is unknown. It creates the potential for applying the normal distribution to
many problems when the sample is sufficiently large.

As mentioned earlier the above properties must exist, given this value of sample mean X
is first converted in to a value Z on the standard normal distribution to know how any
single value deviates from X of sample mean values ( μ x), by using the formula;

X−μ
X−μ x
Z= = δ because μ x= μ
δx
√n
If the population is finite and samples of fixed size n are drawn without replacement, then
the standard error of sampling distribution of mean can be modified to adjust the continued

Page 4 of 13
change in the size of population μ due to the several draws of samples of size n is as
follows:
Example 1. The mean length of a certain tool is 41.5 hours with a standard
deviation of 2.5 hours. What is the probability that a simple random sample
of size 50 drawn from this population will have a mean between 40.5 hours
and 42 hours?

μ=41.5 δ =2.5 n=50

P (40.5≤ X ≤42.0) =?

δ 2.5 2.5
μ x= μ δ x = = = = 0.3536
√ n √50 7.0711
The population distribution is unknown, but sample size n=50 is large enough to apply the
central limit theorem. Hence the normal distribution can be used to find the required
probability.
X 1−μ X 2−μ
P (40.5≤ X ≤420) = P ( ≤Z≤ )
δx δx
40.5−41.5 42−41.5
=P( ≤Z≤ )
0.3536 0.3536
= P (−2.8281 ≤ Z ≤ 1.4140)
=P ( Z ≥−2.8281) + P ( Z ≤ 1.4140)
=0.4977+0.4207=0.9184
Thus 0.9184 is the probability of the tool having mean life between the required hours.
δ=2.5
0.497
0.420

x=40.5 μ=41.5 x=40.5

Example 2. A continuous manufacturing process produces items whose weights


are normally distributed with a mean weight of 800gms and a standard
deviation of 300gms. A random sample of 16 items is to be selected from
the process.
A. What is the probability that the arithmetic mean of the sample exceeds 900gms? Interpret
the result.
B. Find the values of the sample arithmetic mean within which the middle 95% of all sample
means will fall.

Page 5 of 13
Solution:

A. P ( x ≥ 900 ) =?
μ X =μ=800gms δ =300gms
n=16
P ( x ≥ 900 ) =?
δ 300 300
δx = = = = 75
√ n √16 4

0.09

μ X =800 X =900

X−μ x 900−8 00
P ( x ≥ 900 ) =P (Z≥ = ¿
δx 75
=P (Z≥ 1.33 ¿
=0.5000-0.4082
=0.0918

B. Since Z=1.96 for the middle 95% area under the normal curve, therefore using the formula
for z to solve for the values of x in terms of the known values are as follows.
x 1= μ X -Zδ x x 2= μ X +Zδ x
=800-1.96(75) =800+1.96(75)
=653gms =947gms
0.9
5
δ =300

X =653 μ X =800 X =947

1.2. SAMPLING DISTRIBUTION OF SAMPLE PROPORTIONS


The sample proportion P having the characteristic of interest (success or failure, accept or
reject, head or tail) is the best use for statistical inferences about the population parameter
P. the sample proportion can be defined as:

Page 6 of 13
number of success , X
P=
sample ¿ n

With same logic of sampling distribution of mean, the sampling distribution of sample
proportions with mean μ P and standard deviation also called standard error) δ P is given by:

√ √
μ P = P and δ P = pq = p(1−P)
n n

If a large ample size (n≥30) satisfying following two conditions.

A. np≥5
B. nq≥5
Then the sampling distribution of proportions is very closely normally distributed. It may
be noted that the sampling distribution of the proportion would actually follow binomial
distribution because population is binomially distributed.
For finite population in which sampling is done without replacement we have;

√ √
μ P = P and δ P = pq * N −n
n N −1
Under the same guidelines as mentioned in the previous sections, for a large sample size n ≥
30, the sampling distribution of proportion is closely approximated by a normal distribution
with a mean and standard deviation as stated above. Hence, to standardize sample
proportion P, the standard normal variable.
P−P
P−μ P
Z=
δP

= pq

Example 3.
n
Few years back, a policy was introduced to give loans to
unemployed engineers to start their own business. Out of 1,000,000
engineers, 600,000 accepted the policy and got the loan. A sample of 100
unemployed engineers is taken at the same time of allotment of loans. What
is the probability that sample portion would have exceeded 50%
acceptance?
Solution:

μ P = P=0.60 N=1,000,000
n=100 P ( P ≥ 0.5) =?


δ P = pq √ N −n ¿ ¿=¿ )( √1,000,000−100 ¿ ¿ )
n N −1
δ P =0.0489
1,000,000−1

P−μP 0.50−0.60
P ( P ≥ 0.5) =P (Z≥ ) =P (Z≥ ) =0.4793+0.5000=0.9793
δP 0.0489

Page 7 of 13
0.47
93 0.50
00

P=0.5 P=0.60

Example 4. A population proportion is 0.40. A simple random sample of size


200 will be taken and the sample proportion will be used to estimate the
population proportion, what is the probability that the sample proportion
will be with in ±0.03 of the population proportion.
Given:

μ P = P=0.40 n=200


P−P
δ P = ( 0.4 ) (0.6) =0.0346 P (-0.03≤ P≤ 0.03) = 2P (Z≥ )
200 δP
= 2P (Z ≤ 0.87 ¿
=2x0.3078
=0.6156

0.3 0.3

P=−0.03 P=0.40

Example 5. A manufacturer of watches has determined from past experience that


3% of the watches he produces are defective. If a random sample of 300
watches is examined, what is the probability that the proportion of defective
is between 0.02 and 0.035?

μ P = P=0.03 P2=0.035
P1=0.02 n=300


δ P = ( 0.03 ) (0.97) =0.0098
300

Page 8 of 13
P−P P−P
P (-0.03≤ P≤ 0.03 ) = P ( ≤ Z≤ )
δP δP
0.02−0.03 0.035−0.03
=P( ≤Z≤ )
0.0098 0.0098
= P (-1.02≤ Z ≤ 0.51)
=P (Z≥−1.02) + P (Z≤ 0.51)
=0.3461+0.1950
= 0.5411
Hence the probability that the proportion of defective will lie between 0.02 and
0.035 is 0.5411

0.34 0.19
61 50

P1=0.02 P=0.03 P2
=0.035

1.3. SAMPLING DISTRIBUTION OF THE DIFFERENCE BETWEEN TWO


MEANS
The concept of sampling distribution of sample mean introduced earlier can also be
used to compare a population of size N 1 having mean μ1and standard deviation δ 1
with another similar type of population of size N 2 having mean μ2and standard
deviation δ 2.

Let X 1 ∧X 2be the mean of sampling distribution of the mean of two populations,
respectively. Then the difference between their mean values μ1and μ2can be
estimated by generalizing the formula of standard normal variable as follows;

( X 1− X 2 )−(μ X −μ X ) ( X 1− X 2 )−(μ1−μ2 )
Z= =
1 2

δ (X −X )
1 2
δ (X − X )
1 2

Where: μ X −μ X = μ1−μ2 (mean of sampling distribution of sample mean)


1 2

δ ¿¿= √ δ X 2 + δ X 2 =
1 2
√ δ 12 δ 22
+
n1 n2
(standard error of sampling distribution of difference of two means)

Page 9 of 13
n1 and n2 are independent random samples drawn from first and second
population , respectively.

Example 6. Car stereos of manufacturer A have a mean lifetime of 1,400 hours


with a standard deviation of 200 hours, while those of manufacturer B have
a mean life time of 1,200 hours with a standard deviation of 100 hours. If a
random sample of 125 stereos of each manufacturer are tested, what is the
probability that manufacturer A’s stereos will have a mean life time which is
at least;
A. 160 hours more than manufacturer B’s stereos?
B. 250 hours more than manufacturer B’s stereos?
Solution:

Manufacturer A μ1=1,400 hours


δ 1= 200 hours n1 =125
Manufacturer B μ1=1,200 hours
δ 1= 100 hours n2 =125
a)

√ √
2 2 2 2
δ ( X −X )= δ 1 + δ 2 = (200) + (100) = √ 80+320=√ 400 =20
n1 n2 125 125
1 2

P ( X 1 −X 2 ≥ 160) = P ( Z ≥ ¿ ¿)
160−200
=P ( Z ≥ )
20
=P ( Z ≥ −2)
=0.5000+0.4772
=0.9772 (area under normal curve)

0.97
72

X 1 −X 2=160 μ X −X =200
1 2

Hence, the probability is very high that the life time of the stereos of A is 160
hours more than that of b.

b) Proceeding in the same manner as in part a) as follows:

Page 10 of 13
( X 1−X 2 )( μ1−μ2 )
P ( X 1 −X 2 ≥ 250) = P (Z ≥
δ ( X −X )
1 2

250−200
=P ( Z ≥ )
20
=P ( Z ≥ −2.5 )
=0.5000 - 0.4938
=0.0062 (area under normal curve)

0.00

μ(X − X ) =200 X 1− X 2=250


1 2

Example 7. The strength of a wire produced by company has a mean of 4,500kg


and a δ 1of 200 kg. Company B has a mean of 4,000 kg and a δ 2of 300 kg. if
50 wires of company A and 100 wires of company B are selected at random
and tested for strength, what is the probability that the sample mean strength
of A will be at least 600gk more than that of B?

Given:
μ1= 4,500 μ2= 4,000
δ 1=200 δ 2=300
n1 =50 n2 =100

√ √
2 2 2 2
δ ( X −X )= δ 1 + δ 2 = (200) + (300) = =41.23
n1 n2 50 100
1 2

P ( X 1 −X 2 ≥ 600) = P ( Z ≥ ¿ ¿)
600−500
=P ( Z ≥ )
41.23
=P ( Z ≥ 2.43)
=0.4925
=0.5000 - 0.4925=0.0075 (area under normal curve)

Page 11 of 13
0.007

μ(X − X ) =500 X 1−X 2=600


1 2

1.4. SAMPLING DISTRIBUTION OF THE DIFFERENCE OF TWO


PROPORTIONS

Suppose two populations of size N 1and N 2are given. For each sample of size n1 from the
first population, compute sample proportion P1and standard deviation δ P . Similarly for
1

each sample size of n2 from the second population, compute sample proportion P2 and
standard deviation δ P . 2

For all combinations of these samples from these populations, we can obtain a sampling
distribution of the difference P1−P2 of sample proportion. Such a distribution is called
sampling distribution of the difference of two proportions. The mean and standard
deviation of this distribution are given by;

μ P −μ P = P1−P2
1 2

δ ¿¿= √ δ P 2 + δ P 2 =
1 2
√ P1 q 1 P 2 q 2
n1
+
n2

If sample size n1∧n1 are large i.e. n1 ≥30, then the sampling distribution of difference of
proportions is closely approximated by a normal distribution.

Example 8. 10% of the machines produced by company A are defective and 5%


of those produced by company B are defective. A random sample of 250
machines is taken from company A and a random sample of 300 machines is
taken from company B. What is the probability that the difference in sample
proportion is less than or equal to0.02?
Page 12 of 13
μ P −μ P = P1−P2= 0.10−0.05=0.05
1 2

n1 =250 n2=300

The standard error of the difference in a sample proportions is given by

δ ( P −P ) = √ δ P 2 + δ P 2 =
1 2 1 2
√ P1 q 1 P 2 q 2
n1
+
n2
δ ( P −P ) = √ 0.0052 = 0.0228
1 2

The desired probability of the difference in sample proportion is given by

( P1−P2 )−(P1−P2 )
P¿0.02) =P ( Z ≥
δ ( P −P )
1 2

0.02−0.05
=P ( Z ≥
)
0.0228
=P ( Z ≥ −1.32)
=0.5000 - 0.4066=0.0934 (area under normal curve)
Hence the desired probability for the difference in sample proportions is 0.0934

0.09

P1−P2=0.02 μ(P − P )=0.05


1 2

Page 13 of 13

You might also like