Sampling and CLT
Sampling and CLT
In statistical investigation the interest usually lies in the assessment of variation of one or more
characteristics of objects belonging to a group. This group of objects under study is called a population .
Thus in statistics population is an aggregation of objects. animate or inanimate, under study.
The population may be finite or infinite. Examining the entire population for sake of assessment of
variation may be difficult or even impossible to do. In such situations we consider a sample.
A finite subset of statistical individuals in a population is called a sample and the number of individuals in
a sample is called the sample size.
The process of obtaining a sample is called a sampling. Sampling in which each member may be chosen
more than once is called sampling with replacement. While if each member cannot be chosen more than
once is called sampling without replacement.
Random sampling: If the sample units are selected at random then it is called random sampling. In this
case each unit of population has an equal chance of being included in the sample.
Let X be random variable contain probability distribution with mean 𝜇 and variance 𝜎 2 . Let X1 , X2 , ----- Xn
be n independent random variables each having the same distribution of X. Then (X1 , X2 , ----- Xn ) is called
a random sample of size n from X. 𝜇 is called population mean. 𝜎 2 is called population variance.
Let X be a random variable with expectation E(X)=𝜇 and variance V(X)= 𝜎 2 then E(𝑋̅) = 𝜇 and
𝜎 2
V(𝑋̅) = where n sample size.
𝑛
Let (X1 , X2 , ----- Xn) be a random sample of size n≥ 2 from a distribution N(𝜇, 𝜎 2 ) then mean
𝜎 2 𝑛𝑠2
sample 𝑋̅~N(𝜇, 𝑛 ) and 𝜎2
~𝜒 2 (𝑛 − 1) .
Problem
1. Let 𝑋̅ be the mean of a sample of size 5 from a normal distribution with mean 𝜇 = 0, variance 125
find C so that P(𝑋̅ < 𝑐) = 0.9.
125
Sol: n=5, X~𝑁(0,125) , 𝑋̅~N(0, ) 5
Pr(𝑋̅ < 𝑐) = 0.9
𝑐
Pr(𝑍 < 5) = 1.28 ⇒ c= 6.40
2. If 𝑋̅ is the mean of a random sample size n from aa normal distribution with mean 𝜇 and variance
100 find n so that Pr {𝜇 − 5 < 𝑋̅ < 𝜇 + 5 } = 0.954.
100
Sol: , X~𝑁(𝜇, 100) , 𝑋̅~N(𝜇, ) 𝑛
𝜇−5−𝜇 𝜇+5−𝜇
Pr { 𝜎 < 𝑋̅ < 𝜎 } = 0.954
−5 5 5 100
Pr { 𝜎 < 𝑋̅ < 𝜎 } = 0.954 ⇒ 2ɸ (𝜎 ) = 1.954 ⇒2.5= √ 𝑛
⇒ n=16
3. Let S2be the variance of a random sample of size 6 from the N ( 𝜇, 12) then find
Pr {2.3 < 𝑆 2 < 22.2 }
12 𝑛𝑠2
Sol: 𝑋̅~N (𝜇, 6
) , 𝜎2
~𝜒 2 (𝑛 − 1)
6𝑠2
⇒ ~𝜒 2 (5)
12
Let X1 , X2 , ----- Xn denote a random sample of size n from a distribution that has mean 𝜇 and variance (
∑𝑛
𝑖=1 𝑋𝑖 − 𝑛𝜇 √𝑛(𝑋̅ − 𝜇)
𝜎 2 . Then random variable 𝑌𝑛 = 𝜎√𝑛
= 𝜎
has a limiting distribution N(0,1).
𝐸(𝑆𝑛 ) = 𝐸(X1 + X2 + − − − − − + Xn ) = n𝜇
𝑉(𝑆𝑛 ) = 𝑉(X1 + X2 + − − − − − + Xn ) = n𝜎 2
∑𝑛𝑖=1 𝑋𝑖 − 𝑛𝜇
𝑌𝑛 =
𝜎√𝑛
𝑀𝑌𝑛 (𝑡) = 𝐸(𝑒 𝑡𝑦𝑛 ) Since the random variables are independent we get
𝑛 𝑋 −𝜇
𝑡( 𝑖 )
𝑀𝑌𝑛 (𝑡) = ∏ 𝐸(𝑒 𝜎 √𝑛
𝑖=1
𝑋 −𝜇 𝑛
𝑡( 𝑖 )
= [𝐸(𝑒 𝜎√𝑛 ]
𝑡 2⁄
𝑀𝑌𝑛 (𝑡) = 𝑒 2
Which is mgf of a random variable with N(0,1) ⇒ 𝑌𝑛 has limiting distribution N(0,1).
Problem
1. Compute an approximate probability that the mean sample of size 15 from a distribution having
2
pdf 𝑓(𝑥) = {3𝑥 0 < 𝑥 < 1 is between 3/5 and 4/5.
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
3 3
Sol: n=15 , 𝜇 = , 𝜎 2 =
4 80
3 3
𝑋̅~N ( , )
4 80
3 4
Pr {5 < 𝑋̅ < 5 } = Pr {−3 < 𝑧 < 1 } = ɸ(1 ) − (1 − ɸ(3 )) = 0.840
2. Let 𝑋̅ the mean of a random sample of size 100 from a distribution which is 𝜒 2 (50). Compute an
approximate value of Pr {49 < 𝑋̅ < 51}.
Sol: X~𝜒 2 (50) , 𝜇 = 𝑛 = 50 , 𝜎 2 = 2𝑛 = 100, n=50
𝑋̅−𝜇 𝜎2
𝜎
~𝑁( 𝜇, 𝑛
)
3. Let 𝑋̅ the mean of a random sample of size 128 from a Gamma distribution with r=2, 𝛼 = 1/4.
Approximate Pr {7 < 𝑋̅ < 9}.
𝑟 𝑟
Sol: , 𝜇 = 𝛼 = 8, 𝜎 2 = 𝛼2 = 32
7−8 𝑋−𝜇 ̅
9−8
Pr {7 < 𝑋̅ < 9} = Pr { 0.5 < 𝜎 < 0.5 } = 0.954.
4. Suppose that Xj j=1,2,-----50 are independent random variable each having a Poisson distribution
with 𝛼 = 0.03 . Let S= X1+ X2+----------+ X50 using the central limit theorem evaluate Pr(S≥ 3).
Sol: S= X1+ X2+----------+ X50
E(S)= 50 × 0.03= 1.5, V(S) = 50 × 0.03= 1.5
∑𝑛𝑖=1 𝑋𝑖 − 𝑛𝜇 𝑆 − 𝐸(𝑆) 𝑆 − 1.5
𝑌= = = ~𝑁(0,1)
𝜎√𝑛 √𝑉(𝑆) √1.5
𝑆−1.5 3−1.5
Pr(S≥ 3) = 1- Pr(S≤ 3) = 1 − Pr( ≤ ) = 1 − ɸ( √1.5) = 0.1093
√1.5 √1.5
5. A distribution with unknown mean 𝜇 has a variance 1.5 find how large a sample should be taken
from the distribution in order that the probability will be at least 0.95 that the sample mean will
be within 0.5 of the population mean.
Sol: Pr {𝜇 − 0.5 < 𝑋̅ < 𝜇 + 0.5} = 0.95
𝜇 − 0.5 − 𝜇 𝑋̅ − 𝜇 𝜇 + 0.5 − 𝜇
Pr { 𝜎 < 𝜎 < 𝜎 } = 0.95
√𝑛 √𝑛 √𝑛
−0.5√𝑛 0.5√𝑛
Pr { <𝑌< } = 0.95
𝜎 𝜎
0.5√𝑛
2ɸ ( ) = 1.95
𝜎
0.5√𝑛
= 1.96
𝜎
𝑛 = 23.05