Lecture 3 PDF
Lecture 3 PDF
P(x)
P(x)
x 2
0.3
e 2 2
f(x)
f ( x) x
0.2
1 for 0.1
2 2 0.0
0.4
x 2
0.3
f(x)
0.2
2 2 0.1
Example 4.1: Let X1, X2, and X3 be independent random variables that are
normally distributed with means and variances as shown.
Mean Variance
X1 10 1
X2 20 2
X3 30 3
Example 4.3: Let X1 , X2 , X3 and X4 be independent random variables that are normally
distributed with means and variances as shown. Find the mean and variance of Q =
X1 - 2X2 + 3X2 - 4X4 + 5
Mean Variance
X1 12 4
X2 -5 2
X3 8 5
X4 10 1
0.3
f(w)
ff(x)
ff(y)
02
0.2 01
0.1 01
0.1
f(
0.1
0.2
The standard
Th t d d normall random d variable,
i bl Z,Z is
i the
th normall random
d
variable with mean = 0 and standard deviation = 1: Z~N(0,12).
0 .4
0 .3
=1
f(z)
{
0 .2
0 .1
0 .0
-5 -4 -3 -2 -1 0 1 2 3 4 5
=0
Z
Finding Probabilities of the Standard
Normal Distribution: P(0 §Z § 1.56)
1 56)
Standard Normal Probabilities
Standard Normal Distribution z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.4 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.3 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
f(z)
0.2 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
0.1 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.56 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
{
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
0.0 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
Z 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
0.2
0.1
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
Finding Probabilities of the Standard
Normal Distribution: P(1 § Z § 2)
To find P(1 Z 2): z
.
.00
.
...
. .
1. Find table area for 2.00 .
0.9
.
0.3159 ...
= .9772
9772 - .8413
8413 = 0.1359
0 1359
Standard Normal Distribution
0.4
0.3
Area
ea between
be wee 1 anda d2
P(1 Z 2) = .9772 - .8413 = 0.1359
f(z)
0.2
0.1
00
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
Finding Values of the Standard Normal
Random Variable: P(0 § Z § z) = 0
0.40
40
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
To find z such that 0.0
01
0.1
0.0000
0 0398
0.0398
0.0040
0 0438
0.0438
0.0080
0 0478
0.0478
0.0120
0 0517
0.0517
0.0160
0 0557
0.0557
0.0199
0 0596
0.0596
0.0239
0 0636
0.0636
0.0279
0 0675
0.0675
0.0319
0 0714
0.0714
0.0359
0 0753
0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.2
0.0
Look to the table of standard normal probabilities Total area in center = .99
to find that: Area in center left = .495
0.4
z.005
f(z)
0.2
f
P(-.2575 Z ) = .99 Area in left tail = .005
0.1
Area in right tail = .005
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
-z.005 z.005
-2.575 2.575
The Transformation of Normal
Random Variables
The area within k of the mean is the same for all normal random variables. So an area
under any normal distribution is equivalent to an area under the standard normal. In this
example: P(40 X P(-1 Z since and
The transformation of X to Z:
X x Normal Distribution: =50, =10
Z
x 0.07
0.06
Transformation 0.05
f(x)
0.04
(1) Subtraction: (X - x) 0.03
0.02 =10
{
Standard Normal Distribution 0.01
0.4 0.00
0 10 20 30 40 50 60 70 80 90 100
X
0.3
f(z)
0.2
X~N(160,302)
P (100 X 180 )
100 X 180
P
100 160 180 160
P Z
30 30
P 2 Z .6666 )
0 .4772 0 . 2475 0 . 7247
Using the Normal Transformation
Example
X~N(127,222)
P( X 150)
P
X 150
P Z
150 127
22
P Z 1.045
0.5 0.3520 0.8520
The Transformation of Normal
Random Variables
f(z)
0 .2
That is, P(X >70) can be found easily because 70 is 2 standard deviations above the mean
of X: 70 = + 2. P(X > 70) is equivalent to P(Z > 2), an area under the standard normal
distribution
distribution.
Example X~N(2450,4002)
Example N(5.7,0.52)
X~N(5.7,0.5
X P(a<X<b)=0.95 P( 1 96<Z<1 96)0.95
P(a<X<b)=0 95 and P(-1.96<Z<1.96) 0 95
P(X > x)=0.01 and P(Z > 2.33) 0.01 x = z = 2450 ± (1.96)(400) = 2450
x = + z = 5.7 + (2.33)(0.5) = 6.865 ±784=(1666,3234)
P(1666 < X < 3234) = 0.95
z .02 .03 .04
. . . . .
z .05 .06 .07
. . . . .
. . . . .
. . . . .
. . . . .
2.2 ... 0.4868 0.4871 0.4875
. . . . .
2.3 ... 0.4898 0.4901 0.4904
1.8 ... 0.4678 0.4686 0.4693
2.4 ... 0.4922 0.4925 0.4927
1.9 ... 0.4744 0.4750 0.4756
. . . . .
2.0 ... 0.4798 0.4803 0.4808
. . . . .
. . . . .
. . . . .
. . . . .
f(x)
0.4
0.3 X.01 = +z = 5.7 + (2.33)(0.5) = 6.865
0.0005
0.2 .0250 .0250
0.1 Area = 0.01
0.0 0.0000
3.2 4.2 5.2 6.2 7.2 8.2 1000 2000 3000 4000
X X
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
z Z.01 = 2.33 -1.96 Z 1.96
4-26
0.0010
.
distribution in 0.0008
.
question and of the
f(x)
0.0006
.
0.0002
.
distribution.
0.0000
1000 2000 3000 4000
X
0.3
f(z)
0.2
0.1
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
4-27
0.0010
. .4750 .4750
1. Draw pictures of 0.0008
.
the normal
f(x)
x)
0.0006
.
distribution in 0.0004
.
0.0002
. .9500
question and of the
0.0000
standard normal 1000 2000 3000 4000
X
distribution.
S tand ard Norm al D is trib utio n
0.4
.4750
2. Shade the area 0.3
.4750
corresponding to
f(z)
0.2
the desired
probability. 0.1
.9500
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
Z
Finding Values of a Normal Random
Variable Given a Probability
Variable,
Normal Distribution: = 2450, = 400
1 D
1. Draw pictures
i t off 3. From the table
0.0012
.
the normal 0.0010
. .4750 .4750 of the standard
distribution in 0.0008
.
normal
f(x)
question and of the
q 0.0006
.
distribution
distribution,
0.0004
.
standard normal 0.0002
. find the z value
.9500
distribution. 0.0000 or values.
1000 2000 3000 4000
X
2 Shade the area
2. S tand ard Norm al D is trib utio n
corresponding 0.4
to the desired .4750 .4750
0.3
probability.
f(z)
0.2
f(x)
question and of the 0.0006
.
0.0004
. distribution,,
standard normal 0.0002
. .9500 find the z value
distribution. 0.0000
1000 2000 3000 4000 or values.
X
2. Shade the area S tand ard Norm al D is trib utio n 4. Use the
corresponding 0.4
transformation
to the desired .4750 .4750 from z to x to get
0.3
probability. value(s) of the
original
g random
f(z)
0.2
P( <4 5) = 0.7749
P(x<4.5) 0 7749 N
Normal Di t ib ti = 3.5,
l Distribution: 3 5 = 1.323
1 323 Bi
Binomial
i l Distribution:
Di t ib ti n = 7,
7 p = 0.50
0 50
0.3 0.3
P( x 4) = 0.7734
0.2 0.2
P(x)
f(x)
f(
P
0.1 0.1
0.0 0.0
0 5 10 0 1 2 3 4 5 6 7
X X
x P( X <= x) x P( X <= x)
4.5000 0.7751 4.00 0.7734
The Normal Approximation of Binomial
Distribution
0.2
P(xx)
f(x)
0.1
0.1
0.0
0.0
0 1 2 3 4 5 6 7 8 9 10 11
0 5 10
X
X
Approximating a Binomial Probability
Using the Normal Distribution
a np b np
P ( a X b) P Z
np(1 p) np(1 p)
ffor n large
l ( 50) andd p nott too
(n t close
l to
t 0 or 1.00
1 00
or:
a 0.5 np b 0.5 np
P (a X b) P Z
np(1 p)
np(1 p)
P 196 x 196
Standard Normal Distribution: 95% Interval
. . 0.95
n n 0.4
0.3
f(z)
or 0.2
0.1
0.0
P x 196 x 196 0.95
-4 -3 -2 -1 0 1 2 3 4
. . z
n n
Confidence Interval for when is Known
(Continued)
Beforesampling,thereis a 0.95probability thatthe interval
1.96
n
will includethe sample
p mean ((and 5% that it willnot).
)
That is, x 1.96 is a 95%confidenceintervalfor .
n
A 95% Interval around the Population
Mean
Sampling Distribution of the Mean
0.4
Approximately
A i t l 95% off sample l means
0.3
95%
can be expected to fall within the
interval 1.96 , 1.96 .
f(x)
0.2
n n
0.1
2.5% 2.5%
Conversely, about 2.5% can be
0.0
1.96
n
196
.
n
x
expected to be above 1.96 n and
2.5% can be expected to be below
x
1.96 .
x n
2.5% fall below
the interval x
x
x
So 5% can be expected to fall outside
x 2.5% fall above the interval 1.96 , 1.96 .
x
the interval n n
x
x
A 95% confidence
fid interval
i t f when
l for h isi known
k andd sampling
li is
i
done from a normal population, or a large sample is used, is:
x 1.96
n
The quantity 1.9 6 is often called the margin of error or the
n
sampling error.
error
For example, if: n = 25 A 95% confidence interval:
= 20 20
x 1.96 122 1.96
x = 122 n 25
122 (1.96)( 4 )
122 7 .84
114 .16,129.84
A ((1- ))100% Confidence Interval for
W d fi z as the
We define th z value
l that
th t cuts
t off i ht t il area off under
ff a right-tail d the
th standard
t d d
2
normal curve. (1-) is called the confidence coefficient. is called the error
2
P z z z (1 )
0.2
0.1 2 2
2 2
0.0 (1- )100% Confidence Interval:
-5 -4 -3 -2 -1 0 1 2 3 4 5
z Z z x z
2 2
2 n
Critical Values of z and Levels of
Confidence
(1 )
z
Stand ard N o rm al Distrib utio n
0.4
2 2 (1 )
03
0.3
0.99
0 99 00.005
00 22.5766
f(z)
0.2
0.4 0.4
0.3 0.3
f(z)
f(z)
0.2 0.2
0.1 0.1
00
0.0 00
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
Z Z
0 .4 0 .9
0 .8
0 .3 0 .7
0 .6
0 .5
f(x)
f(x)
0 .2
0 .4
0 .3
0 .1
0 .2
2
0 .1
0 .0 0 .0
x x
• Shrimmy,
Sh i th shrimp
the hi hatchery,
h t h is
i planning
l i to t invest
i t heavily
h il in i black
bl k tiger
ti
breed. As part of the decision, the company wants to estimate the average
amount of black tiger shrimp a family of four would need per month. A
random sample of n = 100 families is obtained, and in this sample the average
amountt off shrimp
hi in
i poundd per monthth is
i 6.5
6 5 andd the
th population
l ti standard
t d d
deviation is known to be 3.2. Construct a 95% confidence interval for the
average amount of shrimp consumed by the entire population of families of 4.
Confidence Interval or Interval Estimate for
When Is Unknown - The t Distribution
deviation ,
If the population standard deviation, is not known,
known replace
with the sample standard deviation, s. If the population is
normal, the resulting statistic: t X
s
n
has a t distribution with (n - 1) degrees of freedom.
• The t is a family of bell-shaped
bell shaped and symmetric S d d normall
Standard
distributions, one for each number of degree of
freedom. t, df = 20
• The expected value of t is 0.
• For df > 2, the variance of t is df/(df
df/(df-2).
2). This is
t, df = 10
greater than 1, but approaches 1 as the number
of degrees of freedom increases. The t is flatter
and has fatter tails than does the standard
normal.
• The t distribution approaches a standard normal
}
f(tt)
9 1.383 1.833 2.262 2.821 3.250 0 .2
2
}
}
17 1.333 1.740 2.110 2.567 2.898 t
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861 Area = 0.025 Area = 0.025
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23
24
1 319
1.319
1.318
1 714
1.714
1.711
2 069
2.069
2.064
2 500
2.500
2.492
2 807
2.807
2.797
Whenever is
Wh i nott known
k (and
( d the
th population
l ti is
i
25
26
1.316
1.315
1.708
1.706
2.060
2.056
2.485
2.479
2.787
2.779
assumed normal), the correct distribution to use is
27
28
1.314
1.313
1.703
1.701
2.052
2.048
2.473
2.467
2.771
2.763
the t distribution with n-1 degrees of freedom.
29
30
1.311
1 310
1.310
1.699
1 697
1.697
2.045
2 042
2.042
2.462
2 457
2.457
2.756
2 750
2.750
g degrees
Note, however, that for large g of freedom,
40
60
1.303
1.296
1.684
1.671
2.021
2.000
2.423
2.390
2.704
2.660
the t distribution is approximated well by the Z
120
1.289
1.282
1.658
1.645
1.980
1.960
2.358
2.326
2.617
2.576
distribution.
Example
p
A blood analyst wants to estimate the average AFP index of the Vietnamese
people. A random blood sample of size 15 yields an average of x 10.37ng / ml
and a standard deviation of s = 3.5 ng/ml. Assuming a normal population of
the AFP values,
al es give
gi e a 95% confidence inter
interval
al for the average
a erage AFP value
al e
of the Vietnamese population? (AFP=alpha-fetoprotein)
df
---
t0.100
-----
t0.050
-----
t0.025
------
t0.010
------
t0.005
------ The critical value of t for df = (n -1) = (15 -1)
1 3.078 6.314 12.706 31.821 63.657
. . . . . . =14 and a right-tail area of 0.025 is:
t 0.025 2.145
. . . . . .
. . . . . .
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977 The corresponding
p g confidence interval or
15 1 341
1.341 1 753
1.753 2 131
2.131 2 602
2.602 2 947
2.947
s
.
.
.
.
.
.
.
.
.
.
.
. interval estimate is: x t 0 . 025
. . . . . . n
35
.
10.37 2.145
15
10.37 1.94
8.43,12.31
Large Sample Confidence Intervals for
the Population Mean
Example An environmental scientist wants to estimate the average amount of NOx in a given region. A random sample
of 100 data points gives x-bar = 357.60 ppm and s = 140.00 ppm. Give a 95% confidence interval for , the average
amount of NOx in any sample taken.
s 140.00
x z 0 . 025 357.60 1.96 357.60 27.44 330.16,385.04
n 100
Exercise 1
Exercise 2
Large-Sample Confidence Intervals
for the Population Proportion
Proportion, p
For estimating p , a sample is considered large enough when both n p an n q are greater
than 5.
Large-Sample Confidence Intervals
for the Population Proportion
Proportion, p
A large - sample (1- )100% confidence interval for the population proportion, p :
pˆ z pˆ qˆ
/2 n
where the sample proportion, p sample x,
p̂, is equal to the number of successes in the sample,
divided by the number of trials (the sample size), n, and q̂ = 1- p̂.
Example
A marketing
k i researchh fi
firm wants to estimate
i the
h share
h thath foreign
f i companies i
have in the American market for certain products. A random sample of 100
consumers is obtained, and it is found that 34 people in the sample are users
of foreign
foreign-made
made products; the rest are users of domestic products.
products Give a
95% confidence interval for the share of foreign products in this market.
pq ( 0.34 )( 0.66)
p z 0.34 1.96
2
n 100
0.34 (1.96)( 0.04737 )
0.34 0.0928
0.2472 ,0.4328
Thus, the
Th th firm
fi may be
b 95% confident
fid t th
thatt foreign
f i manufacturers
f t control
t l
anywhere from 24.72% to 43.28% of the market.
Exercise 3
Confidence Intervals for the Population Variance:
The Chi-Square (2) Distribution
f( )
as the degrees of freedom increase. df = 30
2
0 .0 5
0 .0 4
0 .0 3 df = 50
0 .0 2
0 .0 1
0 .0 0
0 50 100
2
( n 1) s 2
2
2
2 2
2
1
2
2
where is the value of the chi-square distribution with n - 1 degrees of freedom
2 2
that cuts off an area to its right and is the value of the distribution that
1
2 2
cuts off an area of to its left (equivalently,
(equivalently an area of 1
1 to its right)
right).
2 2
IIn an automated
t t d process, a machine
hi fills
fill cans off coffee.
ff If the
th average amountt
filled is different from what it should be, the machine may be adjusted to
correct the mean. If the variance of the filling process is too high, however, the
machine is out of control and needs to be repaired.
repaired Therefore,
Therefore from time to
time regular checks of the variance of the filling process are made. This is done
by randomly sampling filled cans, measuring their amounts, and computing the
sample variance. A random sample of 30 cans gives an estimate s2 = 18,540.
Give a 95% confidence interval for the population variance, 2.
2
( n 1 ) s 2
( n 1 ) s ( 30 1)18540 , ( 30 1)18540 11765,33604
2
,
2
457
. 16.0
2
1
2
Example
p ((continued))
A
Area iin Ri
Right
h TTail
il
df .995 .990 .975 .950 .900 .100 .050 .025 .010 .005
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
28 12 46
12.46 13 56
13.56 15 31
15.31 16 93
16.93 18 94
18.94 37 92
37.92 41 34
41.34 44 46
44.46 48 28
48.28 50 99
50.99
29 13.12 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67
Chi-Square Distribution: df = 29
0.06
0.05
0.95
0.04
ff( )
2
0.03
0.02
0.025
0.01 0.025
0.00
0 10 20 30 40 50 60 70
2
20.975 16.05 20.025 45.72
Sample-Size
p Determination
For example: A (1- ) Confidence Interval for : x z
2 n
Bound, B
Exercise 4
Sample
p Size and Standard Error
The sample size determines the bound of a statistic, since the standard
error of a statistic shrinks as the sample size increases:
Sample size = 2n
Standard error
of statistic
Sample size = n
Standard error
of statistic
Minimum Sample Size: Mean and
Proportion
Minimum required sample size in estimating the population
mean, :
z2 2
n 2 2
B
Bound of estimate:
B = z
2 n
A microbiologist
i bi l i wants to conductd an experiment
i to estimate
i the
h average amount
of micro-organisms in the water of a popular river. He plans to determine the
average amount of micro organism to within 120 µg/ml, with 95% confidence.
From past record,
record an estimate of the population standard deviation is
s = 400 µg/ml. What is the minimum required sample size?
z
2 2
n 2
2
B
2 2
(1.96 ) ( 400 )
2
120
42 .684 43