0% found this document useful (0 votes)
3 views

Probability Distributions n Special

This document provides an introduction to statistical analysis focusing on probability distributions, random variables, and their properties. It covers discrete and continuous random variables, probability density functions, cumulative distribution functions, expected values, and variance. Additionally, it discusses special distributions like binomial, Poisson, and normal distributions commonly used in real-life applications.

Uploaded by

glynnkuwale99
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Probability Distributions n Special

This document provides an introduction to statistical analysis focusing on probability distributions, random variables, and their properties. It covers discrete and continuous random variables, probability density functions, cumulative distribution functions, expected values, and variance. Additionally, it discusses special distributions like binomial, Poisson, and normal distributions commonly used in real-life applications.

Uploaded by

glynnkuwale99
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Introduction to Statistical Analysis

Probability Distributions

Probability Distributions
Preceding unit reviewed concepts of probability as well as
techniques for calculating the probability of an event.
We see the relationship between the values of a random
variable and the probabilities of occurrence in a probability
distribution.
Introduction to Statistical Analysis
Probability Distributions

Topics in the unit


In this unit you will learn about concepts related to probability
distributions.These include:
discrete and continuous random variables;
probability density functions;
cumulative probability function;
expected value and variance of a random variable.
Introduction to Statistical Analysis
Random Variables

Random Variables
Let X denote the number of heads obtained in an experiment
where a coin is tossed five times.
The possible values X can assume are 0, 1, 2, . . . , 5.
This is an example of variable that is numerical valued.
It is called random variable.
Introduction to Statistical Analysis
Random Variables

Definition
A variable is a real valued function defined on S, a sample
space.
Here are some of the examples of random variables:
A die is rolled twice. Let Y represent the sum of points
obtained in this experiment. Then the possible values of Y are
2, 3, 4, . . . , 12. the variable takes particular real values.
This variable is called a discrete random variable.
Let W be the weight of a new born normal baby, where the
minimum and the maximum weights respectively are 1.8 and
3.5 kg. So the possible values of W are in the interval [1.8, 3.5]
Introduction to Statistical Analysis
Random Variables

Notes
In the first example, the random variable takes values that are
distinct from each other. This variable is called a discrete
random variable.
W can assume any value in the interval [1.8, 3.5]. It is called
a continuous random variable.
If each value or range of values can be accompanied by its
probability then we have a probability distribution.
A formula may also be used to determine probabilities. The
formula is called a probability density function(pdf).
Introduction to Statistical Analysis
Discrete Random Variable

Discrete Random Variable


Let X be a random variable on S = {x1 , x2 , . . . , xn },such
that P(X = x1 ) = p1 , P(X = x2 ) = p2 , . . ., P(X = xn ) = pn .
Then X
P(X = x) = 1
all x
or equivalently,
n
X
pi = 1
i=1

Results are not unexpected since the probability of a sample


space is unity.
Introduction to Statistical Analysis
Discrete Random Variable

Example
The probability density function of a discrete random variable
Y is given by

ky , for y = 12, 13, 14
P(Y = y ) =
0, Otherwise

Find the value of k;


Form a probability distribution.
Introduction to Statistical Analysis
Discrete Random Variable

Solution
Since the function P(Y = y ) = kx is a probability distribution
function, it follows that
14
X
P(Y = y ) = 1.
y =12

1
Thus 12k + 13k + 14k = 1. Therefore, k = 39 . So
y
P(X = x) = , y = 12, 13, 14
39
.
The probability distribution is given in the table below.
Y =y 12 13 14
12 13 14
P(Y = y ) 39 39 39
Introduction to Statistical Analysis
Continuous Random Variables

Continuous Random Variables


A continuous random variable is a theoretical representation
of a continuous variable like time.
Let a continuous random X have a pdf f (x) defined over the
range a to c only. Then, for a < u ≤ x ≤ v < c,
P(u ≤ x ≤ v ) is the area shown under the curve in Figure 1.
This area if given by
Z v
P(u ≤ x ≤ v ) = f (x)dx.
u
Introduction to Statistical Analysis
Continuous Random Variables

Continued/. . .

Figure 1: Curve illustrating P(u ≤ x ≤ v )


Introduction to Statistical Analysis
Continuous Random Variables

Example
x
A continuous random variable X has a pdf f (x) = 6 + C for
0 ≤ x ≤ 3.
Evaluate C .
Find P(1 ≤ x ≤ 2).
Introduction to Statistical Analysis
Continuous Random Variables

Solution

Z 3
P(0 ≤ x ≤ 3) = f (x)dx
0
Z 3
x
= ( + C )dx
0 6
3
= + 3c = 1
4
1
Thus c = 12 .

Z 2
P(1 ≤ x ≤ 2) = f (x)dx
1
Z 2
x 1 1
= ( + )dx =
1 6 12 3
Introduction to Statistical Analysis
Continuous Random Variables

Continued/. . .
Have you ever imagined that we can draw a histogram of a
pdf? Consider the following activity.
For the continuous distribution defined by the pdf:
1
f (x) = (3x 2 + 4), for 0 ≤ x ≤ 4
80

verify that the variable involved X is a random variable.


Construct a probability distribution table for the ranges 0 to 1,
1 to 2, 2 to 3 and 3 to 4.
Illustrate the results in the form of a histogram.
Introduction to Statistical Analysis
The Cumulative Distribution Function

The Cumulative Distribution Function


If pdf of a random variable X is known, we can sometimes
find an algebraic expression for P(X ≤ a) where a ∈ S.
Alternatively, we may set up a cumulative table numerically.
Definition
The cumulative distribution function of a random variable X is
denoted by F (b) and is defined as P(X ≤ b).
For a discrete random variable, this expression means that
b
X
F (b) = P(X = x)
x=−∞
Introduction to Statistical Analysis
The Cumulative Distribution Function

Continued/. . .
The cumulative distribution function (or distribution function
for short) gives the probability that a random variable will
assume a value less than or equal to some specified value, b.
For X continuous, with pdf f (x), the distribution function,
F (b), is given by
Z b
F (b) = f (x)dx
−∞

where b is a value within the sample space of X .


Take a look at some example involving cumulative distribution
functions.
Introduction to Statistical Analysis
The Cumulative Distribution Function

Example
A discrete random variable X has a pdf, P(X = x) given by
 1 2
P(X = x) = 25 (x − 4x + 5), x = 0, 1, 2, 3, 4 and 5
0, otherwise

Determine (a) F (1), (b) F (4), (c) P(2 ≤ X ≤ 4).


Introduction to Statistical Analysis
The Cumulative Distribution Function

Continued/. . .
The probability density function is given by
1
P(X = x) = 25 (x 2 − 4x + 5). Thus

F (2) = P(X = 0) + P(X = 1) + P(X = 2)

Substituting in the formula for pdf we find P(X = 0) = 0.2,


P(X = 1) = 0.08, and P(X = 2) = 0.04 and so on. Thus
F (1) = 0.28.
Similar computations will give F (4) = 0.6.
Now,

P(2 ≤ X ≤ 4) = P(X = 2) + P(X = 3) + P(X = 4)


= F (4) − F (1)
= 0.32
Introduction to Statistical Analysis
The Cumulative Distribution Function

Continued/. . .
F (b) = 1 if b is the maximum value that can be taken by the
random variable.
E.g. For a discrete random variable X , the distribution
function is given by F (x) = kx, x = 1, 2, 3. Find (a) the
value of the constant k, (b) P(X < 3), (c) the probability
distribution of X .
Introduction to Statistical Analysis
The Cumulative Distribution Function

Solution
We are given the cumulative distribution function. We need
to find the pdf before we can obtain a probability distribution.
Recall that F (b) = 1 if b is the maximum value a random
variable can attain. So F (3) = 3k = 1. Thus k = 13 .

P(X < 3) = P(X ≤ 2)


= 2k
2
=
3
To find the probability distribution of X , we require P(X = x).
P(X = 1) = F (2) − F (1) = 13 .
Introduction to Statistical Analysis
The Cumulative Distribution Function

Continued/. . .
Note that P(X = 2) = P(3) = 13 .
Probability distribution is summarised in the following table.
x 1 2 3
P(x) 13 13 31
Introduction to Statistical Analysis
The Cumulative Distribution Function

Example
A continuous random variable Y has a pdf of the form
f (y ) = A(y 2 + 4) for 0 ≤ y ≤ 1.
Calculate the value of the constant A,
Derive an expression for the distribution function of this
random variable.
Introduction to Statistical Analysis
The Cumulative Distribution Function

Solution
If f (y ) is a pdf, it follows that
Z 1
A(y 2 + 4)dy = 1.
0

13
Thus 3 A = 1.
3
Therefore, A = 13 . The pdf of Y is

3 2
f (y ) = (y + 4).
13
Let the distribution function be F (b). Then
Z b
3 2 b
F (b) = (y + 4)dy = (b 2 + 12)
0 13 13
Introduction to Statistical Analysis
Expected Value of a Random Variable

Expected Value of a Random Variable


E. g. Suppose it is claimed that only 3% of all adults in a
certain location is infected with HIV/AIDS. How many adults
would you expect to be infected in a group of 12000 people?
Value found in activity above is known as the expected value
(or the expectation) of the number of those infected.
Introduction to Statistical Analysis
Expected Value of a Random Variable

Definition
Let X be a random variable with pdf P(X = x) (or f (x) for
continuous). Then the expected value of X , denoted by E [X ], is
given by
 P
E [X ] = R X xP(X = x), for discrete X
X xf (x)dx, for continuous X
Introduction to Statistical Analysis
Expected Value of a Random Variable

Continued/. . .
This quantity can take any value.
The quantity E [X ] is usually denoted by the symbol µ.
E.g. Find E [X ] = µ for the following distributions:
x 0 1 2 3 4 5
P(X = x) 0.01 0.08 0.23 0.36 0.21 0.11
f (x) = 1 − x2 for 0 ≤ x ≤ 2.
Introduction to Statistical Analysis
Expected Value of a Random Variable

Solution
The random variable in this case is discrete. So
X
E [X ] = xP(X = x)
all x
= 0 × 0.01 + 1 × 0.08 + . . . + 5 × 0.11 = 3.01

For this random variable, the expectation is given by


Z
E [X ] = xf (x)dx
all x
Z 2
x 2
= x(1 − )dx =
0 2 3
(1)
Introduction to Statistical Analysis
Expected Value of a Random Variable

Example
A committee of three is to be chosen from four girls and seven
boys. Find the expected number of girls on the committee, if
the members of the committee are chosen at random.
Solution
Let Y be the number of girls on the committee. Then
  
4 7
y 3−y
P(Y = y ) =   for y = 0, 1, 2, 3.
11
3
Introduction to Statistical Analysis
Expected Value of a Random Variable

Continued/. . .
Need to find E [Y ].
Start with a probability distribution for Y .
y 0 1 2 3
P(Y = y ) 0.212 0.509 0.255 0.024
So the expectation of Y , is given by

E [Y ] = 0 × 0.212 + 1 × 0.509 + 2 × 0.255 + 3 × 0.024 = 1.091

Simplify computations by coming up with a probability


distribution of a given random variable.
Introduction to Statistical Analysis
Variance of a random variable X

Variance of a random variable X


Definition: The variance of a probability distribution
associated with a random variable X , denoted by Var[X ], is
given by
Var[X ] = E [(X − µ)2 ]
where µ = E [X ].
Introduction to Statistical Analysis
Variance of a random variable X

Example
If X is a random variable (discrete or continuous), show that
Var[X ] = E [X 2 ] − µ2 .
For the discrete random variable X defined by the pdf
x 0 1 2 3 4
1 3 7 3 2
P(X = x) 16 16 16 16 16

find: (1) E [X ], (2) E [X 2 ] and (3) Var[X ].


Introduction to Statistical Analysis
Variance of a random variable X

Solution
The expectation of X , E [X ] is found to be E [X ] = 2.125.
Now, E [X 2 ] = x 2 P(X = x). This gives a value of 5.625.
P

Using the formula in proved an activity above, we find

Var[X ] = E [X 2 ] − µ2
= 5.625 − 4.516 = 1.109
Introduction to Statistical Analysis
SOME SPECIAL DISTRIBUTIONS

SOME SPECIAL DISTRIBUTIONS


Some distributions are used more often in real life situations
Such distributions include the binomial, Poisson and normal
distribution among others.
These distributions are used to model many phenomena.
Introduction to Statistical Analysis
Binomial Distribution

Binomial Distribution
It is concerned with experiments for which there are two possible
outcomes: success or failure.
Definition
A random variable R is said to follow a binomial distribution if
  
n
π r (1 − π)n−r , r = 0, 1, . . . , n

P(R = r ) = r
0, otherwise

where π is the probability of success on each of the n trials.


Introduction to Statistical Analysis
Binomial Distribution

Conditions that give rise to a Binomial Distribution


The conditions that give rise to a binomial distribution are:
fixed number of independent trials, n;
two possible outcomes on each trial: a “success” or a “failure”;
probability of “success” is the same on each trial. The
probability of “failure” is q = 1 − π;
variable of interest is the number of successes in the n trials.
Introduction to Statistical Analysis
Binomial Distribution

Example
The probability that a student, in a particular school, is left
handed is 0.3. If a sample of 10 students is drawn from this
school, find the probability that of the 10 students
none is left handed;
all are left handed;
exactly four are left handed.
Introduction to Statistical Analysis
Binomial Distribution

Solution
π = 0.3 and 1 − π = 0.7
The probability that none is left handed is
10
C0 (0.3)0 (0.7)10 ≈ 0.02825
The probability that all are left handed is
10
C10 (0.3)10 (0.7)0 = 0.310 ≈ 5.9 × 10−6 .
The probability that exactly four are left handed is
10
C4 (0.3)4 (0.7)6 ≈ 0.2.
Introduction to Statistical Analysis
Binomial Distribution

Mean and Variance of a Binomial distribution


Let X be a random variable which follows a binomial
distribution with parameters n and π. Then
expected value of X is given by E [X ] = nπ;
the variance of X , Var[X ], is given by

Var[X ] = nπ(1 − π).


Introduction to Statistical Analysis
Binomial Distribution

Example
A large consignment of shelled peas is known to have 2%
discoloured. If a lot of 10000 is dispatched for tinning, find
the expected number of discoloured peas in the lot, and also
the variance.
Solution
Let X be the number of discoloured peas in a lot. Then X
follows a binomial distribution with parameters n = 10000 and
π = 0.02. Thus

E [X ] = nπ
= 10000 × 0.02 = 200

and the variance of X is

Var[X ] = nπ(1 − π)
= 10000 × 0.02 × 0.98 = 196
Introduction to Statistical Analysis
The Hypergeometric distribution

The Hypergeometric distribution

In Binomial distribution we assume that the sample is taken


from a large population with a proportion p of success.
It is like drawing a sample with replacement with constant
probability of success on each draw.
In practice, however, sampling is done without replacement
from a finite population.
Lets motivate this with an example.
Introduction to Statistical Analysis
The Hypergeometric distribution

Example
Suppose we have 17 students on a study programme with 8 female
students. If a committee of 6 is to be formed from this programme,
what is the chance the committee will include 3 female students?

Solution
From counting techniques, we note that there are
   
8 9
×
3 3

different ways of forming the committee.


But the total
 number
 of committees regardless of who is
17
included is .
6
This gives rise to the hypergeometric distribution.
Introduction to Statistical Analysis
The Hypergeometric distribution

Definition
Let N be population size and k be the number of items in the
population with a particular attribute. Suppose further that n is
the sample size and the random variable X denote the number of
items in the sample with the attribute. Then
  
k N −k
x n−x
P(X = x) =  
N
n

is the probability density function of a hypergeometric variable X .


In this case, we say that X ∼ hyper(N, k, n).
Introduction to Statistical Analysis
The Hypergeometric distribution

Example
Refer to the example in introduction to find the probability that in
a committee of 6, three will be ladies.

Solution
Let X be the number of ladies in the committee. Then
X ∼ Hyper (17,

8, 6)

so that


8 
9 
3 3
P(X = 3) =   = 0.3801

17 
6
Introduction to Statistical Analysis
The Hypergeometric distribution

Example
A product is shipped in lots of 20. A sampling plan calls for
getting a sample of 5 items from each lot and rejecting the lot if
more than one defective is observed. If a lot contains four
defectives, what is the probability a lot will be rejected?

Solution
Let X be the number of defective items in a sample of 5 items.
Then X ∼ Hyper(20, 4, 5). The lot is not accepted of X ≥ 2.
Thus, we require

P(X ≥ 2) = 1 − (P(X = 0) + P(X = 1))


       
16 4 16 4
× + ×
5 0 4 1
= 1−   = 0.2487
20
5
Introduction to Statistical Analysis
The Hypergeometric distribution

Theorem
Let X ∼ Hyper(N, k, n). Then

k
E [X ] = n
N
   
k N −k N −n
Var[X ] = n
N N N −1
Introduction to Statistical Analysis
The Hypergeometric distribution

Example
A doctor examines 15 out-patients at a small clinic. Four of these
are malaria cases. What is the expectation and variance of number
of malaria patients in the 5 that are earmarked for a transfer to a
central hospital?
Introduction to Statistical Analysis
The Hypergeometric distribution

Solution
Let Y be number of malaria cases in the sample. Then
Y ∼ Hyper(N = 15, k = 4, n = 5). Thus
4
E [Y ] = 5 × 15 = 43 = 1.3333. Similarly, we find
4
σ 2 = 5 × 15 × 15−4 15−5
15 × 15−1 = 0.7683.
Introduction to Statistical Analysis
Poisson Distribution

Poisson Distribution
A discrete random variable X is said to have a Poisson
distribution id it has a pdf of the form
µx
P(X = x) = e −µ for x = 0, 1, 2, 3, . . . (2)
x!
where µ is the parameter.
Introduction to Statistical Analysis
Poisson Distribution

Continued/. . .
The following statements describe what is called a Poisson
process.
Occurrences of events are independent. i.e. Occurrence of an
event in an interval of space or time has no effect on the
probability of a second occurrence of the event in the same or
any other, interval.
An infinite number of occurrences of the event must be
possible in an interval.
The probability of the single occurrence of the event in a given
interval is proportional to the length of he interval.
In a small portion of interval, the probability of more than one
occurrence of the events is negligible.
Introduction to Statistical Analysis
Poisson Distribution

Example
In a study of a certain aquatic organism, a large number of
samples were taken from a pond, and the number of organisms in
each sample was counted. The average number of organisms per
sample was found to be two. Assuming that the number of
organisms follows a Poisson distribution, find the probability that
the next sample taken will contain one or fewer organisms.
Introduction to Statistical Analysis
Poisson Distribution

Solution
Let Y be the number of organisms in a sample. You note that in
this case µ = 2. So probability of interest is

F (1) = P(Y ≤ 1)
= P(Y = 0) + P(Y = 1)

Now, P(Y = 0) = 0.1353 and P(Y = 1) = 0.2707. Thus


P(Y ≤ 1) = 0.406
Introduction to Statistical Analysis
Poisson Distribution

Example
If on a certain stretch of road accidents occur at a rate of 3 per
month, find the probability that
only two accidents occur in a month;
fourteen or fifteen accidents occur in four months.
Introduction to Statistical Analysis
Poisson Distribution

Solution
Two accidents in a month means we consider r.v. X with
parameter µ = 3. So

e −3 × 32
P(X = 2) =
2!
= 0.2240

In four months, we expect 12 accidents. Thus

e −12 × 121 4 e −12 × 121 5


P(X = 14 or 15) = + = 0.16288
14! 15!
Introduction to Statistical Analysis
Poisson Distribution

Remark
If X ∼ Poiss(µ), then (1) E [X ] = µ and (2) Var[X ] = µ.
It is only a Poisson distribution that has a mean equal to its
variance.
We can fit theoretical Poisson frequencies to a given
frequency distribution.
A binomial distribution with parameters n and π can be
approximated by a Poisson distribution with parameter
nπ = µ if n is large (more than 50) and π is small (usually
1
less than 10 ).
Introduction to Statistical Analysis
Poisson Distribution

Example
In the mass production of an article, 500 samples each of 30
articles are examined. The number of defective items in the
samples are shown in the following table:
No. of
defectives 0 1 2 3 4 Total
Frequency 309 142 40 8 1 500
Find the mean number of defective items in each sample;
Show that the distribution is approximately that of a Poisson
distribution with this mean.
Calculate the variances of both distributions
Introduction to Statistical Analysis
Poisson Distribution

Solution
Let X be number of defectives in a sample. Then
X fi xi 250
X̄ = P = 0.5
fi 500

Let X̄ = µ.Then we generate the following frequencies, fp


X P(X = x) fp
0 0.6065 303
1 0.3033 152
2 0.0758 38
3 0.0126 6
4 0.0016 1
Introduction to Statistical Analysis
Poisson Distribution

Continued/. . .
Note that observed) frequencies are close to the “Poisson
frequencies” except the second frequency.
Now, the variance of the frequency distribution is given by

fi (xi − X̄ )2
P
S2 = .
500
P 2
Computations show that fi xi = 390. Thus

390 − 500 × 0.52


S2 = = 0.53
500
Value is close to mean found above. This confirms the fact
that the data can be modelled by a Poisson distribution.
Introduction to Statistical Analysis
The Normal Distribution

Normal Distribution
Normal distribution is the most important continuous
distribution in statistics.
It approximates very well to data that occurs frequently in real
life situations. E. g.like weight, height and age.
A continuous random variable X is said to have a Normal
distribution if its pdf is of the form

(x − µ)2
 
1
f (x) = √ exp − for − ∞ < x < ∞
σ 2π 2σ 2

where −∞ < µ < ∞, σ > 0 are parameters.


For X ∼ N(µ, σ 2 ), E [X ] = µ and Var[X ] = σ 2 .
Introduction to Statistical Analysis
Characteristics of the Normal Distribution

Characteristics of the Normal Distribution


Symmetrical about µ. As shown in Figure 2, the curve on
either side of µ is a mirror image of the other side.

Figure 2: Graph of a Normal distribution


Introduction to Statistical Analysis
Characteristics of the Normal Distribution

Continued/. . .
The mean, the median and the mode are all equal.
Different values of µ shift the graph of the distribution along
x-axis,
Different values of σ determine the degree of peakiness or
flatness of the graph of the distribution. See figures below.
Introduction to Statistical Analysis
Characteristics of the Normal Distribution

Continued/. . .

Figure 3: Normal distributions: different means


Introduction to Statistical Analysis
The Standard Normal Distribution

The Standard Normal Distribution


Distribution having µ = 0 and σ = 1 is called a Standard
Normal distribution.
Usually denoted by Z , so that Z ∼ N(0, 1).
The pdf of Z is given by
 2
1 z
f (z) = √ exp −
2π 2

Distribution function of a standard Normal random variable Z


is denoted by Φ(b). Ti.e.

Φ(b) = F (b)
Z b  2
1 z
= P(Z < b) = √ exp − dx
−∞ 2π 2
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
Tables have been drawn for a wide range of values of b. For
most tables, the area given is shown in the figure.
Usually such tables only give values of F (b) for b ≥ 0.
Probabilities like P(Z < −1.25) and P(Z ≥ 2.33) have to be
changed into probabilities of the form P(Z < b) by using the
symmetric properties of the distribution.

Figure 5: Area given by F (b)


Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
In general if Z ∼ N(0, 1) and Φ(b) = P(Z < b) then for
a ∈ [0, ∞):
P(Z > a) = 1 − P(Z < a)
Φ(−a) = 1 − F (a)
any b and c (positive or negative),
P(b < Z < c) = Φ(c) − Φ(b).
Introduction to Statistical Analysis
The Standard Normal Distribution

Example
If Z ∼ N(0, 1), find
P(Z < 0),
P(Z < 2.37),
P(Z > 1.56),
P(Z < −1.65),
P(Z > −2.808)
Introduction to Statistical Analysis
The Standard Normal Distribution

Solution

Figure 6:

From tables, P(Z < 0) = 0.5000. This is not unexpected


since −∞ < Z < ∞. So half the area must be above 0 and
the other half below 0.
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
The shaded area represents P(Z < 2.37) which is 0.9911
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
The area in question is to the right of 1.56. So
P(Z > 1.56) = 1 − P(Z < 1.56). Now, from tables
P(Z < 1.56) = 0.9406. Thus P(Z > 1.56) = 0.0594.

P(Z < −1.65) = P(Z > 1.65)


= 1 − P(Z < 1.65) = 0.0495
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
You see that the area of interest is to the right of −2.808.
But by symmetry, this is the same as P(Z < 2.808) which is
0.9975.
Introduction to Statistical Analysis
The Standard Normal Distribution

Example
If Z ∼ N(0, 1), find the values of
P(0.829 < Z < 1.843),
P(−1.764 < Z < 2.083),
P(Z < −1.97 or Z > 2.5),
P(|Z | > 2.326) and
P(|Z | < 1.78).
Introduction to Statistical Analysis
The Standard Normal Distribution

Solution
P(0.829 < Z < 1.843) = P(Z < 1.843) − P(0.829). From
tables, P(Z < 1.843) = 0.9673 and P(Z < 0.829) = 0.7964.
Thus P(0.829 < Z < 1.843) = 0.9673 − 0.7964 = 0.1709.

Figure 7:
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
The shaded area is P(−1.764 < Z < 2.083). Now,

P(−1.764 < Z < 2.083) = P(Z < 2.083) − P(Z < −1.764)
= P(Z < 2.083) + P(Z < 1.764) − 1 =
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .
The area required is to the left of −1.97 and to the right of
2.5. Now

P(Z < −1.97 or Z > 2.5) = P(Z < −1.97) + P(Z > 2.5)
= (1 − P(Z < 1.97)) + (1 − P(Z < 2.
= 2 − (P(Z < 1.97) + P(Z < 2.5))

From tables P(Z < 1.97) = 0.9756 and P(Z < 2.5) = 0.9938.
Thus P(Z < 1.97) + P(Z < 2.5) = 0.0182
Introduction to Statistical Analysis
The Standard Normal Distribution

Continued/. . .

P(|Z | > 2.326) = P(Z < −2.326 or Z > 2.326)


= (1 − P(Z < 2.326)) + (1 − P(Z < 2.326))
= 2(1 − 2P(Z < 2.326))0.0495

P(|Z | < 1.78) the area between −1.78 and 1.78. Thus

P(|Z | < 1.78) = P(Z < 1.78) − P(Z < −1.78)


= P(Z < 1.78) − [1 − P(Z < 1.78)]
= 2P(Z < 1.78) − 1

But P(Z < 1.78) = 0.96246202. Therefore,


P(|Z | < 1.78) = 0.9249.
Introduction to Statistical Analysis
Transforming any Normal Random Variable to Standard Normal

Transforming any Normal Random Variable to Standard Normal


We have seen how to find probabilities given a standard
normal distribution.
Special technique has to be used to compute probabilities of
non standard normal random variables.
Normal curves are transformed to a standard normal
distribution using the formula µ = 0 and σ = 1.
X −µ
Z=
σ
Introduction to Statistical Analysis
Transforming any Normal Random Variable to Standard Normal

Continued/. . .
The new random variable has mean 0 and a standard
deviation of 1.
X −µ
Let Z = σ . Confirm that E [Z ] = 0 and Var[Z ] = 1.
Lets look at some examples on non-standard normal variables.
If X ∼ N(50, 20), find
P(X > 60.3);
P(X > 48.9);
P(X < 53.5) and
P(X < 47.3)
Introduction to Statistical Analysis
Transforming any Normal Random Variable to Standard Normal

Solution
To find P(X > 60.3) we proceed as follows.
 
60.3 − 50
P(X > 60.3) = P Z > √
20
= P(Z > 2.303)
= 1 − P(Z ≤ 2.303)

From tables, P(Z < 2.303) = 0.9894. Thus


P(X > 60.3) = 0.0106.
 
Now P(X > 48.9) = P Z > 48.9−50

20
. The right-hand
member becomes P(Z > −0.246). But this is the same as
P(Z < 0.246) = 0.5972. Therefore, P(X > 48.9) = 0.5972.
Similar computations will give
P(X < 53.5) = 0.7831 and
P(X < 47.3) = 0.2730.
Introduction to Statistical Analysis
Transforming any Normal Random Variable to Standard Normal

Example
If X ∼ N(84, 12), find
P(X < 79 or X > 92);
P(76 < X < 82);
P(|X − 84| > 2.9)
Introduction to Statistical Analysis
Transforming any Normal Random Variable to Standard Normal

Solution

P(X < 79 or X > 92) = P(X < 79) + P(X > 92)
= P(Z < −1.443) + P(Z > 2.309)
= 2 − (P(Z < 1.443) + P(Z < 2.309)) .

From tables P(Z < 1.443) = 0.9255 and


P(Z < 2.309) = 0.9895. This gives
2 − (P(Z < 1.443) + P(Z < 2.309)) = 0.085. Thus
P(X < 79 or X > 92) = 0.085
P(76 < X < 82) = P(X < 82) − P(X < 76). Now
 
82 − 84
P(X < 82) = P Z < √
12
= P(Z < −0.577)
= 1 − P(Z < 0.577) = 0.2820
Introduction to Statistical Analysis
Transforming any Normal Random Variable to Standard Normal

Continued/ldots
Similarly P(X < 76) = 0.0105. Thus
P(76 < X < 82) = 0.2715
By definition of the modulus function,
P(|X − 84| > 2.9) = P(X − 84 < −2.9 or X − 84 > 2.9).
Therefore, we proceed as follows:

P(|X − 84| > 2.9) = P(X − 84 < −2.9 or X − 84 > 2.9)


= P(X < 81.1) + P(X > 86.9)
= P(Z < −0.837) + P(Z > 0.837)
= 2 − 2P(Z < 0.837)

From tables, we find that P(Z < 0.837) = 0.7987. Thus


P(|X − 84| > 2.9) = 0.4026.
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Normal Approximations to the binomial and Poisson Distribution


Normal distribution can be used as an approximation to the
binomial and Poisson distributions.
If X follows a binomial distribution with parameters n and π,
then for large n (> 50) and π close to 21 one can consider X
as approximately normally distributed with mean nπ and
variance nπ(1 − π).
If X ∼ Poiss(µ) with large µ (usually bigger that 20) X is
approximately normally distributed with mean µ and variance
µ.
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Continued/. . .
A continuity correction factor is introduced because we are
approximating probabilities of discrete using continuous.
Let k and l be integral values. This is done because we are
finding the probabilities of a Then
P(X = k) = P(k − 12 < X < k + 12 );
P(X ≤ k) = P(X < k + 12 );
P(X < k) = P(X < k − 12 );
P(X ≥ k) = P(X > k − 12 );
P(X > k) = P(X > k + 12 );
P(k ≤ X ≤ l) = P(k − 12 < X < l + 12 );
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Example
Note how the factor is used for ≤, < and ≥, >.
Consider the following illustrations on the use of the
continuity correction factor.
If X Bin(200, 0.7), use the normal approximation to find (a)
P(X ≥ 130); (b) P(X < 142); (c) P(136 ≤ X < 148).
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Solution
n = 200 and π = 0.7 is reasonably close to 12 ,
Use the normal approximation in all the three cases with
µ = 140 and σ 2 = 42. (a) P(X ≥ 130) = P(X > 130 − 0.5).
Now,
129.5 − 140
P(X > 129.5) = P(Z > √ )
42
= P(Z > −1.62)
= P(Z < 1.62)

From standard normal tables we find that


P(Z < 1.62) = 0.9474. Thus P(X ≥ 130) = 0.9474.
b) P(X < 142) = P(X < 142.5). This gives us

142.5 − 140
P(X < 142.5) = P(Z < √ )
42
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Continued/. . .
(c) P(136 ≤ X < 148) = P(135.5 < X < 147.5). This is
expressible as P(X < 147.5) − P(X < 135.5). Now

147.5 − 140
P(X < 147.5) = P(Z < √ )
42
= P(Z < 1.1573) = 0.8764

and
135.5 − 140
P(X < 147.5) = P(Z < √ )
42
= 1 − P(Z < 0.694) = 0.2437

Therefore,

P(136 ≤ X < 148) = P(Z < 1.1573) − P(Z < −0.694)


= 0.8764 − 0.2437 = 0.6327
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Example
The number of calls received by an office switch-board per
hour follows a Poisson distribution with parameter 30. Use
the normal approximation to find probability that in one hour
there are (a) more than 33 calls; (b) between 25 and 28 calls
(inclusive); and (c) 34 calls.
Introduction to Statistical Analysis
Normal approximation to the Binomial and Poisson distributions

Example
We only do the last part; others can be worked out in similar
manner.
Let Y be the number of calls received in one hour. Then we
require P(Y = 34). 34 is between 33.5 and 34.5. Therefore,
P(Y = 34) = P(33.5 < Y < 34.5). Now,

P(33.5 < Y < 34.5) = P(Y < 34.5) − P(Y < 33.5)
= P(Z < 0.822) − P(Z < 0.639) = 0.0559

Thus P(X = 34) = 0.0559.

You might also like