Week04 Proba Distribution
Week04 Proba Distribution
Probability Distributions
Review and Preview
Probability Distributions
Binomial Probability Distributions
Parameters for Binomial Distributions
Poisson Probability Distributions
The Standard Normal Distribution
Applications of Normal Distributions
Sampling Distributions and Estimators
Assessing Normality
Normal as Approximation to Binomial and
Poisson
Standard Deviation
[( x P( x)]
2 2
The expected value of a discrete random
variable is denoted by E, and it represents
the mean value of the outcomes. It is
obtained by finding the value of ∑[xP(x)]
E(x) = ∑[xP(x)]
x P( x)
0 0.25
1 0.50
2 0.25
Total 1
Mean = μ = ∑[xP(x)] = 1
Shaded area
represents
voltage levels
greater than
124.5 volts.
5% or 0.05
X
18.0
18.6 SIS 1037Y 2020/2021 61
Finding Normal Probabilities
Let X represent the time it takes, in
seconds to download an image file from
the internet.
Suppose X is normal with a mean of 18.0
seconds and a standard deviation of 5.0
seconds. Find P(X < 18.6)
Z = (x-μ)/σ = (18.6 – 18.0)/5.0 = 0.12
μ = 18 μ=0
σ=5 σ=1
18 18.6 X 0 0.12 Z
P(X < 18.6) P(Z < 0.12)
SIS 1037Y 2020/2021 62
Standardized Normal Probability P(X < 18.6)
Table (Portion)
= P(Z < 0.12)
X
18.0
18.6
SIS 1037Y 2020/2021 64
(continued
)
Now Find P(X > 18.6)…
P(X > 18.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522
0.5478
1.000 1.0 - 0.5478 =
0.4522
Z Z
0 0
0.12 0.12
SIS 1037Y 2020/2021 65
Suppose X is normal with mean 18.0 and
standard deviation 5.0. Find P(18 < X < 18.6)
Calculate Z-values:
X μ 18 18
Z 0
σ 5
18 18.6 X
X μ 18.6 18 0 0.12 Z
Z 0.12
σ 5 P(18 < X < 18.6)
= P(0 < Z < 0.12)
X
18.0
17.4
SIS 1037Y 2020/2021 68
(continued)
0.4522
The probability of
random adult having a
bone density less than
1.27 is 0.8980.
10
SIS 1037Y 2020/2021 0
A normal quantile plot (or normal probability
plot) is a graph of points (x, y), where each x
value is from the original set of sample data,
and each y value is the corresponding z score
that is a quantile value expected from the
standard normal distribution
10
SIS 1037Y 2020/2021 1
Determining whether it is reasonable to
assume that sample data are from a normally
distributed population:
◦ 1. Histogram: Construct a histogram. Reject
normality if the histogram departs dramatically
from a bell shape.
◦ 2. Outliers: Identify outliers. Reject normality if
there is more than one outlier present.
◦ 3. Normal Quantile Plot: If the histogram is basically
symmetric and there is at most one outlier, use
technology to generate a normal quantile plot. Use
the following criteria to determine whether or not
the distribution is normal
10
SIS 1037Y 2020/2021 2
Normal Distribution: The population
distribution is normal if the pattern of the
points is reasonably close to a straight line
and the points do not show some systematic
pattern that is not a straight-line pattern.
Not a Normal Distribution: The population
distribution is not normal if either or both of
these two conditions applies:
◦ The points do not lie reasonably close to a straight
line.
◦ The points show some systematic pattern that is
not a straight-line pattern.
10
SIS 1037Y 2020/2021 3
Normal: Histogram of IQ scores is close to being bell-shaped, suggests that
the IQ scores are from a normal distribution. The normal quantile plot
shows points that are reasonably close to a straight-line pattern. It is safe
to assume that these IQ scores are from a normally distributed population.
10
SIS 1037Y 2020/2021 4
Uniform: Histogram of data having a uniform distribution. The
corresponding normal quantile plot suggests that the points are not
normally distributed because the points show a systematic pattern that is
not a straight-line pattern. These sample values are not from a population
having a normal distribution.
10
SIS 1037Y 2020/2021 5
Skewed: Histogram of the amounts of rainfall for every Monday during one
year. The shape of the histogram is skewed, not bell-shaped. The
corresponding normal quantile plot shows points that are not at all close to
a straight-line pattern. These rainfall amounts are not from a population
having a normal distribution.
10
SIS 1037Y 2020/2021 6
Step 1. First sort the data by arranging the values
in order from lowest to highest.
Step 2. With a sample of size n, each value
represents a proportion of 1/n of the sample.
Using the known sample size n, identify the areas
of 1/2n, 3/2n, and so on. These are the
cumulative areas to the left of the corresponding
sample values.
Step 3. Use the standard normal distribution to
find the z scores corresponding to the
cumulative left areas found in Step 2.
◦ These are the z scores that are expected from a normally
distributed sample.
10
SIS 1037Y 2020/2021 7
Step 4. Match the original sorted data values
with their corresponding z scores found in
Step 3, then plot the points (x, y), where each
x is an original sample value and y is the
corresponding z score.
Step 5. Examine the normal quantile plot and
determine whether or not the distribution is
normal.
10
SIS 1037Y 2020/2021 8
Many data sets have a distribution that is
not normal, but we can transform the data
so that the modified values have a normal
distribution.
One common transformation is to replace
each value of x with log(x + 1).
If the distribution of the log(x + 1) values is
a normal distribution, the distribution of
the x values is referred to as a lognormal
distribution.
10
SIS 1037Y 2020/2021 9
In addition to replacing each x value with
the log(x + 1), there are other
transformations, such as replacing each x
value with x, or 1/x, or x2.
In addition to getting a required normal
distribution when the original data values
are not normally distributed, such
transformations can be used to correct
other deficiencies, such as a requirement
(see later) that different data sets have the
same variance.
11
SIS 1037Y 2020/2021 0
Review and Preview
Probability Distributions
Binomial Probability Distributions
Parameters for Binomial Distributions
Poisson Probability Distributions
The Standard Normal Distribution
Applications of Normal Distributions
Sampling Distributions and Estimators
Assessing Normality
Normal as Approximation to Binomial and
Poisson
11
SIS 1037Y 2020/2021 1
A method for using a normal distribution as
an approximation to the binomial probability
distribution.
If the conditions of np ≥ 5 and nq ≥ 5 are
both satisfied, then probabilities from a
binomial probability distribution can be
approximated well by using a normal
distribution with mean μ = np and standard
deviation σ = (npq)
11
SIS 1037Y 2020/2021 2
1. The procedure must have a fixed number
of trials.
2. The trials must be independent.
3. Each trial must have all outcomes
classified into two categories (commonly,
success and failure).
4. The probability of success remains the
same in all trials.
Solve by binomial probability formula, Table,
or technology
11
SIS 1037Y 2020/2021 3
np 5
nq 5
a distribution.
(normal)
11
SIS 1037Y 2020/2021 4
Using a Normal Distribution to Approximate
a Binomial Distribution
1. Verify that both np≥ 5 and nq ≥ 5. If not,
you must use software, a calculator, a table
or calculations using the binomial probability
formula.
2. Find the values of the parameters μ and σ
by calculating: μ = np, σ = npq
3. Identify the discrete whole number x that
is relevant to the binomial probability
problem. Focus on this value temporarily.
11
SIS 1037Y 2020/2021 5
4. Draw a normal distribution centred about μ, then
draw a vertical strip area centred over x. Mark the left
side of the strip with the number equal to x – 0.5,
and mark the right side with the number equal to x +
0.5. Consider the entire area of the entire strip to
represent the probability of the discrete whole
number itself.
5. Determine whether the value of x itself is included
in the probability. Determine whether you want the
probability of at least x, at most x, more than x,
fewer than x, or exactly x. Shade the area to the right
or left of the strip; also shade the interior of the strip
if and only if x itself is to be included. This total
shaded region corresponds to the probability being
sought.
11
SIS 1037Y 2020/2021 6
6. Using x – 0.5 or x + 0.5 in place of x, find
the area of the shaded region: find the z
score, use that z score to find the area to the
left of the adjusted value of x. Use that
cumulative area to identify the shaded area
corresponding to the desired probability.
11
SIS 1037Y 2020/2021 7
In 431 football games that went to over time,
the teams that won the coin toss went on to
win 235 of those games.
If the coin-toss method is fair, teams winning
the toss would win about 50% of the games
(we’d expect 215.5 wins in 431 overtime
games).
Assuming there is a 0.5 probability of
winning a game after winning the coin toss,
find the probability of getting at least 235
winning games.
11
SIS 1037Y 2020/2021 8
The given problem involves a binomial
distribution with n = 431 trials and an
assumed probability of success of p = 0.5.
Use the normal approximation to the
binomial distribution.
Step 1: The conditions check:
11
SIS 1037Y 2020/2021 9
Step 2: Find the mean and standard deviation
of the normal distribution:
np 431 0.5 215.5
npq 431 0.5 0.5 10.38027
12
SIS 1037Y 2020/2021 0
Example – Football Coin Toss
12
SIS 1037Y 2020/2021 1
Step 6: Find the z score and use technology
or Table to determine the probability.
12
SIS 1037Y 2020/2021 2
When we use the normal distribution (which
is a continuous probability distribution) as an
approximation to the binomial distribution
(which is discrete), a continuity correction is
made to a discrete whole number x in the
binomial distribution by representing the
discrete whole number x by the interval from
x – 0.5 to x + 0.5 (that is, adding and
subtracting 0.5).
12
SIS 1037Y 2020/2021 3
Example – Continuity Corrections
12
SIS 1037Y 2020/2021 4
P(X = 18) ≈ P(17.5 < X < 18.5)
12
SIS 1037Y 2020/2021 5
P(X < 18) ≈ P(X < 18.5)
12
SIS 1037Y 2020/2021 6
The normal distribution can also be
used to approximate the Poisson
distribution whenever the parameter
λ, the expected number of successes,
equals or exceeds 5.
Since the value of the mean and the
variance of a Poisson distribution are
the same:
◦ μ = σ2 = λ
12
SIS 1037Y 2020/2021 7
Substituting into the transformation Equation
X μ X
Z
σ
so that, for large enough λ, the random variable
Z is approximately normally distributed.
Hence, to find approximate probabilities
corresponding to the values of the Poisson
random variable X, a correction has to be made
to cater for discrete values, similar to the
Binomial.
12
SIS 1037Y 2020/2021 8
12
SIS 1037Y 2020/2021 9
Suppose that at a certain automobile plant
the average number of work stoppages per
day due to equipment problems during the
production process is 12.0.
Determine the approximate probability of
having 15 or fewer work stoppages due to
equipment problems on any given day.
13
SIS 1037Y 2020/2021 0
Substituting into the transformation Equation
Xa (15.5 12.0)
Z 1.01
12.0
Here Xa, the adjusted number of successes, is 15.5.
Hence the approximate probability that X does not
exceed this value corresponds to a Z value, of not
more than +1.01.
Note that the area under the normal curve less than
Z = +1.01 is 0.8438. Therefore, the approximate
probability of having 15 or fewer work stoppages due
to equipment problems on any given day is 0.8438.
This approximation compares quite favourably to the
exact Poisson probability, 0.8445.
13
SIS 1037Y 2020/2021 1
When we use the normal distribution (which
is a continuous probability distribution) as an
approximation to the binomial or Poisson
distribution (which are discrete), a continuity
correction is made to a discrete whole
number x in the discrete distributions by
representing the single value x by the interval
from
◦ x – 0.5 to x + 0.5
◦ (that is, adding and subtracting 0.5).
13
SIS 1037Y 2020/2021 2
If x is a random variable with mean x
and variance X and a and b are
2
The variance of y is y2
2
a bx
b X
2 2
13
SIS 1037Y 2020/2021 4
We need to find the mean of y = 255 + 110x
x p(x) xp(x)
1 0.3 0.3
2 0.4 0.8 2.1
x
3 0.2 0.6
4 0.1 0.4
2.1
2
255 110 X
(110) (110) (0.89) 10769
2 2
x
2
255 110 X
110 x 110(0.9434) 103.77
13
SIS 1037Y 2020/2021 7
If x1, x2, , xn are random variables with
means 1, 2, , n and variances 12 , 22 , , n2
respectively,
and y = a1x1 + a2x2 + + anxn then
1. y = a11 + a22 + + ann
(This is true for any random variables with no conditions.)
13
SIS 1037Y 2020/2021 8
A distributor of fruit baskets is going to put
4 apples, 6 oranges and 2 bunches of
grapes in his small gift basket. The weights,
in ounces, of these items are the random
variables x1, x2 and x3 respectively with
means and standard deviations as given in
the following table.
Apples Oranges Grapes
Mean
8 10 7
Standard deviation
0.9 1.1 2
Find the mean, variance and standard deviation of
the random variable y = weight of fruit in a small
gift basket. 13
SIS 1037Y 2020/2021 9
It is reasonable in this case to assume that
the weights of the different types of fruit are
independent. Apples Oranges Grapes
Mean
8 10 7
Standard deviation
0.9 1.1 2
a1 4, a2 6, a3 2, 1 8, 2 10, 3 7
1 0.9, 2 1.1, 3 2
y a x a x
1 1 2 2 a3 x 3
a11 a2 2 a3 3
4(8) 6(10) 2(7) 106
2y a2 x a x
1 1 2 2 a3 x 3
a12 12 a 22 22 a3232
42 (.9)2 6 2 (1.1)2 22 (2)2 72.52
2020/2021
y = 72.52 8.5159
SIS 1037Y 140
We have seen the basics of Probability
Distributions
Two discrete Probability Distributions have
been studied:
◦ Binomial Probability Distributions
◦ Poisson Probability Distributions
The approximation of one distribution by
another is possible under certain conditions.
14
SIS 1037Y 2020/2021 1
Normal distribution
Found approximations to binomial
probabilities by using the normal distribution.
Found approximations to Poisson
probabilities by using the normal distribution.
Continuity Corrections
Linear combination of random variables
14
SIS 1037Y 2020/2021 2
Comments?