MATH 322: Probability and Statistical Methods
MATH 322: Probability and Statistical Methods
METHODS
LECTURE SLIDES
CHAPTER 6
2
6.1 Continuous Uniform Distribution
Exercises
THE CONTINUOUS UNIFORM DISTRIBUTION
One of the simplest continuous distributions in all statistics is the continuous uniform
distribution. This distribution is characterized as follows
Definition. The density function of the continuous random variable 𝑋 on the interval 𝐴, 𝐵 is
1
, 𝐴≤𝑥≤𝐵
𝑓 𝑥; 𝐴, 𝐵 = ൝ 𝐵−𝐴
0, elsewhere.
Theorem 6.1 The mean and variance of the continuous uniform distribution are
𝐴+𝐵 2 𝐵−𝐴 2
𝜇= and 𝜎 = .
2 12
EXAMPLE 6.1:
Suppose that a large conference room at a certain company can be reserved for no more than 4 hours. Both
long and short conferences occur quite often. In fact, it can be assumed that the length X of a conference
has a uniform distribution onthe interval [0, 4].
a. What is the probability density function?
b. What is the probability that any given conference lasts at least 3 hours?
Solution : (a) The appropriate density function for the uniformly distributed random variable X in this
situation is
6.2 NORMAL DISTRIBUTION
The normal distribution, also called the Gaussian distribution is an important family of continuous probability distributions,
applicable in many fields.
Normal distributions are a family of distributions that have the same general shape. They are symmetric with scores more concentrated
in the middle than in the tails. Normal distributions are sometimes described as bell shaped which are shown below. Notice that they
differ in how spread out they are. The area under each curve is the same. The height of a normal distribution can be specified
mathematically in terms of two parameters: location and scale: the mean (μ) and variance (standard deviation squared) (σ2),
respectively.
For both theoritical and practical reasons, the normal distribution is the most important distribution is Statistics. The reason why it is
important is ;
Many classical tests are based on the assumption that the follow a Normal distribution.
In modelling applications, such as linear and non-linear regression, the error term is often assumed to follow a normal distribution
with fixed location and scale.
The normal distribution is used to find significance levels in may hypothesis test and confidence intervals.
6.2 NORMAL DISTRIBUTION
1 𝑥−𝜇 /𝜎 2
𝑁 𝑥; 𝜇, 𝜎 = 𝑒 − 1/2 , −∞<𝑥 <∞
2𝜋𝜎
2 ∞ 1 𝑥−𝜇 /𝜎 2
𝑉𝑎𝑟(𝑋) = 𝐸 𝑋 − 𝜇 = −∞ 𝑥 − 𝜇 2 𝑒 − 1/2 𝑑𝑥 = 𝜎 2 .
2𝜋𝜎
Theorem 6.2
6.3 AREAS UNDER THE NORMAL CURVE
The curve of any continuous probability distribution or density function is constructed so that the area
under the curve bounded by the two ordinates x = x1 and x = x2 equals the probability that the random
variable X assumes a value between x = x1 and x = x2. Thus, for the normal curve in Figure below
The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation
of 1. Normal distributions can be transformed to standard normal distributions by the formula:
𝑋−𝜇
𝑍=
𝜎
where X is a score from the original normal distribution, μ is the mean of the original normal
distribution, and σ is the standard deviation of original normal distribution. The standard normal
distribution is sometimes called the z distribution.
A z score always reflects the number of standard deviations above or below the mean a particular score
is. For instance, if a person scored a 70 on a test with a mean of 50 and a standard deviation of 10, then
they scored 2 standard deviations above the mean. Converting the test scores to z scores, an X of 70
would be:
So, a z score of 2 means the original score was 2 standard deviations above the mean. Note that the z
distribution will only be a normal distribution if the original distribution (X) is normal.
Note : The following figures give us the areas to the right of some z-value, to the left of some z-value
and between two z-values.
Areas under portions of the standard normal distribution are shown to the right. About
.68 (.34 + .34) of the distribution is between -1 and 1 while about .96 of the distribution
is between -2 and 2.
Example 6.2. Given a standard normal distribution, find the area under the curve that lies
a. to the right of 𝑧 = 1.84
Solution.
a) 𝑷 𝒁 > 𝟏. 𝟖𝟒 = 𝟏 − 𝑷 𝒁 ≤ 𝟏. 𝟖𝟒
𝒃𝒚 𝒕𝒂𝒃𝒍𝒆 𝑨𝟑
= 𝟏 − 𝟎. 𝟗𝟔𝟕𝟏 = 𝟎. 𝟎𝟑𝟐𝟗.
45 − 50 62 − 50
𝑃 45 < 𝑋 < 62 = 𝑃 <𝑍< = 𝑃 −0.5 < 𝑍 < 1.2
10 10
= 𝑡𝑎𝑏𝑙𝑒(1.2) − 𝑡𝑎𝑏𝑙𝑒(−0.5) = 0.8849−0.3085 = 0.5764
Example 6.5. Given a standard normal distribution, find the value of 𝑘 such that
(a) 𝑃 𝑍 < 𝑘 = 0.0427
(b) 𝑃 𝑍 > 𝑘 = 0.2946
(c) 𝑃 −0.93 < 𝑍 < 𝑘 = 0.7235
we reverse the process and begin with a known area or probability, find the z value,
and then determine x by rearranging the formula
𝑥−𝜇
𝑧= to give x = σz + μ.
𝜎
Example 6.6: Given a normal distribution with μ = 40 and σ = 6, find the value of x that has
(a) 45% of the area to the left and
(b) 14% of the area to the right.
Solution : The distribution of light bulb life is illustrated in Figure. The z values corresponding to 𝑥1 = 778 and 𝑥2 =
834 are
778 − 800 834 − 800
𝑧1 = = −0.55 𝑎𝑛𝑑 𝑧2 = = 0.85
40 40
Hence,
P(778 < X < 834) = P(−0.55 < Z < 0.85) = P(Z < 0.85) − P(Z < −0.55)
From Table A.3, P(Z < 1.18) has the closest value to 0.88, so the desired z value is 1.18. Hence,
x = (7)(1.18) + 74 = 82.26.
Theorem. If 𝑋is a binomial random variable with mean 𝜇 = 𝑛𝑝 and 𝜎 2 = 𝑛𝑝𝑞, then the
limiting form of distribution of
𝑋𝑏𝑖𝑛 −𝑛𝑝
𝑍= as 𝑛 → ∞
𝑛𝑝𝑞
is the standard normal distribution 𝑁 0,1 .
Note. We use the normal approximation to binomial distribution whenever 𝑝is not close
to 0 and 1 with a continuity correction of ±0.5. If both 𝑛𝑝 and 𝑛𝑞 are greater than or
equal to 5, the approximation will be good.
Normal Approximation to Binomial Distribution
Figure 6.n 1. Normal approximation of b(x; 15,0.4) Figure 6.n 2. Normal approximation of b(x; 15, 0.4)
and b(x; 15, 0.4)
Example 6.10. The probability that a patient recovers from a blood disease is 0.4. If 100
people are known to have contracted this disease what is the probability that less than 30
survive?
Solution. Let 𝑋 be the number of surviving people from blood disease.
Given 𝑛 = 100 and 𝑝 = 0.40, 𝜇 = 𝑛𝑝 = 100 ∗ 0.40 = 40, 𝜎 =
100 ∗ 0.40 ∗ 0.60 = 24 = 4.9,
𝑋 − 𝑛𝑝 29.5 − 40
𝑃 𝑋𝑏𝑖𝑛 < 30 ≈ 𝑃 𝑋𝑛𝑜𝑟 < 29.5 = 𝑃 < = 𝑃 𝑍 < −2.14
𝑛𝑝𝑞 24
= 𝑡𝑎𝑏𝑙𝑒(−2.14) = 0.0162
EXAMPLE 6.11:
A multiple-choice quiz has 200 questions, each with 4 possible answers of which only 1 is correct. What is
the probability that sheer guesswork yields from 25 to 30 correct answers for the 80 of the 200 problems
about which the student has no knowledge?
Solution : The probability of guessing a correct answer for each of the 80 questions is p = 1/4. If X
represents the number of correct answers resulting from guesswork, then
Example 6.12. A coin is tossed 400 times, use the normal approximation to binomial to
find the probability of obtaining
(a)between 185 and 210 heads inclusive
Solution.
𝑃 𝑋 = 205 = 𝑃 204.5 < 𝑋 < 205.5
204.5 − 200 205.5 − 200
≈𝑃 ≤𝑍≤
10 10
= 𝑃 0.45 ≤ 𝑍 ≤ 0.55 = 𝑃 𝑍 ≤ 0.55 − 𝑃 𝑍 ≤ 0.45
= 𝑇𝑎𝑏𝑙𝑒(0.55) − 𝑇𝑎𝑏𝑙𝑒(0.45)
= 0.7088 − 0.6736 = 0.0352
Solution:
175.5−200 227.5−200
𝑃 𝑋 < 176 + 𝑃 𝑋 > 227 = 1 − 𝑃 176 ≤ 𝑋 ≤ 227 = 𝑃 175.5 ≤ 𝑋 ≤ 227.5 ≈ 𝑃 ≤𝑍≤
10 10
= 𝑃 −2.45 ≤ 𝑍 ≤ 2.75 = 𝑃 𝑍 ≤ 2.75 − 𝑃 𝑍 ≤ −2.45
= 𝑇𝑎𝑏𝑙𝑒(2.75) − 1 + 𝑇𝑎𝑏𝑙𝑒(2.45)
= 0.9970 − 1 + 0.9924 = 0.9894
THE EXPONENTIAL AND GAMMA DISTRIBUTIONS
Although the normal distribution can be used to solve many problems in engineering and science,
there are still numerous situations that require different types of density functions. Two such density
functions, the gamma and exponential distributions, are discussed in this section.
It turns out that the exponential distribution is a special case of the gamma distribution. Both find a
large number of applications. The exponential and gamma distributions play an important role in both
queuing theory and reliability problems.
Time between arrivals at service facilities and time to failure of component parts and electrical
systems often are nicely modeled by the exponential distribution. The relationship between the
gamma and the exponential allows the gamma to be used in similar types of problems. More details
and illustrations will be supplied later in the section.
The gamma distribution derives its name from the well-known gamma function, studied in many areas
of mathematics. Before we proceed to the gamma distribution, let us review this function and some of
its important properties.
THE EXPONENTIAL AND GAMMA DISTRIBUTIONS
Normal distributions can be used to solve many problems in engineering and science. There are still
numerous situations that require different types of density functions. Two such density functions are
Exponential and Gamma Distributions.
The Exponential and Gamma Distributions play important role in both Queuing Theory and Reliability
Theory.
The continuous distribution that has many useful applications is the exponential distribution, which has
density
1 −𝑥Τ𝛽
𝑒 , 𝑥>0
𝑓 𝑥 = ൞𝛽
0, elsewhere
THE EXPONENTIAL AND GAMMA DISTRIBUTIONS
Because the sample space for this distribution consists of the positive real numbers, this
distribution is sometimes used to model time to failure or survival time of a system.
The distribution function is,
𝑥 1
𝐹(𝑥) = −∞ 𝑒 −𝑡/𝛽 𝑑𝑡 = 1 − 𝑒 −𝑥/𝛽 , 𝑥 > 0
𝛽
The exponential distribution has a special property that is unique to this distribution.
Suppose that 𝑇 is the time to failure of a randomly selected new component and suppose
that this r.v. has an exponential distribution with parameter 𝛽. The probability that this
new component survives to time 𝑡 is,
𝑃 𝑇 > 𝑡 = 1 − 𝐹 𝑡 = 𝑒 −𝑡/𝛽
GAMMA DISTRIBUTION
A generalization of the exponential distribution that can provide a much wider range of models is based on the gamma integral.
1
𝛼 𝑥 𝛼−1 𝑒 −𝑥/𝛽 , 𝑥≥0
Define a function 𝑓 𝑡; 𝛼, 𝛽 by 𝑓 𝑥; 𝛼, 𝛽 = ቐ 𝛤 𝛼 𝛽
0, otherwise
where 𝛼, 𝛽 are positive constants. Note that in this parameterizaton, the parameter 𝛽 is in the denominator of the exponential
component. The reason for this modification will be shown below. Recall that
∞
න 𝑡 𝛼−1 𝑒 −𝑡/𝛽 𝑑𝑡 = 𝛽 𝛼 𝛤 𝛼
0
and hence that 𝑓 𝑥; 𝛼, 𝛽 is a density function. This density function defines a distribution on the positive real numbers and is
referred to as the gamma distribution. Note that the exponential distribution is a special case of the gamma distribution in
which 𝛼 = 1. The parameter 𝛼 is referred to as the shape parameter and 𝛽 is referred to as the scale parameter of the gamma
distribution.
Theorem. The Mean and Variance of the Gamma
Distribution are
𝜇 = 𝛼𝛽 and 𝜎 2 = 𝛼𝛽 2
Solution. Let 𝑋 be the water supply in millions of liters of water. Given 𝛼 = 2 and 𝛽 = 3,
1
∞ 2 𝑥𝑒 −𝑥/3 , 𝑥>0
𝑃 𝑋>9 = 9 𝑓 𝑥 𝑑𝑥 where 𝑓 𝑥 = ቐ 𝛤 2 3
0, otherwise
∞1 1
Thus, 𝑃 𝑋 > 9 = 9 𝑥𝑒 −𝑥/3 𝑑𝑥 = .
9 𝑒3
RELATIONSHIP TO THE POISSON PROCESS
We shall pursue applications of the exponential distribution and then return to the gamma distribution.
The most important applications of the exponential distribution are situations where the Poisson process
applies (see Section 5.5). The reader should recall that the Poisson process allows for the use of the
discrete distribution called the Poisson distribution.
Recall that the Poisson distribution is used to compute the probability of specific numbers of “events”
during a particular period of time or span of space. In many applications, the time period or span of
space is the random variable.
For example, an industrial engineer may be interested in modeling the time T between arrivals at a
congested intersection during rush hour in a large city. An arrival represents the Poisson event.
The relationship between the exponential distribution and the Poisson process is quite simple. In Chapter
5, the Poisson distribution was developed as a single-parameter distribution with parameter λ, where λ
may be interpreted as the mean number of events per unit “time.”
Consider now the random variable described by the time required for the first event to occur. Using the
Poisson distribution, we find that the probability of no events occurring in the span up to time t is given by
We can now make use of the above and let X be the time to the first Poisson event. The probability that the
length of time until the first event will exceed x is the same as the probability that no Poisson events will occur
in x. The latter, of course, is given by 𝑒 −𝜆𝑥 . As a result,
Now, in order that we may recognize the presence of the exponential distribution, we differentiate the cumulative
distribution function above to obtain the density function
𝑓(𝑥) = 𝜆𝑒 −𝜆𝑥 ,
which is the density function of the exponential distribution with 𝜆 = 1/𝛽.
Note. There is a relationship between the Exponential and Poisson distributions. Suppose events are occurring in
time according to Poisson distribution with a rate of 𝜆 events per hour. Thus in 𝑡 hours, the number of events say
𝑌, will have a Poisson distribution with mean value 𝜆𝑡.
Suppose we start at time zero and ask the question ‘How long do I have to wait to see the first event occur?’. Let
𝑋 denote the length of time until the first event. Then
𝜆𝑡 0 𝑒 −𝜆𝑡
𝑃 𝑋 > 𝑡 = 𝑃 𝑌 = 0 𝑜𝑛 𝑡ℎ𝑒 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙 (0, 𝑡) = = 𝑒 −𝜆𝑡
0!
and
𝑃 𝑋 ≤ 𝑡 = 1 − 𝑃 𝑋 > 𝑡 = 1 − 𝑒 −𝜆𝑡 .
1
Thus, 𝑃 𝑋 ≤ 𝑡 = 𝐹 𝑡 , the distribution function for 𝑋 has the form of an exponential distribution with 𝜆 = (the failure
𝛽
rate). Upon differentiating, we see that
𝑑 1−𝑒 −𝜆𝑡 1
𝑓 𝑡 = = 𝑒 −𝑡/𝛽 , 𝑡 > 0.
𝑑𝑡 𝛽
APPLICATIONS OF THE EXPONENTIAL AND GAMMA DISTRIBUTIONS
In the foregoing, we provided the foundation for the application of the exponential distribution in “time to
arrival” or time to Poisson event problems.
We will illustrate some applications here and then proceed to discuss the role of the gamma distribution
in these modeling applications. Notice that the mean of the exponential distribution is the parameter β,
the reciprocal of the parameter in the Poisson distribution.
The reader should recall that it is often said that the Poisson distribution has no memory, implying that
occurrences in successive time periods are independent. The important parameter β is the mean time
between events.
In reliability theory, where equipment failure often conforms to this Poisson process, β is called mean
time between failures. Many equipment breakdowns do follow the Poisson process, and thus the
exponential distribution does apply.
Other applications include survival times in biomedical experiments and computer response time. In
the following example, we show a simple application of the exponential distribution to a problem in
reliability. The binomial distribution also plays a role in the solution.
EXAMPLE 6.14.
The life of a certain type of device has an advertised rate of 0.01 per
hour. The failure rate is constant and the exponential dsitribution
applies.
(a)What is the probability that 200-hours will pass before a failure is
observed?
0.01𝑒 −0.01𝑥 , 𝑥>0
Solution. Given 𝑓 𝑥 = ቊ ,
0, otherwise
∞
𝑷 𝑿 > 𝟐𝟎𝟎 = න𝟎. 𝟎𝟏𝒆−𝟎.𝟎𝟏𝒙 𝒅𝒙 = −𝒆−𝟎.𝟎𝟏𝒙 ฬ = 𝒆−𝟐 .
𝟐𝟎𝟎
Suppose that a system contains a certain type of component whose time, in years, to failure is
given by T. The random variable T is modeled nicely by the exponential distribution with mean
time to failure β = 5. If 5 of these components are installed in different systems, what is the
probability that at least 2 are still functioning at the end of 8 years?
Solution : The probability that a given component is still functioning after 8 years is given by
Let X represent the number of components functioning after 8 years. Then using the binomial
distribution, we have
EXAMPLE 6.16.
In a certain city, the daily consumption of electric power in millions of kw-hours is a random variable 𝑋
having a Gamma distribution with 𝜇 = 6 and 𝜎 2 = 12.
𝜇 = 𝛼𝛽 = 6
ൠ ⇒ 𝛽 = 2, 𝛼 = 3.
𝜎 2 = 𝛼𝛽 2 = 12 = 6.2
(a) Find the probability that on any given day the daily power consumption will exceed 12 million kw-hours.
∞
1 −𝑥/2 2
1 ∞ −𝑥/2 2
𝑃 𝑋 > 12 = න 3 𝑒 𝑥 𝑑𝑥 = න 𝑒 𝑥 𝑑𝑥
12 2 𝛤 3 16 12
EXAMPLE 6.17:
Suppose that telephone calls arriving at a particular switchboard follow a Poisson process
with an average of 5 calls coming per minute. What is the probability that up to a minute
will elapse by the time 2 calls have come in to the switchboard?
Solution : The Poisson process applies, with time until 2 Poisson events following a
gamma distribution with β = 1/5 and α = 2.
Denote by X the time in minutes that transpires before 2 calls come. The required
probability is given by
EXAMPLE 6.18:
In a biomedical study with rats, a dose-response investigation is used to determine the effect of the dose of a
toxicant on their survival time. The toxicant is one that is frequently discharged into the atmosphere from jet
fuel. For a certain dose of the toxicant, the study determines that the survival time, in weeks, has a gamma
distribution with α = 5 and β = 10. What is the probability that a rat survives no longer than 60 weeks?
Solution : Let the random variable X be the survival time (time to death). The required probability is
The integral above can be solved through the use of the incomplete gamma function, which becomes the
cumulative distribution function for the gamma distribution. This function is written as
which is denoted as F(6; 5) in the table of the incomplete gamma function in Appendix A.23. Note that this allows a
quick computation of probabilities for the gamma distribution. Indeed, for this problem, the probability that the rat
survives no longer than 60 days is given by P(X ≤ 60) = F(6;5) = 0.715.
EXAMPLE 6.18:
It is known, from previous data, that the length of time in months between customer complaints about a
certain product is a gamma distribution with α = 2 and β = 4. Changes were made to tighten quality control
requirements. Following these changes, 20 months passed before the first complaint. Does it appear as if the
quality control tightening was effective?
Solution : Let X be the time to the first complaint, which, under conditions prior to the changes, followed a
gamma distribution with α = 2 and β = 4. The question centers around how rare X ≥ 20 is, given that α and β
remain at values 2 and 4, respectively. In other words, under the prior conditions is a “time to complaint” as
large as 20 months reasonable? Thus, following the solution to Example 6,