0% found this document useful (0 votes)

17 views

STAT2372 Topic3 2020

This document discusses continuous random variables. It defines continuous random variables as those that can take on an uncountable number of values and have a probability density function. It provides examples of continuous random variables and discusses key concepts like the cumulative distribution function and quantile function. It also introduces the uniform distribution as a simple example of a continuous distribution and illustrates properties of probability densities, cumulative distribution functions, and quantile functions.

Uploaded by

Simthande Soke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

STAT2372 Topic3 2020

Uploaded by

Simthande Soke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

STAT2372 Probability

2020

Topic 3
Continuous Random Variables

STAT2372 2020 Topic 3 1

Continuous random variables
• We’ve seen examples of discrete random variables which take on
an infinite but countable number of values.
• There are rvs which take on an uncountable number of values.
• Two examples:
– the time that a train arrives at a specified stop;
– the lifetime of a transistor.
• Since random variables of this type can have a continuum of
possible values, they are called continuous random variables.
• Continuous random variables can take any value over some range.
• It is possible for a rv to be a mixture of a continuous and discrete
rv. Examples:
– Amount of rainfall in a day. On a dry day, the amount of

STAT2372 2020 Topic 3 2

rainfall is exactly zero; on a wet day the amount of rainfall is
continuous on (0, ∞).
– The length of time it takes for a customer to commence
service in a single-server queue with continuously distributed
service times. There is a non-zero probability that the queue
is empty and therefore a strictly positive probability that the
time until commencement of service is zero. However, if the
queue is non-empty, the time until commencement of service
is continuous on (0, ∞) .
A random variable X is said to be continuous if it takes on an
uncountably infinite number of values, and if there is a function
fX (x) , called the probability density function, or pdf, such that:
1.
fX (x) ≥ 0, ∀x ∈ R;

STAT2372 2020 Topic 3 3

2.
Z∞
fX (x) dx = 1;
−∞

3.
Zb
P (a < X ≤ b) = fX (x) dx.
a

STAT2372 2020 Topic 3 4

0.4

0.4
P(−1<X<2)
0.3

0.3
fX(x)

fX(x)
0.2

0.2
1
0.1

0.1
0.0

0.0
−2 0 2 4 −2 0 2 4
x x

Important:
• Probabilities are represented by areas under the pdf.

STAT2372 2020 Topic 3 5

• Note that this is not true for discrete distributions.

• The height of fX (x) is never interpreted as P (X = x) for a

continuous rv.

The Uniform Distribution U (0, 1)

• Every value between 0 and 1 is “equally likely”.

1 0 ≤ x ≤ 1
fX (x) =
0 otherwise

STAT2372 2020 Topic 3 6

1.5
1.0
fX(x)
0.5
0.0

−0.5 0.0 0.5 1.0 1.5

STAT2372 2020 Topic 3 7

• If X is uniformly distributed on [0, 1], then, if a ≥ 0 and b ≤ 1,
Z b
P (a ≤ X ≤ b) = P (a < X < b) = 1 dx
a
= [x]ba
= b − a.

STAT2372 2020 Topic 3 8

1.5
1.0
fX(x)
0.5
0.0

a b

−0.5 0.0 0.5 1.0 1.5

STAT2372 2020 Topic 3 9

• From above, for any a,

P (X = a) = lim+ P (a ≤ X ≤ b)
b→a
= lim+ (b − a) = 0
b→a

• This will be true for any continuous rv, i.e. If X is a continuous

rv, then P (X = x) = 0, ∀x.

Cumulative Distribution Function (cdf )

• For any rv X, continuous, discrete, or neither, let FX be defined
by
FX (x) = P (X ≤ x) .
• FX (x) is called the distribution function of X or cumulative
distribution function (cdf) of X.

STAT2372 2020 Topic 3 10

e.g. For the uniform random variable above:

STAT2372 2020 Topic 3 11

• When b < 0, P (X ≤ b) = 0.

• When b > 1, P (X ≤ b) = 1.

• When 0 < b < 1,

Z b
P (X ≤ b) = 1 dx
0
= [x]b0
=b.

Therefore we can write the cdf as


 0 ; b≤0


FX (b) = P (X ≤ b) = b ; 0<b<1


1 ; b≥1


STAT2372 2020 Topic 3 12

• The domain of the cdf is R. Note that FX (x) needs to be
specified for all values of x, and not just the values of x for which
x is a ‘possible value’ of X.

Properties of a cdf

• FX (−∞) = limx→−∞ P (X ≤ x) = 0.
• FX (∞) = limx→∞ P (X ≤ x) = 1.
• Let x2 ≥ x1 . Then FX (x2 ) ≥ FX (x1 ) .
Proof:

FX (x2 ) = P (X ≤ x2 )
= P ({X ≤ x1 } ∪ {x1 < X ≤ x2 }) .

STAT2372 2020 Topic 3 13

Since (−∞, x1 ] and (x1 , x2 ] are mutually exclusive

FX (x2 ) = P (X ≤ x1 ) + P (x1 < X ≤ x2 )

≥ P (X ≤ x1 ) ,

since all probabilities are non-negative. Thus

FX (x2 ) ≥ FX (x1 ) .

• If X has lowest possible value ℓ and greatest possible value u,

then

FX (ℓ) = 0
FX (u) = 1

STAT2372 2020 Topic 3 14

Example of shapes of cdfs

The picture below is of the cdf of a discrete rv. Since X is discrete,

FX (x) is a step-function.

STAT2372 2020 Topic 3 15

The picture below is of the cdf of a continuous rv. Since X is
continuous, FX (x) is continuous.

STAT2372 2020 Topic 3 16

The Quantile Function

The quantile function of a probability distribution is the inverse F −1

of its cdf F .
If the probability distribution is discrete, then the quantile function is

Q(p) = F −1 (p) = inf{x ∈ R : p ≤ F (x)}

for a probability 0 < p < 1, and the quantile function returns the
minimum value of x for which the previous probability statement
holds.
If the probability distribution is continuous, then the quantile
function is
Q(p) = F −1 (p).

STAT2372 2020 Topic 3 17

Specialized Quantiles

• If p = 0.5, then the quantile is called the median.

• If p = 0.25, then the quantile is called the lower quartile.
• If p = 0.75, then the quantile is called the upper quartile.
• If p = 0.1, 0.2, . . . , 0.9, then the quantiles are called deciles.
• If p = 0.01, 0.02, . . . , 0.99, then the quantiles are called
percentiles.

STAT2372 2020 Topic 3 18

Median

1.0
0.8
0.6
F(x)

0.4
0.2
0.0

0 1 2 3 4 5

STAT2372 2020 Topic 3 19

Median

1.0
0.8
0.6
F(x)

0.4
0.2
0.0

−0.5 0.0 0.5 1.0 1.5 2.0

STAT2372 2020 Topic 3 20

Probability Density Functions (pdf )
• The probability of observing a specific value x of a continuous
random variable X is zero. It is thus the cdf, FX (x) , which is
used to define probabilities.
• For a continuous rv X,
Z x
FX (x) = fX (u) du
−∞

Note that the dummy in the integral is u, not x.

• The fundamental theorem of calculus gives us the relationship
d
FX (x) = fX (x) .
dx
i.e. the probability density function fX is the derivative of the
cdf FX .

STAT2372 2020 Topic 3 21

• When X is discrete, there is also a relationship between fX and
FX . ∀x,
fX (x) = FX (x) − lim− FX (y)
y→x

The Uniform Distribution U (a, b)

• Suppose that a random variable X is equally likely to lie in any
small sub-interval of (a, b) , and that it cannot lie outside this
interval. We then say that X ∼ U (a, b) , and have

 1 ; a≤x≤b
b−a
fX (x) =
 0 ; otherwise.

STAT2372 2020 Topic 3 22

• Thus, if x ∈ (a, b) , we have
Zx
FX (x) = fX (u) du
−∞
Za Zx
1
= 0du + du
b−a
−∞ a
1 x
=0+ [u]u=a
b−a
x−a
= .
b−a
• Obviously, we have FX (x) = 0 if x ≤ a and FX (x) = 1 if x ≥ b.
Hence the proper specification of FX is

STAT2372 2020 Topic 3 23


x−a


 b−a ; a≤x≤b
FX (x) = 0 ; x<a


1 ; x>b


STAT2372 2020 Topic 3 24

Checks:

STAT2372 2020 Topic 3 25

a−a
FX (a+ ) = b−a = 0 (lower limit from above)
b−a
FX (b− ) = b−a = 1 (upper limit from below)

Note: The cdf is continuous but not differentiable, since it is not

differentiable at a or b.

The triangular Distribution (Symmetric)

• If two independent (this will be defined later) rvs are both
U (a, b) , then their sum is triangular on (2a, 2b) .
• The pdf is given in the picture below, where c = 2a and d = 2b.
• We shall prove this later in more generality. The pdf of the sum
of two independent rvs is known as the convolution of the pdfs of
the two rvs.

STAT2372 2020 Topic 3 26

STAT2372 2020 Topic 3 27
• For the case where a = 0 and b = 1, we have

 x ; 0<x≤1


fX (x) = 2−x ; 1<x≤2


0 ; otherwise.


• The cdf FX (x) is a little complicated. When x ∈ (0, 1) ,

Zx
FX (x) = u du
0
2 x

u
=
2 0
x2
= ,
2

STAT2372 2020 Topic 3 28

while, when x ∈ (1, 2) ,
Zx
FX (x) = P (X ≤ 1) + (2 − u) du
1
2 x

u
= FX (1) + 2u −
2 1
x2

1 1
= + 2x − − 2−
2 2 2
x2
= 2x − −1
2
1 2
= 1 − (2 − x)
2

STAT2372 2020 Topic 3 29

Thus 


 0 ; x≤0
x2



 ; 0<x≤1
FX (x) = 2
2
1 − 21 (2 − x) ; 1<x≤2






 1 ; x > 2.
• Note the symmetry
• Checks:
+ 02

FX 0 = =0
2
−
1
FX 1 =
2
+
1 1
FX 1 =2− −1=
2 2
−
4
FX 2 =4− −1=1
2
STAT2372 2020 Topic 3 30
The Cauchy Distribution
k
• What value of k makes a valid pdf on R?
1 + x2

• First, does the function satisfy non-negativity conditions?

STAT2372 2020 Topic 3 31

Obviously.
• What about its integral?
Z∞
k −1 ∞
dx = k tan x −∞
1 + x2
−∞
h π π i
=k − −
2 2
= kπ.

• The function is thus a pdf if kπ = 1, i.e. if k = 1/π.

• The pdf of a Cauchy random variable is thus
1
fX (x) =
π (1 + x2 )

• What is the cdf?

STAT2372 2020 Topic 3 32

• ∀x,
Zx
FX (x) = f (u) du
−∞
Zx
1
= du
π (1 + u2 )
−∞
1 −1 x
= tan u −∞
π
1 h −1 π i
= tan x − −
π 2
1 1
= + tan−1 x.
2 π

STAT2372 2020 Topic 3 33

Checks:
1 1
FX (−∞) = + tan−1 (−∞)
2 π
1 1 π
= + × −
2 π 2
= 0,
1 1
FX (∞) = + tan−1 (∞)
2 π
1 1 π
= + ×
2 π 2
= 1,
1 1
FX (0) = + tan−1 (0)
2 π
1 1
= + ×0
2 π
1
=
2
STAT2372 2020 Topic 3 34
The Exponential Distribution
• Let V be the time between two successive events in a Poisson
process {X (t)} with parameter λ.
• As the numbers of events in non-overlapping intervals are
independent, this will have the same distribution as the time it
takes a single event to occur, given that X (0) = 0.
• Consider the event {V > v} . This is the same as the event
{X (v) = 0} . To see this, note that if X (v) = 0, then
X (t) = 0, ∀t ≤ v, and so V > v. Conversely, if V > v, then the
time at which the next event occurs is greater than v, and so
X (v) = 0.

STAT2372 2020 Topic 3 35

• Thus

P (V > v) = P (X (v) = 0|X (0) = 0)

0
e−λv (λv)
=
0!
= e−λv .

Now

FV (v) = P (V ≤ v)
= 1 − e−λv .

Hence, for v > 0,

d −λv

fV (v) = 1−e
dv
= λe−λv .

STAT2372 2020 Topic 3 36

• The pdf 
 λe−λv ; v>0
fV (v) =
 0 ; v≤0
is known as the exponential with parameter λ.

STAT2372 2020 Topic 3 37

STAT2372 2020 Topic 3 38
Some examples of quantities that have been modelled as
exponentially distributed random variables:
• The time it takes for someone to pick up a ringing phone;
• The amount of time until two vehicles meet at a one-lane bridge;
• The amount of time (from now) until an earthquake occurs;

“Lack of Memory” or Memoryless Property

• Let X be distributed exponentially with parameter λ. Consider

the conditional probability

P (X > a + b|X > a)

This is the probability that X is at least a + b, given that we

know X is at least a. [Note: a and b must both be ≥ 0. ]

STAT2372 2020 Topic 3 39

• By definition,
P [{X > a + b} ∩ {X > a}]
P (X > a + b|X > a) =
P (X > a)
P (X > a + b)
= ,
P (X > a)
by the rule of Conditional Probability and noting that the event
{X > a + b} is a subset of the event {X > a}. Thus

STAT2372 2020 Topic 3 40

1 − P (X ≤ a + b)
P (X > a + b | X > a) =
1 − P (X ≤ a)
1 − FX (a + b)
=
1 − FX (a)

1 − 1 − e−λ(a+b)
=
1 − {1 − e−λa }
e−λ(a+b)
=
e−λa
= e−λb

= 1 − 1 − e−λb
= 1 − FX (b)
= P (X > b)

STAT2372 2020 Topic 3 41

Thus
P (X > a + b | X > a) = P (X > b)
provided a and b are both non-negative.

• How do we interpret this? Suppose we have modelled the life of a

lightbulb as being exponentially distributed with parameter 1
(year). Suppose the lightbulb has been working for 1 year. The
above shows that the remaining life is also exponentially
distributed with parameter 1 (year). We say that the exponential
distribution has no memory. This should indicate that the
exponential model is probably not a very good model for the life
of a lightbulb.

STAT2372 2020 Topic 3 42

The Erlangian Distribution
• Now consider the total time until the second event occurs in a
Poisson process. Let W be the time until the second event occurs
in a Poisson process {X (t)} with parameter λ, when X (0) = 0.
Now

P (W > w) = P (2nd event occurs at a time > w)

= P (X (w) = 0 or 1) .

STAT2372 2020 Topic 3 43

Thus

1 − FW (w) = P (0 events in (0, w))

+ P (1 event in (0, w))
0 1
e−λw (λw) e−λw (λw)
= +
0! 1!
= e−λw (1 + λw) ,

and so

 1 − e−λw (1 + λw) ; w>0
FW (w) =
 0 ; w≤0

• The pdf of W is found by differentiating:

STAT2372 2020 Topic 3 44

d
fW (w) = FW (w)
dw

 λe−λw (1 + λw) − e−λw λ ; w>0
=
 0 ; w≤0

 λ2 we−λw ; w > 0
=
 0 ; w≤0

Thus 
 λ2 we−λw ; w>0
fW (w) =
 0 ; w ≤ 0.

• Generalisation: What is the pdf of the time until the kth event in
a Poisson process with parameter λ?

STAT2372 2020 Topic 3 45

The Gamma Function

• The gamma function is defined by

Z∞
Γ (α) = xα−1 e−x dx
0

where the condition α > 0 is needed for the integral to be finite.

• Properties:
Z∞
Γ (1) = e−x dx
0
−x ∞

= −e 0
= [(−0) − (−1)] = 1.

STAT2372 2020 Topic 3 46

Now
Z∞
Γ (2) = xe−x dx
0
Z∞
−x

= (−x) d e
0

Thus
Z∞
−x ∞

Γ (2) = −xe 0
− − e−x dx
0
= [(−0) − (−0)] + Γ (1)
= 1.

STAT2372 2020 Topic 3 47

More generally, when α > 0,

Γ (α + 1) = αΓ (α)

Proof:
Z∞
Γ (α + 1) = xα e−x dx
0
Z∞
α −x

= −x d e
0
Z∞
α −x ∞ α−1 −x
=− x e 0
+ α x e dx
0
= 0 + αΓ (α) .

STAT2372 2020 Topic 3 48

Note that for integer α

Γ (α) = (α − 1) Γ (α − 1)
= (α − 1) (α − 2) Γ (α − 2)
= ···
= (α − 1) (α − 2) · · · 1Γ (1)
= (α − 1)!

• If α > 0, but is not integer, then the Γ function is still defined. In

fact, as long as we know the values of Γ(x) for x ∈ (0, 1), we can
calculate all values of Γ(x). For example,

STAT2372 2020 Topic 3 49

1 7 5 5
Γ 3 =Γ = Γ
2 2 2 2

5 3 3
= × Γ
2 2 2

5 3 1 1
= × × Γ
2 2 2 2

15 1
= Γ
8 2

√

1
It can be shown that Γ = π. Let W = Wk be the time
2
until the occurrence of the k th event in a Poisson process with

STAT2372 2020 Topic 3 50

parameter λ. Then, for k = 1, 2, 3, . . .

1 − FW (w) = P (W > w)
 
0 or 1 or . . . or (k − 1) events
=P 
occur in (0, w)
k−1 r
X e−λw (λw)
= .
r=0
r!

(Note that the kth event has to occur at a time > w.) Hence

 k−1
X (λw)r
 1 − e−λw

; w>0
FW (w) = r=0
r!

0 ; w≤0



STAT2372 2020 Topic 3 51

and
k−1
( )
r
d −λw
X (λw)
fW (w) = − e
dw r=0
r!
k−1 r k−1
X rλ (λw)r−1
X (λw)
= λe−λw − e−λw
r=0
r! r=1
r!
k−1 r k−2
X (λw)s
X (λw)
= λe−λw − λe−λw
r=0
r! s=0
s!
k−1
(λw)
= λe−λw
(k − 1)!
λk wk−1 e−λw
=
(k − 1)!
Special cases:

STAT2372 2020 Topic 3 52

1. k = 1. W is exponential with parameter λ. As expected,

 λe−λw ; w > 0
fW (w) =
 0 ; w≤0

2. k integer. W is said to have the Erlangian distribution. The rv

W can be thought of as the sum of independent and identically
distributed exponential rvs, each with parameter λ. This has
resulted in the Erlangian distribution being used in instances
where the lack of memory of the exponential prevents it from
being used.

k k−1 −λw
λ w e
; w>0


fW (w) = (k − 1)!
0 ; w ≤ 0.



3. Although we have assumed that k is an integer in the above, we

STAT2372 2020 Topic 3 53

know that when α > 0, the function

α α−1 −λw
λ w e
; w>0


f (w) = Γ (α)
0 ; w ≤ 0.



is a pdf, since, using the substitution x = λw,

Z∞ α α−1 −λw Z∞ α−1 −x
λ w e x e
dw = dx
Γ (α) Γ (α)
0 0
= 1.

4. The parameter α is also known as the shape parameter. It is

usual to put β = 1/λ. β is called the scale parameter. The pdf is
then known as the Gamma distribution with parameters α and β:
xα−1 e−x/β
fX (x) =
β α Γ(α)

STAT2372 2020 Topic 3 54

This is denoted as X ∼ G(α, β).
5. The chi-squared distribution with n degrees of freedom χ2n is a
particular case of G(α, β) when α = n/2, β = 2. The pdf of χ2n ,
n > 0, is 
 xn/2−1 e−x/2 , x > 0;
2n/2 Γ(n/2)
fX (x) =
 0, x ≤ 0.

STAT2372 2020 Topic 3 55

Some shapes of the Gamma pdf are shown below.
0.20

α = 1, β = 5
α = 2, β = 5
α = 3, β = 5
0.15
fX(x)
0.10
0.05
0.00

0 5 10 15 20 25 30
x

STAT2372 2020 Topic 3 56

The Beta Distribution

This is a two-parameter density function defined over the closed

interval [0, 1] . As such, it is often used as a model for proportions,
such as the proportion of impurities in a chemical product or the
proportion of time that a machine is in a state of being repaired, etc.

The beta function

• The beta function is defined by

Z 1
β−1
B (α, β) = xα−1 (1 − x) dx.
0

• For the existence of the integral, we need α > 0 and β > 0.

STAT2372 2020 Topic 3 57

• We’ll now show that
Γ (α) Γ (β)
B (α, β) = .
Γ (α + β)

• By definition,
Z ∞ Z ∞
Γ (α) Γ (β) = e−x xα−1 dx e−y y β−1 dy
0 0
Z ∞ Z ∞
= e−x xα−1 e−y y β−1 dy dx
0 0

• This is the volume between the surface

z = e−x xα−1 e−y y β−1

and the x − y plane.

STAT2372 2020 Topic 3 58

STAT2372 2020 Topic 3 59
• Let’s make the change of variable u = x + y in the inner integral.
Note that dy = du. The above integral is then
Z ∞ Z ∞
e−u xα−1 (u − x)β−1 du dx
0 x

• Changing the order of integration, the integral becomes

Z ∞ Z u
β−1
e−u xα−1 (u − x) dx du.
0 0

STAT2372 2020 Topic 3 60

• Now put x = uz. We get dx = udz and the integral becomes
Z ∞ Z 1
α−1 β−1
e−u (uz) (u − uz) udz du
0 0
Z ∞ Z 1
= uα+β−1 e−u z α−1 (1 − z)β−1 dz du
0 0
Z ∞ Z 1
β−1
= uα+β−1 e−u du z α−1 (1 − z) dz
0 0
= Γ (α + β) B (α, β) .

• Hence
Γ (α) Γ (β) = Γ (α + β) B (α, β)
and
Γ (α) Γ (β)
B (α, β) = .
Γ (α + β)
• For the special case where α = β = 1/2, we obtain, since

STAT2372 2020 Topic 3 61

R∞
Γ (1) = 0
e−x dx = 1,
2
{Γ (1/2)} = Γ (1) B (1/2, 1/2)
= B (1/2, 1/2) .

• Now Z 1
−1/2
B (1/2, 1/2) = x−1/2 (1 − x) dx.
0

• Put x = sin2 θ. Then dx = 2 sin θ cos θdθ and so

Z π/2
1 1
B (1/2, 1/2) = 2 sin θ cos θdθ
0 sin θ cos θ
Z π/2
=2 dθ = π.
0

• We thus have
2
{Γ (1/2)} = π

STAT2372 2020 Topic 3 62

and so
∞
e−x √
Z
√ dx = π.
0 x
• We’ll need this soon.

STAT2372 2020 Topic 3 63

The Beta Distribution

• The random variable X is said to follow the beta distribution

with parameters α and β (α > 0 and β > 0) if
 1 β−1
 xα−1 (1 − x) ; x ∈ (0, 1)
fX (x) = B (α, β)
0 ; elsewhere



 Γ (α + β) xα−1 (1 − x)β−1

; x ∈ (0, 1)
= Γ (α) Γ (β)
0 ; elsewhere.



STAT2372 2020 Topic 3 64

• The graphs of a few beta pdfs are shown below.

STAT2372 2020 Topic 3 65

STAT2372 2020 Topic 3 66
• The standard uniform distribution (on the range 0 to 1 ) is just
the beta with parameters α = 1 and β = 1.

STAT2372 2020 Topic 3 67

The Normal Distribution
• The normal or Gaussian distribution is perhaps the most widely
used of all the continuous probability distributions. The shape of
the pdf is the familiar bell-shaped curve. The two parameters, µ
and σ 2 completely determine the shape σ 2 and location µ of the
normal pdf:
( 2 )
1 1 x−µ
f (x) = √ exp − ; −∞ < x < ∞
2πσ 2 2 σ

• Why is this a pdf? From above, putting y = (x − µ) /σ, we have

Z ∞ ( 2 ) Z ∞
1 x−µ 1 2
exp − dx = exp − y σdy
−∞ 2 σ −∞ 2
Z ∞
1
= 2σ exp − y 2 dy.
0 2

STAT2372 2020 Topic 3 68

√
Now put u = y /2. Then y = 2u, dy = √12u du, and so the
2

above is
Z ∞
1 √
2σ √ exp (−u) du = 2σΓ (1/2)
0 2u
√
= 2πσ 2 .

• Thus f (x) is a pdf.

• When X has this pdf, we write
2

X ∼ N µ, σ .

The standard normal has µ = 0 and σ 2 = 1, and the notation Z

is usually reserved to denote a rv having this special distribution.
We shall interpret the parameters µ and σ 2 later.

STAT2372 2020 Topic 3 69

1. Pdf of Z ∼ N (0, 1)

STAT2372 2020 Topic 3 70

2

2. Pdf of X ∼ N µ, σ

STAT2372 2020 Topic 3 71

Checks:
1. f is obviously non-negative;
2. We’ve shown its integral is 1.
Hence
1 − 21 ( x−µ ) 2
f (x) = √ e σ ; −∞ < x < ∞
2πσ
is a valid probability density function.
We can obtain a random variable X which has the N (µ, σ 2 )
distribution from Z which has the N (0, 1) distribution via the
equations
X −µ
Z=
σ
and
X = µ + σZ.
They have the same shapes but different scales and locations.

STAT2372 2020 Topic 3 72

Cumulative Distribution Function

• The normal cdf cannot be written down in a nice closed form.

But neither can the log, exp, sin and cos functions.
• The normal cdf is important enough to be tabulated. It is also
very easy to calculate it using the well-known function erf, which
is called the “error function”.
• If
2

X ∼ N µ, σ
then the transformation
X −µ
Z= ∼ N (0, 1)
σ
can be used to determine “areas under the curve” for any normal
distribution.
• Some tables give the area between 0 and x (x > 0) , some

STAT2372 2020 Topic 3 73

between −∞ and x. All areas can be evaluated from either set of
tables. Some tables give the area between x and ∞.
• The cdf of a standard normal random variable is commonly
written as Φ (z), while the probability density function is written
as φ (z). Thus
Zz
Φ (z) = φ (u) du,
−∞

where
1 2
φ (z) = (2π)−1/2 e− 2 z .

• Note that Φ (−z) = 1 − Φ (z) , since φ (z) is symmetric about

z = 0.

STAT2372 2020 Topic 3 74

The Normal Approximation to the Binomial distribution

• The Poisson distribution can be used to approximate the

Binomial when n is large and p is very small. However, when p is
“medium” and n “large”, the binomial is better approximated by
the normal distribution. We shall prove this later on.
• If
Y ∼ Bin (n, p)
then we may approximate Y ’s distribution with that of a normal
rv with

µ = np
σ 2 = np (1 − p) .

The interpretation of µ and σ 2 will come later.

STAT2372 2020 Topic 3 75

• Thus we may approximate P (Y ≤ y) by Φ √ y−np .
np(1−p)

Continuity Correction

• Since the binomial is discrete and the normal is continuous, an

additional factor must be included to account for this.

STAT2372 2020 Topic 3 76

0.15
0.10
P(X=x)
0.05
0.00

11 12 13 14 15 16 17 18
x

• The B(n, p) probability P (X = x) is approximated by the area

under the N (np, np(1 − p)) curve, between x − 12 and x + 12 .
• As an example, consider the probability of getting between 70
and 72 heads (inclusive) when tossing a biased coin 100 times,

STAT2372 2020 Topic 3 77

where the probability of a heads appearing on any one toss is 0.6.
• Let Y be the number of heads in 100 tosses of the coin. Then

Y ∼ Bin (100, 0.6) .

• Now,

100 70 30 100 71 29
P (70 ≤ Y ≤ 72) = (0.6) (0.4) + (0.6) (0.4)
70 71

100 72 28
+ (0.6) (0.4)
72
≃ 0.0201824

STAT2372 2020 Topic 3 78

• Using the normal approximation, we have
µ = np = 100 (0.6) = 60
σ 2 = np (1 − p)
= 100 (0.6) (0.4)
= 24

• Without Continuity Correction: the area under the curve

between 70 and 72 is

70 − 60 Y −µ 72 − 60
P (70 ≤ Y ≤ 72) = P √ ≤ ≤ √
24 σ 24
≃ P (2.04 ≤ Z ≤ 2.45)
= 0.0136
The relative error is 33%.
• With Continuity Correction: the area under the curve between

STAT2372 2020 Topic 3 79

1 1
70 − and 72 + is
2 2

1 1
P (70 ≤ Y ≤ 72) = P 70 − ≤ Y ≤ 72 +
2 2

69.5 − 60 Y −µ 72.5 − 60
=P √ ≤ ≤ √
24 σ 24
≃ P (1.939 ≤ Z ≤ 2.552)
≃ 0.9946 − 0.9737 = 0.0209

which is a much better approximation. The relative error is now

only 3.1%.
• What is the probability of obtaining fewer than 50 heads?

STAT2372 2020 Topic 3 80

• The exact answer is

P (Y < 50) = P (Y ≤ 49)

49
100
(0.6)r (0.4)100−r
X
=
r=0
r

The R command pbinom(49,100,0.6) gives 0.01676169.

• Using the normal approximation with continuity correction

P (Y < 50) = P (Y ≤ 49)

Y −µ 49.5 − 60
=P ≤ √
σ 24
≃ P (Z < −2.14)
≃ 0.0162,

for which the relative error is −3.6%.

STAT2372 2020 Topic 3 81

• Without continuity correction, we obtain

Y −µ 50 − 60
P (Y < 50) = P < √
σ 24
≃ P (Z < −2.04)
≃ 0.0207,
for which the relative error is 23%.
• Some books call the normal approximation to the binomial the
De Moivre-Laplace Limit Theorem.
• This approximation is quite good for values of n (and p )
satisfying
np (1 − p) ≥ 10.
There are no rules, in general, how big a sample size should be
before the approximation is arbitrarily accurate. The n = 30 rule
you may have seen before is just a rule of thumb.

STAT2372 2020 Topic 3 82

R functions

x = seq(-0.5,2,0.1)
Fx = x^2
Fx[x<=0] = 0
Fx[x>1] = 1
plot(x,Fx,type="l",ylab="F(x)",main="Cumulative distribution
function",lwd=2)

STAT2372 2020 Topic 3 83

Cumulative distribution function

1.0
0.8
0.6
F(x)

0.4
0.2
0.0

−0.5 0.0 0.5 1.0 1.5 2.0

STAT2372 2020 Topic 3 84

For each standard probability distribution, there are 4 R functions.
For example, for uniform distribution:
• dunif(x,min=a,max=b) returns the value of the uniform
probability density function U(a, b) at point x.
> dunif(0.1,0,2)
[1] 0.5
• punif(x,min=a,max=b) returns the value of the cumulative
distribution function (cdf) of the U(a, b) at point x.
> punif(0.1,0,2)
[1] 0.05
• qunif(q,min=a,max=b) returns the q-quantile of the uniform
distribution U(a, b).
> qunif(0.7,0,2)
[1] 1.4

STAT2372 2020 Topic 3 85

• runif(M,min=a,max=b) generates M random variables from the
uniform distribution U(a, b).
> runif(5,0,2)
[1] 0.8512345 0.6658686 1.4998520 0.9878262 1.2498115
Similarly, there are 4 R functions for other continuous distributions
considered in this topic:
• Cauchy dcauchy, pcauchy, qcauchy, rcauchy
• exponential dexp, pexp, qexp, rexp
• Gamma dgamma, pgamma, qgamma, rgamma
• chi-squared dchisq, pchisq, qchisq, rchisq
• Beta dbeta, pbeta, qbeta, rbeta
• normal dnorm, pnorm, qnorm, rnorm

STAT2372 2020 Topic 3 86

Astam Formula Sheet
No ratings yet
Astam Formula Sheet
10 pages
Mini-Test 1 2022
No ratings yet
Mini-Test 1 2022
5 pages
Department of Mathematics and Statistics: Statistics For Physical Sciences Notes
No ratings yet
Department of Mathematics and Statistics: Statistics For Physical Sciences Notes
41 pages
chap 2.2
No ratings yet
chap 2.2
34 pages
Math 5846 Chapter 2
No ratings yet
Math 5846 Chapter 2
102 pages
Module-2 (Correction and Regression)
No ratings yet
Module-2 (Correction and Regression)
120 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
Class Notes 3
No ratings yet
Class Notes 3
18 pages
02 Random Variables
No ratings yet
02 Random Variables
51 pages
UNIT II Probability Theory
No ratings yet
UNIT II Probability Theory
77 pages
Continuous Random Variables: Dr. Hiranmoy Pal
No ratings yet
Continuous Random Variables: Dr. Hiranmoy Pal
19 pages
Lecture 10
No ratings yet
Lecture 10
23 pages
Lecture 4
No ratings yet
Lecture 4
5 pages
STAT112 Lecture 2
No ratings yet
STAT112 Lecture 2
28 pages
Math556 02 RVProbDist
No ratings yet
Math556 02 RVProbDist
6 pages
Chapter 3: Random Variables and Probability Distributions This Chapter Is All About
No ratings yet
Chapter 3: Random Variables and Probability Distributions This Chapter Is All About
8 pages
Continuous Random Variables and Probability Distributions: Institute of Technology of Cambodia
No ratings yet
Continuous Random Variables and Probability Distributions: Institute of Technology of Cambodia
34 pages
4-Random Variables - Introduction and Motivation-06-01-2025
No ratings yet
4-Random Variables - Introduction and Motivation-06-01-2025
26 pages
Continuous Random Variables: Density, PMF F (X) Density, PDF F (Y) Y/6, 2 y 4
No ratings yet
Continuous Random Variables: Density, PMF F (X) Density, PDF F (Y) Y/6, 2 y 4
28 pages
Continuous Random Variables: Density, PMF F (X) Density, PDF F (Y) Y/6, 2 y 4
No ratings yet
Continuous Random Variables: Density, PMF F (X) Density, PDF F (Y) Y/6, 2 y 4
28 pages
3and4_main
No ratings yet
3and4_main
10 pages
3-Random Variables-04-01-2023
No ratings yet
3-Random Variables-04-01-2023
54 pages
UNIT II Probability Theory
No ratings yet
UNIT II Probability Theory
84 pages
CHAPTER TWO (2) S
No ratings yet
CHAPTER TWO (2) S
69 pages
Econ-2042- Unit 2-HO
No ratings yet
Econ-2042- Unit 2-HO
12 pages
Ch1 Random Variables and Probability Distributions 0
No ratings yet
Ch1 Random Variables and Probability Distributions 0
25 pages
Chapter 4
No ratings yet
Chapter 4
36 pages
Prepared By: Mohammad Saifuddin: Discrete or Continuous
No ratings yet
Prepared By: Mohammad Saifuddin: Discrete or Continuous
7 pages
2. Probability Theory_D
No ratings yet
2. Probability Theory_D
80 pages
Stat6201 ch1-5
No ratings yet
Stat6201 ch1-5
4 pages
1588472597-conditional-distribution-and-independence
No ratings yet
1588472597-conditional-distribution-and-independence
20 pages
3.Lecture 3-Random Variables
No ratings yet
3.Lecture 3-Random Variables
22 pages
Tutorial 5 Chapter 4: Continuous R.V. (Part I) & Assignment: STAT1306 Introductory Statistics (First Semester, 2011-2012)
No ratings yet
Tutorial 5 Chapter 4: Continuous R.V. (Part I) & Assignment: STAT1306 Introductory Statistics (First Semester, 2011-2012)
79 pages
Continuous Probability Distributions
No ratings yet
Continuous Probability Distributions
59 pages
Chapter Two 2. Random Variables and Probability Distributions
100% (1)
Chapter Two 2. Random Variables and Probability Distributions
15 pages
Chapter 2
No ratings yet
Chapter 2
34 pages
02-Random Variables
No ratings yet
02-Random Variables
38 pages
Lect 3 PDF
No ratings yet
Lect 3 PDF
50 pages
8 A. K. Majee
No ratings yet
8 A. K. Majee
13 pages
Lecture 5 - Fall 2023
No ratings yet
Lecture 5 - Fall 2023
15 pages
Random Variables
No ratings yet
Random Variables
37 pages
MIT18 05S14 Class5slides PDF
No ratings yet
MIT18 05S14 Class5slides PDF
17 pages
MI2020E Problems of Chapter 2
No ratings yet
MI2020E Problems of Chapter 2
15 pages
Mca4020 SLM Unit 02
No ratings yet
Mca4020 SLM Unit 02
27 pages
Continuous Random Variables: Scott Sheffield
No ratings yet
Continuous Random Variables: Scott Sheffield
102 pages
Probability and Random Process
No ratings yet
Probability and Random Process
12 pages
MI2023.Chapter 2. Random Variables and Probability Distributions
No ratings yet
MI2023.Chapter 2. Random Variables and Probability Distributions
10 pages
Mathematics PDF
No ratings yet
Mathematics PDF
280 pages
Lecture Notes on Continuous Random Variables
No ratings yet
Lecture Notes on Continuous Random Variables
38 pages
Lecture 3_CSE38900_rev
No ratings yet
Lecture 3_CSE38900_rev
88 pages
Stochastic
No ratings yet
Stochastic
63 pages
PRP - Unit 2
No ratings yet
PRP - Unit 2
41 pages
MATH10BDIS0321SOLNS
No ratings yet
MATH10BDIS0321SOLNS
6 pages
MEFall2023_7
No ratings yet
MEFall2023_7
46 pages
Exam P Review Sheet
No ratings yet
Exam P Review Sheet
12 pages
PRP Module 2
No ratings yet
PRP Module 2
113 pages
6 Continuous Variables
No ratings yet
6 Continuous Variables
8 pages
Problems of Chapter 3 (Week 5, 6, 7)
No ratings yet
Problems of Chapter 3 (Week 5, 6, 7)
15 pages
CH 3 Random Variables and Probability Distributions
No ratings yet
CH 3 Random Variables and Probability Distributions
32 pages
chap4
No ratings yet
chap4
36 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Dec 2021 7
No ratings yet
Dec 2021 7
4 pages
Engineering Mathematics-III Syllabus 220902 204109
No ratings yet
Engineering Mathematics-III Syllabus 220902 204109
3 pages
Maths 2A Random Variables and Probability Distributions Important Questions
No ratings yet
Maths 2A Random Variables and Probability Distributions Important Questions
13 pages
5 - Logistic Regression
No ratings yet
5 - Logistic Regression
19 pages
Session 6 - CSD102 Measures of Divergence From Normality
100% (1)
Session 6 - CSD102 Measures of Divergence From Normality
30 pages
Kalli et al (2011)
No ratings yet
Kalli et al (2011)
13 pages
Fall 2024_STA642_1
No ratings yet
Fall 2024_STA642_1
2 pages
Mtma Dse2 End Sem 19
No ratings yet
Mtma Dse2 End Sem 19
3 pages
HS 512 Advanced Econometrics
No ratings yet
HS 512 Advanced Econometrics
2 pages
Assignment3 Solution
No ratings yet
Assignment3 Solution
5 pages
CDF (Normal Table)
No ratings yet
CDF (Normal Table)
2 pages
Problem Set 2
No ratings yet
Problem Set 2
2 pages
ch2 PDF
No ratings yet
ch2 PDF
28 pages
IE 5004 Lecture 2
No ratings yet
IE 5004 Lecture 2
45 pages
S5 M1 Quiz 9- Normal Distribution
No ratings yet
S5 M1 Quiz 9- Normal Distribution
4 pages
2023-Tutorial 04
No ratings yet
2023-Tutorial 04
2 pages
STA416 - Topic 4 - 2
No ratings yet
STA416 - Topic 4 - 2
14 pages
Chapter 12
No ratings yet
Chapter 12
6 pages
Chapter 03 Kurtosis
No ratings yet
Chapter 03 Kurtosis
36 pages
Tut - Sheet 6
No ratings yet
Tut - Sheet 6
2 pages
ST2131 22S2 Tutorial 10
No ratings yet
ST2131 22S2 Tutorial 10
2 pages
(Maa 4.9) Discrete Distributions in General
No ratings yet
(Maa 4.9) Discrete Distributions in General
20 pages
Unit - 2
No ratings yet
Unit - 2
3 pages
Joint Probability Distribution
No ratings yet
Joint Probability Distribution
21 pages
Lecture On Random Variables Statistics
No ratings yet
Lecture On Random Variables Statistics
23 pages
Midterm 2023 IntroStat Sol (Rev)
No ratings yet
Midterm 2023 IntroStat Sol (Rev)
4 pages
Rosdiana 3
No ratings yet
Rosdiana 3
11 pages
ch5
No ratings yet
ch5
54 pages