0% found this document useful (0 votes)

30 views102 pages

Math 5846 Chapter 2

This document is a chapter from a course on Probability and Stochastic Processes, covering key concepts such as discrete and continuous random variables, probability mass functions, cumulative distribution functions, and expectations. It provides definitions, properties, and examples to illustrate these concepts, including the calculation of probabilities and expected values. The chapter serves as an introduction to the foundational principles of probability theory used in statistical analysis.

Uploaded by

huangde1212

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views102 pages

Math 5846 Chapter 2

Uploaded by

huangde1212

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 102

School of Mathematics and Statistics

UNSW Sydney

Introduction to Probability and Stochastic Processes

OPEN LEARNING
CHAPTER 2

2 / 100
Outline:
2.1 Introduction

2.2 Discrete Random Variables and Probability Mass Functions

2.3 Cumulative Distribution Functions

2.4 Continuous Random Variables and Probability Density Functions

2.5 Expectation and Moments

2.6 Expectation of Transformed Random Variables

2.7 Expectation of Random Variables under Changes of Scale

2.8 Standard Deviation and Variance

2.9 Moment Generating Functions

2.10 Properties of Moment Generating Functions

2.11 Chebychev’s Inequality

2.12 Supplementary Material

3 / 100
2.1 Introduction

4 / 100
Consider a situation where we want to measure something
of interest across multiple subjects. For example, we may
be interested in measuring the weight loss achieved by
participants in a weight loss program.

Inevitably, our measurements always vary from subject to

subject due to factors beyond our control or knowledge.
For this reason, we treat the measurements as random
variables - variables that have some random component.

In this chapter, we will explore random variables and some

properties of random variables that are important in a
study or experiment.

5 / 100
Definition
The probability mass function, P (X = x) is the
probability that the random variable X takes the value of x.

For discrete sample space S, a random variable X is a

function defined on S with
X
P (X = x) = P ({s})
s:X(s)=x

being the probability that X takes the value of x.

6 / 100
2.2 Discrete Random Variables and
Probability Mass Functions

7 / 100
Definition
The probability mass function, P (X = x) is the
probability that the random variable X takes the value of x.

For discrete sample space S, a random variable X is a

function defined on S with
X
P (X = x) = P ({s})
s:X(s)=x

being the probability that X takes the value of x.

8 / 100
Properties
The probability mass function of a discrete random variable
X has the following probabilities.

1 P (X = x) ≥ 0 for all x ∈ R
P
all x P (X = x) = 1.
2

9 / 100
Example
Suppose a fair coin is tossed three times. The sample space
S is

S = {HHH, HHT, HT H, T HH, T T H, T HT, HT T, T T T }.

Let the random variable X denote the number of heads

that turned up. The probability of X is given by
1
P (X = 0) = P ({T T T }) = 8
3
P (X = 1) = P ({T T H, T HT, HT T }) = 8
3
P (X = 2) = P ({HHT, HT H, T HH}) = 8
1
P (X = 3) = P ({HHH}) = 8

P3
Observe that x=0 P (X = x) = 1.

10 / 100
2.3 Cumulative Distribution Functions

11 / 100
Definition
The cumulative distribution function (cdf ) of a
random variable X is defined as

FX (x) = P (X ≤ x).

12 / 100
Result
Let X be a random variable with cdf FX (x). Then,

FX (b) − FX (a) = P (a < X ≤ b).

13 / 100
Proof.

FX (b) − FX (a) = P (X ≤ b) − P (X ≤ a) (by the definition of cdf)

= P ({a < X ≤ b} ∪ {X ≤ a}) − P (X ≤ a)
= P (a < X ≤ b) + P (X ≤ a) − P (X ≤ a)
(since {a < X ≤ b} ∪ {X ≤ a} are mutually exclusive events)

= P (a < X ≤ b).

14 / 100
The cdf FX (x) of random variable X has the following
properties:

The cdf FX (x) is a non-decreasing function of x for all

x∈R
The cdf FX (x) ranges from zero to one, which makes it
reasonable since it is a probability.
If X is a discrete random variable whose minimum value is
a, then FX (a) = P (X ≤ a) = P (X = a). If c < a, then
FX (c) = 0.
If the maximum value of X is b, then
FX (b) = P (X ≤ b) = 1. If d ≥ b, then FX (d) = 1.
All the probabilities of X can be stated in terms of the cdf
FX (x).

15 / 100
Example
For the three coin toss example Jump to Three Coin Example , the
cumulative distribution function is given by


0, x<0

1/8, 0≤x<1



FX (x) = 4/8, 1≤x<2

7/8, 2≤x<3





1, x ≥ 3.

16 / 100
Example
Let p be the probability of a head on a single coin toss.
Suppose this coin is tossed until a head turns up for the
first time. Let X denote the number of tosses required to
get the first head.

Find the probability mass function and the

cumulative distribution function of X.

17 / 100
Example

Solution:
Assuming that each toss is independent, for n = 1, 2, 3, . . . ,
the probability mass function of X is

P (X = n) = P ({T, T, T, T . . . , T , H}) = (1 − p)n−1 p,

| {z }
(n−1) tails

where for a single toss, P (T ) = 1 − p and P (H) = p.

18 / 100
Example

Solution - continued:
The cumulative distribution function of X is
n
X
FX (n) = P (X = i)
i=1
n
i−1
X
= (1 − p) p
i=1
n
i−1
X
= p (1 − p)
i=1
h i
2 n−1
= p 1 + (1 − p) + (1 − p) + · · · + (1 − p)

(This is a geometric sum.)

h 1 − (1 − p)n i
= p
1 − (1 − p)
h 1 − (1 − p)n i
= p
p
n
= 1 − (1 − p) n = 1, 2, 3, . . . , .

19 / 100
2.4 Continuous Random Variables and
Probability Density Functions

20 / 100
When a random variable has a continuum of possible
values, it is continuous.

An example of a continuous random variable is a

lightbulb’s lifetime with possible values in [0, ∞).

The analogue of a probability mass function in the discrete

case is the probability density function (pdf ).

21 / 100
Definition
The probability density function (pdf) of a
continuous random variable is real-valued fX on R with
the property Z
fX (x)dx = P (X ∈ A)
A

for any (measurable) set A ⊂ R.

Results
1 fX (x) ≥ 0 for all x ∈ R
R +∞
2
−∞
fX (x) dx = 1.

22 / 100
Regardless of whether a random variable X is continuous
or discrete, its cumulative distribution function (cdf )
is defined by
FX (s) = P (X ≤ x).

The next set of results shows how FX may be found from

fX and vice versa.

23 / 100
Continuous random variables X have the property
P (X = a) = 0 for all a ∈ R. (1)

It only makes sense to talk about the probability of X

lying in some subset of R.
A consequence of Equation (1) is that, with continuous
random variables, we do not worry about distinguishing
between < and ≤ signs. The probabilities are not affected.
For example, if X is a continuous random variable, then

P (a < X < b) = P (a ≤ X < b) = P (a < X ≤ b) = P (a ≤ X ≤ b).

This is not the case for discrete random variables. The

endpoints’ inclusion and/or exclusion will change your
answer.
24 / 100
Results
1 The cumulative distribution function (cdf) FX of a
continuous random variable X can be found from the
probability density function fX via
Z x
FX (x) = fX (y)dy.
−∞

2 The probability density function fX of a continuous

random variable X can be found from the cumulative
distribution function (cdf) FX via

fX (x) = FX′ (x).

25 / 100
Results
➌ For any continuous random variable X and pair of
numbers a ≤ b,

P (a ≤ X ≤ b) = P (a < X < b)
= P (a ≤ X < b)
= P (a < X ≤ b)
Z b
= fx (x) dx
a
= the area under fX between a and b.

26 / 100
These results demonstrate the importance of fX (x) and
FX (x).

If you know (or can derive) either fX (x) and FX (x), you
can derive any probability you want to know about X (and
indeed any property of X).

Some other important properties of the continuous random

variable X will be discussed later.

27 / 100
Example
The lifetime (in thousands of hours) of a light bulb, X, has
density fX (x) = e−x , x > 0.

Find the probability that the light bulb lasts

between two thousand and three thousand hours.

28 / 100
Example

Solution:
There are several ways to obtain this probability.
R3 R3 3
1 P (2 ≤ X ≤ 3) = 2 fX (x) dx = 2 e−x dx = − e−x =
2
e−2 − e−3 ≈ 0.085.
Rx
2 Using the cdf of X, FX (x) = P (X ≤ x) = −∞ fX (y)dy =
R x −y −x
0 e dy = 1 − e , x > 0. Therefore,

P (2 ≤ X ≤ 3) = P (X ≤ 3) − P (X ≤ 2)
= FX (3) − FX (2) = (1 − e−3 ) − (1 − e−2 )
= e−2 − e−3 ≈ 0.085

29 / 100
Definition
If FX is strictly increasing in some interval, then FX−1 is
well defined.

For a specified p ∈ (0, 1), the pth quantile of FX is xp ,

where
FX (xp ) = p or xp = FX−1 (p).

For example,
x0.5 is the median of FX , and
x0.25 and x0.75 are the lower and upper quantile of
FX , respectively.

30 / 100
Example
Let X be a random variable with cumulative distribution
function FX (x) = 1 − e−x , x > 0.
Find the median and lower and upper quantiles of
X.

31 / 100
Example
Let X be a random variable with cumulative distribution
function FX (x) = 1 − e−x , x > 0.
Find the median and lower and upper quantiles of
X.

Solution:
Firstly, we observe that the cdf FX (x) is strictly increasing,
so we can easily find FX−1 . From Jump to Definition of pth Quantile ,

p = 1 − e−xp =⇒ xp = − ln(1 − p) =⇒ FX−1 (p) = − ln(1 − p).

To compute the median, put in p = 0.5 in the above

equation and compute F −1 (p). That is, the median
= x0.5 = − ln(1 − 0.5) = − ln(0.5) = ln 2.

31 / 100
Example

Solution - continued:
For the lower and upper quantiles of X,

4
x0.25 = − ln(1 − 0.25) = ln(0.75) = ln
3

x0.75 = − ln(1 − 0.75) = ln(0.25) = ln(4) = 2 ln(2).

32 / 100
2.5 Expectation and Moments

33 / 100
The mean or average of the numbers a1 , a2 , . . . , an is
a1 + a2 + · · · + an 1 1 1
= a1 · + a2 · + · · · + an · .
n n n n

Consider a random variable X with probability mass

function P (X = 5) = 15 and P (X = 10) = 45 .
If we observe that values say 100 random variables with the
same distribution as X, we would expect to observe about
20 5’s and about 80 10s, so the mean or the average of the
100 numbers should be about
5 × 20 + 10 × 80 1 4
= 5 · + 10 · = 9.
100 5 5

That is, the mean or average is the sum of possible

values of X weighted by their probabilities.

34 / 100
Definition
The expected value or mean of a discrete random
variable X is
X
E(X) = x · P (X = x). (2)
all x

By analogy, for the continuous random variable case, we

have

Definition
The expected value or mean of a continuous random
variable X is
Z +∞
E(X) = x · fX (x) dx. (3)
−∞

35 / 100
In both cases, E(X) has the interpretation of being
the long-run average of X.

That is, in the long run, as you observe an increasing

number of values of X, the average of these values
approaches E(X).

In both cases, E(X) has the physical interpretation of the

centre of gravity of the function fX .

So if a piece of thick wood or stone were carved in the shape

of fX , it would balance on the fulcrum place at E(X).

36 / 100
Example
Let X be the number of females in a committee with three
members. Assume that there is a 50 : 50 chance of each
committee member being female, and the committee
members are chosen independently.

Find E(X).

37 / 100
Example

Solution:
First, we need to write out the sample space S and
determine the probability mass function of X, the number
of females in a committee with three members. We see that

S = {M M M, M M F, M F M, F M M, M F F, F M F, F F M, F F F },

where M and F represent male and female, respectively.

38 / 100
Example

Solution:
The corresponding probability mass function is
x 0 1 2 3

1 3 3 1
P (X = x) 8 8 8 8

By Equation (2), the expectation of X is

3
X 1 3 3 1 3
E(X) = x P (X = x) = 0 · +1· +2· +3· = .
x=0
8 8 8 8 2

3
The interpretation of 2
is not what you expect X to be; however, if you repeated
the experiment, say, 100 times, then the average of the 100 numbers observed
150 3
should be 100
= 2
. We expect to observe about 150 females in total in 100
committees. We don’t expect to see exactly 1.5 females on each committee!

39 / 100
Example
Suppose X is a standard uniform random number
generator (such a generator can be found on most
hand-held calculators).

The probability density function of X is given by

(
1 if 0 < x < 1
fX (x) =
0 otherwise

Find E(X).

40 / 100
Example

Solution:
By Equation (3), the expectation of X is
Z +∞ Z 1
x2 1 1
E(X) = x fX (s) dx = x · 1 dx = = .
−∞ 0 2 0 2

Note that in the last two examples, the probability mass function and
probability density function of X is symmetric, and it is symmetric
about E(X). As we will see in the next few examples, that is not
always true.

41 / 100
Example
Suppose X has probability mass function

P (X = x) = (1 − p)x−1 p, x = 1, 2, . . . , 0 < p < 1.

Find E(X).

42 / 100
Example

Solution:
Let q = 1 − p. By Equation (2),
∞
X ∞
X ∞
X
E(X) = x P (X = x) = x q x−1 p = p x q x−1
x=1 x=1 x=1
∞
X d x d x
= p q (since q = x q x−1 )
x=1
dq dq
∞
d X x
= p q (since we can interchange the derivative and the summation)
dq x=1
d q 1 · (1 − q) − q · (−1)
= p =p
dq 1 − q (1 − q)2
1 1 1
= p =p 2 = .
(1 − q)2 p p

43 / 100
Example
Suppose X has probability density function

fX (x) = e−x , x > 0.

Find E(X).

44 / 100
Example

Solution:
By Equation (3),
Z +∞ Z +∞
E(X) = x fX (x) dx = x e−x dx
−∞ 0
+∞
Z +∞
= −x e−x − −e−x dx
0 0
(by integration by parts )
Z +∞ +∞
= e−x dx = −e−x = 0 − (−1) = 1.
0 0

45 / 100
Example
If X is degenerate, that is, X = c with probability 1 for
some constant
P c, then X is in fact a constant, and
E(X) = all x x P (X = c) = c · 1 = c.

Thus, the expected value of a constant is the

constant, i.e.,
E(c) = c.

This is a special case.

46 / 100
2.6 Expectation of Transformed Random
Variables

47 / 100
Sometimes, we are interested in a transformation of random
variables.

Examples are
The circumference of a tree trunk is measured, but we
want to know the cross-sectional area
ofthe trunk.
2
X
The random variable of interest is π 2π .

The number of smartphones sold in an electronic shop

is recorded, and revenue is of interest. The shop has
bought twenty smartphones at $500 each, sells them
for $650, and can redeem unsold smartphones for $350.
The variable of interest is 650 X + (20 − X) 350 - 20
× 500.

48 / 100
Transformations are also of interest when studying the
properties of a random variable.

For example, in order to understand X, it is often useful to

look at the rth moment of X about some constant a,
defined as E[(X − a)r ].

This is another example of a transformation of X, for

which we wish to find E[ g(X) ] for some function g(x).

49 / 100
Result
The expected value of a function of g(X) of a random
variable X is
P


 all x g(x)P (X = x) if X is discrete

E[ g(X) ] =



R +∞
−∞
g(x) fX (x) dx if X is continuous

50 / 100
Example
Let I denote the electric current through a particular
circuit, and I has pdf given by
(
1
if 1 ≤ x ≤ 3
f (x) = 2
0 otherwise.

Let power P be a function of I and resistance. For

example, a circuit with resistance three Ohms,

P = 3 I 2.

What is the expected value of P through this

circuit?

51 / 100
Example

Solution:
Z 3 Z 3
2 2 1
E[P ] = E[3I ] = 3x fI (x)dx = 3x2 dx
1 1 2
3 31 3
= x
2 3 1
1
= (27 − 1) = 13.
2

52 / 100
Result
In most situations,

E[ g(X)) ] ̸= g[ E(X) ].

53 / 100
2.7 Expectation of a Random Variables
Under Changes of Scale

54 / 100
Often, a change of scale is required when studying random
variables. For example, when a measurement change is
required, say grams → kilograms or o F → o C.

Results
If a is a constant and X is a discrete random variable,
then
X
E(X + a) = (x + a) P (X = x) = E(X) + a
all x
X
E(aX) = (a x) P (X = x) = a E(X).
all x

55 / 100
Results
If a is a constant and X is a continuous random
variable, then
Z
E(X + a) = (x + a) fX (x) dx = E(X) + a
Z
E(aX) = (a x) fX (x) dx = a E(X)

56 / 100
Result
If X is a continuous or discrete random variable, then

E[ g1 (X) + · · · + gn (X) ] = E[ g1 (X) ] + · · · + E[ gn (X) ].

57 / 100
2.8 Standard Deviation and Variance

58 / 100
The standard deviation of a random variable is a measure
of its spread. It is closely tied to the variance of a random
variable.

Definition
If we let µ = E(X), then second moment of X about µ,
E[ (X − µ)2 ], is the variance of X denoted by V ar(X).

Definition
The standard deviation of a random variable X is the
square root of its variance:
p
standard deviation of X = V ar(X).

59 / 100
Standard deviations are more readily interpreted because
they are measured in the same units as the original variable
X.

So standard deviations are more commonly used as

measures of spread in applied statistics and in reporting
quantitative research results.

Variances are a bit easier to work with theoretically and are

commonly used in mathematical statistics (and this course).

60 / 100
Result

V ar(X) = E[ (X − µ)2 ]
= E(X 2 ) − (E(X))2
= E(X 2 ) − µ2 .

61 / 100
Proof.

V ar(X) = E[ (X − µ)2 ] (by definition)

= E( X − 2 µ X + µ2 )
2

= E(X 2 ) − 2 µE(X) + E(µ2 )

(expectation is linear functional)
= E(X 2 ) − 2 µ2 + µ2 (since E(X) = µ)
= E(X 2 ) − µ2 .

62 / 100
Example
Assume the lifetime of a light bulb (in thousands of hours)
has probability density function fX (x) = e−x , x > 0.

Calculate Var(X).

63 / 100
Example

Solution:
We will use the formula V ar(X) = E(X 2 ) − (E(X))2 .
Recall that we found that E(X) = 1 earlier . So,
Z +∞
2
E(X ) = x2 e−x dx
0
Z +∞
2 −x ∞
= −[ x e ]0 − −2xe−x dx
0
(using the integration by parts formula )
= 0 + 2 E(X) = 2

V ar(X) = E(X 2 ) − (E(X))2 = 2 − (12 ) = 1.

64 / 100
Example
Consider two random variables A and B with probability
mass function are given by
x 1 2 3 4 5
A:
P (A = x) 0.15 0.25 0.20 0.25 0.15

x 1 2 3 4 5
B:
P (B = x) 0.10 0.10 0.60 0.10 0.10

Which of A and B is more variable?

65 / 100
Example

Solution:
We ask to find which random variable has the largest
variance.
E(A) = 1 × 0.15 + 2 × 0.25 + 3 × 0.20 + 4 × 0.25 + 5 × 0.15 = 3

E(A2 ) = 12 × 0.15 + 22 × 0.25 + 32 × 0.20

+42 × 0.25 + 52 × 0.15 = 10.7

V ar(A) = E(A2 ) − (E(A))2 = 10.7 − 32 = 1.7

66 / 100
Example

Solution - continued:

E(B) = 1 × 0.10 + 2 × 0.10 + 3 × 0.60 + 4 × 0.10 + 5 × 0.10 = 3

E(B 2 ) = 12 × 0.10 + 22 × 0.10 + 32 × 0.60

+42 × 0.10 + 52 × 0.10 = 10.0

V ar(B) = E(B 2 ) − (E(B))2 = 10.0 − 32 = 1

We see that V ar(A) > V ar(B), so A is more variable.

67 / 100
Results
Let a be a constant. We have

V ar(X + a) = V ar(X)

V ar(aX) = a2 V ar(X).

68 / 100
Proof.

V ar(X + a) = E[ ( (X + a) − E(X + a) )2 ] (definition)

2
= E[ ( (X + a) − E(X) − a ) ]
= E[ ( X − E(X) )2 ]
= V ar(X)

V ar(aX) = E[ ( (aX) − E(aX) )2 ] (definition)

= E[ ( (aX) − a E(X) )2 ]
= a2 E[ ( X − E(X) )2 ]
= a2 V ar(X)

69 / 100
2.9 Moment Generating Functions

70 / 100
Definition
The moment generating function (mgf ) of a random
variable X is
mX (u) = E(euX ).

The name moment generating function comes from the following

result concerning the rth moment of X about zero, E(X r ):

Result
In general,
(r)
E(X r ) = mX (0) for r = 0, 1, 2, . . .
(r)
where mX (0) represents the differentiation of the moment
generation function with respect to u, r times, and evaluate
it at u = 0.

71 / 100
Proof.
First, we observe that
mX (u) = E( eu X ) (by definition)
uX (uX)2
= E 1+ + + ···
1! 2!
(using the exponential series definition )
2 3
u u u
= 1 + E(X) · + E(X 2 ) · + E(X 3 ) · + ··· .
1! 2! 3!

Thus, mX (0) = 1 = E(X 0 ). Next,

′ 2 2u 3 3u2
mx (u) = E(X) + E(X ) · + E(X ) · + ···
2! 3!
′ ′
=⇒ mx (0) = mx (u) = E(X)
u=0
(2) 2 2 3 3 · 2u
mx (u) = E(X ) · + E(X ) · + ···
2! 3!
(2) 2
=⇒ mx (0) = E(X ),

and so on.
72 / 100
Example
Suppose the random variable X has the following
moment-generating function
(
(1 − u)−1 if u < 1
mX (u) =
+∞ if u ≥ 1.

Find an expression for the rth moment of X.

73 / 100
Example

Solution:
We see that (1 − u)−1 = 1−u 1
= 1 + u + u2 + · · · , when u < 1. (Recall
that this is a geometric series.) So

u u2 (r) ur
mx (u) = E(euX ) = 1 + m′X (0) + m′′X (0) + · · · + mX (0) +··· .
1! 2! r!

(r) r
Comparing the two series, we see that mX (0) ur! = ur . Therefore,
(r)
mX (0) (r)
r! = 1 and mX (0) = r!.

(r)
Thus the rth moment of X is E(X r ) = mX (0) = r!

74 / 100
Example
Suppose the random variable X has the following
probability mass function.
λx
P (X = x) = e−λ , x = 0, 1, 2, . . . , ; λ > 0.
x!
Find the moment-generating function of X.
Furthermore, find E(X) and V ar(X).

75 / 100
Example

Solution:
By the definition of moment-generating functions, we have
∞ ∞
def uX
X
ux −λ λx −λ
X (eu λ)x
mX (u) = E[e ]= e e =e
x=0
x! x=0
x!
u
= e−λ eλ e by Taylor series expansion of the the exponential )
u
= eλ (e −1) , λ > 0.

We have
u −1) u −1)
E(X) = m′X (u) = d
du eλ (e = (λ eu ) eλ (e = λ.
u=0 u=0

Similarly, we see that E(X 2 ) = λ + λ2 . Hence,

V ar(X) = E(X 2 ) − (E(X))2 = λ + λ2 − λ2 = λ.

76 / 100
Example
Suppose the random variable X has a probability mass
function

n x
P (X = x) = p (1 − p)n−x , x = 0, 1, 2, . . . , n; 0 < p < 1.
x

The Binomial Theorem states that

n
n
X n x n−x
(a + b) = a b . (4)
x=0
x

Use the Binomial Theorem to

1 show that P (X = x) is a probability mass function;
2 find the moment-generating function of X;
3 find E(X) and V ar(X) using the moment-generating
function.

77 / 100
Example

Solution
1 To show that P
P (X = x) is a probability mass function, we need
n
to show that x=0 P (X = x) = 1. That is,
n n
X X n x
P (X = x) = p (1 − p)n−x
x=0 x=0
x
= (p + (1 − p))n by the binomial theorem
with a = p and b = 1 − p.
= 1.

Indeed, P (X = x) is a probability mass function.

78 / 100
Example

Solution:
➋ The moment generating function is given by
n
X
uxn x
mX (u) = e p (1 − p)n−x
x=0
x
n
X n x
= p eu (1 − p)n−x
x=0
x
= (p eu + 1 − p)n by the binomial theorem
with a = p eu and b = 1 − p.

79 / 100
Example

Solution
➌ If we differentiate the moment generating function with respect
to u, we get

d mX (u)
= n (p eu + 1 − p)n−1 p eu
du
= n p eu (p eu + 1 − p)n−1 .

Evaluating this at u = 0 gives

E(X) = np(p + 1 − p)n−1 = np.

80 / 100
Example

Solution - continued:
➌ To find the second moment, we use the product rule

d(yz) dz dy
=y +z
du du du
to get

d2 mX (u)
= n p eu {(n − 1) (p eu + 1 − p)n−2 p eu }
du2
+ (p eu + 1 − p)n−1 {n p eu }
u u n−1
using y = n p e and z = (p e + 1 − p)

= n p eu (p eu + 1 − p)n−2 {n p eu + 1 − p}.

Evaluating this at u = 0 gives

E(X 2 ) = n p(p + 1 − p)n−2 {n p + 1 − p}

= n p{n p + 1 − p}.
81 / 100
Example

Solution - continued:
➌ From this, we see that

V ar(X) = E(X 2 ) − (E(X))2

= np{n p + 1 − p} − (np)2
= np(1 − p).

82 / 100
Example
Suppose the random variable X has a probability density
function
fX (x) = e−x , x > 0.
Find the moment-generating function of X.

83 / 100
Example
Suppose the random variable X has a probability density
function
fX (x) = e−x , x > 0.
Find the moment-generating function of X.
Solution:
Using the definition of the moment-generating function, we
have

mX (u) = E[ euX ]
Z ∞ Z ∞
= eux e−x dx = e−x(1−u) dx
0 0
−e−x(1−u) ∞
=
1−u 0
1
= , u < 1.
1−u
83 / 100
2.10 Properties of Moment Generating
Functions

84 / 100
The following results on uniqueness and convergence for moment
generating function will be particularly important later.

Result
Let X and Y be two random variables whose moments
exist. If
mX (u) = mY (u)
for all u in a neighbourhood of zero (i.e. for all |u| < ϵ for
some ϵ > 0), then

FX (x) = FY (x) for all x ∈ R.

This result tells us that the moment generating function of

a random variable is unique.
The proof of this result can be found in Casella, G. and Berger, R.L.
(1990) Statistical Inference, Duxbury.
85 / 100
Result
Let { Xn : n = 1, 2, 3, . . . } be a sequence of random
variables, each with moment generating function mXn (u).
Furthermore, suppose that.

lim mXn (u) = mX (u) for all u in a neighbourhood of zero

n→∞

and mX (u) is a moment generating function of a random

variable X. Then

lim FXn (x) = FX (x) for all x ∈ R.

n→∞

The convergence of moment-generating functions implies

the convergence of cumulative distribution functions.

See the proof of this result in Casella, G. and Berger, R.L. (1990).
Statistical Inference, Duxbury.

86 / 100
The proofs of the last two results rely on the theory of
Laplace transforms.

However, this is not covered in Casella and Berger (1990).

The reader should consult

Widder, D.V. (1946) The Laplace Transform. Princeton,

New Jersey: Princeton University Press.

87 / 100
2.11 Chebychev’s Inequality

88 / 100
Chebychev’s inequality is a fundamental result concerning tail
probabilities of general random variables.

It is useful for the derivation of convergence results discussed

later.

Chebychev’s Inequality
If X is any random variable with E(X) = µ and
V ar(X) = σ 2 , then
1
P X − µ > k σ ≤ 2.
k

The probability statement in Chebychev’s Inequality is often

stated verbally as
the probability that X is more than k standard
deviations from its means is bounded by k12 .

89 / 100
Note that Chebychev’s Inequality makes no assumptions
about the distribution of X.

This is a particularly handy result.

In practice, we usually do not know the distribution of X.

By using Chebychev’s Inequality, we can make specific

probabilistic statements about a random variable based
only on its mean and deviation.

90 / 100
Proof.
We will provide proof for the continuous random variable
case only.
Z ∞
2
σ = V ar(X) = (x − µ)2 fX (x) dx
Z−∞
≥ (x − µ)2 fX (x) dx
|x−µ|>k σ
Z
≥ (k σ)2 fX (x) dx
|x−µ|>k σ

since, |x − µ| > k σ =⇒ (x − µ)2 fX (x) > (k σ)2 fX (x).

91 / 100
Proof.
Therefore, we have
Z
2 2 2
σ ≥ k σ fX (x) dx
|x−µ|>k σ

= k2 σ2 P X − µ > k σ .

By re-arranging, we get
1
P X − µ > k σ ≤ 2,
k
the desired result.

92 / 100
Example
The number of items produced by a factory in one day has
mean 500 and variance 100.

What is the lower bound for the probability that

between 400 and 600 items will be produced
tomorrow?

93 / 100
Example

Solution:
Let X be the number of items produced tomorrow. We are given
E(X) = µ = 500 and σ 2 = V ar(X) = 100.

Note that 400 = 500 − 100 = µ − 10σ, and 600 = 500 + 100 = µ + 10σ.
Thus,

P (400 ≤ X ≤ 600) = P (400 − 500 ≤ X − 500 ≤ 600 − 500)

= P (−100 ≤ X − 500 ≤ 100)
= P (|X − 500| ≤ 100)
= P (|X − µ| ≤ 10 σ)
= 1 − P (|X − µ| > 10 σ).

94 / 100
Example

Solution - continued:
1
Chebychev’s Inequality states that P (|X − µ| > 10 σ) ≤ 102 .
Therefore,

P (400 ≤ X ≤ 600) = 1 − P (|X − µ| > 10 σ)

1
≥ 1 − 2 = 0.99.
10

95 / 100
Supplementary Material

96 / 100
Supplementary Material - Integration by Parts

Integration by parts formula

Z Z
udv = uv − vdu.

For our example, , we choose u = x and dv = e−x dx.

So du = dx and v = −e−x .

97 / 100
Supplementary Material -Integration by Parts

Integration by parts formula

Z Z
udv = uv − vdu.

For our example , we choose u = x2 and dv = e−x dx.

So du = 2 x dx and v = −e−x .

98 / 100
Supplementary Material - Exponential Series

Recall the exponential series is given by

x x2 x3
ex = 1 + + + + ···
1! 2! 3!

99 / 100
Supplementary Material - Binomial Theorem

The Binomial Theorem states that

n
n
X n x n−x
(a + b) = a b . (5)
x=0
x

100 / 100

text_file_3
No ratings yet
text_file_3
1 page
Handout2v6
No ratings yet
Handout2v6
8 pages
Chapter 2 Random Variables
No ratings yet
Chapter 2 Random Variables
34 pages
2 Random Variable
No ratings yet
2 Random Variable
69 pages
Class Notes 5
No ratings yet
Class Notes 5
17 pages
Class Notes 3
No ratings yet
Class Notes 3
18 pages
Class Notes 4
No ratings yet
Class Notes 4
14 pages
Lecture 4
No ratings yet
Lecture 4
5 pages
Stats2 Textbook Week4
No ratings yet
Stats2 Textbook Week4
34 pages
Week 6 - Random Variables, CDF, PDF
No ratings yet
Week 6 - Random Variables, CDF, PDF
16 pages
Lecture+14_inClass
No ratings yet
Lecture+14_inClass
23 pages
02-Random Variables
No ratings yet
02-Random Variables
62 pages
Lecture 5 - Fall 2023
No ratings yet
Lecture 5 - Fall 2023
15 pages
GS
No ratings yet
GS
2 pages
4-Random Variables
No ratings yet
4-Random Variables
80 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
02-Random Variables
No ratings yet
02-Random Variables
77 pages
Chap 2 Random Variables
No ratings yet
Chap 2 Random Variables
41 pages
Chapter 3: Random Variables and Probability Distributions
No ratings yet
Chapter 3: Random Variables and Probability Distributions
54 pages
Math 5846 Assignment
No ratings yet
Math 5846 Assignment
3 pages
Chap-5
No ratings yet
Chap-5
14 pages
Math556 02 RVProbDist
No ratings yet
Math556 02 RVProbDist
6 pages
NOTES_DC
No ratings yet
NOTES_DC
109 pages
02 Random Variables SEIDTCHR
No ratings yet
02 Random Variables SEIDTCHR
44 pages
Chapter 2
No ratings yet
Chapter 2
34 pages
Deep learning-based natural language processing for detecting medicalsymptoms and histories in emergency patient triage
No ratings yet
Deep learning-based natural language processing for detecting medicalsymptoms and histories in emergency patient triage
10 pages
02-Random Variables
No ratings yet
02-Random Variables
38 pages
CH 7 - Random Variables Discrete and Continuous
No ratings yet
CH 7 - Random Variables Discrete and Continuous
7 pages
02-Random Variables
No ratings yet
02-Random Variables
44 pages
02-Random Variables2-1 (2)
No ratings yet
02-Random Variables2-1 (2)
38 pages
Week 5 - Random Variables, CDF, PDF
No ratings yet
Week 5 - Random Variables, CDF, PDF
16 pages
02 Random Variables
No ratings yet
02 Random Variables
51 pages
MEFall2023_7
No ratings yet
MEFall2023_7
46 pages
Generalized Inverse Lindley Power Series Distributions Modeling and Simulation-1563863561
No ratings yet
Generalized Inverse Lindley Power Series Distributions Modeling and Simulation-1563863561
17 pages
Applied Statistics in Business and Economics 5th Edition Doane Solutions Manual 1
100% (73)
Applied Statistics in Business and Economics 5th Edition Doane Solutions Manual 1
25 pages
Question 1
No ratings yet
Question 1
5 pages
Discrete Random Variables Class 4, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
No ratings yet
Discrete Random Variables Class 4, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
13 pages
Econ-2042- Unit 2-HO
No ratings yet
Econ-2042- Unit 2-HO
12 pages
quiz_6
No ratings yet
quiz_6
8 pages
week02A
No ratings yet
week02A
44 pages
week01B
No ratings yet
week01B
45 pages
Definition of a Loaded Die
No ratings yet
Definition of a Loaded Die
9 pages
Math5846_chapter9
No ratings yet
Math5846_chapter9
43 pages
Chapter 3
No ratings yet
Chapter 3
142 pages
Module2 - Random Variable
No ratings yet
Module2 - Random Variable
24 pages
One dim. RV (Ch 3 and 4)
No ratings yet
One dim. RV (Ch 3 and 4)
11 pages
Discrete Random Variables and Probability Distributions
No ratings yet
Discrete Random Variables and Probability Distributions
23 pages
Math 5846 Chapter 1
No ratings yet
Math 5846 Chapter 1
102 pages
Chap2 Discrete Distributions
No ratings yet
Chap2 Discrete Distributions
22 pages
Math5846_chapter10
No ratings yet
Math5846_chapter10
76 pages
Unit-7-Probability-II
No ratings yet
Unit-7-Probability-II
14 pages
Statistics chapter 4 (1)
No ratings yet
Statistics chapter 4 (1)
28 pages
Chapter-3
No ratings yet
Chapter-3
26 pages
SST 214 Module PDF-1
No ratings yet
SST 214 Module PDF-1
85 pages
Notes ch1 Random Variables and Probability Distributions
No ratings yet
Notes ch1 Random Variables and Probability Distributions
30 pages
SI_Chapter-1
No ratings yet
SI_Chapter-1
30 pages
SB Test Bank Chapter 6
No ratings yet
SB Test Bank Chapter 6
127 pages
Random Variables PDF
No ratings yet
Random Variables PDF
64 pages
Discrete Distributions - Hypergeometric, Binomial, and Poisson - Engineering LibreTexts
No ratings yet
Discrete Distributions - Hypergeometric, Binomial, and Poisson - Engineering LibreTexts
14 pages
Continuous Random Variables: Dr. Hiranmoy Pal
No ratings yet
Continuous Random Variables: Dr. Hiranmoy Pal
19 pages
CHAPTER TWO (2) S
No ratings yet
CHAPTER TWO (2) S
69 pages
02-Random Variables
No ratings yet
02-Random Variables
62 pages
Mca4020 SLM Unit 02
No ratings yet
Mca4020 SLM Unit 02
27 pages
Chapter 4-6
No ratings yet
Chapter 4-6
39 pages
PRP Module 2
No ratings yet
PRP Module 2
113 pages
Unit 1 - Digital Communication - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Digital Communication - WWW - Rgpvnotes.in
11 pages
Chapter 3: Random Variables and Probability Distributions This Chapter Is All About
No ratings yet
Chapter 3: Random Variables and Probability Distributions This Chapter Is All About
8 pages
Sample-Based Neural Approximation Approach For Probabilistic Constrained Programs
No ratings yet
Sample-Based Neural Approximation Approach For Probabilistic Constrained Programs
8 pages
Department of Mathematics and Statistics: Statistics For Physical Sciences Notes
No ratings yet
Department of Mathematics and Statistics: Statistics For Physical Sciences Notes
41 pages
CH 3
No ratings yet
CH 3
22 pages
2023-3-Sar-St Thomas Kuching-Qa
No ratings yet
2023-3-Sar-St Thomas Kuching-Qa
10 pages
Probability Mass Function
No ratings yet
Probability Mass Function
2 pages
Sma 4024 Bivariate Random Variables
No ratings yet
Sma 4024 Bivariate Random Variables
25 pages
Computers and Operations Research: Yulin Sun, Simon Cong Guo, Xueping Li
No ratings yet
Computers and Operations Research: Yulin Sun, Simon Cong Guo, Xueping Li
12 pages
PRP - Unit 2
No ratings yet
PRP - Unit 2
41 pages
General Insurance 2
No ratings yet
General Insurance 2
35 pages
Binomial and Hypergeometric PDF
No ratings yet
Binomial and Hypergeometric PDF
12 pages
6 Continuous Variables
No ratings yet
6 Continuous Variables
8 pages
Probability and QUeuing Theory Formula
100% (1)
Probability and QUeuing Theory Formula
21 pages
Simulation of Fta in Simulink: January 2008
No ratings yet
Simulation of Fta in Simulink: January 2008
8 pages
4.06 The Normal Probability Distribution - Making Probability Statements and The 1-2-3 Std-Rule
No ratings yet
4.06 The Normal Probability Distribution - Making Probability Statements and The 1-2-3 Std-Rule
2 pages
3.5.16 Probability Distribution PDF
No ratings yet
3.5.16 Probability Distribution PDF
23 pages
RV Intro
No ratings yet
RV Intro
5 pages
.Chapter 1: What Is Statistics?: 1.1 Key Statistical Concepts
No ratings yet
.Chapter 1: What Is Statistics?: 1.1 Key Statistical Concepts
66 pages
Random Variables: 8.1 What Is A Random Variable?
No ratings yet
Random Variables: 8.1 What Is A Random Variable?
5 pages
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
No ratings yet
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
63 pages
Digital Assignment 1 Statistics For Engineers - Mat2001 Module 1 & 2 Submission Date: 20-9-2021
No ratings yet
Digital Assignment 1 Statistics For Engineers - Mat2001 Module 1 & 2 Submission Date: 20-9-2021
3 pages
Prepared By: Mohammad Saifuddin: Discrete or Continuous
No ratings yet
Prepared By: Mohammad Saifuddin: Discrete or Continuous
7 pages
QUANTITATIVE METHODS - Common Probability Distribution Test Questions
No ratings yet
QUANTITATIVE METHODS - Common Probability Distribution Test Questions
28 pages
Goovaerts (1997) Geostatistics For Natural Resources Estimation
No ratings yet
Goovaerts (1997) Geostatistics For Natural Resources Estimation
246 pages
Manually Calculate A P-Value - Minitab
No ratings yet
Manually Calculate A P-Value - Minitab
4 pages
3 - Discrete Probability Distributions
No ratings yet
3 - Discrete Probability Distributions
30 pages
Fundamentals of Structural Reliability Analysis: DR Peter J. Stafford February 2009
No ratings yet
Fundamentals of Structural Reliability Analysis: DR Peter J. Stafford February 2009
15 pages
XCAP User Guide 5.22.xx (Rev0) - 190313
No ratings yet
XCAP User Guide 5.22.xx (Rev0) - 190313
50 pages
Signal and System Mcqs
No ratings yet
Signal and System Mcqs
38 pages
Important MCQ - Signals and Systems
33% (3)
Important MCQ - Signals and Systems
15 pages
Chapter 4
80% (5)
Chapter 4
21 pages
Chapter 3
No ratings yet
Chapter 3
6 pages
Group Theory I Essentials
From Everand
Group Theory I Essentials
Emil Milewski
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)