100% found this document useful (1 vote)

392 views

Advanced Topics in Introductory Probability

Uploaded by

Musfirah Aamir

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

392 views

Advanced Topics in Introductory Probability

Uploaded by

Musfirah Aamir

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 275

NICHOLAS N.N.

NSOWAH-NUAMAH

ADVANCED TOPICS
IN INTRODUCTORY
PROBABILITY
A FIRST COURSE IN PROBABILITY
THEORY – VOLUME III

2
Advanced Topics In Introductory Probability: A first Course in Probability Theory – Volume III
2nd edition
© 2018 Nicholas N.N. Nsowah-Nuamah & bookboon.com
ISBN 978-87-403-2238-5

3
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN
PROBABILITY THEORY – VOLUME III Contents

CONTENTS
Part 1 Bivariate Probability Distributions 7

Chapter 1 Probability And Distribution Functions

of Bivariate Distributions 8
1.1 Introduction 8
1.2 Concept of Bivariate Random Variables 8
1.3 Joint Probability Distributions 9
1.4 Joint Cumulative Distribution Functions 18
1.5 Marginal Distribution of Bivariate Random Variables 23
1.6 Conditional Distribution of Bivariate Random Variables 30
1.7 Independence of Bivariate Random Variables 35

Chapter 2 Sums, Differences, Products and Quotients

of Bivariate Distributions 45
2.1 Introduction 45
2.2 Sums of Bivariate Random Variables 46
2.3 Differences of Random Variables 63

�e Graduate Programme
I joined MITAS because for Engineers and Geoscientists
I wanted real responsibili� www.discovermitas.com
Maersk.com/Mitas �e G
I joined MITAS because for Engine
I wanted real responsibili� Ma

Month 16
I was a construction Mo
supervisor ina const
I was
the North Sea super
advising and the No
Real work he
helping foremen advis
International
al opportunities
Internationa
�ree wo
work
or placements ssolve problems
Real work he
helping fo
International
Internationaal opportunities
�ree wo
work
or placements ssolve pr

4
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN
PROBABILITY THEORY – VOLUME III Contents

2.4 Products of Bivariate Random Variables 68

2.5 Quotients of Bivariate Random Variables 72

Chapter 3 Expectation and Variance of Bivariate Distributions 80

3.1 Introduction 80
3.2 Expectation of Bivariate Random Variables 80
3.3 Variance of Bivariate Random Variables 104

Chapter 4 Measures of Relationship of Bivariate Distributions 120

4.1 Introduction 120
4.2 Product Moment 121
4.3 Covariance of Random Variables 124
4.4 Correlation Coefficient of Random Variables 130
4.5 Conditional Expectations 135
4.6 Conditional Variances 141
4.7 Regression Curves 143

Part 2 Statistical Inequalities, Limit Laws and Sampling Distributions 154

Chapter 5 Statistical Inequalities and Limit Laws 155

5.1 Introduction 155
5.2 Markov’s Inequality 156
5.3 Chebyshev’s Inequality 160
5.4 Law of Large Numbers 170
5.5 Central Limit Theorem 177

Chapter 6 Sampling Distributions I: Basic Concepts 191

6.1 Introduction 191
6.2 Statistical Inference 192
6.3 Probability Sampling 195
6.4 Sampling With and Without Replacement 200

Chapter 7 Sampling Distributions II:

Sampling Distribution of Statistics 202
7.1 Introduction 202
7.2 Sampling Distribution of Means 206
7.3 Sampling Distribution of Proportions 216
7.4 Sampling Distribution of Differences 221
7.5 Sampling Distribution of Variance 225

5
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN
PROBABILITY THEORY – VOLUME III Contents

Chapter 8 Distributions Derived from Normal Distribution 236

8.1 Introduction 236
8.2 χ Distribution
2
237
8.3 t Distribution 243
8.4 F Distribution 247

Statistical Tables 254

Answers to Odd-Numbered Exercises 271

Bibliography 273

6
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN
PROBABILITY THEORY – VOLUME III Part 1 Bivariate Probability Distributions

PART 1
BIVARIATE PROBABILITY DISTRIBUTIONS

I salute the discovery of a single even insignificant truth more highly than
all the argumentation on the highest questions which fails to reach a truth
GALILEO (1564–1642)

7
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS

Chapter 1

PROBABILITY AND DISTRIBUTION FUNCTIONS

OF BIVARIATE DISTRIBUTIONS

1.1 INTRODUCTION

So far, all discussions in the two volumes of my book on probability (Nsowah-

Nuamah, 2017 and 2018) have been associated with a single random variable
X (that is, a one-dimensional or univariate random variable). Frequently,
we may be concerned with multivariate situations that simultaneously in-
volve two or more random variables. For instance, if we wanted to study
the relationship between weight and height of individual students we might
consider weight and height to be two random variables X and Y , respec-
tively, whose values are determined by measuring the weights and heights of
the students in the school. Such study will produce the ordered pair (X, Y ).

1.2 CONCEPT OF BIVARIATE RANDOM VARIABLES

1.2.1 Definition of Bivariate Random Variables

Many of the concepts discussed for the one-dimensional random variables

also hold for higher-dimensional case. Here in most cases we shall limit
ourselves to the two-dimensional (bivariate) case; more complex multivariate
4situations are straightforward Advanced Topics in Introductory Probability
generalisations.

Definition 1.1 BIVARIATE RANDOM 3 VARIABLE

If X = X(s) and Y = Y (s) are two real-valued functions on the
sample space S, then the pair (X, Y ) that assigns a point in the real
(x, y) plane to each point s ∈ S is called a bivariate random variable

Synonyms of a bivariate random variable are a two-dimensional random

variable/vector Fig. 1.1 is an illustration of a bivariate random variable.

Fig. 1.1 Bivariate Random Variables

1.2.2 Types of Bivariate Random Variables

Multivariate situations, similar to univariate cases, may involve discrete, as

well as continuous random variables.
8
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
Fig. 1.1 Bivariate Random Variables

1.2.2 Types of Bivariate Random Variables

Multivariate situations, similar to univariate cases, may involve discrete, as

well as continuous random variables.

Definition 1.2 DISCRETE BIVARIATE RANDOM

VARIABLE
(X, Y ) is a discrete bivariate random variable, if each of the random
variables X and Y is discrete

Definition 1.3 CONTINUOUS BIVARIATE RANDOM

VARIABLE
(X, Y ) is a continuous bivariate random variable if each of the ran-
dom variables is continuous

There are cases where one variable is discrete and the other continuous
but this will not be considered here.
Density and Distribution Functions of Bivariate Distributions 5

1.3 JOINT PROBABILITY DISTRIBUTIONS

A joint distribution is a distribution having two or more random variables,

with each random variable still having its own probability distribution, ex-
pected value and variance. In addition, for ordered pair values of the ran-
dom variables, probabilities will exist and the strength of any relationship
between the two variables can be measured.
In the multivariate case as in the univariate case we often associate a
probability (mass) function with discrete random variables and a probability
density function with continuous random variables. We shall take up the
discrete case first since it is the easier one to understand.

1.3.1 Joint Probability Distribution of Discrete Random

Variables

Suppose that X and Y are discrete random variables, and X takes values
i = 0, 1, 2, · · · , n, and Y takes values j = 1, 2, · · · , m. Most often, such a
joint distribution is given in table form. Table 1.1 is an n-by-m array which
displays the number of occurrences of the various combinations of values
of X and Y . We may observe that each row represents values of X and
each column represents values of Y . The row and column totals are called
marginal totals. Such a table is called the joint frequency distribution.

Table 1.1 Joint Frequency Distribution of X and Y

X Y Row Totals
y1 y2 ··· ym

x1 (x1 , y1 ) (x1 , y2 ) ··· (x1 , ym ) x1
y

x2 (x2 , y1 ) (x2 , y2 ) ··· (x2 , ym ) x2
y
.. .. .. .. .. 9
..
. . . . . .
i = 0, 1, 2, · · · , n, and Y takes values j = 1, 2, · · · , m. Most often, such a
joint distribution is given in table form. Table 1.1 is an n-by-m array which
displays
ADVANCED the number
TOPICS of occurrences of the various combinations of values
IN INTRODUCTORY
of X and Y . A We
PROBABILITY: may
FIRST observe
COURSE IN that each row represents values ofAND
PROBABILITY and
X DISTRIBUTION FUNCTIONS
PROBABILITY
each column THEORY
represents – VOLUME
values IIIof Y . The row and column totals areOFcalled
BIVARIATE DISTRIBUTIONS

marginal totals. Such a table is called the joint frequency distribution.

Table 1.1 Joint Frequency Distribution of X and Y

X Y Row Totals
y1 y2 ··· ym

x1 (x1 , y1 ) (x1 , y2 ) ··· (x1 , ym ) x1
y

x2 (x2 , y1 ) (x2 , y2 ) ··· (x2 , ym ) x2
y
.. .. .. .. .. ..
. . . . . .

xn (xn , y1 ) (xn , y2 ) ··· (xn , ym ) xn
y

Column y1 y2 ··· ym xi yj = N
Totals x x x x y
6 Advanced Topics in Introductory Probability

For example, suppose X and Y are discrete random variables, and X takes
values 0, 1, 2, 3, and Y takes values 1, 2, 3. Each of the nm row-column in-
tersections in Table 1.2 represents the frequency that belongs to the ordered
pair (X, Y ).

Table 1.2 Joint Frequency Distribution of X and Y

Values Values of Y Row

of X Totals
1 2 3
0 1 0 0 1
1 0 2 1 3
2 0 2 1 3
3 1 0 0 1
Column 2 4 2 8
Totals

Definition 1.4 JOINT PROBABILITY DISTRIBUTION

Let X and Y be discrete random variables with possible values
xi , i = 1, 2, ..., n and yj , j = 1, 2, 3, ..., m, respectively. The joint
(or bivariate) probability distribution for X and Y is given by

p(xi , yj ) = P ({X = xi } ∩ {Y = yj })

defined for all (xi , yj )

The function p(xi , yj ) is sometimes referred to as the joint probability

mass function (p.m.f.) or the joint probability function (p.f.) of X and Y .
This function gives the probability that X will assume a particular value x
while at the same time Y will assume a particular value y.

Note
(a) The notation p(x, y) for all (x, y) is the same as writing p(xi , yj ) for
i = 1, 2, ..., n and j = 1, 2, 3, ..., m. Sometimes when there is no ambi-
10
guity we shall use simply p(x, y).
The function p(xi , yj ) is sometimes referred to as the joint probability
ADVANCED TOPICS IN INTRODUCTORY
mass functionA (p.m.f.)
PROBABILITY: or theINjoint probability function
FIRST COURSE (p.f.) of XAND
PROBABILITY andDISTRIBUTION
Y. FUNCTIONS
This functionTHEORY
PROBABILITY gives the probability
– VOLUME III that X will assume a particular OFvalue
BIVARIATE
x DISTRIBUTIONS
while at the same time Y will assume a particular value y.

Note
(a) The notation p(x, y) for all (x, y) is the same as writing p(xi , yj ) for
i = 1, 2, ..., n and j = 1, 2, 3, ..., m. Sometimes when there is no ambi-
guity
Density andwe shall use simply
Distribution Functions of Bivariate Distributions
p(x, y). 7

(b) The joint probability p(xi , yj ) is sometimes denoted as

P (X = x, Y = y),
where the comma stands for ‘and’ or ‘∩’.

Definition 1.5
If X and Y are discrete random variables with joint probability mass
function p(xi , yj ), then

(a) p(xi , yj ) ≥ 0, for all i and j

n
m
(b) p(xi , yj ) = 1
i=1 j=1

Once the joint probability mass function is determined for discrete random
variables X and Y , calculation of joint probabilities involving X and Y is
straightforward.
Let the value that the random variables X and Y jointly take be de-
noted by the ordered pair (xi , yj ). The joint probability p(xi , yj ) is obtained
by counting the number of occurrences of that combination of values X and
Y and dividing the count by the total number of all the sample points. Thus,
#({X = xi } ∩ {Y = yj })
P ({X = xi } ∩ {Y = yj }) = n
m

#({X = xi } ∩ {Y = yj })
i=1 j=1
#(xi , yj )
= n
m

#(xi , yj )
i=1 j=1

where
#(xi , yj ) is the number of occurrences in the cell of the
ordered pair (xi , yj );
n
m
#(xi , yj ) is the total number of all sample points (cells)
i=1 j=1
of the ordered pairs (xi , yj ), denoted by N .

11
ADVANCED TOPICS IN INTRODUCTORY
8
PROBABILITY: A FIRST COURSE INAdvanced Topics in Introductory Probability
PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS

Joint Probability Distribution of Bivariate Random Variables

in Tabular Form

The joint probability distribution may be given in the form of a table of

n rows and m columns (See Table 1.3). The upper margins of the table
indicate the possible distinct values of X and Y . The numbers in the body
of the table are the probabilities for the joint occurrences of the two events
corresponding to X = xi (1 ≤ i ≤ n) and Y = yj (1 ≤ i ≤ m). The row and
column totals are the probabilities for the individual random variables and
are called marginal probabilities because they appear on the margins of
the table. Such a table is also called the joint relative frequency distribution.

Table 1.3 Joint Probability Distribution of X and Y

X Y Row Totals
y1 y2 ··· ym
x1 p(x1 , y1 ) p(x1 , y2 ) ··· p(x1 , ym ) p(x1 )
x2 p(x2 , y1 ) p(x2 , y2 ) ··· p(x2 , ym ) p(x2 )
.. .. .. .. .. ..
. . . . . .

xn p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym ) p(xn )

n
m
Column p(y1 ) p(y2 ) ··· p(ym ) p(xi , yj ) = 1
Totals i=1 j=1

Note
The marginal probabilities for X are simply the simple probabilities that
X = xi for values of yj , where j assumes a value from 1 to m. Similarly, the
marginal probabilities for Y are the simple probabilities that Y = yj , where
i assumes a value from 1 and n.

It is important to note that the distribution satisfies a joint probability

function, namely,
(a) p(xi , yj ) ≥ 0, for all i = 1, 2, · · · , n; j = 1, 2, · · · , m.
n
m
(b) p(xi , yj ) = 1
i=1 j=1

www.job.oticon.dk

12
xn
ADVANCED
p(x , y ) p(xn , y2 )
TOPICSn IN 1INTRODUCTORY
··· p(xn , ym ) p(xn )
n
m
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY p(y1 ) – VOLUME
Column THEORY p(y2 )III ··· p(ym ) p(xi , yj ) =OF1 BIVARIATE DISTRIBUTIONS
Totals i=1 j=1

It is important to note that the distribution satisfies a joint probability

function, namely,
(a) p(xi , yj ) ≥ 0, for all i = 1, 2, · · · , n; j = 1, 2, · · · , m.
n
m
(b) p(xi , yj ) = 1
i=1 j=1
Density and Distribution Functions of Bivariate Distributions 9
Density and Distribution Functions of Bivariate Distributions 9

Example 1.1
Example 1.1
(a) For the data in Table 1.2, calculate the joint probabilities of X and Y .
(a) For the data in Table 1.2, calculate the joint probabilities of X and Y .
(b) Does this distribution satisfy the properties of a joint probability func-
(b) Does
tion? this distribution satisfy the properties of a joint probability func-
tion?

Solution
Solution
(a) From Table 1.2, the cell ({X = 0} ∩ {Y = 1}) = (0, 1) contains one
(a) From Table 1.2,
element; the cell ({X = 0} ∩ {Y = 1}) = (0, 1) contains one
element;
Total number of elements in all cells is 8.
Total
Hencenumber of elements in all cells is 8.
Hence
P ({X = 0} ∩ {Y = 1}) = p(0, 1)
P ({X = 0} ∩ {Y = 1}) = p(0, #({X
1) = 0} ∩ {Y = 1}) 1
= n m #({X = 0} ∩ {Y = 1}) =1

= n m = 8
#({X = xi } ∩ {Y = yj }) 8
i j #({X = xi } ∩ {Y = yj })
i j

Similarly,
Similarly,
0
P ({X = 0} ∩ {Y = 2}) = p(0, 2) = 08 =0
P ({X = 0} ∩ {Y = 2}) = p(0, 2) = =0
80
P ({X = 0} ∩ {Y = 3}) = p(0, 3) = 08 =0
P ({X = 0} ∩ {Y = 3}) = p(0, 3) = =0
80
P ({X = 1} ∩ {Y = 1}) = p(1, 1) = 08 =0
P ({X = 1} ∩ {Y = 1}) = p(1, 1) = =0
82 1
P ({X = 1} ∩ {Y = 2}) = p(1, 2) = 28 =1
P ({X = 1} ∩ {Y = 2}) = p(1, 2) = = 4
81 41
P ({X = 1} ∩ {Y = 3}) = p(1, 3) = 18 =1
P ({X = 1} ∩ {Y = 3}) = p(1, 3) = = 8
8 8

When probabilities of all possible joint events, P (X = xi , Y = yj ),

When probabilities
have been determinedof in
allthis
possible joint
fashion, weevents,
have a P (X probability
joint = xi , Y = ydis-
j ),
have been determined in this fashion, we have a joint probability
tribution of X and Y and these results may be presented in a two-way dis-
tribution of X and
table as shown Y and
in the tablethese results may be presented in a two-way
below:
table as shown in the table below:

13
8
2 1
P ({X = 1} ∩ {Y = 2}) = p(1, 2) = =
ADVANCED TOPICS IN INTRODUCTORY 8 4
PROBABILITY: A FIRST COURSE IN 1 1
P ({X = 1} ∩ {Y = 3}) = p(1, 3) PROBABILITY
= = AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III 8 8 OF BIVARIATE DISTRIBUTIONS

When probabilities of all possible joint events, P (X = xi , Y = yj ),

10 have been determined inAdvanced Topics
this fashion, in Introductory
we have Probability
a joint probability dis-
10 tribution of X and Y andAdvanced
these results
Topicsmayinbe presented inProbability
Introductory a two-way
table as shown in the table below:
X Y
X 1 Y2 3
0 1/8 1 02 03
10 1/8 0 2/80 1/8
0
21 00 2/8
2/8 1/8 1/8
32 1/8 0 0
2/8 0
1/8
3 1/8 0 0

(b) From the table above,

(b) From the, table
(i) p(x above,for all i = 0, 1, 2, 3; j = 1, 2, 3.
i yj ) ≥ 0,

(i) 3
p(x i , yj ) ≥ 0,
3 for1 all1 i =2 0, 1,22, 3;1 j =1 1, 2, 3.
(ii) p(xi , yj ) = + + + + + = 1
3 3
i=0 j=1
81 81 82 82 81 81
(ii) p(xi , yj ) = + + + + + = 1
8 8 8 8 8 8
Hencei=0
this
j=1 distribution is a joint probability function.

Hence this distribution is a joint probability function.

Joint Probability Distribution of Bivariate Random Variables
in Expression
Joint ProbabilityFormDistribution of Bivariate Random Variables
in Expression Form
Sometimes the joint probability distribution of discrete random variables X
and Y is given
Sometimes by a formula.
the joint probability distribution of discrete random variables X
and Y is given by a formula.
Example 1.2
Given the1.2
Example function
Given the function
p(x, y) = k(3x + 2y), x = 0, 1; y = 0, 1, 2
p(x, y) = k(3x + 2y), x = 0, 1; y = 0, 1, 2
(a) Find the constant k > 0 such that the p(x, y) is a joint probability
(a) mass
Find function.
the constant k > 0 such that the p(x, y) is a joint probability
mass function.
(b) Present it in a tabular form for the probabilities associated with the
(b) sample
Present points
it in a(x, y). Obtain
tabular the the
form for rowprobabilities
and column totals.
associated with the
sample points (x, y). Obtain the row and column totals.
Solution
(a) (i) p(x, y) ≥ 0
Solution
(a) (i) and
Density p(x, 2 ≥0
y)
1 Distribution Functions
of1 Bivariate Distributions 11

Density and
Distribution Functions of Bivariate Distributions 11
(ii) k(3x + 2y) = k [(3x + 0) + (3x + 2) + (3x + 4)]
1 2 1
x=0 y=0 x=0
(ii) k(3x + 2y) = k [(3x + 0)
1 + (3x + 2) + (3x + 4)]
x=0 y=0 = 1 (9x + 6)
k x=0
= k x=0(9x + 6)
= k[(0 + 6) + (9 + 6)]
x=0
= k[(0 + 6) + (9 + 6)]
= 21k
= 21k
For p(x, y) to be a joint probability function we must have 21k = 1 from
For p(x, y) to be a joint probability function we must have 21k = 1 from
which
which 1
k= 1
k = 21
21
(b) For the sample point {X = 0, Y = 0} = (0, 0)
(b) For the sample point {X = 0, Y = 0} = (0, 0)
1
p(0, 0) = 1 [3(0) + 2(0)] = 0
p(0, 0) = 21 [3(0) + 2(0)]
14= 0
21
= 21k

For
ADVANCED to be IN
p(x, y)TOPICS a joint probability function we must have 21k = 1 from
INTRODUCTORY
PROBABILITY: A FIRST COURSE IN
which PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III 1 OF BIVARIATE DISTRIBUTIONS
k=
21
(b) For the sample point {X = 0, Y = 0} = (0, 0)

1
p(0, 0) = [3(0) + 2(0)] = 0
21

Similarly,

1 2
p(0, 1) = [3(0) + 2(1)] =
21 21
1 4
p(0, 2) = [3(0) + 2(2)] =
21 21
1 3
p(1, 0) = [3(1) + 2(0)] =
21 21
1 5
p(1, 1) = [3(1) + 2(1)] =
21 21
1 7
p(1, 2) = [3(1) + 2(2)] =
21 21

These results are presented in the following table. Recollect that the row
and column totals are the marginal probabilities.

X Y Row
Totals
0 1 2
0 0 2/21 4/21 6/21
1 3/21 5/21 7/21 15/21
Column 3/21 7/21 11/21 1
12 Totals Advanced Topics in Introductory Probability

1.3.2 Joint Distribution of Continuous Random Variables

Definition 1.6 JOINT PROBABILITY DENSITY FUNCTION

Let X and Y be two continuous random variables. The joint (or
bivariate) probability density function f (x, y) of X and Y is given
by
y2 x2
P ({x1 ≤ X ≤ x2 } ∩ {y1 ≤ Y ≤ y2 }) = f (x, y) dxdy
y1 x1

for two pairs (x1 , x2 ) and (y1 , y2 ) with x2 ≥ x1 , y2 ≥ y1

Fig 1.2 depicts the case of the continuous bivariate random variables
(X, Y ) which assume all the values in the rectangle x1 ≤ X ≤ x2 and
y1 ≤ Y ≤ y2 .

15
Fig 1.2
ADVANCED depicts
TOPICS the case of the continuous bivariate random variables
IN INTRODUCTORY
PROBABILITY:
(X, Y ) whichA assume
FIRST COURSE
all theINvalues in the rectanglePROBABILITY
x1 ≤ X ≤ANDx2 DISTRIBUTION
and FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
y1 ≤ Y ≤ y2 .

Fig. 1.2 Region assumed by (X, Y ) is shaded

Thus, the probability that the event

{x1 ≤ X ≤ x2 } ∩ {y1 ≤ Y ≤ y2 }

will fall in the shaded rectangle is

P ({x1 ≤ X ≤ x2 } ∩ {y1 ≤ Y ≤ y2 })
Density and Distribution Functions of Bivariate Distributions 13
This probability can be found by subtracting from the probability that the
event will fall in the (semi-infinite) rectangle having the upper-right corner
(x2 , y2 ) the probabilities that it will fall in the semi-infinite rectangle having
the upper-right corner (x1 , y2 ) and (x2 , y1 ) respectively, and then adding
back the probability that it will fall in the semi-infinite rectangle with the
upper-right corner at (x1 , y1 ). That is,

P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = P (X ≤ x2 , Y ≤ y2 ) − P (X ≤ x2 , Y ≤ y1 )
−P (X ≤ x1 , Y ≤ y2 ) + P (X ≤ x1 , Y ≤ y1 )

For computational purposes, we shall adopt Definition 1.7.

WHY WAIT FOR

Definition 1.7

PROGRESS?
Let (X, Y ) be a continuous bivariate random variable assuming all
values in the region R. The joint probability density function f is a
function satisfying the following properties:

DARE TO1 DISCOVER

Property f (x, y) ≥ 0, for all (x, y) ∈ R

Discovery means many different things at
Property 2 But it’s the spirit
Schlumberger. f (x,that
y) unites =1
dxdyevery
single one of us. It doesn’t
R matter whether they
join our business, engineering or technology teams,
our trainees push boundaries, break new ground
and deliver
Property the exceptional.
2 states that theIf that excites
total you,
volume
bounded by the surface given by
then we want to hear from you.
equation z = f (x, y) and the region R on the xy-plane is equal to 1.
careers.slb.com/recentgraduates
Example 1.3
Given the following function of a two-dimensional continuous random vari-
able (X, Y ):
xy
x2 + , 0≤x≤1 0≤y≤2
f (x, y) = k
0, elsewhere

where k is a constant. 16

(a) Find the value of k > 0 such that f (x, y) is a probability density
(x2 , y2 ) the probabilities that it will fall in the semi-infinite rectangle having
the upper-right corner (x1 , y2 ) and (x2 , y1 ) respectively, and then adding
ADVANCED TOPICS IN INTRODUCTORY
back the probability
PROBABILITY: that it will
A FIRST COURSE IN fall in the semi-infinite rectangle AND
PROBABILITY withDISTRIBUTION
the FUNCTIONS
upper-right
PROBABILITYcorner
THEORYat– (x ,
1 1y
VOLUME ). That
III is, OF BIVARIATE DISTRIBUTIONS

P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = P (X ≤ x2 , Y ≤ y2 ) − P (X ≤ x2 , Y ≤ y1 )
−P (X ≤ x1 , Y ≤ y2 ) + P (X ≤ x1 , Y ≤ y1 )

For computational purposes, we shall adopt Definition 1.7.

Definition 1.7
Let (X, Y ) be a continuous bivariate random variable assuming all
values in the region R. The joint probability density function f is a
function satisfying the following properties:

Property 1 f (x, y) ≥ 0, for all (x, y) ∈ R

Property 2 f (x, y) dxdy = 1
R

Property 2 states that the total volume bounded by the surface given by
equation z = f (x, y) and the region R on the xy-plane is equal to 1.

Example 1.3
Given the following function of a two-dimensional continuous random vari-
able (X, Y ):
xy
x2 + , 0≤x≤1 0≤y≤2
f (x, y) = k
0, elsewhere

where k is a constant.
(a) Find the value of k > 0 such that f (x, y) is a probability density
function.
14(b) Find P (0 < X < 1, 1 < Advanced
Y < 2). Topics in Introductory Probability

Solution
(a) For f (x, y) to be a p.d.f, it should satisfy the two conditions of Theorem
1.2. Obviously,
f (x, y) ≥ 0
since x ≥ 0, y ≥ 0, and k > 0.
Also +∞ +∞
f (x, y) dx dy = 1
−∞ −∞

Now,
2 1 2 1
xy xy
x +2
dx dy = x +
2
dx dy
y=0 x=0 k y=0 x=0 k
2 3 1
x x2 y
= + dy
y=0 3 2k x=0
2
1 y
= + dy
y=0 3 2k
2
1 y2
= y+
3 4k 0
2 171
= + =1
3 k
x + dx dy = x + dx dy
y=0 x=0 k y=0 x=0 k
2 3 1
ADVANCED TOPICS IN INTRODUCTORY x x2 y
= + dy
PROBABILITY: A FIRST COURSE IN y=0 2k x=0
3 PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III 2 OF BIVARIATE DISTRIBUTIONS
1 y
= + dy
y=0 3 2k
2
1 y2
= y+
3 4k 0
2 1
= + =1
3 k
giving k = 3.
(b) Using the value of k found in (a) we have
1 2
xy
P (0 < X < 1, 1 < Y < 2) = dx dy x2 +
x=0 y=1 3
2 1
xy
= x +
2
dx dy
y=1 x=0 3
2 3 1
x x2 y
= + dy
y=1 3 6 x=0
2
1 y
= + dy
y=1 3 6
2
y y2 7
Density and Distribution Functions of=Bivariate
+ Distributions
= 15
3 12 y=1 12

1.4 JOINT CUMULATIVE DISTRIBUTION FUNCTIONS

1.4.1 Definition of Joint Bivariate Distribution Function

The joint behaviour of two random variables, X and Y is determined by the

joint cumulative distribution function, also called the bivariate cumulative
distribution function, or simply the joint or bivariate distribution function
of the two random variables, X and Y . The definition given in Definition
1.8 is applicable whether X and Y are discrete or continuous.

Definition 1.8 JOINT DISTRIBUTION FUNCTION

For any random variables X and Y , the joint (bivariate) cumulative
distribution function F (x, y), is given by

F (x, y) = P ({X ≤ x} ∩ {Y ≤ y}) = P (X ≤ x, Y ≤ y)

1.4.2 Joint Distribution Function of Discrete Random Variables

The joint cumulative distribution function of random variables X and Y

gives the probability that X takes on a value less that or equal to xi ,
i = 1, 2, · · · , n and that the Y takes on a value less than or equal to yj ,
j = 1, 2, · · · , m.

Definition 1.9 JOINT DISTRIBUTION FUNCTION

(Discrete Case)
The joint distribution function of two discrete random variables X
and Y is
F (x, y) = p(xi , yj )
xi ≤x yi ≤y

18
The joint cumulative distribution function of random variables X and Y
The joint
gives cumulative that
the probability distribution
X takesfunction of random
on a value variables
less that or equal and
X to xi ,Y
ADVANCED TOPICS IN INTRODUCTORY
gives the probability
i = 1, 2, · · · , nA FIRST
PROBABILITY:
that
and that the IN
COURSE
X takes on a value less that or equal
Y takes on a value less PROBABILITY to yxji ,
than or equalAND DISTRIBUTION FUNCTIONS
= 1,
ji = 1,2,
2,······,,m.
PROBABILITY n and that
THEORY the YIIItakes on a value less than or equal OF
– VOLUME to BIVARIATE
yj , DISTRIBUTIONS
j = 1, 2, · · · , m.

Definition 1.9 JOINT DISTRIBUTION FUNCTION

Definition 1.9 JOINT
(DiscreteDISTRIBUTION
Case) FUNCTION
(Discrete Case)
The joint distribution function of two discrete random variables X
The Yjoint
and is distribution function of two discrete random variables X

and Y is F (x, y) = p(xi , yj )
F (x, y) = xi ≤x yi ≤y p(xi , yj )
xi ≤x yi ≤y

Example 1.4
16
Example 1.4 table of Example Advanced Topics in Introductory Probability
Refer
16 to the 1.1. Calculate
Advanced Topics in Introductory Probability
Refer to the table of Example 1.1. Calculate
(a) the joint probability P (2 ≤ X ≤ 3, 1 ≤ Y ≤ 2);
(b)
(a) the
the joint
joint cumulative
probability probability
P (2 ≤ X ≤ P (X ≤Y1, ≤Y 2);
≤ 2).
(b) the joint cumulative probability P3,(X
1≤≤ 1, Y ≤ 2).
Solution
Solution
(a) The joint probability P (2 ≤ X ≤ 3, 1 ≤ Y ≤ 2) is obtained as follows:
(a) The joint probability P (2 ≤ X ≤ 3, 1 ≤ Y ≤ 2) is obtained as follows:
P (2 ≤ X ≤ 3, 1 ≤ Y ≤ 2) = p(2, 1) + p(2, 2) + p(3, 1) + p(3, 2)
P (2 ≤ X ≤ 3, 1 ≤ Y ≤ 2) = p(2, 21) + 1p(2, 2) + p(3, 3 1) + p(3, 2)
= 0+ 2 + 1 +0== 3
= 0+ 8 + 8 +0== 8
8 8 8
(b) The joint probability P (X ≤ 1, Y ≤ 2) is as follows:
(b) The joint probability P (X ≤ 1, Y ≤ 2) is as follows:
P (X ≤ 1, Y ≤ 2) = F (1, 2),
P (X ≤ 1, Y ≤ 2) = F (1, 2),
= p(0, 1) + p(0, 2) + p(1, 1) + p(1, 2)
= p(0,
1 1) + p(0,22) +3p(1, 1) + p(1, 2)
= 1 +0+0+ 2 = 3
= 8 +0+0+ 8 = 8
8 8 8
Example 1.5
Example
Refer 1.5
to Example 1.2. Calculate
Refer to Example 1.2. Calculate
(a) the joint probability distribution P (0 ≤ X ≤ 1, 1 ≤ Y ≤ 2);
(a) the joint probability distribution P (0 ≤ X ≤ 1, 1 ≤ Y ≤ 2);
PREPARE FOR A
(b) the joint cumulative distribution function P (X ≤ 1, Y ≤ 1);
(b) the joint cumulative distribution function P (X ≤ 1, Y ≤ 1);
LEADING ROLE.
(c) the joint cumulative distribution function P (X ≤ 1, Y ≤ 2).
(c) the joint cumulative distribution function P (X ≤ 1, Y ≤English-taught
2). MSc programmes
Solution in engineering: Aeronautical,
Solution Biomedical, Electronics,
Mechanical, Communication
1 2
1
(a) P (0 ≤ X ≤ 1, 1 ≤ Y ≤ 2) = 1 2
1 (3x + 2y) systems and Transport systems.
21
(a) P (0 ≤ X ≤ 1, 1 ≤ Y ≤ 2) = x=0 y=1 (3x + 2y)
x=0 y=1
21 No tuition fees.
1
1 1
= 1 [(3x + 2) + (3x + 4)]
= 21 [(3x + 2) + (3x + 4)]
21 x=0
x=0
1
1
= 1 1 (6x + 6)
= 21 (6x + 6)
21 x=0
1 x=0 E liu.se/master
6
= 1 [(0 + 6) + (6 + 6)] = 67
21
= [(0 + 6) + (6 + 6)] =
(b) P (X ≤ 1, Y ≤ 1) = 21(1, 1)
F 7

‡
(b) P (X ≤ 1, Y ≤ 1) = F1(1, 11)
1
= 1 1
1 (3x + 2y)
= x=0 y=0
21 (3x + 2y)
x=0 y=0
21

19
P (X ≤ 1, Y ≤ 2) = F (1, 2),
ADVANCED TOPICS IN INTRODUCTORY
= p(0, 1)
PROBABILITY: A FIRST COURSE IN + p(0, 2) + p(1,PROBABILITY
1) + p(1, 2) AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III 1 2 3 OF BIVARIATE DISTRIBUTIONS
= +0+0+ =
8 8 8

Example 1.5
Refer to Example 1.2. Calculate

(a) the joint probability distribution P (0 ≤ X ≤ 1, 1 ≤ Y ≤ 2);

(b) the joint cumulative distribution function P (X ≤ 1, Y ≤ 1);

(c) the joint cumulative distribution function P (X ≤ 1, Y ≤ 2).

Solution
2
1
1
(a) P (0 ≤ X ≤ 1, 1 ≤ Y ≤ 2) = (3x + 2y)
x=0 y=1
21
1
1
= [(3x + 2) + (3x + 4)]
21 x=0
1
1
= (6x + 6)
21 x=0
1 6
= [(0 + 6) + (6 + 6)] =
21 7
(b) P (X ≤ 1, Y ≤ 1) = F (1, 1)
1 1

Density and Distribution Functions of Bivariate 1
Distributions 17
= (3x + 2y)
x=0 y=0
21
1
1
= [(3x + 0) + (3x + 2)]
21 x=0
1
1
= (6x + 2)
21 x=0
1 10
= [(0 + 2) + (6 + 2)] =
21 21

The reader is asked in Exercise 1.5 to solve part (c) of this example.

1.4.3 Joint Distribution Function of Continuous Random

Variables

Definition 1.10 JOINT DISTRIBUTION FUNCTION

(Continuous Case)
The cumulative distribution function F of the two-dimensional ran-
dom variable (X, Y ) is defined as
y x
F (x, y) = P ({X ≤ x} ∩ {Y ≤ y}) = f (s, t) ds dt
−∞ −∞

where f (s, t) is the value of the joint probability density of X and Y

at (s, t)

The joint distribution function gives the probability that the point X, Y
belongs to a semi-infinite rectangle in the plane, as shown in Fig. 1.3.

20
F (x, y) = P ({X ≤ x} ∩ {Y ≤ y}) = f (s, t) ds dt
−∞ −∞
ADVANCED TOPICS IN INTRODUCTORY
where f (s,At)FIRST
PROBABILITY: is the value of
COURSE IN the joint probability density of X and
PROBABILITY ANDY DISTRIBUTION FUNCTIONS
at (s, t)
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS

The joint distribution function gives the probability that the point X, Y
belongs to a semi-infinite rectangle in the plane, as shown in Fig. 1.3.

18 Fig. 1.3 Region of {X ≤Topics

Advanced x, Y ≤iny} is shaded Probability
Introductory

Example 1.6
Refer to Example 1.3. Calculate P (X ≤ 1, Y < 1).

Solution
1 1
xy
P (X ≤ 1, Y < 1) = x + dx dy 2
y=0 x=0 3
1 1
xy
= x +
2
dx dy
y=0 x=0 3
1 3 1
x x2 y
= + dy
y=0 3 6 x=0
1
1 y
= + dy
y=0 3 6
1
1 y2
= y+
3 12 y=0
5
=
12

Theorem 1.1
If F is the cumulative distribution function of a two-dimensional
random variable with joint probability density function f (x, y) then

∂ 2 F (x, y)
= f (x, y)
∂x ∂y
wherever F is differentiable

Example 1.7
Let
F (x, y) = (1 − e−x )(1 − e−y ), x ≥ 0, y≥0
Find the joint probability density function f (x, y).

Solution
∂F (x, y)
= e−x (1 − e−y )
∂x
21
∂ 2 F (x, y)
ADVANCED TOPICS IN INTRODUCTORY = f (x, y)
∂x ∂y
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
wherever F is differentiable

Example 1.7
Let
F (x, y) = (1 − e−x )(1 − e−y ), x ≥ 0, y≥0
Find the joint probability density function f (x, y).

Solution
∂F (x, y)Functions
Density and Distribution of Bivariate Distributions 19
= e−x (1 − e−y )
∂x
∂ 2 F (x, y)
= e−x e−y
∂x∂y
= e−(x+y) , x ≥ 0, y ≥ 0.

Hence
f (x, y) = e−(x+y) , x ≥ 0, y ≥ 0}

Note
∂ 2 F (x, y) ∂ 2 F (x, y)
=
∂x∂y ∂y∂x

1.4.4 Properties of Bivariate Cumulative Distribution Function

The joint c.d.f. of a bivariate random variable has properties which are
analogous to those of the univariate random variable.

Property 1
The function F (x, y) is a probability, hence

0 ≤ F (x, y) ≤ 1
Click here
Property 2 to learn more

TAKE THE
The bivariate distribution function F (x, y) is monotonic increasing, in a
wider sense for both variables, that is,

RIGHT TRACK
if x1 ≤ x2 , then

F (x1 , y) ≤ F (x2 , y), for y fixed

if y1 ≤ y2 , then
Give your career a head start
byy1studying
F (x, ) ≤ F (x, ywith
2 ), us.
forExperience
x fixed the advantages
of our collaboration with major companies like
Property 3 ABB, Volvo and Ericsson!
The following relations are also true:
(a) F (−∞, y) = 0

(b) F (x, −∞)Apply

= 0 by World class
www.mdh.se
15 January research
(c) F (+∞, +∞) = 1

22
f (x, y) = e , x ≥ 0, y ≥ 0}

ADVANCED TOPICS IN INTRODUCTORY

PROBABILITY: A FIRST COURSE IN
Note PROBABILITY AND DISTRIBUTION FUNCTIONS
∂ 2 F (x,
PROBABILITY THEORY – VOLUME III y) ∂ 2 F (x, y) OF BIVARIATE DISTRIBUTIONS
=
∂x∂y ∂y∂x

1.4.4 Properties of Bivariate Cumulative Distribution Function

The joint c.d.f. of a bivariate random variable has properties which are
analogous to those of the univariate random variable.

Property 1
The function F (x, y) is a probability, hence

0 ≤ F (x, y) ≤ 1

Property 2
The bivariate distribution function F (x, y) is monotonic increasing, in a
wider sense for both variables, that is,
if x1 ≤ x2 , then

F (x1 , y) ≤ F (x2 , y), for y fixed

if y1 ≤ y2 , then

F (x, y1 ) ≤ F (x, y2 ), for x fixed

Property 3
The following relations are also true:
(a) F (−∞, y) = 0

(b) F (x, −∞) = 0

20 Advanced Topics in Introductory Probability
(c) F (+∞, +∞) = 1
(d) F (−∞, −∞) = 0
(e) F (x, +∞) = P (X ≤ x) = F1 (x), where F1 (x) is the c.d.f of X
(g) P (a < X < b, Y ≤ y) = F (b, y) − F (a, y)
(h) P (X ≤ x, c < Y < d) = F (x, d) − F (x, c)
(f) P (a < X < b, c < Y < d) = F (b, d) − F (a, d) − F (b, c) + F (a, c)

Property 4
At points of continuity of f (x, y) is
∂ 2 F (x, y)
= f (x, y)
∂x∂y

1.5 MARGINAL DISTRIBUTION OF BIVARIATE RANDOM

VARIABLES

1.5.1 Marginal Distribution of Discrete Bivariate Random

Variables

The row totals of Table 1.3 provide us with the probability distribution of
X. Similarly, the column totals provide the probability distribution of Y .
These are typically called marginal probability mass functions because they
are found on the margins of tables.

Definition 1.11 MARGINAL BIVARIATE DISTRIBUTIONS

OF X AND Y 23
Let X and Y be discrete random variables with joint probability
Variables

The row totals

ADVANCED TOPICSofINTable 1.3 provide us with the probability distribution of
INTRODUCTORY
PROBABILITY: A FIRST COURSEtotals
X. Similarly, the column IN provide the probability distribution
PROBABILITY AND of Y.
DISTRIBUTION FUNCTIONS
These are typically called marginal probability mass functions because they
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
are found on the margins of tables.

Definition 1.11 MARGINAL BIVARIATE DISTRIBUTIONS

OF X AND Y
Let X and Y be discrete random variables with joint probability
function p(xi , yj ). Then the marginal distributions of X and Y re-
spectively are given by
m

g(xi ) = P (X = xi ) = p(xi , yj ), i = 1, 2, 3, ..., n
j=1

n

h(yj ) = P (Y Functions
Density and Distribution = yj ) = ofp(xi , yj ), Distributions
Bivariate j = 1, 2, 3, ..., m 21
i=1
Density and Distribution Functions of Bivariate Distributions 21
Example 1.8
For the data
Example 1.8 in Table 1.2, find the marginal probability distribution for
(a)
For the
X data andin Table
(b) Y . find the marginal probability distribution for
1.2,
(a) X and (b) Y .
Solution
To calculate the marginal probabilities the joint probabilities are required.
Solution
The joint probabilities
To calculate the marginalfor probabilities
this problemthe have been
joint calculated are
probabilities in Example
required.
1.1.
The joint probabilities for this problem have been calculated in Example
1.1.
(a) From the table obtained in Example 1.1. we shall calculate the marginal
(a) probabilities
From the table forobtained
each xi inbyExample
fixing i 1.1.
and we
summing all the joint
shall calculate proba-
the marginal
bilities acrossfor
probabilities j. Thus:
each xi by fixing i and summing all the joint proba-
bilities across j. Thus:
P (X = 0) = P (X = 0, Y = 1) + P (X = 0, Y = 2)
P (X = 0) = P (X +P= (X0,=Y 0,=Y1)=+3)P (X = 0, Y = 2)
+P (X = 0, Y = 3) 1 1
= p(0, 1) + p(0, 2) + p(0, 3) = + 0 + 0 =
81 81
= p(0, 1) + p(0, 2) + p(0, 3) = + 0 + 0 =
8 8
P (X = 1) = P (X = 1, Y = 1) + P (X = 1, Y = 2)
P (X = 1) = P (X+P = 1,
(XY ==1,1)Y+=P3)(X = 1, Y = 2)
+P (X = 1, Y = 3) 2 1 3
= p(1, 1) + p(1, 2) + p(1, 3) = 0 + + =
82 81 83
= p(1, 1) + p(1, 2) + p(1, 3) = 0 + + =
8 8 8
P (X = 2) = P (X = 2, Y = 1) + P (X = 2, Y = 2)
P (X = 2) = P (X +P= (X2,=Y 2,=Y1)=+3)P (X = 2, Y = 2)
+P (X = 2, Y = 3) 2 1 3
= p(2, 1) + p(2, 2) + p(2, 3) = 0 + + =
82 81 83
= p(2, 1) + p(2, 2) + p(2, 3) = 0 + + =
8 8 8
P (X = 3) = P (X = 3, Y = 1) + P (X = 3, Y = 2)
P (X = 3) = P (X +P= (X3,=Y 3,=Y1)=+3)P (X = 3, Y = 2)
+P (X = 3, Y = 3) 1 1
= p(3, 1) + p(3, 2) + p(3, 3) = + 0 + 0 =
81 81
= p(3, 1) + p(3, 2) + p(3, 3) = + 0 + 0 =
The results are summarised in the table below: 8 8
The results are summarised in the table below:
xi 0 1 2 3
xii ) 10 31 283 381
g(x 8 8
g(xi ) 1
8
3
8
3
8
1
8

24
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY
22 THEORY – VOLUME Advanced
III OF BIVARIATE DISTRIBUTIONS
Topics in Introductory Probability

(b) Similar to (a) we fix j and sum all the joint probabilities across i.
Hence,
P (Y = 1) = P (Y = 1, X = 0) + P (Y = 1, X = 1)
+P (Y = 1, X = 2) + P (Y = 1, X = 3)
= p(1, 0) + p(1, 1) + p(1, 2) + p(1, 3)
1 1
= +0+0+0=
8 8

P (Y = 2) = P (Y = 2, X = 0) + P (Y = 2, X = 1) +
P (Y = 2, X = 2) + P (Y = 2, X = 3) +
= p(2, 0) + p(2, 1) + p(2, 2) + p(2, 3)
2 2 4
= 0+ + +0=
8 8 8

P (Y = 3) = P (Y = 3, X = 0) + P (Y = 3, X = 1) +
P (Y = 3, X = 2) + P (Y = 3, X = 3)
= p(3, 0) + p(3, 1) + p(3, 2) + p(3, 3)
1 1 2
= 0+ + +0=
8 8 8

The results are summarised in the table below.

yj 1 2 3
g(yj ) 2
8
4
8
2
8

The results of Examples 1.1 and 1.8 (that is, the joint and marginal proba-
bilities) are usually presented in a single table such as in Table 1.4.
How will people travel in the future, and
Note how will goods be transported? What re-
sources will we use, and how many will
(a) The marginal distributions of X and Y are the ordinary we probability
need? The passenger and freight traf-
distribution functions of X and Y but when derived from fic the
sectorjoint
is developing rapidly, and we
distribution function the adjective “marginal” is added. provide the impetus for innovation and
movement. We develop components and
(b) In marginal distribution, the probability of different values of a for
systems ran-
internal combustion engines
dom variable in a subset of random variables is determined that without
operate more cleanly and more ef-
reference to any possible values of the other variables. ficiently than ever before. We are also
pushing forward technologies that are
bringing hybrid vehicles and alternative
drives into a new dimension – for private,
corporate, and public use. The challeng-
es are great. We deliver the solutions and
offer challenging jobs.

www.schaeffler.com/careers

25
P (Y = 3) = P (Y = 3, X = 0) + P (Y = 3, X = 1) +
P (Y = 3, X = 2) + P (Y = 3, X = 3)
ADVANCED TOPICS IN INTRODUCTORY
= p(3, 0) + p(3, 1)
PROBABILITY: A FIRST COURSE IN
+ p(3, 2) +PROBABILITY
p(3, 3) AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III1 1 2 OF BIVARIATE DISTRIBUTIONS
= 0+ + +0=
8 8 8

The results are summarised in the table below.

yj 1 2 3
g(yj ) 2
8
4
8
2
8

The results of Examples 1.1 and 1.8 (that is, the joint and marginal proba-
bilities) are usually presented in a single table such as in Table 1.4.

Note
(a) The marginal distributions of X and Y are the ordinary probability
distribution functions of X and Y but when derived from the joint
distribution function the adjective “marginal” is added.
(b) In marginal distribution, the probability of different values of a ran-
dom
Density andvariable in a subset
Distribution of random
Functions variables
of Bivariate is determined without
Distributions 23
reference to any possible values of the other variables.
(c) From the table,

n
m m
n

p(xi , yj ) = g(xi , yj ) = h(xi , yj ) = 1
i=1 j=1 j=1 i=1

that is, the total probability is 1.

Table 1.4 Joint and Marginal Probabilities for Table 1.2

X Y Row
Totals
1 2 3
0 1/8 0 0 1/8
1 0 2/8 1/8 3/8
2 0 2/8 1/8 3/8
3 1/8 0 0 1/8
Column 2/8 4/8 2/8 1
Totals

Computation of Marginal Probabilities of Discrete Bivariate

Random Variables from Formula

We may also compute marginal distributions from a formula.

Example 1.9
Refer to Example 1.2. Find the marginal distributions of
(a) X, (b) Y .

Solution

(a) The marginal distribution of X is given by:

2
1
g(x) = (3x + 2y)
y=0
21
 
2
1 
= (3x + 2y)
21 y=0 26
We may also compute marginal distributions from a formula.
ADVANCED TOPICS IN INTRODUCTORY
Example 1.9 A FIRST COURSE IN
PROBABILITY: PROBABILITY AND DISTRIBUTION FUNCTIONS
Refer to Example
PROBABILITY 1.2.
THEORY Find theIII marginal distributions of
– VOLUME OF BIVARIATE DISTRIBUTIONS
(a) X, (b) Y .

Solution

(a) The marginal distribution of X is given by:

2
1
g(x) = (3x + 2y)
y=0
21
 
2
24 1 Advanced

=  (3x +Topics
2y) in Introductory Probability
21 y=0
1
= [(3x + 0) + (3x + 2) + (3x + 4)]
21
1
= (3x + 2)
7
(b) The marginal distribution of Y is given by:
1
1
h(y) = (3x + 2y)
x=0
21
1
1
= (3x + 2y)
21 x=0
1
= [(0 + 2y) + (3 + 2y)]
21
1
= (3 + 4y)
21

1.5.2 Marginal Distribution of Continuous Bivariate Random

Variables

Definition 1.12
Suppose f (x, y) be the joint probability density function of the con-
tinuous two-dimensional random variable (X, Y ). We define g(x)
and h(y), the marginal probability density function of X and Y ,
respectively by ∞
g(x) = f (x, y) dy
−∞
and ∞
h(y) = f (x, y) dx
−∞

Example 1.10
Refer to Example 1.3. Find the marginal probability distribution of X and
Y.

Solution
Marginal probability distribution of X:
∞
Density and Distribution Functions of Bivariate Distributions 25
g(x) = f (x, y) dy
−∞
2
xy
= x2 + dy, 0<x<1
0 3
2
xy 2
= x y+ 2
6 0
4x
= 2x2 +
6
27
ADVANCED TOPICS
Density and IN INTRODUCTORY
Distribution Functions of Bivariate Distributions 25
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME
2 III OF BIVARIATE DISTRIBUTIONS
xy
= 2
x + dy, 0<x<1
0 3
2
xy 2
= x y+ 2
6 0
4x
= 2x2 +
6

That is,
2x2 + 23 x, 0 < x < 1
g(x) =
0, elsewhere

Marginal probability distribution of Y :

∞
h(y) = f (x, y) dx
−∞
1
xy
= x2 + dx, 0<y<2
0 3
1
x3 x2 y
= +
3 6 0
1 y
= +
3 6
That is
1
+ y6 , 0 < y < 2
h(y) = 3
0, elsewhere

Definition 1.13 MARGINAL CUMULATIVE DISTRIBUTION

The marginal probability distribution function of X, denoted by

FX (x) is 678'<)25<2850$67(5©6'(*5((
FX (x) = P (X ≤ x)
and &KDOPHUV8QLYHUVLW\RI7HFKQRORJ\FRQGXFWVUHVHDUFKDQGHGXFDWLRQLQHQJLQHHU
the marginal probability distribution function of Y, denoted by
FY (y) is
LQJDQGQDWXUDOVFLHQFHVDUFKLWHFWXUHWHFKQRORJ\UHODWHGPDWKHPDWLFDOVFLHQFHV
FY (y) = P (Y ≤ y)
DQGQDXWLFDOVFLHQFHV%HKLQGDOOWKDW&KDOPHUVDFFRPSOLVKHVWKHDLPSHUVLVWV
IRUFRQWULEXWLQJWRDVXVWDLQDEOHIXWXUH¤ERWKQDWLRQDOO\DQGJOREDOO\
9LVLWXVRQ&KDOPHUVVHRU1H[W6WRS&KDOPHUVRQIDFHERRN

28
3 6
That is
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE
1
+ y, 0<y<2
h(y) =IN 3 6 PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III 0, elsewhere OF BIVARIATE DISTRIBUTIONS

Definition 1.13 MARGINAL CUMULATIVE DISTRIBUTION

The marginal probability distribution function of X, denoted by
FX (x) is
FX (x) = P (X ≤ x)
and the marginal probability distribution function of Y, denoted by
FY (y) is
FY (y) = P (Y ≤ y)
26 Advanced Topics in Introductory Probability

Thus,

F (x, y) = P (X ≤ x, Y ≤ y)
= lim F (x, y)
y→∞
= FX (x)
x ∞
= f (u, y) dy du
−∞ −∞

Similarly,

F (x, y) = lim F (x, y)

x→∞
= FY (y)
y ∞
= f (x, v) dx dv
−∞ −∞

From this, it follows that the probability density function of X alone, known
as the marginal density of X, is

g(x) = fX (x)
= F X (x)
∞
= f (x, y) dy
−∞

Similarly

h(y) = fY (y)
= FY (y)
∞
= f (x, y) dx
−∞

Note
The marginal probability density functions g(x) and h(y) can easily be de-
termined from the knowledge of the joint density function f (x, y). However,
the knowledge of the marginal probability density functions does not, in gen-
eral, uniquely determine the joint density function. The exception occurs
when the two random variables are independent.

29
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
Density and Distribution Functions of Bivariate Distributions 27

1.6 CONDITIONAL DISTRIBUTION OF BIVARIATE RANDOM

VARIABLES

1.6.1 Conditional Distribution of Discrete Random Variables

The conditional probability distribution of a random variable is analogous

to the concept of conditional probability of events. For two events A and B,
the multiplication law gives the probability of the intersection A ∩ B as

P (A ∩ B) = P (A)P (B|A)

where P (A) is the unconditional probability of A and P (B|A) is the condi-

tional probability of B given that A has occurred.
Now, consider ({X = x}) ∩ ({Y = y}), represented by the bivariate
event (x, y). Then it follows directly from the multiplication law of proba-
bility that the bivariate probability for the intersection (xi , yj ) is

p(xi , yj ) = g(xi )p(yj |xi )

p(xi , yj ) = h(yj )p(xi |yj )

The probabilities g(xi ) and h(yj ) are those associated with the marginal
probability distributions for X and Y , respectively. The probability p(x|y)
is the probability that the random variable X takes a specific value x given
that Y takes on the value y written in full as P (X = x|Y = y). Thus,
P (X = 2|Y = 1) is the conditional probability that X = 2 given that Y =
1. A similar interpretation can be attached to the conditional probability
p(y|x).

Note
Conditional distribution is the opposite of marginal distribution, in which
the probability of a value of a random variable is determined without refer-
28 Advanced Topics in Introductory Probability
ence to the possible values of the other variables.

Definition 1.14 BIVARIATE CONDITIONAL

PROBABILITY DISTRIBUTION
(Discrete Case)
Suppose X and Y are discrete random variables with joint probability
distribution p(x, y) then the conditional discrete probability function
of X given Y is

P (X = x, Y = y)
P (X = x|Y = y) = , P (Y = y) > 0
P (Y = y)

That is, given the joint probability distribution p(x, y) and marginal prob-
ability functions g(x) and h(y), respectively, the conditional discrete proba-
bility function of X given Y is
30

p(x, y)
P (X = x|Y = y) = , P (Y = y) > 0
P (XP=(Yx,=Yy)= y)
P (X = x|Y = y) = , P (Y = y) > 0
ADVANCED TOPICS IN INTRODUCTORYP (Y = y)
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS

That is, given the joint probability distribution p(x, y) and marginal prob-
ability
That is,functions
given the and
joint
g(x) h(y), respectively,
probability the p(x,
distribution conditional discrete proba-
y) and marginal prob-
bility
abilityfunction of g(x)
functions X given Y is respectively, the conditional discrete proba-
and h(y),
bility function of X given Y is
p(x, y)
p(x|y) = , h(y) > 0
h(y)y)
p(x,
p(x|y) = , h(y) > 0
h(y)
and similarly, the conditional discrete probability function of Y given X is
and similarly, the conditional discrete probability function of Y given X is
p(x, y)
p(y|x) = , g(x) > 0
g(x)y)
p(x,
p(y|x) = , g(x) > 0
g(x)
This definition shows that if we have the joint probability function
of two random
This variables
definition shows andthat
desire
if we thehave
conditional
the jointdistribution
probabilityforfunction
one of
them
of twowhen
randomthe other is held
variables andfixed,
desireit the
is merely necessary
conditional to dividefor
distribution theone
joint
of
probability
them when function
the otherbyis the
heldprobability
fixed, it is function of the fixed
merely necessary variable.
to divide the joint
probability function by the probability function of the fixed variable.
Example 1.11
Refer to the
Example 1.11table in Example 1.1. Find
Refer to the table in Example 1.1. Find
(a) P (Y = 2|X = 1)
(a) P (Y = 2|X = 1)
(b) P (X = 1|Y = 2)
(b) P (X = 1|Y = 2)
(c) P (X = 1|Y = 1)
(c) P (X = 1|Y = 1)

Scholarships

Open your mind to

new opportunities
With 31,000 students, Linnaeus University is
one of the larger universities in Sweden. We
are a modern university, known for our strong
international profile. Every year more than Bachelor programmes in
1,600 international students from all over the Business & Economics | Computer Science/IT |
world choose to enjoy the friendly atmosphere Design | Mathematics
and active student life at Linnaeus University.
Master programmes in
Welcome to join us! Business & Economics | Behavioural Sciences | Computer
Science/IT | Cultural Studies & Social Sciences | Design |
Mathematics | Natural Sciences | Technology & Engineering
Summer Academy courses

31
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY
Density and THEORY – VOLUME
Distribution III
Functions of Bivariate Distributions OF BIVARIATE
29 DISTRIBUTIONS

Solution
2
P (X = 1, Y = 2) p(1, 2) 2
(a) P (Y = 2|X = 1) = = = 8
=
P (X = 1) g(1) 3
8
3
2
P (X = 1, Y = 2) p(1, 2) 2
(b) P (X = 1|Y = 2) = = = 8
=
P (Y = 2) h(2) 4
8
4

P (X = 1, Y = 1) p(1, 1) 0
(c) P (X = 1|Y = 1) = = = 2 =0
P (Y = 1) h(1) 8

Example 1.12
Refer to Example 1.2. Find the conditional probability function
(a) p(x|y), (b) p(y|x)

Solution
p(x, y)
p(x|y) =
h(y)
(a) From Example 1.2,

1
p(x, y) = (3x + 2y)
21
From Example 1.9,
1
h(y) = (3 + 4y)
21
Hence

21 (3x + 2y)
1
p(x|y) =
21 (3 + 4y)
1

3x + 2y
=
(3 + 4y)

(b)
p(x, y)
p(y|x) =
g(x)
From Example 1.9,
30 Advanced
1 Topics in Introductory Probability
g(x) = (3x + 2)
7
Hence

21 (3x + 2y)
1
p(y|x) =
7 (3x + 2)
1

7(3x + 2y)
=
21(3x + 2)
3x + 2y
=
3(3x + 2)

1.6.2 Conditional Distribution of Continuous Variables

Definition 1.15
Let (X, Y ) be a continuous two-dimensional random variable with
joint probability density function f (x, y). Let g and h be the
marginal probability density functions of X32and Y respectively. The
conditional probability of X for given Y = y is defined by
7
7(3x + 2y)
=
ADVANCED TOPICS IN INTRODUCTORY 21(3x + 2)
PROBABILITY: A FIRST COURSE IN 3x + 2y PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III = OF BIVARIATE DISTRIBUTIONS
3(3x + 2)

1.6.2 Conditional Distribution of Continuous Variables

f (x, y)
f (x|y) = , h(y) > 0
h(y)

and the conditional probability of Y for given X = x is defined by

f (x, y)
f (y|x) = , g(x) > 0
g(x)

Example 1.13
Refer to Example 1.3. Find (a) f (x|y), (b) f (y|x).
Solution
From Example 1.10
1 y
h(y) = +
3 6
(a)
Density and Distribution Functions of Bivariate Distributions 31
f (x, y) Distributions
Density and Distribution Functions of Bivariate 31
f (x|y) =
Therefore h(y)
Therefore x2 + xy
f (x|y) = y
3
x132 + xy
f (x|y) = 12 y
63

3 ++62xy
6x
= , 0 ≤ x ≤ 1; 0 ≤ y ≤ 2
6x22 + 2xyy
= , 0 ≤ x ≤ 1; 0 ≤ y ≤ 2
2+y
(b) The conditional probability of Y for given X = x is
(b) The conditional probability of Y for given X = x is
f (x, y)
f (y|x) = .
(x, y)
f g(x)
f (y|x) = .
g(x)
From Example 1.10,
2
From Example 1.10, g(x) = 2x2 + x
32
g(x) = 2x2 + x
Therefore 3
Therefore
x2 + xy
f (y|x) = 3
2x + xy
x22 + 2
33 x
f (y|x) =
2x
3x + xy
2 2
3x
=
6x + xy
3x 2 2x
= 3x + y
= 6x + 2x
2
, 0 ≤ y ≤ 2; 0≤x≤1
6x + y2
3x
= , 0 ≤ y ≤ 2; 0≤x≤1
6x + 2
Example 1.14
Consider the joint probability density function
Example 1.14
33
Consider the joint probability
density function
6(1 − y), 0 ≤ x ≤ y ≤ 1
= 3x 3x 2+
2
2 +
3
xy
=
= 6x
3x2 + xy 2x
xy
= 3x 6x 2+ 2x
ADVANCED TOPICS IN INTRODUCTORY
= 3x 6x2+
6x
+
+
+yy2x
2x
, 0 ≤ y ≤ 2; 0 ≤ x ≤ 1
PROBABILITY: A FIRST COURSE = 3x
6x
IN3x +
+ 2
y
y ,, 00 ≤ y ≤ 2; 0PROBABILITY
≤x ≤ 11 AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME
=
= III6x
6x +
+ 2
2 , 0≤ ≤ 2;
≤ yy ≤ 2; 00 ≤ ≤x x≤≤1 OF BIVARIATE DISTRIBUTIONS
6x + 2
Example 1.14
Example
Example 1.14
Example 1.14
Consider the joint probability density function
1.14
Consider
Consider the joint probability density function
Consider the
the joint
joint probability
probability density
density function
function

6(1 − y), 0 ≤ x ≤ y ≤ 1
f (x, y) = 6(1 − y), 0 ≤ x ≤ y ≤ 1
ff (x, = 6(1
0, − y), 00elsewhere
− y), ≤x x≤ ≤ yy ≤ ≤ 11
(x, y)
y) == 6(1 0,
≤
elsewhere
f (x, y) 0,
0, elsewhere
elsewhere
1 3
Find P
Y < 1 X < 3
.
11 X < 34
2
Find
Find PP Y < 3 .
Find P Y Y <
< 22 X X< < 44 ..
Solution 2 4
Solution
Solution
Solution
1 3

32 P Y < , X <
1 Advanced 3 PTopics 2111 ,, X 4333
in Introductory Probability
P
Y < X < = P Y
Y <
<
X <
<
32 1
1
3
Advanced
43 P Topics
Y < in 2 , Introductory
22 3 X < 4
44 Probability
P Y
P Y <
< 12 X
X <
< 3 = = P
X <
P Y < 2 X < 4 = 3
of 43

We first evaluate the2 2 numerator. 4
4 The P region
P X< 3integration is sketched
P X X< < 44
below.We first evaluate the numerator. The region of4integration is sketched
below.

Now,
Now,
1
1 3 2
y
P Y < , X < = y 6(1 − y) dx dy
1
21 43 x=0 2
y=0
P Y < ,X< = 6(1
1 − y) dx dy
2 4 y=0
2 x=0
= 3 y − 2 y3 21
y=0
= 13 y 2 − 2 y3 2
y=0
=
21
=
2
The denominator is evaluated as follows. First we find the marginal p.d.f of
The denominator is evaluated
X. 1 as follows. First we find the marginal p.d.f of
X. g(x) = 6(1 − y) dy = 3(1 − x)2 0 ≤ x ≤ 1
1
y=x
g(x) = 6(1 − y) dy = 3(1 − x)2 0≤x≤1
y=x
so that
so that 3
3 4
P X < = 3 3(1 − x)2 dx
43 4
P X< = x=0 3(1 − x)2 dx
4 63
= x=0
63
64
=
64
Hence,

Hence, 1 3 1/2 32
P Y < X < = =
21 43 1/2
63/64 32
63
P Y < X < = =
2 4 63/64 63

34
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS

Now,
1
1 3 2
y
P Y < ,X< = 6(1 − y) dx dy
2 4 y=0 x=0
1
= 3 y 2 − 2 y3 2

y=0
1
=
2
The denominator is evaluated as follows. First we find the marginal p.d.f of
X. 1
g(x) = 6(1 − y) dy = 3(1 − x)2 0 ≤ x ≤ 1
y=x

so that
3
3 4
P X< = 3(1 − x)2 dx
4 x=0
63
=
64
Hence,

1 3 1/2 32
P Y < X< = =
2 4 63/64 63
Density and Distribution Functions of Bivariate Distributions 33

1.7 INDEPENDENCE OF BIVARIATE RANDOM VARIABLES

1.7.1 Definition of Independence of Bivariate Random

Variables

We recall from the introductory probability course that, in terms of event

probabilities, two events A and B are independent if the realization of A
is not affected by the occurrence of B. That is, two events A and B are
independent if
P (A|B) = P (A)
This condition can be carried over to two random variables. That is, two
random variables are independent if the realization of one does not affect
the probability distribution of the other.
We also recall that the definition of independence of two events are
usually given as
P (A ∩ B) = P (A)P (B)
Similarly, we may carry this condition of independence over to two random
variables as in Definition 1.16.

Definition 1.16 INDEPENDENCE OF BIVARIATE

RANDOM VARIABLES
Let X have distribution function F1 (x), Y have distribution function
F2 (y) with X and Y having joint distribution function F (x, y). Then
X and Y are said to be independent if and only if

F (x, y) = F1 (x)F2 (y)

for every pair of real numbers (x, y) 35

We also recall that the definition of independence of two events are
usually given as
ADVANCED TOPICS IN INTRODUCTORY
P (A ∩ B) = P (A)P (B)
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY
Similarly, weTHEORY – VOLUME
may carry III
this condition OF BIVARIATE DISTRIBUTIONS
of independence over to two random
variables as in Definition 1.16.

Definition 1.16 INDEPENDENCE OF BIVARIATE

F (x, y) = F1 (x)F2 (y)

for every pair of real numbers (x, y)

Thus, the independence of the random variables X and Y implies that their
joint distribution function factors into the products of their marginal distri-
bution functions. This definition applies whether the random variables are
discrete or continuous.
If X and Y are not independent, they are said to be dependent. It is
usually more convenient to verify independence or otherwise with the help
34 Advanced
of the p.m.f. (in the discrete case) Topics
or p.d.f. in Introductory
(in the Probability
continuous case).

1.7.2 Independence of Bivariate Discrete Random Variables

Definition 1.17
If X and Y are discrete random variables with joint probability func-
tion p(x, y) and marginal probability function g(x) and h(y) respec-
tively, then X and Y are independent if and only if

p(x, y) = g(x)h(y)

for all pairs of real numbers (x, y)

Example 1.15
Refer to the table in Example 1.1, verify whether or not X and Y are
independent.

Solution
Consider the ordered pair (0, 1).
From Example 1.1
1
P (X = 0, Y = 1) =
8
But from the marginal distributions,
1
P (X = 0) =
8
2
P (Y = 1) =
8

1 2
P (X = 0)P (Y = 1) =
8 8
1
which does not equal . Therefore X and Y are not independent.
8
Example 1.16
Refer to Example 1.2. Are X and Y independent?

Solution 36
From Example 1.2
P (Y = 1) =
8

1 2
ADVANCED TOPICS IN P (X = 0)P (Y =
INTRODUCTORY 1) =
PROBABILITY: A FIRST COURSE IN 8 8 PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
1
which does not equal . Therefore X and Y are not independent.
8
Example 1.16
Refer to Example 1.2. Are X and Y independent?

Solution
From Example 1.2
Density and Distribution Functions of1 Bivariate Distributions 35
p(x, y) = (3x + 2y)
21
From Example 1.9,
1
g(x) = (3x + 2)
7
1
h(y) = (3 + 4y)
21
Now
1 1
g(x)h(y) = (3x + 2) (3 + 4y)
7 21
1
= (9x + 12xy + 8y + 6)
147
which is not equal to p(x, y); hence X and Y are not independent.

1.7.3 Independence of Continuous Random Variables

Like the discrete case, we usually verify for independence or lack of it of

continuous random variables using the result contained in Definition 1.16.

Definition 1.18
If X and Y are are continuous random variables with a joint density
function of f (x, y) and marginal density functions of g(x) and h(y),
respectively, then X and Y are independent if and only if

f (x, y) = g(x)h(y)

for all pairs of real numbers (x, y)

Example 1.17
Refer to Example 1.3. Verify whether X and Y are independent.

Solution
From Example 1.10,
2
g(x) = 2x2 + x
3
1 y
h(y) = +
3 6

37
1 1
g(x)h(y) = (3x + 2) (3 + 4y)
7 21
ADVANCED TOPICS IN INTRODUCTORY 1
PROBABILITY: A FIRST COURSE =
IN (9x + 12xy + 8y +PROBABILITY
6) AND DISTRIBUTION FUNCTIONS
147
PROBABILITY THEORY – VOLUME III OF BIVARIATE DISTRIBUTIONS
which is not equal to p(x, y); hence X and Y are not independent.

1.7.3 Independence of Continuous Random Variables

Like the discrete case, we usually verify for independence or lack of it of

continuous random variables using the result contained in Definition 1.16.

f (x, y) = g(x)h(y)

for all pairs of real numbers (x, y)

Example 1.17
Refer to Example 1.3. Verify whether X and Y are independent.

Solution
From Example 1.10,
36
36 Advanced
Advanced Topics
2 in
Topics in Introductory
Introductory Probability
Probability
36 Advanced
g(x) = 2x2Topics
+ x in Introductory Probability
36 3 in Introductory Probability
Advanced Topics
Now 1 y
Now
Now h(y) = +
3 6
Now
2 22 1
11 + yyy
g(x)h(y)
g(x)h(y) =
= 2x
2x 2+ 2
2+ x
x +
g(x)h(y) = 2x + 33 x 33 + 66
2 23 132 y6 1
2 22 +1 ++ 11 xy
g(x)h(y) = = 2x2+ 1 2y + 2
x
2
= 2 x
x2 + 1 3xx
x2 yy + 329 x
x
x++6 9 xy
xy
= 3 3 x + 3
3 + 9 9
23 2 13 2 29 19
which is
which is not
is not equal
not equal to
equal to
to = x + x y + x + xy
which 3 32 xy 9 9
ff (x, y) = x 2 + xy
(x, y) = x 2 + xy
which is not equal to f (x, y) = x + 33
3
xy
Hence
Hence X
X and
and Y
Y are
are not
not independent.
(x,
independent.
f y) = x 2
+
Hence X and Y are not independent. 3
Example
Hence
ExampleX 1.18
and Y are not independent.
1.18
Example 1.18
Suppose the
Suppose the joint
the joint p.d.f.
joint p.d.f. of
p.d.f. of X
of X and Y
and Y
X and Y isis given
is given by
given by
by
Suppose
Example 1.18

Suppose the joint p.d.f. of and Y 0
X 4xy,
4xy, 0is<given
x< by0 < yy <
1, 1
ff (x, y)
(x, y) =
= 4xy,
y) = 0< <xx<< 1,
1, 00 <
<y< < 11
f (x, 0,
0, elsewhere
elsewhere
0,
4xy, 0elsewhere
< x < 1, 0 < y < 1
Verify whether f (x,
or not y)X=and Y are independent.
Verify whether or not X and 0,
Y are independent.
elsewhere
Verify whether or not X and Y are independent.
Solution
Verify whether or not X and Y are independent.
Solution
Solution
We
We have
We have
have
Solution
11
We have g(x) = 1 f (x, y) dy
g(x) =
g(x) = 00 ff (x,
(x, y)
y) dy
dy

0 1
1
g(x) =
=
11 f (x, y) dy
4xy dy
=
= 000 4xy
4xy dy
dy

01
11
= 4xy
y 22 dy
y2 1
=
= 04x y
= 4x
4x 222 10
y2 00
=
= 2x,
4x 00 <
2x, x< 1
= 2x, 2 0 <
<x < 11
x<
0
Similarly
Similarly = 2x, 0 < x < 1
Similarly

Similarly 11 38
h(y) = 1 f (x, y)dx
h(y) = 00 ff (x,
h(y) = (x, y)dx
y)dx
1
= 4xy dy
0
ADVANCED TOPICS IN INTRODUCTORY
1
PROBABILITY: A FIRST COURSE IN =
y2
4x PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY THEORY – VOLUME III 2 0 OF BIVARIATE DISTRIBUTIONS
= 2x, 0 < x < 1

Similarly
1
h(y) = f (x, y)dx
0
= 2y, 0<y<1

Hence,
f (x, y) = g(x)h(y)
Density and Distribution Functions of Bivariate Distributions 37
for all real numbers (x, y), and therefore X and Y are independent.

To conclude this chapter we shall make two important observations.

(a) Multivariate Distributions

Our discussion of the bivariate case can be readily extended to multivariate

distributions. Apart from added computational labour, joint distributions
of three or more variables do not pose new analytical problems. Thus, we
simply state here that the joint probability mass function of k discrete ran-
dom variables p(x1 , x2 , ..., xk ) is specified to be nonnegative and its sum
over the k dimensional space is one. There are marginal density functions of
various dimensions in that case. Suppose X, Y, and Z are jointly continu-
ous random variables with density function f (x, y, z), The one dimensional
marginal distribution of X is
∞ ∞
fX (x) = f (x, y, z) dy dz
−∞ −∞

and the two-dimensional marginal distribution of X and Y is

∞
fXY (x, y) = f (x, y, z) dz
−∞

(b) Algebra of Random Variables

The four basic operations of algebra, namely, addition, subtraction, multi-

plication and division can be performed on random variables to form new
random variables. That is, if X and Y are random variables and c is a
X
real number, then we can form new variables X + Y, X − Y, XY, , c+
Y
X, cX. We can also form the random variable f (X), where f is any func-
tion on R. That is, if X has range {x1 , x2 , · · · , xn }, then f (X) has range
{f (x1 ), f (x2 ), · · · , f (xn )}. For example, if X is a random variable taking
1 1 1 1
the values {1, 2, 3, 4} with probabilities , , , and , respectively, then
2 8 4 8
1 1 1 1
X takes the values {1, 4, 9, 16} with probabilities , , , and , respec-
2
2 8 4 8
tively.
In general, the probability distributions of cX (that is, adding a random
variable to itself c times) and f (X) (if f is one-to-one) such as multiplying a
38 Advanced Topics in Introductory Probability
random variable by itself, or finding the root or cosine of a random variable
are the same as that of X. However, the determination of the probability
X
distributions of X ± Y , XY , and (Y = 0) is not straightforward and this
Y
is the subject of the next chapter.

EXERCISES

39
1.1 Given the following joint frequency distribution of X and Y
X
distributions of X ± Y , XY , and (Y = 0) is not straightforward and this
Y
is the subject
ADVANCED of the
TOPICS next chapter.X
IN INTRODUCTORY
distributions
PROBABILITY: AofFIRST , XY , and
X ± YCOURSE IN (Y =
0) is not straightforward and DISTRIBUTION
PROBABILITY AND this FUNCTIONS
PROBABILITY THEORY – VOLUME III
Y OF BIVARIATE DISTRIBUTIONS
is the subject of the next chapter.
EXERCISES

EXERCISES
1.1 Given the following joint frequency distribution of X and Y

1.1 Given the following joint frequency distribution of X and Y

X Y
1 2 3
X1 3 Y5 1
2 12 24 36
13 31 52 10
24 24 41 65
3 1 2 0
4 4 1 5
Calculate the joint probability of X and Y .

Calculate the joint probability of X and Y .

1.2 Refer to Exercise 1.1.

1.2 Refer to Exercise

(a) Find 1.1. probability distribution of
the marginal
(i) X, (ii) Y
(a) Find the marginal probability distribution of
(b) Verify the independence of X and Y for the data of Exercise 1.1.
(i) X, (ii) Y
(b) Verify
1.3 Refer the independence
to Exercise of X and Y for the data of Exercise 1.1.
1.1. Calculate

1.3 Refer
(a) Pto(XExercise
< 2, Y 1.1.
≤ 3),Calculate
(b) P (2 ≤ X < 3, 1 ≤ Y ≤ 2)
(c) P (1 < X < 2, Y < 3), (d) P (X < 2, 0 ≤ Y < 2)
(a) P (X < 2, Y ≤ 3), (b) P (2 ≤ X < 3, 1 ≤ Y ≤ 2)
(c) Pto
1.4 Refer (1 Exercise
< X < 2,1.1.
Y < 3),
Find (d) P (X < 2, 0 ≤ Y < 2)

1.5 Refer to Example 1.5. Solve part (c).

40
1.3 Refer to Exercise 1.1. Calculate
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
(a) P (X < 2, Y ≤ 3),
PROBABILITY THEORY – VOLUME III
(b) P (2 ≤ X < 3, 1 ≤ Y ≤ 2) OF BIVARIATE DISTRIBUTIONS
(c) P (1 < X < 2, Y < 3), (d) P (X < 2, 0 ≤ Y < 2)

1.4 Refer to Exercise 1.1. Find

(a) P (Y = 3|X = 1), (b) P (Y = 2|X = 3),

(c) P (X = 2|Y = 2), (d) P (X = 1|Y = 2),
(e) P (X = 3|Y = 1), (f) P (X = 3|Y = 2)
Density and Distribution Functions of Bivariate Distributions 39
1.5 Refer to Example 1.5. Solve part (c).

1.6 The joint probability mass function of X and Y is given by

1 1 1 1
p(1, 1) = , p(1, 2) = , p(2, 1) = , p(2, 2) =
8 4 8 2
(a) Compute the conditional probability mass function of X, given
Y = i, i = 1, 2.
(b) Are X and Y independent?

1.7 Consider the bivariate discrete random variables X and Y with prob-
ability function

1
p(x, y) = (2x + y) x = 0, 1, 2; y = 0, 1, 2.
27
(a) Verify that p(x, y) is a legitimate probability mass function.
(b) Find the joint probability

P (1 ≤ X ≤ 2, 0 ≤ Y ≤ 1)

(c) Find the marginal distributions of X and Y .

(d) Find the conditional probability of
(i) Y for given X = x
(ii) X for given Y = y
(e) Verify whether X and Y are independent or not.

1.8 A bivariate discrete random variable (X, Y ) is such that

P (X = x, Y = y) = θx (1 − θ)y ,

where 0 < θ < 1 and x, y = 1, 2, 3, ...

(a) Show that the above probability mass function is legitimate

(b) Find the marginal probability distribution of X and Y
(c) Find the conditional probability of
(i) X given Y = 2, (ii) Y given X = 0
(d) Verify whether X and Y are independent or not.

41
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
PROBABILITY
40 THEORY – VOLUME Advanced
III OF BIVARIATE DISTRIBUTIONS
Topics in Introductory Probability

1.9 A bivariate discrete random variable (X, Y ) has the following proba-
bility mass function.

x+y x
p (1 − p)y e−λ λx+y
x
p(x, y) = , x, y = 0, 1, 2, ...
(x + y)!
(a) Show that the marginal probability function of X is

e−λ p (λ p)x
g(x) = ; x = 0, 1, 2, ...
x!
(b) Find the marginal probability mass function of Y .
(c) Examine whether X and Y are independent.

1.10 Consider the following function of two discrete random variables

1 2
f (x, y) = (x − y), f or x = 2, 3; y = 1, 2
20
(a) Find the value of k such that the function is a legitimate proba-
bility mass function.
(b) Construct the joint probability table.
(c) Find the marginal probability distributions of X and Y.
(d) Find the conditional probability of
(i) Y for given X = x
(ii) X for given Y = y
(e) Verify whether or not X and Y are independent.

1.11 Suppose that (X, Y ) is a two-dimensional continuous random variable

with joint probability density function.

k(x + y − 2xy), 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f (x, y) =
0, elsewhere

where k is a constant.

Density(a)
andFind the value Functions
Distribution of k. of Bivariate Distributions 41
(b) Find the marginal probability distribution of X and Y .
(c) Find the conditional probability
(i) X given Y = y
(ii) Y given X = x
(d) Verify whether X and Y are independent or not.
(e) P (X > Y )

1.12 A bivariate continuous random variable (X, Y ) has joint probability

density function

k(x + 2y), 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f (x, y) =
0, elsewhere

where k is a constant.

(a) Find the value of k.

(b) Find the marginal probability distribution of X and Y .
42
(c) Find the conditional probability of
(c) Find the conditional probability
(i) X given Y = y
ADVANCED TOPICS IN INTRODUCTORY
(ii) Y given X = x
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
(d) Verify
PROBABILITY whether
THEORY X and
– VOLUME III Y are independent or not. OF BIVARIATE DISTRIBUTIONS

(e) P (X > Y )

1.12 A bivariate continuous random variable (X, Y ) has joint probability

density function

k(x + 2y), 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f (x, y) =
0, elsewhere

where k is a constant.

(a) Find the value of k.

(b) Find the marginal probability distribution of X and Y .
(c) Find the conditional probability of
(i) Y given X = x
(ii) X given Y = y
(d) Verify whether or not X and Y are independent

1.13 Suppose that a bivariate function is given by

f (x, y) = a(x2 + xy), 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
f (x, y) =
0, elsewhere

Find

(a) the constant a such that f (x, y) is a p.d.f.

(b) P (X > Y );
(c) the marginal p.d.f. of X;
(d) the marginal p.d.f. of Y.

1.14 The joint probability density function of X and Y is given by

42 7 (x
6 2
+ xy
2 ),
Advanced 0<x<
Topics in 1, 0 < y < 2Probability
Introductory
f (x, y) =
0, elsewhere
(a) Verify that this is indeed a joint p.d.f.
(b) Compute the distribution function of X.
(c) Find P (X > Y )

1 1
(d) Find P Y < X > .
2 2
(e) Verify whether X and Y are independent.

1.15 Consider the following joint density function

f (x, y) = λ2 eλy , 0≤x≤y

Find

(a) the marginal p.d.f. of X

(b) the marginal p.d.f. of Y ;
(c) the conditional p.d.f. of Y given X = x;
(d) the conditional p.d.f. of X given Y = y

1.16 The number of people that enter a church in a given hour is a Poisson
random variable with λ = 10. Compute the conditional probability
that at most 2 men entered the church in a given hour, given that 10
women entered in that hour. What assumptions have you made?

1.17 The joint probability density function of 43

X and Y is given by

(a) the marginal p.d.f. of X
ADVANCED TOPICS IN INTRODUCTORY
(b) the marginal p.d.f. of Y ;
PROBABILITY: A FIRST COURSE IN PROBABILITY AND DISTRIBUTION FUNCTIONS
(c) the
PROBABILITY conditional
THEORY p.d.f.
– VOLUME III of Y given X = x; OF BIVARIATE DISTRIBUTIONS

(d) the conditional p.d.f. of X given Y = y

1.17 The joint probability density function of X and Y is given by

xe−(x+y) , x > 0, y > 0
f (x, y) =
0, elsewhere

(a) Find the marginal p.d.f of (i) X (ii) Y

(b) Find the conditional p.d.f of
(i) X, given Y = y;
(ii) Y given X = x.
(c) Verify whether X and Y are independent.

1.18 The joint probability density function of X and Y is given by

Density and Distribution 2, of
Functions 0< 0<y<1
x < y, Distributions
Bivariate 43
f (x, y) =
0, elsewhere

(a) Find the marginal p.d.f of (i) X, (ii) Y.

(b) Find the conditional p.d.f of
(i) X, given Y = y, (ii) Y given X = x

1.19 The joint probability density function of X and Y is given by

f (x, y) = λ1 λ2 exp{−(λ1 x + λ2 y)}, 0 < x < ∞, 0 < y < ∞

(a) Find the marginal p.d.f. of (i) X (ii) Y .

(b) Verify whether X and Y are independent.

1.20 Show that the conditional densities f (x|y) and f (y|x) are legitimate
density functions.

44
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS

Chapter 2

SUMS, DIFFERENCES, PRODUCTS AND

QUOTIENTS OF BIVARIATE DISTRIBUTIONS

2.1 INTRODUCTION

In the previous chapter, we discussed the joint probability distributions,

cumulative distributions and marginal distributions of random variables X
and Y . We also discussed the independence of bivariate random variables.
In this chapter, we shall take up the distribution of sums, differences,
products and quotients of X and Y . The numerical characterisation of the
joint distribution of X and Y will be discussed in the next chapter. As
has been the practice in this book, discussions will be done separately for
discrete and continuous cases.

2.2 SUMS OF BIVARIATE RANDOM VARIABLES

2.2.1 Sums of Discrete Bivariate Random Variables

We shall introduce the concept of the sum of discrete random variables with
an example.

Example 2.1
Consider a sequence of independent experiments in each of which a fixed
event is of interest. Suppose X1 , X2 , · · · , Xn are discrete random variables

44 WE WILL TURN YOUR CV

INTO AN OPPORTUNITY
OF A LIFETIME

Do you like cars? Would you like to be a part of a successful brand?

Send us your CV on
As a constructer at ŠKODA AUTO you will put great things in motion. Things that will
www.employerforlife.com
ease everyday lives of people all around Send us your CV. We will give it an entirely
new new dimension.

45
In this chapter, we shall take up the distribution of sums, differences,
products and quotients of X and Y . The numerical characterisation of the
joint distribution
ADVANCED of INTRODUCTORY
TOPICS IN X and Y will be discussed in the next chapter. As
has been the Apractice
PROBABILITY: in thisINbook, discussions will be done SUMS,
FIRST COURSE separately for
DIFFERENCES, PRODUCTS AND
PROBABILITY
discrete and THEORY
continuous– VOLUME
cases. III QUOTIENTS OF BIVARIATE DISTRIBUTIONS

2.2 SUMS OF BIVARIATE RANDOM VARIABLES

2.2.1 Sums of Discrete Bivariate Random Variables

We shall introduce the concept of the sum of discrete random variables with
an example.

Example 2.1
Sums, Differences, Products and Quotients of Bivariate Distributions 45
Consider a sequence of independent experiments in each of which a fixed
event is of interest. Suppose X1 , X2 , · · · , Xn are discrete random variables
defined by
44
1, if the fixed event happens in the ith experiment;
Xi =
0, if the fixed event does not happen in the ith experiment.

Let
Zn = X1 + X2 + · · · + Xn
The random variable Zn is a sum of discrete random variables Xi , i =
1, 2, ..., n, and denotes the number of times the fixed event happens in the
n trials.
For instance, the President of Regent University receives visitors com-
ing every day from the two surrounding communities. The number of visitors
on a particular day from the communities are X1 and X2 , respectively. Let
Z2 = X1 + X2 . The random variable Z2 is a sum of two random variables
and gives the total number of visitors coming to the President of Regent
University on a particular day.

Distribution of Sums of Discrete Bivariate Random Variables

in Tabular Form

Case 1
Suppose that the joint distribution of X and Y is in the form of the table as
presented in Table 1.3. Such a table can be referred to as the parent joint
probability distribution table. Then the various sums of the discrete
random variables X and Y, denoted as X + Y, and their corresponding
probabilities may be presented as in Table 2.1. This can be referred to as
the derived joint probability distribution table.

Table 2.1 Various Sums (X + Y ) and their Corresponding Probabilities

xi + yj x1 + y1 x1 + y2 ··· x1 + ym x2 + y1 x2 + y2 ···
p(xi , yj ) p(x1 , y1 ) p(x1 , y2 ) ··· p(x1 , ym ) p(x2 , y1 ) p(x2 , y2 ) ···

··· x2 + ym xn + y1 xn + y2 ··· xn + ym
··· p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym )

To find the distribution of the sum, we apply the principle of the probabilities

46
xi + yj x1 + y1 x1 + y2 ··· x1 + ym x2 + y1 x2 + y2 ···
p(xi , yj ) p(x1 , y1 ) p(x1 , y2 ) ··· p(x1 , ym ) p(x2 , y1 ) p(x2 , y2 ) ···
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
· · · x2 + ym x + y1 xn + y2 · · · xn + ym
PROBABILITY THEORYn– VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
· · · p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) · · · p(xn , ym )
46 Advanced Topics in Introductory Probability
To find the distribution of the sum, we apply the principle of the probabilities
of equivalent events.1 By this principle, the probabilities of equivalent events
are equal. That is, if A ⊂ S and B ⊂ RX are equivalent events, then we
define the probability of the event B, P (B), to be equal to P (A).
This principle is illustrated in some examples below after the following
theorem.

46 Theorem 2.1 Advanced Topics in Introductory Probability

The distribution (probability) of the sum of X = xi and Y = yj
being equal to k is the sum of the probabilities that correspond to
of equivalent
all indicesevents.
i and 1j By this
that sumprinciple,
to k the probabilities of equivalent events
are equal. That is, if A ⊂ S and B ⊂ RX are equivalent events, then we
define the probability of the event B, P (B), to be equal to P (A).
This principle is illustrated in some examples below after the following
Example 2.2
theorem.
Given the table below, find the distribution of the sum X+Y.

Theorem 2.1
Y
The distribution (probability) of the sum of X = xi and Y = yj
X −1 0 1 Row Totals
being equal to k is the sum of the probabilities that correspond to
j that sum0.10
all indices i and−1 to k 0.20 0.11 0.41
0 0.08 0.02 0.26 0.36
2 0.03 0.17 0.03 0.23
Column Totals 0.21 0.39 0.40 1.00
Example 2.2
Given the table below, find the distribution of the sum X+Y.
Solution
We shall list the values of x + y in each cell within parenthesis as in the
table Differences,
Sums, below: Products and Quotients Y of Bivariate Distributions 47
X −1 0 1 Row Totals
1
The events A ⊂ S −1 0.10called
and B ⊂ RX are 0.20 0.11 0.41
Y equivalent events if
0 0.08 0.02 0.26 0.36
X −1 0 1 Row Totals
2 0.03 0.17
A = {s ∈ S|X(s) ∈ B} 0.03 0.23
−1 0.10 0.20 0.11 0.41
Column Totals 0.21 0.39 0.40 1.00
where RX is the range space of the(-2)
random(-1)
variable(0)
X.
0 0.08 0.02 0.26 0.36
(-1) (0) (1)
Solution 2 0.03 0.17 0.03 0.23
We shall list the values of x + (1)y in (2)
each cell(3) within parenthesis as in the
table below:Column Totals 0.21 0.39 0.40 1.00

The1values for x+y,

The events A ⊂ Salso
and known
B ⊂ RX as
arethe support
called eventsare
of x+y,
equivalent if {−2, −1, 0, 1, 2, 3}
and the probability of each value P [(X + Y ) = x + y] is the sum of all the
2

cell probabilities where that valueA = {soccurs.

∈ S|X(s)For
∈ B}example, there are two (mu-
tually exclusive) ways that the sum X + Y can equal 0, and that is, if X = 0
and Y R=X 0is or
where the if X =
range −1ofand
space Y = 1,variable
the random so that
X.the probability of the sum
X + Y = 0 is P (X = 0, Y = 0) or P (X = −1, Y = 1). That is,
P [(X + Y ) = 0] = P (X = 0, Y = 0) + P (X = −1 Y = 1)
= 0.11 + 0.02
= 0.13 = p(0) 47
0 0.08
(-1) 0.02(0) 0.030.26
(1) 0.36
2 0.03 0.17 0.23
2 (-1)
0.03 (0) (1)
(1) 0.17 (2) 0.03 (3) 0.23
ADVANCED TOPICS IN 2INTRODUCTORY 0.03 0.17 0.03 0.23
(1) (2) (3)
PROBABILITY: Column Totals IN0.21
A FIRST COURSE (1) 0.39 (2) 0.40 (3) 1.00 SUMS, DIFFERENCES, PRODUCTS AND
Column– Totals
PROBABILITY THEORY VOLUME III0.21 0.39 0.40 1.00QUOTIENTS OF BIVARIATE DISTRIBUTIONS
Column Totals 0.21 0.39 0.40 1.00
The values for x+y, also known as the support of x+y, are {−2, −1, 0, 1, 2, 3}
The values
and the for x+y, also
probability known
of each as2the
value support
P [(X +Y) = + y] are
ofxx+y, {−2,
is the sum−1,of0,all
1, 2,
the3}
The
and values
the for x+y,
probability also
of known
each as2the
value P support
[(X + Y ) ofxx+y,
= + y] are
is {−2,
the sum−1,of0,all
1, 2,
the3}
cell probabilities where that value2 occurs. For example, there are two (mu-
and
cell the probability
probabilities of each
wherethat value
thatthe
value P [(X + Y ) = x + y] is the sum of all the
tually exclusive) ways sumoccurs.
X + Y canFor equal
example,
0, andthere are
that is,two
if X(mu-
=0
cell probabilities
tually where that value occurs. For example, there are two X(mu-
and Y exclusive)
= 0 or if waysX =that−1 the
andsumY =X1,+ Y so can
thatequal 0, and that is,
the probability of if
the =0
sum
tually
and Y exclusive)
= 0 or if ways
X = that
−1 the
and sum
Y = X1,+ Yso can
that equal
the 0, and that is,
probability of if
the =0
X sum
X + Y = 0 is P (X = 0, Y = 0) or P (X = −1, Y = 1). That is,
and Y = 0 or if X = −1 and Y = 1, so that
X + Y = 0 is P (X = 0, Y = 0) or P (X = −1, Y = 1). That is, the probability of the sum
X +Y = 0 is +
P [(X P (X 0, Y== 0)
Y ) = 0] or P
P (X =(X0, Y= =−1,0) Y+ = 1). =
P (X That
−1 Yis,= 1)
P [(X + Y ) = 0] = P (X = 0, Y = 0) + P (X = −1 Y = 1)
P [(X + Y ) = 0] = = P 0.11
(X+=0.020, Y = 0) + P (X = −1 Y = 1)
= 0.11 + 0.02
= 0.13 =
= 0.11 + 0.02 p(0)
= 0.13 = p(0)
Again, the only way X += Y can0.13equal
= p(0)2 is when X = 2 and Y = 0 so that
Again, the only way X + Y can
the probability of the sum X + Y = 2 is equal 2 is when X = 2 and Y = 0 so that
Again, the only way X + Y can
the probability of the sum X + Y = 2 is equal 2 is when X = 2 and Y = 0 so that
the probability of P the
[(Xsum
+ YX) += Y2] ==2 isP (X = 2, Y = 0)
P [(X + Y ) = 2] = P (X = 2, Y = 0)
P [(X + Y ) = 2] = = P 0.17
(X ==p(2)
2, Y = 0)
= 0.17 = p(2)
Thus, we can summarise the table = 0.17 = p(2)
as follows:
Thus, we can summarise the table as follows:
Thus, we can summarise the table as follows:
(x + y) −2 −1 0 1 2 3
(x + y) −2 −1 0
p(x + y) 0.1 0.28 0.13 0.29 0.17 0.03 1 2 3
(x +
p(x +y)y) −20.1 0.28−1 0.13 0 1
0.29 2
0.17 3
0.03
p(x + y) 0.1 0.28 0.13 0.29 0.17 0.03
Aliter
Aliter
We can obtain this distribution directly from the joint distribution table by
Aliter
We can obtain this
this example
distribution
going through stepdirectly
by step.from the joint distribution table by
We can
going obtain this
through this example
distribution
step directly
by step.from the joint distribution table by
going through this example step by step.
2
Take note that
2
Takenote that p(x + y) = P [(X + Y ) = x + y]
2
Take note that p(x + y) = P [(X + Y ) = x + y]
p(x + y) = P [(X + Y ) = x + y]

Develop the tools we need for Life Science

Masters Degree in Bioinformatics

Bioinformatics is the
exciting ﬁeld where biology,
computer science, and
mathematics meet.

We solve problems from

biology and medicine using
methods and tools from
computer science and
mathematics.

Read more about this and our other international masters degree programmes at www.uu.se/master

48
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
48
PROBABILITY THEORY – VOLUME Advanced
III QUOTIENTS
Topics in Introductory OF BIVARIATE DISTRIBUTIONS
Probability

Step 1
Indicate the various summands with their corresponding joint probabilities

xi + yj −1 + (−1) −1 + 0 −1 + 1 0 + (−1) 0+0 0+1

p(xi , yj ) 0.10 0.20 0.11 0.08 0.02 0.26

2 + (−1) 2+0 2+1

0.03 0.17 0.03

Step 2
48
Sum the various values for X Advanced
and Y Topics in Introductory Probability

x+y −2 −1 0 −1 0 1 1 2 3
Step 1
p(x, y) 0.1 0.2 0.11 0.08 0.02 0.26 0.03 0.17 0.03
Indicate the various summands with their corresponding joint probabilities

By the principle of equality of the probabilities of equivalent events, we shall

x + yj −1 + (−1) −1 + 0 −1 + 1 0 + (−1) 0 + 0 0 + 1
perform ithe operation in Step 3.
p(xi , yj ) 0.10 0.20 0.11 0.08 0.02 0.26
Step 3
2 + (−1) 2 + 0 2 + 1
Sum the various probabilities corresponding to a particular value of the sum
0.03 0.17 0.03
x+y, denoted by p(x+y) (by the addition rule of probability)3 . For example,
the probability of the sum x + y = 1, is given by
Step 2
P [(X +
Sum the various values ) =and
forYX 1] Y= p(1)
= 0.26 + 0.03 = 0.29
x+y −2 −1 0 −1 0 1 1 2 3
Thus, we summarise the table in Step 3 in the table below:
p(x, y) 0.1 0.2 0.11 0.08 0.02 0.26 0.03 0.17 0.03

x+y −2 −1 1 0 2 3
By the principle of equality of the probabilities of equivalent events, we shall
p(x + y) 0.1 0.2 + 0.08 0.26 + 0.03 0.11 + 0.02 0.17 0.03
perform the operation in Step 3.
Step 4
Step 3
Present the final result as in the table below:
Sum the various probabilities corresponding to a particular value of the sum
x+y, denoted by p(x+y) (by the addition rule of probability)3 . For example,
Sums, Differences,(x +Products
y) −2 −1Quotients 0 of1by 2 Distributions
3 49
the probability of the sum x and+ y = 1, is given Bivariate
p(x + y) 0.1 0.28 0.13 0.29 0.17 0.03
P [(X + Y ) = 1] = p(1)
3
We iscan
That if averify thatvalue
particular thisindistribution of +
the sum appears
= 0.26 the sum
more
0.03 = is
than a probability
once,
0.29 distri-
their probabilities
are addedThat
bution. together
is, for the purpose of constructing the probability distribution.
Thus, we summarise the table0 in Step+3y)in≤the
≤ p(x 1 table below:
and x+y −2 −1 1 0 2 3
p(x + y) = 1
p(x + y) 0.1 0.2 + 0.08 0.26 + 0.03 0.11 + 0.02 0.17 0.03

Step 4
Case 2
Present the final result as in the table below:
Suppose that the joint distribution of X and Y is in the form of Table 1.3.
Suppose also that X and Y are independent so that p(xi , yj ) = g(xi )h(yj ).
(x + y) −2 −1 0 1 2 3
Then the various sums X + Y and their corresponding probabilities may be
p(x + y) 0.1 0.28 0.13 0.29 0.17 0.03
presented in the form of the following table.
3
That is if a particular value in the sum appears more than once, their probabilities
are added together for the purpose of constructing the probability distribution.
xi + yj x1 + y1 x2 + y2 ··· xn + ym
p(xi , yj ) p(x1 )p(y1 ) p(x2 )p(y2 ) ··· p(xn )p(ym )

49
Theorem 2.2
Case 2
Suppose that the joint distribution of X and Y is in the form of Table 1.3.
ADVANCED TOPICS IN INTRODUCTORY
Suppose also that X and Y are independent so that p(xi , yj ) = g(xi )h(yj ).
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
Then the various sums X + Y and their corresponding probabilities
PROBABILITY THEORY – VOLUME III
may be
QUOTIENTS OF BIVARIATE DISTRIBUTIONS
presented in the form of the following table.

xi + yj x1 + y1 x2 + y2 ··· xn + ym
p(xi , yj ) p(x1 )p(y1 ) p(x2 )p(y2 ) ··· p(xn )p(ym )

Theorem 2.2
Suppose that X and Y are independent, discrete random variables
with marginal probability distributions g(x) and h(y) respectively,
then P (X + Y = k) is the sum of the product of g(x) and h(y)
for the corresponding value k = x + y

Example 2.3
Suppose the joint probabilities of two random variables X and Y are given
in the table below.

Y
X 3 6 Row Total
-2 0.28 0.12 0.4
4 0.42 0.18 0.6
Column Total 0.7 0.3 1.0

50 Advanced Topics in Introductory Probability

(a) Verify that X and Y are independent random variables.
(b) Find the distribution of the sum of X and Y .

Solution
(a) The marginal probability distributions of the X is

x -2 4
g(x) 0.4 0.6

and the marginal probability distributions of the Y is

y 3 6
h(y) 0.7 0.3

so that we have

p(−2, 3) = 0.28 = g(−2)h(3) = 0.4(0.7) = 0.28

p(−2, 6) = 0.12 = g(−2)h(6) = 0.4(0.3) = 0.12
p(4, 3) = 0.42 = g(4)h(3) = 0.6(0.7) = 0.42
p(4, 6) = 0.18 = g(4)h(6) = 0.6(0.3) = 0.18

Hence, X and Y are independent.

(b) It has been shown that X and Y are independent. Hence, we obtain
the following table:

xi + yj −2 + 3 −2 + 6 4+3 4+6
p(xi + yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)
50
p(−2, 6) = 0.12 = g(−2)h(6) = 0.4(0.3) = 0.12
ADVANCED TOPICSp(4, 3) = 0.42 =
IN INTRODUCTORY g(4)h(3) = 0.6(0.7) = 0.42
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
p(4, 6) = 0.18 = g(4)h(6) = 0.6(0.3) = 0.18
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
Hence, X and Y are independent.
(b) It has been shown that X and Y are independent. Hence, we obtain
the following table:

xi + yj −2 + 3 −2 + 6 4+3 4+6
p(xi + yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)

Note
Even though the values of the random variables of X and Y are added
together, their corresponding probabilities g(xi ) and h(yj ) are multi-
plied, if X and Y are independent.

Sums,By the principle

Differences, of equality
Products of probabilities
and Quotients of equivalent
of Bivariate 51
events, the
Distributions
distribution of the sum of X and Y is presented in the table below:
xi + yj 1 4 7 10
p(xi + yj ) 0.28 0.12 0.42 0.18

Distribution of Sums of Discrete Bivariate Random Variable in

Expression Form

In most cases the distribution of X and Y may not be presented in a table.

Suppose that all that we know are the values of X + Y and their corre-
sponding probabilities. In such a case, to obtain the distribution of X + Y
we shall use Theorem 2.3.

Theorem 2.3
Suppose X and Y are nonnegative discrete random variables with
joint probability function p(x, y). Let Z = X + Y . Then the proba-
bility distribution of Z is
z

P (Z = z) = p(x, z − x)
x=0

or equivalently,
Copenhagen P (Z = z) =
z

y=0
p(z − y, y)

Master of Excellence
Proof
cultural studies
P (Z = z)Master
Copenhagen = P (Xof+Excellence
Y = z) are
z
two-year master= degrees
P (X taught
= x, X +in
Y =English
z)
at one of Europe’s x=0
leading universities religious studies
z

= P (X = x, Y = z − x) (i)
Come to Copenhagenx=0
- and aspire!

fromApply now
which the at follows.
result science
Similarly,
www.come.ku.dk
z

P (Z = z) = p(z − y, y)
y=0

51
In most cases
ADVANCED the IN
TOPICS distribution of X and Y may not be presented in a table.
INTRODUCTORY
Suppose thatAall
PROBABILITY: that
FIRST we know
COURSE IN are the values of X + Y and their
SUMS, corre-
DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME
sponding probabilities. In suchIIIa case, to obtain the distribution
QUOTIENTSof OF
X+ BIVARIATE
Y DISTRIBUTIONS

we shall use Theorem 2.3.

or equivalently,
z

P (Z = z) = p(z − y, y)
y=0

Proof

P (Z = z) = P (X + Y = z)
z

= P (X = x, X + Y = z)
x=0
z
= P (X = x, Y = z − x) (i)
x=0

from which the result follows.

Similarly,
z

P (Z = z) = p(z − y, y)
52 y=0
Advanced Topics in Introductory Probability

Note
It is important to note the limits of summation. Beyond these limits, one
of the two component mass functions is zero. When dealing with densities
that are nonzero only on some subset of values, we must always be careful.
In case X and Y are allowed to take negative values as well, the lower index
of summation is changed from 0 to −∞.

Example 2.4
Refer to Example 1.2, find

(a) the distribution of Z = X + Y ;

(i) from the first principle (that is, using the joint distribution of X
and Y directly;)
(ii) using Theorem 2.3.

(b) P (Z = 2),

(i) using the distribution obtained in (a) (i);

(ii) using the distribution obtained in (a) (ii).

Solution
Recall from the solution of Example 1.2 that

1
p(x, y) = (3x + 2y), for x = 0, 1; y = 0, 1, 2
21
The various pairs of X and Y that we require 52
are
(b) P (Z = 2),
ADVANCED TOPICS IN INTRODUCTORY
(i) using
PROBABILITY: the distribution
A FIRST COURSE IN obtained in (a) (i); SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
(ii) using the distribution obtained in (a) (ii).

Solution
Recall from the solution of Example 1.2 that

1
p(x, y) = (3x + 2y), for x = 0, 1; y = 0, 1, 2
21
The various pairs of X and Y that we require are

(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)

By the principle of equality of probabilities of equivalent events, the events

{(0, 1)} and {1, 0} are equivalent to the event {Z = 1}. Similarly, the
events {(0, 2)} and {(1, 1)} are equivalent to the event {Z = 2}. Hence,
the possible values of Z are z = 0, 1, 2, 3.

(a) (i) Writing the sum of the joint probability mass function as
1 2
1
P (X + Y = x + y) = (3x +
Sums, Differences, Products and Quotients21of Bivariate 2y)
Distributions 53
x=0 y=0

we calculate the distribution of Z = X + Y as follows.

P (Z = 0) = p(0, 0)
1
= [3(0) + 2(0)]
21
= 0

P (Z = 1) = p(0, 1) + p(1, 0)
1 1
= [3(0) + 2(1)] + [3(1) + 2(0)]
21 21
5
=
21

P (Z = 2) = p(0, 2) + p(1, 1)
1 1
= [3(0) + 2(2)] + [3(1) + 2(1)]
21 21
9
=
21

P (Z = 3) = p(1, 2)
1
[3(1) + 2(2)]
21
7
=
21

Hence the distribution of Z = X + Y is

x+y 0 1 2 3
p(x + y) 0 5
21
9
21
7
21

(ii) The distrbution of Z = X + Y by Theorem 2.3 is

 z



 p(0, z − 0), z = 0, 1, 2, 3
 x=0
P (Z = z) = 53

 z


x+y
ADVANCED TOPICS IN INTRODUCTORY 0 1 2 3
PROBABILITY: A FIRST COURSE IN+ y)
p(x 0 5 9 7 SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III 21 21 21 QUOTIENTS OF BIVARIATE DISTRIBUTIONS

(ii) The distrbution of Z = X + Y by Theorem 2.3 is

 z



 p(0, z − 0), z = 0, 1, 2, 3
 x=0
P (Z = z) =

54  z
Advanced

 Topics in Introductory Probability
 p(1, z − 1), z = 1, 2, 3
x=1

Aliter
 z




p(0, z − 0), z = 0, 1, 2, 3

 y=0





 z
P (Z = z) = p(1, z − 1), z = 1, 2, 3

 y=1





 z



 p(2, z − 2), z = 2, 3
y=2

(b) (i) From the table obtained in (a), that is, calculating the probability
from first principle, we have
3
P (Z = 2) =
7
(ii) By Theorem 2.3 (that is, using the results in (a)(ii)) we have
1

P (Z = 2) = p(x, 2 − x)
x=0
1
1
= [3x + 2(2 − x)]
21 x=0
1
1
Brain power =
21 x=0
3
(x + 4) By 2020, wind could provide one-tenth of our planet’s
electricity needs. Already today, SKF’s innovative know-
how is crucial to running a large proportion of the
= world’s wind turbines.
7 Up to 25 % of the generating costs relate to mainte-
nance. These can be reduced dramatically thanks to our
systems for on-line condition monitoring and automatic
Aliter lubrication. We help make it more economical to create
Using the alternative formula in Theorem 2.3, cleaner, cheaper energy out of thin air.
By sharing our experience, expertise, and creativity,
2
industries can boost performance beyond expectations.
P (Z = 2) = p(2 − y, y) Therefore we need the best employees who can
y=0 meet this challenge!
2

= p(2 − y, y), since The=Power
p(2, 0) 0 of Knowledge Engineering
y=1
2
1
= [3(2 − y) + 2y]
21 y=1

Plug into The Power of Knowledge Engineering.

Visit us at www.skf.com/knowledge

54
y=2

(b) (i) From the table obtained in (a), that is, calculating the probability
ADVANCED TOPICS IN INTRODUCTORY
from first
PROBABILITY: principle,
A FIRST COURSEwe IN
have SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
3
P (Z = 2) =
7
(ii) By Theorem 2.3 (that is, using the results in (a)(ii)) we have
1

P (Z = 2) = p(x, 2 − x)
x=0
1
1
= [3x + 2(2 − x)]
21 x=0
1
1
= (x + 4)
21 x=0
3
=
7

Aliter
Using the alternative formula in Theorem 2.3,
2

P (Z = 2) = p(2 − y, y)
y=0
2

= p(2 − y, y), since p(2, 0) = 0
y=1
2
Sums, Differences, Products 1 55
= and Quotients of 2y]
[3(2 − y) + Bivariate Distributions
21 y=1
2
1
= (6 − y)
21 y=1
3
=
7

Theorem 2.4 CONVOLUTION THEOREM

(Discrete Case)
Suppose that X and Y are independent nonnegative discrete ran-
dom variables with marginal probability distributions g(x) and h(y)
respectively. Let Z be the sum of X and Y . Then the convolution
of g and h is the distribution of the sum Z given by
z

P (Z = z) = g(x)h(z − x) z≥0
x=0

or equivalently,
z

P (Z = z) = g(z − y)h(y) z≥0
y=0

called the convolution of g and h

Proof
Since X and Y are independent, the proof follows by writing
P (X = x, Y = z − x)
as
P (X = x)P (Y = z − x)
in Theorem 2.3.
55
Theorem 2.4 suggests a method of obtaining probability mass function when
z

P (Z = z) = g(z − y)h(y) z≥0
ADVANCED TOPICS IN INTRODUCTORY
y=0
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
called the convolution of g and h

Proof
Since X and Y are independent, the proof follows by writing
P (X = x, Y = z − x)
as
P (X = x)P (Y = z − x)
in Theorem 2.3.

Theorem 2.4 suggests a method of obtaining probability mass function when

X + Y are independent.
56 Advanced Topics in Introductory Probability
When we have two one-dimensional probability mass functions g and
h, the convolution formula
is a one-dimensional probability
z mass function. Thus the probability func-
tion
56 of the sum ofPtwo
(Z independent
= z) = random
g(x)h(z
Advanced variables
− x),
Topics is the
z≥ 0 convolution
in Introductory
4 of
Probability
the individual probability functions.
x=0

is
Notea one-dimensional probability mass function. Thus the probability func-
tion of the sum of two independent random variables is the convolution4 of
(a) Theorem 2.3 also expresses convolution;
the individual probability functions.
(b) The convolution of g and h is denoted as g ∗ h.
Note
Since
(a) Theorem 2.3 also expresses convolution;
X +Y =Y +X
(b) The convolution of g and h is denoted as g ∗ h.
it is easy to verify that
Since z

P (Z = z)
X +=Y = Y g(x)h(z
+ X − x)
x
it is easy to verify that z

= z
g(z − y)h(y)

y
P (Z = z) = g(x)h(z − x)
that is, g ∗ h = h ∗ g. x
z

Example 2.5 = g(z − y)h(y)
y
Suppose that X and Y are independent discrete random variables having
that is, g ∗ hdistributions
probability = h ∗ g.

Example 2.5 e−λ λx e−θ θy

pX (x) = and pY (y) =
Suppose that X and Y are independent
x! discrete random y! variables having
probability distributions
respectively, where x = 0, 1, 2, . . . and y = 0, 1, 2, . . . Find
(a) the distribution of thee−λ λx X +pY.(y) = e θ
−θ y
pX (x) = sum Z = and Y
x! y!
(b) P (X + Y = 4), if λ = 1 and θ = 2.
respectively, where x = 0, 1, 2, . . . and y = 0, 1, 2, . . . Find
Solution
(a) the distribution of the sum Z = X + Y.
The joint probability distribution of X and Y is given by
(b) P (X + Y = 4), if λ = 1 and θ = 2.
e−λ λx e−θ θy
p(x, y) = ·
Solution x! y!
The
4 joint probability distribution of X and Y is given by
Some writers prefer to use the French word Composition or even the German equiva-
lent Faltung
e−λ λx e−θ θy
p(x, y) = ·
x! y!
4
Some writers prefer to use the French word Composition or even the German equiva-
lent Faltung
56
x! y!
respectively, where x = 0, 1, 2, . . . and y = 0, 1, 2, . . . Find
ADVANCED TOPICS IN INTRODUCTORY
(a) the distribution
PROBABILITY: of the sum
A FIRST COURSE IN Z = X + Y. SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
(b) P (X + Y = 4), if λ = 1 and θ = 2.

Solution
The joint probability distribution of X and Y is given by
Sums, Differences, Products and Quotients
e−λ λx of
e−θBivariate
θy Distributions 57
p(x, y) = ·
x! y!
(a)
4 Bywriters
Some Theorem 2.4
prefer to use the French word Composition or even the German equiva-
lent Faltung z
e−λ λx e−θ θz−x
P (Z = z) = ·
x=0
x! (z − x)!

(Note that the summation goes from x = 0 to x = z, since x cannot

exceed z = x + y).
z
e−(λ+θ) λx θz−x
P (Z = z) =
x=0
x!(z − x)!
z
λx θz−x
= e−(λ+θ)
x=0
x!(z − x)!
z
e−(λ+θ) z!
= λx θz−x
z! x=0 x!(z − x)!

Identifying the sum as the binomial expansion of (λ + θ)z we obtain

e−(λ+θ) (λ + θ)z
P (Z = z) = , z = 0, 1, 2, ...
z!
(b) When λ = 1 and θ = 2

e−(1+2) (1 + 2)4
P (Z = 4) =
4!
Trust and responsibility
e−3 (3)4
=
4!
NNE and Pharmaplan have joined forces to create – You have to be proactive and open-minded as a
NNE Pharmaplan, the world’s leading engineering newcomer and make it clear to your colleagues what
and consultancy company focused entirely on the you are able to cope. The pharmaceutical field is new
Example 2.5 biotech
pharma and showsindustries.
that, if X and Y are independent random
to me. But busy asvariables
they are, most of my colleagues
having Poisson distribution with parameters λ and θ firespectively thenme,
nd the time to teach their
and they also trust me.
Inés Aréizaga Esteva (Spain), 25 years old Even though it was a bit hard at first, I can feel over
sum hasEducation:
a Poisson distribution with parameter λ + time
Chemical Engineer
θ. That is, the sum of
that I am beginning to be taken seriously and
the two independent Poisson random variables is again that mya contribution
Poisson random
is appreciated.
variable. The ideas here could be used to prove more generally by induction
that the sum of a finite Poisson random variables also has a Poisson distri-
bution, a result we proved earlier using the moment generating function in
my book “Introductory Probability Theory” (Nsowah-Nuamah, 2017).

Note
In general, if two independent random variables follow the same type of

NNE Pharmaplan is the world’s leading engineering and consultancy company

focused entirely on the pharma and biotech industries. We employ more than
1500 people worldwide and offer global reach and local knowledge along with
our all-encompassing list of services. nnepharmaplan.com

57
x=0

Identifying
ADVANCED TOPICS the sum as the binomial
IN INTRODUCTORY expansion of (λ + θ)z we obtain
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
−(λ+θ)
e (λ + θ)z
P (Z = z) = , z = 0, 1, 2, ...
z!
(b) When λ = 1 and θ = 2

e−(1+2) (1 + 2)4
P (Z = 4) =
4!

e−3 (3)4
=
4!

Example 2.5 shows that, if X and Y are independent random variables

having Poisson distribution with parameters λ and θ respectively then their
sum has a Poisson distribution with parameter λ + θ. That is, the sum of
the two independent Poisson random variables is again a Poisson random
variable. The ideas here could be used to prove more generally by induction
that the sum of a finite Poisson random variables also has a Poisson distri-
bution, a result we proved earlier using the moment generating function in
my book “Introductory Probability Theory” (Nsowah-Nuamah, 2017).
58 Advanced Topics in Introductory Probability
Note
In general, if two independent random variables follow the same type of
distribution, it is not necessarily true that their sum follows the same type
of distribution (see Example 2.7 in the sequel.)

2.2.2 Sums of Continuous Bivariate Random Variables

The formulas for finding sums of continuous random variables are similar to
those in the discrete case by replacing the sum by the integral. However,
the actual computation is more complex with the continuous case.

Theorem 2.5
Suppose that X and Y are continuous random variables having joint
density function f (x, y). Let Z = X + Y and denote the density
function of Z by s(z). Then
∞
s(z) = f (x, z − x) dx
−∞

or equivalently, ∞
s(z) = f (z − y, y) dy
−∞

Proof
We shall use the cumulative distribution function technique. Let S denote
the c.d.f. of the random variable Z. Then

S(z) = P (Z ≤ z) = P (X + Y ≤ z) = f (x, y) dx dy
R

where R is the part of the region over which x + y < z, shown shaded in the
Figure 2.1.
To represent this integral as an iterated integral, we fix x and then
integrate with respect to y from z − x to −∞. Next integrate with respect
to x from −∞ to ∞. Thus
∞ z−x 58
S(z) = f (x, y) dy dx

S(z) = P (Z ≤ z) = P (X + Y ≤ z) = f (x, y) dx dy
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN R SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
where R is the part of the region over which x + y < z, shown shaded in the
Figure 2.1.
To represent this integral as an iterated integral, we fix x and then
integrate with respect to y from z − x to −∞. Next integrate with respect
to x from −∞ to ∞. Thus
∞ z−x
S(z) = and Quotientsf of
Sums, Differences, Products (x,Bivariate
y) dy dxDistributions 59
x=−∞ y=−∞

Let u = x+y; for fixed x, du = dy. If x is fixed then when y = z−x, u = z.

Thus ∞ z
S(z) = f (x, u − x) du dx
x=−∞ u=−∞

We assume that f (x, y) has continuous partial derivative. Therefore we

exchange the order of integration and obtain
z ∞
S(z) = f (x, u − x) dx du
u=−∞ x=−∞

Fig. 2.1 Region x + y < z is shaded

Differentiating S(z) with respect to z using the fundamental theorem of

calculus will yield the density function of the random variable Z. Hence,
z ∞
d S(z) d
= f (x, u − x) dx du
dz dz u=−∞ x=−∞
∞
s(z) = f (x, z − x) dx
−∞

Reversing the roles played by X and Y , we obtain,

∞
s(z) = f (z − y, y) dy
−∞

Example 2.6
Refer to Example 1.3.
60 Advanced Topics in Introductory Probability
(a) Find the distribution of Z = X + Y ;
(b) Calculate P (Z ≤ 2) using
(i) the distribution of Z;
(ii) the joint distribution of (X, Y ), that is, from the first principle.

Solution
(a) Z = X + Y =⇒ Y = Z − X
Therefore
59
0 < Y < 2 =⇒ 0 ≤ Z − X ≤ 2 =⇒ X ≤ Z ≤ X + 2
(b) Calculate P (Z ≤ 2) using
ADVANCED TOPICS IN INTRODUCTORY
(i) theA distribution
PROBABILITY: FIRST COURSE ofINZ; SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
(ii) the joint distribution of (X, Y ), that is, from the first principle.

Solution
(a) Z = X + Y =⇒ Y = Z − X
Therefore

0 < Y < 2 =⇒ 0 ≤ Z − X ≤ 2 =⇒ X ≤ Z ≤ X + 2

To determine the region of integration we sketch below the preceding

inequality and 0 < X < 1.

It can be seen from the sketch that the region of integration is parti-
tioned into three, namely,

0 ≤ X ≤ Z when 0 < Z ≤ 1
0 ≤ X ≤ 1 when 1 < Z ≤ 2
Z − 2 ≤ X ≤ 1 when 2 < Z ≤ 3

For 0 < Z ≤ 1,

z x(z − x) 7 3
s(z) = x2 + dx = z
x=0 3 18
For 1 < Z ≤ 2,

This e-book

1 x(z − x) 2 z
s(z) = x +
2
dx = +
x=0 3 9 6

is made with SETASIGN

SetaPDF

PDF components for PHP developers

www.setasign.com

60
tioned into three, namely,

0≤X≤Z
ADVANCED TOPICS IN INTRODUCTORY when 0 < Z ≤ 1
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
0≤X≤1
PROBABILITY THEORY – VOLUME III
when 1 < Z ≤ 2QUOTIENTS OF BIVARIATE DISTRIBUTIONS
Z − 2 ≤ X ≤ 1 when 2 < Z ≤ 3

For 0 < Z ≤ 1,

z x(z − x) 7 3
s(z) = x +
2
dx = z
x=0 3 18
Sums,For 1 < Z ≤ Products
Differences, 2, and Quotients of Bivariate Distributions 61
Sums, Differences, Products and Quotients of Bivariate

1and Quotients of Bivariate Distributions Distributions 61
Sums, Differences, Products x(z − x) 2 z 61
s(z) = x +
2
dx = +
Finally, for 2 < Z ≤ x=0 3, 3 9 6
Finally, for 2 < Z ≤ 3,
Finally, for
1 2 <Z ≤ 3,
x(z − x)

1
2
s(z) = 1
x2 + x(z − x) dx = 1 −7z 33 + 36z 22 − 57z + 36
s(z) = x=z−2
1 x + x(z 3− x) dx = 18 1 −7z + 36z − 57z + 36
s(z) = x=z−2 x2 + 3 dx = 18 −7z 3 + 36z 2 − 57z + 36
x=z−2 3 18
Therefore, the distribution of Z is
Therefore, the distribution of Z is
Therefore, the distribution
 7 of Z is
 7 z3,

 0<z≤1


 18

 7 z 33 , 0<z≤1


 1818 z , 0<z≤1

 2 z


 2 + z,

 1<z≤2
s(z) =  
 9
2 + z6 , 1<z≤2
s(z) =  

 9 + 6, 1<z≤2
s(z) =  
 91 6



 1 −7z 33 + 36z 22 − 57z + 36 , 2 < z ≤ 3

 18
 1 −7z 3 + 36z 2 − 57z + 36 , 2 < z ≤ 3

 18 −7z + 36z − 57z + 36 , otherwise
 0, 2<z≤3



 0,
 18 otherwise
0, otherwise
(b) (i) Using the distribution of Z we calculate the required probability
(b) (i) Using the distribution of Z we calculate the required probability
as follows:
(b) (i) Using the distribution of Z we calculate the required probability
as follows:
as follows:
P (X ≤ 2) = P (0 < z ≤ 1) + P (1 < z < 2)
P (X ≤ 2) = P (01 <7z ≤ 1) +P 2(1 < z < 2)
P (X ≤ 2) = P (0 < z ≤ 1) + P (1 < 2z < x2)
= 1 7 x dx + 2

2 + x dx
= x=0 1 18
7 x dx + 2
x=1 9
2 x+ 6 dx
= x=0 418 1x dx +x=1 9 +62 dx
x=07 z 18 1 x=1
2 9z 6
2
= 7 z 4 11 + 1 2 z + z 2 22
= 18
7 z44 + 31 32 z + z42 1
= 18 4 00 + 3 3 z + 4
4118 4 3 3 4 1
= 41 0 1
= 72 41
= 72
72
(ii) Now, using the joint distribution of (X, Y )
(ii) Now, using the joint distribution of (X, Y )
(ii) Now, using the joint distribution of (X, Y )

P (Z ≤ 2) = f (x, y) dy dx
P (Z ≤ 2) = f (x, y) dy dx
P (Z ≤ 2) = z<2 f (x, y)dy dx
z<21 2−x xy
= z<21 2−x x2 + xy dy dx
x=0 3 dy dx
= 1 2−x x + xy
y=0
2
= x=0 y=0 x2 2+2−x 3 dy dx

1 y=0
x=0 xy 2 2−x3
= 1 x2 y + xy 2−x dx
2
= x=0 1 x y + xy 62 dx
= x=0 x2 y + 6 y=0 dx
62
1
x=0 Advanced
6Topicsy=0
x in Introductory
2 Probability
x
= 1 x2 (2 − x) + x (2 − x)2 − x2 + x dx
2 y=0 2
= x=0 1 x2 (2 − x) + x 6 (2 − x)2 − x + x 6 dx
= x=0 6
1 x (2 − x) + 1 (2 − x) − x +
2 6 dx
x=0 6 6 2 x
= (2x − x ) + (4x − 4x + x ) − (x + ) dx
2 3 2 3
x=0 6 6
1 1 1
2x2 4x4 1 4x2 x4 x3 x2
= − + 2x2 − + − +
3 4 0
6 3 4 0
3 12 0
41
=
72

Theorem 2.6 CONVOLUTION THEOREM

61
(Continuous Case)
x=0 6 6
1 1 1
2x2 4x4 1 4x2 x4 x3 x2
=
ADVANCED TOPICS IN INTRODUCTORY − + 2x2 − + − +
3
PROBABILITY: A FIRST COURSE IN 4 0
6 3 4SUMS,
0
3 12 0 PRODUCTS AND
DIFFERENCES,
41
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
=
72

Theorem 2.6 CONVOLUTION THEOREM

(Continuous Case)

Suppose that X and Y are independent continuous random variables

having joint density function f (x, y) with marginal probability
distributions g(x) and h(y) respectively. Let Z = X + Y and denote
the p.d.f. of Z by s(z). Then
∞
s(z) = g(x)h(z − x) dx
−∞

or equivalently, ∞
s(z) = g(z − y)h(y) dy
−∞

The theorem follows from Theorem 2.5 by noting that since X and Y are
independent:

f (x, z − x) = g(x)h(z − x)
f (z − y, y) = g(z − y)h(y)

The function s is called the convolution of the functions g and h, and the
expression in the Theorem is called the convolution formula.
If X and Y are nonnegative random variables, then the convolution
formula reduces to the following
z
s(z) = g(x)h(z − x) dx
0

This isDifferences,
Sums, because each of g andand
Products h isQuotients
equal to 0offor negativeDistributions
Bivariate argument. 63

Example 2.7
Suppose that X and Y are independent and identically distributed random
variables having p.d.f.’s

fX (x) = λ e−λ x , x≥0

and
fY (y) = λ e−λ y , y≥0
respectively, find the distribution of the sum Z = X + Y .

Solution
z
fZ (z) = λ e−λ t λ e−λ(z−t) dt
0
z
= λ2 e−λ z dt
0
= λ2 z e−λ z , z≥0

Note

(a) See note after Theorem 2.3.

(b) This distribution is a gamma distribution62with parameters 2 and λ.

fZ (z) = λ e−λ t λ e−λ(z−t) dt
0
z
ADVANCED TOPICS IN INTRODUCTORY
= λ2 e−λ z dt
PROBABILITY: A FIRST COURSE IN 0 SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III 2 −λ z QUOTIENTS OF BIVARIATE DISTRIBUTIONS
= λ ze , z≥0

Note

(a) See note after Theorem 2.3.

(b) This distribution is a gamma distribution with parameters 2 and λ.

2.3 DIFFERENCES OF RANDOM VARIABLES

2.3.1 Differences of Discrete Bivariate Random Variables

The probability distribution of the difference of two independent discrete
random variables X and Y may be considered as the probability distribution
of the sum of X and (−Y ). This is demonstrated in Table 2.2.

Table 2.2 Various Differences (X − Y ) and their Corresponding Proba-

bilities

xi − yj x1 − y1 x1 − y2 · · · x1 − ym x2 − y1 x2 − y2 · · ·
64p(xi , yj ) p(x1 , y1 ) ) · · · p(x
p(x1 , y2Advanced Topics
,
1 my )
in )
Introductory
p(x ,
2 1y 2 , y2 ) · · ·
Probability
p(x

··· x2 − ym xn − y1 xn − y2 ··· xn − ym
··· p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym )

To find the distribution of the difference, we similarly apply the principle of

the probabilities of equivalent events.
Similar to sums of random variables, we shall present the two cases for
the difference, namely, when X and Y are not independent and when they
are.

ExampleSharp
2.8 Minds - Bright Ideas!
For the data in Example 2.2, find the distribution of the difference of the
Employees at FOSS Analytical A/S are living proof of the company value - First - using The Family owned FOSS group is
random new
variables X and Y .
inventions to make dedicated solutions for our customers. With sharp minds and the world leader as supplier of
cross functional teamwork, we constantly strive to develop new unique products - dedicated, high-tech analytical
SolutionWould you like to join our team? solutions which measure and
control the quality and produc-
IndicateFOSS
theworks
various difference
diligently withand
with innovation their corresponding
development as basis for itsjoint probabilities
growth. It is tion of agricultural, food, phar-
reflected in the fact that more than 200 of the 1200 employees in FOSS work with Re- maceutical and chemical produ-
search & Development in Scandinavia and USA. Engineers at FOSS work in production,
i − yj
xdevelopment −1 − (−1) within
and marketing, −1a−wide 0 range 1 0−
−1 of−different (−1)
fields, 0−0
i.e. Chemistry, 0−1 cts. Main activities are initiated
from Denmark, Sweden and USA
p(x ,
i jy )
Electronics, 0.1
Mechanics, Software, 0.2
Optics, 0.11
Microbiology, 0.08
Chemometrics. 0.08 0.26 with headquarters domiciled in
Hillerød, DK. The products are
We offer marketed globally by 23 sales
2 − (−1) 2 − 0 2 − 1
A challenging job in an international and innovative company that is leading in its field. You will get the companies and an extensive net
0.03 to work
opportunity 0.17 0.03
with the most advanced technology together with highly skilled colleagues. of distributors. In line with
the corevalue to be ‘First’, the
Read more about FOSS at www.foss.dk - or go directly to our student site www.foss.dk/sharpminds where
company intends to expand
you can learn more about your possibilities of working together with us on projects, your thesis etc.
Step 2 its market position.

Subtract the various values for Y from the various values for X:
Dedicated Analytical Solutions
FOSS
−y 0 69−1
xSlangerupgade −2 1 0 −1 3 2 1
p(x,
3400y) 0.1 0.2
Hillerød 0.11 0.08 0.02 0.26 0.03 0.17 0.03
Tel. +45 70103370

Step 3 www.foss.dk
By the principle of equality of the probabilities of equivalent events we shall
obtain the table below:

x−y −2 −1 0 1 2 3
63
p(x + y) 0.11 0.2 + 0.26 0.1 + 0.02 0.08 + 0.03 0.17 0.03
64 Advanced Topics in Introductory Probability
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
· · · x2 − yTHEORY
PROBABILITY m xn– − y1
VOLUME
xIIIn − y2 ··· xn − ym QUOTIENTS OF BIVARIATE DISTRIBUTIONS
· · · p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym )

To find the distribution of the difference, we similarly apply the principle of

the probabilities of equivalent events.
Similar to sums of random variables, we shall present the two cases for
the difference, namely, when X and Y are not independent and when they
are.

Example 2.8
For the data in Example 2.2, find the distribution of the difference of the
random variables X and Y .

Solution
Indicate the various difference with their corresponding joint probabilities

xi − yj −1 − (−1) −1 − 0 −1 − 1 0 − (−1) 0−0 0−1

p(xi , yj ) 0.1 0.2 0.11 0.08 0.08 0.26

2 − (−1) 2−0 2−1

0.03 0.17 0.03

Step 2
Subtract the various values for Y from the various values for X:

x−y 0 −1 −2 1 0 −1 3 2 1
p(x, y) 0.1 0.2 0.11 0.08 0.02 0.26 0.03 0.17 0.03

Step 3
By the principle of equality of the probabilities of equivalent events we shall
obtain the table below:

x−y −2 −1 0 1 2 3
p(x + y) 0.11 0.2 + 0.26 0.1 + 0.02 0.08 + 0.03 0.17 0.03

Step 4Differences, Products and Quotients of Bivariate Distributions

Sums, 65
Present the final result as in the table below:

x−y −2 −1 0 1 2 3
p(x − y) 0.11 0.46 0.12 0.11 0.17 0.03

which can be presented as follows:

x−y −2 −1 0 1 2 3
p(x − y) 0.11 0.46 0.12 0.11 0.17 0.03

We can verify that this distribution of the sum is a probability distri-

bution. That is,
0 ≤ p(x − y) ≤ 1

and

p(x − y) = 1

64
We can
ADVANCED verifyINthat
TOPICS this distribution
INTRODUCTORY of the sum is a probability distri-
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
bution. That is,
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
0 ≤ p(x − y) ≤ 1

and

p(x − y) = 1

Example 2.9
For the data in Example 2.3, find the distribution of the difference of the
random variables X and Y .

Solution
It has been shown in Example 2.3 that X and Y are independent. Therefor,
the various values of the sum {x + (−y)} and their corresponding probabil-
ities are given in the following table:

xi − yj −2 − 3 −2 − 6 4−3 4−6
p(xi , yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)

By the principle of equality of probabilities of equivalent events, we obtain

the distribution of the difference of X and Y, presented in the table below:

xi − yj −5 −8 1 −2
p(xi − yj ) 0.28 0.12 0.42 0.18
66 Advanced Topics in Introductory Probability

2.3.2 Differences of Continuous Bivariate Random Variables

Theorem 2.7
Suppose that X and Y are continuous random variables having joint
density function f (x, y) and let U = X − Y . Then
∞
fu (u) = f (x, x − u) dx
−∞

or equivalently, ∞
fu (u) = f (u + y, y) dy
−∞

Corollary 2.1
Suppose that X and Y are independent nonnegative continuous random
variables having joint density function f (x, y) with marginal probability
distributions g(x) and h(y) respectively, and let U = X − Y . Then
∞
fu (u) = g(x) h(x − u) dx
0

or equivalently, ∞
fu (u) = g(u + y) h(y) dy
0

Example 2.10
Refer to Example 1.3.
(a) Find the distribution of U = X − Y ;
65

1
fu (u) = g(x) h(x − u) dx
0
ADVANCED TOPICS IN INTRODUCTORY
or equivalently,
PROBABILITY: A FIRST COURSE IN
∞ SUMS, DIFFERENCES, PRODUCTS AND
fu (u) =III
PROBABILITY THEORY – VOLUME g(u + y) h(y) dy QUOTIENTS OF BIVARIATE DISTRIBUTIONS
0

Example 2.10
Refer to Example 1.3.
(a) Find the distribution of U = X − Y ;

1
(b) Using the distribution in (a) calculate (i) P (U ≤ 1) (ii) P U < .
2

Solution
(a) X − Y = U =⇒ X = U + Y .
Therefore
Sums, Differences, Products and Quotients of Bivariate Distributions 67
0 < X < 1 =⇒ 0 ≤ U + Y ≤ 1 =⇒ −Y ≤ U ≤ 1 − Y

The region defined by the preceding inequality and 0 < Y < 2 is

sketched in the diagram below.

It can be seen from the sketch that the region of integration is parti-
tioned into three, namely,

−U ≤ Y ≤ 2 when −2 ≤ U < −1
−U ≤ Y ≤ 1 − U when −1 ≤ U < 0
0≤Y ≤1−U when 0 ≤ U ≤ 1

For −2 ≤ U < −1

2 7 4
fu (u) = u + u y + y 3 dy
2
−u 3 3
5 3 14 32
= u + 2 u2 + u+
18 3 9

For −1 ≤ U < 0
1−u 7 4

fu (u) = u2 + u y + y 2 dy
−u 3 3
4 u
= −
9 6

For 0 ≤ U ≤ 1
1−u 7 4

fu (u) = u + u y + y2 2
dy
0 3 3
4 u 5 3
= − − u
9 6 18

66
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY
68 THEORY – VOLUME Advanced
III QUOTIENTS
Topics in Introductory OF BIVARIATE DISTRIBUTIONS
Probability

Therefore

 5 3 14 32


 u + 2 u2 + u + , −2 ≤ u < −1



18 3 9





 4 u

 − , −1 ≤ u < 0
fu (u) = 9 6



 4 u 5 3

 − − u , 0≤u≤1



 9 6 18





0, elsewhere

(b)

(i) P (U ≤ 1) = P (−2 ≤ U ≤ 1)
= P (−2 ≤ U < 1) + P (−1 ≤ U < 0) + P (0 ≤ U ≤ 1)

−1
5 14 32
= u + 2u + 3
u+ 2
du
−2 18 3 9
0 1
4 u 4 u 5 3
+ − du + − − u du
−1 9 6 0 9 6 18
13 19 21
= + +
72 36 72
= 1

1
(ii) P U≤ = P (−2 ≤ U ≤ 1) + P (−1 ≤ U < 0)
2

1
+P 0 ≤ U ≤
2
−1
5 14 32
= u3 + 2u2 + u+ du
−2 18 3 9
0 1
u 4 2 4 u 5
+ du +
− − − u3 du
−1 9 6 0 9 6 18
13 19 227
= + +
72 36 1152
1043
=
1152

67
4 u 4 u 5 3
+ − du + − − u du
−1 9 6 0 9 6 18
ADVANCED TOPICS IN INTRODUCTORY
13 IN19 21
=
PROBABILITY: A FIRST COURSE+ + SUMS, DIFFERENCES, PRODUCTS AND
36
72 III
PROBABILITY THEORY – VOLUME 72 QUOTIENTS OF BIVARIATE DISTRIBUTIONS
= 1

1
(ii) P U≤ = P (−2 ≤ U ≤ 1) + P (−1 ≤ U < 0)
2

1
+P 0 ≤ U ≤
2
−1
5 3 14 32
= u + 2u + u +
2
du
−2 18 3 9
0 1
u 4 2 4 u 5
+ − du + − − u3 du
−1 9 6 0 9 6 18
13 19 227
= + +
72 36 1152
1043
=
1152
Sums, Differences, Products and Quotients of Bivariate Distributions 69

2.4 PRODUCTS OF BIVARIATE RANDOM VARIABLES

2.4.1 Product of Discrete Bivariate Random Variables

The probability distribution of the product of two discrete random variables

X and Y whose joint probability functions are in tabular form can be calcu-
lated using the principle of the equality of the probabilities of two equivalent
events. This is demonstrated in Table 2.3.

Table 2.3 Various Products XY and their Corresponding Probabilities

xi yj x1 y1 x1 y2 ··· x1 ym x2 y1 x2 y2 ···
p(xi , yj ) p(x1 , y1 ) p(x1 , y2 ) ··· p(x1 , ym ) p(x2 , y1 ) p(x2 , y2 ) ···

··· x2 ym xn y1 xn y2 ··· xn ym
··· p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym )

Example 2.11
For the data in Example 2.2, find the distribution of the product of the
random variables X and Y .

Solution
Step 1
Indicate the various products with their corresponding joint probabilities

xi yj −1(−1) −1(0) −1(1) 0(−1) 0(0) 0(1)

p(xi , yj ) 0.1 0.2 0.11 0.08 0.08 0.26

2(−1) 2(0) 2(1)

0.03 0.17 0.03

Step 2
Multiply the various values for X by the various values for Y :

68
p(xi , yj ) 0.1 0.2 0.11 0.08 0.08 0.26
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
2(−1) 2(0) 2(1)
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
0.03 0.17 0.03

70
Step 2 Advanced Topics in Introductory Probability
70 Advanced Topics in Introductory Probability
Multiply the various values for X by the various values for Y :
xy 1 0 −1 0 0 0 −2 0 2
xyy)
p(x, 1
0.1 0
0.2 −1
0.11 0
0.08 0
0.02 0
0.26 −2
0.03 0
0.17 2
0.03
p(x, y) 0.1 0.2 0.11 0.08 0.02 0.26 0.03 0.17 0.03
Step 3
Step
By the3 principle of equality of the probabilities of equivalent events we shall
By the the
obtain principle
table of equality of the probabilities of equivalent events we shall
below:
obtain the table below:
xy −2 −1 0 1 2
xy
p(xy) −2
0.03 −1
0.11 0 + 0.26 + 0.17
0.2 + 0.08 + 0.02 1
0.1 2
0.03
p(xy) 0.03 0.11 0.2 + 0.08 + 0.02 + 0.26 + 0.17 0.1 0.03
Step 4
Step 4 the final result as in the table below:
Present
Present the final result as in the table below:
xy −2 −1 0 1 2
xy
p(xy) −2
0.03 −1
0.11 0
0.73 1
0.1 2
0.03
p(xy) 0.03 0.11 0.73 0.1 0.03
We can verify that this distribution of the sum is a probability distri-
WeThat
bution. can verify
is, that this distribution of the sum is a probability distri-
bution. That is, 0 ≤ p(xy) ≤ 1
0 ≤ p(xy) ≤ 1
and
and p(xy) = 1
p(xy) = 1

Example 2.12
Example 2.12in Example 2.3, find the distribution of the product of X and
For the data
For
Y. the data in Example 2.3, find the distribution of the product of X and
Y.
Solution
Solution
It has been shown in Example 2.3 that X and Y are independent. Therefor,
It has
the been shown
various values in
of Example 2.3 that
the product X and
xy and are independent.
Y corresponding
their Therefor,
probabilities
the various values of the product
are given in the following table: xy and their corresponding probabilities
are given in the following table:
xi yj −2(3) −2(6) 4(3) 4(6)
−2(3) 0.4(0.3)
xi y, yj ) 0.4(0.7)
p(x 4(3)
−2(6) 0.6(0.7) 4(6)
0.6(0.3)
i j
p(xi , yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)
Sums, Differences, Products and Quotients of Bivariate Distributions 71
By the principle of the equality of probabilities of equivalent events we obtain
By 71
the the principle of the equality of probabilities
Sums, Differences, Products and Quotients of of equivalent
Bivariate events we obtain
Distributions
result summarised in the following table.
the result
xi ysummarised
j −6 in the following
−12 12 24table.
x iyyj ) 0.28
p(x
i j −6 0.12 −12 0.42
12 0.1824
p(xi yj ) 0.28 0.12 0.42 0.18

Example 2.13
Suppose there
Example 2.13 are two independent distributions:
Suppose there are two independent distributions:

x −2 4
g(x)
x 0.4
−2 0.6
4
g(x) 0.4 0.6

y −2 4
h(y)
y 0.4
−2 0.6
4 69
x −2 4
ADVANCED
g(x) 0.4 IN0.6
x TOPICS
−2 4INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY 0.4 0.6
g(x) THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS

y −2 4
h(y)
y 0.4
−2 0.6
4
h(y) 0.4 0.6

Find the distribution of (a) X 2 (b) XY.

Find the distribution of (a) X 2 (b) XY.
Solution
Solution

x2 4 16
(a)
x2 ) 0.4 0.6
p(x2
4 16
(a)
p(x2 ) 0.4 0.6

xy −2(−2) −2(4) 4(−2) 4(4)

(b)
p(x,
xy y) 0.4(0.4)
−2(−2) 0.4(0.6)
−2(4) 0.6(0.4)
4(−2) 0.6(0.6)
4(4)
(b)
p(x, y) 0.4(0.4) 0.4(0.6) 0.6(0.4) 0.6(0.6)

xy 8 −8 −8 16
p(x,
xy y) 0.16
8 0.24
−8 0.24
−8 0.36
16
p(x, y) 0.16 0.24 0.24 0.36

xy 4−8 16
72 0.16
0.48 Advanced Topics in Introductory Probability
p(xy)
xy −8 0.36
4 16
p(xy) 0.16 0.48 0.36
2.4.2 Product of Continuous Bivariate Random Variables

Theorem 2.8
Suppose that X and Y are continuous random variables having joint
density function f (x, y) and let V = XY . Then
∞
1 v
fv (v) = f x, dx
−∞ |x| x

or equivalently,
∞
1 v
fv (v) = f ,y dy
−∞ |y| y

Corollary 2.2
Suppose that X and Y are independent, nonnegative continuous random
variables having joint density function f (x, y) with marginal probability
distributions g(x) and h(y) respectively and let V = XY . Then
∞
1 v
fv (v) = h g(x) dx, y>0
0 x x

or equivalently,
∞
1 v
fv (v) = g h(y) dy, x>0
0 y y

70
We can calculate the probability P (XY ≤ t) for the independent, nonnega-
or equivalently,
ADVANCED TOPICS IN INTRODUCTORY

PROBABILITY: A FIRST COURSE∞IN 1 v SUMS, DIFFERENCES, PRODUCTS AND
fv (v)
PROBABILITY THEORY =
– VOLUME III g h(y) dy, x > 0QUOTIENTS OF BIVARIATE DISTRIBUTIONS
0 y y

We can calculate the probability P (XY ≤ t) for the independent, nonnega-

tive continuous random variables X and Y thus

P (XY ≤ t) = g(x)h(y) dx dy
0<xy≤t
t ∞
1 v
= g h(y) dy dv
0 −∞ y y
t
Sums, Differences, Products fv (v) dv of Bivariate Distributions
= and Quotients 73
0

The integration is restricted to the first quadrant, since g(x)h(y), the joint
density function of X and Y, are zero elsewhere.

Example 2.14
Refer to Example 1.3.
(a) Find the distribution of V = XY ;

1
(b) Using the distribution in (a), calculate P ≤U ≤1 .
2

Solution
V
(a) XY = V =⇒ X = .
Y
Therefore
V
0 < X < 1 =⇒ 0 ≤ ≤ 1 =⇒ 0 ≤ V ≤ Y
Y
The region defined by the preceding inequality and 0 < Y < 2 is
sketched in the diagram below.

It can be seen from the sketch that the region of integration is

V ≤ Y ≤ 2 when 0 < V ≤ 2

Therefore

2 v2 1 v
h(u) = + dy
y=v y3 3 y

71
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
74 Advanced Topics in Introductory Probability

v 1 2
= ln y −
3 2 y2 v

v 2 1 v2
= ln + − , 0<v≤2
3 v 2 8

(b) Now,
1
1 v 2 1 v2
P ≤V ≤1 = ln + − dv
2 1
2
3 v 2 8
1
v v3 v2 2
= − + 2 + ln
2 24 6 v 1
2
= 0.3338

2.5 QUOTIENTS OF BIVARIATE RANDOM VARIABLES

2.5.1 Quotient of Discrete Bivariate Random Variables

The probability distribution of the quotient of two discrete random variables

X
X and Y ( ) whose joint probability functions are in tabular form can be
Y
calculated using the principle of the equality of the probabilities of two
equivalent events. This is demonstrated in Table 2.4.

X
Table 2.4 Various Quotient and their Corresponding Probabilities
Y

xi x1 x1 x1 x2 x2
yj The Wake
p(xi , yj )
y1
p(x1 , y1 )
y2
p(x1 , y2 )
···
···
ym
p(x1 , ym )
y1
p(x2 , y1 )
y2
p(x2 , y2 )
···
···
the only emission we want to leave behind
x2 xn xn xn
··· ···
ym y1 y2 ym
··· p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym )

Suppose X and Y are two independent variables. Then the distribution of

.QYURGGF'PIKPGU/GFKWOURGGF'PIKPGU6WTDQEJCTIGTU2TQRGNNGTU2TQRWNUKQP2CEMCIGU2TKOG5GTX

6JGFGUKIPQHGEQHTKGPFN[OCTKPGRQYGTCPFRTQRWNUKQPUQNWVKQPUKUETWEKCNHQT/#0&KGUGN6WTDQ
2QYGTEQORGVGPEKGUCTGQHHGTGFYKVJVJGYQTNFoUNCTIGUVGPIKPGRTQITCOOGsJCXKPIQWVRWVUURCPPKPI
HTQOVQM9RGTGPIKPG)GVWRHTQPV
(KPFQWVOQTGCVYYYOCPFKGUGNVWTDQEQO

72
The probability distribution of the quotient of two discrete random variables
X
ADVANCED TOPICS IN INTRODUCTORY
X and Y ( A) FIRST
PROBABILITY:
whoseCOURSE
joint probability
IN
functions are in tabular form can be
SUMS, DIFFERENCES, PRODUCTS AND
Y
calculated using
PROBABILITY the– principle
THEORY VOLUME IIIof the
equality of the probabilities of BIVARIATE
QUOTIENTS OF two DISTRIBUTIONS
equivalent events. This is demonstrated in Table 2.4.

X
Table 2.4 Various Quotient and their Corresponding Probabilities
Y

xi x1 x1 x1 x2 x2
··· ···
yj y1 y2 ym y1 y2
p(xi , yj ) p(x1 , y1 ) p(x1 , y2 ) ··· p(x1 , ym ) p(x2 , y1 ) p(x2 , y2 ) ···

x2 xn xn xn
··· ···
ym y1 y2 ym
··· p(x2 , ym ) p(xn , y1 ) p(xn , y2 ) ··· p(xn , ym )

Sums, Differences, Products and Quotients of Bivariate Distributions 75

Suppose X and Y are two independent variables. Then the distribution 75
Sums, Differences, Products and Quotients of Bivariate Distributions of
X
X , Y = 0 can, as before, be calculated using the principle of equality of
Y , Y = 0 can, as before, be calculated using the principle of equality of
probabilities
Y of equivalent events.
probabilities of equivalent events.

Example 2.15
Example
Consider 2.15
the distributions of two independent random variables X and Y .
Consider the distributions of two independent random variables X and Y .

x 2 5
x
g(x) 2
0.2 5
0.8
g(x) 0.2 0.8

y 4 8
y
h(y) 4
0.7 8
0.3
h(y) 0.7 0.3

X Y
Find the distribution of X and Y.
Find the distribution of Y and X.
Y X
Solution
Solution
Proceeding as before:
Proceeding as before:

x/y 2/4 2/8 5/4 5/8

x/yy)
p(x, 2/4
0.2(0.7) 2/8
0.2(0.3) 5/4
0.8(0.7) 5/8
0.8(0.3)
p(x, y) 0.2(0.7) 0.2(0.3) 0.8(0.7) 0.8(0.3)

X
The distribution of X is thus
The distribution of Y is thus
Y

x/y 1/2 1/4 5/4 5/8

x/y
p(x/y) 1/2
0.14 1/4
0.06 5/4
0.56 5/8
0.24
p(x/y) 0.14 0.06 0.56 0.24

Y
The distribution of Y follows similarly and is given in the table below:
The distribution of X follows similarly and is given in the table below:
X

y/x 2 4 4/5 8/5

y/x
p(y/x) 2
0.14 4
0.06 4/5
0.56 8/5
0.24
p(y/x) 0.14 0.06 0.56 0.24

73
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME Advanced
76 III QUOTIENTS
Topics in Introductory OF BIVARIATE DISTRIBUTIONS
Probability

2.5.2 Quotient of Continuous Bivariate Random Variables

Theorem 2.9
Suppose that X and Y are continuous random variables having joint
X
density function f (x, y) and let W = . Then
Y
∞
fw (w) = |y| f (wy, y) dy
−∞

Corollary 2.3
Suppose that X and Y are independent, nonnegative continuous random
variables having joint density function f (x, y) with marginal probability
X
distributions g(x) and h(y) respectively, and let W = . Then
Y
∞
fw (w) = y f (wy) h(y) dy
0

X
We can calculate the cumulative distribution function of for the indepen-
Y
dent, nonnegative random variables X and Y by

X
P ≤t = y f (wy)h(y) dy
Y
0 <x
y
≤t
0<y<∞
t ∞
= y f (wy) h(y) dy dw
0 0
t
= fw (w) dw
0

Example 2.16
Refer to Example 1.3.
X
(a) Find the distribution of W = ;
Y

1 Distributions
Sums, Differences, Products and Quotients of Bivariate 77
(b) Using the distribution in (a), calculate P W ≤ .
4
Solution
X
(a) Distribution of W = :
Y
X
= W =⇒ X = Y W
Y
Therefore
1
0 < X < 1 =⇒ 0 ≤ Y W ≤ 1 =⇒ 0 ≤ W ≤
Y
The region defined by the preceding inequality and 0 < Y < 2 is
sketched in the diagram below.

74
Y
X
ADVANCED TOPICS IN INTRODUCTORY= W =⇒ X = Y W
Y
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
Therefore
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
1
0 < X < 1 =⇒ 0 ≤ Y W ≤ 1 =⇒ 0 ≤ W ≤
Y
The region defined by the preceding inequality and 0 < Y < 2 is
sketched in the diagram below.

It can be seen from the sketch that the region of integration is parti-
tioned into two, namely,
1
0 < Y ≤ 2 when 0 < W ≤
2
1 1
0<Y ≤ when ≤W <∞
W 2

w 2 1
h(w) = w2 + y 3 dy, 0≤w<
3 0 2
1
w w 1
= w2 + y 3 dy, ≤w<∞
3 0 2
Therefore  w


 4(w2 + ), 0≤w< 1
 3 2
h(w) =

 1 1 w

 (w + ), 1
≤w<∞
4w3 3 2

Challenge the way we run

EXPERIENCE THE POWER OF

FULL ENGAGEMENT…

RUN FASTER.
RUN LONGER.. READ MORE & PRE-ORDER TODAY
RUN EASIER… WWW.GAITEYE.COM

1349906_A6_4+0.indd 1 22-08-2014 12:56:57

75
2
ADVANCED TOPICS IN INTRODUCTORY
w 1
h(w) = w2 + y 3 dy, 0≤w<
PROBABILITY: A FIRST COURSE IN 3 0 2
SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III 1
w w 1 QUOTIENTS OF BIVARIATE DISTRIBUTIONS
= 2
w + 3
y dy, ≤w<∞
3 0 2
Therefore  w


 4(w2 + ), 0≤w< 1
 3 2
h(w) =

 1 1 w
78 
Advanced
(w +Topics
), 1in≤Introductory
w<∞ Probability
4 w3 3 2

1
(b) Calculation of P W ≤ .
4

X 1 1
P ≤ = P W ≤
Y 4 4
1
4 w
= 4 w2 + dw
0 3
1
=
16

EXERCISES

2.1 Refer to Exercise 1.1. Find the distribution of

X
(a) X + Y (b) X − Y (c) XY (d)
Y
3X
(e) 2X + 3Y (f ) 5X − Y (g) 3X(4Y ) (h)
7Y
(i)X2 + 5Y (j) 3X (k) 3(X + Y ) (l) Y 3
(m) Y 2 − 2Y + 3X

2.2 Refer to Example 2.14. Find the distribution of

(a) the sum Z = X + Y ;

(b) W = 2X.

2.3 Refer to Exercise 1.6. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z = 2) (i) using the distribution of Z, (ii) from the first
principle.

2.4 Refer to Exercise 1.7. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z = 2) (i) using the distribution of Z, (ii) from the first
principle.
Sums, Differences, Products and Quotients of Bivariate Distributions 79
2.5 Refer to Exercise 1.9. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z = 4) (i) using the distribution of Z, (ii) from the first
principle.
(c) Find P (X = x|X + Y = x + y)

2.6 Refer to Exercise 1.10. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z = 4) (i) using the distribution of Z, (ii) from the first
principle. 76
(a) the distribution of the sum Z = X + Y ;
ADVANCED TOPICS IN INTRODUCTORY
(b) P (Z = 4) (i) using the
PROBABILITY: A FIRST COURSE IN
distribution of Z, (ii) SUMS,
from DIFFERENCES,
the first PRODUCTS AND
principle.
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
(c) Find P (X = x|X + Y = x + y)

2.6 Refer to Exercise 1.10. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z = 4) (i) using the distribution of Z, (ii) from the first
principle.

2.7 Refer to Exercise 1.11, where k = 2. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z ≤ 1) (i) using the distribution of Z, (ii) from the first
principle.
2
2.8 Refer to Exercise 1.12, where k = . Find
3
(a) the distribution of the sum Z = X + Y ;
(b) P (Z =≤ 1) (i) using the distribution of Z, (ii) from the first
principle.

2.9 Refer to Exercise 1.14. Find

(a) the distribution of the sum Z = X + Y ;

(b) P (Z ≤ 1) (i) using the distribution of Z, (ii) from the first
principle.
12
2.10 Refer to Exercise 1.13, where a = . Find
7
(a) the distribution of the sum Z = X + Y ;
(b) P (Z ≤ 1) (i) using the distribution of Z, (ii) from the first
principle.

2.11 If X and Y are independent Binomial random variables and X is b(n, p)

80 and Y is b(m, p), use theAdvanced
convolution theorem
Topics to show that
in Introductory X + Y is
Probability
b(n + m, p).
2.12 If X and Y are independent Geometric random variables each with
parameter p. Use the convolution theorem to show that X + Y is a
negative binomial N B(2, p).

2.13 If X and Y are independent Negative Binomial random variables and

X is b− (k1 , p) and Y is b− (k2 , p), use the convolution theorem to show
that X + Y is a negative binomial b− (k1 + k2 , p).

2.14 If X and Y are independent Gamma random variables and X is

Γ(s1 , λ) and Y is Γ(s2 , λ), use the convolution theorem to show that
X + Y is Γ(s1 + s2 , λ).

2.15 Suppose X and Y are independent Normal random variables and X is

N (µ1 , σ12 ) and Y is N (µ2 , σ22 ), use the convolution theorem to show
that X + Y is a N (µ1 + µ2 , σ12 + σ22 ).

2.16 Refer to Example 1.15. Compute P (X + Y > 2).

2.17 The moment generating function of X is given by

MX (t) = exp 2et − 2

and that of Y by
77
10 10
1 t
2.15 Suppose X and Y are independent Normal random variables and X is
ADVANCED , σ12 ) and
N (µ1TOPICS Y is N (µ2 , σ22 ), use the
IN INTRODUCTORY convolution theorem to show
PROBABILITY:
that X A + FIRST
Y is aCOURSE
N (µ1 +INµ2 , σ12 + σ22 ). SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
2.16 Refer to Example 1.15. Compute P (X + Y > 2).

2.17 The moment generating function of X is given by

MX (t) = exp 2et − 2

and that of Y by
10 10
1
MY (t) = 3et + 1
4

If X and Y are independent, find P (X + Y = 2).

2.18 Refer to Exercise 1.10. Find

(a) the distribution of the difference U = X − Y ;

(b) (i) P (U = 0); (ii) P (U = 1); (iii) P (U = 2).

2.19 Refer to Exercise 1.11. Find

(a) the distribution of the difference U = X − Y ;

(b) (i) P (U ≤ 12 ), (ii) P (− 12 ≤ U ≤ 12 ).

2.20 Refer to Exercise 1.10. Find

Sums, Differences, Products and Quotients of Bivariate Distributions 81
(a) the distribution of the product V = XY ;

(b) (i) P (V = 2), (ii) P (V = 3) (iii) P (V = 4), (iv) P (V = 6).

2.21 Refer to Exercise 1.11. Find

(a) the distribution of the product U = XY ;

(b) (i) P (U > 34 ), (ii) P ( 12 < U < 1), using the distribution of U,
Technical training on
2.22 Refer to Exercise 2.17. Find P (XY = 0).

2.23
WHAT you need, WHEN you need it
Refer to Exercise 1.10. Find
At IDC Technologies we can tailor our technical and engineering
(a) the distribution of the quotient
training workshops to suitW
your Y ;
= needs.
X We have extensive OIL & GAS
experience in training technical and X
engineering staff and ENGINEERING
(b) (i) P ( X
Y = 1); (ii) P (W = 2); (iii) P ( = 3).
have trained people in organisationsYsuch as General
ELECTRONICS
Motors,1.11.
2.24 Refer to Exercise Shell,Find
Siemens, BHP and Honeywell to name a few.
Our onsite training is cost effective, convenient and completely AUTOMATION &
(a) the distribution oftothe
customisable the quotient =XY ;
W engineering
technical and areas you want PROCESS CONTROL
covered. Our workshops are all comprehensive hands-on learning
(b) P (W = experiences
5). with ample time given to practical sessions and MECHANICAL
demonstrations. We communicate well to ensure that workshop content ENGINEERING
2.25 Refer to Exercise 1.12, match
and timing = 23 . Findskills, and abilities of the participants.
wherethek knowledge,
INDUSTRIAL
(a) the distribution
We run of P (X
onsite − Y );all year round and hold the workshops on
training DATA COMMS
your premises or a venue of your choice for your convenience.
(b) P (− 2 ≤ X − Y ≤ 1).
1
ELECTRICAL
For a no obligation proposal, contact us today POWER
2.26 Refer to Exercise 1.12, where k = 23 . Find
at [email protected] or visit our website
for more information: www.idc-online.com/onsite/
(a) the distribution of XY ;
(b) P (XY ≤ 45 ). Phone: +61 8 9321 1702
Email: [email protected]
2.27 Refer to Exercise 1.12, where k = 23 . Find
Website: www.idc-online.com
Y );
(a) the distribution of P ( X
(b) P ( X
Y < 10).

2.28 Refer to Exercise 1.13. Find

78
(a) the distribution of P (X − Y );
ADVANCED
Sums, TOPICS IN
Differences, INTRODUCTORY
Products and Quotients of Bivariate Distributions 81
PROBABILITY: A FIRST COURSE IN SUMS, DIFFERENCES, PRODUCTS AND
PROBABILITY THEORY – VOLUME III QUOTIENTS OF BIVARIATE DISTRIBUTIONS
(b) (i) P (V = 2), (ii) P (V = 3) (iii) P (V = 4), (iv) P (V = 6).

2.21 Refer to Exercise 1.11. Find

(a) the distribution of the product U = XY ;

(b) (i) P (U > 34 ), (ii) P ( 21 < U < 1), using the distribution of U,

2.22 Refer to Exercise 2.17. Find P (XY = 0).

2.23 Refer to Exercise 1.10. Find

(a) the distribution of the quotient W = Y ;

(b) (i) P ( X
Y = 1); (ii) P (W = 2); (iii) P ( X
Y = 3).

2.24 Refer to Exercise 1.11. Find

(a) the distribution of the quotient W = Y ;

(b) P (W = 5).

2.25 Refer to Exercise 1.12, where k = 23 . Find

(a) the distribution of P (X − Y );

(b) P (− 12 ≤ X − Y ≤ 1).

2.26 Refer to Exercise 1.12, where k = 23 . Find

(a) the distribution of XY ;

(b) P (XY ≤ 45 ).

2.27 Refer to Exercise 1.12, where k = 23 . Find

Y );
(a) the distribution of P ( X
(b) P ( X
Y < 10).

2.28 Refer to Exercise 1.13. Find

(a) the distribution of P (X − Y );

(b) P (− 12 ≤ X − Y ≤ 12 ).

2.29 Refer to Exercise 1.14. Find

82 Advanced Topics in Introductory Probability
(a) the distribution of XY ;
(b) P (XY ≤ 1).

2.30 Refer to Exercise 1.14. Find

(a) the distribution of P X
Y ;

(b) P X
Y ≤2 .

79
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS

Chapter 3

EXPECTATION AND VARIANCE

OF BIVARIATE DISTRIBUTIONS

3.1 INTRODUCTION

In the earlier text5 , we discussed the numerical characterisation of a single

random variable. Specifically, we discussed in Chapter 6 the concepts of
expectation and variance. In this chapter, we shall extend the concept of
expectation and variance to the case of bivariate distribution and discuss a
few more properties. We shall also realise that in this case, there arises the
concept of covariance.

Chapter
3.2
3
EXPECTATION OF BIVARIATE RANDOM VARIABLES

EXPECTATION AND VARIANCE

To refresh our minds, we recapitulate that the expectation of a discrete
random variable X is
OF BIVARIATE DISTRIBUTIONS
n
E(X) = xi p(xi )
i=1

Example 3.1
3.1 INTRODUCTION
For the data in Example 2.3, find E(X) and E(Y ).

Solution
In the earlier text5 , we discussed the numerical characterisation of a single
For
84 convenience
random variable.weSpecifically,
present theAdvanced
data
we here.
Topics
discussed in in Introductory
Chapter Probability
6 the concepts of
expectation
5
and variance.
Nsowah-Nuamah, 2017 In this chapter, we shall extend the concept of
expectation and variance to thex case of −2bivariate
4 distribution and discuss a
i
83
few more properties. We shall also realise that in this case, there arises the
p(xi ) 0.4 0.6
concept of covariance.

3.2 E(X)
EXPECTATION OF= BIVARIATE
−2(0.4) + 4(0.6) = 1.6 VARIABLES
RANDOM

To refresh our minds, we recapitulate

yj 3 that6 the expectation of a discrete
random variable X is p(yj ) n0.7 0.3

E(X) = xi p(xi )
i=1
E(Y ) = 3(0.7) + 6(0.3) = 3.9
Example 3.1
For the data
3.2.1 in Exampleof2.3,
Expectation find E(X)
Functions ofand E(Y ). Bivariate Random
Discrete
Variables
Solution
For
The convenience
expectationwe
of present the data
any function of here.
discrete bivariate random variables,
H(X,
5
Y ), can be computed
Nsowah-Nuamah, 2017 either the

(a) parent joint probability distribution;

83 or

(b) derived joint probability distribution.

80
Expectation from Parent Joint Probability Distribution
yj )
p(y 3
0.7 6
0.3
j
p(yj ) 0.7 0.3
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
E(Y ) =III3(0.7)
PROBABILITY THEORY – VOLUME + 6(0.3) = 3.9 BIVARIATE DISTRIBUTIONS
E(Y ) = 3(0.7) + 6(0.3) = 3.9
3.2.1 Expectation of Functions of Discrete Bivariate Random
3.2.1 Expectation
Variables of Functions of Discrete Bivariate Random
Variables
The expectation of any function of discrete bivariate random variables,
The expectation
H(X, of any function
Y ), can be computed of discrete bivariate random variables,
either the
H(X, Y ), can be computed either the
(a) parent joint probability distribution; or
(a) parent joint probability distribution; or
(b) derived joint probability distribution.
(b) derived joint probability distribution.
Expectation from Parent Joint Probability Distribution
Expectation from Parent Joint Probability Distribution
The next theorem shows that we can find E(X), E(Y ) and E(XY ) of a
The next distribution
bivariate theorem shows(X, Ythat we canfrom
) directly findthe
E(X), E(Y
parent ) and
joint E(XY ) of
probability a
dis-
bivariate
tribution distribution
without first(X, Y ) directly
obtaining the from thejoint
derived parent joint probability
probability dis-
distribution
tribution without first obtaining
or the marginal distribution. the derived joint probability distribution
or the marginal distribution.

Definition 3.1
Definition 3.1a two-dimensional discrete random variable and let
Let (X, Y ) be
Let (X, Y ) be a two-dimensional discrete random variable and let
Z = H(X, Y ). Then
Z = H(X, Y ). Then
∞
∞
= Bivariate
E(Z) of
Expectation and Variance ∞ ∞ H(x
i , yj )p(xi , yj )
Distributions 85
Expectation and Variance = Bivariate
E(Z) of i=1 j=1 H(x i , yj )p(xi , yj )
Distributions 85
Expectation and Variance of Bivariate
i=1 j=1 Distributions 85
For the finite case,
For the finite case,
For the finite case, n
m

E(Z) = n
n
m H(xi , yj ) p(xi , yj )
m
E(Z) =
i=1 j=1 H(xi , yj ) p(xi , yj )
E(Z) = H(xi , yj ) p(xi , yj )
i=1 j=1
i=1 j=1
That is,
That is,
That is,
E[H(X, Y )] = H(x1 , y1 ) p(x1 , y1 ) + H(x1 , y2 ) p(x1 , y2 ) + · · ·
E[H(X, Y )] = H(x
+H(x 1 , y,1y) p(x 1 , y1,)y+)H(x
) p(x + H(x 1 , y2 ), yp(x 1 , y2 ), y+ ···
E[H(X, Y )] = H(x 1 , 1y1 )m p(x 1 , y11 ) +m H(x 1 , y2 ) 1 ) p(x
p(x 1 , y2 ) +1 )· · ·
+H(x1 ,, yym))p(x
+H(x p(x1 , ym ) ))++ H(x
· + H(x2 , y1 ) ,p(x
ym22),,p(xy1 ) , y )
2
1 2 m ) p(x21,,yy2m +· ·H(x 2 , y1 2) p(x y1 )2 m
+H(x
+ , y
· · · +2 ,H(x ) p(x , y ) + · · · + H(x , y ) p(x
p(x22n,,,yyym )
+H(x n , y21,)yp(x
y2 ) p(x 2) + n ,·y·1·)+ +H(xH(x2n, ,yym2)) p(x 2)
2 2 2 2 2 m
m
+ · · · + H(xn ,, yy1 ))p(x p(xnnn,,,yyy11m) )+ H(xn , y2 ) p(xn , y2 )
+ · · · + H(xn m 1 ) p(x ) + H(xn , y2 ) p(xn , y2 )
+ · · · + H(xn , ym ) p(xn , ym )
+ · · · + H(xn , ym ) p(xn , ym )
Example 3.2
Example
For 3.2
the data
Example 3.2 in Example 2.3, find the expectation of the function
For the data in Example 2.3, find the expectation of the function
For the data in Example 2.3, find the expectation of the function
H(X, Y ) = X 2 Y
H(X, Y ) = X 2 Y
H(X, Y ) = X 2 Y
Solution
Solution
Solution 2
E(X Y ) = {(x1 )2 (y1 )} p(x1 , y1 ) + {(x1 )2 (y2 )}p(x1 , y2 )
E(X 22 Y ) = +{(x
{(x )2 (y 2 1 )} p(x1 , y1 ) + {(x1 )2 (y
2
2 2 )}p(x1 , y2 )
E(X Y ) = {(x11 )22)(y(y 1 )}p(x
1 )} p(x1 ,2y, 1y)1 +) +{(x {(x 1 )2 )(y(y 2 )}p(x
2 )}p(x 1 ,2y, 2y)2 )
+{(x 22 )2 (y1 )}p(x2 , y1 ) +
2
(−2) 2(3)(0.28)
= +{(x +2(−2) 2 {(x2 )2 (y2 )}p(x
2
2 , y2 )
) (y1 )}p(x , y1 ) +(6)(0.12)
{(x2 ) (y+ (4)2 (3)(0.42)
2 )}p(x 2 , y2 )
= (−2)
+(4)2 (3)(0.28)
2
2
(6)(0.18) + (−2)2 (6)(0.12) + (4)2 (3)(0.42)
2 2
= (−2) (3)(0.28) + (−2) (6)(0.12) + (4) (3)(0.42)
+(4)22 (6)(0.18)
= 43.68
+(4) (6)(0.18)
= 43.68
= 43.68

Theorem 3.1 81
Theorem
Let (X, Y )3.1
be two-dimensional discrete random variable and let
= (−2)2 (3)(0.28) + (−2)2 (6)(0.12) + (4)2 (3)(0.42)
+(4)2 (6)(0.18)
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
= 43.68
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS

Theorem 3.1
Let (X, Y ) be two-dimensional discrete random variable and let
H(X, Y ) = XY . Then
n
m
E(XY ) = xi yj p(xi , yj )
86 Advanced Topics in Introductory Probability
i=1 j=1
86
86 Advanced
Advanced Topics in
Topics in Introductory
Introductory Probability
Probability
Example 3.3
Example
Refer
Example 3.3
to Example
3.3 2.2. Find E(XY ) from the parent joint probability dis-
Refer
Refer to Example 2.2.
to
tribution.Example 2.2. Find E(XY )) from
Find E(XY from the
the parent
parent joint
joint probability
probability dis-
dis-
tribution.
tribution.
Solution
Solution
The parent distribution of XY is given in the table below.
Solution
The
The parent
parent distribution
distribution of XY is
of XY is given
given in
in the
the table
table below.
below.
Y
X −1 Y
0Y 1 Row Totals
X
X
−1 −1
0.10
−1 00
0.20 11
0.11 Row Totals
Row0.41
Totals
−1
−1 0.10
(1)
0.10 0.20
(0)
0.20 0.11
(-1)
0.11 0.41
0.41
0 (1)
0.08
(1) (0)
0.02
(0) (-1)
0.26
(-1) 0.36
00 0.08
(0)
0.08 0.02
(0)
0.02 0.26
(0)
0.26 0.36
0.36
2 (0)
0.03
(0) (0)
0.17
(0) (0)
0.03
(0) 0.23
22 0.03
(-2)
0.03 0.17
(0)
0.17 0.03
(2)
0.03 0.23
0.23
Column Totals (-2)
(-2)
0.21 (0)
(0)
0.39 (2)
(2)
0.40 1.00
Column
Column Totals
Totals 0.21
0.21 0.39
0.39 0.40
0.40 1.00
1.00
The values in parentheses are for xy. Multiplying these values by their
The
The values
values in
corresponding
in parentheses
probabilities,
parentheses are for
we obtain
are xy.the
for xy. Multiplying these
these values
following table:
Multiplying values by
by their
their �e Graduate Programme
I joined
corresponding
MITAS because
probabilities, we obtain the following table: for Engineers and Geoscientists
corresponding probabilities, we obtain the following table:
I wanted real responsibili� www.discovermitas.com
Maersk.com/Mitas �e G
I joined MITAS
Y because for Engine
X I wanted
−1 0real 1responsibili�
Y
Y Ma
X
X
−1 −1
0.10
−1 00
0.00 11
-0.11
0
−1
−1 0.10
0.00
0.10 0.00 -0.11
0.00
-0.11
200 0.00
-0.06
0.00 0.00 0.00
0.06
0.00
22 -0.06
-0.06 0.00
0.00 0.06
0.06
Hence,
Hence,
Hence,
E(XY ) = n
n
n
m
m xi yj p(xi , yj )
m
Month 16
E(XY )) =
E(XY

p(xii,, yyjj))
= i=1 j=1 xxiiyyjj p(x I was a construction Mo
= 0.10
i=1 + 0.00 + (−0.11) + 0.00 + · · · + 0.00 + 0.06
i=1 j=1
=
= 0.10
j=1
0.10 +
−0.01+ 0.00
0.00 + + (−0.11)
(−0.11) + + 0.00
0.00 +
+ ·· ·· ·· +
+ 0.00
0.00 +
+ 0.06
0.06
supervisor ina const
I was
=
= −0.01
−0.01 the North Sea super
Corollary 3.1
Corollary
advising and the No
Let (X, Y )3.1
Corollary be two-dimensional discrete random variable and let H(X, Y ) =
3.1
Let
Let (X, Y )) be
(X, Y be two-dimensional
two-dimensional discrete
discrete random
random variable
Real work
variable and
and let
let H(X,
H(X, he
helping
Y )) =
Y = foremen advis
International
al opportunities
Internationa
�ree wo
work
or placements ssolve problems
Real work he
helping fo
International
Internationaal opportunities
�ree wo
work
or placements ssolve pr

82
(0) (0) (0)
2 0.03
ADVANCED TOPICS IN INTRODUCTORY 0.17 0.03 0.23
PROBABILITY: A FIRST COURSE IN (-2) (0) (2) EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
Column Totals 0.21 0.39 0.40 1.00

The values in parentheses are for xy. Multiplying these values by their
corresponding probabilities, we obtain the following table:

Y
X −1 0 1
−1 0.10 0.00 -0.11
0 0.00 0.00 0.00
2 -0.06 0.00 0.06

Hence,
n
m
E(XY ) = xi yj p(xi , yj )
i=1 j=1
= 0.10 + 0.00 + (−0.11) + 0.00 + · · · + 0.00 + 0.06
= −0.01

Expectation and Variance of Bivariate Distributions 87

Corollary 3.1
Let (X, Y ) be two-dimensional discrete random variable and let H(X, Y ) =
X. Then n m
E(X) = xi p(xi , yj )
i=1 j=1

and n
m

E(Y ) = yj p(xi , yj )
i=1 j=1

Example 3.4
For the data in Example 2.2, find the expectation of the functions

(a) H(X, Y ) = X

(b) H(X, Y ) = Y .

Solution
(a) From the table in Example 3.3, we calculate the values of the cells with
the xi values. Thus, for x1 = −1

x1 p(x1 , y1 ) = −1(0.10) = −0.1

x1 p(x1 , y2 ) = −1(0.2) = −0.2
x1 p(x1 , y3 ) = −1(0.11) = −0.11

so that the sum for that row is:

3

x1 p(x1 , yj ) = −0.1 + (−0.2) + (−0.11) = −0.41
j=1

Similarly, for x2 = 0
3

x2 p(x2 , yj ) = 0 + 0 + 0 = 0
j=1

and for x3 = 2
3

x3 p(x3 , yj ) = 0.06 + 0.34 +830.06 = 0.46
j=1
j=1

Similarly, for x2 = 0
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE
3 IN EXPECTATION AND VARIANCE OF

PROBABILITY THEORY – VOLUME III , y )
x2 p(x =0+0+0=0 BIVARIATE DISTRIBUTIONS
2 j
j=1

and for x3 = 2
3

x3 p(x3 , yj ) = 0.06 + 0.34 + 0.06 = 0.46
j=1
88 Advanced Topics in Introductory Probability
The results are presented in the table below:

Y
X −1 0 1 Row Totals
−1 -0.1 -0.2 -0.11 -0.41
0 0 0 0 0
2 0.06 0.34 0.06 0.46

Hence
3
3
E(X) = xi p(xi , yj )
i=1 j=1
= −0.41 + 0 + 0.46
= 0.05

(b) We compute E(Y ) as follows:

For y1 = −1
3

y1 p(xi , y1 ) = −0.1 + (−0.08) + (−0.03) = −0.21
i=1
Similarly, for y2 = 0
3

y2 p(xi , y2 ) = 0 + 0 + 0 = 0
i=1

and for y3 = 1
3

y3 p(xi , y3 ) = 0.11 + 0.26 + 0.03 = 0.40
i=1

The results are presented in the table below:

Y
X −1 0 1
−1 -0.1 0 0.11
0 -0.08 0 0.26
Expectation and Variance of2Bivariate
-0.03Distributions
0 0.03 89
Total -0.21 0 0.40

Hence
3
3
E(Y ) = yj p(xi , yj )
j=1 i=1
= −0.21 + 0 + 0.40
= 0.19

Expectation from Derived Joint Probability Distribution

84
The following theorem shows that we can find E(XY ) of a bivariate dis-
E(Y ) = j=1 i=1 yj p(xi , yj )
= −0.21
j=1 i=1
+ 0 + 0.40
= −0.21 + 0 + 0.40
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN= 0.19 EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME = III 0.19 BIVARIATE DISTRIBUTIONS

Expectation from Derived Joint Probability Distribution

Expectation from Derived Joint Probability Distribution
The following theorem shows that we can find E(XY ) of a bivariate dis-
The following
tribution (X, Ytheorem
) from theshows thatjoint
derived we can find E(XY
probability ) of a bivariate
distribution without dis-
the
tribution (X, Y ) from the derived joint probability distribution without
knowledge of the parent joint probability distribution or the marginal dis- the
knowledge
tribution inofthe
thecase
parent joint probability
of independence of Xdistribution
and Y . or the marginal dis-
tribution in the case of independence of X and Y .

Theorem 3.2
Theorem
Let (X, Y )3.2
be two-dimensional discrete random variable and let
Let
H(X,(X,
Y )Y=) XY
be .two-dimensional
Then discrete random variable and let
H(X, Y ) = XY . Then

E(XY ) = xy p(xy)
E(XY ) = xy xy p(xy)
xy

Example 3.5
Example
Refer 3.5
to Example 2.2. Find E(XY ) from the derived joint probability dis-
Refer to Example
tribution. 2.2. Find E(XY ) from the derived joint probability dis-
tribution.
Solution
Solution
From Example 2.11, we have the following:
From Example 2.11, we have the following:
xy −2 −1 0 1 2
xy
p(xy) −2
0.03 −1
0.11 0
0.73 1
0.1 2
0.03
p(xy) 0.03 0.11 0.73 0.1 0.03
xy p(xy) -0.06 -0.11 0 0.1 0.06
xy p(xy) -0.06 -0.11 0 0.1 0.06
Hence
Hence
E(XY ) = xy p(xy)
E(XY ) = xy xy p(xy)
xy

www.job.oticon.dk

85
From Example 2.11, we have the following:

ADVANCED TOPICS INxy INTRODUCTORY

−2 −1 0 1 2
PROBABILITY: A FIRST COURSE0.03
p(xy) IN 0.11 0.73 0.1 0.03 EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
xy p(xy) -0.06 -0.11 0 0.1 0.06

Hence
90 Advanced Topics in Introductory Probability
E(XY ) = xy p(xy)
xy
= −0.06 + (−0.11) + 0 + 0.1 + 0.06
= −0.01

the same result as in Example 3.3.

Expression of Expectation through Marginal Distribution

The following Theorem shows that the expectation of X and Y in a bivariate

distribution can be obtained through the marginal distribution.

Theorem 3.3
Let (X, Y ) be a two-dimensional random variable with probability
mass function p(x, y) and let H(X, Y ) = X.
Then the function
n
m
E(X) = xi p(xi , yj )
i=1 j=1

simplifies to
n

E(X) = xi p(xi )
i=1

Proof
n
m
E(X) = xi p(xi , yj )
i=1 i=1
n m
= xi p(xi , yj )
i=1 i=1
(since the x values do not depend on yand can therefore be
taken from the summation of y)
n

= xi p(xi )
i=1
(from the definition of the marginal probability mass function
Expectation and Variance of Bivariate Distributions 91
of X in Definition 1.11)

Similarly, if H(X, Y ) = Y . Then

n
m
E(Y ) = yi p(xi , yj )
i=1 j=1

which also reduces to m

E(Y ) = yj p(yj )
j=1

Example 3.6
Refer to Example 2.2. Using the marginal distributions, find
(a) E(X) (b) E(Y ).
86
Solution
E(Y ) = yi p(xi , yj )
i=1 j=1
ADVANCED TOPICS IN INTRODUCTORY
which also reduces
PROBABILITY: toCOURSE IN
A FIRST m EXPECTATION AND VARIANCE OF

PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
E(Y ) = yj p(yj )
j=1

Example 3.6
Refer to Example 2.2. Using the marginal distributions, find
(a) E(X) (b) E(Y ).

Solution
(a) From Example 2.11, the marginal distributions of X is:

x −1 0 2
p(x) 0.41 0.36 0.23
x p(x) -0.41 0 0.46

Hence
n

E(X) = xi p(xi )
i=1
= −41 + 0 + 0.46
= 0.05

(b) From Example 3.3, the marginal distributions of Y is:

y −1 0 1
p(y) 0.21 0.39 0.4
y p(y) -0.21 0 0.4

Hence
m
92
Advanced Topics in Introductory Probability
E(Y ) = E(Y ) = yj p(yj )
j=1
= −21 + 0 + 0.4
= 0.19

3.2.2 Expectation of Functions of Continuous Bivariate

Random Variables

Definition 3.2
Let (X, Y ) be a two-dimensional random variable with probability
density function f (x, y) and let Z = H(X, Y ).
Then ∞ ∞
E(Z) = H(x, y) f (x, y) dx dy
−∞ −∞

Theorem 3.4
Let (X, Y ) be a two-dimensional random variable with probability
density function f (x, y) and let H(X, Y ) = XY . Then expectation
of the product of X and Y is
∞ ∞
E(XY ) = xy f (x, y) dx dy
−∞ −∞

Example 3.7
Let (X, Y ) be a two-dimensional random variable with probability
density function f (x, y) and let Z = H(X, Y ).
ADVANCED
Then TOPICS IN INTRODUCTORY
∞
PROBABILITY: A FIRST COURSE ∞ IN EXPECTATION AND VARIANCE OF
E(Z) =
PROBABILITY THEORY – VOLUME III H(x, y) f (x, y) dx dy BIVARIATE DISTRIBUTIONS
−∞ −∞

Example 3.7
Refer to Example 1.3. Find E(XY ).

Solution
2 1
xy
E(XY ) = xy x + 2
dx dy
0 3
2 0
1 x2y2
= x3 y + dx dy
0 0 3
2
y y2
= + dy
0 4 9
4 8 43
= + =
8 27 54

WHY WAIT FOR

PROGRESS?
DARE TO DISCOVER
Discovery means many different things at
Schlumberger. But it’s the spirit that unites every
single one of us. It doesn’t matter whether they
join our business, engineering or technology teams,
our trainees push boundaries, break new ground
and deliver the exceptional. If that excites you,
then we want to hear from you.

careers.slb.com/recentgraduates

88
ADVANCED TOPICS
Expectation IN INTRODUCTORY
and Variance of Bivariate Distributions 93
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS

Definition 3.3
Let (X, Y ) be a two-dimensional continuous random variable with
probability density function f (x, y). Then the expectation of the
marginal distributions of X and Y are given by
∞
E(X) = x g(x) dx
−∞
∞
E(Y ) = y h(y) dy
−∞

where g(x) and h(y) are the marginal probability density function of
X and Y respectively

Example 3.8
Refer to Example 1.3. Find (a) E(X), (b) E(Y )

Solution
To find the expectation of a random variable from a bivariate distribution
of X and Y we require the marginal probability distributions of X and Y .
The marginal probability distribution of X for Example 1.3 was found
in Example 1.10 to be

 2
2 x2 + x, 0 < x < 1
g(x) = 3
 0, elsewhere

and that of Y to be

 1 y
+ , 0<x<2
h(y) = 3 6
 0, elsewhere

Hence, by Definition 3.1

∞
(a) E(X) = x g(x) dx
−∞
1
94 2
Advanced Topics2 in Introductory Probability
= x 2x + x dx
0 3
1
2
= 2x + x2
3
dx
0 3
13
=
18
2
1 y
(b) E(Y ) = y +
dy
0 3 6
2
1 y2
= y+ dy
0 3 6
10
=
9

3.2.3 Properties of Expectation of Bivariate Random Variables

Similar to the univariate case, we shall assume in our discussions of the prop-
erties of the expectation of bivariate random variables that both variables
have finite expectations.
89

Property 1 Expectation of Identical Distributions

2
1 y2
= y+ dy
0 3 6
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN 10 EXPECTATION AND VARIANCE OF
=
PROBABILITY THEORY – VOLUME III 9 BIVARIATE DISTRIBUTIONS

3.2.3 Properties of Expectation of Bivariate Random Variables

Similar to the univariate case, we shall assume in our discussions of the prop-
erties of the expectation of bivariate random variables that both variables
have finite expectations.

Property 1 Expectation of Identical Distributions

Theorem 3.5
If X1 , X2 , ..., Xn have the same distribution, then they possess a com-
mon expectation E(Xi ) = µ

Note
The expression “have the same distribution” is equivalent to the expression
“are identically distributed”. Identical distribution means that the probabil-
ity density function or the probability mass function of the random variable
remains the same from trial to trial.
For example, X and Y are identically distributed if their probability
distribution functions are

g(x) = e−x , x≥0

Expectation and Variance of Bivariate
−y Distributions 95
h(y) = e , y≥0

Property 2 Addition Law of Expectation

Theorem 3.6
Let X and Y be any two random variables with expectation E(X)
and E(Y ), respectively. The expectation of their sum is the sum
of their expectations. That is,

E(X + Y ) = E(X) + E(Y )

Proof
We shall first prove for discrete case.
n
m
E(X + Y ) = (xi + yj ) p(xi , yj )
i=1 j=1
n m n
m
= xi p(xi , yj ) + yj p(xi , yj )
i=1 j=1 i=1 j=1
n m m
n

= xi p(xi , yj ) + yj p(xi , yj )
i=1 j=1 j=1 i=1
n
m

= xi p(xi ) + yj p(xi )
i=1 j=1
(following from the logic of the proof of Theorem 3.3)
= E(X) + E(Y )
90
= n m xi p(xi , yj ) + n m yj p(xi , yj )
= i=1 j=1 xi p(xi , yj ) + i=1 j=1 yj p(xi , yj )
ADVANCED TOPICS INn INTRODUCTORY
m m j=1 n

i=1 j=1
i=1
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
= n xi m p(xi , yj ) + m yj n p(xi , yj )
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
= i=1 xi j=1 p(xi , yj ) + j=1 yj i=1 p(xi , yj )
i=1
n j=1 m j=1 i=1
= n xi p(xi ) + m yj p(xi )
= i=1 xi p(xi ) + j=1 yj p(xi )
i=1(following from
j=1 the logic of the proof of Theorem 3.3)
(following
= E(X) + E(Y from
) the logic of the proof of Theorem 3.3)
= E(X) + E(Y )

We shall now prove for continuous case.

We shall now prove for
∞continuous
∞ case.
E(X + Y ) = ∞ ∞ (x + y)f (x, y) dx dy
E(X + Y ) = −∞ ∞
−∞
∞(x + y)f (x, y) dx dy ∞ ∞
= −∞ ∞ x−∞
∞ f (x, y) dy dx + ∞ y ∞ f (x, y) dx dy
= −∞ ∞ x −∞
f (x, y)dy
∞ dx + −∞
y −∞ f (x, y) dx dy
= −∞ −∞
∞ x fX (x) dx + ∞ y fY (y) dy
−∞ −∞
−∞ −∞
96 = x fX (x) dx + Topics
Advanced y fY (y) dy
in Introductory Probability
−∞ −∞
By the definition of the marginal probability densities, the result follows.
By the definition of the marginal probability densities, the result follows.
Example 3.9
For the table in Example 2.2, verify that

E(X + Y ) = E(X) + E(Y )

(a) using the parent joint probability distribution table;

(b) using the derived joint probability distribution table.

Solution

(a) The distribution of X+Y from the parent joint probability distribution
table is reproduced from the solution for Example 2.2, taking note that
the values for x + y are those in parenthesis:
PREPARE FOR A
LEADING ROLE.
Y English-taught MSc programmes
X −1 0 1 in engineering: Aeronautical,
Row Totals
−1 0.10 0.20 0.11 0.41 Biomedical, Electronics,
(-2) (-1) (0) Mechanical, Communication
0 0.08 0.02 0.26 0.36 systems and Transport systems.
(-1) (0) (1) No tuition fees.
2 0.03 0.17 0.03 0.23
(1) (2) (3)
Column Totals 0.21 0.39 0.40 1.00

E liu.se/master
To obtain E(X + Y ), we multiply the values for x + y by their corre-
sponding probabilities and obtain the following table:

X −1
Y
0 1
‡
−1 -0.20 -0.20 0.00
0 -0.08 0.00 0.26
2 0.03 0.34 0.09
91
(a) usingTOPICS
ADVANCED the parent joint probability distribution table;
IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
(b) using the
PROBABILITY derived
THEORY joint probability
– VOLUME III distribution table. BIVARIATE DISTRIBUTIONS

Solution

(a) The distribution of X+Y from the parent joint probability distribution
table is reproduced from the solution for Example 2.2, taking note that
the values for x + y are those in parenthesis:

Y
X −1 0 1 Row Totals
−1 0.10 0.20 0.11 0.41
(-2) (-1) (0)
0 0.08 0.02 0.26 0.36
(-1) (0) (1)
2 0.03 0.17 0.03 0.23
(1) (2) (3)
Column Totals 0.21 0.39 0.40 1.00

To obtain E(X + Y ), we multiply the values for x + y by their corre-

sponding probabilities and obtain the following table:

Y
X −1 0 1
−1 -0.20 -0.20 0.00
Expectation and Variance of 0Bivariate
-0.08 Distributions
0.00 0.26 97
2 0.03 0.34 0.09

Hence,
n
m
E(X + Y ) = (xi + yj ) p(xi , yj )
i=1 j=1
= −0.20 + (−0.20) + 0.00 + (−0.08) · · · + 0.34 + 0.09
= 0.24

which is the same as

E(X) + E(Y ) = 0.05 + 0.19 = 0.24

from Example 3.4.

(b) The distribution of X + Y from the derived joint probability distribu-

tion table is taken from Example 2.2. Hence, to obtain E(X + Y ), we
have the following table:

(x + y) −2 −1 0 1 2 3
p(x + y) 0.1 0.28 0.13 0.29 0.17 0.03
(x + y) p(x + y) −0.2 −0.28 0 0.29 0.34 0.09

Hence,

E(X + Y ) = (xi + yj ) p(xi + yj )
= −0.2 + (−0.28) + 92
0 + 0.29 + 0.34 + 0.09
= 0.24
have the following table:

(x + y)IN INTRODUCTORY
ADVANCED TOPICS −2 −1 0 1 2 3
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
p(x + y) 0.1 0.28 0.13 0.29 0.17 0.03
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
(x + y) p(x + y) −0.2 −0.28 0 0.29 0.34 0.09

Hence,

E(X + Y ) = (xi + yj ) p(xi + yj )
= −0.2 + (−0.28) + 0 + 0.29 + 0.34 + 0.09
= 0.24

Corollary 3.2
Let X1 , X2 , ..., Xn be n random variables, where n is finite. Then the ex-
pectation of the sum Sn = X1 + X2 + ... + Xn is the sum of the expectations
of the individual random variables. Thus,

E(Sn ) = E(X1 + X2 + ... + Xn )

= E(X1 ) + E(X2 ) + ... + E(Xn )
n

= E(Xi )
98 Advanced
i=1 Topics in Introductory Probability

Corollary 3.3
If X1 , X2 , ..., Xn have the same distribution with E(Xi ) = µ for all 1 ≤ i ≤ n,
then the expectation of their sum Sn = X1 + X2 + ... + Xn is

E(Sn ) = E(X1 + X2 + ... + Xn ) = nµ

Proof
Since X1 , X2 , ..., Xn have the same distribution, from Theorem 3.3, they
have a common expectation:

E(X1 ) = E(X2 ) = ... = E(Xn ) = E(X) = µ

From Corollary 3.2

n
n

E(X1 + ... + Xn ) = E(Xi ) = µ = nµ
i=1 i=1

Example 3.10
Refer to Example 2.13. Find E(X + Y ).

Solution

E(X + Y ) = E(X) + E(Y )

= 1.6 + 1.6
= 2(1.6) = 3.2
= nµ

Property 3 Difference Law of Expectation

Theorem 3.7
Let X and Y be any two random variables with expectations E(X)
and E(Y ) respectively. The expectation of their difference is the
difference of their expectations. That is,

E(X − Y ) = E(X) − E(Y )

93
E(X + Y ) = E(X) + E(Y )
ADVANCED TOPICS IN INTRODUCTORY
= 1.6 + 1.6
PROBABILITY: A FIRST COURSE IN = 2(1.6) = 3.2 EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
= nµ

Property 3 Difference Law of Expectation

Theorem 3.7
Let X and Y be any two random variables with expectations E(X)
and E(Y ) respectively. The expectation of their difference is the
difference of their expectations. That is,

E(X − Y ) = E(X) − E(Y )

Expectation and Variance of Bivariate Distributions 99

Example 3.11
For the table in Example 2.2, verify that

E(X − Y ) = E(X) − E(Y )

Solution
The distribution of X − Y is given in Example 2.8. Hence, to obtain
E(X + Y ), we have the following table:

x−y −2 −1 0 1 2 3
p(x − y) 0.11 0.46 0.12 0.11 0.17 0.03
(x − y) p(x − y) −0.22 −0.46 0 0.11 0.34 0.09

Hence,

E(X − Y ) = −0.22 + (−0.46) + 0 + 0.11 + 0.34 + 0.09

= −0.14

which is the same as Click here

to learn more

TAKE THE
E(X) − E(Y ) = 0.05 − 0.19 = −0.14

RIGHT TRACK
from Example 3.4.

Property 4

Theorem 3.8 Give your career a head start

by studying withrandom
If a and b are constants, then for any us. Experience
variables the advantages
X and Y
of our collaboration with major companies like
ABB, Volvo and Ericsson!
E(aX + bY ) = aE(X) + bE(Y )

Corollary 3.4 Apply by World class

Let X1 , X2 , ...,15XJanuary
n be n random variables. Then www.mdh.se
research
n

E(c1 X1 + c2 X2 + ... + cn Xn ) = ci E(Xi )
i=1

94
E(X + Y ), we have the following table:

ADVANCED
x −TOPICS
y IN INTRODUCTORY
−2 −1 0 1 2 3
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
p(x − y) 0.11
PROBABILITY THEORY – VOLUME III
0.46 0.12 0.11 0.17 0.03 BIVARIATE DISTRIBUTIONS
(x − y) p(x − y) −0.22 −0.46 0 0.11 0.34 0.09

Hence,

E(X − Y ) = −0.22 + (−0.46) + 0 + 0.11 + 0.34 + 0.09

= −0.14

which is the same as

E(X) − E(Y ) = 0.05 − 0.19 = −0.14

from Example 3.4.

Property 4

Theorem 3.8
If a and b are constants, then for any random variables X and Y

E(aX + bY ) = aE(X) + bE(Y )

Corollary 3.4
Let X1 , X2 , ..., Xn be n random variables. Then
n

100 E(c1 X1 + c2 X2Advanced
+ ... + cn Topics
Xn ) = in Introductory
ci E(Xi ) Probability
i=1

where ci ’s are constants.

If ci = c for all 1 ≤ i ≤ n, then Corollary 3.4 becomes

n

E(c1 X1 + c2 X2 + ... + cn Xn ) = c E(Xi )
i=1

The proofs of Theorem 3.8 and its corollary are similar to the proofs of
Theorem 3.6 and its corollaries.

Note
Theorems 3.6, 3.7 and 3.8 and their corollaries hold whether or not the
random variables involved are independent.

Example 3.12
For the data of Example 3.1, if a = 3 and b = 4, verify whether Property 4
is valid.

Solution
From Example 3.1,

3E(X) + 4E(Y ) = 3(1.6) + 4(3.9)

= 20.4

Now,

3x + 4y 3(−2) + 4(3) 3(−2) + 4(6) 3(4) + 4(3) 3(4) + 4(6)

p(x, y) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)

or
3x + 4y 6 18 24 36 95
p(x + y) 0.28 0.12 0.42 0.18
Solution
From Example
ADVANCED 3.1,
TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY3E(X) + 4E(Y
– VOLUME III ) = 3(1.6) + 4(3.9) BIVARIATE DISTRIBUTIONS
= 20.4

Now,

3x + 4y 3(−2) + 4(3) 3(−2) + 4(6) 3(4) + 4(3) 3(4) + 4(6)

p(x, y) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)

or
3x + 4y 6 18 24 36
p(x + y) 0.28 0.12 0.42 0.18

so that

E(3X + 4Y ) = 6(0.28) + 18(0.12) + 24(0.42) + 36(0.18)

Expectation and Variance of Bivariate Distributions 101
Expectation and Variance= of1.68 + 2.16 +
Bivariate 10.08 + 6.48
Distributions 101
Expectation and Variance= of20.4
Bivariate Distributions 101
Property 5 andExpectation
Expectation of Product
Variance of Bivariate of X and Y
Distributions 101
Thus, Property
Property 4 is valid.
5 Expectation of Product of X and Y
Property 5 Expectation of Product of X and Y
Property 5 Expectation of Product of X and Y
Theorem 3.9
Theorem
Let X and 3.9
Y be two random variables with expectations E(X) and
Theorem
Let and 3.9
Y be twoThenrandom
E(YX), respectively. the variables with
expectation of expectations
their productE(X)
is and
Let X
Theorem
E(Y and Y
3.9be two
), respectively. random variables with expectations
Then the expectation of their productE(X) is and
E(YX
Let ), respectively.
and Y be
E(XY ) =twoThen the) variables
random
E(X)E(Y expectation
+ E{[X − of expectations
with their product
E(X)][Y − E(Y )]}is
E(X) and
E(XY ) = E(X)E(Y
E(Y ), respectively. Then the) expectation
+ E{[X − E(X)][Y − E(Y )]}is
of their product
E(XY ) = E(X)E(Y ) + E{[X − E(X)][Y − E(Y )]}
E(XY ) = E(X)E(Y ) + E{[X − E(X)][Y − E(Y )]}

The last term appears sufficiently often to deserve a separate name and
The last term
treatment. It is appears sufficiently
called the covarianceoften to deserve
of X and a separate
Y and denoted name and
as Cov(X, Y ).
The last term
treatment. It is appears
called sufficiently
the covariance often
of X to deserve
and Y and a separate
denoted as name and
Cov(X, Y ).
The concept of covariance is discussed in detail in the next chapter.
treatment. It
last term
The concept is called
of appears the covariance
sufficiently
covariance of X
often
is discussed and Y and
to deserve
in detail denoted
in theanext as
separate Cov(X, Y
name and
chapter. ).
The concept
treatment. of covariance is discussed in detail in the next chapter.
It is called the covariance of X and Y and denoted as Cov(X, Y ).
Example 3.13
For the table of
The concept
Example 3.13 incovariance is discussed
Example 2.2, obtain E{[Xin detail in the next
− E(X)][Y − E(Ychapter.
)]}.
Example 3.13
For the table in Example 2.2, obtain E{[X − E(X)][Y − E(Y )]}.
For the table
Example 3.13 in Example 2.2, obtain E{[X − E(X)][Y − E(Y )]}.
Solution
Forhas
thebeen
Solution
It tableproved
in Example 2.2, obtain
in Theorem 4.1 inE{[X that − E(Y )]}.
− E(X)][Y
the sequel
Solution
It has been proved in Theorem 4.1 in the sequel that
It has been proved in Theorem 4.1 in the sequel that
Solution
E{[X − E(X)][Y − E(Y )]} = E(XY ) − E(X)E(Y )
It has been E{[X
proved−inE(X)][Y
Theorem − 4.1
E(Yin)]}the
= sequel
E(XY )that− E(X)E(Y )
E{[X − E(X)][Y − E(Y )]} = E(XY ) − E(X)E(Y )
From Examples 3.3,
From Examples − E(X)][Y − E(Y )]} = E(XY ) − E(X)E(Y )
E{[X3.3,
From Examples 3.3, E(XY ) = −0.01
E(XY ) = −0.01
From Examples 3.3, E(XY ) = −0.01
From Examples 3.4,
From Examples 3.4, E(XY ) = −0.01
From Examples 3.4,
E(X) = 0.05 and E(Y ) = 0.19
From Examples 3.4, E(X) = 0.05 and E(Y ) = 0.19
E(X) = 0.05 and E(Y ) = 0.19
Hence, E(X) = 0.05 and E(Y ) = 0.19
Hence,
Hence,
E{[X − E(X)][Y − E(Y )]} = E(XY ) − E(X)E(Y )
Hence, E{[X − E(X)][Y − E(Y )]} = E(XY ) − E(X)E(Y )
E{[X − E(X)][Y − E(Y )]} = E(XY +
= −0.01 )−0.05(0.19)
E(X)E(Y )
= −0.01 + 0.05(0.19)
E{[X − E(X)][Y − E(Y )]} = −0.01 = −0.0005
E(XY + )− E(X)E(Y )
0.05(0.19)
= −0.0005
−0.01 + 0.05(0.19)
= −0.0005
= −0.0005
96
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME Advanced
102 III BIVARIATE DISTRIBUTIONS
Topics in Introductory Probability

Property 6 Expectation of Product of Independent Random

Variables

Theorem 3.10
Let X and Y be two independent random variables with expectations
E(X) and E(Y ) respectively. Then the expectation of their product
is equal to the product of their expectations:

E(XY ) = E(X)E(Y )

Proof (for the discrete bivariate case)

The random variable XY can take all values xi yj , where i = 1, 2, ..., n
and j = 1, 2, ..., m with probabilities g(xi )h(yj ), as long as X and Y are
independent random variables. Hence
n
m
E(XY ) = xi yj g(xi ) h(yj )
i=1 j=1
n  m 

= xi g(xi )  yj h(yj )
i=1 j=1
= E(X)E(Y )

The proof of the bivariate continuous case is similar.

Note
The converse of Theorem 3.10 does not hold, that is, the random variables
X and Y may satisfy the relation E(XY ) = E(X)E(Y ) without being
independent. How will people travel in the future, and
how will goods be transported? What re-
Example 3.14 sources will we use, and how many will
For the data in Example 2.3, verify whether we need? The passenger and freight traf-
fic sector is developing rapidly, and we
E(XY ) = E(X)E(Y ) provide the impetus for innovation and
movement. We develop components and
Solution systems for internal combustion engines
The distribution of XY is given in Example 2.12. Hence, to obtain
that operate more cleanly and more ef-
E(XY ), we have the following table: ficiently than ever before. We are also
pushing forward technologies that are
bringing hybrid vehicles and alternative
drives into a new dimension – for private,
corporate, and public use. The challeng-
es are great. We deliver the solutions and
offer challenging jobs.

www.schaeffler.com/careers

97
= xi g(xi )  yj h(yj )
i=1 j=1
ADVANCED TOPICS IN INTRODUCTORY
= E(X)E(Y
PROBABILITY: A FIRST COURSE IN ) EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
The proof of the bivariate continuous case is similar.

Note
The converse of Theorem 3.10 does not hold, that is, the random variables
X and Y may satisfy the relation E(XY ) = E(X)E(Y ) without being
independent.

Example 3.14
For the data in Example 2.3, verify whether

E(XY ) = E(X)E(Y )

Solution
The distribution
Expectation and of XY is of
Variance given in Example
Bivariate 2.12. Hence, to obtain
Distributions 103
E(XY ), we have the following table:

xy −6 −12 12 24
p(xy) 0.28 0.12 0.42 0.18
xy p(xy) -1.68 -1.44 5.08 4.32

Hence,

E(XY ) = −1.68 + (−1.44) + 5.08 + 4.32

= 6.24

From Example 3.1,

E(X) = 1.6 and E(Y ) = 3.9

It has been shown in Example 2.3 that X and Y are independent. Hence,

E(X)E(Y ) = 6.24

which is the same as E(XY ) above.

Corollary 3.5
Let X1 , X2 , · · · , Xn be independent random variables. Then

E(X1 X2 · · · Xn ) = E(X1 )E(X2 )...E(Xn )

This result may be written as

n n

E (Xi ) = E(Xi )
i=1 i=1

where is the “product of”.

Proof
We shall demonstrate for the case of a three random variables X, Y and Z.
The product of three random variables X Y Z may be written as (XY )Z.
Using Theorem 3.10, we obtain:

E(XY Z) = E[(XY )Z] = E(XY )E(Z)

Applying again the same theorem to the random variable XY which is a

product of random variables X and Y we obtain

E(XY Z) = E(X)E(Y )E(Z)

98
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY
104 THEORY – VOLUME Advanced
III BIVARIATE DISTRIBUTIONS
Topics in Introductory Probability

This corollary can be proved by mathematical induction. It is easy to prove

Theorem 3.10 for any finite number of independent random variables.

Property 7 Expectation of Quotient of X and Y

Theorem 3.11
Let X and Y be two random variables with expectations E(X) and
E(Y ) respectively. Then the expectation of their quotient is

X E(X) 1 E(X)
E ≈ − Cov(X, Y ) + Var(Y )
Y E(Y ) E(Y )2 E(Y )3

E(X) = 0, E(Y ) = 0, where Cov(X, Y ) is called the covariance of

X and Y (see next chapter)

Note
X
The expectation of the quotient may not exist even though the moments
Y
of X and Y exist.

Property 8 Monotonicity of Expectation

Theorem 3.12
Suppose X and Y are two random variables with expectations E(X)
and E(Y ) respectively. If X ≤ Y, then

E(X) ≤ E(Y )

Proof
We write
Y = X + (Y − X)
where Y − X ≥ 0. By Theorem 3.5, we have
Expectation and Variance of Bivariate Distributions 105
E(Y ) = E(X) + E(Y − X) ≥ E(X)

Property 9 Hölder’s Inequality of Expectation

Theorem 3.13
Suppose X and Y are bivariate random variables with finite pth and
q th order moments respectively, where p and q are positive numbers
1 1
satisfying + = 1. Then XY has a finite first moment and
p q
1 1
E(|XY |) ≤ E(|X|p ) p E(|Y |q ) q

The equality holds if and only if E(|Y |q ) = c E(|X|p ) for some constant c
or X = 0.
99
Property 10 Cauchy-Schwartz’ Inequality of Expectation
p q
1 1
ADVANCED TOPICS IN INTRODUCTORY
E(|XY |) ≤ E(|X|p ) p E(|Y |q ) q
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS

The equality holds if and only if E(|Y |q ) = c E(|X|p ) for some constant c
or X = 0.

Property 10 Cauchy-Schwartz’ Inequality of Expectation

Theorem 3.14
Suppose X and Y are two random variables with second moments
E(X 2 ) and E(Y 2 ) respectively. Then E(XY ) exists and

[E(XY )]2 ≤ E(X 2 )E(Y 2 )

Proof
Case 1
If E(X 2 ) or E(Y 2 ) is infinite, the theorem is valid.

Case 2
If E(X 2 ) = 0 then we must have P (X = 0) = 1 so that P (XY = 0) = 1
and E(XY ) = 0, hence again the theorem is valid. The same argument
applies if E(Y 2 ) = 0, so that we may assume that 0 < E(X 2 ) < ∞ and
0 < E(Y 2 ) < ∞.

Case 3
Suppose we have a random variable
Zt = tX + Y

678'<)25<2850$67(5©6'(*5((

&KDOPHUV8QLYHUVLW\RI7HFKQRORJ\FRQGXFWVUHVHDUFKDQGHGXFDWLRQLQHQJLQHHU
LQJDQGQDWXUDOVFLHQFHVDUFKLWHFWXUHWHFKQRORJ\UHODWHGPDWKHPDWLFDOVFLHQFHV
DQGQDXWLFDOVFLHQFHV%HKLQGDOOWKDW&KDOPHUVDFFRPSOLVKHVWKHDLPSHUVLVWV
IRUFRQWULEXWLQJWRDVXVWDLQDEOHIXWXUH¤ERWKQDWLRQDOO\DQGJOREDOO\
9LVLWXVRQ&KDOPHUVVHRU1H[W6WRS&KDOPHUVRQIDFHERRN

100
Case 2
If E(X 2 ) = 0 then we must have P (X = 0) = 1 so that P (XY = 0) = 1
ADVANCED TOPICS IN INTRODUCTORY
and E(XY ) A=FIRST
PROBABILITY: 0, hence again
COURSE IN the theorem is valid. The sameEXPECTATION
argument AND VARIANCE OF
applies if E(Y
PROBABILITY 2 ) = 0,
THEORY so that IIIwe may assume that 0 < E(X 2 ) < ∞ BIVARIATE
– VOLUME and DISTRIBUTIONS
0 < E(Y ) < ∞.
2

Case 3
Suppose we have a random variable
106 Advanced Topics in Introductory Probability
Zt = tX + Y
Let us consider the real valued function
E(Zt )2 = E(tX + Y )2
For every t we have
E(Zt )2 = E(tX + Y )2 ≥ 0
since
(tX + Y )2 ≥ 0
so that
E(t2 X 2 + 2 tXY + Y 2 ) ≥ 0 (i)
The term on the left hand side of (i) is a quadratic function in t, which is
always nonnegative for all t. Hence, it does not have more than one real
root. Its discriminant must, therefore, be less than or equal to zero. The
discriminant “b2 − 4ac” for the quadratic equation (i) in t is
[2 E(XY )]2 − 4 E(X 2 )E(Y 2 ) ≤ 0
or
4 [E(XY )]2 ≤ 4 E(X 2 )E(Y 2 )
from which the result follows.

Note
(a) Equality of Theorem 3.14 holds if and only if one of the random vari-
ables equals a constant multiple of the other, say, Y = t X for some
constant t or, at least one of them is non-zero, say, X = 0;
(b) This inequality is sometimes simply called Schwartz’ inequality;
(c) Cauchy-Schwartz’ inequality is a special case of Höder’s inequality
when p = q = 2.

Property 11 Minkowski’s Inequality of Expectation

Theorem 3.15
Suppose X and Y are two random variables with finite pth order
moments. Then for p ≥ 1
1 1 1
[E(|X + Y |p )] p ≤ E(|X|p ) p + E(|Y |p ) p
Expectation and Variance of Bivariate Distributions 107

Corollary 3.6
Suppose X and Y are two random variables with expectations E(X) and
E(Y ) respectively. Then

E(|X + Y |) ≤ E(|X|) + E(|Y |)

The proof follows immediately from Theorem 3.14 for the case when p = 1.

Property 12 Expectation of Frequency of Success

Theorem 3.16
101
Suppose the random variable M is the frequency of occurrence of the
E(Y ) respectively. Then
ADVANCED TOPICS IN INTRODUCTORY
E(|X + Y |) ≤
PROBABILITY: A FIRST COURSE IN
E(|X|) + E(|Y |) EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
The proof follows immediately from Theorem 3.14 for the case when p = 1.

Property 12 Expectation of Frequency of Success

Theorem 3.16
Suppose the random variable M is the frequency of occurrence of the
event A (number of successes) in n independent and identical trials.
Then
E(M ) = np

Proof
Let X1 , X2 , · · · , Xn be n independent random variables which are identically
distributed. Suppose

1, if event A occurs in the ith trial,
Xi =
0, otherwise

Then Xi are Bernoulli random variables and

E(Xi ) = pi = p

Letting
M = X1 + X2 + · · · + Xn
we have

E(M ) = E(X1 ) + E(X2 ) + · · · + E(Xn )

= np
108 Advanced Topics in Introductory Probability
Example 3.15
A box contains 20 black, 20 red and 10 green balls. From the box, 25 balls
are randomly selected with replacement.
Find the expectation of the frequency of occurrence of the red balls.

Solution
Let A = {the occurrence of red balls}. Then
20 2
p = P (A) = =
50 5
Hence the expectation of the frequency of occurrence of the red balls is

E(M ) = np

2
= 25 = 10
5
That is, if we select 25 balls from a box containing 20 black, 20 red
and 10 green balls at random with replacement, we are likely to have the
red ball appearing 10 times.

Property 13 Expectation of Relative Frequency of Success

Theorem 3.17
M
Let the random variable be the relative frequency of the event
n
A (or proportion of success) among the n independent and identical
trials of experiment E. Then the expectation of the relative frequency
102
equals the probability of the event:
2
= 25 = 10
5
ADVANCED TOPICS IN INTRODUCTORY
That is,AifFIRST
PROBABILITY: we select 25 INballs from a box containing 20 black,
COURSE 20 red AND VARIANCE OF
EXPECTATION
and 10 greenTHEORY
PROBABILITY balls at– VOLUME
random IIIwith replacement, we are likely to haveBIVARIATE
the DISTRIBUTIONS
red ball appearing 10 times.

Property 13 Expectation of Relative Frequency of Success

Proof
Thus, using Property 4 of expectation (Theorem 3.7)

M1
E E(M )=
n
n
1
= (n p) (from Theorem 3.15)
Expectation and Variance of Bivariate
n Distributions 109
= p

The above theorem says that the expected relative frequency of the event A
is p, where p = P (A). This is intuitively clear and it establishes a connection
between the relative frequency of an event and the probability of that event.

Example 3.16
Refer to Example 3.15. Find the expectation of the relative frequency of the
occurrence of the red balls.
Scholarships
Solution
From Example 3.15,

2
p=
5

so that
Open your mind to 3
new opportunities
q = P (A) =
5
With 31,000 students, Linnaeus University is
one of the larger universities in Sweden. We
Hence,are the expectation
a modern university,of the for
known relative frequency of occurrence of the red
our strong
balls international
is profile. Every year more than Bachelor programmes in
1,600 international students fromall over
the Business & Economics | Computer Science/IT |
world choose to enjoy the friendly atmosphere
M 2 Design | Mathematics
E =
and active student life at Linnaeus University.
p =
n 5 Master programmes in
Welcome to join us! Business & Economics | Behavioural Sciences | Computer
Science/IT | Cultural Studies & Social Sciences | Design |
Mathematics | Natural Sciences | Technology & Engineering
3.3 VARIANCE OF BIVARIATE RANDOM VARIABLES
Summer Academy courses

3.3.1 Variance of Functions of Bivariate Random Variables

The variance of a univariate random variable has been discussed in the

book on “Introductory Probability Theory” (Nsowah-Nuamah,
103 2017). We
extend the discussion to bivariate distributions in this section. A few more
ADVANCED
The above TOPICS
theoremIN says
INTRODUCTORY
that the expected relative frequency of the event A
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
is p, where p = P (A). This is intuitively clear and it establishes a connection
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
between the relative frequency of an event and the probability of that event.

Example 3.16
Refer to Example 3.15. Find the expectation of the relative frequency of the
occurrence of the red balls.

Solution
From Example 3.15,

2
p=
5

so that
3
q = P (A) =
5

Hence, the expectation of the relative frequency of occurrence of the red

balls is

M 2
E =p=
n 5

3.3 VARIANCE OF BIVARIATE RANDOM VARIABLES

3.3.1 Variance of Functions of Bivariate Random Variables

The variance of a univariate random variable has been discussed in the

book on “Introductory Probability Theory” (Nsowah-Nuamah, 2017). We
extend the discussion to bivariate distributions in this section. A few more
properties will also be discussed.
The variance of a bivariate random variables X and Y can be expressed
110
through the joint probabilityAdvanced Topics in Introductory Probability
functions.

Definition 3.4
Let (X, Y ) be a two-dimensional discrete random variable with prob-
ability mass function p(xi , yj ).
If H(X, Y ) = (X − µX )2 , then
n
m
Var(X) = (X − µX )2 p(xi , yj )
i=1 j=1

If H(X, Y ) = (Y − µY )2 , then
n
m
Var(Y ) = (Y − µY )2 p(xi , yj )
i=1 j=1

where µX = E(X) and µY = E(Y )

Example 3.17
For the data in Example 2.3, find the variance of the (a) X (b) Y .
104
Solution
Var(Y ) =i=1 j=1 (Y − µY )2 p(xi , yj )
i=1 j=1
where µTOPICS
ADVANCED
X = and Y = E(Y )
IN INTRODUCTORY
E(X) µ
where µX A=FIRST
PROBABILITY: E(X) and µYIN= E(Y )
COURSE EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS

Example 3.17
Example 3.17in Example 2.3, find the variance of the
For the data (a) X (b) Y .
For the data in Example 2.3, find the variance of the (a) X (b) Y .
Solution
Solution
(a) From Example 3.1, E(X) = 1.6. Hence,
(a) From Example 3.1, E(X) = 1.6. Hence,
Var(X) = (x1 − µX )2 p(x1 , y1 ) + (x1 − µX )2 p(x1 , y2 )
Var(X) = (x1 − µX )2 p(x 1 , y1 ) + (x1 − µX ) p(x
2
1 , y2 )
+(x2 − µX )2 p(x 2 , y1 ) + (x2 − µX )2 p(x 2 , y2 )
+(x2 − µX2) p(x2 , y1 ) + (x2 − µ2X ) p(x2 , y2 )
2 2
= (−2 − 1.6) (0.28) + (−2 − 1.6) (0.12) + (4 − 1.6)2 (0.42)
= (−2 − 1.6)22 (0.28) + (−2 − 1.6)2 (0.12) + (4 − 1.6)2 (0.42)
+(4 − 1.6) (0.18)
+(4 − 1.6)2 (0.18)
= 8.64
= 8.64
(b) From Example 3.1, E(Y ) = 3.9. Hence,
(b) From Example 3.1, E(Y ) = 3.9. Hence,
Var(Y ) = (y1 − µY )2 p(x1 , y1 ) + (y1 − µY )2 p(x1 , y2 )
Var(Y ) = (y1 − µY )2 p(x 1 , y1 ) + (y1 − µY ) p(x
2
1 , y2 )
+(y2 − µY )2 p(x 2 , y 1 ) + (y 2 − µY )2 p(x 2 , y2 )
+(y2 − µ2Y )2 p(x2 , y1 ) + (y22 − µY )2 p(x2 , y2 ) 2
= (3 − 3.9) (0.7) + (6 − 3.9) (0.12) + (3 − 3.9) (0.42)
Expectation and= Variance
(3 − 3.9)of
2
(0.7) + (6 −Distributions
3.9)2 (0.12) + (3 − 3.9)2 (0.42) 111
+(6 − 3.9)2Bivariate
(0.18)
Expectation and Variance of 2Bivariate
+(6 − 3.9) (0.18) Distributions 111
= 2.23
= 2.23
Definition 3.5
Definition 3.5a two-dimensional continuous random variables with
Let (X, Y ) be
Let (X, Y ) be a two-dimensional
probability density function f (x, y). continuous random variables with
probability density
If H(X, Y ) = (X − function
µX ) , then
2 f (x, y).
If H(X, Y ) = (X − µX )2, then
∞ ∞
Var(X) = ∞ ∞ (X − µX )2 p(xi , yj )
Var(X) = −∞ −∞ (X − µX )2 p(xi , yj )
−∞ −∞
If H(X, Y ) = (Y − µY )2 , then
If H(X, Y ) = (Y − µY )2 ,then
∞ ∞
Var(Y ) = ∞ ∞ (Y − µY )2 p(xi , yj )
Var(Y ) = −∞ −∞ (Y − µY )2 p(xi , yj )
−∞ −∞

We may also express the variance of X and Y in a bivariate distribution

We maymarginal
through the also express the variance
probability of X and Y in a bivariate distribution
functions.
through the marginal probability functions.

Definition 3.6
Definition
Let (X, Y ) be3.6a two-dimensional discrete random variable with prob-
Let (X,mass
ability Y ) befunction
a two-dimensional discrete
p(x, y). Then randomof
the variance variable with prob-
the marginal dis-
ability mass function p(x, y). Then
tributions of X and Y are given by the variance of the marginal dis-
tributions of X and Y are given by
Var(X) = (x − µX )2 p(x)
Var(X) = all x(x − µX )2 p(x)
x
all
Var(Y ) = (y − µY )2 p(y)
Var(Y ) = all y(y − µY )2 p(y)
all y
where p(x) and p(y) are the marginal probability mass function of
where
X and p(x) and p(y) are the marginal probability mass function of
Y , respectively
X and Y , respectively

Example 3.18
Example 3.18in Example 2.3, find (a) Var(X) (b) Var(Y ).
For the data
For the data in Example 2.3, find (a) Var(X)
105 (b) Var(Y ).
Solution
Y
all y
ADVANCED TOPICS IN INTRODUCTORY
where p(x)
PROBABILITY: and p(y)
A FIRST are IN
COURSE the marginal probability mass function of
EXPECTATION AND VARIANCE OF
X and Y THEORY
PROBABILITY , respectively
– VOLUME III BIVARIATE DISTRIBUTIONS

Example 3.18
For the data in Example 2.3, find (a) Var(X) (b) Var(Y ).

Solution

112 Var(X) = Advanced
(x Topics in Introductory Probability
− µX )2 p(x)
all x

= (−2 − 1.6)2 (0.4) + (4(−2 − 1.6)2 (0.6)

= 8.64

The reader is asked to calculate (b).

Definition 3.7
Let (X, Y ) be a two-dimensional continuous random variable with
probability density function f (x, y). Then the variance of the
marginal distributions of X and Y are given by
∞
Var(X) = (x − µX )2 g(x) dx
−∞
∞
Var(Y ) = (y − µY )2 h(y) dy
−∞

where g(x) and h(y) are the marginal probability density function of
X and Y respectively

Like in the case of the univariate random variable, for computational

purposes, we use the following formulas:

Var(X) = E(X 2 ) − [E(X)]2

= x2 p(x) − (E(X))2 (Discrete case)
all x
∞
= x2 g(x) dx − (E(X))2 (Continuous case)
−∞

Var(Y ) = E(Y 2 ) − [E(Y )]2

= y 2 p(y) − (E(Y ))2 (Discrete case)
all y
∞
= y 2 h(y) dy − (E(Y ))2 (Continuous case)
−∞

Example 3.19
Refer to Example 1.3. Find (a) Var(X) (b) Var(Y ).

106
Var(Y ) = (y − µY )2 h(y) dy
−∞
ADVANCED TOPICS IN INTRODUCTORY
where g(x)
PROBABILITY: and h(y)
A FIRST are the
COURSE IN marginal probability density function of
EXPECTATION AND VARIANCE OF
X and Y THEORY
PROBABILITY respectively
– VOLUME III BIVARIATE DISTRIBUTIONS

Like in the case of the univariate random variable, for computational

purposes, we use the following formulas:

Var(X) = E(X 2 ) − [E(X)]2

= x2 p(x) − (E(X))2 (Discrete case)
all x
∞
= x2 g(x) dx − (E(X))2 (Continuous case)
−∞

Var(Y ) = E(Y 2 ) − [E(Y )]2

= y 2 p(y) − (E(Y ))2 (Discrete case)
all y
∞
= y 2 h(y) dy − (E(Y ))2 (Continuous case)
−∞

Example 3.19and Variance of Bivariate Distributions

Expectation 113
Refer to Example 1.3. Find (a) Var(X) (b) Var(Y ).

Solution
13
(a) From Example 3.6, E(X) = .
18
Now
1
2
E(X ) =
2
x2
2x + x 2
dx
0 3
1
2
= 2 x + x3
4
dx
0 3
1
2 x5 2 x4
= +
5 12 0
2 2
= +
5 12
17
=
30
Therefore 2
17 13
Var(X) = − = 0.04506
30 18
10
(b) From Example 3.5, E(Y ) = .
9
Now
2
1 y
E(Y ) = 2
y 2
+ dy
0 3 6
2
1 y3
= y + 2
dy
0 3 6
2
y3 y4
= +
9 24 0
8 16
= +
9 24
14
=
9 107

Hence
2
1 y3
= y +2
dy
ADVANCED TOPICS IN INTRODUCTORY 0 3 6
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III
2 BIVARIATE DISTRIBUTIONS
y3 y4
= +
9 24 0
8 16
= +
9 24
14
=
9
Hence
14 10 2
Var(Y ) = − = 0.32099
114 9 9 Topics in Introductory Probability
Advanced

3.3.2 Properties of Variance of Bivariate Random Variables

In discussing the properties of variance, it is assumed that the variances

exist.

Property 1 Variance of Identical and Independent Random

Variables

Theorem 3.18
If the random variables X1 , X2 , ..., Xn have the same distribution,
then they all have the same variance σ 2 , that is,

Var(Xi ) = σ 2 , 1≤i≤n

Property 2 Variance of Sum and Difference of X and Y

Theorem 3.19
If (X,Y) are two dimensional random variables, then

Var(X ± Y ) = Var(X) + Var(Y ) ± 2 Cov(X, Y )

where Cov(X, Y ), is the covariance of X and Y

Proof
We shall give the proof for the case X + Y . The proof for the case X − Y
should be tried by the reader (see Exercise 3.13).
Var(X + Y ) = E{[X + Y − E(X + Y )]2 }
= E{[X + Y − E(X) − E(Y )]2 }
= E{([X − E(X)] + [Y − E(Y )])2 }
= E{(X − E(X))2 + (Y − E(Y ))2
+2(X − E(X))(Y − E(Y ))}
= E{X − E(X)}2 + E{Y − E(Y )}2
+2{E(X − E(X))(Y − E(Y ))}
= Var(X) + Var(Y ) + 2 Cov(X, Y )

108
Expectation and Variance of Bivariate Distributions 115
Expectation
ADVANCED TOPICS
Expectation and
and Variance of
of Bivariate
IN INTRODUCTORY
Variance Bivariate Distributions
Distributions 115
115
Expectation
PROBABILITY: and A FIRSTVariance
COURSE of IN
Bivariate Distributions 115 AND VARIANCE OF
EXPECTATION
Expectation and Variance of Bivariate Distributions 115
PROBABILITY
Expectation THEORY – VOLUME III BIVARIATE
115 DISTRIBUTIONS
Example 3.20and Variance of Bivariate Distributions
Example
Example 3.20
If Var(X)3.20
Example = 8, Var(Y ) = 5 and Cov(X, Y ) = 3 find
3.20
If Var(X)
Example
If Var(X) =
3.20
= 8, Var(Y )) =
=5 5 and Cov(X, Y Y )) = = 33 find
(a)
If Var(X3.20
Var(X)
Example =+ 8,Y ), Var(Y
8, (b) Var(X
Var(Y )=5 −
and
andY )Cov(X,
Cov(X, Y ) = 3 find
find
(a) Var(X
If Var(X)
(a) Var(X = +
+ 8,Y ),
Y ), Var(Y (b) Var(X
)=5 −
(b) Var(X −andY )
Y )Cov(X, Y ) = 3 find
(a) Var(X =
If Var(X) + Y ), Var(Y (b) Var(X Y )Cov(X, Y ) = 3 find
(a) Var(X + 8,
Solution Y ), (b) Var(X )=5 − −andY)
Solution
(a) Var(X + Y ), (b) Var(X − Y )
Solution
Solution
(a) Var(X + Y ) = 8 + 5 + 2(3) = 19
Solution
(a)
(a) Var(X
Solution Var(X + +Y Y )) ==8 8++5 5++ 2(3)
2(3) = = 1919
(a)
(b) Var(X
Var(X − + Y ) =
Y)=8+5− 8 + 5 + 2(3) =
2(3) 719
(a)
(b) + + = 19
(a) Var(X
(b) Var(X − +Y
− Y )) ==8 8++5 5−+ 2(3)
− 2(3) = =7 719
(b) Var(X − Y ) = 8 + 5 − 2(3) = 7
(b) variance
The Var(X −isY not ) = 8additive
+ 5 − 2(3) =7
in general, as the expected value. However,
(b) variance
The Var(X −isY not ) = 8additive
+ 5 − 2(3) in =7
general, as
The
with variance
the additional is not additive in general, as the
the expected
expected value. However,
we obtainvalue. However,
The
with variance
the is notassumption
additional additive in of
assumption of
independence,
general, as the expected
independence, we obtain
Theorem
value.
Theorem
3.20.
However,
3.20.
The
with variance
the additional is notassumption
additive in ofgeneral, as the expected
independence, we obtainvalue.
Theorem However,
3.20.
with
The the additional
variance is not assumption
additive in ofgeneral,
independence,
as the we obtainvalue.
expected Theorem 3.20.
However,
with the
Corollary 3.7 additional assumption of independence, we obtain Theorem 3.20.
Corollary
with the
Corollary 3.7
additional assumption of independence,
3.7Xn are n dimensional random variables, then we obtain Theorem 3.20.
If X1 , X2 , ...,
Corollary 3.7Xn are n dimensional random variables, then
If X ,
Corollary
If X , ...,
3.7X n are n dimensional random variables, then
1 2
X1 , X2 , ...,
If are
n n dimensional random variables, then
Corollary
If
X , X , ...,3.7X
X11 , X22 , ..., X n
are
n n dimensional

n
n
n
random variables,
n
n
n then
n
If X1 , X2 , ...,Var X n are
n Xn =
idimensional
n Var(X i ) + 2 variables,
random n n Cov(X
then i ,, X ),
Var
Var n X
n Xii =
= n Var(X
n Var(Xii ) + 2
) + 2 n
n j=1 n Cov(X Xj ),
n Cov(Xii , Xjj ),
Var
i=1
i=1
= i=1
i=1
n Var(X ) + 2 i=1 n Cov(X , X ),
n X

Var i=1 i=1 Xii = i=1 Var(Xii ) + 2 ni<j
i=1
j=1

j=1 Cov(Xii , Xjj ),
Var i=1 Xi = i=1 Var(Xi ) + 2 i=1i<jj=1 Cov(Xi , Xj ),
i=1 i=1 i=1 i<jj=1
or equivalently, i=1 i<j
or equivalently,
or equivalently,
i=1 i=1 i<jj=1
or equivalently, i<j
or equivalently,
n n n n
n
n Xi =
n
n Var(Xi ) +
n n
or equivalently, Var n n Cov(Xi , Xj ),
Var
Var n Xi = n Var(Xi ) + n n Cov(Xi , Xj ),
i=1 Xi =
n i=1 Var(Xi ) +
n n
i=1 j=1 n Cov(Xi , Xj ),
Var i=1 X = i=1 Var(X ) + i=1 j=1n Cov(X , X ),
Var i=1 Xi = i=1 Var(Xi ) + i=1i=j=1
n i n i n i =j Cov(Xii , Xjj ),
Var i=1 Xi = i=1 Var(Xi ) + i=1i=j=1
i=1 i=1 i=1 j
j=1
j Cov(Xi , Xj ),
or equivalently, i=1 i=j
or equivalently,
or equivalently, i=1 i=1 i =j=1
j
or equivalently, n i=j
or equivalently, n n
n
n Xi =
n
n
n
or equivalently, Var n Xi = n
n Cov(Xi , Xj )
Var
Var n Xi = n j=1 n Cov(Xi , Xj )
n Cov(Xi , Xj )
Var i=1
= i=1
n Cov(X ,X )
i=1 X i=1 j=1
Var i=1 n Xi =
i
n j=1
i=1 Cov(Xii , Xjj )
Var i=1 Xi = i=1 j=1 Cov(Xi , Xj )
i=1 j=1
Property 3 Sum and i=1 Difference of Independent Random
Property
Property 3 Sum 3 Sum and
and Differencei=1of
Difference
i=1
ofj=1Independent
Independent Random Random
Property 3 Sum Variables and Difference of Independent Random
Property 3 Sum Variables
Variables and Difference of Independent Random
Property 3 Sum Variables and Difference of Independent Random
Variables
Theorem 3.20 Variables
Theorem
Theorem 3.20
3.20
If (X, Y ) are
Theorem 3.20two dimensional random variables and if X and Y are
If (X,
Theorem
If Y )
(X, Y ) are are
3.20two
two dimensional
dimensional random random variables
variables and and if
if XX andand Y Y are
are
independent
If (X,
Theorem Y ) are then,
3.20two dimensional random variables and if X and Y are
independent
If (X, Y ) are then,
independent two dimensional random variables and if X and Y are
then,
independent
If (X, Y ) are then, two dimensional random variables and if X and Y are
independent then, Var(X ± Y ) = Var(X) + Var(Y )
independent then,Var(X Var(X ± ±Y Y )) == Var(X)
Var(X) + + Var(YVar(Y ))
Var(X ± Y ) = Var(X) + Var(Y )
Var(X ± Y ) = Var(X) + Var(Y )
Var(X ± Y ) = Var(X) + Var(Y )

109
or equivalently,
ADVANCED TOPICS IN INTRODUCTORY
n n n
PROBABILITY: A FIRST COURSE
IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORYVar– VOLUMEXIII
i = Cov(Xi , Xj ) BIVARIATE DISTRIBUTIONS
i=1 i=1 j=1

Property 3 Sum and Difference of Independent Random

Variables

Theorem 3.20
If (X, Y ) are two dimensional random variables and if X and Y are
independent then,

Var(X ± Y ) = Var(X) + Var(Y )

116 Advanced Topics in Introductory Probability

Proof
The theorem follows from Theorem 3.19 by noting that Cov(X, Y ) = 0 for
the case when X and Y are independent (see Theorem 4.7 in the sequel.)

Corollary 3.8 Bienaymé Equality

Let Xi (i = 1, 2, ..., n) be independent random variables. Then
n n

Var Xi = Var(Xi )
i=1 i=1

This corollary follows from Corollary 3.7.

Note
It is important to observe that
n n

E Xi = E(Xi )
i=1 i=1

whether or not the Xi ’s are independent but it is generally not the case that
n n

Var Xi = Var(Xi )
i=1 i=1

always.

Example 3.21
For the data in Example 2.3, verify that
(a) Var(X + Y ) = Var(X) + Var(Y );

(b) Var(X − Y ) = Var(X) + Var(Y ).

Solution
It has been shown in Example 2.3 that X and Y are independent, hence:

(a) Var(X + Y ) = E(X + Y )2 − [E(X + Y )]2 .

Now
(xi + yj )2 (−2 + 3)2 (−2 + 6)2 (4 + 3)2 (4 + 6)2
p(xi )p(yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)

110
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
Expectation and Variance of Bivariate Distributions 117

Therefore, the distribution of (X + Y )2 is

(xi + yj )2 1 16 49 100
p(xi )p(yj ) 0.28 0.12 0.42 0.18

E (X + Y )2 = (x + y)2 p(x)p(y)
= 1(0.28) + 16(0.12) + 49(0.42) + 100(0.18)
= 0.28 + 1.92 + 20.58 + 18
= 40.78

From Example 3.7, E(X + Y ) = 5.50.

Therefore,

Var(X + Y ) = E (X + Y )2 − [E(X + Y )]2
= 40.78 − (5.5)2
= 10.53

Now
Var(X) = E(X 2 ) − [E(X)]2
From Example 3.1, E(X) = 1.6.
From Example 2.3, we shall find E(X 2 ). Now,

x2i (−2)2 (4)2

p(xi ) 0.4 0.6

n

E(X ) = 2
x2 p(x)
i=1
= (−2)2 (0.4) + (4)2 (0.6)
= 1.6 + 9.6 = 11.2

Therefore
Var(X) = 11.2 − (1.6)2 = 8.64
Again
118 Advanced Topics in Introductory Probability
Var(Y ) = E(Y 2 ) − [E(Y )]2

From Example 3.1, E(Y ) = 3.9.

Similarly,

yj2 (3)2 (6)2

p(yj ) 0.7 0.3

E(Y 2 ) = y 2 h(y)
= (3)2 (0.7) + (6)2 (0.3)
= 6.3 + 10.8 = 17.1

Therefore
Var(Y ) = 17.1 − (3.9)2 = 1.89
Hence,
Var(X + Y ) = Var(X) + 111
Var(Y)

y 2 h(y)
E(Y 22)) =
2
E(Y = y 2 h(y)
E(Y ) = 2 y 2 h(y)
E(Y ) = (3) 2 (0.7) + (6)2 (0.3)
ADVANCED TOPICS IN INTRODUCTORY 2 2
= (3) y 2 h(y)
(0.7) + (6) (0.3)
PROBABILITY: A FIRST COURSE IN= (3) (0.7) + (6)2 (0.3)
2 EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME = = 6.3
III 6.3
2+ 10.8 = 17.1
(3) +(0.7)
10.8+=(6)
2
17.1(0.3) BIVARIATE DISTRIBUTIONS
= 6.3 + 10.8 = 17.1
Therefore = 6.3 + 10.8 = 17.1
Therefore
Therefore Var(Y
Var(Y )) == 17.1 − (3.9)
(3.9)22 == 1.89
2
Therefore 17.1 − 1.89
Var(Y ) = 17.1 − (3.9)2 = 1.89
Hence,
Hence, Var(Y ) = 17.1 − (3.9) = 1.89
Hence, Var(X
Hence, Var(X + +Y Y )) == Var(X)
Var(X) + + Var(Y)
Var(Y)
Var(X + Y ) = Var(X) + Var(Y)
(b) Var(X + Y ) = Var(X) + Var(Y)
(b) Var
Var [(X Y )]
)] = Y ))22 −
− [E(X Y )]
2 2
[(X − −Y E(X −
= E(X −Y [E(X − −Y )]22
(b) Var [(X − Y )] = E(X − Y )2 − [E(X − Y )]2
(b) Var [(X − Y )] = E(X − Y ) − [E(X − Y )]
Now
Now
Now
Now (xi − yj )22 (−2 − 3)22 (−2 − 6)22 (4 − 3)22 (4 − 6)22
(xi − yj ) 2 (−2 − 3) 2 (−2 − 6) 2 (4 − 3) 2 (4 − 6) 2
(x −
p(x yj )j 2) (−2
)p(y − 3)2 (−2
0.4(0.7) − 6)2 0.6(0.7)
0.4(0.3) (4 − 3)2 0.6(0.3)
(4 − 6)2
)p(y
(xii ii−
p(x yj )j ) (−2 0.4(0.7)
− 3) 0.4(0.3)
(−2 − 6) 0.6(0.7)
(4 − 3) 0.6(0.3)
(4 − 6)
p(xi )p(yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)
p(xi )p(yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)
(xi −
(x yj )22 25 64 11 44
i − yj ) 2 25 64
(xi i−
p(x yj )j 2) 0.28
)p(y 25 0.12 64 0.42 1 4
0.18
(xi i−
p(x )p(y
yj )j ) 0.28 25 0.12 64 0.42 1 0.18
4
p(xi )p(yj ) 0.28 0.12 0.42 0.18
p(xi )p(yj ) 0.28 0.12 0.42 0.18
)
2 = (x − y)2 g(x) h(y)
2 2
E(X − Y
E(X − Y ) 2 = (x − y) 2 g(x) h(y)
E(X − Y )2 = = (x − y) 2 g(x) h(y)
E(X − Y ) = 25(0.28) (x − y)
25(0.28) +
+ 64(0.12)
g(x) h(y)+
64(0.12) + 1(0.42)
1(0.42) + + 4(0.18)
4(0.18)
=
= 25(0.28)
7 + 7.68 +
+ 64(0.12)
0.42 + +
0.72 1(0.42) + 4(0.18)
= 725(0.28)
+ 7.68 + + 0.42
64(0.12)
+ 0.72+ 1(0.42) + 4(0.18)
= 3.82
= 7 + 7.68 + 0.42 + 0.72
7 + 7.68 + 0.42 + 0.72
= 3.82
= 3.82
= 3 3.82
xii −
x − yyjj −2
−2 −− 3 −2 −2 − − 66 44 −
− 33 44 −
− 66
xii )p(y
p(x −2 − 3 0.4(0.3)
− yjj ) 0.4(0.7) −2 − 6 0.6(0.7) 4−3 4−6
0.6(0.3)
x i )p(y
p(x − y j ) 0.4(0.7)
−2 − 3 0.4(0.3)
−2 − 6 0.6(0.7) 4−3 0.6(0.3)
4−6
p(xii )p(yjj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)
p(xi )p(yj ) 0.4(0.7) 0.4(0.3) 0.6(0.7) 0.6(0.3)
xii −
x − yyjj −5
−5 −8
−8 11 −2
−2
x
p(x ii )p(yjj )
− y 0.28
−5 0.12
−8 0.421 0.18
−2
xii )p(y
p(x − yjj ) 0.28 −5 0.12 −8 0.42 1 0.18
−2
p(xi )p(yj ) 0.28 0.12 0.42 0.18
p(xi )p(yj ) 0.28 0.12 0.42 0.18

112
ADVANCED TOPICS
Expectation IN INTRODUCTORY
and Variance of Bivariate Distributions 119
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS

E [(X − Y )] = (x − y) g(x) h(y)
= −5(0.28) + (−8)(0.12) + 1(0.42) + −2(0.18)
= −1.4 − 0.96 + 0.42 − 0.36
= −2.3

Therefore,
[E(X − Y )]2 = (−2.30)2 = 5.29

Hence,
Var(X − Y ) = 15.82 − 5.29 = 10.53

But
Var(X) + Var(Y ) = 8.64 + 1.89 = 10.53

Hence,

Var(X − Y ) = Var(X) + Var(Y )

The next property is of much importance in both theoretical and applied

statistics.

Property 4

Theorem 3.21
If the random variables Xi (i = 1, 2, ..., n) are independent and have
the same variance, then
n

Var Xi = n σ2
i=1

Var(Xi ) = σ 2

120
This theorem follows immediately from Corollary
Advanced Topics in 3.8.
Introductory Probability

Property 5

Theorem 3.22
If (X, Y ) is a two-dimensional random variable, then

Var(aX ± bY ) = a2 Var(X) ± b2 Var(Y ) ± 2 a b Cov(X, Y )

where a and b are constant

Proof
We shall prove for the sum. The proof for the difference is similar.
Var(aX + bY ) = E{[(aX + bY ) − E(aX + bY )]2 }
= E{a[X − E(X)] + b[Y − E(Y )]2 }
= a2 E[X − E(X)]2 + b2 E[Y − E(Y )]2
+2 a b E{[X − E(X)][Y
113
− E(Y )]}
= a Var(X) + b2 Var(Y ) + 2 a b Cov(X, Y )
2
If (X, Y ) is a two-dimensional random variable, then
ADVANCED
Var(aX
TOPICS
±IN
bY ) = a22 Var(X)
INTRODUCTORY
± b22 Var(Y ) ± 2 a b Cov(X, Y )
Var(aX ± bY ) = a Var(X) ± b Var(Y ) ± 2 a b Cov(X, Y )
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
where a and
PROBABILITY b are– VOLUME
THEORY constantIII BIVARIATE DISTRIBUTIONS
where a and b are constant

Proof
Proof
We shall prove for the sum. The proof for the difference is similar.
We shall prove for the sum. The proof for the difference is similar.
Var(aX + bY ) = E{[(aX + bY ) − E(aX + bY )]2 }
Var(aX + bY ) = E{[(aX + bY ) − E(aX + bY )]22}
= E{a[X − E(X)] + b[Y − E(Y )]2 }
= E{a[X − E(X)] + b[Y − E(Y )] }
= a22 E[X − E(X)]22 + b22 E[Y − E(Y )]22
= a E[X − E(X)] + b E[Y − E(Y )]
+2 a b E{[X − E(X)][Y − E(Y )]}
+2 a b E{[X − E(X)][Y − E(Y )]}
= a2 Var(X) + b22 Var(Y ) + 2 a b Cov(X, Y )
2
= a Var(X) + b Var(Y ) + 2 a b Cov(X, Y )
More generally, we have the following results for the variance of a linear
More generally, we have the following results for the variance of a linear
combination of random variables.
combination of random variables.
Corollary 3.9
Corollary 3.9
If Xi (i = 1, 2, ..., n) are n random variables and ci a constant associated
If Xi (i = 1, 2, ..., n) are n random variables and ci a constant associated
with ith random variable, then
with ith random
variable,
then
n
n n
n
n n n
n
Var ci Xi = c2i2 V ar(Xi ) + 2 ci cj Cov(Xi , Xj )
Var i=1
ci Xi = i=1
ci V ar(Xi ) + 2 c c
i=1 j=1 i j
Cov(Xi , Xj )
i=1 i=1 i=1 j=1
i<j
i<j
or equivalently,
or equivalently,

n
n
n
n

n n n n
Var ci Xi = c2i2 Var(Xi ) + ci cj Cov(Xi , Xj )
Var i=1
ci Xi = i=1
ci Var(Xi ) + i=1 c c
j=1 i j
Cov(Xi , Xj )
i=1 i=1 i=1 j=1
i=j
i=j
or equivalently,
or equivalently, n
Expectation
Expectation and
and
Variance
Variance n
of Bivariate
n of Bivariate
n
Distributions
n Distributions
n
121
121
Expectation andVar i =
i XBivariate
Variance cof ci cj Cov(Xi , Xj )
Distributions 121
Var i=1 ci Xi = i=1 j=1 ci cj Cov(Xi , Xj )
i=1 i=1 j=1
Example
Example 3.22
3.22
Example
If Var(X) 3.22
= 5, Var(Y )) = 3 and Cov(X, Y ) = 2, find
If
If Var(X) =
Var(X) = 5,
5, Var(Y
Var(Y ) == 33 and
and Cov(X,
Cov(X, YY )) =
= 2,
2, find
find
(a)
(a) Var(4X + 6Y );
(a) Var(4X
Var(4X ++ 6Y
6Y );
);
(b)
(b) Var(4X − 6Y ).
(b) Var(4X
Var(4X −− 6Y
6Y ).
).
Solution
Solution (a)
Solution (a)
(a)
(a)
(a) Var(4X
Var(4X +
+ 6Y
6Y )) =
= 4
4 2 Var(X) + 62 Var(Y ) + 2(4)(6) Cov(X, Y )
2 2
2 Var(X) + 6 2 Var(Y ) + 2(4)(6) Cov(X, Y )
(a) Var(4X + 6Y ) = 4 Var(X) + 6 Var(Y ) + 2(4)(6) Cov(X, Y )
=
= 16(5) + 36(3) + 48(2)
= 16(5)
16(5) ++ 36(3)
36(3) +
+ 48(2)
48(2)
=
= 284
284
= 284
(b)
(b) Var(4X − 6Y )) = 422 Var(X) + 622 Var(Y )) − 2(4)(6) Cov(X, Y)
(b) Var(4X − 6Y ) = 442 Var(X)
Var(4X − 6Y = Var(X) +
+ 662 Var(Y
Var(Y ) −− 2(4)(6)
2(4)(6) Cov(X,
Cov(X, YY ))
=
= 16(5) + 36(3) − 48(2)
= 16(5)
16(5) ++ 36(3) − 48(2)
36(3) − 48(2)
=
= 92
92
= 92
Property
Property 6
Property 66

Theorem 3.23
Theorem 3.23
Theorem
Let X and 3.23
Y be two
two independent random
random variables. Then
Then
Let X and
Let X and YY be
be two independent
independent random variables.
variables. Then
Var(aX ±
Var(aX bY )) =
± bY =aa22 Var(X) + b22 Var(Y )
2 Var(X) + b 2 Var(Y )
Var(aX ± bY ) = a Var(X) + b Var(Y )
where
where a and bb are constants
where aa and
and b are
are constants
constants
Corollary 3.10
Corollary 3.10
Corollary
Let X (i 3.10
= 1, 2, ..., n) be independent random114variables and cci aa constant
Let
Let X
X i (i
i
(i =
= 1,
1, 2, ...,
2,ith
..., n)
n) be
be independent
independent random
random variables
variables and
and cii a constant
constant
associated
i with random variable,
associated with i th random variable, then
th then
Let X and Y be two independent random variables. Then
ADVANCED TOPICS IN INTRODUCTORY 2
Var(aX ± bY ) = a Var(X)
PROBABILITY: A FIRST COURSE IN
+ b2 Var(Y ) EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
where a and b are constants

Corollary 3.10
Let Xi (i = 1, 2, ..., n) be independent random variables and ci a constant
associated with ith random variable, then
n n

Var ci Xi = c2i Var(Xi )
i=1 i=1

From Corollary 3.10, if ci = c then

n n

Var c Xi = c2 Var(Xi )
i=1 i=1

Example 3.23
122 Advanced Topics in Introductory Probability
If Var(X) = 10 and Var(Y ) = 6, find

(a) Var(3X + 2Y ), (b) Var(2X + 2Y ),

Solution
(a) Var(3X + 2Y ) = 32 Var(X) + 22 Var(Y )
= 9(10) + 4(6)
= 112
(b) Var(2X + 2Y ) = Var[2(X + Y )]
= 22 Var(X + Y )
= 22 [Var(X) + Var(Y )]
= 4(10 + 6)
= 64
The reader should solve (c) and (d).

Property 7 Variance of Frequency of Success

Theorem 3.24
WE WILL TURN YOUR CV
Let the random variable M be the frequency of success in n inde- INTO AN OPPORTUNITY
pendent trials, then OF A LIFETIME
Var(M ) = npq

Proof
We found in the proof of Theorem 3.16 that E(Xi ) = p. To find the variance,
we need to find also E(Xi2 ):

x2i 0 1
p(xi ) q p

i )to=
be1(p)
a part+
of0(q) = p brand?
2
E(X
Do you like cars? Would you like a successful
Send us your CV on
As a constructer at ŠKODA AUTO you will put great things in motion. Things that will
so that ease everyday lives of people all around Send us your CV. We will give it an entirely www.employerforlife.com
new new dimension.
Var(Xi ) = E(Xi2 ) − [E(Xi )]2
= p − p = pq2

115
Solution
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A(a)
FIRSTVar(3X
COURSE+IN
2Y ) = 32 Var(X) + 22 Var(Y ) EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III BIVARIATE DISTRIBUTIONS
= 9(10) + 4(6)
= 112
(b) Var(2X + 2Y ) = Var[2(X + Y )]
= 22 Var(X + Y )
= 22 [Var(X) + Var(Y )]
= 4(10 + 6)
= 64
The reader should solve (c) and (d).

Property 7 Variance of Frequency of Success

Theorem 3.24
Let the random variable M be the frequency of success in n inde-
pendent trials, then
Var(M ) = npq

Proof
We found in the proof of Theorem 3.16 that E(Xi ) = p. To find the variance,
we need to find also E(Xi2 ):

x2i 0 1
p(xi ) q p

E(Xi2 ) = 1(p) + 0(q) = p

so that
Var(X
Expectation and Variance ofi )Bivariate
= E(XDistributions
i ) − [E(Xi )]
2 2 123
= p − p = pq 2

Hence
Var(M ) = Var(X1 + X2 + ... + Xn )
= Var(X1 ) + Var(X2 ) + · · · + Var(Xn )
= npq

Example 3.24
Refer to example 3.15, find Var(M ).

Solution
2 3
n = 25, p = , q=
5 5
Var(M ) = npq

2 3
= 25
5 5
= 6

Property 8 Variance of Relative Frequency of Success

Theorem 3.25
M
Let the random variable be the relative frequency of success in n
n
independent trials, then

M pq
Var =
n n 116
5 5
Var(M ) = npq
ADVANCED TOPICS IN INTRODUCTORY
2 3
PROBABILITY: A FIRST COURSE IN = 25 EXPECTATION AND VARIANCE OF
PROBABILITY THEORY – VOLUME III 5 5 BIVARIATE DISTRIBUTIONS
= 6

Property 8 Variance of Relative Frequency of Success

Theorem 3.25
M
Let the random variable be the relative frequency of success in n
n
independent trials, then

M pq
Var =
n n

Proof

M 1
Var = Var(M )
n n2
npq
= (by Theorem 3.24)
n2
pq
=
n

Example 3.25
124 to example 3.15, find Var M
Refer Advanced
. Topics in Introductory Probability
n

Solution
2 3
n = 25, p = , q=
5 5

M pq
Var =
n n

2 3
5 5
=
25
= 0.0096

In this chapter we have discussed some moments of bivariate random

variables, namely, the expectation and variance. In the next chapter, we
shall continue with the discussion of moments and how that can be used to
measure the relationship between two random variables.

EXERCISES

3.1 Refer to the data of Exercise 1.1, find

(a) E(X + Y ) (b) E(X

−Y )
3X
(c) E(XY ) (d) E
7Y
(e) E(2X + 3Y ) (f ) E(5X
−Y )
3X
(g) E[(3X)(4Y )] (h) E
7Y
(i) E(X 2 + 5Y ) (j) E[3(X + Y )]
(k) E(Y 3 ) (l) E(Y 2 − 2Y + 3X)

3.2 Refer to Exercise 1.1. Suppose X and Y are not assumed to be inde-
pendence, find

(a) Var(X + Y ) (b) Var(X

117
−Y )
X
7Y 7Y
(e) (e)
E(2X
E(2X+ 3Y+ )3Y (f) ) (f )E(5X
E(5X −Y−)Y )
3X 3X
ADVANCED TOPICS(g) (g)
IN INTRODUCTORY
E[(3X)(4Y
E[(3X)(4Y )] )](h) (h)
E E
PROBABILITY: A FIRST COURSE IN 7Y 7Y EXPECTATION AND VARIANCE OF
(i) E(X
PROBABILITY THEORY (i) 2 + 25Y
E(X
– VOLUME +
III)5Y )(j) (j)
E[3(X + Y+)]Y )]
E[3(X BIVARIATE DISTRIBUTIONS
(k) (k) 3) 3)
E(YE(Y (l) E(Y
(l) E(Y
2 − 22Y
− 2Y
+ 3X)
+ 3X)

3.2 3.2
Refer
Refer
to Exercise
to Exercise
1.1.1.1.
Suppose
Suppose X and
X and Y are
Y are notnot assumed
assumed to inde-
to be be inde-
pendence,
pendence, findfind

(a) (a)
Var(X
Var(X
+ Y+) Y ) (b) (b)
Var(X
Var(X
−Y−)Y )
X X
(c) (c) Var(XY
Var(XY ) ) (d) (d)
VarVar
Y Y
(e) (e)
Var(2X
Var(2X
+ 3Y
+ )3Y (f
) ) (f
Var(5X
) Var(5X
− Y−) Y )
(g) (g)
Var[(3X)(4Y
Var[(3X)(4Y
)] )]
Expectation and Varianceof3X

3X
Bivariate Distributions 125
(h) (h)
VarVar
7Y 7Y
(i) Var(X 2 + 5Y )
1
(j) Var(3X) (k) Var(3X − 2Y )
7
(l) Var(Y 2 ) (m) Var(Y 2 − 2Y + 3X)
(n) Var[3(X + Y )]

3.3 Refer to Exercise 3.2 and rework, assuming that X and Y are inde-
pendent.
3.4 A box contains 10 green balls and 15 red balls. 5 balls are randomly
picked from the box with replacement. Find the expectation and vari-
ance of the frequency and relative frequency of the occurrence of green
balls.
3.5 Given a pair of continuous random variables having the joint density

24 x y, x > 0, y > 0 and x + y ≤ 1
f (x, y) =
0, elsewhere

Find
(a) E(X) (b) E(Y ) (c) E(XY )
(d) Var(X) (e) Var(Y ) (f ) E(X + Y )
(g) E(X − Y ) (h) Var(X + Y ) (i) Var(X − Y )
assuming independence.
3.6 Given the joint density

2, 0 < x < 1, 0 < y < 1, x + y < 1
f (x, y) =
0, elsewhere

find
(a) E(X) (b) E(Y) (c) E(XY)
(d) Var(X) (e) Var(Y) (f ) E(X − Y )
(g) E(X + Y ) (h) Var(X + Y ) (i) Var(X − Y )
if independence is assumed.
3.7 Refer to Exercise 1.7. Find
(a) E(X) (b) E(Y) (c) E(XY)
(d) Var(X) (e) Var(Y) (f ) E(X − Y )
(g) E(X + Y ) (h) Var(X + Y ) (i) Var(X − Y )

118
ADVANCED TOPICS IN INTRODUCTORY
PROBABILITY: A FIRST COURSE IN EXPECTATION AND VARIANCE OF
PROBABILITY
126 THEORY – VOLUME III
Advanced BIVARIATE DISTRIBUTIONS
Topics in Introductory Probability