0% found this document useful (0 votes)
47 views37 pages

CPTR 36-40

This document introduces moments and cross-moments of random variables and vectors. It defines the moment of a random variable as the expected value of its power, and the central moment as the expected value of the deviation from the mean. Cross-moments are defined as the expected value of the product of powers of the entries in a random vector. Examples are provided to demonstrate computing moments and cross-moments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views37 pages

CPTR 36-40

This document introduces moments and cross-moments of random variables and vectors. It defines the moment of a random variable as the expected value of its power, and the central moment as the expected value of the deviation from the mean. Cross-moments are defined as the expected value of the product of powers of the entries in a random vector. Examples are provided to demonstrate computing moments and cross-moments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Chapter 36

Moments and cross-moments

This lecture introduces the notions of moment of a random variable and cross-
moment of a random vector.

36.1 Moments
36.1.1 De…nition of moment
The n-th moment of a random variable is the expected value of its n-th power:

De…nition 202 Let X be a random variable. Let n 2 N. If

X (n) = E [X n ]

exists and is …nite, then X is said to possess a …nite n-th moment and X (n)
is called the n-th moment of X. If E [X n ] is not well-de…ned, then we say that
X does not possess the n-th moment.

36.1.2 De…nition of central moment


The n-th central moment of a random variable X is the expected value of the n-th
power of the deviation of X from its expected value:

De…nition 203 Let X be a random variable. Let n 2 N. If


n
X (n) = E [(X E [X]) ]

exists and is …nite, then X is said to possess a …nite n-th central moment and
X (n) is called the n-th central moment of X.

36.2 Cross-moments
36.2.1 De…nition of cross-moment
Let X be a K 1 random vector. A cross-moment of X is the expected value of
the product of integer powers of the entries of X:
nK
E [X1n1 X2n2 : : : XK ]

285
286 CHAPTER 36. MOMENTS AND CROSS-MOMENTS

where Xi is the i-th entry of X and n1 ; n2 ; : : : ; nK 2 Z+ are non-negative integers.


The following is a formal de…nition of cross-moment:

De…nition 204 Let X be a K 1 random vector. Let n1 , n2 , . . . , nK be K


non-negative integers and
K
X
n= nk (36.1)
k=1

If
X
nK
(n1 ; n2 ; : : : ; nK ) = E [X1n1 X2n2 : : : XK ] (36.2)

exists and is …nite, then it is called a cross-moment of X of order n. If all


cross-moments of order n exist and are …nite, i.e. if (36.2) exists and is …nite for
all K-tuples of non-negative integers n1 , n2 , . . . , nK such that condition (36.1) is
satis…ed, then X is said to possess …nite cross-moments of order n.

The following example shows how to compute a cross-moment of a discrete


random vector:

Example 205 Let X be a 3 1 discrete random vector and denote its components
by X1 , X2 and X3 . Let the support of X be:
82 3 2 3 2 39
< 1 2 3 =
RX = 4 2 5 ; 4 1 5 ; 4 3 5
: ;
1 3 2

and its joint probability mass function1 be:


8 >
>
>
1
if x = 1 2 1
>
< 3
>
1
pX (x) = 3 if x = 2 1 3
> 1 >
>
> if x = 3 3 2
: 3
0 otherwise

The following is a cross-moment of X of order 4:

X (1; 2; 1) = E X1 X22 X3

which can be computed using the transformation theorem2 :

X (1; 2; 1) = E X1 X22 X3
X
= x1 x22 x3 pX (x1 ; x2 ; x3 )
(x1 ;x2 ;x3 )2RX

= 1 22 1 pX (1; 2; 1) + 2 12 3 pX (2; 1; 3)
+3 32 2 pX (3; 3; 2)
1 1 1 64
= 4 +6 + 54 =
3 3 3 3
1 See p. 117.
2 See p. 134.
36.2. CROSS-MOMENTS 287

36.2.2 De…nition of central cross-moment


The central cross-moments of a random vector X are just the cross-moments of
the random vector of deviations X E [X]:

De…nition 206 Let X be a K 1 random vector. Let n1 , n2 , . . . , nK be K


non-negative integers and
K
X
n= nk (36.3)
k=1

If " #
K
Y nk
X (n1 ; n2 ; : : : ; nK ) = E (Xk E [Xk ]) (36.4)
k=1

exists and is …nite, then it is called a central cross-moment of X of order n. If


all central cross-moments of order n exist and are …nite, i.e. if (36.4) exists and is
…nite for all K-tuples of non-negative integers n1 , n2 , . . . , nK such that condition
(36.3) is satis…ed, then X is said to possess …nite central cross-moments of
order n.

The following example shows how to compute a central cross-moment of a


discrete random vector:

Example 207 Let X be a 3 1 discrete random vector and denote its components
by X1 , X2 and X3 . Let the support of X be:
82 3 2 3 2 39
< 4 2 1 =
RX = 4 2 5 ; 4 1 5 ; 4 3 5
: ;
4 1 2

and its joint probability mass function be:


8 >
>
>
1
if x = 4 2 4
>
< 25 >
pX (x) = 5 if x = 2 1 1
> 2 >
>
> if x = 1 3 2
: 5
0 otherwise

The expected values of the three components of X are:


1 2 2
E [X1 ] = 4 +2 +1 =2
5 5 5
1 2 2
E [X2 ] = 2 +1 +3 =2
5 5 5
1 2 2
E [X3 ] = 4 +1 +2 =2
5 5 5
The following is a central cross-moment of X of order 3:
h i
2
X (2; 1; 0) = E (X1 E [X1 ]) (X2 E [X2 ])

which can be computed using the transformation theorem:


h i
2
X (2; 1; 0) = E (X 1 E [X 1 ]) (X 2 E [X 2 ])
288 CHAPTER 36. MOMENTS AND CROSS-MOMENTS
X 2
= (x1 2) (x2 2) pX (x1 ; x2 ; x3 )
(x1 ;x2 ;x3 )2RX
2
= (4 2) (2 2) pX (4; 2; 4)
2
+ (2 2) (1 2) pX (2; 1; 1)
2
+ (1 2) (3 2) pX (1; 3; 2)
2 2 2
= (1 2) (3 2) =
5 5
Chapter 37

Moment generating function


of a random variable

The distribution of a random variable is often characterized in terms of its moment


generating function (mgf), a real function whose derivatives at zero are equal to
the moments1 of the random variable. Mgfs have great practical relevance not only
because they can be used to easily derive moments, but also because a probability
distribution is uniquely determined by its mgf, a fact that, coupled with the an-
alytical tractability of mgfs, makes them a handy tool to solve several problems,
such as deriving the distribution of a sum of two or more random variables.
It must be mentioned that not all random variables possess an mgf. However, all
random variables possess a characteristic function2 , another transform that enjoys
properties similar to those enjoyed by the mgf.

37.1 De…nition
We start this lecture by giving a de…nition of mgf.

De…nition 208 Let X be a random variable. If the expected value

E [exp (tX)]

exists and is …nite for all real numbers t belonging to a closed interval [ h; h] R,
with h > 0, then we say that X possesses a moment generating function and
the function MX : [ h; h] ! R de…ned by

MX (t) = E [exp (tX)]

is called the moment generating function of X.

The following example shows how the mgf of an exponential random variable
is derived.
1 See p. 285.
2 See p. 307.

289
290 CHAPTER 37. MGF OF A RANDOM VARIABLE

Example 209 Let X be an exponential random variable3 with parameter 2 R++ .


Its support is the set of positive real numbers

RX = [0; 1)

and its probability density function is

exp ( x) if x 2 RX
fX (x) =
0 if x 2
= RX

Its mgf is computed as follows:


Z 1
E [exp (tX)] = exp (tx) fX (x) dx
1
Z 1
= exp (tx) exp ( x) dx
0
Z 1
A = exp ((t ) x) dx
0
1
1
= exp ((t ) x)
t 0
1
= 0 =
t t

where: in step A we have assumed that t < , which is necessary for the integral
to be …nite. Therefore, the expected value exists and is …nite for t 2 [ h; h] if h is
such that 0 < h < , and X possesses an mgf

MX (t) =
t

37.2 Moments and mgfs


The mgf takes its name by the fact that it can be used to derive the moments of
X, as stated in the following proposition.

Proposition 210 If a random variable X possesses an mgf MX (t), then, for any
n 2 N, the n-th moment of X, denoted by X (n), exists and is …nite. Furthermore,

dn MX (t)
X (n) = E [X n ] =
dtn t=0

n
where d M X (t)
dtn is the n-th derivative of MX (t) with respect to t, evaluated at
t=0
the point t = 0.

Proof. Proving the above proposition is quite complicated, because a lot of an-
alytical details must be taken care of (see, e.g., Pfei¤er4 - 1978). The intuition,
3 See p. 365.
4 Pfei¤er, P. E. (1978) Concepts of probability theory, Courier Dover Publications.
37.3. DISTRIBUTIONS AND MGFS 291

however, is straightforward: since the expected value is a linear operator and dif-
ferentiation is a linear operation, under appropriate conditions we can di¤erentiate
through the expected value, as follows:

dn MX (t) dn dn
= E [exp (tX)] = E exp (tX) = E [X n exp (tX)]
dtn dtn dtn

which, evaluated at the point t = 0, yields

dn MX (t)
= E [X n exp (0 X)] = E [X n ] = X (n)
dtn t=0

The following example shows how this proposition can be applied.

Example 211 In Example 209 we have demonstrated that the mgf of an exponen-
tial random variable is
MX (t) =
t
The expected value of X can be computed by taking the …rst derivative of the mgf:

dMX (t)
= 2
dt ( t)

and evaluating it at t = 0:

dMX (t) 1
E [X] = = 2 =
dt t=0 ( 0)

The second moment of X can be computed by taking the second derivative of the
mgf:
d2 MX (t) 2
2
= 3
dt ( t)
and evaluating it at t = 0:

d2 MX (t) 2 2
E X2 = = 3 = 2
dt2 t=0 ( 0)

And so on for the higher moments.

37.3 Distributions and mgfs


The following proposition states the most important property of the mgf.

Proposition 212 (equality of distributions) Let X and Y be two random vari-


ables. Denote by FX (x) and FY (y) their distribution functions5 , and by MX (t)
and MY (t) their mgfs. X and Y have the same distribution, i.e., FX (x) = FY (x)
for any x, if and only if they have the same mgfs, i.e., MX (t) = MY (t) for any t.
5 See p. 108.
292 CHAPTER 37. MGF OF A RANDOM VARIABLE

Proof. For a fully general proof of this proposition see, e.g., Feller6 (2008). We
just give an informal proof for the special case in which X and Y are discrete
random variables taking only …nitely many values. The "only if" part is trivial. If
X and Y have the same distribution, then

MX (t) = E [exp (tX)] = E [exp (tY )] = MY (t)

The "if" part is proved as follows. Denote by RX and RY the supports of X and
Y , and by pX (x) and pY (y) their probability mass functions7 . Denote by A the
union of the two supports:
A = R X [ RY
and by a1 ; : : : ; an the elements of A. The mgf of X can be written as

MX (t)
= E [exp (tX)]
X
A = exp (tx) pX (x)
x2RX
Xn
B = exp (tai ) pX (ai )
i=1

where: in step A we have used the de…nition of expected value; in step B we


have used the fact that pX (ai ) = 0 if ai 2
= RX . By the same token, the mgf of Y
can be written as
Xn
MY (t) = exp (tai ) pY (ai )
i=1

If X and Y have the same mgf, then, for any t belonging to a closed neighborhood
of zero,
MX (t) = MY (t)
and
n
X n
X
exp (tai ) pX (ai ) = exp (tai ) pY (ai )
i=1 i=1

By rearranging terms, we obtain


n
X
exp (tai ) [pX (ai ) pY (ai )] = 0
i=1

This can be true for any t belonging to a closed neighborhood of zero only if

pX (ai ) pY (ai ) = 0

for every i. It follows that that the probability mass functions of X and Y are
equal. As a consequence, also their distribution functions are equal.
It must be stressed that this proposition is extremely important and relevant
from a practical viewpoint: in many cases where we need to prove that two dis-
tributions are equal, it is much easier to prove equality of the mgfs than to prove
equality of the distribution functions.
6 Feller, W. (2008) An introduction to probability theory and its applications, Volume 2, Wiley.
7 See p. 106.
37.4. MORE DETAILS 293

Also note that equality of the distribution functions can be replaced in the
proposition above by equality of the probability mass functions8 if X and Y are
discrete random variables, or by equality of the probability density functions9 if X
and Y are absolutely continuous random variables.

37.4 More details


37.4.1 Mgf of a linear transformation
The next proposition gives a formula for the mgf of a linear transformation.

Proposition 213 Let X be a random variable possessing an mgf MX (t). De…ne

Y = a + bX

where a; b 2 R are two constants and b 6= 0. Then, the random variable Y possesses
an mgf MY (t) and
MY (t) = exp (at) MX (bt)

Proof. Using the de…nition of mgf, we obtain

MY (t) = E [exp (tY )] = E [exp (at + btX)]


= E [exp (at) exp (btX)] = exp (at) E [exp (btX)]
= exp (at) MX (bt)

If MX (t) is de…ned on a closed interval [ h; h], then MY (t) is de…ned on the


h h
interval b; b .

37.4.2 Mgf of a sum


The next proposition shows how to derive the mgf of a sum of independent random
variables.

Proposition 214 Let X1 ; : : : ; Xn be n mutually independent10 random variables.


Let Z be their sum:
Xn
Z= Xi
i=1

Then, the mgf of Z is the product of the mgfs of X1 ; : : : ; Xn :


n
Y
MZ (t) = MXi (t)
i=1

provided the latter exist.

Proof. This is proved as follows:

MZ (t) = E [exp (tZ)]


8 See p. 106.
9 See p. 107.
1 0 See p. 233.
294 CHAPTER 37. MGF OF A RANDOM VARIABLE
" n
!#
X
= E exp t Xi
i=1
" n
!#
X
= E exp tXi
i=1
" n
#
Y
= E exp (tXi )
i=1
n
Y
A = E [exp (tXi )]
i=1
Yn
B = MXi (t)
i=1

where: in step A we have used the properties of mutually independent variables11 ;


in step B we have used the de…nition of mgf.

37.5 Solved exercises


Some solved exercises on mgfs can be found below.

Exercise 1
Let X be a discrete random variable having a Bernoulli distribution12 . Its support
is
RX = f0; 1g
and its probability mass function13 is
8
< p if x = 1
pX (x) = 1 p if x = 0
:
0 if x 2
= RX

where p 2 (0; 1) is a constant. Derive the mgf of X, if it exists.

Solution
Using the de…nition of mgf, we get
X
MX (t) = E [exp (tX)] = exp (tx) pX (x)
x2RX
= exp (t 1) pX (1) + exp (t 0) pX (0)
= exp (t) p + 1 (1 p) = 1 p + p exp (t)

The mgf exists and it is well-de…ned because the above expected value exists for
any t 2 R.
1 1 See p. 234.
1 2 See p. 335.
1 3 See p. 106.
37.5. SOLVED EXERCISES 295

Exercise 2
Let X be a random variable with mgf
1
MX (t) = (1 + exp (t))
2
Derive the variance of X.

Solution
We can use the following formula for computing the variance14 :
2
Var [X] = E X 2 E [X]

The expected value of X is computed by taking the …rst derivative of the mgf:
dMX (t) 1
= exp (t)
dt 2
and evaluating it at t = 0:
dMX (t) 1 1
E [X] = = exp (0) =
dt t=0 2 2
The second moment of X is computed by taking the second derivative of the mgf:
d2 MX (t) 1
= exp (t)
dt2 2
and evaluating it at t = 0:
d2 MX (t) 1 1
E X2 = = exp (0) =
dt2 t=0 2 2
Therefore,
2
2 1 1
Var [X] = E X2 E [X] =
2 2
1 1 1
= =
2 4 4

Exercise 3
A random variable X is said to have a Chi-square distribution15 with n degrees of
freedom if its mgf is de…ned for any t < 12 and it is equal to
n=2
MX (t) = (1 2t)

De…ne
Y = X1 + X2
where X1 and X2 are two independent random variables having Chi-square dis-
tributions with n1 and n2 degrees of freedom respectively. Prove that Y has a
Chi-square distribution with n1 + n2 degrees of freedom.
1 4 See p. 156.
1 5 See p. 387.
296 CHAPTER 37. MGF OF A RANDOM VARIABLE

Solution
The mgfs of X1 and X2 are
n1 =2
MX1 (t) = (1 2t)
n2 =2
MX2 (t) = (1 2t)

The mgf of a sum of independent random variables is the product of the mgfs of
the summands:
n1 =2 n2 =2 (n1 +n2 )=2
MY (t) = (1 2t) (1 2t) = (1 2t)

Therefore, MY (t) is the mgf of a Chi-square random variable with n1 + n2 degrees


of freedom. As a consequence, Y has a Chi-square distribution with n1 +n2 degrees
of freedom.
Chapter 38

Moment generating function


of a random vector

The concept of joint moment generating function (joint mgf) is a multivariate


generalization of the concept of moment generating function (mgf). Similarly to
the univariate case, the joint mgf uniquely determines the joint distribution of its
associated random vector, and it can be used to derive the cross-moments1 of the
distribution by partial di¤erentiation.
If you are not familiar with the univariate concept, you are advised to …rst read
the lecture entitled Moment generating functions (p. 289).

38.1 De…nition
Let us start with a formal de…nition.

De…nition 215 Let X be a K 1 random vector. If the expected value

E exp t> X = E [exp (t1 X1 + t2 X2 + : : : + tK XK )]

exists and is …nite for all K 1 real vectors t belonging to a closed rectangle H
such that
H = [ h1 ; h 1 ] [ h2 ; h 2 ] ::: [ hK ; h K ] RK
with hi > 0 for all i = 1; : : : ; K, then we say that X possesses a joint moment
generating function and the function MX : H ! R de…ned by

MX (t) = E exp t> X

is called the joint moment generating function of X.

As an example, we derive the joint mgf of a standard multivariate normal


random vector.

Example 216 Let X be a K 1 standard multivariate normal random vector2 .


Its support is
R X = RK
1 See p. 285.
2 See p. 439.

297
298 CHAPTER 38. JOINT MGF OF A RANDOM VECTOR

and its joint probability density function3 is

K=2 1 |
fX (x) = (2 ) exp x x
2

As explained in the lecture entitled Multivariate normal distribution (p. 439), the K
components of X are K mutually independent4 standard normal random variables,
because the joint probability density function of X can be written as

fX (x) = f (x1 ) f (x2 ) : : : f (xK )

where xi is the i-th entry of x, and f (xi ) is the probability density function of a
standard normal random variable:
1=2 1 2
f (xi ) = (2 ) exp x
2 i
The joint mgf of X can be derived as follows:

MX (t) = E exp t> X


= E [exp (t1 X1 + t2 X2 + : : : + tK XK )]
"K #
Y
= E exp (ti Xi )
i=1
K
Y
A = E [exp (ti Xi )]
i=1
K
Y
B = MXi (ti )
i=1

where: in step A we have used the fact that the entries of X are mutually in-
dependent5 ; in step B we have used the de…nition of mgf of a random variable6 .
Since the mgf of a standard normal random variable is7

1 2
MXi (ti ) = exp t
2 i
the joint mgf of X is
K
Y K
Y 1 2
MX (t) = MXi (ti ) = exp t
i=1 i=1
2 i
K
!
1 X 1 >
= exp t2i = exp t t
2 i=1
2

Note that the mgf MXi (ti ) of a standard normal random variable is de…ned for
any ti 2 R. As a consequence, the joint mgf of X is de…ned for any t 2 RK .
3 See p. 117.
4 See p. 233.
5 See p. 234.
6 See p. 289.
7 See p. 378.
38.2. CROSS-MOMENTS AND JOINT MGFS 299

38.2 Cross-moments and joint mgfs


The next proposition shows how the joint mgf can be used to derive the cross-
moments of a random vector.

Proposition 217 If a K 1 random vector X possesses a joint mgf MX (t), then


it possesses …nite cross-moments of order n for any n 2 N. Furthermore, if you
de…ne a cross-moment of order n as

X
nK
(n1 ; n2 ; : : : ; nK ) = E [X1n1 X2n2 : : : XK ]
PK
where n1 ; n2 ; : : : ; nK 2 Z+ and n = k=1 nk , then

@ n1 +n2 +:::+nK MX (t1 ; t2 ; : : : ; tK )


(n1 ; n2 ; : : : ; nK ) =
X
@tn1 1 @tn2 2 : : : @tnKK t1 =0;t2 =0;:::;tK =0

where the derivative on the right-hand side is an n-th order cross-partial derivative
of MX (t) evaluated at the point t1 = 0, t2 = 0, . . . , tK = 0.

Proof. We do not provide a rigorous proof of this proposition, but see, e.g.,
Pfei¤er8 (1978) and DasGupta9 (2010). The intuition of the proof, however, is
straightforward: since the expected value is a linear operator and di¤erentiation is
a linear operation, under appropriate conditions one can di¤erentiate through the
expected value, as follows:

@ n1 +n2 +:::+nK MX (t1 ; t2 ; : : : ; tK )


@tn1 1 @tn2 2 : : : @tnKK
@ n1 +n2 +:::+nK
= E [exp (t1 X1 + t2 X2 + : : : + tK XK )]
@tn1 1 @tn2 2 : : : @tnKK
@ n1 +n2 +:::+nK
= E exp (t1 X1 + t2 X2 + : : : + tK XK )
@tn1 1 @tn2 2 : : : @tnKK
= E [X1n1 X2n2 : : : XK nK
exp (t1 X1 + t2 X2 + : : : + tK XK )]

which, evaluated at the point t1 = 0; t2 = 0; : : : ; tK = 0, yields

@ n1 +n2 +:::+nK MX (t1 ; t2 ; : : : ; tK )


@tn1 1 @tn2 2 : : : @tnKK t1 =0;t2 =0;:::;tK =0
= E [X1n1 X2n2 : : : XK nK
exp (0 X1 + 0 X2 + : : : + 0 XK )]
n1 n2 nK
= E [X1 X2 : : : XK ]
= X (n1 ; n2 ; : : : ; nK )

The following example shows how the above proposition can be applied.

Example 218 Let us continue with the previous example. The joint mgf of a 2 1
standard normal random vector X is
1 > 1 2 1 2
MX (t) = exp t t = exp t + t
2 2 1 2 2
8 Pfei¤er, P. E. (1978) Concepts of probability theory, Courier Dover Publications.
9 DasGupta, A. (2010) Fundamentals of probability: a …rst course, Springer.
300 CHAPTER 38. JOINT MGF OF A RANDOM VECTOR

The second cross-moment of X can be computed by taking the second cross-partial


derivative of the joint mgf:

X (1; 1) = E [X1 X2 ]
@2 1 2 1 2
= exp t + t
@t1 @t2 2 1 2 2 t1 =0;t2 =0
@ @ 1 2 1 2
= exp t + t
@t1 @t2 2 1 2 2 t1 =0;t2 =0
@ 1 2 1 2
= t2 exp t + t
@t1 2 1 2 2 t1 =0;t2 =0
1 2 1 2
= t1 t2 exp t + t =0
2 1 2 2 t1 =0;t2 =0

38.3 Joint distributions and joint mgfs


One of the most important properties of the joint mgf is that it completely char-
acterizes the joint distribution of a random vector.

Proposition 219 (equality of distributions) Let X and Y be two K 1 ran-


dom vectors, possessing joint mgfs MX (t) and MY (t). Denote by FX (x) and
FY (y) their joint distribution functions10 . X and Y have the same distribution,
i.e., FX (x) = FY (x) for any x 2 RK , if and only if they have the same mgfs, i.e.,
MX (t) = MY (t) for any t 2 H RK .

Proof. The reader may refer to Feller11 (2008) for a rigorous proof. The informal
proof given here is almost identical to that given for the univariate case12 . We con-
…ne our attention to the case in which X and Y are discrete random vectors taking
only …nitely many values. As far as the left-to-right direction of the implication is
concerned, it su¢ ces to note that if X and Y have the same distribution, then

MX (t) = E exp t> X = E exp t> Y = MY (t)

The right-to-left direction of the implication is proved as follows. Denote by RX


and RY the supports of X and Y , and by pX (x) and pY (y) their joint probability
mass functions13 . De…ne the union of the two supports:

A = R X [ RY

and denote its members by a1 ; : : : ; an . The joint mgf of X can be written as

MX (t) = E exp t> X


X
A = exp t> x pX (x)
x2RX
Xn
B = exp t> ai pX (ai )
i=1
1 0 See p. 118.
1 1 Feller,W. (2008) An introduction to probability theory and its applications, Volume 2, Wiley.
1 2 See p. 291.
1 3 See p. 116.
38.4. MORE DETAILS 301

where: in step A we have used the de…nition of expected value; in step B we


have used the fact that pX (ai ) = 0 if ai 2
= RX . By the same line of reasoning, the
joint mgf of Y can be written as
n
X
MY (t) = exp t> ai pY (ai )
i=1

If X and Y have the same joint mgf, then

MX (t) = MY (t)

for any t belonging to a closed rectangle where the two mgfs are well-de…ned, and
n
X n
X
exp t> ai pX (ai ) = exp t> ai pY (ai )
i=1 i=1

By rearranging terms, we obtain


n
X
exp t> ai [pX (ai ) pY (ai )] = 0
i=1

This equality can be veri…ed for every t only if

pX (ai ) pY (ai ) = 0

for every i. As a consequence, the joint probability mass functions of X and Y are
equal, which implies that also their joint distribution functions are equal.
This proposition is used very often in applications where one needs to demon-
strate that two joint distributions are equal. In such applications, proving equality
of the joint mgfs is often much easier than proving equality of the joint distribution
functions (see also the comments to Proposition 212).

38.4 More details


38.4.1 Joint mgf of a linear transformation
The next proposition gives a formula for the joint mgf of a linear transformation.

Proposition 220 Let X be a K 1 random vector possessing a joint mgf MX (t).


De…ne
Y = A + BX
where A is a L 1 constant vector and B is an L K constant matrix. Then, the
L 1 random vector Y possesses a joint mgf MY (t), and

MY (t) = exp t> A MX B > t

Proof. Using the de…nition of joint mgf, we obtain

MY (t) = E exp t> Y


= E exp t> A + t> BX
= E exp t> A exp t> BX
302 CHAPTER 38. JOINT MGF OF A RANDOM VECTOR

= exp t> A E exp t> BX


h >
i
= exp t> A E exp B > t X
= exp t> A MX B > t

If MX (t) is de…ned on a closed rectangle H, then MY (t) is de…ned on another


closed rectangle whose shape and location depend on A and B.

38.4.2 Joint mgf of a vector with independent entries


The next proposition shows how to derive the joint mgf of a vector whose compo-
nents are independent random variables.

Proposition 221 Let X be a K 1 random vector. Let its entries X1 ; : : : ; XK


be K mutually independent random variables possessing an mgf. Denote the mgf
of the i-th entry of X by MXi (ti ). Then, the joint mgf of X is
K
Y
MX (t1 ; : : : ; tK ) = MXi (ti )
i=1

Proof. This is proved as follows:

MX (t) = E exp t> X


" K
!#
X
= E exp ti X i
i=1
"K #
Y
= E exp (ti Xi )
i=1
K
Y
A = E [exp (ti Xi )]
i=1
K
Y
B = MXi (ti )
i=1

where: in step A we have used the fact that the entries of X are mutually
independent; in step B we have used the de…nition of mgf of a random variable.

38.4.3 Joint mgf of a sum


The next proposition shows how to derive the joint mgf of a sum of independent
random vectors.

Proposition 222 Let X1 , . . . , Xn be n mutually independent random vectors14 ,


all of dimension K 1. Let Z be their sum:
n
X
Z= Xi
i=1
1 4 See p. 235.
38.5. SOLVED EXERCISES 303

Then, the joint mgf of Z is the product of the joint mgfs of X1 , . . . , Xn :


n
Y
MZ (t) = MXi (t)
i=1

provided the latter exist.

Proof. This is proved as follows:

MZ (t) = E exp t> Z


" n
!#
X
>
= E exp t Xi
i=1
" n
!#
X
>
= E exp t Xi
i=1
" n
#
Y
>
= E exp t Xi
i=1
n
Y
A = E exp t> Xi
i=1
Yn
B = MXi (t)
i=1

where: in step A we have used the fact that the vectors Xi are mutually inde-
pendent; in step B we have used the de…nition of joint mgf.

38.5 Solved exercises


Some solved exercises on joint mgfs can be found below.

Exercise 1
Let X be a 2 1 discrete random vector and denote its components by X1 and
X2 . Let the support of X be
n o
> > >
RX = [1 1] ; [2 0] ; [0 0]

and its joint probability mass function be


8 >
>
> 1=3 if x = [1 1]
< >
1=3 if x = [2 0]
pX (x) = >
>
: 1=3
> if x = [0 0]
0 otherwise

Derive the joint mgf of X, if it exists.


304 CHAPTER 38. JOINT MGF OF A RANDOM VECTOR

Solution
By using the de…nition of joint mgf, we get

MX (t) = E exp t> X = E [exp (t1 X1 + t2 X2 )]


X
= exp (t1 x1 + t2 x2 ) pX (x1 ; x2 )
(x1 ;x2 )2RX

= exp (t1 1 + t2 1) pX (1; 1) + exp (t1 2 + t2 0) pX (2; 0)


+ exp (t1 0 + t2 0) pX (0; 0)
1 1 1
= exp (t1 + t2 ) + exp (2t1 ) + exp (0)
3 3 3
1
= [1 + exp (2t1 ) + exp (t1 + t2 )]
3
Obviously, the joint mgf exists and it is well-de…ned because the above expected
value exists for any t 2 R2 .

Exercise 2
Let
>
X = [X1 X2 ]
be a 2 1 random vector with joint mgf
1 2
MX1 ;X2 (t1 ; t2 ) = + exp (t1 + 2t2 )
3 3
Derive the expected value of X1 .

Solution
The mgf of X1 is

MX1 (t1 ) = E [exp (t1 X1 )] = E [exp (t1 X1 + 0 X2 )]


1 2
= MX1 ;X2 (t1 ; 0) = + exp (t1 + 2 0)
3 3
1 2
= + exp (t1 )
3 3
The expected value of X1 is obtained by taking the …rst derivative of its mgf:

dMX1 (t1 ) 2
= exp (t1 )
dt1 3
and evaluating it at t1 = 0:

dMX1 (t1 ) 2 2
E [X1 ] = = exp (0) =
dt1 t1 =0 3 3

Exercise 3
Let
>
X = [X1 X2 ]
38.5. SOLVED EXERCISES 305

be a 2 1 random vector with joint mgf


1
MX1 ;X2 (t1 ; t2 ) = [1 + exp (t1 + 2t2 ) + exp (2t1 + t2 )]
3
Derive the covariance between X1 and X2 .

Solution
We can use the following covariance formula:

Cov [X1 ; X2 ] = E [X1 X2 ] E [X1 ] E [X2 ]

The mgf of X1 is

MX1 (t1 ) = E [exp (t1 X1 )] = E [exp (t1 X1 + 0 X2 )]


1
= MX1 ;X2 (t1 ; 0) = [1 + exp (t1 + 2 0) + exp (2t1 + 0)]
3
1
= [1 + exp (t1 ) + exp (2t1 )]
3
The expected value of X1 is obtained by taking the …rst derivative of its mgf:
dMX1 (t1 ) 1
= [exp (t1 ) + 2 exp (2t1 )]
dt1 3
and evaluating it at t1 = 0:
dMX1 (t1 ) 1
E [X1 ] = = [exp (0) + 2 exp (0)] = 1
dt1 t1 =0 3

The mgf of X2 is

MX2 (t2 ) = E [exp (t2 X2 )] = E [exp (0 X1 + t2 X2 )]


1
= MX1 ;X2 (0; t2 ) = [1 + exp (0 + 2t2 ) + exp (2 0 + t2 )]
3
1
= [1 + exp (2t2 ) + exp (t2 )]
3
To compute the expected value of X2 we take the …rst derivative of its mgf:
dMX2 (t2 ) 1
= [2 exp (2t2 ) + exp (t2 )]
dt2 3
and we evaluate it at t2 = 0:
dMX2 (t2 ) 1
E [X2 ] = = [2 exp (0) + exp (0)] = 1
dt2 t2 =0 3

The second cross-moment of X is computed by taking the second cross-partial


derivative of the joint mgf:
@ 2 MX1 ;X2 (t1 ; t2 ) @ @ 1
= [1 + exp (t1 + 2t2 ) + exp (2t1 + t2 )]
@t1 @t2 @t1 @t2 3
@ 1
= [2 exp (t1 + 2t2 ) + exp (2t1 + t2 )]
@t1 3
306 CHAPTER 38. JOINT MGF OF A RANDOM VECTOR

1
= [2 exp (t1 + 2t2 ) + 2 exp (2t1 + t2 )]
3
and evaluating it at (t1 ; t2 ) = (0; 0):

@ 2 MX1 ;X2 (t1 ; t2 )


E [X1 X2 ] =
@t1 @t2 t1 =0;t2 =0
1 4
= [2 exp (0) + 2 exp (0)] =
3 3
Therefore,

Cov [X1 ; X2 ] = E [X1 X2 ] E [X1 ] E [X2 ]


4 1
= 1 1=
3 3
Chapter 39

Characteristic function of a
random variable

In the lecture entitled Moment generating function (p. 289), we have explained
that the distribution of a random variable can be characterized in terms of its
moment generating function, a real function that enjoys two important properties:
it uniquely determines its associated probability distribution, and its derivatives
at zero are equal to the moments of the random variable. We have also explained
that not all random variables possess a moment generating function.
The characteristic function (cf) enjoys properties that are almost identical to
those enjoyed by the moment generating function, but it has an important advan-
tage: all random variables possess a characteristic function.

39.1 De…nition
We start this lecture by giving a de…nition of characteristic function.

p
De…nition 223 Let X be a random variable. Let i = 1 be the imaginary unit.
The function ' : R ! C de…ned by

'X (t) = E [exp (itX)]

is called the characteristic function of X.

The …rst thing to be noted is that the characteristic function 'X (t) exists for
any t. This can be proved as follows:

'X (t) = E [exp (itX)]


= E [cos (tX) + i sin (tX)]
= E [cos (tX)] + iE [sin (tX)]

and the last two expected values are well-de…ned, because the sine and cosine
functions are bounded in the interval [ 1; 1].

307
308 CHAPTER 39. CF OF A RANDOM VARIABLE

39.2 Moments and cfs


Like the moment generating function of a random variable, the characteristic func-
tion can be used to derive the moments of X, as stated in the following proposition.

Proposition 224 Let X be a random variable and 'X (t) its characteristic func-
tion. Let n 2 N. If the n-th moment of X, denoted by X (n), exists and is …nite,
then 'X (t) is n times continuously di¤ erentiable and

1 dn 'X (t)
X (n) = E [X n ] =
in dtn t=0

n
where d dt'X (t)
n is the n-th derivative of 'X (t) with respect to t, evaluated at
t=0
the point t = 0.

Proof. The proof of the above proposition is quite complex (see, e.g., Resnick1 -
1999). The intuition, however, is straightforward: since the expected value is a lin-
ear operator and di¤erentiation is a linear operation, under appropriate conditions
one can di¤erentiate through the expected value, as follows:

dn 'X (t) dn
= E [exp (itX)]
dtn dtn
dn
= E exp (itX)
dtn
n
= E [(iX) exp (itX)]
= in E [X n exp (itX)]

which, evaluated at the point t = 0, yields

dn 'X (t)
= in E [X n exp (0 iX)] = in E [X n ] = in X (n)
dtn t=0

In practice, the proposition above is not very useful when one wants to compute
a moment of a random variable, because it requires to know in advance whether
the moment exists or not. A much more useful statement is provided by the next
proposition.

Proposition 225 Let X be a random variable and 'X (t) its characteristic func-
tion. If 'X (t) is n times di¤ erentiable at the point t = 0, then

1. if n is even, the k-th moment of X exists and is …nite for any k n;

2. if n is odd, the k-th moment of X exists and is …nite for any k < n.

In both cases, the following holds:

1 dk 'X (t)
X (k) = E X k =
ik dtk t=0

1 Resnick, S. I. (1999) A Probability Path, Birkhauser.


39.3. DISTRIBUTIONS AND CFS 309

Proof. See e.g. Ushakov2 (1999).


The following example shows how this proposition can be used to compute the
second moment of an exponential random variable.

Example 226 Let X be an exponential random variable with parameter 2 R++ .


Its support is
RX = [0; 1)
and its probability density function is

exp ( x) if x 2 RX
fX (x) =
0 if x 2
= RX

Its characteristic function is

'X (t) = E [exp (itX)] =


it
which is proved in the lecture entitled Exponential distribution (p. 365). Note that
dividing by ( it) does not pose any division-by-zero problem, because > 0 and
the denominator is di¤ erent from 0 also when t = 0. The …rst derivative of the
characteristic function is
d'X (t) i
= 2
dt ( it)
The second derivative of the characteristic function is

d2 'X (t) 2
= 3
dt2 ( it)

Evaluating it at t = 0, we obtain

d2 'X (t) 2
= 2
dt2 t=0

Therefore, the second moment of X exists and is …nite. Furthermore, it can be


computed as
1 d2 'X (t) 1 2 2
E X2 = 2 2
= 2 2 = 2
i dt t=0 i

39.3 Distributions and cfs


Characteristic functions, like moment generating functions, can also be used to
characterize the distribution of a random variable.

Proposition 227 (equality of distributions) Let X and Y be two random vari-


ables. Denote by FX (x) and FY (y) their distribution functions3 and by 'X (t)
and 'Y (t) their characteristic functions. X and Y have the same distribution,
i.e., FX (x) = FY (x) for any x, if and only if they have the same characteristic
function, i.e., 'X (t) = 'Y (t) for any t.
2 Ushakov, N. G. (1999) Selected topics in characteristic functions, VSP.
3 See p. 108.
310 CHAPTER 39. CF OF A RANDOM VARIABLE

Proof. For a formal proof, see, e.g., Resnick4 (1999). An informal proof for the
special case in which X and Y have a …nite support can be provided along the
same lines of the proof of Proposition 212, which concerns the moment generating
function. This is left as an exercise (just replace exp (tX) and exp (tY ) in that
proof with exp (itX) and exp (itY )).
This property is analogous to the property of joint moment generating functions
stated in Proposition 212. The same comments we made about that proposition
also apply to this one.

39.4 More details


39.4.1 Cf of a linear transformation
The next proposition gives a formula for the characteristic function of a linear
transformation.

Proposition 228 Let X be a random variable with characteristic function 'X (t).
De…ne
Y = a + bX
where a; b 2 R are two constants and b 6= 0. Then, the characteristic function of
Y is
'Y (t) = exp (iat) 'X (bt)

Proof. Using the de…nition of characteristic function, we get

'Y (t) = E [exp (itY )]


= E [exp (iat + ibtX)]
= E [exp (iat) exp (ibtX)]
= exp (iat) E [exp (ibtX)]
= exp (iat) 'X (bt)

39.4.2 Cf of a sum
The next proposition shows how to derive the characteristic function of a sum of
independent random variables.

Proposition 229 Let X1 , . . . , Xn be n mutually independent random variables5 .


Let Z be their sum:
Xn
Z= Xj
j=1

Then, the characteristic function of Z is the product of the characteristic functions


of X1 , . . . , Xn :
Y n
'Z (t) = 'Xj (t)
j=1

4 Resnick, S. I. (1999) A Probability Path, Birkhauser.


5 See p. 233.
39.4. MORE DETAILS 311

Proof. This is proved as follows:


'Z (t) = E [exp (itZ)]
2 0 13
X n
= E 4exp @it X j A5
j=1
2 0 13
n
X
= E 4exp @ itXj A5
j=1
2 3
n
Y
= E4 exp (itXj )5
j=1
n
Y
A = E [exp (itXj )]
j=1
Yn
B = 'Xj (t)
j=1

where: in step A we have used the properties of mutually independent variables6 ;


in step B we have used the de…nition of characteristic function.

39.4.3 Computation of the characteristic function


When X is a discrete random variable with support RX and probability mass
function pX (x), its characteristic function is
X
'X (t) = E [exp (itX)] = exp (itx) pX (x)
x2RX

Thus, the computation of the characteristic function is pretty straightforward: all


we need to do is to sum the complex numbers exp (itx) pX (x) over all values of x
belonging to the support of X.
When X is an absolutely continuous random variable with probability density
function fX (x), its characteristic function is
Z 1
'X (t) = E [exp (itX)] = exp (itx) fX (x) dx
1

The right-hand side integral is a contour integral of a complex function along


the real axis. As people reading these lecture notes are usually not familiar with
contour integration (a topic in complex analysis), we avoid it altogether in the rest
of this book. We instead exploit the fact that
exp (itx) = cos(tx) + i sin (tx)
to rewrite the contour integral as the complex sum of two ordinary integrals:
Z 1 Z 1 Z 1
exp (itx) fX (x) dx = cos(tx)fX (x) dx + i sin(tx)fX (x) dx
1 1 1

and to compute the two integrals separately.


6 See p. 234.
312 CHAPTER 39. CF OF A RANDOM VARIABLE

39.5 Solved exercises


Below you can …nd some exercises with explained solutions.

Exercise 1
Let X be a discrete random variable having support

RX = f0; 1; 2g

and probability mass function


8
>
> 1=3 if x=0
<
1=3 if x=1
pX (x) =
>
> 1=3 if x=2
:
0 if x2
= RX

Derive the characteristic function of X.

Solution
By using the de…nition of characteristic function, we obtain
X
'X (t) = E [exp (itX)] = exp (itx) pX (x)
x2RX
= exp (it 0) pX (0) + exp (it 1) pX (1) + exp (it 2) pX (2)
1 1 1 1
= + exp (it) + exp (2it) = [1 + exp (it) + exp (2it)]
3 3 3 3

Exercise 2
Use the characteristic function found in the previous exercise to derive the variance
of X.

Solution
We can use the following formula for computing the variance:
2
Var [X] = E X 2 E [X]

The expected value of X is computed by taking the …rst derivative of the charac-
teristic function:
d'X (t) 1
= [i exp (it) + 2i exp (2it)]
dt 3
evaluating it at t = 0, and dividing it by i:

1 d'X (t) 11
E [X] = = [i exp (i 0) + 2i exp (2i 0)] = 1
i dt t=0 i3

The second moment of X is computed by taking the second derivative of the


characteristic function:
d2 'X (t) 1 2
2
= i exp (it) + 4i2 exp (2it)
dt 3
39.5. SOLVED EXERCISES 313

evaluating it at t = 0, and dividing it by i2 :

1 d2 'X (t) 11 2 5
E X2 = = i exp (i 0) + 4i2 exp (2i 0) =
i2 dt2 t=0 i2 3 3

Therefore,
2 5 2
Var [X] = E X 2 E [X] = 12 =
3 3

Exercise 3
Read and try to understand how the characteristic functions of the uniform and
exponential distributions are derived in the lectures entitled Uniform distribution
(p. 359) and Exponential distribution (p. 365).
314 CHAPTER 39. CF OF A RANDOM VARIABLE
Chapter 40

Characteristic function of a
random vector

This lecture introduces the notion of joint characteristic function (joint cf) of a
random vector, which is a multivariate generalization of the concept of character-
istic function of a random variable. Before reading this lecture, you are advised to
…rst read the lecture entitled Characteristic function (p. 307).

40.1 De…nition
Let us start this lecture with a de…nition.
p
De…nition 230 Let X be a K 1 random vector. Let i = 1 be the imaginary
unit. The function ' : RK ! C de…ned by
2 0 13
K
X
'X (t) = E exp it> X = E 4exp @i tj Xj A5
j=1

is called the joint characteristic function of X.


The …rst thing to be noted is that the joint characteristic function 'X (t) exists
for any t 2 RK . This can be proved as follows:
'X (t) = E exp it> X
= E cos t> X + i sin t> X
= E cos t> X + iE sin t> X
and the last two expected values are well-de…ned, because the sine and cosine
functions are bounded in the interval [ 1; 1].

40.2 Cross-moments and joint cfs


Like the joint moment generating function1 of a random vector, the joint charac-
teristic function can be used to derive the cross-moments2 of X, as stated in the
1 See p. 297.
2 See p. 285.

315
316 CHAPTER 40. JOINT CF

following proposition.

Proposition 231 Let X be a random vector and 'X (t) its joint characteristic
function. Let n 2 N. De…ne a cross-moment of order n as follows:

X
nK
(n1 ; n2 ; : : : ; nK ) = E [X1n1 X2n2 : : : XK ]

where n1 ; n2 ; : : : ; nK 2 Z+ and
K
X
n= nk
k=1

If all cross-moments of order n exist and are …nite, then all the n-th order partial
derivatives of 'X (t) exist and

1 @ n1 +n2 +:::+nK 'X (t1 ; t2 ; : : : ; tK )


(n1 ; n2 ; : : : ; nK ) =
X
in @tn1 1 @tn2 2 : : : @tnKK t1 =0;t2 =0;:::;tK =0

where the partial derivative on the right-hand side of the equation is evaluated at
the point t1 = 0, t2 = 0, . . . , tK = 0.

Proof. See Ushakov3 (1999).


In practice, the proposition above is not very useful when one wants to compute
a cross-moment of a random vector, because the proposition requires to know in
advance whether the cross-moment exists or not. A much more useful proposition
is the following.

Proposition 232 Let X be a random vector and 'X (t) its joint characteristic
function. If all the n-th order partial derivatives of 'X (t) exist, then:

1. if n is even, for any


K
X
m= mk n
k=1

all m-th cross-moments of X exist and are …nite;

2. if n is odd, for any


K
X
m= mk < n
k=1

all m-th cross-moments of X exist and are …nite.

In both cases,

1 @ m1 +m2 +:::+mK 'X (t1 ; t2 ; : : : ; tK )


(m1 ; m2 ; : : : ; mK ) =
X
in @tm m2 mK
1 @t2 : : : @tK
1
t1 =0;t2 =0;:::;tK =0

Proof. See Ushakov (1999).


3 Ushakov, N. G. (1999) Selected topics in characteristic functions, VSP.
40.3. JOINT DISTRIBUTIONS AND JOINT CFS 317

40.3 Joint distributions and joint cfs


The next proposition states the most important property of the joint characteristic
function.

Proposition 233 (equality of distributions) Let X and Y be two K 1 ran-


dom vectors. Denote by FX (x) and FY (y) their joint distribution functions4 and
by 'X (t) and 'Y (t) their joint characteristic functions. X and Y have the same
distribution, i.e., FX (x) = FY (x) for any x 2 RK , if and only if they have the
same characteristic functions, i.e., 'X (t) = 'Y (t) for any t 2 RK .

Proof. See Ushakov (1999). An informal proof for the special case in which X
and Y have a …nite support can be provided along the same lines of the proof of
Proposition 219, which concerns the joint moment generating function. This is left
as an exercise (just replace exp t> X and exp t> Y in that proof with exp it> X
and exp it> Y ).
This property is analogous to the property of joint moment generating functions
stated in Proposition 219. The same comments we made about that proposition
also apply to this one.

40.4 More details


40.4.1 Joint cf of a linear transformation
The next proposition gives a formula for the joint characteristic function of a linear
transformation.

Proposition 234 Let X be a K 1 random vector with characteristic function


'X (t). De…ne
Y = A + BX
where A is a L 1 constant vector and B is a L K constant matrix. Then, the
characteristic function of Y is

'Y (t) = exp it> A 'X B > t

Proof. By using the de…nition of characteristic function, we obtain

'Y (t) = E exp it> Y


= E exp it> A + it> BX
= E exp it> A exp it> BX
= exp it> A E exp it> BX
h >
i
= exp it> A E exp i B > t X
= exp it> A 'X B > t

4 See p. 118.
318 CHAPTER 40. JOINT CF

40.4.2 Joint cf of a random vector with independent entries


The next proposition shows how to derive the joint characteristic function of a
vector whose components are independent random variables.

Proposition 235 Let X be a K 1 random vector. Let its entries X1 ; : : : ; XK be


K mutually independent random variables. Denote the characteristic function of
the j-th entry of X by 'Xj (tj ). Then, the joint characteristic function of X is

K
Y
'X (t1 ; : : : ; tK ) = 'Xj (tj )
j=1

Proof. This is proved as follows:

'X (t) = E exp it> X


2 0 13
K
X
= E 4exp @i tj Xj A5
j=1
2 3
K
Y
= E4 exp (itj Xj )5
j=1
K
Y
A = E [exp (itj Xj )]
j=1
K
Y
B = 'Xj (tj )
j=1

where: in step A we have used the fact that the entries of X are mutually
independent5 ; in step B we have used the de…nition of characteristic function
of a random variable6 .

40.4.3 Joint cf of a sum


The next proposition shows how to derive the joint characteristic function of a sum
of independent random vectors.

Proposition 236 Let X1 , . . . , Xn be n mutually independent random vectors.


Let Z be their sum:
n
X
Z= Xj
j=1

Then, the joint characteristic function of Z is the product of the joint characteristic
functions of X1 ; : : : ; Xn :
Yn
'Z (t) = 'Xj (t)
j=1

5 In particular, see the mutual independence via expectations property (p. 234).
6 See p. 307.
40.5. SOLVED EXERCISES 319

Proof. This is proved as follows:

'Z (t) = E exp it> Z


2 0 13
n
X
= E 4exp @it> Xj A5
j=1
2 0 13
Xn
= E 4exp @ it> Xj A5
j=1
2 3
n
Y
= E4 exp it> Xj 5
j=1
n
Y
A = E exp it> Xj
j=1
Yn
B = 'Xj (t)
j=1

where: in step A we have used the fact that the vectors Xj are mutually inde-
pendent; in step B we have used the de…nition of joint characteristic function of
a random vector given above.

40.5 Solved exercises


Below you can …nd some exercises with explained solutions.

Exercise 1
Let Z1 and Z2 be two independent standard normal random variables7 . Let X be
a 2 1 random vector whose components are de…ned as follows:

X1 = Z12
X2 = Z12 + Z22

Derive the joint characteristic function of X.


Hint: use the fact that Z12 and Z22 are two independent Chi-square random
variables8 having characteristic function
1=2
'Z12 (t) = 'Z22 (t) = (1 2it)

Solution
By using the de…nition of characteristic function, we get

'X (t) = E exp it> X


= E [exp (it1 X1 + it2 X2 )]
7 See p. 376.
8 See p. 387.
320 CHAPTER 40. JOINT CF

= E exp it1 Z12 + it2 Z12 + Z22


= E exp i (t1 + t2 ) Z12 + it2 Z22
= E exp i (t1 + t2 ) Z12 exp it2 Z22
A = E exp i (t1 + t2 ) Z12 E exp it2 Z22
B = 'Z12 (t1 + t2 ) 'Z22 (t2 )
1=2 1=2
= (1 2it1 2it2 ) (1 2it2 )
1=2
= [(1 2it1 2it2 ) (1 2it2 )]
1=2
= [1 2it1 2it2 2it2 1 2it2 ( 2it1 ) 2it2 ( 2it2 )]
1=2
= 1 2it1 4it2 4t1 t2 4t22

where: in step A we have used the fact that Z1 and Z2 are independent; in step
B we have used the de…nition of characteristic function.

Exercise 2
Use the joint characteristic function found in the previous exercise to derive the
expected value and the covariance matrix of X.

Solution
We need to compute the partial derivatives of the joint characteristic function:

@' 1 3=2
= 1 2it1 4it2 4t1 t2 4t22 ( 2i 4t2 )
@t1 2
@' 1 3=2
= 1 2it1 4it2 4t1 t2 4t22 ( 4i 4t1 8t2 )
@t2 2
@2' 3 5=2 2
= 1 2it1 4it2 4t1 t2 4t22 ( 2i 4t2 )
@t21 4
@2' 3 5=2 2
= 1 2it1 4it2 4t1 t2 4t22 ( 4i 4t1 8t2 )
@t22 4
3=2
+4 1 2it1 4it2 4t1 t2 4t22
@2' 3 5=2
= 1 2it1 4it2 4t1 t2 4t22 ( 2i 4t2 ) ( 4i 4t1 8t2 )
@t1 @t2 4
3=2
+2 1 2it1 4it2 4t1 t2 4t22

All partial derivatives up to the second order exist and are well de…ned. As a
consequence, all cross-moments up to the second order exist and are …nite and
they can be computed from the above partial derivatives:

1 @' 1
E [X1 ] = = i=1
i @t1 t1 =0;t2 =0 i
1 @' 1
E [X2 ] = = 2i = 2
i @t2 t1 =0;t2 =0 i
40.5. SOLVED EXERCISES 321

1 @2' 1
E X12 = = 3i2 = 3
i2 @t21 t1 =0;t2 =0 i2
2
1 @ ' 1
E X22 = = 12i2 + 4 = 8
i2 @t22 t1 =0;t2 =0 i2
1 @2' 1
E [X1 X2 ] = = 6i2 + 2 = 4
i2 @t1 @t2 t1 =0;t2 =0 i2

The covariances are derived as follows:


2
Var [X1 ] = E X12 E [X1 ] = 3 1=2
2
Var [X2 ] = E X22
E [X2 ] = 8 4 = 4
Cov [X1 ; X2 ] = E [X1 X2 ] E [X1 ] E [X2 ] = 4 2 = 2

Summing up, we have


> >
E [X] = E [X1 ] E [X2 ] = 1 2
Var [X1 ] Cov [X1 ; X2 ] 2 2
Var [X] = =
Cov [X1 ; X2 ] Var [X2 ] 2 4

Exercise 3
Read and try to understand how the joint characteristic function of the multinomial
distribution is derived in the lecture entitled Multinomial distribution (p. 431).

You might also like