T M N D: HE Ultivariate Ormal Istribution
T M N D: HE Ultivariate Ormal Istribution
4
THE MULTIVARIATE NORMAL
DISTRIBUTION
4.1 Introduction
A generalization of the familiar bell-shaped normal density to several dimensions plays
a fundamental role in multivariate analysis. In fact, most of the techniques encountered
in this book are based on the assumption that the data were generated from a multi-
variate normal distribution. While real data are never exactly multivariate normal, the
normal density is often a useful approximation to the “true” population distribution.
One advantage of the multivariate normal distribution stems from the fact that
it is mathematically tractable and “nice” results can be obtained. This is frequently
not the case for other data-generating distributions. Of course, mathematical attrac-
tiveness per se is of little use to the practitioner. It turns out, however, that normal
distributions are useful in practice for two reasons: First, the normal distribution
serves as a bona fide population model in some instances; second, the sampling
distributions of many multivariate statistics are approximately normal, regardless of
the form of the parent population, because of a central limit effect.
To summarize, many real-world problems fall naturally within the framework of
normal theory. The importance of the normal distribution rests on its dual role as
both population model for certain natural phenomena and approximate sampling
distribution for many statistics.
From Chapter 4 of Applied Multivariate Statistical Analysis, Sixth Edition. Richard A. Johnson,
Dean W. Wichern. Copyright © 2007 by Pearson Education, Inc. All rights reserved.
149
150 Chapter 4 The Multivariate Normal Distribution
A plot of this function yields the familiar bell-shaped curve shown in Figure 4.1.
Also shown in the figure are approximate areas under the curve within ; 1 standard
deviations and ;2 standard deviations of the mean. These areas represent probabil-
ities, and thus, for the normal random variable X,
P1m - s … X … m + s2 .68
P1m - 2s … X … m + 2s2 .95
It is convenient to denote the normal density function with mean m and vari-
ance s2 by N1m, s22. Therefore, N110, 42 refers to the function in (4-1) with m = 10
and s = 2. This notation will be extended to the multivariate case later.
The term
x - m 2 -1
a b = 1x - m2 1s22 1x - m2 (4-2)
s
in the exponent of the univariate normal density function measures the square of
the distance from x to m in standard deviation units. This can be generalized for a
p * 1 vector x of observations on several variables as
The p * 1 vector M represents the expected value of the random vector X, and the
p * p matrix is the variance–covariance matrix of X. [See (2–30) and (2–31).] We
shall assume that the symmetric matrix is positive definite, so the expression in
(4-3) is the square of the generalized distance from x to M.
The multivariate normal density is obtained by replacing the univariate distance
in (4-2) by the multivariate generalized distance of (4-3) in the density function of
(4-1). When this replacement is made, the univariate normalizing constant
-1>2
12p2-1>21s22 must be changed to a more general constant that makes the volume
under the surface of the multivariate density function unity for any p. This is neces-
sary because, in the multivariate case, probabilities are represented by volumes
under the surface over regions defined by intervals of the xi values. It can be shown
(see [1]) that this constant is 12p2-p>2 ƒ ƒ -1>2, and consequently, a p-dimensional
normal density for the random vector X¿ = 7X1 , X2, Á , Xp8 has the form
1 ¿ -1
f1x2 = e -1x -M2 1x -M2>2 (4-4)
12p2p>2 ƒ ƒ 1>2
150
The Multivariate Normal Density and Its Properties 151
Example 4.1 (Bivariate normal density) Let us evaluate the p = 2-variate normal
density in terms of the individual parameters m1 = E1X12, m2 = E1X22,
s1 1 = Var 1X12, s2 2 = Var 1X22, and r1 2 = s1 2>1 1s1 1 1s2 22 = Corr 1X1 , X22.
Using Result 2A.8, we find that the inverse of the covariance matrix
s1 1 s1 2
= B R
s1 2 s2 2
is
1 s2 2 - s1 2
-1 = B R
s1 1 s2 2 - s212 - s1 2 s1 1
Introducing the correlation coefficient r1 2 by writing s1 2 = r1 2 1s1 1 1s2 2 , we
obtain s1 1s2 2 - s12 2 = s1 1 s2 211 - r21 22, and the squared distance becomes
1x - M2¿ -11x - M2
1
= 7x1 - m1 , x2 - m28
s1 1s2 211 - r12 22
s2 2 - r1 2 1s1 1 1s2 2 x - m1
B R B 1 R
- r1 2 1s1 1 1s2 2 s1 1 x2 - m2
s2 21x1 - m122 + s1 11x2 - m222 - 2r1 2 1s1 1 1s2 2 1x1 - m12 1x2 - m22
=
s1 1s2 211 - r12 22
1 x1 - m1 2 x2 - m2 2 x1 - m1 x2 - m2
= 2
B ¢ ≤ + ¢ ≤ - 2r1 2 ¢ ≤ ¢ ≤R (4-5)
1 - r1 2 1s 11 1s 22 1s 11 1s2 2
The last expression is written in terms of the standardized values 1x1 - m12> 1s1 1 and
1x2 - m22> 1s2 2 .
Next, since ƒ ƒ = s1 1s2 2 - s21 2 = s1 1s2 211 - r12 22, we can substitute for -1
and ƒ ƒ in (4-4) to get the expression for the bivariate 1p = 22 normal density
involving the individual parameters m1 , m2 , s1 1 , s2 2 , and r1 2 :
1
f1x1 , x22 = (4-6)
2p2s1 1s2 211 - r12 22
1 x1 - m1 2 x2 - m2 2
* exp b - B ¢ ≤ + ¢ ≤
211 - r12 22 1s1 1 1s2 2
x1 - m1 x2 - m2
- 2r1 2 ¢ ≤ ¢ ≤Rr
1s1 1 1s2 2
The expression in (4-6) is somewhat unwieldy, and the compact general form in
(4-4) is more informative in many ways. On the other hand, the expression in (4-6) is
useful for discussing certain properties of the normal distribution. For example, if the
random variables X1 and X2 are uncorrelated, so that r1 2 = 0, the joint density can
be written as the product of two univariate normal densities each of the form of (4-1).
151
152 Chapter 4 The Multivariate Normal Distribution
That is, f1x1 , x22 = f1x12f1x22 and X1 and X2 are independent. [See (2-28).] This
result is true in general. (See Result 4.5.)
Two bivariate distributions with s1 1 = s2 2 are shown in Figure 4.2. In Figure
4.2(a), X1 and X2 are independent 1r1 2 = 02. In Figure 4.2(b), r1 2 = .75. Notice how
the presence of correlation causes the probability to concentrate along a line. ■
f (x1, x 2)
x2
x1
(a)
f (x1, x 2 )
x2
x1
(b)
152
The Multivariate Normal Density and Its Properties 153
From the expression in (4-4) for the density of a p-dimensional normal variable, it
should be clear that the paths of x values yielding a constant height for the density are
ellipsoids. That is, the multivariate normal density is constant on surfaces where the
square of the distance 1x - M2¿ -11x - M2 is constant.These paths are called contours:
Constant probability density contour = 5all x such that 1x - M2¿ -11x - M2 = c26
= surface of an ellipsoid centered at M
The axes of each ellipsoid of constant density are in the direction of the eigen-
vectors of -1, and their lengths are proportional to the reciprocals of the square
roots of the eigenvalues of -1. Fortunately, we can avoid the calculation of -1 when
determining the axes, since these ellipsoids are also determined by the eigenvalues
and eigenvectors of . We state the correspondence formally for later reference.
x¿ -1 x = x¿ a a a b eieiœ b x
p
1
i=1 li
= a a b 1x¿ ei2 Ú 0
p
1 2
i=1 li
2
since each term li-11x¿ ei2 is nonnegative. In addition, x¿ ei = 0 for all i only if
x = 0. So x Z 0 implies that a 11>li2 1x¿ ei2 7 0, and it follows that -1 is
p
2
i=1
positive definite.
These ellipsoids are centered at M and have axes ; c 1li ei , where ei = liei
for i = 1, 2, Á , p.
153
154 Chapter 4 The Multivariate Normal Distribution
Example 4.2 (Contours of the bivariate normal density) We shall obtain the axes of
constant probability density contours for a bivariate normal distribution when
s1 1 = s2 2 . From (4-7), these axes are given by the eigenvalues and eigenvectors of
. Here ƒ - lI ƒ = 0 becomes
s1 1 - l s1 2
0 = ` ` = 1s1 1 - l22 - s21 2
s1 2 s1 1 - l
= 1l - s1 1 - s1 22 1l - s1 1 + s1 22
s1 1 s1 2 e e
B R B 1 R = 1s1 1 + s1 22 B 1 R
s1 2 s1 1 e2 e2
or
s1 1e1 + s1 2 e2 = 1s1 1 + s1 22e1
s1 2e1 + s1 1 e2 = 1s1 1 + s1 22e2
These equations imply that e1 = e2 , and after normalization, the first eigenvalue–
eigenvector pair is
1
12
l1 = s1 1 + s1 2 , e1 = D T
1
12
x2
c σ 11 +σ 12
c σ 11 σ 12
µ2 Figure 4.3 A constant-density
contour for a bivariate normal
distribution with s1 1 = s2 2 and
x1
µ1 s1 2 7 0 (or r1 2 7 0 ).
154
The Multivariate Normal Density and Its Properties 155
1 1
12 12
;c1s1 1 + s1 2 D T and ; c 1s1 1 - s1 2 D T
1 -1
12 12 ■
We show in Result 4.7 that the choice c2 = x2p1a2, where x2p1a2 is the upper
1100a2th percentile of a chi-square distribution with p degrees of freedom, leads to
contours that contain 11 - a2 * 100% of the probability. Specifically, the following
is true for a p-dimensional normal distribution:
has probability 1 - a.
The constant-density contours containing 50% and 90% of the probability under
the bivariate normal surfaces in Figure 4.2 are pictured in Figure 4.4.
x2 x2
µ2
µ2
x1 x1
µ1 µ1
Figure 4.4 The 50% and 90% contours for the bivariate normal
distributions in Figure 4.2.
The p-variate normal density in (4-4) has a maximum value when the squared
distance in (4-3) is zero—that is, when x = M. Thus, M is the point of maximum
density, or mode, as well as the expected value of X, or mean. The fact that M is
the mean of the multivariate normal distribution follows from the symmetry
exhibited by the constant-density contours: These contours are centered, or balanced,
at M.
155
156 Chapter 4 The Multivariate Normal Distribution
Result 4.2. If X is distributed as Np1M, 2, then any linear combination of vari-
ables a¿ X = a1X1 + a2X2 ± Á + apXp is distributed as N1a¿ M, a¿ a2. Also, if a¿ X
is distributed as N1a¿M, a¿ a2 for every a, then X must be Np1M, 2.
Proof. The expected value and variance of a¿ X follow from (2-43). Proving that
a¿ X is normally distributed if X is multivariate normal is more difficult. You can find
a proof in [1]. The second part of result 4.2 is also demonstrated in [1].
X1
X2
a¿ X = 71, 0, Á , 08 D T = X1
o
Xp
156
The Multivariate Normal Density and Its Properties 157
and
m1
m2
a¿ M = 71, 0, Á , 08 D T = m1
o
mp
we have
s1 1 s1 2 Á s1 p 1
s1 2 s2 2 Á s2 p 0
a¿ a = 71, 0, Á , 08 D T D T = s1 1
o o ∞ o o
s1 p s2 p Á sp p 0
and it follows from Result 4.2 that X1 is distributed as N1m1 , s1 12. More generally,
the marginal distribution of any component Xi of X is N1mi , si i2. ■
a1 1X1 + Á + a1 pXp
a2 1X1 + Á + a2 pXp
A X = D T
1q * p21p * 12 o
aq 1X1 + Á + aq pXp
Proof. The expected value E1AX2 and the covariance matrix of AX follow from
(2–45). Any linear combination b¿ 1AX2 is a linear combination of X, of the
form a¿X with a = A¿b. Thus, the conclusion concerning AX follows directly from
Result 4.2.
The second part of the result can be obtained by considering a¿1X + d2 =
a¿ X + 1a¿ d2, where a¿ X is distributed as N1a¿ M, a¿ a2. It is known from the
univariate case that adding a constant a¿d to the random variable a¿ X leaves the
variance unchanged and translates the mean to a¿ M + a¿ d = a¿1M + d2. Since a
was arbitrary, X + d is distributed as Np1M + d, 2.
X1
X1 - X2 1 -1 0
B R = B R C X2 S = AX
X2 - X3 0 1 -1
X3
157
158 Chapter 4 The Multivariate Normal Distribution
m1
1 -1 0 m - m2
AM = B R C m2 S = B 1 R
0 1 -1 m2 - m3
m3
s1 1 s1 2 s1 3 1 0
1 -1 0
AA¿ = B R C s1 2 s2 2 s2 3 S C - 1 1S
0 1 -1
s1 3 s2 3 s3 3 0 -1
1 0
s - s1 2 s1 2 - s2 2 s1 3 - s2 3
= B 11 R C -1 1S
s1 2 - s1 3 s2 2 - s2 3 s2 3 - s3 3
0 -1
s1 1 - 2s1 2 + s2 2 s1 2 + s2 3 - s2 2 - s1 3
= B R
s1 2 + s2 3 - s2 2 - s1 3 s2 2 - 2s2 3 + s3 3
Alternatively, the mean vector AM and covariance matrix AA¿ may be veri-
fied by direct calculation of the means and covariances of the two random variables
Y1 = X1 - X2 and Y2 = X2 - X3 . ■
X1 M1
1q * 12 1q * 12
X = C S M = C S
1p * 12 X2 1p * 12 M2
11p - q2 * 12 11p - q2 * 12
and
1 1 1 2
1q * q2 1q * 1p - q22
= D T
1p * p2 2 1 2 2
11p - q2 * q2 11p - q2 * 1p - q22
158
The Multivariate Normal Density and Its Properties 159
X2 m2 s2 2 s2 4
X1 = B R, M1 = B R, 1 1 = B R
X4 m4 s2 4 s4 4
and note that with this assignment, X, M, and can respectively be rearranged and
partitioned as
X2 m2 s2 2 s2 4 s1 2 s2 3 s25
X4 m4 s2 4 s4 4 s1 4 s3 4 s4 5
X = E X1 U , M = E m1 U , = E s1 2 s1 4 s1 1 s1 3 s1 5 U
X3 m3 s2 3 s3 4 s1 3 s3 3 s3 5
X5 m5 s2 5 s4 5 s1 5 s3 5 s5 5
or
X1 M1 1 1 1 2
12 * 12 12 * 12 12 * 22 12 * 32
X = D T, M = D T, = D T
X2 M2 2 1 2 2
13 * 12 13 * 12 13 * 22 13 * 32
m2 s2 2 s2 4
N21M 1 , 1 12 = N2 ¢ B R, B R≤
m4 s2 4 s4 4
It is clear from this example that the normal distribution for any subset can be
expressed by simply selecting the appropriate means and covariances from the origi-
nal M and . The formal process of relabeling and partitioning is unnecessary. ■
We are now in a position to state that zero correlation between normal random
variables or sets of normal random variables is equivalent to statistical independence.
Result 4.5.
(a) If X1 and X2 are independent, then Cov 1X 1 , X 22 = 0, a q1 * q2 matrix of
1q * 12 1q * 12
1 2
zeros.
X1 M 1 2
(b) If B R is Nq1 + q2 ¢ B 1 R , B 1 1 R ≤ , then X 1 and X 2 are independent if
X2 M2 2 1 2 2
and only if 1 2 = 0.
159
160 Chapter 4 The Multivariate Normal Distribution
M1 0
Nq1 + q2 ¢ B R , B 11 R≤
M2 0¿ 2 2
Proof. (See Exercise 4.14 for partial proofs based upon factoring the density
function when 1 2 = 0.)
Example 4.6 (The equivalence of zero covariance and independence for normal
variables) Let X be N31M, 2 with
13 * 12
4 1 0
= C1 3 0S
0 0 2
X1 4 1 0 1 1 1 2
12 * 22 12 * 12
X = C X2 S , = C1 3 0S = C S
2 1 2 2
X3 0 0 2 11 * 22 11 * 12
X1 0
we see that X 1 = B R and X3 have covariance matrix 1 2 = B R . Therefore,
X2 0
1X1 , X22 and X3 are independent by Result 4.5. This implies X3 is independent of
X1 and also of X2 . ■
X1 M
Result 4.6. Let X = B R be distributed as Np1M, 2 with M = B 1 R ,
X2 M2
1 1 1 2
= B R , and ƒ 2 2 ƒ 7 0. Then the conditional distribution of X 1 , given
2 1 2 2
that X 2 = x 2 , is normal and has
Mean = M 1 + 1 22-12 1x 2 - M 22
160
The Multivariate Normal Density and Its Properties 161
and
Covariance = 1 1 - 1 22-12 2 1
Note that the covariance does not depend on the value x 2 of the conditioning
variable.
Proof. We shall give an indirect proof. (See Exercise 4.13, which uses the densities
directly.) Take
I - 1 2 2-12
1q * q2 q * 1p - q2
A = C S
1p * p2 0 I
1p - q2 * q 1p - q2 * 1p - q2
so
X1 - M1 X - M 1 - 1 22-121X 2 - M 22
A1X - M2 = A B R = B 1 R
X2 - M2 X2 - M2
I - 1 22-12 1 2 I 0¿ - 1 22-12 2 1 0¿
B R B 11 R B œ R = B 11 R
0 I 2 1 2 2 1- 1 2 2-122 I 0 2 2
f1x1 , x22
f1x1 ƒ x22 = 5conditional density of X1 given that X2 = x26 =
f1x22
where f1x22 is the marginal distribution of X2 . If f1x1 , x22 is the bivariate normal
density, show that f1x1 ƒ x22 is
s1 2 s12 2
N ¢ m1 + 1x2 - m22, s1 1 - ≤
s2 2 s2 2
161
162 Chapter 4 The Multivariate Normal Distribution
Here s1 1 - s12 2>s2 2 = s1 111 - r12 22. The two terms involving x1 - m1 in the expo-
nent of the bivariate normal density [see Equation (4-6)] become, apart from the
multiplicative constant -1>211 - r21 22,
1 1s1 1 2 r21 2
= B x1 - m1 - r1 2 1x2 - m22 R - 1x - m222
s1 1 1s2 2 s2 2 2
Because r1 2 = s1 2> 1s1 1 1s2 2 , or r1 2 1s1 1> 1s2 2 = s1 2>s2 2 , the complete expo-
nent is
1 1 r21 2
- ¢ - ≤ 1x2 - m222
211 - r21 22 s2 2 s2 2
2 2
-1 s1 2 1 1x2 - m22
= 2
¢ x1 - m1 - 1x2 - m22 ≤ -
2s1 111 - r1 22 s2 2 2 s2 2
1 2
f1x22 = e -1x2 - m22 >2s2 2
12p 1s2 2
f1x1 , x22
f1x1 ƒ x22 =
f1x22
1 2 2
= e -7x1 - m1 - 1s1 2>s2 221x2 - m228 >2s1 111 - r1 2 2,
12p 2s1 111 - r21 22
- q 6 x1 6 q
Thus, with our customary notation, the conditional distribution of X1 given that
-1
X2 = x2 is N1m1 + 1s1 2>s2 22 1x2 - m22, s1 111 - r21 222. Now, 1 1 - 1 222 2 1 =
2 2 -1
s1 1 - s1 2>s2 2 = s1 111 - r1 22 and 1 22 2 = s1 2>s2 2 , agreeing with Result 4.6,
which we obtained by an indirect method. ■
162
The Multivariate Normal Density and Its Properties 163
o (4-9)
mq + b q, q + 11xq + 1 - mq + 12 + Á + b q, p1xp - mp2
b 1, q + 1 b 1, q + 2 Á b 1, p
b b 2, q + 2 Á b 2, p
1 2 2-12 = D 2, q + 1 T
o o ∞ o
b q, q + 1 b q, q + 2 Á b q, p
3. The conditional covariance, 1 1 - 1 2 2-12 2 1 , does not depend upon the value(s)
of the conditioning variable(s).
We conclude this section by presenting two final properties of multivariate
normal random vectors. One has to do with the probability content of the ellipsoids
of constant density. The other discusses the distribution of another form of linear
combinations.
The chi-square distribution determines the variability of the sample variance
s2 = s1 1 for samples from a univariate normal population. It also plays a basic role
in the multivariate case.
Proof. We know that x2p is defined as the distribution of the sum Z21 + Z22 + Á + Z2p ,
where Z1 , Z2, Á , Zp are independent N10, 12 random variables. Next, by the
spectral decomposition [see Equations (2-16) and (2-21) with A = , and see
Result 4.1], -1 = a
p
1
eieiœ , where ei = liei , so -1 ei = 11>li2ei . Consequently,
i=1 li
i=1 i=1
a 711> 1li2 ei 1X - M28 = a Zi , for instance. Now, we can write Z = A1X - M2,
p p
œ 22
i=1 i=1
163
164 Chapter 4 The Multivariate Normal Distribution
where
1 œ
e1
1l1
Z1
1 œ
Z2 e2
Z = D T, A = G 1l2 W
1p * 12 o 1p * p2
o
Zp
1
epœ
1lp
1 œ
e1
1l1
1 œ
A A¿ = G 1l2 W B a li ei eiœ R B
p
e2 1 1 1
e1 e2 Á ep R
1p * p21p * p21p * p2 i=1 1l1 1l2 1lp
o
1
epœ
1lp
1l1 e1œ
1l2 e2œ 1 1 Á 1
= D TB e1 e2 ep R = I
o 1l1 1l2 1lp
1lp epœ
is the squared statistical distance from X to the population mean vector M. If one
component has a much larger variance than another, it will contribute less to the
squared distance. Moreover, two highly correlated random variables will contribute
less than two variables that are nearly uncorrelated. Essentially, the use of the in-
verse of the covariance matrix, (1) standardizes all of the variables and (2) elimi-
nates the effects of correlation. From the proof of Result 4.7,
164
The Multivariate Normal Density and Its Properties 165
1 1
- -
In terms of 2 (see (2-22)), Z = 2 1X - M2 has a Np10, I p2 distribution, and
1 1
- -
1X - M2¿ -11X - M2 = 1X - M2¿ 2 2 1X - M2
The squared statistical distance is calculated as if, first, the random vector X were
transformed to p independent standard normal random variables and then the
usual squared distance, the sum of the squares of the variables, were applied.
Next, consider the linear combination of vector random variables
This linear combination differs from the linear combinations considered earlier in
that it defines a p * 1 vector random variable that is a linear combination of vec-
tors. Previously, we discussed a single random variable that could be written as a lin-
ear combination of other univariate random variables.
V1 = c1 X 1 + c2 X 2 + Á + cnX n
j=1 j=1
a a c2j b
n
1b¿ c2
j=1
D T
a a b2j b
n
1b¿ c2
j=1
j=1
7X1 1, Á , X1 p , X21, Á , X2 p, Á , Xn p8 = 7X 1œ , X 2œ , Á , X nœ 8 = Xœ
11 * np2
M1 0 Á 0
M2 0 Á 0
M = D T and x = D T
1np * 12 o 1np * np2 o o ∞ o
Mn 0 0 Á
165
166 Chapter 4 The Multivariate Normal Distribution
The choice
c1I c2 I Á cnI
A = B R
12p * np2 b1I b2I Á bnI
a cjX j
n
V1
AX = D j =n 1 T = B R
a bjX j
V2
j=1
j=1
j=1
j=1
b¿ c = 0, so that a a cj bj b =
n
0 , V1 and V2 are independent by Result 4.5(b).
j=1 1p * p2
For sums of the type in (4-10), the property of zero correlation is equivalent to
requiring the coefficient vectors b and c to be perpendicular.
166
The Multivariate Normal Density and Its Properties 167
which is itself a random vector. Here each term in the sum is a constant times a
random vector.
Now consider two linear combinations of random vectors
1 1 1 1
X1 + X2 + X3 + X4
2 2 2 2
and
X 1 + X 2 + X 3 - 3X 4
Find the mean vector and covariance matrix for each linear combination of vectors
and also the covariance between them.
By Result 4.8 with c1 = c2 = c3 = c4 = 1>2, the first linear combination has
mean vector
6
1c1 + c2 + c3 + c42M = 2M = C - 2 S
2
3 -1 1
1c21 + c22 + c23 + c242 = 1 * = C - 1 1 0S
1 0 2
For the second linear combination of random vectors, we apply Result 4.8 with
b1 = b2 = b3 = 1 and b4 = - 3 to get mean vector
0
1b1 + b2 + b3 + b42M = 0M = C 0 S
0
36 - 12 12
1b21 + b22 + b23 + b242 = 12 * = C - 12 12 0 S
12 0 24
Finally, the covariance matrix for the two linear combinations of random vectors is
0 0 0
1c1 b1 + c2 b2 + c3b3 + c4b42 = 0 = C 0 0 0S
0 0 0
Every component of the first linear combination of random vectors has zero
covariance with every component of the second linear combination of random vectors.
If, in addition, each X has a trivariate normal distribution, then the two linear
combinations have a joint six-variate normal distribution, and the two linear combi-
nations of vectors are independent. ■
167