0% found this document useful (0 votes)
7 views

CH 4

The document defines the multivariate normal distribution through three equivalent definitions involving its density function, moment generating function, and linear transformations of standard normal random variables. It provides theorems stating properties of the multivariate normal distribution, including that expectations are equal to the mean vector and covariances are equal to the covariance matrix. It also introduces the chi-squared distribution and states properties such as it arising as sums of squared standard normal random variables.

Uploaded by

Morgana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

CH 4

The document defines the multivariate normal distribution through three equivalent definitions involving its density function, moment generating function, and linear transformations of standard normal random variables. It provides theorems stating properties of the multivariate normal distribution, including that expectations are equal to the mean vector and covariances are equal to the covariance matrix. It also introduces the chi-squared distribution and states properties such as it arising as sums of squared standard normal random variables.

Uploaded by

Morgana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

13

4 The Multivariate Normal Distribution

The following are three possible definitions of the multivariate normal distribution (MVN). Given a vector
µ and a positive semidefinite matrix Σ, Y ∼ Nn (µ, Σ) if:

4.1 Definition: For a positive definite Σ, the density function of Y is


1
fY (y) = (2π)−n/2 |Σ|−1/2 exp{− (y − µ)" Σ−1 (y − µ)}.
2
4.2 Definition: The moment generating function (m.g.f.) of Y is
! 1
MY (t) ≡ E[et Y ] = exp{µ" t + t" Σt}.
2
4.3 Definition: Y has the same distribution as AZ + µ, where Z = (Z1 , . . . , Zk ) is a random sample from
N (0, 1) and An×k satisfies AA" = Σ.

4.4 Theorem: Definitions 4.1, 4.2, and 4.3 are equivalent for Σ > 0 (positive definite). Definitions 4.2 and
4.3 are equivalent for for Σ ≥ 0 (positive semidefinite). If Σ is not positive definite, then Y has a singular
MVN distribution and no density function exists.

4.5 Theorem: If Z = (Z1 , . . . , Zn ) is a random sample from N (0, 1), then Z has the N (0n , In×n ) distribution.

4.6 Theorem: E[Y] = µ, cov(Y) = Σ.

4.7 Example: Let Z = (Z1 , Z2 )" ∼ N2 (0, I), and let A be the linear transformation matrix
! "
1/2 −1/2
A= .
−1/2 1/2

Let Y = (Y1 , Y2 )" be the linear transformation


! "
(Z1 − Z2 )/2
Y = AZ = .
(Z2 − Z1 )/2

By Definition 4.3 Y ∼ N (0, Σ) where Σ = AA" .

4.8 Theorem: If Y ∼ Nn (µ, Σ) and Cp×n is a constant matrix of rank p, then CY ∼ Np (Cµ, CΣC" ).

4.9 Theorem: Y is MVN if and only if a" Y is normally distributed for all non-zero constant vectors a.

4.10 Theorem: Let Y ∼ Nn (µ, σ 2 I), and let T be an orthogonal constant matrix. Then TY ∼ Nn (Tµ, σ 2 I).
14

4.11 Note: Theorem 4.10 says that mutually independent normal random variables with common variance re-
main mutually independent with common variance under orthogonal transformations. Orthogonal matrices
correspond to rotations and reflections about the origin, i.e., they preserve the vector length:

||Ty||2 = (Ty)" (Ty) = y" T" Ty = y" y = ||y||2 .

Let Y ∼ Nn (µ, Σ) be partitioned as ! "


Y1
Y= ,
Y2
where Y1 is p × 1 and Y2 is q × 1, (p + q = n). The mean and covariance matrix are correspondingly
partitioned as
! " ! " ! "
µ1 Σ11 Σ12 cov(Y1 ) cov(Y1 , Y2 )
µ= and Σ= = .
µ2 Σ21 Σ22 cov(Y2 , Y1 ) cov(Y2 )

4.12 Theorem: The marginal distributions are Y1 ∼ Np (µ1 , Σ11 ) and Y2 ∼ Nq (µ2 , Σ22 ).

4.13 Theorem: Uncorrelated implies independent: Y1 and Y2 are independent if and only if Σ12 = Σ"21 = 0.

4.14 Theorem: If Σ is positive definite, then the conditional distribution of Y1 given Y2 is

Y1 |Y2 = y2 ∼ Np (µ1 + Σ12 Σ−1 −1


22 (y2 − µ2 ), Σ11 − Σ12 Σ22 Σ21 ).

#d
4.15 Definition: For any positive integer d, χ2d is the distribution of 2
i=1 Zi , where Z1 , . . . , Zd are inde-
pendent and identically distributed N (0, 1) random variables.

4.16 Example: Let Y1 , . . . , Yn be independent N (µ, σ2 ) random variables. Then Ȳ and S 2 are independent and
(n − 1) × S 2 /σ 2 ∼ χ2n−1 .

In linear model theory, test statistics arise from sums of squares (special cases of quadratic forms) with χ2
distributions.

4.17 Theorem: If Y ∼ Nn (µ, Σ) and Σ is positive definite, then (Y − µ)" Σ−1 (Y − µ) ∼ χ2n .

4.18 Theorem: Let Y ∼ Nn (µ, σ 2 I) and Pn×n be symmetric of rank r. Then Q = (Y − µ)" P(Y − µ)/σ 2 ∼
χ2r if and only if P is idempotent (i.e. P2 = P), and hence a projection.

4.19 Note: Theorem 4.18 says that in the spherically symmetric case Σ = σ2 I, the only quadratic forms with
χ2 distributions are sums of squares, i.e. squared lengths of projections: x" Px = ||Px||2 .
15

Theorem 4.22 addresses conditions under which the difference of two χ2 -distributed quadratic forms is χ2
(to be applied to the ANOVA decomposition of the sum of squares). To prove the theorem, we will need to
know when the difference of two projection matrices is a projection matrix.

4.20 Theorem: Assume that P1 and P2 are projection matrices and that P1 − P2 is positive semidefinite. Then
(a) P1 P2 = P2 P1 = P2 ,
(b) P1 − P2 is a projection matrix.

4.21 Note: The actual interpretation of Theorem 4.20 is:


1. P1 is a projection onto a linear space Ω.
2. P2 is a projection onto a subspace ω of Ω.
3. P1 − P2 is a projection onto the orthogonal complement of ω within Ω.

4.22 Theorem: Let Y ∼ Nn (µ, σ 2 I) and Qi = (Y − µ)" Pi (Y − µ)/σ 2 , where Pi is a symmetric n × n


matrix for i = 1, 2. If Qi ∼ χ2ri and Q1 − Q2 ≥ 0 , then Q1 − Q2 and Q2 are independent, and
Q1 − Q2 ∼ χ2r1 −r2 .

4.23 Definition: The non-central chi-squared distribution with n degrees of freedom and non-centrality param-
#n
eter λ, denoted χ2n (λ), is defined as the distribution of 2
i=1 Zi , where Z1 , . . . , Zn are independent
#n
N (µi , 1) random variables, and λ = i=1 µ2i /2.

4.24 Note: For any n we have χ2n (0) ≡ χ2n , which we refer to as the central chi-square distribution.

4.25 Theorem: If Y ∼ Nn (µ, I), then Y" Y has moment generating function
$ " % &'
−n µµ 1
MY! Y (t) = (1 − 2t) 2 exp −1 , t < 1/2.
2 1 − 2t

4.26 Theorem: Let Y ∼ Nn (µ, σ 2 In ) and P = P" . Then P = P2 of rank r if and only if
Y " PY/σ 2 ∼ χ2r (µ" Pµ/2σ 2 ).

4.27 Theorem: If Y ∼ χ2 (n, λ), then E[Y ] = n + 2λ, var[Y ] = 2(n + 4λ).

4.28 Theorem: If Y ∼ χ2n with n > 2, then E[ Y1 ] = 1


n−2 .

4.29 Theorem: χ2n (λ), like χ2n , has the convolution property.

You might also like