0% found this document useful (0 votes)
1 views

Principal_Components

The document discusses principal component analysis (PCA), focusing on its objectives of data reduction and interpretation through linear combinations of variables. It explains the mathematical foundations of PCA, including the derivation of principal components from covariance matrices, and provides examples of applying PCA to real datasets. Additionally, it covers the standardization of variables and the implications of sample variation in PCA.

Uploaded by

vanjunxin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Principal_Components

The document discusses principal component analysis (PCA), focusing on its objectives of data reduction and interpretation through linear combinations of variables. It explains the mathematical foundations of PCA, including the derivation of principal components from covariance matrices, and provides examples of applying PCA to real datasets. Additionally, it covers the standardization of variables and the implications of sample variation in PCA.

Uploaded by

vanjunxin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1 Introduction

• A principal component analysis is concerned with explaining the variance-covariance


structure of a set of variables through a few linear combinations of these vari-
ables.
• p components can reproduce the total system variability, often much of this vari-
ability can be accounted for by a small number of k of the principal components.
• The original data set, consisting of n measurements on p variables, is reduced to
a data set consisting of n measurements on k principal components.
• Its general objectives are (1) data reduction and (2) interpretation.

2 Population Principal Components


• Let the random vector X′ = [X1 , X2 , · · · , Xp ] have the covariance matrix Σ
with eigenvalues λ1 ≥ · · · ≥ λp ≥ 0.

• Consider the linear combinations

Y1 = a′1 X = a11 X1 + · · · + a1p Xp


Y2 = a′2 X = a21 X1 + · · · + a2p Xp
.. ..
. .
Yp = a′p X = ap1 X1 + · · · + app Xp

• we have

V ar(Yi ) = a′i Σai , i = 1, · · · , p


Cov(Yi , Yk ) = a′i Σak , i, k = 1, 2, · · · , p

• The principal components are those uncorrelated linear combinations Y1 , Y2 , · · · , Yp


whose variances are as large as possible.
• We define
– First principal component: linear combination a′1 X that maximizes V ar(a′1 X)
subject to a′1 a1 = 1.
– Second principal component: linear combination a′2 X that maximizes
V ar(a′2 X) subject to a′2 a2 = 1 and Cov(a′1 X, a′2 X) = 0.

···
– At the i step, ith principal component: linear combination a′i X that max-
imizes V ar(a′i X) subject to a′i ai = 1 and Cov(a′i X, a′k X) = 0, for k < i.

1
Let Σ be the covariance matrix associated with the random vector X′ = [X1 , · · · , Xp ].
Let Σ have the eigenvalue-eigenvector pairs (λ1 , e1 ), · · · , (λp , ep ) where λ1 ≥ · · · λp ≥
0. Then the ith principal component is given by

Yi = e′i X = ei1 X1 + · · · + eip Xp , i = 1, · · · , p

with these choices


V ar(Yi ) = e′i Σei = λi , i = 1, · · · , p
Cov(Yi , Yk ) = e′i Σek = 0, i ̸= k
If some λi are equal, the choices of the corresponding coefficient vectors ei and hence
Yi are not unique.
Let X′ = [X1 , · · · , Xp ] have covariance matrix Σ, with the eigenvalue-eigenvector
pairs (λ1 , e1 ), · · · , (λp , ep ) where λ1 ≥ · · · λp ≥ 0. Let Y1 = e′1 X, · · · , Yp = e′p X be
the principal components. Then


p ∑
p
σ11 + · · · + σpp = V ar(Xi ) = λ1 + · · · + λp = V ar(Yi )
i=1 i=1

If Y1 = e′1 X, · · · , Yp = e′p X are the principal components obtained from the


covariance matrix Σ, then

eik λi
ρYi ,Xk = √ i, k = 1, 2, · · · , p
σkk

are the correlation coefficients between the components Yi and the variables Xk . Here
(λ1 , e1 ), · · · , (λp , ep ) are the eigenvalue eigenvector pairs for Σ.
Suppose the random variables X1 ,X2 and X3 have the covariance matrix
 
1 −2 0
−2 5 0
0 0 2

• Find the principal components for random vector X′ = [X1 , X2 , X3 ].


• Determine how many principal components should be used to replace the origi-
nal three variables.

• Calculate the correlation coefficient between principal components with original


variables.
Comments
• Suppose X is distributed as Np (µ, Σ)

• The principal components y1 = e′1 x, · · · , yp = e′p x lie in the directions of the


axes of a constant density ellipsoid.

2
3 Principal Components Obtained from Standardized Variables

• Principal components may also be obtained for the standardized variables: Z = (V1/2 )−1 (X − µ)

• We have E(Z) = 0 and Cov(Z) = (V1/2 )−1 Σ(V1/2 )−1 = ρ


<+->Result 4 The ith principal component of the standardized variables Z′ = [Z1 , · · · , Zp ] with Cov(Z) = ρ, is given
by
′ ′ 1/2 −1
Yi = ei Z = ei (V ) (X − µ), i = 1, 2 · · · , p
∑ p ∑ p
Moreover, V ar(Yi ) = V ar(Zi ) = p
i=1 √ i=1
ρYi ,Zk = eik λi , i, k = 1, 2, · · · , p.
and
In this case, (λ1 , e1 ), · · · , (λp , ep ) are the eigenvalue eigenvector pairs for ρ, with λ1 ≥ · · · ≥ λp ≥ 0.
Consider the covariance matrix
[ ]
1 4
Σ=
4 100

and the derived correlation matrix


[ ]
1 0.4
ρ=
0.4 1

• Find the principal components for Σ and ρ


• Discussion

4 Principal Components for Covariance Matrices with Special Structures

•  
σ11 0 ··· 0
 0 σ22 ··· 0 
 
Σ= . .. .. .. 
 .. . . . 
0 0 ··· σpp
•  
σ2 ρσ 2 ··· ρσ 2
ρσ 2 σ2 · · · ρσ 2 
 
Σ= . .. .. .. 
 .. . . . 
ρσ 2 ρσ 2 ··· σ2

5 Summarizing Sample Variation By Principal Com-


ponents
• Suppose the data x1 , x2 , · · · , xn represent n independent drawings from some p-dimensional
population with mean vector µ and covariance matrix Σ. These data yield the sample
mean vector x̄, the sample covariance matrix S, and the sample correlation matrix R.
• The sample principal components are defined as those linear combinations which have
maximum sample variance:

3
– First sample principal component: linear combination a′1 xj the sample variance
of a′1 xj subject to a′1 a1 = 1.
– Second sample principal component: linear combination a′2 xj the sample vari-
ance of a′2 xj subject to a′2 a2 = 1 and zero sample covariance for the paris (a′1 xj , a′2 xj )
– ith sample principal component: linear combination a′i xj the sample variance of
a′i xj subject to a′i ai = 1 and zero sample covariance for the paris (a′i xj , a′k xj ), k <
i
• If S = {sik } is the p × p sample covariance matrix with eigenvalue eigenvector paris
(λ̂1 , ê1 ), · · · , (λ̂p , êp ), the i-th sample principal component is given by

ŷi = ê′i x = eˆi1 x1 + · · · + eˆip xp , i = 1, 2, · · · , p

where λ̂1 ≥ · · · ≥ λ̂p ≥ 0 and x is any observation on the variables X1 , · · · , Xp .


• Sample variance(ŷk ) = λ̂k , , k = 1, 2, · · · , p Sample covariance(ŷi , ŷk ) = 0, i ̸=
0

• T otal sample variance = pi=1 sii = λ̂1 + · · · + λ̂p

ê λ̂i
• rŷi ,xk = ik √
skk
, i, k = 1, 2, · · · , p
A census provided information, by tract, on five socioeconomic variables for the Madison,
Wisconsin, area. The data from 61 tracts are listed in as following
Tract Total Professional Employed Government Median
population degree age over 16 employment home value
(thousands) (percent) (percent) (percent) ($100,000)
1 2.67 5.71 69.02 30.3 1.48
2 2.25 4.37 72.98 43.3 1.44
.
.
.
61 6.48 4.93 74.23 20.9 1.98

These data produced the following summary statistics:



x̄ = [4.47, 3.96, 71.42, 26.91, 1.64]
 
3.397 −1.102 4.306 −2.078 0.027
−1.102 9.673 −1.513 10.953 1.203 
 
 −1.513 −28.937 −0.044
 4.306 55.626 
−2.078 10.953 −28.937 89.067 0.957 
0.027 1.203 −0.044 0.957 0.319

Can the sample variation be summarized by one or two principal components?

6 Standardizing the Sample Principal Components


• Standardization is accomplished by
zj = D−1/2 (xj − x̄) j = 1, 2, · · · , n
• Sample mean of the standardized data: z̄ = 0, and sample covariance matrix: Sz =
1
n−1
Z′ Z = R, where Z = [z′1 , · · · , z′n ]′ .
• the ith sample principal component is
ŷi = ê′i z = êi1 z1 + êi2 z2 + · · · + êip zp , i = 1, 2, · · · , p
where (λ̂i , êi ) is the ith eigenvalue-eigenvector pair of R with λ̂1 ≥ · · · ≥ λ̂p ≥ 0.

• Sample variance (ŷi ) = λ̂i , i = 1, 2, · · · , p


Sample covariance (ŷi , ŷk ) = 0, i ̸= k
• Total (standardized) sample variance=tr(R) = p = λ̂1 + · · · + λ̂p

• rŷi ,zk = êik λi , i, k = 1, 2, · · · , p

4
The weekly rates of return for five stocks (JP Morgan, Citibank, Wells Fargo, Royal Dutch
Shell, and ExxonMobil) listed on the New York Stock Exchange were determined for the period
January 2004 through December 2005. The weekly rates of return are defined as (current Friday
closing price-previous Friday closing price)/(previous Friday closing price) adjusted for stock
splits and dividends. The data are listed in the following table:
week JP Morgan Citibank Wells Fargo Royal Dutch Shell Exxon Mobil
1 0.01303 -0.00784 -0.00319 -0.04477 0.00522
2 0.00849 0.01669 -0.00621 0.01196 0.01349
···
103 -0.01279 -0.01437 -0.01874 -0.00498 -0.01637

Find the sample principal components for these data and try to interpret them.

7 Large Sample Inferences


• Let Λ be the diagonal matrix of eigenvalues λ1 , · · · , λp of Σ, then
√ a
n(λ̂ − λ) ∼ Np (0, 2Λ2 )

• Let êi be the eigenvector associated with λ̂i , then


√ a
n(êi − ei ) ∼ Np (0, Ei )
∑p ′
where Ei = λi i=1,k̸=i (λk λ−λ k
i)
2 ek ek

• Each λ̂i is distributed independently of the elements of the associated êi .

You might also like