Intermediate R - Principal Component Analysis
Intermediate R - Principal Component Analysis
is large.
X1
18
18
X1
18
16 16
14 16 18 20
14 15 16 17 18 19
X2
X2
X1
18
Y1 = A11X1 + A21X2 + … + Ap1Xp
. pull
.
16
.
®
14 15 16 17 18 19
Y1 = A11X1 + A21X2
Y2 = A12X1 + A22X2
Example
Y2=A12X1 + A22X2 + A32X3 +A42X4 + A52X5 + A62X6 + A72X7 reproduced by p principal components
Big contributors to Y2
Var ( Yi ) = λi p p
where λ1 ≥ λ2 ≥ … ≥ λp ≥ 0 λk
Proportion of total variance
=
due to the kth PC λ1 + λ2 + ... + λ p
the X’s in the kth PC [ A1k A2k … Aik … Apk ] • PCA using Σ considers relative weights of the
variables (larger variance ⇒ larger weight)
is the eigenvector of the kth PC.
• PCA using ρ considers all variables with
• The magnitude of the Aij measures the equal weights
importance of Xi to Yj. Correlation matrix is recommended when:
• The sign of Aij denotes the direction of the • attributes have large variances
• scales have highly varied ranges
contribution of Xi to Yj. • different units of measurements
Eigenvalue-eigenvector pairs:
λ2 2.00
λ1 = 5.83 e1’ = [0.383 -0.924 0] = = 0.25
λ2 = 2.00 e2’ = [0 0 1] λ1 + λ 2 + λ 3 8
λ3 = 0.17 e3’ = [0.924 0.383 0]
PCA in R
PCA Plot
Another Example
biplot()
Thank you!