0% found this document useful (0 votes)
7 views

02 Principal Components

Uploaded by

Ronit Bhatia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

02 Principal Components

Uploaded by

Ronit Bhatia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Principal Components Analysis (PCA)

Data Science Modelling (SOST30062)

Week 9

Dr András Vörös, Department of Social Statistics


Motivation
We have a number of correlated variables in our data
• they partly measure similar things
Can the information they carry be expressed in fewer variables?
 identify principal components

Uses of the approach:


• Explore data
• Visualise data
• Define predictors for supervised learning methods
Finding the first principal component
Aim: find linear combination of p variables that has the largest
variance (~ captures the most information)

 first principal component


Z1 = ϕ11 X1 + ϕ21 X2 + ⋯ + ϕp1 Xp
p

where ෍ ϕ2j1 = 1
j=1

ϕj1 are called the “loadings” of the variables on the first PC


(their relative weights on the component)
Maximising the variance
We need to maximise the following across n observations:
σni=1(zi1 − zത1 )2
Var(Z1 ) =
n
where zi1 = ϕ11 xi1 + ϕ21 xi2 + ⋯ + ϕp1 xip

If the variables are centered (have means of 0), this simplifies:


σni=1 𝑧𝑖1
2
Var(Z1 ) =
n
zi1 is called the “score” of the ith observationson the first PC
(its value on the component as a variable)
The other components
The second principal component has the largest variance of
linear combinations uncorrelated with the first PC

The third is uncorrelated with the first two

And so on…

In a coordinate system: uncorrelated = orthogonal


Interpretation of principal components
Two PCs based on arrests by three
crime type and urban population of
US states

Arrows: loading vectors on 2 PCs


(axes top and right)
• PC1: crime, PC2: urbanisation
State labels: scores on 2 PCs
(axes bottom and left)
• different state profiles by
crime and urbanisation
Scaling the variables matters
Units of measurement
matter

Variables with larger


variance will be more
important in PCs

Solution: scale them,


so all variables have
variance = 1
Choosing the number of components
How many PCs?
Proportion of variance
explained helps to
choose

Scree plots 
look for “elbow”; drop
in variance explained

Subjective choice
Please continue with the next topic.

You might also like