Cheat Sheet: - PCA Dimensionality Reduction
Cheat Sheet: - PCA Dimensionality Reduction
What is PCA?
• Based on the dataset find a new set of orthogonal feature vectors in such a way that the
data spread is maximum in the direction of the feature vector (or dimension)
• Rates the feature vector in the decreasing order of data spread (or variance)
• The datapoints have maximum variance in the first feature vector, and minimum variance
in the last feature vector
• The variance of the datapoints in the direction of feature vector can be termed as a
measure of information in that direction.
Steps
1. Standardize the datapoints
2. Find the covariance matrix from the given datapoints
3. Carry out eigen-value decomposition of the covariance matrix
4. Sort the eigenvalues and eigenvectors
Figure 1 Figure 2
Feature # 1 (F1)
FeFeature # 1
Variance
Variance
1
e#
2
ur
#
re
at
atu
Fe
ew
w
Ne
N
at
Fe F2 F2 (new feature # 1) and project the data on
w
w
Ne
Ne
Source: https://www.cheatsheets.aqeel-anwar.com