0% found this document useful (0 votes)
3 views8 pages

Principal Component Analysis — A numerical approach

The document explains Principal Component Analysis (PCA), a technique for dimensionality reduction in machine learning and data analysis. It outlines the steps involved in PCA, including data preparation, covariance matrix computation, eigenvalue and eigenvector calculation, sorting eigenvalues, selecting principal components, and projecting data onto these components. A numerical example is provided to illustrate the PCA process and its application.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views8 pages

Principal Component Analysis — A numerical approach

The document explains Principal Component Analysis (PCA), a technique for dimensionality reduction in machine learning and data analysis. It outlines the steps involved in PCA, including data preparation, covariance matrix computation, eigenvalue and eigenvector calculation, sorting eigenvalues, selecting principal components, and projecting data onto these components. A numerical example is provided to illustrate the PCA process and its application.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Get unlimited access to the best of Medium for less than $1/week.

Become a member

Principal Component Analysis — A


numerical approach
Dr. Dhanya NM · Follow
5 min read · Sep 21, 2023

21

In this blog, we are going to see how PCA works with a numerical example.
Principal Component Analysis (PCA) is a dimensionality reduction technique
commonly used in machine learning and data analysis to reduce the number
of features while preserving as much information as possible. It achieves
this by finding the principal components of the data, which are orthogonal
vectors that capture the most significant variations in the data.

Note: In this example, all values are rounded to 2 decimal points.

Steps for PCA

1. Data Preparation:

Collect and preprocess your data.

2. Covariance Matrix Computation:


Compute the covariance matrix of your centered data. The covariance
matrix describes the relationships between the different features in your
dataset.

3. Eigenvalue and Eigenvector Computation:

Calculate the eigenvalues and eigenvectors of the covariance matrix.


These represent the magnitude and direction of the principal
components.

The eigenvalues indicate the variance explained by each principal


component, and the eigenvectors define the direction of these
components.

4. Sort Eigenvalues:

Sort the eigenvalues in descending order. The largest eigenvalues


correspond to the most significant principal components.

5. Select Principal Components:

Decide how many principal components to keep. You can select a specific
number or choose based on the explained variance (e.g., keep
components that explain 95% of the variance).

6. Projection onto Principal Components:

Project your original data onto the selected principal components to


obtain a new set of features (variables) that are linear combinations of
the original features.

Step — 1 Collect the data


The following are the data points or features given. Let us start with 4 data
points

Step — 6 Projection onto Principal ComponentsStep — 2


Covariance Matrix Computation
The final covariance matrix is the following

Step — 3 Eigen values and Eigen vector Computation


Next we will compute the eigen values of this covariance matrix. The
characteristic equation of the covariance matrix is the following.
To find the roots of the quadratic equation we can apply the following
formula.

By applying the formula, we will get two roots and these are the eigen values.

Next step is to calculate eigen vectors. To calculate this the initial step is to
calculate the Characteristic Equation:

The characteristic equation for finding eigenvalues (λ) of a matrix C is given


by:

det(S — λI) = 0

where:
S is the covariance matrix.

λ (lambda) represents the eigenvalues.

I is the identity matrix of the same dimensions as C.

Equating the above to vector 0, we will get the following

Taking the first equation into consideration

Consider the value of t as 1, the U vector or eigen vector will become

Step — 4 Sort Eigen Values


Whenever we want to calculate the principal components we have to
consider the largest eigen value. Here it is 8.37. To calculate the unit eigen
vector , we have to calculate the length of U1.To obtain unit eigenvectors
(eigenvectors with a magnitude of 1), we normalize the eigenvectors. To
normalize a vector, divide each component by its magnitude (Euclidean
norm).

The below represents the eigenvector corresponding to Principal component


2, if we are calculating PC2 we can consider the below eigenvector.

Step — 5 Select First Principal Component


The equation to calculate the principal component is as follows.

Applying the above equation on the first eigen vector we will get the
following.
Step — 6 Projection onto Principal Components
Applying the equation on each data point.

The final answer ,

The final result can be visualized approximately like this.

You might also like