lec14

The document discusses Principal Component Analysis (PCA) and Singular Value Decomposition (SVD), explaining how every linear transformation can be represented through scaling and rotation. It details the properties of singular values and vectors, their uniqueness, and the relationship between eigenvalues and singular values. Additionally, it covers concepts like matrix rank, trace, determinant, and the conditions for matrix invertibility, culminating in the application of these concepts in PCA for dimensionality reduction.

Uploaded by

myalternativemail6803

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

lec14

Uploaded by

myalternativemail6803

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Principal

Component
Analysis
The Universal Linear
Transformations
It turns out that scaling and rotation transformations
are all we need to understand in order to understand
any linear transformation
A superbly powerful result in linear algebra assures us
that every linear transformation can be expressed as a
composition of two rotation transformations and one
scaling transformation
This result is known as the singular value
decomposition (SVD) theorem and it underlies a very
useful ML technique known as principal component
analysis (PCA)
Singular Value Decomposition
Let be any (possibly non-
symmetric) square matrix. Then
we can always write as a
product of three matrices

where are orthonormal

matrices and is a scaling Caution: This matrix is
(diagonal) matrix with all entriesnot i.e.
non-negative
clockwise scaling counter
Thus, every linear map is simply rotation 120 clockwise
a rotation (+ possibly axes flips, rotation 45
swaps), followed by axes scaling
followed by another rotation (+
axes flips/swaps)
Singular Value Decomposition
For example, consider the following four different but equivalent
SVDs for
Diagonal elements in are called the singular values of (always )
Column vectors of are called the left singular vectors of
ColumnNotevectors
that eachofmatrix
(row vecs of ) called right singular vectors of
has unit (row and
If , then
column) rank
In order to minimize this ambiguity, people
Singular values of a matrix are always unique, singular vectors
commonly write SVD such that and
are not
Given one set of left+right singular vectors, can obtain other sets of
left+right singular vectors using “certain”
If then weorthonormal
have maps – details a
bit tedious
Tons of things to studyCan you find
about an expression
singular for rows of– too little
decompositions
time too?
Singular Value Decomposition
SVD is defined even for matrices that are not square
Suppose then we can always write where
and are orthonormal matrices of different sizes
is a rectangular diagonal matrix with but for
Case 1: i.e. output dim < input dim i.e. reduces dim of
vectors
In this case throws out
𝐴 ΣV last
⊤ dimensions
V
⊤
𝐱 by
after rotation

𝐴¿ 𝑈 Σ ⊤
𝑉
Singular Value Decomposition
SVD is defined even for matrices that are not square
Suppose then we can always write where
and are orthonormal matrices of different sizes
is a rectangular diagonal matrix with but for
Case 2: i.e. output dim > input dim. increases dim of
vectors
In this case adds
scaling
𝐴 dummy dimensions
ΣV
⊤
Vin 𝐱
⊤addition to

𝐴 𝑈 Σ
⊤
𝑉
¿ 0
0
0
0
Not that this automatically shows that row rank
Rank and column rank are the same. and the column
rank of this matrix is clearly the same as that of
Gettingthethe
SVD of asince
matrix matrix immediately
the number of non-zerostells
a lot us
in does
not change at all.
Rank: We always have number of non-zero entries in
To see why, notice that if some diagonal entries of are zero,
we can remove those rows and the corresponding columns
of
𝐴¿ 𝑈 Σ ⊤
If and only entries of are non-zero,
then we can equivalently write as
0

𝑉
where and . This is often called the thin SVD of
Finally, notice that where and thus, the column space of is
Trace and Determinant
Trace and determinant are defined only for square
matrices
Trace: For a square matrix ,
Can use SVD to get another definition of trace
Trace has a funny property: for any
Trace also satisfies linearity properties if , then we have and .
This gives us

Determinant: For a square matrix ,

Sign of the depends on how many axes did flip/swap
Inverses
A square matrix is invertible iff all its singular values are non-
zero
A singular value being zero means that squishes vectors along some
direction to the origin. This means that multiple input vectors map to
the same output vector. This means that we cannot undo the linear
map of
Note that this means is invertible iff
If is invertible, then we always have
Nice because inverse of a diagonal matrix is simply element wise
inverse
If , then
Recall that we do insist i.e. and so is well defined
Indeed
Verify that as well
Pseudo Inverse Indeed, the pseudo inverse always
gives us the “least squares”
solution. If , then
Even if a matrix is not invertible (even if is not
square), we can still define the Moore-Penrose inverse
of Least squares?? Does this have
anything to do with the least
squares we did in regression?
if , where we define where if else
The pseudo inverseItsatisfies
sure does. a
If few nice properties
is invertible, then we
If is indeed invertible then can show that
If with , then i.e. acts Thus,
asleast squares
identity for solution
columns simply
of . This
means that for all . However this gives us that if , then is a
means
vector such
In fact, eventhat . Thus,
if there existsdoes invert
no , s.t. the
, even map
then, of bythe
returns sending
a valid output of the
solution map
which has to
thean input error
smallest whichi.e. does indeed
generate that all
and among very output
vectors withwith
this that map.
smallest error, is the one
Can be shown that with is the vector
smallest L2 with
norm smallest L2 norm s.t.
Eigenvectors and Eigenvalues
Can be considered quite enigmatic when taught in an MTH course
ML perspective is a bit different/easier (MTH people may dislike this – sorry
)
SVD is clearly an extremely important (central even) concept in LinAlg and
ML
Eigenblah can be seen as simply a handy way of obtaining the SVD of a
matrix
Consider a feature matrix – want to find its SVD
Instead, consider
Nice as this matrix is square, symmetric and lets us focus on finding –
how?
Let (recall that columns of are orthonormal). Then we have

where . Thus, merely scales right singular vectors of

To handle , simply take
To get around this issue, people usually define
Eigenvectors and Eigenvalues
eigenvectors to be only unit vectors such that .
Even then, there is still some ambiguity as and
For a square symm. would
matrix , an
both be eigenvectors
eigenpair for this matrix is a pair
of a vectorRecall
and a that we commented
scalar such that even the singular
that vectors are not really unique since we can always
Caution:generate new ones by flipping the sign etc. so it is
is mat-vec multiplication
whereasnotissurprising that eigenvectors are not unique either
scalar multiplication
since right singular vectors of are eigenvectors of 
The vector in the eigenpair
(eigenvector) is merely Thescaled by of eigenpairs makes
concept
the scalar in the pair (eigenvalue)
sense for even non-symmetric (but
and not rotated etc still square) matrices but we will only
counter scaling clockwise
Eigenvalues may be negative or
need the symmetric case in ML
clockwise rotation 45
zero too rotation 45
In general, matrices have an
infinite number of such eigenpairs
– whenever is an eigenpair, is
Eigenpairs and Singular Triplets
We have seen how right singular vectors of are
eigenvectors of and left singular vectors of are
eigenvectors of
It turns out that the square roots of the eigenvalues of
give us all the non-zero singular values of
In fact, the square roots of the eigenvalues of also do the same
thing because and share non-zero eigenvalues
Proof: We have and . We defined which is a diagonal matrix
and which is an diagonal matrix. It is easy to verify the
following simple facts
and share their non-zero (diagonal) entries
Those diagonal entries are exactly the eigenvalues corresponding to the
eigenvectors
Those diagonal entries are exactly squares of the singular values of
A square symmetric matrix is PSD if and only if (aka iff)
Definiteness none of its eigenvalues is negative. We use the term
Positive Definite (PD) to refer to matrices all of whose
Note thateigenvalues
, we have arei.e.
strictly
is positive.
PSD For these matrices,
unless in which case obviously we must have
We are now ready Can to
yousee when
see that exactly
a square is a matrix
symmetric matrix PSD. We
will restrict ourselves
has one orto only
more square
zero symmetric
eigenvalues iff it is notmatrices
full
Non-symmetricrank? caseAgets a bitsquare
full rank complicated due
symmetric to will
matrix
diagonalizability
It is illuminating toissues
seehave only non-zero
this work when theeigenvalues
square symmetric
Claim without
matrix is or . All proof: every
eigenvalues squareofsymmetric
are squares matrix
singular values
can be written as of which is
where means that
orthonormal and is diagonal
1. All its eigenvalues must be non-negative i.e. is always
matrix
PSD
As before, we can write where and . It is easy to see that are
2. If , then can have a zero eigenvalue iff has a zero
eigenpairs for
singular value. This means that is full rank iff is full rank
Thus, for case
in the every , we have
If3.all
If ,then
then easy to see
can have thateigenvalue
a zero we always iff have
has a i.e.
zeroPSD
If even onevalue.
singular , thenThis
take to get
means thati.e. nonrank
is full PSDiff is full rank
Singular Blah vs Eigenblah
Symmetric square matrices always have real eigenvalues. It
is only in the non-symmetric case that funny things start
happening.
Singular Value Fortunately,
Decompositionin most ML situations,
(SVD): writewhenever we
a (rect) matrix
Rotation encounter
matricessquare matrices, they
(and orthonormal are symmetric
matrices too.are
in general)
Eigendecomposition
where the difference (ED):between write
SVDa and
square
ED issymm matrix as
most stark.
Be awarematrices
Rotation of the differences
in generalbetween SVD andcannot)
do not (actually ED – dohave
not get
anyconfused
SVD always exists,
eigenvectors. This isnobecause
matter whether matrix
they rotate everyis square/rect,
vector (except symm/non-
symm
which is a trivial case) so no vector is transformed with just a
ED always exists for square symm mats, may not exist (may require
simple
complex scaling. Thus, they
S or non-ortho have
) for nonno ED. However,
symm. mats. ED they
doesdo nothave
make sense
an
forSVD
rect(all matrices do) and actually they are their own SVD i.e.
mats.
However,
Singular thefor
values a always
are rotation
identity matrixmatrix , the
(which
non-negative, SVD
is also aisrotationcan
eigenvalues matrix
be –
pos/neg/complex
rotation by degrees) is also a bit weird in that every vector is
For aitsPSD
eigenvector since . The identity
square symmetric matrix,matrix
its SVDhasisthe
itssame
ED andSVDvice versa
and ED and that is
For a nonPSD square symm. matrix, its ED can be used to obtain its
SVD
Get ED . Let . Then, let , and (or else take ) to get the SVD
Note that still orthonormal but
Careful: since we have dropped some columns of we
Principal Component Analysis
need to be careful. When was square i.e. , its columns
were orthonormal as were its rows i.e. . However, now that
we havesingular
The largest removed value
some columns, whereas the remaining
(resp. eigenvalue) of a matrix is
calledcolumns are stillsingular
its leading orthonormal,
valuethe (resp.
rows areeigenvalue)
not necessarily
orthonormal i.e. but maybe
The corresponding left/right singular vector (resp. eigenvector) is
called its leading left/right singular vector (resp. eigenvector)
In some cases, there may be more than one singular vector with
the same singular value (resp. more than one eigenvector with
the same eigenvalue)
Principal Component Analysis: the process of finding
the top few singular values and corresponding singular
vectors (left + right)
Given (assume . The case similar) with SVD where , with and
and both have orthonormal columns, is square but is not!
Note: we dropped the last columns of which were useless, to get
Want to find the leading triplets i.e. for for some
Principal Component Analysis
Suppose we wish to find the leading right singular vector of
Same as finding the leading eigenvector of
To find the leading left singular vector of , find leading eigenvector
of
Denote
Recall that and that we reorder things so that
Assume for sake of simplicity that for some
This is sometimes called an eigengap or even leading eigengap
Easier to see the algorithms at work with an eigengap – will handle later
Caution: textbooks might write eigengap additively as for – same thing
Note: where
Similarly, convince yourself that for any
Will use this curious fact very efficiently to find the leading
eigenpair
The Power Method How do I find
our ?
Note that if , then
Hmm …forthis means if our approximation is
not good for
Note that this means then
allouras
approximation
well wont be
good either
Even if , with large enough , gap blows up e.g.
Thus, with large , theFind and then
leading use the fact
eigenvalue thatstands
really is out!
the eigenvalue of corresponding to
Let us take a vector and let (i.e. ) to get
The vector represents in terms of columns of i.e.
must be a vector so that i.e. . If , then as well which means
Notice that since is orthonormal, we have
we will never recover the vector . The longer you run i.e.
This means
larger the , the better the approximation 𝑠 will 1
𝛼 𝜆 ⋅ 𝐯 get.
you +¿
1 1
However, we just saw that for all
This meansWe obtained
that whichour approximation
means that as . How should
we choose ? Will any work? How should we
choose ?
The Power Method
Calculate in timeTHE POWER
using METHODEnsures with high
mat-vec mults
instead of first calculating which takes
1. Input: square time
symmetric matrix probability that
2. Initialize randomly e.g.Good to periodically
3. For normalize to prevent
1. Let overflow errors
Can show that doesn’t
2. Let ¿‖𝐳 ‖2
𝑡
affect the working of the
4. Return leading eigenvector algoestimate ¿‖ 𝐱 ‖2
in any way as
𝑠

5. Return leading eigenvalue estimate as

The Power Method is fast – guaranteed to return an estimate
In settings
in at mostwith no eigengap,
iterations it turns
(proof beyond out To
CS771). that
findthere is an
smaller
entire subspace
eigenpairs, (i.e. off
we “peel” infinitely many eigenvectors)
largest eigenpair we have found and
corresponding to the largest eigenvalue
repeat process
The Peeling Method
THE PEELING METHOD
1. Input: square symmetric matrix
Takes overall time to return
the top leading eigenpairs of
2. Initialize
3. For
1. Let The “peeling”
2. Let step
After leading eigenpair is peeled off, the Some residue might still be
4. Return
eigenpair with the second largest left due to inaccurate
eigenvalue becomes the new leading pair estimation of but usually
and Power Method can now recover this small if sufficiently large
1 ⊤ 2 ⊤ 3 ⊤ 4 ⊤
𝐴= 𝜆1 v ( v ) + 𝜆2 v ( v ) + 𝜆3 v ( v ) + 𝜆4 v ( v ) +…
1 2 3 4

Matrix Diagonalization Exercises
No ratings yet
Matrix Diagonalization Exercises
5 pages
Linear Algebra Project
No ratings yet
Linear Algebra Project
9 pages
The mathematics of quantum mechanics
From Everand
The mathematics of quantum mechanics
Alessio Mangoni
No ratings yet
1 Singular Value Decomposition: Lecture 8-10 Notes: SVD and Its Applications
No ratings yet
1 Singular Value Decomposition: Lecture 8-10 Notes: SVD and Its Applications
8 pages
Lecture 10 SVD
No ratings yet
Lecture 10 SVD
36 pages
Lec 8
No ratings yet
Lec 8
11 pages
Elec9731 LM1
No ratings yet
Elec9731 LM1
41 pages
MFDS lecture BITS WILP
No ratings yet
MFDS lecture BITS WILP
29 pages
Intro SVD
No ratings yet
Intro SVD
16 pages
Imm4000 PDF
No ratings yet
Imm4000 PDF
5 pages
dama50_2024_2025_Unit3n (1)
No ratings yet
dama50_2024_2025_Unit3n (1)
56 pages
2 - Introduction to SVD
No ratings yet
2 - Introduction to SVD
5 pages
Module 2 - DS I
No ratings yet
Module 2 - DS I
94 pages
Mod 2
No ratings yet
Mod 2
6 pages
Singular Value Decomposition (SVD)
No ratings yet
Singular Value Decomposition (SVD)
94 pages
PART I: Approximation of Static Systems
No ratings yet
PART I: Approximation of Static Systems
123 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
Linalg Refresh
No ratings yet
Linalg Refresh
3 pages
Singular Value Decomposition: Notes On Linear Algebra
No ratings yet
Singular Value Decomposition: Notes On Linear Algebra
9 pages
524C0006_524C0016
No ratings yet
524C0006_524C0016
26 pages
SVD PDF
No ratings yet
SVD PDF
10 pages
Dela 7-2
No ratings yet
Dela 7-2
10 pages
Cos323 s06 Lecture09 SVD
No ratings yet
Cos323 s06 Lecture09 SVD
24 pages
A Gentle Introduction To Singular-Value Decomposition For Machine Learning
No ratings yet
A Gentle Introduction To Singular-Value Decomposition For Machine Learning
14 pages
The Singular Value Decomposition.
No ratings yet
The Singular Value Decomposition.
88 pages
SVD Slides
No ratings yet
SVD Slides
26 pages
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
No ratings yet
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
9 pages
My Notes For Linear Algebra 987654
No ratings yet
My Notes For Linear Algebra 987654
33 pages
The Singular Value Decomposition (SVD)
No ratings yet
The Singular Value Decomposition (SVD)
9 pages
SFML_DATE_19_lecture3_svdpca_notes
No ratings yet
SFML_DATE_19_lecture3_svdpca_notes
6 pages
8 - The Singular Value Decomposition: Cmda 3606 Mark Embree
No ratings yet
8 - The Singular Value Decomposition: Cmda 3606 Mark Embree
24 pages
Singular-Value Decomposition and its Applications
No ratings yet
Singular-Value Decomposition and its Applications
28 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
524H0195_524H0191
No ratings yet
524H0195_524H0191
25 pages
Strang 367-376
No ratings yet
Strang 367-376
11 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
Module-II
No ratings yet
Module-II
41 pages
Singular Value Decomposition Geometry
No ratings yet
Singular Value Decomposition Geometry
9 pages
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
No ratings yet
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
24 pages
9.4 Singular Value Decomposition: 9.4.1 Definition of The SVD
No ratings yet
9.4 Singular Value Decomposition: 9.4.1 Definition of The SVD
4 pages
Book4 SVD
No ratings yet
Book4 SVD
7 pages
11 SVD
No ratings yet
11 SVD
47 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
15 pages
Singular Value Decomposition
100% (1)
Singular Value Decomposition
24 pages
Eigenvalues, Eigenvectors (CDT-28) : April 2020
No ratings yet
Eigenvalues, Eigenvectors (CDT-28) : April 2020
11 pages
Eigen Decomposition
No ratings yet
Eigen Decomposition
24 pages
SVD
No ratings yet
SVD
3 pages
1 Mathematical Foundations of SVD – Singlular Value Decomposition
No ratings yet
1 Mathematical Foundations of SVD – Singlular Value Decomposition
7 pages
Coursework
No ratings yet
Coursework
14 pages
SVD and PCA
No ratings yet
SVD and PCA
36 pages
INT255 Unit 2
No ratings yet
INT255 Unit 2
13 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
3 pages
SVD Slides
No ratings yet
SVD Slides
17 pages
Eigen Vectors: Topic
No ratings yet
Eigen Vectors: Topic
7 pages
Matrices: Q1. What Is Diagonalization in Matrices Means? Also Explain by Constructing An Example
No ratings yet
Matrices: Q1. What Is Diagonalization in Matrices Means? Also Explain by Constructing An Example
14 pages
COMP4222-Lecture 4-Self Reading-Chapter 4 Eigenvalues Decomposition by Longin Jan Latecki
No ratings yet
COMP4222-Lecture 4-Self Reading-Chapter 4 Eigenvalues Decomposition by Longin Jan Latecki
30 pages
gCha3A
No ratings yet
gCha3A
10 pages
Group07 CC02 Application of Linear Algebra in SVD For Compression
No ratings yet
Group07 CC02 Application of Linear Algebra in SVD For Compression
9 pages
L14 SVD
No ratings yet
L14 SVD
8 pages
ISYE6669 LP 10 22 3 - AndySun - FW
No ratings yet
ISYE6669 LP 10 22 3 - AndySun - FW
9 pages
Exercises of Vectors and Vectorial Spaces
From Everand
Exercises of Vectors and Vectorial Spaces
Simone Malacrida
No ratings yet
Lecture Determinant Inverse Systems of Linear Equations
No ratings yet
Lecture Determinant Inverse Systems of Linear Equations
32 pages
11th Mathematics-Vol-2 English Medium Text-44-47 Cropped
No ratings yet
11th Mathematics-Vol-2 English Medium Text-44-47 Cropped
4 pages
Mathematics For A+ Students
No ratings yet
Mathematics For A+ Students
5 pages
MATH-816 Applied Linear Algebra Matrices & Linear Transformations
No ratings yet
MATH-816 Applied Linear Algebra Matrices & Linear Transformations
26 pages
Hadamard Matrices and Weaving
No ratings yet
Hadamard Matrices and Weaving
4 pages
Matrices _ DPP 02 __ Parishram 2026
No ratings yet
Matrices _ DPP 02 __ Parishram 2026
5 pages
Lecture 9
No ratings yet
Lecture 9
5 pages
Chapter 2: Creating Arrays 2.11 Problems
No ratings yet
Chapter 2: Creating Arrays 2.11 Problems
4 pages
Math 54 Fall 2016 Practice Midterm 1: Nikhil Srivastava 50 Minutes, Closed Book, Closed Notes
No ratings yet
Math 54 Fall 2016 Practice Midterm 1: Nikhil Srivastava 50 Minutes, Closed Book, Closed Notes
2 pages
Matrices & Determinants
No ratings yet
Matrices & Determinants
4 pages
Exercise - Vectors and Matrices
No ratings yet
Exercise - Vectors and Matrices
4 pages
Linear Algebra Lecture Notes 03
No ratings yet
Linear Algebra Lecture Notes 03
5 pages
Practice For Null Space
No ratings yet
Practice For Null Space
3 pages
Lecture 1P4-Further Matrices (Class 3) - B&W
No ratings yet
Lecture 1P4-Further Matrices (Class 3) - B&W
55 pages
Linear Transformations: 9.1 The Vector Space R
No ratings yet
Linear Transformations: 9.1 The Vector Space R
10 pages
12 XII-M1-02 Matrices - Solution - 64dcca8417674
No ratings yet
12 XII-M1-02 Matrices - Solution - 64dcca8417674
19 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
37 pages
Fast Poisson Solvers and FFT: Tom Lyche
No ratings yet
Fast Poisson Solvers and FFT: Tom Lyche
25 pages
Applied Mathematics Question Bank 2024-25
100% (2)
Applied Mathematics Question Bank 2024-25
6 pages
Matrices Inter First Year Important
78% (9)
Matrices Inter First Year Important
11 pages
Fast Reliable Algorithms For Matrices With Structure-Ed Kailith-Sayed
No ratings yet
Fast Reliable Algorithms For Matrices With Structure-Ed Kailith-Sayed
356 pages
Lecture2 CAED
No ratings yet
Lecture2 CAED
67 pages
Rank and Nullity Theorem
No ratings yet
Rank and Nullity Theorem
6 pages
3 D Transformation
No ratings yet
3 D Transformation
10 pages
Math 9 PDF
No ratings yet
Math 9 PDF
174 pages
Assignment of Matrx and Determinants - XII - 2024-25
No ratings yet
Assignment of Matrx and Determinants - XII - 2024-25
3 pages
8.4 Matrix Exponential
No ratings yet
8.4 Matrix Exponential
19 pages
(Undergraduate Texts in Mathematics) Peter J. Olver, Chehrzad Shakiban - Applied Linear Algebra (Instructor's Solution Manual) (Solutions)-Springer (2018)
No ratings yet
(Undergraduate Texts in Mathematics) Peter J. Olver, Chehrzad Shakiban - Applied Linear Algebra (Instructor's Solution Manual) (Solutions)-Springer (2018)
205 pages
Matrix Mathematics Theory Facts and Formulas Second Edition Dennis S. Bernstein pdf download
No ratings yet
Matrix Mathematics Theory Facts and Formulas Second Edition Dennis S. Bernstein pdf download
64 pages

lec14

Uploaded by

lec14

Uploaded by

Principal

where are orthonormal

Determinant: For a square matrix ,

where . Thus, merely scales right singular vectors of

5. Return leading eigenvalue estimate as

You might also like