lec14
lec14
Component
Analysis
The Universal Linear
Transformations
It turns out that scaling and rotation transformations
are all we need to understand in order to understand
any linear transformation
A superbly powerful result in linear algebra assures us
that every linear transformation can be expressed as a
composition of two rotation transformations and one
scaling transformation
This result is known as the singular value
decomposition (SVD) theorem and it underlies a very
useful ML technique known as principal component
analysis (PCA)
Singular Value Decomposition
Let be any (possibly non-
symmetric) square matrix. Then
we can always write as a
product of three matrices
𝐴¿ 𝑈 Σ ⊤
𝑉
Singular Value Decomposition
SVD is defined even for matrices that are not square
Suppose then we can always write where
and are orthonormal matrices of different sizes
is a rectangular diagonal matrix with but for
Case 2: i.e. output dim > input dim. increases dim of
vectors
In this case adds
scaling
𝐴 dummy dimensions
ΣV
⊤
Vin 𝐱
⊤addition to
𝐴 𝑈 Σ
⊤
𝑉
¿ 0
0
0
0
Not that this automatically shows that row rank
Rank and column rank are the same. and the column
rank of this matrix is clearly the same as that of
Gettingthethe
SVD of asince
matrix matrix immediately
the number of non-zerostells
a lot us
in does
not change at all.
Rank: We always have number of non-zero entries in
To see why, notice that if some diagonal entries of are zero,
we can remove those rows and the corresponding columns
of
𝐴¿ 𝑈 Σ ⊤
If and only entries of are non-zero,
then we can equivalently write as
0
𝑉
where and . This is often called the thin SVD of
Finally, notice that where and thus, the column space of is
Trace and Determinant
Trace and determinant are defined only for square
matrices
Trace: For a square matrix ,
Can use SVD to get another definition of trace
Trace has a funny property: for any
Trace also satisfies linearity properties if , then we have and .
This gives us