0% found this document useful (0 votes)
167 views

A Gentle Introduction To Singular-Value Decomposition For Machine Learning

The document provides an overview of singular value decomposition (SVD), including: - SVD decomposes a matrix into three constituent matrices: left singular vectors, singular values, and right singular vectors - It can be used for matrix reconstruction, calculating the pseudoinverse, dimensionality reduction, and other machine learning applications - NumPy's svd() function calculates SVD and returns the decomposed matrices, which can then be used to reconstruct the original matrix

Uploaded by

Vera Damaianty
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
167 views

A Gentle Introduction To Singular-Value Decomposition For Machine Learning

The document provides an overview of singular value decomposition (SVD), including: - SVD decomposes a matrix into three constituent matrices: left singular vectors, singular values, and right singular vectors - It can be used for matrix reconstruction, calculating the pseudoinverse, dimensionality reduction, and other machine learning applications - NumPy's svd() function calculates SVD and returns the decomposed matrices, which can then be used to reconstruct the original matrix

Uploaded by

Vera Damaianty
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

A Gentle Introduction to Singular-Value

Decomposition for Machine Learning


by Jason Brownlee on February 26, 2018 in Linear Algebra

Matrix decomposition, also known as matrix factorization, involves describing a given


matrix using its constituent elements.

Perhaps the most known and widely used matrix decomposition method is the Singular-
Value Decomposition, or SVD. All matrices have an SVD, which makes it more stable
than other methods, such as the eigendecomposition. As such, it is often used in a wide
array of applications including compressing, denoising, and data reduction.

In this tutorial, you will discover the Singular-Value Decomposition method for
decomposing a matrix into its constituent elements.

After completing this tutorial, you will know:

 What Singular-value decomposition is and what is involved.


 How to calculate an SVD and reconstruct a rectangular and square matrix from SVD elements.
 How to calculate the pseudoinverse and perform dimensionality reduction using the SVD..
Let’s get started.

 Update Mar/2018: Fixed typo in reconstruction. Changed V in code to VT for clarity. Fixed typo
in the pseudoinverse equation.
A Gentle Introduction to Singular-Value Decomposition
Photo by Chris Heald, some rights reserved.

Tutorial Overview
This tutorial is divided into 5 parts; they are:

1. Singular-Value Decomposition
2. Calculate Singular-Value Decomposition
3. Reconstruct Matrix from SVD
4. SVD for Pseudoinverse
5. SVD for Dimensionality Reduction
Need help with Linear Algebra for Machine Learning?
Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Download Your FREE Mini-Course


Singular-Value Decomposition
The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method
for reducing a matrix to its constituent parts in order to make certain subsequent matrix
calculations simpler.
For the case of simplicity we will focus on the SVD for real-valued matrices and ignore
the case for complex numbers.

1 A = U . Sigma . V^T

Where A is the real m x n matrix that we wish to decompose, U is an m x m matrix,


Sigma (often represented by the uppercase Greek letter Sigma) is an m x n diagonal
matrix, and V^T is the transpose of an n x n matrix where T is a superscript.

The Singular Value Decomposition is a highlight of linear algebra.

— Page 371, Introduction to Linear Algebra, Fifth Edition, 2016.


The diagonal values in the Sigma matrix are known as the singular values of the original
matrix A. The columns of the U matrix are called the left-singular vectors of A, and the
columns of V are called the right-singular vectors of A.

The SVD is calculated via iterative numerical methods. We will not go into the details of
these methods. Every rectangular matrix has a singular value decomposition, although
the resulting matrices may contain complex numbers and the limitations of floating point
arithmetic may cause some matrices to fail to decompose neatly.

The singular value decomposition (SVD) provides another way to factorize a matrix, into
singular vectors and singular values. The SVD allows us to discover some of the same
kind of information as the eigendecomposition. However, the SVD is more generally
applicable.

— Pages 44-45, Deep Learning, 2016.


The SVD is used widely both in the calculation of other matrix operations, such as matrix
inverse, but also as a data reduction method in machine learning. SVD can also be used
in least squares linear regression, image compression, and denoising data.

The singular value decomposition (SVD) has numerous applications in statistics,


machine learning, and computer science. Applying the SVD to a matrix is like looking
inside it with X-ray vision…

— Page 297, No Bullshit Guide To Linear Algebra, 2017


Calculate Singular-Value Decomposition
The SVD can be calculated by calling the svd() function.

The function takes a matrix and returns the U, Sigma and V^T elements. The Sigma
diagonal matrix is returned as a vector of singular values. The V matrix is returned in a
transposed form, e.g. V.T.

The example below defines a 3×2 matrix and calculates the Singular-value
decomposition.

1 # Singular-value decomposition
2 from numpy import array
3 from scipy.linalg import svd
4 # define a matrix
5 A = array([[1, 2], [3, 4], [5, 6]])
6 print(A)
7 # SVD
8 U, s, VT = svd(A)
9 print(U)
10 print(s)
11 print(VT)

Running the example first prints the defined 3×2 matrix, then the 3×3 U matrix, 2
element Sigma vector, and 2×2 V^T matrix elements calculated from the decomposition.

1 [[1 2]
2 [3 4]
3 [5 6]]
4
5 [[-0.2298477 0.88346102 0.40824829]
6 [-0.52474482 0.24078249 -0.81649658]
7 [-0.81964194 -0.40189603 0.40824829]]
8
9 [ 9.52551809 0.51430058]
10
11 [[-0.61962948 -0.78489445]
12 [-0.78489445 0.61962948]]

Reconstruct Matrix from SVD


The original matrix can be reconstructed from the U, Sigma, and V^T elements.

The U, s, and V elements returned from the svd() cannot be multiplied directly.
The s vector must be converted into a diagonal matrix using the diag() function. By
default, this function will create a square matrix that is m x m, relative to our original
matrix. This causes a problem as the size of the matrices do not fit the rules of matrix
multiplication, where the number of columns in a matrix must match the number of rows
in the subsequent matrix.

After creating the square Sigma diagonal matrix, the sizes of the matrices are relative to
the original m x n matrix that we are decomposing, as follows:

1 U (m x m) . Sigma (m x m) . V^T (n x n)

Where, in fact, we require:

1 U (m x m) . Sigma (m x n) . V^T (n x n)

We can achieve this by creating a new Sigma matrix of all zero values that is m x n (e.g.
more rows) and populate the first n x n part of the matrix with the square diagonal matrix
calculated via diag().

1 # Reconstruct SVD
2 from numpy import array
3 from numpy import diag
4 from numpy import dot
5 from numpy import zeros
6 from scipy.linalg import svd
7 # define a matrix
8 A = array([[1, 2], [3, 4], [5, 6]])
9 print(A)
10 # Singular-value decomposition
11 U, s, VT = svd(A)
12 # create m x n Sigma matrix
13 Sigma = zeros((A.shape[0], A.shape[1]))
14 # populate Sigma with n x n diagonal matrix
15 Sigma[:A.shape[1], :A.shape[1]] = diag(s)
16 # reconstruct matrix
17 B = U.dot(Sigma.dot(VT))
18 print(B)
Running the example first prints the original matrix, then the matrix reconstructed from
the SVD elements.

1 [[1 2]
2 [3 4]
3 [5 6]]
4
5 [[ 1. 2.]
6 [ 3. 4.]
7 [ 5. 6.]]

The above complication with the Sigma diagonal only exists with the case where m and
n are not equal. The diagonal matrix can be used directly when reconstructing a square
matrix, as follows.

1 # Reconstruct SVD
2 from numpy import array
3 from numpy import diag
4 from numpy import dot
5 from scipy.linalg import svd
6 # define a matrix
7 A = array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
8 print(A)
9 # Singular-value decomposition
10 U, s, VT = svd(A)
11 # create n x n Sigma matrix
12 Sigma = diag(s)
13 # reconstruct matrix
14 B = U.dot(Sigma.dot(VT))
15 print(B)

Running the example prints the original 3×3 matrix and the version reconstructed directly
from the SVD elements.

1 [[1 2 3]
2 [4 5 6]
3 [7 8 9]]
4
5 [[ 1. 2. 3.]
6 [ 4. 5. 6.]
7 [ 7. 8. 9.]]

SVD for Pseudoinverse


The pseudoinverse is the generalization of the matrix inverse for square matrices to
rectangular matrices where the number of rows and columns are not equal.

It is also called the the Moore-Penrose Inverse after two independent discoverers of the
method or the Generalized Inverse.

Matrix inversion is not defined for matrices that are not square. […] When A has more
columns than rows, then solving a linear equation using the pseudoinverse provides one
of the many possible solutions.

— Page 46, Deep Learning, 2016.


The pseudoinverse is denoted as A^+, where A is the matrix that is being inverted and +
is a superscript.

The pseudoinverse is calculated using the singular value decomposition of A:

1 A^+ = V . D^+ . U^T

Or, without the dot notation:

1 A^+ = VD^+U^T

Where A^+ is the pseudoinverse, D^+ is the pseudoinverse of the diagonal matrix Sigma
and U^T is the transpose of U.

We can get U and V from the SVD operation.

1 A = U . Sigma . V^T
The D^+ can be calculated by creating a diagonal matrix from Sigma, calculating the
reciprocal of each non-zero element in Sigma, and taking the transpose if the original
matrix was rectangular.

1 s11, 0, 0
2 Sigma = ( 0, s22, 0)
3 0, 0, s33

1 1/s11, 0, 0
2 D^+ = ( 0, 1/s22, 0)
3 0, 0, 1/s33

The pseudoinverse provides one way of solving the linear regression equation,
specifically when there are more rows than there are columns, which is often the case.

NumPy provides the function pinv() for calculating the pseudoinverse of a rectangular
matrix.

The example below defines a 4×2 matrix and calculates the pseudoinverse.

1 # Pseudoinverse
2 from numpy import array
3 from numpy.linalg import pinv
4 # define matrix
5 A = array([
6 [0.1, 0.2],
7 [0.3, 0.4],
8 [0.5, 0.6],
9 [0.7, 0.8]])
10 print(A)
11 # calculate pseudoinverse
12 B = pinv(A)
13 print(B)

Running the example first prints the defined matrix, and then the calculated
pseudoinverse.
1 [[ 0.1 0.2]
2 [ 0.3 0.4]
3 [ 0.5 0.6]
4 [ 0.7 0.8]]
5
6 [[ -1.00000000e+01 -5.00000000e+00 9.04289323e-15 5.00000000e+00]
7 [ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]

We can calculate the pseudoinverse manually via the SVD and compare the results to
the pinv() function.

First we must calculate the SVD. Next we must calculate the reciprocal of each value in
the s array. Then the s array can be transformed into a diagonal matrix with an added
row of zeros to make it rectangular. Finally, we can calculate the pseudoinverse from the
elements.

The specific implementation is:

1 A^+ = V . D^+ . U^V

The full example is listed below.

1 # Pseudoinverse via SVD


2 from numpy import array
3 from numpy.linalg import svd
4 from numpy import zeros
5 from numpy import diag
6 # define matrix
7 A = array([
8 [0.1, 0.2],
9 [0.3, 0.4],
10 [0.5, 0.6],
11 [0.7, 0.8]])
12 print(A)
13 # calculate svd
14 U, s, VT = svd(A)
15 # reciprocals of s
16 d = 1.0 / s
17 # create m x n D matrix
18 D = zeros(A.shape)
19 # populate D with n x n diagonal matrix
20 D[:A.shape[1], :A.shape[1]] = diag(d)
21 # calculate pseudoinverse
22 B = VT.T.dot(D.T).dot(U.T)
23 print(B)

Running the example first prints the defined rectangular matrix and the pseudoinverse
that matches the above results from the pinv() function.

1 [[ 0.1 0.2]
2 [ 0.3 0.4]
3 [ 0.5 0.6]
4 [ 0.7 0.8]]
5
6 [[ -1.00000000e+01 -5.00000000e+00 9.04831765e-15 5.00000000e+00]
7 [ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]

SVD for Dimensionality Reduction


A popular application of SVD is for dimensionality reduction.

Data with a large number of features, such as more features (columns) than
observations (rows) may be reduced to a smaller subset of features that are most
relevant to the prediction problem.

The result is a matrix with a lower rank that is said to approximate the original matrix.

To do this we can perform an SVD operation on the original data and select the top k
largest singular values in Sigma. These columns can be selected from Sigma and the
rows selected from V^T.

An approximate B of the original vector A can then be reconstructed.

1 B = U . Sigmak . V^Tk

In natural language processing, this approach can be used on matrices of word


occurrences or word frequencies in documents and is called Latent Semantic Analysis or
Latent Semantic Indexing.
In practice, we can retain and work with a descriptive subset of the data called T. This is
a dense summary of the matrix or a projection.

1 T = U . Sigmak

Further, this transform can be calculated and applied to the original matrix A as well as
other similar matrices.

1 T = V^Tk . A

The example below demonstrates data reduction with the SVD.

First a 3×10 matrix is defined, with more columns than rows. The SVD is calculated and
only the first two features are selected. The elements are recombined to give an
accurate reproduction of the original matrix. Finally the transform is calculated two
different ways.

1 from numpy import array


2 from numpy import diag
3 from numpy import zeros
4 from scipy.linalg import svd
5 # define a matrix
6 A = array([
7 [1,2,3,4,5,6,7,8,9,10],
8 [11,12,13,14,15,16,17,18,19,20],
9 [21,22,23,24,25,26,27,28,29,30]])
10 print(A)
11 # Singular-value decomposition
12 U, s, VT = svd(A)
13 # create m x n Sigma matrix
14 Sigma = zeros((A.shape[0], A.shape[1]))
15 # populate Sigma with n x n diagonal matrix
16 Sigma[:A.shape[0], :A.shape[0]] = diag(s)
17 # select
18 n_elements = 2
19 Sigma = Sigma[:, :n_elements]
20 VT = VT[:n_elements, :]
21 # reconstruct
22 B = U.dot(Sigma.dot(VT))
23 print(B)
24 # transform
25 T = U.dot(Sigma)
26 print(T)
27 T = A.dot(VT.T)
28 print(T)

Running the example first prints the defined matrix then the reconstructed
approximation, followed by two equivalent transforms of the original matrix.

1 [[ 1 2 3 4 5 6 7 8 9 10]
2 [11 12 13 14 15 16 17 18 19 20]
3 [21 22 23 24 25 26 27 28 29 30]]
4
5 [[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
6 [ 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.]
7 [ 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.]]
8
9 [[-18.52157747 6.47697214]
10 [-49.81310011 1.91182038]
11 [-81.10462276 -2.65333138]]
12
13 [[-18.52157747 6.47697214]
14 [-49.81310011 1.91182038]
15 [-81.10462276 -2.65333138]]

The scikit-learn provides a TruncatedSVD class that implements this capability directly.

The TruncatedSVD class can be created in which you must specify the number of
desirable features or components to select, e.g. 2. Once created, you can fit the
transform (e.g. calculate V^Tk) by calling the fit() function, then apply it to the original
matrix by calling the transform() function. The result is the transform of A called T above.

The example below demonstrates the TruncatedSVD class.

1 from numpy import array


2 from sklearn.decomposition import TruncatedSVD
3 # define array
4 A = array([
5 [1,2,3,4,5,6,7,8,9,10],
6 [11,12,13,14,15,16,17,18,19,20],
7 [21,22,23,24,25,26,27,28,29,30]])
8 print(A)
9 # svd
10 svd = TruncatedSVD(n_components=2)
11 svd.fit(A)
12 result = svd.transform(A)
13 print(result)

Running the example first prints the defined matrix, followed by the transformed version
of the matrix.

We can see that the values match those calculated manually above, except for the sign
on some values. We can expect there to be some instability when it comes to the sign
given the nature of the calculations involved and the differences in the underlying
libraries and methods used. This instability of sign should not be a problem in practice as
long as the transform is trained for reuse.

1 [[ 1 2 3 4 5 6 7 8 9 10]
2 [11 12 13 14 15 16 17 18 19 20]
3 [21 22 23 24 25 26 27 28 29 30]]
4
5 [[ 18.52157747 6.47697214]
6 [ 49.81310011 1.91182038]
7 [ 81.10462276 -2.65333138]]

Extensions
This section lists some ideas for extending the tutorial that you may wish to explore.

 Experiment with the SVD method on your own data.


 Research and list 10 applications of SVD in machine learning.
 Apply SVD as a data reduction technique on a tabular dataset.
If you explore any of these extensions, I’d love to know.

Further Reading
This section provides more resources on the topic if you are looking to go deeper.

Books
 Chapter 12, Singular-Value and Jordan Decompositions, Linear Algebra and Matrix Analysis for
Statistics, 2014.
 Chapter 4, The Singular Value Decomposition and Chapter 5, More on the SVD, Numerical
Linear Algebra, 1997.
 Section 2.4 The Singular Value Decomposition, Matrix Computations, 2012.
 Chapter 7 The Singular Value Decomposition (SVD), Introduction to Linear Algebra, Fifth
Edition, 2016.
 Section 2.8 Singular Value Decomposition, Deep Learning, 2016.
 Section 7.D Polar Decomposition and Singular Value Decomposition, Linear Algebra Done
Right, Third Edition, 2015.
 Lecture 3 The Singular Value Decomposition, Numerical Linear Algebra, 1997.
 Section 2.6 Singular Value Decomposition, Numerical Recipes: The Art of Scientific Computing,
Third Edition, 2007.
 Section 2.9 The Moore-Penrose Pseudoinverse, Deep Learning, 2016.
API
 numpy.linalg.svd() API
 numpy.matrix.H API
 numpy.diag() API
 numpy.linalg.pinv() API.
 sklearn.decomposition.TruncatedSVD API
Articles
 Matrix decomposition on Wikipedia
 Singular-value decomposition on Wikipedia
 Singular value on Wikipedia
 Moore-Penrose inverse on Wikipedia
 Latent semantic analysis on Wikipedia
Summary
In this tutorial, you discovered the Singular-value decomposition method for
decomposing a matrix into its constituent elements.

Specifically, you learned:

 What Singular-value decomposition is and what is involved.


 How to calculate an SVD and reconstruct a rectangular and square matrix from SVD elements.
 How to calculate the pseudoinverse and perform dimensionality reduction using the SVD.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

You might also like