Matrix Computations
Matrix Computations
DISCLAIMER
While this document is believed to contain correct information, neither the author
nor sponsors, nor any of their employees, makes any warranty, express or
implied, or assumes any legal responsibility for the accuracy, completeness,
safety, or usefulness of any information, apparatus, product, or process
disclosed, or represents that its use would not infringe privately owned rights.
All software is free software. You can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later version.
Software is provided in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
License for more details: www.gnu.org/licenses.
Preface
The target audience for this book is the community of students, scientists, engineers and
social scientists who have an interest in applying matrix computations to solve problems
by writing custom applications. The applications themselves are myriad and include (but
are by no means limited to) signal processing, differential equations, image processing,
simultaneous equations, and linear regression. The typical reader will have a primary
interest in a specific application, and a resultant interest in learning at a functional
level the mathematics of matrix computations. The reader should have strong
mathematical and logical skills, but advanced coursework is not a prerequisite. The
readers background should include a familiarity with one or more programming
languages. Programmers can easily adapt the C# code used in this book to other
languages.
There are a few very good texts that treat the mathematics of matrix
computations. These books provide algorithms, develop mathematical proofs, and discuss
at length the necessary techniques (and thinking) required to develop efficient, reliable
implementations. This book focuses on the implementation of established algorithms
using the C# programming language. It also provides a proof-free development of
concepts needed to understand the implementations, as well as several worked examples.
It is my hope that the reader can use the material provided herein and be up and
running with custom implementations in short order.
I recommend augmenting this book with a text on matrix computation
mathematics. One very complete and plainly presented example is Matrix Computations
3rd Edition by Gene H. Golub and Charles F. Van Loan (Johns Hopkins University Press,
Baltimore, 1996). I have relied heavily on this resource in writing this book.
Contents
1. Introduction
5
Conventions and General Background
5
Basic Definitions ............................................................................................. 5
Introductory Vector and Matrix Terminology................................................. 7
Representing Matrices in C#.NET Arrays ..................................................... 10
Mathematical Operations............................................................................... 13
Floating Point Numbers and Operations Counts ........................................... 20
Linear Systems in Matrix Form..................................................................... 21
22
2. Gaussian Elimination, the LU Factorization, and Determined Systems
Gaussian Elimination
22
Upper Triangulation Algorithms ................................................................... 22
Lower Triangulation Algorithms................................................................... 28
Choosing a Pivot Strategy ............................................................................. 29
Determinants
31
Inversion by Augmentation
31
The LU Factorization
33
Permutation Matrices and Encoded Permutation Matrices
36
Determined Systems
37
A General Practical Solution ......................................................................... 38
The Matrix Inverse Revisited ........................................................................ 43
Symmetric Positive Definite Systems, The Cholesky Factorization............. 44
3. The QR Factorization and Full Rank, Over Determined Systems
48
The QR Factorization
48
QR Factorization by Householder Reflection ............................................... 48
QR Factorization by Givens (Jacobi) Rotations ............................................ 57
Fast Givens QR Factorization........................................................................ 63
QR Factorization by the Modified Gram-Schmidt Method........................... 67
Solving Full Rank, Over Determined Systems
68
The Method of Normal Equations ................................................................. 70
Householder and Givens QR Solutions ......................................................... 71
Fast Givens Solutions .................................................................................... 74
Modified Gram-Schmidt Solutions ............................................................... 77
An Example ................................................................................................... 79
The Matrix Computation Utility Class
81
Some Timing Experiments
83
4. QR Factorizations with Pivoting, Complete Orthogonalization and Rank
86
Deficient, Over Determined Systems
Rank Deficiency
86
Factorization Methods
86
Householder QR with Column Pivoting........................................................ 86
Givens QR with Column Pivoting................................................................. 92
The Complete Orthogonal Decomposition.................................................... 97
Obtaining Solutions With Pivoting QR Factorizations
98
1.
Introduction
(c di )
c2 d 2
Scalars:
Traditionally, a scalar is a quantity that has magnitude but no direction.
Practically, a scalar is a single valued number.
Vectors:
Vectors are ordered sets of mathematical elements. In many applications they are
understood to have magnitude and direction. More abstractly, they have magnitude and
dimensional multiplicity.
Matrices:
A matrix is an array of mathematical elements having horizontal rows and vertical
columns. In this book, the number of rows is designated m and the number of columns is
n . If we restrict our consideration to real numbers, then we can represent the set of all mby-n matrices as: A mxn . For complex systems we may designate the corresponding set
as A mxn . We designate a matrix with a capital letter. The elements of the matrix are
designated with appropriately subscripted lowercase letters:
a11 a1n
mxn
A
a
m1 amn
The element aij resides in the ith row and the jth column of the matrix.
n -tuples:
An n-tuple is an ordered set of n elements. It is more general in sense than a
vector, but in relation to the subject matter at hand, we can use ntuple and n-dimensional
vector interchangeably.
The terms point, line, plane and space are not rigorously defined. Our intuition
about these concepts works much better than the circular arguments that arise when we
try to constrain their meaning to definitions. As for extension beyond three (or four)
dimensions, any definitions are mostly philosophical.
6
The vector v has the length and direction shown. It has these properties without
reference to the coordinate system. We can represent the vector in terms of its Cartesian
The dot product (or inner product) of two m dimensional vectors v and u is
m
v, u vi ui . The result is a scalar. If the dot product of two vectors is zero, then and
i 1
only then are the vectors said to be orthogonal. Orthogonal vector pairs can intuitively be
thought of as perpendicular.
Other Common Terms
1/ 2
2
n
A unit vector is a vector of length 1. A unit vector can be obtained from a vector
v
. Unit vectors that are orthogonal are said to be
v using the formula: v
v2
orthonormal.
We can think of matrices as being an ordered set of column or row vectors. As a
result, the terminology of matrices is closely related to that of vectors. The column space
of a matrix is the vector space spanned by its columns. The row space of a matrix is the
vector space spanned by its rows. The rank of a matrix is the number of linearly
independent columns or rows. If Rank[ A] n , then the matrix has full column rank. If
Rank[ A] m, n , then the matrix is rank deficient. An n-by-n matrix A is singular if the
equation Ax 0 has a solution when x is a non zero n dimensional vector. (We will
discuss in detail the meaning of Ax later.) Finally, the null space of A is the set of all
vectors x satisfying the equation Ax 0 .
Consider the following matrix:
mxn
a11 a1n
m1 amn
1 0 0
0
em
. This matrix is designated the m-dimensional identity
0
0 0 1
matrix. It is fundamental that A em A Aem . In fact, we can consider this our most basic
matrix factorization.
Just as we can change the coordinate system (also termed a basis) for describing a
vector, we can re-express a matrix in an alternate basis. This process is a fundamental
concept in matrix factoring.
We saw where the Euclidean norm for vectors gave us a measure of magnitude. In
1/ 2
m n
(aij ) 2 . We
i 1 j 1
10
Therefore, in this book with a couple of exceptions vectors are treated as special
case 1-by-n (row vectors) or m-by-1 (column vectors) matrices
A method is a procedure, either a function or a subroutine, that will perform a
specific task. A function explicitly returns a result. A subroutine can (but need not) return
results by reference. When we develop methods for matrix computations we need to
return values (usually arrays). So, we can return multiple arrays from a subprocedure by
passing those arrays by reference as subprocedure arguments.
Another solution involves returning those arrays from a function as a jagged
array. A jagged array is an array of arrays. Consider the following function:
public double[][,] DO_IT( double[,] A )
{
double[][,] Result_Array = new double[2][,]; //Jagged array
for function
Result_Array[0] = ((System.Double[])(new double[7, 4 ]));
Result_Array[1] = ((System.Double[])(new double[5, 3]));
// Do some things that create two arrays to be returned
then pack them into Result_Array(0) and Result_Array(1)
return Result_Array;
}
This function packs two arrays into one jagged array and returns the jagged array.
On the consumer side we access the results with something like this:
private void Button1_Click( System.Object sender, System.EventArgs e )
{
A = new double[2, 2];
A[1, 1] = 3;
double[][,] Receiver = new double[2][,];
Receiver = ((System.Double[][,])(DO_IT(A)));
B = ((System.Double[,])(Receiver[0 ]));
C = ((System.Double[,])(Receiver[1 ]));
}
Be aware that there are some common language specification (CLS) issues with
jagged arrays, and that classes using them should not be exposed to CLS-compliant code.
In this book we will pass multiple arrays using the subprocedure approach. Jagged
arrays can be in certain cases more efficient, but code written with them is often
hopelessly difficult to follow, and in this book following the code is the objective. Using
the subprocedure with passing arrays by reference also allows the dot NET graphical user
interface to provide inline help with subprocedure arguments. Here is a simple example
of such a procedure together with an interface:
public void DO_IT2( double[,] X, ref double[,] B, ref double[,] C)
{
// Do some things that create results in B and C
}
11
A = new double[ 2, 2 ];
A[ 1, 1 ] = 3;
DO_IT2( A, ref B, ref C );
}
Some additional points should be made before we move on. First, if you are using
VB.NET, turn strict on. [This] restricts implicit data type conversions to only widening
conversions. [It] explicitly disallows any data type conversions in which data loss would
occur and any conversion between numeric types and strings (Microsoft VB Language
Reference). Also, note in the code examples above that we are passing two dimensional
arrays As Double(,) not As Array or As Object, thus giving our user (perhaps ourselves) a
break from the problems of late binding.
The class library we will develop for matrix computations will have certain global
variables. It is best to list these right now before we start adding methods to the class. Put
a paper clip on this page or, better yet, enter these variable declarations in your
programming environment now. Dont worry if you dont know their meaning or use.
public class Matrix_Computation_Utility
{
private int[] Gauss_Encoded_P = new int[2]; // Row permutation
vector
private int[] Gauss_Encoded_Q = new int[2]; // Column
permutation vector
private int[] QR_Encoded_P = new int[2]; // QR column
permutation vector
private int QRRank; // QR rank
private int SVDRANK; // SVD Rank
public double Machine_Error = 0.00000000000000011313; // unit
roundoff variable
public double SVD_Rank_Determination_Threshold = 0.000001;//an
adjustable rank determination threshold
public double Givens_Zero_Value = 0; // Threshold for Givens
process
private int c_sign; // sign of the determinant
public double HOUSE_BETA; // pipeline from house
public double[] beta = new double[2]; // Vector of Householder
Beta values
public delegate void doneEventHandler();
public event doneEventHandler done; // Signal that a procedure
is complete
// The following properties allow reading private globals.
public int[] GE_Encoded_Row_Permutation
{
get
{
return Gauss_Encoded_P;
}
}
public int[] GE_Encoded_Column_Permutation
{
get
{
12
return Gauss_Encoded_Q;
}
}
public int[] QR_Encoded_Column_Permutation
{
get
{
return QR_Encoded_P;
}
}
public int QR_Rank
{
get
{
return QRRank;
}
}
public int svd_Rank
{
get
{
return SVDRANK;
}
}
Mathematical Operations
There are certain fundamental operations we must consider before we move on to
our primary subject matter.
Transposition C=AT
Transposing a matrix involves making the first column of the matrix the first row
of the transpose, then making the second column the second row, and so on.
Alternatively, we can understand the process as rotating the array 180 around the axis
formed by the diagonal set of elements beginning with a11 . Note that the transposed
matrix has n rows and m columns.
It is not always necessary to explicitly form a matrixs transpose. For example, if
we want to perform an operation over i rows and j columns of AT , then we can
exchange the indices of our original matrix and perform that operation over j rows and i
columns of A . See MatrixMatrix Multiplication below for an example.
Implementation:
Given: A
mxn
determine C
nxm
such that AT C
13
{
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
int i = 0, j = 0;
double[,] result = new double[n + 1, m + 1];
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
result[j, i] = Source_Matrix[i, j];
}
}
if (null != done) done();
return result;
}
Addition: C A B
When we add two matrices, we sum corresponding elements in each matrix. For
example, suppose we are adding A and B to form C . The ijth element of C ( Cij ) is
Aij Bij . Note that the matrices to be added must have the same dimensions.
Implementation:
Given: A
C A B
mxn
, B
mxn
Scalar Multiplication
14
Here we simply multiply every element by a scalar to form a new matrix with the
same size and shape as the original matrix.
Implementation:
Given: A
mxn
determine C
mxn
1;
1;
+ 1, n + 1];
A_Matrix[i, j];
MatrixMatrix Multiplication
With transposition running a close second, matrix multiplication is the most
common elementary operation we will encounter. It is a costly process and it is
fundamental that we implement multiplication procedures strategically. The following
schematic illustrates the multiplication process.
a11
ai1
am1
a1 p
aip
amp
b11 b1 j b1n
=
bp1 bpj bpn
Column J
a11b11 ... a1 p bp1
Row i
15
As you can see, the ijth element of C is the dot product of the ith row of A and the
jth column of B . Clearly, the number of columns in A must be equal to the number of
rows in B .
Implementation:
Given: A
mp
,B
pn
determine C
mn
such that C AB
AB BA. It is, however, always the case and this is important that AB BT AT .
T
If we represent two column vectors v and u as matrices, then the dot product is
v, u vT u u T v . If we represent two row vectors v and u as matrices, then the dot
product is v, u vu T uvT .
The order of the loops in the multiplication algorithm can be changed from the
above strategy to: For i = 1 To m: For j=1 To n: For k = 1 To p. The form of the
calculation stays the same, namely C(i, j) = A (i, k) * B (k, j) + C(i, j). In fact, there are
six loop sequences that will work: i,j,k; j,i,k; i,k,j; j,k,i; k,i,j; and k,j,i. . However, the data
access properties will change and some variants will be more efficient than others.
Consider the i,k,j variant used in the implementation. The inner loop accesses the
elements of Result(i, j) and B_Matrix(k, j) sequentially along the current row (i for Result
and k for B_Matrix). The inner loop data access for the six variants is detailed below.
16
{
throw (new ApplicationException("for A X B, columns in
A and rows in B must be equal"));
}
double[,] c = new double[m + 1, n + 1];
int q =
System.Convert.ToInt32(System.Math.Floor(System.Convert.ToSingle(m) /
2));
double[,] A1 = new double[q + 1, p + 1];
double[,] A2 = new double[m - q + 1, p + 1];
for (i = 1; i <= q; i++)
{
for (j = 1; j <= p; j++)
{
A1[i, j] = A_Matrix[i, j];
}
}
for (i = q + 1; i <= m; i++)
{
for (j = 1; j <= p; j++)
{
A2[i - q, j] = A_Matrix[i, j];
}
}
CoMultiply t1 = new CoMultiply();
t1.a1 = A1;
t1.a2 = A2;
t1.b = B_Matrix;
System.Threading.Thread thread1 = new
System.Threading.Thread(new
System.Threading.ThreadStart(t1.multiply1));
System.Threading.Thread thread2 = new
System.Threading.Thread(new
System.Threading.ThreadStart(t1.multiply2));
thread1.Start();
thread2.Start();
thread1.Join();
thread2.Join();
double[,] c1 = t1.c1;
double[,] c2 = t1.c2;
c = new double[m + 1, n + 1];
for (i = 1; i <= q; i++)
{
for (j = 1; j <= n; j++)
{
c[i, j] = c1[i, j];
}
}
for (i = q + 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
c[i, j] = c2[i - q, j];
}
}
if (null != done) done();
return c;
18
}
public class CoMultiply
{
public double[,] a1;
public double[,] a2;
public double[,] b;
public double[,] c1;
public double[,] c2;
public void multiply1()
{
int m = a1.GetLength(0) - 1;
int n = b.GetLength(1) - 1;
int p = a1.GetLength(1) - 1;
c1 = new double[m + 1, n + 1];
if (p != (b.GetLength(0) - 1))
{
throw (new ApplicationException("for A X B, columns in
A and rows in B must be equal"));
}
int i = 0, j = 0, k = 0;
for (i = 1; i <= m; i++)
{
for (k = 1; k <= p; k++)
{
for (j = 1; j <= n; j++)
{
c1[i, j] = a1[i, k] * b[k, j] + c1[i, j];
}
}
}
}
public void multiply2()
{
int m = a2.GetLength(0) - 1;
int n = b.GetLength(1) - 1;
int p = a2.GetLength(1) - 1;
c2 = new double[m + 1, n + 1];
if (p != (b.GetLength(0) - 1))
{
throw (new ApplicationException("for A X B, columns in
A and rows in B must be equal"));
}
int i = 0, j = 0, k = 0;
for (i = 1; i <= m; i++)
{
for (k = 1; k <= p; k++)
{
for (j = 1; j <= n; j++)
{
c2[i, j] = a2[i, k] * b[k, j] + c2[i, j];
}
}
}
}
19
Be aware that a mutual data bus is being shared on these processors and processor
capacity utilization may be strikingly hindered by inadequate data transfer capacity.
. In other words, a rounding error occurs. One can determine a functional value
for machine precision that is tolerant of differing implementations of floating point
standards with the following procedure:
public void Find_precision()
{
// At what threshold value of delta does 1 + delta no
longer equal 1?
double delta = 0;
for (delta = 1.0E-18; delta <= 1; delta += 1.0E-18)
{
if (1 + delta != 1)
{
break;
}
}
Machine_Error = 1.01 * delta; // Global, small multiple of
unit roundoff
}
20
The central calculation includes one floating point multiplication and one floating
point addition, i.e., two floating point operations. This calculation is performed mpn
times. The total number of floating point operations for matrixmatrix multiplication is
2mpn. Some sources, including this book, abbreviate floating point operations with the
acronym FLOPs. Be aware that other sources reserve this acronym for floating point
operations per second (a measure of speed), and still others refer to a floating point
add/multiply pair as a FLOP. We will not discuss floating point operations per second
other than to say the concept is essentially meaningless without a reference standard
algorithm for comparing machine-to-machine performance. In a straight loop with two
numbers being added, such as
For i = 1 to 1000: For j = 1 to 1000: For k = 1 to 1000
X = 5.02 + 3.01
Next: Next: Next
most PCs will score around 0.5 to 2 billion floating point operations per second. With a
matrixmatrix multiplication where data in arrays must be accessed, the speed is highly
dependent on algorithm implementation and is typically 50 to 200 million floating point
operations per second.
21
, mn
a
m1 amm amn
to one of form:
a 11
a 1m
a mm
a 1n
a mn
We are zeroing the elements below the principal diagonal of the matrixs leftmost
square. The matrix may be square, in which case the leftmost square is the matrix itself.
For the purpose of our discussion, the matrix cannot have more rows than columns.
The procedure begins with obtaining a vector by multiplying the first row by the
a2,1
(row factor 2, 1), then subtracting that vector from the second row. This
factor
a1,1
places a zero in the a2,1 position. Next, we obtain a vector by multiplying the first row by
the factor
a3,1
(row factor 3, 1), then we subtract that vector from the third row. This
a1,1
procedure is repeated for the remaining rows, then column two is processed in the same
way beginning with row factor 3, 2. Our denominator for the row factor is always the
diagonal element of the column we are working on, and the numerator is the element we
seek to zero. Frequently the resultant matrix is computed in the form of an encoded
matrix, where the zeros below the principal diagonal are replaced with the row factors
used to produce a zero in that position.
Note that the left square of the matrix is being upper triangulated. The following
code encapsulates the process.
public double[,] GaussUT_nopiv(double[,] Source_Matrix)
22
{
// no pivot upper triangular Gauss elimination of leftmost
square of matrix
// overwrites Source_Matrix, returning encoded array with
rowfactors in lower triangle
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
int i = 0, j = 0, k = 0;
double rowfactor = 0;
if (m > n)
{
throw new ApplicationException("Rows exceeds columns.
Leftmost square undefined");
}
// gaussian elimination loop
for (k = 1; k <= m; k++)
{
if (Source_Matrix[k, k] == 0)
{
throw new ApplicationException("Divide by zero
occurs, consider pivoting.");
}
// the loop to zero the lower triangle (actually
rowfactors are stored in the lower triangle)
for (i = k + 1; i <= m; i++)
{
rowfactor = Source_Matrix[i, k] / Source_Matrix[k,
k];
for (j = k; j <= n; j++)
{
Source_Matrix[i, j] = Source_Matrix[i, j] rowfactor * Source_Matrix[k, j];
}
Source_Matrix[i, k] = rowfactor;
}
}
if (null != done) done();
return Source_Matrix;
}
Obviously a problem arises when any given row factors denominator is zero.
Also, arithmetic reliability problems occur when the row factors denominator is
disproportionately low in absolute value compared to other matrix elements (rounding
errors occur). The solution lies in a technique called pivoting.
There are algorithms for partial pivoting and full pivoting. Let us first consider
row pivoting. Initially we have the matrices
1 0 0
a11 a1m a1n
, where Prow is an
A and Prow
a
m1 amm amn
0 0 1
identity matrix with the same dimensions as the leftmost square of A (m m).
23
Again, we seek to triangulate the leftmost square of A , and we work from the diagonal
elements down for any given column. The row containing the diagonal element of
interest is called the pivot row. We compare the absolute value of the diagonal element
with that of each of the elements below it. The row with the element of highest absolute
value is swapped with the pivot row (if indeed the pivot row is not already the row with
the highest absolute value element). We keep track of row exchanges throughout the
computation with an encoded row permutation matrix. (Gauss_Encoded_P. Permutation
matrices and encoded permutation matrices are discussed in detail later in this chapter.)
Here is a procedure for doing this:
public double[,] GaussUT_partial_piv(double[,] Source_Matrix)
{
// Row pivot upper triangular Gauss elimination of leftmost
square of matrix
// returns encoded array with rowfactors in lower triangle.
Modifies global Prow, overwrites source_matrix
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
if (m > n)
{
throw new ApplicationException("Rows exceeds columns.
Leftmost square undefined");
}
Gauss_Encoded_P = new int[m + 1];
c_sign = 1; // This is here for determinant calculations.
int i = 0, j = 0, k = 0, exrow = 0, temp2 = 0;
double temp1 = 0, rowfactor = 0, amax = 0;
// initialize permutation matrices
for (i = 1; i <= m; i++)
{
Gauss_Encoded_P[i] = i;
}
// gaussian elimination loop
for (k = 1; k <= m; k++)
{
exrow = k;
amax = System.Math.Abs(Source_Matrix[k, k]); // find
the element with the highest abs value below the pivot and record its
row
for (i = k; i <= m; i++)
{
if (System.Math.Abs(Source_Matrix[i, k]) > amax)
{
amax = System.Math.Abs(Source_Matrix[i, k]);
exrow = i;
}
}
if (exrow != k)
{ // exchange rows
for (j = 1; j <= n; j++)
{ // for the source_matrix matrix
temp1 = Source_Matrix[k, j];
Source_Matrix[k, j] = Source_Matrix[exrow, j];
24
Source_Matrix[exrow, j] = temp1;
}
temp2 = Gauss_Encoded_P[k]; // for the permutation
matrix
Gauss_Encoded_P[k] = Gauss_Encoded_P[exrow];
Gauss_Encoded_P[exrow] = temp2;
c_sign = c_sign * -1;
}
if (Source_Matrix[k, k] == 0)
{
throw new ApplicationException("Singular matrix");
}
// the loop to zero the lower triangle (actually
rowfactors are stored in the lower triangle)
for (i = k + 1; i <= m; i++)
{
rowfactor = Source_Matrix[i, k] / Source_Matrix[k,
k];
for (j = k; j <= n; j++)
{
Source_Matrix[i, j] = Source_Matrix[i, j] rowfactor * Source_Matrix[k, j];
}
Source_Matrix[i, k] = rowfactor;
}
}
if (null != done) done();
return Source_Matrix;
}
The full pivoting procedure considers the diagonal element of focus at any step of
the process to be a pivot point. At any step K we begin at the diagonal element and search
and compare each element of the A pivot submatrix to determine which has the highest
absolute value. Then row and column exchanges are performed to bring the element of
highest absolute value to the pivot point. We keep track of the row and column exchanges
throughout the computation with encoded row and column permutation matrices
Gauss_Encoded_P and Gauss_Encoded_Q.
The schematic below shows what is meant by the pivot submatrix. At any point in
the procedure only elements within this submatrix are compared, but the exchanges
involve the entire row or column.
a11
0
a1n
a kk a kn
a mk a mn
25
m1 amm amn
1 0 0
0
Prow Pcol
, where Prow and Pcol are identity matrices with the
0 0 1
same dimensions as the leftmost square of A . Here is the procedure:
public double[,] GaussUT_full_piv(double[,] Source_Matrix)
{
// Full pivot upper triangular Gauss elimination of
leftmost square of matrix
// returns encoded array with rowfactors in lower triangle.
Modifies global Gauss_Encoded_Q and Gauss_Encoded_P, overwrites
source_matrix
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
if (m > n)
{
throw new ApplicationException("Rows exceeds columns.
Leftmost square undefined");
}
Gauss_Encoded_P = new int[m + 1];
Gauss_Encoded_Q = new int[n + 1];
c_sign = 1;
int i = 0, j = 0, k = 0, exrow = 0, excol = 0, temp2 = 0;
double temp1 = 0, rowfactor = 0, amax = 0;
// initialize permutation matrices
for (i = 1; i <= m; i++)
{
Gauss_Encoded_P[i] = i;
Gauss_Encoded_Q[i] = i;
}
// gaussian elimination loop
for (k = 1; k <= m; k++)
{
exrow = k;
excol = k;
amax = System.Math.Abs(Source_Matrix[k, k]); // find
the element with the highest abs value in the pivot submatrix and
record its row and column
for (i = k; i <= m; i++)
{
for (j = k; j <= m; j++)
{
if (System.Math.Abs(Source_Matrix[i, j]) >
amax)
{
amax = System.Math.Abs(Source_Matrix[i,
j]);
exrow = i;
26
excol = j;
}
}
}
if (exrow != k)
{ // exchange rows
for (j = 1; j <= n; j++)
{ // for the source_matrix matrix
temp1 = Source_Matrix[k, j];
Source_Matrix[k, j] = Source_Matrix[exrow, j];
Source_Matrix[exrow, j] = temp1;
}
temp2 = Gauss_Encoded_P[k]; // for the permutation
matrix
Gauss_Encoded_P[k] = Gauss_Encoded_P[exrow];
Gauss_Encoded_P[exrow] = temp2;
c_sign = c_sign * -1;
}
if (excol != k)
{ // exchange columns (to bring element with highest
abs value to the principal diagonal
for (i = 1; i <= m; i++)
{ // for the source_matrix matrix
temp1 = Source_Matrix[i, k];
Source_Matrix[i, k] = Source_Matrix[i, excol];
Source_Matrix[i, excol] = temp1;
}
temp2 = Gauss_Encoded_Q[k]; // for the permutation
matrix
Gauss_Encoded_Q[k] = Gauss_Encoded_Q[excol];
Gauss_Encoded_Q[excol] = temp2;
c_sign = c_sign * -1;
}
if (Source_Matrix[k, k] == 0)
{
throw new ApplicationException("Singular matrix");
}
// the loop to zero the lower triangle (actually
rowfactors are stored in the lower triangle)
for (i = k + 1; i <= m; i++)
{
rowfactor = Source_Matrix[i, k] / Source_Matrix[k,
k];
for (j = k; j <= n; j++)
{
Source_Matrix[i, j] = Source_Matrix[i, j] rowfactor * Source_Matrix[k, j];
}
Source_Matrix[i, k] = rowfactor;
}
}
if (null != done) done();
return Source_Matrix;
}
27
In this procedure the row factors are not being stored (you will see why later), but
they certainly could be. We can add steps to perform row pivoting or full pivoting, but
there is a catch. Note that in the upper triangulation algorithms, the step to zero the lower
triangle has the form:
For i = k + 1 To m
rowfactor = A(i, k) / A(k, k)
For j = k To n
A(i, j) = A(i, j) - rowfactor * A(k, j)
Next j
A(i, k) = rowfactor
Next i
The inner loop (For j = k To n; Next j) only needs to execute from k to n
because A(k, p) is zero for all p < k. With lower triangulation for m n , the zeros occur
28
bounded by non-zero values to their left and right. We either must perform the inner loop
for the entire row (as above), or employ some skip-over strategy each of which
results in an unnecessary loss of efficiency. So, except when lower triangulation is
required by some special application, upper triangulation seems to work best.
m 1
(m i) .
i 1
The diagram below schematically shows the elements being compared. They occupy the
area under the diagonal of the matrix. The number of comparisons is proportional to
and not greater than m2 , the area of the square. Many textbooks state this measure of
computational overhead with big O notation, i.e., the number of comparisons is
O(m2 ) .
29
As with row pivoting, in the full pivot Gaussian elimination algorithm the
absolute value of the element on the principal diagonal of a given column is stored in a
variable amax. This value is now compared using the same strategy as the row pivot,
except we use a search and compare procedure covering the entire submatrix. For the first
column, there is a total of m 2 1 comparisons (it is minus one because we do not
compare the first element with itself). The next column considers a pivot submatrix with
2
dimensions m 1 by m 1 , so the number of comparisons is m 1 1 . The overall
m 1
(m i)
i 1
enveloped by a cube of volume m3 . The base of the pyramid contains the elements being
compared for column ones pivot point. Similarly, the ascending layers correspond to the
pivot submatrices for the remaining columns. The number of comparisons is proportional
to and not greater than m3 , the volume of the cube. The number of comparisons is
O ( m3 ) .
30
Determinants
Although determinants are rarely used in linear solutions algorithms, we should
briefly discuss them because they are used in the Cramers rule method for solving
determined systems. Determinants apply to square matrices. One of several methods for
evaluating determinants involves Gaussian elimination with pivoting. As usual we arrive
a11 a1m
m
0
0
0
a
mm
product of the diagonal elements). Sign is negative whenever the number of row
exchanges plus column exchanges is odd.
For the encapsulation below, recall that the GaussUT_full_piv function utilizes
c_sign, a variable with class scope:
public double Determinant(double[,] Source_Matrix)
{
double determinantReturn = 0;
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
if (m != n)
{
throw new ApplicationException("matrix must be
square");
}
int i = 0;
double[,] arrTemp = new double[m + 1, n + 1];
Array.Copy(Source_Matrix, arrTemp, arrTemp.Length);
Array.Copy(GaussUT_full_piv(arrTemp), arrTemp,
arrTemp.Length);
determinantReturn = 1;
for (i = 1; i <= m; i++)
{
determinantReturn = determinantReturn * arrTemp[i, i];
}
determinantReturn = determinantReturn * c_sign;
if (null != done) done();
return determinantReturn;
}
Inversion by Augmentation
Another fundamental computation is matrix inversion. Inversion only applies to
square matrices. Here we consider:
mxm
1
mxm
and A
A
31
an1
0 0
0
ann 0 1
a1n 1
Next a full pivot Gaussian elimination strategy is used to zero the lower triangle
of the left square.
a11 a1n b11
0
A augmented
0 0
0 0 0 a nn bn1
b1n
bnn
Here b and a denote transitional values. The row factors are deleted from the
result matrices. We will not need them.
Next we zero the leftmost square's upper triangle using Gaussian elimination, this
time using the lower triangulation procedure (without pivoting).
a11 0 0 c11
0
A augmented
0 0 0
0 0 0 a nn cn1
c1n
cnn
We divide each row by the corresponding diagonal element in the leftmost square
and obtain:
32
1 0 0 a 111
A augmented
0 0 0
1
0 0 0 1 a n1
a 11n
a 1nn
The rightmost square is now a permutation of the inverse matrix, which we parse
into its own n-by-n matrix. This matrix is multiplied by the Column Permutation matrix
obtained from the full pivot Gaussian elimination step to obtain A1 .
The LU Factorization
We have actually already studied the LU factorization. Its essence is that of upper
triangulation by Gaussian elimination, and all that remains to be discussed is the simple
formality of nomenclature. In our implementations of the upper triangulation Gaussian
elimination algorithms, the row factors are stored in the lower left triangle below the
principal diagonal of the leftmost square of the matrix. Also, encoded row and column
permutation matrices are being calculated and stored as global arrays Gauss_Encoded_P
and Gauss_Encoded_Q. With the LU factorization we restrict consideration to square
matrices, so the leftmost square is the matrix itself.
First we execute any of the three upper triangulation Gaussian elimination
functions on a matrix A GaussUT_nopiv( A ), GaussUT_partial_piv( A ), or
GaussUT_full_piv( A ). Next, the row factors (elements below the diagonal) are parsed
into a new matrix L which is unit diagonal and square
0
0
1
1
0
RF
L 21
1
RFm1 RFmn 1
0
, where RF is the row factor.
0
The upper triangular portion of the Gaussian elimination matrix is parsed into U .
a 11
0
U
a 1m
a mm
33
Finally, for the sake of clarity, let us call the row and column permutation
matrices P and Q respectively. L, U, P and Q make up the LU decomposition and
PAQ LU . When row pivoting is used, Q is an m m identity matrix.
The following two functions take as an argument a square matrix and perform
both the Gaussian elimination and subsequent parsing. P and Q are stored as encoded
matrices in the global variables Gauss_Encoded_P and Gauss_Encoded_Q, and they can
be accessed through the read only properties GE_Encoded_Row_Permutation and
GE_Encoded_Column_Permutation.
Row Pivot LU
public void LU_row_pivot(double[,] Source_Matrix, ref double[,]
L, ref double[,] U)
{
// This is mostly a simple parsing algorithm. The actual LU
is performed by GaussUT_partial_piv.
// Modifies global Gauss_Encoded_Q and Gauss_Encoded_P,
overwrites source_matrix with encoded LU
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
if (m != n)
{
throw new ApplicationException("matrix must be
square");
}
L = new double[m + 1, n + 1];
U = new double[m + 1, n + 1];
int i = 0, j = 0;
Source_Matrix = GaussUT_partial_piv(Source_Matrix);
// parse encoded_array to L and U
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
if (i > j)
{
L[i, j] = Source_Matrix[i, j];
}
if (i < j)
{
L[i, j] = 0;
}
if (i == j)
{
L[i, j] = 1;
}
if (i <= j)
{
U[i, j] = Source_Matrix[i, j];
}
}
}
if (null != done) done();
}
34
Full Pivot LU
public void LU_full_pivot(double[,] Source_Matrix, ref
double[,] L, ref double[,] U)
{
// This is mostly a simple parsing algorithm. The actual LU
is performed by GaussUT_full_piv.
// Modifies global Gauss_Encoded_Q and Gauss_Encoded_P,
overwrites source_matrix with encoded LU
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
if (m != n)
{
throw new ApplicationException("matrix must be
square");
}
L = new double[m + 1, n + 1];
U = new double[m + 1, n + 1];
int i = 0, j = 0;
Source_Matrix = GaussUT_full_piv(Source_Matrix);
// parse encoded_array to L and U
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
if (i > j)
{
L[i, j] = Source_Matrix[i, j];
}
if (i < j)
{
L[i, j] = 0;
}
if (i == j)
{
L[i, j] = 1;
}
if (i <= j)
{
0 0 1 1 0 0
36
int i = 0, j = 0;
for (i = 1; i <= n; i++)
{
temp[i, i] = 1;
}
for (i = 1; i <= n; i++)
{
for (j = 1; j <= n; j++)
{
Cperm[i, j] = temp[i, encoded_vector[j]];
}
}
return Cperm;
}
public double[,] Decode_row_permutation(int[] encoded_vector)
{
int n = encoded_vector.GetLength(0) - 1;
double[,] Rperm = new double[n + 1, n + 1];
double[,] temp = new double[n + 1, n + 1];
int i = 0, j = 0;
for (i = 1; i <= n; i++)
{
temp[i, i] = 1;
}
for (i = 1; i <= n; i++)
{
for (j = 1; j <= n; j++)
{
Rperm[i, j] = temp[encoded_vector[i], j];
}
}
return Rperm;
}
Determined Systems
He we consider the system Ax b where A mn , x is an n dimensional vector,
b is an m dimensional vector, and m n . This is the standard problem of solving a
system of linear equations where the number of equations is equal to the number of
unknowns. Remembering that AA1 A1 A I , let us multiply both sides of the equation
Ax b by A1 : x A1b . So, the solution is simple; all we have to do is multiply the b
vector by A1 . However, inverting a matrix is computationally expensive.
Another overwhelmingly expensive procedure with historical significance is
Cramers rule:
a11 a12
a13 x1 b1
a23 x2 b2 ,
a33 x3 b3
37
b1
det b2
b
3
x1
a11
det a21
a
31
a12 a13
a11
det a21
a22 a23
a
a32 a33
31
, x2
a12 a13
a11
det a21
a22 a23
a
a32 a33
31
b1 a13
b2 a23
b3 a33
, etc.
a12 a13
a22 a23
a32 a33
0 0 4 x3 12
38
39
multiply by
permutation
for (i = 1; i <= n; i++)
{
tempb[i, 1] = localb[Gauss_Encoded_P[i], 1];
}
localb = tempb;
}
double temp = 0;
// forward calculate for Ly=Pb
for (i = 2; i <= n; i++)
{
for (j = 1; j <= i - 1; j++)
{
temp = temp - L[i, j] * localb[j, 1];
}
localb[i, 1] = temp + localb[i, 1];
temp = 0;
}
// At this point localb holds y. We will back calculate for
Uz=y
localb[n, 1] = localb[n, 1] / U[n, n];
for (i = n - 1; i >= 1; i += -1)
{
for (j = n; j >= i + 1; j += -1)
{
temp = temp - U[i, j] * localb[j, 1];
}
temp = temp + localb[i, 1];
localb[i, 1] = temp / U[i, i];
temp = 0;
}
// At this point localb holds z
if (Gauss_Encoded_q == null)
{
if (null != done) done();
return localb;
}
else
{ // Multiply by permutation matrix
double[,] tempb = new double[n + 1, 2];
for (i = 1; i <= n; i++)
{
tempb[Gauss_Encoded_q[i], 1] = localb[i, 1];
}
localb = tempb;
if (null != done) done();
return localb;
40
}
}
public double[,] Explicit_LU_Ax_b(double[,] L, double[,] U,
double[,] b)
{
return Explicit_LU_Ax_b(L, U, b, null);
}
public double[,] Explicit_LU_Ax_b(double[,] L, double[,] U,
double[,] b, int[] Gauss_Encoded_P)
{
return Explicit_LU_Ax_b(L, U, b, Gauss_Encoded_P, null);
}
41
42
L 0.513
1
0 and U 0 3.920 2.274
0.974 0.135 1
0
1.110
0
P 0 0 1 and Q 1 0 0
1 0 0
0 1 0
43
44
Linear systems involving symmetric positive definite coefficient matrices can be solved
with a Cholesky factorization, which requires m3/3 FLOPs as opposed to 2m3/3 FLOPs
for an LU factorization.
Consider a square matrix with dimensions mxm and the set of all m dimensional
vectors. Remember that vectors are matrices with one row (or column). The product
xT Ax will be a scalar. If this number is greater than zero, no matter what m dimensional
vector x we consider (except the case that x is a zero vector), then the matrix A is said to
be positive definite; i.e., a matrix A m ,m is positive definite if xT Ax 0 for all non
zero x m .
If, for A m ,m , AT A , then A is said to be symmetric.
If A m ,m is symmetric positive definite, then there exists a unique lower
triangular matrix G m with positive diagonal elements such that GGT A . GGT is
the Cholesky factorization of A .
First, here is a function for obtaining G :
public double[,] Cholesky(double[,] Source_Matrix)
{
// cholesky gaxpy Golub/Van Loan 4.2.1 Result = G
// Overwrites Source_Matrix and zeros above diagonal
int i = 0, j = 0, k = 0;
int n = Source_Matrix.GetLength(0) - 1;
double[] v = new double[n + 1];
for (j = 1; j <= n; j++)
{
for (i = j; i <= n; i++)
{
v[i] = Source_Matrix[i, j];
}
for (k = 1; k <= j - 1; k++)
{
for (i = j; i <= n; i++)
{
v[i] = v[i] - (Source_Matrix[j, k] *
Source_Matrix[i, k]);
}
}
for (i = 1; i <= n; i++)
{
if (i < j)
{
Source_Matrix[i, j] = 0;
}
else
{
Source_Matrix[i, j] = v[i] / (Math.Pow(v[j],
0.5));
}
}
}
if (null != done) done();
return Source_Matrix;
45
One good way to determine if your matrix is positive definite is to submit it to the
above function. If the code executes and does not return negative square roots (NaN), the
matrix submitted is positive definite. Its symmetry can be evaluated by inspection, or by
verifying that its product with the inverse of its transpose yields the identity matrix.
Once we have a Cholesky factorization of A we can obtain solutions. Given a
determined system Ax b where A is symmetric positive definite, we can use the
following steps to solve for x :
Obtain G and GT
Solve Gy b (simple forward calculation)
Solve GT x y (simple back calculation)
This algorithm is encapsulated below:
public double[,] Cholesky_Ax_b(double[,] G, double[,] b)
{
if (G.GetLength(0) != G.GetLength(1))
{
throw new ApplicationException("matrix must be
square");
}
int n = G.GetLength(0) - 1;
double[,] result = new double[n + 1, 2];
Array.Copy(b, result, b.Length); // result now holds b
int i = 0, j = 0;
double Temp = 0;
// forward calculate for Gy=b
result[1, 1] = result[1, 1] / G[1, 1];
for (i = 2; i <= n; i++)
{
for (j = 1; j <= i - 1; j++)
{
Temp = Temp - G[i, j] * result[j, 1];
}
Temp = Temp + result[i, 1];
result[i, 1] = Temp / G[i, i];
Temp = 0;
}
// result now holds y
double[,] GT = new double[n + 1, n + 1];
GT = Transpose(G);
Temp = 0;
// back calculate for Ux=y
result[n, 1] = result[n, 1] / GT[n, n];
for (i = n - 1; i >= 1; i += -1)
{
for (j = n; j >= i + 1; j += -1)
{
Temp = Temp - GT[i, j] * result[j, 1];
}
Temp = Temp + result[i, 1];
46
47
The QR Factorization
QR Factorization by Householder Reflection
Consider the following two dimensional system:
20
15
10
0
-20
- 5
- 0
-5
10
15
20
v
u
x
Px
-10
15
-20
49
}
else
{
Result[1] = -sigma / (x_vector[1] + u);
}
HOUSE_BETA = (2 * Result[1] * Result[1]) / (sigma +
(Math.Pow(Result[1], 2)));
for (i = 2; i <= m; i++)
{
Result[i] = Result[i] / Result[1];
}
}
Result[1] = 1;
if (null != done) done();
return Result;
}
An example:
x (1,3,5)T
v (1, 0.610242, 1.0170706)T
Hx (5.9160788,0,0)T
50
1.5
2.5
A
3.5
4.5
1.84 2.76
3.95 9.88
.
6.55 22.92
9.55 42.96
0.50988
.
v
0.71383
0.91778
The Householder matrix is:
0
-1.365543 -12.79117
.
A1
0
-0.891759 -8.819633
0
-0.017977 2.151897
We next submit the last three rows of column two of A1 to the House function and
1
1
The Householder matrix is:
v
0.2975928
0.0059991
0
0
0
1
51
0
1.631032 15.50748
.
A2
0
0
-0.3981584
0
0
2.321662
.
obtain v
0.8431019
The corresponding Householder is:
0
H3
0
0
0
0
1
0
0
.
0 -0.1690295 0.985611
0 0.985611 0.1690295
0
1.631032 15.50748
.
A3
0
0
2.355556
0
0
0
52
AH A( I vvT ) A AvvT .
When we employ this approach we compute matrixvector multiplications instead
of costly matrixmatrix multiplications.
The following code encapsulates the basic QR decomposition abstraction:
public double[,] Householder_QR_Simple(double[,] Source_Matrix)
{
// Golub/Van Loan 5.2.1 Overwrites source_matrix with R and
returns Q from function call.
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
QRRank = n; // Default for full rank case.
int i = 0, j = 0, k = 0;
double[,] Q = new double[m + 1, m + 1];
for (i = 1; i <= m; i++)
{
Q[i, i] = 1;
}
double[] vector = new double[m + 1];
double[,] v = new double[m + 1, 2];
beta = new double[n + 1];
for (j = 1; j <= n; j++)
{
v = new double[m + 1, 2];
vector = new double[m - j + 2];
// convert the jth column of the submatrix to a 1
dimensional vector for submission to house
for (i = 1; i <= vector.Length - 1; i++)
{
vector[i] = Source_Matrix[j + i - 1, j];
}
// submit it
vector = House(vector);
// convert the result to a 2 dimensional "vector" with
leading zeros up to the jth row
for (i = 1; i <= vector.Length - 1; i++)
{
v[j + i - 1, 1] = vector[i];
}
beta[j] = HOUSE_BETA;
double[,] omegaT = null;
omegaT = Matrix_Multiply(v,
Scalar_multiply(Transpose(Matrix_Multiply(Transpose(Source_Matrix),
v)), beta[j]));
for (i = j; i <= m; i++)
{
for (k = j; k <= n; k++)
{
Source_Matrix[i, k] = Source_Matrix[i, k] omegaT[i, k];
}
54
}
double[,] omega = null;
omega =
Scalar_multiply(Matrix_Multiply(Matrix_Multiply(Q, v), Transpose(v)),
beta[j]);
for (i = 1; i <= m; i++)
{
for (k = 1; k <= m; k++)
{
Q[i, k] = Q[i, k] - omega[i, k];
}
}
}
if (null != done) done();
return Q;
}
Note that when we form omega above, we perform the multiplication (Qv)vT and
not Q(vvT ) . Although they give the same result, the former is considerably less
computationally expensive.
Many applications do not require the explicit generation of the Q matrix, so an
encoded array can be used where the upper triangle is the R matrix and the elements
below the principal diagonal represent the elements of a unit lower triangular matrix of
the Householder vectors. The following procedure implements these modifications:
public double[,] Householder_QR_Simple_encoded(double[,]
Source_Matrix)
{
// QR Golub/Van Loan 5.2.1 Form encoded QR without explicit
formation of Householder matrices
// Overwrites Source_matrix with encoded QR and also sends
encoded QR as return
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
QRRank = n; // Default for full rank case.
int i = 0, j = 0, k = 0;
beta = new double[n + 1];
for (j = 1; j <= n; j++)
{
double[,] v = new double[m + 1, 2]; // redim is used
simply to reinitialize the array here
double[] vector = new double[m - j + 1 + 1];
// convert the jth column of the submatrix to a 1
dimensional vector for submission to house
for (i = 1; i <= vector.Length - 1; i++)
{
vector[i] = Source_Matrix[j + i - 1, j];
}
// submit it
vector = House(vector);
// convert the result to a 2 dimensional "vector" with
leading zeros up to the jth row
for (i = 1; i <= vector.Length - 1; i++)
{
55
v[j + i - 1, 1] = vector[i];
}
beta[j] = HOUSE_BETA;
double[,] omegaT = null;
omegaT = Matrix_Multiply(v,
Scalar_multiply(Transpose(Matrix_Multiply(Transpose(Source_Matrix),
v)), beta[j]));
for (i = j; i <= m; i++)
{
for (k = j; k <= n; k++)
{
Source_Matrix[i, k] = Source_Matrix[i, k] omegaT[i, k];
}
}
// populate the lower triangular matrix holding the
Householder vectors
for (i = j + 1; i <= m; i++)
{
Source_Matrix[i, j] = vector[i - j + 1];
}
}
if (null != done) done();
return Source_Matrix;
}
The encoded result can be decoded into a discrete Q and R result with the
following decoder subroutine:
public double[,] Decode_Householder_QR(double[,]
Encoded_matrix)
{
// Returns Q from encoded QR, R is just the upper triangle
of encoded_R.
// Conserves Encoded_R
int m = Encoded_matrix.GetLength(0) - 1;
int n = Encoded_matrix.GetLength(1) - 1;
int i = 0, j = 0, k = 0;
double[,] Q = new double[m + 1, m + 1];
for (i = 1; i <= m; i++)
{
Q[i, i] = 1;
}
// Decoder for essential part of Householder vectors
storred in lower triangle.
for (j = n; j >= 1; j += -1)
{
double[,] v = new double[m + 1, 2];
v[j, 1] = 1;
for (i = j + 1; i <= m; i++)
{
v[i, 1] = Encoded_matrix[i, j];
}
double[,] omega = null;
omega = Matrix_Multiply(Transpose(Q), v);
for (i = 1; i <= m; i++)
56
0
1 0
0
cos( ) sin( ) 0 i
0
0
0 1
i
k
How do we determine such that the element ak ,i is zeroed? Or, to restate the
question, what must be to effect the following transformation?
cos( ) sin( ) aii rii
sin( ) cos( ) aki 0
T
The following function returns a two dimensional array result that holds cos()
in result(1) and sin() in result(2):
public double[] Givens(double a, double b)
{
// Golub/Van Loan 5.1.3
// (c,s)=givins(a,b): the cosine is returned in Result(1),
the sine in Result(2)
double[] Result = new double[4];
double tau = 0;
double c = 0;
double s = 0;
if (b == 0)
{
c = 1;
s = 0;
}
else
{
if (System.Math.Abs(b) > System.Math.Abs(a))
{
tau = -a / b;
s = 1 / (Math.Pow((1 + (Math.Pow(tau, 2))), 0.5));
c = s * tau;
}
else
{
tau = -b / a;
c = 1 / (Math.Pow((1 + (Math.Pow(tau, 2))), 0.5));
s = c * tau;
}
}
58
Result[1] = c;
Result[2] = s;
return Result;
}
59
Else
cs(1) = 2 / encoded_Givens_QR (i, j) : cs(2) = (1 - (cs(1) * cs(1))) ^
0.5
End If
End If
The following function returns an encoded Givens QR factorization:
public double[,] Givens_QR_Simple_encoded(double[,]
Source_Matrix)
{
// 'Golub/Van Loan 5.2.2 Returns an encoded QR and
overwrites source_matrix with encoded R
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
QRRank = n; // by default
int h = 0, i = 0, j = 0;
double[] cs = new double[3];
double rho = 0;
for (j = 1; j <= n; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (System.Math.Abs(Source_Matrix[i, j]) >
Givens_Zero_Value)// If it's already zero we won't zero it.
{
double[,] tempA = new double[3, n - j + 1 + 1];
cs = Givens(Source_Matrix[i - 1, j],
Source_Matrix[i, j]);
if (cs[1] == 0)
{ // This is the rho encoder section
rho = 1;
}
else
{
if (System.Math.Abs(cs[2]) <=
System.Math.Abs(cs[1]))
{
rho = System.Math.Sign(cs[1]) * cs[2] /
2;
}
else
{
rho = 2 * System.Math.Sign(cs[2]) /
cs[1];
}
}
// accumulate R
for (h = j; h <= n; h++)
{
tempA[1, -j + 1 + h] = cs[1] *
Source_Matrix[i - 1, h] - cs[2] * Source_Matrix[i, h];
tempA[2, -j + 1 + h] = cs[2] *
Source_Matrix[i - 1, h] + cs[1] * Source_Matrix[i, h];
}
61
62
}
else
{
cs[1] = 2 / local_encoded_QR[i, j];
cs[2] = Math.Pow((1 - (cs[1] * cs[1])),
0.5);
}
}
// accumulate Q
double[,] temp_Q = new double[m + 1, 4];
for (h = 1; h <= m; h++)
{
temp_Q[h, 1] = cs[1] * Q[h, i - 1] - cs[2]
* Q[h, i];
temp_Q[h, 2] = cs[2] * Q[h, i - 1] + cs[1]
* Q[h, i];
}
for (h = 1; h <= m; h++)
{
Q[h, i - 1] = temp_Q[h, 1];
Q[h, i] = temp_Q[h, 2];
}
}
}
}
return Q;
}
where D
63
RD
m21 m22
xi , j
1
alternate forms: Type 1 =
or Type 2 =
1
three tasks:
It returns and .
It returns the Type
It updates the di 1, j and di , j elements of D
An implementation of the algorithm for the fast Givens function is provided
below:
public double[] FastGivens(double[] x, double[] d)
{
// Golub/Van Loan 5.1.4
// returns result(0)= alpha; Result(1)=Beta; result(2)=type
double[] result = new double[4];
double alpha = 0, beta = 0, gamma = 0, type = 0;
if (x[2] != 0)
{
alpha = -x[1] / x[2];
beta = -alpha * (d[2] / d[1]);
gamma = -alpha * beta;
if (gamma <= 1)
{
double tau = 0;
type = 1;
tau = d[1];
d[1] = (1 + gamma) * d[2];
d[2] = (1 + gamma) * tau;
}
else
{
type = 2;
alpha = 1 / alpha;
beta = 1 / beta;
gamma = 1 / gamma;
d[1] = (1 + gamma) * d[1];
d[2] = (1 + gamma) * d[2];
}
64
}
else
{
type = 2;
alpha = 0;
beta = 0;
}
result[0] = alpha;
result[1] = beta;
result[2] = type;
return result;
}
The fast Givens zeroing function has complexity similar to that of the Givens
function except that it does not involve square roots. The fast Givens function is called by
the fast Givens QR subroutine below to calculate M, D, and T.
public void FastGivens_QR(double[,] Source_Matrix, ref
double[,] D_destination, ref double[,] M_destination, ref double[,]
T_destination)
{
// Golub/Van Loan 5.2.4 that explicitly forms M
// Source_Matrix is overwritten by T. D, M, and T are
returned by reference
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
QRRank = n; // by default
int i = 0, j = 0, k = 0;
D_destination = new double[2, m + 1];
M_destination = new double[m + 1, m + 1];
// T_destination= T
for (i = 1; i <= m; i++)
{
D_destination[1, i] = 1;
M_destination[i, i] = 1;
}
for (j = 1; j <= n; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (System.Math.Abs(Source_Matrix[i, j]) >
Givens_Zero_Value)// No need to zero if already zero.
{
double alpha = 0;
double beta = 0;
int type = 0;
double[] fg_x = new double[4];
double[] fg_d = new double[4];
fg_x[1] = Source_Matrix[i - 1, j];
fg_x[2] = Source_Matrix[i, j];
fg_d[1] = D_destination[1, i - 1];
fg_d[2] = D_destination[1, i];
double[] Return_FG_result = new double[4];
Return_FG_result = FastGivens(fg_x, fg_d);
alpha = Return_FG_result[0];
beta = Return_FG_result[1];
65
type =
System.Convert.ToInt32(Return_FG_result[2]);
D_destination[1, i - 1] = fg_d[1];
D_destination[1, i] = fg_d[2];
double[,] tempA = new double[3, n - j + 1 + 1];
double[,] temp_bigm = new double[m + 1, 3];
if (type == 1)
{
for (k = j; k <= n; k++)
{
tempA[1, -j + 1 + k] = beta *
Source_Matrix[i - 1, k] + Source_Matrix[i, k];
tempA[2, -j + 1 + k] = Source_Matrix[i
- 1, k] + alpha * Source_Matrix[i, k];
}
for (k = 1; k <= m; k++)
{
temp_bigm[k, 1] = M_destination[k, i 1] * beta + M_destination[k, i];
temp_bigm[k, 2] = M_destination[k, i 1] + M_destination[k, i] * alpha;
}
}
else
{
for (k = j; k <= n; k++)
{
tempA[1, -j + 1 + k] = Source_Matrix[i
- 1, k] + beta * Source_Matrix[i, k];
tempA[2, -j + 1 + k] = alpha *
Source_Matrix[i - 1, k] + Source_Matrix[i, k];
}
for (k = 1; k <= m; k++)
{
temp_bigm[k, 1] = M_destination[k, i 1] + M_destination[k, i] * beta;
temp_bigm[k, 2] = M_destination[k, i 1] * alpha + M_destination[k, i];
}
}
for (k = 1; k <= n - j + 1; k++)
{
Source_Matrix[i - 1, j + k - 1] = tempA[1,
k];
Source_Matrix[i, j + k - 1] = tempA[2, k];
}
for (k = 1; k <= m; k++)
{
M_destination[k, i - 1] = temp_bigm[k, 1];
M_destination[k, i] = temp_bigm[k, 2];
}
}
}
}
T_destination = Source_Matrix;
if (null != done) done();
66
67
1
1
A
1
1
1 1
2 4
3 9
4 16
5 25
6 36
-2
-9
-22
b
-41
-66
-97
1
1
1
1
1 1
-2
2 4
-9
x1
3 9 -22
x2
,
4 16 -41
x3
5 25 -66
6 36
-97
the formula Ax b . Remember that the Euclidian norm of a vector is the sum of the
squares of the elements. If we determine x such that the norm Ax b 2 is minimized,
then that x is the best least squares fit for our data.
It need not be the case that the A matrix is a Vandermonde or any other structured
matrix. Column one could be ambient temperature, column two could be relative
humidity, and column three could be pounds of corn starch added. The observation vector
could be viscosity of the fudge batch at a candy factory.
So, how do we find the vector x that minimizes the residual norm? We solve
Ax b , giving each row in A and b (each set of simultaneous equations) equal
consideration.
69
both sides by AT A
matrix computation development with the tedious development given for normal
equations techniques in most statistics texts. The matrix approach is one of simple
beauty.
A procedure based on the ideas discussed above, but which avoids an explicit
inverse through a Cholesky factorization ( AT A is symmetric positive definite), is
encapsulated below.
public double[,] Normal_Ax_b_by_Cholesky_onestep(double[,]
A_Matrix, double[,] b)
{
// Golub/Van Loan 238 Solves system using normal equations
by Cholesky method
int i = 0, j = 0;
int m = A_Matrix.GetLength(0) - 1;
int n = A_Matrix.GetLength(1) - 1;
// compute lower triangle of ATA
double[,] At = null;
At = Transpose(A_Matrix);
double[,] c = null;
c = Matrix_Multiply(At, A_Matrix);
for (i = 1; i <= n; i++)
{
for (j = i + 1; j <= n; j++)
{
c[i, j] = 0;
}
}
// compute Result=ATb
double[,] Result = null;
Result = Matrix_Multiply(At, b);
// Cholesky factor C
double[,] G = null;
G = Cholesky(c);
// solve Gy=x(Result) forward sub
double Temp = 0;
Result[1, 1] = Result[1, 1] / G[1, 1];
for (i = 2; i <= n; i++)
{
for (j = 1; j <= i - 1; j++)
{
Temp = Temp - G[i, j] * Result[j, 1];
70
}
Temp = Temp + Result[i, 1];
Result[i, 1] = Temp / G[i, i];
Temp = 0;
}
// solve GTx=y backward sub
Temp = 0;
double[,] Gtrans = null;
Gtrans = Transpose(G);
Temp = 0;
Result[n, 1] = Result[n, 1] / Gtrans[n, n];
for (i = n - 1; i >= 1; i += -1)
{
for (j = n; j >= i + 1; j += -1)
{
Temp = Temp - Gtrans[i, j] * Result[j, 1];
}
Temp = Temp + Result[i, 1];
Result[i, 1] = Temp / Gtrans[i, i];
Temp = 0;
}
if (null != done) done();
return Result;
}
As written, this method does not produce a reusable factor. The algorithm from
which it is derived requires (m+n/3)n2 FLOPs.
There is an important shortcoming in the normal equations method. Any time we
explicitly form the product of a matrix and its transpose, we risk a severe loss of
information (see Watkins, exercise 3.5.25). A more consistently accurate technique
involves the QR factorization.
71
Note in the above function that although we have a complete mxm Q, we only use
the first n columns (a thin Q) to form QT b .
Is explicit formation of the Q matrix necessary for solving Ax QRx b ? Can
we somehow use our encoded matrix directly to obtain a solution? Yes we can, and our
approach will exact a computational effort savings beyond that gained by eliminating the
explicit formation of Q and R. The Householder vectors stored in the lower triangle of
our encoded matrix represent a factored form of Q that can be used directly to obtain
QT b in a matrixvector rather than a matrixmatrix multiplication. A similar
argument can be made for the Givens QR procedure. There are two functions below. One
takes an encoded Householder QR and b as arguments, and another takes an encoded
Givens QR and b as arguments.
Householder:
public double[,] Householder_Encoded_QR_Ax_b(double[,]
Encoded_QR, double[,] b)
{
// Golub/Van Loan pp 259 and 239 Given encoded QR, b solve
for x. Does not overwrite QR or b.
int i = 0, j = 0;
int m = Encoded_QR.GetLength(0) - 1;
int n = Encoded_QR.GetLength(1) - 1;
double[,] QTb = new double[m + 1, 2];
Array.Copy(b, QTb, b.Length);
72
Givens:
public double[,] Givens_Encoded_QR_Ax_b(double[,] Encoded_QR,
double[,] b)
{
int m = Encoded_QR.GetLength(0) - 1;
int n = Encoded_QR.GetLength(1) - 1;
int i = 0, j = 0;
double[,] x = new double[n + 1, 2];
double[] cs = new double[4];
double[,] QTb = new double[m + 1, 2];
Array.Copy(b, QTb, b.Length);
for (j = 1; j <= n; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (Encoded_QR[i, j] != 0)// Don't bother if rho is
0. It's just multiplying by one.
{
if (Encoded_QR[i, j] == 1)
73
75
{
d[i] = 1;
}
for (j = 1; j <= n; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (System.Math.Abs(A[i, j]) > Givens_Zero_Value)
{ // No need to zero if already zero.
double alpha = 0;
double beta = 0;
int type = 0;
double[] fg_x = new double[4];
double[] fg_d = new double[4];
fg_x[1] = A[i - 1, j];
fg_x[2] = A[i, j];
fg_d[1] = d[i - 1];
fg_d[2] = d[i];
double[] Return_FG_result = new double[4];
Return_FG_result = FastGivens(fg_x, fg_d);
alpha = Return_FG_result[0];
beta = Return_FG_result[1];
type =
System.Convert.ToInt32(Return_FG_result[2]);
d[i - 1] = fg_d[1];
d[i] = fg_d[2];
double[,] tempA = new double[4, n - j + 1 + 1];
double temp1 = 0;
double temp2 = 0;
if (type == 1)
{
for (k = j; k <= n; k++)
{
tempA[1, -j + 1 + k] = beta * A[i - 1,
k] + A[i, k];
tempA[2, -j + 1 + k] = A[i - 1, k] +
alpha * A[i, k];
}
// compute Q transpose b
temp1 = QTb[i - 1, 1] * beta + QTb[i, 1];
temp2 = QTb[i - 1, 1] + QTb[i, 1] * alpha;
QTb[i - 1, 1] = temp1;
QTb[i, 1] = temp2;
}
else
{
for (k = j; k <= n; k++)
{
tempA[1, -j + 1 + k] = A[i - 1, k] +
beta * A[i, k];
tempA[2, -j + 1 + k] = alpha * A[i - 1,
k] + A[i, k];
}
// compute Q transpose b
temp1 = QTb[i - 1, 1] + QTb[i, 1] * beta;
temp2 = QTb[i - 1, 1] * alpha + QTb[i, 1];
76
QTb[i - 1, 1] = temp1;
QTb[i, 1] = temp2;
}
for (k = 1; k <= n - j + 1; k++)
{
A[i - 1, j + k - 1] = tempA[1, k];
A[i, j + k - 1] = tempA[2, k];
}
}
}
}
Result[n, 1] = QTb[n, 1] / A[n, n]; // solve for result by
back calculation
for (i = n - 1; i >= 1; i += -1)
{
for (j = n; j >= i + 1; j += -1)
{
Result[i, 1] = Result[i, 1] - A[i, j] * Result[j,
1];
}
Result[i, 1] = Result[i, 1] + QTb[i, 1];
Result[i, 1] = Result[i, 1] / A[i, i];
}
if (null != done) done();
return Result;
}
77
if (mA != mB)
{
throw (new ApplicationException("A & B must have equal
number of rows"));
}
double[,] result = new double[mA + 1, nA + nB + 1];
for (i = 1; i <= mA; i++)
{
for (j = 1; j <= nA; j++)
{
result[i, j] = MatrixA[i, j];
}
}
for (i = 1; i <= mA; i++)
{
for (j = nA + 1; j <= nA + nB; j++)
{
result[i, j] = MatrixB[i, j - nA];
}
}
if (null != done) done();
return result;
}
Then we need to submit that augmented matrix to the MGS function and by back
calculation solve for x. We will need a function for back calculation:
public double[,] Back_calculate_x(double[,]
Upper_Triangular_Matrix, double[,] b)
{
int n = Upper_Triangular_Matrix.GetLength(1) - 1;
double[,] local_b = new double[n + 1, 2];
Array.Copy(b, local_b, b.Length);
double temp = 0;
int i = 0, j = 0;
local_b[n, 1] = local_b[n, 1] / Upper_Triangular_Matrix[n,
n];
for (i = n - 1; i >= 1; i += -1)
{
for (j = n; j >= i + 1; j += -1)
{
temp = temp - Upper_Triangular_Matrix[i, j] *
local_b[j, 1];
}
temp = temp + b[i, 1];
local_b[i, 1] = temp / Upper_Triangular_Matrix[i, i];
temp = 0;
}
if (null != done) done();
return local_b;
}
An Example
Consider the well conditioned system Ax=b where:
0.72
0.6
0.4
A
0.49
0.49
0.13
-5.2155
0.42 0.55 0.72
-1.425
0.33 0.63 0.13
, and b
0.69
0.81 0.53
6.2605
Q R e ncoded
-0.2041699 0.02805388
1.688444
0.6260355
-0.1259701
1.166175
-0.2501081 0.2725491
0.2501081 0.8304145
-0.5213985
-2.619453
-0.2485669
-3.258441
-0.06635521 -0.496249
0.158273
0.4796017
0.3954314 -0.3372979 0.1666163 0.6686068
-0.3954314 -0.7596755
0.248444 0.0187022 -0.4128687
-0.18437
0.643659
0.2620783 -0.4983875 -0.1762441
0.1049104 0.4759829
1.239153 -0.6195361 0.005406919 -0.3843755
0
1.586182
-0.5826885 -0.2904675
0
0
1.688444
0.6260355
R
0
0
0
1.166175
0
0
0
0
0
0
0
0
79
Q R e ncoded
3.361712 6.147047
-1.688444
-0.6260355
3.363904
1.166175
-4.053085 8.07967
2.877768 -7.062371
-10.38621
0.3153294
4.473432
0.1419598
0.1282174 0.1509458
0
0.5810421 0.2648015 0.4575582 0.4802244 0.3902504
0
1.586182
-0.5826885 -0.2904675
0
0
-1.688444
-0.6260355
R
0
0
0
1.166175
0
0
0
0
0
0
0
0
80
0.3032593
0.8369952
-0.653923
0.5702108
0.6390141
0
-2.633538 0.9674375 0.4822631
0
0
2.59626
0.9626322
T
0
0
0
2.428396
0
0
0
0
0
0
0
0
0
0
0
0
0
6.395252
0
2.756596
0
0
0
0
0
0
2.36441
0
0
0
D
0
0
0
4.336223
0
0
0
0
0
0
3.092522
0
0
0
0
0
0
1.953622
Also, Q MD1/ 2 Q and R D1/ 2T are the same matrices as obtained with the
standard Givens procedure.
81
This demonstration will involve implementing our algorithm for solving for x in
Ax=b using the MGS approach discussed above. First we need to form Abaugmented and
factor it:
private void factor_MGS_Click(object sender, EventArgs e)
{
C = solver.MakeAugmentedArray(A, B);
solver.MGS(C, ref D, ref E); //Q1=Q1qnplus1 and R1=R1z
}
Lets see the results from our previous example using this MGS approach:
0.72
0.6
0.4
Ab 0.49
0.49
0.13
4.294
0.78 0.48 0.83
2.774
0.96 0.86 0.55
0.69
0.81 0.53 6.2605
0.4135
0
Q1qn 1
0
0.3954314 -0.3372979 0.1666163 0.6686068
-0.3954314 -0.7596755
0.248444 0.0187022 0.6252933
0.643659
0.2620783 0.4653346
0.1049104 0.4759829
82
0
1.586182
-0.5826885 -0.2904675 -0.02663594
R1 z
0
0
1.688444
0.6260355
7.785736
0
0
0
1.166175
5.53933
0
0
0
0
0
x 0.95 1.9 2.85 4.75
To evaluate timing we need test matrices; different procedures will perform with
higher relative efficiency depending on the test matrix structure. In specific applications,
the procedure used should consider these differences.
Our first test matrix is a thoroughly populated (no zero elements) 200 200
element symmetric positive definite matrix. This will give us a chance to relate timings
83
4 9 13
The third column is not linearly independent of one and two (it is the sum of one
and two). The Rank ( A) is the number of linearly independent columns or rows of A . To
deal with cases where the Rank ( A) may be less than n we need a factorization
procedure that discovers and appropriately deals with this rank deficiency.
Factorization Methods
Householder QR with Column Pivoting
One solution is a Householder QR decomposition with column pivoting. The
procedure begins with determining the Euclidian norm for each column, then exchanges
columns to move the column with the highest norm into the first position. Then the
Householder matrix and the intermediary Q and R matrices are determined as usual. Next
we re-evaluate the norms for the submatrix below the first row and to the right of the first
column. Again, the columns are exchanged (the whole column). The value of the highest
norm is stored in a variable, tau. Tau is compared with a machine error multiple that we
can set to some fixed value. If tau is greater than the threshold then the process will
repeat, otherwise it will terminate. (When an orthogonal transformation is performed on
two dependent vectors their rows will zero in a similar fashion, giving rise to a zero norm
calculation (in exact arithmetic) for the second of the two when it is encountered.) The
number of iterations for the process is stored in an index called QRRank. All of the
column swapping is recorded in a permutation matrix.
Recall that for a simple QR factorization QT A R , when we perform a QR with
column pivoting as detailed above, we obtain a factorization with the following structure:
R12
1 to Rank
R11
QT AP R
0
Rank+1 to m
0
1 to Rank Rank+1 to n
where P is the column permutation matrix.
86
87
}
}
while (tau > Machine_Error * fnorm)
{
QRRank = QRRank + 1;
// column swap
if (k != QRRank)
{
double[] ColTemp = new double[m + 1];
double cTemp = 0;
cTemp = c[k];
c[k] = c[QRRank];
c[QRRank] = cTemp;
for (i = 1; i <= m; i++)
{// for the Source_Matrix matrix
ColTemp[i] = Source_Matrix[i, QRRank];
Source_Matrix[i, QRRank] = Source_Matrix[i, k];
Source_Matrix[i, k] = ColTemp[i];
}
int temp = 0;
temp = QR_Encoded_P[k]; // for the permutation
matrix
QR_Encoded_P[k] = QR_Encoded_P[QRRank];
QR_Encoded_P[QRRank] = temp;
}
// Convert the rankth column to a 1 dimensional vector
for submission to house.
double[] vector = new double[m - QRRank + 1 + 1];
double[,] v = new double[m + 1, 2];
for (i = 1; i <= vector.Length - 1; i++)
{
vector[i] = Source_Matrix[QRRank + i - 1, QRRank];
}
// Submit it
vector = House(vector);
// Convert the result to a column vector with leading
zeros up to the QRRank row.
for (i = 1; i <= vector.Length - 1; i++)
{
v[QRRank + i - 1, 1] = vector[i];
}
beta[QRRank] = HOUSE_BETA;
double[,] omegaT = null;
omegaT = Matrix_Multiply(v,
Scalar_multiply(Transpose(Matrix_Multiply(Transpose(Source_Matrix),
v)), beta[QRRank]));
for (i = QRRank; i <= m; i++)
{
for (k = QRRank; k <= n; k++)
{
Source_Matrix[i, k] = Source_Matrix[i, k] omegaT[i, k];
}
}
double[,] omega = null;
88
omega =
Scalar_multiply(Matrix_Multiply(Matrix_Multiply(Q, v), Transpose(v)),
beta[QRRank]);
for (i = 1; i <= m; i++)
{
for (k = 1; k <= m; k++)
{
Q[i, k] = Q[i, k] - omega[i, k];
}
}
for (i = QRRank + 1; i <= n; i++)
{ // Golub/Van Loan p. 249
c[i] = c[i] - (Source_Matrix[QRRank, i] *
Source_Matrix[QRRank, i]);
}
if (QRRank < n)
{
tau = Machine_Error * fnorm;
for (i = QRRank + 1; i <= n; i++)
{
if (c[i] > tau)
{
tau = c[i];
k = i;
}
}
}
else
{
tau = 0;
}
}
if (null != done) done();
return Q;
}
Just as we did for the QR simple algorithm, we can modify the column pivot QR
procedure to give an encoded QR factorization where the upper triangle is R and the
elements below the principal diagonal are the essential elements of the Householder
vectors. We will also have to return the permutation matrix.
public double[,] Householder_QR_Pivot_encoded(double[,]
Source_Matrix)
{
// Encoded column pivot QR adapted from Golub/Van Loan
5.4.1 Form encoded QR without explicit formation of Householder
matrices
// Overwrites source_matrix with encoded QR and also
returns encoded QR. Modifies QR_Encoded_P.
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
int i = 0, j = 0, k = 0;
double[] c = new double[n + 1];
QR_Encoded_P = new int[n + 1];
89
90
91
92
{
double[] ColTemp = new double[m + 1];
double cTemp = 0;
cTemp = c[k];
c[k] = c[QRRank];
c[QRRank] = cTemp;
for (i = 1; i <= m; i++)
{
ColTemp[i] = Source_Matrix[i, QRRank];
Source_Matrix[i, QRRank] = Source_Matrix[i, k];
Source_Matrix[i, k] = ColTemp[i];
}
int temp = 0;
temp = QR_Encoded_P[k]; // for the permutation
matrix
QR_Encoded_P[k] = QR_Encoded_P[QRRank];
QR_Encoded_P[QRRank] = temp;
}
j = QRRank;
for (i = m; i >= j + 1; i += -1)
{
if (System.Math.Abs(Source_Matrix[i, j]) >
Givens_Zero_Value)
{ // If it's already zero we won't zero it.
double[,] tempA = new double[3, n - j + 1 + 1];
double[,] temp_Q = new double[m + 1, 3];
cs = Givens(Source_Matrix[i - 1, j],
Source_Matrix[i, j]);
// accumulate Source_Matrix
for (h = j; h <= n; h++)
{
tempA[1, -j + 1 + h] = cs[1] *
Source_Matrix[i - 1, h] - cs[2] * Source_Matrix[i, h];
tempA[2, -j + 1 + h] = cs[2] *
Source_Matrix[i - 1, h] + cs[1] * Source_Matrix[i, h];
}
for (h = 1; h <= n - j + 1; h++)
{
Source_Matrix[i - 1, j + h - 1] = tempA[1,
h];
Source_Matrix[i, j + h - 1] = tempA[2, h];
}
// accumulate Q
for (h = 1; h <= m; h++)
{
temp_Q[h, 1] = cs[1] * Q[h, i - 1] - cs[2]
* Q[h, i];
temp_Q[h, 2] = cs[2] * Q[h, i - 1] + cs[1]
* Q[h, i];
}
for (h = 1; h <= m; h++)
{
Q[h, i - 1] = temp_Q[h, 1];
Q[h, i] = temp_Q[h, 2];
}
}
}
93
94
95
{
rho = System.Math.Sign(cs[1]) * cs[2] /
2;
}
else
{
rho = 2 * System.Math.Sign(cs[2]) /
cs[1];
}
}
// accumulate R
for (h = j; h <= n; h++)
{
tempA[1, -j + 1 + h] = cs[1] *
Source_Matrix[i - 1, h] - cs[2] * Source_Matrix[i, h];
tempA[2, -j + 1 + h] = cs[2] *
Source_Matrix[i - 1, h] + cs[1] * Source_Matrix[i, h];
}
for (h = 1; h <= n - j + 1; h++)
{
Source_Matrix[i - 1, j + h - 1] = tempA[1,
h];
Source_Matrix[i, j + h - 1] = tempA[2, h];
}
Source_Matrix[i, j] = rho; // put encoded Q
below diagonal
}
}
for (i = QRRank + 1; i <= n; i++)
{ // Golub/Van Loan p. 249
c[i] = c[i] - (Source_Matrix[QRRank, i] *
Source_Matrix[QRRank, i]);
}
if (QRRank < n)
{
tau = Machine_Error * fnorm;
for (i = QRRank + 1; i <= n; i++)
{
if (c[i] > tau)
{
tau = c[i];
k = i;
}
}
}
else
{
tau = 0;
}
}
if (null != done) done();
return Source_Matrix;
}
96
QT APZ
1 to Rank Rank+1 to n
T11
0
0
0
1 to Rank
Rank+1 to m
97
{
for (j = i; j <= n; j++)
{
TZ[j, i] = A_matrix[i, j];
}
}
TZ = Householder_QR_Simple_encoded(TZ);
TZ = Transpose(TZ); // TZ now holds the left to right QR
the upper rank X rank square of which is T11
TZBeta = new double[rank + 1, 2]; // Array for T beta
for (j = 1; j <= rank; j++)
{
TZBeta[j, 1] = beta[j];
}
QRRank = rank; // reset to original QR rank
if (null != done) done();
}
98
they give have at most Rank(A) number of non-zero elements. Again, the QR
factorization with column pivoting gives a factorization of form:
R12
1 to Rank
R11
QT AP R
0
Rank+1 to m
0
1 to Rank Rank+1 to n
d Rank+1 to m
0
Rank+1 to m
1 to Rank Rank+1 to n
If we restrict our attention to the upper left Rank Rank block of R (i.e., R11) and
the first Rank rows of QT b (i.e., c), our basic solution will reduce to R11 x c (where x is
a Rank dimensional vector), which we can solve by back calculation. We then append x
x
with zeros to form an n-dimensional vector . The only thing left to do is multiply the
0
result by the permutation matrix. So, in a sense, we ignore all but one member of any set
of dependent columns in A. In a set of dependent columns in our coefficient matrix A, the
only one for which the corresponding x multiplier is determined not to be zero is the one
with the initial highest norm.
For explicit QR with column pivoting solutions we have:
public double[,] QR_pivot_Ax_b(double[,] Q, double[,] R,
double[,] b)
{
// Golub/Van Loan pg 259 and 239 Given Q, R, b solve for x
int i = 0, j = 0, k = 0;
int m = Q.GetLength(0) - 1;
int n = R.GetLength(1) - 1;
int Rank = QRRank;
// make QTb. A thin Q is used here.
double[,] Qtb = new double[Rank + 1, 2]; // QTb=c
for (i = 1; i <= Rank; i++)
{
for (k = 1; k <= m; k++)
{
Qtb[i, 1] = Q[k, i] * b[k, 1] + Qtb[i, 1]; // note
here that Q(k,i)=QT(i,k)
}
}
// QTAPz=QTb and QTAP is Upper triangular so by back
calculation
double[,] z = new double[R.GetLength(1) - 1 + 1, 2];
z[Rank, 1] = Qtb[Rank, 1] / R[Rank, Rank]; // note that the
first rank X rank rows and cols of R = QTAP=r11
99
multiply by
permutation
for (i = 1; i <= n; i++)
{
tempb[QR_Encoded_P[i], 1] = z[i, 1];
}
if (null != done) done();
return tempb;
}
100
multiply by
permutation
for (i = 1; i <= n; i++)
{
tempb[QR_Encoded_P[i], 1] = z[i, 1];
}
if (null != done) done();
return tempb;
}
Finally, for solutions using an encoded Givens QR with column pivoting we have:
public double[,] Givens_Encoded_QR_Pivot_Ax_b(double[,]
Encoded_QR, double[,] b)
{
int m = Encoded_QR.GetLength(0) - 1;
int n = Encoded_QR.GetLength(1) - 1;
int rank = QRRank;
int i = 0, j = 0;
double[,] x = new double[n + 1, 2];
double[] cs = new double[4];
double[,] QTb = new double[m + 1, 2];
Array.Copy(b, QTb, b.Length);
for (j = 1; j <= rank; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (Encoded_QR[i, j] != 0)// Don't bother if rho is
0. It's just multiplying by one.
{
if (Encoded_QR[i, j] == 1)
{ // this is the rho decoder section
cs[1] = 0;
cs[2] = 1;
}
else
{
if (System.Math.Abs(Encoded_QR[i, j]) < 1)
{
cs[2] = 2 * Encoded_QR[i, j];
cs[1] = Math.Pow((1 - (cs[2] * cs[2])),
0.5);
}
else
101
{
cs[1] = 2 / Encoded_QR[i, j];
cs[2] = Math.Pow((1 - (cs[1] * cs[1])),
0.5);
}
}
double temp1 = 0; // compute QTb=c
double temp2 = 0;
temp1 = cs[1] * QTb[i - 1, 1] - cs[2] * QTb[i,
1];
temp2 = cs[2] * QTb[i - 1, 1] + cs[1] * QTb[i,
1];
QTb[i - 1, 1] = temp1;
QTb[i, 1] = temp2;
}
}
}
double[,] z = new double[Encoded_QR.GetLength(1) - 1 + 1,
2];
z[rank, 1] = QTb[rank, 1] / Encoded_QR[rank, rank];
for (i = rank - 1; i >= 1; i += -1)
{
for (j = rank; j >= i + 1; j += -1)
{
z[i, 1] = z[i, 1] - Encoded_QR[i, j] * z[j, 1];
}
z[i, 1] = z[i, 1] + QTb[i, 1];
z[i, 1] = z[i, 1] / Encoded_QR[i, i];
}
// z holds unpermuted x
// Multiply by permutation matrix but do not use multiply
function
double[,] tempb = new double[n + 1, 2]; //
permutation
for (i = 1; i <= n; i++)
{
tempb[QR_Encoded_P[i], 1] = z[i, 1];
}
if (null != done) done();
return tempb;
}
An example:
Consider the system Ax=b where:
1 1.5 1 2
2 3 3 4
4 6 5 8
5 7.5 4 10
102
multiply by
Note that columns 1, 2 and 4 are dependent (column 2 = 1.5 column 1 and
column 4 = 2 column 1). This is a rank 2 system and column 4 has the highest norm of
the three dependent columns.
The Householder QR with column pivoting yields:
Q R e nc
0
0
0
, P
-0.4675666 0.973525
0
0
0
-0.6234221
-1.301281
0
0
1
-0.7792776 0.8798804
0
0
0
0
1
0
1
0
0
0
1
0
0
1.981735
0
0
0
0
0
0
0
0
0
Q R e nc
0
0
0
, P
-7.348469
-108
0
0
0
4.714045
3.550217
0
0
1
-3.201562 3.901133
0
0
0
0
1
0
0
1
0
0
1
0
0
0
0
0.13484 -0.0183494 0.9906974
103
-1.981735
0
0
0
0
0
0
0
0
0
Again, of the three dependent columns of A, only the column with the highest
norm column 4 has a non zero multiplier in x.
QT APZ T
0
0
1 to Rank Rank+1 to n
T11
1 to Rank
Rank+1 to m
The form of our problem is not AZx=b, it is Ax=b. We can remedy this because
ZZT is the identity matrix for an orthogonal Z. We have:
QT APZZ T x TZ T x
0
0
1 to Rank Rank+1 to n
T11
1 to Rank
Z T x QT b
Rank+1 to m
1 to Rank
w
and QTb has the form
Now, ZTx has the form Z T x
Rank+1
to
m
y
1 to Rank
c
. Over the first rank elements of our system our equation
QT b
Rank+1 to m
d
T 1c
becomes T11w c , so xLS Z 11 , where c is a vector of the first rank elements of
0
T
Q b.
104
In practice we first form c, the first rank elements of QT b . Next we solve the
system T11w=c for w, where T11 is lower triangular using forward substitution. Using this
approach we avoid explicitly forming T111 . Next we apply Z to w to obtain the prepermuted xls. Finally, we apply the permutation matrix.
Here are the solution functions for the Householder and Givens variants of the
complete orthogonalization.
public double[,]
Solve_Householder_complete_orthogonal_implicit(double[,] QR_encoded,
double[,] T_encoded, double[,] QR_Beta, double[,] T_beta, double[,] b)
{
// Golub/Van Loan p 250.
int m = QR_encoded.GetLength(0) - 1;
int n = QR_encoded.GetLength(1) - 1;
int rank = T_encoded.GetLength(0) - 1;
double[,] QTb = new double[m + 1, 2];
Array.Copy(b, QTb, b.Length);
int i = 0, j = 0;
for (j = 1; j <= rank; j++)
{ // form QTb implicitly
double[,] v = new double[m + 1, 2];
v[j, 1] = 1;
for (i = j + 1; i <= m; i++)
{
v[i, 1] = QR_encoded[i, j];
}
double[,] omega = new double[2, 2];
omega = Matrix_Multiply(Transpose(QTb), v);
for (i = 1; i <= m; i++)
{
QTb[i, 1] = QTb[i, 1] - 1 * QR_Beta[j, 1] *
omega[1, 1] * v[i, 1]; // This beta(j) is a global
}
}
// here we use substitution to solve T11 X w = c avoiding
the explicit formation of t11 inverse
double[,] w = new double[n + 1, 2];
w[1, 1] = QTb[1, 1] / T_encoded[1, 1];
for (i = 2; i <= rank; i++)
{
for (j = 1; j <= i - 1; j++)
{
w[i, 1] = w[i, 1] - T_encoded[i, j] * w[j, 1];
}
w[i, 1] = w[i, 1] + QTb[i, 1];
w[i, 1] = w[i, 1] / T_encoded[i, i];
}
// form Z(T11)inv c = Zw
m = T_encoded.GetLength(1) - 1;
n = T_encoded.GetLength(0) - 1;
for (j = n; j >= 1; j += -1)
{ // apply Q to w implicitly
105
public double[,]
Solve_Givens_complete_orthogonal_implicit(double[,] QR_encoded,
double[,] ZT, double[,] b)
{
// Golub/Van Loan p 250.
int m = QR_encoded.GetLength(0) - 1;
int n = QR_encoded.GetLength(1) - 1;
int rank = ZT.GetLength(1) - 1;
double[,] QTb = new double[m + 1, 2];
double[] cs = new double[4];
Array.Copy(b, QTb, b.Length);
int i = 0, j = 0;
for (j = 1; j <= rank; j++)
{ // determine c
for (i = m; i >= j + 1; i += -1)
{
if (QR_encoded[i, j] != 0)// Don't bother if rho is
0. It's just multiplying by one.
{
if (QR_encoded[i, j] == 1)
{ // this is the rho decoder section
cs[1] = 0;
cs[2] = 1;
}
else
{
if (System.Math.Abs(QR_encoded[i, j]) < 1)
{
cs[2] = 2 * QR_encoded[i, j];
106
107
else
{
cs[1] = 2 / ZT[i, j];
cs[2] = Math.Pow((1 - (cs[1] * cs[1])),
0.5);
}
}
double temp1 = 0;
double temp2 = 0;
temp1 = cs[1] * w[i - 1, 1] + cs[2] * w[i, 1];
temp2 = cs[2] * -w[i - 1, 1] + cs[1] * w[i, 1];
w[i - 1, 1] = temp1;
w[i, 1] = temp2;
}
}
}
double[,] tempw = new double[n + 1, 2]; //
multiply by
permutation
for (i = 1; i <= n; i++)
{
tempw[QR_Encoded_P[i], 1] = w[i, 1];
}
if (null != done) done();
return tempw;
}
4 6 5 8
5 7.5 4 10
Q R e nc
0
0
0
, P
-0.4675666 0.973525
0
0
0
0
0
-0.6234221 -1.301281
1
-0.7792776 0.8798804
0
0
108
0
0
1
0
0
1
0
0
1
0
0
Rank = 2
xls {1.241379, 1.862069, 3, 2.482759}T
Q R e nc
T
ZTenc
0
0
0
, P
-7.348469
-108
0
0
0
0
0
4.714045 3.550217
1
-3.201562 3.901133
0
0
0
0
1
0
0
1
0
0
1
0
0
21.20902 -0.6677584
2.859824
1.865843
-4.242599
5.711773
0
-0.2773501
Rank = 2
xls {1.241379, 1.862069, 3, 2.482759}T
The Euclidian norm of our x vector is 4.49, compared to a norm of 5.41 for the
QR with pivoting result. Note also that in this case each column of our coefficient matrix
has a corresponding non zero multiplier in x.
109
110
2
0
Log s
-2
-4
-6
-8
-10
It is evident that this matrix has a rank of 3. However, it isnt always this clear.
For example:
9.65
7.93
A 7.9
3.02
3.7
11.22364
9.443055
9.411908
4.267387
4.999094
4
2
0
Log s
-2
-4
-6
-8
-10
-12
Formally, the matrix should have full rank. Clearly the SVD provides information
assisting in the appropriate assignment of rank based on application specifics.
Before we explore implemented algorithms for calculating the SVD we should
discuss a certain amount of background. As promised in Chapter 1 we will be
deliberately qualitative. Other texts are appropriate for intimately understanding the
underlying mathematics. We must not lose track of our primary objective to explore
implementations.
Eigensystems
Consider the equation Av v where A is a square matrix, v is a non-zero vector,
and is a scalar. It states quite simply that multiplying the vector by the matrix returns a
vector that is a scalar multiple of the original vector. The pair, v and , form an eigenpair
for the matrix A. For any given matrix A mxm there are m eigenpairs, although the
eigenvalues may be equal in certain cases. The set of eigenvalues for a matrix is termed
the matrixs spectrum. Although the spectrum is unique, the set of associated
eigenvectors is not because any linearly dependent vector is also an eigenvector
( A2v 2v, A3v 3v, etc. ). The fundamental eigenvector has a norm of unity
( v v ).
v
We define the characteristic equation of A as determinant( I A) 0 . We can
use this formula for simple matrices to calculate eigenvalues. For example:
1 2
0
1 2
A
, I
, I A
4 3
4 3
0
det( I A) ( 1)( 3) 8 2 4 5 0
Clearly, this quadratic equation has two roots, -1 and 5. Not so evident is the set
of unit norm eigenvectors: (-0.7071, 0.7071) and (-0.4472, -0.8944). If we multiply them
out we indeed see that:
1 2 0.4472
0.4472
1 2 0.7071
0.7071
4 3 0.7071 1 0.7071 and 4 3 0.8944 5 0.8944 .
has the form A b1 bm 1 . Given this form, a procedure called the Symmetric QR
0 bm 1 am
algorithm is used to iteratively obtain the final result.
The computation of eigenvalues and eigenvectors has value in many applications,
most notably the solution to differential equations. Our objective, however, is to find the
SVD of an mxn matrix which is not necessarily a square system. However, the iterative
algorithms for solving eigensystems are the fundamental algorithms for computing the
SVD. Recall from our discussion of the method of normal equations that ATA is square.
It is also symmetric, and it can be reduced to tridiagonal form and submitted to the
symmetric QR algorithm. Recall too that the SVD of A is given by
U T AV diag ( 1 , 2 ,..., n ) , where i are the positive singular values in descending
order and U and V are orthogonal. It is further the case that
V T ( AT A)V diag ( 12 , 22 ,..., n2 ) and U T ( AAT )U diag ( 12 , 22 ,..., n2 , 0n 1 ,..., 0m ) .
Notice that these results have the conspicuous look of Schur decompositions for ATA and
AAT. So, we could form ATA, then use reduction to tridiagonal form followed by the QR
algorithm to find V. We can then apply our QR with column pivoting procedure to AV to
arrive at UT ( AV ) P R . All we need to do at that point is organize the elements of
so that they are positive and descending. Corresponding adjustments in U and V will
of course need to be made.
That is about as far into the theory as were going to go here, except to say that
certain enhancements to the procedure will be made. For example, we know that forming
ATA leads to a serious loss of information, so we will use a bidiagonalization procedure
that will enable shortcutting the formation of ATA and reducing it to tridiagonal form.
Another enhancement involves the use of a convergence enhancing process called a
Wilkinson shift. The processes and procedures underlying the algorithm we will
implement are fascinating, and you are definitely encouraged to read more (see
Golub/Van Loan, Chapter 8).
113
the SVD of A is A U V T .
For the system Ax=b, Ax (U V T ) x b. If we consider U and V to be sets of
rank
uiT bvi
Another way of approaching the solution employs a device called the pseudoinverse. The pseudo-inverse, designated A , is given by A V U T , where
diag ( 1 ,..., 1 , 0 ,..., 0 ) . Here x A b .
r 1
ls
Bidiagonalization
Here we will factor an m x n matrix to bidiagonal form using Givens
transformations. Householder transformations work equally well and theoretically are
114
BD
c1
0
0
0
cn 1 , and U bdmxm as well as Vbdnxn are orthogonal.
bn
0
115
<= n - j + 1; k++)
j + k - 1] = temp_R[1, k];
k - 1] = temp_R[2, k];
UBD
<= m; k++)
1] = cs[1] * UBD[k, i - 1] -
116
We can store the Givens rotators and avoid the explicit formation of Ubd with the
following implicit version:
public void Bidiagonalize_implicit_Givens(double[,]
Source_Matrix, ref double[,] bd_encoded, ref double[,] Vbd)
{
// Golub/Van Loan 5.4.2. Does not overwrite Source_matrix.
int i = 0, j = 0, k = 0, m = 0, n = 0;
double rho = 0;
if (Source_Matrix.GetLength(1) - 1 >
Source_Matrix.GetLength(0) - 1)
{
m = Source_Matrix.GetLength(1) - 1; // allow for
underdetermined system
}
else
{
m = Source_Matrix.GetLength(0) - 1;
}
n = Source_Matrix.GetLength(1) - 1;
bd_encoded = new double[m + 1, n + 1];
Vbd = new double[n + 1, n + 1];
Array.Copy(Source_Matrix, bd_encoded,
Source_Matrix.Length);
for (i = 1; i <= n; i++)
{
Vbd[i, i] = 1;
}
117
118
We can speed up the bidiagonalization process when m>>n with Rbidiagonalization (see Chan). With R bidiagonalization, we perform a preliminary QR
factorization to do the sub-diagonal zeroing, then submit the upper nxn matrix (R1) to the
regular bidiagonalization method. Urbd is the product of Q and the U from the latter
bidiagonalization extended to an mxm matrix with the identity matrix and Vrbd=Vbd. Here
is the implementation:
public
double[,] RBD,
{
//
Householder QR
//
119
int m = 0, n = 0, i = 0, j = 0, k = 0;
m = Source_Matrix.GetLength(0) - 1;
n = Source_Matrix.GetLength(1) - 1;
RBD = new double[m + 1, n + 1];
URBD = new double[m + 1, m + 1];
// QR Factor
double[,] Q = null;
double[,] R = null;
Q = Householder_QR_Simple(Source_Matrix);
R = Source_Matrix;
// extract nxn square of R
double[,] R1 = new double[n + 1, n + 1];
for (i = 1; i <= n; i++)
{
for (j = 1; j <= n; j++)
{
R1[i, j] = R[i, j];
}
}
// bidiagonalize that square
double[,] BD = null;
double[,] UBD = null;
Bidiagonalize_explicit_Givens(R1, ref BD, ref UBD, ref
VRBD);
Array.Copy(BD, RBD, BD.Length); // conserves length of
destination array
for (i = 1; i <= m; i++)
{
for (k = 1; k <= n; k++)
{
for (j = 1; j <= n; j++)
{
URBD[i, j] = Q[i, k] * UBD[k, j] + URBD[i, j];
}
}
}
for (i = 1; i <= m; i++)
{
for (j = n + 1; j <= m; j++)
{
URBD[i, j] = Q[i, j];
}
}
for (i = 1; i <= n; i++)
{
for (k = i + 1; k <= m; k++)
{
RBD[k, i] = 0;
}
}
for (k = 1; k <= n - 2; k++)
{
for (i = k + 2; i <= n; i++)
{
RBD[k, i] = 0;
}
}
120
The next method takes b as an argument, then implicitly forms and returns U bdT b
(UbdTb). It is non-reusable.
public void R_Bidiagonalize_implicit(double[,] Source_Matrix,
double[,] b, ref double[,] RBD, ref double[,] Vbd, ref double[,] UbdTb)
{
// Chan p.72 - 83. Note that Ubd=QUr(extended).
UbdTb=QUr(extended)Tb=UrextendedT X QTb. Immediate formation of UbdTb.
// Overwrites Source_matrix. Do not use for underdetermined
systems.
int i = 0, j = 0, k = 0, m = 0, n = 0;
m = Source_Matrix.GetLength(0) - 1;
n = Source_Matrix.GetLength(1) - 1;
RBD = new double[m + 1, n + 1];
Vbd = new double[n + 1, n + 1];
UbdTb = new double[n + 1, 2];
// perform QR encoded
Source_Matrix = Givens_QR_Simple_encoded(Source_Matrix);
// break out R1
double[,] R1 = new double[n + 1, n + 1];
for (i = 1; i <= n; i++)
{
for (j = i; j <= n; j++)
{
R1[i, j] = Source_Matrix[i, j];
}
}
// make QTb
double[,] QTb = new double[m + 1, 2];
Array.Copy(b, QTb, b.Length);
double[] cs = new double[4];
for (j = 1; j <= n; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (Source_Matrix[i, j] != 0)
{ // Don't bother if rho is 0. It's just
multiplying by one.
if (Source_Matrix[i, j] == 1)
{ // this is the rho decoder section
cs[1] = 0;
cs[2] = 1;
}
else
{
if (System.Math.Abs(Source_Matrix[i, j]) <
1)
{
cs[2] = 2 * Source_Matrix[i, j];
121
For many systems we do not need a full Q, because the first n columns of Urbd
will suffice. The following procedure is a U1 truncated R-bidiagonalization:
public void R_Bidiagonalize_U1(double[,] Source_Matrix, ref
double[,] RBD, ref double[,] URBD, ref double[,] VRBD)
{
// Chan p.72 - 83. Overwrites Source_Matrix. Not for use
with undetermined systems
int m = 0, n = 0, i = 0, j = 0, k = 0;
m = Source_Matrix.GetLength(0) - 1;
n = Source_Matrix.GetLength(1) - 1;
URBD = new double[m + 1, n + 1];
// QR Factor
double[,] Q = null;
double[,] R = null;
Q = Householder_QR_Simple(Source_Matrix);
122
R = Source_Matrix;
// extract nxn square of R
double[,] R1 = new double[n + 1, n + 1];
for (i = 1; i <= n; i++)
{
for (j = 1; j <= n; j++)
{
R1[i, j] = R[i, j];
}
}
// bidiagonalize that square
double[,] UBD = null;
Bidiagonalize_explicit_Givens(R1, ref RBD, ref UBD, ref
VRBD);
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
for (k = 1; k <= n; k++)
{
URBD[i, k] = Q[i, j] * UBD[j, k] + URBD[i, k];
}
}
}
for (i = 1; i <= n; i++)
{
for (k = i + 1; k <= n; k++)
{
RBD[k, i] = 0;
}
}
for (k = 1; k <= n - 2; k++)
{
for (i = k + 2; i <= n; i++)
{
RBD[k, i] = 0;
}
}
if (null != done) done();
}
123
// perform QR encoded
Array.Copy(Source_Matrix, QR, Source_Matrix.Length);
QR = Givens_QR_Simple_encoded(QR);
// break out R1
double[,] R1 = new double[n + 1, n + 1];
for (i = 1; i <= n; i++)
{
for (j = i; j <= n; j++)
{
R1[i, j] = QR[i, j];
}
}
// bidiagonalize r1 explicitly
Bidiagonalize_explicit_Givens(R1, ref R1BD, ref UR1, ref
Vbd);
if (null != done) done();
}
The second procedure takes it to the next level, returning an encoded QR, an
encoded R1bd, and Vbd:
public void R_Bidiagonalize_implicit_2(double[,] Source_Matrix,
ref double[,] QR, ref double[,] R1BD_enc, ref double[,] Vbd)
{
// Chan p.72 - 83. Note that Ubd=QUr(extended). Implicit
QR, implicit R BD. Re-usable factorization
// Do not use for underdetermined systems.
int i = 0, j = 0, m = 0, n = 0;
m = Source_Matrix.GetLength(0) - 1;
n = Source_Matrix.GetLength(1) - 1;
QR = new double[m + 1, n + 1];
// perform QR encoded
Array.Copy(Source_Matrix, QR, Source_Matrix.Length);
QR = Givens_QR_Simple_encoded(QR);
// break out R1
double[,] R1 = new double[n + 1, n + 1];
for (i = 1; i <= n; i++)
{
for (j = i; j <= n; j++)
{
R1[i, j] = QR[i, j];
}
}
// bidiagonalize r1 implicitly
Bidiagonalize_implicit_Givens(R1, ref R1BD_enc, ref Vbd);
if (null != done) done();
}
124
125
126
127
else
{
cs[1] = 2 / Q_encoded[i, j];
cs[2] = Math.Pow((1 - (cs[1] * cs[1])),
0.5);
}
}
double temp1 = 0; // compute QTb
double temp2 = 0;
temp1 = cs[1] * QTb[i - 1, 1] - cs[2] * QTb[i,
1];
temp2 = cs[2] * QTb[i - 1, 1] + cs[1] * QTb[i,
1];
QTb[i - 1, 1] = temp1;
QTb[i, 1] = temp2;
}
}
}
double[,] UbdTb = new double[n + 1, 2];
Array.Copy(QTb, UbdTb, UbdTb.Length);
for (j = 1; j <= n; j++)
{
for (i = n; i >= j + 1; i += -1)
{
if (R1bd_enc[i, j] != 0)
{ // Don't bother if rho is 0. It's just
multiplying by one.
if (R1bd_enc[i, j] == 1)
{ // this is the rho decoder section
cs[1] = 0;
cs[2] = 1;
}
else
{
if (System.Math.Abs(R1bd_enc[i, j]) < 1)
{
cs[2] = 2 * R1bd_enc[i, j];
cs[1] = Math.Pow((1 - (cs[2] * cs[2])),
0.5);
}
else
{
cs[1] = 2 / R1bd_enc[i, j];
cs[2] = Math.Pow((1 - (cs[1] * cs[1])),
0.5);
}
}
double temp1 = 0; // compute QTb
double temp2 = 0;
temp1 = cs[1] * UbdTb[i - 1, 1] - cs[2] *
UbdTb[i, 1];
temp2 = cs[2] * UbdTb[i - 1, 1] + cs[1] *
UbdTb[i, 1];
UbdTb[i - 1, 1] = temp1;
128
UbdTb[i, 1] = temp2;
}
}
}
// create a matrix that holds U in the upper nxn square and
extends to mxm with the identity matrix
// Ubd=Q X (Urextended) so UbdTb=(Urextended)Transpose X
QTb
double[,] result = new double[n + 1, 2];
for (i = 1; i <= n; i++)
{
result[i, 1] = UbdTb[i, 1];
}
if (null != done) done();
return result;
}
There are certainly many other variants of the procedures given above. There are
also cases where only the diagonal vector of singular values is of interest, so we can
eliminate accumulation and storage of the U and V matrices with substantial savings in
processing overhead and storage. Also, we are using an entire n x n array to store the
bidiagonal matrix. Doing this makes the implementation substantially easier to follow
than the more efficient practice of storing this matrix as two vectors.
129
130
{
break;
}
}
if (q < n)
{
// if any diag on b22 is zero then zero the super
bool FoundZero = false;
for (h = p; h <= n - q; h++)
{
for (i = p; i <= n - q; i++)
{
if (Sigma[i, i] == 0)
{
Sigma[i - 1, i] = 0;
FoundZero = true;
}
}
if (FoundZero == true)
{
goto L1; // Here we use the unthinkable
goto statement. I think the code is clearer with it.
}
else
{
// apply algo 861 to B22
double[,] b22 = new double[n - p - q + 1 +
1, n - p - q + 1 + 1]; // form it
for (j = 1; j <= n - p - q + 1; j++)
{
for (k = 1; k <= n - p - q + 1; k++)
{
b22[j, k] = Sigma[j + p - 1, k + p
- 1];
}
}
Algo861_explicit(b22, n, p, q, Usvd, Vsvd);
for (j = 1; j <= n - p - q + 1; j++)
{ // sub it back in
for (k = 1; k <= n - p - q + 1; k++)
{
Sigma[j + p - 1, k + p - 1] =
b22[j, k];
}
}
}
}
}
L1: { }
}
// Assign(rank)
double FnormS = Frobenius_norm(Sigma);
SVDRANK = n;
for (i = 1; i <= n; i++)
{
131
132
133
int n = Source_Matrix.GetLength(1) - 1;
double result = 0;
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
result = result + (Math.Pow(Source_Matrix[i, j],
2));
}
}
result = Math.Pow(result, 0.5);
return result;
}
The following two methods are analogs of the above methods in which
T
T
b to form U T U bd
b U bdU b U T b .
transformations are applied to U bd
T
134
}
}
// find the largest Q
q = 0;
for (i = n; i >= 1; i += -1)
{
if (sigma[i - 1, i] == 0)
{
q = q + 1;
}
else
{
break;
}
}
// and smallest P....
p = n - q;
for (i = n - q; i >= 2; i += -1)
{
if (sigma[i - 1, i] != 0)
{
p = p - 1;
}
else
{
break;
}
}
if (q < n)
{
// if any diag on b22 is zero then zero the super
bool FoundZero = false;
for (h = p; h <= n - q; h++)
{
for (i = p; i <= n - q; i++)
{
if (sigma[i, i] == 0)
{
sigma[i - 1, i] = 0;
FoundZero = true;
}
}
if (FoundZero == true)
{
goto L1; // Here we use the unthinkable
goto statement. I think the code is clearer with it.
}
else
{
// apply algo 861 to B22
double[,] b22 = new double[n - p - q + 1 +
1, n - p - q + 1 + 1]; // form it
for (j = 1; j <= n - p - q + 1; j++)
{
for (k = 1; k <= n - p - q + 1; k++)
{
135
b22[j, k] = sigma[j + p - 1, k + p
- 1];
}
}
Algo861_implicit(b22, n, p, q,
UsvdT_UbdT_b, Vsvd);
for (j = 1; j <= n - p - q + 1; j++)
{ // sub it back in
for (k = 1; k <= n - p - q + 1; k++)
{
sigma[j + p - 1, k + p - 1] =
b22[j, k];
}
}
}
}
}
L1: { }
}
// Assign(rank)
double FnormS = Frobenius_norm(sigma);
SVDRANK = n;
for (i = 1; i <= n; i++)
{
if (System.Math.Abs(sigma[i, i]) <
SVD_Rank_Determination_Threshold * FnormS)
{
SVDRANK = SVDRANK - 1;
}
}
if (null != done) done();
}
136
137
}
for (i = 1; i <= B22.GetLength(1) - 1; i++)
{
B22[k, i] = summer[1, i];
B22[k + 1, i] = summer[2, i];
}
if (k < npq - 1)
{
y = B22[k, k + 1];
z = B22[k, k + 2];
}
}
}
138
double tempU = 0;
tempS = localSigma[i, i];
localSigma[i, i] = localSigma[xcol, xcol];
localSigma[xcol, xcol] = tempS;
for (k = 1; k <= n; k++)
{
tempv = V[k, xcol];
V[k, xcol] = V[k, i];
V[k, i] = tempv;
}
for (k = 1; k <= m; k++)
{
tempU = U[k, xcol];
U[k, xcol] = U[k, i];
U[k, i] = tempU;
}
}
}
Sigma = localSigma;
// make all SVs positive
for (i = 1; i <= n; i++)
{
if (Sigma[i, i] < 0)
{
Sigma[i, i] = -Sigma[i, i];
for (j = 1; j <= m; j++)
{
U[j, i] = -U[j, i];
}
}
}
if (null != done) done();
}
139
Finally, we can solve the system Ax=b with one of two methods: the pseudoinverse technique or the summation approach for either the explicit or implicit case. The
summation methods below do not provide for re-evaluation of rank, but this functionality
could easily be added.
For the explicit case we have:
public double[,] PI_Solve_svd(double[,] BigU, double[,] SIGMA,
double[,] BigV, double[,] b)
{ // determine x for Ax=b
// Golub/Van Loan section 5.5.4. Does not overwrite any of
the arguments.
140
int m = b.GetLength(0) - 1;
int n = SIGMA.GetLength(1) - 1;
int i = 0, j = 0;
SVDRANK = n;
double[,] Temp = new double[n + 1, n + 1]; // just the
first n rows of sigma
double FnormS = Frobenius_norm(SIGMA);
// form temp = pseudoinverse ie sigma plus
for (i = 1; i <= n; i++)
{
if (System.Math.Abs(SIGMA[i, i]) >
SVD_Rank_Determination_Threshold * FnormS)
{
Temp[i, i] = 1 / SIGMA[i, i];
}
else
{
Temp[i, i] = 0;
SVDRANK = SVDRANK - 1;
}
}
// make temp = V X Sigma Plus
Temp = Matrix_Multiply(BigV, Temp);
// make the first n rows of U transpose (a thin U)
double[,] tUT = new double[n + 1, m + 1];
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
tUT[j, i] = BigU[i, j];
}
}
// make V X Sigma Plus X the first n rows of U transpose
double[,] temp2 = new double[n + 1, m + 1];
temp2 = Matrix_Multiply(Temp, tUT);
double[,] Summer = new double[n + 1, 2];
for (i = 1; i <= n; i++)
{
for (j = 1; j <= m; j++)
{
Summer[i, 1] = temp2[i, j] * b[j, 1] + Summer[i,
1];
}
}
if (null != done) done();
return Summer;
}
public double[,] SVD_Axb_by_Sum(double[,] BigU, double[,]
Sigma, double[,] BigV, double[,] b)
{
// Golub/Van Loan Theorem 5.5.1. Does not overwrite any of
the arguments.
int m = b.GetLength(0) - 1;
int n = BigV.GetLength(1) - 1;
int i = 0, j = 0;
double[] uiTb_over_sigma = new double[n + 1];
141
142
143
double[,] U = null;
double[,] Sigma = null;
double[,] V = null;
Organize_svd(UBD, Usvd, raw_sigma, Vbd, Vsvd, ref U, ref
Sigma, ref V);
return PI_Solve_svd(U, Sigma, V, b);
}
144
145
Examples
We do not have room for a detailed example for all of the implementation
sequences given above. We will first look at the fully explicit Golub-Reinsch SVD
procedure.
146
8.2
9.4
11.1
A
14.7
6.2
2.9
16.4 2.1
18.8 5.2
22.2 7.5
29.4 10.4
12.4 3.3
5.8 4.6
10.3
14.6
18.6
T
, b 88.5 121 152.4 205.1 78.9 58.3
25.1
9.5
7.5
0
0
-23.27552 -61.62521
0
-3.735111
5.001186
0
0
0
-7.497559E-15 8.892092E-16
BD
0
0
0
1.131181E-15
0
0
0
0
0
0
0
0
U bd
-0.3523
-0.40386
-0.4769
-0.63156
-0.26637
-0.12459
1
0
Vbd
0
0.676562
0.142852
-0.13934
-0.29047
-0.10117
0.000942
0.747542
-0.22093
-0.55631
0.67927
-0.18486
0.085481
0.313738
0.443215
-0.32176
-0.00248
0.123626
-0.55204
0.036094
-0.67972
-0.63465
-0.27819
-0.43166
0.373591
0
0.75539
0.234272
0.611967
0
0.564159
-0.70758
-0.4255
0.398301
0.23797
-0.6787
0.38169
0.422382
0.333333
0.666667
-0.66667
-4.59E-19
U
-1.06E-52
0.000000
0.000000
-0.0533
-7.14E-17
2.07E-17
0.000000
0.998579
1.35E-15
-3.91E-16
0.000000
-1.40E-15
0.960309
-0.27894
0.000000
-5.29E-47
0.278939
0.960309
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
1.000000
0.000000
147
0.000000
0.000000
0.000000
0.000000
0.000000
1.000000
-0.352332108
-0.935866315
V
0.004040496
-6.19E-36
-0.240167166
0.086242956
-0.966892902
2.42E-31
0.763045782
-0.288198431
-0.215239586
0.537005325
-0.485739593
0.183461322
0.13701719
0.843578853
-0.483644396
U
-0.646147847
-0.259406924
-0.158241585
-0.352332108
-0.704664215
V
-0.222106135
-0.574438243
-0.240167166
-0.480334333
0.704360284
0.464193118
0.763045782
-0.160129684
0.442786415
-0.442786415
-0.485739593
0.497077214
0.508414836
-0.508414836
Next we will look at the fully explicit Chan R-Bidiagonalization SVD procedure
using the same A and b as above.
23.27552
0.00000
0.00000
RBD
0.00000
0.00000
0.00000
U RBD
0.352301419
0.403857724
0.476895823
0.631564739
0.266374244
0.124594404
61.62521
3.73511
0.00000
0.00000
0.00000
0.00000
-0.676561627
-0.142851661
0.139342912
0.290468779
-0.12362631
0.634654035
0.00000
-5.00119
1.85E-14
0.00000
0.00000
0.00000
-0.461293525
-0.084697494
0.643661676
-0.174604972
0.24542344
-0.524418443
148
0.00000
0.00000
-2.38E-15
4.81E-16
0.00000
0.00000
-0.144182938
0.808973687
0.081252452
-0.509618395
-0.079757096
0.228251595
0.33410149
-0.365974697
0.310473945
-0.467816497
0.430745752
0.503637238
-0.27008116
0.144716176
-0.485651667
0.087771232
0.813403673
-0.030438936
VRBD
1.00000
0.00000
0.00000
0.00000
0.00000
0.75539
0.23427
0.61197
0.00000
0.56416
-0.70758
-0.42550
0.00000
0.33333
0.66667
-0.66667
0.00000
U
0.00000
0.00000
0.00000
-0.05330
0.99858
0.00000
0.00000
0.00000
0.00000
-0.352332108
-0.935866315
V
0.004040496
-7.87E-33
-0.240167166
0.086242956
-0.966892902
1.60E-30
0.00000
0.00000
0.99915
-0.04112
0.00000
0.00000
0.00000
0.00000
0.04112
0.99915
0.00000
0.00000
0.806451341
-0.304592485
-0.227483405
0.452891768
-0.409656018
0.154724951
0.115555572
0.891565503
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
-0.483644411
U
-0.646147848
-0.259406932
-0.158241581
-0.694376192
-0.164172468
0.113728437
0.256396352
-0.137647179
0.627111722
-0.352332108
-0.704664215
V
-0.222106135
-0.574438243
0.454974239
0.117892481
-0.639775959
0.153500743
-0.248495617
0.53336102
-0.240167166
-0.480334333
0.704360284
0.464193118
-0.163030298
0.80480647
0.10765239
-0.516367434
-0.069597325
0.20649338
0.806451341
-0.207458896
0.39153355
-0.39153355
0.33410149
-0.365974697
0.310473945
-0.467816497
0.430745752
0.503637238
-0.409656018
0.479257842
0.548859667
-0.548859667
149
-0.27008116
0.144716176
-0.485651667
0.087771232
0.813403673
-0.030438936
Q R encoded
-23.27552
-5.67696
4.63473
-3.54067
-0.21106
-0.21184
R1BDencoded
VRBD
-46.55105
0.00000
1.00000
1.00000
0.22361
0.00000
-23.275523
0
1.00000
0.00000
0.00000
0.00000
-37.71258
-0.31973
-4.40218
0.00000
0.25220
0.25856
-14.43705
-0.31973
-4.40218
0.34447
-4.90490
87.96641
-61.625214 0
3.7351114 -5.00118631
-27.609018
-7.51E-15
1.27E-16
0.0404787
0.00000
0.75539
0.23427
0.61197
8.50E-16
-9.63E-16
0.00000
0.33333
0.66667
-0.66667
0.00000
0.56416
-0.70758
-0.42550
T
U BD
b (310.539, 30.89647, 6.61E-14, -1.83E-14)T
-0.352332108
-0.935866315
V
0.004040496
-5.93E-36
-0.240167166
0.086242956
-0.966892902
2.31E-31
0.790350836
-0.298511408
-0.222941782
0.486347053
-0.439917462
0.166154542
0.124091705
0.873765726
-0.574438243
-0.240167166
-0.480334333
0.704360284
0.464193118
0.790350836
-0.189151354
0.412048129
-0.412048129
-0.439917462
0.486774114
0.533630766
-0.533630766
150
xip
i 1
. The
n 2 2
x 1 xi , x 2 xi and x max xi . The 2 norm is the Euclidian
i 1
i 1
norm we have previously discussed.
Here are some functions for determining 1, 2 and infinity vector norms:
n
152
Matrix norms continue the sequence from absolute values for scalars to vector
norms to arrive at a means of quantifying the distance on the space of matrices. We have
1
m n 2 2
already discussed the Frobenius norm: A F aij . There are also matrix p
i 1 j 1
norms, again the most important of which are the 1, 2, and infinity matrix norms. The
matrix 1 norm is the maximum column sum of absolute values for the matrix. The
infinity norm is the maximum row sum of absolute values for the matrix. Finally, the
matrix 2 norm is the largest singular value for the matrix. The Frobenius norm should not
be confused with the matrix 2 norm. The 1 and infinity norms are easy to compute. The
matrix 2 norm requires an SVD computation; however, a rough estimate can be obtained
from the following inequalities:
1
A A2 m A
n
1
A A 2 n A1
m 1
A2
A1 A
Here are some functions for determining matrix 1, infinity, and F norms:
public double Frobenius_norm(double[,] Source_Matrix)
{
153
int i = 0, j = 0;
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
double result = 0;
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
result = result + (Math.Pow(Source_Matrix[i, j],
2));
}
}
result = Math.Pow(result, 0.5);
return result;
}
public double Matrix_1_norm(double[,] Source_Matrix)
{
int i = 0, j = 0;
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
double[,] ColSum = new double[n + 1, 2];
for (j = 1; j <= n; j++)
{
for (i = 1; i <= m; i++)
{
ColSum[j, 1] = ColSum[j, 1] +
System.Math.Abs(Source_Matrix[i, j]);
}
}
double result = Vector_Infinity_norm(ColSum);
return result;
}
public double Matrix_infinity_norm(double[,] Source_Matrix)
{
int i = 0, j = 0;
int m = Source_Matrix.GetLength(0) - 1;
int n = Source_Matrix.GetLength(1) - 1;
double[,] RowSum = new double[m + 1, 2];
for (i = 1; i <= m; i++)
{
for (j = 1; j <= n; j++)
{
RowSum[i, 1] = RowSum[i, 1] +
System.Math.Abs(Source_Matrix[i, j]);
}
}
double result = Vector_Infinity_norm(RowSum);
return result;
}
There are times when an estimate of the matrix 2 norm will not suffice and an
explicit calculation of the largest singular value is required. Our procedures for the SVD
involve not only the determination of the singular values, but computation of U and V as
154
well. The following three procedures are a Sigma only truncation of the
bidiagonalization-SVD methods presented in the previous chapter.
public void Bidiagonalize_bd_SigmaOnly(double[,] Source_Matrix,
ref double[,] BD)
{
int i = 0, j = 0, k = 0, m = 0, n = 0;
if (Source_Matrix.GetLength(1) - 1 >
Source_Matrix.GetLength(0) - 1)
{
m = Source_Matrix.GetLength(1) - 1; // allow for
underdetermined system
}
else
{
m = Source_Matrix.GetLength(0) - 1;
}
n = Source_Matrix.GetLength(1) - 1;
BD = new double[m + 1, n + 1];
Array.Copy(Source_Matrix, BD, Source_Matrix.Length);
for (j = 1; j <= n; j++)
{
for (i = m; i >= j + 1; i += -1)
{
if (System.Math.Abs(BD[i, j]) > Givens_Zero_Value)
{ // If it's already zero we won't zero it.
double[,] temp_R = new double[3, n - j + 1 +
1];
double[,] temp_U = new double[m + 1, 3];
double[] cs = new double[3];
cs = Givens(BD[i - 1, j], BD[i, j]);
// accumulate R = BD
for (k = j; k <= n; k++)
{
temp_R[1, -j + 1 + k] = cs[1] * BD[i - 1,
k] - cs[2] * BD[i, k];
temp_R[2, -j + 1 + k] = cs[2] * BD[i - 1,
k] + cs[1] * BD[i, k];
}
for (k = 1; k <= n - j + 1; k++)
{
BD[i - 1, j + k - 1] = temp_R[1, k];
BD[i, j + k - 1] = temp_R[2, k];
}
}
}
if (j <= n - 2)
{
for (i = n; i >= j + 2; i += -1)
{
double[,] temp_R = new double[m - j + 1 + 1,
3];
double[,] temp_V = new double[n + 1, 3];
double[] cs = new double[3];
cs = Givens(BD[j, i - 1], BD[j, i]);
for (k = j; k <= m; k++)
155
{
temp_R[-j + 1 + k, 1] = cs[1] * BD[k, i 1] - cs[2] * BD[k, i];
temp_R[-j + 1 + k, 2] = cs[2] * BD[k, i 1] + cs[1] * BD[k, i];
}
for (k = 1; k <= m - j + 1; k++)
{
BD[j + k - 1, i - 1] = temp_R[k, 1];
BD[j + k - 1, i] = temp_R[k, 2];
}
}
}
}
if (null != done) done();
for (i = 1; i <= n; i++)
{
for (k = i + 1; k <= m; k++)
{
BD[k, i] = 0;
}
}
for (k = 1; k <= n - 2; k++)
{
for (i = k + 2; i <= n; i++)
{
BD[k, i] = 0;
}
}
}
public void Algo862_sigma_SigmaOnly(double[,]
bidiagonalized_sourse_array, ref double[,] Sigma)
{
// Golub/Van Loan 8.6.2. Does not overwrite Source_matrix
int m = bidiagonalized_sourse_array.GetLength(0) - 1;
int n = bidiagonalized_sourse_array.GetLength(1) - 1;
int q = 0, p = 0, h = 0, i = 0, j = 0, k = 0;
Sigma = new double[n + 1, n + 1];
for (i = 1; i <= n; i++)
{
j = i; // copy BD to SVD
while (j < n)
{
Sigma[i, j] = bidiagonalized_sourse_array[i, j];
Sigma[i, j + 1] = bidiagonalized_sourse_array[i, j
+ 1];
j = j + 1;
}
Sigma[i, j] = bidiagonalized_sourse_array[i, j];
}
while (q < n)
{
// Set bi,i+1 to zero...
for (i = 1; i <= n - 1; i++)
{
156
157
158
{
summer[i, 1] = cs[1] * B22[i, k] - cs[2] * B22[i, k
+ 1];
summer[i, 2] = cs[2] * B22[i, k] + cs[1] * B22[i, k
+ 1];
}
for (i = 1; i <= B22.GetLength(0) - 1; i++)
{
B22[i, k] = summer[i, 1];
B22[i, k + 1] = summer[i, 2];
}
y = B22[k, k];
z = B22[k + 1, k];
// determine c&s .. |c/-s s/c|T * |y/z| = |* 0|
cs = new double[3];
Array.Copy(Givens(y, z), cs, cs.Length);
// accumulate U TempU=TempUG
// 'apply givins transformation to B22 B22
B22=GTTempB22
summer = new double[3, B22.GetLength(1) - 1 + 1];
for (i = 1; i <= B22.GetLength(1) - 1; i++)
{
summer[1, i] = cs[1] * B22[k, i] - cs[2] * B22[k +
1, i];
summer[2, i] = cs[2] * B22[k, i] + cs[1] * B22[k +
1, i];
}
for (i = 1; i <= B22.GetLength(1) - 1; i++)
{
B22[k, i] = summer[1, i];
B22[k + 1, i] = summer[2, i];
}
if (k < npq - 1)
{
y = B22[k, k + 1];
z = B22[k, k + 2];
}
}
}
159
A singular matrix has a 2-norm condition of infinity. The larger the condition
number is, the closer A is to being singular. An orthogonal matrix one with maximally
independent column (row) vectors has a 2-norm condition of unity.
160
x
x
A
b
A
( A)
A
b
A
161
condA = Matrix_infinity_norm(A) *
Matrix_infinity_norm(Inverse_by_LU_encoded(A));
return (condA * normdb / normb) / 0.000001;
}
It makes some sense to use such a function to evaluate the effect of perturbations
in b, because the elements of the observation vector are frequently obtained from a
unique measurement device. Measurements within a given devices dynamic range would
have similar relative error. Discovering the effect of perturbations in the coefficient
matrix is a different matter. Here each column would be expected to have its own unique
error profile. So, analysis of the associated perturbation effects is more applicationspecific.
There are linear systems where poor conditioning can be resolved through scaling.
Recall that we express a linear system
a11 x1 a12 x2 a12 x3 b1
a11 a12 a13 x1 b1
a21 x1 a22 x2 a22 x3 b2 in matrix form with a21 a22 a23 x2 b2 .
a31 a32 a33 x 3 b3
a31 x1 a32 x2 a32 x3 b3
Any one of the rows in the linear system can be multiplied by a scalar. This equates to
multiplying the corresponding rows in A and b by that scalar. If the coefficients of a
given row are disproportionately low or high with respect to the other rows then the A
matrix may be ill conditioned, but this might be corrected by scaling the row. (It should
be noted here that Golub/Van Loan provides some considerably more sophisticated
scaling ideas for Gaussian elimination.) There is a problem with scaling
disproportionately low rows. These low numerical values are often measurements, and
comparatively low measured values frequently have high relative errors that are
magnified by scaling.
Whenever determined linear systems are being solved, it makes sense to evaluate
the condition number to become aware of potential condition errors. Explicit calculation
of relative error due to condition error as explained above provides a tool for determining
whether solutions can be expected to meet application-specific quality objectives.
162
{
int i = 0;
int m = A.GetLength(0) - 1;
double[,] rHat = new double[m + 1, 2];
double[,] AxHat = new double[m + 1, 2];
AxHat = Matrix_Multiply(A, xHat);
for (i = 1; i <= m; i++)
{
rHat[i, 1] = b[i, 1] - AxHat[i, 1];
}
return Vector_Two_Norm(rHat);
}
x
x
2
2
2 2 ( A)
b
Ax
2
2
2 2 ( A) 2
Ax b
Ax
A 2
A
2 2 ( A)
A
A
2
2
163
x x
x
x x
x
. For determined systems the formula for the maximum relative error is
( A)
r
b
x x
x
2 ( A) 2 u , where u
is machine round off error. To obtain the maximum error for full rank least squares
solutions obtained using orthogonalization methods, we substitute the round off error for
the perturbations in the condition error equation given above to obtain:
x x
b2
Ax b 2
2
2 2 ( A)
u 2 2 ( A) 2
u 2 2 ( A)u .
x2
Ax 2
Ax 2
Again, generalization of this procedure to encompass rank deficient problems is
discussed in Bjorck.
We do not always care about the agreement between our value for x and some
theoretical true x. In fact, we often perform these computations to obtain a so called
theoretical x in a process called linear regression. In these cases, we are interested in
having a low relative residual, and the goodness of fit statistic is most important.
We have all seen two dimensional graphs of data for seemingly stochastic systems
where a fit line is drawn through a virtual cloud of scattered data. Fit statistics for these
profiles could be in the 500% to 1000% range, yet a clear, predictive function is being
revealed and the computational process is a success. Consideration of the specific
problem at hand is fundamental in error analysis. Augmenting residual analysis and fit
statistics with perturbation experiments is always a good idea.
Furthermore, looking at a problem with a variety of algorithms is frequently quite
revealing. For example, although considerably more expensive computationally,
orthogonalization methods (including the SVD and column pivoting QR procedures) can
be used to solve square systems, resulting in added protection against instability and
providing capture for rank deficiency.
Regression Statistics
Linear regression is a most important application of least squares solutions. Here
we seek a function that describes the relationships between variables, and enables
164
The values we use for x0 , x1...xn , which we shall call estimators, are themselves
estimates. They have a random error associated with them, and that error has a mean of
zero. We can gain insight into the contribution to total error from any given estimator by
determining the standard deviation of the estimators sampling distribution. One simple
way to accomplish this is to form the diagonal vector of the matrix ( AT A) 1 . The
standard error for the ith estimator is given by si s diag (( AT A) 1 )i . Here is a function
that returns a vector of the standard errors of the estimators:
public double[,] Standard_error_in_estimators(double[,]
A_matrix, double Std_error_estimate)
{
double[,] ATAI =
Inverse_by_LU_encoded(Matrix_Multiply(Transpose(A_matrix), A_matrix));
double[,] result = new double[ATAI.GetLength(0) - 1 + 1,
2];
int i = 0;
165
Correlation
We need a statistic to evaluate how well the regression model fits the data. The
form that is in standard use is the multiple coefficient of correlation. With this statistic a
value of 1 is a perfect fit, and a value of zero indicates no relationship between the data
and the model. We calculate the statistic as follows:
2
residual 2
2
R 1
where b is the mean of the elements of the b vector. The
2
bi b
2
First we need a function for obtaining t p . There are many approaches, including
creating a table in code. I like the programmatic approach. Here is a good function
attributed to G.W. Hill. It takes as arguments p and DF. We can calculate p with
p 100 confidence level /100 (e.g., p for a confidence level of 95% is 0.05).
public double Students_tquantile(double P, double n)
{
// G. W. Hill Algo 396
// This algorithm evaluates the positive quantile at the
// (two-tail) probability level P, for Student's tdistribution with
// n degrees of freedom.
double result = 0;
if (n == 2)
{
result = System.Math.Sqrt(2.0 / (P * (2.0 - P)) - 2.0);
return result;
}
if (n == 1)
{
P = P * (System.Math.PI / 2);
167
168
169
170
171
7.
Now all of the public methods of the class are available to your application
project.
Interfacing Considerations
Global Variables
If you need them, consume global variables the moment they are modified,
because subsequent calls to other procedures may modify them. For example, the
QR_Encoded_P variable is modified by Householder_QR_Pivot,
Householder_QR_Pivot_encoded, Givens_QR_Pivot, Givens_QR_Pivot_encoded,
HouseholderQR_Pivot_Oneshot, Householder_QR_Pivot_encoded,
complete_orthogonal_implicit_Householder, complete_orthogonal_implicit_givens,
GivensQR_Pivot_Oneshot, and GivensQR_Pivot_Encoded_Oneshot. We could have
written our methods in such a fashion as to force the consumption of these variables, but
that is often wasteful.
Entering Matrices
172
Clearly we want our matrices in two dimensional arrays. It will be most intuitive
and compatible with our class library if these arrays are one-based. Getting the data into
the arrays is application specific. For example, you might have an analog-to-digital board
for transferring measurements to your system. Because most database programs like
Excel and many utilities like MatLab allow exporting/importing tab and comma
delimited files, let us consider an example where we read the data from a tab- or commadelimited file into an array A by clicking a menu item named OpenA.
At the very top of your form1 code, enter the following using statements:
using System.Text.RegularExpressions;
using System.IO;
173
tempstr1 =
System.Convert.ToString(System.Convert.ToChar(srreader.Read())); //
temp storage for readin character
if (tempstr1 == "," | char.Parse(tempstr1)
== System.Convert.ToChar(13) | char.Parse(tempstr1) ==
System.Convert.ToChar(9))
{ // chr(13) is a linefeed
if (char.Parse(tempstr1) ==
System.Convert.ToChar(13))// end of row encountered
{
cols = j;
j = 0;
i = i + 1;
rows = rows + 1;
}
j = j + 1;
tempstr1 = "";
tempstr2 = "";
}
}
srreader.Close();
fsstream.Close();
srreader = null;
fsstream = null;
temparray = new double[rows + 1, cols + 1];
// data is read one character at a time in row
by row fashion
FileStream fsstreamA = new
FileStream(fileAname, FileMode.Open, FileAccess.Read);
StreamReader srreaderA = new
StreamReader(fsstreamA);
j = 1;
i = 1;
while (srreaderA.Peek() > -1)
{ // until the end of the file is found
tempstr1 =
System.Convert.ToString(System.Convert.ToChar(srreaderA.Read())); //
temp storage for readin character
if (tempstr1 == "," | char.Parse(tempstr1)
== System.Convert.ToChar(13))// chr(13) is a linefeed
{
temparray[i, j] =
double.Parse(tempstr2);
if (char.Parse(tempstr1) ==
System.Convert.ToChar(13))// end of row encountered
{
j = 0;
i = i + 1;
}
j = j + 1;
tempstr1 = "";
tempstr2 = "";
}
tempstr2 = tempstr2 + tempstr1; //
concatenation of readin characters to form data elements
}
srreaderA.Close();
174
fsstreamA.Close();
srreaderA = null;
fsstreamA = null;
System.GC.Collect();
Regex Pathvalue = new Regex(@"(.*)\\");
caption =
Pathvalue.Replace(openFileDialog1.FileName, "");
Regex extvalue = new Regex(@"\.csv");
caption = extvalue.Replace(caption, "");
Arr = temparray;
}
}
}
catch (Exception ex)
{
Interaction.MsgBox(ex.Message,
(Microsoft.VisualBasic.MsgBoxStyle)(0), null);
}
}
We can save A with a similar logic. The message box implementation requires a
reference to Microsoft Visual Basic.
private void SaveA_Click_1(object sender, EventArgs e)
{
{
save_to_csv(A, ref captionA);
}
}
public void save_to_csv(double[,] arr, ref string caption)
{
try
{
if (saveFileDialog1.ShowDialog() == DialogResult.OK)
{
int R = arr.GetLength(0) - 1;
int C = arr.GetLength(1) - 1;
int i = 0, j = 0;
string temp1 = "";
string file_name = saveFileDialog1.FileName;
FileStream fsstream = new FileStream(file_name,
FileMode.Create, FileAccess.Write);
StreamWriter swwriter = new StreamWriter(fsstream);
swwriter.AutoFlush = true;
for (i = 1; i <= R; i++)
{
for (j = 1; j <= C; j++)
{
temp1 += arr[i, j];
if (j < C)
{
175
temp1 += ",";
}
}
swwriter.WriteLine(temp1);
temp1 = "";
}
swwriter.Close();
fsstream.Close();
swwriter = null;
fsstream = null;
System.GC.Collect();
Regex Pathvalue = new Regex(@"(.*)\\");
file_name = Pathvalue.Replace(file_name, "");
Regex extvalue = new Regex(@"\.csv");
file_name = extvalue.Replace(file_name, "");
caption = file_name;
}
}
catch (Exception ex)
{
Interaction.MsgBox(ex.Message,
(Microsoft.VisualBasic.MsgBoxStyle)(0), null);
}
}
Viewing Matrices
In many applications you may want to view and perhaps edit your matrices.
The DOT NET DataGridView is just the control we need. I prefer to create a form2 class
with a fully docked DataGrid for this purpose. This class should have a global string
variable named My_Source_Array to enable us to associate any form2 instance with its
corresponding array later. We create an instance of the form2 for each matrix.
176
177
Refinements
All of our methods are to be considered starting points in your ongoing process to
optimize your applications. The procedures we discussed are meant to be intuitive and
easily followed demonstrating a process and not necessarily establishing best
production code practices. There are many areas for improvement, including:
Implementing methods for iterative refinement in factorization and solution
procedures.
Developing procedures to more effectively handle structured problems like sparse
matrices and Vandermonde systems.
Dealing with matrix storage more efficiently.
Extending methods to applications involving complex numbers.
Establishing methods for updating least squares solution through adding or
deleting rows in the data.
Developing methods for solving least squares subject to constraints.
Develop methods for parallel computation on multi-processor systems.
The list goes on inexhaustibly. A lifetime can be spent exploring (many have) this
vast subject area. It is my hope that the reader will use this introduction as a beginning for
a more intensive study. I have included a bibliography of works that I found quite useful.
178
Bibliography
A.A. Anda and H. Park, Fast Plane Rotations with Dynamic Scaling, SIAM J.
Matrix Anal. Appl. 15, 1994, pp. 162-174
Bjorck, Ake, Numerical Methods for Least Squares Problems, Society for
Industrial and Applied Mathematics, Philadelphia, PA, 1996.
B. D. Bunday, S. M. Husnain Bokhari and K. H. Khan, A New Algorithm For
The Normal Distribution Function, Test, v. 6 n. 2, p.369377, Dec. 1997.
Chan, Tony F., An Improved Algorithm For Computing The Singular Value
Decomposition, ACM Transactions on Mathematical Software, Vol. 8, No. 1, March
1982, pp. 72-88.
Golub, Gene H., Charles F. Van Loan, Matrix Computations 3rd Edition, The
Johns Hopkins University Press, Baltimore, MD, 1996.
Hekstra, Gerben (Philips Research Labs, Eindhoven, The Netherlands),
Evaluation of Fast Rotation Methods, Journal of VLSI Signal Processing, Volume 25,
2000, Kluwer Academic Publishers, The Netherlands, pp. 113124.
Hill, G. W. (C.S.I.R.O., Division of Mathematical Statistics, Glen Osmond, South
Australia), Algorithm 395 Student's T-Distribution, Communications of the ACM,
Volume 13, Number 10, October, 1970, L. D. Fosdick, Editor, p. 617.
Pollock, D.S.G., A Handbook of Time-Series Analysis, Signal Processing and
Dynamics, Queen Mary and Westfield College, The University of London, Academic
Press, 1999.
Serber, George A. F., Alan J. Lee, Linear Regression Analysis, 2nd Edition, Wiley
and Sons, Inc., Hoboken, NJ, 2003.
Stewart, G.W., The Economical Storage of Plane Rotations, Numer. Math. 2,
1976, Springer-Verlag, pp. 3738.
Watkins, David S., Fundamentals of Matrix Computations, 2nd Edition, John
Wiley and Sons, Inc., New York, NY, 2002.
Wrede, Robert C., Introduction to Vector and Tensor Analysis, Dover
Publications, NY, 1972.
http://mathworld.wolfram.com/ A Wolfram Web Resource, Wolfram
Research, Inc., 100 Trade Center Drive Champaign, IL 61820-7237, USA
179
Statistical data analysis by Dr. Dang Quang A and Dr. Bui, The Hong Institute of
Information Technology Hoang Quoc Viet road, Cau Giay, Hanoi
180