0% found this document useful (0 votes)
99 views7 pages

Yale Sparse Matrix Package I The Symmetric Codes

This document introduces a package of FORTRAN subroutines for efficiently solving large, sparse systems of symmetric, positive definite linear equations. The package uses a sparse matrix storage scheme and performs symmetric reordering to reduce computational work and storage. It includes drivers for reordering, solving systems, and demonstrating usage on model problems. The minimum degree algorithm is used for reordering to locally optimize the elimination process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views7 pages

Yale Sparse Matrix Package I The Symmetric Codes

This document introduces a package of FORTRAN subroutines for efficiently solving large, sparse systems of symmetric, positive definite linear equations. The package uses a sparse matrix storage scheme and performs symmetric reordering to reduce computational work and storage. It includes drivers for reordering, solving systems, and demonstrating usage on model problems. The minimum degree algorithm is used for reordering to locally optimize the elimination process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, VOL.

18, 1145-1 151 (1982)

YALE SPARSE MATRIX PACKAGE


I: THE SYMMETRIC CODES?
S. C. EISENSTAT
Department of Computer Science, Yale University
M. C. GURSKY
Department of Electrical Engineering and Computer Science, University of California, Berkeley
M. H. SCHULTZ
Department of Computer Science, Yale University
A. H. SHERMAN
Exxon Production Research Company, Houston, Texas, U.S.A.

INTRODUCTION
Consider the N x N system of linear equations
Mx=b (1)
where the coefficient matrix M is large, sparse, symmetric, and positive definite. Such systems
arise frequently in scientific computation, e.g. in finite difference and finite element approxima-
tions to elliptic boundary value problems. In this paper, we present a package of efficient,
reliable, well-documented and portable FORTRAN subroutines for solving these systems.
See Reference 3 for a corresponding package for nonsymmetric problems.
Direct methods for solving (1) are generally variations of symmetric Gaussian elimination.
One forms the U'DU decomposition of M, where U is unit upper triangular and D positive
diagonal, and then successively solves the triangular systems
U'y=b, Dz=y, Ux=z (2)
When M is large (N >> l ) , dense Gaussian elimination (in which zeros are not exploited) is
prohibitively expensive in terms of both the work (-1/6N3 multiples) and storage ( N 2words)
required. But, since M is sparse, most entries of M and U are zero, and there are significant
advantages to factoring M without storing or operating on the zeros appearing in M and U.
Recently, a number of implementations of sparse Gaussian elimination have appeared based
on this idea (cf. References 2 , 6 , 8 , 11 and 13-16).
In this paper, we first describe the scheme used for storing sparse matrices, and then we
give an overview of the package from the point of view of the user. The calling sequences for
the routines are detailed in the listing; for further details of the algorithms employed, see
References 5 a d 10. Finally, we illustrate the performance of the package on a typical model
problem and on randomly generated test matrices. Listings and machine-readable versions of

t This research was supported in part by ONR Grant N00014-76-0277, NSF Grant MCS 76-11460, AFOSR Grant
F49620-77-C-0037, and the Chevron Oil Field Research Company.

0029-5981/82/081145-07$01 .OO Received 16March 1981


@ 1982 by John Wiley & Sons, Ltd. Revised 13 August 1981
1146 S . C. EISENSTAT ET AL.

the subroutines and a demonstration driver may be obtained by sending a written request to
YSMP Librarian, Department of Computer Science, Box 2158 Yale Station, New Haven,
CT 06520.

A SPARSE MATRIX STORAGE SCHEME


Since the coefficient matrix M and the upper triangular factor U are large and sparse, it is
inefficient to store them as dense matrices. Instead, we store matrices using a row-by-row
storage scheme used in previous implementations of sparse symmetric Gaussian elimination
(cf. References 2,5 and 11).In this section, we describe the scheme used for the symmetric
input matrix M. Users who need to examine the factor U should refer to the codes themselves
or to References 5 and 10 for a description of the compressed storage scheme used.
This scheme requires three one-dimensional arrays: IA, JA, and A . The nonzero entries
of M are stored row-by-row in the REAL array A . To identify the individual nonzero entries
in a row, we need to know in which column each entry lies. The INTEGER array JA contains
the column indices which correspond to the nonzero entries of M, i.e. if A ( K )= M ( I ,J ) , then
J A ( K )=J. In addition, we need to know where each row starts and how long it is. The
INTEGER array IA contains pointers to the locations in JA and A where the rows of M
begin, i.e. if M ( I ,J ) is the first (stored) entry of the Ah row and A ( K )= M ( I ,J ) , then IA (I)= K.
Moreover, I A ( N + 1) is defined as the index in JA and A of the first location following the
last element in the last row. Thus, the number of entries in the Ith row is given by I A ( I + 1)-
I A ( I ) ,and the nonzero entries of the Ith row are stored consecutively in
, ( I A ( I ) +l),. . . , A ( I A ( I +1)- 1)
A ( I A ( I ) )A
and the corresponding column indices are stored consecutively in
JA ( I A ( I )+ l),. . . ,J A ( I A ( I + 1)- 1)
JA(1A(I)),

1: : 1 :
For example, the 5 x 5 matrix
1 0 2 3 0

M = 2 0 5 6 0

0 0 0 8 9
is stored as
I 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3
IA 1 4 5 8 1 2 1 4
JA 1 3 4 2 1 3 4 1 3 4 5 4 5
A 1 2 3 4 2 5 6 3 6 7 8 8 9

IA 1 4 5 7 9 1 0
JA 1 3 4 2 3 4 4 5 5
A 1 2 3 4 5 6 7 8 9
YALE SPARSE MATRIX PACKAGE 1147

The overhead in this storage scheme is the storage required for the INTEGER arrays IA
and JA. But since IA has N + 1 entries and JA has one entry for each element of A , the total
overhead is approximately equal to the number of nonzero entries in M that are stored. (We
note that our code will accept input matrices stored in either of the above forms, although
the latter form is clearly more efficient.)

A SPARSE SYMMETRIC MATRIX PACKAGE


The package consists of three drivers and nine subroutines (see Figure 1).The ordering driver
(subroutine ODRV) may be used to symmetrically reorder the variables and equations so as
to reduce the total work (i.e. the number of multiplies) and storage required. The solution

Figure 1 . A schematic overview of the sparse symmetric matrix package

driver (subroutine SDRV) is used to solve the (possibly permuted) system of linear equations.
The demonstration driver (program SDMO) sets up a model sparse symmetric positive definite
system of linear equations, calls ODR V to reorder the variables and equations, and calls
SDRV to solve the linear system. In the remainder of this section, we describe each of these
routines in somewhat greater detail. The codes themselves are extensively documented; for
further details about the algorithms employed, see References 5 , 10 and 17.

The ordering driver (ODRV )


The work and storage required to solve a large sparse system of linear equations clearly
depend upon the zero-nonzero structure of the coefficient matrix. But since this matrix is
symmetric and positive definite, we could equally well solve the permuted system
QMQ'y = Qb, Qx = y (3)
given any permutation matrix Q. The permuted system corresponds to symmetrically reordering
the variables and equations of the original system, and the net result can often be a significant
reduction in the work and storage required to form the U'DU decomposition of M.4
1148 S . C. EISENSTAT ET AL.

The ordering driver (subroutine ODRV) uses an important heuristic, the minimum degree
algorithm' (implemented in subroutines MD, MDI, MDM, MDU and MDP), to select Q. The
algorithm effectively does a symbolic elimination on the nonzero structure of the system. A t
each step, it chooses a pivot element from among those uneliminated diagonal matrix entries
which require the fewest arithmetic operations to eliminate (ties are broken arbitrarily). This
has the effect of locally optimizing the elimination process with respect to the number of
arithmetic operations performed. See References 9, 10 and 17 for more details.
MD returns two one-dimensional INTEGER arrays of length N : P contains the permutation
of the row and column indices of M, i.e. the sequence of pivots; and IP contains the inverse
permutation, i.e. I P ( P ( I ) =) I for I = 1 , 2 , . . . ,N. If only the upper triangle of M is being
stored, then the representation of A4 (i.e. the arrays IA, JA and A ) must be rearranged using
the subroutine SRO. (SRO may also be called when the entire matrix M is stored.) In
rearranging IA, JA and A, SRO places each matrix entry M ( I ,J) (Is J ) in row K, where
K = P ( L ) and L = min ( I P ( I )I, P ( J ) ) .Thus the rows of M are not reordered, but after SRO
is finished, each row contains only those entries which appear in the upper triangle of the
matrix QMQ'. Among the entries for each row, SRO places the diagonal entry first and leaves
the off-diagonal entries ordered by increasing column number in M.
To combine the ordering and reordering capabilities just described, ODRV has three possible
paths. If PATH = 1, then only a minimum degree ordering is performed, while if PATH = 2,
then S RO is called as well. Finally, if PATH = 3, only SRO is called.
The user may bypass ODRV entirely by setting P ( I )= IP(I)= I for I = 1 , 2 , 3 , . . . ,N,
Alternatively, the user may substitute another ordering subroutine for the minimum degree
ordering subroutines, as long as it produces the two permutations P and IP. But again, if only
the upper triangle of M is being stored, the representation of M must be rearranged using SRO.

The solution driver (SDRV )


The solution driver (subroutine SDRV) is used to solve the (possibly permuted) linear
system. Following Chang,' SDRV breaks the solution process into three steps: symbolic
factorization (subroutine SSF), numerical factorization (subroutine SNF) and back-solution
(subroutine SNS). First, SSF determines the nonzero structure of the rows of U from the
nonzero structure of the rows of M. Second, SNF uses the structure information generated
by SSF to compute the U'DV factorization of M. Third, SNS computes the solution x from
the factorization generated by SNF and the right-hand side 6.
By splitting up the computation, we have gained flexibility. To solve a single system of
equations, it suffices to use SSF, SNF and SNS (by setting PATH = 1 in SDRV). To solve
several systems in which the coefficient matrices have the same nonzero structure but different
numerical entries, it suffices to use SSF only once and then use SNF and SNS for each system
(PATH = 2). To solve several systems with the same coefficient matrix but different right-hand
sides, it suffices to use SSF and SNF only once each and then use SNS for each right-hand
side (PATH = 3). SDRV also provides the capabilities of calling only SSF (PATH = 4) or of
calling just SSF and SNF (PATH = 5).

The demonstration driver (SDMO)


The demonstration driver (program SDMO) is intended to aid the user in installing the
package on a particular computer system and may be used as a guide to understanding how
to use the package. It generates the coefficient matrix for the standard five-point finite difference
approximation to the Poisson equation on a 3 x 3 grid and chooses the right-hand side so that
YALE SPARSE MATRIX PACKAGE 1149

the solution vector x is (1,2,3,4,5,6,7,8,9)‘. Since M is symmetric, one can store either
the entire matrix (CASE = 1) or only the upper triangle (CASE = 2). SDMO then calls ODRV
to reorder the variables and equations and SDRV to solve the linear system. At each stage,
the values of all relevant variables are printed out. It should be noted that SDMO is not a
thorough test routine for the package; it is only designed to ensure that every subroutine in
the package is called at least once.

PERFORMANCE
One of the most important aspects of any package is its performance in terms of both time
and storage. In this section we present results for two classes of linear systems. The first typifies
problems to which the package might be applied in practice, while the second allows us to
evaluate the effects of several design decisions in the software. The computations were
performed on a CDC 6600 using the FTN optimizing Fortran compiler with OPT = 2. For
reference, we note that with this compiler and machine, execution of the statement

for RE A L variable A , B and C requires 1.6 x sec, while execution of the body of inner
most loop of SNF
JUMUJ = JU(MU +J >
D(JUMUJ)= D(JUMUJ)+ UKIDI*U(J)

requires 3.25 x lop6sec.


In Tables I and I1 we present the time and storage required to solve the familiar nine-point
finite difference equations on an n x n grid for several values of n. (Note that the number of
equations and unknowns is N = n 2 for these examples.)
For comparative purposes, Tables I and I1 also include the time and storage requirements
for two available solvers for banded systems, the IMSL subroutine LEQlPB’ and the LINPACK
subroutines SPBFA and SPBSL1* (with the Fortran BLA’s). These results were obtained using
the natural (row-by-row) grid ordering so that the resulting bandwidth was minimized. It
should be noted that both band solvers destroy M and b, a fact which significantly reduces
their storage requirements.
The results of our comparisons indicate that the sparse package can yield a significant savings
over the band codes in the time required to factor M and back-solve for x , especially for
large problems. However, at least for small problems, the additional time required for the
minimum degree ordering is sufficient to give the band code the overall edge in time. This
situation is mitigated somewhat in practice, since often one use of the ordering package will
suffice to fix an ordering for the solution of several linear systems; so, in a certain sense, it is
the comparisons of factorization and solution times that matter most.
Tables 111 and LV give the time and storage requirements for solving randomly generated
symmetric, positive definite linear systems of order 200. These results are included not because
we believe that performance on random systems is necessarily important in itself, but rather
because we believe that good performance on such problems can provide justification for
software design decisions. In this instance, the results indicate that our internal use of
compressed storage for U leads to substantial storage savings and that our subroutines achieve
great overall efficiency even for relatively small problems.
Table I. Time required for nine-point model problems (sec)

SNF/mults. Total
n N MD SRO SSF SNF Mults. (x10-6) SNS (excluding MD) LEQlPB SPBFA/SPBSL

15 225 0.220 0.049 0.079 0.154 24,520 6.28 0.017 0.299 0.236 0.293
25 625 0.637 0.148 0,244 0.732 133,280 5.49 0.082 1.206 1.343 1.474 v,
35 1,225 1.285 0.256 0.554 1.947 383,821 5.07 0.162 2.919 4.474 4.353 t,
t!

Table 11. Storage required for nine-point model problems :


b
?
Storage for A Storage for U Required Matrix storage
n Nonzeros Total Overhead/nonzero Nonzeros Total Overhead/nonzero value of NSP for band codes

15 1,037 2,300 1.22 2,672 4,326 0.619 5,002 3,825


25 2,977 6,580 1.21 10,203 15,209 0.491 17,085 16,875
35 5,917 13,060 1.21 23,763 33,947 0.429 37,623 45,325
YALE SPARSE MATRIX PACKAGE 1151

T a b l e 111. T i m e required for 200 x 200 random systems (sec)-averages of 10 trials

Density SNF/mults.
(%I MD SRO SSF SNF Mults. (x~O-~) SNS Total

2.00 0.218 0.024 0.040 0,059 7,555 8.10 0.014 0.357


3.02 0.483 0.033 0.076 0.235 47,380 4.99 0.021 0,848
5-05 1.006 0.049 0-133 0.759 180,873 4.20 0.038 1-985
9.97 2.082 0.085 0.243 1.911 488,606 3.91 0.066 4.387

Table IV. Storage required for 200 x 200 random systems-averages of 10 trials
~ ~~ ~~~

Storage for A Storage for U Required


Density Overhead/ Overhead/ value of
(YO) Nonzeros Total nonzero Nonzeros Total nonzero NSP

2.00 500 11,201 1.40 979 2,044 1.110 2,646


3.02 704 1,609 1.29 2,804 3,472 0.598 5,076
5.05 1,110 2,421 1.18 6,187 8,524 0.378 9,126
9497 2,094 4,389 1.10 11,244 14,216 0.264 14,818

REFERENCES
1. A. Chang, ‘Application of sparse matrix methods in electric power system analysis’, in R. A. Willoughby, Ed.,
Sparse Matrix Proceedings, Report RA1, IBM Research, Yorktown Heights, New York, 1968.
2. A. R. Curtis and J. K. Reid, ‘Two FORTRAN subroutines for direct solution of linear equations whose matrix
is sparse, symmetric, and positive definite’, Harwell Report AERE-R7119, 1972.
3. S. C. Eisenstat, M. C. Gursky, M. H. Schultz and A. H. Sherman, ‘The Yale matrix package 11: the non-symmetric
codes’, Report 114, Yale University Department of Computer Science, 1977.
4. S. C. Eisenstat, M. H. Schultz and A. H. Sherman, ‘Application of sparse matrix methods to partial differential
equations’, Proc. AICA Int. Symp. on Computer Methods for Partial Differential Equations, Bethlehem,
Pennsylvania, 1975, pp. 40-45.
5 . S. C. Eisenstat, M. H. Schultz and A. H. Sherman, ‘Algorithms and data structures for sparse symmetric Gaussian
elimination’, SIAM J. Sci. Stat. Comput., 2, 225-237 (1981).
6. F. G. Gustavson, ‘Some basic techniques for solving sparse systems of linear equations’, in D. J. Rose and R.
A. Willoughby, Eds., Sparse Matrices and Their Applications, Plenum Press, 1972, pp. 41-52.
7. International Mathematical and Statistical Libraries, Inc. The IMSL Library 3, Edition 6, 1977.
8. W. C. Rheinboldt and C. K. Mesztenyi, ‘Programs for the solution of large sparse matrix problems based on the
arc-graph structure’, Technical Report TR-262, University of Maryland Computer Science, 1973.
9. D. J. Rose, ‘A graph-theoretic study of the numerical solution of sparse positive definite systems of linear
equations’, in R. Read, Ed., Graph Theory and Computing, Academic Press, 1972, pp. 183-217.
10. A. H. Sherman, ‘On the efficient solution of sparse systems of linear and nonlinear equations’, Ph.D. dissert.,
Department of Computer Science, Yale University, 1975.
11. A. H. Sherman, ‘Yale sparse matrix package user’s guide’, Report UCID-30114, Lawrence Livermore Laboratory,
1975.
12. J. J. Dongarra, J. R. Bunch, C. B. Moler and G. W. Stewart, ‘LINPACK user’s guide’, Society for Industrial
and Applied Mathematics, 1979.
13. I. S. Duff, ‘MA28-a set of Fortran subroutines for sparse unsymmetric linear equations’, A E R E Report R.8730,
HMSO, London, 1977.
14. J. A. George and J. W. H. Liu, Computer Solution ofLarge sparse Positioe Definite Systems, Prentice-Hall, 1981.
15. Z. Zlatev, V. A. Barker and P. G. Thomsen, ‘SSLEST: A Fortran IV subroutine for solving sparse systems of
linear equations. User’s guide’, Technical Report 78-01, Numersk Inst., Denmark, 1978.
16. N. Munksgaard, ‘Fortran subroutines for direct solution of sets of sparse and symmetric linear equations’, Reporf
NI-77-05, Technical University of Denmark, 1977.
17. S. C. Eisenstat, ‘An implementation of the minimum degree algorithm’ (to appear).

You might also like