5 - Solving Nonlinear Eigenvalue Problem by Algorithmic Differentation - ARBENZ GANDER
5 - Solving Nonlinear Eigenvalue Problem by Algorithmic Differentation - ARBENZ GANDER
9 by Springer-Verlag1986
Abstract - - Zusammenfassung
Solving Nonlinear Eigenvalue Problems by Algoritllmic Differentiation. The eigenvalues of a matrix A @),
the elements of which are complex functions of a complex variable ~,, can be found with a zero finding
method applied to the determinant function det A @). It is proposed to evaluate the derivatives ofdet @)
used in the zero finding method by algorithmic differentiation. This leads to a simple and lucid
algorithm. Program listings and numerical examples are given.
AMS Subject Classifications." 65H15, 65F40.
Key words: Algorithm, differentiation, eigenvalue, nonlinear.
L~isen von nichtlinearen Eigenwertproblemen durch Algorithmisches Differenzieren. Die Eigenwerte einer
Matrix A @), deren Elemente komplexe Funktionen einer komplexen Variable )~sind, k6nnen vermittels
Nullstellenverfahren, angewandt auf die Determinantenfunktion det A @) gefunden werden. Es wird
vorgeschlagen, die bei den Nullstellenverfahren verwendeten Ableitungen yon det A @) diarch Algo-
rithmisches Differenzieren zu berechnen. Dies ftihrt zu einem einfachen und tibersichtlichen Algo-
rithmus. Es werden Programmlisten und numerische Beispiele gegeben.
I. Introduction
Let A (2) be an n x n matrix, whose elements are differentiable functions with respect
to the complex parameter 2. The number 2 is an eigenvalue of A, if there exists a
nonzero vector x ~ C" such that
A (2) x = 0. (1)
Vectors x C0 satisfying (1) are referred to as eigenvectors corresponding to the
eigenvalue 2. The set a (A) of the eigenvalues of A is called spectrum of A. For surveys
on numerical methods for the solution of nonlinear eigenvalue problems of the form
(1) see [9, 10, 14]. It is well-known, that (1) has nontrivial solutions if and only if
A (2): = det A (2) = 0. (2)
Since it is easy and numerically stable to evaluate A (2) by Gaussian elimination with
partial pivoting [2], an obvious way to compute eigenvalues of A (2) is to apply the
secant or Muller's method (i.e. successive linear or quadratic interpoIation
respectively) to find zeros of the determinant A (2) [11].
206 P. Arbenz and W. Gander:
In this note we consider root finding methods as for instance Newton's iteration that
use derivatives of the function A (2). While Lancaster [9] (see also [17] for an
equivalent formula) deduces for the reciprocal of the Newton step size
A' (,~)
f(2): = = trace {A- 1 (;0 A' (2)}, 2 r r (A) (3)
A (4)
we propose here to evaluate f(2) and its derivatives by algorithmic differentiation.
2. Algorithmic Differentiation
is inserted. Here da, db, and dc are the variables storing the derivatives of a, b, and c.
expr' is the expression obtained of expr by formal differentiation.
A: p : = 0 ; for i : = n downto 0 do p : = p * z + a [ i ] ;
computes both p (z) and p' (z). Thus algorithmic differentiation yields A' the well-
known scheme for evaluation of the polynomial and its derivative by H o m e r ' s rule
[4,6].
There exist compilers that automatically translate A into A' [5, 6, 8, 12, 13, 15].
These compilers often are able to compute partial derivatives as well. However for
short programs it is little work to do the changes by hand using an editor.
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 207
Let
P (4) A (4) = L (4) R (4) (4)
We may differentiate the above algorithm again to obtain f ' (2). This makes the
application of cubically convergent iteration schemes possible, cf. Halley's iteration
[1]
A (20 A'(20
2k + 1 = 2k
d' (202 - 89A (20 A" (20
=2 k 2f(2k) (10)
f(2k) 2 --f' (2k)
2
=2g
f (2k)-- f ' (2k)/f (2k) '
a formula Lancaster [10] reports favourably upon.
We obtained for our test problems better results with Laguerre's iteration [9]
K a (&)
2k+ 1 = 2k
A' (2k) _+]/(K - 1) [(K - 1) A' (2k)2 -- K d (20 d" (2k)]
(11)
K
=2 k
f(2k) +_]/(1 -- K) [ K f ' (2k) +f(2k) 2-] '
a method which can be recommended if A (2) is a polynomial. The sign in the
denominator in (11) is chosen to minimize the step length. Laguerre's method
converges cubically to simple zeros for every real value K > 1, however for global
convergence choosing K equal to the degree of the polynomial may be more efficient.
The evaluation off(h) and f'(2) by algorithmic differentiation costs 2 n 3 complex
flops. The use of formula (3) together with [9]
f ' (2) = trace {A (2)-1 A" (2) - (A (2)-1 A' (2)) 2 } (12)
The number e' can be chosen greater than e in (9). If (14) became true we defined m to
be the integer nearest to --f(2k)2/f (2k) and switched over to one of the iterations
2m
2k+1 2k f(2k)-- mf'(2k)/f(2k) '
= (15)
K
2k+1=2 k , (K>m) (16)
f(2k)'~-/(m 7nK) [Kft(2k)Jrf(2k) 2]
or
f(2k)
2k+1 = 2k+ f (17)
(15) and (16) are the Halley and Laguerre formulae converging cubically towards
roots of multiplicity m [3]. (17) is the Newton iteration method applied to the
function A (2)/A' (2). It converges quadratically towards all zeros 2* of A (2), since
A (2)/A' (2) has simple roots only.
Remarks:
Formulae (15) and (16) are obtained if A (2) and K in (10) and (11) are replaced by
~(2):=A(2)l/" and R:=mK. (Hence f= l~f and f ' = l f ' . ) In (16) the hat of
/( is omitted, m
(17) can be obtained by replacing m in Halley's iteration (15) by --f(2k)2/f ' (2k).
To be able to compute more than one eigenvalue one can proceed as follows: Having
already computed 21.... ,2q we avoid recomputation of one of these eigenvaIues by
suppressing, i.e. we replace A (2) in (5) by
A(2)=A (2)/i(I (2-2j). (18)
4. Numerical Examples
with
--1+2cr 2 . ( 1 - . 2 - 2 f l 2) 2cz2fi2 -.flz(.2+/~2) /
2c~ _ .2 _ 2/~2 2 o~f12 _/32 (.2 + 132)
A0
1 0 0 0
0 1 0 0
A2=I.
0
2
0
0 J
Here e is a non-negative real parameter and fi = ~ + 1. The eigenvalues of (21) are
25 = 0 .
For positive ~ the eigenvalues of A (2) are all distinct and for e = 0 there are triple
eigenvalues - i and + i and a double eigenvalue 0.
For several values of e we computed 5 eigenvalues of A (2) with Newton's, Halley's,
Laguerre's, and modified Newton's iteration (17) using suppression. The initial
value for every iteration was chosen 2o = - 1 - 2 i. The iterations were stopped if the
convergence criterion (9) was satisfied with e = 10-8 or after 50 iteration steps. The
numerical results are listed in Table 1. To each ~ and each iteration method we give
two columns of numbers, one containing the index of the eigenvalue according to
(22) in the computed order of succession and the other containing the number of
iterations needed to obtain the specific eigenvalue in the desired accuracy. Beneath
these two columns the time to compute all 5 eigenvalues appears. The calculations
have been performed on an IBM 3084 in complex double precision arithmetics. The
actual programs were written in Fortran 77.
Table 1 shows that the cubically convergent iteration methods are faster than
Newton's iteration in spite of the double amount of work per iteration step. While
Halley's iteration has only small advantages over Newton's iteration, Laguerre's
iteration is obviously superior to the other two methods. However this may be
different if A (2) is not a polynomial.
The modified Newton iteration performed badly. For some values of ~ the iteration
to find a third eigenvalue even failed to converge (and was stopped after 50 steps). In
this case the 2k's formed an infinite cycle. One of the points in the cycle was treated as
an eigenvalue and removed by suppression. A further drawback of the modified
Newton iteration is that it does not always converge towards one of the nearest
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 211
Table 1
0.5 9 1 6 1 4 7
10 3 7 2 5 8
13 5 10 3 5 21
16 2 7 4 6 13
9 4 6 5 5 7
.0664 sec. 90664 sec. .0508 sec. 90859 sec.
0.1 1 15 1 9 1 2 13
3 13 3 9 2 7 16
5 17 2 9 I 3 50
2 11 4 9 4 3 8
4 10 5 7 5 6 10
.0781 sec. .0742 sec. .0586 sec. .141 sec.
0.01 1 21 1 13 1 9 2 13
3 17 3 11 2 8 6 19
4 20 2 8 3 5 5 17
2 11 4 11 4 9 3 9
5 11 5 7 5 5 7 17
.0938 sec9 .0859 sec. .0664 sec. .113sec.
10 - 4 1 33 1 19 1 13 3 13
3 23 3 15 2 11 7 21
4 27 2 8 3 5 50
2 11 4 15 4 12 2 17
5 11 5 7 5 5 4 12
.ll3sec. .105 sec. .0859 sec. 9160 sec.
10 - 6 1 42 1 25 1 18 2 11
3 28 3 18 2 14 7 22
4 32 2 8 3 5 - 50
2 11 4 18 4 15 3 14
5 11 5 7 5 5 4 12
.133 sec. .125 sec. .0977 sec. 9160 sec.
lO-S 1 49 1 29 2 21 1 9
3 31 3 20 1 16 6 15
4 36 2 8 3 6 - 50
2 11 4 20 4 17 2 12
5 11 5 7 5 5 4 7
9145 sec. .141 sec. .113 sec. .133 sec.
eigenvalues. It seems that due to the poles of the function A (2)/A' (2) the initial
approximations 2 o are to be chosen very close to the zero to obtain convergence
towards it.
Since the eigenvalues move together with decreasing ~ the number of iterations
increases from top to bottom in Table 1. For c~= 10 - s and 7 = 0 the algorithms
behave almost the same. This is not surprising since in the neighbourhood of
multiple eigenvalues a significant loss in the accuracy in the computed function has
to be accepted 9 Indeed we have to expect a relative error of about (macheps) 1/"
nearby a zero of multiplicity m [16]. Here macheps ~- 10-16. Therefore we consider
our results satisfactory.
We modified Laguerre's and Halley's iteration method as described in section 3. As
soon as _ f 2 (2k)/J"(2k) tended to an integer greater than 1 we switched over to
formula (15) or (16). In the convergence criterion (14) we set e ' = 10 -4. The so
obtained results that differed from the ones given in Table 1 are listed in Table 2. The
additional "m"-column indicates the computed multiplicities of the respective
eigenvalue. For 7 = 10 - s in the first columns the number of the nearest eigenvalue
appears. Once again Laguerre's iteration is superior to Halley's. On the other hand
both modified algorithms have computed the eigenvalues of A (2) with the desired
accuracy in a much shorter time than the original ones.
Table 2
lO-S 1 3 22 1 3 19
4 2 15 4 2 12
8 3 6 8 3 7
.0742 sec. .0664 sec.
1-3 3 19 1-3 3 t4
4-5 2 15 4-5 2 12
6-8 3 3 6-8 3 3
.0664 sec. .0586 sec.
A0
/ / 9 8
9~
''
""
'
1
.
9 1
8
, A 2
2
A1 = A 3 =I.
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 213
Numerical computations indicate that A (2) has n purely real simple eigenvalues in
the interval ( - n - 1,0) and n complex conjugate pairs of eigenvalues with positive
real parts. For n = 5, 10, and 20 two eigenvalues have been computed. The iterations
were started with 20 = - n - 1. The numerical results for this example are shown in
Table 3. They confirm the advantages of the cubically convergent methods over
Newton's iteration and particularly show the superiority of Laguerre's iteration.
Once again the modified Newton iteration method is the slowest and does not
converge to the nearest zeros.
Table 3
5 1 10 1 6 1 4 8
2 13 2 8 2 5 8
.0313 sec. .0273 sec. .0273 sec. .0313 sec.
10 1 11 1 7 1 4 7
2 15 2 9 2 5 13
.133 sec. .0977 sec, .0781 sec. ,172sec.
20 1 12 1 7 1 4 7
2 17 2 10 2 5 15
.918 sec. .641 sec, .500 sec. 1.28 sec.
for k : = i + l to n do
ff a b s (a [k, i]) > a b s (a [k max, i]) then k max: = k;
m a x : = a [k max, i] ; d m a x : = da [k max, i] ;
ff m a x = 0 then f : = lO/eps
else
begin
f : = f + d max~max;
if k m a x < > i then
begin
h: -- a [i] ; a [i] : = a [k m a x ] ; a [k m a x ] : = h;
h: = d a [i] ; d a [i] : = d a [k max] ; d a [k max] : = h;
end;
(* e l i m i n a t i o n ,)
invmax : = l / m a x ;
for k : = i + l to n do
begin
f a k : = invmax 9 a [k, i];
d f a k: = invmax * (d a [k, i] - f a k * d max);
forj:=i+l to n do
begin
a [ k , j ] : = a [k,j] - f a k * a [ i , j ] ;
da [k,j] : = da [k,j] - d f a k 9 a [ i,j] - f a k * da [ i,j]
end;
end
end;
i:=i+1;
until (i > n) or (max = O)
end;
References
[1] Gander, W. : On Halley's iteration method. Am. Math. Monthly 92, 131 - 1 3 4 (1985).
[2] Golub, G. H., van Loan, C. F. : Matrix Computations. Baltimore : The Johns Hopkins University
Press 1983.
[3] Hansen, E., Patrick, M.: A family of root finding methods. Numer. Math. 27, 257-269 (1977).
[4] Henrici, P. : Essentials of Numerical Analysis. New York: Wiley 1982.
[5] Hillstrom, K. E. : J A K E F - A Portable Symbolic Differentiator of Functions Given by
Algorithms. Techn. Report ANL-82-48. Argonne National Laboratory, Argonne, Illinois 1982.
[6] Joss, J. : Algorithmisches Differenzieren. Dissertation Nr. 5757, ETH Ziirich 1976.
[7] Kahan, W. : Private communication (1981).
[8] Kedem, G. : Automatic differentiation of computer programs. ACM Trans. Math. Software 6,
150-165 (1980).
[9] Lancaster, P.: Lambda-Matrices and Vibrating Systems. Oxford: Pergamon Press 1966.
[10] Lancaster, P. : A review of numerical methods for eigenvalue problems nonlinear in the parameter.
In: Numerik und Anwendungen yon Eigenwertaufgaben und Verzweigungsproblemen, pp. 43 - 67
(Bohl, E., Collatz, L., Hadeler, K. P., eds.). ISNM 38. Basel-Stuttgart: Birkhguser 1977.
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 215
[11] Peters, G., Wilkinson, J. H. : A x = ~ B x and the generalized eigenproblem. SIAM J. Numer. Anal.
7, 479-492 (1970).
[12] Rail, L. B.: Automatic differentiation: techniques and applications. Lecture Notes in Computer
Science, vol. 120. Berlin-Heidelberg-New York: Springer-Verlag 1981.
[13] Rall, L. B. : Differentiation in Pascal-SC: Type GRADIENT. ACM Trans. Math. Software 10,
161 - 184 (1984).
[14] Ruhe, A. : Algorithms for the nonlinear eigenvalue problem. SIAM J. Numer. Anal. 10, 674 - 689
(1973).
[15] Speelpenning, B.: Compiling fast partial derivatives of functions given by algorithms. Techn.
Report UIUCDCS-R-80-1002, Dept. of Computer Science, Univ. of Illinois, Urbana, Illinois 1980.
[16] Stoer, J. : Einffihrung in die Numerische Mathematik I, 4. Aufl. Berlin-Heidelberg-New York-
Tokyo: Springer-Verlag 1983.
[17] Yang, W. H. : A method for eigenvalues of sparse X-matrices. Int. J. Numer. Methods Engrg. 19,
943-984 (1983).
15 Computing 36/3