0% found this document useful (0 votes)
36 views

5 - Solving Nonlinear Eigenvalue Problem by Algorithmic Differentation - ARBENZ GANDER

The document proposes using algorithmic differentiation to solve nonlinear eigenvalue problems. Specifically: 1) Algorithmic differentiation is used to evaluate the derivatives of the determinant function needed for root-finding methods like Newton's method. 2) This avoids having to explicitly compute derivatives via formulas and leads to a simple algorithm for finding the eigenvalues of a matrix whose elements are functions of a variable. 3) The algorithm involves algorithmically differentiating Gaussian elimination to compute the derivatives and applying Newton or other iterative methods to find the roots of the determinant.

Uploaded by

Jean Dujardin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

5 - Solving Nonlinear Eigenvalue Problem by Algorithmic Differentation - ARBENZ GANDER

The document proposes using algorithmic differentiation to solve nonlinear eigenvalue problems. Specifically: 1) Algorithmic differentiation is used to evaluate the derivatives of the determinant function needed for root-finding methods like Newton's method. 2) This avoids having to explicitly compute derivatives via formulas and leads to a simple algorithm for finding the eigenvalues of a matrix whose elements are functions of a variable. 3) The algorithm involves algorithmically differentiating Gaussian elimination to compute the derivatives and applying Newton or other iterative methods to find the roots of the determinant.

Uploaded by

Jean Dujardin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Computing 36, 2 0 5 - 215 (1986) Computing

9 by Springer-Verlag1986

Solving Nonlinear Eigenvalue Problems by


Algorithmic Differentiation
P. Arbenz, Baden, and W. Gander, Zfirich
Received May 10, 1985; revised August 20, 1985

Abstract - - Zusammenfassung

Solving Nonlinear Eigenvalue Problems by Algoritllmic Differentiation. The eigenvalues of a matrix A @),
the elements of which are complex functions of a complex variable ~,, can be found with a zero finding
method applied to the determinant function det A @). It is proposed to evaluate the derivatives ofdet @)
used in the zero finding method by algorithmic differentiation. This leads to a simple and lucid
algorithm. Program listings and numerical examples are given.
AMS Subject Classifications." 65H15, 65F40.
Key words: Algorithm, differentiation, eigenvalue, nonlinear.

L~isen von nichtlinearen Eigenwertproblemen durch Algorithmisches Differenzieren. Die Eigenwerte einer
Matrix A @), deren Elemente komplexe Funktionen einer komplexen Variable )~sind, k6nnen vermittels
Nullstellenverfahren, angewandt auf die Determinantenfunktion det A @) gefunden werden. Es wird
vorgeschlagen, die bei den Nullstellenverfahren verwendeten Ableitungen yon det A @) diarch Algo-
rithmisches Differenzieren zu berechnen. Dies ftihrt zu einem einfachen und tibersichtlichen Algo-
rithmus. Es werden Programmlisten und numerische Beispiele gegeben.

I. Introduction

Let A (2) be an n x n matrix, whose elements are differentiable functions with respect
to the complex parameter 2. The number 2 is an eigenvalue of A, if there exists a
nonzero vector x ~ C" such that
A (2) x = 0. (1)
Vectors x C0 satisfying (1) are referred to as eigenvectors corresponding to the
eigenvalue 2. The set a (A) of the eigenvalues of A is called spectrum of A. For surveys
on numerical methods for the solution of nonlinear eigenvalue problems of the form
(1) see [9, 10, 14]. It is well-known, that (1) has nontrivial solutions if and only if
A (2): = det A (2) = 0. (2)
Since it is easy and numerically stable to evaluate A (2) by Gaussian elimination with
partial pivoting [2], an obvious way to compute eigenvalues of A (2) is to apply the
secant or Muller's method (i.e. successive linear or quadratic interpoIation
respectively) to find zeros of the determinant A (2) [11].
206 P. Arbenz and W. Gander:

In this note we consider root finding methods as for instance Newton's iteration that
use derivatives of the function A (2). While Lancaster [9] (see also [17] for an
equivalent formula) deduces for the reciprocal of the Newton step size
A' (,~)
f(2): = = trace {A- 1 (;0 A' (2)}, 2 r r (A) (3)
A (4)
we propose here to evaluate f(2) and its derivatives by algorithmic differentiation.

Remark: (3) is sometimes called Jacobi's formula [73.

2. Algorithmic Differentiation

Let f : C ~ D --* C be a differentiable function and A an algorithm which for given


s e D computes the function value f(s).
By algorithmic differentiation A is transformed in an algorithm A' computing not
only f(s) but also f ' (s). The differentiation is performed in two steps:
First for each variable occurring in A and dependent on s a new variable of the same
type is declared to store the value of the derivative.
Second before each statement of the form

a: = expr (b, c,...)

in which a value is assigned to a variable dependent on s, a statement of the form

da: =expr' (b, db, c, dc .... )

is inserted. Here da, db, and dc are the variables storing the derivatives of a, b, and c.
expr' is the expression obtained of expr by formal differentiation.

If e.g. A evaluates the polynomial p (s) = i a i s' for s = z


i=0

A: p : = 0 ; for i : = n downto 0 do p : = p * z + a [ i ] ;

then its derivative


A': d p : = 0 ; p : = 0 ;
for i: = n downto 0 do
begin dp: = d p * z + p ; p = p * z + a [i] end;

computes both p (z) and p' (z). Thus algorithmic differentiation yields A' the well-
known scheme for evaluation of the polynomial and its derivative by H o m e r ' s rule
[4,6].
There exist compilers that automatically translate A into A' [5, 6, 8, 12, 13, 15].
These compilers often are able to compute partial derivatives as well. However for
short programs it is little work to do the changes by hand using an editor.
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 207

3. Computing Eigenvalues of A (X)

Let
P (4) A (4) = L (4) R (4) (4)

be the triangular decomposition of A (2) obtained by Gaussian elimination with


partial column pivoting. In (4) L is a unit lower triangular, R an upper triangular and
P a permutation matrix. Thus for the determinant we have

A (2)=det P(2) h ru(2) (5)


i=1
and

A'(2) = det P (2) ~d h rii (2)


i=l
n n (6)
=det P(2)~i=t r'u(2)(jI~=lrjj(2))"

Using eqs. (5) and (6)f(2) defined in (3) becomes

f(2) = ~ (r'.(2t/r. (2t), 2 r ~ (AI. (71


i=1

Therefore to perform the k-th step in the ordinary Newton-Raphson iteration we


proceed as follows: Having evaluated the matrices A (2k) and A' (2k), we compute
L(2k), L' (2k), R (2k), and R' (2k) by algorithmically differentiated Gaussian elim-
ination (cf. Appendix). If any of the ru, 1 <_i < n, vanishes, 2 k is an eigenvalue of A.
Otherwise we form the sum f(2k) of (7) to obtain the next iterate by

2k+ 1 = 2k- l / f G). (8)


The iteration is stopped as soon as

[ 2k + 1 -- 2k [ < max {e 12 k+ 1 l, e} (9)


for some prescribed accuracy e > 0.
Well programmed the computation off(2k) needs n 3 complex flops whether we use
algorithmically differentiated Gaussian elimination or Jacobi's formula (3). A
(complex) flop is the amount of work to perform a (complex) floating point
multiplication/division, a (complex) floating point addition/subtraction plus some
index manipulations [2].
Algorithmic differentiation of Gaussian elimination has the advantage of providing
in a very simple way an elegant and lucid algorithm for the computation off(Ak)
without using much knowledge of linear algebra as it is needed for the deduction of
(3) (or (12) below). Furthermore algorithmic differentiation is a valuable tool for
obtaining derivatives in similar problems where explicit equations like (3) or (12) are
not available.
208 P. Arbenz and W. Gander:

We may differentiate the above algorithm again to obtain f ' (2). This makes the
application of cubically convergent iteration schemes possible, cf. Halley's iteration
[1]
A (20 A'(20
2k + 1 = 2k
d' (202 - 89A (20 A" (20

=2 k 2f(2k) (10)
f(2k) 2 --f' (2k)
2
=2g
f (2k)-- f ' (2k)/f (2k) '
a formula Lancaster [10] reports favourably upon.
We obtained for our test problems better results with Laguerre's iteration [9]

K a (&)
2k+ 1 = 2k
A' (2k) _+]/(K - 1) [(K - 1) A' (2k)2 -- K d (20 d" (2k)]
(11)
K
=2 k
f(2k) +_]/(1 -- K) [ K f ' (2k) +f(2k) 2-] '
a method which can be recommended if A (2) is a polynomial. The sign in the
denominator in (11) is chosen to minimize the step length. Laguerre's method
converges cubically to simple zeros for every real value K > 1, however for global
convergence choosing K equal to the degree of the polynomial may be more efficient.
The evaluation off(h) and f'(2) by algorithmic differentiation costs 2 n 3 complex
flops. The use of formula (3) together with [9]

f ' (2) = trace {A (2)-1 A" (2) - (A (2)-1 A' (2)) 2 } (12)

needs 2 n 3 complex flops as well.


All mentioned iteration methods converge only linearly to multiple roots. For
iteration methods that use f(2) and f'(2) (for instance Halley's or Laguerre's
iteration) we propose the following remedy: To detect slow convergence we make
use of the function
d ( 1 )_ f'(2) (13)
d2 ~ f(2) 2"

If 2 converges to an eigenvalue 2* with multiplicity m (2*) then (13) converges to


1/m (2*) with the same order of convergence. This is well-known from the analysis of
the Newton iteration method (see cf. [16]). Therefore we simply checked in each
iteration step if

f ' (2k) f ' (~'k--1) f ' (2k)


f ( ~ k )2 f(~k_l)2 < d f(~-k)2 , k > 0 . (14)
Solving Nonlinear EigenvalueProblems by AlgorithmicDifferentiation 209

The number e' can be chosen greater than e in (9). If (14) became true we defined m to
be the integer nearest to --f(2k)2/f (2k) and switched over to one of the iterations
2m
2k+1 2k f(2k)-- mf'(2k)/f(2k) '
= (15)
K
2k+1=2 k , (K>m) (16)
f(2k)'~-/(m 7nK) [Kft(2k)Jrf(2k) 2]
or
f(2k)
2k+1 = 2k+ f (17)

(15) and (16) are the Halley and Laguerre formulae converging cubically towards
roots of multiplicity m [3]. (17) is the Newton iteration method applied to the
function A (2)/A' (2). It converges quadratically towards all zeros 2* of A (2), since
A (2)/A' (2) has simple roots only.

Remarks:
Formulae (15) and (16) are obtained if A (2) and K in (10) and (11) are replaced by
~(2):=A(2)l/" and R:=mK. (Hence f= l~f and f ' = l f ' . ) In (16) the hat of
/( is omitted, m
(17) can be obtained by replacing m in Halley's iteration (15) by --f(2k)2/f ' (2k).
To be able to compute more than one eigenvalue one can proceed as follows: Having
already computed 21.... ,2q we avoid recomputation of one of these eigenvaIues by
suppressing, i.e. we replace A (2) in (5) by
A(2)=A (2)/i(I (2-2j). (18)

A simple computation shows that

tz)= ~ = f ( 2 ) - Z (2-2j) (19)


j=l
and
q 1
f ' (2) = f ' (2) + • ( 2 - 2~)2 (20)
j=l
with f(2) defined in (7).

4. Numerical Examples

We consider first the test problem [9, 14]


A (2)= Ao + A 1 2+A2 22 (21)
210 P. Arbenz and W. Gander:

with
--1+2cr 2 . ( 1 - . 2 - 2 f l 2) 2cz2fi2 -.flz(.2+/~2) /
2c~ _ .2 _ 2/~2 2 o~f12 _/32 (.2 + 132)
A0
1 0 0 0
0 1 0 0

3o~ -(1+.2+2fl2) cz(l+2fl 2) _/~2(e2+,82)~


2 0 0 0
AI=
0
0
2
0

A2=I.
0
2
0
0 J
Here e is a non-negative real parameter and fi = ~ + 1. The eigenvalues of (21) are

,,~2 = .~7 = --ifl


23 = ,~6 = - i (22)

25 = 0 .
For positive ~ the eigenvalues of A (2) are all distinct and for e = 0 there are triple
eigenvalues - i and + i and a double eigenvalue 0.
For several values of e we computed 5 eigenvalues of A (2) with Newton's, Halley's,
Laguerre's, and modified Newton's iteration (17) using suppression. The initial
value for every iteration was chosen 2o = - 1 - 2 i. The iterations were stopped if the
convergence criterion (9) was satisfied with e = 10-8 or after 50 iteration steps. The
numerical results are listed in Table 1. To each ~ and each iteration method we give
two columns of numbers, one containing the index of the eigenvalue according to
(22) in the computed order of succession and the other containing the number of
iterations needed to obtain the specific eigenvalue in the desired accuracy. Beneath
these two columns the time to compute all 5 eigenvalues appears. The calculations
have been performed on an IBM 3084 in complex double precision arithmetics. The
actual programs were written in Fortran 77.
Table 1 shows that the cubically convergent iteration methods are faster than
Newton's iteration in spite of the double amount of work per iteration step. While
Halley's iteration has only small advantages over Newton's iteration, Laguerre's
iteration is obviously superior to the other two methods. However this may be
different if A (2) is not a polynomial.
The modified Newton iteration performed badly. For some values of ~ the iteration
to find a third eigenvalue even failed to converge (and was stopped after 50 steps). In
this case the 2k's formed an infinite cycle. One of the points in the cycle was treated as
an eigenvalue and removed by suppression. A further drawback of the modified
Newton iteration is that it does not always converge towards one of the nearest
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 211

Table 1

Newton Halley Laguerre Modified


(8) (10) (11) Newton (18)
no. of itns. no. of itns. no. of itns. no. of itns.
ev. used ev. used ev. used ev. used

0.5 9 1 6 1 4 7
10 3 7 2 5 8
13 5 10 3 5 21
16 2 7 4 6 13
9 4 6 5 5 7
.0664 sec. 90664 sec. .0508 sec. 90859 sec.

0.1 1 15 1 9 1 2 13
3 13 3 9 2 7 16
5 17 2 9 I 3 50
2 11 4 9 4 3 8
4 10 5 7 5 6 10
.0781 sec. .0742 sec. .0586 sec. .141 sec.

0.01 1 21 1 13 1 9 2 13
3 17 3 11 2 8 6 19
4 20 2 8 3 5 5 17
2 11 4 11 4 9 3 9
5 11 5 7 5 5 7 17
.0938 sec9 .0859 sec. .0664 sec. .113sec.

10 - 4 1 33 1 19 1 13 3 13
3 23 3 15 2 11 7 21
4 27 2 8 3 5 50
2 11 4 15 4 12 2 17
5 11 5 7 5 5 4 12
.ll3sec. .105 sec. .0859 sec. 9160 sec.

10 - 6 1 42 1 25 1 18 2 11
3 28 3 18 2 14 7 22
4 32 2 8 3 5 - 50
2 11 4 18 4 15 3 14
5 11 5 7 5 5 4 12
.133 sec. .125 sec. .0977 sec. 9160 sec.

lO-S 1 49 1 29 2 21 1 9
3 31 3 20 1 16 6 15
4 36 2 8 3 6 - 50
2 11 4 20 4 17 2 12
5 11 5 7 5 5 4 7
9145 sec. .141 sec. .113 sec. .133 sec.

1-3 48 1- 3 29 1-3 20 1-3 9


1-3 30 1-3 19 1-3 16 6--8 14
4-5 36 1-3 10 1-3 7 -- 50
1-3 12 4- 5 20 4-5 17 1-3 12
4-5 11 4-5 7 4-5 5 4--5 7
9145 sec. .133 sec. .117 sec. .133 sec.
212 P. Arbenz and W. Gander:

eigenvalues. It seems that due to the poles of the function A (2)/A' (2) the initial
approximations 2 o are to be chosen very close to the zero to obtain convergence
towards it.
Since the eigenvalues move together with decreasing ~ the number of iterations
increases from top to bottom in Table 1. For c~= 10 - s and 7 = 0 the algorithms
behave almost the same. This is not surprising since in the neighbourhood of
multiple eigenvalues a significant loss in the accuracy in the computed function has
to be accepted 9 Indeed we have to expect a relative error of about (macheps) 1/"
nearby a zero of multiplicity m [16]. Here macheps ~- 10-16. Therefore we consider
our results satisfactory.
We modified Laguerre's and Halley's iteration method as described in section 3. As
soon as _ f 2 (2k)/J"(2k) tended to an integer greater than 1 we switched over to
formula (15) or (16). In the convergence criterion (14) we set e ' = 10 -4. The so
obtained results that differed from the ones given in Table 1 are listed in Table 2. The
additional "m"-column indicates the computed multiplicities of the respective
eigenvalue. For 7 = 10 - s in the first columns the number of the nearest eigenvalue
appears. Once again Laguerre's iteration is superior to Halley's. On the other hand
both modified algorithms have computed the eigenvalues of A (2) with the desired
accuracy in a much shorter time than the original ones.

Table 2

Halley (10)-(15) Laguerre (11)-(16)


no. of itns. no. of itns.
ev. m used ev. m used

lO-S 1 3 22 1 3 19
4 2 15 4 2 12
8 3 6 8 3 7
.0742 sec. .0664 sec.

1-3 3 19 1-3 3 t4
4-5 2 15 4-5 2 12
6-8 3 3 6-8 3 3
.0664 sec. .0586 sec.

The second test problem we considered was


A (2) = A ~ + A I j~ q_ A2 ~2 q_A3 jl3, A j E ~ nxn (23)
with

A0
/ / 9 8

9~
''
""

'

1
.
9 1

8
, A 2
2

A1 = A 3 =I.
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 213

Numerical computations indicate that A (2) has n purely real simple eigenvalues in
the interval ( - n - 1,0) and n complex conjugate pairs of eigenvalues with positive
real parts. For n = 5, 10, and 20 two eigenvalues have been computed. The iterations
were started with 20 = - n - 1. The numerical results for this example are shown in
Table 3. They confirm the advantages of the cubically convergent methods over
Newton's iteration and particularly show the superiority of Laguerre's iteration.
Once again the modified Newton iteration method is the slowest and does not
converge to the nearest zeros.

Table 3

Newton Halley Laguerre Modified


(8) (10) (11) Newton (18)
no. of itns. no. of itns. no. of itns. no, of itns.
ev. used ev. used ev. used ev. used

5 1 10 1 6 1 4 8
2 13 2 8 2 5 8
.0313 sec. .0273 sec. .0273 sec. .0313 sec.

10 1 11 1 7 1 4 7
2 15 2 9 2 5 13
.133 sec. .0977 sec, .0781 sec. ,172sec.

20 1 12 1 7 1 4 7
2 17 2 10 2 5 15
.918 sec. .641 sec, .500 sec. 1.28 sec.

Appendix: Algorithmically Differentiated Gaussian Elimination

With the declarations


eonst n n = 10;
type vektor =array [1 ... n n ] of real;
matrix = array [ 1 ... n n] of vektor;
the following PASCAL procedure computes f(2) from given a (2) and da ()0 = a' (2):

procedure determinante (a, d a: matrix; var f: real);


const e p s = 1 e - 7;
vat i, j, k, k m a x : integer; h: vektor;
invmax, f a k , dfak, z, max, dmax:real;
begin
f:=O; i:=1;
repeat
(, look for pivot in i-th column ,)
kmax:=i;
214 P. Arbenz and W. Gander:

for k : = i + l to n do
ff a b s (a [k, i]) > a b s (a [k max, i]) then k max: = k;
m a x : = a [k max, i] ; d m a x : = da [k max, i] ;
ff m a x = 0 then f : = lO/eps
else
begin
f : = f + d max~max;
if k m a x < > i then
begin
h: -- a [i] ; a [i] : = a [k m a x ] ; a [k m a x ] : = h;
h: = d a [i] ; d a [i] : = d a [k max] ; d a [k max] : = h;
end;
(* e l i m i n a t i o n ,)
invmax : = l / m a x ;
for k : = i + l to n do
begin
f a k : = invmax 9 a [k, i];
d f a k: = invmax * (d a [k, i] - f a k * d max);
forj:=i+l to n do
begin
a [ k , j ] : = a [k,j] - f a k * a [ i , j ] ;
da [k,j] : = da [k,j] - d f a k 9 a [ i,j] - f a k * da [ i,j]
end;
end
end;
i:=i+1;
until (i > n) or (max = O)
end;

I n t h e c a s e m a x = O w e set f : = lO/eps in o r d e r to satisfy t h e s t o p p i n g c r i t e r i o n (9).

References

[1] Gander, W. : On Halley's iteration method. Am. Math. Monthly 92, 131 - 1 3 4 (1985).
[2] Golub, G. H., van Loan, C. F. : Matrix Computations. Baltimore : The Johns Hopkins University
Press 1983.
[3] Hansen, E., Patrick, M.: A family of root finding methods. Numer. Math. 27, 257-269 (1977).
[4] Henrici, P. : Essentials of Numerical Analysis. New York: Wiley 1982.
[5] Hillstrom, K. E. : J A K E F - A Portable Symbolic Differentiator of Functions Given by
Algorithms. Techn. Report ANL-82-48. Argonne National Laboratory, Argonne, Illinois 1982.
[6] Joss, J. : Algorithmisches Differenzieren. Dissertation Nr. 5757, ETH Ziirich 1976.
[7] Kahan, W. : Private communication (1981).
[8] Kedem, G. : Automatic differentiation of computer programs. ACM Trans. Math. Software 6,
150-165 (1980).
[9] Lancaster, P.: Lambda-Matrices and Vibrating Systems. Oxford: Pergamon Press 1966.
[10] Lancaster, P. : A review of numerical methods for eigenvalue problems nonlinear in the parameter.
In: Numerik und Anwendungen yon Eigenwertaufgaben und Verzweigungsproblemen, pp. 43 - 67
(Bohl, E., Collatz, L., Hadeler, K. P., eds.). ISNM 38. Basel-Stuttgart: Birkhguser 1977.
Solving Nonlinear Eigenvalue Problems by Algorithmic Differentiation 215

[11] Peters, G., Wilkinson, J. H. : A x = ~ B x and the generalized eigenproblem. SIAM J. Numer. Anal.
7, 479-492 (1970).
[12] Rail, L. B.: Automatic differentiation: techniques and applications. Lecture Notes in Computer
Science, vol. 120. Berlin-Heidelberg-New York: Springer-Verlag 1981.
[13] Rall, L. B. : Differentiation in Pascal-SC: Type GRADIENT. ACM Trans. Math. Software 10,
161 - 184 (1984).
[14] Ruhe, A. : Algorithms for the nonlinear eigenvalue problem. SIAM J. Numer. Anal. 10, 674 - 689
(1973).
[15] Speelpenning, B.: Compiling fast partial derivatives of functions given by algorithms. Techn.
Report UIUCDCS-R-80-1002, Dept. of Computer Science, Univ. of Illinois, Urbana, Illinois 1980.
[16] Stoer, J. : Einffihrung in die Numerische Mathematik I, 4. Aufl. Berlin-Heidelberg-New York-
Tokyo: Springer-Verlag 1983.
[17] Yang, W. H. : A method for eigenvalues of sparse X-matrices. Int. J. Numer. Methods Engrg. 19,
943-984 (1983).

Dr. P. Arbenz Dr. W. Gander


Brown, Boveri & Cie. Seminar for Angewandte Mathematik
Dept. CTT-Z ETH Ziirich
CH-5401 Baden CH-8029 Ztirich
Switzerland Switzerhind

15 Computing 36/3

You might also like