0% found this document useful (0 votes)
34 views

Cuyt, Rall - 1985 - Computational Implementation of The Multivariate Halley Method For Solving Nonlinear Systems of Equations

Uploaded by

ntb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Cuyt, Rall - 1985 - Computational Implementation of The Multivariate Halley Method For Solving Nonlinear Systems of Equations

Uploaded by

ntb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220492569

Computational Implementation of the Multivariate Halley


Method for Solving Nonlinear Systems of Equations.

Article  in  ACM Transactions on Mathematical Software · March 1985


DOI: 10.1145/3147.3162 · Source: DBLP

CITATIONS READS

24 492

2 authors, including:

Annie Cuyt
University of Antwerp
211 PUBLICATIONS   1,443 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Annie Cuyt on 10 June 2014.

The user has requested enhancement of the downloaded file.


Computational Implementation
of the Multivariate Halley Method for Solving
Nonlinear Systems of Equations
ANNIE A. M. CUYT
University of Antwerp
and
L. B. RALL
University of Wisconsin-Madison

Cubicaliy convergent iterative methods for the solution of nonlinear systems of equations, such as
the multivariate Halley method, require first and second partial derivatives of the functions compris-
ing the system. Automatic differentiation is used to automate the Halley method, using the data type
HESSIAN and routines for the required operators and functions. A Pascal-SC program is given,
which implements this method in a single-step iteration mode. The program is applied to two
nonlinear systems, and the results are compared with Newton’s method.
Categories and Subject Descriptors: G.l.5 [Numerical Analysis]: Roots of Nonlinear Equations-
iterative methods, systems of equations; G.1.m [Numerical Analysis]: Miscellaneous
General Terms: Languages
Additional Key Words and Phrases: Automatic differentiation, cubic convergence, Halley method,
Pascal-SC, type HESSIAN

1. NONLINEAR SYSTEMS OF EQUATIONS


One of the central problems of scientific computation is the efficient numerical
solution of systems of n equations
fi(xl, X2, * - - 9%I = 09 i = 1, 2, . . . , n, (1.1)
in n unknowns x1, x2, . . . , x,. This is a special case of the operator equation
f(x) = 0, (1.2)
in which f: D C R” + R”, 0 E R” denotes the zero vector 0 = (0, 0, . . . , 0), and
the point x E R” is sought. If f is an affine operator,
f(x) = Ax + b, (1.3)

Research was sponsored by the Belgian National fund for Scientific Research (NFWO), and in part
by the U.S. Army under Contract No. DAAG29-80-C-0041.
Authors’ addresses: A. A. M. Cuyt, Department of Mathematics, University of Antwerp UIA,
B-2610 Wilrijk, Belgium; L. B. Rall, Mathematics Research Center, University of Wisconsin-
Madison, Madison WI 53706.
Permission to copy without fee all or part of this material is granted provided that the copies are not
made or distributed for direct commercial advantage, the ACM copyright notice and the title of the
publication and its date appear, and notice is given that copying is by permission of the Association
for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific
permission.
0 1985 ACM 0098-3500/85/0300-0020 $00.75
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985, Pages 20-36.
Multivariate Halley Method for Solving Nonlinear Systems of Equations l 21

with the matrix A = (eij) and the vector b = (bl, bz, . . . , b,) given, then the
system (1.1) is said to be linear. This important special case is now fairly well
understood in both theory and computational practice. Otherwise, (1.1) is a
nonlinear system, and the situation is quite different from the linear case with
respect to both theory and practice. Most of the methods for nonlinear systems
investigated to date [14,X] involve some form of iteration, and many also involve
approximation of the nonlinear system by a linear system during the various
steps of the solution process, such as in the case of Newton’s methods and its
many variants [14,15]. It has been observed that some solution procedures work
better than others on a given problem, so that in the absence of a clear-cut
criterion for choosing the optimal method, it is advisable to have several choices
available in the form of computer programs that are easy to use.
It will be assumed that the operator f corresponding to the system (1.1) has
first and second Frechet derivatives f ‘, f fl on its domain D C R” [15]. In this
case, the first Frechet derivative off at x is represented by the Jacobian matrix

f’(x) = F ) (1.4)
( J )
and the second by the Hessian operator

(1.5)

[15]. Necessary values of the derivatives appearing in (1.4) and (1.5) will be
obtained by automatic differentiation [ 171, so that the user need only supply
expressions or subroutines for the n functions fi(xl, x2, . . . , x,) appearing in
(1.1). This avoids both the labor of providing code for derivatives and the
inaccuracy of numerical differentiation. In [20], it was shown how to automate
the calculation of the Jacobian matrix (1.4) needed in Newton’s method by the
use of type GRADIENT. Here, type HESSIAN [ 181will be used to evaluate both
(1.4) and (1.5), which are required for the computational implementation of a
cubically convergent iterative procedure, the multivariate Halley method due to
Cuyt [2, 3,4].

2. THE MULTIVARIATE HALLEY METHOD


This method is based on the theory of abstract Pad6 approximants [2, 41, and
conditions for its numerical stability have been given by Cuyt [3]. The abstract
setting is a Banach algebra [15]: R” with multiplication and division of vectors
defined componentwise forms such a structure for the norm ]] x ]] = max(i) ] xi ],
for example. Halley’s method starts from an initial approximation x0 to a solution
x = x* of (1.2), and then defines the sequence (r”] of successive approximations
by the following algorithm:

(a”)*
x u+l = xy + - (2.1)
a” + ib”’
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
22 l A. A. M. Cuyt and L. B. Rail

where
u.” = -f’(x”)-‘[ix”) (the Newton correction),
and
b” = f’(x”)-‘f”(x”)a”a”, v = 0, 1, 2, . . . .
In the actual computation, the Jacobian matrix f ‘ix”) is not inverted. Rat,her,
the linear system
f ‘(x”)u” = -f (x”) (2.2)
is solved for a”, following which the linear system
f ‘(x”)b” = f N(xY)u”uY (2.3)
is then solved for b”. Since the systems (2.2) and (2.3) have the same coefficient
matrix, the decomposition of the Jacobian matrix f’ (x’) used to solve (2.2) can
also be used to solve (2.3), resulting in a saving of effort.
An outline of the computational effort for one step of Halley’s method is thus
(1) evaluation off ix”), f ‘ix”), f “ix”);
(2) solution of (2.2) for a”;
(3) evaluation off n(xY)uYuY;
(4) solution of (2.3) for b”;
(5) calculation of the Halley correction c” = (a”)‘/(~” + ibY);
(6) addition of the Halley correction to x” to obtain x”+‘.
This sequence of operations is more elaborate than required for Newton’s
method [15, 201, which requires only the evaluation off ix”), f ‘ix”), the solution
of (2.2) for a”, and finally the addition of u” to x” to obtain x”+l. However, in
favorable cases, the rate of convergence of Halley’s method is cubic, whereas
Newton’s method converges quadratically. Thus the greater effort required for
each step of Halley’s method could be offset if fewer steps are required to obtain
the accuracy desired. Two steps of Newton’s method can be combined to yield a
method with biquadratic convergence. However, this requires the solution of (2.2)
with different coefficient matrices f ‘ix”) and f ‘ix”+‘) and right sides -f (x”) and
-f (x”+l).
For computational implementation, it is convenient to consider the steps of
Halley’s method to consist of a procedure for evaluation (Step l), which will
depend on the specific system being solved, and a procedure for iteration (Steps
2-6), which will have the same form for all systems. The operations of compo-
nentwise multiplication and division of vectors will also have to be provided in
addition to the standard vector operations; These are simple to define, sirlce for
u=(q,uz,..., a,) and b = (bl, b2, . . . , b,) E R”, one has
ub = (ulbl, uzbz, , . . , unb,),

(2.4)

where division is defined in general only if bi # 0, i = 1, 2, . . . , n.


ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations l 23

3. USE OF AUTOMATIC DIFFERENTIATION


Newton’s method and the method of Section 2 ate sometimes shunned because
it is assumed that code has to be supplied for the derivatives of the functions fi,
or because these functions are defined by subroutines rather than simple expres-
sions. However, since the rules for differentiation are well understood, the
computer itself can produce the required code by automatic differentiation of the
given expressions or subroutines [6, 171. In the case of functions defined by
expressions, programs capable of obtaining first and second derivatives have
been in use for some time [5, 7, 151. More recently, differentiation methods for
subroutines have also been developed [6, 16, 17, 181. Since the latter case is the
most general, it will be examined here.
To illustrate the fundamental idea of automatic differentiation, consider a real
function f of n real variables x = (x1, x2, . . . , x,). The pair (f, f ') = (f(x), Of(x))
is a datum of type GRADIENT for a given value of x [18, 201, where Of(x)
denotes the gradient vector
df(x) df(x) af(4
Vfb) = dr'dx'...'ax * (3.1)
( 1 2 n )

Writing F = (f, f ‘) to represent an element of this new type of data, the next
step is to define the corresponding arithmetic operations to implement the rules
for differentiation in a computable form. For example, for G = (g, g’), addition
and multiplication are defined by
F + G = (f + g, f' + g'h
(3.2)
F*G = (f*g,f*g' +g*f'),
respectively. Similarly, functions such as the sine function can be represented in
the form
GSIN(F) = (sin(f), cos(f)*f'). (3.3)
The independent variable xi is represented by the GRADIENT variable X[i] =
(xi, ei), where ei is the ith unit vector, and the evaluation of a GRADIENT
expression will automatically yield both the values of the function f(x) and its
gradient vector Of(x) at the given value of x. Thus the programmer need only
supply code for the evaluation of a function to get also its derivative, once the
standard set of GRADIENT operators and functions [20] is available.
For the present purpose, second derivatives are needed, and so type GRA-
DIENT is extended to type HESSIAN, a datum of which is the triple F = (f, f ',
f") = (f(x),Vf(x),Hf(x)) [W, w here Hf (x) is the Hessian matrix

(3.4)

Once again, there is no problem in the implementation of arithmetic operations


and standard functions, for example,
F + G = (f + g, f’ + g’, f” + g”),
F*G = (f*g, f*g’ + g*f’, f*g” + f’*g’* +g’*frT + g*f”), (3.5)
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
24 l A. A. M. Cuyt and L. 6. Rail

and
SIN(F) = (sin(f), cos(f)*f', cos(f)*f" - sin(f)*f’*f’T). (3.6)
The HESSIAN variables X[i] corresponding to the independent variables Xi
are X[i] = (xi, ei, O), where ei is the ith unit vector, and 0 denotes the n X n zero
matrix. Thus evaluation of expressions of type HESSIAN yields the value of the
second derivative f"(x) as well as the values of the function f(x) and its first
derivative f ‘(x). Although the formulations of HESSIAN operators and standard
functions are more complicated than those for type GRADIENT [20], program-
ming them is no real challenge, and this needs to be done only one time. Once
available, these subroutines shift the burden of differentiation from the program-
mer to the computer, which is as it should be.
In order to calculate the Jacobian matrix (1.4) and Hessian operator (1.5) of a
uector-ualued function f(x) = (fi(x), fi(x), . . . , fn(x)), each real-valued component
function fi(x) is defined to be of type HESSIAN. In this case, the ith row of the
Jacobian matrix f ‘(x) is given by the gradient vector Vfi(X) of the ith component
function, and the Hessian matrix Hfi(x) of the ith component function will be
the ith “panel” of the Hessian operator f"(x).

4. TYPE HESSIAN IN PASCAL-SC


In primitive computing languages such as FORTRAN, automatic differentiation
requires interpretation of expressions [5, 71 or precompilation [6, 171. However,
in languages that permit user-defined operators, such as ALGOL 68, Ada, and
Pascal-SC [l, 131, statements can be written in ordinary notation, with deriva-
tives evaluated automatically. In order to make full use of the facilities already
available in Pascal-SC for vector and matrix arithmetic [22], type HESSIAN is
introduced in a way to make it consistent with the definitions of types RVECTOR
and RMATRIX for n-dimensional REAL vectors and matrices, respectively. The
standard declarations [22] are
CONST DIM = n;
TYPE DIMTYPE = l..DIM;
RVECTOR = ARRAY [DIMTYPE] OF REAL; (4.1)
RMATRIX = ARRAY [DIMTYPE] OF RVECTOR;

Following these, type HESSIAN is declared by


TYPEHESSIAN=RECORDF: REAL; DF: RVECTOR; HF: RMATRIXEND; (4.2)

[19]. Thus, as the result of a subroutine for computation of f(x) as the HESSIAN
variable F, one has

F.F = f(x), F.DF = vf(x), F.HF = Hf(x). (4.3)


A complete package of HESSIAN arithmetic operators and standard functions
has been prepared in Pascal-SC [18]. It is efficient to consider variables of types
INTEGER and REAL as constants for the purpose of differentiation, so a total
of 22 arithmetic operators are required. If K, R, and H denote generic variables
ACM Transactionson MathematicalSoftware,Vol. 11,No. 1,March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations l 25

of types INTEGER, REAL, and HESSIAN, respectively, these are


+ H, K + H, H + K, R + H, H + R, H + H, - H, K - H, H -
K, R - H, H - R, H - H, K * H, H*K, R*H, H*R, H*H, (4.4)
K / H, H / K, R / H, H / R, H / H.

The power operator ** and various standard functions are also available for type
HESSIAN [18]. Typical examples of HESSIAN operators can be found in the
evaluation routine given in Appendix C.
In order to represent the vector x = (x1, x2, . . . , x,) of independent variables
and the vector-valued function f(x) = (fi(x), f&r), . . . , f”(x)), with components
of type HESSIAN, it is convenient to introduce the data type HESSVAR, defined
by
TYPE HESSVAR = ARRAY [DIMTYPE] OF HESSIAN; (4.5)
In this way, it is possible to code systems of equations (1.1) in a form that follows
ordinary mathematical notation. For example, the simple system

e -x1+=2- 0.1 = 0, (4.6)


emxl-*Z - 0.1 = 0,

investigated by Cuyt and Van der Cruyssen [2, 41 requires the following HES-
SIAN operators and functions:
OPERATOR - (H: HESSIAN) RES: HESSIAN;
OPERATOR - (HA, HB: HESSIAN) RES: HESSIAN);
OPERATOR - (H: HESSIAN; R: REAL) RES: HESSIAN; (4.7)
OPERATOR + (HA, HB: HESSIAN) RES: HESSIAN;
FUNCTION HEXP(H: HESSIAN): HESSIAN;

The subroutines for these have to appear in the heading of the procedure
HESSEVAL(VAR X, F: HESSVAR) for the evaluation of f(x) corresponding to
(4.6) (see Appendix C). The evaluation off and its first and second derivatives is
then carried out by the statements
FL11 := HEXP(-X[l] + X[2]) - 0.1;
(4.6)
FL21 := HEXP(-X[l] - X[2]) - 0.1;

which follow the form of (4.6) exactly.


Similarly, HESSEVAL for the function f(x) corresponding to the system
16x’: + 16x: + x: - 16 = 0,
x:: + x; + x3” - 3 = 0, (4.9)
xf - x2 = 0,

considered in [20], requires the following operators:


OPERATOR * (K: INTEGER; H: HESSIAN) RES: HESSIAN;
OPERATOR ** (R: REAL; K: INTEGER) RES: REAL;
OPERATOR ** (H: HESSIAN; K: INTEGER) RES: HESSIAN;
OPERATOR + (HA, HB: HESSIAN) RES: HESSIAN; (4.10)
OPERATOR - (H: HESSIAN; K: INTEGER) RES: HESSIAN;
OPERATOR - (HA, HB: HESSIAN) RES: HESSIAN;
ACM Transactionson MathematicalSoftware,Vol. ll,No. 1,March 1985.
26 l A. A. M. Cuyt and L. 8. Rail

after which the evaluation of f and its derivatives takes place by means
of the statements
FL11 := 16 * (X[l] ** 4) + 16 * (x[2] ** 4)
+ X13] **4 - 16; (4.11)
F[21 := X[l] ** 2 + x[2] ** 2 + X[3] ** 2 - 3;

FL31 := X[l] *t 3 - x[2];

which resemble (4.9). Parentheses are necessary in the first statement of


(4.11) because * and ** have the same priority in Pascal-SC [13]. The
coding for the system (4.9) is analogous to the statements given in Ap-
pendix C for the system (4.6).
Step 1 of Halley’s method as described in Section 2 is thus carried
out simply by the evaluation of the function f(x) as of type HESSVAR,
where the independent variable x also has the same type, with curren.t
value X[1’].F = xi, i = 1, . . . , n. The value of the transformation F is
given by the RVECTOR B with components B[i] = F[i].F, and the Ja-
cobian matrix of F is the RMATRIX JACF with rows JACF [i] =
F[i].DF, i = 1, . . . , n. As will be shown below, the panels F[i].HF of
the Hessian operator of F can be used directly in the computation, and
so it is not necessary to construct the operator itself.

5. SOLUTION OF LINEAR SYSTEMS OF EQUATIONS IN PASCAL-SC


Steps 2 and 4 of Halley’s method require the solution of linear systems
of equations, an operation that is also required by Newton’s met,hod.
Pascal-SC provides the basic procedure LGLP for this purpose [22],
which is declared by
PROCEDURE LGLP(DIM, AKDIM: INTEGER; VAR A: RMATRIX ; VAR B:
RVECTOR;
VAR Y: IVECTOR);

The meaning of DIM is the same as before; if one wishes to solve a smaller
system, the parameter AKDIM can be used to set the number of rows and
columns of the coefficient matrix A and components of the right side vector :B
that enter into the computation. More significantly, instead of returning a
floating-point RVECTOR x as an approximate solution of the linear system

An=B, (5.1)
LGLP returns an interval vector (IVECTOR) Y, which, if proper, is guaranteed
to contain the exact solution x of (5.1) [lo, 21, 221. Furthermore, successful
completion of LGLP guarantees that the floating-point matrix A is nonsingular
[21, 221. If A is singular or extremely badly conditioned, LGLP will return an
improper interval vector Y with all components equal to the improper interval
[+1, -11 [22].
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations l 27

In actual practice, LGLP is observed to be highly accurate, even for matrices


that are known to be poorly conditioned [21]. In any case, if
Y = (bl, &I, b2, &?I,* . . , k&I,hII) (5.2)
is proper (ai 5 bi for i = 1, . . . , n), one has
ai I Xi I bi, i = 1, 2, . . . , n, (5.3)
for all components xi of the exact solution x = (xi, x2, . . . , x,) of (5.1), from which
an approximate solution with known error bounds can be constructed [19]. This
kind of guaranteed accuracy is possible because Pascal-SC completely supports
interval arithmetic [8, 11, 12, 221 as well as accurate floating-point arithmetic.
Since Step 4 of Halley’s method requires the solution of another linear system
of equations with the same coefficient matrix as in Step 2, it is more efficient to
use the decomposition of the coefficient matrix and other auxiliary results from
the first system to solve the second than starting anew. For this purpose, the
Pascal-SC procedure LGLPR is provided [22]:
PROCEDURE LGLPR (DIM, AKDIM: INTEGER; VAR A: RMATRIX; VAR B:
RVECTOR
NRS : BOOLEAN; VAR R: RMATRIX ; VAR MB: IMA-
TRIX; VAR Y: INVECTOR) ;
EXTERNAL 522;

For the first system to be solved, one sets NRS = FALSE, and then subsequently
NRS = TRUE for each new right side. The results from the first solution needed
later are stored as the real matrix R and interval matrix MB.
After solution of the linear systems (2.2) or (2.3), the interval vector Y has to
be checked and converted to a real vector, before the computation can be
continued. This is done by the function MID given in Appendix B.

6. COMPUTATION WITH BILINEAR OPERATORS


In Step 3 of Halley’s method, the right side f”(x”)a”a” of the system (2.3) is
constructed by operating with the bilinear operator f “(x”) twice on the vector a”.
The first operation yields a matrix, and the second a vector [Xi]. The way in
which HESSIAN variables are defined makes it easy to implement these opera-
tions in terms of the vector and matrix operators available in Pascal-SC [22]. In
general, a bilinear operator
B = (bijk) (6.1)
will be considered to be composed of n matrices
Bl = (b&t & = (&jk), - * * t Bn = t&J, (6.2)
called i-panels, or simply panels of B. For a vector n E R”, the matrix

A = (aij) = BX = (6.3)

will have rows A’ given by the matrix-vector product


Ai = Bix, i = 1, 2, . . . , n. (6.4)
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
28 l A. A. M. Cuyt and L. B. Rail

Once the matrix I; is formed by computing the vectors (4.4), then the vector

(6.5)

is obtained by a single additional matrix-vector multiplication. In the case


B = f”(x), one has Bi = Hfi(x”), SO that
Ai = Hfi(x”)~“, i = 1, 2, . . . , n, (6.6)
and thus
f”(x”)u”u” = Au”, (6.7:)
so the required vector is obtained by a total of (n + 1) matrix-vector multipli-
cations. In Pascal-SC, matrices are stored rowwise, and so no transposition is
required when forming the matrix A from the vectors Ai in (4.6) [22]. It is also
important to note that in Pascal-SC, scalar products of vectors and also matrix--
vector products are computed with the minimum possible round-off error; that
is, their values are obtained to the closest floating-point numbers [9, 10, 221.
This accuracy is far greater than can be obtained by the usual method of
simulation of these operations by sums of products of floating-point numbers
[lo]. The calculation of the rows of A by (6.6) and the vector (6.7) require only
the Pascal-SC operator
OPERATOR * (A: RMATRIX; B: RVECTOR) RES: RVECTOR;

for matrix-by-vector multiplication [22].

7. AN ITERATION PROCEDURE FOR HALLEY’S METHOD


In order to write a procedure for one step of Halley’s method (2.1), all that is
needed in addition to the above is the calculation of the Halley correction in
Step 4 and the addition of this vector to the initial vector (Step 5). Calculation
of the Halley correction requires operators for the componentwise multiplication
and division of vectors. Suitable formulations of these are as follows:
OPERATOR * (VA, VB: RVECTOR) RES: RVECTOR;
VAR I: DIMTYPE; U: RVECTOR;
BEGIN
FOR I := 1 TO DIM DO (7.I)
U[Il := VA[I] * VB[I];
RES := U
END;

and
OPERATOR / (VA, VB: RVECTOR) RES: RVECTOR;
VAR I: DIMTYPE; U: RVECTOR;
BEGIN
FOR I := 1 TO DIM DO
IF (VA[I] = 0) and (VB[I] = 0) THEN U[I] := 0
ELSE U[I] := VA[I]/VB[I]; (7.2)
RES := U
END;
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations l 29

In (6.2), the indeterminant form O/O is assigned the value 0, by continuity of the
Halley approximation. The calculation of the Halley correction also requires the
standard Pascal-SC operators
OPERATOR * (A: REAL; B: RVECTOR) RES: RVECTOR;
OPERATOR + (A, B: RVECTOR) RES: RVECTOR;

for multiplication of vectors by real numbers, and addition of vectors [22]. With
these and the componentwise operators (7.1) and (7.2), the Halley correction can
be evaluated by a statement of the form
CN := (AN * AN)/(AN + 0.5 * BN); (7.3)
where, of course, AN = a”, BN = b”, CN = c”. The current value of X is then
updated by the statement
FOR I := 1 TO DIM DO X[I] .F := X(I].F -t CN[I]; (7.4)
The steps required for a Halley iteration are collected in the form of the Pascal-
SC procedure given in Appendix B. Together with the procedure
HESSEVAL(VAR X, F: HESSVAR);

for the evaluation of the function f(r) corresponding to the system of equations
(l.l), a program for the iterative solution of (1.1) by Halley’s method can be
constructed easily. A simple program of this type is given in Appendix A, which
presents the results of each iteration to the user, who can then decide whether
to iterate further, stop the iteration, or start over with another initial vector.
In the program of Appendix A, the compiler directive
$USES LGL, DIM = #;

brings in the necessary type declarations, sets the constant DIM to the dimension
of the system specified by the user [13], and refers the compiler to the external
library LGLLIB containing the linear equation-solving and matrix inversion
routines [ 131.The $INCLUDEodirectives bring in the source code for the method
being used and for evaluation of the systems, which are in the external files
HALLEY.SRC and HESSEVAL.SRC, respectively [13]. In general, the program-
mer need only supply the file HESSEVAL.SRC for evaluation of the system
being solved, and modify the source code of the program ITERATE to set the
dimension and give the name of the method being used in the heading of the
output. The only place where modifications are necessary is indicated by “#” in
the source code file ITERATE.SRC.

8. NUMERICAL RESULTS
The method described in this paper was applied to the systems (4.8) and (4.9),
and the results were compared with those obtained by Newton’s method [20].
The initial approximations for the system (4.8) were
Xl = 4.3, xp = 2.0, (8.1)
[3, 41, and the initial approximations for (4.9) were
X] = 1.0, x2 = 1.0, x3 = 1.0, (8.2)
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
30 ’ A. A. WI.Cuyt and L. B. Rail

[20]. For the syst.em (4.8), Newton’s method requires 55 iterations to reduce the
residual to 0 to 12 decimal places, whereas Halley’s methods required only five
iterations. On the other hand, for (4.9), the corresponding numbers were eigh.t
iterations for Newton’s method, and five for Halley’s method, a result that is
more favorable to Newton’s method. The results are given in detail in Appendix
lJ.
The methodology presented in this paper can also be used to automate other
higher order methods for the solution of systems of equations, such as Chebyshl-
ev’s method and the method of tangent hyperbolas [14, 151.

APPENDIX A. A SIMPLE PROGRAM TO DRIVE HIGHER ORDER


ITERATIVE METHODS
PROGRAM ITERATE(iNPUT,OUTPUT);
$USES LGL, DIM=#; (* DIMENSION OF SYSTEM l )

TYPE HESSIAN = RECORD F: REAL; (* FUNCTION VALUE l )


DF: RVECTOR; (* GRADIENT VECTOR *)
HF: RMATRIX (* HESSIAN MATRIX l )
END;
HE~~VAR = ARRAY [DIMTYPE] 0~ HESSIAN;

VAR X,F: HESSVAR; (* INDEPENDENT AND DEPENDENT VARIABLES l )


1,J: DIMTYPE; (* INDEX VARIABLES l )
c: CHAR; (* CONTROL CHARACTER l :)
K: INTEGER; (+ ITERATION COUNTER ')
FUNCTION MlUWLL: RMATRIX; (* RETURNS THE ZERO MATRIX l )
VAR 1,J: DIMTYPE;
C: RMATRIX;
BEGIN
FOR I:=1 TO DIM DO
FOR J:=l TO DIM DO
C[I,J] := 0;
MRNULL :=c
ENDI (* FUNCTION MRNULL l )

SINCLUDE HE~~E~AL.SRC; (* SOURCE CODE FOR PROCEDURE HESSEVAL l )


SINCLUDE HALLEY. SRC; (' SOURCE CODE FOR ITERATION STEP l ')

BEGIN (* PROGRAM ITERATE l )

WRITELN; (* SIGN-ON MESSAGE l )


WRITELN('HALLEY"S METHOD FOR SOLUTION OF SYSTEMS OF EQUATIONS');
WRITELN;

FOR I:=1 TO DIM DO (* INITIALIZATION OF INDEPENDENT VARIABLES *')


BEGIN

X[I].HF:=MRNuLL; (* SETS HESSIANS TO ZERO MATRIX l )


FOR J:=l TO DIM DO X[Il.DF[J]:=O;
X[I].DF[I]:=l (* SETS GRADIENTS TO UNIT VECTORS l )

ENDI (* INITIALIZATION OF INDEPENDENT VARIABLES "1

C:='R';WHILE C = 'R' DO
ACM Transactions on hilathematicalSoftware,Vol. 11, No. 1,March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations l 31

BEGIN (* SYSTEM SOLUTION l )

WRITELN;WRITELN('ENTER VALUES OF INDEPENDENT VARIABLES');


FOR I:=1 TC DIM DO READ(X[Il.F);
WRITELN;WRITELN('INITIAL VALUES ARE');K:=O;
C:='I';WHILE C = 'I' DC

BEGIN (* ITERATION l )

HFSSEVAL(X,F); (* EVALUATE SYSTEM AT CURRENT VALUE OF X '1


WRITELN;
FOR I:=1 TO DIM DC (' PRINT VALUES OF X,F l )
WRITELN('X[',I:2,'] = ',X[Il.F,' F[',I:2,'] = ',F[I].F);
WRITELN;
WRITELN('ENTER "1" TO ITERATE, "R" TO RESTART, "Q" TO QUIT');
READ(C,C); (* ITERATION CONTROL ')
IF C = 'I' THEN

BEGIN
-Y(X,F) i (* ITERATION STEP *)
K: =K+l ; (+ INCREASE ITERATION COUNTER *)
WRITELN;WRITELN('RESULTS OF ITERATION ',K:3);
ENDi (* ITERATION STEP l )

END: (' ITERATION *)

END (* SYSTEM SOLUTION ')

END. (* PROGRAM ITERATB +)

APPENDIX 8. SOURCE CODE FOR THE MULTIVARIATE HALLEY METHOD


PROCEDURE HALLEY(VAR X,F: HESSVAR);

(* HALLEY METHOD l )

VAR I: DIMTYPE;
JACF : RMATRIX; (* THE JACOBIAN MATRIX l )
AN: RVECTOR; (* THE NEWTON CORRECTION *)
BN: RVECTOR;
CN: RVECTOR; (+ THE HALLEY CORRECTION l )
A: RMATRIX;
B: RVECTOR; (* USED BY LGLPR l )
R: RMATRIX; (* "
l I
Y: IVECTOR; (* I l I
MB: IMATRIX; (* I
l )

(' STANDARD MATRIX AND VECTOR OPERATORS *)

OPERATOR + (A,B: RVECTOR) RES: RVECTOR;


VAR I: DIMTYPE;
BEGIN FOR I:=1 TO DIM DC A[I] := A[I]+B[I];
RES :=A
END:
ACM Transactions on MathematicalSoftware,Vol. 11, No. 1,March 1985.
32 l A. A. M. Cuyt and L. B. Rail

OPERATOR l (A: REAL; B: RVECTOR) RES: RVECTOR;


VAR I: DIMTYFE;
BEGIN FOR I:-1 TO DIM DO B[I] := A*B[I];
RFS:=B
END;

OPERATOR * (A: RMATRIX; B: WEKTOR) RFS: RVECTOR;


VAR I: DIMTYPE;
WAR: RVECTOR;
BEGIN
BVAR := B;
FOR I:=1 TO DIM M
B[Il :- SCALP (A[I],BVAR,O);
RES :-B
Em;

(* END OF STANDARD MATRIX AND VECTOR OPERATORS l )

(* OPERATORS FOR COMPONENTWISE MULTIPLICATION AND DlVISION OF VECTORS l )

OPERATOR l (A,B: RVECTOR) RES: P.VEZTOR;


VAR I: DIMTYPE;C: RVECTOR;
BEGIN FOR I:=1 TO DIM DO C[I]:=A[I]'B[I];
REs:-2
END,
OPERATOR / (A,B: RVECTOR) RES: RVECTOR;
VAR I: DIMTYPE;C: RVECTOR;
BEGIN FOR I:= 1 TO DIM DO
IF (A[I]=O) AND (B[I]=O) THEN C[I]:=o
ELSE C[I]:=A[I]/B[I];
REs:=c
END;
(* FUNCTION TO CONVERT PROPER INTERVAL VECTOR TO REAL VECTOR ')

FUNCTION MID(VAR Y: IVECTOR): RVECTOR;


VAR I: DIMTYPE;C: RVECTOR;
BEGIN
IF Y[l].INF <= Y[l].SUP THEN (* Y IS PROPER ')
FOR I:=1 TO DIM D0 C[I]:=Y[I].INF+(Y[I].SUP-Y[I].INF)/2
ELSE (* Y IS IMPROPER l )
BEGIN (* SEND ERROR MESSAGE AND RETURN TO OPERATING SYSTEM l )
WRITELN('JACOBIAN MATRIX IS SINGULAR OR BADLY CONDITIONED');
FOR I:=1 TO DIM D0 Y[I].SUP:=Y[I].INF; (* RESET Y l )
svR(O) (* RETURN TO O/S *)
END;
MID:<
ENDi

BEGIN (* HALLEY ITERATION l )

(* CALCULATE JACOBIAN MATRIX AND RIGHT SIDE OF (2.2) l ')

FOR I:=1 TO DIM DO


BEGIN
JACF[I] := F[I].DF; (* JACOBIAN MATRIX *)
B[I] :=: -F[I].F (' RIGm RAND SIDE ':)
END:
ACM Transactions on MathenmticalSoftware,Vol. 11, No. 1,March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations - 33

(* SOLVE FOR NEWTON CORRECTION l )

LGLPR(DIM,DIM,JACF,B,FALSE,R,MB,Y)i
AN := MID(Y); (* NEWTON CORRECTION l )

(* CALCULATE RIGHT SIDE OF (2.3) AND SOLVE FOR BN l )

FOR I:=1 TO DIM Do A[I] := P[I].HF*AN;


B:=A*AN;
LGLPR(DIM,DIM,JACF,B,TRUE,R,MB,Y);
BN:=MID(Y);

(* CALCULATE HALLEY CORRECTION l )

ct4 := (m*m)/(m + 0.5'~~); (* HALLEY CORRECTION l )

FOR I:=1 TO DIM Do X[I].F := X[I].F + CN[I]; (* UPDATE VALUES OF X l )

END; (* HALLEY ITERATION l )

APPENDIX C. SOURCE CODE FOR HESSIAN EVALUATION


OF THE SYSTEM (4.8)

PROCEDURE HESSEVAL(VAR X,F: HESSVAR);

(+ HESSIAN OPERATOM AND FUNCTIONS FOR SYSTEM EVALUATION l )

OPERATOR + (HA,HB: HFSSIAN) RES: HESSIAN; (* H + H l )


VAR 1.J: DIMTYPE;U: HESSIAN;
BEGIN U.F:=HA.F+HB.F;FOR I:=1 TO DIM DO
BEGIN U.DF[I]:=HA.DF[I]+HB.DF[I];
FOR J:=l TO DIM DO
U.HF[I,J]:=HA.HF[I,J]+HB.HF[I,J]
ENDi
REs:=u
END;

OPERATOR - (H: HESSIAN) RES: HESSIAN; (* -H l )


VAR 1,J: DIMTYPE;U: HESSIAN;
BEGIN U.F:=-H.F;FOR I:=1 TO DIM Do
BEGIN U.DF[I]:=-H.DF[I];
FOR J:=l TO DIM DO
U.HF[I,J]:=-H.HF[I,J]
END:
REs:=u
END:

OPERATOR - (H: HESSIAN;R: REAL) RES: HESSIAN; (* A - R l )


VAR U: HESSIAN;
BEGIN U.F:=H.F-R;U.DF:=H.DF;U.HF:=H.HF;
REs:=u
ENDi
ACM Transactions on Mathematical Software, Vol. 11, No. l,March1995.
34 l A. A. M. Cuyt and L. B. Rail

OPERATOR - (HA,HB: HESSIAN) RES: HESSIAN; c* H - H l I


VAR I,J: DIMTYPE;U: HESSIAN:
BEGIN U.F:=HA.F-HB.F;FOR I:=1 TO DIM DC
BEGIN U.DF[I]:=HA.DF[I]-HB.DF[I];
FOR J:=l TO DIM Do
IJ.HF[I,J]:=HA.HF[I,J]-HB.HF[I,J]
END;
REs:=u
ENDI

FUNCTION HEXP(H: HESSIAN): HESSIAN; (* HEXP '1


VAR 1,J: DIMTYPE;U: HESSIAN;
BEGIN U.F:=EXP(?i.F);
FOR I:=1 M DIM DC
BEGIN U.DF[I]:=U.F*H.DF[I]; (* I LCGP l 1
FOR J:=l M I DC
BEGIN U.HF[I,J]:=U.F*H.HF[I,J]+U.DF[I]*H.DF[J];
IF I<>J THEN U.HF[J,I]:=U.HF[I,J]
ENDI
END; (+ I LOOP l 1
HF.XP:=U
END i (' FUNCTION HEXP l )
(* END OF HESSIAN OPERATORS AND FUNCTIONS '1

BEGIN (* HESSEVAL '1

(* DEFINITIONS OF FUNCTIONS IN SYSTEM ')

FITI := HEXP(-X[l] + X[Z]) - 0.1;


FL21 := HEXP(-X[l] - X[2]) - 0.1;

(* END OF DEFINITIONS OF SYSTEM FUNCTIONS l )

END: (* PROCEDURE HESSEVAL l )

APPENDIX D. NUMERICAL RESULTS

D.l Halley’s Method for the System (4.8)


INITIAL VALUES ARE

X[ l] = 4.30000000000E+00 F[ l] = 2.58843723000E-04
X[ 21 = 2.00000000000E+00 F[ 21 = -9.81636952230E-02

RESULTS OF ITERATION 1

X[ l] = 3.33615528246E+OO F[ l] = 2.40511813000E-04
X[ 21 = l.O3597241993E+OO F [ 21 = -8.73756488903E-02

RESULTS OF ITERATION 2

X[ l] = 2.560818009373+00 F[ l] = 1.44792584000E-04
X[ 21 = 2.59679794981E-01 F[ 21 = -4.042372200443-02

RESULTS OF ITERATION 3

X[ 1] = 2.308175634693+00 F[ l] = 9.324795000003-06
X[ 21 =i 5.683785305003-03 F[ 21 = -1.121100995703-03
ACM Transactions on Mltthematical Software, Vol. 11, No. 1, March 1985.
Multivariate Halley Method for Solving Nonlinear Systems of Equations - 35

FxESUi.,TS OF ITERATION 4

X[ l] = 2.302585151183+00 F[ l] = 3.02000000000E-10
X[ 21 = 6.12055700000E-08 F [ 21 = -1.193960000003-08

RESULTS OF ITERATION 5

X [ 11 = 2.30258509299E+OO F [ 1] = 0.00000000000E+00
X[ 21 = -2.433561400003-12 F[ 21 = 0.00000000000E+00

D.2. Halley’s Method for the System (4.11)


INITIAL VALUES ARE

X[ 11 = 1.00000000000E+00 F[ 11 = 1.70000000000E+01
XL 21 = 1.00000000000E+00 F [ 21 = 0.00000000000E+00
X[ 31 = 1.00000000000E+00 F[ 31 = 0.00000000000E+00

RESULTS OF ITERATION 1

X[ l] = 8.91118701964E-01 F[ l] = 9.37521623100E-01
X[ 21 = 7.05429347548E-01 F[ 21 = -9.44922445000E-03
X[ 31 = 1.30339083879E+OO F[ 31 = 2.20137281800E-03

RESULTS OF ITERATION 2

X[ 11 = 8.77982528233E-01 F [ 11 = l.O3685690000E-03
X[ 21 = 6.767866893023-01 F[ 21 = -9.09324000000E-06
X[ 31 = 1.33082582033E+OO F[ 31 = 9.05738500000E-06

RESULTS OF ITERATION 3

X[ 11 = 8.77965760274E-01 F[ 1] = 0.00000000000E+00
X[ 21 = 6.76756970516E-01 F[ 21 = O.OOOOOOOOO0OE+OO
X[ 31 = 1.33085541162E+OO F[ 31 = 2.00000000000E-12

RESULTS OF ITERATION 4

x[ 11 = 8.779657602743-01 F[ l] = 0.00000000000E+00
X[ 21 = 6.76756970517E-01 F[ 21 = O.OOOOOOOOOOOE+OO
X[ 31 = 1.330855411623+00 F[ 31 = 1.00000000000E-12

RESULTS OF ITERATION 5

X[ l] = 8.779657602743-01 F[ l] = 0.00000000000E+00
X[ 21 = 6.767569705186-01 F[ 21 = O.OOOOOOOOOOOE+OO
X[ 31 = 1.33085541162E+OO F[ 31 = 0.00000000000E+00

REFERENCES
1. BOHLENDER, G., GRUNER, K., KAUCHER, E., KLATTE, R., KRAMER, W., KULISCH, U. W., RUMP,
S. M., ULLRICH, C. WOLFF VON GUDENBERG, J., AND MIRANKER, W. L. PASCAL-SC: A
PASCAL for contemporary scientific computation. Res. Rep. RC 9009, IBM Thomas J. Watson
Research Center, Yorktown Heights, N.Y., 1981.
2. CUYT, A. A. M. Abstract Pad6 approximants for operators: Theory and applications. Lecture
Notes in Mathematics, vol 1065, Springer-Verlag, New York, 1984.
3. CUYT, A. A. M. Numerical stability of the Halley-iteration for the solution of a system of
nonlinear equations. Math. Comput. 38 (1982), 171-179.
ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.
36 * A. A. M. Cuyt and L. 6. Rail

4. CUYT, A. A. M., AND VAN DER CRUYSSEN, P. Abstract Pad6 approximants for the solution of
a system of nonlinear equations. Rep. 80-17, University of Antwerp UIA, Antwerp, Belgium,
1980.
5. GRAY, J. H., AND RALL, L. B. NEWTON: A general purpose program for solving nonlinear
systems. In Proceedings of the 1967 Army Numerical Analysis Conference. U. S. Army Research
Office, Durham, N.C., 1967, pp. 11-59.
6. KEDEM, G. Automatic differentiation of computer programs. ACM Trans. Math. Softw. 6, 2
(June 1980), 150-165.
7. KURA, D., AND RALL, L. B. A UNIVAC 1108 program for obtaining rigorous error estimates for
approximate solutions of systems of equations. Tech. Summary Rep. 1168, Mathematics Research
Center, University of Wisconsin-Madison, 1972.
8. KULISCH, U. A new arithmetic for scientific computation. In A New Approach to Scientific
Computation, U. Kulisch and W. L. Miranker, Eds. Academic Press, New York, 1983, pp. l-26.
9. KULISCH, U., AND MIRANKER, W. L. Computer Arithmetic in Theory and Practice. Academic
Press, New York, 1981.
10. KULISCH, U., AND MIRANKER, W. L., Eds. A New Approach to Scientific Computation. Academic
Press, New York, 1983.
11. MOORE, R. E. Interval Analysis. Prentice-Hall, Englewood Cliffs, N. J., 1966.
12. MOORE, R. E. Techniques and Applications of Interval Analysis, vol. 2, SIAM Studies in Applied
Mathematics. SIAM, Philadelphia, Pa., 1979.
13. NEAGA, M. Pascal-SC Language Description and Programming Guide (German). Department
of Computer Science, University of Kaiserslautern, Kaiserslautern, W. Germany, 1982.
14. ORTEGA, J. M., AND RHEINBOLDT, W. C. Iterative Solution of Nonlinear Equations in Seueral
Variables. Academic Press, New York, 1970.
15. RALL, L. B. Computational Solution of Nonlinear Operator Equations. Krieger, Huntington,
N. Y., 1979.
16. RALL, L. B. Applications of software for automatic differentiation in numerical computation.
Computing, Suppl. 2 (1980), 141-156.
17. RALL, L. B. Automatic Differentiation: Techniques and Applications, Lecture Notes in Computer
Science, vol. 120. Springer-Verlag. Berlin, Heidelberg, New York, 1981.
18. RALL, L. B. Differentiation and generation of Taylor coefficients in PASCAL-SC. In A New
Approach to Scientific Computation, U. W. Kulisch and W. L. Miranker, Eds. Academic Press,
New York, 1983, pp. 291-309.
19. RALL, L. B. Representations of intervals and optimal error bounds. Math. Comput. 41, 163
(1983), 219-227.
20. RALL, L. B. Differentiation in Pascal-SC: Type GRADIENT. ACM Trans. Math. Softw. IO, 2
(June 1984), 161-184.
21. RUMP, S. Solving algebraic problems with high accuracy. In A New Approach to Scientific
Computation, U. W. Kulisch and W. L. Miranker, Eds. Academic Press, New York, 1983, pp.
53-120.
22. WOLFF VON GUDENBERG, J. Complete Arithmetic of the PASCAL-SC Computer: User Hand-
book (German). Institute for Applied Mathematics, University of Karlsruhe, Karlsruhe, W.
Germany, 1981.

Received February 1984; revised August 1984; accepted September 1984

ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985.

View publication stats

You might also like