0% found this document useful (0 votes)
35 views24 pages

1 s2.0 S0022247X04003105 Main

This document presents a convergence analysis for two-point Newton-like methods in Banach spaces. It provides: 1) A local and semilocal convergence analysis under general Lipschitz conditions, allowing for finer error estimates and weaker convergence criteria than previous work. 2) Examples showing the analysis compares favorably to other methods, weakening the famous Newton-Kantorovich hypothesis under the same assumptions. 3) Applications to numerical solutions for problems in fields like viscoelasticity, demonstrating convergence when the Newton-Kantorovich condition is violated.

Uploaded by

Kritik Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views24 pages

1 s2.0 S0022247X04003105 Main

This document presents a convergence analysis for two-point Newton-like methods in Banach spaces. It provides: 1) A local and semilocal convergence analysis under general Lipschitz conditions, allowing for finer error estimates and weaker convergence criteria than previous work. 2) Examples showing the analysis compares favorably to other methods, weakening the famous Newton-Kantorovich hypothesis under the same assumptions. 3) Applications to numerical solutions for problems in fields like viscoelasticity, demonstrating convergence when the Newton-Kantorovich condition is violated.

Uploaded by

Kritik Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

J. Math. Anal. Appl.

298 (2004) 374–397


www.elsevier.com/locate/jmaa

A unifying local–semilocal convergence analysis


and applications for two-point Newton-like methods
in Banach space
Ioannis K. Argyros
Cameron University, Department of Mathematical Sciences, Lawton, OK 73505, USA
Received 15 September 2003

Submitted by W.L. Wendland

Abstract
We provide a local as well as a semilocal convergence analysis for two-point Newton-like methods
in a Banach space setting under very general Lipschitz type conditions. Our equation contains a
Fréchet differentiable operator F and another operator G whose differentiability is not assumed.
Using more precise majorizing sequences than before we provide sufficient convergence conditions
for Newton-like methods to a locally unique solution of equation F (x) + G(x) = 0. In the semilocal
case we show under weaker conditions that our error estimates on the distances involved are finer
and the information on the location of the solution at least as precise as in earlier results. In the
local case a larger radius of convergence is obtained. Several numerical examples are provided to
show that our results compare favorably with earlier ones. As a special case we show that the famous
Newton–Kantorovich hypothesis is weakened under the same hypotheses as the ones contained in
the Newton–Kantorovich theorem.
 2004 Elsevier Inc. All rights reserved.

Keywords: Newton-like method; Banach space; Majorizing sequence; Fréchet-derivative; Newton–Kantorovich


method/hypothesis; Radius of convergence; Banach lemma on invertible operators

E-mail address: [email protected].

0022-247X/$ – see front matter  2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.jmaa.2004.04.008
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 375

1. Introduction

In this study we are concerned with the problem of approximating a locally unique
solution x ∗ of the nonlinear equation
F (x) + G(x) = 0, (1)
where F, G are operators defined on a closed ball Ū (w, R) centered at point w and of
radius R  0, which is a subset of a Banach space X with values in a Banach space Y .
F is Fréchet-differentiable on Ū (w, R), while the differentiability of the operator G is not
assumed.
A large number of problems in applied mathematics and also in engineering are solved
by finding the solutions of certain equations [4,7,16]. For example, dynamic systems are
mathematically modeled by difference or differential equations, and their solutions usually
represent the states of the systems. For the sake of simplicity, assume that a time-invariant
system is driven by the equation ẋ = Q(x) (for some suitable operator Q), where x is
the state. Then the equilibrium states are determined by solving Eq. (1). Similar equa-
tions are used in the case of discrete systems. The unknowns of engineering equations can
be functions (difference, differential, and integral equations), vectors (systems of linear
or nonlinear algebraic equations), or real or complex numbers (single algebraic equations
with single unknowns). Except in special cases, the most commonly used solution methods
are iterative—when starting from one or several initial approximations a sequence is con-
structed that converges to a solution of the equation. Iteration methods are also applied for
solving optimization problems. In such cases, the iteration sequences converge to an op-
timal solution of the problem at hand. Since all of these methods have the same recursive
structure, they can be introduced and discussed in a general framework.
We use the two-point Newton method

y−1 , y0 ∈ Ū (w, R),


 
yn+1 = yn − A(yn−1 , yn )−1 F (yn ) + G(yn ) (n  0) (2)
to generate a sequence converging to x ∗ . Here A(x, y) ∈ L(X, Y ), the space of bounded
linear operators from X into Y for each fixed x, y ∈ Ū (w, R). We provide a local as well
as a semilocal convergence analysis for method (2) under very general Lipschitz-type hy-
potheses (see (25), (26)).
Our new idea is to use center-Lipschitz conditions instead of Lipschitz conditions for the
upper bounds on the inverses of the linear operators involved. It turns out that this way we
obtain more precise majorizing sequences. Moreover, despite the fact that our conditions
are more general than related ones already in the literature [1–26], we can provide weaker
sufficient convergence conditions, and finer error bounds on the distances involved.
We note that our analysis is also useful in particular in the numerical solution of prob-
lems appearing in visco-elasticity [4,7]. However we leave the details to the motivated
reader. Finally we mention that our approach compares favorably with the classical and
elegant work of J.W. Schmidt on the Secant method (see [20–22] and our Example 2).
Several applications are provided: e.g., in the semilocal case we show that the famous
Newton–Kantorovich hypothesis (for its simplicity and transparency, see (5)) is weakened
376 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

(see (20)), whereas in the local case we can provide a larger convergence radius using the
same information (see (134) and (135)).

2. Semilocal convergence analysis of method (2)

Part A. Motivation

Deuflhard and Heindl [13] have proved the following affine invariant form of the
Newton–Kantorovich theorem [16] which is the motivation for this study.

Theorem 1. Let F : D ⊆ X → Y be a Fréchet-differentiable operator on an open convex


set D. Suppose that d0 ∈ D is such that F  (d0 )−1 exists and
  
F (d0 )−1 F (d0 )  η, (3)
   
F (d0 )−1 F  (x) − F  (y)   γ1 x − y for all x, y ∈ D and γ1 > 0, (4)
h = 2γ1 η  1, (5)
 
Ū (d0 , d 1 ) = x ∈ X | x − d0   d 1 ⊆ D, (6)
where

1− 1−h
d1 = . (7)
γ1
Then, sequence {dn } (n  0) generated by Newton’s method
dn+1 = dn − F  (dn )−1 F (dn ) (n  0), (8)
is well defined, remains in Ū (d0 , d 1) for all n  0 and converges to a unique solution d ∗
of equation
F (d) = 0 (9)
in Ū (d0 , d 1 ) ∪ (D ∩ U (d0 , d 2 )), where

1+ 1−h
d2 = . (10)
γ1
Moreover the following error bounds hold:
dn+1 − dn   d̄n+1 − d̄n , (11)

dn − d   d − d̄n ,
1
d = lim d̄n ,
1
(12)
n→∞

where sequence {d̄n } (n  0) is given by


γ1 (d̄n+1 − d̄n )2
d̄0 = 0, d̄1 = η, d̄n+2 = d̄n+1 + . (13)
2(1 − γ1 d̄n+1 )

Condition (5) is the famous Newton–Kantorovich hypothesis which is the essential


sufficient convergence condition for the semilocal convergence of Newton’s method (8).
However Newton’s method may converge to a solution of Eq. (9) even when (5) is violated.
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 377

Example 1. Let X = Y = R, d0 = 1, D = [p, 2 − p], p ∈ [0, 1/2), and define F on D by


F (d) = d 3 − p. (14)
Using (3), (4) and (14) we get
1
η = (1 − p), γ1 = 2(2 − p), (15)
3
which imply
 
4 1
h = (1 − p)(2 − p) > 1 for all p ∈ 0, . (16)
3 2
That is, there is no guarantee that method (8) converges since (5) is violated. However one
can find values of p in [0, 1/2) such that method (8) converges. √
For example, if p = 0.48, then using (8) we find d ∗ = 3 0.48. Hence, we wonder if
(5) can be weakened. Hypothesis (5) is used to show that majorizing sequence {d̄n } is
monotonically increasing and converges to d 1 . We have noticed that sequence {d̄¯ n } (n  0)
given by
γ1 (d̄¯ n+1 − d̄¯ n )2
d̄¯ n = 0, d̄¯ 1 = η, d̄¯ n+2 = d̄¯ n+1 + (17)
2(1 − γ0 d̄¯ n+1 )
is also a more precise majorizing sequence for Newton’s method (8) than (13), where γ0 is
the center-Lipschitz constant such that
   
F (d0 )−1 F  (x) − F  (d0 )   γ0 x − d0  for all x ∈ D. (18)
In general the inequality
γ0  γ1 (19)
holds. Note also that in practice finding constant γ1 requires the computation of γ0 . Hence
no additional computational effort is required to compute (γ0 , γ1 ) instead of γ1 . As it is
shown in a more general setting in what follows (see Application 2) in this case (5) can be
replaced by
h1 = (γ0 + γ1 )η  1 (20)
(see also (72)). By comparing (5) and (20) we get h1  h/2. Moreover note that
h1 ⇒ h1  1 (21)
but not vice versa unless if γ0 = γ1 . Furthermore as it is shown in a more general setting
for all n  0 (γ0 < γ1 ),
d
n+1 − d   d̄¯
n − d̄¯ < d̄
n+1 n − d̄ ,
n+1 n (22)
dn − d   d − d̄¯ n  d1 − d̄n ,
∗ 3
(23)
and
d 3 = lim d̄¯n  d1 (24)
n→∞
(see Remark 5). Hence we also obtain finer error bounds and at least as precise information
on the location of the solution d ∗ as in Theorem
√ 1. Returning back to Example 1, since
γ0 = 3 − p we find that (20) holds if p ∈ [(5 − 13 )/3, 1/2), which improves Theorem 1.
378 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

Part B. Main results

In order for us to show that these observations hold in a more general setting we first
need to introduce the following assumptions.
Let R  0 be given. Assume there exist v, w ∈ X such that A(v, w)−1 ∈ L(Y, X), and
for any x, y, z ∈ Ū (w, r) ⊆ Ū (w, R), t ∈ [0, 1], the following hold:
   
A(v, w)−1 A(x, y) − A(v, w)   h0 x − v, y − w + a (25)
and
    
A(v, w)−1 F  y + t (z − y) − A(x, y) (z − y) + G(z) − G(y) 
    
 h1 y − w + tz − y − h2 y − w + h3 z − x + b z − y, (26)
where h0 (r, s), h1 (r + r̄) − h2 (r) (r̄  0), h2 (r), h3 (r) are monotonically increasing func-
tions for all r, s on [0, R]2 , [0, R]2 , [0, R], [0, R], respectively, with h0 (0, 0) = h1 (0) =
h2 (0) = h3 (0) = 0, and the constants a, b satisfy a  0, b  0. Given y−1 , y0 , v, w in X,
define parameters c−1 , c, c1 by
y−1 − v  c−1 , y−1 − y0   c, v − w  c1 . (27)

Remark 1. Conditions similar to (21)–(26) but less flexible were considered by Chen and
Yamamoto in [11] in the special case when A(x, y) = A(x) for all x, y ∈ Ū (w, R) (A(x) ∈
L(X, Y )) (see also Theorem 4). Operator A(x) is intended there to be an approximation to
the Fréchet-derivative F  (x) of F . However we also want the choice of operator A to be
more flexible, and be related to the difference G(z) − G(y) for all y, z ∈ Ū (w, R). It has
already been shown in special cases in [3,4,10] that this way the ratio of convergence for
method (2) is improved (see also Application 1). Note also that if we choose

A(x, y) = F  (x), G(x) = 0, w = d0 ,


h0 (r, r) = γ0 r, h1 (r) = h2 (r) = γ1 r, h3 (r) = 0 (28)
for all x, y ∈ Ū (w, R), r ∈ [0, R], and a = b = 0, then conditions (25) and (26) reduce to
(18) and (4), respectively. Other choices of operators, functions and constants appearing in
(25) and (26) can be found in the applications that follow.

With the above choices, we show the following result on majorizing sequences for
method (2).

Lemma 1. Assume that there exist parameters η  0, a  0, b  0, c−1  0, c  0, δ ∈


[0, 2), r0 ∈ [0, R] such that

1
 
2 h1 (r0 + θ η) dθ − h2 (r0 ) + b + h3 (c + η) + a + h0 (c + c−1 , η + r0 ) δ  δ,
0
(29)

+ r0 + c  R, (30)
2−δ
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 379

  δ n+1  δ n+2
1− 1−
h0 2
η + c + c−1 , 2
η + r0 + a < 1, (31)
1− δ
2 1− δ
2
and
1   δ n+1  n+1   n+1
1− δ 1 − 2δ
2 h1 2
η+θ η + r0 dθ − 2h2 η + r0
1− δ
2
2 1 − 2δ
0
 n     n+1  n+2
δ δ 1 − 2δ 1 − 2δ
+ 2h3 1+ η + δh0 η + c + c−1 , η + r0
2 2 1 − 2δ 1 − 2δ
δ (32)
for all n  0. Then, iteration {tn } (n  −1) given by

t−1 = r0 , t0 = c + r0 , t1 = c + r0 + η,
1
{ 0 [h1 (tn −t0 +r0 +θ(tn+1 −tn ))−h2 (tn −t0 +r0 )+b] dθ+h3 (tn+1 −tn−1 )}(tn+1 −tn )
tn+2 = tn+1 + 1−a−h0 (tn −t−1 +c−1 ,tn+1 −t0 +r0 )
(33)
is monotonically increasing, bounded above by

t ∗∗ = + r0 + c, (34)
2−δ
and converges to some t ∗ such that
0  t ∗  t ∗∗  R. (35)
Moreover the following error bounds hold for all n  0:
 n+1
δ δ
0  tn+2 − tn+1  (tn+1 − tn )  η. (36)
2 2

Proof. We must show


 1
  
2 h1 tk − t0 + r0 + θ (tk+1 − tk ) − h2 (tk − t0 + r0 ) + b dθ
0

 
+ h3 (tk+1 − tk−1 ) + δ a + h0 (tk − t−1 + c−1 , tk+1 − t0 + r0 )  δ, (37)

0  tk+1 − tk , (38)
and
h0 (tk − t−1 + c−1 , tk+1 − t0 + r0 ) + a < 1 (39)
for all k  0.
380 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

Estimate (36) can then follow from (37)–(39) and (33).


Using induction on the integer k  0, first for k = 0 in (37)–(39) we must show


1
 
2 h1 (r0 + θ η) dθ − h2 (r0 ) + b + h3 (c + η) + δ a + h0 (c + c−1 , η + r0 )  δ,
0
0  t1 − t0 ,
h0 (c + c−1 , η + r0 ) + a < 1,

which hold by (29) and the definition of t1 .


By (33) we get
δ
0  t2 − t1  (t1 − t0 ).
2
Assume that (37)–(39) hold for all k  n + 1. Using (37)–(39) we obtain in turn

 1
  
2 h1 tk+1 − t0 + r0 + θ (tk+2 − tk+1 ) − h2 (tk+1 − t0 + r0 ) + b dθ
0

 
+ h3 (tk+2 − tk ) + δ a + h0 (tk+1 − t−1 + c−1 , tk+2 − t0 + r0 )

 1   δ k+1  k+1    k+1


1− δ 1 − 2δ
2 h1 2
+θ η + r0 − h2 η + r0
1− δ
2
2 1 − 2δ
0
 k+1  k 
δ δ
+ b + h3 η+ η
2 2
   δ k+1  δ k+2 
1− 1−
+ δ a + h0 2
η + c + c−1 , 2
η + r0
1− δ
2 1− δ
2

by (29) and (32). Hence we showed (37) holds for k = n + 2. Moreover, we must show

tk  t ∗∗ , (40)
t−1 = r0  t ∗∗ , t0 = r0 + c  t ∗∗ , t1 = c + r0 + η  t ∗∗ ,
δ 2+δ
t2  c + r0 + η + η = η + r0 + c  t ∗∗ .
2 2
Assume that (40) holds for all k  n + 1. It follows from (33), (37)–(39) that
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 381

δ δ δ
tk+2  tk+1 + (tk+1 − tk )  tk + (tk − tk−1 ) + (tk+1 − tk )
2 2 2
 2  k+1
δ δ δ
 · · ·  c + r0 + η + η + η + ···+ η
2 2 2
 k+2
1 − 2δ 2η
= η + r0 + c  + r0 + c = t ∗∗ . (41)
1− 2δ 2−δ
Hence, sequence {tn } (n  −1) is bounded above by t ∗∗ . Inequality (39) holds for k = n+2
by (30) and (31). Moreover (38) holds for k = n + 2 by (41) and since (37) and (39) also
hold for k = n + 2. Furthermore, sequence {tn } (n  0) is monotonically increasing by (38)
and as such it converges to some t ∗ satisfying (35).
That completes the proof of Lemma 1. 2

We provide the main result on the semilocal convergence of method (2) using majorizing
sequence (33).

Theorem 2. Assume that hypotheses of Lemma 1 hold, and there exist


y−1 ∈ Ū (w, R), y0 ∈ Ū (w, r0 ) (42)
such that
  
A(y−1 , y0 )−1 F (y0 ) + G(y0 )   η. (43)
Then, sequence {yn } (n  −1) generated by Newton-like method (2) is well defined, re-
mains in Ū (w, t ∗ ) for all n  −1, and converges to a solution x ∗ of equation F (x) +
G(x) = 0. Moreover the following error bounds hold for all n  −1:
yn+1 − yn   tn+1 − tn (44)
and
yn − x ∗   t ∗ − tn . (45)
Furthermore the solution x ∗ is unique in Ū (w, t ∗ ) if
1

h1 (1 + t)t ∗ dt − h2 (t ∗ ) + h3 (t ∗ ) + h0 (t ∗ + c1 , t ∗ ) + a + b < 1, (46)
0

and in Ū (w, R0 ) for R0 ∈ (t ∗ , R] if


1
h1 (t ∗ + tR0 ) dt − h2 (t ∗ ) + h3 (R0 ) + h0 (t ∗ + c1 , t ∗ ) + a + b < 1, (47)
0
provided that x−1 = v and x0 = w.

Proof. We first show estimate (44), and yn ∈ Ū (w, t ∗ ) for all n  −1. For n = −1, 0,
(44) follows from (27), (33) and (43). Suppose (44) holds for all n = 0, 1, . . . , k + 1; this
implies in particular (using (27), (42))
382 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

yk+1 − w  yk+1 − yk  + yk − yk−1  + · · · + y1 − y0  + y0 − w


 (tk+1 − tk ) + (tk − tk−1 ) + · · · + (t1 − t0 ) + r0 = tk+1 − t0 + r0
 tk+1  t ∗ .
That is, yk+1 ∈ Ū (w, t ∗ ).
We show (44) holds for n = k + 2. By (25) and (33) we obtain for all x, y ∈ Ū (w, t ∗ ),
   
A(v, w)−1 A(x, y) − A(v, w)   h0 x − v, y − w + a. (48)
In particular for x = yk and y = yk+1 we get using (25), (27),
   
A(v, w)−1 A(yk , yk+1 ) − A(v, w)   h0 yk − v, yk+1 − w + a

 h0 yk − y−1  + y−1 − v, yk+1 − x0  + y0 − w + a
 h0 (tk − t−1 + c−1 , tk+1 − t0 + r0 ) + a
  k  k+1
1 − 2δ 1 − 2δ
 h0 η + c + c−1 , η + r0 + a < 1 (by (31)). (49)
1 − 2δ 1 − 2δ
It follows from (49) and the Banach lemma on invertible operators [16] that A(yk , yk+1 )−1
exists, and
   
A(yk , yk+1 )−1 A(v, w)  1 − a − h0 (tk − t−1 + c−1 , tk+1 − t0 + r0 ) −1 . (50)
Using (2), (26), (33), (50) we obtain in turn
  
yk+2 − yk+1  = A(yk , yk+1 )−1 F (yk+1 ) + G(yk+1 ) 
 
= A(yk , yk+1 )−1 F (yk+1 ) + G(yk+1 ) − A(yk−1 , yk )(yk+1 − yk )

− F (yk ) − G(yk ) 
  
 A(yk , yk+1 )−1 A(v, w)A(v, w)−1 F (yk+1 ) − F (yk )

− A(yk−1 , yk )(yk+1 − yk ) + G(yk+1 ) − G(yk ) 

{ 01 [h1 (yk −w+t yk+1 −yk )−h2 (yk −w)+b] dt +h3 (yk+1 −yk−1 )}yk+1 −yk 
 1−a−h0 (tk −t−1 +c−1 ,tk+1 −t0 +r0 )
1
{ 0 [h1 (tk −t0 +r0 +t (tk+1 −tk ))−h2 (tk −t0 +r0 )+b] dt +h3 (tk+1 −tk−1 )}(tk+1 −tk )
 1−a−h0 (tk −t−1 +c−1 ,tk+1 −t0 +r0 )
= tk+2 − tk+1 , (51)
which shows (44) for all n  0.
Note also that
yk+2 − yk+1   yk+2 − yk+1  + yk+1 − z  tk+2 − tk+1 + tk+1 − t0 + r0
= tk+2 − t0 + r0  tk+2  t ∗ . (52)
That is, yk+2 ∈ Ū (z, t ∗ ).
It follows from (44) that {yn } (n  −1) is a Cauchy sequence in a Banach space X, and
as such it converges to some x ∗ ∈ X ∈ Ū (w, t ∗ ) (since Ū (w, t ∗ ) is a closed set). By letting
k → ∞ in (51) we obtain F (x ∗ ) + G(x ∗ ) = 0. Estimate (45) follows from (44) by using
standard majorization techniques [8,16].
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 383

To show uniqueness in Ū (w, t ∗ ), let y ∗ be a solution of Eq. (1) in Ū (w, t ∗ ). We define


Newton-like iteration {xn } (n  −1) by

x−1 = v, x0 = w,
 
xn+1 = xn − A(xn−1 , xn )−1 F (xn ) + G(xn ) (n  0). (53)
Iteration {xn } (n  −1) is a special case of {yn } (n  −1). Hence, we have
xk+1 − xk   t¯k+1 − t¯k , lim xn = x ∗ ,
n→∞

and
x ∗ − xk   t ∗ − t¯k , lim t¯k = t ∗ , (54)
where {t¯n } is {tn } (n  −1) for r0 = 0.
We shall show
y ∗ − xk   t ∗ − t¯k . (55)
For k = 0, (54) holds since y∗ ∈ Ū (w, t ∗ ). Suppose (54) holds for all n  k. Then as in
(51) we obtain the identity

y ∗ − xk+1 = y ∗ − xk + A(xk−1 , xk )−1 F (xk ) + G(xk )

− A(xk−1 , xk )−1 F (y ∗ ) + G(y ∗ )
  
= A(xk−1 , xk )−1 A(x−1 , x0 ) A(x−1 , x0 )−1 F (xk ) − F (y ∗ )

− A(xk−1 , xk )(y ∗ − xk ) + G(xk ) − F (y ∗ ) . (56)
Using (56) we obtain in turn
1
[ 0 h1 (xk −x0 +t y ∗ −xk ) dt −h2 (xk −x0 )+h3 (y ∗ −xk−1 )+b]y ∗ −xk 
y ∗ − xk+1   1−a−h0 (xk−1 −x−1 ,xn −x0 )
1
[ h1 ((1 + t)t ) dt − h2 (t ∗ ) + h3 (t ∗ ) + b] ∗

 0 y − xk 
1 − a − h0 (t ∗ + c1 , t ∗ )
< y ∗ − xk   t ∗ − t¯k → 0 as k → ∞. (57)
That is, x∗ = y∗.
If y ∗ ∈ Ū (x0 , R0 ) then as in (57) we get
1
∗ [ 0 h1 (t ∗ + tR0 ) dt − h2 (t ∗ ) + h3 (R0 ) + b] ∗
y − xk+1   y − xk 
1 − a − h0 (t ∗ + c1 , t ∗ )
< y ∗ − xk . (58)
x∗
Hence, again we get = y∗.
That completes the proof of Theorem 2. 2

Remark 2. Conditions (31), (32) can be replaced by the stronger but easier to check

2η 2η
h0 + c + c−1 , + r0 + a  1 (59)
2−δ 2−δ
384 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

and
1  
2η δ 2η
2 h1 + θ + r0 dθ − 2h2 + r0
2−δ 2 2−δ
0
  
δ 2η 2η
+ 2h3 1 + η + δh0 + c + c−1 , + r0
2 2−δ 2−δ
 δ, (60)
respectively. Note also that conditions (29)–(32), (59), (60) are of the Newton–Kantoro-
vich-type hypotheses (see also (5)) which are always present in the study of Newton-like
methods [4,8,11,12,24,26].
Application 1. Let us consider some special choices of operator A, functions hi , i = 0, 1,
2, 3, parameters a, b and points v, w.
Define

A(x, y) = F  (y) + [x, y; G], (61)


v = y−1 , w = y0 , (62)
and set
r0 = 0, (63)
where F  , [· , · ; G] denote the Fréchet-derivative of F and the divided difference of order
one for operator G, respectively [4,8,10]. Hence, we consider Newton-like method (2) in
the form
 −1 
yn+1 = yn − F  (yn ) + [yn−1 , yn ; G] F (yn ) + G(yn ) (n  0). (64)

The method was studied in [3,4,10]. It is shown to be of order (1 + 5 )/2 ≈ 1.618 . . .
(same as the order of Chord), but higher than the order of

zn+1 = zn − F  (zn )−1 F (zn ) + G(zn ) (n  0) (65)
and

wn+1 = wn − A(wn )−1 F (wn ) + G(wn ) (n  0), (66)
where A(·) is an operator approximating F  (see, e.g., [4,8,11,12,21]). Assume
  
A(y−1 , y0 )−1 F  (y) − F  (y0 )   γ2 y − y0 , (67)
  
A(y−1 , y0 )−1 F  (x) − F  (y)   γ3 x − y, (68)
   
A(y−1 , y0 )−1 [x, y; G] − [y−1 , y0 ; G]   γ4 x − y−1  + y − y0  , (69)
and
  
A(y−1 , y0 )−1 [x, y; G] − [z, x; G]   γ5 z − y (70)
for some nonnegative parameters γi , i = 2, 3, 4, 5, and all x, y ∈ Ū (y0 , r) ⊆ Ū (y0 , R).
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 385

Then we can define


a = b = 0, h1 = h2 , h1 (q) = γ3 q, h3 (q) = γ5 q, and
h0 (q1 , q2 ) = γ4 q1 + (γ2 + γ4 )q2 . (71)
If the hypotheses of Theorem 2 hold for the above choices, the conclusions follow.
Note that conditions (67)–(70) are weaker than the corresponding ones in [10, pp. 48–
49], [3,4]. Indeed, conditions
      
F (x) − F  (y)  γ6 x − y, A(x, y)−1   γ7 , [x, y, z; G]  γ8 ,
and
  
[x, y; G] − [z, w; G]  γ9 x − z + y − w

for all x, y, z, w ∈ Ū (y0 , r) are used there instead of (67)–(70), where [x, y, z; G] denotes
a second order divided difference of G at (x, y, z), and γi , i = 6, 7, 8, 9, are non-negative
parameters.

Let us provide an example for this case.

Example 2. Let X = Y = (R2 ,  · ∞ ). Consider the system


3x 2y + y 2 − 1 + |x − 1| = 0,
x 4 + xy 3 − 1 + |y| = 0.
Set x∞ = (x  , x  )∞ = max{|x  |, |x  |}, F = (F1 , F2 ), and G = (G1 , G2 ). For x =
(x  , x  ) ∈ R2 we take F1 (x  , x  ) = 3(x  )2 x  + (x  )2 − 1, F2 (x  , x  ) = (x  )4 + x  (x  )3 − 1,
G1 (x  , x  ) = |x  − 1|, G2 (x  , x  ) = |x  |. We shall take [x, y; G] ∈ M2×2 (R) as
Gi (y  , y  ) − Gi (x  , y  ) Gi (x  , y  ) − Gi (x  , x  )
[x, y; G]i,1 = , [x, y; G]i,2 = ,
y − x y  − x 
i = 1, 2.
Using method (65) with z0 = (1, 0) we obtain
(1) (2)
n zn zn zn − zn−1 
0 1 0
1 1 0.333333333333333 3.333E−1
2 0.906550218340611 0.354002911208151 9.344E−2
3 0.885328400663412 0.338027276361322 2.122E−2
4 0.891329556832800 0.326613976593566 1.141E−2
5 0.895238815463844 0.326406852843625 3.909E−3
6 0.895154671372635 0.327730334045043 1.323E−3
7 0.894673743471137 0.327979154372032 4.809E−4
8 0.894598908977448 0.327865059348755 1.140E−4
9 0.894643228355865 0.327815039208286 5.002E−5
10 0.894659993615645 0.327819889264891 1.676E−5
11 0.894657640195329 0.327826728208560 6.838E−6
12 0.894655219565091 0.327827351826856 2.420E−6
13 0.894655074977661 0.327826643198819 7.086E−7
···
39 0.894655373334687 0.327826511746298 5.149E−19
386 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

Using the method of chord (i.e., (66) with A(wn ) = [wn−1 , w1 ; G]) with w0 = (5, 5),
w−1 = (1, 0), we obtain
(1) (2)
n wn wn wn − wn−1 
0 5 5
1 1 0 5.000E+00
2 0.989800874210782 0.012627489072365 1.262E−02
3 0.921814765493287 0.307939916152262 2.953E−01
4 0.900073765669214 0.325927010697792 2.174E−02
5 0.894939851625105 0.327725437396226 5.133E−03
6 0.894658420586013 0.327825363500783 2.814E−04
7 0.894655375077418 0.327826521051833 3.045E−04
8 0.894655373334698 0.327826521746293 1.742E−09
9 0.894655373334687 0.327826521746298 1.076E−14
10 0.894655373334687 0.327826521746298 5.421E−20

Using our method (64) with y0 = (5, 5), y−1 = (1, 0), we obtain
(1) (2)
n yn yn yn − yn−1 
0 5 5
1 1 0 5
2 0.909090909090909 0.363636363636364 3.636E−01
3 0.894886945874111 0.329098638203090 3.453E−02
4 0.894655531991499 0.327827544745569 1.271E−03
5 0.894655373334793 0.327826521746906 1.022E−06
6 0.894655373334687 0.327826521746298 6.089E−13
7 0.894655373334687 0.327826421746298 2.710E−20

We did not verify the hypotheses of Theorem 3 for the above starting points. However, it is
clear that the hypotheses of Theorem 3 are satisfied for all three methods for starting points
closer to the solution

x ∗ = (0.894655373334687, 3.27826421746298)

chosen from the lists of the tables displayed above.


Hence method (2) (i.e., method (64) in this case) converges faster than (65) suggested in
Chen and Yamamoto [11], Zabrejko and Nguen [26] in this case and the method of chord
[21,22].
In the application that follows we show that the famous Newton–Kantorovich hypothe-
sis (see (5)) is weakened under the same hypotheses/information [16,18,19,23].

Application 2. Returning back to Remark 1 and (28), iteration (2) reduces to the famous
Newton–Kantorovich method (8).
Condition (29) reduces to

hδ = (γ1 + δγ0 )η  δ. (72)

Case 1. Let us restrict δ ∈ [0, 1]. Hypothesis (32) now becomes


I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 387

1    k+1   k+1
2η δ δ
2 γ1 1− +θ η dθ
2−δ 2 2
0
   k+1     k+1 
2η δ 2η δ
− 2γ1 1− + δγ0 1−
2−δ 2 2−δ 2
1
 2γ1 θ η dθ + δγ0 η
0
or
   k+1
γ0 δ 2 δ
− γ1 1−  1,
2−δ 2
which is true for all k  0 by the choice of δ. Furthermore (31) gives
2γ0η
 2γ0 η < 1.
2−δ
Hence in this case conditions (29), (31) and (32) reduce only to (72) provided δ ∈ [0, 1].
Condition (72) for say δ = 1 reduces to (20).

Case 2. It follows from Case 1 that (29), (31) and (32) reduce to (72),
2γ0η
1 (73)
2−δ
and
γ0 δ 2
 γ1 , (74)
2−δ
respectively, provided δ ∈ [0, 2).

Case 3. It turns out that the range for δ can be extended (see also Example 3). Introduce
conditions
1
γ0 η  1 − δ for δ ∈ [δ0 , 2),
2
where

−b + b2 + 8b γ1
δ0 = , b = , and γ0 = 0.
2 γ0
Indeed the proof of Theorem 2 goes through if instead we show the weaker condition
 k+1   k+2
δ 2γ0 δ δ
γ1 + 1−  δ,
2 2−δ 2
or
  k+1
bδ δ 2 δ
b− −  0,
2 2 2
388 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

or
δ  δ0 ,
which is true by the choice of δ0 .

Example 3. Returning back to Example 1 but using Case 3 we can do better. Indeed, choose

5 − 13
p = p0 = 0.4505 < = 0.464816242 . . ..
3
Then we get
η = 0.183166 . . ., γ0 = 2.5495, γ1 = 3.099, and δ0 = 1.0656867.
Choose δ = δ0 . Then we get
δ0
γ0 η = 0.466983415 < 1 − = 0.46715665.
2

That is the interval [(5 − 13 )/2, 1/2) can be extended to at least [p0 , 1/2).

In the example that follows we show that γ1 /γ0 can be arbitrarily large. Indeed

Example 4. Let X = Y = R, d0 = 0 and define functions F , G on R by


F (x) = c0 x + c1 + c2 sin ec3 x , G(x) = 0, (75)
where ci , i = 0, 1, 2, 3, are given parameters. Using (75) it can easily be seen that for c3
large and c2 sufficiently small γ1 /γ0 can be arbitrarily large.

Part C. Specialization to one-step methods

In order to compare with earlier results, we consider the case when x = y and v = w
(single step methods). We can then prove along the same lines to Lemma 1 and Theorem 2,
respectively, the following results by assuming that there exists w ∈ X such that A(w)−1 ∈
L(Y, X), for any x, y ∈ Ū (w, r) ⊆ Ū (w, R), t ∈ [0, 1],
   
A(w)−1 A(x) − A(w)   g0 x − w + α (76)
and
   
A(w)−1 F (x + t (y − x)) − A(x) (y − x) + G(y) − G(x) 
   
 g1 x − w + ty − x − g2 x − w + g3 (r) + β y − x, (77)
where g0 , g1 , g2 , g3 , α, β are as h0 (one variable), h1 , h2 , h3 , a, b, respectively.
Then we can show the following result on majorizing sequences.

Lemma 2. Assume that there exist η  0, α  0, β  0, δ ∈ [0, 2), r0 ∈ [0, R] such that
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 389


1
 
h̄δ = 2 g1 (r0 + θ η) dθ − g2 (r0 ) + g3 (r0 + η) + β + δ α + g0 (r0 + η)  δ,
0
(78)

+ r0  R, (79)
2−δ
   n+1 
2η δ
g0 1− + r0 + α < 1, (80)
2−δ 2
1    n+1   n+1
2η δ δ
2 g1 1− + r0 + θ η dθ
2−δ 2 2
0
  n+1     n+1 
2η δ 2η δ
−2g2 1− + r0 + 2g3 1− + r0
2−δ 2 2−δ 2
   n+1 
2η δ
+ δg0 1− + r0
2−δ 2
δ (81)
for all n  0. Then, iteration {sn } (n  0) given by

s0 = r0 , s1 = r0 + η,
1 s
{g1 (sn +θ(sn+1 −sn ))−g2 (sn )+β} dθ(sn+1 −sn )+ snn+1 g3 (θ) dθ
sn+2 = sn+1 + 0
1−α−g0 (sn+1 ) (82)

is monotonically increasing, bounded above by



s ∗∗ = + r0 , (83)
2−δ
and converges to some s ∗ such that
0  s ∗  s ∗∗ . (84)
Moreover the following error bounds hold for all n  0:
 n+1
δ δ
0  sn+2 − sn+1  (sn+1 − sn )  η. (85)
2 2

Theorem 3. Assume that hypotheses of Lemma 2 hold and there exists y0 ∈ Ū (w, r0 ) such
that
  
A(y0 )−1 F (y0 ) + G(y0 )   η. (86)
Then, sequence {wn } (n  0) generated by Newton-like method (66) is well defined, re-
mains in Ū (w, s ∗ ) for all n  0, and converges to a solution x ∗ of equation F (x) + G(x)
= 0. Moreover the following error bounds hold for all n  0:
wn+1 − wn   sn+1 − sn (87)
390 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

and
wn − x ∗   s ∗ − sn . (88)
Furthermore the solution x∗ is unique in Ū (w, s ∗ ) if
1
 
g1 (s ∗ + θ s ∗ ) − g2 (s ∗ ) dθ + g3 (s ∗ ) + g0 (s ∗ ) + α + β < 1, (89)
0

or in Ū (w, R0 ) if s ∗ < R0  R, and


1
 
g1 (s ∗ + θ R0 ) − g2 (s ∗ ) dθ + g3 (s ∗ + R0 ) + g0 (s ∗ ) + α + β < 1, (90)
0
provided that w0 = w.

We state the relevant results due to Chen and Yamamoto [11, p. 40]. We assume that
A(w)−1 exists, and for any x, y ∈ Ū (w, r) ⊆ Ū (z, R),
  
0 < A(w)−1 F (w) + G(w)   η̄, (91)
   
A(w)−1 A(x) − A(w)   ḡ0 x − w + ᾱ, (92)
   
A(w)−1 F  x + t (y − x) − A(x) 
  
 ḡ1 x − w + t y − x − ḡ0 x − w + β̄, t ∈ [0, 1], (93)
  
A(w) G(x) − G(y)   g3 (r)x − y,
−1
(94)
where ḡ0 , ḡ1 , ᾱ, β̄ are as g0 , g1 , α, β, respectively, but ḡ0 is also differentiable with ḡ0 (r)
> 0, r ∈ [0, R], and ᾱ + β̄ < 1.
As in [11], set
r r
ϕ(r) = η̄ − r + ḡ1 (t) dt, ψ(r) = g3 (t) dt, (95)
0 0
χ(r) = φ(r) + ψ(r) + (ᾱ + β̄)r. (96)
Denote the minimal value of χ(r) on [0, R] by χ ∗,
and the minimal point by r ∗.
If χ(R)
 0, denote the unique zero of χ by r0∗ ∈ (0, r ∗ ]. Define scalar sequence {rn } (n  0) by
u(rn )
r0 ∈ [0, R], rn+1 = rn + (n  0), (97)
g(rn )
where
u(r) = χ(r) − x ∗ (98)
and
g(r) = 1 − ḡ0 (r) − ᾱ. (99)
With the above notation they showed
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 391

Theorem 4 [11, p. 40]. Suppose χ(R)  0. Then Eq. (1) has a solution x ∗ ∈ Ū (w, r0∗ ),
which is unique in

Ū (w, R) if χ(R) = 0 or ψ(R) = 0, and r0∗ < R,
Ũ = (100)
U (w, R) if χ(R) = 0 and r0∗ < R.
Let
 
   u(r)
∗   −1
D = Ūr∈[0,r ∗) y ∈ Ū (w, r) A(y) F (y) + G(y)   . (101)
g(r)
Then, for any y0 ∈ D, sequence {yn } (n  0) generated by Newton-like method (66) is well
defined, remains in Ū (w, r ∗ ) and satisfies
yn+1 − yn   rn+1 − rn (102)
and
yn − x ∗   r ∗ − rn (103)
provided that r0 is chosen as in (97) so that r0 ∈ Ry0 , where for y ∈ D ∗ ,
 

∗ 
  u(r)
−1
Ry = r ∈ [0, r ) A(y) F (y) + G(y)   , y − z  r . (104)
y(r)

Remark 3. (a) Hypothesis on ḡ0 is stronger than the corresponding one on g0 .


(b) Iteration (97) converges to r ∗ (even if r0 = 0) not r0∗ .
(c) Choices of y−1 , y0 other than the ones in Theorems 2, 3 can be given by (101) and
(102).

Remark 4. The conclusions of Theorem 4 hold (i.e., the results in [11] were improved) if
the more general conditions (76), (77) replace (92)–(94), and
ḡ0 (r)  g2 (r), r ∈ [0, R], (105)
is satisfied. Moreover if strict inequality holds in (105) we obtain more precise error
bounds. Indeed, define the sequence {r̄n } (n  0), using (77), g2 instead of (93), ḡ0 , re-
spectively (with ḡ1 = g1 , α = ᾱ, β = β̄), by
r̄0 = r0 , r̄1 = r1 ,
u(r̄n ) − u(r̄n−1 ) + (1 − g2 (r̄n−1 ) − ᾱ)(r̄n − r̄n−1 )
r̄n+1 − r̄n = (n  1). (106)
g(r̄n )
It can easily be seen using induction on n (see also the proof of Proposition 1 that follows)
that
r̄n+1 − r̄n < rn+1 − rn , (107)
r̄n < rn , (108)
∗ ∗ ∗
r̄ − r̄n  r − rn , r̄ = lim r̄n , (109)
n→∞
and
r̄ ∗  r ∗ . (110)
Furthermore condition (77) allows us more flexibility in choosing functions and constants.
392 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

Remark 5. Returning back to Newton’s method (8) (see also (28)), the iterations cor-
responding to (97) and (106) are (13) and (17), respectively. Moreover condition (105)
reduces to (19), and in case γ0 < γ1 , estimates (22)–(24) hold.

Remark 6. Our error bounds (87), (88) are finer than the corresponding ones (102) and
(103), respectively in many interesting cases.

Let us choose

α = ᾱ, β = β̄, g0 (r) = ḡ0 (r), g1 (r) = g2 (r) = ḡ1 (r), and
ḡ3 (r) = g3 (r) for all r ∈ [0, R].
Then we can show

Proposition 1. Under the hypotheses of Theorems 3 and 4, further assume


s1 < r1 . (111)
Then, the following hold:

sn < rn (n  1), (112)


sn+1 − sn < rn+1 − rn (n  0), (113)
∗ ∗
s − sn  r − rn (n  0), (114)
and
s∗  r ∗. (115)

Proof. It suffices to show (112) and (113), since then (114) and (115), respectively, can
easily follow. Inequality (112) holds for n = 1 by (111). By (82) and (97) we get in turn
1 
{g1 (s0 +θ(s1 −s0 )) dθ−g2 (s0 )+α}(s1 −s0 )+ ss1 g3 (θ) dθ
s2 − s1 =
0 0
1−β−g0 (s1 )
1 r
0 {ḡ1 (r0 +θ(r1 −r0 )) dθ−ḡ2 (r0 )+ᾱ}(r1 −r0 )+ r ḡ3 (θ) dθ
1
0
< 1−β̄−ḡ0 (r1 )
u(r1 ) − u(r0 ) + g(r0 )(r1 − r0 ) u(r1 )
= = = r2 − r1 . (116)
1 − β̄ − ḡ0 (r1 ) g(r1 )
Assume
sk+1 < rk+1 (117)
and
sk+1 − sk < rk+1 − rk (118)
hold for all k  n.
Using (82), (88), and (118) we obtain
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 393

1 s
{g1 [sk +θ(sk+1 −sk )] dθ−g2 (sk )+α}(sk+1 −sk )+ skk+1 g3 (θ) dθ
sk+2 − sk+1 = 0
1−β−g0 (sk+1 )
1  rk+1
0 {ḡ1 [rk +θ(rk+1 −rk )] dθ−ḡ2 (rk )+ᾱ}(rk+1 −rk )+ rk ḡ3 (θ) dθ
< 1−β̄−ḡ0 (rk+1 )
u(rk+1 ) − u(rk ) + g(rk )(rk+1 − rk ) u(rk+1 )
= = = rk+2 − rk+1 .
g(rk+1 ) g(rk+1 )
That completes the proof of Proposition 1. 2

In order for us to include a case where operator G is nontrivial, we consider the follow-
ing example for Theorem 2 (or Theorem 3).

Example 5. Let X = Y = C[0, 1] the space of continuous functions on [0, 1] equipped


with the sup-norm. Consider the integral equation on Ū (x0 , R/2) given by

1

x(t) = k t, s, x(s) ds, (119)
0

where the kernel k(t, s, x(s)) with (t, s) ∈ [0, 1] × [0, 1] is a nondifferentiable operator on
Ū (x0 , R/2). Define operators F, G on Ū (x0 , R/2) by

F (x)(t) = I x(t) (I the identity operator), (120)


1

G(x)(t) = − k t, s, x(s) ds. (121)
0

Choose x0 = 0, and assume there exists a constant θ0 ∈ [0, 1), a real function θ1 (t, s) such
that
 
k(t, s, x) − k(t, s, y)  θ1 (t, s)x − y (122)

and
1
sup θ1 (t, s) ds  θ0 (123)
t ∈[0,1]
0

for all t, s ∈ [0, 1], x, y ∈ Ū (x0 , R/2).


Moreover choose in Theorem 3: r0 = 0, y0 = y−1 , A(x, y) = A(x) = I (x), I the iden-
tity operator on X, g0 (r) = r, α = β = 0, g1 (r) = g2 (r) = 0, and g3 (r) = θ0 for all
x, y ∈ Ū (x0 , R/2), r, s ∈ [0, 1] (similar choices for Theorem 3). It can easily be seen that
the conditions of Theorem 2 hold if
η R
t∗ =  . (124)
1 − θ0 2
394 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

3. Local convergence of method (2)

In order to cover the local case, let us assume x ∗ is a zero of Eq. (1), A(x ∗ , x ∗ )−1 exists
and for any x, y ∈ Ū (x ∗ , r) ⊆ Ū (x ∗ , R), t ∈ [0, 1],
   
A(x ∗ , x ∗ )−1 A(x, y) − A(x ∗ , x ∗ )   h̄0 x − x ∗ , y − x ∗  + ā (125)
and
    
A(x ∗ , x ∗ )−1 F  x ∗ + t (y − x ∗ ) − A(x, y) (y − x ∗ ) + G(y) − G(x ∗ ) 
    
 h̄1 y − x ∗ (1 + t) − h̄2 y − x ∗  + h̄3 x − x ∗  + b̄ y − x ∗ , (126)
where h̄0 , h̄1 , h̄2 , h̄3 , ā, b̄ are as h0 , h1 , h2 , h3 , a, b, respectively. Then exactly as in (56)
but using (125), (126), instead of (25), (26) we can show the following local result for
method (2).

Theorem 5. Assume that there exists a solution of equation


f (λ) = 0 (127)
in [0, R], where
1
  
f (λ) = h̄1 (1 + t)λ − h̄2 (λ) dt + h̄3 (λ) + h̄0 (λ, λ) + ā + b̄ − 1. (128)
0

Denote by λ0 the smallest of the solutions in [0, R]. Then, sequence {xn } (n  −1) gen-
erated by Newton-like method (2) is well defined, remains in Ū (x ∗ , λ0 ) for all n  0 and
converges to x ∗ provided that x−1 , x0 ∈ Ū (x ∗ , λ0 ). Moreover the following error bounds
hold for all n  0:
x ∗ − xn+1   pn , (129)
where

{ 01 [h̄1 ((1+t )xn−x ∗ )−h̄2 (xn −x ∗ )] dt +ā+h̄3 (xn−1 −x ∗ )}
pn = 1−b̄−h̄0 (xn −x ∗ )
xn − x ∗ . (130)

Application 3. Let us again consider Newton’s method, i.e., F  (x) = A(x, y), G(x) = 0,
and assume
  ∗ −1   
F (x ) F (x) − F  (x ∗ )   λ1 x − x ∗  (131)
and
  ∗ −1   
F (x ) F (x) − F  (y)   λ2 x − y (132)
for all x, y ∈ Ū (x ∗ , r) ⊆ Ū (x ∗ , R). Then we can set

ā = b̄ = 0, h̄3 = 0, h̄1 (r) = h̄2 (r) = λ2 r, and h̄0 (r, r) = λ1 r


for all r ∈ [0, R]. (133)
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 395

Using (131), (132) we get


2
λ0 = . (134)
2λ1 + λ2
Local results were not given in [11,12,21,23]. However Rheinboldt in [19] showed that
under only (132) the convergence radius is given by
2
λ3 = . (135)
3λ2
But in general
λ1  λ2 . (136)
Hence we conclude
λ3  λ0 . (137)
The corresponding error bounds become

xn+1 − x ∗   en , (138)

xn+1 − x   en1 , (139)
where
λ2 xn − x ∗ 2
en = (140)
2[1 − λ1 xn − x ∗ ]
and
λ2 xn − x ∗ 2
en1 = . (141)
2[1 − λ2 xn − x ∗ ]
That is
en  en1 (n  0). (142)
If strict inequality holds in (136) then (137) and (142) hold as strict inequalities also (see
also Example 4).

Remark 7. As noted in [1–9,25] the local results obtained here can be used for projection
methods such as Arnoldi’s, the generalized minimum residual method (GMRES), the gen-
eralized conjugate residual method (GCR), for combined Newton/finite projection methods
and in connection with the mesh independence principle to develop the cheapest and most
efficient mesh refinement strategies.

Remark 8. The local results can also be used to solve equations of the form F (x) = 0,
where F  satisfies the autonomous differential equation [4,8,16]

F  (x) = P F (x) , (143)
where P : Y → X is a known continuous operator. Since F  (x ∗ ) = P (F (x ∗ )) = P (0), we
can apply our results without actually knowing the solution x ∗ of Eq. (1).
396 I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397

Example 6. Let X = Y = R, Ū (x ∗ , R) = Ū (0, 1), G = 0, A(x, y) = F  (x), and define


function F on Ū (0, 1) by
F (x) = ex − 1. (144)
Then we can set P (x) = x + 1 in (143). Using (132) we get λ2 = e. Moreover by (144) we
get
x2 xn
F  (x) − F  (x ∗ ) = ex − 1 = x + + ··· + +···
2! n!
 
x x n−1
= 1+ + ···+ + · · · (x − x ∗ ) (145)
2! n!
and
 
F  (x ∗ )−1 F  (x) − F  (x ∗ )  (e − 1)x − x ∗ .
That is λ1 = e − 1. By (134) and (135) we get
λ3 = 0.245252961 (146)
and
λ0 = 0.254028662. (147)
That is our convergence radius λ0 is larger than the corresponding one λ3 due to Rheinboldt
and our error bounds (140) are also finer than (141) so that (142) holds as a strict inequality.
Finally note that all these improvements are made using the same hypotheses/information
as in the earlier results. This observation is important in computational mathematics, since
a wider choice of initial guesses x0 becomes available (see also Remark 7).

The results obtained here can be extended to m-point methods (m > 2 an integer) [4–8,
17], and can be used in the solution of variational inequalities [23].

References

[1] F.L. Allgower, K. Böhmer, F.A. Potra, W.C. Rheinboldt, A mesh independence principle for operator equa-
tions and their discretizations, SIAM J. Numer. Anal. 23 (1986) 160–169.
[2] J. Appel, E. DePascale, P.P. Zabrejko, On the application of the Newton–Kantorovich method to nonlinear
integral equations of Uryson type, Numer. Funct. Anal. Optim. 12 (1991) 271–283.
[3] I.K. Argyros, Improving the rate of convergence of Newton methods on Banach spaces with a convergence
structure and applications, Appl. Math. Lett. 10 (1977) 21–28.
[4] I.K. Argyros, Advances in the Efficiency of Computational Methods and Applications, World Scientific,
River Edge, NJ, 2000.
[5] I.K. Argyros, On the radius of convergence of Newton’s method, Internat. J. Comput. Math. 77 (2001)
389–400.
[6] I.K. Argyros, A Newton–Kantorovich theorem for equations involving m-Fréchet-differentiable operators
and applications in radiative transfer, J. Comput. Appl. Math. 131 (2001) 149–159.
[7] I.K. Argyros, F. Szidarovszky, Convergence of general iteration schemes, J. Math. Anal. Appl. 168 (1992)
42–52.
[8] I.K. Argyros, F. Szidarovszky, The Theory and Applications of Iteration Methods, CRC Press, Boca Raton,
FL, 1993.
I.K. Argyros / J. Math. Anal. Appl. 298 (2004) 374–397 397

[9] P.N. Brown, A local convergence theory for combined inexact-Newton/finite-difference projection methods,
SIAM J. Numer. Anal. 24 (1987) 407–434.
[10] E. Cǎtinas, On some iterative methods for solving nonlinear equations, Rev. Anal. Numér. Théor. Approx. 23
(1994) 47–53.
[11] X. Chen, T. Yamamoto, Convergence domains of certain iterative methods for solving nonlinear equations,
Numer. Funct. Anal. Optim. 10 (1989) 37–48.
[12] J.E. Dennis, Toward a unified convergence theory for Newton-like methods, in: L.B. Rall (Ed.), Nonlinear
Functional Analysis and Applications, Academic Press, New York, 1971, pp. 425–472.
[13] P. Deuflhard, G. Heindl, Affine invariant convergence theorems for Newton’s method and extensions to
related methods, SIAM J. Numer. Anal. 16 (1979) 1–10.
[14] P. Deuflhard, F.A. Potra, Asymptotic mesh independence of Newton–Galerkin methods via a refined
Mysovskii theorem, SIAM J. Numer. Anal. 29 (1992) 1395–1412.
[15] J.M. Gutiérrez, M.A. Hernandez, M.A. Salanova, Accessibility of solutions by Newton’s method, Internat.
J. Comput. Math. 57 (1995) 239–247.
[16] L.V. Kantorovich, G.P. Akilov, Functional Analysis, Pergamon, Oxford, 1982.
[17] F.A. Potra, On an iterative algorithm of order 1.839 . . . for solving nonlinear equations, Numer. Funct. Anal.
Optim. 7 (1984–85) 75–106.
[18] F.A. Potra, V. Ptǎk, Sharp error bounds for Newton’s process, Numer. Math. 34 (1980) 67–72.
[19] W.C. Rheinboldt, An adaptive continuation process for solving systems of nonlinear equations, Banach
Center Publ. 3 (1977) 129–142.
[20] J.W. Schmidt, H. Schwetlick, Ableitungsfreie Verfahren mit hohere Konvergenzgeschwindigkeit, Comput-
ing 3 (1968) 215–226.
[21] J.W. Schmidt, H. Leonhardt, Eingrenzung von Losungen mit der Regula Falsi, Computing 6 (1970) 318–329.
[22] J.W. Schmidt, Untere Fehlerschranken fur Regula-Falsi Verhafren, Period. Math. Hungar. 9 (1978) 241–247.
[23] L.U. Uko, J.O. Adeyeye, Generalized Newton-iterative methods for nonlinear operator equations, Nonlinear
Studies 8 (2001) 465–477.
[24] T. Yamamoto, A convergence theorem for Newton-like methods in Banach spaces, Numer. Math. 51 (1987)
545–557.
[25] T.J. Ypma, Local convergence of inexact Newton methods, SIAM J. Numer. Anal. 21 (1984) 583–590.
[26] P.P. Zabrejko, D.F. Nguen, The majorant method in the theory of Newton–Kantorovich approximations and
the Ptǎk error estimates, Numer. Funct. Anal. Optim. 9 (1987) 671–684.

You might also like