0% found this document useful (0 votes)
8 views

Modified Limited Memory BFGS Method With Nonmonoto

Uploaded by

jwan.aqrawi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Modified Limited Memory BFGS Method With Nonmonoto

Uploaded by

jwan.aqrawi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228974276

Modified limited memory BFGS method with nonmonotone line search for
unconstrained optimization

Article in Journal of the Korean Mathematical Society · July 2010


DOI: 10.4134/JKMS.2010.47.4.767

CITATIONS READS

35 223

3 authors, including:

Gonglin Yuan Zengxin Wei


Guangxi University Guangxi University
126 PUBLICATIONS 2,569 CITATIONS 75 PUBLICATIONS 2,529 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Gonglin Yuan on 23 May 2014.

The user has requested enhancement of the downloaded file.


J. Korean Math. Soc. 47 (2010), No. 4, pp. 767–788
DOI 10.4134/JKMS.2010.47.4.767

MODIFIED LIMITED MEMORY BFGS METHOD WITH


NONMONOTONE LINE SEARCH
FOR UNCONSTRAINED OPTIMIZATION

Gonglin Yuan, Zengxin Wei, and Yanlin Wu

Abstract. In this paper, we propose two limited memory BFGS algo-


rithms with a nonmonotone line search technique for unconstrained op-
timization problems. The global convergence of the given methods will
be established under suitable conditions. Numerical results show that
the presented algorithms are more competitive than the normal BFGS
method.

1. Introduction
Consider the following unconstrained optimization problem
(1.1) min f (x),
x∈Rn

where f : Rn → R is continuously differentiable. The line search method is one


of the most numerical method, which is defined by
(1.2) xk+1 = xk + αk dk , k = 0, 1, 2, . . . ,
where αk that is determined by a line search is the steplength, and dk which
determines different line search methods [35, 36, 37, 39, 40, 43, 44, 45, 46, 48, 50]
is a search direction of f at xk .
One of the most effective methods for unconstrained optimization (1.1) is
Newton method. It normally requires a fewest number of function evaluations,
and is very good at handling ill-conditioning. However, its efficiency largely
depends on the possibility to efficiently solve a linear system which arises when
computing the search dk at each iteration
(1.3) G(xk )dk = −g(xk ),

Received July 17, 2008.


2000 Mathematics Subject Classification. 65H10, 65K05, 90C26.
Key words and phrases. limited memory BFGS method, optimization, nonmonotone,
global convergence.
This work is supported by China NSF grands 10761001 and the Scientific Research Foun-
dation of Guangxi University (Grant No. X081082).

c
°2010 The Korean Mathematical Society

767
768 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

where g(xk ) = ∇f (xk ) is the gradient of f (x) at xk , and G(xk ) = ∇2 f (xk ) is


the Hessian matrix of f (x) at current iteration. Moreover, the exact solution
of the system (1.3) could be too burdensome, or is not necessary when xk is
far from a solution [31]. Inexact Newton methods [8, 31] represent the basic
approach underlying most of the Newton-type large-scale algorithms. At each
iteration, the current estimate of the solution is updated by approximately
solving the linear system (1.3) using an iterative algorithm. The inner iteration
is typically “truncated” before the solution to the linear system is obtained.
The limited memory BFGS (L-BFGS) method (see [3]) is an adaptation of the
BFGS method for large-scale problems. The implementation is almost identical
to that of the standard BFGS method, the only difference is that the inverse
Hessian approximation is not formed explicitly, but defined by a small number
of BFGS updates. It is often provided a fast rate of linear convergence, and
requires minimal storage.
Since the standard BFGS is wildly used to solve general minimization prob-
lems, most of the studies concerning limited memory methods are concentrate
on the L-BFGS method. We know that, the BFGS update exploits only the
gradient information, while the information of function values available is ne-
glected. Therefore, many efficient attempts have been made to modify the
usual quasi-Newton methods using both the gradient and function values in-
formation (e.g. [41, 51]). Lately, in order to get a higher order accuracy in
approximating the second curvature of the objective function, Wei, Li, and Qi
[41], and Zhang, Deng, and Chen [51] proposed modified BFGS-type methods
for (1.1), and the reported numerical results show that the average performance
is better than that of the standard BFGS method, respectively.
The monotone line search technique is often used to get the stepsize αk ,
however monotonicity may cause a series of very small steps if the contours of
objective function are a family of curves with large curvature [18]. More re-
cently, the nonmonotonic line search for solving unconstrained optimization is
proposed by Grippo et al. in [18]. Han and Liu [21] presented a new nonmono-
tone BFGS method for (1.1). The global convergence of the convex objective
function was established. Numerical results show that this method is more
competitive to the normal BFGS method with monotone line search. We [49]
proved its superlinear convergence.
Motivated by the above observation, we propose two limited memory BFGS-
type method on the basic of Wei et al. [41], Zhang et al. [51], and [21],
respectively, which are suitable for solving large-scale unconstrained optimiza-
tion problems. The major contribution of this paper is an extension of the
BFGS-type method in [41], [51], and the nonmonotone line search technique to
limited memory scheme. The triple of the standard L-BFGS method {si , yi },
i = k−m e + 1, . . . , k, is stored, where si = xi+1 − xi , yi = gi+1 − gi , gi = g(xi )
and gi+1 = g(xi+1 ) are the gradient of f (x) at xi and xi+1 , respectively, m e >0
is a constant. A distinguishing feature of our proposed L-BFGS method is that,
MODIFIED LIMITED MEMORY BFGS METHOD 769

at each iteration, a triple


{si , yi , Ai }, i = k − m
e + 1, . . . , k,
is stored, where Ai is a scalar related to function value. Compared with the
standard BFGS method, at each iteration, the proposed method requires no
more function or derivative evaluations, and hardly more storage or arithmetic
operations. Under suitable conditions, we establish the global convergence of
the method. The numerical experiments of the proposed method on a set of
large problems indicate that it is interesting.
This paper is organized as follows. In the next section, modified BFGS
update and nonmonotone line search are stated. The proposed L-BFGS algo-
rithms are given in Section 3. Under some reasonable conditions, the global
convergence of the given methods is established in Section 4. Numerical results
and a conclusion are presented in Section 5 and in Section 6, respectively.

2. Modified BFGS update and nonmonotone line search


Quasi-Newton methods are iterative methods of the form
xk+1 = xk + αk dk ,
where xk is the kth iteration point, αk is a stepsize, and dk is a search direction.
Now we first state the search direction as follows.
2.1. Some modified BFGS update formulas
The search direction of the quasi-Newton method is defined by
(2.1) Bk dk + gk = 0,
where gk = g(xk ) = ∇f (xk ) is the gradient of f (x) at xk , Bk is an approx-
imation of ∇2 f (xk ). By tradition, {Bk } satisfies the following quasi-Newton
equation
(2.2) Bk+1 sk = yk ,
where sk = xk+1 − xk = αk dk , yk = gk+1 − gk . Throughout the paper, we use
these notations: k · k is the Euclidean norm, g(xk ) and g(xk+1 ) are replaced by
gk and gk+1 , and f (xk ) and f (xk+1 ) are replaced by fk and fk+1 respectively.
The famous update Bk is the standard BFGS formula
Bk sk sTk Bk yk yk T
(2.3) Bk+1 = Bk − + .
sTk Bk sk yk T sk
Let Hk be the inverse of Bk . Then the inverse update formula of (2.3) method
is represented as
yk T (sk − Hk yk )sk sTk (sk − Hk yk )sTk + sk (sk − Hk yk )T
Hk+1 = Hk − T 2
+
(yk sk ) (yk T sk )2
µ ¶ µ ¶
sk yk T yk sT sk sT
(2.4) = I− T Hk I − T k + Tk ,
yk sk yk sk y k sk
770 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

which is the dual form of the DF P update formula in the sense that Hk ↔ Bk ,
Hk+1 ↔ Bk+1 , and sk ↔ yk . It has been shown that the BFGS method is
very efficient for solving unconstrained optimization problems (1.1) [11, 14, 47].
For convex minimization problems, the BFGS method are global convergence
if the exact line search or some special inexact line search is used [1, 2, 4,
12, 16, 32, 33, 38] and its local convergence has been well established [9, 10,
17]. For general function f, Dai [5] have constructed an example to show that
the standard BFGS method may fail for non-convex functions with inexact
line search, Mascarenhas [29] showed that the nonconvergence of the standard
BFGS method even with exact line search.
In order to obtain a global convergence of BFGS method without convexity
assumption on the objective function, Li and Fukushima [22, 23] made a slight
modification to the standard BFGS method. Now we state their works as
follows:
(i) A new quasi-Newton equation [22] with following form
Bk+1 sk = yk1∗ ,
yT s
where yk1∗ = yk + (max{0, − kskk kk2 } + φ(kgk k))sk , and function φ : R → R
satisfies: (a) φ(t) > 0 for all t > 0; (b) φ(t) = 0 if and only if t = 0; (c) φ(t) is
bounded if t is in a bounded set.
(ii) A modified BFGS update formula [23] with following form
( T
B s sT B y 1∗ y 1∗ sT y 1∗
Bk − skT kBkksk k + yk1∗ Tks , if ksk k kk2 ≥ φ(kgk k)
(2.5) Bk+1 = k k k
Bk , otherwise.
Then it is not difficult to see that sTk yk1∗ > 0 always holds, which can ensure
that the update matrix Bk+1 inherits the positive definiteness of Bk (see [14]).
The global convergence and the superlinear convergence of these two methods
for nonconvex have been established under appropriate conditions (see [22, 23]
in detail).
In order to get a better approximation of the objective function Hessian
matrix, Wei, Li, and Qi [41] and Zhang, Deng, and Chen [51] proposed modified
quasi-Newtion equations which are given as follows. (i) The equation of Wei,
Li, and Qi [41]:
(2.6) Bk+1 sk = yk2∗ = yk + Ak sk ,
where
2[f (xk ) − f (xk+1 )] + [g(xk+1 ) + g(xk )]T sk
(2.7) Ak = .
ksk k2
They replaced all the yk in (2.3), and obtained the following modified BFGS-
type update formula
T
Bk sk sTk Bk yk2∗ yk2∗
(2.8) Bk+1 = Bk − + .
sTk Bk sk T
yk2∗ sk
MODIFIED LIMITED MEMORY BFGS METHOD 771

Note that this quasi-Newton equation (2.6) contains both gradient and function
value information at the current and the previous step, one may argue that the
resulting methods will really outperform than the original method. In fact, the
practical computation shows that this method is better than the normal BFGS
method (see [41, 42] for detail) for some given problems [30]. Furthermore,
some theoretical advantages of the new quasi-Newton equation (2.6) can be
seen from the following two theorems.
Theorem 2.1 ([42, Lemma 3.1]). Considering the quasi-Newton equation (2.6).
Then we have for all k ≥ 1
1
f (xk ) = f (xk+1 ) + g(xk+1 )T (xk − xk+1 ) + (xk − xk+1 )T Bk+1 (xk − xk+1 ).
2
Theorem 2.2 ([24, Theorem 3.1]). Assume that the function f (x) is suffi-
ciently smooth and ksk k is sufficiently small. Then we have
1
(2.9) sTk Gk+1 sk − sTk yk2∗ − sTk (Tk+1 sk )sk = O(ksk k4 )
3
and
1
(2.10) sTk Gk+1 sk − sTk yk − sTk (Tk+1 sk )sk = O(ksk k4 ),
2
where Gk+1 denotes the Hessian matrix of f at xk+1 , Tk+1 is the tensor of f
at xk+1 , and
n
X ∂ 3 f (xk+1 ) i j l
sTk (Tk+1 sk )sk = s s s .
∂xi ∂xj ∂xl k k k
i,j,l=1

(iii) The equation of Zhang, Deng, and Chen [51]:


(2.11) Bk+1 sk = yk3∗ = yk + Āk sk ,
where
6[f (xk ) − f (xk+1 )] + 3[g(xk+1 ) + g(xk )]T sk
(2.12) Āk = .
ksk k2
They replaced all the yk in (2.3), and obtained the following modified BFGS-
type update formula
T
Bk sk sT Bk y 3∗ y 3∗
(2.13) Bk+1 = Bk − T k + k Tk .
sk Bk sk yk3∗ sk
Similar to equation (2.6), this quasi-Newton equation (2.11) contains both gra-
dient and function value information at the current and the previous step, one
may argue that the resulting methods will really outperform than the original
method. In fact, the practical computation shows that this method is better
than the normal BFGS method (see [51] for detail). Furthermore, some theo-
retical advantages of the new quasi-Newton equation (2.11) can be seen from
the following theorem.
772 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Theorem 2.3 ([51, Theorem 3.3]). Assume that the function f (x) is suffi-
ciently smooth and ksk k is sufficiently small. Then we have
(2.14) sTk (Gk+1 sk − yk3∗ ) = O(ksk k4 )
and
(2.15) sTk (Gk+1 sk − yk ) = O(ksk k3 ).
It is not difficult to deduce that sTk yk2∗ > 0 holds for the uniformly convex
function f (or see [42]). We all know that the condition sTk yk2∗ > 0 can ensure
that the update matrix Bk+1 from (2.8) inherits the positive definiteness of Bk .
Similarly, in order to get the positive definiteness of Bk in (2.13) for each k,
we give a modified BFGS update of (2.13), i.e., the modified update formula is
defined by
T
Bk sk sTk Bk y 4∗ y 4∗
(2.16) Bk+1 = Bk − T
+ k Tk ,
sk Bk sk yk4∗ sk
1
where yk4∗ = yk + A∗k sk , A∗k = 3 max{Āk , 0}. Then the corresponding quasi-
Newton equation is
(2.17) Bk+1 sk = yk4∗ .
From the definition of yk4∗ , we can obtain sTk yk4∗ > 0 if the objective function f
is uniformly convex function (see Lemma 4.1).
There are other modified formulas that can be seen from [7, 34] in detail,
here we do not present anymore.
2.2. One nonmonotone line search
Normally the steplength αk is generated by the following weak Wolfe-Powell
(WWP): Find a steplength αk such that
(2.18) f (xk + αk dk ) ≤ f (xk ) + σ1 αk gkT dk ,

(2.19) g(xk + αk dk )T dk ≥ σ2 gkT dk ,


where 0 < σ1 < σ2 < 1. Many authors analysis the BFGS algorithm from
generalizing line search procedures [25, 26]. Recently, the nonmonotone line
search technique for unconstrained optimization is proposed by Grippo et. al.
[18, 19, 20] and further studied by [27, 28] etc.. Grippo, Lamparillo, and Lucidi
[18] proposed the following nonmonotone line search we call GLL line search.
GLL line search: Select steplength αk satisfying
(2.20) f (xk+1 ) ≤ max f (xk−j ) + ε1 αk gkT dk ,
0≤j≤n(k)

(2.21) g(xk+1 )T dk ≥ max{ε2 , 1 − (αk kdk k)p }gkT dk ,


where p ∈ (−∞, 1), k = 0, 1, 2, . . . , ε1 , ε2 ∈ (0, 1), n(k) = min{H, k}, H ≥ 0
is an integer constant. Combining this line search and the normal BFGS for-
mula (2.3), Han and Liu [21] established the global convergence of the convex
MODIFIED LIMITED MEMORY BFGS METHOD 773

objective function. Numerical results show that this method is more compet-
itive than the normal BFGS method with WWP line search. Recently, the
superlinear convergence of the new nonmonotone BFGS algorithm for convex
function was proved by Yuan and Wei [49].

3. Limited memory BFGS-type method


In this section, we propose new algorithm to solve (1.1). To improve the
performance of the line search, it is a better choice to use the GLL line search
instead of the WWP line search. Then our method generates a sequence of
points {xk } by
xk+1 = xk + αk dk , k = 0, 1, 2, . . . ,
where αk is determined by (2.20) and (2.21), dk is a descent direction of f at
xk . In the following, we state the direction dk in details.
The limited memory BFGS (L-BFGS) method (see [3]) is an adaptation of
the BFGS method for large-scale problems. In the L-BFGS method, matrix
Hk is obtained by updating the basic matrix H0 m e (> 0) times using BFGS
formula with the previous m e iterations. The standard BFGS correction (2.4)
has the following form
(3.22) Hk+1 = VkT Hk Vk + ρk sk sTk ,
where ρk = sT1yk , Vk = I − ρk yk sTk , I is the unit matrix. Thus, Hk+1 in the
k
L-BFGS method has the following form:
Hk+1 = VkT Hk Vk + ρk sk sTk
= VkT [Vk−1
T
Hk−1 Vk−1 + ρk−1 sk−1 sTk−1 ]Vk + ρk sk sTk
= ···
= [VkT · · · Vk−
T
m+1
e ]Hk−m+1
e [Vk−m+1
e · · · Vk ]
T T
+ρk−m+1
e [Vk−1 · · · Vk− m+2
e ]sk−m+1
e sTk−m+1
e [Vk−m+2
e · · · Vk−1 ]
+···
(3.23) +ρk sk sTk .
To improve the performance of the standard limited memory BFGS algo-
rithm, it is a better choice use the modified BFGS-type update instead of the
standard BFGS. If we replaced all the yk with yk4∗ and yk2∗ in (3.23) respectively,
the new limited memory BFGS-type update can be obtained by
Hk+1 = Vk∗ T Hk Vk∗ + ρ∗k sk sTk
∗ T
= Vk∗ T [Vk−1 ∗
Hk−1 Vk−1 + ρ∗k−1 sk−1 sTk−1 ]Vk∗ + ρ∗k sk sTk
= ···
= [Vk∗ T · · · Vk−

m+1
e
T
]Hk−m+1
e

[Vk− m+1
e · · · Vk∗ ]
∗ T T
+ρ∗k−m+1
e [Vk−1 ∗
· · · Vk− m+2
e ]sk−m+1
e sTk−m+1
e

[Vk− m+2
e

· · · Vk−1 ]
+···
774 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

(3.24) +ρ∗k sk sTk


and
Hk+1 = Vk∗∗ T Hk Vk∗∗ + ρ̄∗k sk sTk
= Vk∗∗ T [Vk−1
∗∗ T ∗∗
Hk−1 Vk−1 + ρ̄∗k−1 sk−1 sTk−1 ]V̄k∗∗ + ρ̄∗k sk sTk
= ···
= [Vk∗∗ T · · · Vk−
∗∗
m+1
e
T
]Hk−m+1
e
∗∗
[Vk− m+1
e · · · V̄k∗∗ ]
∗∗ T T
+ρ̄∗k−m+1
e [Vk−1 ∗∗
· · · Vk− m+2
e ]sk−m+1
e sTk−m+1
e
∗∗
[Vk− m+2
e
∗∗
· · · Vk−1 ]
+···
(3.25) +ρ̄∗k sk sTk ,
where ρ∗k = sT 1y4∗ , Vk∗ = I − ρ∗k yk4∗ sTk , and ρ̄∗k = sT 1y2∗ , Vk∗∗ = I − ρ̄∗k yk2∗ sTk .
k k k k
Now we state the new limited memory BFGS-type algorithm (L-BFGS-A)
with GLL line search as follows.
Algorithm 1. (L-BFGS-A1)
Step 0: Choose an initial point x0 ∈ Rn , an basic symmetric positive definite
matrix H0 ∈ Rn×n , and constants r, ε1 , ε2 ∈ (0, 1), p ∈ (−∞, 1), H ≥ 0, an
positive integer m1 . Let k := 0;
Step 1: Stop if kgk k = 0.
Step 2: Determine dk by
(3.26) dk = −Hk gk .
Step 3: Find αk satisfying (2.20) and (2.21).
Step 4: Let the next iterative be xk+1 = xk + αk dk .
Step 5: Let m
e = min{k +1, m1 }. Update H0 for me times to get Hk+1 by (3.24).
Step 6: Let k := k + 1. Go to step 1.
Algorithm 11. (L-BFGS-A11)
Step 5: Let me = min{k +1, m1 }. Update H0 for me times to get Hk+1 by (3.25).
In the following, we assume that the algorithm updates Bk −the inverse of
Hk . We also assume that the basic matrix B0 , and its inverse H0 , are bounded
and positive definite. The Algorithm 1 with Bk can be stated as follows.
Algorithm 2. (L-BFGS-A2)
Step 2: Determine dk by
(3.27) Bk dk = −gk .
Step 5: Let m
e = min{k + 1, m1 }. Put sk = xk+1 − xk = αk dk , yk = gk+1 − gk .
Update B0 for m
e times, i.e., for l = k − m
e + 1, . . . , k compute
T
Bkl sl sTl Bkl yl4∗ yl4∗
(3.28) Bkl+1 = Bkl − + T
,
sTl Bkl sl yl4∗ sl
where sl = xl+1 − xl , yl4∗ = yl + A∗l sl , and Bkk−m+1
e
= B0 for all k.
MODIFIED LIMITED MEMORY BFGS METHOD 775

Algorithm 22. (L-BFGS-A22)


Step 2: Determine dk by
(3.29) Bk dk = −gk .
Step 5: Let m
e = min{k + 1, m1 }. Put sk = xk+1 − xk = αk dk , yk = gk+1 − gk .
Update B0 for m
e times, i.e., for l = k − m
e + 1, . . . , k compute
T
Bkl sl sTl Bkl y 2∗ y 2∗
(3.30) Bkl+1 = Bkl − T l
+ l Tl ,
sl Bk sl yl2∗ sl
where sl = xl+1 − xl , yl2∗ = yl + Al sl , and Bkk−m+1
e
= B0 for all k.
Note that Algorithms 1 and 2 are mathematically equivalent, and Algorithms
11 and 22 are mathematically equivalent too. In our numerical experiments we
implement Algorithms 1 and 11, and Algorithms 2 and 22 are given only for the
purpose of analysis. Throughout this paper, we only discuss Algorithms 2 and
22. In the following section, we will concentrate on their global convergence.

4. Convergence analysis
This section is devoted to show that Algorithm 2 is convergent on twice
continuously differentiable and uniformly convex function. In order to establish
global convergence for Algorithm 2, we need the following assumptions.
Assumption A. (i) The level set Ω = {x | f (x) ≤ f (x0 )} is bounded.
(ii) The function f is twice continuously differentiable on Ω.
(iii) The function f is uniformly convex, i.e., there exist positive constants
m and M such that
(4.1) mkdk2 ≤ dT G(x)d ≤ M kdk2
holds for all x ∈ Ω and d ∈ Rn , where G(x) = ∇2 f (x). These assumptions are
the same as those in [42, 51].
It is obvious that Assumption A implies that there exists a constant M∗ > 0
such that
kG(x)k ≤ M∗ , x ∈ Ω.
Assumption A (ii) implies that there exists a constant L ≥ 0 satisfying
(4.2) kg(x) − g(y)k ≤ Lkx − yk, x, y ∈ Ω.
Lemma 4.1. Let Assumption A hold. Then there exists a positive number M1
such that
kyk4∗ k2
≤ M1 , k = 0, 1, 2, . . . .
sTk yk4∗
Proof. Following the definition of yk4∗ and the Taylor’s formula, we get
sTk yk4∗ = sTk yk + sTk A∗k sk
T
= max{2[fk − fk+1 ] + 2gk+1 sk , sTk yk }
776 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

T 1
= max{2[−gk+1 sk + sTk G(xk + θ(xk+1 − xk )sk ] + 2gk+1
T
sk ,
2
sTk G(xk + θ1 (xk+1 − xk )sk }
= max{sTk G(xk + θ(xk+1 − xk )sk , sTk G(xk + θ1 (xk+1 − xk )sk },
where θ, θ1 ∈ (0, 1). Combining with Assumption A(iii), it is easy to obtain
(4.3) mksk k2 ≤ sTk yk4∗ ≤ M ksk k.
By the definition of yk4∗ and the Taylor’s formula again, we have
° ½ ¾ °
° 2[f (xk ) − f (xk+1 )] + [g(xk+1 ) + g(xk )]T sk °
4∗ °
kyk k = °yk + max , 0 sk °
ksk k 2 °
½ T
¾
| 2[f (xk ) − f (xk+1 )] + [g(xk+1 ) + g(xk )] sk |
≤ max kyk k + , kyk k
ksk k
| sTk G(xk + θ(xk+1 − xk )sk |
≤ 2kyk k +
ksk k2
≤ 2Lksk k + M ksk k
(4.4) = (2L + M )ksk k,
where θ ∈ (0, 1), the third inequality follows (4.1) and (4.2). By (4.3) and (4.4),
we get
kyk4∗ k2 (2L + M )2 ksk k2 (2L + M )2
≤ = = M1 .
sTk yk4∗ mksk k2 m
The proof is complete. ¤

Lemma 4.2. Let Bk be generated by (3.28). Then we have


k
Y sTl yl4∗
(4.5) det(Bk+1 ) = det(Bkk−m+1
e
) ,
sTl Bl sl
l=k−m+1
e

where det(Bk ) denotes the determinant of Bk .


Proof. To begin with, we take the determinant in both sides of (2.8)
à à T
!!
sk sTk Bk Bk−1 yk4∗ yk4∗
det(Bk+1 ) = det Bk I − T +
sk Bk sk sTk yk4∗
à T
!
sk sTk Bk Bk−1 yk4∗ yk4∗
= det(Bk ) det I − T +
sk B k sk sTk yk4∗
õ ¶ à !
T B k sk −1 4∗ T yk4∗
= det(Bk ) 1 − sk T 1 + (Bk yk ) T
sk Bk sk yk4∗ sk
à !µ ¶!
T yk4∗ (Bk sk )T −1 4∗
− −sk B y
T
y 4∗ sk sTk Bk sk k k
k
MODIFIED LIMITED MEMORY BFGS METHOD 777

T
yk4∗ sk
= det(Bk ) ,
sTk Bk sk
where the third equality follows from the formula (see, e.g., [9, Lemma 7.6])
det(I + u1 uT2 + u3 uT4 ) = (1 + uT1 u2 )(1 + uT3 u4 ) − (uT1 u4 )(uT2 u3 ).
Therefore, there is also a simple expression for the determinant of (3.28)
k
Y sTl yl4∗
det(Bk+1 ) = det(Bkk−m+1
e
) .
sTl Bl sl
l=k−m+1
e

Then we complete the proof. ¤

Define the length of the orthogonal projection of −gk on dk by


−gkT dk
(4.6) ηk = .
kdk k
The following Lemmas 4.3-4.6 has been proved in [21], here we only state them
as follows, but omit the proof.
Lemma 4.3. Let Assumption A be satisfied. Consider GLL line search. Then
there exists a positive constant b0 such that
1
ksk k ≥ b0 min{ηk , (ηk ) 1−p },
where ηk is defined by (4.6).
Lemma 4.4. Denote that
f (xl(k) ) = max f (xk−j ), k − n(k) ≤ l(k) ≤ k.
0≤j≤n(k)

If fk+1 ≤ f (xl(k) ), k = 0, 1, 2, . . . , then the sequence {f (xl(k) )} monotonically


decreases, and xk ∈ Ω for all k ≥ 0.
Lemma 4.5. If
(4.7) fk+1 ≤ f (xl(k) ) − tk , k = 0, 1, 2, . . . ,
where tk ≥ 0, then

X
(4.8) min tk+n(k)−j < +∞.
0≤j≤n(k)
k=0

Lemma 4.6. If the sequence of nonnegative numbers mk (k = 0, 1, . . .) satisfies


k
Y
(4.9) mj ≥ ck1 , c1 > 0, k = 1, 2, . . . ,
j=0

then lim supk mk > 0.


778 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Lemma 4.7. Let {xk } be generated by Algorithm 2 and Assumption A hold.


If
lim inf kgk k > 0,
k→∞
then there exists a constant ²0 > 0 such that
k
Y
ηj ≥ (²0 )k+1 for all k ≥ 0.
j=0

Proof. Assume that lim inf k kgk k > 0, i.e., there exists a constant c2 > 0 such
that
(4.10) kgk k ≥ c2 , k = 0, 1, 2, . . . .
From Assumption A(iii) and Taylor’s formula, we have
(4.11) mksk k2 ≤ sTk G(xk + θ1 sk )sk = sTk yk ≤ M ksk k2 ,
combining with (4.3), we get
m T M T
(4.12) s yk ≤ sTk yk4∗ ≤ s yk .
M k m k
Taking the trace operation in both sides of (3.28), we get
k
X k
X
kBl sl k2 kyl4∗ k2
(4.13) Tr(Bk+1 ) = Tr(Bkk−m+1
e
)− + ,
sTl Bl sl slT yl4∗
l=k−m+1
e l=k−m+1
e

where Tr(Bk ) denotes the trace of Bk . Repeating this trace operation, we have
k
X k
X
kBl sl k2 kyl4∗ k2
Tr(Bk+1 ) = Tr(Bkk−m+1
e
) − +
sTl Bl sl sTl yl4∗
l=k−m+1
e l=k−m+1
e
= ···
k
X k
X
kBl sl k2 ky 4∗ k2
l
(4.14) = Tr(B0 ) − + .
l=0
sTl Bl sl l=0
sTl yl4∗

Combining (4.10), (4.14), (3.26), (3.29), and Lemma 4.1, we obtain


k
X c22
(4.15) Tr(Bk+1 ) ≤ Tr(B0 ) − T
+ (k + 1)M1 .
g Hj gj
j=0 j

Using Bk+1 is positive definite, we have Tr(Bk+1 ) > 0. By (4.15), we obtain


k
X c22 Tr(B0 ) + (k + 1)M1
(4.16) ≤
T
g Hj g j
j=0 j
c22

and
(4.17) Tr(Bk+1 ) ≤ Tr(B0 ) + (k + 1)M1 .
MODIFIED LIMITED MEMORY BFGS METHOD 779

By the geometric-arithmetic mean value formula we get


Yk · ¸k+1
T (k + 1)c22
(4.18) gj Hj gj ≥ .
j=0
Tr(B0 ) + (k + 1)M1

Using Lemma 4.2, (4.12), and (2.21), we have


k
Y sTl yl4∗
det(Bk+1 ) = det(Bkk−m+1
e
)
sTl Bl sl
l=k−m+1
e
k
Y m sTl yl
≥ det(Bkk−m+1
e
)
M sTl Bl sl
l=k−m+1
e
k
Y m min{1 − ε2 , ksl kp }
≥ det(Bkk−m+1
e
)
M αl
l=k−m+1
e
≥ ···
h m ik+1 Yk
min{1 − ε2 , ksj kp }
≥ det(B0 ) ,
M j=0
αj
which implies
h m ik+1 Yk ½ ¾
det(B0 ) αj αj
(4.19) ≤ max , .
det(Bk+1 ) M j=0
1 − ε2 ksj kp
By using the geometric-arithmetic mean value formula again, we get
· ¸n
Tr(Bk+1 )
(4.20) det(Bk+1 ) ≤ .
n
Using (4.17), (4.19) and (4.20), we obtain
Yk ½ ¾ h i
αj αj m k+1 det(B0 )nn
max , p

j=0
1 − ε2 ksj k M [Tr(B0 ) + (k + 1)M1 ]n
h m ik+1 1 det(B0 )nn

M k + 1 [Tr(B0 ) + M1 ]n
h m ik+1 µ ¶k+1 ½ ¾
1 det(B0 )nn
≥ min ,1
M exp(n) [Tr(B0 ) + M1 ]n
· ¸k+1 ½ ¾
M det(B0 )nn
≥ min , 1
exp(n)m [Tr(B0 ) + M1 ]n
(4.21) ≥ ck+1
3 ,
n
M det(B0 )n
where c3 ≤ [ exp(n)m ] min{ [Tr(B 0 )+M1 ]
n , 1}. Let

−gjT dj
cos θj = .
kgj kkdj k
780 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Multiplying (4.18) with (4.21), we get for all k ≥ 0


Yk ½ ¾ · ¸k+1
ksj kkgj k cos θj kgj k cos θj k+1 (k + 1)c22
max , ≥ c3
j=0
1 − ε2 ksj kp−1 Tr(B0 ) + (k + 1)M1
· ¸k+1
c3 c22
(4.22) ≥ .
Tr(B0 ) + M1
By
k
Y ½ ¾
ksj kkgj k cos θj kgj k cos θj
max ,
j=0
1 − ε2 ksj kp−1
µ ¶k+1 k+1
Y
1
≤ max{ksk k, ksk k1−p }kgj k cos θj ,
1 − ε2 j=0

we have
k+1
Y · ¸k+1
(1 − ε2 )c3 c22
(4.23) max{ksj k, ksj k1−p }kgj k cos θj ≥ .
j=0
Tr(B0 ) + M1
According to Lemma 4.4 and Assumption A we know that there exists a con-
stant M20 > 0 such that
(4.24) ksk k = kxk+1 − xk k ≤ kxk+1 k + kxk k ≤ 2M20 .
Combining (4.23) and (4.24), and noting that kgj k cos θj = ηj , we get for all
k≥0
Yk · ¸k+1
(1 − ε2 )c3 c22
ηj ≥ 0 , 1, (2M 0 )1−p } = ²k+1
0 .
j=0
(Tr(B 0 ) + M1 ) max{2M 2 2

The proof is complete. ¤


Now we establish the global convergence theorem for Algorithm 2.
Theorem 4.1. Let Assumption A hold and the sequence {xk } be generated by
Algorithm 2. Then we have
(4.25) lim inf kgk k = 0.
k→∞

Proof. By Lemma 4.3 and (2.20), we get


fk+1 ≤ f (xl(k) ) − ε1 ksk kηk
2−p

(4.26) ≤ f (xl(k) ) − ε1 b0 min{ηk2 , ηk1−p }.


2−p

Let tk = ε1 b0 min{ηk2 , ηk1−p }. By Lemma 4.5, we have



X 2−p
2 1−p
min min{ηk+n(k)−j , ηk+n(k)−j } < +∞,
0≤j≤n(k)
k=0
MODIFIED LIMITED MEMORY BFGS METHOD 781


X 2−p
2 1−p
min min{η(n(k)+1)q+n(k)−j , η(n(k)+1)q+n(k)−j } < +∞.
0≤j≤n(k)
q=1

Denoting the sequence{p(q)} as follows:


2−p
2 1−p
min{ηp(q) , ηp(q) } = min min{H1j (q), H2j (q)},
0≤j≤n(k)
2
H1j (q) = η(n(k)+1)q+n(k)−j ,
q(n(k) + 1) ≤ p(q) ≤ (q + 1)n(k) + q.
Therefore
p(1) < p(2) < p(3) < · · · < p(q − 1) < p(q) < · · · ,
2−p
2 1−p
lim min{ηp(q) , ηp(q) } = 0,
q→∞
(4.27) lim ηp(q) = 0,
q→∞

which means that


lim ηk = 0, K ⊂ N,
k∈K
where K is a subset of N = {1, 2, 3, . . .}. By xk ∈ Ω, and Ω is bounded, we
can assume that there exists a constant b3 > 0 such that kgk k ≤ b3 . Then we
get
−gkT dk
(4.28) ηk = ≤ kgk k ≤ b3 .
kdk k
We prove our theorem by contradiction. Assume that
lim inf kgk k > 0,
k→∞

so that there exists a constant c2 > 0 such that


kgk k ≥ c2 , k = 0, 1, 2, . . . .
By Lemma 4.7, we know that there exists a constant ²0 satisfying
Y
(4.29) k
ηj=0 ≥ ²k+1
0 .
Combining (4.29) and (4.28), we deduce that for any integer k ≥ 1,
(k+1)n(k)+k
(k+1)n(k)+k
Y
²0 ≤ ηj
j=1
k (q+1)n(k)+q
1 Y Y
= ηj
η0 q=0
j=(n(k)+1)q
k
Y Y
1
= ηq(n(k)+1)+n(k)−j
η0 q=0 0≤j≤n(k)
782 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

k
1 Y n(k)
≤ [ηp(q) b3 ]
η0 q=0
k
1 kn(k) Y
(4.30) = b3 ηp(q) .
η0 q=0

Then we have
k
" #k " #k
Y n(k)
n(k)+1
²0
n(k)+1
²0 n(k)
ηp(q) ≥ η0 ² 0 n(k)
≥ n(k)
min{1, η0 ²0 } .
q=0 b3 b3
Using Lemma 4.6 we have
(4.31) lim sup ηp(q) > 0,
q→∞

which contradicts (4.27). Therefore, we obtain


lim inf kgk k = 0.
k→∞
The proof is complete. ¤
Similar to Algorithm 2, it is not difficult to get the global convergence of
Algorithm 22. Here, we only state it as follows but omit the proof.
Theorem 4.2. Let Assumption A hold and the sequence {xk } be generated by
Algorithm 22. Then we have (4.25).

5. Numerical results
In this section, we report some numerical results on the problems [30] with
initial points. All codes were written in MATLAB 7.0 and run on PC with
2.60GHz CPU processor and 256MB memory and Windows XP operation sys-
tem. The parameters are chosen as: σ1 = 0.1, σ2 = 0.9, ε = 10−5 , ε1 =
0.1, ε2 = 0.01, p = 5, H = 8, m1 = 5, and the initial matrix B0 = I is the
unit matrix.
The following Himmeblau stop rule is used [47]:
If | f (xk ) |> e1 , let stop1 = |f (xk|f)−f (xk+1 )|
(xk )| ; Otherwise, let stop1 =| f (xk ) −
f (xk+1 ) | .
For each problem, if kg(x)k < ε or stop1 < e2 was satisfied, the program
will be stopped, where e1 = e2 = 10−5 .
Since the line search cannot always ensure the descent condition dTk gk < 0,
uphill search direction may occur in the numerical experiments. In this case,
the line search rule maybe fails. In order to avoid this case, the stepsize αk will
be accepted if the searching number is more than twenty five in line search.
We also stop the program if the iteration number is more than one thousand,
and the corresponding method is considered to be failed.
In Figure 1-3, “BFGS-WP-Ak1” and “BFGS-WP-Ak2” stand for the mod-
ified BFGS formula (2.8) with WWP rule and the modified BFGS formula
MODIFIED LIMITED MEMORY BFGS METHOD 783

(2.16) with WWP rule, respectively. “L-BFGS-A1” and “L-BFGS-A11” stand


for Algorithm 1 and Algorithm 2, respectively. The detailed numerical results
are listed on the web site

http : //210.36.16.53 : 8018/publication.asp?id = 33331.

Dolan and Moré [13] gave a new tool to analyze the efficiency of algorithms.
They introduced the notion of a performance profile as a means to evaluate and
compare the performance of the set of solvers S on a test set P. Assuming that
there exist ns solvers and np problems, for each problem p and solver s, they
defined tp,s = computing time (the number of function evaluations or others)
required to solve problem p by solver s.
Requiring a baseline for comparisons, they compared the performance on
problem p by solver s with the best performance by any solver on this problem;
that is, using the performance ratio
tp,s
rp,s = .
min{tp,s : s ∈ S}

Suppose that a parameter rM ≥ rp,s for all p, s is chosen, and rp,s = rM if and
only if solver s does not solve problem p.
The performance of solver s on any given problem might be of interest, but
we would like to obtain an overall assessment of the performance of the solver,
then they defined
1
ρs (t) = size{p ∈ P : rp,s ≤ t},
np
thus ρs (t) was the probability for solver s ∈ S that a performance ratio rp,s
was within a factor t ∈ R of the best possible ration. Then function ρs was the
(cumulative) distribution function for the performance ratio. The performance
profile ρs : R 7→ [0, 1] for a solver was a nondecreasing, piecewise constant
function, continuous from the right at each breakpoint. The value of ρs (1) was
the probability that the solver would win over the rest of the solvers.
According to the above rules, we know that one solver whose performance
profile plot is on top right will win over the rest of the solvers.
Figures 1, 2, and 3 show that the performance of these methods is relative
to N I, N F G, and T ime, respectively, where N I denotes the total number of
iterations, N F G denotes the total number of the function evaluations and the
gradient evaluations where N T = N F + 5N G (see [6, 18]), and T ime denotes
the cpu time that these methods spent. From these three figures it is clear
that the L-BFGS-A11 method has the most wins (has the highest probability
of being the optimal solver).
Figure 1 shows that L-BFGS-A11 and L-BFGS-A1 outperform BFGS-WP-
Ak1 and BFGS-WP-Ak2 about 5% and 8% test problems, respectively. The
L-BFGS-A11 method is predominant among the other three methods for t ≤ 5.
784 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Performance profiles of these methods(NI). Figure 1


1

0.9

0.8

0.7

0.6
Pp:r(p,s)<=t

0.5

0.4

0.3
.. L−BFGS−A1
0.2 − − BFGS−WP−Ak2
−.L−BFGS−A11
0.1 − BFGS−WP−Ak1

0
5 10 15 20 25 30
t

Moreover, the L-BFGS-A11 and L-BFGS-A1 and solve 100%, and the BFGS-
WP-Ak1 and BFGS-WP-Ak2 method solve about 95% and 92% of the test
problems successfully, respectively.
Figure 2 shows that L-BFGS-A11 and L-BFGS-A1 are superior to BFGS-
WP-Ak1 and BFGS-WP-Ak2 about 15% test problems. The L-BFGS-A11
and L-BFGS-A1 method can solve 100% of the test problems successfully at
t ≈ 4.2 and t ≈ 7.2, respectively. The BFGS-WP-Ak1 and BFGS-WP-Ak2
method solve about 85% of the test problems successfully.
Figure 3 shows that L-BFGS-A11 outperforms the other three methods. The
L-BFGS-A1 method and the BFGS-WP-Ak1 method solve about 95% and 91%
of the test problems, respectively, and the BFGS-WP-Ak2 solves about 88% of
the test problems successfully.
In summary, the presented numerical results reveal that Algorithm 1 and
Algorithm 11, compared with other two methods with WWP line search and
BFGS update, have potential advantages for these problems.

6. Conclusion
This paper gives two modified L-BFGS method with one nonmonotone line
search technique for solving unconstrained optimization, which include the
function value at the current and next iterative point values. The global con-
vergence for the uniformly convex functions are established. The numerical
MODIFIED LIMITED MEMORY BFGS METHOD 785

Performance profiles of these methods(NFG). Figure 2


1

0.9

0.8

0.7

0.6
Pp:r(p,s)<=t

0.5

0.4

0.3
.. L−BFGS−A1
0.2 − − BFGS−WP−Ak2
−.L−BFGS−A11
0.1 − BFGS−WP−Ak1

0
2 4 6 8 10 12 14 16
t

Performance profiles of these methods(Time). Figure 3


1

0.9

0.8

0.7

0.6
Pp:r(p,s)<=t

0.5

0.4

0.3
.. L−BFGS−A1
0.2 − − BFGS−WP−Ak2
−.L−BFGS−A11
0.1 − BFGS−WP−Ak1

0
2 4 6 8 10 12 14 16
t
786 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

results show that the given methods are competitive to the other standard
BFGS methods for the test problems.
For further research, we should study the performance of the new algorithm
at different stop rules and different testing environment (such as [15]). More-
over, more numerical experiments for large practical problems should be done
in the future.

References
[1] C. G. Broyden, J. E. Dennis Jr, and J. J. Moré, On the local and superlinear convergence
of quasi-Newton methods, J. Inst. Math. Appl. 12 (1973), 223–245.
[2] R. H. Byrd and J. Nocedal, A tool for the analysis of quasi-Newton methods with appli-
cation to unconstrained minimization, SIAM J. Numer. Anal. 26 (1989), no. 3, 727–739.
[3] R. H. Byrd, J. Nocedal, and R. B. Schnabel, Representations of quasi-Newton matrices
and their use in limited memory methods, Math. Programming 63 (1994), no. 2, Ser.
A, 129–156.
[4] R. Byrd, J. Nocedal, and Y. Yuan, Global convergence of a class of quasi-Newton meth-
ods on convex problems, SIAM J. Numer. Anal. 24 (1987), no. 5, 1171–1190.
[5] Y. Dai, Convergence properties of the BFGS algorithm, SIAM J. Optim. 13 (2002), no.
3, 693–701.
[6] Y. Dai and Q. Ni, Testing different conjugate gradient methods for large-scale uncon-
strained optimization, J. Comput. Math. 21 (2003), no. 3, 311–320.
[7] W. C. Davidon, Variable metric methods for minimization, A. E. C. Research and
Development Report ANL-599, 1959.
[8] R. Dembo and T. Steihaug, Truncated Newton algorithms for large-scale unconstrained
optimization, Math. Programming 26 (1983), no. 2, 190–212.
[9] J. E. Dennis Jr. and J. J. Moré, Quasi-Newton methods, motivation and theory, SIAM
Rev. 19 (1977), no. 1, 46–89.
[10] , A characterization of superlinear convergence and its application to quasi-
Newton methods, Math. Comp. 28 (1974), 549–560.
[11] J. E. Dennis Jr. and R. B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations, Prentice Hall Series in Computational Mathematics. Prentice
Hall, Inc., Englewood Cliffs, NJ, 1983.
[12] L. C. W. Dixon, Variable metric algorithms: necessary and sufficient conditions for
identical behavior of nonquadratic functions, J. Optimization Theory Appl. 10 (1972),
34–40.
[13] E. D. Dolan and J. J. Moré, Benchmarking optimization software with performance
profiles, Math. Program. 91 (2002), no. 2, Ser. A, 201–213.
[14] R. Fletcher, Practical Methods of Optimization, Second edition. A Wiley-Interscience
Publication. John Wiley & Sons, Ltd., Chichester, 1987.
[15] N. I. M. Gould, D. Orban, and Ph. L. Toint, CUTEr (and SifDec), a constrained and un-
constrained testing environment, revisite, ACM Transactions on Mathematical Software
29 (2003), 373–394.
[16] A. Griewank, The global convergence of partitioned BFGS on problems with convex
decompositions and Lipschitzian gradients, Math. Programming 50 (1991), no. 2, (Ser.
A), 141–175.
[17] A. Griewank and Ph. L. Toint, Local convergence analysis for partitioned quasi-Newton
updates, Numer. Math. 39 (1982), no. 3, 429–448.
[18] L. Grippo, F. Lamparillo, and S. Lucidi, A nonmonotone line search technique for
Newton’s method, SIAM J. Numer. Anal. 23 (1986), no. 4, 707–716.
MODIFIED LIMITED MEMORY BFGS METHOD 787

[19] , A truncated Newton method with nonmonotone line search for unconstrained
optimization, J. Optim. Theory Appl. 60 (1989), no. 3, 401–419.
[20] , A class of nonmonotone stabilization methods in unconstrained optimization,
Numer. Math. 59 (1991), no. 8, 779–805.
[21] J. Y. Han and G. H. Liu, Global convergence analysis of a new nonmonotone BFGS
algorithm on convex objective functions, Comput. Optim. Appl. 7 (1997), no. 3, 277–289.
[22] D. Li and M. Fukushima, A modified BFGS method and its global convergence in non-
convex minimization, J. Comput. Appl. Math. 129 (2001), no. 1-2, 15–35.
[23] , On the global convergence of the BFGS method for nonconvex unconstrained
optimization problems, SIAM J. Optim. 11 (2001), no. 4, 1054–1064.
[24] G. Li, C. Tang, and Z. Wei, New conjugacy condition and related new conjugate gradient
methods for unconstrained optimization, J. Comput. Appl. Math. 202 (2007), no. 2,
523–539.
[25] G. H. Liu and J. Y. Han, Notes on the general form of stepsize selection, OR and
Decision Making I (1992), 619–624.
[26] , Global convergence Analysis of the variable metric algorithm with a generalized
Wolf linesearch, Technical Report, Institute of Applied Mathematics, Academia Sinica,
Beijing, China, no. 029, 1993.
[27] G. H. Liu, J. Y. Han, and D. F. Sun, Global convergence of the BFGS algorithm with
nonmonotone linesearch, Optimization 34 (1995), no. 2, 147–159.
[28] G. H. Liu and J. M. Peng, The convergence properties of a nonmonotonic algorithm, J.
Comput. Math. 1 (1992), 65–71.
[29] W. F. Mascarenhas, The BFGS method with exact line searches fails for non-convex
objective functions, Math. Program. 99 (2004), no. 1, Ser. A, 49–61.
[30] J. J. Moré, B. S. Garbow, and K. E. Hillstrome, Testing unconstrained optimization
software, ACM Trans. Math. Software 7 (1981), no. 1, 17–41.
[31] S. G. Nash, A survey of truncated-Newton methods, Numerical analysis 2000, Vol. IV,
Optimization and nonlinear equations. J. Comput. Appl. Math. 124 (2000), no. 1-2,
45–59.
[32] M. J. D. Powell, On the convergence of the variable metric algorithm, J. Inst. Math.
Appl. 7 (1971), 21–36.
[33] , Some global convergence properties of a variable metric algorithm for min-
imization without exact line searches, Nonlinear programming (Proc. Sympos., New
York, 1975), pp. 53–72. SIAM-AMS Proc., Vol. IX, Amer. Math. Soc., Providence, R.
I., 1976.
[34] , A new algorithm for unconstrained optimization, 1970 Nonlinear Programming
(Proc. Sympos., Univ. of Wisconsin, Madison, Wis., 1970) pp. 31–65 Academic Press,
New York.
[35] M. Raydan, The Barzilai and Borwein gradient method for the large scale unconstrained
minimization problem, SIAM J. Optim. 7 (1997), no. 1, 26–33.
[36] J. Schropp, A note on minimization problems and multistep methods, Numer. Math. 78
(1997), no. 1, 87–101.
[37] , One-step and multistep procedures for constrained minimization problems, IMA
J. Numer. Anal. 20 (2000), no. 1, 135–152.
[38] Ph. L. Toint, Global convergence of the partitioned BFGS algorithm for convex partially
separable optimization, Math. Programming 36 (1986), no. 3, 290–306.
[39] D. J. van Wyk, Differential optimization techniques, Appl. Math. Modelling 8 (1984),
no. 6, 419–424.
[40] M. N. Vrahatis, G. S. Androulakis, J. N. Lambrinos, and G. D. Magolas, A class of gra-
dient unconstrained minimization algorithms with adaptive stepsize, J. Comput. Appl.
Math. 114 (2000), no. 2, 367–386.
View publication stats

788 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

[41] Z. Wei, G. Li, and L. Qi, New quasi-Newton methods for unconstrained optimization
problems, Appl. Math. Comput. 175 (2006), no. 2, 1156–1188.
[42] Z. Wei, G. Yu, G. Yuan, and Z. Lian, The superlinear convergence of a modified BFGS-
type method for unconstrained optimization, Comput. Optim. Appl. 29 (2004), no. 3,
315–332.
[43] G. L. Yuan, Modified nonlinear conjugate gradient methods with sufficient descent prop-
erty for large-scale optimization problems, Optim. Lett. 3 (2009), no. 1, 11–21.
[44] G. L. Yuan and X. W. Lu, A new line search method with trust region for unconstrained
optimization, Comm. Appl. Nonlinear Anal. 15 (2008), no. 1, 35–49.
[45] , A modified PRP conjugate gradient method, Ann. Oper. Res. 166 (2009), 73–90.
[46] G. L. Yuan, X. Lu, and Z. Wei, A conjugate gradient method with descent direction for
unconstrained optimization, J. Comput. Appl. Math. 233 (2009), no. 2, 519–530.
[47] Y. Yuan and W. Sun, Theory and Methods of Optimization, Science Press of China,
1999.
[48] G. L. Yuan and Z. X. Wei, New line search methods for unconstrained optimization, J.
Korean Statist. Soc. 38 (2009), no. 1, 29–39.
[49] , The superlinear convergence analysis of a nonmonotone BFGS algorithm on
convex objective functions, Acta Math. Sin. (Engl. Ser.) 24 (2008), no. 1, 35–42.
[50] , Convergence analysis of a modified BFGS method on convex minimizations,
Comput. Optim. Appl. doi: 10.1007/s10589-008-9219-0.
[51] J. Z. Zhang, N. Y. Deng, and L. H. Chen, New quasi-Newton equation and related
methods for unconstrained optimization, J. Optim. Theory Appl. 102 (1999), no. 1,
147–167.

Gonglin Yuan
College of Mathematics and Information Science
Guangxi University
Nanning, Guangxi, 530004, P. R. China
E-mail address: [email protected]

Zengxin Wei
College of Mathematics and Information Science
Guangxi University
Nanning, Guangxi, 530004, P. R. China
E-mail address: [email protected]

Yanlin Wu
College of Mathematics and Information Science
Guangxi University
Nanning, Guangxi, 530004, P. R. China
E-mail address: [email protected]

You might also like