0% found this document useful (0 votes)

8 views

Modified Limited Memory BFGS Method With Nonmonoto

Uploaded by

jwan.aqrawi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Modified Limited Memory BFGS Method With Nonmonoto

Uploaded by

jwan.aqrawi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228974276

Modiﬁed limited memory BFGS method with nonmonotone line search for
unconstrained optimization

Article in Journal of the Korean Mathematical Society · July 2010

DOI: 10.4134/JKMS.2010.47.4.767

CITATIONS READS

35 223

3 authors, including:

Gonglin Yuan Zengxin Wei

Guangxi University Guangxi University
126 PUBLICATIONS 2,569 CITATIONS 75 PUBLICATIONS 2,529 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Gonglin Yuan on 23 May 2014.

The user has requested enhancement of the downloaded file.

J. Korean Math. Soc. 47 (2010), No. 4, pp. 767–788
DOI 10.4134/JKMS.2010.47.4.767

MODIFIED LIMITED MEMORY BFGS METHOD WITH

NONMONOTONE LINE SEARCH
FOR UNCONSTRAINED OPTIMIZATION

Gonglin Yuan, Zengxin Wei, and Yanlin Wu

Abstract. In this paper, we propose two limited memory BFGS algo-

rithms with a nonmonotone line search technique for unconstrained op-
timization problems. The global convergence of the given methods will
be established under suitable conditions. Numerical results show that
the presented algorithms are more competitive than the normal BFGS
method.

1. Introduction
Consider the following unconstrained optimization problem
(1.1) min f (x),
x∈Rn

where f : Rn → R is continuously differentiable. The line search method is one

of the most numerical method, which is defined by
(1.2) xk+1 = xk + αk dk , k = 0, 1, 2, . . . ,
where αk that is determined by a line search is the steplength, and dk which
determines different line search methods [35, 36, 37, 39, 40, 43, 44, 45, 46, 48, 50]
is a search direction of f at xk .
One of the most effective methods for unconstrained optimization (1.1) is
Newton method. It normally requires a fewest number of function evaluations,
and is very good at handling ill-conditioning. However, its efficiency largely
depends on the possibility to efficiently solve a linear system which arises when
computing the search dk at each iteration
(1.3) G(xk )dk = −g(xk ),

Received July 17, 2008.

2000 Mathematics Subject Classification. 65H10, 65K05, 90C26.
Key words and phrases. limited memory BFGS method, optimization, nonmonotone,
global convergence.
This work is supported by China NSF grands 10761001 and the Scientific Research Foun-
dation of Guangxi University (Grant No. X081082).

c
°2010 The Korean Mathematical Society

767
768 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

where g(xk ) = ∇f (xk ) is the gradient of f (x) at xk , and G(xk ) = ∇2 f (xk ) is

the Hessian matrix of f (x) at current iteration. Moreover, the exact solution
of the system (1.3) could be too burdensome, or is not necessary when xk is
far from a solution [31]. Inexact Newton methods [8, 31] represent the basic
approach underlying most of the Newton-type large-scale algorithms. At each
iteration, the current estimate of the solution is updated by approximately
solving the linear system (1.3) using an iterative algorithm. The inner iteration
is typically “truncated” before the solution to the linear system is obtained.
The limited memory BFGS (L-BFGS) method (see [3]) is an adaptation of the
BFGS method for large-scale problems. The implementation is almost identical
to that of the standard BFGS method, the only difference is that the inverse
Hessian approximation is not formed explicitly, but defined by a small number
of BFGS updates. It is often provided a fast rate of linear convergence, and
requires minimal storage.
Since the standard BFGS is wildly used to solve general minimization prob-
lems, most of the studies concerning limited memory methods are concentrate
on the L-BFGS method. We know that, the BFGS update exploits only the
gradient information, while the information of function values available is ne-
glected. Therefore, many efficient attempts have been made to modify the
usual quasi-Newton methods using both the gradient and function values in-
formation (e.g. [41, 51]). Lately, in order to get a higher order accuracy in
approximating the second curvature of the objective function, Wei, Li, and Qi
[41], and Zhang, Deng, and Chen [51] proposed modified BFGS-type methods
for (1.1), and the reported numerical results show that the average performance
is better than that of the standard BFGS method, respectively.
The monotone line search technique is often used to get the stepsize αk ,
however monotonicity may cause a series of very small steps if the contours of
objective function are a family of curves with large curvature [18]. More re-
cently, the nonmonotonic line search for solving unconstrained optimization is
proposed by Grippo et al. in [18]. Han and Liu [21] presented a new nonmono-
tone BFGS method for (1.1). The global convergence of the convex objective
function was established. Numerical results show that this method is more
competitive to the normal BFGS method with monotone line search. We [49]
proved its superlinear convergence.
Motivated by the above observation, we propose two limited memory BFGS-
type method on the basic of Wei et al. [41], Zhang et al. [51], and [21],
respectively, which are suitable for solving large-scale unconstrained optimiza-
tion problems. The major contribution of this paper is an extension of the
BFGS-type method in [41], [51], and the nonmonotone line search technique to
limited memory scheme. The triple of the standard L-BFGS method {si , yi },
i = k−m e + 1, . . . , k, is stored, where si = xi+1 − xi , yi = gi+1 − gi , gi = g(xi )
and gi+1 = g(xi+1 ) are the gradient of f (x) at xi and xi+1 , respectively, m e >0
is a constant. A distinguishing feature of our proposed L-BFGS method is that,
MODIFIED LIMITED MEMORY BFGS METHOD 769

at each iteration, a triple

{si , yi , Ai }, i = k − m
e + 1, . . . , k,
is stored, where Ai is a scalar related to function value. Compared with the
standard BFGS method, at each iteration, the proposed method requires no
more function or derivative evaluations, and hardly more storage or arithmetic
operations. Under suitable conditions, we establish the global convergence of
the method. The numerical experiments of the proposed method on a set of
large problems indicate that it is interesting.
This paper is organized as follows. In the next section, modified BFGS
update and nonmonotone line search are stated. The proposed L-BFGS algo-
rithms are given in Section 3. Under some reasonable conditions, the global
convergence of the given methods is established in Section 4. Numerical results
and a conclusion are presented in Section 5 and in Section 6, respectively.

2. Modified BFGS update and nonmonotone line search

Quasi-Newton methods are iterative methods of the form
xk+1 = xk + αk dk ,
where xk is the kth iteration point, αk is a stepsize, and dk is a search direction.
Now we first state the search direction as follows.
2.1. Some modified BFGS update formulas
The search direction of the quasi-Newton method is defined by
(2.1) Bk dk + gk = 0,
where gk = g(xk ) = ∇f (xk ) is the gradient of f (x) at xk , Bk is an approx-
imation of ∇2 f (xk ). By tradition, {Bk } satisfies the following quasi-Newton
equation
(2.2) Bk+1 sk = yk ,
where sk = xk+1 − xk = αk dk , yk = gk+1 − gk . Throughout the paper, we use
these notations: k · k is the Euclidean norm, g(xk ) and g(xk+1 ) are replaced by
gk and gk+1 , and f (xk ) and f (xk+1 ) are replaced by fk and fk+1 respectively.
The famous update Bk is the standard BFGS formula
Bk sk sTk Bk yk yk T
(2.3) Bk+1 = Bk − + .
sTk Bk sk yk T sk
Let Hk be the inverse of Bk . Then the inverse update formula of (2.3) method
is represented as
yk T (sk − Hk yk )sk sTk (sk − Hk yk )sTk + sk (sk − Hk yk )T
Hk+1 = Hk − T 2
+
(yk sk ) (yk T sk )2
µ ¶ µ ¶
sk yk T yk sT sk sT
(2.4) = I− T Hk I − T k + Tk ,
yk sk yk sk y k sk
770 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

which is the dual form of the DF P update formula in the sense that Hk ↔ Bk ,
Hk+1 ↔ Bk+1 , and sk ↔ yk . It has been shown that the BFGS method is
very efficient for solving unconstrained optimization problems (1.1) [11, 14, 47].
For convex minimization problems, the BFGS method are global convergence
if the exact line search or some special inexact line search is used [1, 2, 4,
12, 16, 32, 33, 38] and its local convergence has been well established [9, 10,
17]. For general function f, Dai [5] have constructed an example to show that
the standard BFGS method may fail for non-convex functions with inexact
line search, Mascarenhas [29] showed that the nonconvergence of the standard
BFGS method even with exact line search.
In order to obtain a global convergence of BFGS method without convexity
assumption on the objective function, Li and Fukushima [22, 23] made a slight
modification to the standard BFGS method. Now we state their works as
follows:
(i) A new quasi-Newton equation [22] with following form
Bk+1 sk = yk1∗ ,
yT s
where yk1∗ = yk + (max{0, − kskk kk2 } + φ(kgk k))sk , and function φ : R → R
satisfies: (a) φ(t) > 0 for all t > 0; (b) φ(t) = 0 if and only if t = 0; (c) φ(t) is
bounded if t is in a bounded set.
(ii) A modified BFGS update formula [23] with following form
( T
B s sT B y 1∗ y 1∗ sT y 1∗
Bk − skT kBkksk k + yk1∗ Tks , if ksk k kk2 ≥ φ(kgk k)
(2.5) Bk+1 = k k k
Bk , otherwise.
Then it is not difficult to see that sTk yk1∗ > 0 always holds, which can ensure
that the update matrix Bk+1 inherits the positive definiteness of Bk (see [14]).
The global convergence and the superlinear convergence of these two methods
for nonconvex have been established under appropriate conditions (see [22, 23]
in detail).
In order to get a better approximation of the objective function Hessian
matrix, Wei, Li, and Qi [41] and Zhang, Deng, and Chen [51] proposed modified
quasi-Newtion equations which are given as follows. (i) The equation of Wei,
Li, and Qi [41]:
(2.6) Bk+1 sk = yk2∗ = yk + Ak sk ,
where
2[f (xk ) − f (xk+1 )] + [g(xk+1 ) + g(xk )]T sk
(2.7) Ak = .
ksk k2
They replaced all the yk in (2.3), and obtained the following modified BFGS-
type update formula
T
Bk sk sTk Bk yk2∗ yk2∗
(2.8) Bk+1 = Bk − + .
sTk Bk sk T
yk2∗ sk
MODIFIED LIMITED MEMORY BFGS METHOD 771

Note that this quasi-Newton equation (2.6) contains both gradient and function
value information at the current and the previous step, one may argue that the
resulting methods will really outperform than the original method. In fact, the
practical computation shows that this method is better than the normal BFGS
method (see [41, 42] for detail) for some given problems [30]. Furthermore,
some theoretical advantages of the new quasi-Newton equation (2.6) can be
seen from the following two theorems.
Theorem 2.1 ([42, Lemma 3.1]). Considering the quasi-Newton equation (2.6).
Then we have for all k ≥ 1
1
f (xk ) = f (xk+1 ) + g(xk+1 )T (xk − xk+1 ) + (xk − xk+1 )T Bk+1 (xk − xk+1 ).
2
Theorem 2.2 ([24, Theorem 3.1]). Assume that the function f (x) is suffi-
ciently smooth and ksk k is sufficiently small. Then we have
1
(2.9) sTk Gk+1 sk − sTk yk2∗ − sTk (Tk+1 sk )sk = O(ksk k4 )
3
and
1
(2.10) sTk Gk+1 sk − sTk yk − sTk (Tk+1 sk )sk = O(ksk k4 ),
2
where Gk+1 denotes the Hessian matrix of f at xk+1 , Tk+1 is the tensor of f
at xk+1 , and
n
X ∂ 3 f (xk+1 ) i j l
sTk (Tk+1 sk )sk = s s s .
∂xi ∂xj ∂xl k k k
i,j,l=1

(iii) The equation of Zhang, Deng, and Chen [51]:

(2.11) Bk+1 sk = yk3∗ = yk + Āk sk ,
where
6[f (xk ) − f (xk+1 )] + 3[g(xk+1 ) + g(xk )]T sk
(2.12) Āk = .
ksk k2
They replaced all the yk in (2.3), and obtained the following modified BFGS-
type update formula
T
Bk sk sT Bk y 3∗ y 3∗
(2.13) Bk+1 = Bk − T k + k Tk .
sk Bk sk yk3∗ sk
Similar to equation (2.6), this quasi-Newton equation (2.11) contains both gra-
dient and function value information at the current and the previous step, one
may argue that the resulting methods will really outperform than the original
method. In fact, the practical computation shows that this method is better
than the normal BFGS method (see [51] for detail). Furthermore, some theo-
retical advantages of the new quasi-Newton equation (2.11) can be seen from
the following theorem.
772 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Theorem 2.3 ([51, Theorem 3.3]). Assume that the function f (x) is suffi-
ciently smooth and ksk k is sufficiently small. Then we have
(2.14) sTk (Gk+1 sk − yk3∗ ) = O(ksk k4 )
and
(2.15) sTk (Gk+1 sk − yk ) = O(ksk k3 ).
It is not difficult to deduce that sTk yk2∗ > 0 holds for the uniformly convex
function f (or see [42]). We all know that the condition sTk yk2∗ > 0 can ensure
that the update matrix Bk+1 from (2.8) inherits the positive definiteness of Bk .
Similarly, in order to get the positive definiteness of Bk in (2.13) for each k,
we give a modified BFGS update of (2.13), i.e., the modified update formula is
defined by
T
Bk sk sTk Bk y 4∗ y 4∗
(2.16) Bk+1 = Bk − T
+ k Tk ,
sk Bk sk yk4∗ sk
1
where yk4∗ = yk + A∗k sk , A∗k = 3 max{Āk , 0}. Then the corresponding quasi-
Newton equation is
(2.17) Bk+1 sk = yk4∗ .
From the definition of yk4∗ , we can obtain sTk yk4∗ > 0 if the objective function f
is uniformly convex function (see Lemma 4.1).
There are other modified formulas that can be seen from [7, 34] in detail,
here we do not present anymore.
2.2. One nonmonotone line search
Normally the steplength αk is generated by the following weak Wolfe-Powell
(WWP): Find a steplength αk such that
(2.18) f (xk + αk dk ) ≤ f (xk ) + σ1 αk gkT dk ,

(2.19) g(xk + αk dk )T dk ≥ σ2 gkT dk ,

where 0 < σ1 < σ2 < 1. Many authors analysis the BFGS algorithm from
generalizing line search procedures [25, 26]. Recently, the nonmonotone line
search technique for unconstrained optimization is proposed by Grippo et. al.
[18, 19, 20] and further studied by [27, 28] etc.. Grippo, Lamparillo, and Lucidi
[18] proposed the following nonmonotone line search we call GLL line search.
GLL line search: Select steplength αk satisfying
(2.20) f (xk+1 ) ≤ max f (xk−j ) + ε1 αk gkT dk ,
0≤j≤n(k)

(2.21) g(xk+1 )T dk ≥ max{ε2 , 1 − (αk kdk k)p }gkT dk ,

where p ∈ (−∞, 1), k = 0, 1, 2, . . . , ε1 , ε2 ∈ (0, 1), n(k) = min{H, k}, H ≥ 0
is an integer constant. Combining this line search and the normal BFGS for-
mula (2.3), Han and Liu [21] established the global convergence of the convex
MODIFIED LIMITED MEMORY BFGS METHOD 773

objective function. Numerical results show that this method is more compet-
itive than the normal BFGS method with WWP line search. Recently, the
superlinear convergence of the new nonmonotone BFGS algorithm for convex
function was proved by Yuan and Wei [49].

3. Limited memory BFGS-type method

In this section, we propose new algorithm to solve (1.1). To improve the
performance of the line search, it is a better choice to use the GLL line search
instead of the WWP line search. Then our method generates a sequence of
points {xk } by
xk+1 = xk + αk dk , k = 0, 1, 2, . . . ,
where αk is determined by (2.20) and (2.21), dk is a descent direction of f at
xk . In the following, we state the direction dk in details.
The limited memory BFGS (L-BFGS) method (see [3]) is an adaptation of
the BFGS method for large-scale problems. In the L-BFGS method, matrix
Hk is obtained by updating the basic matrix H0 m e (> 0) times using BFGS
formula with the previous m e iterations. The standard BFGS correction (2.4)
has the following form
(3.22) Hk+1 = VkT Hk Vk + ρk sk sTk ,
where ρk = sT1yk , Vk = I − ρk yk sTk , I is the unit matrix. Thus, Hk+1 in the
k
L-BFGS method has the following form:
Hk+1 = VkT Hk Vk + ρk sk sTk
= VkT [Vk−1
T
Hk−1 Vk−1 + ρk−1 sk−1 sTk−1 ]Vk + ρk sk sTk
= ···
= [VkT · · · Vk−
T
m+1
e ]Hk−m+1
e [Vk−m+1
e · · · Vk ]
T T
+ρk−m+1
e [Vk−1 · · · Vk− m+2
e ]sk−m+1
e sTk−m+1
e [Vk−m+2
e · · · Vk−1 ]
+···
(3.23) +ρk sk sTk .
To improve the performance of the standard limited memory BFGS algo-
rithm, it is a better choice use the modified BFGS-type update instead of the
standard BFGS. If we replaced all the yk with yk4∗ and yk2∗ in (3.23) respectively,
the new limited memory BFGS-type update can be obtained by
Hk+1 = Vk∗ T Hk Vk∗ + ρ∗k sk sTk
∗ T
= Vk∗ T [Vk−1 ∗
Hk−1 Vk−1 + ρ∗k−1 sk−1 sTk−1 ]Vk∗ + ρ∗k sk sTk
= ···
= [Vk∗ T · · · Vk−
∗
m+1
e
T
]Hk−m+1
e
∗
[Vk− m+1
e · · · Vk∗ ]
∗ T T
+ρ∗k−m+1
e [Vk−1 ∗
· · · Vk− m+2
e ]sk−m+1
e sTk−m+1
e
∗
[Vk− m+2
e
∗
· · · Vk−1 ]
+···
774 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

(3.24) +ρ∗k sk sTk

and
Hk+1 = Vk∗∗ T Hk Vk∗∗ + ρ̄∗k sk sTk
= Vk∗∗ T [Vk−1
∗∗ T ∗∗
Hk−1 Vk−1 + ρ̄∗k−1 sk−1 sTk−1 ]V̄k∗∗ + ρ̄∗k sk sTk
= ···
= [Vk∗∗ T · · · Vk−
∗∗
m+1
e
T
]Hk−m+1
e
∗∗
[Vk− m+1
e · · · V̄k∗∗ ]
∗∗ T T
+ρ̄∗k−m+1
e [Vk−1 ∗∗
· · · Vk− m+2
e ]sk−m+1
e sTk−m+1
e
∗∗
[Vk− m+2
e
∗∗
· · · Vk−1 ]
+···
(3.25) +ρ̄∗k sk sTk ,
where ρ∗k = sT 1y4∗ , Vk∗ = I − ρ∗k yk4∗ sTk , and ρ̄∗k = sT 1y2∗ , Vk∗∗ = I − ρ̄∗k yk2∗ sTk .
k k k k
Now we state the new limited memory BFGS-type algorithm (L-BFGS-A)
with GLL line search as follows.
Algorithm 1. (L-BFGS-A1)
Step 0: Choose an initial point x0 ∈ Rn , an basic symmetric positive definite
matrix H0 ∈ Rn×n , and constants r, ε1 , ε2 ∈ (0, 1), p ∈ (−∞, 1), H ≥ 0, an
positive integer m1 . Let k := 0;
Step 1: Stop if kgk k = 0.
Step 2: Determine dk by
(3.26) dk = −Hk gk .
Step 3: Find αk satisfying (2.20) and (2.21).
Step 4: Let the next iterative be xk+1 = xk + αk dk .
Step 5: Let m
e = min{k +1, m1 }. Update H0 for me times to get Hk+1 by (3.24).
Step 6: Let k := k + 1. Go to step 1.
Algorithm 11. (L-BFGS-A11)
Step 5: Let me = min{k +1, m1 }. Update H0 for me times to get Hk+1 by (3.25).
In the following, we assume that the algorithm updates Bk −the inverse of
Hk . We also assume that the basic matrix B0 , and its inverse H0 , are bounded
and positive definite. The Algorithm 1 with Bk can be stated as follows.
Algorithm 2. (L-BFGS-A2)
Step 2: Determine dk by
(3.27) Bk dk = −gk .
Step 5: Let m
e = min{k + 1, m1 }. Put sk = xk+1 − xk = αk dk , yk = gk+1 − gk .
Update B0 for m
e times, i.e., for l = k − m
e + 1, . . . , k compute
T
Bkl sl sTl Bkl yl4∗ yl4∗
(3.28) Bkl+1 = Bkl − + T
,
sTl Bkl sl yl4∗ sl
where sl = xl+1 − xl , yl4∗ = yl + A∗l sl , and Bkk−m+1
e
= B0 for all k.
MODIFIED LIMITED MEMORY BFGS METHOD 775

Algorithm 22. (L-BFGS-A22)

Step 2: Determine dk by
(3.29) Bk dk = −gk .
Step 5: Let m
e = min{k + 1, m1 }. Put sk = xk+1 − xk = αk dk , yk = gk+1 − gk .
Update B0 for m
e times, i.e., for l = k − m
e + 1, . . . , k compute
T
Bkl sl sTl Bkl y 2∗ y 2∗
(3.30) Bkl+1 = Bkl − T l
+ l Tl ,
sl Bk sl yl2∗ sl
where sl = xl+1 − xl , yl2∗ = yl + Al sl , and Bkk−m+1
e
= B0 for all k.
Note that Algorithms 1 and 2 are mathematically equivalent, and Algorithms
11 and 22 are mathematically equivalent too. In our numerical experiments we
implement Algorithms 1 and 11, and Algorithms 2 and 22 are given only for the
purpose of analysis. Throughout this paper, we only discuss Algorithms 2 and
22. In the following section, we will concentrate on their global convergence.

4. Convergence analysis
This section is devoted to show that Algorithm 2 is convergent on twice
continuously differentiable and uniformly convex function. In order to establish
global convergence for Algorithm 2, we need the following assumptions.
Assumption A. (i) The level set Ω = {x | f (x) ≤ f (x0 )} is bounded.
(ii) The function f is twice continuously differentiable on Ω.
(iii) The function f is uniformly convex, i.e., there exist positive constants
m and M such that
(4.1) mkdk2 ≤ dT G(x)d ≤ M kdk2
holds for all x ∈ Ω and d ∈ Rn , where G(x) = ∇2 f (x). These assumptions are
the same as those in [42, 51].
It is obvious that Assumption A implies that there exists a constant M∗ > 0
such that
kG(x)k ≤ M∗ , x ∈ Ω.
Assumption A (ii) implies that there exists a constant L ≥ 0 satisfying
(4.2) kg(x) − g(y)k ≤ Lkx − yk, x, y ∈ Ω.
Lemma 4.1. Let Assumption A hold. Then there exists a positive number M1
such that
kyk4∗ k2
≤ M1 , k = 0, 1, 2, . . . .
sTk yk4∗
Proof. Following the definition of yk4∗ and the Taylor’s formula, we get
sTk yk4∗ = sTk yk + sTk A∗k sk
T
= max{2[fk − fk+1 ] + 2gk+1 sk , sTk yk }
776 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

T 1
= max{2[−gk+1 sk + sTk G(xk + θ(xk+1 − xk )sk ] + 2gk+1
T
sk ,
2
sTk G(xk + θ1 (xk+1 − xk )sk }
= max{sTk G(xk + θ(xk+1 − xk )sk , sTk G(xk + θ1 (xk+1 − xk )sk },
where θ, θ1 ∈ (0, 1). Combining with Assumption A(iii), it is easy to obtain
(4.3) mksk k2 ≤ sTk yk4∗ ≤ M ksk k.
By the definition of yk4∗ and the Taylor’s formula again, we have
° ½ ¾ °
° 2[f (xk ) − f (xk+1 )] + [g(xk+1 ) + g(xk )]T sk °
4∗ °
kyk k = °yk + max , 0 sk °
ksk k 2 °
½ T
¾
| 2[f (xk ) − f (xk+1 )] + [g(xk+1 ) + g(xk )] sk |
≤ max kyk k + , kyk k
ksk k
| sTk G(xk + θ(xk+1 − xk )sk |
≤ 2kyk k +
ksk k2
≤ 2Lksk k + M ksk k
(4.4) = (2L + M )ksk k,
where θ ∈ (0, 1), the third inequality follows (4.1) and (4.2). By (4.3) and (4.4),
we get
kyk4∗ k2 (2L + M )2 ksk k2 (2L + M )2
≤ = = M1 .
sTk yk4∗ mksk k2 m
The proof is complete. ¤

Lemma 4.2. Let Bk be generated by (3.28). Then we have

k
Y sTl yl4∗
(4.5) det(Bk+1 ) = det(Bkk−m+1
e
) ,
sTl Bl sl
l=k−m+1
e

where det(Bk ) denotes the determinant of Bk .

Proof. To begin with, we take the determinant in both sides of (2.8)
Ã Ã T
!!
sk sTk Bk Bk−1 yk4∗ yk4∗
det(Bk+1 ) = det Bk I − T +
sk Bk sk sTk yk4∗
Ã T
!
sk sTk Bk Bk−1 yk4∗ yk4∗
= det(Bk ) det I − T +
sk B k sk sTk yk4∗
Ãµ ¶ Ã !
T B k sk −1 4∗ T yk4∗
= det(Bk ) 1 − sk T 1 + (Bk yk ) T
sk Bk sk yk4∗ sk
Ã !µ ¶!
T yk4∗ (Bk sk )T −1 4∗
− −sk B y
T
y 4∗ sk sTk Bk sk k k
k
MODIFIED LIMITED MEMORY BFGS METHOD 777

T
yk4∗ sk
= det(Bk ) ,
sTk Bk sk
where the third equality follows from the formula (see, e.g., [9, Lemma 7.6])
det(I + u1 uT2 + u3 uT4 ) = (1 + uT1 u2 )(1 + uT3 u4 ) − (uT1 u4 )(uT2 u3 ).
Therefore, there is also a simple expression for the determinant of (3.28)
k
Y sTl yl4∗
det(Bk+1 ) = det(Bkk−m+1
e
) .
sTl Bl sl
l=k−m+1
e

Then we complete the proof. ¤

Define the length of the orthogonal projection of −gk on dk by

−gkT dk
(4.6) ηk = .
kdk k
The following Lemmas 4.3-4.6 has been proved in [21], here we only state them
as follows, but omit the proof.
Lemma 4.3. Let Assumption A be satisfied. Consider GLL line search. Then
there exists a positive constant b0 such that
1
ksk k ≥ b0 min{ηk , (ηk ) 1−p },
where ηk is defined by (4.6).
Lemma 4.4. Denote that
f (xl(k) ) = max f (xk−j ), k − n(k) ≤ l(k) ≤ k.
0≤j≤n(k)

If fk+1 ≤ f (xl(k) ), k = 0, 1, 2, . . . , then the sequence {f (xl(k) )} monotonically

decreases, and xk ∈ Ω for all k ≥ 0.
Lemma 4.5. If
(4.7) fk+1 ≤ f (xl(k) ) − tk , k = 0, 1, 2, . . . ,
where tk ≥ 0, then
∞
X
(4.8) min tk+n(k)−j < +∞.
0≤j≤n(k)
k=0

Lemma 4.6. If the sequence of nonnegative numbers mk (k = 0, 1, . . .) satisfies

k
Y
(4.9) mj ≥ ck1 , c1 > 0, k = 1, 2, . . . ,
j=0

then lim supk mk > 0.

778 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Lemma 4.7. Let {xk } be generated by Algorithm 2 and Assumption A hold.

If
lim inf kgk k > 0,
k→∞
then there exists a constant ²0 > 0 such that
k
Y
ηj ≥ (²0 )k+1 for all k ≥ 0.
j=0

Proof. Assume that lim inf k kgk k > 0, i.e., there exists a constant c2 > 0 such
that
(4.10) kgk k ≥ c2 , k = 0, 1, 2, . . . .
From Assumption A(iii) and Taylor’s formula, we have
(4.11) mksk k2 ≤ sTk G(xk + θ1 sk )sk = sTk yk ≤ M ksk k2 ,
combining with (4.3), we get
m T M T
(4.12) s yk ≤ sTk yk4∗ ≤ s yk .
M k m k
Taking the trace operation in both sides of (3.28), we get
k
X k
X
kBl sl k2 kyl4∗ k2
(4.13) Tr(Bk+1 ) = Tr(Bkk−m+1
e
)− + ,
sTl Bl sl slT yl4∗
l=k−m+1
e l=k−m+1
e

where Tr(Bk ) denotes the trace of Bk . Repeating this trace operation, we have
k
X k
X
kBl sl k2 kyl4∗ k2
Tr(Bk+1 ) = Tr(Bkk−m+1
e
) − +
sTl Bl sl sTl yl4∗
l=k−m+1
e l=k−m+1
e
= ···
k
X k
X
kBl sl k2 ky 4∗ k2
l
(4.14) = Tr(B0 ) − + .
l=0
sTl Bl sl l=0
sTl yl4∗

Combining (4.10), (4.14), (3.26), (3.29), and Lemma 4.1, we obtain

k
X c22
(4.15) Tr(Bk+1 ) ≤ Tr(B0 ) − T
+ (k + 1)M1 .
g Hj gj
j=0 j

Using Bk+1 is positive definite, we have Tr(Bk+1 ) > 0. By (4.15), we obtain

k
X c22 Tr(B0 ) + (k + 1)M1
(4.16) ≤
T
g Hj g j
j=0 j
c22

and
(4.17) Tr(Bk+1 ) ≤ Tr(B0 ) + (k + 1)M1 .
MODIFIED LIMITED MEMORY BFGS METHOD 779

By the geometric-arithmetic mean value formula we get

Yk · ¸k+1
T (k + 1)c22
(4.18) gj Hj gj ≥ .
j=0
Tr(B0 ) + (k + 1)M1

Using Lemma 4.2, (4.12), and (2.21), we have

k
Y sTl yl4∗
det(Bk+1 ) = det(Bkk−m+1
e
)
sTl Bl sl
l=k−m+1
e
k
Y m sTl yl
≥ det(Bkk−m+1
e
)
M sTl Bl sl
l=k−m+1
e
k
Y m min{1 − ε2 , ksl kp }
≥ det(Bkk−m+1
e
)
M αl
l=k−m+1
e
≥ ···
h m ik+1 Yk
min{1 − ε2 , ksj kp }
≥ det(B0 ) ,
M j=0
αj
which implies
h m ik+1 Yk ½ ¾
det(B0 ) αj αj
(4.19) ≤ max , .
det(Bk+1 ) M j=0
1 − ε2 ksj kp
By using the geometric-arithmetic mean value formula again, we get
· ¸n
Tr(Bk+1 )
(4.20) det(Bk+1 ) ≤ .
n
Using (4.17), (4.19) and (4.20), we obtain
Yk ½ ¾ h i
αj αj m k+1 det(B0 )nn
max , p
≥
j=0
1 − ε2 ksj k M [Tr(B0 ) + (k + 1)M1 ]n
h m ik+1 1 det(B0 )nn
≥
M k + 1 [Tr(B0 ) + M1 ]n
h m ik+1 µ ¶k+1 ½ ¾
1 det(B0 )nn
≥ min ,1
M exp(n) [Tr(B0 ) + M1 ]n
· ¸k+1 ½ ¾
M det(B0 )nn
≥ min , 1
exp(n)m [Tr(B0 ) + M1 ]n
(4.21) ≥ ck+1
3 ,
n
M det(B0 )n
where c3 ≤ [ exp(n)m ] min{ [Tr(B 0 )+M1 ]
n , 1}. Let

−gjT dj
cos θj = .
kgj kkdj k
780 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Multiplying (4.18) with (4.21), we get for all k ≥ 0

Yk ½ ¾ · ¸k+1
ksj kkgj k cos θj kgj k cos θj k+1 (k + 1)c22
max , ≥ c3
j=0
1 − ε2 ksj kp−1 Tr(B0 ) + (k + 1)M1
· ¸k+1
c3 c22
(4.22) ≥ .
Tr(B0 ) + M1
By
k
Y ½ ¾
ksj kkgj k cos θj kgj k cos θj
max ,
j=0
1 − ε2 ksj kp−1
µ ¶k+1 k+1
Y
1
≤ max{ksk k, ksk k1−p }kgj k cos θj ,
1 − ε2 j=0

we have
k+1
Y · ¸k+1
(1 − ε2 )c3 c22
(4.23) max{ksj k, ksj k1−p }kgj k cos θj ≥ .
j=0
Tr(B0 ) + M1
According to Lemma 4.4 and Assumption A we know that there exists a con-
stant M20 > 0 such that
(4.24) ksk k = kxk+1 − xk k ≤ kxk+1 k + kxk k ≤ 2M20 .
Combining (4.23) and (4.24), and noting that kgj k cos θj = ηj , we get for all
k≥0
Yk · ¸k+1
(1 − ε2 )c3 c22
ηj ≥ 0 , 1, (2M 0 )1−p } = ²k+1
0 .
j=0
(Tr(B 0 ) + M1 ) max{2M 2 2

The proof is complete. ¤

Now we establish the global convergence theorem for Algorithm 2.
Theorem 4.1. Let Assumption A hold and the sequence {xk } be generated by
Algorithm 2. Then we have
(4.25) lim inf kgk k = 0.
k→∞

Proof. By Lemma 4.3 and (2.20), we get

fk+1 ≤ f (xl(k) ) − ε1 ksk kηk
2−p

(4.26) ≤ f (xl(k) ) − ε1 b0 min{ηk2 , ηk1−p }.

2−p

Let tk = ε1 b0 min{ηk2 , ηk1−p }. By Lemma 4.5, we have

∞
X 2−p
2 1−p
min min{ηk+n(k)−j , ηk+n(k)−j } < +∞,
0≤j≤n(k)
k=0
MODIFIED LIMITED MEMORY BFGS METHOD 781

∞
X 2−p
2 1−p
min min{η(n(k)+1)q+n(k)−j , η(n(k)+1)q+n(k)−j } < +∞.
0≤j≤n(k)
q=1

Denoting the sequence{p(q)} as follows:

2−p
2 1−p
min{ηp(q) , ηp(q) } = min min{H1j (q), H2j (q)},
0≤j≤n(k)
2
H1j (q) = η(n(k)+1)q+n(k)−j ,
q(n(k) + 1) ≤ p(q) ≤ (q + 1)n(k) + q.
Therefore
p(1) < p(2) < p(3) < · · · < p(q − 1) < p(q) < · · · ,
2−p
2 1−p
lim min{ηp(q) , ηp(q) } = 0,
q→∞
(4.27) lim ηp(q) = 0,
q→∞

which means that

lim ηk = 0, K ⊂ N,
k∈K
where K is a subset of N = {1, 2, 3, . . .}. By xk ∈ Ω, and Ω is bounded, we
can assume that there exists a constant b3 > 0 such that kgk k ≤ b3 . Then we
get
−gkT dk
(4.28) ηk = ≤ kgk k ≤ b3 .
kdk k
We prove our theorem by contradiction. Assume that
lim inf kgk k > 0,
k→∞

so that there exists a constant c2 > 0 such that

kgk k ≥ c2 , k = 0, 1, 2, . . . .
By Lemma 4.7, we know that there exists a constant ²0 satisfying
Y
(4.29) k
ηj=0 ≥ ²k+1
0 .
Combining (4.29) and (4.28), we deduce that for any integer k ≥ 1,
(k+1)n(k)+k
(k+1)n(k)+k
Y
²0 ≤ ηj
j=1
k (q+1)n(k)+q
1 Y Y
= ηj
η0 q=0
j=(n(k)+1)q
k
Y Y
1
= ηq(n(k)+1)+n(k)−j
η0 q=0 0≤j≤n(k)
782 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

k
1 Y n(k)
≤ [ηp(q) b3 ]
η0 q=0
k
1 kn(k) Y
(4.30) = b3 ηp(q) .
η0 q=0

Then we have
k
" #k " #k
Y n(k)
n(k)+1
²0
n(k)+1
²0 n(k)
ηp(q) ≥ η0 ² 0 n(k)
≥ n(k)
min{1, η0 ²0 } .
q=0 b3 b3
Using Lemma 4.6 we have
(4.31) lim sup ηp(q) > 0,
q→∞

which contradicts (4.27). Therefore, we obtain

lim inf kgk k = 0.
k→∞
The proof is complete. ¤
Similar to Algorithm 2, it is not difficult to get the global convergence of
Algorithm 22. Here, we only state it as follows but omit the proof.
Theorem 4.2. Let Assumption A hold and the sequence {xk } be generated by
Algorithm 22. Then we have (4.25).

5. Numerical results
In this section, we report some numerical results on the problems [30] with
initial points. All codes were written in MATLAB 7.0 and run on PC with
2.60GHz CPU processor and 256MB memory and Windows XP operation sys-
tem. The parameters are chosen as: σ1 = 0.1, σ2 = 0.9, ε = 10−5 , ε1 =
0.1, ε2 = 0.01, p = 5, H = 8, m1 = 5, and the initial matrix B0 = I is the
unit matrix.
The following Himmeblau stop rule is used [47]:
If | f (xk ) |> e1 , let stop1 = |f (xk|f)−f (xk+1 )|
(xk )| ; Otherwise, let stop1 =| f (xk ) −
f (xk+1 ) | .
For each problem, if kg(x)k < ε or stop1 < e2 was satisfied, the program
will be stopped, where e1 = e2 = 10−5 .
Since the line search cannot always ensure the descent condition dTk gk < 0,
uphill search direction may occur in the numerical experiments. In this case,
the line search rule maybe fails. In order to avoid this case, the stepsize αk will
be accepted if the searching number is more than twenty five in line search.
We also stop the program if the iteration number is more than one thousand,
and the corresponding method is considered to be failed.
In Figure 1-3, “BFGS-WP-Ak1” and “BFGS-WP-Ak2” stand for the mod-
ified BFGS formula (2.8) with WWP rule and the modified BFGS formula
MODIFIED LIMITED MEMORY BFGS METHOD 783

(2.16) with WWP rule, respectively. “L-BFGS-A1” and “L-BFGS-A11” stand

for Algorithm 1 and Algorithm 2, respectively. The detailed numerical results
are listed on the web site

http : //210.36.16.53 : 8018/publication.asp?id = 33331.

Dolan and Moré [13] gave a new tool to analyze the efficiency of algorithms.
They introduced the notion of a performance profile as a means to evaluate and
compare the performance of the set of solvers S on a test set P. Assuming that
there exist ns solvers and np problems, for each problem p and solver s, they
defined tp,s = computing time (the number of function evaluations or others)
required to solve problem p by solver s.
Requiring a baseline for comparisons, they compared the performance on
problem p by solver s with the best performance by any solver on this problem;
that is, using the performance ratio
tp,s
rp,s = .
min{tp,s : s ∈ S}

Suppose that a parameter rM ≥ rp,s for all p, s is chosen, and rp,s = rM if and
only if solver s does not solve problem p.
The performance of solver s on any given problem might be of interest, but
we would like to obtain an overall assessment of the performance of the solver,
then they defined
1
ρs (t) = size{p ∈ P : rp,s ≤ t},
np
thus ρs (t) was the probability for solver s ∈ S that a performance ratio rp,s
was within a factor t ∈ R of the best possible ration. Then function ρs was the
(cumulative) distribution function for the performance ratio. The performance
profile ρs : R 7→ [0, 1] for a solver was a nondecreasing, piecewise constant
function, continuous from the right at each breakpoint. The value of ρs (1) was
the probability that the solver would win over the rest of the solvers.
According to the above rules, we know that one solver whose performance
profile plot is on top right will win over the rest of the solvers.
Figures 1, 2, and 3 show that the performance of these methods is relative
to N I, N F G, and T ime, respectively, where N I denotes the total number of
iterations, N F G denotes the total number of the function evaluations and the
gradient evaluations where N T = N F + 5N G (see [6, 18]), and T ime denotes
the cpu time that these methods spent. From these three figures it is clear
that the L-BFGS-A11 method has the most wins (has the highest probability
of being the optimal solver).
Figure 1 shows that L-BFGS-A11 and L-BFGS-A1 outperform BFGS-WP-
Ak1 and BFGS-WP-Ak2 about 5% and 8% test problems, respectively. The
L-BFGS-A11 method is predominant among the other three methods for t ≤ 5.
784 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

Performance profiles of these methods(NI). Figure 1

0.9

0.8

0.7

0.6
Pp:r(p,s)<=t

0.5

0.4

0.3
.. L−BFGS−A1
0.2 − − BFGS−WP−Ak2
−.L−BFGS−A11
0.1 − BFGS−WP−Ak1

0
5 10 15 20 25 30
t

Moreover, the L-BFGS-A11 and L-BFGS-A1 and solve 100%, and the BFGS-
WP-Ak1 and BFGS-WP-Ak2 method solve about 95% and 92% of the test
problems successfully, respectively.
Figure 2 shows that L-BFGS-A11 and L-BFGS-A1 are superior to BFGS-
WP-Ak1 and BFGS-WP-Ak2 about 15% test problems. The L-BFGS-A11
and L-BFGS-A1 method can solve 100% of the test problems successfully at
t ≈ 4.2 and t ≈ 7.2, respectively. The BFGS-WP-Ak1 and BFGS-WP-Ak2
method solve about 85% of the test problems successfully.
Figure 3 shows that L-BFGS-A11 outperforms the other three methods. The
L-BFGS-A1 method and the BFGS-WP-Ak1 method solve about 95% and 91%
of the test problems, respectively, and the BFGS-WP-Ak2 solves about 88% of
the test problems successfully.
In summary, the presented numerical results reveal that Algorithm 1 and
Algorithm 11, compared with other two methods with WWP line search and
BFGS update, have potential advantages for these problems.

6. Conclusion
This paper gives two modified L-BFGS method with one nonmonotone line
search technique for solving unconstrained optimization, which include the
function value at the current and next iterative point values. The global con-
vergence for the uniformly convex functions are established. The numerical
MODIFIED LIMITED MEMORY BFGS METHOD 785

Performance profiles of these methods(NFG). Figure 2

0.9

0.8

0.7

0.6
Pp:r(p,s)<=t

0.5

0.4

0.3
.. L−BFGS−A1
0.2 − − BFGS−WP−Ak2
−.L−BFGS−A11
0.1 − BFGS−WP−Ak1

0
2 4 6 8 10 12 14 16
t

Performance profiles of these methods(Time). Figure 3

0.9

0.8

0.7

0.6
Pp:r(p,s)<=t

0.5

0.4

0.3
.. L−BFGS−A1
0.2 − − BFGS−WP−Ak2
−.L−BFGS−A11
0.1 − BFGS−WP−Ak1

0
2 4 6 8 10 12 14 16
t
786 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

results show that the given methods are competitive to the other standard
BFGS methods for the test problems.
For further research, we should study the performance of the new algorithm
at different stop rules and different testing environment (such as [15]). More-
over, more numerical experiments for large practical problems should be done
in the future.

References
[1] C. G. Broyden, J. E. Dennis Jr, and J. J. Moré, On the local and superlinear convergence
of quasi-Newton methods, J. Inst. Math. Appl. 12 (1973), 223–245.
[2] R. H. Byrd and J. Nocedal, A tool for the analysis of quasi-Newton methods with appli-
cation to unconstrained minimization, SIAM J. Numer. Anal. 26 (1989), no. 3, 727–739.
[3] R. H. Byrd, J. Nocedal, and R. B. Schnabel, Representations of quasi-Newton matrices
and their use in limited memory methods, Math. Programming 63 (1994), no. 2, Ser.
A, 129–156.
[4] R. Byrd, J. Nocedal, and Y. Yuan, Global convergence of a class of quasi-Newton meth-
ods on convex problems, SIAM J. Numer. Anal. 24 (1987), no. 5, 1171–1190.
[5] Y. Dai, Convergence properties of the BFGS algorithm, SIAM J. Optim. 13 (2002), no.
3, 693–701.
[6] Y. Dai and Q. Ni, Testing different conjugate gradient methods for large-scale uncon-
strained optimization, J. Comput. Math. 21 (2003), no. 3, 311–320.
[7] W. C. Davidon, Variable metric methods for minimization, A. E. C. Research and
Development Report ANL-599, 1959.
[8] R. Dembo and T. Steihaug, Truncated Newton algorithms for large-scale unconstrained
optimization, Math. Programming 26 (1983), no. 2, 190–212.
[9] J. E. Dennis Jr. and J. J. Moré, Quasi-Newton methods, motivation and theory, SIAM
Rev. 19 (1977), no. 1, 46–89.
[10] , A characterization of superlinear convergence and its application to quasi-
Newton methods, Math. Comp. 28 (1974), 549–560.
[11] J. E. Dennis Jr. and R. B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations, Prentice Hall Series in Computational Mathematics. Prentice
Hall, Inc., Englewood Cliffs, NJ, 1983.
[12] L. C. W. Dixon, Variable metric algorithms: necessary and sufficient conditions for
identical behavior of nonquadratic functions, J. Optimization Theory Appl. 10 (1972),
34–40.
[13] E. D. Dolan and J. J. Moré, Benchmarking optimization software with performance
profiles, Math. Program. 91 (2002), no. 2, Ser. A, 201–213.
[14] R. Fletcher, Practical Methods of Optimization, Second edition. A Wiley-Interscience
Publication. John Wiley & Sons, Ltd., Chichester, 1987.
[15] N. I. M. Gould, D. Orban, and Ph. L. Toint, CUTEr (and SifDec), a constrained and un-
constrained testing environment, revisite, ACM Transactions on Mathematical Software
29 (2003), 373–394.
[16] A. Griewank, The global convergence of partitioned BFGS on problems with convex
decompositions and Lipschitzian gradients, Math. Programming 50 (1991), no. 2, (Ser.
A), 141–175.
[17] A. Griewank and Ph. L. Toint, Local convergence analysis for partitioned quasi-Newton
updates, Numer. Math. 39 (1982), no. 3, 429–448.
[18] L. Grippo, F. Lamparillo, and S. Lucidi, A nonmonotone line search technique for
Newton’s method, SIAM J. Numer. Anal. 23 (1986), no. 4, 707–716.
MODIFIED LIMITED MEMORY BFGS METHOD 787

[19] , A truncated Newton method with nonmonotone line search for unconstrained
optimization, J. Optim. Theory Appl. 60 (1989), no. 3, 401–419.
[20] , A class of nonmonotone stabilization methods in unconstrained optimization,
Numer. Math. 59 (1991), no. 8, 779–805.
[21] J. Y. Han and G. H. Liu, Global convergence analysis of a new nonmonotone BFGS
algorithm on convex objective functions, Comput. Optim. Appl. 7 (1997), no. 3, 277–289.
[22] D. Li and M. Fukushima, A modified BFGS method and its global convergence in non-
convex minimization, J. Comput. Appl. Math. 129 (2001), no. 1-2, 15–35.
[23] , On the global convergence of the BFGS method for nonconvex unconstrained
optimization problems, SIAM J. Optim. 11 (2001), no. 4, 1054–1064.
[24] G. Li, C. Tang, and Z. Wei, New conjugacy condition and related new conjugate gradient
methods for unconstrained optimization, J. Comput. Appl. Math. 202 (2007), no. 2,
523–539.
[25] G. H. Liu and J. Y. Han, Notes on the general form of stepsize selection, OR and
Decision Making I (1992), 619–624.
[26] , Global convergence Analysis of the variable metric algorithm with a generalized
Wolf linesearch, Technical Report, Institute of Applied Mathematics, Academia Sinica,
Beijing, China, no. 029, 1993.
[27] G. H. Liu, J. Y. Han, and D. F. Sun, Global convergence of the BFGS algorithm with
nonmonotone linesearch, Optimization 34 (1995), no. 2, 147–159.
[28] G. H. Liu and J. M. Peng, The convergence properties of a nonmonotonic algorithm, J.
Comput. Math. 1 (1992), 65–71.
[29] W. F. Mascarenhas, The BFGS method with exact line searches fails for non-convex
objective functions, Math. Program. 99 (2004), no. 1, Ser. A, 49–61.
[30] J. J. Moré, B. S. Garbow, and K. E. Hillstrome, Testing unconstrained optimization
software, ACM Trans. Math. Software 7 (1981), no. 1, 17–41.
[31] S. G. Nash, A survey of truncated-Newton methods, Numerical analysis 2000, Vol. IV,
Optimization and nonlinear equations. J. Comput. Appl. Math. 124 (2000), no. 1-2,
45–59.
[32] M. J. D. Powell, On the convergence of the variable metric algorithm, J. Inst. Math.
Appl. 7 (1971), 21–36.
[33] , Some global convergence properties of a variable metric algorithm for min-
imization without exact line searches, Nonlinear programming (Proc. Sympos., New
York, 1975), pp. 53–72. SIAM-AMS Proc., Vol. IX, Amer. Math. Soc., Providence, R.
I., 1976.
[34] , A new algorithm for unconstrained optimization, 1970 Nonlinear Programming
(Proc. Sympos., Univ. of Wisconsin, Madison, Wis., 1970) pp. 31–65 Academic Press,
New York.
[35] M. Raydan, The Barzilai and Borwein gradient method for the large scale unconstrained
minimization problem, SIAM J. Optim. 7 (1997), no. 1, 26–33.
[36] J. Schropp, A note on minimization problems and multistep methods, Numer. Math. 78
(1997), no. 1, 87–101.
[37] , One-step and multistep procedures for constrained minimization problems, IMA
J. Numer. Anal. 20 (2000), no. 1, 135–152.
[38] Ph. L. Toint, Global convergence of the partitioned BFGS algorithm for convex partially
separable optimization, Math. Programming 36 (1986), no. 3, 290–306.
[39] D. J. van Wyk, Differential optimization techniques, Appl. Math. Modelling 8 (1984),
no. 6, 419–424.
[40] M. N. Vrahatis, G. S. Androulakis, J. N. Lambrinos, and G. D. Magolas, A class of gra-
dient unconstrained minimization algorithms with adaptive stepsize, J. Comput. Appl.
Math. 114 (2000), no. 2, 367–386.
View publication stats

788 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

[41] Z. Wei, G. Li, and L. Qi, New quasi-Newton methods for unconstrained optimization
problems, Appl. Math. Comput. 175 (2006), no. 2, 1156–1188.
[42] Z. Wei, G. Yu, G. Yuan, and Z. Lian, The superlinear convergence of a modified BFGS-
type method for unconstrained optimization, Comput. Optim. Appl. 29 (2004), no. 3,
315–332.
[43] G. L. Yuan, Modified nonlinear conjugate gradient methods with sufficient descent prop-
erty for large-scale optimization problems, Optim. Lett. 3 (2009), no. 1, 11–21.
[44] G. L. Yuan and X. W. Lu, A new line search method with trust region for unconstrained
optimization, Comm. Appl. Nonlinear Anal. 15 (2008), no. 1, 35–49.
[45] , A modified PRP conjugate gradient method, Ann. Oper. Res. 166 (2009), 73–90.
[46] G. L. Yuan, X. Lu, and Z. Wei, A conjugate gradient method with descent direction for
unconstrained optimization, J. Comput. Appl. Math. 233 (2009), no. 2, 519–530.
[47] Y. Yuan and W. Sun, Theory and Methods of Optimization, Science Press of China,
1999.
[48] G. L. Yuan and Z. X. Wei, New line search methods for unconstrained optimization, J.
Korean Statist. Soc. 38 (2009), no. 1, 29–39.
[49] , The superlinear convergence analysis of a nonmonotone BFGS algorithm on
convex objective functions, Acta Math. Sin. (Engl. Ser.) 24 (2008), no. 1, 35–42.
[50] , Convergence analysis of a modified BFGS method on convex minimizations,
Comput. Optim. Appl. doi: 10.1007/s10589-008-9219-0.
[51] J. Z. Zhang, N. Y. Deng, and L. H. Chen, New quasi-Newton equation and related
methods for unconstrained optimization, J. Optim. Theory Appl. 102 (1999), no. 1,
147–167.

Gonglin Yuan
College of Mathematics and Information Science
Guangxi University
Nanning, Guangxi, 530004, P. R. China
E-mail address: [email protected]

Zengxin Wei
College of Mathematics and Information Science
Guangxi University
Nanning, Guangxi, 530004, P. R. China
E-mail address: [email protected]

Yanlin Wu
College of Mathematics and Information Science
Guangxi University
Nanning, Guangxi, 530004, P. R. China
E-mail address: [email protected]

Optimization Theory with Applications
From Everand
Optimization Theory with Applications
Donald A. Pierre
4/5 (4)
1-s2.0-S0893965901001628-main
No ratings yet
1-s2.0-S0893965901001628-main
7 pages
1991IMAJNA-11-325-332
No ratings yet
1991IMAJNA-11-325-332
9 pages
Ica20100100003 17780538
No ratings yet
Ica20100100003 17780538
8 pages
Wiki Lbfgs
No ratings yet
Wiki Lbfgs
6 pages
Algorithm 778: L-BFGS-B: Fortran Subroutines For Large-Scale Bound-Constrained Optimization
No ratings yet
Algorithm 778: L-BFGS-B: Fortran Subroutines For Large-Scale Bound-Constrained Optimization
11 pages
Limited-Memory BFGS With Displacement Aggregation
No ratings yet
Limited-Memory BFGS With Displacement Aggregation
24 pages
A Limited T,: Memory Algorithm For Bound Constrained T, T
No ratings yet
A Limited T,: Memory Algorithm For Bound Constrained T, T
19 pages
A Limited-memory Algorithm For
No ratings yet
A Limited-memory Algorithm For
22 pages
Gradient Methods With Adaptive Step-Sizes
No ratings yet
Gradient Methods With Adaptive Step-Sizes
19 pages
DFO of Noisy Functions Via Quasi Newtion Methods - Berahas, Byrd, Nocedal (2018)
No ratings yet
DFO of Noisy Functions Via Quasi Newtion Methods - Berahas, Byrd, Nocedal (2018)
40 pages
A Modified BFGS Method and Its Global Convergence in Nonconvex Minimization - 2001 - Journal of Computational and Applied Mathematics
No ratings yet
A Modified BFGS Method and Its Global Convergence in Nonconvex Minimization - 2001 - Journal of Computational and Applied Mathematics
21 pages
Study On GRG
100% (1)
Study On GRG
14 pages
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
No ratings yet
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
10 pages
Advances in Operations Research - 2009 - Yuan - A Trust Region Based BFGS Method With Line Search Technique For Symmetric
No ratings yet
Advances in Operations Research - 2009 - Yuan - A Trust Region Based BFGS Method With Line Search Technique For Symmetric
22 pages
A Progressive Batching L-BFGS Method For Machine Learning: Robbins & Monro 1951
No ratings yet
A Progressive Batching L-BFGS Method For Machine Learning: Robbins & Monro 1951
24 pages
CG Survey
No ratings yet
CG Survey
21 pages
A Line Search Algorithm For Unconstrained Optimiza
No ratings yet
A Line Search Algorithm For Unconstrained Optimiza
8 pages
Continuous Optimization - Vaithilingam Jeyakumar, Alexander Rubinov
100% (1)
Continuous Optimization - Vaithilingam Jeyakumar, Alexander Rubinov
453 pages
JSSM20090100005 27988653
No ratings yet
JSSM20090100005 27988653
7 pages
Shanno Conmin
No ratings yet
Shanno Conmin
5 pages
Unit 2 (Second Order Methods)
No ratings yet
Unit 2 (Second Order Methods)
9 pages
NLPQLG: A Fortran Implementation of A Sequential Quadratic Programming Algorithm For Heuristic Global Optimization - User's Guide
No ratings yet
NLPQLG: A Fortran Implementation of A Sequential Quadratic Programming Algorithm For Heuristic Global Optimization - User's Guide
24 pages
L-BFGS algorithm
No ratings yet
L-BFGS algorithm
4 pages
An Overview of Traditional Optimization Methods - Truncated
No ratings yet
An Overview of Traditional Optimization Methods - Truncated
17 pages
A modified Liu-Storey scheme for nonlinear systems with an application to image recovery
No ratings yet
A modified Liu-Storey scheme for nonlinear systems with an application to image recovery
21 pages
Project For Automated Train by Roshan
No ratings yet
Project For Automated Train by Roshan
6 pages
bfgs
No ratings yet
bfgs
11 pages
NLPP Complete Perfect
No ratings yet
NLPP Complete Perfect
14 pages
Jsea20100500012 19713998
No ratings yet
Jsea20100500012 19713998
7 pages
Conjucted & Optimal Details
No ratings yet
Conjucted & Optimal Details
25 pages
A Survey On The Dai-Liao Family of Nonlinear Conjugate Gradient Methods
No ratings yet
A Survey On The Dai-Liao Family of Nonlinear Conjugate Gradient Methods
16 pages
Linear and NonLinearProgramming18903341X
No ratings yet
Linear and NonLinearProgramming18903341X
6 pages
FALLSEM2020-21 CHE1011 TH VL2020210101704 Reference Material I 05-Sep-2020 Lecture 17 PDF
No ratings yet
FALLSEM2020-21 CHE1011 TH VL2020210101704 Reference Material I 05-Sep-2020 Lecture 17 PDF
19 pages
A Q-Polak-Ribiere-Polyak Conjugate Gradient Algori
No ratings yet
A Q-Polak-Ribiere-Polyak Conjugate Gradient Algori
30 pages
BFGS
No ratings yet
BFGS
9 pages
Local Search in Smooth Convex Sets: CX Ax B A I A A A A A A O D X Ax B X CX CX O A I J Z O Opt D X X C A B P CX
No ratings yet
Local Search in Smooth Convex Sets: CX Ax B A I A A A A A A O D X Ax B X CX CX O A I J Z O Opt D X X C A B P CX
9 pages
Dimitri Bertsekas - Nonlinear Programming (Google Books Preview) (2016, Athena Scientific) - Libgen - Li
No ratings yet
Dimitri Bertsekas - Nonlinear Programming (Google Books Preview) (2016, Athena Scientific) - Libgen - Li
64 pages
Linnear Nonlineae Numerical Method
No ratings yet
Linnear Nonlineae Numerical Method
43 pages
Algorithms 17 00107
No ratings yet
Algorithms 17 00107
15 pages
04 SUTJM14 24 Yabe
No ratings yet
04 SUTJM14 24 Yabe
37 pages
Math562 ContinuousOptimization
No ratings yet
Math562 ContinuousOptimization
126 pages
A Sequential Quadratic Programming Algorithm With Non-Monotone Line Search
No ratings yet
A Sequential Quadratic Programming Algorithm With Non-Monotone Line Search
24 pages
Convergence of Inexact Forward--Backward Algorithms Using the Forward--Backward Envelope
No ratings yet
Convergence of Inexact Forward--Backward Algorithms Using the Forward--Backward Envelope
29 pages
A Truncated Nonmonotone Gauss-Newton Method For Large-Scale Nonlinear Least-Squares Problems
No ratings yet
A Truncated Nonmonotone Gauss-Newton Method For Large-Scale Nonlinear Least-Squares Problems
16 pages
Nonlinear Programming
No ratings yet
Nonlinear Programming
6 pages
A Conjugate Rosen's Gradient Projection Method With Global Line Search For Piecewise Linear Optimization
No ratings yet
A Conjugate Rosen's Gradient Projection Method With Global Line Search For Piecewise Linear Optimization
21 pages
Support Lecture 1
No ratings yet
Support Lecture 1
4 pages
Nonlinear Programming Concepts
No ratings yet
Nonlinear Programming Concepts
37 pages
The Computer Journal 1965 Box 42 52
No ratings yet
The Computer Journal 1965 Box 42 52
11 pages
Cours D'optimisation
No ratings yet
Cours D'optimisation
159 pages
16.323 Optimal Control Problems Set 1
No ratings yet
16.323 Optimal Control Problems Set 1
3 pages
Using Gradient Directions To Get Global Convergence of Neewton-Type Metods
No ratings yet
Using Gradient Directions To Get Global Convergence of Neewton-Type Metods
22 pages
Math 11143 Peer Review
No ratings yet
Math 11143 Peer Review
16 pages
FX X RCX I CX I I: Study On Lagrangian Methods
No ratings yet
FX X RCX I CX I I: Study On Lagrangian Methods
10 pages
Global Convergence of A Modified Fletcher-Reeves Conjugate Gradient Method With Armijo-Type Line Search - Zhang, Zhou (2006)
No ratings yet
Global Convergence of A Modified Fletcher-Reeves Conjugate Gradient Method With Armijo-Type Line Search - Zhang, Zhou (2006)
12 pages
L-BFGS-B Summary
No ratings yet
L-BFGS-B Summary
1 page
Optimization
No ratings yet
Optimization
198 pages
Method of Moments for 2D Scattering Problems: Basic Concepts and Applications
From Everand
Method of Moments for 2D Scattering Problems: Basic Concepts and Applications
Christophe Bourlier
No ratings yet
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
From Everand
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
Abdolhossein Fereidoon
No ratings yet
Path
No ratings yet
Path
6 pages
Binomial Distribution 0101
No ratings yet
Binomial Distribution 0101
1 page
Summative Test Variation Summative Test Variation
100% (5)
Summative Test Variation Summative Test Variation
2 pages
Test1 HL Seq Series Bin THM v1
No ratings yet
Test1 HL Seq Series Bin THM v1
2 pages
2007 Bomet District Paper 1
0% (1)
2007 Bomet District Paper 1
16 pages
Unit 1 Dynamic Programming: Structure Page Nos
No ratings yet
Unit 1 Dynamic Programming: Structure Page Nos
72 pages
Secret Sheet For JEE 2025
No ratings yet
Secret Sheet For JEE 2025
64 pages
PM-Bracket Removal and Factorisation PDF
No ratings yet
PM-Bracket Removal and Factorisation PDF
5 pages
Math484 V1 PDF
No ratings yet
Math484 V1 PDF
167 pages
Trig Unit 3 LT4 LT5 LT6 LT7 Mastery Project
No ratings yet
Trig Unit 3 LT4 LT5 LT6 LT7 Mastery Project
1 page
2017 Random Variables and Stochastic Processes
No ratings yet
2017 Random Variables and Stochastic Processes
7 pages
Power Theorem
No ratings yet
Power Theorem
14 pages
Exercises in Nonlinear Control Systems
No ratings yet
Exercises in Nonlinear Control Systems
115 pages
University of Technology, Jamaica: Tutorial Sheet #3: Double Integrals
No ratings yet
University of Technology, Jamaica: Tutorial Sheet #3: Double Integrals
3 pages
Linear Programming: Transportation, Assignment, and Transshipment Problems
No ratings yet
Linear Programming: Transportation, Assignment, and Transshipment Problems
14 pages
2.1 A Thin Fluid Layer Flowing Down An Incline: Chapter 2: Low Viscous Flows
No ratings yet
2.1 A Thin Fluid Layer Flowing Down An Incline: Chapter 2: Low Viscous Flows
3 pages
Analysis of Algorithms
No ratings yet
Analysis of Algorithms
19 pages
Solved Problems On Supremum and Infimum
No ratings yet
Solved Problems On Supremum and Infimum
7 pages
Matlab Lab 3
No ratings yet
Matlab Lab 3
28 pages
Handout 3.2 Kinds System of Linear Equation
No ratings yet
Handout 3.2 Kinds System of Linear Equation
3 pages
Seminar
No ratings yet
Seminar
22 pages
Lab 10. Differential Equations: Name: 1 Instructions
No ratings yet
Lab 10. Differential Equations: Name: 1 Instructions
3 pages
GATE Problems in Probability
No ratings yet
GATE Problems in Probability
12 pages
Vectors and Matrices Notes.: 1 Index Notation
No ratings yet
Vectors and Matrices Notes.: 1 Index Notation
6 pages
1.2:rates of Change & Limits Learning Goals
No ratings yet
1.2:rates of Change & Limits Learning Goals
58 pages
Highway Engineering Quiz 7 2a
No ratings yet
Highway Engineering Quiz 7 2a
1 page
Combo-Null
No ratings yet
Combo-Null
9 pages
Xii Math Mock Question Paper Mid Term Exam
No ratings yet
Xii Math Mock Question Paper Mid Term Exam
13 pages
EEN - 321 Digital Communications: Dr. Omer Mohsin Mubarak Assistant Professor
No ratings yet
EEN - 321 Digital Communications: Dr. Omer Mohsin Mubarak Assistant Professor
32 pages
Chapter 6 - Binomial Distribution
100% (1)
Chapter 6 - Binomial Distribution
4 pages

Modified Limited Memory BFGS Method With Nonmonoto

Uploaded by

Modified Limited Memory BFGS Method With Nonmonoto

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Article in Journal of the Korean Mathematical Society · July 2010

Gonglin Yuan Zengxin Wei

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

MODIFIED LIMITED MEMORY BFGS METHOD WITH

Gonglin Yuan, Zengxin Wei, and Yanlin Wu

Abstract. In this paper, we propose two limited memory BFGS algo-

where f : Rn → R is continuously differentiable. The line search method is one

Received July 17, 2008.

where g(xk ) = ∇f (xk ) is the gradient of f (x) at xk , and G(xk ) = ∇2 f (xk ) is

at each iteration, a triple

2. Modified BFGS update and nonmonotone line search

(iii) The equation of Zhang, Deng, and Chen [51]:

(2.19) g(xk + αk dk )T dk ≥ σ2 gkT dk ,

(2.21) g(xk+1 )T dk ≥ max{ε2 , 1 − (αk kdk k)p }gkT dk ,

3. Limited memory BFGS-type method

(3.24) +ρ∗k sk sTk

Algorithm 22. (L-BFGS-A22)

Lemma 4.2. Let Bk be generated by (3.28). Then we have

where det(Bk ) denotes the determinant of Bk .

Then we complete the proof. ¤

Define the length of the orthogonal projection of −gk on dk by

If fk+1 ≤ f (xl(k) ), k = 0, 1, 2, . . . , then the sequence {f (xl(k) )} monotonically

Lemma 4.6. If the sequence of nonnegative numbers mk (k = 0, 1, . . .) satisfies

then lim supk mk > 0.

Lemma 4.7. Let {xk } be generated by Algorithm 2 and Assumption A hold.

Combining (4.10), (4.14), (3.26), (3.29), and Lemma 4.1, we obtain

Using Bk+1 is positive definite, we have Tr(Bk+1 ) > 0. By (4.15), we obtain

By the geometric-arithmetic mean value formula we get

Using Lemma 4.2, (4.12), and (2.21), we have

Multiplying (4.18) with (4.21), we get for all k ≥ 0

The proof is complete. ¤

Proof. By Lemma 4.3 and (2.20), we get

(4.26) ≤ f (xl(k) ) − ε1 b0 min{ηk2 , ηk1−p }.

Let tk = ε1 b0 min{ηk2 , ηk1−p }. By Lemma 4.5, we have

Denoting the sequence{p(q)} as follows:

which means that

so that there exists a constant c2 > 0 such that

which contradicts (4.27). Therefore, we obtain

(2.16) with WWP rule, respectively. “L-BFGS-A1” and “L-BFGS-A11” stand

http : //210.36.16.53 : 8018/publication.asp?id = 33331.

Performance profiles of these methods(NI). Figure 1

Performance profiles of these methods(NFG). Figure 2

Performance profiles of these methods(Time). Figure 3

788 GONGLIN YUAN, ZENGXIN WEI, AND YANLIN WU

You might also like