0% found this document useful (0 votes)
7 views

SNOPT An SQP Algorithm For Large-Scale Constrained

Uploaded by

Arthur
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

SNOPT An SQP Algorithm For Large-Scale Constrained

Uploaded by

Arthur
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220133387

SNOPT: An SQP Algorithm for large-scale constrained optimization

Article in SIAM Review · April 2002


DOI: 10.2307/20453604 · Source: DBLP

CITATIONS READS
2,132 3,069

3 authors, including:

Philip E. Gill Michael A. Saunders


University of California, San Diego Stanford University
122 PUBLICATIONS 20,365 CITATIONS 188 PUBLICATIONS 42,395 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Philip E. Gill on 26 May 2014.

The user has requested enhancement of the downloaded file.


SNOPT: AN SQP ALGORITHM FOR LARGE-SCALE
CONSTRAINED OPTIMIZATION
PHILIP E. GILLy , WALTER MURRAYz , AND MICHAEL A. SAUNDERSz
Abstract. Sequential quadratic programming (SQP) methods have proved highly e ective for
solving constrained optimization problems with smooth nonlinear functions in the objective and
constraints. Here we consider problems with general inequality constraints (linear and nonlinear).
We assume that rst derivatives are available, and that the constraint gradients are sparse.
We discuss an SQP algorithm that uses a smooth augmented Lagrangian merit function and
makes explicit provision for infeasibility in the original problem and the QP subproblems. SNOPT is a
particular implementation that makes use of a semide nite QP solver. It is based on a limited-memory
quasi-Newton approximation to the Hessian of the Lagrangian, and uses a reduced-Hessian algorithm
(SQOPT) for solving the QP subproblems. It is designed for problems with many thousands of
constraints and variables but a moderate number of degrees of freedom (say, up to 2000). Numerical
results are given for most problems in the CUTE test collection (about 800 smooth problems) and
for a number of other applications, including trajectory optimization in the aerospace industry.
As much as possible, we isolate the SQP algorithm from the method used to solve the QP
subproblems. In this way SNOPT may be extended to allow for an arbitrary number of active
constraints or degrees of freedom, given an appropriate QP solver.
Key words. large-scale optimization, nonlinear programming, nonlinear inequality constraints,
sequential quadratic programming, quasi-Newton methods, limited-memory methods
AMS subject classi cations. 49J20, 49J15, 49M37, 49D37, 65F05, 65K05, 90C30
1. Introduction. We present a sequential quadratic programming method for
solving large-scale optimization problems involving linear and nonlinear inequality
constraints. SQP methods have proved reliable and ecient for many such prob-
lems. For example, under mild conditions the general-purpose solvers NLPQL [48]
and NPSOL [25, 28] typically nd a (local) optimum from an arbitrary starting point,
and they require relatively few evaluations of the problem functions and gradients
compared to other solvers.
1.1. The optimization problem. The algorithm we describe applies to con-
strained optimization problems of the form
NP minimize f (x)
x 0 1
x
subject to l  @ c(x) C
B A  u;
Ax
where x 2 IRn , f (x) is a linear or nonlinear objective function, c(x) is a vector of
nonlinear constraint functions ci (x) with sparse derivatives, A is a sparse matrix, and
l and u are given bounds.
 This paper may be referenced as Report NA 97-2, Dept of Mathematics, University of California,
San Diego, and Report SOL 97-3, Dept of EESOR, Stanford University. This research was partially
supported by National Science Foundation grants DMI-9204208, DMI-9204547 and DMI-9424639,
and Oce of Naval Research grants N00014-90-J-1242 and N00014-96-1-0274.
y Department of Mathematics, University of California, San Diego, La Jolla, CA 92093-0112
([email protected]).
z Department of EESOR, Stanford University, Stanford, CA 94305-4023 (walter@SOL-walter.
stanford.edu, [email protected]).
2 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

We assume that the nonlinear functions are smooth and that their rst derivatives
are available (and possibly expensive to evaluate). In the present implementation we
assume that the number of active constraints at a solution is reasonably close to n. In
other words, the number of degrees of freedom is not too large (say, less than 1000).
Important examples are control problems such as those arising in optimal trajec-
tory calculations. For several years, the optimal trajectory system OTIS (Hargraves
and Paris [32]) has been applied successfully within the aerospace industry, using
NPSOL to solve the associated optimization problems. Although NPSOL has solved
examples with over a thousand variables and constraints, it was not designed for
large problems with sparse constraint derivatives. (The Jacobian of c(x) is treated as
a dense matrix.) Our aim here is to describe an SQP method that has the favorable
theoretical properties of the NPSOL algorithm, but is suitable for large sparse prob-
lems such as those arising in trajectory calculations. The implementation is called
SNOPT (Sparse Nonlinear Optimizer).
1.2. Infeasible constraints. SNOPT makes explicit allowance for infeasible
constraints. Infeasible linear constraints are detected rst by solving a problem of
the form
FLP minimize eT (v + w)
x;v;w !
subject to l  x  u; v  0; w  0;
Ax ? v + w
where e is a vector of ones. This is equivalent to minimizing the one-norm of the gen-
eral linear constraint violations subject to the simple bounds. (In the linear program-
ming literature, the approach is often called elastic programming . Other algorithms
based on minimizing one-norms of infeasibilities are given by Conn [13] and Bartels
[1].)
If the linear constraints are infeasible (v 6= 0 or w 6= 0), SNOPT terminates
without computing the nonlinear functions. Otherwise, all subsequent iterates satisfy
the linear constraints. (As with NPSOL, such a strategy allows linear constraints to
be used to de ne a region in which f and c can be safely evaluated.)
SNOPT then proceeds to solve NP as given, using QP subproblems based on
linearizations of the nonlinear constraints. If a QP subproblem proves to be infeasible
or unbounded (or if the Lagrange multiplier estimates for the nonlinear constraints
become large), SNOPT enters \nonlinear elastic" mode and solves the problem
NP( ) minimize f (x) + eT (v + w)
x;v;w 0 1
x
subject to @ c(x) ? v + w CA  u; v  0; w  0;
l B
Ax
where is a nonnegative penalty parameter and f (x)+ eT (v + w) is called a compos-
ite objective . If NP has a feasible solution and is suciently large, the solutions to
NP and NP( ) are identical. If NP has no feasible solution, NP( ) will tend to deter-
mine a \good" infeasible point if is again suciently large. (If were in nite, the
nonlinear constraint violations would be minimized subject to the linear constraints
and bounds.) A similar `1 formulation of NP is fundamental to the S`1 QP algorithm
of Fletcher [18]. See also Conn [12].
LARGE-SCALE SQP 3
1.3. Other work on large-scale SQP. There has been considerable interest
elsewhere in extending SQP methods to the large-scale case. Some of this work
has focused on problems with nonlinear equality constraints. The method of Lalee,
Nocedal and Plantenga [35], somewhat related to the trust-region method of Byrd
and Omojokun [42], uses either the exact Lagrangian Hessian or a limited-memory
quasi-Newton approximation de ned by the method of Zhu et al. [52]. The method
of Biegler, Nocedal and Schmidt [3] is in the class of reduced-Hessian methods , which
maintain a dense approximation to the reduced Hessian, using quasi-Newton updates.
For large problems with general inequality constraints as in Problem NP, SQP
methods have been proposed by Eldersveld [17], Tjoa and Biegler [50], and Betts and
Frank [2]. The rst two approaches are also reduced-Hessian methods. In [17], a
full but structured Hessian approximation is formed from the reduced Hessian. The
implementation LSSQP solves the same class of problems as SNOPT. In [50], the
QP subproblems are solved by eliminating variables using the (linearized) equality
constraints. The remaining variables are optimized using a dense QP algorithm.
Bounds on the eliminated variables become dense constraints in the reduced QP. The
method is ecient for problems whose constraints are mainly nonlinear equalities,
with few bounds on the variables. In contrast, the method of Betts and Frank uses
the exact Lagrangian Hessian or a nite-di erence approximation, and since the QP
solver works with sparse KKT factorizations (see x7), the method is not restricted to
problems with few degrees of freedom.
1.4. Other large-scale methods. Two existing packages MINOS [39, 40, 41]
and CONOPT [16] are designed for large problems with a modest number of degrees of
freedom. MINOS uses a projected Lagrangian or sequential linearly constrained (SLC)
method, whose subproblems require frequent evaluation of the problem functions.
CONOPT uses a generalized reduced gradient (GRG) method, which maintains near-
feasibility with respect to the nonlinear constraints, again at the expense of many
function evaluations. SNOPT is likely to outperform MINOS and CONOPT when the
functions (and their derivatives) are expensive to evaluate. Relative to MINOS, an
added advantage is the existence of a merit function to ensure global convergence.
This is especially important when the constraints are highly nonlinear.
LANCELOT Release A [14] is another widely used package in the area of large-
scale constrained optimization. It uses a sequential augmented Lagrangian (SAL)
method. All constraints other than simple bounds are included in an augmented La-
grangian function, which is minimized subject to the bounds. In general, LANCELOT
is recommended for large problems with many degrees of freedom. It complements
SNOPT and the other methods discussed above. A comparison between LANCELOT
and MINOS has been made in [6, 7].
2. The SQP iteration. Here we discuss the main features of an SQP method
for solving a generic nonlinear program. All features are readily specialized to the
more general constraints in Problem NP.
2.1. The generic problem. In this section we take the problem to be
GNP minimize
x f (x)
subject to c(x)  0;
where x 2 IRn , c 2 IRm , and the functions f (x) and ci (x) have continuous second
derivatives. The gradient of f is denoted by the vector g(x), and the gradients of each
4 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

element of c form the rows of the Jacobian matrix J (x).


We assume that a Karush-Kuhn-Tucker (KKT) point (x ; ) exists for GNP,
satisfying the rst-order optimality conditions:
(2.1) c(x )  0;   0; c(x )T  = 0; J (x )T  = g(x):
2.2. Structure of the SQP method. An SQP method obtains search direc-
tions from a sequence of quadratic programming subproblems. Each QP subproblem
minimizes a quadratic model of a certain Lagrangian function subject to linearized
constraints. Some merit function is reduced along each search direction to ensure
convergence from any starting point.
The basic structure of an SQP method involves major and minor iterations. The
major iterations generate a sequence of iterates (xk ; k ) that converge to (x ; ). At
each iterate a QP subproblem is used to generate a search direction towards the next
iterate (xk+1 ; k+1 ). Solving such a subproblem is itself an iterative procedure, with
the minor iterations of an SQP method being the iterations of the QP method.
For an overview of SQP methods, see, for example, Fletcher [19], Gill, Murray
and Wright [29], Murray [36], and Powell [46].
2.3. The modi ed Lagrangian. Let xk and k be estimates of x and . For
several reasons, our SQP algorithm is based on the modi ed Lagrangian associated
with GNP, namely
(2.2) L(x; xk ; k ) = f (x) ? kT dL (x; xk );
which is de ned in terms of the constraint linearization and the departure from lin-
earity :
cL (x; xk ) = c(xk ) + J (xk )(x ? xk );
dL (x; xk ) = c(x) ? cL (x; xk );
see Robinson [47] and Van der Hoek [51]. The rst and second derivatives of the
modi ed Lagrangian with respect to x are
rL(x; xk ; k ) = g(x) ? (J (x) ? J (xk ))T k ;
X
r2 L(x; xk ; k ) = r2 f (x) ? (k )i r2 ci (x):
i
Observe that r L is independent of xk (and is the same as the Hessian of the con-
2
ventional Lagrangian). At x = xk , the modi ed Lagrangian has the same function
and gradient values as the objective:
L(xk ; xk ; k ) = f (xk ); rL(xk ; xk ; k ) = g(xk ):
2.4. The QP subproblem. Let LQ be the quadratic approximation to L at
x = xk :
LQ (x; xk ; k ) = f (xk ) + g(xk )T(x ? xk ) + 21 (x ? xk )T r2 L(xk ; xk ; k )(x ? xk ):
If (xk ; k ) = (x ; ), optimality conditions for the quadratic program
GQP minimize
x
LQ (x; xk ; k )
subject to linearized constraints cL (x; xk )  0
LARGE-SCALE SQP 5
are identical to those for the original problem GNP. This suggests that if Hk is an
approximation to r2 L at the point (xk ; k ), an improved estimate of the solution
may be found from (xbk ; bk ), the solution of the following QP subproblem:
GQPk minimize
x
f (xk ) + g(xk )T (x ? xk ) + 21 (x ? xk )THk (x ? xk )
subject to c(xk ) + J (xk )(x ? xk )  0:
Optimality conditions for GQPk may be written as
g(xk ) + Hk (xbk ? xk ) = J (xk )T bk ; bk  0; sbk  0;
c(xk ) + J (xk )(xbk ? xk ) = sbk ; bk sbk = 0;
T

where sbk is a vector of slack variables for the linearized constraints. In this form,
(xbk ; bk ; bsk ) can be regarded as estimates of (x ;  ; s), where the nonnegative vari-
ables s satisfy c(x ) ? s = 0. The vector sbk is needed explicitly for the line search
(see x2.7).
2.5. The working-set matrix Wk . The working set is an important quantity
for both the major and the minor iterations. It is the current estimate of the set of
constraints that are binding at a solution. More precisely, suppose that GQPk has
just been solved. Although we try to regard the QP solver as a \black box", we
normally expect it to return an independent set of constraints that are active at the
QP solution. This is an optimal working set for subproblem GQPk .
The same constraint indices de ne a working set for GNP (and for subproblem
GQPk+1 ). The corresponding gradients form the rows of the working-set matrix Wk ,
an nY  n full-rank submatrix of the Jacobian J (xk ).
2.6. The null-space matrix Zk . Let Zk be an n  nZ full-rank matrix that
spans the null space of Wk . (Thus, nZ = n ? nY and Wk Zk = 0.) The QP solver
will often return Zk as part of some matrix factorization. For example, in NPSOL it
is part of an orthogonal factorization of Wk , while in LSSQP [17] (and in the current
SNOPT) it is de ned from a sparse LU factorization of part of Wk . In any event,
Zk is useful for theoretical discussions, and its column dimension has strong practical
implications. Important quantities are the reduced Hessian ZkTHk Zk and the reduced
gradient ZkTg.
2.7. The merit function. Once the QP solution (xbk ; bk ; sbk ) has been deter-
mined, new estimates of the GNP solution are computed using a line search on the
augmented Lagrangian merit function
(2.3)
?  ?  ?
M(x; ; s) = f (x) ? T c(x) ? s + 21 c(x) ? s TD c(x) ? s ;

where D is a diagonal matrix of penalty parameters. If (xk ; k ; sk ) are the current
estimates of (x ; ; s ), the line search determines a step length k (0 < k  1)
such that the new point
0 1 0 1 0 1
xk+1 xk xbk ? xk
(2.4) B@ k+1 CA = B@ k CA + k B@ bk ? k CA
sk+1 sk sbk ? sk
gives a sucient decrease in the merit function (2.3). Let 'k ( ) denote the merit
function computed at the point (xk + (xbk ? xk ); k + (bk ? k ); sk + (sbk ? sk )),
6 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

i.e., 'k ( ) de nes M as a univariate function of the step length. Initially D is zero
(for k = 0). When necessary, the penalties in D are increased by the minimum-norm
perturbation that ensures sucient descent for 'k ( ) [28]. (Note: As in NPSOL, sk+1
in (2.4) is rede ned to minimize the merit function as a function of s, prior to the
solution of GQPk+1 . For more details, see [25, 17].)
In the line search, for some > 0 the condition
(2.5) c(xk + k pk )  ? ;
is enforced. (We use i =  maxf1; ?ci(x0 )g, where  is a speci ed constant, e.g.,
 = 10.) This de nes a region in which the objective is expected to be de ned
and bounded below. Murray and Prieto [38] show that under certain conditions,
convergence can be assured if the line search enforces (2.5). If the objective is bounded
below in IRn then may be any positive vector.
If k is essentially zero (because kpk k is very large), the objective is considered
\unbounded" in the expanded region. Elastic mode is entered (or continued) as
described in x4.3.
2.8. The approximate Hessian. As suggested by Powell [44], we maintain a
positive-de nite approximate Hessian Hk . On completion of the line search, let the
change in x and the gradient of the modi ed Lagrangian be
k = xk+1 ? xk and yk = rL(xk+1 ; xk ; ) ? rL(xk ; xk ; );
for some vector . An estimate of the curvature of the modi ed Lagrangian along k
is incorporated using the BFGS quasi-Newton update,
Hk+1 = Hk + k yk ykT ? k qk qkT;
where qk = Hk k , k = 1=ykTk and k = 1=qkTk . When Hk is positive de nite, Hk+1
is positive de nite if and only if the approximate curvature ykTk is positive. The
consequences of a negative or small value of ykTk are discussed in the next section.
There are several choices for , including the QP multipliers bk+1 and least-
squares multipliers k (see, e.g., [22]). Here we use the updated multipliers k+1 from
the line search, because they are responsive to short steps in the search and they are
available at no cost. The de nition of L (2.2) yields
yk = rL(xk+1 ; xk ; k+1 ) ? rL(xk ; xk ; k+1 )
= g(xk+1 ) ? (J (xk+1 ) ? J (xk ))T k+1 ? g(xk ):
2.9. Maintaining positive-de niteness. Since the Hessian of the modi ed
Lagrangian need not be positive de nite at a local minimizer, the approximate cur-
vature ykTk can be negative or very small at points arbitrarily close to (x ;  ). The
curvature is considered not suciently positive if
(2.6) ykTk < k ; k = k (1 ? )pTkHk pk ;
where  is a preassigned constant (0 <  < 1) and pk is the search direction xbk ? xk
de ned by the QP subproblem. In such cases, if there are nonlinear constraints, two
attempts are made to modify the update: the rst modifying k and yk , the second
modifying only yk . If neither modi cation provides suciently positive approximate
curvature, no update is made.
LARGE-SCALE SQP 7
First modi cation. First, we de ne a new point zk and evaluate the nonlinear
functions there to obtain new values for k and yk :
k = xk+1 ? zk ; yk = rL(xk+1 ; xk ; k+1 ) ? rL(zk ; xk ; k+1 ):
We choose zk = xk + k (xk ? xk ), where xk is the rst feasible iterate found for
problem GQPk (see x4).
The purpose of this modi cation is to exploit the properties of the reduced Hessian
at a local minimizer of GNP. With this choice of zk , k = xk+1 ? zk = k pN , where
pN is the vector xbk ? xk . Then,
ykTk = k ykT pN  2k pTN r2 L(xk ; xk ; k )pN ;
and ykTk approximates the curvature along pN . If Wk , the nal working set of problem
GQPk , is also the working set at xk , then Wk pN = 0 and it follows that ykTk approx-
imates the curvature for the reduced Hessian, which must be positive semi-de nite at
a minimizer of GNP.
The assumption that the QP working set does not change once zk is known is
always justi ed for problems with equality constraints (see Byrd and Nocedal [11]
for a similar scheme in this context). With inequality constraints, we observe that
Wk pN  0, particularly during later major iterations, when the working set has settled
down.
This modi cation exploits the fact that SNOPT maintains feasibility with respect
to any linear constraints in GNP. (Such a strategy allows linear constraints to be used
to de ne a region in which f and c can be safely evaluated.) Although an additional
function evaluation is required at zk , we have observed that even when the Hessian
of the Lagrangian has negative eigenvalues at a solution, the modi cation is rarely
needed more than a few times if used in conjunction with the augmented Lagrangian
modi cation discussed next.
Second modi cation. If (xk ; k ) is not close to (x ; ), the modi ed approxi-
mate curvature ykT k may not be suciently positive and a second modi cation may
be necessary. We choose yk so that (yk + yk )T k = k (if possible), and rede ne
yk as yk + yk . This approach was rst suggested by Powell [45], who proposed
rede ning yk as a linear combination of yk and Hk k .
To obtain yk , we consider the augmented modi ed Lagrangian [40]:
(2.7) LA (x; xk ; k ) = f (x) ? kT dL (x; xk ) + 21 dL (x; xk )T dL (x; xk );
where is a matrix of parameters to be determined: = diag(!i ), !i  0, i =
1; : : : ; m. The perturbation
yk = (J (xk+1 ) ? J (xk ))T dL (xk+1 ; xk )
is equivalent to rede ning the gradient di erence as
(2.8) yk = rLA (xk+1 ; xk ; k+1 ) ? rLA (xk ; xk ; k+1 ):
We choose the smallest (minimum two-norm) !i 's that increase ykTk to k (2.6). They
are determined by the linearly constrained least-squares problem
LSP minimize
!
k!k2
subject to aT! = ; !  0;
8 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

where = k ? ykTk and ai = vi wi (i = 1, . . . , m), with v = (J (xk+1 ) ? J (xk ))k


and w = dL (xk+1 ; xk ). The optimal ! can be computed analytically [25, 17]. If no
solution exists, or if k!k is very large, no update is made.
The approach just described is related to the idea of updating an approximation
of the Hessian of the augmented Lagrangian, as suggested by Han [31] and Tapia
[49]. However, we emphasize that the second modi cation is not required in the
neighborhood of a solution because as x ! x , r2 LA converges to r2 L and the rst
modi cation will already have been successful.
2.10. Convergence tests. A point (x; ) is regarded as a satisfactory solution
if it satis es the rst-order optimality conditions (2.1) to within certain tolerances.
Let P and D be speci ed small positive constants, and de ne x = P (1 + kxk),
 = D (1 + kk). The SQP algorithm terminates if
(2.9) ci (x)  ?x ; i  ? ; ci (x)i   ; jdj j   ;
where d = g(x) ? J (x)T . These conditions cannot be satis ed if GNP is infeasible,
but in that case the SQP algorithm will eventually enter elastic mode and satisfy
analogous tests for the problem
GNP( ) minimize
x;v f (x) + eTv
subject to c(x) + v  0; v  0;
whose optimality conditions include
0  i  ; (ci (x) + vi )i = 0; vi ( ? i ) = 0:
The fact that kk1  at a solution of GNP( ) leads us to initiate elastic mode if
kk k exceeds .
3. Large-scale Hessians. In the large-scale case, we cannot treat Hk as an
n  n dense matrix. Nor can we maintain dense triangular factors of a transformed
Hessian QTHk Q = RTR as in NPSOL. We discuss the alternatives implemented in
SNOPT.
3.1. Linear variables. If only some of the variables occur nonlinearly in the
objective and constraint functions, the Hessian of the Lagrangian has structure that
can be exploited during the optimization. We assume that the nonlinear variables are
the rst n components of x. By induction, if H0 is zero in its last n ? n rows and
columns, the last n ? n components of the BFGS update vectors yk and Hk k are zero
for all k, and every Hk has the form
k 0
!
(3.1) Hk = H ;
0 0
where H k is n  n. Simple modi cations of the methods of x2.9 can be used to keep
H k positive de nite. A QP subproblem with Hessian of this form is either unbounded,
or has at least n ? n constraints in the nal working set. This implies that the reduced
Hessian need never have dimension greater than n.
In order to treat semide nite Hessians such as (3.1), the QP solver must include
an inertia controlling working-set strategy, which ensures that the reduced Hessian
has at most one zero eigenvalue. See x4.2.
LARGE-SCALE SQP 9
3.2. Dense Hessians. The Hessian approximations Hk are matrices of order n,
the number of nonlinear variables. If n is not too large, it is ecient to treat each Hk
as a dense matrix and apply the BFGS updates explicitly. The storage requirement
is xed, and the number of major iterations should prove to be moderate. (We can
expect 1-step Q-superlinear convergence.)
3.3. Limited-memory Hessians. To treat problems where the number of non-
linear variables n is very large, we use a limited-memory procedure to update an initial
Hessian approximation Hr a limited number of times. The present implementation is
quite simple and has bene ts in the SQP context when the constraints are linear.
Initially, suppose n = n. Let ` be preassigned (say ` = 20), and let r and k denote
two major iterations such that r  k  r + `. Up to ` updates to a positive-de nite
Hr are accumulated to represent the Hessian as
kX
?1
(3.2) Hk = H r + j yj yjT ? j qj qjT;
j =r
where qj = Hj j , j = 1=yjTj and j = 1=qjTj . The quantities (yj ; qj ; j ; j ) are
stored for each j . During major iteration k, the QP solver accesses Hk by requesting
products of the form Hk v. These are computed with work proportional to k ? r:
kX
?1
Hk v = H r v + j (yjTv)yj ? j (qjTv)qj :
j =r
On completion of iteration k = r + `, the diagonals of Hk are computed from (3.2) and
saved to form the next positive-de nite Hr (with r = k + 1). Storage is then \reset"
by discarding the previous updates. (Similar schemes are suggested by Buckley and
LeNir [9, 10] and Gilbert and Lemarechal [20].)
If n < n, Hk has the form (3.1) and the same procedure is applied to H k . Note
that the vectors yj and qj have length n|a bene t when n  n. The modi ed
Lagrangian LA (2.7) retains this property for the modi ed yk in (2.8).
4. The QP solver SQOPT. Since SNOPT solves nonlinear programs of the
form NP, it requires solution of QP subproblems of the same form (with f (x) replaced
by a quadratic function and c(x) replaced by its current linearization):
QPk minimize f (xk ) + g(xk )T (x ? xk ) + 21 (x ? xk )THk (x ? xk )
x 0 1
x
subject to @ c(xk ) + J (xk )(x ? xk ) CA  u:
l B
Ax
If a QP subproblem proves to be infeasible, we rede ne QPk (and all subsequent
subproblems) to correspond to the linearization of NP( ). SNOPT is then in elastic
mode thereafter.
At present, QPk is solved by the package SQOPT [23], which employs a two-phase
active-set algorithm and implements elastic programming implicitly when necessary.
SQOPT can treat any of the variable and constraint bounds as elastic, but SNOPT
uses this feature only for the constraint bounds in problem FLP (x1) and for the
linearized nonlinear constraint bounds in QPk ).
10 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

SQOPT maintains a dense Cholesky factorization of the QP reduced Hessian:


(4.1) Z THk Z = RTR;
where Z is the null-space matrix for the working sets W in the QP minor iterations.
Normally, R is computed from (4.1) when the non-elastic constraints are rst satis ed.
It is then updated as the QP working set changes. For eciency the dimension of
R should not be excessive (say, nZ  1200). This is guaranteed if the number of
nonlinear variables is moderate (because nZ  n at a solution).
As in MINOS, Z is maintained in \reduced-gradient" form, using the package
LUSOL [26] to maintain sparse LU factors of a square matrix B that alters as the
working set W changes. The important pieces are
?1
!
(4.2) WBS P = ( B S ); ZBS = P ?B S ;
I
where WBS is part of the working set (some rows and columns of J (xk ) and A) and
P is a permutation that ensures B is nonsingular. Variables associated with B and
S are called basic and superbasic; the remainder are called nonbasic. The number of
degrees of freedom is the number of superbasic variables (the column dimension of
S ). Products of the form Zv and Z Tg are obtained by solving with B or B T.
4.1. Condition control. If the basis matrix is not chosen carefully, the condi-
tion of a reduced-gradient Z could be arbitrarily high. To guard against this, MINOS
and SQOPT implement a \basis repair" feature in the following way. LUSOL is used
to compute the rectangular factorization
WBST = LU;
returning just the permutation P that makes PLP T unit lower triangular. The pivot
tolerance is set to require jPLP T jij  2, and the permutation is used to de ne P in
(4.2). It can be shown that kZ k is likely to be little more than 1. Hence, Z should
be well-conditioned regardless of the condition of W .
SQOPT applies this feature at the beginning of a warm start (when a potential
B -S ordering is known). To prevent basis repair every warm start|i.e., every major
iteration of SNOPT|a normal B = LU factorization is computed rst (with the usual
loose pivot tolerance to improve the sparsity of the factors). If U appears to be more
ill-conditioned than after the last repair, a new repair is invoked.
4.2. Inertia control. If NP contains linear variables, Hk in (3.1) is positive
semide nite. In SQOPT, only the last diagonal of R (4.1) is allowed to be zero. (See
[27] for discussion of a similar strategy for inde nite quadratic programming.) If the
initial R is singular, enough temporary constraints are added to the working set to
give a nonsingular R. Thereafter, R can become singular only when a constraint is
deleted from the working set (in which case no further constraints are deleted until R
becomes nonsingular). When R is singular at a non-optimal point, it is used to de ne
a direction dZ such that
(4.3) Z THk ZdZ = 0 and gT ZdZ < 0;
where g = g(xk ) + Hk (x ? xk ) is the gradient of the quadratic objective. The vector
d = ZdZ is a direction of unbounded descent for the QP in the sense that the QP
objective is linear and decreases without bound along d. Normally, a step along d
reaches a new constraint, which is then added to the working set for the next iteration.
LARGE-SCALE SQP 11
4.3. Unbounded QP subproblems. If the QP objective is unbounded along
d, subproblem QPk terminates. The nal QP search direction d = ZdZ is also a
direction of unbounded descent for the objective of NP. To show this, we observe
from (4.3) that if we choose p = d, then
Hk p = 0 and g(xk )T p < 0:
The imposed nonsingularity of H k (3.1) implies that the nonlinear components of
p are zero and so the nonlinear terms of the objective and constraint functions are
unaltered by steps of the form xk + p. Since g(xk )T p < 0, the objective of NP
is unbounded along p because it must include a term in the linear variables that
decreases without bound along p.
In short, NP behaves like an unbounded LP along p, with the nonlinear variables
(and functions) frozen at their current values. Thus if xk is feasible for NP, unbound-
edness in QPk implies that F (x) is unbounded for feasible points, and the problem is
declared unbounded.
If xk is infeasible, unboundedness in QPk implies that F (x) is unbounded for some
expanded feasible region c(x)  ? (see (2.5)). We enter or continue elastic mode
(with an increased value of if it has not already reached its maximum permitted
value). Eventually, the QP subproblem will be bounded, or xk will become feasible,
or the iterations will converge to a point that approximately minimizes the one-norm
of the constraint violations.
5. Implementation details. SNOPT is a general-purpose Fortran 77 package
for large-scale optimization. It implements most of the techniques described in xx2{4.
It includes SQOPT, a sparse QP solver that maintains a dense Cholesky factorization
of the reduced QP Hessian.
5.1. The initial point. To take advantage of a good starting points x0 , we
apply the QP solver to the \proximal-point" QP problem
PP minimize
x
kx ? x0 k2
subject to the linear constraints and bounds;
where x and x0 correspond to the nonlinear variables in x. The solution de nes a new
starting point x0 for the SQP iteration. The nonlinear functions are evaluated at this
point and a \crash" procedure is executed to nd a working set W0 for the linearized
constraints.
Note that Problem PP may be \more nonlinear" than the original problem NP,
in the sense that its exact solution may lie on fewer constraints (even though it is
nonlinear in the same subset of variables, x). To prevent the reduced Hessian from
becoming excessively large, we terminate SQOPT early by specifying a loose optimality
tolerance.
5.2. Linearly constrained problems. For problems with linear constraints
only, the maximum steplength is not necessarily one. Instead, it is the maximum
feasible step along the search direction. If the line search is not restricted by the
maximum step, the line search ensures that the approximate curvature is suciently
positive and the BFGS update can always be applied. Otherwise, the update is
skipped if the approximate curvature is not suciently positive.
For linear constraints, the working-set matrix Wk does not change at the new
major iterate xk+1 and the basis B need not be refactorized. If B is constant, then
12 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

so is Z , and the only change to the reduced Hessian between major iterations comes
from the rank-two BFGS update. This implies that the reduced Hessian need not be
refactorized if the BFGS update is applied explicitly to the reduced Hessian. This
obviates factorizing the reduced Hessian at the start of each QP, saving considerable
computation.
Given any nonsingular matrix Q, the BFGS update to Hk implies the following
update to QTHk Q:
(5.1) H Q = HQ + k yQ yQT ? k qQ qQT ;
where H Q = QTHk+1 Q, HQ = QTHk Q, yQ = QTyk , Q = Q?1 k , qQ = HQ Q ,
k = 1=yQT Q and k = 1=qQT Q . If Q is of the form ( Z Y ) for some matrix Y , the
reduced Hessian is the leading principal submatrix of HQ .
The Cholesky factor R of the reduced Hessian is simply the upper-left corner of
the n  n upper-trapezoidal matrix RQ such that HQ = RQT RQ . The update for R is
derived from the rank-one update to RQ implied by (5.1). Given k and yk , if we had
the Cholesky factor RQ , it could be updated directly as
p RQT w T
(5.2) RQ + kwwk k yQ ? kwk ;
where w = RQ Q (see Goldfarb [30], Dennis and Schnabel [15]). This rank-one modi-
cation of RQ could be restored to upper-triangular form by applying two sequences
of plane rotations from the left [21].
To simplify the notation we write (5.2)
p as RQ + uvT , where RQ is an n  n upper-
trapezoidal matrix, u = w=kwk and v = k yQ ? RQT u. Let vZ be the rst nZ elements
of v. The following algorithm determines the Cholesky factor R of the rst nZ rows
and columns of H Q (5.1).
1. Compute q = Hk k , t = Z Tq.
2. De ne  = kwk2 = (kTHk k )1=2 = (qTk )1=2 .
3. Solve RT wZ = t.
4. De ne uZ = wZ =;  = (1 ? kuZ k22 )1=2 .
5. De ne a sweep of nZ rotations P1 in the planes (nZ + 1; i), i = nZ , nZ ? 1,
. . . , 1, such that
! !
P1 R u Z = Rb 0 ;
 rT 1
where Rb is upper triangular and rT is a \row spike" in row nZ + 1.
6. De ne a sweep of nZ rotations P2 in the planes (i; nZ + 1), i = 1, 2, . . . ,
nZ + 1, such that
b ! R !
R
P2 T T = ;
r + vZ 0
where R is upper triangular.
5.3. The major iteration of SNOPT. The main steps of the SQP algorithm
in SNOPT are as follows. We assume that a starting point (x0 , 0 ) is available, and
that the reduced-Hessian QP solver SQOPT is being used. We describe elastic mode
verbally. Speci c values for are given at the start of x6.
LARGE-SCALE SQP 13
0. Apply the QP solver to Problem PP to nd the point closest to x0 satisfy-
ing the linear constraints. If Problem PP is infeasible, declare Problem NP
infeasible. Otherwise, Problem PP de nes a working-set matrix W0 . Set
k = 0.
1. Factorize Wk .
2. Find xk , a feasible point for the QP subproblem. (This is an intermediate
point for the QP solver, which also provides a working-set matrix W k and its
null-space matrix Zk .) If no feasible point exists, initiate elastic mode and
restart the QP.
3. Form the reduced Hessian ZTk Hk Zk and compute its Cholesky factorization.
4. Continue solving the QP subproblem to nd (xbk ; bk ), an optimal QP solution.
(This provides a working-set matrix W ck and its null-space matrix Zbk .)
If elastic mode has not been initiated but kbk k1 is \large", enter elastic mode
and restart the QP.
If the QP is unbounded and xk satis es the nonlinear constraints, declare the
problem unbounded. Otherwise (if the QP is unbounded), go to Step 6.
5. If (xk ; k ) satis es the convergence tests for NP analogous to (2.9), declare
the solution optimal. If similar convergence tests are satis ed for NP( ), go
to Step 6. Otherwise, go to Step 7.
6. If elastic mode has not been initiated, enter elastic mode and repeat Step 4.
Otherwise, if has not reached its maximum value, increase and repeat
Step 4. Otherwise, declare the problem infeasible.
7. Find a step length k that gives a sucient reduction in the merit function.
Set xk+1 = xk + k (xbk ? xk ) and k+1 = k + k (bk ? k ). Evaluate the
Jacobian at xk+1 .
8. De ne k = xk+1 ? xk and yk = rL(xk+1 ; xk ; k+1 ) ? rL(xk ; xk ; k+1 ). If
ykT k < k , recompute k and yk with xk rede ned as xk + k (xk ? xk ). (This
requires an extra evaluation of the problem derivatives.) If necessary, increase
ykT k (if possible) by adding an augmented Lagrangian term to yk .
9. If ykT k  k , apply the BFGS update to Hk using the pair (Hk k ; yk ).
10. Set k k + 1 and repeat from Step 1.
Apart from computing the problem functions and their rst derivatives, most
of the computational e ort lies in Steps 1 and 3. Steps 2 and 4 may also involve
signi cant work if the QP subproblem requires many minor iterations. Typically this
will happen only during the early major iterations.
Note that all points xk satisfy the linear constraints and bounds (as do the points
used to de ne extra derivatives in Step 8). Thus, SNOPT evaluates the nonlinear
functions only at points where it is reasonable to assume that they are de ned.
6. Numerical results. We give the results of applying SNOPT to several sets of
optimization problems, including 3 standard sets of small dense problems, the CUTE
collection, and some large sparse problems arising in optimal trajectory calculations.
Sources for the problems are given in Table 6.1. Table 6.2 de nes the notation used
in the later tables of results.
Unless stated otherwise, all runs were made on an SGI Indigo2 Impact 10000 with
256MB of RAM. SNOPT is coded in Fortran and was compiled using f77 in 64-bit
mode and full code optimization. Figure 6.1 gives the SNOPT optional parameters
and their values, most of which are the default. In some cases we compare the per-
formance of SNOPT with the SLC code MINOS 5.5 of Dec 1996 (see [41] and x1.4).
The default MINOS optional parameters were used, such as Crash option 3 and
14 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.1
Sets of test problems.

Problems Reference
bt Boggs and Tolle [4, 5]
hs Hock and Schittkowski [34]
CUTE Bongartz, Conn, Gould and Toint [8]
Spring Murtagh and Saunders [40]
Min-time Hargraves and Paris [32]
Table 6.2
Notation in tables of results.

QP The problem is a quadratic program.


LC The objective is nonlinear but the constraints are linear.
NC Some of the constraints are nonlinear.
m The number of general linear and nonlinear constraints.
n The number of variables.
nZ The number of degrees of freedom at a solution (columns in Z ).
Mnr The number of QP minor iterations.
Mjr The number of major iterations required by the optimizer.
Fcn The number of function and gradient evaluations.
cpu The number of cpu seconds.
Obj The nal objective value (to help classify local solutions).
Con The nal constraint violation norm (to identify infeasible problems).
a Almost optimal (within 10?2 of satisfying the optimality tolerance).
l Linear constraints infeasible.
u Unbounded functions.
c Final point could not be improved.
t Iterations terminated.
i Nonlinear constraints locally infeasible.

Line search tolerance 0.1. The only exceptions were Superbasic limit 1200
and Major iterations 2000. The convergence criteria for SNOPT and MINOS are
identical.
For the SNOPT Hessian approximations Hk , if the number of nonlinear variables
is small enough (n  75), a full dense BFGS Hessian is used. Otherwise, a limited-
memory BFGS Hessian is used, with Hk reset to the current Hessian diagonal every
20 major iterations.
To aid comparison with results given elsewhere, runs on published test problems
used the associated \standard start" for x0 . In all cases, the starting multiplier
estimates 0 were set to zero. On the Hock-Schittkowski test set, setting 0 to be the
QP multipliers from the rst subproblem led to fewer major iterations and function
evaluations in most cases. Overall, however, SNOPT was more reliable with 0 = 0.
The default initial for elastic mode is !kg(xk1 )k2 , where ! is the Elastic
weight (default 100) and xk1 is the iterate at which is rst needed. Thereafter, if
the rth increase to occurs at iteration k2, = !10r kg(xk2 )k2 .
The Major feasibility tolerance and Major optimality tolerance are the
parameters P and D of x2.10 de ned with respect to Problem NP. The Minor toler-
ances are analogous quantities for SQOPT as it solves QPk . (The Minor feasibility
tolerance incidentally applies to the bound and linear constraints in NP as well as
LARGE-SCALE SQP 15
QPk .)
The Violation limit is the parameter  of x2.7 that de nes an expanded feasible
region in which the objective is expected to be bounded below.
As in MINOS, the default is to scale the linear constraints and variables, and the
rst basis is essentially triangular (Crash option 3), except for NC problems, where
SNOPT's default is the all-slack basis (Crash option 0).

BEGIN SNOPT Problem


Minimize
Jacobian sparse
Derivative level 3
Elastic weight 100.0
Hessian updates 20
Hessian dimension 1200
Superbasics limit 1200
Iterations 1000000
Major iterations 1000
Minor iterations 9000
Major feasibility tolerance 1.0e-6
Major optimality tolerance 1.0e-6
Minor feasibility tolerance 1.0e-6
Minor optimality tolerance 1.0e-6
Crash option 0 (3 for LC)
Line search tolerance 0.9
Step limit 2.0
Unbounded objective 1.0e+15
Violation limit 10.0
Solution No
END SNOPT Problem

Fig. 6.1. The SNOPT optional parameter le

6.1. Results on small problems. SNOPT was applied to 3 sets of small test
problems that have appeared in the literature.
Table 6.3 gives results on the Boggs-Tolle problems [5], which are a set of small
nonlinear problems. Where appropriate, the problems were run with the \close starts"
(Start 1), \intermediate starts" (Start 2) and \far starts" (Start 3) suggested by Boggs
and Tolle [4]. SNOPT solved all cases except bt4 with the intermediate starting point.
In that case, the merit function became unbounded, with kck ! 1.
Tables 6.4{6.6 give results on the Hock-Schittkowski (HS) collection [34] as im-
plemented in the CUTE test suite (see x6.3). Results from all CUTE HS problems are
included except for hs67 , hs85 and hs87 , which are not smooth. In every case SNOPT
found a point that satis es the rst-order conditions for optimality. However, since
SNOPT uses only rst derivatives, it is able to nd only rst-order solutions, which
may or may not be minimizers. In some cases, e.g., hs16 and hs25 , the nal point
was not a minimizer. Similarly, the constraint quali cation does not hold at the nal
point obtained for hs13 .
Table 6.7 summarizes the results of applying SNOPT and MINOS to the HS col-
lection. SNOPT required more minor iterations than MINOS but fewer function eval-
uations. MINOS solved 119 of the 122 problems. Problem hs104lnp could not be
16 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.3
SNOPT on the Boggs-Tolle test problems.

No. Problem Mnr Mjr Fcn Obj


1 bt1 7 5 8 -9.999999E-01
2 bt2 13 10 14 3.256820E-02
3 bt2 Start 2 19 16 22 3.256820E-02
4 bt2 Start 3 32 23 41 3.256820E-02
5 bt3 (LC) 6 3 7 4.093023E+00
6 bt3 Start 2 6 3 7 4.093023E+00
7 bt3 Start 3 6 3 7 4.093023E+00
8 bt4 22 18 42 -3.739424E+01
9 bt4 Start 2 t 51 50 180 -3.168674E+02
10 bt4 Start 3 15 11 19 -3.739424E+01
11 bt5 (hs63 ) 26 18 41 9.617152E+02
12 bt5 Start 2 13 8 11 9.617152E+02
13 bt5 Start 3 14 9 13 9.617152E+02
14 bt6 (hs77 ) 19 14 19 2.415051E-01
15 bt6 Start 2 82 69 80 2.415051E-01
16 bt7 20 14 22 3.065000E+02
17 bt8 14 11 15 1.000001E+00
18 bt8 Start 2 19 16 19 1.000000E+00
19 bt9 (hs39 ) 18 14 20 -1.000000E+00
20 bt9 Start 2 26 22 32 -1.000000E+00
21 bt9 Start 3 29 25 36 -1.000000E+00
22 bt10 2 6 9 -1.000000E+00
23 bt10 Start 2 2 13 17 -1.000000E+00
24 bt10 Start 3 4 15 18 -1.000000E+00
25 bt11 (hs79 ) 13 8 12 9.171343E-02
26 bt11 Start 2 22 17 22 9.171343E-02
27 bt11 Start 3 39 34 51 9.171343E-02
28 bt12 94 63 142 6.188119E+00
29 bt12 Start 2 25 18 25 6.188119E+00
30 bt12 Start 3 19 14 19 6.188119E+00
31 bt13 53 47 93 0.000000E+00

solved because of an uninitialized variable in the Standard Input Format (SIF) le


(SGI Fortran does not initialize local variables to zero). The two other \failures"
occurred when hs93 and hs98 were declared to be infeasible. Since MINOS does not
yet have provision for minimizing the norm of the constraint violations once nonlinear
constraint infeasibility becomes apparent, infeasibility may be declared incorrectly or
at points where the constraint violations are not close to being minimized. The re-
sults of Table 6.7 and subsequent tables indicate that the treatment of infeasibility
using the elastic variable approach has a substantial e ect upon the reliability of the
method.
6.2. Optimal control problems. Next we consider two problems that arise
from the discretization of certain optimal control problems. A feature of these prob-
lems is that the optimization variables de ne a discretization of a function of a con-
tinuous variable (in this case, time). The accuracy of the approximation increases
with the number of optimization variables and constraints.
LARGE-SCALE SQP 17

Table 6.4
SNOPT on the CUTE Hock-Schittkowski suite: Part I.
No. Problem Mnr Mjr Fcn Obj
1 hs1 (BC) 39 24 31 3.535967E-15
2 hs2 (BC) 28 16 25 5.042619E-02
3 hs3 (QP) 5 3 10 1.972152E-34
4 hs3mod (QP) 6 4 10 1.925930E-34
5 hs4 (BC) 2 1 3 2.666667E+00
6 hs5 (BC) 9 6 10 -1.913223E+00
7 hs6 5 4 8 0.000000E+00
8 hs7 17 15 27 -1.732051E+00
9 hs8 (FP) 2 5 10 0.000000E+00
10 hs9 (LC) 5 4 10 -5.000000E-01
11 hs10 14 12 17 -1.000000E+00
12 hs11 11 9 17 -8.498464E+00
13 hs12 11 8 13 -3.000000E+01
14 hs13 1 4 11 1.434080E+00
15 hs14 3 5 9 1.393465E+00
16 hs15 2 2 5 3.065000E+02
17 hs16 1 4 7 2.314466E+01
18 hs17 18 13 26 1.000000E+00
19 hs18 15 13 24 5.000000E+00
20 hs19 8 12 24 -6.961814E+03
21 hs20 1 3 6 4.019873E+01
22 hs21 (QP) 2 2 6 -9.996000E+01
23 hs21mod (LC) 2 2 6 -9.596000E+01
24 hs22 3 0 3 1.000000E+00
25 hs23 8 7 17 2.000000E+00
26 hs24 (LC) 6 2 8 -1.000000E+00
27 hs25 (BC) 0 0 3 3.283500E+01
28 hs26 94 62 249 5.685474E-11
29 hs27 24 21 32 4.000000E-02
30 hs28 (LC) 5 3 7 4.830345E-18
31 hs29 26 18 25 -2.262742E+01
32 hs30 14 12 15 1.000000E+00
33 hs31 13 9 14 6.000000E+00
34 hs32 5 3 6 1.000000E+00
35 hs33 1 2 6 -3.993590E+00
36 hs34 4 5 8 -8.340324E-01
37 hs35 (QP) 12 7 10 1.111111E-01
38 hs35mod (QP) 9 6 9 2.500000E-01
39 hs36 (LC) 3 1 4 -3.300000E+03
40 hs37 (LC) 8 5 9 -3.456000E+03
41 hs38 (BC) 45 29 32 1.823912E-16
42 hs39 (bt9 ) 20 16 28 -1.000000E+00
43 hs40 9 5 9 -2.500000E-01
44 hs41 (LC) 3 0 3 1.925926E+00
45 hs42 11 7 12 1.385786E+01
46 hs43 16 9 14 -4.400000E+01
47 hs44 (QP) 9 5 10 -1.500000E+01
18 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

Table 6.5
SNOPT on the CUTE Hock-Schittkowski suite: Part II.
No. Problem Mnr Mjr Fcn Obj
48 hs44new (QP) 10 4 8 -1.500000E+01
49 hs45 (BC) 8 3 11 1.000000E+00
50 hs46 17 12 18 3.488613E-08
51 hs47 45 38 52 -2.671418E-02
52 hs48 (LC) 10 7 11 2.631260E-15
53 hs49 (LC) 34 30 34 1.299963E-11
54 hs50 (LC) 29 19 27 2.155810E-16
55 hs51 (QP) 4 3 7 5.321113E-29
56 hs52 (QP) 6 3 8 5.326648E+00
57 hs53 (QP) 6 3 7 4.093023E+00
58 hs54 (LC) 61 37 50 -8.674088E-01
59 hs55 (LC) 3 0 3 6.666667E+00
60 hs56 27 17 28 -3.456000E+00
61 hs57 40 33 46 2.845965E-02
62 hs59 24 16 28 -6.749505E+00
63 hs60 11 8 12 3.256820E-02
64 hs61 15 10 16 -1.436461E+02
65 hs62 (LC) 13 8 13 -2.627251E+04
66 hs63 (bt5 ) 6 14 39 9.723171E+02
67 hs64 49 38 42 6.299842E+03
68 hs65 17 9 12 9.535289E-01
69 hs66 7 4 6 5.181633E-01
70 hs68 55 36 56 -9.204250E-01
71 hs69 18 13 23 -9.567129E+02
72 hs70 33 28 34 7.498464E-03
73 hs71 9 5 8 1.701402E+01
74 hs72 55 43 59 7.267078E+02
75 hs73 11 3 5 2.989438E+01
76 hs74 14 6 9 5.126498E+03
77 hs75 10 5 8 5.174413E+03
78 hs76 (QP) 8 4 7 -4.681818E+00
79 hs77 (bt6 ) 20 15 20 2.415051E-01
80 hs78 12 7 12 -2.919700E+00
81 hs79 (bt11 ) 16 11 15 7.877682E-02
82 hs80 11 6 9 5.394985E-02
83 hs81 17 12 16 5.394985E-02
84 hs83 3 3 7 -3.066554E+04
85 hs84 15 6 18 -5.280335E+06
86 hs86 (LC) 20 8 12 -3.234868E+01
87 hs88 48 36 67 1.362657E+00
88 hs89 51 38 70 1.362657E+00
89 hs90 49 38 70 1.362657E+00
90 hs91 47 36 62 1.362657E+00
91 hs92 57 42 80 1.362657E+00
92 hs93 27 21 25 1.350760E+02
LARGE-SCALE SQP 19

Table 6.6
SNOPT on the CUTE Hock-Schittkowski suite: Part III.
No. Problem Mnr Mjr Fcn Obj
93 hs95 1 1 4 1.561953E-02
94 hs96 1 1 4 1.561953E-02
95 hs97 10 13 34 3.135809E+00
96 hs98 10 13 34 3.135809E+00
97 hs99 55 20 30 -8.310799E+08
98 hs99exp 545 148 801 -1.260006E+12
99 hs100 23 13 22 6.806301E+02
100 hs100lnp 22 15 28 6.806301E+02
101 hs100mod 24 16 26 6.786796E+02
102 hs101 180 94 381 1.809765E+03
103 hs102 99 48 166 9.118806E+02
104 hs103 103 46 166 5.436680E+02
105 hs104 31 23 28 3.951163E+00
106 hs104lnp 28 20 26 3.951163E+00
107 hs105 (LC) 68 47 60 1.044725E+03
108 hs106 31 13 16 7.049248E+03
109 hs107 22 6 11 5.055012E+03
110 hs108 31 9 13 -8.660255E-01
111 hs109 40 14 20 5.362069E+03
112 hs110 (BC) 50 1 4 -9.990002E+09
113 hs111 221 148 259 -4.776109E+01
114 hs111lnp 99 57 96 -4.737066E+01
115 hs112 (LC) 65 24 41 -4.776109E+01
116 hs113 33 14 19 2.430621E+01
117 hs114 45 16 31 -1.768807E+03
118 hs116 284 69 88 9.759102E+01
119 hs117 54 16 21 3.234868E+01
120 hs118 (QP) 29 2 6 6.648205E+02
121 hs119 (LC) 36 9 12 2.448997E+02
122 hs268 (QP) 77 37 45 -1.091394E-11

Table 6.7
Summary: MINOS and SNOPT on the CUTE HS test set.

MINOS SNOPT
Problems attempted 124 124
Optimal 121 124
Cannot be improved 1 0
False infeasibility 2 0
Major iterations 7175 3573
Minor iterations 1230 2052
Function evaluations 18703 4300
Cpu time (secs) 8.45 3.66
20 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

Problem Spring. This problem computes the optimal control of a spring mass
and damper system described in [40]. Our implementation of Spring perpetuates a
coding error that makes the problem badly conditioned unless exactly 100 discretized
sample points are used (see Plantenga [43]). Corrected versions appear in the CUTE
test collection as problems optcdeg2 and optcdeg3 (see the results of x6.3). Table 6.8
includes the dimensions of six spring problems of increasing size. Table 6.9 gives
results for MINOS and SNOPT on these problems. SNOPT proved remarkably e ective
in terms of the number of major iterations and function values|around 20 regardless
of problem size.
Table 6.8
Dimensions of Optimal Control problems.

Problem m n Problem m n
Spring200 400 602 Min-time10 270 184
Spring300 600 902 Min-time15 410 274
Spring400 800 1202 Min-time20 550 384
Spring500 1000 1502 Min-time25 690 454
Spring600 1200 1802 Min-time30 830 544
Spring700 1400 2102 Min-time35 970 634
Min-time40 1110 724
Min-time45 1250 814
Min-time50 1390 904

Table 6.9
MINOS and SNOPT on Spring.

MINOS SNOPT
Problem nZ Mnr Mjr Fcn cpu nZ Mnr Mjr Fcn cpu
Spring200 26 935 18 935 4.2 40 886 16 19 5.9
Spring300 53 2181 29 1881 13.6 48 1259 14 17 9.9
Spring400 79 1601 32 2136 16.7 47 1602 17 20 15.6
Spring500 108 2188 42 3069 27.7 50 1990 17 20 22.9
Spring600 135 3147 58 4522 65.8 61 2432 17 15 24.9
Spring700 162 3453 64 5046 81.7 62 2835 17 20 38.4

For MINOS, the major iterations increase with problem size because they are
terminated at most 40 minor iterations after the subproblem is feasible (whereas
SNOPT always solves its subproblems to optimality). The function evaluations are
proportional to the minor iterations, which increase steadily. The runtime would be
greatly magni ed if the functions or gradients were not trivial to compute.
For more complex examples (such as the F4 minimum-time-to-climb below), the
solution to the smallest problem could provide a good starting point for the larger
cases. This should help most optimizers, but the expensive functions would still leave
MINOS at a disadvantage compared to SNOPT.
An optimal trajectory problem. Here we give the results of applying two
SQP methods to a standard optimal trajectory problem: the F4 minimum-time-to-
climb. In this problem, a pilot cruising in an F-4 at Mach 0.34 at sea level wishes
LARGE-SCALE SQP 21
Table 6.10
NZOPT and SNOPT on the F4 minimum-time-to-climb.

NZOPT SNOPT
Problem Mjr Fcn cpu Mjr Fcn cpu
Min-time10 21 33 49.1 22 33 16.1
Min-time15 22 42 163.8 32 46 36.3
Min-time20 34 38 449.4 33 38 43.9
Min-time25 43 61 1368.3 33 40 67.1
Min-time30 37 41 2194.0 40 46 94.2
Min-time35 47 51 4324.5 40 46 118.6
Min-time40 47 51 6440.3 54 60 182.9
Min-time45 47 51 9348.0 49 56 209.3
Min-time50 53 57 14060.5 43 47 217.3
Table 6.11
MINOS and SNOPT on the F4 minimum-time-to-climb.

MINOS SNOPT
Problem nZ Mnr Mjr Fcn cpu nZ Mnr Mjr Fcn cpu
Min-time10 5 657 15 1996 1083.0 5 33 22 33 16.1
Min-time15 15 1586 16 5047 4076.3 15 46 32 42 36.3
Min-time20 23 2201 14 6972 7056.4 23 38 33 38 43.9
Min-time25 30 3044 14 9947 13754.0 30 40 33 61 67.1
Min-time30 40 7180 17 23443 37198.8 40 46 40 41 94.2
Min-time35 46 4698 14 15070 28371.6 46 46 40 51 118.6
Min-time40 55 4448 14 14351 30643.3 55 60 54 51 182.9
Min-time45 64 3806 13 10752 26712.5 64 56 49 51 209.3
Min-time50 72 6758 44 20515c 58026.1 72 47 43 57 217.3

to ascend to 65,000 feet at Mach 1.0 in minimum time. The problem has two path
constraints on the maximum altitude and the maximum dynamic pressure.
The runs in this section were made on a Sun SPARCstation 20/61 using f77 with
full code optimization.
Table 6.10 gives results for 9 optimization problems, each involving a ner level of
discretization of the underlying continuous problem. The problems were generated by
OTIS [33], and the constraint gradients are approximated by a sparse nite-di erence
scheme. In the table, SNOPT is compared with the code NZOPT, which implements an
SQP method based on a dense reduced Hessian and a dense orthogonal factorization of
the working-set matrix. (NZOPT is a special version of NPSOL that was developed in
conjunction with McDonnell Douglas Space Systems for optimal trajectory problems.)
Note that both codes required only 50{60 major iterations and function evaluations
to solve a problem with O(1000) variables. Since the problem functions are very
expensive in this application, it appears that SQP methods (even without the aid of
second derivatives) are well suited to trajectory calculations.
Since SNOPT and NZOPT are based on similar SQP methods, the di erent itera-
tion and function counts are due to the starting procedures and to SNOPT's limited-
memory Hessian resets. Note that the limited-memory approximation did not signif-
icantly increase the counts for SNOPT.
22 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

Di erences in cpu times are due to the QP solvers. In NZOPT, where the Jacobian
is treated as a dense matrix, the cost of solving the QP subproblems grows as a cubic
function of the number of variables, whereas the total cost of evaluating the problem
functions grows quadratically. Eventually the linear algebra required to solve the QP
subproblems dominates the computation. In SNOPT, sparse-matrix techniques for
factoring the Jacobian greatly reduce this cost. On the larger problems, a speedup of
almost two orders of magnitude has been achieved.
Table 6.11 compares MINOS and SNOPT on the F4 Min-time problems. The
same sparse-matrix methods are used, but the times are dominated by the expensive
functions and gradients.
6.3. Results on the CUTE test set. Extensive runs have been made on the
CUTE test collection dated 07/May/97. The various problem types in this distribution
are summarized in Table 6.12.
Table 6.12
CUTE problem categories

Frequency Type Characteristics


24 LP Linear obj, linear constraints
159 QP Quadratic obj, linear constraints
135 UC Nonlinear obj, no constraints
78 BC Nonlinear obj, bound constraints
68 LC Nonlinear obj, linear constraints
289 NC Nonlinear obj, nonlinear constraints
75 FP No objective
15 NS Nonsmooth
843

From the complete set of 843 problems, 47 were omitted as follows:


 The nonsmooth problems (bigbank , bridgend , britgas , concon , core1 , core2 ,
gridgena , hs67 , hs85 , hs87 , mconcon , net1 , net2 , net3 , stancmin ).
 26 problems with more than 1200 degrees of freedom at the solution (aug2d ,
aug2dc , aug2dcqp , aug2dqp , aug3d , aug3dc , aug3dcqp , aug3dqp , bqpgauss ,
dixmaanb , dtoc5 , dtoc6 , jimack , jnlbrng1 , jnlbrng2 , jnlbrnga , obstclae , obstcl-
bm, odnamur , orthrdm2 , orthrgdm , stcqp1 , stcqp2 , stnqp1 , stnqp2 , torsion6 ).
 4 problems with unde ned variables in the SIF le (lhaifam , p t1 , p t3 ,
recipe ).
 1 problem with incorrect gradients (himmelbj ).
 1 problem with excessively low accuracy in the objective gradients (bleachng ).
Requesting greater accuracy leads to excessive evaluation time.
SNOPT was applied to the remaining 796 problems, using the same options as
before (see Fig. 6.1). No special information was used in the case of LP, QP and FP
problems|i.e., each problem was assumed to have a general nonlinear objective.
Detailed results are presented from 3 subsets extracted using CUTE's interactive
\select" facility:
1. Linearly constrained problems (QP, BC and LC);
2. NC and FP problems with xed dimension;
3. NC and FP problems for which the problem size can be chosen by the user.
LARGE-SCALE SQP 23
A selection of linearly constrained problems. Tables 6.13{6.15 give results
for SNOPT on the 109 CUTE LC problems. The selection for this case was

Objective function type : *


Constraints type : L (linear constraints)
Regularity : R (smooth)
Degree of available derivatives : *
Problem interest : *
Explicit internal variables : *
Number of variables : *
Number of constraints : *

The problem model has infeasible linear constraints, but was included anyway.
The objective for problem static3 is unbounded below in the feasible region.
SNOPT solved all 109 problems in the CUTE LC set, and both SNOPT and MINOS
correctly diagnosed the special features of problems model and static3 . MINOS solved
101 of the problems, but could not improve the nal (nonoptimal) point for problems
ncvxqp1 , ncvxqp2 , ncvxqp4 , ncvxqp6 , ncvxqp8 , powell20 and ubh1 .
Table 6.16 summarizes the MINOS and SNOPT results on the CUTE LC problems.
The total cpu time for MINOS was less than one fth of that required for SNOPT,
largely because of the ve blockqp problems. (When these were excluded from the
LC selection, the total time for SNOPT dropped from 1211.4 secs to 276.4 secs, which
is comparable to the MINOS time.) On blockqp1 , which is typical of this group of
problems, MINOS requires 1021 functions evaluations compared to 9 for SNOPT. The
di erence in cpu time comes from the number of minor iterations (1010 for MINOS,
2450 for SNOPT) and the size of the reduced Hessians. For MINOS, the reduced
Hessian dimension (the number of superbasics) is never larger than four. By contrast,
for SNOPT it expands to 1005 during the rst QP subproblem, only to be reduced
to four during the third major iteration. The intermediate minor iterations are very
expensive, with the need to update a dense matrix R (4.1) of order 1000 at each step.
Although the ability to make many changes to the working set (between function
evaluations) has been regarded as an attractive feature of SQP methods, these ex-
amples illustrate that some caution is required. We anticipate that eciency would
be improved by allowing the QP subproblem to terminate early if the reduced Hes-
sian dimension has increased signi cantly. (Other criteria for early termination are
discussed in [37].)
A selection of problems with variable dimensions. The next selection was
used to choose problems whose dimension can be one of several values. (We chose n
as close to 1000 as possible. Problems from the other 3 categories were deleted.)

Objective function type : *


Constraints type : Q O (quadratic, general nonlinear)
Regularity : R (smooth)
Degree of available derivatives : *
Problem interest : M R (modelling, real application)
Explicit internal variables : *
Number of variables : v (variable dimension)
Number of constraints : v (variable dimension)
24 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

Table 6.13
SNOPT on the CUTE LC problems: Part I.

No. Problem Mnr Mjr Fcn Obj


1 agg 101 0 0 -3.599177E+07
2 avion2 42 22 27 9.468013E+07
3 biggsc4 8 2 6 -2.437500E+01
4 blockqp1 2450 3 9 -9.965000E+02
5 blockqp2 2381 2 8 -9.961012E+02
6 blockqp3 2164 153 215 -4.975000E+02
7 blockqp4 2498 9 15 -4.980982E+02
8 blockqp5 2557 480 708 -4.975000E+02
9 booth 1 0 0 0.000000E+00
10 bt3 6 3 7 4.093023E+00
11 cvxqp1 1155 18 21 1.087512E+06
12 cvxqp2 1426 42 48 8.201554E+05
13 cvxqp3 887 15 18 1.362829E+06
14 degenlpa 15 0 0 3.060322E+00
15 degenlpb 15 0 0 -3.074235E+01
16 dtoc1l 53 19 22 7.359454E-02
17 dtoc3 15 4 7 2.245904E+02
18 eqc 2 0 3 -8.278941E+02
19 exp ta 45 20 24 1.136612E-03
20 exp tb 284 23 29 5.019366E-03
21 exp tc 1582 27 33 2.330257E-02
22 extrasim 0 0 0 0.000000E+00
23 fccu 38 16 19 1.114911E+01
24 genhs28 7 3 7 9.271737E-01
25 gon 25 0 0 -1.571798E-13
26 gouldqp2 566 44 47 1.845311E-04
27 gouldqp3 469 10 20 2.027592E+00
28 hager1 12 2 5 8.809722E-01
29 hager2 15 4 8 4.325700E-01
30 hager3 15 4 8 1.410332E-01
31 hager4 18 4 8 2.833914E+00
32 hat dh 4 1 4 -2.437500E+01
33 himmelba 0 0 0 0.000000E+00
34 himmelbi 265 61 67 -1.735570E+03
35 hong 16 7 12 2.257109E+01
36 hub t 9 6 9 1.689349E-02
37 huestis 1145 2 5 3.482456E+10
38 hydroell 573 4 11 -3.585547E+06
39 hydroelm 292 3 10 -3.582016E+06
40 hydroels 101 2 9 -3.582268E+06
41 ksip 2746 32 35 5.757979E-01
42 lin 4 1 5 -1.757754E-02
43 liswet1 37 2 5 2.474970E-01
44 liswet2 23 2 5 2.529889E-01
45 liswet3 21 2 5 2.529889E-01
LARGE-SCALE SQP 25

Table 6.14
SNOPT on the CUTE LC problems: Part II.

No. Problem Mnr Mjr Fcn Obj


46 liswet4 25 1 4 2.513441E-01
47 liswet5 28 3 6 2.519595E-01
48 liswet6 21 2 5 2.540073E-01
49 liswet7 24 2 5 2.692336E-01
50 liswet8 28 2 5 2.688664E-01
51 liswet9 20 1 4 2.154389E+01
52 liswet10 53 2 5 2.507553E-01
53 liswet11 16 2 5 1.666122E+00
54 liswet12 12 1 4 2.079821E+01
55 loadbal 99 65 70 4.528510E-01
56 lotschd 6 1 4 2.398416E+03
57 lsq t 8 6 9 3.378699E-02
58 makela4 1 0 0 0.000000E+00
59 model l 18 0 0 0.000000E+00
60 mosarqp1 190 35 40 -1.542001E+02
61 mosarqp2 931 12 15 -5.098246E+02
62 ncvxqp1 1009 5 8 -7.163832E+07
63 ncvxqp2 1122 3 6 -5.780383E+07
64 ncvxqp3 1273 17 20 -3.143376E+07
65 ncvxqp4 1135 3 6 -9.396719E+07
66 ncvxqp5 1231 10 13 -6.634101E+07
67 ncvxqp6 1337 24 27 -3.548614E+07
68 ncvxqp7 799 2 5 -4.338654E+07
69 ncvxqp8 811 3 6 -3.049409E+07
70 ncvxqp9 925 15 18 -2.157328E+07
71 od ts 14 9 15 -2.380027E+03
72 oet1 101 0 0 5.382431E-01
73 oet3 296 0 0 4.504972E-03
74 pentagon 31 22 34 1.365217E-04
75 powell20 6 0 3 5.781250E+01
76 pt 1 0 0 1.783942E-01
77 qc 9 1 4 -9.565377E+02
78 qcnew 2 0 3 -8.048683E+02
79 qpcblend 99 5 9 -7.842543E-03
80 qpcboei1 1200 12 17 1.150391E+07
81 qpcboei2 375 10 15 8.171964E+06
82 qpcstair 645 8 12 6.204392E+06
83 qpnblend 128 15 24 -9.136140E-03
84 qpnboei1 1937 48 57 6.777408E+06
85 qpnboei2 816 24 40 1.368276E+06
86 qpnstair 644 12 16 5.146033E+06
87 reading2 2607 0 0 -1.258335E-02
88 s268 77 37 45 -1.091394E-11
89 s277 -280 6 0 0 5.076190E+00
90 simpllpa 3 0 0 1.000000E+00
26 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.15
SNOPT on the CUTE LC problems: Part III.

No. Problem Mnr Mjr Fcn Obj


91 simpllpb 1 0 0 1.100000E+00
92 sipow1 296 0 0 -1.000000E+00
93 sipow1m 297 0 0 -1.000001E+00
94 sipow2 149 0 0 -1.000000E+00
95 sipow2m 149 0 0 -1.000005E+00
96 sipow3 59 0 0 5.346586E-01
97 sipow4 29 0 0 2.723613E-01
98 sosqp1 1 0 3 -4.864710E-16
99 sosqp2 2259 30 33 -4.987032E+02
100 sseblin 145 0 0 1.617060E+07
101 static3 u 483 78 1053 -1.000000E+15
102 supersim 1 0 0 6.666667E-01
103 tame 1 0 3 3.081488E-33
104 t 2 34 0 0 6.490311E-01
105 t 3 43 3 6 4.301158E+00
106 ubh1 1458 10 13 1.116324E+00
107 yao 2 0 3 2.731285E+02
108 zangwil3 2 0 0 0.000000E+00
109 zecevic2 3 2 4 -4.125000E+00

Table 6.16
Summary: MINOS and SNOPT on the CUTE LC problems.

MINOS SNOPT
Problems attempted 109 109
Optimal 100 107
Infeasible 1 1
Unbounded 1 1
Cannot be improved 7 0
Major iterations 83 1597
Minor iterations 42892 49619
Function evaluations 59976 3206
Cpu time (secs) 227.1 1239.4

Table 6.17 gives the problem dimensions and Table 6.18 gives the SNOPT results
on this set. SNOPT solved 38 of the 45 problems attempted. Among the successes we
have included 7 infeasible cases. These include 3 cases with infeasible linear constraints
( osp2hh , osp2hl and osp2hm ), and 4 cases that SNOPT identi ed as having in-
feasible nonlinear constraints (drcavty3 , lubrif , osp2th and junkturn ). Since SNOPT
is not assured of nding a global minimizer of the sum of infeasibilities, failure to
nd a feasible point does not imply that none exists. To gain further assurance that
drcavty3 , lubrif , osp2th and junkturn are indeed infeasible, they were re-solved using
SNOPT's Feasible Point option, in which the true objective is ignored but \elastic
mode" is invoked (as usual) if the constraint linearizations prove to be infeasible (i.e.,
f (x) = 0 and = 1 in problem NP( ) of x1.1). In all 4 cases, the nal sum of
constraint violations was comparable to that obtained with the composite objective.
LARGE-SCALE SQP 27

Table 6.17
Dimensions of variable-dimensioned CUTE NC selection.

No. Problem Variables Linear Nonlinear


1 bdvalue 1002 1 1000
2 bratu2d 1024 1 900
3 bratu2dt 1024 1 900
4 bratu3d 1000 1 512
5 car2 1199 1 996
6 cbratu2d 1058 1 882
7 cbratu3d 686 1 250
8 chandheq 100 1 100
9 chemrcta 1000 5 996
10 chemrctb 1000 3 998
11 clnlbeam 903 301 300
12 drcav1lq 1225 1 0
13 drcav2lq 1225 1 0
14 drcav3lq 1225 1 0
15 drcavty1 1225 1 961
16 drcavty2 1225 1 961
17 drcavty3 1225 1 961
18 drugdis 904 1 600
19 drugdise 603 1 500
20 osp2hh 867 579 225
21 osp2hl 867 579 225
22 osp2hm 867 579 225
23 osp2th 867 579 225
24 osp2tl 867 579 225
25 osp2tm 867 579 225
26 hadamard 900 1801 465
27 junkturn 1999 1 1400
28 lubrif 751 1252 249
29 manne 1095 366 365
30 orbit2 898 1 898
31 porous1 1024 1 900
32 porous2 1024 1 900
33 reading1 2002 1 1000
34 reading2 3003 1 0
35 reading3 2002 2 1000
36 reading4 1001 1 1000
37 reading5 1001 1 1000
38 reading9 2001 1 1000
39 sreadin3 2002 2 1000
40 ssnlbeam 2003 1 1000
41 svanberg 1000 1 1000
42 trainf 2008 2 1001
43 trainh 3008 2 1001
44 ubh1 909 601 0
45 ubh5 1010 601 100
28 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

Table 6.18
SNOPT on the variable-dimensioned CUTE NC problems.

No. Problem Mnr Mjr Fcn Obj Con


1 bdvalue 1180 14 30 0.000000E+00 8.3E-09
2 bratu2d 900 3 5 0.000000E+00 9.3E-08
3 bratu2dt i 925 51 175 0.000000E+00 1.5E-05
4 bratu3d 512 4 7 0.000000E+00 1.3E-11
5 car2 5742 48 74 2.666122E+00 8.2E-06
6 cbratu2d 441 3 5 0.000000E+00 3.0E-07
7 cbratu3d 125 3 5 0.000000E+00 2.0E-08
8 chandheq 100 10 12 0.000000E+00 7.0E-07
9 chemrcta 1124 2 6 0.000000E+00 1.0E-07
10 chemrctb 1053 2 6 0.000000E+00 1.1E-08
11 clnlbeam 5550 10 20 3.481480E+02 2.2E-09
12 drcav1lq t 1961 1000 1103 4.628229E-04 0.0E+00
13 drcav2lq t 1961 1000 1128 6.985585E-04 0.0E+00
14 drcav3lq t 1961 1000 1089 6.441539E-03 0.0E+00
15 drcavty1 982 12 22 0.000000E+00 6.7E-14
16 drcavty2 961 9 20 0.000000E+00 1.1E-09
17 drcavty3 i 1655 65 138 0.000000E+00 2.9E-01
18 drugdis 12489 120 242 4.267653E+00 2.0E-05
29 drugdise t 3599 1000 1990 3.285726E+00 8.9E-01
20 osp2hh l 567 255 0 0.000000E+00 0.0E+00
21 osp2hl l 568 255 0 0.000000E+00 0.0E+00
22 osp2hm l 568 255 0 0.000000E+00 0.0E+00
23 osp2th i 2244 78 106 0.000000E+00 3.6E+01
24 osp2tl 820 5 11 0.000000E+00 3.0E-10
25 osp2tm 1086 24 37 0.000000E+00 6.0E-08
26 hadamard c 315535 34 73 1.906620E-04 3.0E+01
27 junkturn i 2724 132 223 3.623666E-04 3.2E-03
28 lubrif i 16323 146 271 2.914855E+01 1.8E+01
29 manne 742 0 3 -9.745726E-01 0.0E+00
30 orbit2 9855 95 110 3.123404E+02 4.4E-09
31 porous1 900 11 16 0.000000E+00 5.7E-07
32 porous2 900 6 16 0.000000E+00 4.6E-09
33 reading1 4205 38 54 -1.604804E-01 3.7E-12
34 reading2 2607 0 0 -1.258335E-02 0.0E+00
35 reading3 5155 42 93 -1.525715E-01 1.5E-10
36 reading4 4953 26 34 -2.904724E-01 1.9E-13
37 reading5 1000 5 10 -8.001323E-13 2.8E-09
38 reading9 t 2679 1000 2003 -3.082172E-03 1.7E-13
39 sreadin3 5348 49 101 -1.525715E-01 5.2E-11
40 ssnlbeam 10971 12 20 3.462756E+02 2.4E-11
41 svanberg 4866 31 55 1.671434E+03 3.3E-11
42 trainf 7145 25 33 3.103455E+00 1.7E-13
43 trainh 10801 36 49 1.231224E+01 2.0E-08
44 ubh1 1458 10 13 1.116324E+00 0.0E+00
45 ubh5 2853 33 60 1.116324E+00 1.2E-12
LARGE-SCALE SQP 29
Table 6.19
Summary: SNOPT and MINOS on the variable-dimensioned CUTE NC problems.

MINOS SNOPT
Problems attempted 45 45
Optimal 29 31
Infeasible 6 7
Cannot be improved 1 1
False infeasibility 2 1
Terminated 4 5
False unboundedness 3 0
Major iterations 12062 6959
Minor iterations 217169 460094
Function evaluations 303718 9468
Cpu time (secs) 15421.5 12870.9

The remaining infeasible case for SNOPT was bratu2dt , which is listed as a \false
infeasible" solution. However, the run gives a point that appears to be near-optimal,
with a nal nonlinear constraint violation of 1:5  10?5. In this case, SNOPT's
Feasible Point option also declared the problem infeasible with nal nonlinear con-
straint violation of 1:5  10?4. However, points satisfying the nonlinear constraints
have been found in other runs (and by other algorithms).
SNOPT was unable to solve 5 problems within 1000 major iterations (drcav1lq ,
drcav2lq , drcav3lq , reading9 and drugdise ). On termination, SNOPT was in elastic
mode for drugdise with nal constraint violation 5:5  10?4 (implying that no feasible
point may exist). The non-optimal nal value for problem hadamard could not be
improved.
MINOS solved 29 problems, and declared the 8 problems drugdise , osp2hh ,
osp2hl , osp2hm , hadamard , lubrif , orbit2 and trainh to be infeasible. Feasible
points for orbit2 and trainh are known, so these two cases are considered to have
failed. Problems bratu2dt and osp2tm and junkturn became unbounded at infeasi-
ble points. The non-optimal nal value for ubh1 could not be improved, and the four
problems drcavty1 , drcavty2 , drcavty3 and osp2th could not be solved within 2000
major iterations.
Table 6.19 summarizes the performance of MINOS and SNOPT on the variable-
dimensioned NC problems. As with the other test sets, the better reliability of SNOPT
is partly explained by the use of elastic variables to treat infeasible problems. The
large number of function evaluations is the reason why MINOS required more time
than SNOPT even though fewer problems were solved. The unbounded cases for
MINOS are partly attributable to the absence of a suitable merit function.
30 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

A selection of problems with xed dimensions. The next selection was


used to choose nonlinearly constrained problems whose dimension is xed:

Objective function type : *


Constraints type : Q O (quadratic, general nonlinear)
Regularity : R (smooth)
Degree of available derivatives : *
Problem interest : M R (modelling, real application)
Explicit internal variables : *
Number of variables : any fixed number (fixed dimension)
Number of constraints : any fixed number (fixed dimension)

Tables 6.20{6.21 give results for this set. SNOPT solved 54 of the 56 problems
attempted. The successes include two problems that SNOPT identi ed as having in-
feasible nonlinear constraints (discs and nystrom5 ). The nal sum of the nonlinear
constraint violations for these problems was 4:00, 3:193  10?3 and 1:72  10?2 respec-
tively. To our knowledge, no feasible point has ever been found for these problems.
SNOPT was unable to solve problems cresc132 and leaknet in 1000 major iterations.
For leaknet , the run gives a point that appears to be close to optimality, with a nal
nonlinear constraint violation of 6:3  10?9.
MINOS declared 7 problems to be infeasible (cresc132 , discs , lakes , nystrom5 ,
robot , truspyr1 and truspyr2 ). Feasible points found by SNOPT imply that this
diagnosis is correct only for discs and nystrom5 . Unbounded iterations occurred in
8 cases (brainpc3 , brainpc7 , brainpc9 , errinbar , tenbars1 , tenbars2 , tenbars3 and
tenbars4 ). The major iteration limit was enforced for problem reading6 .
Table 6.22 summarizes the MINOS and SNOPT results on the xed-dimensioned
NC problems. If the conjectured infeasible problems are counted as successes, the
number of successes for MINOS and SNOPT are 42 and 54 respectively, out of a total
of 56.
A selection of all smooth problems. Finally, SNOPT and MINOS were com-
pared on (almost) the entire CUTE collection. The resulting selection includes the
HS, LC and NC selections considered earlier, but only the additional problems are
discussed below. Table 6.23 summarizes the MINOS and SNOPT results.
SNOPT found an unbounded solution for the problem bratu1d . The 8 problems
eigmina , orthrds2 , orthregd , scon1ls , tointgor , vanderm1 , vanderm2 and vanderm3
were terminated at a point within 10?2 of satisfying the convergence test. These
problems would have succeeded with a less stringent convergence tolerance.
SNOPT identi ed 11 infeasible problems: argauss , bratu2dt , eigenb , etcher ,
growth , himmelbd , lewispol , lootsma , powellsq , s365mod and vanderm4 . Of these,
powellsq , vanderm4 must be counted as failures because they are known to have
feasible points, as veri ed by calling SNOPT in Feasible point mode. Similarly,
etcher and lootsma have feasible solutions but their initial points are infeasible and
stationary for the sum of infeasibilities, so SNOPT terminated immediately. These
problems are also listed as failures. The nal sums of infeasibilities for the remain-
ing 7 problems were identical to those found by running SNOPT with the Feasible
point option. We conjecture that these problems are infeasible.
SNOPT was unable to solve 30 cases within the allotted 1000 major iterations
(problems biggsb1 , catena , catenary , chainwoo , chenhark , dixchlng , djtl , eigenbls ,
eigencls , etcbv3 , genrose , heart6ls , helsby , hydc20ls , maratosb , noncvxu2 , noncvxun ,
LARGE-SCALE SQP 31

Table 6.20
SNOPT on the xed-dimension CUTE NC problems: Part I.

No. Problem Mnr Mjr Fcn Obj Con


1 aircrfta 5 3 7 0.000000E+00 3.7E-12
2 aircrftb 48 43 53 4.086853E-17 0.0E+00
3 airport 129 44 84 4.795270E+04 4.7E-11
4 brainpc0 6975 39 43 1.499639E-03 2.5E-12
5 brainpc1 9512 31 35 1.011544E-09 2.1E-09
6 brainpc2 13897 58 77 4.105827E-08 7.6E-13
7 brainpc3 7007 59 63 1.687131E-04 1.8E-12
8 brainpc4 7006 61 65 1.287866E-03 3.4E-13
9 brainpc5 7004 56 60 1.362251E-03 2.5E-12
10 brainpc6 6999 59 63 5.925666E-05 2.9E-13
11 brainpc7 6981 38 44 3.822638E-05 1.6E-13
12 brainpc8 7004 59 74 1.651779E-04 3.1E-12
13 brainpc9 7036 85 116 8.227963E-04 1.1E-13
14 cantilvr 30 25 27 1.339956E+00 1.0E-09
15 coolhans 57 13 29 0.000000E+00 3.7E-13
16 cresc100 108 42 58 5.676027E-01 1.6E-12
17 cresc132 t 2047 1000 3993 6.852576E-01 1.2E-04
18 cresc4 35 12 18 8.718975E-01 1.0E-11
19 cresc50 281 127 413 5.934401E-01 5.6E-11
20 cs 1 25 11 16 -4.907520E+01 1.8E-06
21 cs 2 49 17 23 5.501761E+01 8.0E-09
22 dembo7 196 41 48 1.747870E+02 2.3E-10
23 disc2 56 19 22 1.562500E+00 1.1E-08
24 discs i 1004 20 48 1.200005E+01 4.0E+00
25 dnieper 133 15 29 1.874402E+04 5.9E-08
26 errinbar 171 81 95 2.804526E+01 1.2E-09
27 grouping 44 0 3 1.385040E+01 0.0E+00
28 heart6 517 358 1327 0.000000E+00 9.5E-11
29 heart8 55 33 54 0.000000E+00 2.4E-10
30 himmelbk 142 11 21 5.181434E-02 2.5E-07
31 lakes 410 63 151 3.505247E+05 3.0E-12
32 launch 40 8 15 9.004903E+00 2.3E-13
33 lch 733 426 447 -4.258149E+00 1.0E-13
34 leaknet t 1173 1000 2002 1.391811E+01 3.2E-13
35 methanb8 49 6 14 0.000000E+00 1.2E-07
36 methanl8 58 8 14 0.000000E+00 4.2E-08
37 mribasis 902 854 4177 1.821790E+01 1.3E-04
38 nystrom5 i 852 225 612 0.000000E+00 1.7E-02
39 prodpl0 118 28 55 5.879010E+01 2.6E-09
40 prodpl1 95 8 13 3.573897E+01 2.6E-10
41 reading6 245 78 87 -1.446597E+02 1.2E-11
42 reading7 1217 10 23 -1.291618E+03 1.7E-13
43 reading8 2437 120 243 -2.647934E+03 4.0E-11
44 res 6 0 0 0.000000E+00 0.0E+00
45 robot 140 87 147 5.462841E+00 7.2E-12
46 rotdisc 1828 22 43 7.872068E+00 2.6E-04
47 ssebnln 330 5 12 1.617060E+07 1.1E-01
48 swopf 153 24 35 6.786018E-02 2.0E-11
32 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.21
SNOPT on the xed-dimension CUTE NC problems: Part II.

No. Problem Mnr Mjr Fcn Obj Con


49 tenbars1 343 96 130 2.295373E+03 4.1E-11
50 tenbars2 258 98 126 2.277946E+03 5.0E-12
51 tenbars3 463 194 372 2.247129E+03 1.1E-10
52 tenbars4 163 45 66 3.684932E+02 7.6E-08
53 trigger 7 20 49 0.000000E+00 1.6E-06
54 truspyr1 52 34 39 1.122874E+01 8.0E-10
55 truspyr2 188 167 336 1.122874E+01 1.7E-13
56 twobars 10 8 15 1.508652E+00 2.1E-09

Table 6.22
Summary: SNOPT and MINOS on the xed-dimensioned CUTE NC problems.

MINOS SNOPT
Problems attempted 56 56
Optimal 40 52
Infeasible 2 2
False infeasibility 5 0
Terminated 1 2
False unboundedness 8 0
Major iterations 3193 6094
Minor iterations 53795 96823
Function evaluations 94914 16231
Cpu time (secs) 2635.1 5003.0

nlmsurf , nonmsqrt , orthrgds , palmer5a , palmer5b , palmer5e , palmer7a , p t2 , p t3ls ,


p t4 , qr3dls , snake and sparsine ). Another 5 problems could not be improved at a
non-optimal point: brownbs , hydcar20 , meyer3 , penalty3 and polak4 . SNOPT essen-
tially found the solution of the badly scaled problems brownbs , meyer3 and polak4 ,
but was unable to declare optimality. The unconstrained problem penalty3 was ter-
minated at a point where the objective gradient was 1:2  10?4.
MINOS incorrectly identi ed 12 infeasible problems (eigmaxa , eigmina , etcher ,
hvycrash , lootsma , optcdeg3 , orbit2 , semicon1 , vanderm1 , vanderm2 , vanderm3 and
vanderm4 ), and was unable to solve 6 problems within 2000 major iterations (artif ,
himmelbd , minc44 , oet2 , palmer5a and powellsq ). MINOS correctly found an un-
bounded solution for bratu1d , but another 20 problems were incorrectly diagnosed as
being unbounded (problems bratu2dt , catena , catenary , dixchlng , dixchlnv , eigenb ,
eigenc2 , eigencco , elattar , osp2tm , indef , oet6 , oet7 , orthrds2 , orthrega , orthrgds ,
p t2 , p t4 , s365mod and semicon2 ). Finally, the 7 problems etcbv3 , gulf , heart6ls ,
nonmsqrt , s365 , spanhyd , dittert could not be improved at a non-optimal point.
If the LC and NC infeasible and unbounded problems are counted as successes,
MINOS and SNOPT solved a grand total of respectively 719 and 740 of the 796 prob-
lems attempted. Moreover, SNOPT found a feasible point that was within a factor
10?2 of satisfying the optimality tolerance for another 8 cases. This is strong evidence
of the robustness of SQP methods when implemented with an augmented Lagrangian
merit function and elastic variable strategy for treating infeasibility.
LARGE-SCALE SQP 33
Table 6.23
Summary: SNOPT and MINOS on the smooth CUTE problems.

MINOS SNOPT
Problems attempted 796 796
Optimal 706 721
Unbounded 2 3
Infeasible 11 16
Almost optimal 0 8
Cannot be improved 15 6
False infeasibility 21 5
Terminated 11 37
False unboundedness 30 0
Major iterations 31328 74335
Minor iterations 903395 875344
Function evaluations 1641959 135143
Cpu time (secs) 26134.6 30863.1

7. Extensions. Where possible, we have de ned the SQP algorithm to be in-


dependent of the QP solver. Of course, certain \warm start" features are highly
desirable. For example, SQOPT can use a given starting point and working set, and
for linearly constrained problems (x5.2) it can accept a known Cholesky factor R for
the reduced Hessian.
Here we discuss other \black-box" QP solvers that could be used in future im-
plementations of SNOPT. Recall that active-set methods solve KKT systems of the
form
! ! !
(7.1) Hk W T p = g
W q h
each minor iteration, where W is the current working-set matrix. Reduced-Hessian
methods such as SQOPT are ecient if W is nearly square and products Hk x can be
formed eciently.
7.1. Approximate reduced Hessians. As the major iterations converge, the
QP subproblems require fewer changes to their working set, and with warm starts
they eventually solve in one minor iteration. Hence, the work required by SQOPT
becomes dominated by the computation of the reduced Hessian Z THk Z and its factor
R (4.1), especially if there are many degrees of freedom.
For such cases, MINOS could be useful as the QP solver because it has two ways
of approximating the reduced Hessian in the form Z THk Z  RTR:
 R may be input from the previous major iteration and maintained using
quasi-Newton updates during the QP minor iterations.
 If R is very large, it is maintained in the form
!
R= Rr 0 ;
D
where Rr is a dense triangle of speci ed size, and D is diagonal. This struc-
ture partitions the superbasic variables into two sets. After a few minor
34 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

iterations involving all superbasics (with quasi-Newton updates to Rr and


D), the variables associated with D are temporarily frozen. Iterations pro-
ceed with updates to Rr only, and superlinear convergence can be expected
within that subspace. A frozen superbasic variable is then interchanged with
one from Rr and the process is repeated.
Both of these features could be implemented in a future version of SQOPT. Thus,
SNOPT with MINOS or an enhanced SQOPT as the QP solver would provide a viable
SQP algorithm for optimization problems of arbitrary dimension. The cost per minor
iteration is controllable, and the only unpredictable quantity is the total number of
minor iterations.
Note that the SQP updates to Hk could be applied to R between major iterations
as for the linear-constraint case (x5.2). However, the quasi-Newton updates during
the rst few minor iterations of each QP should achieve a similar e ect.
7.2. Range-space methods. If all variables appear nonlinearly, Hk is positive
de nite. A \range-space" approach could then be used to solve systems (7.1) as
W changes. This amounts to maintaining factors of Hk 's Schur complement, S =
WHk?1 W T. It would be ecient if W did not have many rows, so that S could be
treated as a dense matrix.
7.3. Schur-complement methods. For limited-memory Hessians of the form
Hk = H0 + V DV T , where H0 is some convenient Hessian approximation, D =
diag(I; ?I ) = D?1 , and V contains the BFGS update vectors, (7.1) is equivalent
to
0 10 1 0 1
H0 W T V
B@ W CA B@ pq CA = B@ hg CA :
VT ?D r 0
Following [24, x3.6.2], if we de ne
!
K0 = H0 W T ;
W
it would be ecient to work with a sparse factorization of K0 and dense factors of its
Schur complement S . (For a given QP subproblem, V is constant, but changes to W
would be handled by appropriate updates to S .)
This approach has been explored by Betts and Frank [2, x5] with H0 = I (or pos-
sibly a sparse nite-di erence Hessian approximation). As part of an SQP algorithm,
its practical success depends greatly on the de nition of H0 and the BFGS updates
that de ne V . Our experience with SNOPT emphasizes the importance of ensuring
positive de niteness in Hk ; hence the precautions of x2.9.
If H0 were de ned as in x3, the major iterates would be identical to those currently
obtained with SQOPT.
8. Summary and conclusions. We have presented theoretical and practical
details about an SQP algorithm for solving nonlinear programs with large numbers of
constraints and variables, where the nonlinear functions are smooth and rst deriva-
tives are available.
As with interior-point methods, the most promising way to achieve eciency with
the linear algebra is to work with sparse second derivatives (i.e., an exact Hessian of
LARGE-SCALE SQP 35
the Lagrangian, or a sparse nite-di erence approximation). However, inde nite QP
subproblems raise many practical questions, and alternatives are needed when second
derivatives are not available.
The present implementation, SNOPT, uses a positive de nite quasi-Newton Hes-
sian approximation Hk . If the number of nonlinear variables is moderate, Hk is
stored as a dense matrix. Otherwise, limited-memory BFGS updates are employed,
with resets to the current diagonal at a speci ed frequencey (typically every 20 major
iterations). An augmented Lagrangian merit function (the same as in NPSOL) ensures
convergence from arbitrary starting points.
The present QP solver, SQOPT, maintains a dense reduced-Hessian factorization
Z THk Z = RTR, where Z is obtained from a sparse LU factorization of part of the
Jacobian. Eciency improves with the number of constraints active at a solution; i.e.,
the number of degrees of freedom nZ should not be excessive (say, less than 1200).
This is true for many important problems, such as trajectory optimization and process
control. It is likely to be true for most control problems because the number of control
variables is usually small compared to the number of state variables.
The numerical results of x6 con rm that SNOPT is ecient and reliable on several
sets of such problems. Relative to the dense SQP solver NPSOL, the sparse-matrix
techniques have produced speedups of over 50 on the larger optimal trajectory ex-
amples, and comparable reliability. Comparisons with MINOS demonstrate greater
eciency if the function evaluations are expensive, and greater reliability in general
as a result of the merit function and the \elastic variables" treatment of infeasibility.
Future work will include alternative QP solvers to allow for many degrees of
freedom.
Acknowledgements. We extend sincere thanks to our colleagues Dan Young
and Rocky Nelson of McDonnell Douglas Space Systems, Huntington Beach, Califor-
nia, for their constant support during the development of SNOPT.

REFERENCES
[1] R. H. Bartels, A penalty linear programming method using reduced-gradient basis-exchange
techniques, Linear Algebra Appl., 29 (1980), pp. 17{32.
[2] J. T. Betts and P. D. Frank, A sparse nonlinear optimization algorithm, J. Optim. Theory
and Applics., 82 (1994), pp. 519{541.
[3] L. T. Biegler, J. Nocedal, and C. Schmid, A reduced Hessian method for large-scale con-
strained optimization, SIAM J. Optim., 5 (1995), pp. 314{347.
[4] P. T. Boggs and J. W. Tolle, An implementation of a quasi-Newton method for constrained
optimization, Technical Report 81-3, University of North Carolina at Chapel Hill, 1981.
[5] P. T. Boggs, J. W. Tolle, and P. Wang, On the local convergence of quasi-Newton methods
for constrained optimization, SIAM J. Control Optim., 20 (1982), pp. 161{171.
[6] I. Bongartz, A. R. Conn, N. I. M. Gould, M. A. Saunders, and P. L. Toint, A numerical
comparison between the LANCELOT and MINOS packages for large-scale constrained
optimization, report, 1997. To appear.
[7] , A numerical comparison between the LANCELOT and MINOS packages for large-scale
constrained optimization: the complete numerical results, report, 1997. To appear.
[8] I. Bongartz, A. R. Conn, N. I. M. Gould, and P. L. Toint, CUTE: Constrained and
unconstrained testing environment, ACM Trans. Math. Software, 21 (1995), pp. 123{160.
[9] A. Buckley and A. LeNir, QN-like variable storage conjugate gradients, Math. Prog., 27
(1983), pp. 155{175.
[10] , BBVSCG{a variable storage algorithm for function minimization, ACM Trans. Math.
Software, 11 (1985), pp. 103{119.
[11] R. H. Byrd and J. Nocedal, An analysis of reduced Hessian methods for constrained opti-
mization, Math. Prog., 49 (1991), pp. 285{323.
36 P. E. GILL, W. MURRAY AND M. A. SAUNDERS

[12] A. R. Conn, Constrained optimization using a nondi erentiable penalty function, SIAM J.
Numer. Anal., 10 (1973), pp. 760{779.
[13] , Linear programming via a nondi erentiable penalty function, SIAM J. Numer. Anal.,
13 (1976), pp. 145{154.
[14] A. R. Conn, N. I. M. Gould, and P. L. Toint, LANCELOT: a Fortran package for large-
scale nonlinear optimization (Release A), Lecture Notes in Computation Mathematics 17,
Springer Verlag, Berlin, Heidelberg, New York, London, Paris and Tokyo, 1992. ISBN
3-540-55470-X 65-04.
[15] J. E. Dennis, Jr. and R. B. Schnabel, A new derivation of symmetric positive de nite
secant updates, in Nonlinear Programming 4, O. L. Mangasarian, R. R. Meyer, and S. M.
Robinson, eds., Academic Press, London and New York, 1981, pp. 167{199.
[16] A. Drud, CONOPT: A GRG code for large sparse dynamic nonlinear optimization problems,
Math. Prog., 31 (1985), pp. 153{191.
[17] S. K. Eldersveld, Large-scale sequential quadratic programming algorithms, PhD thesis, De-
partment of Operations Research, Stanford University, Stanford, CA, 1991.
[18] R. Fletcher, An `1 penalty method for nonlinear constraints, in Numerical Optimization 1984,
P. T. Boggs, R. H. Byrd, and R. B. Schnabel, eds., Philadelphia, 1985, SIAM, pp. 26{40.
[19] , Practical Methods of Optimization, John Wiley and Sons, Chichester, New York, Bris-
bane, Toronto and Singapore, second ed., 1987.
[20] J. C. Gilbert and C. Lemarechal, Some numerical experiments with variable-storage quasi-
Newton algorithms, Math. Prog., (1989), pp. 407{435.
[21] P. E. Gill, G. H. Golub, W. Murray, and M. A. Saunders, Methods for modifying matrix
factorizations, Math. Comput., 28 (1974), pp. 505{535.
[22] P. E. Gill and W. Murray, The computation of Lagrange multiplier estimates for constrained
minimization, Math. Prog., 17 (1979), pp. 32{60.
[23] P. E. Gill, W. Murray, and M. A. Saunders, SQOPT: An algorithm for large-scale quadratic
programming. To appear.
[24] P. E. Gill, W. Murray, M. A. Saunders, and M. H. Wright, Sparse matrix methods in
optimization, SIAM J. on Scienti c and Statistical Computing, 5 (1984), pp. 562{589.
[25] , User's guide for NPSOL (Version 4.0): a Fortran package for nonlinear programming,
Report SOL 86-2, Department of Operations Research, Stanford University, Stanford, CA,
1986.
[26] , Maintaining LU factors of a general sparse matrix, Linear Algebra and its Applications,
88/89 (1987), pp. 239{270.
[27] , Inertia-controlling methods for general quadratic programming, SIAM Review, 33
(1991), pp. 1{36.
[28] , Some theoretical properties of an augmented Lagrangian merit function, in Advances in
Optimization and Parallel Computing, P. M. Pardalos, ed., North Holland, North Holland,
1992, pp. 101{128.
[29] P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, Academic Press, London
and New York, 1981. ISBN 0-12-283952-8.
[30] D. Goldfarb, Factorized variable metric methods for unconstrained optimization, Math. Com-
put., 30 (1976), pp. 796{811.
[31] S. P. Han, Superlinearly convergent variable metric algorithms for general nonlinear program-
ming problems, Math. Prog., 11 (1976), pp. 263{282.
[32] C. R. Hargraves and S. W. Paris, Direct trajectory optimization using nonlinear program-
ming and collocation, J. of Guidance, Control, and Dynamics, 10 (1987), pp. 338{348.
[33] , OTIS Optimal Trajectories by Implicit Integration, 1988. Boeing Aerospace Company,
Contract No. F33615-85-c-3009.
[34] W. Hock and K. Schittkowski, Test Examples for Nonlinear Programming Codes, Lecture
Notes in Economics and Mathematical Systems 187, Springer Verlag, Berlin, Heidelberg
and New York, 1981.
[35] M. Lalee, J. Nocedal, and T. Plantenga, On the implementation of an algorithm for
large-scale equality constrained optimization. Manuscript, 1995.
[36] W. Murray, Sequential quadratic programming methods for large-scale problems, J. Comput.
Optim. Appl., 7 (1997), pp. 127{142.
[37] W. Murray and F. J. Prieto, A sequential quadratic programming algorithm using an in-
complete solution of the subproblem, SIAM J. Optim., 5 (1995), pp. 590{640.
[38] , A second-derivative method for nonlinearly constrained optimization. To appear, 1997.
[39] B. A. Murtagh and M. A. Saunders, Large-scale linearly constrained optimization, Math.
Prog., 14 (1978), pp. 41{72.
[40] , A projected Lagrangian algorithm and its implementation for sparse nonlinear con-
LARGE-SCALE SQP 37
straints, Math. Prog. Study, 16 (1982), pp. 84{117.
[41] , MINOS 5.4 User's Guide, Report SOL 83-20R, Department of Operations Research,
Stanford University, Stanford, CA, Revised 1995.
[42] E. O. Omojokun, Trust region algorithms for nonlinear equality and inequality constraints,
PhD thesis, Department of Computer Science, University of Colorado, Boulder, 1989.
[43] T. Plantenga, A trust region method for nonlinear programming based on primal interior-
point techniques. Manuscript, 1996.
[44] M. J. D. Powell, Algorithms for nonlinear constraints that use Lagrangian functions, Math.
Prog., 14 (1978), pp. 224{248.
[45] , The convergence of variable metric methods for nonlinearly constrained optimization
calculations, in Nonlinear Programming 3, O. L. Mangasarian, R. R. Meyer, and S. M.
Robinson, eds., Academic Press, London and New York, 1978, pp. 27{63.
[46] , Variable metric methods for constrained optimization, in Mathematical Programming:
The State of the Art, A. Bachem, M. Grotschel, and B. Korte, eds., Springer Verlag,
London, Heidelberg, New York and Tokyo, 1983, pp. 288{311.
[47] S. M. Robinson, A quadratically-convergent algorithm for general nonlinear programming
problems, Math. Prog., 3 (1972), pp. 145{156.
[48] K. Schittkowski, NLPQL: A Fortran subroutine for solving constrained nonlinear program-
ming problems, Ann. Oper. Res., 11 (1985/1986), pp. 485{500.
[49] R. A. Tapia, A stable approach to Newton's method for general mathematical programming
problems in IRn , J. Optim. Theory and Applics., 14 (1974), pp. 453{476.
[50] I.-B. Tjoa and L. T. Biegler, Simultaneous solution and optimization strategies for parameter
estimation of di erential algebraic equation systems, Ind. Eng. Chem. Res., 30 (1991),
pp. 376{385.
[51] G. Van der Hoek, Asymptotic properties of reduction methods applying linearly equality con-
strained reduced problems, Math. Prog. Study, 16 (1982), pp. 162{189.
[52] C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal, L-BFGS-B|FORTRAN subroutines for large-
scale bound constrained optimization, preprint, Department of Electrical Engineering and
Computer Science, Northwestern University, Evanston, IL, December 1994.

View publication stats

You might also like