SNOPT An SQP Algorithm For Large-Scale Constrained
SNOPT An SQP Algorithm For Large-Scale Constrained
net/publication/220133387
CITATIONS READS
2,132 3,069
3 authors, including:
All content following this page was uploaded by Philip E. Gill on 26 May 2014.
We assume that the nonlinear functions are smooth and that their rst derivatives
are available (and possibly expensive to evaluate). In the present implementation we
assume that the number of active constraints at a solution is reasonably close to n. In
other words, the number of degrees of freedom is not too large (say, less than 1000).
Important examples are control problems such as those arising in optimal trajec-
tory calculations. For several years, the optimal trajectory system OTIS (Hargraves
and Paris [32]) has been applied successfully within the aerospace industry, using
NPSOL to solve the associated optimization problems. Although NPSOL has solved
examples with over a thousand variables and constraints, it was not designed for
large problems with sparse constraint derivatives. (The Jacobian of c(x) is treated as
a dense matrix.) Our aim here is to describe an SQP method that has the favorable
theoretical properties of the NPSOL algorithm, but is suitable for large sparse prob-
lems such as those arising in trajectory calculations. The implementation is called
SNOPT (Sparse Nonlinear Optimizer).
1.2. Infeasible constraints. SNOPT makes explicit allowance for infeasible
constraints. Infeasible linear constraints are detected rst by solving a problem of
the form
FLP minimize eT (v + w)
x;v;w !
subject to l x u; v 0; w 0;
Ax ? v + w
where e is a vector of ones. This is equivalent to minimizing the one-norm of the gen-
eral linear constraint violations subject to the simple bounds. (In the linear program-
ming literature, the approach is often called elastic programming . Other algorithms
based on minimizing one-norms of infeasibilities are given by Conn [13] and Bartels
[1].)
If the linear constraints are infeasible (v 6= 0 or w 6= 0), SNOPT terminates
without computing the nonlinear functions. Otherwise, all subsequent iterates satisfy
the linear constraints. (As with NPSOL, such a strategy allows linear constraints to
be used to de ne a region in which f and c can be safely evaluated.)
SNOPT then proceeds to solve NP as given, using QP subproblems based on
linearizations of the nonlinear constraints. If a QP subproblem proves to be infeasible
or unbounded (or if the Lagrange multiplier estimates for the nonlinear constraints
become large), SNOPT enters \nonlinear elastic" mode and solves the problem
NP( ) minimize f (x) + eT (v + w)
x;v;w 0 1
x
subject to @ c(x) ? v + w CA u; v 0; w 0;
l B
Ax
where is a nonnegative penalty parameter and f (x)+ eT (v + w) is called a compos-
ite objective . If NP has a feasible solution and is suciently large, the solutions to
NP and NP( ) are identical. If NP has no feasible solution, NP( ) will tend to deter-
mine a \good" infeasible point if is again suciently large. (If were in nite, the
nonlinear constraint violations would be minimized subject to the linear constraints
and bounds.) A similar `1 formulation of NP is fundamental to the S`1 QP algorithm
of Fletcher [18]. See also Conn [12].
LARGE-SCALE SQP 3
1.3. Other work on large-scale SQP. There has been considerable interest
elsewhere in extending SQP methods to the large-scale case. Some of this work
has focused on problems with nonlinear equality constraints. The method of Lalee,
Nocedal and Plantenga [35], somewhat related to the trust-region method of Byrd
and Omojokun [42], uses either the exact Lagrangian Hessian or a limited-memory
quasi-Newton approximation de ned by the method of Zhu et al. [52]. The method
of Biegler, Nocedal and Schmidt [3] is in the class of reduced-Hessian methods , which
maintain a dense approximation to the reduced Hessian, using quasi-Newton updates.
For large problems with general inequality constraints as in Problem NP, SQP
methods have been proposed by Eldersveld [17], Tjoa and Biegler [50], and Betts and
Frank [2]. The rst two approaches are also reduced-Hessian methods. In [17], a
full but structured Hessian approximation is formed from the reduced Hessian. The
implementation LSSQP solves the same class of problems as SNOPT. In [50], the
QP subproblems are solved by eliminating variables using the (linearized) equality
constraints. The remaining variables are optimized using a dense QP algorithm.
Bounds on the eliminated variables become dense constraints in the reduced QP. The
method is ecient for problems whose constraints are mainly nonlinear equalities,
with few bounds on the variables. In contrast, the method of Betts and Frank uses
the exact Lagrangian Hessian or a nite-di erence approximation, and since the QP
solver works with sparse KKT factorizations (see x7), the method is not restricted to
problems with few degrees of freedom.
1.4. Other large-scale methods. Two existing packages MINOS [39, 40, 41]
and CONOPT [16] are designed for large problems with a modest number of degrees of
freedom. MINOS uses a projected Lagrangian or sequential linearly constrained (SLC)
method, whose subproblems require frequent evaluation of the problem functions.
CONOPT uses a generalized reduced gradient (GRG) method, which maintains near-
feasibility with respect to the nonlinear constraints, again at the expense of many
function evaluations. SNOPT is likely to outperform MINOS and CONOPT when the
functions (and their derivatives) are expensive to evaluate. Relative to MINOS, an
added advantage is the existence of a merit function to ensure global convergence.
This is especially important when the constraints are highly nonlinear.
LANCELOT Release A [14] is another widely used package in the area of large-
scale constrained optimization. It uses a sequential augmented Lagrangian (SAL)
method. All constraints other than simple bounds are included in an augmented La-
grangian function, which is minimized subject to the bounds. In general, LANCELOT
is recommended for large problems with many degrees of freedom. It complements
SNOPT and the other methods discussed above. A comparison between LANCELOT
and MINOS has been made in [6, 7].
2. The SQP iteration. Here we discuss the main features of an SQP method
for solving a generic nonlinear program. All features are readily specialized to the
more general constraints in Problem NP.
2.1. The generic problem. In this section we take the problem to be
GNP minimize
x f (x)
subject to c(x) 0;
where x 2 IRn , c 2 IRm , and the functions f (x) and ci (x) have continuous second
derivatives. The gradient of f is denoted by the vector g(x), and the gradients of each
4 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
where sbk is a vector of slack variables for the linearized constraints. In this form,
(xbk ; bk ; bsk ) can be regarded as estimates of (x ; ; s), where the nonnegative vari-
ables s satisfy c(x ) ? s = 0. The vector sbk is needed explicitly for the line search
(see x2.7).
2.5. The working-set matrix Wk . The working set is an important quantity
for both the major and the minor iterations. It is the current estimate of the set of
constraints that are binding at a solution. More precisely, suppose that GQPk has
just been solved. Although we try to regard the QP solver as a \black box", we
normally expect it to return an independent set of constraints that are active at the
QP solution. This is an optimal working set for subproblem GQPk .
The same constraint indices de ne a working set for GNP (and for subproblem
GQPk+1 ). The corresponding gradients form the rows of the working-set matrix Wk ,
an nY n full-rank submatrix of the Jacobian J (xk ).
2.6. The null-space matrix Zk . Let Zk be an n nZ full-rank matrix that
spans the null space of Wk . (Thus, nZ = n ? nY and Wk Zk = 0.) The QP solver
will often return Zk as part of some matrix factorization. For example, in NPSOL it
is part of an orthogonal factorization of Wk , while in LSSQP [17] (and in the current
SNOPT) it is de ned from a sparse LU factorization of part of Wk . In any event,
Zk is useful for theoretical discussions, and its column dimension has strong practical
implications. Important quantities are the reduced Hessian ZkTHk Zk and the reduced
gradient ZkTg.
2.7. The merit function. Once the QP solution (xbk ; bk ; sbk ) has been deter-
mined, new estimates of the GNP solution are computed using a line search on the
augmented Lagrangian merit function
(2.3)
? ? ?
M(x; ; s) = f (x) ? T c(x) ? s + 21 c(x) ? s TD c(x) ? s ;
where D is a diagonal matrix of penalty parameters. If (xk ; k ; sk ) are the current
estimates of (x ; ; s ), the line search determines a step length k (0 < k 1)
such that the new point
0 1 0 1 0 1
xk+1 xk xbk ? xk
(2.4) B@ k+1 CA = B@ k CA + k B@ bk ? k CA
sk+1 sk sbk ? sk
gives a sucient decrease in the merit function (2.3). Let 'k ( ) denote the merit
function computed at the point (xk + (xbk ? xk ); k + (bk ? k ); sk + (sbk ? sk )),
6 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
i.e., 'k ( ) de nes M as a univariate function of the step length. Initially D is zero
(for k = 0). When necessary, the penalties in D are increased by the minimum-norm
perturbation that ensures sucient descent for 'k ( ) [28]. (Note: As in NPSOL, sk+1
in (2.4) is rede ned to minimize the merit function as a function of s, prior to the
solution of GQPk+1 . For more details, see [25, 17].)
In the line search, for some > 0 the condition
(2.5) c(xk + k pk ) ? ;
is enforced. (We use i = maxf1; ?ci(x0 )g, where is a speci ed constant, e.g.,
= 10.) This de nes a region in which the objective is expected to be de ned
and bounded below. Murray and Prieto [38] show that under certain conditions,
convergence can be assured if the line search enforces (2.5). If the objective is bounded
below in IRn then may be any positive vector.
If k is essentially zero (because kpk k is very large), the objective is considered
\unbounded" in the expanded region. Elastic mode is entered (or continued) as
described in x4.3.
2.8. The approximate Hessian. As suggested by Powell [44], we maintain a
positive-de nite approximate Hessian Hk . On completion of the line search, let the
change in x and the gradient of the modi ed Lagrangian be
k = xk+1 ? xk and yk = rL(xk+1 ; xk ; ) ? rL(xk ; xk ; );
for some vector . An estimate of the curvature of the modi ed Lagrangian along k
is incorporated using the BFGS quasi-Newton update,
Hk+1 = Hk + k yk ykT ? k qk qkT;
where qk = Hk k , k = 1=ykTk and k = 1=qkTk . When Hk is positive de nite, Hk+1
is positive de nite if and only if the approximate curvature ykTk is positive. The
consequences of a negative or small value of ykTk are discussed in the next section.
There are several choices for , including the QP multipliers bk+1 and least-
squares multipliers k (see, e.g., [22]). Here we use the updated multipliers k+1 from
the line search, because they are responsive to short steps in the search and they are
available at no cost. The de nition of L (2.2) yields
yk = rL(xk+1 ; xk ; k+1 ) ? rL(xk ; xk ; k+1 )
= g(xk+1 ) ? (J (xk+1 ) ? J (xk ))T k+1 ? g(xk ):
2.9. Maintaining positive-de niteness. Since the Hessian of the modi ed
Lagrangian need not be positive de nite at a local minimizer, the approximate cur-
vature ykTk can be negative or very small at points arbitrarily close to (x ; ). The
curvature is considered not suciently positive if
(2.6) ykTk < k ; k = k (1 ? )pTkHk pk ;
where is a preassigned constant (0 < < 1) and pk is the search direction xbk ? xk
de ned by the QP subproblem. In such cases, if there are nonlinear constraints, two
attempts are made to modify the update: the rst modifying k and yk , the second
modifying only yk . If neither modi cation provides suciently positive approximate
curvature, no update is made.
LARGE-SCALE SQP 7
First modi cation. First, we de ne a new point zk and evaluate the nonlinear
functions there to obtain new values for k and yk :
k = xk+1 ? zk ; yk = rL(xk+1 ; xk ; k+1 ) ? rL(zk ; xk ; k+1 ):
We choose zk = xk + k (xk ? xk ), where xk is the rst feasible iterate found for
problem GQPk (see x4).
The purpose of this modi cation is to exploit the properties of the reduced Hessian
at a local minimizer of GNP. With this choice of zk , k = xk+1 ? zk = k pN , where
pN is the vector xbk ? xk . Then,
ykTk = k ykT pN 2k pTN r2 L(xk ; xk ; k )pN ;
and ykTk approximates the curvature along pN . If Wk , the nal working set of problem
GQPk , is also the working set at xk , then Wk pN = 0 and it follows that ykTk approx-
imates the curvature for the reduced Hessian, which must be positive semi-de nite at
a minimizer of GNP.
The assumption that the QP working set does not change once zk is known is
always justi ed for problems with equality constraints (see Byrd and Nocedal [11]
for a similar scheme in this context). With inequality constraints, we observe that
Wk pN 0, particularly during later major iterations, when the working set has settled
down.
This modi cation exploits the fact that SNOPT maintains feasibility with respect
to any linear constraints in GNP. (Such a strategy allows linear constraints to be used
to de ne a region in which f and c can be safely evaluated.) Although an additional
function evaluation is required at zk , we have observed that even when the Hessian
of the Lagrangian has negative eigenvalues at a solution, the modi cation is rarely
needed more than a few times if used in conjunction with the augmented Lagrangian
modi cation discussed next.
Second modi cation. If (xk ; k ) is not close to (x ; ), the modi ed approxi-
mate curvature ykT k may not be suciently positive and a second modi cation may
be necessary. We choose yk so that (yk + yk )T k = k (if possible), and rede ne
yk as yk + yk . This approach was rst suggested by Powell [45], who proposed
rede ning yk as a linear combination of yk and Hk k .
To obtain yk , we consider the augmented modi ed Lagrangian [40]:
(2.7) LA (x; xk ; k ) = f (x) ? kT dL (x; xk ) + 21 dL (x; xk )T dL (x; xk );
where is a matrix of parameters to be determined: = diag(!i ), !i 0, i =
1; : : : ; m. The perturbation
yk = (J (xk+1 ) ? J (xk ))T dL (xk+1 ; xk )
is equivalent to rede ning the gradient di erence as
(2.8) yk = rLA (xk+1 ; xk ; k+1 ) ? rLA (xk ; xk ; k+1 ):
We choose the smallest (minimum two-norm) !i 's that increase ykTk to k (2.6). They
are determined by the linearly constrained least-squares problem
LSP minimize
!
k!k2
subject to aT! = ; ! 0;
8 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
so is Z , and the only change to the reduced Hessian between major iterations comes
from the rank-two BFGS update. This implies that the reduced Hessian need not be
refactorized if the BFGS update is applied explicitly to the reduced Hessian. This
obviates factorizing the reduced Hessian at the start of each QP, saving considerable
computation.
Given any nonsingular matrix Q, the BFGS update to Hk implies the following
update to QTHk Q:
(5.1) H Q = HQ + k yQ yQT ? k qQ qQT ;
where H Q = QTHk+1 Q, HQ = QTHk Q, yQ = QTyk , Q = Q?1 k , qQ = HQ Q ,
k = 1=yQT Q and k = 1=qQT Q . If Q is of the form ( Z Y ) for some matrix Y , the
reduced Hessian is the leading principal submatrix of HQ .
The Cholesky factor R of the reduced Hessian is simply the upper-left corner of
the n n upper-trapezoidal matrix RQ such that HQ = RQT RQ . The update for R is
derived from the rank-one update to RQ implied by (5.1). Given k and yk , if we had
the Cholesky factor RQ , it could be updated directly as
p RQT w T
(5.2) RQ + kwwk k yQ ? kwk ;
where w = RQ Q (see Goldfarb [30], Dennis and Schnabel [15]). This rank-one modi-
cation of RQ could be restored to upper-triangular form by applying two sequences
of plane rotations from the left [21].
To simplify the notation we write (5.2)
p as RQ + uvT , where RQ is an n n upper-
trapezoidal matrix, u = w=kwk and v = k yQ ? RQT u. Let vZ be the rst nZ elements
of v. The following algorithm determines the Cholesky factor R of the rst nZ rows
and columns of H Q (5.1).
1. Compute q = Hk k , t = Z Tq.
2. De ne = kwk2 = (kTHk k )1=2 = (qTk )1=2 .
3. Solve RT wZ = t.
4. De ne uZ = wZ =; = (1 ? kuZ k22 )1=2 .
5. De ne a sweep of nZ rotations P1 in the planes (nZ + 1; i), i = nZ , nZ ? 1,
. . . , 1, such that
! !
P1 R u Z = Rb 0 ;
rT 1
where Rb is upper triangular and rT is a \row spike" in row nZ + 1.
6. De ne a sweep of nZ rotations P2 in the planes (i; nZ + 1), i = 1, 2, . . . ,
nZ + 1, such that
b ! R !
R
P2 T T = ;
r + vZ 0
where R is upper triangular.
5.3. The major iteration of SNOPT. The main steps of the SQP algorithm
in SNOPT are as follows. We assume that a starting point (x0 , 0 ) is available, and
that the reduced-Hessian QP solver SQOPT is being used. We describe elastic mode
verbally. Speci c values for are given at the start of x6.
LARGE-SCALE SQP 13
0. Apply the QP solver to Problem PP to nd the point closest to x0 satisfy-
ing the linear constraints. If Problem PP is infeasible, declare Problem NP
infeasible. Otherwise, Problem PP de nes a working-set matrix W0 . Set
k = 0.
1. Factorize Wk .
2. Find xk , a feasible point for the QP subproblem. (This is an intermediate
point for the QP solver, which also provides a working-set matrix W k and its
null-space matrix Zk .) If no feasible point exists, initiate elastic mode and
restart the QP.
3. Form the reduced Hessian ZTk Hk Zk and compute its Cholesky factorization.
4. Continue solving the QP subproblem to nd (xbk ; bk ), an optimal QP solution.
(This provides a working-set matrix W ck and its null-space matrix Zbk .)
If elastic mode has not been initiated but kbk k1 is \large", enter elastic mode
and restart the QP.
If the QP is unbounded and xk satis es the nonlinear constraints, declare the
problem unbounded. Otherwise (if the QP is unbounded), go to Step 6.
5. If (xk ; k ) satis es the convergence tests for NP analogous to (2.9), declare
the solution optimal. If similar convergence tests are satis ed for NP( ), go
to Step 6. Otherwise, go to Step 7.
6. If elastic mode has not been initiated, enter elastic mode and repeat Step 4.
Otherwise, if has not reached its maximum value, increase and repeat
Step 4. Otherwise, declare the problem infeasible.
7. Find a step length k that gives a sucient reduction in the merit function.
Set xk+1 = xk + k (xbk ? xk ) and k+1 = k + k (bk ? k ). Evaluate the
Jacobian at xk+1 .
8. De ne k = xk+1 ? xk and yk = rL(xk+1 ; xk ; k+1 ) ? rL(xk ; xk ; k+1 ). If
ykT k < k , recompute k and yk with xk rede ned as xk + k (xk ? xk ). (This
requires an extra evaluation of the problem derivatives.) If necessary, increase
ykT k (if possible) by adding an augmented Lagrangian term to yk .
9. If ykT k k , apply the BFGS update to Hk using the pair (Hk k ; yk ).
10. Set k k + 1 and repeat from Step 1.
Apart from computing the problem functions and their rst derivatives, most
of the computational e ort lies in Steps 1 and 3. Steps 2 and 4 may also involve
signi cant work if the QP subproblem requires many minor iterations. Typically this
will happen only during the early major iterations.
Note that all points xk satisfy the linear constraints and bounds (as do the points
used to de ne extra derivatives in Step 8). Thus, SNOPT evaluates the nonlinear
functions only at points where it is reasonable to assume that they are de ned.
6. Numerical results. We give the results of applying SNOPT to several sets of
optimization problems, including 3 standard sets of small dense problems, the CUTE
collection, and some large sparse problems arising in optimal trajectory calculations.
Sources for the problems are given in Table 6.1. Table 6.2 de nes the notation used
in the later tables of results.
Unless stated otherwise, all runs were made on an SGI Indigo2 Impact 10000 with
256MB of RAM. SNOPT is coded in Fortran and was compiled using f77 in 64-bit
mode and full code optimization. Figure 6.1 gives the SNOPT optional parameters
and their values, most of which are the default. In some cases we compare the per-
formance of SNOPT with the SLC code MINOS 5.5 of Dec 1996 (see [41] and x1.4).
The default MINOS optional parameters were used, such as Crash option 3 and
14 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.1
Sets of test problems.
Problems Reference
bt Boggs and Tolle [4, 5]
hs Hock and Schittkowski [34]
CUTE Bongartz, Conn, Gould and Toint [8]
Spring Murtagh and Saunders [40]
Min-time Hargraves and Paris [32]
Table 6.2
Notation in tables of results.
Line search tolerance 0.1. The only exceptions were Superbasic limit 1200
and Major iterations 2000. The convergence criteria for SNOPT and MINOS are
identical.
For the SNOPT Hessian approximations Hk , if the number of nonlinear variables
is small enough (n 75), a full dense BFGS Hessian is used. Otherwise, a limited-
memory BFGS Hessian is used, with Hk reset to the current Hessian diagonal every
20 major iterations.
To aid comparison with results given elsewhere, runs on published test problems
used the associated \standard start" for x0 . In all cases, the starting multiplier
estimates 0 were set to zero. On the Hock-Schittkowski test set, setting 0 to be the
QP multipliers from the rst subproblem led to fewer major iterations and function
evaluations in most cases. Overall, however, SNOPT was more reliable with 0 = 0.
The default initial for elastic mode is !kg(xk1 )k2 , where ! is the Elastic
weight (default 100) and xk1 is the iterate at which is rst needed. Thereafter, if
the rth increase to occurs at iteration k2, = !10r kg(xk2 )k2 .
The Major feasibility tolerance and Major optimality tolerance are the
parameters P and D of x2.10 de ned with respect to Problem NP. The Minor toler-
ances are analogous quantities for SQOPT as it solves QPk . (The Minor feasibility
tolerance incidentally applies to the bound and linear constraints in NP as well as
LARGE-SCALE SQP 15
QPk .)
The Violation limit is the parameter of x2.7 that de nes an expanded feasible
region in which the objective is expected to be bounded below.
As in MINOS, the default is to scale the linear constraints and variables, and the
rst basis is essentially triangular (Crash option 3), except for NC problems, where
SNOPT's default is the all-slack basis (Crash option 0).
6.1. Results on small problems. SNOPT was applied to 3 sets of small test
problems that have appeared in the literature.
Table 6.3 gives results on the Boggs-Tolle problems [5], which are a set of small
nonlinear problems. Where appropriate, the problems were run with the \close starts"
(Start 1), \intermediate starts" (Start 2) and \far starts" (Start 3) suggested by Boggs
and Tolle [4]. SNOPT solved all cases except bt4 with the intermediate starting point.
In that case, the merit function became unbounded, with kck ! 1.
Tables 6.4{6.6 give results on the Hock-Schittkowski (HS) collection [34] as im-
plemented in the CUTE test suite (see x6.3). Results from all CUTE HS problems are
included except for hs67 , hs85 and hs87 , which are not smooth. In every case SNOPT
found a point that satis es the rst-order conditions for optimality. However, since
SNOPT uses only rst derivatives, it is able to nd only rst-order solutions, which
may or may not be minimizers. In some cases, e.g., hs16 and hs25 , the nal point
was not a minimizer. Similarly, the constraint quali cation does not hold at the nal
point obtained for hs13 .
Table 6.7 summarizes the results of applying SNOPT and MINOS to the HS col-
lection. SNOPT required more minor iterations than MINOS but fewer function eval-
uations. MINOS solved 119 of the 122 problems. Problem hs104lnp could not be
16 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.3
SNOPT on the Boggs-Tolle test problems.
Table 6.4
SNOPT on the CUTE Hock-Schittkowski suite: Part I.
No. Problem Mnr Mjr Fcn Obj
1 hs1 (BC) 39 24 31 3.535967E-15
2 hs2 (BC) 28 16 25 5.042619E-02
3 hs3 (QP) 5 3 10 1.972152E-34
4 hs3mod (QP) 6 4 10 1.925930E-34
5 hs4 (BC) 2 1 3 2.666667E+00
6 hs5 (BC) 9 6 10 -1.913223E+00
7 hs6 5 4 8 0.000000E+00
8 hs7 17 15 27 -1.732051E+00
9 hs8 (FP) 2 5 10 0.000000E+00
10 hs9 (LC) 5 4 10 -5.000000E-01
11 hs10 14 12 17 -1.000000E+00
12 hs11 11 9 17 -8.498464E+00
13 hs12 11 8 13 -3.000000E+01
14 hs13 1 4 11 1.434080E+00
15 hs14 3 5 9 1.393465E+00
16 hs15 2 2 5 3.065000E+02
17 hs16 1 4 7 2.314466E+01
18 hs17 18 13 26 1.000000E+00
19 hs18 15 13 24 5.000000E+00
20 hs19 8 12 24 -6.961814E+03
21 hs20 1 3 6 4.019873E+01
22 hs21 (QP) 2 2 6 -9.996000E+01
23 hs21mod (LC) 2 2 6 -9.596000E+01
24 hs22 3 0 3 1.000000E+00
25 hs23 8 7 17 2.000000E+00
26 hs24 (LC) 6 2 8 -1.000000E+00
27 hs25 (BC) 0 0 3 3.283500E+01
28 hs26 94 62 249 5.685474E-11
29 hs27 24 21 32 4.000000E-02
30 hs28 (LC) 5 3 7 4.830345E-18
31 hs29 26 18 25 -2.262742E+01
32 hs30 14 12 15 1.000000E+00
33 hs31 13 9 14 6.000000E+00
34 hs32 5 3 6 1.000000E+00
35 hs33 1 2 6 -3.993590E+00
36 hs34 4 5 8 -8.340324E-01
37 hs35 (QP) 12 7 10 1.111111E-01
38 hs35mod (QP) 9 6 9 2.500000E-01
39 hs36 (LC) 3 1 4 -3.300000E+03
40 hs37 (LC) 8 5 9 -3.456000E+03
41 hs38 (BC) 45 29 32 1.823912E-16
42 hs39 (bt9 ) 20 16 28 -1.000000E+00
43 hs40 9 5 9 -2.500000E-01
44 hs41 (LC) 3 0 3 1.925926E+00
45 hs42 11 7 12 1.385786E+01
46 hs43 16 9 14 -4.400000E+01
47 hs44 (QP) 9 5 10 -1.500000E+01
18 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Table 6.5
SNOPT on the CUTE Hock-Schittkowski suite: Part II.
No. Problem Mnr Mjr Fcn Obj
48 hs44new (QP) 10 4 8 -1.500000E+01
49 hs45 (BC) 8 3 11 1.000000E+00
50 hs46 17 12 18 3.488613E-08
51 hs47 45 38 52 -2.671418E-02
52 hs48 (LC) 10 7 11 2.631260E-15
53 hs49 (LC) 34 30 34 1.299963E-11
54 hs50 (LC) 29 19 27 2.155810E-16
55 hs51 (QP) 4 3 7 5.321113E-29
56 hs52 (QP) 6 3 8 5.326648E+00
57 hs53 (QP) 6 3 7 4.093023E+00
58 hs54 (LC) 61 37 50 -8.674088E-01
59 hs55 (LC) 3 0 3 6.666667E+00
60 hs56 27 17 28 -3.456000E+00
61 hs57 40 33 46 2.845965E-02
62 hs59 24 16 28 -6.749505E+00
63 hs60 11 8 12 3.256820E-02
64 hs61 15 10 16 -1.436461E+02
65 hs62 (LC) 13 8 13 -2.627251E+04
66 hs63 (bt5 ) 6 14 39 9.723171E+02
67 hs64 49 38 42 6.299842E+03
68 hs65 17 9 12 9.535289E-01
69 hs66 7 4 6 5.181633E-01
70 hs68 55 36 56 -9.204250E-01
71 hs69 18 13 23 -9.567129E+02
72 hs70 33 28 34 7.498464E-03
73 hs71 9 5 8 1.701402E+01
74 hs72 55 43 59 7.267078E+02
75 hs73 11 3 5 2.989438E+01
76 hs74 14 6 9 5.126498E+03
77 hs75 10 5 8 5.174413E+03
78 hs76 (QP) 8 4 7 -4.681818E+00
79 hs77 (bt6 ) 20 15 20 2.415051E-01
80 hs78 12 7 12 -2.919700E+00
81 hs79 (bt11 ) 16 11 15 7.877682E-02
82 hs80 11 6 9 5.394985E-02
83 hs81 17 12 16 5.394985E-02
84 hs83 3 3 7 -3.066554E+04
85 hs84 15 6 18 -5.280335E+06
86 hs86 (LC) 20 8 12 -3.234868E+01
87 hs88 48 36 67 1.362657E+00
88 hs89 51 38 70 1.362657E+00
89 hs90 49 38 70 1.362657E+00
90 hs91 47 36 62 1.362657E+00
91 hs92 57 42 80 1.362657E+00
92 hs93 27 21 25 1.350760E+02
LARGE-SCALE SQP 19
Table 6.6
SNOPT on the CUTE Hock-Schittkowski suite: Part III.
No. Problem Mnr Mjr Fcn Obj
93 hs95 1 1 4 1.561953E-02
94 hs96 1 1 4 1.561953E-02
95 hs97 10 13 34 3.135809E+00
96 hs98 10 13 34 3.135809E+00
97 hs99 55 20 30 -8.310799E+08
98 hs99exp 545 148 801 -1.260006E+12
99 hs100 23 13 22 6.806301E+02
100 hs100lnp 22 15 28 6.806301E+02
101 hs100mod 24 16 26 6.786796E+02
102 hs101 180 94 381 1.809765E+03
103 hs102 99 48 166 9.118806E+02
104 hs103 103 46 166 5.436680E+02
105 hs104 31 23 28 3.951163E+00
106 hs104lnp 28 20 26 3.951163E+00
107 hs105 (LC) 68 47 60 1.044725E+03
108 hs106 31 13 16 7.049248E+03
109 hs107 22 6 11 5.055012E+03
110 hs108 31 9 13 -8.660255E-01
111 hs109 40 14 20 5.362069E+03
112 hs110 (BC) 50 1 4 -9.990002E+09
113 hs111 221 148 259 -4.776109E+01
114 hs111lnp 99 57 96 -4.737066E+01
115 hs112 (LC) 65 24 41 -4.776109E+01
116 hs113 33 14 19 2.430621E+01
117 hs114 45 16 31 -1.768807E+03
118 hs116 284 69 88 9.759102E+01
119 hs117 54 16 21 3.234868E+01
120 hs118 (QP) 29 2 6 6.648205E+02
121 hs119 (LC) 36 9 12 2.448997E+02
122 hs268 (QP) 77 37 45 -1.091394E-11
Table 6.7
Summary: MINOS and SNOPT on the CUTE HS test set.
MINOS SNOPT
Problems attempted 124 124
Optimal 121 124
Cannot be improved 1 0
False infeasibility 2 0
Major iterations 7175 3573
Minor iterations 1230 2052
Function evaluations 18703 4300
Cpu time (secs) 8.45 3.66
20 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Problem Spring. This problem computes the optimal control of a spring mass
and damper system described in [40]. Our implementation of Spring perpetuates a
coding error that makes the problem badly conditioned unless exactly 100 discretized
sample points are used (see Plantenga [43]). Corrected versions appear in the CUTE
test collection as problems optcdeg2 and optcdeg3 (see the results of x6.3). Table 6.8
includes the dimensions of six spring problems of increasing size. Table 6.9 gives
results for MINOS and SNOPT on these problems. SNOPT proved remarkably e ective
in terms of the number of major iterations and function values|around 20 regardless
of problem size.
Table 6.8
Dimensions of Optimal Control problems.
Problem m n Problem m n
Spring200 400 602 Min-time10 270 184
Spring300 600 902 Min-time15 410 274
Spring400 800 1202 Min-time20 550 384
Spring500 1000 1502 Min-time25 690 454
Spring600 1200 1802 Min-time30 830 544
Spring700 1400 2102 Min-time35 970 634
Min-time40 1110 724
Min-time45 1250 814
Min-time50 1390 904
Table 6.9
MINOS and SNOPT on Spring.
MINOS SNOPT
Problem nZ Mnr Mjr Fcn cpu nZ Mnr Mjr Fcn cpu
Spring200 26 935 18 935 4.2 40 886 16 19 5.9
Spring300 53 2181 29 1881 13.6 48 1259 14 17 9.9
Spring400 79 1601 32 2136 16.7 47 1602 17 20 15.6
Spring500 108 2188 42 3069 27.7 50 1990 17 20 22.9
Spring600 135 3147 58 4522 65.8 61 2432 17 15 24.9
Spring700 162 3453 64 5046 81.7 62 2835 17 20 38.4
For MINOS, the major iterations increase with problem size because they are
terminated at most 40 minor iterations after the subproblem is feasible (whereas
SNOPT always solves its subproblems to optimality). The function evaluations are
proportional to the minor iterations, which increase steadily. The runtime would be
greatly magni ed if the functions or gradients were not trivial to compute.
For more complex examples (such as the F4 minimum-time-to-climb below), the
solution to the smallest problem could provide a good starting point for the larger
cases. This should help most optimizers, but the expensive functions would still leave
MINOS at a disadvantage compared to SNOPT.
An optimal trajectory problem. Here we give the results of applying two
SQP methods to a standard optimal trajectory problem: the F4 minimum-time-to-
climb. In this problem, a pilot cruising in an F-4 at Mach 0.34 at sea level wishes
LARGE-SCALE SQP 21
Table 6.10
NZOPT and SNOPT on the F4 minimum-time-to-climb.
NZOPT SNOPT
Problem Mjr Fcn cpu Mjr Fcn cpu
Min-time10 21 33 49.1 22 33 16.1
Min-time15 22 42 163.8 32 46 36.3
Min-time20 34 38 449.4 33 38 43.9
Min-time25 43 61 1368.3 33 40 67.1
Min-time30 37 41 2194.0 40 46 94.2
Min-time35 47 51 4324.5 40 46 118.6
Min-time40 47 51 6440.3 54 60 182.9
Min-time45 47 51 9348.0 49 56 209.3
Min-time50 53 57 14060.5 43 47 217.3
Table 6.11
MINOS and SNOPT on the F4 minimum-time-to-climb.
MINOS SNOPT
Problem nZ Mnr Mjr Fcn cpu nZ Mnr Mjr Fcn cpu
Min-time10 5 657 15 1996 1083.0 5 33 22 33 16.1
Min-time15 15 1586 16 5047 4076.3 15 46 32 42 36.3
Min-time20 23 2201 14 6972 7056.4 23 38 33 38 43.9
Min-time25 30 3044 14 9947 13754.0 30 40 33 61 67.1
Min-time30 40 7180 17 23443 37198.8 40 46 40 41 94.2
Min-time35 46 4698 14 15070 28371.6 46 46 40 51 118.6
Min-time40 55 4448 14 14351 30643.3 55 60 54 51 182.9
Min-time45 64 3806 13 10752 26712.5 64 56 49 51 209.3
Min-time50 72 6758 44 20515c 58026.1 72 47 43 57 217.3
to ascend to 65,000 feet at Mach 1.0 in minimum time. The problem has two path
constraints on the maximum altitude and the maximum dynamic pressure.
The runs in this section were made on a Sun SPARCstation 20/61 using f77 with
full code optimization.
Table 6.10 gives results for 9 optimization problems, each involving a ner level of
discretization of the underlying continuous problem. The problems were generated by
OTIS [33], and the constraint gradients are approximated by a sparse nite-di erence
scheme. In the table, SNOPT is compared with the code NZOPT, which implements an
SQP method based on a dense reduced Hessian and a dense orthogonal factorization of
the working-set matrix. (NZOPT is a special version of NPSOL that was developed in
conjunction with McDonnell Douglas Space Systems for optimal trajectory problems.)
Note that both codes required only 50{60 major iterations and function evaluations
to solve a problem with O(1000) variables. Since the problem functions are very
expensive in this application, it appears that SQP methods (even without the aid of
second derivatives) are well suited to trajectory calculations.
Since SNOPT and NZOPT are based on similar SQP methods, the di erent itera-
tion and function counts are due to the starting procedures and to SNOPT's limited-
memory Hessian resets. Note that the limited-memory approximation did not signif-
icantly increase the counts for SNOPT.
22 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Di erences in cpu times are due to the QP solvers. In NZOPT, where the Jacobian
is treated as a dense matrix, the cost of solving the QP subproblems grows as a cubic
function of the number of variables, whereas the total cost of evaluating the problem
functions grows quadratically. Eventually the linear algebra required to solve the QP
subproblems dominates the computation. In SNOPT, sparse-matrix techniques for
factoring the Jacobian greatly reduce this cost. On the larger problems, a speedup of
almost two orders of magnitude has been achieved.
Table 6.11 compares MINOS and SNOPT on the F4 Min-time problems. The
same sparse-matrix methods are used, but the times are dominated by the expensive
functions and gradients.
6.3. Results on the CUTE test set. Extensive runs have been made on the
CUTE test collection dated 07/May/97. The various problem types in this distribution
are summarized in Table 6.12.
Table 6.12
CUTE problem categories
The problem model has infeasible linear constraints, but was included anyway.
The objective for problem static3 is unbounded below in the feasible region.
SNOPT solved all 109 problems in the CUTE LC set, and both SNOPT and MINOS
correctly diagnosed the special features of problems model and static3 . MINOS solved
101 of the problems, but could not improve the nal (nonoptimal) point for problems
ncvxqp1 , ncvxqp2 , ncvxqp4 , ncvxqp6 , ncvxqp8 , powell20 and ubh1 .
Table 6.16 summarizes the MINOS and SNOPT results on the CUTE LC problems.
The total cpu time for MINOS was less than one fth of that required for SNOPT,
largely because of the ve blockqp problems. (When these were excluded from the
LC selection, the total time for SNOPT dropped from 1211.4 secs to 276.4 secs, which
is comparable to the MINOS time.) On blockqp1 , which is typical of this group of
problems, MINOS requires 1021 functions evaluations compared to 9 for SNOPT. The
di erence in cpu time comes from the number of minor iterations (1010 for MINOS,
2450 for SNOPT) and the size of the reduced Hessians. For MINOS, the reduced
Hessian dimension (the number of superbasics) is never larger than four. By contrast,
for SNOPT it expands to 1005 during the rst QP subproblem, only to be reduced
to four during the third major iteration. The intermediate minor iterations are very
expensive, with the need to update a dense matrix R (4.1) of order 1000 at each step.
Although the ability to make many changes to the working set (between function
evaluations) has been regarded as an attractive feature of SQP methods, these ex-
amples illustrate that some caution is required. We anticipate that eciency would
be improved by allowing the QP subproblem to terminate early if the reduced Hes-
sian dimension has increased signi cantly. (Other criteria for early termination are
discussed in [37].)
A selection of problems with variable dimensions. The next selection was
used to choose problems whose dimension can be one of several values. (We chose n
as close to 1000 as possible. Problems from the other 3 categories were deleted.)
Table 6.13
SNOPT on the CUTE LC problems: Part I.
Table 6.14
SNOPT on the CUTE LC problems: Part II.
Table 6.16
Summary: MINOS and SNOPT on the CUTE LC problems.
MINOS SNOPT
Problems attempted 109 109
Optimal 100 107
Infeasible 1 1
Unbounded 1 1
Cannot be improved 7 0
Major iterations 83 1597
Minor iterations 42892 49619
Function evaluations 59976 3206
Cpu time (secs) 227.1 1239.4
Table 6.17 gives the problem dimensions and Table 6.18 gives the SNOPT results
on this set. SNOPT solved 38 of the 45 problems attempted. Among the successes we
have included 7 infeasible cases. These include 3 cases with infeasible linear constraints
( osp2hh , osp2hl and osp2hm ), and 4 cases that SNOPT identi ed as having in-
feasible nonlinear constraints (drcavty3 , lubrif , osp2th and junkturn ). Since SNOPT
is not assured of nding a global minimizer of the sum of infeasibilities, failure to
nd a feasible point does not imply that none exists. To gain further assurance that
drcavty3 , lubrif , osp2th and junkturn are indeed infeasible, they were re-solved using
SNOPT's Feasible Point option, in which the true objective is ignored but \elastic
mode" is invoked (as usual) if the constraint linearizations prove to be infeasible (i.e.,
f (x) = 0 and = 1 in problem NP( ) of x1.1). In all 4 cases, the nal sum of
constraint violations was comparable to that obtained with the composite objective.
LARGE-SCALE SQP 27
Table 6.17
Dimensions of variable-dimensioned CUTE NC selection.
Table 6.18
SNOPT on the variable-dimensioned CUTE NC problems.
MINOS SNOPT
Problems attempted 45 45
Optimal 29 31
Infeasible 6 7
Cannot be improved 1 1
False infeasibility 2 1
Terminated 4 5
False unboundedness 3 0
Major iterations 12062 6959
Minor iterations 217169 460094
Function evaluations 303718 9468
Cpu time (secs) 15421.5 12870.9
The remaining infeasible case for SNOPT was bratu2dt , which is listed as a \false
infeasible" solution. However, the run gives a point that appears to be near-optimal,
with a nal nonlinear constraint violation of 1:5 10?5. In this case, SNOPT's
Feasible Point option also declared the problem infeasible with nal nonlinear con-
straint violation of 1:5 10?4. However, points satisfying the nonlinear constraints
have been found in other runs (and by other algorithms).
SNOPT was unable to solve 5 problems within 1000 major iterations (drcav1lq ,
drcav2lq , drcav3lq , reading9 and drugdise ). On termination, SNOPT was in elastic
mode for drugdise with nal constraint violation 5:5 10?4 (implying that no feasible
point may exist). The non-optimal nal value for problem hadamard could not be
improved.
MINOS solved 29 problems, and declared the 8 problems drugdise , osp2hh ,
osp2hl , osp2hm , hadamard , lubrif , orbit2 and trainh to be infeasible. Feasible
points for orbit2 and trainh are known, so these two cases are considered to have
failed. Problems bratu2dt and osp2tm and junkturn became unbounded at infeasi-
ble points. The non-optimal nal value for ubh1 could not be improved, and the four
problems drcavty1 , drcavty2 , drcavty3 and osp2th could not be solved within 2000
major iterations.
Table 6.19 summarizes the performance of MINOS and SNOPT on the variable-
dimensioned NC problems. As with the other test sets, the better reliability of SNOPT
is partly explained by the use of elastic variables to treat infeasible problems. The
large number of function evaluations is the reason why MINOS required more time
than SNOPT even though fewer problems were solved. The unbounded cases for
MINOS are partly attributable to the absence of a suitable merit function.
30 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
Tables 6.20{6.21 give results for this set. SNOPT solved 54 of the 56 problems
attempted. The successes include two problems that SNOPT identi ed as having in-
feasible nonlinear constraints (discs and nystrom5 ). The nal sum of the nonlinear
constraint violations for these problems was 4:00, 3:193 10?3 and 1:72 10?2 respec-
tively. To our knowledge, no feasible point has ever been found for these problems.
SNOPT was unable to solve problems cresc132 and leaknet in 1000 major iterations.
For leaknet , the run gives a point that appears to be close to optimality, with a nal
nonlinear constraint violation of 6:3 10?9.
MINOS declared 7 problems to be infeasible (cresc132 , discs , lakes , nystrom5 ,
robot , truspyr1 and truspyr2 ). Feasible points found by SNOPT imply that this
diagnosis is correct only for discs and nystrom5 . Unbounded iterations occurred in
8 cases (brainpc3 , brainpc7 , brainpc9 , errinbar , tenbars1 , tenbars2 , tenbars3 and
tenbars4 ). The major iteration limit was enforced for problem reading6 .
Table 6.22 summarizes the MINOS and SNOPT results on the xed-dimensioned
NC problems. If the conjectured infeasible problems are counted as successes, the
number of successes for MINOS and SNOPT are 42 and 54 respectively, out of a total
of 56.
A selection of all smooth problems. Finally, SNOPT and MINOS were com-
pared on (almost) the entire CUTE collection. The resulting selection includes the
HS, LC and NC selections considered earlier, but only the additional problems are
discussed below. Table 6.23 summarizes the MINOS and SNOPT results.
SNOPT found an unbounded solution for the problem bratu1d . The 8 problems
eigmina , orthrds2 , orthregd , scon1ls , tointgor , vanderm1 , vanderm2 and vanderm3
were terminated at a point within 10?2 of satisfying the convergence test. These
problems would have succeeded with a less stringent convergence tolerance.
SNOPT identi ed 11 infeasible problems: argauss , bratu2dt , eigenb , etcher ,
growth , himmelbd , lewispol , lootsma , powellsq , s365mod and vanderm4 . Of these,
powellsq , vanderm4 must be counted as failures because they are known to have
feasible points, as veri ed by calling SNOPT in Feasible point mode. Similarly,
etcher and lootsma have feasible solutions but their initial points are infeasible and
stationary for the sum of infeasibilities, so SNOPT terminated immediately. These
problems are also listed as failures. The nal sums of infeasibilities for the remain-
ing 7 problems were identical to those found by running SNOPT with the Feasible
point option. We conjecture that these problems are infeasible.
SNOPT was unable to solve 30 cases within the allotted 1000 major iterations
(problems biggsb1 , catena , catenary , chainwoo , chenhark , dixchlng , djtl , eigenbls ,
eigencls , etcbv3 , genrose , heart6ls , helsby , hydc20ls , maratosb , noncvxu2 , noncvxun ,
LARGE-SCALE SQP 31
Table 6.20
SNOPT on the xed-dimension CUTE NC problems: Part I.
Table 6.22
Summary: SNOPT and MINOS on the xed-dimensioned CUTE NC problems.
MINOS SNOPT
Problems attempted 56 56
Optimal 40 52
Infeasible 2 2
False infeasibility 5 0
Terminated 1 2
False unboundedness 8 0
Major iterations 3193 6094
Minor iterations 53795 96823
Function evaluations 94914 16231
Cpu time (secs) 2635.1 5003.0
MINOS SNOPT
Problems attempted 796 796
Optimal 706 721
Unbounded 2 3
Infeasible 11 16
Almost optimal 0 8
Cannot be improved 15 6
False infeasibility 21 5
Terminated 11 37
False unboundedness 30 0
Major iterations 31328 74335
Minor iterations 903395 875344
Function evaluations 1641959 135143
Cpu time (secs) 26134.6 30863.1
REFERENCES
[1] R. H. Bartels, A penalty linear programming method using reduced-gradient basis-exchange
techniques, Linear Algebra Appl., 29 (1980), pp. 17{32.
[2] J. T. Betts and P. D. Frank, A sparse nonlinear optimization algorithm, J. Optim. Theory
and Applics., 82 (1994), pp. 519{541.
[3] L. T. Biegler, J. Nocedal, and C. Schmid, A reduced Hessian method for large-scale con-
strained optimization, SIAM J. Optim., 5 (1995), pp. 314{347.
[4] P. T. Boggs and J. W. Tolle, An implementation of a quasi-Newton method for constrained
optimization, Technical Report 81-3, University of North Carolina at Chapel Hill, 1981.
[5] P. T. Boggs, J. W. Tolle, and P. Wang, On the local convergence of quasi-Newton methods
for constrained optimization, SIAM J. Control Optim., 20 (1982), pp. 161{171.
[6] I. Bongartz, A. R. Conn, N. I. M. Gould, M. A. Saunders, and P. L. Toint, A numerical
comparison between the LANCELOT and MINOS packages for large-scale constrained
optimization, report, 1997. To appear.
[7] , A numerical comparison between the LANCELOT and MINOS packages for large-scale
constrained optimization: the complete numerical results, report, 1997. To appear.
[8] I. Bongartz, A. R. Conn, N. I. M. Gould, and P. L. Toint, CUTE: Constrained and
unconstrained testing environment, ACM Trans. Math. Software, 21 (1995), pp. 123{160.
[9] A. Buckley and A. LeNir, QN-like variable storage conjugate gradients, Math. Prog., 27
(1983), pp. 155{175.
[10] , BBVSCG{a variable storage algorithm for function minimization, ACM Trans. Math.
Software, 11 (1985), pp. 103{119.
[11] R. H. Byrd and J. Nocedal, An analysis of reduced Hessian methods for constrained opti-
mization, Math. Prog., 49 (1991), pp. 285{323.
36 P. E. GILL, W. MURRAY AND M. A. SAUNDERS
[12] A. R. Conn, Constrained optimization using a nondi erentiable penalty function, SIAM J.
Numer. Anal., 10 (1973), pp. 760{779.
[13] , Linear programming via a nondi erentiable penalty function, SIAM J. Numer. Anal.,
13 (1976), pp. 145{154.
[14] A. R. Conn, N. I. M. Gould, and P. L. Toint, LANCELOT: a Fortran package for large-
scale nonlinear optimization (Release A), Lecture Notes in Computation Mathematics 17,
Springer Verlag, Berlin, Heidelberg, New York, London, Paris and Tokyo, 1992. ISBN
3-540-55470-X 65-04.
[15] J. E. Dennis, Jr. and R. B. Schnabel, A new derivation of symmetric positive de nite
secant updates, in Nonlinear Programming 4, O. L. Mangasarian, R. R. Meyer, and S. M.
Robinson, eds., Academic Press, London and New York, 1981, pp. 167{199.
[16] A. Drud, CONOPT: A GRG code for large sparse dynamic nonlinear optimization problems,
Math. Prog., 31 (1985), pp. 153{191.
[17] S. K. Eldersveld, Large-scale sequential quadratic programming algorithms, PhD thesis, De-
partment of Operations Research, Stanford University, Stanford, CA, 1991.
[18] R. Fletcher, An `1 penalty method for nonlinear constraints, in Numerical Optimization 1984,
P. T. Boggs, R. H. Byrd, and R. B. Schnabel, eds., Philadelphia, 1985, SIAM, pp. 26{40.
[19] , Practical Methods of Optimization, John Wiley and Sons, Chichester, New York, Bris-
bane, Toronto and Singapore, second ed., 1987.
[20] J. C. Gilbert and C. Lemarechal, Some numerical experiments with variable-storage quasi-
Newton algorithms, Math. Prog., (1989), pp. 407{435.
[21] P. E. Gill, G. H. Golub, W. Murray, and M. A. Saunders, Methods for modifying matrix
factorizations, Math. Comput., 28 (1974), pp. 505{535.
[22] P. E. Gill and W. Murray, The computation of Lagrange multiplier estimates for constrained
minimization, Math. Prog., 17 (1979), pp. 32{60.
[23] P. E. Gill, W. Murray, and M. A. Saunders, SQOPT: An algorithm for large-scale quadratic
programming. To appear.
[24] P. E. Gill, W. Murray, M. A. Saunders, and M. H. Wright, Sparse matrix methods in
optimization, SIAM J. on Scienti c and Statistical Computing, 5 (1984), pp. 562{589.
[25] , User's guide for NPSOL (Version 4.0): a Fortran package for nonlinear programming,
Report SOL 86-2, Department of Operations Research, Stanford University, Stanford, CA,
1986.
[26] , Maintaining LU factors of a general sparse matrix, Linear Algebra and its Applications,
88/89 (1987), pp. 239{270.
[27] , Inertia-controlling methods for general quadratic programming, SIAM Review, 33
(1991), pp. 1{36.
[28] , Some theoretical properties of an augmented Lagrangian merit function, in Advances in
Optimization and Parallel Computing, P. M. Pardalos, ed., North Holland, North Holland,
1992, pp. 101{128.
[29] P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, Academic Press, London
and New York, 1981. ISBN 0-12-283952-8.
[30] D. Goldfarb, Factorized variable metric methods for unconstrained optimization, Math. Com-
put., 30 (1976), pp. 796{811.
[31] S. P. Han, Superlinearly convergent variable metric algorithms for general nonlinear program-
ming problems, Math. Prog., 11 (1976), pp. 263{282.
[32] C. R. Hargraves and S. W. Paris, Direct trajectory optimization using nonlinear program-
ming and collocation, J. of Guidance, Control, and Dynamics, 10 (1987), pp. 338{348.
[33] , OTIS Optimal Trajectories by Implicit Integration, 1988. Boeing Aerospace Company,
Contract No. F33615-85-c-3009.
[34] W. Hock and K. Schittkowski, Test Examples for Nonlinear Programming Codes, Lecture
Notes in Economics and Mathematical Systems 187, Springer Verlag, Berlin, Heidelberg
and New York, 1981.
[35] M. Lalee, J. Nocedal, and T. Plantenga, On the implementation of an algorithm for
large-scale equality constrained optimization. Manuscript, 1995.
[36] W. Murray, Sequential quadratic programming methods for large-scale problems, J. Comput.
Optim. Appl., 7 (1997), pp. 127{142.
[37] W. Murray and F. J. Prieto, A sequential quadratic programming algorithm using an in-
complete solution of the subproblem, SIAM J. Optim., 5 (1995), pp. 590{640.
[38] , A second-derivative method for nonlinearly constrained optimization. To appear, 1997.
[39] B. A. Murtagh and M. A. Saunders, Large-scale linearly constrained optimization, Math.
Prog., 14 (1978), pp. 41{72.
[40] , A projected Lagrangian algorithm and its implementation for sparse nonlinear con-
LARGE-SCALE SQP 37
straints, Math. Prog. Study, 16 (1982), pp. 84{117.
[41] , MINOS 5.4 User's Guide, Report SOL 83-20R, Department of Operations Research,
Stanford University, Stanford, CA, Revised 1995.
[42] E. O. Omojokun, Trust region algorithms for nonlinear equality and inequality constraints,
PhD thesis, Department of Computer Science, University of Colorado, Boulder, 1989.
[43] T. Plantenga, A trust region method for nonlinear programming based on primal interior-
point techniques. Manuscript, 1996.
[44] M. J. D. Powell, Algorithms for nonlinear constraints that use Lagrangian functions, Math.
Prog., 14 (1978), pp. 224{248.
[45] , The convergence of variable metric methods for nonlinearly constrained optimization
calculations, in Nonlinear Programming 3, O. L. Mangasarian, R. R. Meyer, and S. M.
Robinson, eds., Academic Press, London and New York, 1978, pp. 27{63.
[46] , Variable metric methods for constrained optimization, in Mathematical Programming:
The State of the Art, A. Bachem, M. Grotschel, and B. Korte, eds., Springer Verlag,
London, Heidelberg, New York and Tokyo, 1983, pp. 288{311.
[47] S. M. Robinson, A quadratically-convergent algorithm for general nonlinear programming
problems, Math. Prog., 3 (1972), pp. 145{156.
[48] K. Schittkowski, NLPQL: A Fortran subroutine for solving constrained nonlinear program-
ming problems, Ann. Oper. Res., 11 (1985/1986), pp. 485{500.
[49] R. A. Tapia, A stable approach to Newton's method for general mathematical programming
problems in IRn , J. Optim. Theory and Applics., 14 (1974), pp. 453{476.
[50] I.-B. Tjoa and L. T. Biegler, Simultaneous solution and optimization strategies for parameter
estimation of di erential algebraic equation systems, Ind. Eng. Chem. Res., 30 (1991),
pp. 376{385.
[51] G. Van der Hoek, Asymptotic properties of reduction methods applying linearly equality con-
strained reduced problems, Math. Prog. Study, 16 (1982), pp. 162{189.
[52] C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal, L-BFGS-B|FORTRAN subroutines for large-
scale bound constrained optimization, preprint, Department of Electrical Engineering and
Computer Science, Northwestern University, Evanston, IL, December 1994.