0% found this document useful (0 votes)
9 views6 pages

ADMM Prescaling for Model Predictive Control (1)(1)

The paper presents a novel prescaling strategy for the alternating direction method of multipliers (ADMM) to improve convergence rates in model predictive control (MPC) problems. This strategy eliminates the need for a penalty parameter, thus reducing computational complexity while maintaining optimal performance. Numerical studies demonstrate the effectiveness of the prescaling approach in enhancing the efficiency of ADMM for strongly convex quadratic problems with constraints.

Uploaded by

690977791qq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

ADMM Prescaling for Model Predictive Control (1)(1)

The paper presents a novel prescaling strategy for the alternating direction method of multipliers (ADMM) to improve convergence rates in model predictive control (MPC) problems. This strategy eliminates the need for a penalty parameter, thus reducing computational complexity while maintaining optimal performance. Numerical studies demonstrate the effectiveness of the prescaling approach in enhancing the efficiency of ADMM for strongly convex quadratic problems with constraints.

Uploaded by

690977791qq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2016 IEEE 55th Conference on Decision and Control (CDC)

ARIA Resort & Casino


December 12-14, 2016, Las Vegas, USA

ADMM Prescaling for Model Predictive Control


Felix Rey, Damian Frick, Alexander Domahidi, Juan Jerez, Manfred Morari and John Lygeros

Abstract— The alternating direction method of multipliers allows for state constraints, preserves the sparsity of the
(ADMM) is an iterative first order optimization algorithm for problem data and results in simple ADMM iteration steps.
solving convex problems such as the ones arising in linear The convergence behavior of ADMM can be improved by
model predictive control (MPC). The ADMM convergence
rate depends on a penalty (or step size) parameter that is acceleration techniques such as over-relaxation [2], Nesterov
often difficult to choose. In this paper we present an ADMM acceleration [9], constraint conditioning [5], [6] and optimal
prescaling strategy for strongly convex quadratic problems with penalty parameter selection [5], [6], [8]. The main contribu-
linear equality and box constraints. We apply this prescaling tion of this paper lies in extending this list by a prescaling
procedure to MPC-type problems with diagonal objective, procedure that not only aims at improving the convergence
which results in an elimination of the penalty parameter.
Moreover, we illustrate our results in a numerical study that rate, but can also reduce the computational cost of ADMM.
demonstrates the benefits of prescaling. Prescaling is a linear change of variables z̄ = P z, w̄ = P w,
using the scaling matrix P . It should not be confused with
I. I NTRODUCTION constraint conditioning [5] or preconditioning [6], which is
a scaling of the equality constraint Az − w = 0. In [6] it is
In this paper we introduce a novel prescaling strategy
shown that constraint conditioning is equivalent to prescaling
for quadratic problems when solved with the alternating
on the dual problem. Prescaling on the primal, as discussed
direction method of multipliers (ADMM). In particular we
in this paper, is to the best of our knowledge not considered
look at problems generated by model predictive control
in the literature so far. Over-relaxation and Nesterov acceler-
(MPC) [1], which is an optimization-based control method
ation can be utilized alongside prescaling and are therefore
that is suited to deal with constrained control tasks. Its basic
not discussed further. Optimal penalty parameter selection, in
idea is to determine the optimal control action based on
contrast, is closely related to prescaling. In standard ADMM
the predicted future behavior of the controlled plant. The
formulations, the penalty parameter can be chosen freely,
prediction is done in a receding horizon fashion, i.e., at each
and can have a dramatic effect on the convergence speed [5].
time step, a forecast with finite time horizon is made. There-
In the existing literature its optimal choice is motivated by
fore, the control problem is transformed into a sequence of
improving the worst case convergence rate. Similarly, in this
constrained optimization problems. MPC demands the timely
paper we make use of the convergence analysis in [8] to
solution of each optimization problem and often runs on
determine the best prescaling matrix.
embedded hardware. This motivates the need for fast and
In the second half of the paper we focus on an MPC
simple optimization algorithms.
scenario, where a linear time-varying system should track
ADMM is an iterative low-complexity optimization algo-
a reference quantity while obeying box constraints on states
rithm [2]. Compared to second order methods, e.g., interior
and inputs. The use of a diagonal positive definite tracking
point methods [3], ADMM is more suited to run on resource-
weight matrix results in strongly convex quadratic problems
constrained hardware, since it only relies on simple opera-
with diagonal Hessian. We show that for this problem
tions and is able to operate in fixed-point arithmetic [4].
class, the optimal penalty parameter for the scaled problem
When MPC problems are solved with ADMM, they
becomes equal to one, which effectively eliminates it from
need to be transformed into an admissible problem for-
the ADMM formulation. This is desirable since the optimal
mulation. Usually, the formulation minz,w f (z) + g(w) s.t.
parameter selection, according to [8], comes at a considerable
Az − w = 0 is used, where z and w are the optimization
computational cost, whereas prescaling is computationally
variables and the parameters f, g, A describe the MPC prob-
inexpensive. The cost reduction is particularly useful in
lem. In [5], [6], [7] the function f is required to be smooth,
nonlinear MPC settings, where successive linearization meth-
which either prevents the MPC problems from having state
ods such as sequential quadratic programming (SQP) [10]
constraints or results in prohibitively difficult ADMM itera-
are used. There the linearized equality constraint matrices
tion steps. We rely on a different ADMM formulation, stated
change for each optimization problem, which demands the
in [8], where A = I, and f , g are allowed to be non-smooth.
repeated computation of the optimal penalty parameter.
This leads to a natural MPC problem reformulation, which
Given that a diagonal Hessian approximation is available
F. Rey reyfe*, D. Frick dafrick*, J. Jerez juanj*, J. Lygeros (e.g., using a constrained Gauss-Newton method [11] for the
lygeros* and M. Morari morari* are with the Automatic Control linearization), our prescaling scheme can be used to eliminate
Laboratory (IfA) at ETH Zurich, Physikstrasse 3, 8092 Zurich, Switzerland. the expensive optimal parameter computation. Hence, the
*@control.ee.ethz.ch computational complexity of the procedure is reduced, while
A. Domahidi [email protected] is with inspire AG,
Technoparkstrasse 1, 8005 Zurich, Switzerland it still benefits from the optimal parameter choice. The same

978-1-5090-1837-6/16/$31.00 ©2016 IEEE 3662


reasoning can be applied for time-varying systems or in fast- B. ADMM Formulation
sampled nonlinear MPC, if the real-time iteration [12] is To apply ADMM to problem (1), the consensus constraint
used. z−w = 0 is introduced and w ∈ Rn is added as an additional
A. Contribution and Outline optimization variable. The constraints (1b) and (1c) can be
moved into the objective using indicator functions. The al-
We describe an improved procedure to utilize ADMM for
location of objective terms depending on the primal variable
quadratic problems. The contributions are
z and the consensus variable w is called the splitting. The
• Introducing an ADMM prescaling strategy for strongly augmented Lagrangian of the resulting problem is given as
convex quadratic optimization problems with linear
equality and box constraints. Lρ (z, w, ν) = f (z) + g(w) + ρν > (z − w) + ρ2 kz − wk2 ,
• Analyzing the implications of prescaling for MPC-
type problems with diagonal objective, in particular the where we use f (z) = 12 z > Qz + q > z + I{z|Az=b} (z) and
elimination of the penalty parameter and the resulting g(w) = IC (w). Moreover, ν ∈ Rn is called the scaled
reduction in computational complexity. Lagrange multiplier and ρ > 0 is the penalty parameter.
• Presenting a numerical study, including an SQP exam-
ADMM iteratively minimizes the augmented Lagrangian
ple, that shows how prescaling improves the conver- with respect to z and w and maximizes it with respect to ν.
gence rate. The resulting iterates are
In Section II we develop the prescaling procedure and in z k+1 = arg min 1 >
(2a)
2 z (Q +
ρIn )z
Section III we apply it to MPC problems. The numerical z∈{z|Az=b}
− ρ(w − ν k − ρ1 q)> z
k
study is presented in Section IV.
wk+1 = PC (z k+1 + ν k ) (2b)
B. Notation ν k+1 = ν k + z k+1 − wk+1 , (2c)
We denote by Sn+ the space of n-dimensional, symmetric,
positive definite matrices. Dn+ denotes the diagonal matrices where the w update (2b) is an Euclidean projection onto C.
in Sn+ . The identity matrix of dimension n × n is denoted Usually, ADMM is only used in cases where this projection
as In . O is the Bachmann-Landau notation to express can be computed efficiently, e.g., when C is a box. As shown
computational complexity. Given x ∈ Rn , Q ∈ Sn+ and a in [2], [8], the ADMM iteration (2) converges to a fixed point
set C ⊆ Rn , the norm kxk2Q denotes the scaled, squared that is optimal for problem (1).
Euclidean norm x> Qx. The i-th eigenvalue of Q is denoted
C. Convergence Analysis and Optimal Penalty Parameter
by λi (Q), its minimal and maximal eigenvalues are λmin (Q)
and λmax (Q). If Q is symmetric, then the matrix norm kQk In [8] the convergence rate for problems of type (1)
is the maximal absolute eigenvalue maxi |λi (Q)|. IC (x) is an is analyzed. It is shown that the worst case rate can be
indicator function that is zero if x ∈ C and infinity otherwise. computed by solving a quadratically constrained quadratic
Finally, the operator PC (x) denotes an Euclidean projection problem that depends on the parameters c?F , αmax ?
and
of x onto the set C, i.e., PC (x) = arg miny∈C ky − xk22 . kMZ k. The smaller these quantities, the better the conver-
gence rate. For details on the rate computation, we refer
II. P RESCALING FOR ADMM to [8, Theorem 3]. Here we focus on how these parameters
In this section, inspired by the convergence analysis in [8], can be influenced by the penalty parameter selection, and
we present a novel technique for improving the convergence later by prescaling. The parameter c?F relates to the cosine of
behavior of ADMM. First, we consider a general type of the Friedrich’s angle between the range space of A> and the
quadratic problems. The adaption to MPC-type problems active inequality constraints at the solution. The parameter
follows in Section III.
?
αmax is determined by the initial iterate (w1 , ν 1 ), and the
closest inactive inequality constraint to the solution. Both
A. Optimization Problem parameters depend on the a-priori unknown solution of the
Similarly to [8], we consider strongly convex quadratic optimization problem and are therefore difficult to influence.
optimization problems of the form The contraction matrix MZ is defined as
1 >
−1
+ q> z (1a)

min 2 z Qz MZ = 2 ρ1 Z > QZ + In−m − In−m , (3)
z
s. t. Az = b (1b)
where Z ∈ Rn×n−m is an orthonormal basis for the null
z ∈ [zmin , zmax ], (1c)
space of A, i.e., AZ = 0 and Z > Z = In−m . Among
where z ∈ Rn , Q ∈ Sn+ and A ∈ Rm×n has full row the quantities that influence the convergence rate, only the
rank. We assume that the problem is feasible. To apply contraction matrix MZ depends on the penalty parameter ρ.
the convergence analysis presented in [8], we moreover This is utilized in [8] where ρ is chosen to minimize kMZ k.
assume that C = {z | zmax ≤ z ≤ zmin } has a nonempty The resulting optimal penalty parameter is
interior, and the linear independence constraint qualification q
(LICQ) [13] holds at the solution. ρ? = λmin (Z > QZ)λmax (Z > QZ). (4)

3663
Later we will choose the prescaling matrix to reduce kMZ k Proposition 1. Given problem (1) and the prescaling strat-
even further. Unfortunately, prescaling will also affect the egy (5), the prescaling matrix P ? ∈ K, that minimizes kM̄Z k,
quantities c?F and αmax?
, which makes its influence on the can be computed by solving
convergence rate less clear.
(P ? , Z̄ ? , µ? ) = arg min µ (7a)
While (4) settles the penalty parameter choice in theory, in P ∈K,Z̄,µ
practice it is often computationally prohibitive to compute Z s. t. In−m 4 H̄ 4 µIn−m (7b)
and the eigenvalues of Z > QZ. This is particularly relevant > >
H̄ = Z̄ P QP Z̄ (7c)
in the context of SQP, where the optimal penalty parameter AP Z̄ = 0 (7d)
needs to be recomputed at each time step.
Z̄ > Z̄ = In−m (7e)

D. Prescaling Proof. The i-th eigenvalue of M̄Z can be written as


2
Prescaling can be applied to optimization problems of λi (M̄Z ) = 1 − 1 ∈ (−1, 1) , (8)
λ
ρ i (H) +1
type (1) to change their numerical properties. It is a linear
transformation P of the optimization variable z where the inclusion follows from λi (H) > 0 ∀i, which is
true due to λi (Q) > 0 ∀i, and semi-orthogonality of Z̄. The
z = P z̄ , (5) eigenvalues of M̄Z are real but possibly negative. Hence its
norm is the absolute value of either the minimal or maximal
where the scaled variable is denoted by z̄ and the prescaling eigenvalue. For the moment we focus on H̄ = Z̄ > P > QP Z̄.
matrix P ∈ Rn×n is invertible. The scaled version of Minimizing kM̄Z k yields
problem (1) is H̄ ? = arg min max |λmin (M̄Z )|, |λmax (M̄Z )| . (9)

H
min 1 >
2 z̄ Q̄z̄ + q̄ > z̄ (6a) We investigate the two possible outcomes of the max-

operator separately. In the first case it is |λmin (M̄Z )| ≥
s. t. Āz̄ = b (6b)
|λmax (M̄Z )| which implies λmin (M̄Z ) ≤ 0. Hence using (8)
¯
z̄ ∈ C, (6c) leads to
!
2
where Q̄ = P > QP , q̄ = P > q, C¯ = {P −1 z | z ∈ C} H̄λ?min = arg min − 1 +1 .
and Ā = AP . The prescaling matrix P is chosen from a ρ λmax (H̄) + 1
H

structure-imposing set K ⊆ Rn×n to preserve the structure After algebraic reformulation and the use of definition (4)
¯ The KKT conditions of problem (6) are
in Ā or C. we arrive at
s
Q̄z̄ ? + Ā> η̄ ? − ν̄ ? = −q̄, ? λmax (H̄)
H̄λmin = arg min .
H λmin (H̄)
Āz̄ ? = b,
¯
(ν̄ ? )> (z̄ − z̄ ? ) ≥ 0 ∀z̄ ∈ C. If we go back to equation (9) to consider the second case
of |λmax (M̄Z )| ≥ |λmin (M̄Z )|, we get the same result.
Using z̄ ? = P −1 z ? , ν̄ ? = P > ν ? and η̄ ? = η ? allows to Hence, we have H̄ ? = H̄λ?min which, by substituting
recover the solution of the unscaled problem (1). Hence, we H̄ = Z̄ > P > QP Z̄, yields
can solve any scaled problem in place of the original one. λmax (Z̄ > P > QP Z̄)
Given a certain K, the question of choosing a suited P ∈ K min
P,Z̄ λmin (Z̄ > P > QP Z̄)
remains. This is investigated in the following section.
s. t. AP Z̄ = 0, Z̄ > Z = In−m , P ∈ K.
The eigenvalues can be expressed as optimization prob-
E. Prescaling Matrix Choice
lems via λmax (H̄) = arg minµ∈R {µ | µIn−m < H̄} and
For the choice of the prescaling matrix P , we follow an λmin (H̄) = arg maxα∈R {α | H̄ < αIn−m } which allows to
argument similar to the one in [8] for the optimal penalty rewrite the previous problem as
parameter. There ρ? is chosen such that the norm of the µ
min
contraction matrix MZ in (3) is minimized. The contraction P,Z̄,α,µ α
matrix M̄Z of the scaled problem is also influenced by s. t. αIn−m 4 Z̄ > P > QP Z̄ 4 µIn−m
P through Q̄ and Ā. Therefore, we choose P such as to
AP Z̄ = 0, Z̄ > Z̄ = In−m , P ∈ K .
minimize kM̄Z k. The result is stated in the following propo-
sition, where H̄ is called reduced Hessian, and Z̄ ∈ Rn×n−m Any feasible solution of this problem satisfies µ ≥ α. With-
defines an orthonormal basis for the null space of the scaled out loss of generality we can set α = 1, which leads to
equality constraint matrix Ā = AP . problem formulation (7) and concludes the proof.

3664
Remark 1. The resulting prescaling matrix P ? does not A. MPC Problem
necessarily optimize the worst case convergence rate, it only We consider reference tracking problems where the system
minimizes kM̄Z k, which has a beneficial influence on the dynamics are affine and time-varying. States and inputs need
rate. This is due to the fact that P (as opposed to ρ) also to satisfy upper and lower bounds. The MPC problems have
influences other parameters related to the convergence, i.e., the form
c?F and αmax
?
. This is why, although P ? is the result to X
1 x
k2Qi+1 + kui − riu k2Ri (10a)

an optimization problem, we do not use the term optimal min 2 kxi+1 − ri+1
{xi+1 ,ui }
prescaling. i∈I
s. t. xi+1 = Asi xi + Bis ui + csi ∀i ∈ I (10b)
The optimization problem (7) is a non-convex polynomial
xi+1 ∈ [xmin , xmax ] ∀i ∈ I (10c)
matrix inequality (PMI) problem [14] and is in general very
hard to solve. In some cases however, an analytic solution ui ∈ [umin , umax ] ∀i ∈ I (10d)
can be found. Corollary 1 describes this situation. where I = {0, 1, ..., N − 1}, N is the prediction horizon
Corollary 1. Given the setup of Proposition 1. If Q−1/2 ∈ K and xi ∈ Rnx , ui ∈ Rnu are the system states and inputs.
then Qi ∈ Dn+x and Ri ∈ Dn+u are diagonal objective weight ma-
trices. The reference quantities rix and riu define the target
(i) P ? = Q−1/2 ,
points for state and input. The matrices Asi , Bis and the
(ii) Q̄ = In , offset csi describe the time-varying affine system. Finally,
(iii) ρ̄? = 1 , [xmin , xmax ] as well as [umin , umax ] define the state and
(iv) kM̄Z k = 0 . input bounds. We assume that each instance of problem (10)
Proof. To show (i) we recognize that P ? = Q−1/2 exists is feasible and that the LICQ holds at the solution.
and is feasible for (7). Using P ? and Z̄ > Z̄ = In−m in Equation (10) is a parametric quadratic problem that needs
the first constraint results in µ? = 1, which is the lowest to be solved repeatedly for changing parameter values. The
attainable objective value. Since P ?  0, a Z̄ ? that satis- parameters are the initial state x0 , the reference quantities
fies the remaining constraints can always be found. Hence rix , riu , system parameters Asi , Bis and csi , as well as the
(P ? , Z̄ ? , µ? ) solves (7). Part (ii) follows directly from (i). objective matrices Qi+1 and Ri .
Using (4) and Z̄ > Z̄ = In−m shows (iii) from (i) and (ii). The To apply ADMM to problem (10) we need to state it in
last part (iv) is easily verified by the definition of M̄Z . the form of (1). Using the ordering
z > = u> x> u> x> . . . u> >
 
0 1 1 2 N −1 xN
It is important to realize that if Q−1/2 ∈ K, neither
the orthonormal basis Z̄ nor the eigenvalues in (4) need to as shown in [15] allows to construct the quadratic problem
be computed in order to implement ADMM for the scaled parameters Q, q, A, b and C such that Q remains in Dn+ .
problem with the optimal penalty parameter. The resulting re- The resulting equality constraint matrix A ∈ Rm×n has
duction of the computational cost is quantified in section III- full row rank and is block banded. The dimensions are
C. Another aspect is that the prescaling matrix P ? reaches n = (nx + nu )N and m = nx N .
kM̄Z k = 0, which usually cannot be achieved by the choice
B. Diagonal Prescaling
of the optimal penalty parameter alone. Hence prescaling
might further improve the convergence rate. Finally, it can be Proposition 1 tells us how to choose the prescaling ma-
observed that neither P ? nor ρ̄? is influenced by the equality trix P from the structure imposing set K. If each state
constraint matrix A. This is because A only influences and input in problem (10) has an upper and lower bound,
the geometry of Z̄, which is always semi-orthogonal, i.e., we want to choose P diagonal to keep C¯ projectable and
Z̄ > Z̄ = In−m . Therefore, the reduced Hessian Z̄ > Q̄Z̄ is therefore the second ADMM step (2b) simple. Hence, we
the identity matrix and the dependence on A is eliminated. set K = Dn+ and we recognize Q−1/2 ∈ K. Consequently,
the MPC problem and the set K satisfy the conditions for
III. ADMM P RESCALING FOR Corollary 1, and prescaling can be used in its most beneficial
M ODEL P REDICTIVE C ONTROL form. In particular, we can use the diagonal prescaling matrix
P ? = Q−1/2 and the optimal penalty parameter ρ̄? = 1.
In this section we focus on MPC-type problems, where a
diagonal positive definite objective and prescaling matrix can C. Implementation and Complexity Analysis
be chosen, i.e., K = Dn+ . This allows to utilize Corollary 1 In this section we compare the prescaling procedure to
while preserving the structure of C and therefore the ability conventional ADMM strategies in terms of implementation
to project efficiently. For the case where C does not constrain and computational complexity. In particular we show that
all variables, a less restrictive K can be chosen, and therefore prescaling leads to a reduced complexity compared to opti-
more general objective matrices can be handled. Further, mal penalty parameter selection.
the computational complexity of the prescaling procedure is We consider the scaled MPC problem using P ? = Q−1/2 ,
analyzed in this section, where also the particular structure Q̄ = In and ρ̄? = 1. The third step (2c) of the ADMM
of the MPC problem type is utilized. iteration is negligible in terms of computational complexity.

3665
Since C¯ decouples in each component, the second step (2b) prediction horizon N due to the evaluation of (4). This
is simple as well. The first ADMM step (2a) is crucial for an additional burden can be prohibitive, since a large horizon
efficient implementation. As suggested in [2], and similarly is often needed for satisfactory closed-loop performance of
to [3], the structure of Ā can be exploited. We use the Schur MPC. A main contribution of this paper is to combine the
complement method [13, Equation (16.16)] which yields optimal penalty parameter selection of strategy (ii) with the
low computational complexity of of strategy (i).
z̄ k+1 = C̄(w̄k + ν̄ k − q̄) + Ēb , (11)
where C̄ = 21 (I − Ā> (ĀĀ> )−1 Ā) and Ē = Ā> (ĀĀ> )−1 . IV. N UMERICAL S TUDY
Note that the simplicity of this form comes from the This section illustrates the convergence performance of
fact that the inverse of Q̄ disappears. Computing ĀĀ> prescaled ADMM on (a) a set of MPC-type problems with
and its sparse Cholesky factorization L̄L̄> , e.g., using random parameters, and (b) a nonlinear MPC example for
[16, Algorithm 4.3.5], has complexity O(N n2x (nx + nu )). the control of an overhead crane via SQP. To make the
Then (11) can be evaluated in O(N nx (nx + nu )) via convergence behavior comparable, each problem instance is
forward-backward substitution, since L̄ is lower triangular solved with the strategies (i) fixed penalty ρ = 1, (ii) optimal
with bandwidth 2nx , due to bandedness of Ā. Assembling penalty and (iii) prescaling with the predefined choice of the
all previous steps leads to Algorithm 1. optimal penalty.
The random problems (a) are generated based on the
Algorithm 1 Prescaled ADMM for MPC-type Problems parameters σ, λmin (Q), N , nx , nu , σ s and ∆ in the fol-
Require: (Static data) Bounds zmin , zmax and number of lowing way: The entries of the diagonal quadratic objective
iterations nit ∈ N+ . Q are drawn from a uniform distribution between 0 and σ.
Require: (Parameters) x0 , (Asi , Bis , csi ), riu , ri+1
x
and Ri , A minimal curvature λmin (Q) is added to ensure strong
Qi+1 ∀i ∈ I. Initial iterates w̄ and ν̄ .
0 0
convexity. The entries of the linear objective q are drawn
1 >
1: P ? ← Q(Ri , Qi+1 )− 2 , q̄ ← P ? q(riu , ri+1 x
) from a (0, σ)-normal distribution. For each problem instance,
2: Ā ← A(Bi , Ai+1 )P , b ← b(x0 , A0 , ci )
s s ? s s
N random systems Asi , Bis , csi with nx states and nu inputs
−1 −1
3: [z̄min , z̄max ] ← [P ? zmin , P ? zmax ] are generated. The system matrices Asi , Bis and csi are
4: L̄ ← CHOLESKY (ĀĀ ) >
. Cholesky decomposition sampled from a uniform distribution between ±σs . If σ s > 1,
5: c̄ ← Ā> (L̄L̄> )−1 b the system dynamics can be unstable. The initial condition
6: for k ∈ {1, . . . , nit } do x0 and the box constraints are chosen randomly such that
7: z̄ k+1 ← 12 (I − Ā> (L̄L̄> )−1 Ā)(w̄k + ν̄ k − q̄) + c̄ the problem is feasible and the bounds zmin , zmax have a
8: w̄k+1 ← CLIP(z̄ k+1 − ν̄ k ) . Clip to [z̄min , z̄max ] minimal distance ∆.
9: ν̄ k+1 ← ν̄ k + w̄k+1 − z̄ k+1
σ λmin (Q) N nx nu σs ∆
We call the steps 1–5 precomputation and 6–9 iteration “easy” 100 0.1 10 6 3 1 250
phase. This formalization allows to analyze the complexity of “hard” 100 0.1 15 12 5 1.6 100
prescaled ADMM and to compare it to other ADMM strate- TABLE II
gies. In particular, we compare (i) standard ADMM with an PARAMETRIZATION OF THE TWO PROBLEM TYPES .
arbitrary fixed penalty parameter, (ii) ADMM with optimal
penalty parameter ρ? according to (4) and (iii) prescaled We use two types of random problems, generated by “easy”
ADMM with optimal penalty parameter ρ̄? = 1, as described and “hard” parametrizations as shown in Table II. The
in Algorithm 1. For brevity we omit the detailed complexity “hard” problem instances are expected to be more difficult
analysis of the ADMM strategies (i) and (ii). Since the struc- to solve since the dimension is higher, they tend to have
ture of A can be exploited in all three strategies, they exhibit more active constraints in the solution and they include about
the same computational complexity O(nit N nx (nx + nu )) 50% unstable systems. For each random problem type 1000
for the iteration phase. The precomputation phase differs as instances are generated.
noted in Table I. The overhead crane MPC example (b) is obtained
from [17]. We use the sampling interval Ts = 0.1 and
ADMM strategy precomputation complexity the horizon N = 10. Compared to [17] we use additional
(i) fixed penalty O(N n2x (nx + nu )) state constraints and a slightly different reference trajectory
(ii) optimal penalty O(N 3 n2x (nx + nu )) and objective parametrization to make the problems more
(iii) prescaling same as (i) demanding. We use the real-time iteration [12] in a 40s
TABLE I simulation resulting in 400 different problem instances.
P RECOMPUTATION COMPLEXITY OF DIFFERENT ADMM STRATEGIES . For the algorithm comparison, first the solution to
all example problems is determined using a commercial
Strategy (i) and (iii) show the same precomputation com- solver [18], [19] Then, each of the three ADMM procedures
plexity, since the penalty parameter does not need to be solve the problems. The procedures are terminated as soon
computed. Strategy (ii), on the other hand, has a substantially as their iterates come within Euclidean distance 10−5 to
larger computational cost which grows cubically in the the actual solution. In the prescaled case, the scaling is

3666
removed before this distance is measured, to guarantee a choice of the penalty parameter otherwise would be computa-
fair comparison. tionally prohibitive. Further, we have shown that the resulting
prescaling procedure performs well in simulation.
“easy” problems (a)

1
R EFERENCES
fraction of solved

0.8
0.6 [1] J. M. Maciejowski, Predictive Control with Constraints. Prentice
(i) fixed penalty Hall, 2002.
0.4
(ii) optimal penalty [2] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
0.2 (iii) prescaling optimization and statistical learning via the alternating direction
0 method of multipliers,” Foundations and Trends in Machine Learning,
101 102 103 104 105 vol. 3, no. 1, pp. 1–122, Jan. 2011.
[3] A. Domahidi, A. U. Zgraggen, M. N. Zeilinger, M. Morari, and C. N.
“hard” problems (a)

1
fraction of solved

(i) fixed penalty Jones, “Efficient interior point methods for multistage problems arising
0.8 in receding horizon control,” in Decision and Control (CDC), 2012
(ii) optimal penalty
0.6 (iii) prescaling IEEE 51st Annual Conference on, Dec 2012, pp. 668–674.
0.4 [4] J. L. Jerez, P. J. Goulart, S. Richter, G. A. Constantinides, E. C.
Kerrigan, and M. Morari, “Embedded online optimization for model
0.2
predictive control at megahertz rates,” Automatic Control, IEEE Trans-
0 actions on, vol. 59, no. 12, pp. 3238–3251, Dec. 2014.
101 102 103 104 105 [5] E. Ghadimi, A. Teixeira, I. Shames, and M. Johansson, “Optimal
parameter selection for the alternating direction method of multipliers
crane problems (b)

1
fraction of solved

0.8 (ADMM): Quadratic problems,” arXiv:1306.2454v2, Apr. 2014.


[Online]. Available: http://arxiv.org/abs/1306.2454
0.6 [6] P. Giselsson and S. Boyd, “Diagonal scaling in Douglas-Rachford
(i) fixed penalty
0.4 splitting and ADMM,” in Decision and Control, IEEE Conference
(ii) optimal penalty
0.2 (iii) prescaling on, 12 2014, pp. 5033–5039.
0 [7] Y. Pu, M. N. Zeilinger, C. N. Jones, and P. Ye, “Fast Alternating
101 102 103 104 105 Minimization Algorithm for Model Predictive Control,” 19th World
Congress of the International Federation of Automatic Control, pp.
number of iterations 11 980–11 986, 2014.
[8] A. U. Raghunathan and S. Di Cairano, “ADMM for convex
Fig. 1. Performance plots for the three ADMM procedures with “easy” quadratic programs: Q-linear convergence and infeasibility detection,”
random problems (above), “hard” random problems (middle) and the arXiv:1411.7288, 11 2014. [Online]. Available: http://arxiv.org/abs/
overhead crane problems (below). The vertical axis shows the fraction of 1411.7288
problems solved to 10−5 accuracy. The horizontal axis shows the number [9] T. Goldstein, B. O’Donoghue, S. Setzer, and R. Baraniuk, “Fast Al-
of performed iterations i.e., the number of evaluations of (2). ternating Direction Optimization Methods,” SIAM Journal on Imaging
Sciences, vol. 7, no. 3, pp. 1588–1623, aug 2014.
[10] P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” Acta
In Figure 1 we compare the convergence speed (measured numerica, vol. 4, no. 1, pp. 1–51, 1995.
in the number of iterations each algorithm needs to terminate) [11] H. G. Bock, M. M. Diehl, D. B. Leineweber, and J. P. Schlöder,
for the three ADMM procedures and the three problem types. A Direct Multiple Shooting Method for Real-Time Optimization of
Nonlinear DAE Processes. Springer, 2000, vol. 26, pp. 245–267.
As intended, “hard” problem instances are more difficult to [12] M. Diehl, H. G. Bock, J. P. Schlöder, R. Findeisen, Z. Nagy, and
solve than “easy” problem instances. Prescaled ADMM (iii) F. Allgöwer, “Real-time optimization and nonlinear model predictive
clearly outperforms strategy (i) in all three problem types, control of processes governed by differential-algebraic equations,”
Journal of Process Control, vol. 12, no. 4, pp. 577–585, 2002.
without imposing an additional computational burden and [13] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed.
without the need for parameter tuning. Compared to the more Springer Science+Business Media, 2006.
costly optimal penalty strategy (ii), prescaled ADMM takes [14] D. Henrion and J.-B. Lasserre, “Convergent relaxations of polynomial
matrix inequalities and static output feedback,” Automatic Control,
the lead for the “easy” instances and for the crane problems. IEEE Transactions on, vol. 51, no. 2, pp. 192–202, Feb 2006.
For the “hard” problem instances it still remains competitive. [15] Y. Wang and S. Boyd, “Fast model predictive control using online
optimization,” Control Systems Technology, IEEE Transactions on,
vol. 18, no. 2, pp. 267–278, 3 2010.
V. C ONCLUSIONS [16] G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore,
MD, USA: Johns Hopkins University Press, 1996.
We have stated an ADMM prescaling strategy for strongly [17] M. Vukov, W. Van Loock, B. Houska, H. J. Ferreau, J. Swevers,
convex quadratic problems with linear equality and box and M. Diehl, “Experimental validation of nonlinear MPC on an
constraints. This strategy was successfully applied to MPC- overhead crane using automatic code generation,” in American Control
Conference, June 2012, pp. 6264–6269.
type problems with diagonal objective. In this problem [18] J. Lofberg, “Yalmip: A toolbox for modeling and optimization in
class, prescaling also allows for a simple choice of the matlab,” in Computer Aided Control Systems Design, 2004 IEEE
optimal penalty parameter, which substantially reduces the International Symposium on. IEEE, 2004, pp. 284–289.
[19] Gurobi Optimization Inc., “Gurobi optimizer reference manual,”
computational cost. We have emphasized that these savings 2015. [Online]. Available: http://www.gurobi.com
are particularly useful in SQP settings, where the optimal

3667

You might also like