Primal and Dual Algorithms for Optimization Over the Efficient Set
Primal and Dual Algorithms for Optimization Over the Efficient Set
To cite this article: Zhengliang Liu & Matthias Ehrgott (2018) Primal and dual
algorithms for optimization over the efficient set, Optimization, 67:10, 1661-1686, DOI:
10.1080/02331934.2018.1484922
Primal and dual algorithms for optimization over the efficient set
Zhengliang Liua and Matthias Ehrgottb
a School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK;
b Department of Management Science, Lancaster University Management School, Bailrigg, Lancaster, UK
1. Introduction
Multi-objective optimization deals with optimization problems with multiple conflicting objectives.
It has many applications in practice, e.g. minimizing cost versus minimizing adverse environmental
impacts in infrastructure projects [1], minimizing risk versus maximizing return in financial portfo-
lio management [2] or maximizing tumour control versus minimizing normal tissue complications
in radiotherapy treatment design [3]. Because a feasible solution simultaneously optimizing all of the
objectives does not usually exist, the goal of multi-objective optimization is to identify a set of so-
called efficient solutions. Efficient solutions have the property that it is not possible to improve any
of the objectives without deteriorating at least one of the others. In practical applications of multi-
objective optimization, it is generally necessary for a decision-maker to select one solution from the
efficient set for implementation. This selection process can be modelled as the optimization of a func-
tion over the efficient set of the underlying multi-objective optimization problem. For example, an
investor may aim at minimizing the transaction cost of establishing a portfolio with high return and
low risk. If such a function that describes the decision-maker’s preferences is explicitly available, one
might expect that it is computationally easier to directly optimize the function over the efficient set
rather than to solve the multi-objective optimization problem first and then obtain a most preferred
CONTACT Zhengliang Liu [email protected] School of Electronic Engineering and Computer Science, Queen Mary
University of London, London E1 4NS, UK
Figure 1. The image of the feasible set of the MOLP of Example 2.1 in objective space.
Figure 2. The image of the feasible set of the CMOP of Example 2.2 in objective space.
efficient solution – in particular, because the efficient set can in general be an infinite set (see Fig-
ures 1 and 2). Secondly, decision-makers may be overwhelmed by the large size of the whole efficient
set and may not be able to choose a preferred solution from it in a very effective manner. These con-
siderations have motivated research on the subject of optimization over the efficient set since the
1970s.
We contribute to this research with new algorithms and new results on how to identify optimal
solutions to this problem. In Section 2, we provide the mathematical preliminaries on multi-objective
optimization and necessary notation. A revised version of Benson’s outer approximation algorithm
and its dual variant are summarized in Section 3. In Section 4, we survey algorithms for optimization
over the efficient set from the literature. As a new contribution, we identify a subset of the vertices
of the feasible set in objective space at which an optimal solution must be attained. Based on the
outer approximation algorithm, we then propose a new primal algorithm in the all linear case by
incorporating a bounding procedure in the primal algorithm of [4] in Section 5. Furthermore, this
primal algorithm is extended to maximize a linear function over the non-dominated set of a convex
multi-objective optimization problem. Section 6 proposes a new dual algorithm for optimization over
the non-dominated set, which makes use of a new result providing a geometrical interpretation of
optimization over the non-dominated set. This dual algorithm is also further developed to maximize
a linear function over the non-dominated set of a convex problem. The numerical experiments in
Section 7 compare the performance of our new primal and dual algorithms with algorithms from the
literature that we discuss in Section 4. The results reveal that our algorithms, in particular, the dual
OPTIMIZATION 1663
algorithm in the linear case, are much faster (up to about 10 times) than comparable algorithms from
the literature.
2. Preliminaries
A multi-objective optimization problem (MOP) can be written as
Definition 2.1: A feasible solution x̂ ∈ X is called a (weakly) efficient solution of MOP (1) if
there is no x ∈ X such that f (x) ≤ (<)f (x̂). The set of all (weakly) efficient solutions is called the
(weakly) efficient set in decision space and is denoted by X(W)E . Correspondingly, ŷ = f (x̂) is called
a (weakly) non-dominated point and Y(W)N := {f (x) : x ∈ X(W)E } is the (weakly) non-dominated
set in objective space.
Throughout this article, we assume that the multi-objective optimization problem is non-trivial,
i.e. yI ∈
/ Y and Y = YN , which means that the objectives do not have a common minimizer and that
not every feasible point is non-dominated. If yI ∈ Y then YN = {yI }. In this case, as well as if YN = Y,
the optimization problem we study in this paper becomes either trivial or becomes a standard single
objective optimization problem.
In case all objectives of MOP (1) are linear and the feasible set X is polyhedral, (1) is called a
multi-objective linear programme (MOLP) and can be written as
min{Cx : x ∈ X }, (2)
where C is a p × n matrix. The feasible set X is defined by linear constraints Ax b, where A ∈ Rm×n
and b ∈ Rm . A polyhedral convex set such as X has a finite number of faces. A subset F of X is a face
if and only if there are ω ∈ Rn \ {0} and γ ∈ R such that X ⊆ {x ∈ Rn : ωT x γ } and F = {x ∈
Rn : ωT x = γ } ∩ X . We call a hyperplane H = {x ∈ Rn : ωT x = γ } supporting X at x0 if ωT x γ
for all x ∈ X and ωT x0 = γ . The proper (r − 1)-dimensional faces of an r-dimensional polyhedral
set X are called facets of X . Proper faces of dimension zero are called extreme points or vertices of X .
In Example 2.1, we provide a numerical example of a multi-objective linear programme, which we
shall use throughout the paper.
1664 Z. LIU AND M. EHRGOTT
Example 2.1:
1 0 x1
min
0 1 x2
⎛ ⎞ ⎛ ⎞
4 1 4
⎜3 2 ⎟ ⎜6⎟
⎜ ⎟ ⎜ ⎟
⎜1 5⎟ ⎜5⎟
⎜ ⎟ x1 ⎜ ⎟
s.t. ⎜−1 −1⎟ x2 ⎜−6⎟ .
⎜ ⎟ ⎜ ⎟
⎝1 0⎠ ⎝0⎠
0 1 0
Figure 1 shows the feasible set X of Example 2.1, as well as its image Y in objective space (due to C
being the identity matrix). The bold line segments compose the non-dominated set YN . Furthermore,
in this example, XE is the same as YN .
Example 2.2:
1 0 x1
min
0 1 x2
(x1 − 3)2 (x2 − 2)2
s.t. + 1.
9 4
The image of the feasible set in objective space is illustrated in Figure 2. The bold curve is the non-
dominated set of the CMOP.
max μT Cx : x ∈ XE , (4)
Equations (4) and (5) can be regarded as a bilevel optimization problem, where (4) is the upper level
problem and (5) is the lower level problem. The constraint x ∈ XE in (4) can be replaced by the well
known optimality conditions for MOLP, see e.g. [11]. Feasible solution x ∈ X is efficient if and only
p
if there exist λ ∈ R> and u ∈ Rm
such that AT u = λT C and λT Cx = bT u.
Therefore, Equation (4) can be rewritten as
While Equation (6) has a linear objective function and some linear constraints, it also has quadratic
constraints and requires that λ be strictly positive. Hence, solving Problem (6) will be consider-
ably more difficult than solving a linear programme. Example 2.3 thus provides one motivation for
research in algorithms to solve Equation (3) in the linear case: is it possible to derive algorithms that
only make use of linear programming techniques?
In multi-objective optimization, it is well known that because the number of objective functions is
usually much smaller than the number of decision variables, the structure of Y is most often simpler
than the structure of X . In particular, in multi-objective linear programming, the structure and prop-
erties of XE and YN are well investigated. Dauer [12] notes that Y often has fewer extreme points and
faces than X . Dauer [13] illustrates the concept of ‘collapsing’, which means that faces of X shrink
into non-facial subsets of Y . Benson [14] shows that the dimension of efficient faces in the feasible
set always exceeds or equals the dimension of their images in Y . Hence, it is arguably more compu-
tationally efficient to employ techniques and methods to solve Problem (3) in objective space, and
algorithms for optimization over the efficient set have followed this trend since 2000. The algorithms
we propose in this paper fall into this category, too.
In this context, we assume that the objective function of Problem (3) is a composite function of a
function M : Rp → R and the objective function f of the underlying MOP, i.e. = M ◦ f . Therefore,
(x) = M(f (x)). Substituting y = f (x) into Problem (3), we derive the problem of optimizing M over
the non-dominated set YN of an MOP:
Problem (7) is essentially the same problem as Problem (3) but appears to be more intuitive than
Problem (3), because in practice decision-makers typically choose a preferred solution based on the
objective function values rather than the value of decision variables.
In this article, the problems we are interested in are two special cases of Problem (7) namely
Problems (8) and (10) defined below.
max μT y : y ∈ PN , (8)
p
where μ ∈ Rp and PN is the non-dominated set of the upper image P := Y + R of an MOLP (2).
Note that the set of vertices of P , VP ⊂ PN = YN , i.e. all vertices of P are non-dominated, and the
non-dominated sets of Y and P coincide. Moreover, PN is a subset of the boundary of P , see for
example [11, Proposition 2.4]. It is easy to see that Theorem 2.1 holds for Problem (8).
1666 Z. LIU AND M. EHRGOTT
Theorem 2.1: There exists an optimal solution y∗ of Problem (8) at a vertex of P , i.e. y∗ ∈ VP , the set
of vertices of P .
In the second special case, we consider a CMOP as underlying MOP. Then using once again PN =
YN , Problem (9) optimizes a linear function over the non-dominated set PN of the upper image
p
Y + R of of a CMOP.
max μT y : y ∈ PN . (9)
Because the upper image P of a CMOP is a convex (but not necessarily polyhedral) set, we will in gen-
eral not be able to compute it or its non-dominated subset exactly. Hence, we consider approximations
of PN using the concept of -non-dominance as defined below.
Consequently, we change Problem (9) by replacing PN with PN , an -non-dominated set of the
upper image of a CMOP.
T
λ∗ (y) := y1 − yp , . . . , yp−1 − yp , −1 . (12)
Consider the weighted sum scalarisation P1 (v) of (2)
p p
Proposition 3.1: Let v ∈ R≥ such that k=1 vk 1. Then an optimal solution x̂ to (P1 (v)) is a weakly
efficient solution to the MOLP (2).
Given a point y ∈ Rp in objective space, the following LP (P2 (y)) serves to check the feasibility of y, i.e.
if the optimal value ẑ > 0, then y is infeasible, otherwise y ∈ P . An optimal solution (x̂, ẑ) ∈ X × R
to (P2 (y)) provides a weakly non-dominated point ŷ = Cx̂ of P .
Proposition 3.2 is the key result for the revised version of Benson’s algorithm.
Proposition 3.2 ([4]): Let (u∗ , λ∗ ) be an optimal solution to (D2 (y)). Then
p
λ∗k yk = bT u∗
k=1
is a supporting hyperplane to P .
Therefore, by solving (P2 (y)), we not only check the feasibility of point y but also obtain the
dual variable values (u∗ , λ∗ ) as optimal solutions to (D2 (y)) by which we construct a supporting
hyperplane to P .
Heyde and Löhne [18] introduced a concept of geometric duality for multi-objective linear pro-
gramming. This theory relates an MOLP with a dual multi-objective linear programme in dual
objective space Rp . In dual objective space, we use the following notation to compare two vectors
v 1 , v 2 ∈ Rp . We write v 1 >K v 2 if vk1 = vk2 for k = 1, . . . , p − 1 and vp1 > vp2 ; v 1 K v 2 if vk1 = vk2
for k = 1, . . . , p − 1 and vp1 vp2 . Moreover, v 1 ≥K v 2 is the same as v 1 >K v 2 .
The dual of MOLP is
Figure 3. Lower image of Equation (13) for the MOLP of Example 2.1.
Theorems 3.1 and 3.2 state a relationship between proper K-maximal faces of D and proper weakly
non-dominated faces of P .
(1) A point v is a K-maximal vertex of D if and only if H(v) ∩ P is a weakly non-dominated facet
of P .
(2) A point y is a is a weakly non-dominated vertex of P if and only if H ∗ (y) ∩ D is a K-maximal facet
of D.
(F ∗ ) := H(v) ∩ P .
v∈F ∗
Theorem 3.2 ([18]): is an inclusion reversing one-to-one map between the set of all proper K-
maximal faces of D and the set of all proper weakly non-dominated faces of P and the inverse map
is given by
−1 (F ) = H ∗ (y) ∩ D.
y∈F
Moreover, for every proper K-maximal face F ∗ of D it holds that dimF ∗ + dim(F ∗ ) = p − 1.
Therefore, given a non-dominated extreme point yex ∈ P , the corresponding K-maximal facet of
D is H ∗ (yex ) ∩ D.
The revised version of Benson’s algorithm applies an outer approximation to P in the primal objec-
tive space, whereas its dual variant does the same to D in the dual objective space. Hamel et al. [4]
detail the dual algorithm. It iteratively generates supporting hyperplanes of D, which correspond to
weakly non-dominated faces of P . Eventually, a complete set of hyperplanes that define D, as well as
the set of all extreme points of D are obtained.
convex MOP. Similar to the revised version of Benson’s algorithm for computing PN in the case of an
MOLP this algorithm iteratively constructs a polyhedral outer approximation of the upper image P of
p
a CMOP. It starts with a polyhedron S 0 = yI + R containing P . In each iteration i, a vertex of S i−1
that is not an -non-dominated point of P is randomly chosen to generate a supporting hyperplane
to P . Then the approximating polyhedron is updated by intersecting it with the halfspace containing
P defined by the supporting hyperplane. The algorithm terminates when all of the vertices of S i are
-non-dominated with a vertex and inequality representation of polyhedron S i containing P .
We introduce two pairs of single objective optimization problems to facilitate the description of
algorithms later. Problem P1 (v) is a weighted sum problem. Solving P1 (v) results in a weakly non-
dominated point. Problem D1 (v) is the Lagrangian dual of P1 (v). Because we work with Lagrangian
duals, we will from now on assume that the functions defining the MOP are differentiable. Problems
P2 (y) and D2 (y) are employed to generate supporting hyperplanes. These four optimization problems
are the non-linear extensions of the LPs P1 (v), D1 (v), P2 (y) and D2 (y), respectively. They involve
non-linear convex terms making them harder to solve than their LP counterparts.
The outer approximation algorithm of [19] for convex MOPs approximates the non-polyhedral P (see
Example 2.2) generating a set of -non-dominated points. Löhne et al. [19] propose a dual variant of
the convex version of Benson’s algorithm (see Section 5.3). The geometric dual of a convex MOP is
defined as
max D(v) : v ∈ Rp , λ(v) ≥ 0 , (16)
K
where D(v) = {v1 , . . . , vp−1 , minx∈X [λ(v)T f (x)]}. The ordering cone K := {v ∈ Rp : v1 = v2 =
· · · = vp−1 = 0, vp 0} is the same as in (13), and maximization is with respect to the order defined
by K. Let V denote the feasible set in the dual objective space, then its lower image in the dual objective
space is D := V − K. The K-maximal set of Problem (16) is
T T T
DK = max (λ1 , . . . , λp−1 , min λ(v) f (x) ) : (u, λ) 0, e λ = 1 .
K x∈X
Figure 4 shows the lower image of Example 2.2 in dual objective space and the bold curve is the K-
maximal set. For computational purposes, we consider an approximation of the K-maximal set. This
is modelled by the concept of K-maximum, which is defined in Definition 3.1.
Definition 3.1: Let ∈ R and > 0. A point v is called an K-maximal point if v − ep ∈ D and
there does not exist any v̂ ∈ D such that v̂j = vj for j = 1, . . . , p − 1 and v̂p > vp .
The dual version of the convex Benson algorithm for solving CMOPs computes an outer approx-
imation to D. This algorithm first chooses an interior point in D. This is implemented in the way
stated in Algorithm 2, Step i1 in [16]. Then a polyhedron S 0 containing D is constructed. In each
iteration, a vertex si of S i−1 that does not belong to D K is chosen. By solving (P1 (si )), a supporting
hyperplane to D is determined. Eventually, a set of K-maximal points of D is obtained, the convex
hull of which extended by −K is an outer approximation polyhedron of D.
1670 Z. LIU AND M. EHRGOTT
Figure 4. The lower image of the CMOP of Example 2.2 in dual objective space.
lower bound on M(y). Then optimization Problem (17) is solved to obtain an upper bound,
max M(y) : (y22 − y21 )y1 + (y11 − y12 )y2 y11 y22 − y21 y12 , y ∈ Y . (17)
In the objective space, Problem (17) means to find an optimal point over the region bounded by the
non-dominated set and the line connecting y1 and y2 . Having solved problem (17), we have found
an upper bound. In the case that M(y) is non-linear, branching steps may now take place. Let the
line segment connecting y1 and y2 shift parallel until it becomes a supporting hyperplane to Y at
some point q. By connecting both y1 and y2 with q, the branching process splits the problem into two
subproblems. For each of the subproblems, the same process is repeated until the upper bound and
the lower bound coincide. Computational experiments can be found in [6].
In [21], this method is extended to maximize an increasing function M(y) over the non-dominated
set of a convex bi-objective optimization problem. The first step is to determine y1 and y2 and a lower
bound in the same way as the linear version by [6]. These two points and the ideal point yI define a
simplex. The objective function M(y) is maximized over the intersection of Y and the simplex so that
an upper bound on the optimal objective function value can be attained. A point q at the intersection
of a ray emanating from the origin and Y is determined. Point q splits the simplex into two, each
of which is to be explored in the subsequent iterations. The simplices with upper bounds that are
worse than the incumbent objective function values are pruned. This process is iterated until the
gap between the upper bound and the lower bound is within a tolerance determined initially by the
decision maker.
OPTIMIZATION 1671
where U is a matrix containing column vectors yk − yAI for k = 1, . . . , p. For the branching step, a
ray emanating from yAI and passing through the centre point of the simplex spanned by the yk vectors
hits the boundary of Y at a non-dominated point and a lower bound is achieved. The initial cone is
partitioned. By evaluating each new cone, the gap between the upper bound and the lower bound
is narrowed. A cone is called active if there is a gap between the upper bound and lower bound. An
active cone will be further explored. A cone is incumbent if the upper bound meets the lower bound
with the best objective value so far. A cone is fathomed if the best feasible solution found in this cone
is suboptimal. An optimal point is obtained when the upper bound coincides with the lower bound.
This algorithm can also deal with optimization over the non-dominated set of a CMOP. A cone is
constructed that contains Y . An upper bound for M(y) can be found through maximizing M(y) over
the intersection of the cone with Y . A ray emanating from yAI intersects the non-dominated set of
Y at some point providing a lower bound and partitioning the cone. The objective function M(y) is
maximized over each of the regions obtained by intersecting the smaller cones with the feasible set in
objective space, which provides new upper bounds. This process is repeated until the upper bound
and the lower bound coincide or the gap between them is small enough.
vertices of P . We start this discussion by investigating the vertices of Y more closely. Let [a − b]
denote an edge of polyhedron Y with vertices a and b. We call points a and b neighbouring vertices
and denote with N(a) the set of all neighbouring vertices of vertex a.
Definition 5.1: Let [a − b] be an edge of Y . [a − b] is called a non-dominated edge if for some point
y in the relative interior of [a − b], y ∈ YN .
According to [24] (Chapter 8), a point in the relative interior of a face of Y is non-dominated if
and only if the entire face is non-dominated.
Definition 5.2: Vertex a ∈ VP is called a complete vertex if all faces F of P containing a are non-
c denote the set of complete vertices of
dominated, otherwise it is called an incomplete vertex. Let VP
P and define VP := VP \ VP as the set of incomplete vertices.
ic c
Definition 5.2 provides a partition of the vertices of P into complete and incomplete ones. In
Figure 1, the complete vertices are b and c. The incomplete vertices are a and d, whereas e and f are
vertices of Y but not of P .
Proposition 5.1: Let a be a vertex of Y . If there does not exist a vertex b ∈ N(a) such that μT (b − a) >
0, then a is an optimal solution of the linear programme (19),
max μT y : y ∈ Y . (19)
Proof: Assume a is not an optimal solution, then there exists at least one vertex b ∈ N(a) such that
μT b > μT a, which means μT (b − a) > 0, a contradiction.
Theorem 5.1: Problem (19) has at least one optimal solution that is also an optimal solution to Problem
p
(8) if and only if μ ∈ R.
p p
Proof: (1) Let μ ∈ R. Define μ := −μ, then μ ∈ R and we rewrite Problem (19) as
min{μ T Cx : x ∈ X }, which is a weighted sum scalarisation of the underlying MOLP. It is well
known that there exists an efficient solution x∗ ∈ X which is an optimal solution to this problem.
Therefore, y∗ = Cx∗ ∈ PN . Hence, y∗ is an optimal solution to Problem (8).
p
/ R and assume that y∗ is an optimal solution to Problem (8). Choose d ∈ R≥ such
(2) Now, let μ ∈
p
that dj = 1 if μj > 0 and dj = 0 otherwise. Let y = y∗ + d and > 0. Since P = Y + R, y ∈
Y for sufficiently small . However, μT y = μT y∗ + μT d > μT y∗ and so y∗ is not an optimal
solution to Problem (19).
p
Using similar arguments as in the proof of Theorem 5.1, it is in fact possible to show that μ ∈ R<
if and only if the set of optimal solutions of both Problems (8) and (19) is identical.
OPTIMIZATION 1673
p
Proposition 5.2: Let μ ∈ Rp \ R. If y ∈ VP
c , then there exists y ∈ N(y) such that μT (y − y) > 0.
Proof: Let μ and y be as in the proposition. If there were no y ∈ N(y) such that μT (y − y) > 0,
then by Proposition 5.1, y would be an optimal solution to Problem (19). As y is also a non-dominated
p
point, y would be an optimal solution to Problem (8). According to Theorem 5.1, this implies μ ∈ R,
a contradiction.
Proof: Under the assumptions of the theorem, assume that VP ic does not contain an optimal solution
of Problem (8), then because of Theorem 2.1, there exists an optimal solution y∗ ∈ VP c . According to
Proposition 5.2, there exists y ∈ N(y∗ ) such that μT y > μT y∗ . Since y ∈ YN , y is feasible for Problem
(8) and a contradiction is obtained.
According to Theorem 5.1, whenever μ 0, Problem (8) can be solved by solving the LP (19). In
p
case μ ∈ Rp \ R, an optimal solution to Problem (8) must be obtained at an incomplete vertex of
P . Algorithm 1 can therefore be restricted to incomplete vertices. Unfortunately, due to the fact that
the structure of PN can be very complex (see for example [25] for an investigation of the structure of
the non-dominated set of tri-objective linear programmes), a necessary and sufficient condition for
checking a vertex to be incomplete is not known. In the next section, we provide an algorithm that
uses cutting planes to avoid the enumeration of all vertices of P .
Theorem 5.2 also explains why optimizing over the non-dominated set of a bi-objective linear
programme is easy. In the case p = 2, the extreme points of P can be ordered according to increasing
values of one objective and therefore decreasing order of the other objective. It follows that Y has
exactly two incomplete non-dominated extreme points (in Figure 1 these are a and d). These two
extreme points are the non-dominated extreme points obtained by lexicographically optimizing the
objectives in the order (1,2) and (2,1), respectively. Hence, for p = 2 objectives, Equation (8) can be
solved by linear programming, solving two lexicographic LPs or four single objective LPs.
Example 5.1: The primal algorithm is illustrated in Figure 5 by maximizing y1 + y2 over the non-
dominated set of Example 2.1.
1674 Z. LIU AND M. EHRGOTT
10: Set S i := S i−1 ∩ y ∈ Rp : λik yki bT ui . Update VS i . Threshold := False.
k=1
11: end if
12: Set i := i + 1.
13: end while
14: y∗ := si .
Figure 5 and Table 1 show in each iteration the extreme point chosen, the type of cut added and
the incumbent objective function value. In the first iteration, an improvement cut is added generating
two new vertices, (2, 0)T and (0, 3)T . Then (0, 3)T is chosen in the next iteration because it provides
the best objective function value so far. The second iteration improves the function value to 4. Since
OPTIMIZATION 1675
(0, 4)T is feasible, a threshold cut, y1 + y2 4, is then added. Although it does not improve the objec-
tive function value, this cut generates a new infeasible vertex (4, 0)T . An optimal point, (5, 0)T , is
found after another improvement cut has been generated.
10: Set S i := S i−1 ∩ y ∈ Rp : λik yki bT ui . Threshold = False. Update VS i .
k=1
11: end if
12: Set i := i + 1.
13: end while
14: y∗ ∈ argmax μT y : y ∈ VS i−1 .
15: Find an optimal solution x̂ to P2 (y∗ ). An approximately optimal solution to Problem (9) is ŷ =
f (x̂).
Now we extend the primal Algorithm 2 to solve Problem (10). The algorithm starts with a poly-
hedron S 0 containing P . In each iteration, a hyperplane is generated and added to the inequality
representation of S i−1 resulting in a set of new extreme points. Evaluating μT y at these points,
we select the one with the best function value to construct the cut for the next iteration. If the
p is an -non-dominated point, a threshold cut is added. Otherwise an improvement cut
selected point
{y ∈ Rp : k=1 λik yki bT ui } as in the algorithm of [19] for convex MOPs is added. A threshold cut is
{y ∈ Rp : μT y μT ŷ}, where ŷ is the incumbent solution. A threshold cut removes the region where
points are worse than ŷ. At the end of the algorithm, an -non-dominated point y∗ is obtained, which
1676 Z. LIU AND M. EHRGOTT
is an optimal solution to Problem (10). Furthermore, by solving (P2 (y∗ )), we can find an element ŷ
of PN , which is an approximate solution to Problem(9). We also know that y∗ ŷ y∗ + e, where
is the approximation error predetermined by the decision-maker. As a result, we have solved Prob-
lem (10) exactly and determined an approximate solution ŷ to Problem (9). This primal algorithm is
detailed in Algorithm 3.
Example 5.2: We illustrate Algorithm 3 through maximizing 5y1 + 7y2 over the non-dominated set
of the CMOP of Example 2.2.
Figure 6 and Table 2 show the iterations of Algorithm 3. In the first iteration, an improvement
cut is added generating two new vertices, (0, 1.27)T and (1.61, 0)T . Then (0, 1.27)T is chosen in
the next iteration because it provides the best objective function value so far. The second itera-
tion improves the function value to 11.94. Since (0, 1.71)T is -non-dominated, a threshold cut,
5y1 + 7y2 11.94, is then added. Although it does not improve the objective function value, this
cut generates a new infeasible vertex (2.39, 0)T . An -non-dominated point y∗ = (2.9, 0)T , is found
after another improvement cut has been generated. By solving P2 (y∗ ), we obtain an approximately
optimal solution (2.9001, 0.001)T and objective function value 14.5075.
We notice that without loss of generality we can assume that μ ≥ 0. Otherwise, we set μ̂k = −μk and
ŷk = −yk whenever μk < 0 and μ̂k = μk and ŷk = yk whenever μk 0 to rewrite Equation (20) as
Since Equation (22) is a hyperplane in the primal objective space, it corresponds to a point v μ in
dual objective space. According to geometric duality theory, in particular, Equation (12), this point is
nothing but
T
μ μ1 μp−1 μT yex
v = ,..., , .
μ μ μ
Notice that only the last element of v μ varies with yex , i.e. the first p−1 elements of v μ are merely
determined by μ. Geometrically, it means that v μ with respect to various extreme points yex lies on a
μ μ
vertical line Lμ := {v ∈ Rp : v1 = v1 , . . . , vp−1 = vp−1 }. Furthermore, the last element of v μ is equal
to the objective function value of Problem (8) at yex divided by μ . Hence, geometrically, Problem (8)
is equivalent to finding a point v μ with the largest last element along Lμ .
Proof: Substitute the point v μ into the equation of H ∗ (yex ). The left hand side is
μ1 μp−1 μT yex
LHS = (y1ex − ypex ) ex
+, . . . , +(yp−1 − ypex ) −
μ μ μ
⎛ ⎞
p−1 p−1 p−1
1
=⎝ μi yiex − ypex μi − μi yiex − μp ypex ⎠
i=1 i=1 i=1
μ
p
i=1 μi ex
=− yp = −ypex .
μ
1678 Z. LIU AND M. EHRGOTT
Figure 7. Projection of a three dimensional lower image D onto the v 1 -v 2 coordinate plane.
This discussion shows that, because we are just interested in finding a hyperplane H ∗ (yex ) that
intersects Lμ at the highest point, i.e. the point with the largest last element value, it is unnecessary to
obtain the complete K-maximal set of D. We now characterize which elements of this set we need to
consider.
In Section 5, we reached the conclusion that an optimal solution to Problem (8) can be found at an
incomplete vertex of Y . An analogous idea applies to the facets of the dual polyhedron. In the rest of
this section, we develop this idea through the association between the upper image P of the primal
MOLP and the lower image D of the dual MOLP and exploit it to propose the dual algorithm.
In the discussion that follows, we make use of Theorem 3.1. Let yi , i = 1, . . . , r be the extreme
points of P . Then we know that the facets of D are F i := −1 (yi ) ∩ D for i = 1, . . . , r.
Figure 7 shows the projection of a three dimensional lower image D onto the v 1 and v 2 coordi-
nate plane. The cells P(F 1 ) to P(F 5 ) are the projections of facets F 1 to F 5 of D, respectively. Notice
that the projections of the facets have disjoint interiors, because by definition D consists of points
(v1 , . . . , vp−1 , vp ) such that vp ≤ v ∗ for the point (v1 , vp−1 , v ∗ ) ∈ DK , the K-maximal subset of D.
Definition 6.1: K-maximal facets F i and F j of D are called neighbouring facets if dim(F i ∩ F j ) =
p − 2.
Figure 7 shows that the neighbouring facets of facet F 2 are F 3 and F 5 . Neither F 1 nor F 4 are
neighbouring facets of F 2 .
Proposition 6.1: If yi and yj are neighbouring vertices of P , then facets F i = H ∗ (yi ) ∩ D and F j =
H ∗ (yj ) ∩ D are neighbouring facets of D.
Proof: Since yi and yj are neighbouring vertices of Y there is an edge [yi − yj ] connecting them. The
edge [yi − yj ] has dimension one and according to Theorem 3.2 dim([yi − yj ]) + dim( −1 ([yi −
yj ])) = p − 1 we have that dim( −1 ([yi − yj ])) = p − 2. Moreover, −1 (yi ) = H ∗ (yi ) ∩ D = F i
and −1 (yj ) = H ∗ (yj ) ∩ D = F j due to our notational convention. Hence dim(F i ∩ F j ) = p − 2 and
F i and F j are neighbouring facets.
Definition 6.2: Let F be a K-maximal facet of D. If all neighbouring facets of F are K-maximal facets,
then F is called a complete facet, otherwise it is called an incomplete facet. The set of all complete
facets of D is denoted by F c , the set of all incomplete facets of D is denoted by F ic .
In Figure 3, there are two incomplete facets, namely, the facet attached to the origin and the facet
attached to the point (1, 0)T . The other two facets are complete. In Figure 7, the projections of the
OPTIMIZATION 1679
incomplete facets are P(F 2 ), P(F 3 ), P(F 4 ) and P(F 5 ). The only complete facet is F 1 , hence P(F 1 ) is
surrounded by the projections of the incomplete facets.
Theorem 6.2: There exists a one-to-one correspondence between incomplete facets of D and incomplete
vertices of P .
Proof: Proposition 6.1 states that the facets of D have the same neighbouring relations as the vertices
of P , which implies that the completeness of a facet of D remains the same as that of its corresponding
vertex of P . Therefore, Theorem 6.2 is true.
Theorem 6.3: If y∗ is an optimal solution to Problem (8) at an incomplete vertex, then (H ∗ (y∗ ) ∩ D)
is a K-maximal incomplete facet of D.
Proof: Theorem 6.3 follows directly from Theorems 5.2 and 6.2.
Theorem 6.3 says that a facet of D corresponding to an optimal extreme point solution to Problem
(8) is an incomplete facet. In other words, we do not need to investigate complete facets to find an
optimal solution because in the primal space the vertex corresponding to this facet has only non-
dominated neighbours, i.e. this vertex is a complete vertex, which cannot be an optimal solution of
Problem (8). On the other hand, an incomplete facet of D is the counterpart of an incomplete vertex of
Y . Hence, an optimal solution to Problem (8) can be obtained by investigating the incomplete facets
of D. In order to obtain the set of incomplete facets of D, let us define
p−1
WD := v ∈ Rp : vi = 0, 0 vj 1, j = 1 . . . (p − 1), j = i
i=1
⎧ ⎫
⎨ p−1
⎬
∪ v ∈ Rp : vi = 1, 0 vi 1, i = 1 . . . (p − 1) .
⎩ ⎭
i=1
In Figure 7, the highlighted triangle is the projection of WD onto the v 1 -v 2 coordinate plane. The
incomplete facets intersect with WD because their neighbouring facets are not all K-maximal facets.
On the other hand, a complete facet ‘surrounded’ by K-maximal facets does not intersect with WD .
Hence, it is sufficient to consider the facets that intersect with WD . The dual algorithm proposed below
is designed to solve Problem (8) in the dual objective space through finding facets that intersect with
WD .
Theorem 6.2 shows that there is a one-to-one correspondence between incomplete vertices of P
and incomplete facets of D. But through intersections with WD incomplete facets of D are easier to
characterize than incomplete facets of P and can therefore be handled algorithmically.
Example 6.1: Figure 8 illustrates the dual algorithm by maximizing y1 + y2 over the non-dominated
set of Example 2.1. By employing the dual algorithm only two of the four K-maximal facets need to
be generated.
In this example, WD = {v ∈ R2 : v1 = 0} ∪ {v ∈ R2 : v1 = 1}. In Figure 8(a), a, b ∈ WD . In
Figure 8(b), vertex a is used to generate a supporting hyperplane of D. This hyperplane corresponds to
extreme point (0, 4)T in the primal space. A new vertex c is found. Since c ∈
/ WD , in Figure 8(c) vertex
b is employed to generate another supporting hyperplane of D corresponding to extreme point (5, 0)T
in the primal space. At this stage, there is no infeasible vertex belonging to WD and the algorithm
terminates with optimal solution (5, 0)T .
1680 Z. LIU AND M. EHRGOTT
⎨ p−1
⎬
Set S 0 := v ∈ Rp : λ(v) 0, (yk∗ − yp∗ )vk − vp −yp∗ and i := 1.
⎩ ⎭
k=1
while WD ∩ VS k−1 ⊂ D do
Choose v i ∈ WD ∩ VS i−1 such that v i ∈ / D.
Compute an optimal solution xi of (P1 (v i )), set yi := Cxi .
if M ∗ < μT yi then
Set y∗ := yi and M ∗ := μT yi .
end if ⎧ ⎫
⎨ p−1
⎬
Set S i := S i−1 ∩ v ∈ R : (yki − ypi )vi − vp −ypi . Update VS i .
⎩ ⎭
k=1
Set i := i + 1.
end while
In Figure 9 (a), an initial polyhedron S 0 is constructed with two vertices a, b ∈ WD . In the first
iteration, vertex a is selected to generate a supporting hyperplane to D at (0, 0)T as shown in Figure 9
(b). Now a new vertex c is found. In Figure 9(c) vertex b is employed to generate another supporting
hyperplane to D at (1, 0)T resulting in vertex d. Since c, d ∈
/ WD , there is no infeasible vertex belonging
to WD and the algorithm terminates with the approximate solution y∗ = (3, 0)T with value μT y∗ =
15. This is the optimal solution in this example.
7. Computational results
7.1. The linear case
In this section, we use randomly generated instances to compare some of the algorithms for solving
Problem (8). The method proposed by [26] is used to generate instances the coefficients of which are
uniformly distributed between -10 and 10. All of the algorithms were implemented in Matlab R2013b
using CPLEX 12.5 as linear programming solver. The experiments were run on a computer with an
Intel i7 processor (3.40GHz and 16GB RAM). We solved three instances of the same size for MOLPs
with p objectives, m constraints and n variables. Note that m = n for all instances. Table 3 shows the
average CPU times (in seconds). We tested six algorithms, namely
⎨ p−1
⎬
3: Set S 0 := v ∈ Rp : λ(v) 0, (yk∗ − yp∗ )vk − vp −yp∗ and i = 1.
⎩ ⎭
k=1
4: while WD ∩ VS i−1 ⊂ D K do
5: Choose si ∈ WD ∩ VS i−1 .
6: Compute an optimal solution xi of (P1 (si )) and an optimal value zi to (P1 (si )); yi := f (xi );
M i := μT yi .
7: if M ∗ < M i then
8: y∗ := yi and M ∗ := M i .
9: end if
10: if sip − zi > then
⎧ ⎫
⎨ p−1
⎬
11: Set S i := S i−1 ∩ v ∈ R : (yki − ypi )vi − vp −ypi . Update VSi .
⎩ ⎭
k=1
12: else
13: V K := V K ∪ si .
14: end if
15: Set i := i + 1.
16: end while
Clearly, as the size of the instances grows, the CPU time increases rapidly. The crucial parame-
ter here is the number of objective functions. While with p = 2 objectives, even problems with 500
variables and constraints can be solved in less than a second, this takes between less than 1 minute
and about 10 minutes with p = 5 objectives for the different algorithms. We observe that for p = 2, the
bi-objective branch and bound algorithm turns out to be the fastest algorithm, but it cannot be gener-
alized to problems with more than two objectives. The dimension factor maybe plays a less important
role when p is small. Other factors such as the number of variables and constraints may have a more
influential impact on CPU times. As p increases, the merit of the primal and the dual algorithms is
revealed. Specifically, when solving problems with 5 objectives and 500 variables and constraints, the
primal and the dual algorithms take much less time (one sixth, respectively, one tenth) than Benson’s
branch and bound algorithm. The brute force algorithm (A1) performs better than the branch and
bound algorithms (A3 and A4) because A1 only solves one LP in each iteration whereas A3 and A4
solve multiple LPs. Solving LPs is the most time-consuming step in the algorithms. Table 3 also shows
the dual algorithm performs better than the primal algorithm in solving instances of large scale. In
Figure 10, we plot the log-transformed CPU times of solving the instances with 500 variables and
constraints for the five different applicable algorithms. It shows that, as expected, the time required
to achieve optimality exponentially increases with the number of objectives, even for our primal and
dual algorithm, making the speed-up obtained by our algorithms even more important.
Figure 10. Log-transformed CPU times for instances with 500 variables and constraints.
are generated as the objective functions, which are in the form f (x) = xT Hx + aT x, where H is a
positive semi-definite matrix and a is a column vector. Matrix H = ST S, where S is a square matrix. All
coefficients are uniformly distributed between −10 and 10. All of the algorithms were implemented
in Matlab R2013b using CPLEX 12.5 as a solver. The value of was set to 10−4 . The experiments were
run on a computer with Intel i7 processor (3.40GHz, 16GB RAM). Table 4 shows the average CPU
times (in seconds) of solving three instances of the same size for which the underlying CMOP has p
objectives, m constraints and n variables. We tested five algorithms, namely
Obviously, the CPU time increases rapidly as the size of the instances grows. The largest size of
instances we tested is 3 objectives and 100 variables and constraints due to the substantial amount of
time required to solve the instances to the chosen accuracy. We notice that the number of objective
functions is a crucial factor. All of the instances with p = 2 objectives can be solved within 30 sec-
onds. The most efficient algorithm is the extended bi-objective branch and bound algorithm (A2)
proposed by [21], which solves the largest instances with 100 variables and constraints within 2 sec-
onds. Unfortunately it is specific to the bi-objective case. The difference in time between A3, A4 and
A5 is not significant. We notice that adding one more dimension to the objective space (i.e. adding
one more objective function) leads to substantial increase in computational effort. For instances with
3 objectives, the required CPU time increases substantially so that even instances with 5 variables
and constraints take a few minutes to solve. Furthermore, the largest instances with 100 variables and
constraints, take a few hours. The dual algorithm solves the largest instances in half of the time used
by the conical branch and bound algorithm (A3). We also notice that the dual algorithm is faster
than the primal one in most of the cases. Throughout the test, the slowest algorithm is the brute force
algorithm (A1). This is due to the fact that this algorithm enumerates a large number of vertices which
are redundant. This also proves the advantage of the techniques employed in our new algorithms.
Additionally, in the implementation of the algorithms, we notice that time required to solve instances
is sensitive to the approximation accuracy (reflected by as stated in the algorithms). In this test, the
level of accuracy is 10−4 . It is expected that a lower level of accuracy results in faster solution time.
8. Conclusion
Optimization over the efficient set is a problem of concern when decision-makers have to select a
preferred point among an infinite number of efficient solutions in multiple criteria decision making.
We have addressed the case that this selection is based on the optimization of a linear function over
the non-dominated set of a linear multi-objective optimization problem. We have exploited primal
and dual variants of Benson’s algorithm, which compute all non-dominated extreme points and facets
of multi-objective linear programmes, as the basis of algorithms to solve this problem. In addition,
we have described structural properties of the problem of optimizing a linear function over the non-
dominated set to reduce the need for a complete enumeration of all non-dominated extreme points.
We have compared our algorithms to several algorithms from the literature, and the complete enu-
meration approach, and obtained speed-ups of up to 10 times on instances with up to 5 objectives
and 500 variables and constraints.
We also extended the primal and the dual algorithms to optimize a linear function over the non-
dominated set of a convex multi-objective optimization problem. We employed the techniques of the
linear primal and dual methods to facilitate the non-linear ones. Comparing with other algorithms
from the literature, our primal and dual algorithms are the fastest. In the future we plan to address
OPTIMIZATION 1685
more challenging versions of this problem, e.g. when the function to be optimized is non-linear and
when the underlying MOP is non-convex, where we are particularly interested in the discrete case.
Acknowledgements
We express our gratitude to two anonymous referees, whose careful reading and comments helped us improve the
paper. More information on the data can be found at doi https://dx.doi.org/10.17635/lancaster/researchdata/224.
Disclosure statement
No potential conflict of interest was reported by the authors.
Funding
This research was supported by US Airforce Office for Scientific Research [grant number FA8655-13-1-3053].
References
[1] Ehrgott M, Naujoks B, Stewart TJ, et al. Multiple criteria decision making for sustainable energy and transportation
systems. Lecture notes in economics and mathematical systems. Vol. 634. Berlin: Springer; 2010.
[2] Markowitz H. Portfolio selection. J Finance. 1952;7(1):77–91.
[3] Ehrgott M, Güler Ç, Hamacher HW, et al. Mathematical optimization in intensity modulated radiation therapy.
Ann Oper Res. 2009;175(1):309–365.
[4] Hamel AH, Löhne A, Rudloff B. Benson type algorithms for linear vector optimization and applications. J Global
Optim. 2014;59(4):811–836.
[5] Benson HP. An all-linear programming relaxation algorithm for optimizing over the efficient set. J Global Optim.
1991;1(1):83–104.
[6] Fülöp J, Muu LD. Branch-and-bound variant of an outcome-based algorithm for optimizing over the efficient set
of a bicriteria linear programming problem. J Optim Theory Appl. 2000;105(1):37–54.
[7] Horst R, Tuy H. Global optimization: deterministic approaches. Berlin: Springer; 1993.
[8] Thoai NV. Conical algorithm in global optimization for optimizing over efficient sets. J Global Optim.
2000;18(4):321–336.
[9] Yamamoto Y. Optimization over the efficient set: overview. J Global Optim. 2002;22(1–4):285–317.
[10] Fülöp J. On the equivalency between a linear bilevel programming problem and linear optimization over the
efficient set. Hungarian Academy of Sciences; 1993. Technical report.
[11] Ehrgott M. Multicriteria optimization. 2nd ed. Berlin: Springer; 2005.
[12] Dauer JP. Analysis of the objective space in multiple objective linear programming. J Math Anal Appl.
1987;126(2):579–593.
[13] Dauer JP. On degeneracy and collapsing in the construction of the set of objective values in a multiple objective
linear program. Ann Oper Res. 1993;46–47(2):279–292.
[14] Benson HP. A geometrical analysis of the efficient outcome set in multiple objective convex programs with linear
criterion functions. J Global Optim. 1995;6(3):231–251.
[15] Benson HP. An outer approximation algorithm for generating all efficient extreme points in the outcome set of a
multiple objective linear programming problem. J Global Optim. 1998;13:1–24.
[16] Ehrgott M, Löhne A, Shao L. A dual variant of benson’s outer approximation algorithm for multiple objective
linear programming. J Global Optim. 2011;52(4):757–778.
[17] Ehrgott M, Wiecek M. Mutiobjective programming. In: Figueira J, Greco S, Ehrgott M, editors. Multiple criteria
decision analysis: state of the art surveys. New York (NY): Springer; 2005. p. 667–708. (International Series in
Operations Research and Management Science; Vol. 78).
[18] Heyde F, Löhne A. Geometric duality in multiple objective linear programming. SIAM J Optim. 2008;19(2):
836–845.
[19] Löhne A, Rudloff B, Ulus F. Primal and dual approximation algorithms for convex vector optimization problems.
J Global Optim. 2014;60(4):713–736.
[20] Benson HP, Lee D. Outcome-based algorithm for optimizing over the efficient set of a bicriteria linear program-
ming problem. J Optim Theory Appl. 1996;88(1):77–105.
[21] Kim NTB, Thang TN. Optimization over the efficient set of a bicriteria convex programming problem. Pac J
Optim. 2013;9:103–115.
[22] Benson HP. An outcome space algorithm for optimization over the weakly efficient set of a multiple objective
nonlinear programming problem. J Global Optim. 2011;52(3):553–574.
[23] Benson HP. Optimization over the efficient set. J Math Anal Appl. 1984;98(2):562–580.
1686 Z. LIU AND M. EHRGOTT
[24] Yu P-L. Multiple-criteria decision making: concepts, techniques, and extensions). New York (NY): Plenum Press;
1985. (Mathematical Concepts and Methods in Science and Engineering; Vol. 30).
[25] Fruhwirth M, Mekelburg K. On the efficient point set of tricriteria linear programs. Eur J Oper Res.
1994;72:192–199.
[26] Charnes A, Raike WM, Stutz JD, et al. On generation of test problems for linear programming codes. Commun
ACM. 1974;17(10):583–586.