DC As I Am Convex Property
DC As I Am Convex Property
2003/9
page 1
10
1.2
We have already mentioned that convex functions are tractable in optimization (or
minimization) problems and this is mainly because of the following properties:
1. Local optimality (or minimality) guarantees global optimality;
2. Duality such as min-max relation and separation theorem holds good.
This section is to give more specific descriptions of these properties, and to discuss
their possible versions for discrete functions.
Let us first recall the definition of a convex function. A function f : Rn
R {+} is said to be convex if
f (x) + (1 )f (y) f (x + (1 )y)
(1.2)
for all x, y Rn and for all with 0 1, where it is understood that the
inequality is satisfied if f (x) or f (y) is equal to +. The inequality (1.2) implies
that the set
S = {x Rn | f (x) < +},
called the effective domain of f , is a convex set. Hence the present definition of
a convex function coincides with the one in (1.1) that makes an explicit reference
to the effective domain S. A special case of inequality (1.2) for = 1/2 yields the
midpoint convexity
f (x) + f (y)
x+y
f
(x, y Rn ),
(1.3)
2
2
and, conversely, this implies convexity, provided f is continuous. We often assume
(explicitly or implicitly) that f (x) < + for some x Rn whenever we talk about
a convex function f . A function h : Rn R {} is said to be concave if h is
convex.
A point (or vector) x is said to be a global optimum of f if the inequality
f (x) f (y)
(1.4)
holds for every y, and x is a local optimum if this inequality holds for every y in
some neighborhood of x. Obviously, global optimality implies local optimality. The
converse is not true in general, but it is true for convex functions.
Theorem 1.1. For a convex function, global optimality (or minimality) is guaranteed by local optimality.
Proof. Let x be a local optimum of a convex function f . Then we have f (z) f (x)
for any z in some neighborhood U of x. For any y, z = x + (1 )y belongs to U
for < 1 sufficiently close to 1, and it follows from (1.2) that
f (x) + (1 )f (y) f (x + (1 )y) = f (z) f (x).
sidca
2003/9
page 1
11
n
X
fi (x(i)),
(1.5)
i=1
n
X
p(i)x(i)
(p Rn ),
(1.6)
(1.7)
i=1
p log p p (p > 0)
0
(p = 0)
f (p) =
+
(p < 0)
by a simple calculation. See Fig. 1.3 for the geometric meaning in the case of n = 1.
The LegendreFenchel transformation gives a one-to-one correspondence in
the class of well-behaved convex functions, called closed proper convex functions,
where the precise meaning of this technical terminology (not important here) will
be explained later in 3.1. Notation f means (f ) , the conjugate of the conjugate
function of f .
Theorem 1.2 (Conjugacy). The LegendreFenchel transformation f 7 f gives
a symmetric one-to-one correspondence in the class of all closed proper convex functions. That is, for a closed proper convex function f , f is a closed proper convex
function and f = f .
5) A
sidca
2003/
page 1
12
Y = f (x)
f (p)
Y = hp, xi f (p)
x
Figure 1.3. Conjugate function (LegendreFenchel transform)
Y = f (x)
Y = + hp , xi
Y 6
Y = h(x)
x
Figure 1.4. Separation for convex and concave functions
(p Rn ).
(1.8)
The duality principle in convex analysis can be expressed in a number of different forms. One of the most appealing statements is in the form of the separation
theorem, which asserts the existence of a separating affine function Y = + hp , xi
for a pair of convex and concave functions (see Fig. 1.4).
Theorem 1.3 (Separation theorem).
Let f : Rn R {+} and h :
Rn R {} be convex and concave functions, respectively (satisfying certain
sidca
2003/9
page 1
13
regularity conditions). If 6)
f (x) h(x)
(x Rn ),
(x Rn ).
It is admitted that the statement above is mathematically incomplete, referring to certain regularity conditions, which will be specified later in 3.1.
Another expression of the duality principle is in the form of the Fenchel duality.
This is a min-max relation between a pair of convex and concave functions and their
conjugate functions. The certain regularity conditions in the statement below will
be specified later.
Theorem 1.4 (Fenchel duality). Let f : Rn R {+} and h : Rn
R{} be convex and concave functions, respectively (satisfying certain regularity
conditions). Then
min{f (x) h(x) | x Rn } = max{h (p) f (p) | p Rn }.
Such a min-max theorem is computationally useful in that it affords a certificate of optimality. Suppose that we want to minimize f (x) h(x) and have
x = x
as a candidate for the minimizer. How can we verify or prove that x
is
indeed an optimal solution? One possible way is to demonstrate a vector p such
that f (
x) h(
x) = h (
p) f (
p). This implies the optimality of x
by virtue of
the min-max theorem. The vector p, often called a dual optimal solution, serves as
a certificate for the optimality of x
. It is emphasized that the min-max theorem
guarantees the existence of such a certificate p for any optimal solution x
. It is also
mentioned that the min-max theorem does not tell us how to find optimal solutions
x
and p.
It is one of the recurrent themes in discrete convexity how the conjugacy
and the duality above should be adapted in discrete settings. To be specific, let
us consider integer-valued functions on integer lattice points, and discuss possible
notions of conjugacy and duality for f : Zn Z {+} and h : Zn Z
{}. Some ingredients of discreteness (integrality) are naturally expected in the
formulation of conjugacy and duality. This amounts to discussing another kind
of discreteness, discreteness in value so to speak, in contrast to discreteness in
direction mentioned above.
Discrete versions of the LegendreFenchel transformations can be defined by
f (p) = sup{hp, xi f (x) | x Zn }
(p Zn ),
n
(p Z ).
(1.9)
(1.10)
sidca
2003/9
page 1
14
(x Zn ),
(x Zn ).
(x Zn ).
(1.11)
(x Z).
(1.12)
As is easily verified, the discrete separation theorem as well as the discrete Fenchel
duality holds with this definition in the case of n = 1.
sidca
2003/9
page 1
15
Y = f (x)
Y = + hp , xi
Y = h(x)
Figure 1.5. Discrete separation
When it comes to higher dimensions, the situation is not that simple. The
following examples demonstrate that the discrete separation fails with this naive
definition of convexity.
Example 1.5. [failure of discrete separation] Consider two discrete functions
defined by
f (x) = max(0, x(1) + x(2)),
sidca
2003/9
page 1
16
there exists no integral vector p Z2 such that f (x) hp , xi h(x) for all
x Z2 . This demonstrates the failure of the desired discreteness in the separating
affine function.
Example 1.6. [failure of real-valued separation] This example shows that even the
existence of a separating affine function can be denied. For the discrete functions
f (x) = |x(1) + x(2) 1|,
for x = (x(1), x(2)) R2 , since f (1/2, 1/2) < h(1/2, 1/2). This example shows also
that f h on Rn does not follow from f h on Zn .
Similarly, the discrete Fenchel duality fails under the naive definition of convexity. The above two examples serve to demonstrate this.
Thus the naive approach to discrete convexity does not work, and some deep
combinatorial or discrete-mathematical considerations are needed. We are now
motivated to look at some results in the area of matroids and submodular functions,
which hopefully provide a clue for fruitful definitions of discrete convexity.