0% found this document useful (0 votes)
4 views

Func 20160919

Uploaded by

yueqi.yx
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Func 20160919

Uploaded by

yueqi.yx
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

2.

2 Convex Function

Yipeng Liu

School of Electronic Engineering/Center for Robotics/Center for Information in Medicine


University of Electronic Science and Technology of China (UESTC)

[email protected]

October 10, 2016

1 / 35
Overview

1. definition

2. basic properties

3. epigraph and sublevel set

4. Jensen’s inequality

5. operations that preserve convexity

6. conjugate function

7. log-concave and log-convex functions

8. convexity with respect to generalized inequalities

2 / 35
Definition

f : RN → R is convex if dom f is a convex set and

f (θx + (1 − θ)y) 6 θf (x) + θf (1 − y)

for all x, y ∈ dom f , 0 6 θ 6 1

• f is concave if −f is convex

• f is strictly convex if dom f is convex and

f (θx + (1 − θ)y) < θf (x) + θf (1 − y)

for all x, y ∈ dom f, x 6= y, 0 < θ < 1

3 / 35
Examples on R

convex:

• affine: ax + b on R, for any a, b ∈ R

• exponential: eax , for any a ∈ R

• powers:xα on R++ , for all α > 1 or α 6 0

• powers of absolute value: |x|p on R, for p > 1

• negative entropy: x log x on R++

concave:

• affine: ax + b on R, for any a, b ∈ R

• powers: xα on R++ , for all 0 6 α 6 1

• logarithm: log x on R++

4 / 35
Examples on RN and RM ×N

affine functions are convex and concave; all norms are convex
examples on RN :

• affine function: f (x) = aT x + b


P 1/p
N
• norms: kxkp = n=1 |xi |p , for p > 1; kxk∞ = maxn |xn |

examples on RM ×N :

• affine function
  M X
X N
f (X) = tr AT X + b = Amn Xmn + b
m=1 n=1

• spectral (maximum singular value) norm


  1/2
f (X) = kXk2 = σmax (X) = λmax XT X

5 / 35
Restriction of a convex function to a line

f : RN → R is convex if and only if the function g : R → R,

g(t) = f (x + tv), dom g = {t |x + tv ∈ dom f }

is convex (in t) for any x ∈ dom f, v ∈ RN


check convexity of f by checking convexity of functions of only one variable
example: f : SN → R with f (X) = log det X, dom f = SN
++

 
g(t) = log det (X + tV) = log det X + log det I + tX−1/2 VX−1/2
N
X
= log det X + log(1 + tλn )
n=1

where λn are the eigenvalues of X−1/2 VX−1/2


g is concave in t (for any choice of X  0, V); hence f is concave

6 / 35
Extended-value extension

extended-value extension f˜ of f is

f˜(x) = f (x), x ∈ dom f, f˜(x) = ∞, x ∈


/ dom f

often simplifies notation; for example, the condition

0 6 θ 6 1 ⇒ f˜(θx + (1 − θ)y) 6 θf˜(x) + (1 − θ)f˜(y)

(as an inequality in R ∪ {∞}), means the same as the two conditions

• dom f is convex

• x, y ∈ dom f

0 6 θ 6 1 ⇒ f (θx + (1 − θ)y) 6 θf (x) + (1 − θ)f (y)

7 / 35
First-order condition
f is differentiable if dom f is open and the gradient
 
∂f (x) ∂f (x) ∂f (x)
∇f (x) = , ,··· ,
∂x1 ∂x2 ∂xN

exists at each x ∈ dom f


1st-order condition: differentiable f with convex domain is convex iff

f (y) > f (x) + ∇f (x)T (y − x) , for all x, y ∈ dom f

first-order approximation of f is global underestimator

8 / 35
Second-order conditions

f is twice differentiable if dom f is open and the Hessian ∇2 f (x) ∈ SN

∂ 2 f (x)
∇2 f (x)i,j = , i, j = 1, 2, · · · , N
∂xi ∂xj

exists at each x ∈ dom f


2nd-order conditions: for twice differentiable f with convex domain

• f is convex if and only if

∇2 f (x)  0, for all x ∈ dom f

• if ∇2 f (x)  0, for all x ∈ dom f , then f is strictly convex

9 / 35
Examples
quadratic function: f (x) = (1/2) xT Px + qT x + r (with P ∈ SN )

∇f (x) = Px + q, ∇2 f (x) = P

convex if P  0
least-squares objective: f (x) = kAx − bk22

∇f (x) = 2AT (x − b), ∇2 f (x) = 2AT A

convex (for any A)

quadratic-over-linear: f (x, y) = x2 y


" #" #T
2 2 y y
∇ f (x, y) = 3 0
y −x −x

convex for y > 0

10 / 35
Examples

PN
log-sum-exp (soft max): f (x) = log n=1 exp xn is convex

1 1
∇2 f (x) = diag(z) − zzT , (zn = exp xn )
1T z (1T z)2

to show ∇2 f (x)  0, we must verify that vT ∇2 f (x)v > 0 for all v:

2
zn vn2
P  P  P
zn − n vn zn
vT ∇2 f (x)v = n
P
n
2 >0
n zn
2
zn vn2
P P  P 
since n vn zn 6 n n zn (Cauchy-Schwarz inequality)

Q 1/N
N
geometric mean: f (x) = n=1 xn on RN
++ is concave

(similar proof as for log-sum-exp)

11 / 35
Epigraph and sublevel set

α-sublevel set of f : RN → R

Cα = {x ∈ dom f |f (x) 6 α }

sublevel sets of convex functions are convex (converse is false)


epigraph of f : RN → R

n o
epi f = (x, t) ∈ RN +1 |x ∈ dom f, f (x) 6 t

a function (in black) f is convex if and


only if the region above its graph (in
green, epi f ) is a convex set

two kinds of relations between convex set and convex function

12 / 35
Some convex functions constructed from convex sets

Let K ⊆ RN be a convex set.

1. The characteristic function (equivalent to indicator) of K is:


(
0, if x ∈ K
χK (x) =
+∞, otherwise
2. Suppose that 0 ∈ K. The Minkowski function of K is:

µK = inf {t > 0 : x ∈ tK}

note: epi µK is the light cone of K, which is a convex cone.


for t > 0, x, y ∈ RN
• µ is positively homogeneous: µK (tx) = tµK (x)
• µ is subadditive: µK (x + y) = µK (x) + µK (y)

µK < 1 for x ∈ int K

13 / 35
Jensen’s inequality

basic inequality: if f is convex, then for 0 6 θ 6 1

f (θx + (1 − θ)y) 6 θf (x) + (1 − θ)f (y)

extension: if f is convex, then

f (Ez) 6 Ef (z)

for any random variable z.

basic inequality is special case with discrete distribution

prob(z = x) = θ, prob(z = y) = 1 − θ

14 / 35
Operations that preserve convexity

practical methods for establishing convexity of a function

1. verify definition (often simplified by restricting to a line)


2. for twice differentiable functions, show ∇2 f (x)  0
3. show that f is obtained from simple convex functions by operations that
preserve convexity
• nonnegative weighted sum
• composition with affine function
• pointwise maximum and supremum
• composition
• minimization
• perspective

15 / 35
Positive weighted sum & composition with affine function

nonnegative multiple: αf is convex if f is convex, α > 0


sum: f1 + f2 convex if f1 , f2 convex (extends to infinite sums, integrals).
composition with affine function: f (Ax + b) is convex if f is convex

examples

• log barrier for linear inequalities

M
X  
f (x) = − log bm − aTm x ,
m=1
n o
dom f = x aTm x < bm , m = 1, · · · , M

• (any) norm of affine function: f (x) = kAx + bk

16 / 35
Pointwise maximum

if f1 , · · · , fM are convex, then f (x) = max{f1 (x), · · · , fM (x)} is convex

proof: A function is convex iff its epigraph is convex + the epigraph of a


pointwise maximum is the intersection of the epigraphs ⇒ the pointwise
maximum of convex functions is convex

examples

• piecewise-linear function f (x) = maxm=1,··· ,M (aT


m x + bm ) is convex

• sum of K largest components of x ∈ RN :

f (x) = x[1] + x[2] + · · · + x[K]

is convex (x[k] is kth largest component of x)

f (x) = max {xn1 + xn2 + · · · + xnK |1 6 n1 < n2 < · · · < nK 6 N }

the max of all the functions which select K entries from x and sum them.
17 / 35
Pointwise supremum
if f (x, y) is convex in x for each f (x, y), y ∈ C, then

g(x) = sup f (x, y)


y∈C

is convex
note that: f (x, y) does not need to be convex in f (x, y)
examples
• support function of a set

C : SC (x) = sup xT y
y∈C

• distance to farthest point in a set C:

f (x) = sup kx − yk
y∈C

• maximum eigenvalue of symmetric matrix: for X ∈ SN

λmax (X) = sup yT Xy


kyk2 =1

18 / 35
Composition with scalar functions

composition of g : RN → R and h : R → R:

f (x) = h(g(x))

g convex, h convex, h̃ nondecreasing


f is convex if
g concave, h convex, h̃ nonincreasing

• proof (for N = 1, differentiable g, h)

f 00 (x) = h00 (g(x))g 0 (x)2 + h0 (g(x))g 00 (x)


• note: monotonicity must hold for extended-value extension h̃

examples

• exp g(x) is convex if g is convex

• 1/g(x) is convex if g is concave and positive

19 / 35
Vector composition
N
composition of g : R → RK and h : RK → R:

f (x) = h(g(x)) = h(g1 (x), g2 (x), · · · , gK (x))

f (x) = h(g(x)) = h(g1 (x), g2 (x), · · · , gK (x))


gk convex, h convex, h̃ nondecreasing in each argument
f is convex if
gk concave, h convex, h̃ nonincreasing in each argument

proof (for N = 1, differentiable g, h)

f 00 (x) = g 0 (x)T ∇2 h(g(x))g 0 (x) + ∇h(g(x))T g 00 (x)

examples
PM
• log gm (x) is concave if gm are concave and positive
m=1
PM
• log m=1 exp gm (x) is convex if gm are convex

20 / 35
Infimum

if f (x, y) is convex in (x, y) and C is a convex set, then

g(x) = inf f (x, y)


y∈C

is convex
example

• f (x, y) = xT Ax + 2xT By + yT Cy with


" #
A B
 0, C  0
BT C

minimizing over y gives g(x) = inf y f (x, y) = xT (A − BC−1 BT )x


g is convex, hence Schur complement A − BC−1 BT  0
• distance to a set: dist(x, S) = inf y∈S kx − yk is convex if S is convex

21 / 35
Perspective
the perspective of a function f : RN → R is the function g : RN × R → R,

g(x, t) = tf (x/t), dom g = {(x, t) |x/t ∈ dom f, t > 0 }

g is convex if f is convex
examples

• f (x) = xT x is convex; hence g(x, t) = xT x t is convex for t > 0




• negative logarithm f (x) = − log x is convex; hence relative entropy


g(x, t) = t log t − t log x is convex on R2++ .
• f is convex, then

Ax + b
g(x) = (cT x + d)f ( )
cT x + d
is convex on
 
Ax + b
x cT x + d > 0, ∈ dom f
cT x + d

22 / 35
Conjugate function
the conjugate of a function f is

f ∗ (y) = sup (yT x − f (x))


x∈dom f

when y is fixed, xy is a line with 0 point in it and the slop is y


• f ∗ is the maximum gap between linear function yT x and f (x)
• f ∗ is convex (even if f is not), since it is the pointwise maximum of
convex (affine) functions in y
• for differentiable f , conjugation is called the Legendre transform 23 / 35
Conjugate function
examples

• negative logarithm f (x) = − log x


(
∗ −1 − log(−y), y < 0
f (y) = sup(yx + log x) =
x>0 ∞, otherwise
1 T
• strictly convex quadratic f (x) = 2
x Qx with Q ∈ SN
++

1 T 1
f ∗ (y) = sup(yT x − x Qx) = yT Q−1 y
x 2 2
• indicator function f (x) = 1C (x)

f ∗ (y) = 1∗C (x) = sup yT x


x∈C

called the support function of C


• norm f (x) = kxk

f ∗ (y) = 1{y:kyk∗ 61} (y)

24 / 35
Conjugate function

Properties

• Fenchel’s inequality: for any x, y,

f (x) + f ∗ (y) > xT y

• Hence conjugate of conjugate f ∗∗ satisfies f ∗∗ 6 f

• If f is closed and convex, then f ∗∗ = f

• If f is closed and convex, then for any x, y,

x ∈ ∂f ∗ (y) ⇔ y ∈ ∂f ∗ (x) ⇔ f (x) + f ∗ (y) = xT y

• If f (u, v) = f1 (u) + f2 (v) (here u ∈ RN , v ∈ RM ), then

f ∗ (w, z) = f1∗ (w) + f2∗ (z)

25 / 35
Quasiconvex functions
N
Definition 1: f : R → R is quasiconvex if dom f is convex and the sublevel
sets
Sα = {x ∈ dom f |f (x) 6 α }

are convex for all α

• f is quasiconcave if −f is quasiconvex

• f is quasilinear if it is quasiconvex and quasiconcave

26 / 35
examples

p
• |x| is quasiconvex on R
• ceil(x) = inf {z ∈ Z |z > x } is quasilinear

• log x is quasilinear on R++

• f (x1 , x2 ) = x1 x2 is quasiconcave on R2++

• linear-fractional function

aT x + b n o
f (x) = T
, dom f = x cT x + d > 0
c x+d
is quasilinear
• distance ratio

kx − ak2 
f (x) = , dom f = x kx − ak2 6 kx − bk2
kx − bk2
is quasiconvex

27 / 35
Quasiconvex functions

internal rate of return

• cash flow x = [x0 , · · · , xN ]T ; xn is payment in period n (to us if xn > 0)

• we assume x0 < 0 (investment) and x0 + x1 + · · · + xN > 0

• present value of cash flow x, for interest rate r:

N
X
PV(x, r) = (1 + r)−n xn
n=0

• internal rate of return is smallest interest rate for which PV(x, r):

IRR(x) = inf {r > 0 |PV(x, r) = 0 }

IRR is quasiconcave: superlevel set is intersection of halfspaces

N
X
IRR(x) > R ⇔ (1 + r)−n xn > 0 for 06r<R
n=0

28 / 35
Properties of quasiconvex functions

modified Jensen inequality (Definition 2): for quasiconvex f

0 6 θ 6 1 ⇒ f (θx + (1 − θ)y) 6 max{f (x), f (y)}

first-order condition: differentiable f with cvx domain is quasiconvex iff

f (y) 6 f (x) ⇒ ∇f (x)T (y − x) 6 0

sums of quasiconvex functions are not necessarily quasiconvex

29 / 35
Strictly local quasiconvex function

let x, z ∈ RN , κ, ε > 0, f : RN → R is (ε, κ, z)-strictly locally quasiconvex


(SLQC) in x, if at least one of the following applies:

• f (x) − f (z) 6 ε

• k∇f (x)k2 > 0, and for every y ∈ B(z, ε/κ) it holds that
h∇f (x), y − xi 6 0

L-Lipschitz + strictly quasiconvex = (ε, L, z)-SLQC

note: Lipschitz continuity: kf (x) − f (y)k 6 L kx − yk , ∀ x, y ∈ C

normalized gradient descent methods can solve the SLQC optimization

30 / 35
Log-concave and log-convex functions

a positive function f is log-concave if log f is concave:

f (θx + (1 − θ)y) > f (x)θ f (y)1−θ , for 0 6 θ 6 1

f is log-convex if log f is convex

• powers: xa on R++ is log-convex for a 6 1, log-concave for a > 1

• many common probability densities are log-concave, e.g., normal:


 
1 1
f (x) = q exp − (x − x)T Σ−1 (x − x)
2
(2π)N det Σ

• cumulative Gaussian distribution function φ is log-concave

x
u2
Z  
1
φ(x) = √ exp − du
2π −∞ 2

31 / 35
Properties of log-concave functions

• twice differentiable f with convex domain is log-concave if and only if

f (x)∇2 f (x)  ∇f (x)∇f (x)T

for all x ∈ dom f


• product of log-concave functions is log-concave

• sum of log-concave functions is not always log-concave

• integration: if f : RN × RM → R is log-concave, then


Z
g(x) = f (x, y)dy

is log-concave (not easy to show)

32 / 35
Properties of log-concave functions

consequences of integration property

• convolution f ∗ g of log-concave functions f, g is log-concave


Z
f ∗ g(x) = f (x − y)g(y)dy

• if C is convex and y is a random variable with log-concave pdf, then

f (x) = prob(x + y ∈ C)

is log-concave
proof: write f (x) as integral of product of log-concave functions
(
1, u ∈ C
Z
f (x) = g(x + y) p(y)dy, g(u) =
0, u ∈
/C
p is pdf of y

33 / 35
Properties of log-concave functions

example: yield function

h(x) = prob(x + w ∈ S)

• x ∈ RN : nominal/target parameter values for product

• w ∈ RN : random variations of parameters in manufactured product

• S: set of acceptable values

if S is convex and w has a log-concave pdf, then

• h is log-concave

• yield regions {x |h(x) > α } are convex

34 / 35
Convexity with respect to generalized inequalities

f : RN → RM is K-convex if dom f is convex and

f (θx + (1 − θ)y) K θf (x) + (1 − θ)f (y)

for x, y ∈ dom f, 0 6 θ 6 1
example: f : SM → SM , f (X) = X2 is SM
+ -convex

proof: for fixed z ∈ RM , zT X2 z = kXzk22 is convex in X, i.e.,

zT (θX + (1 − θ)Y)2 z 6 θzT X2 z + (1 − θ)zT Y2 z

for X, Y ∈ SM , 0 6 θ 6 1
therefore, (θX + (1 − θ)Y)2  θX2 + (1 − θ)Y2

35 / 35

You might also like