0% found this document useful (0 votes)

88 views

Subgradients: Subgradient Calculus Duality and Optimality Conditions Directional Derivative

The document discusses subgradients and subdifferentials of convex functions. It defines a subgradient as a vector that satisfies an inequality characterizing a first-order approximation of a convex function. The subdifferential is the set of all subgradients. Properties of subgradients and subdifferentials are presented, including rules for calculating subgradients of sums, max functions, and other operations on convex functions. Examples of applying these rules to specific convex functions like the absolute value and L1-norm are also provided.

Uploaded by

DAVID MORANTE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views

Subgradients: Subgradient Calculus Duality and Optimality Conditions Directional Derivative

Uploaded by

DAVID MORANTE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

L.

Vandenberghe ECE236C (Spring 2020)

2. Subgradients

• definition

• subgradient calculus

• duality and optimality conditions

• directional derivative

2.1
Basic inequality

recall the basic inequality for differentiable convex functions:

f (y) ≥ f (x) + ∇ f (x)T (y − x) for all y ∈ dom f

(x, f (x))

∇ f (x)
−1

• the first-order approximation of f at x is a global lower bound

• ∇ f (x) defines non-vertical supporting hyperplane to epigraph of f at (x, f (x)):
T
∇ f (x) y x
− ≤ 0 for all (y, t) ∈ epi f
−1 t f (x)

Subgradients 2.2
Subgradient

g is a subgradient of a convex function f at x ∈ dom f if

f (y) ≥ f (x) + gT (y − x) for all y ∈ dom f

f (y)

f (x1) + g1T (y − x1)

f (x1) + g2T (y − x1)

f (x2) + g3T (y − x2)

x1 x2

g1, g2 are subgradients at x1; g3 is a subgradient at x2

Subgradients 2.3
Subdifferential

the subdifferential ∂ f (x) of f at x is the set of all subgradients:

∂ f (x) = {g | gT (y − x) ≤ f (y) − f (x), ∀y ∈ dom f }

Properties

• ∂ f (x) is a closed convex set (possibly empty)

this follows from the definition: ∂ f (x) is an intersection of halfspaces

• if x ∈ int dom f then ∂ f (x) is nonempty and bounded

proof on next two pages

Subgradients 2.4
Proof: we show that ∂ f (x) is nonempty when x ∈ int dom f

• (x, f (x)) is in the boundary of the convex set epi f

• therefore there exists a supporting hyperplane to epi f at (x, f (x)):

T
a y x
∃(a, b) , 0, − ≤0 ∀(y, t) ∈ epi f
b t f (x)

• b > 0 gives a contradiction as t → ∞

• b = 0 gives a contradiction for y = x + a with small > 0

1
• therefore b < 0 and g = a is a subgradient of f at x
|b|

Subgradients 2.5
Proof: ∂ f (x) is bounded when x ∈ int dom f

• for small r > 0, define a set of 2n points

B = {x ± re k | k = 1, . . . , n} ⊂ dom f

and define M = max f (y) < ∞

y∈B
• for every g ∈ ∂ f (x), there is a point y ∈ B with

r kgk∞ = gT (y − x)

(choose an index k with |gk | = kgk∞, and take y = x + r sign(gk )e k )

• since g is a subgradient, this implies that

f (x) + r kgk∞ = f (x) + gT (y − x) ≤ f (y) ≤ M

• we conclude that ∂ f (x) is bounded:

M − f (x)
kgk∞ ≤ for all g ∈ ∂ f (x)
r

Subgradients 2.6
Example

f (x) = max { f1(x), f2(x)} with f1, f2 convex and differentiable

f (y)

f2(y)

f1(y)

• if f1( x̂) = f2( x̂), subdifferential at x̂ is line segment [∇ f1( x̂), ∇ f2( x̂)]
• if f1( x̂) > f2( x̂), subdifferential at x̂ is {∇ f1( x̂)}
• if f1( x̂) < f2( x̂), subdifferential at x̂ is {∇ f2( x̂)}

Subgradients 2.7
Examples

Absolute value f (x) = |x|

f (x) ∂ f (x)

x −1

Euclidean norm f (x) = k xk2

1
∂ f (x) = { x} if x , 0, ∂ f (x) = {g | kgk2 ≤ 1} if x = 0
k xk2

Subgradients 2.8
Monotonicity

the subdifferential of a convex function is a monotone operator:

(u − v)T (x − y) ≥ 0 for all x , y , u ∈ ∂ f (x), v ∈ ∂ f (y)

Proof: by definition

f (y) ≥ f (x) + uT (y − x), f (x) ≥ f (y) + v T (x − y)

combining the two inequalities shows monotonicity

Subgradients 2.9
Examples of non-subdifferentiable functions

the following functions are not subdifferentiable at x = 0

• f : R → R, dom f = R+

f (x) = 1 if x = 0, f (x) = 0 if x > 0

• f : R → R, dom f = R+ √
f (x) = − x

the only supporting hyperplane to epi f at (0, f (0)) is vertical

Subgradients 2.10
Subgradients and sublevel sets

if g is a subgradient of f at x , then

f (y) ≤ f (x) =⇒ gT (y − x) ≤ 0

x
f (y) ≤ f (x)

the nonzero subgradients at x define supporting hyperplanes to the sublevel set

{y | f (y) ≤ f (x)}

Subgradients 2.11
Outline

• definition

• subgradient calculus

• duality and optimality conditions

• directional derivative
Subgradient calculus

Weak subgradient calculus: rules for finding one subgradient

• sufficient for most nondifferentiable convex optimization algorithms

• if you can evaluate f (x), you can usually compute a subgradient

Strong subgradient calculus: rules for finding ∂ f (x) (all subgradients)

• some algorithms, optimality conditions, etc., need entire subdifferential

• can be quite complicated

we will assume that x ∈ int dom f

Subgradients 2.12
Basic rules

Differentiable functions: ∂ f (x) = {∇ f (x)} if f is differentiable at x

Nonnegative linear combination

if f (x) = α1 f1(x) + α2 f2(x) with α1, α2 ≥ 0, then

∂ f (x) = α1 ∂ f1(x) + α2 ∂ f2(x)

(right-hand side is addition of sets)

Affine transformation of variables: if f (x) = h(Ax + b), then

∂ f (x) = AT ∂h(Ax + b)

Subgradients 2.13
Pointwise maximum

f (x) = max { f1(x), . . . , fm (x)}

define I(x) = {i | fi (x) = f (x)}, the ‘active’ functions at x

Weak result

to compute a subgradient at x , choose any k ∈ I(x), any subgradient of fk at x

Strong result
∂ f (x) = conv ∂ fi (x)
[
i∈I(x)

• the convex hull of the union of subdifferentials of ‘active’ functions at x

• if fi ’s are differentiable, ∂ f (x) = conv {∇ fi (x) | i ∈ I(x)}

Subgradients 2.14
Example: piecewise-linear function

f (x) = max (aiT x + bi )

i=1,...,m

f (x)

aiT x + bi

the subdifferential at x is a polyhedron

∂ f (x) = conv {ai | i ∈ I(x)}

with I(x) = {i | aiT x + bi = f (x)}

Subgradients 2.15
Example: `1-norm

f (x) = k xk1 = max sT x

s∈{−1,1}n

the subdifferential is a product of intervals

 [−1, 1]

 xk = 0
∂ f (x) = J1 × · · · × Jn, Jk = {1} xk > 0

 {−1} xk < 0



(1, 1)
(−1, 1) (1, 1) (1, 1)

(−1, −1) (1, −1)

(1, −1)

∂ f (0, 0) = [−1, 1] × [−1, 1] ∂ f (1, 0) = {1} × [−1, 1] ∂ f (1, 1) = {(1, 1)}

Subgradients 2.16
Pointwise supremum

f (x) = sup fα (x), fα (x) convex in x for every α

α∈A

Weak result: to find a subgradient at x̂ ,

• find any β for which f ( x̂) = f β ( x̂) (assuming maximum is attained)

• choose any g ∈ ∂ f β ( x̂)

(Partial) strong result: define I(x) = {α ∈ A | fα (x) = f (x)}

∂ fα (x) ⊆ ∂ f (x)
[
conv
α∈I(x)

equality requires extra conditions (for example, A compact, fα continuous in α)

Subgradients 2.17
Exercise: maximum eigenvalue

Problem: explain how to find a subgradient of

f (x) = λmax(A(x)) = sup yT A(x)y

k yk2 =1

where A(x) = A0 + x1 A1 + · · · + xn An with symmetric coefficients Ai

Solution: to find a subgradient at x̂ ,

• choose any unit eigenvector y with eigenvalue λmax(A( x̂))

• the gradient of yT A(x)y at x̂ is a subgradient of f :

(yT A1 y, . . . , yT An y) ∈ ∂ f ( x̂)

Subgradients 2.18
Minimization

f (x) = inf h(x, y), h jointly convex in (x, y)

Weak result: to find a subgradient at x̂ ,

• find ŷ that minimizes h( x̂, y) (assuming minimum is attained)

• find subgradient (g, 0) ∈ ∂h( x̂, ŷ)

Proof: for all x , y ,

h(x, y) ≥ h( x̂, ŷ) + gT (x − x̂) + 0T (y − ŷ)
= f ( x̂) + gT (x − x̂)

therefore
f (x) = inf h(x, y) ≥ f ( x̂) + gT (x − x̂)
y

Subgradients 2.19
Exercise: Euclidean distance to convex set

Problem: explain how to find a subgradient of

f (x) = inf k x − yk2

y∈C

where C is a closed convex set

Solution: to find a subgradient at x̂ ,

• if f ( x̂) = 0 (that is, x̂ ∈ C), take g = 0

• if f ( x̂) > 0, find projection ŷ = P( x̂) on C and take

1 1
g= ( x̂ − ŷ) = ( x̂ − P( x̂))
k ŷ − x̂k2 k x̂ − P( x̂)k2

Subgradients 2.20
Composition

f (x) = h( f1(x), . . . , fk (x)), h convex and nondecreasing, fi convex

Weak result: to find a subgradient at x̂ ,

• find z ∈ ∂h( f1( x̂), . . . , fk ( x̂)) and gi ∈ ∂ fi ( x̂)

• then g = z1 g1 + · · · + z k gk ∈ ∂ f ( x̂)

reduces to standard formula for differentiable h, fi

Proof:

f (x) ≥ h f1( x̂) + g1T (x − x̂), . . . ,
fk ( x̂) + gTk (x− x̂)

≥ h ( f1( x̂), . . . , fk ( x̂)) + z g1 (x − x̂), . . . , gk (x − x̂)
T T T

= h ( f1( x̂), . . . , fk ( x̂)) + (z1 g1 + · · · + z k gk )T (x − x̂)

= f ( x̂) + gT (x − x̂)

Subgradients 2.21
Optimal value function

define f (u, v) as the optimal value of convex problem

minimize f0(x)
subject to fi (x) ≤ ui, i = 1, . . . , m
Ax = b + v

(functions fi are convex; optimization variable is x )

Weak result: suppose f (û, v̂) is finite and strong duality holds with the dual
!
inf f0(x) + λi ( fi (x) − ûi ) + νT (Ax − b − v̂)
X
maximize
x
i
subject to λ0

if λ̂, ν̂ are optimal dual variables (for right-hand sides û, v̂ ) then (−λ̂, −ν̂) ∈ ∂ f (û, v̂)

Subgradients 2.22
Proof: by weak duality for problem with right-hand sides u, v
!
f (u, v) ≥ inf f0(x) + λ̂i ( fi (x) − ui ) + ν̂T (Ax − b − v)
X
x
i
!
= inf f0(x) + λ̂i ( fi (x) − ûi ) + ν̂T (Ax − b − v̂)
X
x
i

− λ̂T (u − û) − ν̂T (v − v̂)

= f (û, v̂) − λ̂T (u − û) − ν̂T (v − v̂)

Subgradients 2.23
Expectation

f (x) = E h(x, u) u random, h convex in x for every u

Weak result: to find a subgradient at x̂ ,

• choose a function u 7→ g(u) with g(u) ∈ ∂x h( x̂, u)

• then, g = Eu g(u) ∈ ∂ f ( x̂)

Proof: by convexity of h and definition of g(u),

f (x) = E h(x, u)

≥ E h( x̂, u) + g(u)T (x − x̂)

= f ( x̂) + gT (x − x̂)

Subgradients 2.24
Outline

• definition

• subgradient calculus

• duality and optimality conditions

• directional derivative
Optimality conditions — unconstrained

x? minimizes f (x) if and only

0 ∈ ∂ f (x?)

this follows directly from the definition of subgradient:

f (y) ≥ f (x?) + 0T (y − x?) for all y ⇐⇒ 0 ∈ ∂ f (x?)

Subgradients 2.25
Example: piecewise-linear minimization

f (x) = max (aiT x + bi )

i=1,...,m

Optimality condition

0 ∈ conv {ai | i ∈ I(x?)} where I(x) = {i | aiT x + bi = f (x)}

• in other words, x? is optimal if and only if there is a λ with

m
λ 0, 1 λ = 1,
T
λi ai = 0, λi = 0 for i < I(x?)
X
i=1

• these are the optimality conditions for the equivalent linear program

minimize t maximize bT λ
subject to Ax + b t1 subject to AT λ = 0
λ 0, 1T λ = 1

Subgradients 2.26
Optimality conditions — constrained

minimize f0(x)
subject to fi (x) ≤ 0, i = 1, . . . , m

assume dom fi = Rn, so functions fi are subdifferentiable everywhere

Karush–Kuhn–Tucker conditions

if strong duality holds, then x?, λ? are primal, dual optimal if and only if
1. x? is primal feasible

2. λ? 0

3. λi? fi (x?) = 0 for i = 1, . . . , m

4. x? is a minimizer of L(x, λ?) = f0(x) + λ? f (x):

Pm
i=1 i i

m
?
0 ∈ ∂ f0(x ) + λi?∂ fi (x?)
X
i=1

Subgradients 2.27
Outline

• definition

• subgradient calculus

• duality and optimality conditions

• directional derivative
Directional derivative

Definition (for general f ): the directional derivative of f at x in the direction y is

0 f (x + αy) − f (x)
f (x; y) = lim
α&0 α

1
= lim t( f (x + y) − t f (x)
t→∞ t

(if the limit exists)

• f 0(x; y) is the right derivative of g(α) = f (x + αy) at α = 0

• f 0(x; y) is homogeneous in y :

f 0(x; λy) = λ f 0(x; y) for λ ≥ 0

Subgradients 2.28
Directional derivative of a convex function

Equivalent definition (for convex f ): replace lim with inf

0 f (x + αy) − f (x)
f (x; y) = inf
α>0 α

1
= inf t f (x + y) − t f (x)
t>0 t

Proof

• the function h(y) = f (x + y) − f (x) is convex in y , with h(0) = 0

• its perspective th(y/t) is nonincreasing in t (ECE236B ex. A2.5); hence

f 0(x; y) = lim th(y/t) = inf th(y/t)

t→∞ t>0

Subgradients 2.29
Properties

consequences of the expressions (for convex f )

0 f (x + αy) − f (x)
f (x; y) = inf
α>0 α

1
= inf t f (x + y) − t f (x)
t>0 t

• f 0(x; y) is convex in y (partial minimization of a convex function in y , t )

• f 0(x; y) defines a lower bound on f in the direction y :

f (x + αy) ≥ f (x) + α f 0(x; y) for all α ≥ 0

Subgradients 2.30
Directional derivative and subgradients

for convex f and x ∈ int dom f

f 0(x; y) = sup gT y
g∈∂ f (x)
fˆ0(x, y) = gT y

ĝ
f 0(x; y) is support function of ∂ f (x)
y ∂ f (x)

• generalizes f 0(x; y) = ∇ f (x)T y for differentiable functions

• implies that f 0(x; y) exists for all x ∈ int dom f , all y (see page 2.4)

Subgradients 2.31
Proof: if g ∈ ∂ f (x) then from page 2.29

0 f (x) + αgT y − f (x)

f (x; y) ≥ inf = gT y
α>0 α

it remains to show that f 0(x; y) = ĝT y for at least one ĝ ∈ ∂ f (x)

• f 0(x; y) is convex in y with domain Rn, hence subdifferentiable at all y

• let ĝ be a subgradient of f 0(x; y) at y : then for all v , λ ≥ 0,

λ f 0(x; v) = f 0(x; λv) ≥ f 0(x; y) + ĝT (λv − y)

• taking λ → ∞ shows that f 0(x; v) ≥ ĝT v ; from the lower bound on page 2.30,

f (x + v) ≥ f (x) + f 0(x; v) ≥ f (x) + ĝT v for all v

hence ĝ ∈ ∂ f (x)

• taking λ = 0 we see that f 0(x; y) ≤ ĝT y

Subgradients 2.32
Descent directions and subgradients

y is a descent direction of f at x if f 0(x; y) < 0

• the negative gradient of a differentiable f is a descent direction (if ∇ f (x) , 0)

• negative subgradient is not always a descent direction

Example: f (x1, x2) = |x1 | + 2|x2 |

g = (1, 2)

x1
(1, 0)

g = (1, 2) ∈ ∂ f (1, 0), but y = (−1, −2) is not a descent direction at (1, 0)

Subgradients 2.33
Steepest descent direction

Definition: (normalized) steepest descent direction at x ∈ int dom f is

∆xnsd = argmin f 0(x; y)

k yk2 ≤1

∆xnsd is the primal solution y of the pair of dual problems (BV §8.1.3)

minimize (over y ) f 0(x; y) maximize (over g ) −kgk2

subject to k yk2 ≤ 1 subject to g ∈ ∂ f (x)

• dual optimal g? is subgradient with least norm

• f 0(x; ∆xnsd) = −kg? k2 ∂ f (x)
g?
• if 0 < ∂ f (x), ∆xnsd = −g?/kg? k2
• ∆xnsd can be expensive to compute

∆xnsd gT ∆xnsd = f 0(x, ∆xnsd)

Subgradients 2.34
Subgradients and distance to sublevel sets

if f is convex, f (y) < f (x), g ∈ ∂ f (x), then for small t > 0,

k x − tg − yk22 = k x − yk22 − 2tgT (x − y) + t 2 kgk22

≤ k x − yk22 − 2t( f (x) − f (y)) + t 2 kgk22
< k x − yk22

• −g is descent direction for k x − yk2, for any y with f (y) < f (x)

• in particular, −g is descent direction for distance to any minimizer of f

Subgradients 2.35
References

• A. Beck, First-Order Methods in Optimization (2017), chapter 3.

• D. P. Bertsekas, A. Nedić, A. E. Ozdaglar, Convex Analysis and Optimization

(2003), chapter 4.

• J.-B. Hiriart-Urruty, C. Lemaréchal, Convex Analysis and Minimization

Algoritms (1993), chapter VI.

• Yu. Nesterov, Lectures on Convex Optimization (2018), section 3.1.

• B. T. Polyak, Introduction to Optimization (1987), section 5.1.

Subgradients 2.36

MAT1150-Exercices-preparatoires Corrigé
No ratings yet
MAT1150-Exercices-preparatoires Corrigé
6 pages
MATH3161/MATH5165 Optimization: The University of New South Wales School of Mathematics and Statistics
No ratings yet
MATH3161/MATH5165 Optimization: The University of New South Wales School of Mathematics and Statistics
4 pages
Final 12
No ratings yet
Final 12
11 pages
HW2 Sol
No ratings yet
HW2 Sol
5 pages
Exercises With Solutions PDF
No ratings yet
Exercises With Solutions PDF
37 pages
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
No ratings yet
Nonlinear Programming 3rd Edition Theoretical Solutions Manual
12 pages
BS en 1757-1-2001
100% (2)
BS en 1757-1-2001
32 pages
Grounding An Archtop Tailpiece en
100% (1)
Grounding An Archtop Tailpiece en
11 pages
Kharrat-Toma Transform and Its Application in Solving Some Ordinary Differential Equations With Initial Boundary Conditions
No ratings yet
Kharrat-Toma Transform and Its Application in Solving Some Ordinary Differential Equations With Initial Boundary Conditions
6 pages
Lec 24 Lagrange Multiplier
No ratings yet
Lec 24 Lagrange Multiplier
20 pages
2007 Pep
No ratings yet
2007 Pep
4 pages
Variational Calculus
100% (1)
Variational Calculus
3 pages
Download ebooks file Stochastic Equations in Infinite Dimensions Prato G.D. all chapters
100% (2)
Download ebooks file Stochastic Equations in Infinite Dimensions Prato G.D. all chapters
61 pages
Lehmann-Scheffe Theorem
No ratings yet
Lehmann-Scheffe Theorem
15 pages
Constrained Optimization
No ratings yet
Constrained Optimization
6 pages
Get Problems and solutions in mathematics Second Edition. Edition Li free all chapters
100% (8)
Get Problems and solutions in mathematics Second Edition. Edition Li free all chapters
75 pages
Fractional Programming
No ratings yet
Fractional Programming
53 pages
Lecture 1 - Introduction To Optimization PDF
No ratings yet
Lecture 1 - Introduction To Optimization PDF
31 pages
Chap02 Solutions Ex 2 4 Calculus
100% (1)
Chap02 Solutions Ex 2 4 Calculus
20 pages
Akra Bazzi Assignment
No ratings yet
Akra Bazzi Assignment
10 pages
A Short Course On Nonparametric Curve Estimation R PDF
No ratings yet
A Short Course On Nonparametric Curve Estimation R PDF
114 pages
Geogebra Commands
No ratings yet
Geogebra Commands
3 pages
MATH3161/MATH5165 Optimization: The University of New South Wales School of Jviathematics
No ratings yet
MATH3161/MATH5165 Optimization: The University of New South Wales School of Jviathematics
4 pages
Simulated Annealing
No ratings yet
Simulated Annealing
16 pages
CSE548 Lectures 7 8 PDF
No ratings yet
CSE548 Lectures 7 8 PDF
10 pages
HW3 Solutions Autotag
No ratings yet
HW3 Solutions Autotag
6 pages
Strassen's 2 × 2 Matrix Multiplication Algorithm: A Conceptual Perspective
No ratings yet
Strassen's 2 × 2 Matrix Multiplication Algorithm: A Conceptual Perspective
6 pages
Linear Control Systems Lecture # 8 Observability & Discrete-Time Systems
No ratings yet
Linear Control Systems Lecture # 8 Observability & Discrete-Time Systems
25 pages
DINI DERIVATIVES - Part 1
No ratings yet
DINI DERIVATIVES - Part 1
28 pages
Definitions of CEC2017 Benchmark Suite Final Version Updated
100% (1)
Definitions of CEC2017 Benchmark Suite Final Version Updated
34 pages
Max-Flow Min-Cut Problems
No ratings yet
Max-Flow Min-Cut Problems
9 pages
Pontryagin's Maximum Principle
No ratings yet
Pontryagin's Maximum Principle
21 pages
Solution Manual For Econometric Analysis 7th Edition by Greene
No ratings yet
Solution Manual For Econometric Analysis 7th Edition by Greene
12 pages
Jacobi Method PDF
100% (1)
Jacobi Method PDF
57 pages
Numerical Linear Algebra University of Edinburgh Past Paper 2020-2021
100% (1)
Numerical Linear Algebra University of Edinburgh Past Paper 2020-2021
4 pages
CSE3506 - Essentials of Data Analytics: Facilitator: DR Sathiya Narayanan S
No ratings yet
CSE3506 - Essentials of Data Analytics: Facilitator: DR Sathiya Narayanan S
158 pages
Notes 3
No ratings yet
Notes 3
8 pages
An Introduction To Q-Difference Equations
No ratings yet
An Introduction To Q-Difference Equations
141 pages
Complex Analysis Iqra Liaqat
0% (1)
Complex Analysis Iqra Liaqat
468 pages
MAFE208IU-L11 ODEs Part1
No ratings yet
MAFE208IU-L11 ODEs Part1
25 pages
PAT Question Paper-II 2019 (Final)
No ratings yet
PAT Question Paper-II 2019 (Final)
8 pages
Ritz Method
No ratings yet
Ritz Method
8 pages
Ch1 Integer Part
100% (1)
Ch1 Integer Part
3 pages
The Variational Approach To Optimal Control
100% (1)
The Variational Approach To Optimal Control
48 pages
LP Exercises
100% (1)
LP Exercises
12 pages
Conjugate Gradient Method Report
No ratings yet
Conjugate Gradient Method Report
17 pages
Hidden Markov Models Applications to Financial Economics 1st Edition by Ramaprasad Bhar, Shigeyuki Hamori 1402078994 9781402078996 instant download
100% (2)
Hidden Markov Models Applications to Financial Economics 1st Edition by Ramaprasad Bhar, Shigeyuki Hamori 1402078994 9781402078996 instant download
33 pages
Optimal Control Dynamic Programming
No ratings yet
Optimal Control Dynamic Programming
18 pages
Operations Research Paper PDF
No ratings yet
Operations Research Paper PDF
4 pages
UG B.sc. Mathematics 113 53 BSc-Mathematics Numerical Analysis CRC 2329
No ratings yet
UG B.sc. Mathematics 113 53 BSc-Mathematics Numerical Analysis CRC 2329
264 pages
Experience immediate access to the complete Calculus with Applications 11th Edition Lial Solutions Manual (PDF).
100% (13)
Experience immediate access to the complete Calculus with Applications 11th Edition Lial Solutions Manual (PDF).
66 pages
Libro - An Introduction To Continuous Optimization
No ratings yet
Libro - An Introduction To Continuous Optimization
399 pages
Math 38 Mathematical Analysis III: I. F. Evidente
No ratings yet
Math 38 Mathematical Analysis III: I. F. Evidente
72 pages
Calculus of Variations
No ratings yet
Calculus of Variations
15 pages
A First Course in Optimization Theory SUNDARAM R K
No ratings yet
A First Course in Optimization Theory SUNDARAM R K
7 pages
2.definition:: Is Not Equal To 0. Polynomial Functions of Only One
No ratings yet
2.definition:: Is Not Equal To 0. Polynomial Functions of Only One
6 pages
Separable Programming Presentation
No ratings yet
Separable Programming Presentation
33 pages
An Introduction to Measure and Integration 2nd Edition Inder K. Rana pdf download
100% (2)
An Introduction to Measure and Integration 2nd Edition Inder K. Rana pdf download
48 pages
Subgradients: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Subgradients: Ryan Tibshirani Convex Optimization 10-725
25 pages
Subgradients
No ratings yet
Subgradients
39 pages
Notes On Subgradients
No ratings yet
Notes On Subgradients
13 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Get Application PDF
No ratings yet
Get Application PDF
1 page
Components of An Android Application: 1. Activities
100% (1)
Components of An Android Application: 1. Activities
3 pages
Zero Trust Architecture Detailed Presentation
No ratings yet
Zero Trust Architecture Detailed Presentation
11 pages
Evidence May2017
No ratings yet
Evidence May2017
4 pages
SAP BTP Training Content
No ratings yet
SAP BTP Training Content
8 pages
Full download QoS and Energy Management in Cognitive Radio Network: Case Study Approach 1st Edition Vishram Mishra pdf docx
100% (3)
Full download QoS and Energy Management in Cognitive Radio Network: Case Study Approach 1st Edition Vishram Mishra pdf docx
65 pages
Minimising The Bad Debt: Sulastri
No ratings yet
Minimising The Bad Debt: Sulastri
15 pages
Financial Planning of Households in Urban Barangays in Lipa City
No ratings yet
Financial Planning of Households in Urban Barangays in Lipa City
36 pages
Shanto-Mariam University of Creative Technology
No ratings yet
Shanto-Mariam University of Creative Technology
4 pages
PM Chapter 4 AAMBC
No ratings yet
PM Chapter 4 AAMBC
62 pages
Spatial Statistics
No ratings yet
Spatial Statistics
14 pages
HPE7-A04 Exam Questions
No ratings yet
HPE7-A04 Exam Questions
2 pages
Cobol-Tutorial - Mainframe Refresher
100% (1)
Cobol-Tutorial - Mainframe Refresher
41 pages
UM - Plextor - PX 891 SA - Internal CD DVD Writer - 2013 - EN
No ratings yet
UM - Plextor - PX 891 SA - Internal CD DVD Writer - 2013 - EN
210 pages
Tracer Adaptiview Display: Operations Guide
No ratings yet
Tracer Adaptiview Display: Operations Guide
72 pages
Assessing Teachers Using Philippine Standards For Teachers: Emejidio C. Gepila JR
No ratings yet
Assessing Teachers Using Philippine Standards For Teachers: Emejidio C. Gepila JR
8 pages
第 10 节 GEE 的参数类型 (Filter,Join)
No ratings yet
第 10 节 GEE 的参数类型 (Filter,Join)
16 pages
Short Tutorial Ipopt
No ratings yet
Short Tutorial Ipopt
17 pages
Active Learning Strategies
No ratings yet
Active Learning Strategies
5 pages
X Data Caterpillar Diagramas Electricos PDFs Tractocamiones DIAGRAMA 3126B
No ratings yet
X Data Caterpillar Diagramas Electricos PDFs Tractocamiones DIAGRAMA 3126B
2 pages
Gaussian Elimination Spreadsheet
0% (1)
Gaussian Elimination Spreadsheet
2 pages
Name: Iniovorua Ajokperoghene: - GROUP: IM-450 - Date: 7 NOVEMBER, 2020. - Theme: Dracunculiasis
No ratings yet
Name: Iniovorua Ajokperoghene: - GROUP: IM-450 - Date: 7 NOVEMBER, 2020. - Theme: Dracunculiasis
10 pages
02 Avaya Communication Server 1000E Datasheet
No ratings yet
02 Avaya Communication Server 1000E Datasheet
8 pages
Engineering Economy: Chapter 2: Cost Concepts and Design Economics
No ratings yet
Engineering Economy: Chapter 2: Cost Concepts and Design Economics
30 pages
Haleeb Foods (Final)
No ratings yet
Haleeb Foods (Final)
13 pages
Print Order # E701088856
No ratings yet
Print Order # E701088856
4 pages
Portfolio Management and Capital Asset Pricing Model: Chapter 2 (Steven Roman)
No ratings yet
Portfolio Management and Capital Asset Pricing Model: Chapter 2 (Steven Roman)
3 pages
Hands-On 2
No ratings yet
Hands-On 2
11 pages

Subgradients: Subgradient Calculus Duality and Optimality Conditions Directional Derivative

Uploaded by

Subgradients: Subgradient Calculus Duality and Optimality Conditions Directional Derivative

Uploaded by

L.

Vandenberghe ECE236C (Spring 2020)

• duality and optimality conditions

recall the basic inequality for differentiable convex functions:

f (y) ≥ f (x) + ∇ f (x)T (y − x) for all y ∈ dom f

• the first-order approximation of f at x is a global lower bound

g is a subgradient of a convex function f at x ∈ dom f if

f (y) ≥ f (x) + gT (y − x) for all y ∈ dom f

f (x1) + g1T (y − x1)

f (x1) + g2T (y − x1)

f (x2) + g3T (y − x2)

g1, g2 are subgradients at x1; g3 is a subgradient at x2

the subdifferential ∂ f (x) of f at x is the set of all subgradients:

∂ f (x) = {g | gT (y − x) ≤ f (y) − f (x), ∀y ∈ dom f }

• ∂ f (x) is a closed convex set (possibly empty)

• if x ∈ int dom f then ∂ f (x) is nonempty and bounded

• (x, f (x)) is in the boundary of the convex set epi f

• therefore there exists a supporting hyperplane to epi f at (x, f (x)):

• b > 0 gives a contradiction as t → ∞

• b = 0 gives a contradiction for y = x +  a with small  > 0

• for small r > 0, define a set of 2n points

and define M = max f (y) < ∞

(choose an index k with |gk | = kgk∞, and take y = x + r sign(gk )e k )

f (x) + r kgk∞ = f (x) + gT (y − x) ≤ f (y) ≤ M

• we conclude that ∂ f (x) is bounded:

f (x) = max { f1(x), f2(x)} with f1, f2 convex and differentiable

Absolute value f (x) = |x|

Euclidean norm f (x) = k xk2

the subdifferential of a convex function is a monotone operator:

(u − v)T (x − y) ≥ 0 for all x , y , u ∈ ∂ f (x), v ∈ ∂ f (y)

f (y) ≥ f (x) + uT (y − x), f (x) ≥ f (y) + v T (x − y)

combining the two inequalities shows monotonicity

the following functions are not subdifferentiable at x = 0

f (x) = 1 if x = 0, f (x) = 0 if x > 0

the only supporting hyperplane to epi f at (0, f (0)) is vertical

the nonzero subgradients at x define supporting hyperplanes to the sublevel set

• duality and optimality conditions

Weak subgradient calculus: rules for finding one subgradient

• sufficient for most nondifferentiable convex optimization algorithms

Strong subgradient calculus: rules for finding ∂ f (x) (all subgradients)

• some algorithms, optimality conditions, etc., need entire subdifferential

we will assume that x ∈ int dom f

Differentiable functions: ∂ f (x) = {∇ f (x)} if f is differentiable at x

Nonnegative linear combination

if f (x) = α1 f1(x) + α2 f2(x) with α1, α2 ≥ 0, then

∂ f (x) = α1 ∂ f1(x) + α2 ∂ f2(x)

(right-hand side is addition of sets)

Affine transformation of variables: if f (x) = h(Ax + b), then

f (x) = max { f1(x), . . . , fm (x)}

define I(x) = {i | fi (x) = f (x)}, the ‘active’ functions at x

to compute a subgradient at x , choose any k ∈ I(x), any subgradient of fk at x

• the convex hull of the union of subdifferentials of ‘active’ functions at x

f (x) = max (aiT x + bi )

the subdifferential at x is a polyhedron

∂ f (x) = conv {ai | i ∈ I(x)}

with I(x) = {i | aiT x + bi = f (x)}

f (x) = k xk1 = max sT x

the subdifferential is a product of intervals

(−1, −1) (1, −1)

∂ f (0, 0) = [−1, 1] × [−1, 1] ∂ f (1, 0) = {1} × [−1, 1] ∂ f (1, 1) = {(1, 1)}

f (x) = sup fα (x), fα (x) convex in x for every α

Weak result: to find a subgradient at x̂ ,

• find any β for which f ( x̂) = f β ( x̂) (assuming maximum is attained)

(Partial) strong result: define I(x) = {α ∈ A | fα (x) = f (x)}

equality requires extra conditions (for example, A compact, fα continuous in α)

Problem: explain how to find a subgradient of

f (x) = λmax(A(x)) = sup yT A(x)y

where A(x) = A0 + x1 A1 + · · · + xn An with symmetric coefficients Ai

Solution: to find a subgradient at x̂ ,

• choose any unit eigenvector y with eigenvalue λmax(A( x̂))

f (x) = inf h(x, y), h jointly convex in (x, y)

Weak result: to find a subgradient at x̂ ,

• find ŷ that minimizes h( x̂, y) (assuming minimum is attained)

Proof: for all x , y ,

Problem: explain how to find a subgradient of

f (x) = inf k x − yk2

• b = 0 gives a contradiction for y = x + a with small > 0