0% found this document useful (0 votes)
7 views

final_optimization_K64 (1)

de nam ngoai

Uploaded by

emilypham056
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

final_optimization_K64 (1)

de nam ngoai

Uploaded by

emilypham056
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

NATIONAL ECONOMICS UNIVERSITY FINAL EXAM

Faculty of Mathematical Economics Optimization

Test 02
Date: 09/12/2023
Time: 90 minutes

Instruction: Only two handwritten A4 notes are permitted in the examination room.

Question 1. (3p) Consider the problem


minimize f (x)
subject to x ∈ Ω,
where f : R2 → R is given by f (x) = 3x1 − 3 with x = [x1 , x2 ], and Ω = {x = [x1 , x2 ] : x1 + x22 ≥ 2 .
a) Does the point x∗ = [2, 0] satisfy the first-order necessary condition?

b) Does the point x∗ = [2, 0] satisfy the second-order necessary condition?

c) Is the point x∗ = [2, 0] a local minimizer?

Question 2. (3p) Consider the problem


m
aTi x − bi

minimize ∑h
i=1

where x ∈ Rn is the variable, the vectors a1 , . . . , am ∈ Rn and b ∈ Rm are given, and h is the function given by
(
0, |z| ≤ 1
h(z) =
|z| − 1, |z| > 1.

Note that this problem can be thought of as a sort of hybrid between ℓ1 and ℓ∞ -norms, since there is no cost for
residuals smaller than one, and a linearly growing cost for residuals larger than one.

a) Graph the function h and show that h(z) = max{(−z − 1), 0, (z − 1)}.

b) Express the problem as a linear program.

c) Derive its dual problem and simplify it as much as you can.

Question 3. (3p) Suppose that f : R2 → R is defined by


1
f (x) = xT Ax + bT x,
2
     
3 0 −3 (0) 5
where A = and b = . Let the starting point x = .
0 4 2 5
a) Find an unconstrained local minimum point of f .

b) Is the above solution actually a global minimum point? Why?

c) Suppose that we use a fixed-step-size gradient algorithm:

x(k+1) = x(k) − α∇ f x(k) .




Find the largest range of values of α for which the algorithm is globally convergent.

d) What would be the rate of convergence of steepest descent method for this problem? How many steepest
descent iterations would it take (at most) to reduce the function value to ε = 10−5 ?
     
10 0 −3 (0) 5
e) Consider A = ,b= , and x = . How does the rate of convergence to the
0 0.1 2 5
global minimum compare with that in the previous question? What makes the difference?

Question 4. (1p)

a) What is the difference between gradient descent and mini-batch gradient descent? Compare their advan-
tages and drawbacks.

b) Briefly describe the idea behind the Adaptive Gradient Algorithm (Adagrad) and its update rule.

You might also like