0% found this document useful (0 votes)
27 views

8. Nonlinear Optimization With Inequality Constraints

Uploaded by

pranav.garg1006
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

8. Nonlinear Optimization With Inequality Constraints

Uploaded by

pranav.garg1006
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Non-linear optimization with inequality constraints

Introduction
 It refers to the problem of finding the optimal value of a given function f(X1, X2,…
Xn) of n variables, usually restricted to be non-negative (Xj ≥ 0), subject to one or
more constraints (linear or non-linear) which are in general inequalities: g(X1, X2,
…, Xn) {≥ , or ≤} bi. Not both f and g are linear; otherwise, this would be a different
class of problem – linear programming

 To introduce the problem, consider the following example:


Min F = (X1 – 4)2 + (X2 -3)2
s. t. 3X1 + 2X2 ≤ 12 (g1)
-2X1 + 2X2 ≤ 3 (g2)
2X1 – X2 ≤ 4 (g3)
2X1 + 3X2 ≥ 6 (g4)
X1, X1, ≥ 0
We can solve this problem with the aid of a graph which allows us to see which
1 January 1, 2025
constraint/s
Non-linear optimization are binding. 1
X2
g1
The optimal point is X1* = 34/13,
X2* = 27/13, F* =2.77.

3
27/13

34/13 4 X1

Non-linear optimization 2
January 1, 2025 2
 To find this point we utilize the fact that at the tangency point (see graph), the
slopes are equal. The slope of g1 (the binding constraint) is -3/2.
 To find the slope of the objective function, we take the total differential of the
objective function: (X1 – 4)2 + (X2 -3)2 – F = 0
i.e.,

2(X1- 4)dX1 + 2(X2 – 3)dX2 – 0 = 0

2(X1- 4) + 2(X2 – 3)dX2/dX1= 0


But at tangency between F and g1 the slope is dX2/dX1= -3/2.

So, 2(X1- 4) + 2(X2 – 3)(-3/2) = 0 → 2X1 – 3X2 = -1


Solving along with 3X1 + 2X2 = 12, we get the solution given above.

January 1, 2025
Non-linear optimization 3
 One can show that the optimal can occur inside the Feasible Set.
Example: Min Z = (X1 – 9/5)2 + (X2 – 2)2 (same constraints)

X2
Now the optimal value
A of Z occurs at an
interior point (9/5, 2)
and Z = 0 at minimum.

X1
9/5

Non-linear optimization 4
January 1, 2025
4
We can also show that a local optimum is not always a global one, i.e.: Suppose
we were looking for maximum instead of a minimum (same objective function
and constraints). Point A will provide a local maximum (Z = 169/100), but point B
would give global maximum, that is Z at B > Z at A.

Finally, consider the following


constraints:
X12 + X 22 = b2
X1X2 ≤ b1
X12 + X22 ≥ b2 X2
X1, X2, ≥ 0
This results in a feasible set that is
not convex (it is disjoint). X1X2 = b1
A solution may not exist if the
constraints contain inconsistencies (as in
the example above) or if the problem is
X1
unbounded.
Non-linear optimization 5
January 1, 2025
5
First-order conditions (necessary conditions)

 In handling classical constrained optimization problems we used calculus


techniques. In particular, a convenient method there was the Lagrangian
method.

 Can we extend this approach to N.L.O. problems (that is problems with


inequality constraints)?

Non-linear optimization 6
January 1, 2025
6
Karush*, Kuhn, and Tucker (KKT) have shown that:
(1) There is a wide class of such problems for which a Lagrangian approach can be
followed; when the Lagrangian function is optimized, the same values that optimize
the Lagrangian will also optimize the original function subject to the constraints.

(2) Consider a maximization problem: We set up the L-function and treat the L-
multipliers (λs) as variables (as we did earlier); then suppose we solve the problem.

Property of the solution: In the optimal solution, we’ll find the values of the
decision variables of the maximization problem (X1*, X2*,…) which maximize the
L-function, and at the same time, the solution values of the multipliers (λ*) will be
those which minimize the value of a dual L-function.
This is a saddle point solution.

*The KKT conditions were originally named after Harold W. Kuhn and Albert W. Tucker, who
first published the conditions in 1951. Later it was understood that the necessary conditions for
this problem had been stated by William Karush in his U of Chicago master's thesis (!) in 1939.
January 1, 7
2025
Non-linear optimization 7
Here, the constrained maximization (minimization) problem is formulated as a
Lagrange function whose optimal point is a saddle point, i.e. a maximum (minimum)
over the domain of the decision variables and a minimum (maximum) over the multipliers
(λs), which is why the Karush–Kuhn–Tucker theorem is sometimes referred to as the
saddle-point theorem.

This saddle point is related to duality. The duality concept specifies a primal-dual, or
max-min relationship.
Non-linear optimization 8
January 1, 2025 8
For example, suppose we have a (primal) output maximization problem subject to a cost
constraint.
- The dual will be a cost minimization problem subject to an output constraint.

Similarly, if the primal is the utility maximization problem, its dual is the expenditure
minimization problem.

 Let X* be the optimal vector of the max. problem variables and Y* be the
optimal vector of the dual (minimization) problem choice variables. Then:
 The values of the primal variables X* which maximize the primal objective function
(and the L-function), have L-multiplier solution values (λ*) which are the optimal
values (Y*) of the dual problem.

 Similarly, the optimal values of the dual variables Y* that minimize the dual
minimization problem L(Y*, X*), have λ-multipliers which are the optimal values,
X* of the primal problem.

Non-linear optimization
January 1, 2025 9 9
 Therefore, the Lagrangian multipliers (λs) in the primal problem are equal to
the dual optimal variables, and the dual multipliers (λs) are equal to the
primal optimal solution values.

The Lagrangian function


 To set up the L-function in the case of NLO problems we have to follow a similar
approach to the case of classical constrained optimization, but need to be a bit
more careful.
Given the problem:
Max. Z = f(X1, X2, …Xn)
s. t. g1(X1, X2, …Xn) ≤ b1
……………………….
gm(X1, X2, …Xn) ≤ bm
Then,
L(X, λ) = f(X1, X2,…Xn) + λ1[b1 - g1(X1, X2, …Xn)] +…+ λm[bm - gm(X1, X2, …Xn)]

Non-linear optimization January 1, 2025


10
 The Kuhn-Tucker conditions are necessary (but not sufficient) for optimization.
In other words, they are the First-Order-Conditions for NLO problems.

Below are the K-T conditions side by side with the FOCs for classical constrained
optimization:

Classical constrained optimization K-T Conditions for Max.


∂L(X, λ)/∂Xj = 0 ∂L(X, λ)/∂Xj ≤ 0, and
Xj[∂L(X, λ)/∂Xj] = 0

∂L(X, λ)/∂λi = 0 ∂L(X, λ)/∂λi ≥ 0, and


λi[∂L(X, λ)/∂λi] = 0
Xj ≥ 0

(for minimization the inequalities are reversed)

11
January 1, 2025 11
Non-linear optimization
Example:
Max Z = X13 + X1X2
s. t. X1 + X25 ≤ 10
X12 + 2X2 ≥ 4 [or -X12 - 2X2 ≤ -4]
X1, X2, ≥ 0
Then,

L(X, λ) = X13 + X1X2 + λ1(10 - X1 - X25) + λ2(-4 + X12 + 2X2)

We entered the constraints in such a way so that when we take the derivative of
L with respect to λ1 and λ2 we get that the relevant part of the K-T conditions,
i.e., ∂L/∂λi ≥ 0 are equivalent to the constraints:

∂L/∂λ1 = 10 - X1 - X25 ≥ 0 (equivalent to X1 + X25 ≤ 10)

∂L/∂λ
Non-linear 2 = -4 +
optimization X12 + 2X2 ≥ 0 (equivalent to X12 + 2X2 ≥ 4) 12
January 1, 2025
12
Continued…
Now the K-T conditions are:

∂L/∂X1 = 3X12 + X2 – λ1 + 2λ2X1 ≤ 0 (1)


Along with: X1[3X12 + X2 – λ1 + 2λ2X1] = 0 (1a)

∂L/∂X2 = X1 - 5λ1X24 + 2λ2 ≤ 0 (2)


Along with: X2[X1 - 5λ1X24 + 2λ2] = 0 (2a)

and,
∂L/∂λ1 = 10 - X1 - X25 ≥ 0 (3)
Along with: λ1[10 - X1 - X25 ] = 0 (3a)

∂L/∂λ2 = -4 + X12 + 2X2 ≥ 0 (4)


Along with: λ2[-4 + X12 + 2X2 ] = 0 (4a)
and X1, X2 ≥ 0
13
January 1, 2025 13
Non-linear optimization
 The K-T conditions are in fact generalizations of the FOCs in classical
constrained optimization.

 Having inequality constraints amounts to allowing for the possibility that the
optimum occurs at a point where one or more of the solution values are = 0 (Xj = 0),
i.e., the possibility of a boundary (corner) solution as opposed to an interior
solution. They also allow for the possibility that the constraint(s) are not = 0 at the
optimal solution (i.e., are satisfied as strict inequalities).

 The K-T conditions describe a more general case and therefore contain the FOCs
for classical constrained optimization.

Non-linear optimization 14
January 1, 2025
14
Some intuition:
i.e., A max. can occur at a corner point
for y = f(x), with x non-negative as well as an interior point (like
Y A) where ∂Y/∂X = 0 .
A At B, ∂Y/∂X also = 0, (also have
B a max).

X
At C we have ∂Y/∂X < 0 and X = 0. Can this represent a maximum? Yes, because
the value of Y will decrease for any X > 0 (given the non-negativity constraints).
However, for max. we cannot have ∂Y/∂X > 0 at X = 0, because then, we can
increase the value of Y by increasing X.
15
January 1, 2025
Non-linear optimization 15
 Putting the above together and generalizing for functions with more than one
variables, for max. we have:

 Either ∂Y/∂Xi = 0 and Xi > 0 (interior max.) (point A)


or ∂Y/∂Xi < 0 and Xi = 0 (corner max.) (point C)
while ∂Y/∂Xi = 0 and Xi = 0 is also possible (point B)

 Which one applies? From the K-T conditions: Xj[∂L/∂Xj] = 0


This means that if Xj > 0 → ∂L/∂Xj = 0
If ∂L/∂Xj ≠ 0 → Xj = 0
That is, either Xj = 0 or ∂L/∂Xj = 0 (or both)

 Similar conclusions are derived from λi[∂L/∂λ] = 0.


[i.e., λi > 0 → ∂L/∂λi = 0, or ∂L/∂λi ≠ 0 → λi = 0, or both ∂L/∂λi = 0 and λi = 0]
When λi > 0, this means that the corresponding constraint is binding.
Non-linear optimization January 1, 2025
16
If we have a minimization problem, the signs of the inequalities are reversed.

∂L/∂Xj ≥ 0
Y
∂L/∂λi ≤ 0

B A

Non-linear optimization 17
January 1, 2025 17
Why the Lagrangian method works:

Max. Z = f(X1, X2, …Xn)


s. t. g1(X1, X2, …Xn) ≤ b1
……………………….
gm(X1, X2, …Xn) ≤ bm

Consider the Lagrangian expression:


L(X, λ) = f(X1, X2, …Xn) + λ1[b1 - g1(X1, X2,…Xn)] +…+λm[bm- gm(X1, X2, …Xn)]

 Condition ∂L/∂λi ≥ 0 for maximum means:

∂L/∂λi = bi - gi(X1, X2, …Xn) ≥ 0, which is the ith constraint. That is, we know that the
constraints are satisfied.

Non-linear optimization
18
January 1, 2025
18
 Then, by the complimentary condition: λi(∂L/∂λi) = 0, or
λi[bi - gi(X1, X2, …Xn)] = 0, for all i, the terms on the right hand side vanish and
we are left with:
L(X*, λ*) = f(X1, X2, …Xn), which is the objective function.

 Therefore, a solution which satisfies the K-T conditions for the L-function must
also:

(1) Satisfy the constraints.


(2) Will make the value of L equal to that of the objective function.

Non-linear optimization January 1, 2025


19
• The (KKT) conditions (also known as Kuhn-Tucker conditions), give the ( first-
order/necessary) conditions for a solution in nonlinear programming to be optimal,
provided that some “regularity conditions” are satisfied.
• The constraint qualification imposes restrictions on the constraint functions to rule
out certain irregularities on the boundary of the feasible region. If the constraint
qualification is not satisfied, this would invalidate the KT conditions if the optimal
solution happens to occur at such a point (if, interested see section on “Constraint
Qualification in the book).
Since the FOCs (K-T conditions) are inequalities, we don’t have 2nd order conditions to test as we
did under equality constraints. The sufficient conditions for max-min are given in terms of
requirements of concavity-convexity of objective and constraint functions.

Sufficient conditions:
The candidate point will be a global max (min) if:
(1) For maximum the objective function must be concave. For a minimum it
must be convex.
(2) For max (min), the set of constraints form a Feasible Region which is convex
(concave) – for this, we need all the constraint functions must be convex
(concave).
(3) The K-T conditions must be satisfied.
[Notice that the distinction, convex vs. concave can refer to a function or a feasible
region. If needed please review pages 327-330 in CW]
 Note also that a linear function satisfies both convexity and concavity.
 So, we use the test given earlier to determine the shape of each function.

Non-linear optimization
21
January 1, 2025
20

You might also like