8. Nonlinear Optimization With Inequality Constraints
8. Nonlinear Optimization With Inequality Constraints
Introduction
It refers to the problem of finding the optimal value of a given function f(X1, X2,…
Xn) of n variables, usually restricted to be non-negative (Xj ≥ 0), subject to one or
more constraints (linear or non-linear) which are in general inequalities: g(X1, X2,
…, Xn) {≥ , or ≤} bi. Not both f and g are linear; otherwise, this would be a different
class of problem – linear programming
3
27/13
34/13 4 X1
Non-linear optimization 2
January 1, 2025 2
To find this point we utilize the fact that at the tangency point (see graph), the
slopes are equal. The slope of g1 (the binding constraint) is -3/2.
To find the slope of the objective function, we take the total differential of the
objective function: (X1 – 4)2 + (X2 -3)2 – F = 0
i.e.,
January 1, 2025
Non-linear optimization 3
One can show that the optimal can occur inside the Feasible Set.
Example: Min Z = (X1 – 9/5)2 + (X2 – 2)2 (same constraints)
X2
Now the optimal value
A of Z occurs at an
interior point (9/5, 2)
and Z = 0 at minimum.
X1
9/5
Non-linear optimization 4
January 1, 2025
4
We can also show that a local optimum is not always a global one, i.e.: Suppose
we were looking for maximum instead of a minimum (same objective function
and constraints). Point A will provide a local maximum (Z = 169/100), but point B
would give global maximum, that is Z at B > Z at A.
Non-linear optimization 6
January 1, 2025
6
Karush*, Kuhn, and Tucker (KKT) have shown that:
(1) There is a wide class of such problems for which a Lagrangian approach can be
followed; when the Lagrangian function is optimized, the same values that optimize
the Lagrangian will also optimize the original function subject to the constraints.
(2) Consider a maximization problem: We set up the L-function and treat the L-
multipliers (λs) as variables (as we did earlier); then suppose we solve the problem.
Property of the solution: In the optimal solution, we’ll find the values of the
decision variables of the maximization problem (X1*, X2*,…) which maximize the
L-function, and at the same time, the solution values of the multipliers (λ*) will be
those which minimize the value of a dual L-function.
This is a saddle point solution.
*The KKT conditions were originally named after Harold W. Kuhn and Albert W. Tucker, who
first published the conditions in 1951. Later it was understood that the necessary conditions for
this problem had been stated by William Karush in his U of Chicago master's thesis (!) in 1939.
January 1, 7
2025
Non-linear optimization 7
Here, the constrained maximization (minimization) problem is formulated as a
Lagrange function whose optimal point is a saddle point, i.e. a maximum (minimum)
over the domain of the decision variables and a minimum (maximum) over the multipliers
(λs), which is why the Karush–Kuhn–Tucker theorem is sometimes referred to as the
saddle-point theorem.
This saddle point is related to duality. The duality concept specifies a primal-dual, or
max-min relationship.
Non-linear optimization 8
January 1, 2025 8
For example, suppose we have a (primal) output maximization problem subject to a cost
constraint.
- The dual will be a cost minimization problem subject to an output constraint.
Similarly, if the primal is the utility maximization problem, its dual is the expenditure
minimization problem.
Let X* be the optimal vector of the max. problem variables and Y* be the
optimal vector of the dual (minimization) problem choice variables. Then:
The values of the primal variables X* which maximize the primal objective function
(and the L-function), have L-multiplier solution values (λ*) which are the optimal
values (Y*) of the dual problem.
Similarly, the optimal values of the dual variables Y* that minimize the dual
minimization problem L(Y*, X*), have λ-multipliers which are the optimal values,
X* of the primal problem.
Non-linear optimization
January 1, 2025 9 9
Therefore, the Lagrangian multipliers (λs) in the primal problem are equal to
the dual optimal variables, and the dual multipliers (λs) are equal to the
primal optimal solution values.
Below are the K-T conditions side by side with the FOCs for classical constrained
optimization:
11
January 1, 2025 11
Non-linear optimization
Example:
Max Z = X13 + X1X2
s. t. X1 + X25 ≤ 10
X12 + 2X2 ≥ 4 [or -X12 - 2X2 ≤ -4]
X1, X2, ≥ 0
Then,
We entered the constraints in such a way so that when we take the derivative of
L with respect to λ1 and λ2 we get that the relevant part of the K-T conditions,
i.e., ∂L/∂λi ≥ 0 are equivalent to the constraints:
∂L/∂λ
Non-linear 2 = -4 +
optimization X12 + 2X2 ≥ 0 (equivalent to X12 + 2X2 ≥ 4) 12
January 1, 2025
12
Continued…
Now the K-T conditions are:
and,
∂L/∂λ1 = 10 - X1 - X25 ≥ 0 (3)
Along with: λ1[10 - X1 - X25 ] = 0 (3a)
Having inequality constraints amounts to allowing for the possibility that the
optimum occurs at a point where one or more of the solution values are = 0 (Xj = 0),
i.e., the possibility of a boundary (corner) solution as opposed to an interior
solution. They also allow for the possibility that the constraint(s) are not = 0 at the
optimal solution (i.e., are satisfied as strict inequalities).
The K-T conditions describe a more general case and therefore contain the FOCs
for classical constrained optimization.
Non-linear optimization 14
January 1, 2025
14
Some intuition:
i.e., A max. can occur at a corner point
for y = f(x), with x non-negative as well as an interior point (like
Y A) where ∂Y/∂X = 0 .
A At B, ∂Y/∂X also = 0, (also have
B a max).
X
At C we have ∂Y/∂X < 0 and X = 0. Can this represent a maximum? Yes, because
the value of Y will decrease for any X > 0 (given the non-negativity constraints).
However, for max. we cannot have ∂Y/∂X > 0 at X = 0, because then, we can
increase the value of Y by increasing X.
15
January 1, 2025
Non-linear optimization 15
Putting the above together and generalizing for functions with more than one
variables, for max. we have:
∂L/∂Xj ≥ 0
Y
∂L/∂λi ≤ 0
B A
Non-linear optimization 17
January 1, 2025 17
Why the Lagrangian method works:
∂L/∂λi = bi - gi(X1, X2, …Xn) ≥ 0, which is the ith constraint. That is, we know that the
constraints are satisfied.
Non-linear optimization
18
January 1, 2025
18
Then, by the complimentary condition: λi(∂L/∂λi) = 0, or
λi[bi - gi(X1, X2, …Xn)] = 0, for all i, the terms on the right hand side vanish and
we are left with:
L(X*, λ*) = f(X1, X2, …Xn), which is the objective function.
Therefore, a solution which satisfies the K-T conditions for the L-function must
also:
Sufficient conditions:
The candidate point will be a global max (min) if:
(1) For maximum the objective function must be concave. For a minimum it
must be convex.
(2) For max (min), the set of constraints form a Feasible Region which is convex
(concave) – for this, we need all the constraint functions must be convex
(concave).
(3) The K-T conditions must be satisfied.
[Notice that the distinction, convex vs. concave can refer to a function or a feasible
region. If needed please review pages 327-330 in CW]
Note also that a linear function satisfies both convexity and concavity.
So, we use the test given earlier to determine the shape of each function.
Non-linear optimization
21
January 1, 2025
20