Multivariable Optimization
Multivariable Optimization
Multivariable Functions
Functions depends on more than one variable
In a multivariable function, the gradient of a
function is not a scalar quantity; instead it is a
vector quantity
The objective function is a function of N variables
represented by x1, x2, . . . , xN.
The gradient vector at any point x (t) is represented
by ∇f(x(t)) which is an N dimensional vector given
as follows:
Gradient of a multivariable function
Geometrically, the gradient vector is
normal to the tangent plane at the point
x*,
Also, it points in the direction of
maximum increase in the function
Function of Two Variables (contour)
Unidirectional Search
Many multivariable optimization techniques use
successive unidirectional search techniques to find the
minimum point along a particular search direction..
A unidirectional search is a search performed by
comparing function values only along a specified
direction.
A unidirectional search is performed from a point x(t)
and in a specified direction s(t)..
Any arbitrary point on that line can be expressed as
follows:
Step 3 Calculate the reflected point xr = 2xc − xh. Set xnew = xr.
If f(xr) < f(xl), set xnew = (1 + γ)xc − γxh (expansion);
Else if f(xr) ≥ f(xh), set xnew = (1 − β)xc + βxh (contraction);
Else if f(xg) < f(xr) < f(xh), set xnew = (1 + β)xc − βxh (contraction).
Calculate f(xnew) and replace xh by xnew.
Step 4 , Terminate;
Else go to Step 2.
Minimize
Take the points defining the initial
simplex as
Step 4 If ∥d∥ is small , Terminate; Else replace s(j) = s(j−1) for all
j = N,N − 1, . . . , 2. Set s(1) = d/∥d∥ and go to Step 2.
Minimize from the starting point X1=
using powell’s method
Minimize
s(k) = −∇f(x(k)).
Cauchy’s (steepest descent) Method
Since this direction gives maximum descent in function
values, it is also known as the steepest descent method.
At every iteration, the derivative is computed at the current
point and a unidirectional search is performed in the
negative to this derivative direction to find the minimum
point along that direction.
The minimum point becomes the current point and the
search is continued from this point.
The algorithm continues until a point having a small enough
gradient vector is found. This algorithm guarantees
improvement in the function value at every iteration.
Algorithm :Cauchy’s (steepest
descent) Method
Step 1 Choose a maximum number of iterations M to be
performed, an initial point x(0), two termination parameters
ϵ1, ϵ2, and set k = 0.
Step 2 Calculate ∇f(x(k)), the first derivative at the point x(k).
Step 3 If ∥∇f(x(k))∥ ≤ ϵ1, Terminate; Else if k ≥ M;
Terminate; Else go to Step 4.
Step 4 Perform a unidirectional search to find α (k) using ϵ2
such that f(x(k+1)) = f(x(k)−α(k)∇f(x(k))) is minimum. One
criterion for termination is when |∇f(x(k+1)) · ∇f(x(k))| ≤ ϵ2.
Step 5 Is ∥x(k+1)−x(k)∥ /∥x(k)∥ ≤ ϵ1? If yes, Terminate; Else set
k = k + 1 and go to Step 2.
Minimize Minimize with
Second order derivative
The second-order derivatives in multivariable
functions form a matrix, ∇2f(x(t)) (better known as
the Hessian matrix) given as follows:
Derivatives
The computation of the first derivative with respect to each variable requires
two function evaluations, thus totaling 2N function evaluations for the
complete first derivative vector. The computation of the second derivative
requires three function evaluations, but the second-order partial derivative
requires four function evaluations . Thus, the computation of Hessian matrix
requires (2N2 + 1) function evaluations.
Newton’s Method
Newton’s method presented in single variable optimization
can be extended for the minimization of multivariable
functions.
Consider the quadratic approximation of the function f(X) at
X = Xi using the Taylor’s series expansion
[Ji]
Minimize with
Despite these advantages, the method is not very useful
in practice, due to the following features of the method:
1. It requires the storing of the n × n matrix [Ji ].
2. It becomes very difficult and sometimes impossible to
compute the elements of the matrix [Ji ].
3. It requires the inversion of the matrix [Ji ] at each step.
4. It requires the evaluation of the quantity [Ji ]−1∇fi at
each step.
These features make the method impractical for problems
involving a complicated objective function with a large
number of variables.