0% found this document useful (0 votes)
11 views

Maximum Slope Method

This paper describes the maximum slope method for minimizing functions of several variables. The method iterates toward the solution by evaluating the function, its gradient, and using this to determine the steepest direction of descent. A quadratic polynomial is then interpolated to calculate the optimal step and update the approximation. The process repeats until it converges to a local minimum of the function.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Maximum Slope Method

This paper describes the maximum slope method for minimizing functions of several variables. The method iterates toward the solution by evaluating the function, its gradient, and using this to determine the steepest direction of descent. A quadratic polynomial is then interpolated to calculate the optimal step and update the approximation. The process repeats until it converges to a local minimum of the function.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

MAXIMUM SLOPE METHOD

I. Introduction

One of the oldest methods to minimize a function of several variables is the


maximum slope method, also called the gradient method or the descent method.

The Maximum Slope method converges to the solution generally only linearly,
but it is global in nature, that is, convergence occurs from almost every initial
value, even if these initial values are deficient. Consequently, with it sufficiently
accurate initial approximations are achieved for techniques based on Newton's
method, in the same way that the bisection method is used in a single equation.

The Maximum Slope method determines a local minimum for a function of


several variables of the form g: lR n→ lR. The method is very useful regardless
of its application as the first method to solve nonlinear systems.

The connection between the problem of minimizing a function of lR n in lR and


the solution of a system of nonlinear equations lies in the fact that a linear
system of the form:

f 1 ( x 1 , x 2 , … , x n ) =0 ,
f 2 ( x 1 , x 2 , … , x n ) =0 ,


f n ( x 1 , x 2 , … , x n ) =0

t
has a solution in x=( x 1 , x 2 , … , x n ) =0 just when the function g defined by:
n
g ( x 1 , x 2 , … , x n )=∑ [ f 1 ( x 1 , x 2 ,… , x n ) ]
2

i=1

has zero minimum value.

The Maximum Slope method for finding a local minimum of any function g of
lR n in lR can be described intuitively as follows:

(0) ( 0) ( 0) (0 ) t
→ Evaluate the function g in an initial approximation x =(x 1 , x 2 , … , x n )
.
→ Determine a direction that, from x (0 ) , a decrease in the deg value occurs.

→ Shift an appropriate amount towards this direction and call the new
vector x (1 ) .

→ Repeat the three previous steps replacing x (0 ) by x (1 ) .


Before describing how to select the correct direction and the appropriate
distance to travel in that direction, it is necessary to review some results of
infinitesimal calculus.

 Theorem 1. Extreme value theorem

This theorem states that a differentiable function of a single variable can


have a relative minimum only when its derivative is zero.

To extend this result to functions of several variables, the following definition is


needed.

 Definition 1. If g: lR n→ lR, the gradient of g is defined in


t
x=( x 1 , x 2 , … , x n ) =0 , which is denoted by∇ g (x) and is defined by:

t
∂g ∂g ∂g
∇ g ( x )=( ( x), ( x), … , ( x ))
∂ x1 ∂ x2 ∂ xn

The gradient of a function of several variables is analogous to the


derivative of a function of several variables in the sense that a
differentiable function of several variables can have a local minimum at
a point x only when its gradient at x is the zero vector .

The gradient has another very important property related to the


minimization of functions of several variables. Suppose that
t
v=(v 1 , v 2 , … , v n ) is a unit vector of lR n ; that is to say:

n
‖v‖ =∑ v 2i =1
2
2
i=1

 Definition 2. The directional derivative of g at x in the direction of v is


defined by:

1
Dv g ( x ) =lim
h
[ g ( x +hv )−g ( x) ] =v ∗∇ g ( x ) .
t

h→0

The directional derivative of g at x in the direction of v measures the


variation of the values of the function g with respect to changes of its
variable in the direction of v .
When g is a function of two variables
Figure 1

A standard result of the infinitesimal calculus of functions of several variables


states that if the function g is differentiable, the direction in which the largest
directional derivative is obtained is obtained when v is parallel to the gradient
∇ g (x) , as long as∇ g (x)≠ 0 . Consequently, the direction of the maximum
decrease in g values from x is the direction given by−∇ g (x) .
Since the objective is to reduce g(x ) to its minimum value of zero, given the
initial approximation x(1) , Is taken:

(1) (0)
x = x −α ∇ g(x )
(0) …… (**)

for some constantα >0 .


The problem, then, is reduced to choosing a value of α so that g ( x (1 )) be
significantly less than g ( x (0 ) ) . If you want to determine an appropriate choice of
the value ofα , consider the function of a single variable

(0) (0)
h ( α )=g( x −α ∇ g(x ))

The value ofα which minimizes h is the value required in equation (**).

To directly obtain a minimum value of h requires differentiating h, and then


solving a root calculus problem to determine the critical points of h. This
procedure is generally too expensive in terms of necessary calculations.
Therefore three points are selectedα 1< α 2< α 3 which, it is expected, are close to
whereh(α ) reaches its minimum value. Next, the second degree polynomial is
constructed P ( x ) which interpolates h intoα 1 ,α 2 andα 3 . We take a valueα^ in
[ α 1 , α 3 ] such that P( α^ ) be the minimum of P ( x ) in[ α 1 , α 3 ] and using P( α^ ) as an
approximation of the minimum value ofh(α ) .

Soα^ is the value used to determine the new iteration in the search for the
minimum value of g:

(1) (0) (0)


x = x − α^ ∇ g(x )

As already available g ( x (0 ) ) , to reduce the computational effort as much as


possible, the first point chosen isα 1=0 . Then a point is takenα 3 such that
h ( α 3 ) <h (α 1 ) . (Given theα 1 is not the minimum of h , said numberα 3 if it exists).
α
Finally it is decided thatα 2 be equal to 3 .
2

Pointα 3 where the minimum value of P ( x ) in[ α 1 , α 3 ] is the only critical point of P
or the right endpoint of the intervalα 3 because, by assumption,
P ( α 3 )=h ( α 3 ) <h ( α 1 )=P (α 1) .Given the P ( x ) It is a polynomial of the second
degree, said critical point can be easily determined.

II. Mathematical algorithm of the method

To approximate a solution p to the minimization problem


g ( p )=min g(x )
n
x ∈R

Given an initial approximation x :

t
Input: number of n variables; initial approach x=( x 1 , x 2 , … , x n ) , TOL tolerance;
maximum number of iterations N.

t
Output: approximate solution x=( x 1 , x 2 , … , x n ) or a failure message.

Step 1. Take k = 1 .
Step 2. While(k ≤ N ) , do steps 3-15.
Step 3. Take g1=g ( x 1 , x 2 , … , x n ) ; (Note: g1=g( x¿¿ (k ))¿ )
z=∇ g ( x1 , x2 , … , x n) ; (Note: z=∇ g (x ¿¿( k))¿ )
z 0=‖ z‖2
Step 4. Yeah z 0=0 , then OUTPUT ('Zero Gradient');
EXIT( x 1 , x 2 , … , x n , g 1) ;
(Procedure finished, I was able to have a
minimum. )
STALL.
Step 5. Take z = z/ z 0 ; (convert az to a unit vector.)
α 1=0 ;
α 3=1;
g3=g ( x−α 3 z ).

Step 6. While(g3 ≥ g 1) , I do steps 7 and 8.

Step 7. Takeα 3=α 3 /2 ;


g3=g (x−α 3 z )
Step 8. Yeahα 3 <TOL/2 , so
OUTPUT ('Improvement unlikely');
EXIT( x 1 , x 2 , … , x n , g 1) ;
(Procedure finished, I could have a minimum. )
STALL.

Step 9. Takeα 2=α 3 /2


g2=g (x−α 2 z) .

Step 10. Takeh1=(g2−g 1)/α 2 ;


h2 =(g3−g 2)/(α 3 −α 2 );
h3 =(h 2−h1 )/α 3.
( Note: Newton's forward divided differences formula is used to
find the quadratic P ( α )=g1 + h1 α +h3 α (α −α 2) that interpolates
h ( α ) inα =0 ,α =α 2 andα =α 3 ).

Step 11. Takeα 0=(α 2−h1 ¿ h3) ; ( the critical point of P occursα 0 )
g0=g (x−α 0 z ).

Step 12. Getα of{ α 0 , α 3 } such that g=g ( x−αz )=mín { g 0 , g3 }

Step 13. Take x = x –α z

Step 14. Yeah|g−g 1|<TOL , so


EXIT( x 1 , x 2 , … , x n , g ) ;
(Procedure completed successfully)
STALL

Step 15.Take k = k + 1
Step 16. OUTPUT ('Maximum number of iterations exceeded').
(Procedure completed without success).
STALL

III. Example

Let the following nonlinear system of equations be:


1
f 1 ( x 1 , x 2 , … , x n ) =3 x 1−cos ( x 2 x 3 )− =0 ,
2

2 2
f 2 ( x 1 , x 2 , … , x n ) =x1 −81 ( x2 +0.1 ) + sen x 3 +1.06=0 ,

−x 1 x2 10 π−3
f 3 ( x 1 , x 2 , … , x n ) =e +20 x 3+ =0
3

Using the Maximum Slope method, calculate the approximation of the solution,
starting at the initial point. x(0) =(0 ,0 , 0)t .

Solution:

2 2 2
Be g ( x 1 , x 2 , x 3 )=[f 1 ( x 1 , x 2 , x 3 ) ] +[ f 2 ( x 1 , x 2 , x 3 ) ] +[f 3 ( x 1 , x 2 , x 3 ) ] ; so:

∇ g ( x 1 , x 2 , x3 ) ≡ ∇ g ( x )=¿
∂f1 ∂f ∂f
2f 1(x) ( x ) +2 f 2 ( x ) 2 ( x )+ 2 f 3 ( x ) 3 ( x ) ,
∂ x2 ∂ x2 ∂ x2

∂f1 ∂f ∂f
2f 1(x) ( x )+2 f 2 ( x ) 2 ( x ) +2 f 3 ( x ) 3 ( x ) ¿
∂ x3 ∂ x3 ∂ x3
t
¿ 2 J ( x ) F ( x)

With x(0) =(0 ,0 , 0)t , have:

g ( x (0 ) )=111.975 and z 0=‖∇ g ( x )‖2=419.554 .


(0 )

Be

1
∇ g ( x (0 ) )=(−0.0214514 ,−0.0193062 , 0.999583) .
t
z=
z0

Forα 1=0 , have g1=g ( x −α 1 z ) =g ( x )=111.975 . Arbitrarily, we doα 3=1 , so


( 0) (0 )

that:

g3=g ( x( 0)−α 3 z )=93.5649.

As g3 ¿ g 1 , we acceptα 3 and we doα 2=0.5 . So,

g2=g ( x ( 0)−α 2 z ) =2.53557 .

Now we construct the Newton interpolation polynomial with forward divided


differences:

P ( α )=g1 + h1 α +h3 α (α −α 2)

that interpolates

g ( x (0 )−α ∇ g( x (0 )) ) =g(x (0 )−αz)

inα 1=0 ,α 2=0.5 andα 3=1 as follows:

α 1=0, g1=111.975

g2−g1
α 2=0.5 , g2=2.53557 , h1 = =−218.878 ,
α 1−α 2

g3−g2
α 3=1, g3=93.5649, h2 = =182.059,
α 3−α 2

g2−g1
h3 = =400.937 .
α 3−α 1
Therefore:
P ( α )=111.975−218.878 α +400.937 α ( α −0.5 ) .

We have to P ' ( α )=0 whenα =α 0=0.522959. As g0=g ( x −α 0 z )=2.32762 is


( 0)

less than g1 and g3 , we make:


t
α =α 0=0.522959 z= ( 0.0112182 , 0.0100964 ,−0.522741 )

and

g ( x (1 )) =2.32762 .

The following table contains the rest of the results. A real solution of the
nonlinear system is (0.5, 0, -0.5235988) t .

k k k
k x1 x2 x3 g( x k1 , x k2 , x3k ¿

2 0.137860 -0.205453 -0.522059 1.27406


3 0.266959 0.00551102 -0.558494 1.06813
4 0.272734 -0.00811751 -0.522006 0.468309
5 0.308689 -0.0204026 -0.533112 0.381087
6 0.314308 -0.0147046 -0.520923 0.318837
7 0.324267 -0.00852549 -0.528431 0.287024

IV. Application

This method can be used to find the maximum peaks, which can be used in
various branches. In statistics it is used to find the performance of different
situations such as that of a population, finding its maximum slope based on the
variance it presents.

V. Computational algorithm
Start

f(x), N, n, X, TOL

k=1

W
k≤N

g 1 = g(x (k) )

z=∇ g(x (k) )

z 0 = ‖z‖2

Yeah
z 0 =0

g1
z = z/z 0

α1= 0
α3= 1
g 3 =g(x- α 3 z)

222 1 3
2 2 1 3

W
g3≥g1

α3= α3/2

g 3 =g(x- α 3 z)

Yeah
α 3 < TOL/2

g1

α2= α3/2
g 2 =g(x- α 2 z)

h 1 = ( g 2 -g 1 )/ α
2

h 2 = (g 3 -g 2 )/ (α 3 -
α2)

2 3
4

2 4 3

h 3 = ( h 2 -h 1 )/ α
3

Find the quadratic polynomial

α 0 = 0.5 (α 2 - h 1 )/ h 3

g 0 =g(x- α 0 z)

g=g(x- αz) = min { g 0 , g 3 }

x = x-αz

Yeah
|gg 1 |

k=k+1
Maximum number
of iterations (N)
exceeded

VI. End
Conclusions and recommendations

 The maximum slope method is a method that converges only linearly to


the solution.
 This method will almost always converge even with poor initial
approximations.
 The maximum slope method is slow convergence, it will take more
iterations to get closer and closer.
 It is recommended to be very careful when calculating each iteration,
since a bad calculation could make us repeat the entire procedure.
 The maximum slope method allows for many variations, some of which
include more complex techniques for determining the value ofα .
 We have to have knowledge of Newton's Interpolation method, or some
other method that we have knowledge of for the construction of the
quadratic polynomial, which we need in the maximum slope method.

VII. Exhibit

Program in Matlab
function [x,varargout]= maxSlope(a,b,varargin)

n=length(a); x=zeros(n,1);
mmax=40; eps=1e-6;

if nargin>2
mmax=varargin{1};
end

if nargin>3
eps=varargin{2};
end
if (nargin>4)
x=varargin{3};
end

res=zeros(1,mmax);
r=ba*x; res(1)=dot(r,r); aux=norm(b);
for m=1:mmax
p=a*r;
xi=res(m)/dot(r,p);
x=x+xi*r;
r=r-xi*p;
res(m+1)=dot(r,r); % we save waste
if (sqrt(res(m+1))<eps*aux);
break
end
end
res=res(1:m+1);
if (m==mmax) && nargout<=3
disp( 'maximum number of iterations exceeded' )
end

if nargout>1
varargout{1}=m;
end
if nargout>2
varargout{2}=sqrt(res(:));
end
if (nargout>3)
if m==mmax
varargout{3}=0;
else
varargout{3}=1;
end
end

return

You might also like