0% found this document useful (0 votes)
13 views283 pages

B.N. Pshenichny, Yu.M. Danilin - Numerical Methods in Extremal Problems - Mir - 1978

The book by B.N. Pshenichny and Yu.M. Danilin focuses on methods and algorithms for the numerical solution of extremum problems related to functions and functionals in various fields, including mathematical programming and economics. It emphasizes algorithms with fast convergence rates that can be implemented on computers, discussing both unconstrained and constrained minimization techniques. This work is intended for specialists in mathematical programming, computational mathematics, and optimal control theory, as well as students and engineers involved in function minimization problems.

Uploaded by

Bobby Weche
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views283 pages

B.N. Pshenichny, Yu.M. Danilin - Numerical Methods in Extremal Problems - Mir - 1978

The book by B.N. Pshenichny and Yu.M. Danilin focuses on methods and algorithms for the numerical solution of extremum problems related to functions and functionals in various fields, including mathematical programming and economics. It emphasizes algorithms with fast convergence rates that can be implemented on computers, discussing both unconstrained and constrained minimization techniques. This work is intended for specialists in mathematical programming, computational mathematics, and optimal control theory, as well as students and engineers involved in function minimization problems.

Uploaded by

Bobby Weche
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 283

R IC A

M E T I I
O D S
III III I I I I

IN B . N . P s h e n i c h n y a n d Yu. M . Danilin

E M A L
P 0

MIR
PUBLISHERS
MOSCOW
I
I I
i n - - - t r II II TI I I M I
B.N. Pshenichny and Yu.M.Danilin

T h e b o o k describes m e t h o d s a n d
a l g o r i t h m s for n u m e r i c a l solution of
p r o b l e m s of finding e x t r e m a of f u n c ­
tions u n d functionals m e t with in
nuthematicul progra m mi n g, e c o n o m i c s ,
optimal control theory a nd other fields
of s c i e n c e a n d practice.
Special attention is p a i d to a l g o r i t h m s
*itb a fast rate of c o n v e r g e n c e and
iniplementuble on computers. M e t h o d s
of u n c o n s t r a i n e d a nd constrained mi -
nimi/ution of functions of i n d e p e n d e n t
variables ure discussed. T h e b o o k will
b e useful to s p e c i a l i s t s in m a t h e m a t i ­
cal p r o g r a m m i n g , computational m a t h e ­
m a t i c s a n d optimal control theory a n d
|o b road circles of students a nd e n g i ­
neers, w h o in their practical w o r k h a v e
to solve p r o b l e m s of function m i n i m i ­
sation.
B. H. IllllGHHHHblH

H). M . flaHHJIHH

HHCJIEHHblE
METOAbl
B SKCTPEMAJlbHblX
3AAAHAX

MSAATEilbCTBO «HAYKA»
rilABHAfl P E A A K U H R
0M3 M K O- MATEMATMHECKOM AMTEPATyPbl
M O C K B A
6. N. P s h e n i c h n y a n d Yu. M . Danilin

Translated from the Russian


b y V . Z h i t o m i r s k y , D . Sc. (Eng.)
First published 1978

Revised from the 1975


Ru ssi an edition

H a a u z A u u c K O M si3 U K e

© IhiaBHaH p e ^ a K U H H 4>H3iiKO-MaTeMaTHHecKo£f jiHTepaTypu


HBflaTejibCTBa « H a y K a » , 1 9 7 5
© English translation, M i r Publishers, 1978
C O N T E N T S

P R E F A C E 9

C H A P T E R I. I N T R O D U C T I O N T O T H E T H E O R Y O F
M A T H E M A T I C A L P R O G R A M M I N G 12

1. C O N V E X SETS 12
Definition. Separation T h e o r e m . C o n v e x Cones. Strictly a n d
S t r o n g l y C o n v e x Sets.

2. C O N V E X FUNCTIONS 17

Definition. B a s i c Properties. Differential Properties. Strict­


ly a n d Strongly C o n v e x Functions. C o n c a v e Functions.

3. C O N V E X P R O G R A M M I N G 24

F o r m u l a t i o n of t h e P r o b l e m . B a s i c Properties. N e c e s s a r y
C o n d i t i o n s for a M i n i m u m . T h e K u h n - T u c k e r T h e o r e m . D u a l
P r o b l e m . P r o b l e m of L i n e a r P r o g r a m m i n g . P r o b l e m of Q u a d ­
ratic P r o g r a m m i n g .

4. N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M 32

B a s i c Definitions, N e c e s s a r y C o n d i t i o n s for a M i n i m u m .
M i n i m a x P r o b l e m . N e c e s s a r y C o n d i t i o n s of t h e S e c o n d Order.

5. S O M E A D D I T I O N A L I N F O R M A T I O N 41

Bibliographic Notes 42

C H A P T E R II. M E T H O D S O F U N C O N S T R A I N E D F U N C ­
T I O N M I N I M I Z A T I O N 44

1. G R A D I E N T M E T H O D S 45

M e t h o d of Steepest D e s c e n t . V a r i a n t s of t h e M e t h o d . O t h e r
G r a d i e n t M e t h o d s . Q u a l i t a t i v e A n a l y s i s of t h e M e t h o d s .

5
C O N T E N T S

2. N E W T O N ’S M E T H O D W I T H STEP A D J U S T M E N T 58

C o n s t r u c t i o n o f tlie M e t h o d . T h e o r e m s a b o u t P r o p e r t i e s o f
the M e t h o d . Modifications of the G e n e r a l i z e d N e w t o n M e t h o d .
D i s c u s s i o n o f t h e P r o p e r t i e s o f N e w t o n ’s M e t h o d .

3. M E T H O D S OF D U A L DIRECTIONS 67

C o n s i d e r a t i o n s o n t h e C h o i c e of S c h e m e s of t h e M e t h o d s .
S u b s t a n t i a t i o n of the M e t h o d s . C o n s t r u c t i o n of V a r i o u s
A l g o r i t h m s . D e t e r m i n i n g V e c t o r p h . T h e Initial S t a g e of t h e
Process. M i n i m i z a t i o n of Q u a d r a t i c F o r m . Di s c u ss i o n of
P r o p e r t i e s of t h e M e t h o d s .

4. M E T H O D S O F C O N J U G A T E D I R E C T I O N S . M I N I M I Z A T I O N OF
Q U A D R A T I C FUNCTIONS 82

C o n j u g a t e D i r e c t i o n s a n d T h e i r Properties. C o n s t r u c t i o n of
t h e M e t h o d s . G e n e r a l Properties of t h e M e t h o d s . C o n c r e t e
A l g o r i t h m s . M i n i m i z a t i o n of a C o n v e x Q u a d r a t i c F u n c t io n .
Discussion of Results.

5. M E T H O D S O F C O N J U G A T E DIRECTIONS. MINIMIZATION
OF A R B I T R A R Y FUNCTIONS 103

C o n s i d e r a t i o n s a b o u t t h e A p p l i c a b i l i t y of the Methods.
T h e o r e m o n C o n v e r g e n c e of t h e M e t h o d s . S t u d y of P r o p e r t i e s
of Different A l g o r i t h m s . F u r t h e r S t u d y of t h e R a t e of C o n v e r ­
gence. Discussion of Results.

0. M E T H O D S W I T H O U T C A L C U L A T I N G DERIVATIVES 129

I n t r o d u c t o r y R e m a r k s . C o n s t r u c t i n g M e t h o d s of D u a l D i r e c ­
tions. R e m a r k s o n t h e I m p l e m e n t a t i o n of M e t h o d s of D u a l
Directions. M e t h o d s of C o n j u g a t e Directions. D i s c u s s i o n of
Results.
Bibliographic Notes 145

C H A P T E R III. M E T H O D S O F C O N S T R A I N E D F U N C T I O N
M I N I M I Z A T I O N 146

1. P R O B L E M OF Q U A D R A T I C P R O G R A M M I N G 146

O p e r a t o r s of Projection. M i n i m i z a t i o n of a Q u a d r a t i c F u n c t i o n
in a S u b s p a c e . A l g o r i t h m of G e n e r a l P r o b l e m of Q u a d r a t i c

6
C O N T E N T S

P r o g r a m m i n g . C o m p u t a t i o n a l Aspects. P r o b l e m of Q u a d r a t i c
P r o g r a m m i n g with S i m p l e Constraints.

2. M E T H O D OF FEASIBLE DIRECTIONS 162

M e t h o d of C h o o s i n g F e a s i b l e Directions. A l g o r i t h m of M e t h o d of
F e a s i b l e Directions. S u b s t a n t i a t i o n of C o n v e r g e n c e of t h e
A l g o r i t h m . C o n s t r u c t i o n of t h e Initial A p p r o x i m a t i o n .

3. M E T H O D O F C O N D I T I O N A L G R A D I E N T A N D N E W T O N ’S
M E T H O D 170

R u l e for C h o o s i n g t h e S t e p L e n g t h . D e s c r i p t i o n of t h e A l g o ­
r i t h m . S u b s t a n t i a t i o n of C o n v e r g e n c e of t h e A l g o r i t h m a n d
E s t i m a t i o n of Its R a t e of C o n v e r g e n c e . E s t i m a t e of C o n v e r ­
g e n c e f o r a S t r o n g l y C o n v e x R e g i o n . N e w t o n ’s M e t h o d w i t h
S t e p A d j u s t m e n t . P r o p e r t i e s o f N e w t o n ’s M e t h o d .

4. C U T T I N G H Y P E R P L A N E M E T H O D 184

Algorithm. Computational Aspects. Concluding Remarks.

5. L I N E A R I Z A T I O N M E T H O D 188

Basic A s s u m p t i o n s . F o r m u l a t i o n of t h e A l g o r i t h m . C o n v e r g e n c e
of t h e Algorithm. Computational Aspects. S o m e Generaliza­
tions. P r o b l e m of L i n e a r P r o g r a m m i n g . L o c a l E s t i m a t e of
the R a t e of C o n v e r g e n c e .

6. L I N E A R I Z A T I O N M E T H O D : S O L V I N G S Y S T E M S O F E Q U A L I ­
TIES A N D INEQUALITIES A N D FI NDING T H E M I N I M A X 211

S y s t e m s of E q u a l i t i e s a n d Inequalities. C o n v e r g e n c e of t h e
A l g o r i t h m . R e m a r k s . Sufficient C o n d i t i o n s of C o n v e r g e n c e .
S o l v i n g the P r o b l e m of F i n d i n g t h e M i n i m a x .

7. L O C A L A C C E L E R A T I O N OF C O N V E R G E N C E 224

F o r m u l a t i o n of t h e P r o b l e m . B a s i c F o r m u l a s . A l g o r i t h m .
C o m p u t a t i o n a l A s p e c t s . A p p l i c a t i o n to t h e P r o b l e m of M a t h e ­
matical Programming. Minimization Problem with Equality
Constraints.

7
C O N T E N T S

8. M E T H O D OF P E N A L T Y FUNCTIONS 235

S u b s t a n t i a t i o n of t h e P e n a l t y F u n c t i o n M e t h o d . Convex
P r o g r a m m i n g . Computational Aspects. Fiacco and McCor­
mick Method.

9. P R O J E C T I O N M E T H O D S W I T H R E S T O R A T I O N OF TIES 244

C o n s t r u c t i o n of t h e M e t h o d s . M e t h o d s of t h e First O r d e r .
M e t h o d of t h e S e c o n d O r d e r . M i n i m i z a t i o n M e t h o d s of H i g h ­
er Effectiveness. O n t h e S o l v i n g of t h e G e n e r a l P r o b l e m of
Mathematical Programming. Conclusive Remarks.
Bibliographic Notes 257
APPENDIX. C O M P U T A T I O N A L S C H E M E S OF T H E M A I N
A L G O R I T H M S 259
L I T E R A T U R E 265
INDEX 271

8
P R E F A C E

C o m p u t a t i o n a l m e t h o d s of solving e x t r e m a l p r o b l e m s d e v e l o p e d
v e r y intensively in recent years.
P R E F A C E

T h e r e f o r e s p e c i a l s t r e s s is l a i d o n t h e d e s c r i p t i o n o f t h e a l g o r i t h m s
t h a t r e q u i r e t h e f i n d i n g o n l y o f t h e first d e r i v a t i v e o r o n l y o f t h e
v a l u e of t h e function.
In describing the computational m e t h o d s w e consider only the
f i n i t e d i m e n s i o n a l c a s e . T h i s is d u e t o t w o r e a s o n s . F i r s t , i n u s i n g
a c o m p u t e r f o r c a l c u l a t i o n s , t h e p r o b l e m is t o b e a p p r o x i m a t e d a n y ­
w a y b y a finite d i m e n s i o n a l o n e . S e c o n d l y , m o s t of t h e k n o w n a l g o ­
r i t h m s are c o m p a r a t i v e l y s i m p l y generalized for the m i n i m i z a t i o n
o f f u n c t i o n a l s w i t h o u t e s s e n t i a l c h a n g e s . T h i s a p p r o a c h m a d e it
possible to m a k e t h e b o o k easily u n d e r s t o o d b y a b r o a d circle of
readers, since in o r d e r to g r a s p m o s t of t h e results d e s c r i b e d o n l y a
k n o w l e d g e of the principles of m a t h e m a t i c a l analysis a n d linear
a l g e b r a is r e q u i r e d .
T o a v o i d the necessity of frequent cross-referencing, not m a n y refe­
rences are g i v e n in t h e text. S h o r t bibliog r a p h i c notes f o l l o w s o m e
o f t h e c h a p t e r s . T h e a u t h o r s d i d n o t a t t e m p t to c o m p r i s e all t h e
literature o n the questions treated, this b e i n g s i m p l y i m p o s s i b l e
b e c a u s e o f i t s v a s t n e s s . T h i s is w h y t h e l i s t o f l i t e r a t u r e g i v e n a t
t h e e n d of the b o o k includes o n l y p a p e r s a n d m o n o g r a p h s directly
u s e d in w r i t i n g this b o o k .
It, s h o u l d b e n o t e d t h a t t h e a u t h o r s h a v e n o t d i s c u s s e d t h e m e t h o d s
of s o lving a b r o a d a n d i m p o r t a n t class of n o n c o r r e c t e x t r e m a l p r o b ­
l e m s , w h i c h are treated in t h e w o r k s of A . N . T i k h o n o v a n d his
followers. T h e a u t h o r s h a v e b u t slightly t o u c h e d the s o l v i n g of
optimal control problems. These problems have been studied from
v a r i o u s poi n t s of v i e w a n d t h e m e t h o d s for their solution are g i v e n
i n N . N . M o i s e e v ’s m o n o g r a p h N u m e r i c a l M e t h o d s i n t h e T h e o r y o f
O p t i m a l Systems.
T h e a l g o r i t h m s set forth b e l o w are iterative in character. T h i s
m e a n s t h a t w e c a n c o n s t r u c t a finite o r infinite s e q u e n c e of p o i n t s
X k , k — 0, 1 . . . w h i c h is s a i d t o c o n v e r g e t o t h e s o l v i n g o f a m i ­
nimization problem.
T h e p o i n t s of t h e s e q u e n c e are related b y the e q u a t i o n
x h+ 1 ~ xh + &hPh
w h e r e p h i s t h e v e c t o r o f s h i f t f r o m p o i n t x h a n d a h is a s t e p a l o n g
t h e direction of p k . T h e r e f o r e t he d e scription of a n y of the a l g o ­
r i t h m s g i v e n b e l o w consists in e s t a b l i s h i n g t h e m e t h o d of c h o o s i n g
t h e v e c t o r p h a n d t h e l e n g t h of t h e step a h . It s h o u l d b e n o t e d t h a t
t h e m e t h o d of c h o o s i n g t h e v e c t o r p k d e t e r m i n e s t h e g e n e r al rate of
c o n v e r g e n c e of t h e process a n d the m e t h o d of c h o o s i n g a k h a s a n i m ­
p o r t a n t i n f l u e n c e o n t h e a m o u n t of c a l c u l a t i o n s at e a c h iteration.
T h e r e f o r e t h e a u t h o r s ’ a i m w a s t o g i v e i n all c a s e s of c h o o s i n g a h
a m e t h o d , s u c h t h a t t h e r e q u i r e d v a l u e of a h c o u l d b e f o u n d after
a finite n u m b e r of i t e r a t i o n s w i t h o u t a f f e c t i n g t h e g e n e r a l r a t e of
convergence.

10
P R E F A C E

L o t lis briefly r e v i e w t h e e s t i m a t e s of t h e rate of c o n v e r g e n c e ,


w h i c h are in m o s t cases u s e d in this b o o k .
W e say t h a t a s e q u e n c e {#*,} c o n v e r g e s t o p o i n t x * a t a l i n e a r r a t e
o r at t h e r a t e o f g e o m e t r i c a l p r o g r e s s i o n ( w i t h t h e r a t i o q ) if f r o m a
certain k t h e i n e q u a l i t y || # ft+1 — x % || ^ q || x h — x * || w h e r e 0 < ;
<C q C 1, is s a t i s f i e d . I f t h e i n e q u a l i t y | |x k + l — x % | |^ q k | |x h — x * \ \
is s a t isfied, w h e r e q k — w i t h k — oo, w e s a y t h a t t h e rate of c o n v e r ­
g e n c e o f I h e s e q u e n c e { x ft} is s u p e r l i n e a r , o r f a s t e r t h a n t h e r a t e o f c o n ­
v e r g e n c e o f a n y g e o m e t r i c p r o g r e s s i o n . I f q k ^ C || x k — x % || — » - 0 ,
t h e n || .rft+1 — x % || ^ C || x k — x * ||2 . T h i s e s t i m a t e i s a c h a r a c t e r ­
istic of t h e q u a d r a t i c rate of c o n v e r g e n c e .
T h e a b o v e estimates will oc c u r in this b o o k also in several other
equivalent forms.
S o m e r e m a r k s on the notations used.
A s m e n t i o n e d b e f o r e , t h e s u b j e c t is t r e a t e d f o r t h e c a s e o f a n n -
dimensional vector space w h i c h will be d e n o t e d b y E n. T h e vectors
w i l l b e d e n o t e d b y l o w e r - c a s e l e t t e r s x , y , z, e t c . a n d t h e i r c o m p o n e n t s
b y u s i n g s u p e r s c r i p t s s o t h a t x 1 is t h e i - t h c o m p o n e n t o f v e c t o r x .
T h e subscripts d e n o t e t h e e l e m e n t s of a s e q u e n c e . M a t r i c e s are d e n o ­
t e d b y c a p i t a l letters A , B , C etc. A n a s t e r i s k as u p p e r i n d e x d e n o t e s
t r a n s p o s i t i o n , i.e. A * is t h e t r a n s p o s e d m a t r i x A . A s a r u l e v e c t o r x
m e a n s a colum n- v e c t o r so that x * denotes a row-vector. T h e scalar
p r o d u c t o f t w o v e c t o r s i s d e n o t e d b y ( # , y ), i . e .
n
(x, y) = 2 3 * v .
i= 1
T h e n o r m o f t h e v e c t o r is u n d e r s t o o d t o b e its E u c l i d e a n n o r m ,
unless o t h e r w i s e specified:
ii * i i - - = / ( * . * ) •
In conclusion, the authors express their sincere gratitude
t o (i. E . L y b a r s k a y a , L . A . S o b o l e n k o , E . I. B o g u s l a v s k a y a a n d
V . M . P a n i n for t h e i n v a l u a b l e assistance in p r e p a r i n g this b o o k .
C h a p t e r I ( e x c e p t Sec. 5 a n d p a r t l y Sec. 2) a n d C h a p . Ill ( e x c e p t
Sec. 9 a n d p a r t l y Sec. 3) h a v e b e e n w r i t t e n b y B . N . P s h e n i c h n y .
C h a p t e r II, t h e t h i r d a n d t h e f o u r t h s u b s e c t i o n s o f S e c . 2 a n d
t h e fifth a n d s i x t h s u b s e c t i o n s o f S e c . 3 , a n d S e c . 9 o f C h a p . I l l
have been written b y Y u . M . Danilin.

11
C H A P T E R I
I N T R O D U C T I O N T O T H E T H E O R Y
O F M A T H E M A T I C A L P R O G R A M M I N G

T h i s c h a p t e r describes s o m e facts f r o m the t h e o r y of c o n v e x sets


a n d the necessary conditions of the e x t r e m a ; these facts are neces­
s a r y for u n d e r s t a n d i n g t h e m a t t e r set forth in s u b s e q u e n t chapters.

1. C O N V E X S E T S
I n this section w e c o n s i d e r t h e basic properties of c o n v e x sets in
a n /i-dimensional Euclidean space.

Definition. S e p a r a t i o n T h e o r e m
D e f i n i t i o n 1 . 1 . A s e t o f p o i n t s X i n E n is c a l l e d c o n v e x if t o g e t h e r
w i t h a n y x lt x 2 £ X it c o n t a i n s a l s o a l l p o i n t s o f t h e f o r m :
x = X x x -4- ( 1 — X ) x 2 , 0 ^ X ^ 1.
I n g e o m e t r i c a l t e r m s t h i s m e a n s t h a t if t h e e n d p o i n t s o f a seg­
m e n t b e l o n g to a c o n v e x set X t h e n t h e w h o l e s e g m e n t b e l o n g s to t h e
set too.
L e n n n a 1.1. T h e f o l l o w i n g s t a t e m e n t s h o l d :
( 1 ) T h e i n t e r s e c t i o n o f a n y n u m b e r o f c o n v e x s e t s is c o n v e x '
(2) / / x t 6 X , i = 1, . . m , t h e n w i t h a n y X t, i = 1, . . ., m
m m
such that 2 = 1, X t ^ 0, a p o i n t x = 2 X i X t b e l o n g s to X .
i=l i=l
T h e f o l l o w i n g t h e o r e m a n d its c o r o l l a r i e s a r e t h e b a s i c too l s u s i n g
w h i c h it is p o s s i b l e t o o b t a i n r e s u l t s c h a r a c t e r i s i n g v a r i o u s p r o ­
perties of c o n v e x sets.
T h e o r e m 1 . 1 . L e t X b e a c o n v e x s e t , a n d X its c l o s u r e . I f p o i n t x 0
d o e s n o t b e l o n g to X , t h e n t h e r e exist a v e c t o r a £ E n , a ^ 0, a n d a n u m ­
ber e > 0 s u c h that for all x £ X
(a, x ) ^ (a, x Q) — e.

12
C O N V E X S E T S

P r o o f . X is a c l o s e d s e t , b y d e f i n i t i o n . L e t u s s h o w t h a t it is c o n v e x .
I n d e e d , if x £ X , t h e n t h e r e i s a s e q u e n c e { # * } , k = 1 , . . ., s u c h
that x h 6 X , x k x . N o w l e t a;, y 6 X , 0 ^ A, ^ 1 . L e t u s p r o v e
t h a t h e + (1 — X ) y 6 X . S i n c e X is a c o n v e x set, it f o l l o w s f r o m
Xk, yh e x , x h - + x , y h - + y that

+ (1 — k) y k g X ,
kxh + {i — k) y h ->-kx + (i — k) y.

T h i s m e a n s t h a t X x + ( 1 — A,) y £ X , i . e . X i s c o n v e x .
L e t u s t a k e a p o i n t y 6 X w h o s e d i s t a n c e f r o m x Q is t h e l e a s t , i.e.
II % X q || ^ || y X q (I, x £ X .
Since X i s c o n v e x f o r a l l x (j X and 0 ^ X ^ 1, w e h a v e
Xx + (1 — X) y = y + X (x — y) £ X .
Therefore
|| X x - f ( 1 — X) y — x 0 ||2 = || y — x0 + X (x — y ) ||2
= (y — *o + ^ (x — y)> y — *o + x (x — y))
= (y — *o» y — *o) + 2X(y — x 0, X — y ) + A,2 ( x — y , a; — y )
= II y — II2 + 2 X ( y — x — y ) + A,2 || x — y ||2 > || y — x 0 \\2 .

T h e last i n e q u a l i t y h o l d s for a n y X, v a r y i n g b e t w e e n zero a n d


u n i t y . S i m p l i f y i n g it w e o b t a i n
2 (y — *o» * — y) 4 - ^ II * — y II2 > 0;
hence with A, = 0
(y — x 0f x — y) > 0.
L e t a = z 0 — y. T h e last i n e q u a l i t y c a n t h e n b e w r i t t e n i n t h e
f o r m (a, x ) ^ (a, y). B u t
( a , y ) = ( a , x 0 ) — ( a , x 0 — y ) = ( a , x 0 ) — || a ||2 .
Setting e = || a ||2 , w e f i n a l l y o b t a i n
(a, x ) ^ (a, x 0) — e.
T h i s i n e q u a l i t y h o l d s for a n y x £ X . B e s i d e s e > 0 as x 0 X and
consequently y =£ x 0. Therefore
e = || a ||2 = || z 0 - y ||2 > 0.
Q.E.D.

13
M A T H E M A T I C A L P R O G R A M M I N G

R e m a r k . I n p r o v i n g t h e o r e m 1.1. w e h a v e p r o v e d at t h e s a m e
t i m e t h a t t h e c l o s u r e o f a c o n v e x s e t is c o n v e x t o o . A s a s i m p l e
exercise t h e r e a d e r c a n p r o v e t h a t t h e set of interior p o i n t s of a c o n ­
v e x s e t is c o n v e x t o o .
C o r o l l a r y 1.1. L e t X b e a c o n v e x set a n d x Q the f r o n t i e r p o i n t of X .
T h e n t h e r e is a v e c t o r a = £ 0 s u c h t h a t
(a, x ) ^ (a x 0 ), x £ X.
C o r o l l a r y 1.2. I f X a n d Y a r e c o n v e x sets t h a t d o n o t intersect, t h e n
t h e r e is a v e c t o r a 0 such that
(a, x ) < (a, y), x 6 X, y £ Y .
C o r o l l a r y 1.3. I f X a n d Y a r e cl o s e d c o n v e x sets w h i c h d o n o t intersect
a n d o n e o f t h e m is b o u n d e d , t h e n t h e r e e x i s t a v e c t o r a = £ 0 a n d a
n u m b e r e > 0 such that
(a, x ) < (a, y ) — e, *x £ X , y £ Y .

Convex Cones
D e f i n i t i o n 1 . 2 . A s e t K is c a l l e d a c o n v e x c o n e if t h e s e t is c o n v e x
a n d t o g e t h e r w i t h e v e r y p o i n t x £ K it c o n t a i n s a l l p o i n t s X x w i t h X > 0 .
I t i s c l e a r t h a t ii x , y £ K t h e n x + y £ K . I n f a c t , s i n c e K i s a
1 1
c o n v e x set, p o i n t y x + y y b e l o n g s t o K . B u t

x y { t x ~^~ ~2 y ) *

w h e n c e x + y £ AT b y t h e definition of a c o n e . T h e m o s t i m p o r t a n t
properties of c o n e s are f o r m u l a t e d in t e r m s w h i c h establish the rela­
t i o n b e t w e e n t h e o r i g i n a l c o n e a n d t h e c o n e t h a t is its c o n j u g a t e o r
dual.
D e f i n i t i o n 1.3. L e t K b e a c o n v e x c o n e . T h e set of all vectors y £ E n
s a t i s f y i n g f o r a n y x £ K t h e i n e q u a l i t y ( x , y ) ^ 0 is c a l l e d a c o n j u g a t e
cone a n d denoted by K * .
A n e l e m e n t a r y c h e c k s h o w s t h a t K * is a l s o a c o n v e x c o n e .
L e m m a 1 . 2 . K * is a c l o s e d c o n v e x c o n e .
L e m m a 1 . 3 . L e t K b e a c o n v e x c o n e . T h e n x 0 £ K if a n d o n l y if
(ar0 , y ) ^ 0 f o r a l l y £ K * . I f K is c l o s e d , t h e n
(K * ) * = K.
P r o o f . I t i s e v i d e n t t h a t if x 0 £ K , t h e n ( x 0l y ) ^ 0 f o r a l l y £ A T * .
S u p p o s e i t i s f a l s e . L e t (;z0 , y ) ^ 0 f o r a n y i/ £ A T * , b u t x 0 £ K .

14
C O N V E X S E T S

S i n c e K is a c l o s e d c o n v e x s e t a n d u s i n g t h e o r e m 1 . 1 , w e c a n a s s e r t
t h a t t h e r e is a v e c t o r a s u c l i t h a t
(a, x 0) ^ (at x) — e, x £ K .
N o w a closed cone K a l w a y s c o n t a i n s p o i n t 0. T h e r e f o r e i n p a r t i c u l a r
(a, x 0) < — e. (1.1)
On the other h a n d
(a, x ) ^ 0, x £ K . (1*2)
I n d e e d , if f o r a c e r t a i n x { £ K (a , x t ) < 0, t h e n s i n c e £ K with
X > 0
(fl, X 0 ) < X (tf, Xj) — e

a n d t h e l a s t i n e q u a l i t y m u s t b e v a l i d f o r a n y X ; t h i s is i m p o s s i b l e
if ( a , x x ) < ; 0 . T h u s ( 1 . 2 ) i s v a l i d a n d c o n s e q u e n t l y a £ K * . T h e n
( a , :r0 ) ^ 0 a n d t h i s c o n t r a d i c t s ( 1 . 1 ) . T h i s p r o v e s t h e f i r s t p a r t o f
the l e m m a .
L e t u s n o w p r o v e i t s s e c o n d p a r t . I f x £ K , t h e n (x , y ) ^ 0 f o r
all y £ K * , b y d e f i n i t i o n , a n d t h e r e f o r e x 6 ( i f * ) * , K c z ( K * ) * .
C o n v e r s e l y , b y d e f i n i t i o n , x £ ( K * ) * if a n d o n l y if ( # , y ) ^ 0 w i t h
a n y y £ K * . H o w e v e r , it w a s p r o v e d a b o v e t h a t i n t h i s c a s e x £ K ,
i.e. ( K * ) * a K . T h u s ( K * ) * = K . Q . E . D .
P o l y h e d r a l c o n e s are a n i m p o r t a n t class of c o n e s e n c o u n t e r e d
in t h e t h e o r y of linear p r o g r a m m i n g .
D e f i n i t i o n 1 . 4 . A c o n e K is c a l l e d p o l y h e d r a l if t h e r e e x i s t s a f i n i t e
set o f n - d i m e n s i o n a l v e c t o r s a t, i = 1, . . . , m s u c h t h a t w i t h x £ K
the e x p a n s i o n

2 Xia-,, 0, i=--lf ..., m (1.3)

is v a l i d a n d c o n v e r s e l y ( 1 . 3 ) i m p l i e s t h a t x £ K .
T h u s a p o l y h e d r a l c o n e K is a s e t o f p o i n t s w h i c h c a n b e r e p r e ­
s e n t e d in t h e f o r m (1.3). A g i v e n p o i n t x £ K i n t h e f o r m (1.3),
s p e a k i n g g e n e r a l l y , is r e p r e s e n t e d n o t u n i q u e l y .
L e m m a 1 . 4 . L e t x £ K , K b e i n g a p o l y h e d r a l c o n e . T h e n t h e r e is s u c h
a n e x p a n s i o n o f x i n v e c t o r s a t w i t h n o n n e g a t i v e c o e f f i c i e n t s X tl t h a t
the n u m b e r of indices i for w h i c h Xi 0 does n o t e x c e e d n, the n u m b e r
o f d i m e n s i o n s o f t h e s p a c e ; t h e v e c t o r s a t c o r r e s p o n d i n g t o n o n z e r o X-t
are linearly independent.
m
P r o o f . L e t x £ K , i.e. x = 2 ^ i a n a n d 3 b e t h e s e t o f t h o s e i n d i c e s i
i=i
s u c h t h a t X i > > 0 . S u p p o s e t h a t t h e n u m b e r o f e l e m e n t s i n .7 is
g r e a t e r t h a n n, o r d o e s n o t e x c e e d n, b u t t h e v e c t o r s a*, i £ J , a r e

15
M A T H E M A T I C A L P R O G R A M M I N G

linearly dependent. Since m o r e t h a n n linearly inde p en d e n t vectors


c a n n o t exist in a n ^ - d i m e n s i o n a l space, there are coefficients a*,
n o t a l l z e r o , s u c h t h a t 2 a ia i = 0 * B e s i d e s , b y d e f i n i t i o n o f J ,
iej
X t = 0 if i g J a n d s o
x = _£] X i d i , %i > 0, i£ J .

S u b t r a c t i n g f r o m t h i s r e l a t i o n t h e p r e c e d i n g o n e m u l t i p l i e d b y e,
w e obtain
£ = 2 (Xi-ea^cii.

W i t h o u t loss of generality w e c a n take that for s o m e 3.


Setting e0 = min — - a n d X i = X i — e 0oc/, w e h a v e
a .>0 a i

X — Aijflj

w h e r e X t ^ 0 a n d for o n e i at least X t = 0.
T h u s w e h a v e o b t a i n e d a n e x p a n s i o n of x in vectors a t w i t h n o n ­
n e g a t i v e coefficients; h o w e v e r t h e n u m b e r of strictly positive coef­
ficients h a s b e e n d i m i n i s h e d .
T h i s process c a n n o w b e a p p l i e d further until the n u m b e r of n o n ­
z e r o coefficients b e c o m e s less t h a n n o r e q u a l to n a n d v e c t o r s a t
for w h i c h X t > 0 b e c o m e linearly i n d e p endent. Since w e h a v e a pro­
cess of d i m i n i s h i n g a w h o l e n u m b e r , this process o b v i o u s l y c a n n o t
b e c o n t i n u e d infinitely a n d after a certain n u m b e r of steps w e shall
g e t a n e x p a n s i o n w h i c h satisfies t h e c o n d i t i o n s of o u r l e m m a .
L e m m a 1 . 5 . A p o l y h e d r a l c o n e is c l o s e d .
L e m m a 1.6. L e t the c o n e K b e defined b y a s y s t e m of linear inequalities

(a*, x ) ^ 0, i = 1 , . . ., m

w h e r e a t £ E n . T h e n t h e c o n j u g a t e c o n e K * is a p o l y h e d r a l c o n e a n d
consists of p o i n t s y, w h i c h c a n be p r e s e n t e d in the f o r m
m
y ^ ^ X i d i , X i ^ 0, i = l, . . . , m .
i=l

Proof. Let us consider the cone


m
K = \y : y ^ X td i , i- = l t
i=l

16
C O N V E X F U N C T I O N S

B y d e f i n i t i o n , K * i s a s e t o f p o i n t s x , f o r w h i c h (x , y ) ^ 0 , y £ K ,
m
i.e. ( x , 2 for all X i ^ O . T h e n
i=l
m m
(a:, 2 ^ ia i) = ^ j h ( * , o.
i=l i=l
T h e l a s t i n e q u a l i t y c a n o b v i o u s l y b e v a l i d f o r a n y X $ ^ 0 o n l y if
( a f, z ) ^ 0 , i = 1, . . m , i.e. if x £ K . T h u s K * = K . S i n c e K
i s a p o l y h e d r a l c o n e , it i s c l o s e d a n d b y l e m m a 1 . 3 ( X * ) * = K .
Thus K * = K. Q.E.D.
R e m a r k . T h e l e m m a p r o v e d a b o v e is k n o w n a s t h e F a r k a s - M i n -
k o w s k i l e m m a a n d is u s e d a s t h e b a s i c t o o l f o r o b t a i n i n g t h e n e c e s ­
sary conditions for e x t r e m a .

Strictly a n d Strongly C o n v e x S e t s
D e f i n i t i o n 1 . 5 . A s e t X a E n is c a l l e d s t r i c t l y c o n v e x if f o r a n y x ly
x 2 € X , Xi x 2 all p o i n t s of the f o r m
Xx1 + (1 — X ) x 2% 0 <c X < 1
are internal p o i n t s of this set.
D e f i n i t i o n 1 . 6 . A s e t X a E n i s c a l l e d s t r o n g l y c o n v e x i f t h e r e is
a constant y > 0 such that a n y point
Xi + X 2
2 + y £ X
i f x 1 , x 2 6 X a n d || y || < y || x 2 — x l ||2 .
I t is e a s i l y a s c e r t a i n e d t h a t a s t r o n g l y c o n v e x s e t is a l s o s t r i c t l y
c o n v e x (but not the converse).

2. C O N V E X F U N C T I O N S
C o n v e x functions h a v e a n u m b e r of i m p o r t a n t properties a n d
constitute o n e of t h e m a i n objects of s t u d y in t h e t h e o r y of m a t h e ­
m a t i c a l p r o g r a m m i n g . T h e p r o b l e m of c o n v e x p r o g r a m m i n g w h i c h
is t h e m o s t i n v e s t i g a t e d o n e f o r e x t r e m a is f o r m u l a t e d i n t e r m s o f
c o n v e x sets. H o w e v e r c o n v e x f u n c t i o n s p l a y a d e c i s i v e role i n t h e
general n o n l i n e a r p r o b l e m too, since the sufficiently general a n d
c o m p r e h e n s i v e necessary conditions of e x t r e m a c a n b e f o r m u l a t e d
o n l y for the case w h e r e the derivatives of t h e functions in the direc­
tion at the g i v e n point are c o n v e x functions.
W e shall m a i n l y s t u d y c o n v e x functions defined o ver the w h o l e
s p a c e s o t h a t t h e v a l u e o f a n y g i v e n c o n v e x f u n c t i o n is finite a t e a c h
p o i n t x £ E n . F r o m t h e v i e w p o i n t o f g e n e r a l t h e o r y it is s o m e t i m e s
expedient to consider c o n v e x functions w h i c h c a n at s o m e points

2— 0326 17
M A T H E M A T I C A L P R O G R A M M I N G

t a k e t h e v a l u e of + oo. I n w h a t follows h o w e v e r s u c h functions will


o c c u r o n l y in s t u d y i n g d u a l p r o b l e m s of c o n v e x p r o g r a m m i n g . T h e r e ­
f o r e i n all s t a t e m e n t s of t h i s s e c t i o n , u n l e s s o t h e r w i s e specified, w e
s u p p o s e t h a t t h e c o n v e x f u n c t i o n u n d e r c o n s i d e r a t i o n is d e f i n e d o v e r
t h e w h o l e s p a c e E n a n d t a k e s finite v a l u e s .

Definition. B a s i c Properties
D e f i n i t i o n 2 . 1 . A f u n c t i o n f ( x ) d e f i n e d f o r a l l x £ E n is c a l l e d c o n ­
v e x if f o r a n y x ly x 2 a n d X x , X 2 ^ 0 , X x + X 2 = 1 ,
/ ( % 1x 1 + X 2:z2 ) < K J (xt) - f ( x 2 ).
R e m a r k . If f (x) = + o o f o r s o m e x, t h e d e f i n i t i o n r e m a i n s v a l i d .
L e m m a 2 . 1 . L e t f x ( x ) a n d f 2 ( x ) b e c o n v e x f u n c t i o n s a n d c ±1 c 2 n o n ­
negative numbers. T h e n
1 (z) = clfl (z) + c2/2 (x)
is a c o n v e x f u n c t i o n t o o .
L e m m a 2 . 2 . L e t f t (x), i = 1, . . , m b e c o n v e x f u n c t i o n s . Then
f ( x ) = m a x ft ( x ) is a l s o a c o n v e x f u n c t i o n .
L e m m a 2 . 3 . I f f ( x ) is a c o n v e x f u n c t i o n , t h e n w e h a v e
\f ( X j X i ” 1” k 2X 2 • . • “ F" X m £ m )
^ ^ i f (^l) ^ 2 / (^2) • • • “ 1” ^ m f f a m )
f o r a n y n o n n e g a t i v e X*, w h i c h satisfy the c o n d i t i o n
X* + . . • + Xm = 1.
Proof. W i t h m = 2 this s t a t e m e n t follows f r o m the definition of
a c o n v e x function. S u p p o s e the l e m m a h a s b e e n p r o v e d for m ^ . k .
L e t u s s h o w t h a t t h e s t a t e m e n t is v a l i d f o r m = k + 1. L e t X * ^ 0 ,
i = 1 , . . . , & + l , X 1 + . . . + X fe+1 — 1 . E v i d e n t l y o n e c a n c o n ­
s i d e r all %i to b e strictly g r e a t e r t h a n zero; o t h e r w i s e w e s h o u l d
h a v e t h e c a s e w h e r e t h e a b o v e i n e q u a l i t y is s a t i s f i e d b y h y p o t h e s i s .
T h u s Xft+i 0 a n d 1 — X ^ i — X j -f- . . . -j- X ^ 0.
F r o m the definition of a c o n v e x function w e have|
/ ( ^ 1*^1 ~ h • • • “ 1" X f t X h -{- X f c + i ^ h + i )

^ ( 1 — X f t+i) / ^ l — X k + l X l ~^~ • * • i — Xh+i X k ) ^ h+1f (2*1)


B u t b y induction

' ( T = f c r * + - - + - r a f c r * * )
< 2 -2 >

1 8
C O N V E X F U N C T I O N S

since
^■1_ _ |_ I_ _ _ _ _ _ _ A
- t - * * * ^ l — Xh+i
C o m p a r i n g (2.1) a n d (2.2) w e o b t a i n t h e r e q u i r e d result.
T h e l e m m a h a s b e e n p r o v e d us i n g the principle of m a t h e m a t i c a l
induction.
L e m m a 2 . 4 . T h e f u n c t i o n f ( x ) is c o n v e x if a n d o n l y i f f o r a n y x
a n d p £ Z?n t h e f u n c t i o n o f t h e o n e - d i m e n s i o n a l v a r i a b l e t
<P*. p (<) = / (x + tp) (2 -3 )
is a c o n v e x f u n c t i o n .

Differential Properties
L e t / (x) b e a c o n v e x d i f f e r e n t i a b l e f u n c t i o n w h o s e c o n t i n u o u s
g r a d i e n t is f (x).
L e m m a 2.5. T h e fo l l o wi n g statements are equivalent:
( 1 ) / ( x ) is a c o n v e x f u n c t i o n .
( 2 ) f ( x 2 ) — f ( x x ) > (/' f o ) , x 2 — x j f o r a n y x ly x 2 6 E n .
( 3 ) (/' ( x + X p ) , p ) is a n o n d e c r e a s i n g f u n c t i o n o f X .
I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n , t h e n
( 4 ) / " (a;), t h e m a t r i x o f s e c o n d d e r i v a t i v e s , is p o s i t i v e d e f i n i t e » i . e %
( T (x ) P i p ) Q fo r a n y x i P 6
P r o o f . N o t e first o f a l l t h a t if
fp * . P W = / (i + Xp),'
t h e n a s s h o w n a b o v e <p*. p (A,) i s a c o n v e x f u n c t i o n a n d
<pi. P ( X ) = (/' (a; + Xp), p), ffx. p ( X ) = (p, f (x 4 - X p ) p). (2.4)
L e t u s s h o w t h a t s t a t e m e n t (2) f o l l o w s f r o m s t a t e m e n t (1). I n f a c t *
since
/ ((1 - X) x t + X x 2) < (1 - X ) f f o ) + X f ( x 2 ), 0 < X < 1
we have
J ( x t + X («, - «,)) - f M < f ( X i ) _ ; ( X i ) .

T a k i n g the limit w i t h ^ - ► O w e obtain


(/ (^l)i X 2 ^l) ^ / (^2) / fa'l)*
T h u s s t a t e m e n t (2) f o l l o w s f r o m (1) o r s h o r t l y (1) -»-(2).
L e t u s s h o w t h a t ( 2 ) — ( 3 ) . F r o m s t a t e m e n t ( 2 ) w e h a v e f o r <pX t P ( X ) :
*Px, p ( ^ 1 ) (^2 ^1 ) ^ <P*.p (^2) 9 x, P ( ^ l ) i
9 x, p ( ^ 2 ) (^1 ^ 2) ^ <P*. p ( ^ 1 ) <P*. p ( ^ 2 )*

19 2*
M A T H E M A T I C A L P R O G R A M M I N G

T h e t w o inequalities w i t h k 2 > > give


p ^ 2) — p t\ \
^Px, p (A l ) ^ ^ ^ 93C, P ( ^ 2 ) »

i . e . (/' ( x + k t f ) , p ) ^ (/' ( x + ^ 2/?)* /?). Q . E . D .


( 3 ) - > ( 1 ) . L e t (/' (a: + k p ) , p ) b e a n o n d e c r e a s i n g f u n c t i o n o f k .
T h e n (p£, p ( k j s g (pi, p ( k 2 ) w i t h k 2 > 2 ^ .
I f 0 c pi < c 1 , t h e n
1

M* ( ^ 2 ^ 1 ) J l*Px, p ( ^ 1 + 1 ( ^ 2 ^ 1 )) <Px, p ( ^ 1 “ h 'TM' ( ^ 2 ^ 1 ))] d x


0

= ( 1 — fl) (p*. p ( k j ) + |Ll(p ( k 2 ) — (px , p ( ( 1 — \i) A * + \ x k 2 ) ,


i . e . (px , p (A,) i s a c o n v e x f u n c t i o n o f k . T h e n a s f o l l o w s f r o m l e m m a 2 . 4 ,
/ (x) is a c o n v e x f u n c t i o n .
( 3 ) - > - ( 4 ) . S i n c e ( p i , p ( k ) — (/' ( x + k p ) , p ) i s a n o n d e c r e a s i n g
f u n c t i o n , (pi, p ( k ) ^ 0 , i . e .
(P m V (* + kp)p) > 0. (2.5)
H e n c e t h e m a t r i x f (x) is p o s i t i v e d e f i n i t e .
( 4 ) - > - ( 3 ) . C o n v e r s e l y , if ( 2 . 5 ) i s s a t i s f i e d , t h e n (pi, p (A,) is n o n ­
n e g a t i v e a n d c o n s e q u e n t l y t h e f u n c t i o n (pi, p ( k ) — (/' ( x + Xp), p)
is n o n d e c r e a s i n g .
S i n c e it h a s b e e n s h o w n t h a t (1) (2) ->• (3) (1), (4) (3) a n d
(3) (4) t h e e q u i v a l e n c e o f al l o f t h e f o u r s t a t e m e n t s i n l e m m a 2.5
is p r o v e d .
C o r o l l a r y 2.1. T h e quadratic function

f ( * ) = y ( * > A x ) + Q>, x )
L.

is c o n v e x i f a n d o n l y if m a t r i x A is p o s i t i v e d e f i n i t e .
I n d e e d , / (x) is t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e a n d /" (x) = A .
T h e r e f o r e t h e s t a t e m e n t of t h e corollary follows directly f r o m sta­
t e m e n t (4) o f l e m m a 2 . 5 .
L e m m a 2.5 p r o v i d e s a series of criteria of c o n v e x i t y of a f u n c ­
t i o n w h i c h e n a b l e u s t o e s t a b l i s h w h e t h e r a g i v e n f u n c t i o n is c o n v e x .
D e f i n i t i o n 2 . 2 . L e t the c o n v e x f u n c t i o n f (x) b e d e f i n e d a t p o i n t x 0
a n d h a v e a f i n i t e v a l u e . V e c t o r g is c a l l e d a s u b g r a d i e n t o r s u p p o r t
v e c t o r f o r f u n c t i o n f ( x ) a t p o i n t x 0 if f o r a n y x t h e i n e q u a l i t y
/(*)— / (*o) 5 s (g, x — x 0) (2.6)
is s a t i s f i e d *
I t c a n b e s h o w n t h a t i f / ( # ) i s c o n t i n u o u s a t p o i n t x 0> t h e n a t t h i s
p o i n t t h e r e e x i s t s u b g r a d i e n t s a n d t h e s e t o f t h e s e s u b g r a d i e n t s is
c o n v e x , c l o s e d a n d b o u n d e d . It f o l l o w s f r o m l e m m a 2 . 5 ( s t a t e m e n t 2)

2 0
C O N V E X F U N C T I O N S

t h a t /' (a;0 ) i s a s u b g r a d i e n t o f f u n c t i o n / ( # ) a t p o i n t x 0 i f / ( x ) i s
d i f f e r e n t i a b l e . T h u s t h e c o n c e p t o f s u b g r a d i e n t is a g e n e r a l i z a t i o n
of the gradient concept.
I t i s c l e a r f r o m t h e d e f i n i t i o n t h a t if g 1 a n d g 2 a r e s u b g r a d i e n t s o f
c o n v e x f u n c t i o n s f±(x) a n d f 2 (x) a t p o i n t x 0 , t h e n c ^ + c 2g 2 is a
subg r ad i e n t of function ( x ) + c 2f 2 ( x ) , c x , c 2 ^ 0 . T h u s k n o w ­
i n g t h e s u b g r a d i e n t s o f c e r t a i n c o n v e x f u n c t i o n s it i s e a s y t o c o m ­
p u t e the s u b g r a d i e n t for their linear c o m b i n a t i o n .
N o w let / (x) = m a x / f ( x ), w h e r e f t (a;) i s a c o n v e x f u n c t i o n ^
i=l, m
a n d let g t b e s u b g r a d i e n t s of (x) a t p o i n t x 0 . T h e n v e c t o r
m
g = 2 ]hgt
i=l
m
where 2 ^ = 1, i = 1, . Xt= 0 if f i ( x 0) < . f ( x 0 ), i s
i=l
a s u b g r a d i e n t of f u n c t i o n f(x).

Strictly a n d Strongly C o n v e x Functions


F u n c t i o n s f o r w h i c h t h e c o n d i t i o n o f c o n v e x i t y is s a t i s f i e d i n a
strong sense pl a y a v e r y i m p o r t a n t role in m a t h e m a t i c a l p r o g r a m ­
ming.
D e f i n i t i o n 2 . 3 . F u n c t i o n f ( x ) is c a l l e d s t r i c t l y c o n v e x i f
f ((1 — X) x + Xy) < (1 — X ) f (x) + X f (y)% 0 < X < 1,;
x ^ y .
If a s t r i c t l y c o n v e x f u n c t i o n is s u f f i c i e n t l y s m o o t h , t h e n s t a t e m e n t s
s i m i l a r t o t h o s e f o r m u l a t e d i n l e m m a 2 . 5 a r e v a l i d f o r it.
L e m m a 2.6. T h e f o l l o wi n g statements are equivalent:
(1) / (x) is a s t r i c t l y c o n v e x f u n c t i o n .
( 2 ) / ( x 2 ) — / ( ^ i ) > (/' f o ) , x 2 — X j ) f o r a n y x ly x 2 6 E n , x x x 2.
( 3 ) (/' ( x + X p ) , p ) is a s t r i c t l y i n c r e a s i n g f u n c t i o n o f X .
D e f i n i t i o n 2 . 4 . F u n c t i o n f ( x ) is c a l l e d s t r o n g l y c o n v e x i f f o r
any x^ x2 6 E n

f ^vll*2— * i I P (2.7)

w h e r e y > > 0 is a n a r b i t r a r y s m a l l c o n s t a n t .
A s t r o n g l y c o n v e x f u n c t i o n a s c a n b e e a s i l y a s c e r t a i n e d is a l s o
strictly c o n v e x , but, s p e a k i n g generally, t h e c o n v e r s e d o e s n o t h o l d .
In w h a t follows w e shall consider twice c o n t i n u o u s l y differen­
tiable strongly c o n v e x functions.

21
M A T H E M A T I C A L P R O G R A M M I N G

L e m m a 2 . 7 . I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n ,
t h e n t h e c o n d i t i o n o f s t r o n g c o n v e x i t y ( 2 . 7 ) is e q u i v a l e n t t o t h e c o n d i t i o n
(/" (x) P , p ) > m , II p II2 , m > Oj (2.8)
for a n y x a n d p £ E n .
I n e q u a l i t y ( 2 . 8 ) i m p l i e s t h a t m a t r i x f " (a:) i s s t r o n g l y p o s i t i v e .
C o r o l l a r y 2.2. A strictly c o n v e x q u a d r a t i c f u n c t i o n f (x) =
= ( A x , x ) + (b , x ) defined over the space E n is s t r o n g l y c o n v e x
t o o a n d t h e c o n v e r s e is v a l i d .
P r o o f . I t is n e c e s s a r y t o p r o v e o n l y t h e first s t a t e m e n t .
F r o m (2) o f l e m m a 2 . 6 it f o l l o w s t h a t f o r a n y x 0
(Ax, x) > 0. (2.9)
A t the s a m e time
( A x , x ) ' ^ k (x, x ) = k || x ||2 (2.10)
w h e r e k is t h e l e a s t e i g e n v a l u e o f t h e m a t r i x o f s e c o n d d e r i v a t i v e s ,
A . F r o m ( 2 . 9 ) a n d ( 2 . 1 0 ) it f o l l o w s t h a t k > 0 a n d / ( x ) i s a s t r o n g l y
c o n v e x function.
L e t x 0 b e a n arbitrary p o i n t in E n . C o n s i d e r t h e set
Y = {x: f (x) < / (x„)}.
L e m m a 2 . 8 . I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e s t r o n g l y
c o n v e x f u n c t i o n , t h e n Y is a c l o s e d b o u n d e d s t r o n g l y c o n v e x s e t .
P r o o f . T h e s e t Y is c l o s e d s i n c e / (x) is a c o n t i n u o u s f u n c t i o n .
L e t u s p r o v e t h a t Y i s b o u n d e d . B y T a y l o r ’s f o r m u l a

/ ( * ) = / ( * » ) + ( / ' ( * o)> x ~ * o ) + y ( / " ( ! ) ( x — x 0 ), x — x 0 )

w h e r e £ = # o + 0 ( x — x 0), 0 £ l O , 1 ] . U s i n g ( 2 . 8 ) w e h a v e

/ ( * o ) > f (x ) > f (x o) 4 - ( / ' ( * o ) . * — *o) 4 x II x — x Q ||2 .


Hence
~ I I X — x 0 1|2 + (/' ( * 0) , x — x 0) < 0,
i.e.
I I X — x c ||2 < I (/' ( z 0 ) , X — x 0) | < I I / ' ( x 0) IIII x — x 0 II
or
| | x - x 0 | | < 2 ll/' ^ o)l1 .

T h i s last i n e q u a l i t y p r o v e s t h a t Y is b o u n d e d .

22
C O N V E X F U N C T I O N S

F i n a l l y , let u s e s t a b l i s h t h a t Y is a s t r o n g l y c o n v e x set. L e t x ± ,
x 2 6 Y . U s i n g L a g r a n g e ’s f o r m u l a a n d c o n d i t i o n ( 2 . 7 ) w e o b t a i n

/ ( ^ 4 £s+ » ) = / ( = 4 * ) + ( / ' © , y)

[f(*i) + /(*2)l — v l l ^ i — x*\\z + M II9 II (2.11)

w h e r e \ = x , ~ ^ x * - 6 y , 0 0 [0, 1], M is t h e m a x i m u m v a l u e o f f ' ( x ) ,


the derivative o n the set Y .
S e t t i n g f ( x 1 ) ^ f ( x 2) w e have y [/ (*i) + / W K / ( * i ) . I f II y I K
^ a H ^ 2 — #i||2 , t h e n f r o m ( 2 . 1 1 ) it f o l l o w s t h a t +
^ f ( x i), i . e . -~\-y£Y. B y definition this m e a n s that 7 is a
s t r o n g l y c o n v e x set.
T h e l e m m a is p r o v e d .
R e m a r k . T h e set Y r e m a i n s closed a n d s t r o n g l y c o n v e x also in
c a s e / (a;) i s a d i f f e r e n t i a b l e o r c o n t i n u o u s s t r o n g l y c o n v e x f u n c t i o n .
T h e p r o o f o f Y b e i n g s t r o n g l y c o n v e x is b a s e d o n t h e f a c t t h a t a c o n ­
t i n u o u s s t r o n g l y c o n v e x f u n c t i o n satisfies i n e v e r y b o u n d e d set
L i p s c h i t z ’ c o n d i t i o n (see N . B o u r b a k i ) .
L e m m a 2 . 9 . I f t h e m a t r i x f ” (x) satisfies c o n d i t i o n (2.8), t h e n t h e r e
exists t h e i n v e r s e m a t r i x f ~ x (x) a n d a l s o
(f"~l ( x ) p , p ) ^ IIP II2-
I f m o r e o v e r t h e m a t r i x f (a:) is b o u n d e d , i . e .
(/" (*) P % p ) « £ M II p II2 (2 .1 2 )
then

( T W p , p ) > ^ r \ \ p \ \ z-

Concave Functions
Definition. If for a n y x Y, x 2 £ E n a n d a n y 0 ^ X ^ 1 the inequality
f {Xxx + (1 - X) x 2) > Xf fo) + (1 - X ) f ( x 2)

is s a t i s f i e d , t h e n t h e f u n c t i o n f ( x ) is c a l l e d c o n c a v e .
I t f o l l o w s t h a t t h e f u n c t i o n / (a:) i s c o n c a v e if a n d o n l y i f t h e f u n c ­
t i o n — / (x) is c o n v e x . T a k i n g t h i s i n t o a c c o u n t all t h e p r o p e r t i e s
of c o n c a v e function c a n b e o b t a i n e d b y a s i m p l e r e f o r m u l a t i o n of
the corresponding properties of c o n v e x functions.

23
M A T H E M A T I C A L P R O G R A M M I N G

I n a w a y a n a l o g o u s to that u s e d for c o n v e x functions, strictly a n d


strongly c o n c a v e functions c a n b e defined a n d their properties stu­
died.

3. C O N V E X P R O G R A M M I N G
T h e s u b j e c t m a t t e r o f c o n v e x p r o g r a m m i n g is m i n i m i z a t i o n o f
a c o n v e x f u n c t i o n i n a c o n v e x d o m a i n . C o n v e x p r o g r a m m i n g is t h e
m o s t e l a b o r a t e d p a r t of m a t h e m a t i c a l p r o g r a m m i n g .

F o r m u l a t i o n g of t h e P r o b l e m .
Basic Properties

G i v e n a c o n v e x c o n t i n u o u s f u n c t i o n / (#), x £ E n , d e f i n e d f o r all
x 6 E n , a n d a c o n v e x s e t X . I t is r e q u i r e d t o f i n d t h e m i n i m u m o f
f (x) i n t h e s e t X , i.e. t o f i n d p o i n t x % s u c h t h a t

/ (**) < / (*)* x 6 X .

L e m m a 3 . 1 . A c o n v e x c o n t i n u o u s f u n c t i o n f ( x) a t t a i n s its m i n i m u m
i n a c o m p a c t c o n v e x set X .
P r o o f . T h e h y p o t h e s i s is j u s t a p a r t i c u l a r c a s e o f t h e w e l l - k n o w n
Weierstrass t h e o r e m w h i c h states that a c o n t i n u o u s function attains
its m i n i m u m i n a c o m p a c t set.
L e m m a 3 . 2 . L e t X b e a c l o s e d set a n d f (#) a t w i c e c o n t i n u o s l y diffe­
r e n t i a b l e s t r o n g l y c o n v e x f u n c t i o n . T h e n f ( x ) a t t a i n s its m i n i m u m i n X .
Pro o f . L e t x Q £ X . C o n s i d e r t h e set

Y = { x : f (x) < / (zq)}.

B y l e m m a 2 . 8 it is c l o s e d a n d b o u n d e d . C o n s i d e r n o w t h e i n t e r ­
s e c t i o n X f] Y . O b v i o u s l y , if x * i s t h e m i n i m u m o f f ( x ) i n t h e s e t
X f] Y , t h e n t h i s p o i n t i s t h e m i n i m u m p o i n t o f / (a;) i n X a s
w e l l . B u t t h e s e t X f) Y i s b o u n d e d a n d c l o s e d b e i n g t h e i n t e r s e c t i o n
o f t w o c l o s e d s e t s o n e o f w h i c h is b o u n d e d . T h e r e f o r e f {x) a t t a i n s
i t s m i n i m u m i n X f| Y a n d c o n s e q u e n t l y i n X a s a w h o l e .
C o n v e x a n d strictly c o n v e x f u n c t i o n s c a n fail t o a t t a i n t h e i r
minimum.
L e m m a 3.3. A set of p o i n t s X * c z X at w h i c h the c o n v e x f u n c t i o n
f (x) a t t a i n s its m i n i m u m i n X is c o n v e x .
L e m m a 3 . 4 . A s t r i c t l y c o n v e x f u n c t i o n a t t a i n s its m i n i m u m i n a
c o n v e x set X a t o n e a n d o n l y o n e p oint.

24
C O N V E X P R O G R A M M I N G

P r o o f . L e t x 1 a n d x 2 b e d i f f e r e n t p o i n t s o f m i n i m u m o f / (a:) i n X .
Then

/ (Y * i + y * 2) < y / ( * 1) + y / ( z 2) = / (*i),

~2 x i “ I- y * s € X •
T h i s c o n t r a d i c t s t h e f a c t t h a t arx i s a p o i n t o f m i n i m u m o f / (3 ).

N e c e s s a r y Conditions for a M i n i m u m
L e t / (a?) b e a c o n t i n u o u s l y d i f f e r e n t i a b l e c o n v e x f u n c t i o n a n d X
a c o n v e x s e t . W e h a v e t o c o n s i d e r t h e f o l l o w i n g q u e s t i o n : if x * i s
t h e m i n i m u m p o i n t o f / (a:) i n X y w h a t c o n d i t i o n s a r e t o b e s a t i s f i e d
at this point?
D e f i n i t i o n 3 . 1 . L e t x 0 £ X . W e d e n o t e b y K (a:0 ) a s e t o f v e c t o r s p
s u c h t h a t p £ K (a:0 ) i f a n d o n l y i f t h e r e i s a n a > > 0 s u c h t h a t x Q + a p £
e x .
T h e s e t K (a:0) i s c a l l e d t h e c o n e o f a d m i s s i b l e d i r e c t i o n s f o r X a t
point x 0.
L e m m a 3 . 5 . K (a:0 ) is a c o n v e x c o n e . I f p £ K ( x 0 ) a n d x Q + a 0p £
£ X t then x 0 + a p £ X with a n y 0 ^ a ^ a 0.
T h e o r e m 3.1. L e t x ^ b e the m i n i m u m p o i n t of a c o n t i n u o u s l y diffe­
r e n t i a b l e c o n v e x f u n c t i o n f (x) i n a c o n v e x set X . T h e n
f (**) e x * (**). (3 .1 )
C o n v e r s e l y i f ( 3 . 1 ) h o l d s , t h e n x * is t h e m i n i m u m p o i n t o f f ( x ) i n X .
P r o o f . L e t ( 3 . 1 ) b e s a t i s f i e d a t p o i n t x * . T h e n (/' ( a r * ) t p ) ^ 0 ,
p £ K ( x + ) . F u r t h e r if x £ X , t h e n p — x — x + £ X ( £ * ) f o r x 0 -\-
+ (x — x+) = x £ X . T h e r e f o r e
(/' (a:*), x — x*) > 0, a: 6 X .
B y l e m m a 2.5 w e h a v e for a c o n v e x f u n ction
[/ ( * ) ~ / (**) > (/' ( * * ) » « — a : *).
Hence
f (x) — f (a:*) > 0 , a: £ X
t h i s s h o w s t h a t a: * i s t h e m i n i m u m p o i n t o f f (a:) i n X .
L e t u s n o w p r o v e t h a t c o n d i t i o n (3.1) is n e c e s s a r y . L e t x * b e t h e
m i n i m u m point. T h e n for a n y x £ X a n d X, 0 < ^ 1, w e h a v e
/ ((! — X) x + + Xx) = f (x* + X (x — a:*)) > / (a:*)
or
/ ( j g + X (X — X»)) — / ( * » ) ^ q

25
M A T H E M A T I C A L P R O G R A M M I N G

In the limit w i t h we obtain


(/' ( * * ) , x — z*) > 0, x 6 X. (3.2)
Let n o w p £ K (#*). T h e n + ap = x £ X , a > > 0, o r

T h e n b y (3.2) a n d w i t h a > 0 , w e have

(/' ( * « ) , P ) = (*») (3.3)


I n e q u a l i t y ( 3 . 3 ) i s v a l i d f o r a n y p £ K (a:*). H e n c e
/' ( * • ) 6 # * ( * • ) .
Corollary 3.1. B y t h e o r e m 3 . 1 p o i n t x * is t h e m i n i m u m p o i n t o f
f ( x ) i n X if a n d o n l y i f t h e i n e q u a l i t y
(/ ( ^ * ) » ^ *£#) 0,r X £ X
is satisfied. I n f a c t , a s j u s t s h o w n , ( 3 . 2 ) is e q u i v a l e n t t o (3.1).
L e t u s s h o w h o w t h e o r e m 3 . 1 is a p p l i e d t o t h e c a s e w h e r e d o m a i n X
is d e f i n e d b y a s y s t e m o f l i n e a r i n e q u a l i t i e s .
G i v e n vectors a t £ E n , i £ j ~ U J°, w h e r e J ~ a n d J ° are
finite s e t s o f i n d i c e s , a n d c o r r e s p o n d i n g n u m b e r s b t. L e t d o m a i n X
b e defined b y a s y s t e m of equalities a n d inequalities:
(au x) — bt ^ 0, i 6 J “ , { a t, x ) — bt ~ 0, i £ J°. (3.4)
Let us describe a cone K ( x 0) a t a n a r b i t r a r y p o i n t x 0 £ X . W e set
J “ ( x 0) = {i: ( a t , x 0) — bt — 0, i £ J~).
B y d e f i n i t i o n , p £ K ( x 0) if x 0 + a p 6 X w i t h s u f f i c i e n t l y s m a l l a .
It is c l e a r t h a t x 0 + a p £ X , i.e. t h a t p o i n t x 0 + a p satisfies (3.4)
w i t h s m a l l a if a n d o n l y if
( a iy p ) < 0, i £ j ~ ( x 0 ), (a*, p) = 0, i £ J°. (3.5)
T h u s c o n e K ( x 0) is d e s c r i b e d b y s y s t e m (3.5) w h i c h w e c a n r e w r i t e
in a n equivalent form:
(— p ) > 0 , i e J " ( z 0 ) x (a i » P ) > 0, i 6 J ° ,
( — a ir p ) > 0 , i 6 J ° .
B y l e m m a 1 . 6 v e c t o r y £ i f * (#o) c a n b e p r e s e n t e d i n t h e j f o r m
z/ = 2 — u ia i + 2 — u H a t -\- 2 u ~ la i ,

w h e r e u \ u +x, u “ * a r e n o n n e g a t i v e n u m b e r s . D e n o t i n g u x = u + x —
i £ J ° w e obtain
y = — 2
i£j~(xo)
u'at -
i£jO
2 u'di, u x^ 0, i ^ J " ( o ; o ) . (3.6)

26
C O N V E X P R O G R A M M I N G

T h e o r e m 3 . 2 . L e t f (x) b e a c o n v e x d i f f e r e n t i a b l e f u n c t i o n a n d set X
b e d e f i n e d b y s y s t e m (3.4). T h e n f o r p o i n t x * to b e t h e m i n i m u m p o i n t o f
f (,x ) i n X it is n e c e s s a r y a n d s u f f i c i e n t t h a t t h e r e e x i s t n u m b e r s u x ,
i £ Cf~ U 3 ° such iha t
f (#*) + S u la t = 0 , u 1^ 0 , i£J~, u x = 0,
iGjr-ujro
i f {O'iy * * ) “ < 0, i € J ~ .
P r o o f . T h e r e s u l t is o b t a i n e d d i r e c t l y b y u s i n g t h e o r e m 3 . 1 a n d y
i n t h e f o r m (3.6) f o r e l e m e n t s o f K * ( # 0) a n d a l s o t a k i n g u % — 0 f o r
i £ j ~ (#*).
C o r o l l a r y 3 . 2 . F o r p o i n t x % to b e t h e m i n i m u m p o i n t o f the c o n v e x
d i f f e r e n t i a b l e f u n c t i o n o v e r t h e w h o l e s p a c e it is n e c e s s a r y a n d s u f f i c i e n t
to satisfy th e e q u a l i t y
f (xj = 0.
C o r o l l a r y 3 . 3 . F o r p o i n t x % to b e t h e m i n i m u m p o i n t o f t h e c o n v e x
differentiable f u n c t i o n i n the set
X1 > 0 , 7 6 f,
w h e r e f is a s u b s e t o f t h e s e t j = 1 , 2 , . . ., n , it is n e c e s s a r y a n d s u f ­
ficient to satisfy t h e r e l a t i o n s
^ f ^ 0 i f x ’=0, y<I f ,
dx3

ftUihl = 0 if x ’ = / = 0 o r f g j ' .
dx3

The Kuhn-Tucker Theorem

T h e n e c e s s a r y a n d sufficient c o n d i t i o n s for a m i n i m u m c o n s i d e r e d
a b o v e w e r e b a s e d o n a n abstract d escription of a n a d m i s s i b l e set X
i n w h i c h f u n c t i o n / (#) w a s m i n i m i z e d . I n a b r o a d c l a s s o f p r o b l e m s
t h e s e t is d e f i n e d b y a s y s t e m o f i n e q u a l i t i e s a n d e q u a l i t i e s . T h i s
section considers the necessary conditions for a m i n i m u m in this
concrete case.
G i v e n c o n v e x f u n c t i o n s f t (x), i = 0 , 1 , . . . » m a n d c o n v e x s e t X .
It is r e q u i r e d t o m i n i m i z e f 0 (x) w i t h t h e f o l l o w i n g c o n s t r a i n t s
fi ( x ) ^ 0, i = 1 , . . ., m , x 6 X . (3.7)
T h e o r e m 3. 3 ( K u h n - T u c k e r ) . L e t x+ be the m i n i m u m p o i n t of
/ 0 (x) w i t h t h e c o n s t r a i n t s ( 3 . 7 ) a n d let t h e r e b e a p o i n t x ± £ X s u c h t h a t
ft ( ^ i ) < ^- 6 , i — . . ., m .

27
M A T H E M A T I C A L P R O G R A M M I N G

T h e n there are n u m b e r s u* ^ 0, i = 1, . . m such that


m m
fo ( * . ) + 2 “ V i (a;,) < U (x) + S “ V i (a;). * € X,
i=l i=l
w * / j (a:*) = 0, i — 1 , . . ., ; n . (3.8)
T h e s e c o n d i t i o n s a r e n e c e s s a r y a n d sufficient.
Definition 3.2, N u m b e r s u % used in the t h e o r e m are called L a g r a n g e
multipliers.

Dual Problem
C o n s i d e r a g a i n t h e p r o b l e m of m i n i m i z a t i o n of c o n v e x f u n c t i o n
/ 0 ( x ) w i t h c o n s t r a i n t s ( 3 . 7 ) . L e t u % ^ 0 , i = 1 , . . .x m b e f i x e d .
Let us c o m p u te
77i

q> ( u ) = i n f [ / 0 ( x ) + 2 “ Vi (*)]. (3.9)


x £ X i— 1

T h u s f u n c t i o n cp ( w ) w i t h u ^ 0 h a s b e e n d e t e r m i n e d ; i t c a n t a k e
t h e v a l u e — o o a s w e l l . W e l e a v e i t t o t h e r e a d e r t o p r o v e t h a t cp ( u )
is a c o n c a v e f u n c t i o n .
T h e o r e m 3.4. L e t u ^ 0 a n d x satisfy the c onstraints (3.7). T h e n
<P (“ ) = f o ( * ) •
If h o w e v e r the c o n d i t i o n s of t h e o r e m 3 . 3 are satisfied, t h e n
m a x cp ( w ) = m i n f 0 ( x )
u > 0 x £ D

w h e r e D is a s e t o f p o i n t s x w h i c h s a t i s f y ( 3 . 7 ) .
Proof. F o r x £ D , u ^ O w e h a v e
m
<P(“ X / o ( * ) + 2
i=l
“ Vi ( * ) < f o (x ) ■

L e t n o w t h e c o n d i t i o n s of t h e o r e m 3 . 3 b e satisfied. T h e n t h e r e
exists a v e c t o r u 0 ^ 0 s u c h t h a t t h e r e l a t io n s (3.8) are satisfied for
it. T h e s e r e l a t i o n s i m p l y
m
<p ( U q ) = / o ( # * ) 2 K f i (#*)— fo (#*)»
i=l
a n d s i n c e cp ( u ) ^ / 0 ( x * ) i t f o l l o w s t h a t v e c t o r u 0 p r o v i d e s t h e m a x i ­
m u m o f f u n c t i o n <p ( u ) i n t h e d o m a i n u ^ 0 a n d
m a x cp ( u ) = cp ( u 0 ) = f 0 (a:*) = m i n f 0 ( x ) .

Q.E.D.

28
C O N V E X P R O G R A M M I N G

T h e p r o b l e m o f m a x i m i z a t i o n o f q> ( w ) w i t h t h e c o n s t r a i n t u ^ 0 is
k n o w n as the d u a l p r o b l e m of c o n v e x p r o g r a m m i n g a n d u as the vector
of d u a l variables.
T h e essence of t h e o r e m 3.4 c a n n o w b e interpreted as follows:
u n d e r the c o nditions of t h e K u h n - T u c k e r t h e o r e m t h e v a l u e of t he
m a x i m u m o f t h e o b j e c t i v e f u n c t i o n i n t h e d u a l p r o b l e m is t h a t o f
the m i n i m u m of the objective f unction of the p r i m a l p r o b l e m . T h e
L a g r a n g e multipliers of the p r i m a l p r o b l e m are at t h e s a m e t i m e t he
s o l u t i o n of t h e d u a l p r o b l e m .
T h e p r o b l e m of c o n v e x p r o g r a m m i n g o ften arises in t h e f o r m of
t h e m i n i m i z a t i o n o f f 0 (x) w i t h t h e c o n s t r a i n t s
ft(x) = 0, i £ j \ x £ X (3.10)
w h e r e J ~ a n d J ° a r e f i n i t e s e t s o f i n d i c e s , f 0 ( x ) a n d f t ( x ), i £ J ~ ,
a r e c o n v e x f u n c t i o n s o f x , f t (#), i £ J ° a r e l i n e a r f u n c t i o n s a n d X
is a c o n v e x s e t .
T h e d u a l p r o b l e m f o r t h i s c a s e is f o r m u l a t e d a s a m a x i m i z a t i o n
p r o b l e m o f tp ( u ) w i t h t h e c o n s t r a i n t s u l ^ 0 , i £ J “ w h e r e u h a s
c o m p o n e n t s u l, i £ J ~ U « ^ ° a n d
q>(u)-inf [/,(*)+ S aVi(*)]. (3.11)
x £ X i e j - u j o

T h u s t h e n u m b e r o f d u a l v a r i a b l e s is e q u a l t o t h e n u m b e r o f c o n ­
straints (3.10) a n d t h e v a r i a b l e u x c o r r e s p o n d i n g to t h e i-th constraint
t a k e s n o n n e g a t i v e v a l u e s if i t c o r r e s p o n d s t o a n i n e q u a l i t y c o n s t r a i n t
a n d a r b i t r a r y v a l u e s if i t c o r r e s p o n d s t o a n e q u a l i t y c o n s t r a i n t .

P r o b l e m of L i n e a r P r o g r a m m i n g
T h e p r o b l e m o f l i n e a r p r o g r a m m i n g is t h e p r o b l e m o f m i n i m i z a t i o n
o f t h e f u n c t i o n f 0 (x) = ( a 0 , x ) s u b j e c t t o c o n s t r a i n t s (3.4)
(“ ;» x ) — bt < 0,; i 6 (fli* x ) — b t = 0„ i € J°.
T h i s p r o b l e m c o i n c i d e s w i t h t h e p r o b l e m ( 3 . 1 0 ) if
fi ( x ) = (<*!„ x ) — b iK X £ E n.
L e m m a 3.6. I f constraints (3.4) a p p l y together, t h e n the p r o b l e m
of linear p r o g r a m m i n g either h a s a solution x * or the v a l u e of the lower
b o u n d f 0 (x) — ( a 0 , x ) w i t h t h e c o n s t r a i n t s ( 3 . 4 ) is — o o .
T h e proof of this l e m m a c a n b e f o u n d in t e x t b o o k s o n linear p r o ­
gramming.
T h e necessary conditions characterizing x%, the solution of the
p r o b l e m of linear p r o g r a m m i n g , are o b t a i n e d just b y r e f o r m u l a t i n g
t h e o r e m 3 . 2 s i n c e fQ (x) — a 0 .

29
M A T H E M A T I C A L P R O G R A M M I N G

T h e o r e m 3.5. I n order that point x ^ be the solution of a p r o b l e m of


l i n e a r p r o g r a m m i n g it is n e c e s s a r y a n d s u f f i c i e n t t h a t t h e r e b e n u m ­
b e r s u x, i 6 U 3 ° such that

a0+ 2 u la i = 0 , u %^ 0, i £ J ~ , u l = 0, (3.12)

If {Uij x ^ ) — b i C 0 , i £ d ~ .
L e t u s c o n s t r u c t t h e d u a l of t h e p r o b l e m of linear p r o g r a m m i n g .
B y definition,
<P ( u ) = i n f f / 0 (a;) + 2 (*)1
*6 E n it£f-U J °

= inf [(a0, x) + 2 u * ( { a u x ) — bi)]


x £ E n izJ- {jJ°
= inf [((a0 + 2 u^i), x) — 2 u % ]
x£En ieJ- u J °
— 2 u % bi if ao 2 U % a i = 0,
itJ-uJ0
oo if a0+ 2 u xa i ^ = 0 .

T h u s t h e d u a l o f t h e p r o b l e m o f l i n e a r p r o g r a m m i n g , i.e. - t h e
p r o b l e m of t h e m a x i m i z a t i o n of <p(w) w i t h u x^ 0, is e q u i v a ­
lent to t h e m a x i m i z a t i o n of
— 2 u % bi (3.13)
ieJ-U J°
with the constraints
a Q -\- 2 u xa i = 0 , u x^ 0, i £ j ~ . (3.14)
i e J - U <7°
T h e o r e m 3.6. If the p r i m a l p r o b l e m of linear p r o g r a m m i n g h a s a
solution, then the L a g r a n g e multipliers are the solution of the d u a l p r o b ­
lem, a n d at the s a m e t i m e the v a l u e of the m i n i m u m of the objective
f u n c t i o n o f t h e p r i m a l p r o b l e m is e q u a l t o t h e v a l u e o f t h e m a x i m u m o f
the objective f u n c t i o n of the d u a l p r o b l e m .
I n a d d i t i o n t o c o n s t r a i n t s (3.4), p r o b l e m s o f l i n e a r p r o g r a m m i n g
often c o n t a i n constraints of the t y p e
> 0, n t (3.4')
w h e r e ' f i s a s u b s e t o f t h e s e t j — 1 , 2 , . . ., n . U s i n g t h e o r e m 3 . 6
the reader c a n easily pro v e the following.
T h e o r e m 3.7. If a p r o b l e m of linear p r o g r a m m i n g w i t h constraints
( 3 . 4 ) , (S A ' ) h a s a s o l u t i o n , t h e n t h e L a g r a n g e m u l t i p l i e r s c o r r e s p o n d i n g

30
C O N V E X P R O G R A M M I N G

to c o n s t r a i n t s (3.4) a r e t h e s o l u t i o n of th e d u a l p r o b l e m : t h e m a x i m i z a ­
tion of
— 2 u % bi

with constraints
ai + 2 * * * « ? > o, j £ f ,
flj + 2 u ia \ = 0 , j£f, 0, i

w h e r e a { i s t h e f - t h c o m p o n e n t o f v e c t o r a t. T h e v a l u e s o f t h e m i n i ­
m u m of the objective f u n c t i o n of t h e p r i m a l p r o b l e m a n d of t h e
m a x i m u m of the d u a l one coincide.

P r o b l e m of Q u a d r a t i c Programming
T h e p r o b l e m of quadratic p r o g r a m m i n g consists in the m i n i m i ­
zation of the quadratic f u n c t io n
/o (*) = ! ( * , C x ) + (d, x )

w i t h c o n s t r a i n t s (3.4); h e r e C is a n symmetric, positive


d e f i n i t e m a t r i x , a n d d is a n n - d i m e n s i o n a l v e c t o r .
L e m m a 3.7. I n a p r o b l e m of quadratic p r o g r a m m i n g the lower b o u n d
is e i t h e r a t t a i n e d o r i s — oo.
T h e proof of this l e m m a will b e omit t ed .
T h e o r e m 3.8. I n order that p o i n t x * be the solution of a p r o b l e m of
q u a d r a t i c p r o g r a m m i n g , it is n e c e s s a r y a n d s u f f i c i e n t t h a t t h e r e b e
n u m b e r s u*, i £ 3 ~ U *^° s u c h t h a t
C x m -\-d-\- 2 = 0,
uJ°
i4 = 0 if (a x , x ^ ) — bi < [ 0 , i£j~; 0, i£j'
T h e h y p o t h e s i s c a n b e p r o v e d b y directly u s i n g t h e o r e m 3.2.
L e t n o w m a t r i x C b e s t r i c t l y p o s i t i v e d e f i n i t e , i.e. t h e r e is a y
s u c h t h a t ( X j C x ) ^ y || x ||2 . I n t h i s c a s e m a t r i x C i s n o n s i n g u l a r
a n d h a s a n i n v e r s e m a t r i x C ~ x. L e t u s c o n s t r u c t t h e d u a l p r o b l e m :
<p ( u ) = i n f I/o ( * ) + 2 M V i (*)]
x g E 71

= inf \ * ( z , C x ) + (d, x ) + 2 u i ( ( “ i. * ) — m ]
x£En L
a J - v J 0

= fnf [— 2 u % + — - (x, C x ) + ( x , d + 2
xeE?n ie£f~\j30 u J °

31
M A T H E M A T I C A L P R O G R A M M I N G

E q u a t i n g the derivatives of t h e r i g h t - h a n d side to zero, we


f i n d t h a t t h e m i n i m u m is a t t a i n e d w i t h
x (u) = — ( d -f- 2 u % a i)«

A t the same time


<p ( u ) = — 2 u % bi

— y { dJr 2 w < a »> c ~ l (d + 2 “ *“ « ) ) • (3 -1 5 )


ieJ-uJ0 ieJ'uJ0
T h u s t h e d u a l p r o b l e m consists in t h e m a x i m i z a t i o n of (3.15)
w i t h constraints it ^ 0 , i £ J ~ .
T h e o r e m 3.9. If the m i n i m u m in the p r o b l e m of quadratic p r o g r a m ­
m i n g is a t t a i n e d a n d m a t r i x C is s t r i c t l y p o s i t i v e d e f i n i t e , t h e n t h e
K u h n - T u c k e r t h e o r e m a n d t h e o r e m 3 . 4 are valid for the p r o b l e m of
q u a d r a t i c p r o g r a m m i n g . I n this case the L a g r a n g e m u l i p l i e r s of the
p r i m a l p r o b l e m a r e t h e s o l u t i o n o f t h e d u a l p r o b l e m , a n d if u * is t h e
solution of the d u a l p r o b l e m , t h e n the solution of the p r i m a l o n e c a n
be f o u n d b y the following f o r m u l a :

x ( u ) = — C-‘ ( d + 2 u 4* , ) . (3.16)
n J - u J 0

4. N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M
T h e general p r o b l e m of m a t h e m a t i c a l p r o g r a m m i n g consists in
m i n i m i z i n g f u n c t i o n f Q (x), x 6 E n i n a s e t d e f i n e d b y a s y s t e m o f
equalities a n d inequalities
/i w < o, fi ( x ) = 0, x £ X % (4.1)
w h e r e J ~ a n d J ° a r e finite s e t s o f i n d i c e s . I n t h i s s e c t i o n it is a l ­
w a y s a s s u m e d t h a t f t (x) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n s
w h o s e g r a d i e n t is f\ (x). N o a s s u m p t i o n is m a d e a b o u t s e t X f o r t h e
present.
T h e m a i n o b j e c t o f t h i s s e c t i o n is t o d e d u c e t h e n e c e s s a r y c o n ­
ditions w h i c h m u s t b e satisfied at p o i n t x * p r o v i d i n g t h e m i n i m u m
o f /0 (x) s u b j e c t t o c o n s t r a i n t s (4 .1) .

B a s i c Definitions
D e f i n i t i o n 4 . 1 . A s e t D o f p o i n t s w h i c h s a t i s f y c o n s t r a i n t s ( 4 . 1 ) is
called a n admissible d o m a i n .
W e a s s u m e t h a t t h i s s e t is n o n e m p t y .

32
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M

D e f i n i t i o n 4.2. F u n c t i o n f 0 (x) b e i n g m i n i m i z e d i n D is c a l l e d a n
objective junction.
Definition 4.3. P o i n t x * satisfying (4.1) for w h i c h
h ix * ) < ft (*). x € D ,
is c a l l e d t h e m i n i m u m p o i n t .
D e f i n i t i o n 4 . 4 . P o i n t x 0 is c a l l e d a p o i n t o f l o c a l m i n i m u m o f f 0 (x)
i n D if t h e r e is a n e i g h b o u r h o o d Q o f p o i n t x # s u c h t h a t

f (**) < /o (*)> x 6 D n Q.


I n w h a t f o l l o w s t h e p r o b l e m o f t h e m i n i m i z a t i o n o f f 0 (x) w i l l
generally b e considered. Obviously, the p r o b l e m of the m a x i m i z a ­
t i o n o f a f u n c t i o n f ( x ) i n D is r e d u c e d t o t h a t o f t h e m i n i m i z a t i o n
i n D o f t h e f u n c t i o n / 0 ( # ) = — / (a;).

N e c e s s a r y C o n d i t i o n s for a M i n i m u m

Definition 4.5. Vector p £ E n defines the admissible direction w i t h


r e s p e c t to s e t X a t p o i n t x 0 £ X if f o r a n y v e c t o r e t £ E n , i £ J ° a n d
a n y f u n c t i o n r* (X), i £ J ° w h i c h s a t i s f i e s t h e c o n d i t i o n

(4.2)
A.-0 A
the expression
£ o + A,/>+ 2 (4.3)
ieJ°
is v a l i d w i t h s u f f i c i e n t l y s m a l l % ; > 0 .
T h e basic result to b e p r o v e d in this section c a n n o w b e f o r m u ­
lated.
T h e o r e m 4. 1 . L e t x # b e a p o i n t of local m i n i m u m of /0 (x) i n D .
B e s i d e s , let t h e s e t o f a d m i s s i b l e d i r e c t i o n s w i t h r e s p e c t to s e t X a t p o i n t x *
f o r m a c o n v e x c o n e K (a;*). T h e n t h e r e a r e n u m b e r s u ° , u l 6 U 3°
such that

« % ( * . ) + 2 »*/•(*.) 6 « * ( * . ) ,
k J - u J 0
“ Vi (.x.)= o, « * > o, i=o, « e J - . (4.4)
Proof. C o n s i d e r t w o cases.
(1) V e c t o r s f\ (a:*), i £ J ° are linearly dependent. T h e n there are
numbers i £ J ° such that

2 “ V i (x . ) = o .

3— 0326 33
M A T H E M A T I C A L P R O G R A M M I N G

T a k i n g u ° — 0, u x = 0, i £ £f ~ w e s e e t h a t all t h e c o n d i t i o n s o f
t h e o r e m (4.1) ar e satisfied.
(2) V e c t o r s f\ (a:*), i £ J ° a r e l i n e a r l y i n d e p e n d e n t . T h e n t h e r e
are vectors i £ such that
(fi ( * * ) , e } ) = 6 ^, i, / 6 T
w h e r e 6 ^ = 0 i f i ^ j a n d 6 ** = 1 .
L e t t h e t o t a l n u m b e r o f i n d i c e s i i n t h e s e t J " \J J ° b e m . C o n ­
sider set Z in s p a c e E m + 1 , t h e set b e i n g defined as follows. V e c t o r z
b e l o n g s t o Z if a n d o n l y if t h e r e i s a v e c t o r p £ K ( x * ) s u c h t h a t z l =
= (f\ ( # * ) , p ) i f f t (,x * ) = 0 , i 6 U 3 ° o r i — 0. T h e c o m p o ­
n e n t s z x ( o f v e c t o r z £ Z ) f o r w h i c h f t (x * ) < c 0 a r e a r b i t r a r y . S i n c e
K (a;*) is a c o n v e x c o n e it i s e a s i l y s e e n t h a t Z is a c o n v e x c o n e t o o .
L e t u s n o w d e f i n e t h e s e t P . V e c t o r w b e l o n g s t o P if a n d o n l y if
w x < c 0 w i t h fi ( x % ) = 0, i 6 3 " or i — 0
w 1 — 0 with i £ J°.
T h e r e m a i n i n g c o m p o n e n t s of vector w are arbitrary. O b v i o u s l y , P
is a c o n v e x s e t t o o .
L e t u s d e m o n s t r a t e th a t Z a n d P d o n o t intersect. S u p p o s e that
t h e o p p o s i t e is t r u e . T h e n t h e r e m u s t b e a v e c t o r p 0 £ K (a:*) s u c h t h a t
(/; ( * » ) . p . ) < o m e a - a n d f t ( * * ) = o
and
(fi ( * * ) , P o ) = 0 with i 6 J°. (4.5)
W e n o w c o n s t r u c t a s y s t e m of e q u a t i o n s i n f u n c t i o n s r 1 (X)* i 6 J ° :

f i (x + H - A p o ~ b 2 r i e i) — 0 * i 6 «7°» (4*6)
i€J°

Let us denote

g t & , r) = fi(*, + h P o + 2 r *e t ) >

T h e n f r o m (4.6) w e have
g t ( K r) = 0, i 6 J° (4.7)
w h i c h d e f i n e s r x a s i m p l i c i t f u n c t i o n s o f A.. S i n c e i t w a s a s s u m e d t h a t
fi ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e ' f u n c t i o n s , t h e f u n c t i o n s g t ( A , r)
a r e also c o n t i n u o u s l y di fferentiable i n A a n d r \ T h e n u s i n g (4.5)
w e write

d e i t x 0) = ( / < ( * . ) . P o ) = o , (4.8)

{(«♦), «i)■=««!• (4.9)

34
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M -

Let us denote by a matrix whose components are »


i, j 6 J ° . B y t h e t h e o r e m o n i m p l i c i t f u n c t i o n s , s y s t e m (4.7) can
be so l v e d for r w i t h s m a l l X if m a t r i x - - - is nonsingular. In
this case r(X) is a d i f f e r e n t i a b l e f u n c t i o n o f X , r ( 0 ) = 0 a n d

r’( v = - ( f y ‘i <4 -1 0 )

where is a v e c t o r w h o s e c o m p o n e n t s a r e dgl ^ . In the case


under consideration
dg_ 2L (4.11)
dr dX 0,

w h e r e I is a n i d e n t i t y m a t r i x . T h i s f o l l o w s f r o m ( 4 . 8 ) a n d ( 4 . 9 ) .
T h u s w e see that w i t h small X co ntinuously differentiable f u n c ­
t i o n s r l (X), i £ J ° a r e d e f i n e d . F r o m ( 4 . 1 0 ) a n d ( 4 . 1 1 ) w e h a v e

l i m r< W ~ ri (0) = l i m ^ = r*' ( 0 ) = 0 (4.12)


X-*() A X-*0 K
Let now x (A,) = x * + Xp0 + 2 rl W e i• T h e n x (A,) £ X with
iejo
s m a l l X > 0 b y t h e d e f i n i t i o n o f K (a;*), for p 0 € K (x*). F u r t h e r *
fi ( x (A,)) = 0 , i 6 J ° , s i n c e r l (A,) s a t i s f y (4.6) b y definition. F u r ­
t h e r f 0 ( x ( X ) ) c f 0 (a:*) w i t h s m a l l X > ► 0 . I n f a c t b y T a y l o r ’s f o r ­
mula
/o ( * W ) = /o ( * * ) + (f'0 ( £ ) , x ( X ) — x*)
w h e r e £ is a p o i n t o f t h e s e g m e n t j o i n i n g x * a n d x (X). T h e r e f o r e

=(n(Mh po)+ 2 et).

i£t)°
Since from (4.5) (f0 (x^)y p 0) < i 0 a n d — ^ “* 0 w e obtain, with
s m a l l p o s i t i v e X, H — *-x* a n d
f o ( x ( X ) ) — f0 ( x * ) ^ q

S i m i l a r l y i f i £ J ~ a n d f t (x * ) = 0 , t h e n b y (4.5)
f i ( x ( 7 1) ) < 0 , fi(x*) = 0.
I f fi ( x * ) c 0 , i £ t h e n / * ( x (A,)) c 0 , b y c o n t i n u i t y .
T h u s point x (X) w i t h s m a l l p o s i t i v e X satisfies all c o n s t r a i n t s (4.1)
a n d /0 (x (X)) < c / 0 (a:*). B u t t h i s c o n t r a d i c t s t h e f a c t t h a t x * i s a
point of local minimum.

35 3*
. M A T H E M A T I C A L P R O G R A M M I N G

T h e c o n t r a d i c t i o n o b t a i n e d s h o w s that sets Z a n d P d o n o t inter­


sect. S i n c e th ese sets are c o n v e x , t h e y c a n b e s e par ate d. T h i s m e a n s
t h a t t h e r e a r e n u m b e r s u°, u x £ Cf~ U n o t all o f t h e m z e r o , s u c h
that

u ° z ° ~ j- 2 u xz l ^ u ° w ° - \ - 2 u xw x y z£Z, w £ P . (4.13)

T h e structure of sets Z a n d P m a k e s it p o s s i b l e t o d r a w c e r t a i n c o n ­
clusions about n umbers u\ B y the definition of P , w ° c a n t a k e a n y
v a l u e less t h a n zero. H e n c e u ° ^ 0, o t h e r w i s e t h e r i g h t - h a n d s i d e
could take any great value a n d this c o n t r a d i c t s (4.13). S i m i l a r l y ,

u x ^ 0 if f i ( x % ) = 0, i £ (4.14)

F u r t h e r , if i £ a n d f t ( # * ) < ; 0, t h e n w x is a r b i t r a r y . T h e r e f o r e
f o r t h e i n e q u a l i t y ( 4 . 1 3 ) t o b e v a l i d it is n e c e s s a r y t h a t

u x — 0 if f t (a;*) < 0, i £ J ”. (4.15)

L e t t i n g n o w w in (4.13) t e n d to zero so that w £ P a n d t a k i n g into


a c c o u n t (4.15) a n d t h e definition of Z, w e o b t a i n

( / ; ( * , ) . P ) + 2 « * ( / » ( * . ) . P ) > 0 , p £ K { x . )

or

(«•/;(*.)+ 2 «*/»(*•). p ) > 0 , P €*(*.)• (4.16)


ieJ-u T

T h e s t a t e m e n t s p r o v e d (4.14), (4.15), (4.16) are o b v i o u s l y e q u i v a l e n t


to the a b o v e th eorem. This co m p l e t e s the proof.
C o r o l l a r y 4 . 1 . I f X = E n , t h e n f o r p o i n t x % to p r o v i d e a l o c a l m i n i ­
m u m it is n e c e s s a r y t h a t t h e r e b e n u m b e r s u \ n o t a l l z e r o , s u c h t h a t

«V.(*.)+ 2 »*/»(*.) = o ,

u ® >0, u ‘> 0 , tgj-, It*/, ( * , ) = 0 , 16J". (4.17)

P r o o f . I f X = E n , t h e n a n y d i r e c t i o n p is a d m i s s i b l e , i.e. K ( # * ) =
= E n . T h e r e f o r e c o n e K * (#*) consists of o n e a n d o n l y o n e zero vector
a n d relations (4.4) directly t a k e t h e f o r m (4.17).

36
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M

C o r o l l a r y 4 . 2 . F o r p o i n t x # to p r o v i d e t h e m i n i m u m of f Q (x) i n t h e
domain

s? > 0, ; 6 f

w h e r e ' f is a s u b s e t o f t h e s e t / = 1, 2, . . n it is n e c e s s a r y t o s a t i s f y
the conditions

d l ° <**> > 0 if X i = 0 , i e f ,
dxl
o ifxi > 0 , i e f or i l f . (4.18)

Proof. Constraints ar 7 ^ 0 , ; 6 c a n b e rewritten in the f o r m


(— a j , x) ^ 0 , j w h e r e aj is a v e c t o r w h o s e c o m p o n e n t s a r e
aj — GiJ* * = 1» • • n. U s i n g the preceding corollary w e obtain
that there are n u m b e r s u° a n d U*, j £ n o t all z e r o s u c h t h a t

u °fo(x * ) — 2 u 3a j = 0 , u°, u 3^ 0, u Ja : i = 0 , 7 6 ?- (4.19)


iff
T h e first e q u a l i t y c a n b e w r i t t e n i n t e r m s o f c o m p o n e n t s i n t h e
following form:

“• ■ ^ - 2 ^ 1 = 0
ief
or
u° = u ‘, i e f , u° dk£ * ) - 0. ilf- (4.20)

I t f o l l o w s f r o m ( 4 . 2 0 ) t h a t u ° > 0 , s i n c e if u ° = 0 t h e n a l l u x = 0 ,
b u t this c o n t r a d i c t s c o r o l l a r y (4.1). T h e r e f o r e w e c a n a s s u m e t h a t
u° — 1 .
I t f o l l o w s d i r e c t l y f r o m ( 4 . 2 0 ) a n d ( 4 . 1 9 ) t h a t c o r o l l a r y 4 . 2 is
valid.
D e f i n i t i o n 4 . 6 . T h e p o i n t x % p r o v i d i n g t h e m i n i m u m o f f 0 (x) w i t h
c o n s t r a i n t s ( 4 . 1 ) , w h e r e X = E n , is c a l l e d a r e g u l a r p o i n t if g r a d i e n t s
/i(x*) for indices i s u c h t h at i 6 (J # ° » fi ( * * ) = 0 a r c l i n e a r l y
independent.
C o r o l l a r y 4 . 3 . I f x % is a r e g u l a r p o i n t , t h e n i n ( 4 . 1 7 ) w e c a n t a k e
u ° = 1 a n d t h e m u l t i p l i e r s u ly i £ \J J ° a r e u n i q u e .
P r o o f . I n f a c t , u ° ] > 0 . S i n c e if u ° = 0 t h e n b y ( 4 . 1 7 ) t h e g r a d i e n t s
f i ( x + ) f o r w h i c h i £ C f ~ (J « ? ° > . U ( # * ) = 0 w o u l d p r o v e l i n e a r l y
d e p e n d e n t . F u r t h e r , b y ( 4 . 1 7 ) u ' = 0 i f f t (,x * ) < 0 . T h e r e f o r e t h e

37
M A T H E M A T I C A L P R O G R A M M I N G

first o f r e l a t i o n s ( 4 . 1 7 ) w i t h u ° = 1 yields the ex pansion

f'o ( * . ) = - s u lf i ( * , )

o f v e c t o r f'0 ( x * ) i n l i n e a r l y i n d e p e n d e n t v e c t o r s f t ( x * ) a n d d e f i n e s
u n i q u e l y u %.
L e t t h e r e b e n o w o n l y e q u a l i t y co n s t r a i n t s in p r o b l e m (4.1)

ft ( x ) = 0, i 6 3°

a n d X — E n . If w i t h s u c h c o n s t r a i n t s x * p r o v i d e s t h e m i n i m u m of
/ 0 ( x ) a n d g r a d i e n t s ft ( x * ) a r e l i n e a r l y i n d e p e n d e n t , t h e n t h e n e c e s ­
s a r y co n d i t i o n s for a m i n i m u m (4.17) c a n b e w r i t t e n in the f o r m

f t ( x *) + 2 m 7 J ( * J = 0 .
.£ AfO v •'
T h e set of vectors p s u c h th a t
(ft ( * * ) . P ) = 0, i 6 3°
i n t h e c a s e u n d e r c o n s i d e r a t i o n is c a l l e d a t a n g e n t ( b o u n d i n g ) m a n i ­
fold at p o i n t x * to t h e set
D = {x: ft(x) = 0, i 6 3°).

C o r o l l a r y 4 . 4 . F o r p o i n t x * a t w h i c h ft (a:*), i £ 3 ° a r e l i n e a r l y
i n d e p e n d e n t to p r o v i d e t h e m i n i m u m of t h e f u n c t i o n f0 (x) i n set D
it i s n e c e s s a r y t h a t g r a d i e n t f 0 ( x * ) b e o r t h o g o n a l t o a m a n i f o l d t a n g e n t
t o D a t p o i n t x i . e . i f p b e l o n g s t o t h e b o u n d i n g m a n i f o l d t h e n (ft ( x * ) ,
p ) = 0 . I n o t h e r w o r d s t h e p r o j e c t i o n o f v e c t o r ft ( x * ) o n t h e t a n g e n t
m a n i f o l d vanishes.
P r o o f . If x * w i t h t h e a b o v e a s s u m p t i o n s p r o v i d e s t h e m i n i m u m
then

( / ; ( * . ) . />) = - s »*(/*(*.). p ) ~ o

f o r a n y v e c t o r p o f t h e t a n g e n t m a n i f o l d . C o n v e r s e l y i f (f t ( x * ) , p )
is e q u a l t o z e r o w i t h a n y p , w h i c h b e l o n g s t o t h e h o u n d i n g m a n i f o l d ,
then w e can write

/; ( * . ) = - s . » v j (*.)
s eJ°
T h i s f o l l o w s f r o m l e m m a 1 . 6 if e a c h o f t h e e q u a l i t i e s (ft ( x * ) , p ) = 0
is w r i t t e n d o w n i n t h e f o r m o f t w o i n e q u a l i t i e s :

(/; (**), p ) > o, - ( / ; (**), p ) > o.

38
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M

Minimax Problem
It is r e q u i r e d t o f i n d t h e m i n i m u m p o i n t o f t h e f u n c t i o n
/ (x) = m a x f t (x) (4.21)
i=l. . m

w h e r e /* ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n s , x 6 E n . I n
o r d e r to a p p l y t h e results o b t a i n e d i n t h e p r e c e d i n g s u b s e c t i o n let
u s r e d u c e t h e p r o b l e m o f m i n i m i z a t i o n o f / (<x ) t o t h e e q u i v a l e n t
p r o b l e m o f m a t h e m a t i c a l p r o g r a m m i n g . I t is e a s i l y s e e n t h a t if w e
introduce a s u p p l e m e n t a r y variable x n+1y t h e n # * — the point of m i n i ­
m u m o f / (rr)— w i l l a l s o b e t h e s o l u t i o n o f t h e f o l l o w i n g p r o b l e m :
t o f i n d t h e m i n i m u m o f g Q (x , x n + 1 ) — x n + 1 w i t h t h e c o n s t r a i n t s
g i (x , z n + 1 ) s ft ( # ) — x n+1 0, i = 1, . . m. (4.22)

T h e m i n i m u m v a l u e o f g 0 ( x y x n + 1 ) is a ?;+ 1 = / ( x * ) .
L e t us a p p l y corollary 4.1 of t h e o r e m 4 . 1 t o p r o b l e m ( 4 . 2 2 ) . I t is
necessary to ta ke into a c c o u n t that the p r o b l e m will n o w be solved
in space E n+1 of variables x \ ...» x n y x n+1 so that the gradients
o f t h e f u n c t i o n s g t ( x , a;n + 1 ) h a v e t h e f o rm

gi (x, * " + » ) = ) , i = 1, ..., m , g'„ ( x , a:n + ‘ ) = ($) .

B y c o r o l l a r y 4 . 1 w e n o w h a v e : t h e r e a r e n u m b e r s u ° , u xy i = 1, . . .
..., m n o t all zero s u c h t h a t
m
“0 (i) + 2 “ i ( /i- 1, , ) = ° .
i=l
u * ^ 0, ( = 0, 1,
( f t (x * ) — * S + 1 ) = » * ( / i ( * . ) — / ( * , ) ) = 0 . i = 1, . m. (4.23)
m
T h e first o f r e l a t i o n s ( 4 . 2 2 ) s h o w s t h a t u° — 2 u x. H e n c e , s i n c e u x ^ 0 ,
i=l
w e h a v e u ° > 0 for w i t h u ° — 0 all u x w o u l d a l s o b e zero. S i n c e e x ­
p r e s s i o n ( 4 . 2 2 ) is h o m o g e n e o u s w i t h r e s p e c t t o u x w e c a n t a k e u ° = 1.
T h u s w e h a v e finally o b t a i n e d t h e f o l l o w i n g result.
T h e o r e m 4 . 2 . F o r p o i n t x % t o p r o v i d e t h e m i n i m u m o f f (a;) d e f i n e d
b y r e l a t i o n ( 4 . 2 1 ) i t i s n e c e s s a r y t h a t t h e r e b e n u m b e r s u x y i = 1 , . . ., m
such that
m

2 “ V i (*.) = o,
i=l
m

2 = 1, u x^ 0, i = l, . . . , m ,
i=i
*** (f t ( * * ) — / (**)) = i = 1, . . m. (4.24)

39
M A T H E M A T I C A L P R O G R A M M I N G

Necessary Conditions
of the S e c o n d Order
L e t u s a g a i n r e t u r n t o t h e p r o b l e m o f t h e m i n i m i z a t i o n o f / 0 (a:)
w i t h c o n s t r a i n t s (4.1), X = E n . W e u s e t h e n o t a t i o n L (#, u ) a s
follows:
L(x, u ) = f 0 (x)+ 2 u xf t ( x ) . (4.25)
i£j~ u J °

A s s u m e that p o i n t # * — the solution of this m i n i m i z a t i o n p r o b l e m —


is r e g u l a r ( d e f i n i t i o n 4 . 6 ) . T h e n u s i n g c o r o l l a r y 4 . 3 o f t h e o r e m 4 . 1
w e c a n w r i t e t h e first o f r e l a t i o n s ( 4 . 1 7 ) i n t h e f o r m

L'x ( # * , u ) — 0. (4.26)

A s s u m e n o w t h a t a l l f u n c t i o n s /* ( # ) a r e t w i c e c o n t i n u o u s l y d i f ­
f e r e n t i a b l e , i.e. t h a t t h e r e a r e c o n t i n u o u s m a t r i c e s o f s e c o n d d e r i v a ­
t i v e s fl (#). T h e n t h e m a t r i x o f s e c o n d d e r i v a t i v e s Z £ x ( # * , u ) o f
f u n c t i o n L (#, u ) w i t h r e s p e c t t o x is a l s o d e f i n e d .
T h e a s s u m p t i o n t h a t # * is a r e g u l a r p o i n t i m p l i e s t h a t t h e r e l a ­
t i o n ( 4 . 1 7 ) is u n i q u e l y d e t e r m i n e d b y m u l t i p l i e r s u l, i 6 U J°.
W e introduce the following notations

J o ( * ♦ ) = {*: a f > 0 ,
3 ~ ( x . ) = {«: M * « ) = 0 , i 6 3 ~ ) •

U s i n g (4.17) w e have 3 o ( x , ) c z J - ( x t ). Let vector p satisfy


the relations
(/{(•*'*)» i6 J (#*)> * € J o (**'♦)»
(/I ( * * ) , p ) = 0 , i € J o (**) U J ° . (4.27)
A s s u m e
3 P (x.) = { i € ^ ‘ (x.) U J ° : U'i ( * . ) . P ) = 0 } . (4.28)

P o i n t # * b e i n g r e g u l a r , v e c t o r s f\ ( # * ) , i 6 J p ( * * ) a r e l i n e a r l y
i n d e p e n d e n t . T h e r e f o r e it c a n b e d e m o n s t r a t e d t h a t t h e r e is a f u n c ­
t i o n r (X) 6 E n s u c h t h a t
ft ( X ( X ) ) = 0, i e J P (**) (4.29)

where # (X) = #* + Xp + r (X), lim ^ = 0. The p r o o f is q u i t e


X —► 0
similar to that of t h e o r e m 4.1.

40
A D D I T I O N A L I N F O R M A T I O N

F u r t h e r , i f i £ J p ( x * ) , t h e n e i t h e r f t ( x * ) < 0 o r ( f i ( x m ), p ) < 0
w h i c h e n s u r e s t h e i n e q u a l i t y f t ( x ( X ) ) c 0 w i t h s m a l l A,. T h u s p o i n t
x (X) w i t h s m a l l X satisfies all t h e c o n s t r a i n t s (4.1), X = E n . U s i n g
this fact as w e ll as (4.27)-(4.29) w e o b t a i n
/ o (x M ) = L ( z (A,), u )

s i n c e i f u l =5^ 0 t h e n f r o m ( 4 . 2 9 ) f t ( x (A,)) = 0. A t t h e s a m e t i m e
w e o b t a i n f r o m (4.17) t h a t /0 (x*) = L (x*, u). T a k i n g i n t o a c c o u n t
t h a t x (A,) s a t i s f i e s a l l o f t h e r e l a t i o n s ( 4 . 1 ) a n d t h a t x # is t h e m i n i ­
m u m p o i n t o f /0 (x) w i t h c o n s t r a i n t s (4.1) w e o b t a i n for s m a l l X:
L ( x ( X ), u) > L (x*, u).
Expanding L ( x (A,), u ) to second-order terms in powers o f A,
w e obtain
L ( x (A.), u ) = L ( x , , u ) + ( L x ( x „ u ), x ( X ) — x j

+ w)(x(A,) — x,), x ( K ) — x * ) > L ( x „ u )

w h e r e \ { X ) is a p o i n t i n t h e s e g m e n t w h i c h j o i n s x0 and x(A,)
s o t h a t £ (A<) — >• x ^ a s A , — > ~ 0 . U s i n g ( 4 . 2 6 ) w e o b t a i n

“ ) ( p + Lr }). p + rJr ) > ° ■

D i v i d i n g b y A ,2 a n d t a k i n g A, - > 0 w e finally o b t a i n
( L x” x ( x * , u ) p , p ) > 0.
The f o l l o w i n g t h e o r e m is p r o v e d .
Theo r e m 4 . 3 . L e t f u n c t i o n s f t (x) b e t w i c e c o n t i n u o u s l y d i ffe ren tia bl e
and x* b e a r e g u l a r p o i n t o f m i n i m u m o f f0 (x) w i t h c o n s t r a i n t s (4.1),
X = E n . T h e n there are n u m b e r s u \ i £ J ~ U 3 ° such that
L x (x*, u) = 0, ux ^ 0, i £ J “, u xfi ( x * ) = 0, i 6 Cf~
and
( L x x (a:*, u ) p , p ) > 0
for all p w h i c h satisfy inequalities (4.27).

5. S O M E A D D I T I O N A L I N F O R M A T I O N
T h e Newto n-L eib ni tz formu la w h i c h establishes the connection
b e t w e e n a s c a l a r f u n c t i o n / (x) a n d its d e r i v a t i v e is t r e a t e d i n m a t h e ­
m a t i c a l a n a l y s i s . T h i s f o r m u l a is g e n e r a l i z e d a n d a p p l i e s t o o p e r a ­
tor functions.

41
M A T H E M A T I C A L P R O G R A M M I N G

I f F (x) is a d i f f e r e n t i a b l e o p e r a t o r f u n c t i o n d e f i n e d i n a n o p e n c o n v e x
set Q £ E n a n d x , x + h £ Q , then

i
F ( x - \ - h ) — F (x) — j F' (x-\-ah) h d a . (5.1)
o

T h e p r o o f o f t h e f o r m u l a ( w h i c h is v a l i d a l s o f o r o p e r a t o r s , d e f i n e d
in functional spaces) c a n b e found, for instance, in the b o o k b y
A . N . K o l m o g o r o v a n d S. V . F o m i n .
L e t us state o n e m o r e p r o p e r t y of operator functions.
I f F ( x ) is a n o n l i n e a r d i f f e r e n t i a b l e o p e r a t o r f u n c t i o n , t h e n f o r
a n y x , h , y £ E n t h e f o l l o w i n g f o r m u l a is v a l i d :

( F (x + h) — F (x), y ) = (F' (x + Q h ) h , y ),
0 < 0 < 1. (5.2)

T h i s f o r m u l a i s c a l l e d L a g r a n g e ’s f o r m u l a f o r o p e r a t o r s ( o r L a ­
g r a n g e ’s g e n e r a l i z e d f o r m u l a ) . I t s p r o o f ( f o r o p e r a t o r s o f a m o r e
g e n e r a l f o r m ) c a n b e f o u n d i n M . M . V a i n b e r g ’s m o n o g r a p h [ 1 ] ,
I n the following chapters w e shall h a v e m a n y occasions of using
T a y l o r ’s f o r m u l a w i t h t h e r e m a i n d e r t e r m i n L a g r a n g e ’s f o r m .
If / (x) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n i n a c o n v e x
set Q , t h e n for a n y x, x + h £ Q a n d a £ 10, 1 ]

/ (x + ah) — f (x) = a (/' ( x + a B ^ ) , h)


and

/ ( x + a h ) = f ( x ) + a (/' ( x ) , h ) + (/" ( x + a Q 2h ) h , h )

where 0 lt 0 2 6 1 0 , 1 1 .

Bibliographic Notes
T h e properties of c o n v e x sets a n d c o n v e x functions in finite-dimensional
s p a c e s a r e d e s c r i b e d b y S. K a r l i n , G . Z o u t e n d i j k [1], H . K i i n z i a n d W . K r e l l e .
T h e m o s t c o m p r e h e n s i v e d e s c r i p t i o n i s g i v e n i n R . T . R o c k a f e l l a r ’s b o o k . T h e
properties of c o n v e x sets a n d f u n c t i o n s in functional s p a c e s are in detail
i n v e s t i g a t e d b y N . D u n f o r d a n d J. T . S c h w a r t z .
S p e c i a l p r op e r t i e s c o n n e c t e d w i t h strict a n d s t r o n g c o n v e x i t i e s are c o n s i d e r e d
b y E . S. L e v i t i n a n d B . T. P o l y a k .
A m o n g m a n y w o r k s d e v o t e d to th e t h e o r y of n e c e s s a r y c o nditions for e x t r e m a
w e n o t e t h e b o o k s b y S . K a r l i n , G . Z o u t e n d i j k [ 1 ] , S . I. Z u k h o v i t s k i i a n d
L . I. A v d e y e v a , H . K i i n z i a n d W . K r e l l e w h i c h c o n s i d e r t h e p r o b l e m s o f l i n e a r
a n d c o n v e x p r o g r a m m i n g in finite-dimensional spaces. A m o r e c o m p l e t e t h e o r y

42
B I B L I O G R A P H I C N O T E S

o f t h e n e c e s s a r y c o n d i t i o n s o f t h e first o r d e r i n t h e g e n e r a l c a s e t h e r e a d e r c a n
find in t h e w o r k s b y A . Y a . D u b o v i t s k y a n d A . A . M i l y u t i n , L. W . N e u s t a d t ,
H . H a l k i n a n d L . W . N e u s t a d t , B . N . P s h e n i c h n y [ 1 ] , M . R . I l e s t e n e s [ 2],
K . A r r o w , L. B. H u r w i t z a n d H . U z a w a . M . R . H e s t e n e s considers also the
necessary conditions of the s e c o n d order.
T h e p r o b l e m s of linear p r o g r a m m i n g as well as c o m p u t a t i o n a l algorithms
a n d the t h e o r y of d uality in linear p r o g r a m m i n g are treated in detail b y D . Gale,
G . B . D a n t z i g [ l ] , S . K a r l i n , S . I. Z u k h o v i t s k i i a n d L . I. A v d e y e v a . T h e g e n e r a l
c a s e o f t h e t h e o r y o f d u a l i t y i n c o n v e x p r o g r a m m i n g is t r e a t e d b y E . G . G o l -
s t e i n [11, [2] a n d R . T . R o c k a f e l l a r .
T h e t h e o r e m o n i m p l i c i t f u n c t i o n s , w h i c h is u s e d i n d e d u c i n g t h e n e c e s s a r y
c o n d i t i o n s f o r e x t r e m a , c a n b e f o u n d i n G . M . F i k h t e n h o l ’t s ’ b o o k .
C H A P T E R II
M E T H O D S O F U N C O N S T R A I N E D
F U N C T I O N MINIMIZATION

T h i s c h a p t e r is d e v o t e d to the p r o b l e m of m i n i m i z a t i o n of t h e
f u n c t i o n f(x) d e f i n e d in a n ^-dimensional Euclidean space E n. Accord­
ingly, in this ch apt er x is a l w a y s a n r c - d i m e n s i o n a l v e c t o r .
In solving the problem w e shall use iterative processes of the t y p e
*h+1 = + VkPh (0.1)
w h e r e p h is a v e c t o r d e t e r m i n i n g t h e d i r e c t i o n o f m o t i o n f r o m p o i n t x k
a n d a k is a n u m e r i c a l f a c t o r w h o s e v a l u e d e t e r m i n e s t h e l e n g t h o f
the step in the direction of p h.
T h e p r o c e s s ( 0 . 1 ) w i l l b e d e f i n e d if t h e m e t h o d s o f c o n s t r u c t i n g
vector p k a n d c o m p u t i n g t h e v a l u e of a * are g i v e n for e v e r y itera­
tion. T h e properties of t h e process— t h e v a l u e s of the fu n c t i o n for
different e l e m e n t s of the s e q u e n c e {#*}, c o n v e r g e n c e of the s e q u e n c e
to the solution, the rate of c o n v e r g e n c e , etc.— d e p e n d directly o n t h e
m e t h o d chosen. A t t h e s a m e t i m e v a r i o u s m e t h o d s of co nstructing
vector p k a n d d e t e r m i n i n g a * require different a m o u n t s of c a lcu­
lations a n d i n v o l v e different constraints o n t h e f u n c t i o n to b e m i n i ­
mized.
Let us state the considerations o n w h i c h w e shall base our choice
of t h e direction of m o t i o n a n d t h e step length.
I n o r d e r t o g e t n e a r e r t o p o i n t x % ( i n t h e g e n e r a l c a s e x * is t h e p o i n t
a t w h i c h t h e n e c e s s a r y c o n d i t i o n s f o r a n e x t r e m u m o f f u n c t i o n f (x)
ar e satisfied, p o s s i b l y w i t h i n a c e r t a i n a c c u r a c y ) , o n e s h o u l d n a t u ­
rally m o v e f r o m point x k in the direction in w h i c h the function d e ­
c r e a s e s , i.e. i n t h e d i r e c t i o n o f d e s c e n t . I f p o i n t is n o t t h e p o i n t
o f m i n i m u m o r a s t a t i o n a r y p o i n t , t h e n t h e r e is a n in fin ite n u m b e r
of vectors p w h i c h d e t e r m i n e the direction of descent f r o m point x h
a n d e a c h v e c t o r is d e f i n e d b y
(/' ( * * ) > p ) < 0
(/ (x) is d i f f e r e n t i a b l e ) .

44
G R A D I E N T M E T H O D S

T h i s is s e e n f r o m t h e f o l l o w i n g a r g u m e n t .
L e t x = x h + a p . E x p a n s i o n o f t h e f u n c t i o n i n T a y l o r ’s s e r i e s
a b o u t x h (it is o b v i o u s l y a s s u m e d t h a t t h e f u n c t i o n is d i f f e r e n t i a b l e
a n d h a s a sufficient n u m b e r of derivatives) gives

/ ( * ) = / (**) + «(/*> P ) + - y - (f'kcP, P ) ■

W e set h e r e = f ' ( x h ), f k’ c = f " ( x k c ), x k c = x h + 0 ( x — x h ),


0 6 [0, 11. F o r t h e s a k e o f b r e v i t y , t h e s e n o t a t i o n s w i l l o f t e n b e
u s e d further o n in this chapter.
I f (/i, p ) < c 0 t h e n a t l e a s t w i t h s m a l l v a l u e s o f a , f ( x ) < c f ( % h )
s i n c e t h e s i g n o f t h e r i g h t - h a n d s i d e is d e t e r m i n e d b y a t e r m w h i c h
is l i n e a r w i t h r e s p e c t t o a .
B y a p p l y i n g various m e t h o d s in c h o o s i n g the direction of the d e ­
s c e n t a n d f a c t o r a fe, w e c a n o b t a i n d i f f e r e n t m i n i m i z a t i o n a l g o r i t h m s .

1. G R A D I E N T M E T H O D S
M e t h o d of S t e e p e s t D e s c e n t
T h e simplest a p p r o a c h to the choice of the direction of p k in order
t o s a t i s f y t h e c o n d i t i o n (fh , p h ) < 0 (i.e. o f t h e d i r e c t i o n o f t h e
d e s c e n t o f / (#)) is t o a s s u m e p h — — /*.
T h e iterative process
*^fc+l ~ ^ > 0, k = 0, 1, . . •(1*1)
w h i c h r e s u l t s f r o m s u c h a c h o i c e o f t h e d i r e c t i o n o f m o t i o n is c a l l e d
t h e m e t h o d of steepest descent or g r a d i e n t m e t h o d .
I n t e r m s o f c o o r d i n a t e s , p r o c e s s ( 1 . 1 ) is w r i t t e n d o w n i n t h e f o l ­
lowing form:
„ d f (x h) ,• _ \ o „
X k ~|-i %k OLh. 7 : » & — L f n.
d x l

A t p r e s e n t t h e m e t h o d o f s t e e p e s t d e s c e n t is o n e o f t h e b e s t k n o w n
minimization methods.
T h e p o p u l a r i t y o f t h e m e t h o d h a s b e e n f a v o u r e d b y its b e i n g
c o m p a r a t i v e l y s i m p l e a n d suitable for application to the m i n i m i ­
zatio n of a v e r y b r o a d class of functions.
W e t u r n n o w t o t h e s t u d y of t h e p r o p e r t i e s of t h e a l g o r i t h m (1.1).
First of all w e d e s c r i b e t h e m e t h o d of c h o o s i n g t h e m a g n i t u d e of t h e
scalar factor a*.
(1) T a k e a n a r b i t r a r y v a l u e o f a ( t h e s a m e a t all i t e r a t i o n s ) a n d
determine point x = x^ — afh.
( 2 ) C o m p u t e f ( x ) = / ( x k — a f ' h ).

45
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

(3) V e r i f y t h e i n e q u a l i t y
/(*) — / (xh) < e a (f'k, p h ) (1.2)
w h e r e 0 < c e < c 1 is a n a r b i t r a r i l y c h o s e n c o n s t a n t ( t h e s a m e w i t h
a n y k = 0 , 1 , . . .).
(4) I f i n e q u a l i t y ( 1 . 2 ) is s a t i s f i e d , t h e n t h e v a l u e o f a is t a k e n t o b e
t h e s o u g h t o n e : a k = a . H o w e v e r if t h e i n e q u a l i t y is n o t s a t i s f i e d ,
w e reduce a (multiplying a b y a n arbitrary i < 1) until inequality
( 1 .2 ) i s s a t i s f i e d .
T h e a b o v e m e t h o d of c h o o s i n g a h n e e d s substantiation; t h e c o n ­
ditions of t h e existence of n o n z e r o v a l u e s of a w h i c h satisfy i n e q u a l i t y
( 1 . 2 ) m u s t b e e s t a b l i s h e d . S u c h a s u b s t a n t i a t i o n is g i v e n i n t h e f o l ­
lowing theorem.
T h e o r e m 1 . 1 . I f f u n c t i o n f (,x ) i s b o u n d e d f r o m b e l o w , i t s g r a d i e n t
f (x) satisfies L i p s c h i t z ’ c o n d i t i o n

II/' ( z ) - f ( y ) \ \ < R \ \ x - y \ \ d.3)


w i t h a n y x , y £ E n a n d t h e c h o i c e o f t h e v a l u e o f a k is m a d e a s d e s c r i b e d
a b o v e , t h e n i n t h e p r o c e s s ( 1 . 1 ) \ \ f k || - > - 0 a s h - > o o w h a t e v e r t h e
initial p o i n t x Q.
Proof. B y the m e a n - v a l u e t h e o r e m w e h a v e
/(*) — / (*ft) = (/' ( X h c ) , x — x h) (1.4)
w h e r e x h c = x h + 0 ( x — a;*), 0 £ [0, 1). W e n o t e t h a t i n w h a t f o l -
l o w s t h e i n d e x k c (c ) w i l l b e u s e d t o d e n o t e a n i n t e r m e d i a t e p o i n t i n
a corresponding segment.
E q u a l i t y (1.4) c a n b e t r a n s f o r m e d as f o l l o w s
/ (*^) fh = (fki X X k ) “i~ i l k c fhi X Xfc ) .
Hence, noticing that x — xh = — a /& a n d u s i n g (1.3) w e obtain

/ (*) - / » < - « V L m + a R II - * * II II / ; II
II fk II2 + oc f l II * ~ II II / i II = a II fk II2 ( - 1 + a R ) .
T h e estimate obtained s h o w s that there are values a 0 such that
t h e i n e q u a l i t y ( 1 .2 ) i s s a t i s f i e d ; t o o b t a i n t h i s r e s u l t i t s u f f i c e s t o
c h o o s e a s u c h t h a t — 1 + oci? ^ — e. T h i s is a l w a y s f e a s i b l e , s i n c e R
is a l i m i t e d q u a n t i t y a n d 0 < e < 1 . C o n s e q u e n t l y , ( 1 . 2 ) w i l l a l w a y s
| _ _ g
b e satisfied w i t h a ^ . T h u s , ch oosing a * in accor dan ce w i t h
the above algorithm w e obtain
fk+i — /* < — e a ft || f'k ||2 , (1.5)
i . e . w i t h a n y k w e h a v e f h + 1 — f h « < 0 ( p r o v i d e d || f h || # 0 ) . S i n c e
b y h y p o t h e s i s t h e f u n c t i o n h a s a l o w e r b o u n d , t h e last i n e q u a l i t y

46
G R A D I E N T M E T H O D S

gives as k oo
/ j k + i - / * - * 0. ( 1 .6 )

It f o l l o w s f r o m (1.5) t h a t

i i / ^ ^ i ^ r 1 - 0-7)

W e n o t e t h a t t h i s a l g o r i t h m f o r c h o o s i n g a fc e n s u r e s t h a t a k ^
^ a > 0 , with any k , where a can be any constant which does not
j_ g
e x c e e d t h e q u a n t i t y — g - , s i n c e a s it w a s m e n t i o n e d , t h e i n e q u a l i t y

(i .2) o r ( 1 . 5 ) i s c e r t a i n l y s a t i s f i e d w i t h a = .W i t h this r e m a r k ,
c o n d i t i o n s ( 1 . 6 ) a n d ( 1 . 7 ) i m p l y t h a t || / £ || - > - 0 a s A; - > > o o , a n d t h i s
proves the theorem.
T h e class of f u n c t i o n s s a t i s f y i n g t h e r e q u i r e m e n t s of t h e o r e m 1.1
is v e r y b r o a d . S u c h f u n c t i o n s c a n h a v e n o m i n i m u m p o i n t a t all,
c a n h a v e local m i n i m a , s a d d l e p o int s, etc. T h e o r e m 1.1 s h o w s t h a t
the gradient m e t h o d pr o v i d e s for c o n v e r g e n c e either to t h e exact l o w e r
b o u n d i n f / (a:) o r t o a v a l u e o f t h e f u n c t i o n a t a c e r t a i n s t a t i o n a r y
X

point. T h e c o n v e r g e n c e of the s e q u e n c e {x*} to a stationary p o int


(if s u c h a p o i n t e x i s t s ) a l s o t a k e s p l a c e . I t is d i f f i c u l t h o w e v e r t o
d e t e r m i n e t h e r a t e o f c o n v e r g e n c e w i t h t h e c o n d i t i o n s o f t h e o r e m 1 .1 .
If t h e r e q u i r e m e n t s c o n c e r n i n g t h e s m o o t h n e s s a n d c o n v e x i t y o f
t h e f u n c t i o n a r e sufficiently strict, t h e n n o t o n l y c a n t h e c o n v e r g e n c e
o f t h e s e q u e n c e { x ft} b e p r o v e d b u t t h e r a t e o f c o n v e r g e n c e c a n a l s o
be estimated.
T h e o r e m 1.2. L e t f (x) b e a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n ,
a n d t h e m a t r i x o f s e c o n d d e r i v a t i v e s satisfies t h e c o n d i t i o n s
m || y ||2 < ( / " ( x ) y , y ) < M \ \ y ||2 , M > m > 0 (1.8)
w i t h a n y x , y £ E n , a n d t h e s e q u e n c e { x ft} b e c o n s t r u c t e d b y m e t h o d ( 1 . 1 ) ,
w h e r e a h is c h o s e n i n t h e w a y d e s c r i b e d a b o v e . T h e n w i t h a n y i n i t i a l
p o i n t x 0 w e h a v e x * ~ + x % , f ( x h ) - > - / ( x % ) , w h e r e x * is t h e (u n i q u e )
p o i n t o f m i n i m u m o f f ( x ).
T h e n the following estimates of the rate of convergence hold:
fk — /* < Ifo — /*], II X k — X* I K C q h >2 ,
C < oo, 0 < q < 1. ( 1 .9 )

P r o o f . T h e e x i s t e n c e o f t h e u n i q u e m i n i m i z e r o f / (x) w i t h t h e
conditions of the t h e o r e m follows f r o m the results of l e m m a 3.2
( C h . I). T h e r e f o r e w e h a v e o n l y t o p r o v e t h e c o n v e r g e n c e o f s e q u e n c e
{ x fe} t o p o i n t x * a n d t o o b t a i n e s t i m a t e s ( 1 . 9 ) . L e t u s f i r s t e s t a b l i s h
t h a t t h e f i r s t o f e s t i m a t e s ( 1 . 9 ) h o l d s . U s i n g T a y l o r ’s f o r m u l a w e

47
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

obtain
/(*.)=/(*) + ( / ' ( x ), — *) + " S ' ( f ( * c ) (*. — *), X „ — x).
H e n c e , a p p l y i n g (1.8) w e h a v e

/ ( * ) - / (*.) < (/' ( * ) , * - * . ) - f -B * ~ *. IP


< ii r <*) m u * - *.ii — *■ n * - *. ip- (i. to)
A t t h e s a m e t i m e ( s i n c e /' ( # * ) = 0 )

/ ( * ) — /(*.) = T ( / " (x ct) ( * — *,). x — *,)

a n d therefore, f r o m (1.8), w e o b t a i n

if-II*- ** I P < / ( * ) — / ( * . ) < - y - | l * — **ll12 (1.11)

W e have from the left-hand side of inequality (1.11) and


f r o m (1.10)

I I * - * . I K 111 ^ 11. ( 1 .1 2 )
a n d f r o m th e r i g h t - h a n d side of ine q u a l i t y (1.11)

ii* — * . n2 > - j | - [ / ( * ) - / ( * . ) ] •
W i t h the estimates obtained w e c a n write (1.10) in t h e f o r m

/ ( * ) - / (*.) < l i r l f 112 - £ [ / ( * ) - / ( * . ) i •


Hence
\\1' ( x ) f > m { i + - ? r ) [ / ( * ) - / ( * . ) ] . (1.13)

A p p l y i n g this estimate w e c a n o b t a i n f r o m i n e q u a l i t y (1.5)

/ * + ! — / * < — 8 ® * m ( i + i r ) ( / * — /.)• (i.i4)


W i t h t h e c o n d i t i o n s of t he t h e o r e m we have

f (x ) f {&k) = (fki x %h) *4“ "2 " ( f k c — &h)t % — &k)

= - a \ \ r h r + - - ( r ko f i / ; x - « ( 1 - ^ ) ii«ip.

It follows that the inequality (1.2) is1 always satisfied


if 1 — — i.e. if a ^ a = ^—M m~ . Then w e h a v e f r o m (1.14)

/*+x — / . < [ l — e a , , m ( l + - ^ - ) ] (/* — / „ ) < ? ( / » — / , ) 84

48
G R A D I E N T M E T H O D S

w h e r e <7 = 1 — e a m ( ^ 1 + — j C 1 , i. e .

(/k-Z.X^tfo-n. (1-15)
S i n c e a = 2 (V r e ^ w e h a v e
M
n — \ 2 e ( 1 — e ) rm / . ■ m \
* 1 \ ' M / '
He n c e the m i n i m u m v a l u e of t h e ratio of the p r o g r e s s i o n q m in
i
is a t t a i n e d w i t h and then

Q m In = 1 2 ~ M O + l f ) *

1
C o n s e q u e n t l y it is e x p e d i e n t t o t a k e e i n t h e c o n d i t i o n (1.2).
E s t i m a t e (1.15) t o get her w i t h t h e left-hand side o n e of e s t im ate s
(1 .1 1 ) m a k e s it p o s s i b l e t o e s t a b l i s h t h e c o n v e r g e n c e a n d e s t i m a t e
the rate of c o n v e r g e n c e of the s e q u e n c e {#&} to the po i n t of m i n i m u m :

|l^ - ^ | | < ( — ) 1 / 2 ( / (. - / . ) ’ / 2 < ( - ) 1 '2 ( h - I . ) u 2 q k n < C q ^ .


T h e t h e o r e m is p r o v e d .
A n a l y z i n g the a b o v e proof w e see that in order to obtain estimate
(1.15) w e u s e d in t h e e n d o n l y c o n d i t i o n s (1.2) a n d (1.13). W e c o n ­
c l u d e t h a t t h e c l a s s o f f u n c t i o n s f o r w h i c h e s t i m a t e ( 1 . 1 5 ) h o l d s is
actually m u c h b r o a d e r t h a n the class of functions w h i c h satisfy
c o n d i t i o n s (1.8), v i z . e s t i m a t e ( 1 . 1 5 ) is v a l i d f o r all f u n c t i o n s w h i c h
satisfy t h e c o n d i t i o n s of t h e o r e m 1.1 a n d m o r e o v e r t h e c o n d i t i o n

II / ' ( * ) II2 > 6 1 / ( * ) - /*], 6 > 0.

T h e p r o o f o f t h e v a l i d i t y o f e s t i m a t e ( 1 . 1 5 ) i n t h i s c a s e is n o t r e a l l y
c o n n e c t e d w i t h the existence of a m i n i m u m ; o n e c a n s u p p o s e that
/ * = i n f / (#) w i t h o u t t r y i n g t o e s t a b l i s h w h e t h e r t h e p r e c i s e l o w e r
b o u n d is a t t a i n e d . I t s h o u l d b e s t r e s s e d h o w e v e r t h a t f u n c t i o n s o f
this class d o h a v e a m i n i m u m — n o t necessarily the o n l y o n e — a n d
t h a t s e q u e n c e { z ft} c o n v e r g e s t o a c e r t a i n p o i n t the s e c o n d of
e s t i m a t e (1.9) h o l d i n g t r u e for t h e rate of c o n v e r g e n c e .
I n d e e d f r o m (1.1) a n d (1.7) w e h a v e

II x h + l - x h II2 = a t II n | | * < ( / * - / * + ! )

w h e r e a m a x is t h e m a x i m u m v a l u e of the p a r a m e t e r at w h i c h w e

4— 0326 49
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

start to r e d u c e a. T a k i n g this into a c c o u n t w e o b tai n for a n y m > k


m~i m —i
l|zm - * * | | s S 2 II* ! + « — * « I I c i /iS S g i/2< C i /2 * i/ 2 -
i=h i=h 9
H e n c e || x m — x h || — * 0 w i t h k - > o o , i . e . s e q u e n c e { x fc} c o n v e r g e s
(to a c e r t a i n m i n i m u m x%), a n d also

|| * , - * * || = l i m || x m - x h | | < C }/2
m~*oo 1— q '

V a r i a n t s of the M e t h o d
T h e m e t h o d of c h o o s i n g a h in pr o c e s s (1.1) w h i c h i n v o l v e s t h e
c h e c k i n g o f i n e q u a l i t y ( 1 .2 ) a s d e s c r i b e d a b o v e i s n o t t h e o n l y o n e
possible. W e shall n o w consider several other m e t h o d s of c h o o s i n g
t h e v a l u e o f a ft; e a c h o f t h e s e m e t h o d s d e t e r m i n e s a d i f f e r e n t v a r i a n t
o f t h e g r a d i e n t m e t h o d . I n p r o v i n g t h e o r e m s 1 . 1 a n d 1 . 2 it w a s e s t a b ­
l i s h e d t h a t i n e q u a l i t y ( 1 .2 ) i s a l w a y s s a t i s f i e d w i t h v a l u e s o f a ^
^ (theorem 1.1) or a ^ (theorem 1.2). T h i s c i r c u m ­
s t a n c e m a d e it p o s s i b l e t o p r o v e t h e statements about the properties
o f m e t h o d ( 1 .1 ) i n c h o o s i n g a h u n d e r the co ndition of satisfying in­
e q u a l i t y (1.2). If c o n s t a n t s R o r M are k n o w n w h i c h characterize
t h e f u n c t i o n / (x) b e i n g m i n i m i z e d , t h e n i n a p p l y i n g m e t h o d ( 1 .1 )
— — | _ _ _ g

w e can beforehand choose a* = a, where 0 < a ^ or 0 <

< a < ■ ■ (V r e ^ » a n d t h e o r e m s 1 . 1 a n d 1.2 will r e m a i n valid. T h i s


variant of the gradient m e t h o d allows to d e t e r m i n e m o r e exactly
t h e v a l u e of t h e ratio q i n t h e e s t i m a t e s of t h e r a t e of c o n v e r g e n c e (1.9).
T h e o r e m 1 . 3 . I f f u n c t i o n f (a:) s a t i s f i e s t h e c o n d i t i o n s o f t h e o r e m 1 . 2
2
a n d i n m e t h o d ( 1 .1 ) a * = a , 0 < a then for the rate of conver­
gence of sequence {#&} the estimate

II x k — * * I K 9* II x o — x* II.
q = m a x { |1 — a m |, |1 — a M |}

is valid, the m i n i m u m value, qm in= ^ q ~ » attained with


2
a M m *
Proof. W e have
II S f t + i — * * ||2 = (x h — a l k — * * , * / r + i — * * )
= x k + i — * * ) . 50

50
G R A D I E N T M E T H O D S

A p p l y i n g L a g r a n g e ’s f o r m u l a f o r o p e r a t o r s ( 5 . 2 ) ( C h a p . I ) w e obtain

(/ft / * ) » ^ f t + l ^ * ) = (ct/ftc (*^k ^ft+1

U s i n g this result w e h a v e
||*ft+i— x , | | 2 = ( ( 7 — a f k c ) ( * » — * , ) , tfft+i — * * )
< || / — a } k C || || x h — x n || || x k + i — x m ||,
i.e.
II Z f t + i ~ | K || / — a f l c || || x h - x . \ \ = q || x h — * * ||.

B y c o n d i t i o n s (1.8)
q = || / — a / f t C | | - m a x {| 1 — a m |, 1 1 — a M |}.

I n t h e i n t e r v a l j^O, - ~ J t h e linear f u n c t i o n 1 — aik/, a s w e know,


changes its sign. Therefore the m i n i m u m value q m in ( a ) will be
2
attained with 1— a m = — ( 1 — a M ) , i . e . w i t h a = «r r -1— , a n d ob-
. i M — m
viously =
T h e t h e o r e m is p r o v e d .
2
N o t e t h a t w i t h a = iM r~r \-i—- m
the first of estimates ( 1 . 9 )7 c a n be
defined m o r e exactly

/ * « - / . < ( — ) > - / . ) • (1-16)

W e describe another m e t h o d of ch o o s i n g the step length. O n e c a n


select the v a l u e of a h p r o v i d i n g t h e m i n i m u m of t h e f u nct ion in
t h e d i r e c t i o n o f d e s c e n t , i.e. t h e c h o s e n v a l u e o f a k m u s t s a t i s f y
the condition
/ (x h — a h f'k) = m i n f (xh — a/*). (1.17)
a>0
W i t h this m e t h o d of c h o o s i n g t h e s t e p l e n g t h all t h e a b o v e results
c o n c e r n i n g t h e p r o p e r t i e s o f m e t h o d ( 1 .1 ) r e m a i n v a l i d ; m o r e o v e r
w e obtain m o r e exact b o u n d s o n the rate of convergence.
W e p r o v e a s t a t e m e n t s i m i l a r to t h a t of t h e o r e m 1.1.
T h e o r e m 1.4. I f f u n c t i o n f (x) satisfies t h e r e q u i r e m e n t s o f t h e o r e m 1.1
a n d i n u s i n g m e t h o d ( 1 . 1 ) a h i s c h o s e n b y ( 1 . 1 7 ) , t h e n || f h \ \ - > - 0 a s
k - > o o w h a t e v e r the initial p o i n t x 0.
Proof. A s in t h e o r e m 1.1 w e o b t a i n the est i m a t e

/(*) — / (xh) = — a || / * | P — a <J'h c — j'h , f k’ )


< - a ll/*ll2 + a 2 i ? l|/*|l2-

51 4*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

The m i n i m u m o f f u n c t i o n (p ( a ) = — a | | / i ||2 - f a 2 i ? | | / i ||2 i s a t ­


tained w i t h a mln = — a n d <p ( a m l n ) = — . S i n c e a 2 R || f'h ||2 i s t h e
u p p e r b o u n d o f t h e t e r m — a(/fcC — /i, /*), t h e v a l u e o f a h w h i c h
satisfies c o n d i t i o n (1 . 1 7 ) is o b v i o u s l y n o t less t h a n a m l n a n d

(1 .1 8 )

H e n c e b y t h e s a m e a r g u m e n t a s i n t h e o r e m 1 . 1 w e f i n d t h a t || f k ||
-►0. Q.E.D.
T h e e s t i m a t e s (1.9) for t h e v a r i a n t of t h e g r a d i e n t m e t h o d w h e r e
t h e s t e p l e n g t h is c h o s e n a c c o r d i n g t o ( 1 . 1 7 ) c a n b e v e r i f i e d i n a w a y
a n a l o g o u s to t h a t u s e d in t h e o r e m 1.2 w i t h t h e o n l y difference that
e x p r e s s i o n (1.13) s h o u l d b e u s e d i n t h e e s t i m a t e fh+ 1 — /& ^
^ ^ || f k ||2 o b t a i n e d b y t h e s a m e a r g u m e n t as for (1.18). H o w ­
e v e r w e p r o c e e d u s i n g t h e results of t h e o r e m 1.3. T h i s e n s u r e s h i g h e r
a c c u r a c y f o r t h e v a l u e o f t h e r a t i o q.
~ 2
S e t x k + i = X k — M - f m f h’ ' t ^i e n e s l i m a t e ( 1 * 1 6 )

/ fo+i)— / ( * , x ( )2 f a — /,)
holds.
If p o i n t x k +i is c h o s e n b y a p p l y i n g t h e c o n d i t i o n for function
m i n i m i z a t i o n in t h e direction of descent, t h e n

/ (*»«) - / w < / (**«) - / (*.) < (s f S ) 2 {fk ~ /*)

Using n o w estimates (1.11) w e o b t a i n


ii mo ^ 2 .- , v - 2 ( M — m \ 2(ft+l) M ,i no
||^fe+l — ^*11 — — \\x o — x * \ \
a n d finally

where

T h u s t h e f o l l o w i n g t h e o r e m is v a l i d .
T h e o r e m 1.5. I f f u n c t i o n f (#) satisfies t h e c o n d i t i o n s o f t h e o r e m 1 . 2
a n d i n a p p l y i n g m e t h o d ( 1 . 1 ) a * i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 1 . 1 7 ) , 52

52
G R A D I E N T M E T H O D S

t h e n s e q u e n c e {:rk } c o n v e r g e s t o t h e m i n i m u m a t t h e r a t e o f a g e o m e t r i c
, ,. M — m
progression whose ratio q — •
N o t e t h a t t h e v a r i a n t of m e t h o d (1.1) w h e r e t h e s t e p l e n g t h is
c h o s e n a c c o r d i n g to the condition of function m i n i m i z a t i o n in t h e
d i r e c t i o n o f d e s c e n t is o f t e n c a l l e d i n l i t e r a t u r e t h e m e t h o d o f s t e e p e s t
descent.

Other Gradient Methods


L e t F (a:) b e a n a r b i t r a r y s y m m e t r i c a l m a t r i x w h i c h satisfies t h e
conditions

p II y II2 < (F (*) y, y) < R II y IIs* p > o (1 . 1 9 )


w i t h a n y x, y 6 E n . I f w e c h o o s e v e c t o r p = — F ( 2 ) /' ( x ) , t h e n
(/' ( * ) , P ) = — i f ' , F f ) < — p || / ' I P < 0 p r o v i d e d || / ' ( * ) || = ^ = 0 .
T h u s vector p = — F (x) f (x) d e t e r m i n e s t h e d i r e c t i o n o f t h e d e ­
crease of function / ( x ) . T h e n i n o r d e r t o m i n i m i z e / (2 ) w e c a n c o n ­
struct the iterative process
^fe+i = &h.Ek f Aj = 0, 1t ...
w h e r e { F ft} i s a s e q u e n c e of arbitrary ma trices w h i c h satisfy c o n ­
ditions (1.19). T o c o n n e c t this process w i t h th e descriptions of t h e
following part of this c h a p t e r (to b e m o r e precise to m a k e c o n s i s t e n t
the notations used), w e shall consider the process
Zft+i = — v-hFh%, a h > 0 (1.20)
i n w h i c h t h e m a t r i x is t h e r e c i p r o c a l o f m a t r i x F k . T h i s a f f e c t s t h e
h e a r t o f t h e m a t t e r i n n o t h i n g f o r if m a t r i x F k s a t i s f i e s c o n d i t i o n s
(1.19), t h e n for m a t r i x t h e c o n d i t i o n s (cf. l e m m a 2 . 9 o f C h . I )

j / X ^ i I I p I I 2. m t= 7 ^ - > 0 ’ = y
w i l l b e satisfied a n d therefore
(/*, P k ) = F i ' f i ) ^ — n i i || jii l|2 ^ 0 . (1.22)
Diflerent iterative processes will c o rre spo nd to different s e q u e n c e s
{ F k 1 )-
A s far as the principles of the m e t h o d s are c o n c e r n e d the s t u d y
o f m e t h o d ( 1 .2 0 ) d o e s n o t i n v o l v e a n y n e w e l e m e n t s a s c o m p a r e d t o
t h e “ p u r e ” g r a d i e n t m e t h o d (1.1). A l l t h e results o b t a i n e d for m e t h o d
( 1 .1 ) r e m a i n v a l i d a l s o f o r m e t h o d ( 1 .2 0 ) w i t h t h e s a m e r e q u i r e m e n t s
to the function b e i n g m i n i m i z e d a n d the s a m e m e t h o d s of c h o o s i n g
the step length. O n l y the t e c h n i q u e of p r o v i n g the c o rre spo ndi ng
s t a t e m e n t s is l i g h t l y c h a n g e d . O f c o u r s e , t h e q u a n t i t a t i v e v a l u e s
o f t h e p a r a m e t e r s i n ( 1 . 2 0 ) w i l l d i f f e r f r o m t h e v a l u e s o f a n a l o g o u s 53

53
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

p a r a m e t e r s i n m e t h o d (1.1). I n p a r t i c u l a r , t h i s r e l a t e s t o t h e v a l u e
of the ratio q in the estimates of the rate of convergence.
W e shall d w e l l n o w o n l y o n the results of m e t h o d (1.20) w h i c h
will b e m a d e use of later on.
T h e o r e m 1.6. T h e results of t h e o r e m 1.2 r e m a i n valid for m e t h o d (1.20).
P r o o f . If x = x k + a p kt w h e r e iV/i then

f ( x ) — f ( x k ) = a ( f ’k , P k ) + - ^ - ( f Z c P k , P k )

' 2 (/*i P k ) '


N o w u s i n g (1.19) w e have
(Ik, P k ) = — (F k P k , P k ) < — P II P k II2 - (1-23)
Consequently

Pk) ( i - - -

Hence inequality (1.2) will be satisfied without fail if 1 —


— ^ = - ^ m ~ ^~ P * T h i s s u b s t a n t i a t e s t h e m e t h o d
o f c h o o s i n g a ft.
S i n c e ( f a , p h ) < C 0 w i t h || f a || 0, it f o l l o w s f r o m t h e c o n d i t i o n

h+i — /ft < e a h (fa, p k ) (1.24)

t h a t f h + x < / f t . U s i n g n o w ( 1 . 2 4 ) a n d h a v i n g i n m i n d t h a t f (x)
is b o u n d e d f r o m b e l o w , b y a n a l o g y t o t h e p r o o f i n t h e o r e m 1.1 t h a t
|| f a || - > - 0 , w e e s t a b l i s h t h a t ( f a , p * ) - > - 0 a s k - > - o o . B y ( 1 . 2 2 ) ,
it m e a n s t h a t \ \ f a \ \ — > - 0 . H e n c e , s i n c e f ( x ) i s s t r o n g l y c o n v e x ,
s e q u e n c e ( 1 . 2 0 ) c o n v e r g e s t o t h e s o l u t i o n x *. I n o r d e r t o o b t a i n b o u n d s
o n t h e r a t e of c o n v e r g e n c e of fh -»-/*, x h - + x # , let u s w r i t e i n e q u a l i t y
( 1 . 2 4 ) , u s i n g ( 1 . 2 2 ) , i n t h e f o r m f h + l — f k < — e a * ^ || f a ||2 . F u r ­
t h e r , i n t r o d u c i n g i n t h i s e x p r e s s i o n \\ f k || w i t h t h e a i d o f i n e q u a l i t y
(1.13) a n d a p p l y i n g the a r g u m e n t of t h e o r e m 1.2 w e establish the
v a l i d i t y f o r m e t h o d ( 1 .2 0 ) o f t h e e s t i m a t e s o f t h e c o n v e r g e n c e r a t e
(1.9). T h e v a l u e o f t h e r a t i o o f t h e p r o g r e s s i o n is

g = l - e ^ f f t i m ( l + - g - ) = l - e 2 ( 1 ~ e ) p m 1O T ( l + - ^ - ) .

The m i n i m u m o f q is a t t a i n e d w i t h s = -^- :
* p 2 m / . , m \
<7mln— 1 “ 2R * M f
The theorem is p r o v e d .

54
G R A D I E N T M E T H O D S

It f o l l o w s f r o m t h e p r o o f t h a t p r o c e s s (1.20) r e m a i n s c o n v e r g e n t
2
if w e s e t a h = a, 0 *< a •< p (a v a r i a n t of t h e m e t h o d w i t h c o n ­
s t a n t step). B y t h e s a m e a r g u m e n t as in t h e o r e m 1.3, o n e c a n o b t a i n
the estimate
II * k + i — X* I K III — a F i ' f l c II II x k — I * ||.
H o w e v e r it is i m p o s s i b l e t o o b t a i n a n e s t i m a t e o f t h e r a t i o o f
t h e p r o g r e s s i o n a s w a s d o n e i n t h e o r e m 1 . 3 , s i n c e t h e m a t r i x F l lf k c
is n o t p o s i t i v e d e f i n i t e i n t h e g e n e r a l c a s e ( t h i s l a s t p r o p e r t y is f u l ­
filled o n l y o n c o n d i t i o n t h a t m a t r i c e s F & 1 a n d f" ( x k c ) c a n b e t r a n s ­
posed.
W e c a n consider a v a r i a n t of m e t h o d (1.20) in w h i c h the step
l e n g t h is c h o s e n u s i n g t h e c o n d i t i o n t h a t / (x) a t t a i n s m i n i m u m i n
the direction of descent.
T h e o r e m 1.7. I f f u n c t i o n f (x) satisfies t h e c o n d i t i o n s o f t h e o r e m 1 . 2
a n d i n a p p l y i n g m e t h o d ( 1 .2 0 ) p a r a m e t e r a h i s c h o s e n u s i n g c o n d i t i o n
f (xk + a hPh) = m i n / (x k + a p k ).

t h e n s e q u e n c e {a:*} c o n v e r g e s to t h e p o i n t o f m i n i m u m a t t h e r a t e o f a
geometric progression.
T h e t h e o r e m c a n b e p r o v e d ac c o r d i n g to the following s c h e m e .
E x p a n d i n g t h e f u n c t i o n i n t o T a y l o r ’s s e r i e s t o s e c o n d - o r d e r t e r m s
a b o u t point x h a n d re a s o n i n g as in t h e o r e m 1.4 w e c a n o b t a i n the
estimate:
(/;, p h )*
fh+i — fk ^ - - 2~
m ii p h iia •

T h i s inequality, b e c a u s e of (1.22) a n d ( 1 . 2 3 ) , is e q u i v a l e n t t o
1 P ' » i ll f k II2
2 M

F u r t h e r , e x p r e s s i n g || f'k || w i t h t h e u s e o f i n e q u a l i t y ( 1 . 1 3 ) o n e
s h o u l d r e p e a t c o m p l e t e l y t h e a r g u m e n t of t h e o r e m 1.2. W e c a n n o t
o b t a i n in this case a m o r e precise v a l u e of t h e ratio q since w e k n o w
it m u s t b e g r e a t e r t h a n i n t h e m e t h o d o f s t e e p e s t d e s c e n t .

Qualitative A n a l y s i s of the M e t h o d s
Let us co mpare the gradient methods considered above and con­
s i d e r c e r t a i n s t a t e m e n t s o n t h e q u a l i t y o f t h e s e a l g o r i t h m s , i.e.
o n their effectiveness in solving m i n i m i z a t i o n problems.
W e h a v e s t u d i e d t h r e e v a r i a n t s of m e t h o d (1.1) differing in t h e
m e t h o d of c h o o s i n g the step length. T h e properties of the variants
r e s e m b l e closely. T h e y c a n b e u s e d in m i n i m i z i n g functions of like5

55
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

classes, their rate of c o n v e r g e n c e b e i n g n e a r l y t h e s a m e (in t h e cases


w h e r e it c a n b e e s t i m a t e d ) . C o n s e q u e n t l y it is e x p e d i e n t i n s o l v i n g
p r o b l e m s to use that of t h e va riants of the m e t h o d w h i c h requires
t h e m i n i m u m of l a b o u r . C o m p u t a t i o n a l effort at e a c h iteration i n t h e
v a r i a n t s o f p r o c e s s ( 1 .1 ) c a n e v i d e n t l y b e d i f f e r e n t o n l y o w i n g t o t h e
difference in the m e t h o d of c h o o s i n g p a r a m e t e r a h . T h e variant of
th e m e t h o d w h i c h uses a c o n s t a n t step a * = a requires the least
a m o u n t o f c o m p u t a t i o n s a t i t e r a t i o n s ( f o r i n t h i s c a s e it is n e c e s s a r y
t o c a l c u l a t e o n l y t h e g r a d i e n t / ' ( x fe)). H o w e v e r i n m o s t p r o b l e m s
s u c h a m e t h o d o f c h o o s i n g a * is p r a c t i c a l l y i m p o s s i b l e s i n c e u s u a l l y
t h e v a l u e s o f c o n s t a n t s i?, M c h a r a c t e r i z i n g t h e f u n c t i o n a r e u n k n o w n .
L e t us c o m p a r e the a m o u n t s of w o r k required b y the m e t h o d s of
c h o o s i n g t h e s t e p l e n g t h ; t h i s is c o n n e c t e d w i t h t h e c h e c k i n g o f c o n ­
d i t i o n s ( 1 . 2 ) a n d ( 1 . 1 7 ) . W e h a v e e s t a b l i s h e d t h a t if t h e f u n c t i o n
/ ( x ) s a t i s f i e s c e r t a i n r e q u i r e m e n t s ( t h e o r e m s 1 . 1 , 1 . 2 ), t h e n i n e q u a l ­
i t y ( 1 .2 ) i s a l w a y s s a t i s f i e d , a t l e a s t w i t h s u f f i c i e n t l y s m a l l v a l u e s
o f a * ( w h i c h a r e d e t e r m i n e d b y t h e v a l u e s o f c o n s t a n t s /?, M ) . B y
virtue of this w h a t e v e r the v a l u e a m a x f r o m w h i c h w e b e g a n to verify
i n e q u a l i t y ( 1 . 2 ), t h e i n e q u a l i t y w i l l b e s a t i s f i e d , a f t e r a c e r t a i n f i n i t e
n u m b e r o f r e d u c t i o n s o f t h e p a r a m e t e r , i.e. w e h a v e t o c a l c u l a t e t h e
f u n c t i o n v a l u e a finite n u m b e r of t i m e s to c h o o s e t h e r e q u i r e d v a l u e
o f a * . A s t o t h e c h o i c e o f a h u s i n g c o n d i t i o n ( 1 . 1 7 ) it is i n t h e g e n e r a l
ca s e a p r o c e d u r e w i t h a n infinite n u m b e r of possible values.
O f course, in practice w e h a v e to d e t e r m i n e a point of m i n i m u m
in t h e d i r e c t i o n of d e s c e n t b y e v a l u a t i n g t h e f u n c t i o n also a finite
n u m b e r of t i m e s . C l e a r l y for a m o r e o r less precise s o l u t i o n of a o n e ­
di men sio nal m i n i m i z a t i o n p r o b l e m w e h a v e to p e r f o r m m o r e calcu­
l a t i o n s o f t h e f u n c t i o n v a l u e t h a n f o r s a t i s f y i n g i n e q u a l i t y ( 1 . 2 ).
T h e a b o v e considerations s h o w that o n e s h o u l d prefer the m e t h o d of
c h o o s i n g t h e s t e p l e n g t h w h i c h i n v o l v e s c h e c k i n g o f i n e q u a l i t y ( 1 . 2 ).
A l l of t h e a b o v e r e m a r k s c a n b e a p p l i e d to m e t h o d (1.20) too.
T h e reader un derstands surely that the a b o v e a r g u m e n t s are based
o n l y o n the use of the m o s t general properties of the function b e i n g
minimized and the algorithms being studied and do not m a k e use
of t h e specific properties of co ncrete functions. T h e r e f o r e t h e a b o v e
r e c o m m e n d a t i o n s s h o u l d not b e considered as absolutely applicable.
T h i s r e m a r k s h o u l d b e k e p t in m i n d in w h a t follows.
W e t u r n n o w to the discussion of the effectiveness of gradient
m e t h o d s . F r o m the po i n t of v i e w of solving p r o b l e m s of fu n c t i o n m i n i ­
mization, for sufficiently g o o d functions (smoo th, c o n v e x ) gradient
m e t h o d s give c o n v e r g e n c e to t h e m i n i m u m at the rate of a g e o m e t r i c
progression. T h e v a l u e of the ratio of the progression, in particular
for strongly c o n v e x functions, d e p e n d s o n the m a x i m u m M and
m i n i m u m m of t h e e i g e n v a l u e of the m a t r i x of s e c o n d derivatives of
f u n c t i o n / (x). T h e r a t i o q w i l l b e s u f f i c i e n t l y s m a l l o n l y w h e n t h e
e i g e n v a l u e s m a n d M a r e b u t s l i g h t l y d i f f e r e n t , i.e. w h e n t h e m a t r i x

56
G R A D I E N T M E T H O D S

f" (z) is w e l l c o n d i t i o n e d . I n t h i s c a s e t h e r a t e o f c o n v e r g e n c e is fast.


H o w e v e r in co mputational practice such p r o b l e m s occur very seldom.
A s a rule w e h a v e to find t h e m i n i m u m of functions w h o s e m a t r i x
f (x ) i s i l l c o n d i t i o n e d ( — < 1). T h e less t h e ratio the closer
to u n i t y will b e t h e ratio q of the progression a n d the s l owe r the
rate of c o n v e r g e n c e . T h i s fact c a n b e g i v e n a g e o m e t r i c interpretation.
W i t h the d i m i n i s h i n g of the ratio — the g r a p h of the function b e i n g
m i n i m i z e d (i.e. s u r f a c e s / (z) = C) becomes m o r e elongated an d the
d i r e c t i o n o f v e c t o r f (x) a t m o s t points deviates m o r e a n d m o r e from
the direction to the point of m i n i m u m . T h i s leads to s l owi ng d o w n
the rate of convergence. T h i s can be particularly well visualized
b y considering, for instance, a strictly c o n v e x q u a d r a t i c f u nct ion
/ (a:) i n s p a c e E 2, e.g. / = y ( - p - f - - p ] . T h e m a t r i x of s e c o n d deri­
v a t i v e s of t h i s f u n c t i o n h a s c o n s t a n t e l e m e n t s , its l e v e l s u r f a c e s a r e
1 / ^2v
ellipses y l - p + - p l = C , t h e p o i n t of m i n i m u m coincides w i t h t h e
c e n t r e of t h e ellipses. T h e e i g e n v a l u e s of t h e m a t r i x of s e c o n d deri­
vatives a r e p an d p . T h e m o r e the ratio p differs f r o m u n i t y , t h e
m o r e are the lines of level e x t e n d e d a l o n g o n e of the a x e s O X or O Y
a n d t h e g r e a t e r is t h e n u m b e r o f s t e p s i n t h e d i r e c t i o n o f t h e a n t i -
gradient w h i c h h a v e to b e t a k e n in m o v i n g f r o m a n arbitrary p o i n t
(;r0 , l/o) i n o r d e r t o a t t a i n a s u f f i c i e n t l y s m a l l n e i g h b o u r h o o d o f t h e
m i n i m u m point.
T h e s l o w c o n v e r g e n c e of gradient m e t h o d s p r eve nts their b e i n g
used in solving complicated m i n i m i z a t i o n p r o b l e m s since too m u c h
t i m e is r e q u i r e d e v e n w i t h t h e u s e o f m o d e r n h i g h s p e e d c o m p u t e r s .
Therefore at present m i n i m i z a t i o n m e t h o d s h a v e b e e n a n d are be ing
w o r k e d o u t w h i c h h a v e a faster rate of convergence, a n d gradient
m e t h o d s are often u s ed in c o m b i n a t i o n w i t h other m o r e effective
o n e s at t h e initial s t a g e of s o l v i n g t h e p r o b l e m w h e n p o i n t is a t a
great distance f r o m the m i n i m u m a n d steps along the antigradient
p e r m i t to o b t a i n a significant decrease of the function. W e also
stress o n c e m o r e t h e u n m i s t a k a b l e a d v a n t a g e s of g r a d i e n t m e t h o d s
a n d their suitability for m i n i m i z i n g functions of v e r y different
characters.
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

2. N E W T O N ’S M E T H O D
W I T H STEP A D J U S T M E N T
C o n s t r u c t i o n of t h e M e t h o d
I n gradient m e t h o d s o n l y the linear t e r m of the e x p a n s i o n of
t h e f u n c t i o n i n T a y l o r ’s s e r i e s i s u s e d i n c h o o s i n g t h e d i r e c t i o n o f
m o t i o n , i.e. u s e is m a d e o f t h e c r u d e s t a p p r o x i m a t i o n t o t h e f u n c t i o n
being minimized.
L e t f u n c t i o n / (a:) w h o s e m i n i m u m i s t o b e d e t e r m i n e d b e s t r i c t l y
c o n v e x a n d sufficiently s m o o t h .
Consider the function

(y) + ( f (y), x — y ) + 4 r ( f " ( y ) { x — y), * — y)

w h i c h i s a q u a d r a t i c a p p r o x i m a t i o n t o / (a:) i n t h e n e i g h b o u r h o o d
o f a c e r t a i n p o i n t y. S i n c e f u n c t i o n / (x) is s t r i c t l y c o n v e x , f u n c t i o n
\|) ( x ) a s c a n e a s i l y b e a s c e r t a i n e d i s a l s o s t r i c t l y c o n v e x ; t h e r e f o r e
t h e m i n i m u m o f t h i s f u n c t i o n is a t t a i n e d a t a u n i q u e p o i n t a n d
vector p = y — y w h i c h minimizes (a:) i s d e t e r m i n e d f r o m t h e
f o r m u l a p = — (/" ( y ) ) -1 /' ( y ) . T h e d i r e c t i o n d e t e r m i n e d b y v e c t o r p
i s t h a t o f d e s c e n t o f f ( x ) s i n c e (/' ( y ) , p ) = — ( f ( y ) p , p ) < C 0
b y v i r t u e of / (x) b e i n g c o n v e x . T h e q u a d r a t i c f u n c t i o n (a:) i n t h e
n e i g h b o u r h o o d o f p o i n t y, is a f a r b e t t e r a p p r o x i m a t i o n t o t h e
function being m i n i m i z e d t h a n a linear function. Therefore o n e
n a t u r a l l y e x p e c t s , a t l e a s t if p o i n t y is i n a s u f f i c i e n t l y s m a l l n e i g h ­
b o u r h o o d of solution x*, that b y m o v i n g f r o m point y in the direc­
t i o n p — — (/" ( y ) ) -1 /' ( y ) o n e c a n a t t a i n a m o r e s i g n i f i c a n t d e c r e a s e
of the function a n d obtain a m o r e accurate a p p r o x i m a t i o n to the
s o l u t i o n t h a n b y m o v i n g i n t h e d i r e c t i o n — /' ( y ) w h i c h is u s e d
in the gradient m e t h o d . O n the g r o u n d of the a b o v e a r g u m e n t w e
s u p p o s e that the iterative process
**+i = xk — a k > 0, k = 0, 1, . . . (2 .1 )
w h e n used to construct successive a p p r o x i m a t i o n s to the solution
o f t h e p r o b l e m o f m i n i m i z a t i o n o f f u n c t i o n / (a;) w i l l p r o v e m o r e
e f f e c t i v e t h a n t h e m e t h o d o f s t e e p e s t d e s c e n t , i.e. t h a t t h e r a t e o f
c o n v e r g e n c e of x h / (x h ) — ► / ( a : * ) w h e n u s i n g a l g o r i t h m ( 2 .1 )
will b e faster t h a n w h e n a p p l y i n g t h e gr adi ent m e t h o d . T h e results
o f t h i s s e c t i o n w i l l s h o w t h a t o u r e x p e c t a t i o n is justified.
W e shall call m e t h o d (2.1) N e w t o n ' s m e t h o d w i t h a d j u s t m e n t of
steps, or g e ner ali zed N e w t o n m e t h o d .
T h e u s u a l N e w t o n m e t h o d c o r r e s p o n d s t o t h e c a s e w h e n a * = 1.
D e n o t i n g t h e e l e m e n t s o f m a t r i x ( / a ) - 1 b y c ( x h ), i , j =
= 1 , 2 , . . ., n , w h e r e i i s t h e r o w i n d e x , w e c a n w r i t e m e t h o d ( 2 . 1 )

58
N E W T O N ’S M E T H O D W I T H S T E P A D J U S T M E N T

i n its c o o r d i n a t e f o r m :
n

j=i
N o t e t h a t m e t h o d (2.1) c a n also b e p r e s e n t e d i n t h e f o l l o w i n g f o r m :

fkPh — — f k »- x h+ 1 = x h + & h P k
or in coordinate f o r m

vi d 2f (xh) p j __ _ d f (xfe)
^ dxi dxi P h dxi
;=i
4 + l = : r A + a k / ?l » i==1» •••*»•
Consequently, in order to d e t e r m i n e vector p * o n e c a n solve a
a s y s t e m o f l i n e a r e q u a t i o n s i n s t e a d o f i n v e r t i n g t h e m a t r i x f " (x * ) .
W e shall s t u d y t w o variants of the generalized N e w t o n m e t h o d
in w h i c h different m e t h o d s of c h o o s i n g p a r a m e t e r a will b e used.
T h e first o f t h e s e m e t h o d s c o n s i s t s o f t h e f o l l o w i n g f o u r s t e p s :
(1) S e t t i n g a = 1 c a l c u l a t e p o i n t x = + ap*.
( 2 ) E v a l u a t e / ( x ) — f ( x h + a p k ).
(3) C h e c k t h e i n e q u a l i t y
/(*) — / ( * * ) < e a ( / f t , p h ), 0 < e < - i - . (2.2)

(4) If t h e i n e q u a l i t y is satisfied, t h e n t a k e t h e v a l u e a = 1 t o b e
t h e s o u g h t o n e : a * = 1. O t h e r w i s e p r o c e e d t o r e d u c e a u n t i l i n ­
e q u a l i t y (2.2) is satisfied.
W e shall call further o n t h e a b o v e m e t h o d of c h o o s i n g t h e v a l u e
of a * m e t h o d of ch o o s i n g a c c o r d i n g to c o n d i t i o n (2.2). It c a n
b e s e e n t h a t t h i s m e t h o d o f c h o o s i n g t h e s t e p l e n g t h is a n a l o g o u s
to that in the m e t h o d of steepest descent, involving the c h e c k i n g
of i n e q u a l i t y (1.2).
T h e o t h e r v a r i a n t of m e t h o d (2.1) r e q u i r e s t h a t t h e v a l u e of a *
p r o v i d e the m i n i m u m of t h e f u n c t i o n in the direction of m o t i o n
/ ( x h — a * (/ft)"1 f h ) = m i n f ( x h — a (ft)-' ft). (2.3)

T h e o r e m s about Properties of the M e t h o d


A s f o l l o w s f r o m f o r m u l a ( 2 . 1 ) N e w t o n ’s m e t h o d c a n o n l y b e
applied to the m i n i m i z a t i o n of functions that h a v e a n invertible
m a t r i x o f s e c o n d d e r i v a t i v e s a n d t h i s m a t r i x (/a ) " 1 a s w i l l b e s h o w n
later o n m u s t b e b o u n d e d .

59
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Strongly c o n v e x twice continuously differentiable functions possess


s u c h properties. Therefore in this section w e a l w a y s a s s u m e the
f u n c t i o n / (x) t o s a t i s f y t h e f o l l o w i n g c o n d i t i o n s :
I»ll y \ \ 2 < ( r (*) y, y) < M i l y\\2, m > 0 (2.4)
for a n y x, y £ E n . R e c a l l t h a t s u c h f u n c t i o n s h a v e a l o w e r b o u n d
a n d a u n i q u e m i n i m u m point x+.
T h e o r e m 2 . 1 . I f i n m i n i m i z i n g f u n c t i o n f (x) t h a t satisfies c o n d i ­
t i o n s ( 2 . 4 ) u s e is m a d e o f m e t h o d ( 2 . 1 ) a n d a * is c h o s e n a c c o r d i n g t o
c o n d i t i o n (2.2), t h e n w h a t e v e r the initial p o i n t x 0 the s e q u e n c e
c o n v e r g e s to t h e m i n i m u m p o i n t a t a s u p e r l i n e a r rate

II % N + l “ “ *^*ll ^ • • • ^<N+Z (2*5)


w h e r e , N , C < c o o , X N + i < 1 w i t h a n y Z ^ 0, ^ — >-0 a s i -+■ o o .
P r o o f . M e t h o d (2.1) c a n b e c o n s i d e r e d as a p r o c e s s of t h e g r a d i e n t
t y p e (1.20), a s s u m i n g F l 1 = (/J)"1 . S i n c e m a t r i x /& p o s s e s s e s t h e
r e q u i r e d properties, t h e c o n v e r g e n c e of m e t h o d (2.1) to t h e s o l u t i o n
follows f r o m the general results pertaining to the c o n v e r g e n c e of
g r a d i e n t m e t h o d s ( t h e o r e m 1.6).
L e t u s e s t a b l i s h t h e v a l i d i t y o f e s t i m a t e ( 2 . 5 ) . N o t e first o f a l l
that
(/*> P k ) - — (fhPkx P k ) < — roll P * | | 2 . (2.6)

S i n c e (/&, p k ) * < 0 a n d (/i, p * ) — > - 0 ( t h e o r e m 1 . 6 ) , i t f o l l o w s f r o m


( 2 . 6 ) t h a t || p f c || — a s k ->■ o o . L e t u s s h o w n o w t h a t f r o m a c e r t a i n
i t e r a t i o n o n i n m e t h o d ( 2 . 1 ) w e h a v e a h = 1 . U s i n g T a y l o r ’s f o r m u l a
a n d (2.6) w e o b t a i n

/fc+l — fh = (/ft, P f e ) H ^-(/ftPfc, Pft)H Y ((fhc — fk) P h i P h )

(*' -n \ I A W /ftC- ZJ |l llPfcll2 \


(/», Pk) ( 1 - j- - 2- - - - - - - )
w h e r e x h c = x h + Q ( x h + 1 — x h ), G £ [ 0 , 1 J . Since || x k — x % || - > 0 , w e
h a v e as k — ► o o
II fie- n ii < ii fie-- r, II+ ll /;- fk\\- o
b y v i r t u e of t h e ( o p e r a t o r ) f u n c t i o n f" (x) b e i n g c o n t i n u o u s . Then
for a n y constant 0 < e < ^ t h e r e is a n u m b e r N 0 (e) s u c h t h a t
with k ^ N 0 (e) t h e c o n d i t i o n

1 — Ok Oft ^ 8
2 2 m
w i l l b e sa t i s f i e d w i t h a k — 1. T h i s m e a n s t h a t i n e q u a l i t y (2.2)
w i l l a l s o b e satisfied w i t h a k = 1. T h u s t h e m e t h o d u s e d i n c h o o s i n g

6 0
N E W T O N ’S M E T H O D W I T H S T E P A D J U S T M E N T

the step length u n d e r the conditions of the t h e o r e m g u a r a n t e e s that


m e t h o d (2.1) f r o m a c e r t a i n iteration o n w i l l b e a p p l i e d w i t h a s t e p
e q u a l t o u n i t y , i.e. w i l l b e t r a n s f o r m e d i n t o t h e u s u a l N e w t o n
m e t h o d . W e c a n n o w ob tai n b o u n d s o n the rate of c o n v e r g e n c e of
the method:
(*£fc+l Xfi+l = {%k ifh) * f k i ^ft+i X t ).

B y L a g r a n g e ’s f o r m u l a f o r o p e r a t o r s w e h a v e
((/*)■*/*, * * + 1 — * . ) = ( ( / * ) " * ( / * — /:), * * + 1 — * . )
= ((/ft)'1 / h e (*ft — *.), Zft+l — Z.)

w h e r e Xhc = & h + 0 (x k — # * ) , 0 € [ 0 , 1 ] , C o n s e q u e n t l y
|| Zft+l — X , IP = ((/ — (/ft)'1 /ftc) ( X k — x,), X k+l — x m)
= ( ( / » ) ' * (/ft — /ftc) ( * f t — * , ) , ^ f t + 1 — ^ « )

< II /ft— /ftc IIII ^ h — II || * f t + i — a ; . II


or
I l * f t + 1 — x t | « X » ||arft — a r . l l (2.7)

where Xk = ftv
|| / * — / * c ||. S i n c e || f"h — f l c || - * 0 , t h e r e is a n u m-
h e r N s u c h t h a t w i t h k = N + I, I = 0 , 1 , . . . w e h a v e
a n d k N + i - > > 0 a s I - v o o . S e t t i n g || x N — x + \ \ = C a n d t a k i n g i n t o
a c c o u n t t h e a b o v e r e m a r k s w e o b t a i n e s t i m a t e (2.5).
T h e t h e o r e m is p r o v e d .
L e t u s s u p p o s e n o w t h a t m a t r i x f" (x) satisfies b e s i d e s c o n d i ­
tions (2.4) also L i p s c h i t z ’ c o n d i t i o n

iir (x)- r m \ < r \\x - y\i *, y € e \ (2.8)


In this case w e h a v e in e s t i m a t e (2.7)

Xk= 4 " ii& - ^ H'< - £ - ii Xh *• ii


a n d therefore

lkft+1— * . | | < - ^ - l k f t — *.ll2- (2.9)

C o n s e q u e n t l y t h e f o l l o w i n g t h e o r e m is v a l i d .
T h e o r e m 2 . 2 . I f f u n c t i o n f ( x ) is s u c h t h a t c o n d i t i o n s ( 2 . 4 ) a n d ( 2 . 8 )
a r e satisfied, t h e n s e q u e n c e (2.1) i n w h i c h t h e v a l u e s o f a k a r e c h o s e n
a c c o r d i n g to c o n d i t i o n (2.2), w h a t e v e r t h e initial p o i n t x 0 , c o n v e r g e s
t o t h e s o l u t i o n a t a q u a d r a t i c r a t e , i.e. e s t i m a t e ( 2 . 9 ) is v a l i d .

61
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

E s t i m a t e (2.9) can be written also in the following form.


W e set — H z * — x % ||. T h e r e is a number L su c h that w i t h
k = L - \ - l , Z = 0 , 1, . . . w e h a v e and

— II * L + l - * . » < ( — II * I * M - x. || ) * < . . . < ( 4 ||| * L - X . ||) 2 1 .

Finally w e can write


l|xL + (— x , i K - ^ - n i * .

L e t u s c o n s i d e r n o w t h e v a r i a n t of m e t h o d (2.1) w i t h t h e s t e p
l e n g t h b e i n g c h o s e n a c c o r d i n g t o c o n d i t i o n (2.3). T h e c o n v e r g e n c e
o f s e q u e n c e { a ; fe} t o t h e s o l u t i o n i n t h i s c a s e f o l l o w s f r o m t h e g e n e r a l
r e s u l t s a b o u t t h e c o n v e r g e n c e of g r a d i e n t m e t h o d s ( t h e o r e m 1.7).
T h e rate of c o n v e r g e n c e as in the case of c h o o s i n g a h a c c o r d i n g to
c o n d i t i o n ( 2 . 2 ) w i l l b e s u p e r l i n e a r if c o n d i t i o n ( 2 . 4 ) is s a t i s f i e d a n d
q u a d r a t i c if c o n d i t i o n ( 2 . 8 ) i s a l s o s a t i s f i e d . T h i s c a n b e p r o v e d
as follows.
L e t a * + 1 = x k — ( J h ) ~ l f h a n d x „ + 1 = x k — a k ( f k ) ~ l f'k w h e r e a *
is c h o s e n a c c o r d i n g t o c o n d i t i o n (2.3). T h e n u s i n g e s t i m a t e ( 1 . 1 1 )
w e obtain
- f - II * » « - X , II2 < / * « - /. < / (xh+i) II x k + i - x . I p .
B y (2.7), — x* || x h — |, X h 0 as oo. Conse­
q u e n t l y if c o n d i t i o n s ( 2 . 4 ) a r e s a t i s f i e d , t h e n

II x * + 1 — X , I K ( — -) 1/2 *■» || X f t — X , || = y k II x k — X . || (2.10)

where >-0 as k — ^ o o . If e s t i m a t e ( 2 . 8 ) h o l d s , t h e n

* * < 4 ! ! * * - * • II* II * * + 1 — * * I K ( - ^ - ) 1 / 2 - f - I I I I 2 - (2.11)

Modifications of t h e G e n e r a l i z e d
Newton Method
A s o n e of t h e p o ssi ble m o d i f i c a t i o n s of m e t h o d (2.1) w e shall c o n ­
sider a n a l g o r i t h m in w h i c h the s e q u e n c e of a p p r o x i m a t i o n s to the
s o l u t i o n is c o n s t r u c t e d b y t h e f o l l o w i n g f o r m u l a :
**+1 = - a * (Z;)-1 ft, a* ^ 0. (2.12)
I n t h i s m e t h o d p k = — (ft) “ V i * i-e - order to d e t e r m i n e the di­
r e c t i o n s o f d e s c e n t u s e is m a d e o f t h e s a m e m a t r i x (/S)” 1. M e t h o d
( 2 . 1 2 ) i s a p a r t i c u l a r c a s e o f a l g o r i t h m ( 1 . 2 0 ) ( F ^ 1 = ( f t ) ” 1 )* T h e r e f o r e
it c a n b e a s s e r t e d t h a t s e q u e n c e ( 2 . 1 2 ) , w h a t e v e r t h e i n i t i a l p o i n t x 0 j

6 2
N E W T O N ’S M E T H O D W I T H S T E P A D J U S T M E N T

will c o n v e r g e to the solution at t h e rate of a g e o m e t r i c progression


b o t h if t h e s t e p l e n g t h i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) a n d
if it is c h o s e n u s i n g c o n d i t i o n ( 2 . 3 ) ( t h e o r e m s 1 . 6 a n d 1 . 7 ) . H o w e v e r
t h e v a l u e o f t h e r a t i o q , i.e. t h e a c t u a l r a t e o f c o n v e r g e n c e , w i l l
significantly d e p e n d o n t h e initial a p p r o x i m a t i o n c h o s e n , x 0.
I n f a c t , t a k i n g i n t o a c c o u n t t h a t i n m e t h o d ( 2 . 1 2 ) (/£, p h ) =
= — (/i, f j h ) a n d u s i n g T a y l o r ’s f o r m u l a w e c a n o b t a i n t h e f o l l o w ­
i n g b o u n d s o n t h e r a t e of c o n v e r g e n c e (as in t h e o r e m 2.1):

(/;, p h ) (t (2.i3)

w h e r e x ^ c = x h + 0 ( x k + x — x h ), 0 £ [ 0 , 1 ] . I f x 0 — >■ x „ t h e n s i n c e
t h e m a t r i x o f s e c o n d d e r i v a t i v e s is c o n t i n u o u s w e h a v e

m a x II
x £ S
r (*) - r (*o)ll - * 0

(S = { x : f (x) ^ f ( x 0 )}). T h u s t h e c l o s e r is t h e i n i t i a l p o i n t x 0
to point x*, the greater will b e the v a l u e of a h w h i c h satisfy the
i n e q u a l i t y (2 .2) , i.e. t h e g r e a t e r w i l l b e t h e s t e p i n t h e p r o c e s s ( 2 . 1 2 )
if t h e s t e p l e n g t h is c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) . I n p a r t i -
cular for a n y constant 0 < e < y t h e r e is a c o n s t a n t p (e) s u c h
t h a t if t h e i n i t i a l a p p r o x i m a t i o n x 0 w a s c h o s e n i n a sphere S of
r a d i u s p, w e shall h a v e
m a x || r ( * ) - / £ II
1 1 x£S_ _ _ _ _ _ _ _

T h i s m e a n s , b y (2.13), t h a t p r o v i d e d t h e initial a p p r o x i mation


w a s c h o s e n sufficiently close to p o i n t x+, i n e q u a l i t y (2.2) will b e
s a t i s f i e d w i t h a & = 1, i.e. p r o c e s s ( 2 . 1 2 ) w i l l c o n v e r g e w i t h a step
e q u a l to unity. T h e n p r o c e e d i n g as in the proof of t h e o r e m 2.1 w e
obtain the following estimate:

II * » + ! - * . l l < 1 1 T O - ’ ll II n - / S e l l II * » - *.ll
< 5ll X h — z.ll (2.14)
1
where q = — m a x || / " — / " (x)\\. T h i s s h o w s t h a t t h e v a l u e of t h e
171 x£S
ratio q d e p e n d s o n t h e c h o i c e of t h e initial p o i n t x 0, t h e v a l u e
of q b e c o m i n g t h e s m a l l e r t h e closer p o i n t x 0 lies to t h e s o l u t i o n x m .
F o r t h e v a r i a n t o f m e t h o d ( 2 . 1 2 ) i n w h i c h t h e s t e p l e n g t h is c h o s e n
u n d e r t h e c o n d i t i o n o f / (x) a t t a i n i n g m i n i m u m i n t h e d i r e c t i o n o f

63
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

m o t i o n , o n e c a n find u s i n g inequalities (2.14) a n d r e a s o n i n g as in


o b t a i n i n g ] e s t i m a t e s (2 .1 0 ) a n d (2 .1 1 ) t h a t
|| * * + ! — x t | | ^ ( - ^ - ) 1 / 2 ?|| x h — x . || = ~ q || x h — x , ||

where g = m a x | | / ; — / J c || ^ - 0 i f x 0 - + x , .

A n o t h e r p o s s i b l e m o d i f i c a t i o n o f N e w t o n ’s m e t h o d i s t h e f o l l o w ­
ing.
L e t k = \ t + i, | = 0 , 1 , . . . » i = 0 , 1 , . . ., t — 1 , t ^ 1 b e
a n arbitrary integer. W e c a n construct a n iterative process
= x lt+i — a lt+i (fit)"1 f i l + h ^ 0
or, w i t h the original notations,
Xk+ 1 = Xh — a * (/{,)'* /;, a * > o. (2.15)
S u c h a m e t h o d takes a n intermediate position b e t w e n algorithm
( 2 .1 ) i n w h i c h f o r t h e c o n s t r u c t i o n o f v e c t o r p k a n e w m a t r i x ( Z ^ ) " 1
i s u s e d a n d a l g o r i t h m (2 .1 2 ) i n w h i c h i n d e t e r m i n i n g t h e d i r e c t i o n
o f m o t i o n t h e s a m e m a t r i x (/J) " 1 is a l w a y s u s e d . I n m e t h o d ( 2 . 1 5 )
a n e w m a t r i x is a p p l i e d a f t e r t s t e p s . T h i s a l g o r i t h m a s w e l l a s m e t h ­
o d s ( 2 . 1 ), ( 2 . 1 2 ) c a n b e c o n s i d e r e d t o b e v a r i a n t s o f t h e g r a d i e n t
m e t h o d ( 1 . 2 0 ); t h e r e f o r e i t s c o n v e r g e n c e w i t h d i f f e r e n t w a y s o f
c h o o s i n g t h e s t e p l e n g t h f o l l o w s f r o m t h e o r e m s 1.6 a n d 1.7.
W e c o n s i d e r t h e rate of c o n v e r g e n c e of m e t h o d (2.15), a s s u m i n g
t h a t t h e s t e p l e n g t h i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 .2 ) a n d t h a t
c o n d i t i o n s ( 2 . 4 ) a n d ( 2 . 8 ) a r e v a l i d f o r f u n c t i o n f ( x ).
U s i n g T a y l o r ’s f o r m u l a w e o b t a i n

/ * « - / » < « * ( / * . pi ,) ( i - - j - - 1

B e c a u s e o f t h e c o n v e r g e n c e o f t h e p r o c e s s w e h a v e || x k c — x # || =
= II *^(sf+£)c ^idl ^ 1 1 x lt *£|H-lll “ 1“ • • • “ 1“ II •^cf+i+lll
a s k - * - o o , t h e r e f o r e || / * c — /t*|| — > - 0 . T a k i n g t h i s i n t o a c c o u n t
a n d r e a s o n i n g a s w e d i d in t h e o r e m 2.1 w e c a n p r o v e t h a t f r o m a
certain iteration on, m e t h o d (2.15) will c o n v e r g e w i t h a step e q u a l
t o u n i t y : a * = 1. T h e n , b y t h e o r e m 2 . 2 , t h e f o l l o w i n g e s t i m a t e is
valid
(2.16)

with any w h e r e L is a p o s i t i v e n u m b e r . F u r t h e r , p r o c e e d i n g
as in t h e o r e m 2.1 w e can obtain the estimate
II * ! < + * — X , || = || ^ ( + 1 — / = , + ! — Z , II
< ii ii ii n , - / « , + . * ii ii A m - * . ii

M
N E W T O N ' S M E T H O D W I T H S T E P A D J U S T M E N T

w h e r e x < t m > c = x » + i + 0 ( x , — x t ( + i), 0 € [ O , 11. W e have by (2.8)


II / ! < - / ( K + D c II < II fi t ~ / ; || + II / ; - / ( | < + i ) c || < R ( || * t , - x . || +
+ 1 | x t (+1 — x , II). U s i n g e s t i m a t e ( 2 . 1 6 ) w e n o w obtain

J I * E ( + 2 — * . I K — - ( | | x t i — x . " | | + 1| x 5 m — x , || ) || x t m — x , ||

< - £ r l l * t t - x . l | 3 (l H— — l)a:e t — a : * | | ) ,
i.e.
II X *£*|| ^ 2 11 X %l ^'*IP.» ^2 OO*
S u p p o s e that wi t h a certain 2 < / < t — 1 the following estimate
is s a t i s f i e d :
II X V + J — *#ll < C A \ x it — x *\\1 + 1 i Cj < °°*
Then w e have

lll^f+i+i x * II ~ II x l t + j — (/gt)""1 f t t + j x + ll
< II ( f i t ) ' 1 II II f h - / ( W i ) c || || a f c t + j - * * II

( II X l t “ X * II + H X % t + i “ X * II ) II X * * + * X * II

< C M | l * u - X, i r , C J + i = J L C j j L 1 + C j || X u - X . I f ) .

T h u s the following b o u n d on t h e r a t e o f c o n v e r g e n c e is v a l i d
for m e t h o d (2.15):

II * * ♦ » » - x.|| < C|| x v - * J t+1. (2.17)

This estimate m e a n s that se q u e n c e {£$*} converges to the solution


at a rate of the order of t 4 * 1 .

Discussion of t h e Properties of N e w t o n ' s M e t h o d

W e h a v e e s t a b l i s h e d t h a t N e w t o n ’s m e t h o d w i t h a d j u s t m e n t o f
steps c o n v e r g e s to t h e so lution w h a t e v e r t h e initial p o i n t x 0 at
a rate either superlinear or quadratic d e p e n d i n g o n the requ i r e m e n t s
s a t i s f i e d b y f u n c t i o n / ( x ).
T h e c o n v e r g e n c e of m e t h o d (2.1) f r o m a n y initial a p p r o x i m a t i o n
o n is its e s s e n t i a l a d v a n t a g e o v e r t h e u s u a l N e w t o n m e t h o d i n
w h i c h t h e c o n v e r g e n c e i s e n s u r e d if t h e i n i t i a l a p p r o x i m a t i o n i s
s u f f i c i e n t l y g o o d (i.e. s u f f i c i e n t l y c l o s e t o t h e s o l u t i o n o f t h e p r o b l e m ) .
B e s i d e s , i n a p p l y i n g N e w t o n ’s m e t h o d t h e c h e c k o f t h e c o n d i t i o n s
w h i c h g u a r a n t e e t h a t t h e initial a p p r o x i m a t i o n e n s u r e s t h e c o n ­
v e r g e n c e o f t h e p r o c e s s is i n p r a c t i c e d i f f i c u l t t o p e r f o r m , s i n c e

5 - 0 3 2 6 65
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

it r e q u i r e s s u c h d a t a a b o u t t h e f u n c t i o n t h a t a r e u s u a l l y u n k n o w n
(for in s t a n c e , v a l u e s of t h e c o n s t a n t s m , M ) .
T h e c o m p a r i s o n of the t w o m e t h o d s of c h o o s i n g the step length
a c c o r d i n g t o c o n d i t i o n (2.2) o r (2.3) is i n f a v o u r o f t h e f o r m e r , f o r
it p r o v e s l e s s l a b o r i o u s a s t o t h e a m o u n t o f c a l c u l a t i o n s o f f u n c t i o n
v a l u e s (in particular, f r o m a certain iteration o n t h e f o r m e r m e t h o d
requires the function to be evaluated on ly once since a * = 1) a n d
g u a r a n t e e s t h e rate of c o n v e r g e n c e n o t s l o w e r t h a n w i t h t h e latter
method.
I f w e c o m p a r e N e w t o n ’s m e t h o d a n d g r a d i e n t m e t h o d s a s a p p l i e d
t o s o l v i n g p r o b l e m s o f c o n v e x f u n c t i o n m i n i m i z a t i o n , it b e c o m e s
e v i d e n t t h a t N e w t o n ’s m e t h o d e n s u r e s a f a s t e r r a t e o f c o n v e r g e n c e
o f t h e s e q u e n c e o f a p p r o x i m a t i o n s t o t h e s o l u t i o n . T h u s if w e c o n s i d e r
the rate of c o n v e r g e n c e to m e a n effectiveness of a m e t h o d , t h e n o u r
s u p p o s i t i o n s t a t e d a t t h e b e g i n n i n g o f t h i s s e c t i o n t h a t N e w t o n ’s
m e t h o d m u s t b e f a r m o r e e f f e c t i v e t h a n t h e g r a d i e n t m e t h o d s is
justified. H o w e v e r a m o r e p r eci se m e a n i n g of t h e c o n c e p t o f effecti­
v e n e s s o f a m e t h o d is b a s e d o n e s t i m a t i n g t h e a m o u n t o f c o m p u t a ­
tion i n v o l v e d w h e n a p p l y i n g a concrete a l g o r i t h m for the solution
of a p r o b l e m to the required accuracy. C o n s e q u e n t l y the effectiveness
of a n a l g o r i t h m c a n b e e s t i m a t e d b y the n u m b e r of iterations, w h i c h
are necessary for solving the p r o b l e m , a n d the a m o u n t of c o m p u t a ­
tions at e a c h iteration.
T h e a m o u n t o f c o m p u t a t i o n s p e r i t e r a t i o n i n N e w t o n ’s m e t h o d
is a s a r u l e c o n s i d e r a b l y g r e a t e r t h a n i n t h e g r a d i e n t m e t h o d s b e c a u s e
of t h e n e ces sar y c o m p u t a t i o n s a n d inversions of the m a t r i c e s of
s e c o n d d e r i v a t i v e s . O n t h e o t h e r h a n d , N e w t o n ’s m e t h o d u s u a l l y
i n v o l v e s scores a n d h u n d r e d s of t i m e s less iterations t h a n g r a d i e n t
m e t h o d s ; b y v i r t u e o f t h i s f a c t N e w t o n ’s m e t h o d p r o v e s t o b e c o n s i ­
d e r a b l y m o r e effective.
N e v e r t h e l e s s i n m a n y p r o b l e m s t h e l a b o u r p e r i t e r a t i o n i n N e w t o n ’s
m e t h o d c a n p r o v e e x c e s s i v e l y g r e a t b e c a u s e o f it b e i n g n e c e s s a r y
t o c a l c u l a t e m a t r i c e s o f s e c o n d d e r i v a t i v e s , /" (x) (as a r u l e i n s o l v i n g
e x t r e m a l p r o b l e m s t h e g r e a t e s t d i f f i c u l t y is t h e c a l c u l a t i o n o f t h e
m a t r i x /" (x) a n d n o t its i n v e r s i o n ) . S u c h p r o b l e m s w i l l b e c o n s i d e r e d
later on. I n o r d e r to solve t h e p r o b l e m in s u c h a case, o n e c a n m a k e
u s e o f o n e o f t h e m o d i f i c a t i o n s o f N e w t o n ’s m e t h o d w h i c h w e h a v e
studied. I n o n e of the modifications w e h a v e to calculate a n d invert
t h e m a t r i x f* (#) o n l y o n c e , i n t h e o t h e r t h i s is m a d e a f t e r a finite
n u m b e r o f i t e r a t i o n s . If t h e i n i t i a l a p p r o x i m a t i o n is g o o d e n o u g h ,
t h e n t h e r a t e o f c o n v e r g e n c e t o t h e s o l u t i o n w i l l b e fast. H o w e v e r ,
u s i n g m o d i f i c a t i o n s o f N e w t o n ’s m e t h o d i s n o t a c a r d i n a l s o l u t i o n
of the p r o b l e m of r e d u c i n g the a m o u n t of w o r k required to solve the
p r o b l e m ( s p e a k i n g g e n e r a l l y , it c a n b e c o m e e v e n g r e a t e r ) . T h e r e f o r e
w e c o m e to the question of the possibility of constructing m i n i m i z a ­
t i o n m e t h o d s w h i c h w o u l d b e c l o s e t o N e w t o n ’s m e t h o d a s t o t h e i r

66
M E T H O D S O P D U A L D I R E C T I O N S

rate of c o n v e r g e n c e a n d w o u l d r e q u i r e c o n s i d e r a b l y less c o m p u t a ­
tions at e v e r y iteration.
Several su c h m e t h o d s h a v e b e e n w o r k e d out; t h e y are b a s e d o n
d i f f e r e n t i d e a s . A s a r u l e t h e y p r o v e m o r e e f f e c t i v e t h a n N e w t o n ’s
m e t h o d a n d t h i s is w h y t h e y a r e u s e d m o r e a n d m o r e a t p r e s e n t .
T h e next three sections are d e v o t e d to the s t u d y of s u c h algorithms.

3. M E T H O D S O F D U A L D I R E C T I O N S
Considerations on the Choice
of S c h e m e s of t h e M e t h o d s
In the p r e c e d i n g section w e n o t e d that the m a i n difficulty in
a p p l y i n g N e w t o n ’s m e t h o d i s t h e n e c e s s i t y o f e v a l u a t i n g t h e m a t r i x
of s e c o n d derivatives of the f u n c t i o n b e i n g m i n i m i z e d . C o n s e q u e n t l y
a l g o r i t h m s w h i c h w o u l d h e m o r e e f f e c t i v e t h a n N e w t o n ’s m e t h o d
s h o u l d e x c l u d e the calculation of s e c o n d derivatives, p r o v i d i n g
h o w e v e r t h e r a t e o f c o n v e r g e n c e o f N e w t o n ’s m e t h o d .
T h e q u e s t i o n a r i s e s w h e t h e r it is p o s s i b l e i n c o n s t r u c t i n g t h e
s e q u e n c e of a p p r o x i m a t i o n s to t h e solutions to d e t e r m i n e directions
p h w h i c h w o u l d h e c l o s e t o t h o s e i n N e w t o n ’s m e t h o d , b y u s i n g f o r
t h i s p u r p o s e o n l y t h e first d e r i v a t i v e o f t h e f u n c t i o n b e i n g m i n i m i z e d .
T h e first a n d t h e s e c o n d d e r i v a t i v e s o f / (x) c a n b e r e l a t e d b y
T a y l o r ’s f o r m u l a f o r o p e r a t o r s ( t h e g r a d i e n t f ( x ) i s o n e ) :

j/' ( y ) — f (*) — f (*) (y — *) + © (*, y — *) (3.1)


w h e r e || to ( x , y — x ) | | = o (|| y — x | | ) .
T h e e q u a l i t y ( 3 . 1 ) s u g g e s t s t h a t if w e c a l c u l a t e t h e d e r i v a t i v e s
f ( x ) a t a r b i t r a r y b u t c l o s e p o i n t s x lt . . ., x n + x a n d d e t e r m i n e
the s q u a r e n X w - m a t r i x A w i t h the aid of a s y s t e m of (vector)
equations

/' ( * i + i ) — f (*i) — A (*i+i — *i). i = L • - •, n (3.2)


( a s s u m i n g o f c o u r s e v e c t o r s x i + 1 — x i t i — 1 , . . ., n t o b e l i n e a r l y
independent), t h e n m a t r i x A m u s t b e close to the m a t r i x of second
d e r i v a t i v e s c a l c u l a t e d a t a n y p o i n t x t.
I n fact, b y (3.1) w i t h a n y i w e h a v e

/ (^i+l) / ip^i) / foi) f a t + l % i ) “ 1” © ( ^ i » % t + 1 ^i)

and th e r e f o r e u s i n g (3.2) w e obtain


A (*i+i — *i) = /" (*i) ( * i + i — x t) + a) ( x u x i + 1 — x t)
i — 1 , . . ., n .

67 5*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h i s s y s t e m of e q u a t i o n s c a n b e rewritten in t h e f o l l o w i n g f o r m :
a (xi+1 - Xi) = n (*i+i - x t) + (/? - / 3) ( x i + 1 - Xi)

(X i » # j + i Xi)x
i — 1, . . re, 1 ^ j ^ n. ( 3 .3 )

If m a t r i x f n (x) is c o n t i n u o u s a n d n o n s i n g u l a r , t h e n b y v i r t u e o f t h e
a s s u m e d closeness of p o i n t s xi t h e s u m of t h e t w o last t e r m s of t h e
r i g h t - h a n d side of e a c h e q u a t i o n in s y s t e m (3.3) m u s t b e c o n s i d e r a b l y
l e s s t h a n t h e first t e r m , i.e.
A (Xi+i — x t) « /" (xj) ( x f+1 — x t), i = 14 . . n

a n d t h i s i n g e n e r a l m e a n s t h a t m a t r i c e s A a n d / $ , / = 1 , . . ., n ,
m u s t b e close to e a c h other. It c a n b e easily i m a g i n e d h o w t h e a b o v e
considerations c a n b e u s e d for the construction of iterative processes
o f m i n i m i z a t i o n . I f { x ft} i s a n a r b i t r a r i l y c o n s t r u c t e d s e q u e n c e w h i c h
c o n v e r g e s t o t h e m i n i m u m p o i n t o f / (x), t h e n i n a s u f f i c i e n t l y s m a l l
n e i g h b o u r h o o d o f t h e m i n i m u m p o i n t p o i n t s x k , x k _ lt . . ., x h - n
are close to o n e another. Therefore h a v i n g defined m a t r i x A h b y the
s y s t e m of equations

4 ft X k - i — i) — f (Xk-i) / (^fc-i-i)» i = = O j 1 , . . ., n 1

w e c a n construct the (k + l)-th a p p r o x i m a t i o n u s i n g n o w th e for­


mula
Xh+i = x h — a h A h lf h , a h > 0. (3.4)
I f m a t r i x A * h a p p e n s t o b e s u f f i c i e n t l y c l o s e t o m a t r i x /J, t h e n
d i r e c t i o n p h = — A k lf h w i l l b e c l o s e t o d i r e c t i o n — ( / a ) - 1 / * ( i . e . t o
t h e d i r e c t i o n o f m o t i o n i n N e w t o n ’s m e t h o d ) a n d t h e r e f o r e w i l l b e
t h e d i r e c t i o n o f d e s c e n t . If w e c o n t i n u e i n a s i m i l a r w a y t o d e t e r ­
m i n e matrices A h + 2 i . . ., b y v i r t u e o f t h e i r b e i n g c l o s e
t o m a t r i c e s fh+i, f h + 2 • • • p r o c e s s (3.4) m u s t b e c l o s e a s t o its p r o p ­
e r t i e s t o N e w t o n ’s m e t h o d . A t t h e s a m e t i m e m e t h o d ( 3 . 4 ) d o e s
n o t require the calculation of the s e c o n d derivatives of the function.
O n t h e b a s i s o f t h e p r e c e d i n g r e a s o n i n g it p r o v e s p o s s i b l e t o c o n ­
struct a w h o l e class of d e s c e n t processes w i t h a superlinear rate of
convergence a n d w h o s e implementation does not require the calcula­
tion of s e c o n d derivatives of t h e function. W e call these processes
m e t h o d s of d u al directions. T h e origin of this n a m e will b e c o m e
clear later o n w h e n w e shall discuss m e t h o d s for calculating m a t r i x
A h 1 a n d v e c t o r p *.
W e n o w t u r n to t h e strict s u b s t a n t i a t i o n of m e t h o d s of t h e (3.4)
type.

68
M E T H O D S O F D U A L D I R E C T I O N S

S u b s t a n t i a t i o n of t h e M e t h o d s
S u p p o s e t h a t / (x ) is a f u n c t i o n w h i c h h a s c o n t i n u o u s first a n d
s e c o n d derivatives. G i v e n a n infinite s e q u e n c e of e l e m e n t s
W e t a k e t h e s e q u e n c e { y ft} c o r r e s p o n d i n g t o { : r A } i n a c c o r d a n c e w i t h
the formula
Vh = xk + rA
w h e r e vectors rA are su ch that the following t w o conditions are
satisfied:
(1) If A A is a d e t e r m i n a n t w h o s e c o l u m n s a r e v e c t o r s y ~ [ | *

. . . , n » then with any k ^ n — 1 w e have |A a | e,


e is a n a r b i t r a r i l y s m a l l p o s i t i v e n u m b e r .
( 2 ) || r ft|| 0 as k — oo. I n ot h e r respects t h e c h o i c e of vectors
is a r b i t r a r y .
T h e first o f t h e r e q u i r e m e n t s t h a t v e c t o r s r A m u s t s a t i s f y is, i n
p o i n t of fact, t h e r e q u i r e m e n t of their linear i n d e p e n d e n c e .
L e m m a 3 . 1 . L e t { a : A } b e a b o u n d e d s e q u e n c e , w h e r e || x k + i — £ A ||—
as k oo a n d wi th a n y k ^ n — 1 m a t r i x A k be defined b y the follow­
i n g s y s t e m of equations
AhTh-i &h-ii i •••» ^ f (3.6)
w h e r e e A _ * = f ( y h - i ) — f ( f y - i ) a n d r ft» J/ h a r e e l e m e n t s o f s e q u e n c e
(3.5). T h e n w e h a v e
l i m || A h - f ( * A )|| = 0 .
h-*- o o

Proof. U s i n g f o r m u l a (5.1) of C h a p . I to g i v e t h e o p e r a t o r f (x)


i n t e r m s o f its d e r i v a t i v e s , w e c a n w r i t e
i
/ il/k-i) / (*£/<-$) ~ ^ / (%k-i H“ ^ (j/h-i £ k - i ) ) f h — i d*X
0
1 1
= j / " { X h - i ) r k -t d x + j [ f ( x h _i + T T k - t ) - f ( x s - i ) l r k _ i d x
0 0
1
= f ( x k _i) r k _i - f j I f ( x h -i + T r A _*) — f (Xfc-i)] r h -i d x .
o
U s i n g this expression w e have
( A h - f " ( * * ) ) r (,_( = ( f ( * » H ) - f ” ( x k )) r k _ j
1
+ j I f ( Z h - i + x r k _ i ) — f ( * A _ f )] r k _ i d x .
n

69
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

I n t r o d u c i n g t h e n o t a t i o n .A* — /£ = B & w e o b t a i n

ll^-iKlI/Ui-ZJIIIIr^H
+ s u P | | / ' ( * * - i + t r lk_ f) — / ’ ( x fc.i)|| ||rk _j||. (3.7)

S i n c e { x fc} i s a b o u n d e d s e q u e n c e , w i t h a n y k w e h a v e x h £ Q »
Q c E n is a c l o s e d b o u n d e d set. F u n c t i o n /" (x) is u n i f o r m l y c o n t i ­
nuous in set Q \ c o n s e q u e n t l y || / * _ * — f h \ \ = -► 0 and
SUP II F + T r fe-i) — F (^fe-i)ll = P f e - i - ^ 0 a s k - > o o .
O^T^l
Thus it f o l l o w s f r o m (3.7) t h a t

II -®fer ft-tll ^ Q^k-i + ^k-i II = ^fc-ill rk-i\\ (3-8)


w h e r e h h„ t ->-0 as k oo.
A c c o r d i n g t o t h e d e f i n i t i o n o f t h e o p e r a t o r n o r m , || l?fc|| =
= m a x || £ ftz||. L e t t h e m a x i m u m b e a t t a i n e d a t e l e m e n t z h . I f
ll*ll=t
„ _ _ R rh i I c * rfe-n+1
“k Hall + • • • + • || r h . n + i || •

t h e n b e c a u s e of t h e c o n d i t i o n | A * | ^ e ; > 0 , t h e coefficients 6^-1


w i l l b e b o u n d e d | 6 *.* | ^ C , i = 0, 1, . . n — 1. U s i n g t h e
expression of vector zh w e obtain

I I B * II = | | * * * » ! ! = 2 &h-iBh < S Sk-lBh


1 = 0 i=0

H e n c e t a k i n g into account (3.8) and the fact that \$h-i\ is


bounded w e have

||flk | | < 2 I V , | ,- ^ ^ f 1 L = 2 18*-i Ih k - i -*■ o


i=0 i=0

a s k -*> o o . T h e l e m m a is p r o v e d .
T h e results of the a b o v e l e m m a o p e n the w a y to the construction
of m e t h o d s o f t h e t y p e (3.4).
L e m m a 3 . 2 . I f f ( x ) is a c o n t i n u o u s l y d i f f e r e n t i a b l e s t r o n g l y c o n v e x
f u n c t i o n a n d s e q u e n c e { a : fe} i s s u c h t h a t f k + 1 ^ / * a n d (/*, x k + 1 — x h ) — 0
a s k - ► o o , t h e n || 3 : * + ! — 3?ft|| 0.
P r o o f . A c c o r d i n g t o c o n d i t i o n f h + 1 ^ f h w e h a v e x fc+1 6 S hf S k =
— { x \ f ( x ) ^ / (3 ; * ) } w i t h a n y k . S e t S i s s t r o n g l y c o n v e x s i n c e / ( x )

70
M E T H O D S O F D U A L D I R E C T I O N S

is a s t r o n g l y c o n v e x f u n c t i o n ( l e m m a 2 . 8 o f C h a p . I). T h e n t h e r e is
a positive n u m b e r A , > 0 such that a n y point Xk+i^ **— hi * w h e r e

I | 5 I K ^ I | * * + 1 — ^fcll2 * * s a n i n t e r n a l p o i n t o f s e t S Let Xk+l~ x\ ==


= u + o) w h e r e i ; £ 7 \ , T k i s a p l a n e t a n g e n t t o s e t S h a t p o i n t X k
a n d o)_L7V T h e n noting that f (xk) ± T k w e obtain

I = ! ( / * . ', + ® ) l = l l / * l l I I ® | | -

B u t || co || > | | H I s i n c e o t h e r w i s e i n a d d i t i o n t o p o i n t x h s e t S k a n d
plane T k w o u l d h a v e other points in c o m m o n , w h i c h contradicts
t h e s t r o n g c o n v e x i t y o f S *. T h e r e f o r e

y I (/»- *h+i — Xk) |> A . || f'k || || * i t + i — * * ||2 .

H e n c e i f || f h \\ 0 t h e n || x k + 1 — ;rft|| - » - 0 . B u t i f || /i|| - > - 0 t h e n ,


s i n c e / (x) is s t r o n g l y c o n v e x , t h e m a x i m u m d i a m e t e r o f s e t S h d h
— > > 0 , w h i c h i m p l i e s t h a t || x h + i — x h \\ — » - 0 . T h e l e m m a i s p r o v e d .
W e c a n n o w d i s c u s s t h e p r o p e r t i e s of p r o c e s s (3.4). W e s h a l l s t u d y
t h i s p r o c e s s a s s u m i n g t h a t t h e v a l u e o f p a r a m e t e r a h is c h o s e n
a c c o r d i n g to c o n d i t i o n (2.2) t a k i n g in to a c c o u n t t h a t in this case
P k — — A k lf k . T h e f u n c t i o n b e i n g m i n i m i z e d i s a s s u m e d t o b e
s m o o t h a n d strongly c o n v e x . W e m o t i v a t e the possibility of c h o o s i n g
p a r a m e t e r a * in the s a m e w a y as w a s d o n e in the p r e c e d i n g sections.
E x p a n d i n g t h e f u n c t i o n i n T a y l o r ’s s e r i e s , w e o b t a i n t h e f o l l o w i n g
estimate:

/ « - / , < « . <a . f,)[i


w h e r e a h = | | f'h c — i4»||, x k c = xh + 0 (xh+1 — x h ), 0 £ 1 0 , 1 ] . N o t e
that
(f'k, P h ) = — (AhPk, Ph) = — (f'k, A k ' f k ). (3.9)
T a k i n g (3.9) i n t o a c c o u n t w e h a v e
Oh_ a h&h llPftll2 ~|
fk+i— / » < a * ( / i , p k) [l —
2 2 (/i. Pk) J *
H e n c e c o n d i t i o n (2.2) wi l l b e a l w a y s satisfied if a h satisfies the
following inequality:

2 ' 2 (f k , p k ) = 3 ’ 8 - ^ - 1 U )

T h e o r e m 3 . 1 . I f f (x) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n
for w h i c h conditions (2.4) are v a l i d , m a t r i x A h w i t h a n y k ^ n — 1

71
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

is d e f i n e d b y s y s t e m ( 3 . 6 ) a n d s a t i s f i e s t h e c o n d i t i o n
(A'k % , fh) > 0 (3.11)
a n d o&fe i s d e t e r m i n e d a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) , t h e n w h a t e v e r t h e
initial p o i n t x Q the f o l l o w i n g s t a t e m e n t s a r e v a l i d for s e q u e n c e (3.4):
f k + 1 < C f h a n d || x h — # J | — at a superlinear rate of convergence:
II x n + i — ^#11 C X N . . . kN +i ^(3.12)
w h e r e C , N < cx>, k N + i < c 1 w i t h a n y l ^ 0 , % i - ^ 0 a s i ^ o o .
P r o o f . I n order to m a k e u s e of the results of l e m m a 3.1 w e m u s t
first o f a l l s h o w t h a t t h e c o n d i t i o n s o f t h e t h e o r e m i m p l y t h a t
II x h + i — ^fcll - > 0 f o r s e q u e n c e ( 3 . 4 ) .
A c c o r d i n g t o c o n d i t i o n s ( 3 . 9 ) a n d ( 3 . 1 1 ) w e h a v e (fh, p h ) < 0
w i t h a n y k . H e n c e it f o l l o w s , first, t h a t t h e r e is a l w a y s a v a l u e
0 s u c h t h a t i n e q u a l i t y ( 3 . 1 0 ) is sa tisfied ( a n d c o n s e q u e n t l y
al s o (2.2)), s e c o n d l y , b y (2.2) w e s h a l l h a v e f k + 1 < c f h - T h i s m e a n s
t h a t x k+ x 6 S = {x: f (x) ^ / ( x 0) } w i t h a n y k a n d b e s i d e s s i n c e / (x)
h a s a l o w e r b o u n d , f k — f h + x - > - 0 ; b y v i r t u e o f t h i s it f o l l o w s f r o m
(2.2) that'
(/fe, P h ) = (.f h , x h + 1 — x h) ->-0. (3.13)
S i n c e f h + i ^ f k a n ^ c o n d i t i o n ( 3 . 1 3 ) is fulfilled, s e q u e n c e {ajft}
satisfies t h e r e q u i r e m e n t s of l e m m a 3.2. H e n c e , as k -» -oo

II x k + i — x h\\ ~ ^ 0 . "IT; (3.14)


T h u s t h e kc o n d i t i o n s of t h e t h e o r e m p r o v i d e for t h e fulfillment of
all t h e r e q u i r e m e n t s of l e m m a 3.1. It f o l l o w s t h a t
M f c - « l l - * 0 (3.15)
a s k — *~o o . T a k i n g i n t o a c c o u n t c o n d i t i o n s ( 2 . 4 ) t f o r a n y M 1 a n d m x
s u c h t h a t M x ^ M a n d 0 < ; m ± ^ m t h e r e is a n u m b e r L s u c h t h a t
w i t h k ^ L for a n y y £ E n w e h a v e
W i l l y||2 < (A h y , y ) < M x || j/||2 . ^(3.16)
B e c a u s e of c o n d i t i o n s (3.9) a n d (3.16) f r o m a c e r t a i n k o n w e shall
h a v e (/ft, p k ) ^ — m x \\ P f t | | 2 . C o n s e q u e n t l y i n e q u a l i t y ( 3 . 1 0 ) w i l l
a l w a y s b e s a t i s f i e d if t h e f o l l o w i n g i n e q u a l i t y is s a t i s f i e d :

e. 0 < e < — . ;(3.17)

N o t i n g t h a t a h ^ M - f M x < o o it is e a s i l y a s c e r t a i n e d t h a t
t h e r e is a c o n s t a n t a > > 0 s u c h t h a t w i t h a n y k i n e q u a l i t y ( 3 . 1 7 )
w i l l b e s a t i s f i e d w i t h aft ^ a . B y v i r t u e o f t h i s it f o l l o w s f r o m ( 3 . 1 4 )
’■ h a t || p k || = ^ r || x k + i — **11 - + 0 .

72
M E T H O D S O F D U A L D I R E C T I O N S

Hence
II fill = \ \ A h P h | | < M 1 | | / > * | | - * 0 .
The last c o n d i t i o n as s h o w n b y i n e q u a l i t y (1.12) m e a n s t h a t
xh L e t u s establish that e s t i m a t e (3.12) holds.
S i n c e || / ? fe|| a n d c o n d i t i o n ( 3 . 1 5 ) is fu lfilled a n d t h e s e c o n d
d e r i v a t i v e s o f f u n c t i o n / (x) a r e u n i f o r m l y c o n t i n u o u s o n s e t S ,
w e h a v e as k — oo
a* <11 r (*» + e (**+, - *0) - /" (<)ll + 1 1 /" (**) - ii»ll - ► o

a n d it f o l l o w s t h a t w i t h a n y 0 *< e < y f r o m a certain iteration


o n i n e q u a l i t y ( 3 . 1 7 ) w i l l h e sa tisfied w i t h a * = 1. T h i s m e a n s t h a t
p r o c e s s (3.4) w i l l b e i m p l e m e n t e d w i t h a s t e p e q u a l to u n i t y .
T h e left-hand side in eq ual ity of e s tim ate s (3.16) m e a n s that w i t h
k ^ L w e s h a l l h a v e || A ^ W ^ — ( l e m m a 2 . 9 o f C h a p . I). T o g e t h e r
YTi\
w i t h the possibility of inverting m a t r i x A h w i t h a n y k ^ n — 1
t h i s e s t i m a t e m a k e s it p o s s i b l e t o c o n c l u d e t h a t t h e r e is a c o n s t a n t M 2
s u c h t h a t || A I 1 1| ^ M 2 w i t h k ^ n — 1 . T a k i n g t h i s i n t o a c c o u n t
w e c a n establish t h e b o u n d s o n t h e rate of c o n v e r g e n c e like w e d i d
in t h e o r e m 2.1:
II ^ k + l — < I K || I — A i ' f ’h c || || X h ~ X * ||
< II i l ? IIII i4» ■ - fkc IIII * * ■ - < | K II il* ■ - fk c IIIIX* - * . ||
o r ||*ft+i — * . | K M I < — < 1 1 where k h = M , || A k — f L l|- S i n c e
I M » - / i . | | < | f i 4 * - / ; | | + | | / i ; - / ' ( * » + e ( * » - * . ) ) | K 0 , t h e r e is
a n u m b e r N s u c h t h a t w i t h k = N ~ \ - l , 1 = 0, 1, . . . w e s h a l l h a v e
^ 1 a n d as I —+ o o , ^ 0.
S e t t i n g || — x*\\ — C a n d t a k i n g i n t o a c c o u n t o u r r e m a r k s
a b o u t t h e v a l u e s o f X k , w e o b t a i n e s t i m a t e ( 3 . 1 2 ) . T h e t h e o r e m is
proved.
C o n d i t i o n (3.11) u s e d i n t h e t h e o r e m , b e c a u s e of (3.9), m e a n s
t h a t t h e d i r e c t i o n p k i s t h e d i r e c t i o n o f d e s c e n t o f / ( x ). I t c a n o c c u r
t h a t (A ' k f h , i k ) ^ 0 a t s o m e i t e r a t i o n s o f t h e p r o c e s s . I n t h i s
case w e c a n either c h a n g e vector r* a n d construct a n e w m a t r i x A h 1
( s u c h t h a t c o n d i t i o n (3.11) b e satisfied) o r m a k e a s t e p in t h e d i ­
rection of the antigradient. T h e n u m b e r of s u c h steps will a l w a y s
b e f i n i t e s i n c e i n d e s c e n d i n g a l o n g t h e a n t i g r a d i e n t || x h + i — x h \\ - > 0
a n d if v e c t o r s r h s a t i s f y t h e r e q u i r e m e n t s f o r m u l a t e d a b o v e , t h e n
|| A k — fill 0. C o n s e q u e n t l y a c c o r d i n g to c o n d i t i o n s (3.16) a n d
(3.9) f r o m a c e r t a i n iteration o n w e shall h a v e of necessity
( A j ^ f k , fh) > 0 . H o w e v e r if w i t h a certain k the condition
(At^f'h, fh) = — ( p a , fk) C 0 is f u l f i l l e d , t h e n it is e a s i e r t o c h a n g e
in f o r m u l a (3.4) t h e s i g n of t h e scalar f a c t o r a * ; t h e n t h e m o t i o n f r o m
p o i n t x h w i l l b e i n t h e d i r e c t i o n — p k , i.e. i n t h e d i r e c t i o n o f d e s c e n t .

73
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Construction of V a r i o u s Algorithms
T h e r e q u i r e m e n t s w h i c h m u s t b e satisfied b y v e c t o r s rk u s e d in
c o n s t r u c t i n g s e q u e n c e (3.5) are n o t strict a n d l e a v e u s a g r e a t f r e e d o m
o f c h o i c e o f t h e s e v e c t o r s . T h i s m a k e s it p o s s i b l e t o c o n s t r u c t d i f f e r ­
e n t a l g o r i t h m s of t y p e (3.4) s i nce different s e q u e n c e s {rk } will
define ( b y (3.6)) different s e q u e n c e s of m a t r i c e s A k .
L e t u s discuss s o m e possible w a y s of constructing vectors rh . W e
c a n take as rk vectors directed along the axes of coordinates. F o r
e x a m p l e , i f r 0 — X q V ^ t h e n w i t h k = t n + i, w h e r e t i s a n i n t e g e r
a n d i = 0 , 1 , . . ., n — 1 , w e h a v e r ft = X k v t + i * v i + i i s f h 0 u n i t
v e c t o r o f t h e c o r r e s p o n d i n g a x i s a n d X h is a n u m e r i c a l f a c t o r s u c h
t h a t X k — > - 0 a s h — >-oo. S u c h a c h o i c e o f v e c t o r r h g u a r a n t e e s t h a t
t h e c o n d i t i o n | A h \ ^ e w i l l b e satisfied. I n this c a s e in o r d e r to
d e t e r m i n e m a t r i x A ft, i t i s n e c e s s a r y t o c a l c u l a t e a t e a c h i t e r a t i o n
the derivatives at t w o points, x h a n d y h . T h e l a w of the decrease
of X h m a y b e c h o s e n arbitrarily; c o m p u t a t i o n a l practice s h o w s h o w ­
e v e r t h a t t h e m a x i m u m r a t e o f c o n v e r g e n c e is o b t a i n e d w i t h m o n o t o -
\
nicallyj d i m i n i s h i n g X h ; o n e can, for instance, a s s u m e that X h = .
A n o t h e r p o s s i b l e m e t h o d o f d e t e r m i n i n g v e c t o r s r h is a s f o l l o w s .
W i t h k ^ n — 1 w e c a n , i n s t e a d of (3.5), u s e d i r e c t l y s e q u e n c e (3.4),
i.e. a s s u m e r h — x k + i — x h — — a ^ A ^ f k - I n f a c t , t h e p r o o f o f
t h e o r e m 3 . 1 s h o w s c l e a r l y t h a t if A k is a n a r b i t r a r y m a t r i x w h i c h
satisfies o n l y c o n d i t i o n ( 3 . 1 1 ) a n d a k is c h o s e n a c c o r d i n g t o c o n d i ­
t i o n ( 2 . 2 ) , t h e n || x A + 1 — x k \\ — a s k — > - o o . C o n s e q u e n t l y , if w e
u s e s e q u e n c e ( 3 . 4 ) f o r c o n s t r u c t i n g v e c t o r s r fe, t h e n t h e r e q u i r e m e n t
|| r h \\ — w i l l of n e c e s s i t y b e fulfilled a n d w e o n l y n e e d to establish
| A f t | ^ e . I f t h i s c o n d i t i o n i s n o t s a t i s f i e d w i t h a c e r t a i n A:, a n o t h e r
v e c t o r ^ m u s t b e c h o s e n ( u s i n g n o t (3.4) b u t a n e w f o r m u l a ) . I n
s u c h a n a l g o r i t h m , for d e t e r m i n i n g m a t r i x 4ft at e v e r y iteration
( w h e r e s e q u e n c e (3.4) p r o v i d e s t h e fulfillment of t h e r e q u i r e m e n t s
to b e m e t b y vectors rh) the g r a d i e n t m u s t b e calculated o n l y at o n e
point x h.
O f c o u r s e , o t h e r m e t h o d s o f c o n s t r u c t i n g v e c t o r s r ft m a y b e u s e d .
I n t h e s y s t e m of e q u a t i o n s (3.6) w h i c h defines m a t r i x A k o n l y
one n e w vector rh a n d the corresponding vector ek are used with
a n y k ; t h e r e m a i n i n g v e c t o r s r h . ly . . ., r ft_ n + 1 a n d e h - n . . ., e k ^ n + 1
a r e c o n s t r u c t e d f r o m p r e c e d i n g iterations. T h e s y s t e m (3.6) c a n b e
m o d i f i e d so t h a t at e a c h iteration of p r o c e s s (3.4) a n a r b i t r a r y n u m b e r
o f v e c t o r s r * ^ , . . ., 1 ^ 7 n ( a n d of their c o r r e s p o n d i n g
vectors . . ., e h . i j ) b e r e n e w e d a n d t h e r e m a i n i n g n — j
v e c t o r s r k — i J + 1 , . . . » r h — in b e t a k e n f r o m p r e c e d i n g i t e r a t i o n s .
I n this c a s e s y s t e m (3.6) s h o u l d p r e f e r a b l y b e w r i t t e n in t h e f o r m
A k Ti = e t, i — 1 , . . ., n . (3.18)

74
M E T H O D S O P D U A L D I R E C T I O N S

If t h e r e q u i r e m e n t s o f l e m m a 3 . 1 a r e r e t a i n e d , t h e n r e p e a t i n g its
p r o o f w o r d f o r w o r d , it c a n b e a s c e r t a i n e d t h a t f o r m a t r i x A k d e f i n e d
b y s y s t e m ( 3 . 1 8 ) t h e c o n d i t i o n || A h — fl\\ - * - 0 a s k - + o o i s a l s o
satisfied.
U s i n g d i f f e r e n t m e t h o d s f o r c o n s t r u c t i n g v e c t o r s r* i n s y s t e m ( 3 . 1 8 )
one can obtain several well k n o w n mi nimization algorithms. For
i n s t a n c e , if w e s e t r f = v hi ( a n d y t = x h + v k i ) w h e r e u ki is a v e c t o r
d i r e c t e d a l o n g t h e i - t h a x i s o f c o o r d i n a t e s a n d s u c h t h a t || u k i \\ - > - 0
as k oo, t h e n s y s t e m (3.18) will t a k e t h e f o l l o w i n g f o r m :

A h v hi = f (x k + vM ) — f ( z k ), i = 1, . . n.

M a t r i x A h d e f i n e d b y t h i s s y s t e m is a finite d i f f e r e n c e s a n a l o g u e o f
t h e m a t r i x o f s e c o n d d e r i v a t i v e s f " ( x fe); t h u s i n t h i s c a s e p r o c e s s ( 3 . 4 )
t r a n s f o r m s i n t o a f i n i t e d i f f e r e n c e s a n a l o g u e o f N e w t o n ’s m e t h o d
w i t h a d j u s t m e n t o f s t e p l e n g t h . O n t h e b a s i s o f t h e o r e m 3 . 1 it c a n
b e a s s e r t e d t h a t N e w t o n ’s f i n i t e d i f f e r e n c e s m e t h o d w i t h a d j u s t m e n t
of step l e n g t h c o n v e r g e s f r o m a n y initial a p p r o x i m a t i o n at a s u p e r -
l i n e a r rate. A s s u m i n g t h a t m a t r i x f (x) satisfies L i p s c h i t z ’ c o n d i ­
t i o n ( 2 . 8 ) , it c a n b e s h o w n , u s i n g t h e p r e c e d i n g r e s u l t s , t h a t if
II v hi\\ ^ 1 1 /ftll> N e w t o n ’s f i n i t e d i f f e r e n c e s m e t h o d c o n v e r g e s a t
a q u a d r a t i c rate. T h i s c a n b e s e e n f r o m t h e f o l l o w i n g a r g u m e n t .
I n l e m m a 3 . 1 t h e b o u n d o n t h e q u a n t i t y B k r t — B k v hi t a k e s t h e f o r m

ii B h v k i \\ < sup ii r (** + tou) - r (*ik)ii ii " k « i i -


0 t

H e n c e , u s i n g ( 2 . 8 ) , || B h v h i \\ ^ i?|| i;ftf||2 . F r o m t h i s i n e q u a l i t y a n d
t h e e s t i m a t e || v k j|j ^ | | / * jj, a s i n l e m m a 3 . 1 , w e o b t a i n

s l < M I I M I * £ * 2 |8«|||/il|-»ll/il|.
7= 1 7= 1

A c c o r d i n g t o ( 2 . 4 ) t h e g r a d i e n t /' ( x ) s a t i s f i e s L i p s c h i t z ’ c o n d i t i o n
with constant M . Consequently,
II B h II< R II f t II = R II f t - f t I K ' R M II - X . ||.

T h e h o u n d s o n th e rate of c o n v e r g e n c e o b t a i n e d in t h e o r e m 3.1
c a n n o w be defined m o r e exactly as follows:

II Z f t + i — *JI < M 2 \\ A k — /Zell II x h — xj|


< M 2 (|| A h - f h \\ + 1 1 f k - f (xh + 0 (xh - x.))||)|| - x j
< M 2 ( R M + R ) || x * - x j | a,
i.e.
II X h + i x # || ^ C\\ X h #*lla «

75
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

L e t u s d e s c r i b e a n o t h e r m e t h o d of c h o o s i n g v e c t o r s r t. W e s e t
Tj = W l/i = = J f i + i = i/i ~1“ ^ 2 , • • •» w . T h e n
t h e s y s t e m (3.18) ta kes t h e f o l l o w i n g f o r m :

— (2/i) = f (yt+i) — /' ( 0 f ) > i = 1, . .


M a t r i x .4* defined b y s u c h a s y s t e m of eq uat ion s (accurate to
a n u m e r i c a l f a c t o r X k ) is u s e d t o c o n s t r u c t a n i t e r a t i v e p r o c e s s ( A i t k e n
a n d S t e f f e n s e n ’s m e t h o d ) . W e s h a l l n o t s t u d y t h e p r o p e r t i e s o f t h i s
m e t h o d here. N o t e o n l y that b y v a r y i n g the v a l u e of factor X k
a q u a d r a t i c rate of c o n v e r g e n c e of the process c a n b e obtained.

Determining Vector p k
T h e a m o u n t of w o r k required to c o m p u t e vector p k d e t e r m i n e s
t o a c o n s i d e r a b l e e x t e n t t h e c o m p u t a t i o n a l effort i n p r o c e s s (3.4).
W e shall n o w consider a m e t h o d of constructing vector p k =
= — A j ^ f h w h i c h m a k e s u s e of t h e specific p r ope rti es of s y s t e m (3.6)
that defines m a t r i x A k . B y t a k i n g a c c o u n t of these properties o n e
c a n c o n s i d e r a b l y s i m p l i f y the construction of vector p k ; w e b e g i n
w i t h the inversion of m a t r i x A h.
T h e n e c e s s a r y c o n d i t i o n f o r t h e e x i s t e n c e o f m a t r i x A k 1 is t h a t
m a t r i x A k be no nsingular a n d this in turn necessitates linear in de­
p e n d e n c e o f t h e v e c t o r s y s t e m e k , . . ., e k „ n + 1 . W i t h s u f f i c i e n t l y
l a r g e k , m a t r i x A k is n o n s i n g u l a r a s s h o w n b y (3.16). H o w e v e r , a t
s o m e iterations of t h e initial s t a g e of process (3.4) t h e v e c t o r s y s t e m
e k , . . ., & k - n + i c a n p r o v e l i n e a r l y d e p e n d e n t . I n t h i s c a s e w e c a n
either c h a n g e o n e of the vectors or m a k e a step in the direction
o f t h e a n t i g r a d i e n t t h i s c a u s i n g a c h a n g e o f s y s t e m e k % . . ., e k _ n + 1 .
W e a s s u m e in w h a t follows that w i t h a n y k ^ n — 1 s y s t e m ek, . . .
. . ., t f f e - n + x i s l i n e a r l y i n d e p e n d e n t .
I n this c a s e s y s t e m (3.6) c a n b e w r i t t e n in t h e f o r m
A k &k—% — i ” 1 , . . .j Tl 1
or in the form of a matrix equation:
A ? E h = R k (3.19)
w h e r e E k , R k are matrices w h o s e c o l u m n s are coordinates of vectors
e k - i a n d r ft_ f r e s p e c t i v e l y . F r o m ( 3 . 1 9 ) w e o b t a i n
A u 1 = R hE i \ (3.20)
T h u s i n o r d e r t o c o n s t r u c t m a t r i x A H 1 it is n e c e s s a r y first o f all
t o c a l c u l a t e m a t r i x E k 1. I t is k n o w n f r o m l i n e a r a l g e b r a (see, f o r
instance, D . K . F a d d e e v a n d V . N . F a d d e e v a ) t h at r o w s of m a t r i x E kl
w i l l b e v e c t o r s w h o s e b a s i s s k l . . ., s k _ n + 1 i s d u a l f o r o r b i o r t h o g o n a l
t o b a s i s e k , . . ., e k - n + 1 . R e c a l l t h a t l i n e a r l y i n d e p e n d e n t s y s t e m s

76
M E T H O D S O F D U A L D I R E C T I O N S

of vectors a n a n d b lt . . ., b n a r e c a l l e d d u a l i f t h e y s a t i s f y
the conditions
( a t, bj) = 0 with i /, (a b t) — 1.

I f s k l . . ., s * _ n + i i s t h e d u a l o f b a s i s e * , . . e h - n + 1<t t h e n
( a c c o r d i n g t o t h e r e l a t i o n s o f d u a l i t y ) S * E h = /, w h e r e S k is a m a t r i x
w h o s e c o l u m n s are vectors It f o l l o w s t h a t S % = E k l.
E a c h o f t h e m a t r i c e s E kl k = 0 , 1, . . . d i f f e r s f r o m t h e n e x t
o n e o n t h e left a n d o n t h e r i g h t s i d e o n l y b y o n e c o l u m n . B y v i r t u e
o f t h i s f a c t t h e p r o c e s s o f c o n s t r u c t i n g t h e b a s i s s h l . . ., s k _ n + 1 c a n
be p e r f o r m e d b y recursive relations a n d this to a considerable extent
r e d u c e s c o m p u t a t i o n a l efforts.
S u p p o s e w e h a v e c a l c u l a t e d m a t r i x E k 1 , i.e. h a v e c o n s t r u c t e d t h e
b a s i s s k , . . ,# S f c - n + i » L e t u s c o n s t r u c t t h e s y s t e m o f v e c t o r s S j h - j ,
S k l • • •» s A - n + 2 a s f o l l o w s :
T _ _ _ _ _ _ _ s h - n + 1_ _ _ _

k"*"* i.s k - n + 1 > eh + i ) *

^fe+i-7 ^k+i-j (s f c + i - i * ^ a + i ) S k + i i 7 l j • • •# ft 1- (3.21)


It f o l l o w s f r o m t h e linear i n d e p e n d e n c e o f v e c t o r s e k + lt e h , . . .
• • *j ti+ 2 t h a t
(Sfc-n+l» £fc+l) ^ 0 (3.22)
I n d e e d , b y v i r t u e o f t h e d u a l i t y o f t h e b a s e s s h , . . ., S k _ n + i a n d
e k l . . ., e h - n + 1 w e h a v e ( s ft- n + 1 , < ? k - ; ) — 0 w i t h / = 0 , 1 , . . .
. . n — 2 a n d if w e h a d (sk _ n + 1 , £fc+i) = 0 , t h e n it w o u l d f o l l o w
t h a t v e c t o r s e h + 1 , . . ., e h - n + 2 a r e l i n e a r l y d e p e n d e n t . T h u s , i n
o r d e r t o c h e c k t h e l i n e a r i n d e p e n d e n c e o f v e c t o r s e h + u e h , . . ., e k - n + 2r
it is s u f f i c i e n t t o c h e c k c o n d i t i o n ( 3 . 2 2 ) .
L e t u s s h o w t h a t t h e v e c t o r s y s t e m ( 3 . 2 1 ) is a b a s i s w h i c h is d u a l
f o r t h e b a s i s e k + l9 . . ., £ * - n + 2 *
Indeed,
t~ „ \ (s h + i - n i e h + 1) \
( ft+1’ e * + l ) = — l'
U „ + 1 -j, e k + 1 ) = ( s k + 1 . j, e k + i ) — (sk + ,-j, e h + 1 ) = o,

. (s k + l - 7 1 1 e k + l - j ) ~
(sh + 1 , e k + 1 .j) = 1 — — ^ = 0

b y virtue of t h e d u a l i t y of the bases ek , . . . , c ^ _ n+1 a n d ...,

( S f t + J -ji € k+l-m) = (s k + l - j i €k+i-m) (s h + l - j y ^ft+l) ~ ^ J7n

77
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

7 , m = 1 , 2 , . . ., n — 1 , i s K r o n e c k e r ’s d e l t a ( 6 ^ = 1 , 6 j m —
= 0 with ] m). C o n s e q u e n t l y , the conditions of duality are satis­
fied f o r v e c t o r s y s t e m s efe_ n + 2 a n d s * + u . . s k - n + 2 , a n d
this corroborates o u r statement.
T h u s , u s i n g r e c u r s i v e relations (3.21), t h e c o n s t r u c t i o n of basis
S h + i i • • •» s h - n + 2 ( i * e * m a t r i x E k + 1) is p e r f o r m e d q u i t e e a s i l y .
W e c a n n o w derive a s i m p l e f o r m u l a for the d e t e r m i n a t i o n of t h e
d i r e c t i o n of m o t i o n p k . F o r t h i s p u r p o s e let u s w r i t e e q u a t i o n (3.20)
in the following form:
n - 1
A j tl = 2 rh-iS*-i (3.23)
i=G

(where superscript * denotes a vector-row). U s i n g (3.23) w e obtain


71-1 71- 1
p h = — A i ' f ’k = - 2 r h.,st.i/* = - 2 ( « * - « , f k ) r h . ■>. (3.24)
t-0 t=0

It w a s this f o r m u l a for t h e d e t e r m i n a t i o n of t h e d i r e c t i o n of d e s c e n t
in m e t h o d s of t h e t y p e (3.4) t h a t c a u s e d s u c h a l g o r i t h m s to b e called
m e t h o d s of d u a l directions. U s i n g (3.24), f o r m u l a (3.4) c a n b e
written
71 - 1

% h + l = = ’ 3'h ~ ~ ($fc — it (3.25)


i=0
or in coordinate f o r m
n — 1 n

Xh + i ~ Xk ah 2 2 Sk- i rh - V V = l , ...,W.
i — (i j = l

I n practice, sequences of a p p r o x i m a t i o n s s h o u l d he a l w a y s constructed


b y this f o r m u l a .
N o t e also that u s i n g exp r e s s i o n (3.23) a recursive f o r m u l a c a n b e
o b t a i n e d f o r t h e c a l c u l a t i o n o f m a t r i x A ^ 1 • W e g i v e it h e r e w i t h o u t
its d e r i v a t i o n :

■Ak+ 1 = A ? + (rk + 1 — A l * e k + 1 ) s*+ 1 .

I t is e a s y t o c h e c k its v a l i d i t y b y d i r e c t m u l t i p l i c a t i o n o f a m a t r i x
c o n s t r u c t e d b y t h i s f o r m u l a a n d v e c t o r s e ft+ lT e h l . . ., e h - n + 2 .
It c a n b e s e e n t h e n t h a t

A k .+1 £ f c + i — i f*fc+i— 1» i = 0 , 1 , . . ., n 1,

i.e. t h a t m a t r i x A l + i sa tis fie s s y s t e m (3.6).

78
M E T H O D S O P D U A L D I R E C T I O N S

T h e Initial S t a g e o f t h e P r o c e s s
U n t i l n o w w e c o n s i d e r e d t h e iterative p r o c e s s (3.4) b e g i n n i n g
w i t h k = n — 1 s i n c e f o r t h e d e f i n i t i o n o f m a t r i x A kt n v e c t o r s r h
a n d corresponding vectors ek are required.
T h e first i t e r a t i o n s o f t h e p r o c e s s ( k < n — 1 ) c a n b e p e r f o r m e d
in various w a y s . F o r instance, use c a n b e m a d e of t h e m e t h o d of
s t e e p e s t d e s c e n t : x k+ t = x h — a * / £ , a * > 0, k = 0, 1, . . n — 2.
I n o r d e r t o s e c u r e u n i f o r m i t y o f t h e a l g o r i t h m f r o m t h e first i t e r a ­
t i o n o n , w e c a n p r o c e e d a s f o l l o w s w i t h 0 ^ k < rc — 1 .
S e t A ' 1 = /. P r e s e n t t h i s m a t r i x i n t h e f o r m A ~ * = i?02?“l,
w h e r e R 0 = /, E ~ x = /, or, u s i n g (3.23),
n - 1

A £ = 2 '•o-.so-t
i=0

w h e r e r 0 , r _ j , . . ., r _ n + 1 a n d s 0 , s _ lt . . ., s _ n + 1 a r e v e c t o r s o f
a u n i t y o r t h o n o r m a l i z e d basis. T a k i n g this into a c c o u n t , w e h a v e :
T 7 - 1

X i = x 0 — a 0 2 (/o» s o - i ) r o-t*
7=0

F u r t h e r , h a v i n g c a l c u l a t e d v e c t o r s rx a n d e ± b y (3.21) w e c o n s t r u c t
the basis:

,, + li
ls-n . ,
e l)
~Si-i=Si-h ei)«if 7 = 1 , — i

and the next approximation:


n - 1 _

X 2 = x1— Oi 2 (/!.
i=0
T h e c o n s t r u c t i o n o f t h e s u c c e s s i v e i t e r a t i o n s is s t r a i g h t f o r w a r d .

M i ni mi za ti on of Q u a d r a t i c F o r m
L e t us consider as a n e x a m p l e the application of the m e t h o d s of
dual directions to the finding of the m i n i m u m point of a quadratic
function. Let
/(*)-= -^-(Ax, x)-\-(by x ) + c

w h e r e A is a n n X n s y m m e t r i c , s t r i c t l y p o s i t i v e d e f i n i t e m a t r i x
w i t h c o n s t a n t e l e m e n t s : (.A x , x ) > 0 f o r a n y x ^ 0 , b i s a v e c t o r ,
c is a s c a l a r q u a n t i t y . T h e g r a d i e n t o f t h i s f u n c t i o n is f (x) = A x +
-f b the vector
e, = f (x + r,) - f (x) = A r t. (3.26)

79
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e r e f o r e , i f r l t . . ., r n i s a l i n e a r l y i n d e p e n d e n t s y s t e m o f
v e c t o r s , s lt . . ., s n i s t h e b a s i s d u a l f o r t h e b a s i s e l9 . * ., e n y t h e n
b e c a u s e of (3.20) a n d (3.23), w e h a v e

A'1 = R nE'1 = £ r ts f .
i=l

N o w , s i n c e it f o l l o w s f r o m ( 3 . 2 6 ) t h a t m a t r i x A is d e f i n e d b y t h e
s y s t e m o f e q u a t i o n s A r t — e i9 i = 1 , 2 , . . ., n , w e c a n w r i t e

A ' 1 = R n E'n ' = 2 r,*f, (3.27)


i— 1
i.e. A n = 4 " 1 . H e n c e
* n + i = X n — A £ f ’n = x n — A - 1 ( A x n + b ) = — A-'b (3.28)

a n d / n + i = — A A ~ x b + 6 = 0 , i . e . a ?n + 1 = x m .
Thus, in order to m i n i m i z e a quadratic function b y the m e t h o d
of d u a l directions w e h a v e to calculate t h e gr adi ent of the funct ion
a t n + 1 p o i n t s a n d c o n s t r u c t a b a s i s w h i c h is d u a l f o r t h e b a s i s o f
v e c t o r s e lt . . ., e n . I f w e c o n s i d e r t h e p r o c e s s o f s u c c e s s i v e c a l c u l a ­
t i o n s o f v e c t o r s e l9 . . ., e n a s a c e r t a i n i t e r a t i v e p r o c e d u r e , t h e n
it c a n b e s a i d t h a t m e t h o d s o f d u a l d i r e c t i o n s m a k e it p o s s i b l e t o
m i n i m i z e a q u a d r a t i c f u n c t i o n after a finite n u m b e r of steps.
N o t e a l s o t h a t t h e p r o b l e m u n d e r c o n s i d e r a t i o n is e q u i v a l e n t
t o s o l v i n g a s y s t e m o f l i n e a r e q u a t i o n s A x — — b. C o n s e q u e n t l y ,
m e t h o d s o f d u a l d i r e c t i o n s m a k e it p o s s i b l e t o s o l v e a s y s t e m o f
l i n e a r e q u a t i o n s b y p e r f o r m i n g a finite n u m b e r of iterations.

Discussion of Properties of t h e M e t h o d s
M e t h o d s o f d u a l d i r e c t i o n s m a k e it p o s s i b l e t o s o l v e t h e p r o b l e m
of m i n i m i z i n g a strictly c o n v e x s m o o t h f u n c t i o n w h a t e v e r t h e initial
a p p r o x i m a t i o n c h o s e n a n d t h e rate of c o n v e r g e n c e of the s e q u e n c e
{a:ft} t o t h e s o l u t i o n i s s u p e r l i n e a r . T h e m e t h o d o f c h o o s i n g p a r a ­
m e t e r aft g u a r a n t e e s t h e d e t e r m i n a t i o n o f t h e r e q u i r e d v a l u e o f a *
after a finite n u m b e r of r e d u c t i o n s . O f c o u r s e , i n p r o c e s s (3.4) as
in the m e t h o d s described in the preceding sections can be chosen
u n d e r the c o n d i t i o n of o b t a i n i n g t h e m i n i m u m fu n c t i o n v a l u e in
t h e d i r e c t i o n o f m o t i o n ; h o w e v e r , s u c h a m e t h o d is m o r e l a b o r i o u s .
T h e m e t h o d s o f t h e c l a s s u n d e r c o n s i d e r a t i o n a p p r o a c h N e w t o n ’s
m e t h o d as to the es tim ate of their rate of convergence. L e t u s c o m ­
p a re t h e l a b o u r p e r iteration in t h e m e t h o d s of d u a l directions a n d
i n N e w t o n ’s m e t h o d .
I n p r o c e s s e s o f t y p e ( 3 . 4 ) w i t h m a t r i x A ft d e f i n e d b y s y s t e m ( 3 . 6 )
i n o r d e r t o c a l c u l a t e m a t r i x .4ft 1 w e h a v e t o c a l c u l a t e v e c t o r e * a n d

8 0
M E T H O D S O F D U A L D I R E C T I O N S

t h e n u s i n g recursive f o r m u l a s (3.21) to construct a basis d u a l for


the basis . . »,
F o r c o n s t r u c t i n g v e c t o r e h as stated in t h e s u b s e c t i o n o n p. 7 4
it is n e c e s s a r y t o c a l c u l a t e t h e g r a d i e n t o f t h e f u n c t i o n a t o n e o r t w o
points. T h e a m o u n t of w o r k required to construct th e d u a l basis b y
f o r m u l a s ( 3 .2 1 ) i s o n l y 1 I n o f t h a t r e q u i r e d b y t h e u s u a l m e t h o d s
( D . K . F a d d e e v a n d V . N . F a d d e e v a ) (it is r e d u c e d e v e n m o r e if w e
c h e c k conditions that pr o v i d e the inequality to zero of the d e n o m i ­
n a t o r in t h e g e n e r a l f o r m u l a s for t h e c o n s t r u c t i o n of a d u a l basis).
T h u s first, t h e m e t h o d s o f d u a l d i r e c t i o n s , a s d i s t i n c t f r o m
N e w t o n ’s m e t h o d , d o n o t r e q u i r e c a l c u l a t i o n o f t h e s e c o n d d e r i v a ­
tives of t h e function. W e c o m p a r e t h e d u a l direction m e t h o d s to
N e w t o n ’s f i n i t e d i f f e r e n c e s m e t h o d . W e f i n d t h a t t h e a m o u n t o f
c o m p u t a t i o n s in the f o r m e r m e t h o d s required for the construction
o f m a t r i x A z 1 is a b o u t — as m u c h , s i n c e N e w t o n ’s m e t h o d n e c e s s i -
" n ’
t a t e s t h e c a l c u l a t i o n a t e a c h i t e r a t i o n o f d e r i v a t i v e s a t n -f- 1 p o i n t s
a n d the inversion of m a t r i x A ^ w i t h o u t us i n g recursive relations.
I n N e w t o n ’s f i n i t e d i f f e r e n c e s m e t h o d , i n o r d e r t o d e t e r m i n e t h e
direction of m o t i o n i n s t e a d of i n v e r t i n g t h e m a t r i x o n e c a n
s o l v e a s y s t e m o f l i n e a r e q u a t i o n s ( l i k e it c a n b e d o n e i n t h e u s u a l
N e w t o n m e t h o d ) . I n s u c h a case the quantitative es tim ati on of the
c o m p u t a t i o n a l e f f o r t i n m e t h o d s o f t h e t y p e ( 3 . 4 ) a n d i n N e w t o n ’s
m e t h o d d e p e n d s o n the m e t h o d of solving the s y s t e m of equations;
h o w e v e r , t h e r a t i o is a l s o a p p r o x i m a t e l y e q u a l t o n . F o r i n s t a n c e ,
if f o r s o l v i n g t h e s y s t e m o f l i n e a r e q u a t i o n s w e u s e t h e m e t h o d o f
d u a l directions (see t h e s u b s e c t i o n o n p. 79), w e h a v e , in practice,
to calculate m a t r i x A n 1 w i t h o u t using recursive relations, w h i c h ,
as w a s m e n t i o n e d a b o v e , requires a n n t i m e s greater a m o u n t of
c a l c u l a t i o n s a s c o m p a r e d t o t h a t r e q u i r e d b y f o r m u l a s ( 3 . 2 1 ).
C o m p u t a t i o n a l e f f o r t is n e a r l y t h e s a m e i n s o l v i n g a s y s t e m o f
linear e q u a t i o n s b y m e t h o d s of c o n j u g a t e directions; w e shall discuss
this m e t h o d in t h e follo win g section.
T h u s m e t h o d s of d u a l directions c o n v e r g i n g at a rate close to that
o f N e w t o n ’s m e t h o d r e q u i r e a t t h e s a m e t i m e a f a r l e s s e r a m o u n t
o f c a l c u l a t i o n s p e r i t e r a t i o n . T h e s h o r t c o m i n g o f t h e s e m e t h o d s is
that their i m p l e m e n t a t i o n o n c o m p u t e r s requires a larger storage
c a p a c i t y o f t h e c o m p u t e r , s i n c e it is n e c e s s a r y t o m e m o r i z e t w o
s y s t e m s o f v e c t o r s r ft, r k . x , . . ., r ft_ n + 1 a n d s k . l y . . ., s ft_ n + 1 ,
i.e. a c t u a l l y t w o n X n m a t r i c e s . T h i s is a n o b s t a c l e t o t h e a p p l i c a ­
tion of m e t h o d s of d u a l directions to t h e s o l v i n g of large size p r o b ­
l e m s o n c o m p u t e r s w i t h a l i m i t e d w o r k i n g storage. It s h o u l d b e
n o t e d , h o w e v e r , t h a t t h i s s h o r t c o m i n g is p a r t i a l l y c o m p e n s a t e d if
w e u s e v e c t o r s d i r e c t e d a l o n g c o o r d i n a t e a x e s a s v e c t o r s r ft, f o r
in this case w e h a v e to store in the c o m p u t e r m e m o r y o n l y o n e n-
d i m e n s i o n a l v e c t o r i n s t e a d o f t h e s y s t e m r ft, . . . , ’ r * _ n + l .

6 — 0 3 2 6 81
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

4. M E T H O D S O F C O N J U G A T E D I R E C T I O N S .
MINIMIZATION OF Q U A D R A T I C F UNCTIONS
Conjugate Directions
a n d Their Properties
L e t u s tu r n a g a i n to the p r o b l e m of m i n i m i z i n g quadratic functions
of the f o r m

/(*)— x ' > + (-b - * ) + c (4-1 )


w h e r e ( A x , x ) > 0 w i t h a n y x =7 ^ 0 ; w e c o n s i d e r e d t h i s p r o b l e m
i n t h e p r e c e d i n g s e c t i o n ( t h e s u b s e c t i o n o n p . 7 9 ) . It is e a s y t o a s c e r ­
tain that t h e p r o b l e m of q u a d r a t i c f u n c t i o n m i n i m i z a t i o n c a n b e
r e d u c e d t o t h e i n v e r s i o n o f m a t r i x A ; if m a t r i x A ~ x is k n o w n , t h e n
t h e s o l u t i o n is i m m e d i a t e l y f o u n d b y u s i n g f o r m u l a s (3.28):
** = z0 — A ~ % = — A ^ b (4.2)
w h e r e x 0 is a n a r b i t r a r y p o i n t .
I f o n e c a l c u l a t e s m a t r i x A ~ 1 u s i n g e x p r e s s i o n ( 3 . 2 7 ) , it is n e c e s ­
sary, h a v i n g c h o s e n a n arbitrary linearly i n d e p e n d e n t s y s t e m of
v e c t o r s p 0 , . . ., p n _ x ( w e u s e h e r e t h e n o t a t i o n p t i n s t e a d o f r t),
to calculate the c o r r e s p o n d i n g vectors
= f(xt + p ^ — f (Xi) = A p i , i = 0 , 1 , . . ., n — 1 (4.3)

w h e r e x t a r e a r b i t r a r y p o i n t s , a n d t o c o n s t r u c t a b a s i s s 0 » • • •» s n - i
d u a l f o r t h e b a s i s e 0 , . . ., e n _ l5 i . e . w h i c h s a t i s f i e s t h e c o n d i t i o n s
(Si, el) = 1 , (,s t , ! > } ) = 0 w i t h i = £ j. (4.4)
T h e s e relations, b e c a u s e of (4.3), c a n b e w r i t t e n in t h e f o r m

(st, A p ^ = 1, (S i , A p j ) — 0, i = £ j. (4.5)

O f p a r t i c u l a r i n t e r e s t i s t h e c a s e i n w h i c h v e c t o r s p 0 , . . ., p n - x
a r e A - o r t h o g o n a l o r , a s t h e y a r e s o m e t i m e s c a l l e d , c o n j u g a t e , i.e. s u c h
that
iPii A P j ) ~ i ^ 7* G * 6)

T h e s y s t e m o f ( n o n z e r o ) v e c t o r s p 0 , . . ., p n - i w h i c h s a t i s f i e s c o n d i ­
t i o n s ( 4 .6 ) i s l i n e a r l y i n d e p e n d e n t ( b e i n g o r t h o g o n a l i n a m e t r i c ,
defined b y a non s i n g u l a r m a t r i x ) a n d accordingly c a n b e u s e d to
d e t e r m i n e v e c t o r s e t b y f o r m u l a s (4.3); v e c t o r s s t w h i c h satisfy
( 4 . 5 ) i n t h i s c a s e c a n b e c a l c u l a t e d b y v e r y s i m p l e f o r m u l a s : iS
Si = — r , i=0, 1, ..., rc— 1.(4.7)
1 (Apu Pi) ’ '

82
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

T h u s i f v e c t o r s p 0i . . . , p n - t a r e A - o r t h o g o n a l , matrix A -1 is
calculated b y t h e f o l l o w i n g f o r m u l a (see (3.27)):
n-1 n-1
PiP*
A " 1^ 3 ^ * = 2 (4.8)
{ A p t , pi) 1
i = 0 i = 0

i.e. t h e p r o b l e m o f t h e i n v e r s i o n o f m a t r i x A , a n d t h u s t h a t o f t h e
m i n i m i z a t i o n o f f u n c t i o n / (a;) i s s o l v e d q u i t e e a s i l y .
L e t us n o w consider the p r o b l e m of d e t e r m i n i n g point x % w i t h
the aid of c o n j u g a t e vectors f r o m a s o m e w h a t different v i e w p o i n t ;
at the s a m e t i m e w e shall s t u d y a n u m b e r of interesting properties
of co njugate directions.
S i n c e p 0 , . . ., p n ~ \ i s t h e b a s i s o f s p a c e E n , p o i n t x * m a y b e p r e ­
sented in the f o llo win g f o r m :
n- 1
** = * 0 + 2 (4.9)
i=0
But by (4.2) a n d (4.8),

* . = * 0 - S -J p u P i ) f '°- ( 4 -1 0 )
1=0

It f o l l o w s f r o m (4.9) a n d (4.10) t h a t

{A p T P I ) f '°
i
o r in a n o t h e r f o r m

*o + 2 a ‘P i = x 0 — 3 (Apuit) P i - (4 ‘1 1 )
i i

Since a vector has only o n e resolution along the basis axes, the
l a s t e q u a l i t y d e t e r m i n e s t h e v a l u e s o f c o e f f i c i e n t s <Zj i n t h e e x p a n ­
s i o n (4.9)
(Am Pi) (Am Pi)
CCi = i = 0, 1, ..., n - i . (4.12)
{Api, Pi' (e£, p i )

T h u s if a c e r t a i n s y s t e m o f c o n j u g a t e v e c t o r s is k n o w n , t h e n t h e
m i n i m u m p o i n t o f q u a d r a t i c f u n c t i o n ( 4 . 1 ) is e a s i l y f o u n d b y f o r ­
m u l a s (4.9), (4.12).
T h e p r o c e d u r e of d e t e r m i n i n g p o i n t x * b y f o r m u l a (4.9) c a n b e
c o n s i d e r e d as a process of c o n s t r u c t i o n of successive points:

*t+i = * i ~l~ & i P i i i 9, 1 , . . ., n — 1 (4.13)

83 6*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

w h e r e p a r a m e t e r s a * a r e d e t e r m i n e d b y f o r m u l a s (4.12). It f o l l o w s
t h at u s i n g the m e t h o d of conjugate directions o n e c a n solve the p r o b l e m
of q u a d r a t i c f u n c t i o n m i n i m i z a t i o n after p e r f o r m i n g a finite n u m b e r
of steps not exceeding n (the n u m b e r of points in the iterative
p r o c e s s ( 4 . 1 3 ) c a n p r o v e l e s s t h a n n i f s o m e o f t h e c o e f f i c i e n t s otj i n
e x p a n s i o n ( 4 . 9 ) p r o v e e q u a l t o z e r o , i . e . if f o r s o m e i w e h a v e
(f0 , p { ) = 0). T h e a b o v e p r o p e r t y o f t h e m e t h o d o f c o n j u g a t e d i ­
r e c t i o n s is t h e m o s t i m p o r t a n t o n e . It s h o w s h o w e f f e c t i v e is t h e
appli cat ion of c o n j u g a t e vectors to q u a d r a t i c fu nct ion m i n i m i z a ­
t i o n ; t h i s is t h e r e a s o n o f w i d e a p p l i c a t i o n o f m e t h o d s o f c o n j u g a t e
directions.
A s a n i n t e r e s t i n g c o r o l l a r y t o t h e r e s u l t o b t a i n e d , it c a n b e s h o w n
t h a t p o i n t x t c o n s t r u c t e d b y f o r m u l a s ( 4 . 1 3 ) , ( 4 . 1 2 ) is t h e m i n i m u m
p o i n t o f f u n c t i o n ( 4 . 1 ) o n t h e s u b s p a c e f o r m e d b y v e c t o r s p 0 , . . ., p i - x
a n d passing through point x 0. Let
~ i~ 1 ~
Ji = ^ o + 2 1 a hPh
k=0
wh e r e a * are arb i t r a r y coefficients. F o r p o i n t x t to b e t h e m i n i m u m
of a strictly c o n v e x differentiable fun c t i o n in the s u b s p a c e f o r m e d
by v e c t o r s p 0 , . . ., it is n e c e s s a r y a n d s u f f i c i e n t ( c o r o l l a r y 4 . 4
of C h a p . I) t h a t t h e f o l l o w i n g c o n d i t i o n s b e satisfied:
(/' f t ) . P i ) = 0, 7 = 0 , 1 , . . ., i — 1. (4.14)
N o w for a n y 0 ^ j ^ i — 1, w e h a v e
^ ^ i - 1 ~

( / r f t ) , P j ) = ( A x i + b, P j ) = ( A ( z 0 + 2 V k P i d + b, p ^
fe=0
i— 1 ^
= ( A x 0 + b, p j ) + ^ a h ( A p hl P j ) = (f0 , P j ) A - ^ j ( A p h p f ) .
h=0
H e n c e t a k i n g into a c c o u n t (4.14) w e h a v e that point x t pr o v i d e s
the m i n i m u m of the function in the subspace, f o r m e d b y vectors
P o i • • •» P i - i a n d p a s s i n g t h r o u g h p o i n t x 0 i f a n d o n l y i f (/o, p f ) 4 -
+ a j (A p jf p j ) — 0 , i.e.
- _ (/pi P j )
— ( A p p Pj) *

N o w these coefficients co inc ide w i t h t h e coefficients a j calculated


b y f o r m u l a ( 4 . 1 2 ) , i.e. p o i n t x t w h i c h p r o v i d e s t h e r e q u i r e d m i n i m u m
c o i n c i d e s w i t h p o i n t Xi (4.13). C o n s e q u e n t l y
(/' ( * i ) , P i ) = 0, / = 0, 1 , . . ., i - 1. (4.15)
It is n o w c l e a r t h a t m i n i m i z a t i o n o f a q u a d r a t i c f u n c t i o n i n s p a c e
E n b y f o r m u l a s (4.13), (4.12) c a n b e interpreted as a process of

84
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

successive m i n i m i z a t i o n of the function in subspaces of i + 1 di­


m e n s i o n s , i = 0, 1, . . n — 1, it b e i n g n e c e s s a r y t o c a l c u l a t e
o n l y o n e coefficient a f in or d e r to find e v e r y t i m e t h e n e x t m i n i m u m
point.
N o t e that in finding a* b y f o r m u l a s (4.12) w e n e e d n o t practically
c a l c u l a t e t h e m a t r i x o f s e c o n d d e r i v a t i v e s A a n d it is n e c e s s a r y t o
c a l c u l a t e o n l y v e c t o r s e t = f ( x t + p t ) — / ' ( x f) ( s e e ( 4 . 3 ) ) , i . e .
o n l y t h e first d e r i v a t i v e s o f t h e f u n c t i o n .
It is e a s y t o a s c e r t a i n t h a t f o r m u l a s ( 4 . 1 2 ) c a n b e t r a n s f o r m e d a s
f o l l o w s . I f X i is d e t e r m i n e d b y f o r m u l a ( 4 . 1 3 ) , t h e n
a , pi) = + n, p d
= (— a ^ A p o — a xA p x — ... — a i ^ A p i ^ + f it p t )
a n d according to the ^ - o r t h o g o n a l i t y o f v e c t o r s p 0 , . . ., P i we
h a v e (/;, p ^ = (/•, p t ) .
Consequently
(f'i* P i ) Pi) • n a a // 4C\
(X>i ({ AA pnt , PIiT)T : ~
(C f , Pi)
' 1 I ^ » • ••1 ^ ^• (4.16)

I t f o l l o w s t h a t if w i t h a c e r t a i n — l i n f o r m u l a (4.13)
a t = 0 ( i . e . x t+ 1 = x t), t h e n t h i s m e a n s t h a t (/S, p t ) = 0 . C o m b i n i n g
this eq u a l i t y w i t h (4.15) w e o b t a i n
( / * + ii P j ) = tiu Pi) = 0, 7 = 0, 1 , . . ., i.
T h u s t h e fact t h a t t h e coefficient a f b e c o m e s zero m e a n s that the
corresponding point x t provides the m i n i m u m of the quadratic function
i n t h e s u b s p a c e f o r m e d b y v e c t o r s p 0 y . . ., p t a n d p a s s i n g t h r o u g h
point x 0.
Finally, n o t e t h a t b y (4.15) (fu Pi-i) = 0. T h i s m e a n s t h a t t h e
choice of coefficients a* b y f o r m u l a s (4.12) or (4.16) c o r r e s p o n d s to
choosing a f under the condition
/ (x t + a iPt) = m i n / (xt + a p t).
a

C o n s t r u c t i o n of t h e M e t h o d s
In considering in the pre c e d i n g subsection the effectiveness of
m e t h o d s of c o n j u g a t e directions for t h e m i n i m i z a t i o n of a q u a d r a t i c
function, w e did not e v e n m e n t i o n the m e t h o d s of constructing
such vectors a n d the w o r k involved in this procedure.
W e t u r n to the s t u d y of m e t h o d s of c o n s t r u c t i n g A -ort h o g o n a l
vectors. E a c h of these m e t h o d s d e t e r m i n e s o n e or other m e t h o d of
conjugate directions, w h i c h consists in the construction of successive
a p p r o x i m a t i o n s to the solution of t h e p r o b l e m of m i n i m i z a t i o n of
f u n c t i o n (4.1) m a k i n g u s e o f f o r m u l a s (4.13), (4.12) (or (4.16)).

8
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e effectiveness of the m e t h o d s of c o n j u g a t e directions d e p e n d s


directly o n t h e a m o u n t of calculations to b e p e r f o r m e d in order to
c o n s t r u c t t h e s y s t e m of c o n j u g a t e vectors. If t h e m e t h o d c h o s e n for
constructing the conjugate vectors proves too laborious, then the
effectiveness of the c o r r e s p o n d i n g m e t h o d of c o n j u g a t e directions
m a y p r o v e to b e l o w (as c o m p a r e d to a l g o r i t h m s of o t h e r classes).
T h e r e f o r e , it is w o r t h w h i l e t o s p e c i f y t h e g e n e r a l r e q u i r e m e n t s
w h i c h m u s t b e satisfied b y a n y m e t h o d of c o n s t r u c t i n g c o n j u g a t e
v e c t o r s in o r d e r t h a t t h e c o r r e s p o n d i n g m e t h o d of c o n j u g a t e direc­
tions b e effective.
First, t h e process of c o n s t r u c t i n g c o n j u g a t e v e cto rs s h o u l d u s e
o n l y c a l c u l a t i o n s of t h e f u n c t i o n a n d its g r a d i e n t a n d s h o u l d n o t
r e q u i r e t h e c a l c u l a t i o n of s e c o n d d e r i v a t i v e s of t h e f u n c t i o n . If this
r e q u i r e m e n t is n o t sa tis fie d, t h e n m i n i m i z a t i o n o f a q u a d r a t i c
function b y m e t h o d (4.13) c a n inv o l v e the n e e d of calculating the
m a t r i x of second derivatives, a n d m o r e o v e r require the calculation
of gradients at several points. T h e r e f o r e in general, a m e t h o d of
c o n j u g a t e directions w h i c h requires the calculation of the m a t r i x
o f s e c o n d d e r i v a t i v e s p r o v e s l e s s e f f e c t i v e t h a n N e w t o n ’s m e t h o d
( w i t h t h e possible e x c e p t i o n o n l y of p r o b l e m s in w h i c h t h e inversion
o f m a t r i x A is f a r m o r e l a b o r i o u s i n c o m p a r i s o n w i t h its c a l c u l a t i o n ) .
Secondly, the information about the function should be used only
at p o i n t s of t h e s e q u e n c e (4.13). I n o t h e r w o r d s , t h e pr o c e s s of c o n ­
structing conjugate vectors should be such that in determining
vector 0 ^ i ^ n — 1 , t h e f u n c t i o n a n d its g r a d i e n t h e e v a l u a t e d
o n l y a t p o i n t s x 0 , . . ., x t .
It f o l l o w s f r o m t h i s r e q u i r e m e n t t h a t o n e s h o u l d c o n s i d e r o n l y
such m e t h o d s of constructing conjugate vectors that

(/i, P i ) = 0 , 0 < i < n - 1 (4.17)

i s s a t i s f i e d i f a n d o n l y i f f\ = 0 . I n d e e d , i f c o n d i t i o n ( 4 . 1 7 ) i s s a t i s ­
fied, t h e n b y ( 4 . 1 6 ) a f = 0 a n d t h e r e f o r e i n s e q u e n c e (4.13) x i+1 = x t.
T h i s m e a n s that at the ( i + l)-th iteration of the process w e shall
not receive a n y additional information about the function an d there­
fore s h a l l n o t b e a b l e t o c o n s t r u c t v e c t o r p i+1 p t. T h e p r o c e s s
will a c c o r d i n g l y d e g e n e r a t e (stop) w i t h o u t r e a c h i n g the solution
i f fi =7^ 0 .
T h u s for a n y of th e m e t h o d s of constructing co njugate vectors
( a n d for t h e c o r r e s p o n d i n g m e t h o d of c o n j u g a t e directions), the
condition
( f'u P i ) ¥ = o if 0 (4.18)

m u s t b e satisfied. T h i s c o n d i l i o n g u a r a n t e e s t h a t at a n y of t h e itera­
tions of the process w e shall h a v e a* ^ 0 .

86
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

In w o r k i n g out a l g o r i t h m s for the construction of c o n j u g a t e vectors


w e s h o u l d a s s u m e t h a t c o n d i t i o n ( 4 . 1 8 ) is satisfied. T h e n , t h e a l g o ­
r i t h m s h a v i n g b e e n c o n s t r u c t e d , it is n e c e s s a r y t o c h e c k w h e t h e r
t h i s c o n d i t i o n i s a c t u a l l y s a t i s f i e d a n d , if n e c e s s a r y , t o i m p o s e
additional constraints o n the a l g o r i t h m in order to satisfy the c o n d i ­
tion.
T a k i n g i n t o a c c o u n t t h e a b o v e r e m a r k s , let u s t u r n t o t h e a c t u a l
w o r k i n g o u t of the relations for the construction of A -o r t h o g o n a l
vectors.
In w h a t follows w e use the notations
ri = *i+1 — = aiPi, et = f'i+i — fi = ViApi. (4.19)

A n a r b i t r a r y d i rec tio n of d e s c e n t of f u n c t i o n (4.1) m a y b e c h o s e n


t o b e v e c t o r p 0 = — H * f ' Q, w h e r e H 0 is a s y m m e t r i c , s t r i c t l y p o s i t i v e
definite m a t r i x .
L e t u s e s t a b l i s h t h e r e q u i r e m e n t s w h i c h v e c t o r p * , 1 ^ ft ^ n — 1 ,
m u s t s a t i s f y i n o r d e r t o fulfill t h e c o n d i t i o n s o f A - o r t h o g o n a l i t y :

(Ph, A p j ) = 0, 0 < / < ft — 1. (4.20)

T o this end, w e m a k e use of the fact that according to the p r o p ­


erties of c o n j u g a t e directions (see (4.15)) in c h o o s i n g a* in process
(4.13) b y f o r m u l a (4.16), c o n d i t i o n s (4.20) a n d at t h e s a m e t i m e also
the equality
( / * , P i ) = 0 , 0 < / < ft - 1 (4.21)
must b e satisfied. If w e set
P h = - Htfk (4.22)
where is, a n n X n s q u a r e m a t r i x , t h e n c o n d i t i o n s ( 4 . 2 0 ) c a n
be written in the following form:
( /I, H k A P j ) = 0 , 0 < / < ft - 1.
C o m p a r i s o n o f t h e e q u a l i t i e s o b t a i n e d w i t h ( 4 . 2 1 ) s h o w s t h a t if
( 4 . 2 1 ) is satisfied, t h e n ( 4 . 2 0 ) w i l l a l s o b e satisfied, p r o v i d e d m a t r i x
H h satisfies t h e r e l a t i o n s rjj
H hA p j = apj, 0 < / < ft — 1
w h e r e a is a n a r b i t r a r y c o n s t a n t . ]
S i n c e a c c o r d i n g to c o n d i t i o n (4.18) a n d t h e strict c o n v e x i t y of
f u n c t i o n (4.1) w e h a v e 0 < | a , | < o o w i t h a n y 0 ^ i ^ n — 1,
equalities (4.20) a n d (4.21) c a n b e w r i t t e n in t h e f o l l o w i n g f o r m :

( r ft, e j ) = 0 , 0 < / < ft - 1, (4.23)


( f L rj) = 0, 0 < / < f t — l, (4.24)

87
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

and the conditions for determining matrix H k can be written as


follows:
H h ej = arh 0 < / < * — 1. (4.25)
T h u s t h e c o n d i t i o n s o f A - o r t h o g o n a l i t y ( 4 . 2 0 ) w i l l b e s a t i s f i e d if
m a t r i x H k w h i c h d e t e r m i n e s v e c t o r p k b y f o r m u l a (4.22) satisfies
e q u a t i o n s (4.25).
W i t h k < n — 1, t h e n u m b e r o f v e c t o r e q u a t i o n s (4.25) w i l l b e
l e s s t h a n n ; it f o l l o w s t h a t m a t r i x H h is n o t u n i q u e l y d e f i n e d . B e s i d e s ,
w i t h different v a l u e s of c o n s t a n t a the s y s t e m s of e q u a t i o n s for
defining m a t r i x H k will also h e different. A l l this su gge sts t h e d i v e r ­
sity of al gorithms w h i c h c a n b e u s e d to construct conjugate direc­
tions as w e h a v e to u s e v a r i o u s m e t h o d s of c o n st ruc tin g different
matrices H k.
S i n c e e q u a t i o n s ( 4 . 2 5 ) m u s t b e satisfied w i t h a n y k — 1, 2, . . .
. . ., n — 1 , i t i s n a t u r a l t o t r y a n d c o n s t r u c t m a t r i x b y recur­
sive relations.
L e t u s write (4.25) in the following f o rm:
( H h^ + Atffc-i) ej = arh 0 < j < k — 1. (4.26)
S i n c e m a t r i x H h -X m u s t satisfy t h e e q u a t i o n s
H h^ e } = arh 0 < / < k — 2,
it f o l l o w s f r o m ( 4 . 2 6 ) t h a t m a t r i x A H h ^ is d e f i n e d b y t h e f o l l o w i n g
conditions:
A H h ^ e j = 0, - 2 ,
A - H W k -1 = a r h .1 — (4.27)
T h e l a t t e r e q u a l i t y w i l l e v i d e n t l y b e s a t i s f i e d if w e assume
A I T _ rh-iu k - l Hh-jeu-ivt-i
(4.28)
ft_1 a (yft.!,

w h e r e u h - ±1 a r e u n k n o w n v e c t o r s . I t is n e c e s s a r y t h a t t h e v e c t o r s
b e s u c h t h a t t h e first o f t h e c o n d i t i o n s ( 4 . 2 7 ) is s a t i s f i e d , i.e.
< K * - 0, (vh ^ ej) = 0 , 0 < / < k - 2. (4.29)
Clearly* vectors v must also satisfy conditions
(Wft-i, e h . x) 0, (yft-i, e h _ x ) 0.(4.30)

T a k i n g i n t o a c c o u n t ( 4 . 2 3 ) it is c l e a r t h a t c o n d i t i o n s ( 4 . 2 9 ) w i l l b e
s a t i s f i e d if w e c h o o s e x — v = r ^ . C o n d i t i o n s (4.30) will
also b e satisfied since
(r*-i, e*-i) = (rh-i, > 0 (4.31)
according to the properties of m a t r i x A .

88
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

Vectors vh c a n also b e c h o s e n b y u s i n g t h e f o l l o w i n g c o n ­
s i d e r a t i o n s . If c o n d i t i o n ( 4 . 2 0 ) is sati s f i e d , t h e n w e h a v e

( A p h - 1, P i ) = ( — n ) = ° . 2.

M a k i n g u s e of (4.25) w e h a v e t h e n
® (^ft-l» O ) = (^ft-l» ~ C^ft-l* &h-l» = 0,
0 < / < /c— 2.
It f o l l o w s t h a t in o r d e r to satisfy (4.29) w e c a n a s s u m e

= Vk~i = ^ I k -i&k -i*


I n g e n e r a l , if w e c h o o s e v e c t o r s and in the fo rm

u h-i = *i,ftr ft-i + h'kHk-^h-i,


v h -1 = ^3.ftr ft-i + (4.32)
w h e r e t l t k , £ 2 > h , £ 3>ft, f4>ft a r e a r b i t r a r y n u m b e r s ( w h i c h i n p r i n c i p l e
c a n c h a n g e w i t h c h a n g i n g k ), t h e n c o n d i t i o n s ( 4 . 2 9 ) w i l l o b v i o u s l y
b e s a t i s f i e d . I n o r d e r t o s a t i s f y c o n d i t i o n s ( 4 . 3 0 ) t h e q u a n t i t i e s t iiht
i = 1 , . . ., 4 s h o u l d b e a d j u s t e d , i f n e c e s s a r y ( i n p a r t i c u l a r , a s
h a s b e e n n o t e d , c o n d i t i o n s ( 4 . 3 0 ) w i l l b e s a t i s f i e d w i t h t1<h = t3tk =
= ^ 2 ,k ~ ^ 4 , ft = = see (4.31)).
Thus, choosing vectors v k -± i n t h e f o r m (4.32) w e ar e a b l e
to construct m a t r i x A H u - i b y f o r m u l a (4.28) a n d in this w a y establish
the recursive relations for constructing s u c h a m a t r i x H that the
v e c t o r p k w h i c h it d e t e r m i n e s w i l l s a t i s f y t h e c o n d i t i o n s o f ^ - o r t h o g ­
o n a l i t y (4.20). T o e a c h pa ir of v e c t o r s v h _x a n d c o n s t a n t a
c h o s e n there will c o r r e s p o n d their particular m a t r i x A H h -± a n d ,
c o n s e q u e n t l y , m a t r i x H k . I n o t h e r w o r d s , w i t h d i f f e r e n t v e c t o r s u kl
uh a n d constant a w e shall o b t a i n different a l g o r i t h m s for construct­
i n g c o n j u g a t e v e c t o r s , i.e. s h a l l c o n s t r u c t d i f f e r e n t m e t h o d s o f c o n ­
jugate directions.

G e n e r a l P r o p e r t i e s of t h e M e t h o d s
L e t u s try to establish the general properties of m e t h o d s of c o n j u ­
gate directions, w h i c h c a n b e constructed in the m a n n e r described
above.
F i r s t o f all, it is n e c e s s a r y t o a s c e r t a i n w h e t h e r c o n d i t i o n ( 4 . 1 8 )
is s a t i s f i e d b y t h e m e t h o d s u n d e r c o n s i d e r a t i o n s i n c e i n w o r k i n g o u t
t h e m e t h o d s o f c o n s t r u c t i n g t h e a l g o r i t h m s , it is a s s u m e d t h a t t h i s
c o n d i t i o n is f u l f i l l e d .
A n o t h e r i n t e r e s t i n g q u e s t i o n is w h e t h e r t h e d i r e c t i o n s p j , j —
= 0, 1, . . n — 1 w h i c h are d e t e r m i n e d b y different m a t r i c e s H j

89
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

d i f f e r f r o m o n e a n o t h e r , i . e . w h e t h e r p o i n t s x lt . . x n _± a r e differ­
ent for different al go r i t h m s (on condition that point x 0 r e m a i n s the
s a m e ) or they coincide.
I n o r d e r t o a n s w e r t h e s e q u e s t i o n s let u s p r e s e n t v e c t o r — p j =
= Hffj using the recursive f o r m u l a for m a t r i x H j a n d expressions
(4.28), (4.32) in t h e f o l l o w i n g f o r m :
Hffj = (fTj-! + A H j - i ) * /}.
M a k i n g use of (4.24) w e c a n w r i t e

A H f ^ n

(h,jrj-
(V e j — i)
If w e a l s o t a k e i n t o a c c o u n t , t h a t
P j - 1»
then vector — pj can be written in the following form:

HJ -,/ * 4 * . _ II* 4' V 4


i/i [1 - - - ( J

- - - - - - - - - ej-i)- - - - -
K j v r j _ ie f _ i /
« / - 1) / ei-l) V3,i
Further
S » j ~ 1, e J - i ) = ( ^ , / O - i + h j H f - i e j - u «y-i)l
= *3, 7 ( r 7 " l » e j - l ) + *4,J ( € j ~ 1» H 2 ~ l / j ) + *4,7 ( e j - 1» P j - l ) } (4.33)
hence
(r 7*— i» e j-i) /^ 1 *4 , 7 \ ^ h , j i e j-i>
( v j . u e j . i) \ 3 j i " a ; _ , / “ (wj-i, «;-i)
U s i n g this expression to t r a n s f o r m the f o r m u l a for H^fj, w e obtain
n - ig * - 1
m = y j (4.34)
(ri-l» e 7— i).
where
I [*4.j(gj-lf
( V j - u «/-i)
If v e c t o r satisfies c o n d i t i o n (4.30) a n d — £ 4 | /, w i t h
a n y j — 1, 2 , . . . f a c t o r Y j = £ 0 s i n c e
h , j ( e j - 1»
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

T h i s i n e q u a l i t y is e a s i l y c h e c k e d b y c o m p a r i n g t h e n u m e r a t o r
of t h e ratio w i t h e x p r e s s i o n (4.33). F u r t h e r , s u p p o s e that factors t3j
a n d t^j are such that with 7 ^ 1 the conditions ej_t) 0 and
y j = £ 0 b e satisfied.
L e t us p e r f o r m the scalar multiplication of the t w o sides of equali-
t y ( 4 . 3 4 ) b y f'h

(/;. # ? / ; ■ ) = yi (4.35)
U i " 1
S i n c e y j ^ 0 a n d (ft, J J * f t ) = 0 , (ft, rf) = 0 w i t h ; < k — 1 (by
( 4 . 2 1 ) a n d ( 4 . 2 4 ) ) , it f o l l o w s f r o m ( 4 . 3 5 ) t h a t

(ft, H U f i ) = 0, 1 < / < k - 1. (4.36)


Subtracting the equalities (ft, H f - i f j - i ) = 0 , — 1 from
(4.36) w e obtain:
( f t , H t e t) = 0, 0 < i < A: - 2. (4.37)

Let us n o w prove using the relations obtained that w i t h 0 ^ i ^


^ 7 — 2 equalities — H * f ) h o l d , i.e.

H % ift = H U f i = • • • = H 0 ti . (4.38)
U s i n g the recursive f o r m u l a for m a t r i x H t a n d t a k i n g into a c c o u n t
c o n d i t i o n s (4.24), w e c a n w r i t e v e c t o r H * + J j in t h e f o l l o w i n g f o r m :

(4.39)

I n o r d e r t o p r o v e e q u a l i t i e s ( 4 . 3 8 ) it is n e c e s s a r y t o s h o w t h a t
[(ft, J 5 T , e f) = 0, (4.40)

U s i n g again the recursive f o r m u l a for H t w e obtain:


i
. jj . jj ^ (/p H s€ s) ( y s> c i + l )
(/j, Hi+l&i+l) — (fji -^0e i+l) (l>s, cs) *
s=0 S’ S
0 < i < 7— 2, (4.41)

(/-. # ? + , « , + , ) = ( / ; , , ) - s <itl).
s=0
— 2. (4.42)

B e c a u s e of c o n d i t i o n s (4.24) a n d (4.37), w e h a v e
( * i , /5) = <s.i ( r „ fi) + h.i fi) = 0 , 0 < i < / - 2.
(4.43)

91
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T a k i n g i n t o a c c o u n t e q u a l i t i e s ( 4 . 4 3 ) a n d c o n d i t i o n s ( 4 . 3 7 ) , it f o l ­
lows f r o m (4.42) that
(/J, # ® e i + i ) = 0, 0 < i < ; - 3. (4.44)
L e t u s n o w c o n s i d e r t h e relations (4.41). W i t h i = 0 w e h a v e b y
( 4 . 3 7 ) ( f T o * 0 , / j ) = ( f f * e o, f ) ) = 0 a n d b y ( 4 . 4 4 ) (/}, H Q e x ) = 0 .
C o n s e q u e n t l y , ( H & , ff) = 0 . F u r t h e r , w e s i m i l a r l y e s t a b l i s h t h a t
equalities (4.38) hold.
T a k i n g into a c c o u n t these equalities, w e c a n write expression
(4.34) in the following f o r m :

m n = y } ( i - (4.45>

T h i s is t h e f o r m u l a t h a t e n a b l e s u s t o a n s w e r t h e q u e s t i o n s f o r m u ­
lated at the b e g i n n i n g of this subsection.
B y s c a l a r m u l t i p l i c a t i o n o f t h e t w o s i d e s o f ( 4 . 4 5 ) b y /j a n d u s i n g
c o n d i t i o n (4.24), w e o b t a i n

- (/;> P i ) = y j (/;, H J i ) , j > 0. (4.46)

I f H 0 i s a s t r i c t l y p o s i t i v e d e f i n i t e m a t r i x , t h e n (/}, H 0 f j ) > 0 .
C o n s e q u e n t l y , if y j 0 , t h e n i t f o l l o w s f r o m ( 4 . 4 6 ) t h a t (/$, p j ) = ^ = 0 .
T h u s , it f o l l o w s f r o m ( 4 . 4 6 ) t h a t t h e a s s u m p t i o n t h a t c o n d i t i o n ( 4 . 1 8 )
c a n b e satisfied, u s e d i n w o r k i n g o u t t h e m e t h o d s of c o n s t r u c t i n g
c o n j u g a t e v e c t o r s , p r o v e s t o h o l d if H 0 is a s y m m e t r i c , s t r i c t l y p o s i t i v e
definite m a t r i x .
I n o r d e r t o a s c e r t a i n w h e t h e r v e c t o r s p t a n d p o i n t s Zf+i, i = 0,
1 , . . ., n — 1 a r e d i f f e r e n t i n d i f f e r e n t a l g o r i t h m s w e t u r n a g a i n
to f o r m u l a (4.45).
T h e first s t e p i n a n y m e t h o d o f c o n j u g a t e d i r e c t i o n s is t h e s a m e
( g i v e n t h e s a m e m a t r i x H 0) s i n c e x x = x 0 — a 0^ o / o a n d *s c h o s e n
u n d e r t h e c o n d i t i o n m i n / ( # 0 — a H * f 0 ). C o n s e q u e n t l y , p o i n t x 1
a
a n d t h e r e f o r e v e c t o r s r 0 , e 0 , f[ w i l l a l s o b e t h e s a m e i n a n y a l g o r i t h m
w h i c h c a n b e c o n s t r u c t e d b y t h e m e t h o d d e s c r i b e d . B u t t h e n a s it
f o l l o w s f r o m (4.45), t h e di r e c t i o n

also will n o t d e p e n d o n the choice of vectors u 0, v 0 ( w h i c h satisfy


t h e r e q u i r e m e n t s f o r m u l a t e d ) , i.e. w i l l n o t d e p e n d o n t h e m e t h o d
of constructing m a t r i x H x, T o b e m o r e precise, vectors p x o b t a i n e d
w i t h different m e t h o d s of c o n s t r u c t i n g m a t r i x H x will differ o n l y b y
the scalar factor H o w e v e r s i n c e t h e q u a n t i t y a x is c h o s e n u n d e r
t h e c o n d i t i o n m i n / (a^ + & P i ) y p o i n t x 2 w h i c h p r o v i d e s t h e m i n i -
a
m u m will b e th e s a m e w h a t e v e r the m e t h o d of constructing m a t r i x

92
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

H i ( b y v i r t u e o f t h e s t r i c t c o n v e x i t y o f / (x)). C o n s e q u e n t l y , t h e
q u a n t i t i e s r lt e l9 f'2 w i l l b e t h e s a m e f o r d i f f e r e n t m e t h o d s o f c o n j u g a t e
directions. C o n t i n u i n g this a r g u m e n t b a s e d o n expressing vector p h
b y f o r m u l a ( 4 . 4 5 ) w e c o n c l u d e t h a t p o i n t s x 0 l x X9 . . ., x n a r e i n d e ­
p e n d e n t o f t h e c h o i c e o f v e c t o r s u *, v i.e. o f t h e m e t h o d o f c o n ­
s t r u c t i n g m a t r i x H ft. T h u s t h e s u c c e s s i v e a p p r o x i m a t i o n s t o t h e
solution of the p r o b l e m of m i n i m i z a t i o n of a qu a d r a t i c function
are the s a m e for different m e t h o d s of c o n j u g a t e directions.
O n e more remark.
It c o u l d b e n o t e d a b o v e t h a t t h e first o f t h e t w o m a t r i c e s t h a t
f o r m m a t r i x A H j ( 4 . 2 8 ) , 7 = 0 , 1 , . . ., k — 1 t a k e s n o p a r t i n
constructing vector p h . I n d e e d in d e t e r m i n i n g vector A H * f k w e find
a c c o r d i n g to conditions (4.24) that

q T ( u jr,' ’ e j? ) = 0 ,

i.e. m a t r i c e s

a , T‘U * , , 1
(“ 7 » « ; )

take no part in constructing vector — p k = H t f \ —


h-i
= A / T ; ) * f \ . H o w e v e r , t h e y affect considerably the
3 = 0
p r o p e r t i e s o f m a t r i x H kl i n p a r t i c u l a r t h e p r o p e r t i e s o f m a t r i x H n .
W e shall ta ke notice of this fact in s t u d y i n g concrete a l g o r i t h m s
in the following subsection. H e r e w e no te o n l y that the difference
in t h e p r o p e r t i e s of m a t r i x H n tells o n t h e p r o p e r t i e s of t h e m e t h o d s
of conjugate directions in the m i n i m i z a t i o n of n o n q u a d r a t i c fu nc­
tions.

Concrete Algorithms

7 Let us n o w consider several formulas w h i c h can be used in c o n ­


structing conjugate directions. L e t us repeat that e a c h of s u c h formu­
las d e t e r m i n e s a m e t h o d of c o n j u g a t e directions consisting in con­
structing successive a p p r o x i m a t i o n s to the solution b y f ormulas
**+1 = xh + a hp h , p k = — Hifi,, k = 0, 1, . . n — 1 (4.47)
w h e r e a h is c h o s e n u n d e r t h e c o n d i t i o n m i n / (xk + a p u ) * i«e« is
a
d e t e r m i n e d b y expressions (4.12) or (4.16).

93
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

(1) W e s e t i n ( 4 . 2 8 ) a = 1, u = r h . lt v h (i.e.
in formulas ( 4 . 3 2 ) t 1 > k = tx = 1 , t2th = t 2 = 0 , t 3 ,h = t3 = 0 ,
h . h — h — 1). T h e n
rh-lr* - i
H h = H h^ - \ (4.48>
( r k - i , e h - 1) eh-l)*
L e t us s t u d y s o m e properties of m a t r i x H h o b t a i n e d b y this m e t h o d .
M a t r i x H k is s y m m e t r i c . T h i s f a c t is e a s i l y c M a b l i s l i e d b y i n d u c ­
t i o n . M a t r i x H 0 is s y m m e t r i c . T h e t w o m a t r i c e s w h i c h f o r m A H 0
a r e s y m m e t r i c t o o ( t h e s e c o n d o n e b y v i r t u e o f t h e s y m m e t r y o f f f 0 ).
T h e r e f o r e , H x is a s y m m e t r i c m a t r i x . S i m i l a r a r g u m e n t s h o l d f o r
a n y k = 2 , . . ., n .
M a t r i x H h is s t r i c t l y p o s i t i v e d e f i n i t e . W e g i v e a p r o o f b y i n d u c t i o n .
M a t r i x H 0 is s t r i c t l y p o s i t i v e d e f i n i t e . L e t H k b e a s t r i c t l y p o s i t i v e
definite m a t r i x . T h e n for a n y x £ E n

_ {Hux. s ) ( £ Q , e f t , gft) — (fffegft. a:)2 . (rh , x)~


( H k e h, e h ) [i'h,
B y h y p o t h e s i s , t h e r e i s a s q u a r e r o o t H i \ 11 ( D . K . F a d d e e v a n d
V . N . Faddeeva). C o n s e q u e n t l y taking into account the s y m m e t r y
o f m a t r i x H hl w e h a v e

( H i , * . x ) = ( H l / 2 H ' k /2x , x ) = ( H ' k ,2x , H l /2x ) = ( y , y ) ;


similarly

( H h C h , £/?)- ( H k 2€h, H k 2eh) — (z, 2),


( B hek , j ) = (//;V2f/;, I I } / 2 x ) ^ (z, y ) .

M a k i n g u s e o f t h e s e r e l a t i o n s a n d a p p l y i n g C a u c h y - B u n i a k o w s k i ’s
inequality w e conclude that the following inequality holds:
(H h x , x ) (H h e k , e k ) — ( H ke h , x ) 2 = ( y , y ) (z, z) — (z, y ) 2 > 0
a n d t h i s i n e q u a l i t y h o l d s o n l y ii z = y , i.e. s i n c e H h is n o n s i n g u l a r ,
o n l y if x = e B u t i n t h i s c a s e ( r hl x ) — ( r kl e h ) = ( r kl A r k ) > » 0 .
T h u s for a n y x 0, w e h a v e

(H „ + 1 x , x ) = "1* + > o
{ H h e kt e k ) (f h i e h>
a n d this proves that o u r reasoning b y induction holds.
M a t r i x H n = A ~ x . I n d e e d , H h satisfies (4.25) w i t h a = 1, i.e.
H n ej = r/, / = 0 , 1, . . . » w — 1, o r m a k i n g u s e o f ( 4 . 1 9 )
H nA r j Tj, / 0 , 1, . . •> n 1.

94
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

It f o l l o w s t h a t v e c t o r s r 0 , . . rn are eig e n v e c t o r s of m a t r i x H nA
w i t h eigenvalues equal to unity. H e n c e , taking into account the
l i n e a r i n d e p e n d e n c e o f c o n j u g a t e v e c t o r s r*, i — 0 , 1, . . n — 1
w e h a v e H n A = / , i.e.
H n = A - K
B u t f r o m (4.8)
n-1
A-* = 2
l= r
i.e. w e f i n d t h a t m a t r i x H n i s d e t e r m i n e d o n l y b y t h e m a t r i c e s
nuf r tr f

(r ii e i) ir i* 6 i)

(this w a s m e n t i o n e d at t h e e n d of t h e p r e c e d i n g su bsection).
(2) A n o t h e r m e t h o d o f c o n s t r u c t i n g # * is o b t a i n e d if w e take
a = 1 in (4.28) a n d c h o o s e u h -x = v k -x = r * ^ . W e h a v e t h e n

H k = H h - 1 + ( r h - 1 — H h - 1e h - 1 ) ^ •( 4 . 4 9 )

Matrix # * thus d e t e r m i n e d is n o m o r e s y m m e t r i c . S i n c e a = 1,
w e have n o w # n = A ~ X a n d this c a n b e d e m o n s t r a t e d just in the
s a m e w a y as for m e t h o d (4.48).
U s i n g (4.49) w e c a n o b t a i n a s o m e w h a t different f o r m u l a for
constructing #*. W e wr i t e (4.49) in the f o l l o w i n g f o r m :
h-1 #
(r x g f j ~ ♦ (4.50;
i— 0 1
A c c o r d i n g to the co ndi tio ns of c o n j u g a t e n e s s (4.20) (taking into
a c c o u n t f o r m u l a s ( 4 . 1 9 ) ) , w e h a v e (eht rj) = 0 , 0 ^ ^ k — 1.
C o n s e q u e n t l y , it f o l l o w s f r o m ( 4 . 5 0 ) t h a t
H he h = #o**, k = 0, 1, . . n — 1. (4.51)
Thus formula (4.49) c a n b e w r i t t e n as follows:

# & = # a -i + (Ot-i # o ^ / < - i ) (rft l • (4.52)

If # 0 = / , t h i s f o r m u l a p r o v e s s o m e w h a t s i m p l e r t h a n (4.49).
(3) L e t u s c h o o s e a — 0, v h „ x = r * ^ . I n t h i s c a s e
Z T * _ 1e * _ 1r J _ 1
(4.53)
(rh-u
W i t h a = 0 i t f o l l o w s f r o m ( 4 . 2 5 ) t h a t # n e,- = 0 , ) = 0 , 1 , . . .
. . ., n — 1 . S i n c e v e c t o r s e Q y . . ., are linearly independent,

95
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

these equalities i m p l y that ffn = 0 (the linear i n d e p e n d e n c e of


v e c t o r s e t = A r t , i — 0 , 1 , . . ., n — 1 f o l l o w s f r o m t h e l i n e a r
i n d e p e n d e n c e o f c o n j u g a t e v e c t o r s r t a n d t h e p r o p e r t i e s o f m a t r i x A ).
S i n c e c o n d i t i o n ( 4 . 5 1 ) is s a t i s f i e d f o r f o r m u l a ( 4 . 5 3 ) t o o , t h e l a t t e r
can be written in the following form:
H oe fc-ir f e _ i
H ft = (4.54)
(rfe-i, Cfc-i)
T h e constructing of m e t h o d s of c o n j u g a t e directions c a n b e c o n ­
t i n u e d b y c h o o s i n g various c o m b i n a t i o n s of co nst ant a a n d vectors
u hi v h b y f o r m u l a s ( 4 . 3 2 ) b u t w e s h a l l n o t d o s o ( t h e r e a n d f u r t h e r ,
s p e a k i n g of a c o n c r e t e m e t h o d , for e x a m p l e (4.48), w e h a v e in m i n d
m e t h o d ( 4 . 4 7 ) i n w h i c h f o r m u l a ( 4 . 4 8 ) is u s e d f o r c o n s t r u c t i n g
m a t r i x f f h ).
L e t u s m a k e a r e m a r k h e r e . S t r i c t l y s p e a k i n g , it w a s n e c e s s a r y
in e a c h of the m e t h o d s treated a b o v e to c h e c k w h e t h e r conditions
( 4 . 3 0 ) w e r e s atisfied b y v e c t o r s u k , v k . I t is e a s y t o a s c e r t a i n t h a t
in all t h e m e t h o d s s t u d i e d , t h e s e c o n d i t i o n s w e r e satisfied. F o r
instance, in t h e case u k =* u k = rk t h e satisfying of c o n d i t i o n s (4.30)
w a s a l r e a d y m e n t i o n e d in t h e s u b s e c t i o n o n p. 85. I n m e t h o d (4.48)
v h = H i e hi h o w e v e r , s i n c e m a t r i x H h is p o s i t i v e d e f i n i t e , w e h a v e
(U k , e h ) = (#ft£ft, e h ) > 0 , i.e. c o n d i t i o n s ( 4 . 3 0 ) a r e a l s o s a t i s f i e d .
T h u s , in a c c o r d a n c e w i t h t h e results of t h e s u b s e c t i o n o n p. 89,
c o n d i t i o n ( 4 . 1 8 ) is s a t i s f i e d b y t h e m e t h o d s d i s c u s s e d , i.e. t h e m e t h o d s
are gu a r a n t e e d to b e nondegenerate.
Le t us n o w derive formulas directly applicable to the calculation
o f v e c t o r s p h d e f i n e d b y d i f f e r e n t m a t r i c e s //*. T h i s is e a s i l y d o n e b y
u s i n g f o r m u l a (4.45). S i n c e r h -i — ctft-LPk-it w e h a v e f r o m (4.45)

Pk = - Yft W o f h ~ PftPft-i) (4.55)


where
( H 0 f k' , <?ft_i)
e h . t) • ( 4 ‘5 6 )
If V k = H * h e h , t h e n

^h { H k - i e h - u «ft-i)*
A c c o r d i n g t o e q u a l i t i e s ( 4 . 3 8 ) w e h a v e ( e * . ! , H i — ifh) — (^fc-i» H 0fh).
F u r t h e r , b e c a u s e o f ( 4 5 1 ) a n d (4. 3 8 ) , ( H h P h i e h) — ( H h f h + i * fk+ i ) +
- h ( H k , f k ) = ( H 0 f h + u / i + i ) — ( p ft, f h ). T h e equalities obtained
s h o w that
. «k-i)
Yft = (4.57)

96
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

N o t e t h a t f r o m ( 4 . 4 5 ) , b e c a u s e o f ( 4 . 2 1 ) a n d ( 4 . 2 4 ) , it f o l l o w s t h a t
(provided yj 0)
(f*f fTo/J) = 0 f (4.58)
Taking this into account, it p r o v e s that

( H J k , **-i) = ( H o f k , /*)• (4.59)


U s i n g equalities (4.59) a n d (4.57) w e find that

N o t e also that
(/i. P k ) = (/*, P k ) — (/ft+i. P h ) = — («ft. P i t ) - (4.61)

C o m p a r i n g f o r m u l a s (4.56) a n d (4.60) a n d t a k i n g into a c c o u n t (4.59)


\
a n d ( 4 . 6 1 ) , it is e a s y t o e s t a b l i s h t h a t = j z f f a * H e n c e YftPfc =
= 1 — y k . C o n s e q u e n t l y , f o r m u l a (4.55) w h i c h d e t e r m i n e s ve ctor p h ,
in the case w h e r e in constructing m a t r i x H h w e use vector v k =
= H t - i e k - n c a n be written in the f o r m
Ph = — Yft#o/ft + (1 — Yft) P h - i (4.62)
w h e r e c o e f f i c i e n t y * is d e t e r m i n e d b y o n e o f t h e f o r m u l a s ( 4 . 5 7 )
o r (4.60). V e c t o r p k c a n b e w r i t t e n in t h e a l t e r n a t i v e f o r m :

Pk — — Xof'h + Pft ( H o f ' k + P h - 1) (4.63)


where
- (BofL,
pft = 1 — Vft = l H f , , .( 4 . 6 4 )
(H o f k , f k ) — (Ph-1, fk _ i ) '

O t h e r expressions c a n b e o b t a i n e d for coefficient if e q u a l i t i e s


(4.59), (4.61), ( 4 . 4 6 ) a r e m a d e u s e of; n o t e t h a t t h e l a s t o f t h e s e
formulas can be written in the f o r m

(/*, P k ) = ( h - 1) W i , ft). (4.65)


C a l c u l a t i n g in e x p r e s s i o n s (4.62), (4.63) coefficients y h , b y differ­
ent formulas, w e practically o b t a i n different m e t h o d s of c o n j u g a t e
directions. It s h o u l d b e stressed t h a t in t h e m i n i m i z a t i o n of n o n -
quadratic functions different f o r m u l a s for p k d e t e r m i n e different
vectors (both as to m a g n i t u d e a n d as to direction). Especially s i m p l e
is t h e c o n s t r u c t i o n o f v e c t o r p h u s i n g e x p r e s s i o n ( 4 . 4 5 ) if v e c t o r
V k - x = r h - i i s c h o s e n i n c o n s t r u c t i n g m a t r i x H k . I n t h i s c a s e tikt k =
— 0, t h e r e f o r e y k = 1, a n d f r o m ( 4 . 5 5 ) w e o b t a i n

Ph ~ — Hqfk + PfcPfc-i (4.66)

7 - 0 3 2 6 97
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

w h e r e p ft i s c a l c u l a t e d b y f o r m u l a ( 4 . 5 6 ) . I f w e u s e e q u a l i t i e s ( 4 . 5 9 ) ,
(4.61) a n d (4.46) (the last o n e in t h e case u n d e r c o n s i d e r a t i o n ta kes
the form
(Pu, ft) = ~ ( H o f l fh) (4.67))

t h e n for d e t e r m i n i n g coefficient p * o n e of t h e f o l l o w i n g f o r m u l a s c a n
be obtained:
( H 0 r h , gft— d _ ( i i 0 rk , r k ) ( H 0r k , rh )
(4.68)
f t - 1) (P h - i i 1 k - 0 ( ^ o f t _ i » f' h - 0

E x p r e s s i o n s (4.62), (4.63), (4.66) w h i c h d e t e r m i n e v e c t o r p k in


t h e i r t u r n c a n b e g i v e n t h e f o r m p k = — /TJft, w h e r e m a t r i x H h
d e p e n d s o n t h e c o e f f i c i e n t s y h l p ft, p * . F o r i n s t a n c e , i f c o e f f i c i e n t p *
i n ( 4 . 6 3 ) is c a l c u l a t e d b y f o r m u l a (4.64), t h e n t h e c o r r e s p o n d i n g
matrix
u _ tI H 0 rh ( H o f ' k + P k - i ) *
(4.69)
h 0 ( H o / * . rk ) - ( P k - „ / i _ 4 )
w h e r e I I n = H 0 ( s i n c e f n = 0). If i n ( 4 . 6 6 ) p & is c a l c u l a t e d b y t h e
first o f f o r m u l a s ( 4 . 6 8 ) , t h e n v e c t o r p * is d e t e r m i n e d b y m a t r i x
Htfh-\P*-\
H h (4.70)
(Ph -li 1 k - 0

a n d if u s e i s m a d e of t h e s e c o n d of f o r m u l a s (4.68), t h e n

* > - * • + ! S S b - - ( 4 -7 1 )
N o t e that in f o r m u l a (4.71) H n = H 0 (since fn = 0); i n (4.70)
H n =7^= H 0 .
T h e reader himself ca n o b t a i n other f o r m u l a s for constructing H k .
T h e simplest formula for calculating A -o r t h o g o n a l vectors c a n b e
obtained by choosing H 0 = / in (4.66). I n this c a s e

Ph = — fh + PftPfe-i (4.72)
w h e r e p fe i s d e t e r m i n e d , f o r i n s t a n c e , b y o n e o f t h e f o l l o w i n g f o r m u ­
las:
(ft* (ft* ft) _ V i 10
P* (4.73)
{Ph~u1k-0 {Ph-iif'k-0 (ft-it f t - 1)

M e t h o d (4.47) in w h i c h c o n j u g a t e vectors are constructed b y


( 4 . 7 2 ) , ( 4 . 7 3 ) is w i d e l y k n o w n a s t h e m e t h o d o f c o n j u g a t e g r a d i e n t s
(this n a m e is d u e t o c o n d i t i o n s (4.58)).

98
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

Min i m i z a t i o n of a C o n v e x Q u a d r a t i c F u nc ti on
Until n o w w e co nsidered m e t h o d s of A -o rth ogo nal directions for
t h e m i n i m i z a t i o n o f a s t r i c t l y c o n v e x q u a d r a t i c f u n c t i o n , i.e. a s ­
s u m e d m a t r i x A to b e strictly positive definite.
Let n o w the function

H x ) = \ ( A x , x ) + (6 , x ) + c
A*

b e c o n v e x , i.e. m a t r i x A b e p o s i t i v e d e f i n i t e : ( A x , x ) ^ 0 w i t h
any x 0. S u p p o s e t h a t this f u n c t i o n h a s a m i n i m u m .
L e t u s s t u d y the p r o b l e m of the ap plication of m e t h o d s of c o n j u ­
gate directions in this case. C o n s i d e r preliminarily certain properties
o f f u n c t i o n / (x).
( 1 ) I f (A p , p ) = 0 t h e n o f n e c e s s i t y
A p = 0. (4.74)

I n f a c t if ( A p , p ) = 0 , t h e n p is t h e m i n i m u m p o i n t o f t h e c o n v e x
f u n c t i o n cp ( x ) — y ( A x , x). B u t at t h e m i n i m u m p o i n t t h e n e c e s s a ­
ry condition of a n extremum
<p' ( p ) = A p = 0
m u s t h e satisfied.
( 2 ) I f p is t h e m i n i m u m po i n t of the convex f u n c t i o n <p ( x ) =
= y ( A x , x ), t h e n o f n e c e s s i t y

(b, p ) = 0. (4.75)
I n f a c t , if ( A p , p ) = 0 a n d (b, p ) > * 0 , t h e n / ( a p ) = a (b, p ) - f
•f c - ► — oo as a — ► — o o , i.e. f (x) d o e s n o t a t t a i n t h e m i n i m u m
a n d this contradicts the a s s u m p t i o n .
T h e c a s e w h e r e (b, p ) < Z 0 is t r e a t e d i n a s i m i l a r m a n n e r .
(3) T h e m i n i m u m p o i n t o f f u n c t i o n f (x) is n o t u n i q u e .
I n d e e d , a n y m i n i m u m p o i n t o f a c o n v e x q u a d r a t i c f u n c t i o n / (x)
m u s t h e a solution of the linear s y s t e m A x b — 0 a n d conversely,
s i n c e t h e c o n d i t i o n f (x) = A x + b = 0 is a n e c e s s a r y a n d ( s i n c e
t h e r e is a m i n i m u m o f / (x)) s u f f i c i e n t c o n d i t i o n f o r a n e x t r e m u m o f
t h e c o n v e x f u n c t i o n / (x) ( c o r o l l a r y 3 . 2 o f C h a p . I). H o w e v e r , t h e
r a n k o f m a t r i x A is l o w e r t h a n t h e n u m b e r o f u n k n o w n s ( c o n d i t i o n
( A x , x) 0 m e a n s t h a t m a t r i x A is s i n g u l a r , s e e ( 4 . 7 4 ) ) a n d s o
the s y s t e m A x + 6 = 0 has n o u n i q u e solution.
( 4 ) I f ( A p , p ) = 0 a n d z £ E 11 i s a n a r b i t r a r y p o i n t , t h e n o f
necessity
(/' ( * ) , p ) = 0. (4.76)

99 7*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Indeed, a c c o r d i n g to co ndi tio ns (4.74) a n d (4.75)


(/' O O i P ) = (Az + b, p ) = (.A p , z ) + (6 , p ) = 0.

E q u a l i t y (4.76) c a n b e interpreted as follows. T h e set of solutions


o f t h e p r o b l e m o f t h e m i n i m i z a t i o n o f t h e f u n c t i o n (p ( x ) f o r m s a n
n — g - d i m e n s i o n a l h y p e r p l a n e , w h e r e q is t h e r a n k o f m a t r i x A .
( T h i s h y p e r p l a n e b e l o n g s t o t h e l e v e l s u r f a c e o f f u n c t i o n / (x) s i n c e
1
if p is a n a r b i t r a r y point of m i n i m u m o f <p ( x ) = y ( A x y x) t h e n
u s i n g (4.75) w e have

f ( P ) = - j ( A p , p) + {b, p) + c = c . )

C o n s e q u e n t l y , e q u a l i t y (4.76) m e a n s that t h e gr adi ent of function


/ (x) a t a n y p o i n t is s i t u a t e d i n a q - d i m e n s i o n a l s u b s p a c e w h i c h
is o r t h o g o n a l t o p l a n e A p = 0. It f o l l o w s t h a t t h e n u m b e r o f l i n e a r l y
i n d e p e n d e n t v e c t o r s /' ( x ) i s e q u a l t o q < n ( f o r a c o n v e x q u a d r a t i c
f u n c t i o n / (x)).
H a v i n g i n m i n d t h e a b o v e p r o p e r t i e s of f u n c t i o n / (x) let u s a g a i n
t u r n to the p r o b l e m of the application of m e t h o d s of c o n j u g a t e
directions to the solving of the p r o b l e m u n d e r consideration.
F o r s i mpl ici ty, s u p p o s e t h a t H 0 = I a n d c o n s i d e r m e t h o d (4.72).
W e d e n o t e t h e s u b s p a c e w h i c h c o n t a i n s v e c t o r s /' ( x ) b y E q . I t is
e a s y to ascertain that vector p k defined b y f o r m u l a (4.72) b e l o n g s
t o E q . I n d e e d , p 0 = — f'0 6 E 9 a n d c o n s e q u e n t l y w i t h a n y k , v e c t o r
P k is a l i n e a r c o m b i n a t i o n o f v e c t o r s w h i c h b e l o n g t o s u b s p a c e E q .
C o n s e q u e n t l y , t h e f u n c t i o n m i n i m i z a t i o n (process (4.47)) b y m e t h o d
( 4 . 7 2 ) is p r a c t i c a l l y p e r f o r m e d i n s u b s p a c e E q . N o w i n t h i s s u b s p a c e ,
c o n d i t i o n ( A x y x ) > 0 is s a t i s f i e d f o r a n y x =^= 0 . T h i s m e a n s t h a t
o w i n g to sp a c e E q b e i n g finite-dimensional w e h a v e for a n y x 6 E ?
m i | | x ||2 ^ ( A x y x) ^ M x || x | | 2 , m 1 > 0, M t ^ M .
H e n c e f u n c t i o n / (x) is s t r i c t l y c o n v e x i n t h e s u b s p a c e u n d e r c o n ­
s i d e r a t i o n , t h e r e f o r e all of t h e p r o p e r t i e s o f m e t h o d s of c o n j u g a t e
directions discussed in the pr ece din g subsections h o l d also in o u r
c a s e . I n p a r t i c u l a r , e q u a l i t i e s ( 4 . 5 8 ) w h i c h s h o w t h a t v e c t o r s f\, i =
= 0 , 1 , . . ., k a r e l i n e a r l y i n d e p e n d e n t h o l d i n o u r c a s e . H o w e v e r ,
there c a n n o t be m o r e t h a n q linearly i n d e p e n d e n t vectors in sub-
space E q. Consequently, w i t h a certain k q — 1 the process of
constructing conjugate vectors m u s t be truncated. Since the m e t h o d
i s n o n d e g e n e r a t e ( p r o p e r t y ( 4 . 1 8 ) ) , t h i s w i l l o c c u r o n l y i f fjt = 0 .
It is c l e a r f r o m w h a t h a s b e e n s t a t e d a b o v e t h a t i n m i n i m i z i n g
a f u nct ion b y m e t h o d (4.72) at a certain k ^ q — 1 w e shall of
n e c e s s i t y find t h a t /i = 0. S i n c e d i r e c t i o n s p * d e t e r m i n e d b y differ­
e n t m e t h o d s of c o n j u g a t e d i r e c t i o n s c o i n c i d e (to a n a c c u r a c y of
a scalar factor), e v e r y t h i n g stated ab ove ] h o l d s n o t o n l y for m e t h o d

100
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N

(4.72) b u t also for ot her a l g o r i t h m s s t udi ed in this section. B y a


a s o m e w h a t m o r e c o m p l i c a t e d r e a s o n i n g it c a n b e s h o w n t h a t t h e
r e s u l t o b t a i n e d h o l d s a l s o i n t h e c a s e w h e r e H 0 is a n a r b i t r a r y s t r i c t l y
positive definite m a t r i x .
T h u s , m e t h o d s o f c o n j u g a t e d i r e c t i o n s m a k e it p o s s i b l e t o f i n d t h e
m i n i m u m p o i n t of a c o n v e x qu adratic f u n c t i o n a n d the solution of the
p r o b l e m is o b t a i n e d a f t e r l e s s t h a n n s t e p s .
W e s u p p o s e n o w t h a t t h e c o n v e x f u n c t i o n / (x ) d o e s n o t a t t a i n
the m i n i m u m ; this will b e the case, as c a n b e seen f r o m t h e proof
o f e q u a l i t y ( 4 . 7 5 ) , if a n a r b i t r a r y v e c t o r p w h i c h m i n i m i z e s f u n c t i o n
cp (.r) = ( A x , x ) i s s u c h t h a t ( b , p ) =?£= 0 .
L e t u s consider w h a t the processes of const ruc tin g c o n j u g a t e
directions lead to in this case.
W e first d e s c r i b e o n e p r o p e r t y o f A - o r t h o g o n a l v e c t o r s w h i c h
t a k e s p l a c e u n d e r t h e c o n d i t i o n t h a t / (x) is a c o n v e x f u n c t i o n ( b u t
n o t strictly c o n v e x ) .
I f m a t r i x A is p o s i t i v e d e f i n i t e , t h e n a t l e a s t o n e o f t h e c o n j u g a t e
vectors p — 1 satisfies t h e c o n d i t i o n
( A p h, Ph) = 0. (4.77)

T h a t t h i s s t a t e m e n t h o l d s is s h o w n as follows.
If t h e c o n d i t i o n s
( A p t, p d > 0 (4.78)
w e r e s a t i s f i e d f o r a n y i = 0 , 1 , . . ., n — 1 , t h e n t h e s y s t e m o f
v e c t o r s p Q , p x , . . ., p n - x w o u l d b e l i n e a r l y i n d e p e n d e n t . I n d e e d ,
s u p p o s e that c o n d i t i o n s (4.78) are satisfied a n d t h at
71-1
2 8iPi= 0
i=0

where, for e x a m p l e , 6 0 0. T h e n b y (scalar) m u l t i p l i c a t i o n of


b o t h sides of the equality b y A p 0 w e o b t a i n o n the left-hand side
6 0 (A p 0 , p 0 ) > • 0 , i . e . t h e r e s u l t i s a c o n t r a d i c t i o n . C o n s e q u e n t l y ,
w i t h c o n d i t i o n s (4.78), satisfied v e c t o r s w o u l d f o r m a basis
in Ef1 a n d therefore a n y vector z co u l d b e written in the f o r m
7 1 - 1

2 1 ^iPi
i=0

w h e r e at least o n e of t h e coefficients a t 0. H o w e v e r , i n this c a s e


we would have
(Az, z ) = ( A 2 at Pi, ^ a i P i ) = ^ a 2i ( A p i , p , ) > 0 .
i i i

T h i s contradicts c o n d i t i o n ( A x, x) > 0.

101
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h u s t h e initial a s s u m p t i o n t h a t c o n d i t i o n s (4.78) a r e fulfilled


is false.
A c c o r d i n g to the p r ope rty of A - o r t h o g o n a l vectors discussed, in
a p p l y i n g a n y m e t h o d of c o n j u g a t e directions w i t h a certain k ^ 0
w e shall find e q u a l i t y (4.77) satisfied. S i n c e for t h e f u n c t i o n u n d e r
c o n s i d e r a t i o n properties (4.75), (4.76) d o n o t t a k e place, p a r a m e ­
t e r a ft, w i t h ( 4 . 7 7 ) s a t i s f i e d , w h i c h i s c a l c u l a t e d b y f o r m u l a ( 4 . 1 6 )
w i l l b e c o m e i n f i n i t e , i.e. t h e f u r t h e r c o n s t r u c t i n g o f c o n j u g a t e
directions will prove impossible.

Discussion of Results

T h u s , w e h a v e considered a general s c h e m e of constructing m e ­


t h o d s o f c o n j u g a t e d i r e c t i o n s a n d o n its b a s i s o b t a i n e d m a n y c o n ­
c r e t e a l g o r i t h m s . A n y o f t h e s e m e t h o d s m a k e s it p o s s i b l e t o f i n d
t h e m i n i m u m of a c o n v e x q u a d r a t i c f u n c t i o n after a n u m b e r of
s t e p s i n p r o c e s s ( 4 . 4 7 ) n o t e x c e e d i n g n . B e s i d e s , w e h a v e m a d e it
c l e a r t h a t s u c c e s s i v e a p p r o x i m a t i o n s t o t h e s o l u t i o n x Q , x ± , . . ., x k
o b t a i n e d b y using different a l g o r i t h m s p r o v e to b e identical.
If a l g o r i t h m s a r e j u d g e d b y t h e a m o u n t o f c a l c u l a t i o n s p e r i t e r a ­
tion, t h e n a l g o r i t h m s (4.62), (4.63), (4.66) s h o u l d c e rta inl y b e p r e ­
f e r r e d . T h e s e m e t h o d s a r e e s p e c i a l l y e a s y t o i m p l e m e n t if t h e i d e n ­
t i t y m a t r i x / is c h o s e n a s t h e i n i t i a l m a t r i x H 0 a n d it s e e m s t h a t
f o r m o s t p r o b l e m s t h e c h o i c e H 0 = / is t h e m o s t e x p e d i e n t .
I n this c a s e m e t h o d s (4.63), (4.66) b y t h e w o r k p e r iteration differ
b u t s l i g h t l y f r o m t h e g r a d i e n t m e t h o d b u t a r e c o n s i d e r a b l y ftiore
effective o w i n g to t h e fact t h a t p r o c e s s (4.47) p r o v e s t o h a v e a finite
n u m b e r of steps.
T h e a d v a n t a g e of m e t h o d s (4.62), (4.63) a n d (4.66) w h e n i m p l e ­
m e n t e d o n c o m p u t e r s is t h a t t h e y r e q u i r e b u t a s l i g h t l y l a r g e r s t o r a g e
c a p a c i t y of the c o m p u t e r t h a n that required b y t h e m e t h o d s of
steepest descent.
M e t h o d s of c o n j u g a t e directions in w h i c h in order to d e t e r m i n e
t h e di r e c t i o n of m o t i o n m a t r i c e s ((4.48), (4.49), (4.52)-(4.54)) are
p r e l i m i n a r i l y c o n s t r u c t e d are s o m e w h a t inferior to m e t h o d s (4.63),
(4.66) in the aspects considered; h o w e v e r , t h e y retain their consi der­
able a d v a n t a g e over gradient m e t h o d s . All the m e t h o d s of the
c l a s s u n d e r c o n s i d e r a t i o n h a v e a d v a n t a g e o v e r N e w t o n ’s m e t h o d
in that t h e y d o not require the calculation of se con d derivatives
of the function.
T h e q u e s t i o n c a n b e r a i s e d w h e t h e r it is w o r t h w h i l e t o c o n s i d e r
m e t h o d s w h e r e w e u s e p r e l i m i n a r y c o n s t r u c t i o n o f m a t r i c e s if t h e y
a r e inferior to m e t h o d s (4.63), (4.72) i n t h e c o m p u t a t i o n a l effort
a n d c o m p u t e r m e m o r y required.

102
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

H o w e v e r , it s h o u l d b e k e p t i n m i n d t h a t w e a r e m a k i n g a p u r e l y
theoretical estimate of the m e t h o d s a n d d o not take into a c c o u n t
such a n i m p o r t a n t factor as the sensitiveness of a n al gorithm to
errors in c o m p u t a t i o n s . T h i s factor c a n c h a n g e c o n s i d e r a b l y the
relation b e t w e e n the a m o u n t s of c o m p u t a t i o n s i n v o l v e d in solving
t h e p r o b l e m b y different a l g o r i t h m s . It s h o u l d also b e n o t e d t h a t
m e t h o d s (4.48), (4.49), (4.52), for instance, in s o l v i n g a m i n i m i ­
zation p r o b l e m al l o w to o b t a i n at the s a m e t i m e the inverse m a t r i x
A ~ x a n d this m a y b e useful in s o m e cases.
T h e d i f f e r e n c e i n p r o p e r t i e s of a l g o r i t h m s tells c o n s i d e r a b l y w h e n
t h e y are u s e d for m i n i m i z a t i o n of n o n q u a d r a t i c functions; this will
be discussed in the n e x t section.
M e t h o d s of co n j u g a t e directions p r o v e useful in o n e m o r e aspect;
t h e y m a k e it p o s s i b l e t o e s t a b l i s h w h e t h e r t h e s i g n o f t h e m a t r i x
is fi x e d . T h u s a c c o r d i n g t o t h e r e s u l t s o f t h e s u b s e c t i o n o n p . 9 9 ,
if m a t r i x A is p o s i t i v e d e f i n i t e a n d f u n c t i o n f{pc) d o e s n o t a t t a i n t h e
m i n i m u m , t h e n at a certain step w e shall h a v e a * = oo. H o w e v e r ,
if m a t r i x A is n o t p o s i t i v e d e f i n i t e , w e f i n d a t a c e r t a i n s t e p o f p r o ­
cess (4.47) t h a t a * « < 0. T h u s t h e v a l u e of p a r a m e t e r a * d e t e r m i n e s
t h e sign of m a t r i x A .
T h e e f f e c t i v e n e s s o f m e t h o d s o f c o n j u g a t e d i r e c t i o n s is t h e r e a s o n
of their m o r e a n d m o r e extensive application to the m i n i m i z a t i o n
of quadratic functions a n d solution of s y s t e m s of linear equations.

5. M E T H O D S O F C O N J U G A T E D I R E C T I O N S .
M I N I M I Z A T I O N O F A R B I T R A R Y F U N C T I O N S
Considerations about the Applicability
of the M e t h o d s
S u p p o s e that w e in tend to m a k e use of the process
”1" & h P h i Pk “ “ Hlifhi f » • • •» (5.1)
w h e r e v e c t o r p k (or m a t r i x H h ) is d e t e r m i n e d b y o n e o f t h e m e t h o d s
studied in the pre c e d i n g section, for the m i n i m i z a t i o n of a n arbitrary
( n o t q u a d r a t i c ) c o n v e x f u n c t i o n f (x). I n t h i s c a s e , m a t r i x f" (#)
w i l l h a v e different e l e m e n t s at different p o i n t s of s e q u e n c e (5.1);
b y v i r t u e o f t h i s f a c t v e c t o r s / ? 0 , . . ., constructed b y a n y of the
m e t h o d s d e s c r i b e d in t h e s u b s e c t i o n o n p. 9 3 will n o t satisfy c o n d i ­
t i o n s ( 4 . 2 0 ) , i.e. w i l l n o t b e c o n j u g a t e . H o w e v e r , if t h e i n i t i a l p o i n t
x 0 is i n a c l o s e n e i g h b o u r h o o d o f t h e m i n i m u m of a s m o o t h c o n v e x
f u n c t i o n / ( x ), t h e n a t a n y p o i n t o f t h i s r e g i o n m a t r i x f " ( x ) i s c l o s e
e n o u g h t o m a t r i x / " ( # * ) , i.e. t h e q u a d r a t i c f u n c t i o n

<P (*) = y ( H * .

103
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

is a g o o d a p p r o x i m a t i o n t o t h e f u n c t i o n f (x). T h u s , w e c a n e x p e c t
that the properties of vectors p 0, . . p h d e t e r m i n e d b y m e t h o d s of
Sec. 4 will h e close e n o u g h to th e properties of c o n j u g a t e vectors
(/" ( a ^ - o r t h o g o n a l ) a n d t h e r e f o r e t h e p r o p e r t i e s o f p r o c e s s ( 5 . 1 ) i n
w h i c h p a r a m e t e r a h is c h o s e n o n c o n d i t i o n t h a t t h e m i n i m u m of
f u n c t i o n / (x) o c c u r s i n t h e d i r e c t i o n o f p h w i l l b e c l o s e e n o u g h t o
the properties of m e t h o d s of c o n j u g a t e directions. I n other w o r d s ,
w e c a n e x p e c t t h e m e t h o d s of t h e p r e c e d i n g s e c t i o n p r o v e suffi­
ciently effective in m i n i m i z i n g n o n q u a d r a t i c functions too. In this
case, t h e m e t h o d s w i l l n o m o r e y i e l d t h e result after a finite n u m b e r
of steps since the c o ndi tio ns
(/' ( * * ) P h , Pi) = 0, t =£ k

will n o t b e strictly satisfied w i t h a n y initial p o i n t x 0.


I t e r a t i v e p r o c e s s e s o f t h e t y p e ( 5 . 1 ) i n w h i c h v e c t o r p h is c o n ­
s t r u c t e d b y a l g o r i t h m s o f S e c . 4 a n d t h e v a l u e o f p a r a m e t e r a h is
chosen on condition that
/ (*h + aftPft) = m i n / (xh + a p h)
a

will b e called as before m e t h o d s of c o n j u g a t e directions.


N o t e t h a t t h e c o n d i t i o n u n d e r w h i c h p a r a m e t e r a h is c h o s e n c a n
b e w r itt en also in the fo llo win g fo rm:
(/;,+!, p h ) = (/' ( x h + a h p k ), p h ) = 0. (5.2)
T h e o b j e c t o f t h i s s e c t i o n is t o s u b s t a n t i a t e t h e c o n v e r g e n c e o f
m e t h o d s of c o n j u g a t e directions in the m i n i m i z a t i o n of n o n q u a d r a t i c
functions a n d to o b t a i n b o u n d s o n the rate of c o n v e r g e n c e .

T h e o r e m o n C o n v e r g e n c e of t h e M e t h o d s
I n w h a t f o l l o w s w e s h a l l a s s u m e t h a t / (x) is a s t r o n g l y c o n v e x
d i f f e r e n t i a b l e f u n c t i o n w h o s e first a n d s e c o n d d e r i v a t i v e s a r e c o n ­
t i n u o u s , i.e. t h a t c o n d i t i o n s
m\\ y\? < (/" (x) y , y ) < M \ \ y||2 , m > 0 (5.3)
a r e s a t i s f i e d f o r a l l x , y 6 E 7\ a n d t h a t a s y m m e t r i c , s t r i c t l y p o s i t i v e
d e f i n i t e m a t r i x h a s b e e n c h o s e n a s H 0 , i.e.
m 0 || i/ll2 < (H „ y , y ) < M „|| j/||2 , m 0 > 0 (5.4)
for all y 6 E n .
P r o c e s s e s of t y p e (5.1) c a n b e realized either w i t h re storat ion of
m a t r i x H h after a finite n u m b e r of steps, o r w i t h o u t s u c h a reinitial­
ization. S p e a k i n g of processes w i t h restoration, say, after n steps w e
m e a n that with an y £ = 0 , 1 , ... matrix H is r e s t o r e d , i.e.
H ln = H 0 .

104
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

F r o m the b e g i n n i n g n o t e t h e f o l l o w i n g fact. If a p r o c e s s w i t h
restoration of m a t r i x H h a f t e r a finite n u m b e r o f s t e p s is b e i n g r e a ­
lized, t h e n for a n y of t h e m e t h o d s of c o n j u g a t e directions the f o l l o w ­
ing condition is fulfilled:
l i m ||/' ( * fc)ll = 0 (5.5)
h-*oo
s i n c e e a c h first s t e p o f t h e p r o c e s s a f t e r r e s t o r a t i o n is a s t e p o f g r a ­
d i e n t descent, for w h i c h a c c o r d i n g to (5.3) t h e c o n d i t i o n s of c o n v e r ­
g e n c e o f g r a d i e n t m e t h o d s ( t h e o r e m 1 .6 ) a r e s a t i s f i e d a n d i n t h e
following steps b e t w e e n restorations w e h a v e a descent to the m i n i ­
m u m of the function in the direction of m o t i o n . T h e fulfillment of
c o n d i t i o n (5.5) for a strictly c o n v e x f u n c t i o n m e a n s t h a t a n y of the
m e t h o d s d i s c u s s e d i n S e c . 4 , if r e a l i z e d w i t h r e s t o r a t i o n o f m a t r i x H k
after a finite n u m b e r o f s t e p s , c o n v e r g e s to t h e s o l u t i o n . Therefore
i n o r d e r t o j u d g e t h e e f f e c t i v e n e s s o f s u c h a p r o c e s s it is i m p o r t a n t
to o b t a i n b o u n d s o n its r a t e o f c o n v e r g e n c e .
N o t e t h a t c o n d i t i o n ( 5 . 5 ) f o r p r o c e s s e s w i t h r e s t o r a t i o n is s a t i s f i e d
n o t o n l y f o r s t r i c t l y c o n v e x f u n c t i o n s b u t f o r a n y f u n c t i o n if t h e
f u l f i l l m e n t o f ( 5 . 5 ) is g u a r a n t e e d f o r it i n a p p l y i n g g r a d i e n t m e t h o d s
(see t h e o r e m 1.4).
H o w e v e r , if p r o c e s s e s a r e r e a l i z e d w i t h o u t r e s t o r a t i o n o f / / & ,
t h e n t h e i r c o n v e r g e n c e m u s t b e s u b s t a n t i a t e d . B e s i d e s , it is a l s o
necessary to es timate their rate of co nve rge nce .
Let us n o w formulate the t h e o r e m w h o s e contents are the m a i n
result of this section.
T h e o r e m 5 . 1 . F o r t h e m i n i m i z a t i o n of f u n c t i o n f (x) w h i c h satisfies
c o n d i t i o n s ( 5 . 3 ) let t h e r e b e a p p l i e d p r o c e s s ( 5 . 1 ) i n w h i c h t h e c o n s t r u c ­
t i o n o f m a t r i x H k is p e r f o r m e d b y o n e o f t h e m e t h o d s o f S e c . 4 ( ( 4 . 4 8 ) ,
(4.49), (4.52)-(4.54), (4.69)-(4.71)) w i t h restoration of H h after n steps.
I f t h e v a l u e o f a h is c h o s e n u n d e r t h e c o n d i t i o n t h a t t h e m i n i m u m o f t h e
f u nct ion be in the direction of p h , th e n the s e que nce { x | n } w h a t e v e r the
initial p o i n t x 0 c h o s e n c o n v e r g e s to t h e s o l u t i o n a t a s u p e r l i n e a r rate.
L e t us outline the general s c h e m e of the proof of this t h e o r e m .
S u p p o s e t h a t t h e h y p o t h e s i s is n o t t r u e , i.e. t h a t f o r t h e i t e r a t i v e
p r o c e s s e s d e s c r i b e d t h e f o l l o w i n g c o n d i t i o n is s a t i s f i e d w i t h a n y k :
lkft+i — * * II > A, || * * — * * II (5.6)
w h e r e X ; > 0 is a c o n s t a n t . U s i n g i n e q u a l i t y ( 1 . 1 2 ) a n d t h e e x p r e s ­
sion
II r ( x ) II = II / ' ( x ) - r ( x m ) II < A T II * - * J (5.7)
w h i c h h o l d for a f u n c t i o n w h i c h satisfies c o n d i t i o n (5.3), w e find t h a t
c o n d i t i o n ( 5 . 6 ) is e q u i v a l e n t t o t h e f o l l o w i n g o n e :
II f i + i II > « \ \ U II (5.8)
w h e r e 6 > > 0 is a c o n s t a n t .

105
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

S t u d y i n g t h e p r o p e r t i e s of p r o c e s s (5.1) a n d a s s u m i n g t h a t c o n d i ­
t i o n ( 5 . 8 ) is f u l f i l l e d w e f i n d t h a t i n d e p e n d e n t l y o f w h a t a l g o r i t h m
is u s e d f o r c o n s t r u c t i n g m a t r i x H h t h e f o l l o w i n g e s t i m a t e s h o l d :
C II f \ II < II r * | | < N \\f'k || (5.9)

where C , N are c o n s t a n t s i n d e p e n d e n t of k, C > ► 0 a n d

(^|n+i» r^n+j) = O ( l k g n + i II II ^ \ n + j II)»


U j k /, 0 ^ i, / ^ n — 1. (5.10)

I t w i l l b e d e m o n s t r a t e d b e l o w t h a t if t h e s e e s t i m a t e s a r e f u l f i l l e d ,
t h e n s e q u e n c e (5.1) c o n v e r g e s to t h e s o l u t i o n at a s u p e r l i n e a r rate.
H o w e v e r , t h i s c o n t r a d i c t s o u r i n i t i a l a s s u m p t i o n ( 5 .6 ) ( o r ( 5 . 8 ) ) ,
i.e. c o n d i t i o n ( 5 . 6 ) c a n n o t b e s a t i s f i e d f o r p r o c e s s ( 5 . 1 ) . U s i n g t h i s
f a c t it w i l l b e e a s y t o e s t a b l i s h t h a t t h e t h e o r e m h o l d s .
T h u s , t h e p a t t e r n o f t h e p r o o f is t h e s a m e f o r all t h e m e t h o d s d i s ­
c u s s e d , b u t t h e v a l i d i t y o f ( 5 . 9 ) a n d ( 5 . 1 0 ) is e s t a b l i s h e d i n d i f f e r e n t
w a y s . T h e proof that these estimates h o l d for different a l g o r i t h m s
will b e gi v e n in the next subsection a n d here w e shall describe that
p a r t of t h e pr o o f , w h i c h all t h e s e m e t h o d s h a v e in c o m m o n .
W e s h a l l m a k e first a r e m a r k a b o u t t h e n o t a t i o n s . I n w h a t f o l l o w s
for the simplicity w e shall often in us ing vectors a n d p a r a m e t e r s
/ £n+i» ^£n+i» Ot|n+M Pgn+ii ^ “ 0 , 1 , . . ., Jfl^ 1
o m i t i n d e x £ / i , i . e . o p e r a t e w i t h v e c t o r s a n d p a r a m e t e r s r f , / S , e iy
< x f, P i e t c . H o w e v e r , it s h o u l d b e s t r e s s e d t h a t t h i s is d o n e o n l y to
s i m p l i f y the w r i t t e n f o r m ; the real i n d e x of the c o r r e s p o n d i n g q u a n ­
t i t y i s % n -f- i.
W e t u r n n o w to the proof of the t h e o r e m a n d a s s u m e that e s t i m a ­
t e s ( 5 . 9 ) a n d ( 5 . 1 0 ) a r e s a t i s f i e d . U s i n g L a g r a n g e ’s f o r m u l a f o r o p e r a ­
tors w e obtain
<*/, rj) = (£r„ n) = ( f i n , rj) + (( fie - fi) r „ r,). (5.11)

w h e r e , a s u s u a l , i n d e x ic d e n o t e s a n i n t e r m e d i a t e p o i n t i n t h e c o r r e s ­
ponding segment:
x ic = x t + 0 r f, O < 0 < 1.
I I II H II — > - 0 , t h e n b e c a u s e o f t h e u n i f o r m c o n t i n u i t y o f s e c o n d
d e r i v a t i v e s o f f u n c t i o n / ( x ) o n s e t S = { x : f ( x ) ^ / (ar0 ) } w e h a v e
II / i c — f i || ^ - 0 a n d i t f o l l o w s f r o m ( 5 . 1 1 ) t h a t , i f ( 5 . 1 0 ) i s s a t i s f i e d ,
estimates
( f i r i , r j) — 0 ( M r ill || r , ||) + o (|| e , || || r , ||),
i = £ /, 0 < n — 1
h o l d too.

106
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

U n d e r c o n d i t i o n s ( 5 . 3 ) ||ij|| = W f l + i — /ill ^ M || r 4 1|, c o n s e q u e n t ­


l y | | e j | a n d ||r2 || a r e o f t h e s a m e o r d e r o f s m a l l n e s s . T a k i n g t h i s i n t o
account, w e have

(/In. n ) = 0 (Ifoll I N I ) , i ¥ = } , o < i, j — l. (5.12)

If e s t i m a t e s ( 5 . 1 2 ) a r e fulfilled, t h e n t h e r e a r e v e c t o r s

ri — ri + i = 0,; 1 , . . ., n — 1, (5.13)

w h e r e ||o)i|| = o ( H o ||), s u c h t h a t

( J i n r it T j ) = 0,i¥=j, o i,; < n — 1. (5.14)

This c a n b e s h o w n as follows.
L e t u s n o r m a l i z e v e c t o r s r t:
~ _ Tj_ _ _ _
ff" r . r .\i /2
v 7 | n r * ’ r it
T h e n (/gn r*, n ) = l a n d a s | ► o o s i n c e p r o c e s s ( 5 . 1 ) is con­
v e r g e n t ( w i t h restoration of H h ) a n d b y (5.3) a n d (5.12) w e have

(flX O X m ||r i ||||VjT ( K / < r » 0 ) + ( ( / i n - / I ) r , - , r y )l - ► 0

i^=j, 0 < t , j ^ n — 1.

T h e r e f o r e if R gn is a m a t r i x w h o s e c o l u m n s a r e v e c t o r s r* a n d F * n =
= R t n f i n R gn . t h e n a s £ oo
F l n ^ I .

Since |n = I w e obtain, d e n o t i n g FgA/?£n b y

Q l n f l n R *n = I- (5.15)

N o w , s i n c e F gn - ► / , w e h a v e a l s o F - * / and, consequently,
R * n - > - < ? £ „ , i.e. v e c t o r - c o l u m n s q t o f m a t r i x can be written
in the f o r m
Ql = T i -j- 0 )j, i = 0, 1 , . . ., TL — 1

w h e r e ||<0 j|l - > 0 a s ? - > » o o . L e t u s w r i t e t h e e q u a l i t i e s o b t a i n e d i n


the following form:
( f l n T i , f j ) 1''2 ? , = T; + ( f l „ r it r ^ O ) ' .

107
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

B e c a u s e o f ( 5 . 1 5 ) , v e c t o r s r* a n d ] r* = (/|„r*, i — 0, 1, . .
n — 1 s a t i s f y c o n d i t i o n s ( 5 . 1 4 ) . A t t h e s a m e t i m e , v e c t o r s r* s a t i s f y
c o n d i t i o n s (5.13) s i n c e b y (5.3)

|| (Pill </£7.r <. O ) 172 II <*>l I I _ M il2 || r, || 0.


II r t || || r * || ^ || r t 1|

T h u s it h a s b e e n s h o w n t h a t ( 5 . 1 4 ) h o l d s .
V e c t o r s r* w i t h s u f f i c i e n t l y l a r g e | a r e l i n e a r l y i n d e p e n d e n t .
I n d e e d , l e t t h e r e b e f a c t o r s 6 ,-, i = 0 , 1 , . . ., n — 1 ( o f w h i c h a t l e a s t
_ I
t w o a r e n o n z e r o ) s u c h t h a t 2 j & i r i — 0 . I f 8 0 =7 ^ 0 * t h e n w e o b t a i n
1=0

fy) ( / £ n r 0 i r o ) “ l“ ^ J & J ( / £ ^ 0 , r j ) —
j=l
H o w e v e r , w i t h s u f f i c i e n t l y l a r g e v a l u e s o f £ I b i s e q u a l i t y is n o t
s a t i s f i e d . I n d e e d , s i n c e 11(0*11 = o (||r*||) a n d a s £ — ► < » , ||r*|| - > - 0 ,
w i t h sufficiently l a r g e £ a c c o r d i n g to (5.3), w e h a v e

i f r 0) — ^*o) “ 1“ (/fen^* 0 » ^ o )

a n d at the s a m e t i m e ( / £ „ r 0 , r 7 ) = 0 , j = 1 , . . ., n — 1 , b e c a u s e o f
(5.14). Thu s w e c o m e t o a c o n t r a d i c t i o n , i.e. v e c t o r s r*, i = 0 , 1 , . . . ,
n — 1 are really linearly independent.
Let z$n b e t h e m i n i m u m of t h e q u a d r a t i c f u n c t i o n

< P ( * ) = ( / 5 n , * — *57t)+ 4 " (/!»(* — *5n)* * — **«)•

L e t u s w r i t e v e c t o r z\n — X x n in t h e f o r m
7 1 - 1

zin— x %n= 2 a*r*. (5.16)


i=0

S i n c e <p' ( z g n ) = 1{n - f / £ n ( z l n — x ln) = 0 , then using (5.16) w e


obtain

2 a ifiiir i ~ / g »•
i=0

H e n c e , t a k i n g i n t o a c c o u n t ( 5 . 1 4 ) , it f o l l o w s t h a t c o e f f i c i e n t s a *
can be calculated by the following formulas:

di - 1 , ..., n — 1.

108
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

Let us write the n u m e r a t o r on the right-hand side in the fol­


lowing form:

(fin, r i ) = ( f o — /I + /i — •••— /i-f-i + f i + 1 » r i) ~ — 2 ( e h r *)


5=0

(we have taken i n t o a c c o u n t t h a t b y ( 5 . 2 ) ( A ' + i * r f) = 0). H e n c e ,


h a v i n g in m i n d e s t i m a t e s ( 5 . 1 0 ) , it f o l l o w s t h a t
i- 1
( f i n , r i ) = — ( ^ i , 7*i) "i~ 2 0 (II H || || € j ||)* (5.17)
5=0

A c c o r d i n g t o c o n d i t i o n s ( 5 . 8 ) a n d ( 5 . 9 ) a l l o f t h e v e c t o r s r 0 , . . .,
r n _i a r e o f t h e s a m e o r d e r o f s m a l l n e s s ( r e c a l l t h a t v e c t o r s i \ n + t a r e
practically meant). Since as w a s m e n t i o n e d a b o v e A / | | r f ||,
v e c t o r s e 0l . . ., e n _ x a r e o f t h e s a m e o r d e r o f s m a l l n e s s . T a k i n g i n t o
a c c o u n t t h e a b o v e r e m a r k s the equalities (5.17) c a n b e w r i t t e n in
the following form:
(fin, n ) = — ( e h r f) + o ( | | r £ ||2 ), i = 0, 1 , . . ., n — 1.
F u r t h e r , t a k i n g into a c c o u n t (5.13), w e find t h a t

( f i n r i , r t) = ( f i n r i, r t ) + ( / £ * & ) * , r t)
= ( f i c r iy n ) + ( ( / i n — / i c ) 0 , r f i + i f i n a t , r f ) = (<?*, r f ) + ( | | r j 2 ).
Thus
„ _ ri)+o(ll n lP )
1 (*i, n ) + O i ( l | n l l 2)*
B y (5.3), w e have
( * i , r t) = (fieri, > m IfoH2, i = 0, 1 , . . ., n — 1 . ( 5 . 1 8 )
C o n s e q u e n t l y , a s g - » - o o ( i . e . a s ||rj|| - > 0 )
«i - ► 1, i = 0 , 1, . . ., n — 1. (5.19)
n - 1
Since x (t + 1 ) n — X | n = 2 r i» w e have
t = 0
n - 1 _

*^(£+l)n = (*^( 5 + l ) n *^£rj) (^n n) = 2 (^i


i=0

H e n c e , taking into a c c o u n t (5.13) a n d (5.19), w e obtain

II Zfcn || — 2 0 (II r i II)


i
o r u s i n g (5.8) a n d (5.9)
lks+l>n “ Hn\\ = O (||/ 4 nll ) . (5.20)

109
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

S i n c e z ln — xin = — (/in)"V|n and taking into account (5.20),


w e have
•**<£+1)71 — % Vi = (***<£+1)71 z l n ) ” t" (z | n x In)
= - (rmr'fm + ti i n
w h e r e I h m | | = o (||/^n ||).
It f o l l o w s t h a t t h e r e is a sequence o f m a t r i c e s D * n — ► (fin) 1
such that
•*■(£+1)71 In = D^nfcn* (5.21)
( W e c a n take, for instance, th at
n - i _ a - \-l i Z £ n ^ d + D w i f \* \
^ i n - ( h n ) H - - 7p J7-s ( 7 s n ) •)
v^lrr fin'
E q u a l i t y (5.21) s h o w s t h a t s e q u e n c e { ^ m ) » 5 = 0, 1, . . . c o n ­
verges at a superlinear rate to the solution; the corresponding
b o u n d s o n t h e r a t e o f c o n v e r g e n c e c a n b e o b t a i n e d j u s t a s it w a s
d o n e in t h e o r e m 3.1 for s e q u e n c e {#*}.
T h u s a s s u m i n g t h a t c o n d i t i o n ( 5 . 6 ) is s a t i s f i e d a n d t a k i n g e s t i ­
m a t e s (5.10) a n d (5.9) to b e satisfied, w e h a v e d e m o n s t r a t e d t h a t for
{ ^ m } inequality
ll^d+Dn ^#11 ^ \ n ll*^|n *^*ll (5.22)
w h e r e X t n — > - 0 a s 5 - > o o , h o l d s . H o w e v e r , if c o n d i t i o n ( 5 . 6 ) h o l d s ,
t h e n i n e q u a l i t y (5.22) c a n n o t b e satisfied s i n c e w i t h (5.6) fulfilled
w e have
I t a s + i m — **ll > — 2*11- (5.23)
T h u s w e h a v e c o m e to a contradiction. Th is m e a n s that condition
( 5 .6 ) ( o r ( 5 . 2 3 ) ) c a n n o t b e f u l f i l l e d f o r p r o c e s s ( 5 . 1 ) .
T h e i m p o s s i b i l i t y of fulfilling c o n d i t i o n (5.6) w i t h a n y k m e a n s , in
fact, t h a t for p r o c e s s (5.1) ( w i t h reinitialization) i n e q u a l i t y (5.23)
c a n n o t b e f u l f i l l e d i n a n y s u b s e q u e n c e { l m }, m = 0 , 1 , . . . . I f t h e r e
w e r e a s e q u e n c e {£„,} s u c h that
ll^U+im ~ X* W > ll^m n — **ll. (5-24)
then with any ^ & ^ ( | m -|-l) w for m e t h o d s w i t h reinitializa­
t i o n e s t i m a t e s (5.9), (5.10) w o u l d b e fulfilled; this w i l l b e f o u n d in
s t u d y i n g t h e properties of s u c h processes in the n e x t subsection.
Therefore, repeating the ab ove a r g u m e n t w e w o u l d ha ve concluded
t h a t a t i t e r a t i o n s w h i c h c o r r e s p o n d t o s e q u e n c e { 5 m } i n e q u a l i t y ( 5 .2 2 )
is fulfilled a n d t h i s c o n t r a d i c t s (5.24).
T h u s , for p r o c e s s (5.1) w i t h r e sto rat ion of m a t r i x H h i n e q u a l i t y
(5.24) c a n n o t b e fulfilled. It f o l l o w s t h a t for a n y c o n s t a n t X > 0 t h e r e
is a n u m b e r T s u c h t h a t w i t h 5 > T c o n d i t i o n ( 5 . 2 2 ) is f u l f i l l e d , i.e.
s e q u e n c e {a;gn } c o n v e r g e s t o t h e s o l u t i o n a t a s u p e r l i n e a r r a te.

110
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

S t u d y of Properties
of Different A l g o r i t h m s
W e t u r n n o w to t h e p r o o f of t h e v a l i d i t y of e s t i m a t e s (5.9), (5 .10)
for different m e t h o d s of c o n j u g a t e directions w i t h restoration of
m a t r i x H k after n steps, a s s u m i n g t h a t i n e q u a l i t y (5.6) (or (5.8))
is fulfilled.
T h e f a c t t h a t f o r a n y o f t h e s e m e t h o d s t h e e s t i m a t e s h o l d is
e s t a b l i s h e d b y i n d u c t i o n ; it is d e m o n s t r a t e d t h a t e s t i m a t e s ( 5 . 9 ) ,
( 5 . 1 0 ) t a k e p l a c e w i t h i = £ j , i, j = 0 , 1 ; a n d t h e n s u p p o s i n g t h a t t h e s e
estimates take place with O ^ i , j ^ t <i n — 1 w e prove that
t h e y r e m a i n v a l i d a l s o w i t h 0 ^ i, / ^ t + 1 .
1. M e t h o d ( 4 . 4 8 ) . If r e s t o r a t i o n o f m a t r i x ( 4 . 4 8 ) is p e r f o r m e d a f t e r
a finite n u m b e r o f s t e p s , t h e n w i t h a n y k m a t r i x H * is b o u n d e d :
\ \ H k \\ go. (5.25)
W e s h o w n o w h o w this c a n b e proved.
B y (5.2),
( H h fk , f t * ) = - ( p h , f t * ) = 0 -
Therefore
( H k e k , e h ) = ( J 5 T * f t , f t ) + ( J E T f c f t n , f t + 1 ). (5.26)
Since is p o s i t i v e d e f i n i t e ( S e c . 4), w e h a v e (H^e^ e ft) ^
^ — ( P h > f t ) = ( P h j ^fc)* H e n c e , a c c o r d i n g t o ( 5 . 1 8 )
( H h e k , e h ) > ^ - \ \ r h \f. (5.27)

T a k i n g into account moreover that || e * | | ^ M || r h ||, w e obtain


f r o m (4.48) that
11 zir ||— -'|| f f || 1
\\h m i k h H h 1 1 + - II^ r —h ll2^ + | — a k II H sh t||2h -<t/2
u p —|| ffc ||2 •
U s i n g c o n d i t i o n (5.4) o n e c a n e a s i l y a s c e r t a i n t h a t a gn ^ a * < o o
a n d o n t h e s t r e n g t h o f t h i s it f o l l o w s f r o m t h e r e c u r s i v e i n e q u a l i t y
f o r II^Tft+ill t h a t e s t i m a t e ( 5 . 2 5 ) h o l d s f o r H i n + 1 . O n t h i s g r o u n d
w e shall p r o v e b e l o w b y i n d u c t i o n t h a t a %n+i ^ a < oo w i t h a n y
z = l, . . . , n — 1. T a k i n g t h i s i n t o a c c o u n t w e f i n d t h a t ( 5 . 2 5 )
holds.
Let us prove n o w that w i t h i = 1 the following relations hold:
( r lt e 0 ) = 0, K , r0) = o (||r0 || H r J ) ,
Cx ll/;il< ||/;|| (5.28)
where the constants C x a r e i n d e p e n d e n t of k a n d C x ; > 0. T h e
first o f t h e s e e s t i m a t e s is f o u n d a s f o l l o w s :
(»1 , «o) = — « i ( H J l , e 0) = — a x (/;, H x e „ ) .

Ill
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

But H ^ q = r0, therefore ( r lt e 0) — — a^/;, r0) = 0. Further,


(*i> r o) = (/'ic^i, r 0 ) = ( r x , f ”o c r 0 )
+ (*i, (/7c — f 0 c)r0) = (*i, e 0) + o ( H r J I ||r0 ||) = o (||ra || ||r0 ||).
L e t u s n o w s h o w t h a t t h e e s t i m a t e s h o l d f o r 11^ 11. I t f o l l o w s f r o m
( 4 . 4 8 ) , t a k i n g i n t o a c c o u n t ( 5 .2 ) a n d ( 5 . 2 6 ) , t h a t

(»,/;. / ; > - < « . / ; , a ■


/i, / o )
U s i n g e s t i m a t e s (1.14), (1.15) a n d (5.7) w e d e d u c e that for a f u n c ­
t i o n w h i c h satisfies (5.3)

m ( l + - £ - ) ( / ( * ) - / . ) < II (/(*)-/.). (5-29)

T a k i n g into account estimates (5.4) jand (5.29) we h a v e o n set


■$<> = { * : / ( * ) < ; / ( * o ) }
( / / o / ; . /;> ^ a / q ii n ir- ^ d, ( / , - / . ) ^ d.
(f l o / i , /;) ^ "'oil/ill* ^ </«-/») ^
w h e r e <Zt , d 2 a r e c o n s t a n t s i n d e p e n d e n t o f £ . B y virtue of this,

(#./;, f \ ) > II / ; f . (5.30)

where a i = m 0 j (l+-^") *s i n d e p e n d e n t o f I.
L e t us u s e n o w inequality (5.30) in order to es t i m a t e the v a l u e of
parameter a ^ . Since
h — f i = a i (/ii P i ) + y (/lcPi, Pi)

a n d a t i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 5 . 2 ) , it i s c l e a r t h a t
(fi* _ _ _ (/I. pi)
JW||pi|l2 % s l5a m II P i II- *
N o w b y ( 5 . 3 0 ) , w e h a v e — (/,', P i ) = ( H 1 f \ , / ; ) > a t ||/ ; ||2 a n d b y
( 5 . 2 5 ) , || p i || = | H {f\ | | ^ L || / J ||; t a k i n g t h e s e e s t i m a t e s i n t o a c c o u n t ,
we h a v e a , ^ - — — = a ] > 0 . A t t h e s a m e t i m e it f o l l o w s f r o m
— M L 1
( 5 . 3 0 ) t h a t || P i H ^ f l i | | / I ||. U s i n g t h i s e s t i m a t e w e c a n e a s i l y e s t a b ­
lish that — ^ - = a < o o . Thus we find that
1 ^ maf

#1 ii/;ii = * l ii/;ii > iirx ii = a i ii#r/;ii > m \ = c, m


w h e r e c o n s t a n t s N x , C 1 a r e i n d e p e n d e n t o f £, i.e. w e h a v e e s t a b l i s h e d
th a t estimates (5.28) hold.

112
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

Suppose that the estimates


i n , e s) = o (||r,|| D o l l ) , i^=j, 0 < i, j ^ r < re — 1, (5.31)
c, mil < INI < N t ll/ill, o < i < T (5.32)
w h e r e c o n s t a n t s N iy C t > 0 a r e i n d e p e n d e n t o i g , h o l d . L e t u s s h o w
that similar estimates t a k e place also w i t h 7 ^ t + 1
(/t+u r j) — (/i+i» r j) + (ej+ 1 +. . . + r ;)» 0 ^ 7 <C t. (5.33)
According to c o n d i t i o n (5.8) a n d e s t i m a t e s (5.32) q u a n t i t i e s
ll/x+ill, ||/x|| a n d ||rf|| w i t h a n y 0 ^ i ^ t a r e o f t h e s a m e o r d e r o f
smallness. T a k i n g this into a c c o u n t a n d us i n g conditions (5.2)
a n d (5.31) w e find a c c o r d i n g to (5.33) that
( / t + 1 . r i) = 0 (ll/i+ill M o l l ) = o (||r,II’ ), 0 < ; < t.
S i n c e (/x' + lT r x ) = 0, a c c o r d i n g to (5.2), w e o b t a i n finally
(/m, n) = o ( H r v l l 2 ), 0 < / < x. (5.34)
Let us estimate n o w the quantity fi+i)- U s i n g f o r m u l a
(4.48) a n d t a k i n g into a c c o u n t (5.26) w e o b t a i n for a n y 0 ^ ^ t :

(tf,+1/;+ 1 , /;+ i ) - /;+.)+ {r^ ! l +f

+ ( H j f r + i , / ; + i ) ( H 1f-, / i ) - ( / / , / ; + 1 , / ; + o 2
- ( H j f ] , / ; + , ) ’ + 2 { H , f i + u f't+i) { H i f ' h / ; + « ) ! •
O n the r i g ht -ha nd side of this inequality the difference b e t w e e n the
f i r s t a n d t h e t h i r d t e r m s o f t h e n u m e r a t o r , b y G a u c h y - B u n i a k o w s k i ’s
i n e q u a l i t y , is n o n n e g a t i v e . T a k i n g i n t o a c c o u n t e s t i m a t e s ( 5 . 3 4 ) ,
( 5 . 2 7 ) a n d ( 5 . 2 5 ) a n d t h a t a j-, } ^ t i s b o u n d e d , i t i s e a s y t o a s c e r t a i n
tha t t h e ratio of t h e last t w o t e r m s of t h e n u m e r a t o r to t h e d e n o m i ­
n a t o r i s o f t h e o r d e r o f o (||ry|| \ \ f i + i \ \ ) = o (||/^+il|2 ). H e n c e
(#,•/;+! i /;+!> /;•)
( H j+ifx + u f x + i ) 7 ^ ' {Hjej, ej) O d l / i + l l l 2 )-

E s t i m a t e s (5.32) i m p l y that there are c o n s t a n t s a t i n d e p e n d e n t of g


s u c h t h a t ( H j f j , f j ) = — ( p j , f j ) ^ a j \\fj\\2 . M a k i n g u s e o f t h i s f a c t
a n d of (5.25) w e h a v e

(Hj+ifr+1, / ; + ! ) > ( g j / ; + 1 ’ r '+i)


- o ( \ \ r r + i \ \ 2 ) > a j ( H i f'x + u / ; + , ) - o ( | | / ; + 1 ||2 ) (5.35)

w h e r e aj > 0 a n d is i n d e p e n d e n t o f £ ( b y ( 5 . 3 2 ) ) .

8 — 0 3 2 6 113
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

It w a s n o t e d in t h e p r e c e d i n g s u b s e c t i o n t h a t for p r o c e s s e s w i t h
r e s t o r a t i o n o f H h a s k — >* o o w e h a v e ||/^|| - > 0 . T h e r e f o r e , it f o l l o w s
f r o m i n e q u a l i t i e s ( 5 . 3 5 ) , t a k i n g i n t o a c c o u n t t h a t m a t r i x H h is p o s i ­
t i v e d e f i n i t e , t h a t i f w i t h a n y £ w e h a v e (f f j f i + i , / t + i ) ^ Y j ll/x+ill2
w h e r e y j > 0 a n d is i n d e p e n d e n t o f £, t h e n t h e r e is a c o n s t a n t
Y i + i > 0 s u c h t h a t w i t h a n y £ w e s h a l l h a v e (H j + J t + i , / x + i ) ^
^ Y i + i Il/x-hiII2 - B u t i n e s t i m a t i n g t h e q u a n t i t y ( ^ T i / t + i , f x + \ ) w e f i n d ,
since /x + 1 ) ^ W o I I / t + i I I 2 > t h a t t h e r e is a c o n s t a n t Y i s u c h
t h a t ( H J i + ly f i + i ) ^ Y i l l / x + i l l 2 w i t h a n y £ . T a k i n g t h i s i n t o a c c o u n t ,
o u r a r g u m e n t b y i n d u c t i o n s h o w s t h a t t h e r e is a c o n s t a n t a x + 1
i n d e p e n d e n t o f £ a n d s u c h t h a t ( H x + j f x + 1 , / i + 1 ) ^ a x + 1 | | / x - m II2 * W e
establish n o w just as w e did a b o v e that
a x+i ^ ( / x + l ’ Px+i) ^ ^ ( / x + i ’ Px+i) ^ L

M L * ^ M || p x + i || * ^ x+1^ m || P x + i ||* ^ m a | +1 *

Therefore, w e have
^T+ill«+ill > IK+ill = a x + 1 \ \ H x + 1 f i + 1 \\ > C x + 1 ||/x+ 1 ||. (5.36)
Let us s h o w n o w that
H x+ i e j = rs + rjj-, 0 < j < t (5.37)
w h e r e ||rjy-|| = o (||r; ||).
M u l t i p l y i n g b o t h sides of f o r m u l a (4.48) b y ej w e obtain
JJ JJ J r s (r s» e j) (*.«., e j) H
(5.38)
" * l6j = + <*.*. u) •
If w e a s s u m e t h a t w i t h a c e r t a i n s , / + 1 ^ s ^ t, e q u a l i t i e s
H se j = T j + t a k e p l a c e w h e r e ||t]^|| = o (||rj||), t h e n u s i n g e s t i ­
m a t e s (5.31), (5.27), (5.25) a n d t a k i n g i n t o a c c o u n t t h a t all o f t h e
q u a n t i t i e s ||rs || a r e o f t h e s a m e o r d e r o f s m a l l n e s s w e a l s o h a v e b y
( 5 . 3 8 ) t h a t H s + 1 e j = r j -f- tj^, w h e r e ||t)j-|| = o (||r;||). B u t H j + 1 e j =
= rj, a n d w e establish b y i n d u c t i o n that equalities (5.37) h o l d true.
T a k i n g into a c c o u n t (5.37) w e h a v e
('•,+1 , e s) = — a T+1 ej) = — a T + 1 (/(+ 1 , r s + rj, ) .
therefore by (5.34), w e find that
(r x + i » e j ) = o (||r,||2 ) + o (||/;+ 1 || H o l l ) , 0 < j sg; t .
I n e q u a l i t i e s ( 5 . 8 ) a n d ( 5 . 3 6 ) s h o w t h a t ||rT + 1 || i s o f t h e s a m e o r d e r
o f s m a l l n e s s a s ||/x + 1 || a n d c o n s e q u e n t l y a s ||r7 ||, 0 ^ / ^ t . I t
follows that
(rx + 1 , ej) = o (||rx + 1 || I M ) = o ( | | r x + 1 ||2 ), 0 < / < t. (5.39)
T a k i n g this into a c c o u n t w e establish in a m a n n e r a n a l o g o u s to that
u s e d before w i t h i = 1 that also
r f) = o ( | | r T + 1 ||2 ) , 0 < ; < t. (5.40)

114
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

T h e relations (5.36), (5.39) a n d (5.40) s h o w t h a t e s t i m a t e s (5.31)


a n d (5.32) really t a k e place w i t h t + 1 too.
T h u s it h a s b e e n e s t a b l i s h e d t h a t e s t i m a t e s ( 5 . 9 ) , ( 5 . 1 0 ) h o l d f o r
m e t h o d ( 4 . 4 8 ) if it is a s s u m e d t h a t p r o c e s s ( 5 . 1 ) is r e a l i z e d w i t h r e s t o ­
r a t i o n of m a t r i x H k after a finite n u m b e r of steps.
T h e a b o v e a r g u m e n t c a n b e r e p e a t e d w o r d f o r w o r d if w e a s s u m e
t h a t i n e q u a l i t y ( 5 . 2 4 ) ( o r t h e c o r r e s p o n d i n g i n e q u a l i t y ||/(§ + 1 )n || ^
^ $ n ||/£m 7j||) i s s a t i s f i e d r a t h e r t h a n c o n d i t i o n ( 5 . 6 ) ( o r ( 5 . 2 3 ) ) a n d
c o n s i d e r o n l y i t e r a t i o n s w h i c h c o r r e s p o n d t o s u b s e q u e n c e { | m }. A t
these iterations e s t i m a t e s (5.9) a n d (5.10) r e m a i n valid.
T h e superlinear rate of c o n v e r g e n c e of the m e t h o d follows f r o m
this fact as w a s s h o w n in the p r e c e d i n g subsection.
2. M e t h o d ( 4 . 4 9 ) . If m a t r i x H h is r e s t o r e d a f t e r a f i nit e n u m b e r o f
steps, t h e n w i t h a n y k t h e m a t r i x h a s a b o u n d . T h i s follows f r o m
inequality
it II I H H2 I 11 H k I) M || rft 1|2
II f r + l | | ^ | | h || + m || r f t | | 2 " T 7 7 1 1| T \ | | 2

W i t h i = 1
( H t r , /;) = ( t f 0 /;, /;) > m 0 m w
M a k i n g use of these relations a n d re a s o n i n g as in s t u d y i n g m e t h o d
(4.48) w e establish that e s t i m a t e s (5.28) h o l d a n d t h e n a s s u m i n g th a t
estimates (5.31) a n d (5.32) h o l d w e d e m o n s t r a t e that e s t i m a t e (5.34)
holds.
Further w e have that

( / / ? + , / ; + „ / ; + 1 ) = ( t f j / ; + 1 , / ; + , ) + 2 ( / ; + l ’ r < ) ,(? H i e u f 'x + i ) .


\r i i *i)
i= 0
U s i n g e s t i m a t e s (5.18), (5.34) a n d t h e fact t h a t H k h a s a b o u n d
a n d t a k i n g i n t o a c c o u n t t h a t a l l t h e q u a n t i t i e s ||r*||, H ^ i H , H/t+j.11 a r e
of the s a m e o r d e r of s m a l l n e s s w e find that

( t f j + i / x + i , / t + i ) 5 s m 0 II/t+iII2 + O (H/t+ilP).

C o n s e q u e n t l y w i t h s u f f i c i e n t l y s m a l l v a l u e s o f W x + i \ \ (i-e. w i t h
sufficiently large 5 ) w e h a v e

(Hx+ifx+ii fx+i) ^ a % + i ll/x+ill2

w h e r e a x + 1 > > 0 a n d is i n d e p e n d e n t o f |. T h i s b e i n g so, w e f i n d t h a t


a > a x + 1 > a > 0 a n d C x + 1 \\fx + 1 \\ < \\rx + 1 \\ < N x + 1 | | £ + 1 ||.;
Us ing equalities
H s + i e i = H se i - \ - ( r i — H 8e s) ■ , i-\- l ^ s ^ x ,
V ' s * e s)

115 8*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

t a k i n g into a c c o u n t e s t i m a t e s (5.31), (5.18), t h e fact t h a t H k h a s


a b o u n d a n d r e a s o n i n g as w e d i d in s t u d y i n g (4.48), w e ascertain t h a t
m a t r i x H x+1 satisfies e q u a t i o n s (5.37) a n d c o n s e q u e n t l y e s t i m a t e s
(5.39) a n d (5.40) r e m a i n valid.
This completes the proof that our reasoning b y induction holds.
T h e s t u d y o f m e t h o d ( 4 . 5 2 ) is c a r r i e d o u t i n a s i m i l a r w a y .
R e m a r k . If a t a n i t e r a t i o n of t h e initial s t a g e of t h e p r o c e s s w e
f i n d t h a t p t = — H * f \ = 0 t h e n it is n e c e s s a r y t o r e s t a r t t h e p r o ­
cess restoring m a t r i x H 0.
3. M e t h o d (4.53). T h e t e c h n i q u e of t h e proofs p e r t a i n i n g to m a t ­
r i x H k i n t h i s c a s e is j u s t t h e s a m e a s i n m e t h o d (4.49). N o t e o n l y
t h a t in this m e t h o d m a t r i x H x+ i satisfies e q u a t i o n s
H x+lej = r\j, 0 < < x
w h e r e \\r\j\\ = o (||rj||), r a t h e r t h a n c o n d i t i o n s ( 5 . 3 7 ) . T h o u g h t h i s
simplifies t h e o b t a i n i n g of e s t i m a t e (5.39).
T h e s t u d y o f m e t h o d ( 4 . 5 4 ) is a n a l o g o u s .
4. M e t h o d (4.69). M a t r i x H & (4.69) d e t e r m i n e s v e c t o r p k (4.63) in
w h i c h fih i s c a l c u l a t e d b y f o r m u l a ( 4 . 6 4 ) .
F o r this m e t h o d
II f'k H 2 ~ t~ M o II ll II H k - \ II |l f ' k - i |l
II H k I K A A . +
m o II V h ||*

B y (5.8) with any k w e have where d0 is


i n d e p e n d e n t o f £. T a k i n g t h i s i n t o a c c o u n t w e f i n d t h a t w i t h a n y k
f o r t h e p r o c e s s w i t h r e s t o r a t i o n , | | / / /t | | ^ L .
L e t u s s h o w n o w t h a t w i t h a n y k P / » ^ P < C 1, i.e. t h a t

l — p ft> i - p = p > o (5.41)


w h e r e p is i n d e p e n d e n t o f |.
U s i n g (4.65) w e find that
f _ _ _ _ _ _ _ _ _ W i - W _ _ _ _ _ _ _

(- ^ 0 //> i )
w , . r„)

B e c a u s e of (5.4) a n d (5.8) w i t h a n y k
wi-i-/;-!) ^ ii* ^ „

w h e r e y is i n d e p e n d e n t o f S.

116
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

B e s i d e s , u s i n g ( 4 . 6 5 ) a n d ( 4 . 6 4 ) it is e a s y t o e s t a b l i s h b y indue-
tion that w i t h a n y k w e ha v e 0 < C p f t < C l - C o n s e q u e n t l y

^ — •

H e n c e t a k i n g i n t o a c c o u n t t h a t P g n = 0 , £ = 0 , 1, . . . » it c a n b e
e s tab lis hed t h a t (5.41) holds. It f o l l o w s f r o m (4.65) u s i n g (5.41)
that with a n y k
- ( . P k , f'k) > P ^ o H/ill2 ,
i.e.
llPkll > P ™ o llfill-
T a k i n g into a c c o u n t this inequality a n d the fact that H k is b o u n ­
d e d as in the m e t h o d s considered a b o v e w e establish that 0 < a ^
^ a k ^ a a n d C ||/i|| ^ ||rft|| ^ N ||/i||. T h u s i t h a s b e en proved
t h a t e s t i m a t e s (5.9) b o l d .
L e t us s h o w that estimates (5.10) hold. W i t h k = \ n + 1 w e have

(Pi, «o) = — ( H , 1 \ , e 0) + /;)■' V H o / ; . « o ) + (P o . «.)]•

B u t ( H aj \ , e 0 ) = ( H J ' , , /;) a n d b e c a u s e o f ( 4 . 6 1 ) ( p 0 , e 0 ) = — (p0,Q .


M a k i n g u s e of t h e s e e q u a l i t i e s w e find t h a t (p2, e 0) = 0.
If e s t i m a t e s ( 5 . 3 1 ) h o l d , t h e n i n t h e s a m e w a y a s it w a s d o n e for
m e t h o d ( 4 . 4 8 ) it c a n b e p r o v e d t h a t e s t i m a t e s ( 5 . 3 4 ) hold.
F u r t h e r u s i n g (4.63) w e establish that
( P t + h ej ) = (Pt+i — I ) ( H o f x + i , ej ) + P x+ i (px > *j)*
Let us estimate the quantity ( H 0 f x + i > e j)- I t f o l l o w s f r o m (4.63)
that
g-1 (Pj — PjP;-i).
Hofj = (5-42)
P;— 1
T a k i n g this into a c cou nt w e h a v e
H 0ej = 1 ( p j+1 - % + 1 Pj) - - 1 (Pj - P j P ; -i).
Pj+1— 1 P j — 1
U s i n g this e x p r e s s i o n a n d t a k i n g into a c c o u n t e s t i m a t e s (5.34), (5.41)
w e establish that
( f i + u H 0«j) = O ( | | p , + ill2 ) = o (| | r x + i||2 ), 0 < j < T.
Since b y (5.31) w e h a v e also ( p x , ej) = o ( | |rt + 1 ||2 ), 0 < ; / w e
find that
( P t + i , e,) = o ( | |rt + 1 ||2 ), 0 O < t.
With ; = t

(/h+l» ^t) = (Pt+i 1 ) (-^o/t-f-lj e x) + P t + l ( P x i & x ) m

117
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

H e n c e u s i n g (4.64) a n d (4.61), w e obtain:


, , (*o/;+ 1 ./y (*,/;)
( P t + i . e x) - (/ / 0 / ; + 1 , / ; + 1 ) _ ( P t , / ; , •

S i n c e ( H 0 f x + lt f x + 1 ) ; > 0 , w e h a v e

( ,Pt + 1» ^ x ) ^ ( ^ o / t + 1 > / t )-
I f q u a n t i t y ( H 0 f x + j, / t ) i s e s t i m a t e d u s i n g e x p r e s s i o n ( 5 . 4 2 ) w i t h
j = t, t h e n m a k i n g u s e of (5.34) a n d (5.41) w e find t h a t

(f f o f i + i , /;) = ^ (ikx+iii2)
(it h a s b e e n t a k e n i n t o a c c o u n t t h a t s i n c e ( 5 . 9 ) h o l d s , t h e q u a n t i t i e s
||r2*|| a n d \\fi\\, i — 0 , 1 , . . ., n — 1 a r e o f t h e s a m e o r d e r o f s m a l l ­
ness). T h u s
( P t + 1. e,) = o ( | | r T + 1 ||*), 0 < ; < t.

i.e. f o r t h e m e t h o d u n d e r c o n s i d e r a t i o n e s t i m a t e s ( 5 . 3 9 ) h o l d . C o n ­
s e q u e n t l y es timates (5.40) h o l d too.
T h u s w e h a v e established that for m e t h o d (4.69) a s s u m i n g that
c o n d i t i o n (5.6) is fulfilled, e s t i m a t e s (5.9) a n d ( 5 . 1 0 ) h o l d t o o .
5. M e t h o d (4.71). B y (4.67) a n d (5.4) w e h a v e for this m e t h o d
-(/>*, \\ti\\2 . (5.43)
Taking this into a c c o u n t w e have
l l * o II I I / i IIII ill/it-ill
II ll<llff.||+ m 0 || j ||2

S i n c e w i t h a n y k t h e r a t i o ||/jl||/ W f k - i W o n s e t S 0 is a b o u n d e d
q u a n t i t y a n d | | # 0 || ^ A f 0 , w e h a v e
II*J < M 0 + d W H u ^ W
w h e r e d i s a c o n s t a n t . I t f o l l o w s t h a t if t h e m a t r i x i s r e s t o r e d a f t e r
a finite n u m b e r of steps, t h e n w i t h a n y k m a t r i x H h h a s a b o u n d :
lUBTfcll ^ L . M a k i n g u s e o f t h i s f a c t a n d e s t i m a t e ( 5 . 4 3 ) w e a s c e r t a i n
t h a t w i t h a n y k, a ^ ^ a > 0 a n d N \\fh\\ ^ II r fc|| ^ C \\fk\\.
C o n s e q u e n t l y , for m e t h o d (4.71) e s t i m a t e s (5.9) hold. L e t u s
p r o v e that estim ate s (5.10) hold:

( p * + 1. **) = - W i + i . e „ ) ~ {Holkip h \ ) \ " J ' ) ( P * . « * > •

Hence, taking (4.61) into account,


e h) ~ — (ffofi+l, e h) + W + l , fh+1) = (H o f k + 1 1 /ft)- (5.44)

1 1 8
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

With k — \n, (5.44) yields

(Pi, e 0) - ( H J \ % f 0) = (/I, P a ) - 0.

U s i n g (4.66) w e c a n o b t a i n the f o l l o w i n g expressions:

( P t + l ? e j) C ^ o / t + I ? € j ) “ f" P x + 1 C P x » € j ) i
H ofj = — Pj + P/Pi-i*

— — Pj+i + Pi — fij+iPi + P/Py-i-

If w e a s s u m e t h a t e s t i m a t e s ( 5 . 3 1 ) h o l d , t h e n a r g u i n g a s i n m e t h o d
(4.48) w e c a n p r o v e that es timates (5.34) hold. N o t e also that

o ^ a / o i f / ; 11*
P *' ( I > h - 1, r h - x ) ^ w o l l / i _ , II* ^ d2 *

T a k i n g this into a c c o u n t a n d r e a s o n i n g as w e d i d in m e t h o d (4.69)


w e establish that
(Pt+i* ej) = o ( | | r T + 1 ||2 ) , 0 < / < t
and, besides,
(ffo/i, /i+i) = o ( | | r T + 1 ||2 )
h e n c e it f o l l o w s f r o m (5.44) that
(Pt+i. eT) = o ( | | r t + i l l 2 )-
Thus
( P x + 1 , e t) = o (||rT + 1 ||2 ), 0 < / < t
a n d this p r o v e s that estimates (5.39) hold. T h e validity of e s t i m a ­
t e s ( 5 . 4 0 ) is e s t a b l i s h e d i n t h e s a m e m a n n e r a s it w a s d o n e for
m e t h o d (4.48). C o n s e q u e n t l y , e s t i m a t e s (5.10) h o l d for t h e m e t h o d
un der consideration.
In s t u d y i n g m e t h o d (4.48) w e n o t e d that the proof of estimates
(5.9), ( 5 . 1 0 ) c a n b e g i v e n a s s u m i n g t h a t c o n d i t i o n ( 5 . 2 4 ) is fulfilled;
it is o n l y n e c e s s a r y t o r e p e a t t h e a b o v e a r g u m e n t f o r c o r r e s p o n d i n g
i t e r a t i o n s . T h i s r e m a r k is a p p l i c a b l e t o o t h e r m e t h o d s c o n s i d e r e d
a b o v e a n d this i m p l i e s that t h e y c o n v e r g e at a superlinear rate.

Further Study
of t h e R a t e of C o n v e r g e n c e
1 . S u p p o s e n o w t h a t m a t r i x /" (x) b e s i d e s c o n d i t i o n s (5.3) satisfies
L i p s c h i t z ' c o n d i t i o n ( 2 . 8 ) . I n t h i s c a s e it is p o s s i b l e t o o b t a i n a m o r e
p r e c i s e b o u n d o n t h e r a t e o f c o n v e r g e n c e o f s e q u e n c e { ^ ^ n }.

119
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T o m a k e referring m o r e co n v e n i e n t , w e shall give the different


r e l a t i o n s w h i c h h o l d if ( 5 . 3 ) d o e s ( m a n y o f t h e m w e r e o f t e n u s e d
before):
m || a: — x,|| ^ ||/' ( x ) | | ^ M ||ar — x*|| (5.45)
<h U (x) - / (**)] < ||/' ( x )||2 < d t [/ ( * ) - / (*,)] (5.46)

(the const ant s d x a n d d 2 are i n d e p e n d e n t of the choice of point x)

r o M < l k * l l < J l f Ifall. (5.47)


L e t x, y b e a r b i t r a r y p o i n t s
/ ( y ) < / (*). (5.48)
T h e n m a k i n g use of (5.46) w e establish that
II/' ( y ) \ \ < C ||/' ( x ) | | . (5.49)
Here a n d further o n in this subsection C d e n o t e s va r i o u s constants
(not eq u a l to zero) w h i c h are i n d e p e n d e n t of the choice of points
x, y £ E n.
If ( 5 . 4 8 ) is s a t i s f i e d , w e h a v e b y ( 5 . 4 5 ) a n d ( 5 . 4 9 )
||i/ — x*|| < C ||x — x*||. (5.50)

2. S u p p o s e t h a t for t h e iterative pr ocesses b e i n g s t u d i e d t h e fol­


l o w i n g estimate holds:

Wtn+t\\ ll/in+i+lll < * * ll/|rj+;ll» 0 < i < / < * - 1. ( 5 .5 1 )


H e r e a n d further on, will d e n o t e different variables t e n d i n g to
zero as £ — > oo.
I n w h a t f o l l o w s w e shall l i m i t o u r s e l v e s to t h e s t u d y of the p r o p e r ­
ties of m e t h o d (4.48). H o w e v e r , t h e results o b t a i n e d ( l e m m a 5.1,
t h e o r e m 5 .2 ) h o l d a l s o f o r o t h e r a l g o r i t h m s o f c o n j u g a t e d i r e c t i o n s .
L e m m a 5.1. L e t process (5.1) be u s e d for the m i n i m i z a t i o n of the
t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n f (x) w h i c h a l s o satisfies c o n ­
d i t i o n s ( 5 . 3 ) a n d ( 2 . 8 ) ; i n t h i s p r o c e s s t h e c o n s t r u c t i o n o f m a t r i x H h is
p e r f o r m e d b y f o r m u l a ( 4 . 4 8 ) . T h e n if i n e q u a l i t i e s ( 5 . 5 1 ) h o l d , e s t i m a ­
tes (5.9) al so h o l d a n d m o r e o v e r
I ('•E n + i , « E n + j ) I c | | r !„+|||2 | | r J n + l + l l l
t — m i n {i, / } , i y = /, 0 ^ i, j ^ n — 1. (5.52)

T h i s l e m m a is p r o v e d i n t h e s a m e w a y a s e s t i m a t e s (5.9) a n d ( 5 . 1 0 )
for m e t h o d (4.48) i n t h e s u b s e c t i o n o n p. I l l ; o n l y t h e o r d e r of s m a l l ­
n e s s o f s o m e q u a n t i t i e s is d e t e r m i n e d m o r e p r e c i s e l y . T h e r e f o r e ,

120
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

w e abstain f r o m giving a detailed proof a n d shall dw e l l o n l y o n the


changes involved.
With i = 1
(«1 . r o) = (r i, e „ ) + ( r „ (J",C — f l M -
T a k i n g (2.8) i n t o a c c o u n t , w e o b t a i n

ll/",c - /o.ll - II/" (*i + Vi) - /" (x, + e 0r 0 )||


< 7 ? ( | f e - *,|| + llrjH + ||r,|D < R (2|| r,|| + II M l ) .
B y (5.50) a n d (5.45), w e h a v e
|| r k ||< || * * - * * || + 1| * * « ■ - | | < c || f k ||• (5':53)

U s i n g ( 5 . 5 3 ) a n d ( 5 . 4 9 ) w e e s t a b l i s h t h a t |J r i | | ^ J C || / J | | ^ C || / J ||.
T a k i n g i n t o a c c o u n t a l s o t h a t | r 0 | | ^ C | / ' || w e o b t a i n | | r j | | ^
|| r 0 1|. C o n s e q u e n t l y , || f [ c — f"0 c | ^ C || r 0 |. L !s i n g t h i s w re f i n d t h a t
I ( e ii r 0 ) \ < C || r 0 |l2 1| 7*i ||-

Suppo.se that est i m a t e s (5.9) a n d (5.52) h o l d w i t h O ^ i , / ^ t <C


< n — 1 . T h e n s i n c e ( / J + t , r 7 ) — 0 , w Te h a v e

H / i + 1. > v ) M ( et + . . . + e J+„ M K c lloll2 H o + i l | . 0 < / < t . (5.54)


H e n c e u s i n g (5.53),
| ( / ; + 1 , r , ) | < C | | o | | | | / ; - | | | | / j + 1 ||. (5.55)
I f ( 5 . 5 1 ) i s s a t i s f i e d , t h e n ||/j || | | / i + i | | ^ ^ | | / r i i ||- M a k i n g u s e of
this inequality w e obtain f r o m (5.55) that
I (fin, n)i < ini ||/;+ 1 ||, o < / < t.
S i n c e ( f j + lt r T ) = 0, w e finally o b t a i n
l ( f i + i . n )| < P i g i h i i n / ; + 1 ||, o < j < t. (5.56)

U s i n g e s t i m a t e s (5.27), (5.56) a n d (5.25) w e o b t a i n also:


< * ■j j f j ” ^ ^ T + l - r j)Z || r> ||2 /r T7\
WH :j e j , e j ) ^ m | | r j ||2 II ’ (o.o7)
I (Jijrj + V /;+ 1 ) ( H j f y /;+ l ) i c || || u /;+ 1 1| | (/;+ 1 , n ) \
^ ™ II o i l ”
h \ \ f i + i \ \ * ii ii
(5.58)
II r j II
T a k i n g i n t o a c c o u n t t h a t ||o|| ^ C ||/j||, 0 ^ / < t a n d u s i n g (5.53),
(5.49) w e o b t a i n
INI < C \\fi\\ < C II/-H < C ||rj||, 0 < j < i < x.(5.59)

121
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

M a k i n g use of (5.59) w e h a v e
II / x _j_| |l“ II /j- 4-i II
11 / x I 1 II2 , T. (s.no)
iiTTii
U s i n g (5.57), (5.58) a n d (5.60) w e establish in t h e s a m e w a y as in
t h e s u b s e c t i o n o n p. I l l t h a t
(Hx+iK+i, 1) > f l x + i ii/;+iii*.
Let us s h o w n o w that
^ r + l ej — rj + 0 ^ ^ T (5.61)
where
T
I K . x + i l K C 2 (5.62)
v=j+l
Indeed,
( < V J J se j ) I I s e s
(5.63)

w e h a v e H j+ ^ j = rj. U s i n g e s t i m a t e s ( 5 . 5 2 ) , w h i c h h o l d b y a s s u m p ­
t i o n w i t h 0 ^ s, j ^ t , ( 5 . 2 5 ) , ( 5 . 2 7 ) , ( 5 . 4 7 ) a n d t a k i n g i n t o a c ­
c o u n t t h a t | a,-1 ^ C , 0 < i ^ t, w e o b t a i n w i t h s / + 1
|l l t ' y + ! • H j + i e j) M j + i e j + i II II (e j + i » r j ) H j + i e j + i II < ---^ H r j l | 2 II r 7 + l II

(■H j + i e j + l ' e j + 1 ) {-H j + l e j - r h e j + 1 ) |l r j + i II

N o t i n g a l s o t h a t ( r j+ll ej) — 0 w e obtain from (5.63):


it . ii w ^ r H r i I I " II r i * \ II
H j + 2e j — rj + 7+2 • || i \ j , ; + • * II ^ C - - j| ||
Suppose t h a t w i t h a c e r t a i n / -f- 1 < c w e have
rj | ii r* II r i !l’ II r 7+ U I
H s ej rj + f]j, s* II T | 7*. s \ \ < C 2 j j| || •
v=j+l V
Then b e c a u s e of the s a m e co nditions u s e d with s^/-|-l we have:
ll ( < V M s e j ) || || r j ||- || f j + i || „ || r j ||~ J| f’j + i II
( U se s , e s ) ||7JI 2i [fTTil ?
V=i-H

U s i n g these estimates in (5.63) w e establish that

,r . Il || „ si ^r\ H ^ 7* I I 2 II 0 " + * II
H * + \ ej rj + T/, s+i» || 7, s + 1 I I ^ ^ p j j -•
v=j+ 1
Thus, b y induction, w e c a n consider (5.61) to hold.

122
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

W e c a n p r o v e n o w that e s t im ate s (5.52) h o l d w i t h i = x + 1.


M a k i n g use of (5.61) w e h a v e
O x + l» € j ) = ® T +1 C ^ T + l /x+1?
— — a t+i (/t+h rj + ^1 j. t + i ) - (5.64)
B y (5.59) a n d (5.49), w e have
INI > c II / ; II > C II n + 1 \ l 0 < V < T. (5.65)
T a k i n g t h i s i n t o a c c o u n t a n d u s i n g ( 5 . 6 2 ) , w e o b t a i n | (/x + 1 , x+1) | ^
^ C l l o l l 2 II O + i I I * U s i n g t h i s i n e q u a l i t y a n d a l s o e s t i m a t e s ( 5 . 5 4 )
in (5.64) a n d t a k i n g into a c c o u n t t h a t a T+! ^ C , w e find t h a t
I (r T + i » e j ) | < C Hr,- ||2 || o + i l l , 0 < / < t. (5.66)
Further,
( * x + n r j) ~ ( r t + i » e j) + (r x + u ( f " ( o + i + ® T + i rT+i)
— /" ( O + 0 ) ) 0 )* ( 5 *6 7 )
B y (5.65) a n d (5.53),
l l o + 1 || < c lloll, 0 < / < T.

Consequently, using (2.8) w e have


II/" {X t ¥ i + e T + irT + 1) - /" t o + e , o ) II < R (\\xx + 1 - x j II
+ II0 + 1 I I + I N I ) < C II O i l .
U s i n g this es t i m a t e a n d (5.66) in (5.67) w e obtain:
I (O+i, o) | < c II o i l 2 l l o + i ||, 0 < / < x.
T h u s o n e s t e p o f i n d u c t i o n h a s b e e n c o m p l e t e d , i.e. it h a s b e e n e s t a ­
b l i s h e d t h a t e s t i m a t e s (5.9) a n d (5.52) h o l d . T h e p r o o f of t h e l e m m a
is c o m p l e t e d .
T h e o r e m 5 . 2 . L e t f (x) b e a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c ­
t i o n a n d m a t r i x f " (,r) s a t i s f y c o n d i t i o n s ( 5 . 3 ) a n d ( 2 . 8 ) . T f f ( x ) i s m i n i ­
m i z e d b y a l g o r i t h m {(5.1), (4.48)}, t h e n w i t h a n y sufficiently large 5
the following estimate holds:

II *^’<J-+l>n * 0 II ^ II % | n + l *0:11 II % \ n *Oll* (5.68)


Proof. B y ( 5 . 4 5 ) , e s t i m a t e ( 5 . 6 8 ) is e q u i v a l e n t t o
ii / ; i i < c ii/; ii i o . (5.69)
S u p p o s e t h a t e s t i m a t e (5.69) w i t h all sufficiently l a r g e £ d o e s n o t h o l d .
T h e n there m u s t exist a n infinite s u b s e q u e n c e { £ m } s u c h t h a t for
corresponding points the following inequalities hold:

llfoll II /;il < WfnWi h m + 0 as E m + oo. (5.70)

123
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

W i t h o u t l o s s o f g e n e r a l i t y , it c a n b e a s s u m e d t h a t t h e s u b s e q u e n c e
{ S m } c o i n c i d e s w i t h t h e w h o l e s e q u e n c e £ = 0 , 1 , . . .. T a k i n g i n t o
a c c o u n t ( 5 . 4 9 ) w e c a n a s c e r t a i n t h a t if ( 5 . 7 0 ) h o l d s , e s t i m a t e s ( 5 . 5 1 )
h o l d t o o . C o n s e q u e n t l y , if w e a s s u m e e s t i m a t e s ( 5 . 7 0 ) t o b e s a t i s f i e d ,
t h e n t h e r e q u i r e m e n t s of t h e o r e m 5.2 p r o v i d e for the fulfilment of
t h e c o n d i t i o n s o f l e m m a 5 . 1 . T h u s , if ( 5 . 7 0 ) i s f u l f i l l e d , t h e e s t i m a t e s
(5.9) a n d (5.52) ho l d .
T a k i n g this into a c c o u n t w e h a v e
I (/n» r j)l = I (e n - l + • • • 4~ e j + 1? 0 ) 1
II Oil2 ||rm ||, 0 < / < « - 2 .

N o w in a w a y a n a l o g o u s to that u s e d in establishing (5.56) w e c a n


s h o w that
I (/;, n) ii/; h i m , l - > o , o < k « - i . (5.7i)
L e t u s d e m o n s t r a t e n o w t h a t if ( 5 . 5 1 ) h o l d s , t h e n t h e s y s t e m
r 0 , . . ., r n _ ! i s l i n e a r l y i n d e p e n d e n t . N o t e f i r s t o f a l l t h a t d u e t o
e s t i m a t e s (5.9), it f o l l o w s f r o m ( 5 . 5 1 ) t h a t
I I '•,11 I h + i l l < 0 < i < / < n — 1. (5.72)

M a k i n g u s e of (5.72) a n d (5.47), e s t i m a t e s (5.52) c a n t a k e t h e fol­


lowing form
| ( r £, e j ) | < ||r,|| ||r,|| < A , 6 ||r,|| X l 0,
0 i =^= j n — 1. (5.73)

W e denote rt = r , / ||r*|| a n d l e t

II 9 1 1 ^ min || P i n ||-|| 2 P i O ||.


n — 1 i=0 7=0
Ift I— 4
2V 10 *1 = 1
i= 0
Then

1 ( 9 - e i ) \ > \ \ $ ) ( X i , «]) I— I2 P i ( n > «j)| (5.74)


7 , - j

7 7 - 1 _

Since 2 IPi I — w e have | P j | ^ P > 0 , at least for one index


7=0
0, n — 1. T h e n b y (5.18) a n d (5.47) w e have
IM o , e j ) \ ^ c \\r,\\>c IMI.
With i j m a k i n g u s e of (5.73) w e h a v e
IMr„*,)| I P f l l h i i I N I = ^ 6 IPilIkilli
^ - k 0 as E -► o o .

124
A R B I T R A R Y F U N C T I O N - M I N I M I Z A T I O N

U s i n g the inequalities o b t a i n e d in (5.74) w e h a v e w i t h sufficiently


l a r g e t | (<p, e f ) \ > C \ \ e j \ l i . e .
Ilcpll > C . (5.75)
H e n c e , it f o l l o w s t h a t t h e s y s t e m o f v e c t o r s r 0 , . . r n _ x is l i n e a r l y
i n d e p e n d e n t . B e s i d e s , u s i n g ( 5 . 7 5 ) it is e a s y t o e s t a b l i s h t h a t t h e
f o l l o w i n g s t a t e m e n t h o l d s : if • • •» ^ n - i i s a s y s t e m b i o r t h o g o n a l
t o r 0 , . . ., r n _ lt t h e n w i t h s u f f i c i e n t l y l a r g e £ w e h a v e
Ihllll^lKC, 1. (5.76)
F i n a l l y , it c a n b e a s c e r t a i n e d t h a t u n d e r c o n d i t i o n s ( 5 . 7 1 ) a n d ( 5 . 7 6 )
t h e s y s t e m o f v e c t o r s / ; , r 0 , . . ., r n _ x i s a l s o l i n e a r l y i n d e p e n d e n t .
Indeed, suppose that
n -l n - 1
i'n = 2 Y i ' t ’i = 2 1 ( / n .
i= 0 i=u
T h e n b y (5.71) a n d (5.76) w e o b t a i n
ii/;ii< ^ 6 ii/; ii.
Since — ► (), t h e l a s t i n e q u a l i t y c a n n o t b e s a t i s f i e d w i t h s u f f i c i e n t ­
l y g r e a t £ . H e n c e , i t f o l l o w s t h a t t h e s y s t e m f Q , r 0 , . . ., r n _ x i s
linearly independent.
T h u s h a v i n g a s s u m e d that e s t i m a t e (5.68) d o e s n o t h o l d w i t h
a n y £ ^ £ 0 ( w h e r e £ 0 is a c e r t a i n s u f f i c i e n t l y g r e a t n u m b e r ) , w e h a v e
p r o v e d t h a t a s y s t e m o f n + 1 v e c t o r s / ; , r 0 , . . ., r n _ x i n s p a c e E n
is l i n e a r l y i n d e p e n d e n t . H o w e v e r , t h i s is i m p o s s i b l e . T h u s t h e i n i ­
t i a l a s s u m p t i o n w a s w r o n g , i.e. e s t i m a t e ( 5 . 6 8 ) h o l d s .
T h e t h e o r e m is p r o v e d .

Discussion of R e s u l t s
T h u s w e h a v e m a d e it c l e a r t h a t all o f t h e m e t h o d s s t u d i e d i n S e c . 4
c a n b e applied for m i n i m i z i n g n o n q u a d r a t i c functions a n d that the
c o n v e r g e n c e of t h e processes c a n b e g u a r a n t e e d for a class of f u n c t i o n s
that can be m i n i m i z e d b y gradient methods. In the case w h e r e m e t h ­
o d s of c o n j u g a t e directions are u s e d for m i n i m i z a t i o n of strongly
c o n v e x functions, the rate of c o n v e r g e n c e p r o v e s n o t s l o w e r t h a n
superl inear.
T h e rate of c o n v e r g e n c e of m e t h o d s of c o n j u g a t e directions w a s
established in a s o m e w h a t different m a n n e r t h a n that of m e t h o d s
of other classes studied in the p r e c e d i n g sections: the s e q u e n c e
c o n s i d e r e d w a s { # £ n } r a t h e r t h a n { # * } , i.e. a c t u a l l y w e c o n s i d e r e d
as o n e iteration a unified g r o u p of n us u a l iterations of the process
x\n<> S p e a k i n g generally, the real rate of c o n ­
v e r g e n c e of s u c h processes m a y p r o v e slower t h a n of m e t h o d s of d u a l

125
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

d i r e c t i o n s ( S e c . 3 ) a n d t h e m o r e s o t h a n o f N e w t o n ’s m e t h o d (Sec. 2)
(i.e. t h e d e c r e a s e o f t h e f u n c t i o n v a l u e a t e a c h i t e r a t i o n | fk+1 — fk |
in m e t h o d s of the class u n d e r consideration m a y p r o v e less t h a n in
m e t h o d s o f S e c s . 2 , 3 a n d t h e r a t i o || — x + \ \ / || x h — :r*|| g r e a ­
ter). T h u s , if f o r i n s t a n c e , i n s o m e a l g o r i t h m w e h a v e

Zk+i — Xk = — D h lfk (5.77)

a n d in a m e t h o d of c o n j u g a t e directions

*£<&+l)n % In ^£n/|n

a n d D h = D S n — >> f k , t h e n t h i s m e a n s t h a t n i t e r a t i o n s o f t h e m e t h ­
o d of c o n j u g a t e directions are equivalent, as to their conve rge nce ,
to o n l y o n e iteration of pr oce ss (5.77). N e v e r t h e l e s s t h e rate of c o n ­
v e r g e n c e o f t h e m e t h o d s o f t h e c l a s s u n d e r c o n s i d e r a t i o n is p r a c t i ­
cally rather fast a n d e x c e e d s b y far th at of gra d i e n t m e t h o d s .
A t t h e s a m e t i m e , as m e n t i o n e d i n S e c . 4, t h e m e t h o d s of c o n j u g a ­
te directions differ b u t slightly f r o m t h e g r a d i e n t m e t h o d s as to t h e
l a b o u r per iteration.
T h e f o r e g o i n g m a k e s it p o s s i b l e t o c o n c l u d e t h a t t h e m e t h o d s o f
c o n j u g a t e directions are of the m o s t effective for solving m i n i m i z a ­
tion problems.
I n this section w e limited ourselves to the s t u d y of several c o n ­
crete a l g o r i t h m s c o n s t r u c t e d in Sec. 4, t h o u g h w e c o u l d h a v e s t u d i e d
also t h e properties of ot h e r a l g o r i t h m s that c a n b e c o n s t r u c t e d ac cor­
d i n g to t h e g e n e r a l s c h e m e d i s c u s s e d in Sec. 4. H o w e v e r , t h e t e c h n i ­
q u e of s t u d y i n g o t h e r a l g o r i t h m s w o u l d n o t c o n s i d e r a b l y differ f r o m
t h a t u s e d in S e c . 5. I n d e e d , t h e d i f f e r e n c e i n t h e t e c h n i q u e o f p r o v i n g
t h e o r e m 5.1 a m o u n t s o n l y to s o m e w h a t different w a y s of investi­
g a t i n g t h e properties of m a t r i x H k . B u t in a n y m e t h o d of the class
u n d e r consideration, vectors u a n d u k u s e d for constructing //h+1
c a n h e b u t v a r i o u s c o m b i n a t i o n s of v e c t o r s r h a n d H * e k (see (4.32)),
a n d t h e a l g o r i t h m s d i s c u s s e d in Secs. 4, 5 w e r e c h o s e n so as to u s e
in constructing m a t r i c e s H h various c o m b i n a t i o n s of these elements.
U s i n g the results obtained, w e n o w c o m p a r e the properties of
different a l g o r i t h m s in the m i n i m i z a t i o n of n o n q u a d r a t i c functions.
T h e results of t h e o r e m 5.2 ( e s t i m a t e (5.68)) s h o w t h a t t h e rate
of c o n v e r g e n c e of s e q u e n c e { x 5n} d e p e n d s co nsiderably o n the p r o ­
p e r t i e s o f m a t r i x H gn . If, a s | - ► o o ,
H ln (5.78)
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N

a n d t h e r a t e o f c o n v e r g e n c e i n c r e a s e s . T h i s f a c t is p r a c t i c a l l y o f t h e
gr eat est interest for a l g o r i t h m s h a v i n g t h e p r o p e r t y t h a t in m i n i m i ­
zing a quadratic function w e h a v e
H n = A -1. (5.79)
A l g o r i t h m s (4.48), (4.49), (4.52) b e l o n g to m e t h o d s of this g r o u p .
If i n i m p l e m e n t i n g o n e o f s u c h a l g o r i t h m s c o n d i t i o n ( 5 . 7 8 ) is f u l ­
filled, t h e n , b y t h e a b o v e c o n s i d e r a t i o n s , it is e x p e d i e n t n o t t o r e s t o r e
matrix H k.
I n m e t h o d ( 4 . 7 0 ) , p r o p e r t y ( 5 . 7 9 ) is n o t fulfilled; t h e r e f o r e t h e
variant of this m e t h o d w i t h o u t restoration gives n o a d v a n t a g e ( w e
refer to t h e rate of c o n v e r g e n c e ) o v e r th e v a r i a n t w i t h restoration.
T h e s a m e c a n b e said also of other a l g o r i t h m s that h a v e the p r o p e r t y
t h a t in m i n i m i z i n g a q u a d r a t i c f u n c t i o n w e h a v e ff n = //0 (for
i n s t a n c e , m e t h o d s ( 4 . 6 9 ) , ( 4 . 7 1 ) ) o r H n is c l o s e t o H 0 ( j u s t t h i s is
t h e c a s e o f m a t r i x //„ in m e t h o d (4.70): its effect o n t h e s y s t e m o f
l i n e a r l y i n d e p e n d e n t v e c t o r s e 0 , . . ., e n - x i s t h e s a m e a s t h a t o f m a t ­
r i x H 0 , e x c e p t o n t h e v e c t o r ^ - l ) . T h e r e f o r e it is n o t w o r t h w h i l e t o
consider variants of s u c h m e t h o d s w i t h o u t restoration of m a t r i x H k .
H o w e v e r , t h e rate of c o n v e r g e n c e of m e t h o d s (4.70), (4.71) will
i n c r e a s e if w e u s e , i n s t e a d o f t h e f i x e d m a t r i x / / 0 , a s e q u e n c e o f
p o s i t i v e d e f i n i t e m a t r i c e s H g0 w h i c h s a t isfy t h e c o n d i t i o n

tf*.— ( & ) - 1. (5-80)


In cases w h e r e the r e q u i r e m e n t s of l e m m a 5.1 are m e t , m a t r i c e s
H iq w h i c h satisfy c o n d i t i o n (5.80) c a n b e c o n s t r u c t e d u s i n g v e c t o r s
r 5n, r s„ + I , . . r b y th« formula
n- 1
r5n+ir| n + i
i f , s + i,o - 2 (r £ n + U * | n + i )
■i= 0
I n the light of t h e a b o v e considerations, t h e m o s t effective m e t h o d s
of t h e class of m e t h o d s of c o n j u g a t e directions f r o m t h e v i e w p o i n t of
t h e rate of c o n v e r g e n c e in the m i n i m i z a t i o n of strictly c o n v e x f u n c ­
tions s h o u l d b e m e t h o d s t h a t h a v e p r o p e r t y (5.79).
In practice, d e v i a t i o n s f r o m this c o n c l u s i o n c a n of c o urs e o c c u r in
the sense that, for e x a m p l e , us ing m e t h o d (4.70) the solution of the
m i n i m i z a t i o n p r o b l e m (to a g i v e n a c c u r a c y ) c a n b e o b t a i n e d after
a s m a l l e r n u m b e r of iterations, as c o m p a r e d to m e t h o d (4.48), say.
T h e f a c t , a s w e s t r e s s e d m a n y t i m e s , is t h a t t h e r a t e o f c o n v e r g e n c e
o f a n y m e t h o d is a f f e c t e d b y m a n y a d d i t i o n a l f a c t o r s , f o r i n s t a n c e ,
b y errors in calculations, the choice of the values of being m a d e
not precisely e n o u g h , a n d the sensitiveness of the m e t h o d s to per­
t u r b a t i o n s is d i f f e r e n t . B e s i d e s , t h e c o m p a r i s o n o f t h e r a t e s o f c o n ­
ve rgence m a k e s sense o n l y in a sufficiently s m a l l region a b o u t the

127
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

m i n i m u m , a n d a t a p o i n t d i s t a n t f r o m it o n e c a n c o m p a r e t h e
effectiveness of different a l g o r i t h m s o n l y o n the g r o u n d of n u m e r i c a l
experiments.
M a n y w o r k s pu b l i s h e d u p to the present t i m e (J.D. Pearson,
J. G r e e n s t a d t , B . T . P o l y a k [2], H . G . H u a n g a n d A . V . L e v y )
c o n t a i n results of n u m e r i c a l solution of v a r i o u s p r o b l e m s b y m e t h ­
o d s of c o n j u g a t e directions. T h e m o s t c o m p r e h e n s i v e c o m p a r a t i v e
a n a l y s i s o f t h e e f f e c t i v e n e s s o f d i f f e r e n t a l g o r i t h m s is g i v e n i n t h e
last of t h e w o r k s n a m e d . O n t h e w h o l e , t h e results of n u m e r i c a l
e x p e r i m e n t s c o n f i r m the co n c l u s i o n that the m o s t effective m e t h o d s
a r e t h o s e f o r w h i c h c o n d i t i o n ( 5 . 7 9 ) is fulfilled. A t t h e s a m e t i m e ,
m e t h o d (4.71) p r o v e s m o r e effective in the case w h e r e the m a t r i x
is r e s t o r e d a f t e r n i t e r a t i o n s ( a s c o m p a r e d w i t h a p r o c e s s w i t h o u t
restoration). It s e e m s t h a t in pr actice m e t h o d (4.70) s h o u l d also
b e u s e d w i t h restoration of m a t r i x H h .
Finally, w e d w e l l o n p r o b l e m s th at are i n v o l v e d in t h e c h o i c e of
th e step l e n g t h in m e t h o d s of the class u n d e r consideration. A s w a s
a l r e a d y m e n t i o n e d , in m e t h o d s of c o n j u g a t e directions th e step
l e n g t h is c h o s e n u n d e r t h e c o n d i t i o n t h a t t h e m i n i m u m o f t h e f u n c ­
t i o n is i n t h e d i r e c t i o n o f m o t i o n . I t w a s s t r e s s e d m a n y t i m e s t h a t
t h e m a i n s h o r t c o m i n g o f s u c h a p r o c e d u r e is t h e n e c e s s i t y o f p e r f o r ­
m i n g a considerable a m o u n t of calculations of fu n c t i o n values, this
m a k i n g t h e c o m p u t a t i o n a l effort v e r y c o n s i d e r a b l e in p r o b l e m s in
w h i c h the function evaluation requires m u c h time. In s o m e cases
t h e s e l e c t e d m e t h o d o f c h o o s i n g t h e s t e p l e n g t h is n o t p r a c t i c a l l y
s u i t e d if, f o r i n s t a n c e , t h e v a l u e o f p a r a m e t e r a * c h a n g e s g r e a t l y a t
e v e r y step. T h i s s h o r t c o m i n g of t h e m e t h o d of c o n j u g a t e directions
w a s s t r e s s e d i n m a n y w o r k s ( C . G . B r o y d e n [2], B . N . P s h e n i c h n y [3],
W . C . D a v i d o n [2], M . J . D . P o w e l l [11, R . F l e t c h e r [1], a n d o t h e r s ) .
T o a v o i d the a b o v e s h o r t c o m i n g , these w o r k s consider m e t h o d s in
w h i c h the v a l u e is c h o s e n s o t h a t it g u a r a n t e e s o n l y a c e r t a i n
degree of decrease of the function. H o w e v e r in other respects, the
c o n s t r u c t i o n o f t h e s e m e t h o d s is b a s e d o n t h e s a m e i d e a s , w h i c h w e r e
d e s c r i b e d a b o v e ( t h e w o r k o f B . N . P s h e n i c h n y [31 e x c e p t e d ) .
T h e s t u d y of t h e p r ope rti es of m e t h o d s in w h i c h t h e c h o i c e of t h e
s t e p l e n g t h is n o t c o n n e c t e d w i t h t h e f i n d i n g o f t h e f u n c t i o n m i n i m u m
a l o n g t h e d i r e c t i o n of m o t i o n b e c o m e s m u c h m o r e difficult a n d t h e
theoretical subs t a n t i a t i o n of m a n y of t h e m h a s n o t b e e n p e r f o r m e d
e v e n in the case of the m i n i m i z a t i o n of a q u a dr ati c function.
F r o m the v i e w p o i n t of the m e t h o d of c h o o s i n g the a * value, m e t h ­
o d s of d u a l directions, Sec. 3, are preferable. T h e rate of c o n v e r ­
g e n c e of t h e s e m e t h o d s also p r o v e s faster. H o w e v e r , m e t h o d s of d u a l
d i r e c t i o n s r e q u i r e a g r e a t e r s t o r a g e c a p a c i t y of t h e c o m p u t e r (as n o ­
t e d in Sec. 3, t w o n x n m a t r i c e s m u s t b e stored); therefore u s i n g
t h e m , o n e c a n s o l v e m i n i m i z a t i o n p r o b l e m s b u t of s m a l l e r size.
O n e can, t h o u g h , u s e a s m a l l e r storage c a p a c i t y of the c o m p u t e r b y

128
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

c h o o s i n g i n m e t h o d s o f d u a l d i r e c t i o n s v e c t o r s r ft a l o n g t h e c o o r d i ­
n a t e a x e s ; h o w e v e r , i n t h i s c a s e it is n e c e s s a r y t o c a l c u l a t e t h e d e r i ­
vative tw i c e at e v e r y iteration a n d this increases the a m o u n t of
w o r k required.

6. M E T H O D S W I T H O U T
C A L C U L A T I N G D E R I V A T I V E S
Introductory R e m a r k s
U n t i l n o w w e d e s c r i b e d m i n i m i z a t i o n m e t h o d s i n w h i c h it w a s
n e c e s s a r y a t e a c h i t e r a t i o n t o c a l c u l a t e , b e s i d e s t h e f u n c t i o n / (x),
i t s g r a d i e n t / ' ( x ) ( m e t h o d s o f S e c s . 1 , 3 , 4 , 5 ) , a n d i n N e w t o n ’s
m e t h o d ( S e c . 2), m o r e o v e r , t h e m a t r i x o f s e c o n d d e r i v a t i v e s f n (x).
M a n y t i m e s w e stressed t h e fact that the calculation of s e c o n d deri­
v a t i v e s is o f t e n t h e m o s t c o m p l i c a t e d a n d l a b o r i o u s p a r t o f t h e
construction of the iterative process, a n d m e t h o d s of Secs. 3-5 w e r e
w o r k e d o u t just w i t h the a i m of a v o i d i n g the calculation of s e c o n d
derivatives. H o w e v e r in m a n y p r o b l e m s , the calculation of the g r a ­
dient c a n also p r o v e considerably m o r e c o m p l i c a t e d t h a n the e v a l u ­
a t i o n o f t h e f u n c t i o n ( i n s o m e c a s e s it is i m p o s s i b l e t o o b t a i n a n
a n a l y t i c a l e x p r e s s i o n o f f (x) a t all). I n s u c h c a s e s it is d e s i r a b l e t o
use m e t h o d s w h i c h require only the function evaluation.
T h e calculation of a gradient b y a n analytical f o r m u l a c a n be
substituted b y a n a p p r o x i m a t e one, for instance, b y using the
finite d i f f e r e n c e s a p p r o x i m a t i o n t o p a r t i a l d e r i v a t i v e s . I n this w a y
o n e c a n construct modifications of the m e t h o d s (discussed in the
p r e c e d i n g s e c t i o n s ) w h i c h i n v o l v e o n l y f u n c t i o n e v a l u a t i o n . If w e
require a definite d e g r e e of a c c u r a c y of t h e a p p r o x i m a t i o n a n d i m p o s e
certain additional r e q u i r e m e n t s o n the construction of a n iterative
process, t h e n in m o s t cases w e c a n o b t a i n that the properties of s u c h
modified m e t h o d s (convergence, rate of convergence) a p p r o x i m a t e
t h e p r o p e r t i e s o f t h e o r i g i n a l a l g o r i t h m s i n w h i c h /' ( x ) , / " ( x ) a r e
evaluated b y analytical expressions.
T h e s t u d y o f m e t h o d s w i t h o u t c a l c u l a t i o n o f g r a d i e n t s is i n t e r e s t i n g
also in a n o t h e r respect. In d e t e r m i n i n g the a c c u r a c y of a p p r o x i m a t i n g
the derivatives w i t h w h i c h the properties of s u c h algorithms coincide
w i t h those of th e original m e t h o d s , w e find in fact th e a l l o w a b l e
calculation errors that d o no t lead to violations of the properties
o f a l g o r i t h m s ( w i t h t h e c a l c u l a t i o n o f /' ( x ) , / " ( x ) ) .
In this section w e are s t u d y i n g o n l y those a l g o r i t h m s w h o s e c o n ­
s t r u c t i o n is b a s e d o n m e t h o d s o f d u a l d i r e c t i o n s , S e c . 3; i n t h i s
c o nne cti on w e retain for t h e m the s a m e n a m e . Besides, w e shall
d w e l l o n a l g o r i t h m s of a n o t h e r t y p e in w h i c h t h e idea of the c o n s t r u c ­
t i o n o f c o n j u g a t e d i r e c t i o n s is r e a l i z e d b u t w i t h o u t t h e c a l c u l a t i o n o f
t h e g r a d i e n t o r its a p p r o x i m a t i o n b y finite diff e r e n c e s .

9 — 0 3 2 6 129
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

C o n s t r u c t i n g M e t h o d s of D u a l Directions
In these m e t h o d s , successive a p p r o x i m a t i o n s to the solution are
constructed b y the formula
Xk+1 = Xh — VhDk'gh (6.1)
w h e r e D k is a n n X n m a t r i x , g h is a v e c t o r . T h e s c a l a r f a c t o r a h
w h i c h d e ter min es the step length, as distinct f r o m the m e t h o d s discus­
sed before, c a n take positive as well as negative values; this d e p e n d s
o n w h i c h d i r e c t i o n — D ^ g h o r D V g h is t h e d i r e c t i o n o f d e s c e n t o f
f u n c t i o n / (x).
O n e c a n also u s e a n o t h e r a p p r o a c h a n d a s s u m e t h a t a h ^ 0, b u t
t h e n the direction of m o t i o n s h o u l d b e either p k = — D ^ g h or p h =
= D ^ g h so that the following condition holds:
(ft, P h ) < 0 (6.2)
W e a s s u m e a s w e d i d i n S e c . 3 t h a t / (x) is a t w i c e c o n t i n u o u s l y
differentiable strongly c o n v e x function.
Constructing matrix D h a n d vector g h. Let us determine vectors
0fe a n d < p ft:
a / /(*fc-fHfci>l) — / ( * * ) f {xh + P k V n ) — f(xk) \
vft 1 * •••? I9
V Hft Ph / ’
( f(yk + W i ) — f {yk) f ( y h + P k V n ) — f(yh)\
^ = 1- - - - E - - - - - - - - - - - - E - - - - ) ’
♦ f t = <Pft ~ 0ft
w h e r e 0 < | p * I ^ II r h ||*, t > 1 , y k , r k a r e e l e m e n t s o f s e q u e n c e
(3.5), Vi is t h e u n i t y v e c t o r o f t h e c o r r e s p o n d i n g a x i s .
L e m m a 6 . 1 . L e t { x ft} b e a b o u n d e d s e q u e n c e , || x k + 1 — x h || - > 0 a s
k — ► o o a n d m a t r i x D h w i t h a n y k ^ n — 1 be defined b y the fol­
l o w i n g s y s t e m of e q u a t i o n s :
D hrh-i = tyh-ii i = 0, 1 , . . ., n — 1 (6.3)
w h e r e r h -i a r e e l e m e n t s o f s e q u e n c e ( 3 . 5 ) . T h e n
l i m || D h - f h || = 0.
ft-*-OB
T h e p r o o f o f t h i s l e m m a c o i n c i d e s i n its e s s e n t i a l f e a t u r e s w i t h
t h a t of l e m m a 3.1. W e shall co nsider o n l y the arising differences.
T h e c o m p o n e n t s o f v e c t o r s 0 * a n d ( p ft c a n b e w r i t t e n t h u s :

dxJ x=3Cft+0 ^ft*;


df
<pi = dxi
^ A + S j U f t 0;
7 = 1, • • • n.

130
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

T a k i n g into a c c o u n t this a n d the continuity of s e c o n d derivatives


o f t h e f u n c t i o n , it is e a s y t o a s c e r t a i n t h a t t h e f o l l o w i n g e s t i m a t e s
hold:
II e » - / ' ( * * ) I K C l I H i . I « 1 / 2 < C 2 ||r* ||‘ , (6.4)
II < P k - / ' ( V k ) II < C 3 |||ik | | n 1/a < C 4 II r » | | * (6.5)

w h e r e C 2, C k < ; oo.
Let us write vector in the f o r m

= f (y h - i ) — f ( X h - i ) + — f (yh-i)) — — f (Xk-i))>
t h e n d e n o t i n g a s b e f o r e e h -i = f (yh-i) — f (#*-*) w e o b t a i n
D h r k .i = e h .i + (cpfe-i — f ( y h -i)) — (Gft-f — f ( z h - i ) ) ,
i = 0 , 1 , . . ., n — 1. (6.6)
L e t u s t a k e B k = D k — / " ( x k ). P r o c e e d i n g a s i n p r o v i n g l e m m a 3 . 1
w e obtain the following estimate:

| | ^ f c r f c - i II ^ hk-i l l ^ f e - i II “I" II ® k - i — 1' (^ft-i)ll + ll^Pfe-i — f (.Vfe-t)ll

where hk^ - > 0 a s k - > oo. W h e n c e , d u e t o (6.4) a n d (6.5), w e h a v e

ll^fer fc-ill ^ hk-i llr f t - i l l + ^ 5 II r h - i II* t C 5 « < OO


or

ll-®fcr /t-ill h h - i II r h ~ i \ \
w h e r e h h - t = h k _ i + C 5 ||rft_ | H 4 - 1 - > 0 a s k - v o o .
T h e r e m a i n i n g part of the proof repeats t h e a r g u m e n t of l e m m a 3.1.
Let us n o w determine vector g k:
„ / + f(*h) f (xk + PhVn) — f M \ /a n\
gh = \ - - - - - n - - - - - - - - - - - - - - j*- - - - - ) ( 6 -7 >

where Ip* K I I (>f P * = H u . t h e n g k = 0 * ) .


It is c l e a r t h a t t h e c o n v e r g e n c e a n d t h e r a t e o f c o n v e r g e n c e o f
s e q u e n c e (6.1) d e p e n d n o t o n l y o n t h e v a l u e of m a t r i x D h b u t also o n
h o w c l o s e v e c t o r g h a p p r o x i m a t e s g r a d i e n t / ' ( x k ), I t w i l l b e c o m e
clear f r o m w h a t f o l l o w s t h a t in o r d e r to g u a r a n t e e a fast rate of
c o n v e r g e n c e o f s e q u e n c e ( 6 . 1 ) t o t h e s o l u t i o n , it is r e q u i r e d t h a t w i t h
a n y k the inequalities
0 < C I Pfe I ^ 5ft II P k II (6.8)
w h e r e £ ft - > » 0 i n a n a r b i t r a r y m a n n e r a s k - > - o o , h e s a t i s f i e d .
I f a t a c e r t a i n i t e r a t i o n t h e c h o s e n v a l u e o f p ft d o e s n o t s a t i s f y
c o n d i t i o n s ( 6 . 8 ) , i t b e c o m e s n e c e s s a r y t o t a k e a s m a l l e r p ft, c a l c u l a t e
a new’ vector g h a n d then calculate a n e w vector p k a n d check u p

131 9*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

o n c e m o r e w h e t h e r ( 6 . 8 ) i s s a t i s f i e d o r n o t . S i n c e g k — ► / & a s | p ft | — ► 0
a n d a t t h e s a m e t i m e || p fe|| - ► || D l x f h \\, a n d || D l lf k || ! > 0 w i t h a n y
x h ¥ = x * ( m a t r i x D & 1 is n o n s i n g u l a r b e i n g t h e i n v e r s e o f m a t r i x D h \
o n t h e c a l c u l a t i o n of D l x see t h e s u b s e c t i o n o n p. Ill, c a l c u l a t i o n of
v e c t o r p k ), t h e n w i t h s u f f i c i e n t l y s m a l l v a l u e s o f p fe c o n d i t i o n s ( 6 . 8 )
ar e satisfied.
D e t e r m i n i n g t h e d i r e c t i o n o f m o t i o n . T h i s is m a d e a s f o l l o w s .
S e t t i n g a c e r t a i n v a l u e of y 0 ( n a t u r a l l y this s h o u l d b e c h o s e n suffici­
e n t l y s m a l l ) , / (x) is e v a l u a t e d a t p o i n t s x k ± ^ at o n e
t h e s e p o i n t s t h e f u n c t o n v a l u e is l e s s t h a n / (#*), t h e n t h e c o r r e s ­
p o n d i n g v e c t o r (— D l lg h o r D h lg k ) is t a k e n a s p k ( c o n d i t i o n ( 6 . 2 ) is
s a t i s f i e d s i n c e / ( # ) is c o n v e x ) . H o w e v e r , if b o t h f u n c t i o n v a l u e s a r e
g r e a t e r t h a n / (#*), w e r e d u c e Y o u n t i l o n e o f t h e f u n c t i o n v a l u e s b e ­
c o m e s l e s s t h a n / (#*), a n d t h e c o r r e s p o n d i n g v e c t o r is t a k e n a s p k .
H o w e v e r , it c a n o c c u r t h a t w i t h s m a l l v a l u e s o f y t h e f u n c t i o n
does not decrease in either of the directions ztiD^gh* Th i s c a n m e a n
that either w e h a v e n o t r e a c h e d values of y w i t h w h i c h the function
d e c r e a s e s o r t h e c o n d i t i o n ( / ' ft, D l l g k ) = 0 i s s a t i s f i e d ( i t w i l l b e s e e n
f r o m w h a t f o l l o w s t h a t s u c h a c a s e is p o s s i b l e o n l y a t t h e i n i t i a l
stage of the process a n d then, obviously, neither of the vectors
d z D ^ g k c a n b e c h o s e n , a s p fe). I n o r d e r t o e x c l u d e s u c h a n o c c u r r e n c e
i t i s n e c e s s a r y t o c a l c u l a t e a n e w v e c t o r g k , 1 h a v i n g c h a n g e d p ft ( b u t
s o t h a t c o n d i t i o n s ( 6 . 8 ) b e s a t i s f i e d ) , c a l c u l a t e a n e w v e c t o r D k xgk, i
a n d f r o m a certain y ^ Yo ° n i e v a l u a t e the f u n c t i o n at points
x k ± Y D h xg h , i a s w e l l . I f x h ¥ = x * , t h e n o n e o f t h e d i r e c t i o n s ± D h lg k
o r ± D l l g h , i, i s , o f n e c e s s i t y , t h e d i r e c t i o n o f d e s c e n t . T h e c o r r e s p o n ­
d i n g v e c t o r i s t h e n t a k e n a s p ft.
T h e a l g o r i t h m of c h o o s i n g t h e step. L e t u s c h o o s e a * in t h e fol­
lowing wa y: suppose that

= min (6.9)

where 0 «< R < oo a n d c h e c k the validity of the inequality

f (x) — f ( x k ) s g e a 2 p ft ( g k , p k ) (6.10)

where x = x h + a p h, = — sgn ( g h , p h ), 0 < e < y .


If ( 6 . 1 0 ) h o l d s w i t h a = a * , t h e n t h e v a l u e a * is t a k e n a s t h e r e ­
q u i r e d o n e , a n d if n o t , w e r e d u c e u n t i l ( 6 . 1 0 ) is s a t i s f i e d ; t h e v a l u e
o f a k t h u s o b t a i n e d is t a k e n t o b e t h e o n e s o u g h t .
T h i s m e t h o d o f c h o o s i n g a h p r e s u p p o s e s , o f c o u r s e , t h a t (g k , P h ) ¥ =
= / = 0 . I f a t a c e r t a i n i t e r a t i o n w e f i n d t h a t (g k l P k ) — 0 ( t h i s c a n o c c u r
0 n l y a t t h e i n i t i a l s t a g e o f t h e p r o c e s s ) , t h e n it is n e c e s s a r y t o r e d u c e
p k a n d calculate vector g k anew.

132
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

W e n o w s t u d y t h e p r o p e r t i e s of s e q u e n c e (6.1) i n c o n s t r u c t i n g
matrix D k, vector g h and parameter a h b y the m e t h o d described
above.
T h e o r e m 6 . 1 . I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c ­
t i o n t h a t satisfies c o n d i t i o n s ( 2 . 4 ) , m a t r i x D k w i t h a n y k ^ n — 1 is
d e f i n e d b y s y s t e m ( 6 . 3 ) , v e c t o r g h is d e t e r m i n e d b y e x p r e s s i o n ( 6 . 7 )
w h e r e p ft s a t i s f i e s c o n d i t i o n s ( 6 . 8 ) a n d a h d e t e r m i n e d b y t h e m e t h o d d e ­
scribed a b o v e , t h e n for s e q u e n c e (6.1) s t a t e m e n t s a n a l o g o u s to those
proved in theorem 3.1 hold.
P r o o f . I n o r d e r t o t a k e a d v a n t a g e o f l e m m a 6 . 1 , it is n e c e s s a r y
first o f a l l t o s h o w t h a t u n d e r t h e c o n d i t i o n s o f t h e t h e o r e m , c o n d i t i o n
\\xh + 1 — x k \\ - » - 0 h o l d s f o r s e q u e n c e ( 6 . 1 ) .
E x p a n d i n g f u n c t i o n / ( x ) i n t o T a y l o r ’s s e r i e s t o t h e s e c o n d - o r d e r
t e r m s in th e region a b o u t point x k w e obtain:

, _ Q „ V r (/ * ’ P k ) . a* U h c P P h ) -|
= Pk) L K { t k + — p ; (-*■; S ) J

w h e r e x h c = x h - f 0 ( x h + 1 — x h ), 0 < 0 < g 1 . S i n c e p ft ( g h l p h ) < 0,


i n e q u a l i t y ( 6 . 1 0 ) i s s a t i s f i e d if
(1'h' P h ) a k UhcPki Pk) ^ ___
Pfe ( g h , P h ) 2 Pk ( g h , P k ) ^ h
o r w h i c h is t h e s a m e t h i n g
1 (/ft* P h ) 1 UhePh* Ph)
(6 .1 1 )
P h (gh, Ph) 2 Pfc ( g k , P h ) ^

Due to (6.2) a n d t h e c h o i c e o f p ft
(/fc» P h ) n
Pfe ( g h , P h )

C o n s e q u e n t l y , w i t h a c e r t a i n a * > 0 t h e i n e q u a l i t y ( 6 . 1 1 ) is s a t i s f i e d
and, therefore, (6.10) as well. T h i s p r o v e s the possibility of c h o o s i n g
a h b y the m e t h o d described above.
T h u s , b y (6.10), f h + i < f h • T h i s m e a n s t h a t x h £ S = {x\ f (x) ^
^ / (*<>)} w i t h a n y k a n d s i n c e f ( x ) h a s a l o w e r b o u n d f h — f h + i
- > 0 . H e n c e , it f o l l o w s f r o m ( 6 . 1 0 ) t h a t a s k — > - o o
I (g h . P h ) I - > 0 . (6.12)

Since a h ^ a h , it f o l l o w s f r o m ( 6 . 9 ) t h a t

W i t h a c c o u n t t a k e n of t h e last inequality, c o n d i t i o n (6.12) i m p l i e s


that || = a * | | / ? * | | - > 0 a s k ^ o o . C o n s e q u e n t l y , t h e

133
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

con d i t i o n s of the t h e o r e m p r o v i d e for the satisfaction of the require­


m e n t s of l e m m a 6.1 a n d therefore
IP* - £ 1 1 - 0 . (6.13)
W e s h o w n o w that u n d e r the conditions of the t h e o r e m as k — ► oo
(/;. p h )
- - - - - - - - - - >. I (6.14)
Ph (ghi P h )
W e have
</g. ph ) _ i ( / ; - * * . p*) < i n/;-gfciiiiPfcii
(6.15)
Ph (£hi Ph) Ph P h (g h . P h ) ^ Ph I(£h» Ph) I

For vector the following estimate (analogous to (6.4)) h o l d s :

ll£fc — /ft II ^ ^ 6 I P h |tt1/2 — C i | p ft |. (6.16)


Since
I (£*. P h ) I = I { P hPh , P*)l (6.17)
it f o l l o w s b e c a u s e o f c o n d i t i o n s ( 6 . 1 3 ) a n d (2.4), t h a t f r o m a c e r t a i n
iteration on,
I (gh, P k ) I > " h . llPftll2 (6-18)
where 0 U s i n g e s t i m a t e s (6.8), (6.16) a n d (6.18), w e
find t h a t f r o m a certain iteration on, the f o l l o w i n g co n d i t i o n s will
b e fulfilled:
I I / i ~ « . IIII J>h II ^ C 7 | p fc | || l | _ _ C T - 0
I (a^ nut I ^ m . II n u l|2 ^ m . U *

H e n c e , it f o l l o w s f r o m ( 6 . 1 5 ) t h a t f r o m a c e r t a i n i t e r a t i o n o n , w e
h a v e pfc = + 1 ( s i n c e t h e l e f t - h a n d s i d e o f ( 6 . 1 5 ) i s p o s i t i v e ) a n d
t h e r e f o r e c o n d i t i o n ( 6 . 1 4 ) is r e a l l y satisfied.
O n s e t S t h e g r a d i e n t f ( x ) i s b o u n d e d : || //t || ^ L . T a k i n g i n t o
a c c o u n t a l s o t h a t | p & | ^ p < C 0 0 » w e f i n d u s i n g ( 6 . 1 6 ) t h a t || g h || ^
^ L x w i t h a n y k. B y a n a l o g y w i t h t h e o r e m 3.1 w e c a n e s t ab lis h
t h a t w i t h a n y k ^ n — 1 w e h a v e || D || ^ A f 2 . C o n s e q u e n t l y ,

IIP* I K I I W I I I l f o l K f t -
U s i n g this estimate a n d inequality (6.18) w e establish that w i t h
sufficiently great k

' f IIr P* h 'liF


II3
1 > ' " iIIi "P hP |ii|2
II3
> | IIA P hr II> ^6 . > 0 . (6.19)

H e n c e , it f o l l o w s f r o m ( 6 . 9 ) t h a t f r o m a c e r t a i n k o n , w e s h a l l h a v e
^ a > 0. (6.20)

134
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

It c a n b e also easily a s c e r t a i n e d u s i n g c o n d i t i o n s (6.14) a n d


(6.18) that w i t h sufficiently great k i n e q u ali ty (6.11) a n d , therefore,
(6.10) t o o a r e satisfied w i t h v a l u e s a ^ a > 0. T h e s e i n e q u a l i t i e s
together w i t h es t i m a t e (6.20) s h o w that f r o m a certain k on, w e
find t h a t
ot/i ^ C l0 ! > 0.

B e c a u s e o f t h i s e s t i m a t e i t f o l l o w s f r o m t h e c o n d i t i o n a ft|| || 0,
the fulfilment of w h i c h w a s discussed a b ove , that as k oo
II P k 1 1 - ^ 0 . (6.21)

S i n c e || g k || = | | D k p k || ^ M ± \\ p h ||, p r o v i d e d ( 6 . 2 1 ) is satisfied,
w e have
1 1 ^ 1 1 + 0 . (6.22)

In accordance with conditions (6.8), (6.16), (6.21) a n d (6.22), w e


c a n assert that as k oo
II/' ( x k ) II + 0.
B u t this m e a n s , d u e to inequality (1.12) w h i c h h o l d s for stron gly
c o n v e x functions, t h a t s e q u e n c e (6.1) c o n v e r g e s to t h e solution.
L e t us o b tai n a n es timate of the rate of c o n v e r g e n c e of the m e t h o d .
D u e to c o n d i t i o n s (6.17), (6.21) a n d t h e u n i f o r m c o n v e r g e n c e of
s e c o n d de riv ati ves of t h e f u n c t i o n o n set S as k — >-oo, w e h a v e
U'hcPh, P h ) .

I (S h , P h ) |

U s i n g t h i s c o n d i t i o n a n d ( 6 . 1 4 ) it is e a s y t o a s c e r t a i n t h a t i n e q u a l i ­
ty (6.11) a n d , therefore, (6.10) as w e l l w i t h sufficiently great k are
s a t i s f i e d w i t h o& = 1 . F r o m r e l a t i o n s ( 6 . 1 9 ) , w i t h ( 6 . 2 1 ) f u l f i l l e d *
it f o l l o w s t h a t
I (gft, P h ) I nn
ii P k i i 3

Therefore in c h o o s i n g a c c o r d i n g t o c o n d i t i o n ( 6 . 9 ) f r o m a c e r t a i n A:
o n , w e h a v e a * = 1.
T h e a b o v e r e m a r k s s h o w that f r o m a certain iteration on, = 1
and
*£ft+1 D k gft.
A t t h e s a m e t i m e t h e r e is a m a t r i x D & 1 s u c h t h a t

%h+l /ft*

135
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e s e q u e n c e of m a t r i c e s D h u n d e r c o n d i t i o n s (6.8) a n d (6.16) c a n b e
chosen so that
D h - + D h. (6.23)
I n order to o b t a i n (6.23) w e can, for instance, a s s u m e that
~ (/b — S h )
D k = D k - " k 7 (X h + 1 - X h ) \
II X h + l — X k l r
I t is n o w e a s y t o p r o v e t h a t s e q u e n c e ( 6 . 1 ) c o n v e r g e s a t a s u p e r -
linear rate. W e p r o c e e d as in t h e o r e m 2.1 a n d establish that the
following inequality holds:
|| * * + 1 - x * || < | | D k 1 || || D h - f h e || || - * * ||.
F u r t h e r u s i n g c o n d i t i o n s (6.13), (6.23) a n d t h e c o n t i n u i t y of s e c o n d
derivatives w e ascertain that as k o o || D h — / * c II - > 0 a n d
q u a n t i t y || D k x || h a s a b o u n d . H e n c e , a s k oo w e h a v e
II * n + i — M x h — **ll (6.24)
w h e r e X h ->*0 a n d this p r o v e s that the rate of c o n v e r g e n c e of {x*}
is s u p e r l i n e a r .
T h e t h e o r e m is p r o v e d .

R e m a r k s o n t h e I m p l e m e n t a t i o n of M e t h o d s
of Dual Directions
Various algorithms. T h e requirements w h i c h should be m e t by
vectors rk u s e d in constructing m a t r i x D h are the s a m e as those
c o n s i d e r e d in c o n s t r u c t i n g s e q u e n c e (3.5). T h e r e f o r e , all t h a t w a s
said in t h e s u b s e c t i o n o n p. 7 4 a b o u t t h e co n s t r u c t i o n of v a r i o u s
a l g o r i t h m s of t y p e (3.4) h o l d s f o r p r o c e s s (6.1).
C a l c u l a t i o n of v e c t o r p h . T h e results of t h e s u b s e c t i o n o n p. 7 6
a r e f u l l y a p p l i c a b l e h e r e . T h u s , b a s i s s k + u s ft, . . ., s fc_ n + 2 , t h e
d u a l of b a s i s tyk+ii • • •> ^ f c - n + 2 » i s c o n s t r u c t e d b y t h e f o l l o w i n g
f o r m u l a s ( a n a l o g o u s to (3.21)):

& h +1 = ~7~ ‘ “ T , Sh+l-i = ^fc+1— j {Sh+l-ji & h + 1)


\5 ft-n + It e h + ^l
/ = 1 , . . ., n — 1.
I n this case in or d e r to c h e c k that vectors tyh+u ^l‘A - n + 2
a r e l i n e a r l y i n d e p e n d e n t , it s u f f i c e s t o c a l c u l a t e t h e s c a l a r p r o d u c t
( ^ f t - n + i * ^ f c + i ) » If 0> fc- n+i » ^Pfc+i) t h e n v e c t o r s ^ l - f t - n +2
a r e l i n e a r l y i n d e p e n d e n t . B u t i f w e f i n d t h a t ( s fe_ n + lf ^ f e + i ) = 0 , t h e n
it i s n e c e s s a r y t o c h a n g e e i t h e r v e c t o r r ft+1 o r o n e o f t h e v e c t o r s
<pfe + 1 , t h u s c h a n g i n g v e c t o r t y h + i -

136
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

In practice, successive a p p r o x i m a t i o n s sh ould b e constructed b y


t h e f o l l o w i n g f o r m u l a ( a n a l o g o u s to (3.25))
n — 1
•^k+i — % h &k 2 £k)rh-i’ (6.25)
i=0

T h e initial s t a g e o f t h e p r o c e s s . T h e r e a r e s e v e r a l w a y s o f p e r f o r m ­
i n g t h e first i t e r a t i o n s o f t h e p r o c e s s ( w i t h k < ; / i — 1). F o r i n s t a n c e ,
the descent c a n b e realized in o n e of the directions p * g A , P* = ± 1
c h o o s i n g t h e s i g n o f p fe s o t h a t / (a;) d e c r e a s e s .
I n o r d e r t o e n s u r e u n i f o r m i t y o f t h e i t e r a t i v e p r o c e s s ( 6 . 2 5 ) , it
c a n b e started in a w a y a n a l o g o u s to that g i v e n in the subsection
o n p. 79.
M i n i m i z i n g a q u a d r a t i c f o r m . L e t f (x) — — (Ax, x) + (b, x) + c,
w h e r e (Ax, x) > 0 w i t h a n y x 0 . I n t h i s c a s e it is e a s i l y a s c e r t a i n e d
t h a t v e c t o r 6 * = g h = / ' ( x k ), cp* = / ' ( y k ), t|>k = e h , i . e . D h = 4 * ,
a n d p r o c e s s (6.1) c o i n c i d e s w i t h (3.4). C o n s e q u e n t l y (see t h e s u b s e c ­
t i o n o n p. 79), p r o c e s s (6.1) a l l o w s to find t h e m i n i m u m of a q u a d r a t ­
ic f u n c t i o n a f t e r n s t e p s . I t is n e c e s s a r y i n t h i s c a s e t o c a l c u l a t e
(n + l)2 f u n c t i o n values.
C h o o s i n g v e c t o r g h . I n m e t h o d (6.1), b e s i d e s a p p r o x i m a t i n g m a ­
t r i x f (#), w e a l s o s u b s t i t u t e f o r g r a d i e n t f (x) its finite d i f f e r e n c e s
an alogue-vector g h . I n this case as w a s n o t e d a b o v e in order to o b t a i n
a s u p e r l i n e a r rate of c o n v e r g e n c e , c o n d i t i o n s (6.8) are to b e satisfied.
If t o s a t i s y t h e s e c o n d i t i o n s w e h a v e t o c a l c u l a t e a t a c e r t a i n i t e r a ­
tion vector g h several times, the a m o u n t of w o r k required in t h e
process increases (particularly, for a m u l t i d i m e n s i o n a l space).
N o t e t h a t i f || p h || < | | P a - i l l a t e a c h i t e r a t i o n , t h e n o n e c a n c h o o s e
| p fc | = | | P k - i \ \ 2 - I t i s v e r y p r o b a b l e t h a t w i t h s u c h a m a n n e r o f
c h o o s i n g p ft, t h e r i g h t - h a n d o n e o f t h e i n e q u a l i t i e s ( 6 . 8 ) w i l l b e
satisfied, at least f r o m a c e r t a i n iteration o n . I n d e e d , in t h e e n d w e
o b t a i n b o u n d s (6.24) o n the rate of c o n v e r g e n c e .
T h e r a t e o f c o n v e r g e n c e o f a p r o c e s s e s t i m a t e d i n t h i s m o d e is
usually slower than the quadratic one:

II *^fcll ^ 1 1 % h *^h-llP» II ^fc-lll ~ ^ 9 ,

i . e . w i t h b o u n d s ( 6 . 2 4 ) u s u a l l y || P h - i l l 2 < C l l P h II ( r e c a l l t h a t f r o m
a c e r t a i n k o n , w e h a v e a * = 1, i.e. p h = x h + x — x *). T h e r e f o r e , if
s e q u e n c e { | ft} i s c h o s e n s u c h t h a t — ►() at a sufficiently s l o w rate,
w e c a n e x p e c t t h a t w i t h p ft = | | p f c - i l l 2 i t w i l l n o t b e n e c e s s a r y t o
c a l c u l a t e g h m a n y t i m e s i n o r d e r t o s a t i s f y (6.8). If, h o w e v e r , c o n d i ­
t i o n s ( 6 . 8 ) a r e n o t s a t i s f i e d f r o m t h e b e g i n n i n g (i.e. if w e h a v e t o
r e d u c e p ft), t h i s w i l l s u g g e s t t h a t t h e r a t e o f c o n v e r g e n c e i s c l o s e t o
the quadratic one.

137
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

I n conclusion, no te also that us i n g the results of the subsection o n


p . 7 4 a n d o f t h i s s e c t i o n it is p o s s i b l e t o e s t a b l i s h t h e c o n d i t i o n s
o f c o n v e r g e n c e o f t h e m o d i f i c a t i o n o f N e w t o n ’s m e t h o d t h a t d o e s n o t
require calculation of derivatives.

M e t h o d s of C o n j u g a t e Directions

W e consider a m e t h o d of constructing c o n j u g a t e directions w h i c h


differs i n t h e e s s e n c e f r o m t h e m e t h o d s d i s c u s s e d i n S e c . 4.
Let again
1
/ (^) = ( A x » *^) ( ^ » % ) ~f~ C

w h e r e (Ax, x) 0 with any x 0. S u p p o s e t h a t d i r e c t i o n s p x , . . .


. . ., p m , m < n ( n o t e q u a l t o z e r o ) a r e A - o r t h o g o n a l a n d E m ( z 0 )
a n d E m ( x 0 ,m ) a r e t w o d i f f e r e n t m - d i m e n s i o n a l s u b s p a c e s o f s p a c e E n
t h a t a r e f o r m e d b y v e c t o r s p ly . . ., p m a n d p a s s t h r o u g h p o i n t s x 0
a n d #o,m*
I f x m a n d x m ,m a r e p o i n t s o f t h e m i n i m u m o f / ( x ) i n s u b s p a c e s
E m ( x 0 ) a n d E m (ar0 .m ), t h e n

if (*m)» Pi) = 0, (/' ( s m ,m ), P i ) = 0,


i = 1 , 2 , . . ., m .

Consequently, (f (xm ) — f (,x m ,m ), p t) = 0 or

{A (xm ,m ) y P i ) = = i = 1 , 2 , . . . , 171.

T h u s if p o i n t s o f t h e m i n i m u m o f / ( x ) a r e d e t e r m i n e d i n d i f f e r e n t
s u b s p a c e s f o r m e d b y A - o r t h o g o n a l d i r e c t i o n s p ly . . ., p m , t h e n t h e
d i r e c t i o n p m + x = x mtTn — x m p r o v e s to b e c o n j u g a t e to d i r e c t i o n s
Pli • • •» P m •
T h e m e t h o d described of const ruc tin g c o n j u g a t e vectors d o e s n o t
r e q u i r e c a l c u l a t i o n o f t h e g r a d i e n t o r its finite d i f f e r e n c e s a p p r o x i m a ­
tion. L e t us n o w describe a concrete a l g o r i t h m for the m i n i m i z a t i o n
of a q u a d r a t i c f u n c t i o n in w h i c h the construction of c o n j u g a t e
v e c t o r s is p e r f o r m e d b y t h e m e t h o d s d e s c r i b e d .
W e c h o o s e arbitrarily p o i n t x 0 a n d v e c t o r p x; t h e m - t h iteration of
t h e a l g o r i t h m ( m = 1 , 2 , . . ., n ) i s p e r f o r m e d a s f o l l o w s :
(1) C a l c u l a t e p o i n t
Xm ~ % - i “ 1“ & m P m (6.26)
w h e r e a m is d e t e r m i n e d u n d e r t h e c o n d i t i o n o f t h e m i n i m u m f u n c t i o n
value:
/ (®0 — / ( ^ m - l "f* & P m ) m

138
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

(2) C a l c u l a t e p o i n t
* 0.m = x m + rm (6.27)
w h e r e r m is a n a r b i t r a r y v e c t o r w h i c h is n o t a l i n e a r c o m b i n a t i o n o f
vectors p u . . ., p m ( b e l o w w e s h a l l d w e l l a t s o m e l e n g t h o n t h e
question of t h e c h o i c e o f r m ).
(3) C a l c u l a t e points
x h,m = x k-l, to “ I” ® k,m Phi k = 1 , 771

w h e r e f a c t o r a * , m is d e t e r m i n e d u n d e r t h e c o n d i t i o n o f t h e v a l u e
of f u n c t i o n / (a) = / + « P a) being m i n i m u m .
(4) N o w c a l c u l a t e v e c t o r p m + i = x m , m — x m • T h i s is t h e e n d o f
the m - t h iteration.
V e c t o r rm (in (6.27)) m u s t n o t b e l o n g to t h e s u b s p a c e E m (x0)
s o t h a t p o i n t x 0 t T n w o u l d n o t b e l o n g t o s u b s p a c e E m (:r0 ). S i n c e
p o i n t x m i s t h e m i n i m u m p o i n t o f / ( # ) i n s u b s p a c e E m ( # 0 ), i t i s
c l e a r t h a t a n y v e c t o r x — x m i n w h o s e d i r e c t i o n f u n c t i o n / (x) d e ­
c r e a s e s d o e s n o t b e l o n g t o E m (a:0 ). C o n s e q u e n t l y , a n y d i r e c t i o n o f
d e s c e n t o f / ( x ) f r o m p o i n t x m c a n b e t a k e n a s r m . I n p a r t i c u l a r , it is
c o n v e n i e n t to c h o o s e vector rm a l o n g o n e of the coordinate axes;
t h e n if s u c h a v e c t o r p r o v e s n o t t o b e t h e d i r e c t i o n o f d e s c e n t , it is
necessary to t a k e as rm a v e cto r a l o n g a n o t h e r axis.
A c c o r d i n g to t h e results of Sec. 4, p o i n t x n c a l c u l a t e d b y for­
m u l a ( 6 . 2 6 ) is t h e m i n i m u m p o i n t o f / (x): x n = x * . I n o r d e r t o l i n d
point x + w e h a v e to solve o n e - d i m e n s i o n a l m i n i m i z a t i o n p r o b l e m s
n ’ 1
(to determine factors a m and a k t T n ) 1 -f- 2 + . . . + n = - n
times.
U s i n g this a p p r o a c h to the co nst ruc tio n of the m e t h o d of c o n j u g a t e
directions, o n e c a n construct various a l g o r i t h m s for the m i n i m i z a ­
tion of n o n q u a d r a t i c functions. O f course, in a n y a l g o r i t h m of this
kind, the directions . . ., p m , m ^ n w i l l b e n o m o r e c o n j u g a t e
(see the s u b s e c t i o n o n p. 103). H o w e v e r , w e c a n e x p e c t t h a t s u i t a b l y
w o r k e d out m e t h o d s in a sufficiently s m a l l n e i g h b o u r h o o d of the
m i n i m u m p o i n t x * (of a c o n v e x s m o o t h f u n c t i o n ) w i l l m a k e p o s s i b l e
the construction of vectors that are close e n o u g h as to their p r o p ­
erties to t h e c o n j u g a t e ones. S u c h a l g o r i t h m s m a y p r o v e effective
in m i n i m i z i n g n o n q u a d r a t i c functions.
W e shall consider b e l o w a n a l gor ith m based o n the a b o v e c o n ­
siderations.
Let o b e a n a r b i t r a r y p o i n t a n d i>i, i , . . . » v lt n b e a n o r t h o -
n o r m a l i z e d c o o r d i n a t e basis; the &-th iteration of the a l gor ith m,
k = 1, 2, . . . c o n s i s t s o f t h e f o l l o w i n g s t e p s :
( 1 ) F o r i — 1 , 2 , . . ., n c a l c u l a t e

x h, i x k , i ~: a k,i v k, i

139
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

w h e r e a hti a r e d e t e r m i n e d under the condition that the function


v a l u e is m i n i m u m :
/ ( < * ) = / (sjm-! + a v h , j).
(2) A s s u m e that
„ n 0
= - - - n —
w h e r e y h = | | x ht1l — * A ,0 || a n d c a l c u l a t e p o i n t x k t U + 1 = x k ,n - f
+ a ^ . n + i ^ . n + i J w h e r e a,k<n + i Is d e t e r m i n e d u n d e r t h e c o n d i t i o n
that the v a l u e of the function

/ (**.n + OW*,n+l)

be m i n i m u m .
(3) L e t a AjS = max{a*,i: i = 1, 2, . . n}, A* be a deter­

m i n a n t w h o s e c o l u m n s a r e v e c t o r s v ht u . . ., v h t n and e > 0
b e a n a r b i t r a r y s m a l l p o s i t i v e c o n s t a n t . If
> e .
Vfc
w e set t h at v k+lt t f with i s and i;A + lf 3 — v h>n+1; then
w e have

(6-28)

If w e find that ahtS^ h e, w e take z;A + l f i = u kti for all i —


= 1 , 2 , . . ., n ; t h e n A a + 1 = A fc.
(4) T a k e x A + 1 ,0 = .zA ,n + 1 ; t h i s i s t h e e n d o f t h e fc-tli i t e r a t i o n .
E q u a l i t y (6.28) m u s t b e p r o v e d . H o w e v e r , preliminarily w e shall
discuss the algorithm proposed.
L e t u s consider a simplified variant of the a l g o r i t h m w h o s e &-th
i t e r a t i o n is p e r f o r m e d a s f o l l o w s :
( 1 ) C o n s t r u c t p o i n t s x kt i = 1 , 2 , . . ., n i n t h e s a m e w a y
a s i n s t e p (1) o f t h e o r i g i n a l a l g o r i t h m .
(2) C a l c u l a t e x * , n + 1 = x k ,n + a h ,n + 1 v k .n + 1 , where v h ,n + l =
= x h > n — x k ,o a n d a A > n + 1 p r o v i d e s t h e m i n i m u m of funct ion
f ix h , n + a i W i ) .
( 3 ) S e t v h + l t i = v h ) i + 1 , i = 1 , 2 , . . ., n .
( 4 ) S e t x h + 1 ,Q = x h , n + 1 .
L e t k = 2. T h e n w e h a v e

•^2.0 — *^l.n+l X 1,n ® l . n + l ^ l , n + l »

x 2.n = x 2>n-l “ 1“ ® 2 . n ^ 2 . n “ x 2>n-l “ 1“ ® 2 . n ^ l , n + l i

140
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

i . e . p o i n t s x 2 ,0 a n d x 2 % n a r e m i n i m u m p o i n t s o f / ( x ) i n t h e o n e ­
d i m e n s i o n a l s u b s p a c e ( f o r m e d b y v e c t o r u lt n + x ) w h i c h p a s s e s t h r o u g h
t w o d i f f e r e n t p o i n t s x 1%n a n d £ 2 , n - i - I f / ( # ) is a q u a d r a t i c f u n c t i o n ,
t h e n a c c o r d i n g t o t h e f o r e g o i n g t h e d i r e c t i o n v 2 ,n + i — # 2 , n — ^ 2.0
i s f o u n d t o b e c o n j u g a t e t o t h e d i r e c t i o n v 1% n +x. = v 2% n . B y s i m i l a r
r e a s o n i n g , i t c a n b e a s c e r t a i n e d t h a t i f w i t h a n y k — 1 , 2 , . . ., n
v e c t o r s u hi 1<t . . ., v h , n a r e l i n e a r l y i n d e p e n d e n t , t h e n a f t e r t h e
k -th iteration vectors v k % n + 1 , v ktTly . . ., v k t n - k + 2 will prove
c o n j u g a t e , i.e. a f t e r n i t e r a t i o n s o f t h e p r o c e s s w e s h a l l h a v e c o n s ­
t r u c t e d n c o n j u g a t e v e c t o r s . H o w e v e r , it is i m p o s s i b l e w i t h t h i s
m e t h o d o f c o n s t r u c t i n g v e c t o r s u k t l 1 . . ., u k ,n t o g u a r a n t e e t h e i r
l i n e a r i n d e p e n d e n c e . I n d e e d , if w i t h a c e r t a i n k w e h a v e a * . x = 0
t h e n , a s is e a s i l y a s c e r t a i n e d ,
n
Vkt n + 1 = ^h, n %h. 0 ~ ^ k , n %h, 1 ^ 2 ® k , i ^ h , it
i=2

i.e. a t t h e ( k + l ) - i t e r a t i o n t h e s y s t e m * ! o f v e c t o r s — v hti+ ly
i = 1 , 2 , . . ., n i s f o u n d t o b e l i n e a r l y d e p e n d e n t . I n t h i s c a s e
it is n o t p o s s i b l e t o c o n s t r u c t a s y s t e m o f n c o n j u g a t e v e c t o r s ; t h i s
m e a n s that w i t h the application of this simplified a l g o r i t h m w e
c a n n o t g u a r a n t e e that a solution will b e o b t a i n e d e v e n for a q u a d r a t ­
ic f u n c t i o n . T h e m o r e c o m p l i c a t e d s t e p s (2) a n d (3) o f t h e o r i g i n a l
a l g o r i t h m are u s e d just in order to a v o i d linear d e p e n d e n c e of v e c ­
t o r s v h , h i = 1 , 2 , . . ., n ( w e f i n d t h a t A h > e ) .
H o w e v e r , note that in m i n i m i z i n g a quadratic function w i t h
t h e a i d o f t h e o r i g i n a l a l g o r i t h m , it is i m p o s s i b l e t o g u a r a n t e e t h a t
t h e p r o b l e m w i l l b e s o l v e d after a finite n u m b e r of iterations. I n d e e d ,
i f w e g o o v e r f r o m t h e s y s t e m o f v e c t o r s u h t l , . . ., v k ,n t o t h e
s y s t e m i;f c + l f l , . . ., u f t + x, n , i t c a n o c c u r t h a t o n e o f t h e c o n j u g a t e
v e c t o r s a l r e a d y c o n s t r u c t e d is c h a n g e d ( s e e s t e p (3)); t h e r e f o r e it is
impossible to g u a r a n t e e that n co nju gat e vectors will b e o b t a i n e d
a f t e r a f i n i t e n u m b e r o f i t e r a t i o n s . B e s i d e s , t h e s y s t e m o f v e c t o r s v k ,i
c a n r e m a i n u n c h a n g e d in g o i n g o v e r to t h e (k + l)-iteration.
W e s h o w n o w that e q ual ity (6.28) holds:

det [(Uj^j, . . ., Ufc> s _x, V k , n + it • ■• » ^k,n)l*

But
n
1 1
^k.n - n = — ( ^ , n — = — 2 a k , i V h ,i.
k i=l
C o n s e q u e n tly

^(,+1 = - ^ - d e t [(?*,!, . . . . v ktS, v h .n ) ] = — h^ h .

141
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Thus with any k w e f i n d t h a t A h ^ e, a n d j u s t t h i s g u a r a n t e e s


t h a t v e c t o r s v h i l , . . ., u ht7l a r e l i n e a r l y i n d e p e n d e n t .
Let us study certain properties of this algorithm.
T h e o r e m 6.2. L e t f (x) b e a c o n t i n u o u s l y differentiable strictly c o n v e x
f u n c t i o n s u c h t h a t set S = {ar: f (a;) ^ / ( x 1% 0 ) } h a s a b o u n d w i t h a n
arbitrarily chosen point x lt0. T h e n s e q u e n c e
{#fe,i}» i = 0, 1 , . . ., rc, k = 1 , 2 , . . . (6.29)
c o n s t r u c t e d b y t h e m e t h o d d e s c r i b e d a b o v e c o n v e r g e s to t h e m i n i m u m
p o i n t o f f u n c t i o n f (x).
Proof. T h e existence a n d u n i q u e n e s s of m i n i m u m point x * of
f u n c t i o n / (a:) u n d e r t h e c o n d i t i o n s o f t h e t h e o r e m f o l l o w f r o m t h e
r e s u l t s o f l e m m a s 3 . 1 a n d 3 . 4 ( C h a p . I). T h e r e f o r e it r e m a i n s o n l y
t o p r o v e t h a t s e q u e n c e {.x *} is c o n v e r g e n t . A n y p o i n t o f s e -
q u e n c e ( 6 . 2 9 ) x h ,t 6 S s i n c e / ( x h A ) = m i n f ( x h , j . j + a v k . ,) <
a
< / a n d / ( x k + l t 0 ) = / ( x h ,n + 1 ) < / (a:ft,n ). S e t S i s b o u n d ­
e d , i . e . ( i n E 71) i t i s c o m p a c t . C o n s e q u e n t l y , o n e v e r y i n f i n i t e s e ­
q u e n c e o f e l e m e n t s o f t h i s s e t it is p o s s i b l e t o p i c k o u t a s u b s e q u e n c e
w h i c h c o n v e r g e s t o a c e r t a i n e l e m e n t of S . If w e c o n s i d e r s e q u e n c e
{ x f t t i } w i t h a f i x e d i = 0 , 1 , . . ., n , t h e n , b y v i r t u e o f w h a t h a s
b e e n s t a t e d , t h e r e is a n i n f i n i t e s u b s e q u e n c e that converges
t o a p o i n t x t 6 S . A t t h e s a m e t i m e s i n c e / (x) h a s a l o w e r b o u n d , w e
h a v e / (x/j ti+ x ) - > / ( x h ,,•)• I t f o l l o w s , t a k i n g i n t o a c c o u n t t h e
771 Tit

c o n t i n u i t y o f / (x), t h a t t h e f o l l o w i n g e q u a l i t i e s h o l d :
f ( x i+1 ) = l i m f ( x ft_.i+i)= l i m f ( x h i) = f ( X i ) . (6.30)
/i -*-oo A_-*-oo
m m
Let us demonstra t e t h a t a:*+x = f o r a l l i = 0 , 1 , . . ., n — 1 .
B y construction, || = 1 w i t h a n y k a n d i, c o n s e q u e n t l y ,
v e c t o r s v kt f c a n b e c o n s i d e r e d to b e e l e m e n t s of a u n i t y s p h e r e (of
a b o u n d e d set) a n d t h e r e f o r e w i t h a n y f i x e d i = 1 , . . ., n t h e r e i s
a subsequence t h a t c o n v e r g e s t o a c e r t a i n v e c t o r v t. S i n c e
= i “ t- ^ h . i + l ^ h . i + 1 and
v h m .i + 1 - > w e have
•^i+l ~ “ I- I = 1 » • • •» ^ 1
where och-i = lim a k h-i- Since c o n d i t i o n / (a:f t t i + 1 ) =
oo *m
= m i n / (a*ftt t -f- a v kt l + 1 ) i s s a t i s f i e d [ a t p o i n t x ht i + 1 , w e m u s t h a v e :
a
/ (aTf+j) = m i n / (xt + a Uf+i), i = 0 , 1 , . . ., n — 1, (6.31)
a
i.e. t h e m i n i m u m o f f ( x ) i n t h e d i r e c t i o n v i + 1 is a t t a i n e d a t p o i n t
X i + i . B u t i t f o l l o w s f r o m ( 6 . 3 0 ) t h a t / (arf+ x ) = / (arf). S i n c e / ( x ) i s
s t r i c t l y c o n v e x , t h e r e is a u n i q u e m i n i m u m p o i n t i n t h e d i r e c ­
t i o n U j + x ; h e n c e a ^ + x = a-*.

142
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S

T h u s w e find that x 0 = x x — . . . = x n . D e n o t i n g this c o m m o n


po int b y x w e c a n rewrite c o n d i t i o n (6.31) as follows
/ (a:) ^ / (x + a v t) f i = 1, 2 , . . .* w (6.32)
w i t h a n y a. F o r a differentiable function these conditions are e q u i v a ­
lent to the following ones:
(/' ( x ) , — 0, i— 1, 2 , . . ., n . (6.33)
N o t e n o w t h a t s i n c e d e t l ( v fc, 1? . . ., i?ftf n )] ^ e , w e h a v e a l s o
d e t [(z7l 9 . . ., v n )] ^ e . I t f o l l o w s t h a t v e c t o r s v l t . . ., v n a r e
linearly i n d e p e n d e n t . T a k i n g this into a c c o u n t w e h a v e f r o m (6.33)
t h a t f (x) — 0. D u e t o t h e s t r i c t c o n v e x i t y o f / (#), t h i s m e a n s t h a t
x is t h e m i n i m u m p o i n t o f f (x): x = x % .
T h u s w e h a v e p r o v e d t h a t t h e r e is a s u b s e q u e n c e {#fem , *} w h i c h
c o n v e r g e s t o p o i n t x *. H o w e v e r , s i n c e w i t h a n y f i x e d i = 0 , 1, . . .
. . ., n w e h a v e / ( x k + l t f ) ^ / ( x h , *) a n d / ( x ) h a s a l o w e r b o u n d ,
t h e f o l l o w i n g c o n d i t i o n is s a t i s f i e d :
lim / (xk,i ) = lim f(xh l ) = 1(xi) = j { x ) .
h-*-oo

It f o l l o w s t h a t w i t h a fixed i t h e s e q u e n c e *} is a m i n i m i z i n g
o n e , c o n s e q u e n t l y t h e s e q u e n c e ( 6 . 2 9 ) is a m i n i m i z i n g o n e a s w e l l ,
a n d t h e r e f o r e , s i n c e t h e r e is o n l y o n e m i n i m u m , t h e s e q u e n c e c o n ­
v e r g e s t o p o i n t x *. T h e t h e o r e m is p r o v e d .
I t is e a s y t o a s c e r t a i n t h a t i n p r o v i n g t h a t c o n d i t i o n s ( 6 . 3 2 ) h o l d ,
w e m a d e n o u s e o f t h e f a c t t h a t f u n c t i o n / (x) is d i f f e r e n t i a b l e , i.e.
these inequalities h o l d also for a strictly c o n v e x continuous; f u n c -
tion. H o w e v e r , po i n t x — t h e limit p o i n t of s e q u e n c e (6.29)— in this
c a s e c a n b e n o t t h e m i n i m u m p o i n t o f / (a:) ( a t t h e s a m e t i m e s e ­
q u e n c e (6.29) c a n h a v e m o r e t h a n o n e limit point).

D i s c u s s i o n of R e s u l t s
N o t e first o f a l l t h a t t h e fi eld o f a p p l i c a t i o n o f t h e m e t h o d o f
c o n j u g a t e d i r e c t i o n s is b r o a d e r t h a n t h a t o f m e t h o d s o f d u a l d i r e c ­
t i o n s ; t h i s is e a s i l y a s c e r t a i n e d b y c o m p a r i n g t h e r e q u i r e m e n t s
i m p o s e d o n t h e f u n c t i o n b e i n g m i n i m i z e d in t h e o r e m s 6.1 a n d 6.2.
T h e properties of t h e m e t h o d of c o n j u g a t e directions u n d e r c o n ­
s i d e r a t i o n h a v e b e e n a s y e t s t u d i e d b u t i n s u f f i c i e n t l y . T h u s it is
n o t y e t c l e a r w h a t t h e r a t e o f c o n v e r g e n c e o f t h e a l g o r i t h m is.
N e v e r t h e l e s s , it is e v i d e n t l y s l o w e r ( i n m i n i m i z i n g f u n c t i o n s o f t h e
s a m e class) t h a n t h a t of m e t h o d s o f S e c . 5; t h i s c a n b e j u d g e d e v e n
b y the fact that the a l g o r i t h m u n d e r consideration does no t g u a ­
r a nte e t h e finding of the m i n i m u m of a q u a d r a t i c f o r m after n itera-

143
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

l i o n s ( a n d i n f a c t , a f t e r a f i n i t e n u m b e r o f s t e p s ) , i.e. it d o e s n o t
g u a r a n t e e the co n s t r u c t i o n of a s y s t e m of n c o n j u g a t e vectors after
a finite n u m b e r of iterations. C o n s e q u e n t l y , f r o m t h e v i e w p o i n t
of the rate of co nvergence, m e t h o d s of d u a l directions w i t h their
superlinear rate of c o n v e r g e n c e h a v e a n advantage, ov er the m e t h o d s
of co nju gat e directions.
L e t us m a k e a n a t t e m p t to c o m p a r e the a m o u n t s of c o m p u t a t i o n s
at iterations of the a l g o r i t h m s studied.
I n a m e t h o d o f t y p e ( 6 . 1 ) , it is n e c e s s a r y a t e a c h i t e r a t i o n t o c a l ­
culate the function va l u e n + 1 or 2 (n + 1 ) ti m e s for the construc­
tion of m a t r i x D J1 a n d n H- 1 t i m e s for the construction of vector g k
d e p e n d i n g o n t h e v a r i a n t a p p l i e d (see t h e s u b s e c t i o n o n p. 136);
a t t h e s a m e t i m e it c a n o c c u r a t s o m e i t e r a t i o n s t h a t i n d e t e r m i n i n g
g k t h e r e is n o n e e d t o p e r f o r m n e w e v a l u a t i o n s o f t h e f u n c t i o n (if
Pfc = or the a m o u n t of calculations c a n increase several times
d e p e n d i n g o n h o w c l o s e t h e g r a d i e n t is a p p r o x i m a t e d b y g k . B e s i d e s ,
it is n e c e s s a r y t o p e r f o r m s o m e m o r e c a l c u l a t i o n s o f f u n c t i o n v a l u e s
i n o r d e r to c h o o s e t h e d i r e c t i o n of m o t i o n a n d t h e s t e p size.
I n t h e m e t h o d o f c o n j u g a t e d i r e c t i o n s it is n e c e s s a r y t o c a l c u l a t e
at e a c h iteration the m i n i m u m of the f u nct ion in the direction of
m o t i o n n + 1 t i m e s . If w e a s s u m e t h a t i n s o l v i n g a o n e - d i m e n s i o n a l
m i n i m i z a t i o n p r o b l e m w e h a v e to calculate o n the a v e r a g e 3 or
4 function values, t h e n the a m o u n t of calculations at e a c h iteration
w i t h t h e m e t h o d b e i n g s t u d i e d is a b o u t t h e s a m e . I t is n o t a s y e t
clear t h o u g h w h a t a c cur acy the c o m p u t a t i o n of the m i n i m u m in
a direction of m o t i o n m u s t b e p e r f o r m e d w i t h in the m e t h o d of
c o n j u g a t e directions so that the properties of the process b e not
violated. F r o m the v i e w p o i n t of the influence o n the convergence,
t h e a l g o r i t h m o f a k c h o i c e i n p r o c e s s (6.1) is t o b e p r e f e r r e d .
O n t h e w h o l e , g i v e n t h e p o s s i b i l i t y o f u s i n g m e t h o d s of t y p e (6.1),
t h e y m u s t b e m o r e effective t h a n the m e t h o d of c o n j u g a t e direc­
t i o n s ; h o w e v e r , it s h o u l d b e s t r e s s e d o n c e m o r e t h a t t h e f i e l d o f
a p p l i c a t i o n o f t h e l a t t e r is b r o a d e r .
F i n a l l y , it s h o u l d b e n o t e d t h a t i n s t u d y i n g p r o c e s s ( 6 . 1 ) w e
m a d e it p r a c t i c a l l y c l e a r t h a t t h e c a l c u l a t i o n e r r o r s i n d e t e r m i n i n g
v e c t o r e k o f t h e o r d e r o f 0 (|| r ft||*) ( s e e ( 6 . 4 ) , ( 6 . 5 ) , ( 6 . 6 ) ) a n d i n
d e t e r m i n i n g v e c t o r / ' ( x k ) o f t h e o r d e r o f O ( £ ft|| p ft||) ( s e e ( 6 . 1 6 ) )
d o n o t vi o l a t e t h e p r o p e r t i e s of p r o c e s s (3.4) ( c o n v e r g e n c e , b o u n d s
o n t h e r a t e of c o n v e r g e n c e ) . If w e c o n s i d e r t h e v a r i a n t of p r o c e s s (3.4)
in w h i c h = x k + 1 — x ky t h e n w e c a n o b t a i n o t h e r e x p r e s s i o n s f o r
e s t i m a t i n g t h e errors. F r o m a c e r t a i n s t e p o n i n p r o c e s s (3.4), a * = 1
and, consequently, w e have
II '•*11 = | | P s - x l l = 1 1 x h — x^ l l = P i L i / * - 1 || > m i l l /i-ill-

Taking into a c c o u n t (1.12), w e obtain || r k \\ ^ m ^ m W x k^ — x*||.

144
B I B L I O G R A P H I C N O T E S

T h u s if r h + 1 = x h + i — x k t h e e r r o r s i n t h e c a l c u l a t i o n o f v e c t o r s e h
a n d f h o f t h e o r d e r o f O (|| — :r*||') a n d O — **||) d o n o t
tell o n t h e p r o p e r t i e s of p r o c e s s (3.4).

Bibliographic Notes
T o S e c . 1 . T h e i d e a o f t h e g r a d i e n t m e t h o d w a s first s t a t e d b y A . L . C a u c h y .
M e t h o d s o f t h e g r a d i e n t t y p e w e r e s t u d i e d b y L . V . K a n t o r o v i c h [1], J. E . K e l l e y ,
B . T . P o l y a k [1], M . A l t m a n , Y u . I. L y u b i c h , Y u . I. L y u b i c h a n d G . D . M a i s t -
r o v s k y . T h e s e w o r k s c o n t a i n l o n g lists o f literature.
T h e v a r i a n t of t h e m e t h o d w i t h t h e c h o i c e of t h e s t e p size a c c o r d i n g to
c o n d i t i o n (1.2) w h i c h is d e s c r i b e d i n t h i s s e c t i o n is p u b l i s h e d f o r t h e first t i m e .
T h e s t u d y of the rate of c o n v e r g e n c e of gradient m e t h o d s g i v e n in this
s e c t i o n is b a s e d o n t h e r e s u l t s o f t h e w o r k o f B . T . P o l y a k (1 1 .
T o S e c . 2 . N e w t o n ’s m e t h o d f o r s o l v i n g m i n i m i z a t i o n p r o b l e m s a n d e q u a ­
t i o n s w a s s t u d i e d b y L . V . K a n t o r o v i c h [2], L . V . K a n t o r o v i c h a n d G . P . A k i ­
lov, a n d L. Collatz.
M . N . Y a k o v l e v p r o v e d that the rate of c o n v e r g e n c e of the generalized
m e t h o d is s u p e r l i n e a r if t h e s t e p s i z e is c h o s e n f r o m t h e c o n d i t i o n t h a t t h e f u n c ­
t i o n a t t a i n s m i n i m u m i n t h e d i r e c t i o n o f m o t i o n . A . A . G o l d s t e i n a n d J. F. P r i c e ,
J . W . D a n i e l , Y u . M . D a n i l i n [ 1 , 2 1 s t u d i e d N e w t o n ’s m e t h o d w i t h a d j u s t ­
m e n t of step len gth a n d u s e d a m o d e of choice of the value which did not
involve the finding of the function m i n i m u m in the direction of m o t i o n .
T o S e c . 3 . T h i s s e c t i o n is b a s e d o n a p a p e r b y Y u . M . D a n i l i n a n d B . N . P s h e ­
n i c h n y [I].
T o S e c . 4 . T h e first o f t h e m e t h o d s o f c o n j u g a t e d i r e c t i o n s — t h e m e t h o d o f
con juga te gradients— w a s p r o p o s e d for solving p r o b l e m s of linear algebra b y
M . R . H e s t e n e s a n d E . Stiefel. A n o t h e r a p p r o a c h to t h e c o n s t r u c t i o n of m e t h o d s
of conjugate directions as applied to quadratic function m i n i m i z a t i o n w a s
p r o p o s e d b y W . C . D a v i d o n [lj a n d d e v e l o p e d b y R . F l e t c h e r , M . J . D . P o w e l l
a n d others.
I n B . N . P s h e n i c h n y ’s a l g o r i t h m [ 3 ] t h e c o n s t r u c t i o n o f c o n j u g a t e d i r e c ­
tions does not involve the finding of the function m i n i m u m in the direction
of m o t i o n .
M a n y properties of conjugate directions are discussed b y D. K . F a d d e e v a n d
V. N. F a d d e e v a . T h e general m e t h o d of constructing conjugate directions w h i c h
is u s e d i n t h i s s e c t i o n w a s w o r k e d o u t b y H . Y . H u a n g . S o m e o f t h e r e s u l t s a r e
n e w , e.g. f o r m u l a (4.45), m e t h o d (4.63).
T o Sec. 5. R . Fletcher a n d C. M . R e e v e s sug g e s t e d the use of the m e t h o d of
c o n j u g a t e gradients for the m i n i m i z a t i o n of n o n q u a d r a t i c functions. T h e p r o b ­
l e m s of the c o n v e r g e n c e a n d b o u n d s o n the rate of c o n v e r g e n c e of the m e t h o d of
c o n j u g a t e g r a d i e n t s w e r e s t u d i e d b y J. W . D a n i e l [1, 2], B . T . P o l y a k [2],
G . D . M a i s t r o v s k y [1, 21, S. A . S m o l y a k . T h e c o n v e r g e n c e o f m e t h o d ( 4 . 4 8 ) a n d
t h e b o u n d s o n t h e r a t e o f c o n v e r g e n c e w e r e e s t a b l i s h e d b y M . J . D . P o w e l l [3]
( t h e s e r e s u l t s a r e d e s c r i b e d i n t h e b o o k s b y J. W . D a n i e l [1] a n d E . P o l a k [2]).
T h e pro of of t h e c o n v e r g e n c e of m e t h o d s of c o n j u g a t e directions d e s c r i b e d in
t h i s s e c t i o n is b a s e d o n a p a p e r b y Y u . M . D a n i l i n 141.
T o Sec. 6. M e t h o d s of d u a l directions w i t h o u t calculating derivatives of
t h e f u n c t i o n a r e d i s c u s s e d i n a p a p e r b y Y u . M . D a n i l i n a n d B . N . P s h e n i c h n y [2).
M e t h o d s o f c o n j u g a t e d i r e c t i o n s w e r e s t u d i e d b y C . S . S m i t h , M . J . D . P o w e l l [2],
W . I. Z a n g w i l l [ 1 ] , J . W . D a n i e l [1] . T h e w o r k s n a m e d h a v e b e e n u s e d i n w r i t i n g
this section. A r e v i e w of m i n i m i z a t i o n m e t h o d s w i t h o u t calculating derivatives
w a s written b y R . P. Brent.

10— 0326 145


C H A P T E R III
M E T H O D S O F C O N S T R A I N E D
F U N C T I O N M I N I M I Z A T I O N

T h i s c h a p t e r describes v a rio us m e t h o d s of f u n c t i o n m i n i m i z a t i o n
w i t h c o n s t r a i n t s o n t h e v a r i a b l e s . T h e first s e c t i o n d e v e l o p s m e t h o d s
o f s o l v i n g p r o b l e m s o f q u a d r a t i c p r o g r a m m i n g w h i c h is a s u b s i d i a r y
p r o b l e m in m a n y algorithms. T h e following sections describe the
a l g o r i t h m s for solving p r o b l e m s of c o n v e x a n d n o n c o n v e x p r o g r a m ­
m i n g . E v e r y w h e r e , if o n l y f e a s i b l e , t h e b o u n d s o n t h e r a t e o f c o n ­
vergence are given.

1. P R O B L E M O F Q U A D R A T I C P R O G R A M M I N G
U s u a l l y t h e p r o b l e m o f q u a d r a t i c p r o g r a m m i n g is u n d e r s t o o d
to b e the p r o b l e m of the m i n i m i z a t i o n of a qu a d r a t i c function w i t h
l i n e a r c o n s t r a i n t s . T h u s t h e p r o b l e m o f q u a d r a t i c p r o g r a m m i n g is t h e
m i n i m i z a t i o n of th e fun c t i o n

f(x)=r~Y(x, C x ) + (d, x) (1.1)

with the following constraints


(a,,*) - 6, < 0 ,
( a f, x) - bt = 0, i 6 T K *2 )
w h e r e x £ E n , a t £ E n , i £ C J ~ U J ° , d £ E n , b t a r e n u m b e r s , C is
a u n X n s y m m e t r i c , p o s i t i v e d e f i n i t e m a t r i x , i . e . (x , C x ) ^ 0
f o r all x , a n d J ' a n d J ° a r e finite sets of i n d i c e s .
T h e b a s i s o f t h e n u m e r i c a l m e t h o d o f s o l v i n g t h i s p r o b l e m is t h e
m e t h o d of c o n j u g a t e gradients. T h e m a i n idea of the a p p l i cat ion of
t h i s m e t h o d t o p r o b l e m ( 1 . 1 ) - ( 1 . 2 ) is a s f o l l o w s .
L e t x 0 b e a p o i n t w h i c h satisfies c o n s t r a i n t s (1.2). W e p i c k o u t
a m o n g t h e constraints t h o s e w h i c h are satisfied as equalities. T h e s e
constraints d e t e r m i n e a certain face of t h e p o l y h e d r a l set defined

146
Q U A D R A T I C P R O G R A M M I N G

b y l i n e a r i n e q u a l i t i e s (1.2). W e f i n d t h e m i n i m u m o f f (x) o n t h i s
f a c e u s i n g t h e m e t h o d o f c o n j u g a t e g r a d i e n t s . T h e p o i n t o b t a i n e d is
the solution of o u r p r o b l e m or indicates a transition to a n e w face
a n d t h e n t h e p r o c e d u r e is r e p e a t e d . S i n c e t h e m e t h o d o f c o n j u g a t e
g r a d i e n t s m i n i m i z e s f u n c t i o n / (x) af t e r a finite n u m b e r o f s t e p s ,
a n d t h e n u m b e r o f f a c e s o f t h e p o l y h e d r a l s e t is l i m i t e d , it is c l e a r
t h a t a n a l g o r i t h m of this k i n d c o n v e r g e s after a finite n u m b e r of
steps.

O p e r a t o r s of Projection
L e t n o w C f = J ~ (J J ° a n d f * b e a s u b s e t o f t h e s e t o f i n d i c e s J .
W e f o r m m a t r i x A ™ w h o s e r o w s a r e v e c t o r s a t, i £ ^ s o t h a t t h e
o
m a t r i x is m X r c - d i m e n s i o n a l , w h e r e m is t h e n u m b e r o f e l e m e n t s
in set f .
L e m m a 1 . 1 . I f v e c t o r s a t , i £ 'f a r e l i n e a r l y i n d e p e n d e n t , t h e n
m a t r i x A u A % is n o n s i n g u l a r .
C O —
Proof. Let y £ E ™ be a nonzero vector such that
A f A f y = 0. (1.3)
Then
V * A f A ^ y = ( A * g y ) * A % y = ( A f y y , A f y y ) = || A f y y ||2 = 0 ,
A \ y = 0 (1-4)
o
B u t A * ^ y is j u s t a l i n e a r c o m b i n a t i o n o f v e c t o r s a t, i £ f w i t h c o e f ­
f i c i e n t s y l , i = 1 , . . ., m , w h e r e y l a r e c o m p o n e n t s o f v e c t o r y .
B y t h e a s s u m p t i o n t h a t a t, i £ 'f a r e l i n e a r l y i n d e p e n d e n t , t h i s c o m ­
b i n a t i o n c a n n o t b e zero. T h e r e f o r e , (1.4) a n d c o n s e q u e n t l y (1.3)
f r o m w h i c h (1.4) w a s o b t a i n e d are n o t true. T h u s , m a t r i x A f A f
c a n b e m a d e zero o n l y b y a zero vector, a n d this m e a n s that this
m a t r i x is n o n s i n g u l a r . L e t u s n o w d e f i n e o p e r a t o r P :
P = A f (A^Ay)-' Ay. (1.5)
I t is e a s i l y s e e n t h a t o p e r a t o r P has the following properties:
P P = P, (1.6)
P * = P, (1.7)
P (I - P) = (I - P) P = 0. (1.8)
O p e r a t o r P is t h e o p e r a t o r o f o r t h o g o n a l p r o j e c t i o n i n t o a s u b s p a c e
s p a n n e d b y v e c t o r s a t, i £ .
I n d e e d , f o r a n y v e c t o r x £ 2?”
x = P x + (/ — P ) x.

147 io*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Further by (1.7), (1.8),J


(P x , (/ — P ) x) = (x, P * (I - P)x) = 0
a n d so P x a n d (I — P ) x are c o m p o n e n t s of t h e o r t h o g o n a l resolution
of v e c t o r x. M o r e o v e r
P x = A % lU = 2 a iu i
r
w h e r e v e c t o r u £ E ™ w i t h its c o m p o n e n t s u x is d e f i n e d b y f o r m u l a
u = (AyAy)~* AyX.
T h e e x p r e s s i o n f o r v e c t o r P x s h o w s t h a t it is w h o l l y i n t h e s u b s p a c e
s p a n n e d b y vectors a*, i 6 If.
N o t e n o w that
A y (I — P ) = A y . — ( A y A y ) (.A y A y ) ~ 1 A y = 0 . (1.9)

Therefore^ for a n y x 6 E n , vector y = (/ — i 5 ) a; s a t i s f i e s t h e s y s t e m


of e q u a t i o n s A y y = 0.

M i n i m i z a t i o n o f a Q u a d r a t i c F u n c t i o n in a S u b s p a c e
S u p p o s e n o w t h a t w e h a v e t o m i n i m i z e a q u a d r a t i c f u n c t i o n / (x)
d e f i n e d b y (1.1) w i t h t h e c o n s t r a i n t s
(ffli, x ) — bt = 0, i 6 (1.10)
W e a s s u m e t h a t v e c t o r s a iy i 6 are linearly independent.
L e t x 0 b e a p o i n t w h i c h satisfies (1.10).
N o t e t h a t i f w e d e n o t e b y b y a v e c t o r w h o s e c o m p o n e n t s a r e b it
t h e n t h e s y s t e m of e q u a t i o n s (1.10) c a n b e w r i t t e n in the
f o r m A y x — b y = 0 s o t h a t A y x 0 — b y — 0.
W e n o w introduce a n e w variable y defined as follows:
x = x0 + (/ — P) y (1.11)
and consider the quadratic function
<p ( y ) = / (*o + (i — P ) y).
T h e g r a d i e n t s o f f u n c t i o n s <p ( y ) a n d / ( x ) , a c c o r d i n g t o t h e r u l e s
of differentiation of a c o m p o s i t e fu n c t i o n a n d t h e s y m m e t r y of
operator P x are related as follows:
<p ' ( y ) = ( i - P ) f ( * ) (1.12)
w h e r e x a n d y are c o n n e c t e d b y (1.11).
L e m m a 1.2. L e t y be the p o i n t of absolute m i n i m u m of function
<p ( y ) . T h e n t h e c o r r e s p o n d i n g p o i n t
x = x0 + (I — P ) y

148
Q U A D R A T I C P R O G R A M M I N G

is t h e m i n i m u m p o i n t o f f u n c t i o n f ( x ) w i t h c o n s t r a i n t s ( 1 . 1 0 ) .
P r o o f . A t p o i n t y t h e g r a d i e n t o f f u n c t i o n <p ( y ) b e c o m e s z e r o :
<p' ( y ) = 0 . T h e r e f o r e , b y ( 1 . 1 2 ) ,
(/ - p) r (x) = o
or
f (x) — Afy ( A y A * y Y x A y f (x) = 0.
Taking u — — ( A y A y Y 1 A y f (x), w e obtain
/' ( x ) + A y u = 0. (1.13)
U s i n g (1.9) w e o b t a i n also
A y X — A y X 0 + A y (/ — P) y — A y X 0 = by,

i.e. x sa t i s f i e s c o n d i t i o n s ( 1 . 1 0 ) .
T h u s x is t h e f e a s i b l e p o i n t a n d a t t h i s p o i n t c o n d i t i o n s ( 1 . 1 3 )
are satisfied, w h i c h are n e c e s s a r y a n d sufficient for x to b e t h e m i n i ­
m u m p o i n t o f / (x) w i t h c o n d i t i o n s (1.10). T h e l e m m a is p r o v e d .
L e m m a 1.2 s h o w s that the p r o b l e m u n d e r consideration c a n b e re­
d u c e d t o t h e m i n i m i z a t i o n o f q u a d r a t i c f u n c t i o n <p ( p ) w i t h o u t c o n ­
s t r a i n t s . T o m i n i m i z e <p ( y ), w e a p p l y t h e m e t h o d o f c o n j u g a t e g r a ­
d i e n t s ( C h a p . II, S e c . 4):
jfo = 0, Pi = — q>' ( 0 ) ,
yk+i = yh + ocfc+iPk+n
„ _ {fit /.. \ I I* <P' f a k ) II2 n ,
P k + i - - q > » * ) + [, Pk-
T h e q u a n t i t y a h+1 i n t h e s e f o r m u l a s is c a l c u l a t e d a s f o l l o w s
„ _ _ _ _ _ _ (<P' (yh), P h + i )
k+1 ( P a + i , ( I - P ) C ( I - P ) Ph+i)

s i n c e it is e a s y t o a s c e r t a i n t h a t t h e m a t r i x w h i c h d e t e r m i n e s t h e
q u a d r a t i c t e r m o f f u n c t i o n <p ( y ) h a s t h e f o r m
(I — P ) C (I — P).
These formulas determine the process involving the additional
v a r i a b l e s y . I t is, h o w e v e r , e x p e d i e n t t o g o o v e r t o t h e o r i g i n a l
v a r i a b l e s x. W e p r e l i m i n a r i l y p r o v e t h a t t h e f o l l o w i n g re l a t i o n h o l d s :
(/ - P) p h = P k . (1.14)
Indeed, w i t h ft = 1 w e have:
(I - P) Pl = - ( I - P ) q>' ( 0 ) = - ( / - / > ) ( / - / > ) /' ( x q )
= - (/ - P ) r ( x 0> = -<p'(0) = pi

149
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

where w e have m a d e use of (1.12) a n d of t h e fact that


(/ — P ) (/ - P) = I— P — (/ - P) P = I — P.
N o w s u p p o s e t h a t ( 1 . 1 4 ) h o l d s f o r k a n d p r o v e t h a t it h o l d s f o r
k + 1, w h e r e w e h a v e a g a i n m a d e u s e o f ( 1 . 1 2 ) a n d ( 1 . 1 4 ) :

< / - P) i m + ii';',

It f o l l o w s n o w f r o m (1.11) t h a t
x h+i = x 0 + (I — P ) y h + 1 ,
x h+i = ** + (!— p ) (yh+ 1 — yh) = xh + {I — P ) <*h+iPh+i,

x h+ 1 — x k + &k+lPh+l•
Let us transform the formula f o r P f c + l7 u s i n g (1.12):

d„ - (T P M ' ( x \ I I K / - J° ) / / ( ^ ) H 2 n
Ph+i {1 1 ) 1 ( ^ f t ) + || ( i - p ) f ' ( x /t_ 1 ) | | * ^ f t *
The f o r m u l a f o r a.k+i n o w takes the following form:
„ = (( I - P ) 1 ’ { x k ), P h + i ) ^ (/' (x h ) > Ph+i)
h+1 ( { I - P ) p k + 1, C ( I - P ) P k + i ) { P k + 1> CPh+i)

T h e o r e m 1.1. T h e p r o b l e m of the m i n i m i z a t i o n of q u a d r a t i c f u n c ­
t i o n f (x ) w i t h c o n s t r a i n t s ( 1 . 1 0 ) , g i v e n t h e i n i t i a l p o i n t x Q w h i c h
sa t i s f i e s ( 1 . 1 0 ) , is s o l v e d a f t e r a f i n i t e n u m b e r o f s t e p s b y t h e f o l l o w i n g
process:
pi - - ( / - p ) r ( * 0 ),
x h +1 = x h + a k + l P k + l i

PD k + i —_ _ _ / / _ _ P \ f f ( X u \ -I-
{* P)f ix k ) + 1|N (/ _ P ^) p f ^ X k ) IP T).
ip P k »
~ L ( f f (x h ) i P h + i ) ;r — o 1
(p m , C P M ) ’ f c _ 0 ’ 1 . . . .

T h e proof of this t h e o r e m w a s g i v e n practically in the argument


u s e d in deriving the f o r m u l a s of the process.
R e m a r k . A s w e k n o w ( C h a p . I I , S e c . 4 ) , if t h e m e t h o d o f conjugate
g r a d i e n t s is a p p l i e d t o a q u a d r a t i c f u n c t i o n w i t h a s i n g u l a r matrix C ,
t h e n the process c o n v e r g e s after a n u m b e r of steps n o t exceeding
n — Z, w h e r e Z i s t h e n u m b e r o f z e r o e i g e n v a l u e s o f m a t rix C. In

150
Q U A D R A T I C P R O G R A M M I N G

m i n i m i z i n g (p ( y ) w e a p p l i e d t h i s m e t h o d t o a f u n c t i o n w h o s e m a t r i x
was ( I — P ) C ( / — P ). B u t s i n c e A y ( / — P ) = 0 , t h a t is,
(/ — P ) A y . = 0, w e h a v e (I — P ) a t = 0 , i . Therefore, in the
case u n d e r consideration the n u m b e r of zero e i g e n v a l u e s of m a t r i x
( I — P ) C ( I — P ) i s n o t l e s s t h a n m , w h e r e m i s t h e n u m b e r o f a it
i Therefore, the process suggested either converges to the m i n i -
m u m p o i n t o r s h o w s n o l o w e r b o u n d o f q u a d r a t i c f u n c t i o n / (x) w i t h
constraints (1.10) after a n u m b e r of steps n o t e x c e e d i n g n — m .

Algorithm of General P r o b l e m
of Q u ad ra ti c P r o g r a m m i n g
L e t u s n o w r e t u r n to t h e g e n e r a l p r o b l e m (1.1), (1.2). F o r e a c h
p o i n t x w h i c h satisfies (1.2) w e set

f (x) = { i : (a*, x ) — bt = 0, i 6 U J 0 }-
v
In w h a t follows w e a s s u m e that the f o l l o w i n g c o n d i t i o n of n o n ­
d e g e n e r a c y is fulfilled: w i t h a n y x v e c t o r s (x) a r e l i n e a r l y i n ­
dependent.
W e n o w propose the a l g o r i t h m for solving the p r o b l e m .
L e t x 0 b e a n a r b i t r a r y p o i n t w h i c h s a t i s f i e s ( 1 . 2 ) a n d is t h e first
a p p r o x i m a t i o n . T a k e a set of indices f 0 — (x 0 ) a n d c o n s t r u c t
operator P y J

p f<>= A h ( A t o A f < r lA fo-


Calculate the quantities
u0= — A j of' ( x 0) , ( I — P f o ) /' ( x 0 ) = /' ( x 0) + A * f ( u a .

T h e r e are t w o possible cases:


(1) (/ - P y o) f (*o) = o. H e r e
f (x0) + A % 0u 0 - 0 (1.15)

a n d p o i n t x 0 is t h e m i n i m u m p o i n t o f / (x) o n t h e f a c e d e f i n e d b y t h e
s y s t e m of e q u a t i o n s
(at, x ) — bt = 0, i 6 to
( s e e C h a p . I, S e c . 3).
If t h e r e a r e n o n e g a t i v e c o m p o n e n t s a m o n g aj, c o m p o n e n t s of
v e c t o r m 0 , i £ f ( x 0 ) f) t h e n ( s e e C h a p . I, S e c . 3 ) p o i n t x 0 is t h e
s o l u t i o n of t h e p r i m a l p r o b l e m (1.1), (1.2), for i n this case (1.15)
are the necessary a n d sufficient co n d i t i o n s for the m i n i m u m of f u n c ­
t i o n / (x) w i t h c o n s t r a i n t s (1.2).

151
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

S u p p o s e n o w t h a t t h e r e i s a n i n d e x / £ ' f ( x 0 ) fl 3 ~ s u c h t h at wj « <
0. C o n s t r u c t a n e w set of i n d i c e s h y deleting i n d e x ;. W e a p p l y
the m e t h o d of c o n j u g a t e gradients described in the subsection on
p. 1 4 8 to s o l v i n g t h e p r o b l e m of m i n i m i z a t i o n o f / (x) w i t h c o n ­
straints
(di, x ) — bt — 0, i 6 % . tl-16)
H o w e v e r , in a p p l y i n g the m e t h o d of c o n j u g a t e gradients, the process
m u s t n o t t r a n s g r e s s t h e l i m i t s (1.2). T h e r e f o r e at e v e r y s t e p of t h e
algorithm the following check should be made. C o m p u t e the quantity

< u , )

w h e r e t h e m i n i m u m i s t a k e n o v e r a l l i f o r w h i c h ( a * , /?k + 1 ) > ■ 0 .
I n t h i s f o r m u l a x k is t h e p o i n t j u s t c o n s t r u c t e d b y t h e a l g o r i t h m a n d
p ft+1 is t h e c o n j u g a t e d i r e c t i o n a t t h i s p o i n t .
L e t n o w o&fc+i b e t h e c o r r e s p o n d i n g s t e p l e n g t h i n t h e m e t h o d o f
c o n j u g a t e g r a d i e n t s . If a h + 1 c a * + 1 , t h e n j r ^ + x = x h + a h+1 p k+1
a n d t h e p r o c e s s g o e s o n . I f h o w e v e r a * + i ^ a * + x » t h e n £ * + i = x \ -+*
+ o c j h - x P j h -i a n d t h e p r o c e s s s t o p s .
T h u s , e i t h e r w e f i n d t h e m i n i m u m p o i n t o f / (#) u n d e r c o n d i ­
tions (1.16) or the process will b e t r u n c a t e d w h e n a * + i ^ a*+x.
I n b o t h cases w e t a k e t h e p o i n t o b t a i n e d to b e t h e initial p o i n t a n d
p r o c e e d u s i n g t h e n e w p o i n t as w e d i d w i t h t h e initial one, x 0 .
( 2 ) ( I — / ^ 0 ) f (x o ) ^ 0 *
In this case w e a p p l y the m e t h o d of c o n j u g a t e gradients to solving
t h e p r o b l e m o f m i n i m i z a t i o n o f / (x ) w i t h c o n s t r a i n t s
( a f, x ) — bi = 0 , i 6 f 0 (1.18)
s t a r t i n g a t p o i n t x 0 . A t e v e r y s t e p , a s b e f o r e , a c h e c k is m a d e
w h e t h e r t h e p o i n t s o b t a i n e d a r e f e a s i b l e o r n o t , i.e. w e c a l c u l a t e
a *+1 b y f o r m u l a s (1.17) a n d a p p l y t h e process of c o n j u g a t e gr a d i e n t s
u n t i l e i t h e r w e f i n d t h e m i n i m u m p o i n t o f / (x) w i t h c o n s t r a i n t s (1.18)
o r t h e c o n d i t i o n a * + x ^ « f c + i is s a t i s f i e d a n d t h e p o i n t x * + x =
= Xft + o Cft+xPfc+i o b t a i n e d . I n b o t h c a s e s w e t a k e t h e p o i n t o b t a i n e d
a s t h e i n i t i a l o n e a n d r e p e a t a t it t h e o p e r a t i o n s p e r f o r m e d w i t h x 0 .
L e t u s s u b s t a n t i a t e t h e c o n v e r g e n c e of t h e m e t h o d after a finite
n u m b e r o f s t e p s . W e m u s t first o f all s h o w t h a t i n c a s e (1) a s w e l l
a s i n c a s e (2) a s u c c e s s f u l s t e p w i l l b e m a d e , i.e. w e m o v e f r o m
p o i n t x 0 t o a n e w p o i n t a t w h i c h t h e v a l u e o f f u n c t i o n / (x) w i l l b e
s t r i c t l y l e s s t h a n / (x 0 .
N e w points are o b t a i n e d b y t h e m e t h o d of c o n j u g a t e gradients
a n d in this m e t h o d th e function decreases at e a c h step. Therefore,
t h e o n l y t h i n g w e h a v e t o s h o w is t h a t o c * + x > > 0 a l w a y s , i.e. c o n -

152
Q U A D R A T I C P R O G R A M M I N G

straints (1.2) p e r m i t to t a k e a n o n z e r o s t e p i n t h e d i r e c t i o n chosen,.


p k + 1 , a n d b e s i d e s t h a t i n c a s e (1) p o i n t x 0 is n o t t h e m i n i m u m p o i n t
o f / ( z ) w i t h c o n s t r a i n t s ( 1 . 1 6 ) , f o r if it w e r e s o t h e m e t h o d o f c o n j u ­
gate gradients w o u l d not h a v e m o v e d the process f r o m point x 0.
Let us prove several subsidiary l e m m a s .
L e m m a 1 . 3 . V e c t o r p 1 = — ( I — P ^ 0 ) f (x o ) * * t h e s o l u t i o n o f
the p r o b l e m of m i n i m i z i n g function

> ( P ) = ( / ' (*<>). p ) + 4 II P ll2

with constraints
A fop = 0. (1.19)

P r o o f . I n d e e d , b y (1.9), p Y satisfies (1.19). M o r e o v e r , q/ (p) =


= p + / ' (a;0 ). T h e r e f o r e ,

<p / ( P i ) ~ Pi + f (*o) = (I P y 0) / (*o) f (x o )


= P f / M = - A * f o u a.
Hence,
<p' ( p i ) — A f y Qu 0 = 0. (1.20)

T h e l a s t e x p r e s s i o n is t h e n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n f o r
c o n v e x f u n c t i o n (p ( p ) t o a t t a i n i t s m i n i m u m a t p o i n t p x w i t h c o n ­
s t r a i n t s ( 1 . 1 9 ) . T h e l e m m a is p r o v e d .
W e f o r m u l a t e n o w a p r o b l e m w h i c h is t h e d u a l o f t h e p r o b l e m o f
m i n i m i z i n g cp ( p ) w i t h c o n s t r a i n t s ( 1 . 1 9 ) . A c c o r d i n g t o t h e r u l e s
s t a t e d i n S e c . 3 o f C h a p . I, w e h a v e t o f i n d t h e m i n i m u m o f f u n c t i o n
cp ( p ) - f u ^ A y j p . D i f f e r e n t i a t i n g w i t h r e s p e c t t o p a n d e q u a t i n g t h e
derivatives to zero, w e obtain p -j- f ( # 0 ) + A f 0u = 0, i.e.

P = — f (*o) — A f 0u .
Su bstituting this expression for p w e o b t a i n that

m i n { < p ( p ) + u M * 0p > = — \ |1 / ' ( * o ) + A f y y ||2 .


V

T h u s t h e d u a l p r o b l e m c o n s i s t s i n f i n d i n g o v e r all p o s s i b l e v e c t o r s u
the m i n i m u m of the function

q>*(«)— 4*w ||2 -

N o w d i f f e r e n t i a t i n g cp* (u) a n d e q u a t i n g t h e d e r i v a t i v e s t o z e r o ,
w e c a n easily ascertain that vector
“o = — ( A f 0A % ) - ' A y / (*.)

153
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

is t h e s o l u t i o n o f t h e d u a l p r o b l e m , i.e. m a x i m i z e s cp* (u). R e c a l l


t h a t ixj, i € o a r e t h e c o m p o n e n t s o f u 0 . T h u s , v e c t o r u 0 is t h e
v e c t o r o f L a g r a n g e ’s m u l t i p l i e r s i n t h e p r o b l e m o f t h e m i n i m i z a ­
t i o n o f (p ( p ) w i t h c o n s t r a i n t s ( 1 . 1 9 ) . B e s i d e s , w e o b t a i n t h a t t h e
v a l u e o f t h e m i n i m u m o f cp ( p ) w i t h c o n s t r a i n t s ( 1 . 1 9 ) a n d o f t h e
m a x i m u m o f cp* ( u ) o v e r u , w h i c h is t h e s a m e b y t h e d u a l i t y t h e o r e m s ,
is e q u a l t o

— Y II / ' ( * o ) + A * f a u „ ||2 o r — ^ || ( / — P f o ) / ' ( x „ ) ||2 .

L e m m a 1.4. L e t m a t r i x A ^ be f o r m e d f r o m b y deleting the r o w


w i t h i n d e x f f o r w h i c h u { < ~ 0 a n d let ( I — P f 0 ) f (#o) = 0- T h e n ,
v e c t o r P i = — ( / — Pf-'o) f (^o) n ° t z e r o a n & (a P P i ) < 0 .
Proof. V e c t o r p x c a n h e written in the fo llowing f o rm:

Pl = — (/' (*o) + A y v ) , V = — { A y A f y y 1A y f ( x 0 ).

If P i = 0 , t h e n f ' ( x 0) - \ - A y v = 0 . B u t o n t h e o t h e r hand, by
assumptions,
( I — P ^ Q) /' ( x 0 ) = /' ( # 0) A y . Qu 0 — 0 . (1.21)
Subtracting from ( 1 . 2 1 ) t h e first e q u a l i t y , w e o b t a i n

A f 0 u 0 — A f a v = « j a ;- + 2 . ( « i — V i) a t = 0
I ’T * 3

w h i c h , as u{ 0 , i s i m p o s s i b l e s i n c e v e c t o r s a iy i £ 0 are linearly
i n d e p e n d e n t . W e n o w p r o v e t h e s e c o n d part, o f t h e l e m m a .
R e w r i t e (1.21) in the c o m p o n e n t f o r m :

/ ' ( * o ) + 2 u oa i + ( — “ j ) ( — a }) = L. (1-22)
i#j
N o t e t h a t — u{ > 0 s i n c e u{ < 0. C o n s i d e r t h e p r o b l e m of m i n i ­
m i z i n g cp ( p ) = ( p , /' ( x 0 )) + || p | | 2 / 2 w i t h c o n s t r a i n t s

( a t, p ) = 0, i 6 — ( a }< P ) < 0. (I-2 3 )

S i n c e <p' ( p ) = / ' ( x 0 ) + p , w e h a v e cp' ( 0 ) = / ' ( x 0 ), a n d t h e r e -


f o r e , ( 1 . 2 2 ) is t h e n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n f o r t h e p o i n t
p = 0 t o b e t h e s o l u t i o n o f t h e p r o b l e m o f m i n i m i z a t i o n o f <p ( p )
w i t h c o n s t r a i n t s ( 1 . 2 3 ) . O n t h e o t h e r h a n d , b y l e m m a 1 . 3 , p x is t h e
s o l u t i o n o f t h e p r o b l e m o f m i n i m i z a t i o n o f cp ( p ) w i t h c o n s t r a i n t s
A ^ p = 0 or in the c o m p o n e n t f o r m
(fli, p ) = 0, i 6 K . (1.24

154
Q U A D R A T I C P R O G R A M M I N G

Suppose t h a t ( a 7-, /?!) ^ 0 . S i n c e p x s a t i s f i e s c o n s t r a i n t s ( 1 . 2 4 ) , i t


satisfies all c o n s t r a i n t s (1.23) t o o . B u t

(/' ( * o ) , A ) = - (/' ( * o ) , ( / - ?. <*®))


= - {Pr f < * > + ( / ■- P f 0 / ' ( * . ) , ( / - p r ) r w )

= — ( ( / — P j ) /' ( * o ) , /' (*o))

= - II P i II2-
Therefore,

<p ( P i ) = ( / ' ( « « ) . a ) + 4 - ii P i i p = — ^ ii P i ii2 < °-

T h e last i n e q u a l i t y c o n t r a d i c t s t h e fact t h a t t h e m i n i m u m v a l u e of
<p ( p ) w i t h c o n s t r a i n t s ( 1 . 2 3 ) i s a t t a i n e d w i t h p = 0 a n d i s e q u a l t o
z e r o . T h i s c o n t r a d i c t i o n s h o w s t h a t (aj, P i ) < 0 . T h e l e m m a is
proved.
W e return n o w to the a l g o r i t h m constructed. L e t u s consi der
c a s e ( 1 ) a n d l e t p o i n t X o ’b e n o t t h e s o l u t i o n o f t h e p r o b l e m o f q u a d r a t ­
ic p r o g r a m m i n g . A c c o r d i n g to t h e a l g o r i t h m , w e s h o u l d a p p l y t h e
m e t h o d o f c o n j u g a t e g r a d i e n t s i n o r d e r t o m i n i m i z e f u n c t i o n / (x )
w i t h co n s t r a i n t s (1.16). I n a c c o r d a n c e w i t h t h e f o r m u l a s of t h e
m e t h o d , t h e first s t e p is m a d e i n t h e d i r e c t i o n o f v e c t o r

P i = - ( / - PfO f < * * > J


B y l e m m a 1 . 4 , P i ¥ = 0 a n d c o n s e q u e n t l y p o i n t x 0 is n o t t h e s o l u ­
tion of the subsidiary p r o b l e m of m i n i m i z a t i o n u n d e r consideration.
W e n o w d e m o n s t r a t e t h a t a x > 0.
I n d e e d , v e c t o r p x s a t i s f i e s c o n d i t i o n ( 1 . 2 4 ) , a n d ( a 7*, p x ) < c 0 a c ­
c o r d i n g to l e m m a 1.4. T h e r e f o r e

.(a * > P i ) < ° > (1-25)


F o r i £ f 0 b y t h e m e t h o d o f c h o o s i n g s e t j ^ 0 ,!
{fliIt * o ) 0*

Therefore

a x = min — — ^a i l x °^ > > 0


i ( a ii P i ) _
s i n c e t h e m i n i m u m is t a k e n o n l y o v e r t h o s e i f o r w h i c h ( a f, p x ) ; > 0
a n d c o n s e q u e n t l y practically ov er a certain subset of indices i that
d o e s n o t intersect f 0 a c c o r d i n g to (1.25). A n d
— (flj, x 0 ) 0.
w i t h s u c h i.

155
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e f a c t t h a t a x > 0 i n d i c a t e s t h a t all o f t h e p o i n t s x 0 4- a p x
w i t h 0 ^ a ^ c*! s a t i s f y c o n d i t i o n s ( 1 . 2 ) . I n d e e d , f o r i 0
(«i. x 0 + a p 1) — b i = ( a i, x 0) ^ b i - \ - a ( a i t p 1)
j = o ,
= « ( « « , ft) { < o, i = /.
For i £ f 0
( a fl x 0 + a P J — bt = ( a f, x 0) — bt + a (a,, Pi) < 0,

if ( a * , p x ) ^ 0 ; h o w e v e r if ( a ** p x) > 0, t h e n

a ^ o t i ^ ftf — (<*i» x q )
(a h P i )
a n d therefore

( a t , x 0 + a p 1) — b i ^ ( a „ x 0) — bt+ 6| ( - * * o) ( a t , f t ) = 0 .
(«i. Pi)
N o t e t h a t t h e s i g n of i n e q u a l i t y in t h e last e x p r e s s i o n is to be
bj — 3Tq)
c o n s i d e r e d s t r i c t if a < C < i i o r a x < C
(<*i* P i )
A c c o r d i n g to the algorithm, t w o cases are possible: a x < C « i a n d
a i ^ a x . I n t h e first c a s e w e o b t a i n a n e w p o i n t x x = x 0 + o^iPi
t h a t satisfies t h e r e l a t i o n s
; (a*, *i) — bt = 0, i 6 jfj, (a*, x x) — bt < 0, i £ (1.26)

In t h e s e c o n d c a s e , w e o b t a i n p o i n t x x = ar0 + a n d it is t a k e n
as a n e w initial p o i n t f r o m w h i c h t h e a l g o r i t h m b e g i n s to o p e r a t e
in c h e c k i n g c a s e (1) o r (2). P o i n t x x satisfies c o n d i t i o n s (a*, x x) —
— b t = 0, i £ f'o a n < l m o r e o v e r e q u a l i t i e s ( a f, x x) — b t = 0 w i t h
all i £ such that
b i — (diy x 0 ) _ "
— - - = Uji
(««. P i)

T h u s f' (^i) r ) f l , t h e i n c l u s i o n b e i n g strict.


W e r e t u r n n o w t o t h e c a s e w h e n a x * < a x. H e r e t h e a p p l i c a t i o n of
t h e m e t h o d of c o n j u g a t e g r a d i e n t s g o e s o n , a n d so l o n g as a * + x < ajt+i
h o l d s , all of t h e p o i n t s x \ + x c o n t i n u e to satisfy t h e r e l a t i o n s (1.26)
( l i k e ^ i ) , s i n c e b y ( 1 . 1 4 ) a n d ( 1 . 9 ) w i t h P = P *** w e h a v e
A fip k = A f i (/ - P f .;) p k = 0,

i.e. if w r i t t e n i n t h e c o m p o n e n t f o r m
(«i. P h ) = o, i 6 r „

156
Q U A D R A T I C P R O G R A M M I N G

The inequalities
(at, x h ) — bi < 0 , i £ f o

also will n o t b e violated, for their violation w o u l d m e a n that the


c a s e w h e r e a h + x ^ a fe+1 t a k e s p l a c e .
T h u s , w e h a v e d e m o n s t r a t e d t h a t i n c a s e (1) t h e i t e r a t i v e p r o c e s s
c o n s t r u c t s s u c c e s s i v e l y p o i n t s x 0 , x lt . . ., x k ^ 1 a n d t h e v a l u e
o f / (x) s t r i c t l y d e c r e a s e s a l o n g t h i s s e q u e n c e b e c a u s e it is c o n s t r u c t e d
b y t h e m e t h o d o f c o n j u g a t e d i r e c t i o n s . T h e l a s t p o i n t x * is e i t h e r t h e
m i n i m u m o f / (x) w i t h c o n s t r a i n t s ( 1 . 1 6 ) o r a t t h i s p o i n t 'f ( x * ) c o m ­
prises strictly set
I n c a s e (2) t h e d i r e c t i o n of m o t i o n f r o m p o i n t x 0 c o i n c i d e s w i t h
v e c t o r P i = — (/ — P f 0) f ( x 0) # 0, ( a t, p ^ ) = 0 , i £ f t ( = f (x„))
a n d t h e r e f o r e it c a n h e e a s i l y s h o w n t h a t a x > 0 a n d t h e m e t h o d o f
c o n j u g a t e d i r e c t i o n s m a k e s it p o s s i b l e t o m a k e a t l e a s t o n e n o n z e r o
s t e p t o t h e n e w p o i n t x x a t w h i c h t h e v a l u e o f f (x) is s t r i c t l y s m a l l e r .
All the proofs in this case are a n a l o g o u s to those g i v e n above.
W e o b t a i n a s a r e s u l t t h e s e q u e n c e o f p o i n t s x 0 , x lt . . ., x k ^ 1 ,
a n d x fe i s e i t h e r t h e m i n i m u m o f / (a:) w i t h c o n s t r a i n t s ( 1 . 1 8 ) , o r
t (**) = > f :
N o t e t h a t i n c a s e ( 1 ) a s w e l l a s i n c a s e ( 2 ) if p o i n t x * i s t h e m i n i ­
m u m p o i n t of f (x) w i t h c o n s t r a i n t s (1.16), t h e n i n b o t h c a s e s x h
is t h e m i n i m u m o f / (x) o n t h a t f a c e o f t h e p o l y h e d r a l s e t w h i c h is
determined b y expressions

(a,, x ) — b, = 0, i 6 f (xh) (1.27)

since, b y construction, (xh ) =d i n c a s e (1) a n d (x k ) z d f > 0 i n


c a s e (2) a n d t h e m i n i m u m p o i n t o n t h e b r o a d e r s e t is t h e m i n i m u m
point also o n the n a r r o w e r one.
W e s h o w n o w t h a t after a finite n u m b e r of s t e p s s t a r t i n g w i t h
p o i n t x 0 w e s h a l l i n e v i t a b l y c o m e t o p o i n t x h w h i c h is itself t h e m i ­
n i m u m p o i n t o f / (x) w i t h c o n s t r a i n t s ( 1 . 2 7 ) . I n d e e d , it c a n b e s e e n
f r o m t h e f o r e g o i n g t h a t if t h e m e t h o d o f c o n j u g a t e g r a d i e n t s d o e s
n o t r e s u l t i n t h e f i n d i n g o f t h e m i n i m u m p o i n t , t h e n it f o l l o w s i m ­
m e d i a t e l y t h a t t h e s e t o f i i n d i c e s is e x t e n d e d a n d f o r t h e m t h e n e x t
p o i n t o b t a i n e d satisfies t h e r e l a t i o n s ( a f, x h ) — b t = 0. S i n c e , b y
a s s u m p t i o n , v e c t o r s a f, i 6 f' ( % k ) a r e l i n e a r l y i n d e p e n d e n t , it is
clear t h a t this e x t e n s i o n m u s t b e t r u n c a t e d after a finite n u m b e r of
s t e p s n o t e x c e e d i n g w , w h e r e n is t h e d i m e n s i o n o f x.
T h u s a f t e r a n u m b e r o f s t e p s nolt e x c e e d i n g n t h e a l g o r i t h m d e s c r i b e d
c o n s t r u c t s t h e n e x t p o i n t x h w h i c h is t h e m i n i m u m o f / (x) w i t h
constraints (1.27).
N o t e t h a t t h e s e t s 'f ( X * ) w i t h d i f f e r e n t x * a r e d i f f e r e n t f o r t h e v a l u e
of f u n c t i o n / (x) d e c r e a s e s m o n o t o n i c a l l y a l o n g t h e s e q u e n c e c o n .

157
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

structed. I n d e e d , let x m a n d x m <c.k be m i n i m u m p o i n t s of / (x)


with constraints
(flj, x ) bi 0,
i 6 t ( x m ) a n d i 6 f ( x h ) r e s p e c t i v e l y . I f f ( x m ) = f ( x k ), t h e n
c l e a r l y / (x m ) = / B u t a c c o r d i n g to the c o n s t ruc tio n of the
process, / (xm ) < ; f (xk) w i t h m > k a n d thus the equality (xm ) =
— f (^fe) d o e s n o t h o l d t r u e .
O n t h e o t h e r h a n d , all sets (x & ) a r e s u b s e t s o f a f i n i t e s e t J =
= U J ° a n d t h e r e f o r e t h e n u m b e r o f s u c h s u b s e t s is l i m i t e d .
It f o l l o w s t h a t t h e p r o c e s s p r o p o s e d m u s t b e t r u n c a t e d after a finite
n u m b e r o f s t e p s . B u t t h i s c a n o c c u r o n l y if t h e m i n i m u m p o i n t o f
f (x) w i t h c o n s t r a i n t s (1.2) h a s b e e n f o u n d , o t h e r w i s e a s s h o w n a b o v e
the process c a n go on.
T h i s p r o v e s t h a t t h e p r o c e s s c o n v e r g e s after a finite n u m b e r of
steps.
R e m a r k . I f m a t r i x C is s i n g u l a r , t h e n a c c o r d i n g t o t h e t h e o r y o f
m e t h o d s o f c o n j u g a t e d i r e c t i o n s i t c a n o c c u r t h a t (/' ( x h ), p u + i ) ¥ = 0
a t p o i n t x k b u t ( p k + u C p k + 1) = 0- I n t h i s c a s e c a n n o t be cal­
culated since
n _ (/ (xh)* P h + i )
h+l ~ ( P k + 1, c P M )■

H o w e v e r in this case, / (xk + a Ph+i) decreases w i t h increasing a


a n d therefore w e c a n a s s u m e that = +<x> and perform the
c a l c u l a t i o n s a s u s u a l . If < C + o o , t h e n the application of the
m e t h o d of c o n j u g a t e gradients will t e r m i n a t e at point x h+1 =
= x k -f- o L h + x P k + i , t h i s v i o l a t e s t h e a b o v e a r g u m e n t i n n o t h i n g .
H o w e v e r , i f a f t + 1 a l s o h a s n o l i m i t , i . e . i f (<a t , p ^ + x ) ^ 0 f o r a l l i,
t h e n t h e m o t i o n a l o n g t h e r a y x k + a pu+i. results in a decrease,
w i t h o u t l i m i t , o f f u n c t i o n / (#). T h i s m e a n s t h a t t h e p r o b l e m o f
quadratic p r o g r a m m i n g w h i c h w e consider has no solution since
t h e l o w e r b o u n d o f / (x) w i t h c o n s t r a i n t s ( 1 . 2 ) is — o o .

Computational Aspects
T h e algorithm proposed comprises in essence only o n e complicated
c o m p u t a t i o n a l operation: the projecting of the gradient o n a s u b ­
s p a c e , i.e. t h e c a l c u l a t i o n o f t h e q u a n t i t y ( / — P y ) f {x). T h e r e a r e
t w o w a y s of p e r f o r m i n g this calculation.
T h e first o n e c o n s i s t s i n a d i r e c t c a l c u l a t i o n o f m a t r i x i.e.
P j t = A y . ( A y A y ) -1 A y . T h i s i n v o l v e s c a l c u l a t i n g m a t r i x ( A y A y ) ~ x.
If t h i s m a t r i x is k n o w n , t h e n t h e c a l c u l a t i o n o f t h e r e q u i r e d v e c t o r
u = — ( A y A y ) ~ x A y f (x) is r e d u c e d t o m u l t i p l y i n g t h e m a t r i x
b y the vector.

158
Q U A D R A T I C P R O G R A M M I N G

In order to d i m i n i s h the a m o u n t of c o m p u t a t i o n s at e a c h step,


w h e n set ch ang es, o n e c a n m a k e u s e of the fact that b y deleting
index / w e delete in m a t r i x A ^ A % ^ o n e c o l u m n a n d one row, thus
obtaining m a t r i x A ^ A % ^ > . Just in the s a m e w a y , in a d d i n g a n
ad d i t i o n a l i n d e x to set matrix A ^ A y , acquires an additional
column and row.
T h i s m a k e s it p o s s i b l e t o u s e t h e f o l l o w i n g r e c u r s i v e f o r m u l a s
k n o w n f r o m linear a l g e b r a (see D . K . F a d d e e v a n d V . N . F a d d e e v a ) .
Let B be an arbitrary s y m m e t r i c n X n matrix w h i c h can be
written in the f o r m
( D u\
B = (u* b)
w h e r e D is a n ( n — 1 ) X (n — 1 ) m a t r i x , u is a n ( n — l ) - d i m e n -
s i o n a l c o l u m n - v e c t o r , u * is its t r a n s p o s e d v e c t o r , b is a n u m b e r . T h e n
it c a n b e e a s i l y a s c e r t a i n e d t h a t
/ n-i i D ~ iu u * D ~ l D ~ lu \
, l D + - - - a - - - * - - - - a ” |
B | u*D~l 1 I
a ’ a '

a = b — u * D ~ xu.
T h u s if m a t r i x Z ) - 1 i s k n o w n , t h e n m a t r i x B ~ x , w h e r e B i s o b t a i n e d
b y a d d i n g t h e last c o l u m n a n d t h e last r o w , c a n b e o b t a i n e d b y
simple calculations.
C o n v e r s e l y , if m a t r i x J 5 _ 1 i s o f t h e f o r m
iG P\
B ~l= (\p* m l * .
t h e n for m a t r i x D 1 we have
D~ G - & m-

T h u s if t h e n e w m a t r i x i s o b t a i n e d f r o m t h e o r i g i n a l o n e b y d e l e t ­
i n g t h e last r o w a n d t h e last c o l u m n o r b y a d d i n g a r o w a n d a c o l u m n ,
then the inverse matrices are obtained b y si mpl e arithmetic o p e r a ­
tions. T h e fact t h a t in t h e f o r m u l a s g i v e n a b o v e w e d e l e t e d t h e last
c o l u m n a n d r o w d o e s n o t m a t t e r , f o r it c a n b e e a s i l y c h e c k e d t h a t
the transposition of r o w s in the original m a t r i x leads s i m p l y to a
transposition of c o l u m n s in the inverted ma t r i x , a n d the transposi­
tion of c o l u m n s — to t h e t r a n s p o s i t i o n of r o w s .
T h u s w e h a v e s h o w n that t h e calculation of t h e m a t r i x of projec­
tion c a n b e p e r f o r m e d b y recursive formulas. T h e d r a w b a c k of these
r e c u r s i v e c o m p u t a t i o n s is t h a t t h e y m a y l e a d t o a g r e a t c u m u l a t i v e
c o m p u t a t i o n error.

159
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

L e t us describe another w a y of c o m p u t a t i o n .
It w a s s h o w n in t h e s u b s e c t i o n o n p. 1 5 1 t h a t v e c t o r p 0 =
= — ( / — P y ) f (x) is t h e s o l u t i o n o f t h e p r o b l e m o f m i n i m i z i n g
(/' ( x ) , p ) + y II P l l 2 w i t h c o n s t r a i n t s A y p = 0.] I t is e x p e d i e n t
to go over to the dual p r o b l e m w h i c h , as d e m o n s t r a t e d above, c o n ­
sists in t h e m a x i m i z a t i o n of t h e q u a d r a t i c f u n c t i o n

— r l l / ' W + ^ w l l 2
al ong vector u w i t h o u t constraints. This p r o b l e m c a n be easily solved
b y the m e t h o d of c o n j u g a t e directions. A s w a s s h o w n in the s u b s e c ­
t i o n o n p . 1 5 1 its s o l u t i o n is v e c t o r u 0 = — ( A y A y ) ~ x A y f (x)*
i.e. t h e v e c t o r w h i c h is r e q u i r e d f o r t h e a p p l i c a t i o n o f t h e a l g o r i t h m
for solving the general p r o b l e m of quadratic p r o g r a m m i n g . V e c t o r p 0
is e a s i l y c a l c u l a t e d u s i n g u 0 a n d t h e f o l l o w i n g f o r m u l a :
P o = - (/ - P y ) r (x) = - [/' ( x ) - A*f ( A f A* f )~* A y f (x)]
= — [/' ( x ) 4 - A y u 0]t
i.e.
Po = — [/' ( z ) + A | t w 0 ].

T h u s i n u s i n g t h e s e c o n d w a y o f c o m p u t i n g , t h e o p e r a t i o n is
r e d u c e d to a p p l y i n g m a n y t i m e s the s t a n d a r d p r o c e d u r e of the
m e t h o d of c o n j u g a t e directions.

P r o b l e m of Q u a d r a t i c P r o g r a m m i n g
with Simple Constraints
The problem with simple constraints is u n d e r s t o o d to be the
p r o b l e m of m i n i m i z i n g

f(x) = Y (x, C x ) + (d, x)

w i t h c o n s t r a i n t s x % ^ 0, i £ J , w h e r e d is a s u b s e t o f t h e s e t
{ 1 , 2 , . . ., n ) . I n t h i s c a s e t h e a l g o r i t h m o f t h e s u b s e c t i o n o n
p . 1 5 1 is c o n s i d e r a b l y s i m p l i f i e d . I n s t e a d o f p e r f o r m i n g t h e s e s i m p l i ­
fications formally, w e shall f o r m u l a t e a n a l g o r i t h m for solving the
p r o b l e m . I t s d e s c r i p t i o n w i l l m a k e it c l e a r t h a t t h e p r o o f o f its
c o n v e r g e n c e after a finite n u m b e r of s t e p s c o i n c i d e s w i t h t h e p r o o f
of t h e a l g o r i t h m of t h e s u b s e c t i o n o n p. 151. S o , let x 0 b e a n
a r b i t r a r y p o i n t w h i c h s a t i s f i e s c o n s t r a i n t s xj ^ 0, i 6 J . S u p p o s e t h a t

J (x) = {i: X f — 0, i 6 J}.

160
Q U A D R A T I C P R O G R A M M I N G

W e shall describe n o w the p r o c e d u r e for o n e iteration w i t h the


initial p o i n t x 0 . W e c a l c u l a t e t h e set ( x 0 ). T w o c a s e s a r e p o s s i b l e .
( 1 ) (/' ( x 0 ) ) 1 = 0 , i J ( x 0 ), w h e r e (/' ( x 0 ) ) x i s t h e i - t h c o m p o n e n t
o f v e c t o r / ' ( x 0 ).
I n t h i s c a s e p o i n t x 0 is t h e m i n i m u m p o i n t o f / (x) w i t h c o n s t r a i n t s
x x = 0, i £ ( x 0 ). I f a t t h e s a m e t i m e (/' ( x 0 ) ) 1 ^ 0 w i t h i £ ' f ( x 0 ),
t h e n x 0 is t h e s o l u t i o n o f t h e p r o b l e m , f o r a t p o i n t x 0 t h e n e c e s s a r y
a n d s u f f i c i e n t c o n d i t i o n s f o r a m i n i m u m a r e s a t i s f i e d ( s e e C h a p . I,
S e c . 3).
L e t n o w (/' ( x 0 ) ) 1 < 0 f o r s o m e i £ 'jt ( # 0 ), a n d s e t
r = o e f ( * . ) : (/' ( * . ) ) ‘ > o } .
W e a p p l y t h e m e t h o d of c o n j u g a t e g r a d i e n t s t o m i n i m i z e / (x)
t a k i n g as v a r i a b l e s o n l y x 1, i a n d t a k i n g all t h e x l for i £ ' f
to b e zero. T h e m e t h o d of c o n j u g a t e g r adi ent s requires t h e q u a n t i t y
a**! be computed

®fc+i = min (- - - - - )
1 \ H + i I
w h e r e t h e m i n i m u m is t a k e n o v e r all i ^ for w h i c h < 0 .
T h e n w e c o m p a r e a * + i ^a n d a ^ . • • a
I f a * + i < i a h + 1» t h e n X i + j = X i + a j H - i P f t + i * i S I f ' , x \ + \ = =
= 7 0> * 6 f''- If «fc+i ^ otfe+n t h e n x l + i — x \ + ctfe+iPft+i* i
x \ + i = x\ = 0, i £ T ' • T h e c a l c u l a t i o n p r o c e s s will b e t r u n c a t e d after
a f i n i t e n u m b e r o f s t e p s a n d a p o i n t x k + x w i l l b e f o u n d s u c h t h a t it
p r o v i d e s m i n i m u m of / (x) s u b j e c t t o x* — 0, i £ or such that
otfc+x ^ a h + 1 . H e r e , f ( x k ) n > ' f a n d t h e i n c l u s i o n i s s u c h t h a t
t h e r e a r e i £ ' f ( x ft), b u t i ^ . I n b o t h c a s e s p o i n t x ft+ x i s t a k e n a s
t h e i n i t i a l p o i n t a n d t h e p r o c e s s is r e p e a t e d .
( 2 ) T h e r e a r e i n d i c e s i s u c h t h a t (/' ( x 0 ) ) 1 ^ = 0 , i £ ' f ( x 0 ). I n t h i s
c a s e w e a p p l y t h e m e t h o d of c o n j u g a t e g r a d i e n t s t o m i n i m i z e / (x)
w i t h v a r i a b l e s x x, i £ ( x 0 ). T h e c o m p o n e n t s x l , £ 6 ^ ( x 0 ) a l l t h e
t i m e r e m a i n e q u a l t o z e r o . M o r e o v e r a s i n c a s e (1), a t e v e r y s t e p w e
calculate the quantity

a h+1 = m •i n ( - - *'<
-- - \
i \ Ph+i I
w h e r e t h e m i n i m u m i s t a k e n o v e r a l l i £ ‘f ( x 0 ), 1 < 0 . T h e process
s t o p s i n t h e s a m e w a y a s i n c a s e (1).
I t is e a s i l y s e e n t h a t a n a r g u m e n t a n a l o g o u s t o t h a t g i v e n i n t h e
s u b s e c t i o n o n p. 1 5 1 results either in t h e p r o o f of t h e c o n v e r g e n c e
of t h e a l g o r i t h m after a finite n u m b e r of s t e p s o r in e s t a b l i s h i n g t h e
f a c t t h a t / (x) h a s n o l o w e r b o u n d w i t h c o n d i t i o n s x l ^ 0, i £ J .

1 1 — 0;j26 161
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

2. M E T H O D O F F E A S I B L E D I R E C T I O N S
T h e m e t h o d o f f e a s i b l e d i r e c t i o n s w a s o n e o f t h e first m e t h o d s
s u g g e s t e d for s o l v i n g t h e p r o b l e m of c o n v e x p r o g r a m m i n g .
S u p p o s e it is r e q u i r e d l o m i n i m i z e f u n c t i o n f 0 ( x ) w i t h t h e c o n ­
straints:
fi ( x ) ^ 0 , i = 1 , . . ., m . A x — b = 0 (2.1)
w h e r e x £ E n , /* ( x ) , i — 0 , 1 , . . m are c o n v e x continuously
d i f f e r e n t i a b l e f u n c t i o n s , A is a n I X m m a t r i x , b is a n Z - d i m e n s i o n a l
v e c t o r . M o r e o v e r , s u p p o s e t h a t t h e g r a d i e n t s o f f u n c t i o n s f-t (,x ),
i — 0 , 1 , . . ., m s a t i s f y L i p s c h i t z ’ c o n d i t i o n :
II /; ( * i ) — fi (*»)ll < C|| Xt — ar.2 || (2.2)
a n d || fl (.r)|| ^ K f o r a l l p o i n t s x w h i c h a r e c o n s i d e r e d i n w h a t
f o l l o w s . W e d e n o t e b y D t h e a d m i s s i b l e r e g i o n , i.e. t h e s e t
D — { x \ ft ( x ) 0, i = 1 , . . ., 77i, A x — b = 0}.

W e s h a l l a s s u m e i n w h a t f o l l o w s t h a t s e t D is c o m p a c t a n d t h e
c o n d i t i o n o f t h e g r a d i e n t s h a v i n g b o u n d s is f u l f i l l e d . L e t x 0 b e a
point of D . W e find a direction p £ E n s u c h that w i t h s m a l l
a ( x 0 + a p ) 6 D , a n d , b e s i d e s , f 0 ( x 0 + a p ) < c / 0 ( # 0 ). S u c h a d i r e c ­
t i o n is c a l l e d f e a s i b l e . M o v i n g a l o n g t h i s d i r e c t i o n b y o n e s t e p a x
w e o b t a i n a n e w p o i n t x x — x 0 + a Yp £ D . W e t a k e t h i s p o i n t a s
t h e i n i t i a l p o i n t a n d t h e p r o c e s s is r e p e a t e d . T h e p r o b l e m c o n s i s t s
n o w in w o r k i n g o u t a n effective m e t h o d of finding feasible directions
a n d c h o o s i n g s t e p a so as to p r o v i d e for c o n v e r g e n c e to t h e m i n i m u m
point.
B e l o w w e a s s u m e a l w a y s that the fo l l o w i n g c o n d i t i o n of n o n d e ­
g e n e r a c y is f u l f i l l e d : t h e r e is a p o i n t x s u c h t h a t

A x — b — 0, ft { x ) * < 0 ? i = 1, . . 777.

M e t h o d of C h o o s i n g F e a s i b l e Directions
Let
J S ( x ) = { i : fi ( x ) ^ — 6, i = l , . . . » 777}
for e a c h point x £ D . L e t > » 0 , 7 = 0 , 1 , . . ., m b e a r b i t r a r y
n u m b e r s . C o n s i d e r the f o l l o w i n g p r o b l e m at e a c h point x £ D :
m i n T ],

(fl ( * ) , P ) < li'}> i 6 J s (•>') U { 0 } ,


A p = 0, II p l l < 1 (2.3)

162
M E T H O D O F F E A S I B L E D I R E C T I O N S

w h e r e tj i s a n u m b e r a n d \\p\\ a n a r b i t r a r y n o r m . T o m a k e ( 2 . 3 )
a p r o b l e m o f l i n e a r p r o g r a m m i n g it is c o n v e n i e n t t o t a k e a s a n o r m
II p | | - = m a x | p l |.

L e t p c , (a;), % ( * ) b e a s o l u t i o n o f p r o b l e m ( 2 . 3 ) . S i n c e v e c t o r
p = 0 , r) = 0 s a t i s f i e s c o n s t r a i n t s ( 2 . 3 ) , c l e a r l y w e h a v e rja (a:) ^ 0 .
W e d e m o n s t r a t e t h a t p 6 ( x ) i s a f e a s i b l e d i r e c t i o n i f r\o (a:) < 0 .
I n d e e d , l e t a > 0 . F o r t = 0 w e h a v e b y T a y l o r ’s f o r m u l a

/ « (x + <*Pi (*)) = / o (•*) + « (/« ( 9 „ ) , P i (x))


= h (x ) + a (/o ( x ) , P i ( x ) ) + a (/„ (0„) — f'„ ( x ) , p 6 ( x ) )
< /. (*) + a (/; ( x ) , P i ( x ) ) + a 2C|| P i (x)||2

where 0O = x + S o a Pt> 0 ^ S0 1 and where w e have used


the fact that
II f o ( 0 o ) - n (*)ll < C\\ 0 0 - *11 < C a l l P b (*)||.
Further by (2.3), (/0 (x), (x)) ^ g 0 r]6 (a;). T h e r e f o r e

fo ( * + a P & (*)) < fo (*) + a £ 0 T|fi ( * ) + a 2 C| l P o (*)ll2 . (2.4)


B y a n a l o g y , f o r i 6 d l (*)
ft ( * + a/>6 (*)) < fi ( x ) + (*) + a 2 C|| p & (*)||2 . (2.5)
F u r t h e r , f o r i £ CfZ (x)
ft ( x - f a p i ( x ) ) = fi ( x ) + a (fi (0i), P i (x)) < ft ( x ) + a J f | | p t (*)||-
(2 .6 )
W e n o w choose a > 0 s u c h as to satisfy the inequalities

f i ( x + C t P i ( * ) ) < / o ( x ) + — Mi a l o i i a ( x ) ,
fi(x + ctpi(x))^0, I£Ji(x), ft ( x + a p i ( x ) X 0 , (x). ( 2 . 7 )

To s a t i s f y t h e s e i n e q u a l i t i e s it is s u f f i c i e n t t h a t

5 i V | s ( x ) + a C || p « ( * ) | | 2 < 0 . *€.?«(*).

— 8 + oA'||p# (x)|X0, i£Ji(z) (2.8)


he true since, b y (2.4), (2.5), t h e f o l l o w i n g i n e q u a l i t i e s h o l d :

fo ( * + < * P a ( * ) ) < / o ( * ) H - o S o T | a ( * ) S o T ) 6 (x ) J ’
*

ft ( x - f a p 6 (a:)) < / f (x) + a (x) + aC|| ( * ) | | a l, i 6 d o ( x ) >

163 li
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

and since f t (x ) ^ — 6 for i £ C f l (a:), then


f i (x + a p t (x)) < — 6 + a K \ \ p t (*)ll-
F r o m inequalities (2.8) w e obtain:
(*) 6
a<.
2 C II p 6 ( x ) II2 * C I I / > * ( * ) II* ’ ^ ^ H M *)II
(2.9)
T h u s if a s a t i s f i e s i n e q u a l i t i e s ( 2 . 9 ) , t h e n i n e q u a l i t i e s ( 2 . 7 ) a r e
f u l f i l l e d a n d it f o l l o w s t h a t p t (x) is r e a l l y a f e a s i b l e d i r e c t i o n f o r
A p t (x) = 0 and therefore A ( x + a p t (a:)) — b = A x — b +
+ a A p t ( x ) = 0 . W e n o w s h o w t h a t if p o i n t x d o e s n o t c o i n c i d e
w i t h t h e s o l u t i o n o f t h e p r o b l e m x + w h i c h is t h e m i n i m u m p o i n t
o f / 0 (a:) i n r e g i o n t h e n q t (x) < 0 w i t h a n y s u f f i c i e n t l y s m a l l 6.
L e m m a 2.1. L e t x £ D be not a solution of the p r o b l e m of m i n i m i z a ­
t i o n o f f 0 (a;) w i t h c o n s t r a i n t s ( 2 . 3 ) . T h e n rj6 ( # ) < 0 f o r a n y s u f f i c i e n t l y
s m a l l 6.
Proof. Recall that w e a s s u m e the conditions of n o n d e g e n e r a c y
t o b e f u lfi lle d, i.e. t h e r e is a p o i n t x s u c h t h a t
A x — 6 = 0, fi ( x ) ^ a, i = 1, . . ., m , a < 0 . (2.10)
L e t p o i n t x * b e t h e s o l u t i o n o f t h e p r o b l e m . F u r t h e r , if
J o ( x ) = { i : ft ( x ) = 0, i = l , . . ., m } ,
then with 6 <c 80
— S0 = P ax ft ( * ) »
i£Jo(x)
J t ( x ) = J o ( x ) . I n d e e d , if i £ C J t ( x ) , 6 < C 6 0 , t h e n f t ( x ) ^ — 6 .
B u t f o r a l l i £ J o (x ), f t (a:) ^ — 8 0 < ; — 8 , i . e . i £ J o (a:). S u p p o s e
t h a t 8 < ; 8 0 a n d s o J t (#) = Jo (%)• W e s e t
arp = par - f (1 — p ) a:*, 0 < p < 1.

T h e n d u e t o t h e c o n v e x i t y o f f u n c t i o n s / / (a:), i = 0 , 1 , . . ., m
a n d t h e f a c t t h a t f t ( a : * ) ^ 0 , i = 1 , . . ., m , w e o b t a i n
fi M < p ft (x) + (1 — p ) fi ( x % ) < pa, i = 1 , . . ., m .
Further, for i £ . % (x), fi ( x ) = 0and therefore for 0 < X < 1 ,
Xpo > Xfi (xp) = Xfi (xp) + (1 — X ) fi { x ) > ft ( X x p + (1 — X) x)
= fi ( x + X (xp — x)) — fi ( x ) > X (f\ ( x ) , X p — x)
w h e r e w e h a v e u s e d t h e i n e q u a l i t y ( C h a p . I, S e c . 2 )
f (y) — f (x) > (/' ( x ) , y — x)
which holds for a n y differentiable convex function.

164
M E T H O D O P F E A S I B L E D I R E C T I O N S

Thus
pa ^ (fi W , zp — z ) , i 6 C l l (a:). (2.11)
Further, since point x does not provide a m i n i m u m o f / (a:) i n D
0 Y /o (^#) “ /o (^) ^ (/o ( ^ ) > ^)*
Hence
(/; ( * ) , * p — *) = P (/; ( x ) , x — x) + 11 — p ) (f0 ( x ) , x # — x)
< P (/o ( * ) . * — a?) + ( 1 — p) Y - (2 -1 2 )
It f o l l o w s f r o m (2.11) a n d (2.12) t h a t w i t h sufficiently s m a l l
p > 0 t h e f o l l o w i n g inequalities are satisfied:

( fo ( x ) , P p ) < 0* (fi ( x ) , p fi) < 0 , i 6 (z ) (2.13)


where p p = xp — x, a n d the fact t h at a < 0, y < C 0 h a s b e e n t a k e n
into account. T a k e p p = p p if || p p || ^ 1, and p p = - = £ - if
_ II P p II
II P p II > 1 , a n d s o || p p || ^ 1 . Besides, w e take
(/;(*)’ p p)
r|p — m a x
i 6 j r - ( x ) U { 0 ) h

B y ( 2 . 1 3 ) , T]p < 0 and the inequalities


(fi ( z ) , P p ) < Eiilp, i € CfZ (z) U {0}, r]p < 0 (2.14)
a r e satisfied. F u r t h e r , s i n c e x p = px + (1 — p) x * a n d t h e equalities
A x — 6 = 0, A x m — 6 = 0
hold* then
A x p — 6 = 0.

Note that p p = a p p = a (xp — x ), w h e r e 0 < a ^ 1 . Therefore


A p p + (Ax - b ) = a [Axp - 6] + (1 - a) [Ax - 6] = 0. (2.15)
I f f o l l o w s f r o m ( 2 . 1 4 ) a n d ( 2 . 1 5 ) t h a t v e c t o r p p a n d t h e q u a n t i t y r)p
s a t i s f y c o n d i t i o n s ( 2 . 3 ) . S i n c e t|p < ; 0 , s o m u c h t h e m o r e T|a (a:) < 0 ,
f o r r|fl ( x ) ^ tip b y d e f i n i t i o n . T h e l e m m a i s p r o v e d .

Algorithm of M e t h o d
of Feasible Directions
L e t x 0 £ D b e a n a r b i t r a r y first a p p r o x i m a t i o n a n d 6 0 > * 0. W e
describe the general step of the algorithm. L e t point x k £ D b e
o b t a i n e d at t h e & - t h s t e p a n d 6 h > ► 0.
X
165
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Having solved the problem

m i n i],
(f'i ( X h ) , p ) < Ej»|, I 6 •?« ( X h ) U { 0 } ,
A p = Q, I I p I K U

w e o b t a i n p i h ( x k ) = p h a n d i] ( x h ) = r] k -
R e m a r k . If w e t a k e t h e q u a n t i t y m a x | p x | as t h e n o r m of v e c t o r p,
i
t h e n t h e a b o v e p r o b l e m is a p r o b l e m o f l i n e a r p r o g r a m m i n g and
c a n b e solved b y o n e of the s t a n d a r d m e t h o d s .
T w o cases are possible.

( 1 ) r \ft< C — 6ft. W e take successively a = — , £= 1, . . . , and


2
f i n d t h e first i0 s u c h t h a t t h e f o l l o w i n g i n e q u a l i t i e s

1o(xk + - i - P k ) < / o ( ^ ) + - ^ i 7 lo%.

fi ( x h + - U p i , ) < 0 ,

are satisfied. W e take a k = — — and x h+1 = x k -\-ah Pk, 6/i + 1 = 6 ht


so that
1
/o (^ft+l) < /o (Xk) H- — « h S o r lft»

1i(xi,+i)^0, i = l, m. (2.16)

(2) — 6*. W e take


I
•Ek+i ” t $k+i= y 6/t.

T h u s i n t h e first c a s e t h e r e is a s h i f t t o a n e w p o i n t , i n t h e s e c o n d
o n e t h e r e is n o s u c h a sh ift.
W e n o w f o r m u l a t e also the c o n d i t i o n of the halt of the algorithm:
f a t a c e r t a i n s t e p k , 6 ft < 6 ° ( z h ), w h e r e

6° (xb) = — ma x ft ( x b )

a n d T]ft = 0 , t h e n x i s t h e s o l u t i o n o f t h e p r o b l e m s e t a b o v e , i . e .
X k is t h e m i n i m u m p o i n t o f f 0 (x) w i t h c o n s t r a i n t s (2.1).

166
M E T H O D O F F E A S I B L E D I R E C T I O N S

S u b s t a n t i a t i o n of C o n v e r g e n c e
of t h e A l g o r i t h m
W e s h o w t h a t i f t h e s e q u e n c e { , r fe} i s t r u n c a t e d a t a c e r t a i n s t e p k
b e c a u s e t h e c o n d i t i o n s o f t h e h a l t h a v e b e e n fulfilled, t h e n x k is
really t h e s o l u t i o n of t h e p r o b l e m . I n d e e d , let t h e c o n d i t i o n s of t h e
h a l t b e f u l f i l l e d , i.e. = 1 1 6 . rt ( # * ) = 0 , a n d

fife < fi° ( # a ) = — m a x f t ( x k ). (2.17)

B u t a s w a s s h o w n i n p r o v i n g l e m m a 2 . 1 , if c o n d i t i o n s ( 2 . 1 7 ) a r e
f u lfi lle d, t h e n r\6h ( x k ) < 0 p r o v i d e d x k is n o t t h e s o l u t i o n o f t h e
p r o b l e m . B u t it w a s a s s u m e d tliatr) k = 0 s o it f o l l o w s t h a t x & is t h e
m i n i m u m p o i n t o f / 0 (.r) w i t h x £ D . L e t t h e i t e r a t i v e p r o c e s s b e n o w
c o n t i n u e d w i t h o u t limit so t h a t w e h a v e a n infinite s e q u e n c e {#&},
k — 0 , 1 , . . . . L e t x h b e a p o i n t a t w h i c h T ] ft < — 6 ft, i . e . c a s e ( i )
t a k e s place. T h e n m a k i n g u s e of e s t i m a t e s (2.9) a n d t h e fact t h a t
|| P j J I - - - 1| (^fe)ll ^ 1 , w e c a n s t a t e t h a t if
^ 1 Son*
2 c ,

(2-i8)

w h e r e to s h o r t e n t h e w r i t t e n f o r m w e s e t , ' ) ' * = J ^ h ( x k ), t h e n the


inequalities (2.7) w i l l h o l d :
\
f o (x k + « P f e ) < / o (* fe) + ~y ,
/i(«ft4-aPft)<0, i= 1, . . . , m .

N o w , recall that a c c o r d i n g to the al g o r i t h m , q u a n t i t y a k coincides


1
w i t h I h e f i r s t o f t h e q u a n t i t i e s t t t > i = 0 , 1 , . . ., s a t i s f y i n g i n -
A1
equalities (2.16). It f o l l o w s t h a t after a finite n u m b e r of trials t h e s e
inequalities will b e satisfied.
L e t L0 b e t h e fi r s t o f t h e i n d i c e s s a t i s f y i n g t h e i n e q u a l i t i e s a n d
s o a — ■yi 4 -o . T h i s m e a n s t h a t w i t h a = - . J . t h e i n e^q u a l i t i e s (v 2 . 1 6 )'
w e r e n o t satisfied a n d t h e r e f o r e a d i d n o t s a t i s f y (2.18), i.e.

- > m i n l — 1-jgHS- ii- m i n — liS*. \


I 2 c ’ K ' “ Li C /*

167
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Therefore

a h = 4 r > 4 - ,nii,{ - ^ !L • i f ’ <2 -1 9 >

I f w e t a k e i n t o a c c o u n t t h a t i n t h e c a s e u n d e r c o n s i d e r a t i o n — r j ft >
> 8 ft, t h e n i n e q u a l i t y ( 2 . 1 9 ) c a n b e m a d e s t r o n g e r b y s u b s t i t u t i n g 8 *
for — q*. T h e n w e obtain:

a * > - y - e 0, e0 ^ m i n j-J^-, , & , ..., , - f - } .( 2 . 2 0 )

U s i n g n o w inequalities (2.10), (2.20) a n d t h e fact that tj* < — 6 ^ < 0 ,


w e obtain

fo ( X k + t ) < fo ( X h ) + y «/iio0fc< fo ( X k ) — y y - 6* • (2.21)

It f o l l o w s f r o m this i n e q u a l i t y t h a t - > 0 as A — >-oo. I n d e e d , as


t h e s e q u e n c e { 8 ft} , k = 0 , 1 , . . ., d e c r e a s e s m o n o t o n i c a l l y , a n d
if 6 ft+1 < 6 f t , t h e n =-^-6^, and the fact that & h do e s not te nd
t o z e r o c a n m e a n o n l y t h a t 8 ft = 6 > 0 f o r a l l s u f f i c i e n t l y l a r g e k .
B u t if 8 * r e m a i n s c o n s t a n t , t h e n t h e c o n d i t i o n < — 8 k is f u l f i l l e d
a n d t h u s t h e in equality (2.21) holds.
T h u s f o r a l l s u f f i c i e n t l y l a r g e k ( k ^ 7c0 ), 6 h = 8 a n d t h e i n ­
equality

f o (x k + l ) < fo (Xk) — y y 62

is fulfilled.
Therefore

fo (*.v) < fo (Xk,) - ( N - /c„) i f ? - 6 2 ,

i . e . / 0 ( j : ^ ) — >— oo as N - + o o . B u t this contradicts th e fact that


t h e c o n t i n u o u s f u n c t i o n f 0 (x) i n t h e c o m p a c t r e g i o n D is b o u n d e d .
T h u s w e h a v e s h o w n t h a t 8 * ->-0. T h i s m e a n s t h at t h e initial 8 0
is s u c c e s s i v e l y h a l v e d a n i n f i n i t e n u m b e r o f t i m e s , i.e. t h a t c a s e (2)
t a k e s p l a c e a n infinite n u m b e r o f t i m e s : tj* ^ — 6*.
L e t 'f b e a s e t o f i n d i c e s k f o r w h i c h c a s e ( 2 ) t o o k p l a c e . T h e n
T]f t — as k ->■ oo, k £ f . T h i s f o l l o w s i m m e d i a t e l y f r o m t h e in­
equality — ^ ^ 0 a n d f r o m the fact that ->-0. C o n s i d e r
t h e s e q u e n c e o f p o i n t s x k £ D , k £ ' f . A s D is a c o m p a c t set, o n e c a n
a s s u m e , w i t h o u t loss of generality, t h a t { # * } c o n v e r g e s to a certain
point W e s h a l l d e m o n s t r a t e n o w t h a t x * is t h e m i n i m u m p o i n t
o f / 0 (#) i n D .

168
M E T H O D O F F E A S I B L E D I R E C T I O N S

S u p p o s e t h a t t h e o p p o s i t e is t r u e , i.e. t h a t p o i n t x * is n o t t h e
m i n i m u m p o i n t o f f0 (x) i n D . T h e n o n t h e b a s i s o f l e m m a 2.1,
w e c a n a f f i r m t h a t w i t h all 6 < 6 ° ( x % )
— j a a x /*(*♦)»
J o (* * )
J Z ( X * ) = J o (x * ) a n d t ] 6 ( ^ * ) < 0 . M o r e o v e r , s i n c e C f l ( x * ) = J o ( a : * ) ,
w e h a v e rje ( a : * ) = t j 0 ( a : * ) c 0 . F u r t h e r , J Z k (x h ) — J o ( # * ) w i t h
s u f f i c i e n t l y l a r g e k £ /f . I n d e e d , s u p p o s e t h a t i £ J o ( a : * ) , t h e n f t ( x % ) < z
< 0. T h e r e f o r e b e c a u s e o f t h e f a c t t h a t 6 h ->■ 0, w i t h s u f f i c i e n t l y
l a r g e k w e h a v e f t ( x + ) c — 6 ft, a n d s i n c e x h with great k w e
a l s o h a v e , f t ( x h ) < — 6 ft, i . e . i £ j £ FI (a:h ). T h u s i f i £ J o ( a : * ) , t h e n
w i t h g r e a t k w e h a v e i £ J e . ( a ^ ) , i . e . J g (a:fe) ^ J o ( a : * ) . S i n c e b y
ft ft __

assumption i s n o t t h e m i n i m u m p o i n t o f / 0 (a:) i n D t t h e r e i s a
v e c t o r p ( a ; * ) s u c h t h a t A p ( a * ) = 0 , || p (a;*)|| ^ 1 ,
(fi M , P (**)) < li^o (**). i € J o (**) U {0}
a n d , a s m e n t i o n e d a b o v e , rj0 (a:*) < 0 . B u t t h e n b y c o n t i n u i t y w i t h
great k the following relations hold

(/'»(**)> P W K j ?.»1o ( * » ) . i € 3 1 U {0},

A P (**) = 0, II p ( * * ) l l 1,
for x h - > a * , Clh ^ J o (a:*). H o w e v e r , t h e l a s t r e l a t i o n s m e a n that

= ’l6 Jt (x h ) < y ']0 ( * « ) < o

w i t h all sufficiently g r e a t k a n d th is c o n t r a d i c t s t h e fa ct t h a t ->-0,


k -+■ - f o o , k wh ich w a s proved above. T h e contradiction obtained
p r o v e s t h a t a:* i s t h e m i n i m u m p o i n t o f f 0 ( x ) i n D .
T h e o r e m 2 . 1 . T h e s e q u e n c e o f p o i n t s { a ; ft} c o n s t r u c t e d b y t h e m e t h o d
of feasible directions h a s the p r o p e r t y that f0 (xh) w i t h o u t increasing
m o n o t o n i c a l l y t e n d s t o f 0 ( a : * ) , w h e r e x * i s t h e m i n i m u m p o i n t o f f 0 (x )
in region D .
P r o o f . B y c o n s t r u c t i o n o f s e q u e n c e { a ; h } w e h a v e f 0 (a:f t + 1 ) ^
^ / o (x k ) a n d s o t h i s s e q u e n c e o f n u m b e r s d o e s n o t i n c r e a s e m o n o t o n ­
i c a l l y . S i n c e it h a s a l o w e r b o u n d , it c o n v e r g e s t o a c e r t a i n l i m i t , / 0 .
H o w e v e r , it w a s s h o w n a b o v e t h a t t h e r e i s a s u b s e q u e n c e o f { a : * } ,
k £ f s u c h t h a t x h — ^a**. T h e r e f o r e f0 (xh ) ->-/ (x#). A s t h e w h o l e
s e q u e n c e c o n v e r g e s t o t h e s a m e l i m i t a s t h e s u b s e q u e n c e , it f o l l o w s
t h a t f 0 ( x h ) - + f 0 (a:*). Q . E . D .
R e m a r k 1. A m o n g t h e c o n s t r a i n t s f t (x ) ^ 0 t h e r e c a n b e s u c h
t h a t f o r t h e m f u n c t i o n s f t (a) a r e l i n e a r . I t is e a s y t o s h o w b y a s l i g h t
extension of the p r e c e d i n g a r g u m e n t that w i t h these indices i w e
can take = 0.

169
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Besides, in this case the co n d i t i o n of n o n d e g e n e r a c y c a n also be


w e a k e n e d , v i z . it s u f f i c e s t o r e q u i r e t h a t t h e r e b e a p o i n t x £ D
s u c h t h a t fi ( x ) < 0 o n l y f o r t h o s e i n d i c e s i f o r w h i c h (x) a r e
nonlinear.
R e m a r k 2 . S e q u e n c e { # ft} i t s e l f , s p e a k i n g g e n e r a l l y , c a n l a c k
c o n v e r g e n c e ; h o w e v e r , if p o i n t x % o f t h e m i n i m u m o f f 0 ( x ) w i t h
x £ D is u n i q u e , it c a n e a s i l y b e s e e n t h a t x h — t - x # . U n f o r t u n a t e l y ,
t h e r a t e o f c o n v e r g e n c e o f t h e m e t h o d o f f e a s i b l e d i r e c t i o n s is a s
vet u n k n o w n .

Construction
o f t h e Initial A p p r o x i m a t i o n
T h e a p p l i c a t i o n of t h e m e t h o d of feasible directions requires the
k n o w l e d g e of t h e initial a p p r o x i m a t i o n in region D . T o o b t a i n
this initial a p p r o x i m a t i o n w e c a n use the s a m e m e t h o d of feasible
d i r e c t i o n s b y a p p l y i n g it t o the p r o b l e m of m i n i m i z a t i o n of n u m b e r q
wi th constraints
fi(x) — T) ^ 0, i = 1 , . . ., m , A x — 6 = 0. (2.22)

A s t h e r e is a p o i n t x s u c h t h a t
fi (x) < 0, i — 1 , . . ., m , A x — 6 = 0,
t h e m i n i m u m v a l u e o f q w i t h t h e c o n s t r a i n t s d e s c r i b e d is s t r i c t l y
less t h a n ze r o a n d t h e r e f o r e after a finite n u m b e r of s t e p s w e o b t a i n
point x a n d q s u c h that q < 0 a n d the inequalities (2.22) will b e
satisfied. T h i s m e a n s t h a t t h e o b t a i n e d p o i n t x satisfies t h e c o n ­
straints of t h e original p r o b l e m a n d c a n b e t a k e n as t h e initial p o i n t
for a p p l y i n g t h e m e t h o d of feasible directions.

3. M E T H O D O F C O N D I T I O N A L G R A D I E N T
A N D N E W T O N ’S M E T H O D
T h e m e t h o d of conditional gradient c a n be u s e d for solving the
p r o b l e m of the m i n i m i z a t i o n of a n o n l i n e a r function in a region in
w h i c h the p r o b l e m of the m i n i m i z a t i o n of a linear function c a n be
s o l v e d w i t h o u t g r e a t difficulties.
S u p p o s e t h a t / (x), x £ E n is a c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n
i n a c o m p a c t c o n v e x r e g i o n Q , a n d t h e g r a d i e n t /' ( x ) o f f u n c t i o n / ( x )
i n Q s a t i s f i e s L i p s c h i t z ’ c o n d i t i o n , i.e.
\\ f ' ( x j - f ( x 2)W ^ L\\ x t - x 2 \\ (3.1)
f o r all t h e p o i n t s o f r e g i o n Q .

170
M E T H O D O P C O N D I T I O N A L G R A D I E N T

T h e m e t h o d of conditional g r a d i e n t consists in the following.


L e t ,rft, t h e a p p r o x i m a t i o n a t t h e /c-tli s t e p o f t h e i t e r a t i v e p r o c e s s ,
b e a l r e a d y c o n s t r u c t e d . C a l c u l a t e / ' ( x ft) a n d f i n d t h e m i n i m u m p o i n t
o f l i n e a r f u n c t i o n (/' ( a 1* ) * z ) i n Q . L e t it b e p o i n t z ( x k ). T a k e p k =
= z ( x h ) — x h a n d x h + 1 = x h + a hp k , w h e r e a h ; > 0 is t h e l e n g t h
o f t h e s t e p i n d i r e c t i o n p h . P o i n t x /j+1 is t a k e n a s t h e i n i t i a l o n e a n d
t h e p r o c e d u r e is t h e n r e p e a t e d .
It w i l l b e d e m o n s t r a t e d b e l o w t h a t w i t h a d e f i n i t e r u l e o f c a l c u l a t ­
i n g a/,, t h e p r o c e s s c o n v e r g e s a n d t h e b o u n d s o n t h e r a t e o f c o n ­
vergence will be established. T h e s a m e p r o b l e m s will be a n a l y z e d
f o r N e w t o n ’s m e t h o d w h i c h d i f f e r s f r o m t h e m e t h o d o f c o n d i t i o n a l
g r a d i e n t i n t h a t t h e f u n c t i o n b e i n g m i n i m i z e d is a p p r o x i m a t e d a t
e a c h iteration b y a qua d r a t i c f o r m (while in the m e t h o d of c o n d i ­
t i o n a l g r a d i e n t t h e a p p r o x i m a t i o n is l i n e a r ) .

R u l e for C h o o s i n g t h e S t e p L e n g t h
L e t x b e a n a r b i t r a r y p o i n t in Q . W e d e n o t e b y z (x) a m i n i m u m
p o i n t o f f u n c t i o n (/' ( x ) , z ) i n £2 s u c h t h a t
(/' ( x ) , z ( x ) ) < (/' ( x ) , z ) , 2 6 ^ (3.2)
W e take p (x) = z (x) — x,
rj ( x ) = m i n (/' ( x ) , z — x) = (/' ( x ) , p ( x ) ) .

B y ( 3 . 2 ) , r] ( x ) ^ 0 . W e a r e i n t e r e s t e d i n t h e e s t i m a t e f o r t h e i n c r e a s e
o f t h e f u n c t i o n v a l u e i n m o v i n g f r o m p o i n t x i n d i r e c t i o n p (x).
U s i n g T a y l o r ’s f o r m u l a a n d ( 3 . 1 ) w e o b t a i n :

/ (x - f a p (x)) = / (x) + a (/' ( 0 ) , p ( x ) )


= /(*) + « (/' ( * ) , P ( * ) ) + « (/' ( 6 ) - f ( x h P ( x ) )
< / ( x ) + a t ] ( x ) + a 2 L || p ( x ) | | 2

where 0 = x + £ a p (x), 0 ^ | ^ 1. Thus

/ (x + a p (x)) < / (x) - f a (q (x) + a L \ \ p ( x ) | | 2 ). (3.3)

It f o l l o w s directly f r o m this f o r m u l a that with


q < — 1 n W ■ (3 4)
2 L II p ( x ) II5

the following estimate holds:

/ ( j + a p (j ) ) < / ( x ) + a $ i x) . (3 5)

171
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Description of t h e Algorithm
T h e a l g o r i t h m b e g i n s w i t h a n arbitrary p o i n t x Q of r e g i o n Q .
W e describe n o w t h e general step.
L e t p o i n t x h b e a l r e a d y c o n s t r u c t e d , k ^ 0. H a v i n g s o l v e d t h e
p r o b l e m o f m i n i m i z a t i o n o f (/' z ) i n Q , c a l c u l a t e z ( x h ), p ( x k ),
rj ( x h ). C o n s t r u c t p o i n t = x h + a ^ p ( # * ) , w h e r e a ft i s t a k e n
t o b e 2 ~ jo a n d i 0 i s t h a t o f t h e i n d i c e s i = 0 , 1 , . . w h i c h is t h e
first t o s a t i s f y t h e i n e q u a l i t y

/ (*» J- 2 ~ ' p (*»)) < / (*») + 2 - ‘ . (3.6)

The c o n d i t i o n o f t h e h a l t : t h e p r o c e s s s t o p s i f rj (a;*) = 0.

S u b s ta nt ia ti on of C o n v e r g e n c e
of t h e A l g o r i t h m
a n d E s t i m a t i o n o f Its R a t e o f C o n v e r g e n c e

A c c o r d i n g to the just g i v e n rule of c h o o s i n g the step length, the


f o l l o w i n g i n e q u a l i t y is fulfilled:

/ (X k ) + — nY *) • (3-7)

I n o r d e r t o s u b s t a n t i a t e t h e c o n v e r g e n c e o f t h e a l g o r i t h m , it is
n e c e s s a r y first o f a l l t o d e m o n s t r a t e t h a t i n e q u a l i t i e s ( 3 . 6 ) , ( 3 . 7 )
c a n a l w a y s b e satisfied. I n fact b y (3.4) a n d (3.5), i n e q u a l i t y (3.6)
will b e satisfied as s o o n as i n e q u a l i t y
2 " z< r — — —
^ 2 L || p ( x h ) ||*

i s s a t i s f i e d , a n d s i n c e i0 i s t h e f i r s t i n d e x s a t i s f y i n g ( 3 . 6 ) , w e h a v e

2 a k - 2 -<io - u > - 1 _ ILifa}_


hence
n ^ 1 ~ (o Qv
k > iL H p (X k ) r • (,18)

I t f o l l o w s f r o m t h e f o r e g o i n g t h a t i f r\ ( x h ) < 0 , t h e n i n e q u a l i t y ( 3 . 6 )
w i l l b e satisfied after a finite n u m b e r of trials a n d t h e a h c h o s e n will
satisfy i n e q u a l i t y (3.8).
L e m m a 3 . 1 . I f { # & } , k = 0 , 1 , . . ., i s a s e q u e n c e o f p o i n t s o b t a i n e d
in i m p l e m e n t i n g the a l g o r i t h m of the m e t h o d of conditional gradient4
then x k £ / ( # h ) d e c r e a s e s m o n o t o n i c a l l y a n d v \ ( x ft) 0 ask — - +oo.

172
M E T H O D O F C O N D I T I O N A L G R A D I E N T

Proof. Let x k for k ^ m . W e s h o w that x m+1 6 Indeed,


0 ^ a k ^ 1 a n d z (#*) £ Therefore

^ m + i = H“ & m P (*^m) = (*^m) *^m)


= (1 — a m ) a:m + a m z (sm ) 6 Q,
for region Q is c o n v e x .
Note n o w t h a t || p ( x fc)|| h a s a l i m i t , a c e r t a i n c o n s t a n t C , s i n c e
p ( x ft) = = z (,x k ) — x h , z ( x k ) 6 SI, 6 Q a n d Q is a c o m p a c t set.
Using n o w f o r m u l a s (3.8), (3.7) w e o b t a i n

f(Xk+i) — f ( X h ) ^ — g 2 ^ r 1 )2 ( * * ) • (3.9)

Adding ( 3 . 9 ) f o r a l l A; = 0 , 1 , . . . , m — 1 w e obtain
m- 1
i(xm )— f{xo X — 2 ’I2 ( * * ) •
k=Q
S i n c e r e g i o n Q is c o m p a c t a n d f u n c t i o n / (x) c o n t i n u o u s , w e
h a v e / ( x m ) ^ /*, w h e r e / * is t h e m i n i m u m v a l u e o f / (z) i n Q . T h e r e ­
fore
m - 1

2 i)2 ( * * ) < 1 / ( * , ) - / ( x m )] < 8 L C 2 I f ( x 0 ) - .


fc=0

H e n c e it f o l l o w s t h a t t h e s e r i e s

k=0
c o n v e r g e s . T h i s i s p o s s i b l e o n l y i f rj ( x ft) 0. T h e l e m m a is p r o v e d .
It f o l l o w s f r o m t h e c o n d i t i o n of t h e halt of a n a l g o r i t h m a n d
l e m m a (3.1) that, in g e n e r a l , ei t h e r t h e a l g o r i t h m s t o p s after a finite
n u m b e r o f s t e p s a n d t h e c o n d i t i o n tj ( x h ) = 0 i s f u l f i l l e d o r a m o n o -
t o n i c a l l y d e c r e a s i n g s e q u e n c e o f / ( x k ) v a l u e s o f f u n c t i o n / ( x ) is
obtained.
I n t h e f i r s t c a s e , t h e c o n d i t i o n r\ ( x k ) = 0 , b y ( 3 . 2 ) , i s e q u i v a l e n t
to the following one:
(/' ( * f c ) , x k ) = (/' ( * * ) , z ( x h )) < ( f ( x k ), z ) , z f f i .
T h e l a s t e x p r e s s i o n is n o t h i n g e l s e b u t t h e n e c e s s a r y c o n d i t i o n f o r
f u n c t i o n / (a;) a s s u m i n g i t s m i n i m u m v a l u e a t p o i n t x ( s e e C h a p . I ,
S e c . 3).
T h e s e c o n d c a s e is t h e s u b j e c t o f t h e f o l l o w i n g l e m m a .
L e m m a 3 . 2 . A t a n y l i m i t p o i n t o f s e q u e n c e { # & } , k = 0 , 1 , . . .,
t h e n e c e s s a r y c o n d i t i o n s f o r t h e m i n i m u m o f f (a:) i n s e t Q a r e f u l f i l l e d .

173
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

P r o o f . L o t x * b e a l i m i t p o i n t o f { x ft} , i . e . t h e r e i s a s u b s e q u e n c e
{ x * . } , I - * * 0 0 , s u c h t h a t X f t . — >-x ^ . T h e f o l l o w i n g r e l a t i o n s h o l d :
J J

(*hj) = (/' (*kj) » 2 ( X hj) - X k j ) ,

(/' ( Z h j ) , z ( X f c ^ x ( / ' ( ^ . ) i z ) , z € Q -
W i t h o u t l o s s o f g e n e r a l i t y , w e c a n t a k e t h a t z ( x k .) - > ~ z % . S i n c e
q (x h ) — > - 0 a n d f ( x ) d e p e n d s o n x c o n t i n u o u s l y , i t f o l l o w s f r o m t h e
a b o v e relations that
(/ (•£*)» z * 0
(/' ( * * ) , z * ) ^ (/' ( x * ) , z ) , z £ Q .
Hence
(/ ( ^ * ) i ^ (/ ( ^ * ) » z), z £ Q,
a n d t h i s is t h e p r o o f o f t h e l e m m a .
T h e o r e m 3 . 1 . L e t f u n c t i o n f (x ) b e c o n v e x . T h e n
l i m f (xh) = f*
h - + 00
where /*=m i n / ( x ) . M o r e o v e r , //ze e s t i m a t e
x££2

( C is a p o s i t i v e c o n s t a n t ) h o l d s .
P r o o f . A s / (x) is a c o n v e x function, the following inequality
holds:
/« — / ( * ) = / ( * . ) — / ( * ) > ( / ' (*). * . — * ) > m i n (/'(*), z — z ) = r)(z).
z£Q

Thus 0^J/(x)— — rj(x). T h e r e f o r e w i t h all k


0 < f (xh) — /* < — q (xh). (3.10)
It f o l l o w s f r o m l e m m a 3.1 t h a t q ( x h ) -*-0. T h e r e f o r e , t h e last i n ­
e q u a l i t y s h o w s t h a t / ( x * ) - > / * a n d t h i s p r o v e s t h e first p a r t o f t h e
theorem.
C o m b i n i n g (3.9) a n d (3.10) w e o b t a i n

(/ M - / J - ( / ( * » ) - / » ) < ~ H U F [f (x" > ~ f*I2 -


With the notation c p f c = / ( x / {) — / * w e o b t a i n

fP f c + X 1 8LC~ )

or
1
<P/<+X<Pfc (1 — ^ ~ ~ 8 L C * '

174
M E T H O D O P C O N D I T I O N A L G R A D I E N T

T a k i n g <fi, w e obtain

Y* * ' k >
or
Yft+j 1 x(*+*)Yfc
yh ^ 1 + * **
With e a c h k there are t w o possible cases:

(1) '-e. Y a+ i < Y a -


*k
(2) ^ - > 1 .

Then 4-— x — i Y a > 0. >-e.

X /c+l ^ X *
Further, f r o m (3.11) w e obtain that
1 2

with k ^ 1. N o w o n l y t w o s i t u a t i o n s a r e p o s s i b l e .
(1 ) T h e r e is o n l y a f i n i t e n u m b e r o f i n d i c e s k for w h i c h y h .
T h e n d u e t o t h e a b o v e s t a t e m e n t s f o r a l l g r e a t k s e q u e n c e { y ft} d o e s
n o t i n c r e a s e m o n o t o n i c a l l y , i.e. r e m a i n s b o u n d e d .
M
(2) T h e r e is a n i n f i n i t e n u m b e r o f i n d i c e s k f o r w h i c h y h < z — .
X
I
W e s h a l l d e n o t e t h e s e t o f s u c h i n d i c e s k b y f- s o t h a t y ft ^ — f o r
&£'!'• Let n o w j £ T h e n there will be t w o indices k u k 2 £ f s u c h
t h a t k\ < 4 j < z k 2 a n d k £ f for all k x k < C k 2. T h e n

and v i+1^ y t f o r a l l f — A ^ + l , . . . , A*2 — 1. T h e r e f o r e

T h i s s h o w s that in this case as well the s e q u e n c e h a s a limit, a cer­


tain c o n s t a n t C. It f o l l o w s t h a t

and this c o m p l e t e s the p r o o f of the t h e o r e m .

175
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e estimate obtained sh ows that the algorithm converges not


v e r y r a p i d l y . B u t w e h a v e o b t a i n e d t h e u p p e r b o u n d a n d it c a n s e e m
t h a t in reality th e a l g o r i t h m c o n v e r g e s at a faster rate. U n f o r t u n a t e ­
l y it is n o t s o i n g e n e r a l . A s s h o w n b y M . D . C a n o n a n d C . D .
C u l l u m , t h e b o u n d s o b t a i n e d a r e p r e c i s e if t h e f u n c t i o n b e i n g m i n ­
i m i z e d is c o n v e x o n a p o l y h e d r o n .

E s t i m a t e of C o n v e r g e n c e
for a S t r o n g l y
Convex Region
L e t r e g i o n £2 b e s t r o n g l y c o n v e x , i.e. t h e r e is a n u m b e r 6 > 0
•V 1 11
s u c h that for a n y x, y £ & points - p - + w b e l o n g to Q for all
the w s u c h t h a t || w | | ^ 6 || x — y ||2 . T h e n

= m i n ( / ' (.z), z — x ) ^ min (/'(x), z ^x ) ^ ~ x - \ - w — x \


z£Q || w 1 1 ^ 6 1| z ( x ) — x j l l 2

< i (/' (*), z (*) - x) - 8 II s ( * ) - * II2 II r ( * ) I I .

Hence

4 ^ I I H * ) IIII * < * ) - * II*


or

~ h ( 3 -1 2 )
Theorem 3 . 2 . I f f ( x ) is a c o n v e x f u n c t i o n a n d r e g i o n Q i s s t r o n g l y
convex and i f || f ( x ) || ^ e 0 > 0 f o r a l l x 6 th en the m e t h o d of
conditional gradient converges at the rate of a geometric progression,
i . e * || Xfo X y . || C(]q , Q q 1.
Proof. F r o m (3.7) a n d (3.8) w e o b t a i n
, / v , / \ ^ 1 il2 (x u)
9/t ~ W k + l — / (^'/{) / (Z/i+l) ^ g /y jj p ^ X f j ) ||* *

U s i n g (3.12) a n d (3.10), w e obtain

<Pft — q>/.+i ( — n («))


i.e.

<pi.+i^('1 — 1 7 7 - ) <th-
Therefore

< P m < 9 m To, 5 = 1 —

176
M E T H O D O F C O N D I T I O N A L G R A D I E N T

B e c a u s e of t h e n e c e s s a r y a n d sufficient c o n d i t i o n s for a m i n i m u m ,
we have
(/ ( ^ ♦ ) » ^ *r*) ^
T h e r e f o r e f o r a l l t h e w , || w || ^ 6 || x — x # ||2 ,

Hence

4 (/' ( * . ) . * — * , ) > « II * — Ilz II / ' ( * . ) II-

B u t || /' ( # * ) || ^ e 0 a n d s i n c e f (x) is c o n v e x , w e have


f (x) — f Or*) > (/' ( * * ) , x — x*).
Finally, w e obtain
/ ( * ) - / (**) > 2 6 e 0 || * - * * ||2 . (3.13)
Hence

ii
W i t h the notation
1/2
r — I yp \ „ /\ ^ eo \
C \2te0 ) ’ 9 o _ I1 4L )
w e obtain
II II ^ To-
Q.E.D.

N e w t o n ’s M e t h o d w i t h S t e p A d j u s t m e n t
W e consider n o w the p r o b l e m of the m i n i m i z a t i o n of a c o n v e x
s m o o t h f u n c t i o n f (x) i n (a c o n v e x , c o m p a c t ) set Q .
F o r solving this p r o b l e m the iterative process
*m-i = xh + a hp h , a h > 0 (3.14)
c a n b e u s e d i n w h i c h t h e d i r e c t i o n o f m o t i o n p h = x ^ — x * is t h e
solution of the p r o b l e m of m i n i m i z a t i o n in set Q of the q u a d r a t i c
function
\
% ( * ) = ( / ' (*/>). X — X k ) + - j ( f ( x h) ( x — X h ), X — X h ),

a n d as a * w e take the m a x i m u m value of parameter a, obtained b y


r e d u c i n g t h e initial a = 1 u n til t h e p a r a m e t e r satisfies t h e i n e q u a l -
ity
/ ( x h + a p h ) — / ( * * ) < e<xi|>* ( x h ), 0 < e < 1. (3.15)

12— 0 3 2 6 177
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

P a r a m e t e r a h c a n b e c h o s e n b y other m e t h o d s a n a l o g o u s to those
d e s c r i b e d i n S e c . 2, C h a p . II ( m e t h o d s (2.2), (2.3)).
W e s h a l l a s c e r t a i n b e l o w t h a t t h e r a t e o f c o n v e r g e n c e o f N e w t o n ’s
m e t h o d u n d e r d e f i n i t e c o n d i t i o n s is e i t h e r s u p e r l i n e a r , o r q u a d r a t i c .
C o n s e q u e n t l y , if t h e p r o b l e m o f t h e m i n i m i z a t i o n o f f u n c t i o n
( x ), x £ Q c a n b e s o l v e d e a s i l y e n o u g h , t h e n N e w t o n ’s m e t h o d
p r o v e s v e r y effective.

P r o p e r t i e s o f N e w t o n ’s M e t h o d
T h e o r e m 3.3. If for the m i n i m i z a t i o n of a c o n v e x twice continuously
differentiable f u n c t i o n f (x) i n a c o n v e x cl o s e d b o u n d e d set Q w e u s e
m e t h o d (3.14) in w h i c h a a n d p k are de t e r m i n e d as described a b o v e t
t h e n ( w h a t e v e r t h e i n i t i a l a p p r o x i m a t i o n x 0 £ ^ c h o s e n ):
(1) / ( x k ) d e c r e a s e s m o n o t o n i c a l l y ,
(2) l i m / ( x h ) = f (x+) = m i n / (#).
k-+oo

P r o o f . T h e r e is a m i n i m u m p o i n t x k ( p o s s i b l y n o t u n i q u e ) of t h e
c o n t i n u o u s f u n c t i o n ^ (x) in c o m p a c t set Q ( W e i e r s t r a s s ’ t h e o r e m ) .
W i t h a n y Jc p o i n t X h + i £ & s i n c e X j j + l = x h + (xk — xf} = oCkZh +
+ (1 — a u) X k a n d & & £ [ ( ) , 1]. S i n c e f u n c t i o n (x) is c o n v e x , w e
h a v e ^ h ( ^ h + i ) = % ( ^ h ^ h J i- ( 1 — (1 — a k ) ( x h ).
But (#&) = 0 , therefore
tyh ( % i K a h % to*). (3.1b)

N o w m a k i n g u s e o f T a y l o r ’s f o r m u l a a n d o f ( 3 . 1 6 ) , w e h a v e :

/ (x h + i ) — f ( x h ) = % (xh+i) = -J- ( F hp h , Pk)

^ a k iph f a ) ( l + ^ H II II_pa »- \ ( 3 .i7)


V Z tyh. ( X h ) '
w h e r e F k = f ( x h c ) — f ( x h ), x k c = x h + 0 ( x h + i — z k ), 0 6 [ 0 , 1 1 -
I t f o l l o w s t h a t if (x h ) =^= 0 ( i n t h i s c a s e (x * ) c (x h ) = 0 ) ,
then w i t h a certain a & Z> 0 the inequality2

(3.18)
2 V k i*k) ^
holds. B u t at the s a m e t i m e t h e inequality (3.15) ho l d s as well a n d
this p r o v e s that t h e described m e t h o d of c h o o s i n g a k m a y b e applied.
I t f o l l o w s f r o m ( 3 . 1 5 ) t h a t / ( x k + 1 ) ^ / ( ^ fe). W e s h o w n o w t h a t
tyk ( z k ) — >- 0 a s k - v o o . I n a c l o s e d b o u n d e d s e t Q c o n t i n u o u s f u n c -
tion f i x ) has an upper bound: ||/"(^)||^M. Consequen­
t l y II F h II ^ 2 M , a n d vector p h has an upper bound too:

178
M E T H O D O P C O N D I T I O N A L G R A D I E N T

|| p h || ^ m a x II x — y || = d. Suppose that with any k


x,y£Q

we have (x * ) ^ — P < ; 0. T h e n
1 , ah II F k || \ \ j k | P > 1 a M £ _
2 (*fe) ^ k P
and, hence, inequality (3.18) ( a n d t h e r e f o r e ( 3 . 1 5 ) t o o ) is a l w a y s
satisfied even with a* = ~ = C > 0. But it f o l l o w s from
( 3 . 1 5 ) t h a t a t t h e s a m e t i m e / ( a r ^ - n ) — / (x * ) ^ — e C p w i t h a n y k
a n d t h i s c o n t r a d i c t s t h e f a c t t h a t i n c o m p a c t set Q f u n c t i o n f (x) h a s
a lower bound.
T h u s the condition ( x h ) ^ — p w i t h a n y k c a n n o t b e fulfilled,
i.e. i n a n y c a s e a s k oo the condition (x * ) - > 0 m u s t b e f u l f i l l e d .
T h i s m e a n s that at a n y limit po int of s e q u e n c e (3.14) the necessary
( a n d sufficient, b e c a u s e o f t h e c o n v e x i t y o f / (x)) c o n d i t i o n f o r a
m i n i m u m o f f u n c t i o n / (x) i n s e t Q ( s e e C h a p . I, S e c . 4 ) is fulfilled.
T a k i n g this into a c c o u n t , t h e last a s s u m p t i o n of th e t h e o r e m c a n b e
p r o v e d as in t h e o r e m 3.1.
T h e t h e o r e m just p r o v e d s h o w s that in the p r o b l e m u n d e r consider­
ation, as distinct f r o m u n c o n s t r a i n e d m i n i m i z a t i o n p r o b l e m s w h e n
N e w t o n ’s m e t h o d c a n b e a p p l i e d o n l y t o m i n i m i z e s t r o n g l y c o n ­
v e x functions, this m e t h o d c a n b e applied to m i n i m i z e also c o n v e x
f u n c t i o n s , s i n c e s e t Q is b o u n d e d . H o w e v e r , t h e a p p l i c a t i o n o f
N e w t o n ’s m e t h o d t o t h e m i n i m i z a t i o n o f s t r o n g l y c o n v e x f u n c t i o n s
is o f t h e g r e a t e s t i n t e r e s t , f o r j u s t i n t h i s c a s e t h e m e t h o d c o n v e r g e s
to t h e so lut ion at a fast rate.
T h e o r e m 3 . 4 . I f i n a d d i t i o n to t h e c o n d i t i o n s o f t h e o r e m 3 . 3 , f u n c t i o n
f ( x ) is s t r o n g l y c o n v e x , i.e.
m II y I I 2 < (/* (x) y, y ) < M || y | | 2 , m > 0 , x (j Q ,
y 6 E n, (3.19)
t h e n s e q u e n c e ( 3 . 1 4 ) c o n v e r g e s t o t h e s o l u t i o n a t a s u p e r l i n e a r r a t e (i.e.
the e s t i m a t e (2.5), C h a p . II, holds).
Proof. T h e existence a n d u n i q u e n e s s of the solution of the p r o b l e m
u n d e r c o n s i d e r a t i o n f o l l o w f r o m t h e g e n e r a l r e s u l t s o f S e c . 3, C h a p . I.
A t point x h the.necessary condi tio n for a m i n i m u m of function (x)
in se t Q ( S e c . 4 o f C h a p . I)
Wh(Xh ), X k — x h) ^ 0
is fulfilled, i.e.
(/'(**)* x h — x h ) + ( f " ( x h ) ( x h — x h ), x h — x h ) ^ 0 .
Hence
(/' Ph) < — ( f (*k) P h , Ph )• (3.20)

179 12*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T a k i n g into a c c o u n t this a n d t h e left-hand e s t i m a t e (3.19), w e find


that

f-lIPkll2- (3.21)

U s i n g this estimate w e obtain f r o m (3.17)

(3.22)

Since ( x k ) - > - 0 ( t h e o r e m 3 . 3 ) , i t f o l l o w s f r o m ( 3 . 2 1 ) t h a t || p h ||
— ► 0 a s k — >-oo. H e n c e , b e c a u s e of t h e u n i f o r m c o n t i n u i t y of t h e
s e c o n d d e r i v a t i v e / " ( z ) o n s e t Q , w e h a v e t h a t || F h || - > - 0 . B u t ( 3 . 2 2 )
i m p l i e s t h a t f r o m a c e r t a i n k = N x (e) o n , i n e q u a l i t y (3.15) w i l l
b e s a t i s f i e d w i t h a * = 1, i.e. m e t h o d ( 3 . 1 4 ) is t r a n s f o r m e d i n t o t h e
u s u a l N e w t o n ’s m e t h o d w i t h a u n i t y s t e p . W i t h k > » N x ( e ) t a k i n g
i n t o a c c o u n t t h e c o n v e x i t y o f \|?* ( x ) , w e o b t a i n

(**) = tyfe (x k+l) X f ' *k+i — *h)


= ( / ' ( * f c - i ) » x k + 1 — * * ) + ( / ' ( x h ) ~ f ( X k - l ) , x h + 1 — x ft).
T r a n s f o r m i n g t h e l a s t s c a l a r p r o d u c t b y m e a n s o f L a g r a n g e ’s f o r ­
m u l a f o r o p e r a t o r s ( C h a p . I, S e c . 5 ) w e o b t a i n

% ( x h ) = ( / ' ( * * - i ) + f ( X h - i ) ( x k — x h . t) , x k + i — x h )
“ i- ( O * ( x k 3?ft_i), E k + i ■ x ^ (3.23)
where
O * — f n (xh-i + ( X k — ^fc-i)) — f " 0 £ [0, 1].
Note that ( / ' ( x h ^ ) + / " (.x h ._t) ( x h — x h ^ ) , x k + i - x h ) = ( i | v t ( x h ) ,
Xh+i — x h ). S i n c e (s*) = m i n (#), w e s h a l l h a v e w i t h a n y
x£Q
x £ Q that (a:*), x — x h ) ^ 0 (the ne c e s s a r y c o n d i t i o n for a m i n i ­
m u m ) . Consequently, ^fe+i— x k ) ^ 0 h o l d s a n d therefore
it f o l l o w s f r o m (3.23) t h a t
— tyh ( X k X II Q > h IIII x k — Xk-i II ||*jk — x k+i II
H I 0 * 1 1 II/ > * _ , ! ! II p fc ||. (3.24)
C o m p a r i n g estimates (3.21) a n d (3.24), w e obtain

II * » t + i - lla*— (3.25)

B e c a u s e of t h e u n i f o r m c o n t i n u i t y of f (x) o n set Q , w e h a v e
|| O * || — 0 . C o n s e q u e n t l y , t h e r e i s a ^ n u m b e r N ( e ) s u c h t h a t w i t h
k ^ N (e) w e f i n d X * = 2 ^ ^ < 1. Let us t a k e || x N — x N - { || — C u

1 8 0
M E T H O D O F C O N D I T I O N A L G R A D I E N T

i— i
1 — = *v 0* T h e n ||a?j ll^fe+i ••••
h=N+l

. . . A-jv+i ( 1 ~ h ^ N + i + i “ h ^ n + i + i 4 " + ^ n + / + + i + 4) X T — l^“ ’ **


. . . X N + i = C k N . . . k N + l . C o n s e q u e n t l y , a s i, l - * o o , \ \ x t — x N + t | | - * 0 ,
i.e. s e q u e n c e { # & } is f u n d a m e n t a l a n d , b e c a u s e o f t h e c o m p l e t e n e s s
of space E n , has a limit x+ £ Q, a n d
II ^ n + i— l l ^ ^ ^ j v ^ i v + i • • • k N +i. (3.26)
B y theorem 3.3,
lim f (xh) = f (x * ) = min / (x).
h-+ oo oc£Q

T h u s s e q u e n c e (3.14) c o n v e r g e s to t h e solution a n d the rate of


c o n v e r g e n c e , a s c a n b e s e e n f r o m e s t i m a t e ( 3 . 2 6 ) , is s u p e r l i n e a r . T h e
t h e o r e m is p r o v e d .
If t h e s e c o n d d e r i v a t i v e /" (x) o n set Q satisfies L i p s c h i t z ’ c o n ­
dition w i t h c o n s t a n t R , t h e n inequality (3.25) takes the f o r m

Il*k+1 — X k — *ft-ill2 . (3.27)


21?
W e shall use the notation p* = — 1| x h + l — ||. S i n c e || x k + i — X k ||->0,
fTt
t h e n t h e r e is a n u m b e r L ( e ) s u c h t h a t w i t h f c > L ( e ) w e have
P & < 1 . T a k i n g into a c c o u n t (3.27) w i t h k ^ > L , w e h a v e

P f c ^ p l - i ^ •••^ P l
Consequently, for a n y i > L - \ - l , 1 = = 0 , 1, ...,

II*«-*I.+|||< S II * » + ! - * * I K - ^ - s = SPi*.
h=L+l k=L+l s=l

Since x k -+x+, w e h a v e \ \ x L + i — x + 1| = l i m || x L + i — a;f ||, i . e .


i-*oo
oo

II % L + l — X * II ^ 2 P*- •
s=l
This estimate can be given the form

II z L + i — * * H ^ ^ P l , C C o o
OO
(taking into a c c o u n t t h a t t h e series 2 P l converges). T h e e s t i m a t e
s=l
o b t a i n e d m e a n s t h a t t h e f o l l o w i n g t h e o r e m is v a l i d .
T h e o r e m 3.5. I f the c o n d i t i o n s of t h e o r e m 3 . 4 are fulfilled a n d , be­
s i d e s , m a t r i x f " (a:) o n s e t Q s a t i s f i e s L i p s c h i t z > c o n d i t i o n w i t h c o n s t a n t

181
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

i?, t h e n s e q u e n c e ( 3 . 1 4 ) ( i n w h i c h a & a n d p * a r e c h o s e n b y t h e m e t h o d
d e s c r i b e d a b o v e ) c o n v e r g e s to the s o l u t i o n a t a q u a d r a t i c r a t e .
W e s h a l l s t u d y n o w t h e p r o p e r t i e s o f N e w t o n ’s m e t h o d w i t h t h e
c h o i c e o f a * u n d e r t h e c o n d i t i o n o f / (x) h a v i n g t h e m i n i m u m v a l u e
in the direction of m o t i o n .
T h e a r g u m e n t w e u s e d to e s t i m a t e the rate of c o n v e r g e n c e of
N e w t o n ’s m e t h o d i n u n c o n s t r a i n e d p r o b l e m s ( S e c . 2 , C h a p . I I ) a r e
n o t s u i t e d to this c a s e (since t h e r i g h t - h a n d e s t i m a t e (1.11) of C h a p . II
does not hold).
L e m m a 3.3. I f f u n c t i o n f (#) for w h i c h c o n d i t i o n s (3.19) a r e fulfilled
is b e i n g m i n i m i z e d a n d a * i n m e t h o d ( 3 . 1 4 ) is c h o s e n u n d e r t h e c o n ­
dition
f(xk + a kp k) = m i n f ( x k + a p k ), ;1(3.28)

t h e n X k — >■x * a n d a * ->• 1 a s k — ► o o .
P r o o f . B y T a y l o r ’s f o r m u l a

lj(xk+i) — f ( x k ) = a k ( f ' ( x k ), P h ) + - ^ r ( f " ( * k c ) P k . P h ) -

W i t h t h e v a l u e of a * satisfying (3.28), t h e r i g h t - h a n d side of t h e


last e q u a l i t y c o n s i d e r e d as a f u n c t i o n of t h e va r i a b l e a m u s t attain
i t s m i n i m u m . T h e r e f o r e , it c a n b e e a s i l y a s c e r t a i n e d , t a k i n g i n t o
a c c o u n t e s t i m a t e s (3.19) a n d (3.20), t h a t
(/' t o t ) * P h t ^ m II P k II2 ' m
M II P k II2 ^ M \ \ p h II*

T h u s a * ^ C > 0; t h e r e f o r e i n t h e s a m e w a y as in t h e o r e m 3 . 3
it c a n b e s h o w n t h a t ( # * ) - ► O , i.e. s e q u e n c e ( 3 . 1 4 ) i n w h i c h a *
is c h o s e n u n d e r c o n d i t i o n ( 3 . 2 8 ) c o n v e r g e s t o t h e s o l u t i o n . A t t h e
s a m e t i m e || p h || ~ + Q a n d || F h || - > - 0 ( t h e o r e m 3 . 4 ) .
W e shall d e m o n s t r a t e t h a t a * ->-l:
a*
/ (Zft+i) = W S x k + i ) + - y - ( F h p ,>. P k )

= '|)fc(^i.) + (’l’i ( a : ) . ) . * a + i — * f c )
rI _ _ af
+ y ( * K ( x k ) ( X h + i — x h ), x s + 1 — * * ) + — ( F k P k , P h ) -

Taking into account that %h+i — xk — (ak — 1) p k w e obtain:

/ (*a+i) (x k ) + W > i ( x k ), x k + , — x k)
2
+ W k ( x k) p k , P k) + ^ j - ( F kp k , Pa).

182
M E T H O D O P C O N D I T I O N A L G R A D I E N T

Note ( i f * ( x h ), x h + 1 — x k ) > 0 , ( x k ) p h , p k ) = (/ ( x h ) p h , p h ) >


S * m II P k II2 - A t the same time ( F h p h , p * ) = o (|| p h || 2 ),
( s i n c e || F h || - ^ 0 a s || p k || - > - 0 ) . C o n s e q u e n t l y , t h e m i n i m u m o f
t h e d i f f e r e n c e f ( X h + i ) — t y h (x k ) c a n b e a t t a i n e d o n l y a s a * — > - l ;
o t h e r w i s e , w i t h a n y Jc w e s h o u l d h a v e 1 — a & ^ P > 0 a n d a t t h e
s a m e t i m e / ( x ft+1) — (a:ft) = O ( | | p ft ||2 ) > 0 , w h e r e a s w i t h a = 1
__ _ |
t h e d i f f e r e n c e / ( z ft) — ( x h ) = - j ( F k p h , p k ) = o (|| ||2 ), i . e .
w i t h a sufficiently gr eat k w e s h o u l d a l w a y s h a v e / (xh ) < T / (#*+i);
t h i s c o n t r a d i c t s t h e c o n d i t i o n f o r c h o o s i n g a * . T h e l e m m a is p r o v e d .
T h e o r e m 3 . 6 . J f f u n c t i o n f (x ) s a t i s f i e s t h e r e q u i r e m e n t s o f t h e o r e m
3 . 4 a n d i n m e t h o d ( 3 . 1 4 ) p a r a m e t e r a & is c h o s e n u n d e r t h e c o n d i t i o n
(3.28), t h e n x a t a superlinear rate.
Proof. B y e s t i m a t e s (3.16) a n d (3.21),
— — *Z2
% ( X h+i) < ( * * ) < aj < — f - m 11 P h II2.
i.e.

( * k + i) < — £ II * * + ! - * * II2 . (3.29)

O n the other _ hand, \|)ft ( x ^ + i ) ^ (/' (a:fe), x ft+1 — x k ) =


= (/' (**), ^ft+i — + (/' ( * * ) , ^ h - i — x h ). S i n c e _ a t p o i n t x k t h e
m i n i m u m o f f ( x ) i s a t t a i n e d i n t h e d i r e c t i o n x * _ ! — x k _ l7 w e
have (/' ( x h ), — x h ) ^ 0. H e n c e , w e h a v e

(*ft+l) > (/' ( * f t ) , * * + 1 ~ x h - l )


= ( / ' (arft- i ) , * fc+1 — x h . t) + ( / ' ( x h ) — f (,x k ^ ) , x ^ - x ^ ) .

E x p r e s s i n g t h e last t e r m o n t h e r i g h t - h a n d side b y m e a n s of L a ­
g r a n g e ’s f o r m u l a f o r o p e r a t o r s , w e o b t a i n a f t e r s o m e t r a n s f o r m a t i o n

tyk (x h+l) ^ ( / # (^fe-l) + f (^fc-i) (x h — x k-l ) » x h+i — x h-l)

kT" — #fc-i)» x k + i — x h-i)


w h e r e O k = / " ( x h ^ + 9 ( x k — x ^ ) ) — / " (arfe_ i ) , 0 6 [ 0 , 1 1 . T a k ­
i n g i n t o a c c o u n t t h a t x k — x ^ = ot,k-i (x k - i — x h - 1)» w e f i n d t h a t

^ (*h+i) > (/' + / " ( ^ f t - i ) (x h - i — * f c - i ) , x k + i — X k - l )


+ (K-i — 1 ) / " ( x k - i — x h - 1 ) + ® h (x h — x k - i ) , x h + 1 —
Since
% - i ( X k - l ) = m i n % _ 1 (a:),
x£Q

183
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

w e have (#ft-i), x k - i — x ) ^ 0 f o r all x £ Q (the necessary


condition for a m i n i m u m ) . C o n s e q u e n t l y

( % - i (x k-l)i x h -1—
= (/' ( * * - i ) + / " ( X k - 1 ) (x h - 1 — x h ~ i), x h- 1“ *a+i) < 0.

T a k i n g this into a c c o u n t w e see that the following estimate holds:

tyk (x k+i) ^ X h +i — X h - l ) •
H e n c e , w i t h t h e n o t a t i o n || [ ( a f t - ! — lJ/aft-j] f ” (a^-i) + O f t || = b kl
w e obtain
— ifrfc ( x h + i ) \ \ x k — x k ^ || || x k + l — x k . i ||
< b k II x h — x k — i 11(11 * k + i — x k II + II x h - i — x k II ) •
T a k i n g n o w into account that

Z f t - l — x k = - I — * " 1 (x h ~ x h - 1)
Wt—1
and d e n o t i n g [(1 — aft^/aft-il b k = ck , w e obtain
— ( x k + i ) ^ b k \ \ x k — x ^ || || x h + t — x h II + Cft || x k — a?ft_! ||2 .
S i n c e a * - > - l , || P h II - ^ 0 ( l e m m a 3 . 3 ) , w e h a v e b h - * - 0 , c h - > 0 .
C o m p a r i n g t h e e s t i m a t e o b t a i n e d w i t h (3.29) w e establish that
II *ft+l — x h II 2 < I k II x h — X h _x || || X h+1 — X h || - f Pfe || X k — Xk-! ||2
where
t _ 26ft 2cft
I k - m ' m ‘
Finally, h a vin g solved the quadratic inequality obtained for
|| a :ft+! — x h ||, w e f i n d t h a t
II * * + 1 — X k II ^ p.* II * k — X k - 1 II
where

| i * = - ^ - + y - J - + P * -*-0 as k->- oo.

T h e r e m a i n i n g p a r t o f t h e p r o o f is p e r f o r m e d j u s t a s i n t h e o r e m 3 . 4 .

4. C U T T I N G H Y P E R P L A N E M E T H O D
T h e c u t t i n g h y p e r p l a n e m e t h o d is m e a n t f o r s o l v i n g p r o b l e m s o f
c o n v e x p r o g r a m m i n g . T h e b a s i c i d e a o f t h e m e t h o d is t h a t t h e a d ­
m i s s i b l e d o m a i n is a p p r o x i m a t e d b y a c e r t a i n p o l y h e d r o n w h i c h

184
C U T T I N G H Y P E R P L A N E M E T H O D

dimin ish es f r o m o n e iteration step to the n e x t giving a better a n d


better a p p r o x i m a t i o n to the ad missible region a b o u t the solution.
T h e m e t h o d is a p p l i e d t o t h e p r o b l e m o f c o n v e x p r o g r a m m i n g
f o r m u l a t e d a s f o l l o w s : t o m i n i m i z e / 0 ( x ) = (c, x ) w i t h t h e c o n s t r a i n t
/ ( * ) < 0 (4.1)
w h e r e / (a;) i s a continuous c o n v e x function.
T h e fact that f u n c t i o n f 0 (a:) t o b e m i n i m i z e d i s l i n e a r a n d t h e
c o n s t r a i n t (4.1) c o n s i s t s o f o n l y o n e i n e q u a l i t y is o f n o g r e a t c o n ­
sequence, since if t h e r e g i o n is d e f i n e d b y s e v e r a l i n e q u a l i t i e s
fi (x ) ^ 0, i = 1, . . m (4.2)
w i t h c o n v e x f u n c t i o n s /* ( x ) , t h e n t h e s y s t e m o f i n e q u a l i t i e s c a n b e
r e w r i t t e n in t h e f o r m (4.1) setting
f(x) = m a x ft (x).
1< i < m
I f c o n v e x f u n c t i o n f 0 (a:) i s n o n l i n e a r , t h e n b y i n t r o d u c i n g a n a d d i ­
t i o n a l c o o r d i n a t e a:n + 1 a n d a d d i n g t h e i n e q u a l i t y ^
/ m + l (x, x ” +i) = fo (x) — £ n + , < 0
to t h e s y s t e m (4.2), w e c a n r e d u c e t h e p r o b l e m t o t h e m i n i m i z a t i o n
of t h e l i n e a r f u n c t i o n of x n+1 w i t h c o n s t r a i n t s (4.2). T h e r e f o r e , w e
sh all s t u d y t h e p r o b l e m i n t h e f o r m (4.1). B e f o r e g o i n g o n t o t h e
description of the a l gor ith m w e r e m i n d the reader that vector a
i s a s u p p o r t v e c t o r t o / (a:) a t p o i n t x 0 i f / ( x ) ^ / ( x 0 ) + ( a , x — x 0 )
for all x. It f o l l o w s f r o m t h e r e s u l t s o f S e c . 2, C h a p . I t h a t for a
c o n t i n u o u s c o n v e x f u c t i o n t h e set o f s u c h v e c t o r s is n o n e m p t y a t
a n y point of space.

Algorithm
Let
Q = {x: f (x) ^ 0}
b e a n o n e m p t y a d m i s s i b l e r e g i o n . S u p p o s e a l s o t h a t Q is c o m p a c t a n d
v e c t o r s s u c h a s a k , k = — Z, — ( Z — 1 ) , . . ., — 1 , 0 a n d n u m b e r s
b h are k n o w n a n d that region
S = {ar: ( a ft, x ) — b h ^ 0, k = — Z, . . ., 0 }
is c o m p a c t a n d c o n t a i n s Q .
For k ^ 0 the successive approximations are d e t e r min ed b y the
f o l l o w i n g rule. W e set S 0 = S . If S h lias b e en constructed, then x h
is a n y s o l u t i o n o f t h e p r o b l e m o f l i n e a r p r o g r a m m i n g : to m i n i m i z e
h (x ) = ( c , x ) w i t h x £ S h . T h e n e x t r e g i o n S h + 1 is c o n s t r u c t e d b y

185
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

t h e f o l l o w i n g rule:
S k+1 = { x m. ( a k + u x) — < 0 } fl S k (4.3)
w h e r e a k + 1 is a s u p p o r t v e c t o r t o / (x) a t p o i n t x k a n d
bu+i = (4.4)
It f o l l o w s f r o m (4.3) t h a t cz S h a n d for k ^ 1
Sfa = {2:. (fl/t x) bj ^ 0 t
7 = — Z, . . ., — 1 , 0 , . . k — 1}. (4.5)
L e m m a 4 . 1 . F o r a l l k ^ 1, Q a z S k .
P r o o f . L e t x £ Q , i.e. / (x) ^ 0. T h e n
f ( x ) > f (*;-1) + («i» * — ^_x) = (a^, x ) — bj
a n d , c o n s e q u e n t l y , ( a , , x ) — b j ^ 0 , 7 = 1 , . . ., Ar. W i t h 7 ^ 0
t h e l a s t i n e q u a l i t i e s h o l d b y v i r t u e o f c h o o s i n g a 7- a n d b j f o r 7 ^ 0 .
T h e l e m m a is p r o v e d .
It f o l l o w s directly f r o m l e m m a 4.1 t h a t
fo(x0) ^ f o (*^1) ^ ^ fo {xk) ^ / o ( * ^ft +i) ^ . . . .
O n t h e o t h e r h a n d , i f x * i s t h e m i n i m u m p o i n t o f / 0 (a:) i n Q , t h e n
fo ( x k ) < fo M since => Q.
T h e o r e m 4.1. L e t f (x) b e a c o n t i n u o u s c o n v e x f u n c t i o n , r e g i o n Q
be c o m p a c t a n d there be a n u m b e r K s u c h that with e a c h x £ S vector
a w h i c h is a s u p p o r t v e c t o r t o f ( # ) a t p o i n t x s a t i s f i e s t h e i n e q u a l i t y
|| a || ^ K . T h e n a n y l i m i t p o i n t o f s e q u e n c e { £ & } , k = 0 , 1 , . . .,
is a s o l u t i o n o f p r o b l e m ( 4 . 1 ) a n d f ( x h ) — >- 0.
P r o o f . S i n c e S 0 — S , S h z > S h + U t h e w h o l e s e q u e n c e { x ft} b e l o n g s
to the c o m p a c t set S. T h e r e f o r e there are a l w a y s limit points of this
sequence.
N o t e n o w t h a t i f / (a:ft) ^ 0 f o r a c e r t a i n k , t h e n x h £ Q a n d , c o n ­
s e q u e n t l y , f0(Zk) f o ( # * ) • H o w e v e r , a s w a s s h o w n , / 0 (x h ) ^
^ fo ( # * ) • T h u s , f 0 ( x k ) = f 0 ( x % ) , i.e. x ^ is t h e s o l u t i o n o f t h e
original problem.
L e t n o w s e q u e n c e { 2 * } b e i n f i n i t e a n d / (x k ) > > 0 f o r a l l k . W e
s h a l l s h o w t h a t / ( x k ) - > - 0 . S u p p o s e t h a t t h e c o n t r a r y is t r u e . T h e n
t h e r e is a n u m b e r r 7 > 0 a n d a s u b s e q u e n c e o f i n d i c e s k ( w e d e n o t e
i t b y J ) s u c h t h a t / ( # ft) ^ r , k £ J . W i t h o u t l o s s o f g e n e r a l i t y , w e
c a n t a k e t h a t x h - + x , k £ J s i n c e { # * } b e l o n g s t o a c o m p a c t set.
L e t n o w k a n d 7* b e l o n g t o J a n d k > 7. T h e n b y c o n s t r u c t i o n ,
p o i n t x h satisfies t h e i n e q u a l i t y
(a 7 + i » Xk.) bj (#,/+iy X h x f ) “ I- / ( X j ) 0,
hence
/ (xj) < ( a J + 1, x j — x h) < K || X j — x h ||.

186
G U T T I N G H Y P E R P L A N E M E T H O D

B u t { x ft} , k £ 3 c o n v e r g e s t o x a n d t h e r e f o r e || x j — x h || ^ r / ( 2 K )
for all su ffi cie ntl y g r e a t k a n d / a n d s o f (xj) ^ r/2 for g r e a t /, a n d
t h i s c o n t r a d i c t s t h e f a c t t h a t f ( x j ) ^ r, / £ 3 •
T h u s w e h a v e s h o w n t h a t / (x * ) - » - 0 . L e t n o w x b e a n y limit point,
i.e. x h - * ~ x , k £ J , w h e r e J is a s u b s e q u e n c e o f i n d i c e s . T h e n because
o f t h e c o n t i n u i t y o f / (x),

f ( x ) = l i m f ( x h) = 0,
KCf
i.e. x 6 Q . O n t h e o t h e r h a n d , f Q ( x h ) ^ f 0 (x * ) a n d t h e r e f o r e
f o ( z ) ^ f o ( z * ) ; i t f o l l o w s d i r e c t l y t h a t f 0 ( x ) = f 0 ( x „.) a n d x i s
a l s o a s o l u t i o n o f p r o b l e m (4.1). T h e t h e o r e m is p r o v e d .

Computational Aspects
T h e a l g o r i t h m of the cutting h y p e r p l a n e m e t h o d requires at e a c h
step the solution of the p r o b l e m of linear p r o g r a m m i n g : to m i n i ­
m i z e f 0 ( x ) = (c, x ) w i t h c o n s t r a i n t s
( a it x ) — bt ^ 0, i — — Z, . . ., k . (4.6)
T h u s t h e size of t h e p r o b l e m b e i n g s o l v e d increases at e v e r y step.
T h e c o m p u t e r m e m o r y required for storing vectors a^ also increases.
I n o r d e r t o s i m p l i f y t h e s o l v i n g o f p r o b l e m (4.6), it is e x p e d i e n t t o
solve instead the d u a l p r o b l e m w h i c h in this case takes the following
h
form: to m a x i m i z e — 2 u % bi w i t h c o n s t r a i n t s
i= - l
h
2 u xa , i - \ - c = 0 , 0, i = — Z, ..., k.

In solving this p r o b l e m b y t h e s i m p l e x - m e t h o d t h e solution of


t h e p r e c e d i n g p r o b l e m s e r v e s as t h e trial s o l u t i o n for t h e n e x t o n e .
I t is a l s o a g o o d a p p r o x i m a t i o n s o t h a t t h e s o l u t i o n o f t h e n e w
p r o b l e m is o b t a i n e d a f t e r a s m a l l n u m b e r o f i t e r a t i o n s .
A t e a c h s t e p o f t h e a l g o r i t h m it is n e c e s s a r y t o c o m p u t e v e c t o r
0 * + ! , s u p p o r t v e c t o r t o / (x) a t p o i n t x h . R e c a l l (see C h a p . I) t h a t
if / ( x ) i s a d i f f e r e n t i a b l e f u n c t i o n , t h e n a k + x = f ( x k ). I f / ( x ) i s
t h e m a x i m u m o f t h e f u n c t i o n s b e i n g d i f f e r e n t i a t e d , i.e. / (x) =
= m a x fi (x),] t h e n w e c a n t a k e a s a n y vector of the f o r m
l^i^m
2 Kfi M y where]hi > 0 , ^ 2 A* = 1, J M = {i: f t ( x k ) =
ej(xk)
= / M y 1 ^ i ^ m ) ; in particular, w e c a n take = f\ (x *),
w h e r e i i s a n y i n d e x f r o m J ( x k ).
T h e f o r e g o i n g f o l l o w s f r o m w h a t w a s s e t f o r t h i n S e c . 2 , C h a p . I.

187
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Concluding R e m a r k s
In describing the cutting hyperplane m e t h o d w e followed
J . E . K e l l y ’s p a p e r . A t p r e s e n t t h e r e a r e m a n y m o d i f i c a t i o n s o f t h i s
m e t h o d . T h e y c a n b e f o u n d in a p a p e r b y E. S. L e v i t i n a n d
B . T . P o l y a k . H o w e v e r , all t h e s e m o d i f i c a t i o n s d o n o t s e e m t o e n h a n c e
t h e m a i n p r o p e r t y w h i c h is o f i n t e r e s t t o u s , v i z . t h e r a t e o f c o n ­
v e r g e n c e w h i c h h a s n o t b e e n precisely e s t i m a t e d for t h e m e t h o d
described; h o w e v e r , the results ob t a i n e d in the paper m e n t i o n e d
p e r m i t t o f o r m t h e j u d g e m e n t t h a t t h i s r a t e is n o t e v e n t h a t o f a
geometric progression.

5. L I N E A R I Z A T I O N M E T H O D
I n this section w e shall describe t h e m e t h o d of solving the general
p r o b l e m of m a t h e m a t i c a l p r o g r a m m i n g w i t h o u t m a k i n g a n y a s s u m p ­
tion c o n c e r n i n g the c o n v e x i t y of the functions to b e dealt wi t h . A n
i m p o r t a n t p r o p e r t y o f t h i s m e t h o d is t h e p o s s i b i l i t y o f t a k i n g i n t o
a c cou nt nonlinear equality constraints, this being a s t u m b lin g-
block for m o s t other m e t h o d s .
It is r e q u i r e d t o m i n i m i z e f u n c t i o n f 0 (x), x £ E n w i t h c o n s t r a i n t s
0, i£J", f i ( x ) = 0, i £ j ° (5.1)
w h e r e J ~ a n d J ° a r e finite sets of indices. W e a s s u m e t h a t all t h e
f u n c t i o n s /* ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e . ( M o r e f u l l y t h e c o n ­
s t r a i n t s w i t h w h i c h t h e p r o b l e m is c o n s i d e r e d w i l l b e s p e c i f i e d b e l o w . )
A t p o i n t x 0 w e s u b s t i t u t e l i n e a r c o n s t r a i n t s for all (5.1) a n d a lin­
e a r f u n c t i o n for f0 (x) b y l i n e a r i z i n g ft (x) a t p o i n t x. A s a r e s u l t w e
o b t a i n a p r o b l e m of linear p r o g r a m m i n g . It w o u l d b e n a t u r a l to
t a k e the solution of the linearized p r o b l e m as the n e x t a p p r o x i m a ­
t i o n a s w e d o i n N e w t o n ’s m e t h o d f o r s o l v i n g s y s t e m s o f n o n l i n e a r
equations. Un for tun ate ly , this w a y d o es not lead directly to the
a i m since as a rule the subsidiary p r o b l e m of linear p r o g r a m m i n g
h a s n o s o l u t i o n . T h e r e f o r e , it is n e c e s s a r y t o i m p o s e c e r t a i n c o n ­
straints o n the increase of vector x at x 0 in or der that the solution
of t h e linearized p r o b l e m s h o u l d n o t shift too far f r o m x 0 a n d s h o u l d
r e m a i n in the n e i g h b o u r h o o d of x 0 s u c h that linearization w o u l d
s t i l l h o l d a t it. T h i s w i l l b e p e r f o r m e d b e l o w b y a d d i n g a q u a d r a t i c
t e r m to the linearized objective function.
N o t e t h a t e a c h o f t h e e q u a l i t i e s f t (x) — 0 is e q u i v a l e n t t o t h e
following t w o inequalities
fi ( z ) < o, — f t (X ) < 0.
Therefore w e c a n limit ourselves to considering o n l y the case
w i t h i n e q u a l i t y c o n s t r a i n t s . S u c h a c o n s t r a i n t is c o n v e n i e n t a t l e a s t

188
L I N E A R I Z A T I O N M E T H O D

in th e theoretical s u b s t a n t i a t i o n of t h e a l g o r i t h m t h o u g h t h e d o u b l ­
ing of t h e n u m b e r of inequalities c a n b e i n c o n v e n i e n t in calculations.
W e shall give b e l o w the theoretical substantiation of the a l g o r i t h m
l o r t h e p r o b l e m o f t h e m i n i m i z a t i o n o f / 0 (a;) w i t h c o n s t r a i n t s
f i (a;) < 0, i 6 3. (5.2)

A m o d i f i c a t i o n of t h e a l g o r i t h m for t h e g e n e r a l p r o b l e m (5.1) will


be considered separately.
T h u s w e shall s t u d y t h e a l g o r i t h m for p r o b l e m (5.2) w i t h o u t loss
of generality. Clearly,.we c a n a l w a y s a s s u m e that a m o n g the i n e q u a l ­
i t i e s ( 5 . 2 ) , t h e r e is t h e t r i v i a l o n e : 0 ^ 0 . T h e r e f o r e it w i l l b e a s s u m e d
t h a t a m o n g t h e f u n c t i o n s / * (a:), i 6 3 t h e r e i s o n e i d e n t i c a l l y e q u a l
l o z e r o : /,• ( x ) = 0 .

Basic Assumptions
W e set
F ( x ) = m a x f t (a:)
* 3
3 b (x) = 0 ' € 3 : ft ( x ) ^ F ( x ) — 6}, 6 > 0 . (5.3)

B y a s s u m p t i o n , F (x) ^ 0 w i t h all t h e x . S u p p o s e t h a t t h e r e a r e
c o n s t a n t s N > » 0, 6 > 0 s u c h that:
(a) t h e set
= {*: /o (*) + N F (x) < C 0 }, C 0 = /0 (*0) + N F ( x 0)

is b o u n d e d ;
(b) t h e g r a d i e n t s of f u n c t i o n s /* ( x ) , i £ { 0 } U 3 in Q N satisfy
L i p s c h i t z * c o n d i t i o n , i.e.
II fi (*i) — f M II < L II — z 2 IK
(c) t h e problem of quadratic programming

m i n (/;(*), p ) + l | | p | p ,

(f\ ( * ) , p ) + / i ( i ) a i6 (*)(5.4)

is s o l v a b l e f o r p £ E n w i t h a n y x £ Q N a n d t h e r e a r e L a g r a n g e m u l ­
t i p l i e r s u x (a:), i £ 3 t ( x ) s u c h t h a t 2 u% (*) ^ W . I n t h i s s e c t i o n ,

|| p || w i l l a l w a y s d e n o t e t h e E u c l i d e a n n o r m o f v e c t o r p .
I n w h a t f o l l o w s w e shall d e n o t e t h e s o l u t i o n of p r o b l e m (5.4)
b y p (x) a n d L a g r a n g e m u l t i p l i e r s b y u x (x), i £ J a (x).

189
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

F o rm ul at io n of the Algorithm
L e t x 0 b e t h e initial a p p r o x i m a t i o n a n d w e t a k e e s u c h that
0 e < c 1* L e t p o i n t x k b e a l r e a d y c o n s t r u c t e d b y t h e a l g o r i t h m .
T h e c o nst ruc tio n of t h e n e x t a p p r o x i m a t i o n will b e p e r f o r m e d in
t w o stages:
(1) W e s o l v e p r o b l e m ( 5 . 4 ) w i t h x — x \ a n d f i n d its s o l u t i o n , v e c -
t o r p k = p ( x h ).
(2) , W e f i n d t h e first o f t h e v a l u e s o f i = 0, 1, . . satisfying
the inequality

/ (** + -£TPh) + N F ( x h + - l r p h ' ) $ ^ f ( x h) + N F ( x k) — — e | | p * ||2 .

I f t h i s i n e q u a l i t y i s s a t i s f i e d f o r t h e fir s t t i m e w i t h i = i0 , t h e n w e
t a k e a * = 2-*°, x k + 1 = x h + a hp k .
T h u s t h e f o l l o w i n g ] i n e q u a l i t y is s a t i s f i e d a t e a c h s t e p :
/ ( * * + i ) + N F ( * * + 1 ) < / ( x h ) + N F ( x h ) — a k e || p h ||2 . (5.5)

C o n v e r g e n c e of the Algorithm
W e s h o w t h a t t h e c h o i c e o f t h e s t e p a * a t e a c h i t e r a t i o n is p e r ­
f o r m e d after a finite n u m b e r of s u c c e s s i v e h a l v i n g s of u n i t y a n d
substantiate the c o n v e r g e n c e of the algorithm.
F r o m t h e r e s u l t s o f S e c . 3 , C h a p . I it f o l l o w s t h a t p (x) is t h e s o ­
l u t i o n o f p r o b l e m ( 5 . 4 ) if a n d o n l y if t h e r e a r e u % ( x ) ^ 0 , i £ C f t ( x )
such that

fo(x ) + p ( x ) + 2 » * ( * ) / « ( * ) = o,

u*(x){(f\{x), p(x)) + f i ( x ) ) = 0 , i € J a (x). (5.6)


Therefore

( / o (x ) • p (x )) = — 2 » * ( * ) ( f \ ( * ) . p (x ) ) — ll p ( x ) ll2
i E J 6W

= 2 «*(*)/* ( * ) - H p ( * ) II2 - (5-7)

L e m m a 5.1. I n order that p o i n t x satisfy inequalities (5.2) a n d the


n e c e s s a r y c o n d i t i o n s for the m i n i m u m of /0 (x) w i t h c o n st rai nts (5.2),
it i s n e c e s s a r y a n d s u f f i c i e n t t h a t t h e e q u a l i t y p ( x ) = 0 b e s a t i s f i e d .
P r o o f . L e t p o i n t x satisfy (5.2) a n d t h e n e c e s s a r y c o n d i t i o n s for
t h e m i n i m u m o f / 0 (or). T h e n t h e r e a r e n u m b e r s u x ^ 0 , i £ J s u c h
that
fo ( * ) + 2 u lf ; ( x ) = 0 , u lf i ( x ) = 0 , i £ j . (5.8)
iej

190
L I N E A R I Z A T I O N M E T H O D

If x satisfies (5.2), t h e n F (z) = 0 a n d t h e r e f o r e J 0 (x) c o i n c i d e s w i t h


t h e s e t o f i f o r w h i c h f t (x ) = 0 . Besides, f r o m the s e c o n d of relations
( 5 . 8 ) , u % = 0 i f fi ( x ) < 0 , i . e . if i £ J 0 ( # ) • T h e r e f o r e t a k i n g i n t o
a c c o u n t t h a t J e (#) 3 J o 0*0 w e c a n r e w r i t e (5.8) in t h e f o l l o w i n g
form:

/;(*)+ 2 uYi(x) = 0, u if i ( x ) = 0 , i £ j 6 (x).

T h e c o m p a r i s o n of t h e last e x p r e s s i o n s w i t h (5.6) s h o w s t h a t v e c t o r
p = 0 is t h e s o l u t i o n o f p r o b l e m ( 5 . 4 ) f o r w i t h p = 0 all c o n s t r a i n t s
(5.4) are satisfied (since (5.2) are satisfied) a n d t h e f u l f i l m e n t of
( 5 . 6 ) w i t h p = 0 is t h e n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n f o r v e c t o r
p = 0 t o b e t h e s o l u t i o n of (5.4).
L e t n o w p (x) = 0. T h i s m e a n s t h a t t h e c o n s t r a i n t s o f (5.4) a r e
s a t i s f i e d w i t h p = 0 , i.e. f t (x) ^ 0 , i £ J $ (x). S i n c e f o r i £ J e (# )
we have
ft ( x ) < F (x) — 6 < fj (x ) < 0

where j £ 0*0? t h e n p o i n t x satisfies all c o n s t r a i n t s (5.2). B e s i d e s


w i t h p = 0 r e l a t i o n s ( 5 . 6 ) a r e t r a n s f o r m e d i n t o ( 5 . 8 ) if w e t a k e
u x = 0, i £ J s (z). T h u s , t h e n e c e s s a r y c o n d i t i o n s f o r t h e m i n i m u m
o f / 0 (x) w i t h c o n s t r a i n t s (5.2) a r e satisfied t o o a n d t h i s c o m p l e t e s
the proof.
L e t u s n o w e s t i m a t e t h e c h a n g e s of all t h e f u n c t i o n s c o m p r i s e d
b y t h e p r o b l e m w i t h a shift f r o m p o i n t x h in direction p k .
F o r i 6 J d ( x k ) u s i n g T a y l o r ’s f o r m u l a w e o b t a i n :
U ( X h + a p h ) = f t ( x h ) + a ( p h , f i ( x h ) ) + a ( p h , f \ ( 9 ; ) — / 5 ( x h ))

w h e r e 0* = x h + a £*/?&, 0 ^ ^ 1. Since p & is] t h e solution


o f ( 5 . 4 ) w i t h x = x kl w e h a v e
f i ( X h + a p f t X / i ( x h ) — a f i ( x h ) + a 2 1| p k ||2 L
< ( 1 — a ) f i ( a r ft) + a 2 | | p fe||*Zr (5.9)
i n d e r i v i n g w h i c h w e m a d e u s e o f t h e f a c t t h a t t h e g r a d i e n t s o f f t (.z)
satisfy L i p s c h i t z ’ condition.
F o r i J J 6 ( x k ),
fi ( x h + a p h) = fi ( x k ) + a ( p ft, f \ ( 0 J )
< F (xk) - 6 + a K \ \ p h || (5.10)
where K i s a q u a n t i t y w h i c h b o u n d s || f \ ( x ) || i n Q N .
Since
(1 — a) F (xh) > F (xk) — 6 + c c K || p h ||

191
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

for a such that a ^ 1,


O ^ a ^ — — 6 — fj-, (5.11)
^ ^ F { x h )+ K \ \ p h \\ ' V '
it f o l l o w s f r o m ( 5 . 9 ) a n d ( 5 . 1 0 ) t h a t f o r a l l t h e i t h e f o l l o w i n g i n ­
equality holds:
ft ( x k + a p k) < (1 — a) F (xk) + a 2 L || p h ||2 (5.12)
p r o v i d e d a satisfies c o n d i t i o n (5.11).
B y a n a l o g y to the preceding estimates
f o ( X k - h c t p k ) = f Q ( x k ) - \ - a ( p k , f 0 ( x h )) - \ - a ( P k , f 0 ( 0 O ) — f o ( * h ) ) ,
0O = *k + aloPk» 0 < g0 < 1-
U s i n g (5.7) a n d L i p s c h i t z ’ c o n d i t i o n for g r a d i e n t s w e o b t a i n
f o (*h + a P f c X / o (*h) + a ( 2 u i ( * k ) f i ( X h ) ) — a \ \ p k ||2 + cc2 L || p k ||2 .

Hence and from ( 5 . 1 2 ) it f o l l o w s t h a t


fo (&k ~h QiPh) ~b N F ( x k apk) ^ fo ( x k ) - { - N F (#&)
+ « ( S u ' ( x k ) f i ( x h ) — N F ( x h ) ) — a \ \ p k \^

+ a 2 ( . A T + 1 ) L || pi, | p. (5.13)
Recall n o w t h at u l (xk) ^ 0, F (xk) ^ 0 and
2 ul(Xk)^N.

Therefore
2 U * (x h ) ft ( x k ) — N F ( x h ) < 0.

But then (5.13) can be written in the form


fo (* k + a Pk) + N F (xh + a p h) < /„ ( x h ) + N F (xh)
— a II P h II2 (1 — a ( N - 1) L )
or, if
1— e
(5.14)
then
fo ( x h + a p h) + N F (xk + a p h) < /„ ( x k ) + N F (xk)
- a e || p h ||2 . (5.15)
Thus if
0 ^ a ^ a h,
— . ( m 6 1l -— e e \
0 t ” \ ’ f ( * * > + « II P k II ’ { N + 1 ) L } ’
t h e n the inequality (5.15) holds.

192
L I N E A R I Z A T I O N M E T H O D

B u t t h i s m e a n s t h a t i n e q u a l i t y (5 . 5 ) is sa t i s f i e d a f t e r a finite n u m ­
b e r o f t r i a l s o f a = 2 ~ l, i — 0 , 1 , . . . » a n d w i t h t h e i n e q u a l i t y

(5.16)

W e shall p r o v e n o w the following t h e o r e m o n the c o n v e r g e n c e of the


process.
T h e o r e m 5.1. If the a s s u m p t i o n s of the subsection o n p. 1 8 9 h o l d %
then the process h a s the following properties:
(a) F (zk) - » - 0 a s k — >-oo;
( b ) a t a n y l i m i t p o i n t x * o f s e q u e n c e { a : * } , k = 0 , 1 , . . ., i n ­
e q u a l i t i e s ( 5 .2 ) a r e s a t i s f i e d a n d a l s o t h e c o n d i t i o n s f o r t h e m i n i m u m o f
/ 0 ( x ) w i t h c o n s t r a i n t s ( 5 .2 ).
R e m a r k . T h e fact t h a t F (x^) t e n d s to z e r o m e a n s t h a t s e q u e n c e
{ x ft} s a t i s f i e s c o n s t r a i n t s ( 5 . 2 ) m o r e a n d m o r e p r e c i s e l y .
P r o o f . A l l p o i n t s o f { } b e l o n g t o Q N s i n c e f u n c t i o n f 0(x) +
+ N F ( x ), b y ( 5 . 1 5 ) , d e c r e a s e s f r o m s t e p t o s t e p . F u r t h e r , a s Q N
is a c o m p a c t set, f 0 (x) + N F (#) is b o u n d e d i n t h i s s e t s i n c e t h e
f u n c t i o n is c o n t i n u o u s . H e n c e
a k II P h II2 - ^ 0 (5.17)
a s k - > • o o f o r o t h e r w i s e f 0 (a:) + N F ( x ) d e c r e a s e s w i t h o u t l i m i t
along s e q u e n c e {#*}.
W e s h a l l p r o v e n o w t h a t jo k I n d e e d , if p & d o e s n o t t e n d t o
z e r o , t h e n it f o l l o w s f r o m ( 5 . 1 7 ) t h a t a * - » - 0 a l o n g a c e r t a i n s u b ­
s e q u e n c e o f i n d i c e s k . B u t it f o l l o w s f r o m ( 5 . 1 6 ) a n d t h e e x p r e s s i o n
of a h that t h e n for great k
^ 1 - 1 6
a h ^ 2 a k - 2 / ? < * _ ) + * 11^,1 •

Therefore, t h e r i g h t - h a n d pa rt of t h e last i n e q u a l i t y m u s t t e n d to
z e r o . A s F (x) is a c o n t i n u o u s f u n c t i o n i n t h e c o m p a c t s e t Q N j F (x )
has an upper bound and the expression y p h || c a n t e n d
t o z e r o o n l y i f || p * || - » - + oo. B u t f r o m (5.6) w e o b t a i n t h a t

ii/ > < * * ) ii= 1 1 / ; ( * > * ) + 2 ^ (xh )fi (x„) (at+1).

T h u s w e h a v e c o m e to a contradiction t h r o u g h the a s s u m p t i o n that


P h does n o t t e n d to zero.
B y definition of p h , the relations
(/; ( * * ) . P h ) + ft ( x h ) < 0, i 6 (xh)
hold. Therefore
ft ( X k ) — ( f t ( X k ) , P k X K II P k ||, i 6 J # ( x ft) .

13— 0 3 2 6 193
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

B u t fj ( x k ) < f i (x h ) , j t> (x h ) , i £ 3 * ( * h ) • H e n c e
F (xk) = m a x f t (xk)< K || p k ||.

C o n s e q u e n t l y , F ( x k ) - > 0 a s A: - > o o f o r F ( x h ) ^ 0 . F u r t h e r , l e t u s
t a k e u x (x) = 0, i £ Cft (x). T h e n w e c a n r e w r i t e (5 . 6 ) a l o n g s e q u e n c e
{ x ft} i n t h e f o l l o w i n g f o r m :

f'o ( x h ) + P k + 2 u % ( x k ) f i (x h ) — 6 ,
*£f
u i ( x h ) ( ( j \ ( x h ), p h ) - \ - f i ( X h ) ) = o , J. (5.18)
L e t n o w x * b e a limit point of {x^}. A s x k £ Q N and is c o m p a c t ,
there are a l w a y s s u c h points. W i t h o u t loss of generality, w e m a y
t a k e t h a t x h — >-x*. B e s i d e s , s i n c e u x (x) ^ 0 , i £ Cf a n d t h e i r s u m
i s l i m i t e d , w e c a n t a k e t h a t u x ( x k ) — >■ u x a s k oo.
T a k i n g the limit in (5.18) w e obtain:

f'o ( x * ) + 2 (**) = 0 , uffi (x„) = 0 , i£ J.


i£J
B e s i d e s , u x ^ 0 s i n c e u x ( x h ) ^ 0 a n d p o i n t x * satisfies all t h e c o n ­
s t r a i n t s ( 5 . 2 ) s i n c e / f ( x ft) ^ F ( x h ) a n d F ( x k ) 0. H e n c e t a k i n g
t h e l i m i t , w e o b t a i n f t (#*) ^ 0. T h u s w e h a v e a s c e r t a i n e d t h a t t h e
n e c e s s a r y c o n d i t i o n s for a m i n i m u m a r e fulfilled at p o i n t x *. T h e
t h e o r e m is p r o v e d .
C o r o l l a r y . If t h e o n l y p o i n t a t w h i c h t h e n e c e s s a r y c o n d i t i o n s for
a m i n i m u m a r e f u l f i l l e d is t h e m i n i m u m p o i n t , t h e n t h e s e q u e n c e
g e n e r a t e d b y the a l g o r i t h m c o n v e r g e s to the m i n i m u m point of
/0 (x) w i t h c o n s t r a i n t s (5.2).
I n d e e d , in this case b y t h e o r e m 5.1, t h e o n l y limit p o i n t of se­
q u e n c e {xft} c a n b e b u t t h e m i n i m u m p o i n t .

Computational Aspects
T h e basic operation w h i c h requires considerable computations
a t e v e r y s t e p i n i m p l e m e n t i n g t h e a l g o r i t h m is t h e s o l v i n g o f p r o b ­
l e m (5.4). T h i s is a p r o b l e m o f q u a d r a t i c p r o g r a m m i n g . I n c h o o s ­
i n g t h e m e t h o d f o r s o l v i n g t h i s p r o b l e m it is n e c e s s a r y t o t a k e i n t o
a c c o u n t t h a t p r o b l e m (5.4) m u s t b e s o l v e d after a finite n u m b e r o f
s t e p s , s i n c e it is a s u b s i d i a r y p r o b l e m . B e s i d e s , s i n c e c o n s t a n t N
is n o t k n o w n b e f o r e h a n d , it is e x p e d i e n t t o o b t a i n t h e c o r r e s p o n d i n g
L a g r a n g e m u l t i p l i e r s u x (x) i n s o l v i n g p r o b l e m (5.4) i n o r d e r to
c h e c k w h e t h e r t h e c h o i c e o f N w a s r i g h t o r n o t . O n t h e s e g r o u n d s it
s e e m s e x p e d i e n t t o p a s s t o t h e d u a l p r o b l e m a n d t o s o l v e it b y t h e
m e t h o d of conjugate gradients w h i c h w a s discussed in the subsection
o n p. 160.

194
L I N E A R I Z A T I O N M E T H O D

W e s h a l l c o n s t r u c t n o w t h e d u a l of p r o b l e m (5.4). A s s t a t e d i n
C h a p . I, S e c . 3, t h e o b j e c t i v e f u n c t i o n o f t h e d u a l p r o b l e m h a s t h e
following form:

<p ( u ) = n u n [ ( / ; ( z ) , p ) + - 1 I I p II2

+ 2 »*((«(*). P ) + / i (*))]. (5-19)

E q u a t i n g to zero the derivatives w i t h respect to p of the r i g h t - h a n d


s i d e o f t h e l a s t e q u a l i t y , w e f i n d t h a t t h e m i n i m u m is a t t a i n e d w i t h

p = - / ; < * ) - 2 » v h *). (5.20)

T h u s , p o i n t p is u n i q u e l y d e f i n e d b y v e c t o r u w i t h c o m p o n e n t s
u \ i e J d (*).
S u b s t i t u t i n g (5.20) into t h e r i g h t - h a n d side of (5.19), w e o b t a i n
i
<p(u) = /;<*)+ 2 «**/*(*) + 2 »*/ «(*)• (5.21)
2

T h u s , w e h a v e calculated the objective function of the d u a l p r o b l e m .


T h e d u a l p r o b l e m co nsists n o w i n t h e m a x i m i z a t i o n of 9 (w) w i t h
c o n s t r a i n t s u x ^ 0 , i £ J o (x).
T h u s w e h a v e c o m e to a p r o b l e m of the m a x i m i z a t i o n of a q u a d r a t ­
ic f o r m w i t h s i m p l e c o n s t r a i n t s ; it is e x p e d i e n t t o s o l v e t h i s p r o b l e m
b y t h e m e t h o d of c o n j u g a t e g r a d i e n t s (the s u b s e c t i o n o n p. 160).
A s a result of the solution of the d u a l p r o b l e m w e o b tai n L a g r a n g e
m u l t i p l i e r s u l (x) a n d a c c o r d i n g t o w h a t w a s s t a t e d i n S e c . 3, C h a p . I,
t h e s u b s t i t u t i o n o f u x (#) i n t o ( 5 . 2 0 ) g i v e s v e c t o r p (x), t h e s o l u t i o n
of the p r i m a l p r o b l e m .
A n o t h e r p r o b l e m is t h e c h o o s i n g o f c o n s t a n t s N a n d 6 . S p e a k i n g
g e n e r a l l y , t h e q u a n t i t y N is u n k n o w n . C h o o s i n g it t o o g r e a t c a n b e
a d i s a d v a n t a g e , since b y f o r m u l a (5.14), this c a n i n v o l v e a c o n s i d ­
e r a b l e r e d u c t i o n o f t h e s t e p . T h e r e f o r e , it is e x p e d i e n t t o e s t i m a t e N
d u r i n g t h e i m p l e m e n t a t i o n o f t h e a l g o r i t h m . F o r e x a m p l e , if a t a
c e r t a i n s t e p it o c c u r r e d t h a t
2 u l ( x k ),

then N s h o u l d b e c h a n g e d to

N = 2 2 »*(*»)• (5.22)

E x p e r i e n c e s h o w s t h a t s u c h a c o r r e c t i o n b r i n g s s u c c e s s . B e s i d e s , it
is c l e a r f r o m t h e o r e t i c a l r e a s o n s t h a t if x h is s u f f i c i e n t l y c l o s e t o t h e

195 13 *
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

limit point, t h e n in the regular case u* (xh) proves close to L a g r a n g e


m u l t i p l i e r s a t p o i n t x * w h i c h is t h e s o l u t i o n o f t h e p r o b l e m a n d
therefore f o r m u l a (5.22) leads to success. T h e b e h a v i o u r of factors
u* (xk) will b e c o n s i d e r e d at greater l e n g t h b e l o w .
A s t o q u a n t i t y 6 , it s h o u l d b e r e d u c e d if s u b s i d i a r y p r o b l e m ( 5 . 4 )
p r o v e s insolvable at a certain step.
W e shall n o w describe the conditions u n d e r w h i c h there exist c o n ­
s t a n t s N a n d 6 . I n fact, t h e y exist*in a c o n s i d e r a b l y b r o a d e r class
of problems.
T h e o r e m 5 . 2 . L e t a l l f u n c t i o n s f 0 (a:), f t ( x ) , i £ C f b e c o n v e x a n d
there be a po i n t x such that
fi ( x ) < 0, i 6 J.
B e s i d e s , l e t / „ (a:) t e n d t o + w o s a: - > + o o a n d l e t p o i n t x 0 s a t i s f y
c o n s t r a i n t s (5.2). T h e n w i t h a n y 6 > 0, m u l t i p l i e r s u * (x), i £ J a (x)
h a v e a n u p p e r b o u n d o n set w i t h s u f f i c i e n t l y g r e a t N a n d if
is c o m p a c t .
Proof. Recall that
Q n = { x : / 0 (a:) + N F (a;) < f0 (x 0) + N F (a:0 ) } .
H e n c e a n d f r o m t h e c o n t i n u i t y o f f 0 ( x ) a n d F (a:), i t f o l l o w s t h a t
Q n is a c l o s e d set. O n t h e o t h e r h a n d , Q j * is b o u n d e d s i n c e b y h y ­
p o t h e s i s , /o (s) + oo as x - ► + oo a n d therefore
U (*) + N F (x ) > U (*o) + N F (x 0 )

w i t h all x sufficiently g r e a t i n n o r m . F u r t h e r , s i n c e x 0 satisfies (5.2),


t h e n F (a;0 ) = 0 . T h e r e f o r e w i t h a l l N w e h a v e Q N c = Q 0 . I n d e e d ,
it f o l l o w s f r o m x 6 that
fo (*) < fo (*) + N F (x) < U (*o) + N F ( * 0) = U (*•)•
i . e . / 0 ( x ) ^ / 0 (a;0 ). O b v i o u s l y , s e t Q 0 i s a l s o c o m p a c t b y t h e a s s u m p ­
tions of t h e t h e o r e m .
F u r t h e r , s i n c e all f t (x) a r e c o n v e x w e h a v e

ft ( * ) + (fi ( * ) , x - x ) ^ h (S) < 0. (5.23)


Therefore,, the s y s t e m o f c o n s t r a i n t s o f p r o b l e m (5.4) is c o n s i s t e n t
with any 6 >• 0 , a s v e c t o r p — x — x s a t i s f i e s it.
Let n o w u l (x) b e L a g r a n g e m u l t i p l i e r s o f p r o b l e m (5.4). T h e n
by Kuhn-Tuck e r ’s t h e o r e m

4 -iip(*)ii2 + ( / ; ( * ) . p ( * ) ) < 4 ii p i i 2 + ( / ; ( * ) ’ p )

+ 2 W * ( * ) (W ( * ) » P ) + fi ( * ) )

196
linearization method

for all p. I n p a r t i c u l a r , w i t h p = p W — x , by (5.23), w e have

4 iip<») « * + « / ; <*).?(*))

<yii?ii*+(/;(*),?)+ S »*(*)(«(*).?)■+/,(*»
i € j fi(x)

< 4 - i i p i p + ( / k « ) i p ) + 2 u i ( x ) f i (x)

ii p n 2 + (/; (*), p ) + (x) u f t .

H e n c e (/,• ( s ) < 0!)

u 1 (X) <
r 4 - u p ( * ) » * + w w . p ( * ) ) i - rL 4f -- n-p -n !,+- -( -f t- M-, - ?)i
l i - - - - - - - - - - - - - - J— - - - - - - ! (5.24)

w h e r e the n u m e r a t o r o n th e r i g h t - h a n d side of (5.24) contains a n o n ­


p o s i t i v e q u a n t i t y s i n c e p (x ) i s t h e s o l u t i o n o f p r o b l e m ( 5 . 4 ) a n d
p satisfies t h e c o n s t r a i n t s of (5.4).
W e s h a l l s h o w n o w t h a t t h e r i g h t - h a n d s i d e o f ( 5 . 2 4 ) is b o u n d e d
i n Q 0 . I n d e e d , s i n c e f u n c t i o n s f t (x) a r e c o n t i n u o u s l y d i f f e ren tia bl e,
the quantity

y I I P I P - r ( / i f * ) . P ) = y I I * — * II2 + ( / i ( * ) , * — * )

is b o u n d e d i n c o m p a c t r e g i o n Q 0 . T h e r e f o r e , t h e s m a l l e r quantity

YII p (*) I P + ( / ! ( * ) • p W )
has an upper bound. A s to its l o w e r b o u n d w e h a v e

Y II p ( * ) I P •+ (/; ( * ) . p ( * ) ) II P ( * ) II* - II / i ( * ) IIII P < * > II


> - l n / ; ( * ) i p .

i.e. w i t h x £ Q 0 t h e q u a n t i t y u n d e r c o n s i d e r a t i o n a l s o h a s a l o w e r
hound.
T h u s w e h a v e s h o w n that in Q 0 t h e r i g h t - h a n d sides of (5.24) h a v e
u p p e r b o u n d s , i.e. u x (x) ^ M , x £ ^ o - T h e s t a t e m e n t o f t h e t h e o ­
r e m directly f o l l o w s f r o m this fact.
T h u s if t h e p r i m a l p r o b l e m w a s a p r o b l e m o f c o n v e x p r o g r a m ­
m i n g , t h e n a n y 6 > 0 suits the algorithm, p r o v i d e d the admissible
region contains a n interior point.

197
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

S o m e Generalizations
A t t h e b e g i n n i n g o f t h i s s e c t i o n it w a s s t a t e d t h a t if t h e r e a r e
e q u a l i t y c o n s t r a i n t s , i.e. if t h e c o n s t r a i n t s a r e o f t h e f o r m ( 5 . 1 ) t
t h e n t h e p r o b l e m is r e d u c e d t o t h e f o r m ( 5 . 2 ) b y s u b s t i t u t i n g t w o
inequalities for e a c h equality.
T h u s , t h e a l g o r i t h m c a n b e a p p l i e d to t h e g e n e r a l p r o b l e m (5.1)
t o o . I t s h o u l d o n l y b e t a k e n i n t o a c c o u n t t h a t if w i t h a c e r t a i n x
w e have
/, ( x ) > F (x) — 6 and — /* ( x ) > F (x) - 6
w h e r e i £ J o » t h e n t h e s y s t e m (5.4) c o m p r i s e s t w o inequalities
(fi W , p) + fi (X) < 0, - (/; ( x ) , p ) - n (x) < 0 (5.25)
w h i c h are e q u i v a l e n t to ]one eq uality
(fi ( x ) , p ) + h (x) = 0. (5.26)
T h e r e f o r e it is e x p e d i e n t t o s u b s t i t u t e i n ( 5 . 4 ) o n e e q u a l i t y ( 5 . 2 6 )
for e a c h pair of inequalities of t h e t y p e (5.25). I n p a s s i n g to t h e d u a l
p r o b l e m this will lead to the cor r e s p o n d i n g multiplier u x h a v i n g
a n arbitrary sign w h i c h h o w e v e r do es n o t i m p e d e the possibility of
a p p l y i n g the algorithm of conjugate gradients (the subsection o n
p. 160).
S u p p o s e n o w that in the p r i m a l p r o b l e m in a d d i t i o n to constraints
(5 . 2 ) , t h e r e is a c o n s t r a i n t i m p o s e d b y t h e c o n d i t i o n t h a t p o i n t x
b e l o n g s t o a s e t X o f s i m p l e s t r u c t u r e . I n t h i s c a s e it is e x p e d i e n t t h a t
t h e a p p r o x i m a t i o n s o b t a i n e d s h o u l d lie i n set X . W e s h a l l d e s c r i b e
n o w h o w t h e a l g o r i t h m is t o b e m o d i f i e d i n t h i s c a s e . A s w e d i d p r e ­
v i o u s l y w e shall consider, w i t h o u t loss of generality, o n l y t h e case
w i t h inequality constraints.
T h u s l e t it b e r e q u i r e d t o m i n i m i z e / 0 (x), x £ E n w i t h c o n s t r a i n t s
/i(x)< 0 , x e X (5.27)
w h e r e J is a f i n i t e s e t o f i n d i c e s a n d X is a c o n v e x c l o s e d s e t . I t
i s a s s u m e d t h a t t h e r e i s a n i n d e x i s u c h t h a t /,• ( # ) = 0 .
S u p p o s e that there are constants N > 0 a n d 6 >> 0 such that the
f o l l o w i n g c o n d i t i o n s a r e fulfilled:
(a) set
& N = { x * f o ( x ) + N F ( x ) < C 0l X 6 - X ' } ,
C 0 = f0 (x0) + N F (xq),
is b o u n d e d a n d t h e ini t i a l a p p r o x i m a t i o n x 0 b e l o n g s t o X ;
( b ) t h e g r a d i e n t s o f f u n c t i o n s f t ( x ) , i 6 { 0 } [} J i n Q N s a t i s f y
L i p s c h i t z ’ c o n d i t i o n , i.e.
II f \ ( * i ) - f\ ( X 2) I X L II X , - x 2 II;

198
L I N E A R I Z A T I O N M E T H O D

(c) t h e problem

m i n (/;(*), P) + yllPll2>

(/J(*)iP) + / i ( * K o, i $ 3 & (x), x + p £ X , (5.28)


is s o l v a b l e f o r p w i t h a n y x £ Q N a n d t h e r e a r e L a g r a n g e m u l t i p l i e r s
u 1 (x), i 6 3 b (x) s u c h t h a t

i e j 6 (x)
R e m a r k . Recall that L a g r a n g e multipliers for p r o b l e m (5.28) are
all n o n n e g a t i v e n u m b e r s s u c h t h a t t h e f o l l o w i n g c o n d i t i o n s a r e
satisfied:

(/o ( * ) . P ( * ) ) + ( p ( * ) > P ( * ) ) + 2 » * ( * ) i ( / « ( * ) . p(.x ) ) + f i (^)i

« / : ( * ) . p ) + (p ( * ) . P ) + 2 W * ( *) t(/i (x), P ) + fi (*)1 ( 5 . 2 9 )

for all p s u c h t h a t
x p £ X. (5.30)
Besides
u x ( x ) [(fi ( x ) , p ( x ) ) + ft ( x ) ] = 0,; i 6 J s (x). (5.31)
T h u s c o n d i t i o n (c) i m p l i e s t h a t n o t o n l y t h e s u b s i d i a r y p r o b l e m ( 5 . 2 8 )
is s o l v a b l e , b u t a l s o t h a t t h e m i n i m u m p o i n t p = p (x) satisfies t h e
n e c e s s a r y a n d sufficient c o n d i t i o n s r e q u i r e d b y K u h n - T u c k e r s ’ t h e o ­
rem.
T h e a l g o r i t h m f o r s o l v i n g p r o b l e m ( 5 . 2 7 ) is c o n s t r u c t e d n o w a s
it w a s e x p o u n d e d i n t h e s u b s e c t i o n o n p . 1 9 0 . O n l y w e t a k e n o w a s
P k v e c t o r p ( x k ) w h i c h is t h e s o l u t i o n o f t h e n e w s u b s i d i a r y p r o b l e m
(5.28).
W e s h a l l s h o w t h a t t h e a l g o r i t h m is c o n v e r g e n t , i.e. t h a t t h e c o n ­
c l u s i o n s of t h e o r e m 5.1 h o l d a n d also t h a t x k £ X w i t h all k. It f o l l o w s
f r o m t h e last a s s e r t i o n t h a t a n y l i m i t p o i n t of s e q u e n c e { # * } lies i n
X . S i n c e t h e p r o o f of c o n v e r g e n c e differs f r o m t h e p r o o f of t h e o r e m
5 . 1 o n l y i n s o m e d e t a i l s , t h e r e is n o n e e d i n g i v i n g t h i s p r o o f c o m ­
pletely. W e shall p o i n t o u t o n l y t h e m a i n specific details.
F i r s t , s i n c e x h + p h £ X a n d X is c o n v e x , w e h a v e x h + a p k 6 X
w i t h a l l a l y i n g b e t w e e n 0 a n d 1. T h e r e f o r e if x h 6 X , t h e n x * + 1 6 X
too. A n d since x 0 £ X , b y a s s u m p t i o n , the w h o l e s e q u e n c e {xft}JLo
lies i n X . S e c o n d l y f r o m (5.29)-(5.31) w i t h p — 0 w e o b t a i n t h a t

(/:(*).p ( * ) ) + i i p ( * ) i p < 2 u i ( x ) f i (x),


«£#«<*>

199
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

i.e.

(/;(*)> P ( * ) X 2 “ * ( * ) / * ( * ) — I I P W II2 - (5.32)

T h i s i n e q u a l i t y s h o u l d b e s u b s t i t u t e d for e x p r e s s i o n (5.7) w h i c h w a s
u s e d i n o b t a i n i n g e s t i m a t e (5.13). A l l o t h e r c a lcu lat ion s m a d e in
obtaining the estimates remain unchanged.
F i n a l l y , if a t p o i n t x # w e h a v e p ( x * ) = 0 , t h e n it f o l l o w s f r o m
(5.29)-(5.31) that conditions

</*.(*.). p ) + 2 «*(*.)(/*(*.). p ) > o,

+ P 6 -X, u ^ x ^ f i (x,) = 0, (5.33)


ar e fulfilled.
Besides, in this case it follows from (5.28) that
M * * ) < 0 , «€■?«(*.), z.C-X',
a n d it is a l s o o b v i o u s t h a t

T h u s p o i n t x * satisfies all c o n s t r a i n t s (5.27) a n d (5.33) s h o w t h a t


at th is p o i n t t h e n e c e s s a r y c o n d i t i o n s for a n e x t r e m u m a r e fulfilled.
S o w e h a v e s h o w n , a s w e d i d a b o v e , t h a t if p ( x * ) = 0 , t h e n a t
p o i n t x * t h e n e c e s s a r y c o n d i t i o n s for a n e x t r e m u m are satisfied. It
is e a s y t o s h o w t h a t t h e c o n v e r s e a l s o h o l d s , i.e. t h a t t h e c o n d i t i o n
p (x) = 0 is n e c e s s a r y a n d s u f f i c i e n t f o r e x p e c t i n g p o i n t x t o b e a n
e x t r e m u m point.
T h e p r o o f of e v e r y l i m i t p o i n t x* of s e q u e n c e { x * } , k = 0 , 1 , . . .
s a t i s f y i n g t h e n e c e s s a r y c o n d i t i o n s f o r a n e x t r e m u m is p e r f o r m e d
j u s t i n t h e s a m e w a y a s w a s u s e d i n p r o v i n g t h e o r e m 5 . 1 , i.e. b y
t a k i n g t h e limit in p a s s i n g f r o m relations (5.29)-(5.31) satisfied at
p o i n t s x h to relations (5.33) satisfied at t h e limit point.

P r o b l e m of Linear P r o g r a m m i n g
L e t n o w a l l f u n c t i o n s / 0 (x), ft (x), i £ J i n p r o b l e m ( 5 . 2 ) b e l i n e a r .
W e h a v e t h e n the p r o b l e m of linear p r o g r a m m i n g . T h o u g h the al­
g o r i t h m d e s c r i b e d is m o s t l y i m p o r t a n t f o r t h e n o n l i n e a r c a s e , its
a p p l i c a t i o n t o t h e p r o b l e m o f l i n e a r p r o g r a m m i n g is a l s o o f a v a i l .
I n p a r t i c u l a r , if s e t J c o m p r i s e s a g r e a t n u m b e r o f i n d i c e s , t h e n t h e
p r o b l e m o f l i n e a r p r o g r a m m i n g is o n e w i t h m a n y c o n s t r a i n t s . A t
t h e s a m e t i m e , w i t h a s m a l l 6 t h e s u b s i d i a r y p r o b l e m (5.4) h a s b u t
a s m a l l n u m b e r o f c o n s t r a i n t s s o t h a t t h e g e n e r a l p r o b l e m is r e d u c e d
to t h e s o l v i n g of a series of s i m p l e r p r o b l e m s . B e s i d e s as distinct
from the simplex method, the m e t h o d proposed does not accumulate

200
L I N E A R I Z A T I O N M E T H O D

c o m p u t a t i o n e r r o r s a s it d o e s n o t t r a n s f o r m t h e o r i g i n a l m a t r i x o f
constraints f r o m step to step.
F o r t h e p r o b l e m o f l i n e a r p r o g r a m m i n g t h e c o n d i t i o n s (a) a n d (c)
( c o n d i t i o n (b) is s a t i s f i e d a u t o m a t i c a l l y ) o f t h e b a s i c a s s u m p t i o n a r e
t o o strict for t h e c o n v e r g e n c y of t h e a l g o r i t h m . W e shall n o t d w e l l
o n the conditions of c o n v e r g e n c y for the p r o b l e m of linear p r o g r a m ­
m i n g s i n c e o u r h n a i n p u r p o s e is t o o b t a i n a n a l g o r i t h m f o r t h e n o n l i n e a r
c a s e . I t w i l l b e s h o w n b e l o w t h a t if t h e a s s u m p t i o n s ( a ) a n d (c) f o r t h e
p r o b l e m of linear p r o g r a m m i n g hold, t h e n th e a l g o r i t h m c o n v e r g e s
after a finite n u m b e r of steps. T h i s fact c h a r a c t e r i z e s t o s o m e e x t e n t
the rate of c o n v e r g e n c e of the algorithm.
T h e o r e m 5 . 3 . L e t a s s u m p t i o n s (a), (c) o f t h e s u b s e c t i o n (p. 1 8 9 ) h o l d
a n d a l l f u n c t i o n s f 0 (x), f t (#) w h i c h d e f i n e p r o b l e m (5 . 2 ) h a v e t h e f o r m

ft ( ^ ) (®i»

T h e n the a l g o r i t h m of the subs e c t i o n o n p . 1 9 0 c o n v e r g e s after a finite


n u m b e r of steps.
Proof. N o t e f r o m the beginning that in the case u n d e r considera­
t i o n s t e p a * is e q u a l t o u n i t y f o r s u f f i c i e n t l y g r e a t k. I n d e e d , s i n c e
a l l fi ( x ) a r e l i n e a r , L i p s c h i t z * c o n s t a n t L i s z e r o . I t f o l l o w s t h e r e f o r e
f r o m t h e f o r m u l a for a * o n p. 1 9 0 t h a t
” • Ia 6 1— e \
a k — nun p M + R y „ ,{ N + l ) L )

= m i n ( 1’ f m + k T p , li ) • < 5 '3 4 >

But it w a s proved above t h a t F ( x ) - * - 0 , || p h || - > 0 . T h e r e f o r e


6 _
for sufficiently great k w e have x , ,, ,,- - r- ^ 1 a n d a fe = 1 .
& __ F ( x ft) + A J| p h \\ ^ A
B u t the construction of is s u c h t h a t i n e q u a l i t y ( 5 . 1 5 ) is fu l f i l l e d
w i t h a — a*. Si n c e at e a c h iteration the choice of a * begins w i t h
h a l v i n g a = 1, it f o l l o w s t h a t i n e q u a l i t y ( 5 . 5 ) w h i c h d e t e r m i n e s
t h e c h o i c e of a * will b e satisfied i m m e d i a t e l y w i t h o u t a n y a d d i t i o n ­
al h a l v i n g s , a n d s t e p a * w i l l b e just e q u a l to 1 .
L e t p o i n t x * b e n o w a l i m i t p o i n t o f s e q u e n c e { x ft} g e n e r a t e d b y
t h e a l g o r i t h m . A s w e a l r e a d y k n o w , t h i s p o i n t is t h e s o l u t i o n o f
p r o b l e m ( 5 . 2 ) f o r it s a t i s f i e s all t h e c o n s t r a i n t s o f t h e p r o b l e m a n d ,
b y t h e o r e m 5.1, t h e n e c e s s a r y c o n d i t i o n s for a m i n i m u m as well;
these conditions in o u r p r o b l e m of linear p r o g r a m m i n g are also
sufficient.
W e set
/ < ( * , ) = 0 }. (5.35)

201
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e n /* ( x * ) < 0 for i £ J 0 (x*) so that


e0 = m a x /i(x#) < 0 . (5.36)
i j 3 o(**)
T o simplify the notations in w h a t follows w e shall as sum e, w i t h o u t
loss of g e n e r a l i t y , t h a t all t h e s e q u e n c e { x * } c o n v e r g e s t o x * .
W e c o n s i d e r n o w t h e s u b s i d i a r y p r o b l e m (5.4) at p o i n t s of {x*}:

m i n (fo ( x h ), p ) + y l l p l l 2 ,

( f ’i ( x n ) , P ) + f t ( * k ) < 0 , i g J 6 (*) • (5.37)


I t s s o l u t i o n ] i s P h — P (x u ) - W e d e n o t e t h e c o r r e s p o n d i n g La-
g r a n g e ’s m u l t i p l i e r s b y u \ * k £ 3 e > ( x k ) s o t h a t
m ( x ft) , / > * ) + / , ( x ft) ] = 0. (5.38)
L e t u s s h o w n o w ] t h a t J 0 ( x * ) c = J o (x k ) f o r a l l s u f f i c i e n t l y g r e a t k .
I n d e e d , if i £ 3 b ( # * ) » t h e n
fi ( * h ) < P (x h ) — 8
a n d taking the limit w i t h respect to k a n d since F (xh) 0, w e o b ­
t a i n t h a t ft ( x * ) ^ — 6 w h i c h w o u l d j c o n t r a d i c t t h e f a c t t h a t i £
6 J 0 (**)•
Further, w e introduce the notation
3 ( x k ) = { i £ 3 b { x h)'- a i > 0 } .
W e assert n e x t that for great indices k ,
3 (Xfc) c = J 0 ( x * ) . (5.39)
I n d e e d , if i £ 3 0 ( x * ) , t h e n f t ( x „ ) ^ e 0 . S i n c e p h - > 0 a n d f\ (x h )
are b o u n d e d a n d x h — ^x*, w e h a v e w i t h great k

| {fi (x k ) y P h ) | ^ ^ i f i (x h ) ^ “ 2 " »
and therefore

(f i W , P l l ) + / l ( l | i K 7 < 0 .
T h e r e f o r e if u \ > 0, t h e n

uU(f'i (xk),Pk) + fi ( * * ) ] < o


b u t this c o n t r a d i c t s (5.38).
R e m a r k . I n the a r g u m e n t n o use w a s m a d e of t h e linearity of
f i ( x ) a n d t h e r e f o r e t h e s t a t e m e n t s t h a t J 0 (x * ) ^ ^ b ( x h ) a n d
3 ( x ft) a 3 o ( # * ) h o l d i n t h e g e n e r a l c a s e o f t h e n o n l i n e a r p r o b l e m .
T h e s e statements will b e used in w h a t follows.

202
L I N E A R I Z A T I O N M E T H O D

A s w a s s h o w n in t h e s u b s e c t i o n o n p. 1 9 4 t h e d u a l of t h e s u b s i d i a r y
p r o b l e m ( 5 . 3 7 ) is t h e m a x i m i z a t i o n o f f u n c t i o n ( 5 . 2 1 ) w i t h t h e c o n ­
s t r a i n t s u x ^ 0 , i 6 J a (#*)• T h e L a g r a n g e m u l t i p l i e r s are the
solution of the dual p r o b l e m a n d the equality of the o p t i m u m values
i n t h e o r i g i n a l a n d d u a l p r o b l e m s h o l d s , i.e. w e h a v e

Uo(*k),Pk) + ~\\Pk\\2
2
2 /;(**)+ 2 «*/«(**) 2 U h f i (d?ft).

S i n c e p * — >-0, t h e l e f t - h a n d s i d e of t h e last e x p r e s s i o n t e n d s t o
zero a n d consequently
Jj_ 2
2 /o ( X h ) + 2 “ */< ( * » ) 2 u t i ‘ (*«.) °- (5-40)

Note n o w that o n l y if i £ J ( x * ) . B e s i d e s ,
/i ( x ) = ( a iy x ) — b it i 6 {0 } U J>
s o t h a t fl ( x ) = a t a n d is i n d e p e n d e n t o f x . T h e r e f o r e , ( 5 . 4 0 ) c a n
be rewritten in the following form:
2
2 a 0 “h 2 u ha i 2 ( * f t ) “ *■ ° *
iej(xA)
B u t J ( X h ) c z J o ( # * ) a s s h o w n a b o v e a n d t h e r e f o r e /* ( x * ) /* ( x , ) =
= 0, s i n c e /»(#*) = 0 w i t h i ^ J o ( ^ * ) » b y definition. T h e r e f o r e
2
2 <*0 + 2 -►o.

But
_ _ 1_ 2
2 4" 2

1 2
m a x — y a 0 ~ 1“ 2 < 0. (5.41)
u*>0, i6J(xh)

W e introduce the notation

W ( » = .m a x — IIa 0 + 2 “ < a i ||*;


u x> 0, iP'f
<o ( f ) i s a f u n c t i o n d e f i n e d i n t h e s e t o f i n d i c e s c: J. Since
a J , this f u n c t i o n c a n t a k e o n l y a finite n u m b e r o f v a l u e s . It

203
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

follows fr o m (5.41) that

c » ( 3 < * * » - ► 0.

B u t t h i s m e a n s t h a t © (J (a:*)) = 0 f o r a l l s u f f i c i e n t l y g r e a t k s i n c e
a s w a s j u s t m e n t i o n e d , cd ('f) c a n t a k e o n l y a finite n u m b e r o f v a l u e s .
T h u s , for great k
c o ( J ( x k )) = 0. (5.42)

W e n o w c h o o s e k s o g r e a t t h a t a * = 1; c o n d i t i o n ( 5 . 4 2 ) is f u l ­
f i l l e d a n d J ( x x ) c z j 0 (a:*). A s a ft = 1 , w e h a v e x h + i = x h +
+ p h . S i n c e x k -►a;*, p k -»-0, w e c a n t a k e that

i l d ,(*.). (5.43)

L e t u s c o n s i d e r a g a i n s u b s i d i a r y p r o b l e m (5.37). A s p x satisfies
t h e c o n s t r a i n t s o f ( 5 . 3 7 ) a n d /,- (a:) a r e l i n e a r , w e h a v e

ft ( * k + i ) = (f t ( X k ) , P k ) + ft ( X k ) < o (5.44)
f o r i £ J a (x k ) a n d c o n s e q u e n t l y for i £ J 0 (z*) too as J 0 (x+) d
c i J a (arft). W e h a v e t h u s s h o w n t h a t x h+ 1 satisfies all c o n s t r a i n t s of
p r o b l e m (5.2).
W e demonstrate t h a t x k + 1 is r e a l l y t h e s o l u t i o n o f p r o b l e m (5.2).
I n d e e d , it f o l l o w s f r o m ( 5 . 3 8 ) a n d t h e d e f i n i t i o n o f s e t Cf ( x x ) t h a t

ft ( X k + i ) = 0, i 6 5 ( x h ). (5.45)

B u t (5.42) m e a n s that there are n u m b e r s u* ^ 0, i 6 3 (x*)] s u c h


that
a0+ S w*a* = 0. (5.46)

Setting n o w uj = 0, (xx) w e obtain that there are n u m b e r s


u\ ^ 0 s u c h that conditions

a0+ 2 K at= 0 , “ j/. ( z * + i ) = 0 ,

a r e fulfilled.
B u t t h e l a s t r e l a t i o n s ( s e e C h a p . I, S e c . 3 ) a r e t h e n e c e s s a r y a n d
sufficient c o n d i t i o n s for p o i n t x k+i to b e t h e solution of t h e p r o b ­
l e m of linear p r o g r a m m i n g .
T h u s t h e a l g o r i t h m p r o v i d e s for t h e s o l u t i o n after a finite n u m b e r
of steps. Q . E . D .

204
L I N E A R I Z A T I O N M E T H O D

L o c a l E s t i m a t e of t h e R a t e of C o n v e r g e n c e
It w a s s h o w n i n t h e p r e c e d i n g s u b s e c t i o n t h a t t h e a l g o r i t h m p r o ­
p o s e d c o n v e r g e s after a finite n u m b e r of s t e p s i n t h e l i n e a r case.
H e r e w e shall s h o w that in the general nonlinear case the al gorithm
co n v e r g e s at a geometrical rate a n d w i t h certain fa v o u r a b l e c i r c u m ­
s t a n c e s e v e n at a q u a d r a t i c rate.
T h e o r e m 5.4. L e t x # be the solution of p r o b l e m 5.2 a n d the follo win g
conditions hold:
(a) F o r a n y sufficiently s m a l l 6 0 fl t h e s u b s i d i a r y p r o b l e m ( 5 . 4 )
is s o l v a b l e .
( b ) F u n c t i o n s fi ( x ) a r e t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e a n d t h e
g r a d i e n t s f (#*), i 6 J o (^*)» w h e r e

J o ( * * ) = {£: / « ( * • ) = 0. i £ J } »
are linearly independent.
(c) A t p o i n t x * the necessary condition for a m i n i m u m is s a t i s f i e d
in the f o r m

/;(*.)+ 2 “ o/i ( * . ) = °
*€ C f o(x # )

a n d * u \ > 0, i £ J 0 (#*).
( d ) T h e s u f f i c i e n t j c o n d i t i o n f o r a l o c a l m i n i m u m , i.e.
(P r L ” (a?*, u 0 ) p ) > 0 t,
holds for all p 0 w h i c h also satisfy the condition

( P t f\ ( * * ) ) = 0 , i £ J o (x * )
where
L (x , u ) = / 0 ( x ) + S u */i ( * )
i€ J o(x*)

a n d L ” i s a m a t r i x * o f s e c o n d d e r i v a t i v e s o f L (a:, u ) w i t h r e s p e c t t o x .
T h e n t h e r e is a n e i g h b o u r h o o d Q o f p o i n t x * , 6 0 > 0 a n d a > * 0 s u c h t h a t
the process
Xk+i = x k + «P* (5.47)
c o n v e r g e s to p o i n t x + f r o m a n y initial a p p r o x i m a t i o n x 0 £ Q a t a g e o -
m e t r i c r a t e , i.e. t h e r e is a n u m b e r 0 ^ q 1 s u c h t h a t \ \ x + — x h || ^
^ C q h for all sufficiently great k.
P r o o f . T h e b a s i c i d e a o f t h e p r o o f is a s f o l l o w s . A s w a s s h o w n a b o v e *
at point x+ the equation
p (**) — 0
is satisfied.

205
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

P r o c e s s ( 5 . 4 7 ) is a s i m p l e i t e r a t i v e p r o c e s s f o r s o l v i n g t h e l a s t
equation. Therefore in order to estimate the rate of c o n v e r g e n c e w e
c a n u s e O s t r o w s k i ’s t h e o r e m w h i c h i s f o r m u l a t e d b e l o w . T h i s t h e o r e m
r e q u i r e s a n e s t i m a t e o f t h e e i g e n v a l u e s o f t h e m a t r i x o f first d e r i v a ­
t i v e s o f p (x) a t p o i n t x * . T h e r e f o r e t h e m a i n p r o b l e m w i l l b e t h e
c a l c u l a t i o n o f t h i s m a t r i x a n d its e i g e n v a l u e s .
W e shall b r e a k the proof of the t h e o r e m into several parts.
W e take
«?o(x*) = M * * ) = 0 },
e0 = m a x /* ( x j < 0 .
i £ 3 o(* * )
L e m m a 5 . 2 . L e t t h e c o n d i t i o n s o f t h e t h e o r e m h e f u lfi lle d a n d let
6 . Then there is a neighbourhood of point x* such that
3 b ( x ) = J 0 ( x * ) a n d p ( x ) is c o n t i n u o u s l y d i f f e r e n t i a b l e w i t h r e s p e c t
to x i n this n e i g h b o u r h o o d . M o r e o v e r , t h e set
3 ( x ) = { i £ J 6 ( x ) : (/; ( x ) , p ( x ) ) - f ft ( x ) = 0 }
c o i n c i d e s w i t h set £f0 (x*).
P r o o f . S i n c e a l l f u n c t i o n s /* ( x ) a r e c o n t i n u o u s , t h e r e i s a n e i g h b o u r -
h o o d of point x * s u c h that

— 4> *€##(*.). (5-48)

/!_(*)<-§-, * £ 3 •(*.)• (5-49)


Recall n o w that
F ( x ) = m a x { 0 , m a x fi ( x ) }
ieJ
a n d i 6 3 b ( x ) if /* ( x ) ^ F (x) — 6 . It f o l l o w s f r o m (5.48), (5.49)
that
0 < F (*) <4 (5.50)

a n d if i ^ J 0 ( x * ) , t h e n

i.e. J$(x). O n the other h a n d , if i £ 3 0 ( x * ) { t h e n i t f o l l o w s


g
from (5.50) t h a t F (x) — 6 < — ^ and consequently

/«(*) > — £ > * ( * ) - 6,

i.e. i 6 3 b (x).

206
L I N E A R I Z A T I O N M E T H O D

T h u s w e have s h o w n that (z) = J o (#*) i n a c e r t a i n n e i g h b o u r ­


h o o d o f x *.
R e c a l l n o w t h a t if p ( x ) i s t h e s o l u t i o n o f p r o b l e m ( 5 . 4 ) , t h e n c o n ­
ditions (5.6) h o l d ; t h e s e c o n d i t i o n s c a n b e r e w r i t t e n in t h e e q u i v a ­
lent form:

P ( x ) + fo(x ) + 2 ui(x)f\(x)^0, (5.51.1)


i £ . J (x)

(/i(*)> P W ) + / i W = 0. i£3(x), (5.51.2)

P W ) + / i W < 0 , i^3(x), i£Ji(x) (5.51.3)

w h e r e u x (a;) ^ 0 .
W e i n t r o d u c e t h e f o l l o w i n g n o t a t i o n s : | c = J t (x) ( = J o (#*));
f y (a;) f o r a m a t r i x w h o s e r o w s a r e f i ( x ) , i £ f y (x) f o r a c o l u m n -
vector w i t h c o m p o n e n t s ( x ), i £ 'ty a n d U y f o r a c o l u m n - v e c t o r
with components i £ 'f. T h e n e q u a t i o n s ( 5 . 5 1 . 1 ) , : ( 5 . 5 1 . 2 ) c a n
b e rewritten as follows:

P ( * ) + /o ( * ) + f ' f (x ) u y ( x ) = 0 ,
f y ( x ) p ( x ) + f y ( x ) = 0, f = J ( x ) . (5.52)

T h e last e x p r e s s i o n s c a n b e c o n s i d e r e d to b e a linear s y s t e m of e q u a ­
t i o n s t o b e s o l v e d f o r p (x) a n d u y (x). I t is e a s y t o s e e t h a t i n a c e r ­
t a i n n e i g h b o u r h o o d o f a-*, s y s t e m ( 5 . 5 2 ) h a s o n l y o n e s o l u t i o n e x ­
pressed b y the formulas

Uf(x) = iff (x) f f ( x ))-‘ [ f y (X ) — f y (x) /; (*)],


p ( x ) = - n ( x ) - f f ( x ) u ( x ) . (5.53)

I t f o l l o w s f r o m t h e s e f o r m u l a s t h a t if s e t 'f is f i x e d , t h e n u y ( x ) a n d
p (a:) a r e c o n t i n u o u s l y d e p e n d e n t o n x .
Let n o w x k — W e sh all s h o w t h a t for all g r e a t k

3 (Xh) = 3 o (x.)-

S u p p o s e t h a t o u r s t a t e m e n t is n o t fulfilled a n d t h e r e a r e g r e a t
n u m b e r s k s u c h t h a t J ( x ft) i s a s u b s e t o f J 0 ( x * ) . S i n c e t h e r e c a n
b e b u t a f i n i t e n u m b e r o f d i f f e r e n t s e t s J (a:), w e c a n t a k e , w i t h o u t
l o s s o f g e n e r a l i t y , t h a t a s e q u e n c e x h - v a : * is c h o s e n s u c h t h a t
3 ( x » ) = f , t c j , (**).

207
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

S u b s t i t u t i n g n o w x k for x in (5.51) a n d t a k i n g the limit (p (xh)


— ► />, u x ( x ft) - > - u l , i £ w e obtain that

p + / J (*,) + 2 «*/{ (*.) = o,


a t

(/H*.). P ) + /«(*.) = 0. i £ f ,
(/«(*•). P ) + / * ( * * ) < o. (*,)=#«(*.)

w h e r e w ^ 0, f o r u l ( x ft) ^ 0 . B u t t h e s e l a s t r e l a t i o n s s h o w t h a t p
is t h e s o l u t i o n o f t h e s u b s i d i a r y p r o b l e m ( 5 . 4 ) f o r p o i n t x * , i.e. p =
= P (#*)• B u t p o i n t x * is t h e s o l u t i o n o f p r o b l e m (5.2) a n d t h e r e ­
fore p (x*) = 0. C o n s e q u e n t l y ,

f'o ( * , ) + 2 “ */« ( * » ) = o .

U s i n g n o w c o n d i t i o n (c) o f t h e t h e o r e m w e obtain from the last


expression that

2 ( u „ — u i) f i ( x , ) + 2 u ifi (*.)=0
ief
a n d this c o n t r a d i c t s c o n d i t i o n (b) of t h e t h e o r e m . T h u s , in a certain
n e i g h b o u r h o o d of p o i n t x * , set J (x) c o i n c i d e s w i t h J 0 (x*). F r o m
J (x) b e i n g c o n s t a n t a n d f o r m u l a s ( 5 . 5 3 ) it f o l l o w s d i r e c t l y t h a t
u y (x ) a n d P (x )y f — f ' o (x * ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e w i t h
r e s p e c t t o x s i nce , b y c o n d i t i o n (b), / f (x) a r e t w i c e c o n t i n u o u s l y dif­
ferentiable.
R e m a r k . T h u s , i n a s m a l l n e i g h b o u r h o o d o f x * , p (x) a n d u y (x )
is t h e s o l u t i o n o f t h e s y s t e m o f e q u a t i o n s ( 5 . 5 2 ) w i t h a c o n s t a n t set
f = J o (#*)• T h e r e f o r e w e s h a l l o m i t i n d e x "f i n u y (x).
L e m m a 5 . 3 . T h e m a t r i x p ' ( x ) o f d e r i v a t i v e s o f v e c t o r j d ( x ) , i.e. t h e
m a t r i x w i t h e l e m e n t s d p x ( x ) f d x \ i , / = 1 , . . ., n , w h e r e p * ( x ) i s
t h e i-th c o m p o n e n t o f v e c t o r p (x), a t p o i n t x * h a s t h e f o l l o w i n g f o r m :
P' (*.) = — [ P + (/ — P ) L " ( x „ u 0 )]
where
P = f j o M (x * ) ( / j o ( x * ) ( * • ) / & * >
a n d u 0 = u (x*).
P r o o f . B y d i f f e r e n t i a t i n g t h e first o f f o r m u l a s ( 5 . 5 2 ) w e obtain

p ' ( * . ) = — L " (X . . K o ) — 2 f i (x . ) ( « * ' ( * • ) ) * ( 5 -5 4 )


i e J o(**)

208
L I N E A R I Z A T I O N M E T H O D

where
d n i (r )
( j

dui (a)
dxn

F r o m f o r m u l a (5.51.2) b y differentiating (p (x*) = 0), w e o b t a i n


/;*(*.) p'(*.) + / i * ( * . ) = 0 , i€ (*.) (5.55)
N o t e n o w t h a t o p e r a t o r P d e f i n e d i n f o r m u l a t i n g t h e l e m m a is t h e
o p e r a t o r o f p r o j e c t i n g o n t h e s u b s p a c e s p a n n e d b y t h e v e c t o r s f\ (a;*),
i 6 J o (#*)• I n d e e d , this c a n b e s e e n (see also t h e s u b s e c t i o n o n
p. 1 4 7 ) f r o m easily verified relations:
(1) 0r (x , ) = f i ( * . ) . i£2o(xy,
(2) P * = P , P * = P ;
(3) ( I — P ) P = 0.
If w e n o w r e w r i t e ( 5 . 5 5 ) i n t h e f o r m

/ < * • ) P ' ( * • ) + f ' j 0 (X.) ( * * ) = ° *


t h e n taking into a c c o u n t the expression for P y w e ob tain
Pp' (xj = — P. (5.56)
F u r t h e r , i t f o l l o w s f r o m r e l a t i o n ( 1 ) f o r P t h a t ( I — P ) f \ (,x * ) = 0.
T h e r e f o r e a p p l y i n g (/ — P ) to b o t h sides of (5.54), w e o b t a i n
(I - P ) p' (x*) = - (/ - P) U ( x * , u 0 ). (5.57)
A d d i n g ( 5 . 5 6 ) a n d ( 5 . 5 7 ) w re o b t a i n t h e r e q u i r e d f o r m u l a f o r p r ( x * ) .
L e m m a 5.4. E i g e n v a l u e s yj of m a t r i x p r (x*) c a n be characterized
a s ] f o l l o w s : y j = — 1 f o r f = 1 , 2 , . . ., m y w h e r e m ^ . n i s t h e n u m b e r
o f i n d i c e s i n s e t J 0 ( x * ) . y j = — X j _ m , j = m + 1 , . . ., n , w h e r e
X j y 7 = 1 , . . ., n — 77i, are the eigenvalues of the matrix
( / — P ) L n ( x * , u 0 ) ( I — P ) a n d X j ; > 0 , j = 1 , . . ., n — m .
Proof. L e t a b e the e i g e n v a l u e a n d y the eigenvector of m a t r i x
p r (x*). T h e n , a c c o r d i n g to l e m m a 5.3, w e h a v e
— P y — (/ — P ) L " (x*, w 0) y = oy = o P y + o (I — P ) y.
W e use n o w the relation P (I — P ) = 0 a n d b y multiplying the
last e q u a l i t y in t u r n b y P a n d I — P w e o b t a i n :
— P y = a P y , (5.58)
_ ( / _ P ) V (xg, u 0) y = o ( I - P) y . (5.59)
T h e r e are t w o possible cases.
14— 0326 209
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

( 1 ) P y = £ 0 . T h e n i t f o l l o w s f r o m ( 5 . 5 8 ) t h a t <t = — 1 .
(2) P y = 0. I n th is case, (I — P ) y = y a n d (5.59) c a n b e r e w r i t ­
ten as follows:
(I - P ) L " (**, u 0) (/ — P) y = - oy, (5.60)
i.e. a is t h e e i g e n v a l u e o f m a t r i x (/ — P ) L " (/ — P). T h i s m a t r i x
is s y m m e t r i c s i n c e P = P * a n d L ” = (L")* b e i n g t h e m a t r i x of
s e c o n d derivatives of function L. Moreover, the matrix under con­
s i d e r a t i o n is n o n n e g a t i v e d e f i n i t e . Indeed, for a n y w w e h a v e
K (I — P ) L " (/ — P) w) = (z, L " z)
where
z = (I — P) w.

But 0 <*«) ( * * ) 2 = f y o(x») (**) (I — P ) v > = 0 and therefore


(z, L ” z) ^ 0 b y t h e c o n d i t i o n (d) of t h e o r e m 5.4, t h e e q u a l i t y si g n
b e i n g p o s s i b l e o n l y if z = ( / — P ) w = 0. It f o l l o w s f r o m t h e s y m m e ­
try of m a t r i x (/ — P ) L " (I — P ) t h a t its e i g e n v a l u e s a n d e i g e n ­
v e c t o r s are real. A s y 0 and y = P y + ( / — P ) y , it f o l l o w s f r o m
P y = 0 t h a t ( I — P ) y •7^ 0 , a n d therefore f r o m (5.60) w e o b t a i n
- a (I f, i f ) = (y, (/ - P ) L ” ( I - P ) y) = (if, L " y ) > 0.

Thus, — a 0 a n d c o n s e q u e n t l y a = — kj, where Xj > 0 is


t h e e i g e n v a l u e of m a t r i x (/ — P ) L ” (/ — P).
T h u s w e h a v e p r o v e d that the eigenvalues of m a t r i x p ' (x+) are
r e a l a n d e q u a l e i t h e r t o — 1 o r t o — Xj, Xj > 0. It r e m a i n s to deter­
m i n e o n l y the n u m b e r of eigenvalues w h i c h are e q u a l t o — 1.
D u e to the fact that

=/;(*,), < (**).


o p e r a t o r ( I — P ) h a s m e i g e n v e c t o r s f\ ( # * ) w h i c h c o r r e s p o n d t o t h e
zero eigenvalue. Therefore, m a t r i x (/ — P ) L ” (I — P ) also h a s m
z e r o e i g e n v a l u e s . O n t h e o t h e r h a n d , a s w e h a v e s e e n , m a t r i x p ’ (x)
h a s all t h e n n o n z e r o e i g e n v a l u e s , e a c h b e i n g e i t h e r e q u a l to — 1 o r
t h e o n e o f m a t r i x ( I — P ) U ( I — P ) . C l e a r l y , t h i s is p o s s i b l e o n l y
if t h e s t a t e m e n t o f l e m m a 5 . 4 h o l d s t r u e .
W e n o w c o m p l e t e t h e p r o o f of t h e o r e m 5.4. It f o l l o w s f r o m O s t r o w -
s k i ’s t h e o r e m t h a t i f x * i s t h e s o l u t i o n o f t h e e q u a t i o n p ( x ) = 0 a n d
t h e e i g e n v a l u e s o f m a t r i x / + a p ' (a:*) h a v e a m o d u l u s l e s s t h a n
u n i t y , t h e n t h e m e t h o d o f s i m p l e i t e r a t i o n x k + 1 = x ^ + a p (x h )
c o n v e r g e s f r o m all p o i n t s of a c e r t a i n n e i g h b o u r h o o d of p o i n t x + a n d
at the s a m e t i m e th e f o l l o w i n g rate of c o n v e r g e n c e holds: for
e a c h e > 0 t h e r e i s a n u m b e r C ( e ) s u c h t h a t || x + — x h || ^
(e) (g0 + e ) \ w h e r e q 0 is t h e g r e a t e s t o f t h e m o d u l i o f t h e
e i g e n v a l u e s of m a t r i x I + a p ' (x+).

210
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

C o n s i d e r n o w t h e e i g e n v a l u e s o f m a t r i x I -f- a p ' ( a : * ) . T h e y a r e
e q u a l e i t h e r to 1 — a o r t o 1 — a Xj. W e c h o o s e n o w a s o t h a t all
of t h e f o l l o w i n g inequalities b e satisfied:
1 — a > — 1, 1 — akj > — 1, 7 = 1, . . n — m,
i.e. t h a t 0 « < a < ; m i n { 2 , 2 / X 0 }, w h e r e X 0 = m a x X j , 7 = 1 , . . .
. . ., n — m . T h e n a l l t h e e i g e n v a l u e s o f m a t r i x I + a p * ( x % ) w i l l
h a v e m o d u l i l e s s t h a n u n i t y ; h e n c e , r e f e r r i n g a l s o t o O s t r o w s k i ’s
results, w e h a v e t h a t t h e o r e m 5.4 holds.
T h e o r e m 5.5. L e t the c o n d i t i o n s of the p r e c e d i n g t h e o r e m be satisfied
a n d , b e s i d e s , m (t h e n u m b e r o f i n d i c e s i n set Cf0 (#*)) b e e q u a l to n ( t h e
d i m e n s i o n of the space). I n this c a s e , p r o c e s s (5.47) c o n v e r g e s f r o m a
certain n e i g h b o u r h o o d of p o i n t w i t h a = 1 at a q u a d r a t i c rate.
P r o o f . It f o l l o w s f r o m l e m m a 5 . 4 for t h e c a s e u n d e r c o n s i d e r a t i o n
t h a t all t h e e i g e n v a l u e s o f m a t r i x p ' ( x % ) a r e e q u a l t o — 1, a n d t h e r e ­
f o r e t h e e i g e n v a l u e s o f m a t r i x I + a p * (a;*) a r e e q u a l t o 1 — a .
If a = 1, t h e n all t h e e i g e n v a l u e s a r e e q u a l t o z e r o a n d q 0 = 0. T h e r e ­
f o r e a c c o r d i n g t o O s t r o w s k i ’s t h e o r e m , w e o b t a i n H a : * — x ^ || ^
^ C (e) e h a n d t h i s m e a n s t h a t t h e p r o c e s s c o n v e r g e s a t a h i g h e r
rate t h a n that of a n y g e o m e t r i c progression. I n fact in this case,
p r o c e s s ( 5 . 4 7 ) p a s s e s i n t o N e w t o n ’s m e t h o d f o r s o l v i n g s y s t e m s o f
e q u a t i o n s f t (x) = 0, i £ J 0 (#*) w h i c h a s w e l l k n o w n a n d a s s h o w n
b e l o w in Sec. 6 converges quadratically.
R e m a r k . All the a r g u m e n t s in this su bsection w e r e c o n d u c t e d for
t h e c a s e o f a p r o b l e m w i t h o n l y i n e q u a l i t y c o n s t r a i n t s . I t is o b v i o u s ,
h o w e v e r , t h a t all t h e r e s u l t s o b t a i n e d c a n b e a p p l i e d t o t h e c a s e
w i t h equality constraints.

6. L I N E A R I Z A T I O N M E T H O D :
S O L V I N G S Y S T E M S O F E Q U A L I T I E S A N D
INEQUALITIES A N D F I N D I N G T H E M I N I M A X
I n t h i s s e c t i o n , t h e l i n e a r i z a t i o n m e t h o d is a p p l i e d t o t w o p r o b ­
l e m s closely c o nne cte d to the usual p r o b l e m of m a t h e m a t i c a l p r o ­
g r a m m i n g . It p r o v e s t h a t in this c a s e o n e c a n s u c c e e d in c o n s t r u c t i n g
effective a l g o r i t h m s w h i c h h a v e a fast rate of c o n v e r g e n c e .

S y s t e m s of Equalities a n d Inequalities
G i v e n t w o finite sets of i n d i c e s and a n d f u n c t i o n s / * (a:),
x 6 E n . T o find t h e solution of t h e f o l l o w i n g s y s t e m :
/ i W < o, /,(*) = 0, i£j°. (6 .1 )
S u p p o s e t h a t f u n c t i o n s /,• ( z ) h a v e continuous g r a d i e n t s f \ (a;) a n d

211 14*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

also that the gradients satisfy Lipschitz* c o n d i t i o n w i t h c o nst ant L :


II f i ( * i ) — f i ( * 2 ) II < L II * i — x 2 ||-
T h e n o r m o f v e c t o r s is e v e r y w h e r e E u c l i d e a n .
W e use the notation:
F (x ) ■ = m a x ( m a x ft ( x ) , m a x | f t ( x ) |) ,
ie&°
C f l ( x ) = {i: ft ( x ) ^ F ( x ) — 6 } ,
3%{x)={i'- i£J°, \ U ( x ) \ ^ F ( x ) — b},

W e c h o o s e a n initial p o i n t x 0 a n d a s s u m e t h a t f o r all x t h a t s a t i s f y
t h e i n e q u a l i t y F ( x ) ^ F (ar0 ), t h e g r a d i e n t s f\ ( x ) a r e l i m i t e d i n
n o r m b y constant K .
Basic assumption. T h ere are n u m b e r s 6 ;> 0 a n d C > 0 such that
f o r a l l x f o r w h i c h F (a:) > 0 , F (x) ^ . F ( x 0) t h e f o l l o w i n g s y s t e m
is s o l v a b l e f o r p :
(/<(*)> P ) + / i W < 0 , i£.Jl(x),
(/<(*), P ) + /«(*) = 0, i£Jl(x). (6.2)

L e t p (#) b e t h e s o l u t i o n o f (6.2) t h a t h a s t h e m i n i m u m n o r m . T h e n
f o r x s u c h t h a t F (#) > 0,
|| p ( x ) || < C F (x). (6.3)

T h e i n e q u a l i t y (6.3) characterizes to a certain e x t e n t t h e r e gul ar


s o l v a b i l i t y o f s y s t e m ( 6 . 2 ) . I n p a r t i c u l a r , if s y s t e m ( 6 . 2 ) is t r a n s ­
f o r m e d into a s y s t e m of n e q u a t i o n s in n u n k n o w n s , c o n d i t i o n (6.3)
is e q u i v a l e n t t o t h e a s s u m p t i o n t h a t t h e m a t r i x o f t h e c o r r e s p o n d ­
i n g s y s t e m is n o n s i n g u l a r . A s w i l l b e s h o w n f u r t h e r o n , (6.3) h o l d s
i f t h e g r a d i e n t s f i (a:), i 6 C f l ( x ) U C f % (a:) a r e l i n e a r l y i n d e p e n d e n t
f o r all x , F (x) > 0.
W e t u r n n o w to the construction of the algorithm. T h e successive
approximations are constructed b y the formula
*k+i = x k + a kp k , p h = P (x h ) (6.4)

w h e r e p a r a m e t e r a * is c h o s e n b y s e q u e n t i a l l y h a l v i n g u n i t y u n t i l
t h e f o l l o w i n g i n e q u a l i t y is s a t i s f i e d :
F (xk + a kp k ) < (1 — ectft) F ( x k ) (6.5)

w h e r e e is a n y n u m b e r , c h o s e n f r o m t h e b e g i n n i n g , 0 < e c l . C l e a r ­
l y , f o r m u l a ( 6 . 4 ) is a p p l i c a b l e if F ( x ) > » 0 . O t h e r w i s e , t h e p r o c e s s
s t o p s a n d x * is t h e s o l u t i o n o f (6.1).

212
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

C o n v e r g e n c e of t h e Al go ri th m
T h e i m p l e m e n t i n g o f t h e a l g o r i t h m p r o p o s e d is c h a r a c t e r i z e d b y
the following theorem.
T h e o r e m 6.1. L e t all the a s s u m p t i o n s of the p r e c e d i n g subsection
b e f u l f i l l e d . T h e n s e q u e n c e { x ft} , k = 0 , 1 , . . ., g e n e r a t e d b y t h e a l g o ­
r i t h m a c c o r d i n g to f o r m u l a (6.4) c o n v e r g e s to x , t h e s o l u t i o n o f s y s t e m
(6.1), a n d a t the s a m e t i m e
(a) f o r a sufficiently g r e a t k , a ^ = l;
(b) for a sufficiently g r e a t k,
F ( x h + 1 ) ^ L C * F * (**);
(c) f o r a n y q, 0 q < C 1 t h e r e is a n u m b e r k (q) s u c h t h a t
_ zh-h(q)

l l » - * * H < 2c(i-«) (6-6)


f o r a l l k ^ k (q).
P r o o f . O b v i o u s l y , i f F ( £ ft) ^ j 0 a t a c e r t a i n s t e p , a l l t h e s t a t e m e n t s
a r e p r o v e d . T h e r e f o r e w e s u p p o s e t h a t F (arft) > 0 f o r a l l k .
F i r s t o f all w e s h o w t h a t t h e c h o i c e o f a * u n d e r c o n d i t i o n (6 . 5 ) is
a l w a y s f e a s i b l e . F o r [i £ C f l ( x & ) , u s i n g T a y l o r ’s f o r m u l a , w e h a v e :
f t ( X k + a p k ) = ft ( x h ) + a ( f ' i ( x k + Q i O t p k ) , P k )
= f t ( X k ) + a { f i ( x k ) , P k ) + a ( f t ( x k - \ - Q t a p h ) — j ( x k ), P k )
where 0 ^ 0*^1. B u t s i n c e p h satisfies (6.2), w e h a v e
(ft ( X k ) , P k ) < — f t ( x k ).
Further
( f t ( X k + Q t C t P k ) — f t ( X k ) , p * ) < II P k || || f t ( X k + f y a p k ) — f t ( X k ) ||
< |l P h l| II & i a P k | | i < a L | | p k ||2 .
T h e r e f o r e u s i n g (6.3), w e o b t a i n
f i ( x h + a P h ) < f t ( x h ) — a f t ( X u ) + a 2 L || p h ||2
< ( 1 - a ) F { x k ) + a * L C * F * ( x h ). (6.7)
F o r i £ J ~ , i £ J i ( x k ), / f ( # * ) < F (x k ) — 6 a n d ^ t h e r e f o r e
ft ( x h + a P h ) = f i ( x h ) + a ( f ' i ( x h + G ia p k ), p h )
^ F ( x h ) S + a K \ \ p h \ \ ^ F ( x h ) ~ 6 + o i K C F ( x h ). (6.8)
Quite similarly, w e h a v e for i ^ J l ( x k )
\fi(xk + 0 LP t t ) l ^ ( i - a ) F ( x h ) + a * L C * F * ( x k ) (6.9)
and for i £ C f l (.x h )
I ft ( x k + a p h) I < F (xk) - 6 + a K C F ( x h ). (6.10)

213
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Note n o w that
(1 - a) F (xk) > F (xh) - 6 + a C K F (xk)
if a ^ at, where
- y l ^ _ _ _ _

* “ (1 + C K ) F ( x k ) ‘
Therefore for a ^ a t it f o l l o w s f r o m ( 6 . 7 ) - ( 6 . 1 0 ) that
F (xh + a p h) < (1 — a) F (xh) + a t L C ^ F 2 (xk)
or
F (zk + a p h) ^ F (x h ) — a F (x h ) [1 — a L C 2 F ( x k )]. (6.11)
If a ^ a|, where
« 2 1 - 8
k L C 2 F (,x k ) »

then 1 — a L C 2F ( x k ) ^ e a n d therefore (6.11) c a n b e rewritten as


F (xh + a p h) < F ( z fe) — a e F ( x h ),
a ^ m i n (at, at}. (6.12)
I t is n o w c l e a r t h a t if w e p r o c e e d t o r e d u c e a b e g i n n i n g w i t h a = 1 ,
t h e n i n e q u a l i t y (6.12) w i l l h o l d after a finite n u m b e r of trials a n d
the a k c h o s e n will satisfy the inequality

a k> m i n | l , — aj, — a$ }. (6.13)

T h u s , w e h a v e p r o v e d t h a t t h e c h o i c e of a k u n d e r c o n d i t i o n (6.5)
is f e a s i b l e a n d t h a t t h i s c h o i c e c a n b e r e a l i z e d after a finite n u m b e r
of operations.
W e s h o w t h a t F ( x h ) - > - 0 . I n d e e d , i t f o l l o w s f r o m ( 6 . 5 ) t h a t F (x k )
d e c r e a s e s m o n o t o n i c a l l y . T h e r e f o r e , it c a n b e c o n c l u d e d f r o m t h e
f o r m u l a s for a t a n d a t that these quantities increase w i t h increasing
k. C o n s e q u e n t l y , f o r m u l a (6.13) p e r m i t s us to c o n c l u d e that a k ^
^ a > 0 a n d so
F (z/H-i) < (1 — e a h) F (xk) < (1 — e a ) F ( x h ).

T h e r e f o r e F ( x k ) ^ ( 1 — e a ) ft F ( x 0 ), h e n c e F (a:k ) - > - 0 . B u t t h e n
a t - * - + oo, a | - > - + oo as c a n b e seen directly f r o m the formulas
for these quantities. Therefore, (6.13) p e r m i t s u s to c o n c l u d e that
< z k = 1 f o r a s u f f i c i e n t l y g r e a t k . B u t ( 6 . 1 1 ) s h o w s f o r a l l s u c h k y if
a = l is s u b s t i t u t e d i n t o it, t h a t
F (xk+1) < L C 2 F 2 ( x h ). (6.14)
T h u s s t a t e m e n t s (a) a n d (b) of t h e t h e o r e m h a v e b e e n p r o v e d .

214
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

W e c a n n o w a s s e r t t h a t t h e r e is a k 0 s u c h t h a t a * = 1 for k ^ k0
a n d ( 6 . 1 4 ) is satisfied. T h e r e f o r e , b y (6.3),
II Z k + i ~ II = II P h II < C F ( x h ).
W e s e t v h — L C 2 F ( x ft). T h e n v h - > Q a n d ( b y ( 6 . 1 4 ) ) u h + 1 ^ v \ . L e t
q b e s u c h t h a t 0 c q « < 1 . T h e n t h e r e i s a A: ( q ) s u c h t h a t v h < z q
f o r k ^ k (q). T h e r e f o r e u k + 1 ^ q v k , k ^ k (q). H e n c e

Vk < v l * <l)k t q ' ^ 9 2 '‘“ '‘<, ) , v m ^ q m ~hv h , m^sk^zk(q).


T h i s p e r m i t s u s to o b t a i n t h e f o l l o w i n g estimate:
m - 1 m - 1

II x m 2 II x J + l II^ L C 2
j=h j=k
m-i-k 2 k ~ h(V
^ T c T 2 L C { i — q) ^ L C (1 — q ) *
3 = 0

It f o l l o w s f r o m t h i s e s t i m a t e ( a c c o r d i n g t o t h e w e l l k n o w n C a u c h y
c r i t e r i o n ) t h a t s e q u e n c e { x ft} c o n v e r g e s t o a c e r t a i n p o i n t x . S i n c e
F ( x ft) — >- 0 , w e h a v e F ( x ) = 0 , i . e . x i s t h e s o l u t i o n o f s y s t e m ( 6 . 1 ) .
M o r e o v e r , t a k i n g t h e limit in (6.15) as m — >-oo w e o b tai n
_ 2*-*<©
II* — * » l l < L C ( 1 - g ) •
Q.E.D.

Remarks
R e m a r k 1 . L e t u s b e s o l v i n g a s y s t e m o f n e q u a t i o n s ft ( # ) = 0,
i = 1 , . . ., n , w h e r e x £ E n . T h e n
fi ( x ) S * F ( x ) — 6, i = 1 , . . ., n ,
F (x ) = m a x |/ f (x) |
l^t^n
f o r a n y 6, p r o v i d e d x is s u f f i c i e n t l y c l o s e t o t h e s o l u t i o n x . T h e r e f o r e ,
. 1 % ( x ) = { 1 , 2 , . . ., » } a n d s y s t e m ( 6 . 2 ) t a k e s t h e f o r m
{fi ( * ) . P) + fi ( x ) = 0, i = l, . . ., n . (6.16)
T h e r e f o r e t h e m e t h o d p r o p o s e d c o i n c i d e s w i t h N e w t o n ’s m e t h o d
in w h i c h iterations are p e r f o r m e d b y the f o r m u l a x k+1 = x k +
+ p ( x h )j w h e r e p ( x ) i s t h e s o l u t i o n o f s y s t e m ( 6 . 1 6 ) . T h e c o n d i t i o n
f o r t h e c o n v e r g e n c e o f N e w t o n ’s m e t h o d i s t h e n o n s i n g u l a r i t y a t
p o i n t x o f m a t r i x f (#), w h e r e f ( x ) is a n n X n m a t r i x w h o s e r o w s
a r e f \ ( x ) . I n t h i s c a s e , p ( x ) = — (/ ' ( x ) ) - 1 f ( x ) , w h e r e f ( x ) i s a
c o l u m n - v e c t o r w h o s e c o m p o n e n t s a r e /* ( x ) . B u t it f o l l o w s f r o m t h e

215
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

last f o r m u l a that
II P ( X ) II < II (/' ( x ) ) - 1 || II / ( * ) || < c 0 II ( / ' ( x ) ) - 1 || F ( x )
w h e r e C 0 is a c o n s t a n t . I t c a n b e s e e n f r o m t h i s i n e q u a l i t y t h a t ( 6 . 3 )
holds in a certain n e i g h b o u r h o o d of point x .
T h u s i t f o l l o w s f r o m t h e t h e o r e m p r o v e d t h a t t h e u s u a l N e w t o n ’s
m e t h o d is l o c a l l y c o n v e r g e n t i n s o l v i n g a s y s t e m o f n e q u a t i o n s w i t h
n unknowns.
R e m a r k 2 . I f o n l y o n e e q u a t i o n / ( x ) = 0 i n n u n k n o w n s is t o b e
s o lve d, t h e n s y s t e m (6.2) t a k e s t h e f o r m
(/'(*), p) + / ( * ) = 0 (6.17)
a n d it is r e q u i r e d t o f i n d t h e s o l u t i o n o f t h i s e q u a t i o n w i t h a m i n i ­
m u m n o r m , i . e . t o f i n d t h e m i n i m u m o f || p ||2 w i t h c o n s t r a i n t s ( 6 . 1 7 ) .
U s i n g the rule of L a g r a n g e multipliers, w e h a v e in this case

P(X)==“ I 7 W /'(X)’
hence

l | p ( a : ) | i = ii r l ) i i | / ( J ) | -
C l e a r l y , f o r m u l a ( 6 . 3 ) w i l l h e s a t i s f i e d i f || / ' ( x ) || ^ y f o r a l l x .
R e m a r k 3. T h e f i n d i n g of v e c t o r p (x) a t e a c h s t e p i n v o l v e s
t h e s o l v i n g o f t h e p r o b l e m o f m i n i m i z a t i o n o f || p ||2 w i t h c o n s t r a i n t s
(6 .2). T h i s is a p r o b l e m o f q u a d r a t i c p r o g r a m m i n g . C o n c e r n i n g t h e
m e t h o d s o f s o l v i n g it, w e c a n u s e t h e s a m e i n f o r m a t i o n g i v e n i n
Sec. 5 a b o u t the solving of the subsidiary p r o b l e m of qua d r a t i c p r o ­
g r a m m i n g w h i c h arises in the linearization m e t h o d .

Sufficient Conditions of C o n v e r g e n c e
T h e m a i n c o n d i t i o n (6.3) w h i c h g u a r a n t e e s t h e c o n v e r g e n c e of
t h e a l g o r i t h m is n o t e a s y t o c h e c k . T h i s s u b s e c t i o n d e s c r i b e s c o n ­
ditions that c a n b e c h e c k e d m o r e effectively. I n particular, for the
c o n v e x c a s e if t h e r e is a n i n t e r i o r p o i n t i n t h e d o m a i n d e f i n e d b y
e x p r e s s i o n s (6.1), t h e c o n d i t i o n s g u a r a n t e e t h e c o n v e r g e n c e of t h e
algorithm.
L e t t h e s y s t e m c o n t a i n o n l y i n e q u a l i t y c o n s t r a i n t s , i.e.
ft ( x ) < 0 , i 6 J “. (6.18)
T h e n t h e s u b s i d i a r y s y s t e m (6.2) t a k e s t h e f o r m
(/i ( * ) , p ) + U (x) < 0 , i 6 J E (x). (6.19)
C l e a r l y , this s y s t e m c a n b e s o l v e d w i t h F (z) > 0 if t h e s y s t e m
(fl ( * ) , P ) + F ( x ) * £ 0, i 6 3 1 (x) (6.20)
is s o l v a b l e .

216
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

L e m m a 6.1. I f F (x) > 0 , t h e n s y s t e m ( 6 . 2 0 ) h a s a s o l u t i o n if a n d


o n l y if
L b (x ) = m i n || 2 W i ( * ) || > 0

where the m i n i m u m is t a k e n o v e r a l l X i ^ O such that

2 h = i.
(*)
T h e n t h e s o l u t i o n p (x) o f s y s t e m ( 6 . 2 0 ) w i t h a m i n i m u m n o r m satisfies
the equality

P r o o f . L e t X t ^ 0 b e s u c h t h a t t h e i r s u m o v e r all i 6 H i (x ) is
e q u a l t o u n i t y . If p is a s o l u t i o n o f ( 6 . 2 0 ) , t h e n

- 2 * » ( « ( * ) . j>) > * ( * ) ,

or

( - 2 P ) > P ( X )-
ieJfl(x)
U s i n g t h e i n e q u a l i t y ( x , y ) ^ | | x | | | | y ||, w e o b t a i n

II 2 *i/«(*)||\ \ P \ \ > P [(X )-


B u t t h e last i n e q u a l i t y h o l d s w i t h a n y c h o s e n as m e n t i o n e d a b o v e
a n d therefore
L i ( x ) || p I O F ( x ) ,
i.e. (x) > 0 a n d

( 6 -2 1 >
T h u s , it h a s b e e n p r o v e d t h a t t h e c o n d i t i o n s o f t h e l e m m a a r e
necessary.
S u p p o s e n o w t h a t L 6 (x) > 0
C o n s i d e r t h e p r o b l e m : to find the m i n i m u m of p w i t h t h e f o l l o w ­
ing constraints:
(fi ( * ) » P ) + P (*) — P < 0, i 6 H i (x),

I I p I K ' - o , r » = T ^ i > 0 . ( 6 . 2 2 )

T h i s is a p r o b l e m o f c o n v e x p r o g r a m m i n g , a n d a l l t h e c o n d i t i o n s
o f t h e K u h n - T u c k e r t h e o r e m , i n p a r t i c u l a r S l a t e r ’s c o n d i t i o n , * a r e

217
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

o b v i o u s l y f u l f i l l e d . L e t p 0l p 0 b e t h e s o l u t i o n . A p p l y i n g t h e K u h n -
T u c k e r theorem, w e obtain that there are ^ 0 s u c h t h a t for all
P » II p II ^ r 0 a n d f o r a l l p t h e f o l l o w i n g i n e q u a l i t y h o l d s :
Po + 2 h ((fi ( x ) , p 0 ) + F ( x ) — P o )

< p + 2 ? ) + / ’ ( * ) — p). (6.23)


0 fl(*)
Besides, w e have
h ((/; (*), P o ) + F (x) — Po) = 0, i 6 Jfi (x). (6.24)
S i n c e p is a r b i t r a r y , it f o l l o w s d i r e c t l y f r o m ( 6 . 2 3 ) t h a t

2 *i = l.

B e c a u s e of (6.24), w e c a n rewrite (6.23) in the f o r m

P o < ( 2 h h ( * ) . p ) + F ( x )*

T a k i n g t h e m i n i m u m w i t h r e s p e c t t o p , \\p\\ ^ r0, of the ri g h t - h a n d


sid e of t h e last i n e q u a l i t y w e o b t a i n

P o < — »o|| 2 Uli (*)H + ^ ( * X — r o i » ( * ) + ^ ( * ) = 0.


T h u s , Po ^ 0, i.e. v e c t o r p 0 satisfies t h e s y s t e m of inequalities
(see (6.22))

(fl ( * ) « P o ) + P ( * X Po < 0a i 6 J o (a;),


a n d w e h a v e a l s o 11 P o II ^ r0 = F ( x ) l L t (x).'| B u t it f o l l o w s f r o m
(6.21) that
F{*)
II P o |l >
^ (*) ’
Therefore
*(«)
tlllPoll (®)
a n d vector p 0 is t h e s o l u t i o n o f s y s t e m ( 6 . 2 0 ) . M o r e o v e r , ( 6 . 2 1 )
s h o w s that t h i s is a s o l u t i o n w i t h a m i n i m u m n o r m . T h u s , p 0 —
= p (x) a n d t h e l e m m a is p r o v e d .
Theorem 6.2. L e t all the a s s u m p t i o n s of the subsection o n p. 2 1 1
hold, except t h e m a i n o n e . M o r e o v e r , let L & ( x ) j ^ y > 0 f o r a l l x s u c h
that 0 < F ( x ) ^ F (Xfc). T h e n t h e c o n d i t i o n s o f t h e m a i n a s s u m p t i o n
are fulfilled too a n d all the results of t h e o r e m 6.1 h o l d for p r o b l e m
(6.18).

218
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

P r o o f . S i n c e a n y s o l u t i o n o f s y s t e m ( 6 . 2 0 ) is a l s o t h e s o l u t i o n o f
s y s t e m (6.19), w e h a v e
IIp (*) I K U p ( * ) II.

T h e r e f o r e b y l e m m a 6.1, w e h a v e

this s h o w s t h a t all t h e c o n d i t i o n s of t h e o r e m 6.1 a r e satisfied.


N o t e t h a t t h e c o n d i t i o n L & ( x ) ^ y > 0 is n a t u r a l e n o u g h , f o r it
r e q u i r e s l i n e a r i n d e p e n d e n c e o f v e c t o r s f\ (s), i 6 C f l (z).
T h e o r e m 6 . 3 . L e t f u n c t i o n s f t (a;) i n p r o b l e m ( 6 . 1 8 ) b e c o n v e x a n d
c o n t i n u o u s l y d i f f e r e n t i a b l e . B e s i d e s , let t h e d o m a i n d e f i n e d b y t h e i n ­
e q u a l i t y F ( x ) ^ F ( x 0 ) b e c o m p a c t , t h e g r a d i e n t s f\ ( x ) i n t h i s d o m a i n
satisfy L i p s c h i t z f co n d i t i o n a n d there be a p o i n t x s u c h that F ( x ) = y < 0 .
T h e n w i t h 6 < — y all th e c o n d i t i o n s of t h e o r e m 6 . 1 a r e fulfilled.
P r o o f . A s f t (a:) a r e c o n v e x , w e h a v e
fi ( * ) > h (x) + ( f i (a:), x — x), i £ J".
For i £ (x) w i t h p = x — x w e have

fi ( x ) + 6 > ft ( x ) + 6 + ( / ; (a;), p ) .

B u t fi ( x ) + 6 ^ F ( x ) + 6 = V + fi < 0 and fi ( x ) + 6 ^ F ( x ),
i 6 Cfl (x). T h e r e f o r e
0 > y + 6 > F ( x ) -{- (ft ( x ) , p ) , i 6 Cfl (*).
Setting y + 6 = — e, w e obtain that
(fi P ) + (F (x) + e)<0, i 6 C f l (a:).
B u t b y l e m m a 6.1, this m e a n s t h a t for all x s u c h t h a t F (x) ^ F (a:0 ),
F (x) + e ^ 0, s y s t e m ( 6 . 2 0 ) is s o l v a b l e a n d f o r s u c h x a l s o (a:) >
> * 0 . N o w s i n c e t h e d o m a i n F (x) F (a;0 ) i s c o m p a c t an d the
f u n c t i o n s a r e c o n t i n u o u s , it c a n b e e a s i l y a s c e r t a i n e d t h a t L & (x) ^
^ y > 0 f o r a l l x s u c h t h a t 0 ^ F ( x ) ^ F (a:0 ).
T h u s all t h e c o n d i t i o n s of t h e o r e m 6 . 2 a r e satisfied a n d this c o m ­
pletes t h e p r o o f of t h e o r e m 6.3.

Solving the P r o b l e m of Finding the M i n i m a x


G i v e n f u n c t i o n s f t (a:), i = 1 , . . ., m . W e compose the function
F ( x ) = m a x ft ( x ) . (6.25)
l^i^ro
T h e p r o b l e m is n o w t o f i n d p o i n t x £ E n w h i c h m i n i m i z e s F (x).

219
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

It is e a s y t o s e e t h a t t h i s p r o b l e m c a n b e r e d u c e d t o t h e f o l l o w i n g
o n e b y i n t r o d u c i n g a n additional variable x n+1: to m i n i m i z e
/ 0 (x, x n + 1 ) = x n + 1 w i t h c o n s t r a i n t s
fi ( x ) — x n+1 ^ 0, i = 1, . . m.
Therefore the m e t h o d s described above, in particular the lineariza­
tion m e t h o d , are n o w applicable. N o t e also that in this w a y w e c a n
s o l v e a l s o t h e p r o b l e m o f t h e m i n i m i z a t i o n o f F (.x ) i f x v a r i e s i n a
certain d o m a i n Q defined b y a s y s t e m of equalities or inequalities.
I n this subsection, w e shall discuss t h e m e t h o d of m i n i m i z a t i o n
o f F (x) w i t h x 6 E n . T h i s m e t h o d is b a s e d o n a s l i g h t m o d i f i c a t i o n
of the linearization m e t h o d .
L e t us introduce at e a c h point x the following subsidiary problem:

m i n ( P + y l M I 2).
(fi ( * ) » P ) + /,* ( x ) — p < 0, i 6 J o (x), (6.26)
where 6 > 0 and
J e (x) = {i: 1 ^ i ^ m, ft (x) ^ F (x) — 6}.
N o t e t h a t p r o b l e m ( 6 . 2 6 ) is a p r o b l e m o f c o n v e x p r o g r a m m i n g for
w h i c h S l a t e r ’s c o n d i t i o n i s s a t i s f i e d , f o r t a k i n g p sufficiently great
w e c a n a l w a y s satisfy strictly constraints (6.26). B y applying directly
t h e K u h n - T u c k e r t h e o r e m i n its d i f f e r e n t i a l f o r m , w e n o w find that
p ( x ) a n d p ( x ) i s t h e s o l u t i o n o f p r o b l e m ( 6 . 2 6 ) if a n d o n l y if t h e r e
ar e u l ^ 0, i £ J & ( x ) s u c h t h a t

23
p (*) + 2 ( * ) = o,
i£tjf 6 ( x )
“ * ( ( / : ( * ) > P ( * ) ) + f t (a:) — P ( * ) ) = 0 , i £ (*). (6.27)
Further, point p = 0, p = F (x), o b v i o u s l y satisfies c o n s t r a i n t s
(6.26). T h e r e f o r e
p ( * ) + t H p (*)Ip < ^ ( * ) - (6.28)

W e f o r m u l a t e n o w the a l g o r i t h m for solving the p r o b l e m . L e t x 0


b e a c e r t a i n i n i t i a l a p p r o x i m a t i o n . L e t p o i n t s x j , j = 0 , 1 , . . ., k
be already constructed. T h e n
**+1 = X h + otkPk (6.29)
w h e r e p k = p ( x k ). W e c h o o s e a k e q u a l t o 2 - l » w h e r e t , i s t h e f i r s t
o f i n d i c e s i = 0, 1, . . w h i c h satisfies t h e f o l l o w i n g i n e q u a l i t y :

F ( x h + 2 - 4/ > » ) < F ( x k ) — 2 ‘e || p h ||2 , y < e < 1

220
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

Thus the condition


F «£ F (xh) — a h t || p h ||2 (6.30)
is fulfilled. W e f o r m u l a t e n o w t h e c o n d i t i o n s f o r c o n v e r g e n c e o f t h e
algorithm.
L e m m a 6 . 2 . p ( x ) — 0 if a n d o n l y i f t h e n e c e s s a r y c o n d i t i o n s f o r a
m i n i m u m of F (#) a r e satisfied a t p o i n t x.
T o p r o v e the l e m m a w e m u s t recall the necessary conditions for
a m i n i m u m o f F (x ) a n d u s e a n a r g u m e n t a n a l o g o u s t o t h a t u s e d f o r
p r o v i n g l e m m a 5.1.
T h e o r e m 6.4. L e t f t {x) be c o n t i n u o u s l y differentiable, d o m a i n
Q = { x : F (x) ^ F ( # „ ) } b e b o u n d e d a n d f\ (#) s a t i s f y i n Q L i p s c h i t z ’
condition w i t h constant L . T h e n a n y limit p o i n t x * of sequence {#*},
k = 0 , 1 , . . ., s a t i s f i e s t h e n e c e s s a r y c o n d i t i o n s f o r a m i n i m u m o f
F ( x ) w i t h x £ E n . I f fi ( x ) a r e c o n v e x , t h e n x % i s t h e s o l u t i o n o f t h e .
problem.
P r o o f . A s i n p r o v i n g t h e o r e m 5 . 1 , it is e a s y t o o b t a i n t h e f o l l o w ­
ing estimates:
h (** + ap„) < fi ( x k ) + a (f'i ( x h ), p k ) + a 2 L || p h ||2 ,
i € 3 » ( x h ),
fi ( x k + apk) < F (x k ) — 6 + a K || p h ||, i g (xh)
where K = m a x || / i (-c) ||-
i, J C G Q
If w e u s e n o w t h e c o n d i t i o n ( s e e ( 6 . 2 6 ) )
(fi ( X h ) , P k ) < P * — ft ( x h ), P* = P (xk)
and a l s o ( 6 . 2 8 ) , t h e n t h e first e s t i m a t e t a k e s t h e f o l l o w i n g f o r m :
ft ( x h + a p k X (1 — a ) fi ( x h ) + p*a + a?L\\ p h ||2 ,
( x k ) — a ( F ( x k ) — p * ) + a 2L || p h ||2
< F ( x u)— - - || p h ||2 + a 2L || p h ||2 .

F u r t h e r , since for O ^ a ^ a J , ol\ = - - - - - - - - - - ,


IImil ( « + y l l m l l )

F ( x k)— £ - 1| P k \\2 > F ( * * ) - « + * K || P h ||,


w e have

fi(xh + a p „ ) ^ F ( x h )— y || P k ||2 - f - a 2 L || p h ||2 (6.31)

with OsCa^JaJ;. Therefore

F (xh + a p h ) ^ . F (zh)— a||Pk||2 ( y — ccL) , 0 < a < a j ( . (6.32)

221
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

If n o w
1
e
2
a k = min a L (6.33)
L
then
F (xk + a p k) < F (xh) — a | | p h ||2 e .
I t f o l l o w s i m m e d i a t e l y t h a t i n e q u a l i t y ( 6 . 3 0 ) h o l d s if
. 1 -
<tk>-2<X'h (6.34)

after a finite n u m b e r of r e d u c t i o n s s t a r t i n g w i t h u n i t y .
I t f o l l o w s i m m e d i a t e l y f r o m ( 6 . 3 0 ) t h a t a ^ H P k ||2 - > - 0 . T h i s m e a n s
t h a t || p k || - > 0 . I n d e e d , a k ^ y > 0 s i n c e b y ( 6 . 2 7 ) || p ( x ) ||
h a s a n u p p e r b o u n d i n Q . B u t it f o l l o w s f r o m ( 6 . 3 3 ) , ( 6 . 3 4 ) t h a t a h
also h a s a l o w e r b o u n d , a certain positive constant.
T h u s , p k - + 0 . L e t n o w x * b e a limit point of the sequence. W i t h ­
o u t loss of generality, w e c a n t a k e t h a t x k Moreo ver , since
U h , i 6 £ h > (^fc) a r e p o s i t i v e a n d t h e i r s u m i s e q u a l t o u n i t y , w e c a n .
\ i “ “* •
s e t t i n g U k = 0, i £Cft> (#*)» t a k e t h a t u u ~ + u l a n d iV ^ 0, th e i r
s u m being
m
S « 4 = l. (6.35)
t=l
W e rewrite n o w (6.27) a n d (6.26) for points x h as follows
m
u h f i (*Tft) = 0 ,
i=l
U h ( ( f i ( z ft)> P h ) + ft M — P h)) = 0 , i = 1 , . . ., m ,
(fi ( * k ) , P k ) + ft M < Pfe» i € # 6 (**)• (6.36)
I t f o l l o w s f r o m t h e l a s t i n e q u a l i t y ( 6 . 3 6 ) , if w e c h o o s e i £ £h> (*h)
s u c h t h a t f i (a;ft) = F ( # & ) , t h a t
P* > f t ( x h ) - K \ \ p k \\ = F (xh) — K \\ p h ||.

B u t (6.28) s h o w s that pk^ F ( x h ) — y || p k ||2 . T h e r e f o r e $ h ~ * ~ F ( x 0 ).


T a k i n g the limit in (6.36) w e obtain:

2 »*/«(*.)= o ,
i=i
u * (fi ( x , ) — F ( * » ) ) = 0 , i — 1. • • • » I».
m
S u * = l, i ? >0. (6.37)
i=i

222
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S

B u t t h e s e a r e just t h e n e c e s s a r y c o n d i t i o n s f o r F (x) t o a t t a i n its


m i n i m u m a t p o i n t x * ( s e e C h a p . I). I f f t (x) a r e c o n v e x , t h e n t h e s e
co ndi tio ns will b e at the s a m e t i m e sufficient a n d this p r o v e s t h e
theorem.
L e t u s n o w give a local estimate of the c o n v e r g e n c e of the algo­
rithm.
T h e o r e m 6.5. L e t x * b e the m i n i m u m p o i n t of F (x) a n d f u n c t i o n s
ft ( x ) b e t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e . B e s i d e s , l e t t h e g r a d i e n t s
f i ( * * ) , i 6 J o ( * * ) , w h e r e J 0 ( * * ) = { * : ft ( * * ) = P ( * * ) } > b e s u c h
that the differences

f i (x % ) /io (^'♦)» i hy ^0 £ J o (^'♦)


are linearly i n d e p e n d e n t a n d the m u l t i p l i e r s u % strictly greater t h a n
zero f o r t 6 J o ( # * ) a n d (Vi L " (x*, u ) y ) > 0 f o r a l l y ^ 0. H e r e

L(x, u ) = 2 u'ft (x ) a n d L " ix i u )


i=l
is t h e m a t r i x o f s e c o n d d e r i v a t i v e s w i t h r e s p e c t to x . T h e n w i t h s u f f i c i e n t l y
s m a l l 6 > 0 , a n d a > 0 t h e r e is a n e i g h b o u r h o o d o f p o i n t x * s u c h t h a t
the process
x h+ 1 = Xh + c e p ( x ft), k = 0 , 1 , . . .,

c o n v e r g e s starting f r o m a n y initial a p p r o x i m a t i o n x 0 of this r e g i o n


a n d || x * — Xft || ^ C q h , w h e r e 0 < q < C 1 .
Proof. W e shall gi ve o n l y t h e gen e r a l s c h e m e of th e proof since
t h e c o m p l e t e p r o o f is q u i t e a n a l o g o u s t o t h e p r o o f o f t h e o r e m 5 . 4
a n d i n f a c t is r e d u c e d t o it.
If w e t a k e

J (z) = { i € J 6 (x ): ( f i (x ), P (x )) + ft ( z ) — p (x) = 0}r


t h e n it c a n b e s h o w n (see l e m m a 5.2) that w i t h a s m a l l 6 w e h a v e
(x) = J 0 ( * ) = Jx ( x ) f o r a l l x c l o s e t o x * . T h e r e f o r e ' it f o l ­
l o w s f r o m (6.26) a n d (6.27) t h a t v e c t o r p (x) a n d t h e c o r r e s p o n d i n g
Lagrange multipliers u % satisfy the s y s t e m of e q u a t i o n s

P (X ) 4 - S m ‘/ 1 ( * ) = 0 ,
i£Cf o(**)
( f i ( x )y P ( x )) " h f i (**0 = P ( x )y i £ . J o ( x *)i
2 u * = 1. (6.38)
i t J o(**)
L e t i0 h e a n i n d e x f r o m J 0 ( x * ) a n d

fi ( x ) — fi ( x ) fi ,o ( x ) , /o (x) = fio (x).

223
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T h e n s y s t e m ( 6 . 3 8 ) is e q u i v a l e n t t o t h e f o l l o w i n g o n e :

p{x)+Tl(x) + 2 u iJ l = o ,
Jo(**)
(7i W , P (x)) + Ti ( x ) = 0, i 6 Jo (z*),
J o (#*) = Jo (#*)\{*o}- (6.39)
B u t t h i s s y s t e m is a b s o l u t e l y e q u i v a l e n t t o s y s t e m ( 5 . 5 1 . 1 ) , ( 5 . 5 1 . 2 ) .
A s the proof of t h e o r e m 5.4 w a s r e d u c e d to a s t u d y of the properties
o f p ( x )— t h e s o l u t i o n o f s y s t e m ( 5 . 5 1 . 1 ) , ( 5 . 5 1 . 2 ) — it f o l l o w s t h a t
t h e f u r t h e r p r o o f o f t h e o r e m 6 . 5 is s i m p l y r e d u c e d t o c h e c k i n g t h e
c o n d i t i o n s o f t h e o r e m 5 . 4 . B u t it c a n h e e a s i l y a s c e r t a i n e d t h a t t h e
a s s u m p t i o n s of t h e o r e m 6.5 p r o v i d e c o m p l e t e l y for the fulfilment
o f t h e c o n d i t i o n s o f t h e o r e m 5 . 4 f o r f u n c t i o n s /* a n d t h i s c o m p l e t e s
the proof of the t h e o r e m .
T h e f o l l o w i n g t h e o r e m is a n a b s o l u t e a n a l o g u e o f t h e o r e m 5 . 5 .
T h e o r e m 6.6. L e t the c o n d i t i o n s of t h e o r e m 6 . 5 be fulfilled a n d ,
b e s i d e s , t h e n u m b e r o f i n d i c e s i n set J 0 ( # * ) b e e q u a l to n + 1 • I n this
case, w i t h a s m a l l 6 the process
**+i = xk + P (x h ) (6.40)
c o n v e r g e s a t a q u a d r a t i c ra te to p o i n t x % .
P r o o f . I n t h e c a s e u n d e r c o n s i d e r a t i o n , v e c t o r p (x) is u n i q u e l y d e ­
fined b y the s y s t e m of e q u a t i o n s

(fi ( * ) , P ( x ) ) + h (x) = 0, i 6 J o (**),

s i n c e v e c t o r s f\ ( z ) , i 6 Jo ( # * ) a r e l i n e a r l y i n d e p e n d e n t f o r # ,
c l o s e t o x % b y t h e a s s u m p t i o n s . B u t t h e n p r o c e s s ( 6 . 4 0 ) is j u s t N e w ­
t o n ’s m e t h o d f o r s o l v i n g t h e s y s t e m o f e q u a t i o n s

U (x) = 0, i 6 J 0 (#*) (6.41)


w h i c h b y t h e o r e m 6.1 a n d r e m a r k 1 o n p. 215, c o n v e r g e s q u a d r a t -
ically i n t h e n e i g h b o u r h o o d of p o i n t x % . N o t e t h a t p o i n t x * satisfies
( 6 . 4 1 ) , f o r ft ( # * ) = F ( # * ) , i 6 J o ( # * ) a n d t h e r e f o r e

f i ( X j|e) = fi ( X j|{) f % Q (#jje) 0, i £ J q (#$)•

7. L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
A s w a s s h o w n i n S e c . 5, t h e l i n e a r i z a t i o n m e t h o d , s p e a k i n g g e n e r ­
ally, c o n v e r g e s at t h e rate of a g e o m e t r i c progression. I n a n u m b e r
of p r o b l e m s this c a n p r o v e insufficient a n d t h e p r o b l e m arises of
h o w to accelerate the c o n v e r g e n c e of the process.

224
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E

I n : t h i s s e c t i o n w e s h a l l d e s c r i b e m e t h o d s t h a t p e r m i t t o d o it
p r o v i d e d a n a p p r o x i m a t i o n sufficiently close to the solution h a s
b e e n f o u n d . T h e last c i r c u m s t a n c e is a s h o r t c o m i n g of t h e p r o c e s s ;
t h e r e are, h o w e v e r , n o m e t h o d s at p r e s e n t t h a t p e r m i t to c o n s t r u c t
a process, w h a t e v e r t h e initial a p p r o x i m a t i o n , w i t h a n a s y m p t o t i ­
cally super lin ear rate of c o n v e r g e n c e , this b e i n g a c h i e v e d b u t in t h e
p r o b l e m of u n c o n s t r a i n e d m i n i m i z a t i o n .
T h e m e t h o d s described b e l o w are b a s e d o n the following idea.
T h e m i n i m i z a t i o n p r o b l e m is r e d u c e d t o a c e r t a i n s y s t e m o f n o n l i n ­
e a r e q u a t i o n s a n d t h e n N e w t o n ’s m e t h o d o r i t s m o d i f i c a t i o n i s
ap pli ed to the solving of this s y s t e m . A t the e n d of this section w e
s h a l l d e s c r i b e a m e t h o d t h a t u s e s t h i s i d e a d i r e c t l y , i.e. t h e n e c e s ­
s a r y c o n d i t i o n s f o r a m i n i m u m w i l l b e e s t a b l i s h e d a n d N e w t o n ’s
m e t h o d applied to the solving of the e q u a t i o n s obtained. S u c h a
m e t h o d h a s m a n y s h o r t c o m i n g s o f w h i c h t h e p r i n c i p a l o n e is t h e
necessity of calculating s e c o n d derivatives of th e original functions.
Therefore, this m e t h o d c a n b e applied o n l y to p r o b l e m s in w h i c h
s u c h derivatives are easily calculated.
A s e c o n d m e t h o d is b a s e d o n t h e f a c t t h a t p o i n t x * is t h e s o l u t i o n
o f m i n i m i z a t i o n p r o b l e m ( 5 . 1 ) o n l y if i t s a t i s f i e s t h e e q u a t i o n
p ( x % ) = 0, w h e r e v e c t o r p (x) is t h e s o l u t i o n o f t h e s u b s i d i a r y
p r o b l e m (5.4). W e s h all d e s c r i b e a m e t h o d t h a t p e r m i t s t o s o l v e
a s y s t e m of n o n l i n e a r e q u a t i o n s w i t h o u t calculating derivatives.
A s w a s m e n t i o n e d a b o v e , this m e t h o d will c o n v e r g e o n l y f r o m a suf­
ficiently g o o d initial a p p r o x i m a t i o n .

Formulation of the Problem.


Basic Formulas
It is r e q u i r e d t o s o l v e a s y s t e m o f e q u a t i o n s
p ( x ) = Q (7.1)
w h e r e p ( x ) i s a v e c t o r w i t h c o m p o n e n t s p l ( x ) , i = 1 , . . ., n x £ E n ,
N o t e t h a t p (x) is a n a r b i t r a r y v e c t o r - f u n c t i o n t h a t is n o t a s y e t
c o n n e c t e d w i t h the p r o b l e m of m a t h e m a t i c a l p r o g r a m m i n g .
L e t x * b e t h e s o l u t i o n of s y s t e m (7.1). W e s h all a l w a y s a s s u m e
f u r t h e r o n t h a t p (#) is a v e c t o r - f u n c t i o n w h i c h is d i f f e r e n t i a b l e i n
t h e n e i g h b o u r h o o d of p o i n t x * a n d that t h e m a t r i x of deriv ati ves
, / v ( d p i (x ) \
t dx* ) i=i, ..., n
j = l , .... n

sat i s f i e s L i p s c h i t z ’ c o n d i t i o n , i.e.

II P ' ( * ) — p ' (j l) II < L II x — y II


w h e r e all t h e n o r m s a r e E u c l i d e a n .

1 5 — 0 3 2 6 225
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

F u r t h e r w i t h o u t loss of generality, w e c a n t a k e that x * = 0. W e


denote
p ' (0) = A , (o ( x ) = p ( x ) — A x ,

“ ( * ’ y ) = T j T ^ T j r i p ( * ) — p ( y ) — A (x — y)i- (7-2)
S u p p o s e t h a t m a t r i x A is n o n s i n g u l a r s o t h a t t h e f o l l o w i n g e s t i m a t e s
hold:
m || x || ^ || A x || ^ M || x || (7.3)
w h e r e M ^ m > 0.
L e m m a 7.1. T h e estimates
II © ( x ) || < C x || x ||2 , || © ( x , y ) || < C 2 m a x {|| x ||, || y ||)
hold.
P r o o f . L e t p v (x) b e t h e gradient o f f u n c t i o n p l (x). T h e n by
T a y l o r ’s f o r m u l a w e h a v e
p ‘ (*) = p i (0) + (pi- (0), x ) + (pi- (Z) - pi- (0), X )
w h e r e z = 0 x , 0 ^ 0 ^ 1. U s i n g t h e f a c t t h a t p x (0) = 0 and Lip-
s c h i t z ’ c o n d i t i o n f o r p * (a:), w e o b t a i n t h a t
II p x ( x ) — ( p i ' ( 0 ) , x ) II < L II X II2 , i = 1 , . . ., n .
H e n c e , w e have;
II <■> ( x ) II = lip ( * ) — p' ( 0 ) X II < C l II X ||J .
F urther
p * (y) = P i (x) + (pi- (x), y — x) + (pi- (z) — pi- (x), y — x)
where z = 0x + (1 — 0) y, 0 ^ 0 ^ 1. T h e r e f o r e
p * (y ) - p * (*) - (p v (0), y - x )
= (p\ (x) — P i (0), y — x) + ( p v (z) — p if ( x ) , y — x).
H e n c e after s i m p l e transformations (using Lipschitz’ condition),
w e obtain
I p ‘ (y ) — P * (X ) — ( p i ' ( 0 ) , y — x) I
< L || x || || y - x || + L || z - x || || y - * ||
= L || y - x || (|| x || + (1 - 0 ) || y - x ||)
< L || y - x || « 2 — 6 ) || s || + (1 - 0 ) || y ||)
< 3 L || j , - x || m a x {|| z | | , || y ||}.
T h e s e c o n d s t a t e m e n t of the l e m m a follows i m m e d i a t e l y f r o m the
last in equality.
L e t n o w p o i n t s x l t x 2 , . . ., x n b e a l r e a d y c o n s t r u c t e d , p ( x * ) = £
0, k = 1, . . . » n, e h b e u n i t v e c t o r s i n t h e d i r e c t i o n o f t h e & - t h
co ord ina te axis.

226
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E

W e introduce the notations:


Vk = x k + II P (Xk) II e h ,
rn = Uh — xh = || p ( x h ) || e h ,
z* = P (Vk) — P k = 1. . . n.
W e shall i n tro duc e a m e a s u r e of the linear i n d e p e n d e n c e of a n
a r b i t r a r y s e t o f v e c t o r s b h y k = 1 , . . ., n . W e t a k e

A (6j, . .., b n ) — - min


2 la il= 1

It is e a s i l y s e e n t h a t A bn) > 0 if a n d o n l y if v e c t o r s
bu ...,bn are linearly independent. N o t e also that

A (^i, • • • i ^ n ) /— •
V »
L e m m a 7 . 2 . T / i c r e is a n e i g h b o u r h o o d o f p o i n t x % — 0 sac/i t h a t
A (zj, . . zn) > y > 0,
p r o v i d e d x x , . . ., «zn a r e irc i & i s r e g i o n .
P r o o f . B y t h e d e f i n i t i o n of cd (x , y), w e h a v e
Zfe = A r k + ca (i/*, || r h 1 1 = 1 1 / ? ( x h ) || ( A e k + (0 ( y k l x h )).
Therefore

II z h II II + ( y u , x h ) II ‘
I f x h — >■ 0 , t h e n
z fc Aeh
II z k || || A e k || *
H o w e v e r , i t i s e a s y t o s e e t h a t A ( z ly . . . , z n ) d e p e n d s c o n t i n u o u s l y
o n ZfcHzfcH-1. T h e r e f o r e for x h sufficiently c l o s e to zero w e h a v e
I
^ (^1 • • • , ^n)
1 ^ ( ^ - ^ i » • • • » ^ 4 ^ n )*

B u t A ( A e ly . . A e n ) > 0 , f o r v e c t o r s A e hy k = 1, . . n are
s i m p l y c o l u m n s o f m a t r i x A , a n d s i n c e m a t r i x A is n o n s i n g u l a r ,
its c o l u m n s a r e l i n e a r l y i n d e p e n d e n t . T h u s
\
A (^1, • • • j % n ) ^ ^ " 2 * ^ • • •> 0

f o r all x k f r o m t h e n e i g h b o u r h o o d o f x # = 0. Q . E . D .
Let 6 > 0 b e the radius of the region a b o u t zero in w h i c h l e m ­
m a s 7 . 1 a n d 7 . 2 h o l d . L e t p o i n t s x ly . . ., x n b e c h o s e n i n t h i s r e g i o n .

227 15*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

L e t u s f i n d q u a n t i t i e s p*, i = 1, . . n f r o m the s y s t e m of e q u a ­
tions

— p(xn) = S Mfe* (7-4)


fc=i
B y l e m m a 7.2, t h i s s y s t e m is s o l v a b l e . W e take
n

*^n+1 = “ I- 2 ft*( 7 * 5 )
h= 1
L e t us estimate the n o r m of x n+ x. Since
p (xn ) = A xn + & (xn )
zk = A r k + co ( y h l x h ) || r h ||, (7.6)
w e o b t a i n f r o m (7.4) t h a t
n n
— A x n — a>(xn) = 2 p h A r k -\r S P x h ) ||rfe ||
fc=l fc=l
or
n
A x n+l= — d)(xn)— 2 P Xk) II Tft ||.
k=l
It f o l l o w s f r o m t h e last e q u a l i t y t h a t

H i * n - H I K I M * n + t I K | M x „ ) | | + 2 I P * I || r h II II (I) ( y » , *»)||. (7.7)


ft=l
But by l e m m a 7.1
II to (jr*, a : * ) || < C 2 m a x {|| y h ||, || x h ||>
= C 2 m a x {|| ( * » + || p ( * » ) || e h ) ||, || x „ ||>
< ( || * * || + || p ( * * ) ||).
In the region under consideration
II P ( * k ) II = II A x h + o) ( x h ) || < M (x h ) + C x || x k ||2 .
Therefore
II ® ( y k j x k ) || < C 2 (1 + 1 + C x 6 ) || a * || s C 3 || ||.
U s i n g this i n e q u a l i t y w e c a n r e w r i t e e s t i m a t e (7.7) in t h e f o l l o w i n g
form:
71

I I « n + i I K 4m - TL c iII *i.ll2 + c * m a x I K l l ,2 . I M I K I l lJ • (7.8)


ft=l

228
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E

Note n o w that

Pft II z h 11
2
h= 1 2 i i ii ** ii
i= l

(^i? • • • j 2 ^ )
h=1
T a k i n g i n t o a c c o u n t (7.4), w e obtain

| | p ( x „ ) | | > A ( Z l , . . . , 2 n ) ( 2 | p fc1 1 | * | | ) . (7.9)


h= 1
Further
II || = || ( ^ ^ + 0 ) ( y h , x t ) || r fc ||)
> II A r » || — 1| r h || || w ( y k , x s ) || > || r * || ( m — C 3 1| x h ||)
'^\\rn\\(rn— C i m a x id

Therefore

I I I P * l l M > ( 2 I Pi. I I k * II) ( m — C 3 m a x ||z*||).


h= 1 A=1
if

m a x || x k ! ! < - £ - , (7.10)
i^k^n ^3
t h e n i n e q u a l i t y (7.9) y i e l d s n o w

2 IP* IIk* I K 4 ( z , , . . . . J l l - t ' 1 m a x | U * ||) ' (7 - 1 1 )


h=i i^h^n
T a k i n g into account that
II p ( x n ) II = II A x n + 0) ( x n ) I K A T II z n II + C x II * n II2
< C l || ||
a n d also (7.11) a n d l e m m a 7.2,j w e c a n finally r e w r i t e (7.8) as fol­
lows:
C 3 max \\xh \\CL ” |
_ _
1

l| X n + l | K | I * » || m
‘^ n ; . (7.12)

W e f o r m u l a t e n o w the result o b t a i n e d in the f o r m of a l e m m a .

229
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

L e m m a 7 . 3 . I f p o i n t s x h , k — 1 , . . ., n a r e c h o s e n i n a r e g i o n
a b o u t p o i n t x$, = 0 s u c h t h a t t h e c o n d i t i o n s o f l e m m a s 7 . 1 a n d 7 . 2
a n d i n e q u a l i t y ( 7 . 1 0 ) a r e satisfied, t h e n e s t i m a t e ( 7 . 1 2 ) h o l d s true.

Algorithm
W e f o r m u l a t e n o w the a l g o r i t h m for solving the s y s t e m of e q u a ­
t i o n s (7.1).
C h o o s e i n i t i a l p o i n t s x x , x.2 , . . ., x n . L e t p o i n t s x x , . . ., ;rn , . . .
. . ., X h b e a l r e a d y c o n s t r u c t e d . T h e n p o i n t x * + ! i s c o n s t r u c t e d b y
the following formula:
n

x h+1— 2 P i r ft-n+i (7.13)


i=l
where
n = II P ( * j ) II e m U h yj = Xj 4- rh zj = p ( y s) — p (xj)

a n d the quantities p*, i = 1 , . . ., n are determined from the


s y s t e m of e q u a t i o n s
n
— P ( X k ) = 2 PiZk-n+i- (7.14)
i= 1
I n d e x m (/) is c a l c u l a t e d b y t h e f o l l o w i n g r u l e : if / = I n + p ,
1 ^ p ^ n — 1 , w h e r e I i s a n i n t e g e r , t h e n m (j) = p ; a n d if / =
= I n , t h e n m (j ) - - n .
T h u s v e c t o r s r x , r 2 , . . ., r h a r e p r o p o r t i o n a l t o u n i t v e c t o r s o f
the coordinate axes w h i c h are t a k e n in cyclic order.
It c a n b e s e e n f r o m t h e a b o v e f o r m u l a s t h a t t h e s c h e m e o f t h e
a l g o r i t h m is s i m p l e e n o u g h . A t e v e r y s t e p it c o m p r i s e s t h e c a l c u l a t i o n
o f p (x) a t p o i n t s x h a n d y k a n d t h e s o l v i n g o f t h e s y s t e m of e q u a ­
tions (7.14).
T h e o r e m 7.1. L e t 6 0 ] > 0 be s u c h that for all x satisfying the in­
e q u a l i t y || x || ^ 6 0 t h e c o n d i t i o n s o f l e m m a s 7 . 1 a n d 7 . 2 a r e f u l f i l l e d
a n d moreover, the f o l l o w i n g t w o inequalities:

m - C 3 ||*||>-£, (7.15)

i g -fc i + 2CaCt 1 < 1 . (7.16)


m L myj J '
Let x x, . . x n b e c h o s e n s u c h t h a t || z * || ^ 6 0 > k = i , . . ., n .
T h e n t h e a l g o r i t h m d e s c r i b e d a b o v e c o n v e r g e s to the solution x * of e q u a ­
tions (7.1) at a s u p e r l i n e a r rate.
P r o o f . F i r s t , w e s h o w t h a t || x k || ^ 6 0 f o r all p o i n t s x & c o n s t r u c ­
t e d b y t h e a l g o r i t h m . I n d e e d , i f x x , . . ., x h a r e in t h e 6 0- r e g i o n

230
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E

a b o u t p o i n t x *, t h e n t h e c o n d i t i o n s o f l e m m a 7 . 3 a r e s a t i s f i e d f o r
points x h _n + 1 , x k _n + 2 , • • x k a n d therefore the following ine­
quality a n a l o g o u s to inequality (7.12) holds:
C C
3 4 m a x || £ f c - n + i ||
l^t^n
II * * + 1 I K II II 4 - C \ II x k |l + y ( m — Cz m a x II x k - n + i ||)

H e n c e it f o l l o w s , t a k i n g i n t o a c c o u n t ( 7 . 1 5 ) , t h a t

II * f t + i II < II x h || m a x || x k - n + i || C 5 (7.17)
i^i^n
where
c 5= — r c 1+ ^ s - " | .
5 m L 1 Ym J
But || £ f c _ n + i II ^ 80 b y assumption. Therefore
II * * + 1 II < II II 6 0 C 5 < II x h II < 6 0.
Q . K . D . M o r e o v e r , it f o l l o w s from the last inequality, with the
n o t a t i o n q 0 — 8 0C 5, t h a t
II * * + 1 II < 7 o II * * ||. (7.18)
S i n c e b y ( 7 . 1 6 ) q 0 c l , w e h a v e t h e e s t i m a t e || x k || ^ g j ~ n || x n ||,
i.e. x k — ►(), a n d t h i s p r o v e s t h e first s t a t e m e n t o f t h e t h e o r e m .
Further, f r o m (7.17) w e o b t a i n that

m a x || * * . „ + , ||. (7.19)
\\x k \ \
Since x k — ► 0 , the estimate m e a n s that
II II . n
n*fcii
T h e last relation s h o w s that x k — at a faster rate t h a n that of a n y
g e o m e t r i c p r o g r e s s i o n . T h e t h e o r e m is p r o v e d .
W e shall give n o w the m o r e precise b o u n d s o n the rate of c o n v e r ­
g e n c e . W e s e t v k = C 5 || x h ||. T h e n ( 7 . 1 7 ) c a n b e w r i t t e n i n t h e
form
Vh+ m a x Vk-n+i• (7.20)

Let us take n o w Vj = m a x !;f , j = 1 , . . ., n a n d d e t e r m i n e v ht


k > n b y the following recursive formula:
V k + i = V k m a x v h_ n+i (7.21)
l^i^n
I t is n o w e a s y t o s e e t h a t v h ^ v k for all k. F u r t h e r , s i n c e
Vi C h || X i || C 560 = Qo i = l, . . . , w,

231
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

/■W
W e h a v e u t ^ q 0 * < 1 , i = 1 , . . ., n , a n d t h e r e f o r e s e q u e n c e { u fe}
decreases m o n o t o n i c a l l y . T h i s fact c a n b e p r o v e d in a n e l e m e n t a r y
w a y b y i n d u c t i o n o n k . I t f o l l o w s t h a t m a x u h _ n + i = i>ft_ n + 1 a n d
(7.21) c a n b e rewritten in the f o r m

Vh+l = VkVh-n+l- (7.22)

W e denote w k = In vh . T h e n

U'h+l = u>h + U>h-n+ 1» k > n,


w h = In vh, h = 1 , . . ., n . (7.23)

I t f o l l o w s f r o m O s t r o w s k i ’s r e s u l t s ( h i s t h e o r e m s 1 2 . 1 a n d 1 2 . 2 ) t h a t

(7.24)
u>h
I
w h e r e X 0 is t h e g r e a t e s t p o s i t i v e r o o t o f t h e e q u a t i o n
<p (X) = xn — r - 1 - 1 = 0 . (7.25)

As q) ( 1 ) = — 1 < 0 , a n d f o r g r e a t X , cp ( X ) > 0, w e h a v e A 0 ; > 1.


It f o l l o w s from (7.24) that for any e > 0 , A,0 — e > > l t h e r e is
a n u m b e r k ( e ) s u c h t h a t -w -h + i ^ X 0 — e o r *n ^ X q — e. As
/W
7 wh \nvh
In < 0 ( V h < . q o < i l ) , w e h a v e for k ^ k ( s )

I n v k + i ^ ( X o — e ) I n v h = I n u l X o " e)

ory / t + i k^k(s). It follows f r o m t h e last formula that


V h ^ v ^ ~ e^ k ~ k ^ .But the sequence {i;*} decreasesmonotonically
and un^ q0 < Z \ . Therefore
vh^ q £ o - * > ik- h w \ k ^ k (e). (7.26)

T h e o r e m 7.2. I f the c o n d i t i o n s of t h e o r e m 7.1 a r e fulfilled, t h e n for


every 0 , X 0 — e > ► 1, w h e r e X 0 is t h e g r e a t e s t r o o t o f t h e e q u a t i o n
X n — A,” " 1 — 1 = 0 , t h e r e i s a n u m b e r k ( e ) s u c h t h a t f o r a l l k ^ k ( e )
the inequality

<?o<1 ( 7 -2 7 >
holds.
P r o o f . R e c a l l t h a t v h = C 5 || x h ||, v h vk. These inequalities
a n d (7.26) yield i m m e d i a t e l y the result required.

232
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E

Computational Aspects.
Application to the P r o b l e m
of M a t h e m a t i c a l P r o g r a m m i n g
T h e a l g o r i t h m d e s c r i b e d i n t h e p r e c e d i n g s u b s e c t i o n is s i m p l e
e n o u g h . It r e q u i r e s at e v e r y s t e p t h e c a l c u l a t i o n of v e c t o r p (#) at
points x k a n d y a n d the solving of the s y s t e m of linear e q u a t i o n s
(7.14). If w e d e n o t e b y Z h t h e m a t r i x w h o s e c o l u m n s a r e z k ^ n + i t
i = 1 , . . ., rc, t h e n e q u a t i o n s ( 7 . 1 4 ) c a n b e r e w r i t t e n i n t h e f o r m
Z k $ k = — p ( x h ) y w h e r e p A is a c o l u m n - v e c t o r w h o s e c o m p o n e n t s
a r e pi, i = 1, . . n.
It f o l l o w s f r o m t h e a l g o r i t h m t h a t m a t r i c e s Z k a n d Z k ^ differ
o n l y b y o n e c o l u m n : c o l u m n z ft+1 i s s u b s t i t u t e d f o r c o l u m n z * a n d
zk-n+i+1> i ^ n — f ° r z k - n + i > i ^ n — 1. T h e r e f o r e t o c a l c u l a t e
P i + 1 , o n e c a n p r o c e e d as in th e s u b s e c t i o n s o n pp . 7 6 a n d 79.
N o t e that the p r o c e d u r e leads to the a c c u m u l a t i o n of calculation
e r r o r s . T h e r e f o r e , if t h e c a l c u l a t i o n o f p (y k ) r e q u i r e s c o n s i d e r a b l y
m o r e o p e r a t i o n s t h a n t h e s o l v i n g of s y s t e m (7.14), t h e s t a n d a r d
p r o g r a m for so lvi ng a s y s t e m of linear e q u a t i o n s s h o u l d b e u s e d
for calculating p h rather t h a n u s i n g recursive formulas.
W e n o w t u r n a g a i n t o t h e p r o b l e m 5 . 1 - 5 . 2 d i s c u s s e d i n S e c . 5.
A c c o r d i n g t o l e m m a 5 . 1 , i n o r d e r t o f i n d a l o c a l m i n i m u m it s u f f i c e s
t o s o l v e t h e e q u a t i o n p (x) = 0, w h e r e p (x) is t h e s o l u t i o n o f p r o b ­
l e m (5.4). If t h e a s s u m p t i o n s of t h e o r e m 5 . 4 h o l d , t h e n b y l e m m a s
5 . 2 , 5 . 4 i n a s u f f i c i e n t l y s m a l l r e g i o n a b o u t t h e s o l u t i o n x *, t h e c o n ­
di t i o n s of t h e o r e m 7.1 are also satisfied. T h e r e f o r e , t h e a p p l i c a t i o n
o f t h e a l g o r i t h m d e s c r i b e d i n t h i s s e c t i o n m a k e s it p o s s i b l e t o a c c e l ­
erate the c o n v e r g e n c e of the linearization m e t h o d . I n a p p l y i n g
t h i s m e t h o d o n e s h o u l d u s e a s p (x) t h e v e c t o r t h a t is t h e s o l u t i o n o f
p r o b l e m 5.4.

Minimization Problem
with Equality Constraints
L e t u s consider the p r o b l e m of m i n i m i z a t i o n of f u n c t i o n f 0 (x)
with constraints
fi (x) = 0 , i = 1 , . . ., m . (7.28)
Let x % b e the solution of the p r o b l e m a n d the following a s s u m p t i o n s
hold.
( a ) F u n c t i o n s fi ( x ) a r e t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e a n d
their s e con d derivatives satisfy Lipschitz’ condition.
( b ) A t p o i n t x % t h e g r a d i e n t s f \ ( # * ) , i = 1 , . . ., m a r e l i n e a r l y
i n d e p e n d e n t so that the necessary conditions for a m i n i m u m at x *
a r e s a t i s f i e d i n t h e i r r e g u l a r f o r m ( s e e C h a p . I, S e c . 4 ) . T h u s , t h e r e a r e

233
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

La grange multipliers i — 1, . . m such that

/ ; ( * , ) + £ « * / ; ( * . ) = o,
i=i
ft ( x * ) = 0, i = 1, . . ., m . (7.29)

(c) T h e sufficient conditions for a local minimum, i.e.


(y , L " (**7 a) y) > 0, h o l d if y = + 0 a n d (f\ ( x * ) , y ) = 0 , i
m
= 1, . . m. Here L (x, u ) = /0 (x) + 2 u 'fi ( x ) a n d L " ( x , n )
i= l
is t h e m a t r i x o f s e c o n d d e r i v a t i v e s o f L (x, u ) w i t h r e s p e c t t o x.
T h e o r e m 7.3. L e t the a b o v e conditions (a), (b), (c) b e fulfilled. T h e n
s e q u e n c e s { x ft} , { a * } , i = 1, . . m, Ar = 0 , 1 , . . . » calculated
by the following recursive formulas
m
L ( x k , Ufi) Pft -f- 2 A u k f i (xft) -j- Z / ( x / t , Wfc) — 0 ,
i=i

( f i ( * * ) > P f t ) - f f; ( x h ) = 0, * = f, • • m, (7.30)
= *h + Pk,
+ AuJ, i = 1, . . ., m , (7.31)

c o n v e r g e to x * a n d a* r e s p e c t i v e l y a t a q u a d r a t i c r a t e , w h a t e v e r the
i n i t i a l a p p r o x i m a t i o n x 0 , n j , i = 1 , . . ., m s u f f i c i e n t l y c l o s e t o t h e
s o l u t i o n x * , u \ i = 1 , . . ., m .
P r o o f . T h e p r o c e s s d e f i n e d b y f o r m u l a s ( 7 . 3 0 ) , ( 7 . 3 1 ) is s i m p l y t h e
o n e g e n e r a t e d b y N e w t o n ’s m e t h o d w h e n i t i s a p p l i e d t o s y s t e m ( 7 . 2 9 ) .
T h e r e f o r e , i n o r d e r t o p r o v e t h e t h e o r e m it s u f f i c e s t o c h e c k , u s i n g
r e m a r k 1 o n p . 2 1 5 , t h a t t h e m a t r i x o f t h e first d e r i v a t i v e s o f t h e left-
h a n d s i d e s o f ( 7 . 2 9 ) w i t h r e s p e c t t o all x a n d u % is n o n s i n g u l a r .
I f w e d e n o t e b y f ' ( x ) a m a t r i x w h o s e r o w s a r e f \ ( x ) , i = 1 , . . ., m ,
t h e n it is e a s y t o s e e t h a t t h e m a t r i x o f t h e first d e r i v a t i v e s o f t h e
left-hand sides of (7.29) h a s the f o r m of the f o l l o w i n g block:
I L " (x,, u) /'* ( x , ) \ \
71 + 711.
\ /'(*.) o II
v -v ■ ' '
77 + 771

I n o r d e r t o a s c e r t a i n t h a t t h i s m a t r i x is n o n s i n g u l a r , it is s u f f i c i e n t
to s h o w th a t t h e h o m o g e n e o u s s y s t e m of e q u a t i o n s

L " (x*, u) y + /'* (x*) u = 0,


/' ( * * ) y = 0 (7.32)

234
P E N A L T Y F U N C T I O N M E T H O D

has only a z e r o s o l u t i o n . I n t h i s s y s t e m , y £ E n , a is a v e c t o r w h o s e
componen t s a r e u \ i — 1, . . m . L e t y y ii b e t h e s o l u t i o n o f s y s t e m
(7.32) . P e r f o r m i n g s c a l a r m u l t i p l i c a t i o n o f t h e first o f e q u a t i o n s
(7.32) b y ?y, w e o b t a i n o n t h e b a s i s o f t h e s e c o n d e q u a t i o n t h a t
(?/, L " ( x * , u ) y ) + (?/, / ' * ( x * ) it)
= 07. L " (x*, u) y) + ( f ( x * ) y , il) = (y, L " ( x „ u) y) = 0.
R u t b y a s s u m p t i o n (c), t h e l a s t e x p r e s s i o n s h o w s t h a t y = 0 .
T h e r e f o r e , t h e first o f r e l a t i o n s ( 7 . 3 2 ) c a n b e r e w r i t t e n a s f o l l o w s :
m

/ ' * ( * ♦ ) « = 2 u ‘n ( * . ) = o ,
i=i
a n d t h i s i s p o s s i b l e o n l y i f t V = 0 , i — 1 , . . ., m s i n c e v e c t o r s
fi (.r*) a r e l i n e a r l y i n d e p e n d e n t b y a s s u m p t i o n ( b ) .
T h u s w e h a v e s h o w n that the conditions of c o n v e r g e n c e of N e w ­
t o n ’s m e t h o d a r e f u l f i l l e d a n d c o n s e q u e n t l y t h e t h e o r e m i s p r o v e d .

8. M E T H O D O F P E N A L T Y F U N C T I O N S
T h e m e t h o d o f p e n a l t y f u n c t i o n s is o n e o f t h e s i m p l e s t a n d w i d e l y
k n o w n of the m e t h o d s for solving the p r o b l e m of m a t h e m a t i c a l
p r o g r a m m i n g . T h e basic idea of the m e t h o d consists in a p p r o x i ­
m a t e l y r e d u c i n g the constrained m i n i m i z a t i o n p r o b l e m to the u n c o n ­
strained m i n i m i z a t i o n of a certain function. T h e s u b s i d i a r y function
is c h o s e n s o t h a t it c o i n c i d e s w i t h t h e f u n c t i o n t o b e m i n i m i z e d i n
t h e a d m i s s i b l e d o m a i n a n d i n c r e a s e s s t e e p l y o u t s i d e it.
S u p p o s e w e s t u d y t h e p r o b l e m o f m i n i m i z a t i o n o f f u n c t i o n / 0 (x),
x £ E n with constraints
fi ( # ) ^ 0, i = 1, . . m. (8.1)
A l l t h e f u n c t i o n s // ( x ) , i = 0 , 1 , . . . » m are continuous.
W e introduce the notations
t2 , 0, t, 0,
(8 .2 )
< P o (i) =
{ 0, « < 0 ; <P« ( 0 =
{ 0, t < 0 .
Let us compose a function
m

^ (*, r) = r 2 <Po ( f i ( * ) ) • (8.3)


i=i
It is e a s y t o s e e t h a t
\|) ( x , r ) = 0, x £ £2
where
Q = { x : fi ( x ) ^ 0, i — 1, . . m}.

235
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

I f x £ Q , t h e n i|? ( x , r ) > > 0 a n d \|) ( x , r ) — >* + o o a s r +oo. The


s u b s i d i a r y p r o b l e m is n o w t h e m i n i m i z a t i o n o f t h e f u n c t i o n
F (.x , r ) = /0 (x) + il? ( x , r ) . (8.4)
I t is n a t u r a l t o e x p e c t t h a t t h e s o l u t i o n o f t h i s p r o b l e m x (r) w i l l
b e close to t h e solution of the original one. T h e precise c o n d i t i o n s
u n d e r w h i c h this fact will b e realized are f o r m u l a t e d below.
N o t e that the choice of f u n c t i o n (x , r ) i n t h e w a y i t w a s d o n e
a b o v e is n o t t h e o n l y o n e p e r m i s s i b l e . I t is s u f f i c i e n t f o r t h e f u n c t i o n
to h a v e certain general properties that ensure the conv e r g e n c e of
the m e t h o d . T h e m e t h o d h a s different properties d e p e n d i n g o n the w a y
o f c h o o s i n g f u n c t i o n s i|) ( x , r ) . I n p a r t i c u l a r , i f w e s e t
^ ( x , r ) — r m a x < p 4 (/* ( x ) ) ,
l^i<r
then the linearization m e t h o d described in Sec. 5 c a n be taken as
s u i t a b l e for t h e m i n i m i z a t i o n of f u n c t i o n (8.4). I n this case, as w a s
s h o w n i n S e c . 5, it is n o t n e c e s s a r y t h a t r s h o u l d t e n d t o i n l i n i t y .
H o w e v e r , F (x , r ) w i l l n o t b e a s m o o t h f u n c t i o n .
I n t h e g e n e r a l C a s e , F (#, r) is c o n s t r u c t e d s o a s t o b e s m o o t h a n d
m a k e it p o s s i b l e t o a p p l y o n e o f t h e m e t h o d s o f C h a p . II , w h i c h
c o n v e r g e at a fast rate. U n f o r t u n a t e l y in this case, r m u s t t e n d
to infinity a n d this fact i n v o l v e s a n u m b e r of i m p l i c i t difficulties
w h i c h , in the o p i n i o n of the authors, co n s i d e r a b l y i m p a i r the v a l u e
o f t h e p e n a l t y f u n c t i o n m e t h o d . W e s h a l l d i s c u s s b e l o w t h e s e diffi­
culties a n d give a cursory description of o n e m o r e m e t h o d w h o s e
i d e a is c l o s e t o t h a t o f t h e p e n a l t y f u n c t i o n m e t h o d .

S u b s t a n t i a t i o n of t h e P e n a l t y F u n c t i o n M e t h o d
Let a certain continuous function (x, r) h a v e t h e f o l l o w i n g p r o p ­
erties:
(1) ( x , r ) = 0 i f x £ Q , i|? ( x , r ) > 0 , x £ Q , a n d i|) ( x h , r k ) - ►
- ► + o o if x h - + x 0 , x 0 £ Q , r h +oo;
( 2 ) ip (.z, r ) i n c r e a s e s m o n o t o n i c a l l y w i t h i n c r e a s i n g r .
T h e o r e m 8.1. L e t the set
Q c (r) = { x : F (x , r ) < C},
F (x , r) = / 0 (x) + oj) ( x , r )
h e c o m p a c t . T h e n f u n c t i o n F (x , r ) a s s u m e s i t s m i n i m u m m (r) f o r a l l
x a t a c e r t a i n p o i n t x (r) a n d m (r) ^ m , w h e r e
m = m m f 0 ( x ), m(r)— ^ m

a n d m (r) i n c r e a s e s m o n o t o n i c a l l y w i t h i n c r e a s i n g r. M o r e o v e r , if
x (J"h) — * - x 0 , k — o o , r h - > o o , t h e n x 0 i s t h e s o l u t i o n o f t h e o r i g i n a l
problem.

236
P E N A L T Y F U N C T I O N M E T H O D

P r o o f . L e t j b e a p o i n t o f Q , C = f 0 (x). T h e n s e t Q o f t h o s e x £ ^
f o r w h i c h / 0 (a:) ^ C i s a c l o s e d s u b s e t o f c o m p a c t s e t (r). I n d e e d ,
f o r x £ & b y t h e p r o p e r t i e s o f \|? ( x , r ) w e h a v e

/o (*) + r) = f 0 (x) < C,

i . e . x £ Q g (r )* B u t ^ *s c ^ e a r t h a t t h e m i n i m u m o f / 0 ( x ) i n Q must
b e in subset Q . Therefore, w e s h o u l d seek the m i n i m u m of c o n t i n ­
u o u s f u n c t i o n /0 (x) in c o m p a c t set Q . S i n c e t h e c o n t i n u o u s f u n c t i o n
a t t a i n s its m i n i m u m i n a c o m p a c t s e t , it f o l l o w s t h a t p r o b l e m ( 8 . 1 )
is s o l v a b l e . A n a n a l o g o u s r e a s o n i n g s h o w s t h a t f u n c t i o n F (x, r)
a s s u m e s i t s m i n i m u m m (r) a t a c e r t a i n p o i n t x (r).
L e t x * b e a c e r t a i n p o i n t of t h e m i n i m u m of f0 (x) in Q . T h e n
F (Xjjj, r ) / q (Xj^) ” j ~ i|) ( x , £ , r ) / o (x^.).

for x * 6 ^ a n d *i|) ( x , r ) = 0 with x 6 Therefore


m i n / 5’ ( x , r ) = m (r)^m,
X

i.e. x (r) C (r).


C o n s i d e r n o w t h e sets

(?) {%• fo (^) ^ (^» F) /o (•£♦)}•


T h e s e sets are c o m p a c t b y a s s u m p t i o n , a n d b e c a u s e of t h e increase
o f 'll) ( x , r ) w i t h i n c r e a s i n g r , w e h a v e
Q r n ( r 2) < = Q m (r,), ri < r2.
L e t n o w { r ft} , - ► oo b e a n increasing s e q u e n c e of and -|-oo.
Then %
Q m ( r k ) c z £ 2 m ( r x ).

S i n c e a s w a s s h o w n a b o v e x (r) £ ( r )> a l l p o i n t s x ( r fc) b e l o n g


t o c o m p a c t s e t Q m ( r x ). T h e r e f o r e w i t h o u t l o s s o f g e n e r a l i t y , w e c a n
t a k e t h a t s e q u e n c e { x ( r fe) } c o n v e r g e s t o a c e r t a i n p o i n t x 0 .
L e t u s s h o w t h a t x 0 6 Q a n d f 0 ( x 0 ) = m . I n d e e d , if x 0 £ Q , t h e n
t|) ( x ( r ft), r h ) - > - + o o a n d c o n s e q u e n t l y F ( x ( r h ), r h ) - ^ - f o o , f o r
f o ( x {r h ) ) > m i n f0 (x).
x£Qm(ri)
B u t this contradicts the fact that F ( x ( r ft), r h ) = m (r k ) ^ m.
Thus x0 6 Q.
Further,
m (rh) = F ( x ( r k ), r k ) = /„ ( i (r*)) + i|> ( x ( r k ), r k )
< fo (*o) + (*o. r k ) = fo (*#)•

237
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

Hence

lim m (rk) = l i m (/„ ( x ( r h )) + i|> ( x ( r k ), r k ) ) < /„ (i„). (8.5)


h-*oo k-+oo

But / „ ( x ( r * ) ) - » - / 0 ( x 0 ). T h e r e f o r e

l i m t|) ( i ( r x ), r * ) < /„ (x0) — l i m /„ ( x (r*)) = 0.


h - + oo ft^oo
A s ^ (a:, r ) ^ 0 , it f o l l o w s t h a t

l i m t ( x ( r h ), r k ) = 0.
A-*oo
Thus,
l i m m ( r h ) = l i m / 0 (a; ( r * ) ) - f l i m \ | ) ( a : ( r * ) , r * ) = - f 0 ( x 0 ) > m.
h~+oo oo h~>oo

O n the other hand, m ( r ft) ^ m. Therefore lim m ( r fe) = /0 (x0) ^ m.


h-+ oo
C o m p a r i n g t h e last i n e q u a l i t y w i t h t h e p r e c e d i n g one, w e see that
l i m m ( r h ) = f 0 (x 0 ) = m , a n d t h i s c o m p l e t e s t h e p r o o f o f t h e
h-+ oo
theorem.
T h e t h e o r e m p r o v e d s h o w s that the substitution of the m i n i m i z a ­
t i o n o f f u n c t i o n F (a:, r ) f o r t h e s o l v i n g o f p r o b l e m ( 8 . 1 ) w i t h g r e a t r
permits us to c o m e nearer to the solution of the original p r o b l e m .
L e t us es t i m a t e the character of this c o n v e r g e n c e for the n o n c o n v e x
case. T h e p r o b l e m of c o n v e x p r o g r a m m i n g will b e studied in the
next subsection.
T h e o r e m 8 . 2 . L e t f u n c t i o n s f i ( x ) t i — 0 , 1 , . . ., m be c o n ­
t i n u o u s l y differentiable a n d the c o n d i t i o n s of t h e o r e m 8 . 1 be satisfied.
M o r e o v e r , w e take that:
(1) p r o b l e m (8.1) h a s a u n i q u e s o l u t i o n ,
( 2 ) f u n c t i o n \|? (a:, r ) i s c h o s e n i n f o r m ( 8 . 3 ) a n d t h e m i n i m u m o f
F (x , r ) i s a t t a i n e d a t a u n i q u e p o i n t x ( r ) w i t h g r e a t r ,
( 3 ) i n t h e s o l u t i o n x * o f p r o b l e m ( 8 . 1 ) t h e g r a d i e n t s f\ (a;*), i £ J o ( ^ « )
are linearly independent and\
o (^^) {L fi ( ^ ^ t ) I f» • • •* m } ,

T h e n l i m x (r) = x* and
r-> o o

l i m ripi (fi ( x (r))) — — u l, i = 1, . . . , m


r-> o o

w?&ere a* a r e L a g r a n g e m u l t i p l i e r s of p r o b l e m (8.1).
R e m a r k . R e c a l l t h a t a c c o r d i n g t o t h e o r e m 4 . 1 ( C h a p . I) t h e n e c e s ­
s a r y c o n d i t i o n s for a m i n i m u m a r e fulfilled at p o i n t in the follow-

238
P E N A L T Y F U N C T I O N M E T H O D

ing form:

/ ; ( * • ) + 2 »*/*(*•)=«,
i=l

u x ^ 0, u'ft ( # * ) = 0, i = 1 , . . ., m . (8.6)
P r o o f . F i r s t , w e s h a l l s h o w t h a t x (r) - + x # . S u p p o s e t h a t t h e
o p p o s i t e h o l d s . T h e n t h e r e is a s e q u e n c e r k -*■ o o s u c h t h a t
II x ( r k ) — a * || ^ 6 0 > 0 . S i n c e i n p r o v i n g t h e o r e m 8 . 1 i t w a s
s h o w n t h a t x (r h ) £ £ 2 m (ri) a n d s e t Q m (rx ) is c o m p a c t , w e c a n t a k e ,
w i t h o u t l o s s o f g e n e r a l i t y (if r e q u i r e d w e c a n t a k e a s u b s e q u e n c e ) ,
t h a t x ( r h ) - ^ a * * a n d c l e a r l y || x % % — || ^ 8 0 > 6 - H o w e v e r ,
it f o l l o w s f r o m t h e o r e m 8 . 1 t h a t x * * is t h e s o l u t i o n o f p r o b l e m ( 8 . 1 ) .
T h u s w e h a v e o b t a i n e d t w o different s o l u t i o n s of p r o b l e m (8.1).
B u t this contradicts the a s s u m p t i o n .
T h u s , it h a s b e e n s h o w n t h a t x (r) — ► x *. W e t u r n t o t h e p r o o f
o f t h e s e c o n d s t a t e m e n t o f t h e t h e o r e m . A s x (r) is t h e m i n i m u m p o i n t
o f f u n c t i o n F (x , r ) , a t t h i s p o i n t t h e g r a d i e n t o f t h e f u n c t i o n
m

F ( * , r) =f0(i)+rS <Po (/i ( * ) )


i=l

m u s t b e e q u a l to zero. S i m p l e calculations s h o w t h a t w e c o m e to
the equality
m

F ' ( x ( r ) , r) = /; ( x (r)) + 2 ( 2 n p , ( f t ( * « ) ) ) ft ( * <r)) = 0


i= C
or, w i t h t h e n o t a t i o n
(r) = 2 r ( p j (fi ( x ( r ) ) ) ,
to the equality
m

/;(*(»•))+2 «V)f«(*<r))=0- (8-7)


i— 1
N o t e n o w t h a t a s x (r) x % w e h a v e f t ( x (r)) < < 0 for i £(x%)
s i n c e f t (a:*) < 0 , i £ £ f 0 (a:*). T h e r e f o r e w i t h g r e a t r
u l (r) = 2 n p x (/, ( x ( r ) ) ) = 0, i £ J 0 (a:*).
B u t u lfi ( a : * ) = 0 a n d t h e r e f o r e f o r i £ J 0 ( # * ) w e h a v e u x — 0 .
T h u s t a k i n g i n t o a c c o u n t t h e e x p r e s s i o n f o r u x (r), t h e s t a t e m e n t o f
t h e t h e o r e m i s p r o v e d f o r i £ J 0 (<x * ) .
D u e to t h e f o r e g o i n g , w e c a n r e w r i t e (8.7) a n d (8.6) as follows:

( * ( r )) + 2 ( r ) i i ( * (r ) ) = o ,
0(ar*)
r . w + 2 «*/!(*.)= o . (8.8)

239
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

I f w e t a k e i n t o a c c o u n t t h a t x (r) x f i (x) a r e c o n t i n u o u s for


a l l x a n d f i ( x % ) , i (z J o ( x % ) a r e l i n e a r l y i n d e p e n d e n t , t h e n i t i s
e a s y t o c o n c l u d e f r o m ( 8 . 8 ) t h a t u l (r) u 1 a n d this, w i t h a c c o u n t
t a k e n o f t h e e x p r e s s i o n f o r u l (r), c o m p l e t e s t h e p r o o f o f t h e t h e o r e m .
R e m a r k . I t f o l l o w s f r o m t h e o r e m 8 . 2 t h a t if u x > 0 , t h e n w i t h
g r e a t r f u n c t i o n s f t ( x (r)) a r e s t r i c t l y g r e a t e r t h a n z e r o a n d ( x (r))
t e n d s t o z e r o a t t h e s a m e r a t e a s q u a n t i t y u xr ~ x d o e s . T h u s t h e a p p r o x ­
i m a t e s o l u t i o n w i l l a l w a y s v i o l a t e t h e c o n s t r a i n t fi ( z ) ^ 0 if
u x > 0.

Convex Programming
In the case of the p r o b l e m of c o n v e x p r o g r a m m i n g , the estimates
o f t h e a p p r o x i m a t i o n o f x (r) t o t h e s o u g h t s o l u t i o n x * c a n b e m a d e
m o r e precise.
T h e o r e m 8 . 3 . L e t a l l t h e f u n c t i o n s f t (.r), i = 0 , 1 , . . ., m h e
c o n v e x , t h e c o n d i t i o n s o f t h e o r e m 8 . 1 f o r f u n c t i o n *i|? (;r, r ) i n f o r m ( 8 . 3 )
h e satisfied a n d , besides, the n e c e s s a r y c o n d i t i o n s i n the f o r m of the
K u h n - T u c k e r t h e o r e m b e f u l f i l l e d a t p o i n t x % w h i c h is t h e s o l u t i o n o f
p r o b l e m 8 . 1 , i.e. t h e r e a r e n u m b e r s u x ^ 0 s u c h t h a t
m
fo ( * * ) < 2 u *fi (x ) + f o (x ) 7 f o r o t t x 7
i=l
^ i/ i ( ^ * ) = 0 , i = l , . . . , m . (8.9)
Then

f t (x (r ) ) ^ S ~ r » if f i ( x ( r ) ) > Q (8 .1 0 )

(8 . 1 1 )
where

« = ] / . ! (ur.

Proof. W e i n t r o d u c e t h e n o t a t i o n J 0 ( # ) = {i: f i ( x ) ^ 0 , i = 1 , . . .
..., m } . A s
m
F ( x ( r ) , r ) = f 0 (x ( r ) ) + r S i <Po ( f i ( x (r ))) < f o (x * ) .
i=i
it f o l l o w s f r o m ( 8 . 9 ) t h a t
m m
fo ( X (r)) + r 2 <Po (fi (x ( r ) ) X t o (x (r)) + 2 «7i(*(0)
i=i i=l

240
P E N A L T Y F U N C T I O N M E T H O D

or
m nt
r 2 <Po ( f t (* ( r ) ) ) < 2 “ Vi (* ( r ) ) -
i=l l=l
B u t f o r i ^ J o ( ^ ( r ))
<i>o (fi ( x ( r ) ) ) = 0, ft ( x (r)) < 0,

a n d <p 0 (fi ( x ( r ) ) ) = f \ ( x ( r ) ) f o r i 6 J o ( # ( r ) ) * T h e r e f o r e t h e i n e q u a l ­
ity o b t a i n e d m a y b e m a d e stronger:

r J2 /i(*(r)X _2 “Vi (* W X “ J/^ J S /?(*(»•))•


* 6 ^ 0 (x(r)) i£Jo(x(r)) i6«7o(*(r))

I n d e r i v i n g t h e last i n e q u a l i t y w e used the well k n o w n Cauchy-


B u n i a k o w s k i inequality.
Thus

f i ( x ( r ) ) < W ~ 2 /?(*('))<-- (8 . 1 2 )

a n d it f o l l o w s t h a t (8.10) holds.
F u r t h e r , for all x
m

/ . W < M * ) + S "Vi (*)</.(*) + 2 “ V i (2)


1— 1 ieJo<*>

=/#(*)+»• 2 /'(*)— 2 (*)— )*


itJo(x) it£fo(x)
m _ _
+ 2 ^ < / o ( ;r) + r 2 <p » V ‘ ( x ) ) + - i r = i ? ( x * r ) + l r '
l e ^ X o (x) i=1
I n w r i t i n g t h i s f o r m u l a it w a s t a k e n i n t o a c c o u n t t h a t
m
2 ¥»(/•(*))- 2 /?(*), (8-13)
1 1 iiJtXx)

b y t h e d e f i n i t i o n o f <p0 (;r) a n d u 0 ( x ) . T h u s
m __
fa (r)) + r 2 <Po (/ * ( * (r)))
1=1

16— 0 3 2 8 241
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

B u t it f o l l o w s f r o m ( 8 . 1 2 ) a n d (8.13) that

(/i ( * ( * ■ ) ) ) < £ .

Therefore

fo ( x (r)) > U (**) — X -T* •


Q.E.D.

Computational Aspects
T h e m e t h o d s e x p o u n d e d a b o v e r e d u c e p r o b l e m (8.1) to t h e m i n i ­
m i z a t i o n o f f u n c t i o n F (x, r). I t is p o s s i b l e n o w , i n o r d e r t o o b t a i n
a n a p p r o x i m a t e solution, to u s e o n e of t h e m e t h o d s d e scr ibe d in
C h a p . II. T h e f o l l o w i n g specific c i r c u m s t a n c e s s h o u l d be, h o w e v e r ,
t a k e n i n t o a c c o u n t . If f u n c t i o n s f t (x ) a r e n o t c o n v e x , t h e n f u n c ­
t i o n F (x, r) is a l s o n o t c o n v e x w i t h r e s p e c t t o x . T h e r e f o r e , it c a n
h a v e l o c a l m i n i m a w h i l e i n a l l t h e p r e c e d i n g t e x t it w a s a s s u m e d
t h a t w e w e r e d e t e r m i n i n g t h e g l o b a l m i n i m u m x (r).
A s all t h e m e t h o d s of C h a p . II a r e m e a n t for f i n d i n g of a local
m i n i m u m , if t h e f u n c t i o n t o b e m i n i m i z e d is n o t c o n v e x , it is t h e
l o c a l m i n i m u m t h a t w i l l b e f o u n d if t h e i n i t i a l a p p r o x i m a t i o n i s
p o o r . T h i s a f f e c t s t h e c o n v e r g e n c e a n d is a n i m p o r t a n t s h o r t c o m i n g
of t h e p e n a l t y f u n c t i o n m e t h o d i n its a p p l i c a t i o n t o n o n c o n v e x
problems.
If t h e p r o b l e m u n d e r c o n s i d e r a t i o n is o n e o f c o n v e x p r o g r a m m i n g
w i t h t h e u s e o f f u n c t i o n ( 8 . 3 ) a s i | ) (x , r ) , t h e n i t i s e a s i l y a s c e r t a i n e d
t h a t F (#, r) is c o n v e x t o o ; t h e r e f o r e , t h e d i f f i c u l t y m e n t i o n e d a b o v e
is r e m o v e d . H o w e v e r , a n o t h e r d i f f i c u l t y a r i s e s . T h e f a c t is t h a t o n e
s h o u l d t a k e r sufficiently great in order to obtain a g o o d a p p r o x i m a ­
tion; this follows f r o m the estimates o b t a i n e d ab ove. I n this case
a l l d e r i v a t i v e s o f F (x , r ) w i t h r e s p e c t t o x w i l l a l s o b e g r e a t , f o r
t h e y a r e p r o p o r t i o n a l t o r. B u t it w a s e s t a b l i s h e d i n a n a l y s i n g a l l
m e t h o d s d e s c r i b e d i n C h a p . I I w h o s e r a t e o f c o n v e r g e n c e is s u p e r l i n -
ear t h at t h e size of t h e r e g i o n in w h i c h t h e rate of c o n v e r g e n c e b e ­
c o m e s s u p e r l i n e a r is i n i n v e r s e p r o p o r t i o n t o L i p s c h i t z 1 c o n s t a n t o f
s e c o n d d e r i v a t i v e s , i.e. i n t h e c a s e u n d e r c o n s i d e r a t i o n t h i s r e g i o n
will also b e s m a l l a n d e v e n a m e t h o d w h i c h , at the limit, theoreti­
cally c o n v e r g e s r a p i d l y c a n b e c o m e ineffective. M o r e o v e r , as f u n c ­
t i o n <p0 (t) w i t h t = 0 h a s n o s e c o n d d e r i v a t i v e , F ( # , r ) c a l c u l a t e d
b y f o r m u l a (8.3) w i t h t h e u s e of (x , r ) w i l l a l s o h a v e n o s e c o n d
d e r i v a t i v e s a t p o i n t s x f o r w h i c h / f ( x ) = 0 f o r a c e r t a i n i. B u t i f
t h e s o l u t i o n x m l i e s o n t h e b o u n d a r y o f t h e d o m a i n , it is t h i s c a s e t h a t
w i l l t a k e p l a c e . O n t h e o t h e r h a n d , all m e t h o d s w h i c h c o n v e r g e

242
P E N A L T Y F U N C T I O N M E T H O D

at a fast ra te r e q u i r e t h a t t h e f u n c t i o n b e i n g m i n i m i z e d h a v e s e c o n d
derivatives at least in a certain r e g i o n ahoutj t h e p o i n t s o u g h t .
A l l of t h e difficulties m e n t i o n e d are, as a rule, o b s e r v e d in c a l c u ­
lations in practice a n d this lowers the effectiveness of the m e t h o d .

Fiacco and McCormick Method


T h i s m e t h o d is p r a c t i c a l l y a p p l i c a b l e o n l y t o p r o b l e m s o f c o n v e x
p r o g r a m m i n g . It is b a s e d o n a n i d e a c l o s e t o t h a t o f t h e p e n a l t y
function m e t h o d ; h o w e v e r , in this m e t h o d the a p p r o x i m a t i o n s
a p p r o a c h the solution f r o m inside the d o m a i n a n d not f r o m outside
as in the pe n a l t y function m e t h o d .
L e t u s a g a i n c o n s i d e r p r o b l e m (8.1). S u p p o s e t h a t all t h e f u n c ­
t i o n s fi ( x ) a r e c o n v e x a n d t h e r e i s a p o i n t x s u c h t h a t /* ( x ) < ; 0 ,
/ = 1 , . . . , m , s o t h a t t h e i n t e r i o r o f t h e a d m i s s i b l e s e t Q is n o n ­
empty. W e compose the function
m

P (x, r) = fo ( * ) — 2 j f w - r > 0
i=i *
d e f i n e d i n s i d e s e t Q . I t is e a s i l y a s c e r t a i n e d t h a t P (x, r) is c o n v e x
w i t h r e s p e c t to x i n s i d e Q . If w e d e n o t e b y x (r) t h e m i n i m u m p o i n t
o f P (x, r) i n Q , t h e n w i t h s u f f i c i e n t l y g e n e r a l assumptions analogous,
t o t h o s e o f t h e o r e m s 8 . 1 a n d 8 . 2 it c a n b e s h o w n that

l i m x (r) = x * ,
r-*-+0

/?(*<>•)) = u '' i = 1,

T h u s , t h e a p p r o x i m a t e m e t h o d of s o l v i n g p r o b l e m (8.1) a g a i n h a s
b e e n r e d u c e d to the p r o b l e m of u n c o n s t r a i n e d m i n i m i z a t i o n of
f u n c t i o n P (x, r).
O f t h e specific traits of this s u b s i d i a r y p r o b l e m t h e s a m e t h i n g s
c a n b e said that w e r e said of the p e nal ty function m e t h o d in the
s u b s e c t i o n o n p. 242. I n o r d e r to illustrate t h e s e traits a n d to s h o w
w h y e v e n e f f e c t i v e m e t h o d s o f m i n i m i z a t i o n o f F (x, r) o r P (x, r)
m a y fail to p r o v i d e for a fast r a t e of c o n v e r g e n c e , w e s h a l l a d ­
duce a simple example.
L e t / 0 ( x ) = — x , fx (x) = x , x 6 E 1 , i.e. w e a r e s o l v i n g t h e p r o b l e m
of m i n i m i z a t i o n of — x s u b j e c t to x ^ 0. T h e o b v i o u s s o l u t i o n
is x * = 0 :

P ( x , r) = — x - - T— .

243 16
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

E q u a t i n g t o z e r o t h e d e r i v a t i v e o f P (x , r ) w i t h r e s p e c t t o x w e
obtain
P ’ (x, r ) = - 1 + ^ = 0 . (8.14)

H e n c e x (r) = — Y r • L e t u s n o w a p p l y t o t h e s o l v i n g o f ( 8 . 1 4 ) a m e t h ­
o d t h a t c o n v e r g e s a t a q u a d r a t i c r a t e — N e w t o n ’s m e t h o d , i . e .
obtain approximations b y the formula
P ' ( X h , r)
x k+l — x h — P ” { x k > r) *
S u b s t i t u t i n g e x p r e s s i o n s f o r P r (x, r) a n d P " (x, r) w e o b t a i n a f t e r
simple transformations

Pft+i = 2 ^\r ** vl > «’fc = * * + V ' -. (8.15)

It is c l e a r f r o m f o r m u l a ( 8 . 1 5 ) t h a t t h e d e v i a t i o n o f x k f r o m t h e
s o l u t i o n x (r) = — Y r t e n d s t o z e r o m o n o t o n i c a l l y o n l y w i t h i n i ­
tial p o i n t s s u c h t h a t

As xh < c O ( t h e a p p r o x i m a t i o n is s o u g h t i n t h e r e g i o n £ < 0 ) , we
have

T h u s t h e last f o r m u l a s h o w s t h a t a q u a d r a t i c rate of c o n v e r g e n c e of
N e w t o n ’s m e t h o d w i l l b e g u a r a n t e e d o n l y i n a d o m a i n s u c h t h a t i n
it X h d e v i a t e s f r o m t h e s o l u t i o n b y n o t m o r e t h a n ] A r , i.e. t h e d o m a i n
o f c o n v e r g e n c e o f N e w t o n ’s m e t h o d t e n d s t o z e r o w i t h d e c r e a s i n g r ,
a n d t h e s i z e o f t h i s d o m a i n is o f t h e o r d e r o f m a g n i t u d e o f t h e d e v i a ­
t i o n o f x (r) f r o m t h e t r u e s o l u t i o n o f t h e p r o b l e m — x + . T h i s i n d i c a t e s
that the greater a m o u n t of calculation w o r k will b e required to
h i t t h e r e g i o n o f c o n v e r g e n c e o f N e w t o n ’s m e t h o d , w h i l e i n c a s e s
w h e r e N e w t o n ’s m e t h o d h a s a g o o d c o n v e r g e n c e i t i s n o m o r e n e c e s ­
sary as the a p p r o x i m a t i o n o b tai ned deviates f r o m x * b y as m u c h
a s x (r) d o e s .
,; i

9. P R O J E C T I O N M E T H O D S
W I T H R E S T O R A T I O N O F TIES
C o n s t r u c t i o n of t h e M e t h o d s
C o nsi der the p r o b l e m of m i n i m i z a t i o n of function /0 (x) w i t h
the following conditions:
f i ( x ) = 0 , i = * 1 , . . ., m , m < n. (9*1)

244
P R O J E C T I O N M E T H O D S

W e set g = ( / lf . . ., / m ), S g = { x : g ( x ) = 0 } a n d s u p p o s e t h a t
all f u n c t i o n s f 0 (a;), (.x ), . . f m ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e
a n d S g is a s m o o t h m a n i f o l d ( ( n — m ) d i m e n s i o n a l ) , i.e. t h a t
at a n y point x £ S g t h e r a n k o f m a t r i x g ' ( x ) i s e q u a l t o m ( g ' (x ) =
( d f ‘ (x)
= | ^ J, i = 1, . . m> 7 = 1 , . . ., n , i i s t h e r o w index).
Conse que ntl y, at a n y point x £ S g a h y p e r p l a n e ta n g e n t to S g c a n b e
constructed:
g' (x)(x — x ) = 0. (9.2)
F u r t h e r o n , w e d e n o t e t h i s h y p e r p l a n e (i.e. t h e s e t o f p o i n t s t h a t
s a t i s f y e q u a t i o n (9.2)) b y T {x).
O n e possible a p p r o a c h to the construction of iterative processes
f o r s o l v i n g t h e p r o b l e m f o r m u l a t e d is b a s e d o n t h e f o l l o w i n g c o n s i d ­
erations.
L e t x 0 b e a n a r b i t r a r y p o i n t o f S g s u c h t h a t t h e g r a d i e n t /' ( x 0 )
is n o t o r t h o g o n a l t o t h e l i y p e r p l a n e T ( x 0 ) (i.e. t h e n e c e s s a r y c o n ­
d i t i o n f o r a n e x t r e m u m o f f u n c t i o n f 0 ( # ) o n t h e m a n i f o l d iS0 is n o t
f u l f i l l e d a t p o i n t x 0 ). T h e n i n p l a n e T ( x 0 ) t h e r e a r e i n f i n i t e l y m a n y
d i r e c t i o n s o f d e s c e n t o f f 0 ( x ) (i.e. t h e r e a r e i n f i n i t e l y m a n y d i r e c ­
t i o n s x — x 0 w h i c h b e l o n g t o T (a:0 ) a n d s u c h t h a t ( f'0 ( x 0 ), x — x 0 ) c
< C —0 ) . S u p p o s e w e h a v e d e t e r m iA n e d o n e o f t h e s e d i r e c t i o nA s u 0 =
— x 0— x 0 and constructed point x 0 ( a ) = x 0 a v 0 s u c h t h a t f0(x0 (a)) < c
< fo (^o)- P o i n t x 0 n o l o n g e r s a t i s f i e s e q u a t i o n s o f c o n d i t i o n s (9.1).
H o w e v e r , if t h e v a l u e o f p a r a m e t e r a i s s u f f i c i e n t l y s m a l l ( t h e q u a n -
A /X
t i t y II x Q — x Q ( a ) || i s s m a l l ) , t h e n u s i n g p o i n t x 0 ( a ) w e c a n c o n ­
struct in several w a y s point x x 6 S g such that
f o (x i ) ■ ' C f o ( ^ o ) « (9*3)
T h i s s t a t e m e n t is b a s e d o n t h e f a c t t h a t w e c a n c h o o s e a p o i n t x x ( a )
o n the s m o o t h manifold S g (and not a single one) so that Ihe follow­
i n g c o n d i t i o n h e fulfilled:
(a) — x0 = x 0 (a) — xQ + co0 (a)
where
II <"o ( « ) II = II * i ( a ) — * 0 (“ ) l| = 0 (II * 0 (“ ) — * o ID- ( 9 .4 )
(This c a n b e p r o v e d strictly b y u s i n g t h e t h e o r e m of m a p p i n g o n o n e
a n o t h e r of t h e r e g i o n a b o u t p o i n t x 0 i n m a n i f o l d S g a n d in t h e t a n ­
g e n t m a n i f o l d T ( x 0 ); t h e t h e o r e m h o l d s i n s p a c e E n ; s e e L . A . L u s t e r -
n i k a n d V . I. S o b o l e v . )
If (9.4) h o l d s , w e h a v e , s i n c e / 0 (x) is d i f f e r e n t i a b l e ,
/(» ( * i ) = f o ( * o ) + (f o ( * o ) » * i — * o ) + 0 (II x x — x Q II)
f o ( x 0 ) “ H j [ / o (*^o)» * 0 *o) o (|| X q X q (I)
+ (fo ( X o ) , — x 0 ) + o (II x x — a 0 II)
= f o ( * o ) “ t ~ ( f o ( x 0 ), X q X q ) + O l ( | | X q X q ID*

245
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

H e n c e if p a r a m e t e r a i s s u f f i c i e n t l y s m a l l , t h e i n e q u a l i t y ( 9 . 3 ) is
satisfied.
B y c o n s t r u c t i n g p o i n t x x 6 S g a t w h i c h c o n d i t i o n (9.3) is fulfilled,
w e h a v e p e r f o r m e d in e s s e n c e a n iteration of a certain pr ocess of
descent for constructing successive a p p r o x i m a t i o n s to the solution.
T h u s the k-th iteration of the process of the t y p e b e i n g described
consists in the following.
1 . T h e d i r e c t i o n o f d e s c e n t i?* = — x k o f f u n c t i o n f0 (x) in t h e
t a n g e n t h y p e r p l a n e T (x * ) i s d e t e r m i n e d .
2. A s t e p o f d e f i n i t e l e n g t h is m a d e i n t h e d i r e c t i o n v k : x h ( a ) =
= X k + < z v h ( s o t h a t / „ ( x h ) < / „ ( x k )).
3. U s i n g p o i n t x ^ (a), p o i n t 6 S g is d e t e r m i n e d s u c h t h a t
c o n d i t i o n / 0 ( ^ + i ) < f 0 (x k ) b e s a t i s f i e d .
It is c l e a r f r o m t h e f o r e g o i n g t h a t w e c a n c h o o s e f o r m o v i n g f r o m
p o i n t X h d i f f e r e n t d i r e c t i o n s o f d e s c e n t i n p l a n e T ( x ft). T h e c h o i c e
of q u a n t i t y a * a n d t h e final s t e p of t h e i t e r a t i o n — c o n s t r u c t i o n
of p o i n t x k + 1 — are also d e t e r m i n e d n o t u n i q u e l y . P e r f o r m i n g in differ­
ent w a y s e a c h of the three stages of the iteration w e c a n construct
a w h o l e class of processes of descent of the t y p e described.
C o n s i d e r n o w several possible m e t h o d s of c h o o s i n g vector W e
c a n t a k e a s v e c t o r u k t h e p r o j e c t i o n o f t h e a n t i g r a d i e n t — f'0 ( x k )
o n t o t h e p l a n e T ( x k ). T h e c o n s t r u c t i o n o f s u c h a v e c t o r i s e q u i v ­
alent to the solving of the p r o b l e m of m i n i m i z a t i o n of function

F h ( x ) = ( / ; ( x h ) , x - x k ) + ^ \ \ x - x k ||2 (9.5)

p r o v i d e d t h a t x £ T ( x k ). A p p l y i n g the m e t h o d of L a g r a n g e m u l t i ­
pliers, w e find t h a t
** = - ( / - g f* ( g ' g ' * ) - Y ) /; ( * * > (9.6)
w h e r e g ' = g ' ( x k ).
M o r e effective p r o j e c t i o n m e t h o d s w i t h re storation of ties c a n
be constructed b y choosing as uh a vector that m i n i m i z e s the f u nc­
tion
F k ( x ) = ( f o ( x h ) , x — x k ) + — -(f"0 { x k ) ( x — x k ) , x — x k ) (9.7)

o n p l a n e T ( x h ) ( t h e r e i s s u c h a v e c t o r if F h i s a c o n v e x f u n c t i o n ) .
S i n c e in this case the q u a d r a t i c a p p r o x i m a t i o n to the function
b e i n g m i n i m i z e d w a s practically u s e d for t h e const ruc tio n of the
d i r e c t i o n o f m o t i o n , w e s h a l l c a l l t h e m e t h o d s , i n w h i c h v h is c o n ­
s t r u c t e d in t h e w a y de s c r i b e d , m e t h o d s of the s e c o n d order.
C o n s i d e r n o w t h e m e t h o d of r e s t o r a t i o n of ties (the t h ird s t a g e
of a n iteration) w h i c h will b e u s e d in w h a t follows.
L e t t h e s y s t e m of eq sations (9.1) in a ce rtain r e g i o n a b o u t a n y
p o i n t x £ S g d e f i n e t h e f u n d i o n y = y (z), w h e r e y is a n w i - d i m e n s i o n -

240
P R O J E C T I O N M E T H O D S

al v e c t o r of c o o r d i n a t e s a n d z is a n (n — m ) - d i m e n s i o n a l v e c t o r .
W i t h o u t loss of generality, we can t a k e y = (s1, . . z m ), z =
= (xm + 1 , . . x n ). B y the theorem o f i m p l i c i t f u n c t i o n s , it is
necessary for the existence of function y (z) a n d its d e r i v a t i v e s t h a t
at a n y point x £ S g w e h a v e the dete rminant

u » ( * ) i = | { - £ 7 } | ^ ° « j = i . . . m - (9.8)

In this case, point x h+1 — (z f t + 1 , y h + x ) c a n be constructed b y the


formulas
z*+1 = Zfe + a hPk, Uk +1 = y (Zk+i) (9.9)
w h e r e p k = z k — z k is t h e c o r r e s p o n d i n g p a r t o f v e c t o r v k . T o c o n ­
s t r u c t s e q u e n c e ( 9 . 9 ) , t h e r e is n o n e e d i n f i n d i n g t h e e x p l i c i t e x p r e s ­
s i o n o f f u n c t i o n y (z); it is s u f f i c i e n t t o b e a b l e t o e v a l u a t e it (i.e. t o
s o l v e s y s t e m ( 9 . 1 ) ) w i t h a f i x e d v e c t o r z.
T h e construction of s e q u e n c e (9.9) (besides the realization in
this w a y of o n e of t h e p o s s i b l e m e t h o d s of r e s t o r a t i o n of ties) c a n b e
c o n s i d e r e d in a n o t h e r respect— as a n iterative process of m i n i m i ­
z a t i o n o f f u n c t i o n (p ( z ) = f 0 ( z , y ( z ) ) . O b v i o u s l y , w i t h c o n d i t i o n
( 9 . 8 ) f u l f i l l e d , t h e m i n i m i z a t i o n o f cp ( z ) i s e q u i v a l e n t t o t h e s o l v i n g
o f t h e o r i g i n a l p r o b l e m . V e c t o r p k ( w h i c h is t h e d i r e c t i o n o f d e s c e n t
o f f u n c t i o n (p ( z ) ) c a n b e c o n s i d e r e d i n t h i s c a s e t o b e t h e s o l u t i o n
of the p r o b l e m of m i n i m i z a t i o n of f u n c t i o n (z) = F h (z, y i (z)),
w h e r e f u n c t i o n F k (z, y ) i s d e t e r m i n e d b y f o r m u l a ( 9 . 5 ) o r ( 9 . 7 ) , a n d
vector-function (z) is d e t e r m i n e d f r o m t h e l i n e a r i z e d e q u a t i o n
o f t i e s ( i . e . f r o m t h e e q u a t i o n o f t h e t a n g e n t p l a n e T ( x h ))
g y ix h ) ( y — </k) + g z ( x h )(z — Zk) - 0.
Hence
y i (z) = y h — g y 1 (x k ) g z ( x k ) ( z — z ft) .
T h e f a c t t h a t v e c t o r p k d e t e r m i n e d b y t h e m e t h o d s d e s c r i b e d is t h e
d i r e c t i o n o f d e s c e n t o f <p ( z ) f o l l o w s f r o m t h e e q u a l i t y ( z ft) =
= cp' ( z fc), w h e r e
<P (z h) f o z (x h ) + y ' * (z h) foy (**)»
f = ( - dfo \
JOz \ dxm+i * • • • » dxn ) *
f - I l k . dfo \
7ob ~ \ dxt ’ * • • ’ dxm ) *
y iz h ) — S y K (*^ft) g z (*£ft)• (9.10)

S i n c e t h e p r o c e s s of t h e (9.9) t y p e c a n b e c o n s i d e r e d to b e a m e t h o d
o f m i n i m i z a t i o n o f f u n c t i o n <p ( z ) , i t i s e a s y t o s e e t h a t v e c t o r cp' ( z * )
c a n b e t a k e n as p h ; t h e n s e q u e n c e (9.9) will b e t h e g r a d i e n t m e t h o d
o f m i n i m i z a t i o n o f cp ( z ) . '

247
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

N o t e that vector p p r o v i d i n g t h e m i n i m u m o f f u n c t i o n F h (z,


y i (z)), w h e r e F h (z, y ) i s d e f i n e d b y e x p r e s s i o n ( 9 . 5 ) , i s c a l c u l a t e d
b y the following formula:

ph = - ( i + y ' * (**) y ' (z*))-1 <p'(*0- («-U)


C o n s e q u e n t l y , s e q u e n c e (9.9) in w h i c h is d e t e r m i n e d b y f o r m u l a
( 9 . 1 1 ) is a l s o a m e t h o d o f t h e g r a d i e n t t y p e f o r t h e m i n i m i z a t i o n
o f <p ( z ) . M e t h o d s o f t h e g r a d i e n t t y p e w e s h a l l c a l l m e t h o d s o f t h e
first o r d e r .
N e w t o n ’s m e t h o d a n d i t s m o d i f i c a t i o n s c a n b e a p p l i e d i n p r i n c i p l e
t o t h e m i n i m i z a t i o n o f f u n c t i o n <p ( z ) , p r o v i d e d c e r t a i n n e c e s s a r y
r e q u i r e m e n t s a r e f u lfi lle d. H o w e v e r , it s h o u l d b e n o t e d t h a t t h e
c a l c u l a t i o n o f t h e s e c o n d d e r i v a t i v e cp" (z) is, a s a r u l e , v e r y l a b o ­
r i o u s , f o r it r e q u i r e s t h e c a l c u l a t i o n o f t h e s e c o n d d e r i v a t i v e o f t h e
v e c t o r - f u n c t i o n y (z), i.e., p r a c t i c a l l y , t h e c a l c u l a t i o n o f t h e s e c o n d
d e r i v a t i v e s o f f u n c t i o n s f x ( x ) , . . ., f m ( x ).
S u p p o s e t h a t c o n d i t i o n ( 9 . 8 ) is n o t f u l f i l l e d a n d t h e f o l l o w i n g
w e a k e r r e q u i r e m e n t is s a t i s f i e d : a t a n y p o i n t x £ S g a t l e a s t o n e
d e t e r m i n a n t o f o r d e r m is n o t z e r o

(9.12)

., m .

T h e w e a k e n i n g o f t h e r e q u i r e m e n t s t o f u n c t i o n s /* i s t h a t a t d i f f e r e n t
p o int s of set S g dif f e r e n t ^ d e t e r m i n a n t s m a y b e n o t zero. I n this case,
the coordinates of po int x £ S g w h i c h f o r m vector z a n d vector-func­
t i o n y (z) c a n b e , s p e a k i n g g e n e r a l l y , d i f f e r e n t a t d i f f e r e n t p o i n t s
o f m a n i f o l d S g : z = ( x J m + I , . . ., x * n ), y = ( x > ' , . . ., x * m ). T a k ­
i n g this into a c c o u n t w e c a n, as before, u s e f o r m u l a (9.9) for t h e r esto-
r a t i o n o f ties. E a c h s t e p o f p r o c e s s (9.9) c a n b e t r e a t e d a s a s t e p o f t h e
p r o c e s s o f m i n i m i z a t i o n o f a c e r t a i n f u n c t i o n <p ( x i m + l , . . ., x i n )
f o r w h i c h t h e c o r r e s p o n d i n g v e c t o r p h is t h e d i r e c t i o n o f d e s c e n t .
M e t h o d s of t h e (9.9) t y p e will b e s t u d i e d b e l o w . It will b e c o n ­
v e n i e n t t o d e n o t e a n y v e c t o r - f u n c t i o n (x . . ., x * m ) b y y a n d a v e c ­
tor of i n d e p e n d e n t v a r i a b l e s b y z (as w e d i d in fulfilling t h e c o n ­
d i t i o n (9.8)). A c c o r d i n g l y , a n y of t h e d e t e r m i n a n t s \{dfi/dx*}\
o f m o r d e r w i l l b e d e n o t e d b y | g y | a n d f u n c t i o n / 0 ( z , y ( z ) ) b y <p ( z ) .
T h e a b s o l u t e v a l u e o f f u n c t i o n | g y ( x ) | s h a l l b e d e n o t e d b y j g y ( x ) |A .
I n t h e f o l l o w i n g t w o su bsections w e shall s t u d y the properties of
t h e m e t h o d s o f t h e first a n d s e c o n d o r d e r . I n t h e f o u r t h s u b s e c t i o n
w e shall consider m e t h o d s of d u a l a n d c o n j u g a t e directions for
t h e m i n i m i z a t i o n o f <p ( z ) ( o r t h e a l g o r i t h m s b a s e d o n m e t h o d s
of this type). F r o m t h e v i e w p o i n t of practical c o m p u t a t i o n s just
these a l g o r i t h m s are of t h e greatest interest.2

2 IS
P R O J E C T I O N M E T H O D S

M e t h o d s of t h e First O r d e r
W e shall s t u d y the properties of m e t h o d s b a s e d o n the lineariza­
t i o n o f f u n c t i o n f 0 ( x ) a n d t i e s /*, i = 1 , . . m.
C o n s i d e r t h e a l g o r i t h m w h o s e e v e r y s t e p is a s t e p o f t h e g r a d i e n t
m e t h o d f o r m i n i m i z a t i o n o f a c e r t a i n f u n c t i o n <p ( z ) :
Zft + 1 = Zf, — otfccp' ( z h ), y k + 1 = y (zk + 1 ) (9.13)

w h e r e z h i s a v e c t o r c o r r e s p o n d i n g t o t h e d e t e r m i n a n t | g y (x ) | w h i c h
h a s at p o i n t x k £ S g t h e m a x i m u m a b s o l u t e v a l u e of all t h e d e t e r m i ­
n a n t s | g y |, t h e g r a d i e n t cp' ( z fe) i s c a l c u l a t e d b y f o r m u l a ( 9 . 1 0 ) a n d
parameter a c a n b e d e t e r m i n e d b y o n e of t h e m e t h o d s described
i n s t u d y i n g g r a d i e n t m e t h o d s ( S e c . 1, C h a p . II). W e s h a l l c h o o s e
as a h the m a x i m u m v a l u e of the p a r a m e t e r o b t a i n e d b y successive
r e d u c t i o n s of a c e r t a i n p o s i t i v e c o n s t a n t w h i c h satisfies
f o (z. y (z)) — fo (Zk. y h ) < — e a II 9 ' ( z k ) ||2 , 0 < e < 1 (9.14)
w h e r e z — z k — acp' (zk ) (t h i s is a n a n a l o g u e o f t h e m e t h o d o f c h o o s -
i n g a h a c c o r d i n g t o c o n d i t i o n (1.2), C h a p . II).
T h e o r e m 9 . 1 . I f f u n c t i o n s f 0 ( x ) a n d ft ( x ) , i = 1 , . . ., m a r e
twice continuously differentiable a n d , besides, functions ft are such
t h a t c o n d i t i o n ( 9 . 1 2 ) i s f u l f i l l e d , a n d s e t S = S g [] S 0 ( S 0 = { x \
f o ( * ) < fo ( * o ) }) is b o u n d e d w i t h a n a r b i t r a r y c h o i c e o f p o i n t x 0t
t h e n o n s e q u e n c e ( 9 . 1 3 ) / 0 ( a j ^ + i ) ^ / 0 ( x ft) a n d || q / ( z h ) || — >• 0 a s
k — >■ o o .
Proof. T h e possibility of c o n s t r u c t i n g s e q u e n c e (9.13) follows
f r o m c o n d i t i o n (9.12): w i t h sufficiently s m a l l v a l u e s of p a r a m e t e r a ,
p o i n t z k + 1 lies in t h e r e g i o n a b o u t p o i n t w h e r e f u n c t i o n y (z) is
defined. In this region, b y the a s s u m p t i o n s of the t h e o r e m , function
<p ( z ) = / ( z , y ( z ) ) i s t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e . T a k i n g t h i s
into account, the following estimate holds:

<p ( z h + l ) - 9 ( z h ) < a * || 9 ' ( Z k ) II2 ( - 1 + - f -1| 9 " ( z c ) ||) (9.15)

w h e r e q > " ( z c ) = c p w ( z h + 0 ( z k + 1 — z ft)) , 0 6 [ 0 , 1 ]; h e n c e , i f t h e v a l u e


o f a k is s u f f i c i e n t l y s m a l l , t h e i n e q u a l i t y ( 9 . 1 4 ) w i l l b e s a t i s f i e d .
T h i s m e a n s t h a t o n e l e m e n t s of s e q u e n c e (9 . 1 3 ) f u n c t i o n / 0 (x) d e ­
creases monotonically.
W e s h a l l p r o v e n o w t h a t || q>' ( z * ) || — 0 . O n t h e c l o s e d b o u n d e d
s e t S c o n t i n u o u s f u n c t i o n | g y ( x ) |A a s s u m e s i t s m i n i m u m v a l u e y
( W e i e r s t r a s s ’ t h e o r e m ) , a n d b y ( 9 . 1 2 ) y > ► 0 ( f u n c t i o n | g y ( x ) \A
i s c o n t i n u o u s b e i n g t h e m a x i m u m o f c o n t i n u o u s f u n c t i o n s | g y ( x ) |A ).
T h e n u m b e r o f d i f f e r e n t f u n c t i o n a l d e t e r m i n a n t s | g y | is finite a n d ,
s i n c e f u n c t i o n s /f a r e d i f f e r e n t i a b l e o n set £ , t h e y a r e all u n i f o r m l y
c o n t i n u o u s . Therefore, for a n y c o n s t a n t 0 < c Vi Y» t h e r e is a c o n -

249
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

st ant p > 0 s u c h th at at a n y po i n t of set S w h i c h b e l o n g s to s p h e r e


S q o f r a d i u s p a n d h a v i n g its c e n t r e a t a n a r b i t r a r y p o i n t 0 £ S , t h e
a b s o l u t e v a l u e o f d e t e r m i n a n t | g y (x ) | a s s u m i n g a t p o i n t 0 t h e
v a l u e o f | g y (0) | w i l l b e n o t l e s s t h a n y x . M o r e o v e r , s i n c e s e t S is
b o u n d e d a n d t h e f i r s t a n d s e c o n d p a r t i a l d e r i v a t i v e s o f f u n c t i o n s ft
are continuous, these derivatives at a n y point 0 £ S c h o s e n in sphere
S q of radius p are b o u n d e d ( b y a c o n s t a n t M ) . T a k i n g this s t a t e m e n t
into account a n d according to the t h e o r e m s o n implicit functions,
w e c a n assert that in a certain parallelepiped
[0l ± 6 0 ' ], i = 1 , . . ., n
b e l o n g i n g to s p h e r e S q , s y s t e m (9.1) defines at least o n e t w i c e
c o n t i n u o u s l y d i f f e r e n t i a b l e v e c t o r - f u n c t i o n y (z) a n d i n t h i s p a r a l l e l ­
e p i p e d the derivatives of a n y s u c h function are b o u n d e d :
|| y ' ( z ) || < N u || y " ( z ) || < N x. (9.16)
S i n c e t h e d e r i v a t i v e s y ' (z), y n (z) a r e b o u n d e d a n d t h e first a n d
s e c o n d d e r i v a t i v e s o f f u n c t i o n / 0 (x) o n se t S a r e b o u n d e d t o o , t h e
d e r i v a t i v e s o f f u n c t i o n <p ( z ) i n p a r a l l e l e p i p e d [ 0 l ± 6 0 ' ] a r e a l s o
b o u n d e d : || <p' ( z ) || < 7 V 2 , j | c p " ( z ) || < N 2 .
T a k i n g into account the a b o v e remarks, w e can ascertain that
t h e r e is a c o n s t a n t 6 > 0 s u c h t h a t if a k ^ 6 t h e n p o i n t x k + t is i n
s p h e r e S X k o f r a d i u s p . I n d e e d , s u p p o s e t h a t || x k + 1 — x k ||2 =
= II * * + 1 — Zft II2 + II i t o + i — V h II2 = P 2 - S i n c e these equalities
m e a n t h a t x h+1 £ S Xk, this p o i n t also b e l o n g s to p a r a l l e l e p i p e d [x£ ±
± 6 x * ] , i = 1 , . . ., n i n w h i c h e s t i m a t e s ( 9 . 1 6 ) h o l d f o r d e r i v a t i v e s
of f u n c t i o n y ( z ) . C o n s e q u e n t l y , || y h + 1 — y k || < N 3 || z k + 1 — z k ||
an d t h e r e f o r e it f o l l o w s f r o m the preceding equalities that
(N \ + 1 ) || z * + 1 - II2 = * I N \ || q>' ( z h ) ||2 > p 2 ; h e n c e

* ^ 4 II < p ' ( z ft) | | ^ i V 4 ^ 2 *

T h i s e s t i m a t e s h o w s t h a t t h e e q u a l i t y || x ft+ x — x k || = p h o l d s
w i t h a k ^ p / ( ^ 47 V P2 )> i * © * t h a t w e c a n c h o o s e a s 6 a n y c o n s t a n t n o t
e x c e e d i n g p / ( 7 V 4i V 2 ).
U s i n g n o w inequality (9.15) ( a n d t a k i n g into a c c o u n t that deriva­
t i v e <p" (z) i s b o u n d e d ) , i t i s e a s y t o a s c e r t a i n t h a t i n e q u a l i t y ( 9 . 1 4 )
will certainly hold w i t h a h = min — jy— — J. B u t this m e a n s ,
s i n c e f0 (x) h a s a l o w e r b o u n d ( o n set £), t h a t a s k oo of n e c e s ­
s i t y || <p' (Zfe) || — ►• 0 . T h e t h e o r e m i s p r o v e d .
T h e c o n d i t i o n || <p' ( z h ) || — m e a n s in the general case that
s e q u e n c e ( 9 . 1 3 ) (or a c e r t a i n o f its s u b s e q u e n c e s ) c o n v e r g e s t o p o i n t x *
w h i c h satisfies t h e n e c e s s a r y c o n d i t i o n for a n e x t r e m u m of f u n c t i o n
f 0 (x) o n m a n i f o l d S g ( a t p o i n t x * t h e g r a d i e n t f 0 ( x # ) is o r t h o g o n a l

250
P R O J E C T I O N M E T H O D S

t o t h e t a n g e n t h y p e r p l a n e g' ( x m )(x — x *) = 0 ) ( C h a p . I, S e c . 4).


S i n c e f u n c t i o n / 0 (x) is c o n t i n u o u s , its m i n i m u m o n s e t S e x i s t s .
I f s e q u e n c e ( 9 . 1 3 ) c o n v e r g e s t o t h e s o l u t i o n a n d f u n c t i o n <p ( z ) ,
to w h o s e m i n i m i z a t i o n in a certain region a b o u t the m i n i m u m
t h e s o l v i n g o f t h e o r i g i n a l p r o b l e m is r e d u c e d , s a t i s f i e s t h e c o n d i ­
t i o n s m 0 || v ||2 ^ ( c p " ( z ) V j v ) ^ M || v ||2 f o r a n y v (; E n ~ m , t h e n
the rate of c o n v e r g e n c e will n o t b e s l o w e r t h a n that of a certain
geometric progression; this follows f r o m the results o n the c o n v e r ­
g e n c e of g r a d i e n t m e t h o d s ( t h e o r e m 1.2, C h a p . II). W e s h a l l d w e l l
o n several questions c o n n e c t e d w i t h the i m p l e m e n t a t i o n of the
method.
In the t h e o r e m p r o v e d above, use w a s m a d e in defining vectors
Zfc a n d y k o f t h e d e t e r m i n a n t | g y ( x k ) |. T o f i n d t h i s d e t e r m i n a n t
it i s n e c e s s a r y a t e a c h i t e r a t i o n t o c a l c u l a t e a l l t h e d e t e r m i n a n t s
| g y |. H o w e v e r i n p r a c t i c e , t h e r e i s n o n e e d t o d o s o ( t h e d e t e r m i ­
n a n t | g y | is u s e d i n t h e t h e o r e m o n l y t o s i m p l i f y t h e p r o o f ) . T h e
c o n v e r g e n c e o f t h e m e t h o d i s r e t a i n e d if w e c h o o s e v e c t o r s z & a n d
y k c o r r e s p o n d i n g to a n y of t h e d e t e r m i n a n t s | g y | w h o s e a b s o l u t e
v a l u e a t p o i n t x h is n o t l e ss t h a n a n a r b i t r a r y s m a l l p o s i t i v e c o n ­
s t a n t p ( t h e s a m e f o r all k). W i t h t h e c o n d i t i o n s of t h e t h e o r e m s u c h
a c o n s t a n t e x i s t s s i n c e t h e r e is t h e c o n s t a n t y . T h e r e f o r e i n i m p l e m e n t ­
ing the algorithm, vectors z a n d y should be chosen corresponding
t o t h e s a m e d e t e r m i n a n t u n t i l a t a c e r t a i n p o i n t its a b s o l u t e v a l u e
b e c o m e s l e s s t h a n p ; o n l y if t h i s h a s o c c u r r e d i s it n e c e s s a r y t o p a s s
t o o t h e r v e c t o r s z a n d y , i . e . t o c a l c u l a t e a n o t h e r d e t e r m i n a n t \ g v |.
T h e c o n s t a n t p is a r b i t r a r y . It c a n o c c u r t h a t a t a c e r t a i n p o i n t x k
all t h e d e t e r m i n a n t s | g y | h a v e a n a b s o l u t e v a l u e less t h a n p . W e
h a v e t h e n to c h o o s e a n e w c o nst ant Pi < p. A t e a c h re duc tio n of
p a r a m e t e r a w h i c h is n e c e s s a r y t o fulfill t h e i n e q u a l i t y ( 9 . 1 4 ) , a n e w
e v a l u a t i o n o f f u n c t i o n y ( z ) i s r e q u i r e d ( f o r t h e e v a l u a t i o n o f f 0 (x ) —
= f o (z » y (z ) ) ) > i - e * w e h a v e t o s o l v e t h e s y s t e m o f n o n l i n e a r e q u a ­
t i o n s ( 9 . 1 ) w i t h a f i x e d v a l u e o f v e c t o r z. T o r e d u c e t h e a m o u n t o f
c o m p u t a t i o n s , the required v a l u e of the p a r a m e t e r s h o u l d b e deter­
m i n e d b y establishing the following inequality:

f o (z. V i (z)) — f o (Zft, y h ) < — e a || q>' ( z k ) ||2 , 0 < e < 1. (9.17)

A s s o o n a s t h i s i n e q u a l i t y is satisfied, t h e i n e q u a l i t y ( 9 . 1 4 ) s h o u l d
b e c h e c k e d w i t h t h e a o b t a i n e d ; if ( 9 . 1 4 ) i s n o t s a t i s f i e d , t h e r e d u c ­
tion of a s h o u l d b e c o n ti nue d; ot her wis e the o b t a i n e d v a l u e of the
p a r a m e t e r should b e retained or o n e should a t t e m p t to increase
it i n c h e c k i n g ( 9 . 1 4 ) . N o t e t h a t

y (z) = Vk + y' (z*) (z — z*) + o (|| z — Zft ||2 )

= V » « + 0 ( l | z - z » II2).

251
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

W i t h a s u f f i c i e n t l y s m a l l z h + i — z h , w e o b t a i n f 0 (z k + l y y t ( z k + i ) ) - ►
~ * f o ( Z h + i , y (Zft+i)); t h e r e f o r e if ( 9 . 1 7 ) is s a t i s f i e d , i n e q u a l i t y
( 9 . 1 4 ) w i l l a l s o b e s a t i s f i e d , i.e. n o a d d i t i o n a l r e d u c t i o n s o f t h e
step length will be required.
R e m a r k . T h e r e q u i r e m e n t s of t h e o r e m 9.1 to t h e s m o o t h n e s s of
f u n c t i o n s / 0 ( # ) a n d f t (x ) c a n b e t a k e n s o m e w h a t w e a k e r ; h o w e v e r ,
this leads to a m o r e c o m p l i c a t e d proof.
W e d w e l l briefly also o n a m e t h o d of t h e (9.9) t y p e in w h i c h v e c t o r
p h is c h o s e n b y f o r m u l a ( 9 . 1 1 ) ( v e c t o r s z h , y h a r e d e t e r m i n e d i n t h e
s a m e w a y a s i n t h e p r e c e d i n g m e t h o d ) a n d a h is t h e m a x i m u m v a l u e
of t h e p a r a m e t e r ( o b t a i n e d b y r e d u c t i o n s ) w h i c h satisfies t h e f o l l o w ­
ing inequality:
/ o (z. y (z)) — fo (z*, Vh) < e a (q>' ( z » ) , p h ), z = zk + a Pk-
F o r s u c h a n a l g o r i t h m , t h e o r e m (9.1) h o l d s true. T h e p r o o f will
differ o n l y in s o m e details ( a n a l o g o u s l y to t h e difference b e t w e e n
the proof of the t h e o r e m of the properties of m e t h o d s of the gradient
t y p e a n d the proof of the t h e o r e m s o n the m e t h o d of steepest descent,
S e c . 1, C h a p . II).
N o t e that the a m o u n t of w o r k per iteration in s u c h a n a l gor ith m
is g r e a t e r t h a n i n m e t h o d ( 9 . 1 3 ) .

M e t h o d of the S e c o n d Order
S u p p o s e t h a t f 0 (x) is a s t r o n g l y c o n v e x f u n c t i o n . T h e n t h e q u a ­
d r a t i c f u n c t i o n F h (x) (9.7) is s t r i c t l y c o n v e x a n d s i n c e f u n c t i o n
y i (z) is l i n e a r , f u n c t i o n (z) = F h (z, y t (z)) i s s t r i c t l y c o n v e x
t o o . M o r e p r e c i s e l y , d u e t o t h e s t r o n g c o n v e x i t y o f / 0 ( x ), t h e f o l l o w ­
i n g c o n d i t i o n s a r e f u l f i l l e d f o r a n y f u n c t i o n ^ (z) w i t h a n y v e c t o r
v £ E n -m :
m 0 \\v ||2 < (i)£i>, v ) < M 0 || v ||2 , m 0 > 0 (9.18)

where matrix = /02Z + y ^ f ^ y ' + 2 y ' * f 0zV (all t h e d e r i v ­


a t i v e s a r e c a l c u l a t e d a t p o i n t x k ). I n ] t h i s c a s e v e c t o r p k w h i c h
minimizes (z) is c a l c u l a t e d b y t h e f o r m u l a
Pk = (**). (9.19)
In the m e t h o d of the s e c o n d order, point x k = 0,1, . . is c o n ­
structed as follows:

z*+i = Z* — a * W > » D ' V (z*). Vh+ 1 = y (z*+i) (9-20)

w h e r e vectors zh a n d y * are d e t e r m i n e d in the s a m e w a y , as in const­


r u c t i n g m e t h o d (9.13), a n d for a h w e t a k e t h e m a x i m u m v a l u e of
t h e p a r a m e t e r ( o b t a i n e d b y r e d u c t i o n s ) w h i c h satisfies t h e f o l l o w -

° t* ;o
P R O J E C T I O N M E T H O D S

ing inequality:
/o W - /o W < « » ( ? ’ (2 » ) . P k ) . 0 < e < | - (9.21)

w h e r e x = (z, y (z)), z = z k + a p h .
T h e o r e m 9.2. L e t /0 (x) be a t w i c e c o n t i n u o u s l y diff e r e n t i a b l e f u n c ­
tion a n d for a n y vector w £ E n
m || < a \ ||2 < ( f l (x ) w , [ < o ) < M k || to ||2 , m > 0
a n d let f u n c t i o n s f t (x), i = 1, . . m satisfy the r e q u i r e m e n t s of
t h e o r e m 9.1. T h e n , w h a t e v e r the p o i n t x 0 c h o s e n , the results of t h e o r e m
9.1 h o l d for m e t h o d (9.20).
T h e proof of the t h e o r e m follows the s a m e s c h e m e as that u s e d
in p r o v i n g t h e o r e m 9.1. T h ere for e, w e shall d w e l l o n l y o n t h o s e
c h a n g e s in th e proof, as c o m p a r e d to th at of t h e o r e m 9.1, w h i c h
arise b e c a u s e of different m e t h o d s of c h o o s i n g v e c t o r p h .
D u e t o t h e strict c o n v e x i t y o f f 0 (x), t h e s e t S 0 h a s a b o u n d . It
f o l l o w s t h a t s e t S = S g [] S 0 i s b o u n d e d a n d c l o s e d ( s i n c e s e t s
S 0 a n d S g are closed). T a k i n g this into account, w e prove, in the
s a m e w a y as in t h e o r e m 9.1, that e s t i m a t e s (9.16) h o l d a n d establish
t h a t t h e d e r i v a t i v e s q>' ( z ) , < p " ( z ) h a v e b o u n d s i n t h e p a r a l l e l e p i p e d
[ 0 * ± 6 0 * ], i = 1 , . . ., n .
Further, by (9.18), || ('I?*)"1 1 | < - , consequently, ||/?f t || =
= || ( t y * ) ~ * t y k M || = || ( t y k ) " 1 <p' ( z h ) || < N 2 l m o , and, therefore, if
- f 1 ) || z h + 1 — z k ||2 = a \ N \ || p h ||2 > p 2 , t h e n

a h> M f! „ (9.22)
- ^ 4 II P h II ^ N kN 2 v '

It f o l l o w s t h a t a n y c o n s t a n t , p r o v i d e d it d o e s n o t e x c e e d
p m 0 / ( i V 47 V 2 ), c a n b e c h o s e n a s 6.
Using the expansi o n o f f u n c t i o n <p ( z ) i n t o T a y l o r ’s s e r i e s , w e f i n d
t h a t in s p h e r e S Xh of radius p
aa
<p(z*+i) — = a*(q>' M , P k ) + - ^ - ( < t (zkc) P h , P h )

p h)

It f o l l o w s f r o m (9.19), w i t h a c c o u n t o f (9.18), t h a t

Ph) = — (<p'(z»). Ph) > m 0 || p h H 2 .


This implies that

<P (Zft+l) — <P (<p' (*h), P h ) ( l .

253
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T a k i n g into a c c o u n t this e s t i m a t e a n d inequalities (9.22), w e e s t a b ­


lish t h a t i n e q u a l i t y (9.21) will c e r t a i n l y h o l d w i t h
. ( R 2 m 0 (1 — e)
g * ~ m m \^. \ )•
This means, s i n c e /„ (x) h a s a lower bound, that
(<P' t e n ) , Pk) -*-0. (9.23)

S i n c e — (q>' ( z * ) , p k ) = ( ( i | ) 2 ) - V ( * * ) . <p' M ) > m 0 || q>' ( z h ) ||2 , it


f o l l o w s f r o m ( 9 . 2 3 ) t h a t || c p ' ( z h ) || - > 0 . T h i s c o m p l e t e s t h e p r o o f
of t h e t h e o r e m .
In i m p l e m e n t i n g t h e a l g o r i t h m (9.20) o n e s h o u l d t a k e account
o f t h e r e m a r k s c o n c e r n i n g t h e c h o i c e o f v e c t o r s z ft, y * a n d p a r a m e t e r
a k m a d e in s t u d y i n g m e t h o d (9.13).
If s e q u e n c e ( 9 . 2 0 ) c o n v e r g e s t o t h e s o l u t i o n a n d t h e c o n d i t i o n
p"(*fc) (9.24)

h o l d s f o r f u n c t i o n <p ( z ) t o w h o s e m i n i m i z a t i o n t h e s o l v i n g o f t h e
o r i g i n a l p r o b l e m is r e d u c e d , t h e n t h e r a t e o f c o n v e r g e n c e of
t h e m e t h o d is s u p e r l i n e a r . I n o r d e r t o a s c e r t a i n this, o n e s h o u l d
t a k e i n t o a c c o u n t t h a t i f ( 9 . 2 4 ) h o l d s ( a n d a l s o (i|)* p k , p k ) =
= — ( < P ' ( ft), P f t ) ) , t h e n
2

<P ( 2 f t + l ) — <P (2ft) = « f t (< P ' ( ft), P h )


2

* /\ a h ( <P * M Ph, P h ) « f t (( < P * (*ftc) — 9 * M ) P k , P k ) \


\ 2 (9ft/>ft> P h ) 2 ( 9 * P f t i Pft)/
v. » \ * \ ( a a ft a ft ( ( 9 * ( z f t c ) 9 * (Z ft)) P f t » P f t ) \
(*»), p k ) ( l - — - - - - - - - - ^ p 0 - - - - )•

A t t h e s a m e t i m e , b y ( 9 . 1 8 ) a n d ( 9 . 2 4 ) , f u n c t i o n <p ( z ) w i l l b e s t r o n g l y
c o n v e x in a certain region a b o u t the m i n i m u m . W i t h the a b o v e
r e m a r k s , the proof of the superlinear rate of c o n v e r g e n c e c a n b e the
s a m e a s t h a t , f o r e x a m p l e , i n s t u d y i n g N e w t o n ’s m e t h o d ( S e c . 2 ,
C h a p . II).
T h u s t h e rate of c o n v e r g e n c e of m e t h o d (9.20) in a n u m b e r of
p r o b l e m s w i l l b e f a s t e r t h a n t h a t o f m e t h o d s o f t h e first o r d e r . H o w ­
ever, the a m o u n t of w o r k per iteration in m e t h o d (9.20) m a y p r o v e
c o n s i d e r a b l y greater o w i n g to t h e ne c e s s a r y calculations of the
s e c o n d d e r i v a t i v e s o f f u n c t i o n /0 (x).

Minimization Methods
of Higher E f fectiveness
T h e projection m e t h o d s described in the pr ece din g subsections
a r e , i n a s e n s e , a n a l o g u e s o f t h e g r a d i e n t m e t h o d s a n d N e w t o n ’s
m e t h o d for sol v i n g p r o b l e m s of finding a n absolute e x t r e m u m .

254
P R O J E C T I O N M E T H O D S

T h e y share the s h o r t c o m i n g s of the c o r r e s p o n d i n g m e t h o d s : either


t h e y h a v e a s l o w r a t e o f c o n v e r g e n c e ( l i k e m e t h o d s o f t h e first o r d e r )
or t h e y i n v o l v e a greater a m o u n t of w o r k p e r iteration (like m e t h o d s
of t h e s e c o n d order). H o w e v e r , t h e fact t h a t in t h e a l g o r i t h m u n d e r
c o n s i d e r a t i o n t h e s o l v i n g o f t h e o r i g i n a l p r o b l e m is r e d u c e d t o
u n c o n s t r a i n e d m i n i m i z a t i o n of a f u n c t i o n (one or several functions,
d e p e n d i n g o n w h e t h e r c o n d i t i o n ( 9 . 8 ) o r ( 9 . 1 2 ) h o l d s ) m a k e s it
possible to use s u c h effective m i n i m i z a t i o n a l g o r i t h m s as m e t h o d s
o f d u a l a n d c o n j u g a t e d i r e c t i o n s ( S e c s . 3 - 5 , C h a p . II). T h u s , if
f u n c t i o n s /* ( x ) y i = 1 , . . ., m a r e s u c h t h a t c o n d i t i o n ( 9 . 8 ) i s
satisfied a n d w i t h a n y fixed z s y s t e m (9.1) h a s a u n i q u e s o l u t i o n
y = y ( z ) , t h e n , if tp ( z ) = / ( z , y , ( z )) i s a t w i c e c o n t i n u o u s l y d i f f e r ­
entiable strongly c o n v e x function, a n y m e t h o d of d u a l or c o nju gat e
d i r e c t i o n s ( f o r t h e m i n i m i z a t i o n o f cp ( z ) ) c o n v e r g e s t o t h e s o l u t i o n
a t a s u p e r l i n e a r r a t e . A t t h e s a m e t i m e if u s e i s m a d e o f t h e v a r i a n t s
of m e t h o d s w i t h r e s t o r a t i o n of m a t r i c e s A & 1 a n d H h after a finite
n u m b e r of steps (see t h e s u b s e c t i o n o n p. 104), t h e n t h e c o n v e r g e n c e
of m e t h o d s of d u a l a n d c o n j u g a t e directions will b e g u a r a n t e e d w i t h
t h e s a m e a s s u m p t i o n s a b o u t f u n c t i o n cp ( z ) a s i n g r a d i e n t m e t h o d s .
Consider, for e x a m p l e , t h e p r o b l e m of t h e m i n i m i z a t i o n of q u a ­
d r a t i c f u n c t i o n f0 (x) w i t h l i n e a r c o n s t r a i n t s : g (x) = A x + b = 0,
w h e r e A — ( a ^ ) i s a n m X n m a t r i x , b = ( b 1 , . . ., b m ).
L e t t h e d e t e r m i n a n t | ( a it) | 0 , i, 1 = 1 , . . ., m ; t h e n w e c a n
t a k e y = ( x *, . . ., x m ), z = ( x m + i y . . ., x n ). *
S i n c e y ( z ) i s a l i n e a r f u n c t i o n , <p ( z ) i s a q u a d r a t i c f u n c t i o n o f t h e
v a r i a b l e z a n d it i s s t r i c t l y c o n v e x if t h e o r i g i n a l f u n c t i o n / „ ( # )
is s t r i c t l y c o n v e x . T h e a p p l i c a t i o n o f a n y m e t h o d o f d u a l o r c o n j u ­
g a t e d i r e c t i o n s m a k e s it p o s s i b l e t o f i n d t h e m i n i m u m o f f u n c t i o n
<p ( z ) a f t e r n — m steps.
I f f u n c t i o n f 0 ( x ) a n d f t ( x ) , i = 1 , . . ., m s a t i s f y t h e r e q u i r e ­
m e n t s o f t h e o r e m 9 . 2 ( t h e c o n d i t i o n ( 9 . 8 ) i n t h i s c a s e is s u b s t i t u t e d
b y t h e w e a k e r r e q u i r e m e n t (9.12)), t h e n m e t h o d s of d u a l or c o n j u g a t e
d i r e c t i o n s c a n b e u s e d t o m i n i m i z e e a c h o f t h e f u n c t i o n s (p ( z ) t h a t
are to b e dealt w i t h in solving the p r o b l e m . I n other w o r d s , in algo­
r i t h m s of t h e (9.9) type, v e c t o r p k a n d p a r a m e t e r a k c a n b e d e t e r ­
m i n e d in the s a m e w a y as in d u a l or con j u g a t e directions m e t h o d s ,
a n d v e c t o r s zk a n d y k c a n b e c h o s e n as in s t u d y i n g m e t h o d (9.13).
A l g o r i t h m s t h u s c o n s t r u c t e d ( i m p l e m e n t e d w i t h restoration of m a t ­
rix A^.1 or H h) c o n v e r g e u n d e r the s a m e conditions as projection
m e t h o d s o f t h e first a n d s e c o n d o r d e r s . A t t h e s a m e t i m e , t h e i r
e f f e c t i v e n e s s is h i g h e r t h a n t h a t o f t h e p r o j e c t i o n m e t h o d s s t u d i e d ;
in particular, w i t h a s m a l l increase in t h e a m o u n t of w o r k p e r iter­
a t i o n a s c o m p a r e d t o t h a t i n t h e m e t h o d o f t h e first o r d e r , a s u p e r -
linear rate of c o n v e r g e n c e c a n b e attained.
I n pra c t i c e, just the a l g o r i t h m s of the t y p e described in this su bsection
s h o u l d b e u s e d f o r s o l v i n g t h e Jp r o b l e m b e i n g s t u d i e d .

255
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

N o t e also t h a t t h e a m o u n t of w o r k p e r iteration ( m e t h o d (9.20))


c a n b e r e d u c e d if i n s t e a d o f m a t r i x f l ( x * ) u s e is m a d e o f m a t r i x
D k defined b y the following s y s t e m of equations:

D k ( x k-i x h - l ~ l) — /o — /o
i = 0, 1,..., n — 1
( a n a l o g u e o f s y s t e m ( 3 . 6 ) o f C h a p . I I w h i c h is u s e d i n c o n s t r u c t i n g
m e t h o d s of d u a l direction) a n d vector p k = — (zh ) i s c o n s t r u c t ­
e d , w h e r e F h = D hzz + y ' * D k y y y ' + 2 y ' * + D hzy a n d m a t r i c e s
D k z z i D h y y > D hzy are parts of m a t r i x D h a n d c o r r e s p o n d to m a t ­
r i c e s /qzz, f o y y » fozyi r e s p e c t i v e l y .

O n th e Solving of th e G e n e r a l P r o b l e m
of M a t h e m a t i c a l P r o g r a m m i n g
I t is r e q u i r e d t o m i n i m i z e f u n c t i o n f 0 (x) w i t h c o n s t r a i n t s
fi ( * ) ^ 0, i = l , . . ., m . (9.25)
S u c h constraints c a n be r e d u c e d to equality constraints in several
w a y s . F o r i n s t a n c e , if w e i n t r o d u c e a d d i t i o n a l v a r i a b l e s x n + 1 , . . .
. . ., x n + m , t h e n c o n s t r a i n t s ( 9 . 2 5 ) w i l l h o l d w i t h t h e s a m e v a l u e s
o f v a r i a b l e s x 1 , . . ., x n w h i c h s a t i s f y t h e e q u a l i t i e s
( s n + i )2 + ft (x) = 0, i = 1 , . . ., m . (9.26)

C o n s e q u e n t l y , t h e m i n i m u m o f f u n c t i o n / 0 (a:) w i t h c o n s t r a i n t s
( 9 . 2 5 ) w i l l c o i n c i d e w i t h t h e m i n i m u m o f / 0 (a:) w i t h c o n s t r a i n t s
( 9 . 2 6 ) . F o r t h e m i n i m i z a t i o n o f /0 (x) w i t h c o n s t r a i n t s (9.26), m e t h o d s
o f t h e first o r d e r d e s c r i b e d i n t h e s u b s e c t i o n o n p . 2 4 9 c a n b e u s e d .
M e t h o d (9.20) w i t h c o n d i tio ns (9.26) c a n n o t b e a p p l i e d to the
m i n i m i z a t i o n o f f 0 (a;), f o r i n s p a c e E n + m f u n c t i o n f 0 ( x ) i s n o t
s t r i c t l y c o n v e x ; m a t r i x /JJ ( x ) , a s i s e a s i l y a s c e r t a i n e d , i s s i n g u l a r
i n E n + m . D u e t o t h i s f a c t , t h e r e is n o p o i n t i n u s i n g m e t h o d s o f d u a l
a n d c o n j u g a t e directions in this case.

Conclusive Re m a r k s
O f t h e class of p r o j e c t i o n m e t h o d s w i t h r e sto rat ion of ties w e h a v e
d i s c u s s e d o n l y t h o s e a l g o r i t h m s i n w h i c h u s e is m a d e o f f o r m u l a s
(9.9) t o r e s t o r e ties. I n a n u m b e r of p r o b l e m s this m e t h o d of c a r r y i n g
o u t the c o n c l u d i n g (third) stage of iteration m a y p r o v e inconvenient;
i n t h i s c a s e it is w o r t h w h i l e t o c a r r y o u t t h i s s t a g e i n a n o t h e r w a y .
F o r instance, one c a n determine point x ^ + ^ S g b y m a k i n g the quantity
A A

II x h + 1 (a ) — x h ( a ) II m i n i m i z e t h e d i s t a n c e b e t w e e n p o i n t x h ( a )
a n d set S g .

256
B I B L I O G R A P H I C N O T E S

T h e a m o u n t of w o r k per iteration in projection m e t h o d s d i m i n i s h e s


w h e n a p p r o a c h i n g t h e so lution (or a s t a t i o n a r y p o i n t of f u n c t i o n
/0 (#) o n S g ) s i n c e t h e p r o b l e m of d e t e r m i n i n g p o i n t x h+ 1 ( u s i n g p o i n t
x h) b e c o m e s simpler. So, for e x a m p l e , t h e a m o u n t of w o r k i n v o l v e d
i n s o l v i n g s y s t e m ( 9 . 1 ) w i t h a f i x e d v e c t o r z k + 1 (i.e. i n e v a l u a t i n g
f u n c t i o n y (Z h + i ) d i m i n i s h e s a s w e a p p r o a c h t h e s o l u t i o n o f t h e p r o b l e m
b e c a u s e p o i n t y t ( z ft+1) w i t h i n c r e a s i n g k a p p r o x i m a t e s t h e s o l u t i o n
y (Zfe+i) b e t t e r a n d b e t t e r . I n t h i s s e n s e , p r o j e c t i o n m e t h o d s d i f f e r
advantageously f r o m penalty function m e t h o d s w h e r e w e h a v e to
solve p r o b l e m s of greater a n d greater c o m p l e x i t y in o r d e r to o b t a i n
a m o r e precise a p p r o x i m a t i o n to the solution.

Bibliographic Notes
T o S e c . 1. T h e m e t h o d o f s o l v i n g p r o b l e m s o f q u a d r a t i c p r o g r a m m i n g e x ­
p o u n d e d i n t h i s s e c t i o n is b a s e d o n a p p l y i n g t h e m e t h o d o f c o n j u g a t e g r a d i e n t s .
I t i s t h e m o s t s i m p l e a n d e x p e d i e n t * m e t h o d if t h e c o n s t r a i n t s o n t h e v a r i a b l e s
are simple. T h e r e are a great m a n y other m e t h o d s of solving the p r o b l e m of
q u a d r a t i c p r o g r a m m i n g w h i c h c o n v e r g e e i t h e r after a finite n u m b e r o r a n infinite
n u m b e r of steps. T h e s e m e t h o d s are described b y H . Kiinzi a n d W . Krelle,
S . I. Z u k h o v i t s k i i a n d L . 1. A v d e y e v a , G . Z o u t e n d i j k [ 1 ] , V . F . D e m ’y a n o v a n d
V . N . M a l o z e m o v [2].
T h e p r o b l e m s of the effectiveness a n d a c c u r a c y of different alg orit hms are
a n a l y z e d b y V . V . I v a n o v [2], [3], V . V . I v a n o v a n d V . E . T r u t e n ’.
T o S e c . 2. I n d e s c r i b i n g t h e m e t h o d of feasible d i r e c t i o n s w e f o l l o w e d m a i n l y
t h e w o r k s b y S . I. Z u k h o v i t s k i i , R . A . P o l y a k a n d M . E . P r i m a k [1], [2]. T h e
m e t h o d d e s c r i b e d in Sec. 2 differs f r o m t h e t r a d i t i o n a l o n e b y t h e r u l e of c h o o s ­
i n g t h e s t e p l e n g t h at e a c h iteration. S e v e r a l v a r i a n t s of t h e m e t h o d of feasible
d i r e c t i o n s h a v e b e e n s t u d i e d i n d e t a i l a n d s u b s t a n t i a t e d b y G . Z o u t e n d i j k [11,
a n d a l s o i n p a p e r s b y G . Z o u t e n d i j k [2], D . M . T o p k i s a n d A . V e i n o t t .
T o S e c . 3. T h e m e t h o d o f c o n d i t i o n a l g r a d i e n t w a s f o r t h e first t i m e d e ­
s c r i b e d b y M . F r a n k a n d P . W o l f e . L a t e r o n it w a s s t u d i e d b y V . F . D e m ’y a n o v
a n d A . M . R u b i n o v , E . S. L e v i t i n a n d B . T. P o l y a k w h o g a v e t h e b o u n d s o n
t h e r a t e o f c o n v e r g e n c e . T h e p r o o f o f t h e a c c u r a c y o f t h e s e e s t i m a t e s is g i v e n
in a p a p e r b y M . D . C a n o n a n d C. D . C u l l u m .
T h e g e n e r a l i z a t i o n o f N e w t o n ’s m e t h o d o f s o l v i n g c o n s t r a i n e d p r o b l e m s w a s
c a r r i e d o u t b y E . S . L e v i t i n a n d B . T . P o l y a k . N e w t o n ’s m e t h o d w i t h s t e p
a d j u s t m e n t w i t h c o n s t r a i n t s w a s s t u d i e d b y Y u . M . D a n i l i n [11, [21.
T o Sec. 4. T h e cutting h y p e r p l a n e m e t h o d as described in this section fol­
l o w s the w o r k b y J.E. K e l l e y . Different generalizations of this m e t h o d a n d also
t h e b o u n d s o n i t s r a t e o f c o n v e r g e n c e (| / ( x h ) — / ( x * ) | < 1 C l n ) f o r a s t r o n g l y
c o n v e x f u n c t i o n / (x ) a r e g i v e n i n t h e p a p e r b y E . S . L e v i t i n a n d B . T . P o l y a k .
T o S e c s . 5, 6 . T h e d e s c r i p t i o n of t h e l i n e a r i z a t i o n m e t h o d in this c h a p t e r fol­
l o w s t h e w o r k b y B . N . P s h e n i c h n y [4] w h e r e h e p r o v e s t h a t t h e m e t h o d is c o n ­
v e r g e n t . T h e s u b t l e r r e s u l t s s u c h a s t h e finite c o n v e r g e n c e i n l i n e a r p r o g r a m m i n g ,
the local est imat e of the rate of c o n v e r g e n c e a n d also q u a d r a t i c rates of c o n v e r ­
g e n c e in special cases, h a d n o t b e e n p u b l i s h e d before. T h e s a m e c a n b e said of
the application of this m e t h o d to the p r o b l e m of finding the m i n i m a x . T h e
l a s t p r o b l e m w Ta s s t u d i e d b y V . F . D e m ’ y a n o v a n d V . N . M a l o z e m o v [ 1 ] , [ 2 ]
w rh o c o n s t r u c t e d a n u m b e r o f a l g o r i t h m s o f d e s c e n t l o r s o l v i n g t h e p r o b l e m .
N o t e also that the m i n i m a x p r o b l e m can be solved b y using the a l g orit hm of
t h e g e n e r a l i z e d g r a d i e n t d e s c e n t a n d its v a r i a n t s d e v e l o p e d b y N . Z . S h o r

1/2 1 7 - 0 3 2 6 257
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N

T o Sec. 7. T h e m e t h o d s of the acceleration of c o n v e r g e n c e described o n


p p . 2 2 5 - 2 3 3 h a d n o t b e e n p u b l i s h e d b e f o r e ; h o w e v e r , i n i ts b a s i c i d e a s it is c l o s e ­
l y c o n n e c t e d w i t h t h e w o r k s b y Y u . M . D a n i l i n a n d B . N . P s h e n i c h n y [ 1 ],
[2]. M e t h o d s o f s o l v i n g s y s t e m s o f e q u a t i o n s w i t h o u t u s i n g d e r i v a t i v e s o f t h e
left-hand sides w e r e described in m a n y works; of these w e m e n t i o n here o n l y
t h e w o r k s b y F . J . Z e l e z n i k , J . B a r n e s , C . G . B r o y d e n ( 1 1 , V . E . S h a m a n s k y [ 1 ].
T h e a p p l i c a t i o n o f N e w t o n ’s m e t h o d o f s o l v i n g s y s t e m s o f e q u a t i o n s t o t h o s e
a r i s i n g i n f o r m u l a t i n g t h e n e c e s s a r y c o n d i t i o n s f o r a n e x t r e m u m is d e s c r i b e d
in t h e s u b s e c t i o n o n p. 233. T h e p r o b l e m s i n v o l v e d are treated in detail b y
B . T . P o l y a k [3].
T o S e c . 8 . T h e l i t e r a t u r e o n t h e v e r y p o p u l a r p e n a l t y f u n c t i o n m e t h o d is
extensive. T h e different properties of this m e t h o d c a n b e f o u n d in m o n o g r a p h s
b y J . C e a , A . V . F i a c c o a n d G . P . M c C o r m i c k , E . P o l a k [21. i n p a p e r s b y
W . I. Z a n g w i l l [2J, E . S . L e v i t i n a n d B . T . P o l y a k , A . V . F i a c c o . T h e b o u n d s
o n t h e r a t e o f c o n v e r g e n c e w e r e s t u d i e d b y I. I. E r e m i n [ 1 1 , D . L a e n b e r g e r . T h e
m e t h o d of centers closely con nect ed w i t h the penalty function m e t h o d w a s s u g ­
g e s t e d b y P. H u a r d . A n e w , little s t u d i e d v a r i a n t o f t h e p e n a l t y f u n c t i o n m e t h ­
o d is g i v e n i n a w o r k b y M . R . H e s t e n e s [1].

a n g a n a J . u. n e i a e m a n p r o j e c t i o n m e t n o a s w i t n resto r a t i o n oi ties are d e s c r i b e d


i n w h i c h u s e is m a d e o f t h e g r a d i e n t m e t h o d o r t h e c o n j u g a t e g r a d i e n t m e t h o d
for d e t e r m i n i n g the direction of m o v e m e n t a n d at the conclusive stage the p r o b l e m
o f t h e m i n i m i z a t i o n o f t h e d i s t a n c e b e t w e e n t h e p o i n t a n d t h e t i e is s o l v e d .
T h e projection algorithms considered in this section w e r e described b y
Y u . M . D a n i l i n [3]. F i n a l l y , w e m e n t i o n h e r e s e v e r a l o f t h e m e t h o d s t h a t h a v e
n o t b e e n d e s c r i b e d i n t h i s b o o k : t h e m e t h o d o f F e j e r ’s a p p r o x i m a t i o n s d e v e l o p e d
b y I . I . E r e m i n [ 2 ] , [ 3 1 , a n d a l s o N . Z . S h o r ’s m e t h o d o f t h e g e n e r a l i z e d g r a d i e n t
[ll-[4]. C o m b i n e d m e t h o d s o f f i n d i n g e x t r e m a w e r e s t u d i e d b y V . V . I v a n o v [1].
M a n y interesting results o n the c o n v e r g e n c e of m i n i m i z a t i o n alg o r i t h m s h a v e
been obtained b y V. G. -Karmanov.
O f papers d e v o t e d to reviews of n u m e r i c a l m e t h o d s w e suggest those b y
Y u . M . E r m o l ’e v , G . Z o u t e n d i j k [ 2 ] , a n d a l s o b o o k s b y H . K i i n z i a n d
W . O e t t l i , F . P . V a s i l ’e v ; t h e l a t t e r c o n t a i n s a n e x t e n s i v e b i b l i o g r a p h y o n t h e
p r o b l e m under consideration.

258
A P P E N D I X

C O M P U T A T I O N A L S C H E M E S
O F T H E M A I N A L G O R I T H M S

I. M E T H O D O F D U A L D I R E C T I O N S
( C H A P . II, S E C . 3 )
T h i s m e t h o d is i n t e n d e d f o r t h e m i n i m i z a t i o n o f a c o n v e x f u n c t i o n / (x),
x 6 E n -
Iteration scheme.
L e t x 0 b e a n a r b i t r a r y p o i n t , r0f0 , s 0 , - n • • s o , - 7i + i b e a n arbitrary linearly
independent vector system.
With 0 k n — 1, t b e i t e r a t i o n is a s f o l l o w s :
(1) C o n s t r u c t t h e p o i n t
*h+ 1 = *h — a h i k (*ft) (*)
w h e r e a * is c h o s e n b y a n y o f t h e m e t h o d s d e s c r i b e d i n C h a p . I I , S e c . 1.
(2) S e t :
r li+ 1

eh + 1 = r ( w - r ( * h ). (2)
(3) C o m p u t e
n+i» ^ft+l)*
If
I(tynfc-Ti+l’ f /j+i) I ^ Y II s k , h - n + 1 II II ^ f t + 1 II (^)
where y > 0 is a n a r b i t r a r i l v s m a l l c o n s t a n t , g o t o s t e p (5).
If
I (s h t h - n + 1 ’ 0 : + i ) 1 Y lls f c » f c - n + l l l II e f t + i l l » (^)
g o t o s t e p (4).
(4) S e t
rJt+i = Pft+i5ft,ft-n+l (5)
w h e r e t h e q u a n t i t y {5f t + 1 > 0 i s c h o s e n s u c h t h a t t h e c o n d i t i o n ||rf e + 1 || < || r k ||
b e fulfilled.
C o m p u t e t h e g r a d i e n t / ' ( x k -f- r k + 1 ) a n d t h e n c o n s t r u c t v e c t o r e k + l ~
= f (*k + r h + 1) — /' M - T h c n S ° t 0 s t e P (5 )-
(5) C o n s t r u c t the vector system
_ _ _ _ s k, k - n + i
h + 1 . k + 1 — ( Sft h _ n + u efl+i) »
sk + l > h - j — ski h - j — ^ h , h - j t efc+i) S h + U h + l i
7 = 0, 1, .... n — 2. (6 )

259 17*
A P P E N D I X

T h i s is t h e e n d o f t h e i t e r a t i o n .
W i t h k >■ n:
(1) C o n s t r u c t v e c t o r
77 - 1

P h = — ^ W h-t) r k - i•

( 2 ) C o m p u t e ( /' ( x ft), p h ).
I f (/' ( x h ) , P k ) 0 , c o n s t r u c t p o i n t a *f t + 1 b y o n e o f t h e f o r m u l a s z f e + 1 =
= Z k ± c w h e r e a & is c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) , C h a p . I I .
I f (/' ( * & ) » P h ) = c o n s t r u c t p o i n t x h + 1 u s i n g t h e g r a d i e n t m e t h o d ( s e e ( 1 ))
o f t h e i t e r a t i o n w i t h A* ^ n — 1 ( f u r t h e r , c o n s t r u c t t h e i t e r a t i o n i n t h e s a m e w a y
a s y o u d i d w i t h k - < n — 1 ( s t e p s (2)-(5)).
R e m a r k s . 1. W e h a v e a d d u c e d o n l y o n e o f t h e p o s s i b l e c o m p u t a t i o n s c h e m e s
o f t h e m e t h o d s o f d u a l d i r e c t i o n s . H e r e t h e f i r s t i t e r a t i o n s (A: n — 1) a r e
carried o u t b y t h e g r a d i e n t m e t h o d . S i n c e at t h e initial steps of t h e iterative
process the gradient m e t h o d us u a l l y pro v i d e s for a sufficiently steep decrease of
t h e f u n c t i o n , s u c h a n i n i t i a l s t a g e o f t h e p r o c e s s is e x p e d i e n t i n s o l v i n g m a n y
problems.
2. W c h a v e c h a n g e d h e r e f o r t h e s a k e o f c o n v e n i e n c e t h e n o t a t i o n s o f t h e
v e c t o r s o f t h e d u a l b a s i s ( c f . ( 6 ), a n d ( 3 . 2 1 ) o f C h a p . I I ) .
3. If v e c t o r r ^ + j is c h o s e n i n f o r m ( 5 ) a n d f u n c t i o n / (s) is s m o o t h a n d s t r o n g l y
c o n v e x (i.e. c o n d i t i o n s ( 2 . 4 ) o f C h a p . I I h o l d t r u e ) , t h e n i n e q u a l i t y (3) w i l l a u t o ­
m a t i c a l l y hold, p r o v i d e d the c o n s t a n t y h a s b e e n c h o s e n sufficiently small.
I n d e e d , i f f u n c t i o n f ( x ) s a t i s f i e s t h e r e q u i r e m e n t s f o r m u l a t e d , t h e n | | e ft|| < !
• < M ||rft || a n d e s t i m a t e ( 5 . 1 8 ) o f C h a p . I I h o l d s t r u e . B y v i r t u e o f t h i s f a c t ,
w e have
1 m
( 5 f c » f c - n + l i e h + 1) = pk" (rh+ii eh+i) ^ “j ^ H r f t + i ll2

> p II r h + i ll l| e/i + i W ^ - J f II s h , h - n + 1 II II e k + i I I .

T h u s if y -=7 , t h e n i n e q u a l i t y (3) is s a t i s f i e d .
4. T h e practice of com puta tion s s h o w s that quantity y m a y be chosen very
s m a l l : y = 1 0 ~ 6 - 1 0 “ 15. If c o n d i t i o n (3)' is n o t s a t i s f i e d e v e n w i t h v e c t o r
r k + 1 h a v i n g b e e n c h o s e n i n f o r m ( 5 ) , i t m e a n s t h a t m a t r i x f (a:) a s x x* be­
c o m e s ill c o n d i t i o n e d , i . e . t h e m i n i m i z e d f u n c t i o n i s n o t s t r o n g l y c o n v e x .
In particular, the surfaces of the levels of this function m a y h a v e f o r m s of a long,
d e e p a n d n a r r o w valley. In this case, a v e r y accurate a p p r o x i m a t i o n to the solu­
t i o n w i t h r e s p e c t t o t h e v a r i a b l e is n o t a t t a i n a b l e . H o w e v e r , t h e c o m p u t a t i o n
practice s h o w s that o n e c a n o bt ain function val ues sufficiently close to the m i n i ­
m u m in m i n i m i z i n g e v e n of n o n c o n v e x functions w i t h d e e p va l l e y level sur­
faces.

II. C O N J U G A T E G R A D I E N T S M E T H O D
( C H A P . II, S E C . 4 )
T h i s m e t h o d is i n t e n d e d f o r t h e m i n i m i z a t i o n o f c o n v e x f u n c t i o n / (x), x 6 £ n .
Iteration scheme.
L e t X q b e a n a r b i t r a r y p o i n t , p 0 = — / ' ( x 0 ).
, If r O < A < n — 1, g o t o ( 2 ).
' \ft = n. g o t o (5).

260
A P P E N D I X

(2) C o n s t r u c t point
xk+i = xk + VhPk
w h e r e f a c t o r a & is d e t e r m i n e d u n d e r t h e c o n d i t i o n

/ (x h + = m i n / (x k - r a p ft) .
a>0
(3) C o m p u t e v e c t o r
Pfc +i = — f' (x k+i ) + Pfe+iPft+i

where
P _ (f (*i,+i), V { x M ) - V M )
Jft+1 (/' (*/i), Pfe)
(4) G o t o (1).
(5) S e t x n = x 0 , p 0 = — /' ( x n ) a n d r e p e a t t h e p r o c e s s ( g o t o (1)).
Rem a r k . C o e f f i c i e n t p ft+1 c a n b e d e t e r m i n e d b y a n y o f t h e f o r m u l a s ( 4 . 7 3 ) ,
Chap. II.

III. M E T H O D O F F E A S I B L E D I R E C T I O N S
( C H A P . Ill, S E C . 2)
T h i s m e t h o d is i n t e n d e d f o r s o l v i n g p r o b l e m s o f c o n v e x p r o g r a m m i n g : to
m i n i m i z e f u n c t i o n / 0 (x) w i t h c o n s t r a i n t s
f i(.x ) i 1 , . . ., w ,
A x — 6 = 0
w h e r e x 6 E n , f i ( x ) , i = 0 , . . ., m a r e c o n v e x c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c ­
t i o n s , il i s a n I x « m a t r i x , 6 i s a n Z - d i m e n s i o n a l v e c t o r .
Notations:
O l ( x ) = {* : f i ( x ) > — 6, i = 1. . . ., m } ,

I I P II = m a x I pi I
l^j<n
where p 6 E n, pi are c o m p o n e n t s of vector p.
Initial data: x 0 is t h e i n i t i a l a p p r o x i m a t i o n s a t i s f y i n g a l l t h e c o n s t r a i n t s ;
60 > 0 , > 0 , i = 0 , . . ., m a r e p o s i t i v e n u m b e r s w h i c h , s p e a k i n g g e n e r a l l y ,
are arbitrary.
The c o m m o n step of the algorithm.
Point and n u m b e r 6^ > 0 have been computed.
(1) S o l v e t h e p r o b l e m of linear p r o g r a m m i n g
min
U ’i (x h ) > P ) < S i 1! * * 6 J E k (x h ) U {0>,

A p = 0,
— 1 -< pi + 1, j = 1, . . ., n .
T h e s o l u t i o n i s \}h , p h .
(2) If % < — b h , t h e n

xh+i = xk + a kPki &k+i ~

261
A P P E N D I X

where a ^ = — , a n d q 0 is t h e first i n t e g e r o f g = 0 , 1 . . f o r w h i c h t h e f o l l o w i n g

inequalities hold:

f o ( x k + -^q p h ) < / o (^) + y

fi 2?^) ^ •••»

1
( 3 ) I f x)jj ^ 5k , then x^+i = x^,

(4) R e t u r n t o (1).
R e m a r k . T h e c h o i c e o f n u m b e r s 6 0 , £$ c a n i n f l u e n c e t h e c o u r s e o f t h e p r o c e s s ;
the choice should be m a d e o n the basis of a n analysis of the p r o b l e m u n d e r con­
sideration. T h e a l g o r i t h m c a n b e u s e d also for n o n c o n v e x p r o b l e m s .

IV. LINEARIZATION M E T H O D
( C H A P . Ill, S E C . 5)
This method is i n t e n d e d f o r s o l v i n g t h e p r o b l e m : t o m i n i m i z e / 0 ( a ) w i t h
constraints:
/,• ( * ) < 0 , i= 1, . . m,
f| (a:) = i = n i — [- 1 , . . . , 1

w h e r e f t (x) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n s .
Notations:
F (x) = m a x {0, /j (x), . . . , fm (x), I / m + i (a:) |, . . . , | f m + i { x ) |},
= f i ( x ) ^ F (x ) — & > i = 1 > • • • » « ) »
= I fi (*) I > F ix ) — bi < = i » + 1, — ,
(x ) = fo (*) - f N F (x),

\ \ p 112 = 2 O T 2.
5=1
I n i t i a l d a t a : t h e i n i t i a l a p p r o x i m a t i o n x 0 is a r b i t r a r y ; N 0 is sufficiently
great, 6 0 > 0 , 0 < c < 1 .
T h e c o m m o n step of the algorithm.
P o i n t X h is c o n s t r u c t e d a n d n u m b e r s N h a n d 6 ^ a r e c h o s e n .
(1) S o l v e t h e p r o b l e m :

m i n (/o ( x k ) t p ) + — || P |l2 ,

(/J(*a). P ) Jr h ( ^ K 0 , i €

( f i (x h > P ) H - f i (x k ) = 0, i £ Cf% (x h ) -
a
The solution is p k . If t h e p r o b l e m is i n c o m p a t i b l e , t h e n s e t z ft+1 =
6&+i = y 6h , N k+1 = N k and r e t u r n t o (1).

262
A P P E N D I X

(2) I f t h e p r o b l e m is c o n s i s t e n t a n d p k is f o u n d , t h e n s e t
•*£+1 =
***& &hPh'
$fc+l = & k
1
where is c h o s e n e q u a l t o — — a n d q 0 is t h e first o f i n t e g e r s o f q = 0 , 1, . .

for w h i c h the relation

29 ^ b ) ^ 29 6 ^
holds.
(3; L e t n u m b e r s u U i£;7r (^)U«^s (*&) he Lagrange multipliers of the
k U fe U /i
subsidiary problem that w a s s o l v e d a t t h e first s t a g e . If n o w

* » > 2 4+ 2 i i.
^ i k <*ft>
then N k+i = N k .
Otherwise

^ +1= 2 / 2 4+ 2 i4n-
Vl^ ‘V ‘W V V '
(4) R e t u r n t o (1).
R e m a r k . N u m b e r s 6 k a n d N ^ cease to c h a n g e f r o m a certain step on. T h e al­
g o r i t h m requires a n effectively w o r k i n g s t a n d a r d p r o g r a m for s o l v i n g t he
p r o b l e m of q u a d r a t i c p r o g r a m m i n g .

V. A L G O R I T H M F O R S O L V I N G A S Y S T E M
O F E Q U A T I O N S W I T H O U T C A L C U L A T I N G
DERIVATIVES
( C H A P . Ill, S E C . 6)
T h i s a l g o r i t h m is i n t e n d e d f o r s o l v i n g t h e s y s t e m o f e q u a t i o n s
P (x) = 0
w h e r e x £ E n , p (x) is a n n - d i m e n s i o n a l v e c t o r - f u n c t i o n w h o s e c o m p o n e n t s
pi(x)> 7 = 1 , . • n are differentiable.
I n i t i a l d a t a : i n i t i a l a p p r o x i m a t i o n s x x , . . ., x n a r e a r b i t r a r i l y c h o s e n i n
a sufficiently s m a l l r e g i o n a b o u t t h e s o l u t i o n . I n a p a r t i c u l a r c a s e all ft =
= 1 , . . ., n c a n c o i n c i d e .
n
N o t a t i o n s : || p ( x ) | | 2 = 2 0 3 * ( * ) ) 2 ; 9 ( * ) I s e q u a l t o 1 , 2 , . . ., n — 1 i f ft
i=1
w h e n d i v i d e d b y n l e a v e s a r e m a i n d e r 1 , 2 , . . ., n — 1 r e s p e c t i v e l y , cp(ft) = n
if it d i v i d e s b y n .
T h e c o m m o n s t e p o f t h e a l g o r i t h m , x l t . . ., x h h a v e b e e n c o n s t r u c t e d .
(1) S o l v e for u n k n o w n s i = 1 , . . ., n t h e s y s t e m o f e q u a t i o n s

— 71+i P i P {Xh)

263
A P P E N D I X

where
1
zi — || p || t P (X J " H I P (x j ) II e ( p ( j p P ^i)]»

€ i is a v e c t o r w i t h z e r o c o m p o n e n t s , e x c e p t t h e i - t h o n e w h i c h is e q u a l t o 1 .
(2) S e t
n
*ft+l = a:k + •2 j
Pi ^<P(ft-n+i)*

(3) R e t u r n to (1).

264
L I T E R A T U R E

A l t m a n , M . “Generalized gradient m e t h o d s of m i n i m i z i n g a functional”, Bull.


A c a d . P o l o n . Sci. 14, N o . 6 (1966), 313-318.
A r r o w , K . , H u r w i z , L., a n d U z a w a , H . (eds.) S t u d i e s in L i n e a r a n d N o n l i n e a r
P r o g r a m m i n g , S t a m f o r d , Calif., 1 9 5 8 .
Auslender, A. “M e t h o d e s et t h e o r e m e s d e dualite” , C o m p t e s R e n d u s , A c a d e m i e
Science, Paris, A 2 6 7 (1968), 114-117.
B a l a k r i s h n a n , A . V., a n d N e u s t a d t , L . W . (eds.) C o m p u t i n g M e t h o d s in O p t i ­
mization P r o b l e m s , N e w Y or k, 1964.
B a r n e s , J. “A n a l g o r i t h m f or s o l v i n g n o n l i n e a r e q u a t i o n s b a s e d o n t h e s e c a n t
m e t h o d ” , C o m p u t e r J . 8 (1965) 66-72.
B e r s h c h a n s k y , Y a . M . “A m e t h o d for solving p r o b l e m s of linear a n d c o n v e x
p r o g r a m m i n g ” , Z h . Vychisl. M a t . i M a t . Fiz. 10, N o . 3 (1970), 6 2 1 - 6 2 9
(in R u s s i a n ) .
B o u r b a k i , N . E l e m e n t s d e m a t h e m a t i q u e . L i v r e 5. E s p a c e s vectoriels t o p o l o g i q u e s ,
Paris, 1953.
B r e n t , R . P. “A l g o r i t h m s for finding zeros a n d e x t r e m a o f f u n c t i o n s w i t h o u t
calculating derivatives”, S T A N - C S , 71-198, Feb. 1971.
B r o w n , K . M . , a n d D e n n i s , J. E . “ O n N e w t o n - l i k e i t e r a t i o n f u n c t i o n s : g e n e r a l
c o n v e r g e n c e t h e o r e m s a n d a specific a l g o r i t h m ” , N u m . M a t h . 1 2 (1968).
B r o y d e n , C . G . 1. “ A c l a s s o f m e t h o d s t o r s o l v i n g n o n l i n e a r s i m u l t a n e o u s
equations” , M a t h . C o m p . 19 (1965), 577-593.
2. “ Q u a s i - N e w t o n m e t h o d s a n d t h e i r a p p l i c a t i o n t o f u n c t i o n m i n i ­
miz atio n” , M a t h . C o m p . 21 (1967), 368-381.
3. “ T h e c o n v e r g e n c e o f s i n g l e - r a n k q u a s i - N e w t o n m e t h o d s ” , M a t h .
C o m p . 2 4 (1970).
B u d a k , B. M., a n d G o l d m a n , N. R. “O n the application of the N e w t o n m e t h o d
for solving nonlinear b o u n d a r y - v a l u e p r o b l e m s ” in VychisliteVnye m e t o d y
i programmirovanie (Computational M e t h o d s a n d Pro g r a m m i n g ) , Issue VI,
M o s c o w U n i v e r s i t y Press, 1967, p p . 17 - 3 3 (in R u s s i a n ) .
C a n o n , M . D., a n d C u l l u m , C. D. “A tight u p p e r b o u n d o n the rate of co n v e r g ­
ence of the F r a n k - W o l f e al g o r i t h m ” , S I A M J. Control 6 , (1968), 509-516.
C a n o n , M . D . , a n d E a t o n , J. H . “ A n e w a l g o r i t h m f o r a c la ss o f q u a d r a t i c p r o -
• i i » , i l ■ _x O T A X / T f * _ _ _ i . . » /

D a n i e l , J. W . 1. T h e A p p r o x i m a t e M i n i m i z a t i o n o f F u n c t i o n a l s , E n g l e w o o d
Cliff, N . J., 1 9 7 1 .
2. “ C o n v e r g e n c e o f t h e c o n j u g a t e g r a d i e n t m e t h o d w i t h c o m p u t a ­
tionally efficient m o d i f i c a t i o n s ” , N u m . M a t h . 10, N o . 2 (1967), 1 2 5 - 1 3 1 .
3. “ T h e c o n j u g a t e g r a d i e n t m e t h o d f o r l i n e a r a n d n o n l i n e a r o p e r a t o r
e q u a t i o n s ” , S I A M J . N u m . A n a l . 4 (1967), 10-28.
D a n i l i n Y u . M . 1. “ O n a n a p p r o a c h t o m i n i m i z a t i o n p r o b l e m s ” , D o k l a d y
A k a d . N a u k S S S R 188. N o . 6 (1969), 1 2 2 1 [Engl, trans.: Soviet M a t h .
D o k l a d y 10 (1969), 1274].

18— 0 3 2 6 265
L I T E R A T U R E

2. “ M i n i m i z a t i o n m e t h o d s b a s e d o n a p p r o x i m a t i o n o f t h e initial
f u n c t i o n a l b y a c o n v e x o n e ” , Z h . Vychisl. M a t . i M a t . Fiz. 10, N o . 5 (1970),
1 0 6 7 - 1 0 8 0 (in Rus sian ).
3. “ O n f u n c t i o n m i n i m i z a t i o n in p r o b l e m s w i t h e q u a l i t y c o n s t r a i n t s ”
K i b e r n e t i k a , N o . 2 (1971), 8 8 - 9 5 (in Russian).
4. “ M e t h o d s o f c o n j u g a t e d i r e c t i o n s for s o l v i n g m i n i m i z a t i o n p r o b l e m s ” ,
K i b e r n e t i k a , N o . 5 (1971), 1 2 2 - 1 3 6 (in R u s s i a n ) .
D a n i l i n Y u . M . a n d P s h e n i c h n y B . N . 1. “ O n m i n i m i z a t i o n m e t h o d s v v i t h a c c e l
e ra ted c o n v e r g e n c e ” , Z h . Vychisl. M a t . i M a t . Fiz. 10, N o . 6 (1970), 1 3 4 1 -
1 3 5 4 (in R u s s i a n ) .
2. “ M i n i m i z a t i o n m e t h o d w i t h o u t c a l c u l a t i n g d e r i v a t i v e s ” , Z h , V y c h i s l •
M a t . i M a t . Fiz. 11, N o . 1 (1971), 12-21.
D a n t z i g G . B . 1. L i n e a r P r o g r a m m i n g a n d E x t e n s i o n s , P r i n c e t o n , 1 9 6 3 .
2. “ L i n e a r c o n t r o l p r o c e s s e s a n d m a t h e m a t i c a l p r o g r a m m i n g ” , S I A M J .
C o n t r . 4, N o . 1 (1966), 56 - 6 0 .
D a v i d o n W . C . 1. “ V a r i a b l e m e t r i c m e t h o d s f o r m i n i m i z a t i o n ” , A E C R e s e a r c h a n d
D e v e l o p m e n t , Rept. A N L 5 99 0 (Rev.), 1959.
2. “ V a r i a n c e a l g o r i t h m f o r m i n i m i z a t i o n ” , C o m p . J . 10, N o . 4 ( 1 9 6 8 ) ,
406-410.
D e m ’y a n o v V . F . a n d M a l o z e m o v V . N . 1. “ C o n t r i b u t i o n t o t h e t h e o r y o f n o n ­
linear m i n i m a x problems'” , U s p e k h i M a t . N a u k 26, N o . 3 (1971), 53 - 1 0 4
(in R u s s i a n ) .
2. I n t r o d u c t i o n to M i n i m a x , N e w Y o r k , 1 9 7 4 . 1
D e m ’y a n o v V . F . a n d R u b i n o v A . M . A p p r o x i m a t e M e t h o d s o f O p t i m i z a t i o n
Problems N g w York 1970-
D e n n i s I . E . “ O n N e w t o n - l i k e m e t h o d s ” , N u m e r . M a t h . B a n d 11, H e f t 4 (1968).
D u b o v i t s k y A. Y a . a n d Milyutin A. A. “E x t r e m a l pro b l e m s with constraints ,
Z h . Vychisl. M a t . 5, N o . 3 (1965), 3 9 5 - 4 5 3 (in R u s s i a n ) .
D u n f o r d N . a n d S c h w a r t z J . T . L i n e a r O p e r a t o r s . P a r t 1: G e n e r a l T h e o r y . N e w
Y o r k , 1962.
E r e m i n I. I. 1. “ P e n a l t y m e t h o d i n c o n v e x p r o g r a m m i n g ” , K i b e r n e t i k a , N o . 4
( 1 9 6 7 ) , 6 3 - 6 7 (in R u s s i a n ) .
2. “ M e t h o d o f F e j e r a p p r o x i m a t i o n s i n c o n v e x p r o g r a m m i n g ” , M a t e m .
Z a m e t k i 3 (1968), 2 1 7 - 2 3 4 (in R u s s i a n ) .
3. “ R a t e o f c o n v e r g e n c e o f t h e m e t h o d o f F e j e r a p p r o x i m a t i o n s ” , ibid. 4
( 1 9 6 8 ) , 5 3 - 6 1 (in R u s s i a n ) .
E r m o l ’e v Y u . M . “ M e t h o d s f o r s o l v i n g n o n l i n e a r e x t r e m a l p r o b l e m s ” , K i b e r n e ­
t i k a , N o . 4 (1966), 1 - 1 7 (in R u s s i a n ) .
F a d d e e v D. K . a n d F a d d e e v a V . N . C o m p u t a t i o n a l M e t h o d s of L i n e a r A l g e b r a ,
S a n Francisco, 1963.
F i a c c o A . V . “ P e n a l t y m e t h o d for m a t h e m a t i c a l p r o g r a m m i n g in E n w i t h
general constraint sets” , J. Opt. T h e o r y a n d A p p l . 6 , N o . 3 (1970), 252-268.
Fiacco A. V. and M c C o r m i c k G. P. N o n l i n e a r P r o g r a m m i n g : Sequential
Unconstrained M i n i m i z a t i o n Techniques, N e w Y o r k , 1968.
F i k h t e n h o l ’t s G . M . C o u r s e o f D i f f e r e n t i a l a n d I n t e g r a l C a l c u l u s , M o s c o w , 1 9 5 9
(in R u s s i a n ) .
F l e t c h e r R . 1. “ A n e w a p p r o a c h t o v a r i a b l e m e t r i c a l g o r i t h m s ” , C o m p u t . J . 1 3 ,
N o . 3 (1970), 317-322.
2. “ F u n c t i o n m i n i m i z a t i o n w i t h o u t e v a l u a t i n g d e r i v a t i v e s . A r e v i e w ” ,
C o m p u t . J. 8 , N o . 1 (1965), 33-41.
F l e t c h e r R . a n d P o w e l l M . J . D . “A r a p i d l y c o n v e r g e n t d e s c e n t m e t h o d for
m i n i m i z a t i o n ” , C o m p u t . J. 6 , N o . 2 (1963), 163-168.
Fletcher R. a n d R e e v e s C. M . “F u n c t i o n m i n i m i z a t i o n b y conjugate gradients” ,
C o m p u t . J . 7, N o . 2 ( 1 9 64), 1 4 9 - 1 5 4 .
F r a n k M . a n d W o l f e P. “A n a l g o r i t h m for q u a d r a t i c p r o g r a m m i n g ” , N a v . Res.
L o g . Q u a r t . 3, (1956), 9 5 - 1 1 0 .

266
L I T E R A T U R E

F r i d m a n V . M . “O n the c o n v e r g e n c e of m e t h o d s of steepest descent t y p e ” , U s p e k h i


M a t . N a u k 17, N o . 3 (1 9 6 2 ) (in R u s s i a n ) .
G a l e D . T h e T h e o r y of L i n e a r E c o n o m i c M o d e l s , N e w Y o r k , 1960.
Gilbert E. G . “A n iterative p r o c e d u r e for c o m p u t i n g the m i n i m u m of a q u a d r a t i c
f o r m o n a c o n v e x set” , S I A M J . C o n t r . 4, N o . 1 (1966), 6 1 - 8 0 .
G l a s m a n I. M . “ R e l a x a t i o n m e t h o d s ” , i n T r u d y p e r v o i z i m n e i s h k o l y p o m a t e m .
p r o g r . ( D r o g o b y c h , 1 9 6 8 ) , v . I, M o s c o w , 1 9 6 9 ( i n R u s s i a n ) .
G o l d f a r b D . “A f a m i l y of variable-metric m e t h o d s derived b y variational m e a n s ” .
M a t h . C o m p u t . 24, N o . 109 (1970), 23-26.
G o l d s t e i n A . A . 1. “ O n s t e e p e s t d e s c e n t ” , S I A M J . C o n t r . 3 ( 1 9 6 5 ) , 1 4 7 - 1 5 1 .
2. “ M i n i m i z i n g f u n c t i o n a l s o n n o r m e d l i n e a r s p a c e s ” , S I A M J . C o n t r .
4, N o . 1 (1966), 8 1 - 8 9 .
G o l d s t e i n A . A . a n d P r i c e J. F. “ A n e f f e c t i v e a l g o r i t h m f o r m i n i m i z a t i o n ” , N u m *
M a t h . 10 (1967), 184-189.
G o l s t e i n E . G . 1 . “ D u a l p r o b l e m s o f c o n v e x a n d p a r t i y - c o n v e x p r o g r a m m i n g ini
functional spaces” in Issledovaniya p o m a t e m a t i c h e s k o m u p r o g r a m m i r o v a n i y u ,
M o s c o w , 1 9 6 8 (in R u s s i a n ) .
2. C o n v e x p r o g r a m m i n g : E l e m e n t a r y t h e o r y , M o s c o w , 1 9 7 0 ( i n R u s s i a n ) -
G r e e n s t a d t J. “ V a r i a t i o n s o n v a r i a b l e - m e t r i c m e t h o d s ” , M a t h . C o m p u t . 24,.
N o . 1 0 9 (1970), 1-22.
H a y e s R . M . “ Iterative m e t h o d s of s o l v i n g linear p r o b l e m s in H i l b e r t s p a c e ”'
in “ C o n t r i b u t i o n s to t h e s o l utio n of s y s t e m s of linear e q u a t i o n s a n d d e t e r m i ­
n a t i o n of eigenvalues” , N a t . B u r . S t a n d . A p p l . M a t h . 3 9 (1954), 71-104.
H a l k i n H . a n d N e u s t a d t L. W . “ G e n e r a l necessary conditions for opt i m i z a t i o n
p r o b l e m s ” , Proc. N a t . A c a d . Sci. 5 6 (1956), 1066-1071.
H e s t e n e s M . R . 1. “ M u l t i p l i e r a n d g r a d i e n t m e t h o d s ” , J . O p t i m . T h e o r y A p p l i c .
4, N o . 5 (1969).
2. C a l c u l u s o f V a r i a t i o n s a n d O p t i m a l C o n t r o l T h e o r y , N e w Y o r k , 1 9 6 6 .
3. “ T h e c o n j u g a t e g r a d i e n t m e t h o d f or s o l v i n g l i n e a r s y s t e m s ” , P r o c .
S y m p . A p p l . M a t h . 6 (1956), 86-102.
H e s t e n e s M . R . a n d Stiefel E . “M e t h o d s of c o n j u g a t e gra dien ts for s o l v i n g linear
e q u atio ns” , J. Res. N a t . B u r . S t a n d . 49, N o . 6 (1952), 409-436.
Hn oo rr w wi ut z L i.. 13.i3. a n ad aS a r a c nh i Kk rP . En . “ uD a v i ad oo nn ’ s m e t nh o ad i n Hi Ti il ibnee rr t s pp aac c< e ” , S I A M
J. A p p l . M a t h . 16, 4 (1968).
H u a n g H . Y . “ U n i f i e d a p p r o a c h to q u a d r a t i c a l l y c o n v e r g e n t a l g o r i t h m for f u n c -
t i o n m i n i m i z a t i o n ” , J . O p t . T h e o r y A p p l . 5, N o . 6 (1970).
H u a n g N. Y. a n d L e v y A. V . “N u uerical e x p e r i m e n t s o n quadratically c o n v e r ­
g e n t a l g o r i t h m s for function m i n i m i z a t i o n ” , J. O p t . T h e o r y A p p l . 6 , N o . 3
(1970), 269-282.
H u a r d P. “Res olut ion of m a t h e m a t i c a l p r o g r a m m i n g w i t h nonlinear constraints
b y t h e m e t h o d o f c e n t e r s ” i n N o n l i n e a r P r o g r a m m i n g , J. A b a d i e ( e d . ) ,
A m s t e r d a m , 1967, pp. 208-219.
I v a n o v V . Y . 1. T h e o r y o f A p p r o x i m a t i o n M e t h o d s a n d I t s A p p l i c a t i o n t o t h e
N u m e r i c a l S o l u t i o n of S i n g u l a r I n t e g r a l E q u a t i o n s , K i e v , 1 9 6 8 (in R u s s i a n ) .
2. O n t h e A c c u r a c y a n d E f f e c t i v e n e s s o f C o m p u t a t i o n a l A l g o r i t h m s ,
K i e v , 1 9 6 9 (in R u s s i a n ) .
3. “ O n o p t i m a l a l g o r i t h m s o f m i n i m i z i n g f u n c t i o n s o f c e r t a i n c l a s s e s ” ,
K i b e r n e t i k a , N o . 4 (1972), 8 1 - 9 4 (in R u s s i a n ) .
I v a n o v V . V . a n d T r u t e n ’ V . E. “A n a l y s i s of the a c c u r a c y of quadratic} p r o ­
g r a m s ” , K i b e r n e t i k a , N o . 4 (1969), 9 4 - 1 0 5 (in R u s s i a n ) .
Isaev V . K . a n d S o n i n V . V . “O n a mod ific atio n of t h e N e w t o n m e t h o d for
n u m eric al solving of b o u n d a r y value p r o b l e m s ” , Zh. Vychisl. M a t . i M a t .
Fiz. 3, N o . 6 ( 1 9 63), 1 1 1 4 - 1 1 1 6 (in R u s s i a n ) .
K a n t o r o v i c h L . V . 1. “ O n t h e m e t h o d o f s t e e p e s t d e s c e n t ” , D o k l . A k a d . N a u k
S S S R 5 6 (1947), 2 3 3 - 2 3 6 (in R u s s i a n ) .
2. “ O n t h e N e w t o n m e t h o d f o r f u n c t i o n a l e q u a t i o n s ” , i b i d . 6 0 , N o . 7
( 1 9 4 8 ) (in R u s s i a n ) .

18* 267
L I T E R A T U R E

Kantorovich L. V a n d A k i l o v G . P. F u n c t i o n a l A na l y s i s in N o r m e d S p a c e s »
N e w York, 1964.
K a r l i n S. M a t h e m a t i c a l M e t h o d s a n d T h e o r y in G a m e s , P r o g r a m m i n g a n d
E c o n o m i c s , Reading, Mass., 1962.
K a r m a n o v V . G. “E s t i m a t e s of rate of c o n v e r g e n c e of iterative m e t h o d s of m i n i ­
m i z a t i o n ” , Z h . V y c k i s l . M a t . i M a t . F i z . 14, N o . 1 ( 1 9 7 4 ) (in R u s s i a n ) .
K e l l e y H . J. “ T h e g r a d i e n t m e t h o d ” i n O p t i m i z a t i o n t e c h n i q u e s , N e w Y o r k ,
1962.
K e l l e y J. E . “ T h e c u t t i n g p l a n e m e t h o d f o r s o l v i n g c o n v e x p r o g r a m s ” , J . S o c .
Ind. A p p l . M a t h . 8 , N o . 4 (1960), 703-712.
K o l m o g o r o v A . N . a n d F o m i n S. V . I n t r o d u c t o r y R e a l A n a l y s i s , E n g l e w o o d Cliffs,
N . J., 1 9 7 0 .
Kiinzi H . a n d Krelle W . Nichilineare P r o g r a m m i e r u t i g , Berlin, 1962.
Kiinzi H . a n d Oettli W . Nichilineare O p t i m i e r u n g : neuere verfahren bibliographic,
B e r l in- Heid elbe rg-Ne w Y o r k , 1969.
L a e n b e r g e r D . “C o n v e r g e n c e rate of penalty-function s c h e m e ” , /. Opt . T h e o r y
a n d A p p l . 7, N o . 1 ( 1 9 7 1 ) , 3 9 - 5 1 .
L a v r o v S. S. “ A p p l i c a t i o n of b a r y c e n t r i c c o o r d i n a t e s for s o l v i n g s o m e n u m e r i c a l
p r o b l e m s ” , Z h . V yc h i s l . M a t . i M a t . Fiz. 4, N o . 5 (1964), 9 0 5 - 9 1 1 (in R u s s i a n ) .
L e v i t i n E . S. a n d P o l y a k B . T. “C o n s t r a i n e d m i n i m i z a t i o n m e t h o d s ” , Z h . Vychisl.
M a t . i M a t . Fiz. 6 , N o . 5 (1966), 7 8 7 - 8 2 3 (in R u s s i a n ) .
L u s t e r n i k L . A . a n d S o b o l e v V . I. E l e m e n t s o f F u n c t i o n a l A n a l y s i s , N e w Y o r k ,
1974.
L y u b i c h Y u . I. “ S t e e p e s t d e s c e n t ” , T r u d y v t o r o i z i m n e i s h k o l y p o m a t e m . p r o g r .
i s m e z h n . v o p r . I s s u e 1, M o s c o w , 1 9 6 9 , 1 1 3 - 1 5 1 ( i n R u s s i a n ) .
L y u b i c h Y u . I. a n d M a i s t r o v s k y G . D . “ G e n e r a l t h e o r y o f r e l a x a t i o n p r o c e s s e s
f o r c o n v e x f u n c t i o n a l s ” , U s p e k h i M a t . N a u k 2 5 , I s s u e 1, 1 9 7 0 ( i n R u s s i a n ) .
M a i s t r o v s k y G . D . 1. “ O n t h e c o n v e r g e n c e o f t h e c o n j u g a t e g r a d i e n t m e t h o d ” ,
Z h . V yc h i s l . M a t . i M a t . Fiz . 11, N o . 5 (1971), 1 2 9 1 - 1 2 9 4 (in R u s s i a n ) .
2. “ P r o o f o f q u a d r a t i c c o n v e r g e n c e o f t h e c o n j u g a t e g r a d i e n t m e t h o d ” ,
Vychisl. M a t . i V y c h i s l . T e k h n . F i z . - t e c h n . inst. n i z k i k h t e m p e r a t u r ,
K h a r k o v , I s s u e 2 (1971), 3 -5 (in R u s s i a n ) .
M c C o r m i c k G . P . a n d P e a r s o n J. D . “ V a r i a b l e m e t r i c m e t h o d s a n d u n c o n s t r a i n e d
optimization” , Confer, o n Optimal., K e e le Hall, E n g l a n d , M a r c h 1968.
M i e l e A . , H u a n g H . Y . a n d H e i d e m a n J. C . “ S e q u e n t i a l g r a d i e n t - r e s t o r a t i o n
a l g o r i t h m for th e m i n i m i z a t i o n of constrained f u n c t i o n — o r d i n a r y a n d
c o n j u g a t e g r a d i e n t v e r s i o n s ” , J . O p t . T h e o r y A p p l . 4, N o . 4 (1969).
M o i s e e v N . N . (ed.) N u m e r i c a l M e t h o d s in the T h e o r y of O p t i m a l S y s t e m s , M o s c o w ,
1 9 7 1 (in R u s s i a n ) .
M u r t a g h B. A. a n d Sargent R. W . H . “C o m p u t a t i o n a l experience w i t h quadrati-
cally c o n v e r g e n t m i n i m i z a t i o n m e t h o d s ” , C o m p u t . J. 13, N o . 2 (1970),
185-194.
N e u s t a d t L. W . “A n abstract variational t h e o r y w i t h applications to b r o a d class
o f o p t i m i z a t i o n p r o b l e m s . I. G e n e r a l t h e o r y ” , S I A M J . C o n t r . 4 , N o . 3
(1966), 505-527.
O b l o m s k a y a L. Y a . “A c o m p a r i s o n of the rate of c o n v e r g e n c e of the c o n j u g a t e
gradient m e t h o d a n d of the gradient m e t h o d for q ua drat ic functionals” , in
V o p r o s y tochnosti i effectivnosti vychislitelnykh a l g o r i t m o v ( P r o b l e m s of
A c c u r a c y a n d Effectiveness of C o m p u t a t i o n Algorithms), 4 (1968), K i e v ,
9 4 - 1 0 3 (in R u s s i a n ) .
O s t r o w s k i A . M . S o l u t i o n of E q u a t i o n s a n d S y s t e m s of E q u a t i o n s , 2 n d ed., N e w
Y o r k , 1966.
P e a r s o n J. D . “ V a r i a b l e m e t r i c m e t h o d s o f m i n i m i z a t i o n ” , C o m p u t . J . 12,
N o . 2 (1969), 171-178.
P o l a k E . 1. “ O n p r i m a l a n d d u a l m e t h o d s o f s o l v i n g d i s c r e t e o p t i m a l c o n t r o l
p r o b l e m s ” i n C o m p u t i n g M e t h o d s i n O p t i m i z a t i o n P r o b l e m s — 2, N e w Y o r k ,
1969.

268
L I T E R A T U R E

2. C o m p u t a t i o n a l M e t h o d s i n O p t i m i z a t i o n ' , a U n i f i e d A p p r o a c h , N e w
Y o r k, 1971.
P o l y a k B . T . 1. “ G r a d i e n t m e t h o d s f o r f u n c t i o n a l m i n i m i z a t i o n ” , Z h . V y c h i s l .
M a t . i M a t . Fiz. 3, N o . 4 (1963), 6 4 3 - 6 5 4 (in R u s s i a n ) .
2. “ M e t h o d o f c o n j u g a t e g r a d i e n t s ” , T r u d y v t o r o i z i m n e i s h k o l y p o m a t e m .
p r o g r a m , i s m e z h n . v o p r . I s s u e 1, M o s c o w , 1 9 6 9 , 1 5 2 - 2 0 1 ( i n R u s s i a n ) .
3. “ I t e r a t i v e m e t h o d s u s i n g L a g r a n g e m u l t i p l i e r s f o r s o l v i n g e x t r e ­
m a l p r o b l e m s w i t h equality constraints” , Z h . Vychisl. M a t . i M a t . Fiz. 10,
N o . 5 (1970), 1 0 9 8 - 1 1 0 6 (in R u s s i a n ) .
4. “ O n t h e r a t e o f c o n v e r g e n c e o f t h e p e n a l t y f u n c t i o n m e t h o d ” , ibid.
11, N o . 1 (1971), 3 - 1 1 (in R u s s i a n ) .
P o w e l l M . J . D . 1. “ A s u r v e y o f n u m e r i c a l m e t h o d s f o r u n c o n s t r a i n e d o p t i m i z a ­
t i o n ” , S I A M R e v . 1 2 , N o . 1, 1 9 7 0 , 7 9 - 9 7 .
2. “ A n e f f i c i e n t m e t h o d f o r f i n d i n g t h e m i n i m u m o f a f u n c t i o n o f s e v e r a l
v a r i a b l e s w i t h o u t c a l c u l a t i n g d e r i v a t i v e s ” , C o m p u t . J . 7, N o . 2 ( 1 9 6 4 ) ,
155-162.
3. “ O n t h e C o n v e r g e n c e o f t h e V a r i a b l e M e t r i c A l g o r i t h m ” , M a t h e m a t i c s
Branch, Atomic Energy Research Establishment, Harwell, Berkshire,
England, October 1969 (mimeo).
P s h e n i c h n y B . N . 1. N e c e s s a r y C o n d i t i o n s f o r a n E x t r e m u m , N e w Y o r k , 1 9 7 4 .
2. “ T h e d u a l i t y p r i n c i p l e i n p r o b l e m s o f c o n v e x p r o g r a m m i n g ” , Z h .
Vychisl. M a t . i M a t . Fiz. 5, N o . 1 (1965), 9 8 - 1 0 6 (in R u s s i a n ) .
3. “ O n a d e s c e n t a l g o r i t h m ” , ibid. 8 , N o . 3 ( 1 9 6 8 ) , 6 4 9 - 6 5 2 (in R u s s i a n ) .
4. “A l g o r i t h m s for t h e g e n e r a l p r o b l e m o f m a t h e m a t i c a l p r o g r a m m i n g ” ,
K i b e r n e t i k a , N o . 5 (1970), 1 2 0 - 1 2 5 (in R u s s i a n ) .
5. “ O n t h e a c c e l e r a t i o n o f c o n v e r g e n c e o f a l g o r i t h m s f o r s o l v i n g o p t i ­
m a l control p r o b l e m s ” in C o m p u t i n g M e t h o d s in Optimization P r o b l e m s ,
N e w Y or k, 1969.
P s h e n i c h n y B . N . a n d G a n z h e l a I. F . “ A n a l g o r i t h m f o r s o l v i n g t h e p r o b l e m o f
c o n v e x p r o g r a m m i n g w i t h linear constraints” , Kibernetika, No. 3(1970),
8 1 - 8 5 (in R u s s i a n ) .
R o c k a f e l l a r R . T . C o n v e x A n a l y s i s , P r i n c e n t o n , N . J., 1 9 7 0 .
R o s e n J. B . “ T h e g r a d i e n t p r o j e c t i o n m e t h o d for n o n l i n e a r p r o g r a m m i n g .
P a r t I: L i n e a r c o n s t r a i n t s . P a r t I I : N o n l i n e a r c o n s t r a i n t s ” , S I A M J . A p p l .
M a t h . 8 , (1960), pp. 181-217; 9 (1961), 514-532.
S h a m a n s k y V . E . 1. M e t h o d s o f N u m e r i c a l S o l v i n g o f B o u n d a r y V a l u e P r o b l e m s
o n C o m p u t e r s . P a r t II, K i e v , 1 9 6 6 (in R u s s i a n ) .
2. “ O n s o m e c o m p u t a t i o n s c h e m e s o f i t e r a t i v e p r o c e s s e s ” , U s p e k h i
M a t . N a u k , 14, N o . 1 (1962), 1 0 0 - 1 0 9 (in R u s s i a n ) .
S h o r N . Z . 1. “ G e n e r a l i z e d g r a d i e n t d e s c e n t ” , T r u d y p e r v o i z i m n e i s h k o l y p o m a t .
p r o g r . D r o g o b y c h , M o s c o w , 1 9 6 9 , 5 7 8 - 5 8 5 (in R u s s i a n ) .
2. “ O n t h e r a t e o f c o n v e r g e n c e o f t h e g e n e r a l i z e d g r a d i e n t d e s c e n t ” ,
K i b e r n e t i k a , N o . 3 (1968), 9 8 - 9 9 (in R u s s i a n ) .
3. “ U s i n g t h e o p e r a t i o n o f s p a c e s t r e t c h i n g i n p r o b l e m s o f c o n v e x f u n c ­
tion m i n i m i z a t i o n ” , K i b e r n e t i k a , N o . 1 (1970), 6 - 1 2 (in R u s s i a n ) .
4. “ O n t h e r a t e o f c o n v e r g e n c e o f t h e g e n e r a l i z e d g r a d i e n t m e t h o d
w i t h s p a c e s t r e t c h i n g ” , K i b e r n e t i k a , N o . 2 (1970), 8 0 - 8 5 (in R u s s i a n ) .
S m i t h C. S. “ T h e a u t o m a t i c c o m p u t a t i o n of m a x i m u m l i k e l i h o o d e s t i m a t e s ” ,
N . C . B . Sci. D e p t . R e p o r t S C 8 4 6 ( M R ) 40.
S m o l y a k S. A . “ Q u a d r a t i c r a t e o f c o n v e r g e n c e of t h e c o n j u g a t e g r a d i e n t s m e t h o d ” ,
T r u d y tret'ei z i m n e i s h k o l y p o m a t . p r o g r . M o s c o w B u i l d i n g I n s t i t u t e ,
M o s c o w . 1970.
S o r e n s e n H . W . “C o m p a r i s o n of s o m e c o n j u g a t e direction p r o c e d u r e s for function
m i n i m i z a t i o n ” , J. F r a n k l i n I n s t i t u t e , 2 8 8 , 4 2 1 ( 1 9 6 9 ) .
T i k h o n o v A . N . 1. “ R e g u l a r i z a t i o n o f i n c o r r e c t l y p o s e d p r o b l e m s ” , D o k l . A k a d .
N a u k S S S R , 1 5 3 (1963), 4 9 - 5 2 [Engl, trans.: S ov i et M a t h . D o k l a d y 4 (1963),

269
L I T E R A T U R E

2. “ O u t h e s t a b i l i t y o f a l g o r i t h m s f o r s o l v i n g d e g e n e r a t e s y s t e m s o f
l i n e a r a l g e b r a i c e q u a t i o n s ” , Z h . Vijchisl. M a t . i M a t . F i z . 5, N o . 4 ( 1 9 6 5 )
(in R u s s i a n ) .
T o k u m a r u H . , A d a c h i N . a n d G o t o K . “ D a v i d o n ' s m e t h o d for m i n i m i z a t i o n
p r o b l e m s in Hilbert space w i t h a n application to control p r o b l e m s ”,
S I A M J. Contr. 8 , N o . 2 (1970).
T o p k i s D . M . a n d V e i n o t t A., Jr. “ O n t h e c o n v e r g e n c e of s o m e feasible direc­
t i o n s a l g o r i t h m s for n o n l i n e a r p r o g r a m m i n g ” , S I A M J . C o n t r . 5, N o . 2
(1967), 268-279.
V a i n D e r g M . M . 1. V a r i a t i o n a l M e t h o d s f o r t h e S t u d y o f N o n l i n e a r O p e r a t o r s ,
San Francisco, 1964.
2. V a r i a t i o n a l M e t h o d a n d M e t h o d o f M o n o t o n e O p e r a t o r s i n t h e T h e o r y
of N o n l i n e a r E q u a t i o n s , N e w Y o r k , 1973.
V a s i l ’e v F . P . L e c t u r e s o n t h e M e t h o d s f o r S o l v i n g E x t r e m a P r o b l e m s , M o s c o w
S t a t e U n i v e r s i t y , 1 9 7 4 (in R u s s i a n ) .
W a r g a J. “ A c o n v e r g e n t p r o c e d u r e f or c o n v e x p r o g r a m m i n g ” , J . S o c . I n d . a n d
A p p l . M a t h . 11, N o . 3 (1963), 579-587.
Y a k o v l e v M . N . “O n s o m e m e t h o d s of solving nonlinear equations”, T r u d y M a t e m .
Inst. A N S S S R 8 4 ( 19 65), 8 - 4 0 (in R u s s i a n ) .
Z a n g w i l l W . I. 1. “ M i n i m i z i n g a f u n c t i o n w i t h o u t c a l c u l a t i n g d e r i v a t i v e s ” ,
C o input. J. 10, N o . 3 (1967), 293-296.
2. “ N o n l i n e a r p r o g r a m m i n g v i a p e n a l t y functions”. M a n a g e m e n t
S c i e n c e 13, N o . 5 (1967), 3 4 4 - 3 6 8 .
Z e l e z n i k F. J. “ Q u a s i - N e w t o n m e t h o d s for n o n l i n e a r e q u a t i o n s ” , J. Assoc.
C o m p u t . M a c h . 15, N o . 2 (1968), 265-271.
Z o u t e n d i j k G . 1. M e t h o d s o f F e a s i b l e D i r e c t i o n s , A m s t e r d a m , 1 9 6 0 .
2 . “Nonlinear p r o g r a m m i n g : A numerical survey”, S I A M J. Contr.
4, N o . 1 ( 19 66), 1 9 4 - 2 1 0 .
Z u k h o v i t s k i i S . I. a n d A v d e y e v a L . I. L i n e a r a n d C o n v e x P r o g r a m m i n g , P h i l a ­
delphia, Pa., 1966.
Z u k h o v i t s k i i S . 1., P o l y a k R . A . a n d P r i m a k M . E . 1. “ A n a l g o r i t h m f o r t h e
solution of the p r o b l e m of c o n v e x C h e b y s h e v a p p r o x i m a t i o n ” , D o k l a d y
A k a d . N a u k S S S R 151, N o . 1 (1963) 2 7 - 3 0 [Engl, trans.: Soviet M a t h .
D o k l a d y 4 (1963), 9011.
2. “ A n a l g o r i t h m f o r t h e s o l u t i o n o f t h e c o n v e x p r o g r a m m i n g p r o b l e m ” ,
D o k l a d y A k a d . N a u k S S S R 1 5 3 , N o . 5, ( 1 9 6 3 ) , 9 9 1 - 9 9 4 [ E n g l , trans.:
Soviet M a t h . D o k l a d y 4 (1963), 1754].

270
INDEX

A c
A b a d i e , J. 2 6 7 C a n o n , M . D. 176, 257, 2 6 5
acceleration of c o n v e r g e n c e 224ff C a u c h y , A. L. 145, 2 6 5
a l g o r i t h m for 2 3 0 C e a , J. 2 5 8 , 2 6 5
and mathematical programming 232 closure of c o n v e x set 14
Adachi, N. 270 Collatz, L. 145, 2 6 5
admissible direction 33 concave function 23
admissible d o m ain 32 conditional gradient m e t h o d 170ff
Aki l o v , G . P. 2 6 8 a n d step adjustment 177
a l g o r i t h m (s) cone
c o m p u t a t i o n a l s c h e m e s of 259ff conjugate 14
for c o n j u g a t e directions m e t h o d 93, convex 14
111, 126, 129, 2 6 0 polyhedral 15
for c u t t i n g h y p e r p l a n e m e t h o d 1 8 5 conjugate directions 82
for d u a l directions m e t h o d 129, 2 5 9 a n d dual directions 143
effectiveness of 60, 6 6 conjugate directions m e t h o d s 138, 143,
for feasible directions m e t h o d 2 6 1 145, 25 5
for linearization m e t h o d 2 6 2 conjugate gradients m e t h o d 98, 145-
for m i n i m i z i n g f u n c t i o n s 1 3 9 147, 152, 153, 156, 161
for q u a d r a t i c p r o g r a m m i n g 151. a l g o r i t h m for 2 6 0
158, 160 c o n j u g a t e vector 82, 138, 141, 144
A l t m a n , M . 145, 265 a l g o r i t h m s for constru ctio n 8 7
A r r o w , K . 43, 2 6 5 constrained function minimization
Auslender, A. 265 146ff
A v d e y e v a , L . I. 4 2 , 4 3 , 2 5 7 , 2 7 0 constraints
linear 146
B simple 160
convergence
Balakrishnan, A. V. 265 acceleration of 224ff
B a r n e s , J. 2 5 8 , 2 6 5 for c o n d i t i o n a l g r a d i e n t a l g o r i t h m s
Bershchansky, Ya. M. 265 172, 176
biorthogonalization, 125 of c on juga te directions m e t h o d s
boundary manifold, 38 104, 119, 125, 144
B o u r b a k i , N . 23. 2 6 5 of dual directions m e t h o d s 144
Bre nt, R . P. 145, 2 6 5 rates of 11
Brown, K. M. 265 c o n v e x cone 14
B r o y d c n , C. G. 128, 258, 2 6 5 c o n v e x f u n c t i o n s 17ff
B u d a k , B. M. 265 in f i n i t e - d i m e n s i o n a l s p a c e 4 2

271
I N D E X

convex functions d u a l s y s t e m s of vectors 77


in functional spaces 4 2 D u b o v i t s k y , A . Y a . 43, 2 6 6
c o n v e x ' p r o g r a m m i n g 24ff D u n f o r d , N . 42, 266
a n d feasible directions 162
E
Fiacco and M c C o r m i c k m e t h o d 243
E a t o n , J. H . 2 6 5
in finite-dimensional spaces 4 2
effectiveness
a n d the penalty function m e t h o d
240, 242 of conjugate directions m e t h o d s 128
effectiveness of algorithm 60
quality of 43
a n d the n u m b e r of iterations 6 6
convex quadratic function
effectiveness of m e t h o d s
m i n i m i z a t i o n of 9 9
a n d rate of convergence 66
c o n v e x s e t s 1 2 ff
E r e m i n , I. 1 . 2 6 6
in finite-dimensional s p a c e 4 2
E r m o l ’e v , Y u . M . 2 5 8 , 2 6 6
in functional spaces 4 2
extrema
C u l l u m , C. D . 176, 257, 2 6 5
n e c e s s a r y con diti ons for 42, 4 3
c u t t i n g h y p e r p l a n e m e t h o d 184ff
a l g o r i t h m for 185 F
c o m p u t a t i o n a l aspects of 187 F a d d e e v , D. K . 76, 81, 94, 145, 159,
for c o n v e x p r o g r a m m i n g 185 266
and the dual problem 187 F a d d e e v a , V . N . 76, 81, 94, 145, 159,
a n d linear p r o g r a m m i n g 187 266
rate of c o n v e r g e n c e of 188 feasible directions 162
choice of 162
D
and convex p r o g r a m m i n g 162
D a n i e l , J. W . 145 , 2 6 5 feasible directions m e t h o d 162ff
Danilin, Y u . M . 145, 257, 2 5 8 , 2 6 5 , 2 6 6 a l g o r i t h m for 2 6 1
D a n t z i g , G . B. 43, 2 6 6
feasible point 149
D a v i d o n , W . C. 128, 145, 266 Fiacco, A. V. 258, 266
D e m ' y a n o v , V . F. 2 6 6
Fiacco and M c C o r m i c k me t h o d 243
D e n n i s , I. E . 2 6 6 F i k h t e n h o l ’t s , G . M . 4 3 , 2 6 6
D e n n i s J. E . 2 6 5 finite di f f e r e n c e s
direction feasibility m e t h o d 162ff a n d gradient 129, 137
direction of m o t i o n 132 first o r d e r m e t h o d s 2 4 8 , 2 4 9
dual directions rate of c o n v e r g e n c e of 251
a n d conjugate directions 143 Fletcher, R . 128, 145, 2 6 6
dual directions me t h o d ( s ) 130, 136, F o m i n , S. V . 42, 2 6 8
143, 255 Frank, M . 257, 266
a l g o r i t h m for 2 5 9 Fridman, V. N. 267
dual problem
function m i n i m i z a t i o n 137, 139
in c o n v e x p r o g r a m m i n g 28, 2 9
c o n s t r a i n e d 146ff
and the cutting hyperplane m e t h ­ u n c o n s t r a i n e d 4 4 ff
od 187
in linear p r o g r a m m i n g 3 0 G
a n d the mini m i z a t i o n p r o b l e m 29 Gain, D . 43, 267
a n d projection operators 160 G a n z h e l a , I. F . 2 6 9
in quadratic p r o g r a m m i n g 194, 195 g e n e r a l i z e d N e w t o n ’s m e t h o d 58ff

272
I N D E X

geometrical-progression rate of convert Lagrange's generalization formula 42


gence 11 L a v r o v , S. S. 2 6 8
Gilbert, E. G. 267 Levitin, E . S. 42, 188, 257, 258, 2 6 8
G l a s m a n , I. M . 2 6 7 L e v y , A . V . 128, 267
Goldfarb, D. 267 linear constraints 146
Goldman, N. R. 265 linear independence of vectors 219
Goldstein, A . A . 145, 2 67 linearization m e t h o d 188ff
Golstein, E. G . 43, 267 acceleration of c o n v e r g e n c e for 2 3 3
Goto, K. 270 a l g o r i t h m for 190, 2 6 2
gradient c o n v e r g e n c e of a l g o r i t h m 190, 193,
a n d finite dif fere nces 129, 1 3 7 194, 2 1 3
g r a d i e n t m e t h o d ( s ) 45ff and the dual problem 198
effectiveness of 56, 57 a n d f i n d i n g t h e m i n i m a x 2 1 Iff
qualitative analysis of 55 a n d linear p r o g r a m m i n g 2 0 0
rate of c o n v e r g e n c e of 145 and simplex method 200
of steepest descent 4 5 a n d s y s t e m s of equalities a n d in­
G r e e n s t a d t , J. 128, 2 6 7 e q u a l i t i e s 2 1 Iff
H linearized objective function 188
Halkin, H . 43, 267 linear p r o g r a m m i n g 29, 4 3
Hayes, R. M. 267 a l g o r i t h m s for 4 3
H e i d e m a n , J. C . 2 5 8 , 2 6 8 and cutting hyperplane m e t h o d 187
Hestenes, M . R . 43, 145, 258, 267 in finite d i m e n s i o n a l s p a c e s 4 2
H or witz , L. B. 43, 265, 267 and necessary conditions for
H u a n g , H . Y . 128, 145, 258, 2 6 8 a m i n i m u m 201, 204
Huang, N. Y. 267 a n d sufficient c o n d i t i o n s for a
H u a r d , P. 258, 267 m i n i m u m 204
linear rate of con vergence 11
I Lipschitz’ condition 46
Isaev, V . K . 267 local acceleration of c o n v e r g e n c e 224ff
Ivanov, V. V. 257, 258, 267 local m i n i m u m
K point of 33
K a n t o r o v i c h , L. V . 145, 267, 268 Lusternik, L. A. 245, 268
K a r l i n , S. 42, 43, 2 6 8 L y u b i c h , Y u . I. 1 4 5 , 2 6 8
K a r m a n o v , V. G. 258, 268
M
K e l l e y , II. J. 2 6 8
K e l l e y , J. E . 1 4 5 , 1 8 8 , 2 5 7 , 2 6 8 M a i s t r o v s k y , G . D . 10, 145, 2 6 8
K o l m o g o r o v , A. N. 42, 268 M a l o z e m o v , V. N. 257, 266
Krelle. W . 42, 257, 2 6 8 m a t h e m a t i c a l p r o g r a m m i n g 1 2 ff
K u h n - T u c k e r t h e o r e m 27, 196, 217, the general p r o b l e m of 25 6
21S, 220 matrices of s e c ond derivatives 6 6
Kiinzi, H . 42. 257, 258, 2 6 8 finite differences a n a l o g u e 7 5
M c C o r m i c k , G . P. 258, 266, 2 6 8
L
m e t h o d of conditional gradient 170ff
Laenberger, D. 258, 268
a n d N e w t o n ’s m e t h o d w i t h s t e p a d ­
Lagrange multipliers 28 justment 177
L a g r a n g e ’s f o r m u l a f o r " o p e r a t o r s 4 2 rate of con v e r g e n c e 172. 176

273
I N D E X

m e t h o d s of conditional gradient a nd equality constraints 233


m e t h o d ( s ) of c o n j u g a t e directions 82ff of nonlinear functions 170
a l g o r i t h m s for 9 3 of quadratic functions 82, 146,
applicability of 103 148, 149
construction of 8 5 m i n i m i z a t i o n of functions
c o n v e r g e n c e of 104, 109, 125f m e t h o d s of con juga te directions
effectiveness of 84, 8 6 , 128 103ff
a n d m i n i m i z a t i o n of functions minimizing quadratic functions
103ff b y m e t h o d of dual directions 80
and m i n i miz atio n of quadratic m i n i m u m of function
functions 82 n e c e s s a r y c o n d i t i o n s for 25, 32ff
properties of 8 9 second-order necessary conditions
properties of algorithms 1 1 1 , 126 40
rate of c o n v e r g e n c e 144 m i n i m u m point 33
m e t h o d of c o n j u g a t e g r a d i e n t s 98ff M o i s e e v , N . N . 10, 2 6 8
a l g o r i t h m for 2 6 0 Murtagh, B. A. 268
method(s) of d u a l d i r ecti ons 67ff N
a l g o r i t h m for 74, 2 5 9
n e c e s s a r y c o n d i t i o n s for a m i n i m u m
choice of s c h e m e s for 67, 6 8
180, 194
construction of iterative proc­
in linear p r o g r a m m i n g 201
esses 6 8
N e u s t a d t , L. W . 43, 265, 267, 2 6 8
and m i n i m i z a t i o n of q u a d r a t i c
Newton-Leibnitz formula 41
functions 79, 8 0
N e w t o n ’s f i n i t e d i f f e r e n c e m e t h o d 8 1
a n d N e w t o n ’s m e t h o d 8 0 , 8 1
N e w t o n ’s m e t h o d
properties of 8 0
and the conditional gradient m e t h ­
rate of convergence 144
od 177
substantiation of 69
and conjugate gradient 170H
without calculating derivatives 145
c o n v e r g e n c e of 215, 24 4
m e t h o d of feasible directions 162ff
a n d dual directions 81
a l g o r i t h m for 165, 2 6 1
g e n e r a l i z e d 58ff
applied to linear p r o g r a m m i n g 163,
rate of c o n v e r g e n c e 178, 182
166
N e w t o n ’s m e t h o d w i t h s t e p a d j u s t ­
c o n v e r g e n c e of 167, 170
m e n t 58ff
m e t h o d s o f first o r d e r 2 4 8 , 2 4 9
choosing of step 59
rate of c o n v e r g e n c e 251
and gradient met hods 66
m e t h o d s of s e c o n d order 246, 25 2
a n d m e t h o d s of d u a l directions 8 0
rate of c o n v e r g e n c e 2 5 4
properties of 59-67
m e t h o d of steepest descent 4 5
nondegeneracy assumption 164
method(s) without calculating deriva­
nonlinear equality constraints
tives 129ff
a n d the linearization m e t h o d 188
Miele, A. 258, 268
nonlinear functions
Milyutin, A. A. 43, 266
m i n i m i z a t i o n of 1 70
m i n i m a x problem 39
minimization O
of a c o n v e x quadratic function 7 9 , objective function 33
99 linearized 188

274
I N D E X

O b l o m s k a y a , L. Y a . 2 6 8 a n d effectiveness of m e t h o d 6 0
Oettli, W . 258, 2 6 8 o f first o r d e r m e t h o d s 2 5 1
Ostrowski, A. M . 232, 268 geometrical-progression 11
O s t r o w s k i ’s t h e o r e m 2 0 6 , 2 1 0 , 2 1 1 linear 11
of m e t h o d s of con jugate directions
P
119, 125f
P e a r s o n , J. D . 128, 2 6 8 quadratic 11
p e n a l t y function m e t h o d 235ff of second order m e t h o d s 2 5 4
computational aspects 242 superlinear 11
in c o n v e x p r o g r a m m i n g 2 4 0 Reeves, C. M . 145, 266
substantiation of 236 regular point 37
po i n t of local m i n i m u m 3 3 restoration process 104, 105
Polak, E. 268, 269 Rockafellar, R . T. 42, 43, 269
P o l y a k , B. T. 42, 128, 145, 188, 257, R o s e n , J. B . 2 5 8 , 2 6 9
258, 268, 269 R u b i n o v , A. M . 257, 266
Polyak, R. A. 270
polyhedral cone 15 S
P o w e l l , M . J. D . 128, 145, 2 6 6 , 2 6 9
Price, J. F. 145 , 2 6 7 Sarachik, P. E. 267
Primak, M . E. 257, 270 Sargent, R. W . H. 268
primal problem'in linear p r o g r a m m i n g S c h w a r t z , J. T . 4 2 , 2 6 6
30 second order m e t h o d s 246, 252
projection m e t h o d s with restoration rate of c o n v e r g e n c e of 2 5 4
o f t i e s 2 4 4 ff separation theorem 12
projection operator 147, 20 9 S h a m a n s k y , V. E. 258, 269
and the dual pr o b l e m 160 Shor, N . Z. 257, 258, 269
m a t r i x for 15 8 simple constraints 160
P s h e n i c h n y , B. N . 43, 128, 145, 257. simplex method
258, 266, 269 a n d the linearization m e t h o d 2 0 0
in s o l v i n g t h e d u a l p r o b l e m 1 8 7
Q S l a t e r ’s c o n d i t i o n 2 1 7 , 2 2 0
quadratic forms S m i t h , C. S. 145, 2 6 9
m i n i m i z a t i o n of 137, 143 S m o l y a k , S. A . 145, 2 6 9
quadratic functions S o b o l e v , V . I. 2 4 5 , 2 6 8
m i n i m i z a t i o n of 146, 148, 149 Sonin, V. V. 267
q u a d r a t i c p r o g r a m m i n g 31, 146ff Sorensen, H. W . 269
a l g o r i t h m for 151, 158, 160 step length
a nd the dual p r o b l e m 194, 195 a l g o r i t h m for c h o o s i n g 1 3 2
iterative p r o c e s s for 157.' choice of 171, 172
with simple constraints 160 Stiefel, E . 145, 2 6 7
quadratic rate of convergence strictly c o n v e x functions 21, 4 2
strictly c o n v e x sets 17
R strongly c o n v e x functions 21, 4 2
rate of c o n v e r g e n c e str ongl y c o n v e x sets 17
for c o n d i t i o n a l g r a d i e n t a l g o r i t h m s strongly positive m a t r i x 22
172, 1 7 6 subgradient 20

275
I N D E X

sufficient c o n d i t i o n V
for a local m i n i m u m 2 3 4 Vainberg, M . M . 42, 2 7 0
superlinear rate of c o n v e r g e n c e 11 V a s i l ’e v , F. P. 258, 270
sup port vector 20, 186 vector of dual variables 29
system(s) of conjugate vectors 141, 144 Veinott, A. 257, 270

W
T
W a r g a , J. 2 7 0
tangent manifo ld 38 Weierstrass’ theorem 24, 178, 249
Tikhonov, A. N . 10, 269, 270 Wolfe, P. 257, 266
Tokumaru, H. 270
Topkis, D. M . 257, 270 Y
Truten*, V. E. 257, 267 Yakovlev, M. N. 145, 2 7 0

Z
U
Z a n g w i l l , W . I. 1 4 5 , 2 5 8 , 2 7 0 .
unconstrained function minimization Z e l e z n i k , F . J. 2 5 8 , 2 7 0
44ff Zoutendijk, G. 42, 257, 258, 270
U z a w a , H . 43, 2 6 5 Z u k h o v i t s k i i , S . I. 4 2 , 4 3 , 2 5 7 , 2 7 0

276
T o the Reader

M i r Publishers w o u l d be grateful

for y o u r c o m m e n t s o n t h e content,,

translation and design of this

book. W e w o u l d also b e pleased

to receive any other suggestions

you m a y wish to make.

Our address is:

U S S R , 129820, M o s c o w 1-110, G S P

Pervy Rizhsky Pereulok 2

Mir Publishers

P r i n t e d in the U n i o n of Soviet Socialist R e p u b l i c s


O T H E R MIR TITLES

N . S. B a k h v a l o v

Numerical Methods
Pr o f e s s o r N . S. B a k h v a l o v of M o s c o w S t a t e U n i v e r s i t y h a s w r i t t e n
a t e x t b o o k f o r a d v a n c e d c o u r s e s i n a p p l i e d m a t h e m a t i c s . I t is d e ­
signed especially for universities a n d higher technical schools a n d
is c o n c e r n e d w i t h t h e t h e o r y a n d p r a c t i c e o f n u m e r i c a l m e t h o d s .
T h e text co v e r s a w i d e r a n g e of subjects i n c l u d i n g t h e essentials of
n u m e r i c a l m e t h o d s dealing w i t h a p p r o x i m a t i o n of functions, integ­
ration, p r o b l e m s of a l g e b r a a n d o p t i m i z a t i o n , a n d the solution of
o r d i n a r y d i f f e r e n t i a l e q u a t i o n s . P a r t i c u l a r a t t e n t i o n is p a i d t o
q u e s t i o n s i n v o l v i n g ch o i c e of m e t h o d a n d o r g a n i z a t i o n of c o m p u t a ­
tions in th e solution of large sets of p r o b l e m s of a single type. T h e
b o o k includes a b i b l i o g r a p h y of R u s s i a n a n d E n g l i s h references.
V . I. K r y l o v a n d N . S . S k o b l y a

A H a n d b o o k of Methods of Approximate
Fourier Transformation a n d Inversion
of the L a p l a c e T r a n s f o r m a t i o n
H a r m o n i c analysis a n d Laplace transformations are very often
u s e d in the solution of m a n y theoretical a n d practical p r o b l e m s .
T h i s text contains m o s t of the k n o w n m e t h o d s of the a p p r o x i m a t e
i n v e r s i o n of t h e L a p l a c e t r a n s f o r m a t i o n a n d m e t h o d s of c a l c u l a t i n g
F o u r i e r i n t e g r a l s . I t is d e s i g n e d f o r s c i e n t i s t s a n d e n g i n e e r s w h o h a v e
to d o w i t h the t h e o r y of a p plications of the L a p l a c e t r a n s f o r m a t i o n
a n d t h e F o u r i e r i n t e g r a l s . It c a n s e r v e a s a u s e f u l r e f e r e n c e w o r k
for c o m p u t i n g centres a n d design b u r e a u s .
V. P. M a s l o v

Operational Methods
T h i s is a t e x t b o o k f o r s e c o n d - a n d t h i r d - y e a r u n i v e r s i t y m a t h e ­
m a t i c s a n d p h y s i c s s t u d e n t s b a s e d o n t h e a u t h o r ’s l e c t u r e s a t t h e
faculty of a p p l i e d m a t h e m a t i c s of the M o s c o w Institute of Electronic
E n g i n e e r i n g a n d t h e p h y s i c s f a c u l t y of M o s c o w U n i v e r s i t y . It illus­
trates t h e theoretical m a t e r i a l c o n c e r n i n g specific p h y s i c a l p r o b l e m s ,
w h i c h are t a k e n as the m o d e l , c o m p a r i n g the f o r m u l a s of the o p e r a ­
tional m e t h o d w i t h the n u m e r i c a l solution. T h e b o o k will b e of
interest to scientific w o r k e r s in general.

You might also like