B.N. Pshenichny, Yu.M. Danilin - Numerical Methods in Extremal Problems - Mir - 1978
B.N. Pshenichny, Yu.M. Danilin - Numerical Methods in Extremal Problems - Mir - 1978
M E T I I
O D S
III III I I I I
IN B . N . P s h e n i c h n y a n d Yu. M . Danilin
E M A L
P 0
MIR
PUBLISHERS
MOSCOW
I
I I
i n - - - t r II II TI I I M I
B.N. Pshenichny and Yu.M.Danilin
T h e b o o k describes m e t h o d s a n d
a l g o r i t h m s for n u m e r i c a l solution of
p r o b l e m s of finding e x t r e m a of f u n c
tions u n d functionals m e t with in
nuthematicul progra m mi n g, e c o n o m i c s ,
optimal control theory a nd other fields
of s c i e n c e a n d practice.
Special attention is p a i d to a l g o r i t h m s
*itb a fast rate of c o n v e r g e n c e and
iniplementuble on computers. M e t h o d s
of u n c o n s t r a i n e d a nd constrained mi -
nimi/ution of functions of i n d e p e n d e n t
variables ure discussed. T h e b o o k will
b e useful to s p e c i a l i s t s in m a t h e m a t i
cal p r o g r a m m i n g , computational m a t h e
m a t i c s a n d optimal control theory a n d
|o b road circles of students a nd e n g i
neers, w h o in their practical w o r k h a v e
to solve p r o b l e m s of function m i n i m i
sation.
B. H. IllllGHHHHblH
H). M . flaHHJIHH
HHCJIEHHblE
METOAbl
B SKCTPEMAJlbHblX
3AAAHAX
MSAATEilbCTBO «HAYKA»
rilABHAfl P E A A K U H R
0M3 M K O- MATEMATMHECKOM AMTEPATyPbl
M O C K B A
6. N. P s h e n i c h n y a n d Yu. M . Danilin
H a a u z A u u c K O M si3 U K e
P R E F A C E 9
C H A P T E R I. I N T R O D U C T I O N T O T H E T H E O R Y O F
M A T H E M A T I C A L P R O G R A M M I N G 12
1. C O N V E X SETS 12
Definition. Separation T h e o r e m . C o n v e x Cones. Strictly a n d
S t r o n g l y C o n v e x Sets.
2. C O N V E X FUNCTIONS 17
3. C O N V E X P R O G R A M M I N G 24
F o r m u l a t i o n of t h e P r o b l e m . B a s i c Properties. N e c e s s a r y
C o n d i t i o n s for a M i n i m u m . T h e K u h n - T u c k e r T h e o r e m . D u a l
P r o b l e m . P r o b l e m of L i n e a r P r o g r a m m i n g . P r o b l e m of Q u a d
ratic P r o g r a m m i n g .
4. N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M 32
B a s i c Definitions, N e c e s s a r y C o n d i t i o n s for a M i n i m u m .
M i n i m a x P r o b l e m . N e c e s s a r y C o n d i t i o n s of t h e S e c o n d Order.
5. S O M E A D D I T I O N A L I N F O R M A T I O N 41
Bibliographic Notes 42
C H A P T E R II. M E T H O D S O F U N C O N S T R A I N E D F U N C
T I O N M I N I M I Z A T I O N 44
1. G R A D I E N T M E T H O D S 45
M e t h o d of Steepest D e s c e n t . V a r i a n t s of t h e M e t h o d . O t h e r
G r a d i e n t M e t h o d s . Q u a l i t a t i v e A n a l y s i s of t h e M e t h o d s .
5
C O N T E N T S
2. N E W T O N ’S M E T H O D W I T H STEP A D J U S T M E N T 58
C o n s t r u c t i o n o f tlie M e t h o d . T h e o r e m s a b o u t P r o p e r t i e s o f
the M e t h o d . Modifications of the G e n e r a l i z e d N e w t o n M e t h o d .
D i s c u s s i o n o f t h e P r o p e r t i e s o f N e w t o n ’s M e t h o d .
3. M E T H O D S OF D U A L DIRECTIONS 67
C o n s i d e r a t i o n s o n t h e C h o i c e of S c h e m e s of t h e M e t h o d s .
S u b s t a n t i a t i o n of the M e t h o d s . C o n s t r u c t i o n of V a r i o u s
A l g o r i t h m s . D e t e r m i n i n g V e c t o r p h . T h e Initial S t a g e of t h e
Process. M i n i m i z a t i o n of Q u a d r a t i c F o r m . Di s c u ss i o n of
P r o p e r t i e s of t h e M e t h o d s .
4. M E T H O D S O F C O N J U G A T E D I R E C T I O N S . M I N I M I Z A T I O N OF
Q U A D R A T I C FUNCTIONS 82
C o n j u g a t e D i r e c t i o n s a n d T h e i r Properties. C o n s t r u c t i o n of
t h e M e t h o d s . G e n e r a l Properties of t h e M e t h o d s . C o n c r e t e
A l g o r i t h m s . M i n i m i z a t i o n of a C o n v e x Q u a d r a t i c F u n c t io n .
Discussion of Results.
5. M E T H O D S O F C O N J U G A T E DIRECTIONS. MINIMIZATION
OF A R B I T R A R Y FUNCTIONS 103
C o n s i d e r a t i o n s a b o u t t h e A p p l i c a b i l i t y of the Methods.
T h e o r e m o n C o n v e r g e n c e of t h e M e t h o d s . S t u d y of P r o p e r t i e s
of Different A l g o r i t h m s . F u r t h e r S t u d y of t h e R a t e of C o n v e r
gence. Discussion of Results.
0. M E T H O D S W I T H O U T C A L C U L A T I N G DERIVATIVES 129
I n t r o d u c t o r y R e m a r k s . C o n s t r u c t i n g M e t h o d s of D u a l D i r e c
tions. R e m a r k s o n t h e I m p l e m e n t a t i o n of M e t h o d s of D u a l
Directions. M e t h o d s of C o n j u g a t e Directions. D i s c u s s i o n of
Results.
Bibliographic Notes 145
C H A P T E R III. M E T H O D S O F C O N S T R A I N E D F U N C T I O N
M I N I M I Z A T I O N 146
1. P R O B L E M OF Q U A D R A T I C P R O G R A M M I N G 146
O p e r a t o r s of Projection. M i n i m i z a t i o n of a Q u a d r a t i c F u n c t i o n
in a S u b s p a c e . A l g o r i t h m of G e n e r a l P r o b l e m of Q u a d r a t i c
6
C O N T E N T S
P r o g r a m m i n g . C o m p u t a t i o n a l Aspects. P r o b l e m of Q u a d r a t i c
P r o g r a m m i n g with S i m p l e Constraints.
M e t h o d of C h o o s i n g F e a s i b l e Directions. A l g o r i t h m of M e t h o d of
F e a s i b l e Directions. S u b s t a n t i a t i o n of C o n v e r g e n c e of t h e
A l g o r i t h m . C o n s t r u c t i o n of t h e Initial A p p r o x i m a t i o n .
3. M E T H O D O F C O N D I T I O N A L G R A D I E N T A N D N E W T O N ’S
M E T H O D 170
R u l e for C h o o s i n g t h e S t e p L e n g t h . D e s c r i p t i o n of t h e A l g o
r i t h m . S u b s t a n t i a t i o n of C o n v e r g e n c e of t h e A l g o r i t h m a n d
E s t i m a t i o n of Its R a t e of C o n v e r g e n c e . E s t i m a t e of C o n v e r
g e n c e f o r a S t r o n g l y C o n v e x R e g i o n . N e w t o n ’s M e t h o d w i t h
S t e p A d j u s t m e n t . P r o p e r t i e s o f N e w t o n ’s M e t h o d .
4. C U T T I N G H Y P E R P L A N E M E T H O D 184
5. L I N E A R I Z A T I O N M E T H O D 188
Basic A s s u m p t i o n s . F o r m u l a t i o n of t h e A l g o r i t h m . C o n v e r g e n c e
of t h e Algorithm. Computational Aspects. S o m e Generaliza
tions. P r o b l e m of L i n e a r P r o g r a m m i n g . L o c a l E s t i m a t e of
the R a t e of C o n v e r g e n c e .
6. L I N E A R I Z A T I O N M E T H O D : S O L V I N G S Y S T E M S O F E Q U A L I
TIES A N D INEQUALITIES A N D FI NDING T H E M I N I M A X 211
S y s t e m s of E q u a l i t i e s a n d Inequalities. C o n v e r g e n c e of t h e
A l g o r i t h m . R e m a r k s . Sufficient C o n d i t i o n s of C o n v e r g e n c e .
S o l v i n g the P r o b l e m of F i n d i n g t h e M i n i m a x .
7. L O C A L A C C E L E R A T I O N OF C O N V E R G E N C E 224
F o r m u l a t i o n of t h e P r o b l e m . B a s i c F o r m u l a s . A l g o r i t h m .
C o m p u t a t i o n a l A s p e c t s . A p p l i c a t i o n to t h e P r o b l e m of M a t h e
matical Programming. Minimization Problem with Equality
Constraints.
7
C O N T E N T S
8. M E T H O D OF P E N A L T Y FUNCTIONS 235
S u b s t a n t i a t i o n of t h e P e n a l t y F u n c t i o n M e t h o d . Convex
P r o g r a m m i n g . Computational Aspects. Fiacco and McCor
mick Method.
9. P R O J E C T I O N M E T H O D S W I T H R E S T O R A T I O N OF TIES 244
C o n s t r u c t i o n of t h e M e t h o d s . M e t h o d s of t h e First O r d e r .
M e t h o d of t h e S e c o n d O r d e r . M i n i m i z a t i o n M e t h o d s of H i g h
er Effectiveness. O n t h e S o l v i n g of t h e G e n e r a l P r o b l e m of
Mathematical Programming. Conclusive Remarks.
Bibliographic Notes 257
APPENDIX. C O M P U T A T I O N A L S C H E M E S OF T H E M A I N
A L G O R I T H M S 259
L I T E R A T U R E 265
INDEX 271
8
P R E F A C E
C o m p u t a t i o n a l m e t h o d s of solving e x t r e m a l p r o b l e m s d e v e l o p e d
v e r y intensively in recent years.
P R E F A C E
T h e r e f o r e s p e c i a l s t r e s s is l a i d o n t h e d e s c r i p t i o n o f t h e a l g o r i t h m s
t h a t r e q u i r e t h e f i n d i n g o n l y o f t h e first d e r i v a t i v e o r o n l y o f t h e
v a l u e of t h e function.
In describing the computational m e t h o d s w e consider only the
f i n i t e d i m e n s i o n a l c a s e . T h i s is d u e t o t w o r e a s o n s . F i r s t , i n u s i n g
a c o m p u t e r f o r c a l c u l a t i o n s , t h e p r o b l e m is t o b e a p p r o x i m a t e d a n y
w a y b y a finite d i m e n s i o n a l o n e . S e c o n d l y , m o s t of t h e k n o w n a l g o
r i t h m s are c o m p a r a t i v e l y s i m p l y generalized for the m i n i m i z a t i o n
o f f u n c t i o n a l s w i t h o u t e s s e n t i a l c h a n g e s . T h i s a p p r o a c h m a d e it
possible to m a k e t h e b o o k easily u n d e r s t o o d b y a b r o a d circle of
readers, since in o r d e r to g r a s p m o s t of t h e results d e s c r i b e d o n l y a
k n o w l e d g e of the principles of m a t h e m a t i c a l analysis a n d linear
a l g e b r a is r e q u i r e d .
T o a v o i d the necessity of frequent cross-referencing, not m a n y refe
rences are g i v e n in t h e text. S h o r t bibliog r a p h i c notes f o l l o w s o m e
o f t h e c h a p t e r s . T h e a u t h o r s d i d n o t a t t e m p t to c o m p r i s e all t h e
literature o n the questions treated, this b e i n g s i m p l y i m p o s s i b l e
b e c a u s e o f i t s v a s t n e s s . T h i s is w h y t h e l i s t o f l i t e r a t u r e g i v e n a t
t h e e n d of the b o o k includes o n l y p a p e r s a n d m o n o g r a p h s directly
u s e d in w r i t i n g this b o o k .
It, s h o u l d b e n o t e d t h a t t h e a u t h o r s h a v e n o t d i s c u s s e d t h e m e t h o d s
of s o lving a b r o a d a n d i m p o r t a n t class of n o n c o r r e c t e x t r e m a l p r o b
l e m s , w h i c h are treated in t h e w o r k s of A . N . T i k h o n o v a n d his
followers. T h e a u t h o r s h a v e b u t slightly t o u c h e d the s o l v i n g of
optimal control problems. These problems have been studied from
v a r i o u s poi n t s of v i e w a n d t h e m e t h o d s for their solution are g i v e n
i n N . N . M o i s e e v ’s m o n o g r a p h N u m e r i c a l M e t h o d s i n t h e T h e o r y o f
O p t i m a l Systems.
T h e a l g o r i t h m s set forth b e l o w are iterative in character. T h i s
m e a n s t h a t w e c a n c o n s t r u c t a finite o r infinite s e q u e n c e of p o i n t s
X k , k — 0, 1 . . . w h i c h is s a i d t o c o n v e r g e t o t h e s o l v i n g o f a m i
nimization problem.
T h e p o i n t s of t h e s e q u e n c e are related b y the e q u a t i o n
x h+ 1 ~ xh + &hPh
w h e r e p h i s t h e v e c t o r o f s h i f t f r o m p o i n t x h a n d a h is a s t e p a l o n g
t h e direction of p k . T h e r e f o r e t he d e scription of a n y of the a l g o
r i t h m s g i v e n b e l o w consists in e s t a b l i s h i n g t h e m e t h o d of c h o o s i n g
t h e v e c t o r p h a n d t h e l e n g t h of t h e step a h . It s h o u l d b e n o t e d t h a t
t h e m e t h o d of c h o o s i n g t h e v e c t o r p k d e t e r m i n e s t h e g e n e r al rate of
c o n v e r g e n c e of t h e process a n d the m e t h o d of c h o o s i n g a k h a s a n i m
p o r t a n t i n f l u e n c e o n t h e a m o u n t of c a l c u l a t i o n s at e a c h iteration.
T h e r e f o r e t h e a u t h o r s ’ a i m w a s t o g i v e i n all c a s e s of c h o o s i n g a h
a m e t h o d , s u c h t h a t t h e r e q u i r e d v a l u e of a h c o u l d b e f o u n d after
a finite n u m b e r of i t e r a t i o n s w i t h o u t a f f e c t i n g t h e g e n e r a l r a t e of
convergence.
10
P R E F A C E
11
C H A P T E R I
I N T R O D U C T I O N T O T H E T H E O R Y
O F M A T H E M A T I C A L P R O G R A M M I N G
1. C O N V E X S E T S
I n this section w e c o n s i d e r t h e basic properties of c o n v e x sets in
a n /i-dimensional Euclidean space.
Definition. S e p a r a t i o n T h e o r e m
D e f i n i t i o n 1 . 1 . A s e t o f p o i n t s X i n E n is c a l l e d c o n v e x if t o g e t h e r
w i t h a n y x lt x 2 £ X it c o n t a i n s a l s o a l l p o i n t s o f t h e f o r m :
x = X x x -4- ( 1 — X ) x 2 , 0 ^ X ^ 1.
I n g e o m e t r i c a l t e r m s t h i s m e a n s t h a t if t h e e n d p o i n t s o f a seg
m e n t b e l o n g to a c o n v e x set X t h e n t h e w h o l e s e g m e n t b e l o n g s to t h e
set too.
L e n n n a 1.1. T h e f o l l o w i n g s t a t e m e n t s h o l d :
( 1 ) T h e i n t e r s e c t i o n o f a n y n u m b e r o f c o n v e x s e t s is c o n v e x '
(2) / / x t 6 X , i = 1, . . m , t h e n w i t h a n y X t, i = 1, . . ., m
m m
such that 2 = 1, X t ^ 0, a p o i n t x = 2 X i X t b e l o n g s to X .
i=l i=l
T h e f o l l o w i n g t h e o r e m a n d its c o r o l l a r i e s a r e t h e b a s i c too l s u s i n g
w h i c h it is p o s s i b l e t o o b t a i n r e s u l t s c h a r a c t e r i s i n g v a r i o u s p r o
perties of c o n v e x sets.
T h e o r e m 1 . 1 . L e t X b e a c o n v e x s e t , a n d X its c l o s u r e . I f p o i n t x 0
d o e s n o t b e l o n g to X , t h e n t h e r e exist a v e c t o r a £ E n , a ^ 0, a n d a n u m
ber e > 0 s u c h that for all x £ X
(a, x ) ^ (a, x Q) — e.
12
C O N V E X S E T S
P r o o f . X is a c l o s e d s e t , b y d e f i n i t i o n . L e t u s s h o w t h a t it is c o n v e x .
I n d e e d , if x £ X , t h e n t h e r e i s a s e q u e n c e { # * } , k = 1 , . . ., s u c h
that x h 6 X , x k x . N o w l e t a;, y 6 X , 0 ^ A, ^ 1 . L e t u s p r o v e
t h a t h e + (1 — X ) y 6 X . S i n c e X is a c o n v e x set, it f o l l o w s f r o m
Xk, yh e x , x h - + x , y h - + y that
+ (1 — k) y k g X ,
kxh + {i — k) y h ->-kx + (i — k) y.
T h i s m e a n s t h a t X x + ( 1 — A,) y £ X , i . e . X i s c o n v e x .
L e t u s t a k e a p o i n t y 6 X w h o s e d i s t a n c e f r o m x Q is t h e l e a s t , i.e.
II % X q || ^ || y X q (I, x £ X .
Since X i s c o n v e x f o r a l l x (j X and 0 ^ X ^ 1, w e h a v e
Xx + (1 — X) y = y + X (x — y) £ X .
Therefore
|| X x - f ( 1 — X) y — x 0 ||2 = || y — x0 + X (x — y ) ||2
= (y — *o + ^ (x — y)> y — *o + x (x — y))
= (y — *o» y — *o) + 2X(y — x 0, X — y ) + A,2 ( x — y , a; — y )
= II y — II2 + 2 X ( y — x — y ) + A,2 || x — y ||2 > || y — x 0 \\2 .
13
M A T H E M A T I C A L P R O G R A M M I N G
R e m a r k . I n p r o v i n g t h e o r e m 1.1. w e h a v e p r o v e d at t h e s a m e
t i m e t h a t t h e c l o s u r e o f a c o n v e x s e t is c o n v e x t o o . A s a s i m p l e
exercise t h e r e a d e r c a n p r o v e t h a t t h e set of interior p o i n t s of a c o n
v e x s e t is c o n v e x t o o .
C o r o l l a r y 1.1. L e t X b e a c o n v e x set a n d x Q the f r o n t i e r p o i n t of X .
T h e n t h e r e is a v e c t o r a = £ 0 s u c h t h a t
(a, x ) ^ (a x 0 ), x £ X.
C o r o l l a r y 1.2. I f X a n d Y a r e c o n v e x sets t h a t d o n o t intersect, t h e n
t h e r e is a v e c t o r a 0 such that
(a, x ) < (a, y), x 6 X, y £ Y .
C o r o l l a r y 1.3. I f X a n d Y a r e cl o s e d c o n v e x sets w h i c h d o n o t intersect
a n d o n e o f t h e m is b o u n d e d , t h e n t h e r e e x i s t a v e c t o r a = £ 0 a n d a
n u m b e r e > 0 such that
(a, x ) < (a, y ) — e, *x £ X , y £ Y .
Convex Cones
D e f i n i t i o n 1 . 2 . A s e t K is c a l l e d a c o n v e x c o n e if t h e s e t is c o n v e x
a n d t o g e t h e r w i t h e v e r y p o i n t x £ K it c o n t a i n s a l l p o i n t s X x w i t h X > 0 .
I t i s c l e a r t h a t ii x , y £ K t h e n x + y £ K . I n f a c t , s i n c e K i s a
1 1
c o n v e x set, p o i n t y x + y y b e l o n g s t o K . B u t
x y { t x ~^~ ~2 y ) *
w h e n c e x + y £ AT b y t h e definition of a c o n e . T h e m o s t i m p o r t a n t
properties of c o n e s are f o r m u l a t e d in t e r m s w h i c h establish the rela
t i o n b e t w e e n t h e o r i g i n a l c o n e a n d t h e c o n e t h a t is its c o n j u g a t e o r
dual.
D e f i n i t i o n 1.3. L e t K b e a c o n v e x c o n e . T h e set of all vectors y £ E n
s a t i s f y i n g f o r a n y x £ K t h e i n e q u a l i t y ( x , y ) ^ 0 is c a l l e d a c o n j u g a t e
cone a n d denoted by K * .
A n e l e m e n t a r y c h e c k s h o w s t h a t K * is a l s o a c o n v e x c o n e .
L e m m a 1 . 2 . K * is a c l o s e d c o n v e x c o n e .
L e m m a 1 . 3 . L e t K b e a c o n v e x c o n e . T h e n x 0 £ K if a n d o n l y if
(ar0 , y ) ^ 0 f o r a l l y £ K * . I f K is c l o s e d , t h e n
(K * ) * = K.
P r o o f . I t i s e v i d e n t t h a t if x 0 £ K , t h e n ( x 0l y ) ^ 0 f o r a l l y £ A T * .
S u p p o s e i t i s f a l s e . L e t (;z0 , y ) ^ 0 f o r a n y i/ £ A T * , b u t x 0 £ K .
14
C O N V E X S E T S
S i n c e K is a c l o s e d c o n v e x s e t a n d u s i n g t h e o r e m 1 . 1 , w e c a n a s s e r t
t h a t t h e r e is a v e c t o r a s u c l i t h a t
(a, x 0) ^ (at x) — e, x £ K .
N o w a closed cone K a l w a y s c o n t a i n s p o i n t 0. T h e r e f o r e i n p a r t i c u l a r
(a, x 0) < — e. (1.1)
On the other h a n d
(a, x ) ^ 0, x £ K . (1*2)
I n d e e d , if f o r a c e r t a i n x { £ K (a , x t ) < 0, t h e n s i n c e £ K with
X > 0
(fl, X 0 ) < X (tf, Xj) — e
a n d t h e l a s t i n e q u a l i t y m u s t b e v a l i d f o r a n y X ; t h i s is i m p o s s i b l e
if ( a , x x ) < ; 0 . T h u s ( 1 . 2 ) i s v a l i d a n d c o n s e q u e n t l y a £ K * . T h e n
( a , :r0 ) ^ 0 a n d t h i s c o n t r a d i c t s ( 1 . 1 ) . T h i s p r o v e s t h e f i r s t p a r t o f
the l e m m a .
L e t u s n o w p r o v e i t s s e c o n d p a r t . I f x £ K , t h e n (x , y ) ^ 0 f o r
all y £ K * , b y d e f i n i t i o n , a n d t h e r e f o r e x 6 ( i f * ) * , K c z ( K * ) * .
C o n v e r s e l y , b y d e f i n i t i o n , x £ ( K * ) * if a n d o n l y if ( # , y ) ^ 0 w i t h
a n y y £ K * . H o w e v e r , it w a s p r o v e d a b o v e t h a t i n t h i s c a s e x £ K ,
i.e. ( K * ) * a K . T h u s ( K * ) * = K . Q . E . D .
P o l y h e d r a l c o n e s are a n i m p o r t a n t class of c o n e s e n c o u n t e r e d
in t h e t h e o r y of linear p r o g r a m m i n g .
D e f i n i t i o n 1 . 4 . A c o n e K is c a l l e d p o l y h e d r a l if t h e r e e x i s t s a f i n i t e
set o f n - d i m e n s i o n a l v e c t o r s a t, i = 1, . . . , m s u c h t h a t w i t h x £ K
the e x p a n s i o n
is v a l i d a n d c o n v e r s e l y ( 1 . 3 ) i m p l i e s t h a t x £ K .
T h u s a p o l y h e d r a l c o n e K is a s e t o f p o i n t s w h i c h c a n b e r e p r e
s e n t e d in t h e f o r m (1.3). A g i v e n p o i n t x £ K i n t h e f o r m (1.3),
s p e a k i n g g e n e r a l l y , is r e p r e s e n t e d n o t u n i q u e l y .
L e m m a 1 . 4 . L e t x £ K , K b e i n g a p o l y h e d r a l c o n e . T h e n t h e r e is s u c h
a n e x p a n s i o n o f x i n v e c t o r s a t w i t h n o n n e g a t i v e c o e f f i c i e n t s X tl t h a t
the n u m b e r of indices i for w h i c h Xi 0 does n o t e x c e e d n, the n u m b e r
o f d i m e n s i o n s o f t h e s p a c e ; t h e v e c t o r s a t c o r r e s p o n d i n g t o n o n z e r o X-t
are linearly independent.
m
P r o o f . L e t x £ K , i.e. x = 2 ^ i a n a n d 3 b e t h e s e t o f t h o s e i n d i c e s i
i=i
s u c h t h a t X i > > 0 . S u p p o s e t h a t t h e n u m b e r o f e l e m e n t s i n .7 is
g r e a t e r t h a n n, o r d o e s n o t e x c e e d n, b u t t h e v e c t o r s a*, i £ J , a r e
15
M A T H E M A T I C A L P R O G R A M M I N G
S u b t r a c t i n g f r o m t h i s r e l a t i o n t h e p r e c e d i n g o n e m u l t i p l i e d b y e,
w e obtain
£ = 2 (Xi-ea^cii.
X — Aijflj
w h e r e X t ^ 0 a n d for o n e i at least X t = 0.
T h u s w e h a v e o b t a i n e d a n e x p a n s i o n of x in vectors a t w i t h n o n
n e g a t i v e coefficients; h o w e v e r t h e n u m b e r of strictly positive coef
ficients h a s b e e n d i m i n i s h e d .
T h i s process c a n n o w b e a p p l i e d further until the n u m b e r of n o n
z e r o coefficients b e c o m e s less t h a n n o r e q u a l to n a n d v e c t o r s a t
for w h i c h X t > 0 b e c o m e linearly i n d e p endent. Since w e h a v e a pro
cess of d i m i n i s h i n g a w h o l e n u m b e r , this process o b v i o u s l y c a n n o t
b e c o n t i n u e d infinitely a n d after a certain n u m b e r of steps w e shall
g e t a n e x p a n s i o n w h i c h satisfies t h e c o n d i t i o n s of o u r l e m m a .
L e m m a 1 . 5 . A p o l y h e d r a l c o n e is c l o s e d .
L e m m a 1.6. L e t the c o n e K b e defined b y a s y s t e m of linear inequalities
(a*, x ) ^ 0, i = 1 , . . ., m
w h e r e a t £ E n . T h e n t h e c o n j u g a t e c o n e K * is a p o l y h e d r a l c o n e a n d
consists of p o i n t s y, w h i c h c a n be p r e s e n t e d in the f o r m
m
y ^ ^ X i d i , X i ^ 0, i = l, . . . , m .
i=l
16
C O N V E X F U N C T I O N S
B y d e f i n i t i o n , K * i s a s e t o f p o i n t s x , f o r w h i c h (x , y ) ^ 0 , y £ K ,
m
i.e. ( x , 2 for all X i ^ O . T h e n
i=l
m m
(a:, 2 ^ ia i) = ^ j h ( * , o.
i=l i=l
T h e l a s t i n e q u a l i t y c a n o b v i o u s l y b e v a l i d f o r a n y X $ ^ 0 o n l y if
( a f, z ) ^ 0 , i = 1, . . m , i.e. if x £ K . T h u s K * = K . S i n c e K
i s a p o l y h e d r a l c o n e , it i s c l o s e d a n d b y l e m m a 1 . 3 ( X * ) * = K .
Thus K * = K. Q.E.D.
R e m a r k . T h e l e m m a p r o v e d a b o v e is k n o w n a s t h e F a r k a s - M i n -
k o w s k i l e m m a a n d is u s e d a s t h e b a s i c t o o l f o r o b t a i n i n g t h e n e c e s
sary conditions for e x t r e m a .
Strictly a n d Strongly C o n v e x S e t s
D e f i n i t i o n 1 . 5 . A s e t X a E n is c a l l e d s t r i c t l y c o n v e x if f o r a n y x ly
x 2 € X , Xi x 2 all p o i n t s of the f o r m
Xx1 + (1 — X ) x 2% 0 <c X < 1
are internal p o i n t s of this set.
D e f i n i t i o n 1 . 6 . A s e t X a E n i s c a l l e d s t r o n g l y c o n v e x i f t h e r e is
a constant y > 0 such that a n y point
Xi + X 2
2 + y £ X
i f x 1 , x 2 6 X a n d || y || < y || x 2 — x l ||2 .
I t is e a s i l y a s c e r t a i n e d t h a t a s t r o n g l y c o n v e x s e t is a l s o s t r i c t l y
c o n v e x (but not the converse).
2. C O N V E X F U N C T I O N S
C o n v e x functions h a v e a n u m b e r of i m p o r t a n t properties a n d
constitute o n e of t h e m a i n objects of s t u d y in t h e t h e o r y of m a t h e
m a t i c a l p r o g r a m m i n g . T h e p r o b l e m of c o n v e x p r o g r a m m i n g w h i c h
is t h e m o s t i n v e s t i g a t e d o n e f o r e x t r e m a is f o r m u l a t e d i n t e r m s o f
c o n v e x sets. H o w e v e r c o n v e x f u n c t i o n s p l a y a d e c i s i v e role i n t h e
general n o n l i n e a r p r o b l e m too, since the sufficiently general a n d
c o m p r e h e n s i v e necessary conditions of e x t r e m a c a n b e f o r m u l a t e d
o n l y for the case w h e r e the derivatives of t h e functions in the direc
tion at the g i v e n point are c o n v e x functions.
W e shall m a i n l y s t u d y c o n v e x functions defined o ver the w h o l e
s p a c e s o t h a t t h e v a l u e o f a n y g i v e n c o n v e x f u n c t i o n is finite a t e a c h
p o i n t x £ E n . F r o m t h e v i e w p o i n t o f g e n e r a l t h e o r y it is s o m e t i m e s
expedient to consider c o n v e x functions w h i c h c a n at s o m e points
2— 0326 17
M A T H E M A T I C A L P R O G R A M M I N G
Definition. B a s i c Properties
D e f i n i t i o n 2 . 1 . A f u n c t i o n f ( x ) d e f i n e d f o r a l l x £ E n is c a l l e d c o n
v e x if f o r a n y x ly x 2 a n d X x , X 2 ^ 0 , X x + X 2 = 1 ,
/ ( % 1x 1 + X 2:z2 ) < K J (xt) - f ( x 2 ).
R e m a r k . If f (x) = + o o f o r s o m e x, t h e d e f i n i t i o n r e m a i n s v a l i d .
L e m m a 2 . 1 . L e t f x ( x ) a n d f 2 ( x ) b e c o n v e x f u n c t i o n s a n d c ±1 c 2 n o n
negative numbers. T h e n
1 (z) = clfl (z) + c2/2 (x)
is a c o n v e x f u n c t i o n t o o .
L e m m a 2 . 2 . L e t f t (x), i = 1, . . , m b e c o n v e x f u n c t i o n s . Then
f ( x ) = m a x ft ( x ) is a l s o a c o n v e x f u n c t i o n .
L e m m a 2 . 3 . I f f ( x ) is a c o n v e x f u n c t i o n , t h e n w e h a v e
\f ( X j X i ” 1” k 2X 2 • . • “ F" X m £ m )
^ ^ i f (^l) ^ 2 / (^2) • • • “ 1” ^ m f f a m )
f o r a n y n o n n e g a t i v e X*, w h i c h satisfy the c o n d i t i o n
X* + . . • + Xm = 1.
Proof. W i t h m = 2 this s t a t e m e n t follows f r o m the definition of
a c o n v e x function. S u p p o s e the l e m m a h a s b e e n p r o v e d for m ^ . k .
L e t u s s h o w t h a t t h e s t a t e m e n t is v a l i d f o r m = k + 1. L e t X * ^ 0 ,
i = 1 , . . . , & + l , X 1 + . . . + X fe+1 — 1 . E v i d e n t l y o n e c a n c o n
s i d e r all %i to b e strictly g r e a t e r t h a n zero; o t h e r w i s e w e s h o u l d
h a v e t h e c a s e w h e r e t h e a b o v e i n e q u a l i t y is s a t i s f i e d b y h y p o t h e s i s .
T h u s Xft+i 0 a n d 1 — X ^ i — X j -f- . . . -j- X ^ 0.
F r o m the definition of a c o n v e x function w e have|
/ ( ^ 1*^1 ~ h • • • “ 1" X f t X h -{- X f c + i ^ h + i )
' ( T = f c r * + - - + - r a f c r * * )
< 2 -2 >
1 8
C O N V E X F U N C T I O N S
since
^■1_ _ |_ I_ _ _ _ _ _ _ A
- t - * * * ^ l — Xh+i
C o m p a r i n g (2.1) a n d (2.2) w e o b t a i n t h e r e q u i r e d result.
T h e l e m m a h a s b e e n p r o v e d us i n g the principle of m a t h e m a t i c a l
induction.
L e m m a 2 . 4 . T h e f u n c t i o n f ( x ) is c o n v e x if a n d o n l y i f f o r a n y x
a n d p £ Z?n t h e f u n c t i o n o f t h e o n e - d i m e n s i o n a l v a r i a b l e t
<P*. p (<) = / (x + tp) (2 -3 )
is a c o n v e x f u n c t i o n .
Differential Properties
L e t / (x) b e a c o n v e x d i f f e r e n t i a b l e f u n c t i o n w h o s e c o n t i n u o u s
g r a d i e n t is f (x).
L e m m a 2.5. T h e fo l l o wi n g statements are equivalent:
( 1 ) / ( x ) is a c o n v e x f u n c t i o n .
( 2 ) f ( x 2 ) — f ( x x ) > (/' f o ) , x 2 — x j f o r a n y x ly x 2 6 E n .
( 3 ) (/' ( x + X p ) , p ) is a n o n d e c r e a s i n g f u n c t i o n o f X .
I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n , t h e n
( 4 ) / " (a;), t h e m a t r i x o f s e c o n d d e r i v a t i v e s , is p o s i t i v e d e f i n i t e » i . e %
( T (x ) P i p ) Q fo r a n y x i P 6
P r o o f . N o t e first o f a l l t h a t if
fp * . P W = / (i + Xp),'
t h e n a s s h o w n a b o v e <p*. p (A,) i s a c o n v e x f u n c t i o n a n d
<pi. P ( X ) = (/' (a; + Xp), p), ffx. p ( X ) = (p, f (x 4 - X p ) p). (2.4)
L e t u s s h o w t h a t s t a t e m e n t (2) f o l l o w s f r o m s t a t e m e n t (1). I n f a c t *
since
/ ((1 - X) x t + X x 2) < (1 - X ) f f o ) + X f ( x 2 ), 0 < X < 1
we have
J ( x t + X («, - «,)) - f M < f ( X i ) _ ; ( X i ) .
19 2*
M A T H E M A T I C A L P R O G R A M M I N G
f ( * ) = y ( * > A x ) + Q>, x )
L.
is c o n v e x i f a n d o n l y if m a t r i x A is p o s i t i v e d e f i n i t e .
I n d e e d , / (x) is t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e a n d /" (x) = A .
T h e r e f o r e t h e s t a t e m e n t of t h e corollary follows directly f r o m sta
t e m e n t (4) o f l e m m a 2 . 5 .
L e m m a 2.5 p r o v i d e s a series of criteria of c o n v e x i t y of a f u n c
t i o n w h i c h e n a b l e u s t o e s t a b l i s h w h e t h e r a g i v e n f u n c t i o n is c o n v e x .
D e f i n i t i o n 2 . 2 . L e t the c o n v e x f u n c t i o n f (x) b e d e f i n e d a t p o i n t x 0
a n d h a v e a f i n i t e v a l u e . V e c t o r g is c a l l e d a s u b g r a d i e n t o r s u p p o r t
v e c t o r f o r f u n c t i o n f ( x ) a t p o i n t x 0 if f o r a n y x t h e i n e q u a l i t y
/(*)— / (*o) 5 s (g, x — x 0) (2.6)
is s a t i s f i e d *
I t c a n b e s h o w n t h a t i f / ( # ) i s c o n t i n u o u s a t p o i n t x 0> t h e n a t t h i s
p o i n t t h e r e e x i s t s u b g r a d i e n t s a n d t h e s e t o f t h e s e s u b g r a d i e n t s is
c o n v e x , c l o s e d a n d b o u n d e d . It f o l l o w s f r o m l e m m a 2 . 5 ( s t a t e m e n t 2)
2 0
C O N V E X F U N C T I O N S
t h a t /' (a;0 ) i s a s u b g r a d i e n t o f f u n c t i o n / ( # ) a t p o i n t x 0 i f / ( x ) i s
d i f f e r e n t i a b l e . T h u s t h e c o n c e p t o f s u b g r a d i e n t is a g e n e r a l i z a t i o n
of the gradient concept.
I t i s c l e a r f r o m t h e d e f i n i t i o n t h a t if g 1 a n d g 2 a r e s u b g r a d i e n t s o f
c o n v e x f u n c t i o n s f±(x) a n d f 2 (x) a t p o i n t x 0 , t h e n c ^ + c 2g 2 is a
subg r ad i e n t of function ( x ) + c 2f 2 ( x ) , c x , c 2 ^ 0 . T h u s k n o w
i n g t h e s u b g r a d i e n t s o f c e r t a i n c o n v e x f u n c t i o n s it i s e a s y t o c o m
p u t e the s u b g r a d i e n t for their linear c o m b i n a t i o n .
N o w let / (x) = m a x / f ( x ), w h e r e f t (a;) i s a c o n v e x f u n c t i o n ^
i=l, m
a n d let g t b e s u b g r a d i e n t s of (x) a t p o i n t x 0 . T h e n v e c t o r
m
g = 2 ]hgt
i=l
m
where 2 ^ = 1, i = 1, . Xt= 0 if f i ( x 0) < . f ( x 0 ), i s
i=l
a s u b g r a d i e n t of f u n c t i o n f(x).
f ^vll*2— * i I P (2.7)
w h e r e y > > 0 is a n a r b i t r a r y s m a l l c o n s t a n t .
A s t r o n g l y c o n v e x f u n c t i o n a s c a n b e e a s i l y a s c e r t a i n e d is a l s o
strictly c o n v e x , but, s p e a k i n g generally, t h e c o n v e r s e d o e s n o t h o l d .
In w h a t follows w e shall consider twice c o n t i n u o u s l y differen
tiable strongly c o n v e x functions.
21
M A T H E M A T I C A L P R O G R A M M I N G
L e m m a 2 . 7 . I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n ,
t h e n t h e c o n d i t i o n o f s t r o n g c o n v e x i t y ( 2 . 7 ) is e q u i v a l e n t t o t h e c o n d i t i o n
(/" (x) P , p ) > m , II p II2 , m > Oj (2.8)
for a n y x a n d p £ E n .
I n e q u a l i t y ( 2 . 8 ) i m p l i e s t h a t m a t r i x f " (a:) i s s t r o n g l y p o s i t i v e .
C o r o l l a r y 2.2. A strictly c o n v e x q u a d r a t i c f u n c t i o n f (x) =
= ( A x , x ) + (b , x ) defined over the space E n is s t r o n g l y c o n v e x
t o o a n d t h e c o n v e r s e is v a l i d .
P r o o f . I t is n e c e s s a r y t o p r o v e o n l y t h e first s t a t e m e n t .
F r o m (2) o f l e m m a 2 . 6 it f o l l o w s t h a t f o r a n y x 0
(Ax, x) > 0. (2.9)
A t the s a m e time
( A x , x ) ' ^ k (x, x ) = k || x ||2 (2.10)
w h e r e k is t h e l e a s t e i g e n v a l u e o f t h e m a t r i x o f s e c o n d d e r i v a t i v e s ,
A . F r o m ( 2 . 9 ) a n d ( 2 . 1 0 ) it f o l l o w s t h a t k > 0 a n d / ( x ) i s a s t r o n g l y
c o n v e x function.
L e t x 0 b e a n arbitrary p o i n t in E n . C o n s i d e r t h e set
Y = {x: f (x) < / (x„)}.
L e m m a 2 . 8 . I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e s t r o n g l y
c o n v e x f u n c t i o n , t h e n Y is a c l o s e d b o u n d e d s t r o n g l y c o n v e x s e t .
P r o o f . T h e s e t Y is c l o s e d s i n c e / (x) is a c o n t i n u o u s f u n c t i o n .
L e t u s p r o v e t h a t Y i s b o u n d e d . B y T a y l o r ’s f o r m u l a
w h e r e £ = # o + 0 ( x — x 0), 0 £ l O , 1 ] . U s i n g ( 2 . 8 ) w e h a v e
T h i s last i n e q u a l i t y p r o v e s t h a t Y is b o u n d e d .
22
C O N V E X F U N C T I O N S
F i n a l l y , let u s e s t a b l i s h t h a t Y is a s t r o n g l y c o n v e x set. L e t x ± ,
x 2 6 Y . U s i n g L a g r a n g e ’s f o r m u l a a n d c o n d i t i o n ( 2 . 7 ) w e o b t a i n
/ ( ^ 4 £s+ » ) = / ( = 4 * ) + ( / ' © , y)
( T W p , p ) > ^ r \ \ p \ \ z-
Concave Functions
Definition. If for a n y x Y, x 2 £ E n a n d a n y 0 ^ X ^ 1 the inequality
f {Xxx + (1 - X) x 2) > Xf fo) + (1 - X ) f ( x 2)
is s a t i s f i e d , t h e n t h e f u n c t i o n f ( x ) is c a l l e d c o n c a v e .
I t f o l l o w s t h a t t h e f u n c t i o n / (a:) i s c o n c a v e if a n d o n l y i f t h e f u n c
t i o n — / (x) is c o n v e x . T a k i n g t h i s i n t o a c c o u n t all t h e p r o p e r t i e s
of c o n c a v e function c a n b e o b t a i n e d b y a s i m p l e r e f o r m u l a t i o n of
the corresponding properties of c o n v e x functions.
23
M A T H E M A T I C A L P R O G R A M M I N G
3. C O N V E X P R O G R A M M I N G
T h e s u b j e c t m a t t e r o f c o n v e x p r o g r a m m i n g is m i n i m i z a t i o n o f
a c o n v e x f u n c t i o n i n a c o n v e x d o m a i n . C o n v e x p r o g r a m m i n g is t h e
m o s t e l a b o r a t e d p a r t of m a t h e m a t i c a l p r o g r a m m i n g .
F o r m u l a t i o n g of t h e P r o b l e m .
Basic Properties
G i v e n a c o n v e x c o n t i n u o u s f u n c t i o n / (#), x £ E n , d e f i n e d f o r all
x 6 E n , a n d a c o n v e x s e t X . I t is r e q u i r e d t o f i n d t h e m i n i m u m o f
f (x) i n t h e s e t X , i.e. t o f i n d p o i n t x % s u c h t h a t
L e m m a 3 . 1 . A c o n v e x c o n t i n u o u s f u n c t i o n f ( x) a t t a i n s its m i n i m u m
i n a c o m p a c t c o n v e x set X .
P r o o f . T h e h y p o t h e s i s is j u s t a p a r t i c u l a r c a s e o f t h e w e l l - k n o w n
Weierstrass t h e o r e m w h i c h states that a c o n t i n u o u s function attains
its m i n i m u m i n a c o m p a c t set.
L e m m a 3 . 2 . L e t X b e a c l o s e d set a n d f (#) a t w i c e c o n t i n u o s l y diffe
r e n t i a b l e s t r o n g l y c o n v e x f u n c t i o n . T h e n f ( x ) a t t a i n s its m i n i m u m i n X .
Pro o f . L e t x Q £ X . C o n s i d e r t h e set
B y l e m m a 2 . 8 it is c l o s e d a n d b o u n d e d . C o n s i d e r n o w t h e i n t e r
s e c t i o n X f] Y . O b v i o u s l y , if x * i s t h e m i n i m u m o f f ( x ) i n t h e s e t
X f] Y , t h e n t h i s p o i n t i s t h e m i n i m u m p o i n t o f / (a;) i n X a s
w e l l . B u t t h e s e t X f) Y i s b o u n d e d a n d c l o s e d b e i n g t h e i n t e r s e c t i o n
o f t w o c l o s e d s e t s o n e o f w h i c h is b o u n d e d . T h e r e f o r e f {x) a t t a i n s
i t s m i n i m u m i n X f| Y a n d c o n s e q u e n t l y i n X a s a w h o l e .
C o n v e x a n d strictly c o n v e x f u n c t i o n s c a n fail t o a t t a i n t h e i r
minimum.
L e m m a 3.3. A set of p o i n t s X * c z X at w h i c h the c o n v e x f u n c t i o n
f (x) a t t a i n s its m i n i m u m i n X is c o n v e x .
L e m m a 3 . 4 . A s t r i c t l y c o n v e x f u n c t i o n a t t a i n s its m i n i m u m i n a
c o n v e x set X a t o n e a n d o n l y o n e p oint.
24
C O N V E X P R O G R A M M I N G
P r o o f . L e t x 1 a n d x 2 b e d i f f e r e n t p o i n t s o f m i n i m u m o f / (a:) i n X .
Then
/ (Y * i + y * 2) < y / ( * 1) + y / ( z 2) = / (*i),
~2 x i “ I- y * s € X •
T h i s c o n t r a d i c t s t h e f a c t t h a t arx i s a p o i n t o f m i n i m u m o f / (3 ).
N e c e s s a r y Conditions for a M i n i m u m
L e t / (a?) b e a c o n t i n u o u s l y d i f f e r e n t i a b l e c o n v e x f u n c t i o n a n d X
a c o n v e x s e t . W e h a v e t o c o n s i d e r t h e f o l l o w i n g q u e s t i o n : if x * i s
t h e m i n i m u m p o i n t o f / (a:) i n X y w h a t c o n d i t i o n s a r e t o b e s a t i s f i e d
at this point?
D e f i n i t i o n 3 . 1 . L e t x 0 £ X . W e d e n o t e b y K (a:0 ) a s e t o f v e c t o r s p
s u c h t h a t p £ K (a:0 ) i f a n d o n l y i f t h e r e i s a n a > > 0 s u c h t h a t x Q + a p £
e x .
T h e s e t K (a:0) i s c a l l e d t h e c o n e o f a d m i s s i b l e d i r e c t i o n s f o r X a t
point x 0.
L e m m a 3 . 5 . K (a:0 ) is a c o n v e x c o n e . I f p £ K ( x 0 ) a n d x Q + a 0p £
£ X t then x 0 + a p £ X with a n y 0 ^ a ^ a 0.
T h e o r e m 3.1. L e t x ^ b e the m i n i m u m p o i n t of a c o n t i n u o u s l y diffe
r e n t i a b l e c o n v e x f u n c t i o n f (x) i n a c o n v e x set X . T h e n
f (**) e x * (**). (3 .1 )
C o n v e r s e l y i f ( 3 . 1 ) h o l d s , t h e n x * is t h e m i n i m u m p o i n t o f f ( x ) i n X .
P r o o f . L e t ( 3 . 1 ) b e s a t i s f i e d a t p o i n t x * . T h e n (/' ( a r * ) t p ) ^ 0 ,
p £ K ( x + ) . F u r t h e r if x £ X , t h e n p — x — x + £ X ( £ * ) f o r x 0 -\-
+ (x — x+) = x £ X . T h e r e f o r e
(/' (a:*), x — x*) > 0, a: 6 X .
B y l e m m a 2.5 w e h a v e for a c o n v e x f u n ction
[/ ( * ) ~ / (**) > (/' ( * * ) » « — a : *).
Hence
f (x) — f (a:*) > 0 , a: £ X
t h i s s h o w s t h a t a: * i s t h e m i n i m u m p o i n t o f f (a:) i n X .
L e t u s n o w p r o v e t h a t c o n d i t i o n (3.1) is n e c e s s a r y . L e t x * b e t h e
m i n i m u m point. T h e n for a n y x £ X a n d X, 0 < ^ 1, w e h a v e
/ ((! — X) x + + Xx) = f (x* + X (x — a:*)) > / (a:*)
or
/ ( j g + X (X — X»)) — / ( * » ) ^ q
25
M A T H E M A T I C A L P R O G R A M M I N G
w h e r e u \ u +x, u “ * a r e n o n n e g a t i v e n u m b e r s . D e n o t i n g u x = u + x —
i £ J ° w e obtain
y = — 2
i£j~(xo)
u'at -
i£jO
2 u'di, u x^ 0, i ^ J " ( o ; o ) . (3.6)
26
C O N V E X P R O G R A M M I N G
T h e o r e m 3 . 2 . L e t f (x) b e a c o n v e x d i f f e r e n t i a b l e f u n c t i o n a n d set X
b e d e f i n e d b y s y s t e m (3.4). T h e n f o r p o i n t x * to b e t h e m i n i m u m p o i n t o f
f (,x ) i n X it is n e c e s s a r y a n d s u f f i c i e n t t h a t t h e r e e x i s t n u m b e r s u x ,
i £ Cf~ U 3 ° such iha t
f (#*) + S u la t = 0 , u 1^ 0 , i£J~, u x = 0,
iGjr-ujro
i f {O'iy * * ) “ < 0, i € J ~ .
P r o o f . T h e r e s u l t is o b t a i n e d d i r e c t l y b y u s i n g t h e o r e m 3 . 1 a n d y
i n t h e f o r m (3.6) f o r e l e m e n t s o f K * ( # 0) a n d a l s o t a k i n g u % — 0 f o r
i £ j ~ (#*).
C o r o l l a r y 3 . 2 . F o r p o i n t x % to b e t h e m i n i m u m p o i n t o f the c o n v e x
d i f f e r e n t i a b l e f u n c t i o n o v e r t h e w h o l e s p a c e it is n e c e s s a r y a n d s u f f i c i e n t
to satisfy th e e q u a l i t y
f (xj = 0.
C o r o l l a r y 3 . 3 . F o r p o i n t x % to b e t h e m i n i m u m p o i n t o f t h e c o n v e x
differentiable f u n c t i o n i n the set
X1 > 0 , 7 6 f,
w h e r e f is a s u b s e t o f t h e s e t j = 1 , 2 , . . ., n , it is n e c e s s a r y a n d s u f
ficient to satisfy t h e r e l a t i o n s
^ f ^ 0 i f x ’=0, y<I f ,
dx3
ftUihl = 0 if x ’ = / = 0 o r f g j ' .
dx3
T h e n e c e s s a r y a n d sufficient c o n d i t i o n s for a m i n i m u m c o n s i d e r e d
a b o v e w e r e b a s e d o n a n abstract d escription of a n a d m i s s i b l e set X
i n w h i c h f u n c t i o n / (#) w a s m i n i m i z e d . I n a b r o a d c l a s s o f p r o b l e m s
t h e s e t is d e f i n e d b y a s y s t e m o f i n e q u a l i t i e s a n d e q u a l i t i e s . T h i s
section considers the necessary conditions for a m i n i m u m in this
concrete case.
G i v e n c o n v e x f u n c t i o n s f t (x), i = 0 , 1 , . . . » m a n d c o n v e x s e t X .
It is r e q u i r e d t o m i n i m i z e f 0 (x) w i t h t h e f o l l o w i n g c o n s t r a i n t s
fi ( x ) ^ 0, i = 1 , . . ., m , x 6 X . (3.7)
T h e o r e m 3. 3 ( K u h n - T u c k e r ) . L e t x+ be the m i n i m u m p o i n t of
/ 0 (x) w i t h t h e c o n s t r a i n t s ( 3 . 7 ) a n d let t h e r e b e a p o i n t x ± £ X s u c h t h a t
ft ( ^ i ) < ^- 6 , i — . . ., m .
27
M A T H E M A T I C A L P R O G R A M M I N G
Dual Problem
C o n s i d e r a g a i n t h e p r o b l e m of m i n i m i z a t i o n of c o n v e x f u n c t i o n
/ 0 ( x ) w i t h c o n s t r a i n t s ( 3 . 7 ) . L e t u % ^ 0 , i = 1 , . . .x m b e f i x e d .
Let us c o m p u te
77i
T h u s f u n c t i o n cp ( w ) w i t h u ^ 0 h a s b e e n d e t e r m i n e d ; i t c a n t a k e
t h e v a l u e — o o a s w e l l . W e l e a v e i t t o t h e r e a d e r t o p r o v e t h a t cp ( u )
is a c o n c a v e f u n c t i o n .
T h e o r e m 3.4. L e t u ^ 0 a n d x satisfy the c onstraints (3.7). T h e n
<P (“ ) = f o ( * ) •
If h o w e v e r the c o n d i t i o n s of t h e o r e m 3 . 3 are satisfied, t h e n
m a x cp ( w ) = m i n f 0 ( x )
u > 0 x £ D
w h e r e D is a s e t o f p o i n t s x w h i c h s a t i s f y ( 3 . 7 ) .
Proof. F o r x £ D , u ^ O w e h a v e
m
<P(“ X / o ( * ) + 2
i=l
“ Vi ( * ) < f o (x ) ■
L e t n o w t h e c o n d i t i o n s of t h e o r e m 3 . 3 b e satisfied. T h e n t h e r e
exists a v e c t o r u 0 ^ 0 s u c h t h a t t h e r e l a t io n s (3.8) are satisfied for
it. T h e s e r e l a t i o n s i m p l y
m
<p ( U q ) = / o ( # * ) 2 K f i (#*)— fo (#*)»
i=l
a n d s i n c e cp ( u ) ^ / 0 ( x * ) i t f o l l o w s t h a t v e c t o r u 0 p r o v i d e s t h e m a x i
m u m o f f u n c t i o n <p ( u ) i n t h e d o m a i n u ^ 0 a n d
m a x cp ( u ) = cp ( u 0 ) = f 0 (a:*) = m i n f 0 ( x ) .
Q.E.D.
28
C O N V E X P R O G R A M M I N G
T h e p r o b l e m o f m a x i m i z a t i o n o f q> ( w ) w i t h t h e c o n s t r a i n t u ^ 0 is
k n o w n as the d u a l p r o b l e m of c o n v e x p r o g r a m m i n g a n d u as the vector
of d u a l variables.
T h e essence of t h e o r e m 3.4 c a n n o w b e interpreted as follows:
u n d e r the c o nditions of t h e K u h n - T u c k e r t h e o r e m t h e v a l u e of t he
m a x i m u m o f t h e o b j e c t i v e f u n c t i o n i n t h e d u a l p r o b l e m is t h a t o f
the m i n i m u m of the objective f unction of the p r i m a l p r o b l e m . T h e
L a g r a n g e multipliers of the p r i m a l p r o b l e m are at t h e s a m e t i m e t he
s o l u t i o n of t h e d u a l p r o b l e m .
T h e p r o b l e m of c o n v e x p r o g r a m m i n g o ften arises in t h e f o r m of
t h e m i n i m i z a t i o n o f f 0 (x) w i t h t h e c o n s t r a i n t s
ft(x) = 0, i £ j \ x £ X (3.10)
w h e r e J ~ a n d J ° a r e f i n i t e s e t s o f i n d i c e s , f 0 ( x ) a n d f t ( x ), i £ J ~ ,
a r e c o n v e x f u n c t i o n s o f x , f t (#), i £ J ° a r e l i n e a r f u n c t i o n s a n d X
is a c o n v e x s e t .
T h e d u a l p r o b l e m f o r t h i s c a s e is f o r m u l a t e d a s a m a x i m i z a t i o n
p r o b l e m o f tp ( u ) w i t h t h e c o n s t r a i n t s u l ^ 0 , i £ J “ w h e r e u h a s
c o m p o n e n t s u l, i £ J ~ U « ^ ° a n d
q>(u)-inf [/,(*)+ S aVi(*)]. (3.11)
x £ X i e j - u j o
T h u s t h e n u m b e r o f d u a l v a r i a b l e s is e q u a l t o t h e n u m b e r o f c o n
straints (3.10) a n d t h e v a r i a b l e u x c o r r e s p o n d i n g to t h e i-th constraint
t a k e s n o n n e g a t i v e v a l u e s if i t c o r r e s p o n d s t o a n i n e q u a l i t y c o n s t r a i n t
a n d a r b i t r a r y v a l u e s if i t c o r r e s p o n d s t o a n e q u a l i t y c o n s t r a i n t .
P r o b l e m of L i n e a r P r o g r a m m i n g
T h e p r o b l e m o f l i n e a r p r o g r a m m i n g is t h e p r o b l e m o f m i n i m i z a t i o n
o f t h e f u n c t i o n f 0 (x) = ( a 0 , x ) s u b j e c t t o c o n s t r a i n t s (3.4)
(“ ;» x ) — bt < 0,; i 6 (fli* x ) — b t = 0„ i € J°.
T h i s p r o b l e m c o i n c i d e s w i t h t h e p r o b l e m ( 3 . 1 0 ) if
fi ( x ) = (<*!„ x ) — b iK X £ E n.
L e m m a 3.6. I f constraints (3.4) a p p l y together, t h e n the p r o b l e m
of linear p r o g r a m m i n g either h a s a solution x * or the v a l u e of the lower
b o u n d f 0 (x) — ( a 0 , x ) w i t h t h e c o n s t r a i n t s ( 3 . 4 ) is — o o .
T h e proof of this l e m m a c a n b e f o u n d in t e x t b o o k s o n linear p r o
gramming.
T h e necessary conditions characterizing x%, the solution of the
p r o b l e m of linear p r o g r a m m i n g , are o b t a i n e d just b y r e f o r m u l a t i n g
t h e o r e m 3 . 2 s i n c e fQ (x) — a 0 .
29
M A T H E M A T I C A L P R O G R A M M I N G
a0+ 2 u la i = 0 , u %^ 0, i £ J ~ , u l = 0, (3.12)
If {Uij x ^ ) — b i C 0 , i £ d ~ .
L e t u s c o n s t r u c t t h e d u a l of t h e p r o b l e m of linear p r o g r a m m i n g .
B y definition,
<P ( u ) = i n f f / 0 (a;) + 2 (*)1
*6 E n it£f-U J °
T h u s t h e d u a l o f t h e p r o b l e m o f l i n e a r p r o g r a m m i n g , i.e. - t h e
p r o b l e m of t h e m a x i m i z a t i o n of <p(w) w i t h u x^ 0, is e q u i v a
lent to t h e m a x i m i z a t i o n of
— 2 u % bi (3.13)
ieJ-U J°
with the constraints
a Q -\- 2 u xa i = 0 , u x^ 0, i £ j ~ . (3.14)
i e J - U <7°
T h e o r e m 3.6. If the p r i m a l p r o b l e m of linear p r o g r a m m i n g h a s a
solution, then the L a g r a n g e multipliers are the solution of the d u a l p r o b
lem, a n d at the s a m e t i m e the v a l u e of the m i n i m u m of the objective
f u n c t i o n o f t h e p r i m a l p r o b l e m is e q u a l t o t h e v a l u e o f t h e m a x i m u m o f
the objective f u n c t i o n of the d u a l p r o b l e m .
I n a d d i t i o n t o c o n s t r a i n t s (3.4), p r o b l e m s o f l i n e a r p r o g r a m m i n g
often c o n t a i n constraints of the t y p e
> 0, n t (3.4')
w h e r e ' f i s a s u b s e t o f t h e s e t j — 1 , 2 , . . ., n . U s i n g t h e o r e m 3 . 6
the reader c a n easily pro v e the following.
T h e o r e m 3.7. If a p r o b l e m of linear p r o g r a m m i n g w i t h constraints
( 3 . 4 ) , (S A ' ) h a s a s o l u t i o n , t h e n t h e L a g r a n g e m u l t i p l i e r s c o r r e s p o n d i n g
30
C O N V E X P R O G R A M M I N G
to c o n s t r a i n t s (3.4) a r e t h e s o l u t i o n of th e d u a l p r o b l e m : t h e m a x i m i z a
tion of
— 2 u % bi
with constraints
ai + 2 * * * « ? > o, j £ f ,
flj + 2 u ia \ = 0 , j£f, 0, i
w h e r e a { i s t h e f - t h c o m p o n e n t o f v e c t o r a t. T h e v a l u e s o f t h e m i n i
m u m of the objective f u n c t i o n of t h e p r i m a l p r o b l e m a n d of t h e
m a x i m u m of the d u a l one coincide.
P r o b l e m of Q u a d r a t i c Programming
T h e p r o b l e m of quadratic p r o g r a m m i n g consists in the m i n i m i
zation of the quadratic f u n c t io n
/o (*) = ! ( * , C x ) + (d, x )
= inf \ * ( z , C x ) + (d, x ) + 2 u i ( ( “ i. * ) — m ]
x£En L
a J - v J 0
= fnf [— 2 u % + — - (x, C x ) + ( x , d + 2
xeE?n ie£f~\j30 u J °
31
M A T H E M A T I C A L P R O G R A M M I N G
x ( u ) = — C-‘ ( d + 2 u 4* , ) . (3.16)
n J - u J 0
4. N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M
T h e general p r o b l e m of m a t h e m a t i c a l p r o g r a m m i n g consists in
m i n i m i z i n g f u n c t i o n f Q (x), x 6 E n i n a s e t d e f i n e d b y a s y s t e m o f
equalities a n d inequalities
/i w < o, fi ( x ) = 0, x £ X % (4.1)
w h e r e J ~ a n d J ° a r e finite s e t s o f i n d i c e s . I n t h i s s e c t i o n it is a l
w a y s a s s u m e d t h a t f t (x) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n s
w h o s e g r a d i e n t is f\ (x). N o a s s u m p t i o n is m a d e a b o u t s e t X f o r t h e
present.
T h e m a i n o b j e c t o f t h i s s e c t i o n is t o d e d u c e t h e n e c e s s a r y c o n
ditions w h i c h m u s t b e satisfied at p o i n t x * p r o v i d i n g t h e m i n i m u m
o f /0 (x) s u b j e c t t o c o n s t r a i n t s (4 .1) .
B a s i c Definitions
D e f i n i t i o n 4 . 1 . A s e t D o f p o i n t s w h i c h s a t i s f y c o n s t r a i n t s ( 4 . 1 ) is
called a n admissible d o m a i n .
W e a s s u m e t h a t t h i s s e t is n o n e m p t y .
32
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M
D e f i n i t i o n 4.2. F u n c t i o n f 0 (x) b e i n g m i n i m i z e d i n D is c a l l e d a n
objective junction.
Definition 4.3. P o i n t x * satisfying (4.1) for w h i c h
h ix * ) < ft (*). x € D ,
is c a l l e d t h e m i n i m u m p o i n t .
D e f i n i t i o n 4 . 4 . P o i n t x 0 is c a l l e d a p o i n t o f l o c a l m i n i m u m o f f 0 (x)
i n D if t h e r e is a n e i g h b o u r h o o d Q o f p o i n t x # s u c h t h a t
N e c e s s a r y C o n d i t i o n s for a M i n i m u m
(4.2)
A.-0 A
the expression
£ o + A,/>+ 2 (4.3)
ieJ°
is v a l i d w i t h s u f f i c i e n t l y s m a l l % ; > 0 .
T h e basic result to b e p r o v e d in this section c a n n o w b e f o r m u
lated.
T h e o r e m 4. 1 . L e t x # b e a p o i n t of local m i n i m u m of /0 (x) i n D .
B e s i d e s , let t h e s e t o f a d m i s s i b l e d i r e c t i o n s w i t h r e s p e c t to s e t X a t p o i n t x *
f o r m a c o n v e x c o n e K (a;*). T h e n t h e r e a r e n u m b e r s u ° , u l 6 U 3°
such that
« % ( * . ) + 2 »*/•(*.) 6 « * ( * . ) ,
k J - u J 0
“ Vi (.x.)= o, « * > o, i=o, « e J - . (4.4)
Proof. C o n s i d e r t w o cases.
(1) V e c t o r s f\ (a:*), i £ J ° are linearly dependent. T h e n there are
numbers i £ J ° such that
2 “ V i (x . ) = o .
3— 0326 33
M A T H E M A T I C A L P R O G R A M M I N G
T a k i n g u ° — 0, u x = 0, i £ £f ~ w e s e e t h a t all t h e c o n d i t i o n s o f
t h e o r e m (4.1) ar e satisfied.
(2) V e c t o r s f\ (a:*), i £ J ° a r e l i n e a r l y i n d e p e n d e n t . T h e n t h e r e
are vectors i £ such that
(fi ( * * ) , e } ) = 6 ^, i, / 6 T
w h e r e 6 ^ = 0 i f i ^ j a n d 6 ** = 1 .
L e t t h e t o t a l n u m b e r o f i n d i c e s i i n t h e s e t J " \J J ° b e m . C o n
sider set Z in s p a c e E m + 1 , t h e set b e i n g defined as follows. V e c t o r z
b e l o n g s t o Z if a n d o n l y if t h e r e i s a v e c t o r p £ K ( x * ) s u c h t h a t z l =
= (f\ ( # * ) , p ) i f f t (,x * ) = 0 , i 6 U 3 ° o r i — 0. T h e c o m p o
n e n t s z x ( o f v e c t o r z £ Z ) f o r w h i c h f t (x * ) < c 0 a r e a r b i t r a r y . S i n c e
K (a;*) is a c o n v e x c o n e it i s e a s i l y s e e n t h a t Z is a c o n v e x c o n e t o o .
L e t u s n o w d e f i n e t h e s e t P . V e c t o r w b e l o n g s t o P if a n d o n l y if
w x < c 0 w i t h fi ( x % ) = 0, i 6 3 " or i — 0
w 1 — 0 with i £ J°.
T h e r e m a i n i n g c o m p o n e n t s of vector w are arbitrary. O b v i o u s l y , P
is a c o n v e x s e t t o o .
L e t u s d e m o n s t r a t e th a t Z a n d P d o n o t intersect. S u p p o s e that
t h e o p p o s i t e is t r u e . T h e n t h e r e m u s t b e a v e c t o r p 0 £ K (a:*) s u c h t h a t
(/; ( * » ) . p . ) < o m e a - a n d f t ( * * ) = o
and
(fi ( * * ) , P o ) = 0 with i 6 J°. (4.5)
W e n o w c o n s t r u c t a s y s t e m of e q u a t i o n s i n f u n c t i o n s r 1 (X)* i 6 J ° :
f i (x + H - A p o ~ b 2 r i e i) — 0 * i 6 «7°» (4*6)
i€J°
Let us denote
T h e n f r o m (4.6) w e have
g t ( K r) = 0, i 6 J° (4.7)
w h i c h d e f i n e s r x a s i m p l i c i t f u n c t i o n s o f A.. S i n c e i t w a s a s s u m e d t h a t
fi ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e ' f u n c t i o n s , t h e f u n c t i o n s g t ( A , r)
a r e also c o n t i n u o u s l y di fferentiable i n A a n d r \ T h e n u s i n g (4.5)
w e write
d e i t x 0) = ( / < ( * . ) . P o ) = o , (4.8)
34
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M -
r’( v = - ( f y ‘i <4 -1 0 )
w h e r e I is a n i d e n t i t y m a t r i x . T h i s f o l l o w s f r o m ( 4 . 8 ) a n d ( 4 . 9 ) .
T h u s w e see that w i t h small X co ntinuously differentiable f u n c
t i o n s r l (X), i £ J ° a r e d e f i n e d . F r o m ( 4 . 1 0 ) a n d ( 4 . 1 1 ) w e h a v e
i£t)°
Since from (4.5) (f0 (x^)y p 0) < i 0 a n d — ^ “* 0 w e obtain, with
s m a l l p o s i t i v e X, H — *-x* a n d
f o ( x ( X ) ) — f0 ( x * ) ^ q
S i m i l a r l y i f i £ J ~ a n d f t (x * ) = 0 , t h e n b y (4.5)
f i ( x ( 7 1) ) < 0 , fi(x*) = 0.
I f fi ( x * ) c 0 , i £ t h e n / * ( x (A,)) c 0 , b y c o n t i n u i t y .
T h u s point x (X) w i t h s m a l l p o s i t i v e X satisfies all c o n s t r a i n t s (4.1)
a n d /0 (x (X)) < c / 0 (a:*). B u t t h i s c o n t r a d i c t s t h e f a c t t h a t x * i s a
point of local minimum.
35 3*
. M A T H E M A T I C A L P R O G R A M M I N G
u ° z ° ~ j- 2 u xz l ^ u ° w ° - \ - 2 u xw x y z£Z, w £ P . (4.13)
T h e structure of sets Z a n d P m a k e s it p o s s i b l e t o d r a w c e r t a i n c o n
clusions about n umbers u\ B y the definition of P , w ° c a n t a k e a n y
v a l u e less t h a n zero. H e n c e u ° ^ 0, o t h e r w i s e t h e r i g h t - h a n d s i d e
could take any great value a n d this c o n t r a d i c t s (4.13). S i m i l a r l y ,
u x ^ 0 if f i ( x % ) = 0, i £ (4.14)
F u r t h e r , if i £ a n d f t ( # * ) < ; 0, t h e n w x is a r b i t r a r y . T h e r e f o r e
f o r t h e i n e q u a l i t y ( 4 . 1 3 ) t o b e v a l i d it is n e c e s s a r y t h a t
( / ; ( * , ) . P ) + 2 « * ( / » ( * . ) . P ) > 0 , p £ K { x . )
or
«V.(*.)+ 2 »*/»(*.) = o ,
P r o o f . I f X = E n , t h e n a n y d i r e c t i o n p is a d m i s s i b l e , i.e. K ( # * ) =
= E n . T h e r e f o r e c o n e K * (#*) consists of o n e a n d o n l y o n e zero vector
a n d relations (4.4) directly t a k e t h e f o r m (4.17).
36
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M
C o r o l l a r y 4 . 2 . F o r p o i n t x # to p r o v i d e t h e m i n i m u m of f Q (x) i n t h e
domain
s? > 0, ; 6 f
w h e r e ' f is a s u b s e t o f t h e s e t / = 1, 2, . . n it is n e c e s s a r y t o s a t i s f y
the conditions
d l ° <**> > 0 if X i = 0 , i e f ,
dxl
o ifxi > 0 , i e f or i l f . (4.18)
“• ■ ^ - 2 ^ 1 = 0
ief
or
u° = u ‘, i e f , u° dk£ * ) - 0. ilf- (4.20)
I t f o l l o w s f r o m ( 4 . 2 0 ) t h a t u ° > 0 , s i n c e if u ° = 0 t h e n a l l u x = 0 ,
b u t this c o n t r a d i c t s c o r o l l a r y (4.1). T h e r e f o r e w e c a n a s s u m e t h a t
u° — 1 .
I t f o l l o w s d i r e c t l y f r o m ( 4 . 2 0 ) a n d ( 4 . 1 9 ) t h a t c o r o l l a r y 4 . 2 is
valid.
D e f i n i t i o n 4 . 6 . T h e p o i n t x % p r o v i d i n g t h e m i n i m u m o f f 0 (x) w i t h
c o n s t r a i n t s ( 4 . 1 ) , w h e r e X = E n , is c a l l e d a r e g u l a r p o i n t if g r a d i e n t s
/i(x*) for indices i s u c h t h at i 6 (J # ° » fi ( * * ) = 0 a r c l i n e a r l y
independent.
C o r o l l a r y 4 . 3 . I f x % is a r e g u l a r p o i n t , t h e n i n ( 4 . 1 7 ) w e c a n t a k e
u ° = 1 a n d t h e m u l t i p l i e r s u ly i £ \J J ° a r e u n i q u e .
P r o o f . I n f a c t , u ° ] > 0 . S i n c e if u ° = 0 t h e n b y ( 4 . 1 7 ) t h e g r a d i e n t s
f i ( x + ) f o r w h i c h i £ C f ~ (J « ? ° > . U ( # * ) = 0 w o u l d p r o v e l i n e a r l y
d e p e n d e n t . F u r t h e r , b y ( 4 . 1 7 ) u ' = 0 i f f t (,x * ) < 0 . T h e r e f o r e t h e
37
M A T H E M A T I C A L P R O G R A M M I N G
f'o ( * . ) = - s u lf i ( * , )
o f v e c t o r f'0 ( x * ) i n l i n e a r l y i n d e p e n d e n t v e c t o r s f t ( x * ) a n d d e f i n e s
u n i q u e l y u %.
L e t t h e r e b e n o w o n l y e q u a l i t y co n s t r a i n t s in p r o b l e m (4.1)
ft ( x ) = 0, i 6 3°
a n d X — E n . If w i t h s u c h c o n s t r a i n t s x * p r o v i d e s t h e m i n i m u m of
/ 0 ( x ) a n d g r a d i e n t s ft ( x * ) a r e l i n e a r l y i n d e p e n d e n t , t h e n t h e n e c e s
s a r y co n d i t i o n s for a m i n i m u m (4.17) c a n b e w r i t t e n in the f o r m
f t ( x *) + 2 m 7 J ( * J = 0 .
.£ AfO v •'
T h e set of vectors p s u c h th a t
(ft ( * * ) . P ) = 0, i 6 3°
i n t h e c a s e u n d e r c o n s i d e r a t i o n is c a l l e d a t a n g e n t ( b o u n d i n g ) m a n i
fold at p o i n t x * to t h e set
D = {x: ft(x) = 0, i 6 3°).
C o r o l l a r y 4 . 4 . F o r p o i n t x * a t w h i c h ft (a:*), i £ 3 ° a r e l i n e a r l y
i n d e p e n d e n t to p r o v i d e t h e m i n i m u m of t h e f u n c t i o n f0 (x) i n set D
it i s n e c e s s a r y t h a t g r a d i e n t f 0 ( x * ) b e o r t h o g o n a l t o a m a n i f o l d t a n g e n t
t o D a t p o i n t x i . e . i f p b e l o n g s t o t h e b o u n d i n g m a n i f o l d t h e n (ft ( x * ) ,
p ) = 0 . I n o t h e r w o r d s t h e p r o j e c t i o n o f v e c t o r ft ( x * ) o n t h e t a n g e n t
m a n i f o l d vanishes.
P r o o f . If x * w i t h t h e a b o v e a s s u m p t i o n s p r o v i d e s t h e m i n i m u m
then
( / ; ( * . ) . />) = - s »*(/*(*.). p ) ~ o
f o r a n y v e c t o r p o f t h e t a n g e n t m a n i f o l d . C o n v e r s e l y i f (f t ( x * ) , p )
is e q u a l t o z e r o w i t h a n y p , w h i c h b e l o n g s t o t h e h o u n d i n g m a n i f o l d ,
then w e can write
/; ( * . ) = - s . » v j (*.)
s eJ°
T h i s f o l l o w s f r o m l e m m a 1 . 6 if e a c h o f t h e e q u a l i t i e s (ft ( x * ) , p ) = 0
is w r i t t e n d o w n i n t h e f o r m o f t w o i n e q u a l i t i e s :
38
N E C E S S A R Y C O N D I T I O N S F O R A M I N I M U M
Minimax Problem
It is r e q u i r e d t o f i n d t h e m i n i m u m p o i n t o f t h e f u n c t i o n
/ (x) = m a x f t (x) (4.21)
i=l. . m
w h e r e /* ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n s , x 6 E n . I n
o r d e r to a p p l y t h e results o b t a i n e d i n t h e p r e c e d i n g s u b s e c t i o n let
u s r e d u c e t h e p r o b l e m o f m i n i m i z a t i o n o f / (<x ) t o t h e e q u i v a l e n t
p r o b l e m o f m a t h e m a t i c a l p r o g r a m m i n g . I t is e a s i l y s e e n t h a t if w e
introduce a s u p p l e m e n t a r y variable x n+1y t h e n # * — the point of m i n i
m u m o f / (rr)— w i l l a l s o b e t h e s o l u t i o n o f t h e f o l l o w i n g p r o b l e m :
t o f i n d t h e m i n i m u m o f g Q (x , x n + 1 ) — x n + 1 w i t h t h e c o n s t r a i n t s
g i (x , z n + 1 ) s ft ( # ) — x n+1 0, i = 1, . . m. (4.22)
T h e m i n i m u m v a l u e o f g 0 ( x y x n + 1 ) is a ?;+ 1 = / ( x * ) .
L e t us a p p l y corollary 4.1 of t h e o r e m 4 . 1 t o p r o b l e m ( 4 . 2 2 ) . I t is
necessary to ta ke into a c c o u n t that the p r o b l e m will n o w be solved
in space E n+1 of variables x \ ...» x n y x n+1 so that the gradients
o f t h e f u n c t i o n s g t ( x , a;n + 1 ) h a v e t h e f o rm
B y c o r o l l a r y 4 . 1 w e n o w h a v e : t h e r e a r e n u m b e r s u ° , u xy i = 1, . . .
..., m n o t all zero s u c h t h a t
m
“0 (i) + 2 “ i ( /i- 1, , ) = ° .
i=l
u * ^ 0, ( = 0, 1,
( f t (x * ) — * S + 1 ) = » * ( / i ( * . ) — / ( * , ) ) = 0 . i = 1, . m. (4.23)
m
T h e first o f r e l a t i o n s ( 4 . 2 2 ) s h o w s t h a t u° — 2 u x. H e n c e , s i n c e u x ^ 0 ,
i=l
w e h a v e u ° > 0 for w i t h u ° — 0 all u x w o u l d a l s o b e zero. S i n c e e x
p r e s s i o n ( 4 . 2 2 ) is h o m o g e n e o u s w i t h r e s p e c t t o u x w e c a n t a k e u ° = 1.
T h u s w e h a v e finally o b t a i n e d t h e f o l l o w i n g result.
T h e o r e m 4 . 2 . F o r p o i n t x % t o p r o v i d e t h e m i n i m u m o f f (a;) d e f i n e d
b y r e l a t i o n ( 4 . 2 1 ) i t i s n e c e s s a r y t h a t t h e r e b e n u m b e r s u x y i = 1 , . . ., m
such that
m
2 “ V i (*.) = o,
i=l
m
2 = 1, u x^ 0, i = l, . . . , m ,
i=i
*** (f t ( * * ) — / (**)) = i = 1, . . m. (4.24)
39
M A T H E M A T I C A L P R O G R A M M I N G
Necessary Conditions
of the S e c o n d Order
L e t u s a g a i n r e t u r n t o t h e p r o b l e m o f t h e m i n i m i z a t i o n o f / 0 (a:)
w i t h c o n s t r a i n t s (4.1), X = E n . W e u s e t h e n o t a t i o n L (#, u ) a s
follows:
L(x, u ) = f 0 (x)+ 2 u xf t ( x ) . (4.25)
i£j~ u J °
L'x ( # * , u ) — 0. (4.26)
A s s u m e n o w t h a t a l l f u n c t i o n s /* ( # ) a r e t w i c e c o n t i n u o u s l y d i f
f e r e n t i a b l e , i.e. t h a t t h e r e a r e c o n t i n u o u s m a t r i c e s o f s e c o n d d e r i v a
t i v e s fl (#). T h e n t h e m a t r i x o f s e c o n d d e r i v a t i v e s Z £ x ( # * , u ) o f
f u n c t i o n L (#, u ) w i t h r e s p e c t t o x is a l s o d e f i n e d .
T h e a s s u m p t i o n t h a t # * is a r e g u l a r p o i n t i m p l i e s t h a t t h e r e l a
t i o n ( 4 . 1 7 ) is u n i q u e l y d e t e r m i n e d b y m u l t i p l i e r s u l, i 6 U J°.
W e introduce the following notations
J o ( * ♦ ) = {*: a f > 0 ,
3 ~ ( x . ) = {«: M * « ) = 0 , i 6 3 ~ ) •
P o i n t # * b e i n g r e g u l a r , v e c t o r s f\ ( # * ) , i 6 J p ( * * ) a r e l i n e a r l y
i n d e p e n d e n t . T h e r e f o r e it c a n b e d e m o n s t r a t e d t h a t t h e r e is a f u n c
t i o n r (X) 6 E n s u c h t h a t
ft ( X ( X ) ) = 0, i e J P (**) (4.29)
40
A D D I T I O N A L I N F O R M A T I O N
F u r t h e r , i f i £ J p ( x * ) , t h e n e i t h e r f t ( x * ) < 0 o r ( f i ( x m ), p ) < 0
w h i c h e n s u r e s t h e i n e q u a l i t y f t ( x ( X ) ) c 0 w i t h s m a l l A,. T h u s p o i n t
x (X) w i t h s m a l l X satisfies all t h e c o n s t r a i n t s (4.1), X = E n . U s i n g
this fact as w e ll as (4.27)-(4.29) w e o b t a i n
/ o (x M ) = L ( z (A,), u )
s i n c e i f u l =5^ 0 t h e n f r o m ( 4 . 2 9 ) f t ( x (A,)) = 0. A t t h e s a m e t i m e
w e o b t a i n f r o m (4.17) t h a t /0 (x*) = L (x*, u). T a k i n g i n t o a c c o u n t
t h a t x (A,) s a t i s f i e s a l l o f t h e r e l a t i o n s ( 4 . 1 ) a n d t h a t x # is t h e m i n i
m u m p o i n t o f /0 (x) w i t h c o n s t r a i n t s (4.1) w e o b t a i n for s m a l l X:
L ( x ( X ), u) > L (x*, u).
Expanding L ( x (A,), u ) to second-order terms in powers o f A,
w e obtain
L ( x (A.), u ) = L ( x , , u ) + ( L x ( x „ u ), x ( X ) — x j
w h e r e \ { X ) is a p o i n t i n t h e s e g m e n t w h i c h j o i n s x0 and x(A,)
s o t h a t £ (A<) — >• x ^ a s A , — > ~ 0 . U s i n g ( 4 . 2 6 ) w e o b t a i n
D i v i d i n g b y A ,2 a n d t a k i n g A, - > 0 w e finally o b t a i n
( L x” x ( x * , u ) p , p ) > 0.
The f o l l o w i n g t h e o r e m is p r o v e d .
Theo r e m 4 . 3 . L e t f u n c t i o n s f t (x) b e t w i c e c o n t i n u o u s l y d i ffe ren tia bl e
and x* b e a r e g u l a r p o i n t o f m i n i m u m o f f0 (x) w i t h c o n s t r a i n t s (4.1),
X = E n . T h e n there are n u m b e r s u \ i £ J ~ U 3 ° such that
L x (x*, u) = 0, ux ^ 0, i £ J “, u xfi ( x * ) = 0, i 6 Cf~
and
( L x x (a:*, u ) p , p ) > 0
for all p w h i c h satisfy inequalities (4.27).
5. S O M E A D D I T I O N A L I N F O R M A T I O N
T h e Newto n-L eib ni tz formu la w h i c h establishes the connection
b e t w e e n a s c a l a r f u n c t i o n / (x) a n d its d e r i v a t i v e is t r e a t e d i n m a t h e
m a t i c a l a n a l y s i s . T h i s f o r m u l a is g e n e r a l i z e d a n d a p p l i e s t o o p e r a
tor functions.
41
M A T H E M A T I C A L P R O G R A M M I N G
I f F (x) is a d i f f e r e n t i a b l e o p e r a t o r f u n c t i o n d e f i n e d i n a n o p e n c o n v e x
set Q £ E n a n d x , x + h £ Q , then
i
F ( x - \ - h ) — F (x) — j F' (x-\-ah) h d a . (5.1)
o
T h e p r o o f o f t h e f o r m u l a ( w h i c h is v a l i d a l s o f o r o p e r a t o r s , d e f i n e d
in functional spaces) c a n b e found, for instance, in the b o o k b y
A . N . K o l m o g o r o v a n d S. V . F o m i n .
L e t us state o n e m o r e p r o p e r t y of operator functions.
I f F ( x ) is a n o n l i n e a r d i f f e r e n t i a b l e o p e r a t o r f u n c t i o n , t h e n f o r
a n y x , h , y £ E n t h e f o l l o w i n g f o r m u l a is v a l i d :
( F (x + h) — F (x), y ) = (F' (x + Q h ) h , y ),
0 < 0 < 1. (5.2)
T h i s f o r m u l a i s c a l l e d L a g r a n g e ’s f o r m u l a f o r o p e r a t o r s ( o r L a
g r a n g e ’s g e n e r a l i z e d f o r m u l a ) . I t s p r o o f ( f o r o p e r a t o r s o f a m o r e
g e n e r a l f o r m ) c a n b e f o u n d i n M . M . V a i n b e r g ’s m o n o g r a p h [ 1 ] ,
I n the following chapters w e shall h a v e m a n y occasions of using
T a y l o r ’s f o r m u l a w i t h t h e r e m a i n d e r t e r m i n L a g r a n g e ’s f o r m .
If / (x) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n i n a c o n v e x
set Q , t h e n for a n y x, x + h £ Q a n d a £ 10, 1 ]
/ ( x + a h ) = f ( x ) + a (/' ( x ) , h ) + (/" ( x + a Q 2h ) h , h )
where 0 lt 0 2 6 1 0 , 1 1 .
Bibliographic Notes
T h e properties of c o n v e x sets a n d c o n v e x functions in finite-dimensional
s p a c e s a r e d e s c r i b e d b y S. K a r l i n , G . Z o u t e n d i j k [1], H . K i i n z i a n d W . K r e l l e .
T h e m o s t c o m p r e h e n s i v e d e s c r i p t i o n i s g i v e n i n R . T . R o c k a f e l l a r ’s b o o k . T h e
properties of c o n v e x sets a n d f u n c t i o n s in functional s p a c e s are in detail
i n v e s t i g a t e d b y N . D u n f o r d a n d J. T . S c h w a r t z .
S p e c i a l p r op e r t i e s c o n n e c t e d w i t h strict a n d s t r o n g c o n v e x i t i e s are c o n s i d e r e d
b y E . S. L e v i t i n a n d B . T. P o l y a k .
A m o n g m a n y w o r k s d e v o t e d to th e t h e o r y of n e c e s s a r y c o nditions for e x t r e m a
w e n o t e t h e b o o k s b y S . K a r l i n , G . Z o u t e n d i j k [ 1 ] , S . I. Z u k h o v i t s k i i a n d
L . I. A v d e y e v a , H . K i i n z i a n d W . K r e l l e w h i c h c o n s i d e r t h e p r o b l e m s o f l i n e a r
a n d c o n v e x p r o g r a m m i n g in finite-dimensional spaces. A m o r e c o m p l e t e t h e o r y
42
B I B L I O G R A P H I C N O T E S
o f t h e n e c e s s a r y c o n d i t i o n s o f t h e first o r d e r i n t h e g e n e r a l c a s e t h e r e a d e r c a n
find in t h e w o r k s b y A . Y a . D u b o v i t s k y a n d A . A . M i l y u t i n , L. W . N e u s t a d t ,
H . H a l k i n a n d L . W . N e u s t a d t , B . N . P s h e n i c h n y [ 1 ] , M . R . I l e s t e n e s [ 2],
K . A r r o w , L. B. H u r w i t z a n d H . U z a w a . M . R . H e s t e n e s considers also the
necessary conditions of the s e c o n d order.
T h e p r o b l e m s of linear p r o g r a m m i n g as well as c o m p u t a t i o n a l algorithms
a n d the t h e o r y of d uality in linear p r o g r a m m i n g are treated in detail b y D . Gale,
G . B . D a n t z i g [ l ] , S . K a r l i n , S . I. Z u k h o v i t s k i i a n d L . I. A v d e y e v a . T h e g e n e r a l
c a s e o f t h e t h e o r y o f d u a l i t y i n c o n v e x p r o g r a m m i n g is t r e a t e d b y E . G . G o l -
s t e i n [11, [2] a n d R . T . R o c k a f e l l a r .
T h e t h e o r e m o n i m p l i c i t f u n c t i o n s , w h i c h is u s e d i n d e d u c i n g t h e n e c e s s a r y
c o n d i t i o n s f o r e x t r e m a , c a n b e f o u n d i n G . M . F i k h t e n h o l ’t s ’ b o o k .
C H A P T E R II
M E T H O D S O F U N C O N S T R A I N E D
F U N C T I O N MINIMIZATION
T h i s c h a p t e r is d e v o t e d to the p r o b l e m of m i n i m i z a t i o n of t h e
f u n c t i o n f(x) d e f i n e d in a n ^-dimensional Euclidean space E n. Accord
ingly, in this ch apt er x is a l w a y s a n r c - d i m e n s i o n a l v e c t o r .
In solving the problem w e shall use iterative processes of the t y p e
*h+1 = + VkPh (0.1)
w h e r e p h is a v e c t o r d e t e r m i n i n g t h e d i r e c t i o n o f m o t i o n f r o m p o i n t x k
a n d a k is a n u m e r i c a l f a c t o r w h o s e v a l u e d e t e r m i n e s t h e l e n g t h o f
the step in the direction of p h.
T h e p r o c e s s ( 0 . 1 ) w i l l b e d e f i n e d if t h e m e t h o d s o f c o n s t r u c t i n g
vector p k a n d c o m p u t i n g t h e v a l u e of a * are g i v e n for e v e r y itera
tion. T h e properties of t h e process— t h e v a l u e s of the fu n c t i o n for
different e l e m e n t s of the s e q u e n c e {#*}, c o n v e r g e n c e of the s e q u e n c e
to the solution, the rate of c o n v e r g e n c e , etc.— d e p e n d directly o n t h e
m e t h o d chosen. A t t h e s a m e t i m e v a r i o u s m e t h o d s of co nstructing
vector p k a n d d e t e r m i n i n g a * require different a m o u n t s of c a lcu
lations a n d i n v o l v e different constraints o n t h e f u n c t i o n to b e m i n i
mized.
Let us state the considerations o n w h i c h w e shall base our choice
of t h e direction of m o t i o n a n d t h e step length.
I n o r d e r t o g e t n e a r e r t o p o i n t x % ( i n t h e g e n e r a l c a s e x * is t h e p o i n t
a t w h i c h t h e n e c e s s a r y c o n d i t i o n s f o r a n e x t r e m u m o f f u n c t i o n f (x)
ar e satisfied, p o s s i b l y w i t h i n a c e r t a i n a c c u r a c y ) , o n e s h o u l d n a t u
rally m o v e f r o m point x k in the direction in w h i c h the function d e
c r e a s e s , i.e. i n t h e d i r e c t i o n o f d e s c e n t . I f p o i n t is n o t t h e p o i n t
o f m i n i m u m o r a s t a t i o n a r y p o i n t , t h e n t h e r e is a n in fin ite n u m b e r
of vectors p w h i c h d e t e r m i n e the direction of descent f r o m point x h
a n d e a c h v e c t o r is d e f i n e d b y
(/' ( * * ) > p ) < 0
(/ (x) is d i f f e r e n t i a b l e ) .
44
G R A D I E N T M E T H O D S
T h i s is s e e n f r o m t h e f o l l o w i n g a r g u m e n t .
L e t x = x h + a p . E x p a n s i o n o f t h e f u n c t i o n i n T a y l o r ’s s e r i e s
a b o u t x h (it is o b v i o u s l y a s s u m e d t h a t t h e f u n c t i o n is d i f f e r e n t i a b l e
a n d h a s a sufficient n u m b e r of derivatives) gives
1. G R A D I E N T M E T H O D S
M e t h o d of S t e e p e s t D e s c e n t
T h e simplest a p p r o a c h to the choice of the direction of p k in order
t o s a t i s f y t h e c o n d i t i o n (fh , p h ) < 0 (i.e. o f t h e d i r e c t i o n o f t h e
d e s c e n t o f / (#)) is t o a s s u m e p h — — /*.
T h e iterative process
*^fc+l ~ ^ > 0, k = 0, 1, . . •(1*1)
w h i c h r e s u l t s f r o m s u c h a c h o i c e o f t h e d i r e c t i o n o f m o t i o n is c a l l e d
t h e m e t h o d of steepest descent or g r a d i e n t m e t h o d .
I n t e r m s o f c o o r d i n a t e s , p r o c e s s ( 1 . 1 ) is w r i t t e n d o w n i n t h e f o l
lowing form:
„ d f (x h) ,• _ \ o „
X k ~|-i %k OLh. 7 : » & — L f n.
d x l
A t p r e s e n t t h e m e t h o d o f s t e e p e s t d e s c e n t is o n e o f t h e b e s t k n o w n
minimization methods.
T h e p o p u l a r i t y o f t h e m e t h o d h a s b e e n f a v o u r e d b y its b e i n g
c o m p a r a t i v e l y s i m p l e a n d suitable for application to the m i n i m i
zatio n of a v e r y b r o a d class of functions.
W e t u r n n o w t o t h e s t u d y of t h e p r o p e r t i e s of t h e a l g o r i t h m (1.1).
First of all w e d e s c r i b e t h e m e t h o d of c h o o s i n g t h e m a g n i t u d e of t h e
scalar factor a*.
(1) T a k e a n a r b i t r a r y v a l u e o f a ( t h e s a m e a t all i t e r a t i o n s ) a n d
determine point x = x^ — afh.
( 2 ) C o m p u t e f ( x ) = / ( x k — a f ' h ).
45
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
(3) V e r i f y t h e i n e q u a l i t y
/(*) — / (xh) < e a (f'k, p h ) (1.2)
w h e r e 0 < c e < c 1 is a n a r b i t r a r i l y c h o s e n c o n s t a n t ( t h e s a m e w i t h
a n y k = 0 , 1 , . . .).
(4) I f i n e q u a l i t y ( 1 . 2 ) is s a t i s f i e d , t h e n t h e v a l u e o f a is t a k e n t o b e
t h e s o u g h t o n e : a k = a . H o w e v e r if t h e i n e q u a l i t y is n o t s a t i s f i e d ,
w e reduce a (multiplying a b y a n arbitrary i < 1) until inequality
( 1 .2 ) i s s a t i s f i e d .
T h e a b o v e m e t h o d of c h o o s i n g a h n e e d s substantiation; t h e c o n
ditions of t h e existence of n o n z e r o v a l u e s of a w h i c h satisfy i n e q u a l i t y
( 1 . 2 ) m u s t b e e s t a b l i s h e d . S u c h a s u b s t a n t i a t i o n is g i v e n i n t h e f o l
lowing theorem.
T h e o r e m 1 . 1 . I f f u n c t i o n f (,x ) i s b o u n d e d f r o m b e l o w , i t s g r a d i e n t
f (x) satisfies L i p s c h i t z ’ c o n d i t i o n
/ (*) - / » < - « V L m + a R II - * * II II / ; II
II fk II2 + oc f l II * ~ II II / i II = a II fk II2 ( - 1 + a R ) .
T h e estimate obtained s h o w s that there are values a 0 such that
t h e i n e q u a l i t y ( 1 .2 ) i s s a t i s f i e d ; t o o b t a i n t h i s r e s u l t i t s u f f i c e s t o
c h o o s e a s u c h t h a t — 1 + oci? ^ — e. T h i s is a l w a y s f e a s i b l e , s i n c e R
is a l i m i t e d q u a n t i t y a n d 0 < e < 1 . C o n s e q u e n t l y , ( 1 . 2 ) w i l l a l w a y s
| _ _ g
b e satisfied w i t h a ^ . T h u s , ch oosing a * in accor dan ce w i t h
the above algorithm w e obtain
fk+i — /* < — e a ft || f'k ||2 , (1.5)
i . e . w i t h a n y k w e h a v e f h + 1 — f h « < 0 ( p r o v i d e d || f h || # 0 ) . S i n c e
b y h y p o t h e s i s t h e f u n c t i o n h a s a l o w e r b o u n d , t h e last i n e q u a l i t y
46
G R A D I E N T M E T H O D S
gives as k oo
/ j k + i - / * - * 0. ( 1 .6 )
It f o l l o w s f r o m (1.5) t h a t
i i / ^ ^ i ^ r 1 - 0-7)
W e n o t e t h a t t h i s a l g o r i t h m f o r c h o o s i n g a fc e n s u r e s t h a t a k ^
^ a > 0 , with any k , where a can be any constant which does not
j_ g
e x c e e d t h e q u a n t i t y — g - , s i n c e a s it w a s m e n t i o n e d , t h e i n e q u a l i t y
(i .2) o r ( 1 . 5 ) i s c e r t a i n l y s a t i s f i e d w i t h a = .W i t h this r e m a r k ,
c o n d i t i o n s ( 1 . 6 ) a n d ( 1 . 7 ) i m p l y t h a t || / £ || - > - 0 a s A; - > > o o , a n d t h i s
proves the theorem.
T h e class of f u n c t i o n s s a t i s f y i n g t h e r e q u i r e m e n t s of t h e o r e m 1.1
is v e r y b r o a d . S u c h f u n c t i o n s c a n h a v e n o m i n i m u m p o i n t a t all,
c a n h a v e local m i n i m a , s a d d l e p o int s, etc. T h e o r e m 1.1 s h o w s t h a t
the gradient m e t h o d pr o v i d e s for c o n v e r g e n c e either to t h e exact l o w e r
b o u n d i n f / (a:) o r t o a v a l u e o f t h e f u n c t i o n a t a c e r t a i n s t a t i o n a r y
X
P r o o f . T h e e x i s t e n c e o f t h e u n i q u e m i n i m i z e r o f / (x) w i t h t h e
conditions of the t h e o r e m follows f r o m the results of l e m m a 3.2
( C h . I). T h e r e f o r e w e h a v e o n l y t o p r o v e t h e c o n v e r g e n c e o f s e q u e n c e
{ x fe} t o p o i n t x * a n d t o o b t a i n e s t i m a t e s ( 1 . 9 ) . L e t u s f i r s t e s t a b l i s h
t h a t t h e f i r s t o f e s t i m a t e s ( 1 . 9 ) h o l d s . U s i n g T a y l o r ’s f o r m u l a w e
47
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
obtain
/(*.)=/(*) + ( / ' ( x ), — *) + " S ' ( f ( * c ) (*. — *), X „ — x).
H e n c e , a p p l y i n g (1.8) w e h a v e
a n d therefore, f r o m (1.8), w e o b t a i n
I I * - * . I K 111 ^ 11. ( 1 .1 2 )
a n d f r o m th e r i g h t - h a n d side of ine q u a l i t y (1.11)
ii* — * . n2 > - j | - [ / ( * ) - / ( * . ) ] •
W i t h the estimates obtained w e c a n write (1.10) in t h e f o r m
= - a \ \ r h r + - - ( r ko f i / ; x - « ( 1 - ^ ) ii«ip.
48
G R A D I E N T M E T H O D S
w h e r e <7 = 1 — e a m ( ^ 1 + — j C 1 , i. e .
(/k-Z.X^tfo-n. (1-15)
S i n c e a = 2 (V r e ^ w e h a v e
M
n — \ 2 e ( 1 — e ) rm / . ■ m \
* 1 \ ' M / '
He n c e the m i n i m u m v a l u e of t h e ratio of the p r o g r e s s i o n q m in
i
is a t t a i n e d w i t h and then
Q m In = 1 2 ~ M O + l f ) *
1
C o n s e q u e n t l y it is e x p e d i e n t t o t a k e e i n t h e c o n d i t i o n (1.2).
E s t i m a t e (1.15) t o get her w i t h t h e left-hand side o n e of e s t im ate s
(1 .1 1 ) m a k e s it p o s s i b l e t o e s t a b l i s h t h e c o n v e r g e n c e a n d e s t i m a t e
the rate of c o n v e r g e n c e of the s e q u e n c e {#&} to the po i n t of m i n i m u m :
T h e p r o o f o f t h e v a l i d i t y o f e s t i m a t e ( 1 . 1 5 ) i n t h i s c a s e is n o t r e a l l y
c o n n e c t e d w i t h the existence of a m i n i m u m ; o n e c a n s u p p o s e that
/ * = i n f / (#) w i t h o u t t r y i n g t o e s t a b l i s h w h e t h e r t h e p r e c i s e l o w e r
b o u n d is a t t a i n e d . I t s h o u l d b e s t r e s s e d h o w e v e r t h a t f u n c t i o n s o f
this class d o h a v e a m i n i m u m — n o t necessarily the o n l y o n e — a n d
t h a t s e q u e n c e { z ft} c o n v e r g e s t o a c e r t a i n p o i n t the s e c o n d of
e s t i m a t e (1.9) h o l d i n g t r u e for t h e rate of c o n v e r g e n c e .
I n d e e d f r o m (1.1) a n d (1.7) w e h a v e
II x h + l - x h II2 = a t II n | | * < ( / * - / * + ! )
w h e r e a m a x is t h e m a x i m u m v a l u e of the p a r a m e t e r at w h i c h w e
4— 0326 49
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
|| * , - * * || = l i m || x m - x h | | < C }/2
m~*oo 1— q '
V a r i a n t s of the M e t h o d
T h e m e t h o d of c h o o s i n g a h in pr o c e s s (1.1) w h i c h i n v o l v e s t h e
c h e c k i n g o f i n e q u a l i t y ( 1 .2 ) a s d e s c r i b e d a b o v e i s n o t t h e o n l y o n e
possible. W e shall n o w consider several other m e t h o d s of c h o o s i n g
t h e v a l u e o f a ft; e a c h o f t h e s e m e t h o d s d e t e r m i n e s a d i f f e r e n t v a r i a n t
o f t h e g r a d i e n t m e t h o d . I n p r o v i n g t h e o r e m s 1 . 1 a n d 1 . 2 it w a s e s t a b
l i s h e d t h a t i n e q u a l i t y ( 1 .2 ) i s a l w a y s s a t i s f i e d w i t h v a l u e s o f a ^
^ (theorem 1.1) or a ^ (theorem 1.2). T h i s c i r c u m
s t a n c e m a d e it p o s s i b l e t o p r o v e t h e statements about the properties
o f m e t h o d ( 1 .1 ) i n c h o o s i n g a h u n d e r the co ndition of satisfying in
e q u a l i t y (1.2). If c o n s t a n t s R o r M are k n o w n w h i c h characterize
t h e f u n c t i o n / (x) b e i n g m i n i m i z e d , t h e n i n a p p l y i n g m e t h o d ( 1 .1 )
— — | _ _ _ g
II x k — * * I K 9* II x o — x* II.
q = m a x { |1 — a m |, |1 — a M |}
50
G R A D I E N T M E T H O D S
A p p l y i n g L a g r a n g e ’s f o r m u l a f o r o p e r a t o r s ( 5 . 2 ) ( C h a p . I ) w e obtain
U s i n g this result w e h a v e
||*ft+i— x , | | 2 = ( ( 7 — a f k c ) ( * » — * , ) , tfft+i — * * )
< || / — a } k C || || x h — x n || || x k + i — x m ||,
i.e.
II Z f t + i ~ | K || / — a f l c || || x h - x . \ \ = q || x h — * * ||.
B y c o n d i t i o n s (1.8)
q = || / — a / f t C | | - m a x {| 1 — a m |, 1 1 — a M |}.
51 4*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
(1 .1 8 )
H e n c e b y t h e s a m e a r g u m e n t a s i n t h e o r e m 1 . 1 w e f i n d t h a t || f k ||
-►0. Q.E.D.
T h e e s t i m a t e s (1.9) for t h e v a r i a n t of t h e g r a d i e n t m e t h o d w h e r e
t h e s t e p l e n g t h is c h o s e n a c c o r d i n g t o ( 1 . 1 7 ) c a n b e v e r i f i e d i n a w a y
a n a l o g o u s to t h a t u s e d in t h e o r e m 1.2 w i t h t h e o n l y difference that
e x p r e s s i o n (1.13) s h o u l d b e u s e d i n t h e e s t i m a t e fh+ 1 — /& ^
^ ^ || f k ||2 o b t a i n e d b y t h e s a m e a r g u m e n t as for (1.18). H o w
e v e r w e p r o c e e d u s i n g t h e results of t h e o r e m 1.3. T h i s e n s u r e s h i g h e r
a c c u r a c y f o r t h e v a l u e o f t h e r a t i o q.
~ 2
S e t x k + i = X k — M - f m f h’ ' t ^i e n e s l i m a t e ( 1 * 1 6 )
/ fo+i)— / ( * , x ( )2 f a — /,)
holds.
If p o i n t x k +i is c h o s e n b y a p p l y i n g t h e c o n d i t i o n for function
m i n i m i z a t i o n in t h e direction of descent, t h e n
where
T h u s t h e f o l l o w i n g t h e o r e m is v a l i d .
T h e o r e m 1.5. I f f u n c t i o n f (#) satisfies t h e c o n d i t i o n s o f t h e o r e m 1 . 2
a n d i n a p p l y i n g m e t h o d ( 1 . 1 ) a * i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 1 . 1 7 ) , 52
52
G R A D I E N T M E T H O D S
t h e n s e q u e n c e {:rk } c o n v e r g e s t o t h e m i n i m u m a t t h e r a t e o f a g e o m e t r i c
, ,. M — m
progression whose ratio q — •
N o t e t h a t t h e v a r i a n t of m e t h o d (1.1) w h e r e t h e s t e p l e n g t h is
c h o s e n a c c o r d i n g to the condition of function m i n i m i z a t i o n in t h e
d i r e c t i o n o f d e s c e n t is o f t e n c a l l e d i n l i t e r a t u r e t h e m e t h o d o f s t e e p e s t
descent.
j / X ^ i I I p I I 2. m t= 7 ^ - > 0 ’ = y
w i l l b e satisfied a n d therefore
(/*, P k ) = F i ' f i ) ^ — n i i || jii l|2 ^ 0 . (1.22)
Diflerent iterative processes will c o rre spo nd to different s e q u e n c e s
{ F k 1 )-
A s far as the principles of the m e t h o d s are c o n c e r n e d the s t u d y
o f m e t h o d ( 1 .2 0 ) d o e s n o t i n v o l v e a n y n e w e l e m e n t s a s c o m p a r e d t o
t h e “ p u r e ” g r a d i e n t m e t h o d (1.1). A l l t h e results o b t a i n e d for m e t h o d
( 1 .1 ) r e m a i n v a l i d a l s o f o r m e t h o d ( 1 .2 0 ) w i t h t h e s a m e r e q u i r e m e n t s
to the function b e i n g m i n i m i z e d a n d the s a m e m e t h o d s of c h o o s i n g
the step length. O n l y the t e c h n i q u e of p r o v i n g the c o rre spo ndi ng
s t a t e m e n t s is l i g h t l y c h a n g e d . O f c o u r s e , t h e q u a n t i t a t i v e v a l u e s
o f t h e p a r a m e t e r s i n ( 1 . 2 0 ) w i l l d i f f e r f r o m t h e v a l u e s o f a n a l o g o u s 53
53
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
p a r a m e t e r s i n m e t h o d (1.1). I n p a r t i c u l a r , t h i s r e l a t e s t o t h e v a l u e
of the ratio q in the estimates of the rate of convergence.
W e shall d w e l l n o w o n l y o n the results of m e t h o d (1.20) w h i c h
will b e m a d e use of later on.
T h e o r e m 1.6. T h e results of t h e o r e m 1.2 r e m a i n valid for m e t h o d (1.20).
P r o o f . If x = x k + a p kt w h e r e iV/i then
f ( x ) — f ( x k ) = a ( f ’k , P k ) + - ^ - ( f Z c P k , P k )
Pk) ( i - - -
t h a t f h + x < / f t . U s i n g n o w ( 1 . 2 4 ) a n d h a v i n g i n m i n d t h a t f (x)
is b o u n d e d f r o m b e l o w , b y a n a l o g y t o t h e p r o o f i n t h e o r e m 1.1 t h a t
|| f a || - > - 0 , w e e s t a b l i s h t h a t ( f a , p * ) - > - 0 a s k - > - o o . B y ( 1 . 2 2 ) ,
it m e a n s t h a t \ \ f a \ \ — > - 0 . H e n c e , s i n c e f ( x ) i s s t r o n g l y c o n v e x ,
s e q u e n c e ( 1 . 2 0 ) c o n v e r g e s t o t h e s o l u t i o n x *. I n o r d e r t o o b t a i n b o u n d s
o n t h e r a t e of c o n v e r g e n c e of fh -»-/*, x h - + x # , let u s w r i t e i n e q u a l i t y
( 1 . 2 4 ) , u s i n g ( 1 . 2 2 ) , i n t h e f o r m f h + l — f k < — e a * ^ || f a ||2 . F u r
t h e r , i n t r o d u c i n g i n t h i s e x p r e s s i o n \\ f k || w i t h t h e a i d o f i n e q u a l i t y
(1.13) a n d a p p l y i n g the a r g u m e n t of t h e o r e m 1.2 w e establish the
v a l i d i t y f o r m e t h o d ( 1 .2 0 ) o f t h e e s t i m a t e s o f t h e c o n v e r g e n c e r a t e
(1.9). T h e v a l u e o f t h e r a t i o o f t h e p r o g r e s s i o n is
g = l - e ^ f f t i m ( l + - g - ) = l - e 2 ( 1 ~ e ) p m 1O T ( l + - ^ - ) .
The m i n i m u m o f q is a t t a i n e d w i t h s = -^- :
* p 2 m / . , m \
<7mln— 1 “ 2R * M f
The theorem is p r o v e d .
54
G R A D I E N T M E T H O D S
It f o l l o w s f r o m t h e p r o o f t h a t p r o c e s s (1.20) r e m a i n s c o n v e r g e n t
2
if w e s e t a h = a, 0 *< a •< p (a v a r i a n t of t h e m e t h o d w i t h c o n
s t a n t step). B y t h e s a m e a r g u m e n t as in t h e o r e m 1.3, o n e c a n o b t a i n
the estimate
II * k + i — X* I K III — a F i ' f l c II II x k — I * ||.
H o w e v e r it is i m p o s s i b l e t o o b t a i n a n e s t i m a t e o f t h e r a t i o o f
t h e p r o g r e s s i o n a s w a s d o n e i n t h e o r e m 1 . 3 , s i n c e t h e m a t r i x F l lf k c
is n o t p o s i t i v e d e f i n i t e i n t h e g e n e r a l c a s e ( t h i s l a s t p r o p e r t y is f u l
filled o n l y o n c o n d i t i o n t h a t m a t r i c e s F & 1 a n d f" ( x k c ) c a n b e t r a n s
posed.
W e c a n consider a v a r i a n t of m e t h o d (1.20) in w h i c h the step
l e n g t h is c h o s e n u s i n g t h e c o n d i t i o n t h a t / (x) a t t a i n s m i n i m u m i n
the direction of descent.
T h e o r e m 1.7. I f f u n c t i o n f (x) satisfies t h e c o n d i t i o n s o f t h e o r e m 1 . 2
a n d i n a p p l y i n g m e t h o d ( 1 .2 0 ) p a r a m e t e r a h i s c h o s e n u s i n g c o n d i t i o n
f (xk + a hPh) = m i n / (x k + a p k ).
t h e n s e q u e n c e {a:*} c o n v e r g e s to t h e p o i n t o f m i n i m u m a t t h e r a t e o f a
geometric progression.
T h e t h e o r e m c a n b e p r o v e d ac c o r d i n g to the following s c h e m e .
E x p a n d i n g t h e f u n c t i o n i n t o T a y l o r ’s s e r i e s t o s e c o n d - o r d e r t e r m s
a b o u t point x h a n d re a s o n i n g as in t h e o r e m 1.4 w e c a n o b t a i n the
estimate:
(/;, p h )*
fh+i — fk ^ - - 2~
m ii p h iia •
T h i s inequality, b e c a u s e of (1.22) a n d ( 1 . 2 3 ) , is e q u i v a l e n t t o
1 P ' » i ll f k II2
2 M
F u r t h e r , e x p r e s s i n g || f'k || w i t h t h e u s e o f i n e q u a l i t y ( 1 . 1 3 ) o n e
s h o u l d r e p e a t c o m p l e t e l y t h e a r g u m e n t of t h e o r e m 1.2. W e c a n n o t
o b t a i n in this case a m o r e precise v a l u e of t h e ratio q since w e k n o w
it m u s t b e g r e a t e r t h a n i n t h e m e t h o d o f s t e e p e s t d e s c e n t .
Qualitative A n a l y s i s of the M e t h o d s
Let us co mpare the gradient methods considered above and con
s i d e r c e r t a i n s t a t e m e n t s o n t h e q u a l i t y o f t h e s e a l g o r i t h m s , i.e.
o n their effectiveness in solving m i n i m i z a t i o n problems.
W e h a v e s t u d i e d t h r e e v a r i a n t s of m e t h o d (1.1) differing in t h e
m e t h o d of c h o o s i n g the step length. T h e properties of the variants
r e s e m b l e closely. T h e y c a n b e u s e d in m i n i m i z i n g functions of like5
55
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
56
G R A D I E N T M E T H O D S
2. N E W T O N ’S M E T H O D
W I T H STEP A D J U S T M E N T
C o n s t r u c t i o n of t h e M e t h o d
I n gradient m e t h o d s o n l y the linear t e r m of the e x p a n s i o n of
t h e f u n c t i o n i n T a y l o r ’s s e r i e s i s u s e d i n c h o o s i n g t h e d i r e c t i o n o f
m o t i o n , i.e. u s e is m a d e o f t h e c r u d e s t a p p r o x i m a t i o n t o t h e f u n c t i o n
being minimized.
L e t f u n c t i o n / (a:) w h o s e m i n i m u m i s t o b e d e t e r m i n e d b e s t r i c t l y
c o n v e x a n d sufficiently s m o o t h .
Consider the function
w h i c h i s a q u a d r a t i c a p p r o x i m a t i o n t o / (a:) i n t h e n e i g h b o u r h o o d
o f a c e r t a i n p o i n t y. S i n c e f u n c t i o n / (x) is s t r i c t l y c o n v e x , f u n c t i o n
\|) ( x ) a s c a n e a s i l y b e a s c e r t a i n e d i s a l s o s t r i c t l y c o n v e x ; t h e r e f o r e
t h e m i n i m u m o f t h i s f u n c t i o n is a t t a i n e d a t a u n i q u e p o i n t a n d
vector p = y — y w h i c h minimizes (a:) i s d e t e r m i n e d f r o m t h e
f o r m u l a p = — (/" ( y ) ) -1 /' ( y ) . T h e d i r e c t i o n d e t e r m i n e d b y v e c t o r p
i s t h a t o f d e s c e n t o f f ( x ) s i n c e (/' ( y ) , p ) = — ( f ( y ) p , p ) < C 0
b y v i r t u e of / (x) b e i n g c o n v e x . T h e q u a d r a t i c f u n c t i o n (a:) i n t h e
n e i g h b o u r h o o d o f p o i n t y, is a f a r b e t t e r a p p r o x i m a t i o n t o t h e
function being m i n i m i z e d t h a n a linear function. Therefore o n e
n a t u r a l l y e x p e c t s , a t l e a s t if p o i n t y is i n a s u f f i c i e n t l y s m a l l n e i g h
b o u r h o o d of solution x*, that b y m o v i n g f r o m point y in the direc
t i o n p — — (/" ( y ) ) -1 /' ( y ) o n e c a n a t t a i n a m o r e s i g n i f i c a n t d e c r e a s e
of the function a n d obtain a m o r e accurate a p p r o x i m a t i o n to the
s o l u t i o n t h a n b y m o v i n g i n t h e d i r e c t i o n — /' ( y ) w h i c h is u s e d
in the gradient m e t h o d . O n the g r o u n d of the a b o v e a r g u m e n t w e
s u p p o s e that the iterative process
**+i = xk — a k > 0, k = 0, 1, . . . (2 .1 )
w h e n used to construct successive a p p r o x i m a t i o n s to the solution
o f t h e p r o b l e m o f m i n i m i z a t i o n o f f u n c t i o n / (a;) w i l l p r o v e m o r e
e f f e c t i v e t h a n t h e m e t h o d o f s t e e p e s t d e s c e n t , i.e. t h a t t h e r a t e o f
c o n v e r g e n c e of x h / (x h ) — ► / ( a : * ) w h e n u s i n g a l g o r i t h m ( 2 .1 )
will b e faster t h a n w h e n a p p l y i n g t h e gr adi ent m e t h o d . T h e results
o f t h i s s e c t i o n w i l l s h o w t h a t o u r e x p e c t a t i o n is justified.
W e shall call m e t h o d (2.1) N e w t o n ' s m e t h o d w i t h a d j u s t m e n t of
steps, or g e ner ali zed N e w t o n m e t h o d .
T h e u s u a l N e w t o n m e t h o d c o r r e s p o n d s t o t h e c a s e w h e n a * = 1.
D e n o t i n g t h e e l e m e n t s o f m a t r i x ( / a ) - 1 b y c ( x h ), i , j =
= 1 , 2 , . . ., n , w h e r e i i s t h e r o w i n d e x , w e c a n w r i t e m e t h o d ( 2 . 1 )
58
N E W T O N ’S M E T H O D W I T H S T E P A D J U S T M E N T
i n its c o o r d i n a t e f o r m :
n
j=i
N o t e t h a t m e t h o d (2.1) c a n also b e p r e s e n t e d i n t h e f o l l o w i n g f o r m :
fkPh — — f k »- x h+ 1 = x h + & h P k
or in coordinate f o r m
vi d 2f (xh) p j __ _ d f (xfe)
^ dxi dxi P h dxi
;=i
4 + l = : r A + a k / ?l » i==1» •••*»•
Consequently, in order to d e t e r m i n e vector p * o n e c a n solve a
a s y s t e m o f l i n e a r e q u a t i o n s i n s t e a d o f i n v e r t i n g t h e m a t r i x f " (x * ) .
W e shall s t u d y t w o variants of the generalized N e w t o n m e t h o d
in w h i c h different m e t h o d s of c h o o s i n g p a r a m e t e r a will b e used.
T h e first o f t h e s e m e t h o d s c o n s i s t s o f t h e f o l l o w i n g f o u r s t e p s :
(1) S e t t i n g a = 1 c a l c u l a t e p o i n t x = + ap*.
( 2 ) E v a l u a t e / ( x ) — f ( x h + a p k ).
(3) C h e c k t h e i n e q u a l i t y
/(*) — / ( * * ) < e a ( / f t , p h ), 0 < e < - i - . (2.2)
(4) If t h e i n e q u a l i t y is satisfied, t h e n t a k e t h e v a l u e a = 1 t o b e
t h e s o u g h t o n e : a * = 1. O t h e r w i s e p r o c e e d t o r e d u c e a u n t i l i n
e q u a l i t y (2.2) is satisfied.
W e shall call further o n t h e a b o v e m e t h o d of c h o o s i n g t h e v a l u e
of a * m e t h o d of ch o o s i n g a c c o r d i n g to c o n d i t i o n (2.2). It c a n
b e s e e n t h a t t h i s m e t h o d o f c h o o s i n g t h e s t e p l e n g t h is a n a l o g o u s
to that in the m e t h o d of steepest descent, involving the c h e c k i n g
of i n e q u a l i t y (1.2).
T h e o t h e r v a r i a n t of m e t h o d (2.1) r e q u i r e s t h a t t h e v a l u e of a *
p r o v i d e the m i n i m u m of t h e f u n c t i o n in the direction of m o t i o n
/ ( x h — a * (/ft)"1 f h ) = m i n f ( x h — a (ft)-' ft). (2.3)
59
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
1 — Ok Oft ^ 8
2 2 m
w i l l b e sa t i s f i e d w i t h a k — 1. T h i s m e a n s t h a t i n e q u a l i t y (2.2)
w i l l a l s o b e satisfied w i t h a k = 1. T h u s t h e m e t h o d u s e d i n c h o o s i n g
6 0
N E W T O N ’S M E T H O D W I T H S T E P A D J U S T M E N T
B y L a g r a n g e ’s f o r m u l a f o r o p e r a t o r s w e h a v e
((/*)■*/*, * * + 1 — * . ) = ( ( / * ) " * ( / * — /:), * * + 1 — * . )
= ((/ft)'1 / h e (*ft — *.), Zft+l — Z.)
w h e r e Xhc = & h + 0 (x k — # * ) , 0 € [ 0 , 1 ] , C o n s e q u e n t l y
|| Zft+l — X , IP = ((/ — (/ft)'1 /ftc) ( X k — x,), X k+l — x m)
= ( ( / » ) ' * (/ft — /ftc) ( * f t — * , ) , ^ f t + 1 — ^ « )
where Xk = ftv
|| / * — / * c ||. S i n c e || f"h — f l c || - * 0 , t h e r e is a n u m-
h e r N s u c h t h a t w i t h k = N + I, I = 0 , 1 , . . . w e h a v e
a n d k N + i - > > 0 a s I - v o o . S e t t i n g || x N — x + \ \ = C a n d t a k i n g i n t o
a c c o u n t t h e a b o v e r e m a r k s w e o b t a i n e s t i m a t e (2.5).
T h e t h e o r e m is p r o v e d .
L e t u s s u p p o s e n o w t h a t m a t r i x f" (x) satisfies b e s i d e s c o n d i
tions (2.4) also L i p s c h i t z ’ c o n d i t i o n
C o n s e q u e n t l y t h e f o l l o w i n g t h e o r e m is v a l i d .
T h e o r e m 2 . 2 . I f f u n c t i o n f ( x ) is s u c h t h a t c o n d i t i o n s ( 2 . 4 ) a n d ( 2 . 8 )
a r e satisfied, t h e n s e q u e n c e (2.1) i n w h i c h t h e v a l u e s o f a k a r e c h o s e n
a c c o r d i n g to c o n d i t i o n (2.2), w h a t e v e r t h e initial p o i n t x 0 , c o n v e r g e s
t o t h e s o l u t i o n a t a q u a d r a t i c r a t e , i.e. e s t i m a t e ( 2 . 9 ) is v a l i d .
61
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
L e t u s c o n s i d e r n o w t h e v a r i a n t of m e t h o d (2.1) w i t h t h e s t e p
l e n g t h b e i n g c h o s e n a c c o r d i n g t o c o n d i t i o n (2.3). T h e c o n v e r g e n c e
o f s e q u e n c e { a ; fe} t o t h e s o l u t i o n i n t h i s c a s e f o l l o w s f r o m t h e g e n e r a l
r e s u l t s a b o u t t h e c o n v e r g e n c e of g r a d i e n t m e t h o d s ( t h e o r e m 1.7).
T h e rate of c o n v e r g e n c e as in the case of c h o o s i n g a h a c c o r d i n g to
c o n d i t i o n ( 2 . 2 ) w i l l b e s u p e r l i n e a r if c o n d i t i o n ( 2 . 4 ) is s a t i s f i e d a n d
q u a d r a t i c if c o n d i t i o n ( 2 . 8 ) i s a l s o s a t i s f i e d . T h i s c a n b e p r o v e d
as follows.
L e t a * + 1 = x k — ( J h ) ~ l f h a n d x „ + 1 = x k — a k ( f k ) ~ l f'k w h e r e a *
is c h o s e n a c c o r d i n g t o c o n d i t i o n (2.3). T h e n u s i n g e s t i m a t e ( 1 . 1 1 )
w e obtain
- f - II * » « - X , II2 < / * « - /. < / (xh+i) II x k + i - x . I p .
B y (2.7), — x* || x h — |, X h 0 as oo. Conse
q u e n t l y if c o n d i t i o n s ( 2 . 4 ) a r e s a t i s f i e d , t h e n
where >-0 as k — ^ o o . If e s t i m a t e ( 2 . 8 ) h o l d s , t h e n
Modifications of t h e G e n e r a l i z e d
Newton Method
A s o n e of t h e p o ssi ble m o d i f i c a t i o n s of m e t h o d (2.1) w e shall c o n
sider a n a l g o r i t h m in w h i c h the s e q u e n c e of a p p r o x i m a t i o n s to the
s o l u t i o n is c o n s t r u c t e d b y t h e f o l l o w i n g f o r m u l a :
**+1 = - a * (Z;)-1 ft, a* ^ 0. (2.12)
I n t h i s m e t h o d p k = — (ft) “ V i * i-e - order to d e t e r m i n e the di
r e c t i o n s o f d e s c e n t u s e is m a d e o f t h e s a m e m a t r i x (/S)” 1. M e t h o d
( 2 . 1 2 ) i s a p a r t i c u l a r c a s e o f a l g o r i t h m ( 1 . 2 0 ) ( F ^ 1 = ( f t ) ” 1 )* T h e r e f o r e
it c a n b e a s s e r t e d t h a t s e q u e n c e ( 2 . 1 2 ) , w h a t e v e r t h e i n i t i a l p o i n t x 0 j
6 2
N E W T O N ’S M E T H O D W I T H S T E P A D J U S T M E N T
(/;, p h ) (t (2.i3)
w h e r e x ^ c = x h + 0 ( x k + x — x h ), 0 £ [ 0 , 1 ] . I f x 0 — >■ x „ t h e n s i n c e
t h e m a t r i x o f s e c o n d d e r i v a t i v e s is c o n t i n u o u s w e h a v e
m a x II
x £ S
r (*) - r (*o)ll - * 0
(S = { x : f (x) ^ f ( x 0 )}). T h u s t h e c l o s e r is t h e i n i t i a l p o i n t x 0
to point x*, the greater will b e the v a l u e of a h w h i c h satisfy the
i n e q u a l i t y (2 .2) , i.e. t h e g r e a t e r w i l l b e t h e s t e p i n t h e p r o c e s s ( 2 . 1 2 )
if t h e s t e p l e n g t h is c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) . I n p a r t i -
cular for a n y constant 0 < e < y t h e r e is a c o n s t a n t p (e) s u c h
t h a t if t h e i n i t i a l a p p r o x i m a t i o n x 0 w a s c h o s e n i n a sphere S of
r a d i u s p, w e shall h a v e
m a x || r ( * ) - / £ II
1 1 x£S_ _ _ _ _ _ _ _
II * » + ! - * . l l < 1 1 T O - ’ ll II n - / S e l l II * » - *.ll
< 5ll X h — z.ll (2.14)
1
where q = — m a x || / " — / " (x)\\. T h i s s h o w s t h a t t h e v a l u e of t h e
171 x£S
ratio q d e p e n d s o n t h e c h o i c e of t h e initial p o i n t x 0, t h e v a l u e
of q b e c o m i n g t h e s m a l l e r t h e closer p o i n t x 0 lies to t h e s o l u t i o n x m .
F o r t h e v a r i a n t o f m e t h o d ( 2 . 1 2 ) i n w h i c h t h e s t e p l e n g t h is c h o s e n
u n d e r t h e c o n d i t i o n o f / (x) a t t a i n i n g m i n i m u m i n t h e d i r e c t i o n o f
63
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
where g = m a x | | / ; — / J c || ^ - 0 i f x 0 - + x , .
A n o t h e r p o s s i b l e m o d i f i c a t i o n o f N e w t o n ’s m e t h o d i s t h e f o l l o w
ing.
L e t k = \ t + i, | = 0 , 1 , . . . » i = 0 , 1 , . . ., t — 1 , t ^ 1 b e
a n arbitrary integer. W e c a n construct a n iterative process
= x lt+i — a lt+i (fit)"1 f i l + h ^ 0
or, w i t h the original notations,
Xk+ 1 = Xh — a * (/{,)'* /;, a * > o. (2.15)
S u c h a m e t h o d takes a n intermediate position b e t w e n algorithm
( 2 .1 ) i n w h i c h f o r t h e c o n s t r u c t i o n o f v e c t o r p k a n e w m a t r i x ( Z ^ ) " 1
i s u s e d a n d a l g o r i t h m (2 .1 2 ) i n w h i c h i n d e t e r m i n i n g t h e d i r e c t i o n
o f m o t i o n t h e s a m e m a t r i x (/J) " 1 is a l w a y s u s e d . I n m e t h o d ( 2 . 1 5 )
a n e w m a t r i x is a p p l i e d a f t e r t s t e p s . T h i s a l g o r i t h m a s w e l l a s m e t h
o d s ( 2 . 1 ), ( 2 . 1 2 ) c a n b e c o n s i d e r e d t o b e v a r i a n t s o f t h e g r a d i e n t
m e t h o d ( 1 . 2 0 ); t h e r e f o r e i t s c o n v e r g e n c e w i t h d i f f e r e n t w a y s o f
c h o o s i n g t h e s t e p l e n g t h f o l l o w s f r o m t h e o r e m s 1.6 a n d 1.7.
W e c o n s i d e r t h e rate of c o n v e r g e n c e of m e t h o d (2.15), a s s u m i n g
t h a t t h e s t e p l e n g t h i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 .2 ) a n d t h a t
c o n d i t i o n s ( 2 . 4 ) a n d ( 2 . 8 ) a r e v a l i d f o r f u n c t i o n f ( x ).
U s i n g T a y l o r ’s f o r m u l a w e o b t a i n
/ * « - / » < « * ( / * . pi ,) ( i - - j - - 1
B e c a u s e o f t h e c o n v e r g e n c e o f t h e p r o c e s s w e h a v e || x k c — x # || =
= II *^(sf+£)c ^idl ^ 1 1 x lt *£|H-lll “ 1“ • • • “ 1“ II •^cf+i+lll
a s k - * - o o , t h e r e f o r e || / * c — /t*|| — > - 0 . T a k i n g t h i s i n t o a c c o u n t
a n d r e a s o n i n g a s w e d i d in t h e o r e m 2.1 w e c a n p r o v e t h a t f r o m a
certain iteration on, m e t h o d (2.15) will c o n v e r g e w i t h a step e q u a l
t o u n i t y : a * = 1. T h e n , b y t h e o r e m 2 . 2 , t h e f o l l o w i n g e s t i m a t e is
valid
(2.16)
with any w h e r e L is a p o s i t i v e n u m b e r . F u r t h e r , p r o c e e d i n g
as in t h e o r e m 2.1 w e can obtain the estimate
II * ! < + * — X , || = || ^ ( + 1 — / = , + ! — Z , II
< ii ii ii n , - / « , + . * ii ii A m - * . ii
M
N E W T O N ' S M E T H O D W I T H S T E P A D J U S T M E N T
J I * E ( + 2 — * . I K — - ( | | x t i — x . " | | + 1| x 5 m — x , || ) || x t m — x , ||
< - £ r l l * t t - x . l | 3 (l H— — l)a:e t — a : * | | ) ,
i.e.
II X *£*|| ^ 2 11 X %l ^'*IP.» ^2 OO*
S u p p o s e that wi t h a certain 2 < / < t — 1 the following estimate
is s a t i s f i e d :
II X V + J — *#ll < C A \ x it — x *\\1 + 1 i Cj < °°*
Then w e have
lll^f+i+i x * II ~ II x l t + j — (/gt)""1 f t t + j x + ll
< II ( f i t ) ' 1 II II f h - / ( W i ) c || || a f c t + j - * * II
( II X l t “ X * II + H X % t + i “ X * II ) II X * * + * X * II
< C M | l * u - X, i r , C J + i = J L C j j L 1 + C j || X u - X . I f ) .
T h u s the following b o u n d on t h e r a t e o f c o n v e r g e n c e is v a l i d
for m e t h o d (2.15):
W e h a v e e s t a b l i s h e d t h a t N e w t o n ’s m e t h o d w i t h a d j u s t m e n t o f
steps c o n v e r g e s to t h e so lution w h a t e v e r t h e initial p o i n t x 0 at
a rate either superlinear or quadratic d e p e n d i n g o n the requ i r e m e n t s
s a t i s f i e d b y f u n c t i o n / ( x ).
T h e c o n v e r g e n c e of m e t h o d (2.1) f r o m a n y initial a p p r o x i m a t i o n
o n is its e s s e n t i a l a d v a n t a g e o v e r t h e u s u a l N e w t o n m e t h o d i n
w h i c h t h e c o n v e r g e n c e i s e n s u r e d if t h e i n i t i a l a p p r o x i m a t i o n i s
s u f f i c i e n t l y g o o d (i.e. s u f f i c i e n t l y c l o s e t o t h e s o l u t i o n o f t h e p r o b l e m ) .
B e s i d e s , i n a p p l y i n g N e w t o n ’s m e t h o d t h e c h e c k o f t h e c o n d i t i o n s
w h i c h g u a r a n t e e t h a t t h e initial a p p r o x i m a t i o n e n s u r e s t h e c o n
v e r g e n c e o f t h e p r o c e s s is i n p r a c t i c e d i f f i c u l t t o p e r f o r m , s i n c e
5 - 0 3 2 6 65
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
it r e q u i r e s s u c h d a t a a b o u t t h e f u n c t i o n t h a t a r e u s u a l l y u n k n o w n
(for in s t a n c e , v a l u e s of t h e c o n s t a n t s m , M ) .
T h e c o m p a r i s o n of the t w o m e t h o d s of c h o o s i n g the step length
a c c o r d i n g t o c o n d i t i o n (2.2) o r (2.3) is i n f a v o u r o f t h e f o r m e r , f o r
it p r o v e s l e s s l a b o r i o u s a s t o t h e a m o u n t o f c a l c u l a t i o n s o f f u n c t i o n
v a l u e s (in particular, f r o m a certain iteration o n t h e f o r m e r m e t h o d
requires the function to be evaluated on ly once since a * = 1) a n d
g u a r a n t e e s t h e rate of c o n v e r g e n c e n o t s l o w e r t h a n w i t h t h e latter
method.
I f w e c o m p a r e N e w t o n ’s m e t h o d a n d g r a d i e n t m e t h o d s a s a p p l i e d
t o s o l v i n g p r o b l e m s o f c o n v e x f u n c t i o n m i n i m i z a t i o n , it b e c o m e s
e v i d e n t t h a t N e w t o n ’s m e t h o d e n s u r e s a f a s t e r r a t e o f c o n v e r g e n c e
o f t h e s e q u e n c e o f a p p r o x i m a t i o n s t o t h e s o l u t i o n . T h u s if w e c o n s i d e r
the rate of c o n v e r g e n c e to m e a n effectiveness of a m e t h o d , t h e n o u r
s u p p o s i t i o n s t a t e d a t t h e b e g i n n i n g o f t h i s s e c t i o n t h a t N e w t o n ’s
m e t h o d m u s t b e f a r m o r e e f f e c t i v e t h a n t h e g r a d i e n t m e t h o d s is
justified. H o w e v e r a m o r e p r eci se m e a n i n g of t h e c o n c e p t o f effecti
v e n e s s o f a m e t h o d is b a s e d o n e s t i m a t i n g t h e a m o u n t o f c o m p u t a
tion i n v o l v e d w h e n a p p l y i n g a concrete a l g o r i t h m for the solution
of a p r o b l e m to the required accuracy. C o n s e q u e n t l y the effectiveness
of a n a l g o r i t h m c a n b e e s t i m a t e d b y the n u m b e r of iterations, w h i c h
are necessary for solving the p r o b l e m , a n d the a m o u n t of c o m p u t a
tions at e a c h iteration.
T h e a m o u n t o f c o m p u t a t i o n s p e r i t e r a t i o n i n N e w t o n ’s m e t h o d
is a s a r u l e c o n s i d e r a b l y g r e a t e r t h a n i n t h e g r a d i e n t m e t h o d s b e c a u s e
of t h e n e ces sar y c o m p u t a t i o n s a n d inversions of the m a t r i c e s of
s e c o n d d e r i v a t i v e s . O n t h e o t h e r h a n d , N e w t o n ’s m e t h o d u s u a l l y
i n v o l v e s scores a n d h u n d r e d s of t i m e s less iterations t h a n g r a d i e n t
m e t h o d s ; b y v i r t u e o f t h i s f a c t N e w t o n ’s m e t h o d p r o v e s t o b e c o n s i
d e r a b l y m o r e effective.
N e v e r t h e l e s s i n m a n y p r o b l e m s t h e l a b o u r p e r i t e r a t i o n i n N e w t o n ’s
m e t h o d c a n p r o v e e x c e s s i v e l y g r e a t b e c a u s e o f it b e i n g n e c e s s a r y
t o c a l c u l a t e m a t r i c e s o f s e c o n d d e r i v a t i v e s , /" (x) (as a r u l e i n s o l v i n g
e x t r e m a l p r o b l e m s t h e g r e a t e s t d i f f i c u l t y is t h e c a l c u l a t i o n o f t h e
m a t r i x /" (x) a n d n o t its i n v e r s i o n ) . S u c h p r o b l e m s w i l l b e c o n s i d e r e d
later on. I n o r d e r to solve t h e p r o b l e m in s u c h a case, o n e c a n m a k e
u s e o f o n e o f t h e m o d i f i c a t i o n s o f N e w t o n ’s m e t h o d w h i c h w e h a v e
studied. I n o n e of the modifications w e h a v e to calculate a n d invert
t h e m a t r i x f* (#) o n l y o n c e , i n t h e o t h e r t h i s is m a d e a f t e r a finite
n u m b e r o f i t e r a t i o n s . If t h e i n i t i a l a p p r o x i m a t i o n is g o o d e n o u g h ,
t h e n t h e r a t e o f c o n v e r g e n c e t o t h e s o l u t i o n w i l l b e fast. H o w e v e r ,
u s i n g m o d i f i c a t i o n s o f N e w t o n ’s m e t h o d i s n o t a c a r d i n a l s o l u t i o n
of the p r o b l e m of r e d u c i n g the a m o u n t of w o r k required to solve the
p r o b l e m ( s p e a k i n g g e n e r a l l y , it c a n b e c o m e e v e n g r e a t e r ) . T h e r e f o r e
w e c o m e to the question of the possibility of constructing m i n i m i z a
t i o n m e t h o d s w h i c h w o u l d b e c l o s e t o N e w t o n ’s m e t h o d a s t o t h e i r
66
M E T H O D S O P D U A L D I R E C T I O N S
rate of c o n v e r g e n c e a n d w o u l d r e q u i r e c o n s i d e r a b l y less c o m p u t a
tions at e v e r y iteration.
Several su c h m e t h o d s h a v e b e e n w o r k e d out; t h e y are b a s e d o n
d i f f e r e n t i d e a s . A s a r u l e t h e y p r o v e m o r e e f f e c t i v e t h a n N e w t o n ’s
m e t h o d a n d t h i s is w h y t h e y a r e u s e d m o r e a n d m o r e a t p r e s e n t .
T h e next three sections are d e v o t e d to the s t u d y of s u c h algorithms.
3. M E T H O D S O F D U A L D I R E C T I O N S
Considerations on the Choice
of S c h e m e s of t h e M e t h o d s
In the p r e c e d i n g section w e n o t e d that the m a i n difficulty in
a p p l y i n g N e w t o n ’s m e t h o d i s t h e n e c e s s i t y o f e v a l u a t i n g t h e m a t r i x
of s e c o n d derivatives of the f u n c t i o n b e i n g m i n i m i z e d . C o n s e q u e n t l y
a l g o r i t h m s w h i c h w o u l d h e m o r e e f f e c t i v e t h a n N e w t o n ’s m e t h o d
s h o u l d e x c l u d e the calculation of s e c o n d derivatives, p r o v i d i n g
h o w e v e r t h e r a t e o f c o n v e r g e n c e o f N e w t o n ’s m e t h o d .
T h e q u e s t i o n a r i s e s w h e t h e r it is p o s s i b l e i n c o n s t r u c t i n g t h e
s e q u e n c e of a p p r o x i m a t i o n s to t h e solutions to d e t e r m i n e directions
p h w h i c h w o u l d h e c l o s e t o t h o s e i n N e w t o n ’s m e t h o d , b y u s i n g f o r
t h i s p u r p o s e o n l y t h e first d e r i v a t i v e o f t h e f u n c t i o n b e i n g m i n i m i z e d .
T h e first a n d t h e s e c o n d d e r i v a t i v e s o f / (x) c a n b e r e l a t e d b y
T a y l o r ’s f o r m u l a f o r o p e r a t o r s ( t h e g r a d i e n t f ( x ) i s o n e ) :
67 5*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h i s s y s t e m of e q u a t i o n s c a n b e rewritten in t h e f o l l o w i n g f o r m :
a (xi+1 - Xi) = n (*i+i - x t) + (/? - / 3) ( x i + 1 - Xi)
(X i » # j + i Xi)x
i — 1, . . re, 1 ^ j ^ n. ( 3 .3 )
If m a t r i x f n (x) is c o n t i n u o u s a n d n o n s i n g u l a r , t h e n b y v i r t u e o f t h e
a s s u m e d closeness of p o i n t s xi t h e s u m of t h e t w o last t e r m s of t h e
r i g h t - h a n d side of e a c h e q u a t i o n in s y s t e m (3.3) m u s t b e c o n s i d e r a b l y
l e s s t h a n t h e first t e r m , i.e.
A (Xi+i — x t) « /" (xj) ( x f+1 — x t), i = 14 . . n
a n d t h i s i n g e n e r a l m e a n s t h a t m a t r i c e s A a n d / $ , / = 1 , . . ., n ,
m u s t b e close to e a c h other. It c a n b e easily i m a g i n e d h o w t h e a b o v e
considerations c a n b e u s e d for the construction of iterative processes
o f m i n i m i z a t i o n . I f { x ft} i s a n a r b i t r a r i l y c o n s t r u c t e d s e q u e n c e w h i c h
c o n v e r g e s t o t h e m i n i m u m p o i n t o f / (x), t h e n i n a s u f f i c i e n t l y s m a l l
n e i g h b o u r h o o d o f t h e m i n i m u m p o i n t p o i n t s x k , x k _ lt . . ., x h - n
are close to o n e another. Therefore h a v i n g defined m a t r i x A h b y the
s y s t e m of equations
4 ft X k - i — i) — f (Xk-i) / (^fc-i-i)» i = = O j 1 , . . ., n 1
68
M E T H O D S O F D U A L D I R E C T I O N S
S u b s t a n t i a t i o n of t h e M e t h o d s
S u p p o s e t h a t / (x ) is a f u n c t i o n w h i c h h a s c o n t i n u o u s first a n d
s e c o n d derivatives. G i v e n a n infinite s e q u e n c e of e l e m e n t s
W e t a k e t h e s e q u e n c e { y ft} c o r r e s p o n d i n g t o { : r A } i n a c c o r d a n c e w i t h
the formula
Vh = xk + rA
w h e r e vectors rA are su ch that the following t w o conditions are
satisfied:
(1) If A A is a d e t e r m i n a n t w h o s e c o l u m n s a r e v e c t o r s y ~ [ | *
69
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
I n t r o d u c i n g t h e n o t a t i o n .A* — /£ = B & w e o b t a i n
ll^-iKlI/Ui-ZJIIIIr^H
+ s u P | | / ' ( * * - i + t r lk_ f) — / ’ ( x fc.i)|| ||rk _j||. (3.7)
S i n c e { x fc} i s a b o u n d e d s e q u e n c e , w i t h a n y k w e h a v e x h £ Q »
Q c E n is a c l o s e d b o u n d e d set. F u n c t i o n /" (x) is u n i f o r m l y c o n t i
nuous in set Q \ c o n s e q u e n t l y || / * _ * — f h \ \ = -► 0 and
SUP II F + T r fe-i) — F (^fe-i)ll = P f e - i - ^ 0 a s k - > o o .
O^T^l
Thus it f o l l o w s f r o m (3.7) t h a t
a s k -*> o o . T h e l e m m a is p r o v e d .
T h e results of the a b o v e l e m m a o p e n the w a y to the construction
of m e t h o d s o f t h e t y p e (3.4).
L e m m a 3 . 2 . I f f ( x ) is a c o n t i n u o u s l y d i f f e r e n t i a b l e s t r o n g l y c o n v e x
f u n c t i o n a n d s e q u e n c e { a : fe} i s s u c h t h a t f k + 1 ^ / * a n d (/*, x k + 1 — x h ) — 0
a s k - ► o o , t h e n || 3 : * + ! — 3?ft|| 0.
P r o o f . A c c o r d i n g t o c o n d i t i o n f h + 1 ^ f h w e h a v e x fc+1 6 S hf S k =
— { x \ f ( x ) ^ / (3 ; * ) } w i t h a n y k . S e t S i s s t r o n g l y c o n v e x s i n c e / ( x )
70
M E T H O D S O F D U A L D I R E C T I O N S
is a s t r o n g l y c o n v e x f u n c t i o n ( l e m m a 2 . 8 o f C h a p . I). T h e n t h e r e is
a positive n u m b e r A , > 0 such that a n y point Xk+i^ **— hi * w h e r e
I = ! ( / * . ', + ® ) l = l l / * l l I I ® | | -
B u t || co || > | | H I s i n c e o t h e r w i s e i n a d d i t i o n t o p o i n t x h s e t S k a n d
plane T k w o u l d h a v e other points in c o m m o n , w h i c h contradicts
t h e s t r o n g c o n v e x i t y o f S *. T h e r e f o r e
2 ' 2 (f k , p k ) = 3 ’ 8 - ^ - 1 U )
T h e o r e m 3 . 1 . I f f (x) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n
for w h i c h conditions (2.4) are v a l i d , m a t r i x A h w i t h a n y k ^ n — 1
71
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
is d e f i n e d b y s y s t e m ( 3 . 6 ) a n d s a t i s f i e s t h e c o n d i t i o n
(A'k % , fh) > 0 (3.11)
a n d o&fe i s d e t e r m i n e d a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) , t h e n w h a t e v e r t h e
initial p o i n t x Q the f o l l o w i n g s t a t e m e n t s a r e v a l i d for s e q u e n c e (3.4):
f k + 1 < C f h a n d || x h — # J | — at a superlinear rate of convergence:
II x n + i — ^#11 C X N . . . kN +i ^(3.12)
w h e r e C , N < cx>, k N + i < c 1 w i t h a n y l ^ 0 , % i - ^ 0 a s i ^ o o .
P r o o f . I n order to m a k e u s e of the results of l e m m a 3.1 w e m u s t
first o f a l l s h o w t h a t t h e c o n d i t i o n s o f t h e t h e o r e m i m p l y t h a t
II x h + i — ^fcll - > 0 f o r s e q u e n c e ( 3 . 4 ) .
A c c o r d i n g t o c o n d i t i o n s ( 3 . 9 ) a n d ( 3 . 1 1 ) w e h a v e (fh, p h ) < 0
w i t h a n y k . H e n c e it f o l l o w s , first, t h a t t h e r e is a l w a y s a v a l u e
0 s u c h t h a t i n e q u a l i t y ( 3 . 1 0 ) is sa tisfied ( a n d c o n s e q u e n t l y
al s o (2.2)), s e c o n d l y , b y (2.2) w e s h a l l h a v e f k + 1 < c f h - T h i s m e a n s
t h a t x k+ x 6 S = {x: f (x) ^ / ( x 0) } w i t h a n y k a n d b e s i d e s s i n c e / (x)
h a s a l o w e r b o u n d , f k — f h + x - > - 0 ; b y v i r t u e o f t h i s it f o l l o w s f r o m
(2.2) that'
(/fe, P h ) = (.f h , x h + 1 — x h) ->-0. (3.13)
S i n c e f h + i ^ f k a n ^ c o n d i t i o n ( 3 . 1 3 ) is fulfilled, s e q u e n c e {ajft}
satisfies t h e r e q u i r e m e n t s of l e m m a 3.2. H e n c e , as k -» -oo
N o t i n g t h a t a h ^ M - f M x < o o it is e a s i l y a s c e r t a i n e d t h a t
t h e r e is a c o n s t a n t a > > 0 s u c h t h a t w i t h a n y k i n e q u a l i t y ( 3 . 1 7 )
w i l l b e s a t i s f i e d w i t h aft ^ a . B y v i r t u e o f t h i s it f o l l o w s f r o m ( 3 . 1 4 )
’■ h a t || p k || = ^ r || x k + i — **11 - + 0 .
72
M E T H O D S O F D U A L D I R E C T I O N S
Hence
II fill = \ \ A h P h | | < M 1 | | / > * | | - * 0 .
The last c o n d i t i o n as s h o w n b y i n e q u a l i t y (1.12) m e a n s t h a t
xh L e t u s establish that e s t i m a t e (3.12) holds.
S i n c e || / ? fe|| a n d c o n d i t i o n ( 3 . 1 5 ) is fu lfilled a n d t h e s e c o n d
d e r i v a t i v e s o f f u n c t i o n / (x) a r e u n i f o r m l y c o n t i n u o u s o n s e t S ,
w e h a v e as k — oo
a* <11 r (*» + e (**+, - *0) - /" (<)ll + 1 1 /" (**) - ii»ll - ► o
73
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Construction of V a r i o u s Algorithms
T h e r e q u i r e m e n t s w h i c h m u s t b e satisfied b y v e c t o r s rk u s e d in
c o n s t r u c t i n g s e q u e n c e (3.5) are n o t strict a n d l e a v e u s a g r e a t f r e e d o m
o f c h o i c e o f t h e s e v e c t o r s . T h i s m a k e s it p o s s i b l e t o c o n s t r u c t d i f f e r
e n t a l g o r i t h m s of t y p e (3.4) s i nce different s e q u e n c e s {rk } will
define ( b y (3.6)) different s e q u e n c e s of m a t r i c e s A k .
L e t u s discuss s o m e possible w a y s of constructing vectors rh . W e
c a n take as rk vectors directed along the axes of coordinates. F o r
e x a m p l e , i f r 0 — X q V ^ t h e n w i t h k = t n + i, w h e r e t i s a n i n t e g e r
a n d i = 0 , 1 , . . ., n — 1 , w e h a v e r ft = X k v t + i * v i + i i s f h 0 u n i t
v e c t o r o f t h e c o r r e s p o n d i n g a x i s a n d X h is a n u m e r i c a l f a c t o r s u c h
t h a t X k — > - 0 a s h — >-oo. S u c h a c h o i c e o f v e c t o r r h g u a r a n t e e s t h a t
t h e c o n d i t i o n | A h \ ^ e w i l l b e satisfied. I n this c a s e in o r d e r to
d e t e r m i n e m a t r i x A ft, i t i s n e c e s s a r y t o c a l c u l a t e a t e a c h i t e r a t i o n
the derivatives at t w o points, x h a n d y h . T h e l a w of the decrease
of X h m a y b e c h o s e n arbitrarily; c o m p u t a t i o n a l practice s h o w s h o w
e v e r t h a t t h e m a x i m u m r a t e o f c o n v e r g e n c e is o b t a i n e d w i t h m o n o t o -
\
nicallyj d i m i n i s h i n g X h ; o n e can, for instance, a s s u m e that X h = .
A n o t h e r p o s s i b l e m e t h o d o f d e t e r m i n i n g v e c t o r s r h is a s f o l l o w s .
W i t h k ^ n — 1 w e c a n , i n s t e a d of (3.5), u s e d i r e c t l y s e q u e n c e (3.4),
i.e. a s s u m e r h — x k + i — x h — — a ^ A ^ f k - I n f a c t , t h e p r o o f o f
t h e o r e m 3 . 1 s h o w s c l e a r l y t h a t if A k is a n a r b i t r a r y m a t r i x w h i c h
satisfies o n l y c o n d i t i o n ( 3 . 1 1 ) a n d a k is c h o s e n a c c o r d i n g t o c o n d i
t i o n ( 2 . 2 ) , t h e n || x A + 1 — x k \\ — a s k — > - o o . C o n s e q u e n t l y , if w e
u s e s e q u e n c e ( 3 . 4 ) f o r c o n s t r u c t i n g v e c t o r s r fe, t h e n t h e r e q u i r e m e n t
|| r h \\ — w i l l of n e c e s s i t y b e fulfilled a n d w e o n l y n e e d to establish
| A f t | ^ e . I f t h i s c o n d i t i o n i s n o t s a t i s f i e d w i t h a c e r t a i n A:, a n o t h e r
v e c t o r ^ m u s t b e c h o s e n ( u s i n g n o t (3.4) b u t a n e w f o r m u l a ) . I n
s u c h a n a l g o r i t h m , for d e t e r m i n i n g m a t r i x 4ft at e v e r y iteration
( w h e r e s e q u e n c e (3.4) p r o v i d e s t h e fulfillment of t h e r e q u i r e m e n t s
to b e m e t b y vectors rh) the g r a d i e n t m u s t b e calculated o n l y at o n e
point x h.
O f c o u r s e , o t h e r m e t h o d s o f c o n s t r u c t i n g v e c t o r s r ft m a y b e u s e d .
I n t h e s y s t e m of e q u a t i o n s (3.6) w h i c h defines m a t r i x A k o n l y
one n e w vector rh a n d the corresponding vector ek are used with
a n y k ; t h e r e m a i n i n g v e c t o r s r h . ly . . ., r ft_ n + 1 a n d e h - n . . ., e k ^ n + 1
a r e c o n s t r u c t e d f r o m p r e c e d i n g iterations. T h e s y s t e m (3.6) c a n b e
m o d i f i e d so t h a t at e a c h iteration of p r o c e s s (3.4) a n a r b i t r a r y n u m b e r
o f v e c t o r s r * ^ , . . ., 1 ^ 7 n ( a n d of their c o r r e s p o n d i n g
vectors . . ., e h . i j ) b e r e n e w e d a n d t h e r e m a i n i n g n — j
v e c t o r s r k — i J + 1 , . . . » r h — in b e t a k e n f r o m p r e c e d i n g i t e r a t i o n s .
I n this c a s e s y s t e m (3.6) s h o u l d p r e f e r a b l y b e w r i t t e n in t h e f o r m
A k Ti = e t, i — 1 , . . ., n . (3.18)
74
M E T H O D S O P D U A L D I R E C T I O N S
If t h e r e q u i r e m e n t s o f l e m m a 3 . 1 a r e r e t a i n e d , t h e n r e p e a t i n g its
p r o o f w o r d f o r w o r d , it c a n b e a s c e r t a i n e d t h a t f o r m a t r i x A k d e f i n e d
b y s y s t e m ( 3 . 1 8 ) t h e c o n d i t i o n || A h — fl\\ - * - 0 a s k - + o o i s a l s o
satisfied.
U s i n g d i f f e r e n t m e t h o d s f o r c o n s t r u c t i n g v e c t o r s r* i n s y s t e m ( 3 . 1 8 )
one can obtain several well k n o w n mi nimization algorithms. For
i n s t a n c e , if w e s e t r f = v hi ( a n d y t = x h + v k i ) w h e r e u ki is a v e c t o r
d i r e c t e d a l o n g t h e i - t h a x i s o f c o o r d i n a t e s a n d s u c h t h a t || u k i \\ - > - 0
as k oo, t h e n s y s t e m (3.18) will t a k e t h e f o l l o w i n g f o r m :
A h v hi = f (x k + vM ) — f ( z k ), i = 1, . . n.
M a t r i x A h d e f i n e d b y t h i s s y s t e m is a finite d i f f e r e n c e s a n a l o g u e o f
t h e m a t r i x o f s e c o n d d e r i v a t i v e s f " ( x fe); t h u s i n t h i s c a s e p r o c e s s ( 3 . 4 )
t r a n s f o r m s i n t o a f i n i t e d i f f e r e n c e s a n a l o g u e o f N e w t o n ’s m e t h o d
w i t h a d j u s t m e n t o f s t e p l e n g t h . O n t h e b a s i s o f t h e o r e m 3 . 1 it c a n
b e a s s e r t e d t h a t N e w t o n ’s f i n i t e d i f f e r e n c e s m e t h o d w i t h a d j u s t m e n t
of step l e n g t h c o n v e r g e s f r o m a n y initial a p p r o x i m a t i o n at a s u p e r -
l i n e a r rate. A s s u m i n g t h a t m a t r i x f (x) satisfies L i p s c h i t z ’ c o n d i
t i o n ( 2 . 8 ) , it c a n b e s h o w n , u s i n g t h e p r e c e d i n g r e s u l t s , t h a t if
II v hi\\ ^ 1 1 /ftll> N e w t o n ’s f i n i t e d i f f e r e n c e s m e t h o d c o n v e r g e s a t
a q u a d r a t i c rate. T h i s c a n b e s e e n f r o m t h e f o l l o w i n g a r g u m e n t .
I n l e m m a 3 . 1 t h e b o u n d o n t h e q u a n t i t y B k r t — B k v hi t a k e s t h e f o r m
H e n c e , u s i n g ( 2 . 8 ) , || B h v h i \\ ^ i?|| i;ftf||2 . F r o m t h i s i n e q u a l i t y a n d
t h e e s t i m a t e || v k j|j ^ | | / * jj, a s i n l e m m a 3 . 1 , w e o b t a i n
s l < M I I M I * £ * 2 |8«|||/il|-»ll/il|.
7= 1 7= 1
A c c o r d i n g t o ( 2 . 4 ) t h e g r a d i e n t /' ( x ) s a t i s f i e s L i p s c h i t z ’ c o n d i t i o n
with constant M . Consequently,
II B h II< R II f t II = R II f t - f t I K ' R M II - X . ||.
T h e h o u n d s o n th e rate of c o n v e r g e n c e o b t a i n e d in t h e o r e m 3.1
c a n n o w be defined m o r e exactly as follows:
75
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
L e t u s d e s c r i b e a n o t h e r m e t h o d of c h o o s i n g v e c t o r s r t. W e s e t
Tj = W l/i = = J f i + i = i/i ~1“ ^ 2 , • • •» w . T h e n
t h e s y s t e m (3.18) ta kes t h e f o l l o w i n g f o r m :
Determining Vector p k
T h e a m o u n t of w o r k required to c o m p u t e vector p k d e t e r m i n e s
t o a c o n s i d e r a b l e e x t e n t t h e c o m p u t a t i o n a l effort i n p r o c e s s (3.4).
W e shall n o w consider a m e t h o d of constructing vector p k =
= — A j ^ f h w h i c h m a k e s u s e of t h e specific p r ope rti es of s y s t e m (3.6)
that defines m a t r i x A k . B y t a k i n g a c c o u n t of these properties o n e
c a n c o n s i d e r a b l y s i m p l i f y the construction of vector p k ; w e b e g i n
w i t h the inversion of m a t r i x A h.
T h e n e c e s s a r y c o n d i t i o n f o r t h e e x i s t e n c e o f m a t r i x A k 1 is t h a t
m a t r i x A k be no nsingular a n d this in turn necessitates linear in de
p e n d e n c e o f t h e v e c t o r s y s t e m e k , . . ., e k „ n + 1 . W i t h s u f f i c i e n t l y
l a r g e k , m a t r i x A k is n o n s i n g u l a r a s s h o w n b y (3.16). H o w e v e r , a t
s o m e iterations of t h e initial s t a g e of process (3.4) t h e v e c t o r s y s t e m
e k , . . ., & k - n + i c a n p r o v e l i n e a r l y d e p e n d e n t . I n t h i s c a s e w e c a n
either c h a n g e o n e of the vectors or m a k e a step in the direction
o f t h e a n t i g r a d i e n t t h i s c a u s i n g a c h a n g e o f s y s t e m e k % . . ., e k _ n + 1 .
W e a s s u m e in w h a t follows that w i t h a n y k ^ n — 1 s y s t e m ek, . . .
. . ., t f f e - n + x i s l i n e a r l y i n d e p e n d e n t .
I n this c a s e s y s t e m (3.6) c a n b e w r i t t e n in t h e f o r m
A k &k—% — i ” 1 , . . .j Tl 1
or in the form of a matrix equation:
A ? E h = R k (3.19)
w h e r e E k , R k are matrices w h o s e c o l u m n s are coordinates of vectors
e k - i a n d r ft_ f r e s p e c t i v e l y . F r o m ( 3 . 1 9 ) w e o b t a i n
A u 1 = R hE i \ (3.20)
T h u s i n o r d e r t o c o n s t r u c t m a t r i x A H 1 it is n e c e s s a r y first o f all
t o c a l c u l a t e m a t r i x E k 1. I t is k n o w n f r o m l i n e a r a l g e b r a (see, f o r
instance, D . K . F a d d e e v a n d V . N . F a d d e e v a ) t h at r o w s of m a t r i x E kl
w i l l b e v e c t o r s w h o s e b a s i s s k l . . ., s k _ n + 1 i s d u a l f o r o r b i o r t h o g o n a l
t o b a s i s e k , . . ., e k - n + 1 . R e c a l l t h a t l i n e a r l y i n d e p e n d e n t s y s t e m s
76
M E T H O D S O F D U A L D I R E C T I O N S
of vectors a n a n d b lt . . ., b n a r e c a l l e d d u a l i f t h e y s a t i s f y
the conditions
( a t, bj) = 0 with i /, (a b t) — 1.
I f s k l . . ., s * _ n + i i s t h e d u a l o f b a s i s e * , . . e h - n + 1<t t h e n
( a c c o r d i n g t o t h e r e l a t i o n s o f d u a l i t y ) S * E h = /, w h e r e S k is a m a t r i x
w h o s e c o l u m n s are vectors It f o l l o w s t h a t S % = E k l.
E a c h o f t h e m a t r i c e s E kl k = 0 , 1, . . . d i f f e r s f r o m t h e n e x t
o n e o n t h e left a n d o n t h e r i g h t s i d e o n l y b y o n e c o l u m n . B y v i r t u e
o f t h i s f a c t t h e p r o c e s s o f c o n s t r u c t i n g t h e b a s i s s h l . . ., s k _ n + 1 c a n
be p e r f o r m e d b y recursive relations a n d this to a considerable extent
r e d u c e s c o m p u t a t i o n a l efforts.
S u p p o s e w e h a v e c a l c u l a t e d m a t r i x E k 1 , i.e. h a v e c o n s t r u c t e d t h e
b a s i s s k , . . ,# S f c - n + i » L e t u s c o n s t r u c t t h e s y s t e m o f v e c t o r s S j h - j ,
S k l • • •» s A - n + 2 a s f o l l o w s :
T _ _ _ _ _ _ _ s h - n + 1_ _ _ _
. (s k + l - 7 1 1 e k + l - j ) ~
(sh + 1 , e k + 1 .j) = 1 — — ^ = 0
77
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
7 , m = 1 , 2 , . . ., n — 1 , i s K r o n e c k e r ’s d e l t a ( 6 ^ = 1 , 6 j m —
= 0 with ] m). C o n s e q u e n t l y , the conditions of duality are satis
fied f o r v e c t o r s y s t e m s efe_ n + 2 a n d s * + u . . s k - n + 2 , a n d
this corroborates o u r statement.
T h u s , u s i n g r e c u r s i v e relations (3.21), t h e c o n s t r u c t i o n of basis
S h + i i • • •» s h - n + 2 ( i * e * m a t r i x E k + 1) is p e r f o r m e d q u i t e e a s i l y .
W e c a n n o w derive a s i m p l e f o r m u l a for the d e t e r m i n a t i o n of t h e
d i r e c t i o n of m o t i o n p k . F o r t h i s p u r p o s e let u s w r i t e e q u a t i o n (3.20)
in the following form:
n - 1
A j tl = 2 rh-iS*-i (3.23)
i=G
It w a s this f o r m u l a for t h e d e t e r m i n a t i o n of t h e d i r e c t i o n of d e s c e n t
in m e t h o d s of t h e t y p e (3.4) t h a t c a u s e d s u c h a l g o r i t h m s to b e called
m e t h o d s of d u a l directions. U s i n g (3.24), f o r m u l a (3.4) c a n b e
written
71 - 1
Xh + i ~ Xk ah 2 2 Sk- i rh - V V = l , ...,W.
i — (i j = l
I t is e a s y t o c h e c k its v a l i d i t y b y d i r e c t m u l t i p l i c a t i o n o f a m a t r i x
c o n s t r u c t e d b y t h i s f o r m u l a a n d v e c t o r s e ft+ lT e h l . . ., e h - n + 2 .
It c a n b e s e e n t h e n t h a t
A k .+1 £ f c + i — i f*fc+i— 1» i = 0 , 1 , . . ., n 1,
78
M E T H O D S O P D U A L D I R E C T I O N S
T h e Initial S t a g e o f t h e P r o c e s s
U n t i l n o w w e c o n s i d e r e d t h e iterative p r o c e s s (3.4) b e g i n n i n g
w i t h k = n — 1 s i n c e f o r t h e d e f i n i t i o n o f m a t r i x A kt n v e c t o r s r h
a n d corresponding vectors ek are required.
T h e first i t e r a t i o n s o f t h e p r o c e s s ( k < n — 1 ) c a n b e p e r f o r m e d
in various w a y s . F o r instance, use c a n b e m a d e of t h e m e t h o d of
s t e e p e s t d e s c e n t : x k+ t = x h — a * / £ , a * > 0, k = 0, 1, . . n — 2.
I n o r d e r t o s e c u r e u n i f o r m i t y o f t h e a l g o r i t h m f r o m t h e first i t e r a
t i o n o n , w e c a n p r o c e e d a s f o l l o w s w i t h 0 ^ k < rc — 1 .
S e t A ' 1 = /. P r e s e n t t h i s m a t r i x i n t h e f o r m A ~ * = i?02?“l,
w h e r e R 0 = /, E ~ x = /, or, u s i n g (3.23),
n - 1
A £ = 2 '•o-.so-t
i=0
w h e r e r 0 , r _ j , . . ., r _ n + 1 a n d s 0 , s _ lt . . ., s _ n + 1 a r e v e c t o r s o f
a u n i t y o r t h o n o r m a l i z e d basis. T a k i n g this into a c c o u n t , w e h a v e :
T 7 - 1
X i = x 0 — a 0 2 (/o» s o - i ) r o-t*
7=0
F u r t h e r , h a v i n g c a l c u l a t e d v e c t o r s rx a n d e ± b y (3.21) w e c o n s t r u c t
the basis:
,, + li
ls-n . ,
e l)
~Si-i=Si-h ei)«if 7 = 1 , — i
X 2 = x1— Oi 2 (/!.
i=0
T h e c o n s t r u c t i o n o f t h e s u c c e s s i v e i t e r a t i o n s is s t r a i g h t f o r w a r d .
M i ni mi za ti on of Q u a d r a t i c F o r m
L e t us consider as a n e x a m p l e the application of the m e t h o d s of
dual directions to the finding of the m i n i m u m point of a quadratic
function. Let
/(*)-= -^-(Ax, x)-\-(by x ) + c
w h e r e A is a n n X n s y m m e t r i c , s t r i c t l y p o s i t i v e d e f i n i t e m a t r i x
w i t h c o n s t a n t e l e m e n t s : (.A x , x ) > 0 f o r a n y x ^ 0 , b i s a v e c t o r ,
c is a s c a l a r q u a n t i t y . T h e g r a d i e n t o f t h i s f u n c t i o n is f (x) = A x +
-f b the vector
e, = f (x + r,) - f (x) = A r t. (3.26)
79
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h e r e f o r e , i f r l t . . ., r n i s a l i n e a r l y i n d e p e n d e n t s y s t e m o f
v e c t o r s , s lt . . ., s n i s t h e b a s i s d u a l f o r t h e b a s i s e l9 . * ., e n y t h e n
b e c a u s e of (3.20) a n d (3.23), w e h a v e
A'1 = R nE'1 = £ r ts f .
i=l
N o w , s i n c e it f o l l o w s f r o m ( 3 . 2 6 ) t h a t m a t r i x A is d e f i n e d b y t h e
s y s t e m o f e q u a t i o n s A r t — e i9 i = 1 , 2 , . . ., n , w e c a n w r i t e
a n d / n + i = — A A ~ x b + 6 = 0 , i . e . a ?n + 1 = x m .
Thus, in order to m i n i m i z e a quadratic function b y the m e t h o d
of d u a l directions w e h a v e to calculate t h e gr adi ent of the funct ion
a t n + 1 p o i n t s a n d c o n s t r u c t a b a s i s w h i c h is d u a l f o r t h e b a s i s o f
v e c t o r s e lt . . ., e n . I f w e c o n s i d e r t h e p r o c e s s o f s u c c e s s i v e c a l c u l a
t i o n s o f v e c t o r s e l9 . . ., e n a s a c e r t a i n i t e r a t i v e p r o c e d u r e , t h e n
it c a n b e s a i d t h a t m e t h o d s o f d u a l d i r e c t i o n s m a k e it p o s s i b l e t o
m i n i m i z e a q u a d r a t i c f u n c t i o n after a finite n u m b e r of steps.
N o t e a l s o t h a t t h e p r o b l e m u n d e r c o n s i d e r a t i o n is e q u i v a l e n t
t o s o l v i n g a s y s t e m o f l i n e a r e q u a t i o n s A x — — b. C o n s e q u e n t l y ,
m e t h o d s o f d u a l d i r e c t i o n s m a k e it p o s s i b l e t o s o l v e a s y s t e m o f
l i n e a r e q u a t i o n s b y p e r f o r m i n g a finite n u m b e r of iterations.
Discussion of Properties of t h e M e t h o d s
M e t h o d s o f d u a l d i r e c t i o n s m a k e it p o s s i b l e t o s o l v e t h e p r o b l e m
of m i n i m i z i n g a strictly c o n v e x s m o o t h f u n c t i o n w h a t e v e r t h e initial
a p p r o x i m a t i o n c h o s e n a n d t h e rate of c o n v e r g e n c e of the s e q u e n c e
{a:ft} t o t h e s o l u t i o n i s s u p e r l i n e a r . T h e m e t h o d o f c h o o s i n g p a r a
m e t e r aft g u a r a n t e e s t h e d e t e r m i n a t i o n o f t h e r e q u i r e d v a l u e o f a *
after a finite n u m b e r of r e d u c t i o n s . O f c o u r s e , i n p r o c e s s (3.4) as
in the m e t h o d s described in the preceding sections can be chosen
u n d e r the c o n d i t i o n of o b t a i n i n g t h e m i n i m u m fu n c t i o n v a l u e in
t h e d i r e c t i o n o f m o t i o n ; h o w e v e r , s u c h a m e t h o d is m o r e l a b o r i o u s .
T h e m e t h o d s o f t h e c l a s s u n d e r c o n s i d e r a t i o n a p p r o a c h N e w t o n ’s
m e t h o d as to the es tim ate of their rate of convergence. L e t u s c o m
p a re t h e l a b o u r p e r iteration in t h e m e t h o d s of d u a l directions a n d
i n N e w t o n ’s m e t h o d .
I n p r o c e s s e s o f t y p e ( 3 . 4 ) w i t h m a t r i x A ft d e f i n e d b y s y s t e m ( 3 . 6 )
i n o r d e r t o c a l c u l a t e m a t r i x .4ft 1 w e h a v e t o c a l c u l a t e v e c t o r e * a n d
8 0
M E T H O D S O F D U A L D I R E C T I O N S
6 — 0 3 2 6 81
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
4. M E T H O D S O F C O N J U G A T E D I R E C T I O N S .
MINIMIZATION OF Q U A D R A T I C F UNCTIONS
Conjugate Directions
a n d Their Properties
L e t u s tu r n a g a i n to the p r o b l e m of m i n i m i z i n g quadratic functions
of the f o r m
w h e r e x t a r e a r b i t r a r y p o i n t s , a n d t o c o n s t r u c t a b a s i s s 0 » • • •» s n - i
d u a l f o r t h e b a s i s e 0 , . . ., e n _ l5 i . e . w h i c h s a t i s f i e s t h e c o n d i t i o n s
(Si, el) = 1 , (,s t , ! > } ) = 0 w i t h i = £ j. (4.4)
T h e s e relations, b e c a u s e of (4.3), c a n b e w r i t t e n in t h e f o r m
(st, A p ^ = 1, (S i , A p j ) — 0, i = £ j. (4.5)
O f p a r t i c u l a r i n t e r e s t i s t h e c a s e i n w h i c h v e c t o r s p 0 , . . ., p n - x
a r e A - o r t h o g o n a l o r , a s t h e y a r e s o m e t i m e s c a l l e d , c o n j u g a t e , i.e. s u c h
that
iPii A P j ) ~ i ^ 7* G * 6)
T h e s y s t e m o f ( n o n z e r o ) v e c t o r s p 0 , . . ., p n - i w h i c h s a t i s f i e s c o n d i
t i o n s ( 4 .6 ) i s l i n e a r l y i n d e p e n d e n t ( b e i n g o r t h o g o n a l i n a m e t r i c ,
defined b y a non s i n g u l a r m a t r i x ) a n d accordingly c a n b e u s e d to
d e t e r m i n e v e c t o r s e t b y f o r m u l a s (4.3); v e c t o r s s t w h i c h satisfy
( 4 . 5 ) i n t h i s c a s e c a n b e c a l c u l a t e d b y v e r y s i m p l e f o r m u l a s : iS
Si = — r , i=0, 1, ..., rc— 1.(4.7)
1 (Apu Pi) ’ '
82
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
T h u s i f v e c t o r s p 0i . . . , p n - t a r e A - o r t h o g o n a l , matrix A -1 is
calculated b y t h e f o l l o w i n g f o r m u l a (see (3.27)):
n-1 n-1
PiP*
A " 1^ 3 ^ * = 2 (4.8)
{ A p t , pi) 1
i = 0 i = 0
i.e. t h e p r o b l e m o f t h e i n v e r s i o n o f m a t r i x A , a n d t h u s t h a t o f t h e
m i n i m i z a t i o n o f f u n c t i o n / (a;) i s s o l v e d q u i t e e a s i l y .
L e t us n o w consider the p r o b l e m of d e t e r m i n i n g point x % w i t h
the aid of c o n j u g a t e vectors f r o m a s o m e w h a t different v i e w p o i n t ;
at the s a m e t i m e w e shall s t u d y a n u m b e r of interesting properties
of co njugate directions.
S i n c e p 0 , . . ., p n ~ \ i s t h e b a s i s o f s p a c e E n , p o i n t x * m a y b e p r e
sented in the f o llo win g f o r m :
n- 1
** = * 0 + 2 (4.9)
i=0
But by (4.2) a n d (4.8),
* . = * 0 - S -J p u P i ) f '°- ( 4 -1 0 )
1=0
It f o l l o w s f r o m (4.9) a n d (4.10) t h a t
{A p T P I ) f '°
i
o r in a n o t h e r f o r m
*o + 2 a ‘P i = x 0 — 3 (Apuit) P i - (4 ‘1 1 )
i i
Since a vector has only o n e resolution along the basis axes, the
l a s t e q u a l i t y d e t e r m i n e s t h e v a l u e s o f c o e f f i c i e n t s <Zj i n t h e e x p a n
s i o n (4.9)
(Am Pi) (Am Pi)
CCi = i = 0, 1, ..., n - i . (4.12)
{Api, Pi' (e£, p i )
T h u s if a c e r t a i n s y s t e m o f c o n j u g a t e v e c t o r s is k n o w n , t h e n t h e
m i n i m u m p o i n t o f q u a d r a t i c f u n c t i o n ( 4 . 1 ) is e a s i l y f o u n d b y f o r
m u l a s (4.9), (4.12).
T h e p r o c e d u r e of d e t e r m i n i n g p o i n t x * b y f o r m u l a (4.9) c a n b e
c o n s i d e r e d as a process of c o n s t r u c t i o n of successive points:
83 6*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
w h e r e p a r a m e t e r s a * a r e d e t e r m i n e d b y f o r m u l a s (4.12). It f o l l o w s
t h at u s i n g the m e t h o d of conjugate directions o n e c a n solve the p r o b l e m
of q u a d r a t i c f u n c t i o n m i n i m i z a t i o n after p e r f o r m i n g a finite n u m b e r
of steps not exceeding n (the n u m b e r of points in the iterative
p r o c e s s ( 4 . 1 3 ) c a n p r o v e l e s s t h a n n i f s o m e o f t h e c o e f f i c i e n t s otj i n
e x p a n s i o n ( 4 . 9 ) p r o v e e q u a l t o z e r o , i . e . if f o r s o m e i w e h a v e
(f0 , p { ) = 0). T h e a b o v e p r o p e r t y o f t h e m e t h o d o f c o n j u g a t e d i
r e c t i o n s is t h e m o s t i m p o r t a n t o n e . It s h o w s h o w e f f e c t i v e is t h e
appli cat ion of c o n j u g a t e vectors to q u a d r a t i c fu nct ion m i n i m i z a
t i o n ; t h i s is t h e r e a s o n o f w i d e a p p l i c a t i o n o f m e t h o d s o f c o n j u g a t e
directions.
A s a n i n t e r e s t i n g c o r o l l a r y t o t h e r e s u l t o b t a i n e d , it c a n b e s h o w n
t h a t p o i n t x t c o n s t r u c t e d b y f o r m u l a s ( 4 . 1 3 ) , ( 4 . 1 2 ) is t h e m i n i m u m
p o i n t o f f u n c t i o n ( 4 . 1 ) o n t h e s u b s p a c e f o r m e d b y v e c t o r s p 0 , . . ., p i - x
a n d passing through point x 0. Let
~ i~ 1 ~
Ji = ^ o + 2 1 a hPh
k=0
wh e r e a * are arb i t r a r y coefficients. F o r p o i n t x t to b e t h e m i n i m u m
of a strictly c o n v e x differentiable fun c t i o n in the s u b s p a c e f o r m e d
by v e c t o r s p 0 , . . ., it is n e c e s s a r y a n d s u f f i c i e n t ( c o r o l l a r y 4 . 4
of C h a p . I) t h a t t h e f o l l o w i n g c o n d i t i o n s b e satisfied:
(/' f t ) . P i ) = 0, 7 = 0 , 1 , . . ., i — 1. (4.14)
N o w for a n y 0 ^ j ^ i — 1, w e h a v e
^ ^ i - 1 ~
( / r f t ) , P j ) = ( A x i + b, P j ) = ( A ( z 0 + 2 V k P i d + b, p ^
fe=0
i— 1 ^
= ( A x 0 + b, p j ) + ^ a h ( A p hl P j ) = (f0 , P j ) A - ^ j ( A p h p f ) .
h=0
H e n c e t a k i n g into a c c o u n t (4.14) w e h a v e that point x t pr o v i d e s
the m i n i m u m of the function in the subspace, f o r m e d b y vectors
P o i • • •» P i - i a n d p a s s i n g t h r o u g h p o i n t x 0 i f a n d o n l y i f (/o, p f ) 4 -
+ a j (A p jf p j ) — 0 , i.e.
- _ (/pi P j )
— ( A p p Pj) *
84
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
I t f o l l o w s t h a t if w i t h a c e r t a i n — l i n f o r m u l a (4.13)
a t = 0 ( i . e . x t+ 1 = x t), t h e n t h i s m e a n s t h a t (/S, p t ) = 0 . C o m b i n i n g
this eq u a l i t y w i t h (4.15) w e o b t a i n
( / * + ii P j ) = tiu Pi) = 0, 7 = 0, 1 , . . ., i.
T h u s t h e fact t h a t t h e coefficient a f b e c o m e s zero m e a n s that the
corresponding point x t provides the m i n i m u m of the quadratic function
i n t h e s u b s p a c e f o r m e d b y v e c t o r s p 0 y . . ., p t a n d p a s s i n g t h r o u g h
point x 0.
Finally, n o t e t h a t b y (4.15) (fu Pi-i) = 0. T h i s m e a n s t h a t t h e
choice of coefficients a* b y f o r m u l a s (4.12) or (4.16) c o r r e s p o n d s to
choosing a f under the condition
/ (x t + a iPt) = m i n / (xt + a p t).
a
C o n s t r u c t i o n of t h e M e t h o d s
In considering in the pre c e d i n g subsection the effectiveness of
m e t h o d s of c o n j u g a t e directions for t h e m i n i m i z a t i o n of a q u a d r a t i c
function, w e did not e v e n m e n t i o n the m e t h o d s of constructing
such vectors a n d the w o r k involved in this procedure.
W e t u r n to the s t u d y of m e t h o d s of c o n s t r u c t i n g A -ort h o g o n a l
vectors. E a c h of these m e t h o d s d e t e r m i n e s o n e or other m e t h o d of
conjugate directions, w h i c h consists in the construction of successive
a p p r o x i m a t i o n s to the solution of t h e p r o b l e m of m i n i m i z a t i o n of
f u n c t i o n (4.1) m a k i n g u s e o f f o r m u l a s (4.13), (4.12) (or (4.16)).
8
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
i s s a t i s f i e d i f a n d o n l y i f f\ = 0 . I n d e e d , i f c o n d i t i o n ( 4 . 1 7 ) i s s a t i s
fied, t h e n b y ( 4 . 1 6 ) a f = 0 a n d t h e r e f o r e i n s e q u e n c e (4.13) x i+1 = x t.
T h i s m e a n s that at the ( i + l)-th iteration of the process w e shall
not receive a n y additional information about the function an d there
fore s h a l l n o t b e a b l e t o c o n s t r u c t v e c t o r p i+1 p t. T h e p r o c e s s
will a c c o r d i n g l y d e g e n e r a t e (stop) w i t h o u t r e a c h i n g the solution
i f fi =7^ 0 .
T h u s for a n y of th e m e t h o d s of constructing co njugate vectors
( a n d for t h e c o r r e s p o n d i n g m e t h o d of c o n j u g a t e directions), the
condition
( f'u P i ) ¥ = o if 0 (4.18)
m u s t b e satisfied. T h i s c o n d i l i o n g u a r a n t e e s t h a t at a n y of t h e itera
tions of the process w e shall h a v e a* ^ 0 .
86
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
87
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
w h e r e u h - ±1 a r e u n k n o w n v e c t o r s . I t is n e c e s s a r y t h a t t h e v e c t o r s
b e s u c h t h a t t h e first o f t h e c o n d i t i o n s ( 4 . 2 7 ) is s a t i s f i e d , i.e.
< K * - 0, (vh ^ ej) = 0 , 0 < / < k - 2. (4.29)
Clearly* vectors v must also satisfy conditions
(Wft-i, e h . x) 0, (yft-i, e h _ x ) 0.(4.30)
T a k i n g i n t o a c c o u n t ( 4 . 2 3 ) it is c l e a r t h a t c o n d i t i o n s ( 4 . 2 9 ) w i l l b e
s a t i s f i e d if w e c h o o s e x — v = r ^ . C o n d i t i o n s (4.30) will
also b e satisfied since
(r*-i, e*-i) = (rh-i, > 0 (4.31)
according to the properties of m a t r i x A .
88
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
Vectors vh c a n also b e c h o s e n b y u s i n g t h e f o l l o w i n g c o n
s i d e r a t i o n s . If c o n d i t i o n ( 4 . 2 0 ) is sati s f i e d , t h e n w e h a v e
( A p h - 1, P i ) = ( — n ) = ° . 2.
M a k i n g u s e of (4.25) w e h a v e t h e n
® (^ft-l» O ) = (^ft-l» ~ C^ft-l* &h-l» = 0,
0 < / < /c— 2.
It f o l l o w s t h a t in o r d e r to satisfy (4.29) w e c a n a s s u m e
G e n e r a l P r o p e r t i e s of t h e M e t h o d s
L e t u s try to establish the general properties of m e t h o d s of c o n j u
gate directions, w h i c h c a n b e constructed in the m a n n e r described
above.
F i r s t o f all, it is n e c e s s a r y t o a s c e r t a i n w h e t h e r c o n d i t i o n ( 4 . 1 8 )
is s a t i s f i e d b y t h e m e t h o d s u n d e r c o n s i d e r a t i o n s i n c e i n w o r k i n g o u t
t h e m e t h o d s o f c o n s t r u c t i n g t h e a l g o r i t h m s , it is a s s u m e d t h a t t h i s
c o n d i t i o n is f u l f i l l e d .
A n o t h e r i n t e r e s t i n g q u e s t i o n is w h e t h e r t h e d i r e c t i o n s p j , j —
= 0, 1, . . n — 1 w h i c h are d e t e r m i n e d b y different m a t r i c e s H j
89
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
d i f f e r f r o m o n e a n o t h e r , i . e . w h e t h e r p o i n t s x lt . . x n _± a r e differ
ent for different al go r i t h m s (on condition that point x 0 r e m a i n s the
s a m e ) or they coincide.
I n o r d e r t o a n s w e r t h e s e q u e s t i o n s let u s p r e s e n t v e c t o r — p j =
= Hffj using the recursive f o r m u l a for m a t r i x H j a n d expressions
(4.28), (4.32) in t h e f o l l o w i n g f o r m :
Hffj = (fTj-! + A H j - i ) * /}.
M a k i n g use of (4.24) w e c a n w r i t e
A H f ^ n
(h,jrj-
(V e j — i)
If w e a l s o t a k e i n t o a c c o u n t , t h a t
P j - 1»
then vector — pj can be written in the following form:
- - - - - - - - - ej-i)- - - - -
K j v r j _ ie f _ i /
« / - 1) / ei-l) V3,i
Further
S » j ~ 1, e J - i ) = ( ^ , / O - i + h j H f - i e j - u «y-i)l
= *3, 7 ( r 7 " l » e j - l ) + *4,J ( € j ~ 1» H 2 ~ l / j ) + *4,7 ( e j - 1» P j - l ) } (4.33)
hence
(r 7*— i» e j-i) /^ 1 *4 , 7 \ ^ h , j i e j-i>
( v j . u e j . i) \ 3 j i " a ; _ , / “ (wj-i, «;-i)
U s i n g this expression to t r a n s f o r m the f o r m u l a for H^fj, w e obtain
n - ig * - 1
m = y j (4.34)
(ri-l» e 7— i).
where
I [*4.j(gj-lf
( V j - u «/-i)
If v e c t o r satisfies c o n d i t i o n (4.30) a n d — £ 4 | /, w i t h
a n y j — 1, 2 , . . . f a c t o r Y j = £ 0 s i n c e
h , j ( e j - 1»
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
T h i s i n e q u a l i t y is e a s i l y c h e c k e d b y c o m p a r i n g t h e n u m e r a t o r
of t h e ratio w i t h e x p r e s s i o n (4.33). F u r t h e r , s u p p o s e that factors t3j
a n d t^j are such that with 7 ^ 1 the conditions ej_t) 0 and
y j = £ 0 b e satisfied.
L e t us p e r f o r m the scalar multiplication of the t w o sides of equali-
t y ( 4 . 3 4 ) b y f'h
(/;. # ? / ; ■ ) = yi (4.35)
U i " 1
S i n c e y j ^ 0 a n d (ft, J J * f t ) = 0 , (ft, rf) = 0 w i t h ; < k — 1 (by
( 4 . 2 1 ) a n d ( 4 . 2 4 ) ) , it f o l l o w s f r o m ( 4 . 3 5 ) t h a t
H % ift = H U f i = • • • = H 0 ti . (4.38)
U s i n g the recursive f o r m u l a for m a t r i x H t a n d t a k i n g into a c c o u n t
c o n d i t i o n s (4.24), w e c a n w r i t e v e c t o r H * + J j in t h e f o l l o w i n g f o r m :
(4.39)
I n o r d e r t o p r o v e e q u a l i t i e s ( 4 . 3 8 ) it is n e c e s s a r y t o s h o w t h a t
[(ft, J 5 T , e f) = 0, (4.40)
(/-. # ? + , « , + , ) = ( / ; , , ) - s <itl).
s=0
— 2. (4.42)
B e c a u s e of c o n d i t i o n s (4.24) a n d (4.37), w e h a v e
( * i , /5) = <s.i ( r „ fi) + h.i fi) = 0 , 0 < i < / - 2.
(4.43)
91
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T a k i n g i n t o a c c o u n t e q u a l i t i e s ( 4 . 4 3 ) a n d c o n d i t i o n s ( 4 . 3 7 ) , it f o l
lows f r o m (4.42) that
(/J, # ® e i + i ) = 0, 0 < i < ; - 3. (4.44)
L e t u s n o w c o n s i d e r t h e relations (4.41). W i t h i = 0 w e h a v e b y
( 4 . 3 7 ) ( f T o * 0 , / j ) = ( f f * e o, f ) ) = 0 a n d b y ( 4 . 4 4 ) (/}, H Q e x ) = 0 .
C o n s e q u e n t l y , ( H & , ff) = 0 . F u r t h e r , w e s i m i l a r l y e s t a b l i s h t h a t
equalities (4.38) hold.
T a k i n g into a c c o u n t these equalities, w e c a n write expression
(4.34) in the following f o r m :
m n = y } ( i - (4.45>
T h i s is t h e f o r m u l a t h a t e n a b l e s u s t o a n s w e r t h e q u e s t i o n s f o r m u
lated at the b e g i n n i n g of this subsection.
B y s c a l a r m u l t i p l i c a t i o n o f t h e t w o s i d e s o f ( 4 . 4 5 ) b y /j a n d u s i n g
c o n d i t i o n (4.24), w e o b t a i n
I f H 0 i s a s t r i c t l y p o s i t i v e d e f i n i t e m a t r i x , t h e n (/}, H 0 f j ) > 0 .
C o n s e q u e n t l y , if y j 0 , t h e n i t f o l l o w s f r o m ( 4 . 4 6 ) t h a t (/$, p j ) = ^ = 0 .
T h u s , it f o l l o w s f r o m ( 4 . 4 6 ) t h a t t h e a s s u m p t i o n t h a t c o n d i t i o n ( 4 . 1 8 )
c a n b e satisfied, u s e d i n w o r k i n g o u t t h e m e t h o d s of c o n s t r u c t i n g
c o n j u g a t e v e c t o r s , p r o v e s t o h o l d if H 0 is a s y m m e t r i c , s t r i c t l y p o s i t i v e
definite m a t r i x .
I n o r d e r t o a s c e r t a i n w h e t h e r v e c t o r s p t a n d p o i n t s Zf+i, i = 0,
1 , . . ., n — 1 a r e d i f f e r e n t i n d i f f e r e n t a l g o r i t h m s w e t u r n a g a i n
to f o r m u l a (4.45).
T h e first s t e p i n a n y m e t h o d o f c o n j u g a t e d i r e c t i o n s is t h e s a m e
( g i v e n t h e s a m e m a t r i x H 0) s i n c e x x = x 0 — a 0^ o / o a n d *s c h o s e n
u n d e r t h e c o n d i t i o n m i n / ( # 0 — a H * f 0 ). C o n s e q u e n t l y , p o i n t x 1
a
a n d t h e r e f o r e v e c t o r s r 0 , e 0 , f[ w i l l a l s o b e t h e s a m e i n a n y a l g o r i t h m
w h i c h c a n b e c o n s t r u c t e d b y t h e m e t h o d d e s c r i b e d . B u t t h e n a s it
f o l l o w s f r o m (4.45), t h e di r e c t i o n
92
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
H i ( b y v i r t u e o f t h e s t r i c t c o n v e x i t y o f / (x)). C o n s e q u e n t l y , t h e
q u a n t i t i e s r lt e l9 f'2 w i l l b e t h e s a m e f o r d i f f e r e n t m e t h o d s o f c o n j u g a t e
directions. C o n t i n u i n g this a r g u m e n t b a s e d o n expressing vector p h
b y f o r m u l a ( 4 . 4 5 ) w e c o n c l u d e t h a t p o i n t s x 0 l x X9 . . ., x n a r e i n d e
p e n d e n t o f t h e c h o i c e o f v e c t o r s u *, v i.e. o f t h e m e t h o d o f c o n
s t r u c t i n g m a t r i x H ft. T h u s t h e s u c c e s s i v e a p p r o x i m a t i o n s t o t h e
solution of the p r o b l e m of m i n i m i z a t i o n of a qu a d r a t i c function
are the s a m e for different m e t h o d s of c o n j u g a t e directions.
O n e more remark.
It c o u l d b e n o t e d a b o v e t h a t t h e first o f t h e t w o m a t r i c e s t h a t
f o r m m a t r i x A H j ( 4 . 2 8 ) , 7 = 0 , 1 , . . ., k — 1 t a k e s n o p a r t i n
constructing vector p h . I n d e e d in d e t e r m i n i n g vector A H * f k w e find
a c c o r d i n g to conditions (4.24) that
q T ( u jr,' ’ e j? ) = 0 ,
i.e. m a t r i c e s
a , T‘U * , , 1
(“ 7 » « ; )
Concrete Algorithms
93
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
(1) W e s e t i n ( 4 . 2 8 ) a = 1, u = r h . lt v h (i.e.
in formulas ( 4 . 3 2 ) t 1 > k = tx = 1 , t2th = t 2 = 0 , t 3 ,h = t3 = 0 ,
h . h — h — 1). T h e n
rh-lr* - i
H h = H h^ - \ (4.48>
( r k - i , e h - 1) eh-l)*
L e t us s t u d y s o m e properties of m a t r i x H h o b t a i n e d b y this m e t h o d .
M a t r i x H k is s y m m e t r i c . T h i s f a c t is e a s i l y c M a b l i s l i e d b y i n d u c
t i o n . M a t r i x H 0 is s y m m e t r i c . T h e t w o m a t r i c e s w h i c h f o r m A H 0
a r e s y m m e t r i c t o o ( t h e s e c o n d o n e b y v i r t u e o f t h e s y m m e t r y o f f f 0 ).
T h e r e f o r e , H x is a s y m m e t r i c m a t r i x . S i m i l a r a r g u m e n t s h o l d f o r
a n y k = 2 , . . ., n .
M a t r i x H h is s t r i c t l y p o s i t i v e d e f i n i t e . W e g i v e a p r o o f b y i n d u c t i o n .
M a t r i x H 0 is s t r i c t l y p o s i t i v e d e f i n i t e . L e t H k b e a s t r i c t l y p o s i t i v e
definite m a t r i x . T h e n for a n y x £ E n
M a k i n g u s e o f t h e s e r e l a t i o n s a n d a p p l y i n g C a u c h y - B u n i a k o w s k i ’s
inequality w e conclude that the following inequality holds:
(H h x , x ) (H h e k , e k ) — ( H ke h , x ) 2 = ( y , y ) (z, z) — (z, y ) 2 > 0
a n d t h i s i n e q u a l i t y h o l d s o n l y ii z = y , i.e. s i n c e H h is n o n s i n g u l a r ,
o n l y if x = e B u t i n t h i s c a s e ( r hl x ) — ( r kl e h ) = ( r kl A r k ) > » 0 .
T h u s for a n y x 0, w e h a v e
(H „ + 1 x , x ) = "1* + > o
{ H h e kt e k ) (f h i e h>
a n d this proves that o u r reasoning b y induction holds.
M a t r i x H n = A ~ x . I n d e e d , H h satisfies (4.25) w i t h a = 1, i.e.
H n ej = r/, / = 0 , 1, . . . » w — 1, o r m a k i n g u s e o f ( 4 . 1 9 )
H nA r j Tj, / 0 , 1, . . •> n 1.
94
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
It f o l l o w s t h a t v e c t o r s r 0 , . . rn are eig e n v e c t o r s of m a t r i x H nA
w i t h eigenvalues equal to unity. H e n c e , taking into account the
l i n e a r i n d e p e n d e n c e o f c o n j u g a t e v e c t o r s r*, i — 0 , 1, . . n — 1
w e h a v e H n A = / , i.e.
H n = A - K
B u t f r o m (4.8)
n-1
A-* = 2
l= r
i.e. w e f i n d t h a t m a t r i x H n i s d e t e r m i n e d o n l y b y t h e m a t r i c e s
nuf r tr f
(r ii e i) ir i* 6 i)
(this w a s m e n t i o n e d at t h e e n d of t h e p r e c e d i n g su bsection).
(2) A n o t h e r m e t h o d o f c o n s t r u c t i n g # * is o b t a i n e d if w e take
a = 1 in (4.28) a n d c h o o s e u h -x = v k -x = r * ^ . W e h a v e t h e n
H k = H h - 1 + ( r h - 1 — H h - 1e h - 1 ) ^ •( 4 . 4 9 )
Matrix # * thus d e t e r m i n e d is n o m o r e s y m m e t r i c . S i n c e a = 1,
w e have n o w # n = A ~ X a n d this c a n b e d e m o n s t r a t e d just in the
s a m e w a y as for m e t h o d (4.48).
U s i n g (4.49) w e c a n o b t a i n a s o m e w h a t different f o r m u l a for
constructing #*. W e wr i t e (4.49) in the f o l l o w i n g f o r m :
h-1 #
(r x g f j ~ ♦ (4.50;
i— 0 1
A c c o r d i n g to the co ndi tio ns of c o n j u g a t e n e s s (4.20) (taking into
a c c o u n t f o r m u l a s ( 4 . 1 9 ) ) , w e h a v e (eht rj) = 0 , 0 ^ ^ k — 1.
C o n s e q u e n t l y , it f o l l o w s f r o m ( 4 . 5 0 ) t h a t
H he h = #o**, k = 0, 1, . . n — 1. (4.51)
Thus formula (4.49) c a n b e w r i t t e n as follows:
If # 0 = / , t h i s f o r m u l a p r o v e s s o m e w h a t s i m p l e r t h a n (4.49).
(3) L e t u s c h o o s e a — 0, v h „ x = r * ^ . I n t h i s c a s e
Z T * _ 1e * _ 1r J _ 1
(4.53)
(rh-u
W i t h a = 0 i t f o l l o w s f r o m ( 4 . 2 5 ) t h a t # n e,- = 0 , ) = 0 , 1 , . . .
. . ., n — 1 . S i n c e v e c t o r s e Q y . . ., are linearly independent,
95
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
^h { H k - i e h - u «ft-i)*
A c c o r d i n g t o e q u a l i t i e s ( 4 . 3 8 ) w e h a v e ( e * . ! , H i — ifh) — (^fc-i» H 0fh).
F u r t h e r , b e c a u s e o f ( 4 5 1 ) a n d (4. 3 8 ) , ( H h P h i e h) — ( H h f h + i * fk+ i ) +
- h ( H k , f k ) = ( H 0 f h + u / i + i ) — ( p ft, f h ). T h e equalities obtained
s h o w that
. «k-i)
Yft = (4.57)
96
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
N o t e t h a t f r o m ( 4 . 4 5 ) , b e c a u s e o f ( 4 . 2 1 ) a n d ( 4 . 2 4 ) , it f o l l o w s t h a t
(provided yj 0)
(f*f fTo/J) = 0 f (4.58)
Taking this into account, it p r o v e s that
N o t e also that
(/i. P k ) = (/*, P k ) — (/ft+i. P h ) = — («ft. P i t ) - (4.61)
7 - 0 3 2 6 97
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
w h e r e p ft i s c a l c u l a t e d b y f o r m u l a ( 4 . 5 6 ) . I f w e u s e e q u a l i t i e s ( 4 . 5 9 ) ,
(4.61) a n d (4.46) (the last o n e in t h e case u n d e r c o n s i d e r a t i o n ta kes
the form
(Pu, ft) = ~ ( H o f l fh) (4.67))
t h e n for d e t e r m i n i n g coefficient p * o n e of t h e f o l l o w i n g f o r m u l a s c a n
be obtained:
( H 0 r h , gft— d _ ( i i 0 rk , r k ) ( H 0r k , rh )
(4.68)
f t - 1) (P h - i i 1 k - 0 ( ^ o f t _ i » f' h - 0
a n d if u s e i s m a d e of t h e s e c o n d of f o r m u l a s (4.68), t h e n
* > - * • + ! S S b - - ( 4 -7 1 )
N o t e that in f o r m u l a (4.71) H n = H 0 (since fn = 0); i n (4.70)
H n =7^= H 0 .
T h e reader himself ca n o b t a i n other f o r m u l a s for constructing H k .
T h e simplest formula for calculating A -o r t h o g o n a l vectors c a n b e
obtained by choosing H 0 = / in (4.66). I n this c a s e
Ph = — fh + PftPfe-i (4.72)
w h e r e p fe i s d e t e r m i n e d , f o r i n s t a n c e , b y o n e o f t h e f o l l o w i n g f o r m u
las:
(ft* (ft* ft) _ V i 10
P* (4.73)
{Ph~u1k-0 {Ph-iif'k-0 (ft-it f t - 1)
98
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
Min i m i z a t i o n of a C o n v e x Q u a d r a t i c F u nc ti on
Until n o w w e co nsidered m e t h o d s of A -o rth ogo nal directions for
t h e m i n i m i z a t i o n o f a s t r i c t l y c o n v e x q u a d r a t i c f u n c t i o n , i.e. a s
s u m e d m a t r i x A to b e strictly positive definite.
Let n o w the function
H x ) = \ ( A x , x ) + (6 , x ) + c
A*
b e c o n v e x , i.e. m a t r i x A b e p o s i t i v e d e f i n i t e : ( A x , x ) ^ 0 w i t h
any x 0. S u p p o s e t h a t this f u n c t i o n h a s a m i n i m u m .
L e t u s s t u d y the p r o b l e m of the ap plication of m e t h o d s of c o n j u
gate directions in this case. C o n s i d e r preliminarily certain properties
o f f u n c t i o n / (x).
( 1 ) I f (A p , p ) = 0 t h e n o f n e c e s s i t y
A p = 0. (4.74)
I n f a c t if ( A p , p ) = 0 , t h e n p is t h e m i n i m u m p o i n t o f t h e c o n v e x
f u n c t i o n cp ( x ) — y ( A x , x). B u t at t h e m i n i m u m p o i n t t h e n e c e s s a
ry condition of a n extremum
<p' ( p ) = A p = 0
m u s t h e satisfied.
( 2 ) I f p is t h e m i n i m u m po i n t of the convex f u n c t i o n <p ( x ) =
= y ( A x , x ), t h e n o f n e c e s s i t y
(b, p ) = 0. (4.75)
I n f a c t , if ( A p , p ) = 0 a n d (b, p ) > * 0 , t h e n / ( a p ) = a (b, p ) - f
•f c - ► — oo as a — ► — o o , i.e. f (x) d o e s n o t a t t a i n t h e m i n i m u m
a n d this contradicts the a s s u m p t i o n .
T h e c a s e w h e r e (b, p ) < Z 0 is t r e a t e d i n a s i m i l a r m a n n e r .
(3) T h e m i n i m u m p o i n t o f f u n c t i o n f (x) is n o t u n i q u e .
I n d e e d , a n y m i n i m u m p o i n t o f a c o n v e x q u a d r a t i c f u n c t i o n / (x)
m u s t h e a solution of the linear s y s t e m A x b — 0 a n d conversely,
s i n c e t h e c o n d i t i o n f (x) = A x + b = 0 is a n e c e s s a r y a n d ( s i n c e
t h e r e is a m i n i m u m o f / (x)) s u f f i c i e n t c o n d i t i o n f o r a n e x t r e m u m o f
t h e c o n v e x f u n c t i o n / (x) ( c o r o l l a r y 3 . 2 o f C h a p . I). H o w e v e r , t h e
r a n k o f m a t r i x A is l o w e r t h a n t h e n u m b e r o f u n k n o w n s ( c o n d i t i o n
( A x , x) 0 m e a n s t h a t m a t r i x A is s i n g u l a r , s e e ( 4 . 7 4 ) ) a n d s o
the s y s t e m A x + 6 = 0 has n o u n i q u e solution.
( 4 ) I f ( A p , p ) = 0 a n d z £ E 11 i s a n a r b i t r a r y p o i n t , t h e n o f
necessity
(/' ( * ) , p ) = 0. (4.76)
99 7*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
f ( P ) = - j ( A p , p) + {b, p) + c = c . )
100
C O N J U G A T E D I R E C T I O N S . Q U A D R A T I C F U N C T I O N M I N I M I Z A T I O N
T h a t t h i s s t a t e m e n t h o l d s is s h o w n as follows.
If t h e c o n d i t i o n s
( A p t, p d > 0 (4.78)
w e r e s a t i s f i e d f o r a n y i = 0 , 1 , . . ., n — 1 , t h e n t h e s y s t e m o f
v e c t o r s p Q , p x , . . ., p n - x w o u l d b e l i n e a r l y i n d e p e n d e n t . I n d e e d ,
s u p p o s e that c o n d i t i o n s (4.78) are satisfied a n d t h at
71-1
2 8iPi= 0
i=0
2 1 ^iPi
i=0
T h i s contradicts c o n d i t i o n ( A x, x) > 0.
101
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Discussion of Results
102
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
H o w e v e r , it s h o u l d b e k e p t i n m i n d t h a t w e a r e m a k i n g a p u r e l y
theoretical estimate of the m e t h o d s a n d d o not take into a c c o u n t
such a n i m p o r t a n t factor as the sensitiveness of a n al gorithm to
errors in c o m p u t a t i o n s . T h i s factor c a n c h a n g e c o n s i d e r a b l y the
relation b e t w e e n the a m o u n t s of c o m p u t a t i o n s i n v o l v e d in solving
t h e p r o b l e m b y different a l g o r i t h m s . It s h o u l d also b e n o t e d t h a t
m e t h o d s (4.48), (4.49), (4.52), for instance, in s o l v i n g a m i n i m i
zation p r o b l e m al l o w to o b t a i n at the s a m e t i m e the inverse m a t r i x
A ~ x a n d this m a y b e useful in s o m e cases.
T h e d i f f e r e n c e i n p r o p e r t i e s of a l g o r i t h m s tells c o n s i d e r a b l y w h e n
t h e y are u s e d for m i n i m i z a t i o n of n o n q u a d r a t i c functions; this will
be discussed in the n e x t section.
M e t h o d s of co n j u g a t e directions p r o v e useful in o n e m o r e aspect;
t h e y m a k e it p o s s i b l e t o e s t a b l i s h w h e t h e r t h e s i g n o f t h e m a t r i x
is fi x e d . T h u s a c c o r d i n g t o t h e r e s u l t s o f t h e s u b s e c t i o n o n p . 9 9 ,
if m a t r i x A is p o s i t i v e d e f i n i t e a n d f u n c t i o n f{pc) d o e s n o t a t t a i n t h e
m i n i m u m , t h e n at a certain step w e shall h a v e a * = oo. H o w e v e r ,
if m a t r i x A is n o t p o s i t i v e d e f i n i t e , w e f i n d a t a c e r t a i n s t e p o f p r o
cess (4.47) t h a t a * « < 0. T h u s t h e v a l u e of p a r a m e t e r a * d e t e r m i n e s
t h e sign of m a t r i x A .
T h e e f f e c t i v e n e s s o f m e t h o d s o f c o n j u g a t e d i r e c t i o n s is t h e r e a s o n
of their m o r e a n d m o r e extensive application to the m i n i m i z a t i o n
of quadratic functions a n d solution of s y s t e m s of linear equations.
5. M E T H O D S O F C O N J U G A T E D I R E C T I O N S .
M I N I M I Z A T I O N O F A R B I T R A R Y F U N C T I O N S
Considerations about the Applicability
of the M e t h o d s
S u p p o s e that w e in tend to m a k e use of the process
”1" & h P h i Pk “ “ Hlifhi f » • • •» (5.1)
w h e r e v e c t o r p k (or m a t r i x H h ) is d e t e r m i n e d b y o n e o f t h e m e t h o d s
studied in the pre c e d i n g section, for the m i n i m i z a t i o n of a n arbitrary
( n o t q u a d r a t i c ) c o n v e x f u n c t i o n f (x). I n t h i s c a s e , m a t r i x f" (#)
w i l l h a v e different e l e m e n t s at different p o i n t s of s e q u e n c e (5.1);
b y v i r t u e o f t h i s f a c t v e c t o r s / ? 0 , . . ., constructed b y a n y of the
m e t h o d s d e s c r i b e d in t h e s u b s e c t i o n o n p. 9 3 will n o t satisfy c o n d i
t i o n s ( 4 . 2 0 ) , i.e. w i l l n o t b e c o n j u g a t e . H o w e v e r , if t h e i n i t i a l p o i n t
x 0 is i n a c l o s e n e i g h b o u r h o o d o f t h e m i n i m u m of a s m o o t h c o n v e x
f u n c t i o n / ( x ), t h e n a t a n y p o i n t o f t h i s r e g i o n m a t r i x f " ( x ) i s c l o s e
e n o u g h t o m a t r i x / " ( # * ) , i.e. t h e q u a d r a t i c f u n c t i o n
<P (*) = y ( H * .
103
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
is a g o o d a p p r o x i m a t i o n t o t h e f u n c t i o n f (x). T h u s , w e c a n e x p e c t
that the properties of vectors p 0, . . p h d e t e r m i n e d b y m e t h o d s of
Sec. 4 will h e close e n o u g h to th e properties of c o n j u g a t e vectors
(/" ( a ^ - o r t h o g o n a l ) a n d t h e r e f o r e t h e p r o p e r t i e s o f p r o c e s s ( 5 . 1 ) i n
w h i c h p a r a m e t e r a h is c h o s e n o n c o n d i t i o n t h a t t h e m i n i m u m of
f u n c t i o n / (x) o c c u r s i n t h e d i r e c t i o n o f p h w i l l b e c l o s e e n o u g h t o
the properties of m e t h o d s of c o n j u g a t e directions. I n other w o r d s ,
w e c a n e x p e c t t h e m e t h o d s of t h e p r e c e d i n g s e c t i o n p r o v e suffi
ciently effective in m i n i m i z i n g n o n q u a d r a t i c functions too. In this
case, t h e m e t h o d s w i l l n o m o r e y i e l d t h e result after a finite n u m b e r
of steps since the c o ndi tio ns
(/' ( * * ) P h , Pi) = 0, t =£ k
T h e o r e m o n C o n v e r g e n c e of t h e M e t h o d s
I n w h a t f o l l o w s w e s h a l l a s s u m e t h a t / (x) is a s t r o n g l y c o n v e x
d i f f e r e n t i a b l e f u n c t i o n w h o s e first a n d s e c o n d d e r i v a t i v e s a r e c o n
t i n u o u s , i.e. t h a t c o n d i t i o n s
m\\ y\? < (/" (x) y , y ) < M \ \ y||2 , m > 0 (5.3)
a r e s a t i s f i e d f o r a l l x , y 6 E 7\ a n d t h a t a s y m m e t r i c , s t r i c t l y p o s i t i v e
d e f i n i t e m a t r i x h a s b e e n c h o s e n a s H 0 , i.e.
m 0 || i/ll2 < (H „ y , y ) < M „|| j/||2 , m 0 > 0 (5.4)
for all y 6 E n .
P r o c e s s e s of t y p e (5.1) c a n b e realized either w i t h re storat ion of
m a t r i x H h after a finite n u m b e r of steps, o r w i t h o u t s u c h a reinitial
ization. S p e a k i n g of processes w i t h restoration, say, after n steps w e
m e a n that with an y £ = 0 , 1 , ... matrix H is r e s t o r e d , i.e.
H ln = H 0 .
104
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
F r o m the b e g i n n i n g n o t e t h e f o l l o w i n g fact. If a p r o c e s s w i t h
restoration of m a t r i x H h a f t e r a finite n u m b e r o f s t e p s is b e i n g r e a
lized, t h e n for a n y of t h e m e t h o d s of c o n j u g a t e directions the f o l l o w
ing condition is fulfilled:
l i m ||/' ( * fc)ll = 0 (5.5)
h-*oo
s i n c e e a c h first s t e p o f t h e p r o c e s s a f t e r r e s t o r a t i o n is a s t e p o f g r a
d i e n t descent, for w h i c h a c c o r d i n g to (5.3) t h e c o n d i t i o n s of c o n v e r
g e n c e o f g r a d i e n t m e t h o d s ( t h e o r e m 1 .6 ) a r e s a t i s f i e d a n d i n t h e
following steps b e t w e e n restorations w e h a v e a descent to the m i n i
m u m of the function in the direction of m o t i o n . T h e fulfillment of
c o n d i t i o n (5.5) for a strictly c o n v e x f u n c t i o n m e a n s t h a t a n y of the
m e t h o d s d i s c u s s e d i n S e c . 4 , if r e a l i z e d w i t h r e s t o r a t i o n o f m a t r i x H k
after a finite n u m b e r o f s t e p s , c o n v e r g e s to t h e s o l u t i o n . Therefore
i n o r d e r t o j u d g e t h e e f f e c t i v e n e s s o f s u c h a p r o c e s s it is i m p o r t a n t
to o b t a i n b o u n d s o n its r a t e o f c o n v e r g e n c e .
N o t e t h a t c o n d i t i o n ( 5 . 5 ) f o r p r o c e s s e s w i t h r e s t o r a t i o n is s a t i s f i e d
n o t o n l y f o r s t r i c t l y c o n v e x f u n c t i o n s b u t f o r a n y f u n c t i o n if t h e
f u l f i l l m e n t o f ( 5 . 5 ) is g u a r a n t e e d f o r it i n a p p l y i n g g r a d i e n t m e t h o d s
(see t h e o r e m 1.4).
H o w e v e r , if p r o c e s s e s a r e r e a l i z e d w i t h o u t r e s t o r a t i o n o f / / & ,
t h e n t h e i r c o n v e r g e n c e m u s t b e s u b s t a n t i a t e d . B e s i d e s , it is a l s o
necessary to es timate their rate of co nve rge nce .
Let us n o w formulate the t h e o r e m w h o s e contents are the m a i n
result of this section.
T h e o r e m 5 . 1 . F o r t h e m i n i m i z a t i o n of f u n c t i o n f (x) w h i c h satisfies
c o n d i t i o n s ( 5 . 3 ) let t h e r e b e a p p l i e d p r o c e s s ( 5 . 1 ) i n w h i c h t h e c o n s t r u c
t i o n o f m a t r i x H k is p e r f o r m e d b y o n e o f t h e m e t h o d s o f S e c . 4 ( ( 4 . 4 8 ) ,
(4.49), (4.52)-(4.54), (4.69)-(4.71)) w i t h restoration of H h after n steps.
I f t h e v a l u e o f a h is c h o s e n u n d e r t h e c o n d i t i o n t h a t t h e m i n i m u m o f t h e
f u nct ion be in the direction of p h , th e n the s e que nce { x | n } w h a t e v e r the
initial p o i n t x 0 c h o s e n c o n v e r g e s to t h e s o l u t i o n a t a s u p e r l i n e a r rate.
L e t us outline the general s c h e m e of the proof of this t h e o r e m .
S u p p o s e t h a t t h e h y p o t h e s i s is n o t t r u e , i.e. t h a t f o r t h e i t e r a t i v e
p r o c e s s e s d e s c r i b e d t h e f o l l o w i n g c o n d i t i o n is s a t i s f i e d w i t h a n y k :
lkft+i — * * II > A, || * * — * * II (5.6)
w h e r e X ; > 0 is a c o n s t a n t . U s i n g i n e q u a l i t y ( 1 . 1 2 ) a n d t h e e x p r e s
sion
II r ( x ) II = II / ' ( x ) - r ( x m ) II < A T II * - * J (5.7)
w h i c h h o l d for a f u n c t i o n w h i c h satisfies c o n d i t i o n (5.3), w e find t h a t
c o n d i t i o n ( 5 . 6 ) is e q u i v a l e n t t o t h e f o l l o w i n g o n e :
II f i + i II > « \ \ U II (5.8)
w h e r e 6 > > 0 is a c o n s t a n t .
105
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
S t u d y i n g t h e p r o p e r t i e s of p r o c e s s (5.1) a n d a s s u m i n g t h a t c o n d i
t i o n ( 5 . 8 ) is f u l f i l l e d w e f i n d t h a t i n d e p e n d e n t l y o f w h a t a l g o r i t h m
is u s e d f o r c o n s t r u c t i n g m a t r i x H h t h e f o l l o w i n g e s t i m a t e s h o l d :
C II f \ II < II r * | | < N \\f'k || (5.9)
I t w i l l b e d e m o n s t r a t e d b e l o w t h a t if t h e s e e s t i m a t e s a r e f u l f i l l e d ,
t h e n s e q u e n c e (5.1) c o n v e r g e s to t h e s o l u t i o n at a s u p e r l i n e a r rate.
H o w e v e r , t h i s c o n t r a d i c t s o u r i n i t i a l a s s u m p t i o n ( 5 .6 ) ( o r ( 5 . 8 ) ) ,
i.e. c o n d i t i o n ( 5 . 6 ) c a n n o t b e s a t i s f i e d f o r p r o c e s s ( 5 . 1 ) . U s i n g t h i s
f a c t it w i l l b e e a s y t o e s t a b l i s h t h a t t h e t h e o r e m h o l d s .
T h u s , t h e p a t t e r n o f t h e p r o o f is t h e s a m e f o r all t h e m e t h o d s d i s
c u s s e d , b u t t h e v a l i d i t y o f ( 5 . 9 ) a n d ( 5 . 1 0 ) is e s t a b l i s h e d i n d i f f e r e n t
w a y s . T h e proof that these estimates h o l d for different a l g o r i t h m s
will b e gi v e n in the next subsection a n d here w e shall describe that
p a r t of t h e pr o o f , w h i c h all t h e s e m e t h o d s h a v e in c o m m o n .
W e s h a l l m a k e first a r e m a r k a b o u t t h e n o t a t i o n s . I n w h a t f o l l o w s
for the simplicity w e shall often in us ing vectors a n d p a r a m e t e r s
/ £n+i» ^£n+i» Ot|n+M Pgn+ii ^ “ 0 , 1 , . . ., Jfl^ 1
o m i t i n d e x £ / i , i . e . o p e r a t e w i t h v e c t o r s a n d p a r a m e t e r s r f , / S , e iy
< x f, P i e t c . H o w e v e r , it s h o u l d b e s t r e s s e d t h a t t h i s is d o n e o n l y to
s i m p l i f y the w r i t t e n f o r m ; the real i n d e x of the c o r r e s p o n d i n g q u a n
t i t y i s % n -f- i.
W e t u r n n o w to the proof of the t h e o r e m a n d a s s u m e that e s t i m a
t e s ( 5 . 9 ) a n d ( 5 . 1 0 ) a r e s a t i s f i e d . U s i n g L a g r a n g e ’s f o r m u l a f o r o p e r a
tors w e obtain
<*/, rj) = (£r„ n) = ( f i n , rj) + (( fie - fi) r „ r,). (5.11)
w h e r e , a s u s u a l , i n d e x ic d e n o t e s a n i n t e r m e d i a t e p o i n t i n t h e c o r r e s
ponding segment:
x ic = x t + 0 r f, O < 0 < 1.
I I II H II — > - 0 , t h e n b e c a u s e o f t h e u n i f o r m c o n t i n u i t y o f s e c o n d
d e r i v a t i v e s o f f u n c t i o n / ( x ) o n s e t S = { x : f ( x ) ^ / (ar0 ) } w e h a v e
II / i c — f i || ^ - 0 a n d i t f o l l o w s f r o m ( 5 . 1 1 ) t h a t , i f ( 5 . 1 0 ) i s s a t i s f i e d ,
estimates
( f i r i , r j) — 0 ( M r ill || r , ||) + o (|| e , || || r , ||),
i = £ /, 0 < n — 1
h o l d too.
106
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
If e s t i m a t e s ( 5 . 1 2 ) a r e fulfilled, t h e n t h e r e a r e v e c t o r s
ri — ri + i = 0,; 1 , . . ., n — 1, (5.13)
w h e r e ||o)i|| = o ( H o ||), s u c h t h a t
This c a n b e s h o w n as follows.
L e t u s n o r m a l i z e v e c t o r s r t:
~ _ Tj_ _ _ _
ff" r . r .\i /2
v 7 | n r * ’ r it
T h e n (/gn r*, n ) = l a n d a s | ► o o s i n c e p r o c e s s ( 5 . 1 ) is con
v e r g e n t ( w i t h restoration of H h ) a n d b y (5.3) a n d (5.12) w e have
i^=j, 0 < t , j ^ n — 1.
T h e r e f o r e if R gn is a m a t r i x w h o s e c o l u m n s a r e v e c t o r s r* a n d F * n =
= R t n f i n R gn . t h e n a s £ oo
F l n ^ I .
Q l n f l n R *n = I- (5.15)
N o w , s i n c e F gn - ► / , w e h a v e a l s o F - * / and, consequently,
R * n - > - < ? £ „ , i.e. v e c t o r - c o l u m n s q t o f m a t r i x can be written
in the f o r m
Ql = T i -j- 0 )j, i = 0, 1 , . . ., TL — 1
107
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
B e c a u s e o f ( 5 . 1 5 ) , v e c t o r s r* a n d ] r* = (/|„r*, i — 0, 1, . .
n — 1 s a t i s f y c o n d i t i o n s ( 5 . 1 4 ) . A t t h e s a m e t i m e , v e c t o r s r* s a t i s f y
c o n d i t i o n s (5.13) s i n c e b y (5.3)
T h u s it h a s b e e n s h o w n t h a t ( 5 . 1 4 ) h o l d s .
V e c t o r s r* w i t h s u f f i c i e n t l y l a r g e | a r e l i n e a r l y i n d e p e n d e n t .
I n d e e d , l e t t h e r e b e f a c t o r s 6 ,-, i = 0 , 1 , . . ., n — 1 ( o f w h i c h a t l e a s t
_ I
t w o a r e n o n z e r o ) s u c h t h a t 2 j & i r i — 0 . I f 8 0 =7 ^ 0 * t h e n w e o b t a i n
1=0
fy) ( / £ n r 0 i r o ) “ l“ ^ J & J ( / £ ^ 0 , r j ) —
j=l
H o w e v e r , w i t h s u f f i c i e n t l y l a r g e v a l u e s o f £ I b i s e q u a l i t y is n o t
s a t i s f i e d . I n d e e d , s i n c e 11(0*11 = o (||r*||) a n d a s £ — ► < » , ||r*|| - > - 0 ,
w i t h sufficiently l a r g e £ a c c o r d i n g to (5.3), w e h a v e
i f r 0) — ^*o) “ 1“ (/fen^* 0 » ^ o )
a n d at the s a m e t i m e ( / £ „ r 0 , r 7 ) = 0 , j = 1 , . . ., n — 1 , b e c a u s e o f
(5.14). Thu s w e c o m e t o a c o n t r a d i c t i o n , i.e. v e c t o r s r*, i = 0 , 1 , . . . ,
n — 1 are really linearly independent.
Let z$n b e t h e m i n i m u m of t h e q u a d r a t i c f u n c t i o n
L e t u s w r i t e v e c t o r z\n — X x n in t h e f o r m
7 1 - 1
2 a ifiiir i ~ / g »•
i=0
H e n c e , t a k i n g i n t o a c c o u n t ( 5 . 1 4 ) , it f o l l o w s t h a t c o e f f i c i e n t s a *
can be calculated by the following formulas:
di - 1 , ..., n — 1.
108
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
A c c o r d i n g t o c o n d i t i o n s ( 5 . 8 ) a n d ( 5 . 9 ) a l l o f t h e v e c t o r s r 0 , . . .,
r n _i a r e o f t h e s a m e o r d e r o f s m a l l n e s s ( r e c a l l t h a t v e c t o r s i \ n + t a r e
practically meant). Since as w a s m e n t i o n e d a b o v e A / | | r f ||,
v e c t o r s e 0l . . ., e n _ x a r e o f t h e s a m e o r d e r o f s m a l l n e s s . T a k i n g i n t o
a c c o u n t t h e a b o v e r e m a r k s the equalities (5.17) c a n b e w r i t t e n in
the following form:
(fin, n ) = — ( e h r f) + o ( | | r £ ||2 ), i = 0, 1 , . . ., n — 1.
F u r t h e r , t a k i n g into a c c o u n t (5.13), w e find t h a t
( f i n r i , r t) = ( f i n r i, r t ) + ( / £ * & ) * , r t)
= ( f i c r iy n ) + ( ( / i n — / i c ) 0 , r f i + i f i n a t , r f ) = (<?*, r f ) + ( | | r j 2 ).
Thus
„ _ ri)+o(ll n lP )
1 (*i, n ) + O i ( l | n l l 2)*
B y (5.3), w e have
( * i , r t) = (fieri, > m IfoH2, i = 0, 1 , . . ., n — 1 . ( 5 . 1 8 )
C o n s e q u e n t l y , a s g - » - o o ( i . e . a s ||rj|| - > 0 )
«i - ► 1, i = 0 , 1, . . ., n — 1. (5.19)
n - 1
Since x (t + 1 ) n — X | n = 2 r i» w e have
t = 0
n - 1 _
109
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
110
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
S t u d y of Properties
of Different A l g o r i t h m s
W e t u r n n o w to t h e p r o o f of t h e v a l i d i t y of e s t i m a t e s (5.9), (5 .10)
for different m e t h o d s of c o n j u g a t e directions w i t h restoration of
m a t r i x H k after n steps, a s s u m i n g t h a t i n e q u a l i t y (5.6) (or (5.8))
is fulfilled.
T h e f a c t t h a t f o r a n y o f t h e s e m e t h o d s t h e e s t i m a t e s h o l d is
e s t a b l i s h e d b y i n d u c t i o n ; it is d e m o n s t r a t e d t h a t e s t i m a t e s ( 5 . 9 ) ,
( 5 . 1 0 ) t a k e p l a c e w i t h i = £ j , i, j = 0 , 1 ; a n d t h e n s u p p o s i n g t h a t t h e s e
estimates take place with O ^ i , j ^ t <i n — 1 w e prove that
t h e y r e m a i n v a l i d a l s o w i t h 0 ^ i, / ^ t + 1 .
1. M e t h o d ( 4 . 4 8 ) . If r e s t o r a t i o n o f m a t r i x ( 4 . 4 8 ) is p e r f o r m e d a f t e r
a finite n u m b e r o f s t e p s , t h e n w i t h a n y k m a t r i x H * is b o u n d e d :
\ \ H k \\ go. (5.25)
W e s h o w n o w h o w this c a n b e proved.
B y (5.2),
( H h fk , f t * ) = - ( p h , f t * ) = 0 -
Therefore
( H k e k , e h ) = ( J 5 T * f t , f t ) + ( J E T f c f t n , f t + 1 ). (5.26)
Since is p o s i t i v e d e f i n i t e ( S e c . 4), w e h a v e (H^e^ e ft) ^
^ — ( P h > f t ) = ( P h j ^fc)* H e n c e , a c c o r d i n g t o ( 5 . 1 8 )
( H h e k , e h ) > ^ - \ \ r h \f. (5.27)
Ill
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
where a i = m 0 j (l+-^") *s i n d e p e n d e n t o f I.
L e t us u s e n o w inequality (5.30) in order to es t i m a t e the v a l u e of
parameter a ^ . Since
h — f i = a i (/ii P i ) + y (/lcPi, Pi)
a n d a t i s c h o s e n a c c o r d i n g t o c o n d i t i o n ( 5 . 2 ) , it i s c l e a r t h a t
(fi* _ _ _ (/I. pi)
JW||pi|l2 % s l5a m II P i II- *
N o w b y ( 5 . 3 0 ) , w e h a v e — (/,', P i ) = ( H 1 f \ , / ; ) > a t ||/ ; ||2 a n d b y
( 5 . 2 5 ) , || p i || = | H {f\ | | ^ L || / J ||; t a k i n g t h e s e e s t i m a t e s i n t o a c c o u n t ,
we h a v e a , ^ - — — = a ] > 0 . A t t h e s a m e t i m e it f o l l o w s f r o m
— M L 1
( 5 . 3 0 ) t h a t || P i H ^ f l i | | / I ||. U s i n g t h i s e s t i m a t e w e c a n e a s i l y e s t a b
lish that — ^ - = a < o o . Thus we find that
1 ^ maf
112
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
+ ( H j f r + i , / ; + i ) ( H 1f-, / i ) - ( / / , / ; + 1 , / ; + o 2
- ( H j f ] , / ; + , ) ’ + 2 { H , f i + u f't+i) { H i f ' h / ; + « ) ! •
O n the r i g ht -ha nd side of this inequality the difference b e t w e e n the
f i r s t a n d t h e t h i r d t e r m s o f t h e n u m e r a t o r , b y G a u c h y - B u n i a k o w s k i ’s
i n e q u a l i t y , is n o n n e g a t i v e . T a k i n g i n t o a c c o u n t e s t i m a t e s ( 5 . 3 4 ) ,
( 5 . 2 7 ) a n d ( 5 . 2 5 ) a n d t h a t a j-, } ^ t i s b o u n d e d , i t i s e a s y t o a s c e r t a i n
tha t t h e ratio of t h e last t w o t e r m s of t h e n u m e r a t o r to t h e d e n o m i
n a t o r i s o f t h e o r d e r o f o (||ry|| \ \ f i + i \ \ ) = o (||/^+il|2 ). H e n c e
(#,•/;+! i /;+!> /;•)
( H j+ifx + u f x + i ) 7 ^ ' {Hjej, ej) O d l / i + l l l 2 )-
w h e r e aj > 0 a n d is i n d e p e n d e n t o f £ ( b y ( 5 . 3 2 ) ) .
8 — 0 3 2 6 113
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
It w a s n o t e d in t h e p r e c e d i n g s u b s e c t i o n t h a t for p r o c e s s e s w i t h
r e s t o r a t i o n o f H h a s k — >* o o w e h a v e ||/^|| - > 0 . T h e r e f o r e , it f o l l o w s
f r o m i n e q u a l i t i e s ( 5 . 3 5 ) , t a k i n g i n t o a c c o u n t t h a t m a t r i x H h is p o s i
t i v e d e f i n i t e , t h a t i f w i t h a n y £ w e h a v e (f f j f i + i , / t + i ) ^ Y j ll/x+ill2
w h e r e y j > 0 a n d is i n d e p e n d e n t o f £, t h e n t h e r e is a c o n s t a n t
Y i + i > 0 s u c h t h a t w i t h a n y £ w e s h a l l h a v e (H j + J t + i , / x + i ) ^
^ Y i + i Il/x-hiII2 - B u t i n e s t i m a t i n g t h e q u a n t i t y ( ^ T i / t + i , f x + \ ) w e f i n d ,
since /x + 1 ) ^ W o I I / t + i I I 2 > t h a t t h e r e is a c o n s t a n t Y i s u c h
t h a t ( H J i + ly f i + i ) ^ Y i l l / x + i l l 2 w i t h a n y £ . T a k i n g t h i s i n t o a c c o u n t ,
o u r a r g u m e n t b y i n d u c t i o n s h o w s t h a t t h e r e is a c o n s t a n t a x + 1
i n d e p e n d e n t o f £ a n d s u c h t h a t ( H x + j f x + 1 , / i + 1 ) ^ a x + 1 | | / x - m II2 * W e
establish n o w just as w e did a b o v e that
a x+i ^ ( / x + l ’ Px+i) ^ ^ ( / x + i ’ Px+i) ^ L
M L * ^ M || p x + i || * ^ x+1^ m || P x + i ||* ^ m a | +1 *
Therefore, w e have
^T+ill«+ill > IK+ill = a x + 1 \ \ H x + 1 f i + 1 \\ > C x + 1 ||/x+ 1 ||. (5.36)
Let us s h o w n o w that
H x+ i e j = rs + rjj-, 0 < j < t (5.37)
w h e r e ||rjy-|| = o (||r; ||).
M u l t i p l y i n g b o t h sides of f o r m u l a (4.48) b y ej w e obtain
JJ JJ J r s (r s» e j) (*.«., e j) H
(5.38)
" * l6j = + <*.*. u) •
If w e a s s u m e t h a t w i t h a c e r t a i n s , / + 1 ^ s ^ t, e q u a l i t i e s
H se j = T j + t a k e p l a c e w h e r e ||t]^|| = o (||rj||), t h e n u s i n g e s t i
m a t e s (5.31), (5.27), (5.25) a n d t a k i n g i n t o a c c o u n t t h a t all o f t h e
q u a n t i t i e s ||rs || a r e o f t h e s a m e o r d e r o f s m a l l n e s s w e a l s o h a v e b y
( 5 . 3 8 ) t h a t H s + 1 e j = r j -f- tj^, w h e r e ||t)j-|| = o (||r;||). B u t H j + 1 e j =
= rj, a n d w e establish b y i n d u c t i o n that equalities (5.37) h o l d true.
T a k i n g into a c c o u n t (5.37) w e h a v e
('•,+1 , e s) = — a T+1 ej) = — a T + 1 (/(+ 1 , r s + rj, ) .
therefore by (5.34), w e find that
(r x + i » e j ) = o (||r,||2 ) + o (||/;+ 1 || H o l l ) , 0 < j sg; t .
I n e q u a l i t i e s ( 5 . 8 ) a n d ( 5 . 3 6 ) s h o w t h a t ||rT + 1 || i s o f t h e s a m e o r d e r
o f s m a l l n e s s a s ||/x + 1 || a n d c o n s e q u e n t l y a s ||r7 ||, 0 ^ / ^ t . I t
follows that
(rx + 1 , ej) = o (||rx + 1 || I M ) = o ( | | r x + 1 ||2 ), 0 < / < t. (5.39)
T a k i n g this into a c c o u n t w e establish in a m a n n e r a n a l o g o u s to that
u s e d before w i t h i = 1 that also
r f) = o ( | | r T + 1 ||2 ) , 0 < ; < t. (5.40)
114
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
W i t h i = 1
( H t r , /;) = ( t f 0 /;, /;) > m 0 m w
M a k i n g use of these relations a n d re a s o n i n g as in s t u d y i n g m e t h o d
(4.48) w e establish that e s t i m a t e s (5.28) h o l d a n d t h e n a s s u m i n g th a t
estimates (5.31) a n d (5.32) h o l d w e d e m o n s t r a t e that e s t i m a t e (5.34)
holds.
Further w e have that
( t f j + i / x + i , / t + i ) 5 s m 0 II/t+iII2 + O (H/t+ilP).
C o n s e q u e n t l y w i t h s u f f i c i e n t l y s m a l l v a l u e s o f W x + i \ \ (i-e. w i t h
sufficiently large 5 ) w e h a v e
115 8*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
(- ^ 0 //> i )
w , . r„)
B e c a u s e of (5.4) a n d (5.8) w i t h a n y k
wi-i-/;-!) ^ ii* ^ „
w h e r e y is i n d e p e n d e n t o f S.
116
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
B e s i d e s , u s i n g ( 4 . 6 5 ) a n d ( 4 . 6 4 ) it is e a s y t o e s t a b l i s h b y indue-
tion that w i t h a n y k w e ha v e 0 < C p f t < C l - C o n s e q u e n t l y
^ — •
H e n c e t a k i n g i n t o a c c o u n t t h a t P g n = 0 , £ = 0 , 1, . . . » it c a n b e
e s tab lis hed t h a t (5.41) holds. It f o l l o w s f r o m (4.65) u s i n g (5.41)
that with a n y k
- ( . P k , f'k) > P ^ o H/ill2 ,
i.e.
llPkll > P ™ o llfill-
T a k i n g into a c c o u n t this inequality a n d the fact that H k is b o u n
d e d as in the m e t h o d s considered a b o v e w e establish that 0 < a ^
^ a k ^ a a n d C ||/i|| ^ ||rft|| ^ N ||/i||. T h u s i t h a s b e en proved
t h a t e s t i m a t e s (5.9) b o l d .
L e t us s h o w that estimates (5.10) hold. W i t h k = \ n + 1 w e have
117
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
S i n c e ( H 0 f x + lt f x + 1 ) ; > 0 , w e h a v e
( ,Pt + 1» ^ x ) ^ ( ^ o / t + 1 > / t )-
I f q u a n t i t y ( H 0 f x + j, / t ) i s e s t i m a t e d u s i n g e x p r e s s i o n ( 5 . 4 2 ) w i t h
j = t, t h e n m a k i n g u s e of (5.34) a n d (5.41) w e find t h a t
(f f o f i + i , /;) = ^ (ikx+iii2)
(it h a s b e e n t a k e n i n t o a c c o u n t t h a t s i n c e ( 5 . 9 ) h o l d s , t h e q u a n t i t i e s
||r2*|| a n d \\fi\\, i — 0 , 1 , . . ., n — 1 a r e o f t h e s a m e o r d e r o f s m a l l
ness). T h u s
( P t + 1. e,) = o ( | | r T + 1 ||*), 0 < ; < t.
i.e. f o r t h e m e t h o d u n d e r c o n s i d e r a t i o n e s t i m a t e s ( 5 . 3 9 ) h o l d . C o n
s e q u e n t l y es timates (5.40) h o l d too.
T h u s w e h a v e established that for m e t h o d (4.69) a s s u m i n g that
c o n d i t i o n (5.6) is fulfilled, e s t i m a t e s (5.9) a n d ( 5 . 1 0 ) h o l d t o o .
5. M e t h o d (4.71). B y (4.67) a n d (5.4) w e h a v e for this m e t h o d
-(/>*, \\ti\\2 . (5.43)
Taking this into a c c o u n t w e have
l l * o II I I / i IIII ill/it-ill
II ll<llff.||+ m 0 || j ||2
S i n c e w i t h a n y k t h e r a t i o ||/jl||/ W f k - i W o n s e t S 0 is a b o u n d e d
q u a n t i t y a n d | | # 0 || ^ A f 0 , w e h a v e
II*J < M 0 + d W H u ^ W
w h e r e d i s a c o n s t a n t . I t f o l l o w s t h a t if t h e m a t r i x i s r e s t o r e d a f t e r
a finite n u m b e r of steps, t h e n w i t h a n y k m a t r i x H h h a s a b o u n d :
lUBTfcll ^ L . M a k i n g u s e o f t h i s f a c t a n d e s t i m a t e ( 5 . 4 3 ) w e a s c e r t a i n
t h a t w i t h a n y k, a ^ ^ a > 0 a n d N \\fh\\ ^ II r fc|| ^ C \\fk\\.
C o n s e q u e n t l y , for m e t h o d (4.71) e s t i m a t e s (5.9) hold. L e t u s
p r o v e that estim ate s (5.10) hold:
1 1 8
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
(Pi, e 0) - ( H J \ % f 0) = (/I, P a ) - 0.
( P t + l ? e j) C ^ o / t + I ? € j ) “ f" P x + 1 C P x » € j ) i
H ofj = — Pj + P/Pi-i*
If w e a s s u m e t h a t e s t i m a t e s ( 5 . 3 1 ) h o l d , t h e n a r g u i n g a s i n m e t h o d
(4.48) w e c a n p r o v e that es timates (5.34) hold. N o t e also that
o ^ a / o i f / ; 11*
P *' ( I > h - 1, r h - x ) ^ w o l l / i _ , II* ^ d2 *
Further Study
of t h e R a t e of C o n v e r g e n c e
1 . S u p p o s e n o w t h a t m a t r i x /" (x) b e s i d e s c o n d i t i o n s (5.3) satisfies
L i p s c h i t z ' c o n d i t i o n ( 2 . 8 ) . I n t h i s c a s e it is p o s s i b l e t o o b t a i n a m o r e
p r e c i s e b o u n d o n t h e r a t e o f c o n v e r g e n c e o f s e q u e n c e { ^ ^ n }.
119
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h i s l e m m a is p r o v e d i n t h e s a m e w a y a s e s t i m a t e s (5.9) a n d ( 5 . 1 0 )
for m e t h o d (4.48) i n t h e s u b s e c t i o n o n p. I l l ; o n l y t h e o r d e r of s m a l l
n e s s o f s o m e q u a n t i t i e s is d e t e r m i n e d m o r e p r e c i s e l y . T h e r e f o r e ,
120
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
U s i n g ( 5 . 5 3 ) a n d ( 5 . 4 9 ) w e e s t a b l i s h t h a t |J r i | | ^ J C || / J | | ^ C || / J ||.
T a k i n g i n t o a c c o u n t a l s o t h a t | r 0 | | ^ C | / ' || w e o b t a i n | | r j | | ^
|| r 0 1|. C o n s e q u e n t l y , || f [ c — f"0 c | ^ C || r 0 |. L !s i n g t h i s w re f i n d t h a t
I ( e ii r 0 ) \ < C || r 0 |l2 1| 7*i ||-
121
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
M a k i n g use of (5.59) w e h a v e
II / x _j_| |l“ II /j- 4-i II
11 / x I 1 II2 , T. (s.no)
iiTTii
U s i n g (5.57), (5.58) a n d (5.60) w e establish in t h e s a m e w a y as in
t h e s u b s e c t i o n o n p. I l l t h a t
(Hx+iK+i, 1) > f l x + i ii/;+iii*.
Let us s h o w n o w that
^ r + l ej — rj + 0 ^ ^ T (5.61)
where
T
I K . x + i l K C 2 (5.62)
v=j+l
Indeed,
( < V J J se j ) I I s e s
(5.63)
w e h a v e H j+ ^ j = rj. U s i n g e s t i m a t e s ( 5 . 5 2 ) , w h i c h h o l d b y a s s u m p
t i o n w i t h 0 ^ s, j ^ t , ( 5 . 2 5 ) , ( 5 . 2 7 ) , ( 5 . 4 7 ) a n d t a k i n g i n t o a c
c o u n t t h a t | a,-1 ^ C , 0 < i ^ t, w e o b t a i n w i t h s / + 1
|l l t ' y + ! • H j + i e j) M j + i e j + i II II (e j + i » r j ) H j + i e j + i II < ---^ H r j l | 2 II r 7 + l II
,r . Il || „ si ^r\ H ^ 7* I I 2 II 0 " + * II
H * + \ ej rj + T/, s+i» || 7, s + 1 I I ^ ^ p j j -•
v=j+ 1
Thus, b y induction, w e c a n consider (5.61) to hold.
122
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
123
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
W i t h o u t l o s s o f g e n e r a l i t y , it c a n b e a s s u m e d t h a t t h e s u b s e q u e n c e
{ S m } c o i n c i d e s w i t h t h e w h o l e s e q u e n c e £ = 0 , 1 , . . .. T a k i n g i n t o
a c c o u n t ( 5 . 4 9 ) w e c a n a s c e r t a i n t h a t if ( 5 . 7 0 ) h o l d s , e s t i m a t e s ( 5 . 5 1 )
h o l d t o o . C o n s e q u e n t l y , if w e a s s u m e e s t i m a t e s ( 5 . 7 0 ) t o b e s a t i s f i e d ,
t h e n t h e r e q u i r e m e n t s of t h e o r e m 5.2 p r o v i d e for the fulfilment of
t h e c o n d i t i o n s o f l e m m a 5 . 1 . T h u s , if ( 5 . 7 0 ) i s f u l f i l l e d , t h e e s t i m a t e s
(5.9) a n d (5.52) ho l d .
T a k i n g this into a c c o u n t w e h a v e
I (/n» r j)l = I (e n - l + • • • 4~ e j + 1? 0 ) 1
II Oil2 ||rm ||, 0 < / < « - 2 .
W e denote rt = r , / ||r*|| a n d l e t
7 7 - 1 _
124
A R B I T R A R Y F U N C T I O N - M I N I M I Z A T I O N
Discussion of R e s u l t s
T h u s w e h a v e m a d e it c l e a r t h a t all o f t h e m e t h o d s s t u d i e d i n S e c . 4
c a n b e applied for m i n i m i z i n g n o n q u a d r a t i c functions a n d that the
c o n v e r g e n c e of t h e processes c a n b e g u a r a n t e e d for a class of f u n c t i o n s
that can be m i n i m i z e d b y gradient methods. In the case w h e r e m e t h
o d s of c o n j u g a t e directions are u s e d for m i n i m i z a t i o n of strongly
c o n v e x functions, the rate of c o n v e r g e n c e p r o v e s n o t s l o w e r t h a n
superl inear.
T h e rate of c o n v e r g e n c e of m e t h o d s of c o n j u g a t e directions w a s
established in a s o m e w h a t different m a n n e r t h a n that of m e t h o d s
of other classes studied in the p r e c e d i n g sections: the s e q u e n c e
c o n s i d e r e d w a s { # £ n } r a t h e r t h a n { # * } , i.e. a c t u a l l y w e c o n s i d e r e d
as o n e iteration a unified g r o u p of n us u a l iterations of the process
x\n<> S p e a k i n g generally, the real rate of c o n
v e r g e n c e of s u c h processes m a y p r o v e slower t h a n of m e t h o d s of d u a l
125
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
d i r e c t i o n s ( S e c . 3 ) a n d t h e m o r e s o t h a n o f N e w t o n ’s m e t h o d (Sec. 2)
(i.e. t h e d e c r e a s e o f t h e f u n c t i o n v a l u e a t e a c h i t e r a t i o n | fk+1 — fk |
in m e t h o d s of the class u n d e r consideration m a y p r o v e less t h a n in
m e t h o d s o f S e c s . 2 , 3 a n d t h e r a t i o || — x + \ \ / || x h — :r*|| g r e a
ter). T h u s , if f o r i n s t a n c e , i n s o m e a l g o r i t h m w e h a v e
a n d in a m e t h o d of c o n j u g a t e directions
*£<&+l)n % In ^£n/|n
a n d D h = D S n — >> f k , t h e n t h i s m e a n s t h a t n i t e r a t i o n s o f t h e m e t h
o d of c o n j u g a t e directions are equivalent, as to their conve rge nce ,
to o n l y o n e iteration of pr oce ss (5.77). N e v e r t h e l e s s t h e rate of c o n
v e r g e n c e o f t h e m e t h o d s o f t h e c l a s s u n d e r c o n s i d e r a t i o n is p r a c t i
cally rather fast a n d e x c e e d s b y far th at of gra d i e n t m e t h o d s .
A t t h e s a m e t i m e , as m e n t i o n e d i n S e c . 4, t h e m e t h o d s of c o n j u g a
te directions differ b u t slightly f r o m t h e g r a d i e n t m e t h o d s as to t h e
l a b o u r per iteration.
T h e f o r e g o i n g m a k e s it p o s s i b l e t o c o n c l u d e t h a t t h e m e t h o d s o f
c o n j u g a t e directions are of the m o s t effective for solving m i n i m i z a
tion problems.
I n this section w e limited ourselves to the s t u d y of several c o n
crete a l g o r i t h m s c o n s t r u c t e d in Sec. 4, t h o u g h w e c o u l d h a v e s t u d i e d
also t h e properties of ot h e r a l g o r i t h m s that c a n b e c o n s t r u c t e d ac cor
d i n g to t h e g e n e r a l s c h e m e d i s c u s s e d in Sec. 4. H o w e v e r , t h e t e c h n i
q u e of s t u d y i n g o t h e r a l g o r i t h m s w o u l d n o t c o n s i d e r a b l y differ f r o m
t h a t u s e d in S e c . 5. I n d e e d , t h e d i f f e r e n c e i n t h e t e c h n i q u e o f p r o v i n g
t h e o r e m 5.1 a m o u n t s o n l y to s o m e w h a t different w a y s of investi
g a t i n g t h e properties of m a t r i x H k . B u t in a n y m e t h o d of the class
u n d e r consideration, vectors u a n d u k u s e d for constructing //h+1
c a n h e b u t v a r i o u s c o m b i n a t i o n s of v e c t o r s r h a n d H * e k (see (4.32)),
a n d t h e a l g o r i t h m s d i s c u s s e d in Secs. 4, 5 w e r e c h o s e n so as to u s e
in constructing m a t r i c e s H h various c o m b i n a t i o n s of these elements.
U s i n g the results obtained, w e n o w c o m p a r e the properties of
different a l g o r i t h m s in the m i n i m i z a t i o n of n o n q u a d r a t i c functions.
T h e results of t h e o r e m 5.2 ( e s t i m a t e (5.68)) s h o w t h a t t h e rate
of c o n v e r g e n c e of s e q u e n c e { x 5n} d e p e n d s co nsiderably o n the p r o
p e r t i e s o f m a t r i x H gn . If, a s | - ► o o ,
H ln (5.78)
A R B I T R A R Y F U N C T I O N M I N I M I Z A T I O N
a n d t h e r a t e o f c o n v e r g e n c e i n c r e a s e s . T h i s f a c t is p r a c t i c a l l y o f t h e
gr eat est interest for a l g o r i t h m s h a v i n g t h e p r o p e r t y t h a t in m i n i m i
zing a quadratic function w e h a v e
H n = A -1. (5.79)
A l g o r i t h m s (4.48), (4.49), (4.52) b e l o n g to m e t h o d s of this g r o u p .
If i n i m p l e m e n t i n g o n e o f s u c h a l g o r i t h m s c o n d i t i o n ( 5 . 7 8 ) is f u l
filled, t h e n , b y t h e a b o v e c o n s i d e r a t i o n s , it is e x p e d i e n t n o t t o r e s t o r e
matrix H k.
I n m e t h o d ( 4 . 7 0 ) , p r o p e r t y ( 5 . 7 9 ) is n o t fulfilled; t h e r e f o r e t h e
variant of this m e t h o d w i t h o u t restoration gives n o a d v a n t a g e ( w e
refer to t h e rate of c o n v e r g e n c e ) o v e r th e v a r i a n t w i t h restoration.
T h e s a m e c a n b e said also of other a l g o r i t h m s that h a v e the p r o p e r t y
t h a t in m i n i m i z i n g a q u a d r a t i c f u n c t i o n w e h a v e ff n = //0 (for
i n s t a n c e , m e t h o d s ( 4 . 6 9 ) , ( 4 . 7 1 ) ) o r H n is c l o s e t o H 0 ( j u s t t h i s is
t h e c a s e o f m a t r i x //„ in m e t h o d (4.70): its effect o n t h e s y s t e m o f
l i n e a r l y i n d e p e n d e n t v e c t o r s e 0 , . . ., e n - x i s t h e s a m e a s t h a t o f m a t
r i x H 0 , e x c e p t o n t h e v e c t o r ^ - l ) . T h e r e f o r e it is n o t w o r t h w h i l e t o
consider variants of s u c h m e t h o d s w i t h o u t restoration of m a t r i x H k .
H o w e v e r , t h e rate of c o n v e r g e n c e of m e t h o d s (4.70), (4.71) will
i n c r e a s e if w e u s e , i n s t e a d o f t h e f i x e d m a t r i x / / 0 , a s e q u e n c e o f
p o s i t i v e d e f i n i t e m a t r i c e s H g0 w h i c h s a t isfy t h e c o n d i t i o n
127
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
m i n i m u m , a n d a t a p o i n t d i s t a n t f r o m it o n e c a n c o m p a r e t h e
effectiveness of different a l g o r i t h m s o n l y o n the g r o u n d of n u m e r i c a l
experiments.
M a n y w o r k s pu b l i s h e d u p to the present t i m e (J.D. Pearson,
J. G r e e n s t a d t , B . T . P o l y a k [2], H . G . H u a n g a n d A . V . L e v y )
c o n t a i n results of n u m e r i c a l solution of v a r i o u s p r o b l e m s b y m e t h
o d s of c o n j u g a t e directions. T h e m o s t c o m p r e h e n s i v e c o m p a r a t i v e
a n a l y s i s o f t h e e f f e c t i v e n e s s o f d i f f e r e n t a l g o r i t h m s is g i v e n i n t h e
last of t h e w o r k s n a m e d . O n t h e w h o l e , t h e results of n u m e r i c a l
e x p e r i m e n t s c o n f i r m the co n c l u s i o n that the m o s t effective m e t h o d s
a r e t h o s e f o r w h i c h c o n d i t i o n ( 5 . 7 9 ) is fulfilled. A t t h e s a m e t i m e ,
m e t h o d (4.71) p r o v e s m o r e effective in the case w h e r e the m a t r i x
is r e s t o r e d a f t e r n i t e r a t i o n s ( a s c o m p a r e d w i t h a p r o c e s s w i t h o u t
restoration). It s e e m s t h a t in pr actice m e t h o d (4.70) s h o u l d also
b e u s e d w i t h restoration of m a t r i x H h .
Finally, w e d w e l l o n p r o b l e m s th at are i n v o l v e d in t h e c h o i c e of
th e step l e n g t h in m e t h o d s of the class u n d e r consideration. A s w a s
a l r e a d y m e n t i o n e d , in m e t h o d s of c o n j u g a t e directions th e step
l e n g t h is c h o s e n u n d e r t h e c o n d i t i o n t h a t t h e m i n i m u m o f t h e f u n c
t i o n is i n t h e d i r e c t i o n o f m o t i o n . I t w a s s t r e s s e d m a n y t i m e s t h a t
t h e m a i n s h o r t c o m i n g o f s u c h a p r o c e d u r e is t h e n e c e s s i t y o f p e r f o r
m i n g a considerable a m o u n t of calculations of fu n c t i o n values, this
m a k i n g t h e c o m p u t a t i o n a l effort v e r y c o n s i d e r a b l e in p r o b l e m s in
w h i c h the function evaluation requires m u c h time. In s o m e cases
t h e s e l e c t e d m e t h o d o f c h o o s i n g t h e s t e p l e n g t h is n o t p r a c t i c a l l y
s u i t e d if, f o r i n s t a n c e , t h e v a l u e o f p a r a m e t e r a * c h a n g e s g r e a t l y a t
e v e r y step. T h i s s h o r t c o m i n g of t h e m e t h o d of c o n j u g a t e directions
w a s s t r e s s e d i n m a n y w o r k s ( C . G . B r o y d e n [2], B . N . P s h e n i c h n y [3],
W . C . D a v i d o n [2], M . J . D . P o w e l l [11, R . F l e t c h e r [1], a n d o t h e r s ) .
T o a v o i d the a b o v e s h o r t c o m i n g , these w o r k s consider m e t h o d s in
w h i c h the v a l u e is c h o s e n s o t h a t it g u a r a n t e e s o n l y a c e r t a i n
degree of decrease of the function. H o w e v e r in other respects, the
c o n s t r u c t i o n o f t h e s e m e t h o d s is b a s e d o n t h e s a m e i d e a s , w h i c h w e r e
d e s c r i b e d a b o v e ( t h e w o r k o f B . N . P s h e n i c h n y [31 e x c e p t e d ) .
T h e s t u d y of t h e p r ope rti es of m e t h o d s in w h i c h t h e c h o i c e of t h e
s t e p l e n g t h is n o t c o n n e c t e d w i t h t h e f i n d i n g o f t h e f u n c t i o n m i n i m u m
a l o n g t h e d i r e c t i o n of m o t i o n b e c o m e s m u c h m o r e difficult a n d t h e
theoretical subs t a n t i a t i o n of m a n y of t h e m h a s n o t b e e n p e r f o r m e d
e v e n in the case of the m i n i m i z a t i o n of a q u a dr ati c function.
F r o m the v i e w p o i n t of the m e t h o d of c h o o s i n g the a * value, m e t h
o d s of d u a l directions, Sec. 3, are preferable. T h e rate of c o n v e r
g e n c e of t h e s e m e t h o d s also p r o v e s faster. H o w e v e r , m e t h o d s of d u a l
d i r e c t i o n s r e q u i r e a g r e a t e r s t o r a g e c a p a c i t y of t h e c o m p u t e r (as n o
t e d in Sec. 3, t w o n x n m a t r i c e s m u s t b e stored); therefore u s i n g
t h e m , o n e c a n s o l v e m i n i m i z a t i o n p r o b l e m s b u t of s m a l l e r size.
O n e can, t h o u g h , u s e a s m a l l e r storage c a p a c i t y of the c o m p u t e r b y
128
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
c h o o s i n g i n m e t h o d s o f d u a l d i r e c t i o n s v e c t o r s r ft a l o n g t h e c o o r d i
n a t e a x e s ; h o w e v e r , i n t h i s c a s e it is n e c e s s a r y t o c a l c u l a t e t h e d e r i
vative tw i c e at e v e r y iteration a n d this increases the a m o u n t of
w o r k required.
6. M E T H O D S W I T H O U T
C A L C U L A T I N G D E R I V A T I V E S
Introductory R e m a r k s
U n t i l n o w w e d e s c r i b e d m i n i m i z a t i o n m e t h o d s i n w h i c h it w a s
n e c e s s a r y a t e a c h i t e r a t i o n t o c a l c u l a t e , b e s i d e s t h e f u n c t i o n / (x),
i t s g r a d i e n t / ' ( x ) ( m e t h o d s o f S e c s . 1 , 3 , 4 , 5 ) , a n d i n N e w t o n ’s
m e t h o d ( S e c . 2), m o r e o v e r , t h e m a t r i x o f s e c o n d d e r i v a t i v e s f n (x).
M a n y t i m e s w e stressed t h e fact that the calculation of s e c o n d deri
v a t i v e s is o f t e n t h e m o s t c o m p l i c a t e d a n d l a b o r i o u s p a r t o f t h e
construction of the iterative process, a n d m e t h o d s of Secs. 3-5 w e r e
w o r k e d o u t just w i t h the a i m of a v o i d i n g the calculation of s e c o n d
derivatives. H o w e v e r in m a n y p r o b l e m s , the calculation of the g r a
dient c a n also p r o v e considerably m o r e c o m p l i c a t e d t h a n the e v a l u
a t i o n o f t h e f u n c t i o n ( i n s o m e c a s e s it is i m p o s s i b l e t o o b t a i n a n
a n a l y t i c a l e x p r e s s i o n o f f (x) a t all). I n s u c h c a s e s it is d e s i r a b l e t o
use m e t h o d s w h i c h require only the function evaluation.
T h e calculation of a gradient b y a n analytical f o r m u l a c a n be
substituted b y a n a p p r o x i m a t e one, for instance, b y using the
finite d i f f e r e n c e s a p p r o x i m a t i o n t o p a r t i a l d e r i v a t i v e s . I n this w a y
o n e c a n construct modifications of the m e t h o d s (discussed in the
p r e c e d i n g s e c t i o n s ) w h i c h i n v o l v e o n l y f u n c t i o n e v a l u a t i o n . If w e
require a definite d e g r e e of a c c u r a c y of t h e a p p r o x i m a t i o n a n d i m p o s e
certain additional r e q u i r e m e n t s o n the construction of a n iterative
process, t h e n in m o s t cases w e c a n o b t a i n that the properties of s u c h
modified m e t h o d s (convergence, rate of convergence) a p p r o x i m a t e
t h e p r o p e r t i e s o f t h e o r i g i n a l a l g o r i t h m s i n w h i c h /' ( x ) , / " ( x ) a r e
evaluated b y analytical expressions.
T h e s t u d y o f m e t h o d s w i t h o u t c a l c u l a t i o n o f g r a d i e n t s is i n t e r e s t i n g
also in a n o t h e r respect. In d e t e r m i n i n g the a c c u r a c y of a p p r o x i m a t i n g
the derivatives w i t h w h i c h the properties of s u c h algorithms coincide
w i t h those of th e original m e t h o d s , w e find in fact th e a l l o w a b l e
calculation errors that d o no t lead to violations of the properties
o f a l g o r i t h m s ( w i t h t h e c a l c u l a t i o n o f /' ( x ) , / " ( x ) ) .
In this section w e are s t u d y i n g o n l y those a l g o r i t h m s w h o s e c o n
s t r u c t i o n is b a s e d o n m e t h o d s o f d u a l d i r e c t i o n s , S e c . 3; i n t h i s
c o nne cti on w e retain for t h e m the s a m e n a m e . Besides, w e shall
d w e l l o n a l g o r i t h m s of a n o t h e r t y p e in w h i c h t h e idea of the c o n s t r u c
t i o n o f c o n j u g a t e d i r e c t i o n s is r e a l i z e d b u t w i t h o u t t h e c a l c u l a t i o n o f
t h e g r a d i e n t o r its a p p r o x i m a t i o n b y finite diff e r e n c e s .
9 — 0 3 2 6 129
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
C o n s t r u c t i n g M e t h o d s of D u a l Directions
In these m e t h o d s , successive a p p r o x i m a t i o n s to the solution are
constructed b y the formula
Xk+1 = Xh — VhDk'gh (6.1)
w h e r e D k is a n n X n m a t r i x , g h is a v e c t o r . T h e s c a l a r f a c t o r a h
w h i c h d e ter min es the step length, as distinct f r o m the m e t h o d s discus
sed before, c a n take positive as well as negative values; this d e p e n d s
o n w h i c h d i r e c t i o n — D ^ g h o r D V g h is t h e d i r e c t i o n o f d e s c e n t o f
f u n c t i o n / (x).
O n e c a n also u s e a n o t h e r a p p r o a c h a n d a s s u m e t h a t a h ^ 0, b u t
t h e n the direction of m o t i o n s h o u l d b e either p k = — D ^ g h or p h =
= D ^ g h so that the following condition holds:
(ft, P h ) < 0 (6.2)
W e a s s u m e a s w e d i d i n S e c . 3 t h a t / (x) is a t w i c e c o n t i n u o u s l y
differentiable strongly c o n v e x function.
Constructing matrix D h a n d vector g h. Let us determine vectors
0fe a n d < p ft:
a / /(*fc-fHfci>l) — / ( * * ) f {xh + P k V n ) — f(xk) \
vft 1 * •••? I9
V Hft Ph / ’
( f(yk + W i ) — f {yk) f ( y h + P k V n ) — f(yh)\
^ = 1- - - - E - - - - - - - - - - - - E - - - - ) ’
♦ f t = <Pft ~ 0ft
w h e r e 0 < | p * I ^ II r h ||*, t > 1 , y k , r k a r e e l e m e n t s o f s e q u e n c e
(3.5), Vi is t h e u n i t y v e c t o r o f t h e c o r r e s p o n d i n g a x i s .
L e m m a 6 . 1 . L e t { x ft} b e a b o u n d e d s e q u e n c e , || x k + 1 — x h || - > 0 a s
k — ► o o a n d m a t r i x D h w i t h a n y k ^ n — 1 be defined b y the fol
l o w i n g s y s t e m of e q u a t i o n s :
D hrh-i = tyh-ii i = 0, 1 , . . ., n — 1 (6.3)
w h e r e r h -i a r e e l e m e n t s o f s e q u e n c e ( 3 . 5 ) . T h e n
l i m || D h - f h || = 0.
ft-*-OB
T h e p r o o f o f t h i s l e m m a c o i n c i d e s i n its e s s e n t i a l f e a t u r e s w i t h
t h a t of l e m m a 3.1. W e shall co nsider o n l y the arising differences.
T h e c o m p o n e n t s o f v e c t o r s 0 * a n d ( p ft c a n b e w r i t t e n t h u s :
130
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
w h e r e C 2, C k < ; oo.
Let us write vector in the f o r m
= f (y h - i ) — f ( X h - i ) + — f (yh-i)) — — f (Xk-i))>
t h e n d e n o t i n g a s b e f o r e e h -i = f (yh-i) — f (#*-*) w e o b t a i n
D h r k .i = e h .i + (cpfe-i — f ( y h -i)) — (Gft-f — f ( z h - i ) ) ,
i = 0 , 1 , . . ., n — 1. (6.6)
L e t u s t a k e B k = D k — / " ( x k ). P r o c e e d i n g a s i n p r o v i n g l e m m a 3 . 1
w e obtain the following estimate:
ll-®fcr /t-ill h h - i II r h ~ i \ \
w h e r e h h - t = h k _ i + C 5 ||rft_ | H 4 - 1 - > 0 a s k - v o o .
T h e r e m a i n i n g part of the proof repeats t h e a r g u m e n t of l e m m a 3.1.
Let us n o w determine vector g k:
„ / + f(*h) f (xk + PhVn) — f M \ /a n\
gh = \ - - - - - n - - - - - - - - - - - - - - j*- - - - - ) ( 6 -7 >
131 9*
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
o n c e m o r e w h e t h e r ( 6 . 8 ) i s s a t i s f i e d o r n o t . S i n c e g k — ► / & a s | p ft | — ► 0
a n d a t t h e s a m e t i m e || p fe|| - ► || D l x f h \\, a n d || D l lf k || ! > 0 w i t h a n y
x h ¥ = x * ( m a t r i x D & 1 is n o n s i n g u l a r b e i n g t h e i n v e r s e o f m a t r i x D h \
o n t h e c a l c u l a t i o n of D l x see t h e s u b s e c t i o n o n p. Ill, c a l c u l a t i o n of
v e c t o r p k ), t h e n w i t h s u f f i c i e n t l y s m a l l v a l u e s o f p fe c o n d i t i o n s ( 6 . 8 )
ar e satisfied.
D e t e r m i n i n g t h e d i r e c t i o n o f m o t i o n . T h i s is m a d e a s f o l l o w s .
S e t t i n g a c e r t a i n v a l u e of y 0 ( n a t u r a l l y this s h o u l d b e c h o s e n suffici
e n t l y s m a l l ) , / (x) is e v a l u a t e d a t p o i n t s x k ± ^ at o n e
t h e s e p o i n t s t h e f u n c t o n v a l u e is l e s s t h a n / (#*), t h e n t h e c o r r e s
p o n d i n g v e c t o r (— D l lg h o r D h lg k ) is t a k e n a s p k ( c o n d i t i o n ( 6 . 2 ) is
s a t i s f i e d s i n c e / ( # ) is c o n v e x ) . H o w e v e r , if b o t h f u n c t i o n v a l u e s a r e
g r e a t e r t h a n / (#*), w e r e d u c e Y o u n t i l o n e o f t h e f u n c t i o n v a l u e s b e
c o m e s l e s s t h a n / (#*), a n d t h e c o r r e s p o n d i n g v e c t o r is t a k e n a s p k .
H o w e v e r , it c a n o c c u r t h a t w i t h s m a l l v a l u e s o f y t h e f u n c t i o n
does not decrease in either of the directions ztiD^gh* Th i s c a n m e a n
that either w e h a v e n o t r e a c h e d values of y w i t h w h i c h the function
d e c r e a s e s o r t h e c o n d i t i o n ( / ' ft, D l l g k ) = 0 i s s a t i s f i e d ( i t w i l l b e s e e n
f r o m w h a t f o l l o w s t h a t s u c h a c a s e is p o s s i b l e o n l y a t t h e i n i t i a l
stage of the process a n d then, obviously, neither of the vectors
d z D ^ g k c a n b e c h o s e n , a s p fe). I n o r d e r t o e x c l u d e s u c h a n o c c u r r e n c e
i t i s n e c e s s a r y t o c a l c u l a t e a n e w v e c t o r g k , 1 h a v i n g c h a n g e d p ft ( b u t
s o t h a t c o n d i t i o n s ( 6 . 8 ) b e s a t i s f i e d ) , c a l c u l a t e a n e w v e c t o r D k xgk, i
a n d f r o m a certain y ^ Yo ° n i e v a l u a t e the f u n c t i o n at points
x k ± Y D h xg h , i a s w e l l . I f x h ¥ = x * , t h e n o n e o f t h e d i r e c t i o n s ± D h lg k
o r ± D l l g h , i, i s , o f n e c e s s i t y , t h e d i r e c t i o n o f d e s c e n t . T h e c o r r e s p o n
d i n g v e c t o r i s t h e n t a k e n a s p ft.
T h e a l g o r i t h m of c h o o s i n g t h e step. L e t u s c h o o s e a * in t h e fol
lowing wa y: suppose that
= min (6.9)
f (x) — f ( x k ) s g e a 2 p ft ( g k , p k ) (6.10)
132
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
W e n o w s t u d y t h e p r o p e r t i e s of s e q u e n c e (6.1) i n c o n s t r u c t i n g
matrix D k, vector g h and parameter a h b y the m e t h o d described
above.
T h e o r e m 6 . 1 . I f f ( x ) is a t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c
t i o n t h a t satisfies c o n d i t i o n s ( 2 . 4 ) , m a t r i x D k w i t h a n y k ^ n — 1 is
d e f i n e d b y s y s t e m ( 6 . 3 ) , v e c t o r g h is d e t e r m i n e d b y e x p r e s s i o n ( 6 . 7 )
w h e r e p ft s a t i s f i e s c o n d i t i o n s ( 6 . 8 ) a n d a h d e t e r m i n e d b y t h e m e t h o d d e
scribed a b o v e , t h e n for s e q u e n c e (6.1) s t a t e m e n t s a n a l o g o u s to those
proved in theorem 3.1 hold.
P r o o f . I n o r d e r t o t a k e a d v a n t a g e o f l e m m a 6 . 1 , it is n e c e s s a r y
first o f a l l t o s h o w t h a t u n d e r t h e c o n d i t i o n s o f t h e t h e o r e m , c o n d i t i o n
\\xh + 1 — x k \\ - » - 0 h o l d s f o r s e q u e n c e ( 6 . 1 ) .
E x p a n d i n g f u n c t i o n / ( x ) i n t o T a y l o r ’s s e r i e s t o t h e s e c o n d - o r d e r
t e r m s in th e region a b o u t point x k w e obtain:
, _ Q „ V r (/ * ’ P k ) . a* U h c P P h ) -|
= Pk) L K { t k + — p ; (-*■; S ) J
Due to (6.2) a n d t h e c h o i c e o f p ft
(/fc» P h ) n
Pfe ( g h , P h )
C o n s e q u e n t l y , w i t h a c e r t a i n a * > 0 t h e i n e q u a l i t y ( 6 . 1 1 ) is s a t i s f i e d
and, therefore, (6.10) as well. T h i s p r o v e s the possibility of c h o o s i n g
a h b y the m e t h o d described above.
T h u s , b y (6.10), f h + i < f h • T h i s m e a n s t h a t x h £ S = {x\ f (x) ^
^ / (*<>)} w i t h a n y k a n d s i n c e f ( x ) h a s a l o w e r b o u n d f h — f h + i
- > 0 . H e n c e , it f o l l o w s f r o m ( 6 . 1 0 ) t h a t a s k — > - o o
I (g h . P h ) I - > 0 . (6.12)
Since a h ^ a h , it f o l l o w s f r o m ( 6 . 9 ) t h a t
133
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
H e n c e , it f o l l o w s f r o m ( 6 . 1 5 ) t h a t f r o m a c e r t a i n i t e r a t i o n o n , w e
h a v e pfc = + 1 ( s i n c e t h e l e f t - h a n d s i d e o f ( 6 . 1 5 ) i s p o s i t i v e ) a n d
t h e r e f o r e c o n d i t i o n ( 6 . 1 4 ) is r e a l l y satisfied.
O n s e t S t h e g r a d i e n t f ( x ) i s b o u n d e d : || //t || ^ L . T a k i n g i n t o
a c c o u n t a l s o t h a t | p & | ^ p < C 0 0 » w e f i n d u s i n g ( 6 . 1 6 ) t h a t || g h || ^
^ L x w i t h a n y k. B y a n a l o g y w i t h t h e o r e m 3.1 w e c a n e s t ab lis h
t h a t w i t h a n y k ^ n — 1 w e h a v e || D || ^ A f 2 . C o n s e q u e n t l y ,
IIP* I K I I W I I I l f o l K f t -
U s i n g this estimate a n d inequality (6.18) w e establish that w i t h
sufficiently great k
H e n c e , it f o l l o w s f r o m ( 6 . 9 ) t h a t f r o m a c e r t a i n k o n , w e s h a l l h a v e
^ a > 0. (6.20)
134
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
B e c a u s e o f t h i s e s t i m a t e i t f o l l o w s f r o m t h e c o n d i t i o n a ft|| || 0,
the fulfilment of w h i c h w a s discussed a b ove , that as k oo
II P k 1 1 - ^ 0 . (6.21)
S i n c e || g k || = | | D k p k || ^ M ± \\ p h ||, p r o v i d e d ( 6 . 2 1 ) is satisfied,
w e have
1 1 ^ 1 1 + 0 . (6.22)
I (S h , P h ) |
U s i n g t h i s c o n d i t i o n a n d ( 6 . 1 4 ) it is e a s y t o a s c e r t a i n t h a t i n e q u a l i
ty (6.11) a n d , therefore, (6.10) as w e l l w i t h sufficiently great k are
s a t i s f i e d w i t h o& = 1 . F r o m r e l a t i o n s ( 6 . 1 9 ) , w i t h ( 6 . 2 1 ) f u l f i l l e d *
it f o l l o w s t h a t
I (gft, P h ) I nn
ii P k i i 3
Therefore in c h o o s i n g a c c o r d i n g t o c o n d i t i o n ( 6 . 9 ) f r o m a c e r t a i n A:
o n , w e h a v e a * = 1.
T h e a b o v e r e m a r k s s h o w that f r o m a certain iteration on, = 1
and
*£ft+1 D k gft.
A t t h e s a m e t i m e t h e r e is a m a t r i x D & 1 s u c h t h a t
%h+l /ft*
135
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h e s e q u e n c e of m a t r i c e s D h u n d e r c o n d i t i o n s (6.8) a n d (6.16) c a n b e
chosen so that
D h - + D h. (6.23)
I n order to o b t a i n (6.23) w e can, for instance, a s s u m e that
~ (/b — S h )
D k = D k - " k 7 (X h + 1 - X h ) \
II X h + l — X k l r
I t is n o w e a s y t o p r o v e t h a t s e q u e n c e ( 6 . 1 ) c o n v e r g e s a t a s u p e r -
linear rate. W e p r o c e e d as in t h e o r e m 2.1 a n d establish that the
following inequality holds:
|| * * + 1 - x * || < | | D k 1 || || D h - f h e || || - * * ||.
F u r t h e r u s i n g c o n d i t i o n s (6.13), (6.23) a n d t h e c o n t i n u i t y of s e c o n d
derivatives w e ascertain that as k o o || D h — / * c II - > 0 a n d
q u a n t i t y || D k x || h a s a b o u n d . H e n c e , a s k oo w e h a v e
II * n + i — M x h — **ll (6.24)
w h e r e X h ->*0 a n d this p r o v e s that the rate of c o n v e r g e n c e of {x*}
is s u p e r l i n e a r .
T h e t h e o r e m is p r o v e d .
R e m a r k s o n t h e I m p l e m e n t a t i o n of M e t h o d s
of Dual Directions
Various algorithms. T h e requirements w h i c h should be m e t by
vectors rk u s e d in constructing m a t r i x D h are the s a m e as those
c o n s i d e r e d in c o n s t r u c t i n g s e q u e n c e (3.5). T h e r e f o r e , all t h a t w a s
said in t h e s u b s e c t i o n o n p. 7 4 a b o u t t h e co n s t r u c t i o n of v a r i o u s
a l g o r i t h m s of t y p e (3.4) h o l d s f o r p r o c e s s (6.1).
C a l c u l a t i o n of v e c t o r p h . T h e results of t h e s u b s e c t i o n o n p. 7 6
a r e f u l l y a p p l i c a b l e h e r e . T h u s , b a s i s s k + u s ft, . . ., s fc_ n + 2 , t h e
d u a l of b a s i s tyk+ii • • •> ^ f c - n + 2 » i s c o n s t r u c t e d b y t h e f o l l o w i n g
f o r m u l a s ( a n a l o g o u s to (3.21)):
136
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
T h e initial s t a g e o f t h e p r o c e s s . T h e r e a r e s e v e r a l w a y s o f p e r f o r m
i n g t h e first i t e r a t i o n s o f t h e p r o c e s s ( w i t h k < ; / i — 1). F o r i n s t a n c e ,
the descent c a n b e realized in o n e of the directions p * g A , P* = ± 1
c h o o s i n g t h e s i g n o f p fe s o t h a t / (a;) d e c r e a s e s .
I n o r d e r t o e n s u r e u n i f o r m i t y o f t h e i t e r a t i v e p r o c e s s ( 6 . 2 5 ) , it
c a n b e started in a w a y a n a l o g o u s to that g i v e n in the subsection
o n p. 79.
M i n i m i z i n g a q u a d r a t i c f o r m . L e t f (x) — — (Ax, x) + (b, x) + c,
w h e r e (Ax, x) > 0 w i t h a n y x 0 . I n t h i s c a s e it is e a s i l y a s c e r t a i n e d
t h a t v e c t o r 6 * = g h = / ' ( x k ), cp* = / ' ( y k ), t|>k = e h , i . e . D h = 4 * ,
a n d p r o c e s s (6.1) c o i n c i d e s w i t h (3.4). C o n s e q u e n t l y (see t h e s u b s e c
t i o n o n p. 79), p r o c e s s (6.1) a l l o w s to find t h e m i n i m u m of a q u a d r a t
ic f u n c t i o n a f t e r n s t e p s . I t is n e c e s s a r y i n t h i s c a s e t o c a l c u l a t e
(n + l)2 f u n c t i o n values.
C h o o s i n g v e c t o r g h . I n m e t h o d (6.1), b e s i d e s a p p r o x i m a t i n g m a
t r i x f (#), w e a l s o s u b s t i t u t e f o r g r a d i e n t f (x) its finite d i f f e r e n c e s
an alogue-vector g h . I n this case as w a s n o t e d a b o v e in order to o b t a i n
a s u p e r l i n e a r rate of c o n v e r g e n c e , c o n d i t i o n s (6.8) are to b e satisfied.
If t o s a t i s y t h e s e c o n d i t i o n s w e h a v e t o c a l c u l a t e a t a c e r t a i n i t e r a
tion vector g h several times, the a m o u n t of w o r k required in t h e
process increases (particularly, for a m u l t i d i m e n s i o n a l space).
N o t e t h a t i f || p h || < | | P a - i l l a t e a c h i t e r a t i o n , t h e n o n e c a n c h o o s e
| p fc | = | | P k - i \ \ 2 - I t i s v e r y p r o b a b l e t h a t w i t h s u c h a m a n n e r o f
c h o o s i n g p ft, t h e r i g h t - h a n d o n e o f t h e i n e q u a l i t i e s ( 6 . 8 ) w i l l b e
satisfied, at least f r o m a c e r t a i n iteration o n . I n d e e d , in t h e e n d w e
o b t a i n b o u n d s (6.24) o n the rate of c o n v e r g e n c e .
T h e r a t e o f c o n v e r g e n c e o f a p r o c e s s e s t i m a t e d i n t h i s m o d e is
usually slower than the quadratic one:
i . e . w i t h b o u n d s ( 6 . 2 4 ) u s u a l l y || P h - i l l 2 < C l l P h II ( r e c a l l t h a t f r o m
a c e r t a i n k o n , w e h a v e a * = 1, i.e. p h = x h + x — x *). T h e r e f o r e , if
s e q u e n c e { | ft} i s c h o s e n s u c h t h a t — ►() at a sufficiently s l o w rate,
w e c a n e x p e c t t h a t w i t h p ft = | | p f c - i l l 2 i t w i l l n o t b e n e c e s s a r y t o
c a l c u l a t e g h m a n y t i m e s i n o r d e r t o s a t i s f y (6.8). If, h o w e v e r , c o n d i
t i o n s ( 6 . 8 ) a r e n o t s a t i s f i e d f r o m t h e b e g i n n i n g (i.e. if w e h a v e t o
r e d u c e p ft), t h i s w i l l s u g g e s t t h a t t h e r a t e o f c o n v e r g e n c e i s c l o s e t o
the quadratic one.
137
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
M e t h o d s of C o n j u g a t e Directions
{A (xm ,m ) y P i ) = = i = 1 , 2 , . . . , 171.
T h u s if p o i n t s o f t h e m i n i m u m o f / ( x ) a r e d e t e r m i n e d i n d i f f e r e n t
s u b s p a c e s f o r m e d b y A - o r t h o g o n a l d i r e c t i o n s p ly . . ., p m , t h e n t h e
d i r e c t i o n p m + x = x mtTn — x m p r o v e s to b e c o n j u g a t e to d i r e c t i o n s
Pli • • •» P m •
T h e m e t h o d described of const ruc tin g c o n j u g a t e vectors d o e s n o t
r e q u i r e c a l c u l a t i o n o f t h e g r a d i e n t o r its finite d i f f e r e n c e s a p p r o x i m a
tion. L e t us n o w describe a concrete a l g o r i t h m for the m i n i m i z a t i o n
of a q u a d r a t i c f u n c t i o n in w h i c h the construction of c o n j u g a t e
v e c t o r s is p e r f o r m e d b y t h e m e t h o d s d e s c r i b e d .
W e c h o o s e arbitrarily p o i n t x 0 a n d v e c t o r p x; t h e m - t h iteration of
t h e a l g o r i t h m ( m = 1 , 2 , . . ., n ) i s p e r f o r m e d a s f o l l o w s :
(1) C a l c u l a t e p o i n t
Xm ~ % - i “ 1“ & m P m (6.26)
w h e r e a m is d e t e r m i n e d u n d e r t h e c o n d i t i o n o f t h e m i n i m u m f u n c t i o n
value:
/ (®0 — / ( ^ m - l "f* & P m ) m
138
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
(2) C a l c u l a t e p o i n t
* 0.m = x m + rm (6.27)
w h e r e r m is a n a r b i t r a r y v e c t o r w h i c h is n o t a l i n e a r c o m b i n a t i o n o f
vectors p u . . ., p m ( b e l o w w e s h a l l d w e l l a t s o m e l e n g t h o n t h e
question of t h e c h o i c e o f r m ).
(3) C a l c u l a t e points
x h,m = x k-l, to “ I” ® k,m Phi k = 1 , 771
w h e r e f a c t o r a * , m is d e t e r m i n e d u n d e r t h e c o n d i t i o n o f t h e v a l u e
of f u n c t i o n / (a) = / + « P a) being m i n i m u m .
(4) N o w c a l c u l a t e v e c t o r p m + i = x m , m — x m • T h i s is t h e e n d o f
the m - t h iteration.
V e c t o r rm (in (6.27)) m u s t n o t b e l o n g to t h e s u b s p a c e E m (x0)
s o t h a t p o i n t x 0 t T n w o u l d n o t b e l o n g t o s u b s p a c e E m (:r0 ). S i n c e
p o i n t x m i s t h e m i n i m u m p o i n t o f / ( # ) i n s u b s p a c e E m ( # 0 ), i t i s
c l e a r t h a t a n y v e c t o r x — x m i n w h o s e d i r e c t i o n f u n c t i o n / (x) d e
c r e a s e s d o e s n o t b e l o n g t o E m (a:0 ). C o n s e q u e n t l y , a n y d i r e c t i o n o f
d e s c e n t o f / ( x ) f r o m p o i n t x m c a n b e t a k e n a s r m . I n p a r t i c u l a r , it is
c o n v e n i e n t to c h o o s e vector rm a l o n g o n e of the coordinate axes;
t h e n if s u c h a v e c t o r p r o v e s n o t t o b e t h e d i r e c t i o n o f d e s c e n t , it is
necessary to t a k e as rm a v e cto r a l o n g a n o t h e r axis.
A c c o r d i n g to t h e results of Sec. 4, p o i n t x n c a l c u l a t e d b y for
m u l a ( 6 . 2 6 ) is t h e m i n i m u m p o i n t o f / (x): x n = x * . I n o r d e r t o l i n d
point x + w e h a v e to solve o n e - d i m e n s i o n a l m i n i m i z a t i o n p r o b l e m s
n ’ 1
(to determine factors a m and a k t T n ) 1 -f- 2 + . . . + n = - n
times.
U s i n g this a p p r o a c h to the co nst ruc tio n of the m e t h o d of c o n j u g a t e
directions, o n e c a n construct various a l g o r i t h m s for the m i n i m i z a
tion of n o n q u a d r a t i c functions. O f course, in a n y a l g o r i t h m of this
kind, the directions . . ., p m , m ^ n w i l l b e n o m o r e c o n j u g a t e
(see the s u b s e c t i o n o n p. 103). H o w e v e r , w e c a n e x p e c t t h a t s u i t a b l y
w o r k e d out m e t h o d s in a sufficiently s m a l l n e i g h b o u r h o o d of the
m i n i m u m p o i n t x * (of a c o n v e x s m o o t h f u n c t i o n ) w i l l m a k e p o s s i b l e
the construction of vectors that are close e n o u g h as to their p r o p
erties to t h e c o n j u g a t e ones. S u c h a l g o r i t h m s m a y p r o v e effective
in m i n i m i z i n g n o n q u a d r a t i c functions.
W e shall consider b e l o w a n a l gor ith m based o n the a b o v e c o n
siderations.
Let o b e a n a r b i t r a r y p o i n t a n d i>i, i , . . . » v lt n b e a n o r t h o -
n o r m a l i z e d c o o r d i n a t e basis; the &-th iteration of the a l gor ith m,
k = 1, 2, . . . c o n s i s t s o f t h e f o l l o w i n g s t e p s :
( 1 ) F o r i — 1 , 2 , . . ., n c a l c u l a t e
x h, i x k , i ~: a k,i v k, i
139
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
/ (**.n + OW*,n+l)
be m i n i m u m .
(3) L e t a AjS = max{a*,i: i = 1, 2, . . n}, A* be a deter
m i n a n t w h o s e c o l u m n s a r e v e c t o r s v ht u . . ., v h t n and e > 0
b e a n a r b i t r a r y s m a l l p o s i t i v e c o n s t a n t . If
> e .
Vfc
w e set t h at v k+lt t f with i s and i;A + lf 3 — v h>n+1; then
w e have
(6-28)
140
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
i . e . p o i n t s x 2 ,0 a n d x 2 % n a r e m i n i m u m p o i n t s o f / ( x ) i n t h e o n e
d i m e n s i o n a l s u b s p a c e ( f o r m e d b y v e c t o r u lt n + x ) w h i c h p a s s e s t h r o u g h
t w o d i f f e r e n t p o i n t s x 1%n a n d £ 2 , n - i - I f / ( # ) is a q u a d r a t i c f u n c t i o n ,
t h e n a c c o r d i n g t o t h e f o r e g o i n g t h e d i r e c t i o n v 2 ,n + i — # 2 , n — ^ 2.0
i s f o u n d t o b e c o n j u g a t e t o t h e d i r e c t i o n v 1% n +x. = v 2% n . B y s i m i l a r
r e a s o n i n g , i t c a n b e a s c e r t a i n e d t h a t i f w i t h a n y k — 1 , 2 , . . ., n
v e c t o r s u hi 1<t . . ., v h , n a r e l i n e a r l y i n d e p e n d e n t , t h e n a f t e r t h e
k -th iteration vectors v k % n + 1 , v ktTly . . ., v k t n - k + 2 will prove
c o n j u g a t e , i.e. a f t e r n i t e r a t i o n s o f t h e p r o c e s s w e s h a l l h a v e c o n s
t r u c t e d n c o n j u g a t e v e c t o r s . H o w e v e r , it is i m p o s s i b l e w i t h t h i s
m e t h o d o f c o n s t r u c t i n g v e c t o r s u k t l 1 . . ., u k ,n t o g u a r a n t e e t h e i r
l i n e a r i n d e p e n d e n c e . I n d e e d , if w i t h a c e r t a i n k w e h a v e a * . x = 0
t h e n , a s is e a s i l y a s c e r t a i n e d ,
n
Vkt n + 1 = ^h, n %h. 0 ~ ^ k , n %h, 1 ^ 2 ® k , i ^ h , it
i=2
i.e. a t t h e ( k + l ) - i t e r a t i o n t h e s y s t e m * ! o f v e c t o r s — v hti+ ly
i = 1 , 2 , . . ., n i s f o u n d t o b e l i n e a r l y d e p e n d e n t . I n t h i s c a s e
it is n o t p o s s i b l e t o c o n s t r u c t a s y s t e m o f n c o n j u g a t e v e c t o r s ; t h i s
m e a n s that w i t h the application of this simplified a l g o r i t h m w e
c a n n o t g u a r a n t e e that a solution will b e o b t a i n e d e v e n for a q u a d r a t
ic f u n c t i o n . T h e m o r e c o m p l i c a t e d s t e p s (2) a n d (3) o f t h e o r i g i n a l
a l g o r i t h m are u s e d just in order to a v o i d linear d e p e n d e n c e of v e c
t o r s v h , h i = 1 , 2 , . . ., n ( w e f i n d t h a t A h > e ) .
H o w e v e r , note that in m i n i m i z i n g a quadratic function w i t h
t h e a i d o f t h e o r i g i n a l a l g o r i t h m , it is i m p o s s i b l e t o g u a r a n t e e t h a t
t h e p r o b l e m w i l l b e s o l v e d after a finite n u m b e r of iterations. I n d e e d ,
i f w e g o o v e r f r o m t h e s y s t e m o f v e c t o r s u h t l , . . ., v k ,n t o t h e
s y s t e m i;f c + l f l , . . ., u f t + x, n , i t c a n o c c u r t h a t o n e o f t h e c o n j u g a t e
v e c t o r s a l r e a d y c o n s t r u c t e d is c h a n g e d ( s e e s t e p (3)); t h e r e f o r e it is
impossible to g u a r a n t e e that n co nju gat e vectors will b e o b t a i n e d
a f t e r a f i n i t e n u m b e r o f i t e r a t i o n s . B e s i d e s , t h e s y s t e m o f v e c t o r s v k ,i
c a n r e m a i n u n c h a n g e d in g o i n g o v e r to t h e (k + l)-iteration.
W e s h o w n o w that e q ual ity (6.28) holds:
But
n
1 1
^k.n - n = — ( ^ , n — = — 2 a k , i V h ,i.
k i=l
C o n s e q u e n tly
141
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
c o n t i n u i t y o f / (x), t h a t t h e f o l l o w i n g e q u a l i t i e s h o l d :
f ( x i+1 ) = l i m f ( x ft_.i+i)= l i m f ( x h i) = f ( X i ) . (6.30)
/i -*-oo A_-*-oo
m m
Let us demonstra t e t h a t a:*+x = f o r a l l i = 0 , 1 , . . ., n — 1 .
B y construction, || = 1 w i t h a n y k a n d i, c o n s e q u e n t l y ,
v e c t o r s v kt f c a n b e c o n s i d e r e d to b e e l e m e n t s of a u n i t y s p h e r e (of
a b o u n d e d set) a n d t h e r e f o r e w i t h a n y f i x e d i = 1 , . . ., n t h e r e i s
a subsequence t h a t c o n v e r g e s t o a c e r t a i n v e c t o r v t. S i n c e
= i “ t- ^ h . i + l ^ h . i + 1 and
v h m .i + 1 - > w e have
•^i+l ~ “ I- I = 1 » • • •» ^ 1
where och-i = lim a k h-i- Since c o n d i t i o n / (a:f t t i + 1 ) =
oo *m
= m i n / (a*ftt t -f- a v kt l + 1 ) i s s a t i s f i e d [ a t p o i n t x ht i + 1 , w e m u s t h a v e :
a
/ (aTf+j) = m i n / (xt + a Uf+i), i = 0 , 1 , . . ., n — 1, (6.31)
a
i.e. t h e m i n i m u m o f f ( x ) i n t h e d i r e c t i o n v i + 1 is a t t a i n e d a t p o i n t
X i + i . B u t i t f o l l o w s f r o m ( 6 . 3 0 ) t h a t / (arf+ x ) = / (arf). S i n c e / ( x ) i s
s t r i c t l y c o n v e x , t h e r e is a u n i q u e m i n i m u m p o i n t i n t h e d i r e c
t i o n U j + x ; h e n c e a ^ + x = a-*.
142
M E T H O D S W I T H O U T C A L C U L A T I N G D E R I V A T I V E S
It f o l l o w s t h a t w i t h a fixed i t h e s e q u e n c e *} is a m i n i m i z i n g
o n e , c o n s e q u e n t l y t h e s e q u e n c e ( 6 . 2 9 ) is a m i n i m i z i n g o n e a s w e l l ,
a n d t h e r e f o r e , s i n c e t h e r e is o n l y o n e m i n i m u m , t h e s e q u e n c e c o n
v e r g e s t o p o i n t x *. T h e t h e o r e m is p r o v e d .
I t is e a s y t o a s c e r t a i n t h a t i n p r o v i n g t h a t c o n d i t i o n s ( 6 . 3 2 ) h o l d ,
w e m a d e n o u s e o f t h e f a c t t h a t f u n c t i o n / (x) is d i f f e r e n t i a b l e , i.e.
these inequalities h o l d also for a strictly c o n v e x continuous; f u n c -
tion. H o w e v e r , po i n t x — t h e limit p o i n t of s e q u e n c e (6.29)— in this
c a s e c a n b e n o t t h e m i n i m u m p o i n t o f / (a:) ( a t t h e s a m e t i m e s e
q u e n c e (6.29) c a n h a v e m o r e t h a n o n e limit point).
D i s c u s s i o n of R e s u l t s
N o t e first o f a l l t h a t t h e fi eld o f a p p l i c a t i o n o f t h e m e t h o d o f
c o n j u g a t e d i r e c t i o n s is b r o a d e r t h a n t h a t o f m e t h o d s o f d u a l d i r e c
t i o n s ; t h i s is e a s i l y a s c e r t a i n e d b y c o m p a r i n g t h e r e q u i r e m e n t s
i m p o s e d o n t h e f u n c t i o n b e i n g m i n i m i z e d in t h e o r e m s 6.1 a n d 6.2.
T h e properties of t h e m e t h o d of c o n j u g a t e directions u n d e r c o n
s i d e r a t i o n h a v e b e e n a s y e t s t u d i e d b u t i n s u f f i c i e n t l y . T h u s it is
n o t y e t c l e a r w h a t t h e r a t e o f c o n v e r g e n c e o f t h e a l g o r i t h m is.
N e v e r t h e l e s s , it is e v i d e n t l y s l o w e r ( i n m i n i m i z i n g f u n c t i o n s o f t h e
s a m e class) t h a n t h a t of m e t h o d s o f S e c . 5; t h i s c a n b e j u d g e d e v e n
b y the fact that the a l g o r i t h m u n d e r consideration does no t g u a
r a nte e t h e finding of the m i n i m u m of a q u a d r a t i c f o r m after n itera-
143
U N C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
l i o n s ( a n d i n f a c t , a f t e r a f i n i t e n u m b e r o f s t e p s ) , i.e. it d o e s n o t
g u a r a n t e e the co n s t r u c t i o n of a s y s t e m of n c o n j u g a t e vectors after
a finite n u m b e r of iterations. C o n s e q u e n t l y , f r o m t h e v i e w p o i n t
of the rate of co nvergence, m e t h o d s of d u a l directions w i t h their
superlinear rate of c o n v e r g e n c e h a v e a n advantage, ov er the m e t h o d s
of co nju gat e directions.
L e t us m a k e a n a t t e m p t to c o m p a r e the a m o u n t s of c o m p u t a t i o n s
at iterations of the a l g o r i t h m s studied.
I n a m e t h o d o f t y p e ( 6 . 1 ) , it is n e c e s s a r y a t e a c h i t e r a t i o n t o c a l
culate the function va l u e n + 1 or 2 (n + 1 ) ti m e s for the construc
tion of m a t r i x D J1 a n d n H- 1 t i m e s for the construction of vector g k
d e p e n d i n g o n t h e v a r i a n t a p p l i e d (see t h e s u b s e c t i o n o n p. 136);
a t t h e s a m e t i m e it c a n o c c u r a t s o m e i t e r a t i o n s t h a t i n d e t e r m i n i n g
g k t h e r e is n o n e e d t o p e r f o r m n e w e v a l u a t i o n s o f t h e f u n c t i o n (if
Pfc = or the a m o u n t of calculations c a n increase several times
d e p e n d i n g o n h o w c l o s e t h e g r a d i e n t is a p p r o x i m a t e d b y g k . B e s i d e s ,
it is n e c e s s a r y t o p e r f o r m s o m e m o r e c a l c u l a t i o n s o f f u n c t i o n v a l u e s
i n o r d e r to c h o o s e t h e d i r e c t i o n of m o t i o n a n d t h e s t e p size.
I n t h e m e t h o d o f c o n j u g a t e d i r e c t i o n s it is n e c e s s a r y t o c a l c u l a t e
at e a c h iteration the m i n i m u m of the f u nct ion in the direction of
m o t i o n n + 1 t i m e s . If w e a s s u m e t h a t i n s o l v i n g a o n e - d i m e n s i o n a l
m i n i m i z a t i o n p r o b l e m w e h a v e to calculate o n the a v e r a g e 3 or
4 function values, t h e n the a m o u n t of calculations at e a c h iteration
w i t h t h e m e t h o d b e i n g s t u d i e d is a b o u t t h e s a m e . I t is n o t a s y e t
clear t h o u g h w h a t a c cur acy the c o m p u t a t i o n of the m i n i m u m in
a direction of m o t i o n m u s t b e p e r f o r m e d w i t h in the m e t h o d of
c o n j u g a t e directions so that the properties of the process b e not
violated. F r o m the v i e w p o i n t of the influence o n the convergence,
t h e a l g o r i t h m o f a k c h o i c e i n p r o c e s s (6.1) is t o b e p r e f e r r e d .
O n t h e w h o l e , g i v e n t h e p o s s i b i l i t y o f u s i n g m e t h o d s of t y p e (6.1),
t h e y m u s t b e m o r e effective t h a n the m e t h o d of c o n j u g a t e direc
t i o n s ; h o w e v e r , it s h o u l d b e s t r e s s e d o n c e m o r e t h a t t h e f i e l d o f
a p p l i c a t i o n o f t h e l a t t e r is b r o a d e r .
F i n a l l y , it s h o u l d b e n o t e d t h a t i n s t u d y i n g p r o c e s s ( 6 . 1 ) w e
m a d e it p r a c t i c a l l y c l e a r t h a t t h e c a l c u l a t i o n e r r o r s i n d e t e r m i n i n g
v e c t o r e k o f t h e o r d e r o f 0 (|| r ft||*) ( s e e ( 6 . 4 ) , ( 6 . 5 ) , ( 6 . 6 ) ) a n d i n
d e t e r m i n i n g v e c t o r / ' ( x k ) o f t h e o r d e r o f O ( £ ft|| p ft||) ( s e e ( 6 . 1 6 ) )
d o n o t vi o l a t e t h e p r o p e r t i e s of p r o c e s s (3.4) ( c o n v e r g e n c e , b o u n d s
o n t h e r a t e of c o n v e r g e n c e ) . If w e c o n s i d e r t h e v a r i a n t of p r o c e s s (3.4)
in w h i c h = x k + 1 — x ky t h e n w e c a n o b t a i n o t h e r e x p r e s s i o n s f o r
e s t i m a t i n g t h e errors. F r o m a c e r t a i n s t e p o n i n p r o c e s s (3.4), a * = 1
and, consequently, w e have
II '•*11 = | | P s - x l l = 1 1 x h — x^ l l = P i L i / * - 1 || > m i l l /i-ill-
144
B I B L I O G R A P H I C N O T E S
T h u s if r h + 1 = x h + i — x k t h e e r r o r s i n t h e c a l c u l a t i o n o f v e c t o r s e h
a n d f h o f t h e o r d e r o f O (|| — :r*||') a n d O — **||) d o n o t
tell o n t h e p r o p e r t i e s of p r o c e s s (3.4).
Bibliographic Notes
T o S e c . 1 . T h e i d e a o f t h e g r a d i e n t m e t h o d w a s first s t a t e d b y A . L . C a u c h y .
M e t h o d s o f t h e g r a d i e n t t y p e w e r e s t u d i e d b y L . V . K a n t o r o v i c h [1], J. E . K e l l e y ,
B . T . P o l y a k [1], M . A l t m a n , Y u . I. L y u b i c h , Y u . I. L y u b i c h a n d G . D . M a i s t -
r o v s k y . T h e s e w o r k s c o n t a i n l o n g lists o f literature.
T h e v a r i a n t of t h e m e t h o d w i t h t h e c h o i c e of t h e s t e p size a c c o r d i n g to
c o n d i t i o n (1.2) w h i c h is d e s c r i b e d i n t h i s s e c t i o n is p u b l i s h e d f o r t h e first t i m e .
T h e s t u d y of the rate of c o n v e r g e n c e of gradient m e t h o d s g i v e n in this
s e c t i o n is b a s e d o n t h e r e s u l t s o f t h e w o r k o f B . T . P o l y a k (1 1 .
T o S e c . 2 . N e w t o n ’s m e t h o d f o r s o l v i n g m i n i m i z a t i o n p r o b l e m s a n d e q u a
t i o n s w a s s t u d i e d b y L . V . K a n t o r o v i c h [2], L . V . K a n t o r o v i c h a n d G . P . A k i
lov, a n d L. Collatz.
M . N . Y a k o v l e v p r o v e d that the rate of c o n v e r g e n c e of the generalized
m e t h o d is s u p e r l i n e a r if t h e s t e p s i z e is c h o s e n f r o m t h e c o n d i t i o n t h a t t h e f u n c
t i o n a t t a i n s m i n i m u m i n t h e d i r e c t i o n o f m o t i o n . A . A . G o l d s t e i n a n d J. F. P r i c e ,
J . W . D a n i e l , Y u . M . D a n i l i n [ 1 , 2 1 s t u d i e d N e w t o n ’s m e t h o d w i t h a d j u s t
m e n t of step len gth a n d u s e d a m o d e of choice of the value which did not
involve the finding of the function m i n i m u m in the direction of m o t i o n .
T o S e c . 3 . T h i s s e c t i o n is b a s e d o n a p a p e r b y Y u . M . D a n i l i n a n d B . N . P s h e
n i c h n y [I].
T o S e c . 4 . T h e first o f t h e m e t h o d s o f c o n j u g a t e d i r e c t i o n s — t h e m e t h o d o f
con juga te gradients— w a s p r o p o s e d for solving p r o b l e m s of linear algebra b y
M . R . H e s t e n e s a n d E . Stiefel. A n o t h e r a p p r o a c h to t h e c o n s t r u c t i o n of m e t h o d s
of conjugate directions as applied to quadratic function m i n i m i z a t i o n w a s
p r o p o s e d b y W . C . D a v i d o n [lj a n d d e v e l o p e d b y R . F l e t c h e r , M . J . D . P o w e l l
a n d others.
I n B . N . P s h e n i c h n y ’s a l g o r i t h m [ 3 ] t h e c o n s t r u c t i o n o f c o n j u g a t e d i r e c
tions does not involve the finding of the function m i n i m u m in the direction
of m o t i o n .
M a n y properties of conjugate directions are discussed b y D. K . F a d d e e v a n d
V. N. F a d d e e v a . T h e general m e t h o d of constructing conjugate directions w h i c h
is u s e d i n t h i s s e c t i o n w a s w o r k e d o u t b y H . Y . H u a n g . S o m e o f t h e r e s u l t s a r e
n e w , e.g. f o r m u l a (4.45), m e t h o d (4.63).
T o Sec. 5. R . Fletcher a n d C. M . R e e v e s sug g e s t e d the use of the m e t h o d of
c o n j u g a t e gradients for the m i n i m i z a t i o n of n o n q u a d r a t i c functions. T h e p r o b
l e m s of the c o n v e r g e n c e a n d b o u n d s o n the rate of c o n v e r g e n c e of the m e t h o d of
c o n j u g a t e g r a d i e n t s w e r e s t u d i e d b y J. W . D a n i e l [1, 2], B . T . P o l y a k [2],
G . D . M a i s t r o v s k y [1, 21, S. A . S m o l y a k . T h e c o n v e r g e n c e o f m e t h o d ( 4 . 4 8 ) a n d
t h e b o u n d s o n t h e r a t e o f c o n v e r g e n c e w e r e e s t a b l i s h e d b y M . J . D . P o w e l l [3]
( t h e s e r e s u l t s a r e d e s c r i b e d i n t h e b o o k s b y J. W . D a n i e l [1] a n d E . P o l a k [2]).
T h e pro of of t h e c o n v e r g e n c e of m e t h o d s of c o n j u g a t e directions d e s c r i b e d in
t h i s s e c t i o n is b a s e d o n a p a p e r b y Y u . M . D a n i l i n 141.
T o Sec. 6. M e t h o d s of d u a l directions w i t h o u t calculating derivatives of
t h e f u n c t i o n a r e d i s c u s s e d i n a p a p e r b y Y u . M . D a n i l i n a n d B . N . P s h e n i c h n y [2).
M e t h o d s o f c o n j u g a t e d i r e c t i o n s w e r e s t u d i e d b y C . S . S m i t h , M . J . D . P o w e l l [2],
W . I. Z a n g w i l l [ 1 ] , J . W . D a n i e l [1] . T h e w o r k s n a m e d h a v e b e e n u s e d i n w r i t i n g
this section. A r e v i e w of m i n i m i z a t i o n m e t h o d s w i t h o u t calculating derivatives
w a s written b y R . P. Brent.
T h i s c h a p t e r describes v a rio us m e t h o d s of f u n c t i o n m i n i m i z a t i o n
w i t h c o n s t r a i n t s o n t h e v a r i a b l e s . T h e first s e c t i o n d e v e l o p s m e t h o d s
o f s o l v i n g p r o b l e m s o f q u a d r a t i c p r o g r a m m i n g w h i c h is a s u b s i d i a r y
p r o b l e m in m a n y algorithms. T h e following sections describe the
a l g o r i t h m s for solving p r o b l e m s of c o n v e x a n d n o n c o n v e x p r o g r a m
m i n g . E v e r y w h e r e , if o n l y f e a s i b l e , t h e b o u n d s o n t h e r a t e o f c o n
vergence are given.
1. P R O B L E M O F Q U A D R A T I C P R O G R A M M I N G
U s u a l l y t h e p r o b l e m o f q u a d r a t i c p r o g r a m m i n g is u n d e r s t o o d
to b e the p r o b l e m of the m i n i m i z a t i o n of a qu a d r a t i c function w i t h
l i n e a r c o n s t r a i n t s . T h u s t h e p r o b l e m o f q u a d r a t i c p r o g r a m m i n g is t h e
m i n i m i z a t i o n of th e fun c t i o n
146
Q U A D R A T I C P R O G R A M M I N G
b y l i n e a r i n e q u a l i t i e s (1.2). W e f i n d t h e m i n i m u m o f f (x) o n t h i s
f a c e u s i n g t h e m e t h o d o f c o n j u g a t e g r a d i e n t s . T h e p o i n t o b t a i n e d is
the solution of o u r p r o b l e m or indicates a transition to a n e w face
a n d t h e n t h e p r o c e d u r e is r e p e a t e d . S i n c e t h e m e t h o d o f c o n j u g a t e
g r a d i e n t s m i n i m i z e s f u n c t i o n / (x) af t e r a finite n u m b e r o f s t e p s ,
a n d t h e n u m b e r o f f a c e s o f t h e p o l y h e d r a l s e t is l i m i t e d , it is c l e a r
t h a t a n a l g o r i t h m of this k i n d c o n v e r g e s after a finite n u m b e r of
steps.
O p e r a t o r s of Projection
L e t n o w C f = J ~ (J J ° a n d f * b e a s u b s e t o f t h e s e t o f i n d i c e s J .
W e f o r m m a t r i x A ™ w h o s e r o w s a r e v e c t o r s a t, i £ ^ s o t h a t t h e
o
m a t r i x is m X r c - d i m e n s i o n a l , w h e r e m is t h e n u m b e r o f e l e m e n t s
in set f .
L e m m a 1 . 1 . I f v e c t o r s a t , i £ 'f a r e l i n e a r l y i n d e p e n d e n t , t h e n
m a t r i x A u A % is n o n s i n g u l a r .
C O —
Proof. Let y £ E ™ be a nonzero vector such that
A f A f y = 0. (1.3)
Then
V * A f A ^ y = ( A * g y ) * A % y = ( A f y y , A f y y ) = || A f y y ||2 = 0 ,
A \ y = 0 (1-4)
o
B u t A * ^ y is j u s t a l i n e a r c o m b i n a t i o n o f v e c t o r s a t, i £ f w i t h c o e f
f i c i e n t s y l , i = 1 , . . ., m , w h e r e y l a r e c o m p o n e n t s o f v e c t o r y .
B y t h e a s s u m p t i o n t h a t a t, i £ 'f a r e l i n e a r l y i n d e p e n d e n t , t h i s c o m
b i n a t i o n c a n n o t b e zero. T h e r e f o r e , (1.4) a n d c o n s e q u e n t l y (1.3)
f r o m w h i c h (1.4) w a s o b t a i n e d are n o t true. T h u s , m a t r i x A f A f
c a n b e m a d e zero o n l y b y a zero vector, a n d this m e a n s that this
m a t r i x is n o n s i n g u l a r . L e t u s n o w d e f i n e o p e r a t o r P :
P = A f (A^Ay)-' Ay. (1.5)
I t is e a s i l y s e e n t h a t o p e r a t o r P has the following properties:
P P = P, (1.6)
P * = P, (1.7)
P (I - P) = (I - P) P = 0. (1.8)
O p e r a t o r P is t h e o p e r a t o r o f o r t h o g o n a l p r o j e c t i o n i n t o a s u b s p a c e
s p a n n e d b y v e c t o r s a t, i £ .
I n d e e d , f o r a n y v e c t o r x £ 2?”
x = P x + (/ — P ) x.
147 io*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
M i n i m i z a t i o n o f a Q u a d r a t i c F u n c t i o n in a S u b s p a c e
S u p p o s e n o w t h a t w e h a v e t o m i n i m i z e a q u a d r a t i c f u n c t i o n / (x)
d e f i n e d b y (1.1) w i t h t h e c o n s t r a i n t s
(ffli, x ) — bt = 0, i 6 (1.10)
W e a s s u m e t h a t v e c t o r s a iy i 6 are linearly independent.
L e t x 0 b e a p o i n t w h i c h satisfies (1.10).
N o t e t h a t i f w e d e n o t e b y b y a v e c t o r w h o s e c o m p o n e n t s a r e b it
t h e n t h e s y s t e m of e q u a t i o n s (1.10) c a n b e w r i t t e n in the
f o r m A y x — b y = 0 s o t h a t A y x 0 — b y — 0.
W e n o w introduce a n e w variable y defined as follows:
x = x0 + (/ — P) y (1.11)
and consider the quadratic function
<p ( y ) = / (*o + (i — P ) y).
T h e g r a d i e n t s o f f u n c t i o n s <p ( y ) a n d / ( x ) , a c c o r d i n g t o t h e r u l e s
of differentiation of a c o m p o s i t e fu n c t i o n a n d t h e s y m m e t r y of
operator P x are related as follows:
<p ' ( y ) = ( i - P ) f ( * ) (1.12)
w h e r e x a n d y are c o n n e c t e d b y (1.11).
L e m m a 1.2. L e t y be the p o i n t of absolute m i n i m u m of function
<p ( y ) . T h e n t h e c o r r e s p o n d i n g p o i n t
x = x0 + (I — P ) y
148
Q U A D R A T I C P R O G R A M M I N G
is t h e m i n i m u m p o i n t o f f u n c t i o n f ( x ) w i t h c o n s t r a i n t s ( 1 . 1 0 ) .
P r o o f . A t p o i n t y t h e g r a d i e n t o f f u n c t i o n <p ( y ) b e c o m e s z e r o :
<p' ( y ) = 0 . T h e r e f o r e , b y ( 1 . 1 2 ) ,
(/ - p) r (x) = o
or
f (x) — Afy ( A y A * y Y x A y f (x) = 0.
Taking u — — ( A y A y Y 1 A y f (x), w e obtain
/' ( x ) + A y u = 0. (1.13)
U s i n g (1.9) w e o b t a i n also
A y X — A y X 0 + A y (/ — P) y — A y X 0 = by,
i.e. x sa t i s f i e s c o n d i t i o n s ( 1 . 1 0 ) .
T h u s x is t h e f e a s i b l e p o i n t a n d a t t h i s p o i n t c o n d i t i o n s ( 1 . 1 3 )
are satisfied, w h i c h are n e c e s s a r y a n d sufficient for x to b e t h e m i n i
m u m p o i n t o f / (x) w i t h c o n d i t i o n s (1.10). T h e l e m m a is p r o v e d .
L e m m a 1.2 s h o w s that the p r o b l e m u n d e r consideration c a n b e re
d u c e d t o t h e m i n i m i z a t i o n o f q u a d r a t i c f u n c t i o n <p ( p ) w i t h o u t c o n
s t r a i n t s . T o m i n i m i z e <p ( y ), w e a p p l y t h e m e t h o d o f c o n j u g a t e g r a
d i e n t s ( C h a p . II, S e c . 4):
jfo = 0, Pi = — q>' ( 0 ) ,
yk+i = yh + ocfc+iPk+n
„ _ {fit /.. \ I I* <P' f a k ) II2 n ,
P k + i - - q > » * ) + [, Pk-
T h e q u a n t i t y a h+1 i n t h e s e f o r m u l a s is c a l c u l a t e d a s f o l l o w s
„ _ _ _ _ _ _ (<P' (yh), P h + i )
k+1 ( P a + i , ( I - P ) C ( I - P ) Ph+i)
s i n c e it is e a s y t o a s c e r t a i n t h a t t h e m a t r i x w h i c h d e t e r m i n e s t h e
q u a d r a t i c t e r m o f f u n c t i o n <p ( y ) h a s t h e f o r m
(I — P ) C (I — P).
These formulas determine the process involving the additional
v a r i a b l e s y . I t is, h o w e v e r , e x p e d i e n t t o g o o v e r t o t h e o r i g i n a l
v a r i a b l e s x. W e p r e l i m i n a r i l y p r o v e t h a t t h e f o l l o w i n g re l a t i o n h o l d s :
(/ - P) p h = P k . (1.14)
Indeed, w i t h ft = 1 w e have:
(I - P) Pl = - ( I - P ) q>' ( 0 ) = - ( / - / > ) ( / - / > ) /' ( x q )
= - (/ - P ) r ( x 0> = -<p'(0) = pi
149
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
< / - P) i m + ii';',
It f o l l o w s n o w f r o m (1.11) t h a t
x h+i = x 0 + (I — P ) y h + 1 ,
x h+i = ** + (!— p ) (yh+ 1 — yh) = xh + {I — P ) <*h+iPh+i,
x h+ 1 — x k + &k+lPh+l•
Let us transform the formula f o r P f c + l7 u s i n g (1.12):
d„ - (T P M ' ( x \ I I K / - J° ) / / ( ^ ) H 2 n
Ph+i {1 1 ) 1 ( ^ f t ) + || ( i - p ) f ' ( x /t_ 1 ) | | * ^ f t *
The f o r m u l a f o r a.k+i n o w takes the following form:
„ = (( I - P ) 1 ’ { x k ), P h + i ) ^ (/' (x h ) > Ph+i)
h+1 ( { I - P ) p k + 1, C ( I - P ) P k + i ) { P k + 1> CPh+i)
T h e o r e m 1.1. T h e p r o b l e m of the m i n i m i z a t i o n of q u a d r a t i c f u n c
t i o n f (x ) w i t h c o n s t r a i n t s ( 1 . 1 0 ) , g i v e n t h e i n i t i a l p o i n t x Q w h i c h
sa t i s f i e s ( 1 . 1 0 ) , is s o l v e d a f t e r a f i n i t e n u m b e r o f s t e p s b y t h e f o l l o w i n g
process:
pi - - ( / - p ) r ( * 0 ),
x h +1 = x h + a k + l P k + l i
PD k + i —_ _ _ / / _ _ P \ f f ( X u \ -I-
{* P)f ix k ) + 1|N (/ _ P ^) p f ^ X k ) IP T).
ip P k »
~ L ( f f (x h ) i P h + i ) ;r — o 1
(p m , C P M ) ’ f c _ 0 ’ 1 . . . .
150
Q U A D R A T I C P R O G R A M M I N G
m i n i m i z i n g (p ( y ) w e a p p l i e d t h i s m e t h o d t o a f u n c t i o n w h o s e m a t r i x
was ( I — P ) C ( / — P ). B u t s i n c e A y ( / — P ) = 0 , t h a t is,
(/ — P ) A y . = 0, w e h a v e (I — P ) a t = 0 , i . Therefore, in the
case u n d e r consideration the n u m b e r of zero e i g e n v a l u e s of m a t r i x
( I — P ) C ( I — P ) i s n o t l e s s t h a n m , w h e r e m i s t h e n u m b e r o f a it
i Therefore, the process suggested either converges to the m i n i -
m u m p o i n t o r s h o w s n o l o w e r b o u n d o f q u a d r a t i c f u n c t i o n / (x) w i t h
constraints (1.10) after a n u m b e r of steps n o t e x c e e d i n g n — m .
Algorithm of General P r o b l e m
of Q u ad ra ti c P r o g r a m m i n g
L e t u s n o w r e t u r n to t h e g e n e r a l p r o b l e m (1.1), (1.2). F o r e a c h
p o i n t x w h i c h satisfies (1.2) w e set
f (x) = { i : (a*, x ) — bt = 0, i 6 U J 0 }-
v
In w h a t follows w e a s s u m e that the f o l l o w i n g c o n d i t i o n of n o n
d e g e n e r a c y is fulfilled: w i t h a n y x v e c t o r s (x) a r e l i n e a r l y i n
dependent.
W e n o w propose the a l g o r i t h m for solving the p r o b l e m .
L e t x 0 b e a n a r b i t r a r y p o i n t w h i c h s a t i s f i e s ( 1 . 2 ) a n d is t h e first
a p p r o x i m a t i o n . T a k e a set of indices f 0 — (x 0 ) a n d c o n s t r u c t
operator P y J
a n d p o i n t x 0 is t h e m i n i m u m p o i n t o f / (x) o n t h e f a c e d e f i n e d b y t h e
s y s t e m of e q u a t i o n s
(at, x ) — bt = 0, i 6 to
( s e e C h a p . I, S e c . 3).
If t h e r e a r e n o n e g a t i v e c o m p o n e n t s a m o n g aj, c o m p o n e n t s of
v e c t o r m 0 , i £ f ( x 0 ) f) t h e n ( s e e C h a p . I, S e c . 3 ) p o i n t x 0 is t h e
s o l u t i o n of t h e p r i m a l p r o b l e m (1.1), (1.2), for i n this case (1.15)
are the necessary a n d sufficient co n d i t i o n s for the m i n i m u m of f u n c
t i o n / (x) w i t h c o n s t r a i n t s (1.2).
151
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
S u p p o s e n o w t h a t t h e r e i s a n i n d e x / £ ' f ( x 0 ) fl 3 ~ s u c h t h at wj « <
0. C o n s t r u c t a n e w set of i n d i c e s h y deleting i n d e x ;. W e a p p l y
the m e t h o d of c o n j u g a t e gradients described in the subsection on
p. 1 4 8 to s o l v i n g t h e p r o b l e m of m i n i m i z a t i o n o f / (x) w i t h c o n
straints
(di, x ) — bt — 0, i 6 % . tl-16)
H o w e v e r , in a p p l y i n g the m e t h o d of c o n j u g a t e gradients, the process
m u s t n o t t r a n s g r e s s t h e l i m i t s (1.2). T h e r e f o r e at e v e r y s t e p of t h e
algorithm the following check should be made. C o m p u t e the quantity
< u , )
w h e r e t h e m i n i m u m i s t a k e n o v e r a l l i f o r w h i c h ( a * , /?k + 1 ) > ■ 0 .
I n t h i s f o r m u l a x k is t h e p o i n t j u s t c o n s t r u c t e d b y t h e a l g o r i t h m a n d
p ft+1 is t h e c o n j u g a t e d i r e c t i o n a t t h i s p o i n t .
L e t n o w o&fc+i b e t h e c o r r e s p o n d i n g s t e p l e n g t h i n t h e m e t h o d o f
c o n j u g a t e g r a d i e n t s . If a h + 1 c a * + 1 , t h e n j r ^ + x = x h + a h+1 p k+1
a n d t h e p r o c e s s g o e s o n . I f h o w e v e r a * + i ^ a * + x » t h e n £ * + i = x \ -+*
+ o c j h - x P j h -i a n d t h e p r o c e s s s t o p s .
T h u s , e i t h e r w e f i n d t h e m i n i m u m p o i n t o f / (#) u n d e r c o n d i
tions (1.16) or the process will b e t r u n c a t e d w h e n a * + i ^ a*+x.
I n b o t h cases w e t a k e t h e p o i n t o b t a i n e d to b e t h e initial p o i n t a n d
p r o c e e d u s i n g t h e n e w p o i n t as w e d i d w i t h t h e initial one, x 0 .
( 2 ) ( I — / ^ 0 ) f (x o ) ^ 0 *
In this case w e a p p l y the m e t h o d of c o n j u g a t e gradients to solving
t h e p r o b l e m o f m i n i m i z a t i o n o f / (x ) w i t h c o n s t r a i n t s
( a f, x ) — bi = 0 , i 6 f 0 (1.18)
s t a r t i n g a t p o i n t x 0 . A t e v e r y s t e p , a s b e f o r e , a c h e c k is m a d e
w h e t h e r t h e p o i n t s o b t a i n e d a r e f e a s i b l e o r n o t , i.e. w e c a l c u l a t e
a *+1 b y f o r m u l a s (1.17) a n d a p p l y t h e process of c o n j u g a t e gr a d i e n t s
u n t i l e i t h e r w e f i n d t h e m i n i m u m p o i n t o f / (x) w i t h c o n s t r a i n t s (1.18)
o r t h e c o n d i t i o n a * + x ^ « f c + i is s a t i s f i e d a n d t h e p o i n t x * + x =
= Xft + o Cft+xPfc+i o b t a i n e d . I n b o t h c a s e s w e t a k e t h e p o i n t o b t a i n e d
a s t h e i n i t i a l o n e a n d r e p e a t a t it t h e o p e r a t i o n s p e r f o r m e d w i t h x 0 .
L e t u s s u b s t a n t i a t e t h e c o n v e r g e n c e of t h e m e t h o d after a finite
n u m b e r o f s t e p s . W e m u s t first o f all s h o w t h a t i n c a s e (1) a s w e l l
a s i n c a s e (2) a s u c c e s s f u l s t e p w i l l b e m a d e , i.e. w e m o v e f r o m
p o i n t x 0 t o a n e w p o i n t a t w h i c h t h e v a l u e o f f u n c t i o n / (x) w i l l b e
s t r i c t l y l e s s t h a n / (x 0 .
N e w points are o b t a i n e d b y t h e m e t h o d of c o n j u g a t e gradients
a n d in this m e t h o d th e function decreases at e a c h step. Therefore,
t h e o n l y t h i n g w e h a v e t o s h o w is t h a t o c * + x > > 0 a l w a y s , i.e. c o n -
152
Q U A D R A T I C P R O G R A M M I N G
with constraints
A fop = 0. (1.19)
T h e l a s t e x p r e s s i o n is t h e n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n f o r
c o n v e x f u n c t i o n (p ( p ) t o a t t a i n i t s m i n i m u m a t p o i n t p x w i t h c o n
s t r a i n t s ( 1 . 1 9 ) . T h e l e m m a is p r o v e d .
W e f o r m u l a t e n o w a p r o b l e m w h i c h is t h e d u a l o f t h e p r o b l e m o f
m i n i m i z i n g cp ( p ) w i t h c o n s t r a i n t s ( 1 . 1 9 ) . A c c o r d i n g t o t h e r u l e s
s t a t e d i n S e c . 3 o f C h a p . I, w e h a v e t o f i n d t h e m i n i m u m o f f u n c t i o n
cp ( p ) - f u ^ A y j p . D i f f e r e n t i a t i n g w i t h r e s p e c t t o p a n d e q u a t i n g t h e
derivatives to zero, w e obtain p -j- f ( # 0 ) + A f 0u = 0, i.e.
P = — f (*o) — A f 0u .
Su bstituting this expression for p w e o b t a i n that
T h u s t h e d u a l p r o b l e m c o n s i s t s i n f i n d i n g o v e r all p o s s i b l e v e c t o r s u
the m i n i m u m of the function
N o w d i f f e r e n t i a t i n g cp* (u) a n d e q u a t i n g t h e d e r i v a t i v e s t o z e r o ,
w e c a n easily ascertain that vector
“o = — ( A f 0A % ) - ' A y / (*.)
153
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Pl = — (/' (*o) + A y v ) , V = — { A y A f y y 1A y f ( x 0 ).
If P i = 0 , t h e n f ' ( x 0) - \ - A y v = 0 . B u t o n t h e o t h e r hand, by
assumptions,
( I — P ^ Q) /' ( x 0 ) = /' ( # 0) A y . Qu 0 — 0 . (1.21)
Subtracting from ( 1 . 2 1 ) t h e first e q u a l i t y , w e o b t a i n
A f 0 u 0 — A f a v = « j a ;- + 2 . ( « i — V i) a t = 0
I ’T * 3
w h i c h , as u{ 0 , i s i m p o s s i b l e s i n c e v e c t o r s a iy i £ 0 are linearly
i n d e p e n d e n t . W e n o w p r o v e t h e s e c o n d part, o f t h e l e m m a .
R e w r i t e (1.21) in the c o m p o n e n t f o r m :
/ ' ( * o ) + 2 u oa i + ( — “ j ) ( — a }) = L. (1-22)
i#j
N o t e t h a t — u{ > 0 s i n c e u{ < 0. C o n s i d e r t h e p r o b l e m of m i n i
m i z i n g cp ( p ) = ( p , /' ( x 0 )) + || p | | 2 / 2 w i t h c o n s t r a i n t s
154
Q U A D R A T I C P R O G R A M M I N G
= - II P i II2-
Therefore,
T h e last i n e q u a l i t y c o n t r a d i c t s t h e fact t h a t t h e m i n i m u m v a l u e of
<p ( p ) w i t h c o n s t r a i n t s ( 1 . 2 3 ) i s a t t a i n e d w i t h p = 0 a n d i s e q u a l t o
z e r o . T h i s c o n t r a d i c t i o n s h o w s t h a t (aj, P i ) < 0 . T h e l e m m a is
proved.
W e return n o w to the a l g o r i t h m constructed. L e t u s consi der
c a s e ( 1 ) a n d l e t p o i n t X o ’b e n o t t h e s o l u t i o n o f t h e p r o b l e m o f q u a d r a t
ic p r o g r a m m i n g . A c c o r d i n g to t h e a l g o r i t h m , w e s h o u l d a p p l y t h e
m e t h o d o f c o n j u g a t e g r a d i e n t s i n o r d e r t o m i n i m i z e f u n c t i o n / (x )
w i t h co n s t r a i n t s (1.16). I n a c c o r d a n c e w i t h t h e f o r m u l a s of t h e
m e t h o d , t h e first s t e p is m a d e i n t h e d i r e c t i o n o f v e c t o r
Therefore
155
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h e f a c t t h a t a x > 0 i n d i c a t e s t h a t all o f t h e p o i n t s x 0 4- a p x
w i t h 0 ^ a ^ c*! s a t i s f y c o n d i t i o n s ( 1 . 2 ) . I n d e e d , f o r i 0
(«i. x 0 + a p 1) — b i = ( a i, x 0) ^ b i - \ - a ( a i t p 1)
j = o ,
= « ( « « , ft) { < o, i = /.
For i £ f 0
( a fl x 0 + a P J — bt = ( a f, x 0) — bt + a (a,, Pi) < 0,
if ( a * , p x ) ^ 0 ; h o w e v e r if ( a ** p x) > 0, t h e n
a ^ o t i ^ ftf — (<*i» x q )
(a h P i )
a n d therefore
( a t , x 0 + a p 1) — b i ^ ( a „ x 0) — bt+ 6| ( - * * o) ( a t , f t ) = 0 .
(«i. Pi)
N o t e t h a t t h e s i g n of i n e q u a l i t y in t h e last e x p r e s s i o n is to be
bj — 3Tq)
c o n s i d e r e d s t r i c t if a < C < i i o r a x < C
(<*i* P i )
A c c o r d i n g to the algorithm, t w o cases are possible: a x < C « i a n d
a i ^ a x . I n t h e first c a s e w e o b t a i n a n e w p o i n t x x = x 0 + o^iPi
t h a t satisfies t h e r e l a t i o n s
; (a*, *i) — bt = 0, i 6 jfj, (a*, x x) — bt < 0, i £ (1.26)
In t h e s e c o n d c a s e , w e o b t a i n p o i n t x x = ar0 + a n d it is t a k e n
as a n e w initial p o i n t f r o m w h i c h t h e a l g o r i t h m b e g i n s to o p e r a t e
in c h e c k i n g c a s e (1) o r (2). P o i n t x x satisfies c o n d i t i o n s (a*, x x) —
— b t = 0, i £ f'o a n < l m o r e o v e r e q u a l i t i e s ( a f, x x) — b t = 0 w i t h
all i £ such that
b i — (diy x 0 ) _ "
— - - = Uji
(««. P i)
i.e. if w r i t t e n i n t h e c o m p o n e n t f o r m
(«i. P h ) = o, i 6 r „
156
Q U A D R A T I C P R O G R A M M I N G
The inequalities
(at, x h ) — bi < 0 , i £ f o
157
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Computational Aspects
T h e algorithm proposed comprises in essence only o n e complicated
c o m p u t a t i o n a l operation: the projecting of the gradient o n a s u b
s p a c e , i.e. t h e c a l c u l a t i o n o f t h e q u a n t i t y ( / — P y ) f {x). T h e r e a r e
t w o w a y s of p e r f o r m i n g this calculation.
T h e first o n e c o n s i s t s i n a d i r e c t c a l c u l a t i o n o f m a t r i x i.e.
P j t = A y . ( A y A y ) -1 A y . T h i s i n v o l v e s c a l c u l a t i n g m a t r i x ( A y A y ) ~ x.
If t h i s m a t r i x is k n o w n , t h e n t h e c a l c u l a t i o n o f t h e r e q u i r e d v e c t o r
u = — ( A y A y ) ~ x A y f (x) is r e d u c e d t o m u l t i p l y i n g t h e m a t r i x
b y the vector.
158
Q U A D R A T I C P R O G R A M M I N G
a = b — u * D ~ xu.
T h u s if m a t r i x Z ) - 1 i s k n o w n , t h e n m a t r i x B ~ x , w h e r e B i s o b t a i n e d
b y a d d i n g t h e last c o l u m n a n d t h e last r o w , c a n b e o b t a i n e d b y
simple calculations.
C o n v e r s e l y , if m a t r i x J 5 _ 1 i s o f t h e f o r m
iG P\
B ~l= (\p* m l * .
t h e n for m a t r i x D 1 we have
D~ G - & m-
T h u s if t h e n e w m a t r i x i s o b t a i n e d f r o m t h e o r i g i n a l o n e b y d e l e t
i n g t h e last r o w a n d t h e last c o l u m n o r b y a d d i n g a r o w a n d a c o l u m n ,
then the inverse matrices are obtained b y si mpl e arithmetic o p e r a
tions. T h e fact t h a t in t h e f o r m u l a s g i v e n a b o v e w e d e l e t e d t h e last
c o l u m n a n d r o w d o e s n o t m a t t e r , f o r it c a n b e e a s i l y c h e c k e d t h a t
the transposition of r o w s in the original m a t r i x leads s i m p l y to a
transposition of c o l u m n s in the inverted ma t r i x , a n d the transposi
tion of c o l u m n s — to t h e t r a n s p o s i t i o n of r o w s .
T h u s w e h a v e s h o w n that t h e calculation of t h e m a t r i x of projec
tion c a n b e p e r f o r m e d b y recursive formulas. T h e d r a w b a c k of these
r e c u r s i v e c o m p u t a t i o n s is t h a t t h e y m a y l e a d t o a g r e a t c u m u l a t i v e
c o m p u t a t i o n error.
159
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
L e t us describe another w a y of c o m p u t a t i o n .
It w a s s h o w n in t h e s u b s e c t i o n o n p. 1 5 1 t h a t v e c t o r p 0 =
= — ( / — P y ) f (x) is t h e s o l u t i o n o f t h e p r o b l e m o f m i n i m i z i n g
(/' ( x ) , p ) + y II P l l 2 w i t h c o n s t r a i n t s A y p = 0.] I t is e x p e d i e n t
to go over to the dual p r o b l e m w h i c h , as d e m o n s t r a t e d above, c o n
sists in t h e m a x i m i z a t i o n of t h e q u a d r a t i c f u n c t i o n
— r l l / ' W + ^ w l l 2
al ong vector u w i t h o u t constraints. This p r o b l e m c a n be easily solved
b y the m e t h o d of c o n j u g a t e directions. A s w a s s h o w n in the s u b s e c
t i o n o n p . 1 5 1 its s o l u t i o n is v e c t o r u 0 = — ( A y A y ) ~ x A y f (x)*
i.e. t h e v e c t o r w h i c h is r e q u i r e d f o r t h e a p p l i c a t i o n o f t h e a l g o r i t h m
for solving the general p r o b l e m of quadratic p r o g r a m m i n g . V e c t o r p 0
is e a s i l y c a l c u l a t e d u s i n g u 0 a n d t h e f o l l o w i n g f o r m u l a :
P o = - (/ - P y ) r (x) = - [/' ( x ) - A*f ( A f A* f )~* A y f (x)]
= — [/' ( x ) 4 - A y u 0]t
i.e.
Po = — [/' ( z ) + A | t w 0 ].
T h u s i n u s i n g t h e s e c o n d w a y o f c o m p u t i n g , t h e o p e r a t i o n is
r e d u c e d to a p p l y i n g m a n y t i m e s the s t a n d a r d p r o c e d u r e of the
m e t h o d of c o n j u g a t e directions.
P r o b l e m of Q u a d r a t i c P r o g r a m m i n g
with Simple Constraints
The problem with simple constraints is u n d e r s t o o d to be the
p r o b l e m of m i n i m i z i n g
w i t h c o n s t r a i n t s x % ^ 0, i £ J , w h e r e d is a s u b s e t o f t h e s e t
{ 1 , 2 , . . ., n ) . I n t h i s c a s e t h e a l g o r i t h m o f t h e s u b s e c t i o n o n
p . 1 5 1 is c o n s i d e r a b l y s i m p l i f i e d . I n s t e a d o f p e r f o r m i n g t h e s e s i m p l i
fications formally, w e shall f o r m u l a t e a n a l g o r i t h m for solving the
p r o b l e m . I t s d e s c r i p t i o n w i l l m a k e it c l e a r t h a t t h e p r o o f o f its
c o n v e r g e n c e after a finite n u m b e r of s t e p s c o i n c i d e s w i t h t h e p r o o f
of t h e a l g o r i t h m of t h e s u b s e c t i o n o n p. 151. S o , let x 0 b e a n
a r b i t r a r y p o i n t w h i c h s a t i s f i e s c o n s t r a i n t s xj ^ 0, i 6 J . S u p p o s e t h a t
160
Q U A D R A T I C P R O G R A M M I N G
®fc+i = min (- - - - - )
1 \ H + i I
w h e r e t h e m i n i m u m is t a k e n o v e r all i ^ for w h i c h < 0 .
T h e n w e c o m p a r e a * + i ^a n d a ^ . • • a
I f a * + i < i a h + 1» t h e n X i + j = X i + a j H - i P f t + i * i S I f ' , x \ + \ = =
= 7 0> * 6 f''- If «fc+i ^ otfe+n t h e n x l + i — x \ + ctfe+iPft+i* i
x \ + i = x\ = 0, i £ T ' • T h e c a l c u l a t i o n p r o c e s s will b e t r u n c a t e d after
a f i n i t e n u m b e r o f s t e p s a n d a p o i n t x k + x w i l l b e f o u n d s u c h t h a t it
p r o v i d e s m i n i m u m of / (x) s u b j e c t t o x* — 0, i £ or such that
otfc+x ^ a h + 1 . H e r e , f ( x k ) n > ' f a n d t h e i n c l u s i o n i s s u c h t h a t
t h e r e a r e i £ ' f ( x ft), b u t i ^ . I n b o t h c a s e s p o i n t x ft+ x i s t a k e n a s
t h e i n i t i a l p o i n t a n d t h e p r o c e s s is r e p e a t e d .
( 2 ) T h e r e a r e i n d i c e s i s u c h t h a t (/' ( x 0 ) ) 1 ^ = 0 , i £ ' f ( x 0 ). I n t h i s
c a s e w e a p p l y t h e m e t h o d of c o n j u g a t e g r a d i e n t s t o m i n i m i z e / (x)
w i t h v a r i a b l e s x x, i £ ( x 0 ). T h e c o m p o n e n t s x l , £ 6 ^ ( x 0 ) a l l t h e
t i m e r e m a i n e q u a l t o z e r o . M o r e o v e r a s i n c a s e (1), a t e v e r y s t e p w e
calculate the quantity
a h+1 = m •i n ( - - *'<
-- - \
i \ Ph+i I
w h e r e t h e m i n i m u m i s t a k e n o v e r a l l i £ ‘f ( x 0 ), 1 < 0 . T h e process
s t o p s i n t h e s a m e w a y a s i n c a s e (1).
I t is e a s i l y s e e n t h a t a n a r g u m e n t a n a l o g o u s t o t h a t g i v e n i n t h e
s u b s e c t i o n o n p. 1 5 1 results either in t h e p r o o f of t h e c o n v e r g e n c e
of t h e a l g o r i t h m after a finite n u m b e r of s t e p s o r in e s t a b l i s h i n g t h e
f a c t t h a t / (x) h a s n o l o w e r b o u n d w i t h c o n d i t i o n s x l ^ 0, i £ J .
1 1 — 0;j26 161
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
2. M E T H O D O F F E A S I B L E D I R E C T I O N S
T h e m e t h o d o f f e a s i b l e d i r e c t i o n s w a s o n e o f t h e first m e t h o d s
s u g g e s t e d for s o l v i n g t h e p r o b l e m of c o n v e x p r o g r a m m i n g .
S u p p o s e it is r e q u i r e d l o m i n i m i z e f u n c t i o n f 0 ( x ) w i t h t h e c o n
straints:
fi ( x ) ^ 0 , i = 1 , . . ., m . A x — b = 0 (2.1)
w h e r e x £ E n , /* ( x ) , i — 0 , 1 , . . m are c o n v e x continuously
d i f f e r e n t i a b l e f u n c t i o n s , A is a n I X m m a t r i x , b is a n Z - d i m e n s i o n a l
v e c t o r . M o r e o v e r , s u p p o s e t h a t t h e g r a d i e n t s o f f u n c t i o n s f-t (,x ),
i — 0 , 1 , . . ., m s a t i s f y L i p s c h i t z ’ c o n d i t i o n :
II /; ( * i ) — fi (*»)ll < C|| Xt — ar.2 || (2.2)
a n d || fl (.r)|| ^ K f o r a l l p o i n t s x w h i c h a r e c o n s i d e r e d i n w h a t
f o l l o w s . W e d e n o t e b y D t h e a d m i s s i b l e r e g i o n , i.e. t h e s e t
D — { x \ ft ( x ) 0, i = 1 , . . ., 77i, A x — b = 0}.
W e s h a l l a s s u m e i n w h a t f o l l o w s t h a t s e t D is c o m p a c t a n d t h e
c o n d i t i o n o f t h e g r a d i e n t s h a v i n g b o u n d s is f u l f i l l e d . L e t x 0 b e a
point of D . W e find a direction p £ E n s u c h that w i t h s m a l l
a ( x 0 + a p ) 6 D , a n d , b e s i d e s , f 0 ( x 0 + a p ) < c / 0 ( # 0 ). S u c h a d i r e c
t i o n is c a l l e d f e a s i b l e . M o v i n g a l o n g t h i s d i r e c t i o n b y o n e s t e p a x
w e o b t a i n a n e w p o i n t x x — x 0 + a Yp £ D . W e t a k e t h i s p o i n t a s
t h e i n i t i a l p o i n t a n d t h e p r o c e s s is r e p e a t e d . T h e p r o b l e m c o n s i s t s
n o w in w o r k i n g o u t a n effective m e t h o d of finding feasible directions
a n d c h o o s i n g s t e p a so as to p r o v i d e for c o n v e r g e n c e to t h e m i n i m u m
point.
B e l o w w e a s s u m e a l w a y s that the fo l l o w i n g c o n d i t i o n of n o n d e
g e n e r a c y is f u l f i l l e d : t h e r e is a p o i n t x s u c h t h a t
A x — b — 0, ft { x ) * < 0 ? i = 1, . . 777.
M e t h o d of C h o o s i n g F e a s i b l e Directions
Let
J S ( x ) = { i : fi ( x ) ^ — 6, i = l , . . . » 777}
for e a c h point x £ D . L e t > » 0 , 7 = 0 , 1 , . . ., m b e a r b i t r a r y
n u m b e r s . C o n s i d e r the f o l l o w i n g p r o b l e m at e a c h point x £ D :
m i n T ],
162
M E T H O D O F F E A S I B L E D I R E C T I O N S
w h e r e tj i s a n u m b e r a n d \\p\\ a n a r b i t r a r y n o r m . T o m a k e ( 2 . 3 )
a p r o b l e m o f l i n e a r p r o g r a m m i n g it is c o n v e n i e n t t o t a k e a s a n o r m
II p | | - = m a x | p l |.
L e t p c , (a;), % ( * ) b e a s o l u t i o n o f p r o b l e m ( 2 . 3 ) . S i n c e v e c t o r
p = 0 , r) = 0 s a t i s f i e s c o n s t r a i n t s ( 2 . 3 ) , c l e a r l y w e h a v e rja (a:) ^ 0 .
W e d e m o n s t r a t e t h a t p 6 ( x ) i s a f e a s i b l e d i r e c t i o n i f r\o (a:) < 0 .
I n d e e d , l e t a > 0 . F o r t = 0 w e h a v e b y T a y l o r ’s f o r m u l a
f i ( x + C t P i ( * ) ) < / o ( x ) + — Mi a l o i i a ( x ) ,
fi(x + ctpi(x))^0, I£Ji(x), ft ( x + a p i ( x ) X 0 , (x). ( 2 . 7 )
To s a t i s f y t h e s e i n e q u a l i t i e s it is s u f f i c i e n t t h a t
5 i V | s ( x ) + a C || p « ( * ) | | 2 < 0 . *€.?«(*).
fo ( * + < * P a ( * ) ) < / o ( * ) H - o S o T | a ( * ) S o T ) 6 (x ) J ’
*
163 li
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h e n d u e t o t h e c o n v e x i t y o f f u n c t i o n s / / (a:), i = 0 , 1 , . . ., m
a n d t h e f a c t t h a t f t ( a : * ) ^ 0 , i = 1 , . . ., m , w e o b t a i n
fi M < p ft (x) + (1 — p ) fi ( x % ) < pa, i = 1 , . . ., m .
Further, for i £ . % (x), fi ( x ) = 0and therefore for 0 < X < 1 ,
Xpo > Xfi (xp) = Xfi (xp) + (1 — X ) fi { x ) > ft ( X x p + (1 — X) x)
= fi ( x + X (xp — x)) — fi ( x ) > X (f\ ( x ) , X p — x)
w h e r e w e h a v e u s e d t h e i n e q u a l i t y ( C h a p . I, S e c . 2 )
f (y) — f (x) > (/' ( x ) , y — x)
which holds for a n y differentiable convex function.
164
M E T H O D O P F E A S I B L E D I R E C T I O N S
Thus
pa ^ (fi W , zp — z ) , i 6 C l l (a:). (2.11)
Further, since point x does not provide a m i n i m u m o f / (a:) i n D
0 Y /o (^#) “ /o (^) ^ (/o ( ^ ) > ^)*
Hence
(/; ( * ) , * p — *) = P (/; ( x ) , x — x) + 11 — p ) (f0 ( x ) , x # — x)
< P (/o ( * ) . * — a?) + ( 1 — p) Y - (2 -1 2 )
It f o l l o w s f r o m (2.11) a n d (2.12) t h a t w i t h sufficiently s m a l l
p > 0 t h e f o l l o w i n g inequalities are satisfied:
Algorithm of M e t h o d
of Feasible Directions
L e t x 0 £ D b e a n a r b i t r a r y first a p p r o x i m a t i o n a n d 6 0 > * 0. W e
describe the general step of the algorithm. L e t point x k £ D b e
o b t a i n e d at t h e & - t h s t e p a n d 6 h > ► 0.
X
165
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
m i n i],
(f'i ( X h ) , p ) < Ej»|, I 6 •?« ( X h ) U { 0 } ,
A p = Q, I I p I K U
w e o b t a i n p i h ( x k ) = p h a n d i] ( x h ) = r] k -
R e m a r k . If w e t a k e t h e q u a n t i t y m a x | p x | as t h e n o r m of v e c t o r p,
i
t h e n t h e a b o v e p r o b l e m is a p r o b l e m o f l i n e a r p r o g r a m m i n g and
c a n b e solved b y o n e of the s t a n d a r d m e t h o d s .
T w o cases are possible.
fi ( x h + - U p i , ) < 0 ,
1i(xi,+i)^0, i = l, m. (2.16)
T h u s i n t h e first c a s e t h e r e is a s h i f t t o a n e w p o i n t , i n t h e s e c o n d
o n e t h e r e is n o s u c h a sh ift.
W e n o w f o r m u l a t e also the c o n d i t i o n of the halt of the algorithm:
f a t a c e r t a i n s t e p k , 6 ft < 6 ° ( z h ), w h e r e
6° (xb) = — ma x ft ( x b )
a n d T]ft = 0 , t h e n x i s t h e s o l u t i o n o f t h e p r o b l e m s e t a b o v e , i . e .
X k is t h e m i n i m u m p o i n t o f f 0 (x) w i t h c o n s t r a i n t s (2.1).
166
M E T H O D O F F E A S I B L E D I R E C T I O N S
S u b s t a n t i a t i o n of C o n v e r g e n c e
of t h e A l g o r i t h m
W e s h o w t h a t i f t h e s e q u e n c e { , r fe} i s t r u n c a t e d a t a c e r t a i n s t e p k
b e c a u s e t h e c o n d i t i o n s o f t h e h a l t h a v e b e e n fulfilled, t h e n x k is
really t h e s o l u t i o n of t h e p r o b l e m . I n d e e d , let t h e c o n d i t i o n s of t h e
h a l t b e f u l f i l l e d , i.e. = 1 1 6 . rt ( # * ) = 0 , a n d
B u t a s w a s s h o w n i n p r o v i n g l e m m a 2 . 1 , if c o n d i t i o n s ( 2 . 1 7 ) a r e
f u lfi lle d, t h e n r\6h ( x k ) < 0 p r o v i d e d x k is n o t t h e s o l u t i o n o f t h e
p r o b l e m . B u t it w a s a s s u m e d tliatr) k = 0 s o it f o l l o w s t h a t x & is t h e
m i n i m u m p o i n t o f / 0 (.r) w i t h x £ D . L e t t h e i t e r a t i v e p r o c e s s b e n o w
c o n t i n u e d w i t h o u t limit so t h a t w e h a v e a n infinite s e q u e n c e {#&},
k — 0 , 1 , . . . . L e t x h b e a p o i n t a t w h i c h T ] ft < — 6 ft, i . e . c a s e ( i )
t a k e s place. T h e n m a k i n g u s e of e s t i m a t e s (2.9) a n d t h e fact t h a t
|| P j J I - - - 1| (^fe)ll ^ 1 , w e c a n s t a t e t h a t if
^ 1 Son*
2 c ,
(2-i8)
167
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Therefore
I f w e t a k e i n t o a c c o u n t t h a t i n t h e c a s e u n d e r c o n s i d e r a t i o n — r j ft >
> 8 ft, t h e n i n e q u a l i t y ( 2 . 1 9 ) c a n b e m a d e s t r o n g e r b y s u b s t i t u t i n g 8 *
for — q*. T h e n w e obtain:
f o (x k + l ) < fo (Xk) — y y 62
is fulfilled.
Therefore
168
M E T H O D O F F E A S I B L E D I R E C T I O N S
S u p p o s e t h a t t h e o p p o s i t e is t r u e , i.e. t h a t p o i n t x * is n o t t h e
m i n i m u m p o i n t o f f0 (x) i n D . T h e n o n t h e b a s i s o f l e m m a 2.1,
w e c a n a f f i r m t h a t w i t h all 6 < 6 ° ( x % )
— j a a x /*(*♦)»
J o (* * )
J Z ( X * ) = J o (x * ) a n d t ] 6 ( ^ * ) < 0 . M o r e o v e r , s i n c e C f l ( x * ) = J o ( a : * ) ,
w e h a v e rje ( a : * ) = t j 0 ( a : * ) c 0 . F u r t h e r , J Z k (x h ) — J o ( # * ) w i t h
s u f f i c i e n t l y l a r g e k £ /f . I n d e e d , s u p p o s e t h a t i £ J o ( a : * ) , t h e n f t ( x % ) < z
< 0. T h e r e f o r e b e c a u s e o f t h e f a c t t h a t 6 h ->■ 0, w i t h s u f f i c i e n t l y
l a r g e k w e h a v e f t ( x + ) c — 6 ft, a n d s i n c e x h with great k w e
a l s o h a v e , f t ( x h ) < — 6 ft, i . e . i £ j £ FI (a:h ). T h u s i f i £ J o ( a : * ) , t h e n
w i t h g r e a t k w e h a v e i £ J e . ( a ^ ) , i . e . J g (a:fe) ^ J o ( a : * ) . S i n c e b y
ft ft __
assumption i s n o t t h e m i n i m u m p o i n t o f / 0 (a:) i n D t t h e r e i s a
v e c t o r p ( a ; * ) s u c h t h a t A p ( a * ) = 0 , || p (a;*)|| ^ 1 ,
(fi M , P (**)) < li^o (**). i € J o (**) U {0}
a n d , a s m e n t i o n e d a b o v e , rj0 (a:*) < 0 . B u t t h e n b y c o n t i n u i t y w i t h
great k the following relations hold
A P (**) = 0, II p ( * * ) l l 1,
for x h - > a * , Clh ^ J o (a:*). H o w e v e r , t h e l a s t r e l a t i o n s m e a n that
169
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Construction
o f t h e Initial A p p r o x i m a t i o n
T h e a p p l i c a t i o n of t h e m e t h o d of feasible directions requires the
k n o w l e d g e of t h e initial a p p r o x i m a t i o n in region D . T o o b t a i n
this initial a p p r o x i m a t i o n w e c a n use the s a m e m e t h o d of feasible
d i r e c t i o n s b y a p p l y i n g it t o the p r o b l e m of m i n i m i z a t i o n of n u m b e r q
wi th constraints
fi(x) — T) ^ 0, i = 1 , . . ., m , A x — 6 = 0. (2.22)
A s t h e r e is a p o i n t x s u c h t h a t
fi (x) < 0, i — 1 , . . ., m , A x — 6 = 0,
t h e m i n i m u m v a l u e o f q w i t h t h e c o n s t r a i n t s d e s c r i b e d is s t r i c t l y
less t h a n ze r o a n d t h e r e f o r e after a finite n u m b e r of s t e p s w e o b t a i n
point x a n d q s u c h that q < 0 a n d the inequalities (2.22) will b e
satisfied. T h i s m e a n s t h a t t h e o b t a i n e d p o i n t x satisfies t h e c o n
straints of t h e original p r o b l e m a n d c a n b e t a k e n as t h e initial p o i n t
for a p p l y i n g t h e m e t h o d of feasible directions.
3. M E T H O D O F C O N D I T I O N A L G R A D I E N T
A N D N E W T O N ’S M E T H O D
T h e m e t h o d of conditional gradient c a n be u s e d for solving the
p r o b l e m of the m i n i m i z a t i o n of a n o n l i n e a r function in a region in
w h i c h the p r o b l e m of the m i n i m i z a t i o n of a linear function c a n be
s o l v e d w i t h o u t g r e a t difficulties.
S u p p o s e t h a t / (x), x £ E n is a c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n
i n a c o m p a c t c o n v e x r e g i o n Q , a n d t h e g r a d i e n t /' ( x ) o f f u n c t i o n / ( x )
i n Q s a t i s f i e s L i p s c h i t z ’ c o n d i t i o n , i.e.
\\ f ' ( x j - f ( x 2)W ^ L\\ x t - x 2 \\ (3.1)
f o r all t h e p o i n t s o f r e g i o n Q .
170
M E T H O D O P C O N D I T I O N A L G R A D I E N T
R u l e for C h o o s i n g t h e S t e p L e n g t h
L e t x b e a n a r b i t r a r y p o i n t in Q . W e d e n o t e b y z (x) a m i n i m u m
p o i n t o f f u n c t i o n (/' ( x ) , z ) i n £2 s u c h t h a t
(/' ( x ) , z ( x ) ) < (/' ( x ) , z ) , 2 6 ^ (3.2)
W e take p (x) = z (x) — x,
rj ( x ) = m i n (/' ( x ) , z — x) = (/' ( x ) , p ( x ) ) .
B y ( 3 . 2 ) , r] ( x ) ^ 0 . W e a r e i n t e r e s t e d i n t h e e s t i m a t e f o r t h e i n c r e a s e
o f t h e f u n c t i o n v a l u e i n m o v i n g f r o m p o i n t x i n d i r e c t i o n p (x).
U s i n g T a y l o r ’s f o r m u l a a n d ( 3 . 1 ) w e o b t a i n :
/ ( j + a p (j ) ) < / ( x ) + a $ i x) . (3 5)
171
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Description of t h e Algorithm
T h e a l g o r i t h m b e g i n s w i t h a n arbitrary p o i n t x Q of r e g i o n Q .
W e describe n o w t h e general step.
L e t p o i n t x h b e a l r e a d y c o n s t r u c t e d , k ^ 0. H a v i n g s o l v e d t h e
p r o b l e m o f m i n i m i z a t i o n o f (/' z ) i n Q , c a l c u l a t e z ( x h ), p ( x k ),
rj ( x h ). C o n s t r u c t p o i n t = x h + a ^ p ( # * ) , w h e r e a ft i s t a k e n
t o b e 2 ~ jo a n d i 0 i s t h a t o f t h e i n d i c e s i = 0 , 1 , . . w h i c h is t h e
first t o s a t i s f y t h e i n e q u a l i t y
The c o n d i t i o n o f t h e h a l t : t h e p r o c e s s s t o p s i f rj (a;*) = 0.
S u b s ta nt ia ti on of C o n v e r g e n c e
of t h e A l g o r i t h m
a n d E s t i m a t i o n o f Its R a t e o f C o n v e r g e n c e
/ (X k ) + — nY *) • (3-7)
I n o r d e r t o s u b s t a n t i a t e t h e c o n v e r g e n c e o f t h e a l g o r i t h m , it is
n e c e s s a r y first o f a l l t o d e m o n s t r a t e t h a t i n e q u a l i t i e s ( 3 . 6 ) , ( 3 . 7 )
c a n a l w a y s b e satisfied. I n fact b y (3.4) a n d (3.5), i n e q u a l i t y (3.6)
will b e satisfied as s o o n as i n e q u a l i t y
2 " z< r — — —
^ 2 L || p ( x h ) ||*
i s s a t i s f i e d , a n d s i n c e i0 i s t h e f i r s t i n d e x s a t i s f y i n g ( 3 . 6 ) , w e h a v e
I t f o l l o w s f r o m t h e f o r e g o i n g t h a t i f r\ ( x h ) < 0 , t h e n i n e q u a l i t y ( 3 . 6 )
w i l l b e satisfied after a finite n u m b e r of trials a n d t h e a h c h o s e n will
satisfy i n e q u a l i t y (3.8).
L e m m a 3 . 1 . I f { # & } , k = 0 , 1 , . . ., i s a s e q u e n c e o f p o i n t s o b t a i n e d
in i m p l e m e n t i n g the a l g o r i t h m of the m e t h o d of conditional gradient4
then x k £ / ( # h ) d e c r e a s e s m o n o t o n i c a l l y a n d v \ ( x ft) 0 ask — - +oo.
172
M E T H O D O F C O N D I T I O N A L G R A D I E N T
f(Xk+i) — f ( X h ) ^ — g 2 ^ r 1 )2 ( * * ) • (3.9)
Adding ( 3 . 9 ) f o r a l l A; = 0 , 1 , . . . , m — 1 w e obtain
m- 1
i(xm )— f{xo X — 2 ’I2 ( * * ) •
k=Q
S i n c e r e g i o n Q is c o m p a c t a n d f u n c t i o n / (x) c o n t i n u o u s , w e
h a v e / ( x m ) ^ /*, w h e r e / * is t h e m i n i m u m v a l u e o f / (z) i n Q . T h e r e
fore
m - 1
H e n c e it f o l l o w s t h a t t h e s e r i e s
k=0
c o n v e r g e s . T h i s i s p o s s i b l e o n l y i f rj ( x ft) 0. T h e l e m m a is p r o v e d .
It f o l l o w s f r o m t h e c o n d i t i o n of t h e halt of a n a l g o r i t h m a n d
l e m m a (3.1) that, in g e n e r a l , ei t h e r t h e a l g o r i t h m s t o p s after a finite
n u m b e r o f s t e p s a n d t h e c o n d i t i o n tj ( x h ) = 0 i s f u l f i l l e d o r a m o n o -
t o n i c a l l y d e c r e a s i n g s e q u e n c e o f / ( x k ) v a l u e s o f f u n c t i o n / ( x ) is
obtained.
I n t h e f i r s t c a s e , t h e c o n d i t i o n r\ ( x k ) = 0 , b y ( 3 . 2 ) , i s e q u i v a l e n t
to the following one:
(/' ( * f c ) , x k ) = (/' ( * * ) , z ( x h )) < ( f ( x k ), z ) , z f f i .
T h e l a s t e x p r e s s i o n is n o t h i n g e l s e b u t t h e n e c e s s a r y c o n d i t i o n f o r
f u n c t i o n / (a;) a s s u m i n g i t s m i n i m u m v a l u e a t p o i n t x ( s e e C h a p . I ,
S e c . 3).
T h e s e c o n d c a s e is t h e s u b j e c t o f t h e f o l l o w i n g l e m m a .
L e m m a 3 . 2 . A t a n y l i m i t p o i n t o f s e q u e n c e { # & } , k = 0 , 1 , . . .,
t h e n e c e s s a r y c o n d i t i o n s f o r t h e m i n i m u m o f f (a:) i n s e t Q a r e f u l f i l l e d .
173
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
P r o o f . L o t x * b e a l i m i t p o i n t o f { x ft} , i . e . t h e r e i s a s u b s e q u e n c e
{ x * . } , I - * * 0 0 , s u c h t h a t X f t . — >-x ^ . T h e f o l l o w i n g r e l a t i o n s h o l d :
J J
(/' ( Z h j ) , z ( X f c ^ x ( / ' ( ^ . ) i z ) , z € Q -
W i t h o u t l o s s o f g e n e r a l i t y , w e c a n t a k e t h a t z ( x k .) - > ~ z % . S i n c e
q (x h ) — > - 0 a n d f ( x ) d e p e n d s o n x c o n t i n u o u s l y , i t f o l l o w s f r o m t h e
a b o v e relations that
(/ (•£*)» z * 0
(/' ( * * ) , z * ) ^ (/' ( x * ) , z ) , z £ Q .
Hence
(/ ( ^ * ) i ^ (/ ( ^ * ) » z), z £ Q,
a n d t h i s is t h e p r o o f o f t h e l e m m a .
T h e o r e m 3 . 1 . L e t f u n c t i o n f (x ) b e c o n v e x . T h e n
l i m f (xh) = f*
h - + 00
where /*=m i n / ( x ) . M o r e o v e r , //ze e s t i m a t e
x££2
( C is a p o s i t i v e c o n s t a n t ) h o l d s .
P r o o f . A s / (x) is a c o n v e x function, the following inequality
holds:
/« — / ( * ) = / ( * . ) — / ( * ) > ( / ' (*). * . — * ) > m i n (/'(*), z — z ) = r)(z).
z£Q
fP f c + X 1 8LC~ )
or
1
<P/<+X<Pfc (1 — ^ ~ ~ 8 L C * '
174
M E T H O D O P C O N D I T I O N A L G R A D I E N T
T a k i n g <fi, w e obtain
Y* * ' k >
or
Yft+j 1 x(*+*)Yfc
yh ^ 1 + * **
With e a c h k there are t w o possible cases:
X /c+l ^ X *
Further, f r o m (3.11) w e obtain that
1 2
with k ^ 1. N o w o n l y t w o s i t u a t i o n s a r e p o s s i b l e .
(1 ) T h e r e is o n l y a f i n i t e n u m b e r o f i n d i c e s k for w h i c h y h .
T h e n d u e t o t h e a b o v e s t a t e m e n t s f o r a l l g r e a t k s e q u e n c e { y ft} d o e s
n o t i n c r e a s e m o n o t o n i c a l l y , i.e. r e m a i n s b o u n d e d .
M
(2) T h e r e is a n i n f i n i t e n u m b e r o f i n d i c e s k f o r w h i c h y h < z — .
X
I
W e s h a l l d e n o t e t h e s e t o f s u c h i n d i c e s k b y f- s o t h a t y ft ^ — f o r
&£'!'• Let n o w j £ T h e n there will be t w o indices k u k 2 £ f s u c h
t h a t k\ < 4 j < z k 2 a n d k £ f for all k x k < C k 2. T h e n
175
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
E s t i m a t e of C o n v e r g e n c e
for a S t r o n g l y
Convex Region
L e t r e g i o n £2 b e s t r o n g l y c o n v e x , i.e. t h e r e is a n u m b e r 6 > 0
•V 1 11
s u c h that for a n y x, y £ & points - p - + w b e l o n g to Q for all
the w s u c h t h a t || w | | ^ 6 || x — y ||2 . T h e n
Hence
~ h ( 3 -1 2 )
Theorem 3 . 2 . I f f ( x ) is a c o n v e x f u n c t i o n a n d r e g i o n Q i s s t r o n g l y
convex and i f || f ( x ) || ^ e 0 > 0 f o r a l l x 6 th en the m e t h o d of
conditional gradient converges at the rate of a geometric progression,
i . e * || Xfo X y . || C(]q , Q q 1.
Proof. F r o m (3.7) a n d (3.8) w e o b t a i n
, / v , / \ ^ 1 il2 (x u)
9/t ~ W k + l — / (^'/{) / (Z/i+l) ^ g /y jj p ^ X f j ) ||* *
<pi.+i^('1 — 1 7 7 - ) <th-
Therefore
176
M E T H O D O F C O N D I T I O N A L G R A D I E N T
B e c a u s e of t h e n e c e s s a r y a n d sufficient c o n d i t i o n s for a m i n i m u m ,
we have
(/ ( ^ ♦ ) » ^ *r*) ^
T h e r e f o r e f o r a l l t h e w , || w || ^ 6 || x — x # ||2 ,
Hence
ii
W i t h the notation
1/2
r — I yp \ „ /\ ^ eo \
C \2te0 ) ’ 9 o _ I1 4L )
w e obtain
II II ^ To-
Q.E.D.
N e w t o n ’s M e t h o d w i t h S t e p A d j u s t m e n t
W e consider n o w the p r o b l e m of the m i n i m i z a t i o n of a c o n v e x
s m o o t h f u n c t i o n f (x) i n (a c o n v e x , c o m p a c t ) set Q .
F o r solving this p r o b l e m the iterative process
*m-i = xh + a hp h , a h > 0 (3.14)
c a n b e u s e d i n w h i c h t h e d i r e c t i o n o f m o t i o n p h = x ^ — x * is t h e
solution of the p r o b l e m of m i n i m i z a t i o n in set Q of the q u a d r a t i c
function
\
% ( * ) = ( / ' (*/>). X — X k ) + - j ( f ( x h) ( x — X h ), X — X h ),
12— 0 3 2 6 177
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
P a r a m e t e r a h c a n b e c h o s e n b y other m e t h o d s a n a l o g o u s to those
d e s c r i b e d i n S e c . 2, C h a p . II ( m e t h o d s (2.2), (2.3)).
W e s h a l l a s c e r t a i n b e l o w t h a t t h e r a t e o f c o n v e r g e n c e o f N e w t o n ’s
m e t h o d u n d e r d e f i n i t e c o n d i t i o n s is e i t h e r s u p e r l i n e a r , o r q u a d r a t i c .
C o n s e q u e n t l y , if t h e p r o b l e m o f t h e m i n i m i z a t i o n o f f u n c t i o n
( x ), x £ Q c a n b e s o l v e d e a s i l y e n o u g h , t h e n N e w t o n ’s m e t h o d
p r o v e s v e r y effective.
P r o p e r t i e s o f N e w t o n ’s M e t h o d
T h e o r e m 3.3. If for the m i n i m i z a t i o n of a c o n v e x twice continuously
differentiable f u n c t i o n f (x) i n a c o n v e x cl o s e d b o u n d e d set Q w e u s e
m e t h o d (3.14) in w h i c h a a n d p k are de t e r m i n e d as described a b o v e t
t h e n ( w h a t e v e r t h e i n i t i a l a p p r o x i m a t i o n x 0 £ ^ c h o s e n ):
(1) / ( x k ) d e c r e a s e s m o n o t o n i c a l l y ,
(2) l i m / ( x h ) = f (x+) = m i n / (#).
k-+oo
P r o o f . T h e r e is a m i n i m u m p o i n t x k ( p o s s i b l y n o t u n i q u e ) of t h e
c o n t i n u o u s f u n c t i o n ^ (x) in c o m p a c t set Q ( W e i e r s t r a s s ’ t h e o r e m ) .
W i t h a n y Jc p o i n t X h + i £ & s i n c e X j j + l = x h + (xk — xf} = oCkZh +
+ (1 — a u) X k a n d & & £ [ ( ) , 1]. S i n c e f u n c t i o n (x) is c o n v e x , w e
h a v e ^ h ( ^ h + i ) = % ( ^ h ^ h J i- ( 1 — (1 — a k ) ( x h ).
But (#&) = 0 , therefore
tyh ( % i K a h % to*). (3.1b)
N o w m a k i n g u s e o f T a y l o r ’s f o r m u l a a n d o f ( 3 . 1 6 ) , w e h a v e :
(3.18)
2 V k i*k) ^
holds. B u t at the s a m e t i m e t h e inequality (3.15) ho l d s as well a n d
this p r o v e s that t h e described m e t h o d of c h o o s i n g a k m a y b e applied.
I t f o l l o w s f r o m ( 3 . 1 5 ) t h a t / ( x k + 1 ) ^ / ( ^ fe). W e s h o w n o w t h a t
tyk ( z k ) — >- 0 a s k - v o o . I n a c l o s e d b o u n d e d s e t Q c o n t i n u o u s f u n c -
tion f i x ) has an upper bound: ||/"(^)||^M. Consequen
t l y II F h II ^ 2 M , a n d vector p h has an upper bound too:
178
M E T H O D O P C O N D I T I O N A L G R A D I E N T
we have (x * ) ^ — P < ; 0. T h e n
1 , ah II F k || \ \ j k | P > 1 a M £ _
2 (*fe) ^ k P
and, hence, inequality (3.18) ( a n d t h e r e f o r e ( 3 . 1 5 ) t o o ) is a l w a y s
satisfied even with a* = ~ = C > 0. But it f o l l o w s from
( 3 . 1 5 ) t h a t a t t h e s a m e t i m e / ( a r ^ - n ) — / (x * ) ^ — e C p w i t h a n y k
a n d t h i s c o n t r a d i c t s t h e f a c t t h a t i n c o m p a c t set Q f u n c t i o n f (x) h a s
a lower bound.
T h u s the condition ( x h ) ^ — p w i t h a n y k c a n n o t b e fulfilled,
i.e. i n a n y c a s e a s k oo the condition (x * ) - > 0 m u s t b e f u l f i l l e d .
T h i s m e a n s that at a n y limit po int of s e q u e n c e (3.14) the necessary
( a n d sufficient, b e c a u s e o f t h e c o n v e x i t y o f / (x)) c o n d i t i o n f o r a
m i n i m u m o f f u n c t i o n / (x) i n s e t Q ( s e e C h a p . I, S e c . 4 ) is fulfilled.
T a k i n g this into a c c o u n t , t h e last a s s u m p t i o n of th e t h e o r e m c a n b e
p r o v e d as in t h e o r e m 3.1.
T h e t h e o r e m just p r o v e d s h o w s that in the p r o b l e m u n d e r consider
ation, as distinct f r o m u n c o n s t r a i n e d m i n i m i z a t i o n p r o b l e m s w h e n
N e w t o n ’s m e t h o d c a n b e a p p l i e d o n l y t o m i n i m i z e s t r o n g l y c o n
v e x functions, this m e t h o d c a n b e applied to m i n i m i z e also c o n v e x
f u n c t i o n s , s i n c e s e t Q is b o u n d e d . H o w e v e r , t h e a p p l i c a t i o n o f
N e w t o n ’s m e t h o d t o t h e m i n i m i z a t i o n o f s t r o n g l y c o n v e x f u n c t i o n s
is o f t h e g r e a t e s t i n t e r e s t , f o r j u s t i n t h i s c a s e t h e m e t h o d c o n v e r g e s
to t h e so lut ion at a fast rate.
T h e o r e m 3 . 4 . I f i n a d d i t i o n to t h e c o n d i t i o n s o f t h e o r e m 3 . 3 , f u n c t i o n
f ( x ) is s t r o n g l y c o n v e x , i.e.
m II y I I 2 < (/* (x) y, y ) < M || y | | 2 , m > 0 , x (j Q ,
y 6 E n, (3.19)
t h e n s e q u e n c e ( 3 . 1 4 ) c o n v e r g e s t o t h e s o l u t i o n a t a s u p e r l i n e a r r a t e (i.e.
the e s t i m a t e (2.5), C h a p . II, holds).
Proof. T h e existence a n d u n i q u e n e s s of the solution of the p r o b l e m
u n d e r c o n s i d e r a t i o n f o l l o w f r o m t h e g e n e r a l r e s u l t s o f S e c . 3, C h a p . I.
A t point x h the.necessary condi tio n for a m i n i m u m of function (x)
in se t Q ( S e c . 4 o f C h a p . I)
Wh(Xh ), X k — x h) ^ 0
is fulfilled, i.e.
(/'(**)* x h — x h ) + ( f " ( x h ) ( x h — x h ), x h — x h ) ^ 0 .
Hence
(/' Ph) < — ( f (*k) P h , Ph )• (3.20)
179 12*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
f-lIPkll2- (3.21)
(3.22)
Since ( x k ) - > - 0 ( t h e o r e m 3 . 3 ) , i t f o l l o w s f r o m ( 3 . 2 1 ) t h a t || p h ||
— ► 0 a s k — >-oo. H e n c e , b e c a u s e of t h e u n i f o r m c o n t i n u i t y of t h e
s e c o n d d e r i v a t i v e / " ( z ) o n s e t Q , w e h a v e t h a t || F h || - > - 0 . B u t ( 3 . 2 2 )
i m p l i e s t h a t f r o m a c e r t a i n k = N x (e) o n , i n e q u a l i t y (3.15) w i l l
b e s a t i s f i e d w i t h a * = 1, i.e. m e t h o d ( 3 . 1 4 ) is t r a n s f o r m e d i n t o t h e
u s u a l N e w t o n ’s m e t h o d w i t h a u n i t y s t e p . W i t h k > » N x ( e ) t a k i n g
i n t o a c c o u n t t h e c o n v e x i t y o f \|?* ( x ) , w e o b t a i n
% ( x h ) = ( / ' ( * * - i ) + f ( X h - i ) ( x k — x h . t) , x k + i — x h )
“ i- ( O * ( x k 3?ft_i), E k + i ■ x ^ (3.23)
where
O * — f n (xh-i + ( X k — ^fc-i)) — f " 0 £ [0, 1].
Note that ( / ' ( x h ^ ) + / " (.x h ._t) ( x h — x h ^ ) , x k + i - x h ) = ( i | v t ( x h ) ,
Xh+i — x h ). S i n c e (s*) = m i n (#), w e s h a l l h a v e w i t h a n y
x£Q
x £ Q that (a:*), x — x h ) ^ 0 (the ne c e s s a r y c o n d i t i o n for a m i n i
m u m ) . Consequently, ^fe+i— x k ) ^ 0 h o l d s a n d therefore
it f o l l o w s f r o m (3.23) t h a t
— tyh ( X k X II Q > h IIII x k — Xk-i II ||*jk — x k+i II
H I 0 * 1 1 II/ > * _ , ! ! II p fc ||. (3.24)
C o m p a r i n g estimates (3.21) a n d (3.24), w e obtain
II * » t + i - lla*— (3.25)
B e c a u s e of t h e u n i f o r m c o n t i n u i t y of f (x) o n set Q , w e h a v e
|| O * || — 0 . C o n s e q u e n t l y , t h e r e i s a ^ n u m b e r N ( e ) s u c h t h a t w i t h
k ^ N (e) w e f i n d X * = 2 ^ ^ < 1. Let us t a k e || x N — x N - { || — C u
1 8 0
M E T H O D O F C O N D I T I O N A L G R A D I E N T
i— i
1 — = *v 0* T h e n ||a?j ll^fe+i ••••
h=N+l
P f c ^ p l - i ^ •••^ P l
Consequently, for a n y i > L - \ - l , 1 = = 0 , 1, ...,
II*«-*I.+|||< S II * » + ! - * * I K - ^ - s = SPi*.
h=L+l k=L+l s=l
II % L + l — X * II ^ 2 P*- •
s=l
This estimate can be given the form
II z L + i — * * H ^ ^ P l , C C o o
OO
(taking into a c c o u n t t h a t t h e series 2 P l converges). T h e e s t i m a t e
s=l
o b t a i n e d m e a n s t h a t t h e f o l l o w i n g t h e o r e m is v a l i d .
T h e o r e m 3.5. I f the c o n d i t i o n s of t h e o r e m 3 . 4 are fulfilled a n d , be
s i d e s , m a t r i x f " (a:) o n s e t Q s a t i s f i e s L i p s c h i t z > c o n d i t i o n w i t h c o n s t a n t
181
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
i?, t h e n s e q u e n c e ( 3 . 1 4 ) ( i n w h i c h a & a n d p * a r e c h o s e n b y t h e m e t h o d
d e s c r i b e d a b o v e ) c o n v e r g e s to the s o l u t i o n a t a q u a d r a t i c r a t e .
W e s h a l l s t u d y n o w t h e p r o p e r t i e s o f N e w t o n ’s m e t h o d w i t h t h e
c h o i c e o f a * u n d e r t h e c o n d i t i o n o f / (x) h a v i n g t h e m i n i m u m v a l u e
in the direction of m o t i o n .
T h e a r g u m e n t w e u s e d to e s t i m a t e the rate of c o n v e r g e n c e of
N e w t o n ’s m e t h o d i n u n c o n s t r a i n e d p r o b l e m s ( S e c . 2 , C h a p . I I ) a r e
n o t s u i t e d to this c a s e (since t h e r i g h t - h a n d e s t i m a t e (1.11) of C h a p . II
does not hold).
L e m m a 3.3. I f f u n c t i o n f (#) for w h i c h c o n d i t i o n s (3.19) a r e fulfilled
is b e i n g m i n i m i z e d a n d a * i n m e t h o d ( 3 . 1 4 ) is c h o s e n u n d e r t h e c o n
dition
f(xk + a kp k) = m i n f ( x k + a p k ), ;1(3.28)
t h e n X k — >■x * a n d a * ->• 1 a s k — ► o o .
P r o o f . B y T a y l o r ’s f o r m u l a
T h u s a * ^ C > 0; t h e r e f o r e i n t h e s a m e w a y as in t h e o r e m 3 . 3
it c a n b e s h o w n t h a t ( # * ) - ► O , i.e. s e q u e n c e ( 3 . 1 4 ) i n w h i c h a *
is c h o s e n u n d e r c o n d i t i o n ( 3 . 2 8 ) c o n v e r g e s t o t h e s o l u t i o n . A t t h e
s a m e t i m e || p h || ~ + Q a n d || F h || - > - 0 ( t h e o r e m 3 . 4 ) .
W e shall d e m o n s t r a t e t h a t a * ->-l:
a*
/ (Zft+i) = W S x k + i ) + - y - ( F h p ,>. P k )
= '|)fc(^i.) + (’l’i ( a : ) . ) . * a + i — * f c )
rI _ _ af
+ y ( * K ( x k ) ( X h + i — x h ), x s + 1 — * * ) + — ( F k P k , P h ) -
/ (*a+i) (x k ) + W > i ( x k ), x k + , — x k)
2
+ W k ( x k) p k , P k) + ^ j - ( F kp k , Pa).
182
M E T H O D O P C O N D I T I O N A L G R A D I E N T
E x p r e s s i n g t h e last t e r m o n t h e r i g h t - h a n d side b y m e a n s of L a
g r a n g e ’s f o r m u l a f o r o p e r a t o r s , w e o b t a i n a f t e r s o m e t r a n s f o r m a t i o n
183
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
( % - i (x k-l)i x h -1—
= (/' ( * * - i ) + / " ( X k - 1 ) (x h - 1 — x h ~ i), x h- 1“ *a+i) < 0.
tyk (x k+i) ^ X h +i — X h - l ) •
H e n c e , w i t h t h e n o t a t i o n || [ ( a f t - ! — lJ/aft-j] f ” (a^-i) + O f t || = b kl
w e obtain
— ifrfc ( x h + i ) \ \ x k — x k ^ || || x k + l — x k . i ||
< b k II x h — x k — i 11(11 * k + i — x k II + II x h - i — x k II ) •
T a k i n g n o w into account that
Z f t - l — x k = - I — * " 1 (x h ~ x h - 1)
Wt—1
and d e n o t i n g [(1 — aft^/aft-il b k = ck , w e obtain
— ( x k + i ) ^ b k \ \ x k — x ^ || || x h + t — x h II + Cft || x k — a?ft_! ||2 .
S i n c e a * - > - l , || P h II - ^ 0 ( l e m m a 3 . 3 ) , w e h a v e b h - * - 0 , c h - > 0 .
C o m p a r i n g t h e e s t i m a t e o b t a i n e d w i t h (3.29) w e establish that
II *ft+l — x h II 2 < I k II x h — X h _x || || X h+1 — X h || - f Pfe || X k — Xk-! ||2
where
t _ 26ft 2cft
I k - m ' m ‘
Finally, h a vin g solved the quadratic inequality obtained for
|| a :ft+! — x h ||, w e f i n d t h a t
II * * + 1 — X k II ^ p.* II * k — X k - 1 II
where
T h e r e m a i n i n g p a r t o f t h e p r o o f is p e r f o r m e d j u s t a s i n t h e o r e m 3 . 4 .
4. C U T T I N G H Y P E R P L A N E M E T H O D
T h e c u t t i n g h y p e r p l a n e m e t h o d is m e a n t f o r s o l v i n g p r o b l e m s o f
c o n v e x p r o g r a m m i n g . T h e b a s i c i d e a o f t h e m e t h o d is t h a t t h e a d
m i s s i b l e d o m a i n is a p p r o x i m a t e d b y a c e r t a i n p o l y h e d r o n w h i c h
184
C U T T I N G H Y P E R P L A N E M E T H O D
Algorithm
Let
Q = {x: f (x) ^ 0}
b e a n o n e m p t y a d m i s s i b l e r e g i o n . S u p p o s e a l s o t h a t Q is c o m p a c t a n d
v e c t o r s s u c h a s a k , k = — Z, — ( Z — 1 ) , . . ., — 1 , 0 a n d n u m b e r s
b h are k n o w n a n d that region
S = {ar: ( a ft, x ) — b h ^ 0, k = — Z, . . ., 0 }
is c o m p a c t a n d c o n t a i n s Q .
For k ^ 0 the successive approximations are d e t e r min ed b y the
f o l l o w i n g rule. W e set S 0 = S . If S h lias b e en constructed, then x h
is a n y s o l u t i o n o f t h e p r o b l e m o f l i n e a r p r o g r a m m i n g : to m i n i m i z e
h (x ) = ( c , x ) w i t h x £ S h . T h e n e x t r e g i o n S h + 1 is c o n s t r u c t e d b y
185
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
t h e f o l l o w i n g rule:
S k+1 = { x m. ( a k + u x) — < 0 } fl S k (4.3)
w h e r e a k + 1 is a s u p p o r t v e c t o r t o / (x) a t p o i n t x k a n d
bu+i = (4.4)
It f o l l o w s f r o m (4.3) t h a t cz S h a n d for k ^ 1
Sfa = {2:. (fl/t x) bj ^ 0 t
7 = — Z, . . ., — 1 , 0 , . . k — 1}. (4.5)
L e m m a 4 . 1 . F o r a l l k ^ 1, Q a z S k .
P r o o f . L e t x £ Q , i.e. / (x) ^ 0. T h e n
f ( x ) > f (*;-1) + («i» * — ^_x) = (a^, x ) — bj
a n d , c o n s e q u e n t l y , ( a , , x ) — b j ^ 0 , 7 = 1 , . . ., Ar. W i t h 7 ^ 0
t h e l a s t i n e q u a l i t i e s h o l d b y v i r t u e o f c h o o s i n g a 7- a n d b j f o r 7 ^ 0 .
T h e l e m m a is p r o v e d .
It f o l l o w s directly f r o m l e m m a 4.1 t h a t
fo(x0) ^ f o (*^1) ^ ^ fo {xk) ^ / o ( * ^ft +i) ^ . . . .
O n t h e o t h e r h a n d , i f x * i s t h e m i n i m u m p o i n t o f / 0 (a:) i n Q , t h e n
fo ( x k ) < fo M since => Q.
T h e o r e m 4.1. L e t f (x) b e a c o n t i n u o u s c o n v e x f u n c t i o n , r e g i o n Q
be c o m p a c t a n d there be a n u m b e r K s u c h that with e a c h x £ S vector
a w h i c h is a s u p p o r t v e c t o r t o f ( # ) a t p o i n t x s a t i s f i e s t h e i n e q u a l i t y
|| a || ^ K . T h e n a n y l i m i t p o i n t o f s e q u e n c e { £ & } , k = 0 , 1 , . . .,
is a s o l u t i o n o f p r o b l e m ( 4 . 1 ) a n d f ( x h ) — >- 0.
P r o o f . S i n c e S 0 — S , S h z > S h + U t h e w h o l e s e q u e n c e { x ft} b e l o n g s
to the c o m p a c t set S. T h e r e f o r e there are a l w a y s limit points of this
sequence.
N o t e n o w t h a t i f / (a:ft) ^ 0 f o r a c e r t a i n k , t h e n x h £ Q a n d , c o n
s e q u e n t l y , f0(Zk) f o ( # * ) • H o w e v e r , a s w a s s h o w n , / 0 (x h ) ^
^ fo ( # * ) • T h u s , f 0 ( x k ) = f 0 ( x % ) , i.e. x ^ is t h e s o l u t i o n o f t h e
original problem.
L e t n o w s e q u e n c e { 2 * } b e i n f i n i t e a n d / (x k ) > > 0 f o r a l l k . W e
s h a l l s h o w t h a t / ( x k ) - > - 0 . S u p p o s e t h a t t h e c o n t r a r y is t r u e . T h e n
t h e r e is a n u m b e r r 7 > 0 a n d a s u b s e q u e n c e o f i n d i c e s k ( w e d e n o t e
i t b y J ) s u c h t h a t / ( # ft) ^ r , k £ J . W i t h o u t l o s s o f g e n e r a l i t y , w e
c a n t a k e t h a t x h - + x , k £ J s i n c e { # * } b e l o n g s t o a c o m p a c t set.
L e t n o w k a n d 7* b e l o n g t o J a n d k > 7. T h e n b y c o n s t r u c t i o n ,
p o i n t x h satisfies t h e i n e q u a l i t y
(a 7 + i » Xk.) bj (#,/+iy X h x f ) “ I- / ( X j ) 0,
hence
/ (xj) < ( a J + 1, x j — x h) < K || X j — x h ||.
186
G U T T I N G H Y P E R P L A N E M E T H O D
B u t { x ft} , k £ 3 c o n v e r g e s t o x a n d t h e r e f o r e || x j — x h || ^ r / ( 2 K )
for all su ffi cie ntl y g r e a t k a n d / a n d s o f (xj) ^ r/2 for g r e a t /, a n d
t h i s c o n t r a d i c t s t h e f a c t t h a t f ( x j ) ^ r, / £ 3 •
T h u s w e h a v e s h o w n t h a t / (x * ) - » - 0 . L e t n o w x b e a n y limit point,
i.e. x h - * ~ x , k £ J , w h e r e J is a s u b s e q u e n c e o f i n d i c e s . T h e n because
o f t h e c o n t i n u i t y o f / (x),
f ( x ) = l i m f ( x h) = 0,
KCf
i.e. x 6 Q . O n t h e o t h e r h a n d , f Q ( x h ) ^ f 0 (x * ) a n d t h e r e f o r e
f o ( z ) ^ f o ( z * ) ; i t f o l l o w s d i r e c t l y t h a t f 0 ( x ) = f 0 ( x „.) a n d x i s
a l s o a s o l u t i o n o f p r o b l e m (4.1). T h e t h e o r e m is p r o v e d .
Computational Aspects
T h e a l g o r i t h m of the cutting h y p e r p l a n e m e t h o d requires at e a c h
step the solution of the p r o b l e m of linear p r o g r a m m i n g : to m i n i
m i z e f 0 ( x ) = (c, x ) w i t h c o n s t r a i n t s
( a it x ) — bt ^ 0, i — — Z, . . ., k . (4.6)
T h u s t h e size of t h e p r o b l e m b e i n g s o l v e d increases at e v e r y step.
T h e c o m p u t e r m e m o r y required for storing vectors a^ also increases.
I n o r d e r t o s i m p l i f y t h e s o l v i n g o f p r o b l e m (4.6), it is e x p e d i e n t t o
solve instead the d u a l p r o b l e m w h i c h in this case takes the following
h
form: to m a x i m i z e — 2 u % bi w i t h c o n s t r a i n t s
i= - l
h
2 u xa , i - \ - c = 0 , 0, i = — Z, ..., k.
187
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Concluding R e m a r k s
In describing the cutting hyperplane m e t h o d w e followed
J . E . K e l l y ’s p a p e r . A t p r e s e n t t h e r e a r e m a n y m o d i f i c a t i o n s o f t h i s
m e t h o d . T h e y c a n b e f o u n d in a p a p e r b y E. S. L e v i t i n a n d
B . T . P o l y a k . H o w e v e r , all t h e s e m o d i f i c a t i o n s d o n o t s e e m t o e n h a n c e
t h e m a i n p r o p e r t y w h i c h is o f i n t e r e s t t o u s , v i z . t h e r a t e o f c o n
v e r g e n c e w h i c h h a s n o t b e e n precisely e s t i m a t e d for t h e m e t h o d
described; h o w e v e r , the results ob t a i n e d in the paper m e n t i o n e d
p e r m i t t o f o r m t h e j u d g e m e n t t h a t t h i s r a t e is n o t e v e n t h a t o f a
geometric progression.
5. L I N E A R I Z A T I O N M E T H O D
I n this section w e shall describe t h e m e t h o d of solving the general
p r o b l e m of m a t h e m a t i c a l p r o g r a m m i n g w i t h o u t m a k i n g a n y a s s u m p
tion c o n c e r n i n g the c o n v e x i t y of the functions to b e dealt wi t h . A n
i m p o r t a n t p r o p e r t y o f t h i s m e t h o d is t h e p o s s i b i l i t y o f t a k i n g i n t o
a c cou nt nonlinear equality constraints, this being a s t u m b lin g-
block for m o s t other m e t h o d s .
It is r e q u i r e d t o m i n i m i z e f u n c t i o n f 0 (x), x £ E n w i t h c o n s t r a i n t s
0, i£J", f i ( x ) = 0, i £ j ° (5.1)
w h e r e J ~ a n d J ° a r e finite sets of indices. W e a s s u m e t h a t all t h e
f u n c t i o n s /* ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e . ( M o r e f u l l y t h e c o n
s t r a i n t s w i t h w h i c h t h e p r o b l e m is c o n s i d e r e d w i l l b e s p e c i f i e d b e l o w . )
A t p o i n t x 0 w e s u b s t i t u t e l i n e a r c o n s t r a i n t s for all (5.1) a n d a lin
e a r f u n c t i o n for f0 (x) b y l i n e a r i z i n g ft (x) a t p o i n t x. A s a r e s u l t w e
o b t a i n a p r o b l e m of linear p r o g r a m m i n g . It w o u l d b e n a t u r a l to
t a k e the solution of the linearized p r o b l e m as the n e x t a p p r o x i m a
t i o n a s w e d o i n N e w t o n ’s m e t h o d f o r s o l v i n g s y s t e m s o f n o n l i n e a r
equations. Un for tun ate ly , this w a y d o es not lead directly to the
a i m since as a rule the subsidiary p r o b l e m of linear p r o g r a m m i n g
h a s n o s o l u t i o n . T h e r e f o r e , it is n e c e s s a r y t o i m p o s e c e r t a i n c o n
straints o n the increase of vector x at x 0 in or der that the solution
of t h e linearized p r o b l e m s h o u l d n o t shift too far f r o m x 0 a n d s h o u l d
r e m a i n in the n e i g h b o u r h o o d of x 0 s u c h that linearization w o u l d
s t i l l h o l d a t it. T h i s w i l l b e p e r f o r m e d b e l o w b y a d d i n g a q u a d r a t i c
t e r m to the linearized objective function.
N o t e t h a t e a c h o f t h e e q u a l i t i e s f t (x) — 0 is e q u i v a l e n t t o t h e
following t w o inequalities
fi ( z ) < o, — f t (X ) < 0.
Therefore w e c a n limit ourselves to considering o n l y the case
w i t h i n e q u a l i t y c o n s t r a i n t s . S u c h a c o n s t r a i n t is c o n v e n i e n t a t l e a s t
188
L I N E A R I Z A T I O N M E T H O D
in th e theoretical s u b s t a n t i a t i o n of t h e a l g o r i t h m t h o u g h t h e d o u b l
ing of t h e n u m b e r of inequalities c a n b e i n c o n v e n i e n t in calculations.
W e shall give b e l o w the theoretical substantiation of the a l g o r i t h m
l o r t h e p r o b l e m o f t h e m i n i m i z a t i o n o f / 0 (a;) w i t h c o n s t r a i n t s
f i (a;) < 0, i 6 3. (5.2)
Basic Assumptions
W e set
F ( x ) = m a x f t (a:)
* 3
3 b (x) = 0 ' € 3 : ft ( x ) ^ F ( x ) — 6}, 6 > 0 . (5.3)
B y a s s u m p t i o n , F (x) ^ 0 w i t h all t h e x . S u p p o s e t h a t t h e r e a r e
c o n s t a n t s N > » 0, 6 > 0 s u c h that:
(a) t h e set
= {*: /o (*) + N F (x) < C 0 }, C 0 = /0 (*0) + N F ( x 0)
is b o u n d e d ;
(b) t h e g r a d i e n t s of f u n c t i o n s /* ( x ) , i £ { 0 } U 3 in Q N satisfy
L i p s c h i t z * c o n d i t i o n , i.e.
II fi (*i) — f M II < L II — z 2 IK
(c) t h e problem of quadratic programming
m i n (/;(*), p ) + l | | p | p ,
(f\ ( * ) , p ) + / i ( i ) a i6 (*)(5.4)
is s o l v a b l e f o r p £ E n w i t h a n y x £ Q N a n d t h e r e a r e L a g r a n g e m u l
t i p l i e r s u x (a:), i £ 3 t ( x ) s u c h t h a t 2 u% (*) ^ W . I n t h i s s e c t i o n ,
|| p || w i l l a l w a y s d e n o t e t h e E u c l i d e a n n o r m o f v e c t o r p .
I n w h a t f o l l o w s w e shall d e n o t e t h e s o l u t i o n of p r o b l e m (5.4)
b y p (x) a n d L a g r a n g e m u l t i p l i e r s b y u x (x), i £ J a (x).
189
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
F o rm ul at io n of the Algorithm
L e t x 0 b e t h e initial a p p r o x i m a t i o n a n d w e t a k e e s u c h that
0 e < c 1* L e t p o i n t x k b e a l r e a d y c o n s t r u c t e d b y t h e a l g o r i t h m .
T h e c o nst ruc tio n of t h e n e x t a p p r o x i m a t i o n will b e p e r f o r m e d in
t w o stages:
(1) W e s o l v e p r o b l e m ( 5 . 4 ) w i t h x — x \ a n d f i n d its s o l u t i o n , v e c -
t o r p k = p ( x h ).
(2) , W e f i n d t h e first o f t h e v a l u e s o f i = 0, 1, . . satisfying
the inequality
I f t h i s i n e q u a l i t y i s s a t i s f i e d f o r t h e fir s t t i m e w i t h i = i0 , t h e n w e
t a k e a * = 2-*°, x k + 1 = x h + a hp k .
T h u s t h e f o l l o w i n g ] i n e q u a l i t y is s a t i s f i e d a t e a c h s t e p :
/ ( * * + i ) + N F ( * * + 1 ) < / ( x h ) + N F ( x h ) — a k e || p h ||2 . (5.5)
C o n v e r g e n c e of the Algorithm
W e s h o w t h a t t h e c h o i c e o f t h e s t e p a * a t e a c h i t e r a t i o n is p e r
f o r m e d after a finite n u m b e r of s u c c e s s i v e h a l v i n g s of u n i t y a n d
substantiate the c o n v e r g e n c e of the algorithm.
F r o m t h e r e s u l t s o f S e c . 3 , C h a p . I it f o l l o w s t h a t p (x) is t h e s o
l u t i o n o f p r o b l e m ( 5 . 4 ) if a n d o n l y if t h e r e a r e u % ( x ) ^ 0 , i £ C f t ( x )
such that
fo(x ) + p ( x ) + 2 » * ( * ) / « ( * ) = o,
( / o (x ) • p (x )) = — 2 » * ( * ) ( f \ ( * ) . p (x ) ) — ll p ( x ) ll2
i E J 6W
190
L I N E A R I Z A T I O N M E T H O D
T h e c o m p a r i s o n of t h e last e x p r e s s i o n s w i t h (5.6) s h o w s t h a t v e c t o r
p = 0 is t h e s o l u t i o n o f p r o b l e m ( 5 . 4 ) f o r w i t h p = 0 all c o n s t r a i n t s
(5.4) are satisfied (since (5.2) are satisfied) a n d t h e f u l f i l m e n t of
( 5 . 6 ) w i t h p = 0 is t h e n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n f o r v e c t o r
p = 0 t o b e t h e s o l u t i o n of (5.4).
L e t n o w p (x) = 0. T h i s m e a n s t h a t t h e c o n s t r a i n t s o f (5.4) a r e
s a t i s f i e d w i t h p = 0 , i.e. f t (x) ^ 0 , i £ J $ (x). S i n c e f o r i £ J e (# )
we have
ft ( x ) < F (x) — 6 < fj (x ) < 0
191
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
+ a 2 ( . A T + 1 ) L || pi, | p. (5.13)
Recall n o w t h at u l (xk) ^ 0, F (xk) ^ 0 and
2 ul(Xk)^N.
Therefore
2 U * (x h ) ft ( x k ) — N F ( x h ) < 0.
192
L I N E A R I Z A T I O N M E T H O D
B u t t h i s m e a n s t h a t i n e q u a l i t y (5 . 5 ) is sa t i s f i e d a f t e r a finite n u m
b e r o f t r i a l s o f a = 2 ~ l, i — 0 , 1 , . . . » a n d w i t h t h e i n e q u a l i t y
(5.16)
Therefore, t h e r i g h t - h a n d pa rt of t h e last i n e q u a l i t y m u s t t e n d to
z e r o . A s F (x) is a c o n t i n u o u s f u n c t i o n i n t h e c o m p a c t s e t Q N j F (x )
has an upper bound and the expression y p h || c a n t e n d
t o z e r o o n l y i f || p * || - » - + oo. B u t f r o m (5.6) w e o b t a i n t h a t
13— 0 3 2 6 193
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
B u t fj ( x k ) < f i (x h ) , j t> (x h ) , i £ 3 * ( * h ) • H e n c e
F (xk) = m a x f t (xk)< K || p k ||.
C o n s e q u e n t l y , F ( x k ) - > 0 a s A: - > o o f o r F ( x h ) ^ 0 . F u r t h e r , l e t u s
t a k e u x (x) = 0, i £ Cft (x). T h e n w e c a n r e w r i t e (5 . 6 ) a l o n g s e q u e n c e
{ x ft} i n t h e f o l l o w i n g f o r m :
f'o ( x h ) + P k + 2 u % ( x k ) f i (x h ) — 6 ,
*£f
u i ( x h ) ( ( j \ ( x h ), p h ) - \ - f i ( X h ) ) = o , J. (5.18)
L e t n o w x * b e a limit point of {x^}. A s x k £ Q N and is c o m p a c t ,
there are a l w a y s s u c h points. W i t h o u t loss of generality, w e m a y
t a k e t h a t x h — >-x*. B e s i d e s , s i n c e u x (x) ^ 0 , i £ Cf a n d t h e i r s u m
i s l i m i t e d , w e c a n t a k e t h a t u x ( x k ) — >■ u x a s k oo.
T a k i n g the limit in (5.18) w e obtain:
Computational Aspects
T h e basic operation w h i c h requires considerable computations
a t e v e r y s t e p i n i m p l e m e n t i n g t h e a l g o r i t h m is t h e s o l v i n g o f p r o b
l e m (5.4). T h i s is a p r o b l e m o f q u a d r a t i c p r o g r a m m i n g . I n c h o o s
i n g t h e m e t h o d f o r s o l v i n g t h i s p r o b l e m it is n e c e s s a r y t o t a k e i n t o
a c c o u n t t h a t p r o b l e m (5.4) m u s t b e s o l v e d after a finite n u m b e r o f
s t e p s , s i n c e it is a s u b s i d i a r y p r o b l e m . B e s i d e s , s i n c e c o n s t a n t N
is n o t k n o w n b e f o r e h a n d , it is e x p e d i e n t t o o b t a i n t h e c o r r e s p o n d i n g
L a g r a n g e m u l t i p l i e r s u x (x) i n s o l v i n g p r o b l e m (5.4) i n o r d e r to
c h e c k w h e t h e r t h e c h o i c e o f N w a s r i g h t o r n o t . O n t h e s e g r o u n d s it
s e e m s e x p e d i e n t t o p a s s t o t h e d u a l p r o b l e m a n d t o s o l v e it b y t h e
m e t h o d of conjugate gradients w h i c h w a s discussed in the subsection
o n p. 160.
194
L I N E A R I Z A T I O N M E T H O D
W e s h a l l c o n s t r u c t n o w t h e d u a l of p r o b l e m (5.4). A s s t a t e d i n
C h a p . I, S e c . 3, t h e o b j e c t i v e f u n c t i o n o f t h e d u a l p r o b l e m h a s t h e
following form:
<p ( u ) = n u n [ ( / ; ( z ) , p ) + - 1 I I p II2
T h u s , p o i n t p is u n i q u e l y d e f i n e d b y v e c t o r u w i t h c o m p o n e n t s
u \ i e J d (*).
S u b s t i t u t i n g (5.20) into t h e r i g h t - h a n d side of (5.19), w e o b t a i n
i
<p(u) = /;<*)+ 2 «**/*(*) + 2 »*/ «(*)• (5.21)
2
then N s h o u l d b e c h a n g e d to
N = 2 2 »*(*»)• (5.22)
E x p e r i e n c e s h o w s t h a t s u c h a c o r r e c t i o n b r i n g s s u c c e s s . B e s i d e s , it
is c l e a r f r o m t h e o r e t i c a l r e a s o n s t h a t if x h is s u f f i c i e n t l y c l o s e t o t h e
195 13 *
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
4 -iip(*)ii2 + ( / ; ( * ) . p ( * ) ) < 4 ii p i i 2 + ( / ; ( * ) ’ p )
+ 2 W * ( * ) (W ( * ) » P ) + fi ( * ) )
196
linearization method
4 iip<») « * + « / ; <*).?(*))
<yii?ii*+(/;(*),?)+ S »*(*)(«(*).?)■+/,(*»
i € j fi(x)
< 4 - i i p i p + ( / k « ) i p ) + 2 u i ( x ) f i (x)
u 1 (X) <
r 4 - u p ( * ) » * + w w . p ( * ) ) i - rL 4f -- n-p -n !,+- -( -f t- M-, - ?)i
l i - - - - - - - - - - - - - - J— - - - - - - ! (5.24)
y I I P I P - r ( / i f * ) . P ) = y I I * — * II2 + ( / i ( * ) , * — * )
is b o u n d e d i n c o m p a c t r e g i o n Q 0 . T h e r e f o r e , t h e s m a l l e r quantity
YII p (*) I P + ( / ! ( * ) • p W )
has an upper bound. A s to its l o w e r b o u n d w e h a v e
i.e. w i t h x £ Q 0 t h e q u a n t i t y u n d e r c o n s i d e r a t i o n a l s o h a s a l o w e r
hound.
T h u s w e h a v e s h o w n that in Q 0 t h e r i g h t - h a n d sides of (5.24) h a v e
u p p e r b o u n d s , i.e. u x (x) ^ M , x £ ^ o - T h e s t a t e m e n t o f t h e t h e o
r e m directly f o l l o w s f r o m this fact.
T h u s if t h e p r i m a l p r o b l e m w a s a p r o b l e m o f c o n v e x p r o g r a m
m i n g , t h e n a n y 6 > 0 suits the algorithm, p r o v i d e d the admissible
region contains a n interior point.
197
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
S o m e Generalizations
A t t h e b e g i n n i n g o f t h i s s e c t i o n it w a s s t a t e d t h a t if t h e r e a r e
e q u a l i t y c o n s t r a i n t s , i.e. if t h e c o n s t r a i n t s a r e o f t h e f o r m ( 5 . 1 ) t
t h e n t h e p r o b l e m is r e d u c e d t o t h e f o r m ( 5 . 2 ) b y s u b s t i t u t i n g t w o
inequalities for e a c h equality.
T h u s , t h e a l g o r i t h m c a n b e a p p l i e d to t h e g e n e r a l p r o b l e m (5.1)
t o o . I t s h o u l d o n l y b e t a k e n i n t o a c c o u n t t h a t if w i t h a c e r t a i n x
w e have
/, ( x ) > F (x) — 6 and — /* ( x ) > F (x) - 6
w h e r e i £ J o » t h e n t h e s y s t e m (5.4) c o m p r i s e s t w o inequalities
(fi W , p) + fi (X) < 0, - (/; ( x ) , p ) - n (x) < 0 (5.25)
w h i c h are e q u i v a l e n t to ]one eq uality
(fi ( x ) , p ) + h (x) = 0. (5.26)
T h e r e f o r e it is e x p e d i e n t t o s u b s t i t u t e i n ( 5 . 4 ) o n e e q u a l i t y ( 5 . 2 6 )
for e a c h pair of inequalities of t h e t y p e (5.25). I n p a s s i n g to t h e d u a l
p r o b l e m this will lead to the cor r e s p o n d i n g multiplier u x h a v i n g
a n arbitrary sign w h i c h h o w e v e r do es n o t i m p e d e the possibility of
a p p l y i n g the algorithm of conjugate gradients (the subsection o n
p. 160).
S u p p o s e n o w that in the p r i m a l p r o b l e m in a d d i t i o n to constraints
(5 . 2 ) , t h e r e is a c o n s t r a i n t i m p o s e d b y t h e c o n d i t i o n t h a t p o i n t x
b e l o n g s t o a s e t X o f s i m p l e s t r u c t u r e . I n t h i s c a s e it is e x p e d i e n t t h a t
t h e a p p r o x i m a t i o n s o b t a i n e d s h o u l d lie i n set X . W e s h a l l d e s c r i b e
n o w h o w t h e a l g o r i t h m is t o b e m o d i f i e d i n t h i s c a s e . A s w e d i d p r e
v i o u s l y w e shall consider, w i t h o u t loss of generality, o n l y t h e case
w i t h inequality constraints.
T h u s l e t it b e r e q u i r e d t o m i n i m i z e / 0 (x), x £ E n w i t h c o n s t r a i n t s
/i(x)< 0 , x e X (5.27)
w h e r e J is a f i n i t e s e t o f i n d i c e s a n d X is a c o n v e x c l o s e d s e t . I t
i s a s s u m e d t h a t t h e r e i s a n i n d e x i s u c h t h a t /,• ( # ) = 0 .
S u p p o s e that there are constants N > 0 a n d 6 >> 0 such that the
f o l l o w i n g c o n d i t i o n s a r e fulfilled:
(a) set
& N = { x * f o ( x ) + N F ( x ) < C 0l X 6 - X ' } ,
C 0 = f0 (x0) + N F (xq),
is b o u n d e d a n d t h e ini t i a l a p p r o x i m a t i o n x 0 b e l o n g s t o X ;
( b ) t h e g r a d i e n t s o f f u n c t i o n s f t ( x ) , i 6 { 0 } [} J i n Q N s a t i s f y
L i p s c h i t z ’ c o n d i t i o n , i.e.
II f \ ( * i ) - f\ ( X 2) I X L II X , - x 2 II;
198
L I N E A R I Z A T I O N M E T H O D
(c) t h e problem
m i n (/;(*), P) + yllPll2>
i e j 6 (x)
R e m a r k . Recall that L a g r a n g e multipliers for p r o b l e m (5.28) are
all n o n n e g a t i v e n u m b e r s s u c h t h a t t h e f o l l o w i n g c o n d i t i o n s a r e
satisfied:
for all p s u c h t h a t
x p £ X. (5.30)
Besides
u x ( x ) [(fi ( x ) , p ( x ) ) + ft ( x ) ] = 0,; i 6 J s (x). (5.31)
T h u s c o n d i t i o n (c) i m p l i e s t h a t n o t o n l y t h e s u b s i d i a r y p r o b l e m ( 5 . 2 8 )
is s o l v a b l e , b u t a l s o t h a t t h e m i n i m u m p o i n t p = p (x) satisfies t h e
n e c e s s a r y a n d sufficient c o n d i t i o n s r e q u i r e d b y K u h n - T u c k e r s ’ t h e o
rem.
T h e a l g o r i t h m f o r s o l v i n g p r o b l e m ( 5 . 2 7 ) is c o n s t r u c t e d n o w a s
it w a s e x p o u n d e d i n t h e s u b s e c t i o n o n p . 1 9 0 . O n l y w e t a k e n o w a s
P k v e c t o r p ( x k ) w h i c h is t h e s o l u t i o n o f t h e n e w s u b s i d i a r y p r o b l e m
(5.28).
W e s h a l l s h o w t h a t t h e a l g o r i t h m is c o n v e r g e n t , i.e. t h a t t h e c o n
c l u s i o n s of t h e o r e m 5.1 h o l d a n d also t h a t x k £ X w i t h all k. It f o l l o w s
f r o m t h e last a s s e r t i o n t h a t a n y l i m i t p o i n t of s e q u e n c e { # * } lies i n
X . S i n c e t h e p r o o f of c o n v e r g e n c e differs f r o m t h e p r o o f of t h e o r e m
5 . 1 o n l y i n s o m e d e t a i l s , t h e r e is n o n e e d i n g i v i n g t h i s p r o o f c o m
pletely. W e shall p o i n t o u t o n l y t h e m a i n specific details.
F i r s t , s i n c e x h + p h £ X a n d X is c o n v e x , w e h a v e x h + a p k 6 X
w i t h a l l a l y i n g b e t w e e n 0 a n d 1. T h e r e f o r e if x h 6 X , t h e n x * + 1 6 X
too. A n d since x 0 £ X , b y a s s u m p t i o n , the w h o l e s e q u e n c e {xft}JLo
lies i n X . S e c o n d l y f r o m (5.29)-(5.31) w i t h p — 0 w e o b t a i n t h a t
199
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
i.e.
T h i s i n e q u a l i t y s h o u l d b e s u b s t i t u t e d for e x p r e s s i o n (5.7) w h i c h w a s
u s e d i n o b t a i n i n g e s t i m a t e (5.13). A l l o t h e r c a lcu lat ion s m a d e in
obtaining the estimates remain unchanged.
F i n a l l y , if a t p o i n t x # w e h a v e p ( x * ) = 0 , t h e n it f o l l o w s f r o m
(5.29)-(5.31) that conditions
P r o b l e m of Linear P r o g r a m m i n g
L e t n o w a l l f u n c t i o n s / 0 (x), ft (x), i £ J i n p r o b l e m ( 5 . 2 ) b e l i n e a r .
W e h a v e t h e n the p r o b l e m of linear p r o g r a m m i n g . T h o u g h the al
g o r i t h m d e s c r i b e d is m o s t l y i m p o r t a n t f o r t h e n o n l i n e a r c a s e , its
a p p l i c a t i o n t o t h e p r o b l e m o f l i n e a r p r o g r a m m i n g is a l s o o f a v a i l .
I n p a r t i c u l a r , if s e t J c o m p r i s e s a g r e a t n u m b e r o f i n d i c e s , t h e n t h e
p r o b l e m o f l i n e a r p r o g r a m m i n g is o n e w i t h m a n y c o n s t r a i n t s . A t
t h e s a m e t i m e , w i t h a s m a l l 6 t h e s u b s i d i a r y p r o b l e m (5.4) h a s b u t
a s m a l l n u m b e r o f c o n s t r a i n t s s o t h a t t h e g e n e r a l p r o b l e m is r e d u c e d
to t h e s o l v i n g of a series of s i m p l e r p r o b l e m s . B e s i d e s as distinct
from the simplex method, the m e t h o d proposed does not accumulate
200
L I N E A R I Z A T I O N M E T H O D
c o m p u t a t i o n e r r o r s a s it d o e s n o t t r a n s f o r m t h e o r i g i n a l m a t r i x o f
constraints f r o m step to step.
F o r t h e p r o b l e m o f l i n e a r p r o g r a m m i n g t h e c o n d i t i o n s (a) a n d (c)
( c o n d i t i o n (b) is s a t i s f i e d a u t o m a t i c a l l y ) o f t h e b a s i c a s s u m p t i o n a r e
t o o strict for t h e c o n v e r g e n c y of t h e a l g o r i t h m . W e shall n o t d w e l l
o n the conditions of c o n v e r g e n c y for the p r o b l e m of linear p r o g r a m
m i n g s i n c e o u r h n a i n p u r p o s e is t o o b t a i n a n a l g o r i t h m f o r t h e n o n l i n e a r
c a s e . I t w i l l b e s h o w n b e l o w t h a t if t h e a s s u m p t i o n s ( a ) a n d (c) f o r t h e
p r o b l e m of linear p r o g r a m m i n g hold, t h e n th e a l g o r i t h m c o n v e r g e s
after a finite n u m b e r of steps. T h i s fact c h a r a c t e r i z e s t o s o m e e x t e n t
the rate of c o n v e r g e n c e of the algorithm.
T h e o r e m 5 . 3 . L e t a s s u m p t i o n s (a), (c) o f t h e s u b s e c t i o n (p. 1 8 9 ) h o l d
a n d a l l f u n c t i o n s f 0 (x), f t (#) w h i c h d e f i n e p r o b l e m (5 . 2 ) h a v e t h e f o r m
ft ( ^ ) (®i»
201
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
m i n (fo ( x h ), p ) + y l l p l l 2 ,
| {fi (x k ) y P h ) | ^ ^ i f i (x h ) ^ “ 2 " »
and therefore
(f i W , P l l ) + / l ( l | i K 7 < 0 .
T h e r e f o r e if u \ > 0, t h e n
202
L I N E A R I Z A T I O N M E T H O D
A s w a s s h o w n in t h e s u b s e c t i o n o n p. 1 9 4 t h e d u a l of t h e s u b s i d i a r y
p r o b l e m ( 5 . 3 7 ) is t h e m a x i m i z a t i o n o f f u n c t i o n ( 5 . 2 1 ) w i t h t h e c o n
s t r a i n t s u x ^ 0 , i 6 J a (#*)• T h e L a g r a n g e m u l t i p l i e r s are the
solution of the dual p r o b l e m a n d the equality of the o p t i m u m values
i n t h e o r i g i n a l a n d d u a l p r o b l e m s h o l d s , i.e. w e h a v e
Uo(*k),Pk) + ~\\Pk\\2
2
2 /;(**)+ 2 «*/«(**) 2 U h f i (d?ft).
S i n c e p * — >-0, t h e l e f t - h a n d s i d e of t h e last e x p r e s s i o n t e n d s t o
zero a n d consequently
Jj_ 2
2 /o ( X h ) + 2 “ */< ( * » ) 2 u t i ‘ (*«.) °- (5-40)
Note n o w that o n l y if i £ J ( x * ) . B e s i d e s ,
/i ( x ) = ( a iy x ) — b it i 6 {0 } U J>
s o t h a t fl ( x ) = a t a n d is i n d e p e n d e n t o f x . T h e r e f o r e , ( 5 . 4 0 ) c a n
be rewritten in the following form:
2
2 a 0 “h 2 u ha i 2 ( * f t ) “ *■ ° *
iej(xA)
B u t J ( X h ) c z J o ( # * ) a s s h o w n a b o v e a n d t h e r e f o r e /* ( x * ) /* ( x , ) =
= 0, s i n c e /»(#*) = 0 w i t h i ^ J o ( ^ * ) » b y definition. T h e r e f o r e
2
2 <*0 + 2 -►o.
But
_ _ 1_ 2
2 4" 2
1 2
m a x — y a 0 ~ 1“ 2 < 0. (5.41)
u*>0, i6J(xh)
203
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
c » ( 3 < * * » - ► 0.
B u t t h i s m e a n s t h a t © (J (a:*)) = 0 f o r a l l s u f f i c i e n t l y g r e a t k s i n c e
a s w a s j u s t m e n t i o n e d , cd ('f) c a n t a k e o n l y a finite n u m b e r o f v a l u e s .
T h u s , for great k
c o ( J ( x k )) = 0. (5.42)
W e n o w c h o o s e k s o g r e a t t h a t a * = 1; c o n d i t i o n ( 5 . 4 2 ) is f u l
f i l l e d a n d J ( x x ) c z j 0 (a:*). A s a ft = 1 , w e h a v e x h + i = x h +
+ p h . S i n c e x k -►a;*, p k -»-0, w e c a n t a k e that
i l d ,(*.). (5.43)
L e t u s c o n s i d e r a g a i n s u b s i d i a r y p r o b l e m (5.37). A s p x satisfies
t h e c o n s t r a i n t s o f ( 5 . 3 7 ) a n d /,- (a:) a r e l i n e a r , w e h a v e
ft ( * k + i ) = (f t ( X k ) , P k ) + ft ( X k ) < o (5.44)
f o r i £ J a (x k ) a n d c o n s e q u e n t l y for i £ J 0 (z*) too as J 0 (x+) d
c i J a (arft). W e h a v e t h u s s h o w n t h a t x h+ 1 satisfies all c o n s t r a i n t s of
p r o b l e m (5.2).
W e demonstrate t h a t x k + 1 is r e a l l y t h e s o l u t i o n o f p r o b l e m (5.2).
I n d e e d , it f o l l o w s f r o m ( 5 . 3 8 ) a n d t h e d e f i n i t i o n o f s e t Cf ( x x ) t h a t
ft ( X k + i ) = 0, i 6 5 ( x h ). (5.45)
a r e fulfilled.
B u t t h e l a s t r e l a t i o n s ( s e e C h a p . I, S e c . 3 ) a r e t h e n e c e s s a r y a n d
sufficient c o n d i t i o n s for p o i n t x k+i to b e t h e solution of t h e p r o b
l e m of linear p r o g r a m m i n g .
T h u s t h e a l g o r i t h m p r o v i d e s for t h e s o l u t i o n after a finite n u m b e r
of steps. Q . E . D .
204
L I N E A R I Z A T I O N M E T H O D
L o c a l E s t i m a t e of t h e R a t e of C o n v e r g e n c e
It w a s s h o w n i n t h e p r e c e d i n g s u b s e c t i o n t h a t t h e a l g o r i t h m p r o
p o s e d c o n v e r g e s after a finite n u m b e r of s t e p s i n t h e l i n e a r case.
H e r e w e shall s h o w that in the general nonlinear case the al gorithm
co n v e r g e s at a geometrical rate a n d w i t h certain fa v o u r a b l e c i r c u m
s t a n c e s e v e n at a q u a d r a t i c rate.
T h e o r e m 5.4. L e t x # be the solution of p r o b l e m 5.2 a n d the follo win g
conditions hold:
(a) F o r a n y sufficiently s m a l l 6 0 fl t h e s u b s i d i a r y p r o b l e m ( 5 . 4 )
is s o l v a b l e .
( b ) F u n c t i o n s fi ( x ) a r e t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e a n d t h e
g r a d i e n t s f (#*), i 6 J o (^*)» w h e r e
J o ( * * ) = {£: / « ( * • ) = 0. i £ J } »
are linearly independent.
(c) A t p o i n t x * the necessary condition for a m i n i m u m is s a t i s f i e d
in the f o r m
/;(*.)+ 2 “ o/i ( * . ) = °
*€ C f o(x # )
a n d * u \ > 0, i £ J 0 (#*).
( d ) T h e s u f f i c i e n t j c o n d i t i o n f o r a l o c a l m i n i m u m , i.e.
(P r L ” (a?*, u 0 ) p ) > 0 t,
holds for all p 0 w h i c h also satisfy the condition
( P t f\ ( * * ) ) = 0 , i £ J o (x * )
where
L (x , u ) = / 0 ( x ) + S u */i ( * )
i€ J o(x*)
a n d L ” i s a m a t r i x * o f s e c o n d d e r i v a t i v e s o f L (a:, u ) w i t h r e s p e c t t o x .
T h e n t h e r e is a n e i g h b o u r h o o d Q o f p o i n t x * , 6 0 > 0 a n d a > * 0 s u c h t h a t
the process
Xk+i = x k + «P* (5.47)
c o n v e r g e s to p o i n t x + f r o m a n y initial a p p r o x i m a t i o n x 0 £ Q a t a g e o -
m e t r i c r a t e , i.e. t h e r e is a n u m b e r 0 ^ q 1 s u c h t h a t \ \ x + — x h || ^
^ C q h for all sufficiently great k.
P r o o f . T h e b a s i c i d e a o f t h e p r o o f is a s f o l l o w s . A s w a s s h o w n a b o v e *
at point x+ the equation
p (**) — 0
is satisfied.
205
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
P r o c e s s ( 5 . 4 7 ) is a s i m p l e i t e r a t i v e p r o c e s s f o r s o l v i n g t h e l a s t
equation. Therefore in order to estimate the rate of c o n v e r g e n c e w e
c a n u s e O s t r o w s k i ’s t h e o r e m w h i c h i s f o r m u l a t e d b e l o w . T h i s t h e o r e m
r e q u i r e s a n e s t i m a t e o f t h e e i g e n v a l u e s o f t h e m a t r i x o f first d e r i v a
t i v e s o f p (x) a t p o i n t x * . T h e r e f o r e t h e m a i n p r o b l e m w i l l b e t h e
c a l c u l a t i o n o f t h i s m a t r i x a n d its e i g e n v a l u e s .
W e shall b r e a k the proof of the t h e o r e m into several parts.
W e take
«?o(x*) = M * * ) = 0 },
e0 = m a x /* ( x j < 0 .
i £ 3 o(* * )
L e m m a 5 . 2 . L e t t h e c o n d i t i o n s o f t h e t h e o r e m h e f u lfi lle d a n d let
6 . Then there is a neighbourhood of point x* such that
3 b ( x ) = J 0 ( x * ) a n d p ( x ) is c o n t i n u o u s l y d i f f e r e n t i a b l e w i t h r e s p e c t
to x i n this n e i g h b o u r h o o d . M o r e o v e r , t h e set
3 ( x ) = { i £ J 6 ( x ) : (/; ( x ) , p ( x ) ) - f ft ( x ) = 0 }
c o i n c i d e s w i t h set £f0 (x*).
P r o o f . S i n c e a l l f u n c t i o n s /* ( x ) a r e c o n t i n u o u s , t h e r e i s a n e i g h b o u r -
h o o d of point x * s u c h that
a n d if i ^ J 0 ( x * ) , t h e n
i.e. i 6 3 b (x).
206
L I N E A R I Z A T I O N M E T H O D
w h e r e u x (a;) ^ 0 .
W e i n t r o d u c e t h e f o l l o w i n g n o t a t i o n s : | c = J t (x) ( = J o (#*));
f y (a;) f o r a m a t r i x w h o s e r o w s a r e f i ( x ) , i £ f y (x) f o r a c o l u m n -
vector w i t h c o m p o n e n t s ( x ), i £ 'ty a n d U y f o r a c o l u m n - v e c t o r
with components i £ 'f. T h e n e q u a t i o n s ( 5 . 5 1 . 1 ) , : ( 5 . 5 1 . 2 ) c a n
b e rewritten as follows:
P ( * ) + /o ( * ) + f ' f (x ) u y ( x ) = 0 ,
f y ( x ) p ( x ) + f y ( x ) = 0, f = J ( x ) . (5.52)
T h e last e x p r e s s i o n s c a n b e c o n s i d e r e d to b e a linear s y s t e m of e q u a
t i o n s t o b e s o l v e d f o r p (x) a n d u y (x). I t is e a s y t o s e e t h a t i n a c e r
t a i n n e i g h b o u r h o o d o f a-*, s y s t e m ( 5 . 5 2 ) h a s o n l y o n e s o l u t i o n e x
pressed b y the formulas
I t f o l l o w s f r o m t h e s e f o r m u l a s t h a t if s e t 'f is f i x e d , t h e n u y ( x ) a n d
p (a:) a r e c o n t i n u o u s l y d e p e n d e n t o n x .
Let n o w x k — W e sh all s h o w t h a t for all g r e a t k
3 (Xh) = 3 o (x.)-
S u p p o s e t h a t o u r s t a t e m e n t is n o t fulfilled a n d t h e r e a r e g r e a t
n u m b e r s k s u c h t h a t J ( x ft) i s a s u b s e t o f J 0 ( x * ) . S i n c e t h e r e c a n
b e b u t a f i n i t e n u m b e r o f d i f f e r e n t s e t s J (a:), w e c a n t a k e , w i t h o u t
l o s s o f g e n e r a l i t y , t h a t a s e q u e n c e x h - v a : * is c h o s e n s u c h t h a t
3 ( x » ) = f , t c j , (**).
207
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
(/H*.). P ) + /«(*.) = 0. i £ f ,
(/«(*•). P ) + / * ( * * ) < o. (*,)=#«(*.)
w h e r e w ^ 0, f o r u l ( x ft) ^ 0 . B u t t h e s e l a s t r e l a t i o n s s h o w t h a t p
is t h e s o l u t i o n o f t h e s u b s i d i a r y p r o b l e m ( 5 . 4 ) f o r p o i n t x * , i.e. p =
= P (#*)• B u t p o i n t x * is t h e s o l u t i o n o f p r o b l e m (5.2) a n d t h e r e
fore p (x*) = 0. C o n s e q u e n t l y ,
f'o ( * , ) + 2 “ */« ( * » ) = o .
2 ( u „ — u i) f i ( x , ) + 2 u ifi (*.)=0
ief
a n d this c o n t r a d i c t s c o n d i t i o n (b) of t h e t h e o r e m . T h u s , in a certain
n e i g h b o u r h o o d of p o i n t x * , set J (x) c o i n c i d e s w i t h J 0 (x*). F r o m
J (x) b e i n g c o n s t a n t a n d f o r m u l a s ( 5 . 5 3 ) it f o l l o w s d i r e c t l y t h a t
u y (x ) a n d P (x )y f — f ' o (x * ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e w i t h
r e s p e c t t o x s i nce , b y c o n d i t i o n (b), / f (x) a r e t w i c e c o n t i n u o u s l y dif
ferentiable.
R e m a r k . T h u s , i n a s m a l l n e i g h b o u r h o o d o f x * , p (x) a n d u y (x )
is t h e s o l u t i o n o f t h e s y s t e m o f e q u a t i o n s ( 5 . 5 2 ) w i t h a c o n s t a n t set
f = J o (#*)• T h e r e f o r e w e s h a l l o m i t i n d e x "f i n u y (x).
L e m m a 5 . 3 . T h e m a t r i x p ' ( x ) o f d e r i v a t i v e s o f v e c t o r j d ( x ) , i.e. t h e
m a t r i x w i t h e l e m e n t s d p x ( x ) f d x \ i , / = 1 , . . ., n , w h e r e p * ( x ) i s
t h e i-th c o m p o n e n t o f v e c t o r p (x), a t p o i n t x * h a s t h e f o l l o w i n g f o r m :
P' (*.) = — [ P + (/ — P ) L " ( x „ u 0 )]
where
P = f j o M (x * ) ( / j o ( x * ) ( * • ) / & * >
a n d u 0 = u (x*).
P r o o f . B y d i f f e r e n t i a t i n g t h e first o f f o r m u l a s ( 5 . 5 2 ) w e obtain
208
L I N E A R I Z A T I O N M E T H O D
where
d n i (r )
( j
dui (a)
dxn
( 1 ) P y = £ 0 . T h e n i t f o l l o w s f r o m ( 5 . 5 8 ) t h a t <t = — 1 .
(2) P y = 0. I n th is case, (I — P ) y = y a n d (5.59) c a n b e r e w r i t
ten as follows:
(I - P ) L " (**, u 0) (/ — P) y = - oy, (5.60)
i.e. a is t h e e i g e n v a l u e o f m a t r i x (/ — P ) L " (/ — P). T h i s m a t r i x
is s y m m e t r i c s i n c e P = P * a n d L ” = (L")* b e i n g t h e m a t r i x of
s e c o n d derivatives of function L. Moreover, the matrix under con
s i d e r a t i o n is n o n n e g a t i v e d e f i n i t e . Indeed, for a n y w w e h a v e
K (I — P ) L " (/ — P) w) = (z, L " z)
where
z = (I — P) w.
210
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
C o n s i d e r n o w t h e e i g e n v a l u e s o f m a t r i x I -f- a p ' ( a : * ) . T h e y a r e
e q u a l e i t h e r to 1 — a o r t o 1 — a Xj. W e c h o o s e n o w a s o t h a t all
of t h e f o l l o w i n g inequalities b e satisfied:
1 — a > — 1, 1 — akj > — 1, 7 = 1, . . n — m,
i.e. t h a t 0 « < a < ; m i n { 2 , 2 / X 0 }, w h e r e X 0 = m a x X j , 7 = 1 , . . .
. . ., n — m . T h e n a l l t h e e i g e n v a l u e s o f m a t r i x I + a p * ( x % ) w i l l
h a v e m o d u l i l e s s t h a n u n i t y ; h e n c e , r e f e r r i n g a l s o t o O s t r o w s k i ’s
results, w e h a v e t h a t t h e o r e m 5.4 holds.
T h e o r e m 5.5. L e t the c o n d i t i o n s of the p r e c e d i n g t h e o r e m be satisfied
a n d , b e s i d e s , m (t h e n u m b e r o f i n d i c e s i n set Cf0 (#*)) b e e q u a l to n ( t h e
d i m e n s i o n of the space). I n this c a s e , p r o c e s s (5.47) c o n v e r g e s f r o m a
certain n e i g h b o u r h o o d of p o i n t w i t h a = 1 at a q u a d r a t i c rate.
P r o o f . It f o l l o w s f r o m l e m m a 5 . 4 for t h e c a s e u n d e r c o n s i d e r a t i o n
t h a t all t h e e i g e n v a l u e s o f m a t r i x p ' ( x % ) a r e e q u a l t o — 1, a n d t h e r e
f o r e t h e e i g e n v a l u e s o f m a t r i x I + a p * (a;*) a r e e q u a l t o 1 — a .
If a = 1, t h e n all t h e e i g e n v a l u e s a r e e q u a l t o z e r o a n d q 0 = 0. T h e r e
f o r e a c c o r d i n g t o O s t r o w s k i ’s t h e o r e m , w e o b t a i n H a : * — x ^ || ^
^ C (e) e h a n d t h i s m e a n s t h a t t h e p r o c e s s c o n v e r g e s a t a h i g h e r
rate t h a n that of a n y g e o m e t r i c progression. I n fact in this case,
p r o c e s s ( 5 . 4 7 ) p a s s e s i n t o N e w t o n ’s m e t h o d f o r s o l v i n g s y s t e m s o f
e q u a t i o n s f t (x) = 0, i £ J 0 (#*) w h i c h a s w e l l k n o w n a n d a s s h o w n
b e l o w in Sec. 6 converges quadratically.
R e m a r k . All the a r g u m e n t s in this su bsection w e r e c o n d u c t e d for
t h e c a s e o f a p r o b l e m w i t h o n l y i n e q u a l i t y c o n s t r a i n t s . I t is o b v i o u s ,
h o w e v e r , t h a t all t h e r e s u l t s o b t a i n e d c a n b e a p p l i e d t o t h e c a s e
w i t h equality constraints.
6. L I N E A R I Z A T I O N M E T H O D :
S O L V I N G S Y S T E M S O F E Q U A L I T I E S A N D
INEQUALITIES A N D F I N D I N G T H E M I N I M A X
I n t h i s s e c t i o n , t h e l i n e a r i z a t i o n m e t h o d is a p p l i e d t o t w o p r o b
l e m s closely c o nne cte d to the usual p r o b l e m of m a t h e m a t i c a l p r o
g r a m m i n g . It p r o v e s t h a t in this c a s e o n e c a n s u c c e e d in c o n s t r u c t i n g
effective a l g o r i t h m s w h i c h h a v e a fast rate of c o n v e r g e n c e .
S y s t e m s of Equalities a n d Inequalities
G i v e n t w o finite sets of i n d i c e s and a n d f u n c t i o n s / * (a:),
x 6 E n . T o find t h e solution of t h e f o l l o w i n g s y s t e m :
/ i W < o, /,(*) = 0, i£j°. (6 .1 )
S u p p o s e t h a t f u n c t i o n s /,• ( z ) h a v e continuous g r a d i e n t s f \ (a;) a n d
211 14*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
W e c h o o s e a n initial p o i n t x 0 a n d a s s u m e t h a t f o r all x t h a t s a t i s f y
t h e i n e q u a l i t y F ( x ) ^ F (ar0 ), t h e g r a d i e n t s f\ ( x ) a r e l i m i t e d i n
n o r m b y constant K .
Basic assumption. T h ere are n u m b e r s 6 ;> 0 a n d C > 0 such that
f o r a l l x f o r w h i c h F (a:) > 0 , F (x) ^ . F ( x 0) t h e f o l l o w i n g s y s t e m
is s o l v a b l e f o r p :
(/<(*)> P ) + / i W < 0 , i£.Jl(x),
(/<(*), P ) + /«(*) = 0, i£Jl(x). (6.2)
L e t p (#) b e t h e s o l u t i o n o f (6.2) t h a t h a s t h e m i n i m u m n o r m . T h e n
f o r x s u c h t h a t F (#) > 0,
|| p ( x ) || < C F (x). (6.3)
w h e r e p a r a m e t e r a * is c h o s e n b y s e q u e n t i a l l y h a l v i n g u n i t y u n t i l
t h e f o l l o w i n g i n e q u a l i t y is s a t i s f i e d :
F (xk + a kp k ) < (1 — ectft) F ( x k ) (6.5)
w h e r e e is a n y n u m b e r , c h o s e n f r o m t h e b e g i n n i n g , 0 < e c l . C l e a r
l y , f o r m u l a ( 6 . 4 ) is a p p l i c a b l e if F ( x ) > » 0 . O t h e r w i s e , t h e p r o c e s s
s t o p s a n d x * is t h e s o l u t i o n o f (6.1).
212
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
C o n v e r g e n c e of t h e Al go ri th m
T h e i m p l e m e n t i n g o f t h e a l g o r i t h m p r o p o s e d is c h a r a c t e r i z e d b y
the following theorem.
T h e o r e m 6.1. L e t all the a s s u m p t i o n s of the p r e c e d i n g subsection
b e f u l f i l l e d . T h e n s e q u e n c e { x ft} , k = 0 , 1 , . . ., g e n e r a t e d b y t h e a l g o
r i t h m a c c o r d i n g to f o r m u l a (6.4) c o n v e r g e s to x , t h e s o l u t i o n o f s y s t e m
(6.1), a n d a t the s a m e t i m e
(a) f o r a sufficiently g r e a t k , a ^ = l;
(b) for a sufficiently g r e a t k,
F ( x h + 1 ) ^ L C * F * (**);
(c) f o r a n y q, 0 q < C 1 t h e r e is a n u m b e r k (q) s u c h t h a t
_ zh-h(q)
213
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Note n o w that
(1 - a) F (xk) > F (xh) - 6 + a C K F (xk)
if a ^ at, where
- y l ^ _ _ _ _
* “ (1 + C K ) F ( x k ) ‘
Therefore for a ^ a t it f o l l o w s f r o m ( 6 . 7 ) - ( 6 . 1 0 ) that
F (xh + a p h) < (1 — a) F (xh) + a t L C ^ F 2 (xk)
or
F (zk + a p h) ^ F (x h ) — a F (x h ) [1 — a L C 2 F ( x k )]. (6.11)
If a ^ a|, where
« 2 1 - 8
k L C 2 F (,x k ) »
T h u s , w e h a v e p r o v e d t h a t t h e c h o i c e of a k u n d e r c o n d i t i o n (6.5)
is f e a s i b l e a n d t h a t t h i s c h o i c e c a n b e r e a l i z e d after a finite n u m b e r
of operations.
W e s h o w t h a t F ( x h ) - > - 0 . I n d e e d , i t f o l l o w s f r o m ( 6 . 5 ) t h a t F (x k )
d e c r e a s e s m o n o t o n i c a l l y . T h e r e f o r e , it c a n b e c o n c l u d e d f r o m t h e
f o r m u l a s for a t a n d a t that these quantities increase w i t h increasing
k. C o n s e q u e n t l y , f o r m u l a (6.13) p e r m i t s us to c o n c l u d e that a k ^
^ a > 0 a n d so
F (z/H-i) < (1 — e a h) F (xk) < (1 — e a ) F ( x h ).
T h e r e f o r e F ( x k ) ^ ( 1 — e a ) ft F ( x 0 ), h e n c e F (a:k ) - > - 0 . B u t t h e n
a t - * - + oo, a | - > - + oo as c a n b e seen directly f r o m the formulas
for these quantities. Therefore, (6.13) p e r m i t s u s to c o n c l u d e that
< z k = 1 f o r a s u f f i c i e n t l y g r e a t k . B u t ( 6 . 1 1 ) s h o w s f o r a l l s u c h k y if
a = l is s u b s t i t u t e d i n t o it, t h a t
F (xk+1) < L C 2 F 2 ( x h ). (6.14)
T h u s s t a t e m e n t s (a) a n d (b) of t h e t h e o r e m h a v e b e e n p r o v e d .
214
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
W e c a n n o w a s s e r t t h a t t h e r e is a k 0 s u c h t h a t a * = 1 for k ^ k0
a n d ( 6 . 1 4 ) is satisfied. T h e r e f o r e , b y (6.3),
II Z k + i ~ II = II P h II < C F ( x h ).
W e s e t v h — L C 2 F ( x ft). T h e n v h - > Q a n d ( b y ( 6 . 1 4 ) ) u h + 1 ^ v \ . L e t
q b e s u c h t h a t 0 c q « < 1 . T h e n t h e r e i s a A: ( q ) s u c h t h a t v h < z q
f o r k ^ k (q). T h e r e f o r e u k + 1 ^ q v k , k ^ k (q). H e n c e
II x m 2 II x J + l II^ L C 2
j=h j=k
m-i-k 2 k ~ h(V
^ T c T 2 L C { i — q) ^ L C (1 — q ) *
3 = 0
It f o l l o w s f r o m t h i s e s t i m a t e ( a c c o r d i n g t o t h e w e l l k n o w n C a u c h y
c r i t e r i o n ) t h a t s e q u e n c e { x ft} c o n v e r g e s t o a c e r t a i n p o i n t x . S i n c e
F ( x ft) — >- 0 , w e h a v e F ( x ) = 0 , i . e . x i s t h e s o l u t i o n o f s y s t e m ( 6 . 1 ) .
M o r e o v e r , t a k i n g t h e limit in (6.15) as m — >-oo w e o b tai n
_ 2*-*<©
II* — * » l l < L C ( 1 - g ) •
Q.E.D.
Remarks
R e m a r k 1 . L e t u s b e s o l v i n g a s y s t e m o f n e q u a t i o n s ft ( # ) = 0,
i = 1 , . . ., n , w h e r e x £ E n . T h e n
fi ( x ) S * F ( x ) — 6, i = 1 , . . ., n ,
F (x ) = m a x |/ f (x) |
l^t^n
f o r a n y 6, p r o v i d e d x is s u f f i c i e n t l y c l o s e t o t h e s o l u t i o n x . T h e r e f o r e ,
. 1 % ( x ) = { 1 , 2 , . . ., » } a n d s y s t e m ( 6 . 2 ) t a k e s t h e f o r m
{fi ( * ) . P) + fi ( x ) = 0, i = l, . . ., n . (6.16)
T h e r e f o r e t h e m e t h o d p r o p o s e d c o i n c i d e s w i t h N e w t o n ’s m e t h o d
in w h i c h iterations are p e r f o r m e d b y the f o r m u l a x k+1 = x k +
+ p ( x h )j w h e r e p ( x ) i s t h e s o l u t i o n o f s y s t e m ( 6 . 1 6 ) . T h e c o n d i t i o n
f o r t h e c o n v e r g e n c e o f N e w t o n ’s m e t h o d i s t h e n o n s i n g u l a r i t y a t
p o i n t x o f m a t r i x f (#), w h e r e f ( x ) is a n n X n m a t r i x w h o s e r o w s
a r e f \ ( x ) . I n t h i s c a s e , p ( x ) = — (/ ' ( x ) ) - 1 f ( x ) , w h e r e f ( x ) i s a
c o l u m n - v e c t o r w h o s e c o m p o n e n t s a r e /* ( x ) . B u t it f o l l o w s f r o m t h e
215
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
last f o r m u l a that
II P ( X ) II < II (/' ( x ) ) - 1 || II / ( * ) || < c 0 II ( / ' ( x ) ) - 1 || F ( x )
w h e r e C 0 is a c o n s t a n t . I t c a n b e s e e n f r o m t h i s i n e q u a l i t y t h a t ( 6 . 3 )
holds in a certain n e i g h b o u r h o o d of point x .
T h u s i t f o l l o w s f r o m t h e t h e o r e m p r o v e d t h a t t h e u s u a l N e w t o n ’s
m e t h o d is l o c a l l y c o n v e r g e n t i n s o l v i n g a s y s t e m o f n e q u a t i o n s w i t h
n unknowns.
R e m a r k 2 . I f o n l y o n e e q u a t i o n / ( x ) = 0 i n n u n k n o w n s is t o b e
s o lve d, t h e n s y s t e m (6.2) t a k e s t h e f o r m
(/'(*), p) + / ( * ) = 0 (6.17)
a n d it is r e q u i r e d t o f i n d t h e s o l u t i o n o f t h i s e q u a t i o n w i t h a m i n i
m u m n o r m , i . e . t o f i n d t h e m i n i m u m o f || p ||2 w i t h c o n s t r a i n t s ( 6 . 1 7 ) .
U s i n g the rule of L a g r a n g e multipliers, w e h a v e in this case
P(X)==“ I 7 W /'(X)’
hence
l | p ( a : ) | i = ii r l ) i i | / ( J ) | -
C l e a r l y , f o r m u l a ( 6 . 3 ) w i l l h e s a t i s f i e d i f || / ' ( x ) || ^ y f o r a l l x .
R e m a r k 3. T h e f i n d i n g of v e c t o r p (x) a t e a c h s t e p i n v o l v e s
t h e s o l v i n g o f t h e p r o b l e m o f m i n i m i z a t i o n o f || p ||2 w i t h c o n s t r a i n t s
(6 .2). T h i s is a p r o b l e m o f q u a d r a t i c p r o g r a m m i n g . C o n c e r n i n g t h e
m e t h o d s o f s o l v i n g it, w e c a n u s e t h e s a m e i n f o r m a t i o n g i v e n i n
Sec. 5 a b o u t the solving of the subsidiary p r o b l e m of qua d r a t i c p r o
g r a m m i n g w h i c h arises in the linearization m e t h o d .
Sufficient Conditions of C o n v e r g e n c e
T h e m a i n c o n d i t i o n (6.3) w h i c h g u a r a n t e e s t h e c o n v e r g e n c e of
t h e a l g o r i t h m is n o t e a s y t o c h e c k . T h i s s u b s e c t i o n d e s c r i b e s c o n
ditions that c a n b e c h e c k e d m o r e effectively. I n particular, for the
c o n v e x c a s e if t h e r e is a n i n t e r i o r p o i n t i n t h e d o m a i n d e f i n e d b y
e x p r e s s i o n s (6.1), t h e c o n d i t i o n s g u a r a n t e e t h e c o n v e r g e n c e of t h e
algorithm.
L e t t h e s y s t e m c o n t a i n o n l y i n e q u a l i t y c o n s t r a i n t s , i.e.
ft ( x ) < 0 , i 6 J “. (6.18)
T h e n t h e s u b s i d i a r y s y s t e m (6.2) t a k e s t h e f o r m
(/i ( * ) , p ) + U (x) < 0 , i 6 J E (x). (6.19)
C l e a r l y , this s y s t e m c a n b e s o l v e d w i t h F (z) > 0 if t h e s y s t e m
(fl ( * ) , P ) + F ( x ) * £ 0, i 6 3 1 (x) (6.20)
is s o l v a b l e .
216
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
2 h = i.
(*)
T h e n t h e s o l u t i o n p (x) o f s y s t e m ( 6 . 2 0 ) w i t h a m i n i m u m n o r m satisfies
the equality
P r o o f . L e t X t ^ 0 b e s u c h t h a t t h e i r s u m o v e r all i 6 H i (x ) is
e q u a l t o u n i t y . If p is a s o l u t i o n o f ( 6 . 2 0 ) , t h e n
- 2 * » ( « ( * ) . j>) > * ( * ) ,
or
( - 2 P ) > P ( X )-
ieJfl(x)
U s i n g t h e i n e q u a l i t y ( x , y ) ^ | | x | | | | y ||, w e o b t a i n
( 6 -2 1 >
T h u s , it h a s b e e n p r o v e d t h a t t h e c o n d i t i o n s o f t h e l e m m a a r e
necessary.
S u p p o s e n o w t h a t L 6 (x) > 0
C o n s i d e r t h e p r o b l e m : to find the m i n i m u m of p w i t h t h e f o l l o w
ing constraints:
(fi ( * ) » P ) + P (*) — P < 0, i 6 H i (x),
I I p I K ' - o , r » = T ^ i > 0 . ( 6 . 2 2 )
T h i s is a p r o b l e m o f c o n v e x p r o g r a m m i n g , a n d a l l t h e c o n d i t i o n s
o f t h e K u h n - T u c k e r t h e o r e m , i n p a r t i c u l a r S l a t e r ’s c o n d i t i o n , * a r e
217
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
o b v i o u s l y f u l f i l l e d . L e t p 0l p 0 b e t h e s o l u t i o n . A p p l y i n g t h e K u h n -
T u c k e r theorem, w e obtain that there are ^ 0 s u c h t h a t for all
P » II p II ^ r 0 a n d f o r a l l p t h e f o l l o w i n g i n e q u a l i t y h o l d s :
Po + 2 h ((fi ( x ) , p 0 ) + F ( x ) — P o )
2 *i = l.
P o < ( 2 h h ( * ) . p ) + F ( x )*
218
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
P r o o f . S i n c e a n y s o l u t i o n o f s y s t e m ( 6 . 2 0 ) is a l s o t h e s o l u t i o n o f
s y s t e m (6.19), w e h a v e
IIp (*) I K U p ( * ) II.
T h e r e f o r e b y l e m m a 6.1, w e h a v e
fi ( x ) + 6 > ft ( x ) + 6 + ( / ; (a;), p ) .
B u t fi ( x ) + 6 ^ F ( x ) + 6 = V + fi < 0 and fi ( x ) + 6 ^ F ( x ),
i 6 Cfl (x). T h e r e f o r e
0 > y + 6 > F ( x ) -{- (ft ( x ) , p ) , i 6 Cfl (*).
Setting y + 6 = — e, w e obtain that
(fi P ) + (F (x) + e)<0, i 6 C f l (a:).
B u t b y l e m m a 6.1, this m e a n s t h a t for all x s u c h t h a t F (x) ^ F (a:0 ),
F (x) + e ^ 0, s y s t e m ( 6 . 2 0 ) is s o l v a b l e a n d f o r s u c h x a l s o (a:) >
> * 0 . N o w s i n c e t h e d o m a i n F (x) F (a;0 ) i s c o m p a c t an d the
f u n c t i o n s a r e c o n t i n u o u s , it c a n b e e a s i l y a s c e r t a i n e d t h a t L & (x) ^
^ y > 0 f o r a l l x s u c h t h a t 0 ^ F ( x ) ^ F (a:0 ).
T h u s all t h e c o n d i t i o n s of t h e o r e m 6 . 2 a r e satisfied a n d this c o m
pletes t h e p r o o f of t h e o r e m 6.3.
219
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
It is e a s y t o s e e t h a t t h i s p r o b l e m c a n b e r e d u c e d t o t h e f o l l o w i n g
o n e b y i n t r o d u c i n g a n additional variable x n+1: to m i n i m i z e
/ 0 (x, x n + 1 ) = x n + 1 w i t h c o n s t r a i n t s
fi ( x ) — x n+1 ^ 0, i = 1, . . m.
Therefore the m e t h o d s described above, in particular the lineariza
tion m e t h o d , are n o w applicable. N o t e also that in this w a y w e c a n
s o l v e a l s o t h e p r o b l e m o f t h e m i n i m i z a t i o n o f F (.x ) i f x v a r i e s i n a
certain d o m a i n Q defined b y a s y s t e m of equalities or inequalities.
I n this subsection, w e shall discuss t h e m e t h o d of m i n i m i z a t i o n
o f F (x) w i t h x 6 E n . T h i s m e t h o d is b a s e d o n a s l i g h t m o d i f i c a t i o n
of the linearization m e t h o d .
L e t us introduce at e a c h point x the following subsidiary problem:
m i n ( P + y l M I 2).
(fi ( * ) » P ) + /,* ( x ) — p < 0, i 6 J o (x), (6.26)
where 6 > 0 and
J e (x) = {i: 1 ^ i ^ m, ft (x) ^ F (x) — 6}.
N o t e t h a t p r o b l e m ( 6 . 2 6 ) is a p r o b l e m o f c o n v e x p r o g r a m m i n g for
w h i c h S l a t e r ’s c o n d i t i o n i s s a t i s f i e d , f o r t a k i n g p sufficiently great
w e c a n a l w a y s satisfy strictly constraints (6.26). B y applying directly
t h e K u h n - T u c k e r t h e o r e m i n its d i f f e r e n t i a l f o r m , w e n o w find that
p ( x ) a n d p ( x ) i s t h e s o l u t i o n o f p r o b l e m ( 6 . 2 6 ) if a n d o n l y if t h e r e
ar e u l ^ 0, i £ J & ( x ) s u c h t h a t
23
p (*) + 2 ( * ) = o,
i£tjf 6 ( x )
“ * ( ( / : ( * ) > P ( * ) ) + f t (a:) — P ( * ) ) = 0 , i £ (*). (6.27)
Further, point p = 0, p = F (x), o b v i o u s l y satisfies c o n s t r a i n t s
(6.26). T h e r e f o r e
p ( * ) + t H p (*)Ip < ^ ( * ) - (6.28)
220
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
221
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
If n o w
1
e
2
a k = min a L (6.33)
L
then
F (xk + a p k) < F (xh) — a | | p h ||2 e .
I t f o l l o w s i m m e d i a t e l y t h a t i n e q u a l i t y ( 6 . 3 0 ) h o l d s if
. 1 -
<tk>-2<X'h (6.34)
after a finite n u m b e r of r e d u c t i o n s s t a r t i n g w i t h u n i t y .
I t f o l l o w s i m m e d i a t e l y f r o m ( 6 . 3 0 ) t h a t a ^ H P k ||2 - > - 0 . T h i s m e a n s
t h a t || p k || - > 0 . I n d e e d , a k ^ y > 0 s i n c e b y ( 6 . 2 7 ) || p ( x ) ||
h a s a n u p p e r b o u n d i n Q . B u t it f o l l o w s f r o m ( 6 . 3 3 ) , ( 6 . 3 4 ) t h a t a h
also h a s a l o w e r b o u n d , a certain positive constant.
T h u s , p k - + 0 . L e t n o w x * b e a limit point of the sequence. W i t h
o u t loss of generality, w e c a n t a k e t h a t x k Moreo ver , since
U h , i 6 £ h > (^fc) a r e p o s i t i v e a n d t h e i r s u m i s e q u a l t o u n i t y , w e c a n .
\ i “ “* •
s e t t i n g U k = 0, i £Cft> (#*)» t a k e t h a t u u ~ + u l a n d iV ^ 0, th e i r
s u m being
m
S « 4 = l. (6.35)
t=l
W e rewrite n o w (6.27) a n d (6.26) for points x h as follows
m
u h f i (*Tft) = 0 ,
i=l
U h ( ( f i ( z ft)> P h ) + ft M — P h)) = 0 , i = 1 , . . ., m ,
(fi ( * k ) , P k ) + ft M < Pfe» i € # 6 (**)• (6.36)
I t f o l l o w s f r o m t h e l a s t i n e q u a l i t y ( 6 . 3 6 ) , if w e c h o o s e i £ £h> (*h)
s u c h t h a t f i (a;ft) = F ( # & ) , t h a t
P* > f t ( x h ) - K \ \ p k \\ = F (xh) — K \\ p h ||.
2 »*/«(*.)= o ,
i=i
u * (fi ( x , ) — F ( * » ) ) = 0 , i — 1. • • • » I».
m
S u * = l, i ? >0. (6.37)
i=i
222
E Q U A L I T Y A N D I N E Q U A L I T Y S Y S T E M S
P (X ) 4 - S m ‘/ 1 ( * ) = 0 ,
i£Cf o(**)
( f i ( x )y P ( x )) " h f i (**0 = P ( x )y i £ . J o ( x *)i
2 u * = 1. (6.38)
i t J o(**)
L e t i0 h e a n i n d e x f r o m J 0 ( x * ) a n d
223
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h e n s y s t e m ( 6 . 3 8 ) is e q u i v a l e n t t o t h e f o l l o w i n g o n e :
p{x)+Tl(x) + 2 u iJ l = o ,
Jo(**)
(7i W , P (x)) + Ti ( x ) = 0, i 6 Jo (z*),
J o (#*) = Jo (#*)\{*o}- (6.39)
B u t t h i s s y s t e m is a b s o l u t e l y e q u i v a l e n t t o s y s t e m ( 5 . 5 1 . 1 ) , ( 5 . 5 1 . 2 ) .
A s the proof of t h e o r e m 5.4 w a s r e d u c e d to a s t u d y of the properties
o f p ( x )— t h e s o l u t i o n o f s y s t e m ( 5 . 5 1 . 1 ) , ( 5 . 5 1 . 2 ) — it f o l l o w s t h a t
t h e f u r t h e r p r o o f o f t h e o r e m 6 . 5 is s i m p l y r e d u c e d t o c h e c k i n g t h e
c o n d i t i o n s o f t h e o r e m 5 . 4 . B u t it c a n h e e a s i l y a s c e r t a i n e d t h a t t h e
a s s u m p t i o n s of t h e o r e m 6.5 p r o v i d e c o m p l e t e l y for the fulfilment
o f t h e c o n d i t i o n s o f t h e o r e m 5 . 4 f o r f u n c t i o n s /* a n d t h i s c o m p l e t e s
the proof of the t h e o r e m .
T h e f o l l o w i n g t h e o r e m is a n a b s o l u t e a n a l o g u e o f t h e o r e m 5 . 5 .
T h e o r e m 6.6. L e t the c o n d i t i o n s of t h e o r e m 6 . 5 be fulfilled a n d ,
b e s i d e s , t h e n u m b e r o f i n d i c e s i n set J 0 ( # * ) b e e q u a l to n + 1 • I n this
case, w i t h a s m a l l 6 the process
**+i = xk + P (x h ) (6.40)
c o n v e r g e s a t a q u a d r a t i c ra te to p o i n t x % .
P r o o f . I n t h e c a s e u n d e r c o n s i d e r a t i o n , v e c t o r p (x) is u n i q u e l y d e
fined b y the s y s t e m of e q u a t i o n s
s i n c e v e c t o r s f\ ( z ) , i 6 Jo ( # * ) a r e l i n e a r l y i n d e p e n d e n t f o r # ,
c l o s e t o x % b y t h e a s s u m p t i o n s . B u t t h e n p r o c e s s ( 6 . 4 0 ) is j u s t N e w
t o n ’s m e t h o d f o r s o l v i n g t h e s y s t e m o f e q u a t i o n s
7. L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
A s w a s s h o w n i n S e c . 5, t h e l i n e a r i z a t i o n m e t h o d , s p e a k i n g g e n e r
ally, c o n v e r g e s at t h e rate of a g e o m e t r i c progression. I n a n u m b e r
of p r o b l e m s this c a n p r o v e insufficient a n d t h e p r o b l e m arises of
h o w to accelerate the c o n v e r g e n c e of the process.
224
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
I n : t h i s s e c t i o n w e s h a l l d e s c r i b e m e t h o d s t h a t p e r m i t t o d o it
p r o v i d e d a n a p p r o x i m a t i o n sufficiently close to the solution h a s
b e e n f o u n d . T h e last c i r c u m s t a n c e is a s h o r t c o m i n g of t h e p r o c e s s ;
t h e r e are, h o w e v e r , n o m e t h o d s at p r e s e n t t h a t p e r m i t to c o n s t r u c t
a process, w h a t e v e r t h e initial a p p r o x i m a t i o n , w i t h a n a s y m p t o t i
cally super lin ear rate of c o n v e r g e n c e , this b e i n g a c h i e v e d b u t in t h e
p r o b l e m of u n c o n s t r a i n e d m i n i m i z a t i o n .
T h e m e t h o d s described b e l o w are b a s e d o n the following idea.
T h e m i n i m i z a t i o n p r o b l e m is r e d u c e d t o a c e r t a i n s y s t e m o f n o n l i n
e a r e q u a t i o n s a n d t h e n N e w t o n ’s m e t h o d o r i t s m o d i f i c a t i o n i s
ap pli ed to the solving of this s y s t e m . A t the e n d of this section w e
s h a l l d e s c r i b e a m e t h o d t h a t u s e s t h i s i d e a d i r e c t l y , i.e. t h e n e c e s
s a r y c o n d i t i o n s f o r a m i n i m u m w i l l b e e s t a b l i s h e d a n d N e w t o n ’s
m e t h o d applied to the solving of the e q u a t i o n s obtained. S u c h a
m e t h o d h a s m a n y s h o r t c o m i n g s o f w h i c h t h e p r i n c i p a l o n e is t h e
necessity of calculating s e c o n d derivatives of th e original functions.
Therefore, this m e t h o d c a n b e applied o n l y to p r o b l e m s in w h i c h
s u c h derivatives are easily calculated.
A s e c o n d m e t h o d is b a s e d o n t h e f a c t t h a t p o i n t x * is t h e s o l u t i o n
o f m i n i m i z a t i o n p r o b l e m ( 5 . 1 ) o n l y if i t s a t i s f i e s t h e e q u a t i o n
p ( x % ) = 0, w h e r e v e c t o r p (x) is t h e s o l u t i o n o f t h e s u b s i d i a r y
p r o b l e m (5.4). W e s h all d e s c r i b e a m e t h o d t h a t p e r m i t s t o s o l v e
a s y s t e m of n o n l i n e a r e q u a t i o n s w i t h o u t calculating derivatives.
A s w a s m e n t i o n e d a b o v e , this m e t h o d will c o n v e r g e o n l y f r o m a suf
ficiently g o o d initial a p p r o x i m a t i o n .
sat i s f i e s L i p s c h i t z ’ c o n d i t i o n , i.e.
1 5 — 0 3 2 6 225
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
“ ( * ’ y ) = T j T ^ T j r i p ( * ) — p ( y ) — A (x — y)i- (7-2)
S u p p o s e t h a t m a t r i x A is n o n s i n g u l a r s o t h a t t h e f o l l o w i n g e s t i m a t e s
hold:
m || x || ^ || A x || ^ M || x || (7.3)
w h e r e M ^ m > 0.
L e m m a 7.1. T h e estimates
II © ( x ) || < C x || x ||2 , || © ( x , y ) || < C 2 m a x {|| x ||, || y ||)
hold.
P r o o f . L e t p v (x) b e t h e gradient o f f u n c t i o n p l (x). T h e n by
T a y l o r ’s f o r m u l a w e h a v e
p ‘ (*) = p i (0) + (pi- (0), x ) + (pi- (Z) - pi- (0), X )
w h e r e z = 0 x , 0 ^ 0 ^ 1. U s i n g t h e f a c t t h a t p x (0) = 0 and Lip-
s c h i t z ’ c o n d i t i o n f o r p * (a:), w e o b t a i n t h a t
II p x ( x ) — ( p i ' ( 0 ) , x ) II < L II X II2 , i = 1 , . . ., n .
H e n c e , w e have;
II <■> ( x ) II = lip ( * ) — p' ( 0 ) X II < C l II X ||J .
F urther
p * (y) = P i (x) + (pi- (x), y — x) + (pi- (z) — pi- (x), y — x)
where z = 0x + (1 — 0) y, 0 ^ 0 ^ 1. T h e r e f o r e
p * (y ) - p * (*) - (p v (0), y - x )
= (p\ (x) — P i (0), y — x) + ( p v (z) — p if ( x ) , y — x).
H e n c e after s i m p l e transformations (using Lipschitz’ condition),
w e obtain
I p ‘ (y ) — P * (X ) — ( p i ' ( 0 ) , y — x) I
< L || x || || y - x || + L || z - x || || y - * ||
= L || y - x || (|| x || + (1 - 0 ) || y - x ||)
< L || y - x || « 2 — 6 ) || s || + (1 - 0 ) || y ||)
< 3 L || j , - x || m a x {|| z | | , || y ||}.
T h e s e c o n d s t a t e m e n t of the l e m m a follows i m m e d i a t e l y f r o m the
last in equality.
L e t n o w p o i n t s x l t x 2 , . . ., x n b e a l r e a d y c o n s t r u c t e d , p ( x * ) = £
0, k = 1, . . . » n, e h b e u n i t v e c t o r s i n t h e d i r e c t i o n o f t h e & - t h
co ord ina te axis.
226
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
It is e a s i l y s e e n t h a t A bn) > 0 if a n d o n l y if v e c t o r s
bu ...,bn are linearly independent. N o t e also that
A (^i, • • • i ^ n ) /— •
V »
L e m m a 7 . 2 . T / i c r e is a n e i g h b o u r h o o d o f p o i n t x % — 0 sac/i t h a t
A (zj, . . zn) > y > 0,
p r o v i d e d x x , . . ., «zn a r e irc i & i s r e g i o n .
P r o o f . B y t h e d e f i n i t i o n of cd (x , y), w e h a v e
Zfe = A r k + ca (i/*, || r h 1 1 = 1 1 / ? ( x h ) || ( A e k + (0 ( y k l x h )).
Therefore
II z h II II + ( y u , x h ) II ‘
I f x h — >■ 0 , t h e n
z fc Aeh
II z k || || A e k || *
H o w e v e r , i t i s e a s y t o s e e t h a t A ( z ly . . . , z n ) d e p e n d s c o n t i n u o u s l y
o n ZfcHzfcH-1. T h e r e f o r e for x h sufficiently c l o s e to zero w e h a v e
I
^ (^1 • • • , ^n)
1 ^ ( ^ - ^ i » • • • » ^ 4 ^ n )*
B u t A ( A e ly . . A e n ) > 0 , f o r v e c t o r s A e hy k = 1, . . n are
s i m p l y c o l u m n s o f m a t r i x A , a n d s i n c e m a t r i x A is n o n s i n g u l a r ,
its c o l u m n s a r e l i n e a r l y i n d e p e n d e n t . T h u s
\
A (^1, • • • j % n ) ^ ^ " 2 * ^ • • •> 0
f o r all x k f r o m t h e n e i g h b o u r h o o d o f x # = 0. Q . E . D .
Let 6 > 0 b e the radius of the region a b o u t zero in w h i c h l e m
m a s 7 . 1 a n d 7 . 2 h o l d . L e t p o i n t s x ly . . ., x n b e c h o s e n i n t h i s r e g i o n .
227 15*
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
L e t u s f i n d q u a n t i t i e s p*, i = 1, . . n f r o m the s y s t e m of e q u a
tions
*^n+1 = “ I- 2 ft*( 7 * 5 )
h= 1
L e t us estimate the n o r m of x n+ x. Since
p (xn ) = A xn + & (xn )
zk = A r k + co ( y h l x h ) || r h ||, (7.6)
w e o b t a i n f r o m (7.4) t h a t
n n
— A x n — a>(xn) = 2 p h A r k -\r S P x h ) ||rfe ||
fc=l fc=l
or
n
A x n+l= — d)(xn)— 2 P Xk) II Tft ||.
k=l
It f o l l o w s f r o m t h e last e q u a l i t y t h a t
228
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
Note n o w that
Pft II z h 11
2
h= 1 2 i i ii ** ii
i= l
(^i? • • • j 2 ^ )
h=1
T a k i n g i n t o a c c o u n t (7.4), w e obtain
Therefore
m a x || x k ! ! < - £ - , (7.10)
i^k^n ^3
t h e n i n e q u a l i t y (7.9) y i e l d s n o w
l| X n + l | K | I * » || m
‘^ n ; . (7.12)
229
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
L e m m a 7 . 3 . I f p o i n t s x h , k — 1 , . . ., n a r e c h o s e n i n a r e g i o n
a b o u t p o i n t x$, = 0 s u c h t h a t t h e c o n d i t i o n s o f l e m m a s 7 . 1 a n d 7 . 2
a n d i n e q u a l i t y ( 7 . 1 0 ) a r e satisfied, t h e n e s t i m a t e ( 7 . 1 2 ) h o l d s true.
Algorithm
W e f o r m u l a t e n o w the a l g o r i t h m for solving the s y s t e m of e q u a
t i o n s (7.1).
C h o o s e i n i t i a l p o i n t s x x , x.2 , . . ., x n . L e t p o i n t s x x , . . ., ;rn , . . .
. . ., X h b e a l r e a d y c o n s t r u c t e d . T h e n p o i n t x * + ! i s c o n s t r u c t e d b y
the following formula:
n
m - C 3 ||*||>-£, (7.15)
230
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
a b o u t p o i n t x *, t h e n t h e c o n d i t i o n s o f l e m m a 7 . 3 a r e s a t i s f i e d f o r
points x h _n + 1 , x k _n + 2 , • • x k a n d therefore the following ine
quality a n a l o g o u s to inequality (7.12) holds:
C C
3 4 m a x || £ f c - n + i ||
l^t^n
II * * + 1 I K II II 4 - C \ II x k |l + y ( m — Cz m a x II x k - n + i ||)
H e n c e it f o l l o w s , t a k i n g i n t o a c c o u n t ( 7 . 1 5 ) , t h a t
II * f t + i II < II x h || m a x || x k - n + i || C 5 (7.17)
i^i^n
where
c 5= — r c 1+ ^ s - " | .
5 m L 1 Ym J
But || £ f c _ n + i II ^ 80 b y assumption. Therefore
II * * + 1 II < II II 6 0 C 5 < II x h II < 6 0.
Q . K . D . M o r e o v e r , it f o l l o w s from the last inequality, with the
n o t a t i o n q 0 — 8 0C 5, t h a t
II * * + 1 II < 7 o II * * ||. (7.18)
S i n c e b y ( 7 . 1 6 ) q 0 c l , w e h a v e t h e e s t i m a t e || x k || ^ g j ~ n || x n ||,
i.e. x k — ►(), a n d t h i s p r o v e s t h e first s t a t e m e n t o f t h e t h e o r e m .
Further, f r o m (7.17) w e o b t a i n that
m a x || * * . „ + , ||. (7.19)
\\x k \ \
Since x k — ► 0 , the estimate m e a n s that
II II . n
n*fcii
T h e last relation s h o w s that x k — at a faster rate t h a n that of a n y
g e o m e t r i c p r o g r e s s i o n . T h e t h e o r e m is p r o v e d .
W e shall give n o w the m o r e precise b o u n d s o n the rate of c o n v e r
g e n c e . W e s e t v k = C 5 || x h ||. T h e n ( 7 . 1 7 ) c a n b e w r i t t e n i n t h e
form
Vh+ m a x Vk-n+i• (7.20)
231
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
/■W
W e h a v e u t ^ q 0 * < 1 , i = 1 , . . ., n , a n d t h e r e f o r e s e q u e n c e { u fe}
decreases m o n o t o n i c a l l y . T h i s fact c a n b e p r o v e d in a n e l e m e n t a r y
w a y b y i n d u c t i o n o n k . I t f o l l o w s t h a t m a x u h _ n + i = i>ft_ n + 1 a n d
(7.21) c a n b e rewritten in the f o r m
W e denote w k = In vh . T h e n
I t f o l l o w s f r o m O s t r o w s k i ’s r e s u l t s ( h i s t h e o r e m s 1 2 . 1 a n d 1 2 . 2 ) t h a t
(7.24)
u>h
I
w h e r e X 0 is t h e g r e a t e s t p o s i t i v e r o o t o f t h e e q u a t i o n
<p (X) = xn — r - 1 - 1 = 0 . (7.25)
I n v k + i ^ ( X o — e ) I n v h = I n u l X o " e)
<?o<1 ( 7 -2 7 >
holds.
P r o o f . R e c a l l t h a t v h = C 5 || x h ||, v h vk. These inequalities
a n d (7.26) yield i m m e d i a t e l y the result required.
232
L O C A L A C C E L E R A T I O N O F C O N V E R G E N C E
Computational Aspects.
Application to the P r o b l e m
of M a t h e m a t i c a l P r o g r a m m i n g
T h e a l g o r i t h m d e s c r i b e d i n t h e p r e c e d i n g s u b s e c t i o n is s i m p l e
e n o u g h . It r e q u i r e s at e v e r y s t e p t h e c a l c u l a t i o n of v e c t o r p (#) at
points x k a n d y a n d the solving of the s y s t e m of linear e q u a t i o n s
(7.14). If w e d e n o t e b y Z h t h e m a t r i x w h o s e c o l u m n s a r e z k ^ n + i t
i = 1 , . . ., rc, t h e n e q u a t i o n s ( 7 . 1 4 ) c a n b e r e w r i t t e n i n t h e f o r m
Z k $ k = — p ( x h ) y w h e r e p A is a c o l u m n - v e c t o r w h o s e c o m p o n e n t s
a r e pi, i = 1, . . n.
It f o l l o w s f r o m t h e a l g o r i t h m t h a t m a t r i c e s Z k a n d Z k ^ differ
o n l y b y o n e c o l u m n : c o l u m n z ft+1 i s s u b s t i t u t e d f o r c o l u m n z * a n d
zk-n+i+1> i ^ n — f ° r z k - n + i > i ^ n — 1. T h e r e f o r e t o c a l c u l a t e
P i + 1 , o n e c a n p r o c e e d as in th e s u b s e c t i o n s o n pp . 7 6 a n d 79.
N o t e that the p r o c e d u r e leads to the a c c u m u l a t i o n of calculation
e r r o r s . T h e r e f o r e , if t h e c a l c u l a t i o n o f p (y k ) r e q u i r e s c o n s i d e r a b l y
m o r e o p e r a t i o n s t h a n t h e s o l v i n g of s y s t e m (7.14), t h e s t a n d a r d
p r o g r a m for so lvi ng a s y s t e m of linear e q u a t i o n s s h o u l d b e u s e d
for calculating p h rather t h a n u s i n g recursive formulas.
W e n o w t u r n a g a i n t o t h e p r o b l e m 5 . 1 - 5 . 2 d i s c u s s e d i n S e c . 5.
A c c o r d i n g t o l e m m a 5 . 1 , i n o r d e r t o f i n d a l o c a l m i n i m u m it s u f f i c e s
t o s o l v e t h e e q u a t i o n p (x) = 0, w h e r e p (x) is t h e s o l u t i o n o f p r o b
l e m (5.4). If t h e a s s u m p t i o n s of t h e o r e m 5 . 4 h o l d , t h e n b y l e m m a s
5 . 2 , 5 . 4 i n a s u f f i c i e n t l y s m a l l r e g i o n a b o u t t h e s o l u t i o n x *, t h e c o n
di t i o n s of t h e o r e m 7.1 are also satisfied. T h e r e f o r e , t h e a p p l i c a t i o n
o f t h e a l g o r i t h m d e s c r i b e d i n t h i s s e c t i o n m a k e s it p o s s i b l e t o a c c e l
erate the c o n v e r g e n c e of the linearization m e t h o d . I n a p p l y i n g
t h i s m e t h o d o n e s h o u l d u s e a s p (x) t h e v e c t o r t h a t is t h e s o l u t i o n o f
p r o b l e m 5.4.
Minimization Problem
with Equality Constraints
L e t u s consider the p r o b l e m of m i n i m i z a t i o n of f u n c t i o n f 0 (x)
with constraints
fi (x) = 0 , i = 1 , . . ., m . (7.28)
Let x % b e the solution of the p r o b l e m a n d the following a s s u m p t i o n s
hold.
( a ) F u n c t i o n s fi ( x ) a r e t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e a n d
their s e con d derivatives satisfy Lipschitz’ condition.
( b ) A t p o i n t x % t h e g r a d i e n t s f \ ( # * ) , i = 1 , . . ., m a r e l i n e a r l y
i n d e p e n d e n t so that the necessary conditions for a m i n i m u m at x *
a r e s a t i s f i e d i n t h e i r r e g u l a r f o r m ( s e e C h a p . I, S e c . 4 ) . T h u s , t h e r e a r e
233
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
/ ; ( * , ) + £ « * / ; ( * . ) = o,
i=i
ft ( x * ) = 0, i = 1, . . ., m . (7.29)
( f i ( * * ) > P f t ) - f f; ( x h ) = 0, * = f, • • m, (7.30)
= *h + Pk,
+ AuJ, i = 1, . . ., m , (7.31)
c o n v e r g e to x * a n d a* r e s p e c t i v e l y a t a q u a d r a t i c r a t e , w h a t e v e r the
i n i t i a l a p p r o x i m a t i o n x 0 , n j , i = 1 , . . ., m s u f f i c i e n t l y c l o s e t o t h e
s o l u t i o n x * , u \ i = 1 , . . ., m .
P r o o f . T h e p r o c e s s d e f i n e d b y f o r m u l a s ( 7 . 3 0 ) , ( 7 . 3 1 ) is s i m p l y t h e
o n e g e n e r a t e d b y N e w t o n ’s m e t h o d w h e n i t i s a p p l i e d t o s y s t e m ( 7 . 2 9 ) .
T h e r e f o r e , i n o r d e r t o p r o v e t h e t h e o r e m it s u f f i c e s t o c h e c k , u s i n g
r e m a r k 1 o n p . 2 1 5 , t h a t t h e m a t r i x o f t h e first d e r i v a t i v e s o f t h e left-
h a n d s i d e s o f ( 7 . 2 9 ) w i t h r e s p e c t t o all x a n d u % is n o n s i n g u l a r .
I f w e d e n o t e b y f ' ( x ) a m a t r i x w h o s e r o w s a r e f \ ( x ) , i = 1 , . . ., m ,
t h e n it is e a s y t o s e e t h a t t h e m a t r i x o f t h e first d e r i v a t i v e s o f t h e
left-hand sides of (7.29) h a s the f o r m of the f o l l o w i n g block:
I L " (x,, u) /'* ( x , ) \ \
71 + 711.
\ /'(*.) o II
v -v ■ ' '
77 + 771
I n o r d e r t o a s c e r t a i n t h a t t h i s m a t r i x is n o n s i n g u l a r , it is s u f f i c i e n t
to s h o w th a t t h e h o m o g e n e o u s s y s t e m of e q u a t i o n s
234
P E N A L T Y F U N C T I O N M E T H O D
has only a z e r o s o l u t i o n . I n t h i s s y s t e m , y £ E n , a is a v e c t o r w h o s e
componen t s a r e u \ i — 1, . . m . L e t y y ii b e t h e s o l u t i o n o f s y s t e m
(7.32) . P e r f o r m i n g s c a l a r m u l t i p l i c a t i o n o f t h e first o f e q u a t i o n s
(7.32) b y ?y, w e o b t a i n o n t h e b a s i s o f t h e s e c o n d e q u a t i o n t h a t
(?/, L " ( x * , u ) y ) + (?/, / ' * ( x * ) it)
= 07. L " (x*, u) y) + ( f ( x * ) y , il) = (y, L " ( x „ u) y) = 0.
R u t b y a s s u m p t i o n (c), t h e l a s t e x p r e s s i o n s h o w s t h a t y = 0 .
T h e r e f o r e , t h e first o f r e l a t i o n s ( 7 . 3 2 ) c a n b e r e w r i t t e n a s f o l l o w s :
m
/ ' * ( * ♦ ) « = 2 u ‘n ( * . ) = o ,
i=i
a n d t h i s i s p o s s i b l e o n l y i f t V = 0 , i — 1 , . . ., m s i n c e v e c t o r s
fi (.r*) a r e l i n e a r l y i n d e p e n d e n t b y a s s u m p t i o n ( b ) .
T h u s w e h a v e s h o w n that the conditions of c o n v e r g e n c e of N e w
t o n ’s m e t h o d a r e f u l f i l l e d a n d c o n s e q u e n t l y t h e t h e o r e m i s p r o v e d .
8. M E T H O D O F P E N A L T Y F U N C T I O N S
T h e m e t h o d o f p e n a l t y f u n c t i o n s is o n e o f t h e s i m p l e s t a n d w i d e l y
k n o w n of the m e t h o d s for solving the p r o b l e m of m a t h e m a t i c a l
p r o g r a m m i n g . T h e basic idea of the m e t h o d consists in a p p r o x i
m a t e l y r e d u c i n g the constrained m i n i m i z a t i o n p r o b l e m to the u n c o n
strained m i n i m i z a t i o n of a certain function. T h e s u b s i d i a r y function
is c h o s e n s o t h a t it c o i n c i d e s w i t h t h e f u n c t i o n t o b e m i n i m i z e d i n
t h e a d m i s s i b l e d o m a i n a n d i n c r e a s e s s t e e p l y o u t s i d e it.
S u p p o s e w e s t u d y t h e p r o b l e m o f m i n i m i z a t i o n o f f u n c t i o n / 0 (x),
x £ E n with constraints
fi ( # ) ^ 0, i = 1, . . m. (8.1)
A l l t h e f u n c t i o n s // ( x ) , i = 0 , 1 , . . . » m are continuous.
W e introduce the notations
t2 , 0, t, 0,
(8 .2 )
< P o (i) =
{ 0, « < 0 ; <P« ( 0 =
{ 0, t < 0 .
Let us compose a function
m
235
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
S u b s t a n t i a t i o n of t h e P e n a l t y F u n c t i o n M e t h o d
Let a certain continuous function (x, r) h a v e t h e f o l l o w i n g p r o p
erties:
(1) ( x , r ) = 0 i f x £ Q , i|? ( x , r ) > 0 , x £ Q , a n d i|) ( x h , r k ) - ►
- ► + o o if x h - + x 0 , x 0 £ Q , r h +oo;
( 2 ) ip (.z, r ) i n c r e a s e s m o n o t o n i c a l l y w i t h i n c r e a s i n g r .
T h e o r e m 8.1. L e t the set
Q c (r) = { x : F (x , r ) < C},
F (x , r) = / 0 (x) + oj) ( x , r )
h e c o m p a c t . T h e n f u n c t i o n F (x , r ) a s s u m e s i t s m i n i m u m m (r) f o r a l l
x a t a c e r t a i n p o i n t x (r) a n d m (r) ^ m , w h e r e
m = m m f 0 ( x ), m(r)— ^ m
a n d m (r) i n c r e a s e s m o n o t o n i c a l l y w i t h i n c r e a s i n g r. M o r e o v e r , if
x (J"h) — * - x 0 , k — o o , r h - > o o , t h e n x 0 i s t h e s o l u t i o n o f t h e o r i g i n a l
problem.
236
P E N A L T Y F U N C T I O N M E T H O D
P r o o f . L e t j b e a p o i n t o f Q , C = f 0 (x). T h e n s e t Q o f t h o s e x £ ^
f o r w h i c h / 0 (a:) ^ C i s a c l o s e d s u b s e t o f c o m p a c t s e t (r). I n d e e d ,
f o r x £ & b y t h e p r o p e r t i e s o f \|? ( x , r ) w e h a v e
i . e . x £ Q g (r )* B u t ^ *s c ^ e a r t h a t t h e m i n i m u m o f / 0 ( x ) i n Q must
b e in subset Q . Therefore, w e s h o u l d seek the m i n i m u m of c o n t i n
u o u s f u n c t i o n /0 (x) in c o m p a c t set Q . S i n c e t h e c o n t i n u o u s f u n c t i o n
a t t a i n s its m i n i m u m i n a c o m p a c t s e t , it f o l l o w s t h a t p r o b l e m ( 8 . 1 )
is s o l v a b l e . A n a n a l o g o u s r e a s o n i n g s h o w s t h a t f u n c t i o n F (x, r)
a s s u m e s i t s m i n i m u m m (r) a t a c e r t a i n p o i n t x (r).
L e t x * b e a c e r t a i n p o i n t of t h e m i n i m u m of f0 (x) in Q . T h e n
F (Xjjj, r ) / q (Xj^) ” j ~ i|) ( x , £ , r ) / o (x^.).
237
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Hence
But / „ ( x ( r * ) ) - » - / 0 ( x 0 ). T h e r e f o r e
l i m t ( x ( r h ), r k ) = 0.
A-*oo
Thus,
l i m m ( r h ) = l i m / 0 (a; ( r * ) ) - f l i m \ | ) ( a : ( r * ) , r * ) = - f 0 ( x 0 ) > m.
h~+oo oo h~>oo
T h e n l i m x (r) = x* and
r-> o o
w?&ere a* a r e L a g r a n g e m u l t i p l i e r s of p r o b l e m (8.1).
R e m a r k . R e c a l l t h a t a c c o r d i n g t o t h e o r e m 4 . 1 ( C h a p . I) t h e n e c e s
s a r y c o n d i t i o n s for a m i n i m u m a r e fulfilled at p o i n t in the follow-
238
P E N A L T Y F U N C T I O N M E T H O D
ing form:
/ ; ( * • ) + 2 »*/*(*•)=«,
i=l
u x ^ 0, u'ft ( # * ) = 0, i = 1 , . . ., m . (8.6)
P r o o f . F i r s t , w e s h a l l s h o w t h a t x (r) - + x # . S u p p o s e t h a t t h e
o p p o s i t e h o l d s . T h e n t h e r e is a s e q u e n c e r k -*■ o o s u c h t h a t
II x ( r k ) — a * || ^ 6 0 > 0 . S i n c e i n p r o v i n g t h e o r e m 8 . 1 i t w a s
s h o w n t h a t x (r h ) £ £ 2 m (ri) a n d s e t Q m (rx ) is c o m p a c t , w e c a n t a k e ,
w i t h o u t l o s s o f g e n e r a l i t y (if r e q u i r e d w e c a n t a k e a s u b s e q u e n c e ) ,
t h a t x ( r h ) - ^ a * * a n d c l e a r l y || x % % — || ^ 8 0 > 6 - H o w e v e r ,
it f o l l o w s f r o m t h e o r e m 8 . 1 t h a t x * * is t h e s o l u t i o n o f p r o b l e m ( 8 . 1 ) .
T h u s w e h a v e o b t a i n e d t w o different s o l u t i o n s of p r o b l e m (8.1).
B u t this contradicts the a s s u m p t i o n .
T h u s , it h a s b e e n s h o w n t h a t x (r) — ► x *. W e t u r n t o t h e p r o o f
o f t h e s e c o n d s t a t e m e n t o f t h e t h e o r e m . A s x (r) is t h e m i n i m u m p o i n t
o f f u n c t i o n F (x , r ) , a t t h i s p o i n t t h e g r a d i e n t o f t h e f u n c t i o n
m
m u s t b e e q u a l to zero. S i m p l e calculations s h o w t h a t w e c o m e to
the equality
m
( * ( r )) + 2 ( r ) i i ( * (r ) ) = o ,
0(ar*)
r . w + 2 «*/!(*.)= o . (8.8)
239
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
Convex Programming
In the case of the p r o b l e m of c o n v e x p r o g r a m m i n g , the estimates
o f t h e a p p r o x i m a t i o n o f x (r) t o t h e s o u g h t s o l u t i o n x * c a n b e m a d e
m o r e precise.
T h e o r e m 8 . 3 . L e t a l l t h e f u n c t i o n s f t (.r), i = 0 , 1 , . . ., m h e
c o n v e x , t h e c o n d i t i o n s o f t h e o r e m 8 . 1 f o r f u n c t i o n *i|? (;r, r ) i n f o r m ( 8 . 3 )
h e satisfied a n d , besides, the n e c e s s a r y c o n d i t i o n s i n the f o r m of the
K u h n - T u c k e r t h e o r e m b e f u l f i l l e d a t p o i n t x % w h i c h is t h e s o l u t i o n o f
p r o b l e m 8 . 1 , i.e. t h e r e a r e n u m b e r s u x ^ 0 s u c h t h a t
m
fo ( * * ) < 2 u *fi (x ) + f o (x ) 7 f o r o t t x 7
i=l
^ i/ i ( ^ * ) = 0 , i = l , . . . , m . (8.9)
Then
f t (x (r ) ) ^ S ~ r » if f i ( x ( r ) ) > Q (8 .1 0 )
(8 . 1 1 )
where
« = ] / . ! (ur.
Proof. W e i n t r o d u c e t h e n o t a t i o n J 0 ( # ) = {i: f i ( x ) ^ 0 , i = 1 , . . .
..., m } . A s
m
F ( x ( r ) , r ) = f 0 (x ( r ) ) + r S i <Po ( f i ( x (r ))) < f o (x * ) .
i=i
it f o l l o w s f r o m ( 8 . 9 ) t h a t
m m
fo ( X (r)) + r 2 <Po (fi (x ( r ) ) X t o (x (r)) + 2 «7i(*(0)
i=i i=l
240
P E N A L T Y F U N C T I O N M E T H O D
or
m nt
r 2 <Po ( f t (* ( r ) ) ) < 2 “ Vi (* ( r ) ) -
i=l l=l
B u t f o r i ^ J o ( ^ ( r ))
<i>o (fi ( x ( r ) ) ) = 0, ft ( x (r)) < 0,
a n d <p 0 (fi ( x ( r ) ) ) = f \ ( x ( r ) ) f o r i 6 J o ( # ( r ) ) * T h e r e f o r e t h e i n e q u a l
ity o b t a i n e d m a y b e m a d e stronger:
f i ( x ( r ) ) < W ~ 2 /?(*('))<-- (8 . 1 2 )
a n d it f o l l o w s t h a t (8.10) holds.
F u r t h e r , for all x
m
b y t h e d e f i n i t i o n o f <p0 (;r) a n d u 0 ( x ) . T h u s
m __
fa (r)) + r 2 <Po (/ * ( * (r)))
1=1
16— 0 3 2 8 241
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
B u t it f o l l o w s f r o m ( 8 . 1 2 ) a n d (8.13) that
(/i ( * ( * ■ ) ) ) < £ .
Therefore
Computational Aspects
T h e m e t h o d s e x p o u n d e d a b o v e r e d u c e p r o b l e m (8.1) to t h e m i n i
m i z a t i o n o f f u n c t i o n F (x, r). I t is p o s s i b l e n o w , i n o r d e r t o o b t a i n
a n a p p r o x i m a t e solution, to u s e o n e of t h e m e t h o d s d e scr ibe d in
C h a p . II. T h e f o l l o w i n g specific c i r c u m s t a n c e s s h o u l d be, h o w e v e r ,
t a k e n i n t o a c c o u n t . If f u n c t i o n s f t (x ) a r e n o t c o n v e x , t h e n f u n c
t i o n F (x, r) is a l s o n o t c o n v e x w i t h r e s p e c t t o x . T h e r e f o r e , it c a n
h a v e l o c a l m i n i m a w h i l e i n a l l t h e p r e c e d i n g t e x t it w a s a s s u m e d
t h a t w e w e r e d e t e r m i n i n g t h e g l o b a l m i n i m u m x (r).
A s all t h e m e t h o d s of C h a p . II a r e m e a n t for f i n d i n g of a local
m i n i m u m , if t h e f u n c t i o n t o b e m i n i m i z e d is n o t c o n v e x , it is t h e
l o c a l m i n i m u m t h a t w i l l b e f o u n d if t h e i n i t i a l a p p r o x i m a t i o n i s
p o o r . T h i s a f f e c t s t h e c o n v e r g e n c e a n d is a n i m p o r t a n t s h o r t c o m i n g
of t h e p e n a l t y f u n c t i o n m e t h o d i n its a p p l i c a t i o n t o n o n c o n v e x
problems.
If t h e p r o b l e m u n d e r c o n s i d e r a t i o n is o n e o f c o n v e x p r o g r a m m i n g
w i t h t h e u s e o f f u n c t i o n ( 8 . 3 ) a s i | ) (x , r ) , t h e n i t i s e a s i l y a s c e r t a i n e d
t h a t F (#, r) is c o n v e x t o o ; t h e r e f o r e , t h e d i f f i c u l t y m e n t i o n e d a b o v e
is r e m o v e d . H o w e v e r , a n o t h e r d i f f i c u l t y a r i s e s . T h e f a c t is t h a t o n e
s h o u l d t a k e r sufficiently great in order to obtain a g o o d a p p r o x i m a
tion; this follows f r o m the estimates o b t a i n e d ab ove. I n this case
a l l d e r i v a t i v e s o f F (x , r ) w i t h r e s p e c t t o x w i l l a l s o b e g r e a t , f o r
t h e y a r e p r o p o r t i o n a l t o r. B u t it w a s e s t a b l i s h e d i n a n a l y s i n g a l l
m e t h o d s d e s c r i b e d i n C h a p . I I w h o s e r a t e o f c o n v e r g e n c e is s u p e r l i n -
ear t h at t h e size of t h e r e g i o n in w h i c h t h e rate of c o n v e r g e n c e b e
c o m e s s u p e r l i n e a r is i n i n v e r s e p r o p o r t i o n t o L i p s c h i t z 1 c o n s t a n t o f
s e c o n d d e r i v a t i v e s , i.e. i n t h e c a s e u n d e r c o n s i d e r a t i o n t h i s r e g i o n
will also b e s m a l l a n d e v e n a m e t h o d w h i c h , at the limit, theoreti
cally c o n v e r g e s r a p i d l y c a n b e c o m e ineffective. M o r e o v e r , as f u n c
t i o n <p0 (t) w i t h t = 0 h a s n o s e c o n d d e r i v a t i v e , F ( # , r ) c a l c u l a t e d
b y f o r m u l a (8.3) w i t h t h e u s e of (x , r ) w i l l a l s o h a v e n o s e c o n d
d e r i v a t i v e s a t p o i n t s x f o r w h i c h / f ( x ) = 0 f o r a c e r t a i n i. B u t i f
t h e s o l u t i o n x m l i e s o n t h e b o u n d a r y o f t h e d o m a i n , it is t h i s c a s e t h a t
w i l l t a k e p l a c e . O n t h e o t h e r h a n d , all m e t h o d s w h i c h c o n v e r g e
242
P E N A L T Y F U N C T I O N M E T H O D
at a fast ra te r e q u i r e t h a t t h e f u n c t i o n b e i n g m i n i m i z e d h a v e s e c o n d
derivatives at least in a certain r e g i o n ahoutj t h e p o i n t s o u g h t .
A l l of t h e difficulties m e n t i o n e d are, as a rule, o b s e r v e d in c a l c u
lations in practice a n d this lowers the effectiveness of the m e t h o d .
P (x, r) = fo ( * ) — 2 j f w - r > 0
i=i *
d e f i n e d i n s i d e s e t Q . I t is e a s i l y a s c e r t a i n e d t h a t P (x, r) is c o n v e x
w i t h r e s p e c t to x i n s i d e Q . If w e d e n o t e b y x (r) t h e m i n i m u m p o i n t
o f P (x, r) i n Q , t h e n w i t h s u f f i c i e n t l y g e n e r a l assumptions analogous,
t o t h o s e o f t h e o r e m s 8 . 1 a n d 8 . 2 it c a n b e s h o w n that
l i m x (r) = x * ,
r-*-+0
/?(*<>•)) = u '' i = 1,
T h u s , t h e a p p r o x i m a t e m e t h o d of s o l v i n g p r o b l e m (8.1) a g a i n h a s
b e e n r e d u c e d to the p r o b l e m of u n c o n s t r a i n e d m i n i m i z a t i o n of
f u n c t i o n P (x, r).
O f t h e specific traits of this s u b s i d i a r y p r o b l e m t h e s a m e t h i n g s
c a n b e said that w e r e said of the p e nal ty function m e t h o d in the
s u b s e c t i o n o n p. 242. I n o r d e r to illustrate t h e s e traits a n d to s h o w
w h y e v e n e f f e c t i v e m e t h o d s o f m i n i m i z a t i o n o f F (x, r) o r P (x, r)
m a y fail to p r o v i d e for a fast r a t e of c o n v e r g e n c e , w e s h a l l a d
duce a simple example.
L e t / 0 ( x ) = — x , fx (x) = x , x 6 E 1 , i.e. w e a r e s o l v i n g t h e p r o b l e m
of m i n i m i z a t i o n of — x s u b j e c t to x ^ 0. T h e o b v i o u s s o l u t i o n
is x * = 0 :
P ( x , r) = — x - - T— .
243 16
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
E q u a t i n g t o z e r o t h e d e r i v a t i v e o f P (x , r ) w i t h r e s p e c t t o x w e
obtain
P ’ (x, r ) = - 1 + ^ = 0 . (8.14)
H e n c e x (r) = — Y r • L e t u s n o w a p p l y t o t h e s o l v i n g o f ( 8 . 1 4 ) a m e t h
o d t h a t c o n v e r g e s a t a q u a d r a t i c r a t e — N e w t o n ’s m e t h o d , i . e .
obtain approximations b y the formula
P ' ( X h , r)
x k+l — x h — P ” { x k > r) *
S u b s t i t u t i n g e x p r e s s i o n s f o r P r (x, r) a n d P " (x, r) w e o b t a i n a f t e r
simple transformations
It is c l e a r f r o m f o r m u l a ( 8 . 1 5 ) t h a t t h e d e v i a t i o n o f x k f r o m t h e
s o l u t i o n x (r) = — Y r t e n d s t o z e r o m o n o t o n i c a l l y o n l y w i t h i n i
tial p o i n t s s u c h t h a t
As xh < c O ( t h e a p p r o x i m a t i o n is s o u g h t i n t h e r e g i o n £ < 0 ) , we
have
T h u s t h e last f o r m u l a s h o w s t h a t a q u a d r a t i c rate of c o n v e r g e n c e of
N e w t o n ’s m e t h o d w i l l b e g u a r a n t e e d o n l y i n a d o m a i n s u c h t h a t i n
it X h d e v i a t e s f r o m t h e s o l u t i o n b y n o t m o r e t h a n ] A r , i.e. t h e d o m a i n
o f c o n v e r g e n c e o f N e w t o n ’s m e t h o d t e n d s t o z e r o w i t h d e c r e a s i n g r ,
a n d t h e s i z e o f t h i s d o m a i n is o f t h e o r d e r o f m a g n i t u d e o f t h e d e v i a
t i o n o f x (r) f r o m t h e t r u e s o l u t i o n o f t h e p r o b l e m — x + . T h i s i n d i c a t e s
that the greater a m o u n t of calculation w o r k will b e required to
h i t t h e r e g i o n o f c o n v e r g e n c e o f N e w t o n ’s m e t h o d , w h i l e i n c a s e s
w h e r e N e w t o n ’s m e t h o d h a s a g o o d c o n v e r g e n c e i t i s n o m o r e n e c e s
sary as the a p p r o x i m a t i o n o b tai ned deviates f r o m x * b y as m u c h
a s x (r) d o e s .
,; i
9. P R O J E C T I O N M E T H O D S
W I T H R E S T O R A T I O N O F TIES
C o n s t r u c t i o n of t h e M e t h o d s
C o nsi der the p r o b l e m of m i n i m i z a t i o n of function /0 (x) w i t h
the following conditions:
f i ( x ) = 0 , i = * 1 , . . ., m , m < n. (9*1)
244
P R O J E C T I O N M E T H O D S
W e set g = ( / lf . . ., / m ), S g = { x : g ( x ) = 0 } a n d s u p p o s e t h a t
all f u n c t i o n s f 0 (a;), (.x ), . . f m ( x ) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e
a n d S g is a s m o o t h m a n i f o l d ( ( n — m ) d i m e n s i o n a l ) , i.e. t h a t
at a n y point x £ S g t h e r a n k o f m a t r i x g ' ( x ) i s e q u a l t o m ( g ' (x ) =
( d f ‘ (x)
= | ^ J, i = 1, . . m> 7 = 1 , . . ., n , i i s t h e r o w index).
Conse que ntl y, at a n y point x £ S g a h y p e r p l a n e ta n g e n t to S g c a n b e
constructed:
g' (x)(x — x ) = 0. (9.2)
F u r t h e r o n , w e d e n o t e t h i s h y p e r p l a n e (i.e. t h e s e t o f p o i n t s t h a t
s a t i s f y e q u a t i o n (9.2)) b y T {x).
O n e possible a p p r o a c h to the construction of iterative processes
f o r s o l v i n g t h e p r o b l e m f o r m u l a t e d is b a s e d o n t h e f o l l o w i n g c o n s i d
erations.
L e t x 0 b e a n a r b i t r a r y p o i n t o f S g s u c h t h a t t h e g r a d i e n t /' ( x 0 )
is n o t o r t h o g o n a l t o t h e l i y p e r p l a n e T ( x 0 ) (i.e. t h e n e c e s s a r y c o n
d i t i o n f o r a n e x t r e m u m o f f u n c t i o n f 0 ( # ) o n t h e m a n i f o l d iS0 is n o t
f u l f i l l e d a t p o i n t x 0 ). T h e n i n p l a n e T ( x 0 ) t h e r e a r e i n f i n i t e l y m a n y
d i r e c t i o n s o f d e s c e n t o f f 0 ( x ) (i.e. t h e r e a r e i n f i n i t e l y m a n y d i r e c
t i o n s x — x 0 w h i c h b e l o n g t o T (a:0 ) a n d s u c h t h a t ( f'0 ( x 0 ), x — x 0 ) c
< C —0 ) . S u p p o s e w e h a v e d e t e r m iA n e d o n e o f t h e s e d i r e c t i o nA s u 0 =
— x 0— x 0 and constructed point x 0 ( a ) = x 0 a v 0 s u c h t h a t f0(x0 (a)) < c
< fo (^o)- P o i n t x 0 n o l o n g e r s a t i s f i e s e q u a t i o n s o f c o n d i t i o n s (9.1).
H o w e v e r , if t h e v a l u e o f p a r a m e t e r a i s s u f f i c i e n t l y s m a l l ( t h e q u a n -
A /X
t i t y II x Q — x Q ( a ) || i s s m a l l ) , t h e n u s i n g p o i n t x 0 ( a ) w e c a n c o n
struct in several w a y s point x x 6 S g such that
f o (x i ) ■ ' C f o ( ^ o ) « (9*3)
T h i s s t a t e m e n t is b a s e d o n t h e f a c t t h a t w e c a n c h o o s e a p o i n t x x ( a )
o n the s m o o t h manifold S g (and not a single one) so that Ihe follow
i n g c o n d i t i o n h e fulfilled:
(a) — x0 = x 0 (a) — xQ + co0 (a)
where
II <"o ( « ) II = II * i ( a ) — * 0 (“ ) l| = 0 (II * 0 (“ ) — * o ID- ( 9 .4 )
(This c a n b e p r o v e d strictly b y u s i n g t h e t h e o r e m of m a p p i n g o n o n e
a n o t h e r of t h e r e g i o n a b o u t p o i n t x 0 i n m a n i f o l d S g a n d in t h e t a n
g e n t m a n i f o l d T ( x 0 ); t h e t h e o r e m h o l d s i n s p a c e E n ; s e e L . A . L u s t e r -
n i k a n d V . I. S o b o l e v . )
If (9.4) h o l d s , w e h a v e , s i n c e / 0 (x) is d i f f e r e n t i a b l e ,
/(» ( * i ) = f o ( * o ) + (f o ( * o ) » * i — * o ) + 0 (II x x — x Q II)
f o ( x 0 ) “ H j [ / o (*^o)» * 0 *o) o (|| X q X q (I)
+ (fo ( X o ) , — x 0 ) + o (II x x — a 0 II)
= f o ( * o ) “ t ~ ( f o ( x 0 ), X q X q ) + O l ( | | X q X q ID*
245
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
H e n c e if p a r a m e t e r a i s s u f f i c i e n t l y s m a l l , t h e i n e q u a l i t y ( 9 . 3 ) is
satisfied.
B y c o n s t r u c t i n g p o i n t x x 6 S g a t w h i c h c o n d i t i o n (9.3) is fulfilled,
w e h a v e p e r f o r m e d in e s s e n c e a n iteration of a certain pr ocess of
descent for constructing successive a p p r o x i m a t i o n s to the solution.
T h u s the k-th iteration of the process of the t y p e b e i n g described
consists in the following.
1 . T h e d i r e c t i o n o f d e s c e n t i?* = — x k o f f u n c t i o n f0 (x) in t h e
t a n g e n t h y p e r p l a n e T (x * ) i s d e t e r m i n e d .
2. A s t e p o f d e f i n i t e l e n g t h is m a d e i n t h e d i r e c t i o n v k : x h ( a ) =
= X k + < z v h ( s o t h a t / „ ( x h ) < / „ ( x k )).
3. U s i n g p o i n t x ^ (a), p o i n t 6 S g is d e t e r m i n e d s u c h t h a t
c o n d i t i o n / 0 ( ^ + i ) < f 0 (x k ) b e s a t i s f i e d .
It is c l e a r f r o m t h e f o r e g o i n g t h a t w e c a n c h o o s e f o r m o v i n g f r o m
p o i n t X h d i f f e r e n t d i r e c t i o n s o f d e s c e n t i n p l a n e T ( x ft). T h e c h o i c e
of q u a n t i t y a * a n d t h e final s t e p of t h e i t e r a t i o n — c o n s t r u c t i o n
of p o i n t x k + 1 — are also d e t e r m i n e d n o t u n i q u e l y . P e r f o r m i n g in differ
ent w a y s e a c h of the three stages of the iteration w e c a n construct
a w h o l e class of processes of descent of the t y p e described.
C o n s i d e r n o w several possible m e t h o d s of c h o o s i n g vector W e
c a n t a k e a s v e c t o r u k t h e p r o j e c t i o n o f t h e a n t i g r a d i e n t — f'0 ( x k )
o n t o t h e p l a n e T ( x k ). T h e c o n s t r u c t i o n o f s u c h a v e c t o r i s e q u i v
alent to the solving of the p r o b l e m of m i n i m i z a t i o n of function
F h ( x ) = ( / ; ( x h ) , x - x k ) + ^ \ \ x - x k ||2 (9.5)
p r o v i d e d t h a t x £ T ( x k ). A p p l y i n g the m e t h o d of L a g r a n g e m u l t i
pliers, w e find t h a t
** = - ( / - g f* ( g ' g ' * ) - Y ) /; ( * * > (9.6)
w h e r e g ' = g ' ( x k ).
M o r e effective p r o j e c t i o n m e t h o d s w i t h re storation of ties c a n
be constructed b y choosing as uh a vector that m i n i m i z e s the f u nc
tion
F k ( x ) = ( f o ( x h ) , x — x k ) + — -(f"0 { x k ) ( x — x k ) , x — x k ) (9.7)
o n p l a n e T ( x h ) ( t h e r e i s s u c h a v e c t o r if F h i s a c o n v e x f u n c t i o n ) .
S i n c e in this case the q u a d r a t i c a p p r o x i m a t i o n to the function
b e i n g m i n i m i z e d w a s practically u s e d for t h e const ruc tio n of the
d i r e c t i o n o f m o t i o n , w e s h a l l c a l l t h e m e t h o d s , i n w h i c h v h is c o n
s t r u c t e d in t h e w a y de s c r i b e d , m e t h o d s of the s e c o n d order.
C o n s i d e r n o w t h e m e t h o d of r e s t o r a t i o n of ties (the t h ird s t a g e
of a n iteration) w h i c h will b e u s e d in w h a t follows.
L e t t h e s y s t e m of eq sations (9.1) in a ce rtain r e g i o n a b o u t a n y
p o i n t x £ S g d e f i n e t h e f u n d i o n y = y (z), w h e r e y is a n w i - d i m e n s i o n -
240
P R O J E C T I O N M E T H O D S
al v e c t o r of c o o r d i n a t e s a n d z is a n (n — m ) - d i m e n s i o n a l v e c t o r .
W i t h o u t loss of generality, we can t a k e y = (s1, . . z m ), z =
= (xm + 1 , . . x n ). B y the theorem o f i m p l i c i t f u n c t i o n s , it is
necessary for the existence of function y (z) a n d its d e r i v a t i v e s t h a t
at a n y point x £ S g w e h a v e the dete rminant
u » ( * ) i = | { - £ 7 } | ^ ° « j = i . . . m - (9.8)
S i n c e t h e p r o c e s s of t h e (9.9) t y p e c a n b e c o n s i d e r e d to b e a m e t h o d
o f m i n i m i z a t i o n o f f u n c t i o n <p ( z ) , i t i s e a s y t o s e e t h a t v e c t o r cp' ( z * )
c a n b e t a k e n as p h ; t h e n s e q u e n c e (9.9) will b e t h e g r a d i e n t m e t h o d
o f m i n i m i z a t i o n o f cp ( z ) . '
247
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
(9.12)
., m .
T h e w e a k e n i n g o f t h e r e q u i r e m e n t s t o f u n c t i o n s /* i s t h a t a t d i f f e r e n t
p o int s of set S g dif f e r e n t ^ d e t e r m i n a n t s m a y b e n o t zero. I n this case,
the coordinates of po int x £ S g w h i c h f o r m vector z a n d vector-func
t i o n y (z) c a n b e , s p e a k i n g g e n e r a l l y , d i f f e r e n t a t d i f f e r e n t p o i n t s
o f m a n i f o l d S g : z = ( x J m + I , . . ., x * n ), y = ( x > ' , . . ., x * m ). T a k
i n g this into a c c o u n t w e c a n, as before, u s e f o r m u l a (9.9) for t h e r esto-
r a t i o n o f ties. E a c h s t e p o f p r o c e s s (9.9) c a n b e t r e a t e d a s a s t e p o f t h e
p r o c e s s o f m i n i m i z a t i o n o f a c e r t a i n f u n c t i o n <p ( x i m + l , . . ., x i n )
f o r w h i c h t h e c o r r e s p o n d i n g v e c t o r p h is t h e d i r e c t i o n o f d e s c e n t .
M e t h o d s of t h e (9.9) t y p e will b e s t u d i e d b e l o w . It will b e c o n
v e n i e n t t o d e n o t e a n y v e c t o r - f u n c t i o n (x . . ., x * m ) b y y a n d a v e c
tor of i n d e p e n d e n t v a r i a b l e s b y z (as w e d i d in fulfilling t h e c o n
d i t i o n (9.8)). A c c o r d i n g l y , a n y of t h e d e t e r m i n a n t s \{dfi/dx*}\
o f m o r d e r w i l l b e d e n o t e d b y | g y | a n d f u n c t i o n / 0 ( z , y ( z ) ) b y <p ( z ) .
T h e a b s o l u t e v a l u e o f f u n c t i o n | g y ( x ) | s h a l l b e d e n o t e d b y j g y ( x ) |A .
I n t h e f o l l o w i n g t w o su bsections w e shall s t u d y the properties of
t h e m e t h o d s o f t h e first a n d s e c o n d o r d e r . I n t h e f o u r t h s u b s e c t i o n
w e shall consider m e t h o d s of d u a l a n d c o n j u g a t e directions for
t h e m i n i m i z a t i o n o f <p ( z ) ( o r t h e a l g o r i t h m s b a s e d o n m e t h o d s
of this type). F r o m t h e v i e w p o i n t of practical c o m p u t a t i o n s just
these a l g o r i t h m s are of t h e greatest interest.2
2 IS
P R O J E C T I O N M E T H O D S
M e t h o d s of t h e First O r d e r
W e shall s t u d y the properties of m e t h o d s b a s e d o n the lineariza
t i o n o f f u n c t i o n f 0 ( x ) a n d t i e s /*, i = 1 , . . m.
C o n s i d e r t h e a l g o r i t h m w h o s e e v e r y s t e p is a s t e p o f t h e g r a d i e n t
m e t h o d f o r m i n i m i z a t i o n o f a c e r t a i n f u n c t i o n <p ( z ) :
Zft + 1 = Zf, — otfccp' ( z h ), y k + 1 = y (zk + 1 ) (9.13)
w h e r e z h i s a v e c t o r c o r r e s p o n d i n g t o t h e d e t e r m i n a n t | g y (x ) | w h i c h
h a s at p o i n t x k £ S g t h e m a x i m u m a b s o l u t e v a l u e of all t h e d e t e r m i
n a n t s | g y |, t h e g r a d i e n t cp' ( z fe) i s c a l c u l a t e d b y f o r m u l a ( 9 . 1 0 ) a n d
parameter a c a n b e d e t e r m i n e d b y o n e of t h e m e t h o d s described
i n s t u d y i n g g r a d i e n t m e t h o d s ( S e c . 1, C h a p . II). W e s h a l l c h o o s e
as a h the m a x i m u m v a l u e of the p a r a m e t e r o b t a i n e d b y successive
r e d u c t i o n s of a c e r t a i n p o s i t i v e c o n s t a n t w h i c h satisfies
f o (z. y (z)) — fo (Zk. y h ) < — e a II 9 ' ( z k ) ||2 , 0 < e < 1 (9.14)
w h e r e z — z k — acp' (zk ) (t h i s is a n a n a l o g u e o f t h e m e t h o d o f c h o o s -
i n g a h a c c o r d i n g t o c o n d i t i o n (1.2), C h a p . II).
T h e o r e m 9 . 1 . I f f u n c t i o n s f 0 ( x ) a n d ft ( x ) , i = 1 , . . ., m a r e
twice continuously differentiable a n d , besides, functions ft are such
t h a t c o n d i t i o n ( 9 . 1 2 ) i s f u l f i l l e d , a n d s e t S = S g [] S 0 ( S 0 = { x \
f o ( * ) < fo ( * o ) }) is b o u n d e d w i t h a n a r b i t r a r y c h o i c e o f p o i n t x 0t
t h e n o n s e q u e n c e ( 9 . 1 3 ) / 0 ( a j ^ + i ) ^ / 0 ( x ft) a n d || q / ( z h ) || — >• 0 a s
k — >■ o o .
Proof. T h e possibility of c o n s t r u c t i n g s e q u e n c e (9.13) follows
f r o m c o n d i t i o n (9.12): w i t h sufficiently s m a l l v a l u e s of p a r a m e t e r a ,
p o i n t z k + 1 lies in t h e r e g i o n a b o u t p o i n t w h e r e f u n c t i o n y (z) is
defined. In this region, b y the a s s u m p t i o n s of the t h e o r e m , function
<p ( z ) = / ( z , y ( z ) ) i s t w i c e c o n t i n u o u s l y d i f f e r e n t i a b l e . T a k i n g t h i s
into account, the following estimate holds:
249
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
T h i s e s t i m a t e s h o w s t h a t t h e e q u a l i t y || x ft+ x — x k || = p h o l d s
w i t h a k ^ p / ( ^ 47 V P2 )> i * © * t h a t w e c a n c h o o s e a s 6 a n y c o n s t a n t n o t
e x c e e d i n g p / ( 7 V 4i V 2 ).
U s i n g n o w inequality (9.15) ( a n d t a k i n g into a c c o u n t that deriva
t i v e <p" (z) i s b o u n d e d ) , i t i s e a s y t o a s c e r t a i n t h a t i n e q u a l i t y ( 9 . 1 4 )
will certainly hold w i t h a h = min — jy— — J. B u t this m e a n s ,
s i n c e f0 (x) h a s a l o w e r b o u n d ( o n set £), t h a t a s k oo of n e c e s
s i t y || <p' (Zfe) || — ►• 0 . T h e t h e o r e m i s p r o v e d .
T h e c o n d i t i o n || <p' ( z h ) || — m e a n s in the general case that
s e q u e n c e ( 9 . 1 3 ) (or a c e r t a i n o f its s u b s e q u e n c e s ) c o n v e r g e s t o p o i n t x *
w h i c h satisfies t h e n e c e s s a r y c o n d i t i o n for a n e x t r e m u m of f u n c t i o n
f 0 (x) o n m a n i f o l d S g ( a t p o i n t x * t h e g r a d i e n t f 0 ( x # ) is o r t h o g o n a l
250
P R O J E C T I O N M E T H O D S
A s s o o n a s t h i s i n e q u a l i t y is satisfied, t h e i n e q u a l i t y ( 9 . 1 4 ) s h o u l d
b e c h e c k e d w i t h t h e a o b t a i n e d ; if ( 9 . 1 4 ) i s n o t s a t i s f i e d , t h e r e d u c
tion of a s h o u l d b e c o n ti nue d; ot her wis e the o b t a i n e d v a l u e of the
p a r a m e t e r should b e retained or o n e should a t t e m p t to increase
it i n c h e c k i n g ( 9 . 1 4 ) . N o t e t h a t
= V » « + 0 ( l | z - z » II2).
251
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
W i t h a s u f f i c i e n t l y s m a l l z h + i — z h , w e o b t a i n f 0 (z k + l y y t ( z k + i ) ) - ►
~ * f o ( Z h + i , y (Zft+i)); t h e r e f o r e if ( 9 . 1 7 ) is s a t i s f i e d , i n e q u a l i t y
( 9 . 1 4 ) w i l l a l s o b e s a t i s f i e d , i.e. n o a d d i t i o n a l r e d u c t i o n s o f t h e
step length will be required.
R e m a r k . T h e r e q u i r e m e n t s of t h e o r e m 9.1 to t h e s m o o t h n e s s of
f u n c t i o n s / 0 ( # ) a n d f t (x ) c a n b e t a k e n s o m e w h a t w e a k e r ; h o w e v e r ,
this leads to a m o r e c o m p l i c a t e d proof.
W e d w e l l briefly also o n a m e t h o d of t h e (9.9) t y p e in w h i c h v e c t o r
p h is c h o s e n b y f o r m u l a ( 9 . 1 1 ) ( v e c t o r s z h , y h a r e d e t e r m i n e d i n t h e
s a m e w a y a s i n t h e p r e c e d i n g m e t h o d ) a n d a h is t h e m a x i m u m v a l u e
of t h e p a r a m e t e r ( o b t a i n e d b y r e d u c t i o n s ) w h i c h satisfies t h e f o l l o w
ing inequality:
/ o (z. y (z)) — fo (z*, Vh) < e a (q>' ( z » ) , p h ), z = zk + a Pk-
F o r s u c h a n a l g o r i t h m , t h e o r e m (9.1) h o l d s true. T h e p r o o f will
differ o n l y in s o m e details ( a n a l o g o u s l y to t h e difference b e t w e e n
the proof of the t h e o r e m of the properties of m e t h o d s of the gradient
t y p e a n d the proof of the t h e o r e m s o n the m e t h o d of steepest descent,
S e c . 1, C h a p . II).
N o t e that the a m o u n t of w o r k per iteration in s u c h a n a l gor ith m
is g r e a t e r t h a n i n m e t h o d ( 9 . 1 3 ) .
M e t h o d of the S e c o n d Order
S u p p o s e t h a t f 0 (x) is a s t r o n g l y c o n v e x f u n c t i o n . T h e n t h e q u a
d r a t i c f u n c t i o n F h (x) (9.7) is s t r i c t l y c o n v e x a n d s i n c e f u n c t i o n
y i (z) is l i n e a r , f u n c t i o n (z) = F h (z, y t (z)) i s s t r i c t l y c o n v e x
t o o . M o r e p r e c i s e l y , d u e t o t h e s t r o n g c o n v e x i t y o f / 0 ( x ), t h e f o l l o w
i n g c o n d i t i o n s a r e f u l f i l l e d f o r a n y f u n c t i o n ^ (z) w i t h a n y v e c t o r
v £ E n -m :
m 0 \\v ||2 < (i)£i>, v ) < M 0 || v ||2 , m 0 > 0 (9.18)
° t* ;o
P R O J E C T I O N M E T H O D S
ing inequality:
/o W - /o W < « » ( ? ’ (2 » ) . P k ) . 0 < e < | - (9.21)
w h e r e x = (z, y (z)), z = z k + a p h .
T h e o r e m 9.2. L e t /0 (x) be a t w i c e c o n t i n u o u s l y diff e r e n t i a b l e f u n c
tion a n d for a n y vector w £ E n
m || < a \ ||2 < ( f l (x ) w , [ < o ) < M k || to ||2 , m > 0
a n d let f u n c t i o n s f t (x), i = 1, . . m satisfy the r e q u i r e m e n t s of
t h e o r e m 9.1. T h e n , w h a t e v e r the p o i n t x 0 c h o s e n , the results of t h e o r e m
9.1 h o l d for m e t h o d (9.20).
T h e proof of the t h e o r e m follows the s a m e s c h e m e as that u s e d
in p r o v i n g t h e o r e m 9.1. T h ere for e, w e shall d w e l l o n l y o n t h o s e
c h a n g e s in th e proof, as c o m p a r e d to th at of t h e o r e m 9.1, w h i c h
arise b e c a u s e of different m e t h o d s of c h o o s i n g v e c t o r p h .
D u e t o t h e strict c o n v e x i t y o f f 0 (x), t h e s e t S 0 h a s a b o u n d . It
f o l l o w s t h a t s e t S = S g [] S 0 i s b o u n d e d a n d c l o s e d ( s i n c e s e t s
S 0 a n d S g are closed). T a k i n g this into account, w e prove, in the
s a m e w a y as in t h e o r e m 9.1, that e s t i m a t e s (9.16) h o l d a n d establish
t h a t t h e d e r i v a t i v e s q>' ( z ) , < p " ( z ) h a v e b o u n d s i n t h e p a r a l l e l e p i p e d
[ 0 * ± 6 0 * ], i = 1 , . . ., n .
Further, by (9.18), || ('I?*)"1 1 | < - , consequently, ||/?f t || =
= || ( t y * ) ~ * t y k M || = || ( t y k ) " 1 <p' ( z h ) || < N 2 l m o , and, therefore, if
- f 1 ) || z h + 1 — z k ||2 = a \ N \ || p h ||2 > p 2 , t h e n
a h> M f! „ (9.22)
- ^ 4 II P h II ^ N kN 2 v '
It f o l l o w s t h a t a n y c o n s t a n t , p r o v i d e d it d o e s n o t e x c e e d
p m 0 / ( i V 47 V 2 ), c a n b e c h o s e n a s 6.
Using the expansi o n o f f u n c t i o n <p ( z ) i n t o T a y l o r ’s s e r i e s , w e f i n d
t h a t in s p h e r e S Xh of radius p
aa
<p(z*+i) — = a*(q>' M , P k ) + - ^ - ( < t (zkc) P h , P h )
p h)
It f o l l o w s f r o m (9.19), w i t h a c c o u n t o f (9.18), t h a t
253
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
h o l d s f o r f u n c t i o n <p ( z ) t o w h o s e m i n i m i z a t i o n t h e s o l v i n g o f t h e
o r i g i n a l p r o b l e m is r e d u c e d , t h e n t h e r a t e o f c o n v e r g e n c e of
t h e m e t h o d is s u p e r l i n e a r . I n o r d e r t o a s c e r t a i n this, o n e s h o u l d
t a k e i n t o a c c o u n t t h a t i f ( 9 . 2 4 ) h o l d s ( a n d a l s o (i|)* p k , p k ) =
= — ( < P ' ( ft), P f t ) ) , t h e n
2
A t t h e s a m e t i m e , b y ( 9 . 1 8 ) a n d ( 9 . 2 4 ) , f u n c t i o n <p ( z ) w i l l b e s t r o n g l y
c o n v e x in a certain region a b o u t the m i n i m u m . W i t h the a b o v e
r e m a r k s , the proof of the superlinear rate of c o n v e r g e n c e c a n b e the
s a m e a s t h a t , f o r e x a m p l e , i n s t u d y i n g N e w t o n ’s m e t h o d ( S e c . 2 ,
C h a p . II).
T h u s t h e rate of c o n v e r g e n c e of m e t h o d (9.20) in a n u m b e r of
p r o b l e m s w i l l b e f a s t e r t h a n t h a t o f m e t h o d s o f t h e first o r d e r . H o w
ever, the a m o u n t of w o r k per iteration in m e t h o d (9.20) m a y p r o v e
c o n s i d e r a b l y greater o w i n g to t h e ne c e s s a r y calculations of the
s e c o n d d e r i v a t i v e s o f f u n c t i o n /0 (x).
Minimization Methods
of Higher E f fectiveness
T h e projection m e t h o d s described in the pr ece din g subsections
a r e , i n a s e n s e , a n a l o g u e s o f t h e g r a d i e n t m e t h o d s a n d N e w t o n ’s
m e t h o d for sol v i n g p r o b l e m s of finding a n absolute e x t r e m u m .
254
P R O J E C T I O N M E T H O D S
255
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
D k ( x k-i x h - l ~ l) — /o — /o
i = 0, 1,..., n — 1
( a n a l o g u e o f s y s t e m ( 3 . 6 ) o f C h a p . I I w h i c h is u s e d i n c o n s t r u c t i n g
m e t h o d s of d u a l direction) a n d vector p k = — (zh ) i s c o n s t r u c t
e d , w h e r e F h = D hzz + y ' * D k y y y ' + 2 y ' * + D hzy a n d m a t r i c e s
D k z z i D h y y > D hzy are parts of m a t r i x D h a n d c o r r e s p o n d to m a t
r i c e s /qzz, f o y y » fozyi r e s p e c t i v e l y .
O n th e Solving of th e G e n e r a l P r o b l e m
of M a t h e m a t i c a l P r o g r a m m i n g
I t is r e q u i r e d t o m i n i m i z e f u n c t i o n f 0 (x) w i t h c o n s t r a i n t s
fi ( * ) ^ 0, i = l , . . ., m . (9.25)
S u c h constraints c a n be r e d u c e d to equality constraints in several
w a y s . F o r i n s t a n c e , if w e i n t r o d u c e a d d i t i o n a l v a r i a b l e s x n + 1 , . . .
. . ., x n + m , t h e n c o n s t r a i n t s ( 9 . 2 5 ) w i l l h o l d w i t h t h e s a m e v a l u e s
o f v a r i a b l e s x 1 , . . ., x n w h i c h s a t i s f y t h e e q u a l i t i e s
( s n + i )2 + ft (x) = 0, i = 1 , . . ., m . (9.26)
C o n s e q u e n t l y , t h e m i n i m u m o f f u n c t i o n / 0 (a:) w i t h c o n s t r a i n t s
( 9 . 2 5 ) w i l l c o i n c i d e w i t h t h e m i n i m u m o f / 0 (a:) w i t h c o n s t r a i n t s
( 9 . 2 6 ) . F o r t h e m i n i m i z a t i o n o f /0 (x) w i t h c o n s t r a i n t s (9.26), m e t h o d s
o f t h e first o r d e r d e s c r i b e d i n t h e s u b s e c t i o n o n p . 2 4 9 c a n b e u s e d .
M e t h o d (9.20) w i t h c o n d i tio ns (9.26) c a n n o t b e a p p l i e d to the
m i n i m i z a t i o n o f f 0 (a;), f o r i n s p a c e E n + m f u n c t i o n f 0 ( x ) i s n o t
s t r i c t l y c o n v e x ; m a t r i x /JJ ( x ) , a s i s e a s i l y a s c e r t a i n e d , i s s i n g u l a r
i n E n + m . D u e t o t h i s f a c t , t h e r e is n o p o i n t i n u s i n g m e t h o d s o f d u a l
a n d c o n j u g a t e directions in this case.
Conclusive Re m a r k s
O f t h e class of p r o j e c t i o n m e t h o d s w i t h r e sto rat ion of ties w e h a v e
d i s c u s s e d o n l y t h o s e a l g o r i t h m s i n w h i c h u s e is m a d e o f f o r m u l a s
(9.9) t o r e s t o r e ties. I n a n u m b e r of p r o b l e m s this m e t h o d of c a r r y i n g
o u t the c o n c l u d i n g (third) stage of iteration m a y p r o v e inconvenient;
i n t h i s c a s e it is w o r t h w h i l e t o c a r r y o u t t h i s s t a g e i n a n o t h e r w a y .
F o r instance, one c a n determine point x ^ + ^ S g b y m a k i n g the quantity
A A
II x h + 1 (a ) — x h ( a ) II m i n i m i z e t h e d i s t a n c e b e t w e e n p o i n t x h ( a )
a n d set S g .
256
B I B L I O G R A P H I C N O T E S
Bibliographic Notes
T o S e c . 1. T h e m e t h o d o f s o l v i n g p r o b l e m s o f q u a d r a t i c p r o g r a m m i n g e x
p o u n d e d i n t h i s s e c t i o n is b a s e d o n a p p l y i n g t h e m e t h o d o f c o n j u g a t e g r a d i e n t s .
I t i s t h e m o s t s i m p l e a n d e x p e d i e n t * m e t h o d if t h e c o n s t r a i n t s o n t h e v a r i a b l e s
are simple. T h e r e are a great m a n y other m e t h o d s of solving the p r o b l e m of
q u a d r a t i c p r o g r a m m i n g w h i c h c o n v e r g e e i t h e r after a finite n u m b e r o r a n infinite
n u m b e r of steps. T h e s e m e t h o d s are described b y H . Kiinzi a n d W . Krelle,
S . I. Z u k h o v i t s k i i a n d L . 1. A v d e y e v a , G . Z o u t e n d i j k [ 1 ] , V . F . D e m ’y a n o v a n d
V . N . M a l o z e m o v [2].
T h e p r o b l e m s of the effectiveness a n d a c c u r a c y of different alg orit hms are
a n a l y z e d b y V . V . I v a n o v [2], [3], V . V . I v a n o v a n d V . E . T r u t e n ’.
T o S e c . 2. I n d e s c r i b i n g t h e m e t h o d of feasible d i r e c t i o n s w e f o l l o w e d m a i n l y
t h e w o r k s b y S . I. Z u k h o v i t s k i i , R . A . P o l y a k a n d M . E . P r i m a k [1], [2]. T h e
m e t h o d d e s c r i b e d in Sec. 2 differs f r o m t h e t r a d i t i o n a l o n e b y t h e r u l e of c h o o s
i n g t h e s t e p l e n g t h at e a c h iteration. S e v e r a l v a r i a n t s of t h e m e t h o d of feasible
d i r e c t i o n s h a v e b e e n s t u d i e d i n d e t a i l a n d s u b s t a n t i a t e d b y G . Z o u t e n d i j k [11,
a n d a l s o i n p a p e r s b y G . Z o u t e n d i j k [2], D . M . T o p k i s a n d A . V e i n o t t .
T o S e c . 3. T h e m e t h o d o f c o n d i t i o n a l g r a d i e n t w a s f o r t h e first t i m e d e
s c r i b e d b y M . F r a n k a n d P . W o l f e . L a t e r o n it w a s s t u d i e d b y V . F . D e m ’y a n o v
a n d A . M . R u b i n o v , E . S. L e v i t i n a n d B . T. P o l y a k w h o g a v e t h e b o u n d s o n
t h e r a t e o f c o n v e r g e n c e . T h e p r o o f o f t h e a c c u r a c y o f t h e s e e s t i m a t e s is g i v e n
in a p a p e r b y M . D . C a n o n a n d C. D . C u l l u m .
T h e g e n e r a l i z a t i o n o f N e w t o n ’s m e t h o d o f s o l v i n g c o n s t r a i n e d p r o b l e m s w a s
c a r r i e d o u t b y E . S . L e v i t i n a n d B . T . P o l y a k . N e w t o n ’s m e t h o d w i t h s t e p
a d j u s t m e n t w i t h c o n s t r a i n t s w a s s t u d i e d b y Y u . M . D a n i l i n [11, [21.
T o Sec. 4. T h e cutting h y p e r p l a n e m e t h o d as described in this section fol
l o w s the w o r k b y J.E. K e l l e y . Different generalizations of this m e t h o d a n d also
t h e b o u n d s o n i t s r a t e o f c o n v e r g e n c e (| / ( x h ) — / ( x * ) | < 1 C l n ) f o r a s t r o n g l y
c o n v e x f u n c t i o n / (x ) a r e g i v e n i n t h e p a p e r b y E . S . L e v i t i n a n d B . T . P o l y a k .
T o S e c s . 5, 6 . T h e d e s c r i p t i o n of t h e l i n e a r i z a t i o n m e t h o d in this c h a p t e r fol
l o w s t h e w o r k b y B . N . P s h e n i c h n y [4] w h e r e h e p r o v e s t h a t t h e m e t h o d is c o n
v e r g e n t . T h e s u b t l e r r e s u l t s s u c h a s t h e finite c o n v e r g e n c e i n l i n e a r p r o g r a m m i n g ,
the local est imat e of the rate of c o n v e r g e n c e a n d also q u a d r a t i c rates of c o n v e r
g e n c e in special cases, h a d n o t b e e n p u b l i s h e d before. T h e s a m e c a n b e said of
the application of this m e t h o d to the p r o b l e m of finding the m i n i m a x . T h e
l a s t p r o b l e m w Ta s s t u d i e d b y V . F . D e m ’ y a n o v a n d V . N . M a l o z e m o v [ 1 ] , [ 2 ]
w rh o c o n s t r u c t e d a n u m b e r o f a l g o r i t h m s o f d e s c e n t l o r s o l v i n g t h e p r o b l e m .
N o t e also that the m i n i m a x p r o b l e m can be solved b y using the a l g orit hm of
t h e g e n e r a l i z e d g r a d i e n t d e s c e n t a n d its v a r i a n t s d e v e l o p e d b y N . Z . S h o r
1/2 1 7 - 0 3 2 6 257
C O N S T R A I N E D F U N C T I O N M I N I M I Z A T I O N
258
A P P E N D I X
C O M P U T A T I O N A L S C H E M E S
O F T H E M A I N A L G O R I T H M S
I. M E T H O D O F D U A L D I R E C T I O N S
( C H A P . II, S E C . 3 )
T h i s m e t h o d is i n t e n d e d f o r t h e m i n i m i z a t i o n o f a c o n v e x f u n c t i o n / (x),
x 6 E n -
Iteration scheme.
L e t x 0 b e a n a r b i t r a r y p o i n t , r0f0 , s 0 , - n • • s o , - 7i + i b e a n arbitrary linearly
independent vector system.
With 0 k n — 1, t b e i t e r a t i o n is a s f o l l o w s :
(1) C o n s t r u c t t h e p o i n t
*h+ 1 = *h — a h i k (*ft) (*)
w h e r e a * is c h o s e n b y a n y o f t h e m e t h o d s d e s c r i b e d i n C h a p . I I , S e c . 1.
(2) S e t :
r li+ 1
eh + 1 = r ( w - r ( * h ). (2)
(3) C o m p u t e
n+i» ^ft+l)*
If
I(tynfc-Ti+l’ f /j+i) I ^ Y II s k , h - n + 1 II II ^ f t + 1 II (^)
where y > 0 is a n a r b i t r a r i l v s m a l l c o n s t a n t , g o t o s t e p (5).
If
I (s h t h - n + 1 ’ 0 : + i ) 1 Y lls f c » f c - n + l l l II e f t + i l l » (^)
g o t o s t e p (4).
(4) S e t
rJt+i = Pft+i5ft,ft-n+l (5)
w h e r e t h e q u a n t i t y {5f t + 1 > 0 i s c h o s e n s u c h t h a t t h e c o n d i t i o n ||rf e + 1 || < || r k ||
b e fulfilled.
C o m p u t e t h e g r a d i e n t / ' ( x k -f- r k + 1 ) a n d t h e n c o n s t r u c t v e c t o r e k + l ~
= f (*k + r h + 1) — /' M - T h c n S ° t 0 s t e P (5 )-
(5) C o n s t r u c t the vector system
_ _ _ _ s k, k - n + i
h + 1 . k + 1 — ( Sft h _ n + u efl+i) »
sk + l > h - j — ski h - j — ^ h , h - j t efc+i) S h + U h + l i
7 = 0, 1, .... n — 2. (6 )
259 17*
A P P E N D I X
T h i s is t h e e n d o f t h e i t e r a t i o n .
W i t h k >■ n:
(1) C o n s t r u c t v e c t o r
77 - 1
P h = — ^ W h-t) r k - i•
( 2 ) C o m p u t e ( /' ( x ft), p h ).
I f (/' ( x h ) , P k ) 0 , c o n s t r u c t p o i n t a *f t + 1 b y o n e o f t h e f o r m u l a s z f e + 1 =
= Z k ± c w h e r e a & is c h o s e n a c c o r d i n g t o c o n d i t i o n ( 2 . 2 ) , C h a p . I I .
I f (/' ( * & ) » P h ) = c o n s t r u c t p o i n t x h + 1 u s i n g t h e g r a d i e n t m e t h o d ( s e e ( 1 ))
o f t h e i t e r a t i o n w i t h A* ^ n — 1 ( f u r t h e r , c o n s t r u c t t h e i t e r a t i o n i n t h e s a m e w a y
a s y o u d i d w i t h k - < n — 1 ( s t e p s (2)-(5)).
R e m a r k s . 1. W e h a v e a d d u c e d o n l y o n e o f t h e p o s s i b l e c o m p u t a t i o n s c h e m e s
o f t h e m e t h o d s o f d u a l d i r e c t i o n s . H e r e t h e f i r s t i t e r a t i o n s (A: n — 1) a r e
carried o u t b y t h e g r a d i e n t m e t h o d . S i n c e at t h e initial steps of t h e iterative
process the gradient m e t h o d us u a l l y pro v i d e s for a sufficiently steep decrease of
t h e f u n c t i o n , s u c h a n i n i t i a l s t a g e o f t h e p r o c e s s is e x p e d i e n t i n s o l v i n g m a n y
problems.
2. W c h a v e c h a n g e d h e r e f o r t h e s a k e o f c o n v e n i e n c e t h e n o t a t i o n s o f t h e
v e c t o r s o f t h e d u a l b a s i s ( c f . ( 6 ), a n d ( 3 . 2 1 ) o f C h a p . I I ) .
3. If v e c t o r r ^ + j is c h o s e n i n f o r m ( 5 ) a n d f u n c t i o n / (s) is s m o o t h a n d s t r o n g l y
c o n v e x (i.e. c o n d i t i o n s ( 2 . 4 ) o f C h a p . I I h o l d t r u e ) , t h e n i n e q u a l i t y (3) w i l l a u t o
m a t i c a l l y hold, p r o v i d e d the c o n s t a n t y h a s b e e n c h o s e n sufficiently small.
I n d e e d , i f f u n c t i o n f ( x ) s a t i s f i e s t h e r e q u i r e m e n t s f o r m u l a t e d , t h e n | | e ft|| < !
• < M ||rft || a n d e s t i m a t e ( 5 . 1 8 ) o f C h a p . I I h o l d s t r u e . B y v i r t u e o f t h i s f a c t ,
w e have
1 m
( 5 f c » f c - n + l i e h + 1) = pk" (rh+ii eh+i) ^ “j ^ H r f t + i ll2
> p II r h + i ll l| e/i + i W ^ - J f II s h , h - n + 1 II II e k + i I I .
T h u s if y -=7 , t h e n i n e q u a l i t y (3) is s a t i s f i e d .
4. T h e practice of com puta tion s s h o w s that quantity y m a y be chosen very
s m a l l : y = 1 0 ~ 6 - 1 0 “ 15. If c o n d i t i o n (3)' is n o t s a t i s f i e d e v e n w i t h v e c t o r
r k + 1 h a v i n g b e e n c h o s e n i n f o r m ( 5 ) , i t m e a n s t h a t m a t r i x f (a:) a s x x* be
c o m e s ill c o n d i t i o n e d , i . e . t h e m i n i m i z e d f u n c t i o n i s n o t s t r o n g l y c o n v e x .
In particular, the surfaces of the levels of this function m a y h a v e f o r m s of a long,
d e e p a n d n a r r o w valley. In this case, a v e r y accurate a p p r o x i m a t i o n to the solu
t i o n w i t h r e s p e c t t o t h e v a r i a b l e is n o t a t t a i n a b l e . H o w e v e r , t h e c o m p u t a t i o n
practice s h o w s that o n e c a n o bt ain function val ues sufficiently close to the m i n i
m u m in m i n i m i z i n g e v e n of n o n c o n v e x functions w i t h d e e p va l l e y level sur
faces.
II. C O N J U G A T E G R A D I E N T S M E T H O D
( C H A P . II, S E C . 4 )
T h i s m e t h o d is i n t e n d e d f o r t h e m i n i m i z a t i o n o f c o n v e x f u n c t i o n / (x), x 6 £ n .
Iteration scheme.
L e t X q b e a n a r b i t r a r y p o i n t , p 0 = — / ' ( x 0 ).
, If r O < A < n — 1, g o t o ( 2 ).
' \ft = n. g o t o (5).
260
A P P E N D I X
(2) C o n s t r u c t point
xk+i = xk + VhPk
w h e r e f a c t o r a & is d e t e r m i n e d u n d e r t h e c o n d i t i o n
/ (x h + = m i n / (x k - r a p ft) .
a>0
(3) C o m p u t e v e c t o r
Pfc +i = — f' (x k+i ) + Pfe+iPft+i
where
P _ (f (*i,+i), V { x M ) - V M )
Jft+1 (/' (*/i), Pfe)
(4) G o t o (1).
(5) S e t x n = x 0 , p 0 = — /' ( x n ) a n d r e p e a t t h e p r o c e s s ( g o t o (1)).
Rem a r k . C o e f f i c i e n t p ft+1 c a n b e d e t e r m i n e d b y a n y o f t h e f o r m u l a s ( 4 . 7 3 ) ,
Chap. II.
III. M E T H O D O F F E A S I B L E D I R E C T I O N S
( C H A P . Ill, S E C . 2)
T h i s m e t h o d is i n t e n d e d f o r s o l v i n g p r o b l e m s o f c o n v e x p r o g r a m m i n g : to
m i n i m i z e f u n c t i o n / 0 (x) w i t h c o n s t r a i n t s
f i(.x ) i 1 , . . ., w ,
A x — 6 = 0
w h e r e x 6 E n , f i ( x ) , i = 0 , . . ., m a r e c o n v e x c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c
t i o n s , il i s a n I x « m a t r i x , 6 i s a n Z - d i m e n s i o n a l v e c t o r .
Notations:
O l ( x ) = {* : f i ( x ) > — 6, i = 1. . . ., m } ,
I I P II = m a x I pi I
l^j<n
where p 6 E n, pi are c o m p o n e n t s of vector p.
Initial data: x 0 is t h e i n i t i a l a p p r o x i m a t i o n s a t i s f y i n g a l l t h e c o n s t r a i n t s ;
60 > 0 , > 0 , i = 0 , . . ., m a r e p o s i t i v e n u m b e r s w h i c h , s p e a k i n g g e n e r a l l y ,
are arbitrary.
The c o m m o n step of the algorithm.
Point and n u m b e r 6^ > 0 have been computed.
(1) S o l v e t h e p r o b l e m of linear p r o g r a m m i n g
min
U ’i (x h ) > P ) < S i 1! * * 6 J E k (x h ) U {0>,
A p = 0,
— 1 -< pi + 1, j = 1, . . ., n .
T h e s o l u t i o n i s \}h , p h .
(2) If % < — b h , t h e n
261
A P P E N D I X
where a ^ = — , a n d q 0 is t h e first i n t e g e r o f g = 0 , 1 . . f o r w h i c h t h e f o l l o w i n g
inequalities hold:
fi 2?^) ^ •••»
1
( 3 ) I f x)jj ^ 5k , then x^+i = x^,
(4) R e t u r n t o (1).
R e m a r k . T h e c h o i c e o f n u m b e r s 6 0 , £$ c a n i n f l u e n c e t h e c o u r s e o f t h e p r o c e s s ;
the choice should be m a d e o n the basis of a n analysis of the p r o b l e m u n d e r con
sideration. T h e a l g o r i t h m c a n b e u s e d also for n o n c o n v e x p r o b l e m s .
IV. LINEARIZATION M E T H O D
( C H A P . Ill, S E C . 5)
This method is i n t e n d e d f o r s o l v i n g t h e p r o b l e m : t o m i n i m i z e / 0 ( a ) w i t h
constraints:
/,• ( * ) < 0 , i= 1, . . m,
f| (a:) = i = n i — [- 1 , . . . , 1
w h e r e f t (x) a r e c o n t i n u o u s l y d i f f e r e n t i a b l e f u n c t i o n s .
Notations:
F (x) = m a x {0, /j (x), . . . , fm (x), I / m + i (a:) |, . . . , | f m + i { x ) |},
= f i ( x ) ^ F (x ) — & > i = 1 > • • • » « ) »
= I fi (*) I > F ix ) — bi < = i » + 1, — ,
(x ) = fo (*) - f N F (x),
\ \ p 112 = 2 O T 2.
5=1
I n i t i a l d a t a : t h e i n i t i a l a p p r o x i m a t i o n x 0 is a r b i t r a r y ; N 0 is sufficiently
great, 6 0 > 0 , 0 < c < 1 .
T h e c o m m o n step of the algorithm.
P o i n t X h is c o n s t r u c t e d a n d n u m b e r s N h a n d 6 ^ a r e c h o s e n .
(1) S o l v e t h e p r o b l e m :
m i n (/o ( x k ) t p ) + — || P |l2 ,
(/J(*a). P ) Jr h ( ^ K 0 , i €
( f i (x h > P ) H - f i (x k ) = 0, i £ Cf% (x h ) -
a
The solution is p k . If t h e p r o b l e m is i n c o m p a t i b l e , t h e n s e t z ft+1 =
6&+i = y 6h , N k+1 = N k and r e t u r n t o (1).
262
A P P E N D I X
(2) I f t h e p r o b l e m is c o n s i s t e n t a n d p k is f o u n d , t h e n s e t
•*£+1 =
***& &hPh'
$fc+l = & k
1
where is c h o s e n e q u a l t o — — a n d q 0 is t h e first o f i n t e g e r s o f q = 0 , 1, . .
29 ^ b ) ^ 29 6 ^
holds.
(3; L e t n u m b e r s u U i£;7r (^)U«^s (*&) he Lagrange multipliers of the
k U fe U /i
subsidiary problem that w a s s o l v e d a t t h e first s t a g e . If n o w
* » > 2 4+ 2 i i.
^ i k <*ft>
then N k+i = N k .
Otherwise
^ +1= 2 / 2 4+ 2 i4n-
Vl^ ‘V ‘W V V '
(4) R e t u r n t o (1).
R e m a r k . N u m b e r s 6 k a n d N ^ cease to c h a n g e f r o m a certain step on. T h e al
g o r i t h m requires a n effectively w o r k i n g s t a n d a r d p r o g r a m for s o l v i n g t he
p r o b l e m of q u a d r a t i c p r o g r a m m i n g .
V. A L G O R I T H M F O R S O L V I N G A S Y S T E M
O F E Q U A T I O N S W I T H O U T C A L C U L A T I N G
DERIVATIVES
( C H A P . Ill, S E C . 6)
T h i s a l g o r i t h m is i n t e n d e d f o r s o l v i n g t h e s y s t e m o f e q u a t i o n s
P (x) = 0
w h e r e x £ E n , p (x) is a n n - d i m e n s i o n a l v e c t o r - f u n c t i o n w h o s e c o m p o n e n t s
pi(x)> 7 = 1 , . • n are differentiable.
I n i t i a l d a t a : i n i t i a l a p p r o x i m a t i o n s x x , . . ., x n a r e a r b i t r a r i l y c h o s e n i n
a sufficiently s m a l l r e g i o n a b o u t t h e s o l u t i o n . I n a p a r t i c u l a r c a s e all ft =
= 1 , . . ., n c a n c o i n c i d e .
n
N o t a t i o n s : || p ( x ) | | 2 = 2 0 3 * ( * ) ) 2 ; 9 ( * ) I s e q u a l t o 1 , 2 , . . ., n — 1 i f ft
i=1
w h e n d i v i d e d b y n l e a v e s a r e m a i n d e r 1 , 2 , . . ., n — 1 r e s p e c t i v e l y , cp(ft) = n
if it d i v i d e s b y n .
T h e c o m m o n s t e p o f t h e a l g o r i t h m , x l t . . ., x h h a v e b e e n c o n s t r u c t e d .
(1) S o l v e for u n k n o w n s i = 1 , . . ., n t h e s y s t e m o f e q u a t i o n s
— 71+i P i P {Xh)
263
A P P E N D I X
where
1
zi — || p || t P (X J " H I P (x j ) II e ( p ( j p P ^i)]»
€ i is a v e c t o r w i t h z e r o c o m p o n e n t s , e x c e p t t h e i - t h o n e w h i c h is e q u a l t o 1 .
(2) S e t
n
*ft+l = a:k + •2 j
Pi ^<P(ft-n+i)*
(3) R e t u r n to (1).
264
L I T E R A T U R E
D a n i e l , J. W . 1. T h e A p p r o x i m a t e M i n i m i z a t i o n o f F u n c t i o n a l s , E n g l e w o o d
Cliff, N . J., 1 9 7 1 .
2. “ C o n v e r g e n c e o f t h e c o n j u g a t e g r a d i e n t m e t h o d w i t h c o m p u t a
tionally efficient m o d i f i c a t i o n s ” , N u m . M a t h . 10, N o . 2 (1967), 1 2 5 - 1 3 1 .
3. “ T h e c o n j u g a t e g r a d i e n t m e t h o d f o r l i n e a r a n d n o n l i n e a r o p e r a t o r
e q u a t i o n s ” , S I A M J . N u m . A n a l . 4 (1967), 10-28.
D a n i l i n Y u . M . 1. “ O n a n a p p r o a c h t o m i n i m i z a t i o n p r o b l e m s ” , D o k l a d y
A k a d . N a u k S S S R 188. N o . 6 (1969), 1 2 2 1 [Engl, trans.: Soviet M a t h .
D o k l a d y 10 (1969), 1274].
18— 0 3 2 6 265
L I T E R A T U R E
2. “ M i n i m i z a t i o n m e t h o d s b a s e d o n a p p r o x i m a t i o n o f t h e initial
f u n c t i o n a l b y a c o n v e x o n e ” , Z h . Vychisl. M a t . i M a t . Fiz. 10, N o . 5 (1970),
1 0 6 7 - 1 0 8 0 (in Rus sian ).
3. “ O n f u n c t i o n m i n i m i z a t i o n in p r o b l e m s w i t h e q u a l i t y c o n s t r a i n t s ”
K i b e r n e t i k a , N o . 2 (1971), 8 8 - 9 5 (in Russian).
4. “ M e t h o d s o f c o n j u g a t e d i r e c t i o n s for s o l v i n g m i n i m i z a t i o n p r o b l e m s ” ,
K i b e r n e t i k a , N o . 5 (1971), 1 2 2 - 1 3 6 (in R u s s i a n ) .
D a n i l i n Y u . M . a n d P s h e n i c h n y B . N . 1. “ O n m i n i m i z a t i o n m e t h o d s v v i t h a c c e l
e ra ted c o n v e r g e n c e ” , Z h . Vychisl. M a t . i M a t . Fiz. 10, N o . 6 (1970), 1 3 4 1 -
1 3 5 4 (in R u s s i a n ) .
2. “ M i n i m i z a t i o n m e t h o d w i t h o u t c a l c u l a t i n g d e r i v a t i v e s ” , Z h , V y c h i s l •
M a t . i M a t . Fiz. 11, N o . 1 (1971), 12-21.
D a n t z i g G . B . 1. L i n e a r P r o g r a m m i n g a n d E x t e n s i o n s , P r i n c e t o n , 1 9 6 3 .
2. “ L i n e a r c o n t r o l p r o c e s s e s a n d m a t h e m a t i c a l p r o g r a m m i n g ” , S I A M J .
C o n t r . 4, N o . 1 (1966), 56 - 6 0 .
D a v i d o n W . C . 1. “ V a r i a b l e m e t r i c m e t h o d s f o r m i n i m i z a t i o n ” , A E C R e s e a r c h a n d
D e v e l o p m e n t , Rept. A N L 5 99 0 (Rev.), 1959.
2. “ V a r i a n c e a l g o r i t h m f o r m i n i m i z a t i o n ” , C o m p . J . 10, N o . 4 ( 1 9 6 8 ) ,
406-410.
D e m ’y a n o v V . F . a n d M a l o z e m o v V . N . 1. “ C o n t r i b u t i o n t o t h e t h e o r y o f n o n
linear m i n i m a x problems'” , U s p e k h i M a t . N a u k 26, N o . 3 (1971), 53 - 1 0 4
(in R u s s i a n ) .
2. I n t r o d u c t i o n to M i n i m a x , N e w Y o r k , 1 9 7 4 . 1
D e m ’y a n o v V . F . a n d R u b i n o v A . M . A p p r o x i m a t e M e t h o d s o f O p t i m i z a t i o n
Problems N g w York 1970-
D e n n i s I . E . “ O n N e w t o n - l i k e m e t h o d s ” , N u m e r . M a t h . B a n d 11, H e f t 4 (1968).
D u b o v i t s k y A. Y a . a n d Milyutin A. A. “E x t r e m a l pro b l e m s with constraints ,
Z h . Vychisl. M a t . 5, N o . 3 (1965), 3 9 5 - 4 5 3 (in R u s s i a n ) .
D u n f o r d N . a n d S c h w a r t z J . T . L i n e a r O p e r a t o r s . P a r t 1: G e n e r a l T h e o r y . N e w
Y o r k , 1962.
E r e m i n I. I. 1. “ P e n a l t y m e t h o d i n c o n v e x p r o g r a m m i n g ” , K i b e r n e t i k a , N o . 4
( 1 9 6 7 ) , 6 3 - 6 7 (in R u s s i a n ) .
2. “ M e t h o d o f F e j e r a p p r o x i m a t i o n s i n c o n v e x p r o g r a m m i n g ” , M a t e m .
Z a m e t k i 3 (1968), 2 1 7 - 2 3 4 (in R u s s i a n ) .
3. “ R a t e o f c o n v e r g e n c e o f t h e m e t h o d o f F e j e r a p p r o x i m a t i o n s ” , ibid. 4
( 1 9 6 8 ) , 5 3 - 6 1 (in R u s s i a n ) .
E r m o l ’e v Y u . M . “ M e t h o d s f o r s o l v i n g n o n l i n e a r e x t r e m a l p r o b l e m s ” , K i b e r n e
t i k a , N o . 4 (1966), 1 - 1 7 (in R u s s i a n ) .
F a d d e e v D. K . a n d F a d d e e v a V . N . C o m p u t a t i o n a l M e t h o d s of L i n e a r A l g e b r a ,
S a n Francisco, 1963.
F i a c c o A . V . “ P e n a l t y m e t h o d for m a t h e m a t i c a l p r o g r a m m i n g in E n w i t h
general constraint sets” , J. Opt. T h e o r y a n d A p p l . 6 , N o . 3 (1970), 252-268.
Fiacco A. V. and M c C o r m i c k G. P. N o n l i n e a r P r o g r a m m i n g : Sequential
Unconstrained M i n i m i z a t i o n Techniques, N e w Y o r k , 1968.
F i k h t e n h o l ’t s G . M . C o u r s e o f D i f f e r e n t i a l a n d I n t e g r a l C a l c u l u s , M o s c o w , 1 9 5 9
(in R u s s i a n ) .
F l e t c h e r R . 1. “ A n e w a p p r o a c h t o v a r i a b l e m e t r i c a l g o r i t h m s ” , C o m p u t . J . 1 3 ,
N o . 3 (1970), 317-322.
2. “ F u n c t i o n m i n i m i z a t i o n w i t h o u t e v a l u a t i n g d e r i v a t i v e s . A r e v i e w ” ,
C o m p u t . J. 8 , N o . 1 (1965), 33-41.
F l e t c h e r R . a n d P o w e l l M . J . D . “A r a p i d l y c o n v e r g e n t d e s c e n t m e t h o d for
m i n i m i z a t i o n ” , C o m p u t . J. 6 , N o . 2 (1963), 163-168.
Fletcher R. a n d R e e v e s C. M . “F u n c t i o n m i n i m i z a t i o n b y conjugate gradients” ,
C o m p u t . J . 7, N o . 2 ( 1 9 64), 1 4 9 - 1 5 4 .
F r a n k M . a n d W o l f e P. “A n a l g o r i t h m for q u a d r a t i c p r o g r a m m i n g ” , N a v . Res.
L o g . Q u a r t . 3, (1956), 9 5 - 1 1 0 .
266
L I T E R A T U R E
18* 267
L I T E R A T U R E
Kantorovich L. V a n d A k i l o v G . P. F u n c t i o n a l A na l y s i s in N o r m e d S p a c e s »
N e w York, 1964.
K a r l i n S. M a t h e m a t i c a l M e t h o d s a n d T h e o r y in G a m e s , P r o g r a m m i n g a n d
E c o n o m i c s , Reading, Mass., 1962.
K a r m a n o v V . G. “E s t i m a t e s of rate of c o n v e r g e n c e of iterative m e t h o d s of m i n i
m i z a t i o n ” , Z h . V y c k i s l . M a t . i M a t . F i z . 14, N o . 1 ( 1 9 7 4 ) (in R u s s i a n ) .
K e l l e y H . J. “ T h e g r a d i e n t m e t h o d ” i n O p t i m i z a t i o n t e c h n i q u e s , N e w Y o r k ,
1962.
K e l l e y J. E . “ T h e c u t t i n g p l a n e m e t h o d f o r s o l v i n g c o n v e x p r o g r a m s ” , J . S o c .
Ind. A p p l . M a t h . 8 , N o . 4 (1960), 703-712.
K o l m o g o r o v A . N . a n d F o m i n S. V . I n t r o d u c t o r y R e a l A n a l y s i s , E n g l e w o o d Cliffs,
N . J., 1 9 7 0 .
Kiinzi H . a n d Krelle W . Nichilineare P r o g r a m m i e r u t i g , Berlin, 1962.
Kiinzi H . a n d Oettli W . Nichilineare O p t i m i e r u n g : neuere verfahren bibliographic,
B e r l in- Heid elbe rg-Ne w Y o r k , 1969.
L a e n b e r g e r D . “C o n v e r g e n c e rate of penalty-function s c h e m e ” , /. Opt . T h e o r y
a n d A p p l . 7, N o . 1 ( 1 9 7 1 ) , 3 9 - 5 1 .
L a v r o v S. S. “ A p p l i c a t i o n of b a r y c e n t r i c c o o r d i n a t e s for s o l v i n g s o m e n u m e r i c a l
p r o b l e m s ” , Z h . V yc h i s l . M a t . i M a t . Fiz. 4, N o . 5 (1964), 9 0 5 - 9 1 1 (in R u s s i a n ) .
L e v i t i n E . S. a n d P o l y a k B . T. “C o n s t r a i n e d m i n i m i z a t i o n m e t h o d s ” , Z h . Vychisl.
M a t . i M a t . Fiz. 6 , N o . 5 (1966), 7 8 7 - 8 2 3 (in R u s s i a n ) .
L u s t e r n i k L . A . a n d S o b o l e v V . I. E l e m e n t s o f F u n c t i o n a l A n a l y s i s , N e w Y o r k ,
1974.
L y u b i c h Y u . I. “ S t e e p e s t d e s c e n t ” , T r u d y v t o r o i z i m n e i s h k o l y p o m a t e m . p r o g r .
i s m e z h n . v o p r . I s s u e 1, M o s c o w , 1 9 6 9 , 1 1 3 - 1 5 1 ( i n R u s s i a n ) .
L y u b i c h Y u . I. a n d M a i s t r o v s k y G . D . “ G e n e r a l t h e o r y o f r e l a x a t i o n p r o c e s s e s
f o r c o n v e x f u n c t i o n a l s ” , U s p e k h i M a t . N a u k 2 5 , I s s u e 1, 1 9 7 0 ( i n R u s s i a n ) .
M a i s t r o v s k y G . D . 1. “ O n t h e c o n v e r g e n c e o f t h e c o n j u g a t e g r a d i e n t m e t h o d ” ,
Z h . V yc h i s l . M a t . i M a t . Fiz . 11, N o . 5 (1971), 1 2 9 1 - 1 2 9 4 (in R u s s i a n ) .
2. “ P r o o f o f q u a d r a t i c c o n v e r g e n c e o f t h e c o n j u g a t e g r a d i e n t m e t h o d ” ,
Vychisl. M a t . i V y c h i s l . T e k h n . F i z . - t e c h n . inst. n i z k i k h t e m p e r a t u r ,
K h a r k o v , I s s u e 2 (1971), 3 -5 (in R u s s i a n ) .
M c C o r m i c k G . P . a n d P e a r s o n J. D . “ V a r i a b l e m e t r i c m e t h o d s a n d u n c o n s t r a i n e d
optimization” , Confer, o n Optimal., K e e le Hall, E n g l a n d , M a r c h 1968.
M i e l e A . , H u a n g H . Y . a n d H e i d e m a n J. C . “ S e q u e n t i a l g r a d i e n t - r e s t o r a t i o n
a l g o r i t h m for th e m i n i m i z a t i o n of constrained f u n c t i o n — o r d i n a r y a n d
c o n j u g a t e g r a d i e n t v e r s i o n s ” , J . O p t . T h e o r y A p p l . 4, N o . 4 (1969).
M o i s e e v N . N . (ed.) N u m e r i c a l M e t h o d s in the T h e o r y of O p t i m a l S y s t e m s , M o s c o w ,
1 9 7 1 (in R u s s i a n ) .
M u r t a g h B. A. a n d Sargent R. W . H . “C o m p u t a t i o n a l experience w i t h quadrati-
cally c o n v e r g e n t m i n i m i z a t i o n m e t h o d s ” , C o m p u t . J. 13, N o . 2 (1970),
185-194.
N e u s t a d t L. W . “A n abstract variational t h e o r y w i t h applications to b r o a d class
o f o p t i m i z a t i o n p r o b l e m s . I. G e n e r a l t h e o r y ” , S I A M J . C o n t r . 4 , N o . 3
(1966), 505-527.
O b l o m s k a y a L. Y a . “A c o m p a r i s o n of the rate of c o n v e r g e n c e of the c o n j u g a t e
gradient m e t h o d a n d of the gradient m e t h o d for q ua drat ic functionals” , in
V o p r o s y tochnosti i effectivnosti vychislitelnykh a l g o r i t m o v ( P r o b l e m s of
A c c u r a c y a n d Effectiveness of C o m p u t a t i o n Algorithms), 4 (1968), K i e v ,
9 4 - 1 0 3 (in R u s s i a n ) .
O s t r o w s k i A . M . S o l u t i o n of E q u a t i o n s a n d S y s t e m s of E q u a t i o n s , 2 n d ed., N e w
Y o r k , 1966.
P e a r s o n J. D . “ V a r i a b l e m e t r i c m e t h o d s o f m i n i m i z a t i o n ” , C o m p u t . J . 12,
N o . 2 (1969), 171-178.
P o l a k E . 1. “ O n p r i m a l a n d d u a l m e t h o d s o f s o l v i n g d i s c r e t e o p t i m a l c o n t r o l
p r o b l e m s ” i n C o m p u t i n g M e t h o d s i n O p t i m i z a t i o n P r o b l e m s — 2, N e w Y o r k ,
1969.
268
L I T E R A T U R E
2. C o m p u t a t i o n a l M e t h o d s i n O p t i m i z a t i o n ' , a U n i f i e d A p p r o a c h , N e w
Y o r k, 1971.
P o l y a k B . T . 1. “ G r a d i e n t m e t h o d s f o r f u n c t i o n a l m i n i m i z a t i o n ” , Z h . V y c h i s l .
M a t . i M a t . Fiz. 3, N o . 4 (1963), 6 4 3 - 6 5 4 (in R u s s i a n ) .
2. “ M e t h o d o f c o n j u g a t e g r a d i e n t s ” , T r u d y v t o r o i z i m n e i s h k o l y p o m a t e m .
p r o g r a m , i s m e z h n . v o p r . I s s u e 1, M o s c o w , 1 9 6 9 , 1 5 2 - 2 0 1 ( i n R u s s i a n ) .
3. “ I t e r a t i v e m e t h o d s u s i n g L a g r a n g e m u l t i p l i e r s f o r s o l v i n g e x t r e
m a l p r o b l e m s w i t h equality constraints” , Z h . Vychisl. M a t . i M a t . Fiz. 10,
N o . 5 (1970), 1 0 9 8 - 1 1 0 6 (in R u s s i a n ) .
4. “ O n t h e r a t e o f c o n v e r g e n c e o f t h e p e n a l t y f u n c t i o n m e t h o d ” , ibid.
11, N o . 1 (1971), 3 - 1 1 (in R u s s i a n ) .
P o w e l l M . J . D . 1. “ A s u r v e y o f n u m e r i c a l m e t h o d s f o r u n c o n s t r a i n e d o p t i m i z a
t i o n ” , S I A M R e v . 1 2 , N o . 1, 1 9 7 0 , 7 9 - 9 7 .
2. “ A n e f f i c i e n t m e t h o d f o r f i n d i n g t h e m i n i m u m o f a f u n c t i o n o f s e v e r a l
v a r i a b l e s w i t h o u t c a l c u l a t i n g d e r i v a t i v e s ” , C o m p u t . J . 7, N o . 2 ( 1 9 6 4 ) ,
155-162.
3. “ O n t h e C o n v e r g e n c e o f t h e V a r i a b l e M e t r i c A l g o r i t h m ” , M a t h e m a t i c s
Branch, Atomic Energy Research Establishment, Harwell, Berkshire,
England, October 1969 (mimeo).
P s h e n i c h n y B . N . 1. N e c e s s a r y C o n d i t i o n s f o r a n E x t r e m u m , N e w Y o r k , 1 9 7 4 .
2. “ T h e d u a l i t y p r i n c i p l e i n p r o b l e m s o f c o n v e x p r o g r a m m i n g ” , Z h .
Vychisl. M a t . i M a t . Fiz. 5, N o . 1 (1965), 9 8 - 1 0 6 (in R u s s i a n ) .
3. “ O n a d e s c e n t a l g o r i t h m ” , ibid. 8 , N o . 3 ( 1 9 6 8 ) , 6 4 9 - 6 5 2 (in R u s s i a n ) .
4. “A l g o r i t h m s for t h e g e n e r a l p r o b l e m o f m a t h e m a t i c a l p r o g r a m m i n g ” ,
K i b e r n e t i k a , N o . 5 (1970), 1 2 0 - 1 2 5 (in R u s s i a n ) .
5. “ O n t h e a c c e l e r a t i o n o f c o n v e r g e n c e o f a l g o r i t h m s f o r s o l v i n g o p t i
m a l control p r o b l e m s ” in C o m p u t i n g M e t h o d s in Optimization P r o b l e m s ,
N e w Y or k, 1969.
P s h e n i c h n y B . N . a n d G a n z h e l a I. F . “ A n a l g o r i t h m f o r s o l v i n g t h e p r o b l e m o f
c o n v e x p r o g r a m m i n g w i t h linear constraints” , Kibernetika, No. 3(1970),
8 1 - 8 5 (in R u s s i a n ) .
R o c k a f e l l a r R . T . C o n v e x A n a l y s i s , P r i n c e n t o n , N . J., 1 9 7 0 .
R o s e n J. B . “ T h e g r a d i e n t p r o j e c t i o n m e t h o d for n o n l i n e a r p r o g r a m m i n g .
P a r t I: L i n e a r c o n s t r a i n t s . P a r t I I : N o n l i n e a r c o n s t r a i n t s ” , S I A M J . A p p l .
M a t h . 8 , (1960), pp. 181-217; 9 (1961), 514-532.
S h a m a n s k y V . E . 1. M e t h o d s o f N u m e r i c a l S o l v i n g o f B o u n d a r y V a l u e P r o b l e m s
o n C o m p u t e r s . P a r t II, K i e v , 1 9 6 6 (in R u s s i a n ) .
2. “ O n s o m e c o m p u t a t i o n s c h e m e s o f i t e r a t i v e p r o c e s s e s ” , U s p e k h i
M a t . N a u k , 14, N o . 1 (1962), 1 0 0 - 1 0 9 (in R u s s i a n ) .
S h o r N . Z . 1. “ G e n e r a l i z e d g r a d i e n t d e s c e n t ” , T r u d y p e r v o i z i m n e i s h k o l y p o m a t .
p r o g r . D r o g o b y c h , M o s c o w , 1 9 6 9 , 5 7 8 - 5 8 5 (in R u s s i a n ) .
2. “ O n t h e r a t e o f c o n v e r g e n c e o f t h e g e n e r a l i z e d g r a d i e n t d e s c e n t ” ,
K i b e r n e t i k a , N o . 3 (1968), 9 8 - 9 9 (in R u s s i a n ) .
3. “ U s i n g t h e o p e r a t i o n o f s p a c e s t r e t c h i n g i n p r o b l e m s o f c o n v e x f u n c
tion m i n i m i z a t i o n ” , K i b e r n e t i k a , N o . 1 (1970), 6 - 1 2 (in R u s s i a n ) .
4. “ O n t h e r a t e o f c o n v e r g e n c e o f t h e g e n e r a l i z e d g r a d i e n t m e t h o d
w i t h s p a c e s t r e t c h i n g ” , K i b e r n e t i k a , N o . 2 (1970), 8 0 - 8 5 (in R u s s i a n ) .
S m i t h C. S. “ T h e a u t o m a t i c c o m p u t a t i o n of m a x i m u m l i k e l i h o o d e s t i m a t e s ” ,
N . C . B . Sci. D e p t . R e p o r t S C 8 4 6 ( M R ) 40.
S m o l y a k S. A . “ Q u a d r a t i c r a t e o f c o n v e r g e n c e of t h e c o n j u g a t e g r a d i e n t s m e t h o d ” ,
T r u d y tret'ei z i m n e i s h k o l y p o m a t . p r o g r . M o s c o w B u i l d i n g I n s t i t u t e ,
M o s c o w . 1970.
S o r e n s e n H . W . “C o m p a r i s o n of s o m e c o n j u g a t e direction p r o c e d u r e s for function
m i n i m i z a t i o n ” , J. F r a n k l i n I n s t i t u t e , 2 8 8 , 4 2 1 ( 1 9 6 9 ) .
T i k h o n o v A . N . 1. “ R e g u l a r i z a t i o n o f i n c o r r e c t l y p o s e d p r o b l e m s ” , D o k l . A k a d .
N a u k S S S R , 1 5 3 (1963), 4 9 - 5 2 [Engl, trans.: S ov i et M a t h . D o k l a d y 4 (1963),
269
L I T E R A T U R E
2. “ O u t h e s t a b i l i t y o f a l g o r i t h m s f o r s o l v i n g d e g e n e r a t e s y s t e m s o f
l i n e a r a l g e b r a i c e q u a t i o n s ” , Z h . Vijchisl. M a t . i M a t . F i z . 5, N o . 4 ( 1 9 6 5 )
(in R u s s i a n ) .
T o k u m a r u H . , A d a c h i N . a n d G o t o K . “ D a v i d o n ' s m e t h o d for m i n i m i z a t i o n
p r o b l e m s in Hilbert space w i t h a n application to control p r o b l e m s ”,
S I A M J. Contr. 8 , N o . 2 (1970).
T o p k i s D . M . a n d V e i n o t t A., Jr. “ O n t h e c o n v e r g e n c e of s o m e feasible direc
t i o n s a l g o r i t h m s for n o n l i n e a r p r o g r a m m i n g ” , S I A M J . C o n t r . 5, N o . 2
(1967), 268-279.
V a i n D e r g M . M . 1. V a r i a t i o n a l M e t h o d s f o r t h e S t u d y o f N o n l i n e a r O p e r a t o r s ,
San Francisco, 1964.
2. V a r i a t i o n a l M e t h o d a n d M e t h o d o f M o n o t o n e O p e r a t o r s i n t h e T h e o r y
of N o n l i n e a r E q u a t i o n s , N e w Y o r k , 1973.
V a s i l ’e v F . P . L e c t u r e s o n t h e M e t h o d s f o r S o l v i n g E x t r e m a P r o b l e m s , M o s c o w
S t a t e U n i v e r s i t y , 1 9 7 4 (in R u s s i a n ) .
W a r g a J. “ A c o n v e r g e n t p r o c e d u r e f or c o n v e x p r o g r a m m i n g ” , J . S o c . I n d . a n d
A p p l . M a t h . 11, N o . 3 (1963), 579-587.
Y a k o v l e v M . N . “O n s o m e m e t h o d s of solving nonlinear equations”, T r u d y M a t e m .
Inst. A N S S S R 8 4 ( 19 65), 8 - 4 0 (in R u s s i a n ) .
Z a n g w i l l W . I. 1. “ M i n i m i z i n g a f u n c t i o n w i t h o u t c a l c u l a t i n g d e r i v a t i v e s ” ,
C o input. J. 10, N o . 3 (1967), 293-296.
2. “ N o n l i n e a r p r o g r a m m i n g v i a p e n a l t y functions”. M a n a g e m e n t
S c i e n c e 13, N o . 5 (1967), 3 4 4 - 3 6 8 .
Z e l e z n i k F. J. “ Q u a s i - N e w t o n m e t h o d s for n o n l i n e a r e q u a t i o n s ” , J. Assoc.
C o m p u t . M a c h . 15, N o . 2 (1968), 265-271.
Z o u t e n d i j k G . 1. M e t h o d s o f F e a s i b l e D i r e c t i o n s , A m s t e r d a m , 1 9 6 0 .
2 . “Nonlinear p r o g r a m m i n g : A numerical survey”, S I A M J. Contr.
4, N o . 1 ( 19 66), 1 9 4 - 2 1 0 .
Z u k h o v i t s k i i S . I. a n d A v d e y e v a L . I. L i n e a r a n d C o n v e x P r o g r a m m i n g , P h i l a
delphia, Pa., 1966.
Z u k h o v i t s k i i S . 1., P o l y a k R . A . a n d P r i m a k M . E . 1. “ A n a l g o r i t h m f o r t h e
solution of the p r o b l e m of c o n v e x C h e b y s h e v a p p r o x i m a t i o n ” , D o k l a d y
A k a d . N a u k S S S R 151, N o . 1 (1963) 2 7 - 3 0 [Engl, trans.: Soviet M a t h .
D o k l a d y 4 (1963), 9011.
2. “ A n a l g o r i t h m f o r t h e s o l u t i o n o f t h e c o n v e x p r o g r a m m i n g p r o b l e m ” ,
D o k l a d y A k a d . N a u k S S S R 1 5 3 , N o . 5, ( 1 9 6 3 ) , 9 9 1 - 9 9 4 [ E n g l , trans.:
Soviet M a t h . D o k l a d y 4 (1963), 1754].
270
INDEX
A c
A b a d i e , J. 2 6 7 C a n o n , M . D. 176, 257, 2 6 5
acceleration of c o n v e r g e n c e 224ff C a u c h y , A. L. 145, 2 6 5
a l g o r i t h m for 2 3 0 C e a , J. 2 5 8 , 2 6 5
and mathematical programming 232 closure of c o n v e x set 14
Adachi, N. 270 Collatz, L. 145, 2 6 5
admissible direction 33 concave function 23
admissible d o m ain 32 conditional gradient m e t h o d 170ff
Aki l o v , G . P. 2 6 8 a n d step adjustment 177
a l g o r i t h m (s) cone
c o m p u t a t i o n a l s c h e m e s of 259ff conjugate 14
for c o n j u g a t e directions m e t h o d 93, convex 14
111, 126, 129, 2 6 0 polyhedral 15
for c u t t i n g h y p e r p l a n e m e t h o d 1 8 5 conjugate directions 82
for d u a l directions m e t h o d 129, 2 5 9 a n d dual directions 143
effectiveness of 60, 6 6 conjugate directions m e t h o d s 138, 143,
for feasible directions m e t h o d 2 6 1 145, 25 5
for linearization m e t h o d 2 6 2 conjugate gradients m e t h o d 98, 145-
for m i n i m i z i n g f u n c t i o n s 1 3 9 147, 152, 153, 156, 161
for q u a d r a t i c p r o g r a m m i n g 151. a l g o r i t h m for 2 6 0
158, 160 c o n j u g a t e vector 82, 138, 141, 144
A l t m a n , M . 145, 265 a l g o r i t h m s for constru ctio n 8 7
A r r o w , K . 43, 2 6 5 constrained function minimization
Auslender, A. 265 146ff
A v d e y e v a , L . I. 4 2 , 4 3 , 2 5 7 , 2 7 0 constraints
linear 146
B simple 160
convergence
Balakrishnan, A. V. 265 acceleration of 224ff
B a r n e s , J. 2 5 8 , 2 6 5 for c o n d i t i o n a l g r a d i e n t a l g o r i t h m s
Bershchansky, Ya. M. 265 172, 176
biorthogonalization, 125 of c on juga te directions m e t h o d s
boundary manifold, 38 104, 119, 125, 144
B o u r b a k i , N . 23. 2 6 5 of dual directions m e t h o d s 144
Bre nt, R . P. 145, 2 6 5 rates of 11
Brown, K. M. 265 c o n v e x cone 14
B r o y d c n , C. G. 128, 258, 2 6 5 c o n v e x f u n c t i o n s 17ff
B u d a k , B. M. 265 in f i n i t e - d i m e n s i o n a l s p a c e 4 2
271
I N D E X
272
I N D E X
273
I N D E X
274
I N D E X
O b l o m s k a y a , L. Y a . 2 6 8 a n d effectiveness of m e t h o d 6 0
Oettli, W . 258, 2 6 8 o f first o r d e r m e t h o d s 2 5 1
Ostrowski, A. M . 232, 268 geometrical-progression 11
O s t r o w s k i ’s t h e o r e m 2 0 6 , 2 1 0 , 2 1 1 linear 11
of m e t h o d s of con jugate directions
P
119, 125f
P e a r s o n , J. D . 128, 2 6 8 quadratic 11
p e n a l t y function m e t h o d 235ff of second order m e t h o d s 2 5 4
computational aspects 242 superlinear 11
in c o n v e x p r o g r a m m i n g 2 4 0 Reeves, C. M . 145, 266
substantiation of 236 regular point 37
po i n t of local m i n i m u m 3 3 restoration process 104, 105
Polak, E. 268, 269 Rockafellar, R . T. 42, 43, 269
P o l y a k , B. T. 42, 128, 145, 188, 257, R o s e n , J. B . 2 5 8 , 2 6 9
258, 268, 269 R u b i n o v , A. M . 257, 266
Polyak, R. A. 270
polyhedral cone 15 S
P o w e l l , M . J. D . 128, 145, 2 6 6 , 2 6 9
Price, J. F. 145 , 2 6 7 Sarachik, P. E. 267
Primak, M . E. 257, 270 Sargent, R. W . H. 268
primal problem'in linear p r o g r a m m i n g S c h w a r t z , J. T . 4 2 , 2 6 6
30 second order m e t h o d s 246, 252
projection m e t h o d s with restoration rate of c o n v e r g e n c e of 2 5 4
o f t i e s 2 4 4 ff separation theorem 12
projection operator 147, 20 9 S h a m a n s k y , V. E. 258, 269
and the dual pr o b l e m 160 Shor, N . Z. 257, 258, 269
m a t r i x for 15 8 simple constraints 160
P s h e n i c h n y , B. N . 43, 128, 145, 257. simplex method
258, 266, 269 a n d the linearization m e t h o d 2 0 0
in s o l v i n g t h e d u a l p r o b l e m 1 8 7
Q S l a t e r ’s c o n d i t i o n 2 1 7 , 2 2 0
quadratic forms S m i t h , C. S. 145, 2 6 9
m i n i m i z a t i o n of 137, 143 S m o l y a k , S. A . 145, 2 6 9
quadratic functions S o b o l e v , V . I. 2 4 5 , 2 6 8
m i n i m i z a t i o n of 146, 148, 149 Sonin, V. V. 267
q u a d r a t i c p r o g r a m m i n g 31, 146ff Sorensen, H. W . 269
a l g o r i t h m for 151, 158, 160 step length
a nd the dual p r o b l e m 194, 195 a l g o r i t h m for c h o o s i n g 1 3 2
iterative p r o c e s s for 157.' choice of 171, 172
with simple constraints 160 Stiefel, E . 145, 2 6 7
quadratic rate of convergence strictly c o n v e x functions 21, 4 2
strictly c o n v e x sets 17
R strongly c o n v e x functions 21, 4 2
rate of c o n v e r g e n c e str ongl y c o n v e x sets 17
for c o n d i t i o n a l g r a d i e n t a l g o r i t h m s strongly positive m a t r i x 22
172, 1 7 6 subgradient 20
275
I N D E X
sufficient c o n d i t i o n V
for a local m i n i m u m 2 3 4 Vainberg, M . M . 42, 2 7 0
superlinear rate of c o n v e r g e n c e 11 V a s i l ’e v , F. P. 258, 270
sup port vector 20, 186 vector of dual variables 29
system(s) of conjugate vectors 141, 144 Veinott, A. 257, 270
W
T
W a r g a , J. 2 7 0
tangent manifo ld 38 Weierstrass’ theorem 24, 178, 249
Tikhonov, A. N . 10, 269, 270 Wolfe, P. 257, 266
Tokumaru, H. 270
Topkis, D. M . 257, 270 Y
Truten*, V. E. 257, 267 Yakovlev, M. N. 145, 2 7 0
Z
U
Z a n g w i l l , W . I. 1 4 5 , 2 5 8 , 2 7 0 .
unconstrained function minimization Z e l e z n i k , F . J. 2 5 8 , 2 7 0
44ff Zoutendijk, G. 42, 257, 258, 270
U z a w a , H . 43, 2 6 5 Z u k h o v i t s k i i , S . I. 4 2 , 4 3 , 2 5 7 , 2 7 0
276
T o the Reader
M i r Publishers w o u l d be grateful
for y o u r c o m m e n t s o n t h e content,,
U S S R , 129820, M o s c o w 1-110, G S P
Mir Publishers
N . S. B a k h v a l o v
Numerical Methods
Pr o f e s s o r N . S. B a k h v a l o v of M o s c o w S t a t e U n i v e r s i t y h a s w r i t t e n
a t e x t b o o k f o r a d v a n c e d c o u r s e s i n a p p l i e d m a t h e m a t i c s . I t is d e
signed especially for universities a n d higher technical schools a n d
is c o n c e r n e d w i t h t h e t h e o r y a n d p r a c t i c e o f n u m e r i c a l m e t h o d s .
T h e text co v e r s a w i d e r a n g e of subjects i n c l u d i n g t h e essentials of
n u m e r i c a l m e t h o d s dealing w i t h a p p r o x i m a t i o n of functions, integ
ration, p r o b l e m s of a l g e b r a a n d o p t i m i z a t i o n , a n d the solution of
o r d i n a r y d i f f e r e n t i a l e q u a t i o n s . P a r t i c u l a r a t t e n t i o n is p a i d t o
q u e s t i o n s i n v o l v i n g ch o i c e of m e t h o d a n d o r g a n i z a t i o n of c o m p u t a
tions in th e solution of large sets of p r o b l e m s of a single type. T h e
b o o k includes a b i b l i o g r a p h y of R u s s i a n a n d E n g l i s h references.
V . I. K r y l o v a n d N . S . S k o b l y a
A H a n d b o o k of Methods of Approximate
Fourier Transformation a n d Inversion
of the L a p l a c e T r a n s f o r m a t i o n
H a r m o n i c analysis a n d Laplace transformations are very often
u s e d in the solution of m a n y theoretical a n d practical p r o b l e m s .
T h i s text contains m o s t of the k n o w n m e t h o d s of the a p p r o x i m a t e
i n v e r s i o n of t h e L a p l a c e t r a n s f o r m a t i o n a n d m e t h o d s of c a l c u l a t i n g
F o u r i e r i n t e g r a l s . I t is d e s i g n e d f o r s c i e n t i s t s a n d e n g i n e e r s w h o h a v e
to d o w i t h the t h e o r y of a p plications of the L a p l a c e t r a n s f o r m a t i o n
a n d t h e F o u r i e r i n t e g r a l s . It c a n s e r v e a s a u s e f u l r e f e r e n c e w o r k
for c o m p u t i n g centres a n d design b u r e a u s .
V. P. M a s l o v
Operational Methods
T h i s is a t e x t b o o k f o r s e c o n d - a n d t h i r d - y e a r u n i v e r s i t y m a t h e
m a t i c s a n d p h y s i c s s t u d e n t s b a s e d o n t h e a u t h o r ’s l e c t u r e s a t t h e
faculty of a p p l i e d m a t h e m a t i c s of the M o s c o w Institute of Electronic
E n g i n e e r i n g a n d t h e p h y s i c s f a c u l t y of M o s c o w U n i v e r s i t y . It illus
trates t h e theoretical m a t e r i a l c o n c e r n i n g specific p h y s i c a l p r o b l e m s ,
w h i c h are t a k e n as the m o d e l , c o m p a r i n g the f o r m u l a s of the o p e r a
tional m e t h o d w i t h the n u m e r i c a l solution. T h e b o o k will b e of
interest to scientific w o r k e r s in general.