BF01585095
BF01585095
A. B E N - T A L
J. Z O W E
The purpose of this paper is to derive, in a unified way, second order necessary and
sufficient optimality criteria, for four types of nonsmooth minimization problems: the discrete
minimax problem, the discrete lrapproximation, the minimization of the exact penalty
[unction and the minimization of the classical exterior penalty .function. Our results correct
and supplement conditions obtained by various authors in recent papers.
Key words: Necessary and Sufficient Second Order Conditions, Nonsmooth Optimization,
Minimax Problem, h-Approximation, Penalty Functions, Directional Derivatives.
1. Preliminaries
1.1. Introduction
This p a p e r p r e s e n t s s e c o n d o r d e r o p t i m a l i t y c r i t e r i a f o r a c l a s s of n o n s m o o t h
p r o b l e m s w h i c h i n c l u d e s t h e f o l l o w i n g f o u r c l a s s i c a l e x a m p l e s of n o n s m o o t h
optimization
m i n i m i z e f ( x ) : = ~ Igi(x) I. (1.2)
i=1
7O
A. Ben-Tal and J. Zowe/ Optimality of nonsmooth problems 71
Here s' and s are natural numbers, ix and p are positive reals and the gi's are
smooth functions defined on a real normed vector space X.
Problem (P1) is widely studied in the literature, see e.g. the books by
D e m ' y a n o v and Malozemov [7] and b y Danskin [6]. In the former a second order
sufficient condition is given [7, Chapter III, Theorem 4.2].
(P2) was treated by Chralambous in [4], where necessary first order con-
ditions, and sufficient second order conditions are derived, however with a
wrong proof for the latter.
The exact penalty function f in (P3) is associated with the non-linear pro-
gramming problem
rain go(x),
subject to gi(x)<~O for l~<i~<s' (1.5)
gi(x)=0 for s ' + l < - i ~ s .
It is clear that f is not differentiable at every x even if all the data qi and gii are
smooth. To see that (P) contains (P1)-(P4) as special cases put for example
q i ( t ) : = t, k i : = 2 , gil: =g~ and g~2:= - g i for i = 1. . . . . m. Then problem (P)
reduces to the discrete ll-approximation problem.
In Section 2 we summarize the primal and dual necessary and sufficient
optimality criteria for (P). These conditions are free of any additional regularity
assumptions. Moreover, the gap between the sufficient conditions and their
necessary counterparts is minimal, in the sense that they differ only in a strict
72 A. Ben-Tal and J. Zowe/Optimality of nonsmooth problems
inequality (>) versus a weak one (I>) in the second order inequality. The primal
conditions will be phrased in terms of first and second order directional deriva-
tives of f. Section 3 is devoted to the computation of these directional deriva-
tives, the main tool being a chain rule. In Section 5 we will give a proof for the
optimality conditions stated in Section 2. A theorem of the alternative, proved in
Section 4, enables us to transform the primal necessary conditions into dual
conditions. The optimality criteria for the special cases (PI)-(P4) are easy
corollaries of our general theorems. These corollaries are collected in Section 6.
In Section 7 we add some remarks about the nonsmooth problem with con-
straints.
Throughout this paper X is a real normed vector space with topological dual
X*. The canonical bilinear form on X* × X will be denoted by ( . , . ) . We write
g'(x) and g"(x) for the first and the second Fr6chet-derivative of a function
g : X ~ R. To simplify the notation we often use g'(x)d instead of (g'(x), d). For
fixed x, g"(x) is interpreted as a bilinear functional on X × X. In case X is the
n-dimension Euclidean space R", then g'(x)= Vg(x) t and g"(x)= V2g(x) the
gradient and Hessian of g at x, respectively.
Let m and ki, 1 ~<i ~< m, be fixed natural numbers and let q i : 0 ~ k ~ R and
g~ : x ~ R for 1 ~< i ~< m and 1 ~<j ~< k~, be C2-functions. We always denote by
the optimal point and assume henceforth that
Further we put
Ki:={1,2 . . . . . ki} f o r i = l . . . . . m.
With these data we define our general minimization problem
(P)
minimize f(x) := ~ qi(max gij(x)).
i=l j~K~
In addition to the sets K i we will use the following sets of indices for given x
and d:
Ki(x) = {j ~- Ki [ gii(x) = max gil(x)}
lEK~
Using this notation the max-terms in (1.8) can be written in a much shorter form
It should be noted that g'~,K~(x)(x)d does not stand for the directional derivative of
gi, K~(x)(') at x applied to d.
In the sequel we will use the above notation without further mentioning.
For z = 0 definition (2.2) coincides with the proposal for a second directional
derivative due to D e m ' y a n o v and P e v n y i [8].
(2.2) reduces to (2.1) for d = 0:
(ii) ~ E yijg;j(~)=0,
i=1 j~Ki(~, d)
(iii)
~ yi~g'[j(~7)(d,d) + ql'(max g~j(~))[ max glj(~)d] 2/> 0.
i=1 i EKi(~,d) = jEK i i.~Ki(~)
local minimizer of f is that for every d ¢ 0 satisfying (o) there are yij >I 0 for
j E K~(Y,, d) and i = 1..... m such that (i)-(iii) hold with strict inequality in (iii).
R e m a r k 1. Fix d = 0. Then (2.7) implies f ' ( ~ ; z) = f"(~; 0, z)/> 0 for all z E X, i.e.
our primal n e c e s s a r y second order condition contains the n e c e s s a r y first order
condition. Further note that (2.8) is automatically satisfied w h e n e v e r the
sufficient first order condition holds, i.e. w h e n e v e r f ' ( ~ ; d ) > 0 for e v e r y d ~ 0.
If we choose d = 0 in T h e o r e m 2.2(a), then our dual n e c e s s a r y second order
condition reduces to the necessary first order condition: there are multipliers
y~j i> 0 satisfying (i) and (ii).
Necessary condition:
f'(~) = 0 and f"(~)(d, d) ~ 0 for all d ~ X.
f'(~;d)=g~l(~;d)=0 for a l l d E l 2 ,
~ ~.ijg'ij(X) : 0 and j E K i ( ~ ).
i= l j~.Ki(~ )
Corollary 2.3. Consider the general problem (P) and assume that (CQP) holds at
g. Then a necessary condition for ~ to be a local minimizer o f f is the existence of
a unique set of multipliers Yij ~> 0, i = 1, 2 .... , m and i E K~(2), such that
(li)
~ Yljg'ij(.~) = 0,
i=l j~Ki(~ )
Lemma 3.1 [Chain rule]. Let q~(. ) = h(k(. )) with functions k : X ~ Y and h : Y - >
Z where X, Y, Z are real normed vector spaces. Let x, d, z E X be given. Put
y := k(x) and suppose v := k'(x; d), w := k"(x; d, z) and h'(y; v), h"(y; v, w)
exist. Further, suppose that h is locally Lipschitz continuous at y = k(x), i.e.
there exists a neighborhood U of y and some real L such that
one obtains
This p r o v e s (ii).
f ( x ) = ~ qi(max git(x))
i= 1 jEKi
78 A. Ben-Tal and J. Zowe/ Optimality of nonsmooth problems
1 ~, q ~'(gi,Ki(X))[g I,K,~x)(x)d] 2.
+2i=1
Proof. Fix i and put k ( . ) : = gi, K~('), i.e.
k(. ) = l(g(. ))
with l as defined in L e m m a 3.2 and g ( ' ) = ( g H ( ' ) ..... g~k~('))t. N o t e that l is
locally L i p s c h i t z continuous. H e n c e L e m m a s 3.1 and 3.2 s h o w
Similarly we get
The assertion follows if we apply the chain rule o n c e m o r e to q~(k(x)) and sum
up o v e r i.
Theorem 4.1. Suppose cti >t 0 for i = 1..... m. Then the following two statements
hold.
A. Ben-Tal and J. Zowe/ Optimality of nonsmooth problems 79
(a) One and only one of the s y s t e m s (I) and (II) has a solution.
(I) Find z ~_ X such that
(i)
~ , Yo = ai .for i = l, 2 . . . . . m,
fEB i
(ii) ~, ~ yiiaii = O,
i=l jCB i
(b) S u p p o s e dim X < oo. Then one and only one of the s y s t e m s (i) and (II) has a
solution.
(I) Find z E X such that
(iI) Find Yir >~0 .for i = 1. . . . . m and j E Bi such that (i) and (ii) hold and
inf ~ ai m a x { a 0, z) +/3it}/> - / 3 ,
z~-X i=l jEB i
has an optimal solution ? and toD = top/> --/3. Feasibility means that ? i> 0 and
tOp = = t> - / 3 ,
i=l jEB i
The proof of Theorem 2.1(b) will be given after the proof of Theorem 2.2.
/ 3 : = ~ 1/~__m
= l qi(grl
" ( ~ ))[gi,' nit,)( ~ d) ] 2.
Note that, as required in Theorem 4.1, the above ai's are nonnegative by
assumption (1.7). With the above notation we have (see Proposition 3.3(ii))
(a) Let ~ be optimal. Fix some d satisfying (o), i.e. f'(~; d)~<0 by Proposition
3.3(i). Hence, by Theorem 2.1(a) and by (5.2),
In other words: system (I) of the theorem of the alternative has no solution and
thus (II) has a solution. This proves (a).
Co) Without loss of generality we may take m = 1. We write K for K1, q for ql
etc. Our proof is by negation. Suppose ~ is not an isolated local minimizer, i.e.
there exists a sequence {x,},=1,2..... x , ~ ~, with
x, n--~
~ butf(x.)<~f(~) for alln.
a. := x, - t. := IId, ll.
A compactness argument yields the existence of some d ¢ 0 in X such that (for a
suitable subsequence)
d = lim t~d,.
n~
L e t us see that d satisfies (o). By assumption, f(x,) <~f(~) and thus for all n,
0 i> f ( x ~ ) - f ( ~ )
= q(gK(x~))- q(gr(.~))
= q ' ( g ~ ( g ) ) [ g r ( x . ) - gr(~)]
+ lq"(gK(~) + O.[gK(x.) - gK(~)])[gr(x.) - g~¢(~)]2 (5.3)
with 0 ~< 0, <~ 1. For convenience let us write An and B,, respectively, for the first
and the second part of the right-hand side of the last equation. From (i) and a
Taylor expansion of the gj's we obtain with suitable 0 ~< trni ~< 1,
An = q'(gr(,x)[gK(x~) - gK(x)]
= ~, Yj[gr(x~) - gK(-~)]
jEK(~, a)
/> ~ Yj[gj(xn)-gj(x)]
j~K(~, d)
1
yig';(~ + trjndn)(d,, an).
jE K(.~, d) j(~K(~,d)
gr(~ -1-t ~ d ) -
gK(x)->g'rt~)(~)d as n --)~.
tn
A. Ben-Tal and J. Zowe/Optimality of nonsmooth.problems 83
We end up with a contradiction to the assumption that (iii) holds with strict
inequality.
The general problem (P) reduces to the minimax problem (P1) by setting
re=l, qt(t)=t, K 1 = { 1 , 2 ..... s}, glj(x)=gj(x)forjEKl. (6.1)
Note that (1.7) is satisfied. Using (6.1), Theorem 2.2 reduces to the following
statement.
yig;(~) = 0, (6.4)
i~K(~, d)
Here
m a x { - d l , 3dl} ~< 0.
° 0 0
i.e.
giz(x) = - gi(x), f o r i = 1. . . . . s.
A. Ben-Tal and J. Zowe/ Optimality of nonsmooth problems 85
Moreover, we set
gi := sign gi(g), O~= Oi(d) := sign g'i(X)d. (6.10)
Proof. First note that for the ll-case the index sets Ki(g) and K~(£, d) in
Theorem 2.2 reduce to
(ii) says
Using Yil, Yi2~>0 for all i and Yll + Yi2= 1 for i E V(d) and setting ui := 2y~1- 1, it
is seen that (6.15)-(6.17) convert to (6.12)-(6.14).
Using the expression for K~(g) in the above proof, one finds that the regularity
condition (CQP) simplifies for the /t-problem (P2) to the classical constraint
qualification
(CQP2) the set {g'~(g) ] i ~ A} is linearly independent.
Parallel to Corollary 2.3 one obtains
Corollary 6.3. Consider the l~-problem (P2) and assume that at ~, (CQP2) holds.
Then a necessary condition for 2 to be a local minimizer is that there exists
(unique) multipliers
[uil ~< 1, i~A,
such that
<~O if i E A and u~ = - I
the following inequality holds
Proof. The only new thing which needs a proof, is the appearance of the set N
instead of the set
M := {d I ~'ieA]g'i(~)d[+ leA*
~" o'ig'~(YOd~<0}.
To see that M = N , let first d E M. Use (6.18) to obtain
As [ui[ ~< 1, this can only hold if each term in the summation is zero; d ~ N
follows easily. Let now d E N. Then
([g~(~)d I - u,g~(~)d) = 0
lEA
[g',(~)dl + ~ o-~g'~(~)d = O,
i~A i~A*
i.e. dEM.
Remark. With strict inequality in (6.19) and dim X < oQthe optimality conditions
in Corollary 6.3 become sufficient. This is a special case of Theorem 6.2(b). Such
a result was stated in [4], however with a wrong proof (see the counterexample
in [3]).
We close this part of the section with a simple example for an application of
Theorem 6.2 (in fact the corollary).
Example. Consider
)=(Oo)" (6.20)
(-a+2)d~>0 for a l l d l C R .
Hence ~---(0, 0) t will be an isolated local minimizer for a < 2 but not a local
minimizer for a > 2. Note that the necessary first order condition (6.20) holds
regardless of the value of a, and hence does not provide any information.
88 A. Ben-Tal and J. Zowe/ Optimality of nonsmooth problems
The exact penalty f u n c t i o n in problem (P3) is obtained for the general problem
(P) by setting m = s + 1 and
Iw, L~ 1, i ~ V(d),
s'
~,g;(~) + Y. w,g',(~) + Y. o~g',(,~)+ 5". ,~,g',(~) + ~, g:(x) = o,
i E V(d) i E V*(d) iE A* i= 1
Remark. For problem (P3) the regularity assumption guaranteeing the in-
dependence of the multipliers from d is also (CQP2). In the simplified version
which results under such a constraint qualification one can also replace the set M
with the set N (see Corollary 6.3). This simplified necessary condition was
obtained in [5, Corollary 2]. The sufficient counterpart of our simplified condition
is the correct version of the conditions given in [5, Corollary 3].
Theorem 6.5 [Necessary and sufficient conditions for problem (P4); the case
p = 2].
(a) A necessary condition for ~ to be a local minimizer of problem (6.22) is
that
(b) If dim X < ~, a sufficient condition for ~ to be an isolated local minimizer for
problem (6.22) is as above but with strict inequality in (6.24) for d ~ O.
Here A, A + are defined as in (6.8).
ProoL The multipliers yij in Theorem 2.2 are independent from d for problem
(6.22). They are uniquely determined by condition (i) (note that at least d = 0
satisfies (o)):
7. Nonsmooth constraints
the necessary optimality conditions at a feasible solution ~ are (see [2] Theorem
9.1.):
For every d E ni:s~(~)=0Ds~(£)nDf0(£) there correspond continuous
linear functionals li E dom 8*(- I QI), not all zero, satisfying
(B-Z) lo + ~, l~ = 0 (the Euler-Lagrange equation),
i :/~(~)=o
The results of Sections 3 and 4 of this paper can be interpreted as follows: For
the functional
we have
(i) Df(~) = {d I f'(x; d) ~< 0; f'(g; d) given by Proposition 3.3},
(ii) for d ~ Dt(~),
Qs(£, d) = {z I f"(£; d, z) < 0 ; f"(g; d, z) given by Proposition 3.3},
(iii) for Qr given in (ii) and d E Dt(~);
Thus all the necessary elements to deal with objective functions and constraints
involving functionals of the general form (7.1), are given in (i)-(iii), and can be
directly incorporated into the (B-Z) optimality conditions.
A. Ben-Tal and J. Zowe/ Optimality of nonsmooth problems 91
References
[1] A. Auslender, "Penalty methods for computing points that satisfy second order necessary
conditions", Mathematical Programming 17 (1979) 229-238.
[2] A. Ben-Tal and J. Zowe, "A unified theory of first and second order conditions for extremum
problems in topological vector spaces", Mathematical Programming Study 19 (1982) 39-76.
[3] A. Ben-Tal and J. Zowe, "Discrete /i-approximation and related nonlinear nondifferentiable
problems", Preprint, Institute of Mathematics, University of Bayreuth, Bayreuth (1980).
[4] C. Charalambous, "On the condition for optimality of the non-linear/l-problem", Mathematical
Programming 19 (1980) 178-185.
[5] T.F. Coleman and A.R. Conn, "Second order conditions for an exact penalty function",
Mathematical Programming 19 (1980) 178-185.
[6] J.M. Danskin, The theory of max-rain (Springer, Berlin, 1967).
[7] V.F. Dem'yanov and V.N. Malozemov, Introduction to minimax (Wiley, New York, 1974).
[8] V.F. Dem'yanov and A.B. Pevnyi, "Expansion with respect to a parameter of the extremal
values of game problems", USSR Computational Mathematics and Mathematical Physics 14
(1974) 33-45.
[9] R.J. Duffin, "Infinite programs", in: H.W. Kuhn and A.W. Tucker, eds., Linear inequalities and
related systems (Princeton University Press, Princeton, NH, 1956) pp. 157-171.
[10] R. Fletcher and G.A. Watson, "First and second order conditions for a class of nondifferenti-
able optimization problems", Mathematical Programming 18 (1980) 291-307.
[11] S.P. Han and O.L Mangasarian, "Exact penalty functions in nonlinear programming",
Mathematical Programming 17 (1979) 251-269.
[12] W. Krabs, Optimization and approximation (Wiley, New York, 1979).
[13] H. Maurer and J. Zowe, "First and second order necessary and sufficient optimality conditions
for infinite-dimensional programming problems", Mathematical Programming 16 (1979) 98-110.
[14] T. Pietrzykowski, "An exact penalty method for constrained maxima", SIAM Journal on
Numerical Analysis 6 (1969) 299-304.