0% found this document useful (0 votes)
85 views

Neural Nets For Optimization

The document proposes a novel neural network architecture for solving constrained optimization problems using the penalty function method. The network utilizes smooth saturation functions to approximate the constraints. It is shown to be more efficient than previous approaches at implementing equality constraints. The network is then applied to solve minimum norm problems, which have applications in control of discrete processes. The behavior of the network is demonstrated on numerical examples.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views

Neural Nets For Optimization

The document proposes a novel neural network architecture for solving constrained optimization problems using the penalty function method. The network utilizes smooth saturation functions to approximate the constraints. It is shown to be more efficient than previous approaches at implementing equality constraints. The network is then applied to solve minimum norm problems, which have applications in control of discrete processes. The behavior of the network is demonstrated on numerical examples.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Purdue Libraries

Purdue e-Pubs
ECE Technical Reports Electrical and Computer Engineering

11-1-1991

NEURAL NETWORKS FOR CONSTRAINED


OPTIMIZATION PROBLEMS
Walter E. Lillo
Purdue University School of Electrical Engineering

Stefen Hui
San Diego State University Department of Mathematical Sciences

Stanislaw H. Zak
Purdue University School of Electrical Engineering

Lillo, Walter E.; Hui, Stefen; and Zak, Stanislaw H., "NEURAL NETWORKS FOR CONSTRAINED OPTIMIZATION
PROBLEMS" (1991). ECE Technical Reports. Paper 322.
http://docs.lib.purdue.edu/ecetr/322

This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for
additional information.
Neural Networks for
Constrained Optimization
Problems

Walter E. Lillo
Stefen Hui
Stanislaw H. ~ a k

TR-EE 91-45
November 1991
NEURAL NETWORKS FOR CONSTRAINED
OPTIMIZATION PROBLEMS

Walter E. Lillo Stefen Hui

School of Electrical Engineering Department of Mathematical Sciences

Purdue University San Diego State University

West Lafayette, IN 47907 San Diego, CA 92182

Stanislaw H. 2ak

School of Electrical Engineering

Purdue University

West Lafayette, IN 47907


Abstract

This paper is concerned with utilizing neural networks and analog circuits

to solve constrained optimization problems. A novel neural network architec-

ture is proposed for solving a class of nonlinear programming problems. The

proposed neural network, or more precisely a physically realizable approxima-

tion, is then used to solve minimum norm problems subject to linear con-

straints. Minimum norm problems have many applications in various areas,

but we focus on their applications to the control of discrete dynamic processes.

The applicability of the proposed neural network is demonstrated on numerical

examples.

Key Words:

Constrained optimization, Minimum norm problems, Analog circuits

1. Introduction

The idea of using analog circuits to solve mathematical programming

problems seems to have been first proposed by Dennis (1959). Since then, vari-

ous types of "neural" networks have been proposed to obtain solutions to con-

strained optimization problems. In particular Chua and Lin (1984) developed

the canonical nonlinear programming circuit, using the Kuhn-Tucker condi-

tions from mathematical programming theory, for simulating general nonlinear


programming problems. Later Tank and Hopfield (1986) developed an optimi-

zation network for solving linear programming problems using general princi-

ples resulting from the basic collective computational properties of a class of

analog-processor networks. Practical design aspects of the Tank and Hopfield

network along with its stability properties were discussed by Smith and Port-

mann (1989). An extension of the results of Tank and Hopfield to more gen-

eral nonlinear programming problems was presented by Kennedy and Chua

(1988). In addition, they noted that the network introduced by Tank and

Hopfield could be considered to be a special case of the canonical nonlinear

programming network proposed by Chua and Lin (1984), with capacitors

added to account for the dynamic behavior of the circuit. Lillo et. al. (1991)

have shown that the above discussed approach implicitly utilizes the penalty

function method. The idea behind the penalty method is to approximate a

constrained optimization problem by an unconstrained optimization problem -


see Luenberger (1984, Chp. 12) for a discussion of this approach.

In this paper we use the penalty function method approach to synthesize a

new neural optimization network capable of solving a general class of con-

strained optimization problems. The proposed programming network is dis-

cussed in section 2 along with its circuit implementation. We show that the

penalty function approach allows one to better control the effects of the physi-

cal constraints of the network's building blocks than the previously proposed

approaches. Our proposed architecture can be viewed as a continuous non-

linear neural network model. For a historical account of nonlinear neural net-

works, the reader may consult Grossberg (1988). In section 3 we discuss appli-

cations of the proposed neural optimization network to solving minimum norm

problems of the form:


minimize I I x(l

subject to Ax 2 b

where p = 1, 2, or oo. The minimum norm problems are important, for exam-

ple, in the context of the control of discrete processes (see Cadzow (1971) or

LaSalle (1986, Chp. 17) for more information related to the issue). The

behavior of the proposed networks are then tested on a numerical example and

computer simulations are given in section 4. Conclusions are found in section

5.

2. Networks for Constrained Optimization

In this paper we are concerned with finding minimizers of constrained

optimization problems. We consider the following general form of a con-

strained optimization problem

minimize f(x)

subject to

g(x) 2 0
h(x) = 0,

where xERn, f:Rn + R , g = [%1,%2,...,gqlT


: R n -+Rq and
h = [hl,h2,...,h,lT : Rn+Rm are vector valued functions of n variables with
dimensions q and m respectively. Since we are dealing with physical devices it

is reasonable to restrict the functions f,g, and h to be continuously


differentiable.

Chua and Lin (1984), and later Kennedy and Chua (1988), proposed
canonical nonlinear programming circuits for simulating the constrained
optimization problems of the above type (see Fig. 1 ). They analyzed the case

Figure 1. Dynamical canonical nonlinear programming circuit of Kennedy


and Chua (1988).

when only the inequality constraints are present. Their development was
based on the Kuhn-Tucker conditions from mathematical programming theory
(see for example Luenberger (1984) for more information on the Kuhn-Tucker
conditions). The functions dj, j = 1,...,q, on the left side of Fig. 1 are defined
by:

-cI if I 2 0
v = $j (I) = if 1 < 0 .

Thus the pj terms have the form:

gj (x)c if g, (x) 5 0
=$j(-gj(x)) = j = 1, ...,q.
if g, (x) > 0
~j

Now applying Kirchhoff's current law (see for example Halliday and Resnick

(1978, p. 702)) to the circuit on the right side of Fig. 1 we obtain

dxk
Solving for - we obtain
dt

Note that if c +ca then, in the steady state, the Kuhn-Tucker conditions are

satisfied.
In this paper we examine the case where we have equality constraints as

well as inequality constraints. An equality constraint hj(x) = 0 can be


represented in terms of inequality constraint(s) in one of the following ways:

However, t o implement equality constraints in terms of inequality constraints

would be inefficient as will be seen later. In this paper we propose an alterna-

tive circuit which has a more efficient implementation of equality constraints

and a general form which more readily lends itself to implementation. This

alternative approach utilizes the penalty method. Utilizing the penalty

method, a constrained optimization problem considered in this paper can be

approximated by an unconstrained problem of the form:

minimize f(x)
I + CP(X)) ,

where c > 0 is a constant, often referred to as a weight, and P(x) is a penalty

function. A penalty function is a continuous non-negative function which is

zero a t a point if and only if all constraints are satisfied at that point. In this

paper we consider penalty functions of the form:


where g i (x) = -min (O,gj(x)). If we consider an equality constraint as two ine-

quality constraints, then the penalty function can be rewritten as:

where

gjl(x) = h j ( x ) and gjz(x) = -hj(x) .

The above penalty function P(x) is often referred to as an exact penalty func-

tion because for a sufficiently large finite value of c the penalty method

approximation, with the above P(x), yields the same global minimizers as the

constrained problem. The exact penalty functions have the drawback that

they are not usually differentiable. Having reviewed the pena1t.y method we
-
now introduce the proposed network. The functions S,,B and S7 in Fig. 2 are

smooth versions of the saturation functions S,,B defined by:

Sa,,(x) =

I
-a
a

a
for x < - p

for -p
forx>p.
5 x IP
Figure 2. The proposed network for constrained optimization.

-
When a = P, we write S,,, as S,. We assume that a > 7. The li, and ij
terms are defined as:

= ic,(h,(x)) .
i,

When < is small, the F, and $, terms can be approximated as:


= -sgn(gj
j-J.j - C C
(x)) - 2
2

-
Aj csgn(hj(x)) .

Remark

The Pj terms differ from the ,uj terms in the the canonical dynamical circuit of

Kennedy and Chua (Fig. 1) in that their values are bounded. This

modification was made in order to accommodate the saturation limits of the

op-amps used in implementing the functions. As a result of replacing the ,uj

terms, it is necessary to replace the linear current sources of the dynamical

canonical circuit with nonlinear current sources in order to effectively enforce

the constraints.

Applying Kirchhoff's current law to the circuits on the right hand side of

Fig. 2 yields:

Substituting for & and Xj, we have


where J is the index set of violated inequality constraints. In the region where

the gradient of P(x) is defined , this equation can be rewritten as

Note that if

-
then the term S,,@ saturates. If we assume the trajectory is in a region where

c- ap (4 > p, then by the design assumption that a > y, we obtain:


A k

and
dxk
In addition, since Ck > 0, we conclude that if c- > p, then - and
h k dt

c- have opposite signs. Hence, if C-


h k
> p, then
h k

Thus

This implies that whenever go,8 saturates and the trajectory is in the region

where P(x) is differentiable, then P(x) is decreasing along that trajectory. Note

that the set of points where P(x) is not differentiable has an n-dimensional

Lebesgue measure zero and that the circuits are designed so that P is small and
thus go,@ will be saturated at almost all points outside the feasible region.

Thus, one would expect that the penalty function P(x) would decrease along

the trajectories outside the feasible region. Note that if So,@ operates in the

saturated mode, then the bound for the rate of decrease of the penalty function

P(x) is independent of the form of the objective function.


It should be noted that if the initial condition is such that the system tra-

jectory reaches the feasible region, then the circuit dynamics are governed by

the equations

Having examined the dynamical behavior of the circuit in Fig. 2, we will

now consider it's implementation. For the case of quadratic programming

problems subject to linear equality and inequality constraints the circuit shown

in Fig. 2 could be implemented using a neural network with the structure dep-

icted in Fig. 3. The implementation of the pnode is the same as was proposed

by Kennedy and Chua (1988) and is shown in Fig. 4. The implementations for

the h and x nodes are depicted in Fig. 5 and 6. It should be clear from the

implementation of the various nodes that to represent an equality constraint in

terms of inequality constraints would be rather inefficient since an inequality

constraint node requires more hardware than an equality constraint node. We

would like to note that one may also use switched-capacitor circuits to imple-

ment neural optimization networks (Cichocki and Unbehauen (1991)).


Figure 3. Neural network for solving quadratic programming problems sub-

ject to linear constraints.

Having given an implementation corresponding to the general case of .qua-

dratic programming we will now examine how a network of this basic structure

can be used to solve some minimum norm problems of interest.


Figure 4. Circuit implementation for an inequality constraint node. The
unlabled resistances are chosen in such a way that IPj= -gj(x)

mi.

Figure 5. Circuit implementation for an equality constraint. The values of


the unlabled resistors are chosen so that Ili = -hj(x) mA.
*a+
-
IF
*+
* Fl-
IP 1OKR

*ip"'

Figure 6. Circuit implementation for an x node. The values of the unl-

abled resistors are chosen so that IF = d and


h k

3. Networks for Solving Minimum Norm Problems

In this section we show how the previously proposed neural network archi-

tecture can be applied to control discrete dynamic systems modeled by the

equation
where ckdRrn,ukeIR1,for k = 1, 2, ..., and F , G are constant matrices with

appropriate dimensions. If we iteratively apply the previous equation we

obtain

We assume that our system is completely controllable (Kailath (1980)). This

implies that we can drive the system to an arbitrary desired state, cd,from an
arbitrary initial state, to.Thus for sufficiently large N, ( N 2 m ) we can find

a sequence of inputs ( uo, ul , ...,U N - ~) such that

In the case where N > m there are an infinite number of input sequences which
would drive the system to the desired state. This can be seen more clearly if

we rewrite the previous equation using the following definitions:

A = [G, FG ,...,FNd2G,FN-'G] , T
xT = [uNPl, T
UN-2, ..., uoT I.

With these definitions, we have


c d = ~ ~ t o + A x7

c d - ~ N c O =. ~

If we let b = Ed - FNc0then we have

Ax=b.

If we define n = 1N then A is m x n , b is m x 1 and x is n x 1. Since the system is

completely controllable, N > m the rank of A is m and the null space of A has

dimension n - m > 0. From this it should be clear that the system of equa-

tions Ax = b is underdetermined ( i.e. there is an infinite number of possible

solutions ). Since there are many possible solutions, secondary criteria are often

used to determine which of the input sequences satisfying the constraints

should be used. Often it is desirable to find the solution which in some sense

minimizes the input x. This is the reason we consider the following con-

strained optimization problem

minimize 1 1 xll

subject to Ax = b ,

where p = 1,2, or m. The solutions corresponding to these problems are

referred to as the minimum fuel, minimum energy, and minimum amplitude


solutions respectively. Because of the importance of these problems they have

been studied fairly extensively (see for example Cadzow (1971,1973), Kolev

(1975)~or LaSalle (1986)). For the case of p=2, there are algorithms based on

linear algebra which solve this problem. When p =1 or p = oo, the problems

are somewhat more complex. There are algorithms based on results from func-

tional analysis which have been proposed to solve these problems (Cadzow

(1971,1973)). In applications such as real time control the speed a t which a

solution can be obtained is of the utmost importance. I t is for this reason that

we propose the use of analog circuits, or neural networks, which are capable of

obtaining solutions on the order of a few time constants.

We will now examine how the quadratic programming i~nplementation

given in the previous section can be applied to solving the problems of interest.

The first thing we notice with all these problems is that the constraints are

linear. Thus in the case where p = 2, since the objective function of the

equivalent problem can be expressed as a quadratic, the network given in the

previous section can be used to solve the problem.

For the case of p = 1, the objective function cannot be expressed as a qua-

dratic. However, as shown below, the components of the gradient of the objec-

tive function are still simple functions of the variables x l , . . . , x,:

This being the case, a component of the gradient of 1 1 xll can be approximated

by the circuit depicted in Fig. 7.


Figure 7. Implementation for approximating a component of the gradient

of the objective function for the case where p =1.

The x-nodes would then be modified as shown in Fig. 8.

For the case p = oo the objective function cannot be expressed as a qua-

dratic. In addition, we can see from the equation below that the components

of the gradient of the objective function cannot be expressed in a simple


manner as was the case when p = 1. They have the form

otherwise

Rather than try to implement this problem directly by building a circuit to


approximate the components of the gradient of the objective function given

above, we transform the problem into an equivalent one which can be


simulated by a network of the form given in section 2. To understand how

this is done, consider the level surface 1 1 xllm = a, where a > 0. This level

surface corresponds to the boundary of the closed hypercube:

Thus the problem can be viewed as finding the smallest value of a > 0 such

that the constraint Ax = b is satisfied and x is an element of the set H,. If we

...,xnlT then the problem can be written as:


let xn+, = a and x* = [xl,x2,

minimize x n + ~

subject to

h(x) = Ax* - b = 0

g1(x) = Xn+l 2 0

gll(x) =xn+1 -x1 2 0


g12(~) = Xn+l + XI 2 0

g21(x) ==Xn+l -X2 2 0

g22 (x)= xn+1 + x2 2 0


We have transformed the original problem into a linear programming problem
and the quadratic programming network introduced in the previous section can

be used to solve this problem.

Figure 8. Implementation for an x node for the case where p = l . The

unlabled resistances are chosen so that the current Ip = --i3.P(4


h k

For some other interesting applications of neural networks for quadratic


minimization the reader may consult Sudharsanan and Sundareshan (1991).
4. Case Study

In order to test the ideas presented in this paper, simulations of the pro-

posed implementations were performed on a digital computer. The simulations

are based on the following differential equations (see section 2):

where SalPand S, are as defined in Section 2 with a = 12, ,b' = 0.5, and 7 = 6.

we use c = 1000 in the definitions of the variables jij , j = 1,...,n and


-
Xj , j = 1,...,q. We approximate the signum function sgn(x) by

The problem which we choose to simulate is taken from Cadzow (1973) and

has the form:


minimize ( I xl I

subject to Ax = b

where p = 1, 2, or m, and

The variables xj, j = 1,...,n, are constrained to be in the interval [-12,121, The

results of the simulations for p = 1, 2 and co are given below.

For p = 1, as shown in Fig. 9, the trajectories converged to the point

which gives Ilx(l = 1.36.

For p = 2, as shown in Fig. 10, the trajectories converged to the point

which gives llxll = 0.769.

For p = m, as shown in Fig. 11 and 12, the trajectories converged to the

point
which gives 1 1 xll oo = 0.372.

The analytical solutions to the three problems are

Thus the results of the simulations closely correspond to the analytical solu-

tion.

Another important consideration is the speed with which the network con-

verges to the correct solution. This depends on the value of the time constants

and the initial condition of the network. In the above simulations we assumed

there is no initial charge on the capacitors in the networks. This corresponds to

the condition x, = 0, j = 1, ...,n. From the following plots of the trajectories of

the variables for the three problems we can see that the network converged to

the solution within a few time constants.


I-Norm case

time ( time constants > time ( time constants.) .

1-Norm case 1-Norm case


.?561-

.5670 -
.3780 -

.I890 -
0 .ooo
0.0 1.0 2.0 3.0 9.0

time ( time constants > time ( time constants

1-Norm case

time ( time constants )

Figure 9. Trajectories corresponding to the case p=l.


- 2-Norm c a s e

%
.-I9

.El09

.1YS9
-

-
-
2-Nora c a s e

time ( time c o n s t a n t s > time ( time constants )

2-Norm c a s e
.SO28 - 2-Norm c a s e
\
.377 1 -

.i?slV-

.I257 -

time ( time c o n s t a n t s ) time ( time canstants >

2-Norm c a s e
m

0.000 ! I
0.0 1.0 2.0 3.0 r.o
time ( time c o n s t a n t s ) time ( time c o n s t a n t s )

Figure 10. Trajectories corresponding to the case p=2.


I n t inity- Norm case
.1128- .3?19 - I n f i n i t y- N o r m case

.OW6 - .2?90 -
4
.056Y - .I860 -

.om2 - .OW9 -

time < t i m e constants ) time < time constants )

ntinity- Norm case I n f i n i t y-Norm case


.m1-4\

2926 -

t, .lmo-

t ime' < t i m e constants ) time < t i m e constants )

I n f inity- Norm case I n f inity- Norm case


.3901-\
--oOOo 1
-2925 -

-1950 -

time < t i m e constants > time < t i m e constants )

Figure 11. Trajectories corresponding to the case p=oo.


time ( time constants

Figure 12. Trajectory of the augmented variable for the case p=m.

5 . Conclusions

A general form of a network was given which can minimize a function


subject to both equality and inequality constraints. An implementation was
given for the case of quadratic programming with linear equality and inequal-
ity constraints. Next the minimum norm problems were introduced and it was
shown how the previously introduced implementation could be used and
modified to solve the various minimum norm problems of interest. The net-
works were then simulated on a digital computer and successfully tested on a '
benchmark problem.
References

Cadzow, J.A. (1971). "Algorithm for the minimum-effort problem." IEEE

Trans. Automatic Control, vol. AC-16, no. 1, pp. 60-63.

Cadzow, J.A. (1973). 9"F'nctional analysis and the optimal control of linear

discrete systems." Int. J. Control, vol. 17, no. 3, pp. 481-495.

Chua, L.O., and Lin, G.-N. (1984). "Nonlinear programming without computa-

tion", IEEE Trans., Circuits and Systems, vol. CAS-31, no. 2, pp. 182-188.

Cichocki, A., and Unbehauen, R. (1991). "Switched-capacitor neural networks

for differential o p timization," Int. J. Circuit Theory and Applications, vol. 19,

no. 2, pp. 161-187.

Dennis, J.B. (1959). Mathematical Programming and Electrical Networks ,


London, England, Chapman & Hall.

Grossberg, S. (1988). "Non-linear neural networks: Principles, mechanisms, and

architectures," Neural Networks, vol. 1, no. 1, pp. 17-61.

Halliday, D., and Resnick, R. (1978). Physics, Third Edition, J . Wiley & Sons,

New York.

Kailath, T. (1980). Linear Systems. Englewood Cliffs, New Jersey, Prentice-

Hall.
Kennedy, M.P., and Chua, L.O. (1988). "Neural networks for nonlinear pro-

gramming." IEEE Trans. Circuits and Systems, vol. 35, no. 5, pp. 554-562.

Kolev, L. (1975). 'Iterative algorithm for the minimum fuel and minimum

amplitude problems for linear discrete systems." Int. J. Control, vol. 21, no. 5,

779-784.

LaSalle, J.P. (1986). The Stability and Control of Discrete Processes. New

York, Springer-Verlag.

Lillo, W.E., Loh, M.H., Hui, S., and ~ a k S.H.


, (1991). "On solving constrained

optimization problems with neural networks : A penalty method approach,"

Technical Report TR-EE-91-43, School of EE, Purdue Univ., West Lafayette,

IN.

Luenberger, D.G. (1984). Linear and Nonlinear Programming. Reading,

Massachusetts, Addison-Wesley.

Smith, M.J.S., and Portmann, C.L. (1989). "Practical design and analysis of a

simple "neural1' optimization circuit", IEEE Trans. Circuit and Systems, vol.

36, no. 1, pp. 42-50.

Sudharsanan, S.I., and Sundareshan, M.K. (1991). 'Bxponential stability and a

systematic synthesis of a neural network for quadratic minimization", Neural

Networks, vol. 4, no. 5, pp. 599-613.


Tank, D.W., and Hopfield, J.J (1986). "Simple neural optimization networks:

An A/D converter, signal decision circuit, and a linear programming circuit."


IEEE Trans. Circuits and Systems, vol. CAS-33, no. 5, pp. 533-541.

You might also like