Solving Constraint Satisfaction Problems Using Neural Networks
Solving Constraint Satisfaction Problems Using Neural Networks
CJWang EPKTsang
University of Essex, UK
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
assigned to a variable at any time and that this value selecting the winning node in a cluster if there is more than
violates a minimum number of constraints. one node that has the maximum input. In this case, if none
By convention, the state of node i is denoted as si,which is of them is on, one will be randomly selected to turn on. If
either 1 for active or 0 for inactive. The connection weight one of them is already on, it will remain on. This is to avoid
between nodes i and j is denoted as w b which is always a chaotic or cyclic wandering of the network states. We have
negative integer and initially given the value -1. The input considered and experimentedto break tiesrandomly, as it is
to a node is the weighted sum of all the nodes’ states done in the Heuristic Repair Method, but to find it
connected to i t To illustrate how a network can be ineffective in problems which have few solutions.
constructed for a CSP, let’s take a simple example as Take the above example for instance. The Heuristic Repair
follows. Method will fail in most runs. This is because only one out
of the 243 possible network states represents a solution.
EXAMPLE: A TIGHT BINARY CSP Moreover, there are 88 local minima and about another 63
Suppose there are five variables, Zl to Z5, and all the states will lead to a local minimum. Therefore, the chances
variables have the domain d = {I, 2.31. Let Vi denote the are that the network will fall into local minima with a
value taken by the variable Zi.The binary consuaints probability of 151/243,or approximately62%. This simple
between Vi and Vi+l,for i=l to 4, is that the sum of and example shows the inadequacy of the Heuristic Repair
F+l must be even. The binary constraint on VI and V, Method and the pitfall of using neural networks to solve
together is that: VI = 2 OR V5 = 2 CSPs in general.
The network will be constructed as in Figure 1, where ESCAPING LOCAL MINIMA
nodes in column i form a cluster representing the variable
Zi,nodes in row j represent the jth value that can be When the network settles in a local minimum, there are
assigned to a variable, and all the connections shown are some active nodes that have negative input, indicating that
negative. some constraints are violated. This happens because the
state update of a node is a local decision based on the
principle that the activated node in a cluster should violate
a minimal number of constraints. Of course, this does not
necessarily lead to a globally optimal decision (one which
finds a solution). In the case of local minima, the state
update rule would fail to make alternative choices. It would
appear that introducing randomness or noise in the state
update rule, in the manner of simulated annealing as
suggestedin the literature [13],might helpin escaping local
z1 zz z3 z4 5 minima. However, this will degrade the overall
performance so drastically that this approach will not be
Figure 1. The network structure for the Example effective for solving real life problems.
For non-binary CSPs in which the values are not simple In order to overcome these problems, we propose a learning
ground terms and constraints are imposed on the attributes rule that heuristically updates the connection weights to
of the value structures, a multilayer network structure will help make alternative selections of active nodes to escape
be required. The number of layers required depends on the local minima. The change of weight for the connection
complexity of the value structure. We will show how this between every pair of nodes i and j, Awb is defined as
can be realized later when solving the car sequencing follows:
problem. Aw.-=w..-%siss.
‘I ‘J
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
297
with an average of 23 convergence cycles over thousands have to be satisfied,but also the production quota will have
of runs. As we shall report later, our extensive experiments to be met. In the above problem, for instance, the task is to
show that this learning rule is so effective that for all the schedule 10 cars of Type 1 and 20 cars each of Types2 and
tested CSPs that are solvable, the network always finds a 3 so that none of the three work areas (for installing the
solution. three options) is overloaded. In an industrial environment
the constraints are normally high, and the options are many,
THE CAR-SCHEDULING PROBLEM and therefore the scheduling problem is highly difficult.
The purpose of this experimentis to show how GENET can
be applied to non-binary highly complex problems. For NETWORK DESIGN FOR CAR-SEOUENCING
clarity, we will describe only a simplified version of our Remember that a CSP is defined as a triple of (Z, D.C).In
experiments. The car sequencing problem is a highly this problem, Z is the set of positions in a sequence of cars
constrained problem appearing in GM production lines and to be scheduled on to the production line. The domains for
considered intractable by earlier researchers [141. Cars to the variables are the car types. The constraints are the
be manufactured can be classified into different types capacity constraints (which are n-ary constraints for
(models), with each type requiring different options (e.g. capacity constraint mln) and production requirements
sun roof, radio, etc.) to be installed. Production (which are 50-ary constraints). The network is constructed
requirements specify the number of cars of each type to be as follows.
manufactured. The problem is to position the cars to be The first layer of the network consists of N clusters of
manufactured on a conveyor belt (for production), nodes, with each cluster representing one variable, Le. a
satisfying a set of capacity constraints. The capacity position in the car sequence. Each cluster has three nodes
constraints of a particular work area w limits the frequency representing the three possible values, Le. the three car
of cars which require work to be done in w arriving in any types. For convenience, we call the nodes in this layer
sub-sequence. For example, because of limitation in work variable-nodes. If a variable represents positionj in the car
force, no more than 3 out of any 4 consecutive cars on the sequence, then we shall call the cluster which corresponds
conveyor belt should require air-conditioningto be fitted. to this variable the jth cluster.
For example, the production line may be able to produce The second layer also has N clusters of nodes. Each cluster
three types of cars, and the cars may have up to three corresponds to one variable, and each node represents one
optional accessories such as air-conditioning,sun-roof, and option which might be required. For convenience, we shall
stereo rape player, etc. The options for the type of cars are call the nodes in the second layer option-nodes. The
shown as in Table 1, where 1’s mean the option is required clusters in the second layer are assumed to be ordered in the
by that type of car. Table 1 also shows capacity constraints same way as those in the first layer. The connections
on the work areas which install the three options. A between the first and the second layers simply map the car
constraint of mln indicates at most m out of any n types into options. For example, if the variable-node in the
consecutive cars may have this option. This problem is ith cluster of the first layer which represents car type 1 is on,
complicated because not only the capacity constraints will then the nodes in the ith cluster of the second layer which
represent options 1 and 2 will be on (because car type 1
requires options 1 and 2).
Car Types capacity
T m l Tm2 Tm3 constraints The final layer is constructed according to the capacity
constraints. We shall call the nodes in the final layer
Option 1 1 1 0 2/3 constraint-nodes. If the capacity constraint of option k is
Option 2 1 0 1 314 m/n, then there is one constraint-node connected to the
Option 3 0 1 1 2/3 option-nodes of every n consecutive clusters. If a
constraint-node X receives active inputs from more than m
required: 10 20 20
option-nodes,it is turned on, which signals the violation of
Table 1. Options, Requirements and Constraints this constraint. X will then transmit a negative signal to the
Constraint nodes
Option Nodes
Figure 2. A template of the network for Car-SchedulingProblem, where opi stands for option i, ct,: for
car type i, dotted lines are mapping connections ( + I ) , and solid lines are constraints (-1). For
simplicity,the requirement-nodes are not shown.
~~~ ~~ _ _ ~ ~
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
298
corresponding variable-nodes to reduce their chance of We have tested on problems with varying number of
being switched on. variables, varying domain sizes, and varying degrees of
The production requirements constraints are implemented tightness (by varying PI andpd.
similarly. One node, call it requirement-node, is used for The resultsof GENET is checked against programs which
every car type and it is connected to all the variable-nodes perform complete search (using the FC-FFP approach).
that represent the value of that type.If the number of cars GENET is given a limit in the number of convergence
in any type exceeds the requirement, the corresponding cycles. If this limit is exceeded before a solution is found,
requirement-node will be turned on and it will send a GENET is instructed to report a failure. For all the
negative signal back to those variable-nodes. Figure 2 problems tested, GENET is found to be capable of
shows a templateof the network s t n ~ c mOur . experiments converging on solutions in solvable problems, and
show that the network always converges to a valid solution reporting failure in insoluble problems.
if the problem is solvable, and heuristic learning is not even
required when the number of cars is below 20. ANALYSIS OF EFFECTIVENESS
In the simulator, the state of one node is changed at a time.
RANDOMLY GENERATED CSPS However, in a hardware implementation, the nodes could
In order to test the effectiveness of GENET for solving change their states in parallel. The performance of the
CSPs in general, we have performed thousands of hatdware implementation should be measured by the time
experiments on different types of randomly generated CSPs it takes to find solutions, which can be estimated by the
which were generated by varying the following 5 number of cycles in the simulator. Table 2 shows the
parameters: number of cycles that it takes to solve CSPs which have D
N the number of variables; = d = 6, p1 = 10% and pz = 85%. Our test under these
parameters is limited to 170 variables because the
D the maximum size of domains; exhaustive search program fails to terminate in over 24
d the average size of indwidual domains (d ID); hour for problems with 180 variables or more. It should be
the percentage of consmints between the varia- mentioned that GENET terminates faster than the
p,
bles, i.e. in the generated problem, there are exhaustive search program in problems with 160 variables
p p N ( N - l ) I 2 constraints; and or more.
the percentage of the value compatibilitybetween As can be Seen from Table 2, the number of cycles taken by
p2
every two constrained variables, i.e. p p d & GENET to find solutions grows exponentially with the
combinationsof the value assignment of the var- number of variables N . Statistical analysis shows that:
iables and V; are legal. no-of_ cycles = e0.026xN-0306
The probability of two random assignmens being with the correlation R = 0.975, which suggests a good
compatible,pc. can be obtained as follows: fitting. This is not surprising, as the CSP is an "-hard
problem. However, we should note that the absolute
P, = 1-PI + P I X P 2 number of cycles required to find a solution is bounded for
the following reasons. The number of cycles that is
required to find a solution is influenced by the number of
No. of variables average cvcles max.cvcles times that learning takes place, which is in turn influenced
10 1.480 2.00 by the number of local minima in the search space. Tracing
20 1.950 2.00 in the simulator reveals that the tighter the problem is, the
30 2.000 2.00 more times the learning takes place. The probable number
40 2.020 3.OO of solutions in percentage of the total search space, Sp, can
50 2.120 5.00 be expected to be:
60 2.560 8.00 N-1 . -N(N-I)
70 2.920 7.00 sP = n. P : = P c
80 5.050 24 .00 2=1
90 6.180 28.00
100 7.370 21.00 Clearly, as Sp decreases much more quickly than 8
110 10.640 26.00 increases, the tightness of a problem grows super
120 14.080 38.00 exponentially as N grows (all other parameters being kept
130 18.720 51.00 unchanged). When N grows to over 200,problems are
140 26.970 94.00 normally insoluble. Therefore, the number of cycles taken
150 35.440 102.00 by GENET is bounded.
160 58.240 223.00 It is important to note that the absolute number of cycles is
170 107.500 337.00 of the order of hundreds. When N = 200,the expected
Table 2. Average and maximum number of cycles tak- number of cycles is 133.00. For an analog computer that
en by GENET to solve randomly generated takes lo8to 10" seconds to process one cycle (as a rough
problems, with 100 runs per each case estimation), a problem of size 0(6*O0)can be solved in
terms of IO4 to IO4 seconds.
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
299
Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.