0% found this document useful (0 votes)
21 views

Solving Constraint Satisfaction Problems Using Neural Networks

Uploaded by

PQ PPD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Solving Constraint Satisfaction Problems Using Neural Networks

Uploaded by

PQ PPD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

295

SOLVING COSSTRAINT SATISFACTION PROBLEMS USING NEURAL NETWORKS

CJWang EPKTsang

University of Essex, UK

INTRODUCTION represent a set of assignments which form a solution. One


A constraint Satisfaction problem (CSP) is defined as a problem with applying neural network techniques for
triple (Zp,C), where 2 is a finite set of variables, D is the solving CSP is that the network may settle in local minima
set of domains for the variables, and C is a set of -- i.e. a set of assignmentswhich violates a small number of
constraints. Each constraint in C restricts the values that constraints but it does not represent a solution of the
one can assign to a set of variables simultaneously. A problem. The Heuristic Repair Method has only shown its
constraint is n-ary if it applies to n variables. The task is to effectiveness in binary CSPs where many solutionsexist. In
assign one value per variable, satisfying all the constraints the Nqueens problem, the larger N is the more the
in C [l]. A binary constraint CSP is a CSP with unary and solutionsthere would be. Our experiments have shown that
binary constraints only. the Heuristic Repair Method will fail to solve CSPs which
have few solutions or the problems for which there are
Problems in many application domains can be formulated many local minima. This will be discussed in detail later.
as CSPs, for instance, the N-queens problem [2], line
labelling in vision [3], temporal reasoning [4, 51, and In this paper, we describe GENET, a generic neural
scheduling [6], to name a few. The majority of existing network simulator, that can solve general CSPs with finite
work in CSP focuses on problem reduction and heuristic domains. GENET generates a sparsely connected network
search [I, 21. The search space in a CSP is U(#), where d for a given CSP with constraints C specified as binary
is the domain size (assuming, for simplicity, that all matrices, and simulates the network convergence
domains have the same size) and N is the number of procedure. In case the network falls into local minima, a
variables in the problem. We define the rightness of a CSP heuristic learning rule will be applied to escape from them.
to be the number of solutions over the search space. The network model lends itself to massively parallel
processing. The experimental results of applying GENET
Constraint programming languages based on heuristic to randomly generated, including very tight constrained,
search, such as CHIP [7], have claimed to be efficient in a CSPs and the real life problem of car sequencing will be
number of applications [6]. For handling variables with reported and an analysis of the effectiveness of GENET
finite domains, these languages use the Forward Checking will be given.
algorithm coupled with the Fail First Principle (we shall
refer to this later as the FC-FFP approach) [2]. One problem NETWORK MODEL
of such heuristic search approaches is that they have The network model is based on the Interactive Activation
exponential complexity. For CSPs with, say, loo00 model (IA) with modifications to suit the natures of the
variables and an average domain size of 50, the search CSPs as defined at the beginning of this paper. The I A
space will be so huge that even a heuristic search approach model in its original form can be characterized as weak
may not produce any answer within a tolerable period of constraint satisfaction, in which the connections represent
time. Thus, many real life CSPs are computationally the coherence, or compatibility, between the connected
intractable by heuristic search. nodes. This model was developed for associative
In principle, neural networks may be able to overcome the information retrieval or pattern matching [11, 121.
above problems. Whilst a heuristic search approach However, it is not adequate for solving CSPs in general, for
guarantees to find a valid solution if there is any, the which all the constraints are absolute and none of them
probabilistic nature of the neural network convergence should be violated at all. For this purpose, the following
procedure may produce a solution much more quickly. modificationshave been developed.
Hopfield and Tank’s work on the Travelling Salesman The nodes in the network are grouped into clusters
Problem 181 is such an example. The attempt to apply with each cluster representing a variable in 2, and the
neural network techniques to solve CSPs has led to the nodes in each cluster represent the values that can be
discovery of an algorithm called Heuristic Repair Mefhod assigned to the variable.
[91, which uses the so called Min-conjict Heuristic. The Only inhibitory connections are allowed. The
Heuristic Repair Method is exmcted from the GDS neural inhibitory connections represent the constraints that
network model [lo], and it manages to solve the million- do not allow the connected nodes to be active (i.c.
queens problem in minutes. turned on) simultaneously.
A CSP can be represented as a network structure in which The nodes in the same cluster compete with each
the variable assignments are represented by the activation other in convergence cycles. The node that receives
of nodes and the constraints are represented by the maximum input will be turned on and the others
connections, possibly with different weights. Hopefully, turned off. This is to ensure that only one value is
when the network converges, the set of nodes which are on

Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
assigned to a variable at any time and that this value selecting the winning node in a cluster if there is more than
violates a minimum number of constraints. one node that has the maximum input. In this case, if none
By convention, the state of node i is denoted as si,which is of them is on, one will be randomly selected to turn on. If
either 1 for active or 0 for inactive. The connection weight one of them is already on, it will remain on. This is to avoid
between nodes i and j is denoted as w b which is always a chaotic or cyclic wandering of the network states. We have
negative integer and initially given the value -1. The input considered and experimentedto break tiesrandomly, as it is
to a node is the weighted sum of all the nodes’ states done in the Heuristic Repair Method, but to find it
connected to i t To illustrate how a network can be ineffective in problems which have few solutions.
constructed for a CSP, let’s take a simple example as Take the above example for instance. The Heuristic Repair
follows. Method will fail in most runs. This is because only one out
of the 243 possible network states represents a solution.
EXAMPLE: A TIGHT BINARY CSP Moreover, there are 88 local minima and about another 63
Suppose there are five variables, Zl to Z5, and all the states will lead to a local minimum. Therefore, the chances
variables have the domain d = {I, 2.31. Let Vi denote the are that the network will fall into local minima with a
value taken by the variable Zi.The binary consuaints probability of 151/243,or approximately62%. This simple
between Vi and Vi+l,for i=l to 4, is that the sum of and example shows the inadequacy of the Heuristic Repair
F+l must be even. The binary constraint on VI and V, Method and the pitfall of using neural networks to solve
together is that: VI = 2 OR V5 = 2 CSPs in general.
The network will be constructed as in Figure 1, where ESCAPING LOCAL MINIMA
nodes in column i form a cluster representing the variable
Zi,nodes in row j represent the jth value that can be When the network settles in a local minimum, there are
assigned to a variable, and all the connections shown are some active nodes that have negative input, indicating that
negative. some constraints are violated. This happens because the
state update of a node is a local decision based on the
principle that the activated node in a cluster should violate
a minimal number of constraints. Of course, this does not
necessarily lead to a globally optimal decision (one which
finds a solution). In the case of local minima, the state
update rule would fail to make alternative choices. It would
appear that introducing randomness or noise in the state
update rule, in the manner of simulated annealing as
suggestedin the literature [13],might helpin escaping local
z1 zz z3 z4 5 minima. However, this will degrade the overall
performance so drastically that this approach will not be
Figure 1. The network structure for the Example effective for solving real life problems.
For non-binary CSPs in which the values are not simple In order to overcome these problems, we propose a learning
ground terms and constraints are imposed on the attributes rule that heuristically updates the connection weights to
of the value structures, a multilayer network structure will help make alternative selections of active nodes to escape
be required. The number of layers required depends on the local minima. The change of weight for the connection
complexity of the value structure. We will show how this between every pair of nodes i and j, Awb is defined as
can be realized later when solving the car sequencing follows:
problem. Aw.-=w..-%siss.
‘I ‘J

NETWORK CONVERGENCE To show &I&this heuristic learning rule is effective to


escape local minima, let’s consider the. case in which the
Initially, one node in each cluster is randomly selected, network is in a local minimum. Since the network is in a
which means randomly assigning a value to each variable. local minimum, there must exist at least two active nodes
Then, in each convergencecycle, every node calculates its connectedby a negative weight Let one such pair of nodes
input and the node in each cluster that has the maximum be i and j. By stipulation, nodes i and j must have the
input will be selected to t u n on and the others will be maximum input in their own clusters. However, their inputs
turned off. Since there exist only negative connections will be reducedby one after every learning cycle, as long as
(representingthe constraints in the problem), the winner in the network state does not change. Clearly, after sufficient
each cluster represents a value assigned to the (normallyone or a few) learning cycles, either i or j will not
corresponding variable which would violate the fewest win the competition in their own clusters. Hence, the state
constraints. This effectively resembles the Min-Conflict of the network will eventually find its way out of the local
Heuristic [9]. After a number of cycles, the network will minima. This learning rule is effectively developing a
settle in a stabk state. In a stable state, if all the active nodes weighting of the constraints that guides the network state
have zero input, a valid solution has been found. Otherwise,
trajectory towards solutions, if there exists any. When this
the network is in a local minimum. learning rule is amlied to solve the above example
When updating the network state, care has to be taken in problem, the network always converges to the solution,

Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
297

with an average of 23 convergence cycles over thousands have to be satisfied,but also the production quota will have
of runs. As we shall report later, our extensive experiments to be met. In the above problem, for instance, the task is to
show that this learning rule is so effective that for all the schedule 10 cars of Type 1 and 20 cars each of Types2 and
tested CSPs that are solvable, the network always finds a 3 so that none of the three work areas (for installing the
solution. three options) is overloaded. In an industrial environment
the constraints are normally high, and the options are many,
THE CAR-SCHEDULING PROBLEM and therefore the scheduling problem is highly difficult.
The purpose of this experimentis to show how GENET can
be applied to non-binary highly complex problems. For NETWORK DESIGN FOR CAR-SEOUENCING
clarity, we will describe only a simplified version of our Remember that a CSP is defined as a triple of (Z, D.C).In
experiments. The car sequencing problem is a highly this problem, Z is the set of positions in a sequence of cars
constrained problem appearing in GM production lines and to be scheduled on to the production line. The domains for
considered intractable by earlier researchers [141. Cars to the variables are the car types. The constraints are the
be manufactured can be classified into different types capacity constraints (which are n-ary constraints for
(models), with each type requiring different options (e.g. capacity constraint mln) and production requirements
sun roof, radio, etc.) to be installed. Production (which are 50-ary constraints). The network is constructed
requirements specify the number of cars of each type to be as follows.
manufactured. The problem is to position the cars to be The first layer of the network consists of N clusters of
manufactured on a conveyor belt (for production), nodes, with each cluster representing one variable, Le. a
satisfying a set of capacity constraints. The capacity position in the car sequence. Each cluster has three nodes
constraints of a particular work area w limits the frequency representing the three possible values, Le. the three car
of cars which require work to be done in w arriving in any types. For convenience, we call the nodes in this layer
sub-sequence. For example, because of limitation in work variable-nodes. If a variable represents positionj in the car
force, no more than 3 out of any 4 consecutive cars on the sequence, then we shall call the cluster which corresponds
conveyor belt should require air-conditioningto be fitted. to this variable the jth cluster.
For example, the production line may be able to produce The second layer also has N clusters of nodes. Each cluster
three types of cars, and the cars may have up to three corresponds to one variable, and each node represents one
optional accessories such as air-conditioning,sun-roof, and option which might be required. For convenience, we shall
stereo rape player, etc. The options for the type of cars are call the nodes in the second layer option-nodes. The
shown as in Table 1, where 1’s mean the option is required clusters in the second layer are assumed to be ordered in the
by that type of car. Table 1 also shows capacity constraints same way as those in the first layer. The connections
on the work areas which install the three options. A between the first and the second layers simply map the car
constraint of mln indicates at most m out of any n types into options. For example, if the variable-node in the
consecutive cars may have this option. This problem is ith cluster of the first layer which represents car type 1 is on,
complicated because not only the capacity constraints will then the nodes in the ith cluster of the second layer which
represent options 1 and 2 will be on (because car type 1
requires options 1 and 2).
Car Types capacity
T m l Tm2 Tm3 constraints The final layer is constructed according to the capacity
constraints. We shall call the nodes in the final layer
Option 1 1 1 0 2/3 constraint-nodes. If the capacity constraint of option k is
Option 2 1 0 1 314 m/n, then there is one constraint-node connected to the
Option 3 0 1 1 2/3 option-nodes of every n consecutive clusters. If a
constraint-node X receives active inputs from more than m
required: 10 20 20
option-nodes,it is turned on, which signals the violation of
Table 1. Options, Requirements and Constraints this constraint. X will then transmit a negative signal to the

Constraint nodes

Option Nodes

Figure 2. A template of the network for Car-SchedulingProblem, where opi stands for option i, ct,: for
car type i, dotted lines are mapping connections ( + I ) , and solid lines are constraints (-1). For
simplicity,the requirement-nodes are not shown.

~~~ ~~ _ _ ~ ~

Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
298

corresponding variable-nodes to reduce their chance of We have tested on problems with varying number of
being switched on. variables, varying domain sizes, and varying degrees of
The production requirements constraints are implemented tightness (by varying PI andpd.
similarly. One node, call it requirement-node, is used for The resultsof GENET is checked against programs which
every car type and it is connected to all the variable-nodes perform complete search (using the FC-FFP approach).
that represent the value of that type.If the number of cars GENET is given a limit in the number of convergence
in any type exceeds the requirement, the corresponding cycles. If this limit is exceeded before a solution is found,
requirement-node will be turned on and it will send a GENET is instructed to report a failure. For all the
negative signal back to those variable-nodes. Figure 2 problems tested, GENET is found to be capable of
shows a templateof the network s t n ~ c mOur . experiments converging on solutions in solvable problems, and
show that the network always converges to a valid solution reporting failure in insoluble problems.
if the problem is solvable, and heuristic learning is not even
required when the number of cars is below 20. ANALYSIS OF EFFECTIVENESS
In the simulator, the state of one node is changed at a time.
RANDOMLY GENERATED CSPS However, in a hardware implementation, the nodes could
In order to test the effectiveness of GENET for solving change their states in parallel. The performance of the
CSPs in general, we have performed thousands of hatdware implementation should be measured by the time
experiments on different types of randomly generated CSPs it takes to find solutions, which can be estimated by the
which were generated by varying the following 5 number of cycles in the simulator. Table 2 shows the
parameters: number of cycles that it takes to solve CSPs which have D
N the number of variables; = d = 6, p1 = 10% and pz = 85%. Our test under these
parameters is limited to 170 variables because the
D the maximum size of domains; exhaustive search program fails to terminate in over 24
d the average size of indwidual domains (d ID); hour for problems with 180 variables or more. It should be
the percentage of consmints between the varia- mentioned that GENET terminates faster than the
p,
bles, i.e. in the generated problem, there are exhaustive search program in problems with 160 variables
p p N ( N - l ) I 2 constraints; and or more.
the percentage of the value compatibilitybetween As can be Seen from Table 2, the number of cycles taken by
p2
every two constrained variables, i.e. p p d & GENET to find solutions grows exponentially with the
combinationsof the value assignment of the var- number of variables N . Statistical analysis shows that:
iables and V; are legal. no-of_ cycles = e0.026xN-0306
The probability of two random assignmens being with the correlation R = 0.975, which suggests a good
compatible,pc. can be obtained as follows: fitting. This is not surprising, as the CSP is an "-hard
problem. However, we should note that the absolute
P, = 1-PI + P I X P 2 number of cycles required to find a solution is bounded for
the following reasons. The number of cycles that is
required to find a solution is influenced by the number of
No. of variables average cvcles max.cvcles times that learning takes place, which is in turn influenced
10 1.480 2.00 by the number of local minima in the search space. Tracing
20 1.950 2.00 in the simulator reveals that the tighter the problem is, the
30 2.000 2.00 more times the learning takes place. The probable number
40 2.020 3.OO of solutions in percentage of the total search space, Sp, can
50 2.120 5.00 be expected to be:
60 2.560 8.00 N-1 . -N(N-I)
70 2.920 7.00 sP = n. P : = P c
80 5.050 24 .00 2=1
90 6.180 28.00
100 7.370 21.00 Clearly, as Sp decreases much more quickly than 8
110 10.640 26.00 increases, the tightness of a problem grows super
120 14.080 38.00 exponentially as N grows (all other parameters being kept
130 18.720 51.00 unchanged). When N grows to over 200,problems are
140 26.970 94.00 normally insoluble. Therefore, the number of cycles taken
150 35.440 102.00 by GENET is bounded.
160 58.240 223.00 It is important to note that the absolute number of cycles is
170 107.500 337.00 of the order of hundreds. When N = 200,the expected
Table 2. Average and maximum number of cycles tak- number of cycles is 133.00. For an analog computer that
en by GENET to solve randomly generated takes lo8to 10" seconds to process one cycle (as a rough
problems, with 100 runs per each case estimation), a problem of size 0(6*O0)can be solved in
terms of IO4 to IO4 seconds.

Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.
299

DISCUSSION lems”, Artificial Intelligence 14(1980), 263-313


The number of nodes required by GENET for binary 3. Waltz, D.L., “Understanding line drawings of scenes
constraint problems is N x d , where N is the number of with shadows”, in WINSTON, P.H. (ed.) The Psy-
variables and d is the domain size (assuming all domains chology of Computer Vision, McGraw-Hill, New
have the same size). When k-ary constraints, where k>2. York,1975.19-91
are considered, the number of nodes required is O(Nkd)in 4. Tsang, E.P.K.,‘The consistent labelling problem in
the worst case. temporal reasoning”, Proc. AAAI Conference, Seat-
Recently, Guesgen proposes a NN approach for solving tle, July, 1987,251-255
CSPs [15]. The number of nodes required for bin 5. Dechter, R., Meiri, I. & Pearl, J., “Temporal con-
constraint problems in this method is O($d2), or O ( $ z straint networks”, Artificial Intelligence, 49, 1991,
if the operation of each node is to be simplified.When k-ary 61-95
constraints, where k>2, are being considered, the number
of nodes will be O(Nkd).The set up and the operations 6. Dincbas, M.,Simonis, H. & Van Hentenryck, P.,
involved in each node are significantly more complex than “Solving car sequencing problem in constraint logic
that in GENET. programming”, Proceedings, European Conference
on AI, 1988,290-295
Although the FC-FFP approach and other search
algorithms can be parallelized, they cannot provide 7. Dincbas, M., Van Hentenryck, P., Simonis, H.,
satisfactory solutions even with a polynomial number of Aggoun, A. & Graf, T., “Applications of CHIP to in-
processors [16]. The GENET approach, however, requires dustrial and engineering problems”, First Interna-
no more processors than the number of nodes required in tional Conference on Industrial and Engineering
the problem, as discussed above. Applications of AI and Expert Systems, June, 1988
8. Hopfield, J. J., and Tank, D.W., “‘Neural’ Computa-
SUMMARY AND FUTURE WORK tion of Decisions in Optimization Problems”, Biol.
This is a report of on-going research. We have presented a Cybem. 52,141-152
general framework for applying neural network techniques 9. Minton, S.,Johnston, M.D., Philips. A. B. & Laird,
to CSPs. The network model is developed from the P., “Solving large-scale constraint-satisfaction and
Interactive Activation Model. scheduling problems using a heuristic repair meth-
We have proposed to use structured multilayer recurrent od”,American Association for Artificial Intelligence
neural networks to represent non-binary CSPs, and a (AAAI), 1990,17-24
learning algorithm to propagate constraints effectively 10. Adorf, H.M. & Johnston, M.D., “A discrete stochas-
through the network so as to escape local minima. To tic neural network algorithm for constraint satisfac-
justify our approach, we have looked at, apart from a large tion problems”, Proceedings, International Joint
number of randomly consbucted CSPs, some specially Conference on Neural Networks, 1990
designed problems for which the Heuristic Repair Method
has failed to produce solutions, and the car sequencing 11. McClelland, J. L., & Rumelhart, D. E., “An interac-
problem which is highly constrained non-binary CSP. In all tive activation model of context effects in letter per-
our tests so far, the simulator GENET has succeeded in ception: Part 1. An account of basic findings”,
finding solutions when one exists and reporting failure Psychological Review, 88,375407
when the problem is insoluble, although the proof of 12. Rumelhart, D. E.,& McClelland, J. L., “An interac-
completeness has not yet been theoretically developed. In tive activation model of context effects in letter per-
any case, we argue that this approach gives hope to solving ception: Part 2. The contextual enhancement effect
real life CSPs to the scale that would be intractable by and some tests and extensionsof the model”, Psycho-
conventional methods. logical Review, 89.60-94
Our next task is to investigate the properties of our model 13. Davis, L. (ed.), “Genetic algorithms and simulated
more thoroughly and improve GENET’S efficiency. Our annealing”, Research notes in AI, Pitman/Morgan
long term objective is to design hardware, based on our Kaufmann, 1987
formalism, for solving CSPs very efficiently. 14. Parreflo, B.D., Kabat, W.C. & Wos. L.,“Job-shop
ACI<“T scheduling using automated reasoning: a case study
of the car sequencing problem”, Journal of Automat-
The authors are grateful to Dr John Ford for his help in ic Reasoning, 2(1), 1986, 1-42
analysing the experimental resulls and Jenny Emby for her
help in improving the presentation. 15. Guesgen, H.W., “Connectionist networks for con-
straint satisfaction”, AAAI Spring Symposium on
REFERENCES Constraint-based Reasoning, March, 1991,182-190
1. Mackworth, A.K., “Consistency in networks or rela- 16. -if, S.,“Onthe parallel complexity of discrete re-
tions”, Artificial Intelligence 8(1), 1977.99-118 laxation in constraint satisfaction networks”, Artifi-
2. Haralick, R.M. & Elliott, G.L., “Increasing tree cial Intelligence (45) 1990,275-286
search efficiency for constraint satisfaction prob-

Authorized licensed use limited to: JILIN UNIVERSITY. Downloaded on December 07,2023 at 06:57:33 UTC from IEEE Xplore. Restrictions apply.

You might also like