computationally-effective-search-and-optimization-procedure-usin
computationally-effective-search-and-optimization-procedure-usin
Abstract- This paper presents a concept of combining solutions. Thus, the main task of a search algorithm in the
genetic algorithms (GAS) with a n approximate evalu- initial few iterations is to provide a direction towards the op-
ation technique tn achieve a computationally effective timal region in the search space. To achieve such a task, it
search a n d optimization procedure. The major objec- may not be necessary to use an exact (or a very fine-grained)
tive of this work is to enable the use of GAS on com- model of the optimization problem early on. An approx-
putationally expensive problems, while retaining their imate model of the problem may he adequate to provide a
basic robust search capabilities. Starting with a coarse reasonably good search direction. However, as the iterations
approximation model of the problem, GAS successively progress, finer models can be used successively to converge
use finer models, thereby allowing the proposed algn- closer to the true optimum of the actual problem. Although
rithm to find the optimal or a near-optimal solution of this idea of using an approximate model in the beginning of
computationally expensive problems faster. A general a search algorithm and refining the model with iterations is
methodology is proposed for combining any approxi- not new [ I , 6, 9, 11, 121, we suggest a generic procedure
mating technique with GA. The proposed methodology which can be used in any arbitrary problem.
is also tested in conjunction with one particular ap- I n the reminder of this paper, we describe the proposed
proximating technique, namely the artificial neural net- coarse-to-fine grained modeling procedure. Thereafter, we
work, on a B-spline curve fitting problem successfully. suggest an artificial neural network (ANN) based proce-
Savings in the exact function evaluation upto 32% a r e dure, specifically to model an approximate version o f the
achieved. T h e computational advantage demonstrated actual problem and show simulation results of the proposed
here should encourage the use of the proposed approach technique applied to a two-objective geometric design prob-
t o more complex and computationally demanding real- lem. Different variations of the ANN design and train-
world problems. ing procedures are compared. Simulation results show a
large computational advantage of the proposed procedure,
. . thereby suggesting the applicability of the proposed proce-
1 Introduction
dure in more complex real-world search and optimization
One of the main hurdlcs faced by an optimization algorithm problems.
in solving real-world prohlems is their need of a reason-
ably large computational time in finding an optimal and a 2 Coarse to Fine Grained Methods Used in
near-optimal solution. In order to reduce the overall com- Past Studies
putational time, researchers in the area of search and opti-
mization look for efficient algorithms which demand only a The use of coarse-to-fine grained modeling in optimization
few function evaluations to arrive at a near-optimal solution. can be found in various application papers available in the
Although successes'in this direction have been achieved by literature, particularly in computational fluid dynamics ap-
using new and unorthodox techniques (such as evolutionary plications and complex mechanical component design prob-
algorithms, tabu search, simulated annealing etc.) involving lems. One such application is the optimal design of elastic
problem-specific operators, such techniques still demand flywheels using the injection island C A (iiGA) suggested
a considerable amount of simulation time, particularly in by Eby et. el. [ 6 ] . It uses a finite element code to assist
solving computationally expensive problems. In such prob- the iiGA to evaluate the solutions to find the specific en-
lems. the main difficulty arises due to large computational ergy density of flywheels. Similar work using the hierar-
time required in evaluating a solution. This is because such chical genetic algorithm (HGA) for a computational fluid
problems either involve many decision variables or a com- dynamics problem is reported 6y Sefrioui et. el. [ I I]. They
putationally involved evaluation procedure, such as the use used a multi-layered hierarchical topology to solve a clas-
of finite element procedure or a network flow computation. sical explorationlexploitation dilemma while using multiple
Although the use of a parallel computer is a remedy to models for optimization problems. They have reported to
these problems in reducing the overall computational time, achieve the same quality results as that obtained by a sim-
in this paper, we suggist a fundamental algorithmic change ple GA, but spending only about one-third of the computa-
to the usual optimization procedure which can be used ei- tional time. The other important work in this area is reported
ther serially or parallely. Most search and optimization al- by Poloni et.el. [12]. They have developed a methodology
gorithms begin their search from one or more random guess which uses a multi-objective genetic algorithm (MOGA) for
2082
ing function is optimized, it is likely that a C A will proceed arrive at an approximated problem, we execute a CA with
in the right direction. However, as a GA tends to converge to exact function evaluations for n generations, thereby col-
the optimum of the approximating function, the approximat- lecting a total of N' = n N solutions for approximation. At
ing function needs to he modified to make a better approxi- the end o f n generations, the approximation technique is in-
mated function from before. Since the population diversity voked with N' solutions and the first approximated problem
will be reduced while approximating the first approximated is created. The CA is then performed for the next (Q - n )
function, the second approximating function need not be generations with this approximated problem. Thereafter,
defined over the whole range of the decision variables, as the CA is performed with the exact function for the next
shown in Figure 1. Since the approximating function will n generations and a new approximated problem is created.
This procedure is continued till the termination criterion is
met. Thus, this procedure requires a fraction n / Q of total
I evaluations in evaluating the problem exactly. With gen-
erations, the approximations continue to happen in smaller
regions and, therefore, the training set N' can he reduced in
0.8 size. We follow a linear reduction in this paper.
If a problem cannot be evaluated exactly, instead some
approximations (such as involving FIT or finite element
0.6 techniques) are needed to evaluate, the parameter n is set
to zero and GAS are run for Q generations with the most
coarse model (in the case of a finite element method only
0.4 a few elements can be chosen to Stan with). It is interest-
ing to note that a set of basis functions with varying impor-
tance to local variations can he used as approximating func-
0.2 tions here. In such cases, a C A may he started with an ap-
proximating function involving only a few basis functions,
thereby allowing a quicker computation of the optimal solu-
0 tion. With generations, more and more basis functions can
0 0.2 0.4 0.6 0.8
X he added to make the model more similar to the actual op-
timization problem. In this study we concentrate on solving
problems for which an exact evaluation method exists but
Figure 1: Progressive approximate modeling. is computationally expensive. However, similar methodol-
ogy can also be used to solve problem for which no exact
he defined over a smaller seafch region, more local details evaluation method exists.
can appear in successive approximations. This process may
continue till no further approximation results in an improve- 3.1 Approximation Through Artificial Neural Networks
ment in the function value. Although the Successive approx-
imation technique described above seems a reasonable plan, We propose combining a CA with the artificial neural net-
care must be taken to ensure that adequate diversity is left works (ANN) as the basic approximating technique for fit-
while switching from one approximating function to a better ness computation. The primary reason for using ANN as the
one. basic approximating tool is its proven capabilities as func-
Figure 2 outlines a schematic o f a plausible plan for the tion approximation .tool from a given data set. The mul-
proposed procedure:The combined procedure'begins with a tilayer perceptron trained with the back-propagation algo-
. . rithm may he viewed as a practical vehicle for performing a
M ,,;>, c* no, w n", non-linear input output mapping of general nature [SI. The
,,,;>,E ,,;mi US<"# overall CA-ANN procedure is shown in a flowchart in Fig-
1111,d .*J%Wllrl B",
monrl ure 3.
ri0.1 An advantage of the proposed technique is its adaptabil-
rim1 poW1.lion ity. Initially the CA population will he randomly spread
me%
., -et
l.Mt
.r.l".rion
in the entire search space for the problem undertaken.
Since the same information is available in.the ANN train-
ing database the approximated model generated using this
database, will also be very general in nature and hence may
miss some finer details about the search space. However as
- Oenaration'. Of (U __ the generations proceed, the CA population will start drift-
ing and focusing on the important regions which it iden-
tifies based on the current model. So when the proposed
Figure 2: A line diagram of the proposed technique.
technique updates its model using exact function evalua-
tion for the current generation, it will have more informa-
set of randomly created N solutions, where N is the popula-
tion about the local regions of interest as more population
tion size. Since an adequate size of solutions are required to
2083
i
Generation Counter
Yes NO
NO
2084
the B-spline fitted curve, and where the first derivatives xu and yu are calculated as fol-
Minimize the maximum curvature of the B-spline fit- lows:
ted curve.
Figure 4 also shows a typical B-spline fitted curve to the 5" = (U' - 1)zo + (1 - 2U')Xl + U * Q ,
saw-tooth function. The first objective is calculated as the yu = (U' - 1)yo + (1 - 2 4 Y 1 + d y 2 .
overall area between the two curves. When this objective is
Such computations can he. performed for all segments and
minimized (that is the B-spline fitted curve is very similar to
the maximum curvature of the entire B-spline curve can he
the saw-tooth function), there is hound to he a large curva-
determined. For a large number ofB-spline segments, many
ture near x = 0.5.Since in some applications, a large curva-
such computations are required, thereby involving a large
ture is to he avoided, the sole minimization of the second ob-
computation time to evaluate the second objective. If such
jective will produce a curve which would be a straight line
computations are extended to 3-D curve or surface fitting_
joining the first (x = 0 ) and the last (z = 1)points. As can
the computations become even more expensive.
he seen from the figure that such a curve will make the error
The ANN module which is used in conjunction with
(the first objective) large. Thus, the above two objectives
NSGA-I1 uses two different types of training procedures,
constitute a pair of conflicting objectives in the curve fitting
namely batch training and iiicreniental training after every
problem. Since the problem is posed as a multi-objective
Q generations for next n generations. Thus, the test prob-
problem, we have chosen to use NSGA-I1 (non-dominated
lem is solved with two different models, namely the incre-
sorting GA - 11) [5] to solve the problem.
mental (I-Q-n)model and the batch (B-Q-n)model. Each
For any given B-spline curve S, the exact evaluation of
model is tried with various combinations of parameter set-
the first objective can he achieved in the following manner:
tings. A three-layer ANN with one hidden layer is used. In-
put layer has 40 neurons and the output layer has 2 neurons.
Fl(s)
= /I='
Z=O
Ifsaw-tooth -SIdx. (1) A momentum factor of 0.1 is used. A unipolar sigmoidal
activation function with logistics equal to 0.5 is chosen. In
Since such a computation is difficult to achieve exactly all cases, we have used a permissible normalized RMS er-
(mainly because of the non-differential modulus function ror of 0.005. All input data are scaled in [0.1,0.9]and the
used in the operand), we compute the above integral numer- output data are scaled between zero and one. For initial and
ically by using the Trapezoidal rule. The accuracy of the final 25% generations, we have used 200n and 50n train-
integral will depend on the number of divisions considered ing cases, respectively. For intermediate generations, we
during the integration. The more divisions, the better is the have linearly reduced the number of training cases. NSGA-
approximation. We have used 400 divisions in the entire I1 with a population size of 200 and SBX crossover proba-
range of x and call the procedure an exact evaluation proce- bility of 0.9 with distribution index of I O and the polynomial
dure. The second objective can he written as follows: mutation probability of 1/39 with a distribution index of 50
are used.
..=I 0,
1,
if U , 5 0,
ifu, 2 1,
uC, otherwise,
(3)
U, =
X".(XO - XI) + Y U d Y O - Y1) I
0.3 0.4 0.5 0.6
J
0.7
2
xuu +,:Y x
5"" = xo -251 +xz:
Y"U = YO - 2Yl +Y2,
Figure 9: Two extreme non-dominated solutions from the
Here (xo,yo), (xl,y,)and (x2,yz) are three control points B-10-3 model.
of each segment. Once the optimal U* is calculated, the
corresponding curvature can be calculated as follows: In order to investigate the suitahle working ranges for the
proposed approach, we have tried using various parameter
(4) settings, by particularly varying number of hidden neurons
2085
600
mast sualumrioas -
1-10-2 0
1-10-2
-
+
so0 so0
0
3 400 ~
5
" 300 -
Ix
4
200 -
r" 200
100 ~
IO0 +* * +
0.006
Figure 5: Batch model results trained with 400 patterns Figure 6: Incremental model results trained with 400 pat-
terns.
600
SO0
'
D
400
300
I
.A
4 200
LOO
Figure 7: Best of incremental and hatch model results. Figure 8: Comparison of the best of the proposed tech-
nique and the exact solution at generation 1100.
in the ANN, ANN learning rate, Q and n. As a bench- This demonstrates that although approximate models are
mark result, we have run NSGA-I1 with a population size used, the combined NSGA-11-ANN procedure proposed in
of 200 using the exact function evaluations (as described this paper is able to find a better non-dominated front than
above) for 750 generations. The obtained non-dominated the exact model. In order to investigate how many genera-
front is shown in Figures 5 to 8 using a solid line. The over- tions it would take by the NSGA-I1 with exact evaluations
all function evaluations required in this exact simulation run to obtain a front similar to that obtained using the proposed
are 200 x 750 or 1,50:000. All NSGA-11-ANN simulations approach, we have continued the NSGA-I1 run with exact
are also performed for the same number of exact function evaluations. Figure 8 shows that the B-10-3 model reaches
evaluations. a similar front in about 1,100 generations (except that for
After the non-dominated solutions are found, they are larger error values the obtained front is still inferior than
evaluated using the exact model and plotted in all figures (5 that obtained using the proposed approach). In comparison
to 8). In all simulations with batch and incremental learning to these evaluations. the approximate model makes a saving'
models with different parameter settings, the obtained non- of around 32% of exact evaluations. ,
dominated front is better than that obtained using the exact Figure 9 show two extreme non-dominated solutions ob-
model. The best result for incremental training is found tained by the B-10.3 model. The saw-tooth function is also
with the 1-20-2 model. while i n the case of hatch train- shown in dots. The figure shows that one solution (marked
ing slightly better results are found with the B-10-3 model. as A) is a better fit with the saw-tooth function,'whereas the
2086
match almost exactly with the training pattern data. but will
loose its generalization capability. This indicates that there
should be a critical value for permissible normalized rms er-
ror at which the proposed NSGA-11-ANN procedure should
give best performance.
In order to investigate the effect of permissible normal-
ized rms error, various simulations were performed. Fig-
ure I O show one such study. Various values of permissi-
ble normalized rms error were tried with B-10-3 model in
[0.001,0.010]. It was found that permissible normalized
rms error less then 0.003 leads to over-approximation of
the true problem. Hence NSGA-11-ANN procedure fails
to converge for the current problem. Permissible normal-
100
............. .... ized rms error for values upto 0.004 found non-dominated
0.006 0.007 0.W8 0.009 . 0.01 0.01I front which was better than exact function evaluations for
A r e a Error 750 generation, indicating savings of exact evaluations. At
permissible normalized rms error value of 0.005, the per-
formance of proposed NSGA-11-ANN procedure is hest as
Figure IO: Effect of permissible rms normalized error on non-dominated front is pushed to the extreme left in the oh-
B-10-3 model uerformance. jective space for minimization of both objectives. However,
further increasing permissible normalized rms error value
at and above 0.006 shows the case of poor approximation of
exact problem with no savings in the exact evaluations. For
clarity of observation, Figure I O shows only three such sim-
ulations along with exact evaluations for 750 generations.
e 0.8 1 i
It clearly shows the hest performance of B-10-3 model at
0.005 permissible normalized rms error value.
2087
est normalized Euclidean distance to P* is calculated. Next Acknowledgments ..
the convergence metric value is calculated by averaging the
normalized distance of all points in the F . Lastly, in or- The first author acknowledges the support from British Tele-
der to keep the convergence metric within [0,1], we di- com under contract number ML83283YCT405106.
vide the convergence metric value by the maximum value
found among all simulations. Table I shows the normal- Bibliography
ized convergence metric value calculated for various simu-
[ I ] Branke. J . . and Schmidt. C.: Faster convergence by means of
lations by proposed NSGA-11-ANN procedure. Figure 1 I
fi mess estimation. In So8 Coiiiliuring Journal. (in press).
shows the same normalized convergence metric value plot
for B-10-3 model at various permissible normalized rms I21 Deb. K.: Mu/ti-O6jecriw 0primi:arion Using Evolurionap
error values. The best normalized convergence level ob- Algorirhins First Edition. Chichester, UK: Wiley, 2001.
tained with I-20-2 model. exact function evaluation for 750 [3] Deb. K.: 0primi:arion for erzgineering design: Algorirhlns
and I100 generations by NSGA-I1 are shown by horizontal and exuiiiples. New Delhi: Prentice-Hall, 1995.
lines on same plot. It can now be safely concluded that the 141 Deb. K.. and lain. S.: Running pertoimance metrics for evo-
proposed NSGA-11-ANN procedure with both I-20.2 and lutionary multi-objective optimization. Proceedings of rlre
B-10-3 models has outperformed NSGA-11 run with exact Foburrlr Asia-Pacific Coilfenrice on Si,nulnred Evolurio!i and
function evaluations for 1 1 0 0 generations. Thus a saving Leanzing (SEAL'O?),(Singapore).2002. pp. 13-20, :
of 32% of exact evaluations can be claimed for both I-20-2 (51 Deb, K.. Pratap. A. Agarwai. S.. and Meyarivan. i:A fast
and B-10-3 models. Figure I1 also shows that overall best and elitist multi-objectivegenetic algorithm: NSCA-II. IEEE
performance is obtained with B-10-3 model. Transucrion on Evulurionup Conipurarion, 6(2). 181-197.
2002.
5 Conclusions and Extensions [61 Ehy. D.. Averill. R. C.. PunchIII. W. F.. andcoodman. E. D.:
Evaluation of injection island GA performance on @wheel
Many real-world search and optimization problems involve design optimization. In Proceedings, Third Conference on
too computationally expensive evaluation procedures to Adaprive Compuring in Design and Maiiufucruring. Springer,
make them useful in practice. Although researchers and 1998.
practitioners adopt different techniques, such as using a par- 171 Goldherg, D. E., Deb, K.. and Clark, J. H.: Genetic algo-
allel computer or using prohlem-specific operators, in this rithms, noise, and the sizing of populations. Comp1e.x Sy-
paper we have suggested the use of successive approxima- reins, 6. 331-362. 1992.
tion models for a faster run-time. Starting from a coarse [81 Haykin. S.: Neural itenrorks a coinprehensi~~ foundarion.
approximated model of the original problem captured by an second edition. Singapore: Addison Wesley. 2001. pp 208.
artificial neural network (ANN) using a set of initial solu- [91 Jin, Y.,and Sendhoff. B.: Fitness approximation in evolution-
tions as a training data set, the proposed genetic algorithm ary computation - A survey. In Proceedings, Generic arid
(CA) uses the approximate model. It has been argued that Eidurionaw Cornpurarion Conference, 2002. Morgan Kauf-
such coarse-to-fine grained approximated models, if coordi- mann, 2002, pp 1105-1 112.
nated correctly, may direct a CA in the right direction and [IO] Reklaitis. G . V., Ravindran. A. and Ragsdell, K.'M.: Bigi-
enable a C A to find a near-optimal or an optimal solution neering Opriinizariort Merhods and Applicarions. New York
quickly. Wiley, 1983.
The proposed technique is applied to a geometric two- [Ill Sefrioui. M., and Piriaux, J.: A hierarchical genetic algo-
objective curve fitting problem of minimizing the difference rithm using multiple models for optimization. In Pmceed-
between the fitted and the desired curve and of minimizing ings, 6rh lnrernarional Conference on P a m M Pmblein Soli'-
the maximum curvature in the fitted curve. Simulation re- iiig from Narure - PPSN V I . Le'cture Notes in Computer Sci-
sults involving a hatch learning ANN and an incremental ence 1917, Springer 2000.
learning ANN obtained using different numbers of train- [I?] Poloni.,C., Giurgevich. A., Onesti. L.. and Prdiroda. V.: Hy-
ing cases and durations of exploiting the approximate model bridization of a multi-objective genetic algorithm, a neural
have shown that the proposed CA-ANN approach can find a network and a classical optimizer fora complex design prob-
non-dominated front with about 68% overall function eval- lem in &id dynamics. In Cornpurer Methods in AppliedMe-
uations than that needed if the exact function evaluations chanics and Engirieering, volume 186, 2000, pp-403.420.
were used. Simulations with different values of permissible [I31 Nair. P. B., Keane. A. 1.. and Shimpi.'R. P.: Combin-
normalized rms error for ANN have shown that though the ing approximating concepts with genetic algorithm-based
proposed approach works successfully for a range of rms structural optimization procedures. In PmceedingrFirsr
error value, there exists a critical value of permissible nor- lSSMO/NASNAIAA lnremer Coiference on Appmxiinarions
malized m s error at which the GA-ANN approach gives U17d Fasr Reanalysis in Engineering Oprimizurion, 1998.
the hest performance. However, the overall procedure inuo- 1141 Zied. I.: CAD/CAM rheop and prucrice. New Delhi, India:
duces a number of new parameters, the sensitivity of which Tata McGraw-Hill Publishing Company, 2000.
on the obtained speed-up must be established by performing
a more elaborate parametric study.
2088