Handbook of Bioinspired Algorithms and Applications 1st Edition Stephan Olariu instant download
Handbook of Bioinspired Algorithms and Applications 1st Edition Stephan Olariu instant download
https://ebookgate.com/product/handbook-of-bioinspired-algorithms-
and-applications-1st-edition-stephan-olariu/
https://ebookgate.com/product/bioinspired-materials-for-medical-
applications-1st-edition-ligia-rodrigues/
ebookgate.com
https://ebookgate.com/product/the-oxford-handbook-of-transformations-
of-the-state-stephan-leibfried/
ebookgate.com
https://ebookgate.com/product/graph-algorithms-and-applications-i-1st-
edition-roberto-tamassia/
ebookgate.com
https://ebookgate.com/product/basics-of-programming-and-algorithms-
principles-and-applications-2024th-edition-roberto-mantaci/
ebookgate.com
Graph Algorithms and Applications Vol 5 Giuseppe Liotta
https://ebookgate.com/product/graph-algorithms-and-applications-
vol-5-giuseppe-liotta/
ebookgate.com
https://ebookgate.com/product/data-clustering-algorithms-and-
applications-1st-edition-charu-c-aggarwal/
ebookgate.com
https://ebookgate.com/product/quantum-dreams-the-art-of-stephan-
martiniere-stephan-martiniere/
ebookgate.com
https://ebookgate.com/product/handbook-of-scheduling-algorithms-
models-and-performance-analysis-1st-edition-james-h-anderson/
ebookgate.com
https://ebookgate.com/product/relational-data-clustering-models-
algorithms-and-applications-1st-edition-bo-long/
ebookgate.com
CHAPMAN & HALL/CRC COMPUTER and INFORMATION SCIENCE SERIES
Handbook of
Bioinspired Algorithms
and Applications
PUBLISHED TITLES
HANDBOOK OF SCHEDULING: ALGORITHMS, MODELS, AND PERFORMANCE ANALYSIS
Joseph Y.-T. Leung
THE PRACTICAL HANDBOOK OF INTERNET COMPUTING
Munindar P. Singh
HANDBOOK OF DATA STRUCTURES AND APPLICATIONS
Dinesh P. Mehta and Sartaj Sahni
DISTRIBUTED SENSOR NETWORKS
S. Sitharama Iyengar and Richard R. Brooks
SPECULATIVE EXECUTION IN HIGH PERFORMANCE COMPUTER ARCHITECTURES
David Kaeli and Pen-Chung Yew
SCALABLE AND SECURE INTERNET SERVICES AND ARCHITECTURE
Cheng-Zhong Xu
Handbook of
Bioinspired Algorithms
and Applications
Edited by
Stephan Olariu
Old Dominion University
Norfolk, Virginia, U.S.A.
Albert Y. Zomaya
University of Sydney
NSW, Australia
Published in 2006 by
Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with
permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish
reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials
or for the consequences of their use.
No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or
other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
The Handbook of Bioinspired Algorithms and Applications seeks to provide an opportunity for researchers
to explore the connection between biologically inspired (or bioinspired) techniques and the development
of solutions to problems that arise in a variety of problem domains. The power of bioinspired paradigms
lies in their capability in dealing with complex problems with little or no knowledge about the search space,
and thus is particularly well suited to deal with a wide range of computationally intractable optimizations
and decision-making applications.
Vast literature exists on bioinspired approaches for solving an impressive array of problems and there
is a great need to develop repositories of “how to apply” bioinspired paradigms to difficult problems. The
material of the handbook is by no means exhaustive and it focuses on paradigms that are “bioinspired,”
and therefore, chapters on fuzzy logic or simulated annealing were not included in the organization. There
was a decision to limit the number of chapters so that the handbook remains manageable within a single
volume.
The handbook endeavors to strike a balance between theoretical and practical coverage of a range of
bioinspired paradigms and applications. The handbook is organized into two main sections: Models and
Paradigms and Application Domains, and the titles of the various chapters are self-explanatory and a
good indication to what is covered. The theoretical chapters are intended to provide the fundamentals of
each of the paradigms in such a way that allows the readers to utilize these techniques in their own fields.
The application chapters show detailed examples and case studies of how to actually develop a solution
to a problem based on a bioinspired technique. The handbook should serve as a repository of significant
reference material, as the list of references that each chapter provides will become a useful source of further
study.
Stephan Olariu
Albert Y. Zomaya
First and foremost we would like to thank and acknowledge the contributors of this book for their support
and patience, and the reviewers for their useful comments and suggestions that helped in improving
the earlier outline of the handbook and presentation of the material. Professor Zomaya would like to
acknowledge the support from CISCO Systems and members of the Advanced Networks Research Group
at Sydney University. We also extend our deepest thanks to Jessica Vakili and Bob Stern from CRC Press
for their collaboration, guidance, and, most importantly, patience in finalizing this handbook. Finally,
we thank Mr. Mohan Kumar for leading the production process of this handbook in a very professional
manner.
Stephan Olariu
Albert Y. Zomaya
Stephan Olariu received his M.Sc. and Ph.D. degrees in computer science from McGill University,
Montreal, in 1983 and 1986, respectively. In 1986 he joined the Old Dominion University where he is a
professor of computer science. Dr. Olariu has published extensively in various journals, book chapters,
and conference proceedings. His research interests include image processing and machine vision, parallel
architectures, design and analysis of parallel algorithms, computational graph theory, computational geo-
metry, and mobile computing. Dr. Olariu serves on the Editorial Board of IEEE Transactions on Parallel
and Distributed Systems, Journal of Parallel and Distributed Computing, VLSI Design, Parallel Algorithms
and Applications, International Journal of Computer Mathematics, and International Journal of Foundations
of Computer Science.
Albert Y. Zomaya is currently the CISCO Systems chair professor of internetworking in the School of
Information Technologies, The University of Sydney. Prior to that he was a full professor in the Electrical
and Electronic Engineering Department at the University of Western Australia, where he also led the
Parallel Computing Research Laboratory from 1990 to 2002. He served as associate, deputy, and acting
head in the same department, and held visiting positions at Waterloo University and the University of
Missouri–Rolla. He is the author/co-author of 6 books and 200 publications in technical journals and
conferences, and the editor of 6 books and 7 conference volumes. He is currently an associate editor
for 14 journals, the founding editor of the Wiley Book Series on Parallel and Distributed Computing, and
the editor-in-chief of the Parallel and Distributed Computing Handbook (McGraw-Hill 1996). Professor
Zomaya was the chair of the IEEE Technical Committee on Parallel Processing (1999–2003) and currently
serves on its executive committee. He has been actively involved in the organization of national and
international conferences. He received the 1997 Edgeworth David Medal from the Royal Society of New
South Wales for outstanding contributions to Australian science. In September 2000 he was awarded the
IEEE Computer Society’s Meritorious Service Award. Professor Zomaya is a chartered engineer (CEng), a
fellow of the IEEE, a fellow of the Institution of Electrical Engineers (U.K.), and member of the ACM. He also
serves on the boards of two startup companies. His research interests are in the areas of high performance
computing, parallel algorithms, networking, mobile computing, and bioinformatics.
1.1 Introduction
One of the most striking features of Nature is the existence of living organisms adapted for surviving in
almost any ecosystem, even the most inhospitable: from abyssal depths to mountain heights, from volcanic
vents to polar regions. The magnificence of this fact becomes more evident when we consider that the
life environment is continuously changing. This motivates certain life forms to become extinct whereas
other beings evolve and preponderate due to their adaptation to the new scenario. It is very remarkable
that living beings do not exert a conscious effort for evolving (actually, it would be rather awkward to talk
about consciousness in amoebas or earthworms); much on the contrary, the driving force for change is
controlled by supraorganic mechanisms such as natural evolution.
Can we learn — and use for our own profit — the lessons that Nature is teaching us? The answer is a big
YES, as the optimization community has repeatedly shown in the last decades. “Evolutionary algorithm”
is the key word here. The term evolutionary algorithm (EA henceforth) is used to designate a collection
of optimization techniques whose functioning is loosely based on metaphors of biological processes.
This rough definition is rather broad and tries to encompass the numerous approaches currently
existing in the field of evolutionary computation [1]. Quite appropriately, this field itself is continuously
evolving; a quick inspection of the proceedings of the relevant conferences and symposia suffices to
demonstrate the impetus of the field, and the great diversity of the techniques that can be considered
“evolutionary.”
1-3
This variety notwithstanding, it is possible to find a number of common features of all (or at least
most of) EAs. The following quote from Reference 2 illustrates such common points:
The algorithm maintains a collection of potential solutions to a problem. Some of these possible
solutions are used to create new potential solutions through the use of operators. Operators act on
and produce collections of potential solutions. The potential solutions that an operator acts on are
selected on the basis of their quality as solutions to the problem at hand. The algorithm uses this
process repeatedly to generate new collections of potential solutions until some stopping criterion
is met.
This definition can be usually found in the literature expressed in a technical language that uses terms
such as genes, chromosomes, population, etc. This jargon is a reminiscence of the biological inspiration
mentioned before, and has deeply permeated the field. We will return to the connection with biology
later on.
The objective of this work is to present a gentle overview of these techniques comprising both the
classical “canonical” models of EAs as well as some modern directions for the development of the field,
namely, the use of parallel computing, and the introduction of problem-dependent knowledge.
• Evolution is a process that does not operate on organisms directly, but on chromosomes. These
are the organic tools by means of which the structure of a certain living being is encoded, that is,
the features of a living being are defined by the decoding of a collection of chromosomes. These
chromosomes (more precisely, the information they contain) pass from one generation to another
through reproduction.
• The evolutionary process takes place precisely during reproduction. Nature exhibits a plethora
of reproductive strategies. The most essential ones are mutation (that introduces variability in
the gene pool) and recombination (that introduces the exchange of genetic information among
individuals).
• Natural selection is the mechanism that relates chromosomes with the adequacy of the entities they
represent, favoring the proliferation of effective, environment-adapted organisms, and conversely
causing the extinction of lesser effective, nonadapted organisms.
These principles are comprised within the most orthodox theory of evolution, the Synthetic
Theory [6]. Although alternate scenarios that introduce some variety in this description have been
proposed — for example, the Neutral Theory [7], and very remarkably the Theory of Punctuated
Equilibria [8] — it is worth considering the former basic model. It is amazing to see that despite the
apparent simplicity of the principles upon which it rests, Nature exhibits unparalleled power in developing
and expanding new life forms.
Not surprisingly, this power has attracted the interest of many researchers, who have tried to translate the
principles of evolution to the realm of algorithmics, pursuing the construction of computer systems with
analogous features. An important point must be stressed here: evolution is an undirected process, that is,
there exists no scientific evidence that evolution is headed to a certain final goal. On the contrary, it can
be regarded as a reactive process that makes organisms change in response to environmental variations.
However, it is a fact that human-designed systems do pursue a definite final goal. Furthermore, whatever
this goal might be, it is in principle, desirable to reach it quickly and efficiently. This leads to the distinction
between two approaches to the construction of natureinspired systems:
1. Trying to reproduce Nature principles with the highest possible accuracy, that is, simulate Nature.
2. Using these principles as inspiration, adapting them in whatever required way so as to obtain
efficient systems for performing the desired task.
Both approaches concentrate nowadays on the efforts of researchers. The first one has given rise to
the field of Artificial Life (e.g., see Reference 9), and it is interesting because it allows re-creating and
studying numerous natural phenomena such as parasitism, predator/prey relationships, etc. The second
approach can be considered more practical, and constitutes the source of EAs. Notice anyway that these
two approaches are not hermetic containers, and have frequently interacted with certainly successful
results.
P
t
c s
vx P⬘
g P⬙ vm
Evolutionary-Algorithm:
of solutions onto which the EA will subsequently work, iteratively applying some evolutionary operators
to modify its contents. More precisely, the process comprises three major stages: selection (promising
solutions are picked from the population by using a selection function σ ), reproduction (new solutions
are created by modifying selected solutions using some reproductive operators ωi ), and replacement (the
population is updated by replacing some existing solutions by the newly created ones, using a replacement
function ψ). This process is repeated until a certain termination criterion (usually reaching a maximum
number of iterations) is satisfied. Each iteration of this process is commonly termed a generation.
According to this description, it is possible to express the pseudocode of an EA as shown in Figure 1.2.
Every possible instantiation of this algorithmic template1 will give rise to a different EA. More precisely,
it is possible to distinguish different EA families, by considering some guidelines on how to perform this
instantiation.
• Evolutionary Programming (EP): This EA family originated in the work of Fogel et al. [11].
EP focuses on the adaption of individuals rather than on the evolution of their genetic informa-
tion. This implies a much more abstract view of the evolutionary process, in which the behavior of
individuals is directly modified (as opposed to manipulating its genes). This behavior is typically
modeled by using complex data structures such as finite automata or as graphs (see Figure 1.3[a]).
Traditionally, EP uses asexual reproduction — also known as mutation, that is, introducing slight
changes in an existing solution — and selection techniques based on direct competition among
individuals.
• Evolution Strategies (ESs): These techniques were initially developed in Germany by Rechenberg
[12] and Schwefel [13]. Their original goal was serving as a tool for solving engineering problems.
With this goal in mind, these techniques are characterized by manipulating arrays of floating-point
numbers (there exist versions of ES for discrete problems, but they are much more popular for
continuous optimization). As EP, mutation is sometimes the unique reproductive operator used
in ES; it is not rare to also consider recombination (i.e., the construction of new solutions by
combining portions of some individuals) though. A very important feature of ES is the utilization
of self-adaptive mechanisms for controlling the application of mutation. These mechanisms are
aimed at optimizing the progress of the search by evolving not only the solutions for the problem
being considered, but also some parameters for mutating these solutions (in a typical situation,
1 Themere fact that this high-level heuristic template can host a low-level heuristic, justifies using the term
metaheuristic, as it will be seen later.
Hidden layer n o p
(b) IF
AND IS
IS IS FOR PL
POS NL VEL NL
FIGURE 1.3 Two examples of complex representations. (a) A graph representing a neural network. (b) A tree
representing a fuzzy rule.
an ES individual is a pair (x , σ ), where σ is a vector of standard deviations used to control the
Gaussian mutation exerted on the actual solution x ).
• Genetic Algorithms (GAs): GAs are possibly the most widespread variant of EAs. They were con-
ceived by Holland [14]. His work has had a great influence in the development of the field, to the
point that some portions — arguably extrapolated — of it were taken almost like dogmas (i.e., the
ubiquitous use of binary strings as chromosomes). The main feature of GAs is the use of a recom-
bination (or crossover) operator as the primary search tool. The rationale is the assumption that
different parts of the optimal solution can be independently discovered, and be later combined to
create better solutions. Additionally, mutation is also used, but it was usually considered a second-
ary background operator whose purpose is merely “keeping the pot boiling” by introducing new
information in the population (this classical interpretation is no longer considered valid though).
These families have not grown in complete isolation from each other. On the contrary, numerous
researchers built bridges among them. As a result of this interaction, the borders of these classical families
tend to be fuzzy (the reader may check [15] for a unified presentation of EA families), and new variants
have emerged. We can cite the following:
• Evolution Programs (EPs): This term is due to Michalewicz [5], and comprises those techniques
that, while using the principles of functioning of GAs, evolve complex data structures, as in EP.
In addition to the different EA variants mentioned above, there exist several other techniques that could
also fall within the scope of EAs, such as Ant Colony Optimization [20], Distribution Estimation Algorithms
[21], or Scatter Search [22] among others. All of them rely on achieving some kind of balance between
the exploration of new regions of the search space, and the exploitation of regions known to be promising
[23], so as to minimize the computational effort for finding the desired solution. Nevertheless, these
techniques exhibit very distinctive features that make them depart from the general pseudocode depicted
in Figure 1.2. The broader term metaheuristic (e.g., see Reference 24) is used to encompass this larger set
of modern optimization techniques, including EAs.
so as to have a genetic reservoir of worthwhile information in the past, and thus be capable of tackling
dynamic changes in the fitness function.
Notice that there may even exist more than one criterion for guiding the search (e.g., we would like to
evolve the shape of a set of pillars, so that their strength is maximal, but so that their cost is also minimal).
These criteria will be typically partially conflicting. In this case, a multiobjective problem is being faced.
This can be tackled in different ways, such as performing an aggregation of these multiple criteria into a
single value, or using the notion of Pareto dominance (i.e., solution x dominates solution y if, and only
if, fi (x) yields a better or equal value than fi (y) for all i, where the fi ’s represent the multiple criteria being
optimized). See References 26 and 27 for details.
1.4.2 Initialization
In order to have the EA started, it is necessary to create the initial population of solutions. This is
typically addressed by randomly generating the desired number of solutions. When the alphabet used
for representing solutions has low cardinality, this random initialization provides a more or less uniform
sample of the solution space. The EA can subsequently start exploring the wide area covered by the initial
population, in search of the most promising regions.
In some cases, there exists the risk of not having the initial population adequately scattered all over the
search space (e.g., when using small populations and/or large alphabets for representing solutions). It is
then necessary to resort to systematic initialization procedures [28], so as to ensure that all symbols are
uniformly present in the initial population.
This random initialization can be complemented with the inclusion of heuristic solutions in the initial
population. The EA can thus benefit from the existence of other algorithms, using the solutions they
provide. This is termed seeding, and it is known to be very beneficial in terms of convergence speed, and
quality of the solutions achieved [29,30]. The potential drawback of this technique is having the injected
solutions taking over the whole population in a few iterations, provoking the stagnation of the algorithm.
This problem can be remedied by tuning the selection intensity by some means (e.g., by making an
adequate choice of the selection operator, as it will be shown below).
1.4.3 Selection
In combination with replacement, selection is responsible for the competition aspects of individuals in
the population. In fact, replacement can be intuitively regarded as the complementary application of the
selection operation.
Using the information provided by the fitness function, a sample of individuals from the population is
selected for breeding. This sample is obviously biased towards better individuals, that is good — according
to the fitness function — solutions should be more likely in the sample than bad solutions.2
The most popular techniques are fitness-proportionate methods. In these methods, the probability of
selecting an individual for breeding is proportional to its fitness, that is,
fi
pi = , (1.1)
j∈P fj
where fi is the fitness3 of individual i, and pi is the probability of i getting into the reproduction stage. This
proportional selection can be implemented in a number of ways. For example, roulette-wheel selection rolls
2 At least, this is customary in genetic algorithms. In other EC families, selection is less important for biasing
evolution, and it is done at random (a typical option in evolution strategies), or exhaustively, that is, all individuals
undergo reproduction (as it is typical in evolutionary programming).
3 Maximization is assumed here. In case we were dealing with a minimization problem, fitness should be transformed
so as to obtain an appropriate value for this purpose, for example, subtracting it from the highest possible value of the
guiding function, or taking the inverse of it.
a dice with |P| sides, such that the ith side has probability pi . This is repeated as many times as individuals
are required in the sample. A drawback of this procedure is that the actual number of instances of
individual i in the sample can largely deviate from the expected |P| · pi . Stochastic Universal Sampling [31]
(SUS) does not have this problem, and produces a sample with minimal deviation from expected values.
Fitness-proportionate selection faces problems when the fitness values of individuals are very similar
among them. In this case, pi would be approximately |P|−1 for all i ∈ P, and hence selection would be
essentially random. This can be remedied by using fitness scaling. Typical options are (see Reference 5):
Another problem is the appearance of an individual whose fitness is much better than the remaining
individuals. Such super-individuals can quickly take over the population. To avoid this, the best option is
using a nonfitness-proportionate mechanism. A first possibility is ranking selection [32]: individuals are
ranked according to fitness (best first, worst last), and later selected — for example, by means of SUS —
using the following probabilities:
1 − + − i−1
pi = η + (η − η ) , (1.2)
|P| |P| − 1
1.4.4 Recombination
Recombination is a process that models information exchange among several individuals (typically two
of them, but a higher number is possible [37]). This is done by constructing new solutions using the
information contained in a number of selected parents. If it is the case that the resulting individuals (the
offspring ) are entirely composed of information taken from the parents, then the recombination is said to
Cut point
Binary mask 00110100011001
01001101011010 01001101011010
Parents
11011010010011 11011010010011
Cutting
Father 1 2 3 4 5 6 7 8 9
1 3 84 5 6 9 2 7
Mappings Child
(1) (6)
Mother 4 3 8 1 7 5 9 2 6
(5)
(2) (4)
(3)
FIGURE 1.5 PMX at work. The numbers in brackets indicate the order in which elements are copied to the
descendant.
be transmitting [38,39]. This is the case of classical recombination operators for bitstrings such as single-
point crossover, or uniform crossover [40], among others. Figure 1.4 shows an example of the application
of these operators.
This property captures the a priori role of recombination: combining good parts of solutions that have
been independently discovered. It can be difficult to achieve for certain problem domains though (the
Traveling Salesman Problem (TSP) is a typical example). In those situations, it is possible to consider other
properties of interest such as respect or assortment. The former refers to the fact that the recombination
operator generates descendants carrying all features common to all parents; thus, this property can be seen
as a part of the exploitative side of the search. On the other hand, assortment represents the exploratory side
of recombination. A recombination operator is said to be properly assorting if, and only if, it can generate
descendants carrying any combination of compatible features taken from the parents. The assortment is
said to be weak if it is necessary to perform several recombinations within the offspring to achieve this
effect.
The recombination operator must match the particulars of the representation of solutions chosen.
In the GA context, the representation was typically binary, and hence operators such as those depicted
in Figure 1.4 were used. The situation is different in other EA families (and indeed in modern GAs too).
Without leaving GAs, another very typical representation is that of permutations. Many ad hoc operators
have been defined for this purpose, for example, order crossover (OX) [41], partially mapped crossover
(PMX; see Figure 1.5) [42], and uniform cycle crossover (UCX) [43] among others. The reader may check
[43] for a survey of these different operators.
When used in continuous parameter optimization, recombination can exploit the richness of the
representation, and utilize a variety of alternate strategies to create the offspring. Let (x1 , . . . , xn ) and
+ / + /
+ X + – + + – + X +
X Y Y X X Y X Y Y X X Y
(y1 , . . . , yn ) be two arrays of real valued elements to be recombined, and let (z1 , . . . , zn ) be the resulting
array. Some possibilities for performing recombination are the following:
In the case of self-adaptive schemes as those typically used in ES, the parameters undergoing self-
adaption would be recombined as well, using some of these operators. More details on self-adaption will
follow in next subsection.
Solutions can be also represented by means of some complex data structure, and the recombination
operator must be adequately defined to deal with these (e.g., References 46 to 48). In particular, the field
of GP normally uses trees to represent LISP programs [17], rule-bases [49], mathematical expressions,
etc. Recombination is usually performed here by swapping branches of the trees involved, as exemplified
in Figure 1.6.
1.4.5 Mutation
From a classical point of view (atleast in the GA arena [50]), this was a secondary operator whose mission is
to keep the pot boiling, continuously injecting new material in the population, but at a low rate (otherwise,
the search would degrade to a random walk in the solution space). EP practitioners [11] would disagree
with this characterization, claiming a central role for mutation. Actually, it is considered the crucial part
of the search engine in this context. This later vision has nowadays propagated to most EC researchers
(atleast in the sense of considering mutation as important as recombination).
As it was the case for recombination, the choice of a mutation operator depends on the representation
used. In bitstrings (and in general, in linear strings spanning n , where is arbitrary alphabet) mutation
is done by randomly substituting the symbol contained at a certain position by a different symbol. If a
permutation representation is used, such a procedure cannot be used for it would not produce a valid
permutation. Typical strategies in this case are swapping two randomly chosen positions, or inverting a
segment of the permutation. The interested reader may check [51] or [5] for an overview of different
options.
If solutions are represented by complex data structures, mutation has to be implemented accordingly.
In particular, this is the case of EP, in which, for example, finite automata [52], layered graphs [53],
directed acyclic graphs [54], etc., are often evolved. In this domain, it is customary to use more than one
mutation operator, making for each individual a choice of which operators will be deployed on it.
In the case of ES applied to continuous optimization, mutation is typically done using Gaussian
perturbations, that is,
zi = xi + Ni (0, σi ), (1.3)
where σi is a parameter controlling the amplitude of the mutation, and N (a, b) is a random number
drawn from a normal distribution with mean a and standard deviation b. The parameters σi usually
undergo self-adaption. In this case, they are mutated prior to mutating the xi ’s as follows:
)+N (0,τ )
σi = σi · eN (0,τ i , (1.4)
where τ and τ are two parameters termed the local and global learning rate, respectively. Advanced schemes
have been also defined in which a covariance matrix is used rather than independent σi ’s. However, these
schemes tend to be unpractical if solutions are highly dimensional. For a better understanding of ES
mutation see Reference 55.
1.4.6 Replacement
The role of replacement is keeping the population size constant.4 To do so, some individuals from the
population have to be substituted by some of the individuals created during reproduction. This can be
done in several ways:
• Replacement-of-the-worst : the population is sorted according to fitness, and the new individuals
replace the worst ones from the population.
• Random replacement : the individuals to be replaced are selected at random.
• Tournament replacement : a subset of α individuals is selected at random, and the worst one is
selected for replacement. Notice that if α = 1 we have random replacement.
• Direct replacement : the offspring replace their parents.
Some variants of these strategies are possible. For example, it is possible to consider the elitist versions
of these, and only perform replacement if the new individual is better than the individual it has to replace.
Two replacement strategies (comma and plus) are also typically considered in the context of ES and
EP. Comma replacement is analogous to replacement of the worst, with the addition that the number of
new individuals |P | (also denoted by λ) can be larger than the population size |P| (also denoted by µ).
In this case, the population is constructed using the best µ out of the λ new individuals. As to the plus
strategy, it would be the elitist counterpart of the former, that is, pick the best µ individuals out of the µ
old individuals plus the λ new ones. The notation (µ, λ) — EA and (µ + λ) — EA is used to denote these
two strategies.
It must be noted that the term “elitism” is often used as well to denote replacement-of-the-worst
strategies in which |P | < |P|. This strategy is very commonly used, and ensures that the best individual
found so far is never lost. An extreme situation takes place when |P | = 1, that is, just a single individual is
generated in each iteration of the algorithm. This is known as steady-state reproduction, and it is usually
associated with faster convergence of the algorithm. The term generational is used to designate the classical
situation in which |P | = |P|.
4 Although it is not mandatory to do so [56], it is common practice to use populations of fixed size.
Telecommunications is another field that has witnessed the successful application of EAs. For example,
EAs have been applied to the placement of antennas and converters [82,83], frequency assignment
[84–86], digital data network design [87], predicting bandwidth demands in ATM networks [88], error
code design [89,90], etc. See also Reference 91.
Evolutionary algorithms have been actively used in electronics and engineering as well. For example,
work has been done in structure optimization [92], aeronautic design [93], power planning [94], circuit
design [95] computer-aided design [96], analogue-network synthesis [97], and service restoration [98]
among other areas.
Besides the precise application areas mentioned before, EAs have been also utilized in many other
fields such as, for example, medicine [99,100], economics [101,102], mathematics [103,104], biology
[105–107], etc. The reader may try querying any bibliographical database or web search engine for
“evolutionary algorithm application” to get an idea of the vast number of problems that have been tackled
with EAs.
1.6 Conclusions
EC is a fascinating field. Its optimization philosophy is appealing, and its practical power is striking.
Whenever the user is faced with a hard search/optimization task that she cannot solve by classical means,
trying EAs is a must. The extremely brief overview of EA applications presented before can convince the
reader that a “killer approach” is in her hands.
EC is also a very active research field. One of the main weaknesses of the field is the absence of
a conclusive general theoretical basis, although great advances are being made in this direction, and
in-depth knowledge is available about certain idealized EA models.
Regarding the more practical aspects of the paradigm, two main streamlines can be identified:
parallelizing and hybridizing. The use of decentralized EAs in the context of multiprocessors or net-
worked systems can result in enormous performance improvement [108], and constitutes an ideal option
for exploiting the availability of distributed computing resources. As to hybridization, it has become
evident in the last years that it constitutes a crucial factor for the successful use of EAs in real-world
endeavors. This can be achieved by hard-wiring problem-knowledge within the EA, or by combining it
with other techniques. In this sense, the reader is encouraged to read other essays in this volume to get
valuable ideas on suitable candidates for this hybridization.
Acknowledgments
This work has been partially funded by the Ministry of Science and Technology (MCYT) and Regional
Development European Found (FEDER) under contract TIC2002-04498-C05-02 (the TRACER project)
http://tracer.lcc.uma.es.
References
[1] T. Bäck, D.B. Fogel, and Z. Michalewicz. Handbook of Evolutionary Computation. Oxford
University Press, New York, 1997.
[2] T.C. Jones. Evolutionary Algorithms, Fitness Landscapes and Search. Ph.D. thesis, University of
New Mexico, 1995.
[3] C. Darwin. On the Origin of Species by Means of Natural Selection. John Murray, London, 1859.
[4] G. Mendel. Versuche über pflanzen-hybriden. Verhandlungen des Naturforschendes Vereines in
Brünn, 4: 3–47, 1865.
[5] Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag,
Berlin, 1992.
[6] J. Huxley. Evolution, the Modern Synthesis. Harper, New York, 1942.
[7] M. Kimura. Evolutionary rate at the molecular level. Nature, 217: 624–626, 1968.
[8] S.J. Gould and N. Elredge. Punctuated equilibria: The tempo and mode of evolution reconsidered.
Paleobiology, 32: 115–151, 1977.
[9] C.G. Langton. Artificial life. In C.G. Langton, Ed., Artificial Life 1. Addison-Wesley, Santa Fe, NM,
1989, pp. 1–47.
[10] D.B. Fogel. Evolutionary Computation: The Fossil Record. Wiley-IEEE Press, Piscataway, NJ, 1998.
[11] L.J. Fogel, A.J. Owens, and M.J. Walsh. Artificial Intelligence Through Simulated Evolution. John
Wiley & Sons, New York, 1966.
[12] I. Rechenberg. Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologis-
chen Evolution. Frommann-Holzboog Verlag, Stuttgart, 1973.
[13] H.P. Schwefel. Numerische Optimierung von Computer–Modellen mittels der Evolutionsstrategie,
Vol. 26 of Interdisciplinary Systems Research. Birkhäuser, Basel, 1977.
[14] J.H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press,
Ann Harbor, MI, 1975.
[15] T. Bäck. Evolutionary Algorithms in Theory and Practice. Oxford University Press, New York, 1996.
[16] M.L. Cramer. A representation for the adaptive generation of simple sequential programs.
In J.J. Grefenstette, Ed., Proceedings of the First International Conference on Genetic Algorithms.
Lawrence Erlbaum Associates, Hillsdale, NJ, 1985.
[17] J.R. Koza. Genetic Programming. MIT Press, Cambridge, MA, 1992.
[18] P. Moscato. On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards
Memetic Algorithms. Technical report Caltech Concurrent Computation Program, Report 826,
California Institute of Technology, Pasadena, CA, USA, 1989.
[19] P. Moscato and C. Cotta. A gentle introduction to memetic algorithms. In F. Glover and
G. Kochenberger, Eds., Handbook of Metaheuristics. Kluwer Academic Publishers, Boston, MA,
2003, pp. 105–144.
[20] M. Dorigo and G. Di Caro. The ant colony optimization meta-heuristic. In D. Corne, M. Dorigo,
and F. Glover, Eds., New Ideas in Optimization. Maiden head, UK, 1999, pp. 11–32.
[21] P. Larrañaga and J.A. Lozano. Estimation of Distribution Algorithms. A New Tool for Evolutionary
Computation. Kluwer Academic Publishers, Boston, MA, 2001.
[22] M. Laguna and R. Martí. Scatter Search. Methodology and Implementations in C. Kluwer Academic
Publishers, Boston, MA, 2003.
[23] C. Blum and A. Roli. Metaheuristics in combinatorial optimization: Overview and conceptual
comparison. ACM Computing Surveys, 35: 268–308, 2003.
[24] F. Glover and G. Kochenberger. Handbook of Metaheuristics. Kluwer Academic Publishers, Boston,
MA, 2003.
[25] R.E. Smith. Diploid genetic algorithms for search in time varying environments. In Annual
Southeast Regional Conference of the ACM. ACM Press, New York, 1987, pp. 175–179.
[26] C.A. Coello. A comprehensive survey of evolutionary-based multiobjective optimization
techniques. Knowledge and Information Systems, 1: 269–308, 1999.
[27] C.A. Coello and A.D. Christiansen. An approach to multiobjective optimization using genetic
algorithms. In C.H. Dagli, M. Akay, C.L.P. Chen, B.R. Fernández, and J. Ghosh, Eds., Intelligent
Engineering Systems Through Artificial Neural Networks, Vol. 5. ASME Press, St. Louis, MO, 1995,
pp. 411–416.
[28] C.R. Reeves. Using genetic algorithms with small populations. In S. Forrest, Ed., Proceedings of the
Fifth International Conference on Genetic Algorithms. Morgan Kaufmann, San Mateo, CA, 1993,
pp. 92–99.
[29] C. Cotta. On the evolutionary inference of temporal Boolean networks. In J. Mira and
J.R. Álvarez, Eds., Computational Methods in Neural Modeling, Vol. 2686 of Lecture Notes in
Computer Science. Springer-Verlag, Berlin, Heidelberg, 2003, pp. 494–501.
[30] C. Ramsey and J.J. Grefensttete. Case-based initialization of genetic algorithms. In S. Forrest,
Ed., Proceedings of the Fifth International Conference on Genetic Algorithms. Morgan Kaufmann,
San Mateo, CA, 1993, pp. 84–91.
[31] J.E. Baker. Reducing bias and inefficiency in the selection algorithm. In J.J. Grefenstette, Ed.,
Proceedings of the Second International Conference on Genetic Algorithms. Lawrence Erlbaum
Associates, Hillsdale, NJ, 1987, pp. 14–21.
[32] D.L. Whitley. Using reproductive evaluation to improve genetic search and heuristic discovery.
In J.J. Grefenstette, Ed., Proceedings of the Second International Conference on Genetic Algorithms.
Lawrence Erlbaum Associates, Hillsdale, NJ, 1987, pp. 116–121.
[33] T. Bickle and L. Thiele. A mathematical analysis of tournament selection. In L.J. Eshelman,
Ed., Proceedings of the Sixth International Conference on Genetic Algorithms. Morgan Kaufmann,
San Francisco, CA, 1995, pp. 9-16.
[34] E. Cantú-Paz. Order statistics and selection methods of evolutionary algorithms. Information
Processing Letters, 82: 15–22, 2002.
[35] K. Deb and D. Goldberg. A comparative analysis of selection schemes used in genetic algorithms.
In G.J. Rawlins, Ed., Foundations of Genetic Algorithms. San Mateo, CA, 1991, pp. 69–93.
[36] E. Alba and J.M. Troya. A survey of parallel distributed genetic algorithms. Complexity, 4: 31–52,
1999.
[37] A.E. Eiben, P.-E. Raue, and Zs. Ruttkay. Genetic algorithms with multi-parent recombination.
In Y. Davidor, H.-P. Schwefel, and R. Männer, Eds., Parallel Problem Solving from Nature
III, Vol. 866 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, Heidelberg, 1994,
pp. 78–87.
[38] C. Cotta and J.M. Troya. Information processing in transmitting recombination. Applied
Mathematics Letters, 16: 945–948, 2003.
[39] N.J. Radcliffe. The algebra of genetic algorithms. Annals of Mathematics and Artificial Intelligence,
10: 339–384, 1994.
[40] G. Syswerda. Uniform crossover in genetic algorithms. In J.D. Schaffer, Ed., Proceedings of the
Third International Conference on Genetic Algorithms. Morgan Kaufmann, San Mateo, CA, 1989,
pp. 2–9.
[41] L. Davis. Handbook of Genetic Algorithms. Van Nostrand Reinhold Computer Library, New York,
1991.
[42] D.E. Goldberg and R. Lingle, Jr. Alleles, loci and the traveling salesman problem.
In J.J. Grefenstette, Ed., Proceedings of an International Conference on Genetic Algorithms.
Lawrence Erlbaum Associates, Hillsdale, NJ, 1985.
[43] C. Cotta and J.M. Troya. Genetic forma recombination in permutation flowshop problems.
Evolutionary Computation, 6: 25–44, 1998.
[44] L.J. Eshelman and J.D. Schaffer. Real-coded genetic algorithms and interval-schemata. In
D. Whitley, Ed., Foundations of Genetic Algorithms 2. Morgan Kaufmann Publishers, San Mateo,
CA, 1993, pp. 187–202.
[45] F. Herrera, M. Lozano, and J.L. Verdegay. Dynamic and heuristic fuzzy connectives-based cros-
sover operators for controlling the diversity and convengence of real coded genetic algorithms.
Journal of Intelligent Systems, 11: 1013–1041, 1996.
[46] E. Alba, J.F. Aldana, and J.M. Troya. Full automatic ann design: A genetic approach. In J. Cabestany,
J. Mira, and A. Prieto, Eds., New Trends in Neural Computation, Vol. 686 of Lecture Notes in
Computer Science. Springer-Verlag, Heidelberg, 1993, pp. 399–404.
[47] E. Alba and J.M. Troya. Genetic algorithms for protocol validation. In H.M. Voigt, W. Ebeling,
I. Rechenberg, and H.-P. Schwefel, Eds., Parallel Problem Solving from Nature IV. Springer-Verlag,
Berlin, Heidelberg, 1996, pp. 870–879.
[48] C. Cotta and J.M. Troya. Analyzing directed acyclic graph recombination. In B. Reusch, Ed.,
Computational Intelligence: Theory and Applications, Vol. 2206 of Lecture Notes in Computer
Science. Springer-Verlag, Berlin, Heidelberg, 2001, pp. 739–748.
[49] E. Alba, C. Cotta, and J.M. Troya. Evolutionary design of fuzzy logic controllers using strongly-
typed GP. Mathware & Soft Computing, 6: 109–124, 1999.
[50] D.E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-
Wesley, Reading, MA, 1989.
[51] A.E. Eiben and J.E. Smith. Introduction to Evolutionary Computing. Springer-Verlag, Berlin,
Heidelberg, 2003.
[52] C.H. Clelland and D.A. Newlands. PFSA modelling of behavioural sequences by evolutionary
programming. In R.J. Stonier and X.H. Yu, Eds., Complex Systems: Mechanism for Adaptation.
IOS Press, Rockhampton, Queensland, Australia, 1994, pp. 165–172.
[53] X. Yao and Y. Liu. A new evolutionary system for evolving artificial neural networks. IEEE
Transactions on Neural Networks, 8: 694–713, 1997.
[54] M.L. Wong, W. Lam, and K.S. Leung. Using evolutionary programming and minimum descrip-
tion length principle for data mining of bayesian networks. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 21: 174–178, 1999.
[55] H.-G. Beyer. The Theory of Evolution Strategies. Springer-Verlag, Berlin, Heidelberg, 2001.
[56] F. Fernandez, L. Vanneschi, and M. Tomassini. The effect of plagues in genetic programming:
A study of variable-size populations. In C. Ryan et al., Eds., Genetic Programming, Proceedings of
EuroGP’2003, Vol. 2610 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, Heidelberg,
2003, pp. 320–329.
[57] S. Chatterjee, C. Carrera, and L. Lynch. Genetic algorithms and traveling salesman problems.
European Journal of Operational Research, 93: 490–510, 1996.
[58] D.B. Fogel. An evolutionary approach to the traveling salesman problem. Biological Cybernetics,
60: 139–144, 1988.
[59] P. Merz and B. Freisleben. Genetic local search for the TSP: New Results. In Proceedings of the
1997 IEEE International Conference on Evolutionary Computation. IEEE Press, Indianapolis, USA,
1997, pp. 159–164.
[60] C. Cotta and J.M. Troya. A hybrid genetic algorithm for the 0–1 multiple knapsack problem.
In G.D. Smith, N.C. Steele, and R.F. Albrecht, Eds., Artificial Neural Nets and Genetic Algorithms
3. Springer-Verlag, Wien New York, 1998, pp. 251–255.
[61] S. Khuri, T. Bäck, and J. Heitkötter. The zero/one multiple knapsack problem and genetic
algorithms. In E. Deaton, D. Oppenheim, J. Urban, and H. Berghel, Eds., Proceedings of the 1994
ACM Symposium of Applied Computation proceedings. ACM Press, New York, 1994, pp. 188–193.
[62] R. Berretta, C. Cotta, and P. Moscato. Enhancing the performance of memetic algorithms by
using a matching-based recombination algorithm: Results on the number partitioning problem.
In M. Resende and J. Pinho de Sousa, Eds., Metaheuristics: Computer-Decision Making. Kluwer
Academic Publishers, Boston, MA, 2003, pp. 65–90.
[63] D.R. Jones and M.A. Beltramo. Solving partitioning problems with genetic algorithms. In
R.K. Belew and L.B. Booker, Eds., In Proceedings of the Fourth International Conference on Genetic
Algorithms. Morgan Kaufmann, San Mateo, CA, 1991, pp. 442–449.
[64] C.C. Aggarwal, J.B. Orlin, and R.P. Tai. Optimized crossover for the independent set problem.
Operations Research, 45: 226–234, 1997.
[65] M. Hifi. A genetic algorithm-based heuristic for solving the weighted maximum independent set
and some equivalent problems. Journal of the Operational Research Society, 48: 612–622, 1997.
[66] D. Costa, N. Dubuis, and A. Hertz. Embedding of a sequential procedure within an evolutionary
algorithm for coloring problems in graphs. Journal of Heuristics, 1: 105–128, 1995.
[67] C. Fleurent and J.A. Ferland. Genetic and hybrid algorithms for graph coloring. Annals of
Operations Research, 63: 437–461, 1997.
[68] S. Cavalieri and P. Gaiardelli. Hybrid genetic algorithms for a multiple-objective scheduling
problem. Journal of Intelligent Manufacturing, 9: 361–367, 1998.
[69] D. Costa. An evolutionary tabu search algorithm and the NHL scheduling problem. INFOR, 33:
161–178, 1995.
[70] C.F. Liaw. A hybrid genetic algorithm for the open shop scheduling problem. European Journal
of Operational Research, 124: 28–42, 2000.
[71] L. Ozdamar. A genetic algorithm approach to a general category project scheduling problem.
IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 29: 44–59,
1999.
[72] E.K. Burke, J.P. Newall, and R.F. Weare. Initialisation strategies and diversity in evolutionary
timetabling. Evolutionary Computation, 6: 81–103, 1998.
[73] B. Paechter, R.C. Rankin, and A. Cumming. Improving a lecture timetabling system for university
wide use. In E.K. Burke and M. Carter, Eds., The Practice and Theory of Automated Timetabling
II, Vol. 1408 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 1998, pp. 156–165.
[74] K. Haase and U. Kohlmorgen. Parallel genetic algorithm for the capacitated lot-sizing prob-
lem. In Kleinschmidt et al., Eds., Operations Research Proceedings. Springer-Verlag, Berlin, 1996,
pp. 370–375.
[75] J. Berger and M. Barkaoui. A hybrid genetic algorithm for the capacitated vehicle routing prob-
lem. In E. Cantú-Paz, Ed., Proceedings of the Genetic and Evolutionary Computation Conference
2003, Vol. 2723 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, Heidelberg, 2003,
pp. 646–656.
[76] J. Berger, M. Salois, and R. Begin. A hybrid genetic algorithm for the vehicle routing problem with
time windows. In R.E. Mercer and E. Neufeld, Eds., Advances in Artificial Intelligence. 12th Biennial
Conference of the Canadian Society for Computational Studies of Intelligence. Springer-Verlag,
Berlin, 1998, pp. 114-127.
[77] P. Merz and B. Freisleben. Genetic algorithms for binary quadratic programming. In W. Banzhaf
et al., Eds., Proceedings of the 1999 Genetic and Evolutionary Computation Conference,
Morgan Kaufmann, San Francisco, CA, 1999, pp. 417–424.
[78] P. Merz and B. Freisleben. Fitness landscape analysis and memetic algorithms for the quadratic
assignment problem. IEEE Transactions on Evolutionary Computation, 4: 337–352, 2000.
[79] E. Hopper and B. Turton. A genetic algorithm for a 2d industrial packing problem. Computers &
Industrial Engineering, 37: 375–378, 1999.
[80] R.M. Krzanowski and J. Raper. Hybrid genetic algorithm for transmitter location in wireless
networks. Computers, Environment and Urban Systems, 23: 359–382, 1999.
[81] M. Gen, K. Ida, and L. Yinzhen. Bicriteria transportation problem by hybrid genetic algorithm.
Computers & Industrial Engineering, 35: 363–366, 1998.
[82] P. Calegar, F. Guidec, P. Kuonen, and D. Wagner. Parallel island-based genetic algorithm for radio
network design. Journal of Parallel and Distributed Computing, 47: 86–90, 1997.
[83] C. Vijayanand, M.S. Kumar, K.R. Venugopal, and P.S. Kumar. Converter placement in all-optical
networks using genetic algorithms. Computer Communications, 23: 1223–1234, 2000.
[84] C. Cotta and J.M. Troya. A comparison of several evolutionary heuristics for the frequency
assignment problem. In J. Mira and A. Prieto, Eds., Connectionist Models of Neurons, Learning
Processes, and Artificial Intelligence, Vol. 2084 of Lecture Notes in Computer Science. Springer-
Verlag, Berlin, Heidelberg, 2001, pp. 709–716.
[85] R. Dorne and J.K. Hao. An evolutionary approach for frequency assignment in cellular radio
networks. In 1995 IEEE International Conference on Evolutionary Computation. IEEE Press, Perth,
Australia, 1995, pp. 539–544.
[86] A. Kapsalis, V.J. Rayward-Smith, and G.D. Smith. Using genetic algorithms to solve the radio link
frequency assignment problem. In D.W. Pearson, N.C. Steele, and R.F. Albretch, Eds., Artificial
Neural Nets and Genetic Algorithms. Springer-Verlag, Wien New York, 1995, pp. 37–40.
[87] C.H. Chu, G. Premkumar, and H. Chou. Digital data networks design using genetic algorithms.
European Journal of Operational Research, 127: 140–158, 2000.
[88] N. Swaminathan, J. Srinivasan, and S.V. Raghavan. Bandwidth-demand prediction in virtual path
in atm networks using genetic algorithms. Computer Communications, 22: 1127–1135, 1999.
[89] H. Chen, N.S. Flann, and D.W. Watson. Parallel genetic simulated annealing: A massively parallel
SIMD algorithm. IEEE Transactions on Parallel and Distributed Systems, 9: 126–136, 1998.
[90] K. Dontas and K. De Jong. Discovery of maximal distance codes using genetic algorithms.
In Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence. IEEE
Press, Herndon, VA, 1990, pp. 805–811.
[91] D.W. Corne, M.J. Oates, and G.D. Smith. Telecommunications Optimization: Heuristic and
Adaptive Techniques. John Wiley, New York, 2000.
[92] I.C. Yeh. Hybrid genetic algorithms for optimization of truss structures. Computer Aided Civil
and Infrastructure Engineering, 14: 199–206, 1999.
[93] D. Quagliarella and A. Vicini. Hybrid genetic algorithms as tools for complex optimisation prob-
lems. In P. Blonda, M. Castellano, and A. Petrosino, Eds., New Trends in Fuzzy Logic II. Proceedings
of the Second Italian Workshop on Fuzzy Logic. World Scientific, Singapore, 1998, pp. 300–307.
[94] A.J. Urdaneta, J.F. Gómez, E. Sorrentino, L. Flores, and R. Díaz. A hybrid genetic algorithm for
optimal reactive power planning based upon successive linear programming. IEEE Transactions
on Power Systems, 14: 1292–1298, 1999.
[95] M. Guotian and L. Changhong. Optimal design of the broadband stepped impedance transformer
based on the hybrid genetic algorithm. Journal of Xidian University, 26: 8–12, 1999.
[96] B. Becker and R. Drechsler. Ofdd based minimization of fixed polarity Reed-Muller expressions
using hybrid genetic algorithms. In Proceedings of the IEEE International Conference on Computer
Design: VLSI in Computers and Processor. IEEE, Los Alamitos, CA, 1994, pp. 106–110.
[97] J.B. Grimbleby. Hybrid genetic algorithms for analogue network synthesis. In Proceedings of the
1999 Congress on Evolutionary Computation. IEEE, Washington D.C., 1999, pp. 1781–1787.
[98] A. Augugliaro, L. Dusonchet, and E. Riva-Sanseverino. Service restoration in compensated dis-
tribution networks using a hybrid genetic algorithm. Electric Power Systems Research, 46: 59–66,
1998.
[99] M. Sipper and C.A. Peña Reyes. Evolutionary computation in medicine: An overview. Artificial
Intelligence in Medicine, 19: 1–23, 2000.
[100] R. Wehrens, C. Lucasius, L. Buydens, and G. Kateman. HIPS, A hybrid self-adapting expert system
for nuclear magnetic resonance spectrum interpretation using genetic algorithms. Analytica
Chimica ACTA, 277: 313–324, 1993.
[101] J. Alander. Indexed Bibliography of Genetic Algorithms in Economics. Technical report
94-1-ECO, University of Vaasa, Department of Information Technology and Production
Economics, 1995.
[102] F. Li, R. Morgan, and D. Williams. Economic environmental dispatch made easy with hybrid
genetic algorithms. In Proceedings of the International Conference on Electrical Engineering, Vol.
2. International Academic Publishers, Beijing, China, 1996, pp. 965–969.
[103] C. Reich. Simulation if imprecise ordinary differential equations using evolutionary algorithms.
In J. Carroll, E. Damiani, H. Haddad, and D. Oppenheim, Eds., ACM Symposium on Applied
Computing 2000. ACM Press, New York, 2000, pp. 428–432.
[104] X. Wei and F. Kangling. A hybrid genetic algorithm for global solution of nondifferentiable
nonlinear function. Control Theory & Applications, 17: 180–183, 2000.
[105] C. Cotta and P. Moscato. Inferring phylogenetic trees using evolutionary algorithms.
In J.J. Merelo, P. Adamidis, H.-G. Beyer, J.-L. Fernández-Villacañas, and H.-P. Schwefel, Eds.,
Parallel Problem Solving from Nature VII, Vol. 2439 of Lecture Notes in Computer Science.
Springer-Verlag, Berlin, 2002, pp. 720–729.
[106] G.B. Fogel and D.W. Corne. Evolutionary Computation in Bioinformatics. Morgan Kaufmann,
San Francisco, CA, 2003.
[107] R. Thomsen, G.B. Fogel, and T. Krink. A clustal alignment improver using evolution-
ary algorithms. In David B. Fogel, Xin Yao, Garry Greenwood, Hitoshi Iba, Paul Marrow,
and Mark Shackleton, Eds., Proceedings of the Fourth Congress on Evolutionary Computation
(CEC-2002) Vol. 1. 2002, pp. 121–126.
[108] E. Alba. Parallel evolutionary algorithms can achieve super-linear performance. Information
Processing Letters, 82: 7–13, 2002.
2.1 Introduction
Artificial Neural Networks have been one of the most active areas of research in computer science over
the last 50 years with periods of intense activity interrupted by episodes of hiatus [1]. The premise for
the evolution of the theory of artificial Neural Networks stems from the basic neurological structure of
living organisms. A cell is the most important constituent of these life forms. These cells are connected
by “synapses,” that are the links that carry messages between cells. In fact, by using synapses to carry the
pulses, cells can activate each other with different threshold values to form a decision or memorize an
event. Inspired by this simplistic vision of how messages are transferred between cells, scientists invented
a new computational approach, which became popularly known as Artificial Neural Networks (or Neural
Networks for short) and used it extensively to target a wide range of problems in many application
areas.
Although the shape or configurations of different Neural Networks may look different at the first
glance, they are almost similar in structure. Every neural network consists of “cells” and “links.” Cells are
the computational part of the network that perform reasoning and generate activation signals for other
2-21
cells, while links connect the different cells and enable messages to flow between cells. Each link is usually
a one directional connection with a weight which affects the carried message in a certain way. This means,
that a link receives a value (message) from an input cell, multiplies it by a given weight, and then passes it
to the output cell. In its simplest form, a cell can have three states (of activation): +1 (TRUE), 0, and −1
(FALSE) [1].
y = f (W · X + b),
W = (w1 w2 ... wn ),
X = (x1 x2 ... xn )T .
The above-mentioned basic structure can be extended to produce networks with more than one output.
In this case, each output has its own weights and is completely uncorrelated to the other outputs. Figure 2.3
(a)
w
x f (.) y
(b)
w
x Σ f (.) y
FIGURE 2.1 (a) Unbiased and (b) biased structure of a neural network.
x1
w1
x2 w2
Σ f ( .) y
wn
xn
b
x1
w1, 1
w1, 2
Σ f1 (.) y1
x2
b1
Σ f2 (.) y2
b2
wm–1,n Σ fm (.) ym
bm
wm,n
xn
1
Y = F (W · X + B),
w1,1 w1,2 . . . w1,n
w2,1
W = . ,
..
wm,1 . . . wm,n
X = (x1 x2 ... xn ) T ,
Y = ( y1 y2 ... ym ) T ,
B = (b1 b2 ... b m )T ,
F (·) = ( f1 (·) f2 (·) ... fm (·))T ,
x1 y1
w11,1 z p1
z11 w 21,1 z 21 w p1,1
Σ f 11(.) Σ f 21(.) Σ f p1(.)
x1
b11 b12 p f P1(.)
b1
1 1 1
z p2 y2
z12 z 22
Σ .)
f 12( Σ f 22 ( .) Σ f P2( .)
b12 b 22 b p2
1 1 1
zm1 z 2m2
Σ f 1m1( .) Σ f 2m (.) Σ f 2m (.)
2
w 2m2,n2 w pmp,np 2
where n is the number of inputs, m the number of outputs, W the weighing matrix, X the input vector,
Y the output vector, and F (·) the array of output functions.
A multi-layer perceptron can simply be constructed by concatenating several single-layer perceptron
networks. Figure 2.4 shows the basic structure of such network with the following parameters [1]: X is
the input vector, Y the output vector, n the number of inputs, m the number of outputs, p the total number
of layers in the network, mi the number of outputs for the ith layer and, ni the number of inputs for the
ith layer.
Note that in this network, every internal layer of the network can have its own number of inputs and
outputs only by considering the concatenation rule, that is, ni = mi−1 . The output of the first layer is
calculated as follows:
Z 1 = F 1 (W 1 · X + B 1 ),
1
w1,1 1
w1,2 ... 1
w1,n
1
w2,1
W1 =
..
,
.
wm1
1 ,1
. . . wm
1
1 ,n
X = (x1 x2 ... x n )T ,
Z 2 = F 2 (W 2 · Z 1 + B 2 ),
2
w1,1 w1,2 2 ... 2
w1,n
2
w2,1
W2 = .
,
..
2
wm 2 ,1
. . . wm
2
2 ,m1
Y = Z p = F p (W p · Z p−1 + B p ),
p p p
w1,1 w1,2 . . . w1,n
p
w
2,1
W =
p
..
,
.
p p
wm1 ,1 . . . wmp ,mp−1
p p p
B p = (b1 b2 ... bmp )T ,
p p p
Z p = (z1 z2 ... zmp )T ,
p p p
F p (·) = ( f1 (·) f2 (·) ... fmp (·))T .
Notice that in such networks, the complexity of the network raises in a fast race based on the number of
layers. Practically experienced, each multi-layer perceptron can be evaluated by a single-layer perceptron
with comparatively huge number of nodes.
x1 1.4
Σ sgn( .) y
1.4
x2 –0.7
x2
+1
–1 +1 x1
–1
x1
–1 +1
x2 –1 –1 –1
+1 –1 +1
W = (w0 w1 ... wn ),
T = {(R 1 , S 1 ), (R 2 , S 2 ), . . . , (R L , S L )},
where n is the number of inputs, R i is the ith input data, S i represents the appropriate output for the ith
pattern, and, L is the size of the training set. Note that, for the above vector W , wn is used to adjust the
bias in the values of the weights. The Perceptron Learning can be summarized as follows:
Step 1: Set all elements of the weighting vector to zero, that is, W = (0 0 · · · 0).
Step 2: Select training pattern at random, namely kth datum.
Step 3: IF the current W has not been classified correctly, that is, W · R k = S k , then, modify the
weighing vector as follows: W ← W + R k S k .
Step 4: Repeat steps 1 to 3 until all data are classified correctly.
0 i = j,
T i , Tj ≈
1 i = j,
where Ti is the ith training data and · is the inner product of two vectors. Based on the above assumption
the weight matrix for this network is calculated as follows where ⊗ stands for outer product of two vectors:
N
W = Ti ⊗ T i .
i=1
5 2
4 3
As it can be seen, the main advantage of this network is in its one-shot learning process, by considering
orthogonal data. Note that, even if the input data are not orthogonal in the first place, they can be
transferred to a new space by a simple transfer function.
2.3.1.3 Iterative Learning
Iterative learning is another approach that can be used to train a network. In this case, the network’s weights
are modified smoothly, in contrast to the one-shot learning algorithms. In general, network weights are set
to some arbitrary values first, then, trained data are fed to the network. In this case, in each training cycle,
network weights are modified smoothly. Then, the training process proceeds until achieving an acceptable
level of acceptance for the network. However, the training data could be selected either sequentially or
randomly in each training cycle [9–11].
2.3.1.4 Hopfield’s Model
A Hopfield neural network is another example of an auto-associative network [1,12–14]. There are two
main differences between this network and the previously described auto-associate network. In this
network, self-connection is not allowed, that is, wi,i = 0 for all nodes. Also, inputs and outputs are either
0 or 1. This means that the node activation is recomputed after each cycle of convergence as follows:
N
Si = wi,j · uj (t ), (2.1)
j=1
1 if Si ≥ 0,
uj = (2.2)
0 if Si < 0.
After feeding a datum into the network, in each convergence cycle, the nodes are selected by a uniform
random function, the input are used to calculate Equation (2.1) and then followed by Equation (2.2) to
generate the output. This procedure is continued until the network converges.
The proof of convergence for this network uses the notion of “energy.” This means that an energy value
is assigned to each state of the network and through the different iterations of the algorithm, the overall
energy is decreased until it reaches a steady state.
2.3.1.5 Mean Square Error Algorithms
These techniques emerged as an answer to the deficiencies experienced by using Preceptrons and other
simple networks [1,15]. One of the most important reasons is the inseparability of training data. If the data
used to train the network are naturally inseparable, the training algorithm never terminates (Figure 2.8).
The other reason for using this technique is to converge to a better solution. In Perceptron learning,
the training process terminates right after finding the first answer regardless of its quality (i.e., sensitivity
of the answer). Figure 2.9 shows an example of such a case. Note that, although the answer found by
the Perceptron algorithm is correct (Figure 2.9[a]), the answer in Figure 2.9(b) is more robust. Finally,
another reason for using Mean Square Error (MSE) algorithms, which is crucial for most neural network
algorithms, is that of speed of convergence.
The MSE algorithm attempts to modify the network weights based on the overall error of all data. In this
case, assume that network input and output data are represented by Ti , Ri for i = 1, . . . , N , respectively.
Now the MSE error is defined as follows:
1
N
E= (W · Ti − Ri )2 .
N
i=1
Note that, the stated error is the summation of all individual errors for the all the training data. Inspite
of all advantages gained by this training technique, there are several disadvantages, for example, the
network might not be able to correctly classify the data if it is widely spread apart (Figure 2.10). The other
–1
–2
–3
–3 –2 –1 0 1 2 3
(a) 3 (b) 3
2 2
1 1
0 0
0
–1 –1
–2 –2
–3 –3
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
disadvantage is that of the speed of convergence which may completely vary from one set of data to
another.
2.3.1.6 The Widow–Hoff Rule or LMS Algorithm
In this technique, the network weight is modified after each iteration [1,16]. A training datum is selected
randomly, then, the network weights are modified based on the corresponding error. This procedure
continues until converging to the answer. For a randomly selected kth entry in the training data, the error
is calculated as follows:
ε = (W · Tk − Rk )2 .
∂ε ∂ε ∂ε
∇ε = ··· .
∂W0 ∂W1 ∂WN
–1
–2
–3
–3 –2 –1 0 1 2 3
Hence,
∂ε
= 2(W · Tk − Rk ) · Tk .
∂Wj
Based on the Widow–Hoff algorithm, the weights should be modified opposite the direction of the
gradient. As a result, the final update formula for the weighting matrix W would be:
W = W − ρ · (W · Tk − Rk ) · Tk .
Note that, ρ is known as the learning rate and it absorbs the multiplier of value “2.”
(a) (b)
+ + +
+ + +
FIGURE 2.11 Results for a K-means clustering with (a) correct (b) incorrect number of clusters.
Figure 2.11 shows an instance of applying such network for data classification with the correct and
incorrect number of clusters.
2.4.3 ART1
This neural classifier, known as“Adaptive Resonance Theory”or ART, deals with digital inputs (Ti ∈ {0, 1}).
In this network, each “1” in the input vector represents information while a “0” entry is considered noise or
unwanted information. In ART, there is no predefined number of classes before the start of classification;
in fact, the classes are generated during the classification process.
Moreover, each class prototype may include the characteristics of more than a training datum. The
basic idea of such network relies on the similarity factor for data classification. In summary, every time
a datum is assigned to a cluster, firstly, the nearest class with this datum is found, then, if the similarity of
this datum and the class prototype is more than a predefined value, known as a vigilance factor, then, the
datum is assigned to this class and the class prototype is modified to have more similarity with the a new
data entry [1,22,23].
The following procedure shows how this algorithm is implemented. However, the following needs to
be noted before outlining the algorithm:
Step 1: Let β be a small number, n be the dimension of the input data; and ρ be the vigilance factor
(0 ≤ ρ < 1).
Step 2: Start with no class prototype.
Step 3: Select a training datum by random, Tk .
Step 4: Find the nearest unchecked class prototype, Ci , to this datum by minimizing (Ci ·Tk )/(β +Ci ).
Step 5: Test if Ci is sufficiently close to Tk by verifying if (Ci · Tk )/(β + Ci ) > (Tk /(β + ρ)).
Step 6: If it is not similar enough, then, make a new class prototype and go to step 3.
Step 7: If it is sufficiently similar check the vigilance factor: (Ci · Tk /Tk ) ≥ ρ.
Step 8: If vigilance factor is exceeded, then, modify the class prototype by Ci = Ci ∩ Tk and go to step 3.
Step 9: If vigilance factor is not exceeded, then, try to find another unchecked class prototype in step 4.
Step 10: Repeat steps 3 to 9 until none of the training data causes any change in class prototypes.
2.4.4 ART2
This is a variation to ART1 with the following differences:
x1 w11,1 y1
z1 w 21,1
Σ f11(.) Σ f12(.)
x2
z2 y2
Σ f21(.) Σ f22(.)
zs
Σ f 1m1(.) Σ fm2(.)
w 2s,m
w 1n,s
xn ym
In this approach, an input is presented to the network and allowed to “forward” propagate through
the network and the output is calculated. Then, the output will be compared to a“desired”output (from the
training set) and an error calculated. This error is then propagated “backward” into the network and the
different weights updated accordingly. To simplify describing this algorithm, consider a network with
a single hidden layer (and two layers of weights) given in Figure 2.13.
In relation to the above network, the following definitions apply. Of course, the same definitions can
be easily extended to larger networks.
It is important to note that, in such network, different combinations of weights might produce the
same input/output relationship. However, this is not crucial as long as the network is able to “learn” this
association. As a result, the network weights may converge to different sets of values based on the order of
the training data and the algorithm used for training although their stability may differ.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
ebookgate.com