Computational Intelligencebased Optimization Algorithms From Theory To Practice Babak Zolghadrasli download
Computational Intelligencebased Optimization Algorithms From Theory To Practice Babak Zolghadrasli download
https://ebookbell.com/product/computational-intelligencebased-
optimization-algorithms-from-theory-to-practice-babak-
zolghadrasli-55493832
https://ebookbell.com/product/computational-intelligencebased-
optimization-algorithms-from-theory-to-practice-1st-edition-
zolghadrasli-55486200
https://ebookbell.com/product/computational-intelligence-based-
optimization-of-manufacturing-process-for-sustainable-materials-1st-
edition-deepak-sinwar-51589808
https://ebookbell.com/product/combustion-optimization-based-on-
computational-intelligence-1st-edition-hao-zhou-6990384
https://ebookbell.com/product/computational-intelligence-based-
solutions-for-vision-systems-ansari-bajaj-46094270
Computational Intelligencebased Time Series Analysis Dinesh C S Bisht
https://ebookbell.com/product/computational-intelligencebased-time-
series-analysis-dinesh-c-s-bisht-46517804
https://ebookbell.com/product/computational-intelligence-based-on-
lattice-theory-1st-edition-vassilis-g-kaburlasos-2181304
https://ebookbell.com/product/interpretability-of-computational-
intelligencebased-regression-models-1st-edition-tams-kenesei-5235772
https://ebookbell.com/product/interpretability-of-computational-
intelligencebased-regression-models-tams-kenesei-5853914
Babak Zolghadr-Asli
Designed cover image: Shutterstock
First edition published 2024
by CRC Press
2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431
and by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
CRC Press is an imprint of Taylor & Francis Group, LLC
© 2024 Babak Zolghadr-Asli
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information storage
or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com
or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-
750-8400. For works that are not available on CCC please contact [email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used
only for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Names: Zolghadr-Asli, Babak, author.
Title: Computational intelligence-based optimization algorithms :
from theory to practice / Babak Zolghadr-Asli.
Description: First edition. | Boca Raton, FL : CRC Press, 2024. |
Includes bibliographical references and index.
Identifiers: LCCN 2023019666 (print) | LCCN 2023019667 (ebook) |
ISBN 9781032544168 (hardback) | ISBN 9781032544151 (paperback) |
ISBN 9781003424765 (ebook)
Subjects: LCSH: Computer algorithms. | Computational intelligence.
Classification: LCC QA76.9.A43 Z65 2024 (print) |
LCC QA76.9.A43 (ebook) | DDC 005.13–dc23/eng/20230623
LC record available at https://lccn.loc.gov/2023019666
LC ebook record available at https://lccn.loc.gov/2023019667
ISBN: 978-1-032-54416-8 (hbk)
ISBN: 978-1-032-54415-1 (pbk)
ISBN: 978-1-003-42476-5 (ebk)
DOI: 10.1201/9781003424765
Typeset in Times
by Newgen Publishing UK
Contents
List of Figures xi
Foreword xv
Preface xvii
3 Genetic Algorithm 49
3.1 Introduction 49
3.2 Algorithmic Structure of the Genetic Algorithm 53
3.2.1 Initiation Stage 53
3.2.2 Reproduction Stage 55
3.2.3 Termination Stage 60
3.3 Parameter Selection and Fine-Tuning of the Genetic Algorithm 61
3.4 Python Codes 62
3.5 Concluding Remarks 65
Index 337
Figures
This is a unique reference book providing in one place: information on the main
meta-heuristic optimization algorithms and an example of their algorithmic
implementation in Python. These algorithms belong to the class of computa-
tional intelligence-based optimization methods that have addressed one of the key
challenges plaguing mathematical optimization for years –that of dealing with dif-
ficult and realistic problems facing any industry with resource restrictions. What do
I mean by difficult and realistic? Instead of simplifying the problem that needs to be
solved due to the limitations of the method, as was the case with many mathemat-
ical optimization algorithms, these meta-heuristics can now tackle large, complex,
and previously often intractable problems.
The book includes 20 meta-heuristic algorithms, from the now-classical gen-
etic algorithm to more “exotic” flower pollination or bat algorithms. Each of the
algorithms is presented as far as possible using the same structure so the reader can
easily see the similarities or differences among them. The Python code provides
an easy-to-access library of these algorithms that can be of use to both novices
and more proficient users and developers interested in implementing and testing
some of the algorithms they may not be fully familiar with. From my own experi-
ence, it is much easier to get into a subject when somebody has already prepared
the grounds. That is the case with this book, if I had it on my desk 30 years ago,
I would’ve been able to try many more different ways of solving problems in engin-
eering. With this book, I may still do it now!
Dragan Savic
Professor of Hydroinformatics
University of Exeter, United Kingdom
and
Distinguished Professor of Hydroinformatics
The National University of Malaysia, Malaysia
Preface
xviii Preface
chapter, which we highly encourage you to do, you can go to a given chapter and
learn all there is to understand and implement an algorithm fully. Each chapter also
contains a brief literature review of the algorithm’s background and showcases
where it has been implemented successfully. As stated earlier, there is a Python
code for all algorithms at the end of each chapter. It is important to note that, while
these are not the most efficient way to code these algorithms, they may very well
be the best way to understand them for beginner to intermediate programmers. As
such, if, as a reader, you have a semi-solid understanding of the Python syntax
and its numeric library NumPy, you could easily understand and implement these
methods on your own.
1 An Introduction to Meta-Heuristic
Optimization
Summary
Before we can embark upon this journey of ours to learn about computational
intelligence-based optimization methods, we must first establish a common lan-
guage to see what an optimization problem actually is. In this chapter, we tend
to take a deep dive into the world of optimization to understand the fundamental
components that are used in the structure of a typical optimization problem. We
would be introduced to the technical terminology used in this field, and more
importantly, we aim to grasp the basic principles of optimization methods. As a
final note, we would learn about the general idea behind meta-heuristic optimiza-
tion algorithms and what this term essentially means. By the end of this chapter, we
will also come to understand why it is essential to have more than one of these opti-
mization algorithms in our repertoire if we tend to use this branch of optimization
method as the primary option to handle real-world complex optimization problems.
1.1 Introduction
What is optimization? That is perhaps the first and arguably the most critical
question we need to get out of the way first. In the context of mathematics, opti-
mization, or what is referred to from time to time as mathematical programming,
is the process of identifying the best option from a set of available alternatives.
The subtle yet crucial fact that should be noted here is that one’s interpretation of
what is “best” may differ from the others (Bozorg-Haddad et al., 2021; Zolghadr-
Asli et al., 2021). That is why explicitly determining an optimization problem’s
objective is essential. So, in a nutshell, in optimization, we are ultimately trying to
search for the optimum solution to find an answer that minimizes or maximizes a
given criterion under specified conditions.
Optimization problems became an integrated part of most, if not all, quantita-
tive disciplines, ranging from engineering to operations research and economics.
In fact, developing novel mathematical programming frameworks has managed to
remain a topical subject in mathematics for centuries. Come to think of it, there is
a valid reason that optimization has incorporated itself into our professional and
personal modern-day life to the extent it has. This is more understandable in the
DOI: 10.1201/9781003424765-1
2 An Introduction to Meta-Heuristic Optimization
context of engineering and management problems, where there are often limited
available resources, and the job at hand is to make the best out of what is at our
disposal. Failing to do so would simply mean that in the said procedure, whatever
that may be, there is going to be some waste of resources. This could, in turn,
imply that we are cutting our margin of profits, wasting limited natural resources,
time, or workforce over something that could have been prevented if the process
were optimized. So, in a way, it could be said that optimization is simply just good
common sense.
There are several formidable approaches to go about mathematical pro-
gramming. The traditional approach to solving optimization problems that are
categorized under the umbrella term of analytical approaches is basically a series
of calculus-based optimization methods. Often these frameworks are referred to
as derivate-based optimization methods, given that they rely heavily on the idea
of differential algebra and gradient-oriented information to solve the problem. As
such, the core idea of these methods is to utilize the information extracted from the
gradient of a differentiable function, often from the first- or second-order deriva-
tive, as a guide to find and locate the optimal solution. The main issue here is that
this could not be a practical method to approach real-world optimization problems,
as these problems are often associated with high dimensionality, multimodality,
epistasis, non-differentiability, and discontinuous search space imposed by
constraints (Yang, 2010; Du & Swamy, 2016; Bozorg-Haddad et al., 2017). As
such, often, these methods are dismissed when it comes to handling intricate real-
world problems as they are not by any means the ultimate practical approach to
tackle such problems.
The alternative approach here would be to use a series of methods that are
categorized under the umbrella term of sampling-based approaches. These, to some
extent, use the simple principle of trial-and-error search to locate what could be
the optimum solution. These methods are either based on unguided or untargeted
search or the searching process that is guided or targeted by some criterion.
Some of the most notable subcategories of unguided search optimization
methods are sampling grid, random sampling, and enumeration-based methods.
The sampling grid is the most primitive approach here, where all possible solutions
would be tested and recorded to identify the best solution (Bozorg-Haddad et al.,
2017). In computer science, such methods are said to be based on brute force com-
putation, given that to find the solution, basically, any possible solution is being
tested here. As you can imagine, this could be quite computationally taxing. While
this seems more manageable when the number of potential solutions is finite, in
most, if not all, practical cases, this can be borderline impossible to implement
such an approach to find the optimum solution. If, for instance, the search space
consists of continuous variables, the only way to implement this method is to
deconstruct the space into a discrete decision space. This procedure, known as
discretization, transforms a continuous space into a discrete one by transposing an
arbitrarily defined mesh grid network over the said space. Obviously, the finer this
grid system, the better the chance of getting closer to the actual optimum solution.
Not only it becomes more computationally taxing to carry this task, but from a
An Introduction to Meta-Heuristic Optimization 3
theoretical point of view, it is also considered impossible to locate the exact optimal
solution for a continuous space with such an approach. However, it is possible to
get a close approximation of the said value through this method.
Another unguided approach is random sampling. The idea here is to simply
take a series of random samples from the search space and evaluate their perform-
ance against the optimization criterion (Bozorg-Haddad et al., 2017). The most
suitable solution found in this process would then be returned as the optimal solu-
tion. Though this process is, for the most part, easy to execute, and the amount
of computational power needed to carry this task can be managed by limiting the
number of samples taken from the search space, as one can imagine, the odds of
locating the actual optimum solution is exceptionally slim. This is, of course, more
pronounced in complex real-world problems where there are often numerous con-
tinuous variables.
The other notable approach in the unguided search category is enumeration-
based methods (Du & Swamy, 2016). These methods are basically a bundle of
computation tasks that would be executed iteratively until a specific termination
criterion is met, at which point the final results would be returned by the method as
the solution to the optimization problem at hand. Like any other unguided method,
here, there is no perception of the search space and the optimization function itself.
As such, the enumeration through the search space would be solely guided by the
sequence of computational tasks embedded within the method. In other words,
such a method could not learn from their encounter with the search space to alter
their searching strategies, which is in and of itself the most notable drawback of all
the unguided searching methods.
Alternatively, there are also targeted searching methods. One of the most notable
features of this branch of optimization is that they can, in a sense, implement what
they have learned about the search space as a guiding mechanism to help navigate
their searching process. As such, they attempt to draw each sample batch from what
they learned in their last attempt. As a result, step by step, they are improving the
possibility that the next set of samples is more likely to be better than the last until,
eventually, they could gradually move toward what could be the optimum solution. It
is important to note that one of the distinctive features of this approach, like any other
sampling method, is that they aim to settle for a close-enough approximation of the
global optima, better known as near-optimal solutions. The idea here is to possibly
sacrifice the accuracy of the emerging solution to an acceptable degree to find a close-
enough solution with considerably less calculation effort. One of the most well-known
sub-class of the guided sampling methods is meta-heuristic optimization algorithms.
However, before diving into what these methods actually are and what they are cap-
able of doing, it is crucial that we improve our understanding of the structure of an
optimization problem and its components from a mathematical point of view.
Optimize f ( X ) (1.1)
X ∈R N
Subject to
gk ( X ) ≤ bk ∀k (1.2)
Lj ≤ xj ≤ U j ∀j (1.3)
in which f() represents the objective function, X is a point in the search space of an
optimization problem with N decision variables, N denotes the number of decision
variables, gk() is the kth constraint of the optimization problem, bk denotes the con-
stant value of the kth constraint, xj represents the value associated to the jth deci-
sion variable, and Uj and Lj represent the upper and lower feasible boundaries of
the jth decision variable, respectively. Note that in an optimization problem with N
decision variables, an N-dimension coordination system could be used to represent
the search space. In this case, any point within the search space, say X, can be
represented mathematically as a 1×N array as follows:
(
X = x1 , x2 , x3 ,…, x j ,…, x N ) (1.4)
variable; that is to say, we want to figure out which type of filter should be installed
to get the best result. The variable may also be binary in nature. This means that
only two possible values can be passed for that variable. For instance, if we want
to figure out whether an industrial site should be constructed in a place where we
tend to maximize the margin of profits. Here the variable could be either going
ahead with the project or shutting the project down. Mathematical programming
terminology refers to all three cases as discrete variables. Alternatively, a deci-
sion variable may also be a float number, which is a number drawn from the real
number set. An example of this would be when you want to determine the max-
imum amount of partially refined industrial site wastewater that can be released
back into the stream without violating the environmental regulatory thresholds set
to protect the natural ecosystem. In mathematical programming terminology, such
a case is an example of a continuous variable. Of course, in real-world optimiza-
tion problems, we may have a combination of discrete and continuous variables.
These are said to be mixed-type optimization problems.
1.2.4 Constraints
Usually, optimization problems are set up in a way that decision variables cannot
assume any given value. In other words, an optimization problem can be limited by
a set of restrictions or constraints that bounds them between often two acceptable
thresholds. Often, this is because resources are limited, and as such, it is impos-
sible to pour unlimited supplies into a process or an operation. For instance, if you
intend to optimize the company’s workflow, there are budget and human resources
limitations that need to be accounted for. In addition to this, there are some legal
or physical restrictions that pose some limitations to the problem. For instance, in
An Introduction to Meta-Heuristic Optimization 7
optimizing the schedule of a power plant, some safety measures restrict how you
can operate the plant, even if there are resources available to pour into the system.
Another example of this would be optimizing the design of an infrastructure, where
tons of regulatory and physical restrictions must be considered in the optimization
process.
From a mathematical standpoint, two types of constraints can be found in an
optimization problem. The first type of restriction is what in mathematical pro-
gramming dialect is referred to as boundary conditions. Here the restrictions are
directly imposed on the decision or state variables themselves. As such, a decision
or state variable should always be within a specified boundary. This could either
mean that the said variable should always assume a value between a lower and
upper boundary or that the boundary is only applied in one direction. A general
representation of such boundary conditions can be seen in Equation (1.3).
The other form of restriction is non-boundary conditions, which pretty much
sums up any other form of restriction that can be imposed on an optimization
problem. From a mathematical standpoint, this means that a function of two or
more decision or state variables is bound to hold a specific condition. This could
mean that the said function should be equal to, greater, or less than a specified
constant. A general representation of such boundary conditions can be seen in
Equation (1.2). Note that while an optimization problem may have no condition, it
may also be restricted by multiple conditions of different sorts.
In mathematical programming lingo, a solution that can hold all the possible
constraints of the optimization problem is referred to as a feasible solution. On
the other hand, if a solution violates even one condition, it is labeled an infeasible
solution. It is important to note that the ultimate goal of optimization is to find a
feasible solution that yields the best objective function value. This simply means
that an infeasible solution can never be the answer to an optimization problem,
even if it yields the best result.
The idea behind optimization is to identify the so-called optimum solution out
of all available solutions. As we have seen earlier, in an optimization problem
with N decision variables, any solution can be mathematically expressed as an N-
dimensional array. By the same token, the combination of all the possible solutions
to an optimization problem can be interpreted mathematically as an N-dimensional
Cartesian coordinate system, where each decision variable denotes an axis to the
said system. This would create a hypothetical N-dimensional space, which in math-
ematical programming dialect is referred to as the search space. The idea is that
any solution is a point within this coordination system, and the main point of opti-
mization is to search through this space to locate the optimum solution.
As the previous section shows, not all solutions can be valid answers to an
optimization problem. In fact, any answer that violates at least one constraint is
considered an infeasible solution and, as such, cannot be passed as a viable result to
the optimization process. With that in mind, the search space could be divided into
8 An Introduction to Meta-Heuristic Optimization
two mutually exclusive sections that are feasible space and infeasible space. The
former constitutes the portion of the search space where all constraints can be held,
while one or more constraints are violated in the latter area. Naturally, the solution
to the optimization problem must be selected from the former space. Note that the
constraints can be assumed as a hyperplane that divides the search space into two
mutually exclusive sections. Based on the condition of the said constraint, either
one of these parts constitutes the feasible space, or the hyperplane itself denotes
this space. If more than one condition is involved, the intersection of all the feasible
portions will denote the feasible search space. In other words, the feasible space
should be able to satisfy all the conditions of the optimization problem at hand.
Note that, in mathematical programming lingo, the portion of the search space in
which all the boundary conditions are met is called the decision space. Figure 1.1
illustrates the search space of a hypothetical constrained two-dimensional optimiza-
tion problem.
1.2.6 Simulator
As we have seen earlier, state variables play a crucial role in the optimization
process. Their role is often more pronounced in intricate real-world optimization
An Introduction to Meta-Heuristic Optimization 9
f (X* ) ≤ f (X ) ∀X (1.5)
10 An Introduction to Meta-Heuristic Optimization
f (X* ) ≥ f (X ) ∀X (1.6)
f ( X ′) ≤ f ( X ) X −ε ≤ X ≤ X +ε (1.7)
f ( X ′) ≥ f ( X ) X −ε ≤ X ≤ X +ε (1.8)
Figure 1.2 depicts the relative and absolute extrema of an optimization problem
with a single decision variable.
All the local and global maxima and minima of an optimization problem are
collectively known as the extrema of the said problem. As we have seen, it is pos-
sible for an optimization problem to have more than one extremum. In mathem-
atical programming lingo, an optimization problem with such characteristics is
referred to as a single-modal or unimodal optimization problem. On the other hand,
if a problem has more than one extremum, it is called a multimodal optimization
problem. Figure 1.3 depicts the schematic form of the single-modal and multi-
modal maximization problem with one decision variable.
While in some practical problems, you may also need to identify all the extrema
of the multimodal optimizing problem at hand, often, as one can imagine, the main
objective of the optimization process is to identify all the possible global or absolute
optima of a given problem. This, however, as we would explore through this book,
is a bit challenging for sampling-based optimization methods. In this branch of opti-
mization, it is often hard to distinguish local and global optima from one another.
More importantly, there is no guarantee that you have encountered them all through
the searching process or whether the point is, in fact, an extrema or we just did not
happen to run into a point with better properties in our stochastic sampling process. So
in optimization methods that are established on the idea of sampling, it is crucial to be
aware of these so-called pitfalls as we strive to locate the absolute optimum solutions.
Figure 1.3 The generic scheme of (a) single-modal and (b) multimodal one-dimensional
maximization problem.
12 An Introduction to Meta-Heuristic Optimization
f (P) ≤ f ( X* ) + ε (1.9)
f (P) ≥ f ( X* ) + ε (1.10)
phase. In the exploration phase, also known as the diversification phase, the algo-
rithm usually tends to explore the search space with more random-based moves that
are more lengthy and sporadic by nature. After this phase, the algorithm would tend
to focus more on specific regions of the search space where it deemed them more
likely to house the optimum solution. This is the task of the exploitation phase,
which is also referred to as the intensification phase. Having a search strategy that
emphasizes adequacy on both these pillars is crucial for an efficient and thorough
search. Often, the algorithms would tend to transition smoothly from the explor-
ation phase to the exploitation. However, as we would see later in this book, there
are those algorithms that are designed in a way that they can execute both phases
simultaneously.
Today, meta-heuristic algorithms have established themselves as one of the
most fundamental and pragmatic ways to tackle complex real-world optimiza-
tion problems. These methods could theoretically handle high-dimensional, dis-
continuous, constrained, or non-differentiable objective functions. This does
not mean that these methods are without their drawbacks. Firstly, the algorithm
would pursue the near optima rather than the global optimum solutions. This,
of course, does not pose a serious challenge to most practical cases. Secondly,
given the stochastic nature of this process, and the trial-and-error component
that is the inseparable part of meta-heuristic optimization methods, there are
no guarantees that these algorithms could converge to the optimum solution.
Also, there is always the possibility that the algorithm could be trapped in local
optima, as there is no sure way here to distinguish the relative and absolute
extrema. Of course, one can always try to play with the algorithm’s param-
eter and, through a series of trial-and-error experiences, find a more suitable
setting for the problem. The bottom line, however, is that while fine-tuning the
parameters of a meta-heuristic optimization algorithm can undoubtedly have an
impact on the final solution, and there are certain measures to see if an algo-
rithm is performing the way it should, there can never be any guarantee that the
emerged solution is indeed the global optimum. All in all, these shortcomings
should not undermine the importance of these methods and their capacity to
handle complex real-world problems.
position of the search agents. The idea here is that the algorithm’s current stage
only depends on its previous stage. As such, to reposition an agent, the algorithm
would only need to know where a given agent is at the moment. Again, given the
many advantages of having a properly designed memory-like feature in the struc-
ture of an algorithm, most modern-day meta-heuristic optimization algorithms are
designed with this feature embedded in their algorithmic architecture (Bozorg-
Haddad et al., 2017).
As we have seen in the previous sections, there are two general types of deci-
sion variables; those that are of a continuous type and, of course, discrete decision
variables. By the same token, there are two general types of search spaces, namely,
continuous and discrete search spaces. What is important to note here is that, from
a computational standpoint, searching through these search spaces would require
different strategies. While some algorithms are innately designed to handle discrete
search spaces, others can only tackle problems with continuous search spaces. That
said, there are certain tricks to switch these two so that an algorithm would be
compatible with both search spaces. The most notable idea here is discretization,
where a continuous space would be transformed into a discrete one by transposing
an arbitrarily defined mesh grid system over the said space. By doing so, you can
use a discrete meta-heuristic algorithm to enumerate through a continuous search
space. That, as you can imagine, is not without its drawbacks. The main problem
is that you essentially lose some information in these transitions. As such, you
cannot be sure if the optimum solution is, in fact, in the newly formed grid system,
no matter how fine of a mesh grid you use for this transition. Often, however, the
researchers would tend to use the governing computational principles of an algo-
rithm to rebuild a revised or modified version of the algorithm that is compatible
with the other type of search space. Note that researchers may also use this strategy
to add new features or tweak the structure of an algorithm to create a more effi-
cient modified algorithm (Yaghoubzadeh-Bavandpour et al., 2022). While the new
algorithm is still built on the same principles, it can usually outperform its original
standard version in one way or the other.
As we have seen thus far, the structure of the meta-heuristic algorithms is built on
the idea of guided random sampling. As the random sampling goes, we have three
main ideas to draw a sample from a set; one being to select the samples based on
pure randomness; we can select the samples using a systematic deterministic struc-
ture; and finally, there is the stochastic selection, which is something in between
the previous approaches, meaning that while there is a random component to the
selection procedure, it also follows some deterministic instruction as well. By the
same token, meta-heuristic optimization algorithms can be categorized into two
classes that are stochastic and deterministic algorithms.
20 An Introduction to Meta-Heuristic Optimization
of view, there are two known ways to create such an effect. One is that the algo-
rithm always preserves the best or even some of the best solutions as it iterates
to locate the optimum solution. These ideas are computer science lingo called
elitism, and an algorithm based on this idea is called an elitist algorithm. The other
known approach here is only to permit the improving moves to occur and reject the
non-improving ones. This idea in computer science lingo is known as the greedy
strategy. Both these ideas can preserve the best solutions encountered through the
search. But this also means that they would lose some of their exploration capaci-
ties. More importantly, the algorithm is more likely to be trapped in local optima, as
they tend to only move toward better positions in the search space. These problems
are more pronounced for single-solution-based algorithms and greedy strategy,
given that in the first case, the search only relies on a single search agent, and in the
second case, non-improving moves are not permitted through the search, meaning
that there are no escaping the local optima if search agents assume these values.
Notice that elitism and greedy strategy for a single-solution-based algorithm are,
in effect, telling the same story, as in the end, they would have the same overall
effect. Also, it is essential to bear in mind that either strategy can be later added to
an algorithm to, perhaps, enhance its performance.
about the search space’s landscape. This is mainly due to the fact that we are not
making any distinction between a solution that slightly violates the constraints or a
case where there are severe violations of the set restrictions. In the former case, the
infeasible solution could have been used as a guide to help the search agent locate
the optimum solution. All in all, while this is undoubtedly an easy way to handle
constraints, it certainly is not the most efficient way to do so.
Alternatively, based on the refinement strategy, you can save the infeasible solu-
tion by the process as you adjust the variables to be deemed feasible again. As such,
unlike the removal strategy, here, you are not directly eradicating any infeasible
solution from the search process. While this is a manageable strategy when the
violations are restricted to the boundary conditions, any other type of violation
is rather hard to be refined by this method. In the former case, you can simply
replace the value of the violating variable with one of the threshold values, which
is a reasonably simple task. However, if the constraints involve multiple decision
variables, there is no universal or straightforward approach to alter the solution in
a way that transforms it into a feasible solution. And more importantly, even if you
can make this transition, there is no way to tell if this is the best way to change an
infeasible solution into a feasible one when it comes to preserving additional infor-
mation about the search space, which is the whole point behind these strategies. As
such, while this is an acceptable remedy to keep the solutions within the decision
space, it is not necessarily the best option for constraints that involve multiple deci-
sion or state variables.
The last option to address constraints is implementing the idea of penalty
functions. The idea here is to penalize the objective function value proportional
to how much the said solution has violated the constraints of the problem at hand
through a mechanism called the penalty function. In effect, a penalty function
determines the severity of these penalties based on how much a solution has
violated a said constraint. Naturally, the more violation forms a solution, the more
severe these penalties would get. So in a maximization problem, as a penalty, a
positive value would be subtracted from the objective function value rendering
it less suitable. In contrast, a positive value would be added to the objective
function value in a minimization problem. It is important to note that the amount
of this penalty function must be directly proportional to how much a solution
violates an optimization problem’s constraints. The other notable thing is that
each constraint could have its own penalty function mechanism. The overall pen-
alty imposed on a solution is the accumulated values of all the said penalties. In
mathematical programming lingo, a penalized objective function is referred to
as the cost function or fitness function. This can be mathematically expressed as
follows:
where F() represents the fitness function, and p() denotes the penalty function.
Note that we would subtract the penalty values from the objective function in the
An Introduction to Meta-Heuristic Optimization 23
ϑ k if gk ( X ) > bk
pk ( X ) = ∀k (1.12)
0 if gk ( X ) ≤ bk
ϑ k = α k × gk ( X ) − bk + γ k
βk
∀k (1.13)
in which pk() denote the penalty function for the kth constraint; ϑk represents the
amount of penalty that is associated with the kth constraint; and αk, βk, and γk are
all constant values for adjusting the magnitude of the penalty function for the kth
constraint. Using these constants, you can control the amount of penalty that would
be applied to the objective function. In other words, increasing these values would
apply a more severe penalty to the objective function. However, using the proper
penalizing mechanism is a critical subject if you want to get a good result out of
these algorithms. Because the point is, applying acute penalties would be, in effect,
equivalent to omitting the infeasible solutions. Applying mild penalties could also
have no visible result on the search algorithms’ emerging solution. So it is crucial
to keep a balance between two extreme situations. However, finding the proper
values for these parameters would need some prior experience and perhaps sensi-
tivity analysis.
By the same token, you can create a penalty function for any other constraint
type. Of course, the general idea is to apply the penalty whenever the said solu-
tion violates the said constraint. All these penalties would be collectively applied
to the objective function to get the fitness function value of the solution. Through
this mechanism, in effect, you would map the objective function to the fitness
function values, which would be used from that point onward as the evaluation
mechanism. As such, the algorithm would use these values to determine how good
a solution is and, in turn, select the optimum solution based on these values. Again,
it is essential to note that, like meta-heuristic algorithms, this is also, by nature, a
trial-and-error procedure, meaning that there could never be an absolute guarantee
that the final solution is never infeasible or there is no better feasible solution in
the search space.
through the search space by tracing the movement of the search agents. And if
the algorithm is set up correctly, little by little, it gets closer and closer to a spe-
cific position in the search space until there is no room for any improvement. At
this point, it is said that the algorithm has converged to the optimum solution. If,
however, the algorithm’s searching process terminates before this point, it is said
that there is premature convergence; That is to say, the emerged solution from this
search is not actually the optimum solution, and the search should have continued
for a bit longer. So, as can be seen here, the final solution of a meta-heuristic algo-
rithm, in and of itself, cannot be sufficient to interpret the quality of the result.
Instead, you need to trace the progression of the search process to see how reliable
the emerging solution actually is. Analyzing the convergence rate of an algorithm
is one of the primary tools for understanding the performance of an algorithm and
the reliability of the outcome solutions.
One of the simplest ways to tackle this matter is to plot the algorithm’s conver-
gence rate throughout the search. Here, like any data-driven science, visualization
could offer reliable guidance to unravel the intricacies of a problem at hand. In
order to do that, either the objective or the fitness function values would be plotted
against either the run time, number of iterations, or, more commonly, a technical
measure called the number of function evaluations, or NFE for short. NFE is a tech-
nical term that refers to the number of times the objective function has been called
during the search. To understand this concept, we must first understand how the
algorithm actually converges throughout the search.
Figure 1.5 illustrates how the algorithm’s solutions converge as the search
progresses for a minimization problem. For simplicity, let us assume that we are
tracing the progression of the search against the iteration or the run time metrics.
This should be reasonably straightforward for the single-solution-based algorithms,
as we deal with a single search agent in this type of algorithmic structure. However,
in population-based algorithms, we could select the best solution in the population
set as the sole representative of the said set. As such, the said values would be
plotted against time or the iteration count to trace the progression of the search pro-
cess. As can be seen here, often, the algorithm experiences a sudden and massive
drop within the first few iterations. But as the algorithm progresses, these changes
become more subtle and less pronounced until, eventually, there are no visible
changes in the values of the objective function. Again, it is essential to note that
if there are notable changes in the tail of the convergence graph; this means that
there is premature convergence, and you need to recalibrate the algorithm to get the
optimum result. This can often be addressed by extending the termination process
so the algorithm can do more iterations. It is also essential to note that there are two
generic types of convergence graphs. In meta-heuristic algorithms that are based
on the idea of preservation strategies, say elitism or greedy strategy, the graphs are
monotonic, meaning that if the algorithm’s returned values are decreasing, there
are no sudden upticks in the graph, and if the values are increasing, there cannot be
a decreasing section in the plot. This is mainly because these sorts of algorithms
always preserve the best solutions in each iteration, so the algorithm is either con-
tinuously improving or at least staying the same. The graph of algorithms that do
An Introduction to Meta-Heuristic Optimization 25
unit (CPU), or graphics processing unit (GPU), depending on which processor was
used to do the computation, and even the operating system (OS) can play a role
here. It is important to note that comparing the convergence speed of two given
meta-heuristic optimization algorithms or different parameter settings for a given
algorithm would only make sense if the systems that executed them were identical.
Lastly, there is the quality of the code in which the algorithm was written down
in the given system. Naturally, while this is indeed a crucial factor in how fast or
smoothly an algorithm handles a given optimization problem, it is quite challen-
ging actually to quantify this factor in an objective way. The programmers’ skills
and knowledge of a programming language could help them find a more efficient
way to implement an algorithm. For instance, it is entirely plausible for a more
experienced programmer to find an elegant way to improve the performance of a
coded algorithm by a beginner programmer in terms of the speed and efficiency of
the program. And while this is indeed an essential factor in the convergence speed,
it is impossible to report any objective information on this matter. All in all, while
it is not a bad idea to have some idea about the run time of an algorithm, as we have
seen here, it is not necessarily the best metric to use as an analytical tool to study
the convergence of an algorithm.
The other option here is the number of iterations that took the algorithm to
reach an objective function. Here, we could plot the objective function values
against the iteration numbers. While, in this approach, we are certainly excluding
the innate problems that we have named earlier, which are the system’s compu-
tational power, the technical properties of the programming language, and, more
importantly, the programmers’ skill set from our analysis, there is still one major
subtle problem with the way that we are breaking down the performance of the
algorithms.
As we have seen in the previous section, some algorithms use one agent to
enumerate through the search agent, while others may employ multiple agents to
carry out this task. So in a single iteration, the former group is arguably doing less
computation than the latter group. This is important to note, given that in most
real-world optimization problems, the most computationally taxing part of the pro-
cess is calling the simulator that gets activated as you call the objective function
in your algorithm. Some algorithms may even call this function multiple times
within the same iteration for each search agent, which requires additional compu-
tational power. Of course, this would also mean that such algorithms may have a
better chance of converging to near-optimal solutions. So, the metric that should
be monitored here is not the number of iterations per se but rather the number of
times the objective function is called during the search. This is, in fact, the defin-
ition of the NFE. By using this metric, you are factoring in the number of search
agents and, more importantly, the innate efficiency of the algorithm itself and not
the programmer’s skill set. For instance, an algorithm that often calls the objective
function would consequently have a greater NFE than a more efficiently designed
algorithm that rarely calls this function during the search. As such, plotting the
objective function against the NFE values is a more reasonable way to analyze or
even compare the convergence of meta-heuristic algorithms.
An Introduction to Meta-Heuristic Optimization 27
Aside from the visualization of the convergence rate, we need other quantifiable
metrics to analyze the performance of an algorithm. These would not only be used
in the fine-tuning procedure to get the best result out of an algorithm for a specific
optimization problem, but they are also helpful when comparing the performance
of two or even more algorithms. Again, to apply these measures, we must first run
the algorithm with the same setting for a specified number of times. This is mainly
due to the fact that often these algorithms are based on stochastic sampling. This
means that the algorithm would return a different solution that may slightly differ
for each run. So the idea is after fully running the algorithms numerous times,
you would use these metrics to evaluate their performance. As such, in addition to
reporting the solutions obtained in each run, you can report these metrics to give
a more wholesome idea about the performance of the algorithms. There are four
general areas to evaluate an algorithm’s performance that are efficacy, efficiency,
reliability, and, finally, robustness.
In this context, efficacy-based metrics measure how an algorithm is performing
regardless of its convergence speed. Three known metrics here are mean best
fitness (MBF), the best solution in all runs (BSAR), and the worst solution in all
runs (WSAR) (Du & Swamy, 2016; Bozorg-Haddad et al., 2017). MBF denotes the
average of the best objective or fitness function value of all runs. BSAR denotes
the best objective or fitness function value that was found among all the runs. By
the same token, WSAR denotes the worst objective or fitness function value that
was found among all the runs.
The idea behind reliability-based metrics is to measure the ability of an algo-
rithm to provide acceptable results. The most notable measure here is the success
rate (SR) (Du & Swamy, 2016). To work on this measure, you must first identify the
best solution among all the runs. Any time the solution comes within an acceptable
range of this value, meaning that you also need to set a predefined threshold here as
well, the said runs are deemed successful. SR is the ratio of successfully executed
runs over the total number of runs. As seen here, the SR has a probabilistic nature;
in a way, it can be interpreted as the probability of executing a successful run.
Efficiency-based measures tend to quantify the speed of an algorithm in identi-
fying the optimum solution. The most notable measure here for this is the average
number of evaluations of a solution (AES) (Du & Swamy, 2016). As the name
suggests, this metric measures the average number of evaluations it takes for
the algorithm to be deemed successful. If an algorithm did not have a so-called
successful run, it is said that ASE is undefined.
Robustness-based measures determine how persistent an algorithm is in
different runs. A so-called robust algorithm would always return reasonably similar
solutions, while a non-robust algorithm’s solutions vary significantly in each run.
Figure 1.6 depicts the difference between robust and non-robust algorithms. As
can be seen here, the non-robust algorithm final solution would rely heavily on the
initiation stage, which is naturally not a good feature to have in a meta-heuristic
algorithm. The most notable measures for this metric are standard deviation
and the coefficient of variation of solutions in different runs (Bozorog-Haddad
et al., 2017).
28 An Introduction to Meta-Heuristic Optimization
Figure 1.6 Convergence rate of an algorithm that is considered (a) robust and (b) non-robust.
look for the best parameter setting for the said algorithm for that specific problem
at hand. This means that a universally best parameter setting for an algorithm could
never exist, and as such, fine-tuning an algorithm is an inseparable part of meta-
heuristic optimization algorithms. Of course, it is always possible to use our intu-
ition, experience, and default values suggested for an algorithm’s parameters as
a good starting point. That said, one should bear in mind that fine-tuning these
parameters is, more than anything, a trial-and-error process. Thus, while it is
possible to get a good enough result by having an educated guess for setting the
parameters of these algorithms, to get the best possible performance, it is necessary
to go through this fine-tuning process.
the context of meta-heuristic optimization algorithms are that if there are no prior
assumptions about the problem at hand, there can never be a universally superior
algorithm, nor can be a universally ideal parameter setting for a given algorithm.
This means we have to have multiple meta-heuristic optimization algorithms in
our repertoire and do parameter fine-tuning to get the best performance out of the
algorithm. That said, we can finally start learning about different algorithms in
this field.
References
Bozorg-Haddad, O., Solgi, M., & Loáiciga, H.A. (2017). Meta-heuristic and evolutionary
algorithms for engineering optimization. John Wiley & Sons. ISBN: 9781119386995
Bozorg-Haddad, O., Zolghadr-Asli, B., & Loaiciga, H.A. (2021). A handbook on multi-
attribute decision-making methods. John Wiley & Sons. ISBN: 9781119563495
Capo, L. & Blandino, M. (2021). Minimizing yield losses and sanitary risks through an
appropriate combination of fungicide seed and foliar treatments on wheat in different pro-
duction situations. Agronomy, 11(4), 725.
Culberson, J.C. (1998). On the futility of blind search: An algorithmic view of “no free
lunch”. Evolutionary Computation, 6(2), 109–127.
Du, K.L. & Swamy, M.N.S. (2016). Search and optimization by metaheuristics: Techniques
and algorithms inspired by nature. Springer International Publishing Switzerland.
ISBN: 9783319411910
George, M., Kumar, V., & Grewal, D. (2013). Maximizing profits for a multi-category
catalog retailer. Journal of Retailing, 89(4), 374–396.
Glover, F. (1986). Future paths for integer programming and links to artificial intelligence.
Computers & Operations Research, 13(5), 533–549.
Ho, Y.C. & Pepyne, D.L. (2002). Simple explanation of the no-free-lunch theorem and its
implications. Journal of Optimization Theory and Applications, 115(3), 549–570.
Hooke, R. & Jeeves, T.A. (1961). “Direct Search” solution of numerical and statistical
problems. Journal of the ACM, 8(2), 212–229.
Husted, B.W. & de Jesus Salazar, J. (2006). Taking Friedman seriously: Maximizing profits
and social performance. Journal of Management Studies, 43(1), 75–91.
Issa, U.H. (2013). Implementation of lean construction techniques for minimizing the risks
effect on project construction time. Alexandria Engineering Journal, 52(4), 697–704.
Kamrad, B., Ord, K., & Schmidt, G.M. (2021). Maximizing the probability of realizing
profit targets versus maximizing expected profits: A reconciliation to resolve an agency
problem. International Journal of Production Economics, 238, 108154.
Wolpert, D.H. & Macready, W.G. (1997). No free lunch theorems for optimization. IEEE
Transactions on Evolutionary Computation, 1(1), 67–82.
Yaghoubzadeh-Bavandpour, A., Bozorg-Haddad, O., Zolghadr-Asli, B., & Gandomi, A.H.
(2022). Improving approaches for meta-heuristic algorithms: A brief overview. In Bozorg-
Haddad, O., Zolghadr-Asli, B. eds. Computational intelligence for water and environ-
mental sciences, Springer Singapore, 35–61.
Yang, X.S. (2010). Nature-inspired metaheuristic algorithms. Luniver Press.
ISBN: 9781905986286
Yanofsky, N.S. (2011). Towards a definition of an algorithm. Journal of Logic and
Computation, 21(2), 253–286.
An Introduction to Meta-Heuristic Optimization 31
Zolghadr-Asli, B., Bozorg-Haddad, O., & Loáiciga, H.A. (2018). Stiffness and sensitivity
criteria and their application to water resources assessment. Journal of Hydro-Environment
Research, 20, 93–100.
Zolghadr-Asli, B., Bozorg-Haddad, O., & van Cauwenbergh, N. (2021). Multi-attribute
decision-making: A view of the world of decision-making. In Bozorg-Haddad,
O. ed. Essential tools for water resources analysis, planning, and management. Springer,
305–322.
Random documents with unrelated
content Scribd suggests to you:
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
ebookbell.com