Algorithms 09 00040
Algorithms 09 00040
Abstract: A direct search algorithm is proposed for minimizing an arbitrary real valued function.
The algorithm uses a new function transformation and three simplex-based operations. The function
transformation provides global exploration features, while the simplex-based operations guarantees
the termination of the algorithm and provides global convergence to a stationary point if the cost
function is differentiable and its gradient is Lipschitz continuous. The algorithm’s performance
has been extensively tested using benchmark functions and compared to some well-known global
optimization algorithms. The results of the computational study show that the algorithm combines
both simplicity and efficiency and is competitive with the heuristics-based strategies presently used
for global optimization.
Keywords: global optimization; direct search methods; search space transformation; derivative-free
optimization; heuristics-based optimization
1. Introduction
An optimization problem consists of finding an element of a given set that minimizes
(or maximizes) a certain value associated with each element of the set. In this paper, we are interested
in minimizing the value of a real valued function defined over the n-dimensional Euclidean space.
Without loss of generality, our problem is equivalent to minimizing a real valued function over the
open unit n-box (0, 1)n ⊂ Rn .
Let f denote a real function defined over the open unit n-box (0, 1)n ⊂ Rn , and consider the
following optimization problem:
The function to be minimized f is called the cost function and the unit n-box (0, 1)n is the search
space of the problem. Our aim is to obtain a point x∗ ∈ (0, 1)n , such that the cost function f attains a
minimum at x∗ , i.e., f (x∗ ) ≤ f ( x ), ∀x ∈ (0, 1)n . We shall make the following assumption:
A1 The cost function f attains its minimum at a point of the search space.
Many optimization methods make use of the stationarity Equation (2) to compute candidate
optimizer points. These methods require the explicit knowledge of the gradient, but they do not
guarantee that the point obtained is an optimizer, except for a pseudoconvex function.
We shall also make an additional assumption that will restrict our alternatives to design a
procedure to compute the optimum.
A2 The gradient of the cost function is not available for the optimization mechanism.
Remark 1. The problem Equation (1) is equivalent to unconstrained optimization. Consider the
following unconstrained optimization problem:
min{ f (ξ ) : ξ ∈ Rn } (3)
h iT
and let ξ = ξ1 ξ2 · · · ξn be the vector of variables of the cost function.
The invertible transformation:
1
xi = (4)
1 + e−ξ i
transforms the real line into the unit interval. Consequently, we can convert an unconstrained
minimization problem into an open unit n-box constrained minimization problem by transforming
each of its variables. Similarly, the linear transformation:
ξi − a
xi = (5)
b−a
converts the open interval ( a, b) into the open unit interval. Consequently, both unconstrained and
arbitrary open n-box optimization problems are included in our formulation. Furthermore, under
Assumption A2 and by choosing appropriate values of a and b, the global minimum of function f is
attained in an interior point of the open n-box ( a, b)n .
The rest of this paper is organized as follows. Section 2 is a review of the literature on direct
search and global optimization algorithms. The direct search algorithm for global optimization is
designed in Section 3. The algorithm makes use of a set of simplex-based operations for a transformed
cost function. We begin by reviewing some basic results about n-dimensional simplices that will be
used later to define the algorithm and study its properties. Then, we will introduce a transformation
of the cost function that is crucial to improving the exploratory properties of the algorithm. Finally, we
conclude this section by explaining the algorithm in detail and analyzing its convergence properties.
In Section 4, an experimental study of the algorithm’s performance is accomplished by using three
well-known test functions. Its performance is also compared to other global optimization strategies.
The conclusions are presented in Section 5.
available, but there is a need for optimization. Some relevant examples are: tuning of algorithmic
parameters, automatic error analysis, structural design, circuit design, molecular geometry or dynamic
pricing. See Chapter 1 of [6] for a detailed description of the applications.
Initially, direct search methods have been dismissed for several reasons [8]. Some of them are
the following: they are based on heuristics, but not on rigorous mathematical theory; they are slow
to converge and only appropriate for small-sized problems; and finally, there were no mathematical
analysis tools to accompany them. However, these drawbacks can be refuted today. Regarding slow
convergence, in practice, one only needs improvement rather than optimality. In such a case, direct
methods can be a good option because they can obtain an improved solution even faster than local
methods that require gradient information. In addition, direct search methods are straightforward
to implement. If we not only consider the elapsed time for a computer to run, but the total time
needed to formulate the problem, program the algorithm and obtain an answer, then a direct search is
also a compelling alternative. Regarding the problem size, it is usually considered that direct search
methods are best suited for problems with a small number of variables. However, they have been
successfully applied to problems with a few hundred variables. Finally, regarding analytical tools,
several convergence analysis studies for a direct search algorithm were published starting in the early
1970s [9–12]. These results prove that it is possible to provide rigorous guarantees of convergence for
a large number of direct search methods [4], even in the presence of heuristics. Consequently, direct
search methods are regarded today as a respectable choice, and sometimes the only option, for solving
certain classes of difficult and practical optimization problems, and they should not be dismissed as a
compelling alternative in these cases. Direct search methods are broadly classified into local and global
optimization approaches.
Model-based search methods are based on the idea of using function evaluations to
compute a model of the cost function and to obtain derivative approximations from this model.
Many model-based methods are trust-region algorithms [17] that interpolate the cost function in
some region of appropriate shape and size to obtain a good approximation of the cost function that
can later be easily optimized on that region. Typically, the model to be adjusted is a fixed-order
polynomial, and the trust region is a sphere in some norm of the search space. The model function
is sequentially optimized in the trust region, and the trust region is updated for the next iteration
until the approximated model satisfies a first-order optimality condition. Model-based methods
also include [18–20]. During the last decade, there has been a considerable amount of work to
handle constraints in the field of direct search methods. New algorithms have been developed to
deal with bounds and linear inequalities [21,22] smooth nonlinear constraints [23,24] or non-smooth
constraints [25–27]. In this work, we do not deal with constraints. Our aim here is to develop a simple
and efficient algorithm that improves the heuristics-based optimization methods that are presently
used to obtain the global minimum in a multidimensional and multimodal continuous function.
function has not been previously used in the literature, and it is a key ingredient of our approach,
because it provides global exploration of the search space. The new algorithm is an efficient alternative
to heuristics-based optimization methods for global optimization and preserves the convergence
properties of a well-designed direct search method.
The algorithm makes use of an initial point that is allocated as a qualified vertex of an
n-dimensional initial simplex. This simplex is evolved by a sequence of operations that depend
on the value of the transformed cost function at each vertex. Each operation produces a similar simplex
of a different size. The sequence of operations is designed in such a way that the simplex tends to
asymptotically collapse in a single point. The algorithm terminates when the simplex vertices are close
enough to each other.
Before explaining the algorithm and its operators in detail, we shall review some basic geometric
results about n-simplices. Later, we shall introduce the cost function transformation and its
main properties.
3.1. Notation
A brief summary of the notation used in this paper is included next for quick reference.
Σ = Co(V ) (6)
Next, we introduce certain specific types of n-simplices that are non-standard in the literature.
Definition 1 (Right n-simplex). A right n-simplex has a vertex, such that the n edges intersecting at it
are pairwise orthogonal. This vertex is called the right vertex.
Definition 2 (Isosceles right n-simplex). A right n-simplex, such that every edge intersecting at the
right vertex has the same length ∆, is an isosceles right n-simplex of length ∆.
Definition 3 (Standard n-simplex). The standard n-simplex is an isosceles right n-simplex of
unit length.
By locating the zero of Rn in the right vertex, the standard n-simplex Σ0n is the n-dimensional
polytope with vertex set V = {bi ∈ Rn : i ∈ Z[0,n] } where b0 is the zero vector, and bi is the
i-th element of the standard basis of Rn . Alternatively, the standard n-simplex can be defined as
the intersection of the closed unit ball in the one-norm of Rn and the non-negative orthant, i.e.,
Σ0n = {x ∈ [0, 1]n : kxk1 ≤ 1}.
The content of an n-dimensional object is the measure of the amount of space inside its boundary.
For instance, the content of a two-simplex is its area, and the content of a three-simplex is its volume.
1
It can be proven by induction that the content of a standard n-simplex is n! .
A regular n-box is an n-dimensional box with edges of equal length. If the content of an
n-simplex is not zero, then a geometric object of nonzero content can fit in its interior. The following
Algorithms 2016, 9, 40 6 of 22
lemma provides the edge length of the maximum regular n-box that can fit in the interior of a
standard n-simplex.
Lemma 1. The regular n-box
n of maximum size that can be contained inside a standard n-simplex has edge
1 1
length n and content n .
Proof. The standard n-simplex is given as Σ0n = {x ∈ [0, 1]n : kxk1 ≤ 1}. The maximum n-box that
can be allocated inside the standard n-simplex has a vertex in the origin, and it is oriented along the
coordinate directions. The opposite vertex to the origin of this regular n-box has coordinates xi = n1
n
for i ∈ Z[1,n] ; therefore, its edge length is n1 , and its content is n1 .
The edges that intersect at the right vertex of a standard n-simplex are oriented along the
coordinate directions, so these edge directions define a nonnegative orthant. Any direction inside
this orthant can be expressed by a nonnegative n-dimensional vector of unit norm, i.e., hd, di = 1.
A direction d forms angles θi with the coordinate directions whose cosines are called the directional
cosines and are given by cos θi = hd, bi i. The following lemma proves that any direction d contained
in the nonnegative orthant has at least one directional cosine that never is less than √1n .
Lemma 2. The smallest directional cosine of any direction contained in the orthant formed by the edges
intersecting at the right vertex of a standard n-simplex is never less than √1n .
Proof. Without loss of generality, suppose that the right vertex is the zero point and that the edges are in
the coordinate directions, characterized by the elements of the standard basis of Rn , i.e., bi for i ∈ Z[0,n] .
h i
Consider the direction d∗ = √1n · · · √1n whose directional cosines are cos θi∗ = hd∗ , bi i = √1n
for all i ∈ Z[1,n] . Suppose that unlike the lemma statement, there exists a direction d with unit
norm hd, di = 1, such that cos θi = hd, bi i < √1n for all i ∈ Z[1,n] , then hd, di < n ∑i∈Z[1,n] n1 = 1.
This contradicts the fact that d has unit norm, and the lemma is proven.
A consequence of this lemma is that any direction contained in the nonnegative n-orthant always
forms an angle less than π2 radians with at least one of the coordinate directions that form the orthant.
In addition, if we consider an arbitrary direction d ∈ Rn , then there always exists an n-simplex similar
to the standard n-simplex, such that the edge directions of its right vertex define an orthant, such that
the smallest directional cosine is never less than √1n . This n-simplex is obtained by rotating π radians
the edges of the standard n simplex corresponding to the negative components of the direction vector d.
Let ξ be an arbitrary n-dimensional vector of real numbers and P a positive integer. Let us
introduce the following map:
This map is onto and transforms the n-dimensional Euclidean space Rn in the n-dimensional
unit interval (0, 1)n . Let f be the cost function of the minimization problem given in Equation (1) and
introduce the following function transformation:
where P is a positive integer. Since the cost function f is not necessarily defined outside of the open
unit n-box, the domain of the transformed cost function φ is:
dom φ = {ξ ∈ Rn : Pξ 6∈ Zn } (10)
The reason is that if Pξ ∈ Zn , then frac( Pξ ) = 0, and f is not necessarily defined for ξ = 0.
The transformed function φ has certain interesting properties. The most important are the
periodicity of period P−1 and piecewise continuity, both on each variable.
Lemma 3. The transformed function φ(ξ ) = f (frac( Pξ )) satisfies the following properties:
(i) The function φ is continuous for any ξ ∈ Rn , such that Pξ ∈ (0, 1)n .
(ii) Let ξ, η ∈ Rn be such that Pξ ∈ (0, 1)n and Pη ∈ Zn , then φ(ξ + η) = φ(ξ ).
Proof. (i) If Pξ ∈ (0, 1)n , then frac( Pξ ) = Pξ and φ(ξ ) = f ( Pξ ). Since f is continuous for any
x ∈ (0, 1)n , then φ is continuous for any ξ ∈ Rn , such that Pξ ∈ (0, 1)n ; (ii) If ξ and η are
n-dimensional vectors, such that Pξ ∈ (0, 1)n and Pη ∈ Zn , respectively, then frac( Pη) = 0 and
frac( P(ξ + η)) = frac( Pξ ) = Pξ; consequently, φ(ξ + η) = f (frac( Pξ )) = φ(ξ ).
Let x ∈ Rn and B P (x) denote the following n-box:
The transformed function φ is continuous on B P (x), and there always exists ξ ∗ ∈ B P (x), such that
φ(ξ ∗ ) = min{φ(ξ ) : ξ ∈ dom φ}. Note that the n-box B(x) is given by B P (x) = {ξ = P−1 (z + b Pxc) :
z ∈ (0, 1)n }. If x = 0, then B P (0) is the fundamental n-box of the function φ, i.e., BP (0) = (0, P−1 )n .
For another x ∈ Rn , B P (x) is an n-box of length P−1 , where the function φ takes the same values as
in the fundamental n-box B P (0). Note that there are Pn different n-boxes for x ∈ (0, 1)n ; then, the
function φ repeats Pn times in the unit n-box (0, 1)n .
√
Example 1. The function f ( x ) = | x | − | x sin(3πx )|, and the transformed function
φ(ξ ) = f (frac( Pξ )) for P = 10 are depicted for the open unit interval in Figure 1. The function
f is continuous on the open unit interval (0, 1) and has three local minima. The transformed function φ
is periodic, of period P−1 = 0.1, and piecewise continuous on the open unit interval. The fundamental
period of φ is B(0) = (0, 0.1), and the function value repeats ten times on the unit interval. The function
φ is continuous on each interval B( x ) = {ξ : b10ξ c = b10x c, 10ξ 6∈ Z} for x ∈ R, but discontinuous at
the boundary points of these intervals, which are given by ξ = 0.1k for any integer k. This discontinuity
always occurs if f (0 + e) 6= f (1 − e) for any e > 0, as occurs in this example and which is clearly
visible in Figure 1. Moreover, the function φ is not necessarily defined for these points of discontinuity.
Consider the unconstrained optimization problem:
The function φ has a global optimizer at ξ ∗ if φ(ξ ∗ ) ≤ φ(ξ ) for all ξ ∈ dom φ. The following
theorem proves that the transformed unconstrained minimization problem Equation (12) also provides
the minimum value of the original optimization problem Equation (1).
Algorithms 2016, 9, 40 8 of 22
f(x)
1
0.5
−0.5
0 0.2 0.4 0.6 0.8 1
x
φ(ξ)
1
0.5
−0.5
0 0.2 0.4 0.6 0.8 1
ξ
√
Figure 1. A continuous function f ( x ) = | x | − | x sin(3πx )| with three local minima in the unit
interval (0, 1) and the corresponding transformed function φ(ξ ) := f (frac( Pξ )) where P = 10 for the
same interval.
Theorem 1. Let f be a real value continuous function on the open unit n-box, and let φ be defined as
φ(ξ ) = f (frac( Pξ )), where P is a positive integer, then λ∗ = min{ f (x) : x ∈ (0, 1)n } if and only if
λ∗ = min{φ(ξ ) : ξ ∈ dom φ}.
Proof. (If): Let x∗ be such that λ∗ = f (x∗ ) = min{ f (x) : x ∈ (0, 1)n } and φ(ξ ) = f (frac( Pξ )), where
P is a positive integer; then define ξ ∗ = P−1 (x∗ + η) for any η ∈ Zn ; then φ(ξ ∗ ) = f (x∗ ) = λ∗ .
(Only if): Let ξ ∗ be such that λ∗ = φ(ξ ∗ ) = min{φ(ξ ) : ξ ∈ dom φ}; define x∗ = frac( Pξ ∗ ); then
f (x∗ ) = φ(ξ ∗ ) = λ∗ .
Theorem 1 allows us to obtain a global minimizer of the original function f on the open unit
n-box by solving the unconstrained optimization problem (12). If we have a method that computes a
global minimizer ξ ∗ of the unconstrained optimization problem (12), then x∗ = frac( Pξ ∗ ) ∈ (0, 1)n ,
and it satisfies φ(ξ ∗ ) = f (x∗ ).
The following lemma proves that any regular n-box of a size greater than or equal to P−1 always
contains a point ξ ∗ that attains the global minimum of the unconstrained optimization problem (12).
Lemma 4. Let B be a regular n-box of edge length ∆ B , such that P∆ B ≥ 1, and let
λ∗ = min{ f (x) : x ∈ (0, 1)}; then, the n-box B always contains a point ξ ∗ ∈ Rn , such that φ(ξ ∗ ) = λ∗ .
Proof. Let x∗ ∈ (0, 1)n be such that f (x∗ ) = λ∗ ≤ f (x) for any x ∈ (0, 1)n , and let ξ̄ ∈ B , such
that b Pξ̄ c = b Pξ̄ + x∗ c. Consider the open n-box B P (ξ̄ ) = {ξ ∈ Rn : b Pξ c = b Pξ̄ c, Pξ 6∈ Zn },
then B P (ξ̄ ) has edge length P−1 , and the point ξ ∗ = P−1 (x∗ + b Pξ̄ c) satisfies ξ ∗ ∈ B(ξ̄ ) ∩ B and
φ(ξ ∗ ) = f (x∗ ) = λ∗ .
We are also interested in determining the size of an n-simplex similar to the standard n-simplex
that always contains a global minimizer of {φ(ξ ) : ξ ∈ dom φ}. Since Lemma 4 establishes that such a
point is always contained in a regular n-box of length not less than P−1 , then the solution is an isosceles
right n-simplex that contains a regular n-box of length P−1 . From Lemma 1, this n-simplex has edge
length nP−1 and content P−n /(n − 1)!. The following lemma states the result.
Theorem 2. Let Σ be an isosceles right n-simplex of length ∆, such that P∆ ≥ n, and let λ∗ = min{ f (x) :
x ∈ (0, 1)n }; then, the n-simplex Σ always contains a point ξ ∗ ∈ dom φ, such that f (frac(ξ ∗ )) = λ∗ .
Algorithms 2016, 9, 40 9 of 22
Proof. A regular n-box of edge length ∆ B = ∆/n can be inscribed inside any isosceles right n-simplex
of length ∆. Since P∆ ≥ n, then P∆ B ≥ 1, and the theorem statement is a straightforward consequence
of Lemma 4.
In the rest of this section, we shall design a direct search algorithm to obtain a solution for the
optimization problem (1). We distinguish two algorithms, the basic algorithm and the complete
algorithm. The basic algorithm is a simplex-based algorithm for the transformed cost function.
An initial point is embedded at the right vertex of an isosceles right n-simplex of edge length ∆0 .
This n-simplex evolves by a sequence of operations. As a result of each iteration, another n-simplex
is obtained that is similar to the standard n-simplex and has the point with minimum value of the
transformed cost function located in the right vertex. The edge length of the n-simplex decreases
by a constant factor if no operations performed during the iteration produce a vertex with a smaller
value of the transformed cost function. Whenever the edge length is no less than nP−1 , we say that
the algorithm is in exploratory phase, because the n-simplex always contains a global minimizer of
{φ(ξ ) : ξ ∈ dom φ}. When the edge length of the n-simplex is less than nP−1 , then we say that the
algorithm is in convergence phase, and it aims to approach a stationary point. Thus, the basic algorithm
is designed to preserve the good properties of direct search algorithms, such as convergence to a
stationary point, but increases the possibility to reach the global optimum at the end of the exploratory
phase. Another feature of the basic algorithm is that there is no interruption or change from one phase
to the other, it is a straightforward consequence of using the transformed cost function.
This transformation aims to perform global improvement at each iteration k ∈ Z[0,∞) whenever
the n-simplex has an edge length, such that P∆k ≥ n.
where b0 is an n-dimensional zero vector and bi for i ∈ Z[1,n] is the i-th element of the standard basis
of Rn . The initial n-simplex is obtained by varying an initial point at a distance ∆0 in each coordinate
direction, such that P∆0 > n. Let ξ 0 ∈ (0, 1)n be an initial point; then, the vertex set of the initial
simplex is given by:
V0 = vset(ξ 0 , ∆0 ) (15)
Algorithms 2016, 9, 40 10 of 22
If the initial point ξ 0 is not provided as an input of the algorithm, then it can be randomly chosen.
Consequently, the initial simplex is an isosceles right n-simplex with edge length ∆0 and content n! ∆0
1 n
ξ 0j = ξ i + ρ(ξ j − ξ 0 ), ξ j ∈ V (16)
where 1 ≤ ρ < ∞. If the vertex set of the current simplex is given by V = vset(ξ 0 , ∆), then the
transformed n-simplex has the vertex set V 0 = vset(ξ j , ∆0 ), where ∆0 = ρ∆, and its edge length is |∆0 |.
The new n-simplex, after the expansive translation, preserves its shape, but increases its content by a
factor of ρn . From a practical point of view, it is convenient to choose the expansive factor ρ close to
one, but slightly greater, because this avoids obtaining candidate points that were already visited in
previous iterations, in the next rotation operation.
Let c be an arbitrary positive constant; if φ(ξ i0 ) < φ(ξ 0 ) − c∆2 for some ξ i0 ∈ V 0 }, then the
algorithm proceeds to a new iteration and performs an expansive translation operation. Otherwise, the
algorithm executes a rotation operation.
The introduction of the positive term c∆2 guarantees that the iteration is considered successful
only if it provides a point with a sufficient decrease of the cost function.
3.4.4. Rotation
This operation produces a similar n-simplex, but where each edge is rotated π radians with
respect to the right vertex. A graphical interpretation is depicted in Figure 2.
ξ 0j
ξj ξj ξj
ξ 0j
ξ0 ξ i0 ξ i0 ξi ξi
ξ i = ξ 00 ξ0 ξ 0 ξ i0
ξ 0j
Figure 2. Graphical representation of the simplex translation, rotation and shrinking simplex operations
for three points in R2 . The initial vertex set is V = {ξ 0 , ξ i , ξ j }, and the final vertex set is V 0 = {ξ 00 , ξ i0 , ξ 0j }.
The three operations produce similar simplices.
Let V denote the vertex set of an n-simplex. The vertices of the rotated n simplex are given by the
following expression:
ξ 0j = ξ 0 − (ξ j − ξ 0 ), ξ j ∈ V (17)
If the vertex set of the current simplex is given by V = vset(ξ 0 , ∆), then the transformed n-simplex
has the vertex set V 0 = vset(ξ 0 , ∆0 ), where ∆0 = −∆, and its edge length is |∆0 |. If φ(ξ i0 ) < φ(ξ 0 ) − c∆2
for some ξ i0 ∈ V 0 , then the rotation operation succeeds, and the algorithm proceeds to start a new
iteration and performs an expansive translation operation. Otherwise, the rotation operation fails; the
Algorithms 2016, 9, 40 11 of 22
rotated vertex set is discarded; and the algorithm executes a shrinkage operation with the original
vertex set V .
3.4.5. Shrinkage
This operation performs a contraction of the simplex by fixing the right vertex and placing the
remaining vertices along the same directions, but at a distance that is reduced by a constant factor
σ ∈ (0, 1). A graphical interpretation of the shrinkage operation is depicted in Figure 2 for R2 .
Let V denote the vertex set of an n-simplex. The vertices of the transformed n-simplex are
given by:
ξ 0j = ξ 0 + σ (ξ j − ξ 0 ), ξ j ∈ V (18)
If the vertex set of the current simplex is given by V = vset(ξ 0 , ∆), then the transformed
n-simplex has the vertex set V 0 = vset(ξ 0 , ∆0 ), where ∆0 = σ∆, and its edge length is |∆0 |. After the
shrinkage operation, the algorithm proceeds to start a new iteration by performing an expansive
translation operation.
where c is a positive constant. This condition is key to proving the termination of the basic algorithm
and global convergence to a stationary point. The constant c is arbitrary, but usually, a small value
is chosen.
P|∆| ≤ e (19)
x (ξ ) = frac( Pξ ) (20)
Therefore, when the n-simplex with vertex set V reaches edge length ∆ ≤ e, the basic algorithm
stops, and the point obtained is x̄ = frac( Pξ̄ ), where ξ̄ ∈ arg min{ξ : ξ ∈ V }; and it attains the function
value f (x̄) = φ(ξ̄ ).
3.4.9. Implementation
An implementation of the basic algorithm in pseudocode is given in Algorithm 1.
Algorithms 2016, 9, 40 12 of 22
Proof. Let S and U denote the index sets of successful and unsuccessful iterations of the basic
algorithm, respectively. Let xk = frac( Pξ k ), then f (xk ) = φ(ξ k ). If k ∈ S , then f (xk+1 ) < f (xk ) − c∆2k
and ∆k+1 = ρ∆k , while if k ∈ U , then f (xk+1 ) ≥ f (xk ) − c∆2k and ∆k+1 = σ∆k . Let us begin by proving
that limk→∞ |∆k | = 0. Suppose not; then, for any k ∈ Z[0,∞] , there exists ` ≥ k, such that ` ∈ S because,
if such an ` does not exist, then the set of successful iterations would be finite, and when k approaches
Algorithms 2016, 9, 40 13 of 22
infinity, the update rule ∆k+1 = σ∆k yields limk→∞ |∆k | = 0. In addition, there exists ∆ > 0, such that
for each successful iteration ` ∈ S , |∆` | ≥ ∆ and:
However, since ` approaches infinity, this implies that f (x`+1 ) → −∞, which contradicts the fact
that the function f is bounded below on (0, 1)n . Consequently, the boundedness of f implies that
∆ = 0 and limk→∞ |∆k | = 0. This proves the termination of the basic algorithm.
For the second part of the theorem, if f is continuous and differentiable on (0, 1)n , then so is
φ(ξ ) = f (frac( Pξ )) on the n-box B P (x) = {ξ ∈ Rn : b Pξ c = b Pxc, Pξ 6∈ Zn } for any x ∈ Rn .
Furthermore, if ∇ f is Lipschitz continuous with constant M on (0, 1)n , then so is ∇φ in B P (x) with
constant PM. Since we proved that limk→∞ |∆k | = 0, then there always exists a positive integer L,
such that b Pξ k c = b Pξ L c for any positive integer k > L. The basic algorithm searches points along
the coordinate directions; then, from Lemma 2 and for any positive integer k > L, no matter the
value of ∇φ(ξ k ), there is always at least one (positive or negative) coordinate direction di ∈ {bi , −bi },
such that:
1 h−∇φ(ξ k ), bi i
√ ≤ cos θ =
n k∇φ(ξ k )kkdi k
or equivalently:
1
√ k∇φ(ξ k ) ≤ −h∇φ(ξ k ), di i (21)
n
because kdi k = 1. For any unsuccessful iteration k ∈ U , φ(ξ k + |∆k |dk ) ≥ φ(ξ k ) − c∆2k , the mean value
theorem establishes that:
for some αk ∈ [0, 1]. Subtracting h∇φ(ξ k ), di i|∆k | from both sides leads to:
1
√ k∇φ(ξ k )k ≤ h∇φ(ξ k + αk |∆k |di ) − ∇φ(ξ k ), di i + c|∆k |
n
Taking into account that φ is continuously differentiable in B P (ξ k ) and its gradient ∇φ is Lipschitz
with constant PM in B P (ξ k ),
1
√ k∇φ(ξ k )k ≤ PMkαk ∆k di k + c|∆k | ≤ ( PM + c)|∆k |
n
and taking the limit when k approaches infinity, it is possible to conclude that:
√
lim k∇φ(ξ k )k ≤ n( PM + c) lim |∆k | = 0, for k ∈ U
k→∞ k→∞
0.5
x
0 −0.5
0 10 20 0 0.5 1
# iteration x = frac (P ξ)
(c) Function value f (x) (d) Worst Simplex Points
1 1
best worst
worst
0.5 0.5
f (x)
f (x)
0 0
−0.5 −0.5
0 10 20 0 0.5 1
# iteration x = frac (P ξ)
√
Figure 3. Evolution of the algorithm for f ( x ) = | x | − | x sin(3πx )| and x (ξ ) = frac( Pξ ).
The parameters of the algorithm are P = 104 , σ = 0.5, ρ = 1.05, c = 0.01, e = 10−4 /P.
better points in some coordinate direction, the algorithm may jump to one of these points during the
exploratory phase of a new execution. Unfortunately, not every optimization problem satisfies the
property that, for any local minimizer, there are points in some coordinate direction where the cost
function is improved. In such a case, a strategy that turns out to be more effective is to start from a new
random initial point. The class of cost functions that we are facing is not usually known in advance,
and we do not know which strategy is more appropriate for a new execution of the basic algorithm.
A trade-off solution is to repeat the basic algorithm N times, but starting from a new random initial
point after a number R ≤ N of repetitions.
where mod is the modulo operator that represents the remainder of the integer division of their
arguments. Note that if R = 0, the initial point is always a random point; however, if R = N, the initial
point is always the point obtained in the previous execution of the basic algorithm.
4. Experimental Study
Func # Description
1.01 Sphere
1.02 Ellipsoid separable with monotone x-transformation, condition 1e + 06
1.03 Rastrigin separable with both x-transformations condition 10
1.04 Skew Rastrigin–Bueche separable, condition 10, skew-condition 100
1.05 Linear slope, neutral extension outside the domain (not flat)
2.06 Attractive sector function
2.07 Step-ellipsoid, condition 100
2.08 Rosenbrock, original
2.09 Rosenbrock, rotated
3.10 Ellipsoid with monotone x-transformation, condition 1e6
3.11 Discus with monotone x-transformation, condition 1e6
3.12 Bent cigar with asymmetric x-transformation, condition 1e6
3.13 Sharp ridge, slope 1:100, condition 10
3.14 Sum of different powers
4.15 Rastrigin with both x-transformations, condition 10
4.16 Weierstrass with monotone x-transformation, condition 100
4.17 Schaffer F7 with asymmetric x-transformation, condition 10
4.18 Schaffer F7 with asymmetric x-transformation, condition 1000
4.19 F8F2 composition of 2D Griewank–Rosenbrock
5.20 Schwefel x sin(x) with tridiagonal transformation, condition 10
5.21 Gallagher 101 Gaussian peaks, condition up to 1000
5.22 Gallagher 21 Gaussian peaks, condition up to 1000, 1000 for global opt
5.23 Katsuuras repetitive rugged function
5.24 Lunacek bi-Rastrigin, condition 100
produce slow convergence. Consequently, values in the interval [1.01, 1.10] are usually a good option
for most problems. Regarding the contractive factor σ, the closer to one, the larger the number of
iterations of the basic GDS algorithm will be. From our experiments, σ = 0.5 is a convenient value
that produces a satisfactory trade-off between the efficiency of the algorithm and the convergence
time. The constant c of the sufficient decrease condition is chosen as c = 0.01. Any positive value
of c provides the termination of the algorithm, but small values are recommended to improve the
algorithm’s performance during the initial iterations when ∆ is large. The parameters P and R were
fixed to P = 1000 and R = 5. These values were selected after a massive experimentation. In our
experiments, we analyzed the mean value of the CPU time for one execution of the basic GDS algorithm
for different values of the parameter R. The empirical results showed that the larger the value of R,
the lower the CPU time for execution. This means that when R increases, the number of executions
N can also be increased to a certain extent without compromising the total CPU time. The reduction
is larger for smaller values of R, e.g., the reduction is about 40% from R = 0 to R = 1 and about
30% from R = 1 to R = 2. Since increasing R is also beneficial for functions that can be globally
optimized along the coordinate directions, a good option that trades off between moderate CPU time
and efficiency in computing the global optimum is to choose R ∈ [2, 10] and increase the number of
repetitions of the basic algorithm N as much as possible, to maintain a moderate total CPU time. In
fact, we have checked that a good selection of P and R could improve the results depending on the
problem. However, since a practitioner usually does not know in advance what class its problem
belongs to, it is important to fix a set of parameters that trade off for a large class of problems. In
conclusion, the parameter values of the GDS algorithm for the experimental study were selected as
P = 1000, R = 5, ρ = 1.05, σ = 0.5, c = 0.01 and N = 1e9, and the stopping criterion is the number of
function evaluations.
and function evaluations is much smaller for GDS; see Figures 4 and 5. Note that GDS is more than
100-times faster than DIRECT for several functions of dimension 40 and 105 function evaluations.
Table 2. Comparative study of the median of the error for problems of dimensions 5 and 10 and the
number of evaluations 1e + 03 and 1e + 04.
Table 3. Comparative study of the median of the error for problems of dimensions 20 and 40 and the
number of evaluations 1e + 04 and 1e + 05.
The results of a study of the GDS algorithm for functions of high dimension are presented in
Figure 6. The GDS algorithm is executed on the functions 1.02, 1.02, 1.04 and 1.05 of dimensions 100
and 200 to search the global minimum. The error to the global minimum in the logarithmic scale and
Algorithms 2016, 9, 40 19 of 22
the CPU time are shown for 104 , 105 , 106 and 107 . It is interesting to remark that the CPU time grows
linearly with the dimension of the problem.
0.5
0.5
CPU time (s)
6
CPU time (s)
2 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
(c) Functions (1-24) (d) Functions (1-24)
Figure 4. Comparative study of the CPU time of GDS (blue) and DIRECT (red) for solving 24 problems
of dimension five (a) and dimension 10 (b) using 103 function evaluations (c) and 104 function
evaluations (d).
5 6
CPU time (s)
4
4
3
2 2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
(a) Functions (1-24) (b) Functions (1-24)
150 120
CPU time (s)
100
100 80
60
50 40
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
(c) Functions (1-24) (d) Functions (1-24)
Figure 5. Comparative study of CPU time of GDS (blue) and DIRECT (red) for solving 24 problems
of dimension 20 (a) and dimension 40 (b) using 104 function evaluations (c) and 105 function
evaluations (d).
1015 1000
CPU Time (s)
10 f1.02 f1.02
10
f1.03 f1.03
Error
5 f1.04 f1.04
10 500
f1.05 f1.05
100
10-5 0
104 105 106 107 10
4
10
5
10
6
10
7
5 f1.04 f1.04
10 1000
f1.05 f1.05
100 500
-5 0
10
104 105 106 107 10
4
10
5
10
6
10
7
Figure 6. Study of the error to attain the global optimum (a) and CPU time (b) for functions of
dimension 100 (c) and dimension 200 (d) using the GDS algorithm.
Algorithms 2016, 9, 40 20 of 22
Since GDS is simple and computationally efficient, it can be successfully applied to problems of
high dimensions where the number of function evaluations to obtain a good result is large for any
direct search algorithm.
5. Conclusions
A direct search algorithm for the global optimization of multivariate functions has been reported.
The algorithm performs a sequence of operations for an initial isosceles right n-simplex. The result
of these operations is always a similar n-simplex of a different size. The operations depend on the
values of the vertices of the n-simplex, where these values are computed using a transformed cost
function that improves the exploratory features of the algorithm. The convergence properties of
the algorithm have been proven. The results of an extensive experimental study demonstrate that
the algorithm is capable of approaching the global optimum for black box problems with moderate
computation time. It is competitive with most of the heuristic-based global optimization strategies
presently used. Furthermore, it has been compared to DIRECT, a well-known Lipschitzian-based direct
search algorithm. The new algorithm is very simple and computationally cost-effective and can be
applied to problems of high dimension, because the empirical results demonstrate that its computation
time increases linearly with the dimension of the problem. In future work, we plan to extend the
algorithm to constrained optimization.
Acknowledgments: The authors thank Instituto de las Tecnologías Avanzadas de la Producción (ITAP) for
supporting the publication costs.
Author Contributions: Enrique Baeyens and Alberto Herreros design the GDS algorithm and develop the
experiments. José R. Perán analyzed the results. The three autors contributed equally to writing the paper.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Lewis, R.M.; Torczon, V.; Trosset, M.W. Direct Search Methods: Then and Now. J. Comput. Appl. Math. 2000,
124, 191–207.
2. Cohn, A.; Scheinberg, K.; Vicente, L. Introduction to Derivative-Free Optimization; MOS/SIAM Series on
Optimization, SIAM: Philadelphia, PA, USA, 2009.
3. Pinter, J.D. Global Optimization: Software, Test Problems, and Applications. In Handbook of Global
Optimization; Pardalos, P.M., Romeijn, H.E., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands,
2002; Volume 2.
4. Kolda, T.G.; Lewis, R.M.; Torczon, V. Optimization by Direct Search: New Perspectives on Some Classical
and Modern Methods. SIAM Rev. 2003, 45, 385–482.
5. Neumaier, A. Complete Search in Continuous Global Optimization and Constraint Satisfaction. Acta Numer.
2004, 13, 271–369.
6. Conn, A.R.; Scheinberg, K.; Vicente, L.N. Introduction to Derivative-Free Optimization; SIAM: Philadelphia, PA,
USA, 2009; Volume 8.
7. Rios, L.M.; Sahinidis, N.V. Derivative-free optimization: A review of algorithms and comparison of software
implementations. J. Glob. Optim. 2013, 56, 1247–1293.
8. Swann, W.H. Direct Search Methods. In Numerical Methods for Unconstrained Optimization; Murray, W., Ed.;
Academic Press: Salt Lake City, UT, USA, 1972; pp. 13–28.
9. Evtushenko, Y.G. Numerical methods for finding global extrema (case of a non-uniform mesh). USSR Comput.
Math. Math. Phys. 1971, 11, 38–54.
10. Strongin, R. A simple algorithm for searching for the global extremum of a function of several variables and
its utilization in the problem of approximating functions. Radiophys. Quantum Electron. 1972, 15, 823–829.
11. Sergeyev, Y.D.; Strongin, R. A global minimization algorithm with parallel iterations. USSR Comput. Math.
Math. Phys. 1989, 29, 7–15.
12. Dennis, J.E.J.; Torczon, V. Direct Search Methods On Parallel Machines. SIAM J. Optim. 1991, 1, 448–474.
13. Hooke, R.; Jeeves, T. Direct search solution of numerical and statistical problems. J. ACM 1961, 8, 12–229.
Algorithms 2016, 9, 40 21 of 22
14. Nelder, J.A.; Mead, R. A Simplex Method for Function Minimization. Comput. J. 1965, 7, 308–313.
15. Spendley, W.; Hext, G.R.; Himsworth, F.R. Sequential Application of Simplex Designs in Optimisation and
Evolutionary Operation. Technometrics 1962, 4, 441–461.
16. Lagarias, J.; Reeds, J.; Wright, M.; Wright, P. Convergence Properties of the Nelder-Mead Simplex Method in
Low Dimensions. SIAM J. Optim. 1998, 9, 112–147.
17. Conn, A.; Gould, N.; Toint, P. Trust-Region Methods; Society for Industrial and Applied Mathematics:
Philadelphia, PA, USA, 2000.
18. Regis, R.; Shoemaker, C. Constrained global optimization of expensive black box functions using radial basis
functions. J. Glob. Optim. 2005, 31, 153–171.
19. Custodio, A.; Rocha, H.; Vicente, L. Incorporating minimum Frobenius norm models in direct search.
Comput. Optim. Appl. 2010, 46, 265–278.
20. Conn, A.; Le Digabel, S. Use of quadratic models with mesh-adaptive direct search for constrained black
box optimization. Optim. Methods Softw. 2013, 28, 139–158.
21. Lewis, R.; Torczon, V. Pattern search algorithms for bound constrained minimization. SIAM J. Optim. 1999,
9, 1082–1099.
22. Lewis, R.; Torczon, V. Pattern search algorithms for linearly constrained minimization. SIAM J. Optim. 2000,
10, 917–941.
23. Lewis, R.; Torczon, V. A globally convergent augmented Lagrangian pattern search algorithm for
optimization with general constraints and simple bounds. SIAM J. Optim. 2002, 12, 1071–1089.
24. Kolda, T.; Lewis, R.; Torczon, V. A Generating Set Direct Search Augmented Lagrangian Algorithm for Optimization
with a Combination of General and Linear Constraints; Sandia National Laboratories: Livermore, CA, USA, 2006.
25. Audet, C.; Dennis, J.E. A pattern search filter method for nonlinear programming without derivatives.
SIAM J. Optim. 2004, 14, 980–1010.
26. Abramson, M.; Audet, C.; Dennis, J. Filter pattern search algorithms for mixed variable constrained
optimization problems. Pac. J. Optim. 2007, 3, 477–500.
27. Dennis, J.; Price, C.; Coope, I. Direct search methods for nonlinearly constrained optimization using filters
and frames. Optim. Eng. 2004, 5, 123–144.
28. Solis, F.J.; Wets, R.J.B. Minimization by Random Search Techniques. Math. Oper. Res. 1981, 6, 19–30.
29. Pearl, J. Heuristics: Intelligent Search Strategies for Computer Problem Solving; Addison-Wesley Longman
Publishing Co., Inc.: Boston, MA, USA, 1984.
30. Michalewicz, Z.; Fogel, D.B. How to Solve It: Modern Heuristics, 2nd ed.; Springer: Berlin, Germany, 2004.
31. Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by Simulated Annealing. Science 1983, 220, 671–680.
32. Bäck, T. Evolutionary Algorithms in Theory and Pratice; Oxford University Press: New York, NY, USA, 1996.
33. Larrañaga, P.; Lozano, J.A. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation;
Kluwer: Boston, MA, USA, 2002.
34. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the 1995 IEEE International
Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948.
35. Poli, R.; Kennedy, J.; Blackwell, T. Particle swarm optimization. Swarm Intell. 2007, 1, 33–57.
36. Storn, R.; Price, K. Differential Evolution—A Simple and Efficient Heuristic for global Optimization over
Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359.
37. Chakraborty, U.K. Advances in Differential Evolution; Springer Publishing Company: Berlin, Germany, 2008.
38. Glover, F. Tabu Search—Part I. ORSA J. Comput. 1989, 2, 190–206.
39. Glover, F.; Laguna, M. Tabu Search; Kluwer Academic Publishers: New York, NY, USA, 1997.
40. Dorigo, M.; Maniezzo, V.; Colorni, A. Ant system: Optimization by a colony of cooperating agents.
IEEE Trans. Syst. Man Cybern. Part B Cybern. 1996, 26, 29–41.
41. Dorigo, M.; Stützle, T. Ant Colony Optimization; Bradford Company: Scituate, MA, USA, 2004.
42. Paulavičius, R.; Žilinskas, J. Simplicial Global Optimization; Springer: Berlin, Germany, 2014.
43. Sergeyev, Y.D.; Strongin, R.G.; Lera, D. Introduction to Global Optimization Exploiting Space-Filling Curves;
Springer: Berlin, Germany, 2013.
44. Strongin, R.G.; Sergeyev, Y.D. Global Optimization with Non-Convex Constraints: Sequential and Parallel
Algorithms; Springer: Berlin, Germany, 2013.
45. Zhigljavsky, A.; Žilinskas, A. Stochastic Global Optimization; Springer: Berlin, Germany, 2008.
46. Zhigljavsky, A.A. Theory of Global Random Search; Springer: Berlin, Germany, 2012.
Algorithms 2016, 9, 40 22 of 22
47. Shubert, B.O. A sequential method seeking the global maximum of a function. SIAM J. Numer. Anal. 1972,
9, 379–388.
48. Sergeyev, Y.D.; Kvasov, D.E. Global search based on efficient diagonal partitions and a set of Lipschitz
constants. SIAM J. Optim. 2006, 16, 910–937.
49. Lera, D.; Sergeyev, Y.D. Lipschitz and Hölder global optimization using space-filling curves.
Appl. Numer. Math. 2010, 60, 115–129.
50. Kvasov, D.E.; Sergeyev, Y.D. Lipschitz global optimization methods in control problems. Autom. Remote
Control 2013, 74, 1435–1448.
51. Jones, D.R.; Perttunen, C.D.; Stuckman, B.E. Lipschitzian optimization without the Lipschitz constant.
J. Optim. Theory Appl. 1993, 79, 157–181.
52. Gablonsky, J.M.; Kelley, C.T. A locally-biased form of the DIRECT algorithm. J. Glob. Optim. 2001, 21, 27–37.
53. Finkel, D.E.; Kelley, C. Additive scaling and the DIRECT algorithm. J. Glob. Optim. 2006, 36, 597–608.
54. Paulavičius, R.; Sergeyev, Y.D.; Kvasov, D.E.; Žilinskas, J. Globally-biased Disimpl algorithm for expensive
global optimization. J. Glob. Optim. 2014, 59, 545–567.
55. Liu, Q.; Cheng, W. A modified DIRECT algorithm with bilevel partition. J. Glob. Optim. 2014, 60, 483–499.
56. Ge, R.P. The theory of filled function-method for finding global minimizers of nonlinearly constrained
minimization problems. J. Comput. Math. 1987, 5, 1–9.
57. Ge, R.; Qin, Y. The globally convexized filled functions for global optimization. Appl. Math. Comput. 1990,
35, 131–158.
58. Han, Q.; Han, J. Revised filled function methods for global optimization. Appl. Math. Comput. 2001,
119, 217–228.
59. Liu, X.; Xu, W. A new filled function applied to global optimization. Comput. Oper. Res. 2004, 31, 61–80.
60. Zhang, Y.; Zhang, L.; Xu, Y. New filled functions for nonsmooth global optimization. Appl. Math. Model.
2009, 33, 3114–3129.
61. Wolpert, D.; Macready, W. No Free Lunch Theorems for Optimization. IEEE Trans. Evol. Comput. 1997,
1, 67–82.
62. Hansen, N.; Finck, S.; Ros, R.; Auger, A. Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless
Functions Definitions; Research Report RR-6829; INRIA: Rocquencourt, France, 2009.
63. Mathworks. MATLAB User’s Guide (R2010a); The MathWorks, Inc. Natick, MA, USA, 2010.
64. Finkel, D.E. DIRECT Optimization Algorithm User Guide; Center for Research in Scientific Computation,
North Carolina State University: Raleigh, NC, USA, 2003.
c 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC-BY) license (http://creativecommons.org/licenses/by/4.0/).