manual-7.1.0
manual-7.1.0
Franz Wilhelmstötter
Franz Wilhelmstötter
[email protected]
https://jenetics.io
7.1.0-2022/06/15
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. To view
a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a
letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041,
USA.
Contents
1 Fundamentals 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Base classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Domain classes . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1.1 Gene . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1.2 Chromosome . . . . . . . . . . . . . . . . . . . . 7
1.3.1.3 Genotype . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1.4 Phenotype . . . . . . . . . . . . . . . . . . . . . 11
1.3.1.5 Population . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Operation classes . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2.1 Selector . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2.2 Alterer . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.3 Engine classes . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.3.1 Fitness function . . . . . . . . . . . . . . . . . . 22
1.3.3.2 Engine . . . . . . . . . . . . . . . . . . . . . . . 23
1.3.3.3 Evolution . . . . . . . . . . . . . . . . . . . . . . 25
1.3.3.4 EvolutionStream . . . . . . . . . . . . . . . . . . 26
1.3.3.5 EvolutionResult . . . . . . . . . . . . . . . . . . 27
1.3.3.6 EvolutionStatistics . . . . . . . . . . . . . . . . . 28
1.3.3.7 Evaluator . . . . . . . . . . . . . . . . . . . . . . 30
1.4 Nuts and bolts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.4.1 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.4.1.1 Basic configuration . . . . . . . . . . . . . . . . 32
1.4.1.2 Concurrency tweaks . . . . . . . . . . . . . . . . 32
1.4.2 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.3 Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.4.4 Utility classes . . . . . . . . . . . . . . . . . . . . . . . . . 37
2 Advanced topics 41
2.1 Extending Jenetics . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.1 Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.2 Chromosomes . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.1.3 Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.1.4 Alterers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.5 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.6 Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
ii
CONTENTS CONTENTS
3 Modules 81
3.1 io.jenetics.ext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1.1 Data structures . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1.1.1 Tree . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1.1.2 Parentheses tree . . . . . . . . . . . . . . . . . . 83
3.1.1.3 Flat tree . . . . . . . . . . . . . . . . . . . . . . 84
3.1.1.4 Tree formatting . . . . . . . . . . . . . . . . . . 85
3.1.1.5 Tree reduction . . . . . . . . . . . . . . . . . . . 86
3.1.2 Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.1.2.1 Tree pattern . . . . . . . . . . . . . . . . . . . . 87
3.1.2.2 Tree rewriter . . . . . . . . . . . . . . . . . . . . 88
3.1.2.3 Tree rewrite rule . . . . . . . . . . . . . . . . . . 88
3.1.2.4 Tree rewrite system (TRS) . . . . . . . . . . . . 88
3.1.2.5 Constant expression rewriter . . . . . . . . . . . 89
3.1.3 Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.1.3.1 BigInteger gene . . . . . . . . . . . . . . . . . . 89
3.1.3.2 Tree gene . . . . . . . . . . . . . . . . . . . . . . 89
3.1.4 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.1.5 Weasel program . . . . . . . . . . . . . . . . . . . . . . . . 90
iii
CONTENTS CONTENTS
4 Internals 124
4.1 PRNG testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.2 Random seeding . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5 Examples 128
5.1 Ones counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.2 Real function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.3 Rastrigin function . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.4 0/1 Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5 Traveling salesman . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6 Evolving images . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.7 Symbolic regression . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.8 Grammar based regression . . . . . . . . . . . . . . . . . . . . . . 144
5.9 DTLZ1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6 Build 150
iv
CONTENTS CONTENTS
Bibliography 153
v
Chapter 1
Fundamentals
1.1 Introduction
Jenetics is a library, written in Java2 , which provides a Genetic algorithm (GA),
Evolutionary algorithm (EA), Multi-objective optimization (MOO) and Genetic
programming (GP) implementation. It has no runtime dependencies to other
libraries, except the Java 17 runtime. Jenetics is available on the Maven central
repository3 and can be easily integrated into existing projects. The very clear
structuring of the different parts of the GA allows an easy adaption for different
problem domains.
To give you a first impression on how to use Jenetics, let’s start with a
1 The classes described in this chapter reside in the io.jenetics.base module or
1
1.1. INTRODUCTION CHAPTER 1. FUNDAMENTALS
simple »Hello World« program. This first example implements the well known
bit counting problem.
1 import io . jenetics . BitChromosome ;
2 import io . jenetics . BitGene ;
3 import io . jenetics . Genotype ;
4 import io . jenetics . e n g i n e . Engine ;
5 import io . jenetics . engine . EvolutionResult ;
6 import io . jenetics . u t i l . Factory ;
7
8 public f i n a l c l a s s HelloWorld {
9 // 2 . ) D e f i n i t i o n o f t h e f i t n e s s f u n c t i o n .
10 private s t a t i c i n t e v a l ( f i n a l Genotype<BitGene> g t ) {
11 return g t . chromosome ( )
12 . a s ( BitChromosome . c l a s s )
13 . bitCount ( ) ;
14 }
15
16 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
17 // 1 . ) D e f i n e t h e g e n o t y p e ( f a c t o r y ) s u i t a b l e
18 // f o r t h e problem .
19 f i n a l Factory<Genotype<BitGene>> g t f =
20 Genotype . o f ( BitChromosome . o f ( 1 0 , 0 . 5 ) ) ;
21
22 // 3 . ) C r e a t e t h e e x e c u t i o n e n v i ro n m en t .
23 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
24 . b u i l d e r ( HelloWorld : : e v a l , g t f )
25 . build () ;
26
27 // 4 . ) S t a r t t h e e x e c u t i o n ( e v o l u t i o n ) and
28 // c o l l e c t the r e s u l t .
29 f i n a l Genotype<BitGene> r e s u l t = e n g i n e . stream ( )
30 . l i m i t (100)
31 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
32
33 System . out . p r i n t l n ( " H e l l o World : \ n\ t " + r e s u l t ) ;
34 }
35 }
Listing 1.1: »Hello World« GA
2
1.1. INTRODUCTION CHAPTER 1. FUNDAMENTALS
4. In the last step, we will create a new EvolutionStream from our Engine.
The EvolutionStream is the model (or view) of the evolutionary process.
It serves as a »process handle« and allows us, among other things, to
control the termination of the evolution. In our example, we simply
truncate the stream after 100 generations. If you don’t limit the stream,
the EvolutionStream will never terminate and run forever. The final
result, the best Genotype in our example, is then collected with one of
the predefined collectors of the EvolutionResult class.
As the example shows, Jenetics makes heavy use of the Stream and Collector
classes. Also lambda expressions and the functional interfaces (SAM types) plays
an important roll in the library design.
There are many other GA implementations out there and they may slightly
differ in the order of the single execution steps. Jenetics uses an classical
approach. Listing 1.2 shows the (imperative) pseudocode of the Jenetics genetic
algorithm steps.
1 P0 ← Pinitial
2 F (P0 )
3 while ! f inished do
4 g ←g+1
5 Sg ← selectS (Pg−1 )
6 Og ← selectO (Pg−1 )
7 Og ← alter(Og )
8 Pg ← f ilter[gi ≥ gmax ](Sg ) + f ilter[gi ≥ gmax ](Og )
9 F (Pg )
Listing 1.2: Genetic algorithm
In line (1) the initial population is created and line (2) calculates the fitness
value of the individuals. The initial population is created implicitly before the
first evolution step is performed. Line (4) increases the generation number
and line (5) and (6) selects the survivor and the offspring population. The
offspring/survivors fraction is determined by the offspringFraction property
of the Engine.Builder. The selected offspring are altered in line (7). The next
line combines the survivor population and the altered offspring population—after
removing the killed individuals—to the new population. The steps from line (4)
to (9) are repeated until a given termination criterion is fulfilled.
3
1.2. ARCHITECTURE CHAPTER 1. FUNDAMENTALS
1.2 Architecture
The basic metaphor of the Jenetics library is the Evolution Stream, implemented
as Java Stream. An evolution stream is powered by—and bound to—an Evolution
Engine, which performs the needed evolution steps for each generation; the steps
are described in the body of the while loop of listing 1.2.
Figure 1.2.2 shows the static view of the main evolution classes, together
with its dependencies. Since the Engine class itself is immutable, and can’t
be changed after creation, it is instantiated (configured) via a builder. The
Engine can be used to create an arbitrary number of EvolutionStreams. The
EvolutionStream is used to control the evolutionary process and collect the
final result. This is done in the same way as for the normal java.util.-
stream.Stream classes. With the additional limit(Predicate) method, it
5 See section 2.6 on page 67 for a detailed description of the available termination strategies.
4
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
In figure 1.2.3 the package structure of the library is shown and it consists of
the following packages:
io.jenetics This is the base package of the Jenetics library and contains all
domain classes like Gene, Chromosome, Genotype or Phenotype. All of
this types are immutable data classes. It also contains the Selector and
Alterer interfaces and its implementations. The classes in this package
are (almost) sufficient to implement an own evolution engine.
io.jenetics.engine This package contains the actual GA implementation
classes, e. g. Engine, EvolutionStream or EvolutionResult. They
mainly operate on the domain classes in the io.jenetics package.
online: https://jenetics.io/javadoc/jenetics/7.1/index.html.
5
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Domain classes These classes form the domain model of the evolutionary
algorithm and contain the structural classes like Gene and Chromosome.
They are directly located in the io.jenetics package.
Operation classes These classes operate on the domain classes and includes the
Alterer and Selector interfaces. They are also located in the io.jen-
etics package.
Engine classes These classes implement the actual evolutionary algorithm and
can be found in the io.jenetics.engine package.
Figure 1.3.1 shows the class diagram of the domain classes. The Gene is
the base of the class structure. Genes are aggregated in Chromosomes, and
one to n Chromosomes are aggregated in Genotypes. A Genotype and a fitness
Function form the Phenotype, which are collected into a population Seq.
1.3.1.1 Gene
The basic building blocks of the Jenetics library contain the actual information
of the encoded solution, the allele. Some of the implementations also contain
domain information of the wrapped allele. This is the case for all Bounded-
Genes, which contain the allowed minimum and maximum values. All Gene
implementations are final and immutable. In fact, they are all value based classes
and fulfill the properties which are described in the Java API documentation[33].8
Beside the container functionality for the allele, every Gene is its own factory
and is able to create new, random instances of the same type and with the same
constraints9 . The factory methods are used by the Alterers for creating new
7 https://en.wikipedia.org/wiki/Value_object
8 Itis also worth reading the blog entry from Stephen Colebourne: http://blog.joda.org/
2014/03/valjos-value-java-objects.html
9 A constraint can restrict the space of valid values of a given problem domain. An example
will be a DoubleGene, where the allowed minimal and maximal value of the double allele is
part of the gene.
6
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Genes from the existing one and play a crucial role by the exploration of the
problem space.
1 public i n t e r f a c e Gene<A, G extends Gene<A, G>>
2 extends Factory<G>, V e r i f i a b l e
3 {
4 A a l l e l e () ;
5 boolean i s V a l i d ( ) ;
6 G newInstance ( ) ;
7 G n e w I n s t a n c e (A a l l e l e ) ;
8 }
Listing 1.3: Gene interface
Listing 1.3 shows the most important methods of the Gene interface. The
isValid method, defined in the Verifiable interface, allows the gene to mark
itself as invalid, e. g. when its allele is not within the allowed range. All invalid
genes are replaced with new ones during the evolution phase. The available
Gene implementations in the Jenetics library cover a wide range of problem
encodings. Refer to chapter 2.1.1 for how to implement your own Gene types.
1.3.1.2 Chromosome
A Chromosome is a collection of Genes which must contain at least one Gene.
This allows defining problems which require more than one Gene to encode. Like
the Gene interface, the Chromosome is also its own factory and allows creation
of a new Chromosome from a given Gene sequence.
1 public i n t e r f a c e Chromosome<G extends Gene <? , G>>
2 extends Factory<Chromosome<G>>, BaseSeq<G>, V e r i f i a b l e
3 {
4 G get ( int index ) ;
5 int length ( ) ;
6 Chromosome<G> n e w I n s t a n c e ( ISeq<G> g e n e s ) ;
7 }
Listing 1.4: Chromosome interface
Listing 1.4 shows the main methods of the Chromosome interface. These are
the methods for accessing single Genes by its index and the factory method for
creating a new Chromosome from a given sequence of Genes. The factory method
is used by the Alterer classes which were able to create altered Chromosomes
from a (changed) Gene sequence. Most of the Chromosome implementations can
be created with variable length. E. g. the IntegerChromosome can be created
with variable length, where the minimum value of the length range is included
and the maximum value of the length range is excluded.
1 IntegerChromosome chromosome = IntegerChromosome . o f (
2 0 , 1_000 , IntRange . o f ( 5 , 9 )
3 );
The factory method of the IntegerChromosome will now create chromosome in-
stances with a length between [rangemin , rangemax ), equally distributed. Figure
1.3.2 shows the structure of a Chromosome with variable length.
1.3.1.3 Genotype
The central processing class, the evolution Engine is working with, is the
Genotype. It is the structural and immutable representative of an individual
7
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
8
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
The code snippet in the listing above creates a Genotype with the same structure
as shown in figure 1.3.3. In this example the DoubleGene has been chosen as
the Gene type.
If the problem space allows equal Gene constraint, the row major Genotype
vector encoding should be chosen. It is easier to create and the available
Recombinator classes are more efficient in exploring the search domain.
The following code snippet shows the creation of a row major Genotype
vector. All Alterers derived from the Recombinator do a fairly good job in
exploring the problem space for a row major Genotype vector.
1 Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1.0 , 8)
3 );
The column major Genotype vector layout must be chosen when the problem
space requires Genes with different constraints. This is almost the only reason for
choosing the column major layout. The layout of this Genotype vector is shown
in 1.3.5. For a vector of length n, n Chromosomes of length one are needed.
9
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
The code snippet below shows how to create a Genotype vector in column
major layout. It’s a little bit more effort to create such a vector, since every
Gene has to be wrapped into a separate Chromosome. The DoubleChromosome
in the given example has a length of one, when the length parameter is omitted.
1 Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1.0) ,
3 DoubleChromosome . o f ( 1 . 0 , 2.0) ,
4 DoubleChromosome . o f ( 0 . 0 , 1 0 . 0 ) ,
5 DoubleChromosome . o f ( 0 . 1 , 0.9)
6 );
The greater flexibility of a column major Genotype vector has to be paid with
a lower exploration capability of the Recombinator alterers. Using Crossover
alterers will have the same effect as the SwapMutator, when used with row major
Genotype vectors. Recommended alterers for vectors of NumericGenes are:
• MeanAlterer10 ,
• LineCrossover11 and
• IntermediateCrossover12
See also 2.3.2 for an advanced description on how to use the predefined vector
codecs.
10
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
See also 2.3.1 for an advanced description on how to use the predefined scalar
codecs.
1.3.1.4 Phenotype
The Phenotype is the actual representative of an individual and consists of
the Genotype, the generation where the Phenotype has been created and an
optional fitness value. Like the Genotype, the Phenotype is immutable and
can’t be changed after creation.
1 public f i n a l c l a s s Phenotype<
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements Comparable<Phenotype<G, C>>
6 {
7 public Genotype<G> g e n o t y p e ( ) ;
8 public long g e n e r a t i o n ( ) ;
9 public C f i t n e s s ( ) ;
10 public boolean i s E v a l u a t e d ( ) ;
11 public Phenotype<G, C> w i t h F i t n e s s (C f i t n e s s ) ;
12 }
Listing 1.5: Phenotype class
Listing 1.5 shows the main methods of the Phenotype. The fitness property
will return the actual fitness value of the Genotype, and the Genotype can be
fetched with the genotype() method. If no fitness value is associated with the
Phenotype yet, the fitness() method will throw an NoSuchElementException.
Whether the fitness value has been set can be checked with the isEvaluated()
method. Setting a fitness value can be done with the withFitness(C) method.
Since the Phenotype is immutable, this method returns a new Phenotype with
the set fitness value. Additionally to the fitness value, the Phenotype contains
the generation when it was created. This allows for the calculation of the current
age and allows for the removal of overaged individuals from the population.
1.3.1.5 Population
There is no special class which represents a population. It’s just a collection of
Phenotypes. As a collection class, the ISeq interface is used. The ISeq interface
allows for the expression of the immutability of the population at the type level
and makes the code more readable. For a detailed description of this collection
classes see section 1.4.4.
11
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
1.3.2.1 Selector
Selectors are responsible for selecting a given number of individuals from the
population. The selectors are used to divide the population into survivors
and offspring. The selectors for offspring and for the survivors can be set
independently.
The selection process of the Jenetics library acts on Phenotypes and indi-
rectly, via the fitness function, on Genotypes. Direct Gene or population
selection is not supported by the library.
• TournamentSelector • LinearRankSelector
• TruncationSelector • ExponentialRankSelector
• MonteCarloSelector • BoltzmannSelector
• ProbabilitySelector • StochasticUniversalSelector
• RouletteWheelSelector • EliteSelector
12
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
the worst individual never survives, and the best individual wins in all the
tournaments in which it participates. The selection pressure can be varied by
changing the tournament size, s. For large values of s, weak individuals have
less chance of being selected. Compared with fitness proportional selectors, the
tournament selector is often used in practice because of its lack of stochastic
noise. Tournament selectors are also independent to the scaling of the genetic
algorithm fitness function.
Monte Carlo selector The Monte Carlo selector selects the individuals from
a given population randomly. Instead of a directed search, the Monte Carlo
selector performs a random search. This selector can be used to measure the
performance of other selectors. In general, the performance of a selector should
be better than the selection performance of the Monte Carlo selector. If the
Monte Carlo selector is used for selecting the parents for the population, it will
be a little bit more disruptive, on average, than roulette wheel selection.[40]
where N is the number of individuals and fi the fitness value of the ith individual.
The probability selector works the same way, only the fitness value, fi , is replaced
by the individual’s selection probability, P (i). It is not necessary to sort the
population. The selection probability of an individual, i, follows a binomial
13
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
distribution
n
P (i, k) = P (i) k (1 − P (i)) (1.3.5)
n−k
k
where n is the overall number of selected individuals and k the number of
individuals i in the set of selected individuals. The runtime complexity of the
implemented probability selectors is O (n + log (n)) instead of O (n2) as for the
naive approach: A binary (index) search is performed on the summed probability
array.
where
fmin = min {fi , 0}
i∈[0,N )
As you can see, the worst fitness value, fmin , if negative, now has a selection
probability of zero. In the case that the sum of the corrected fitness values is
zero, the selection probability of all fitness values will be set N1 .
1 i−1
P (i) = n− + n+ − n− . (1.3.8)
N N −1
Here nN is the probability of the worst individual to be selected and nN the
− +
14
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
ci−1
P (i) = (c − 1) , (1.3.9)
−1
cN
where c must be within the range [0, 1). A small value of c increases the
probability of the best individual to be selected. If c is set to zero, the selection
probability of the best individual is set to one. The selection probability of all
other individuals is zero. A value near one equalizes the selection probabilities.
This selector sorts the population in descending order before calculating the
selection probabilities.
i=1
15
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
1.3.2.2 Alterer
The problem encoding/representation determines the bounds of the search space,
but the Alterers determine how the space can be traversed: Alterers are
responsible for the genetic diversity of the EvolutionStream. The two Alterer
hierarchies used in Jenetics are:
1. mutation and
First we will have a look at the mutation — There are two distinct
roles mutation plays in the evolution process:
µ̂ = NP · Ng · P (m) (1.3.12)
16
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Mutator The mutator has to deal with the problem, that the genes are
arranged in a hierarchical structure with three levels (see chapter 1.3.1.3). The
mutator selects the gene which will be mutated in three steps:
1. Select a genotype, G[i], from the population with probability PG (m),
2. select a chromosome, C[j], from the selected genotype, G[i], with probabil-
ity PC (m) and
3. select a gene, g[k], from the selected chromosome, C[j], with probability
Pg (m).
The needed sub selection probabilities are set to
Swap mutator The swap mutator changes the order of genes in a chromosome,
with the hope of bringing related genes closer together, thereby facilitating the
production of building blocks. This mutation operator can also be used for
combinatorial problems, where no duplicated genes within a chromosome are
allowed, e. g. for the TSP.
17
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
13 random . n e x t I n t ( min ( g t 1 . l e n g t h ( ) , g t 2 . l e n g t h ( ) ) ) ;
14 f i n a l MSeq<Chromosome<G>> c1 = MSeq . o f ( g t 1 ) ;
15 f i n a l MSeq<Chromosome<G>> c2 = MSeq . o f ( g t 2 ) ;
16 f i n a l MSeq<G> g e n e s 1 = MSeq . o f ( c1 . g e t ( c h I n d e x ) ) ;
17 f i n a l MSeq<G> g e n e s 2 = MSeq . o f ( c2 . g e t ( c h I n d e x ) ) ;
18
19 // Perform t h e c r o s s o v e r .
20 c r o s s o v e r ( genes1 , genes2 ) ;
21 c1 . s e t ( chIndex , c1 . g e t ( c h I n d e x ) . n e w I n s t a n c e ( g e n e s 1 . t o I S e q ( ) ) ) ;
22 c2 . s e t ( chIndex , c2 . g e t ( c h I n d e x ) . n e w I n s t a n c e ( g e n e s 2 . t o I S e q ( ) ) ) ;
23
24 // C r e a t i n g two new Phenotypes and r e p l a c e t h e o l d one .
25 pop . s e t ( i 1 , Phenotype . o f ( Genotype . o f ( c1 . t o I S e q ( ) ) ) ) ;
26 pop . s e t ( i 2 , Phenotype . o f ( Genotype . o f ( c2 . t o I S e q ( ) ) ) ) ;
27 }
Listing 1.6: Chromosome selection for recombination
Listing 1.6 shows how two chromosomes are selected for recombination. It is
done this way for preserving the given constraints and to avoid the creation of
invalid individuals.
18
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Figure 1.3.11 you can see how the crossover works for an odd number of
crossover points.
19
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Combine alterer This alterer changes two genes by combining them. The
combine function can be defined when the alterer is created. How this is done,
is shown in the code snippet below.
20
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Mean alterer The Mean alterer works on genes which implement the Mean
interface. All numeric genes implement this interface by calculating the arithmetic
mean of two genes. This alterer is a specialization of the CombineAlterer.
Line crossover The line crossover13 takes two numeric chromosomes and
treats it as a real number vector. Each of these vectors can also be seen as a
point in Rn . If we draw a line through these two points (chromosome), we have
the possible values of the new chromosomes, which all lie on this line.
Figure 1.3.14 shows how the two chromosomes form the two three-dimensional
vectors (black circles). The dashed line, connecting the two points, form the
possible solutions created by the line crossover. An additional variable, p,
determines how far out along the line the created children will be. If p = 0 then
the children will be located along the line within the hypercube. If p > 0, the
children may be located on an arbitrary place on the line, even outside of the
hypercube.This is useful if you want to explore unknown regions, and you need
a way to generate chromosomes further out than the parents are.
The internal random parameters, which define the location of the new
crossover point, are generated once for the whole vector (chromosome). If
the LineCrossover generates numeric genes which lie outside the allowed min-
imum and maximum value, it simply uses the original gene and rejects the
generated, invalid one.
21
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
created more than once, because they are not in the valid range. The probability
for gene recreation rises sharply with the value of p. Setting p to a value greater
than one, doesn’t make sense in most of the cases. A value greater than 10
should be avoided.
Partial alterer Alterers are working on the whole population, which is ef-
fectively a sequence of genotypes. If your genotype consists of more than one
chromosome, the alterer is applied to all chromosomes. There is no way to bind
an alterer to a specific chromosome. The PartialAlterer class overcomes this
shortcoming and allows you to define the chromosomes the wrapped Alterer is
using.
1 f i n a l Genotype<DoubleGene> g t f = Genotype . o f (
2 DoubleChromosome . o f ( 0 , 1 ) ,
3 DoubleChromosome . o f ( 1 , 2 ) ,
4 DoubleChromosome . o f ( 2 , 3 ) ,
5 DoubleChromosome . o f ( 3 , 4 )
6 );
7 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine . b u i l d e r ( f f , g t f )
8 . alterers (
9 P a r t i a l A l t e r e r . o f (new Mutator<DoubleGene , Double >() , 0 , 3 ) ,
10 P a r t i a l A l t e r e r . o f (new MeanAlterer<DoubleGene , Double >() , 1 ) ,
11 new L i n e C r o s s o v e r <>() )
12 . build () ;
The example above shows how to use the PartialAlterer. The wrapped
Mutator will only operate on the chromosome with the index 0 and 3, the
wrapped MeanAlterer will alter on the chromosome with index 1 and the Line-
Crossover will work on all chromosomes. A potential drawback of the Partial-
Alterer is a possible performance penalty. This is because the chromosomes
must be sliced into different population sequences for each PartialAlterer. If
this is an issue for the overall performance depends on the concrete application.
22
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
The following example shows the simplest possible fitness Function. This
Function just returns the allele of a 1x1 float Genotype.
1 public c l a s s Main {
2 s t a t i c Double i d e n t i t y ( f i n a l Genotype<DoubleGene> g t ) {
3 return g t . gene ( ) . a l l e l e ( ) ;
4 }
5
6 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
7 // C r e a t e f i t n e s s f u n c t i o n from method r e f e r e n c e .
8 Function<Genotype<DoubleGene >, Double>> f f 1 =
9 Main : : i d e n t i t y ;
10
11 // C r e a t e f i t n e s s f u n c t i o n from lambda e x p r e s s i o n .
12 Function<Genotype<DoubleGene >, Double>> f f 2 = g t −>
13 g t . gene ( ) . a l l e l e ( ) ;
14 }
15 }
The first type parameter of the Function defines the kind of Genotype from
which the fitness value is calculated and the second type parameter determines
the return type, which must at least implement the Comparable interface.
1.3.3.2 Engine
The evolution Engine controls how the evolution steps are executed. Once the
Engine is created, via its Builder class, it can’t be changed. It doesn’t contain
any mutable global state and can therefore be safely used/called from different
threads. This allows to create more than one EvolutionStreams from the same
Engine and execute them independently.
1 public f i n a l c l a s s Engine<
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements E v o l u t i o n <G, C>,
6 E v o l u t i o n S t r e a m a b l e <G, C>
7 {
8 // The e v o l u t i o n f u n c t i o n , p e r f o r m s one e v o l u t i o n s t e p .
9 public E v o l u t i o n R e s u l t <G, C> e v o l v e ( E v o l u t i o n S t a r t <G, C> s t a r t ) ;
10
11 // E v o l u t i o n stream f o r " normal " e v o l u t i o n e x e c u t i o n .
12 public E v o l u t i o n S t r e a m <G, C> stream ( ) ;
13 }
Listing 1.7: Engine class
Listing 1.7 shows the main methods of the Engine class. The Engine is used
for performing the actual evolution of a given population. One evolution step is
executed by calling the Engine.evolve method, which returns an Evolution-
Result object. This object contains the evolved population plus additional
23
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
information like the killed and as invalid marked individuals. With the stream()
method you create a new EvolutionStream, which is used for controlling the
evolution process. For more information about the EvolutionStream see section
1.3.3.4.
As already shown in previous examples, the Engine can only be created
via its Builder class. Only the fitness Function and the Chromosomes, which
represents the problem encoding, must be specified for creating an Engine
instance. For the rest of the parameters, default values have been specified. This
are the Engine parameters which can configured:
alterers A list of Alterers which are applied to the offspring population, in
the defined order. The default value of this property is set to Single-
PointCrossover<>(0.2) followed by Mutator<>(0.15).
clock The java.time.InstantSource used for calculating the execution dura-
tions. A InstantSource with nanosecond precision (System.nanoTime())
is used as default.
constraint This property lets you override the default implementation of the
Phenotype::isValid method, which is useful if the Phenotype validity
not only depends on valid property of the elements it consists of. A
description of the Constraint interface is given in section 2.5.
executor With this property it is possible to change the java.util.concur-
rent.Executor engine used for evaluating the evolution steps. This prop-
erty can be used to define an application wide Executor or for controlling
the number of execution threads. The default value is set to ForkJoin-
Pool.commonPool().
fitnessFunction This property defines the fitness Function used by the evo-
lution Engine. (See section 1.3.3.1.)
genotypeFactory Defines the Genotype Factory used for creating new indi-
viduals. Since the Genotype is its own Factory, it is sufficient to create
a Genotype, which serves as a template.
interceptor The interceptor lets you define functions, which are able to
change the EvolutionResult before and after an evolution step. An
EvolutionInterceptor can be seen as a crosscutting aspect of the evo-
lution process. One implementation of the EvolutionInterceptor is the
FitnessNullifier, which allows you to enforce the reevaluation of the
fitness values of all individuals. This might be handy, if the fitness function
is not time invariant and can change during the evolution process.
maximalPhenotypeAge Set the maximal allowed age of an individual (Phenotype).
This prevents super individuals to live forever. The default value is set to
70.
offspringFraction Through this property it is possible to define the fraction of
offspring (and survivors) for evaluating the next generation. The fraction
value must within the interval [0, 1]. The default value is set to 0.6.
Additionally to this property, it is also possible to set the survivorsFrac-
tion, survivorsSize or offspringSize. All these additional properties
effectively set the offspringFraction.
24
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
offspringSelector This property defines the Selector used for selecting the
offspring population. The default values are set to TournamentSelect-
or<>(3).
optimize With this property it is possible to define whether the fitness Function
should be maximized or minimized. By default, the fitness Function is
maximized.
populationSize Defines the number of individuals of a population. The evolu-
tion Engine keeps the number of individuals constant. That means, the
population of the EvolutionResult always contains the number of entries
defined by this property. The default value is set to 50.
selector This method allows to set the offspringSelector and survivors-
Selector in one step with the same selector.
survivorsSelector This property defines the Selector used for selecting the
survivors population. The default values are set to TournamentSelec-
tor<>(3).
The EvolutionStreams, created by the Engine class, are unlimited. Such
streams must be limited by calling the available EvolutionStream::limit meth-
ods. Alternatively, the Engine instance itself can be limited with the Engine-
::limit methods. This limited Engines no longer creates infinite Evolution-
Streams, they are truncated by the limit predicate defined by the Engine. This
feature is needed for concatenating evolution Engines (see section 3.1.6.1).
1 f i n a l E v o l u t i o n S t r e a m a b l e <DoubleGene , Double> e n g i n e =
2 Engine . b u i l d e r ( problem )
3 . minimizing ( )
4 . build ()
5 . l i m i t ( ( ) −> L i m i t s . b y S t e a d y F i t n e s s ( 1 0 ) ) ;
As shown in the example code above, one important difference between the
Engine.limit and the EvolutionStream::limit method is, that the limit
method of the Engine takes a limiting Predicate Supplier instead of the
Predicate itself. The reason for this is that some Predicates have to maintain
internal state to work properly. This means, every time the Engine creates
a new stream, it must also create a new limiting Predicate. The Engine-
::limit function will return an EvolutionStreamable instead of an Engine.
This interface lets you create EvolutionStreams, which is what you usually
want to do with the Engine.
1.3.3.3 Evolution
This functional interface represents the evolution function, which is implemented
by the Engine class. The main purpose of the Evolution interface is to decouple
the evolution function/strategy from the actual evolution process, represented
by the EvolutionStream. Listing 1.8 shows the definition of the Evolution
functional interface.
1 @FunctionalInterface
2 public i n t e r f a c e E v o l u t i o n <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
25
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
5 > {
6 E v o l u t i o n R e s u l t <G, C> e v o l v e ( E v o l u t i o n S t a r t <G, C> s t a r t ) ;
7 }
Listing 1.8: Evolution interface
1.3.3.4 EvolutionStream
The EvolutionStream controls the execution of the evolution process and can
be seen as a kind of execution handle. This handle can be used to define
the termination criteria and to collect the final evolution result. Since the
EvolutionStream extends the Java Stream interface, it integrates smoothly
with the rest of the Java Stream API.14
1 public i n t e r f a c e E v o l u t i o n S t r e a m <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 extends Stream<E v o l u t i o n R e s u l t <G, C>>
6 {
7 E v o l u t i o n S t r e a m <G, C>
8 l i m i t ( P r e d i c a t e <? super E v o l u t i o n R e s u l t <G, C>> p r o c e e d ) ;
9 }
Listing 1.9: EvolutionStream interface
26
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
In cases where you appreciate the usage of the EvolutionStream but need
a different Engine implementation, you can use the EvolutionStream::of
factory method for creating an new EvolutionStream.
1 s t a t i c <G extends Gene <? , G>, C extends Comparable <? super C>>
2 E v o l u t i o n S t r e a m <G, C> o f (
3 S u p p l i e r <E v o l u t i o n S t a r t <G, C>> s t a r t ,
4 Function <? super E v o l u t i o n S t a r t <G, C>, E v o l u t i o n R e s u l t <G, C>> f
5 );
1.3.3.5 EvolutionResult
The EvolutionResult contains the result data of an evolution step and is the
element type of the EvolutionStream, as described in section 1.3.3.4.
1 public f i n a l c l a s s E v o l u t i o n R e s u l t <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements Comparable<E v o l u t i o n R e s u l t <G, C>>
6 {
7 public ISeq<Phenotype<G, C>> p o p u l a t i o n ( ) ;
8 public long g e n e r a t i o n ( ) ;
9 }
Listing 1.10: EvolutionResult class
Listing 1.3.3.5 shows the two most important properties, the population and
the generation the result belongs to. These are also the two properties needed
for the next evolution step. The generation is, of course, incremented by
one. To make collecting the EvolutionResult object easier, it also implements
the Comparable interface. Two EvolutionResults are compared by its best
Phenotype, depending on the optimization direction. The EvolutionResult
classes has three predefined factory methods, which will return Collectors
usable for the EvolutionStream:
toBestEvolutionResult() Collects the best EvolutionResult of a Evolu-
tionStream according to the defined optimization strategy (minimization
or maximization).
toBestPhenotype() This collector can be used if you are only interested in the
best Phenotype.
toBestGenotype() Use this collector if you only need the best Genotype of the
EvolutionStream.
The following code snippets show how to use the different EvolutionStream
collectors.
27
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
1 // C o l l e c t i n g t h e b e s t E v o l u t i o n R e s u l t o f t h e E v o l u t i o n S t r e a m .
2 f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t = stream
3 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () ) ;
4
5 // C o l l e c t i n g t h e b e s t Phenotype o f t h e E v o l u t i o n S t r e a m .
6 f i n a l Phenotype<DoubleGene , Double> r e s u l t = stream
7 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;
8
9 // C o l l e c t i n g t h e b e s t Genotype o f t h e E v o l u t i o n S t r e a m .
10 f i n a l Genotype<DoubleGene> r e s u l t = stream
11 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
Sometimes it is useful not only to collect one final result, but to collect the n
best evolution results instead. This can be achieved by combining the MinMax-
::toStrictlyIncreasing and ISeq::toISeq(int) method.
1 f i n a l ISeq<E v o l u t i o n R e s u l t <DoubleGene , Double>> r e s u l t s = e n g i n e
2 . stream ( )
3 . l i m i t (1000)
4 . f l a t M a p (MinMax . t o S t r i c t l y I n c r e a s i n g ( ) )
5 . c o l l e c t ( ISeq . toISeq (10) ) ;
The code snippet above collects the best 10 evolution results into the results
sequence in increasing order.
1.3.3.6 EvolutionStatistics
The EvolutionStatistics class allows you to gather additional statistical in-
formation from the EvolutionStream. This is especially useful during the devel-
opment phase of an application, when you have to find the right parametrization
of the evolution Engine. Besides other information, the EvolutionStatistics
contains (statistical) information about the fitness, invalid and killed Phenotypes
and runtime information of the different evolution steps. Since the Evolution-
Statistics class implements the Consumer<EvolutionResult<?, C>> inter-
face, it can be easily plugged into the EvolutionStream, adding it with the
peek method of the stream.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t a t i s t i c s <? , Double> s t a t i s t i c s =
3 E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
4 e n g i n e . stream ( )
5 . l i m i t (100)
6 . peek ( s t a t i s t i c s )
7 . c o l l e c t ( toBestGenotype ( ) ) ;
Listing 1.11: EvolutionStatistics usage
28
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 4 65 38 2 78 00 0 s ; mean = 0 .0 03 8 78 18 98 3 3 s |
5 | Altering : sum = 0. 08 6 15 54 57 0 00 s ; mean =0 . 00 71 7 96 21 4 17 s |
6 | Fitness calculation : sum = 0. 02 2 90 1 60 6 00 0 s ; mean = 0 .0 01 9 08 4 67 16 7 s |
7 | Overall execution : sum =0 .1 4 72 98 0 67 00 0 s ; mean =0 .0 1 22 74 83 8 91 7 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 12 |
12 | Altered : sum =7 ,331; mean =610 .91 6666 667 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =11; mean =1.951000; var =5.545190 |
19 | Fitness : |
20 | min = 0.0000 00000000 |
21 | max = 4 8 1. 7 4 8 2 2 7 1 1 4 5 3 7 |
22 | mean = 3 8 4 . 4 3 0 3 4 5 0 7 8 6 6 0 |
23 | var = 1 3 0 0 6 . 1 3 2 5 3 7 3 0 1 5 2 8 |
24 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
29
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS
Listing 1.12 shows how to implement a manual statistics gathering. The update
method is called whenever a new EvolutionResult has been calculated. If a
new best Phenotype is available, it is stored and logged. With the TSM::update
method, which is called on every finished generation, you created a live view on
the evolution progress.
1.3.3.7 Evaluator
The Evaluator is responsible for evaluating the fitness values for a given popu-
lation. It is the most general way for doing the fitness evaluation. Usually, it is
not necessary to implement an own evaluation strategy. If you are creating an
evolution Engine with a fitness function, this is done for you automatically. Each
fitness value is then evaluated concurrently, but independently from each other.
Using the Evaluator interface is helpful if you have performance problems when
the fitness function is evaluated serially—or in small concurrent batches, as it is
implemented by the default strategy. In this case, the Evaluator interface can
be used to calculate the fitness function for a population in one batch. Another
use case for the Evaluator interface is, when the fitness value also depends on
the current composition of the population. E. g. it is possible to normalize the
population’s fitness values.
1 @FunctionalInterface
2 public i n t e r f a c e E v a l u a t o r <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 ISeq<Phenotype<G, C>> e v a l ( Seq<Phenotype<G, C>> p o p u l a t i o n ) ;
7 }
Listing 1.13: Evaluator interface
The implementer is free to evaluate the whole population, or only evaluate the
not yet evaluated Phenotypes. There are only two requirements which must be
fulfilled:
1. the size of the returned, evaluated, phenotype sequence must be exactly
the size of the input phenotype sequence and
2. all phenotypes of the returned population must have a fitness value assigned.
That means, the expression pop.forAll(Phenotype::isEvaluated) must
be true.
The code snippet below creates an evaluator which evaluates the fitness values
of the whole population serially in the main thread.
30
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
To use the fitness Evaluator, you have to use the Engine.Builder constructor
directly, instead of one of the factory methods.
1 f i n a l Engine<G, C> e n g i n e = new Engine . B u i l d e r ( e v a l u a t o r , g t f )
2 . build () ;
The Evaluators class contains factory methods, which allows you to create
Evaluator instances from fitness functions which don’t return the fitness value
directly, but return Future<T> or CompletableFuture<T> instead. With these
methods, there is no need for waiting for the fitness value, if the fitness function
is already asynchronous.
1 s t a t i c Future<Double> f i t n e s s ( f i n a l double x ) {
2 return . . . ;
3 }
4 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
5 f i n a l Codec<Double , DoubleGene> c o d e c = . . . ;
6 f i n a l E v a l u a t o r <DoubleGene , Double> e v a l u a t o r = E v a l u a t o r s
7 . a s y n c ( Main : : f i t n e s s , c o d e c ) ;
8
9 f i n a l Engine<DoubleGene , Double> e n g i n e =
10 new Engine . B u i l d e r <>(e v a l u a t o r , c o d e c . e n c o d i n g ( ) )
11 . build () ;
12 }
31
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
The code snippet above shows how to do the Engine operations in the main
thread. Whereas the snippet below executes the Engine operations in a single
thread, other than the main thread.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine . b u i l d e r ( . . . )
2 // Doing t h e Engine o p e r a t i o n s i n a s i n g l e t h r e a d
3 . executor ( Executors . newSingleThreadExecutor ( ) )
4 . build ()
concurrent/ForkJoinPool.html
32
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
The fitness function shouldn’t acquire locks for achieving thread safety. It
is also recommended to avoid calls to blocking methods. If such calls are
unavoidable, consider using the ForkJoinPool.managedBlock method.
Especially if you are using a ForkJoinPool executor, which is the default.
are held by the current worker thread than there are other worker threads that might steal
them. This value may be useful for heuristic decisions about whether to fork other tasks. In
many usages of ForkJoinTasks , at steady state, each worker should aim to maintain a small
constant surplus (for example, 3) of tasks, and to process computations locally if this threshold
is exceeded.
17 The number of sub-populations actually depends on the number of available CPU cores,
33
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
1.4.2 Randomness
In general, GAs heavily depend on pseudo random number generators (PRNG)
for creating new individuals and for the selection and mutation algorithms.
Jenetics uses the Java RandomGenerator interface for generating random num-
bers. To make the random engine pluggable, the RandomGenerator object is
always fetched from the RandomRegistry. This makes it possible to change the
implementation of the random generator without changing the client code. The
central RandomRegistry also allows for easily changing the RandomGenerator
even for specific parts of the code.
The following example shows how to change and restore the RandomGenerator
object. When opening the with scope, changes to the RandomRegistry are
only visible within this scope. Once the with scope is left, the original Random-
Generator object is restored.
1 f i n a l v a r r g f = RandomGeneratorFactory . g e t D e f a u l t ( ) ;
2 f i n a l L i s t <Genotype<DoubleGene>> g e n o t y p e s =
3 RandomRegistry . with ( r g f . c r e a t e ( 1 2 3 ) , r −> {
4 Genotype . o f ( DoubleChromosome . o f ( 0 . 0 , 1 0 0 . 0 , 1 0 ) )
5 . instances ()
6 . l i m i t (100)
7 . toList ()
8 }) ;
With the previous listing, a random, but reproducible, list of genotypes is created.
This might be useful while testing your application or when you want to evaluate
the EvolutionStream several times with the same initial population.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = . . . ;
2 // C r e a t e a new e v o l u t i o n stream with t h e g i v e n
3 // i n i t i a l g e n o t y p e s .
4 f i n a l Phenotype<DoubleGene , Double> b e s t = e n g i n e . stream ( g e n o t y p e s )
5 . limit (10)
6 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;
The example above uses the generated genotypes for creating the Evolution-
Stream. Each created stream uses the same starting population, but will, most
likely, create a different result. This is because the stream evaluation is still
nondeterministic.
34
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
The Fair Play property of a PRNG guarantees that the quality of the
genetic algorithm (evolution stream) does not depend on the degree of
parallelization.
Random seeding Every thread uses the same kind of PRNG but with a
different seed. This is the default strategy used by the Jenetics library. Random
seeding works well for the most problems but without theoretical foundation.19
The RandomRegistry is initialized with the Java L64X256MixRandom class.20
Parameterization All threads use the same kind of PRNG but with different
parameters. This requires the PRNG to be parameterizable, which is not the
case for the Random object of the JDK. You can use the LCG64ShiftRandom
class if you want to use this strategy. The theoretical foundation for these
methods is weak. In a massive parallel environment you will need a reliable set
of parameters for every random stream, which are not trivial to find.
Block splitting With this method each thread will be assigned a non overlap-
ping contiguous block of random numbers, which should be enough for the whole
runtime of the process. If the number of threads is not known in advance, the
length of each block should be chosen much larger than the maximal expected
number of threads. This strategy is used when using the LCG64ShiftRandom
class. This class assigns every thread a block of 256 ≈ 7, 2 · 1016 random numbers.
After 128 threads, the blocks are recycled, but with changed seed.
Leapfrog With the leapfrog method each thread t ∈ [0, P ) only consumes the
P th random number and jumps ahead in the random sequence by the number of
threads, P . This method requires the ability to jump very quickly ahead in the
sequence of random numbers by a given amount. Figure 1.4.3 graphically shows
the concept of the leapfrog method.
19 This is also expressed by Donald Knuth’s advice: »Random number generators should not
be chosen at random.«
20 RandomRegistry.random(RandomGeneratorFactory.of("L64X256MixRandom"))
35
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
Listing 1.14 shows the interface used for implementing the block splitting and
leapfrog parallelization techniques. This methods have the following meaning:
split Changes the internal state of the PRNG in a way that future calls to
nextLong will generate the sth sub-stream of pth sub-streams. s must be
within the range of [0, p − 1). This method is used for parallelization via
leapfrogging.
jump Changes the internal state of the PRNG in such a way that the engine
jumps s steps ahead. This method is used for parallelization via block
splitting.
21 The LCG64ShiftRandom PRNG is part of the io.jenetics.prngine module (see section
36
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
jumpPowerOfTwo Changes the internal state of the PRNG in such a way that
the engine jumps 2s steps ahead. This method is used for parallelization
via block splitting.
1.4.3 Serialization
Jenetics supports serialization for a number of classes, most of them are located
in the io.jenetics package. Only the concrete implementations of the Gene
and the Chromosome interfaces implements the Serializable interface. This
gives a greater flexibility when implementing own Genes and Chromosomes.
• BitGene • LongChomosome
• BitChromosome • DoubleGene
• CharacterGene • DoubleChromosome
• CharacterChromosome • EnumGene
• IntegerGene • PermutationChromosome
• IntegerChromosome • Genotype
• LongGene • Phenotype
With the serialization mechanism you can write a population to disk and load
it into a new EvolutionStream at a later time. It can also be used to transfer
populations to evolution engines, running on different hosts, over a network link.
The IO class, located in the io.jenetics.util package, supports native Java
serialization in a convenient way.
1 // C r e a t i n g r e s u l t p o p u l a t i o n .
2 f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t = stream
3 . l i m i t (100)
4 . c o l l e c t ( toBestEvolutionResult () ) ;
5
6 // W r i t i n g t h e p o p u l a t i o n t o d i s k .
7 f i n a l F i l e f i l e = new F i l e ( " p o p u l a t i o n . o b j " ) ;
8 IO . o b j e c t . w r i t e ( r e s u l t . p o p u l a t i o n ( ) , f i l e ) ;
9
10 // Reading t h e p o p u l a t i o n from d i s k .
11 f i n a l ISeq<Phenotype<G, C>> p o p u l a t i o n =
12 ( ISeq<Phenotype<G, C>>)IO . o b j e c t . r e a d ( f i l e ) ;
13 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = Engine
14 . build ( ff , gtf )
15 . stream ( p o p u l a t i o n , 1 ) ;
37
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
io.jenetics.util.Seq Most notable are the Seq interfaces and its implemen-
tation. They are used, among others, in the Chromosome and Genotype classes
and hold the Genes and Chromosomes, respectively. The Seq interface itself
represents a fixed-sized, ordered sequence of elements. It is an abstraction over
the Java build in array type, but much safer to use for generic elements, because
there are no casts needed when using nested generic types.
Figure 1.4.4 shows the Seq class diagram with their most important methods.
The interfaces MSeq and ISeq are mutable, respectively immutable specializa-
tions of the basis interface. Creating instances of the Seq interfaces is possible
via the static factory methods of the interfaces.
1 // C r e a t e " d i f f e r e n t " s e q u e n c e s .
2 f i n a l Seq<I n t e g e r > a1 = Seq . o f ( 1 , 2 , 3 ) ;
3 f i n a l MSeq<I n t e g e r > a2 = MSeq . o f ( 1 , 2 , 3 ) ;
4 f i n a l ISeq<I n t e g e r > a3 = MSeq . o f ( 1 , 2 , 3 ) . t o I S e q ( ) ;
5 f i n a l MSeq<I n t e g e r > a4 = a3 . copy ( ) ;
6
7 // The ’ e q u a l s ’ method p e r f o r m s element−w i s e c o m p a r i s o n .
8 a s s e r t ( a1 . e q u a l s ( a2 ) && a1 != a2 ) ;
9 a s s e r t ( a2 . e q u a l s ( a3 ) && a2 != a3 ) ;
10 a s s e r t ( a3 . e q u a l s ( a4 ) && a3 != a4 ) ;
How to create instances of the three Seq types is shown in the listing above.
The Seq classes also allows a more functional programming style. For a full
method description refer to the Javadoc.
38
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
The code snipped above shows how to sort an IntFunction. With the proxy
array you are now able to access the access function in ascending order. The
ProxySorter uses the Timsort24 algorithm for sorting the proxy int[] array.
• minimum, • variance,
• maximum,
• skewness and
• sum,
• mean, • kurtosis value.
Table 1.4.1 contains the available statistical moments for the different numeric
types. The following code snippet shows an example on how to collect double
statistics from a given DoubleGene stream.
23 For this specific problem you could also do this by copying the population and sorting the
copy instead of the original. But using a sorted proxy array can lead to simpler code.
24 https://en.wikipedia.org/wiki/Timsort
39
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS
1 // C o l l e c t i n g i n t o an s t a t i s t i c s o b j e c t .
2 f i n a l DoubleChromosome chromosome = . . .
3 f i n a l D o u b l e M o m e n t S t a t i s t i c s s t a t i s t i c s = chromosome . stream ( )
4 . c o l l e c t ( DoubleMomentStatistics
5 . t o D o u b l e M o m e n t S t a t i s t i c s ( v −> v . d o u b l e V a l u e ( ) ) ) ;
6
7 // C o l l e c t i n g i n t o an moments o b j e c t .
8 f i n a l DoubleMoments moments = chromosome . stream ( )
9 . c o l l e c t ( DoubleMoments . toDoubleMoments ( v −> v . d o u b l e V a l u e ( ) ) ) ;
The stat package also contains a class for calculating the quantile25 of a stream
of double values. Its implementing algorithm, which is described in [21], calcu-
lates—or estimates—the quantile value on the fly, without storing the consumed
double values. This allows for using the Quantile class even for very large
sets of double values. How to calculate the first quartile of a given, random
DoubleStream is shown in the code snippet below.
1 f i n a l Q u a n t i l e q u a r t i l e = new Q u a n t i l e ( 0 . 2 5 ) ;
2 RandomGenerator . g e t D e f a u l t ( )
3 . d o u b l e s ( 1 0 _000 )
4 . forEach ( q u a r t i l e ) ;
5 f i n a l double v a l u e = q u a r t i l e . v a l u e ( ) ;
25 https://en.wikipedia.org/wiki/Quantile
40
Chapter 2
Advanced topics
This section describes some advanced topics for setting up an evolution Engine
or EvolutionStream. It contains some problem encoding examples and how to
override the default validation strategy of the given Genotypes. The last section
contains a detailed description of the implemented termination strategies.
2.1.1 Genes
Genes are the starting point in the class hierarchy. They hold the actual
information, the alleles, of the problem domain. Beside the classical bit-gene,
Jenetics comes with gene implementations for numbers (double-, int- and
long values), characters and enumeration types.
For implementing your own gene type you have to implement the Gene
interface with three methods: (1) the Gene::allele method which will re-
turn the wrapped data, (2) the Gene::newInstance method for creating new,
random instances of the gene—must be of the same type and have the same
constraint—and (3) the Gene::isValid method which checks if the gene fulfill
the expected constraints. The gene constraint might be violated after mutation
and/or recombination. If you want to implement a new number-gene, e. g. a gene
which holds complex values, you may want to extend it from the NumericGene
interface.
41
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS
If you want to support your own allele type, but want to avoid the ef-
fort of implementing the Gene interface, you can alternatively use the Any-
Gene class. It can be created with AnyGene::of(Supplier, Predicate). The
given Supplier is responsible for creating new random alleles, similar to the
newInstance method in the Gene interface. Additional validity checks are
performed by the given Predicate.
1 c l a s s LastMonday {
2 // C r e a t e s new random ’ L o c a l D a t e ’ o b j e c t s .
3 private s t a t i c L o c a l D a t e nextMonday ( ) {
4 f i n a l v a r random = RandomRegistry . random ( ) ;
5 LocalDate
6 . of (2015 , 1 , 5)
7 . plusWeeks ( random . n e x t I n t ( 1 0 0 0 ) ) ;
8 }
9
10 // Do some a d d i t i o n a l v a l i d i t y c h e c k .
11 private s t a t i c boolean i s V a l i d ( f i n a l L o c a l D a t e d a t e ) { . . . }
12
13 // C r e a t e a new gene from t h e random ’ S u p p l i e r ’ and
14 // v a l i d a t i o n ’ P r e d i c a t e ’ .
15 private f i n a l AnyGene<LocalDate> gene = AnyGene
16 . o f ( LastMonday : : nextMonday , LastMonday : : i s V a l i d ) ;
17 }
Listing 2.1: AnyGene example
Example listing 2.1 shows the (almost) minimal setup for creating user defined
Gene allele types. By convention, the RandomGenerator, used for creating the
new LocalDate objects, must be requested from the RandomRegistry. With
the optional validation function, isValid, it is possible to reject Genes whose
alleles don’t conform to some criteria. The simple usage of the AnyGene has also
its downsides. Since the AnyGene instances are created from function objects,
serialization is not supported by the AnyGene class. It is also not possible to
use some Alterer implementations with the AnyGene, like:
• GaussianMutator,
• MeanAlterer and
• PartiallyMatchedCrossover
2.1.2 Chromosomes
A new Gene type usually comes with a corresponding Chromosome implemen-
tation. One of the important parts of a Chromosome is the factory method
newInstance(ISeq), which lets the evolution Engine create a new Chromosome
instance from a sequence of Genes. This method is used by the Alterer when
42
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS
Listing 2.2 shows a full usage example of the AnyGene and AnyChromosome. The
example tries to find a Monday with a maximal day of month. An interesting
detail is, that an Codec2 definition is used for creating new Genotypes and
1 http://www.oracle.com/technetwork/articles/java/javaserial-1536170.html
2 See section2.3 on page 54 for a more detailed Codec description.
43
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS
for converting them back to LocalDate alleles. The convenient usage of the
AnyChromosome has to be payed by the same restriction as for the AnyGene:
no serialization support for the chromosome and not usable for all Alterer
implementations.
2.1.3 Selectors
If you want to implement your own selection strategy you only have to implement
the Selector interface with the select method.
1 @FunctionalInterface
2 public i n t e r f a c e S e l e c t o r <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 ISeq<Phenotype<G, C>> s e l e c t (
7 Seq<Phenotype<G, C>> p o p u l a t i o n ,
8 i n t count ,
9 Optimize opt
10 );
11 }
Listing 2.3: Selector interface
The first parameter is the original population from which the sub-population
is selected. The second parameter, count, is the number of individuals of the
returned sub-population. Depending on the selection algorithm, it is possible
that the sub-population contains more elements than the original one. The
last parameter, opt, determines the optimization strategy which must be used
by the selector. This is exactly the point where it is decided whether the GA
minimizes or maximizes the fitness function.
Before implementing a selector from scratch, consider extending your selector
from the ProbabilitySelector (or any other available Selector implementa-
tion). It is worth the effort to try to express your selection strategy in terms of
selection property P (i). Another way for reusing existing Selector implemen-
tation is by composition.
1 public c l a s s E l i t e S e l e c t o r <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements S e l e c t o r <G, C>
6 {
7 private f i n a l T r u n c a t i o n S e l e c t o r <G, C>
8 _ e l i t e = new T r u n c a t i o n S e l e c t o r <>() ;
9
10 private f i n a l T o u r n a m e n t S e l e c t o r <G, C>
11 _ r e s t = new T o u r n a m e n t S e l e c t o r <>(3) ;
12
13 public E l i t e S e l e c t o r ( ) {
14 }
15
16 @Override
17 public ISeq<Phenotype<G, C>> s e l e c t (
18 f i n a l Seq<Phenotype<G, C>> p o p u l a t i o n ,
19 f i n a l i n t count ,
20 f i n a l Optimize opt
21 ) {
22 ISeq<Phenotype<G, C>> r e s u l t ;
44
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS
Listing 2.4 shows how an elite selector could be implemented by using the existing
Truncation- and TournamentSelector. With elite selection, the quality of the
best solution in each generation monotonically increases over time.[6] It is not
necessary to use an elite selector if you want to preserve the best individual
in the final result. The evolution Engine/Stream doesn’t throw away the best
solution found during the evolution process.
2.1.4 Alterers
For implementing a new alterer class it is necessary to implement the Alterer
interface. You might do this if your new Gene type needs a special kind of
alterer not available in the Jenetics project.
1 @FunctionalInterface
2 public i n t e r f a c e A l t e r e r <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 A l t e r e r R e s u l t <G, C> a l t e r (
7 Seq<Phenotype<G, C>> p o p u l a t i o n ,
8 long g e n e r a t i o n
9 );
10 }
Listing 2.5: Alterer interface
The first parameter of the alter method is the population which has to
be altered. The second parameter is the generation of the newly created
individuals and the return value is the number of genes that has been altered
and the altered population, aggregated in the AltererResult class.
2.1.5 Statistics
During the developing phase of an application which uses the Jenetics library,
additional statistical data about the evolution process is crucial. Such data
can help to optimize the parametrization of the evolution Engine. A good
45
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS
2.1.6 Engine
The evolution Engine itself can’t be extended, but it is still possible to create an
EvolutionStream without using the Engine class.3 Because the Evolution-
Stream has no direct dependency to the Engine, it is possible to use an different,
special evolution Function.
1 public f i n a l c l a s s S p e c i a l E n g i n e {
2 // The Genotype f a c t o r y .
3 private s t a t i c f i n a l Factory<Genotype<DoubleGene>> GTF =
4 Genotype . o f ( DoubleChromosome . o f ( 0 , 1 ) ) ;
5
6 // C r e a t e new e v o l u t i o n s t a r t o b j e c t .
7 private s t a t i c E v o l u t i o n S t a r t <DoubleGene , Double>
8 s t a r t ( f i n a l i n t p o p u l a t i o n S i z e , f i n a l long g e n e r a t i o n ) {
9 f i n a l ISeq<Phenotype<DoubleGene , Double>> p o p u l a t i o n = GTF
10 . instances ()
11 . map( g t −> Phenotype . o f ( gt , g e n e r a t i o n ) )
12 . limit ( populationSize )
13 . c o l l e c t ( ISeq . toISeq ( ) ) ;
14
15 return E v o l u t i o n S t a r t . o f ( p o p u l a t i o n , g e n e r a t i o n ) ;
16 }
17
18 // The s p e c i a l e v o l u t i o n f u n c t i o n .
19 private s t a t i c E v o l u t i o n R e s u l t <DoubleGene , Double>
20 e v o l v e ( f i n a l E v o l u t i o n S t a r t <DoubleGene , Double> s t a r t ) {
21 return . . . ; // Add i m p l e m e n t a t i o n !
22 }
23
24 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
25 f i n a l Genotype<DoubleGene> b e s t = E v o l u t i o n S t r e a m
26 . o f ( ( ) −> s t a r t ( 5 0 , 0 ) , S p e c i a l E n g i n e : : e v o l v e )
27 . l i m i t ( Limits . bySteadyFitness (10) )
28 . l i m i t (100)
29 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
30
31 System . out . p r i n t l n ( " Best Genotype : " + b e s t ) ) ;
32 }
33 }
Listing 2.6: Special evolution engine
Listing 2.6 shows an implementation stub for using an own special evolution
Function. It is also possible to change the used evolution function, depending on
the actual population. The EvolutionStream::ofAdjustableEvolution give
you this possibility. In the following example two evolution functions are used,
depending on the fitness variance of the previous population.
3 Also refer to section 1.3.3.4 on page 26 on how to create an EvolutionStream from an
evolution Function.
46
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
2.2 Encoding
This section presents some encoding examples for common optimization problems.
The encoding should be a complete, and minimal representation of the problem
domain. An encoding is complete if it contains enough information to represent
every solution to the problem. Whereas a minimal encoding contains only the
information needed to represent a solution to the problem. If an encoding
contains more information than is needed to uniquely identify solutions to the
problem, the search space will be larger than necessary. In the best case, there is
a one-to-one mapping from the Genotype space to problem domain. Whenever
47
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
• DoubleGene/Chromosome.
It is quite easy to encode a real function. Only the minimum and maximum
value of the function domain must be defined. The DoubleChromosome of length
1 is then wrapped into a Genotype.
1 Genotype . o f (
2 DoubleChromosome . o f ( min , max , 1 )
3 );
Decoding the double value from the Genotype is also straight forward. Just get
the first Gene from the first Chromosome, with the gene method, and convert it
to a double.
1 s t a t i c double toDouble ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . gene ( ) . d o u b l e V a l u e ( ) ;
3 }
When the Genotype only contains scalar Chromosomes5 , it should be clear, that
it can’t be altered by every Alterer. That means, that none of the Crossover
alterers will be able to create modified Genotypes. For scalars the appropriate
alterers would be the MeanAlterer, GaussianAlterer and Mutator.
48
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
With the first encoding you have the possibility to use all available alterers,
including all Crossover alterer classes.
The second encoding must be used if the minimum and maximum value of
the variables xi can’t be the same for all i. For the different domains, each
variable, xi , is represented by a Numeric Chromosome with length one. The final
Genotype will consist of n Chromosomes with length one.
1 Genotype . o f (
2 DoubleChromosome . o f ( min1 , max1 ) ,
3 DoubleChromosome . o f ( min2 , max2 ) ,
4 ...
5 DoubleChromosome . o f ( minn , maxn )
6 );
With the help of the Java Stream API, the decoding of the Genotype can be
done in a view lines. The DoubleChromosome stream, which is created from the
Chromosome Seq, is first mapped to double values and then collected into an
array.
1 s t a t i c double [ ] t o S c a l a r s ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . stream ( )
3 . mapToDouble ( c −> c . gene ( ) . d o u b l e V a l u e ( ) )
4 . toArray ( ) ;
5 }
As already mentioned, with the use of scalar Chromosomes we can only use the
MeanAlterer, GaussianAlterer or Mutator alterer class. If there are perfor-
mance issues in converting the Genotype into a double[] array, or any other
numeric array, you can access the Genes directly via the Genotype.get(i)-
.get(j) method and than convert it to the desired numeric value, by calling
intValue(), longValue() or doubleValue().
49
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
1 Genotype . o f (
2 DoubleChromosome . o f ( min1 , max1 , m) ,
3 DoubleChromosome . o f ( min2 , max2 , m) ,
4 ...
5 DoubleChromosome . o f ( minn , maxn , m)
6 );
The decoding of the vectors is quite easy with the help of the Java Stream API. In
the first map we have to cast the Chromosome<DoubleGene> object to the actual
DoubleChromosome. The second map then converts each DoubleChromosome to
a double[] array, which is collected to an 2-dimensional double[n ][m ] array
afterwards.
1 s t a t i c double [ ] [ ] t o V e c t o r s ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . stream ( )
3 . map( dc −> dc . a s ( DoubleChromosome . c l a s s ) . t o A r r a y ( ) )
4 . t o A r r a y ( double [ ] [ ] : : new) ;
5 }
For the special case of n = 1, the decoding of the Genotype can be simplified to
the decoding we introduced for scalar functions in section 2.2.2.
1 s t a t i c double [ ] t o V e c t o r ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . chromosome ( ) . a s ( DoubleChromosome . c l a s s ) . t o A r r a y ( ) ;
3 }
The drawback with this kind of encoding is, that we will create a lot of invalid
(non-affine transformation matrices) during the evolution process, which must
be detected and discarded. It is also difficult to find the right parameters for
the min and max values of the DoubleChromosomes.
A better approach will be to encode the transformation parameters instead
of the transformation matrix. The affine transformation can be expressed by the
following parameters:
• sx – the scale factor in x direction
6 https://en.wikipedia.org/wiki/Affine_transformation
7 http://mathworld.wolfram.com/AffineTransformation.html
8 https://en.wikipedia.org/wiki/Homogeneous_coordinates
9 https://en.wikipedia.org/wiki/Transformation_matrix
50
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
This encoding ensures that no invalid Genotype will be created during the
evolution process, since the crossover will be only performed on the same kind of
chromosome (same chromosome index). To convert the Genotype back to the
transformation matrix A, the following equations can be used [20]:
1 0 tx 1 kx 0 sx 0 0 cos θ − sin θ 0
0 1 ty · ky 1 0 · 0 sy 0 · sin θ cos θ 0 .
0 0 1 0 0 1 0 0 1 0 0 1
In Java code, the conversion from the Genotype to the transformation matrix,
will look like this:
1 s t a t i c double [ ] [ ] t o M a t r i x ( f i n a l Genotype<DoubleGene> g t ) {
2 f i n a l double sx = g t . g e t ( 0 ) . gene ( ) . d o u b l e V a l u e ( ) ;
3 f i n a l double sy = g t . g e t ( 1 ) . gene ( ) . d o u b l e V a l u e ( ) ;
4 f i n a l double t x = g t . g e t ( 2 ) . gene ( ) . d o u b l e V a l u e ( ) ;
5 f i n a l double t y = g t . g e t ( 3 ) . gene ( ) . d o u b l e V a l u e ( ) ;
6 f i n a l double th = g t . g e t ( 4 ) . gene ( ) . d o u b l e V a l u e ( ) ;
7 f i n a l double kx = g t . g e t ( 5 ) . gene ( ) . d o u b l e V a l u e ( ) ;
8 f i n a l double ky = g t . g e t ( 6 ) . gene ( ) . d o u b l e V a l u e ( ) ;
9
51
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
For the introduced encoding all kind of alterers can be used. Since we have one
scalar DoubleChromosome, the rotation angle θ, it is recommended also to add
a MeanAlterer or GaussianAlterer to the list of alterers.
2.2.5 Graph
A graph can be represented in many different ways. The most known graph
representation is the adjacency matrix. The following encoding examples uses
adjacency matrices with different characteristics.
Figure 2.2.1 shows an undirected graph and its corresponding matrix rep-
resentation. Since the edges between the nodes have no direction, the values
of the lower diagonal matrix are not taken into account. An application which
optimizes an undirected graph has to ignore this part of the matrix.10
1 f i n a l int n = 6 ;
2 f i n a l Genotype<BitGene> g t = Genotype . o f ( BitChromosome . o f ( n ) , n ) ;
The code snippet above shows how to create an adjacency matrix for a graph
with n = 6 nodes. It creates a Genotype which consists of n BitChromosomes of
10 This property violates the minimal encoding requirement we mentioned at the beginning
of section 2.2 on page 47. For simplicity reason this will be ignored for the undirected graph
encoding.
52
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS
length n each. Whether the node i is connected to node j can be easily checked
by calling gt.get(i-1).get(j-1).booleanValue(). For extracting the whole
matrix as int[] array, the following code can be used.
1 f i n a l i n t [ ] [ ] a r r a y = g t . t o S e q ( ) . stream ( )
2 . map( c −> c . t o S e q ( ) . stream ( )
3 . mapToInt ( gene −> gene . b i t ( ) ? 1 : 0 )
4 . toArray ( ) )
5 . t o A r r a y ( i n t [ ] [ ] : : new) ;
Directed graph A directed graph (digraph) is a graph where the path between
the nodes has a direction associated with them. The encoding of a directed
graph looks exactly like the encoding of an undirected graph. This time the
whole matrix is used and the second diagonal matrix is no longer ignored.
Figure 2.2.2 shows the adjacency matrix of a digraph. This time the whole
matrix is used for representing the graph.
The following code snippet shows how the Genotype of the matrix is created.
53
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
1 f i n a l int n = 6 ;
2 f i n a l double min = −1;
3 f i n a l double max = 2 0 ;
4 f i n a l Genotype<DoubleGene> g t = Genotype
5 . o f ( DoubleChromosome . o f ( min , max , n ) , n ) ;
For accessing the single matrix elements, you can simply call Genotype.get(i)-
.get(j).doubleValue(). If the interaction with another library requires a
double[][] array, the following code can be used.
1 f i n a l double [ ] [ ] a r r a y = g t . stream ( )
2 . map( dc −> dc . a s ( DoubleChromosome . c l a s s ) . t o A r r a y ( ) )
3 . t o A r r a y ( double [ ] [ ] : : new) ;
2.3 Codec
The Codec interface, located in the io.jenetics.engine package, narrows the
gap between the fitness Function, which should be maximized/minimized, and
the Genotype representation, which can be understood by the evolution Engine.
With the Codec interface it is possible to implement the encodings of section
2.2 in a more formalized way.
Normally, the Engine expects a fitness function which takes a Genotype as
input. This Genotype has then to be transformed into an object of the problem
domain. The usage Codec interface allows a tighter coupling of the Genotype
definition and the transformation code.11
1 public i n t e r f a c e Codec<T, G extends Gene <? , G>> {
2 Factory<Genotype<G>> e n c o d i n g ( ) ;
3 Function<Genotype<G>, T> d e c o d e r ( ) ;
4 default T d ec od e ( f i n a l Genotype<G> g t ) { . . . }
5 }
Listing 2.8: Codec interface
Listing 2.8 shows the Codec interface. The encoding method returns the
Genotype factory, which is used by the Engine for creating new Genotypes.
The decoder Function, which is returned by the decoder method, transforms
the Genotype to the argument type of the fitness Function. Without the
Codec interface, the implementation of the fitness Function is polluted with code,
which transforms the Genotype into the argument type of the actual fitness
Function.
1 s t a t i c double e v a l ( f i n a l Genotype<DoubleGene> g t ) {
2 f i n a l double x = g t . gene ( ) . d o u b l e V a l u e ( ) ;
3 // Do some c a l c u l a t i o n with ’ x ’ .
4 return . . .
5 }
The Codec for the example above is quite simple and is shown below. It is not
necessary to implement the Codec interface, instead you can use the Codec::of
factory method for creating new Codec instances.
1 f i n a l DoubleRange domain = DoubleRange . o f ( 0 , 2∗ PI ) ;
2 f i n a l Codec<Double , DoubleGene> c o d e c = Codec . o f (
3 Genotype . o f ( DoubleChromosome . o f ( domain ) ) ,
11 Section 2.2 on page 47 describes some possible encodings for common optimization
problems.
54
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
When using a Codec instance, the fitness Function solely contains code from
your actual problem domain—no dependencies to classes of the Jenetics library.
1 s t a t i c double e v a l ( f i n a l double x ) {
2 // Do some c a l c u l a t i o n with ’ x ’ .
3 return . . .
4 }
Jenetics comes with a set of standard encodings, which are created via static
factory methods in the io.jenetics.engine.Codecs class. The following sub-
sections describe the most important predefined Codecs.
The usage of the Codec, created by this factory method, simplifies the imple-
mentation of the fitness Function and the creation of the evolution Engine.
For scalar types, the saving, in complexity and lines of code, is not that big, but
using the factory method is still quite handy. The following listing demonstrates
the interaction between Codec, fitness Function and evolution Engine.
1 c l a s s Main {
2 // F i t n e s s f u n c t i o n d i r e c t l y t a k e s an ’ i n t ’ v a l u e .
3 s t a t i c double f i t n e s s ( i n t a r g ) {
4 return . . . ;
5 }
6 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
7 f i n a l Engine<I n t e g e r G e n e , Double> e n g i n e = Engine
8 . b u i l d e r ( Main : : f i t n e s s , o f S c a l a r ( IntRange . o f ( 0 , 1 0 0 ) ) )
9 . build () ;
10 ...
11 }
12 }
55
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
7 . toArray ( )
8 );
9 }
Listing 2.10: Codec factory method: ofVector
The usage example of the vector Codec is almost the same as for the scalar
Codec. As an additional parameter, we need to define the length of the desired
array and we define our fitness function with an int[] array.
1 c l a s s Main {
2 // F i t n e s s f u n c t i o n d i r e c t l y t a k e s an ’ i n t [ ] ’ a r r a y .
3 s t a t i c double f i t n e s s ( i n t [ ] a r g s ) {
4 return . . . ;
5 }
6 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
7 f i n a l Engine<I n t e g e r G e n e , Double> e n g i n e = Engine
8 . builder (
9 Main : : f i t n e s s ,
10 o f V e c t o r ( IntRange . o f ( 0 , 1 0 0 ) , 1 0 ) )
11 . build () ;
12 ...
13 }
14 }
56
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
Variable sized subsets A Codec for variable sized subsets can be easily
implemented with the use of a BitChromosome, as shown in listing 2.12.
1 s t a t i c <T> Codec<ISeq<T>, BitGene> o f S u b S e t ( ISeq<T> b a s i c S e t ) {
2 return Codec . o f (
3 Genotype . o f ( BitChromosome . o f ( b a s i c S e t . l e n g t h ( ) ) ) ,
4 g t −> g t . chromosome ( )
5 . a s ( BitChromosome . c l a s s ) . o n e s ( )
6 . mapToObj ( b a s i c S e t )
7 . c o l l e c t ( ISeq . toISeq ( ) )
8 );
9 }
Listing 2.12: Codec factory method: ofSubSet
The following usage example of subset Codec shows a simplified version of the
Knapsack problem (see section 5.4). We try to find a subset, from the given
basic SET, where the sum of the values is as big as possible, but smaller or equal
than 20.
1 c l a s s Main {
2 // The b a s i c s e t from where t o c h o o s e an ’ o p t i m a l ’ s u b s e t .
3 f i n a l s t a t i c ISeq<I n t e g e r > SET =
4 ISeq . of (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10) ;
5
6 // F i t n e s s f u n c t i o n d i r e c t l y t a k e s an ’ i n t ’ v a l u e .
7 s t a t i c i n t f i t n e s s ( ISeq<I n t e g e r > s u b s e t ) {
8 a s s e r t ( s u b s e t . s i z e ( ) <= SET . s i z e ( ) ) ;
9 f i n a l i n t s i z e = s u b s e t . stream ( ) . c o l l e c t (
10 C o l l e c t o r s . summingInt ( I n t e g e r : : i n t V a l u e ) ) ;
11 return s i z e <= 20 ? s i z e : 0 ;
12 }
13 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
14 f i n a l Engine<BitGene , Double> e n g i n e = Engine
15 . b u i l d e r ( Main : : f i t n e s s , o f S u b S e t (SET) )
16 . build () ;
17 ...
18 }
19 }
Fixed sized subsets The second kind of subset Codec allows you to find the
best subset of a given, fixed size. A classical usage for this encoding is the Subset
sum problem12 :
Given a set (or multi-set) of integers, is there a non-empty subset whose sum is
zero? For example, given the set {−7, −3, −2, 5, 8}, the answer is yes because
the subset {−3, −2, 5} sums to zero. The problem is NP-complete.13
1 public c l a s s SubsetSum
2 implements Problem<ISeq<I n t e g e r >, EnumGene<I n t e g e r >, I n t e g e r >
3 {
4 private f i n a l ISeq<I n t e g e r > _ b a s i c S e t ;
5 private f i n a l i n t _ s i z e ;
6
7 public SubsetSum ( ISeq<I n t e g e r > b a s i c S e t , i n t s i z e ) {
8 _basicSet = basicSet ;
9 _size = s i z e ;
10 }
12 https://en.wikipedia.org/wiki/Subset_sum_problem
13 https://en.wikipedia.org/wiki/NP-completeness
57
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
11
12 @Override
13 public Function<ISeq<I n t e g e r >, I n t e g e r > f i t n e s s ( ) {
14 return s u b s e t −> abs (
15 s u b s e t . stream ( ) . mapToInt ( I n t e g e r : : i n t V a l u e ) . sum ( ) ) ;
16 }
17
18 @Override
19 public Codec<ISeq<I n t e g e r >, EnumGene<I n t e g e r >> c o d e c ( ) {
20 return Codecs . o f S u b S e t ( _ b a s i c S e t , _ s i z e ) ;
21 }
22 }
Listing 2.13 shows the implementation of a permutation Codec, where the order
of the given alleles influences the value of the fitness function. An alternate
formulation of the traveling salesman problem is shown in the following listing.
It uses the permutation Codec in listing 2.13 and uses io.jenetics.jpx.Way-
Points, from the JPX 14 project, for representing the city locations.
1 public c l a s s TSM {
2 // The l o c a t i o n s t o v i s i t .
3 s t a t i c f i n a l ISeq<WayPoint> POINTS = I S e q . o f ( . . . ) ;
4
5 // The p e r m u t a t i o n c o d e c .
6 s t a t i c f i n a l Codec<ISeq<WayPoint >, EnumGene<WayPoint>>
7 CODEC = Codecs . o f P e r m u t a t i o n (POINTS) ;
8
9 // The f i t n e s s f u n c t i o n ( i n t h e problem domain ) .
10 s t a t i c double d i s t ( f i n a l ISeq<WayPoint> path ) {
11 return path . stream ( )
12 . c o l l e c t ( Geoid .DEFAULT. toTourLength ( ) )
13 . t o ( Length . Unit .METER) ;
14 }
15
16 // The e v o l u t i o n e n g i n e .
17 s t a t i c f i n a l Engine<EnumGene<WayPoint >, Double> ENGINE = Engine
18 . b u i l d e r (TSM : : d i s t , CODEC)
19 . o p t i m i z e ( Optimize .MINIMUM)
20 . build () ;
21
22 // Find t h e s o l u t i o n .
14 https://github.com/jenetics/jpx
58
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
It is not necessary that the source and target set are of the same size. If |source| >
|target|, the returned mapping function is surjective, if |source| < |target|, the
mapping is injective and if |source| = |target|, the created mapping is bijective.
In every case the size of the encoded Map is |target|. Figure 2.3.1 shows the
described different mapping types in graphical form.
With |source| = |target|, you will create a Codec for the assignment problem.
The problem is defined by a number of workers and a number of jobs. Every
worker can be assigned to perform any job. The cost for a worker may vary
depending on the worker job assignment. It is required to perform all jobs by
assigning exactly one worker to each job and exactly one job to each worker in
such a way which optimizes the total assignment costs.15 The costs for such
worker job assignments are usually given by a matrix. Such an example matrix
is shown in table 2.3.1.
If your worker job cost can be expressed by a matrix, the Hungarian algo-
rithm16 can find an optimal solution in O n3 time. You should consider this
deterministic algorithm before using a GA.
59
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
As you can see from the method definition, the combining Codecs and the
combined Codec have the same Gene type.
Only Codecs with the same Gene type can be composed by the combining
factory methods of the Codec class.
The following listing shows a full example which uses a combined Codec. It
uses the subset Codec, introduced in section 2.3.4, and combines it into a Tuple
of subsets.
1 c l a s s Main {
2 s t a t i c f i n a l ISeq<I n t e g e r > SET =
3 ISeq . of (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9) ;
4
5 // R e s u l t t y p e o f t h e combined ’ Codec ’ .
6 s t a t i c f i n a l c l a s s Tuple<A, B> {
7 final A f i r s t ;
8 f i n a l B second ;
9 Tuple ( f i n a l A f i r s t , f i n a l B s e c o n d ) {
10 this . f i r s t = f i r s t ;
11 this . second = second ;
12 }
13 }
14
15 s t a t i c i n t f i t n e s s ( Tuple<ISeq<I n t e g e r >, ISeq<I n t e g e r >> a r g s ) {
16 return a r g s . f i r s t . stream ( )
17 . mapToInt ( I n t e g e r : : i n t V a l u e )
18 . sum ( ) −
19 a r g s . s e c o n d . stream ( )
20 . mapToInt ( I n t e g e r : : i n t V a l u e )
21 . sum ( ) ;
22 }
23
24 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
25 // Combined ’ Codec ’ .
26 f i n a l Codec<Tuple<ISeq<I n t e g e r >, ISeq<I n t e g e r >>, BitGene>
27 c o d e c = Codec . o f (
28 Codecs . o f S u b S e t (SET) ,
29 Codecs . o f S u b S e t (SET) ,
60
2.3. CODEC CHAPTER 2. ADVANCED TOPICS
30 Tuple : : new
31 );
32
33 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
34 . b u i l d e r ( Main : : f i t n e s s , c o d e c )
35 . build () ;
36
37 f i n a l Phenotype<BitGene , I n t e g e r > pt = e n g i n e . stream ( )
38 . l i m i t (100)
39 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;
40
41 // Use t h e c o d e c f o r c o n v e r t i n g t h e r e s u l t ’ Genotype ’ .
42 f i n a l Tuple<ISeq<I n t e g e r >, ISeq<I n t e g e r >> r e s u l t =
43 c o d e c . d e c o d e r ( ) . a p p l y ( pt . g e n o t y p e ( ) ) ;
44 }
45 }
If you have to combine more than one Codec into one, you have to use the
second, more general, combining function: Codec::of(ISeq<Codec<?, G>>,
Function<Object[], T>). The example above shows how to use the general
combining function. It is just a little bit more verbose and requires explicit casts
for the sub-codec types.
1 f i n a l Codec<T r i p l e <Long , Long , Long >, LongGene>
2 c o d e c = Codec . o f ( I S e q . o f (
3 Codecs . o f S c a l a r ( LongRange . o f ( 0 , 1 0 0 ) ) ,
4 Codecs . o f S c a l a r ( LongRange . o f ( 0 , 1 0 0 0 ) ) ,
5 Codecs . o f S c a l a r ( LongRange . o f ( 0 , 1 0 0 0 0 ) ) ) ,
6 v a l u e s −> {
7 f i n a l Long f i r s t = ( Long ) v a l u e s [ 0 ] ;
8 f i n a l Long s e c o n d = ( Long ) v a l u e s [ 1 ] ;
9 f i n a l Long t h i r d = ( Long ) v a l u e s [ 2 ] ;
10 return new T r i p l e <>( f i r s t , second , t h i r d ) ;
11 }
12 );
61
2.4. PROBLEM CHAPTER 2. ADVANCED TOPICS
2.4 Problem
The Problem interface is a further abstraction level, which allows you to bind
the problem encoding and the fitness function into one data structure.
1 public i n t e r f a c e Problem<
2 T,
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 Function<T, C> f i t n e s s ( ) ;
7 Codec<T, G> c o d e c ( ) ;
8 }
Listing 2.16: Problem interface
Listing 2.16 shows the Problem interface. The generic type T represents the
type of the native problem domain. This is the argument type of the fitness
Function, and C the Comparable result of the fitness Function. G is the Gene
type, which is used by the evolution Engine.
1 // D e f i n i t i o n o f t h e Ones c o u n t i n g problem .
2 f i n a l Problem<ISeq<BitGene >, BitGene , I n t e g e r > ONES_COUNTING =
3 Problem . o f (
4 // F i t n e s s Function<ISeq<BitGene >, I n t e g e r >
5 g e n e s −> ( i n t ) g e n e s . stream ( )
6 . f i l t e r ( BitGene : : b i t ) . count ( ) ,
7 Codec . o f (
8 // Genotype Factory<Genotype<BitGene>>
9 Genotype . o f ( BitChromosome . o f ( 2 0 , 0 . 1 5 ) ) ,
10 // Genotype c o n v e r s i o n
11 // Function<Genotype<BitGene >, <BitGene>>
12 g t −> g t . chromosome ( ) . t o S e q ( )
13 )
14 );
15
16 // Engine c r e a t i o n f o r Problem s o l v i n g .
17 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
18 . b u i l d e r (ONES_COUNTING)
19 . populationSize (150)
20 . s u r v i v o r s S e l e c t o r ( newTournamentSelector <>(5) )
21 . o f f s p r i n g S e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
22 . alterers (
23 new Mutator < >(0.03) ,
24 new S i n g l e P o i n t C r o s s o v e r < >(0.125) )
25 . build () ;
The listing above shows how a new Engine is created by using a predefined
Problem instance. This allows the complete decoupling of problem and Engine
definition.
2.5 Constraint
Constraints delimit the feasible space of solutions of an optimization problem
and are considered in evolutionary algorithms [13, 27, 12, 28]. This influence the
desirability of each possible solution. If the constraints are satisfied, the solution
is accepted and it is called a feasible solution; otherwise the solution is removed
or modified. For a fitness function, f (x), the constraints are usually given as a
62
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS
list of inequalities,
gi (x) ≤ 0, (2.5.1)
and a list of equations,
hj (x) = 0. (2.5.2)
Constraint
5 4 x+7 y−32≤0 Outside of search space
y max 4
y min 0 x
0 1 2 3 4 5 6 7 8 9
x min x max
Usually, a given problem should be encoded in a way, that it is not possible for
the evolution Engine to create invalid individuals (Genotypes). Some possible
encodings for common data structures are described in section 2.2. The Engine
creates new individuals in the altering step, by rearranging (or creating new)
Genes within a Chromosome. Since a Genotype is treated as valid if every
single Gene in every Chromosome is valid, the validity property of the Genes
determines the validity of the whole Genotype. The Engine tries to create
only valid individuals when creating the initial population and when it replaces
Genotypes which has been destroyed by the altering step. Individuals which
has exceeded its lifetime are also replaced by new ones. Although this behavior
will work for most Genotypes, it is still possible that invalid individuals will be
created during the evolution. If you need a more advanced validation strategy,
the Constraint interface comes into play.
63
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS
1 public i n t e r f a c e C o n s t r a i n t <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 > {
5 boolean t e s t ( Phenotype<G, C> i n d ) ;
6 Phenotype<G, C> r e p a i r ( Phenotype<G, C> ind , long gen ) ;
7 }
Listing 2.17: Constraint interface
Listing 2.5 shows the definition of the Constraint interface. The test method of
the interface checks the validity of the given Phenotype and the repair method
creates a new individual using the invalid individual as template.
The RetryConstraint class is the default implementation of the Constraint
interface. It implements the repair method by creating new Phenotypes until
the created individual is valid. Although this approach seems a little bit simplistic,
it has an important and desirable property: the repaired individuals follow the
same distribution then the original. This means, that no part of the problem
domain is left out or is overcrowded. The number of necessary retries is also not
a problem, for normal constraints. For example, the probability that a randomly
created point lies outside the unit circle is 1 − π4 ≈ 0.2146. This leads to a failure
probability after 10 retries, which is the default value of the RetryConstraint,
of 1 − π4 ≈ 0.000000207. You can parameterize a different Constraint
10
Figure 2.5.2 shows the distribution of the domain points in our unit circle
example. Rejecting invalid points and recreating new ones leads to an uniform
point distribution. Every part of the domain is explored with the same probability.
This is a very welcome property of the RetryConstraint strategy.
Trying to create only valid domain points can sometimes lead to a nonuniform
distribution. This can be seen in figure 2.5.3. The points were created by choosing
the angle, α, and the radius, r, randomly, and calculate the point coordinates,
x = (r cos α, r sin α) ,where r ∈ [−1, 1] and α ∈ [0, 2π). As you can see, the
points near the center are much denser than at the domain border. This makes
it harder for the Engine to explore the whole problem domain.
The RetryConstraint is the default implementation of the Constraint
interface, but it might not be the best one for every given problem. If it is
possible, it is better to try to repair an invalid Phenotype instead of creating a
new one. Suppose you need to optimize the fitness function, f : R3 → R, with
the following constraints:
x1 + x2 − 1 ≤ 0
x2 · x3 − 0.5 ≤ 0.
64
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.6
-0.8
-1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
The implementation of the new depends on your actual encoding and might look
like this
1 Phenotype<DoubleGene , Double> newPhenotype ( double [ ] r , long gen ) {
65
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS
0.8
0.6
0.4
0.2
-0.2
-0.4
-0.6
-0.8
-1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
2 f i n a l Genotype<DoubleGene> g t = Genotype . o f (
3 DoubleChromosome . o f (
4 DoubleStream . o f ( r ) . boxed ( )
5 . map( v −> DoubleGene
6 . o f ( v , DoubleRange . o f ( 0 , 1 ) ) )
7 . c o l l e c t ( ISeq . toISeq ( ) )
8 )
9 );
10 return Phenotype . o f ( gt , gen ) ;
11 }
0 1 2 3 4 5 6 7 8 9 10
The following listing shows how to create a constraint, which fulfills the
desired codec property.
1 f i n a l I n v e r t i b l e C o d e c <Double , DoubleGene> c o d e c =
2 Codecs . o f S c a l a r ( DoubleRange . o f ( 0 , 1 0 ) ) ;
3 f i n a l C o n s t r a i n t <DoubleGene , Double> c o n s t r a i n t = C o n s t r a i n t . o f (
4 codec ,
5 v −> v < 2 | | v >= 8 ,
6 v −> {
7 i f ( v >= 2 && v < 8 ) {
8 return v < 5 ? ( ( v − 2 ) / 3 ) ∗2 : ( ( 8 − v ) / 3 ) ∗2 + 8 ;
66
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
9 }
10 return v ;
11 }
12 );
The Constraint, defined in the Engine, only fixes individuals which has
been destroyed during the evolution process. Individuals, created by the
Genotype factory may still be invalid. Use the Constraint::constrain
method for creating safe Genotype factories.
2.6 Termination
Termination is the criterion by which the evolution stream decides whether
to continue or truncate the stream. This section gives a deeper insight into
the different ways of terminating or truncating the EvolutionStream. The
EvolutionStream of the Jenetics library offers an additional method for limiting
the evolution. With the limit(Predicate<EvolutionResult<G,C>>) method
it is possible to use more advanced termination strategies. If the predicate, given
67
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
All termination strategies described in the following sections are part of the
library and can be created by factory methods of the io.jenetics.engine-
.Limits class. The termination strategies were tested by solving the Knapsack
problem17 (see section 5.4) with 250 items. This makes it a real problem with a
search space size of 2250 ≈ 1075 elements.
Table 2.6.1 shows the evolution parameters used for the termination tests. To
make the tests comparable, all test runs use the same evolution parameters and
the very same set of knapsack items. Each termination test was repeated 1,000
times, which should give enough data to draw the given candlestick diagrams.
Some of the implemented termination strategies need to maintain an internal
state. These strategies can’t be reused in different evolution streams. To be
on the safe side, it is recommended to always create a Predicate instance for
each stream. Calling Stream.limit(Limits::byTerminationStrategy ) will
always work as expected.
itory: https://github.com/jenetics/jenetics/blob/master/jenetics.example/src/main/
java/io/jenetics/example/Knapsack.java
68
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
11.5
11.0
10.5
10.0
9.5
Fitness
9.0
8.5
8.0
7.5
7.0
6.5
100 101 102 103 104 105
Generation
Figure 2.6.1 shows the best fitness values of the used Knapsack problem after
a given number of generations, whereas the candlestick points represents the min,
25th percentile, median, 75th percentile and max fitness after 250 repetitions per
generation. The solid line shows for the mean of the best fitness values. For a
small increase of the fitness value, the needed generations grows exponentially.
This is especially the case when the fitness is approaching its maximal value.
69
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
7 private C _ f i t n e s s ;
8
9 public S t e a d y F i t n e s s L i m i t ( f i n a l i n t g e n e r a t i o n s ) {
10 _generations = generations ;
11 }
12
13 @Override
14 public boolean t e s t ( f i n a l E v o l u t i o n R e s u l t <? , C> e r ) {
15 i f ( ! _proceed ) return f a l s e ;
16 i f ( _ f i t n e s s == n u l l ) {
17 _fitness = er . bestFitness () ;
18 _stable = 1;
19 } else {
20 f i n a l Optimize opt = r e s u l t . o p t i m i z e ( ) ;
21 i f ( opt . compare ( _ f i t n e s s , e r . b e s t F i t n e s s ( ) ) >= 0 ) {
22 _proceed = ++_ s t a b l e <= _ g e n e r a t i o n s ;
23 } else {
24 _fitness = er . bestFitness () ;
25 _stable = 1;
26 }
27 }
28 return _proceed ;
29 }
30 }
Listing 2.18: Steady fitness
70
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
106
105
Total generation
104
103
102
101
100
11.0
10.0
Fitness
9.0
8.0
7.0
100 101 102 103 104 105
Steady generation
4 . l i m i t ( L i m i t s . byExecutionTime ( D u r a t i o n . o f M i l l i s ( 5 0 0 ) ) ;
71
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
106
105
Total generation
104
103
102
101
100
11.0
10.0
Fitness
9.0
8.0
7.0
100 101 102 103 104 105
4 . l i m i t ( Limits . byFitnessThreshold (1 0. 5)
5 . l i m i t (5000) ;
72
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
107
106
105
Total generation
104
103
102
101
100
11.0
10.0
Fitness
9.0
8.0
7.0
7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0
Fitness threshold
Listing 2.19 shows the factory method which creates the generic fitness con-
vergence predicate. This method allows to define the evolution termination
according to the statistical moments of the short and long fitness filter.
1 public s t a t i c <N extends Number & Comparable <? super N>>
2 P r e d i c a t e <E v o l u t i o n R e s u l t <? , N>> b y F i t n e s s C o n v e r g e n c e (
3 f i n a l int s h o r t F i l t e r S i z e ,
4 f i n a l int l o n g F i l t e r S i z e ,
5 f i n a l double e p s i l o n
6 );
Listing 2.20: Mean fitness convergence
The second factory method (shown in listing 2.20) creates a fitness convergence
predicate, which uses the moving average19 for the two filters. The smoothed
fitness value is calculated as follows:
1 X
N −1
σF (N ) = F[G−i] (2.6.1)
N i=0
19 https://en.wikipedia.org/wiki/Moving_average
73
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
where N is the length of the filter, F[i] the fitness value at generation i and G
the current generation. If the condition
|σF (NS ) − σF (NL )|
<ϵ (2.6.2)
δ
is fulfilled, the EvolutionStream is truncated. Where δ is defined as follows:
max (|σF (NS )| , |σF (NL )|) if ̸= 0
δ= . (2.6.3)
1 otherwise
1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
4 . l i m i t ( L i m i t s . b y F i t n e s s C o n v e r g e n c e ( 1 0 , 3 0 , 10E−4) ;
For using the fitness convergence strategy you have to specify three parameters.
The length of the short filter, NS , the length of the long filter, NL , and the
relative difference between the smoothed fitness values, ϵ.
500
450 NS= 10
400 NL= 30
Total generation
350
300
250
200
150
100
50
0
9.4
9.2
Fitness
9.0
8.8
8.6
8.4
10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10
Epsilon
Figure 2.6.5 shows the termination behavior of the fitness convergence termi-
nation strategy. It can be seen that the minimum number of evolved generations
is the length of the long filter, NL .
Figure 2.6.6 shows the generations needed for terminating the evolution for
higher values of the NS and NL parameters.
74
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS
2500
NS= 50
2000 NL= 150
Total generation
1500
1000
500
0
10.0
9.8
Fitness
9.6
9.4
9.2
10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10
Epsilon
the current population is less than a user specified percentage away from the
best fitness of the current population. The population is deemed as converged
and the EvolutionStream is truncated if
fmax − f¯
< ϵ, (2.6.4)
δ
where
1 X
N −1
f¯ = fi , (2.6.5)
N i=0
and
max |fmax | , f¯ if ̸= 0
δ= . (2.6.7)
1 otherwise
N denotes the number of individuals of the population.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
4 . l i m i t ( Limits . byPopulationConvergence ( 0 . 1 ) ;
75
2.7. REPRODUCIBILITY CHAPTER 2. ADVANCED TOPICS
2.7 Reproducibility
Some problems can be defined with different kinds of fitness functions or encod-
ings. Which combination works best can’t usually be decided a priori. To choose
one, some testing is needed. Jenetics allows you to set up an evolution Engine
in a way that will produce the very same result on every run.
1 f i n a l Engine<DoubleGene , Double> e n g i n e =
2 Engine . b u i l d e r ( f i t n e s s F u n c t i o n , c o d e c )
3 . e x e c u t o r ( Runnable : : run )
4 . build () ;
5 f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t =
6 RandomRegistry . with (new Random ( 4 5 6 ) , r −>
7 e n g i n e . stream ( p o p u l a t i o n )
8 . l i m i t (100)
9 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () )
10 );
Listing 2.21: Reproducible evolution Engine
Listing 2.21 shows the basic setup of such a reproducible evolution Engine.
Firstly, you have to make sure that all evolution steps are executed serially.
This is done by configuring a single threaded executor. In the simplest case
the evolution is performed solely on the main thread—Runnable::run. If the
evolution Engine uses more than one worker thread, the reproducibility is
no longer guaranteed. The second step configures the random generator, the
evolution Engine is working with. Just wrap the EvolutionStream execution
in a RandomRegistry::with block. Additionally you can start the Evolution-
Stream with a predefined, initial population. Once you have setup the Engine,
you can vary the fitness function and the Codec and compare the results.
If you are using user defined implementations of the Gene and Chromosome
interface, make sure to obtain the RandomGenerator object from the
RandomRegistry. This is also required for every initialization code used in
your problem implementation. Also check your code for hidden nondeter-
ministic parts, e. g. Collections::shuffle method.
76
2.9. EVOLUTION STRATEGIES CHAPTER 2. ADVANCED TOPICS
Selector is used for creating the comparison (random search) fitness values.
11.0
10.5
10.0
9.5
Fitness
9.0
8.5
8.0
7.5 MonteCarloSelector
Evolutionary-Selector
7.0
100 101 102 103 104 105
Generation
Figure 2.8.1 shows the evolution performance of the Selector20 used by the
examples in section 2.6. The lower, blue line shows the (mean) fitness values of
the Knapsack problem when using the MonteCarloSelector for selecting the
survivors and offspring population. It can be easily seen, that the performance
of the real evolutionary Selectors is much better than a random search.
77
2.9. EVOLUTION STRATEGIES CHAPTER 2. ADVANCED TOPICS
Listing 2.22 shows how to configure the evolution Engine for (µ, λ)-ES. The
population size is set to λ and the survivors size to zero, since the best parents are
not part of the final population. Step three is configured by setting the offspring
selector to the TruncationSelector. Additionally, the TruncationSelector
is parameterized with µ. This lets the TruncationSelector only select the µ
best individuals, which corresponds to step two of the ES.
There are mainly three levers for the (µ, λ)-ES where we can adjust explo-
ration versus exploitation:[26]
• Population size λ: This parameter controls the sample size for each
population. For the extreme case, as λ approaches ∞, the algorithm would
perform a simple random search.
• Survivors size of µ: This parameter controls how selective the ES is.
Relatively lowµ values push the algorithm towards exploitative search,
because only the best individuals are used for reproduction.23
• Mutation probability p: A high mutation probability pushes the al-
gorithm toward a fairly random search, regardless of the selectivity of
µ.
must be set indirectly via the TruncationSelector parameter. This is necessary, since for the
(µ, λ)-ES, the selected best µ individuals are not part of the population of the next generation.
78
2.10. EVOLUTION INTERCEPTION CHAPTER 2. ADVANCED TOPICS
5 . s e l e c t o r (new T r u n c a t i o n S e l e c t o r <>(mu) )
6 . a l t e r e r s (new Mutator <>(p ) )
7 . build () ;
Listing 2.23: (µ + λ) Engine configuration
Since the selected µ parents are part of the next generation, the survivorsSize
property must be set to µ. This also requires setting the survivors selector to the
TruncationSelector. With the selector(Selector) method, both selectors
and the selector for the survivors and for the offspring, can be set. Because the
best parents are also part of the next generation, the (µ + λ)-ES may be more
exploitative than the (µ, λ)-ES. This has the risk, that very fit parents can defeat
other individuals over and over again, which leads to a premature convergence
to a local optimum.
The code snippet above shows the correct way for intercepting the evolution
stream. The mapper given to the Engine will change the stream of Evolution-
Results and the will also feed the altered result back to the evolution Engine.
Changing the evolved EvolutionResult is a powerful tool and should used
cautiously.
79
2.10. EVOLUTION INTERCEPTION CHAPTER 2. ADVANCED TOPICS
and the same individual is created by chance. You can control the number of
Genotype creation retries using the EvolutionResult::toUniquePopulation(-
int) method, which allows you to define the maximal number of retries if an
individual already exists.
80
Chapter 3
Modules
The Jenetics library has been split into several modules, which allows keeping
the base EA module as small as possible. It currently consists of the modules
shown in table 3.0.1, including the Jenetics base module.1
Module Artifact
io.jenetics.base io.jenetics:jenetics:7.1.0
io.jenetics.ext io.jenetics:jenetics.ext:7.1.0
io.jenetics.prog io.jenetics:jenetics.prog:7.1.0
io.jenetics.xml io.jenetics:jenetics.xml:7.1.0
io.jenetics.prngine io.jenetics:prngine:2.0.0
With this module split, the code is easier to maintain and doesn’t force the
user to use parts of the library he or she isn’t using. This keeps the io.jenetics-
.base module as small as possible. The additional Jenetics modules will be
described in this chapter. Figure 3.0.1 shows the dependency graph of the
Jenetics modules.
1 The used module names follow the recommended naming scheme for the JPMS automatic
modules: http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html.
81
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
3.1 io.jenetics.ext
The io.jenetics.ext module implements additional nonstandard Genes and
evolutionary operations. It also contains data structures which are used by these
additional Genes and operations.
Listing 3.1 shows the Tree interface with its basic abstract tree methods. All
other needed tree methods, e. g. for node traversal and search, are implemented
by default methods, which are derived from these four abstract tree methods. A
mutable default implementation of the Tree interface is given by the TreeNode
class.
1 2 3
4 5 6 7 8 9
10 11
82
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
9 . attach (8)
10 . attach (9) ) ;
Listing 3.2: Example TreeNode
Listing 3.2 shows the TreeNode representation of the given example tree. New
children are added by using the attach method. For full Tree method list have
a look at the Javadoc documentation.
0(1(4,5),2(6),3(7(10,11),8,9))
As you can see, nodes on the same tree level are separated by a comma, ’,’. New
tree levels are created with an opening parentheses ’(’ and closed with a closing
parentheses ’)’. No additional spaces are inserted between the separator character
and the node value. Any spaces in the parentheses tree string will be part of
the node value. Figure 3.1.2 shows the syntax diagram of the parentheses tree.
The NodeValue in the diagram is the string representation of the Tree::value
object.
To get the parentheses tree representation, you just have to call Tree-
::toParenthesesTree. This method uses the Object::toString method for
serializing the tree node value. If you need a different string representation
you can use the Tree::toParenthesesTree(Function<? super V, String>)
method. A simple example, on how to use this method, is shown in the code
snippet below.
1 f i n a l Tree<Path , ?> t r e e = . . . ;
2 f i n a l S t r i n g s t r i n g = t r e e . t o P a r e n t h e s e s S t r i n g ( Path : : getFileName ) ;
If the string representation of the tree node value contains one of the protected
characters, ’,’, ’(’ or ’)’, they will be escaped with a ’\’ character.
1 f i n a l Tree<S t r i n g , ?> t r e e = TreeNode . o f ( " ( r o o t ) " )
2 . attach ( " , " , " ( " , " ) " )
The tree in the code snippet above will be represented as the following parentheses
string:
2 https://www.i-programmer.info/programming/theory/3458-parentheses-are-trees.
html
3 http://evolution.genetics.washington.edu/phylip/newicktree.html
83
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
\(root\)(\„\(,\))
Serializing a tree into parentheses form is just one part of the story. It is
also possible to read back the parentheses string as tree object. The Tree-
Node::parse(String) method allows you to parse a tree string back to a
TreeNode<String> object. If you need to create a tree with the original node
type, you can call the parse method with an additional string mapper function.
How you can parse a given parentheses tree string is shown in the code below.
1 f i n a l Tree<I n t e g e r , ?> t r e e = TreeNode . p a r s e (
2 " 0(1(4 ,5) ,2(6) ,3(7(10 ,11) ,8 ,9) ) " ,
3 Integer : : parseInt
4 );
84
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
The code snippet above shows how to flatten a given integer tree and convert it
back to a regular tree. The first element of the flattened tree node sequence is
always the root node.
div
cos
1.3
cos
3.14
TreeFormatter.DOT Creates a tree string in the dot format, which can be used
to create nice graphs with Graphviz6 .
TreeFormatter.LISP Creates a Lisp tree from a given Tree instance. E. g.
(mul (div (cos 1.0) (cos 3.14)) (sin (mul 1.0 z)))
6 https://www.graphviz.org/
85
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
Listing 3.4 shows the signature of the reduce method. The neutral array is
used in the reducer function for leaf elements. After the reduction process, one
element is returned, which might be null for empty trees. The following code
snippet shows how to use the Tree::reduce method for evaluating a simple
arithmetic expression tree.
1 f i n a l Tree<S t r i n g , ?> f o r m u l a = TreeNode . p a r s e (
2 " add ( sub ( 6 , d i v ( 2 3 0 , 1 0 ) ) , mul ( 5 , 6 ) ) "
3 );
4 f i n a l double r e s u l t = f o r m u l a . r e d u c e (new Double [ 0 ] , ( op , a r g s ) −>
5 switch ( op ) {
6 case " add " −> a r g s [ 0 ] + a r g s [ 1 ] ;
7 case " sub " −> a r g s [ 0 ] − a r g s [ 1 ] ;
8 case " mul " −> a r g s [ 0 ] ∗ a r g s [ 1 ] ;
9 case " d i v " −> a r g s [ 0 ] / a r g s [ 1 ] ;
10 default −> Double . p a r s e D o u b l e ( op ) ;
11 }
12 );
3.1.2 Rewriting
Tree rewriting is a synonym for term rewriting, i.e., the process of transforming
trees (tree structured data) into other trees by applying rewriting rules. Rewriting
trees is not necessarily deterministic. One rewrite rule can be applied in many
different ways to that term, or more than one rule will be applicable to a tree
node. The rewriting system implementation in Jenetics is currently used for
simplifying program trees, which are evolved in genetic programming problems
(see section 3.2 and 3.2.4). A good introduction in tree/term rewriting systems
can be found in [3].
Definition. (Tree rewrite rule): A tree rewrite rule is a pair of terms (sub
trees), l → r. The notation indicates that the left-hand side, l, can be replaced
by the right-hand side, r.
86
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
Listing 3.5 shows the constructor and main methods of the TreePattern. The
matcher method is used when used for the left-hand side and the expand method
for the right-hand side. How to create a simple tree pattern is shown in the code
snippet below.
1 f i n a l Tree<Decl<S t r i n g >, ?> t = TreeNode
2 .< Decl<S t r i n g >>o f (new Val<>(" add " ) )
3 . a t t a c h (new Var<>(" x " ) , new Val<>(" 1 " ) ) ;
4 f i n a l T r e e P a t t e r n <S t r i n g > p = new T r e e P a t t e r n <>(t ) ;
5 a s s e r t p . matcher ( TreeNode . p a r s e ( " add ( sub ( x , y ) , 1 ) " ) ) . matches ( ) ;
You can see that the variable x will match for arbitrary sub trees. For more
complicated patterns it is quite cumbersome to create it via a Decl tree. Usually
you will create a TreePattern object by compiling a proper pattern string.
For creating the same pattern as in the example above you can write Tree-
Pattern.compile("add($x,1)"). The base syntax for the tree pattern follows
the parentheses tree DSL described in 3.1.1.2. It only differs in the declaration of
tree variables, which start with a ’$’ and must be a valid Java identifier. If you
want to match non string trees you must specify an additional mapper function
with the compile method.
1 f i n a l T r e e P a t t e r n <I n t e g e r > p a t t e r n = T r e e P a t t e r n
2 . c o m p i l e ( " 0 ( $x , 1 ) " , I n t e g e r : : p a r s e I n t ) ;
The right-hand side functionality of the rewrite rule is used to expand a given
pattern. For expanding a given pattern you have to deliver a Var to sub tree
mapping.
1 f i n a l T r e e P a t t e r n <S t r i n g > p a t t e r n = T r e e P a t t e r n
2 . c o m p i l e ( " add ( $x , $y , 1 ) " ) ;
3 f i n a l Map<Var<S t r i n g >, Tree<S t r i n g , ?>> v a r s = new HashMap<>() ;
4 v a r s . put (new T r e e P a t t e r n . Var<>(" x " ) , TreeNode . p a r s e ( " s i n ( x ) " ) ) ;
5 v a r s . put (new T r e e P a t t e r n . Var<>(" y " ) , TreeNode . p a r s e ( " s i n ( y ) " ) ) ;
6
7 f i n a l Tree<S t r i n g , ?> t r e e = p a t t e r n . expand ( v a r s ) ;
8 a s s e r t t r e e . t o P a r e n t h e s e s S t r i n g ( ) . e q u a l s ( " add ( s i n ( x ) , s i n ( y ) , 1 ) " ) ) ;
7 https://en.wikipedia.org/wiki/Algebraic_data_type
87
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
The example above defines a tree rewrite system with four rewrite rules, which
are applied in the given order. Each rule is applied until the given tree stays
unchanged. This also means, that the termination of the TRS can’t be guaranteed.
It’s mainly your responsibility to create a rewrite system which will always
terminate. If you are not sure whether the system is terminating or not, you
better call the TreeRewriter.rewrite(TreeNode, int) method, which also
takes the maximal number, the rule should be applied to the input tree.
1 f i n a l TreeNode<S t r i n g > t = p a r s e ( " add ( S ( 0 ) , S ( mul ( S ( 0 ) , S ( S ( 0 ) ) ) ) ) " ) ;
2 trs . rewrite ( t ) ;
3 a s s e r t t . equals ( parse ( "S(S(S(S(0) ) ) ) " ) ) ;
Since the given tree rewrite system is terminating, we can safely apply the TRS
to add(S(0),S(mul(S(0),S(S(0))))), which will then be rewritten to S(S(S(S(0)))).
88
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
3.1.3 Genes
3.1.3.1 BigInteger gene
The BigIntegerGene implements the NumericGene interface and can be used
when the range of the existing LongGene or DoubleGene is not enough. Its
allele type is a BigInteger, which can store arbitrary precision integers. There
also exists a corresponding BigIntegerChromosome.
3.1.4 Operators
Simulated binary crossover The SimulatedBinaryCrossover performs
the simulated binary crossover (SBX) on NumericChromosomes such that each
position is either crossed contracted or expanded with a certain probability. The
probability distribution is designed such that the children will lie closer to their
parents as is the case with the single point binary crossover. It is implemented
as described in [16].
89
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
0 a
1 2 3 b c d
4 5 6 7 8 9 e f g h i j
10 11 k l
3 ←→ h
a
b c d
1 2 h
e f g 3 i j
4 5 6 k l
7 8 9
10 11
The search space of the 28 character long target string is 2728 ≈ 1040 . If the
monkey writes 1, 000, 000 different sentences per second, it would take about 1026
8 https://en.wikipedia.org/wiki/Weasel_program
9 The classes are located in the io.jenetics.ext module.
90
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
years (in average) writing the correct one. Although Dawkins did not provide
the source code for his program, a »Weasel« style algorithm could run as follows:
91
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
25 . stream ( )
26 . l i m i t ( b y F i t n e s s T h r e s h o l d (TARGET. l e n g t h ( ) − 1 ) )
27 . peek ( r −> System . out . p r i n t l n (
28 r . totalGenerations () + " : " +
29 r . bestPhenotype ( ) ) )
30 . c o l l e c t ( toBestPhenotype ( ) ) ;
31 System . out . p r i n t l n ( r e s u l t ) ;
32 }
33 }
Listing 3.7: Weasel program
Listing 3.7 shows how to implement the WeaselProgram with Jenetics. Step (1)
and (2) of the algorithm is done implicitly when the initial population is created.
The third step is done by the WeaselMutator, with mutation probability of 0.05.
Step (4) is done by the WeaselSelector together with the configured offspring
fraction of one. The EvolutionStream is limited by the Limits.byFitness-
Threshold, which is set to scoremax − 1. In the current example this value is
set to TARGET.length() - 1 = 27.
1 1: [ UBNHLJUS RCOXR LFIYLAWRDCCNY ] --> 6
2 2: [ UBNHLJUS RCOXR LFIYLAWWDCCNY ] --> 7
3 3: [ UBQHLJUS RCOXR LFIYLAWWECCNY ] --> 8
4 5: [ UBQHLJUS RCOXR LFICLAWWECCNL ] --> 9
5 6: [ W QHLJUS RCOXR LFICLA WEGCNL ] --> 10
6 7: [ W QHLJKS RCOXR LFIHLA WEGCNL ] --> 11
7 8: [ W QHLJKS RCOXR LFIHLA WEGSNL ] --> 12
8 9: [ W QHLJKS RCOXR LFIS A WEGSNL ] --> 13
9 10: [ M QHLJKS RCOXR LFIS A WEGSNL ] --> 14
10 11: [ MEQHLJKS RCOXR LFIS A WEGSNL ] --> 15
11 12: [ MEQHIJKS ICOXR LFIN A WEGSNL ] --> 17
12 14: [ MEQHINKS ICOXR LFIN A WEGSNL ] --> 18
13 16: [ METHINKS ICOXR LFIN A WEGSNL ] --> 19
14 18: [ METHINKS IMOXR LFKN A WEGSNL ] --> 20
15 19: [ METHINKS IMOXR LIKN A WEGSNL ] --> 21
16 20: [ METHINKS IMOIR LIKN A WEGSNL ] --> 22
17 23: [ METHINKS IMOIR LIKN A WEGSEL ] --> 23
18 26: [ METHINKS IMOIS LIKN A WEGSEL ] --> 24
19 27: [ METHINKS IM IS LIKN A WEHSEL ] --> 25
20 32: [ METHINKS IT IS LIKN A WEHSEL ] --> 26
21 42: [ METHINKS IT IS LIKN A WEASEL ] --> 27
22 46: [ METHINKS IT IS LIKE A WEASEL ] --> 28
The (shortened) output of the Weasel program (listing 3.7) shows, that the
optimal solution is reached in generation 46.
92
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
3.1.6.1 ConcatEngine
The ConcatEngine class allows for creating more than one Engine with different
configurations, and combine it into one EvolutionStreamable (Engine).
93
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
5 ConcatEngine . o f (
6 engine1 . l i m i t (50) ,
7 e n g i n e 2 . l i m i t ( ( ) −> L i m i t s . b y S t e a d y F i t n e s s ( 3 0 ) ) )
8 . stream ( )
9 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
A practical use case for the Engine concatenation is, when you want to do a
broader exploration of the search space at the beginning and narrow it with the
following Engine. In such a setup, the first Engine would be configured with a
Mutator with a relatively big mutation probability. The mutation probabilities
of the following Engine would then be gradually reduced.
3.1.6.2 CyclicEngine
The CyclicEngine is similar to the ConcatEngine. Where the ConcatEngine
stops the evolution, when the EvolutionStream of the last engine terminates,
the CyclicEngine continues with a new EvolutionStream from the first Engine.
The evolution flow of the CyclicEngine is shown in figure 3.1.6.
The reason for using a cyclic EvolutionStream is similar to the reason for using a
concatenated EvolutionStream. It allows you to do a broad search, followed by
a narrowed exploration. This cycle is then repeated until the limiting predicate
of the outer stream terminates the evolution process.
94
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
criteria which are usually in conflict with each other. Hence, the
term »optimize« means finding such a solution which would give the
values of all the objective functions acceptable to the decision maker.
[34]
There are several ways for solving multiobjective problems. An excellent theo-
retical foundation is given in [10]. The algorithms implemented by Jenetics are
based in therms of Pareto optimality as described in [18], [15] and [23].
95
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
1.8
1.6
1.4
1.2
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
1.8
1.6
1.4
1.2
0.8
0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
1. irreflexive: u ⊁ u,
2. transitive: u ≻ v ∧ v ≻ w ⇒ u ≻ w and
3. asymmetric: u ≻ v ⇒ v ⊁ u .
The io.jenetics.ext.moea package contains the classes needed for doing multi-
objective optimization. One of the central types is the Vec interface, which
allows you to wrap a vector of any element type into a Comparable.
1 public i n t e r f a c e Vec<T> extends Comparable<Vec<T>> {
2 T data ( ) ;
3 int length ( ) ;
4 ElementComparator<T> comparator ( ) ;
5 E l e m e n t D i s t a n c e <T> d i s t a n c e ( ) ;
6 Comparator<T> dominance ( ) ;
7 }
Listing 3.9: Vec interface
96
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Listing 3.9 shows the necessary methods of the Vec interface. These methods are
sufficient to do all the optimization calculations. The data() method returns the
underlying vector type, like double[] or int[]. With the ElementComparator,
which is returned by the comparator() method, it is possible to compare single
elements of the vector type T. This is similar to the ElementDistance function,
returned by the distance() method, which calculates the distance of two
vector elements. The last method, dominance(), returns the Pareto dominance
comparator, ≻. Since it is quite a bothersome to implement all these needed
methods, the Vec interface comes with a set of factory methods, which allows
for creating Vec instance for some primitive array types.
1 f i n a l Vec<i n t [] > i v e c = Vec . o f ( 1 , 2 , 3 ) ;
2 f i n a l Vec<long [] > l v e c = Vec . o f ( 1 L , 2L , 3L) ;
3 f i n a l Vec<double [] > dvec = Vec . o f ( 1 . 0 , 2 . 0 , 3 . 0 ) ;
For efficiency reason, the primitive arrays are not copied, when the Vec instance
is created. This lets you, theoretically, change the value of a created Vec instance,
which will lead to unexpected results.
Although the Vec interface extends the Comparable interface, it violates its
general contract. It only implements the Pareto dominance relation, which
defines a partial order. Trying to sort a list of Vec objects, might lead to
an exception (thrown by the sorting method) at runtime.
97
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
2 e n g i n e . stream ( )
3 . l i m i t (100)
4 . c o l l e c t (MOEA. t o P a r e t o S e t ( IntRange . o f ( 3 0 , 5 0 ) ) ) ;
Since there exists a potential infinite number of Pareto optimal solutions, you
have to define desired number of set elements. This is done with an IntRange
object, where you can specify the minimal and maximal set size. The example
above will return a Pareto size with size in the range of [30, 50). For reducing the
Pareto set size, the distance between two vector elements is taken into account.
Points which lie very close to each other are removed. This leads to a result,
where the Pareto optimal solutions are, more or less, evenly distributed over the
whole Pareto front. The crowding distance 11 measure is used for calculating the
proximity of two points and it is described in [10] and [18].
Till now we have described the multi-objective result type (Vec) and the final
collecting of the Pareto optimal solution. So lets create a simple multi-objective
problem and an appropriate Engine.
1 f i n a l Problem<double [ ] , DoubleGene , Vec<double[]>> problem =
2 Problem . o f (
3 v −> Vec . o f ( v [ 0 ] ∗ c o s ( v [ 1 ] ) + 1 , v [ 0 ] ∗ s i n ( v [ 1 ] ) + 1 ) ,
4 Codecs . o f V e c t o r (
5 DoubleRange . o f ( 0 , 1 ) ,
6 DoubleRange . o f ( 0 , 2∗ PI )
7 )
8 );
9
10 f i n a l Engine<DoubleGene , Vec<double[]>> e n g i n e =
11 Engine . b u i l d e r ( problem )
12 . o f f s p r i n g S e l e c t o r (new T o u r n a m e n t S e l e c t o r <>(4) )
13 . s u r v i v o r s S e l e c t o r ( UFTournamentSelector . ofVec ( ) )
14 . build () ;
The fitness function in the example problem above will create 2D-points which
will all lies within a circle with a center of (1, 1). In figure 3.1.8 you can
see how the resulting solution will look like. There is almost no difference in
creating an evolution Engine for single- or multi-objective optimization. You
only have to take care to choose the right Selector. Not all Selectors will
work for multi-objective optimization. This include all Selectors which needs
a Number fitness type and where the population needs to be sorted12 . The
Selector which works fine in a multi-objective setup is the TournamentSelector.
Additionally you can use one of the special MO selectors: NSGA2Selector and
UFTournamentSelector.
NSGA2 selector This selector selects the first elements of the population,
which has been sorted by the Crowded-comparison operator (equation 3.1.1), ≻,
n
as described in [15]
i≻j if (irank < jrank ) ∨ ((irank = jrank ) ∧ idist > jdist ) , (3.1.1)
n
11 The crowding distance value of a solution provides an estimate of the density of solutions
surrounding that solution. The crowding distance value of a particular solution is the av-
erage distance of its two neighboring solutions. https://www.igi-global.com/dictionary/
crowding-distance/42740.
12 Since the ≻ relation doesn’t define a total order, sorting the population will lead to an
IllegalArgumentException at runtime.
98
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
where irank denotes the non-domination rank of individual i and idist the crowd-
ing distance of individual i.
3.1.7.3 Termination
Most of the existing termination strategies, implemented in the Limits class,
presumes a total order of the fitness values. This assumption holds for single-
objective optimization problems, but not for multi-objective problems. Only
termination strategies which don’t rely on the total order of the fitness value,
can be safely used. The following termination strategies can be used for multi-
objective problems:
• Limits::byFixedGeneration,
• Limits::byExecutionTime and
• Limits::byGeneConvergence.
All other strategies doesn’t have a well defined termination behavior.
99
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
Instead of creating the solution Vec instances directly, the fitness function must
create it with a properly configured VecFactory instance.
1 f i n a l VecFactory<double [] > f a c t o r y = VecFactory . ofDoubleVec (
2 Optimize .MINIMUM,
3 Optimize .MAXIMUM,
4 Optimize .MINIMUM
5 );
6
7 Vec<double [] > f i t n e s s ( f i n a l double [ ] p o i n t ) {
8 f i n a l double x = p o i n t [ 0 ] ;
9 f i n a l double y = p o i n t [ 1 ] ;
10 return f a c t o r y . newVec (new double [ ] {
11 s i n ( x ) ∗y ,
12 c o s ( y ) ∗x ,
13 x + y
14 }) ;
15 }
The example code above shows how the VecFactory must be configured to
create Vec<double[]> objects with the desired optimization properties. In the
fitness function you will then use the VecFactory instance for creating the fitness
values instead of the Vec::of(double...) factory method. The optimization
direction of the evolution Engine will remain at its default value, Optimize.MAX-
IMUM. If you configure the Engine for minimization, the configured optimization
directions in the VecFactory will be reversed. That means, the first objective
will be maximized instead of minimized, and so on.
100
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
Mapping
(Codec)
Rules
(CFG)
Program
(Tree)
Executed program
of the grammar.
101
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
expressions to create production rules. The names of the production rules are
put within angle brackets, <...>, and the alternatives are separated by vertical
bars, |. A simple BNF production rule might look like this:
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
In this example num is the name of the production rule and the values 1..9
represent the terminal symbols, associated with the rule. If a non-terminal
symbol appears on the right hand side, it means that there will be another
production rule (or a set of rules) to define its replacement.
<expr> ::= <num> | <var> | <expr> <op> <expr>
This shows that exp is either a num or var or two expr, concatenated by an op.
All components are non-terminal, therefore further production rules are required.
<op> ::= + | - | * | /
<var> ::= x | y
This two rules defines four op symbols, {x, −, ∗, /}, and two var terminals, {x, y}.
Production rules for the syntax of a language might come as a large set of BNF
statements that specify how every aspect of the language is defined. Every
non-terminal symbol on the right hand side of a production rule, must have a
rule that has the symbol on the left side. This continues until everything can be
specified in relation to terminal symbols. Here is the whole grammar, which will
define a language for simple arithmetic expressions:
<expr> ::= <num> | <var> | <expr> <op> <expr>
<op> ::= + | - | * | /
<var> ::= x | y
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
The symbol of the first rule, expr, will automatically serve as start symbol of
the grammar, G, defined by the given BNF.
102
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
2. n ← L [min {i}] ∈ N
Pick the leftmost non-terminal symbol, L [min {i}], from the current sen-
tence list, L.
3. E ← R (n) [rand]
Get the rule, R (n), for the chosen non-terminal, n, and select a random
rule alternative, E. E will contain one of more terminal and/or non-
terminal symbols.
4. L [min {i}] ← E
Replace the chosen variable, n, with the selected symbols, E.
5. Repeat steps 2..4 until the symbol list, L, contains only terminals.
3.1.8.4 Mapping
Now we have everything in place to do the last step. How to create a sentence
from a given grammar? The original paper [37] uses a bit string for creating
a sentence from a given chromosome and called this process mapping. Since
we have already described how to create a sentence from a grammar, we must
describe the last missing part of the process, and this missing part is the selection
of alternative symbols from a given rule. This is step three in the described
103
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
2. n ← L [max {i}] ∈ N
Pick the rightmost non-terminal symbol, L [max {i}], from the current
sentence list, L.
3. E ← R (n) [rand]
Get the rule, R (n), for the chosen non-terminal, n, and select a random
rule alternative, E. E will contain one of more terminal and/or non-
terminal symbols.
4. L [min {i}] ← E
Replace the chosen variable, n, with the selected symbols, E.
5. Repeat steps 2..4 until the symbol list, L, contains only terminals.
algorithms 3.1 on the previous page and 3.2. Instead of selecting a random
alternative, the index of the selected alternative must be determined by the
given chromosome.
Codons
00000101|10110001|01110000|11101001|01111111|01000010
005|273|112|233|127|066
Figure 3.1.11 shows he GE mapping process. The binary string is split into
8-bit junks and interpreted as unsigned integers, with vales in the range of
[0, 256). This 8-bit junks are called codons. Whenever a new rule alternative
must be selected, a new codon is read from the chromosome. A simple modulo
operation is used to get an index within the desired range. If more codons are
needed during the mapping process than available in the input chromosome,
the reading of the codons is wrapped over and starts at the beginning of the
chromosome again.
104
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
Let’s start with the Cfg class and how to create grammars. The Cfg class comes
with a set of factory methods, which lets you create Cfg objects quite easily. Our
already known arithmetic expression grammar can be created with the following
code. The used factory methods, N, T, E, R, were statically imported.
1 f i n a l Cfg<S t r i n g > c f g = Cfg . o f (
2 R( " e x p r " ,
3 E(N( "num" ) ) , E(N( " v a r " ) ) ,
4 E(N( " e x p r " ) , N( " op " ) , N( " e x p r " ) )
5 ),
6 R( " op " , E(T( "+" ) ) , E(T( "−" ) ) , E(T( " ∗ " ) ) , E(T( " / " ) ) ) ,
7 R( " v a r " , E(T( " x " ) ) , E(T( " y " ) ) ) ,
8 R( "num" ,
9 E(T( " 1 " ) ) , E(T( " 2 " ) ) , E(T( " 3 " ) ) ,
10 E(T( " 4 " ) ) , E(T( " 5 " ) ) , E(T( " 6 " ) ) ,
11 E(T( " 7 " ) ) , E(T( " 8 " ) ) , E(T( " 9 " ) )
12 )
13 );
As alternative, the above CFG can also been creating via a string in BNF form,
as described in section 3.1.8.2 on page 101. The code snippet below shows how
to do this.
1 f i n a l Cfg<S t r i n g > c f g = Bnf . p a r s e ( " " "
2 <expr> : : = <num> | <var> | <expr> <op> <expr>
3 <op> ::= + | − | ∗ | /
18 See section 2.3 on page 54.
105
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
4 <var> ::= x | y
5 <num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
6 """
7 );
The SymbolIndex interface is responsible for selecting the index of the rule
alternative to choose. The reason for this interface is to decouple the selection
algorithm from the data structure, which determines the selection; it should
decouple the algorithm from the underlying BitChromosome or IntegerChromo-
some.
1 @FunctionalInterface
2 public i n t e r f a c e SymbolIndex {
3 i n t n e x t ( Cfg . Rule<?> r u l e , i n t bound ) ;
4 }
Listing 3.11: SymbolIndex interface
Listing 3.11 shows the single method of the SymbolIndex interface. It takes
the rule, for which to select the alternatives and the desired index bounds and
returns the selected symbol index. The rule parameter of the next function
allows to use different index selection strategies, or different index sources, for
the different rules. This interface makes it possible to let the sentence creation
to be controlled by an IntegerChromosome instead the classical bit string.
The Codons class implements the SymbolIndex interface and can be created
from a BitChromosome and IntegerChromosome instances. Codons created from
a BitChromosome gives you an implementation of the classic symbol selection
algorithm.
1 // C r e a t e codons backed up by b i n a r y s t r i n g .
2 v a r bch = BitChromosome . o f ( 1 0 0 ∗ 8 ) ;
3 v a r bcodons = Codons . o f B i t G e n e s ( bch ) ;
4 // C r e a d t e codons backed up by i n t e g e r s t r i n g .
5 v a r i c h = IntegerChromosome . o f ( IntRange . o f ( 0 , 2 5 6 ) , 1 0 0 ) ;
6 v a r i c o d o n s = Codons . o f I n t e g e r G e n e s ( i c h ) ;
The code snipped above shows two ways for creating essentially the same codons.
In the first variant, the codons are read from a BitChromosome with the length
of 800, which results in 100 different indexes readable from the created Codons
object. In the second variant an IntegerChromosome, with a value range of
[0, 256) and a length of 100, is used for the Codons object. Using an Integer-
Chromosome instead of a BitChromosome gives you a greater flexibility, as it
allows you to specify an explicit range for the codon values.
Implementations of the functional Generator interface are responsible for
creating sentences from a given grammar. It takes a Cfg object as input and
creates a generic result object of type T. This genericity lets you use the same
interface for different result type, like List<Symbol<String>> or Tree<Symbol<-
String>, ?>.
1 @FunctionalInterface
2 public i n t e r f a c e Generator<T, R> {
3 R g e n e r a t e ( Cfg<? extends T> c f g ) ;
4 }
Listing 3.12: Generator interface
Listing 3.12 shows the definition of the Generator interface. It takes a Cfg
object as input and returns the created result sentence of type R. The kind of
106
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES
In the code snippet above you can see how to create a SentenceGenerator which
will create random sentences with a maximal length of 1000. The generated
sentence will be a list of terminal symbols. But it is quite easy to convert it to a
string. The DerivationTreeGenerator work can be used similarly. Instead of
a list of terminals, it creates a Tree<Symbol<String>, ?> instead.
The factory methods of the Mappings class puts it all together and let you
easily specify the needed parts for the mapping process. Jenetics already has
an interface for expressing such a mapping process, the Codec19 interface. So
no additional interface or class was introduce and the Mappings factories will
return Codec instances instead. In the code snippet below you can see how to
create the classical mapping, with a single BitChromosome used for the rule
symbol selection. It is created via the Mappers::singleBitChromosomeMapper
method.
1 f i n a l Codec<L i s t <Terminal<S t r i n g >>, BitGene> c o d e c =
2 Mappers . singleBitChromosomeMapper (
3 cfg ,
4 // Length o f t h e used BitChromosome .
5 100∗8 ,
6 // The used g e n e r a t o r , c r e a t e d from SymbolIndex .
7 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 1_000 )
8 );
For creating this codec you must specify the CFG, the length of the chromosome
and the sentence Generator. The Generator is given as factory function, which
takes an SymbolIndex at input. The codec, created in the code snippet below,
is essentially the same as created with the Mappers::singleBitChromosome-
Mapper method. It only differs in the way the codons are created. Using an
IntegerChromosome as codons source gives you a greater flexibility and allows
to change the value range of the created codons.
1 f i n a l Codec<L i s t <Terminal<S t r i n g >>, I n t e g e r G e n e > c o d e c =
2 Mappers . s i n g l e I n t e g e r C h r o m o s o m e M a p p e r (
19 See section 2.3 on page 54.
107
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
3 cfg ,
4 // Value r a n g e o f c r e a t e d codons .
5 IntRange . o f ( 0 , 2 5 6 ) ,
6 // Length o f t h e used IntegerChromosome .
7 100 ,
8 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 1_000 )
9 );
Beside the classical mapping algorithms, Jenetics contains also a novel method
for genotype to sentence mapping. This new approach uses a separate Int-
egerChromosome for every Cfg.Rule. This allows to align the range of the
chromosome with the number of alternatives of the rule. With this property,
no modulo operation is needed for creating codons with the correct bounds.
No additional modulo operation means, that all alternatives have the same
probability for being selected, for randomly created chromosomes. All rule
alternatives have the same changes of being selected. There is no accidental bias
towards a given rule alternative or symbol.
1 f i n a l Codec<L i s t <Terminal<S t r i n g >>, I n t e g e r G e n e > c o d e c =
2 Mappers . multiIntegerChromosomeMapper (
3 cfg ,
4 // Chromosome l e n g t h depends o f number o f a l t e r n a t i v e s .
5 r u l e −> IntRange . o f ( r u l e . a l t e r n a t i v e s ( ) . s i z e ( ) ∗ 2 5 ) ,
6 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 1_000 )
7 );
The codes snippet above shows how to create a mapping, which uses a genotype
with one chromosome for every rule in the given CFG. Our example CFG
[0] [1] [2]
<expr> ::= <num> | <var> | <expr> <op> <expr>
[0] [1] [2] [3]
<op> ::= + | - | * | /
[0] [1]
<var> ::= x | y
[0] [1] [2] [3] [4] [5] [6] [7] [8]
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
will use a Genotype with the following structure for the encoding of the codons.
1 Genotype . o f (
2 IntegerChromosome . o f ( IntRange . of (0 , 3) , 3 ∗ 2 5 ) , // <expr>
3 IntegerChromosome . o f ( IntRange . of (0 , 4) , 4 ∗ 2 5 ) , // <op>
4 IntegerChromosome . o f ( IntRange . of (0 , 2) , 2 ∗ 2 5 ) , // v a r
5 IntegerChromosome . o f ( IntRange . of (0 , 9) , 9 ∗ 2 5 ) // <num>
6 );
As you can see, the number range for each rule chromosome reflects the number
of available rule alternatives and the chromosome length can be expressed as a
multiple of the number of alternatives of the corresponding rule. If you don’t
need this flexibility, it is of course also possible to use constant chromosome
lengths. Just use a length function like this: rule -> IntRange.of(1_000).
3.2 io.jenetics.prog
In artificial intelligence, genetic programming (GP) is a technique whereby com-
puter programs are encoded as a set of genes that are then modified (evolved) us-
108
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
3.2.1 Operations
When creating own genetic programs, it is not necessary to derive classes from
the ProgramGene or ProgramChromosome. The intended extension point is the
Op interface.
1 public i n t e r f a c e Op<T> {
2 S t r i n g name ( ) ;
3 int a r i t y ( ) ;
4 T a p p l y (T [ ] a r g s ) ;
5 }
Listing 3.13: GP Op interface
The generic type of the Op interface (see listing 3.13) enforces the data type
constraints for the created program tree and makes the implementation a strongly
typed GP. Using the Op.of factory method, a new operation is created by defining
the desired operation function.
1 f i n a l Op<Double> add = Op . o f ( "+" , 2 , v −> v [ 0 ] + v [ 1 ] ) ;
2 f i n a l Op<S t r i n g > c o n c a t = Op . o f ( "+" , 2 , v −> v [ 0 ] + v [ 1 ] ) ;
genes and chromosomes. It was a requirement, that the existing Alterer and Selector classes
could also be used for the new GP classes. This has been achieved by flattening the AST of a
genetic program to fit into the 1-dimensional (flat) structure of a chromosome.
109
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
Var The Var operation defines a variable of a program, which is set from
outside when it is evaluated.
1 final Var<Double> x = Var . o f ( " x " , 0) ;
2 final Var<Double> y = Var . o f ( " y " , 1) ;
3 final Var<Double> z = Var . o f ( " z " , 2) ;
4 final ISeq<Op<Double>> t e r m i n a l s = ISeq . of (x , y , z ) ;
The terminal operations defined in the listing above can be used for defining
a program which takes a 3-dimensional vector as input parameters, x, y, and
z, with the argument indices 0, 1, and 2. If you have again a look at the
apply method of the operation interface, you can see that this method takes an
object array of type T. The variable x will return the first element of the input
arguments, because it has been created with index 0.
Const The Const operation will always return the same, constant value when
evaluated.
1 f i n a l Const<Double> one = Const . o f ( 1 . 0 ) ;
2 f i n a l Const<Double> p i = Const . o f ( " PI " , Math . PI ) ;
You can create a constant operation in to flavors: with a value only, and with a
dedicated name. If a constant has a name, the symbolic name is used, instead of
the value, when the program tree is printed.
The ephemeral constant value is determined when it is inserted in the tree and
never changes until it is replaced by another ephemeral constant.
The code snippet above will create a perfect program tree23 of depth 5. All non-
leaf nodes will contain operations, randomly selected from the given operations,
whereas all leaf nodes are filled with operations from the terminals.
23 All leafs of a perfect tree have the same depth and all internal nodes have degree Op.arity.
110
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
The created program tree is perfect, which means that all leaf nodes have
the same depth. If new trees needs to be created during evolution, they
will be created with the depth, operations and terminals defined by the
template program tree.
During the evolution phase, the size of the ProgramChromosome can grow and
shrink. The SingleNodeCrossover, which is part of the jenetics.ext module
is responsible for this change in the program size. When a smaller sub tree is
exchanged with a bigger sub tree, the size of the first tree will grow and the
size of the second tree will shrink. This can lead to undesirable large programs.
Because of this reason, it is possible to create a ProgramChromosome with an
additional validation predicate.
1 f i n a l ProgramChromosome<Double> program = ProgramChromosome . o f (
2 depth , ch −> ch . r o o t ( ) . s i z e ( ) <= 5 0 ,
3 operations , terminals
4 );
The predicate, ch -> ch.root().size() <= 50, marks all programs with more
then 50 nodes as invalid. Invalid chromosomes will then be replaced by newly
created one. When defining a validation predicate, you have to take care, that
the desired depth and the validation predicate matches. If the given program
tree depth is too big, e. g. 51, every newly created program will be immediately
marked as invalid. This is because a tree with depth 51 will have for sure more
than 50 nodes.
The evolution Engine used for solving GP problems is created the same
way as for normal GA problems. Also the execution of the EvolutionStream
stays the same. The first Gene of the collected final Genotype represents the
evolved program, which can be used to calculate function values from arbitrary
arguments.
1 f i n a l Engine<ProgramGene<Double >, Double> e n g i n e = Engine
2 . b u i l d e r ( Main : : e r r o r , program )
3 . minimizing ( )
4 . alterers (
5 new S i n g l e N o d e C r o s s o v e r <>() ,
6 new Mutator <>() )
7 . build () ;
8
9 f i n a l ProgramGene<Double> program = e n g i n e . stream ( )
10 . l i m i t (300)
11 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) )
12 . gene ( ) ;
13 f i n a l double r e s u l t = program . e v a l ( 3 . 4 ) ;
For a complete GP example have a look at the examples in chapter 5.7. The code
example above also shows, that the program is represented by the first gene (aka
root gene) of the ProgramChromosome. Since the ProgramGene implements the
Tree<Op<A>,ProgramGene<A>> interface, it smoothly integrates with existing
tree algorithms. Some possible program gene assignments are shown in the code
snippet below, which will compile without warnings or additional casts.
1 f i n a l ProgramChromosome<Double> chromosome = . . . ;
111
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
In the example above, half of the expression trees are simplified in each generation.
If you want to prune the final result, you can do this with the MathExpr::rewrite
method, which uses the MathExpr.REWRITER tree rewriter for the rewrite task.
24 See section 3.1.2 on page 86 for a detailed description of the implemented tree rewrite
system.
112
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
The algorithm used for pruning the expression tree, currently only uses some
basic mathematical identities, like x + 0 = x, x · 1 = x or x · 0 = 0. More
advanced simplification algorithms may be implemented in the future. The
MathExpr helper class can also be used for creating mathematical expression
trees from the usual textual representation.
1 f i n a l MathExpr e x p r = MathExpr
2 . p a r s e ( " 5∗ z + 6∗ x + s i n ( y ) ^3 + ( 1 + s i n ( z ∗ 5 ) / 4 ) /6 " ) ;
3 f i n a l double v a l u e = e x p r . e v a l ( 5 . 5 , 4 , 2 . 3 ) ;
The variables in an expression string are sorted alphabetically. This means, that
the expression is evaluated with x = 5.5, y = 4 and z = 2.3, which leads to a
result value of 44.19673085074048.
The code snippet above shows how to create a codec with two independent
program roots. These programs are then mapped, in the fitness function, to
the combined fitness value. It is also possible to use different operations and
terminals for each ProgramChromosome.
113
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
Since symbolic regression is quite a common task in GP, Jenetics comes with
classes and interfaces, supporting the implementation of such problems. These
classes are defined in the io.jenetics.prog.regression package. The follow-
ing sections describes these classes and interfaces and its usage. A complete
symbolic regression example is given in section 5.7.
Mean squared error The mean squared error is the default loss function
used for regression analysis. It is also known as quadratic loss or L2 loss and is
calculated as the average of the squared differences between the predicted and
actual values.
1X
n−1
M SE = (yi − y˜i ) , (3.2.1)
2
n i=0
where yi denotes the expected function value and y˜i the calculated (estimated)
value for data point, i. The result is always positive and the perfect value is
0. The squaring means that larger mistakes result in more errors than smaller
mistakes, meaning that the model penalizes larger mistakes. The mean squared
error is the preferred loss function for regression problems.
Mean absolute error The mean absolute error, also known as L1 loss, is
calculated as the average of the absolute difference between the expected and
calculated values.
1X
n−1
M AE = |yi − y˜i | (3.2.2)
n i=0
This loss function is suitable for regression problems where the distribution of
the target variable may be mostly Gaussian, but may have outliers, e. g. large or
114
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
small values far from the mean value. This means that the MAE is more robust
than the MSE, which is useful if the sample data is corrupted with outliers.
The MAE is more robust to outliers, but its derivatives are not continuous,
making it less efficient to find the correct solution. The MSE is sensitive to
corrupt data, but finds more stable and closed form solutions.
The interface used for calculating the loss between calculated and expected
values is shown in listing 3.14
1 @FunctionalInterface
2 public i n t e r f a c e L o s s F u n c t i o n <T> {
3 double a p p l y (T [ ] c a l c u l a t e d , T [ ] e x p e c t e d ) ;
4 }
Listing 3.14: LossFunction interface
0.8
0.6
C(P)
0.4
0.2
0
0 5 10 15 20 25 30
N(P)
The graph in figure 3.2.1 shows how the program complexity increases with
the number of nodes. For the example graph the maximal node count was set to
28.
115
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES
1 @FunctionalInterface
2 public i n t e r f a c e Complexity<T> {
3 double a p p l y ( Tree <? extends Op<T>, ?> program ) ;
4 }
Listing 3.15: Complexity interface
Listing 3.15 shows the interface which calculates the complexity measure of a
given program tree.
Listing 3.16 shows the error (fitness) function used for evolving symbolic regres-
sion problems. Instead of implementing the error function from scratch, you will
probably want to use one of the factory methods for creating it from one of the
predefined LossFunction and Complexity measure.
1 f i n a l Error <Double> e r r o r 1 = E r r o r . o f ( L o s s F u n c t i o n : : mse ) ;
2 f i n a l Error <Double> e r r o r 2 = E r r o r . o f (
3 L o s s F u n c t i o n : : mae ,
4 Complexity . ofNodeCount ( 2 8 )
5 );
6 f i n a l Error <Double> e r r o r 3 = E r r o r . o f (
7 L o s s F u n c t i o n : : mse ,
8 Complexity . ofNodeCount ( 2 8 ) ,
9 ( l o s s , c o m p l e x i t y ) −> l o s s + l o s s ∗ c o m p l e x i t y
10 );
The code snippet above shows the three possibilities to create an error function
by using the predefined loss functions and complexity measure. error1 is created
by using the mean squared error, MES. error2 and error3 defines the same
error function. The only difference is that error3 defines the loss-complexity
composition function explicitly.
116
3.3. IO.JENETICS.XML CHAPTER 3. MODULES
5 }
Listing 3.17: Sample interface
The arity of the sample point returns the dimension, n. To make it easier to
create double sample points, some factory methods are also given in the Sample
interface.
1 f i n a l Sample<Double> sample1 = Sample . o f D o u b l e ( 0 . 0 , 0 . 0 ) ;
2 f i n a l Sample<Double> sample2 = Sample . o f D o u b l e ( 1 . 0 , 1 . 0 ) ;
3 f i n a l Sample<Double> sample3 = Sample . o f D o u b l e ( 2 . 0 , 2 . 0 ) ;
The code snippet above shows how to create three sample points for a function
f : R → R.
As you can see in the code snippet above, the Regression class implements the
Problem interface and can be therefore easily used in setting up an appropriate
evolution Engine. A full such regression example can be found in section 5.7.
3.3 io.jenetics.xml
The io.jenetics.xml module allows for writing and reading Chromosomes and
Genotypes to and from XML. Since the existing JAXB marshaling is part
of the deprecated javax.xml.bind module the io.jenetics.xml module is
now the recommended for XML marshalling of the Jenetics classes. The XML
marshalling, implemented in this module, is based on the Java XMLStreamWriter
and XMLStreamReader classes of the java.xml module.
117
3.3. IO.JENETICS.XML CHAPTER 3. MODULES
1 @FunctionalInterface
2 public i n t e r f a c e Writer<T> {
3 void w r i t e ( XMLStreamWriter xml , T data )
4 throws XMLStreamException ;
5
6 s t a t i c <T> Writer<T> a t t r ( S t r i n g name ) ;
7 s t a t i c <T> Writer<T> a t t r ( S t r i n g name , O b j e c t v a l u e ) ;
8 s t a t i c <T> Writer<T> t e x t ( ) ;
9
10 s t a t i c <T> Writer<T>
11 elem ( S t r i n g name , Writer <? super T> . . . c h i l d r e n ) ;
12
13 s t a t i c <T> Writer<I t e r a b l e <T>>
14 e l e m s ( Writer <? super T> w r i t e r ) ;
15 }
Listing 3.18: XMLWriter interface
Together with the static Writer factory method, it is possible to define arbitrary
writers through composition. There is no need for implementing the Writer
interface. A simple example will show you how to create (compose) a Writer
class for the IntegerChromosome. The created XML should look like the given
example above.
1 <int - chromosome length = " 3 " >
2 < min > -2147483648 </ min >
3 < max > 2147483647 </ max >
4 < alleles >
5 < allele > -1878762439 </ allele >
6 < allele > -957346595 </ allele >
7 < allele > -88668137 </ allele >
8 </ alleles >
9 </ int - chromosome >
The following writer will create the desired XML from an integer Chromosome.
As the example shows, the structure of the XML can easily be grasped from the
XML writer definition and vice versa.
1 f i n a l Writer<IntegerChromosome> w r i t e r =
2 elem ( " i n t −chromosome " ,
3 a t t r ( " l e n g t h " ) . map( ch −> ch . l e n g t h ( ) ) ,
4 elem ( " min " , W r i t e r .< I n t e g e r >t e x t ( ) . map( ch −> ch . min ( ) ) ) ,
5 elem ( " max " , W r i t e r .< I n t e g e r >t e x t ( ) . map( ch −> ch . max ( ) ) ) ,
6 elem ( " a l l e l e s " ,
7 e l e m s ( " a l l e l e " , W r i t e r .< I n t e g e r >t e x t ( ) )
8 . map( ch −> ch . t o S e q ( ) . map( g −> g . a l l e l e ( ) ) )
9 )
10 );
118
3.3. IO.JENETICS.XML CHAPTER 3. MODULES
Table 3.3.1 shows the required space of the marshaled genotypes for different
marshalling methods: (a) Java serialization, (b) JAXB25 serialization and (c)
XMLWriter.
119
3.4. IO.JENETICS.PRNGINE CHAPTER 3. MODULES
Using the Java serialization will create the smallest files and the XMLWriter
of the io.jenetics.xml module will create files roughly 75% the size of the
JAXB serialized genotypes. The size of the marshaled objects also influences
the write performance. As you can see in diagram 3.3.1 the Java serialization
is the fastest marshalling method, followed by the JAXB marshalling. The
XMLWriter is the slowest one, but still comparable to the JAXB method.
107
106
105
Marshalling time [µs]
104
103
102
JAXB
101
Java serialization
XML writer
100
100 101 102 103 104 105
Chromosome count
For reading the serialized genotypes, we will see similar results (see diagram
3.3.2). Reading Java serialized genotypes has the best read performance, followed
by JAXB and the XML Reader. This time the difference between JAXB and
the XML Reader is hardly visible.
3.4 io.jenetics.prngine
The prngine26 module contains pseudo-random number generators for sequen-
tial and parallel Monte Carlo simulations27 . It has been designed to work
smoothly with the Jenetics GA library, but it has no dependency to it. All
PRNG implementations of this library implements the Java RandomGenerator
interface, which makes it easily usable in other projects.
26 This module is not part of the Jenetics project directly. Since it has no dependency
on any of the Jenetics modules, it has been extracted to a separate GitHub repository
(https://github.com/jenetics/prngine) with an independent versioning.
27 https://de.wikipedia.org/wiki/Monte-Carlo-Simulation
120
3.4. IO.JENETICS.PRNGINE CHAPTER 3. MODULES
108
107
106
Marshalling time [µs]
105
104
3
10
102
JAXB
1
10 Java serialization
XML reader
100
100 101 102 103 104 105
Chromosome count
121
3.4. IO.JENETICS.PRNGINE CHAPTER 3. MODULES
30 http://digitalcommons.wayne.edu/jmasm/vol2/iss1/2/
31 Measured on a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz with Java(TM) SE Runtime
122
Appendix
Chapter 4
Internals
This section contains internal implementation details which doesn’t fit in one of
the previous sections. They are not essential for using the library, but would
give the user a deeper insight in some design decisions made when implementing
the library. It also introduces tools and classes which were developed for testing
purpose. These classes are not exported and not part of the official API.
The example above demonstrates how to stream a raw binary stream of bits
to the stdin (raw) interface of dieharder. With the DieHarder class, which
is part of the io.jenetics-.prngine.internal package, it is easily possible
to test PRNGs extending the java.util.random.RandomGenerator interface.
The only requirement is, that the PRNG must be default-constructible and part
of the classpath.
Calling the command above will create an instance of the given random engine
and stream the random data (bytes) to the raw interface of dieharder process.
1 #=============================================================================#
2 # Testing : L64X256MixRa ndom (2022 -02 -18 19:11) #
3 #=============================================================================#
4 #=============================================================================#
5 # Mac OS X 10.15.7 ( x86_64 ) #
124
4.2. RANDOM SEEDING CHAPTER 4. INTERNALS
In the listing above, a part of the created dieharder report is shown. For test-
ing the LCG64ShiftRandom class, which is part of the io.jenetics.prngine
module, the following command can be called:
Table 4.1.1 shows the summary of the dieharder tests. The full report is part
of the source file of the LCG64ShiftRandom class.2
Before applying this method throughout the whole library, I decided to perform
some statistical tests. For this purpose I treated the seed method itself as
PRNG and analyzed the created long values with the DieHarder class. The
2 https://github.com/jenetics/prngine/blob/master/prngine/src/main/java/io/
jenetics/prngine/LCG64ShiftRandom.java
3 See section 1.4.2 on page 34.
125
4.2. RANDOM SEEDING CHAPTER 4. INTERNALS
will perform the statistical tests for the nano time random engine. The statistical
quality is rather bad: every single test failed. Table 4.2.1 shows the summary of
the dieharder report.4
This seed method has been wrapped into the ObjectHashRandom class and
tested as well with
Table 4.2.2 shows the summary of the dieharder report5 , which is already
excellent.
After additional experimentation, a combination of the nano time seed and
the object hash seeding seems to be the right solution. The rational behind this
was, that the PRNG seed shouldn’t rely on a single source of entropy.
4 The detailed test report can be found in the source of the NanoTimeRandom
class. https://github.com/jenetics/prngine/blob/master/prngine/src/main/java/io/
jenetics/prngine/internal/NanoTimeRandom.java
5 Full report: https://github.com/jenetics/prngine/blob/master/prngine/src/main/
java/io/jenetics/prngine/internal/ObjectHashRandom.java
126
4.2. RANDOM SEEDING CHAPTER 4. INTERNALS
1 public s t a t i c long s e e d ( ) {
2 f i n a l long a = m i x S t a f f o r d 1 3 ( System . c u r r e n t T i m e M i l l i s ( ) ) ;
3 f i n a l long b = m i x S t a f f o r d 1 3 ( System . nanoTime ( ) ) ;
4 return s e e d ( mix ( a , b ) ) ;
5 }
6
7 private s t a t i c long s e e d ( f i n a l long b a s e ) {
8 return mix ( base , o b j e c t S e e d ( ) ) ;
9 }
10
11 private s t a t i c long mix ( f i n a l long a , f i n a l long b ) {
12 long c = a^b ;
13 c ^= c << 1 7 ;
14 c ^= c >>> 3 1 ;
15 c ^= c << 8 ;
16 return c ;
17 }
Listing 4.1: Random seeding
The code in listing 4.1 shows how the nano time seed is mixed with the object
seed. The mix method was inspired by the mixing step of thelcg64_shift6
random engine, which has been reimplemented in the LCG64ShiftRandom class.
Running the tests with
Open questions
• How does this method perform on operating systems other than Linux?
• How does this method perform on other JVM implementations?
6 This class is part of the TRNG library: https://github.com/rabauke/trng4/blob/
master/src/lcg64_shift.hpp
7 Full report: https://github.com/jenetics/prngine/blob/master/prngine/src/main/
java/io/jenetics/prngine/internal/SeedRandom.java
127
Chapter 5
Examples
This section contains some coding examples which should give you a feeling
of how to use the Jenetics library. The given examples are complete, in the
sense that they will compile and run and produce the given example output.
Running the examples delivered with the Jenetics library can be started with
the run-examples.sh script.
$ ./jenetics.example/src/main/scripts/run-examples.sh
Since the script uses JARs located in the build directory you have to build it
with the jar Gradle target first; see section6.
128
5.1. ONES COUNTING CHAPTER 5. EXAMPLES
21 }
22
23 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
24 // C o n f i g u r e and b u i l d t h e e v o l u t i o n e n g i n e .
25 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
26 . builder (
27 OnesCounting : : count ,
28 BitChromosome . o f ( 2 0 , 0 . 1 5 ) )
29 . populationSize (500)
30 . s e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
31 . alterers (
32 new Mutator < >(0.55) ,
33 new S i n g l e P o i n t C r o s s o v e r < >(0.06) )
34 . build () ;
35
36 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
37 f i n a l E v o l u t i o n S t a t i s t i c s <I n t e g e r , ?>
38 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
39
40 f i n a l Phenotype<BitGene , I n t e g e r > b e s t = e n g i n e . stream ( )
41 // Truncate t h e e v o l u t i o n stream a f t e r 7 " s t e a d y "
42 // g e n e r a t i o n s .
43 . l i m i t ( bySteadyFitness (7) )
44 // The e v o l u t i o n w i l l s t o p a f t e r maximal 100
45 // g e n e r a t i o n s .
46 . l i m i t (100)
47 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
48 // each g e n e r a t i o n
49 . peek ( s t a t i s t i c s )
50 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
51 // i t s b e s t phenotype .
52 . c o l l e c t ( toBestPhenotype ( ) ) ;
53
54 System . out . p r i n t l n ( s t a t i s t i c s ) ;
55 System . out . p r i n t l n ( b e s t ) ;
56 }
57 }
2 For the other default values (population size, maximal age, ...) have a look at the Javadoc:
https://jenetics.io/javadoc/jenetics/7.1/index.html
129
5.2. REAL FUNCTION CHAPTER 5. EXAMPLES
The given example will print the overall timing statistics onto the console. In
the Evolution statistics section you can see that it actually takes 15 generations
to fulfill the termination criteria—finding no better result after 7 consecutive
generations.
1
f (x) = cos + sin (x) · cos (x) . (5.2.1)
2
1
0.8
0.6
0.4
0.2
0
y
-0.2
-0.4
-0.6
-0.8
-1
0 1 2 3 4 5 6
The graph of function 5.2.1, in the range of [0, 2π], is shown in figure 5.2.1
and the listing beneath shows the GA implementation which will minimize the
function.
1 import static j a v a . l a n g . Math . PI ;
2 import static j a v a . l a n g . Math . c o s ;
3 import static j a v a . l a n g . Math . s i n ;
4 import static i o . j e n e t i c s . engine . EvolutionResult . toBestPhenotype ;
5 import static io . j e n e t i c s . engine . Limits . bySteadyFitness ;
6
7 import io . jenetics . DoubleGene ;
8 import io . jenetics . MeanAlterer ;
9 import io . jenetics . Mutator ;
10 import io . jenetics . Optimize ;
11 import io . jenetics . Phenotype ;
12 import io . jenetics . e n g i n e . Codecs ;
13 import io . jenetics . e n g i n e . Engine ;
14 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;
130
5.2. REAL FUNCTION CHAPTER 5. EXAMPLES
15 import i o . j e n e t i c s . u t i l . DoubleRange ;
16
17 public c l a s s R e a l F u n c t i o n {
18
19 // The f i t n e s s f u n c t i o n .
20 private s t a t i c double f i t n e s s ( f i n a l double x ) {
21 return c o s ( 0 . 5 + s i n ( x ) ) ∗ c o s ( x ) ;
22 }
23
24 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
25 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine
26 // C r e a t e a new b u i l d e r with t h e g i v e n f i t n e s s
27 // f u n c t i o n and chromosome .
28 . builder (
29 RealFunction : : f i t n e s s ,
30 Codecs . o f S c a l a r ( DoubleRange . o f ( 0 . 0 , 2 . 0 ∗ PI ) ) )
31 . populationSize (500)
32 . o p t i m i z e ( Optimize .MINIMUM)
33 . alterers (
34 new Mutator < >(0.03) ,
35 new MeanAlterer < >(0.6) )
36 // B u i l d an e v o l u t i o n e n g i n e with t h e
37 // d e f i n e d p a r a m e t e r s .
38 . build () ;
39
40 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
41 f i n a l E v o l u t i o n S t a t i s t i c s <Double , ?>
42 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
43
44 f i n a l Phenotype<DoubleGene , Double> b e s t = e n g i n e . stream ( )
45 // Truncate t h e e v o l u t i o n stream a f t e r 7 " s t e a d y "
46 // g e n e r a t i o n s .
47 . l i m i t ( bySteadyFitness (7) )
48 // The e v o l u t i o n w i l l s t o p a f t e r maximal 100
49 // g e n e r a t i o n s .
50 . l i m i t (100)
51 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
52 // each g e n e r a t i o n
53 . peek ( s t a t i s t i c s )
54 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
55 // i t s b e s t phenotype .
56 . c o l l e c t ( toBestPhenotype ( ) ) ;
57
58 System . out . p r i n t l n ( s t a t i s t i c s ) ;
59 System . out . p r i n t l n ( b e s t ) ;
60 }
61 }
131
5.3. RASTRIGIN FUNCTION CHAPTER 5. EXAMPLES
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =9; mean =1.104381; var =1.962625 |
19 | Fitness : |
20 | min = -0.938171897696 |
21 | max = 0.93 63101 25279 |
22 | mean = -0.897856583665 |
23 | var = 0.0272 462748 38 |
24 | std = 0.16 50644 5661 7 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [ [ [ 3 . 3 8 9 1 2 5 7 8 2 6 5 7 3 1 4 ] ] ] --> -0.9381718976956661
The GA will generated an console output like above. The exact result of the
function–for the given range–will be 3.389, 125, 782, 8907, 939... You can also see,
that we reached the final result after 19 generations.
As the plot in figure 5.3.1 shows, the Rastrigin function has many local minima,
which makes it difficult for standard, gradient based methods to find the global
minimum. If A = 10 and xi ∈ [−5.12, 5.12], the function has only one global
minimum at x = 0 with f (x) = 0.
132
5.3. RASTRIGIN FUNCTION CHAPTER 5. EXAMPLES
The following listing shows the Engine setup for solving the Rastrigin func-
tion, which is very similar to the setup for the real function in section 5.2. Beside
the different fitness function, the Codec for double vectors is used, instead of
the double scalar Codec.
1 import static j a v a . l a n g . Math . PI ;
2 import static j a v a . l a n g . Math . c o s ;
3 import static i o . j e n e t i c s . engine . EvolutionResult . toBestPhenotype ;
4 import static io . j e n e t i c s . engine . Limits . bySteadyFitness ;
5
6 import io . jenetics . DoubleGene ;
7 import io . jenetics . MeanAlterer ;
8 import io . jenetics . Mutator ;
9 import io . jenetics . Optimize ;
10 import io . jenetics . Phenotype ;
11 import io . jenetics . e n g i n e . Codecs ;
12 import io . jenetics . e n g i n e . Engine ;
13 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;
14 import io . jenetics . u t i l . DoubleRange ;
15
16 public c l a s s R a s t r i g i n F u n c t i o n {
17 private s t a t i c f i n a l double A = 1 0 ;
18 private s t a t i c f i n a l double R = 5 . 1 2 ;
19 private s t a t i c f i n a l i n t N = 2 ;
20
21 private s t a t i c double f i t n e s s ( f i n a l double [ ] x ) {
22 double v a l u e = A∗N;
23 f o r ( i n t i = 0 ; i < N; ++i ) {
24 v a l u e += x [ i ] ∗ x [ i ] − A∗ c o s ( 2 . 0 ∗ PI ∗x [ i ] ) ;
25 }
26
27 return v a l u e ;
28 }
29
30 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
31 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine
32 . builder (
33 RastriginFunction : : fitness ,
34 // Codec f o r ’ x ’ v e c t o r .
35 Codecs . o f V e c t o r ( DoubleRange . o f (−R, R) , N) )
36 . populationSize (500)
37 . o p t i m i z e ( Optimize .MINIMUM)
38 . alterers (
39 new Mutator < >(0.03) ,
40 new MeanAlterer < >(0.6) )
41 . build () ;
42
43 f i n a l E v o l u t i o n S t a t i s t i c s <Double , ?>
44 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
45
46 f i n a l Phenotype<DoubleGene , Double> b e s t = e n g i n e . stream ( )
47 . l i m i t ( bySteadyFitness (7) )
48 . peek ( s t a t i s t i c s )
49 . c o l l e c t ( toBestPhenotype ( ) ) ;
50
51 System . out . p r i n t l n ( s t a t i s t i c s ) ;
52 System . out . p r i n t l n ( b e s t ) ;
53 }
54 }
The console output of the program shows, that Jenetics finds the optimal
solution after 38 generations.
133
5.4. 0/1 KNAPSACK CHAPTER 5. EXAMPLES
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .2 0 91 85 1 34 00 0 s ; mean = 0 .0 05 5 04 87 19 4 7 s |
5 | Altering : sum = 0. 29 5 10 20 44 0 00 s ; mean =0 . 00 77 6 58 43 2 63 s |
6 | Fitness calculation : sum = 0. 17 6 87 9 93 7 00 0 s ; mean = 0. 0 04 65 4 73 51 8 4 s |
7 | Overall execution : sum =0 .6 6 45 1 72 56 0 00 s ; mean = 0. 01 7 48 72 9 62 11 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 38 |
12 | Altered : sum =7 ,549; mean =198 .6578 94737 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =8; mean =1.100211; var =1.814053 |
19 | Fitness : |
20 | min = 0.0000 00000000 |
21 | max = 6 3 .6 72 6 04 0 47 47 5 |
22 | mean = 3 .4841574 52128 |
23 | var = 7 1. 0 47 47 51 3 90 18 |
24 | std = 8.42 8966 4336 16 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [[[ -1.3226168588424143 E -9] ,[ -1.096964971404292 E -9]]] --> 0.0
134
5.4. 0/1 KNAPSACK CHAPTER 5. EXAMPLES
28 public f i n a l double v a l u e ;
29
30 Item ( f i n a l double s i z e , f i n a l double v a l u e ) {
31 this . s i z e = s i z e ;
32 this . value = value ;
33 }
34
35 // C r e a t e a new random knapsack item .
36 s t a t i c Item random ( ) {
37 f i n a l var r = RandomRegistry . random ( ) ;
38 return new Item (
39 r . nextDouble ( ) ∗ 1 0 0 ,
40 r . nextDouble ( ) ∗100
41 );
42 }
43
44 // C o l l e c t o r f o r summing up t h e knapsack i t e m s .
45 s t a t i c C o l l e c t o r <Item , ? , Item> toSum ( ) {
46 return C o l l e c t o r . o f (
47 ( ) −> new double [ 2 ] ,
48 ( a , b ) −> { a [ 0 ] += b . s i z e ; a [ 1 ] += b . v a l u e ; } ,
49 ( a , b ) −> { a [ 0 ] += b [ 0 ] ; a [ 1 ] += b [ 1 ] ; return a ; } ,
50 r −> new Item ( r [ 0 ] , r [ 1 ] )
51 );
52 }
53 }
54
55 // C r e a t i n g t h e f i t n e s s f u n c t i o n .
56 s t a t i c Function<ISeq<Item >, Double>
57 f i t n e s s ( f i n a l double s i z e ) {
58 return i t e m s −> {
59 f i n a l Item sum = i t e m s . stream ( ) . c o l l e c t ( Item . toSum ( ) ) ;
60 return sum . s i z e <= s i z e ? sum . v a l u e : 0 ;
61 };
62 }
63
64 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
65 f i n a l int nitems = 1 5 ;
66 f i n a l double k s s i z e = n i t e m s ∗ 1 0 0 . 0 / 3 . 0 ;
67
68 f i n a l ISeq<Item> i t e m s =
69 Stream . g e n e r a t e ( Item : : random )
70 . l i m i t ( nitems )
71 . c o l l e c t ( ISeq . toISeq ( ) ) ;
72
73 // D e f i n i n g t h e c o d e c .
74 f i n a l Codec<ISeq<Item >, BitGene> c o d e c =
75 Codecs . o f S u b S e t ( i t e m s ) ;
76
77 // C o n f i g u r e and b u i l d t h e e v o l u t i o n e n g i n e .
78 f i n a l Engine<BitGene , Double> e n g i n e = Engine
79 . b u i l d e r ( f i t n e s s ( k s s i z e ) , codec )
80 . populationSize (500)
81 . s u r v i v o r s S e l e c t o r (new T o u r n a m e n t S e l e c t o r <>(5) )
82 . o f f s p r i n g S e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
83 . alterers (
84 new Mutator < >(0.115) ,
85 new S i n g l e P o i n t C r o s s o v e r < >(0.16) )
86 . build () ;
87
88 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
89 f i n a l E v o l u t i o n S t a t i s t i c s <Double , ?>
135
5.4. 0/1 KNAPSACK CHAPTER 5. EXAMPLES
90 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
91
92 f i n a l Phenotype<BitGene , Double> b e s t = e n g i n e . stream ( )
93 // Truncate t h e e v o l u t i o n stream a f t e r 7 " s t e a d y "
94 // g e n e r a t i o n s .
95 . l i m i t ( bySteadyFitness (7) )
96 // The e v o l u t i o n w i l l s t o p a f t e r maximal 100
97 // g e n e r a t i o n s .
98 . l i m i t (100)
99 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
100 // each g e n e r a t i o n
101 . peek ( s t a t i s t i c s )
102 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
103 // i t s b e s t phenotype .
104 . c o l l e c t ( toBestPhenotype ( ) ) ;
105
106 f i n a l ISeq<Item> knapsack = c o d e c . d ec od e ( b e s t . g e n o t y p e ( ) ) ;
107
108 System . out . p r i n t l n ( s t a t i s t i c s ) ;
109 System . out . p r i n t l n ( b e s t ) ;
110 System . out . p r i n t l n ( " \n\n " ) ;
111 System . out . p r i n t f (
112 " Genotype o f b e s t item : %s%n " ,
113 best . genotype ( )
114 );
115
116 f i n a l double f i l l S i z e = knapsack . stream ( )
117 . mapToDouble ( i t −> i t . s i z e )
118 . sum ( ) ;
119
120 System . out . p r i n t f ( " %.2 f%% f i l l e d .%n " , 100∗ f i l l S i z e / k s s i z e ) ;
121 }
122 }
The console output for the Knapsack GA will look like the listing beneath.
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 4 44 6 59 78 0 00 s ; mean =0 .0 0 55 58 24 7 25 0 s |
5 | Altering : sum = 0. 06 7 38 52 11 0 00 s ; mean =0 .0 0 84 23 15 1 37 5 s |
6 | Fitness calculation : sum = 0. 03 72 0 81 89 0 00 s ; mean =0 . 00 46 5 10 23 6 25 s |
7 | Overall execution : sum =0 .1 2 64 6 85 39 0 00 s ; mean =0 . 01 58 0 85 67 3 75 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 8 |
12 | Altered : sum =4 ,842; mean = 605.25 0000000 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =7; mean =1.387500; var =2.780039 |
19 | Fitness : |
20 | min = 0.0000 00000000 |
21 | max = 5 4 2 . 3 6 3 2 3 5 9 9 9 3 4 2 |
22 | mean = 4 3 6 . 0 9 8 2 4 8 6 2 8 6 6 1 |
23 | var = 1 1 4 3 1 . 8 0 1 2 9 1 8 1 2 3 9 0 |
24 | std = 1 0 6 . 9 1 9 6 0 1 9 9 9 8 7 8 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [ 0 1 1 1 1 0 1 1 | 1 0 1 1 1 1 0 1 ] --> 5 4 2 . 3 6 3 2 3 5 9 9 9 3 4 1 7
136
5.5. TRAVELING SALESMAN CHAPTER 5. EXAMPLES
137
5.5. TRAVELING SALESMAN CHAPTER 5. EXAMPLES
138
5.6. EVOLVING IMAGES CHAPTER 5. EXAMPLES
111
112 out . p r i n t l n ( s t a t i s t i c s ) ;
113 out . p r i n t l n ( "Known min path l e n g t h : " + minPathLength ) ;
114 out . p r i n t l n ( " Found min path l e n g t h : " + b e s t . f i t n e s s ( ) ) ;
115 }
116
117 }
The Traveling Salesman problem is a very good example which shows you
how to solve combinatorial problems with an GA. Jenetics contains several
classes which will work very well with this kind of problems. Wrapping the base
type into an EnumGene is the first thing to do. In our example, every city has
an unique number, that means we are wrapping an Integer into an EnumGene.
Creating a genotype for integer values is very easy with the factory method
of the PermutationChromosome. For other data types you have to use one of
the constructors of the permutation chromosome. As alterers, we are using a
swap-mutator and a partially-matched crossover. These alterers guarantee that
no invalid solutions are created—every city exists exactly once in the altered
chromosomes.
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 7 74 51 2 97 00 0 s ; mean = 0 .0 0 06 1 96 1 03 76 s |
5 | Altering : sum = 0. 20 53 5 16 88 0 00 s ; mean =0 .0 0 16 42 8 13 50 4 s |
6 | Fitness calculation : sum = 0. 09 7 12 72 25 0 00 s ; mean =0 .0 0 07 77 01 7 80 0 s |
7 | Overall execution : sum =0 .3 71 3 04 46 4 00 0 s ; mean = 0. 0 02 97 0 43 57 1 2 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 125 |
12 | Altered : sum =177 ,200; mean = 1 41 7. 6 00 00 00 0 0 |
13 | Killed : sum =173; mean =1.384000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =11; mean =1.677872; var =5.617299 |
19 | Fitness : |
20 | min = 6 2 .5 73 7 86 0 16 0 92 |
21 | max = 3 4 4 . 2 48 7 6 3 7 2 0 4 8 7 |
22 | mean = 1 4 4 . 6 3 6 7 4 9 9 7 4 5 9 1 |
23 | var = 5 0 8 2 . 9 4 7 2 4 7 8 7 8 9 5 3 |
24 | std = 7 1 .2 94 7 91 1 69 3 34 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 Known min path length : 6 2 . 5 7 3 7 8 6 0 1 6 0 9 2 3 5
27 Found min path length : 6 2 . 5 7 3 7 8 6 0 1 6 0 9 2 3 5
The listing above shows the output generated by our example. The last line
represents the phenotype of the best solution found by the GA, which represents
the traveling path. As you can see, the GA has found the shortest path, in
reverse order.
139
5.6. EVOLVING IMAGES CHAPTER 5. EXAMPLES
$ ./gradlew jar
$ ./jrun io.jenetics.example.image.EvolvingImages
Figure5.6.1 show the GUI after evolving the default image for about 4,000
generations. With the »Open« button it is possible to load other images for
polygonization. The »Save« button allows to store polygonized images in PNG
format to disk. At the button of the UI, you can change some of the GA
parameters of the example:
Population size The number of individual in the population.
140
5.7. SYMBOLIC REGRESSION CHAPTER 5. EXAMPLES
Every command line argument has proper default values, so that it is possible
to start it without parameters. Listing 5.1 shows the default values for the GA
engine if the --engine-properties parameter is not specified.
1 p o p u l a t i o n _ s i z e =50
2 t o u r n a m e n t _ s i z e=3
3 m u t a t i o n _ r a t e =0.025
4 mut ation_m ultitu de =0.15
5 p o l y g o n _ l e n g t h=4
6 polygon_count =250
7 r e f e r e n c e _ i m a g e _ w i d t h =60
8 r e f e r e n c e _ i m a g e _ h e i g h t =60
Listing 5.1: Default engine.properties
The images in figure 5.6.2 shows the resulting polygon images after the given
number of generations. They where created with the command line version of
the program using the default engine.properties file (listing 5.1):
141
5.7. SYMBOLIC REGRESSION CHAPTER 5. EXAMPLES
more effort then the setup of a GA. First, you have to define the set of atomic
mathematical operations, the GP is working with. These operations influence
the search space and is a kind of a priori knowledge put into the GP. As a
second step you have to define the terminal operations. Terminals are either
constants or variables. The number of variables defines the domain dimension of
the fitness function.
1 import s t a t i c i o . j e n e t i c s . u t i l . RandomRegistry . random ;
2
3 import j a v a . u t i l . L i s t ;
4
5 import io . jenetics . Mutator ;
6 import io . jenetics . e n g i n e . Engine ;
7 import io . jenetics . engine . EvolutionResult ;
8 import io . jenetics . engine . Limits ;
9 import io . jenetics . u t i l . ISeq ;
10
11 import i o . j e n e t i c s . e x t . S i n g l e N o d e C r o s s o v e r ;
12 import i o . j e n e t i c s . e x t . u t i l . TreeNode ;
13
14 import io . jenetics . prog . ProgramGene ;
15 import io . jenetics . prog . op . EphemeralConst ;
16 import io . jenetics . prog . op . MathExpr ;
17 import io . jenetics . prog . op . MathOp ;
18 import io . jenetics . prog . op . Op ;
19 import io . jenetics . prog . op . Var ;
20 import io . jenetics . prog . r e g r e s s i o n . E r r o r ;
21 import io . jenetics . prog . r e g r e s s i o n . L o s s F u n c t i o n ;
22 import io . jenetics . prog . r e g r e s s i o n . R e g r e s s i o n ;
23 import io . jenetics . prog . r e g r e s s i o n . Sample ;
24
25 public c l a s s S y m b o l i c R e g r e s s i o n {
142
5.7. SYMBOLIC REGRESSION CHAPTER 5. EXAMPLES
26
27 // D e f i n i t i o n o f t h e a l l o w e d o p e r a t i o n s .
28 private s t a t i c f i n a l ISeq<Op<Double>> OPS =
29 I S e q . o f (MathOp .ADD, MathOp . SUB, MathOp .MUL) ;
30
31 // D e f i n i t i o n o f t h e t e r m i n a l s .
32 private s t a t i c f i n a l ISeq<Op<Double>> TMS = I S e q . o f (
33 Var . o f ( " x " , 0 ) ,
34 EphemeralConst . o f ( ( ) −> ( double ) random ( ) . n e x t I n t ( 1 0 ) )
35 );
36
37 // Lookup t a b l e f o r { @code 4∗ x ^3 − 3∗ x ^2 + x}
38 s t a t i c f i n a l L i s t <Sample<Double>> SAMPLES = L i s t . o f (
39 Sample . o f D o u b l e ( − 1 . 0 , −8.0000) ,
40 Sample . o f D o u b l e ( − 0 . 9 , −6.2460) ,
41 Sample . o f D o u b l e ( − 0 . 8 , −4.7680) ,
42 Sample . o f D o u b l e ( − 0 . 7 , −3.5420) ,
43 Sample . o f D o u b l e ( − 0 . 6 , −2.5440) ,
44 Sample . o f D o u b l e ( − 0 . 5 , −1.7500) ,
45 Sample . o f D o u b l e ( − 0 . 4 , −1.1360) ,
46 Sample . o f D o u b l e ( − 0 . 3 , −0.6780) ,
47 Sample . o f D o u b l e ( − 0 . 2 , −0.3520) ,
48 Sample . o f D o u b l e ( − 0 . 1 , −0.1340) ,
49 Sample . o f D o u b l e ( 0 . 0 , 0 . 0 0 0 0 ) ,
50 Sample . o f D o u b l e ( 0 . 1 , 0 . 0 7 4 0 ) ,
51 Sample . o f D o u b l e ( 0 . 2 , 0 . 1 1 2 0 ) ,
52 Sample . o f D o u b l e ( 0 . 3 , 0 . 1 3 8 0 ) ,
53 Sample . o f D o u b l e ( 0 . 4 , 0 . 1 7 6 0 ) ,
54 Sample . o f D o u b l e ( 0 . 5 , 0 . 2 5 0 0 ) ,
55 Sample . o f D o u b l e ( 0 . 6 , 0 . 3 8 4 0 ) ,
56 Sample . o f D o u b l e ( 0 . 7 , 0 . 6 0 2 0 ) ,
57 Sample . o f D o u b l e ( 0 . 8 , 0 . 9 2 8 0 ) ,
58 Sample . o f D o u b l e ( 0 . 9 , 1 . 3 8 6 0 ) ,
59 Sample . o f D o u b l e ( 1 . 0 , 2 . 0 0 0 0 )
60 );
61
62 s t a t i c f i n a l R e g r e s s i o n <Double> REGRESSION =
63 Regression . of (
64 R e g r e s s i o n . codecOf (
65 OPS, TMS, 5 ,
66 t −> t . gene ( ) . s i z e ( ) < 30
67 ),
68 E r r o r . o f ( L o s s F u n c t i o n : : mse ) ,
69 SAMPLES
70 );
71
72 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
73 f i n a l Engine<ProgramGene<Double >, Double> e n g i n e = Engine
74 . b u i l d e r (REGRESSION)
75 . minimizing ( )
76 . alterers (
77 new S i n g l e N o d e C r o s s o v e r < >(0.1) ,
78 new Mutator <>() )
79 . build () ;
80
81 f i n a l E v o l u t i o n R e s u l t <ProgramGene<Double >, Double> e r =
82 e n g i n e . stream ( )
83 . l i m i t ( Limits . byFitnessThreshold (0 .0 1) )
84 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () ) ;
85
86 f i n a l ProgramGene<Double> program = e r . b e s t P h e n o t y p e ( )
87 . genotype ( )
143
5.8. GRAMMAR BASED REGRESSION CHAPTER 5. EXAMPLES
88 . gene ( ) ;
89
90 f i n a l TreeNode<Op<Double>> t r e e = program . toTreeNode ( ) ;
91 MathExpr . r e w r i t e ( t r e e ) ;
92 System . out . p r i n t l n ( "G: " + er . totalGenerations () ) ;
93 System . out . p r i n t l n ( "F : " + new MathExpr ( t r e e ) ) ;
94 System . out . p r i n t l n ( "E : " + REGRESSION . e r r o r ( t r e e ) ) ;
95 }
96 }
The error function uses the mean squared error 7 as loss function and no
additional tree complexity metric. One output of a GP run is shown in figure
5.7.1. If we simplify this program tree, we will get exactly the polynomial which
created the sample data.
sub
add mul
x mul add x
mul x add x
add x x x
add x
add x
x x
144
5.8. GRAMMAR BASED REGRESSION CHAPTER 5. EXAMPLES
1 import s t a t i c j a v a . u t i l . O b j e c t s . r e q u i r e N o n N u l l ;
2 import s t a t i c j a v a . u t i l . stream . C o l l e c t o r s . j o i n i n g ;
3 import s t a t i c i o . j e n e t i c s . prog . op . MathExpr . p a r s e T r e e ;
4
5 import j a v a . u t i l . L i s t ;
6 import j a v a . u t i l . f u n c t i o n . F u n c t i o n ;
7
8 import io . jenetics . IntegerGene ;
9 import io . jenetics . Phenotype ;
10 import io . jenetics . SinglePointCrossover ;
11 import io . jenetics . SwapMutator ;
12 import io . jenetics . e n g i n e . Codec ;
13 import io . jenetics . e n g i n e . Engine ;
14 import io . jenetics . engine . EvolutionResult ;
15 import io . jenetics . engine . Limits ;
16 import io . jenetics . e n g i n e . Problem ;
17 import io . jenetics . u t i l . IntRange ;
18
19 import io . jenetics . e x t . grammar . Bnf ;
20 import io . jenetics . e x t . grammar . Cfg ;
21 import io . jenetics . e x t . grammar . Cfg . Symbol ;
22 import io . jenetics . e x t . grammar . Mappers ;
23 import io . jenetics . e x t . grammar . S e n t e n c e G e n e r a t o r ;
24 import io . jenetics . e x t . u t i l . Tree ;
25 import io . jenetics . e x t . u t i l . TreeNode ;
26
27 import io . jenetics . prog . op . Const ;
28 import io . jenetics . prog . op . MathExpr ;
29 import io . jenetics . prog . op . Op ;
30 import io . jenetics . prog . r e g r e s s i o n . E r r o r ;
31 import io . jenetics . prog . r e g r e s s i o n . L o s s F u n c t i o n ;
32 import io . jenetics . prog . r e g r e s s i o n . Sample ;
33 import io . jenetics . prog . r e g r e s s i o n . Sampling ;
34 import io . jenetics . prog . r e g r e s s i o n . Sampling . R e s u l t ;
35
36 public c l a s s GrammarBasedRegression
37 implements Problem<Tree<Op<Double >, ?>, I n t e g e r G e n e , Double>
38 {
39
40 private s t a t i c f i n a l Cfg<S t r i n g > GRAMMAR = Bnf . p a r s e ( " " "
41 <expr> : : = x | <num> | <expr> <op> <expr>
42 <op> ::= + | − | ∗ | /
43 <num> : : = 2 | 3 | 4
44 """
45 );
46
47 private s t a t i c f i n a l Codec<Tree<Op<Double >, ?>, I n t e g e r G e n e >
48 CODEC = Mappers . multiIntegerChromosomeMapper (
49 GRAMMAR,
50 // The l e n g t h o f t h e chromosome i s 25 t i m e s t h e l e n g t h
51 // o f t h e a l t e r n a t i v e s o f a g i v e n r u l e . Every r u l e
52 // g e t s i t s own chromosome . I t would a l s o be p o s s i b l e
53 // t o d e f i n e v a r i a b l e chromosome l e n g t h with t h e
54 // r e t u r n e d i n t e g e r r a n g e .
55 r u l e −> IntRange . o f ( r u l e . a l t e r n a t i v e s ( ) . s i z e ( ) ∗ 2 5 ) ,
56 // The used g e n e r a t o r d e f i n e s t h e g e n e r a t e d data type ,
57 // which i s ‘ L i s t <Terminal<S t r i n g >>‘.
58 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 5 0 )
59 )
60 // Map t h e t y p e o f t h e c o d e c from ‘ L i s t <Terminal<S t r i n g >>‘
61 // t o ‘ S t r i n g ‘
62 . map( s −> s . stream ( ) . map( Symbol : : name ) . c o l l e c t ( j o i n i n g ( ) ) )
145
5.8. GRAMMAR BASED REGRESSION CHAPTER 5. EXAMPLES
63 // Map t h e t y p e o f t h e c o d e c from ‘ S t r i n g ‘ t o
64 // ‘ Tree<Op<Double >, ?>‘
65 . map( e −> e . isEmpty ( )
66 ? TreeNode . o f ( Const . o f ( 0 . 0 ) )
67 : parseTree ( e ) ) ;
68
69 private s t a t i c f i n a l Error <Double> ERROR =
70 E r r o r . o f ( L o s s F u n c t i o n : : mse ) ;
71
72 private f i n a l Sampling<Double> _sampling ;
73
74 public GrammarBasedRegression ( Sampling<Double> s a m p l i n g ) {
75 _sampling = r e q u i r e N o n N u l l ( s a m p l i n g ) ;
76 }
77
78 public GrammarBasedRegression ( L i s t <Sample<Double>> s a m p l e s ) {
79 t h i s ( Sampling . o f ( s a m p l e s ) ) ;
80 }
81
82 @Override
83 public Codec<Tree<Op<Double >, ?>, I n t e g e r G e n e > c o d e c ( ) {
84 return CODEC;
85 }
86
87 @Override
88 public Function<Tree<Op<Double >, ?>, Double> f i t n e s s ( ) {
89 return program −> {
90 f i n a l R e s u l t <Double> r e s u l t = _sampling . e v a l ( program ) ;
91 return ERROR. a p p l y (
92 program , r e s u l t . c a l c u l a t e d ( ) , r e s u l t . e x p e c t e d ( )
93 );
94 };
95 }
96
97 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
98 f i n a l var r e g r e s s i o n = new GrammarBasedRegression (
99 S y m b o l i c R e g r e s s i o n .SAMPLES
100 );
101
102 f i n a l Engine<I n t e g e r G e n e , Double> e n g i n e = Engine
103 . builder ( regression )
104 . alterers (
105 new SwapMutator <>() ,
106 new S i n g l e P o i n t C r o s s o v e r <>() )
107 . minimizing ( )
108 . build () ;
109
110 f i n a l E v o l u t i o n R e s u l t <I n t e g e r G e n e , Double> r e s u l t = e n g i n e
111 . stream ( )
112 . l i m i t ( Limits . byFitnessThreshold (0 .0 5) )
113 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () ) ;
114
115 f i n a l Phenotype<I n t e g e r G e n e , Double> b e s t =
116 r e s u l t . bestPhenotype ( ) ;
117
118 f i n a l Tree<Op<Double >, ?> program =
119 r e g r e s s i o n . d ec od e ( b e s t . g e n o t y p e ( ) ) ;
120
121 System . out . p r i n t l n (
122 " Generations : " + r e s u l t . totalGenerations () ) ;
123 System . out . p r i n t l n (
124 " Function : " + new MathExpr ( program ) . s i m p l i f y ( ) ) ;
146
5.9. DTLZ1 CHAPTER 5. EXAMPLES
5.9 DTLZ1
Deb, Thiele, Laumanns and Zitzler have proposed a set of generational MOPs
for testing and comparing MOEAs. This suite of benchmarks attempts to define
generic MOEA test problems that are scalable to a user defined number of
objectives. Because of the last names of its creators, this test suite is known as
DTLZ (Deb-Thiele-Laumanns-Zitzler). [10]
DTLZ1 is an M -objective problem with linear Pareto-optimal front: [17]
1
f1 (x) = x1 x2 · · · xM −1 (1 + g (xM )) ,
2
1
f2 (x) = x1 x2 · · · (1 − xM −1 ) (1 + g (xM )) ,
2
..
.
1
fM −1 (x) = x1 (1 − x2 ) (1 + g (xM )) ,
2
1
fM (x) = (1 − x1 ) (1 + g (xM )) ,
2
∀i ∈ [1, ..n] : 0 ≤ xi ≤ 1
The functional g (xM ) requires |xM | = k variables and must take any function
with g ≥ 0. Typically g is defined as:
1 1
" 2 #
g (xM ) = 100 |xM | + x − − cos 20π x − .
2 2
In the above problem, the total number of variables is n = M + k − 1. The
search space contains 11k − 1 local Pareto-optimal fronts, each of which can
attract an MOEA.
1 import s t a t i c j a v a . l a n g . Math . PI ;
2 import s t a t i c j a v a . l a n g . Math . c o s ;
3 import s t a t i c j a v a . l a n g . Math . pow ;
4
5 import io . jenetics . DoubleGene ;
6 import io . jenetics . Mutator ;
7 import io . jenetics . Phenotype ;
8 import io . jenetics . TournamentSelector ;
9 import io . jenetics . e n g i n e . Codecs ;
10 import io . jenetics . e n g i n e . Engine ;
11 import io . jenetics . e n g i n e . Problem ;
12 import io . jenetics . u t i l . DoubleRange ;
13 import io . jenetics . u t i l . ISeq ;
14 import io . jenetics . u t i l . IntRange ;
15
16 import i o . j e n e t i c s . e x t . S i m u l a t e d B i n a r y C r o s s o v e r ;
17 import i o . j e n e t i c s . e x t . moea .MOEA;
147
5.9. DTLZ1 CHAPTER 5. EXAMPLES
The listing above shows the encoding of the DTLZ1 problem with the Jenetics
library. Figure 5.9.1 on the next page shows the Pareto-optimal front of the
DTLZ1 optimization.
148
5.9. DTLZ1 CHAPTER 5. EXAMPLES
0.5
0.4
0.3
f3
0.2
0.1
00
0.1
0.2
f1 0.3
0.4 0.5
0.4
0.3
0.2
0.5 0.1
0
f2
149
Chapter 6
Build
For building the Jenetics library from source, download the most recent, stable
package version from https://github.com/jenetics/jenetics/releases and
extract it to some build directory.
<version> denotes the actual Jenetics version and <builddir> the actual build
directory. Alternatively you can check out the latest version from the Git master
branch.
Jenetics uses Gradle1 as build system and organizes the source into sub-projects
(modules).2 Each sub-project is located in its own sub-directory.
Published projects
• jenetics: This project contains the source code and tests for the Jenetics
base-module.
package, the proper Gradle version is automatically downloaded and you don’t have to install
Gradle explicitly.
150
CHAPTER 6. BUILD
Non-published projects
• jenetics.example: This project contains example code for the base-
module.
• jenetics.incubator: This project contains experimental code which
might find its way into one of the official modules.
• jenetics.doc: Contains the code of the website and this manual.
• jenetics.tool: This module contains classes used for doing integration
testing and algorithmic performance testing. It is also used for creating
GA performance measures and creating diagrams from the performance
measures.
For building the library change into the <builddir> directory (or one of the
module directory) and call one of the available tasks:
• compileJava: Compiles the Jenetics sources and copies the class files to
the <builddir>/<module-dir>/build/classes/main directory.
• jar: Compiles the sources and creates the JAR files. The artifacts are
copied to the <builddir>/<module-dir>/build/libs directory.
• test: Compiles and executes the unit tests. The test results are printed
onto the console and a test report, created by TestNG, is written to
<builddir>/<module-dir> directory.
• javadoc: Generates the API documentation. The Javadoc is stored in the
<builddir>/<module-dir>/build/docs directory.
• clean: Deletes the <builddir>/build/* directories and removes all gen-
erated artifacts.
For building the library from the source, call
$ cd <build-dir>
$ gradle jar
or
$ ./gradlew jar
if you don’t have the the Gradle build system installed—calling the the Gradle
wrapper script will download all needed files and trigger the build task afterwards.
Maven Central The whole Jenetics package can also be downloaded from
the Maven Central repository http://repo.maven.apache.org/maven2:
3 https://github.com/jenetics/prngine
151
CHAPTER 6. BUILD
Gradle
’io.jenetics:module-name :7.1.0’
License The library itself is licensed under theApache License, Version 2.0.
Copyright 2007-2022 Franz Wilhelmstötter
http://www.apache.org/licenses/LICENSE-2.0
152
Bibliography
[1] Otman Abdoun, Jaafar Abouchabaka, and Chakir Tajani. Analyzing the
performance of mutation operators to solve the travelling salesman problem.
CoRR, abs/1203.3099, 2012.
[2] Otman Abdoun, Chakir Tajani, and Jaafar Abouchabaka. Hybridizing
PSM and RSM operator for solving np-complete problems: Application to
travelling salesman problem. CoRR, abs/1203.5028, 2012.
[3] Franz Baader and Tobias Nipkow. Term Rewriting and All That. Cambridge
University Press, 1998.
[4] Thomas Back. Evolutionary Algorithms in Theory and Practice. Oxford
Univiversity Press, 1996.
[5] James E. Baker. Reducing bias and inefficiency in the selection algorithm.
Proceedings of the Second International Conference on Genetic Algorithms
and their Application, pages 14–21, 1987.
[6] Shumeet Baluja and Rich Caruana. Removing the genetics from the standard
genetic algorithm. pages 38–46. Morgan Kaufmann Publishers, 1995.
[7] Heiko Bauke. Tina’s random number generator library.
https://github.com/rabauke/trng4/blob/master/doc/trng.pdf, 2011.
[8] Tobias Blickle and Lothar Thiele. A comparison of selection schemes used
in evolutionary algorithms. Evolutionary Computation, 4:361–394, 1997.
[9] Joshua Bloch. Effective Java. Addison-Wesley Professional, 3rd edition,
2018.
[10] David A. Van Veldhuizen Carlos A. Coello Coello, Gary B. Lamont. Evo-
lutionary Algorithms for Solving Multi-Objective Problems. Genetic and
Evolutionary Computation. Springer, Berlin, Heidelberg, 2nd edition, 2007.
[11] P.K. Chawdhry, R. Roy, and R.K. Pant. Soft Computing in Engineering
Design and Manufacturing. Springer London, 1998.
[12] Carlos Coello. Coello, a.c.: Theoretical and numerical constraint-handling
techniques used with evolutionary algorithms: A survey of the state of the
art. comput. methods appl. mech. engrg. 191(11-12), 1245-1287. Computer
Methods in Applied Mechanics and Engineering, 191:1245–1287, 01 2002.
153
BIBLIOGRAPHY BIBLIOGRAPHY
[14] Richard Dawkins. The Blind Watchmaker. New York: W. W. Norton &
Company, 1986.
[15] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast and elitist
multiobjective genetic algorithm: Nsga-ii. Trans. Evol. Comp, 6(2):182–197,
April 2002.
[21] Raj Jain and Imrich Chlamtac. The p2 algorithm for dynamic calculation
of quantiles and histograms without storing observations. Commun. ACM,
28(10):1076–1085, October 1985.
[22] David Jones. Good practice in (pseudo) random number generation for
bioinformatics applications, May 2010.
[26] Sean Luke. Essentials of Metaheuristics. Lulu, second edition, 2013. Avail-
able for free at http://cs.gmu.edu/∼sean/book/metaheuristics/.
[27] Efrén Mezura-Montes. Constraint-Handling in Evolutionary Optimization,
volume 198. 01 2009.
154
BIBLIOGRAPHY BIBLIOGRAPHY
155
Index
156
INDEX INDEX
157
INDEX INDEX
158
INDEX INDEX
Undirected graph, 52
Uniform crossover, 20
Unique fitness tournament selector,
99
Unique population, 79
Validation, 7
Vec, 96
VecFactory, 99
Vector codec, 55
Weasel program, 90
WeaselMutator, 91
WeaselSelector, 91
Weighted graph, 53
159