0% found this document useful (0 votes)
2 views

manual-7.1.0

The Jenetics Library User's Manual 7.1 provides comprehensive guidance on the library's architecture, base classes, and advanced topics related to genetic algorithms. It includes detailed sections on domain classes like Gene, Chromosome, and Population, as well as operation and engine classes crucial for evolutionary processes. The manual is licensed under a Creative Commons Attribution-ShareAlike 3.0 License and is accessible online.

Uploaded by

Kha Lid M'ar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

manual-7.1.0

The Jenetics Library User's Manual 7.1 provides comprehensive guidance on the library's architecture, base classes, and advanced topics related to genetic algorithms. It includes detailed sections on domain classes like Gene, Chromosome, and Population, as well as operation and engine classes crucial for evolutionary processes. The manual is licensed under a Creative Commons Attribution-ShareAlike 3.0 License and is accessible online.

Uploaded by

Kha Lid M'ar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 165

JENETICS

LIBRARY USER’S MANUAL 7.1

Franz Wilhelmstötter
Franz Wilhelmstötter
[email protected]

https://jenetics.io

7.1.0-2022/06/15

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. To view
a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a
letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041,
USA.
Contents

1 Fundamentals 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Base classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Domain classes . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1.1 Gene . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1.2 Chromosome . . . . . . . . . . . . . . . . . . . . 7
1.3.1.3 Genotype . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1.4 Phenotype . . . . . . . . . . . . . . . . . . . . . 11
1.3.1.5 Population . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Operation classes . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2.1 Selector . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2.2 Alterer . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.3 Engine classes . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.3.1 Fitness function . . . . . . . . . . . . . . . . . . 22
1.3.3.2 Engine . . . . . . . . . . . . . . . . . . . . . . . 23
1.3.3.3 Evolution . . . . . . . . . . . . . . . . . . . . . . 25
1.3.3.4 EvolutionStream . . . . . . . . . . . . . . . . . . 26
1.3.3.5 EvolutionResult . . . . . . . . . . . . . . . . . . 27
1.3.3.6 EvolutionStatistics . . . . . . . . . . . . . . . . . 28
1.3.3.7 Evaluator . . . . . . . . . . . . . . . . . . . . . . 30
1.4 Nuts and bolts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.4.1 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.4.1.1 Basic configuration . . . . . . . . . . . . . . . . 32
1.4.1.2 Concurrency tweaks . . . . . . . . . . . . . . . . 32
1.4.2 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.3 Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.4.4 Utility classes . . . . . . . . . . . . . . . . . . . . . . . . . 37

2 Advanced topics 41
2.1 Extending Jenetics . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.1 Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.2 Chromosomes . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.1.3 Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.1.4 Alterers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.5 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.6 Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

ii
CONTENTS CONTENTS

2.2.1 Real function . . . . . . . . . . . . . . . . . . . . . . . . . 48


2.2.2 Scalar function . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2.3 Vector function . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2.4 Affine transformation . . . . . . . . . . . . . . . . . . . . 50
2.2.5 Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.3 Codec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.3.1 Scalar codec . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.2 Vector codec . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.3 Matrix codec . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.3.4 Subset codec . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.3.5 Permutation codec . . . . . . . . . . . . . . . . . . . . . . 58
2.3.6 Mapping codec . . . . . . . . . . . . . . . . . . . . . . . . 59
2.3.7 Composite codec . . . . . . . . . . . . . . . . . . . . . . . 59
2.3.8 Invertible codec . . . . . . . . . . . . . . . . . . . . . . . . 61
2.4 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.5 Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.6 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.6.1 Fixed generation . . . . . . . . . . . . . . . . . . . . . . . 68
2.6.2 Steady fitness . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.6.3 Evolution time . . . . . . . . . . . . . . . . . . . . . . . . 70
2.6.4 Fitness threshold . . . . . . . . . . . . . . . . . . . . . . . 71
2.6.5 Fitness convergence . . . . . . . . . . . . . . . . . . . . . 72
2.6.6 Population convergence . . . . . . . . . . . . . . . . . . . 74
2.6.7 Gene convergence . . . . . . . . . . . . . . . . . . . . . . . 76
2.7 Reproducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.8 Evolution performance . . . . . . . . . . . . . . . . . . . . . . . . 76
2.9 Evolution strategies . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.9.1 (µ, λ) evolution strategy . . . . . . . . . . . . . . . . . . . 77
2.9.2 (µ + λ) evolution strategy . . . . . . . . . . . . . . . . . . 78
2.10 Evolution interception . . . . . . . . . . . . . . . . . . . . . . . . 79

3 Modules 81
3.1 io.jenetics.ext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1.1 Data structures . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1.1.1 Tree . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1.1.2 Parentheses tree . . . . . . . . . . . . . . . . . . 83
3.1.1.3 Flat tree . . . . . . . . . . . . . . . . . . . . . . 84
3.1.1.4 Tree formatting . . . . . . . . . . . . . . . . . . 85
3.1.1.5 Tree reduction . . . . . . . . . . . . . . . . . . . 86
3.1.2 Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.1.2.1 Tree pattern . . . . . . . . . . . . . . . . . . . . 87
3.1.2.2 Tree rewriter . . . . . . . . . . . . . . . . . . . . 88
3.1.2.3 Tree rewrite rule . . . . . . . . . . . . . . . . . . 88
3.1.2.4 Tree rewrite system (TRS) . . . . . . . . . . . . 88
3.1.2.5 Constant expression rewriter . . . . . . . . . . . 89
3.1.3 Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.1.3.1 BigInteger gene . . . . . . . . . . . . . . . . . . 89
3.1.3.2 Tree gene . . . . . . . . . . . . . . . . . . . . . . 89
3.1.4 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.1.5 Weasel program . . . . . . . . . . . . . . . . . . . . . . . . 90

iii
CONTENTS CONTENTS

3.1.6 Modifying Engine . . . . . . . . . . . . . . . . . . . . . . 92


3.1.6.1 ConcatEngine . . . . . . . . . . . . . . . . . . . 93
3.1.6.2 CyclicEngine . . . . . . . . . . . . . . . . . . . . 94
3.1.7 Multi-objective optimization . . . . . . . . . . . . . . . . 94
3.1.7.1 Pareto efficiency . . . . . . . . . . . . . . . . . . 95
3.1.7.2 Implementing classes . . . . . . . . . . . . . . . 95
3.1.7.3 Termination . . . . . . . . . . . . . . . . . . . . 99
3.1.7.4 Mixed optimization . . . . . . . . . . . . . . . . 99
3.1.8 Grammatical evolution . . . . . . . . . . . . . . . . . . . . 100
3.1.8.1 Context-free grammar . . . . . . . . . . . . . . . 101
3.1.8.2 Backus-Naur form . . . . . . . . . . . . . . . . . 101
3.1.8.3 Sentence generation . . . . . . . . . . . . . . . . 102
3.1.8.4 Mapping . . . . . . . . . . . . . . . . . . . . . . 103
3.1.8.5 Implementing classes . . . . . . . . . . . . . . . 105
3.2 io.jenetics.prog . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.2.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.2.2 Program creation . . . . . . . . . . . . . . . . . . . . . . . 110
3.2.3 Program repair . . . . . . . . . . . . . . . . . . . . . . . . 112
3.2.4 Program pruning . . . . . . . . . . . . . . . . . . . . . . . 112
3.2.5 Multi-root programs . . . . . . . . . . . . . . . . . . . . . 113
3.2.6 Symbolic regression . . . . . . . . . . . . . . . . . . . . . 113
3.2.6.1 Loss function . . . . . . . . . . . . . . . . . . . . 114
3.2.6.2 Complexity function . . . . . . . . . . . . . . . . 115
3.2.6.3 Error function . . . . . . . . . . . . . . . . . . . 116
3.2.6.4 Sample points . . . . . . . . . . . . . . . . . . . 116
3.2.6.5 Regression problem . . . . . . . . . . . . . . . . 117
3.2.7 Boolean programs . . . . . . . . . . . . . . . . . . . . . . 117
3.3 io.jenetics.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.3.1 XML writer . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.3.2 XML reader . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.3.3 Marshalling performance . . . . . . . . . . . . . . . . . . . 119
3.4 io.jenetics.prngine . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4 Internals 124
4.1 PRNG testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.2 Random seeding . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5 Examples 128
5.1 Ones counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.2 Real function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.3 Rastrigin function . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.4 0/1 Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5 Traveling salesman . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6 Evolving images . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.7 Symbolic regression . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.8 Grammar based regression . . . . . . . . . . . . . . . . . . . . . . 144
5.9 DTLZ1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

6 Build 150

iv
CONTENTS CONTENTS

Bibliography 153

v
Chapter 1

Fundamentals

Jenetics is an advanced Genetic Algorithm, Evolutionary Algorithm and


Genetic Programming library, written in modern day Java. It is designed with
a clear separation of the several algorithm concepts, e. g. Gene, Chromosome,
Genotype, Phenotype, population and fitness Function. Jenetics allows you
to minimize or maximize a given fitness function without tweaking it. In contrast
to other GA implementations, the library uses the concept of an evolution stream
(EvolutionStream) for executing the evolution steps. Since the Evolution-
Stream implements the Java Stream interface, it works smoothly with the rest
of the Java Stream API. This chapter describes the design concepts and its
implementation. It also gives some basic examples and best practice tips.1

1.1 Introduction
Jenetics is a library, written in Java2 , which provides a Genetic algorithm (GA),
Evolutionary algorithm (EA), Multi-objective optimization (MOO) and Genetic
programming (GP) implementation. It has no runtime dependencies to other
libraries, except the Java 17 runtime. Jenetics is available on the Maven central
repository3 and can be easily integrated into existing projects. The very clear
structuring of the different parts of the GA allows an easy adaption for different
problem domains.

This manual is not an introduction or a tutorial for genetic and/or evolu-


tionary algorithms in general. It is assumed that the reader has a knowledge
about the structure and the functionality of genetic algorithms. Good in-
troductions to GAs can be found in [40], [29], [39], [26], [30] or [44]. For
genetic programming you can have a look at [24] or [25].

To give you a first impression on how to use Jenetics, let’s start with a
1 The classes described in this chapter reside in the io.jenetics.base module or

io:jenetics:jenetics:7.1.0 artifact, respectively.


2 The library is build with and depends on Java SE 17: https://adoptium.net/
3 https://mvnrepository.com/artifact/io.jenetics/jenetics: If you are using Gradle,

you can use the following dependency string: »io.jenetics:jenetics:7.1.0«.

1
1.1. INTRODUCTION CHAPTER 1. FUNDAMENTALS

simple »Hello World« program. This first example implements the well known
bit counting problem.
1 import io . jenetics . BitChromosome ;
2 import io . jenetics . BitGene ;
3 import io . jenetics . Genotype ;
4 import io . jenetics . e n g i n e . Engine ;
5 import io . jenetics . engine . EvolutionResult ;
6 import io . jenetics . u t i l . Factory ;
7
8 public f i n a l c l a s s HelloWorld {
9 // 2 . ) D e f i n i t i o n o f t h e f i t n e s s f u n c t i o n .
10 private s t a t i c i n t e v a l ( f i n a l Genotype<BitGene> g t ) {
11 return g t . chromosome ( )
12 . a s ( BitChromosome . c l a s s )
13 . bitCount ( ) ;
14 }
15
16 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
17 // 1 . ) D e f i n e t h e g e n o t y p e ( f a c t o r y ) s u i t a b l e
18 // f o r t h e problem .
19 f i n a l Factory<Genotype<BitGene>> g t f =
20 Genotype . o f ( BitChromosome . o f ( 1 0 , 0 . 5 ) ) ;
21
22 // 3 . ) C r e a t e t h e e x e c u t i o n e n v i ro n m en t .
23 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
24 . b u i l d e r ( HelloWorld : : e v a l , g t f )
25 . build () ;
26
27 // 4 . ) S t a r t t h e e x e c u t i o n ( e v o l u t i o n ) and
28 // c o l l e c t the r e s u l t .
29 f i n a l Genotype<BitGene> r e s u l t = e n g i n e . stream ( )
30 . l i m i t (100)
31 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
32
33 System . out . p r i n t l n ( " H e l l o World : \ n\ t " + r e s u l t ) ;
34 }
35 }
Listing 1.1: »Hello World« GA

In contrast to other GA implementations, Jenetics uses the concept of an


evolution stream (EvolutionStream) for executing the evolution steps. Since
the EvolutionStream extends the Java Stream interface, it works smoothly
with the rest of the Java Stream API. Now let’s have a closer look at listing 1.1
and discuss this simple program step by step:
1. Probably the most challenging part when setting up a new evolution
Engine, is to transform the native problem domain into an appropriate
Genotype (factory) representation.4 In our example we want to count
the number of ones of a BitChromosome. Since we are counting only
the ones of one chromosome, we are adding only one BitChromosome
to our Genotype. In general, the Genotype can be created with 1 to n
chromosomes. For a detailed description of the genotype’s structure have
a look at section 1.3.1.3.
2. Once this is done, the fitness function, which we are trying to maximize,
can be defined. We can simply write a private static method, which
4 Section 2.2 on page 47 describes some common problem encodings.

2
1.1. INTRODUCTION CHAPTER 1. FUNDAMENTALS

takes the Genotype we defined and calculate its fitness value. If we


want to use the optimized bit counting method, bitCount(), we have to
cast the Chromosome<BitGene> class to the actual used BitChromosome
class. Since we know for sure that we created the Genotype with a
BitChromosome, this is a safe operation. A reference to the eval method
is then used as a fitness function and passed to the Engine::build method.
3. In the third step we are creating the evolution Engine, which is responsible
for evolving the given population. The Engine is highly configurable and
takes parameters for controlling the evolutionary and the computational
environment. For changing the evolutionary behavior, you can set different
alterers and selectors (see section 1.3.2). By changing the used Executor
service, you control the number of threads, the Engine is allowed to use. A
new Engine instance can only be created via its builder, which is created
by calling the Engine.builder method.

4. In the last step, we will create a new EvolutionStream from our Engine.
The EvolutionStream is the model (or view) of the evolutionary process.
It serves as a »process handle« and allows us, among other things, to
control the termination of the evolution. In our example, we simply
truncate the stream after 100 generations. If you don’t limit the stream,
the EvolutionStream will never terminate and run forever. The final
result, the best Genotype in our example, is then collected with one of
the predefined collectors of the EvolutionResult class.
As the example shows, Jenetics makes heavy use of the Stream and Collector
classes. Also lambda expressions and the functional interfaces (SAM types) plays
an important roll in the library design.
There are many other GA implementations out there and they may slightly
differ in the order of the single execution steps. Jenetics uses an classical
approach. Listing 1.2 shows the (imperative) pseudocode of the Jenetics genetic
algorithm steps.
1 P0 ← Pinitial
2 F (P0 )
3 while ! f inished do
4 g ←g+1
5 Sg ← selectS (Pg−1 )
6 Og ← selectO (Pg−1 )
7 Og ← alter(Og )
8 Pg ← f ilter[gi ≥ gmax ](Sg ) + f ilter[gi ≥ gmax ](Og )
9 F (Pg )
Listing 1.2: Genetic algorithm

In line (1) the initial population is created and line (2) calculates the fitness
value of the individuals. The initial population is created implicitly before the
first evolution step is performed. Line (4) increases the generation number
and line (5) and (6) selects the survivor and the offspring population. The
offspring/survivors fraction is determined by the offspringFraction property
of the Engine.Builder. The selected offspring are altered in line (7). The next
line combines the survivor population and the altered offspring population—after
removing the killed individuals—to the new population. The steps from line (4)
to (9) are repeated until a given termination criterion is fulfilled.

3
1.2. ARCHITECTURE CHAPTER 1. FUNDAMENTALS

1.2 Architecture
The basic metaphor of the Jenetics library is the Evolution Stream, implemented
as Java Stream. An evolution stream is powered by—and bound to—an Evolution
Engine, which performs the needed evolution steps for each generation; the steps
are described in the body of the while loop of listing 1.2.

Figure 1.2.1: Evolution workflow

The described evolution workflow is also illustrated in figure 1.2.1, where


ES(i) denotes the EvolutionStart object at generation i and ER(i) the Evo-
lutionResult at the ith generation. Once the evolution Engine is created, it
can be used by multiple EvolutionStreams which can be safely used in different
execution threads. This is possible because the evolution Engine doesn’t have
any mutable global state and is therefore thread safe. It is practically a stateless
function, fE : P → P, which maps a start population, P, to an evolved result
population. The Engine function, fE , is, of course, nondeterministic. Calling it
twice with the same start population will lead to different result populations.
The evolution process terminates, if the EvolutionStream is truncated.
The EvolutionStream truncation is controlled by the limit predicate. As
long as the predicate returns true, the evolution is continued.5 At last, the
EvolutionResult is collected from the EvolutionStream by one of the available
EvolutionResult collectors.

Figure 1.2.2: Evolution engine model

Figure 1.2.2 shows the static view of the main evolution classes, together
with its dependencies. Since the Engine class itself is immutable, and can’t
be changed after creation, it is instantiated (configured) via a builder. The
Engine can be used to create an arbitrary number of EvolutionStreams. The
EvolutionStream is used to control the evolutionary process and collect the
final result. This is done in the same way as for the normal java.util.-
stream.Stream classes. With the additional limit(Predicate) method, it
5 See section 2.6 on page 67 for a detailed description of the available termination strategies.

4
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

is possible to truncate the EvolutionStream if some termination criteria is


fulfilled. The separation of Engine and EvolutionStream is the separation of
evolution definition and evolution execution.

Figure 1.2.3: Package structure

In figure 1.2.3 the package structure of the library is shown and it consists of
the following packages:
io.jenetics This is the base package of the Jenetics library and contains all
domain classes like Gene, Chromosome, Genotype or Phenotype. All of
this types are immutable data classes. It also contains the Selector and
Alterer interfaces and its implementations. The classes in this package
are (almost) sufficient to implement an own evolution engine.
io.jenetics.engine This package contains the actual GA implementation
classes, e. g. Engine, EvolutionStream or EvolutionResult. They
mainly operate on the domain classes in the io.jenetics package.

io.jenetics.stat This package contains additional statistics classes which are


not available in the Java core library. Java only includes classes for calcu-
lating the sum and the average of a given numeric stream (e. g. Double-
SummaryStatistics). With this additions it is also possible to calculate
the variance, skewness and kurtosis—using the DoubleMomentStatistics
class. The EvolutionStatistics object, which can be calculated for
every generation, relies on the classes in this package.
io.jenetics.util This package contains the collection classes (BaseSeq, Seq,
ISeq and MSeq) which are used in the public interfaces of the Chromosome
and Genotype. It also contains the RandomRegistry, which implements
the global PRNG lookup, as well as helper IO classes for serializing Geno-
types and whole populations.

1.3 Base classes


This chapter describes the base classes which are needed to setup and run an
genetic algorithm with the Jenetics6 library. They can roughly divided into
three types:
6 The documentation of the whole API is part of the download package or can be viewed

online: https://jenetics.io/javadoc/jenetics/7.1/index.html.

5
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Domain classes These classes form the domain model of the evolutionary
algorithm and contain the structural classes like Gene and Chromosome.
They are directly located in the io.jenetics package.
Operation classes These classes operate on the domain classes and includes the
Alterer and Selector interfaces. They are also located in the io.jen-
etics package.
Engine classes These classes implement the actual evolutionary algorithm and
can be found in the io.jenetics.engine package.

1.3.1 Domain classes


Most of the domain classes are pure data classes and can be treated as value
objects7 . All Gene and Chromosome implementations are immutable as well as
the Genotype and Phenotype class.

Figure 1.3.1: Domain model

Figure 1.3.1 shows the class diagram of the domain classes. The Gene is
the base of the class structure. Genes are aggregated in Chromosomes, and
one to n Chromosomes are aggregated in Genotypes. A Genotype and a fitness
Function form the Phenotype, which are collected into a population Seq.

1.3.1.1 Gene
The basic building blocks of the Jenetics library contain the actual information
of the encoded solution, the allele. Some of the implementations also contain
domain information of the wrapped allele. This is the case for all Bounded-
Genes, which contain the allowed minimum and maximum values. All Gene
implementations are final and immutable. In fact, they are all value based classes
and fulfill the properties which are described in the Java API documentation[33].8
Beside the container functionality for the allele, every Gene is its own factory
and is able to create new, random instances of the same type and with the same
constraints9 . The factory methods are used by the Alterers for creating new
7 https://en.wikipedia.org/wiki/Value_object
8 Itis also worth reading the blog entry from Stephen Colebourne: http://blog.joda.org/
2014/03/valjos-value-java-objects.html
9 A constraint can restrict the space of valid values of a given problem domain. An example

will be a DoubleGene, where the allowed minimal and maximal value of the double allele is
part of the gene.

6
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Genes from the existing one and play a crucial role by the exploration of the
problem space.
1 public i n t e r f a c e Gene<A, G extends Gene<A, G>>
2 extends Factory<G>, V e r i f i a b l e
3 {
4 A a l l e l e () ;
5 boolean i s V a l i d ( ) ;
6 G newInstance ( ) ;
7 G n e w I n s t a n c e (A a l l e l e ) ;
8 }
Listing 1.3: Gene interface
Listing 1.3 shows the most important methods of the Gene interface. The
isValid method, defined in the Verifiable interface, allows the gene to mark
itself as invalid, e. g. when its allele is not within the allowed range. All invalid
genes are replaced with new ones during the evolution phase. The available
Gene implementations in the Jenetics library cover a wide range of problem
encodings. Refer to chapter 2.1.1 for how to implement your own Gene types.

1.3.1.2 Chromosome
A Chromosome is a collection of Genes which must contain at least one Gene.
This allows defining problems which require more than one Gene to encode. Like
the Gene interface, the Chromosome is also its own factory and allows creation
of a new Chromosome from a given Gene sequence.
1 public i n t e r f a c e Chromosome<G extends Gene <? , G>>
2 extends Factory<Chromosome<G>>, BaseSeq<G>, V e r i f i a b l e
3 {
4 G get ( int index ) ;
5 int length ( ) ;
6 Chromosome<G> n e w I n s t a n c e ( ISeq<G> g e n e s ) ;
7 }
Listing 1.4: Chromosome interface
Listing 1.4 shows the main methods of the Chromosome interface. These are
the methods for accessing single Genes by its index and the factory method for
creating a new Chromosome from a given sequence of Genes. The factory method
is used by the Alterer classes which were able to create altered Chromosomes
from a (changed) Gene sequence. Most of the Chromosome implementations can
be created with variable length. E. g. the IntegerChromosome can be created
with variable length, where the minimum value of the length range is included
and the maximum value of the length range is excluded.
1 IntegerChromosome chromosome = IntegerChromosome . o f (
2 0 , 1_000 , IntRange . o f ( 5 , 9 )
3 );

The factory method of the IntegerChromosome will now create chromosome in-
stances with a length between [rangemin , rangemax ), equally distributed. Figure
1.3.2 shows the structure of a Chromosome with variable length.

1.3.1.3 Genotype
The central processing class, the evolution Engine is working with, is the
Genotype. It is the structural and immutable representative of an individual

7
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Figure 1.3.2: Chromosome structure

and consists of one to n Chromosomes. All Chromosomes must be parameterized


with the same Gene type, but each Chromosome is allowed to have different
lengths and constraints. The allowed minimal and maximal values of a Numeric-
Chromosome is an example of such a constraint. Within the same Chromosome,
all alleles must lay within the defined minimal and maximal values.

Figure 1.3.3: Genotype structure

Figure 1.3.3 shows the Genotype structure. A Genotype consists of NG


Chromosomes and a Chromosome consists of NC[i] Genes (depending on the
Chromosome). The overall number of Genes of a Genotype is given by the sum
of the Chromosome’s Genes, which can be accessed via the Genotype.gene-
Count() method:
G −1
NX
Ng = NC[i] (1.3.1)
i=0

As already mentioned, the Chromosomes of a Genotype don’t necessarily have


to have the same size. It is only required that all genes are from the same type
and the Genes within a Chromosome have the same constraints; e. g. the same
minimal and maximal values for numerical Genes.
1 Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1.0 , 8) ,
3 DoubleChromosome . o f ( 1 . 0 , 2 . 0 , 10) ,
4 DoubleChromosome . o f ( 0 . 0 , 1 0 . 0 , 9) ,
5 DoubleChromosome . o f ( 0 . 1 , 0.9 , 5)
6 );

8
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

The code snippet in the listing above creates a Genotype with the same structure
as shown in figure 1.3.3. In this example the DoubleGene has been chosen as
the Gene type.

Genotype vector The Genotype is essentially a two-dimensional composition


of Genes. This makes it trivial to create Genotypes which can be treated
as a Gene matrices. If its needed to create a vector of Genes, there are two
possibilities to do so:
1. creating a row major or
2. creating a column major
Genotype vector. Each of the two possibilities have specific advantages and
disadvantages.

Figure 1.3.4: Row major Genotype vector

Figure 1.3.4 shows a Genotype vector in row major layout. A Genotype


vector of length n needs one Chromosome of length n. Each Gene of such a
vector must obey the same constraints. E. g., for Genotype vectors containing
NumericGenes, all Genes must have the same minimum and maximum values.
If the problem space doesn’t need to have different minimum and maximum
values, the row major Genotype vector is the preferred choice. Beside the easier
Genotype creation, the available Recombinator alterers are more efficient in
exploring the search domain.

If the problem space allows equal Gene constraint, the row major Genotype
vector encoding should be chosen. It is easier to create and the available
Recombinator classes are more efficient in exploring the search domain.

The following code snippet shows the creation of a row major Genotype
vector. All Alterers derived from the Recombinator do a fairly good job in
exploring the problem space for a row major Genotype vector.
1 Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1.0 , 8)
3 );

The column major Genotype vector layout must be chosen when the problem
space requires Genes with different constraints. This is almost the only reason for
choosing the column major layout. The layout of this Genotype vector is shown
in 1.3.5. For a vector of length n, n Chromosomes of length one are needed.

9
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Figure 1.3.5: Column major Genotype vector

The code snippet below shows how to create a Genotype vector in column
major layout. It’s a little bit more effort to create such a vector, since every
Gene has to be wrapped into a separate Chromosome. The DoubleChromosome
in the given example has a length of one, when the length parameter is omitted.
1 Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1.0) ,
3 DoubleChromosome . o f ( 1 . 0 , 2.0) ,
4 DoubleChromosome . o f ( 0 . 0 , 1 0 . 0 ) ,
5 DoubleChromosome . o f ( 0 . 1 , 0.9)
6 );

The greater flexibility of a column major Genotype vector has to be paid with
a lower exploration capability of the Recombinator alterers. Using Crossover
alterers will have the same effect as the SwapMutator, when used with row major
Genotype vectors. Recommended alterers for vectors of NumericGenes are:
• MeanAlterer10 ,
• LineCrossover11 and

• IntermediateCrossover12
See also 2.3.2 for an advanced description on how to use the predefined vector
codecs.

Genotype scalar A special case of a Genotype contains only one Chromosome


with length one. The layout of such a Genotype scalar is shown in 1.3.6. Such
Genotypes are mostly used for encoding real function problems.
How to create a Genotype for a real function optimization problem is shown
in the code snippet below. The recommended Alterers are the same as for
column major Genotype vectors: MeanAlterer, LineCrossover and Inter-
mediateCrossover.
1 Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1 . 0 )
3 );
10 See 1.3.2.2 on page 21.
11 See 1.3.2.2 on page 21.
12 See 1.3.2.2 on page 21.

10
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Figure 1.3.6: Genotype scalar

See also 2.3.1 for an advanced description on how to use the predefined scalar
codecs.

1.3.1.4 Phenotype
The Phenotype is the actual representative of an individual and consists of
the Genotype, the generation where the Phenotype has been created and an
optional fitness value. Like the Genotype, the Phenotype is immutable and
can’t be changed after creation.
1 public f i n a l c l a s s Phenotype<
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements Comparable<Phenotype<G, C>>
6 {
7 public Genotype<G> g e n o t y p e ( ) ;
8 public long g e n e r a t i o n ( ) ;
9 public C f i t n e s s ( ) ;
10 public boolean i s E v a l u a t e d ( ) ;
11 public Phenotype<G, C> w i t h F i t n e s s (C f i t n e s s ) ;
12 }
Listing 1.5: Phenotype class

Listing 1.5 shows the main methods of the Phenotype. The fitness property
will return the actual fitness value of the Genotype, and the Genotype can be
fetched with the genotype() method. If no fitness value is associated with the
Phenotype yet, the fitness() method will throw an NoSuchElementException.
Whether the fitness value has been set can be checked with the isEvaluated()
method. Setting a fitness value can be done with the withFitness(C) method.
Since the Phenotype is immutable, this method returns a new Phenotype with
the set fitness value. Additionally to the fitness value, the Phenotype contains
the generation when it was created. This allows for the calculation of the current
age and allows for the removal of overaged individuals from the population.

1.3.1.5 Population
There is no special class which represents a population. It’s just a collection of
Phenotypes. As a collection class, the ISeq interface is used. The ISeq interface
allows for the expression of the immutability of the population at the type level
and makes the code more readable. For a detailed description of this collection
classes see section 1.4.4.

11
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

1.3.2 Operation classes


Genetic operators are used for creating genetic diversity (Alterer) and selecting
potentially useful solutions for recombination (Selector). This section gives an
overview about the genetic operators available in the Jenetics library. It also
contains some theoretical information, which should help you to choose the right
combination of operators and parameters, for the problem to be solved.

1.3.2.1 Selector
Selectors are responsible for selecting a given number of individuals from the
population. The selectors are used to divide the population into survivors
and offspring. The selectors for offspring and for the survivors can be set
independently.

The selection process of the Jenetics library acts on Phenotypes and indi-
rectly, via the fitness function, on Genotypes. Direct Gene or population
selection is not supported by the library.

1 Engine<DoubleGene , Double> e n g i n e = Engine . b u i l d e r ( . . . )


2 . offspringFraction (0.7)
3 . s u r v i v o r s S e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
4 . o f f s p r i n g S e l e c t o r (new T o u r n a m e n t S e l e c t o r <>() )
5 . build () ;

The offspringFraction, fO ∈ [0, 1], determines the number of selected off-


spring
NOg = ∥Og ∥ = rint (∥Pg ∥ · fO ) (1.3.2)
and the number of selected survivors

NSg = ∥Sg ∥ = ∥Pg ∥ − ∥Og ∥ . (1.3.3)

The Jenetics library contains the following selector implementations:

• TournamentSelector • LinearRankSelector
• TruncationSelector • ExponentialRankSelector

• MonteCarloSelector • BoltzmannSelector
• ProbabilitySelector • StochasticUniversalSelector
• RouletteWheelSelector • EliteSelector

Beside the well known standard selector implementation, the Probability-


Selector is the base of a set of fitness proportional selectors.

Tournament selector In tournament selection the best individual from a


random sample of s individuals is chosen from the population, Pg . The samples
are drawn with replacement. An individual will win a tournament only if the
fitness is greater than the fitness of the other s − 1 competitors. Note that

12
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

the worst individual never survives, and the best individual wins in all the
tournaments in which it participates. The selection pressure can be varied by
changing the tournament size, s. For large values of s, weak individuals have
less chance of being selected. Compared with fitness proportional selectors, the
tournament selector is often used in practice because of its lack of stochastic
noise. Tournament selectors are also independent to the scaling of the genetic
algorithm fitness function.

Truncation selector In truncation selection individuals are sorted according


to their fitness and only the n best individuals are selected. The truncation
selection is a very basic selection algorithm. It has its strength in fast selecting
individuals in large populations, but is not very often used in practice; whereas
the truncation selection is a standard method in animal and plant breeding. Only
the best animals, ranked by their phenotypic value, are selected for reproduction.

Monte Carlo selector The Monte Carlo selector selects the individuals from
a given population randomly. Instead of a directed search, the Monte Carlo
selector performs a random search. This selector can be used to measure the
performance of other selectors. In general, the performance of a selector should
be better than the selection performance of the Monte Carlo selector. If the
Monte Carlo selector is used for selecting the parents for the population, it will
be a little bit more disruptive, on average, than roulette wheel selection.[40]

Probability selectors Probability selectors are a variation of fitness propor-


tional selectors and selects individuals from a given population based on its
selection probability, P (i). Fitness proportional selection works as shown in

Figure 1.3.7: Fitness proportional selection

figure 1.3.7. An uniform distributed random number r ∈ [0, F ) specifies which


individual is selected, by argument minimization:
( n
)
i ← argmin r < (1.3.4)
X
fi ,
n∈[0,N ) i=0

where N is the number of individuals and fi the fitness value of the ith individual.
The probability selector works the same way, only the fitness value, fi , is replaced
by the individual’s selection probability, P (i). It is not necessary to sort the
population. The selection probability of an individual, i, follows a binomial

13
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

distribution  
n
P (i, k) = P (i) k (1 − P (i)) (1.3.5)
n−k
k
where n is the overall number of selected individuals and k the number of
individuals i in the set of selected individuals. The runtime complexity of the
implemented probability selectors is O (n + log (n)) instead of O (n2) as for the
naive approach: A binary (index) search is performed on the summed probability
array.

Roulette-wheel selector The roulette-wheel selector is also known as the


fitness proportional selector and Jenetics implements it as a probability selector.
For calculating the selection probability, P (i), the fitness value, fi , of individual
i is used.
fi
P (i) = PN −1 (1.3.6)
j=0 fj

Selecting n individuals from a given population is equivalent to play n times on


the roulette-wheel. The population doesn’t have to be sorted before selecting
the individuals. Notice that equation 1.3.6 assumes that all fitness values are
positive and the sum of the fitness values is not zero. To cope with negative
fitnesses, an adapted formula is used for calculating the selection probabilities.
fi − fmin
P ′ (i) = PN −1 , (1.3.7)
j=0 (fj − fmin )

where
fmin = min {fi , 0}
i∈[0,N )

As you can see, the worst fitness value, fmin , if negative, now has a selection
probability of zero. In the case that the sum of the corrected fitness values is
zero, the selection probability of all fitness values will be set N1 .

Linear-rank selector The roulette-wheel selector will have problems when


the fitness values differ very much. If the best Chromosome fitness is 90%, its
circumference occupies 90% of roulette-wheel, and then other Chromosomes have
too few chances to be selected.[40] In linear-ranking selection the individuals
are sorted according to their fitness values. The rank N is assigned to the best
individual and the rank 1 to the worst individual. The selection probability,
P (i), of individual i is linearly assigned to the individuals according to their
rank.

1  i−1
 
P (i) = n− + n+ − n− . (1.3.8)
N N −1
Here nN is the probability of the worst individual to be selected and nN the
− +

probability of the best individual to be selected. As the population size is held


constant, the condition n+ = 2 − n− and n− ≥ 0 must be fulfilled. Note that
all individuals get a different rank, respectively a different selection probability,
even if they have the same fitness value.[8]

14
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Exponential-rank selector An alternative to the weak linear-rank selector


is to assign survival probabilities to the sorted individuals using an exponential
function:

ci−1
P (i) = (c − 1) , (1.3.9)
−1
cN
where c must be within the range [0, 1). A small value of c increases the
probability of the best individual to be selected. If c is set to zero, the selection
probability of the best individual is set to one. The selection probability of all
other individuals is zero. A value near one equalizes the selection probabilities.
This selector sorts the population in descending order before calculating the
selection probabilities.

Boltzmann selector The selection probability of the Boltzmann selector is


defined as
eb·fi
P (i) = , (1.3.10)
Z
where b is a parameter which controls the selection intensity and Z is defined as
n
Z= efi . (1.3.11)
X

i=1

Positive values of b increases the selection probability of individuals with high


fitness values and negative values of b decreases it. If b is zero, the selection
probability of all individuals is set to N1 .

Stochastic-universal selector Stochastic-universal selection[4] (SUS) is a


method for selecting individuals according to some given probability in a way
that minimizes the chance of fluctuations. It can be viewed as a type of roulette
game where we now have p equally spaced points which we spin. SUS uses
a single random value for selecting individuals by choosing them at equally
spaced intervals. Weaker members of the population (according to their fitness)
have a better chance to be chosen, which reduces the unfair nature of fitness-
proportional selection methods. The selection method was introduced by James
Baker.[5] Figure 1.3.8 shows the function of the stochastic-universal selection,

Figure 1.3.8: Stochastic-universal selection

where n is the number of individuals to select. Stochastic-universal sampling


ensures a selection of offspring, which is closer to what is deserved than roulette
wheel selection.[40]

15
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Elite selector The EliteSelector copies a small proportion of the fittest


candidates, without changes, into the next generation. This may have a dramatic
impact on performance by ensuring that the GA doesn’t waste time rediscovering
previously refused partial solutions. Individuals that are preserved through
elitism remain eligible for selection as parents of the next generation. Elitism is
also related with memory: remember the best solution found so far. A problem
with elitism is that it may cause the GA to converge to a local optimum, so pure
elitism is a race to the nearest local optimum. The elite selector implementation of
the Jenetics library also lets you specify the selector for the non-elite individuals.

1.3.2.2 Alterer
The problem encoding/representation determines the bounds of the search space,
but the Alterers determine how the space can be traversed: Alterers are
responsible for the genetic diversity of the EvolutionStream. The two Alterer
hierarchies used in Jenetics are:
1. mutation and

2. recombination (e. g. crossover).

First we will have a look at the mutation — There are two distinct
roles mutation plays in the evolution process:

1. Exploring the search space: By making small moves, mutation allows


a population to explore the search space. This exploration is often slow
compared to crossover, but in problems where crossover is disruptive this
can be an important way to explore the landscape.
2. Maintaining diversity: Mutation prevents a population from converging
to a local minimum by stopping the solution to become too close to one
another. A genetic algorithm can improve the solution solely by the
mutation operator. Even if most of the search is being performed by
crossover, mutation can be vital to provide the diversity which crossover
needs.
The mutation probability, P (m), is the parameter that must be optimized. The
optimal value of the mutation rate depends on the role mutation plays. If
mutation is the only source of exploration (if there is no crossover), the mutation
rate should be set to a value that ensures that a reasonable neighborhood of
solutions is explored.
The mutation probability, P (m), is defined as the probability that a specific
gene, over the whole population, is mutated. That means, the (average) number
of genes mutated by a mutator is

µ̂ = NP · Ng · P (m) (1.3.12)

where Ng is the number of available genes of a genotype and NP the population


size (refer to equation 1.3.1).

16
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Mutator The mutator has to deal with the problem, that the genes are
arranged in a hierarchical structure with three levels (see chapter 1.3.1.3). The
mutator selects the gene which will be mutated in three steps:
1. Select a genotype, G[i], from the population with probability PG (m),
2. select a chromosome, C[j], from the selected genotype, G[i], with probabil-
ity PC (m) and
3. select a gene, g[k], from the selected chromosome, C[j], with probability
Pg (m).
The needed sub selection probabilities are set to

PG (m) = PC (m) = Pg (m) = P (m). (1.3.13)


p
3

Gaussian mutator The Gaussian mutator performs the mutation of number


genes. This mutator picks a new value based on a Gaussian distribution around
the current value of the gene. The variance of the new value (before clipping to
the allowed gene range) will be
 2
gmax − gmin
σ̂ =
2
(1.3.14)
4
where gmin and gmax are the valid minimum and maximum values of the number
gene. The new value will be cropped to the gene’s boundaries.

Swap mutator The swap mutator changes the order of genes in a chromosome,
with the hope of bringing related genes closer together, thereby facilitating the
production of building blocks. This mutation operator can also be used for
combinatorial problems, where no duplicated genes within a chromosome are
allowed, e. g. for the TSP.

The second alterer type is the recombination — An enhanced genetic


algorithm (EGA) combines elements of existing solutions in order to create a new
solution, with some of the properties of each parent. Recombination creates a
new chromosome by combining parts of two (or more) parent chromosomes. This
combination of chromosomes can be made by selecting one or more crossover
points, splitting these chromosomes on the selected points, and merging those
portions of different chromosomes to form new ones.
1 void r e c o m b i n e ( f i n a l MSeq<Phenotype<G, C>> pop ) {
2 // S e l e c t t h e Genotypes f o r c r o s s o v e r .
3 f i n a l RandomGenerator random = RandomRegistry . random ( ) ;
4 f i n a l i n t i 1 = random . n e x t I n t ( pop . l e n g t h ( ) ) ;
5 f i n a l i n t i 2 = random . n e x t I n t ( pop . l e n g t h ( ) ) ;
6 f i n a l Phenotype<G, C> pt1 = pop . g e t ( i 1 ) ;
7 f i n a l Phenotype<G, C> pt2 = pop . g e t ( i 2 ) ;
8 f i n a l Genotype<G> g t 1 = pt1 . g e n o t y p e ( ) ;
9 f i n a l Genotype<G> g t 2 = pt2 . g e n o t y p e ( ) ;
10
11 // Choosing t h e Chromosome f o r c r o s s o v e r .
12 f i n a l int chIndex =

17
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

13 random . n e x t I n t ( min ( g t 1 . l e n g t h ( ) , g t 2 . l e n g t h ( ) ) ) ;
14 f i n a l MSeq<Chromosome<G>> c1 = MSeq . o f ( g t 1 ) ;
15 f i n a l MSeq<Chromosome<G>> c2 = MSeq . o f ( g t 2 ) ;
16 f i n a l MSeq<G> g e n e s 1 = MSeq . o f ( c1 . g e t ( c h I n d e x ) ) ;
17 f i n a l MSeq<G> g e n e s 2 = MSeq . o f ( c2 . g e t ( c h I n d e x ) ) ;
18
19 // Perform t h e c r o s s o v e r .
20 c r o s s o v e r ( genes1 , genes2 ) ;
21 c1 . s e t ( chIndex , c1 . g e t ( c h I n d e x ) . n e w I n s t a n c e ( g e n e s 1 . t o I S e q ( ) ) ) ;
22 c2 . s e t ( chIndex , c2 . g e t ( c h I n d e x ) . n e w I n s t a n c e ( g e n e s 2 . t o I S e q ( ) ) ) ;
23
24 // C r e a t i n g two new Phenotypes and r e p l a c e t h e o l d one .
25 pop . s e t ( i 1 , Phenotype . o f ( Genotype . o f ( c1 . t o I S e q ( ) ) ) ) ;
26 pop . s e t ( i 2 , Phenotype . o f ( Genotype . o f ( c2 . t o I S e q ( ) ) ) ) ;
27 }
Listing 1.6: Chromosome selection for recombination

Listing 1.6 shows how two chromosomes are selected for recombination. It is
done this way for preserving the given constraints and to avoid the creation of
invalid individuals.

Because of the possible different Chromosome length and/or Chromosome


constraints within a Genotype, only Chromosomes with the same
Genotype position are recombined (see listing 1.6).

The recombination probability, P (r), determines the probability that a given


individual (genotype) of a population is selected for recombination. The (mean)
number of changed individuals depend on the concrete implementation and
can be vary from P (r) · NG to P (r) · NG · OR , where OR is the order of the
recombination, which is the number of individuals involved in the combine
method.

Single-point crossover The single-point crossover changes two children chro-


mosomes by taking two chromosomes and cutting them at some, randomly chosen,
site. If we create a child and its complement we preserve the total number of
genes in the population, preventing any genetic drift. Single-point crossover is
the classic form of crossover. However, it produces very slow mixing compared
with multi-point crossover or uniform crossover. For problems where the site
position has some intrinsic meaning to the problem, single-point crossover can
lead to smaller disruption than multi-point or uniform crossover.
Figure 1.3.9 shows how the SinglePointCrossover class is performing the
crossover for different crossover points. Sub figure a) shows the two chromosomes
chosen for crossover. The examples in sub figures b) to f) illustrate the crossover
results for indexes 0,1,3,6 and 7.

Multi-point crossover If the MultiPointCrossover class is created with


one crossover point, it behaves exactly like the single-point crossover. The figures
in 1.3.10 shows how the multi-point crossover works with two crossover points.
Figure a) shows the two chromosomes chosen for crossover, b) shows the crossover
result for the crossover points at index 0 and 4, c) uses crossover points at index
3 and 6 and d) at index 0 and 7.

18
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Figure 1.3.9: Single-point crossover

Figure 1.3.10: 2-point crossover

Figure 1.3.11 you can see how the crossover works for an odd number of
crossover points.

Figure 1.3.11: 3-point crossover

Partially-matched crossover The partially-matched crossover guarantees


that all genes are found exactly once in each chromosome. No gene is duplicated
by this crossover strategy. The partially-matched crossover (PMX) can be applied
usefully in the TSP or other permutation problem encodings. Permutation
encoding is useful for all problems where the fitness only depends on the ordering
of the genes within the chromosome. This is the case in many combinatorial
optimization problems. Other crossover operators for combinatorial optimization
are:

• order crossover • edge recombination crossover


• cycle crossover • edge assembly crossover

The PMX is similar to the two-point crossover. A crossing region is chosen


by selecting two crossing points (see figure 1.3.12 a)).

19
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

Figure 1.3.12: Partially-matched crossover

After performing the crossover we–normally–got two invalid chromosomes


(figure 1.3.12 b)). Chromosome 1 contains the value 6 twice and misses the value
3. On the other side chromosome 2 contains the value 3 twice and misses the
value 6. We can observe that this crossover is equivalent to the exchange of the
values 3→6, 4→5 and 5→4. To repair the two chromosomes we have to apply
this exchange outside the crossing region (figure 1.3.12 b)). At the end figure
1.3.12 c) shows the repaired chromosome.

Uniform crossover In uniform crossover, the genes at index i of two chro-


mosomes are swapped with the swap probability, pS . Empirical studies show
that uniform crossover is a more exploitative approach than the traditional
exploitative approach that maintains longer schemata. This leads to a better
search of the design space with maintaining the exchange of good information.[11]

Figure 1.3.13: Uniform crossover

Figure 1.3.13 shows an example of a uniform crossover with four crossover


points. A gene is swapped, if a uniformly created random number, r ∈ [0, 1], is
smaller than the swap probability, pS . The following code snippet shows how
these swap indexes are calculated, in a functional way.
1 f i n a l RandomGenerator random = RandomRegistry . random ( ) ;
2 f i n a l int length = 8 ;
3 f i n a l double ps = 0 . 5 ;
4 f i n a l int [ ] indexes = IntStream . range (0 , length )
5 . f i l t e r ( i −> random . nextDouble ( ) < ps )
6 . toArray ( ) ;

Combine alterer This alterer changes two genes by combining them. The
combine function can be defined when the alterer is created. How this is done,
is shown in the code snippet below.

20
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

1 f i n a l v a r a l t e r e r = new CombineAlterer<DoubleGene , Double >(


2 ( g1 , g2 ) −> g1 . n e w I n s t a n c e ( g1 . d o u b l e V a l u e ( ) / g2 . d o u b l e V a l u e ( ) )
3 );

Mean alterer The Mean alterer works on genes which implement the Mean
interface. All numeric genes implement this interface by calculating the arithmetic
mean of two genes. This alterer is a specialization of the CombineAlterer.

Line crossover The line crossover13 takes two numeric chromosomes and
treats it as a real number vector. Each of these vectors can also be seen as a
point in Rn . If we draw a line through these two points (chromosome), we have
the possible values of the new chromosomes, which all lie on this line.

Figure 1.3.14: Line crossover hypercube

Figure 1.3.14 shows how the two chromosomes form the two three-dimensional
vectors (black circles). The dashed line, connecting the two points, form the
possible solutions created by the line crossover. An additional variable, p,
determines how far out along the line the created children will be. If p = 0 then
the children will be located along the line within the hypercube. If p > 0, the
children may be located on an arbitrary place on the line, even outside of the
hypercube.This is useful if you want to explore unknown regions, and you need
a way to generate chromosomes further out than the parents are.
The internal random parameters, which define the location of the new
crossover point, are generated once for the whole vector (chromosome). If
the LineCrossover generates numeric genes which lie outside the allowed min-
imum and maximum value, it simply uses the original gene and rejects the
generated, invalid one.

Intermediate crossover The intermediate crossover is quite similar to the


line crossover. It differs in the way on how the internal random parameters are
generated and the handling of the invalid—out of range-–genes. The internal
random parameters of the IntermediateCrossover class are generated for each
gene of the chromosome, instead once for all genes. If the newly generated gene
is not within the allowed range, a new one is created. This is repeated, until a
valid gene is built.
The crossover parameter, p, has the same properties as for the line crossover.
If the chosen value for p is greater than 0, it is likely that some genes must be
13 The line crossover, also known as line recombination, was originally described by Heinz

Mühlenbein and Dirk Schlierkamp-Voosen.[31]

21
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

created more than once, because they are not in the valid range. The probability
for gene recreation rises sharply with the value of p. Setting p to a value greater
than one, doesn’t make sense in most of the cases. A value greater than 10
should be avoided.

Partial alterer Alterers are working on the whole population, which is ef-
fectively a sequence of genotypes. If your genotype consists of more than one
chromosome, the alterer is applied to all chromosomes. There is no way to bind
an alterer to a specific chromosome. The PartialAlterer class overcomes this
shortcoming and allows you to define the chromosomes the wrapped Alterer is
using.
1 f i n a l Genotype<DoubleGene> g t f = Genotype . o f (
2 DoubleChromosome . o f ( 0 , 1 ) ,
3 DoubleChromosome . o f ( 1 , 2 ) ,
4 DoubleChromosome . o f ( 2 , 3 ) ,
5 DoubleChromosome . o f ( 3 , 4 )
6 );
7 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine . b u i l d e r ( f f , g t f )
8 . alterers (
9 P a r t i a l A l t e r e r . o f (new Mutator<DoubleGene , Double >() , 0 , 3 ) ,
10 P a r t i a l A l t e r e r . o f (new MeanAlterer<DoubleGene , Double >() , 1 ) ,
11 new L i n e C r o s s o v e r <>() )
12 . build () ;

The example above shows how to use the PartialAlterer. The wrapped
Mutator will only operate on the chromosome with the index 0 and 3, the
wrapped MeanAlterer will alter on the chromosome with index 1 and the Line-
Crossover will work on all chromosomes. A potential drawback of the Partial-
Alterer is a possible performance penalty. This is because the chromosomes
must be sliced into different population sequences for each PartialAlterer. If
this is an issue for the overall performance depends on the concrete application.

1.3.3 Engine classes


The executing classes, which perform the actual evolution, are located in the
io.jenetics.engine package. The evolution stream (EvolutionStream) is
the base metaphor for performing an GA. On the EvolutionStream you can
define the termination predicate and collect the final EvolutionResult. This
decouples the static data structure from the executing evolution part. The
EvolutionStream is also very flexible, when it comes to collecting the final
result. The EvolutionResult class has several predefined collectors, but you
are free to create your own one, which can be seamlessly plugged into the existing
stream.

1.3.3.1 Fitness function


The fitness Function is also an important part when modeling a genetic al-
gorithm. It takes a Genotype as argument and returns a fitness value. The
returned fitness value must implement the Comparable interface. This allows
the evolution Engine, respectively the selection operators, to select the offspring-
and survivor population. Some selectors have stronger requirements for the
fitness value than a Comparable, but these constraints are checked by the Java
type system at compile time.

22
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

The fitness Function must be deterministic. Whenever it is applied to the


same Genotype, it must return the same fitness value. Nondeterministic
fitness functions can lead to unexpected behavior, since the calculated
fitness value is cached by the Phenotype.

The following example shows the simplest possible fitness Function. This
Function just returns the allele of a 1x1 float Genotype.
1 public c l a s s Main {
2 s t a t i c Double i d e n t i t y ( f i n a l Genotype<DoubleGene> g t ) {
3 return g t . gene ( ) . a l l e l e ( ) ;
4 }
5
6 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
7 // C r e a t e f i t n e s s f u n c t i o n from method r e f e r e n c e .
8 Function<Genotype<DoubleGene >, Double>> f f 1 =
9 Main : : i d e n t i t y ;
10
11 // C r e a t e f i t n e s s f u n c t i o n from lambda e x p r e s s i o n .
12 Function<Genotype<DoubleGene >, Double>> f f 2 = g t −>
13 g t . gene ( ) . a l l e l e ( ) ;
14 }
15 }

The first type parameter of the Function defines the kind of Genotype from
which the fitness value is calculated and the second type parameter determines
the return type, which must at least implement the Comparable interface.

1.3.3.2 Engine
The evolution Engine controls how the evolution steps are executed. Once the
Engine is created, via its Builder class, it can’t be changed. It doesn’t contain
any mutable global state and can therefore be safely used/called from different
threads. This allows to create more than one EvolutionStreams from the same
Engine and execute them independently.
1 public f i n a l c l a s s Engine<
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements E v o l u t i o n <G, C>,
6 E v o l u t i o n S t r e a m a b l e <G, C>
7 {
8 // The e v o l u t i o n f u n c t i o n , p e r f o r m s one e v o l u t i o n s t e p .
9 public E v o l u t i o n R e s u l t <G, C> e v o l v e ( E v o l u t i o n S t a r t <G, C> s t a r t ) ;
10
11 // E v o l u t i o n stream f o r " normal " e v o l u t i o n e x e c u t i o n .
12 public E v o l u t i o n S t r e a m <G, C> stream ( ) ;
13 }
Listing 1.7: Engine class

Listing 1.7 shows the main methods of the Engine class. The Engine is used
for performing the actual evolution of a given population. One evolution step is
executed by calling the Engine.evolve method, which returns an Evolution-
Result object. This object contains the evolved population plus additional

23
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

information like the killed and as invalid marked individuals. With the stream()
method you create a new EvolutionStream, which is used for controlling the
evolution process. For more information about the EvolutionStream see section
1.3.3.4.
As already shown in previous examples, the Engine can only be created
via its Builder class. Only the fitness Function and the Chromosomes, which
represents the problem encoding, must be specified for creating an Engine
instance. For the rest of the parameters, default values have been specified. This
are the Engine parameters which can configured:
alterers A list of Alterers which are applied to the offspring population, in
the defined order. The default value of this property is set to Single-
PointCrossover<>(0.2) followed by Mutator<>(0.15).
clock The java.time.InstantSource used for calculating the execution dura-
tions. A InstantSource with nanosecond precision (System.nanoTime())
is used as default.
constraint This property lets you override the default implementation of the
Phenotype::isValid method, which is useful if the Phenotype validity
not only depends on valid property of the elements it consists of. A
description of the Constraint interface is given in section 2.5.
executor With this property it is possible to change the java.util.concur-
rent.Executor engine used for evaluating the evolution steps. This prop-
erty can be used to define an application wide Executor or for controlling
the number of execution threads. The default value is set to ForkJoin-
Pool.commonPool().
fitnessFunction This property defines the fitness Function used by the evo-
lution Engine. (See section 1.3.3.1.)
genotypeFactory Defines the Genotype Factory used for creating new indi-
viduals. Since the Genotype is its own Factory, it is sufficient to create
a Genotype, which serves as a template.
interceptor The interceptor lets you define functions, which are able to
change the EvolutionResult before and after an evolution step. An
EvolutionInterceptor can be seen as a crosscutting aspect of the evo-
lution process. One implementation of the EvolutionInterceptor is the
FitnessNullifier, which allows you to enforce the reevaluation of the
fitness values of all individuals. This might be handy, if the fitness function
is not time invariant and can change during the evolution process.
maximalPhenotypeAge Set the maximal allowed age of an individual (Phenotype).
This prevents super individuals to live forever. The default value is set to
70.
offspringFraction Through this property it is possible to define the fraction of
offspring (and survivors) for evaluating the next generation. The fraction
value must within the interval [0, 1]. The default value is set to 0.6.
Additionally to this property, it is also possible to set the survivorsFrac-
tion, survivorsSize or offspringSize. All these additional properties
effectively set the offspringFraction.

24
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

offspringSelector This property defines the Selector used for selecting the
offspring population. The default values are set to TournamentSelect-
or<>(3).

optimize With this property it is possible to define whether the fitness Function
should be maximized or minimized. By default, the fitness Function is
maximized.
populationSize Defines the number of individuals of a population. The evolu-
tion Engine keeps the number of individuals constant. That means, the
population of the EvolutionResult always contains the number of entries
defined by this property. The default value is set to 50.
selector This method allows to set the offspringSelector and survivors-
Selector in one step with the same selector.

survivorsSelector This property defines the Selector used for selecting the
survivors population. The default values are set to TournamentSelec-
tor<>(3).
The EvolutionStreams, created by the Engine class, are unlimited. Such
streams must be limited by calling the available EvolutionStream::limit meth-
ods. Alternatively, the Engine instance itself can be limited with the Engine-
::limit methods. This limited Engines no longer creates infinite Evolution-
Streams, they are truncated by the limit predicate defined by the Engine. This
feature is needed for concatenating evolution Engines (see section 3.1.6.1).
1 f i n a l E v o l u t i o n S t r e a m a b l e <DoubleGene , Double> e n g i n e =
2 Engine . b u i l d e r ( problem )
3 . minimizing ( )
4 . build ()
5 . l i m i t ( ( ) −> L i m i t s . b y S t e a d y F i t n e s s ( 1 0 ) ) ;

As shown in the example code above, one important difference between the
Engine.limit and the EvolutionStream::limit method is, that the limit
method of the Engine takes a limiting Predicate Supplier instead of the
Predicate itself. The reason for this is that some Predicates have to maintain
internal state to work properly. This means, every time the Engine creates
a new stream, it must also create a new limiting Predicate. The Engine-
::limit function will return an EvolutionStreamable instead of an Engine.
This interface lets you create EvolutionStreams, which is what you usually
want to do with the Engine.

1.3.3.3 Evolution
This functional interface represents the evolution function, which is implemented
by the Engine class. The main purpose of the Evolution interface is to decouple
the evolution function/strategy from the actual evolution process, represented
by the EvolutionStream. Listing 1.8 shows the definition of the Evolution
functional interface.
1 @FunctionalInterface
2 public i n t e r f a c e E v o l u t i o n <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>

25
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

5 > {
6 E v o l u t i o n R e s u l t <G, C> e v o l v e ( E v o l u t i o n S t a r t <G, C> s t a r t ) ;
7 }
Listing 1.8: Evolution interface

1.3.3.4 EvolutionStream
The EvolutionStream controls the execution of the evolution process and can
be seen as a kind of execution handle. This handle can be used to define
the termination criteria and to collect the final evolution result. Since the
EvolutionStream extends the Java Stream interface, it integrates smoothly
with the rest of the Java Stream API.14
1 public i n t e r f a c e E v o l u t i o n S t r e a m <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 extends Stream<E v o l u t i o n R e s u l t <G, C>>
6 {
7 E v o l u t i o n S t r e a m <G, C>
8 l i m i t ( P r e d i c a t e <? super E v o l u t i o n R e s u l t <G, C>> p r o c e e d ) ;
9 }
Listing 1.9: EvolutionStream interface

Listing 1.9 shows the whole EvolutionStream interface. As it can be seen, it


only adds one additional method. But this additional limit method allows
you to truncate the EvolutionStream based on a Predicate which takes an
EvolutionResult. Once the Predicate returns false, the evolution process
is stopped. Since the limit method returns an EvolutionStream, it is possible
to define more than one Predicate, which both must be fulfilled to continue
the evolution process.
1 final Engine<DobuleGene , Double> e n g i n e = . . .
2 final E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e . stream ( )
3 . limit ( predicate1 )
4 . limit ( predicate2 )
5 . l i m i t (100) ;

The EvolutionStream, created in the example above, will be truncated if one


of the two predicates is false or if the maximal allowed generations, of 100,
is reached. An EvolutionStream is usually created via the Engine::stream
method. The immutable and stateless nature of the evolution Engine allows
you to create more than one EvolutionStream with the same Engine.

The generations of the EvolutionStream are evolved serially. Calls of the


EvolutionStream methods (e. g. limit, peek, ...) are executed in the
thread context of the created Stream. In a typical setup, no additional
synchronization and/or locking is needed.

14 It is recommended to make yourself familiar with the Java Stream API.


A good introduction can be found here: http://winterbe.com/posts/2014/07/31/
java8-stream-tutorial-examples/

26
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

In cases where you appreciate the usage of the EvolutionStream but need
a different Engine implementation, you can use the EvolutionStream::of
factory method for creating an new EvolutionStream.
1 s t a t i c <G extends Gene <? , G>, C extends Comparable <? super C>>
2 E v o l u t i o n S t r e a m <G, C> o f (
3 S u p p l i e r <E v o l u t i o n S t a r t <G, C>> s t a r t ,
4 Function <? super E v o l u t i o n S t a r t <G, C>, E v o l u t i o n R e s u l t <G, C>> f
5 );

This factory method takes a start value, of type EvolutionStart, and an


evolution Function. The evolution Function takes the start value and returns
an EvolutionResult object. To make the runtime behavior more predictable,
the start value is fetched/created lazily at the evolution start time.
1 f i n a l S u p p l i e r <E v o l u t i o n S t a r t <DoubleGene , Double>> s t a r t = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream =
3 E v o l u t i o n S t r e a m . o f ( s t a r t , new MySpecialEngine ( ) ) ;

1.3.3.5 EvolutionResult
The EvolutionResult contains the result data of an evolution step and is the
element type of the EvolutionStream, as described in section 1.3.3.4.
1 public f i n a l c l a s s E v o l u t i o n R e s u l t <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements Comparable<E v o l u t i o n R e s u l t <G, C>>
6 {
7 public ISeq<Phenotype<G, C>> p o p u l a t i o n ( ) ;
8 public long g e n e r a t i o n ( ) ;
9 }
Listing 1.10: EvolutionResult class

Listing 1.3.3.5 shows the two most important properties, the population and
the generation the result belongs to. These are also the two properties needed
for the next evolution step. The generation is, of course, incremented by
one. To make collecting the EvolutionResult object easier, it also implements
the Comparable interface. Two EvolutionResults are compared by its best
Phenotype, depending on the optimization direction. The EvolutionResult
classes has three predefined factory methods, which will return Collectors
usable for the EvolutionStream:
toBestEvolutionResult() Collects the best EvolutionResult of a Evolu-
tionStream according to the defined optimization strategy (minimization
or maximization).
toBestPhenotype() This collector can be used if you are only interested in the
best Phenotype.
toBestGenotype() Use this collector if you only need the best Genotype of the
EvolutionStream.
The following code snippets show how to use the different EvolutionStream
collectors.

27
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

1 // C o l l e c t i n g t h e b e s t E v o l u t i o n R e s u l t o f t h e E v o l u t i o n S t r e a m .
2 f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t = stream
3 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () ) ;
4
5 // C o l l e c t i n g t h e b e s t Phenotype o f t h e E v o l u t i o n S t r e a m .
6 f i n a l Phenotype<DoubleGene , Double> r e s u l t = stream
7 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;
8
9 // C o l l e c t i n g t h e b e s t Genotype o f t h e E v o l u t i o n S t r e a m .
10 f i n a l Genotype<DoubleGene> r e s u l t = stream
11 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;

Sometimes it is useful not only to collect one final result, but to collect the n
best evolution results instead. This can be achieved by combining the MinMax-
::toStrictlyIncreasing and ISeq::toISeq(int) method.
1 f i n a l ISeq<E v o l u t i o n R e s u l t <DoubleGene , Double>> r e s u l t s = e n g i n e
2 . stream ( )
3 . l i m i t (1000)
4 . f l a t M a p (MinMax . t o S t r i c t l y I n c r e a s i n g ( ) )
5 . c o l l e c t ( ISeq . toISeq (10) ) ;

The code snippet above collects the best 10 evolution results into the results
sequence in increasing order.

1.3.3.6 EvolutionStatistics
The EvolutionStatistics class allows you to gather additional statistical in-
formation from the EvolutionStream. This is especially useful during the devel-
opment phase of an application, when you have to find the right parametrization
of the evolution Engine. Besides other information, the EvolutionStatistics
contains (statistical) information about the fitness, invalid and killed Phenotypes
and runtime information of the different evolution steps. Since the Evolution-
Statistics class implements the Consumer<EvolutionResult<?, C>‌> inter-
face, it can be easily plugged into the EvolutionStream, adding it with the
peek method of the stream.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t a t i s t i c s <? , Double> s t a t i s t i c s =
3 E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
4 e n g i n e . stream ( )
5 . l i m i t (100)
6 . peek ( s t a t i s t i c s )
7 . c o l l e c t ( toBestGenotype ( ) ) ;
Listing 1.11: EvolutionStatistics usage

Listing 1.11 shows how to add the EvolutionStatistics to the Evolution-


Stream. Once the algorithm tuning is finished, it can be removed in the produc-
tion environment.
There are two different specializations of the EvolutionStatistics object
available. The first is the general one, which will be working for every kind
of Genes and fitness types. It can be created via the EvolutionStatistics-
::ofComparable method. The second one collects additional statistical data for
numerical fitness values. This can be created with the EvolutionStatistics-
::ofNumber method.
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |

28
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 4 65 38 2 78 00 0 s ; mean = 0 .0 03 8 78 18 98 3 3 s |
5 | Altering : sum = 0. 08 6 15 54 57 0 00 s ; mean =0 . 00 71 7 96 21 4 17 s |
6 | Fitness calculation : sum = 0. 02 2 90 1 60 6 00 0 s ; mean = 0 .0 01 9 08 4 67 16 7 s |
7 | Overall execution : sum =0 .1 4 72 98 0 67 00 0 s ; mean =0 .0 1 22 74 83 8 91 7 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 12 |
12 | Altered : sum =7 ,331; mean =610 .91 6666 667 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =11; mean =1.951000; var =5.545190 |
19 | Fitness : |
20 | min = 0.0000 00000000 |
21 | max = 4 8 1. 7 4 8 2 2 7 1 1 4 5 3 7 |
22 | mean = 3 8 4 . 4 3 0 3 4 5 0 7 8 6 6 0 |
23 | var = 1 3 0 0 6 . 1 3 2 5 3 7 3 0 1 5 2 8 |
24 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+

A typical output of an number EvolutionStatistics object will look like the


example above.
The EvolutionStatistics object is a simple way for inspecting the Evolution-
Stream after it is finished. It doesn’t give you a live view of the current evolution
process, which can be necessary for long running streams. In such cases you
have to maintain/update the statistics yourself.
1 public c l a s s TSM {
2 // The l o c a t i o n s t o v i s i t .
3 s t a t i c f i n a l ISeq<Point> POINTS = I S e q . o f ( . . . ) ;
4
5 // The p e r m u t a t i o n c o d e c .
6 s t a t i c f i n a l Codec<ISeq<Point >, EnumGene<Point>>
7 CODEC = Codecs . o f P e r m u t a t i o n (POINTS) ;
8
9 // The f i t n e s s f u n c t i o n ( i n t h e problem domain ) .
10 s t a t i c double d i s t ( f i n a l ISeq<Point> p ) { . . . }
11
12 // The e v o l u t i o n e n g i n e .
13 s t a t i c f i n a l Engine<EnumGene<Point >, Double> ENGINE = Engine
14 . b u i l d e r (TSM : : d i s t , CODEC)
15 . o p t i m i z e ( Optimize .MINIMUM)
16 . build () ;
17
18 // Best phenotype found s o f a r .
19 s t a t i c Phenotype<EnumGene<Point >, Double> b e s t = n u l l ;
20
21 // You w i l l be i n f o r m e d on new r e s u l t s . This a l l o w s t o
22 // r e a c t on new b e s t phenotypes , e . g . l o g i t .
23 private s t a t i c void update (
24 f i n a l E v o l u t i o n R e s u l t <EnumGene<Point >, Double> r e s u l t
25 ) {
26 i f ( b e s t == n u l l | |
27 b e s t . compareTo ( r e s u l t . b e s t P h e n o t y p e ( ) ) < 0 )
28 {
29 best = r e s u l t . bestPhenotype ( ) ;
30 System . out . p r i n t ( r e s u l t . g e n e r a t i o n ( ) + " : " ) ;
31 System . out . p r i n t l n ( " Found b e s t phenotype : " + b e s t ) ;
32 }
33 }
34
35 // Find t h e s o l u t i o n .

29
1.3. BASE CLASSES CHAPTER 1. FUNDAMENTALS

36 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {


37 f i n a l ISeq<Point> r e s u l t = CODEC. de c ode (
38 ENGINE . stream ( )
39 . peek (TSM : : update )
40 . limit (10)
41 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) )
42 );
43 System . out . p r i n t l n ( r e s u l t ) ;
44 }
45 }
Listing 1.12: Live evolution statistics

Listing 1.12 shows how to implement a manual statistics gathering. The update
method is called whenever a new EvolutionResult has been calculated. If a
new best Phenotype is available, it is stored and logged. With the TSM::update
method, which is called on every finished generation, you created a live view on
the evolution progress.

1.3.3.7 Evaluator
The Evaluator is responsible for evaluating the fitness values for a given popu-
lation. It is the most general way for doing the fitness evaluation. Usually, it is
not necessary to implement an own evaluation strategy. If you are creating an
evolution Engine with a fitness function, this is done for you automatically. Each
fitness value is then evaluated concurrently, but independently from each other.
Using the Evaluator interface is helpful if you have performance problems when
the fitness function is evaluated serially—or in small concurrent batches, as it is
implemented by the default strategy. In this case, the Evaluator interface can
be used to calculate the fitness function for a population in one batch. Another
use case for the Evaluator interface is, when the fitness value also depends on
the current composition of the population. E. g. it is possible to normalize the
population’s fitness values.
1 @FunctionalInterface
2 public i n t e r f a c e E v a l u a t o r <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 ISeq<Phenotype<G, C>> e v a l ( Seq<Phenotype<G, C>> p o p u l a t i o n ) ;
7 }
Listing 1.13: Evaluator interface

The implementer is free to evaluate the whole population, or only evaluate the
not yet evaluated Phenotypes. There are only two requirements which must be
fulfilled:
1. the size of the returned, evaluated, phenotype sequence must be exactly
the size of the input phenotype sequence and
2. all phenotypes of the returned population must have a fitness value assigned.
That means, the expression pop.forAll(Phenotype::isEvaluated) must
be true.
The code snippet below creates an evaluator which evaluates the fitness values
of the whole population serially in the main thread.

30
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

1 f i n a l Function <? super Genotype<G>, ? extends C> f i t n e s s = . . . ;


2 f i n a l E v a l u a t o r <G, C> e v a l u a t o r = p o p u l a t i o n −> p o p u l a t i o n
3 . map( pt −> pt . e v a l ( f i t n e s s ) )
4 . asISeq () ;

To use the fitness Evaluator, you have to use the Engine.Builder constructor
directly, instead of one of the factory methods.
1 f i n a l Engine<G, C> e n g i n e = new Engine . B u i l d e r ( e v a l u a t o r , g t f )
2 . build () ;

The Evaluators class contains factory methods, which allows you to create
Evaluator instances from fitness functions which don’t return the fitness value
directly, but return Future<T> or CompletableFuture<T> instead. With these
methods, there is no need for waiting for the fitness value, if the fitness function
is already asynchronous.
1 s t a t i c Future<Double> f i t n e s s ( f i n a l double x ) {
2 return . . . ;
3 }
4 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
5 f i n a l Codec<Double , DoubleGene> c o d e c = . . . ;
6 f i n a l E v a l u a t o r <DoubleGene , Double> e v a l u a t o r = E v a l u a t o r s
7 . a s y n c ( Main : : f i t n e s s , c o d e c ) ;
8
9 f i n a l Engine<DoubleGene , Double> e n g i n e =
10 new Engine . B u i l d e r <>(e v a l u a t o r , c o d e c . e n c o d i n g ( ) )
11 . build () ;
12 }

1.4 Nuts and bolts


1.4.1 Concurrency
The Jenetics library parallelizes independent tasks whenever possible. Especially
the evaluation of the fitness function is done concurrently. That means that
the fitness function must be thread-safe and deterministic. The easiest way for
achieving thread safety is to make the fitness function immutable and reentrant.
Since the number of individuals of one population, which determines the number
of fitness functions to be evaluated, is usually much higher then the number of
available CPU cores, the fitness evaluation is done in batches. This reduces the
evaluation overhead, for lightweight fitness functions.

Figure 1.4.1: Evaluation batch

Figure 1.4.1 shows an example population with 12 individuals. The evaluation


of the phenotype’s fitness functions are evaluated in batches with three elements.
For this purpose, the individuals of one batch are wrapped into a Runnable
object. The batch size is automatically adapted to the available number of CPU
cores. It is assumed that the evaluation cost of one fitness function is quite small.
If this assumption doesn’t hold, you can configure the the maximal number

31
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

of batch elements with the io.jenetics.concurrency.maxBatchSize system


property. The usage of this property is described in section 1.4.1.2.

1.4.1.1 Basic configuration


The used Executor can be defined when building the evolution Engine object.
How to do this is shown in the code example below.
1 import j a v a . u t i l . c o n c u r r e n t . E x e c u t o r ;
2 import j a v a . u t i l . c o n c u r r e n t . E x e c u t o r s ;
3
4 public c l a s s Main {
5 private s t a t i c Double e v a l ( f i n a l Genotype<DoubleGene> g t ) {
6 // c a l c u l a t e and r e t u r n f i t n e s s
7 }
8
9 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
10 // C r e a t i n g an f i x e d s i z e E x e c u t o r S e r v i c e
11 final ExecutorService executor = Executors
12 . newFixedThreadPool ( 1 0 )
13 f i n a l Factory<Genotype<DoubleGene>> g t f = . . .
14 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine
15 . b u i l d e r ( Main : : e v a l , g t f )
16 // Using 10 t h r e a d s f o r e v o l v i n g .
17 . executor ( executor )
18 . build ()
19 ...
20 }
21 }

If no Executor is given, Jenetics uses a common ForkJoinPool15 for con-


currency. Sometimes it might be useful to run the evaluation Engine single-
threaded, or even execute all operations in the main thread. This can be easily
achieved by setting the appropriate Executor.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine . b u i l d e r ( . . . )
2 // Doing t h e Engine o p e r a t i o n s i n t h e main t h r e a d
3 . e x e c u t o r ( ( E x e c u t o r ) Runnable : : run )
4 . build ()

The code snippet above shows how to do the Engine operations in the main
thread. Whereas the snippet below executes the Engine operations in a single
thread, other than the main thread.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine . b u i l d e r ( . . . )
2 // Doing t h e Engine o p e r a t i o n s i n a s i n g l e t h r e a d
3 . executor ( Executors . newSingleThreadExecutor ( ) )
4 . build ()

Such a configuration can be useful for performing reproducible (performance)


tests, without the uncertainty of a concurrent execution environment.

1.4.1.2 Concurrency tweaks


Jenetics uses different strategies for minimizing the concurrency overhead, de-
pending on the configured Executor. For the ForkJoinPool, the fitness evalua-
tion of the population is done by recursively dividing it into subpopulations using
15 https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/

concurrent/ForkJoinPool.html

32
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

the abstract RecursiveAction class. If a minimal subpopulation size is reached,


the fitness values for this subpopulation are directly evaluated. The default value
of this threshold is five and can be controlled via the io.jenetics.concurrency-
.splitThreshold system property. Besides the splitThreshold, the size of
the evaluated subpopulation is dynamically determined by the ForkJoinTask-
::getSurplusQueuedTaskCount method.16 If this value is greater than three,
the fitness values of the current subpopulation are also evaluated immediately.
The default value can be overridden by the io.jenetics.concurrency.max-
SurplusQueuedTaskCount system property.
$ java -Dio.jenetics.concurrency.splitThreshold=1 \
-Dio.jenetics.concurrency.maxSurplusQueuedTaskCount=2 \
-cp jenetics-7.1.0.jar:app.jar \
com.foo.bar.MyJeneticsApp
You may want to tweak this parameters, if you realize a low CPU utilization
during the fitness value evaluation. Long running fitness functions could lead to
CPU under utilization while evaluating the last subpopulation. In this case, only
one core is busy, while the other cores are idle, because they already finished
the fitness evaluation. Since the workload has been already distributed, no
work stealing is possible. Reducing the splitThreshold can help to have a
more equal workload distribution between the available CPUs. Reducing the
maxSurplusQueuedTaskCount property will create a more uniform workload for
the fitness function with heavily varying computation cost for different genotype
values.

The fitness function shouldn’t acquire locks for achieving thread safety. It
is also recommended to avoid calls to blocking methods. If such calls are
unavoidable, consider using the ForkJoinPool.managedBlock method.
Especially if you are using a ForkJoinPool executor, which is the default.

If the Engine is using an ExecutorService, a different optimization strategy


is used for reducing the concurrency overhead. The original population is
divided into a fixed number17 of subpopulations, and the fitness values of each
subpopulation are evaluated by one thread. For long running fitness functions, it
is better to have smaller subpopulations for a better CPU utilization. With the
io.jenetics.concurrency.maxBatchSize system property, it is possible to
reduce the subpopulation size. The default value is set to Integer.MAX_VALUE.
This means, that only the number of CPU cores influences the batch size.
$ java -Dio.jenetics.concurrency.maxBatchSize=3 \
-cp jenetics-7.1.0.jar:app.jar \
com.foo.bar.MyJeneticsApp
16 Excerpt from the Javadoc: Returns an estimate of how many more locally queued tasks

are held by the current worker thread than there are other worker threads that might steal
them. This value may be useful for heuristic decisions about whether to fork other tasks. In
many usages of ForkJoinTasks , at steady state, each worker should aim to maintain a small
constant surplus (for example, 3) of tasks, and to process computations locally if this threshold
is exceeded.
17 The number of sub-populations actually depends on the number of available CPU cores,

which are determined with Runtime.availableProcessors().

33
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

Another source of underutilized CPUs are lock contentions. It is therefore


strongly recommended to avoid locking and blocking calls in your fitness function
at all. If blocking calls are unavoidable, consider using the managed block
functionality of the ForkJoinPool.18

1.4.2 Randomness
In general, GAs heavily depend on pseudo random number generators (PRNG)
for creating new individuals and for the selection and mutation algorithms.
Jenetics uses the Java RandomGenerator interface for generating random num-
bers. To make the random engine pluggable, the RandomGenerator object is
always fetched from the RandomRegistry. This makes it possible to change the
implementation of the random generator without changing the client code. The
central RandomRegistry also allows for easily changing the RandomGenerator
even for specific parts of the code.
The following example shows how to change and restore the RandomGenerator
object. When opening the with scope, changes to the RandomRegistry are
only visible within this scope. Once the with scope is left, the original Random-
Generator object is restored.
1 f i n a l v a r r g f = RandomGeneratorFactory . g e t D e f a u l t ( ) ;
2 f i n a l L i s t <Genotype<DoubleGene>> g e n o t y p e s =
3 RandomRegistry . with ( r g f . c r e a t e ( 1 2 3 ) , r −> {
4 Genotype . o f ( DoubleChromosome . o f ( 0 . 0 , 1 0 0 . 0 , 1 0 ) )
5 . instances ()
6 . l i m i t (100)
7 . toList ()
8 }) ;

With the previous listing, a random, but reproducible, list of genotypes is created.
This might be useful while testing your application or when you want to evaluate
the EvolutionStream several times with the same initial population.
1 f i n a l Engine<DoubleGene , Double> e n g i n e = . . . ;
2 // C r e a t e a new e v o l u t i o n stream with t h e g i v e n
3 // i n i t i a l g e n o t y p e s .
4 f i n a l Phenotype<DoubleGene , Double> b e s t = e n g i n e . stream ( g e n o t y p e s )
5 . limit (10)
6 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;

The example above uses the generated genotypes for creating the Evolution-
Stream. Each created stream uses the same starting population, but will, most
likely, create a different result. This is because the stream evaluation is still
nondeterministic.

Setting the PRNG to a RandomGenerator object with a defined seed has


the effect, that every evolution stream will produce the same result—in a
single threaded environment.

The parallel nature of the GA implementation requires the creation of streams


of random numbers, ti,j , which are statistically independent. These streams
18 A good introduction on how to use managed blocks, and the motivation behind it, is given

in this talk: https://www.youtube.com/watch?v=rUDGQQ83ZtI

34
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

are numbered with j = 1, 2, 3, ..., p, and p denotes the number of processes.


We expect statistical independence between the streams as well. The used
PRNG should enable the GA to play fair, which means that the outcome of the
GA is strictly independent from the underlying hardware and the number of
parallel processes or threads. This is essential for reproducing results in parallel
environments where the number of parallel tasks may vary from run to run.

The Fair Play property of a PRNG guarantees that the quality of the
genetic algorithm (evolution stream) does not depend on the degree of
parallelization.

When the RandomGenerator is used in a multi-threaded environment, there


must be a way to parallelize the sequential PRNG. Usually this is done by
taking the elements of the sequence of pseudo random numbers and distributing
them among the threads. There are essentially four different parallelization
techniques used in practice: Random seeding, Parameterization, Block splitting
and Leapfrogging.

Random seeding Every thread uses the same kind of PRNG but with a
different seed. This is the default strategy used by the Jenetics library. Random
seeding works well for the most problems but without theoretical foundation.19
The RandomRegistry is initialized with the Java L64X256MixRandom class.20

Parameterization All threads use the same kind of PRNG but with different
parameters. This requires the PRNG to be parameterizable, which is not the
case for the Random object of the JDK. You can use the LCG64ShiftRandom
class if you want to use this strategy. The theoretical foundation for these
methods is weak. In a massive parallel environment you will need a reliable set
of parameters for every random stream, which are not trivial to find.

Block splitting With this method each thread will be assigned a non overlap-
ping contiguous block of random numbers, which should be enough for the whole
runtime of the process. If the number of threads is not known in advance, the
length of each block should be chosen much larger than the maximal expected
number of threads. This strategy is used when using the LCG64ShiftRandom
class. This class assigns every thread a block of 256 ≈ 7, 2 · 1016 random numbers.
After 128 threads, the blocks are recycled, but with changed seed.

Leapfrog With the leapfrog method each thread t ∈ [0, P ) only consumes the
P th random number and jumps ahead in the random sequence by the number of
threads, P . This method requires the ability to jump very quickly ahead in the
sequence of random numbers by a given amount. Figure 1.4.3 graphically shows
the concept of the leapfrog method.
19 This is also expressed by Donald Knuth’s advice: »Random number generators should not

be chosen at random.«
20 RandomRegistry.random(RandomGeneratorFactory.of("L64X256MixRandom"))

35
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

Figure 1.4.2: Block splitting

Figure 1.4.3: Leapfrogging

LCG64ShiftRandom21 The LCG64ShiftRandom class is a port of the trng::-


lcg64_shift PRNG class of the TRNG22 library, implemented in C++.[7]
It implements additional methods, which allows to implement the block split-
ting—and also the leapfrog—method.
1 public c l a s s LCG64ShiftRandom
2 implements RandomGenerator . A r b i t r a r i l y J u m p a b l e G e n e r a t o r
3 {
4 public void jump ( f i n a l double s t e p ) ;
5 public void jumpPowerOfTwo ( f i n a l i n t s ) ;
6 ...
7 }
Listing 1.14: LCG64ShiftRandom class

Listing 1.14 shows the interface used for implementing the block splitting and
leapfrog parallelization techniques. This methods have the following meaning:
split Changes the internal state of the PRNG in a way that future calls to
nextLong will generate the sth sub-stream of pth sub-streams. s must be
within the range of [0, p − 1). This method is used for parallelization via
leapfrogging.
jump Changes the internal state of the PRNG in such a way that the engine
jumps s steps ahead. This method is used for parallelization via block
splitting.
21 The LCG64ShiftRandom PRNG is part of the io.jenetics.prngine module (see section

3.4 on page 120).


22 http://numbercrunch.de/trng/

36
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

jumpPowerOfTwo Changes the internal state of the PRNG in such a way that
the engine jumps 2s steps ahead. This method is used for parallelization
via block splitting.

1.4.3 Serialization
Jenetics supports serialization for a number of classes, most of them are located
in the io.jenetics package. Only the concrete implementations of the Gene
and the Chromosome interfaces implements the Serializable interface. This
gives a greater flexibility when implementing own Genes and Chromosomes.

• BitGene • LongChomosome
• BitChromosome • DoubleGene
• CharacterGene • DoubleChromosome

• CharacterChromosome • EnumGene
• IntegerGene • PermutationChromosome
• IntegerChromosome • Genotype

• LongGene • Phenotype

With the serialization mechanism you can write a population to disk and load
it into a new EvolutionStream at a later time. It can also be used to transfer
populations to evolution engines, running on different hosts, over a network link.
The IO class, located in the io.jenetics.util package, supports native Java
serialization in a convenient way.
1 // C r e a t i n g r e s u l t p o p u l a t i o n .
2 f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t = stream
3 . l i m i t (100)
4 . c o l l e c t ( toBestEvolutionResult () ) ;
5
6 // W r i t i n g t h e p o p u l a t i o n t o d i s k .
7 f i n a l F i l e f i l e = new F i l e ( " p o p u l a t i o n . o b j " ) ;
8 IO . o b j e c t . w r i t e ( r e s u l t . p o p u l a t i o n ( ) , f i l e ) ;
9
10 // Reading t h e p o p u l a t i o n from d i s k .
11 f i n a l ISeq<Phenotype<G, C>> p o p u l a t i o n =
12 ( ISeq<Phenotype<G, C>>)IO . o b j e c t . r e a d ( f i l e ) ;
13 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = Engine
14 . build ( ff , gtf )
15 . stream ( p o p u l a t i o n , 1 ) ;

1.4.4 Utility classes


The io.jenetics.util and the io.jenetics.stat package of the library con-
tains utility and helper classes which are essential for the implementation of the
GA.

io.jenetics.util.BaseSeq This interface defines a minimal contract for se-


quential data, which can be accessed by its index or position. The algorithms,

37
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

implemented by the Jenetics library, assumes that accessing elements of a


BaseSeq is done in O (1). Chromosome and Genotype implement the BaseSeq
interface. This expresses the intent that the Chromosome is a sequence of Genes
and the Genotype is a sequence of Chromosomes.

io.jenetics.util.Seq Most notable are the Seq interfaces and its implemen-
tation. They are used, among others, in the Chromosome and Genotype classes
and hold the Genes and Chromosomes, respectively. The Seq interface itself
represents a fixed-sized, ordered sequence of elements. It is an abstraction over
the Java build in array type, but much safer to use for generic elements, because
there are no casts needed when using nested generic types.

Figure 1.4.4: Seq class diagram

Figure 1.4.4 shows the Seq class diagram with their most important methods.
The interfaces MSeq and ISeq are mutable, respectively immutable specializa-
tions of the basis interface. Creating instances of the Seq interfaces is possible
via the static factory methods of the interfaces.
1 // C r e a t e " d i f f e r e n t " s e q u e n c e s .
2 f i n a l Seq<I n t e g e r > a1 = Seq . o f ( 1 , 2 , 3 ) ;
3 f i n a l MSeq<I n t e g e r > a2 = MSeq . o f ( 1 , 2 , 3 ) ;
4 f i n a l ISeq<I n t e g e r > a3 = MSeq . o f ( 1 , 2 , 3 ) . t o I S e q ( ) ;
5 f i n a l MSeq<I n t e g e r > a4 = a3 . copy ( ) ;
6
7 // The ’ e q u a l s ’ method p e r f o r m s element−w i s e c o m p a r i s o n .
8 a s s e r t ( a1 . e q u a l s ( a2 ) && a1 != a2 ) ;
9 a s s e r t ( a2 . e q u a l s ( a3 ) && a2 != a3 ) ;
10 a s s e r t ( a3 . e q u a l s ( a4 ) && a3 != a4 ) ;

How to create instances of the three Seq types is shown in the listing above.
The Seq classes also allows a more functional programming style. For a full
method description refer to the Javadoc.

38
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

io.jenetics.util.ProxySorter This is a special sorter, which allows you to


sort even an immutable collection. As the name suggests, it doesn’t sort a given
sequence directly. Instead it sorts or rearranges a proxy int[] array, which can
then be used for accessing the original sequence in a sorted order. The main usage
for this special sorter in Jenetics is where you need to access of the population in
sorted order, but have to preserve the original order of the population.23 Many
of the algorithms, implemented in the io.jenetics.ext.moea package, uses the
ProxySorter, which leads to simpler code at this places. How the proxy sorter
is used can be seen in the following code snippet.
1 f i n a l double [ ] a r r a y = RandomGenerator . g e t D e f a u l t ( )
2 . doubles (100)
3 . toArray ( ) ;
4 f i n a l i n t [ ] proxy = P r o x y S o r t e r . s o r t ( a r r a y ) ;
5
6 // Doing ’ C l a s s i c a l ’ a r r a y s o r t .
7 f i n a l double [ ] s o r t e d = a r r a y . c l o n e ( ) ;
8 Arrays . s o r t ( s o r t e d ) ;
9
10 // I t e r a t i n g t h e a r r a y i n a s c e n d i n g o r d e r .
11 f o r ( i n t i = 0 ; i < a r r a y . l e n g t h ; ++i ) {
12 a s s e r t s o r t e d [ i ] == a r r a y [ proxy [ i ] ] ;
13 }

The ProxySorter increases the set of sortable objects. It is even possible to


sort objects where you only know the access function to the elements and the
number of elements.
1 f i n a l I n t F u n c t i o n <S t r i n g > a c c e s s = . . . ;
2 f i n a l int length = 100;
3 f i n a l i n t [ ] proxy = P r o x y S o r t e r . s o r t (
4 access , length ,
5 ( a , i , j ) −> a . a p p l y ( i ) . compareTo ( a . a p p l y ( j ) )
6 );

The code snipped above shows how to sort an IntFunction. With the proxy
array you are now able to access the access function in ascending order. The
ProxySorter uses the Timsort24 algorithm for sorting the proxy int[] array.

io.jenetics.stat This package contains classes for calculating statistical


moments. They are designed to work smoothly with the Java Stream API and
are divided into mutable (number) consumers and immutable value classes, which
holds the statistical moments. The additional classes calculate the

• minimum, • variance,
• maximum,
• skewness and
• sum,
• mean, • kurtosis value.

Table 1.4.1 contains the available statistical moments for the different numeric
types. The following code snippet shows an example on how to collect double
statistics from a given DoubleGene stream.
23 For this specific problem you could also do this by copying the population and sorting the

copy instead of the original. But using a sorted proxy array can lead to simpler code.
24 https://en.wikipedia.org/wiki/Timsort

39
1.4. NUTS AND BOLTS CHAPTER 1. FUNDAMENTALS

Numeric type Consumer class Value class


int IntMomentStatistics IntMoments
long LongMomentStatistics LongMoments
double DoubleMomentStatistics DoubleMoments

Table 1.4.1: Statistics classes

1 // C o l l e c t i n g i n t o an s t a t i s t i c s o b j e c t .
2 f i n a l DoubleChromosome chromosome = . . .
3 f i n a l D o u b l e M o m e n t S t a t i s t i c s s t a t i s t i c s = chromosome . stream ( )
4 . c o l l e c t ( DoubleMomentStatistics
5 . t o D o u b l e M o m e n t S t a t i s t i c s ( v −> v . d o u b l e V a l u e ( ) ) ) ;
6
7 // C o l l e c t i n g i n t o an moments o b j e c t .
8 f i n a l DoubleMoments moments = chromosome . stream ( )
9 . c o l l e c t ( DoubleMoments . toDoubleMoments ( v −> v . d o u b l e V a l u e ( ) ) ) ;

The stat package also contains a class for calculating the quantile25 of a stream
of double values. Its implementing algorithm, which is described in [21], calcu-
lates—or estimates—the quantile value on the fly, without storing the consumed
double values. This allows for using the Quantile class even for very large
sets of double values. How to calculate the first quartile of a given, random
DoubleStream is shown in the code snippet below.
1 f i n a l Q u a n t i l e q u a r t i l e = new Q u a n t i l e ( 0 . 2 5 ) ;
2 RandomGenerator . g e t D e f a u l t ( )
3 . d o u b l e s ( 1 0 _000 )
4 . forEach ( q u a r t i l e ) ;
5 f i n a l double v a l u e = q u a r t i l e . v a l u e ( ) ;

Be aware, that the calculated quartile is just an estimation. For sufficient


accuracy, the stream size should be sufficiently large (size ≫ 100).

25 https://en.wikipedia.org/wiki/Quantile

40
Chapter 2

Advanced topics

This section describes some advanced topics for setting up an evolution Engine
or EvolutionStream. It contains some problem encoding examples and how to
override the default validation strategy of the given Genotypes. The last section
contains a detailed description of the implemented termination strategies.

2.1 Extending Jenetics


The Jenetics library was designed to give you a great flexibility in transforming
your problem into a structure that can be solved by a GA. It also comes with
different implementations for the base data-types (genes and chromosomes) and
operators (alterers and selectors). If it is still some functionality missing, this
section describes how you can extend the existing classes. Most of the extensible
classes are defined by an interface and have an abstract implementation which
makes it easier to extend it.

2.1.1 Genes
Genes are the starting point in the class hierarchy. They hold the actual
information, the alleles, of the problem domain. Beside the classical bit-gene,
Jenetics comes with gene implementations for numbers (double-, int- and
long values), characters and enumeration types.
For implementing your own gene type you have to implement the Gene
interface with three methods: (1) the Gene::allele method which will re-
turn the wrapped data, (2) the Gene::newInstance method for creating new,
random instances of the gene—must be of the same type and have the same
constraint—and (3) the Gene::isValid method which checks if the gene fulfill
the expected constraints. The gene constraint might be violated after mutation
and/or recombination. If you want to implement a new number-gene, e. g. a gene
which holds complex values, you may want to extend it from the NumericGene
interface.

41
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS

The custom Genes and Chromosomes implementations must use the


RandomGenerator available via the RandomRegistry::random method
when implementing their factory methods. Otherwise it is not possible to
seamlessly change the RandomGenerator by using the RandomRegistry-
::random(RandomGenerator) method.

If you want to support your own allele type, but want to avoid the ef-
fort of implementing the Gene interface, you can alternatively use the Any-
Gene class. It can be created with AnyGene::of(Supplier, Predicate). The
given Supplier is responsible for creating new random alleles, similar to the
newInstance method in the Gene interface. Additional validity checks are
performed by the given Predicate.
1 c l a s s LastMonday {
2 // C r e a t e s new random ’ L o c a l D a t e ’ o b j e c t s .
3 private s t a t i c L o c a l D a t e nextMonday ( ) {
4 f i n a l v a r random = RandomRegistry . random ( ) ;
5 LocalDate
6 . of (2015 , 1 , 5)
7 . plusWeeks ( random . n e x t I n t ( 1 0 0 0 ) ) ;
8 }
9
10 // Do some a d d i t i o n a l v a l i d i t y c h e c k .
11 private s t a t i c boolean i s V a l i d ( f i n a l L o c a l D a t e d a t e ) { . . . }
12
13 // C r e a t e a new gene from t h e random ’ S u p p l i e r ’ and
14 // v a l i d a t i o n ’ P r e d i c a t e ’ .
15 private f i n a l AnyGene<LocalDate> gene = AnyGene
16 . o f ( LastMonday : : nextMonday , LastMonday : : i s V a l i d ) ;
17 }
Listing 2.1: AnyGene example

Example listing 2.1 shows the (almost) minimal setup for creating user defined
Gene allele types. By convention, the RandomGenerator, used for creating the
new LocalDate objects, must be requested from the RandomRegistry. With
the optional validation function, isValid, it is possible to reject Genes whose
alleles don’t conform to some criteria. The simple usage of the AnyGene has also
its downsides. Since the AnyGene instances are created from function objects,
serialization is not supported by the AnyGene class. It is also not possible to
use some Alterer implementations with the AnyGene, like:
• GaussianMutator,
• MeanAlterer and
• PartiallyMatchedCrossover

2.1.2 Chromosomes
A new Gene type usually comes with a corresponding Chromosome implemen-
tation. One of the important parts of a Chromosome is the factory method
newInstance(ISeq), which lets the evolution Engine create a new Chromosome
instance from a sequence of Genes. This method is used by the Alterer when

42
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS

creating a new combined Chromosome. The newly created Chromosome may


have a different length than the original one. The other methods should be
self explanatory. The Chromosome implementations uses the same serialization
mechanism as the Gene. In the minimal case it can extends the Serializable
interface.1

Just implementing the Serializable interface is sometimes not enough.


You might also need to implement the readObject and writeObject meth-
ods for a more concise serialization result. Consider using the serialization
proxy pattern, item 90, described in Effective Java [9].

Corresponding to the AnyGene, it is possible to create chromosomes with


arbitrary allele types with the AnyChromosome.
1 public c l a s s LastMonday {
2 // The used problem Codec .
3 private s t a t i c f i n a l Codec<LocalDate , AnyGene<LocalDate>>
4 CODEC = Codec . o f (
5 Genotype . o f ( AnyChromosome . o f ( LastMonday : : nextMonday ) ) ,
6 g t −> g t . gene ( ) . a l l e l e ( )
7 );
8
9 // C r e a t e s new random ’ L o c a l D a t e ’ o b j e c t s .
10 private s t a t i c L o c a l D a t e nextMonday ( ) {
11 f i n a l v a r random = RandomRegistry . random ( ) ;
12 LocalDate
13 . of (2015 , 1 , 5)
14 . plusWeeks ( random . n e x t I n t ( 1 0 0 0 ) ) ;
15 }
16
17 // The f i t n e s s f u n c t i o n : f i n d a monday a t t h e end o f t h e month .
18 private s t a t i c i n t f i t n e s s ( f i n a l L o c a l D a t e d a t e ) {
19 return d a t e . getDayOfMonth ( ) ;
20 }
21
22 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
23 f i n a l Engine<AnyGene<LocalDate >, I n t e g e r > e n g i n e = Engine
24 . b u i l d e r ( LastMonday : : f i t n e s s , CODEC)
25 . o f f s p r i n g S e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
26 . build () ;
27
28 f i n a l Phenotype<AnyGene<LocalDate >, I n t e g e r > b e s t =
29 e n g i n e . stream ( )
30 . limit (50)
31 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;
32
33 System . out . p r i n t l n ( b e s t ) ;
34 }
35 }
Listing 2.2: AnyChromosome example

Listing 2.2 shows a full usage example of the AnyGene and AnyChromosome. The
example tries to find a Monday with a maximal day of month. An interesting
detail is, that an Codec2 definition is used for creating new Genotypes and
1 http://www.oracle.com/technetwork/articles/java/javaserial-1536170.html
2 See section2.3 on page 54 for a more detailed Codec description.

43
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS

for converting them back to LocalDate alleles. The convenient usage of the
AnyChromosome has to be payed by the same restriction as for the AnyGene:
no serialization support for the chromosome and not usable for all Alterer
implementations.

2.1.3 Selectors
If you want to implement your own selection strategy you only have to implement
the Selector interface with the select method.
1 @FunctionalInterface
2 public i n t e r f a c e S e l e c t o r <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 ISeq<Phenotype<G, C>> s e l e c t (
7 Seq<Phenotype<G, C>> p o p u l a t i o n ,
8 i n t count ,
9 Optimize opt
10 );
11 }
Listing 2.3: Selector interface

The first parameter is the original population from which the sub-population
is selected. The second parameter, count, is the number of individuals of the
returned sub-population. Depending on the selection algorithm, it is possible
that the sub-population contains more elements than the original one. The
last parameter, opt, determines the optimization strategy which must be used
by the selector. This is exactly the point where it is decided whether the GA
minimizes or maximizes the fitness function.
Before implementing a selector from scratch, consider extending your selector
from the ProbabilitySelector (or any other available Selector implementa-
tion). It is worth the effort to try to express your selection strategy in terms of
selection property P (i). Another way for reusing existing Selector implemen-
tation is by composition.
1 public c l a s s E l i t e S e l e c t o r <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 >
5 implements S e l e c t o r <G, C>
6 {
7 private f i n a l T r u n c a t i o n S e l e c t o r <G, C>
8 _ e l i t e = new T r u n c a t i o n S e l e c t o r <>() ;
9
10 private f i n a l T o u r n a m e n t S e l e c t o r <G, C>
11 _ r e s t = new T o u r n a m e n t S e l e c t o r <>(3) ;
12
13 public E l i t e S e l e c t o r ( ) {
14 }
15
16 @Override
17 public ISeq<Phenotype<G, C>> s e l e c t (
18 f i n a l Seq<Phenotype<G, C>> p o p u l a t i o n ,
19 f i n a l i n t count ,
20 f i n a l Optimize opt
21 ) {
22 ISeq<Phenotype<G, C>> r e s u l t ;

44
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS

23 i f ( p o p u l a t i o n . isEmpty ( ) | | count <= 0 ) {


24 r e s u l t = I S e q . empty ( ) ;
25 } else {
26 f i n a l i n t e c = min ( count , _ e l i t e C o u n t ) ;
27 r e s u l t = _ e l i t e . s e l e c t ( p o p u l a t i o n , ec , opt ) ;
28 r e s u l t = r e s u l t . append (
29 _ r e s t . s e l e c t ( p o p u l a t i o n , max ( 0 , count − e c ) , opt )
30 );
31 }
32 return r e s u l t ;
33 }
34 }
Listing 2.4: Elite selector

Listing 2.4 shows how an elite selector could be implemented by using the existing
Truncation- and TournamentSelector. With elite selection, the quality of the
best solution in each generation monotonically increases over time.[6] It is not
necessary to use an elite selector if you want to preserve the best individual
in the final result. The evolution Engine/Stream doesn’t throw away the best
solution found during the evolution process.

2.1.4 Alterers
For implementing a new alterer class it is necessary to implement the Alterer
interface. You might do this if your new Gene type needs a special kind of
alterer not available in the Jenetics project.
1 @FunctionalInterface
2 public i n t e r f a c e A l t e r e r <
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 A l t e r e r R e s u l t <G, C> a l t e r (
7 Seq<Phenotype<G, C>> p o p u l a t i o n ,
8 long g e n e r a t i o n
9 );
10 }
Listing 2.5: Alterer interface

The first parameter of the alter method is the population which has to
be altered. The second parameter is the generation of the newly created
individuals and the return value is the number of genes that has been altered
and the altered population, aggregated in the AltererResult class.

To maximize the range of application of an Alterer, it is recommended


that they can handle Genotypes and Chromosomes with variable length.

2.1.5 Statistics
During the developing phase of an application which uses the Jenetics library,
additional statistical data about the evolution process is crucial. Such data
can help to optimize the parametrization of the evolution Engine. A good

45
2.1. EXTENDING JENETICS CHAPTER 2. ADVANCED TOPICS

starting point is to use the EvolutionStatistics class in the io.jenetics.-


engine package (see listing 1.11). If the data in the EvolutionStatistics
class doesn’t fit your needs, you simply have to write your own statistics class.
It is not possible to derive from the existing EvolutionStatistics class. This
is not a real restriction, since you still can use the class by delegation. Just
implement the Java Consumer<EvolutionResult<G, C>> interface.

2.1.6 Engine
The evolution Engine itself can’t be extended, but it is still possible to create an
EvolutionStream without using the Engine class.3 Because the Evolution-
Stream has no direct dependency to the Engine, it is possible to use an different,
special evolution Function.
1 public f i n a l c l a s s S p e c i a l E n g i n e {
2 // The Genotype f a c t o r y .
3 private s t a t i c f i n a l Factory<Genotype<DoubleGene>> GTF =
4 Genotype . o f ( DoubleChromosome . o f ( 0 , 1 ) ) ;
5
6 // C r e a t e new e v o l u t i o n s t a r t o b j e c t .
7 private s t a t i c E v o l u t i o n S t a r t <DoubleGene , Double>
8 s t a r t ( f i n a l i n t p o p u l a t i o n S i z e , f i n a l long g e n e r a t i o n ) {
9 f i n a l ISeq<Phenotype<DoubleGene , Double>> p o p u l a t i o n = GTF
10 . instances ()
11 . map( g t −> Phenotype . o f ( gt , g e n e r a t i o n ) )
12 . limit ( populationSize )
13 . c o l l e c t ( ISeq . toISeq ( ) ) ;
14
15 return E v o l u t i o n S t a r t . o f ( p o p u l a t i o n , g e n e r a t i o n ) ;
16 }
17
18 // The s p e c i a l e v o l u t i o n f u n c t i o n .
19 private s t a t i c E v o l u t i o n R e s u l t <DoubleGene , Double>
20 e v o l v e ( f i n a l E v o l u t i o n S t a r t <DoubleGene , Double> s t a r t ) {
21 return . . . ; // Add i m p l e m e n t a t i o n !
22 }
23
24 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
25 f i n a l Genotype<DoubleGene> b e s t = E v o l u t i o n S t r e a m
26 . o f ( ( ) −> s t a r t ( 5 0 , 0 ) , S p e c i a l E n g i n e : : e v o l v e )
27 . l i m i t ( Limits . bySteadyFitness (10) )
28 . l i m i t (100)
29 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
30
31 System . out . p r i n t l n ( " Best Genotype : " + b e s t ) ) ;
32 }
33 }
Listing 2.6: Special evolution engine

Listing 2.6 shows an implementation stub for using an own special evolution
Function. It is also possible to change the used evolution function, depending on
the actual population. The EvolutionStream::ofAdjustableEvolution give
you this possibility. In the following example two evolution functions are used,
depending on the fitness variance of the previous population.
3 Also refer to section 1.3.3.4 on page 26 on how to create an EvolutionStream from an

evolution Function.

46
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

1 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {


2 f i n a l Problem<double [ ] , DoubleGene , Double> problem = . . . ;
3
4 // Engine . B u i l d e r t e m p l a t e .
5 f i n a l Engine . B u i l d e r <DoubleGene , Double> b l d = Engine
6 . b u i l d e r ( problem )
7 . minimizing ( ) ;
8
9 // E v o l u t i o n used f o r low f i t n e s s v a r i a n c e .
10 f i n a l E v o l u t i o n <DoubleGene , Double> lowVar = b u i l d e r . copy ( )
11 . a l t e r e r s (new Mutator < >(0.5) )
12 . s e l e c t o r (new M o n t e C a r l o S e l e c t o r <>() )
13 . build () ;
14
15 // E v o l u t i o n used f o r h i g h f i t n e s s v a r i a n c e .
16 f i n a l E v o l u t i o n <DoubleGene , Double> highVar = b u i l d e r . copy ( )
17 . alterers (
18 new Mutator < >(0.05) ,
19 new MeanAlterer <>() )
20 . s e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
21 . build () ;
22
23 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream =
24 EvolutionStream . ofAdjustableEvolution (
25 E v o l u t i o n S t a r t : : empty ,
26 e r −> v a r ( e r ) < 0 . 2 ? lowVar : highVar
27 );
28
29 f i n a l Genotype<DoubleGene> r e s u l t = stream
30 . l i m i t ( Limits . bySteadyFitness (50) )
31 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;
32 }
33
34 s t a t i c double v a r ( f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> e r ) {
35 return e r != n u l l
36 ? e r . p o p u l a t i o n ( ) . stream ( )
37 . map( Phenotype : : f i t n e s s )
38 . c o l l e c t ( toDoubleMoments ( ) )
39 . variance ()
40 : 0.0;
41 }
Listing 2.7: Adjustable evolution stream
The purpose of such an adjustment is to broaden the search if the population
variance tends to be to small. This can reduce the risk of converging to a local
minimum. If the population variance is to big, a different engine configuration
can help to speed up the optimization.

2.2 Encoding
This section presents some encoding examples for common optimization problems.
The encoding should be a complete, and minimal representation of the problem
domain. An encoding is complete if it contains enough information to represent
every solution to the problem. Whereas a minimal encoding contains only the
information needed to represent a solution to the problem. If an encoding
contains more information than is needed to uniquely identify solutions to the
problem, the search space will be larger than necessary. In the best case, there is
a one-to-one mapping from the Genotype space to problem domain. Whenever

47
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

possible, the encoding should not represent infeasible solutions. If a Genotype


represents an infeasible solution, care must be taken in the fitness function to give
partial credit to the Genotype for its »good« genetic material while sufficiently
penalizing it for being infeasible. Implementing a specialized Chromosome, which
won’t create invalid encodings can be a solution to this problem. In general, it
is much more desirable to design a representation that can only represent valid
solutions so that the fitness function measures only fitness, not validity. An
encoding that includes invalid individuals enlarges the search space and makes
the search more costly. A deeper analysis of how to create encodings can be
found in [36] and [35].
Some of the encodings represented in the following sections have been im-
plemented by Jenetics, using the Codec4 interface, and are available through
static factory methods of the io.jenetics.engine.Codecs class.

2.2.1 Real function


Jenetics contains three different numeric Gene and Chromosome implementations,
which can be used to encode a real function, f : R → R:
• IntegerGene/Chromosome,
• LongGene/Chromosome and

• DoubleGene/Chromosome.
It is quite easy to encode a real function. Only the minimum and maximum
value of the function domain must be defined. The DoubleChromosome of length
1 is then wrapped into a Genotype.
1 Genotype . o f (
2 DoubleChromosome . o f ( min , max , 1 )
3 );

Decoding the double value from the Genotype is also straight forward. Just get
the first Gene from the first Chromosome, with the gene method, and convert it
to a double.
1 s t a t i c double toDouble ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . gene ( ) . d o u b l e V a l u e ( ) ;
3 }

When the Genotype only contains scalar Chromosomes5 , it should be clear, that
it can’t be altered by every Alterer. That means, that none of the Crossover
alterers will be able to create modified Genotypes. For scalars the appropriate
alterers would be the MeanAlterer, GaussianAlterer and Mutator.

Scalar Chromosomes and/or Genotypes can only be altered by


MeanAlterer, GaussianAlterer and Mutator classes. Other alterers
are allowed, but will have no effect on the Chromosomes.

4 See section2.3 on page 54.


5 Scalar chromosomes contains only one gene.

48
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

2.2.2 Scalar function


Optimizing a function, f (x1 , ..., xn ), of one or more variable whose range is
one-dimensional, we have two possibilities for the Genotype encoding.[42] For
the first encoding we expect that all variables, xi , have the same minimum
and maximum value. In this case we can simply create a Genotype with a
Numeric Chromosome of the desired length n.
1 Genotype . o f (
2 DoubleChromosome . o f ( min , max , n )
3 );

The decoding of the Genotype requires a cast of the first Chromosome to a


DoubleChromosome. With a call to the DoubleChromosome.toArray() method
we return the variables (x1 , ..., xn ) as double[] array.
1 s t a t i c double [ ] t o S c a l a r s ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . chromosome ( )
3 . a s ( DoubleChromosome . c l a s s )
4 . toArray ( ) ;
5 }

With the first encoding you have the possibility to use all available alterers,
including all Crossover alterer classes.
The second encoding must be used if the minimum and maximum value of
the variables xi can’t be the same for all i. For the different domains, each
variable, xi , is represented by a Numeric Chromosome with length one. The final
Genotype will consist of n Chromosomes with length one.
1 Genotype . o f (
2 DoubleChromosome . o f ( min1 , max1 ) ,
3 DoubleChromosome . o f ( min2 , max2 ) ,
4 ...
5 DoubleChromosome . o f ( minn , maxn )
6 );

With the help of the Java Stream API, the decoding of the Genotype can be
done in a view lines. The DoubleChromosome stream, which is created from the
Chromosome Seq, is first mapped to double values and then collected into an
array.
1 s t a t i c double [ ] t o S c a l a r s ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . stream ( )
3 . mapToDouble ( c −> c . gene ( ) . d o u b l e V a l u e ( ) )
4 . toArray ( ) ;
5 }

As already mentioned, with the use of scalar Chromosomes we can only use the
MeanAlterer, GaussianAlterer or Mutator alterer class. If there are perfor-
mance issues in converting the Genotype into a double[] array, or any other
numeric array, you can access the Genes directly via the Genotype.get(i)-
.get(j) method and than convert it to the desired numeric value, by calling
intValue(), longValue() or doubleValue().

2.2.3 Vector function


A function, f (X1 , ..., Xn ), of one to n variables whose range is m-dimensional, is
encoded by m DoubleChromosomes of length n.[43] The domain–minimum and
maximum values–of one variable Xi are the same in this encoding.

49
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

1 Genotype . o f (
2 DoubleChromosome . o f ( min1 , max1 , m) ,
3 DoubleChromosome . o f ( min2 , max2 , m) ,
4 ...
5 DoubleChromosome . o f ( minn , maxn , m)
6 );

The decoding of the vectors is quite easy with the help of the Java Stream API. In
the first map we have to cast the Chromosome<DoubleGene> object to the actual
DoubleChromosome. The second map then converts each DoubleChromosome to
a double[] array, which is collected to an 2-dimensional double[n ][m ] array
afterwards.
1 s t a t i c double [ ] [ ] t o V e c t o r s ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . stream ( )
3 . map( dc −> dc . a s ( DoubleChromosome . c l a s s ) . t o A r r a y ( ) )
4 . t o A r r a y ( double [ ] [ ] : : new) ;
5 }

For the special case of n = 1, the decoding of the Genotype can be simplified to
the decoding we introduced for scalar functions in section 2.2.2.
1 s t a t i c double [ ] t o V e c t o r ( f i n a l Genotype<DoubleGene> g t ) {
2 return g t . chromosome ( ) . a s ( DoubleChromosome . c l a s s ) . t o A r r a y ( ) ;
3 }

2.2.4 Affine transformation


An affine transformation6 7 is usually performed by a matrix multiplication
with a transformation matrix—in a homogeneous coordinates system8 . For a
transformation in R2 , we can define the matrix A9 :
 
a11 a12 a13
A =  a21 a22 a23  . (2.2.1)
0 0 1

A simple representation can be done by creating a Genotype which contains


two DoubleChromosomes with a length of 3.
1 Genotype . o f (
2 DoubleChromosome . o f ( min , max , 3 ) ,
3 DoubleChromosome . o f ( min , max , 3 )
4 );

The drawback with this kind of encoding is, that we will create a lot of invalid
(non-affine transformation matrices) during the evolution process, which must
be detected and discarded. It is also difficult to find the right parameters for
the min and max values of the DoubleChromosomes.
A better approach will be to encode the transformation parameters instead
of the transformation matrix. The affine transformation can be expressed by the
following parameters:
• sx – the scale factor in x direction
6 https://en.wikipedia.org/wiki/Affine_transformation
7 http://mathworld.wolfram.com/AffineTransformation.html
8 https://en.wikipedia.org/wiki/Homogeneous_coordinates
9 https://en.wikipedia.org/wiki/Transformation_matrix

50
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

• sy – the scale factor in y direction


• tx – the offset in x direction
• ty – the offset in y direction
• θ – the rotation angle clockwise around origin
• kx – shearing parallel to x axis
• ky – shearing parallel to y axis
This parameters can then be represented by the following Genotype.
1 Genotype . o f (
2 // S c a l e
3 DoubleChromosome . o f ( sxMin , sxMax ) ,
4 DoubleChromosome . o f ( syMin , syMax ) ,
5 // T r a n s l a t i o n
6 DoubleChromosome . o f ( txMin , txMax ) ,
7 DoubleChromosome . o f ( tyMin , tyMax ) ,
8 // R o t a t i o n
9 DoubleChromosome . o f ( thMin , thMax ) ,
10 // Shear
11 DoubleChromosome . o f ( kxMin , kxMax ) ,
12 DoubleChromosome . o f ( kyMin , kxMax )
13 )

This encoding ensures that no invalid Genotype will be created during the
evolution process, since the crossover will be only performed on the same kind of
chromosome (same chromosome index). To convert the Genotype back to the
transformation matrix A, the following equations can be used [20]:

a11 = sx cos θ + kx sy sin θ


a12 = sy kx cos θ − sx sin θ
a13 = tx
a21 = ky sx cos θ + sy sin θ (2.2.2)
a22 = sy cos θ − sx ky sin θ
a23 = ty

This corresponds to an transformation order of T · Sh · Sc · R:

1 0 tx 1 kx 0 sx 0 0 cos θ − sin θ 0
       
 0 1 ty  ·  ky 1 0  ·  0 sy 0  ·  sin θ cos θ 0 .
0 0 1 0 0 1 0 0 1 0 0 1

In Java code, the conversion from the Genotype to the transformation matrix,
will look like this:
1 s t a t i c double [ ] [ ] t o M a t r i x ( f i n a l Genotype<DoubleGene> g t ) {
2 f i n a l double sx = g t . g e t ( 0 ) . gene ( ) . d o u b l e V a l u e ( ) ;
3 f i n a l double sy = g t . g e t ( 1 ) . gene ( ) . d o u b l e V a l u e ( ) ;
4 f i n a l double t x = g t . g e t ( 2 ) . gene ( ) . d o u b l e V a l u e ( ) ;
5 f i n a l double t y = g t . g e t ( 3 ) . gene ( ) . d o u b l e V a l u e ( ) ;
6 f i n a l double th = g t . g e t ( 4 ) . gene ( ) . d o u b l e V a l u e ( ) ;
7 f i n a l double kx = g t . g e t ( 5 ) . gene ( ) . d o u b l e V a l u e ( ) ;
8 f i n a l double ky = g t . g e t ( 6 ) . gene ( ) . d o u b l e V a l u e ( ) ;
9

51
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

10 final double cos_th = c o s ( th ) ;


11 final double s i n _ t h = s i n ( th ) ;
12 final double a11 = cos_th ∗ sx + kx ∗ sy ∗ s i n _ t h ;
13 final double a12 = cos_th ∗ kx ∗ sy − sx ∗ s i n _ t h ;
14 final double a21 = cos_th ∗ ky ∗ sx + sy ∗ s i n _ t h ;
15 final double a22 = cos_th ∗ sy − ky ∗ sx ∗ s i n _ t h ;
16
17 return new double [ ] [ ] {
18 { a11 , a12 , t x } ,
19 { a21 , a22 , t y } ,
20 {0.0 , 0.0 , 1.0}
21 };
22 }

For the introduced encoding all kind of alterers can be used. Since we have one
scalar DoubleChromosome, the rotation angle θ, it is recommended also to add
a MeanAlterer or GaussianAlterer to the list of alterers.

2.2.5 Graph
A graph can be represented in many different ways. The most known graph
representation is the adjacency matrix. The following encoding examples uses
adjacency matrices with different characteristics.

Undirected graph In an undirected graph the edges between the vertices


have no direction. If there is a path between nodes i and j, it is assumed that
there is also path from j to i.

Figure 2.2.1: Undirected graph and adjacency matrix

Figure 2.2.1 shows an undirected graph and its corresponding matrix rep-
resentation. Since the edges between the nodes have no direction, the values
of the lower diagonal matrix are not taken into account. An application which
optimizes an undirected graph has to ignore this part of the matrix.10
1 f i n a l int n = 6 ;
2 f i n a l Genotype<BitGene> g t = Genotype . o f ( BitChromosome . o f ( n ) , n ) ;

The code snippet above shows how to create an adjacency matrix for a graph
with n = 6 nodes. It creates a Genotype which consists of n BitChromosomes of
10 This property violates the minimal encoding requirement we mentioned at the beginning

of section 2.2 on page 47. For simplicity reason this will be ignored for the undirected graph
encoding.

52
2.2. ENCODING CHAPTER 2. ADVANCED TOPICS

length n each. Whether the node i is connected to node j can be easily checked
by calling gt.get(i-1).get(j-1).booleanValue(). For extracting the whole
matrix as int[] array, the following code can be used.
1 f i n a l i n t [ ] [ ] a r r a y = g t . t o S e q ( ) . stream ( )
2 . map( c −> c . t o S e q ( ) . stream ( )
3 . mapToInt ( gene −> gene . b i t ( ) ? 1 : 0 )
4 . toArray ( ) )
5 . t o A r r a y ( i n t [ ] [ ] : : new) ;

Directed graph A directed graph (digraph) is a graph where the path between
the nodes has a direction associated with them. The encoding of a directed
graph looks exactly like the encoding of an undirected graph. This time the
whole matrix is used and the second diagonal matrix is no longer ignored.

Figure 2.2.2: Directed graph and adjacency matrix

Figure 2.2.2 shows the adjacency matrix of a digraph. This time the whole
matrix is used for representing the graph.

Weighted directed graph A weighted graph associates a weight (label) with


every path in the graph. Weights are usually real numbers. They may be
restricted to rational numbers or integers.

Figure 2.2.3: Weighted graph and adjacency matrix

The following code snippet shows how the Genotype of the matrix is created.

53
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

1 f i n a l int n = 6 ;
2 f i n a l double min = −1;
3 f i n a l double max = 2 0 ;
4 f i n a l Genotype<DoubleGene> g t = Genotype
5 . o f ( DoubleChromosome . o f ( min , max , n ) , n ) ;

For accessing the single matrix elements, you can simply call Genotype.get(i)-
.get(j).doubleValue(). If the interaction with another library requires a
double[][] array, the following code can be used.
1 f i n a l double [ ] [ ] a r r a y = g t . stream ( )
2 . map( dc −> dc . a s ( DoubleChromosome . c l a s s ) . t o A r r a y ( ) )
3 . t o A r r a y ( double [ ] [ ] : : new) ;

2.3 Codec
The Codec interface, located in the io.jenetics.engine package, narrows the
gap between the fitness Function, which should be maximized/minimized, and
the Genotype representation, which can be understood by the evolution Engine.
With the Codec interface it is possible to implement the encodings of section
2.2 in a more formalized way.
Normally, the Engine expects a fitness function which takes a Genotype as
input. This Genotype has then to be transformed into an object of the problem
domain. The usage Codec interface allows a tighter coupling of the Genotype
definition and the transformation code.11
1 public i n t e r f a c e Codec<T, G extends Gene <? , G>> {
2 Factory<Genotype<G>> e n c o d i n g ( ) ;
3 Function<Genotype<G>, T> d e c o d e r ( ) ;
4 default T d ec od e ( f i n a l Genotype<G> g t ) { . . . }
5 }
Listing 2.8: Codec interface

Listing 2.8 shows the Codec interface. The encoding method returns the
Genotype factory, which is used by the Engine for creating new Genotypes.
The decoder Function, which is returned by the decoder method, transforms
the Genotype to the argument type of the fitness Function. Without the
Codec interface, the implementation of the fitness Function is polluted with code,
which transforms the Genotype into the argument type of the actual fitness
Function.
1 s t a t i c double e v a l ( f i n a l Genotype<DoubleGene> g t ) {
2 f i n a l double x = g t . gene ( ) . d o u b l e V a l u e ( ) ;
3 // Do some c a l c u l a t i o n with ’ x ’ .
4 return . . .
5 }

The Codec for the example above is quite simple and is shown below. It is not
necessary to implement the Codec interface, instead you can use the Codec::of
factory method for creating new Codec instances.
1 f i n a l DoubleRange domain = DoubleRange . o f ( 0 , 2∗ PI ) ;
2 f i n a l Codec<Double , DoubleGene> c o d e c = Codec . o f (
3 Genotype . o f ( DoubleChromosome . o f ( domain ) ) ,
11 Section 2.2 on page 47 describes some possible encodings for common optimization

problems.

54
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

4 g t −> g t . chromosome ( ) . gene ( ) . a l l e l e ( )


5 );

When using a Codec instance, the fitness Function solely contains code from
your actual problem domain—no dependencies to classes of the Jenetics library.
1 s t a t i c double e v a l ( f i n a l double x ) {
2 // Do some c a l c u l a t i o n with ’ x ’ .
3 return . . .
4 }

Jenetics comes with a set of standard encodings, which are created via static
factory methods in the io.jenetics.engine.Codecs class. The following sub-
sections describe the most important predefined Codecs.

2.3.1 Scalar codec


Listing 2.9 shows the implementation of the Codecs::ofScalar factory method—for
Integer scalars.
1 s t a t i c Codec<I n t e g e r , I n t e g e r G e n e > o f S c a l a r ( IntRange domain ) {
2 return Codec . o f (
3 Genotype . o f ( IntegerChromosome . o f ( domain ) ) ,
4 g t −> g t . chromosome ( ) . gene ( ) . a l l e l e ( )
5 );
6 }
Listing 2.9: Codec factory method: ofScalar

The usage of the Codec, created by this factory method, simplifies the imple-
mentation of the fitness Function and the creation of the evolution Engine.
For scalar types, the saving, in complexity and lines of code, is not that big, but
using the factory method is still quite handy. The following listing demonstrates
the interaction between Codec, fitness Function and evolution Engine.
1 c l a s s Main {
2 // F i t n e s s f u n c t i o n d i r e c t l y t a k e s an ’ i n t ’ v a l u e .
3 s t a t i c double f i t n e s s ( i n t a r g ) {
4 return . . . ;
5 }
6 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
7 f i n a l Engine<I n t e g e r G e n e , Double> e n g i n e = Engine
8 . b u i l d e r ( Main : : f i t n e s s , o f S c a l a r ( IntRange . o f ( 0 , 1 0 0 ) ) )
9 . build () ;
10 ...
11 }
12 }

2.3.2 Vector codec


In listing 2.10, the ofVector factory method returns a Codec for an int[]
array. The domain parameter defines the allowed range of the int values and
the length defines the length of the encoded int array.
1 s t a t i c Codec<i n t [ ] , I n t e g e r G e n e >
2 o f V e c t o r ( IntRange domain , i n t l e n g t h ) {
3 return Codec . o f (
4 Genotype . o f ( IntegerChromosome . o f ( domain , l e n g t h ) ) ,
5 g t −> g t . chromosome ( )
6 . a s ( IntegerChromosome . c l a s s )

55
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

7 . toArray ( )
8 );
9 }
Listing 2.10: Codec factory method: ofVector

The usage example of the vector Codec is almost the same as for the scalar
Codec. As an additional parameter, we need to define the length of the desired
array and we define our fitness function with an int[] array.
1 c l a s s Main {
2 // F i t n e s s f u n c t i o n d i r e c t l y t a k e s an ’ i n t [ ] ’ a r r a y .
3 s t a t i c double f i t n e s s ( i n t [ ] a r g s ) {
4 return . . . ;
5 }
6 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
7 f i n a l Engine<I n t e g e r G e n e , Double> e n g i n e = Engine
8 . builder (
9 Main : : f i t n e s s ,
10 o f V e c t o r ( IntRange . o f ( 0 , 1 0 0 ) , 1 0 ) )
11 . build () ;
12 ...
13 }
14 }

2.3.3 Matrix codec


In listing 2.11, the ofMatrix factory method returns a Codec for an int[][]
matrix. The domain parameter defines the allowed range of the int values and
the rows and cols defines the dimension of the matrix.
1 s t a t i c Codec<i n t [ ] [ ] , I n t e g e r G e n e > o f M a t r i x (
2 IntRange domain ,
3 i n t rows ,
4 int c o l s
5 ) {
6 return Codec . o f (
7 Genotype . o f (
8 IntegerChromosome . o f ( domain , c o l s ) . i n s t a n c e s ( )
9 . l i m i t ( rows )
10 . c o l l e c t ( ISeq . toISeq ( ) )
11 ),
12 g t −> g t . stream ( )
13 . map( ch −> ch . stream ( )
14 . mapToInt ( I n t e g e r G e n e : : i n t V a l u e )
15 . toArray ( ) )
16 . t o A r r a y ( i n t [ ] [ ] : : new)
17 );
18 }
Listing 2.11: Codec factory method: ofMatrix

2.3.4 Subset codec


There are currently two kinds of subset codecs you can choose from: finding
subsets with variable size and with fixed size.

56
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

Variable sized subsets A Codec for variable sized subsets can be easily
implemented with the use of a BitChromosome, as shown in listing 2.12.
1 s t a t i c <T> Codec<ISeq<T>, BitGene> o f S u b S e t ( ISeq<T> b a s i c S e t ) {
2 return Codec . o f (
3 Genotype . o f ( BitChromosome . o f ( b a s i c S e t . l e n g t h ( ) ) ) ,
4 g t −> g t . chromosome ( )
5 . a s ( BitChromosome . c l a s s ) . o n e s ( )
6 . mapToObj ( b a s i c S e t )
7 . c o l l e c t ( ISeq . toISeq ( ) )
8 );
9 }
Listing 2.12: Codec factory method: ofSubSet

The following usage example of subset Codec shows a simplified version of the
Knapsack problem (see section 5.4). We try to find a subset, from the given
basic SET, where the sum of the values is as big as possible, but smaller or equal
than 20.
1 c l a s s Main {
2 // The b a s i c s e t from where t o c h o o s e an ’ o p t i m a l ’ s u b s e t .
3 f i n a l s t a t i c ISeq<I n t e g e r > SET =
4 ISeq . of (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10) ;
5
6 // F i t n e s s f u n c t i o n d i r e c t l y t a k e s an ’ i n t ’ v a l u e .
7 s t a t i c i n t f i t n e s s ( ISeq<I n t e g e r > s u b s e t ) {
8 a s s e r t ( s u b s e t . s i z e ( ) <= SET . s i z e ( ) ) ;
9 f i n a l i n t s i z e = s u b s e t . stream ( ) . c o l l e c t (
10 C o l l e c t o r s . summingInt ( I n t e g e r : : i n t V a l u e ) ) ;
11 return s i z e <= 20 ? s i z e : 0 ;
12 }
13 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
14 f i n a l Engine<BitGene , Double> e n g i n e = Engine
15 . b u i l d e r ( Main : : f i t n e s s , o f S u b S e t (SET) )
16 . build () ;
17 ...
18 }
19 }

Fixed sized subsets The second kind of subset Codec allows you to find the
best subset of a given, fixed size. A classical usage for this encoding is the Subset
sum problem12 :
Given a set (or multi-set) of integers, is there a non-empty subset whose sum is
zero? For example, given the set {−7, −3, −2, 5, 8}, the answer is yes because
the subset {−3, −2, 5} sums to zero. The problem is NP-complete.13
1 public c l a s s SubsetSum
2 implements Problem<ISeq<I n t e g e r >, EnumGene<I n t e g e r >, I n t e g e r >
3 {
4 private f i n a l ISeq<I n t e g e r > _ b a s i c S e t ;
5 private f i n a l i n t _ s i z e ;
6
7 public SubsetSum ( ISeq<I n t e g e r > b a s i c S e t , i n t s i z e ) {
8 _basicSet = basicSet ;
9 _size = s i z e ;
10 }
12 https://en.wikipedia.org/wiki/Subset_sum_problem
13 https://en.wikipedia.org/wiki/NP-completeness

57
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

11
12 @Override
13 public Function<ISeq<I n t e g e r >, I n t e g e r > f i t n e s s ( ) {
14 return s u b s e t −> abs (
15 s u b s e t . stream ( ) . mapToInt ( I n t e g e r : : i n t V a l u e ) . sum ( ) ) ;
16 }
17
18 @Override
19 public Codec<ISeq<I n t e g e r >, EnumGene<I n t e g e r >> c o d e c ( ) {
20 return Codecs . o f S u b S e t ( _ b a s i c S e t , _ s i z e ) ;
21 }
22 }

2.3.5 Permutation codec


This kind of Codec can be used for problems where the optimal solution depends
on the order of the input elements. A classical example for such problems is the
Knapsack problem (chapter 5.5).
1 s t a t i c <T> Codec<T [ ] , EnumGene<T>> o f P e r m u t a t i o n (T . . . a l l e l e s ) {
2 return Codec . o f (
3 Genotype . o f ( PermutationChromosome . o f ( a l l e l e s ) ) ,
4 g t −> g t . chromosome ( ) . stream ( )
5 . map( EnumGene : : a l l e l e )
6 . t o A r r a y ( l e n g t h −> (T [ ] ) Array . n e w I n s t a n c e (
7 a l l e l e s [ 0 ] . getClass () , length ) )
8 );
9 }
Listing 2.13: Codec factory method: ofPermutation

Listing 2.13 shows the implementation of a permutation Codec, where the order
of the given alleles influences the value of the fitness function. An alternate
formulation of the traveling salesman problem is shown in the following listing.
It uses the permutation Codec in listing 2.13 and uses io.jenetics.jpx.Way-
Points, from the JPX 14 project, for representing the city locations.
1 public c l a s s TSM {
2 // The l o c a t i o n s t o v i s i t .
3 s t a t i c f i n a l ISeq<WayPoint> POINTS = I S e q . o f ( . . . ) ;
4
5 // The p e r m u t a t i o n c o d e c .
6 s t a t i c f i n a l Codec<ISeq<WayPoint >, EnumGene<WayPoint>>
7 CODEC = Codecs . o f P e r m u t a t i o n (POINTS) ;
8
9 // The f i t n e s s f u n c t i o n ( i n t h e problem domain ) .
10 s t a t i c double d i s t ( f i n a l ISeq<WayPoint> path ) {
11 return path . stream ( )
12 . c o l l e c t ( Geoid .DEFAULT. toTourLength ( ) )
13 . t o ( Length . Unit .METER) ;
14 }
15
16 // The e v o l u t i o n e n g i n e .
17 s t a t i c f i n a l Engine<EnumGene<WayPoint >, Double> ENGINE = Engine
18 . b u i l d e r (TSM : : d i s t , CODEC)
19 . o p t i m i z e ( Optimize .MINIMUM)
20 . build () ;
21
22 // Find t h e s o l u t i o n .
14 https://github.com/jenetics/jpx

58
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

23 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {


24 f i n a l ISeq<WayPoint> r e s u l t = CODEC. d ec od e (
25 ENGINE . stream ( )
26 . limit (10)
27 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) )
28 );
29
30 System . out . p r i n t l n ( r e s u l t ) ;
31 }
32 }

2.3.6 Mapping codec


This Codec is a variation of the permutation Codec. Instead of permuting the
elements of a given array, it permutes the mapping of elements of a source set to
the elements of a target set. The code snippet below shows the method of the
Codecs class, which creates a mapping codec from a given source and target set.
1 public s t a t i c <A, B> Codec<Map<A, B>, EnumGene<I n t e g e r >>
2 ofMapping ( ISeq <? extends A> s o u r c e , ISeq <? extends B> t a r g e t ) ;

It is not necessary that the source and target set are of the same size. If |source| >
|target|, the returned mapping function is surjective, if |source| < |target|, the
mapping is injective and if |source| = |target|, the created mapping is bijective.
In every case the size of the encoded Map is |target|. Figure 2.3.1 shows the
described different mapping types in graphical form.

Figure 2.3.1: Mapping codec types

With |source| = |target|, you will create a Codec for the assignment problem.
The problem is defined by a number of workers and a number of jobs. Every
worker can be assigned to perform any job. The cost for a worker may vary
depending on the worker job assignment. It is required to perform all jobs by
assigning exactly one worker to each job and exactly one job to each worker in
such a way which optimizes the total assignment costs.15 The costs for such
worker job assignments are usually given by a matrix. Such an example matrix
is shown in table 2.3.1.
If your worker job cost can be expressed by  a matrix, the Hungarian algo-
rithm16 can find an optimal solution in O n3 time. You should consider this
deterministic algorithm before using a GA.

2.3.7 Composite codec


The composite Codec factory method allows to combine two or more Codecs
into one. Listing 2.14 shows the method signature of the factory method, which
15 https://en.wikipedia.org/wiki/Assignment_problem
16 https://en.wikipedia.org/wiki/Hungarian_algorithm

59
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

Job 1 Job 2 Job 3 Job 4


Worker 1 13 4 7 6
Worker 2 1 11 5 4
Worker 3 6 7 3 8
Worker 4 1 3 5 9

Table 2.3.1: Worker job cost

is implemented directly in the Codec interface.


1 s t a t i c <G extends Gene <? , G>, A, B, T> Codec<T, G> o f (
2 final Codec<A, G> codec1 ,
3 final Codec<B, G> codec2 ,
4 final BiFunction<A, B, T> d e c o d e r
5 );
Listing 2.14: Composite Codec factory method

As you can see from the method definition, the combining Codecs and the
combined Codec have the same Gene type.

Only Codecs with the same Gene type can be composed by the combining
factory methods of the Codec class.

The following listing shows a full example which uses a combined Codec. It
uses the subset Codec, introduced in section 2.3.4, and combines it into a Tuple
of subsets.
1 c l a s s Main {
2 s t a t i c f i n a l ISeq<I n t e g e r > SET =
3 ISeq . of (1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9) ;
4
5 // R e s u l t t y p e o f t h e combined ’ Codec ’ .
6 s t a t i c f i n a l c l a s s Tuple<A, B> {
7 final A f i r s t ;
8 f i n a l B second ;
9 Tuple ( f i n a l A f i r s t , f i n a l B s e c o n d ) {
10 this . f i r s t = f i r s t ;
11 this . second = second ;
12 }
13 }
14
15 s t a t i c i n t f i t n e s s ( Tuple<ISeq<I n t e g e r >, ISeq<I n t e g e r >> a r g s ) {
16 return a r g s . f i r s t . stream ( )
17 . mapToInt ( I n t e g e r : : i n t V a l u e )
18 . sum ( ) −
19 a r g s . s e c o n d . stream ( )
20 . mapToInt ( I n t e g e r : : i n t V a l u e )
21 . sum ( ) ;
22 }
23
24 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
25 // Combined ’ Codec ’ .
26 f i n a l Codec<Tuple<ISeq<I n t e g e r >, ISeq<I n t e g e r >>, BitGene>
27 c o d e c = Codec . o f (
28 Codecs . o f S u b S e t (SET) ,
29 Codecs . o f S u b S e t (SET) ,

60
2.3. CODEC CHAPTER 2. ADVANCED TOPICS

30 Tuple : : new
31 );
32
33 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
34 . b u i l d e r ( Main : : f i t n e s s , c o d e c )
35 . build () ;
36
37 f i n a l Phenotype<BitGene , I n t e g e r > pt = e n g i n e . stream ( )
38 . l i m i t (100)
39 . c o l l e c t ( EvolutionResult . toBestPhenotype ( ) ) ;
40
41 // Use t h e c o d e c f o r c o n v e r t i n g t h e r e s u l t ’ Genotype ’ .
42 f i n a l Tuple<ISeq<I n t e g e r >, ISeq<I n t e g e r >> r e s u l t =
43 c o d e c . d e c o d e r ( ) . a p p l y ( pt . g e n o t y p e ( ) ) ;
44 }
45 }

If you have to combine more than one Codec into one, you have to use the
second, more general, combining function: Codec::of(ISeq<Codec<?, G>‌>,
Function<Object[], T>). The example above shows how to use the general
combining function. It is just a little bit more verbose and requires explicit casts
for the sub-codec types.
1 f i n a l Codec<T r i p l e <Long , Long , Long >, LongGene>
2 c o d e c = Codec . o f ( I S e q . o f (
3 Codecs . o f S c a l a r ( LongRange . o f ( 0 , 1 0 0 ) ) ,
4 Codecs . o f S c a l a r ( LongRange . o f ( 0 , 1 0 0 0 ) ) ,
5 Codecs . o f S c a l a r ( LongRange . o f ( 0 , 1 0 0 0 0 ) ) ) ,
6 v a l u e s −> {
7 f i n a l Long f i r s t = ( Long ) v a l u e s [ 0 ] ;
8 f i n a l Long s e c o n d = ( Long ) v a l u e s [ 1 ] ;
9 f i n a l Long t h i r d = ( Long ) v a l u e s [ 2 ] ;
10 return new T r i p l e <>( f i r s t , second , t h i r d ) ;
11 }
12 );

2.3.8 Invertible codec


The InvertibleCodec extends the Codec interface and allows to create a
Genotype from a given value of the native problem domain.
1 public i n t e r f a c e I n v e r t i b l e C o d e c <T, G extends Gene <? , G>>
2 extends Codec<T, G>
3 {
4 Function<T, Genotype<G>> e n c o d e r ( ) ;
5 default Genotype<G> en c ode ( f i n a l T v a l u e ) { . . . }
6 }
Listing 2.15: InvertibleCodec interface

Listing 2.15 shows the additional methods of the InvertibleCodec interface.


Creating a Genotype from a given domain value simplifies the implementation of
the Constraint::repair method. Most of the factory methods in the Codecs
class will return InvertibleCodec instances. The encoder function is not
necessarily the inverse of the decoder function of the Codec interface. This is
the case if more then one Genotype maps to the same value of the problem
domain.

61
2.4. PROBLEM CHAPTER 2. ADVANCED TOPICS

2.4 Problem
The Problem interface is a further abstraction level, which allows you to bind
the problem encoding and the fitness function into one data structure.
1 public i n t e r f a c e Problem<
2 T,
3 G extends Gene <? , G>,
4 C extends Comparable <? super C>
5 > {
6 Function<T, C> f i t n e s s ( ) ;
7 Codec<T, G> c o d e c ( ) ;
8 }
Listing 2.16: Problem interface

Listing 2.16 shows the Problem interface. The generic type T represents the
type of the native problem domain. This is the argument type of the fitness
Function, and C the Comparable result of the fitness Function. G is the Gene
type, which is used by the evolution Engine.
1 // D e f i n i t i o n o f t h e Ones c o u n t i n g problem .
2 f i n a l Problem<ISeq<BitGene >, BitGene , I n t e g e r > ONES_COUNTING =
3 Problem . o f (
4 // F i t n e s s Function<ISeq<BitGene >, I n t e g e r >
5 g e n e s −> ( i n t ) g e n e s . stream ( )
6 . f i l t e r ( BitGene : : b i t ) . count ( ) ,
7 Codec . o f (
8 // Genotype Factory<Genotype<BitGene>>
9 Genotype . o f ( BitChromosome . o f ( 2 0 , 0 . 1 5 ) ) ,
10 // Genotype c o n v e r s i o n
11 // Function<Genotype<BitGene >, <BitGene>>
12 g t −> g t . chromosome ( ) . t o S e q ( )
13 )
14 );
15
16 // Engine c r e a t i o n f o r Problem s o l v i n g .
17 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
18 . b u i l d e r (ONES_COUNTING)
19 . populationSize (150)
20 . s u r v i v o r s S e l e c t o r ( newTournamentSelector <>(5) )
21 . o f f s p r i n g S e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
22 . alterers (
23 new Mutator < >(0.03) ,
24 new S i n g l e P o i n t C r o s s o v e r < >(0.125) )
25 . build () ;

The listing above shows how a new Engine is created by using a predefined
Problem instance. This allows the complete decoupling of problem and Engine
definition.

2.5 Constraint
Constraints delimit the feasible space of solutions of an optimization problem
and are considered in evolutionary algorithms [13, 27, 12, 28]. This influence the
desirability of each possible solution. If the constraints are satisfied, the solution
is accepted and it is called a feasible solution; otherwise the solution is removed
or modified. For a fitness function, f (x), the constraints are usually given as a

62
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS

list of inequalities,
gi (x) ≤ 0, (2.5.1)
and a list of equations,
hj (x) = 0. (2.5.2)

Constraint
5 4 x+7 y−32≤0 Outside of search space

y max 4

Infeasible search space


3

1 Feasible search space

y min 0 x
0 1 2 3 4 5 6 7 8 9
x min x max

Figure 2.5.1: Constrained 2-dimensional search space

Figure 2.5.1 shows how the inequality, 4x + 7y − 32 ≤ 0, divides the search


space into a feasible and an infeasible part. There are different approaches for
handling constraints. Penalty methods try to convert a constrained optimization
problem into an unconstrained one by incorporating its constraints into the
fitness function. Transformation methods try to map the feasible region into a
regular mapped space while preserving the feasibility somehow. The Constraint
interface of Jenetics takes the second approach and tries to preserve feasibility
through a repair step for invalid candidate solutions.

Usually, a given problem should be encoded in a way, that it is not possible for
the evolution Engine to create invalid individuals (Genotypes). Some possible
encodings for common data structures are described in section 2.2. The Engine
creates new individuals in the altering step, by rearranging (or creating new)
Genes within a Chromosome. Since a Genotype is treated as valid if every
single Gene in every Chromosome is valid, the validity property of the Genes
determines the validity of the whole Genotype. The Engine tries to create
only valid individuals when creating the initial population and when it replaces
Genotypes which has been destroyed by the altering step. Individuals which
has exceeded its lifetime are also replaced by new ones. Although this behavior
will work for most Genotypes, it is still possible that invalid individuals will be
created during the evolution. If you need a more advanced validation strategy,
the Constraint interface comes into play.

63
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS

1 public i n t e r f a c e C o n s t r a i n t <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 > {
5 boolean t e s t ( Phenotype<G, C> i n d ) ;
6 Phenotype<G, C> r e p a i r ( Phenotype<G, C> ind , long gen ) ;
7 }
Listing 2.17: Constraint interface

Listing 2.5 shows the definition of the Constraint interface. The test method of
the interface checks the validity of the given Phenotype and the repair method
creates a new individual using the invalid individual as template.
The RetryConstraint class is the default implementation of the Constraint
interface. It implements the repair method by creating new Phenotypes until
the created individual is valid. Although this approach seems a little bit simplistic,
it has an important and desirable property: the repaired individuals follow the
same distribution then the original. This means, that no part of the problem
domain is left out or is overcrowded. The number of necessary retries is also not
a problem, for normal constraints. For example, the probability that a randomly
created point lies outside the unit circle is 1 − π4 ≈ 0.2146. This leads to a failure
probability after 10 retries, which is the default value of the RetryConstraint,
of 1 − π4 ≈ 0.000000207. You can parameterize a different Constraint
10

definition with the constraint method of the Engine.Builder.

The behavior of the Phenotype::isValid method is overridden by


the Constraint interface. A Phenotype is treated as invalid if the
Constraint::test method returns false, even if the Phenotype::isValid
method returns true.

Figure 2.5.2 shows the distribution of the domain points in our unit circle
example. Rejecting invalid points and recreating new ones leads to an uniform
point distribution. Every part of the domain is explored with the same probability.
This is a very welcome property of the RetryConstraint strategy.
Trying to create only valid domain points can sometimes lead to a nonuniform
distribution. This can be seen in figure 2.5.3. The points were created by choosing
the angle, α, and the radius, r, randomly, and calculate the point coordinates,
x = (r cos α, r sin α) ,where r ∈ [−1, 1] and α ∈ [0, 2π). As you can see, the
points near the center are much denser than at the domain border. This makes
it harder for the Engine to explore the whole problem domain.
The RetryConstraint is the default implementation of the Constraint
interface, but it might not be the best one for every given problem. If it is
possible, it is better to try to repair an invalid Phenotype instead of creating a
new one. Suppose you need to optimize the fitness function, f : R3 → R, with
the following constraints:

x1 + x2 − 1 ≤ 0
x2 · x3 − 0.5 ≤ 0.

A repairing Constraint implementation checks the validity of a Phenotype and

64
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Figure 2.5.2: Domain points with retry-constraint

repairs it, if it’s invalid.


1 public c l a s s R e p a i r i n g C o n s t r a i n t
2 implements C o n s t r a i n t <DoubleGene , Double>
3 {
4 @Override
5 public boolean t e s t ( Phenotype<DoubleGene , Double> pt ) {
6 return i s V a l i d (
7 pt . g e n o t y p e ( ) . chromosome ( )
8 . a s ( DoubleChromosome . c l a s s )
9 . toArray ( )
10 );
11 }
12 s t a t i c boolean i s V a l i d ( double [ ] x ) {
13 return x [ 0 ] + x [ 1 ] <= 1 && x [ 1 ] ∗ x [ 2 ] <= 0 . 5 ;
14 }
15
16 @Override
17 public Phenotype<DoubleGene , Double> r e p a i r (
18 f i n a l Phenotype<DoubleGene , Double> pt ,
19 f i n a l long g e n e r a t i o n
20 ) {
21 f i n a l double [ ] x = pt . g e n o t y p e ( ) . chromosome ( )
22 . a s ( DoubleChromosome . c l a s s )
23 . toArray ( ) ;
24
25 return newPhenotype ( r e p a i r ( x ) , g e n e r a t i o n ) ;
26 }
27 s t a t i c double [ ] r e p a i r ( f i n a l double [ ] x ) {
28 i f ( x [ 0 ] + x [ 1 ] > 1) x [ 0 ] = 1 − x [ 1 ] ;
29 i f (x [ 1 ] ∗ x [ 2 ] > 0.5) x [ 2 ] = 0.5/x [ 1 ] ;
30 return x ;
31 }
32 }

The implementation of the new depends on your actual encoding and might look
like this
1 Phenotype<DoubleGene , Double> newPhenotype ( double [ ] r , long gen ) {

65
2.5. CONSTRAINT CHAPTER 2. ADVANCED TOPICS

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Figure 2.5.3: Only valid domain points

2 f i n a l Genotype<DoubleGene> g t = Genotype . o f (
3 DoubleChromosome . o f (
4 DoubleStream . o f ( r ) . boxed ( )
5 . map( v −> DoubleGene
6 . o f ( v , DoubleRange . o f ( 0 , 1 ) ) )
7 . c o l l e c t ( ISeq . toISeq ( ) )
8 )
9 );
10 return Phenotype . o f ( gt , gen ) ;
11 }

Writing a repair method this way is quite tedious. The InvertibleCodec


interface, see section 2.15, allows to implement the repair function in a more
natural way. Imagine you want to encode a split range, as shown in figure 2.5.4.
Only the values between [0, 2) and [8, 10) are valid.

0 1 2 3 4 5 6 7 8 9 10

Figure 2.5.4: Split range domain

The following listing shows how to create a constraint, which fulfills the
desired codec property.
1 f i n a l I n v e r t i b l e C o d e c <Double , DoubleGene> c o d e c =
2 Codecs . o f S c a l a r ( DoubleRange . o f ( 0 , 1 0 ) ) ;
3 f i n a l C o n s t r a i n t <DoubleGene , Double> c o n s t r a i n t = C o n s t r a i n t . o f (
4 codec ,
5 v −> v < 2 | | v >= 8 ,
6 v −> {
7 i f ( v >= 2 && v < 8 ) {
8 return v < 5 ? ( ( v − 2 ) / 3 ) ∗2 : ( ( 8 − v ) / 3 ) ∗2 + 8 ;

66
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

9 }
10 return v ;
11 }
12 );

Using an InvertibleCodec instead of a Codec, the repair function can be


expressed in the problem domain. In the given example, the repair function
maps the invalid range [2, 5) to [0, 2) and the invalid range [5, 8) to [8, 10). An
alternative implementation for this Codec can also be created by mapping the
scalar range Codec directly, as shown in the following listing.
1 f i n a l Codec<Double , DoubleGene> c o d e c = Codecs
2 . o f S c a l a r ( DoubleRange . o f ( 0 , 1 0 ) )
3 . map( v −> {
4 i f ( v >= 2 && v < 8 ) {
5 return v < 5 ? ( ( v − 2 ) / 3 ) ∗2 : ( ( 8 − v ) / 3 ) ∗2 + 8 ;
6 }
7 return v ;
8 }) ;

Creating a new evolution Engine with a Constraint, only repairs individuals


which has been destroyed by the alterer step. It is still possible that the defined
Genotype factory will create invalid individuals. If your Genotype factory can’t
guarantee that only valid individuals are created, an additional setup step is
necessary.
1 f i n a l C o n s t r a i n t <DoubleGene , Double> c o n s t r a i n t = . . . ;
2 f i n a l Factory<Genotype<DoubleGene>> g t f = . . . ;
3 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine
4 . builder ( fitness , constraint . constrain ( gtf ) )
5 . constraint ( constraint )
6 . build () ;

The Constraint::constrain method takes an unreliable genotype factory and


wraps it into a reliable one. As long as the constraint is implemented correctly,
only valid individuals are generated by the Engine.

The Constraint, defined in the Engine, only fixes individuals which has
been destroyed during the evolution process. Individuals, created by the
Genotype factory may still be invalid. Use the Constraint::constrain
method for creating safe Genotype factories.

2.6 Termination
Termination is the criterion by which the evolution stream decides whether
to continue or truncate the stream. This section gives a deeper insight into
the different ways of terminating or truncating the EvolutionStream. The
EvolutionStream of the Jenetics library offers an additional method for limiting
the evolution. With the limit(Predicate<EvolutionResult<G,C>>) method
it is possible to use more advanced termination strategies. If the predicate, given

67
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

to the limit function, returns false, the EvolutionStream is truncated. The


EvolutionStream.limit(r -> true) will create an infinite evolution stream.

The predicate given to the EvolutionStream::limit function must return


false for truncating the evolution stream. If it returns true, the evolution is
continued.

All termination strategies described in the following sections are part of the
library and can be created by factory methods of the io.jenetics.engine-
.Limits class. The termination strategies were tested by solving the Knapsack
problem17 (see section 5.4) with 250 items. This makes it a real problem with a
search space size of 2250 ≈ 1075 elements.

Population size: 150


Survivors selector: TournamentSelector<>(5)
Offspring selector: RouletteWheelSelector<>()
Alterers: Mutator<>(0.03) and
SinglePointCrossover<>(0.125)

Table 2.6.1: Knapsack evolution parameters

Table 2.6.1 shows the evolution parameters used for the termination tests. To
make the tests comparable, all test runs use the same evolution parameters and
the very same set of knapsack items. Each termination test was repeated 1,000
times, which should give enough data to draw the given candlestick diagrams.
Some of the implemented termination strategies need to maintain an internal
state. These strategies can’t be reused in different evolution streams. To be
on the safe side, it is recommended to always create a Predicate instance for
each stream. Calling Stream.limit(Limits::byTerminationStrategy ) will
always work as expected.

2.6.1 Fixed generation


The simplest way for terminating the evolution process, is to define a maximal
number of generations on the EvolutionStream. It just uses the existing limit
method of the Java Stream interface.
1 f i n a l long MAX_GENERATIONS = 1 0 0 ;
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
4 . l i m i t (MAX_GENERATIONS) ;

This kind of termination method can always be applied—usually additional with


other evolution terminators—, to guarantee the truncation of the Evolution-
Stream and to define an upper limit of the executed generations. Additionally,
the Limits::byFixedGeneration(long) predicate can be used instead of the
Stream::limit(long) method. This predicate is mainly there for the completion
17 The actual implementation used for the termination tests can be found in the Github repos-

itory: https://github.com/jenetics/jenetics/blob/master/jenetics.example/src/main/
java/io/jenetics/example/Knapsack.java

68
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

reason and behaves exactly as the Stream::limit(long) function, except for


the number of evaluations performed by the resulting stream. The evaluation of
the population is max generations + 1. This is because the limiting predicate
works on the EvolutionResult object, which guarantees to contain an evaluated
population. That means, that the population must be evaluated at least once,
even for a generation limit of zero. If this is an unacceptable performance penalty,
better use the Stream::limit(long) function instead.

11.5

11.0

10.5

10.0

9.5
Fitness

9.0

8.5

8.0

7.5

7.0

6.5
100 101 102 103 104 105

Generation

Figure 2.6.1: Fixed generation termination

Figure 2.6.1 shows the best fitness values of the used Knapsack problem after
a given number of generations, whereas the candlestick points represents the min,
25th percentile, median, 75th percentile and max fitness after 250 repetitions per
generation. The solid line shows for the mean of the best fitness values. For a
small increase of the fitness value, the needed generations grows exponentially.
This is especially the case when the fitness is approaching its maximal value.

2.6.2 Steady fitness


The steady fitness strategy truncates the EvolutionStream if its best fitness
hasn’t changed after a given number of generations. The predicate maintains
an internal state—the number of generations with non increasing fitness—, and
must be newly created for every EvolutionStream.
1 f i n a l c l a s s S t e a d y F i t n e s s L i m i t <C extends Comparable <? super C>>
2 implements P r e d i c a t e <E v o l u t i o n R e s u l t <? , C>>
3 {
4 private f i n a l i n t _ g e n e r a t i o n s ;
5 private boolean _proceed = true ;
6 private i n t _ s t a b l e = 0 ;

69
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

7 private C _ f i t n e s s ;
8
9 public S t e a d y F i t n e s s L i m i t ( f i n a l i n t g e n e r a t i o n s ) {
10 _generations = generations ;
11 }
12
13 @Override
14 public boolean t e s t ( f i n a l E v o l u t i o n R e s u l t <? , C> e r ) {
15 i f ( ! _proceed ) return f a l s e ;
16 i f ( _ f i t n e s s == n u l l ) {
17 _fitness = er . bestFitness () ;
18 _stable = 1;
19 } else {
20 f i n a l Optimize opt = r e s u l t . o p t i m i z e ( ) ;
21 i f ( opt . compare ( _ f i t n e s s , e r . b e s t F i t n e s s ( ) ) >= 0 ) {
22 _proceed = ++_ s t a b l e <= _ g e n e r a t i o n s ;
23 } else {
24 _fitness = er . bestFitness () ;
25 _stable = 1;
26 }
27 }
28 return _proceed ;
29 }
30 }
Listing 2.18: Steady fitness

Listing 2.18 shows the implementation of the Limits::bySteadyFitness(int)


in the io.jenetics.engine package. It should give you an impression of how
to implement own termination strategies, which possible holds an internal state.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
4 . l i m i t ( Limits . bySteadyFitness (15) ) ;

The steady fitness terminator can be created by the bySteadyFitness factory


method of the io.jenetics.engine.Limits class. In the example above, the
evolution stream is terminated after 15 stable generations.
Figure 2.6.2 shows the actual total executed generation depending on the
desired number of steady fitness generations. The variation of the total generation
is quite big, as shown by the candlesticks. Though the variation can be quite
big—the termination test has been repeated 250 times for each data point—, the
tests showed that the steady fitness termination strategy always terminated, at
least for the given test setup. The lower diagram give an overview of the fitness
progression. Only the mean values of the maximal fitness is shown.

2.6.3 Evolution time


This termination strategy stops the evolution when the elapsed evolution time
exceeds an user specified maximal value. The EvolutionStream is only truncated
at the end of a generation and will not interrupt the current evolution step. A
maximal evolution time of zero ms will at least evaluate one generation. In a
time critical environment, where a solution must be found within a maximal
time period, this terminator lets you define the desired guarantees.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )

70
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

106

105
Total generation

104

103

102

101

100
11.0
10.0
Fitness

9.0
8.0
7.0
100 101 102 103 104 105

Steady generation

Figure 2.6.2: Steady fitness termination

4 . l i m i t ( L i m i t s . byExecutionTime ( D u r a t i o n . o f M i l l i s ( 5 0 0 ) ) ;

In the code example above, the byExecutionTime(Duration) method is used for


creating the termination object. Another method, byExecutionTime(Duration,
InstantSource), lets you define the java.time.InstantSource, which is used
for measure the execution time. Jenetics uses the nano precision clock io.je-
netics.util.NanoClock for measuring the time. To have the possibility to
define a different InstantSource implementation is especially useful for testing
purposes.
Figure 2.6.3 shows the evaluated generations depending on the execution
time. Except for very small execution times, the evaluated generations per time
unit stays quite stable.18 That means that a doubling of the execution time will
double the number of evolved generations.

2.6.4 Fitness threshold


A termination method that stops the evolution when the best fitness in the
current population becomes less than the specified fitness threshold and the
objective is set to minimize the fitness. This termination method also stops the
evolution when the best fitness in the current population becomes greater than
the specified fitness threshold when the objective is to maximize the fitness.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
18While running the tests, all other CPU intensive process has been stopped. The measuring

started after a warm-up phase.

71
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

106

105
Total generation

104

103

102

101

100
11.0
10.0
Fitness

9.0
8.0
7.0
100 101 102 103 104 105

Execution time [ms]

Figure 2.6.3: Execution time termination

4 . l i m i t ( Limits . byFitnessThreshold (1 0. 5)
5 . l i m i t (5000) ;

When limiting the EvolutionStream by a fitness threshold, you have to have


knowledge about the expected maximal fitness. This can be the case if you are
minimizing an error function with a known optimal value of zero. If there is no
such knowledge, it is advisable to add an additional fixed sized generation limit
as a safety net.
Figure 2.6.4 shows executed generations depending on the minimal fitness
value. The total generations grow exponentially with the desired fitness value.
This means, that this termination strategy will (practically) not terminate, if
the value for the fitness threshold is chosen too high. And it will definitely not
terminate if the fitness threshold is higher than the global maximum of the
fitness function. It will be a perfect strategy if you can define some good enough
fitness value, which can be easily achieved.

2.6.5 Fitness convergence


In this termination strategy, the evolution stops when the fitness is deemed as
converged. Two filters of different lengths are used to smooth the best fitness
across the generations. When the best smoothed fitness of the long filter is less
than a specified percentage away from the best smoothed fitness from the short
filter, the fitness is deemed as converged. Jenetics offers a generic version fitness
convergence predicate and a version where the smoothed fitness is the moving
average of the used filters.
1 public s t a t i c <N extends Number & Comparable <? super N>>

72
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

107

106

105
Total generation

104

103

102

101

100
11.0
10.0
Fitness

9.0
8.0
7.0
7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0

Fitness threshold

Figure 2.6.4: Fitness threshold termination

2 P r e d i c a t e <E v o l u t i o n R e s u l t <? , N>> b y F i t n e s s C o n v e r g e n c e (


3 f i n a l int s h o r t F i l t e r S i z e ,
4 f i n a l int l o n g F i l t e r S i z e ,
5 f i n a l B i P r e d i c a t e <DoubleMoments , DoubleMoments> p r o c e e d
6 );
Listing 2.19: General fitness convergence

Listing 2.19 shows the factory method which creates the generic fitness con-
vergence predicate. This method allows to define the evolution termination
according to the statistical moments of the short and long fitness filter.
1 public s t a t i c <N extends Number & Comparable <? super N>>
2 P r e d i c a t e <E v o l u t i o n R e s u l t <? , N>> b y F i t n e s s C o n v e r g e n c e (
3 f i n a l int s h o r t F i l t e r S i z e ,
4 f i n a l int l o n g F i l t e r S i z e ,
5 f i n a l double e p s i l o n
6 );
Listing 2.20: Mean fitness convergence

The second factory method (shown in listing 2.20) creates a fitness convergence
predicate, which uses the moving average19 for the two filters. The smoothed
fitness value is calculated as follows:

1 X
N −1
σF (N ) = F[G−i] (2.6.1)
N i=0
19 https://en.wikipedia.org/wiki/Moving_average

73
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

where N is the length of the filter, F[i] the fitness value at generation i and G
the current generation. If the condition
|σF (NS ) − σF (NL )|
<ϵ (2.6.2)
δ
is fulfilled, the EvolutionStream is truncated. Where δ is defined as follows:
max (|σF (NS )| , |σF (NL )|) if ̸= 0

δ= . (2.6.3)
1 otherwise

1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
4 . l i m i t ( L i m i t s . b y F i t n e s s C o n v e r g e n c e ( 1 0 , 3 0 , 10E−4) ;

For using the fitness convergence strategy you have to specify three parameters.
The length of the short filter, NS , the length of the long filter, NL , and the
relative difference between the smoothed fitness values, ϵ.

500
450 NS= 10
400 NL= 30
Total generation

350
300
250
200
150
100
50
0
9.4
9.2
Fitness

9.0
8.8
8.6
8.4
10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10

Epsilon

Figure 2.6.5: Fitness convergence termination: NS = 10, NL = 30

Figure 2.6.5 shows the termination behavior of the fitness convergence termi-
nation strategy. It can be seen that the minimum number of evolved generations
is the length of the long filter, NL .
Figure 2.6.6 shows the generations needed for terminating the evolution for
higher values of the NS and NL parameters.

2.6.6 Population convergence


This termination method stops the evolution when the population is deemed as
converged. A population is deemed as converged when the average fitness across

74
2.6. TERMINATION CHAPTER 2. ADVANCED TOPICS

2500
NS= 50
2000 NL= 150
Total generation

1500

1000

500

0
10.0
9.8
Fitness

9.6
9.4
9.2
10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10

Epsilon

Figure 2.6.6: Fitness convergence termination: NS = 50, NL = 150

the current population is less than a user specified percentage away from the
best fitness of the current population. The population is deemed as converged
and the EvolutionStream is truncated if

fmax − f¯
< ϵ, (2.6.4)
δ
where
1 X
N −1
f¯ = fi , (2.6.5)
N i=0

fmax = max {fi } (2.6.6)


i∈[0,N )

and
max |fmax | , f¯ if ̸= 0
 
δ= . (2.6.7)
1 otherwise
N denotes the number of individuals of the population.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = . . .
2 f i n a l E v o l u t i o n S t r e a m <DoubleGene , Double> stream = e n g i n e
3 . stream ( )
4 . l i m i t ( Limits . byPopulationConvergence ( 0 . 1 ) ;

The EvolutionStream in the example above will terminate, if the difference


between the population’s fitness mean value and the maximal fitness value of
the population is less than 10%.

75
2.7. REPRODUCIBILITY CHAPTER 2. ADVANCED TOPICS

2.6.7 Gene convergence


This termination strategy is different, in the sense that it takes the genes or alleles,
respectively, for terminating the EvolutionStream. In the gene convergence
termination strategy the evolution stops when a specified percentage of the genes
of a genotype are deemed as converged. A gene is treated as converged when the
average value of that gene across all of the genotypes in the current population
is less than a given percentage away from the maximum allele value across the
genotypes.

2.7 Reproducibility
Some problems can be defined with different kinds of fitness functions or encod-
ings. Which combination works best can’t usually be decided a priori. To choose
one, some testing is needed. Jenetics allows you to set up an evolution Engine
in a way that will produce the very same result on every run.
1 f i n a l Engine<DoubleGene , Double> e n g i n e =
2 Engine . b u i l d e r ( f i t n e s s F u n c t i o n , c o d e c )
3 . e x e c u t o r ( Runnable : : run )
4 . build () ;
5 f i n a l E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t =
6 RandomRegistry . with (new Random ( 4 5 6 ) , r −>
7 e n g i n e . stream ( p o p u l a t i o n )
8 . l i m i t (100)
9 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () )
10 );
Listing 2.21: Reproducible evolution Engine
Listing 2.21 shows the basic setup of such a reproducible evolution Engine.
Firstly, you have to make sure that all evolution steps are executed serially.
This is done by configuring a single threaded executor. In the simplest case
the evolution is performed solely on the main thread—Runnable::run. If the
evolution Engine uses more than one worker thread, the reproducibility is
no longer guaranteed. The second step configures the random generator, the
evolution Engine is working with. Just wrap the EvolutionStream execution
in a RandomRegistry::with block. Additionally you can start the Evolution-
Stream with a predefined, initial population. Once you have setup the Engine,
you can vary the fitness function and the Codec and compare the results.

If you are using user defined implementations of the Gene and Chromosome
interface, make sure to obtain the RandomGenerator object from the
RandomRegistry. This is also required for every initialization code used in
your problem implementation. Also check your code for hidden nondeter-
ministic parts, e. g. Collections::shuffle method.

2.8 Evolution performance


This section contains an empirical proof, that evolutionary selectors deliver
significantly better fitness results than a random search. The MonteCarlo-

76
2.9. EVOLUTION STRATEGIES CHAPTER 2. ADVANCED TOPICS

Selector is used for creating the comparison (random search) fitness values.

11.0

10.5

10.0

9.5
Fitness

9.0

8.5

8.0

7.5 MonteCarloSelector
Evolutionary-Selector
7.0
100 101 102 103 104 105

Generation

Figure 2.8.1: Selector performance (Knapsack)

Figure 2.8.1 shows the evolution performance of the Selector20 used by the
examples in section 2.6. The lower, blue line shows the (mean) fitness values of
the Knapsack problem when using the MonteCarloSelector for selecting the
survivors and offspring population. It can be easily seen, that the performance
of the real evolutionary Selectors is much better than a random search.

2.9 Evolution strategies


Evolution Strategies, ES, were developed by Ingo Rechenberg and Hans-Paul
Schwefel at the Technical University of Berlin in the mid 1960s.[41] It is a
global optimization algorithm in continuous search spaces and is an instance
of an Evolutionary Algorithm from the field of Evolutionary Computation. ES
uses truncation selection21 for selecting the individuals and usually mutation22
for changing the next generation. This section describes how to configure the
evolution Engine of the library for the (µ, λ)- and (µ + λ)-ES.

2.9.1 (µ, λ) evolution strategy


The (µ, λ) algorithm starts by generating λ individuals randomly. After eval-
uating the fitness of all the individuals, all but the µ fittest ones are deleted.
20 Thetermination tests are using a TournamentSelector, with tournament-size 5, for select-
ing the survivors, and a RouletteWheelSelector for selecting the offspring.
21 See 1.3.2.1 on page 13.
22 See 1.3.2.2 on page 17.

77
2.9. EVOLUTION STRATEGIES CHAPTER 2. ADVANCED TOPICS

Each of the µ fittest individuals gets to produce µλ children through an ordinary


mutation. The newly created children just replaces the discarded parents.[26]
To summarize it: µ is the number of parents which survive, and λ is the
number of offspring, created by the µ parents. The value of λ should be a
multiple of µ. ES practitioners usually refer to their algorithm by the choice of
µ and λ. If we set µ = 5 and λ = 20, then we have a (5, 20)-ES.
1 f i n a l Engine<DoubleGene , Double> e n g i n e =
2 Engine . b u i l d e r ( f i t n e s s , c o d e c )
3 . p o p u l a t i o n S i z e ( lambda )
4 . survivorsSize (0)
5 . o f f s p r i n g S e l e c t o r (new T r u n c a t i o n S e l e c t o r <>(mu) )
6 . a l t e r e r s (new Mutator <>(p ) )
7 . build () ;
Listing 2.22: (µ, λ) Engine configuration

Listing 2.22 shows how to configure the evolution Engine for (µ, λ)-ES. The
population size is set to λ and the survivors size to zero, since the best parents are
not part of the final population. Step three is configured by setting the offspring
selector to the TruncationSelector. Additionally, the TruncationSelector
is parameterized with µ. This lets the TruncationSelector only select the µ
best individuals, which corresponds to step two of the ES.
There are mainly three levers for the (µ, λ)-ES where we can adjust explo-
ration versus exploitation:[26]

• Population size λ: This parameter controls the sample size for each
population. For the extreme case, as λ approaches ∞, the algorithm would
perform a simple random search.
• Survivors size of µ: This parameter controls how selective the ES is.
Relatively lowµ values push the algorithm towards exploitative search,
because only the best individuals are used for reproduction.23
• Mutation probability p: A high mutation probability pushes the al-
gorithm toward a fairly random search, regardless of the selectivity of
µ.

2.9.2 (µ + λ) evolution strategy


In the (µ + λ)-ES, the next generation consists of the selected best µ parents and
the λ new children. This is also the main difference compared to (µ, λ), where the
µ parents are not part of the next generation. Thus the next and all successive
generations are µ + λ in size.[26] Jenetics works with a constant population
size and it is therefore not possible to implement an increasing population size.
Besides this restriction, the Engine configuration for the (µ + λ)-ES is shown
in listing2.23.
1 f i n a l Engine<DoubleGene , Double> e n g i n e =
2 Engine . b u i l d e r ( f i t n e s s , c o d e c )
3 . p o p u l a t i o n S i z e ( lambda )
4 . s u r v i v o r s S i z e (mu)
23 As you can see in listing 2.22, the survivors size (reproduction pool size) for the (µ, λ)-ES

must be set indirectly via the TruncationSelector parameter. This is necessary, since for the
(µ, λ)-ES, the selected best µ individuals are not part of the population of the next generation.

78
2.10. EVOLUTION INTERCEPTION CHAPTER 2. ADVANCED TOPICS

5 . s e l e c t o r (new T r u n c a t i o n S e l e c t o r <>(mu) )
6 . a l t e r e r s (new Mutator <>(p ) )
7 . build () ;
Listing 2.23: (µ + λ) Engine configuration

Since the selected µ parents are part of the next generation, the survivorsSize
property must be set to µ. This also requires setting the survivors selector to the
TruncationSelector. With the selector(Selector) method, both selectors
and the selector for the survivors and for the offspring, can be set. Because the
best parents are also part of the next generation, the (µ + λ)-ES may be more
exploitative than the (µ, λ)-ES. This has the risk, that very fit parents can defeat
other individuals over and over again, which leads to a premature convergence
to a local optimum.

2.10 Evolution interception


Once the EvolutionStream is created, it will continuously create Evolution-
Result objects, one for every generation. It is not possible to alter the results,
although it is tempting to use the Streams::map method for this purpose. The
problem with the map method is, that the altered EvolutionResult will not be
fed back to the Engine when evolving the next generation.
1 private E v o l u t i o n R e s u l t <DoubleGene , Double>
2 mapping ( E v o l u t i o n R e s u l t <DoubleGene , Double> r e s u l t ) { . . . }
3
4 f i n a l Genotype<DobuleGene> r e s u l t = e n g i n e . stream ( )
5 . map( t h i s : : mapping )
6 . l i m i t (100)
7 . c o l l e c t ( toBestGenotype ( ) ) ;

Performing the EvolutionResult mapping as shown in the code snippet above,


will only change the results for the operations after the mapper definition. The
evolution processing of the Engine is not affected. If we want to intercept the
evolution process, the interceptor must be defined when the Engine is created.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = Engine . b u i l d ( problem )
2 . i n t e r c e p t o r ( E v o l u t i o n I n t e r c e p t o r . o f A f t e r ( t h i s : : mapping ) )
3 . build () ;

The code snippet above shows the correct way for intercepting the evolution
stream. The mapper given to the Engine will change the stream of Evolution-
Results and the will also feed the altered result back to the evolution Engine.
Changing the evolved EvolutionResult is a powerful tool and should used
cautiously.

Distinct population This kind of intercepting the evolution process is very


flexible. Jenetics comes with one predefined stream interception method, which
allows for removing duplicate individuals from the resulting population.
1 f i n a l Engine<DobuleGene , Double> e n g i n e = Engine . b u i l d ( problem )
2 . i n t e r c e p t o r ( EvolutionResult . toUniquePopulation ( ) )
3 . build () ;

Despite the de-duplication, it is still possible to have duplicate individuals. This


will be the case when domain of the possible Genotypes is not big enough,

79
2.10. EVOLUTION INTERCEPTION CHAPTER 2. ADVANCED TOPICS

and the same individual is created by chance. You can control the number of
Genotype creation retries using the EvolutionResult::toUniquePopulation(-
int) method, which allows you to define the maximal number of retries if an
individual already exists.

80
Chapter 3

Modules

The Jenetics library has been split into several modules, which allows keeping
the base EA module as small as possible. It currently consists of the modules
shown in table 3.0.1, including the Jenetics base module.1

Module Artifact
io.jenetics.base io.jenetics:jenetics:7.1.0
io.jenetics.ext io.jenetics:jenetics.ext:7.1.0
io.jenetics.prog io.jenetics:jenetics.prog:7.1.0
io.jenetics.xml io.jenetics:jenetics.xml:7.1.0
io.jenetics.prngine io.jenetics:prngine:2.0.0

Table 3.0.1: Jenetics modules

With this module split, the code is easier to maintain and doesn’t force the
user to use parts of the library he or she isn’t using. This keeps the io.jenetics-
.base module as small as possible. The additional Jenetics modules will be
described in this chapter. Figure 3.0.1 shows the dependency graph of the
Jenetics modules.

Figure 3.0.1: Module graph

1 The used module names follow the recommended naming scheme for the JPMS automatic

modules: http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html.

81
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

3.1 io.jenetics.ext
The io.jenetics.ext module implements additional nonstandard Genes and
evolutionary operations. It also contains data structures which are used by these
additional Genes and operations.

3.1.1 Data structures


3.1.1.1 Tree
The Tree interface defines a general tree data type, where each tree node can
have an arbitrary number of children.
1 public i n t e r f a c e Tree<V, T extends Tree<V, T>> {
2 V value () ;
3 O p t i o n a l <T> p a r e n t ( ) ;
4 T childAt ( int index ) ;
5 int childCount ( ) ;
6 }
Listing 3.1: Tree interface

Listing 3.1 shows the Tree interface with its basic abstract tree methods. All
other needed tree methods, e. g. for node traversal and search, are implemented
by default methods, which are derived from these four abstract tree methods. A
mutable default implementation of the Tree interface is given by the TreeNode
class.

1 2 3

4 5 6 7 8 9

10 11

Figure 3.1.1: Example tree

To illustrate the usage of the TreeNode class, we will create a TreeNode


instance from the tree shown in figure 3.1.1. The example tree consists of 12
nodes with a maximal depth of three and a varying child count from one to
three.
1 f i n a l TreeNode<I n t e g e r > t r e e = TreeNode . o f ( 0 )
2 . a t t a c h ( TreeNode . o f ( 1 )
3 . attach (4 , 5) )
4 . a t t a c h ( TreeNode . o f ( 2 )
5 . attach (6) )
6 . a t t a c h ( TreeNode . o f ( 3 )
7 . a t t a c h ( TreeNode . o f ( 7 )
8 . attach (10 , 11) )

82
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

9 . attach (8)
10 . attach (9) ) ;
Listing 3.2: Example TreeNode

Listing 3.2 shows the TreeNode representation of the given example tree. New
children are added by using the attach method. For full Tree method list have
a look at the Javadoc documentation.

3.1.1.2 Parentheses tree


A parentheses tree2 is a serialized representation of a tree and is a simplified
form of the Newick tree format3 . The parentheses tree representation of the tree
in figure 3.1.1 will look like the following string:

0(1(4,5),2(6),3(7(10,11),8,9))

As you can see, nodes on the same tree level are separated by a comma, ’,’. New
tree levels are created with an opening parentheses ’(’ and closed with a closing
parentheses ’)’. No additional spaces are inserted between the separator character
and the node value. Any spaces in the parentheses tree string will be part of
the node value. Figure 3.1.2 shows the syntax diagram of the parentheses tree.
The NodeValue in the diagram is the string representation of the Tree::value
object.

Figure 3.1.2: Parentheses tree syntax diagram

To get the parentheses tree representation, you just have to call Tree-
::toParenthesesTree. This method uses the Object::toString method for
serializing the tree node value. If you need a different string representation
you can use the Tree::toParenthesesTree(Function<? super V, String>)
method. A simple example, on how to use this method, is shown in the code
snippet below.
1 f i n a l Tree<Path , ?> t r e e = . . . ;
2 f i n a l S t r i n g s t r i n g = t r e e . t o P a r e n t h e s e s S t r i n g ( Path : : getFileName ) ;

If the string representation of the tree node value contains one of the protected
characters, ’,’, ’(’ or ’)’, they will be escaped with a ’\’ character.
1 f i n a l Tree<S t r i n g , ?> t r e e = TreeNode . o f ( " ( r o o t ) " )
2 . attach ( " , " , " ( " , " ) " )

The tree in the code snippet above will be represented as the following parentheses
string:
2 https://www.i-programmer.info/programming/theory/3458-parentheses-are-trees.

html
3 http://evolution.genetics.washington.edu/phylip/newicktree.html

83
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

\(root\)(\„\(,\))
Serializing a tree into parentheses form is just one part of the story. It is
also possible to read back the parentheses string as tree object. The Tree-
Node::parse(String) method allows you to parse a tree string back to a
TreeNode<String> object. If you need to create a tree with the original node
type, you can call the parse method with an additional string mapper function.
How you can parse a given parentheses tree string is shown in the code below.
1 f i n a l Tree<I n t e g e r , ?> t r e e = TreeNode . p a r s e (
2 " 0(1(4 ,5) ,2(6) ,3(7(10 ,11) ,8 ,9) ) " ,
3 Integer : : parseInt
4 );

The TreeNode::parse method will throw an IllegalArgumentException if it


is called with an invalid tree string.

3.1.1.3 Flat tree


The main purpose for the Tree data type in the io.jenetics.ext module is to
support hierarchical TreeGenes, which are needed for genetic programming (see
section 3.2). Since the Chromosome type is essentially an array, a mapping from
the hierarchical tree structure to a 1-dimensional array is needed.4 For general
trees with arbitrary child count, additional information needs to be stored for a
bijective mapping between tree and array. The FlatTree interface extends the
Tree node with a childOffset method, which returns the absolute start index
of the tree’s children.
1 public i n t e r f a c e F l a t T r e e <V, T extends F l a t T r e e <V, T>>
2 extends Tree<V, T>
3 {
4 int c h i l d O f f s e t ( ) ;
5 default ISeq<T> f l a t t e n e d N o d e s ( ) { . . . } ;
6 }
Listing 3.3: FlatTree interface
Listing 3.3 shows the additional child offset needed for reconstructing the tree
from the flattened array version. When flattening an existing tree, the nodes are
traversed in breadth first order.5 For each node the absolute array offset of the
first child is stored, together with the child count of the node. If the node has
no children, the child offset is set to −1.
Figure 3.1.3 illustrates the flattened example tree shown in figure 3.1.1. The
curved arrows denotes the child offset of a given parent node and the curly braces
denotes the child count of a given parent node.
1 f i n a l TreeNode<I n t e g e r > t r e e = . . . ;
2 f i n a l ISeq<FlatTreeNode<I n t e g e r >> nodes = FlatTreeNode . o f ( t r e e )
3 . flattenedNodes () ;
4 a s s e r t Tree . e q u a l s ( t r e e , nodes . g e t ( 0 ) ) ;
5
6 f i n a l TreeNode<I n t e g e r > u n f l a t t e n e d = TreeNode . o f ( nodes . g e t ( 0 ) ) ;
7 assert tree . equals ( unflattened ) ;
8 assert unflattened . equals ( tree ) ;
4 There exists mapping schemes for perfect binary trees, which allows a bijective mapping

from tree to array without additional storage need: https://en.wikipedia.org/wiki/Binary_


tree#Arrays. For general trees with arbitrary child count, such simple mapping doesn’t exist.
5 https://en.wikipedia.org/wiki/Breadth-first_search

84
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

Figure 3.1.3: Example FlatTree

The code snippet above shows how to flatten a given integer tree and convert it
back to a regular tree. The first element of the flattened tree node sequence is
always the root node.

Since the TreeGene and the ProgramGene are implementing the


FlatTree interface, it is helpful to know and understand the used tree
to array mapping.

Since there is no possibility to change the nodes of a FlatTree, it can be


used as an immutable version of the Tree interface. The tree nodes are also
stored more memory efficient than in the TreeNode class.

3.1.1.4 Tree formatting


Using the parentheses tree is one possibility for creating a string representation
of a given tree. Although it is the default format returned by the toString()
method, it is sometime desirable to use different formats. The TreeFormatter
class lets you to implement your own formats and also defines additional tree
formats.
TreeFormatter.PARENTHESES Converts a tree to its default parentheses format.
This is the default format and is also used by the Tree::toString method.
TreeFormatter.TREE Creates a verbose tree string, which spans multiple lines,
e. g.

div
cos
1.3
cos
3.14

TreeFormatter.DOT Creates a tree string in the dot format, which can be used
to create nice graphs with Graphviz6 .
TreeFormatter.LISP Creates a Lisp tree from a given Tree instance. E. g.

(mul (div (cos 1.0) (cos 3.14)) (sin (mul 1.0 z)))
6 https://www.graphviz.org/

85
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

3.1.1.5 Tree reduction


The Tree is a very mighty data structure. It implements methods, which allows
you to traverse and manipulate it. One of these methods is the Tree::reduce
function, which traverses the tree in pre-order.
1 public i n t e r f a c e Tree<V, T> {
2 default <U> U r e d u c e (
3 U[ ] neutral ,
4 BiFunction <? super V, ? super U [ ] , ? extends U> r e d u c e r
5 );
6 }
Listing 3.4: Tree::reduce method

Listing 3.4 shows the signature of the reduce method. The neutral array is
used in the reducer function for leaf elements. After the reduction process, one
element is returned, which might be null for empty trees. The following code
snippet shows how to use the Tree::reduce method for evaluating a simple
arithmetic expression tree.
1 f i n a l Tree<S t r i n g , ?> f o r m u l a = TreeNode . p a r s e (
2 " add ( sub ( 6 , d i v ( 2 3 0 , 1 0 ) ) , mul ( 5 , 6 ) ) "
3 );
4 f i n a l double r e s u l t = f o r m u l a . r e d u c e (new Double [ 0 ] , ( op , a r g s ) −>
5 switch ( op ) {
6 case " add " −> a r g s [ 0 ] + a r g s [ 1 ] ;
7 case " sub " −> a r g s [ 0 ] − a r g s [ 1 ] ;
8 case " mul " −> a r g s [ 0 ] ∗ a r g s [ 1 ] ;
9 case " d i v " −> a r g s [ 0 ] / a r g s [ 1 ] ;
10 default −> Double . p a r s e D o u b l e ( op ) ;
11 }
12 );

The result of the reduced arithmetic tree will be 13.0, as expected.

3.1.2 Rewriting
Tree rewriting is a synonym for term rewriting, i.e., the process of transforming
trees (tree structured data) into other trees by applying rewriting rules. Rewriting
trees is not necessarily deterministic. One rewrite rule can be applied in many
different ways to that term, or more than one rule will be applicable to a tree
node. The rewriting system implementation in Jenetics is currently used for
simplifying program trees, which are evolved in genetic programming problems
(see section 3.2 and 3.2.4). A good introduction in tree/term rewriting systems
can be found in [3].
Definition. (Tree rewrite rule): A tree rewrite rule is a pair of terms (sub
trees), l → r. The notation indicates that the left-hand side, l, can be replaced
by the right-hand side, r.

A rule, l → r, can be applied to a tree, t, if the left tree (pattern) matches a


sub tree of t. The matching sub tree is then replaced by the right tree (pattern)
r.
Definition. (Tree rewrite system): A tree rewrite system is a set, R, of
rewrite rules, l → r.

86
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

In contrast to string rewriting systems, whose objects are flat sequences of


symbols, the objects a term rewriting system works on, i.e. the terms, form a
term algebra. A term can be visualized as a tree of symbols, the set of admitted
symbols being fixed by a given signature.

3.1.2.1 Tree pattern


The TreePattern class is used for the left-hand and the right-hand side of a
rewrite rule. It is typed and consists of variable (Var) and value (Val) nodes
which form a sum type7 of the sealed Decl class.
1 public f i n a l c l a s s T r e e P a t t e r n <V> {
2 public T r e e P a t t e r n ( Tree<Decl<V>, ?> p a t t e r n ) { . . . }
3 public TreeMatcher<V> matcher ( Tree<V, ?> t r e e ) { . . . }
4 public TreeNode<V> expand (Map<Var<V>, Tree<V, ?>> v a r s ) { . . . }
5 }
Listing 3.5: TreePattern class

Listing 3.5 shows the constructor and main methods of the TreePattern. The
matcher method is used when used for the left-hand side and the expand method
for the right-hand side. How to create a simple tree pattern is shown in the code
snippet below.
1 f i n a l Tree<Decl<S t r i n g >, ?> t = TreeNode
2 .< Decl<S t r i n g >>o f (new Val<>(" add " ) )
3 . a t t a c h (new Var<>(" x " ) , new Val<>(" 1 " ) ) ;
4 f i n a l T r e e P a t t e r n <S t r i n g > p = new T r e e P a t t e r n <>(t ) ;
5 a s s e r t p . matcher ( TreeNode . p a r s e ( " add ( sub ( x , y ) , 1 ) " ) ) . matches ( ) ;

You can see that the variable x will match for arbitrary sub trees. For more
complicated patterns it is quite cumbersome to create it via a Decl tree. Usually
you will create a TreePattern object by compiling a proper pattern string.
For creating the same pattern as in the example above you can write Tree-
Pattern.compile("add($x,1)"). The base syntax for the tree pattern follows
the parentheses tree DSL described in 3.1.1.2. It only differs in the declaration of
tree variables, which start with a ’$’ and must be a valid Java identifier. If you
want to match non string trees you must specify an additional mapper function
with the compile method.
1 f i n a l T r e e P a t t e r n <I n t e g e r > p a t t e r n = T r e e P a t t e r n
2 . c o m p i l e ( " 0 ( $x , 1 ) " , I n t e g e r : : p a r s e I n t ) ;

The right-hand side functionality of the rewrite rule is used to expand a given
pattern. For expanding a given pattern you have to deliver a Var to sub tree
mapping.
1 f i n a l T r e e P a t t e r n <S t r i n g > p a t t e r n = T r e e P a t t e r n
2 . c o m p i l e ( " add ( $x , $y , 1 ) " ) ;
3 f i n a l Map<Var<S t r i n g >, Tree<S t r i n g , ?>> v a r s = new HashMap<>() ;
4 v a r s . put (new T r e e P a t t e r n . Var<>(" x " ) , TreeNode . p a r s e ( " s i n ( x ) " ) ) ;
5 v a r s . put (new T r e e P a t t e r n . Var<>(" y " ) , TreeNode . p a r s e ( " s i n ( y ) " ) ) ;
6
7 f i n a l Tree<S t r i n g , ?> t r e e = p a t t e r n . expand ( v a r s ) ;
8 a s s e r t t r e e . t o P a r e n t h e s e s S t r i n g ( ) . e q u a l s ( " add ( s i n ( x ) , s i n ( y ) , 1 ) " ) ) ;

7 https://en.wikipedia.org/wiki/Algebraic_data_type

87
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

3.1.2.2 Tree rewriter


The TreeRewriter interface is an abstraction of the tree rewriting functionality.
Its rewrite method takes a TreeNode, which will be rewritten, and the maximal
number the rule should be applied to the input tree.
1 public i n t e r f a c e T r e e R e w r i t e r <V> {
2 i n t r e w r i t e ( TreeNode<V> t r e e , i n t l i m i t ) ;
3 }
Listing 3.6: TreeRewriter interface
With the TreeRewriter interface you are able to combine two or more tree
rewriter to one. This can be done with the concat(final TreeRewriter<V>...
rewriters) factory method. There are two implementations of the Tree-
Rewriter interface: the TreeRewriteRule class and the TRS class.

3.1.2.3 Tree rewrite rule


A TreeRewriteRule consists of left matching pattern and a right replacing
pattern. To simplify the creation of a rewrite rule, it is possible to create one
via a simple DSL: add(0,$x) -> $x. The left and the right tree pattern is
separated by an arrow, ->, and the pattern DSL is described in section 3.1.2.1.
1 f i n a l TreeRewriteRule <S t r i n g > r u l e = T r e e R e w r i t e R u l e
2 . c o m p i l e ( " add ( $x , 0 ) −> $x " ) ;
3 f i n a l TreeNode<S t r i n g > t = p a r s e ( " add ( 5 , 0 ) " ) ;
4 rule . rewrite ( t ) ;

Since the TreeRewriteRule implements the TreeRewriter interface, it can


directly be used for rewriting input trees.

3.1.2.4 Tree rewrite system (TRS)


The TRS class puts all things together and allows for defining a complete tree
(term) rewriting system. The primary constructor will take a sequence of
TreeRewriteRules (ISeq<TreeRewriteRule<V>>), but the TRS creation can be
simplified by using a simple DSL.
1 f i n a l TRS<S t r i n g > t r s = TRS . o f (
2 " add ( 0 , $x ) −> $x " ,
3 " add ( S ( $x ) , $y ) −> S ( add ( $x , $y ) ) " ,
4 " mul ( 0 , $x ) −> 0 " ,
5 " mul ( S ( $x ) , $y ) −> add ( mul ( $x , $y ) , $y ) "
6 );

The example above defines a tree rewrite system with four rewrite rules, which
are applied in the given order. Each rule is applied until the given tree stays
unchanged. This also means, that the termination of the TRS can’t be guaranteed.
It’s mainly your responsibility to create a rewrite system which will always
terminate. If you are not sure whether the system is terminating or not, you
better call the TreeRewriter.rewrite(TreeNode, int) method, which also
takes the maximal number, the rule should be applied to the input tree.
1 f i n a l TreeNode<S t r i n g > t = p a r s e ( " add ( S ( 0 ) , S ( mul ( S ( 0 ) , S ( S ( 0 ) ) ) ) ) " ) ;
2 trs . rewrite ( t ) ;
3 a s s e r t t . equals ( parse ( "S(S(S(S(0) ) ) ) " ) ) ;

Since the given tree rewrite system is terminating, we can safely apply the TRS
to add(S(0),S(mul(S(0),S(S(0))))), which will then be rewritten to S(S(S(S(0)))).

88
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

3.1.2.5 Constant expression rewriter


The ConstExprRewriter class allows for the evaluation of constant tree expres-
sions. In the code snippet below, it is shown how to evaluate a constant double
expression.
1 f i n a l TreeNode<Op<Double>> t r e e = MathExpr
2 . p a r s e ( " 1+2+3+4 " )
3 . toTree ( ) ;
4 C o n s t R e w r i t e r . ofType ( Double . c l a s s ) . r e w r i t e ( t r e e ) ;
5 a s s e r t t r e e . v a l u e ( ) . e q u a l s ( Const . o f ( 1 0 . 0 ) )

Since the ConstExprRewriter can rewrite constant expressions of arbitrary


types, a rewrite instance of the appropriate type, Double, must be created first.

3.1.3 Genes
3.1.3.1 BigInteger gene
The BigIntegerGene implements the NumericGene interface and can be used
when the range of the existing LongGene or DoubleGene is not enough. Its
allele type is a BigInteger, which can store arbitrary precision integers. There
also exists a corresponding BigIntegerChromosome.

3.1.3.2 Tree gene


The TreeGene interface extends the FlatTree interface and serves as basis for
the ProgramGene, used for genetic programming. Its tree nodes are stored in
the corresponding TreeChromosome. How the tree hierarchy is flattened and
mapped to an array is described in section 3.1.1.3.

3.1.4 Operators
Simulated binary crossover The SimulatedBinaryCrossover performs
the simulated binary crossover (SBX) on NumericChromosomes such that each
position is either crossed contracted or expanded with a certain probability. The
probability distribution is designed such that the children will lie closer to their
parents as is the case with the single point binary crossover. It is implemented
as described in [16].

Single-node crossover The SingleNodeCrossover class works on TreeCh-


romosomes. It swaps two, randomly chosen, nodes from two tree chromosomes.
Figure 3.1.4 shows how the single-node crossover works. In this example node 3
of the first tree is swapped with node h of the second tree.

Reverse sequence mutator (RSM) The RSMutator chooses two positions i


and j randomly. The gene order in a chromosome will then be reversed between
these two points. This mutation operator can also be used for combinatorial
problems, where no duplicated genes within a chromosome are allowed, e.g. for
the TSP. [1]

89
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

0 a

1 2 3 b c d

4 5 6 7 8 9 e f g h i j

10 11 k l

3 ←→ h
a

b c d

1 2 h

e f g 3 i j
4 5 6 k l

7 8 9

10 11

Figure 3.1.4: Single-node crossover

Hybridizing PSM and RSM (HPRM) The HPRMutator constructs an


offspring from a pair of parents by hybridizing two mutation operators, PSM
(SwapMutator) and RSM. Its main application is for combinatorial problems,
like the TSP. [2]

3.1.5 Weasel program


The Weasel program8 is thought experiment from Richard Dawkins, in which
he tries to illustrate the function of genetic mutation and selection.9 For this
reason he chooses the well known example of typewriting monkeys.
I don’t know who it was first pointed out that, given enough time, a
monkey bashing away at random on a typewriter could produce all
the works of Shakespeare. The operative phrase is, of course, given
enough time. Let us limit the task facing our monkey somewhat.
Suppose that he has to produce, not the complete works of Shake-
speare but just the short sentence »Methinks it is like a weasel«, and
we shall make it relatively easy by giving him a typewriter with a
restricted keyboard, one with just the 26 (uppercase) letters, and a
space bar. How long will he take to write this one little sentence?[14]

The search space of the 28 character long target string is 2728 ≈ 1040 . If the
monkey writes 1, 000, 000 different sentences per second, it would take about 1026
8 https://en.wikipedia.org/wiki/Weasel_program
9 The classes are located in the io.jenetics.ext module.

90
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

years (in average) writing the correct one. Although Dawkins did not provide
the source code for his program, a »Weasel« style algorithm could run as follows:

1. Start with a random string of 28 characters.


2. Make n copies of the string (reproduce).
3. Mutate the characters with an mutation probability of 5%.
4. Compare each new string with the target string »METHINKS IT IS LIKE
A WEASEL«, and give each a score (the number of letters in the string
that are correct and in the correct position).
5. If any of the new strings has a perfect score (28), halt. Otherwise, take
the highest scoring string, and go to step 2.
Richard Dawkins was also very careful to point out the limitations of this
simulation:
Although the monkey/Shakespeare model is useful for explaining the
distinction between single-step selection and cumulative selection, it is
misleading in important ways. One of these is that, in each generation
of selective »breeding«, the mutant »progeny« phrases were judged
according to the criterion of resemblance to a distant ideal target,
the phrase METHINKS IT IS LIKE A WEASEL. Life isn’t like that.
Evolution has no long-term goal. There is no long-distance target, no
final perfection to serve as a criterion for selection, although human
vanity cherishes the absurd notion that our species is the final goal of
evolution. In real life, the criterion for selection is always short-term,
either simple survival or, more generally, reproductive success.[14]
If you want to write a Weasel program with the Jenetics library, you need to
use the special WeaselSelector and WeaselMutator.
1 public c l a s s WeaselProgram {
2 private s t a t i c f i n a l S t r i n g TARGET =
3 "METHINKS IT I S LIKE A WEASEL" ;
4
5 private s t a t i c i n t s c o r e ( f i n a l Genotype<CharacterGene> g t ) {
6 f i n a l v a r s r c = g t . chromosome ( ) . a s ( CharSequence . c l a s s ) ;
7 return I n t S t r e a m . r a n g e ( 0 , TARGET. l e n g t h ( ) )
8 . map( i −> s r c . charAt ( i ) == TARGET. charAt ( i ) ? 1 : 0 )
9 . sum ( ) ;
10 }
11
12 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
13 f i n a l CharSeq c h a r s = CharSeq . o f ( "A−Z " ) ;
14 f i n a l Factory<Genotype<CharacterGene>> g t f = Genotype . o f (
15 new CharacterChromosome ( c h a r s , TARGET. l e n g t h ( ) )
16 );
17 f i n a l Engine<CharacterGene , I n t e g e r > e n g i n e = Engine
18 . b u i l d e r ( WeaselProgram : : s c o r e , g t f )
19 . populationSize (150)
20 . s e l e c t o r (new W e a s e l S e l e c t o r <>() )
21 . offspringFraction (1)
22 . a l t e r e r s (new WeaselMutator < >(0.05) )
23 . build () ;
24 f i n a l Phenotype<CharacterGene , I n t e g e r > r e s u l t = e n g i n e

91
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

25 . stream ( )
26 . l i m i t ( b y F i t n e s s T h r e s h o l d (TARGET. l e n g t h ( ) − 1 ) )
27 . peek ( r −> System . out . p r i n t l n (
28 r . totalGenerations () + " : " +
29 r . bestPhenotype ( ) ) )
30 . c o l l e c t ( toBestPhenotype ( ) ) ;
31 System . out . p r i n t l n ( r e s u l t ) ;
32 }
33 }
Listing 3.7: Weasel program

Listing 3.7 shows how to implement the WeaselProgram with Jenetics. Step (1)
and (2) of the algorithm is done implicitly when the initial population is created.
The third step is done by the WeaselMutator, with mutation probability of 0.05.
Step (4) is done by the WeaselSelector together with the configured offspring
fraction of one. The EvolutionStream is limited by the Limits.byFitness-
Threshold, which is set to scoremax − 1. In the current example this value is
set to TARGET.length() - 1 = 27.
1 1: [ UBNHLJUS RCOXR LFIYLAWRDCCNY ] --> 6
2 2: [ UBNHLJUS RCOXR LFIYLAWWDCCNY ] --> 7
3 3: [ UBQHLJUS RCOXR LFIYLAWWECCNY ] --> 8
4 5: [ UBQHLJUS RCOXR LFICLAWWECCNL ] --> 9
5 6: [ W QHLJUS RCOXR LFICLA WEGCNL ] --> 10
6 7: [ W QHLJKS RCOXR LFIHLA WEGCNL ] --> 11
7 8: [ W QHLJKS RCOXR LFIHLA WEGSNL ] --> 12
8 9: [ W QHLJKS RCOXR LFIS A WEGSNL ] --> 13
9 10: [ M QHLJKS RCOXR LFIS A WEGSNL ] --> 14
10 11: [ MEQHLJKS RCOXR LFIS A WEGSNL ] --> 15
11 12: [ MEQHIJKS ICOXR LFIN A WEGSNL ] --> 17
12 14: [ MEQHINKS ICOXR LFIN A WEGSNL ] --> 18
13 16: [ METHINKS ICOXR LFIN A WEGSNL ] --> 19
14 18: [ METHINKS IMOXR LFKN A WEGSNL ] --> 20
15 19: [ METHINKS IMOXR LIKN A WEGSNL ] --> 21
16 20: [ METHINKS IMOIR LIKN A WEGSNL ] --> 22
17 23: [ METHINKS IMOIR LIKN A WEGSEL ] --> 23
18 26: [ METHINKS IMOIS LIKN A WEGSEL ] --> 24
19 27: [ METHINKS IM IS LIKN A WEHSEL ] --> 25
20 32: [ METHINKS IT IS LIKN A WEHSEL ] --> 26
21 42: [ METHINKS IT IS LIKN A WEASEL ] --> 27
22 46: [ METHINKS IT IS LIKE A WEASEL ] --> 28

The (shortened) output of the Weasel program (listing 3.7) shows, that the
optimal solution is reached in generation 46.

3.1.6 Modifying Engine


The current design of Engine allows for creating multiple independent Evolution-
Streams from a single Engine instance. One drawback of this approach is, that
the EvolutionStream runs with the same evolution parameters until the stream
is truncated. It is not possible to change the stream’s Engine configuration during
the evolution process. This is the purpose of the EvolutionStreamable interface.
It is similar to the Java Iterable interface and abstracts the EvolutionStream
creation.
1 public i n t e r f a c e E v o l u t i o n S t r e a m a b l e <
2 G extends Gene <? , G>,
3 C extends Comparable <? super C>
4 > {
5 E v o l u t i o n S t r e a m <G, C>
6 stream ( S u p p l i e r <E v o l u t i o n S t a r t <G, C>> s t a r t ) ;
7

92
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

8 E v o l u t i o n S t r e a m <G, C> stream ( E v o l u t i o n I n i t <G> i n i t ) ;


9
10 E v o l u t i o n S t r e a m a b l e <G, C>
11 l i m i t ( S u p p l i e r <P r e d i c a t e <? super E v o l u t i o n R e s u l t <G, C>>> p )
12 }
Listing 3.8: EvolutionStreamable interface

Listing 3.8 shows the main methods of the EvolutionStreamable interface.


The existing stream methods take an initial value, which allows to concatenate
different engines. With the limit method it is possible to limit the size of the
created EvolutionStream instances. The io.jenetics.ext module contains
additional classes which allows for concatenating evolution Engines with different
configurations, which will then create one varying EvolutionStream. This
additional Engine classes are:
1. ConcatEngine and
2. CyclicEngine.

3.1.6.1 ConcatEngine
The ConcatEngine class allows for creating more than one Engine with different
configurations, and combine it into one EvolutionStreamable (Engine).

Figure 3.1.5: Engine concatenation

Figure 3.1.5 shows how the EvolutionStream of two concatenated Engines


works. You can create the first partial EvolutionStream with an optional start
value. If the first EvolutionStream stops, its final EvolutionResult is used
as start value of the second EvolutionStream, created by the second evolution
Engine. It is important that the evolution Engines used for concatenation are
limited. Otherwise the created EvolutionStream will only use the first Engine,
since it is not limited.

The concatenated evolution Engines must be limited (by calling


Engine::limit), otherwise only the first Engine is used executing the
resulting EvolutionStream.

The following code sample shows how to create an EvolutionStream from


two concatenate Engines. As you can see, the two Engines are limited.
1 f i n a l Engine<DoubleGene , Double> e n g i n e 1 = . . . ;
2 f i n a l Engine<DoubleGene , Double> e n g i n e 2 = . . . ;
3
4 f i n a l Genotype<DoubleGene> r e s u l t =

93
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

5 ConcatEngine . o f (
6 engine1 . l i m i t (50) ,
7 e n g i n e 2 . l i m i t ( ( ) −> L i m i t s . b y S t e a d y F i t n e s s ( 3 0 ) ) )
8 . stream ( )
9 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;

A practical use case for the Engine concatenation is, when you want to do a
broader exploration of the search space at the beginning and narrow it with the
following Engine. In such a setup, the first Engine would be configured with a
Mutator with a relatively big mutation probability. The mutation probabilities
of the following Engine would then be gradually reduced.

3.1.6.2 CyclicEngine
The CyclicEngine is similar to the ConcatEngine. Where the ConcatEngine
stops the evolution, when the EvolutionStream of the last engine terminates,
the CyclicEngine continues with a new EvolutionStream from the first Engine.
The evolution flow of the CyclicEngine is shown in figure 3.1.6.

Figure 3.1.6: Cyclic Engine

Since the CyclicEngine creates unlimited streams, although the participating


Engines are all creating limited streams, the resulting EvolutionStream must
be limited as well. The code snippet below shows the creation and execution of
a cyclic EvolutionStream.
1 f i n a l Genotype<DoubleGene> r e s u l t =
2 CyclicEngine . of (
3 engine1 . l i m i t (50) ,
4 e n g i n e 2 . l i m i t ( ( ) −> L i m i t s . b y S t e a d y F i t n e s s ( 1 5 ) ) )
5 . stream ( )
6 . l i m i t ( Limits . bySteadyFitness (50) )
7 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) ) ;

The reason for using a cyclic EvolutionStream is similar to the reason for using a
concatenated EvolutionStream. It allows you to do a broad search, followed by
a narrowed exploration. This cycle is then repeated until the limiting predicate
of the outer stream terminates the evolution process.

3.1.7 Multi-objective optimization


A Multi-objective optimization Problem (MOP) can be defined as the problem
of finding
a vector of decision variables which satisfies constraints and optimizes
a vector function whose elements represent the objective functions.
These functions form a mathematical description of performance

94
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

criteria which are usually in conflict with each other. Hence, the
term »optimize« means finding such a solution which would give the
values of all the objective functions acceptable to the decision maker.
[34]
There are several ways for solving multiobjective problems. An excellent theo-
retical foundation is given in [10]. The algorithms implemented by Jenetics are
based in therms of Pareto optimality as described in [18], [15] and [23].

3.1.7.1 Pareto efficiency


Pareto efficiency is named after the Italian economist and political scientist
Vilfredo Pareto10 . He used the concept in his studies of economic efficiency and
income distribution. The concept has been applied in different academic fields
such as economics, engineering, and the life sciences. Pareto efficiency says that
an allocation is efficient if an action makes some individual better off and no
individual worse off. In contrast to single-objective optimization, where usually
only one optimal solution exits, the multi-objective optimization creates a set of
optimal solutions. The optimal solutions are also known as the Pareto front or
Pareto set.
Definition. (Pareto efficiency [10]): A solution, x, is said to be Pareto
optimal iff there is no x′ for which v = (f1 (x′ ) , ..., fk (x′ )) dominates u =
(f1 (x) , ..., fk (x)).
The definition says that x∗ is Pareto optimal if there exists no feasible vector,
x, which would decrease some criterion without causing a simultaneous increase
in at least one other criterion.
Definition. (Pareto dominance [10]): A vector u = (u1 , ..., uk ) is said to
dominate another vector v = (v1 , ..., vk ) (denoted by u ⪰ v) iff u is partially
greater than v, i.e., ∀i ∈ {1, ..., k}, ui ≥ vi ∧ ∃i ∈ {1, ..., k} : ui > vi .
After this two basic definitions, lets have a look at a simple example. Figure
3.1.7 shows some points of a two-dimensional solution space. For simplicity, the
points will all lie within a circle with radius 1 and center point of (1, 1).
Figure 3.1.8 shows the Pareto front of a maximization problem. This means
we are searching for solutions that try to maximize the x and y coordinate at
the same time.
Figure 3.1.9 shows the Pareto front if we try to minimize the x and y
coordinate at the same time.

3.1.7.2 Implementing classes


The classes, used for solving multi-objective problems, reside in the io.jenetics-
.ext.moea package. Originally, the Jenetics library focuses on solving single-
objective problems. This drives the design decision to force the return value of
the fitness function to be Comparable. If the result type of the fitness function
is a vector, it is no longer clear how to make the results comparable. Jenetics
chooses to use the Pareto dominance relation (see section 3.1.7.1). The Pareto
dominance relation, ≻, defines a strict partial order, which means ≻ is
10 https://en.wikipedia.org/wiki/Vilfredo_Pareto

95
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

1.8

1.6

1.4

1.2

0.8

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Figure 3.1.7: Circle points

1.8

1.6

1.4

1.2

0.8
0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2

Figure 3.1.8: Maximizing Pareto front

1. irreflexive: u ⊁ u,
2. transitive: u ≻ v ∧ v ≻ w ⇒ u ≻ w and
3. asymmetric: u ≻ v ⇒ v ⊁ u .
The io.jenetics.ext.moea package contains the classes needed for doing multi-
objective optimization. One of the central types is the Vec interface, which
allows you to wrap a vector of any element type into a Comparable.
1 public i n t e r f a c e Vec<T> extends Comparable<Vec<T>> {
2 T data ( ) ;
3 int length ( ) ;
4 ElementComparator<T> comparator ( ) ;
5 E l e m e n t D i s t a n c e <T> d i s t a n c e ( ) ;
6 Comparator<T> dominance ( ) ;
7 }
Listing 3.9: Vec interface

96
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 3.1.9: Minimizing Pareto front

Listing 3.9 shows the necessary methods of the Vec interface. These methods are
sufficient to do all the optimization calculations. The data() method returns the
underlying vector type, like double[] or int[]. With the ElementComparator,
which is returned by the comparator() method, it is possible to compare single
elements of the vector type T. This is similar to the ElementDistance function,
returned by the distance() method, which calculates the distance of two
vector elements. The last method, dominance(), returns the Pareto dominance
comparator, ≻. Since it is quite a bothersome to implement all these needed
methods, the Vec interface comes with a set of factory methods, which allows
for creating Vec instance for some primitive array types.
1 f i n a l Vec<i n t [] > i v e c = Vec . o f ( 1 , 2 , 3 ) ;
2 f i n a l Vec<long [] > l v e c = Vec . o f ( 1 L , 2L , 3L) ;
3 f i n a l Vec<double [] > dvec = Vec . o f ( 1 . 0 , 2 . 0 , 3 . 0 ) ;

For efficiency reason, the primitive arrays are not copied, when the Vec instance
is created. This lets you, theoretically, change the value of a created Vec instance,
which will lead to unexpected results.

Although the Vec interface extends the Comparable interface, it violates its
general contract. It only implements the Pareto dominance relation, which
defines a partial order. Trying to sort a list of Vec objects, might lead to
an exception (thrown by the sorting method) at runtime.

The second difference to the single-objective setup is the EvolutionResult


collector. In the single-objective case, we will only get one best result, which is
different in the multi-object optimization. As we have seen in section 3.1.7.1,
we no longer have only one result, we have a set of Pareto optimal solutions.
There is a predefined collector in the io.jenetics.ext.moea package, MOEA-
::toParetoSet(IntRange), which collects the Pareto optimal Phenotypes into
an ISeq.
1 f i n a l ISeq<Phenotype<DoubleGene , Vec<double[]>>> p a r e t o S e t =

97
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

2 e n g i n e . stream ( )
3 . l i m i t (100)
4 . c o l l e c t (MOEA. t o P a r e t o S e t ( IntRange . o f ( 3 0 , 5 0 ) ) ) ;

Since there exists a potential infinite number of Pareto optimal solutions, you
have to define desired number of set elements. This is done with an IntRange
object, where you can specify the minimal and maximal set size. The example
above will return a Pareto size with size in the range of [30, 50). For reducing the
Pareto set size, the distance between two vector elements is taken into account.
Points which lie very close to each other are removed. This leads to a result,
where the Pareto optimal solutions are, more or less, evenly distributed over the
whole Pareto front. The crowding distance 11 measure is used for calculating the
proximity of two points and it is described in [10] and [18].
Till now we have described the multi-objective result type (Vec) and the final
collecting of the Pareto optimal solution. So lets create a simple multi-objective
problem and an appropriate Engine.
1 f i n a l Problem<double [ ] , DoubleGene , Vec<double[]>> problem =
2 Problem . o f (
3 v −> Vec . o f ( v [ 0 ] ∗ c o s ( v [ 1 ] ) + 1 , v [ 0 ] ∗ s i n ( v [ 1 ] ) + 1 ) ,
4 Codecs . o f V e c t o r (
5 DoubleRange . o f ( 0 , 1 ) ,
6 DoubleRange . o f ( 0 , 2∗ PI )
7 )
8 );
9
10 f i n a l Engine<DoubleGene , Vec<double[]>> e n g i n e =
11 Engine . b u i l d e r ( problem )
12 . o f f s p r i n g S e l e c t o r (new T o u r n a m e n t S e l e c t o r <>(4) )
13 . s u r v i v o r s S e l e c t o r ( UFTournamentSelector . ofVec ( ) )
14 . build () ;

The fitness function in the example problem above will create 2D-points which
will all lies within a circle with a center of (1, 1). In figure 3.1.8 you can
see how the resulting solution will look like. There is almost no difference in
creating an evolution Engine for single- or multi-objective optimization. You
only have to take care to choose the right Selector. Not all Selectors will
work for multi-objective optimization. This include all Selectors which needs
a Number fitness type and where the population needs to be sorted12 . The
Selector which works fine in a multi-objective setup is the TournamentSelector.
Additionally you can use one of the special MO selectors: NSGA2Selector and
UFTournamentSelector.

NSGA2 selector This selector selects the first elements of the population,
which has been sorted by the Crowded-comparison operator (equation 3.1.1), ≻,
n
as described in [15]

i≻j if (irank < jrank ) ∨ ((irank = jrank ) ∧ idist > jdist ) , (3.1.1)
n
11 The crowding distance value of a solution provides an estimate of the density of solutions

surrounding that solution. The crowding distance value of a particular solution is the av-
erage distance of its two neighboring solutions. https://www.igi-global.com/dictionary/
crowding-distance/42740.
12 Since the ≻ relation doesn’t define a total order, sorting the population will lead to an

IllegalArgumentException at runtime.

98
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

where irank denotes the non-domination rank of individual i and idist the crowd-
ing distance of individual i.

Unique fitness tournament selector The selection of unique fitnesses lifts


the selection bias towards over represented fitnesses by reducing multiple solutions
sharing the same fitness to a single point in the objective space. It is therefore no
longer required to assign a crowding distance of zero to individual of equal fitness
as the selection operator correctly enforces diversity preservation by picking
unique points in the objective space. [18]

Since the multi-objective optimization (MOO) classes are an extensions to


the existing evolution Engine, the implementation doesn’t exactly follow an
established algorithm, like NSGA2 or SPEA2. The results and performance,
described in the relevant papers, are therefore not directly comparable. See
listing 1.2 for comparing the Jenetics evolution flavor.

3.1.7.3 Termination
Most of the existing termination strategies, implemented in the Limits class,
presumes a total order of the fitness values. This assumption holds for single-
objective optimization problems, but not for multi-objective problems. Only
termination strategies which don’t rely on the total order of the fitness value,
can be safely used. The following termination strategies can be used for multi-
objective problems:
• Limits::byFixedGeneration,
• Limits::byExecutionTime and
• Limits::byGeneConvergence.
All other strategies doesn’t have a well defined termination behavior.

3.1.7.4 Mixed optimization


Till now, we have only considered MOO problems, where all objectives where
either minimized or maximized. This property might be to restrictive for some
problem classes. If you have MOO problem with three objectives, for example,
where objective one and three must be minimized and objective two has to be
maximized, you need some additional mechanisms for doing this. Defining the
optimization direction of the Engine is not sufficient. The fitness result Vec has
to be configured accordingly. This can be done by using the most generic factory
method of the Vec interface. Since this is quite bothersome, the VecFactory
can be used for this task. Listing 3.10 shows the main method of the interface.
The additional static factory methods has been omitted.
1 @FunctionalInterface
2 public i n t e r f a c e VecFactory<T> {
3 Vec<T> newVec ( f i n a l T a r r a y ) ;
4 }
Listing 3.10: VecFactory interface

99
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

Instead of creating the solution Vec instances directly, the fitness function must
create it with a properly configured VecFactory instance.
1 f i n a l VecFactory<double [] > f a c t o r y = VecFactory . ofDoubleVec (
2 Optimize .MINIMUM,
3 Optimize .MAXIMUM,
4 Optimize .MINIMUM
5 );
6
7 Vec<double [] > f i t n e s s ( f i n a l double [ ] p o i n t ) {
8 f i n a l double x = p o i n t [ 0 ] ;
9 f i n a l double y = p o i n t [ 1 ] ;
10 return f a c t o r y . newVec (new double [ ] {
11 s i n ( x ) ∗y ,
12 c o s ( y ) ∗x ,
13 x + y
14 }) ;
15 }

The example code above shows how the VecFactory must be configured to
create Vec<double[]> objects with the desired optimization properties. In the
fitness function you will then use the VecFactory instance for creating the fitness
values instead of the Vec::of(double...) factory method. The optimization
direction of the evolution Engine will remain at its default value, Optimize.MAX-
IMUM. If you configure the Engine for minimization, the configured optimization
directions in the VecFactory will be reversed. That means, the first objective
will be maximized instead of minimized, and so on.

3.1.8 Grammatical evolution


One of the difficulties when defining a codec13 for a given problem, is the creation
of invalid solution candidates or individuals. Using constraints is one way of
handling this problem, which is described in section 2.5 on page 62. Another
possibility is to assign bad fitness values to invalid individuals. This might
work to some degree, but at the cost of additional CPU and evolution time. In
the best case, your codec is creating no invalid solutions at all. Grammatical
evolution (GE) can help you to achieve this. GE was introduced by Michael
O’Neill and Conor Ryan [37, 32, 38] for solving genetic programming (GP)14
problems. Although the initial realm for GE was creating/evolving programs
in arbitrary languages, it is not restricted to this area. GE can be used for
every problems, where the problem domain can be expressed as sentences of a
context-free language (CFL)15 .
GE requires an additional mapping step for creating a sentence (program)
from a given CFG. Diagram 3.1.10 on the following page shows the principal
decoding process of GE. A BitChromosome or IntegerChromosome determines
the rules which are used for creating the sentences. The grammar is usually given
in Backus-Naur form (BNF). The result is the construction of a syntactically
correct program from a binary string which can then be evaluated by a fitness
function. In the following sections, the building blocks of GE are described in a
greater detail.
13 See section 2.3 on page 54.
14 See section 3.2 on page 108.
15 https://en.wikipedia.org/wiki/Context-free_language

100
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

Binary string Integer string


(BitChromosome) or (IntegerChromosome)

Mapping
(Codec)

Rules
(CFG)

Program
(Tree)

Executed program

Figure 3.1.10: Grammatical evolution

3.1.8.1 Context-free grammar


A context-free grammar (CFG)16 [19] basically consists of a finite set of grammar
rules. The grammar rules consists of two kind of symbols: 1) the terminals,
which are the symbols of the alphabet underlying the languages, and 2) the
non-terminals, which behave like variables ranging over strings of terminals. A
rule is of the form A → α, where A is a non-terminal, and α is a terminal or
non-terminal symbol. CFGs are used to generate strings (sentences), rather than
recognize strings.
Definition. (CFG): A context-free grammar, G, is defined by the tuple G =
(N , T , R, S), where

1. N is a finite set of non-terminal symbols called the variables. Every symbol


represents a different type of phrase in the sentence.

2. T is a finite set of terminal symbols, not element of N . Terminals make


up the actual content of the sentences defined by G. The terminals are
the alphabet of the language.
3. R is a finite relation in N × (N ∪ T ) . Members of R are the rewrite rules

of the grammar.

4. S ∈ N is the start symbol and represents the whole sentence.

3.1.8.2 Backus-Naur form


The Backus-Naur form (BNF)17 is a traditional form to represent a CFG. It is
used to formally define the grammar of a language, so that there is no ambiguity
as to what is allowed and what is not. BNF uses a range of symbols and
16 https://en.wikipedia.org/wiki/Context-free_grammar
17 https://en.wikipedia.org/wiki/Backus-Naur_form

101
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

expressions to create production rules. The names of the production rules are
put within angle brackets, <...>, and the alternatives are separated by vertical
bars, |. A simple BNF production rule might look like this:
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
In this example num is the name of the production rule and the values 1..9
represent the terminal symbols, associated with the rule. If a non-terminal
symbol appears on the right hand side, it means that there will be another
production rule (or a set of rules) to define its replacement.
<expr> ::= <num> | <var> | <expr> <op> <expr>
This shows that exp is either a num or var or two expr, concatenated by an op.
All components are non-terminal, therefore further production rules are required.
<op> ::= + | - | * | /
<var> ::= x | y
This two rules defines four op symbols, {x, −, ∗, /}, and two var terminals, {x, y}.
Production rules for the syntax of a language might come as a large set of BNF
statements that specify how every aspect of the language is defined. Every
non-terminal symbol on the right hand side of a production rule, must have a
rule that has the symbol on the left side. This continues until everything can be
specified in relation to terminal symbols. Here is the whole grammar, which will
define a language for simple arithmetic expressions:
<expr> ::= <num> | <var> | <expr> <op> <expr>
<op> ::= + | - | * | /
<var> ::= x | y
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
The symbol of the first rule, expr, will automatically serve as start symbol of
the grammar, G, defined by the given BNF.

3.1.8.3 Sentence generation


After we have defined the CFG, we want to create a valid (random) sentences
from our grammar. To do so, we will introduce a numbering schema for the
alternative expressions for every rule. Such a schema, for our simple arithmetic
grammar is shown below:
[0] [1] [2]
<expr> ::= <num> | <var> | <expr> <op> <expr>
[0] [1] [2] [3]
<op> ::= + | - | * | /
[0] [1]
<var> ::= x | y
[0] [1] [2] [3] [4] [5] [6] [7] [8]
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
The following example shows snapshots of the symbol list, L, for a possible
sentence creation run, according to the described leftmost derivation algorithm
3.1 on the next page.

102
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

Algorithm 3.1 Leftmost derivation


1. L ← [S]
Initialize the symbol list, L, with the start symbol, S.

2. n ← L [min {i}] ∈ N
Pick the leftmost non-terminal symbol, L [min {i}], from the current sen-
tence list, L.
3. E ← R (n) [rand]
Get the rule, R (n), for the chosen non-terminal, n, and select a random
rule alternative, E. E will contain one of more terminal and/or non-
terminal symbols.
4. L [min {i}] ← E
Replace the chosen variable, n, with the selected symbols, E.
5. Repeat steps 2..4 until the symbol list, L, contains only terminals.

1. L = [<expr>]. Initialize the sentence list with the start symbol.


2. L = [<expr>, <op>, <expr>]. Select the first non-terminal symbol in
the list, <expr>, and replace it with an randomly chosen rule alternative.
Alternative [2], <expr> <op> <expr>, is chosen and replaces the variable
<expr>.
3. L = [<num>, <op>, <expr>]. Replacing the first <expr> with alternative
[0], <num>.
4. L = [5, <op>, <expr>]. Replacing <num> with 5.
5. L = [5, -, <expr>]
6. L = [5, -, <var>]
7. L = [5, -, x]
After the 6th iteration the symbol list contains only terminal symbols and the
generation process stops. If we convert the symbol list into a string, we get
an algebraic expression, 5 − x, which is an element of the CFL, defined by our
simple CFG. Algorithm 3.2 on the following page shows the rightmost derivation
method, which is a variation of the leftmost derivation algorithm. The only
difference is, that instead of the processing the symbol list from left to right, it
is processed from right to left.

3.1.8.4 Mapping
Now we have everything in place to do the last step. How to create a sentence
from a given grammar? The original paper [37] uses a bit string for creating
a sentence from a given chromosome and called this process mapping. Since
we have already described how to create a sentence from a grammar, we must
describe the last missing part of the process, and this missing part is the selection
of alternative symbols from a given rule. This is step three in the described

103
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

Algorithm 3.2 Rightmost derivation


1. L ← [S]
Initialize the symbol list, L, with the start symbol, S.

2. n ← L [max {i}] ∈ N
Pick the rightmost non-terminal symbol, L [max {i}], from the current
sentence list, L.
3. E ← R (n) [rand]
Get the rule, R (n), for the chosen non-terminal, n, and select a random
rule alternative, E. E will contain one of more terminal and/or non-
terminal symbols.
4. L [min {i}] ← E
Replace the chosen variable, n, with the selected symbols, E.
5. Repeat steps 2..4 until the symbol list, L, contains only terminals.

algorithms 3.1 on the previous page and 3.2. Instead of selecting a random
alternative, the index of the selected alternative must be determined by the
given chromosome.

Codons
00000101|10110001|01110000|11101001|01111111|01000010

005|273|112|233|127|066

CFG Mapping process


[0] [1] [2] <expr>
005 % 3 = 2
<expr> ::= <num> | <var> | <expr> <op> <expr> <expr><op><expr>
[0] [1] [2] [3] 177 % 3 = 0
<num><op><expr>
<op> ::= + | - | * | / 112 % 9 = 4
[0] [1] 5<op><expr>
<var> ::= x | y
233 % 4 = 1
5-<expr>
[0] [1] [2] [3] [4] [5] [6] [7] [8] 127 % 2 = 1
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5-<var>
066 % 3 = 0
5-x

Figure 3.1.11: GE mapping

Figure 3.1.11 shows he GE mapping process. The binary string is split into
8-bit junks and interpreted as unsigned integers, with vales in the range of
[0, 256). This 8-bit junks are called codons. Whenever a new rule alternative
must be selected, a new codon is read from the chromosome. A simple modulo
operation is used to get an index within the desired range. If more codons are
needed during the mapping process than available in the input chromosome,
the reading of the codons is wrapped over and starts at the beginning of the
chromosome again.

104
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

3.1.8.5 Implementing classes


This section describes the main classes which are part of the implementation of
the GE. All the described classes are part of the io.jenetics.ext module and
reside in the io.jenetics.ext.grammar package.
Cfg This class represents a context-free grammar as described in section 3.1.8.1
on page 101.
Bnf Although it is possible to create instances of the Cfg class manually, it is in
most cases easier to define it as BNF string. This class contains methods
for parsing and formatting CFGs from and to BNF strings.
SymbolIndex The SymbolIndex interface defines the strategy, which alternative
symbols of a given rule are selected during the sentence generation process.
The classical algorithm uses codons, encoded in an binary string.

Codons This class is an implementation of the classical symbol selection algo-


rithms as described in the original paper. It implements the SymbolIndex
interface.
Generator The Generator interface is an abstraction of the sentence generation
algorithm as such as algorithm 3.1 on page 103 or 3.2 on the preceding
page.
Mappings The Mappings class contains factory methods for creating different
mapping strategies. Since the mapping process, as shown in figure 3.1.11
on the previous page, is implemented as Codec18 , the factory methods will
return Codec instances for the BitChromosome and IntegerChromosome.

Let’s start with the Cfg class and how to create grammars. The Cfg class comes
with a set of factory methods, which lets you create Cfg objects quite easily. Our
already known arithmetic expression grammar can be created with the following
code. The used factory methods, N, T, E, R, were statically imported.
1 f i n a l Cfg<S t r i n g > c f g = Cfg . o f (
2 R( " e x p r " ,
3 E(N( "num" ) ) , E(N( " v a r " ) ) ,
4 E(N( " e x p r " ) , N( " op " ) , N( " e x p r " ) )
5 ),
6 R( " op " , E(T( "+" ) ) , E(T( "−" ) ) , E(T( " ∗ " ) ) , E(T( " / " ) ) ) ,
7 R( " v a r " , E(T( " x " ) ) , E(T( " y " ) ) ) ,
8 R( "num" ,
9 E(T( " 1 " ) ) , E(T( " 2 " ) ) , E(T( " 3 " ) ) ,
10 E(T( " 4 " ) ) , E(T( " 5 " ) ) , E(T( " 6 " ) ) ,
11 E(T( " 7 " ) ) , E(T( " 8 " ) ) , E(T( " 9 " ) )
12 )
13 );

As alternative, the above CFG can also been creating via a string in BNF form,
as described in section 3.1.8.2 on page 101. The code snippet below shows how
to do this.
1 f i n a l Cfg<S t r i n g > c f g = Bnf . p a r s e ( " " "
2 <expr> : : = <num> | <var> | <expr> <op> <expr>
3 <op> ::= + | − | ∗ | /
18 See section 2.3 on page 54.

105
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

4 <var> ::= x | y
5 <num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
6 """
7 );

The SymbolIndex interface is responsible for selecting the index of the rule
alternative to choose. The reason for this interface is to decouple the selection
algorithm from the data structure, which determines the selection; it should
decouple the algorithm from the underlying BitChromosome or IntegerChromo-
some.
1 @FunctionalInterface
2 public i n t e r f a c e SymbolIndex {
3 i n t n e x t ( Cfg . Rule<?> r u l e , i n t bound ) ;
4 }
Listing 3.11: SymbolIndex interface

Listing 3.11 shows the single method of the SymbolIndex interface. It takes
the rule, for which to select the alternatives and the desired index bounds and
returns the selected symbol index. The rule parameter of the next function
allows to use different index selection strategies, or different index sources, for
the different rules. This interface makes it possible to let the sentence creation
to be controlled by an IntegerChromosome instead the classical bit string.
The Codons class implements the SymbolIndex interface and can be created
from a BitChromosome and IntegerChromosome instances. Codons created from
a BitChromosome gives you an implementation of the classic symbol selection
algorithm.
1 // C r e a t e codons backed up by b i n a r y s t r i n g .
2 v a r bch = BitChromosome . o f ( 1 0 0 ∗ 8 ) ;
3 v a r bcodons = Codons . o f B i t G e n e s ( bch ) ;
4 // C r e a d t e codons backed up by i n t e g e r s t r i n g .
5 v a r i c h = IntegerChromosome . o f ( IntRange . o f ( 0 , 2 5 6 ) , 1 0 0 ) ;
6 v a r i c o d o n s = Codons . o f I n t e g e r G e n e s ( i c h ) ;

The code snipped above shows two ways for creating essentially the same codons.
In the first variant, the codons are read from a BitChromosome with the length
of 800, which results in 100 different indexes readable from the created Codons
object. In the second variant an IntegerChromosome, with a value range of
[0, 256) and a length of 100, is used for the Codons object. Using an Integer-
Chromosome instead of a BitChromosome gives you a greater flexibility, as it
allows you to specify an explicit range for the codon values.
Implementations of the functional Generator interface are responsible for
creating sentences from a given grammar. It takes a Cfg object as input and
creates a generic result object of type T. This genericity lets you use the same
interface for different result type, like List<Symbol<String>> or Tree<Symbol<-
String>, ?>.
1 @FunctionalInterface
2 public i n t e r f a c e Generator<T, R> {
3 R g e n e r a t e ( Cfg<? extends T> c f g ) ;
4 }
Listing 3.12: Generator interface

Listing 3.12 shows the definition of the Generator interface. It takes a Cfg
object as input and returns the created result sentence of type R. The kind of

106
3.1. IO.JENETICS.EXT CHAPTER 3. MODULES

the created result is not determined by interfaces. It can be a random instance


of the input grammar, a reproducible, deterministic result, or even always the
same result. It is the responsibility of the implementations to determine which
kind of result to return. The reason for this is, not make assumptions about the
sentence creation and bake this assumptions into the API; this avoid a leaky
abstraction.
There are currently two implementation of the Generator interface: 1) the
SentenceGenerator, which generates a list of terminal symbols (sentences) from
a given grammar, and 2) the DerivationTreeGenerator, generating a derivation
tree, or parse tree.
1 // D e f i n e s e n t e n c e g e n e r a t o r f o r c r e a t i n g random s e n t e n c e s .
2 v a r g e n e r a t o r = new S e n t e n c e G e n e r a t o r <S t r i n g >(
3 SymbolIndex . o f ( RandomGenerator . g e t D e f a u l t ( ) ) ,
4 1_000
5 );
6 // C r e a t e random s e n t e n c e f o r g i v e n CFG.
7 L i s t <Cfg . Terminal<S t r i n g >> s e n t e n c e = g e n e r a t o r . g e n e r a t e ( c f g ) ;
8 S t r i n g s t r i n g = s e n t e n c e . stream ( )
9 . map( Cfg . Symbol : : name )
10 . collect ( Collectors . joining () ) ;

In the code snippet above you can see how to create a SentenceGenerator which
will create random sentences with a maximal length of 1000. The generated
sentence will be a list of terminal symbols. But it is quite easy to convert it to a
string. The DerivationTreeGenerator work can be used similarly. Instead of
a list of terminals, it creates a Tree<Symbol<String>, ?> instead.
The factory methods of the Mappings class puts it all together and let you
easily specify the needed parts for the mapping process. Jenetics already has
an interface for expressing such a mapping process, the Codec19 interface. So
no additional interface or class was introduce and the Mappings factories will
return Codec instances instead. In the code snippet below you can see how to
create the classical mapping, with a single BitChromosome used for the rule
symbol selection. It is created via the Mappers::singleBitChromosomeMapper
method.
1 f i n a l Codec<L i s t <Terminal<S t r i n g >>, BitGene> c o d e c =
2 Mappers . singleBitChromosomeMapper (
3 cfg ,
4 // Length o f t h e used BitChromosome .
5 100∗8 ,
6 // The used g e n e r a t o r , c r e a t e d from SymbolIndex .
7 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 1_000 )
8 );

For creating this codec you must specify the CFG, the length of the chromosome
and the sentence Generator. The Generator is given as factory function, which
takes an SymbolIndex at input. The codec, created in the code snippet below,
is essentially the same as created with the Mappers::singleBitChromosome-
Mapper method. It only differs in the way the codons are created. Using an
IntegerChromosome as codons source gives you a greater flexibility and allows
to change the value range of the created codons.
1 f i n a l Codec<L i s t <Terminal<S t r i n g >>, I n t e g e r G e n e > c o d e c =
2 Mappers . s i n g l e I n t e g e r C h r o m o s o m e M a p p e r (
19 See section 2.3 on page 54.

107
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

3 cfg ,
4 // Value r a n g e o f c r e a t e d codons .
5 IntRange . o f ( 0 , 2 5 6 ) ,
6 // Length o f t h e used IntegerChromosome .
7 100 ,
8 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 1_000 )
9 );

Beside the classical mapping algorithms, Jenetics contains also a novel method
for genotype to sentence mapping. This new approach uses a separate Int-
egerChromosome for every Cfg.Rule. This allows to align the range of the
chromosome with the number of alternatives of the rule. With this property,
no modulo operation is needed for creating codons with the correct bounds.
No additional modulo operation means, that all alternatives have the same
probability for being selected, for randomly created chromosomes. All rule
alternatives have the same changes of being selected. There is no accidental bias
towards a given rule alternative or symbol.
1 f i n a l Codec<L i s t <Terminal<S t r i n g >>, I n t e g e r G e n e > c o d e c =
2 Mappers . multiIntegerChromosomeMapper (
3 cfg ,
4 // Chromosome l e n g t h depends o f number o f a l t e r n a t i v e s .
5 r u l e −> IntRange . o f ( r u l e . a l t e r n a t i v e s ( ) . s i z e ( ) ∗ 2 5 ) ,
6 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 1_000 )
7 );

The codes snippet above shows how to create a mapping, which uses a genotype
with one chromosome for every rule in the given CFG. Our example CFG
[0] [1] [2]
<expr> ::= <num> | <var> | <expr> <op> <expr>
[0] [1] [2] [3]
<op> ::= + | - | * | /
[0] [1]
<var> ::= x | y
[0] [1] [2] [3] [4] [5] [6] [7] [8]
<num> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
will use a Genotype with the following structure for the encoding of the codons.
1 Genotype . o f (
2 IntegerChromosome . o f ( IntRange . of (0 , 3) , 3 ∗ 2 5 ) , // <expr>
3 IntegerChromosome . o f ( IntRange . of (0 , 4) , 4 ∗ 2 5 ) , // <op>
4 IntegerChromosome . o f ( IntRange . of (0 , 2) , 2 ∗ 2 5 ) , // v a r
5 IntegerChromosome . o f ( IntRange . of (0 , 9) , 9 ∗ 2 5 ) // <num>
6 );

As you can see, the number range for each rule chromosome reflects the number
of available rule alternatives and the chromosome length can be expressed as a
multiple of the number of alternatives of the corresponding rule. If you don’t
need this flexibility, it is of course also possible to use constant chromosome
lengths. Just use a length function like this: rule -> IntRange.of(1_000).

3.2 io.jenetics.prog
In artificial intelligence, genetic programming (GP) is a technique whereby com-
puter programs are encoded as a set of genes that are then modified (evolved) us-

108
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

ing an evolutionary algorithm (often a genetic algorithm).20 The io.jenetics-


.prog module contains classes which enables the Jenetics library doing GP. It
introduces a ProgramGene and ProgramChromosome pair, which serves as the
main data structure for genetic programs. A ProgramGene is essentially a tree
(AST21 ) of operations (Op) stored in a ProgramChromosome.22

3.2.1 Operations
When creating own genetic programs, it is not necessary to derive classes from
the ProgramGene or ProgramChromosome. The intended extension point is the
Op interface.

The extension point for own GP implementations is the Op interface. There


is in general no need for extending the ProgramChromosome class.

1 public i n t e r f a c e Op<T> {
2 S t r i n g name ( ) ;
3 int a r i t y ( ) ;
4 T a p p l y (T [ ] a r g s ) ;
5 }
Listing 3.13: GP Op interface

The generic type of the Op interface (see listing 3.13) enforces the data type
constraints for the created program tree and makes the implementation a strongly
typed GP. Using the Op.of factory method, a new operation is created by defining
the desired operation function.
1 f i n a l Op<Double> add = Op . o f ( "+" , 2 , v −> v [ 0 ] + v [ 1 ] ) ;
2 f i n a l Op<S t r i n g > c o n c a t = Op . o f ( "+" , 2 , v −> v [ 0 ] + v [ 1 ] ) ;

A new ProgramChromosome is created with the operations suitable for our


problem. When creating a new ProgramChromosome, we must distinguish two
different kind of operations:
1. Non-terminal operations have an arity greater than zero, which means
they take at least one argument. These operations need to have child
nodes, where the number of children must be equal to the arity of the
operation of the parent node. Non-terminal operations will be abbreviated
to operations.
2. Terminal operations have an arity of zero and from the leaves of the
program tree. Terminal operations will be abbreviated to terminals.
The io.jenetics.prog module comes with three predefined terminal opera-
tions: Var, Const and EphemeralConst.
20 https://en.wikipedia.org/wiki/Genetic_programming
21 https://en.wikipedia.org/wiki/Abstract_syntax_tree
22When implementing the GP module, the emphasis was to not create a parallel world of

genes and chromosomes. It was a requirement, that the existing Alterer and Selector classes
could also be used for the new GP classes. This has been achieved by flattening the AST of a
genetic program to fit into the 1-dimensional (flat) structure of a chromosome.

109
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

Var The Var operation defines a variable of a program, which is set from
outside when it is evaluated.
1 final Var<Double> x = Var . o f ( " x " , 0) ;
2 final Var<Double> y = Var . o f ( " y " , 1) ;
3 final Var<Double> z = Var . o f ( " z " , 2) ;
4 final ISeq<Op<Double>> t e r m i n a l s = ISeq . of (x , y , z ) ;

The terminal operations defined in the listing above can be used for defining
a program which takes a 3-dimensional vector as input parameters, x, y, and
z, with the argument indices 0, 1, and 2. If you have again a look at the
apply method of the operation interface, you can see that this method takes an
object array of type T. The variable x will return the first element of the input
arguments, because it has been created with index 0.

Const The Const operation will always return the same, constant value when
evaluated.
1 f i n a l Const<Double> one = Const . o f ( 1 . 0 ) ;
2 f i n a l Const<Double> p i = Const . o f ( " PI " , Math . PI ) ;

You can create a constant operation in to flavors: with a value only, and with a
dedicated name. If a constant has a name, the symbolic name is used, instead of
the value, when the program tree is printed.

EphemeralConst An ephemeral constant is a terminal operation, which encap-


sulates a value that is generated at run time from the Supplier it is created
from. Ephemeral constants allows you to have terminals that don’t have all the
same values. To create an ephemeral constant that takes its random value in
[0, 1) you will write the following code.
1 f i n a l Op<Double> rand1 = EphemeralConst
2 . o f ( RandomRegistry . random ( ) : : nextDouble ) ;
3 f i n a l Op<Double> rand2 = EphemeralConst
4 . o f ( "R" , RandomRegistry . random ( ) : : nextDouble ) ;

The ephemeral constant value is determined when it is inserted in the tree and
never changes until it is replaced by another ephemeral constant.

3.2.2 Program creation


The ProgramChromosome comes with some factory methods, which lets you
easily create program trees with a given depth and a given set of operations and
terminals.
1 f i n a l i n t depth = 5 ;
2 f i n a l ISeq<Op<Double>> o p e r a t i o n s = I S e q . o f ( . . . ) ;
3 f i n a l ISeq<Op<Double>> t e r m i n a l s = I S e q . o f ( . . . ) ;
4 f i n a l ProgramChromosome<Double> program = ProgramChromosome
5 . o f ( depth , o p e r a t i o n s , t e r m i n a l s ) ;

The code snippet above will create a perfect program tree23 of depth 5. All non-
leaf nodes will contain operations, randomly selected from the given operations,
whereas all leaf nodes are filled with operations from the terminals.
23 All leafs of a perfect tree have the same depth and all internal nodes have degree Op.arity.

110
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

The created program tree is perfect, which means that all leaf nodes have
the same depth. If new trees needs to be created during evolution, they
will be created with the depth, operations and terminals defined by the
template program tree.

During the evolution phase, the size of the ProgramChromosome can grow and
shrink. The SingleNodeCrossover, which is part of the jenetics.ext module
is responsible for this change in the program size. When a smaller sub tree is
exchanged with a bigger sub tree, the size of the first tree will grow and the
size of the second tree will shrink. This can lead to undesirable large programs.
Because of this reason, it is possible to create a ProgramChromosome with an
additional validation predicate.
1 f i n a l ProgramChromosome<Double> program = ProgramChromosome . o f (
2 depth , ch −> ch . r o o t ( ) . s i z e ( ) <= 5 0 ,
3 operations , terminals
4 );

The predicate, ch -> ch.root().size() <= 50, marks all programs with more
then 50 nodes as invalid. Invalid chromosomes will then be replaced by newly
created one. When defining a validation predicate, you have to take care, that
the desired depth and the validation predicate matches. If the given program
tree depth is too big, e. g. 51, every newly created program will be immediately
marked as invalid. This is because a tree with depth 51 will have for sure more
than 50 nodes.
The evolution Engine used for solving GP problems is created the same
way as for normal GA problems. Also the execution of the EvolutionStream
stays the same. The first Gene of the collected final Genotype represents the
evolved program, which can be used to calculate function values from arbitrary
arguments.
1 f i n a l Engine<ProgramGene<Double >, Double> e n g i n e = Engine
2 . b u i l d e r ( Main : : e r r o r , program )
3 . minimizing ( )
4 . alterers (
5 new S i n g l e N o d e C r o s s o v e r <>() ,
6 new Mutator <>() )
7 . build () ;
8
9 f i n a l ProgramGene<Double> program = e n g i n e . stream ( )
10 . l i m i t (300)
11 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) )
12 . gene ( ) ;
13 f i n a l double r e s u l t = program . e v a l ( 3 . 4 ) ;

For a complete GP example have a look at the examples in chapter 5.7. The code
example above also shows, that the program is represented by the first gene (aka
root gene) of the ProgramChromosome. Since the ProgramGene implements the
Tree<Op<A>,ProgramGene<A>> interface, it smoothly integrates with existing
tree algorithms. Some possible program gene assignments are shown in the code
snippet below, which will compile without warnings or additional casts.
1 f i n a l ProgramChromosome<Double> chromosome = . . . ;

111
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

2 a s s e r t chromosome . gene ( ) == chromosome . r o o t ( ) ;


3 f i n a l ProgramGene<Double> program = chromosome . r o o t ( ) ;
4 f i n a l Tree<Op<Double >, ?> opTree = chromosome . r o o t ( ) ;
5 f i n a l Tree <? , ?> t r e e = chromosome . r o o t ( ) ;

3.2.3 Program repair


The specialized crossover class, SingleNodeCrossover, for a TreeGene guar-
antees that the program tree after the alter operation is still valid. It obeys
the tree structure of the Gene. General alterers, not written for ProgramGene
of TreeGene classes, will most likely destroy the tree property of the altered
chromosome. There are essentially two possibility for handling invalid tree
chromosomes:
1. Marking the Chromosome as invalid. This possibility is easier to achieve,
but would also to lead to a large number of invalid Chromosomes, which
must be recreated. When recreating invalid Chromosomes we will also lose
possible solutions.
2. Trying to repair the invalid Chromosome. This is the approach the Jenetics
library has chosen. The repair process reuses the operations in a Program-
Chromosome and rebuilds the tree property by using the operation arity.

Jenetics allows for the usage of arbitrary Alterer implementations. Even


alterers not implemented for ProgramGenes. Genes destroyed by such
alterers are repaired.

3.2.4 Program pruning


When you are solving symbolic regression problems, the mathematical expression
trees, created during the evolution process, can become quite big. From the
diversity point of view, this might be not that bad, but it comes with additional
computation cost. With the MathRewriteAlterer you are able to simplify some
portion of the population in each generation. The rewrite alterer uses the default
TreeRewiter24 defined by the MathExpr.REWRITER field. It is also possible to
create a MathRewriteAlterer instance with your own TreeRewiter.
1 f i n a l Engine<ProgramGene<Double >, Double> e n g i n e = Engine
2 . b u i l d e r ( Main : : e r r o r , program )
3 . minimizing ( )
4 . alterers (
5 new S i n g l e N o d e C r o s s o v e r <>() ,
6 new Mutator <>() ,
7 new M a t h R e w r i t e A l t e r e r < >(0.5) )
8 . build () ;

In the example above, half of the expression trees are simplified in each generation.
If you want to prune the final result, you can do this with the MathExpr::rewrite
method, which uses the MathExpr.REWRITER tree rewriter for the rewrite task.
24 See section 3.1.2 on page 86 for a detailed description of the implemented tree rewrite

system.

112
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

1 f i n a l ProgramGene<Double> program = e n g i n e . stream ( )


2 . l i m i t (3000)
3 . c o l l e c t ( E v o l u t i o n R e s u l t . toBestGenotype ( ) )
4 . gene ( ) ;
5
6 f i n a l TreeNode<Op<Double>> t r e e = TreeNode . o f T r e e ( program ) ;
7 MathExpr . r e w r i t e ( t r e e ) ;

The algorithm used for pruning the expression tree, currently only uses some
basic mathematical identities, like x + 0 = x, x · 1 = x or x · 0 = 0. More
advanced simplification algorithms may be implemented in the future. The
MathExpr helper class can also be used for creating mathematical expression
trees from the usual textual representation.
1 f i n a l MathExpr e x p r = MathExpr
2 . p a r s e ( " 5∗ z + 6∗ x + s i n ( y ) ^3 + ( 1 + s i n ( z ∗ 5 ) / 4 ) /6 " ) ;
3 f i n a l double v a l u e = e x p r . e v a l ( 5 . 5 , 4 , 2 . 3 ) ;

The variables in an expression string are sorted alphabetically. This means, that
the expression is evaluated with x = 5.5, y = 4 and z = 2.3, which leads to a
result value of 44.19673085074048.

3.2.5 Multi-root programs


The given examples, so far, where using a single ProgramChromosome for modeling
the program. Since the Genotype is able to hold more than one Chromosome, it
is possible to create more than one program root. These programs are evaluated
concurrently.
1 f i n a l Codec<ISeq<Function<Double [ ] , Double >>, ProgramGene<Double>>
2 c o d e c = Codec . o f (
3 Genotype . o f (
4 // F i r s t ’ program ’ .
5 ProgramChromosome . o f (
6 4 , ch −> ch . r o o t ( ) . s i z e ( ) <= 3 0 ,
7 operations , terminals
8 ),
9 // Second ’ program ’ .
10 ProgramChromosome . o f (
11 5 , ch −> ch . r o o t ( ) . s i z e ( ) <= 5 0 ,
12 operations , terminals
13 )
14 ),
15 g t −> g t . stream ( )
16 . map( Chromosome : : gene )
17 . c o l l e c t ( ISeq . toISeq ( ) )
18 );

The code snippet above shows how to create a codec with two independent
program roots. These programs are then mapped, in the fitness function, to
the combined fitness value. It is also possible to use different operations and
terminals for each ProgramChromosome.

3.2.6 Symbolic regression


Symbolic regression is a specific type of regression analyses, where the search
space consists of mathematical expressions. The task is to find a model, which
fits a given data set in terms of accuracy and simplicity. In a classical approach,

113
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

you will try to optimize the parameters of a predefined function type, e. g. a


polynomial of grade n:
n
f (x) =
X
ak xk .
k=0

The encoding would only be a DoubleChromosome of length n + 1, where the


Gene at position k ∈ [0, ..., n] represents the factor ak of the polynomial. If the
type of mathematical function is not known in advance, GP can be used finding
a function which is composed out of a given set of primitives.

Symbolic regression involves finding a mathematical expression, in


symbolic form, that provides a good, best, or perfect fit between a
given finite sampling of values of the independent variables and the
associated values of the dependent variables.[24]

Since symbolic regression is quite a common task in GP, Jenetics comes with
classes and interfaces, supporting the implementation of such problems. These
classes are defined in the io.jenetics.prog.regression package. The follow-
ing sections describes these classes and interfaces and its usage. A complete
symbolic regression example is given in section 5.7.

3.2.6.1 Loss function


The loss function measures how good the evolved program (tree) predicts the
expected outcome or data set. If the prediction deviates too much from the
expected data, the loss function will cough up a larger number. Loss functions
are classified into two major categories, depending on the type of the learning
task—regression losses and classification losses. In the following paragraphs,
only loss functions suitable for regression problems will be described.

Mean squared error The mean squared error is the default loss function
used for regression analysis. It is also known as quadratic loss or L2 loss and is
calculated as the average of the squared differences between the predicted and
actual values.
1X
n−1
M SE = (yi − y˜i ) , (3.2.1)
2
n i=0

where yi denotes the expected function value and y˜i the calculated (estimated)
value for data point, i. The result is always positive and the perfect value is
0. The squaring means that larger mistakes result in more errors than smaller
mistakes, meaning that the model penalizes larger mistakes. The mean squared
error is the preferred loss function for regression problems.

Mean absolute error The mean absolute error, also known as L1 loss, is
calculated as the average of the absolute difference between the expected and
calculated values.
1X
n−1
M AE = |yi − y˜i | (3.2.2)
n i=0
This loss function is suitable for regression problems where the distribution of
the target variable may be mostly Gaussian, but may have outliers, e. g. large or

114
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

small values far from the mean value. This means that the MAE is more robust
than the MSE, which is useful if the sample data is corrupted with outliers.

The MAE is more robust to outliers, but its derivatives are not continuous,
making it less efficient to find the correct solution. The MSE is sensitive to
corrupt data, but finds more stable and closed form solutions.

The interface used for calculating the loss between calculated and expected
values is shown in listing 3.14
1 @FunctionalInterface
2 public i n t e r f a c e L o s s F u n c t i o n <T> {
3 double a p p l y (T [ ] c a l c u l a t e d , T [ ] e x p e c t e d ) ;
4 }
Listing 3.14: LossFunction interface

3.2.6.2 Complexity function


The complexity function measures the complexity of the evolved tree. If you have
two programs with the same loss value, you usually want the simpler program
to survive. A simple complexity measure is the number of nodes a program tree
consists of. You can obtain such a measure by the ofNodeCount(int) factory
method of the Complexity interface. The complexity measure, C (P ), is defined
as s
min (N (P ) , Nmax )
2
C (P ) = 1 − 1 − 2
, (3.2.3)
Nmax
where N (P ) is the number of nodes the program, P , consists of and Nmax the
maximal allowed program nodes. If the number of program nodes is equal or
greater then the maximal node number, C (P ), will return 1.

0.8

0.6
C(P)

0.4

0.2

0
0 5 10 15 20 25 30

N(P)

Figure 3.2.1: Node count complexity

The graph in figure 3.2.1 shows how the program complexity increases with
the number of nodes. For the example graph the maximal node count was set to
28.

115
3.2. IO.JENETICS.PROG CHAPTER 3. MODULES

1 @FunctionalInterface
2 public i n t e r f a c e Complexity<T> {
3 double a p p l y ( Tree <? extends Op<T>, ?> program ) ;
4 }
Listing 3.15: Complexity interface

Listing 3.15 shows the interface which calculates the complexity measure of a
given program tree.

3.2.6.3 Error function


The error function combines the loss function and the complexity function into
one error measure. It is used as fitness function which the GP is minimizing.
1 @FunctionalInterface
2 public i n t e r f a c e Error <T> {
3 double a p p l y (
4 Tree <? extends Op<T>, ?> program ,
5 T[ ] calculated ,
6 T [ ] expected
7 );
8 }
Listing 3.16: Error interface

Listing 3.16 shows the error (fitness) function used for evolving symbolic regres-
sion problems. Instead of implementing the error function from scratch, you will
probably want to use one of the factory methods for creating it from one of the
predefined LossFunction and Complexity measure.
1 f i n a l Error <Double> e r r o r 1 = E r r o r . o f ( L o s s F u n c t i o n : : mse ) ;
2 f i n a l Error <Double> e r r o r 2 = E r r o r . o f (
3 L o s s F u n c t i o n : : mae ,
4 Complexity . ofNodeCount ( 2 8 )
5 );
6 f i n a l Error <Double> e r r o r 3 = E r r o r . o f (
7 L o s s F u n c t i o n : : mse ,
8 Complexity . ofNodeCount ( 2 8 ) ,
9 ( l o s s , c o m p l e x i t y ) −> l o s s + l o s s ∗ c o m p l e x i t y
10 );

The code snippet above shows the three possibilities to create an error function
by using the predefined loss functions and complexity measure. error1 is created
by using the mean squared error, MES. error2 and error3 defines the same
error function. The only difference is that error3 defines the loss-complexity
composition function explicitly.

3.2.6.4 Sample points


Solving regression problems requires to compare the current solution (program
tree) with a set of sample points, which represents the original function to be
approximated. The Sample interface represents such sample point. It actually
maps a n-dimensional point of domain, D, to an one-dimensional point of the
same domain: Dn → D.
1 public i n t e r f a c e Sample<T> {
2 int a r i t y ( ) ;
3 T argAt ( i n t i n d e x ) ;
4 T result () ;

116
3.3. IO.JENETICS.XML CHAPTER 3. MODULES

5 }
Listing 3.17: Sample interface

The arity of the sample point returns the dimension, n. To make it easier to
create double sample points, some factory methods are also given in the Sample
interface.
1 f i n a l Sample<Double> sample1 = Sample . o f D o u b l e ( 0 . 0 , 0 . 0 ) ;
2 f i n a l Sample<Double> sample2 = Sample . o f D o u b l e ( 1 . 0 , 1 . 0 ) ;
3 f i n a l Sample<Double> sample3 = Sample . o f D o u b l e ( 2 . 0 , 2 . 0 ) ;

The code snippet above shows how to create three sample points for a function
f : R → R.

3.2.6.5 Regression problem


The Regression class is the only concrete type of the public API of the
regression package. It integrates the interfaces, described in the last sec-
tions, into one problem definition.
1 public f i n a l c l a s s R e g r e s s i o n <T>
2 implements Problem<Tree<Op<T>, ?>, ProgramGene<T>, Double>
3 {
4 ...
5 }

As you can see in the code snippet above, the Regression class implements the
Problem interface and can be therefore easily used in setting up an appropriate
evolution Engine. A full such regression example can be found in section 5.7.

3.2.7 Boolean programs


The default data type for doing symbolic regression is the Double class. This is
supported by a standard set of mathematical operations, defined in the MathOp
class. Since the GP operations are not restricted to any particular type, the
boolean operations, defined in the BoolOp class, can be used for defining boolean
programs.

3.3 io.jenetics.xml
The io.jenetics.xml module allows for writing and reading Chromosomes and
Genotypes to and from XML. Since the existing JAXB marshaling is part
of the deprecated javax.xml.bind module the io.jenetics.xml module is
now the recommended for XML marshalling of the Jenetics classes. The XML
marshalling, implemented in this module, is based on the Java XMLStreamWriter
and XMLStreamReader classes of the java.xml module.

3.3.1 XML writer


The main entry point for writing XML files is the typed XMLWriter interface.
Listing 3.18 shows the interface of the XMLWriter.

117
3.3. IO.JENETICS.XML CHAPTER 3. MODULES

1 @FunctionalInterface
2 public i n t e r f a c e Writer<T> {
3 void w r i t e ( XMLStreamWriter xml , T data )
4 throws XMLStreamException ;
5
6 s t a t i c <T> Writer<T> a t t r ( S t r i n g name ) ;
7 s t a t i c <T> Writer<T> a t t r ( S t r i n g name , O b j e c t v a l u e ) ;
8 s t a t i c <T> Writer<T> t e x t ( ) ;
9
10 s t a t i c <T> Writer<T>
11 elem ( S t r i n g name , Writer <? super T> . . . c h i l d r e n ) ;
12
13 s t a t i c <T> Writer<I t e r a b l e <T>>
14 e l e m s ( Writer <? super T> w r i t e r ) ;
15 }
Listing 3.18: XMLWriter interface

Together with the static Writer factory method, it is possible to define arbitrary
writers through composition. There is no need for implementing the Writer
interface. A simple example will show you how to create (compose) a Writer
class for the IntegerChromosome. The created XML should look like the given
example above.
1 <int - chromosome length = " 3 " >
2 < min > -2147483648 </ min >
3 < max > 2147483647 </ max >
4 < alleles >
5 < allele > -1878762439 </ allele >
6 < allele > -957346595 </ allele >
7 < allele > -88668137 </ allele >
8 </ alleles >
9 </ int - chromosome >

The following writer will create the desired XML from an integer Chromosome.
As the example shows, the structure of the XML can easily be grasped from the
XML writer definition and vice versa.
1 f i n a l Writer<IntegerChromosome> w r i t e r =
2 elem ( " i n t −chromosome " ,
3 a t t r ( " l e n g t h " ) . map( ch −> ch . l e n g t h ( ) ) ,
4 elem ( " min " , W r i t e r .< I n t e g e r >t e x t ( ) . map( ch −> ch . min ( ) ) ) ,
5 elem ( " max " , W r i t e r .< I n t e g e r >t e x t ( ) . map( ch −> ch . max ( ) ) ) ,
6 elem ( " a l l e l e s " ,
7 e l e m s ( " a l l e l e " , W r i t e r .< I n t e g e r >t e x t ( ) )
8 . map( ch −> ch . t o S e q ( ) . map( g −> g . a l l e l e ( ) ) )
9 )
10 );

3.3.2 XML reader


Reading and writing XML files uses the same concepts. For reading XML there
is an abstract Reader class, which can be easily composed. The main method
of the Reader class can be seen in listing 3.19.
1 public abstract c l a s s Reader<T> {
2 public abstract T r e a d ( f i n a l XMLStreamReader xml )
3 throws XMLStreamException ;
4 }
Listing 3.19: XMLReader class

118
3.3. IO.JENETICS.XML CHAPTER 3. MODULES

When creating a XMLReader, the structure of the XML must be defined in a


similar way as for the XMLWriter. Additionally, a factory function, which will
create the desired object from the extracted XML data, is needed. A Reader,
which will read the XML representation of an IntegerChromosome can be seen
in the following code snippet below.
1 f i n a l Reader<IntegerChromosome> r e a d e r =
2 elem (
3 ( O b j e c t [ ] v ) −> {
4 f i n a l int length = ( int ) v [ 0 ] ;
5 f i n a l i n t min = ( i n t ) v [ 1 ] ;
6 f i n a l i n t max = ( i n t ) v [ 2 ] ;
7 f i n a l L i s t <I n t e g e r > a l l e l e s = ( L i s t <I n t e g e r >)v [ 3 ] ;
8 a s s e r t a l l e l e s . s i z e ( ) == l e n g t h ;
9 return IntegerChromosome . o f (
10 a l l e l e s . stream ( )
11 . map( v a l u e −> I n t e g e r G e n e . o f ( v a l u e , min , max)
12 . t o A r r a y ( I n t e g e r G e n e [ ] : : new)
13 );
14 },
15 " i n t −chromosome " ,
16 a t t r ( " l e n g t h " ) . map( I n t e g e r : : p a r s e I n t ) ,
17 elem ( " min " , t e x t ( ) . map( I n t e g e r : : p a r s e I n t ) ) ,
18 elem ( " max " , t e x t ( ) . map( I n t e g e r : : p a r s e I n t ) ) ,
19 elem ( " a l l e l e s " ,
20 e l e m s ( elem ( " a l l e l e " , t e x t ( ) . map( I n t e g e r : : p a r s e I n t ) ) )
21 )
22 );

3.3.3 Marshalling performance


Another important aspect when doing marshalling, is the space needed for the
marshaled objects and the time needed for doing the marshalling. For the
performance tests a genotype with a varying chromosome count is used. The
used genotype template can be seen in the code snippet below.
1 f i n a l Genotype<DoubleGene> g e n o t y p e = Genotype . o f (
2 DoubleChromosome . o f ( 0 . 0 , 1 . 0 , 1 0 0 ) ,
3 chromosomeCount
4 );

Table 3.3.1 shows the required space of the marshaled genotypes for different
marshalling methods: (a) Java serialization, (b) JAXB25 serialization and (c)
XMLWriter.

Chromosome count Java serialization JAXB XML writer


1 0.0017 MiB 0.0045 MiB 0.0035 MiB
10 0.0090 MiB 0.0439 MiB 0.0346 MiB
100 0.0812 MiB 0.4379 MiB 0.3459 MiB
1000 0.8039 MiB 4.3772 MiB 3.4578 MiB
10000 8.0309 MiB 43.7730 MiB 34.5795 MiB
100000 80.3003 MiB 437.7283 MiB 345.7940 MiB

Table 3.3.1: Marshaled object size


25 The JAXB marshalling has been removed in version 4.0. It is still part of the table for

comparison with the new XML marshalling.

119
3.4. IO.JENETICS.PRNGINE CHAPTER 3. MODULES

Using the Java serialization will create the smallest files and the XMLWriter
of the io.jenetics.xml module will create files roughly 75% the size of the
JAXB serialized genotypes. The size of the marshaled objects also influences
the write performance. As you can see in diagram 3.3.1 the Java serialization
is the fastest marshalling method, followed by the JAXB marshalling. The
XMLWriter is the slowest one, but still comparable to the JAXB method.

107

106

105
Marshalling time [µs]

104

103

102

JAXB
101
Java serialization
XML writer
100
100 101 102 103 104 105

Chromosome count

Figure 3.3.1: Genotype write performance

For reading the serialized genotypes, we will see similar results (see diagram
3.3.2). Reading Java serialized genotypes has the best read performance, followed
by JAXB and the XML Reader. This time the difference between JAXB and
the XML Reader is hardly visible.

3.4 io.jenetics.prngine
The prngine26 module contains pseudo-random number generators for sequen-
tial and parallel Monte Carlo simulations27 . It has been designed to work
smoothly with the Jenetics GA library, but it has no dependency to it. All
PRNG implementations of this library implements the Java RandomGenerator
interface, which makes it easily usable in other projects.
26 This module is not part of the Jenetics project directly. Since it has no dependency

on any of the Jenetics modules, it has been extracted to a separate GitHub repository
(https://github.com/jenetics/prngine) with an independent versioning.
27 https://de.wikipedia.org/wiki/Monte-Carlo-Simulation

120
3.4. IO.JENETICS.PRNGINE CHAPTER 3. MODULES

108

107

106
Marshalling time [µs]

105

104

3
10

102
JAXB
1
10 Java serialization
XML reader
100
100 101 102 103 104 105

Chromosome count

Figure 3.3.2: Genotype read performance

The pseudo random number generators of the io.jenetics.prngine mod-


ule are not cryptographically strong PRNGs.

The io.jenetics.prngine module consists of the following PRNG imple-


mentations:
KISS32Random Implementation of an simple PRNG as proposed in Good Practice
in (Pseudo) Random Number Generation for Bioinformatics Applications
(JKISS32, page 3) David Jones, UCL Bioinformatics Group.[22] The period
of this PRNG is ≈ 2.6 · 1036 .
KISS64Random Implementation of an simple PRNG as proposed in Good Practice
in (Pseudo) Random Number Generation for Bioinformatics Applications
(JKISS64, page 10) David Jones, UCL Bioinformatics Group.[22] The
PRNG has a period of ≈ 1.8 · 1075 .
LCG64ShiftRandom This class implements a linear congruential PRNG with
additional bit-shift transition. It is a port of the trng::lcg64_shift PRNG
class of the TRNG library created by Heiko Bauke.28
MT19937_32Random This is a 32-bit version of Mersenne Twister pseudo random
number generator.29
28 https://github.com/jenetics/trng4
29 https://en.wikipedia.org/wiki/Mersenne_Twister

121
3.4. IO.JENETICS.PRNGINE CHAPTER 3. MODULES

MT19937_64Random This is a 64-bit version of Mersenne Twister pseudo random


number generator.

XOR32ShiftRandom This generator was discovered and characterized by George


Marsaglia [Xorshift RNGs]. In just three XORs and three shifts (generally
fast operations) it produces a full period of 232 − 1 on 32 bits. (The missing
value is zero, which perpetuates itself and must be avoided.)30
XOR64ShiftRandom This generator was discovered and characterized by George
Marsaglia [Xorshift RNGs]. In just three XORs and three shifts (generally
fast operations) it produces a full period of 264 − 1 on 64 bits. (The missing
value is zero, which perpetuates itself and must be avoided.)
All implemented PRNGs have been tested with the dieharder test suite. Table
3.4.1 shows the statistical performance of the implemented PRNGs, including
the Java RandomGenerator, L64X256MixRandom, which is the default generator
used by the library.

PRNG Passed Weak Failed


KISS32Random 113 1 0
KISS64Random 109 5 0
LCG64ShiftRandom 111 3 0
MT19937_32Random 111 3 0
MT19937_64Random 108 6 0
XOR32ShiftRandom 103 6 5
XOR64ShiftRandom 113 1 0
L64X256MixRandom 111 3 0

Table 3.4.1: Dieharder results

The second important performance measure for PRNGs is the number of


random number it is able to create per second.31 Table 3.4.2 shows the PRN
creation speed for all implemented generators.

PRNG 106 int/s 106 float/s 106 long/s 106 double/s


KISS32Random 189 143 129 108
KISS64Random 128 124 115 124
LCG64ShiftRandom 258 185 261 191
MT19937_32Random 140 115 92 82
MT19937_64Random 148 120 148 120
XOR32ShiftRandom 227 161 140 120
XOR64ShiftRandom 225 166 235 166
L64X256MixRandom 162 136 121 166

Table 3.4.2: PRNG speed

30 http://digitalcommons.wayne.edu/jmasm/vol2/iss1/2/
31 Measured on a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz with Java(TM) SE Runtime

Environment (build 1.8.0_102-b14)—Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14,


mixed mode)—, using the JHM micro-benchmark library.

122
Appendix
Chapter 4

Internals

This section contains internal implementation details which doesn’t fit in one of
the previous sections. They are not essential for using the library, but would
give the user a deeper insight in some design decisions made when implementing
the library. It also introduces tools and classes which were developed for testing
purpose. These classes are not exported and not part of the official API.

4.1 PRNG testing


Jenetics uses the dieharder1 (command line) tool for testing the randomness
of the used PRNGs. dieharder is a random number generator (RNG) testing
suite. It is intended to test generators, not files of possibly random numbers.
Since dieharder needs a huge amount of random data for testing the quality of
a RNG, it is usually advisable to pipe the random numbers to the dieharder
process:

$ cat /dev/urandom | dieharder -g 200 -a

The example above demonstrates how to stream a raw binary stream of bits
to the stdin (raw) interface of dieharder. With the DieHarder class, which
is part of the io.jenetics-.prngine.internal package, it is easily possible
to test PRNGs extending the java.util.random.RandomGenerator interface.
The only requirement is, that the PRNG must be default-constructible and part
of the classpath.

$ java -cp prngine-2.0.0.jar \


io.jenetics.prngine.internal.DieHarder \
<random-engine-name> -a

Calling the command above will create an instance of the given random engine
and stream the random data (bytes) to the raw interface of dieharder process.
1 #=============================================================================#
2 # Testing : L64X256MixRa ndom (2022 -02 -18 19:11) #
3 #=============================================================================#
4 #=============================================================================#
5 # Mac OS X 10.15.7 ( x86_64 ) #

1 From Robert G. Brown:http://www.phy.duke.edu/~rgb/General/dieharder.php

124
4.2. RANDOM SEEDING CHAPTER 4. INTERNALS

6 # java version " 17 " #


7 # Java ( TM ) SE Runtime Environment ( build 17+35 - LTS -2724) #
8 # Java HotSpot ( TM ) 64 - Bit Server VM ( build 17+35 - LTS -2724) #
9 #=============================================================================#
10 #=============================================================================#
11 # dieharder version 3.31.1 Copyright 2003 Robert G . Brown #
12 #=============================================================================#
13 rng_name | rands / second | Seed |
14 stdin_input_raw | 2.39 e +07 | 340119430|
15 #=============================================================================#
16 test_name | ntup | tsamples | psamples | p - value | Assessment
17 #=============================================================================#
18 di eh ard _b irt hd ays | 0| 100| 100|0.91974692| PASSED
19 diehard_operm5 | 0| 1000000| 100|0.59856940| PASSED
20 d ie h a rd _ ra n k_ 3 2 x3 2 | 0| 40000| 100|0.99990675| PASSED
21 diehard_ rank_6x8 | 0| 100000| 100|0.46844654| PASSED
22 ...
23 Preparing to run test 209. ntuple = 0
24 dab_monobit2 | 12| 65000000| 1|0.76563780| PASSED
25 #=============================================================================#
26 # Summary : PASSED =111 , WEAK =3 , FAILED =0 #
27 # 235.031 ,383 MB of random data created with 79 ,891 MB / sec #
28 #=============================================================================#
29 #=============================================================================#
30 # Runtime : 0:49:01 #
31 #=============================================================================#

In the listing above, a part of the created dieharder report is shown. For test-
ing the LCG64ShiftRandom class, which is part of the io.jenetics.prngine
module, the following command can be called:

$ java -cp prngine-2.0.0.jar \


io.jenetics.prngine.internal.DieHarder \
LCG64ShiftRandom -a

Table 4.1.1 shows the summary of the dieharder tests. The full report is part
of the source file of the LCG64ShiftRandom class.2

Passed tests Weak tests Failed tests


111 3 0

Table 4.1.1: LCG64ShiftRandom quality

4.2 Random seeding


The PRNGs3 , used by the Jenetics library, needs to be initialized with a proper
seed value before they can be used. The usual way for doing this, is to take the
current time stamp.
1 public s t a t i c long nanoSeed ( ) {
2 return System . nanoTime ( ) ;
3 }

Before applying this method throughout the whole library, I decided to perform
some statistical tests. For this purpose I treated the seed method itself as
PRNG and analyzed the created long values with the DieHarder class. The
2 https://github.com/jenetics/prngine/blob/master/prngine/src/main/java/io/

jenetics/prngine/LCG64ShiftRandom.java
3 See section 1.4.2 on page 34.

125
4.2. RANDOM SEEDING CHAPTER 4. INTERNALS

nanoSeed method has been wrapped into the io.jenetics.prngine.inter-


nal.NanoTimeRandom class. Assuming that the dieharder tool is in the search
path, calling

$ java -cp prngine-2.0.0.jar \


io.jenetics.prngine.internal.DieHarder \
NanoTimeRandom -a

will perform the statistical tests for the nano time random engine. The statistical
quality is rather bad: every single test failed. Table 4.2.1 shows the summary of
the dieharder report.4

Passed tests Weak tests Failed tests


0 0 114

Table 4.2.1: Nano time seeding quality

An alternative source of entropy, for generating seed values, would be the


/dev/random or /dev/urandom file. But this approach is not portable, which
was a prerequisite for the Jenetics library.
The next attempt tries to fetch the seeds from the JVM, via the Object-
::hashCode method. Since the hash code of an Object is available for every
operating system and most likely »randomly« distributed.
1 public s t a t i c long o b j e c t S e e d ( ) {
2 f i n a l long a = new O b j e c t ( ) . hashCode ( ) ;
3 f i n a l long b = new O b j e c t ( ) . hashCode ( ) ;
4 return m i x S t a f f o r d 1 3 ( a << 32 | b ) ;
5 }
6
7 private s t a t i c long m i x S t a f f o r d 1 3 ( f i n a l long z ) {
8 long v = ( z ^ ( z >>> 3 0 ) ) ∗0 x b f 5 8 4 7 6 d 1 c e 4 e 5 b 9 L ;
9 v = ( v ^ ( v >>> 2 7 ) ) ∗0 x94d049bb133111ebL ;
10 return v ^ ( v >>> 3 1 ) ;
11 }

This seed method has been wrapped into the ObjectHashRandom class and
tested as well with

$ java -cp prngine-2.0.0.jar \


io.jenetics.prngine.internal.DieHarder \
ObjectHashRandom -a

Table 4.2.2 shows the summary of the dieharder report5 , which is already
excellent.
After additional experimentation, a combination of the nano time seed and
the object hash seeding seems to be the right solution. The rational behind this
was, that the PRNG seed shouldn’t rely on a single source of entropy.
4 The detailed test report can be found in the source of the NanoTimeRandom

class. https://github.com/jenetics/prngine/blob/master/prngine/src/main/java/io/
jenetics/prngine/internal/NanoTimeRandom.java
5 Full report: https://github.com/jenetics/prngine/blob/master/prngine/src/main/

java/io/jenetics/prngine/internal/ObjectHashRandom.java

126
4.2. RANDOM SEEDING CHAPTER 4. INTERNALS

Passed tests Weak tests Failed tests


113 1 0

Table 4.2.2: Object hash seeding quality

1 public s t a t i c long s e e d ( ) {
2 f i n a l long a = m i x S t a f f o r d 1 3 ( System . c u r r e n t T i m e M i l l i s ( ) ) ;
3 f i n a l long b = m i x S t a f f o r d 1 3 ( System . nanoTime ( ) ) ;
4 return s e e d ( mix ( a , b ) ) ;
5 }
6
7 private s t a t i c long s e e d ( f i n a l long b a s e ) {
8 return mix ( base , o b j e c t S e e d ( ) ) ;
9 }
10
11 private s t a t i c long mix ( f i n a l long a , f i n a l long b ) {
12 long c = a^b ;
13 c ^= c << 1 7 ;
14 c ^= c >>> 3 1 ;
15 c ^= c << 8 ;
16 return c ;
17 }
Listing 4.1: Random seeding

The code in listing 4.1 shows how the nano time seed is mixed with the object
seed. The mix method was inspired by the mixing step of thelcg64_shift6
random engine, which has been reimplemented in the LCG64ShiftRandom class.
Running the tests with

$ java -cp prngine-2.0.0.jar \


io.jenetics.prngine.internal.DieHarder \
SeedRandom -a

leads to the statistics summary7 , which is shown in table 4.2.3.

Passed tests Weak tests Failed tests


110 4 0

Table 4.2.3: Combined random seeding quality

The statistical performance of this seeding is better, according to the die-


harder test suite, than some of the real random engines, including the default
Java Random engine. Using the proposed seed method is in any case preferable
to the simple System.nanoTime() call.

Open questions
• How does this method perform on operating systems other than Linux?
• How does this method perform on other JVM implementations?
6 This class is part of the TRNG library: https://github.com/rabauke/trng4/blob/

master/src/lcg64_shift.hpp
7 Full report: https://github.com/jenetics/prngine/blob/master/prngine/src/main/

java/io/jenetics/prngine/internal/SeedRandom.java

127
Chapter 5

Examples

This section contains some coding examples which should give you a feeling
of how to use the Jenetics library. The given examples are complete, in the
sense that they will compile and run and produce the given example output.
Running the examples delivered with the Jenetics library can be started with
the run-examples.sh script.

$ ./jenetics.example/src/main/scripts/run-examples.sh

Since the script uses JARs located in the build directory you have to build it
with the jar Gradle target first; see section6.

5.1 Ones counting


Ones counting is one of the simplest model problem. It uses a binary chromosome
and forms a classic genetic algorithm1 . The fitness of a Genotype is proportional
to the number of ones.
1 import s t a t i c i o . j e n e t i c s . e n g i n e . E v o l u t i o n R e s u l t . t o B e s t P h e n o t y p e ;
2 import s t a t i c i o . j e n e t i c s . e n g i n e . L i m i t s . b y S t e a d y F i t n e s s ;
3
4 import io . jenetics . BitChromosome ;
5 import io . jenetics . BitGene ;
6 import io . jenetics . Genotype ;
7 import io . jenetics . Mutator ;
8 import io . jenetics . Phenotype ;
9 import io . jenetics . RouletteWheelSelector ;
10 import io . jenetics . SinglePointCrossover ;
11 import io . jenetics . e n g i n e . Engine ;
12 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;
13
14 public c l a s s OnesCounting {
15
16 // This method c a l c u l a t e s t h e f i t n e s s f o r a g i v e n g e n o t y p e .
17 private s t a t i c I n t e g e r count ( f i n a l Genotype<BitGene> g t ) {
18 return g t . chromosome ( )
19 . a s ( BitChromosome . c l a s s )
20 . bitCount ( ) ;
1 In the classic genetic algorithm the problem is a maximization problem and the fitness

function is positive. The domain of the fitness function is a bit-chromosome.

128
5.1. ONES COUNTING CHAPTER 5. EXAMPLES

21 }
22
23 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
24 // C o n f i g u r e and b u i l d t h e e v o l u t i o n e n g i n e .
25 f i n a l Engine<BitGene , I n t e g e r > e n g i n e = Engine
26 . builder (
27 OnesCounting : : count ,
28 BitChromosome . o f ( 2 0 , 0 . 1 5 ) )
29 . populationSize (500)
30 . s e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
31 . alterers (
32 new Mutator < >(0.55) ,
33 new S i n g l e P o i n t C r o s s o v e r < >(0.06) )
34 . build () ;
35
36 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
37 f i n a l E v o l u t i o n S t a t i s t i c s <I n t e g e r , ?>
38 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
39
40 f i n a l Phenotype<BitGene , I n t e g e r > b e s t = e n g i n e . stream ( )
41 // Truncate t h e e v o l u t i o n stream a f t e r 7 " s t e a d y "
42 // g e n e r a t i o n s .
43 . l i m i t ( bySteadyFitness (7) )
44 // The e v o l u t i o n w i l l s t o p a f t e r maximal 100
45 // g e n e r a t i o n s .
46 . l i m i t (100)
47 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
48 // each g e n e r a t i o n
49 . peek ( s t a t i s t i c s )
50 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
51 // i t s b e s t phenotype .
52 . c o l l e c t ( toBestPhenotype ( ) ) ;
53
54 System . out . p r i n t l n ( s t a t i s t i c s ) ;
55 System . out . p r i n t l n ( b e s t ) ;
56 }
57 }

The Genotype in this example consists of one BitChromosome with a ones


probability of 0.15. The altering of the offspring population is performed by
mutation, with mutation probability of 0.55, and then by a single-point crossover,
with crossover probability of 0.06. After creating the initial population, with the
ga.setup() call, 100 generations are evolved. The tournament selector is used
for both, the offspring- and the survivor selection—this is the default selector.2
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 1 65 80 1 44 00 0 s ; mean = 0 .0 0 13 81 6 78 6 67 s |
5 | Altering : sum = 0. 0 96 9 04 15 9 00 0 s ; mean =0 .0 0 80 75 34 6 58 3 s |
6 | Fitness calculation : sum = 0. 02 28 9 43 18 0 00 s ; mean =0 . 00 19 0 78 59 8 33 s |
7 | Overall execution : sum =0 .1 36 5 75 32 3 00 0 s ; mean = 0 .0 11 3 81 27 6 91 7 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 12 |
12 | Altered : sum =40 ,487; mean = 3 37 3 .9 1 66 6 6 66 7 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+

2 For the other default values (population size, maximal age, ...) have a look at the Javadoc:

https://jenetics.io/javadoc/jenetics/7.1/index.html

129
5.2. REAL FUNCTION CHAPTER 5. EXAMPLES

18 | Age : max =9; mean =0.808667; var =1.446299 |


19 | Fitness : |
20 | min = 1.0000 00000000 |
21 | max = 1 8. 00 0 00 00 00 0 00 |
22 | mean = 10 . 05 08 33 3 33 33 3 |
23 | var = 7.839 555898 205 |
24 | std = 2.7 9992 0694 985 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [ 0 0 0 0 1 1 0 1 | 1 1 1 1 0 1 1 1 | 1 1 1 1 1 1 1 1 ] --> 18

The given example will print the overall timing statistics onto the console. In
the Evolution statistics section you can see that it actually takes 15 generations
to fulfill the termination criteria—finding no better result after 7 consecutive
generations.

5.2 Real function


In this example we try to find the minimum value of the function

1
 
f (x) = cos + sin (x) · cos (x) . (5.2.1)
2

1
0.8
0.6
0.4
0.2
0
y

-0.2
-0.4
-0.6
-0.8
-1
0 1 2 3 4 5 6

Figure 5.2.1: Real function

The graph of function 5.2.1, in the range of [0, 2π], is shown in figure 5.2.1
and the listing beneath shows the GA implementation which will minimize the
function.
1 import static j a v a . l a n g . Math . PI ;
2 import static j a v a . l a n g . Math . c o s ;
3 import static j a v a . l a n g . Math . s i n ;
4 import static i o . j e n e t i c s . engine . EvolutionResult . toBestPhenotype ;
5 import static io . j e n e t i c s . engine . Limits . bySteadyFitness ;
6
7 import io . jenetics . DoubleGene ;
8 import io . jenetics . MeanAlterer ;
9 import io . jenetics . Mutator ;
10 import io . jenetics . Optimize ;
11 import io . jenetics . Phenotype ;
12 import io . jenetics . e n g i n e . Codecs ;
13 import io . jenetics . e n g i n e . Engine ;
14 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;

130
5.2. REAL FUNCTION CHAPTER 5. EXAMPLES

15 import i o . j e n e t i c s . u t i l . DoubleRange ;
16
17 public c l a s s R e a l F u n c t i o n {
18
19 // The f i t n e s s f u n c t i o n .
20 private s t a t i c double f i t n e s s ( f i n a l double x ) {
21 return c o s ( 0 . 5 + s i n ( x ) ) ∗ c o s ( x ) ;
22 }
23
24 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
25 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine
26 // C r e a t e a new b u i l d e r with t h e g i v e n f i t n e s s
27 // f u n c t i o n and chromosome .
28 . builder (
29 RealFunction : : f i t n e s s ,
30 Codecs . o f S c a l a r ( DoubleRange . o f ( 0 . 0 , 2 . 0 ∗ PI ) ) )
31 . populationSize (500)
32 . o p t i m i z e ( Optimize .MINIMUM)
33 . alterers (
34 new Mutator < >(0.03) ,
35 new MeanAlterer < >(0.6) )
36 // B u i l d an e v o l u t i o n e n g i n e with t h e
37 // d e f i n e d p a r a m e t e r s .
38 . build () ;
39
40 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
41 f i n a l E v o l u t i o n S t a t i s t i c s <Double , ?>
42 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
43
44 f i n a l Phenotype<DoubleGene , Double> b e s t = e n g i n e . stream ( )
45 // Truncate t h e e v o l u t i o n stream a f t e r 7 " s t e a d y "
46 // g e n e r a t i o n s .
47 . l i m i t ( bySteadyFitness (7) )
48 // The e v o l u t i o n w i l l s t o p a f t e r maximal 100
49 // g e n e r a t i o n s .
50 . l i m i t (100)
51 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
52 // each g e n e r a t i o n
53 . peek ( s t a t i s t i c s )
54 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
55 // i t s b e s t phenotype .
56 . c o l l e c t ( toBestPhenotype ( ) ) ;
57
58 System . out . p r i n t l n ( s t a t i s t i c s ) ;
59 System . out . p r i n t l n ( b e s t ) ;
60 }
61 }

The GA works with 1 × 1 DoubleChromosomes whose values are restricted


to the range [0, 2π].
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 . 06 44 0 64 5 60 00 s ; mean = 0. 0 03 06 6 9 74 09 5 s |
5 | Altering : sum = 0. 07 01 58 3 82 00 0 s ; mean = 0 .0 03 34 0 87 53 33 s |
6 | Fitness calculation : sum = 0. 05 04 5 26 47 0 00 s ; mean =0 .0 0 24 02 50 7 00 0 s |
7 | Overall execution : sum =0 .1 6 98 35 1 54 00 0 s ; mean =0 .0 0 80 87 38 8 28 6 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 21 |
12 | Altered : sum =3 ,897; mean =185.5 71428571 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |

131
5.3. RASTRIGIN FUNCTION CHAPTER 5. EXAMPLES

15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =9; mean =1.104381; var =1.962625 |
19 | Fitness : |
20 | min = -0.938171897696 |
21 | max = 0.93 63101 25279 |
22 | mean = -0.897856583665 |
23 | var = 0.0272 462748 38 |
24 | std = 0.16 50644 5661 7 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [ [ [ 3 . 3 8 9 1 2 5 7 8 2 6 5 7 3 1 4 ] ] ] --> -0.9381718976956661

The GA will generated an console output like above. The exact result of the
function–for the given range–will be 3.389, 125, 782, 8907, 939... You can also see,
that we reached the final result after 19 generations.

5.3 Rastrigin function


The Rastrigin function3 is often used to test the optimization performance of
genetic algorithm.
n
f (x) = An + x2i − A cos (2πxi ) . (5.3.1)
X 
i=1

As the plot in figure 5.3.1 shows, the Rastrigin function has many local minima,
which makes it difficult for standard, gradient based methods to find the global
minimum. If A = 10 and xi ∈ [−5.12, 5.12], the function has only one global
minimum at x = 0 with f (x) = 0.

Figure 5.3.1: Rastrigin function


3 https://en.wikipedia.org/wiki/Rastrigin_function

132
5.3. RASTRIGIN FUNCTION CHAPTER 5. EXAMPLES

The following listing shows the Engine setup for solving the Rastrigin func-
tion, which is very similar to the setup for the real function in section 5.2. Beside
the different fitness function, the Codec for double vectors is used, instead of
the double scalar Codec.
1 import static j a v a . l a n g . Math . PI ;
2 import static j a v a . l a n g . Math . c o s ;
3 import static i o . j e n e t i c s . engine . EvolutionResult . toBestPhenotype ;
4 import static io . j e n e t i c s . engine . Limits . bySteadyFitness ;
5
6 import io . jenetics . DoubleGene ;
7 import io . jenetics . MeanAlterer ;
8 import io . jenetics . Mutator ;
9 import io . jenetics . Optimize ;
10 import io . jenetics . Phenotype ;
11 import io . jenetics . e n g i n e . Codecs ;
12 import io . jenetics . e n g i n e . Engine ;
13 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;
14 import io . jenetics . u t i l . DoubleRange ;
15
16 public c l a s s R a s t r i g i n F u n c t i o n {
17 private s t a t i c f i n a l double A = 1 0 ;
18 private s t a t i c f i n a l double R = 5 . 1 2 ;
19 private s t a t i c f i n a l i n t N = 2 ;
20
21 private s t a t i c double f i t n e s s ( f i n a l double [ ] x ) {
22 double v a l u e = A∗N;
23 f o r ( i n t i = 0 ; i < N; ++i ) {
24 v a l u e += x [ i ] ∗ x [ i ] − A∗ c o s ( 2 . 0 ∗ PI ∗x [ i ] ) ;
25 }
26
27 return v a l u e ;
28 }
29
30 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
31 f i n a l Engine<DoubleGene , Double> e n g i n e = Engine
32 . builder (
33 RastriginFunction : : fitness ,
34 // Codec f o r ’ x ’ v e c t o r .
35 Codecs . o f V e c t o r ( DoubleRange . o f (−R, R) , N) )
36 . populationSize (500)
37 . o p t i m i z e ( Optimize .MINIMUM)
38 . alterers (
39 new Mutator < >(0.03) ,
40 new MeanAlterer < >(0.6) )
41 . build () ;
42
43 f i n a l E v o l u t i o n S t a t i s t i c s <Double , ?>
44 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
45
46 f i n a l Phenotype<DoubleGene , Double> b e s t = e n g i n e . stream ( )
47 . l i m i t ( bySteadyFitness (7) )
48 . peek ( s t a t i s t i c s )
49 . c o l l e c t ( toBestPhenotype ( ) ) ;
50
51 System . out . p r i n t l n ( s t a t i s t i c s ) ;
52 System . out . p r i n t l n ( b e s t ) ;
53 }
54 }

The console output of the program shows, that Jenetics finds the optimal
solution after 38 generations.

133
5.4. 0/1 KNAPSACK CHAPTER 5. EXAMPLES

1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .2 0 91 85 1 34 00 0 s ; mean = 0 .0 05 5 04 87 19 4 7 s |
5 | Altering : sum = 0. 29 5 10 20 44 0 00 s ; mean =0 . 00 77 6 58 43 2 63 s |
6 | Fitness calculation : sum = 0. 17 6 87 9 93 7 00 0 s ; mean = 0. 0 04 65 4 73 51 8 4 s |
7 | Overall execution : sum =0 .6 6 45 1 72 56 0 00 s ; mean = 0. 01 7 48 72 9 62 11 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 38 |
12 | Altered : sum =7 ,549; mean =198 .6578 94737 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =8; mean =1.100211; var =1.814053 |
19 | Fitness : |
20 | min = 0.0000 00000000 |
21 | max = 6 3 .6 72 6 04 0 47 47 5 |
22 | mean = 3 .4841574 52128 |
23 | var = 7 1. 0 47 47 51 3 90 18 |
24 | std = 8.42 8966 4336 16 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [[[ -1.3226168588424143 E -9] ,[ -1.096964971404292 E -9]]] --> 0.0

5.4 0/1 Knapsack


In the Knapsack problem4 a set of items, together with its size and value, is
given. The task is to select a disjoint subset so that the total size does not
exceed the knapsack size. For solving the 0/1 knapsack problem we define a
BitChromosome, one bit for each item. If the ith bit is set to one the ith item is
selected.
1 import s t a t i c i o . j e n e t i c s . e n g i n e . E v o l u t i o n R e s u l t . t o B e s t P h e n o t y p e ;
2 import s t a t i c i o . j e n e t i c s . e n g i n e . L i m i t s . b y S t e a d y F i t n e s s ;
3
4 import j a v a . u t i l . f u n c t i o n . F u n c t i o n ;
5 import j a v a . u t i l . stream . C o l l e c t o r ;
6 import j a v a . u t i l . stream . Stream ;
7
8 import io . jenetics . BitGene ;
9 import io . jenetics . Mutator ;
10 import io . jenetics . Phenotype ;
11 import io . jenetics . RouletteWheelSelector ;
12 import io . jenetics . SinglePointCrossover ;
13 import io . jenetics . TournamentSelector ;
14 import io . jenetics . e n g i n e . Codec ;
15 import io . jenetics . e n g i n e . Codecs ;
16 import io . jenetics . e n g i n e . Engine ;
17 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;
18 import io . jenetics . u t i l . ISeq ;
19 import io . jenetics . u t i l . RandomRegistry ;
20
21 // The main c l a s s .
22 public c l a s s Knapsack {
23
24 // This c l a s s r e p r e s e n t s a knapsack item , with a s p e c i f i c
25 // " s i z e " and " v a l u e " .
26 f i n a l s t a t i c c l a s s Item {
27 public f i n a l double s i z e ;
4 https://en.wikipedia.org/wiki/Knapsack_problem

134
5.4. 0/1 KNAPSACK CHAPTER 5. EXAMPLES

28 public f i n a l double v a l u e ;
29
30 Item ( f i n a l double s i z e , f i n a l double v a l u e ) {
31 this . s i z e = s i z e ;
32 this . value = value ;
33 }
34
35 // C r e a t e a new random knapsack item .
36 s t a t i c Item random ( ) {
37 f i n a l var r = RandomRegistry . random ( ) ;
38 return new Item (
39 r . nextDouble ( ) ∗ 1 0 0 ,
40 r . nextDouble ( ) ∗100
41 );
42 }
43
44 // C o l l e c t o r f o r summing up t h e knapsack i t e m s .
45 s t a t i c C o l l e c t o r <Item , ? , Item> toSum ( ) {
46 return C o l l e c t o r . o f (
47 ( ) −> new double [ 2 ] ,
48 ( a , b ) −> { a [ 0 ] += b . s i z e ; a [ 1 ] += b . v a l u e ; } ,
49 ( a , b ) −> { a [ 0 ] += b [ 0 ] ; a [ 1 ] += b [ 1 ] ; return a ; } ,
50 r −> new Item ( r [ 0 ] , r [ 1 ] )
51 );
52 }
53 }
54
55 // C r e a t i n g t h e f i t n e s s f u n c t i o n .
56 s t a t i c Function<ISeq<Item >, Double>
57 f i t n e s s ( f i n a l double s i z e ) {
58 return i t e m s −> {
59 f i n a l Item sum = i t e m s . stream ( ) . c o l l e c t ( Item . toSum ( ) ) ;
60 return sum . s i z e <= s i z e ? sum . v a l u e : 0 ;
61 };
62 }
63
64 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
65 f i n a l int nitems = 1 5 ;
66 f i n a l double k s s i z e = n i t e m s ∗ 1 0 0 . 0 / 3 . 0 ;
67
68 f i n a l ISeq<Item> i t e m s =
69 Stream . g e n e r a t e ( Item : : random )
70 . l i m i t ( nitems )
71 . c o l l e c t ( ISeq . toISeq ( ) ) ;
72
73 // D e f i n i n g t h e c o d e c .
74 f i n a l Codec<ISeq<Item >, BitGene> c o d e c =
75 Codecs . o f S u b S e t ( i t e m s ) ;
76
77 // C o n f i g u r e and b u i l d t h e e v o l u t i o n e n g i n e .
78 f i n a l Engine<BitGene , Double> e n g i n e = Engine
79 . b u i l d e r ( f i t n e s s ( k s s i z e ) , codec )
80 . populationSize (500)
81 . s u r v i v o r s S e l e c t o r (new T o u r n a m e n t S e l e c t o r <>(5) )
82 . o f f s p r i n g S e l e c t o r (new R o u l e t t e W h e e l S e l e c t o r <>() )
83 . alterers (
84 new Mutator < >(0.115) ,
85 new S i n g l e P o i n t C r o s s o v e r < >(0.16) )
86 . build () ;
87
88 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
89 f i n a l E v o l u t i o n S t a t i s t i c s <Double , ?>

135
5.4. 0/1 KNAPSACK CHAPTER 5. EXAMPLES

90 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
91
92 f i n a l Phenotype<BitGene , Double> b e s t = e n g i n e . stream ( )
93 // Truncate t h e e v o l u t i o n stream a f t e r 7 " s t e a d y "
94 // g e n e r a t i o n s .
95 . l i m i t ( bySteadyFitness (7) )
96 // The e v o l u t i o n w i l l s t o p a f t e r maximal 100
97 // g e n e r a t i o n s .
98 . l i m i t (100)
99 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
100 // each g e n e r a t i o n
101 . peek ( s t a t i s t i c s )
102 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
103 // i t s b e s t phenotype .
104 . c o l l e c t ( toBestPhenotype ( ) ) ;
105
106 f i n a l ISeq<Item> knapsack = c o d e c . d ec od e ( b e s t . g e n o t y p e ( ) ) ;
107
108 System . out . p r i n t l n ( s t a t i s t i c s ) ;
109 System . out . p r i n t l n ( b e s t ) ;
110 System . out . p r i n t l n ( " \n\n " ) ;
111 System . out . p r i n t f (
112 " Genotype o f b e s t item : %s%n " ,
113 best . genotype ( )
114 );
115
116 f i n a l double f i l l S i z e = knapsack . stream ( )
117 . mapToDouble ( i t −> i t . s i z e )
118 . sum ( ) ;
119
120 System . out . p r i n t f ( " %.2 f%% f i l l e d .%n " , 100∗ f i l l S i z e / k s s i z e ) ;
121 }
122 }

The console output for the Knapsack GA will look like the listing beneath.
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 4 44 6 59 78 0 00 s ; mean =0 .0 0 55 58 24 7 25 0 s |
5 | Altering : sum = 0. 06 7 38 52 11 0 00 s ; mean =0 .0 0 84 23 15 1 37 5 s |
6 | Fitness calculation : sum = 0. 03 72 0 81 89 0 00 s ; mean =0 . 00 46 5 10 23 6 25 s |
7 | Overall execution : sum =0 .1 2 64 6 85 39 0 00 s ; mean =0 . 01 58 0 85 67 3 75 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 8 |
12 | Altered : sum =4 ,842; mean = 605.25 0000000 |
13 | Killed : sum =0; mean =0.000000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =7; mean =1.387500; var =2.780039 |
19 | Fitness : |
20 | min = 0.0000 00000000 |
21 | max = 5 4 2 . 3 6 3 2 3 5 9 9 9 3 4 2 |
22 | mean = 4 3 6 . 0 9 8 2 4 8 6 2 8 6 6 1 |
23 | var = 1 1 4 3 1 . 8 0 1 2 9 1 8 1 2 3 9 0 |
24 | std = 1 0 6 . 9 1 9 6 0 1 9 9 9 8 7 8 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 [ 0 1 1 1 1 0 1 1 | 1 0 1 1 1 1 0 1 ] --> 5 4 2 . 3 6 3 2 3 5 9 9 9 3 4 1 7

136
5.5. TRAVELING SALESMAN CHAPTER 5. EXAMPLES

5.5 Traveling salesman


The Traveling Salesman problem5 is one of the classical problems in computa-
tional mathematics and it is the most notorious NP-complete problem. The goal
is to find the shortest distance, or the path, with the least costs, between N
different cities. Testing all possible paths for N cities would lead to N ! checks
to find the shortest one.
The following example uses a path where the cities are lying on a circle. That
means, the optimal path will be a polygon. This makes it easier to check the
quality of the found solution.
1 import static j a v a . l a n g . Math . PI ;
2 import static j a v a . l a n g . Math . c o s ;
3 import static j a v a . l a n g . Math . hypot ;
4 import static j a v a . l a n g . Math . s i n ;
5 import static j a v a . l a n g . System . out ;
6 import static java . u t i l . Objects . requireNonNull ;
7 import static i o . j e n e t i c s . engine . EvolutionResult . toBestPhenotype ;
8 import static io . j e n e t i c s . engine . Limits . bySteadyFitness ;
9
10 import j a v a . u t i l . f u n c t i o n . F u n c t i o n ;
11 import j a v a . u t i l . stream . I n t S t r e a m ;
12
13 import io . jenetics . EnumGene ;
14 import io . jenetics . Optimize ;
15 import io . jenetics . PartiallyMatchedCrossover ;
16 import io . jenetics . Phenotype ;
17 import io . jenetics . SwapMutator ;
18 import io . jenetics . e n g i n e . Codec ;
19 import io . jenetics . e n g i n e . Codecs ;
20 import io . jenetics . e n g i n e . Engine ;
21 import io . jenetics . engine . E v o l u t i o n S t a t i s t i c s ;
22 import io . jenetics . e n g i n e . Problem ;
23 import io . jenetics . u t i l . ISeq ;
24 import io . jenetics . u t i l . MSeq ;
25 import io . jenetics . u t i l . RandomRegistry ;
26
27 public c l a s s T r a v e l i n g S a l e s m a n
28 implements Problem<ISeq<double [ ] > , EnumGene<double [ ] > , Double>
29 {
30
31 private f i n a l ISeq<double [] > _ p o i n t s ;
32
33 // C r e a t e new TSP problem i n s t a n c e with g i v e n way p o i n t s .
34 public T r a v e l i n g S a l e s m a n ( ISeq<double [] > p o i n t s ) {
35 _points = requireNonNull ( p o i n t s ) ;
36 }
37
38 @Override
39 public Function<ISeq<double [ ] > , Double> f i t n e s s ( ) {
40 return p −> I n t S t r e a m . r a n g e ( 0 , p . l e n g t h ( ) )
41 . mapToDouble ( i −> {
42 f i n a l double [ ] p1 = p . g e t ( i ) ;
43 f i n a l double [ ] p2 = p . g e t ( ( i + 1 )%p . s i z e ( ) ) ;
44 return hypot ( p1 [ 0 ] − p2 [ 0 ] , p1 [ 1 ] − p2 [ 1 ] ) ; } )
45 . sum ( ) ;
46 }
47
48 @Override
5 https://en.wikipedia.org/wiki/Travelling_salesman_problem

137
5.5. TRAVELING SALESMAN CHAPTER 5. EXAMPLES

49 public Codec<ISeq<double [ ] > , EnumGene<double[]>> c o d e c ( ) {


50 return Codecs . o f P e r m u t a t i o n ( _ p o i n t s ) ;
51 }
52
53 // C r e a t e a new TSM example problem with t h e g i v e n number
54 // o f s t o p s . A l l s t o p s l i e on a c i r c l e with t h e g i v e n r a d i u s .
55 public s t a t i c T r a v e l i n g S a l e s m a n o f ( i n t s t o p s , double r a d i u s ) {
56 f i n a l MSeq<double [] > p o i n t s = MSeq . o f L e n g t h ( s t o p s ) ;
57 f i n a l double d e l t a = 2 . 0 ∗ PI / s t o p s ;
58
59 f o r ( i n t i = 0 ; i < s t o p s ; ++i ) {
60 f i n a l double a l p h a = d e l t a ∗ i ;
61 f i n a l double x = c o s ( a l p h a ) ∗ r a d i u s + r a d i u s ;
62 f i n a l double y = s i n ( a l p h a ) ∗ r a d i u s + r a d i u s ;
63 p o i n t s . s e t ( i , new double [ ] { x , y } ) ;
64 }
65
66 // S h u f f l i n g o f t h e c r e a t e d p o i n t s .
67 f i n a l var random = RandomRegistry . random ( ) ;
68 f o r ( i n t j = p o i n t s . l e n g t h ( ) − 1 ; j > 0 ; −−j ) {
69 f i n a l i n t i = random . n e x t I n t ( j + 1 ) ;
70 f i n a l double [ ] tmp = p o i n t s . g e t ( i ) ;
71 points . set ( i , points . get ( j ) ) ;
72 p o i n t s . s e t ( j , tmp ) ;
73 }
74
75 return new T r a v e l i n g S a l e s m a n ( p o i n t s . t o I S e q ( ) ) ;
76 }
77
78 public s t a t i c void main ( S t r i n g [ ] a r g s ) {
79 i n t s t o p s = 2 0 ; double R = 1 0 ;
80 double minPathLength = 2 . 0 ∗ s t o p s ∗R∗ s i n ( PI / s t o p s ) ;
81
82 T r a v e l i n g S a l e s m a n tsm = T r a v e l i n g S a l e s m a n . o f ( s t o p s , R) ;
83 Engine<EnumGene<double [ ] > , Double> e n g i n e = Engine
84 . b u i l d e r ( tsm )
85 . o p t i m i z e ( Optimize .MINIMUM)
86 . maximalPhenotypeAge ( 1 1 )
87 . populationSize (500)
88 . alterers (
89 new SwapMutator < >(0.2) ,
90 new P a r t i a l l y M a t c h e d C r o s s o v e r < >(0.35) )
91 . build () ;
92
93 // C r e a t e e v o l u t i o n s t a t i s t i c s consumer .
94 E v o l u t i o n S t a t i s t i c s <Double , ?>
95 s t a t i s t i c s = E v o l u t i o n S t a t i s t i c s . ofNumber ( ) ;
96
97 Phenotype<EnumGene<double [ ] > , Double> b e s t =
98 e n g i n e . stream ( )
99 // Truncate t h e e v o l u t i o n stream a f t e r 25 " s t e a d y "
100 // g e n e r a t i o n s .
101 . l i m i t ( bySteadyFitness (25) )
102 // The e v o l u t i o n w i l l s t o p a f t e r maximal 250
103 // g e n e r a t i o n s .
104 . l i m i t (250)
105 // Update t h e e v a l u a t i o n s t a t i s t i c s a f t e r
106 // each g e n e r a t i o n
107 . peek ( s t a t i s t i c s )
108 // C o l l e c t ( r e d u c e ) t h e e v o l u t i o n stream t o
109 // i t s b e s t phenotype .
110 . c o l l e c t ( toBestPhenotype ( ) ) ;

138
5.6. EVOLVING IMAGES CHAPTER 5. EXAMPLES

111
112 out . p r i n t l n ( s t a t i s t i c s ) ;
113 out . p r i n t l n ( "Known min path l e n g t h : " + minPathLength ) ;
114 out . p r i n t l n ( " Found min path l e n g t h : " + b e s t . f i t n e s s ( ) ) ;
115 }
116
117 }

The Traveling Salesman problem is a very good example which shows you
how to solve combinatorial problems with an GA. Jenetics contains several
classes which will work very well with this kind of problems. Wrapping the base
type into an EnumGene is the first thing to do. In our example, every city has
an unique number, that means we are wrapping an Integer into an EnumGene.
Creating a genotype for integer values is very easy with the factory method
of the PermutationChromosome. For other data types you have to use one of
the constructors of the permutation chromosome. As alterers, we are using a
swap-mutator and a partially-matched crossover. These alterers guarantee that
no invalid solutions are created—every city exists exactly once in the altered
chromosomes.
1 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
2 | Time statistics |
3 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
4 | Selection : sum =0 .0 7 74 51 2 97 00 0 s ; mean = 0 .0 0 06 1 96 1 03 76 s |
5 | Altering : sum = 0. 20 53 5 16 88 0 00 s ; mean =0 .0 0 16 42 8 13 50 4 s |
6 | Fitness calculation : sum = 0. 09 7 12 72 25 0 00 s ; mean =0 .0 0 07 77 01 7 80 0 s |
7 | Overall execution : sum =0 .3 71 3 04 46 4 00 0 s ; mean = 0. 0 02 97 0 43 57 1 2 s |
8 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
9 | Evolution statistics |
10 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
11 | Generations : 125 |
12 | Altered : sum =177 ,200; mean = 1 41 7. 6 00 00 00 0 0 |
13 | Killed : sum =173; mean =1.384000000 |
14 | Invalids : sum =0; mean =0.000000000 |
15 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
16 | Population statistics |
17 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
18 | Age : max =11; mean =1.677872; var =5.617299 |
19 | Fitness : |
20 | min = 6 2 .5 73 7 86 0 16 0 92 |
21 | max = 3 4 4 . 2 48 7 6 3 7 2 0 4 8 7 |
22 | mean = 1 4 4 . 6 3 6 7 4 9 9 7 4 5 9 1 |
23 | var = 5 0 8 2 . 9 4 7 2 4 7 8 7 8 9 5 3 |
24 | std = 7 1 .2 94 7 91 1 69 3 34 |
25 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+
26 Known min path length : 6 2 . 5 7 3 7 8 6 0 1 6 0 9 2 3 5
27 Found min path length : 6 2 . 5 7 3 7 8 6 0 1 6 0 9 2 3 5

The listing above shows the output generated by our example. The last line
represents the phenotype of the best solution found by the GA, which represents
the traveling path. As you can see, the GA has found the shortest path, in
reverse order.

5.6 Evolving images


The following example tries to approximate a given image by semitransparent
polygons.6 It comes with an Swing UI, where you can immediately start your
own experiments. After compiling the sources with
6 Original idea by Roger Johansson http://rogeralsing.com/2008/12/07/
genetic-programming-evolution-of-mona-lisa.

139
5.6. EVOLVING IMAGES CHAPTER 5. EXAMPLES

$ ./gradlew jar

you can start the example by calling

$ ./jrun io.jenetics.example.image.EvolvingImages

Figure 5.6.1: Evolving images UI

Figure5.6.1 show the GUI after evolving the default image for about 4,000
generations. With the »Open« button it is possible to load other images for
polygonization. The »Save« button allows to store polygonized images in PNG
format to disk. At the button of the UI, you can change some of the GA
parameters of the example:
Population size The number of individual in the population.

Tournament size The example uses a TournamentSelector for selecting the


offspring population. This parameter lets you set the number of individuals
used for the tournament step.
Mutation rate The probability that a polygon component (color or vertex
position) is altered.

Mutation magnitude In case a polygon component is going to be mutated,


its value will be randomly modified in the uniform range of [−m, +m].
Polygon length The number of edges (or vertices) of the created polygons.

140
5.7. SYMBOLIC REGRESSION CHAPTER 5. EXAMPLES

Polygon count The number of polygons of one individual (Genotype).


Reference image size To improve the processing speed, the fitness of a given
polygon set (individual) is not calculated with the full sized image. Instead
a scaled reference image with the given size is used. A smaller reference
image will speed up the calculation, but will also reduce the accuracy.
It is also possible to run and configure the Evolving Images example from
the command line. This allows for performing long running evolution experi-
ments and save polygon images every n generations—specified with the--image-
-generation parameter.

$ ./jrun io.jenetics.example.image.EvolvingImages evolve \


--engine-properties engine.properties \
--input-image monalisa.png \
--output-dir evolving-images \
--generations 10000 \
--image-generation 100

Every command line argument has proper default values, so that it is possible
to start it without parameters. Listing 5.1 shows the default values for the GA
engine if the --engine-properties parameter is not specified.
1 p o p u l a t i o n _ s i z e =50
2 t o u r n a m e n t _ s i z e=3
3 m u t a t i o n _ r a t e =0.025
4 mut ation_m ultitu de =0.15
5 p o l y g o n _ l e n g t h=4
6 polygon_count =250
7 r e f e r e n c e _ i m a g e _ w i d t h =60
8 r e f e r e n c e _ i m a g e _ h e i g h t =60
Listing 5.1: Default engine.properties

For a quick start, you can simply call

$ ./jrun io.jenetics.example.image.EvolvingImages evolve

The images in figure 5.6.2 shows the resulting polygon images after the given
number of generations. They where created with the command line version of
the program using the default engine.properties file (listing 5.1):

$ ./jrun io.jenetics.example.image.EvolvingImages evolve \


--generations 1000000 \
--image-generation 100

5.7 Symbolic regression


The following example shows how to set up and solve a symbolic regression
problem with the help of GP and Jenetics. The data set used for the example
was created with the polynomial, 4x3 − 3x2 + x. This allows us to check the
quality of the function found by the GP. Setting up a GP requires a little bit

141
5.7. SYMBOLIC REGRESSION CHAPTER 5. EXAMPLES

a) 100 generations b) 102 generations c) 103 generations

d) 104 generations e) 105 generations f) 106 generations

Figure 5.6.2: Evolving Mona Lisa images

more effort then the setup of a GA. First, you have to define the set of atomic
mathematical operations, the GP is working with. These operations influence
the search space and is a kind of a priori knowledge put into the GP. As a
second step you have to define the terminal operations. Terminals are either
constants or variables. The number of variables defines the domain dimension of
the fitness function.
1 import s t a t i c i o . j e n e t i c s . u t i l . RandomRegistry . random ;
2
3 import j a v a . u t i l . L i s t ;
4
5 import io . jenetics . Mutator ;
6 import io . jenetics . e n g i n e . Engine ;
7 import io . jenetics . engine . EvolutionResult ;
8 import io . jenetics . engine . Limits ;
9 import io . jenetics . u t i l . ISeq ;
10
11 import i o . j e n e t i c s . e x t . S i n g l e N o d e C r o s s o v e r ;
12 import i o . j e n e t i c s . e x t . u t i l . TreeNode ;
13
14 import io . jenetics . prog . ProgramGene ;
15 import io . jenetics . prog . op . EphemeralConst ;
16 import io . jenetics . prog . op . MathExpr ;
17 import io . jenetics . prog . op . MathOp ;
18 import io . jenetics . prog . op . Op ;
19 import io . jenetics . prog . op . Var ;
20 import io . jenetics . prog . r e g r e s s i o n . E r r o r ;
21 import io . jenetics . prog . r e g r e s s i o n . L o s s F u n c t i o n ;
22 import io . jenetics . prog . r e g r e s s i o n . R e g r e s s i o n ;
23 import io . jenetics . prog . r e g r e s s i o n . Sample ;
24
25 public c l a s s S y m b o l i c R e g r e s s i o n {

142
5.7. SYMBOLIC REGRESSION CHAPTER 5. EXAMPLES

26
27 // D e f i n i t i o n o f t h e a l l o w e d o p e r a t i o n s .
28 private s t a t i c f i n a l ISeq<Op<Double>> OPS =
29 I S e q . o f (MathOp .ADD, MathOp . SUB, MathOp .MUL) ;
30
31 // D e f i n i t i o n o f t h e t e r m i n a l s .
32 private s t a t i c f i n a l ISeq<Op<Double>> TMS = I S e q . o f (
33 Var . o f ( " x " , 0 ) ,
34 EphemeralConst . o f ( ( ) −> ( double ) random ( ) . n e x t I n t ( 1 0 ) )
35 );
36
37 // Lookup t a b l e f o r { @code 4∗ x ^3 − 3∗ x ^2 + x}
38 s t a t i c f i n a l L i s t <Sample<Double>> SAMPLES = L i s t . o f (
39 Sample . o f D o u b l e ( − 1 . 0 , −8.0000) ,
40 Sample . o f D o u b l e ( − 0 . 9 , −6.2460) ,
41 Sample . o f D o u b l e ( − 0 . 8 , −4.7680) ,
42 Sample . o f D o u b l e ( − 0 . 7 , −3.5420) ,
43 Sample . o f D o u b l e ( − 0 . 6 , −2.5440) ,
44 Sample . o f D o u b l e ( − 0 . 5 , −1.7500) ,
45 Sample . o f D o u b l e ( − 0 . 4 , −1.1360) ,
46 Sample . o f D o u b l e ( − 0 . 3 , −0.6780) ,
47 Sample . o f D o u b l e ( − 0 . 2 , −0.3520) ,
48 Sample . o f D o u b l e ( − 0 . 1 , −0.1340) ,
49 Sample . o f D o u b l e ( 0 . 0 , 0 . 0 0 0 0 ) ,
50 Sample . o f D o u b l e ( 0 . 1 , 0 . 0 7 4 0 ) ,
51 Sample . o f D o u b l e ( 0 . 2 , 0 . 1 1 2 0 ) ,
52 Sample . o f D o u b l e ( 0 . 3 , 0 . 1 3 8 0 ) ,
53 Sample . o f D o u b l e ( 0 . 4 , 0 . 1 7 6 0 ) ,
54 Sample . o f D o u b l e ( 0 . 5 , 0 . 2 5 0 0 ) ,
55 Sample . o f D o u b l e ( 0 . 6 , 0 . 3 8 4 0 ) ,
56 Sample . o f D o u b l e ( 0 . 7 , 0 . 6 0 2 0 ) ,
57 Sample . o f D o u b l e ( 0 . 8 , 0 . 9 2 8 0 ) ,
58 Sample . o f D o u b l e ( 0 . 9 , 1 . 3 8 6 0 ) ,
59 Sample . o f D o u b l e ( 1 . 0 , 2 . 0 0 0 0 )
60 );
61
62 s t a t i c f i n a l R e g r e s s i o n <Double> REGRESSION =
63 Regression . of (
64 R e g r e s s i o n . codecOf (
65 OPS, TMS, 5 ,
66 t −> t . gene ( ) . s i z e ( ) < 30
67 ),
68 E r r o r . o f ( L o s s F u n c t i o n : : mse ) ,
69 SAMPLES
70 );
71
72 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
73 f i n a l Engine<ProgramGene<Double >, Double> e n g i n e = Engine
74 . b u i l d e r (REGRESSION)
75 . minimizing ( )
76 . alterers (
77 new S i n g l e N o d e C r o s s o v e r < >(0.1) ,
78 new Mutator <>() )
79 . build () ;
80
81 f i n a l E v o l u t i o n R e s u l t <ProgramGene<Double >, Double> e r =
82 e n g i n e . stream ( )
83 . l i m i t ( Limits . byFitnessThreshold (0 .0 1) )
84 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () ) ;
85
86 f i n a l ProgramGene<Double> program = e r . b e s t P h e n o t y p e ( )
87 . genotype ( )

143
5.8. GRAMMAR BASED REGRESSION CHAPTER 5. EXAMPLES

88 . gene ( ) ;
89
90 f i n a l TreeNode<Op<Double>> t r e e = program . toTreeNode ( ) ;
91 MathExpr . r e w r i t e ( t r e e ) ;
92 System . out . p r i n t l n ( "G: " + er . totalGenerations () ) ;
93 System . out . p r i n t l n ( "F : " + new MathExpr ( t r e e ) ) ;
94 System . out . p r i n t l n ( "E : " + REGRESSION . e r r o r ( t r e e ) ) ;
95 }
96 }

The error function uses the mean squared error 7 as loss function and no
additional tree complexity metric. One output of a GP run is shown in figure
5.7.1. If we simplify this program tree, we will get exactly the polynomial which
created the sample data.

sub

add mul

x mul add x

mul x add x

add x x x

add x

add x

x x

Figure 5.7.1: Symbolic regression polynomial

5.8 Grammar based regression


The same example as in given in section 5.7 on page 141, can be done with
grammatical evolution8 . How to do this is shown in the following example.
7 https://en.wikipedia.org/wiki/Mean_squared_error
8 See section 3.1.8 on page 100.

144
5.8. GRAMMAR BASED REGRESSION CHAPTER 5. EXAMPLES

1 import s t a t i c j a v a . u t i l . O b j e c t s . r e q u i r e N o n N u l l ;
2 import s t a t i c j a v a . u t i l . stream . C o l l e c t o r s . j o i n i n g ;
3 import s t a t i c i o . j e n e t i c s . prog . op . MathExpr . p a r s e T r e e ;
4
5 import j a v a . u t i l . L i s t ;
6 import j a v a . u t i l . f u n c t i o n . F u n c t i o n ;
7
8 import io . jenetics . IntegerGene ;
9 import io . jenetics . Phenotype ;
10 import io . jenetics . SinglePointCrossover ;
11 import io . jenetics . SwapMutator ;
12 import io . jenetics . e n g i n e . Codec ;
13 import io . jenetics . e n g i n e . Engine ;
14 import io . jenetics . engine . EvolutionResult ;
15 import io . jenetics . engine . Limits ;
16 import io . jenetics . e n g i n e . Problem ;
17 import io . jenetics . u t i l . IntRange ;
18
19 import io . jenetics . e x t . grammar . Bnf ;
20 import io . jenetics . e x t . grammar . Cfg ;
21 import io . jenetics . e x t . grammar . Cfg . Symbol ;
22 import io . jenetics . e x t . grammar . Mappers ;
23 import io . jenetics . e x t . grammar . S e n t e n c e G e n e r a t o r ;
24 import io . jenetics . e x t . u t i l . Tree ;
25 import io . jenetics . e x t . u t i l . TreeNode ;
26
27 import io . jenetics . prog . op . Const ;
28 import io . jenetics . prog . op . MathExpr ;
29 import io . jenetics . prog . op . Op ;
30 import io . jenetics . prog . r e g r e s s i o n . E r r o r ;
31 import io . jenetics . prog . r e g r e s s i o n . L o s s F u n c t i o n ;
32 import io . jenetics . prog . r e g r e s s i o n . Sample ;
33 import io . jenetics . prog . r e g r e s s i o n . Sampling ;
34 import io . jenetics . prog . r e g r e s s i o n . Sampling . R e s u l t ;
35
36 public c l a s s GrammarBasedRegression
37 implements Problem<Tree<Op<Double >, ?>, I n t e g e r G e n e , Double>
38 {
39
40 private s t a t i c f i n a l Cfg<S t r i n g > GRAMMAR = Bnf . p a r s e ( " " "
41 <expr> : : = x | <num> | <expr> <op> <expr>
42 <op> ::= + | − | ∗ | /
43 <num> : : = 2 | 3 | 4
44 """
45 );
46
47 private s t a t i c f i n a l Codec<Tree<Op<Double >, ?>, I n t e g e r G e n e >
48 CODEC = Mappers . multiIntegerChromosomeMapper (
49 GRAMMAR,
50 // The l e n g t h o f t h e chromosome i s 25 t i m e s t h e l e n g t h
51 // o f t h e a l t e r n a t i v e s o f a g i v e n r u l e . Every r u l e
52 // g e t s i t s own chromosome . I t would a l s o be p o s s i b l e
53 // t o d e f i n e v a r i a b l e chromosome l e n g t h with t h e
54 // r e t u r n e d i n t e g e r r a n g e .
55 r u l e −> IntRange . o f ( r u l e . a l t e r n a t i v e s ( ) . s i z e ( ) ∗ 2 5 ) ,
56 // The used g e n e r a t o r d e f i n e s t h e g e n e r a t e d data type ,
57 // which i s ‘ L i s t <Terminal<S t r i n g >>‘.
58 i n d e x −> new S e n t e n c e G e n e r a t o r <>(index , 5 0 )
59 )
60 // Map t h e t y p e o f t h e c o d e c from ‘ L i s t <Terminal<S t r i n g >>‘
61 // t o ‘ S t r i n g ‘
62 . map( s −> s . stream ( ) . map( Symbol : : name ) . c o l l e c t ( j o i n i n g ( ) ) )

145
5.8. GRAMMAR BASED REGRESSION CHAPTER 5. EXAMPLES

63 // Map t h e t y p e o f t h e c o d e c from ‘ S t r i n g ‘ t o
64 // ‘ Tree<Op<Double >, ?>‘
65 . map( e −> e . isEmpty ( )
66 ? TreeNode . o f ( Const . o f ( 0 . 0 ) )
67 : parseTree ( e ) ) ;
68
69 private s t a t i c f i n a l Error <Double> ERROR =
70 E r r o r . o f ( L o s s F u n c t i o n : : mse ) ;
71
72 private f i n a l Sampling<Double> _sampling ;
73
74 public GrammarBasedRegression ( Sampling<Double> s a m p l i n g ) {
75 _sampling = r e q u i r e N o n N u l l ( s a m p l i n g ) ;
76 }
77
78 public GrammarBasedRegression ( L i s t <Sample<Double>> s a m p l e s ) {
79 t h i s ( Sampling . o f ( s a m p l e s ) ) ;
80 }
81
82 @Override
83 public Codec<Tree<Op<Double >, ?>, I n t e g e r G e n e > c o d e c ( ) {
84 return CODEC;
85 }
86
87 @Override
88 public Function<Tree<Op<Double >, ?>, Double> f i t n e s s ( ) {
89 return program −> {
90 f i n a l R e s u l t <Double> r e s u l t = _sampling . e v a l ( program ) ;
91 return ERROR. a p p l y (
92 program , r e s u l t . c a l c u l a t e d ( ) , r e s u l t . e x p e c t e d ( )
93 );
94 };
95 }
96
97 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
98 f i n a l var r e g r e s s i o n = new GrammarBasedRegression (
99 S y m b o l i c R e g r e s s i o n .SAMPLES
100 );
101
102 f i n a l Engine<I n t e g e r G e n e , Double> e n g i n e = Engine
103 . builder ( regression )
104 . alterers (
105 new SwapMutator <>() ,
106 new S i n g l e P o i n t C r o s s o v e r <>() )
107 . minimizing ( )
108 . build () ;
109
110 f i n a l E v o l u t i o n R e s u l t <I n t e g e r G e n e , Double> r e s u l t = e n g i n e
111 . stream ( )
112 . l i m i t ( Limits . byFitnessThreshold (0 .0 5) )
113 . c o l l e c t ( EvolutionResult . toBestEvolutionResult () ) ;
114
115 f i n a l Phenotype<I n t e g e r G e n e , Double> b e s t =
116 r e s u l t . bestPhenotype ( ) ;
117
118 f i n a l Tree<Op<Double >, ?> program =
119 r e g r e s s i o n . d ec od e ( b e s t . g e n o t y p e ( ) ) ;
120
121 System . out . p r i n t l n (
122 " Generations : " + r e s u l t . totalGenerations () ) ;
123 System . out . p r i n t l n (
124 " Function : " + new MathExpr ( program ) . s i m p l i f y ( ) ) ;

146
5.9. DTLZ1 CHAPTER 5. EXAMPLES

125 System . out . p r i n t l n (


126 " Error : " + r e g r e s s i o n . f i t n e s s ( ) . a p p l y ( program ) ) ;
127 }
128
129 }

5.9 DTLZ1
Deb, Thiele, Laumanns and Zitzler have proposed a set of generational MOPs
for testing and comparing MOEAs. This suite of benchmarks attempts to define
generic MOEA test problems that are scalable to a user defined number of
objectives. Because of the last names of its creators, this test suite is known as
DTLZ (Deb-Thiele-Laumanns-Zitzler). [10]
DTLZ1 is an M -objective problem with linear Pareto-optimal front: [17]

1
f1 (x) = x1 x2 · · · xM −1 (1 + g (xM )) ,
2
1
f2 (x) = x1 x2 · · · (1 − xM −1 ) (1 + g (xM )) ,
2
..
.
1
fM −1 (x) = x1 (1 − x2 ) (1 + g (xM )) ,
2
1
fM (x) = (1 − x1 ) (1 + g (xM )) ,
2
∀i ∈ [1, ..n] : 0 ≤ xi ≤ 1

The functional g (xM ) requires |xM | = k variables and must take any function
with g ≥ 0. Typically g is defined as:

1 1
"  2   #
g (xM ) = 100 |xM | + x − − cos 20π x − .
2 2
In the above problem, the total number of variables is n = M + k − 1. The
search space contains 11k − 1 local Pareto-optimal fronts, each of which can
attract an MOEA.
1 import s t a t i c j a v a . l a n g . Math . PI ;
2 import s t a t i c j a v a . l a n g . Math . c o s ;
3 import s t a t i c j a v a . l a n g . Math . pow ;
4
5 import io . jenetics . DoubleGene ;
6 import io . jenetics . Mutator ;
7 import io . jenetics . Phenotype ;
8 import io . jenetics . TournamentSelector ;
9 import io . jenetics . e n g i n e . Codecs ;
10 import io . jenetics . e n g i n e . Engine ;
11 import io . jenetics . e n g i n e . Problem ;
12 import io . jenetics . u t i l . DoubleRange ;
13 import io . jenetics . u t i l . ISeq ;
14 import io . jenetics . u t i l . IntRange ;
15
16 import i o . j e n e t i c s . e x t . S i m u l a t e d B i n a r y C r o s s o v e r ;
17 import i o . j e n e t i c s . e x t . moea .MOEA;

147
5.9. DTLZ1 CHAPTER 5. EXAMPLES

18 import i o . j e n e t i c s . e x t . moea . NSGA2Selector ;


19 import i o . j e n e t i c s . e x t . moea . Vec ;
20
21 public c l a s s DTLZ1 {
22 private s t a t i c f i n a l i n t VARIABLES = 4 ;
23 private s t a t i c f i n a l i n t OBJECTIVES = 3 ;
24 private s t a t i c f i n a l i n t K = VARIABLES − OBJECTIVES + 1 ;
25
26 s t a t i c f i n a l Problem<double [ ] , DoubleGene , Vec<double[]>>
27 PROBLEM = Problem . o f (
28 DTLZ1 : : f ,
29 Codecs . o f V e c t o r ( DoubleRange . o f ( 0 , 1 . 0 ) , VARIABLES)
30 );
31
32 s t a t i c Vec<double [] > f ( f i n a l double [ ] x ) {
33 double g = 0 . 0 ;
34 f o r ( i n t i = VARIABLES − K; i < VARIABLES ; i ++) {
35 g += pow ( x [ i ] − 0 . 5 , 2 . 0 ) − c o s ( 2 0 . 0 ∗ PI ∗ ( x [ i ] − 0 . 5 ) ) ;
36 }
37 g = 1 0 0 . 0 ∗ (K + g ) ;
38
39 f i n a l double [ ] f = new double [ OBJECTIVES ] ;
40 f o r ( i n t i = 0 ; i < OBJECTIVES ; ++i ) {
41 f [ i ] = 0.5 ∗ (1.0 + g) ;
42 f o r ( i n t j = 0 ; j < OBJECTIVES − i − 1 ; ++j ) {
43 f [ i ] ∗= x [ j ] ;
44 }
45 i f ( i != 0 ) {
46 f [ i ] ∗= 1 − x [ OBJECTIVES − i − 1 ] ;
47 }
48 }
49
50 return Vec . o f ( f ) ;
51 }
52
53 s t a t i c f i n a l Engine<DoubleGene , Vec<double[]>> ENGINE =
54 Engine . b u i l d e r (PROBLEM)
55 . populationSize (100)
56 . alterers (
57 new S i m u l a t e d B i n a r y C r o s s o v e r <>(1) ,
58 new Mutator < >(1.0/VARIABLES) )
59 . o f f s p r i n g S e l e c t o r (new T o u r n a m e n t S e l e c t o r <>(5) )
60 . s u r v i v o r s S e l e c t o r ( NSGA2Selector . ofVec ( ) )
61 . minimizing ( )
62 . build () ;
63
64 public s t a t i c void main ( f i n a l S t r i n g [ ] a r g s ) {
65 f i n a l ISeq<Vec<double[]>> f r o n t = ENGINE . stream ( )
66 . l i m i t (2500)
67 . c o l l e c t (MOEA. t o P a r e t o S e t ( IntRange . o f ( 1 0 0 0 , 1 1 0 0 ) ) )
68 . map( Phenotype : : f i t n e s s ) ;
69 }
70
71 }

The listing above shows the encoding of the DTLZ1 problem with the Jenetics
library. Figure 5.9.1 on the next page shows the Pareto-optimal front of the
DTLZ1 optimization.

148
5.9. DTLZ1 CHAPTER 5. EXAMPLES

0.5

0.4

0.3
f3
0.2

0.1

00

0.1

0.2
f1 0.3

0.4 0.5
0.4
0.3
0.2
0.5 0.1
0
f2

Figure 5.9.1: Pareto front DTLZ1

149
Chapter 6

Build

For building the Jenetics library from source, download the most recent, stable
package version from https://github.com/jenetics/jenetics/releases and
extract it to some build directory.

$ unzip jenetics-<version>.zip -d <builddir>

<version> denotes the actual Jenetics version and <builddir> the actual build
directory. Alternatively you can check out the latest version from the Git master
branch.

$ git clone https://github.com/jenetics/jenetics.git \


<builddir>

Jenetics uses Gradle1 as build system and organizes the source into sub-projects
(modules).2 Each sub-project is located in its own sub-directory.

Published projects
• jenetics: This project contains the source code and tests for the Jenetics
base-module.

• jenetics.ext: This module contains additional non-standard GA opera-


tions and data types. It also contains classes for solving multi-objective
problems (MOEA). Additional classes for defining tree rewrite systems are
also part of this module.
• jenetics.prog: The modules contains classes which allows to do genetic
programming (GP). It seamlessly works with the existing Evolution-
Stream and evolution Engine.
• jenetics.xml: XML marshalling module for the Jenetics base data struc-
tures.
1 http://gradle.org/downloads
2 If you are calling the gradlew script (instead of gradle), which are part of the downloaded

package, the proper Gradle version is automatically downloaded and you don’t have to install
Gradle explicitly.

150
CHAPTER 6. BUILD

• prngine: PRNGine is a pseudo-random number generator library for


sequential and parallel Monte Carlo simulations. Since this library has no
dependencies to one of the other projects, it has its own repository3 with
independent versioning.

Non-published projects
• jenetics.example: This project contains example code for the base-
module.
• jenetics.incubator: This project contains experimental code which
might find its way into one of the official modules.
• jenetics.doc: Contains the code of the website and this manual.
• jenetics.tool: This module contains classes used for doing integration
testing and algorithmic performance testing. It is also used for creating
GA performance measures and creating diagrams from the performance
measures.
For building the library change into the <builddir> directory (or one of the
module directory) and call one of the available tasks:
• compileJava: Compiles the Jenetics sources and copies the class files to
the <builddir>/<module-dir>/build/classes/main directory.
• jar: Compiles the sources and creates the JAR files. The artifacts are
copied to the <builddir>/<module-dir>/build/libs directory.
• test: Compiles and executes the unit tests. The test results are printed
onto the console and a test report, created by TestNG, is written to
<builddir>/<module-dir> directory.
• javadoc: Generates the API documentation. The Javadoc is stored in the
<builddir>/<module-dir>/build/docs directory.
• clean: Deletes the <builddir>/build/* directories and removes all gen-
erated artifacts.
For building the library from the source, call

$ cd <build-dir>
$ gradle jar

or

$ ./gradlew jar

if you don’t have the the Gradle build system installed—calling the the Gradle
wrapper script will download all needed files and trigger the build task afterwards.

Maven Central The whole Jenetics package can also be downloaded from
the Maven Central repository http://repo.maven.apache.org/maven2:
3 https://github.com/jenetics/prngine

151
CHAPTER 6. BUILD

pom.xml snippet for Maven


<dependency>
<groupId>io.jenetics</groupId>
<artifactId>module-name </artifactId>
<version>7.1.0</version>
</dependency>

Gradle
’io.jenetics:module-name :7.1.0’

License The library itself is licensed under theApache License, Version 2.0.
Copyright 2007-2022 Franz Wilhelmstötter

Licensed under the Apache License, Version 2.0 (the "License");


you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software


distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

152
Bibliography

[1] Otman Abdoun, Jaafar Abouchabaka, and Chakir Tajani. Analyzing the
performance of mutation operators to solve the travelling salesman problem.
CoRR, abs/1203.3099, 2012.
[2] Otman Abdoun, Chakir Tajani, and Jaafar Abouchabaka. Hybridizing
PSM and RSM operator for solving np-complete problems: Application to
travelling salesman problem. CoRR, abs/1203.5028, 2012.
[3] Franz Baader and Tobias Nipkow. Term Rewriting and All That. Cambridge
University Press, 1998.
[4] Thomas Back. Evolutionary Algorithms in Theory and Practice. Oxford
Univiversity Press, 1996.
[5] James E. Baker. Reducing bias and inefficiency in the selection algorithm.
Proceedings of the Second International Conference on Genetic Algorithms
and their Application, pages 14–21, 1987.
[6] Shumeet Baluja and Rich Caruana. Removing the genetics from the standard
genetic algorithm. pages 38–46. Morgan Kaufmann Publishers, 1995.
[7] Heiko Bauke. Tina’s random number generator library.
https://github.com/rabauke/trng4/blob/master/doc/trng.pdf, 2011.
[8] Tobias Blickle and Lothar Thiele. A comparison of selection schemes used
in evolutionary algorithms. Evolutionary Computation, 4:361–394, 1997.
[9] Joshua Bloch. Effective Java. Addison-Wesley Professional, 3rd edition,
2018.
[10] David A. Van Veldhuizen Carlos A. Coello Coello, Gary B. Lamont. Evo-
lutionary Algorithms for Solving Multi-Objective Problems. Genetic and
Evolutionary Computation. Springer, Berlin, Heidelberg, 2nd edition, 2007.
[11] P.K. Chawdhry, R. Roy, and R.K. Pant. Soft Computing in Engineering
Design and Manufacturing. Springer London, 1998.
[12] Carlos Coello. Coello, a.c.: Theoretical and numerical constraint-handling
techniques used with evolutionary algorithms: A survey of the state of the
art. comput. methods appl. mech. engrg. 191(11-12), 1245-1287. Computer
Methods in Applied Mechanics and Engineering, 191:1245–1287, 01 2002.

153
BIBLIOGRAPHY BIBLIOGRAPHY

[13] Rituparna Datta and Kalyanmoy Deb. Evolutionary Constrained Optimiza-


tion. 12 2014.

[14] Richard Dawkins. The Blind Watchmaker. New York: W. W. Norton &
Company, 1986.
[15] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast and elitist
multiobjective genetic algorithm: Nsga-ii. Trans. Evol. Comp, 6(2):182–197,
April 2002.

[16] Kalyanmoy Deb and Hans-Georg Beyer. Self-adaptive genetic algorithms


with simulated binary crossover. COMPLEX SYSTEMS, 9:431–454, 1999.
[17] Kalyanmoy Deb, Lothar Thiele, Marco Laumanns, and Eckart Zitzler.
Scalable test problems for evolutionary multi-objective optimization. Number
112 in TIK-Technical Report. ETH-Zentrum, ETH-Zentrum Switzerland,
July 2001.
[18] Félix-Antoine Fortin and Marc Parizeau. Revisiting the nsga-ii crowding-
distance computation. In Proceedings of the 15th Annual Conference on
Genetic and Evolutionary Computation, GECCO ’13, pages 623–630, New
York, NY, USA, 2013. ACM.

[19] Raymond Greenlaw. Grammars. In Raymond Greenlaw, editor, Fundamen-


tals of the Theory of Computation: Principles and Practice, pages 195–220.
Morgan Kaufmann, Oxford, 1998.
[20] J.F. Hughes and J.D. Foley. Computer Graphics: Principles and Practice.
The systems programming series. Addison-Wesley, 2014.

[21] Raj Jain and Imrich Chlamtac. The p2 algorithm for dynamic calculation
of quantiles and histograms without storing observations. Commun. ACM,
28(10):1076–1085, October 1985.
[22] David Jones. Good practice in (pseudo) random number generation for
bioinformatics applications, May 2010.

[23] Abdullah Konak, David W. Coit, and Alice E. Smith. Multi-objective


optimization using genetic algorithms: A tutorial. Rel. Eng. & Sys. Safety,
91(9):992–1007, 2006.
[24] John R. Koza. Genetic Programming: On the Programming of Computers
by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992.
[25] John R. Koza. Introduction to genetic programming: Tutorial. In Proceed-
ings of the 10th Annual Conference Companion on Genetic and Evolutionary
Computation, GECCO ’08, pages 2299–2338, New York, NY, USA, 2008.
ACM.

[26] Sean Luke. Essentials of Metaheuristics. Lulu, second edition, 2013. Avail-
able for free at http://cs.gmu.edu/∼sean/book/metaheuristics/.
[27] Efrén Mezura-Montes. Constraint-Handling in Evolutionary Optimization,
volume 198. 01 2009.

154
BIBLIOGRAPHY BIBLIOGRAPHY

[28] Zbigniew Michalewicz. A survey of constraint handling techniques in evolu-


tionary computation methods. In Evolutionary Programming, 1995.
[29] Zbigniew Michalewicz. Genetic Algorithms + Data Structures = Evolution.
Springer, 1996.
[30] Melanie Mitchell. An Introduction to Genetic Algorithms. MIT Press,
Cambridge, MA, USA, 1998.
[31] Heinz Mühlenbein and Dirk Schlierkamp-Voosen. Predictive models for the
breeder genetic algorithm i. continuous parameter optimization. 1(1):25–49.
[32] Michael O’Neil and Conor Ryan. Grammatical Evolution: Evolutionary
Automatic Programming in an Arbitrary Language. Springer US, Boston,
MA, 2003.
[33] Oracle. Value-based classes. https://docs.oracle.com/javase/8/docs/api/-
java/lang/doc-files/ValueBased.html, 2014.
[34] A. Osyczka. Multicriteria optimization for engineering design. Design
Optimization, page 193–227, 1985.
[35] Charles C. Palmer and Aaron Kershenbaum. An approach to a problem in
network design using genetic algorithms. Networks, 26(3):151–163, 1995.
[36] Franz Rothlauf. Representations for Genetic and Evolutionary Algorithms.
Springer, 2 edition, 2006.
[37] Conor Ryan, J. J. Collins, and Michael O’Neill. Grammatical evolution:
Evolving programs for an arbitrary language. In Wolfgang Banzhaf, Riccardo
Poli, Marc Schoenauer, and Terence C. Fogarty, editors, Proceedings of the
First European Workshop on Genetic Programming, volume 1391 of LNCS,
pages 83–96, Paris, 14-15 April 1998. Springer-Verlag.
[38] Conor Ryan, Michael O’Neill, and JJ Collins. Introduction to 20 Years
of Grammatical Evolution, pages 1–21. Springer International Publishing,
Cham, 2018.
[39] Daniel Shiffman. The Nature of Code. The Nature of Code, 1 edition, 12
2012.
[40] S. N. Sivanandam and S. N. Deepa. Introduction to Genetic Algorithms.
Springer, 2010.
[41] W. Vent. Rechenberg, ingo, evolutionsstrategie — optimierung technischer
systeme nach prinzipien der biologischen evolution. 170 s. mit 36 abb.
frommann-holzboog-verlag. stuttgart 1973. broschiert. Feddes Repertorium,
86(5):337–337, 1975.
[42] Eric W. Weisstein. Scalar function. http://mathworld.wolfram.com/-
ScalarFunction.html, 2015.
[43] Eric W. Weisstein. Vector function. http://mathworld.wolfram.com/-
VectorFunction.html, 2015.
[44] Darrell Whitley. A genetic algorithm tutorial. Statistics and Computing,
4:65–85, 1994.

155
Index

0/1 Knapsack, 134 configuration, 32


2-point crossover, 19 maxBatchSize, 33
3-point crossover, 19 maxSurplusQueuedTaskCount, 33
splitThreshold, 33
Allele, 6, 41 tweaks, 32
Alterer, 16, 45 Constraint, 24, 62
AnyChromosome, 43 Context-free grammar, 101
AnyGene, 42 Crossover
Architecture, 4 2-point crossover, 19
Assignment problem, 59 3-point crossover, 19
Intermediate crossover, 21
Backus-Naur form, 101 Line crossover, 21
Base classes, 5 Multiple-point crossover, 18
BigIntegerGene, 89 Partially-matched crossover, 19,
Block splitting, 35 20
BNF, 101 Simulated binary crossover, 89
Boltzmann selector, 15 Single-point crossover, 18, 19
Build, 150 Uniform crossover, 20
Gradle, 150 CyclicEngine, 94
gradlew, 150
Dieharder, 124
CFG, 101 Directed graph, 53
Chromosome, 7, 8, 42 Distinct population, 79
recombination, 18 Domain classes, 6
scalar, 48 Domain model, 6
variable length, 7 Download, 150
Clock, 24 DTLZ1, 147
Codec, 54
Composite, 59 Elite selector, 16, 45
Mapping, 59 Elitism, 16, 45
Matrix, 56 Encoding, 47
Permutation, 58 Affine transformation, 50
Scalar, 55 Directed graph, 53
Subset, 56 Graph, 52
Vector, 55 Real function, 48
Combine alterer, 20 Scalar function, 49
Compile, 151 Undirected graph, 52
Complexity function, 115 Vector function, 49
Composite codec, 59 Weighted graph, 53
ConcatEngine, 93 Engine, 23, 46
Concurrency, 31 Evaluator, 30

156
INDEX INDEX

reproducible, 76 Var, 110


Engine classes, 22 Genotype, 7, 8
Ephemeral constant, 110 scalar, 10, 48
Error function, 116 vector, 9
ES, 77 Git repository, 150
Evolution, 25 GP, 108, 141
Engine, 23 Gradle, 150
interception, 79 gradlew, 150
performance, 76 Grammar, 144
reproducible, 76 Regression, 144
Stream, 4, 22, 26 Grammar based regression, 144
Evolution strategy, 77 Grammatical evolution, 100
(µ + λ)-ES, 78 Graph, 52
(µ, λ)-ES, 77
Evolution time, 70 Hello World, 2
Evolution workflow, 4
EvolutionResult, 27 Installation, 150
interception, 79 Interception, 79
mapper, 79 Internals, 124
EvolutionStatistics, 28 Invertible codec, 61
EvolutionStream, 26 io.jenetics.ext, 82
EvolutionStreamable, 93 io.jenetics.prngine, 120
Evolving images, 139, 140 io.jenetics.prog, 108
Examples, 128 io.jenetics.xml, 117
0/1 Knapsack, 134 Java property
Evolving images, 139, 140 maxBatchSize, 33
Ones counting, 128 maxSurplusQueuedTaskCount, 33
Rastrigin function, 132 splitThreshold, 33
Real function, 130
Traveling salesman, 137 L1, 114
Exponential-rank selector, 15 L2, 114
LCG64ShiftRandom, 36, 125
Fitness convergence, 72 Leapfrog, 35
Fitness function, 22 License, i, 152
Fitness threshold, 71 Linear-rank selector, 14
Fixed generation, 68 Loss function, 114
FlatTree, 84 L1, 114
L2, 114
Gaussian mutator, 17
MAE, 114
Gene, 6, 41
MSE, 114
validation, 7
Gene convergence, 76 MAE, 114
Generator, 106 Mapping codec, 59
Genetic algorithm, 3 Matrix codec, 56
Genetic programming, 108, 141 Mean absolute error, 114
Const, 110 Mean alterer, 21
Operations, 109 Mean squared error, 114
Program creation, 110 Mixed optimization, 99
Program pruning, 112 Modifying Engine, 92
Program repair, 112 Modules, 81

157
INDEX INDEX

io.jenetics.ext, 82 Rastrigin function, 132


io.jenetics.prngine, 120 Real function, 130
io.jenetics.prog, 108 Recombination, 17
io.jenetics.xml, 117 Regression problem, 117
Mona Lisa, 142 Reproducibility, 76
Monte Carlo selector, 13, 77 Rewrite rule, 86
MOO, 94 Rewrite system, 86
Mixed optimization, 99 Rewriting, 86
MSE, 114 Roulette-wheel selector, 14
Multi-objective optimization, 94
Mixed optimization, 99 Sample point, 116
Multiple-point crossover, 18 SBX, 89
Mutation, 16 Scalar chromosome, 48
Mutator, 17 Scalar codec, 55
Scalar genotype, 48
NSGA2 selector, 98 Seeding, 125
Selector, 12, 44
Ones counting, 128 Elite, 45
Operation classes, 12 Seq, 38
Serialization, 37
Package structure, 5
Simulated binary crossover, 89
Parentheses tree, 83
Single-node crossover, 89
Pareto dominance, 95
Single-point crossover, 18, 19
Pareto efficiency, 95
Source code, 150
Partial alterer, 22
Statistics, 39, 45
Partially-matched crossover, 19, 20
Steady fitness, 69
Permutation codec, 58
Stochastic-universal selector, 15
Phenotype, 11
Subset codec, 56
Population, 6, 11
Swap mutator, 17
Population convergence, 74
Symbolic regression, 113, 141
PRNG, 34
SymbolIndex, 106
Block splitting, 35
LCG64ShiftRandom, 36 Term rewrite rule, 88
Leapfrog, 35 Term rewrite system, 88
Parameterization, 35 Termination, 67, 99
Random seeding, 35 Evolution time, 70
PRNG testing, 124 Fitness convergence, 72
Probability selector, 13 Fitness threshold, 71
Problem, 62 Fixed generation, 68
Gene convergence, 76
Quantile, 40
Population convergence, 74
Random, 34 Steady fitness, 69
Generator, 34 Tournament selector, 12
LCG64ShiftRandom, 36 Traveling salesman, 137
Registry, 34 Tree, 82
seeding, 125 pattern, 87
testing, 124 reduce, 86
Random seeding, 35 rewrite rule, 86
RandomGenerator, 34 rewrite system, 86
Randomness, 34 Tree pattern, 87

158
INDEX INDEX

Tree rewrite rule, 86, 88


Tree rewrite system, 86, 88
Tree rewriter, 88
TreeGene, 89
Truncation selector, 13

Undirected graph, 52
Uniform crossover, 20
Unique fitness tournament selector,
99
Unique population, 79

Validation, 7
Vec, 96
VecFactory, 99
Vector codec, 55

Weasel program, 90
WeaselMutator, 91
WeaselSelector, 91
Weighted graph, 53

159

You might also like