SlideShare a Scribd company logo
1
Semantic Genetic Programming
Alberto Moraglio
University of Exeter
Exeter, UK
A.Moraglio@exeter.ac.uk
Krzysztof Krawiec
Poznan University of Technology
Poznan, Poland
krawiec@cs.put.poznan.pl
2
Instructors
• Alberto Moraglio
– Position: Lecturer in Computer Science at the University of Exeter, UK
– Research Area: founder of the Geometric Theory of Evolutionary
Algorithms, which unifies Evolutionary Algorithms across representations
and has been used for the principled design of new successful search
algorithms, including a new form of Genetic Programming based on
semantics, and for their rigorous theoretical analysis.
• Krzysztof Krawiec
– Position: Associate Professor at Poznan University of Technology, Poland
– Research Area: genetic programming and coevolutionary algorithms, with
applications in program synthesis, modeling, image analysis, and games.
Within GP: design of effective search operators (particularly crossovers),
discovery of semantic modularity of programs, and exploitation of program
execution traces for improving performance of program synthesis.
3
Aims
• Give a comprehensive overview of semantic methods in
genetic programming
• Illustrate in an accessible way a formal geometric framework
for program semantics
• Analyze rigorously their performance (runtime analysis)
• Present current challenges and trends in semantic GP
• Outline new emerging approaches
4
Agenda
1. Introduction to Semantic Genetic Programming
2. Geometric Operators on Semantic Space
3. Approximating Geometric Semantic Genetic Programming
4. Geometric Sematic Genetic Programming
5. Other Developments and Current Research Directions
5
I. Introduction to
Semantic Genetic Programming
6
Genetic Programming
• Generate-and test approach to program synthesis
• Programs represented as symbolic structures (usually abstract syntax trees, ASTs)
• Population-based
• Iterative: start with a population of programs drawn at random, and repeat:
– select the most promising individuals,
– perturb using mutation and crossover
• … until solution found
• This tutorial: focus on tree-based GP (but usually easily generalizable to other
genres).
7
Motivations for Semantic GP (SGP)
• Traditional GP search operates directly
on syntax, largely disregarding program
semantics.
• Consequences:
– Complex, rugged genotype-phenotype
mapping
– Low relatedness of offspring to parents
– Slight change can dramatically change the
output of the program
– And conversely: high likelihood of no-effect
(neutrality)
– Low fitness-distance correlation
8
Questions
• Can we make GP more aware about the effects of program
execution, i.e., program ‘behavior’?
• Can we design search operators that produce offspring
program which behave similarly to parent(s)?
• Can we design search operators that are guaranteed to do so?
9
Program Semantics
• Program semantics = a formal method of capturing program
behavior in abstraction from syntax.
• Common formalisms: denotational semantics, operational
semantics.
– Rarely applicable in GP, where program correctness typically
expressed w.r.t. to fitness cases (tests).
• Note: semantics (noun) vs. semantic (adj.)
10
GP Semantics
• Problems in GP are typically posed using a set of fitness cases (tests)
• Observation: Program behavior is reflected in the effects of computation,
i.e., program output.
• Program semantics in GP: the tuple (vector) of outputs for the training
fitness cases. Example:
• Important consequence: semantic s(p) is a point in an n-dimensional
space.
• A distance between s(p1) and s(p2) reflects semantic similarity of p1 and p2
11
Semantic Building Blocks
(McPhee, Ohs, Hutchison 2007/2008)
• Studied the impact of subtree crossover in terms of semantic building
blocks.
• Describe the semantic action of crossover.
• Provide insight into what does (or doesn’t) make crossover effective.
• Define semantics of subtrees and semantics of contexts, where
context = a tree with one branch missing.
• Definition of program semantics inspired by Poli's and Page's work on
sub-machine code GP
12
Semantic Building Blocks
(McPhee, Ohs, Hutchison 2007/2008)
• Distribution of context semantics are
key in the success (or failure) of runs.
• A very high proportion (typically over
75%) of crossover events are
guaranteed to perform no useful
search in the semantic space.
13
Semantically-Driven Crossover (SDC)
(Beadle and Johnson 2008)
• Program semantics = reduced ordered binary decision diagram
(ROBDDs)
• Trial-and error wrapper of tree-swapping crossover:
– Pick a pair of parents and generate from them a potential offspring (candidate
offspring)
– Calculate ROBDD semantics of parents and offspring
– Repeat if semantics the same as of any of the parents
Analogously: Semantically-driven mutation (SDM)
(Beadle & Johnson 2009)
14
Semantic-Aware Crossovers
• Motivation: swap semantically similar subprograms in the parent
programs, to ‘smoothen’ the semantic effect of crossover.
• Semantic-aware crossover (SAX) (Quang et al. 2011)
– Select a pair of subprograms such that their semantics are sufficiently similar (upper
limit on distance)
• Semantic Similarity-based Crossover (SSX) (Quang et al. 2011)
– As SAX, but imposes also lower limit on distance between the subprograms, to
prevent producing semantically neutral offspring (see efficiency later in this tutorial).
• (Quang et al. 2013): Picks the closest semantically different subprogram in
the other parent.
• Analogous mutations defined too.
15
Semantic-Aware Initialization
Semantically-driven Initialization (Beadle and Johnson 2009)
• Constructs a population of semantically distinct programs of gradually
increasing complexity.
• Start with population P filled with all single-instruction programs
• To generate a new program:
– Repeat:
• Create a random program p by combining a randomly selected non-terminal
instruction r (of arity k) with k randomly selected programs in P
– Until p has a non-constant semantics that is sufficiently distant from semantics of
all programs in P
– Add p to P and return p
16
Semantic-Aware Initialization
• Behavioral Initialization (Jackson 2010)
– Set P  
• To generate a new program:
– Repeat:
• Create a random program p using conventional methods (e.g., Grow or Full)
– Until the semantic of p is sufficiently distant from semantics of all programs in P
– Add p to P and return p
• Observation: Semantic diversity decreases rapidly with run progress (as
opposed to syntactic/structural which increases and then levels-off)
17
II. Geometric Operators
on Semantic Space
18
Metric Space
),(),(),(
),(),(
0),(
0),(
yxdyzdzxd
xydyxd
yxyxd
yxd




Balls & Segments
}),(|{);( ryxdSyrxB 
)},(),(),(|{];[ yxdyzdzxdSzyx 
19
Squared Balls & Chunky Segments
33
000 001
010 011
100 101
111110
B(000; 1)
Hamming space
3
B((3, 3); 1)
Euclidean space
3
B((3, 3); 1)
Manhattan space
Balls
1
2
1
2
000 001
010 011
100 101
111110
[000; 011] = [001; 010]
2 geodesics
Hamming space
1 3
[(1, 1); (3, 2)]
1 geodesic
Euclidean space
1 3
[(1, 1); (3, 2)] = [(1, 2); (3, 1)]
infinitely many geodesics
Manhattan space
Line segments
20
Geometric Crossover & Mutation
• Geometric crossover: a recombination operator is a geometric
crossover under the metric d if all its offspring are in the d-metric
segment between its parents.
• Geometric mutation: a mutation operator is a r-geometric
mutation under the metric d if all its offspring are in the d-ball of
radius r centred in the parent.
21
Example of Geometric Mutation
000
001
010 011
100 101
111110
Neighbourhood structure naturally associated with the shortest path
distance.
Traditional one-point mutation is 1-geometric under Hamming
distance.
22
Example of Geometric Crossover
• Geometric crossover: offspring are in a segment
between parents for some distance.
• The traditional crossover is geometric under the
Hamming distance.
10110
11011
A
B
A
B
11010X
X
2
1
3
H(A,X) + H(X,B) = H(A,B)
23
Significance of Geometric View
• Unification Across Representations
• Simple Landscape for Crossover
• Crossover Principled Design
• Principled Generalisation of Search Algorithms
• General Theory Across Representations
24
• Semantic search operators: operators that act on
the syntax of the programs but that guarantee that
some semantic criterion holds (e.g., semantic
mutation: offspring are semantically similar to
parents)
Semantic Operators
25
Semantic
Mutation
0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1
Induced
Mutation
Semantics
Semantics
Fitness as Distance
• Aim: we want to find a function that scores
perfectly on a given set of input-output examples
(test cases)
• Error of a program: number of mismatches on the
test cases
• Fitness as distance: the error of a program can be
interpreted as the distance of the output vector of
the program to the target output vector
• Distance functions: Hamming distance for Boolean
outputs, Euclidean distance for continuous outputs
26
Semantic Distance & Operators
• The semantic distance between two functions is
the distance of their output vectors measured
with the distance function used in the definition of
the fitness function
• Semantic geometric operators are geometric
operators defined on the metric space of
functions endowed with the semantic distance
27
Semantic Fitness Landscape
• The fitness landscape seen by GP with semantic
geometric operators is always a cone landscape
by definition (unimodal with a linear gradient)
which GP can easily optimise!
28
29
III. Approximating
Geometric Semantic GP
30
Trial-and-Error Geometric Crossover (KLX)
Krawiec and Lichocki Crossover, KLX (Krawiec and Lichocki 2009)
• Goal: Minimize offspring’s total semantic distance from the parents under some
assumed metric || ||.
• Technical realization: Mate the parents (x,y) repetitively using a ‘regular’
crossover operator CX
• Calculate parent semantics s(p1), s(p2)
• Repeat:
– Apply CX to (p1,p2) n times, creating a pool of candidates C
– Calculate the semantics s(z) of each candidate z  C
• Return the candidate z that minimises the total distance:
argmin ||s(z) - s(p1)|| + ||s(z) - s(p2)||
• A form of brood selection
31
Trial-and-Error Geometric Crossover (KLX)
Motivation: Given a globally convex
fitness landscape (one global
optimum), solutions on a segment
connecting solutions x and y cannot
be worse than the worse of them.
32
Promotion of Equidistance
• All candidate offspring on the segment [s(p1);s(p2)] minimize total distance equally well, no
matter how different from the parents they are.
– An offspring z that is a ‘semantic clone’ of p1 (s(z) = s(p1)) also minimises the total
distance.
– The likelihood of crossover producing a semantic clone of one of the parents is
high in GP (see remarks on neutrality later)
• KLX promotes similarity to parents. This may hamper exploration.
• Idea: Extend total distance by a term that promotes balanced distance from both parents
(KLX+)
argmin ||s(z) - s(p1)|| + ||s(z) - s(p2)|| + | ||s(z) - s(p1)|| - ||s(z) - s(p2)|| |
33
Locally Geometric Crossover
(Krawiec & Pawlak 2012)
• Motivations: Finding an ‘almost geometric’ offspring can be difficult for entire
parent programs,
– … but should be easier for subprograms.
– This may make sense if ‘geometricity’ can propagate through a tree.
• The algorithm:
– Find the syntactic common region of the parents (where the trees overlap)
– Select two homogenous nodes (subprograms) p1 and p2 in the common regions
– Calculate the midpoint sm between s(p1) and s(p2)
– Find two programs p’1 and p’2 in a library that have the closest semantic distance from sm
– Replace p1 and p2 with p’1 and p’2, respectively.
34
35
Semantic Backpropagation
• Motivation: many instructions used in GP are invertible or partially
invertible.
• Example: symbolic regression:
– Fully invertible: e.g., addition: y = x + c  x = y - c
– Partially invertible: e.g., square: y = x2  x = sqrt(x)
• The desired output t of a program (target) is known.
• Given a program and t, this allows deriving desired semantics at any
point in a program tree.
36
Semantic Backpropagation
SBP can be used to back propagate any semantics.
37
Semantic Backpropagation
• Note: desired semantics is not a vector of scalar values.
• Desired semantics is a tuple of sets of desired outputs, because not all
instructions are bijective. Examples:
– D = ({2}, {3}, {2,-4}, {0, 1})
– D = ({T}, {F}, {T,F})
• Special case: non-realizable desired semantics, e.g., D = ({T}, , {T,F})
– Or: non-realizable under assumed constraints (e.g., size of subprogram).
• Algorithms have to account for that.
38
Propagation of Desired Semantics
• Two fitness cases, 2D semantic space
• Desired outputs: (0,0)
• Program: cos(sin(x))
• Visualization:
– semantic distance as a function of inputs (x1, x2)
– red = smaller semantic distance (greater fitness)
39
Propagation of Desired Semantics
• Top: desired semantics of cos(#)
– target achieved for x1,x2 =  +k, kZ
• Bottom: desired semantics of cos(sin(#))
– Target cannot be achieved, because
sin  [-1,1], and thus no x causes
cos(sin(x)) = 0
40
Operators Based on SBP
• Approximately Geometric Crossover, AGX (Krawiec & Pawlak 2013)
–A crossover operator
–Uses SBP to match the midpoint on the segment connecting the parents’ semantics
–Starting point of SBP: the midpoint on the segment
• Random Desired Operator, RDO (Wieloch & Krawiec 2013)
–A mutation operator
–Uses SBP to match the target of the search process
–Starting point of SBP: the target semantics of the
41
Operators Based on SBP
• Common part of workflow:
–Pick a node p’ in a parent p
–Perform semantic backpropagation of desired semantics from the root of p to
p’, obtaining desired semantics D
–Replace p’ with a (sub)program from a library that best matches D
• Other differences:
–RDO is agnostic about geometric considerations
–RDO and AGX may use various libraries
42
AGX: Some Results
(Pawlak, Wieloch, Krawiec, 2014)
43
Library of Subprograms
• The source of subprograms for SBP
– Static: Generated prior to run
– Dynamic: Other programs in the current population
• Example of static library: All programs built upon given set of instructions.
– Instructions {+, −,, /, sin, cos, exp, log, x}, max tree height h
– Semantic duplicates eliminated
• Total number of programs: 212 (for h = 3), 108520 (for h = 4)
– Depends on the instruction set and tests (in general the fewer tests,
the fewer unique semantics)
– Impact of floating-point precision
44
Semantic Diversity of Libraries
Exemplary library:
• All programs composed of {+,−,×,/,sin,exp,x},
max tree depth: 4.
• Semantics: 20 points distributed equidistantly
in [−5, 5]  20-dimensional semantic space
• Semantic duplicates removed.
Visualization:
• Reduction to 2D by PCA,
• Red: the smallest (i.e. single node) programs,
• Blue: the longest (i.e. 15 nodes) programs.
Observation: strongly non-uniform distribution of
semantics.
• Expected: see (Langdon & Poli 2002)
45
Technical Challenges of SBP
• Limited semantic diversity
– Using a mutation operator in parallel recommended (to provide constant influx of new
code)
• Computational overhead of library search
– Can be tackled with appropriate algorithms (nearest-neighbor search, e.g., kd-trees)
46
SBP: Remarks and Extensions
• Requirements of SBP-based operators
– AGX requires a means of constructing a midpoint on a segment.
• Possible in vector spaces, but in general not in metric spaces
– RDO can work with any metric (vector space not required)
• The node/subtree p to be replaced can be selected deterministically:
– E.g., the node where the divergence of the actual semantics s(p) and the desired
semantics D is the greatest (Wieloch 2012)
47
IV. Geometric Semantic GP (GSGP)
Geometric Semantic Operators Construction
• By approximation:
– Trial & Error is wasteful
– Offspring do not conform exactly to the semantic requirement
• By direct construction: Is it possible to find search operators that
operate on syntax but that are guaranteed to respect geometric
semantic criteria by direct construction?
• Due to the complexity of genotype-phenotype map in GP
(Krawiec & Lichocki 2009) hypothesized that designing a
crossover operator with such a guarantee is in general
impossible. A pessimist? No, the established view until then...
48
Geometric Semantic Crossover
for Boolean Expressions
49
T1, T2: parent trees
TR: random tree
T3 =
Theorem
The output vector of the offspring T3 is in the
Hamming segment between the output
vectors of its parent trees T1 and T2 for any
tree TR
50
Example: parity problem
• 3-parity problem: we want to find a function
P(X1,X2,X3) that returns 1 when an odd number
of input variables is 1, 0 otherwise.
51
0 1 0 1 0 1 1 1O=
Error = HD(Y,O) = 5
Example: tree crossover
52
T1 =
TR =
T2 =
T3 =
substitution &
simplification
Example: output vector crossover
53
• The output vector of TR acts as a crossover mask to
recombine the output vectors of T1 and T2 to produce the
output vector T3.
• This is a geometric crossover on the semantic distance:
output vector of T3 is in the Hamming segment between the
output vectors of T1 and T2.
Geometric Semantic Crossover
for Arithmetic Expressions
54
Function co-domain: real
Output vectors: real vectors
Semantic distance = Euclidean
CR = random real in [0,1]
Semantic distance = Manhattan
CR = random function with co-
domain [0,1]
T3 =
Geometric Semantic Crossover for Classifiers
55
Function co-domain: symbol
Output vectors: symbol string
Semantic distance = Hamming
RC = random function with
boolean co-domain
(i.e., random condition function
of the inputs)
T3 =
Remark 1: Domain-Specific
• Unlike traditional syntactic operators which
are of general applicability, semantic
operators are domain-specific
• But there is a systematic way to derive
them for any domain
56
Remark 2: Quick Growth
• Offspring grows in size very quickly, as the
size of the offspring is larger than the sum
of the sizes of its parents!
• To keep the size manageable we need to
simplify the offspring without changing the
computed function:
– Boolean expressions: Boolean simplification
– Math Formulas: algebraic simplification
– Programs: simplification by formal methods
57
Remark 3: Syntax Does Not Matter!
• The offspring is defined purely functionally,
independently from how the parent functions and
itself are actually represented (e.g., trees)
• The genotype representation does not matter:
solution can be represented using any genotype
structure (trees, graphs, sequences)/language
(Java, Lisp, Prolog) as long as the semantic
operators can be described in that language
58
Semantic Mutations
• It is possible to derive geometric semantic
mutation operators.
• They also have very simple forms for
Boolean, Arithmetic and Program domains.
59
EXPERIMENTS
60
Boolean Problems
61
Polynomial Regression Problems
62
Classification Problems
63
DEALING WITH GROWTH
64
Geometric Semantic Crossover
for Boolean Expressions (Growth)
65
T1, T2: parent trees
TR: random tree
T3 =
size(T3) = 4 + 2 * size(TR) + size(T1) + size(T2)
average size at generation n + 1 > 2 * average size at generation n
PROBLEM: size grows exponentially in the number of generation!
Geometric Semantic Mutation
for Boolean Expressions (Growth)
66
T: parent tree
M: random minterm tree
TM: mutant tree
size(TM) = 2 + size(M) + size(T)
average size at generation n + 1 = constant + average size at generation n
NO PROBLEM: size grows linearly in the number of generation
Three Solutions
1. Algebraic simplification of offspring
- Can be computationally expensive
- Not all domains can be simplified algebraically
- Understandable final solutions
2. Not using crossover
- Semantic Hill-Climber finds optimum efficiently
- Linear growth is acceptable
3. Compactification of offspring (Vanneschi et al, 2013)
- Linear growth even with crossover
- Applicable to any domain
- Complicated Implementation (pointers structure)
- Final solution is black box
67
Compactification Method
(Vanneschi et al, 2013)
- Individuals are represented as explicit shared linked data structure to their
parents, and recursively to all their ancestry.
- At each generation, each new offspring of crossover requires only a new
triplet of references  Linear growth in the number of generations.
68
Compactification Method
- Output vector of offspring can be computed using the explicitly stored output
vectors of the parent and mask trees. This turns fitness computation from
exponential in the number of generations to constant time.
69
Compactification Method
- Explicit garbage collection of unreferenced past
individuals in the data structure.
- Final solution is extracted from data structure but this
takes exponentially long in the number of generation.
- Extracted solution is queried on non-training inputs to
make predictions. This takes exponential time since done
on extracted solution.
Good idea, but can be improved and beautified!
70
Functional Compactification (Moraglio, 2014)
• Individuals are represented directly as
anonymous Python functions:
P1 = lambda x1, x2, x3: x1 or (x2 and not x3)
P2 = lambda x1, x2, x3: x1 and x2
RF = lambda x1, x2, x3: not (x2 and x3)
71
Functional Compactification
• Offspring call parents rather than pointing
to them:
OX = lambda x1, x2, x3:
((P1() and RF()) or (P2() and not RF())
• The size of offspring is constant in the number of
generations
72
Functional Compactification
• Mutation and Crossover are higher order
functions that take functions in inputs (parents)
and return functions as output (offspring):
Crossover: (B^3  B) x (B^3  B)  (B^3  B)
• The function calls structure keeps implicitly trace
of all ancestry of an individual
73
Functional Compactification
• All individuals are momoized functions:
- The output of previously seen inputs is retrieved from
an implicit storage, not recalculated
- The first time the fitness of an individual is calculated,
its output vector is implicitly stored
- As the output vectors of parents are stored, the fitness
of the offspring takes constant time in num generations
74
Functional Compactification
- Garbage collection of unreferenced past functions done
automatically by the Python compiler.
- Final solution is a Python compiled function (but can be
extracted by keeping track of its source code). The
extracted solution would be exponentially long.
- The compiled final solution can be queried on non-
training inputs to make predictions. Thanks to the
memoization obtaining the output takes only linear time.
75
Functional Compactification
• The functional interpretation of the
compactification method delegates implicitly
all book-keeping of the original
compactification method to the Python
compiler.
• The resulting code is elegant, much shorter
and clear as it has only minimal clutter
(< 100 lines including extensive comments vs
original compactification > 2000 lines of C++).
76
77
GSGP Implementations
• Original Mathematica implementation with algebraic
simplification (see https://github.com/amoraglio/GSGP)
• Compactification method in C++ (see
http://gsgp.sourceforge.net/)
• Functional compactification aka Tiny GSGP in Python (see
https://github.com/amoraglio/GSGP)
• Scala implementation using the ScaPS library (see
http://www.cs.put.poznan.pl/kkrawiec/wiki/?n=Site.Scaps)
RUNTIME ANALYSIS OF
MUTATION-BASED GSGP
78
• Rigorous analytical formula of the
expected optimisation time of the search
algorithm A on the problem class P (on
the worst instance) for increasing size n
of the problem
Runtime Analysis
79
• Algorithm: stochastic hill-climber i.e., flip a bit of the current
solution and accept new solution if it is better than current
• Problem class: one-max i.e., sum of ones in the bit string to
maximise; the problem size is the string size
• Expected optimisation time: O(n log n) by coupon collector
argument
• This result generalises to onemax with an unknown target
string, i.e., to any cone landscape on binary strings
Runtime Analysis (example)
80
Semantic Mutation
(syntactic search & semantic effect)
81
Semantic
Mutation
0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1
Induced
Mutation
Semantics
Semantics
Search Equivalence
82
Semantic GP search at a
syntax level on any problem
Traditional GA search on
output vectors on onemax
Semantics
The search outputs a tree (i.e., a function),
but the runtime analysis can be done on the GA!
Forcing Point Mutation (not Bit Flip)
83
X1 X2 X3 Output
0 0 0 0
0 0 1 1
0 1 0 0  1
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
X = ((X1 ^ X2) ^ !X3) v X3
M = !X1 ^ X2 ^ !X3
X’ = X v M
Issue 1: Exponential Chromosome Size
• Problem size n: number of input variables
• Output vector size N: 2^n
(exponentially long in the number of variables!)
• (1+1)-EA on OneMax has runtime N log N = n 2^n
(exponential!)
84
Issue 2: Exponential Amount of Neutrality
• Training set size t: must be polynomial in n for the
fitness to be computable in poly time
• The output vectors of size 2^n have only poly(n)
active bits, all other bits are inactive: sparse
OneMax with very rare active bits
• Black-box model: we do not know which bits are
active and which are inactive
• (1+1)-EA takes exponential time to optimise
sparse OneMax
85
Solution: Block Mutation
• Use incomplete minterm as a basis for forcing mutation.
This has the effect of forcing at once blocks of entries to
the same random value.
86
X1 X2 X3 Output
0 0 0 0  1
0 0 1 1  1
0 1 0 0  1
0 1 1 1  1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
X = ((X1 ^ X2) ^ !X3) v X3
M = !X1
X’ = X v M
Fixed Block Mutation
87
X1 X2 X3 Output
0 0 0 0
0 0 1 1
0 1 0 0  0
0 1 1 1  0
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
Fix Variables = {X1,X2}
Possible M =
{!X1 ^ !X2, !X1 ^ X2, X1 ^ !X2, X1 ^ X2}
X = ((X1 ^ X2) ^ !X3) v X3
M = !X1 ^ X2
X’ = X ^ !M
Polynomial Runtime with High Probability of
Success on All Boolean Problems!
88
Proof idea: choose v such that the number of partitions of the
output vector is polynomial in n (so that the runtime is
polynomial), and larger enough than the training set, so that
each training example is in a single block w.h.p. (which
guarantees that the optimum can be reached).
Lesson from Theory
• Rigorous runtime analysis of GSGP on general classes of
non-toy problems is possible as the landscape is always a
cone
• There are issues with GSGP which require careful design
of semantic mutations to obtain efficient search. Theory
can guide the design of provably good semantic operators
in terms of runtime
• Runtime analysis of GSGP with several other mutation
operators for Boolean, arithmetic and classification
domains have been done producing refined provably good
semantic search operators
89
90
V. Other developments &
current research directions
91
SGP and Neutrality
• Similarly to non-semantic operators, SGP operators can be ineffective (in the semantic sense).
– The offspring is a semantic clone of a parent.
– Slows down the search process.
• Percentage of neutral mutations:
• Can be tackled by testing potential offspring for semantic neutrality.
Operator Symbolic regression Boolean function
synthesis
SGX (Moraglio et al.) 0.679 0.719
AGX (Pawlak et al.) 0.131 0.935
LGX (Krawiec et al.) 0.067 0.724
KLX (Krawiec et al.) 0.866 0.895
SAC (Uy et al.) 0.067 0.649
GPX (Koza et al.) 0.103 0.518
92
GP as a Test-Based Problem
• Test based problem (S, T, G, Q) (Popovici et al. 2012):
– S – set of candidate solutions (in GP: programs)
– T – set of tests (in GP: tests, fitness cases)
– G – interaction matrix
– Q – quality measure
• Examples: Games (strategies vs. opponents), control problems (controllers vs. initial
conditions), machine learning from examples (hypotheses vs. examples)
– Generally: co-optimization and co-search
93
Discovery of Underlying Objectives via
Clustering
(Krawiec & Liskowski 2013)
94
Behavioral GP
• Generalizes program behavior to the entire course of program execution, not only
program output
• Program behavior = list of execution traces
(Krawiec & Swan 2013, Krawiec & O’Reilly 2014)
95
Behavioral GP: Example
96
Recent Developments
• New approaches based on semantic back propagation
(Ffrancon & Schoenauer, 2015)
• Lexicase selection (Helmuth et al. 2012)
• Relationship to novelty search (program semantics =
behavioral descriptor)
• Application to other types of GP
– Geometric Sematic Grammatical Evolution
• Many Real-World Applications (Vanneschi et al, 2013)
• Generalisation Studies
– PAC learning for provably good generalisation of GSGP
• Derivation of semantic operators for more complex domain
(e.g., recursive programs) on more complex data structures
(e.g., lists)
Other Lines of Investigation in GSGP
97
98
Thank you!
Questions?
Credits: The authors thank Bartosz Wieloch and Tomasz Pawlak for their
feedback on the slides of the tutorial. Other credits: Wikipedia
99
References
• A. Moraglio, K. Krawiec, C. Johnson, Geometric Semantic Genetic Programming, PPSN XII, 2012.
• K. Krawiec, P. Lichocki, Approximating Geometric Crossover in Semantic Space, GECCO 2009,
• K. Krawiec, T. Pawlak, Locally Geometric Semantic Crossover: A Study on the Roles of Semantic and Homology in
Recombination Operators, Genetic Programming and Evolvable Machines, 2013,
• T. Pawlak, B. Wieloch, K. Krawiec, Semantic Backpropagation for Designing Genetic Operators in Genetic Programming, IEEE
Transactions on Evolutionary Computation, 2014.
• L. Beadle, C. Johnson, Semantically Driven Crossover in Genetic Programming, CEC 2008,
• L. Beadle, C. Johnson, Semantically Driven Mutation in Genetic Programming, CEC 2009,
• N.Q. Uy, N.X. Hoai, M. O’Neill, R.I. McKay, E. Galvan-Lopez, Semantically-based crossover in genetic programming: application
to real-valued symbolic regression, Genetic Programming and Evolvable Machines, 2011,
• N.Q. Uy, N.X. Hoai, M. O’Neill, R.I. McKay, D.N. Phong, On the roles of semantic locality in genetic programming, Information
Sciences, 2013,
• N.Q. Uy, N.X. Hoai, Michael O’Neill, Semantics based mutation in genetic programming: The case for real-valued symbolic
regression, MENDEL 2009.
• L. Beadle, C. Johnson, Semantic analysis of program initialisation in genetic programming, Genetic Programming and Evolvable
Machines, 2009,
• D. Jackson, Promoting Phenotypic Diversity in Genetic Programming, PPSN XI, 2010.
• Semantic selection:
• E. Galvan-Lopez, B. Cody-Kenny, L. Trujillo, A. Kattan, Using Semantics in the Selection Mechanism in Genetic Programming:
a Simple Method for Promoting Semantic Diversity, CEC 2013.
• R.E. Smith, S. Forrest, and A.S. Perelson. “Searching for diverse, coop- erative populations with genetic algorithms”. In:
Evolutionary Compu- tation 1.2 (1993).
• Lasarczyk, C. W. G. & and Wolfgang Banzhaf, P. D. Dynamic Subset Selection Based on a Fitness Case Topology
Evolutionary Computation, 2004, 12, 223-242
• Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O’Neill, R. I. McKay, and Dao Ngoc Phong. On the roles of semantic
locality of crossover in genetic programming. Information Sciences, 235:195–213, 20 June 2013.
• Mauro Castelli, Leonardo Vanneschi, and Sara Silva. Semantic search-based genetic programming and the effect of
intron deletion. IEEE Transactions on Cybernetics, 44(1):103–113, January 2014.
• Langdon, W. B. & Poli, R. Foundations of Genetic Programming Springer-Verlag, 2002
• McPhee, N. F., Ohs, B. & Hutchison, T., Semantic Building Blocks in Genetic Programming, in O'Neill, M et al. (eds.)
Proceedings of the 11th European Conference on Genetic Programming, EuroGP 2008, Springer, 2008, 4971, 134-145
100
References
• A. Moraglio, Towards a Geometric Unification of Evolutionary Algorithms, PhD Thesis, University of Essex, UK, 2007.
• A. Moraglio, R. Poli, Topological Interpretation of Crossover, Genetic and Evolutionary Computation Conference, pages 1377-
1388, 2004.
• A. Moraglio, A. Mambrini, L. Manzoni, Runtime Analysis of Mutation-Based Geometric Semantic Geometric Programming on
Boolean Functions, Foundations of Genetic Algorithms, 2013.
• A. Moraglio, A. Mambrini, Runtime Analysis of Mutation-Based Geometric Semantic Genetic Programming for Basis Functions
Regression, Genetic and Evolutionary Computation Conference, 2013.
• A. Mambrini, L. Manzoni, A. Moraglio, Theory-Laden Design of Mutation-Based Geometric Semantic Genetic Programming for
Learning Classification Trees, IEEE Congress on Evolutionary Computation 2013.
• A. Moraglio, J. McDermott, M. O’Neill, Geometric Semantic Grammatical Evolution, SMGP workshop at PPSN, 2014.
• A. Moraglio, An Efficient Implementation of GSGP using Higher-Order Functions and Memoization, SMGP workshop at PPSN,
2014.
• J. Fieldsend, A. Moraglio. Strength through diversity: Disaggregation and multi-objectivisation approaches for genetic
programming, GECCO, 2015 (to appear).
• L. Vanneschi, M. Castelli, L. Manzoni, S. Silva, A New Implementation of Geometric Semantic GP and Its Application to
Problems in Pharmacokinetics, EuroGP 2013
• L. Vanneschi, S. Silva, M. Castelli, L. Manzoni, Geometric semantic genetic programming for real life applications, in Genetic
Programming Theory and Practice XI, 2013
• R. Ffrancon, M. Schoenauer, Greedy Semantic Local Search for Small Solutions, Semantic Methods in Genetic Programming
Workshop, GECCO’15, 2015.

More Related Content

What's hot (20)

Von neumann architecture
Von neumann architectureVon neumann architecture
Von neumann architecture
HaiderAli759482
 
22-graphs1-dfs-bfs.ppt
22-graphs1-dfs-bfs.ppt22-graphs1-dfs-bfs.ppt
22-graphs1-dfs-bfs.ppt
KarunaBiswas3
 
Microprocessor
MicroprocessorMicroprocessor
Microprocessor
Tarun Nayak
 
Multiplexer and demultiplexer applications.ppsx 3
Multiplexer and demultiplexer applications.ppsx 3Multiplexer and demultiplexer applications.ppsx 3
Multiplexer and demultiplexer applications.ppsx 3
safia safreen
 
Presence cloud
Presence cloudPresence cloud
Presence cloud
Monali Akhare
 
Plc
PlcPlc
Plc
Shrikant Kumbhare
 
OIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question BankOIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question Bank
pkaviya
 
G. ripple counter
G. ripple counterG. ripple counter
G. ripple counter
john lexter emberador
 
SIMATIC S7-1200 Overview
SIMATIC S7-1200 OverviewSIMATIC S7-1200 Overview
SIMATIC S7-1200 Overview
Luis Narvaez
 
Ppt on control system
Ppt on control systemPpt on control system
Ppt on control system
ABDULRAHMANALGHANIM
 
Introduction to Internet of Things Hardware
Introduction to Internet of Things HardwareIntroduction to Internet of Things Hardware
Introduction to Internet of Things Hardware
Daniel Eichhorn
 
INDUSTRIAL AUTOMATION USING PLC
INDUSTRIAL AUTOMATION USING PLCINDUSTRIAL AUTOMATION USING PLC
INDUSTRIAL AUTOMATION USING PLC
Mehvish Mushtaq
 
Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture
Ravindra Dastikop
 
Multicore Processor Technology
Multicore Processor TechnologyMulticore Processor Technology
Multicore Processor Technology
Venkata Raja Paruchuru
 
8259 programmable PPI interfacing with 8085 .ppt
8259 programmable PPI interfacing with 8085 .ppt8259 programmable PPI interfacing with 8085 .ppt
8259 programmable PPI interfacing with 8085 .ppt
DrVikasMahor
 
Summer Internship Report For PLC Programming of Traffic light through Ladder ...
Summer Internship Report For PLC Programming of Traffic light through Ladder ...Summer Internship Report For PLC Programming of Traffic light through Ladder ...
Summer Internship Report For PLC Programming of Traffic light through Ladder ...
Aman Gupta
 
Embedded os
Embedded osEmbedded os
Embedded os
chian417
 
Bode Plot Notes Step by Step
Bode Plot Notes Step by StepBode Plot Notes Step by Step
Bode Plot Notes Step by Step
Mohammad Umar Rehman
 
Scope of electronics engineering in india ppt
Scope of electronics engineering in india pptScope of electronics engineering in india ppt
Scope of electronics engineering in india ppt
Rajesh Kumar
 
Design and Construction of Movable Bridge using Arduino
Design and Construction of Movable Bridge using ArduinoDesign and Construction of Movable Bridge using Arduino
Design and Construction of Movable Bridge using Arduino
ijtsrd
 
Von neumann architecture
Von neumann architectureVon neumann architecture
Von neumann architecture
HaiderAli759482
 
22-graphs1-dfs-bfs.ppt
22-graphs1-dfs-bfs.ppt22-graphs1-dfs-bfs.ppt
22-graphs1-dfs-bfs.ppt
KarunaBiswas3
 
Multiplexer and demultiplexer applications.ppsx 3
Multiplexer and demultiplexer applications.ppsx 3Multiplexer and demultiplexer applications.ppsx 3
Multiplexer and demultiplexer applications.ppsx 3
safia safreen
 
OIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question BankOIT552 Cloud Computing - Question Bank
OIT552 Cloud Computing - Question Bank
pkaviya
 
SIMATIC S7-1200 Overview
SIMATIC S7-1200 OverviewSIMATIC S7-1200 Overview
SIMATIC S7-1200 Overview
Luis Narvaez
 
Introduction to Internet of Things Hardware
Introduction to Internet of Things HardwareIntroduction to Internet of Things Hardware
Introduction to Internet of Things Hardware
Daniel Eichhorn
 
INDUSTRIAL AUTOMATION USING PLC
INDUSTRIAL AUTOMATION USING PLCINDUSTRIAL AUTOMATION USING PLC
INDUSTRIAL AUTOMATION USING PLC
Mehvish Mushtaq
 
Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture Cloud Computing and Service oriented Architecture
Cloud Computing and Service oriented Architecture
Ravindra Dastikop
 
8259 programmable PPI interfacing with 8085 .ppt
8259 programmable PPI interfacing with 8085 .ppt8259 programmable PPI interfacing with 8085 .ppt
8259 programmable PPI interfacing with 8085 .ppt
DrVikasMahor
 
Summer Internship Report For PLC Programming of Traffic light through Ladder ...
Summer Internship Report For PLC Programming of Traffic light through Ladder ...Summer Internship Report For PLC Programming of Traffic light through Ladder ...
Summer Internship Report For PLC Programming of Traffic light through Ladder ...
Aman Gupta
 
Embedded os
Embedded osEmbedded os
Embedded os
chian417
 
Scope of electronics engineering in india ppt
Scope of electronics engineering in india pptScope of electronics engineering in india ppt
Scope of electronics engineering in india ppt
Rajesh Kumar
 
Design and Construction of Movable Bridge using Arduino
Design and Construction of Movable Bridge using ArduinoDesign and Construction of Movable Bridge using Arduino
Design and Construction of Movable Bridge using Arduino
ijtsrd
 

Viewers also liked (18)

Genetic Programming in Python
Genetic Programming in PythonGenetic Programming in Python
Genetic Programming in Python
Intellovations, LLC
 
Introduction to Genetic Programming
Introduction to Genetic ProgrammingIntroduction to Genetic Programming
Introduction to Genetic Programming
adil raja
 
Msr13 mistake
Msr13 mistake Msr13 mistake
Msr13 mistake
CS, NcState
 
Fuzzy String Matching
Fuzzy String MatchingFuzzy String Matching
Fuzzy String Matching
kyleburton
 
Serine Integrases in Genetic Circuit Design
Serine Integrases in Genetic Circuit DesignSerine Integrases in Genetic Circuit Design
Serine Integrases in Genetic Circuit Design
Dylan MacPhail
 
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
adil raja
 
Prediction the stock market with genetic programming
Prediction the stock market with genetic programmingPrediction the stock market with genetic programming
Prediction the stock market with genetic programming
David Moskowitz, Ph.D.
 
An intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemAn intelligent scalable stock market prediction system
An intelligent scalable stock market prediction system
Harshit Agarwal
 
Introduction to genetic programming
Introduction to genetic programmingIntroduction to genetic programming
Introduction to genetic programming
abhishek singh
 
Cartesian Genetic Programming
Cartesian Genetic ProgrammingCartesian Genetic Programming
Cartesian Genetic Programming
Jagdeep Singh
 
Genetic programming
Genetic programmingGenetic programming
Genetic programming
Meghna Singh
 
できる!遺伝的アルゴリズム
できる!遺伝的アルゴリズムできる!遺伝的アルゴリズム
できる!遺伝的アルゴリズム
Maehana Tsuyoshi
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by Example
Nobal Niraula
 
Matlab Introduction
Matlab IntroductionMatlab Introduction
Matlab Introduction
ideas2ignite
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Jaganadh Gopinadhan
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
garima931
 
Ad-Hoc Networks
Ad-Hoc NetworksAd-Hoc Networks
Ad-Hoc Networks
Mshari Alabdulkarim
 
Mobile Ad hoc Networks
Mobile Ad hoc NetworksMobile Ad hoc Networks
Mobile Ad hoc Networks
Jagdeep Singh
 
Introduction to Genetic Programming
Introduction to Genetic ProgrammingIntroduction to Genetic Programming
Introduction to Genetic Programming
adil raja
 
Fuzzy String Matching
Fuzzy String MatchingFuzzy String Matching
Fuzzy String Matching
kyleburton
 
Serine Integrases in Genetic Circuit Design
Serine Integrases in Genetic Circuit DesignSerine Integrases in Genetic Circuit Design
Serine Integrases in Genetic Circuit Design
Dylan MacPhail
 
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
adil raja
 
Prediction the stock market with genetic programming
Prediction the stock market with genetic programmingPrediction the stock market with genetic programming
Prediction the stock market with genetic programming
David Moskowitz, Ph.D.
 
An intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemAn intelligent scalable stock market prediction system
An intelligent scalable stock market prediction system
Harshit Agarwal
 
Introduction to genetic programming
Introduction to genetic programmingIntroduction to genetic programming
Introduction to genetic programming
abhishek singh
 
Cartesian Genetic Programming
Cartesian Genetic ProgrammingCartesian Genetic Programming
Cartesian Genetic Programming
Jagdeep Singh
 
Genetic programming
Genetic programmingGenetic programming
Genetic programming
Meghna Singh
 
できる!遺伝的アルゴリズム
できる!遺伝的アルゴリズムできる!遺伝的アルゴリズム
できる!遺伝的アルゴリズム
Maehana Tsuyoshi
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by Example
Nobal Niraula
 
Matlab Introduction
Matlab IntroductionMatlab Introduction
Matlab Introduction
ideas2ignite
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
garima931
 
Mobile Ad hoc Networks
Mobile Ad hoc NetworksMobile Ad hoc Networks
Mobile Ad hoc Networks
Jagdeep Singh
 
Ad

Similar to Semantic Genetic Programming Tutorial (20)

IEEE CEC 2013 Tutorial on Geometry of Evolutionary Algorithms
IEEE CEC 2013 Tutorial on Geometry of Evolutionary AlgorithmsIEEE CEC 2013 Tutorial on Geometry of Evolutionary Algorithms
IEEE CEC 2013 Tutorial on Geometry of Evolutionary Algorithms
AlbertoMoraglio
 
Genetic Programming historia objetivos lisp
Genetic Programming historia objetivos lispGenetic Programming historia objetivos lisp
Genetic Programming historia objetivos lisp
tonycarracedo1
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
Albert Orriols-Puig
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
UKM university
 
CIS 5 Project.pdf
CIS 5 Project.pdfCIS 5 Project.pdf
CIS 5 Project.pdf
RayvonneEvans1
 
cis5-204-Project-ch11c - Evan, Le, Mata.pdf
cis5-204-Project-ch11c - Evan, Le, Mata.pdfcis5-204-Project-ch11c - Evan, Le, Mata.pdf
cis5-204-Project-ch11c - Evan, Le, Mata.pdf
MinhLe595264
 
Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...
Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...
Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...
Aritra Sarkar
 
Genetic Algorithms-1.ppt
Genetic Algorithms-1.pptGenetic Algorithms-1.ppt
Genetic Algorithms-1.ppt
DrSanjeevPunia
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
Aalto University
 
04 1 evolution
04 1 evolution04 1 evolution
04 1 evolution
Tianlu Wang
 
Towards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingTowards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker Prototyping
Edmundo López Bóbeda
 
COSC3054 Lec 05 - Semantic Analysis and Type checking B.pdf
COSC3054 Lec 05 - Semantic Analysis and Type checking B.pdfCOSC3054 Lec 05 - Semantic Analysis and Type checking B.pdf
COSC3054 Lec 05 - Semantic Analysis and Type checking B.pdf
abdulrahmanjilan
 
final
finalfinal
final
Erick Miller
 
Semantic Integration Patterns
Semantic Integration PatternsSemantic Integration Patterns
Semantic Integration Patterns
Optum
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
Xavier Llorà
 
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systemsTMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
Iosif Itkin
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
swapnac12
 
Coates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsCoates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worlds
ArchiLab 7
 
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSA WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
Silvio Cesare
 
The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"
Frank van Harmelen
 
IEEE CEC 2013 Tutorial on Geometry of Evolutionary Algorithms
IEEE CEC 2013 Tutorial on Geometry of Evolutionary AlgorithmsIEEE CEC 2013 Tutorial on Geometry of Evolutionary Algorithms
IEEE CEC 2013 Tutorial on Geometry of Evolutionary Algorithms
AlbertoMoraglio
 
Genetic Programming historia objetivos lisp
Genetic Programming historia objetivos lispGenetic Programming historia objetivos lisp
Genetic Programming historia objetivos lisp
tonycarracedo1
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
Albert Orriols-Puig
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
UKM university
 
cis5-204-Project-ch11c - Evan, Le, Mata.pdf
cis5-204-Project-ch11c - Evan, Le, Mata.pdfcis5-204-Project-ch11c - Evan, Le, Mata.pdf
cis5-204-Project-ch11c - Evan, Le, Mata.pdf
MinhLe595264
 
Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...
Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...
Self-configuring Classical Logic Gate Circuits using Genetic Programming in J...
Aritra Sarkar
 
Genetic Algorithms-1.ppt
Genetic Algorithms-1.pptGenetic Algorithms-1.ppt
Genetic Algorithms-1.ppt
DrSanjeevPunia
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
Aalto University
 
Towards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingTowards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker Prototyping
Edmundo López Bóbeda
 
COSC3054 Lec 05 - Semantic Analysis and Type checking B.pdf
COSC3054 Lec 05 - Semantic Analysis and Type checking B.pdfCOSC3054 Lec 05 - Semantic Analysis and Type checking B.pdf
COSC3054 Lec 05 - Semantic Analysis and Type checking B.pdf
abdulrahmanjilan
 
Semantic Integration Patterns
Semantic Integration PatternsSemantic Integration Patterns
Semantic Integration Patterns
Optum
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
Xavier Llorà
 
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systemsTMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
Iosif Itkin
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
swapnac12
 
Coates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worldsCoates p: the use of genetic programing in exploring 3 d design worlds
Coates p: the use of genetic programing in exploring 3 d design worlds
ArchiLab 7
 
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSA WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
Silvio Cesare
 
The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"
Frank van Harmelen
 
Ad

Recently uploaded (20)

CloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy Texts
CloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy TextsCloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy Texts
CloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy Texts
sehilyi
 
International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)
journalijps98
 
Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...
sehilyi
 
Understanding Visualization Authoring for Genomics Data through User Interviews
Understanding Visualization Authoring for Genomics Data through User InterviewsUnderstanding Visualization Authoring for Genomics Data through User Interviews
Understanding Visualization Authoring for Genomics Data through User Interviews
sehilyi
 
Pushkar camel fest at college campus placement 2
Pushkar camel fest at college campus placement 2Pushkar camel fest at college campus placement 2
Pushkar camel fest at college campus placement 2
nandanitiwari82528
 
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
publication11
 
BP_MXene_Project_Proposal_Presentation.pptx
BP_MXene_Project_Proposal_Presentation.pptxBP_MXene_Project_Proposal_Presentation.pptx
BP_MXene_Project_Proposal_Presentation.pptx
RoccoHunter8
 
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future DirectionsComparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
sehilyi
 
Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)
Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)
Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)
Dr.Devraj Neupane
 
Morphological and biochemical characterization in Rice
Morphological and biochemical characterization in RiceMorphological and biochemical characterization in Rice
Morphological and biochemical characterization in Rice
AbhishekChauhan911496
 
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics DataMulti-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
sehilyi
 
To study the factors on which the self-inductance of the coil depends by obse...
To study the factors on which the self-inductance of the coil depends by obse...To study the factors on which the self-inductance of the coil depends by obse...
To study the factors on which the self-inductance of the coil depends by obse...
crazycompetitor000
 
Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...
Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...
Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...
DrJamesFarley
 
Struktur DNA dan kromosom dalam genetika
Struktur DNA dan kromosom dalam genetikaStruktur DNA dan kromosom dalam genetika
Struktur DNA dan kromosom dalam genetika
WiwitProbowati2
 
Class 12 biology project on the topic cancer
Class 12 biology project on the topic cancerClass 12 biology project on the topic cancer
Class 12 biology project on the topic cancer
crazycompetitor000
 
Conservation Of Natural Resources 1.pptx
Conservation Of Natural Resources 1.pptxConservation Of Natural Resources 1.pptx
Conservation Of Natural Resources 1.pptx
nandhakumar2712013
 
Introduction to Microbiology and Microscope
Introduction to Microbiology and MicroscopeIntroduction to Microbiology and Microscope
Introduction to Microbiology and Microscope
vaishrawan1
 
Abzymes mimickers in catalytic reactions at nanoscales
Abzymes mimickers in catalytic reactions at nanoscalesAbzymes mimickers in catalytic reactions at nanoscales
Abzymes mimickers in catalytic reactions at nanoscales
OrchideaMariaLecian
 
Algebra A BASIC REVIEW INTERMEDICATE ALGEBRA
Algebra A BASIC REVIEW INTERMEDICATE ALGEBRAAlgebra A BASIC REVIEW INTERMEDICATE ALGEBRA
Algebra A BASIC REVIEW INTERMEDICATE ALGEBRA
ropamadoda
 
Metabolic acidosis and alkalosis ppt.pptx
Metabolic acidosis and alkalosis ppt.pptxMetabolic acidosis and alkalosis ppt.pptx
Metabolic acidosis and alkalosis ppt.pptx
aanchalm373
 
CloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy Texts
CloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy TextsCloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy Texts
CloakingNote: A Novel Desktop Interface for Subtle Writing Using Decoy Texts
sehilyi
 
International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)International Journal of Pharmacological Sciences (IJPS)
International Journal of Pharmacological Sciences (IJPS)
journalijps98
 
Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...Toward Understanding Representation Methods in Visualization Recommendations ...
Toward Understanding Representation Methods in Visualization Recommendations ...
sehilyi
 
Understanding Visualization Authoring for Genomics Data through User Interviews
Understanding Visualization Authoring for Genomics Data through User InterviewsUnderstanding Visualization Authoring for Genomics Data through User Interviews
Understanding Visualization Authoring for Genomics Data through User Interviews
sehilyi
 
Pushkar camel fest at college campus placement 2
Pushkar camel fest at college campus placement 2Pushkar camel fest at college campus placement 2
Pushkar camel fest at college campus placement 2
nandanitiwari82528
 
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)Telehealth For Maternal and Child Health: Expanding  Access (www.kiu.ac.ug)
Telehealth For Maternal and Child Health: Expanding Access (www.kiu.ac.ug)
publication11
 
BP_MXene_Project_Proposal_Presentation.pptx
BP_MXene_Project_Proposal_Presentation.pptxBP_MXene_Project_Proposal_Presentation.pptx
BP_MXene_Project_Proposal_Presentation.pptx
RoccoHunter8
 
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future DirectionsComparative Layouts Revisited: Design Space, Guidelines, and Future Directions
Comparative Layouts Revisited: Design Space, Guidelines, and Future Directions
sehilyi
 
Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)
Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)
Treatment of TMDS.by Dr.Devraj Neupane(MDS Resedent)
Dr.Devraj Neupane
 
Morphological and biochemical characterization in Rice
Morphological and biochemical characterization in RiceMorphological and biochemical characterization in Rice
Morphological and biochemical characterization in Rice
AbhishekChauhan911496
 
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics DataMulti-View Design Patterns & 
Responsive Visualization for
Genomics Data
Multi-View Design Patterns & 
Responsive Visualization for
Genomics Data
sehilyi
 
To study the factors on which the self-inductance of the coil depends by obse...
To study the factors on which the self-inductance of the coil depends by obse...To study the factors on which the self-inductance of the coil depends by obse...
To study the factors on which the self-inductance of the coil depends by obse...
crazycompetitor000
 
Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...
Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...
Dr. James Farley Identifies the Key Root Causes You Must Address to Heal Natu...
DrJamesFarley
 
Struktur DNA dan kromosom dalam genetika
Struktur DNA dan kromosom dalam genetikaStruktur DNA dan kromosom dalam genetika
Struktur DNA dan kromosom dalam genetika
WiwitProbowati2
 
Class 12 biology project on the topic cancer
Class 12 biology project on the topic cancerClass 12 biology project on the topic cancer
Class 12 biology project on the topic cancer
crazycompetitor000
 
Conservation Of Natural Resources 1.pptx
Conservation Of Natural Resources 1.pptxConservation Of Natural Resources 1.pptx
Conservation Of Natural Resources 1.pptx
nandhakumar2712013
 
Introduction to Microbiology and Microscope
Introduction to Microbiology and MicroscopeIntroduction to Microbiology and Microscope
Introduction to Microbiology and Microscope
vaishrawan1
 
Abzymes mimickers in catalytic reactions at nanoscales
Abzymes mimickers in catalytic reactions at nanoscalesAbzymes mimickers in catalytic reactions at nanoscales
Abzymes mimickers in catalytic reactions at nanoscales
OrchideaMariaLecian
 
Algebra A BASIC REVIEW INTERMEDICATE ALGEBRA
Algebra A BASIC REVIEW INTERMEDICATE ALGEBRAAlgebra A BASIC REVIEW INTERMEDICATE ALGEBRA
Algebra A BASIC REVIEW INTERMEDICATE ALGEBRA
ropamadoda
 
Metabolic acidosis and alkalosis ppt.pptx
Metabolic acidosis and alkalosis ppt.pptxMetabolic acidosis and alkalosis ppt.pptx
Metabolic acidosis and alkalosis ppt.pptx
aanchalm373
 

Semantic Genetic Programming Tutorial

  • 1. 1 Semantic Genetic Programming Alberto Moraglio University of Exeter Exeter, UK [email protected] Krzysztof Krawiec Poznan University of Technology Poznan, Poland [email protected]
  • 2. 2 Instructors • Alberto Moraglio – Position: Lecturer in Computer Science at the University of Exeter, UK – Research Area: founder of the Geometric Theory of Evolutionary Algorithms, which unifies Evolutionary Algorithms across representations and has been used for the principled design of new successful search algorithms, including a new form of Genetic Programming based on semantics, and for their rigorous theoretical analysis. • Krzysztof Krawiec – Position: Associate Professor at Poznan University of Technology, Poland – Research Area: genetic programming and coevolutionary algorithms, with applications in program synthesis, modeling, image analysis, and games. Within GP: design of effective search operators (particularly crossovers), discovery of semantic modularity of programs, and exploitation of program execution traces for improving performance of program synthesis.
  • 3. 3 Aims • Give a comprehensive overview of semantic methods in genetic programming • Illustrate in an accessible way a formal geometric framework for program semantics • Analyze rigorously their performance (runtime analysis) • Present current challenges and trends in semantic GP • Outline new emerging approaches
  • 4. 4 Agenda 1. Introduction to Semantic Genetic Programming 2. Geometric Operators on Semantic Space 3. Approximating Geometric Semantic Genetic Programming 4. Geometric Sematic Genetic Programming 5. Other Developments and Current Research Directions
  • 5. 5 I. Introduction to Semantic Genetic Programming
  • 6. 6 Genetic Programming • Generate-and test approach to program synthesis • Programs represented as symbolic structures (usually abstract syntax trees, ASTs) • Population-based • Iterative: start with a population of programs drawn at random, and repeat: – select the most promising individuals, – perturb using mutation and crossover • … until solution found • This tutorial: focus on tree-based GP (but usually easily generalizable to other genres).
  • 7. 7 Motivations for Semantic GP (SGP) • Traditional GP search operates directly on syntax, largely disregarding program semantics. • Consequences: – Complex, rugged genotype-phenotype mapping – Low relatedness of offspring to parents – Slight change can dramatically change the output of the program – And conversely: high likelihood of no-effect (neutrality) – Low fitness-distance correlation
  • 8. 8 Questions • Can we make GP more aware about the effects of program execution, i.e., program ‘behavior’? • Can we design search operators that produce offspring program which behave similarly to parent(s)? • Can we design search operators that are guaranteed to do so?
  • 9. 9 Program Semantics • Program semantics = a formal method of capturing program behavior in abstraction from syntax. • Common formalisms: denotational semantics, operational semantics. – Rarely applicable in GP, where program correctness typically expressed w.r.t. to fitness cases (tests). • Note: semantics (noun) vs. semantic (adj.)
  • 10. 10 GP Semantics • Problems in GP are typically posed using a set of fitness cases (tests) • Observation: Program behavior is reflected in the effects of computation, i.e., program output. • Program semantics in GP: the tuple (vector) of outputs for the training fitness cases. Example: • Important consequence: semantic s(p) is a point in an n-dimensional space. • A distance between s(p1) and s(p2) reflects semantic similarity of p1 and p2
  • 11. 11 Semantic Building Blocks (McPhee, Ohs, Hutchison 2007/2008) • Studied the impact of subtree crossover in terms of semantic building blocks. • Describe the semantic action of crossover. • Provide insight into what does (or doesn’t) make crossover effective. • Define semantics of subtrees and semantics of contexts, where context = a tree with one branch missing. • Definition of program semantics inspired by Poli's and Page's work on sub-machine code GP
  • 12. 12 Semantic Building Blocks (McPhee, Ohs, Hutchison 2007/2008) • Distribution of context semantics are key in the success (or failure) of runs. • A very high proportion (typically over 75%) of crossover events are guaranteed to perform no useful search in the semantic space.
  • 13. 13 Semantically-Driven Crossover (SDC) (Beadle and Johnson 2008) • Program semantics = reduced ordered binary decision diagram (ROBDDs) • Trial-and error wrapper of tree-swapping crossover: – Pick a pair of parents and generate from them a potential offspring (candidate offspring) – Calculate ROBDD semantics of parents and offspring – Repeat if semantics the same as of any of the parents Analogously: Semantically-driven mutation (SDM) (Beadle & Johnson 2009)
  • 14. 14 Semantic-Aware Crossovers • Motivation: swap semantically similar subprograms in the parent programs, to ‘smoothen’ the semantic effect of crossover. • Semantic-aware crossover (SAX) (Quang et al. 2011) – Select a pair of subprograms such that their semantics are sufficiently similar (upper limit on distance) • Semantic Similarity-based Crossover (SSX) (Quang et al. 2011) – As SAX, but imposes also lower limit on distance between the subprograms, to prevent producing semantically neutral offspring (see efficiency later in this tutorial). • (Quang et al. 2013): Picks the closest semantically different subprogram in the other parent. • Analogous mutations defined too.
  • 15. 15 Semantic-Aware Initialization Semantically-driven Initialization (Beadle and Johnson 2009) • Constructs a population of semantically distinct programs of gradually increasing complexity. • Start with population P filled with all single-instruction programs • To generate a new program: – Repeat: • Create a random program p by combining a randomly selected non-terminal instruction r (of arity k) with k randomly selected programs in P – Until p has a non-constant semantics that is sufficiently distant from semantics of all programs in P – Add p to P and return p
  • 16. 16 Semantic-Aware Initialization • Behavioral Initialization (Jackson 2010) – Set P   • To generate a new program: – Repeat: • Create a random program p using conventional methods (e.g., Grow or Full) – Until the semantic of p is sufficiently distant from semantics of all programs in P – Add p to P and return p • Observation: Semantic diversity decreases rapidly with run progress (as opposed to syntactic/structural which increases and then levels-off)
  • 19. Balls & Segments }),(|{);( ryxdSyrxB  )},(),(),(|{];[ yxdyzdzxdSzyx  19
  • 20. Squared Balls & Chunky Segments 33 000 001 010 011 100 101 111110 B(000; 1) Hamming space 3 B((3, 3); 1) Euclidean space 3 B((3, 3); 1) Manhattan space Balls 1 2 1 2 000 001 010 011 100 101 111110 [000; 011] = [001; 010] 2 geodesics Hamming space 1 3 [(1, 1); (3, 2)] 1 geodesic Euclidean space 1 3 [(1, 1); (3, 2)] = [(1, 2); (3, 1)] infinitely many geodesics Manhattan space Line segments 20
  • 21. Geometric Crossover & Mutation • Geometric crossover: a recombination operator is a geometric crossover under the metric d if all its offspring are in the d-metric segment between its parents. • Geometric mutation: a mutation operator is a r-geometric mutation under the metric d if all its offspring are in the d-ball of radius r centred in the parent. 21
  • 22. Example of Geometric Mutation 000 001 010 011 100 101 111110 Neighbourhood structure naturally associated with the shortest path distance. Traditional one-point mutation is 1-geometric under Hamming distance. 22
  • 23. Example of Geometric Crossover • Geometric crossover: offspring are in a segment between parents for some distance. • The traditional crossover is geometric under the Hamming distance. 10110 11011 A B A B 11010X X 2 1 3 H(A,X) + H(X,B) = H(A,B) 23
  • 24. Significance of Geometric View • Unification Across Representations • Simple Landscape for Crossover • Crossover Principled Design • Principled Generalisation of Search Algorithms • General Theory Across Representations 24
  • 25. • Semantic search operators: operators that act on the syntax of the programs but that guarantee that some semantic criterion holds (e.g., semantic mutation: offspring are semantically similar to parents) Semantic Operators 25 Semantic Mutation 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 Induced Mutation Semantics Semantics
  • 26. Fitness as Distance • Aim: we want to find a function that scores perfectly on a given set of input-output examples (test cases) • Error of a program: number of mismatches on the test cases • Fitness as distance: the error of a program can be interpreted as the distance of the output vector of the program to the target output vector • Distance functions: Hamming distance for Boolean outputs, Euclidean distance for continuous outputs 26
  • 27. Semantic Distance & Operators • The semantic distance between two functions is the distance of their output vectors measured with the distance function used in the definition of the fitness function • Semantic geometric operators are geometric operators defined on the metric space of functions endowed with the semantic distance 27
  • 28. Semantic Fitness Landscape • The fitness landscape seen by GP with semantic geometric operators is always a cone landscape by definition (unimodal with a linear gradient) which GP can easily optimise! 28
  • 30. 30 Trial-and-Error Geometric Crossover (KLX) Krawiec and Lichocki Crossover, KLX (Krawiec and Lichocki 2009) • Goal: Minimize offspring’s total semantic distance from the parents under some assumed metric || ||. • Technical realization: Mate the parents (x,y) repetitively using a ‘regular’ crossover operator CX • Calculate parent semantics s(p1), s(p2) • Repeat: – Apply CX to (p1,p2) n times, creating a pool of candidates C – Calculate the semantics s(z) of each candidate z  C • Return the candidate z that minimises the total distance: argmin ||s(z) - s(p1)|| + ||s(z) - s(p2)|| • A form of brood selection
  • 31. 31 Trial-and-Error Geometric Crossover (KLX) Motivation: Given a globally convex fitness landscape (one global optimum), solutions on a segment connecting solutions x and y cannot be worse than the worse of them.
  • 32. 32 Promotion of Equidistance • All candidate offspring on the segment [s(p1);s(p2)] minimize total distance equally well, no matter how different from the parents they are. – An offspring z that is a ‘semantic clone’ of p1 (s(z) = s(p1)) also minimises the total distance. – The likelihood of crossover producing a semantic clone of one of the parents is high in GP (see remarks on neutrality later) • KLX promotes similarity to parents. This may hamper exploration. • Idea: Extend total distance by a term that promotes balanced distance from both parents (KLX+) argmin ||s(z) - s(p1)|| + ||s(z) - s(p2)|| + | ||s(z) - s(p1)|| - ||s(z) - s(p2)|| |
  • 33. 33 Locally Geometric Crossover (Krawiec & Pawlak 2012) • Motivations: Finding an ‘almost geometric’ offspring can be difficult for entire parent programs, – … but should be easier for subprograms. – This may make sense if ‘geometricity’ can propagate through a tree. • The algorithm: – Find the syntactic common region of the parents (where the trees overlap) – Select two homogenous nodes (subprograms) p1 and p2 in the common regions – Calculate the midpoint sm between s(p1) and s(p2) – Find two programs p’1 and p’2 in a library that have the closest semantic distance from sm – Replace p1 and p2 with p’1 and p’2, respectively.
  • 34. 34
  • 35. 35 Semantic Backpropagation • Motivation: many instructions used in GP are invertible or partially invertible. • Example: symbolic regression: – Fully invertible: e.g., addition: y = x + c  x = y - c – Partially invertible: e.g., square: y = x2  x = sqrt(x) • The desired output t of a program (target) is known. • Given a program and t, this allows deriving desired semantics at any point in a program tree.
  • 36. 36 Semantic Backpropagation SBP can be used to back propagate any semantics.
  • 37. 37 Semantic Backpropagation • Note: desired semantics is not a vector of scalar values. • Desired semantics is a tuple of sets of desired outputs, because not all instructions are bijective. Examples: – D = ({2}, {3}, {2,-4}, {0, 1}) – D = ({T}, {F}, {T,F}) • Special case: non-realizable desired semantics, e.g., D = ({T}, , {T,F}) – Or: non-realizable under assumed constraints (e.g., size of subprogram). • Algorithms have to account for that.
  • 38. 38 Propagation of Desired Semantics • Two fitness cases, 2D semantic space • Desired outputs: (0,0) • Program: cos(sin(x)) • Visualization: – semantic distance as a function of inputs (x1, x2) – red = smaller semantic distance (greater fitness)
  • 39. 39 Propagation of Desired Semantics • Top: desired semantics of cos(#) – target achieved for x1,x2 =  +k, kZ • Bottom: desired semantics of cos(sin(#)) – Target cannot be achieved, because sin  [-1,1], and thus no x causes cos(sin(x)) = 0
  • 40. 40 Operators Based on SBP • Approximately Geometric Crossover, AGX (Krawiec & Pawlak 2013) –A crossover operator –Uses SBP to match the midpoint on the segment connecting the parents’ semantics –Starting point of SBP: the midpoint on the segment • Random Desired Operator, RDO (Wieloch & Krawiec 2013) –A mutation operator –Uses SBP to match the target of the search process –Starting point of SBP: the target semantics of the
  • 41. 41 Operators Based on SBP • Common part of workflow: –Pick a node p’ in a parent p –Perform semantic backpropagation of desired semantics from the root of p to p’, obtaining desired semantics D –Replace p’ with a (sub)program from a library that best matches D • Other differences: –RDO is agnostic about geometric considerations –RDO and AGX may use various libraries
  • 42. 42 AGX: Some Results (Pawlak, Wieloch, Krawiec, 2014)
  • 43. 43 Library of Subprograms • The source of subprograms for SBP – Static: Generated prior to run – Dynamic: Other programs in the current population • Example of static library: All programs built upon given set of instructions. – Instructions {+, −,, /, sin, cos, exp, log, x}, max tree height h – Semantic duplicates eliminated • Total number of programs: 212 (for h = 3), 108520 (for h = 4) – Depends on the instruction set and tests (in general the fewer tests, the fewer unique semantics) – Impact of floating-point precision
  • 44. 44 Semantic Diversity of Libraries Exemplary library: • All programs composed of {+,−,×,/,sin,exp,x}, max tree depth: 4. • Semantics: 20 points distributed equidistantly in [−5, 5]  20-dimensional semantic space • Semantic duplicates removed. Visualization: • Reduction to 2D by PCA, • Red: the smallest (i.e. single node) programs, • Blue: the longest (i.e. 15 nodes) programs. Observation: strongly non-uniform distribution of semantics. • Expected: see (Langdon & Poli 2002)
  • 45. 45 Technical Challenges of SBP • Limited semantic diversity – Using a mutation operator in parallel recommended (to provide constant influx of new code) • Computational overhead of library search – Can be tackled with appropriate algorithms (nearest-neighbor search, e.g., kd-trees)
  • 46. 46 SBP: Remarks and Extensions • Requirements of SBP-based operators – AGX requires a means of constructing a midpoint on a segment. • Possible in vector spaces, but in general not in metric spaces – RDO can work with any metric (vector space not required) • The node/subtree p to be replaced can be selected deterministically: – E.g., the node where the divergence of the actual semantics s(p) and the desired semantics D is the greatest (Wieloch 2012)
  • 48. Geometric Semantic Operators Construction • By approximation: – Trial & Error is wasteful – Offspring do not conform exactly to the semantic requirement • By direct construction: Is it possible to find search operators that operate on syntax but that are guaranteed to respect geometric semantic criteria by direct construction? • Due to the complexity of genotype-phenotype map in GP (Krawiec & Lichocki 2009) hypothesized that designing a crossover operator with such a guarantee is in general impossible. A pessimist? No, the established view until then... 48
  • 49. Geometric Semantic Crossover for Boolean Expressions 49 T1, T2: parent trees TR: random tree T3 =
  • 50. Theorem The output vector of the offspring T3 is in the Hamming segment between the output vectors of its parent trees T1 and T2 for any tree TR 50
  • 51. Example: parity problem • 3-parity problem: we want to find a function P(X1,X2,X3) that returns 1 when an odd number of input variables is 1, 0 otherwise. 51 0 1 0 1 0 1 1 1O= Error = HD(Y,O) = 5
  • 52. Example: tree crossover 52 T1 = TR = T2 = T3 = substitution & simplification
  • 53. Example: output vector crossover 53 • The output vector of TR acts as a crossover mask to recombine the output vectors of T1 and T2 to produce the output vector T3. • This is a geometric crossover on the semantic distance: output vector of T3 is in the Hamming segment between the output vectors of T1 and T2.
  • 54. Geometric Semantic Crossover for Arithmetic Expressions 54 Function co-domain: real Output vectors: real vectors Semantic distance = Euclidean CR = random real in [0,1] Semantic distance = Manhattan CR = random function with co- domain [0,1] T3 =
  • 55. Geometric Semantic Crossover for Classifiers 55 Function co-domain: symbol Output vectors: symbol string Semantic distance = Hamming RC = random function with boolean co-domain (i.e., random condition function of the inputs) T3 =
  • 56. Remark 1: Domain-Specific • Unlike traditional syntactic operators which are of general applicability, semantic operators are domain-specific • But there is a systematic way to derive them for any domain 56
  • 57. Remark 2: Quick Growth • Offspring grows in size very quickly, as the size of the offspring is larger than the sum of the sizes of its parents! • To keep the size manageable we need to simplify the offspring without changing the computed function: – Boolean expressions: Boolean simplification – Math Formulas: algebraic simplification – Programs: simplification by formal methods 57
  • 58. Remark 3: Syntax Does Not Matter! • The offspring is defined purely functionally, independently from how the parent functions and itself are actually represented (e.g., trees) • The genotype representation does not matter: solution can be represented using any genotype structure (trees, graphs, sequences)/language (Java, Lisp, Prolog) as long as the semantic operators can be described in that language 58
  • 59. Semantic Mutations • It is possible to derive geometric semantic mutation operators. • They also have very simple forms for Boolean, Arithmetic and Program domains. 59
  • 65. Geometric Semantic Crossover for Boolean Expressions (Growth) 65 T1, T2: parent trees TR: random tree T3 = size(T3) = 4 + 2 * size(TR) + size(T1) + size(T2) average size at generation n + 1 > 2 * average size at generation n PROBLEM: size grows exponentially in the number of generation!
  • 66. Geometric Semantic Mutation for Boolean Expressions (Growth) 66 T: parent tree M: random minterm tree TM: mutant tree size(TM) = 2 + size(M) + size(T) average size at generation n + 1 = constant + average size at generation n NO PROBLEM: size grows linearly in the number of generation
  • 67. Three Solutions 1. Algebraic simplification of offspring - Can be computationally expensive - Not all domains can be simplified algebraically - Understandable final solutions 2. Not using crossover - Semantic Hill-Climber finds optimum efficiently - Linear growth is acceptable 3. Compactification of offspring (Vanneschi et al, 2013) - Linear growth even with crossover - Applicable to any domain - Complicated Implementation (pointers structure) - Final solution is black box 67
  • 68. Compactification Method (Vanneschi et al, 2013) - Individuals are represented as explicit shared linked data structure to their parents, and recursively to all their ancestry. - At each generation, each new offspring of crossover requires only a new triplet of references  Linear growth in the number of generations. 68
  • 69. Compactification Method - Output vector of offspring can be computed using the explicitly stored output vectors of the parent and mask trees. This turns fitness computation from exponential in the number of generations to constant time. 69
  • 70. Compactification Method - Explicit garbage collection of unreferenced past individuals in the data structure. - Final solution is extracted from data structure but this takes exponentially long in the number of generation. - Extracted solution is queried on non-training inputs to make predictions. This takes exponential time since done on extracted solution. Good idea, but can be improved and beautified! 70
  • 71. Functional Compactification (Moraglio, 2014) • Individuals are represented directly as anonymous Python functions: P1 = lambda x1, x2, x3: x1 or (x2 and not x3) P2 = lambda x1, x2, x3: x1 and x2 RF = lambda x1, x2, x3: not (x2 and x3) 71
  • 72. Functional Compactification • Offspring call parents rather than pointing to them: OX = lambda x1, x2, x3: ((P1() and RF()) or (P2() and not RF()) • The size of offspring is constant in the number of generations 72
  • 73. Functional Compactification • Mutation and Crossover are higher order functions that take functions in inputs (parents) and return functions as output (offspring): Crossover: (B^3  B) x (B^3  B)  (B^3  B) • The function calls structure keeps implicitly trace of all ancestry of an individual 73
  • 74. Functional Compactification • All individuals are momoized functions: - The output of previously seen inputs is retrieved from an implicit storage, not recalculated - The first time the fitness of an individual is calculated, its output vector is implicitly stored - As the output vectors of parents are stored, the fitness of the offspring takes constant time in num generations 74
  • 75. Functional Compactification - Garbage collection of unreferenced past functions done automatically by the Python compiler. - Final solution is a Python compiled function (but can be extracted by keeping track of its source code). The extracted solution would be exponentially long. - The compiled final solution can be queried on non- training inputs to make predictions. Thanks to the memoization obtaining the output takes only linear time. 75
  • 76. Functional Compactification • The functional interpretation of the compactification method delegates implicitly all book-keeping of the original compactification method to the Python compiler. • The resulting code is elegant, much shorter and clear as it has only minimal clutter (< 100 lines including extensive comments vs original compactification > 2000 lines of C++). 76
  • 77. 77 GSGP Implementations • Original Mathematica implementation with algebraic simplification (see https://github.com/amoraglio/GSGP) • Compactification method in C++ (see http://gsgp.sourceforge.net/) • Functional compactification aka Tiny GSGP in Python (see https://github.com/amoraglio/GSGP) • Scala implementation using the ScaPS library (see http://www.cs.put.poznan.pl/kkrawiec/wiki/?n=Site.Scaps)
  • 79. • Rigorous analytical formula of the expected optimisation time of the search algorithm A on the problem class P (on the worst instance) for increasing size n of the problem Runtime Analysis 79
  • 80. • Algorithm: stochastic hill-climber i.e., flip a bit of the current solution and accept new solution if it is better than current • Problem class: one-max i.e., sum of ones in the bit string to maximise; the problem size is the string size • Expected optimisation time: O(n log n) by coupon collector argument • This result generalises to onemax with an unknown target string, i.e., to any cone landscape on binary strings Runtime Analysis (example) 80
  • 81. Semantic Mutation (syntactic search & semantic effect) 81 Semantic Mutation 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 Induced Mutation Semantics Semantics
  • 82. Search Equivalence 82 Semantic GP search at a syntax level on any problem Traditional GA search on output vectors on onemax Semantics The search outputs a tree (i.e., a function), but the runtime analysis can be done on the GA!
  • 83. Forcing Point Mutation (not Bit Flip) 83 X1 X2 X3 Output 0 0 0 0 0 0 1 1 0 1 0 0  1 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 X = ((X1 ^ X2) ^ !X3) v X3 M = !X1 ^ X2 ^ !X3 X’ = X v M
  • 84. Issue 1: Exponential Chromosome Size • Problem size n: number of input variables • Output vector size N: 2^n (exponentially long in the number of variables!) • (1+1)-EA on OneMax has runtime N log N = n 2^n (exponential!) 84
  • 85. Issue 2: Exponential Amount of Neutrality • Training set size t: must be polynomial in n for the fitness to be computable in poly time • The output vectors of size 2^n have only poly(n) active bits, all other bits are inactive: sparse OneMax with very rare active bits • Black-box model: we do not know which bits are active and which are inactive • (1+1)-EA takes exponential time to optimise sparse OneMax 85
  • 86. Solution: Block Mutation • Use incomplete minterm as a basis for forcing mutation. This has the effect of forcing at once blocks of entries to the same random value. 86 X1 X2 X3 Output 0 0 0 0  1 0 0 1 1  1 0 1 0 0  1 0 1 1 1  1 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 X = ((X1 ^ X2) ^ !X3) v X3 M = !X1 X’ = X v M
  • 87. Fixed Block Mutation 87 X1 X2 X3 Output 0 0 0 0 0 0 1 1 0 1 0 0  0 0 1 1 1  0 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 Fix Variables = {X1,X2} Possible M = {!X1 ^ !X2, !X1 ^ X2, X1 ^ !X2, X1 ^ X2} X = ((X1 ^ X2) ^ !X3) v X3 M = !X1 ^ X2 X’ = X ^ !M
  • 88. Polynomial Runtime with High Probability of Success on All Boolean Problems! 88 Proof idea: choose v such that the number of partitions of the output vector is polynomial in n (so that the runtime is polynomial), and larger enough than the training set, so that each training example is in a single block w.h.p. (which guarantees that the optimum can be reached).
  • 89. Lesson from Theory • Rigorous runtime analysis of GSGP on general classes of non-toy problems is possible as the landscape is always a cone • There are issues with GSGP which require careful design of semantic mutations to obtain efficient search. Theory can guide the design of provably good semantic operators in terms of runtime • Runtime analysis of GSGP with several other mutation operators for Boolean, arithmetic and classification domains have been done producing refined provably good semantic search operators 89
  • 90. 90 V. Other developments & current research directions
  • 91. 91 SGP and Neutrality • Similarly to non-semantic operators, SGP operators can be ineffective (in the semantic sense). – The offspring is a semantic clone of a parent. – Slows down the search process. • Percentage of neutral mutations: • Can be tackled by testing potential offspring for semantic neutrality. Operator Symbolic regression Boolean function synthesis SGX (Moraglio et al.) 0.679 0.719 AGX (Pawlak et al.) 0.131 0.935 LGX (Krawiec et al.) 0.067 0.724 KLX (Krawiec et al.) 0.866 0.895 SAC (Uy et al.) 0.067 0.649 GPX (Koza et al.) 0.103 0.518
  • 92. 92 GP as a Test-Based Problem • Test based problem (S, T, G, Q) (Popovici et al. 2012): – S – set of candidate solutions (in GP: programs) – T – set of tests (in GP: tests, fitness cases) – G – interaction matrix – Q – quality measure • Examples: Games (strategies vs. opponents), control problems (controllers vs. initial conditions), machine learning from examples (hypotheses vs. examples) – Generally: co-optimization and co-search
  • 93. 93 Discovery of Underlying Objectives via Clustering (Krawiec & Liskowski 2013)
  • 94. 94 Behavioral GP • Generalizes program behavior to the entire course of program execution, not only program output • Program behavior = list of execution traces (Krawiec & Swan 2013, Krawiec & O’Reilly 2014)
  • 96. 96 Recent Developments • New approaches based on semantic back propagation (Ffrancon & Schoenauer, 2015) • Lexicase selection (Helmuth et al. 2012) • Relationship to novelty search (program semantics = behavioral descriptor)
  • 97. • Application to other types of GP – Geometric Sematic Grammatical Evolution • Many Real-World Applications (Vanneschi et al, 2013) • Generalisation Studies – PAC learning for provably good generalisation of GSGP • Derivation of semantic operators for more complex domain (e.g., recursive programs) on more complex data structures (e.g., lists) Other Lines of Investigation in GSGP 97
  • 98. 98 Thank you! Questions? Credits: The authors thank Bartosz Wieloch and Tomasz Pawlak for their feedback on the slides of the tutorial. Other credits: Wikipedia
  • 99. 99 References • A. Moraglio, K. Krawiec, C. Johnson, Geometric Semantic Genetic Programming, PPSN XII, 2012. • K. Krawiec, P. Lichocki, Approximating Geometric Crossover in Semantic Space, GECCO 2009, • K. Krawiec, T. Pawlak, Locally Geometric Semantic Crossover: A Study on the Roles of Semantic and Homology in Recombination Operators, Genetic Programming and Evolvable Machines, 2013, • T. Pawlak, B. Wieloch, K. Krawiec, Semantic Backpropagation for Designing Genetic Operators in Genetic Programming, IEEE Transactions on Evolutionary Computation, 2014. • L. Beadle, C. Johnson, Semantically Driven Crossover in Genetic Programming, CEC 2008, • L. Beadle, C. Johnson, Semantically Driven Mutation in Genetic Programming, CEC 2009, • N.Q. Uy, N.X. Hoai, M. O’Neill, R.I. McKay, E. Galvan-Lopez, Semantically-based crossover in genetic programming: application to real-valued symbolic regression, Genetic Programming and Evolvable Machines, 2011, • N.Q. Uy, N.X. Hoai, M. O’Neill, R.I. McKay, D.N. Phong, On the roles of semantic locality in genetic programming, Information Sciences, 2013, • N.Q. Uy, N.X. Hoai, Michael O’Neill, Semantics based mutation in genetic programming: The case for real-valued symbolic regression, MENDEL 2009. • L. Beadle, C. Johnson, Semantic analysis of program initialisation in genetic programming, Genetic Programming and Evolvable Machines, 2009, • D. Jackson, Promoting Phenotypic Diversity in Genetic Programming, PPSN XI, 2010. • Semantic selection: • E. Galvan-Lopez, B. Cody-Kenny, L. Trujillo, A. Kattan, Using Semantics in the Selection Mechanism in Genetic Programming: a Simple Method for Promoting Semantic Diversity, CEC 2013. • R.E. Smith, S. Forrest, and A.S. Perelson. “Searching for diverse, coop- erative populations with genetic algorithms”. In: Evolutionary Compu- tation 1.2 (1993). • Lasarczyk, C. W. G. & and Wolfgang Banzhaf, P. D. Dynamic Subset Selection Based on a Fitness Case Topology Evolutionary Computation, 2004, 12, 223-242 • Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O’Neill, R. I. McKay, and Dao Ngoc Phong. On the roles of semantic locality of crossover in genetic programming. Information Sciences, 235:195–213, 20 June 2013. • Mauro Castelli, Leonardo Vanneschi, and Sara Silva. Semantic search-based genetic programming and the effect of intron deletion. IEEE Transactions on Cybernetics, 44(1):103–113, January 2014. • Langdon, W. B. & Poli, R. Foundations of Genetic Programming Springer-Verlag, 2002 • McPhee, N. F., Ohs, B. & Hutchison, T., Semantic Building Blocks in Genetic Programming, in O'Neill, M et al. (eds.) Proceedings of the 11th European Conference on Genetic Programming, EuroGP 2008, Springer, 2008, 4971, 134-145
  • 100. 100 References • A. Moraglio, Towards a Geometric Unification of Evolutionary Algorithms, PhD Thesis, University of Essex, UK, 2007. • A. Moraglio, R. Poli, Topological Interpretation of Crossover, Genetic and Evolutionary Computation Conference, pages 1377- 1388, 2004. • A. Moraglio, A. Mambrini, L. Manzoni, Runtime Analysis of Mutation-Based Geometric Semantic Geometric Programming on Boolean Functions, Foundations of Genetic Algorithms, 2013. • A. Moraglio, A. Mambrini, Runtime Analysis of Mutation-Based Geometric Semantic Genetic Programming for Basis Functions Regression, Genetic and Evolutionary Computation Conference, 2013. • A. Mambrini, L. Manzoni, A. Moraglio, Theory-Laden Design of Mutation-Based Geometric Semantic Genetic Programming for Learning Classification Trees, IEEE Congress on Evolutionary Computation 2013. • A. Moraglio, J. McDermott, M. O’Neill, Geometric Semantic Grammatical Evolution, SMGP workshop at PPSN, 2014. • A. Moraglio, An Efficient Implementation of GSGP using Higher-Order Functions and Memoization, SMGP workshop at PPSN, 2014. • J. Fieldsend, A. Moraglio. Strength through diversity: Disaggregation and multi-objectivisation approaches for genetic programming, GECCO, 2015 (to appear). • L. Vanneschi, M. Castelli, L. Manzoni, S. Silva, A New Implementation of Geometric Semantic GP and Its Application to Problems in Pharmacokinetics, EuroGP 2013 • L. Vanneschi, S. Silva, M. Castelli, L. Manzoni, Geometric semantic genetic programming for real life applications, in Genetic Programming Theory and Practice XI, 2013 • R. Ffrancon, M. Schoenauer, Greedy Semantic Local Search for Small Solutions, Semantic Methods in Genetic Programming Workshop, GECCO’15, 2015.