0% found this document useful (0 votes)
96 views

Introduction To Genetic Algorithms (GA)

Genetic algorithms (GA) are an evolutionary optimization approach that can be applied to complex, non-linear problems. GA works by stochastically evolving generations of solution populations using concepts like fitness, crossover, and mutation. It has advantages like being able to handle mixed discrete/continuous problems and being robust to local optima. GA encoding methods include binary, permutation, direct value, and tree representations. The basic GA process involves initializing a population, evaluating fitness, selecting parents, recombining and mutating to create offspring, and replacing the population iteratively until a termination criterion is met.

Uploaded by

Somya Arya
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views

Introduction To Genetic Algorithms (GA)

Genetic algorithms (GA) are an evolutionary optimization approach that can be applied to complex, non-linear problems. GA works by stochastically evolving generations of solution populations using concepts like fitness, crossover, and mutation. It has advantages like being able to handle mixed discrete/continuous problems and being robust to local optima. GA encoding methods include binary, permutation, direct value, and tree representations. The basic GA process involves initializing a population, evaluating fitness, selecting parents, recombining and mutating to create offspring, and replacing the population iteratively until a termination criterion is met.

Uploaded by

Somya Arya
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1.

Introduction to Genetic Algorithms (GA)


Genetic algorithms (GA) are an evolutionary optimization approach, which are an
alternative to traditional optimization methods. GA is one of the most appropriate
methods for complex non-linear models where location of the global optimum is a
difficult task. GA follows the concept of solution evolution by stochastically
developing generations of solution populations using a given fitness. They are
particularly applicable to problems, which are large, non-linear and possibly
discrete in nature.
1.1 Difference between GAs and other normal optimization and search
procedures:

Advantages of GA:

 GAs work with a coding for the parameter set, not the parameters
themselves.
 GAs search from a population of points, not a single point.
 GAs use probabilistic transition rules, not deterministic rules.
 The Genetic Algorithms are first suited for the optimization problem.
 GAs are robust with respect to local minima, maxima.
 GAs work well on mixed discrete/continuous problems.
 GAs can operate on various representations.
 GAs are stochastic.
 GAs are easily parallelised.
 GAs are easy to implement and lend themselves well to hybridization.

Limitations of GA

 GA implementation is still an art. They often require a lot of tweaking, then


again sometimes you can tweak all you want and they will still find the
same result.
 GAs are computationally expensive.
 GAs typically require less information about the problem (e.g. no gradients),
but designing an objective function and getting the representation and
operators right can be difficult.

Summarizing: it is suited for the dynamic program, because the genetic


programming paradigm parallels nature in that it is a never-ending process.
However, as a practical matter, a run of the genetic programming paradigm
terminates when the 'termination criterion is satisfied'.

1.2 From Biology to Genetic Algorithms


We can borrow some terms (and ideas) from biology . . .
 Chromosome: The coding of a possible solution for a given problem, usually
represented with an array of bits or characters
 Gene: A single bit or a set of bits coding part of the solution
 Allele: One of the elements used to code the genes
 Fitness: Evaluation of the actual solution . . . to model a learning process as
evolution:
 Crossover: Generate new solution by “mixing” two existing solutions
 Mutation: Random change in the solution
2. Working Principle of GA
The GA procedure is based on the Darwinian principle of survival of the fittest.
 An initial population is created containing a predefined number of
individuals, each represented by a genetic string.
 Each individual has an associated fitness measure. The concept that the
fittest (or best) individuals in a population will produce a fitter offspring is
then implemented in order to reproduce the next population.
 Selected individuals are chosen for reproduction (or crossover) at each
generation, with an appropriate mutation factor to randomly modify the
genes of an individual, in order to develop the new population.
 The result is another set of individuals based on the original subjects leading
to subsequent populations with better fitness and those with lower fitness
will naturally get discarded from the population.
The GA consists of four main stages: evaluation, selection, crossover and
mutation.
 The evaluation procedure measures the fitness of each individual solution in
the population and assigns to it a score.
 The selection procedure randomly selects individuals of the current
population for development of the next generation.
 The crossover procedure takes two selected individuals and combines them
about a crossover point thereby creating two new individuals.
 The mutation procedure randomly modifies the genes of an individual
subject to a small mutation factor, introducing further randomness into the
population. This iterative process continues until one of the possible
termination criteria is met: if a known optimal or acceptable solution level is
attained; or if a maximum number of generations have been performed; or if
a given number of generations without fitness improvement occur,
(Goldberg 1989; Lewin 1994, 1996).
3. Structure of GAs

GAs are categorised under an umbrella term Evolutionary Algorithms, which are
used to describe computer-based problem solving systems which use
computational models of evolutionary processes as key elements in their design
and implementation. A variety of evolutionary algorithms have been proposed, of
which the major ones are: GAs, evolutionary programming, evolution strategies,
classifier systems, and genetic programming (check out www.google.com on these
keywords). They all share a common conceptual base of simulating the 'evolution'
of individual structures via processes of selection, mutation and reproduction.
The existing GAs are founded upon the following main principles:

1. Reproduction
2. Fitness
3. Crossover
4. Mutation

There are various flavours of GAs in circulation, varying in implementation of


these three parameters, but in essence the algorithms all follow a standard
procedure:
1. [Start] Generate random population of n chromosomes (suitable solutions)
2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population
3. [New population] Create a new population by repeating following steps
until the new population is complete
(a) [Selection] Select two parent chromosomes from a population
according to their fitness (the better fitness, the bigger chance to be
selected)
(b) [Crossover] With a crossover probability cross over the parents to
form new offspring. If no crossover was performed, offspring is the exact
copy of parents.
(c) [Mutation] With a mutation probability mutate new offspring at each
locus
(d) [Accepting] Place new offspring in the new population
4. [Replace] Use new generated population for a further run of the algorithm
5. [Test] If the end condition is satisfied, return the best solution in current
population
6. [Loop] Go to step 2

Figure 1. Simplified flow chart of a Genetic Algorithm.

PSEUDOCODE

Algorithm GA is

// start with an initial time


t := 0;

// initialise a usually random population of individuals


initpopulation P (t);
// evaluate fitness of all initial individuals in population
evaluate P (t);

// test for termination criterion (time, fitness, etc.)


while not done do

// increase the time counter


t := t + 1;

// select sub-population for offspring production


P' := selectparents P (t);

// recombine the "genes" of selected parents


recombine P' (t);

// perturb the mated population stochastically


mutate P' (t);

// evaluate it's new fitness


evaluate P' (t);

// select the survivors from actual fitness


P := survive P,P' (t);
od
end GA.

4. Genetic Algorithms: Encoding Operators

4.1 Encoding (I)

Binary encoding is the most common one (mainly because the first research of
GA used this type of encoding)

 In binary encoding, every chromosome is a string of bits (0 or 1)

 Simple Implementation of the genetic operators

 Not always natural for many problems

Cromosome 1 1101100100110110

Cromosome 2 1110111000011110

Example of Problem: Knapsack problem


 There are things with given value and size. The knapsack has given
capacity. Select things to maximize the value of things in knapsack, but do
not extend knapsack capacity.

 Each bit says, whether the corresponding thing is in knapsack.

4.2 Encoding (II)

Permutation encoding can be used in ordering problems

 Every chromosome is a string of numbers that represent a position in a


sequence

 Crossover and mutation must be designed to leave the chromosome


consistent (i.e. have real sequence in it)

Cromosome 1: 15 7 8 3 5 1310111612 1 14 2 4 6 9

Cromosome 2: 2 9 14 1 11 5 8 1513 6 1216 7 3 10 4

Example of Problem: Travelling salesman problem (TSP)

 There are cities and given distances between them. Travelling salesman has
to visit all of them, but he does not want to travel more than necessary. Find
a sequence of cities with a minimal travelled distance.

 Chromosome describes the order of cities

4.3 Encoding (III)

Direct value encoding can be used in problems where some more complicated
values are required

 Every chromosome is a sequence of some values connected to the problem,


such as (real) numbers, chars or any objects

 Good choice for some special problems, but necessary to develop some
specific crossover and mutation

Cromosome: D S A B H Y V V

Cromosome: 2.5678 1.4361 3.3426 7.8761

Cromosome: open walk back close

Example of Problem: Finding weights for a neural network


 A neural network is given with defined architecture. Find weights between
neurons to get the desired output from the network

 Real values in chromosomes represent weights in the neural network

4.4 Encoding (IV)

Tree encoding is used mainly for evolving programs or expressions (i.e., genetic
programming)

 Every chromosome is a tree of some objects, such as functions or commands


in programming language.

 Programming language LISP is often used for this purpose, so crossover and
mutation can be done relatively easily.

Cromosome (+ X (/ 5 y))

Example of Problem: Finding a function that would approximate given pairs of


values

 Input and output values are given. The task is to find a function that will
give the best outputs for all inputs.

 Chromosome are functions represented in a tree

5. Genetic Algorithms: Selection Operators

According to Darwin’s theory of evolution the best chromosome survive to create


new offspring. There are many methods in selecting the best chromosomes:

 Roulette wheel selection,

 Rank selection

 Tournament selection

 Boltzmann selection
5.1 Roulette wheel selection

The simplest selection scheme is roulette-wheel selection, also called stochastic


sampling with replacement. This is a stochastic algorithm and involves the
following technique:

The individuals are mapped to contiguous segments of a line, such that each
individual's segment is equal in size to its fitness. A random number is generated
and the individual whose segment spans the random number is selected. The
process is repeated until the desired number of individuals is obtained (called
mating population). This technique is analogous to a roulette wheel with each slice
proportional in size to the fitness.

This can be simulated by the following algorithm:

1. [Sum] Calculate sum S of all chromosome fitness values in population.

2. [Select] Generate random number r from interval (0, S).

3. [Loop] Go through the population and sum fitness values from 0 - S. When
the sum is greater then r, stop and return the chromosome where you are.

PSEUDOCODE

for all members of population


sum += fitness of this individual
end for

for all members of population


probability = sum of probabilities + (fitness / sum)
sum of probabilities += probability
end for

loop until new population is full


do this twice
number = Random between 0 and 1
for all members of population
if number > probability but less than next probability
then you have been selected
end for
end
create offspring
end loop
Table shows the fitness value and the selection probability for 8 individuals.
Individual 5 is the most fit individual and occupies the largest interval, and
individual 1 is the least fit individual having the smallest interval on the line. For
selecting the mating population the appropriate number of uniformly distributed
random numbers (uniform distributed between 0.0 and 1.0) is independently
generated.
No. Initial Fitness Probability Expected Cummulative Random Chromosome Selected
value Fi Count= Pi No. No. Population
Population Pi = Fi/sum (n=8)* Pi

1 0000 0000 1 0.0429 0.33 0.0429 0.259 3 0001 0101

2 0010 0001 2.1 0.09 0.72 0.1329 0.038 1 0000 0000

3 0001 0101 3.11 0.1336 1.064 0.266 0.486 5 0110 1010

4 0010 1000 4.01 0.1723 1.368 0.438 0.428 4 0010 1000

5 0110 1010 4.66 0.2 1.6 0.638 0.095 2 0010 0001

6 1110 1000 1.91 0.082 0.656 0.72 0.3 4 0010 1000

7 1110 1101 1.93 0.0829 0.664 0.809 0.616 5 0110 1010

8 0111 1100 4.55 0.1955 1.56 1 0.897 8 0111 1100

Sum of all fitness values: sum = 23.264

Average fitness = 23.264/8 = 2.908

After one iteration, instead of the least fit individual 1 the next least fit individual 6
has been discarded from the pool. This shows the noisiness in roulette-wheel
selection. After repeating the selection procedure n = 8 times the count of each
chromosome is shown below.
No. 1 2 3 4 5 6 7 8

Initial Population 0000 0000 0010 0001 0001 0101 00101000 0110 1010 1110 1000 1110 1101 0111 1100

Count in the pool 1 1 1 2 2 0 0 1

Selected 0000 0000 0010 0001 0001 0101 00101000 00101000 0110 1010 0110 1010 0111 1100

Population

Referring the expected counts 5 and 8 gets two copies and 1 and 6 get no copies.
After 8 iterations the pool we get is also very similar to the expected count.
5.2 Rank Selection

The previous selection will have problems when the fitness values differ very
much. For example, if the best chromosome fitness is 90% of all the roulette wheel
then the other chromosomes will have very few chances to be selected.

Rank selection first ranks the population and then every chromosome receives
fitness from this ranking. The worst will have fitness 1, second worst 2 etc. and the
best will have fitness N (number of chromosomes in population).

After this all the chromosomes have a chance to be selected. But this method can
lead to slower convergence, because the best chromosomes do not differ so much
from other ones.

5.3 Tournament Selection

All methods above rely on global population statistics.

– Could be a bottleneck esp. on parallel machines

– Relies on presence of external fitness function which might not exist:


e.g. evolving game players

Tournament selection is an informal Procedure:

– Pick k members at random then select the best of these

– Repeat to select more individuals

 Probability of selecting i will depend on:

– Rank of i

– Size of sample k

 higher k increases selection pressure

– Whether contestants are picked with replacement

 Picking without replacement increases selection pressure

– Whether fittest contestant always wins (deterministic) or this happens


with probability p

 For k = 2, time for fittest individual to take over population is the same
as linear ranking with s = 2 • p
5.4 Steady-State Selection

This is not particular method of selecting parents. Main idea of this selection is
that big part of chromosomes should survive to next generation.

GA then works in a following way. In every generation are selected a few (good -
with high fitness) chromosomes for creating a new offspring. Then some (bad -
with low fitness) chromosomes are removed and the new offspring is placed in
their place. The rest of population survives to new generation.

5.5 Elitism

Idea of elitism has been already introduced. When creating new population by
crossover and mutation, we have a big chance, that we will loose the best
chromosome.

Elitism is name of method, which first copies the best chromosome (or a few best
chromosomes) to new population. The rest is done in classical way. Elitism can
very rapidly increase performance of GA, because it prevents losing the best found
solution.

6. Genetic Algorithm: Crossover and Mutation operators

There are many methods for crossovers and mutations depending on the kind of
encoding applied to represent the chromosomes.

6.1 Crossover/Mutation (I)

For binary encoding we have many Crossover operators:

 Single point crossover: one crossover point is selected, binary string from
the beginning of the chromosome to the crossover point is copied from the
first parent, the rest is copied from the other parent
11001011+11011111 = 11001111

 Two point crossover: two crossover points are selected, binary string from
the beginning of the chromosome to the first crossover point is copied from
the first parent, the part from the first to the second crossover point is copied
from the other parent and the rest is copied from the first parent again.

11001011 + 11011111 = 11011111


 Uniform crossover: bits are randomly copied from the first or from the
second parent

11001011 + 11011101 = 11011111

 Arithmetic crossover: some arithmetic operation is performed to make a


new offspring (e.g., logic AND)

11001011 + 11011111 = 11001001 (AND)

Mutation: inversion of selected bits


11001001 => 10001001

6.2 Crossover/Mutation (II)

For permutation encoding we have to preserve consistency:

 Single point crossover: one crossover point is selected, the permutation is


copied from the first parent till the crossover point, then the other parent is
scanned looking the other numbers

(1 2 3 4 5 6 7 8 9) + (4 5 3 6 8 9 7 2 1) = (1 2 3 4 5 6 8 9 7)

Order changing mutation: two numbers are selected and exchanged

(1 2 3 4 5 6 8 9 7) => (1 8 3 4 5 6 2 9 7)

6.3 Crossover/Mutation (III)

For real value encoding we can reuse crossover from binary encoding:

Mutation: a small number is added (or subtracted) to selected values

(1.29 5.68 2.86 4.11 5.55) => (1.29 5.68 2.73 4.22 5.55)

6.4 Crossover/Mutation (IV)

Specific operators have to be selected also for tree encoding

 Tree crossover: one crossover point is selected in both parents, parents are
divided in that point and the parts below crossover points are exchanged to
produce new offspring.
Changing mutation: the operator, number, or variable in a randomly selected
node is changed.

Parameters of GA

Today there is no general theory which would describe parameters of GA for any
problem. Recommendations are often results of some empiric studies of GAs,
which were often performed only on binary encoding.

Crossover rate:

Crossover rate generally should be high, about 80%-95%. (However some results
show that for some problems crossover rate about 60% is the best.)

Mutation rate:

On the other side, mutation rate should be very low. Best rates reported are about
0.5%-1%.

Population size:

It may be surprising, that very big population size usually does not improve
performance of GA (in meaning of speed of finding solution). Good population
size is about 20-30, however sometimes sizes 50-100 are reported as best. Some
research also shows that best population size depends on encoding, on size of
encoded string. It means, if you have chromosome with 32 bits, the population
should be say 32, but surely two times more than the best population size for
chromosome with 16 bits.

Selection:
Basic roulette wheel selection can be used, but sometimes rank selection can be
better. Also some more sophisticated method, which changes parameters of
selection during run of GA. Basically they behave like simulated annealing.
Selection again depends on the problem.
Encoding:
Encoding depends on the problem and also on the size of instance of the problem.
Check chapter about encoding for some suggestions or look to other resources.

Crossover and mutation type:

Operators depend on encoding and on the problem.

Applications of GA

Genetic algorithm has been used for difficult problems (such as NP-hard
problems), for machine learning and also for evolving simple programs. They have
been also used for some art, for evolving pictures and music.

Advantage of GAs is in their parallelism. GA is travelling in a search space with


more individuals (and with genotype rather than phenotype) so they are less likely
to get stuck in a local extreme like some other methods.

They are also easy to implement. Once you have some GA, you just have to write
new chromosome (just one object) to solve another problem. With the same
encoding you just change the fitness function and it is all. On the other hand,
choosing encoding and fitness function can be difficult.

Disadvantage of GAs is in their computational time. They can be slower than some
other methods. But with today’s computers it is not so big problem.

To get an idea about problems solved by GA, here is a short list of some
applications:

 Nonlinear dynamical systems - predicting, data analysis

 Designing neural networks, both architecture and weights

 Robot trajectory

 Evolving LISP programs (genetic programming)

 Strategy planning

 Finding shape of protein molecules

 TSP and sequence scheduling

 Functions for creating images

You might also like