Introduction To Genetic Algorithms (GA)
Introduction To Genetic Algorithms (GA)
Advantages of GA:
GAs work with a coding for the parameter set, not the parameters
themselves.
GAs search from a population of points, not a single point.
GAs use probabilistic transition rules, not deterministic rules.
The Genetic Algorithms are first suited for the optimization problem.
GAs are robust with respect to local minima, maxima.
GAs work well on mixed discrete/continuous problems.
GAs can operate on various representations.
GAs are stochastic.
GAs are easily parallelised.
GAs are easy to implement and lend themselves well to hybridization.
Limitations of GA
GAs are categorised under an umbrella term Evolutionary Algorithms, which are
used to describe computer-based problem solving systems which use
computational models of evolutionary processes as key elements in their design
and implementation. A variety of evolutionary algorithms have been proposed, of
which the major ones are: GAs, evolutionary programming, evolution strategies,
classifier systems, and genetic programming (check out www.google.com on these
keywords). They all share a common conceptual base of simulating the 'evolution'
of individual structures via processes of selection, mutation and reproduction.
The existing GAs are founded upon the following main principles:
1. Reproduction
2. Fitness
3. Crossover
4. Mutation
PSEUDOCODE
Algorithm GA is
Binary encoding is the most common one (mainly because the first research of
GA used this type of encoding)
Cromosome 1 1101100100110110
Cromosome 2 1110111000011110
Cromosome 1: 15 7 8 3 5 1310111612 1 14 2 4 6 9
There are cities and given distances between them. Travelling salesman has
to visit all of them, but he does not want to travel more than necessary. Find
a sequence of cities with a minimal travelled distance.
Direct value encoding can be used in problems where some more complicated
values are required
Good choice for some special problems, but necessary to develop some
specific crossover and mutation
Cromosome: D S A B H Y V V
Tree encoding is used mainly for evolving programs or expressions (i.e., genetic
programming)
Programming language LISP is often used for this purpose, so crossover and
mutation can be done relatively easily.
Cromosome (+ X (/ 5 y))
Input and output values are given. The task is to find a function that will
give the best outputs for all inputs.
Rank selection
Tournament selection
Boltzmann selection
5.1 Roulette wheel selection
The individuals are mapped to contiguous segments of a line, such that each
individual's segment is equal in size to its fitness. A random number is generated
and the individual whose segment spans the random number is selected. The
process is repeated until the desired number of individuals is obtained (called
mating population). This technique is analogous to a roulette wheel with each slice
proportional in size to the fitness.
3. [Loop] Go through the population and sum fitness values from 0 - S. When
the sum is greater then r, stop and return the chromosome where you are.
PSEUDOCODE
After one iteration, instead of the least fit individual 1 the next least fit individual 6
has been discarded from the pool. This shows the noisiness in roulette-wheel
selection. After repeating the selection procedure n = 8 times the count of each
chromosome is shown below.
No. 1 2 3 4 5 6 7 8
Initial Population 0000 0000 0010 0001 0001 0101 00101000 0110 1010 1110 1000 1110 1101 0111 1100
Selected 0000 0000 0010 0001 0001 0101 00101000 00101000 0110 1010 0110 1010 0111 1100
Population
Referring the expected counts 5 and 8 gets two copies and 1 and 6 get no copies.
After 8 iterations the pool we get is also very similar to the expected count.
5.2 Rank Selection
The previous selection will have problems when the fitness values differ very
much. For example, if the best chromosome fitness is 90% of all the roulette wheel
then the other chromosomes will have very few chances to be selected.
Rank selection first ranks the population and then every chromosome receives
fitness from this ranking. The worst will have fitness 1, second worst 2 etc. and the
best will have fitness N (number of chromosomes in population).
After this all the chromosomes have a chance to be selected. But this method can
lead to slower convergence, because the best chromosomes do not differ so much
from other ones.
– Rank of i
– Size of sample k
For k = 2, time for fittest individual to take over population is the same
as linear ranking with s = 2 • p
5.4 Steady-State Selection
This is not particular method of selecting parents. Main idea of this selection is
that big part of chromosomes should survive to next generation.
GA then works in a following way. In every generation are selected a few (good -
with high fitness) chromosomes for creating a new offspring. Then some (bad -
with low fitness) chromosomes are removed and the new offspring is placed in
their place. The rest of population survives to new generation.
5.5 Elitism
Idea of elitism has been already introduced. When creating new population by
crossover and mutation, we have a big chance, that we will loose the best
chromosome.
Elitism is name of method, which first copies the best chromosome (or a few best
chromosomes) to new population. The rest is done in classical way. Elitism can
very rapidly increase performance of GA, because it prevents losing the best found
solution.
There are many methods for crossovers and mutations depending on the kind of
encoding applied to represent the chromosomes.
Single point crossover: one crossover point is selected, binary string from
the beginning of the chromosome to the crossover point is copied from the
first parent, the rest is copied from the other parent
11001011+11011111 = 11001111
Two point crossover: two crossover points are selected, binary string from
the beginning of the chromosome to the first crossover point is copied from
the first parent, the part from the first to the second crossover point is copied
from the other parent and the rest is copied from the first parent again.
(1 2 3 4 5 6 7 8 9) + (4 5 3 6 8 9 7 2 1) = (1 2 3 4 5 6 8 9 7)
(1 2 3 4 5 6 8 9 7) => (1 8 3 4 5 6 2 9 7)
For real value encoding we can reuse crossover from binary encoding:
(1.29 5.68 2.86 4.11 5.55) => (1.29 5.68 2.73 4.22 5.55)
Tree crossover: one crossover point is selected in both parents, parents are
divided in that point and the parts below crossover points are exchanged to
produce new offspring.
Changing mutation: the operator, number, or variable in a randomly selected
node is changed.
Parameters of GA
Today there is no general theory which would describe parameters of GA for any
problem. Recommendations are often results of some empiric studies of GAs,
which were often performed only on binary encoding.
Crossover rate:
Crossover rate generally should be high, about 80%-95%. (However some results
show that for some problems crossover rate about 60% is the best.)
Mutation rate:
On the other side, mutation rate should be very low. Best rates reported are about
0.5%-1%.
Population size:
It may be surprising, that very big population size usually does not improve
performance of GA (in meaning of speed of finding solution). Good population
size is about 20-30, however sometimes sizes 50-100 are reported as best. Some
research also shows that best population size depends on encoding, on size of
encoded string. It means, if you have chromosome with 32 bits, the population
should be say 32, but surely two times more than the best population size for
chromosome with 16 bits.
Selection:
Basic roulette wheel selection can be used, but sometimes rank selection can be
better. Also some more sophisticated method, which changes parameters of
selection during run of GA. Basically they behave like simulated annealing.
Selection again depends on the problem.
Encoding:
Encoding depends on the problem and also on the size of instance of the problem.
Check chapter about encoding for some suggestions or look to other resources.
Applications of GA
Genetic algorithm has been used for difficult problems (such as NP-hard
problems), for machine learning and also for evolving simple programs. They have
been also used for some art, for evolving pictures and music.
They are also easy to implement. Once you have some GA, you just have to write
new chromosome (just one object) to solve another problem. With the same
encoding you just change the fitness function and it is all. On the other hand,
choosing encoding and fitness function can be difficult.
Disadvantage of GAs is in their computational time. They can be slower than some
other methods. But with today’s computers it is not so big problem.
To get an idea about problems solved by GA, here is a short list of some
applications:
Robot trajectory
Strategy planning