Roulette Wheel Selection Methods
Roulette Wheel Selection Methods
The basic roulette wheel selection method is stochastic sampling with replacement (SSR). Here, the segment
size and selection probability remain the same throughout the selection phase and individuals are selected
according to the procedure outlined above. SSR gives zero bias but a potentially unlimited spread. Any
individual with a segment size > 0 could entirely fill the next population. Stochastic sampling with partial
replacement (SSPR) extends upon SSR by resizing an individual’s segment if it is selected. Each time an
individual is selected, the size of its segment is reduced by 1.0. If the segment size becomes negative, then it
is set to 0.0. This provides an upper bound on the spread of . However, the lower bound is zero and the bias is
higher than that of SSR. Remainder sampling methods involve two distinct phases. In the integral phase,
individuals are selected deterministically according to the integer part of their
expected trials. The remaining individuals are then selected probabilistically from the fractional part of the
individuals expected values. Remainder stochastic sampling with replacement (RSSR) uses roulette wheel
selection to sample the individual not assigned deterministically. During the roulette wheel selection phase,
individual’s fractional parts remain unchanged and, thus, compete for selection between “spins”. RSSR
provides zero bias and the spread is lower bounded. The upper bound is limited only by the number of
fractionally assigned samples and the size of the integral part of an individual. For example, any individual
with a fractional part > 0 could win all the samples during the fractional phase. Remainder stochastic
sampling without replacement (RSSWR) sets the fractional part of an individual’s expected values to zero if it
is sampled during the fractional phase. This gives RSSWR minimum spread, although this selection method is
biased in favour of smaller fractions.
GENETIC ALGORITHM
A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is
routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to
the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using
techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover.
Methodology
In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which
encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves
toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other
encodings are also possible. The evolution usually starts from a population of randomly generated individuals
and happens in generations. In each generation, the fitness of every individual in the population is evaluated,
multiple individuals are stochastically selected from the current population (based on their fitness), and
modified (recombined and possibly randomly mutated) to form a new population. The new population is then
used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum
number of generations has been produced, or a satisfactory fitness level has been reached for the population.
If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may
not have been reached.
A standard representation of the solution is as an array of bits. Arrays of other types and structures can be
used in essentially the same way. The main property that makes these genetic representations convenient is
that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations.
Variable length representations may also be used, but crossover implementation is more complex in this case.
Tree-like representations are explored in genetic programming and graph-form representations are explored in
evolutionary programming.
The fitness function is defined over the genetic representation and measures the quality of the represented
solution. The fitness function is always problem dependent. For instance, in the knapsack problem one wants
to maximize the total value of objects that can be put in a knapsack of some fixed capacity. A representation
of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0
or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size
of objects may exceed the capacity of the knapsack. The fitness of the solution is the sum of values of all
objects in the knapsack if the representation is valid, or 0 otherwise. In some problems, it is hard or even
impossible to define the fitness expression; in these cases, interactive genetic algorithms are used.
Once we have the genetic representation and the fitness function defined, GA proceeds to initialize a
population of solutions randomly, then improve it through repetitive application of mutation, crossover,
inversion and selection operators.
Initialization
Initially many individual solutions are randomly generated to form an initial population. The population size
depends on the nature of the problem, but typically contains several hundreds or thousands of possible
solutions. Traditionally, the population is generated randomly, covering the entire range of possible solutions
(the search space). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to
be found.
Selection
During each successive generation, a proportion of the existing population is selected to breed a new
generation. Individual solutions are selected through a fitness-based process, where fitter solutions (as
measured by a fitness function) are typically more likely to be selected. Certain selection methods rate the
fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample
of the population, as this process may be very time-consuming.
Reproduction
The next step is to generate a second generation population of solutions from those selected through genetic
operators: crossover (also called recombination), and/or mutation.
For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool
selected previously. By producing a "child" solution using the above methods of crossover and mutation, a
new solution is created which typically shares many of the characteristics of its "parents". New parents are
selected for each new child, and the process continues until a new population of solutions of appropriate size
is generated. Although reproduction methods that are based on the use of two parents are more "biology
inspired", some research suggests more than two "parents" are better to be used to reproduce a good quality
chromosome.
These processes ultimately result in the next generation population of chromosomes that is different from the
initial generation. Generally the average fitness will have increased by this procedure for the population, since
only the best organisms from the first generation are selected for breeding, along with a small proportion of
less fit solutions, for reasons already mentioned above.
Although Crossover and Mutation are known as the main genetic operators, it is possible to use other
operators such as regrouping, colonization-extinction, or migration in genetic algorithms.
Termination
This generational process is repeated until a termination condition has been reached. Common terminating
conditions are:
Introduction
Genetic Algorithm (GA) is an artificial intelligence procedure. It is based on the theory of natural selection
and evolution. This search algorithm balances the need for:
1. exploitation -
Selection and crossover tend to converge on a good but sub-optimal solution.
2. exploration -
Selection and mutation create a parallel, noise-tolerant, hill climbing algorithm, preventing a
premature convergence.
For details of Genetic Algorithm, please refer to my partner's first article in GA Project.
Applications
Traditional methods of search and optimization are too slow in finding a solution in a very complex search
space, even implemented in supercomputers. Genetic Algorithm is a robust search method requiring little
information to search effectively in a large or poorly-understood search space. In particular a genetic search
progress through a population of points in contrast to the single point of focus of most search algorithms.
Moreover, it is useful in the very tricky area of nonlinear problems. Its intrinsic parallelism (in evaluation
functions, selections and so on) allows the uses of distributed processing machines.
Genetic Algorithm codes parameters of the search space as binary strings of fixed length. It employs a
population of strings initialized at random, which evolve to the next generation by genetic operators such as
selection, crossover and mutation. The fitness function evaluates the quality of solutions coded by strings.
Selection allows strings with higher fitness to appear with higher probability in the next generation. Crossover
combines two parents by exchanging parts of their strings, starting from a randomly chosen crossover point.
This leads to new solutions inheriting desirable qualities from both parents. Mutation flips single bits in a
string, which prevents the GA from premature convergence, by exploiting new regions in the search space.
GA tends to take advantage of the fittest solutions by giving them greater weight, and concentrating the search
in the regions which lead to fitter structures, and hence better solutions of the problem.
Finding good parameter settings that work for a particular problem is not a trivial task. The critical factors are
to determine robust parameter settings for population size, encoding, selection criteria, genetic operator
probabilities and evaluation (fitness) normalization techniques.
If the population is too small, the genetic algorithm will converge too quickly to a local optimal point
and may not find the best solution. On the other hand, too many members in a population result in a
long waiting times for significant improvement.
Coding the solutions is based on the principle of meaningful building blocks and the principle of
minimal alphabets, by using the binary strings.
The fitter member will have a greater chance of reproducing. The members with lower fitness are
replaced by the offspring. Thus in successive generations, the members on average are fitter as
solutions to the problem.
Too high mutation introduces too much diversity and takes longer time to get the optimal solution.
Too low mutation tends to miss some near-optimal points. Two point crossover is quicker to get the
same results and retain the solutions much longer than one point crossover.
The fitness function links the Genetic Algorithm to the problem to be solved. The assigned fitness is
used to calculate the selection probabilities for choosing parents, for determining which member will
be replaced by which child.
Computer-Aided Design
Genetic Algorithm uses the feedback from the evaluation process to select the fitter designs, generating new
designs through the recombination of parts of the selected designs. Eventually, a population of high
performance designs are resulted.