MCTS-GA
MCTS-GA
Abstract— In this research, we investigate the possibility Genetic algorithms are used for various problem
of applying a search strategy to genetic algorithms to domains such as decision trees, segmentation,
explore the entire genetic tree structure. Several methods classification, etc. However, in this paper, our focus
aid in performing tree searches; however, simpler
algorithms such as breadth-first, depth-first, and iterative
will be on the application of GA in optimizing neural
techniques are computation-heavy and often result in a long network weights.
execution time. Adversarial techniques are often the
preferred mechanism when performing a probabilistic Selection
search, yielding optimal results more quickly. The problem
we are trying to tackle in this paper is the optimization of
neural networks using genetic algorithms. Genetic
algorithms (GA) form a tree of possible states and provide Mutation Crossover
a mechanism for rewards via the fitness function. Monte
Carlo Tree Search (MCTS) has proven to be an effective
tree search strategy given states and rewards; therefore, we Figure 1. Simple Genetic Algorithm
will combine these approaches to optimally search for the
best result generated with genetic algorithms.
The Monte Carlo Tree Search approach was
Keywords—genetic algorithm, mcts, optimization,
reinforcement learning, neural network
developed in 2006 as an application to game-tree
search. This tree search works on the principles of
I. INTRODUCTION cumulated reward calculated from children’s nodes
Genetic algorithms belong to a subclass of and uses Q and N values to balance between
evolutionary algorithms that provide a method for exploration and expansion approaches. The
optimization based on genetic selection principles exploration approach considers the number of nodes
[8]. They serve a purpose in machine learning and visited and uses a quantitative approach to
research development, in addition to search tool discovering child nodes that have not been visited.
optimization systems. The approach used in genetic The expansion approach follows a qualitative
algorithms is analogous to biological concepts of strategy to discovering child nodes with Q-value
chromosome generation, with operators such as indicating the cumulative sum of rewards.
selection, crossover, mutation, and recombination.
GA is a population-based approach that aims to
provide a solution to successive generations. The
process of evolution using GA involves starting with
a random population and evolving it using crossover
and mutation operators to generate children. The
best-fit solution is then filtered, and the genetic
process is repeated until the objective is achieved. We
can observe that in the process of genetic algorithms,
a tree of possible solutions is generated, and the best-
fit solution is picked for successive iteration, which
limits the search space and computational resources. Figure 2. Outline of a Monte Carlo Tree Search
III. APPROACH
In our approach we try to address both the issues
mentioned above by combining the approaches of
MCTS and GA. We can take advantage of the genetic
algorithm structure as it generates a tree with a
mechanism to evaluate the reward in terms of the
fitness of the individual. MCTS will use the same
underlying tree structure along with the fitness
function to calculate the Q-value which indicates the
cumulative sum of rewards; the N-value which
indicates the visit count of each node is also
maintained for the calculation of upper bounds. The
overall structure follows the complete expansion of
child nodes using GA and using MCTS with UCT
policy to search for the optimal nodes with fitness
function as a method to assign rewards. Figure 5. Application of Monte Carlo Tree Search applied
to Genetic Algorithm
The process of evolution using GA involves
starting with a random population which in our case
is the weights of the neural network. We generate the Crossover: A crossover operator is then applied on
initial population P0 of certain size by adding a the initial population to generate children. 1-point
random uniform distribution on each dimension of crossover is applied where we have restricted the
the weights of the neural network and creating crossover operator to each layer of the neural network
children with random weights. to mitigate the problem of competing conventions
and maintain the integrity of the weights of the neural
Next, we introduce the concept of genetic action
network.
for MCTS as illustrated by dashed box in figure 5.
Mutation: The mutation operator is used to
The genetic action consists of the following three
introduce random mutation in the population. For our
genetic operations applied on the parent population.
approach we have used random mutation on each
• Genetic Action - layer of the neural network by swapping the weight
o Selection values of 2 randomly chosen individuals. The
subsequent child nodes are selected with the UCT
o Crossover
policy of MCTS from the root node until a leaf node
o Mutation is found. The children among the leaf nodes are
selected randomly and their corresponding Q and N
values are backpropagated. The UCT policy using
Selection: Various selection strategies such as UCB1 is as follows.
roulette wheel, tournament, rank, steady state etc. are
available for filtering individuals of higher fitness.
UCT = Q + UCB1 The results obtained from primary testing has
proven to be positive. The MCTS-GA approach was
$∗&' (*+)
UCB1 = ! , Ni is the ith child node and N is able to optimize the neural network weights for better
*-#
classification of the diabetes data and obtained better
the number of child nodes accuracy results.
Equation 1. Upper confidence bounds
The comparison of the results obtained to that of
The next step in the process is the application of the original neural network and canonical genetic
the concept of rollout (simulation) based on the algorithm are shown below.
MCTS approach. For the selected child node, an
TABLE I. ACCURACY RESULTS
evolutionary rollout operation is applied wherein the
individual is evolved using (µ+l) evolutionary Neural Net
- SGD
Neural Net
-ADAM
Genetic
Algorithm
MCTS-GA
strategy unlike the regular random action simulations
performed in prior research experiments [3]. The accuracy 0.49 0.72 0.73 0.745
reason for applying the evolutionary rollout is to find recall 0.42 0.73 0.77 0.78
out if the genetically mutated individual is of the
highest fitness possible for its phenotype. We define
this process as ageing the individual in comparison to The classification accuracy can also be noticed from
the biological phenomenon of ageing. Thus, the the roc-auc curves indicated below.
ageing process introduced in rollout determines the
best possible age (generation) in which the individual
would be best suited to genetically mutate again. The
rollout/ageing process will replace the individual if a
better fitness is found in later generations of the
evolutionary strategy.
Figure 6. ROC-AUC curve for neural network, genetic
This process is repeated until the specified tree algorithm and MCTS_GA respectively
depth is reached. This approach provides
computational flexibility in terms of configurable
parameters such as tree height, tournament selection, The confusion matrix for the three approaches
number of rollout generations and branching factor. compared are shown below.
These parameters can be used in combination with
each other as per the computational capacity
available. Thus, the application of genetic action and
evolutionary rollout in amalgamation with MCTS
provides the basis for the approach discussed in this
paper.