Gacnn - Training Deep Convolutional Neural Networks With Genetic Algorithm
Gacnn - Training Deep Convolutional Neural Networks With Genetic Algorithm
GENETIC ALGORITHM
KARAN DIXIT
(1MV17CS046)
Convolutional Neural Networks (CNNs) have gained a Genetic Algorithm (GA), as one of the subsets of
significant attraction in the recent years due to their Evolutionary Algorithms, is a global optimization method
increasing real-world applications. Their performance is inspired by the process of natural selection for solving both
highly dependent to the network structure and the selected constrained and unconstrained optimization problems.
optimization method for tuning the network parameters. In Genetic algorithm repeatedly modifies a population of
this paper, we propose novel yet efficient methods for individual solutions by selecting the best individuals from the
training convolutional neural networks. The most of current current population as parents and using them to produce
state of the art learning method for CNNs are based on children for the next generation through a number of
Gradient decent. In contrary to the traditional CNN training bio-inspired operators. Over successive generations, the
methods, we propose to optimize the CNNs using methods population "evolves" towards an optimal solution. As the
based on Genetic Algorithms (GAs). These methods are training process of a CNN is basically an optimization
carried out using three individual GA schemes, Steady-State, problem, intuition suggests that GA can be used to do that.
Generational, and Elitism. We present new genetic operators
for crossover, mutation and also an innovative encoding In this work, there is a attempt to train two different deep
paradigm of CNNs to chromosomes aiming to reduce the convolutional neural network architectures doing an image
resulting chromosome’s size by a large factor. We compare classification task over two different modern datasets using
the effectiveness and scalability of our encoding with the methods based on genetic algorithm. These methods are
traditional encoding. Furthermore, the performance of carried out using three individual GA schemes, Steady-State,
individual GA schemes used for training the networks were Generational, and Elitism. Our training methods involve
compared with each other in means of convergence rate and novel genetic operators for crossover and mutation. In
overall accuracy. Finally, our new encoding alongside the addition, we introduce the Accordion chromosome structure,
superior GA-based training scheme is compared to an innovative encoding paradigm of the networks to
Backpropagation training with Adam optimization. chromosomes that reduces the chromosome’s size by a
large factor, leading to faster operations time.
1. Introduction
mutation operators.
1. Initialization
2. Evaluation
3. Selection
4. Genetic Operators
5. Termination
4. Evaluation Results
The steady-state scheme for training a convolutional neural We must also note here a very important factor in the fidelity
network consists of the following steps: 1. Initialization: In of our results, that is the initialization used for all of the
this step, some networks equal to pop_size are initialized training methods for each network is the same. Meaning
using Keras, with their convolution filter and connection that, for example, for the MNIST network, a population is
weight values being assigned with a random number drawn initialized, and then, this one and only population is evolved
from a truncated normal distribution centered on zero using using different encoding and training schemes. This results
the Keras’ built in Glorot normal initializer. 2. Evaluation: In in the same starting accuracy point for every comparison that
the duration of this step, each network’s performance is will be made. Also, when comparing to backpropagation
evaluated based on its accuracy reported by Keras’ methods later, the fittest member of the initialization
model.evaluate() function. This particular step is utilized in a population is selected to be trained with backpropagation.
parallel fashion using the multiprocess library in Python,
allowing for a faster program run time. 3. Fitness
Assignment: Each network is assigned a fitness value fi
based on its evaluation. Here, we used the accuracy of each At Figure 12a, the network for the MNIST classification task
network as its fitness value. was trained using the steady-state scheme using one time
the traditional encoding and one time using the Accordion
4. Selection: A selection probability is assigned to each encoding. It can be derived from this result that the
network in this step. For our work, we used one of the most Accordion encoding performs only slightly better than the
famous selection strategies; Fitness Proportional Selection. traditional encoding. This evaluation is also carried out for
Or most commonly known as, Roulette Wheel selection. In the CIFAR10 classification task. As shown in the results for
this selection, the higher the fitness of a network, the more this evaluation in Figure 12b, even though that the initial
probable it is to be selected as parent for reproduction. accuracy in the population of networks that were meant to be
trained using the Accordion encoding is lower than of its
5. Crossover: In this process, two parents will spawn a new same for the traditional encoding, the population trained
child sharing some of their attributes. For this operator, the using the Accordion encoding manages to surpass the
fully folded chromosome is not considered. Instead, a traditional encoding multiple time in the evolution progress.
semi-folded chromosome structure, that each of its elements And additionally, evolution using the Accordion encoding
are either the convolution filters or the entirety of the ingoing achieves a higher accuracy threshold compared to the
weights of neuron in the fully connected section. traditional encoding in the same number of iterations. In
short, the Accordion encoding has a faster convergence rate
6. Mutation: During this process, one parents will spawn a and reaches better accuracies against the traditional
new child sharing most of its attributes. The process clones encoding.
an exact copy of the parent’s chromosome and randomly
selects some of the elements in the semi-folded
chromosome structure.
References
● https://towardsdatascience.com/a-comprehensive-
guide-to-convolutional-neural-networks-the-eli5-wa
y-3bd2b1164a53
(b)
● David J Montana. Neural network weight selection
Figure 12: Comparison of the Accordion encoding and using genetic algorithms. Intelligent Hybrid
traditional encoding used in the steady-state training scheme Systems, 8(6):12–19, 1995.
of (a) the network for the MNIST classification task and (b)
the network for the CIFAR10 classification task.
● Yann LeCun and Corinna Cortes. MNIST
handwritten digit database. 2010.
6. Conclusion