0% found this document useful (0 votes)
20 views

Building CNN Model - Formatted Paper

This document discusses building a CNN model for autonomous vehicles using the Udacity simulator. It begins with an introduction to deep learning and CNNs. It then discusses related work applying CNNs to autonomous driving. It provides an overview of the basic CNN architecture including the input, hidden, and output layers. It describes the Udacity simulator which can be used to generate training data and test a model, providing screenshots of the simulator interface.

Uploaded by

oussama zemzi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Building CNN Model - Formatted Paper

This document discusses building a CNN model for autonomous vehicles using the Udacity simulator. It begins with an introduction to deep learning and CNNs. It then discusses related work applying CNNs to autonomous driving. It provides an overview of the basic CNN architecture including the input, hidden, and output layers. It describes the Udacity simulator which can be used to generate training data and test a model, providing screenshots of the simulator interface.

Uploaded by

oussama zemzi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Recent Trends in Computer Graphics and Multimedia Technology

Volume 4 Issue 1

Building CNN Model for Autonomous Car using


Udacity Simulator

Jinendra Anchalia1*, Dr S Srividhya2


1
Student, 2Associate Professor,
Department of Information Science & Engineering,
BNM Institute of Technology, Bangalore, India.
*Corresponding Author
E-Mail Id:- [email protected]

ABSTRACT
Recent times has seen a rising trend in the area of automation. One of the most trending
topics in the field of automation is self-driving cars which shows the culmination of artificial
intelligence, machine learning, deep learning, internet of things and many other in-demand
domains and technologies which have improved significantly in the last decade. These cars
use complex artificial intelligence and machine learning algorithms. This paper aims at
explaining how to build an architecture for autonomous cars in Udacity simulator and adjust
parameters to get a deeper understanding of how to tune different parameters of a CNN
model to get an acceptable accuracy and presents its outcomes in an easy and
understandable language. This research uses a simulation environment to generate dataset
and test the model on the same.

Keywords:-Artificial intelligence, deep learning, , augmentation, behavioral cloning

INTRODUCTION algorithms on image datasets to generate


The last decade has seen a growth in the important insights and predict possible
demand for skills like artificial future outcomes.
intelligence and machine learning,
teaching computers to mimic human For various tasks related to computer
behaviour, analyze, understand large vision like detection, classification and
amount of data and predict future recognition on both images and videos,
outcomes that can be of utmost business one algorithm supersedes as other called
importance. the Convolutional Neural Network or
CNN. In CNN various features of an
According to a recent research report the image/video frame is not spatially
AI market is expected to grow from dependent. But what a CNN does is that it
approximately 30 billion USD in 2020 to will first try to detect an immediate and
approximately 300 billion USD by 2026 steep change in pixel value.
at a CAGR of 35.6%[1]. Multi-National
Companies like Microsoft, Alphabet, For example, in an image containing a
Facebook, etc rely heavily on artificial face, a CNN will first detect the boundary
intelligence and machine learning, and of face. Thus, the location of the face in
spend billions of dollars in its research an image does not matter in a neural
and Development. network. Then as we move to the deeper
layers of the neural network, it try to
One of the important category in AI is obtain other abstract features. This way a
deep learning, and a lot of work is done in CNN learns to extract features from a
the area of applying deep learning frame as in Fig.1 [2].

HBRP Publication Page 1-7 2022. All Rights Reserved Page 1


Recent Trends in Computer Graphics and Multimedia Technology
Volume 4 Issue 1

Fig.1:-Feature extraction from a frame by a CNN

RELATED WORK in the field of autonomous car. Surly the


Growing technology and computational domain has a lot of potential both
capabilities have now enabled us to build, economically and socially.
deploy and evaluate complex neural
networks. Mariusz Bojarski et al, [3] BASIC CNN ARCHITECTURE
working for NVIDIA published a paper, To the path of understanding a CNN, it is
where they presented the output of there very important to know about its
experiment in which they mounted a architecture. A typical CNN consists of
single front facing camera and recorded a three components.
drivers behaviour and configured a neural 1. Input layer
network to later mimic the driver. 2. Hidden layer
2.1 Convolutional layer
Brilian Tafjira et al,[4] proposed a well 2.2 Pooling layer
guided approach to develop an 2.3 Activation function
autonomous driving car. The method uses 3. Output layer
pre-trained YOLOv1, road lane detector,
and a controller to integrate them into a Input Layer
smart system. Javier del Egio et al, [5] Input to a CNN is an image, mostly an
proposed the usage of Long-Short Term image consists of all the three channels,
Memory (LSTM) layers in the network red, green, blue, some other color
architecture for a smoother driving by schemes can be grayscale, HSV etc.
taking into account the previous values Simply put an image can be viewed as a
encountered by the network. Nitin 2D array or a matrix of pixels, and the
Kanagraj et al, [6] presented a method neural network reduces it such that no
using Spatial features are lost and yet it becomes easier
to process.
Transformer Network (STN) that
implements a calibration matrix and Hidden Layer
distortion coefficients for lane detection Convolutional Layer
of autonomous cars. A lot of work has It extracts information from the input
been done and is continuously being done image. Input images may be distorted or

HBRP Publication Page 1-7 2022. All Rights Reserved Page 2


Recent Trends in Computer Graphics and Multimedia Technology
Volume 4 Issue 1

not clear, so in this layer a kernel is taken, This virtual environment can also be used
which compresses the image such that no to test a trained machine learning model.
data loss occur. By doing this the effect of The project is open source hence any
any problem with input is suppressed. enthusiast is allowed to access and modify
the simulator as per their requirements. It
Pooling Layer is developed using Unity’s video game
This layer down samples the features development IDE. Its resolution as in
received. Simply put it reduces the Figure 2, controls and tracks can be
number of parameters or dimensions of changed as per user’s wish.
the input feature, reducing its spatial size.
Hence acting as a noise suppressant. The environment has two modes and two
tracks to work upon as seen in Fig3. The
Activation Function two modes are the training mode, where
A biological neuron, that is the inspiration the user can record the movement of the
behind neural network, is triggered at a car. User uses controls to steer the car. At
certain threshold. Thus, introducing every frame three pictures are taken from
activation function in neural networks. the left, right and center camera mounted
This means that only after a certain value on top of the car. And an additional csv
output is sent to the next layer, file, that can be seen in Fig4, is also
introducing non-linearity into the model. generated to not only store the path of the
images but also provide some additional
Output Layer data related to the frame. The additional
The output layer consists of a few fully- data stored consists of steering angle,
connected neural layers. Fully-connected throttle, brake and speed of the car at that
layer is one where all the neurons get instance[7]. This dataset can be generated
input from all the neurons from previous on any one of the two tracks provided.
layer. This allow model to learn all After the model is built, the second track
possible non- linear combinations of high can be used to test the model in a real-
level features. time simulated environment, and also the
second track is not seen by the model
UDACITY SIMULATOR previously, this will also validate if or not
Udacity built a virtual environment for our model overfits the training data.
generating dataset for self-driving car.

Fig.2:-Initial prompt to the simulator

HBRP Publication Page 1-7 2022. All Rights Reserved Page 3


Recent Trends in Computer Graphics and Multimedia Technology
Volume 4 Issue 1

Fig.3:-Udacity self-driving car simulator

Fig.4:-Generated csv file

BASIC ARCHITECTURE has only 5 layers; 1 normalization layer, 1


In this domain the most commonly used convolutional layer, 1 pooling layer, 1
architectures to build a model are the dropout layer and 1 flatten layer. The
Nvidia’s model and the Comma.ai model. number of filters in the convolutional layer
But we take a much simpler architecture, as is only 2. The size of the input image is
defined in [8] as our base architecture which 16x32. This architecture has only 69
will further be referred as architecture A. It parameters.

Fig.5:-Base Architecture

HBRP Publication Page 1-7 2022. All Rights Reserved Page 4


Recent Trends in Computer Graphics and Multimedia Technology
Volume 4 Issue 1

METHODOLOGY For model B, we add a couple of


We test model A for complete dataset convolutional layers to our base
divided into 6000 training images and 1500 architecture. We also increase the number of
test images for 10 epochs. For first filters to 16 and hence number of parameters
experimentation we do not use any also increase, we also use augmentation
augmentation technique, and for the next techniques with this model. Fig6 shows
experimentation we use augmentation. For architectural design of model B.
all the architectures we use Adam optimizer
with mean squared error (mse) as loss
function and relu as activation function.

Fig.6:-Architectural design of model B

For model B input shape is also changed to input shape as 66x200. Before dense layers,
66x200 which is the default image size for there is a flatten layer to convert a 2D array
the images generated by Udacity simulator. to 1D array. We also insert normalization,
pooling and dropout layer for better
For model C we use multiple convolutional generalization. This architectural design can
layers and dense layers. The number of be seen below.
features in the third architecture is 64, with

Fig.7:-Architectural design of model C

HBRP Publication Page 1-7 2022. All Rights Reserved Page 5


Recent Trends in Computer Graphics and Multimedia Technology
Volume 4 Issue 1

RESULTS Four experiments are conducted with


The results from all the experiments increasing order of complexity.
conducted can be seen in Table 1 below.

Experiment Min loss Max loss Accuracy

Model A(1) 0.0362 0.0384 64.72%

Model A(2) 0.0346 0.0362 66.62%

Model B 0.0308 0.0424 67.73%

Model C 0.0312 0.6493 68.42%

Table 1:-Experimental results

Model A(1) above shows the REFERENCES


experimental result of base architecture 1. “Global Artificial Intelligence
without any augmentation on the dataset. Market” by Facts and Factors
Model A(2) shows the result of base Research 2021
design with various augmentation
techniques applied on the dataset. 2. Albawi, S., Mohammed, T. A., & Al-
Zawi, S. (2017, August).
CONCLUSIONS Understanding of a convolutional
An acceptable change in accuracy can be neural network. In 2017 international
observed in base architecture when conference on engineering and
different augmentation techniques are technology (ICET) (pp. 1-6). Ieee.
applied to the dataset. This is because of 3. Bojarski, M., Del Testa, D.,
the fact that augmentation increases the Dworakowski, D., Firner, B., Flepp,
size of the dataset. B., Goyal, P., ... & Zieba, K. (2016).
End to end learning for self-driving
Model B and Model C do not show any cars. arXiv preprint arXiv:1604.07316.
significant improvement over one another 4. Nugraha, B. T., & Su, S. F. (2017,
as well as over the base model when October). Towards self-driving car
augmentation is applied. using convolutional neural network
and road lane detector. In 2017 2nd
One of the major factors of no significant international conference on
difference between the three models can automation, cognitive science, optics,
be due to a small dataset, even after micro electro-mechanical system, and
augmentation. A complex neural network information technology
is not able to converge well due to this. (ICACOMIT) (pp. 65-69). IEEE.
5. Egio, J. D., Bergasa, L. M., Romera,
Some other practical limitations to this E., Gómez Huélamo, C., Araluce, J.,
paper is that the simulator generates data & Barea, R. (2018, November). Self-
that is noise free, which is not true for the driving a Car in Simulation Through a
real-world. Different weather and lighting CNN. In Workshop of physical
conditions are also not taken into account. agents (pp. 31-43). Springer, Cham.
Cars can have certain mechanical errors to 6. Kanagaraj, N., Hicks, D., Goyal, A.,
them, which too is ignored. Tiwari, S., & Singh, G. (2021). Deep

HBRP Publication Page 1-7 2022. All Rights Reserved Page 6


Recent Trends in Computer Graphics and Multimedia Technology
Volume 4 Issue 1

learning using computer vision in self https://medium.com/@xslittlegrass/self-


driving cars for lane and traffic sign driving-car-in-a-simulator-with-a-tiny-
detection. International Journal of neural-network-
System Assurance Engineering and 13d33b871234#.8fj065dgy.
Management, 12(6), 1011-1025.
7. “GitHub-udacity/self-driving-car-sim:
A self-driving car simulator built with
Unity.”.
8. Mengxi Wu: Self-driving car in a
simulator with a tiny neural
network(2017).

HBRP Publication Page 1-7 2022. All Rights Reserved Page 7

You might also like