0% found this document useful (0 votes)
16 views

Control System Term Paper

Consider the benchmark popular MNIST Dataset [Link] that contains large no. of handwritten digits. It is being widely used in training of various Computer Vision systems in industries. Design a Multi-Layer Perceptron Neural Network (MLPNN) or Artificial Neural Network (ANN).

Uploaded by

scintilla1822
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Control System Term Paper

Consider the benchmark popular MNIST Dataset [Link] that contains large no. of handwritten digits. It is being widely used in training of various Computer Vision systems in industries. Design a Multi-Layer Perceptron Neural Network (MLPNN) or Artificial Neural Network (ANN).

Uploaded by

scintilla1822
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

B.P.

PODDAR INSTITUTE OF MANAGEMENT AND TECHNOLOGY

Mini Project

CONTROL SYSTEM AND INSTRUMENTATION (EC601)

ECE-A 3rd Year

Submitted by :

Name University Roll no.


Ankit Baranwal 11500318108
Ankan Kumar Bose 11500318109
Ankan Ghosh 11500318110
Anjali Jain 11500318111
Anjali Gupta 11500318112
Ananya Sil 11500318113
Aliva Biswas 11500318114
CONTENT :
1. Problem Statement
2. Abstract
3. Introduction
3.1 Artificial Neural Networks (ANN)
3.2 MNIST Dataset
4. Model
4.1 Visualizing the dataset
4.2 Data Preprocessing
4.3 Building the Model
4.3.1 Activation Functions
4.3.2 Dropout for regularization
4.4 Compiling the Model
4.5 Training the Model
5. Model Evaluation
5.1 Making Prediction
5.2 Evaluating Performance
6. Applications
7. Conclusion
8. References
9. Acknowledgement

1. PROBLEM STATEMENT
Consider the benchmark popular MNIST Dataset [Link] that contains large no. of handwritten
digits. It is being widely used in training of various Computer Vision systems in industries.
Design a Multi-Layer Perceptron Neural Network (MLPNN) or Artificial Neural Network
(ANN) with the following model specifications:

1
PROJECT LINK:
Click here to open the project notebook.

2. ABSTRACT
This paper is started with introducing the basic concept of Artificial Neural Network (ANN) and
a brief discussion is done on the MNIST dataset that is used in this project. After introducing the
basics, a walkthrough of the project has been done. We first discussed in detail the model we
have used i.e building the model and compiling it and the various concepts associated with it.
Then we discussed the predictions done by the model and finally the performance of the model
was evaluated.

3. INTRODUCTION

3.1 Artificial Neural Networks (ANN)


An artificial neural network (ANN) is the piece of a computing system designed to simulate the
way the human brain analyzes and processes information. It is the foundation of artificial
intelligence (AI) and solves problems that would prove impossible or difficult by human or
statistical standards.[1]
To understand the concept of the architecture of an artificial neural network, we have to
understand what a neural network consists of. In order to define a neural network that consists of
a large number of artificial neurons, which are termed units arranged in a sequence of layers. Let
us look at various types of layers available in an artificial neural network.
Artificial Neural Network primarily consists of three layers:

2
Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the calculations to
find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally results in
output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.

It determines the weighted total is passed as an input to an activation function to produce the
output. Activation functions choose whether a node should fire or not. Only those who are fired
make it to the output layer. There are distinctive activation functions available that can be applied
upon the sort of task we are performing.

3.2 MNIST Dataset


The MNIST database (Modified National Institute of Standards and Technology database) is a
large database of handwritten digits that is commonly used for training various image processing
systems. The database is also widely used for training and testing in the field of machine
learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators
felt that since NIST's training dataset was taken from American Census Bureau employees, while
the testing dataset was taken from American high school students, it was not well-suited for
machine learning experiments. Furthermore, the black and white images from NIST were
normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale
levels. [2]

3
4. MODEL
We created a model that recognizes handwritten digits. We have achieved this with the help of
MNIST dataset. The model we created is a multi-layered Artificial Neural Network (ANN) that
takes in a 28x28 dimensional image of the handwritten digits and gives the recognized digit as
the output. The model is discussed in details below.

4.1 Visualizing the dataset


We have selected a random data instance and visualized the digit.

4.2 Data Preprocessing


The last step before creating our model is to preprocess our data. This simply means applying
some prior transformations to our data before feeding it the model. In this case we will simply
scale all our grayscale pixel values (0-255) to be between 0 and 1. We can do this by dividing
each value in the training and testing sets by 255.0. We do this because smaller values will make
it easier for the model to process our values.

4
4.3 Building the Model
We have built the artificial neural network according to the problem statement specified.

4.3.1 Activation Functions


Activation functions are simply a function that is applied to the weighted sum of a neuron. They
can be anything we want but are typically higher order/degree functions that aim to add a higher
dimension to our data. We would want to do this to introduce more complexity to our model. By
transforming our data to a higher dimension, we can typically make better, more complex
predictions.

Activation functions used in our model :


a) Rectified Linear Unit (ReLU)
It is used in the hidden layers.
Rectified Linear Unit (ReLU) solves the Vanishing Gradient problem caused by Sigmoid
activation function.
ReLU function = max (0, x)
The ReLU function is actually a function that takes the maximum value.

5
b) Softmax Activation Function
It is used in our output layer as it is a multiclass classification problem.
For an arbitrary real vector of length K, Softmax can compress it into a real vector of
length K with a value in the range (0, 1), and the sum of the elements in the vector is 1.
It also has many applications in Multiclass Classification and neural networks. Softmax is
different from the normal max function: the max function only outputs the largest value,
and Softmax ensures that smaller values have a smaller probability and will not be
discarded directly.

4.3.2 Dropout for regularization


Deep learning neural networks are likely to quickly overfit a training dataset with few examples.
A single model can be used to simulate having a large number of different network architectures
by randomly dropping out nodes during training. This is called dropout and offers a very
computationally cheap and remarkably effective regularization method to reduce overfitting and
improve generalization error in deep neural networks of all kinds. [5]

6
4.4 Compiling the Model

Backpropagation
We iteratively update the weights to reduce the loss and this is done by backpropagation.
We reduce the loss function with the help of optimizer functions.
Adam Optimizer
Adam Optimizer inherits the strengths or the positive attributes of the Momentum and Root
Mean Square Propagation (RMSP) methods and builds upon them to give a more optimized
gradient descent. [3]
Taking the formulas used in the above two methods, we get

Parameters used :
ϵ = a small +ve constant to avoid 'division by 0' error when (vt -> 0). (10-8)
β1 & β2 = decay rates of average of gradients in the above two methods. (β1 = 0.9 & β2 = 0.999)
α — Step size parameter / learning rate (0.001)

Since mt and vt have both initialized as 0 (based on the above methods), it is observed that they
gain a tendency to be ‘biased towards 0’ as both β1 & β2 ≈ 1. This Optimizer fixes this problem
by computing ‘bias-corrected’ mt and vt. This is also done to control the weights while reaching
the global minimum to prevent high oscillations when near it. The formulas used are:

Intuitively, we are adapting to the gradient descent after every iteration so that it remains
controlled and unbiased throughout the process, hence the name Adam.
Now, instead of our normal weight parameters mt and vt , we take the bias-corrected weight
parameters (m_hat)t and (v_hat)t. Putting them into our general equation, we get

Categorical Cross Entropy Loss


Also called Softmax Loss. It is a Softmax activation plus a Cross-Entropy loss. If we use this
loss, we will train a CNN to output a probability over the C classes for each image. It is used for
multi-class classification.

7
In the specific (and usual) case of Multi-Class classification the labels are one-hot, so only the
positive class Cp keeps its term in the loss. There is only one element of the Target vector t
which is not zero ti = tp. So discarding the elements of the summation which are zero due to
target labels, we can write:

Where Sp is the CNN score for the positive class.

4.5 Training the Model


Finally we trained the model by feeding it with the training data.

5. MODEL EVALUATION
We tested our model on the test data and got 98.07 % accuracy.

8
5.1 Making Prediction
We predict the output of the test data using the model. As Softmax activation function is used in
the output layer, we get an array of probabilities of all the classes for each instance. We make an
array that contains the class that has scored the highest probability for each instance.

5.2 Evaluating Performance


We created a confusion matrix of our predictions and then evaluated our model with the help of
performance metrics like precision, recall and f1 score.

9
6. APPLICATIONS
Artificial neural networks have been applied in all areas of operations. Email service providers
use ANNs to detect and delete spam from a user’s inbox; asset managers use it to forecast the
direction of a company’s stock; credit rating firms use it to improve their credit scoring methods;
e-commerce platforms use it to personalize recommendations to their audience; chatbots are
developed with ANNs for natural language processing; deep learning algorithms use ANN to
predict the likelihood of an event; and the list of ANN incorporation goes on across multiple
sectors, industries, and countries. [4]

7. CONCLUSION
Artificial neural networks are paving the way for life-changing applications to be developed for
use in all sectors of the economy. Artificial intelligence platforms that are built on ANNs are
disrupting the traditional ways of doing things. From translating web pages into other languages
to having a virtual assistant order groceries online to conversing with chatbots to solve problems,
AI platforms are simplifying transactions and making services accessible to all at negligible
costs.

10
8. REFERENCES
[1]https://www.investopedia.com/terms/a/artificial-neural-networks-ann.asp#:~:text=An%20artif
icial%20neural%20network%20(ANN)%20is%20the%20piece%20of%20a,by%20human%20or
%20statistical%20standards.

[2] https://en.wikipedia.org/wiki/MNIST_database

[3]https://www.geeksforgeeks.org/intuition-of-adam-optimizer/#:~:text=Adaptive%20Moment%
20Estimation%20is%20an,less%20memory%20and%20is%20efficient\

[4]https://www.investopedia.com/terms/a/artificial-neural-networks-ann.asp#:~:text=An%20artif
icial%20neural%20network%20(ANN)%20is%20the%20piece%20of%20a,by%20human%20or
%20statistical%20standards.

[5] https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/

9. ACKNOWLEDGEMENT
We are grateful to be a part of this project included in the curriculum of MAKAUT and college
as well as. We are thankful to our parents, teachers and friends who have directly or indirectly
helped to complete this project. The process of preparing the project in collaboration with our
project guide Mr. Prasenjit Kumar Mudi sir is a refreshing experience. We convey our heartfelt
regards and appreciation for his sincere cooperation in this project.

11

You might also like