Control System Term Paper
Control System Term Paper
Mini Project
Submitted by :
1. PROBLEM STATEMENT
Consider the benchmark popular MNIST Dataset [Link] that contains large no. of handwritten
digits. It is being widely used in training of various Computer Vision systems in industries.
Design a Multi-Layer Perceptron Neural Network (MLPNN) or Artificial Neural Network
(ANN) with the following model specifications:
1
PROJECT LINK:
Click here to open the project notebook.
2. ABSTRACT
This paper is started with introducing the basic concept of Artificial Neural Network (ANN) and
a brief discussion is done on the MNIST dataset that is used in this project. After introducing the
basics, a walkthrough of the project has been done. We first discussed in detail the model we
have used i.e building the model and compiling it and the various concepts associated with it.
Then we discussed the predictions done by the model and finally the performance of the model
was evaluated.
3. INTRODUCTION
2
Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the calculations to
find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally results in
output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.
It determines the weighted total is passed as an input to an activation function to produce the
output. Activation functions choose whether a node should fire or not. Only those who are fired
make it to the output layer. There are distinctive activation functions available that can be applied
upon the sort of task we are performing.
3
4. MODEL
We created a model that recognizes handwritten digits. We have achieved this with the help of
MNIST dataset. The model we created is a multi-layered Artificial Neural Network (ANN) that
takes in a 28x28 dimensional image of the handwritten digits and gives the recognized digit as
the output. The model is discussed in details below.
4
4.3 Building the Model
We have built the artificial neural network according to the problem statement specified.
5
b) Softmax Activation Function
It is used in our output layer as it is a multiclass classification problem.
For an arbitrary real vector of length K, Softmax can compress it into a real vector of
length K with a value in the range (0, 1), and the sum of the elements in the vector is 1.
It also has many applications in Multiclass Classification and neural networks. Softmax is
different from the normal max function: the max function only outputs the largest value,
and Softmax ensures that smaller values have a smaller probability and will not be
discarded directly.
6
4.4 Compiling the Model
Backpropagation
We iteratively update the weights to reduce the loss and this is done by backpropagation.
We reduce the loss function with the help of optimizer functions.
Adam Optimizer
Adam Optimizer inherits the strengths or the positive attributes of the Momentum and Root
Mean Square Propagation (RMSP) methods and builds upon them to give a more optimized
gradient descent. [3]
Taking the formulas used in the above two methods, we get
Parameters used :
ϵ = a small +ve constant to avoid 'division by 0' error when (vt -> 0). (10-8)
β1 & β2 = decay rates of average of gradients in the above two methods. (β1 = 0.9 & β2 = 0.999)
α — Step size parameter / learning rate (0.001)
Since mt and vt have both initialized as 0 (based on the above methods), it is observed that they
gain a tendency to be ‘biased towards 0’ as both β1 & β2 ≈ 1. This Optimizer fixes this problem
by computing ‘bias-corrected’ mt and vt. This is also done to control the weights while reaching
the global minimum to prevent high oscillations when near it. The formulas used are:
Intuitively, we are adapting to the gradient descent after every iteration so that it remains
controlled and unbiased throughout the process, hence the name Adam.
Now, instead of our normal weight parameters mt and vt , we take the bias-corrected weight
parameters (m_hat)t and (v_hat)t. Putting them into our general equation, we get
7
In the specific (and usual) case of Multi-Class classification the labels are one-hot, so only the
positive class Cp keeps its term in the loss. There is only one element of the Target vector t
which is not zero ti = tp. So discarding the elements of the summation which are zero due to
target labels, we can write:
5. MODEL EVALUATION
We tested our model on the test data and got 98.07 % accuracy.
8
5.1 Making Prediction
We predict the output of the test data using the model. As Softmax activation function is used in
the output layer, we get an array of probabilities of all the classes for each instance. We make an
array that contains the class that has scored the highest probability for each instance.
9
6. APPLICATIONS
Artificial neural networks have been applied in all areas of operations. Email service providers
use ANNs to detect and delete spam from a user’s inbox; asset managers use it to forecast the
direction of a company’s stock; credit rating firms use it to improve their credit scoring methods;
e-commerce platforms use it to personalize recommendations to their audience; chatbots are
developed with ANNs for natural language processing; deep learning algorithms use ANN to
predict the likelihood of an event; and the list of ANN incorporation goes on across multiple
sectors, industries, and countries. [4]
7. CONCLUSION
Artificial neural networks are paving the way for life-changing applications to be developed for
use in all sectors of the economy. Artificial intelligence platforms that are built on ANNs are
disrupting the traditional ways of doing things. From translating web pages into other languages
to having a virtual assistant order groceries online to conversing with chatbots to solve problems,
AI platforms are simplifying transactions and making services accessible to all at negligible
costs.
10
8. REFERENCES
[1]https://www.investopedia.com/terms/a/artificial-neural-networks-ann.asp#:~:text=An%20artif
icial%20neural%20network%20(ANN)%20is%20the%20piece%20of%20a,by%20human%20or
%20statistical%20standards.
[2] https://en.wikipedia.org/wiki/MNIST_database
[3]https://www.geeksforgeeks.org/intuition-of-adam-optimizer/#:~:text=Adaptive%20Moment%
20Estimation%20is%20an,less%20memory%20and%20is%20efficient\
[4]https://www.investopedia.com/terms/a/artificial-neural-networks-ann.asp#:~:text=An%20artif
icial%20neural%20network%20(ANN)%20is%20the%20piece%20of%20a,by%20human%20or
%20statistical%20standards.
[5] https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/
9. ACKNOWLEDGEMENT
We are grateful to be a part of this project included in the curriculum of MAKAUT and college
as well as. We are thankful to our parents, teachers and friends who have directly or indirectly
helped to complete this project. The process of preparing the project in collaboration with our
project guide Mr. Prasenjit Kumar Mudi sir is a refreshing experience. We convey our heartfelt
regards and appreciation for his sincere cooperation in this project.
11