0% found this document useful (0 votes)
12 views

Report

Uploaded by

pjun020321
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Report

Uploaded by

pjun020321
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Report: Implementing a Simple Neural Network for MNIST Dataset

1. Introduction

1.1 Objective
The primary goal of this assignment was to implement a simple neural network with up to five
layers to classify handwritten digits from the MNIST dataset. This involved designing the
architecture, implementing both forward and backward propagation, and optimizing weights. The
implementation was strictly done using Python and the NumPy library, without relying on pre-
built libraries like PyTorch or TensorFlow.
1.2 Dataset
The MNIST dataset consists of 60,000 training images and 10,000 testing images of handwritten
digits, each labeled as a number between 0 and 9. The images are grayscale with a resolution of
28 \times 28 pixels.
2. Methodology
2.1 Data Preparation
The dataset was processed to make it suitable for training:
• Images were normalized to values between 0 and 1 for consistency.
• Labels were extracted as integers ranging from 0 to 9.
• Mini-batches were created using the create_mini_batches function, which shuffled
the data to ensure unbiased training.
2.2 Model Architecture
The neural network architecture was implemented as follows:
1. Convolutional Layer:
• 3 \times 3 filters, with 16 feature maps.
• Extracts spatial features from the input images.
2. Flatten Layer:
• Converts the feature maps into a 1D vector.
3. Fully Connected Layers:
• FC1: 128 neurons, using the ReLU activation function.
• FC2: 10 neurons, using the Softmax activation function to output class
probabilities.
Activation Functions:

• ReLU: f ( x )=max(0 , x)

• Softmax: Converts raw scores into probabilities:

exp x i
Softmax ( xi ) =
∑ exp x j
j

2.3 Forward Propagation


The input data is passed through each layer in the following order:
1. Convolutional Layer produces feature maps.
2. Flatten Layer converts the feature maps into a 1D array.
3. Fully Connected Layers process the data, with Softmax applied at the output to
generate class probabilities.

2.4 Backward Propagation


Backward propagation involves calculating the gradients of the loss function with respect to the
weights in each layer:
1. Gradients for the output layer are calculated using the derivative of the Softmax and
Cross Entropy Loss functions.
2. Gradients are propagated backward through the fully connected and convolutional
layers.
3. The computed gradients are used to update the weights in each layer.
Loss Function:
Cross Entropy Loss was used, defined as:
N
−1
L= ∑ log p(i , y )
N i=1 i
where p_{i, y_i} is the predicted probability of the correct class for sample i.
2.5 Optimization
Weights were updated using the Gradient Descent algorithm:
W =W −α ⋅∇ L ,

where W represents the weights, \alpha is the learning rate, and \nabla L is the gradient of the
loss. The learning rate was set to 0.01 for this implementation.
3. Experimental Results
3.1 Experimental Setup
• Programming Environment: Python 3 and NumPy.
• Hardware: MacBook Air.
• Dataset: MNIST (60,000 training images, 10,000 testing images).
3.2 Results
The following tables present the loss values for each epoch and accuracy of tranning & test sets:

Epoch Loss values Tranning 92.00%


1 1.134826686807056 Accuracy
2 0.4221529610636374 (%)
3 0.357064769553238 Test 92.05%
4 0.3322257751385145 Accuracy
5 0.3188174469358134 (%)
6 0.3096424919533025
7 0.30356329993117026
8 0.2986148860533998
9 0.29462226426888455
10 0.291294958006831

5. Conclusion
This project successfully implemented a neural network from scratch using Python and NumPy
to classify MNIST digits. The model achieved a training accuracy of 92.00% and a testing
accuracy of 92.05%. Future improvements could include the use of dropout and batch
normalization to enhance performance further.
6. References

1. NumPy Documentation: https://numpy.org/


2. MNIST Dataset: https://www.kaggle.com/datasets/hojjatk/mnist-dataset

You might also like