DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
ipynb - Colab
The objective is to classify grayscale images of handwritten digits (0-9) from the MNIST
dataset. This is a collection of 70,000 grayscale images of handwritten digits (0-9). Each image
is a 28x28 pixel matrix, where each pixel represents the intensity of grayscale values (0 to 255).
The MNIST dataset consists of 60,000 training samples and 10,000 test samples. It is widely
used as a benchmark in machine learning for tasks like image classification.
The goal is to train a neural network to accurately predict the correct digit for each image based
on patterns learned during training. This is a supervised learning problem where the input is the
image, and the output is the digit label.
Here is an example of how the computer interpret each image from the dataset:
keyboard_arrow_down 2. Setup
# Libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.datasets import mnist
import matplotlib.pyplot as plt
import numpy as np
https://colab.research.google.com/github/proffranciscofernando/introduction-to-deep-learning/blob/main/DL-basics-of-neural-networks-MNIST-dat… 1/5
30/03/2025, 23:47 DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
TensorFlow is a popular deep learning framework used to build and train neural networks and
will be used here due to its flexibility, scalability, and ease of integration. For beginners,
TensorFlow offers high-level APIs like Keras, which simplify the process of creating and training
models. These features make TensorFlow an excellent choice for learning and experimentation.
Before loading the dataset, let's understand its structure visually. This will help us comprehend
the nature of the data we are working with.
This will help us check if the dataset is balanced (i.e., equal representation of all digits).
https://colab.research.google.com/github/proffranciscofernando/introduction-to-deep-learning/blob/main/DL-basics-of-neural-networks-MNIST-dat… 2/5
30/03/2025, 23:47 DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
# Before normalization
plt.subplot(1, 2, 1)
plt.hist(temp_X_train.flatten(), bins=50, color='green', alpha=0.7)
plt.title('Pixel Intensity Before Normalization')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
# After normalization
plt.subplot(1, 2, 2)
plt.hist(X_train.flatten(), bins=50, color='blue', alpha=0.7)
plt.title('Pixel Intensity After Normalization')
plt.xlabel('Pixel Intensity (Normalized)')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()
Normalization ensures that the input data has a uniform range, which helps the model converge
faster during training. Neural networks perform better when the data is scaled, as it prevents
larger input values from dominating the learning process.
keyboard_arrow_down 4. Modelling
The model is built using TensorFlow's Sequential API, which allows us to stack layers
sequentially. Each layer processes data in a specific way:
1. Flatten Layer: Converts the 2D input images (28x28 pixels) into a 1D array of 784 features.
This is necessary to feed the data into the Dense layers.
2. Dense Layer (Hidden): A fully connected layer and ReLU activation. ReLU introduces non-
linearity, allowing the model to learn complex patterns.
3. Dense Layer (Output): The final layer has 10 neurons, each representing a digit (0-9). It
uses the softmax activation function to produce probabilities for each class, enabling
classification.
Dense(10, activation='softmax') # Output layer with 10 neurons (one for each digit)
])
# Make predictions
predictions = model.predict(X_test)
https://colab.research.google.com/github/proffranciscofernando/introduction-to-deep-learning/blob/main/DL-basics-of-neural-networks-MNIST-dat… 4/5
30/03/2025, 23:47 DL-basics-of-neural-networks-MNIST-dataset.ipynb - Colab
keyboard_arrow_down 5. Experimentation
Run experiments with the model architecture and training process. Here are some suggestions:
Add more hidden layers: Try increasing the depth of the network by stacking additional
Dense layers.
Change the number of neurons: Modify the number of neurons in each * Dense layer to see
how it affects performance.
Use different activation functions: Experiment with alternatives like 'sigmoid', 'tanh', or
'LeakyReLU'.
Try other optimization algorithms: Replace 'adam' with optimizers like 'sgd', 'rmsprop', or
'nadam'.
Alter the number of epochs: Train the model for more or fewer epochs and observe
overfitting or underfitting.
Use dropout: Introduce dropout layers to prevent overfitting and enhance generalization.
Modify the learning rate: Adjust the optimizer's learning rate to see how it influences
convergence.
Document your changes and verify if or how each modification impacts the training, validation,
and test accuracy.
keyboard_arrow_down 6. Conclusion
https://colab.research.google.com/github/proffranciscofernando/introduction-to-deep-learning/blob/main/DL-basics-of-neural-networks-MNIST-dat… 5/5