0% found this document useful (0 votes)

11 views20 pages

MN5

This document outlines the design and implementation of a Convolutional Neural Network (CNN) for classifying handwritten digits using the MNIST dataset. It details the CNN architecture, including convolutional, pooling, and fully connected layers, and describes the training process using the Adam optimizer and categorical cross-entropy loss. The methodology emphasizes data preprocessing, model evaluation, and future enhancements such as data augmentation and transfer learning.

Uploaded by

moheeddin55

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views20 pages

MN5

Uploaded by

moheeddin55

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Build a Convolution Neural Network for MNIST

Hand written Digit Classification.

Shaik Muneer
Roll no:22KT1A4257
3rd Year (AI&ML)
PSCMR College Of Engineering And Technology

Abstract:

Handwritten digit classification is a fundamental problem in the field of computer vision and
machine learning. The MNIST dataset, a widely used benchmark, consists of 28x28 grayscale
images of handwritten digits (0–9). This work presents the design and implementation of
a Convolutional Neural Network (CNN) for classifying these digits. CNNs are particularly
well-suited for image-related tasks due to their ability to automatically learn spatial hierarchies
of features, such as edges, textures, and patterns, from raw pixel data.

The proposed CNN architecture consists of:

1. Convolutional Layers: To extract spatial features from the input images using learnable
filters.

2. Pooling Layers: To downsample the feature maps, reducing computational complexity

and preventing overfitting.

3. Fully Connected Layers: To combine the extracted features and perform classification.

4. Softmax Activation: To output probabilities for each of the 10 digit classes.

The model is trained using the Adam optimizer and categorical cross-entropy loss, which are
standard choices for multi-class classification tasks. The dataset is preprocessed by normalizing
pixel values to the range [0, 1] and splitting it into training and testing sets. The model achieves
high accuracy on the MNIST test set, demonstrating the effectiveness of CNNs for handwritten
digit classification.

1
This work highlights the power of deep learning and CNNs in solving image classification
problems and provides a foundation for more complex computer vision tasks.

1.Introduction:

One of the most commonly used benchmarks in the area of machine learning and computer
vision is the MNIST dataset. This is a collection of 28x28 pixel grayscale images of handwritten
digits from 0 to 9. The foundational dataset has been widely used in developing and testing
algorithms that have been specifically designed for image classification tasks. It involves
accurately classifying the images into their respective digit categories, thus making it imperative
for the model to learn the intricate patterns and features inherent in the handwritten digits.

Convolutional Neural Networks have been one of the most effective tools in the application of
image classification because of their ability to automatically and adaptively learn spatial
hierarchies of features from input images. In contrast to traditional fully connected neural
networks, CNNs make use of convolutional layers that can capture local patterns like edges,
textures, and shapes for the discrimination of different digits. With stacking multiple
convolutional layers, pooling layers, and fully connected layers, CNNs can actually capture
complex relations between the data; hence they are best applied for tasks such as MNIST digit
classification.

We will be developing a Convolutional Neural Network from scratch on this project, using one
of the following deep learning frameworks, TensorFlow or PyTorch. The aim is to train the
model on the MNIST dataset and then test it against a test set with a high classification accuracy.
This will be an excellent hands-on experience in designing and implementing CNNs while
deepening our understanding of how these networks learn to interpret visual data. By the end of
this project, we wish to train a pretty strong model that classifies handwritten digits very well.
It's going to be effective for image recognition tasks by CNNs.

1.1 Structure for MINST dataset:

1. Introduction

● Background: Brief overview of the MNIST dataset, which contains 70,000 images of
handwritten digits (0-9) and is widely used for training image processing systems.

2
● Objective: To develop a CNN model that accurately classifies handwritten digits using
the MNIST dataset.

2. Dataset Description

● Source: MNIST dataset, consisting of 60,000 training images and 10,000 testing images.

● Image Characteristics: Each image is a grayscale image of size 28x28 pixels.

● Labels: Each image corresponds to a label representing the digit (0-9).

3. Data Preprocessing

● Normalization: Scale pixel values from [0, 255] to [0, 1] to improve model convergence.

● Reshaping: Reshape the input data to include a channel dimension (e.g., from (28, 28) to
(28, 28, 1)).

● Train-Test Split: Ensure that the dataset is properly split into training and testing sets.

4. Model Design

● Architecture:

● Input Layer: Accepts images with shape (28, 28, 1).

● Convolutional Layer 1: Applies several filters (e.g., 32 filters of size (3x3)) with
ReLU activation.

● Max Pooling Layer 1: Reduces spatial dimensions (e.g., using a (2x2) pooling
size).

● Convolutional Layer 2: Applies additional filters (e.g., 64 filters of size (3x3))

with ReLU activation.

● Max Pooling Layer 2: Further reduces spatial dimensions.

● Flatten Layer: Flattens the output from the convolutional layers into a one-
dimensional vector.

● Dense Layer: Fully connected layer with a suitable number of neurons (e.g., 128)
and ReLU activation.

3
● Output Layer: A dense layer with 10 neurons (one for each digit) and softmax
activation to output probabilities.

5. Model Compilation

● Optimizer: Use Adam optimizer for efficient training.

● Loss Function: Categorical crossentropy for multi-class classification tasks.

● Metrics: Accuracy as the evaluation metric.

6. Model Training

● Train the model using the training dataset with appropriate parameters:

● Number of epochs (e.g., 10-20).

● Batch size (e.g., 32 or 64).

● Validation split to monitor performance on unseen data.

7. Model Evaluation

● Evaluate the model on the test set using accuracy as the primary metric.

● Generate confusion matrix and classification report to analyze performance across

different classes.

8. Results Visualization

● Plot training and validation accuracy/loss curves to visualize model performance over
epochs.

● Display some test images along with their predicted labels to qualitatively assess model
predictions.

9. Conclusion

● Summarize findings regarding model performance and accuracy in classifying

handwritten digits.

4
● Discuss potential improvements or alternative architectures (e.g., deeper networks,
dropout layers).

10. Future Work

● Explore advanced techniques such as data augmentation to improve model robustness.

● Investigate transfer learning using pre-trained models on similar tasks for improved
accuracy.

2.Related Work:

Author Task Model Accuracy Pros Cons

Anukriti Compare CNN CNN 98.6% High accuracy; May require

Rajput with KNN and effective significant
SVM feature computational
extraction resources
capabilities

Krut Implement CNN CNN Not Good Lacks detailed

using PyTorch specified introduction to explanation of
CNN concepts; underlying
practical concepts
implementatio
n

ResearchGate Evaluate various GoogLeNet, Not Comprehensiv Accuracy not

Authors CNN MobileNet v2, specified e comparison explicitly
architectures ResNet-50, etc. of multiple stated; may be
architectures complex for
beginners

Jason Develop CNN Custom CNN Not Step-by-step May lack

Brownlee from scratch specified guide for advanced
beginners; techniques

5
hands-on and
coding optimizations

Imdevskp Classify digits CNN Not Practical Limited detail

using CNN specified Kaggle on model
implementatio performance
n with real metrics
dataset

Papers with Benchmark Branching/ State-of- Represents Complexity

Code various models Merging CNN + the-art cutting-edge may hinder
Homogeneous research; high practical
Vector Capsules performance application

ResearchGate Hyperparameter EGACNN, CSNN More Improved Requires

Authors optimization for effective accuracy extensive
CNN than through tuning and
traditional optimization experiment
models

3.Proposed methodology:

The methodology details how one approaches developing a Convolutional Neural Network
(CNN) for identifying handwritten digits on the MNIST dataset. This methodology includes a
step for data preparation, model design, training, evaluation, and finally deployment.

1. Data Collection

Dataset: The MNIST dataset of 70,000 grayscale images corresponding to handwritten digits (0-
9) is used. There are 60,000 training images and 10,000 test images divided from the total
dataset.

2. Data Preprocessing

Loading the Dataset: Use libraries like TensorFlow or Keras to load the MNIST dataset.

6
Reshaping the Data: Convert the images from a 2D array (28x28 pixels) to a 4D array (28, 28, 1)
to include the channel dimension for grayscale images.

Normalization: Scale the pixel values from the range [0, 255] to [0, 1] by dividing by 255. This
helps improve the convergence of the neural network during training.

One-Hot Encoding: Encode the target labels, which are digits, as one-hot encoded vectors. For
instance, '3' can be represented as [0, 0, 0, 1, 0, 0, 0, 0, 0, 0].

3. Model Design

Architecture Selection: Define a CNN architecture for image classification. The selected
architecture is comprised of:

Input Layer: The input layer will take images of shape (28, 28, 1).

Convolutional Layers: Make use of several convolutional layers with ReLU activation to learn
the feature space of images.

Pooling Layers: Max pooling is added for reduction of spatial dimensions while preserving
useful features.

Flatten Layer: The output of the convolutional layers are flattened in preparation for the dense
layers.

Dense Layers: This includes several dense layers which can learn rich representations and further
applies a dropout layer to avoid overfitting.

Output Layer: Employ softmax activation function in the output layer, so that probabilities can
be produced for each of the 10 digit classes.

4. Compilation of Model

Loss Function: Employ categorical cross-entropy loss function as this is the loss function for
multi-class classification problems.

Optimizer: Use an optimizer such as Adam or SGD for minimizing the loss function during
training.

7
Metrics: Employ accuracy as the evaluation metric for measuring the performance of the model.

5. Training the Model

Training Process: Train the model with the training dataset. Define the number of epochs (10-20)
and batch size (32) for the training process.

Validation Split: If needed, utilize a validation split from the training data to track the
performance of the model and avoid overfitting.

6. Model Evaluation

Testing: Test the trained model on the test dataset to evaluate its performance. Compute
accuracy, precision, recall, and F1-score.

Confusion Matrix: Generate a confusion matrix to visualize the model's performance across
different digit classes.

7. Hyperparameter Tuning

Optimization: Experiment with different hyperparameters, such as the number of filters, kernel
sizes, learning rates, and dropout rates, to improve model performance.

Cross-Validation: Implement k-fold cross-validation to ensure the model's robustness and

generalization ability.

8. Model Deployment

Saving the Model: Save the trained model using formats such as HDF5 or TensorFlow
SavedModel for future use.

Deployment: Now, deploy the model in a web application or mobile app, where users may
submit handwritten digits, and it will come up with real-time predictions.

9. Future Work

Data Augmentation: Augment the training dataset artificially using techniques such as rotation,
scaling, and translation to enhance robustness.

8
Transfer Learning: Look at transfer learning-based ideas where pre-trained models could be used
for similar tasks to achieve better performance and reduce time required for training.

Ensemble Methods: Explore ensemble methods by combining the predictions of multiple models
to achieve higher accuracy.

4.Implementation:

4.1 Import the libraries:

import tensorflow as tf

from tensorflow.keras import layers, models

from tensorflow.keras.datasets import mnist

from tensorflow.keras.utils import to_categorical

import matplotlib.pyplot as plt

import numpy as np

In conclusion, this code snippet is a robust starting point to develop a CNN that can classify
handwritten digits using the MNIST dataset. With proper utilization of TensorFlow and its
powerful capabilities along with the friendly interface of Keras, it is easy to build, train, and test
deep learning models. The methodology described in the code above can also be extended
further for enhancing it with further preprocessing work, model tuning, and techniques for
evaluation to achieve high accuracy in digit recognition tasks. This approach does not only
reflect the success of CNNs for image classification tasks but also supplies a practical outline to
explore the deeper applications with even more complexities in the near future.

4.2 Load the datasets:

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))

X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))

9
X_train = X_train.astype('float32') / 255

X_test = X_test.astype('float32') / 255

y_train = to_categorical(y_train, 10)

y_test = to_categorical(y_test, 10)

In summary, the code above initiates the dataset preparation for CNN on the MNIST dataset by
loading, reshaping, normalization, and one-hot encoding on the data; hence, its readiness for
accurate training and assessment. Pre-processing steps such as these are often essential for
perfect digit classification in a model as well as provide a common backbone in the processes of
machine learning. With this preprocessing done, it would be the usual procedure to define the
architecture of the CNN, compile the model, train it on the prepared dataset, and evaluate the
performance of the model on the test set.

4.3. Import the models:

model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))

model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Flatten())

model.add(layers.Dense(64, activation='relu'))

model.add(layers.Dense(10, activation='softmax'))

In conclusion, this code snippet effectively constructs a CNN architecture tailored for the
MNIST handwritten digit classification task. By combining convolutional layers, pooling layers,
and fully connected layers, the model is designed to learn hierarchical representations of the
input images, enabling it to classify digits accurately. This architecture is well-suited for image
classification tasks and serves as a solid foundation for further enhancements, such as
hyperparameter tuning, regularization techniques, and model evaluation. With this model

10
structure in place, the next steps would typically involve compiling the model, training it on the
preprocessed dataset, and evaluating its performance on the test set.

4.4 Train the model:

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.summary()

history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2, verbose=1)

test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)

print(f"Test Accuracy: {test_acc:.4f}")

In conclusion, this code snippet effectively demonstrates the process of compiling, training, and
evaluating a CNN model for handwritten digit classification using the MNIST dataset. By
utilizing the Adam optimizer and categorical cross-entropy loss, the model is well-equipped to
learn from the training data. The training process, monitored through validation metrics, helps
ensure that the model does not overfit. The final evaluation on the test set provides a quantitative
measure of the model's performance, with the printed test accuracy serving as a key indicator of
its effectiveness. This methodology not only highlights the practical application of CNNs in
image classification tasks but also sets the stage for potential future enhancements, such as
hyperparameter tuning, data augmentation, and model optimization.

4.5 Plot the values:

plt.plot(history.history['accuracy'])

plt.plot(history.history['val_accuracy'])

plt.title('Model accuracy')

plt.ylabel('Accuracy')

plt.xlabel('Epoch')

plt.legend(['Train', 'Validation'], loc='upper left')

11
plt.show()

plt.plot(history.history['loss'])

plt.plot(history.history['val_loss'])

plt.title('Model loss')

plt.ylabel('Loss')

plt.xlabel('Epoch')

plt.legend(['Train', 'Validation'], loc='upper left')

plt.show()

In conclusion, the code snippet successfully generates visualizations that are essential for
evaluating the training process of the CNN model. By plotting training and validation accuracy
and loss, one can gain a comprehensive understanding of the model's learning behavior and
performance. These insights are critical for refining the model and achieving optimal results in
handwritten digit classification tasks. The ability to visualize training metrics not only aids in
model evaluation but also enhances the overall interpretability of the machine learning process.

5.Result:

12
13
Conclusion:

In this project, we successfully built a Convolutional Neural Network (CNN) for the
classification of handwritten digits using the MNIST dataset. The methodology encompassed
several key steps, each contributing to the overall effectiveness of the model:

1. Data Preparation:

14
● We began by loading the MNIST dataset, which consists of 70,000 images of
handwritten digits. The dataset was preprocessed by reshaping the images to
include a channel dimension, normalizing pixel values to a range of [0, 1], and
converting the target labels into one-hot encoded vectors. These preprocessing
steps are crucial for ensuring that the data is in the appropriate format for training
the CNN.

● Reshaping ensures images have the correct input shape for CNNs ((28, 28, 1)).

● Normalization speeds up training and prevents issues with large pixel values.

● One-hot encoding allows the model to correctly classify multiple categories.

2. Model Architecture:

● The CNN architecture was designed with multiple layers, including convolutional
layers for feature extraction, max pooling layers for downsampling, and fully
connected layers for classification. This architecture allows the model to learn
hierarchical representations of the input images, effectively capturing the spatial
patterns associated with different digits.

● Convolutional Layers – Extract spatial features from images.

● Max Pooling Layers – Reduce spatial dimensions while retaining important

features.

● Fully Connected Layers – Perform classification based on extracted features.

3. Model Training:

● The model was compiled using the Adam optimizer and categorical cross-entropy
loss function, which are well-suited for multi-class classification tasks. We trained
the model for 10 epochs, monitoring both training and validation accuracy to
ensure that the model was learning effectively without overfitting.

15
● Adam Optimizer – A widely used adaptive learning rate optimization algorithm
that balances speed and performance.

● ategorical Cross-Entropy Loss – The standard loss function for multi-class

classification, ensuring correct probability distribution learning.

● Accuracy Metric – Tracks how well the model predicts digit labels.

4. Model Evaluation:

● After training, the model was evaluated on a separate test dataset, achieving a
high accuracy score. This performance metric indicates that the model generalizes
well to unseen data, making it a reliable tool for digit classification.

● Accuracy – Measures how many predictions are correct.

● Loss – Indicates how well the model’s predictions match actual labels.

● High accuracy on test data proves the model generalizes well.

5. Visualization of Results:

● We visualized the training process by plotting the accuracy and loss curves for
both training and validation datasets. These plots provided insights into the
model's learning behavior, helping to identify potential issues such as overfitting
or underfitting.

● Monitor training progress and identify trends.

● Identify underfitting (when both training and validation accuracy are low).

Future Work:

While the current implementation of a Convolutional Neural Network (CNN) for classifying

16
handwritten digits from the MNIST dataset has yielded promising results, there are several
avenues for future work that could enhance the model's performance, robustness, and
applicability. Below are some suggested directions for further exploration:

1. Data Augmentation:

● Implement data augmentation techniques to artificially increase the size of the

training dataset. Techniques such as rotation, translation, scaling, and shearing
can help the model generalize better by exposing it to a wider variety of input
variations.

2. Hyperparameter Tuning:

● Conduct a systematic search for optimal hyperparameters, such as the number of

filters, kernel sizes, learning rates, batch sizes, and dropout rates. Techniques like
Grid Search or Random Search can be employed to identify the best configuration
for the model.

3. Advanced Architectures:

● Explore more complex CNN architectures, such as ResNet, Inception, or

DenseNet, which have shown superior performance in various image
classification tasks. These architectures can help capture more intricate patterns in
the data.

● Traditional deep networks suffer from the vanishing gradient issue, making
training difficult.

● ResNet introduces skip (residual) connections, allowing gradients to flow

smoothly across deep layers.

4. Transfer Learning:

17
● Investigate the use of transfer learning by leveraging pre-trained models on
similar tasks. Fine-tuning these models on the MNIST dataset can lead to
improved performance and reduced training time.

5. Regularization Techniques:

● Implement additional regularization techniques, such as L1 or L2 regularization,

to further prevent overfitting. This could help improve the model's generalization
to unseen data.

6. Ensemble Methods:

● Combine predictions from multiple models using ensemble techniques such as

bagging, boosting, or stacking. This could enhance predictive performance by
leveraging the strengths of different algorithms.

7. Cross-Validation:

● Implement k-fold cross-validation to ensure that the model's performance is

robust and not overly dependent on a specific train-test split. This would provide a
more reliable estimate of the model's generalization ability.

8. Model Interpretability:

● Investigate model interpretability techniques, such as SHAP (SHapley Additive

exPlanations) or LIME (Local Interpretable Model-agnostic Explanations), to
better understand the contributions of individual features to the model's
predictions. This can provide valuable insights for stakeholders and improve trust
in the model's decisions.

9. Exploration of Other Datasets:

● Extend the analysis to other handwritten digit datasets or real-world image

datasets to validate the model's applicability and robustness across different
contexts. This could include datasets with more complex variations or different
writing styles.

18
10. Real-Time Application:

● Develop a real-time application or web-based tool that allows users to input

handwritten digits and receive predictions instantly. This would involve
considerations for user interface design, model deployment, and performance
optimization.

11. Longitudinal Studies:

● Conduct longitudinal studies to assess how the model's performance changes over
time and how it can adapt to new data or changes in handwriting styles. This
could involve continuous learning techniques to update the model as new data
becomes available.

References:

● LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). "Gradient-Based Learning
Applied to Document Recognition." Proceedings of the IEEE, 86(11), 2278-2324.
● This pioneer paper presents the MNIST data and discusses the application of
convolutional neural networks for digit recognition.
● Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). "ImageNet Classification with
Deep Convolutional Neural Networks." Advances in Neural Information Processing
Systems, 25, 1097-1105.
● This paper proposes the AlexNet architecture, which in fact popularized deep learning as
well as CNNs in the field, showing high efficacy in image-classification tasks.
● Simonyan, K., & Zisserman, A. (2014). "Very Deep Convolutional Networks for Large-
Scale Image Recognition." arXiv preprint arXiv:1409.1556.
● This paper introduces the VGG architecture, emphasizing depth in CNNs and influencing
many subsequent models.
● He, K., Zhang, X., Ren, S., & Sun, J. (2016). "Deep Residual Learning for Image
Recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 770-778.

19
● This paper is about the ResNet architecture that introduced residual connections to enable
the training of very deep networks and demonstrated its efficacy in image classification
tasks.
● Chollet, F. (2015). "Keras: The Python Deep Learning Library." GitHub Repository.
Retrieved from https://github.com/fchollet/keras
● This gives full information on how to implement the building and training of neural
networks, as well as providing examples that are particularly appropriate for image
classification tasks.
● Goodfellow, I., Bengio, Y., & Courville, A. (2016). "Deep Learning." MIT Press.
● This book offers an exhaustive introduction to deep learning concepts, including CNNs,
and forms a valuable resource in understanding the more abstract foundations of the
techniques used here.
● Zhang, Y., & LeCun, Y. (2015). "Text Understanding from Scratch." arXiv preprint
arXiv:1502.01710.
● This article mentions the use of CNNs on the text data, thereby indicating the ability to
use CNN architectures in other applications other than image classification.
● Bengio, Y. (2012). "Practical Recommendations for Gradient-Based Training of Deep
Architectures." Neural Networks: Tricks of the Trade, 437-478.
● This chapter of deep learning training gives practical advice for optimizing the training of
the models, including CNNs.
● Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). "Densely
Connected Convolutional Networks." Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 4700-4708.
● This article introduces DenseNet, a CNN architecture in which each layer is directly
connected to every other layer in a feed-forward manner. This simplifies feature
propagation, eliminating redundant features or suppressing noise propagation, hence
reducing the number of parameters.
● Scikit-learn Documentation. n.d. "User Guide." Available at
https://scikit-learn.org/stable/user_guide.html
● The Scikit-learn documentation provides detailed guidance on various machine learning
algorithms, including preprocessing techniques and model evaluation metrics.

Mnist Handwritten Digit Classification
No ratings yet
Mnist Handwritten Digit Classification
26 pages
MNIST CLASSIFICATION REPORT
No ratings yet
MNIST CLASSIFICATION REPORT
15 pages
Assignment_SQGAN
No ratings yet
Assignment_SQGAN
14 pages
muthu
No ratings yet
muthu
9 pages
Digit Recognition Using Convolutional Neural Networks
No ratings yet
Digit Recognition Using Convolutional Neural Networks
4 pages
NNDL Assignment-2 Report
No ratings yet
NNDL Assignment-2 Report
7 pages
NNDL Assignment-2 Report
No ratings yet
NNDL Assignment-2 Report
9 pages
MNIST
No ratings yet
MNIST
15 pages
dbms
No ratings yet
dbms
14 pages
Recearch_paper
No ratings yet
Recearch_paper
8 pages
Handwritten Digit Recognition With CNN (6)
No ratings yet
Handwritten Digit Recognition With CNN (6)
13 pages
Handwritten Digit Recognition
No ratings yet
Handwritten Digit Recognition
19 pages
Phase 1 PPT Digit Recognition
No ratings yet
Phase 1 PPT Digit Recognition
8 pages
AI Mini Project Report
No ratings yet
AI Mini Project Report
7 pages
ENG21CS0302 - SGAN
No ratings yet
ENG21CS0302 - SGAN
7 pages
MAJOR PROJECT
No ratings yet
MAJOR PROJECT
10 pages
Classifying Hand-Written Digits Using Neural Network
No ratings yet
Classifying Hand-Written Digits Using Neural Network
21 pages
Piyush Rastogi
No ratings yet
Piyush Rastogi
5 pages
Deep_Learning_CNN_Implementation_MNIST
No ratings yet
Deep_Learning_CNN_Implementation_MNIST
2 pages
handwrittendigitrecognitionppt1-221115162428-68e03722
No ratings yet
handwrittendigitrecognitionppt1-221115162428-68e03722
11 pages
On Handwritten Digit Recognition
100% (1)
On Handwritten Digit Recognition
15 pages
IJIRT162606_PAPER_(1)[1]
No ratings yet
IJIRT162606_PAPER_(1)[1]
4 pages
MNIST-Handwritten-Digit-Recognition-with-Different-CNN-Architectures
No ratings yet
MNIST-Handwritten-Digit-Recognition-with-Different-CNN-Architectures
4 pages
Handwritten Digit Recognition Using Convolutional Neural Networks
No ratings yet
Handwritten Digit Recognition Using Convolutional Neural Networks
6 pages
Capstone Project Report (Digit-Recognition Using CNN)
No ratings yet
Capstone Project Report (Digit-Recognition Using CNN)
11 pages
Proposal
No ratings yet
Proposal
9 pages
Research Papers
No ratings yet
Research Papers
16 pages
Handwritten Digit Recognition Roadmap
No ratings yet
Handwritten Digit Recognition Roadmap
17 pages
DLA Week 7
No ratings yet
DLA Week 7
8 pages
Updated 2nd Synopsis
No ratings yet
Updated 2nd Synopsis
33 pages
Newbie’s Deep Learning Project to Recognize Handwritten Digit
No ratings yet
Newbie’s Deep Learning Project to Recognize Handwritten Digit
6 pages
Experiment No. 10 TE SL-II (ANN)
No ratings yet
Experiment No. 10 TE SL-II (ANN)
3 pages
Pattern Recognition
No ratings yet
Pattern Recognition
18 pages
MNIST
No ratings yet
MNIST
3 pages
G54 Midterm
No ratings yet
G54 Midterm
15 pages
SL II C3
No ratings yet
SL II C3
2 pages
Handwritten Digit Recognition Using ML&DL
No ratings yet
Handwritten Digit Recognition Using ML&DL
3 pages
Image Classification using MNIST Dataset
No ratings yet
Image Classification using MNIST Dataset
28 pages
Razii Abraham - AF24SYD010 - MNIST Classification Using Multilayer Perceptrons (MLPs)
No ratings yet
Razii Abraham - AF24SYD010 - MNIST Classification Using Multilayer Perceptrons (MLPs)
6 pages
Lab DigitRecognitionMINST
No ratings yet
Lab DigitRecognitionMINST
10 pages
dl (2)
No ratings yet
dl (2)
2 pages
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
No ratings yet
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
9 pages
Finalproject Review PPT
No ratings yet
Finalproject Review PPT
39 pages
1822 B.E Ece Batchno 14
No ratings yet
1822 B.E Ece Batchno 14
55 pages
Report on Handwritten Digit Recognition using a Feedforward Neural Network
No ratings yet
Report on Handwritten Digit Recognition using a Feedforward Neural Network
8 pages
DL Practical 3 (1)
No ratings yet
DL Practical 3 (1)
5 pages
Project
No ratings yet
Project
15 pages
Deep Learning - Handwritten Digit Recognition Using Python
No ratings yet
Deep Learning - Handwritten Digit Recognition Using Python
46 pages
Aishwarya MiniProjectReport - SC
No ratings yet
Aishwarya MiniProjectReport - SC
6 pages
Image/Digit Recognition Using Machine Learning: by Raghav Chawla, I.T/B.Tech/Hmritm/5 Semester 43713303117
100% (1)
Image/Digit Recognition Using Machine Learning: by Raghav Chawla, I.T/B.Tech/Hmritm/5 Semester 43713303117
15 pages
Base Paper
No ratings yet
Base Paper
5 pages
Project Paper.pdf
No ratings yet
Project Paper.pdf
9 pages
DEEP LEARNING ASSIGNMENT
No ratings yet
DEEP LEARNING ASSIGNMENT
11 pages
Report
No ratings yet
Report
4 pages
JETIR2303661
No ratings yet
JETIR2303661
8 pages
Chapter 1 Introduction 1.1 Problem Definition The recognition of handwritten digits is a challenging task in the field of computer vision due to the variability in writing styles. The project aims
No ratings yet
Chapter 1 Introduction 1.1 Problem Definition The recognition of handwritten digits is a challenging task in the field of computer vision due to the variability in writing styles. The project aims
1 page
Handwritten Digit Recogntion Using Mnist Dataset
No ratings yet
Handwritten Digit Recogntion Using Mnist Dataset
20 pages
Handwritten Text Recognition
No ratings yet
Handwritten Text Recognition
4 pages
Ai Mini Project
No ratings yet
Ai Mini Project
9 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
IF4071 Deep Learning QP
No ratings yet
IF4071 Deep Learning QP
2 pages
Artificial Intelligence for Data Driven Medical Diagnosis, 1st Edition Textbook PDF Download
100% (12)
Artificial Intelligence for Data Driven Medical Diagnosis, 1st Edition Textbook PDF Download
14 pages
Handbook of Research on Machine Learning Applications and Trends Algorithms Methods and Techniques 1st Edition Emilio Soria Olivas pdf download
No ratings yet
Handbook of Research on Machine Learning Applications and Trends Algorithms Methods and Techniques 1st Edition Emilio Soria Olivas pdf download
52 pages
Mathematics 12 01393
No ratings yet
Mathematics 12 01393
34 pages
The Automatic Detection of Speech Disorders in Children Challenges Opportunities and Preliminary Results
No ratings yet
The Automatic Detection of Speech Disorders in Children Challenges Opportunities and Preliminary Results
13 pages
Physics-guided Convolutional Neural Network (PhyCNN) for Data-driven
No ratings yet
Physics-guided Convolutional Neural Network (PhyCNN) for Data-driven
13 pages
Lecture_6373_07
No ratings yet
Lecture_6373_07
53 pages
ML Lab Manual CSE (1)
No ratings yet
ML Lab Manual CSE (1)
50 pages
Ai Algorithm Failure
No ratings yet
Ai Algorithm Failure
36 pages
Session2 2024_2025_ Natural Language Processing
No ratings yet
Session2 2024_2025_ Natural Language Processing
30 pages
tentative ai questions with answers
No ratings yet
tentative ai questions with answers
18 pages
Poster Presentation Sample CAT-3
No ratings yet
Poster Presentation Sample CAT-3
1 page
Romagnoli J.A., Briceño-Mena L., Manee V. - AI in Chemical Engineering_ Unlocking the Power Within Data-CRC Press (2025)
No ratings yet
Romagnoli J.A., Briceño-Mena L., Manee V. - AI in Chemical Engineering_ Unlocking the Power Within Data-CRC Press (2025)
303 pages
005-spe-198762-ms
No ratings yet
005-spe-198762-ms
15 pages
IoT_Based_on-the-fly_Visual_Defect_Detection_in_Railway_Tracks
No ratings yet
IoT_Based_on-the-fly_Visual_Defect_Detection_in_Railway_Tracks
5 pages
ai case study
No ratings yet
ai case study
4 pages
Introduction (9)
No ratings yet
Introduction (9)
6 pages
Geoinformatics for Sustainable Development in Asian Cities Sathaporn Monprapussorn - Read the ebook now or download it for a full experience
100% (1)
Geoinformatics for Sustainable Development in Asian Cities Sathaporn Monprapussorn - Read the ebook now or download it for a full experience
53 pages
pdf
No ratings yet
pdf
8 pages
Chapter 7&8 Mcqs
No ratings yet
Chapter 7&8 Mcqs
7 pages
From Eliza to XiaoIce Challenges and Opportunities With Social Chatbots
No ratings yet
From Eliza to XiaoIce Challenges and Opportunities With Social Chatbots
20 pages
3 Short
No ratings yet
3 Short
10 pages
ECCV 2020 Paper Digests
No ratings yet
ECCV 2020 Paper Digests
138 pages
mtech-ai-ml
No ratings yet
mtech-ai-ml
29 pages
Visual Transformer for Soil Classification
No ratings yet
Visual Transformer for Soil Classification
7 pages
5. Introduction to Artificial Neural Networks With Keras - IITR Batch 2 (8)
No ratings yet
5. Introduction to Artificial Neural Networks With Keras - IITR Batch 2 (8)
252 pages
Sece Ppt Review 2 Brain Detection (Final)
No ratings yet
Sece Ppt Review 2 Brain Detection (Final)
21 pages
11-Business Inteligence and Knowledge Manag
No ratings yet
11-Business Inteligence and Knowledge Manag
40 pages
Machine Intelligence, Big Data Analytics, and IoT in Image Processing Ashok Kumarinstant download
100% (2)
Machine Intelligence, Big Data Analytics, and IoT in Image Processing Ashok Kumarinstant download
48 pages
25_ Physics Embedded Neural Network Novel Data-free Approach Towards Scientific Computing and Applications in Transfer Learning
No ratings yet
25_ Physics Embedded Neural Network Novel Data-free Approach Towards Scientific Computing and Applications in Transfer Learning
14 pages

MN5

Uploaded by

MN5

Uploaded by

Build a Convolution Neural Network for MNIST

Hand written Digit Classification.

The proposed CNN architecture consists of:

2. Pooling Layers: To downsample the feature maps, reducing computational complexity

4. Softmax Activation: To output probabilities for each of the 10 digit classes.

1.1 Structure for MINST dataset:

● Image Characteristics: Each image is a grayscale image of size 28x28 pixels.

● Labels: Each image corresponds to a label representing the digit (0-9).

● Input Layer: Accepts images with shape (28, 28, 1).

● Convolutional Layer 2: Applies additional filters (e.g., 64 filters of size (3x3))

● Max Pooling Layer 2: Further reduces spatial dimensions.

● Optimizer: Use Adam optimizer for efficient training.

● Loss Function: Categorical crossentropy for multi-class classification tasks.

● Metrics: Accuracy as the evaluation metric.

● Number of epochs (e.g., 10-20).

● Batch size (e.g., 32 or 64).

● Validation split to monitor performance on unseen data.

● Generate confusion matrix and classification report to analyze performance across

● Summarize findings regarding model performance and accuracy in classifying

10. Future Work

● Explore advanced techniques such as data augmentation to improve model robustness.

Author Task Model Accuracy Pros Cons

Anukriti Compare CNN CNN 98.6% High accuracy; May require

Krut Implement CNN CNN Not Good Lacks detailed

ResearchGate Evaluate various GoogLeNet, Not Comprehensiv Accuracy not

Jason Develop CNN Custom CNN Not Step-by-step May lack

Imdevskp Classify digits CNN Not Practical Limited detail

Papers with Benchmark Branching/ State-of- Represents Complexity

ResearchGate Hyperparameter EGACNN, CSNN More Improved Requires

5. Training the Model

Cross-Validation: Implement k-fold cross-validation to ensure the model's robustness and

4.1 Import the libraries:

from tensorflow.keras import layers, models

from tensorflow.keras.datasets import mnist

from tensorflow.keras.utils import to_categorical

import matplotlib.pyplot as plt

4.2 Load the datasets:

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))

X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))

X_test = X_test.astype('float32') / 255

y_train = to_categorical(y_train, 10)

y_test = to_categorical(y_test, 10)

4.3. Import the models:

model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))

model.add(layers.Conv2D(64, (3, 3), activation='relu'))

4.4 Train the model:

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2, verbose=1)

test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)

print(f"Test Accuracy: {test_acc:.4f}")

4.5 Plot the values:

plt.legend(['Train', 'Validation'], loc='upper left')

plt.legend(['Train', 'Validation'], loc='upper left')

● One-hot encoding allows the model to correctly classify multiple categories.

● Convolutional Layers – Extract spatial features from images.

● Max Pooling Layers – Reduce spatial dimensions while retaining important

● Fully Connected Layers – Perform classification based on extracted features.

● ategorical Cross-Entropy Loss – The standard loss function for multi-class

● Accuracy – Measures how many predictions are correct.

● High accuracy on test data proves the model generalizes well.

● Monitor training progress and identify trends.

● Implement data augmentation techniques to artificially increase the size of the

● Conduct a systematic search for optimal hyperparameters, such as the number of

● Explore more complex CNN architectures, such as ResNet, Inception, or

● ResNet introduces skip (residual) connections, allowing gradients to flow

● Implement additional regularization techniques, such as L1 or L2 regularization,

● Combine predictions from multiple models using ensemble techniques such as

● Implement k-fold cross-validation to ensure that the model's performance is

● Investigate model interpretability techniques, such as SHAP (SHapley Additive

9. Exploration of Other Datasets:

● Extend the analysis to other handwritten digit datasets or real-world image

● Develop a real-time application or web-based tool that allows users to input

11. Longitudinal Studies:

You might also like