ML Digit Classification Report
ML Digit Classification Report
ML Models
Abstract:
This project evaluates the performance of five supervised learning algorithms—Logistic
Regression, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Random
Forest—on the UCI Handwritten Digits dataset. The dataset comprises 1,797 grayscale
images of handwritten digits (0-9), and the goal is to classify each image correctly. After
data normalization and train-test splitting, each model was trained and evaluated using
standard metrics including accuracy, precision, recall, and F1-score. A performance
comparison was visualized using bar plots and confusion matrices.
Keywords:
Handwritten Digit Classification, Supervised Learning, Model Comparison, Accuracy, UCI
Dataset
1. Introduction:
Handwritten digit recognition is a classic problem in pattern recognition and machine
learning. It has practical applications in postal mail sorting, bank check recognition, and
digitizing handwritten documents. This project uses the UCI Digits dataset to explore how
different machine learning algorithms perform on this classification task.
2. Proposed Methodology
The approach includes data normalization, train-test splitting, training multiple models, and
evaluating them using various performance metrics.
a. Dataset:
The UCI Handwritten Digits dataset contains 1,797 8x8 images of digits. Each image is
represented by 64 numerical features.
b. Preprocessing:
Images were normalized by dividing pixel values by 16.0. The dataset was split into training
(80%) and test (20%) sets with stratification.
c. Models Used:
Five classification algorithms were evaluated:
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Support Vector Machine (SVM)
- Decision Tree
- Random Forest
d. Evaluation Metrics:
Accuracy, Precision, Recall, and F1-score were calculated using weighted averages. 5-fold
cross-validation was also performed.
References:
1. https://scikit-learn.org/stable/auto_examples/classification/
plot_digits_classification.html
2. https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits
3. https://en.wikipedia.org/wiki/Optical_character_recognition
4. https://scikit-learn.org/stable/modules/generated/
sklearn.linear_model.LogisticRegression.html
5. https://scikit-learn.org/stable/modules/generated/
sklearn.neighbors.KNeighborsClassifier.html
6. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
7. https://scikit-learn.org/stable/modules/generated/
sklearn.tree.DecisionTreeClassifier.html
8. https://scikit-learn.org/stable/modules/generated/
sklearn.ensemble.RandomForestClassifier.html
9. https://scikit-learn.org/stable/modules/model_evaluation.html
10. https://seaborn.pydata.org/generated/seaborn.heatmap.html