Handwritten Digit Recognition Using Quantum Convolution Neural Network
Handwritten Digit Recognition Using Quantum Convolution Neural Network
Ravuri Daniel1, Bode Prasad2, Prudhvi Kiran Pasam3, Dorababu Sudarsa4, Ambarapu Sudhakar5,
Bodapati Venkata Rajanna5
1
Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada, India
2
Department of Information Technology, Vignan’s Institute of Information Technology (A), Visakhapatnam, India
3
Department of Computer Science and Engineering (IoT), R.V.R. & J. C. College of Engineering (A), Guntur, India
4
Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, India
5
Department of Electrical and Electronics Engineering, MLR Institute of Technology, Hyderabad, India
Corresponding Author:
Ravuri Daniel
Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology
Vijayawada, India
Email: [email protected]
1. INTRODUCTION
Handwriting is considered the most conventional and structured way of documenting facts and
information. Individuals have unique and idiosyncratic handwriting. A system that is able to recognize and
analyze human handwriting in any language is referred to as a handwritten character recognition (HCR) system
[1]–[3]. Handwriting recognition can be carried out from both online and offline sources. In recent times, the
application of handwriting recognition has become increasingly prevalent and is now used in various domains,
including but not limited to, reading postal addresses, language translation, bank forms and check amounts,
digital libraries, keyword spotting, and traffic sign detection.
Recognizing human handwritten digits is a challenging task for computer applications since they come
in various shapes and sizes, making them imperfect. Handwritten digit recognition refers to a computer’s ability
to identify and classify human-written numbers from various sources, including images, papers, and touch
screens, into ten predefined classes (0-9). Handwritten number recognition poses several challenges due to the
varying styles of writing among individuals, making it different from optical character recognition. The
handwritten digit recognition system utilizes digit images to overcome this challenge and recognize the digit
present in the image. To achieve this, the present research work based on Support Vector Machine, Multilayer
Perceptron, convolutional neural network (CNN), and other deep learning methods [4], [5] are applied over
Modified National Institute of Standards and Technology (MNIST) dataset to recognize handwritten digits.
But CNN [6], [7] require a large amount of labeled data to train, obtaining such datasets can be costly and time-
consuming. It can also be prone to overfitting, which means that the model performs well on the training data
but poorly on new, unseen data. In certain real-world applications, the usefulness of CNNs may be limited due
to their struggle to generalize to new, unseen data beyond the scope of the training data. In this paper, a new
method is proposed to overcome the aforementioned limitations by incorporating a quantum convolutional
neural network algorithm (QCNN) [8], [9]. Because the QCNN can perform certain operations much faster
than classical convolutional neural networks, especially those involving matrix multiplication and Fourier
transforms. It can require fewer resources and can be more efficient than classical CNNs, especially when
dealing with large datasets, working with noisy, and incomplete data. QCNNs have the potential to scale more
efficiently and effectively than classical algorithms, making them better suited for large-scale applications. The
effectiveness of the proposed method, conducted tests on the MNIST dataset and achieved an average accuracy
of 91.08%.
This paper is organized as follows. Section 2 presents the related work. In section 3, the proposed
method is described. In section 4, describe in detail the results and analysis. Finally, we summarized the main
conclusions about the advantage and disadvantage of the proposed method in section 5.
2. RELATED WORK
Research works have presented numerous methods for categorizing handwritten characters and digits.
Handwriting recognition has been previously demonstrated encouraging outcomes utilizing shallow networks
[10], [11]. The accuracy rate attained by the MNIST dataset was 91.08% in Hinton et al.'s research on deep
belief networks (DBN), which consist of three layers and incorporate a learning algorithm [12]. To recognize
unconstrained handwriting, Pham et al. utilized a regularization technique called dropout to enhance efficiency
of recurrent neural networks (RNNs) and lower the rates of word error (WER) and character error (CER) [13].
The performance of handwriting recognition (HCR) was significantly transformed by the introduction of the
convolutional neural network (CNN), which achieved state-of-the-art accuracy [14], [15]. Simard et al.
introduced a common CNN architecture for visual document analysis in 2003, which simplified the training of
complex neural network methods [16]. Multilayer CNNs were utilized by Wang et al. to achieve excellent
outcomes in performing end-to-end text recognition has been demonstrated on benchmark datasets, including
street view text and ICDAR 2003 [17].
CNN has demonstrated exceptional performance in offline handwritten character recognition for
various studies on handwritten text recognition in regional and international languages, including Chinese,
have been conducted and carried out by researchers [18]–[20]. Arabic language [21]; handwritten Tamil
character recognition [22]; handwritten character recognition on Indic scripts [23]; recognition of handwritten
Urdu text. [24], [25]; Telugu character recognition [26]. In their model, Gupta and colleagues utilized CNN-
derived features and identified informative local regions in recent character images, achieving better accuracy
in recognition. They employed a novel multi-objective optimization framework for HCR. Ptucha et al.
presented a conventional neural network-based intelligent character recognition (ICR) system in a logical
manner [27]. The model was evaluated using IAM datasets and RIMES lexicon datasets in French language,
and it reported a commendable result. Tapotosh Ghosh et al. utilized the CMATERdb dataset to convert images
into 28×28 black-and-white forms with white as the foreground color in their study, and effectively designed
CNN parameters using InceptionResNetV2, DenseNet121, and InceptionNetV3 to improve recognition
performance.
A quantum convolutional neural network (QCNN) is a type of neural network that leverages the
principles of quantum mechanics to perform computations. By using qubits instead of classical bits, QCNNs
can potentially provide faster processing times and higher accuracy than classical neural networks for
handwritten digit recognition. However, QCNN is an emerging technology and proposed QCNN based
handwritten digit recognition to improve accuracy and reduce the processing time.
3. METHODOLOGY
The proposed methodology includes data collection, pre-processing, building the model,feature
prediction, and visualization of results. The data collectioninvolves gathering the relevant data that will be used
to train and evaluate the model. It's important to ensure the data is representative and of good quality to produce
reliable results. After collecting the data, preprocessing of the data needs to be prepared for the modeling phase.
The tasks involed in this phase are data cleaning, handling missing values, removing duplicates, and dealing
with outliers. Data pre-processing also included feature scaling, normalization, or transformation to make the
data suitable for the selected model. The model is builed usingquantum convolutional neural network
algorithm. The model is trained using the prepared data from the previous phase. After the model is trained, it
can be used to make predictions on new, unseen data. The input features are provided to the model, and it
generates predictions based on the learned patterns and relationships in the training data. After obtaining
predictions from the model, the results are visualized and interpreted, the detailed process of the proposed
method has shown in Figure 1 and detailed description as follows.
3.2. Pre-processing
Pre-processing is a crucial stage in hand digit recognition. The first is image normalization, a frequent
pre-processing step that entails rescaling the values assigned to each pixel in the image such that they fall
within a specific range, typically between 0 and 1. This can lessen the effect of changes in the input image or
variations in illumination. The second method involves shrinking the input image to a fixed size, which can
assist to cut down on the model's parameter count and speed up training. Additionally, by lessening the effect
of minute differences in the input image size, scaling can help to increase the resilience of the model. The third
technique is called data augmentation, and it entails creating new training samples out of existing ones by
rotating, translating, and scaling them. This can help to expand the training dataset and enhance the model's
generalizability. The fourth method is feature extraction, which can help distinguish between several classes
by extracting significant features from the input image. In QCNN, certain picture properties can be extracted
by encoding them into the amplitudes of quantum states using quantum circuits. The final step is quantum
circuit optimization, once the image has been transformed into a quantum circuit, it is crucial to refine the
circuit to lessen the depth and the quantity of gates. This can aid in reducing the circuit's total runtime and
enhancing the model's functionality.
Figure 4 illustrates a quantum neural network circuit. The input data is encoded into the qubits' state,
and a sequence of quantum gates are applied to the qubits to process the input data. The readout qubit is then
measured, and a prediction is made using the measurement data.
The procedure entails turning binary images made up of black and white pixels, such as the training
and test data sets as shown in Figure 5 into quantum circuits. Figure 5(a) shows the training data sets (2,2) and
(3,1) for CNOT Gate as a quantum circuit. Figure 5(b) shows the test data set (2,1) for CNOT Gate as a quantum
circuit. Additionally, a threshold is applied to the pixel values and the qubits are only rotated through an X gate
if the pixel value is higher than the threshold. By doing this, noisy pixels' negative effects are lessened, and it
is ensured that the resulting quantum circuit is efficient and robust.
(a) (b)
Figure 5. Training and test data sets of CNOT gate (a) CNOT gate for training data set, (b) CNOT
gate for test data set
In quantum computing systems, the quantum neural network (QNN) is used for learning tasks with
quantum data more quickly. Figure 6 depicts the QNN architecture, the image is rescaling to 4×4 dimensions
before inputting it into the unitary matrix for feature extraction across various channels. The extracted features
are then utilized to develop a quantum circuit model, which is optimized a loss function combined with an
optimizer. For binary classification issues, a 2-layer circuit design was adopted and was improved by hyper-
parameter testing at several epochs. The final model, which resembled a tiny recurrent neural network stretched
across pixels, was created using two layers, preparation, and readout processes. Every data qubit in every layer
had an effect on the readout qubit since n repeats of the same gate were used. The stages of designing a quantum
convolutional neural network as follows:
Stage 1: A quantum circuit is built to accept an input image with a 2×2 square region of focus and a limited
field of view.
Stage 2: The unitary matrix (U) is applied to the gate, operations, and circuit in the form of a quantum operation,
which is a common visual representation for quantum operations in Circuit.
Stage 3: The system is quantized by gathering a number of conventional values that are anticipated.
Stage 4: The predicted values for each channel of a single output pixel match to the conventional convolution
layer in a similar way.
Stage 5: The procedure repeats the execution in different areas of the image and by relocating the image with
more than one channel output object, a full scan of the input image can be accomplished.
Stage 6: Either a quantum or a classical layer would be compliant with the quantum convolution layer.
Handwritten digit recognition using quantum convolution neural network (Ravuri Daniel)
538 ISSN: 2252-8938
The loss in CNN is more when compared to QCNN. The loss in CNN at point 1 is between 20 to 25
percentages whereas in QCNN it is between 5 to 10 percentages. At point 4 in CNN the accuracy is between
15 to 20 percentages whereas in QCNN the accuracy is between 0 to 5 percentages. At point 10 in CNN the
accuracy is between 5 to 10 percentages whereas in QCNN the accuracy is between 0 to 5 percentages. The
Figure 8 shows the drastically decrease in loss in CNN whereas there is a constant change in loss in QCNN.
The Figure 9 shows that QCNN achieved higher accuracy (91.07%) than CNN (84.68%) for
handwritten digit recognition. The loss of the QCNN (3.3%) is lower than CNN (7.33%) in terms of
handwritten digit recognition. However, the specific difference in accuracy and loss between CNN and QCNN
are 6.39% and 4.07% respectively.
QCNN CNN
Algorithms
5. CONCLUSION
In this paper, a novel approach to using a quantum neural learning model for handwriting recognition
is provided. Using a sample set of over 60,000 handwritten digit images, the experimental comparison of the
model showed a high level of efficiency with an overall accuracy of 91.07%. The model’s computation-based
training process took much less time than more conventional classical CNN model development with
comparable sample sizes while using quantum hardware. The model’s overall speed and effectiveness were
also demonstrated by the fact that the inference time for each image was measured at one minute. With the use
of QCNN, several of the drawbacks of CNN such as over-fitting and disappearing gradients, have been
addressed, leading to higher accuracy rates. Additionally, the proposed technique has shown resilient to image
distortions and changes in handwriting styles. The generalizability and scalability of QCNN for bigger datasets
and trickier recognition tasks, however, require more investigation. The usage of QCNN has a lot of potential
to advance the science of handwritten digit recognition overall.
REFERENCES
[1] L. Deng, “The MNIST database of handwritten digit images for machine learning research,” IEEE Signal Processing Magazine,
vol. 29, no. 6, pp. 141–142, Nov. 2012, doi: 10.1109/MSP.2012.2211477.
[2] G. Elizabeth Rani, M. Sakthimohan, G. Abhigna Reddy, D. Selvalakshmi, T. Keerthi, and R. Raja Sekar, “MNIST Handwritten
Digit Recognition using Machine Learning,” in 2022 2nd International Conference on Advance Computing and Innovative
Technologies in Engineering, ICACITE 2022, Apr. 2022, pp. 768–772. doi: 10.1109/ICACITE53722.2022.9823806.
[3] Y. Shima, Y. Nakashima, and M. Yasuda, “Classifying for a mixture of object images and character patterns by using CNN pre-
trained for large-scale object image dataset,” in Proceedings of the 13th IEEE Conference on Industrial Electronics and Applications,
ICIEA 2018, May 2018, pp. 2360–2365. doi: 10.1109/ICIEA.2018.8398104.
[4] I. Hussain, R. Ahmad, S. Muhammad, K. Ullah, H. Shah, and A. Namoun, “PHTI: Pashto Handwritten Text Imagebase for Deep
Learning Applications,” IEEE Access, vol. 10, pp. 113149–113157, 2022, doi: 10.1109/ACCESS.2022.3216881.
[5] X. Y. Zhang, F. Yin, Y. M. Zhang, C. L. Liu, and Y. Bengio, “Drawing and Recognizing Chinese Characters with Recurrent Neural
Network,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 849–862, Apr. 2018, doi:
10.1109/TPAMI.2017.2695539.
Handwritten digit recognition using quantum convolution neural network (Ravuri Daniel)
540 ISSN: 2252-8938
[6] C. Lee, G. Srinivasan, P. Panda, and K. Roy, “Deep Spiking Convolutional Neural Network Trained with Unsupervised Spike-
Timing-Dependent Plasticity,” IEEE Transactions on Cognitive and Developmental Systems, vol. 11, no. 3, pp. 384–394, Sep. 2019,
doi: 10.1109/TCDS.2018.2833071.
[7] Z. Meng, X. Guo, Z. Pan, D. Sun, and S. Liu, “Data Segmentation and Augmentation Methods Based on Raw Data Using Deep
Neural Networks Approach for Rotating Machinery Fault Diagnosis,” IEEE Access, vol. 7, pp. 79510–79522, 2019, doi:
10.1109/ACCESS.2019.2923417.
[8] K. Sooksatra, P. Rivas, and J. Orduz, “Evaluating Accuracy and Adversarial Robustness of Quanvolutional Neural Networks,” in
Proceedings - 2021 International Conference on Computational Science and Computational Intelligence, CSCI 2021, Dec. 2021,
pp. 152–157. doi: 10.1109/CSCI54926.2021.00097.
[9] F. Tacchino et al., “Variational Learning for Quantum Artificial Neural Networks,” IEEE Transactions on Quantum Engineering,
vol. 2, pp. 1–10, 2021, doi: 10.1109/tqe.2021.3062494.
[10] M. Shabir et al., “Real-Time Pashto Handwritten Character Recognition Using Salient Geometric and Spectral Features,” IEEE
Access, vol. 9, pp. 160238–160248, 2021, doi: 10.1109/ACCESS.2021.3123726.
[11] S. Ahlawat and R. Rishi, “Handwritten digit recognition using adaptive neuro-fuzzy system and ranked features,” in 2018
International Conference on Computing, Power and Communication Technologies, GUCON 2018, Sep. 2019, pp. 1128–1132. doi:
10.1109/GUCON.2018.8675013.
[12] T. H. S. Li, P. H. Kuo, C. Y. Chang, H. P. Hsu, Y. C. Chen, and C. H. Chang, “Deep Belief Network-Based Learning Algorithm
for Humanoid Robot in a Pitching Game,” IEEE Access, vol. 7, pp. 165659–165670, 2019, doi: 10.1109/ACCESS.2019.2953282.
[13] V. Pham, T. Bluche, C. Kermorvant, and J. Louradour, “Dropout Improves Recurrent Neural Networks for Handwriting
Recognition,” in Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, Sep. 2014, vol. 2014-
December, pp. 285–290. doi: 10.1109/ICFHR.2014.55.
[14] E. Irmak, “A Novel Deep Convolutional Neural Network Model for COVID-19 Disease Detection,” Nov. 2020. doi:
10.1109/TIPTEKNO50054.2020.9299286.
[15] N. A. N. Azlan, C. K. Lu, I. Elamvazuthi, and T. B. Tang, “Automatic detection of masses from mammographic images via artificial
intelligence techniques,” IEEE Sensors Journal, vol. 20, no. 21, pp. 13094–13102, Nov. 2020, doi: 10.1109/JSEN.2020.3002559.
[16] P. Y. Simard, D. Steinkraus, and J. C. Platt, “Best practices for convolutional neural networks applied to visual document analysis,”
in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 2003, vol. 2003-January, pp. 958–
963. doi: 10.1109/ICDAR.2003.1227801.
[17] A. C. and A. Y. N. T. Wang, D. J. Wu, “End-to-end text recognition with convolutional neural networks,” in Proceedings of the
21st International Conference on Pattern Recognition (ICPR2012), 2012, pp. 3304–3308.
[18] X. Liu, G. Meng, S. Xiang, and C. Pan, “Handwritten text generation via disentangled representations,” IEEE Signal Processing
Letters, vol. 28, pp. 1838–1842, 2021, doi: 10.1109/LSP.2021.3109541.
[19] Z. Xie, Z. Sun, L. Jin, Z. Feng, and S. Zhang, “Fully convolutional recurrent network for handwritten Chinese text recognition,” in
Proceedings - International Conference on Pattern Recognition, Dec. 2016, vol. 0, pp. 4011–4016. doi:
10.1109/ICPR.2016.7900261.
[20] L. Xu, Y. Wang, X. Li, and M. Pan, “Recognition of Handwritten Chinese Characters Based on Concept Learning,” IEEE Access,
vol. 7, pp. 102039–102053, 2019, doi: 10.1109/ACCESS.2019.2930799.
[21] C. Boufenar and M. Batouche, “Investigation on deep learning for off-line handwritten Arabic Character Recognition using Theano
research platform,” Apr. 2017. doi: 10.1109/ISACV.2017.8054902.
[22] N. Shaffi and F. Hajamohideen, “UTHCD: A New Benchmarking for Tamil Handwritten OCR,” IEEE Access, vol. 9, pp. 101469–
101493, 2021, doi: 10.1109/ACCESS.2021.3096823.
[23] R. Battiti, B. Demir, and L. Bruzzone, “Quad-tree based compressed histogram attribute profiles for classification of very high
resolution images,” in International Geoscience and Remote Sensing Symposium (IGARSS), Jul. 2016, vol. 2016-November, pp.
3330–3333. doi: 10.1109/IGARSS.2016.7729861.
[24] A. Rasheed, N. Ali, B. Zafar, A. Shabbir, M. Sajid, and M. T. Mahmood, “Handwritten Urdu Characters and Digits Recognition
Using Transfer Learning and Augmentation With AlexNet,” IEEE Access, vol. 10, pp. 102629–102645, 2022, doi:
10.1109/ACCESS.2022.3208959.
[25] T. Anjum and N. Khan, “An attention based method for offline handwritten Urdu text recognition,” in Proceedings of International
Conference on Frontiers in Handwriting Recognition, ICFHR, Sep. 2020, vol. 2020-September, pp. 169–174. doi:
10.1109/ICFHR2020.2020.00040.
[26] C. L. P. Chen and B. Wang, “Random-Positioned License Plate Recognition Using Hybrid Broad Learning System and
Convolutional Networks,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 1, pp. 444–456, Jan. 2022, doi:
10.1109/TITS.2020.3011937.
[27] M. A. Ferrer et al., “Static and dynamic synthesis of bengali and devanagari signatures,” IEEE Transactions on Cybernetics, vol.
48, no. 10, pp. 2896–2907, Oct. 2018, doi: 10.1109/TCYB.2017.2751740.
BIOGRAPHIES OF AUTHORS
Dr. Ambarapu Sudhakar since 2019 as Professor and Head of the Electrical
and Electronics Engineering Department at MLR Institute of Technology, Hyderabad,
Telangana, India. He taught engineering colleges for 17 years. He likes electric drives, clever
controllers, etc. He has several patents and research articles in indexed journals. Research
earned him the 2016 INDUS Research Excellence Award. He wrote a textbook and offered
expert webinars and conferences. He organised state and national conferences with Indian
government subsidies. He can be contacted at email: [email protected].
Handwritten digit recognition using quantum convolution neural network (Ravuri Daniel)