0% found this document useful (0 votes)
30 views12 pages

patterrn1

Uploaded by

rkavinkumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views12 pages

patterrn1

Uploaded by

rkavinkumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.

NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

Indic Hand Written Script Identification


Using Ensemble learning Soft Voting
Classifier and Easy OCR
Sakuldeep Singh
Email - [email protected]
Research Scholar, Monad University Pilakhua, Hapur (U.P), India.
Dr. R.B.Singh
Email – [email protected]
Professor (Dept of Mathematics), Monad University Pilakhua, Hapur (U.P), India.

Abstract –
Handwritten characters and numerals are still challenging to read, despite decades of study on
offline Indic recapitulations. The characters' uncanny facial likeness and the Indic scripts' pervasive
structural similarity are to blame for this. Results for the identification of handwritten Indian writing
using machine learning-based techniques are comparable to those for other computer vision tasks.
1211
This is the scenario, despite the fact that the problem is still fairly recent. However, developing a
handcrafted Machine learning model that is efficient for various Indian languages requires
considerable trial and error and in-depth knowledge with the issue. A solution was found after the
search was streamlined using an evolving meta-heuristics approach. managed to improve our text
extraction and language recognition abilities naturally by fusing machine learning and EasyOcr in this
manner. Focused on Hindi, Malayalam, Kannada, as well as Tamil languages with Ensemble Learning
models to detect languages present in images using the EasyOcr library, proposeddifferent models,
including Ensemble learning voting Classifier,Multi-layer perceptron and Support vector machine at
accuracy 98.6% as well as 89.9% with 100% detection and text extraction rate of Hindi, Kannada,
Malayalam and Tamil Languages.
Keywords— Machine Learning, Voting Classifier, Ensemble learning, Ada Boost, Multi-Layer
Perceptron, Script Identification, Easy OCR.
DOI Number:10.48047/NQ.2022.20.21.NQ99127 Neuroquantology 2022; 20(21):1211-1221

I. INTRODUCTION illumination reflections, varying lighting, the


Machinelearning has been used presence of the same text in various locales and
recently in computer vision research to solve a languages, a wide range of variations in the
range of issues. Two instances of these alignment and position of the text, as well as a
problems are the identification of text as well as wide range of variations in the size, colour, and
script in images of natural scenes. There are aspect ratio of the font, are some of the
many applications that require text detection challenges mentioned above. Because reading
and categorization in images of natural scenes, systems rely on a single language, any
including autonomous vehicles, intelligent comprehensive multilingual text reading
robots, script narrators, and drones[1][2]. More method without script recognition is regarded
difficult than in pictures of scanned papers is as lacking. systematise the end-to-end
this detection in images of natural (e2e)[3][4] identification as well as verification
environments. A complex background, a low- of multilingual scripts. The first one finds writing
resolution screen, background noises, in an image, and the second one determines
eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

what language the text was found in. Text process of identifying scripts[5].
detection is therefore an essential stage in the

Fig. 1 Script Identification using EnsembleLearning.


Nonetheless, the study on character recognition deep neural network for simultaneous script
acquired impetus as a result of advancements in identification and Keyword Spotting (KWS) on
AI and deep learning techniques, which were images of printed and handwritten texts in a
considered as a glimmer of hope for better variety of different languages. We created a
outcomes[6][7]. In the past, due to a lack of cutting-edge CNN-BLSTM architecture to
technological breakthroughs, written by hand provide a unified strategy that effectively
using pieces of paper with leaves and ink. Yet, a handles both problems. In order for the
1212
written record on paper is never completely network to cover more pertinent data, local and
risk-free[8]. The document might be shredded, global attributes are extracted during the script
have its edges ripped off, or be severely identification stage. To capture pairwise
exposed to the weather.Additionally, during this correlations between such data, used compact
era of globalisation, people have a propensity to bi-linear pooling as opposed to conventional
forget their own ethnic origins. The ancient feature fusion methods, which result in a linear
Gujarati scripts should be digitally preserved so feature concatenation. The KWS module
that future generations can study them as well receives the outcomes of a script identification,
as gain important life lessons[9][10]. This is one strips characters from meaningless scripts, and
of the finest ways to contribute to the then conducts the decoding phase using single-
preservation of the culture. In this article, script mode.The CTC loss for such KWS and the
propose a machine learning-based approach NLL loss for script identification were mutually
and novel image processing techniques for minimized throughout the end-to-end multi-
automatically digitising of that kind handwritten task learning training of the all network
scripts. using a trained machine learning model parameters. Our approach was evaluated on
to classify and recognise the languages by first several available datasets that represent
extracting text from images and converting it to different language and writing styles. In tests, it
text using a machine learning model[11], In this was discovered that deep multi-task
work, machine learning, as well as well as representation learning outperformed state-of-
ensemble learning method will be used to the-art systems both at script recognition &
categorize and recognise written through keyword spotting tasks[14].
cursive Indian languages such as Hindi, Tamil, Khalil 2021 et al. FCNs can also be used for
Kannada, and Malayalam. In order to assess model improvement and categorisation. The
how well-trainedtrained DL models perform, proposed approach enhances the Efficient and
run deep learning algorithms on test Accurate Scene Text Detector by adding
data[12][13]. additional FCN branches for script identification.
II. LITERATURE REVIEW The majority of end-to-end (e2e) methods
Cheikhrouhou 2021 et al. In fact, it has been individually train the same text detection or
suggested to employ an end-to-end multi-task script identification models; nevertheless, we
eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

suggest two e2e techniques for simultaneously Gujarati script handwriting recognition, training
training the models, either multi-channel and testing that model, and changing numerous
segmentation and multi-channel mask (MCS). hyper-parameters to achieve the best
When using the ICDAR MLT 2017 or MLe2e accuracy.If you're a computer enthusiast or
datasets, respectively, the results demonstrate researcher interested in creating algorithms for
that an MCS outperforms current approaches to Gujarati script recognition, check out this
recall values at 54.34% and 81.13%. An MCM article. The goal of the essay is to clarify and
performs similar to some other cutting-edge illustrate unique qualities associated to Gujarati
techniques[15]. script[17].
Feurer 2019 et al. uses a mixed-code dataset Gomez 2017 et al. attention is given to the issue
that comprises Roman Urdu, Hindi, Saraiki, of scene text picture script identification.
Bengali, & English to address the problem of Modern CNN classifier are unable to take into
mixed script recognition. Many RNN iterations consideration scene text occurrences'
with word vectorization are used in the training constantly shifting aspect ratios. Because of
of the language recognition model. The optimal this, working out a solution with them is
designs for LSTM, Bidirectional LSTM, Gated difficult. Instead of scaling input images to a
Recurrent Unit (GRU), or Bidirectional Gated predetermined aspect ratio as is typically done
Recurrent Unit tasks have also been developed when using holistic CNN classifiers, we propose
through experimental research (Bi-GRU). By in this research a patch-based classification
combining learnt word class features and GloVe framework to maintain the discriminative
embedding, the experiment was able to achieve portions of the input image that are
the highest accuracy of 90.17 for Bi-GRU. Also, characteristic of its class. In this paper, we
this study addresses problems that arise in propose an unique approach for estimating the
1213
multilingual settings, including such phonetic relative weights of the multiple stroke-part
typing, generative spelling, or the transliteration representations within a patch-based
of Roman words into English characters[16]. categorization framework. This approach makes
Aniket 2019 et al. Explain a suggested use of ensembles of conjoined networks. For
application, which uses image processing & two script identification datasets that are
machine learning to identify and recognise currently available, our testing employing this
Gujarati handwriting. It draws attention to the learning technique show cutting-edge results.
substantial machinery needed for this process. Also, we provide a brand-new open benchmark
The technique is difficult since Gujarati contains dataset for testing end-to-end reading
curved characters and just a variety of algorithms on multilingual scene texts. An end-
handwriting styles. The entire process of to-end system that combines the script
character detection and identification, including identification technique with such a previously
image acquisition, preprocessing, segmentation, published text detector in addition to a
classification, or recognition, and also post- commercially available OCR engine is
processing, is discussed in this article. Also, it demonstrated through experiments with this
highlights crucial elements like creating a neural dataset that emphasise the crucial role the
network appropriate for the difficult task of script identification plays in the system[18].
III. PROPOSED METHODOLOGY

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

Data Preprocess and Count


Perform EDA
Collection Clean Vectorizer

Performance Ensemble
Label Encoding Bag of Words
Evaluation Learning

EasyOcr
Hand-written text Extraction from
Images and Convert into Strings Detect and Classify
Languages and Class

.
Fig. 2 Proposed Flowchart.
A machine learning library for the visual undertaking. Choose no more than four
recognition and categorization of handwritten languages—probably Kerala, Tamil, Kannada, &
Indian language text or script is currently being Hindi. Malayalam has 591 text samples, Tamil
discussed. This is achieved by utilizing both has 464, Kannada has 366, & Hindi does have 62
handwritten text as well as image features. In text samples of various lengths. This
this design for natural language processing, information may be turned into a pandas data
which uses text data from four Indian languages frame for study.
(Hindi, Kannada, Tamil, and Malayalam), after 3.1.2 Pre-processing 1214
the data has been gathered and cleaned, the Finding null and nan values was the first step in
following steps are to apply a label encoder for the preprocessing task. Next, duplicate values in
categorical features to convert them to the text data were removed, and lastly, a clean
numerical values, apply a count vectorizer, and text column was designed. This column used the
then implement Machine learning algorithms regular expression library through Python to
such as Combining four algorithms with a voting clean the text of symbols, numbers, and
classifier in ensemble learning for voting punctuation. keywords[25][26], links, and The
categorization [24] Ada Boost,SVM,MLP[19]. first steps of a preprocessing job were to use
Following the completion of each of these steps, the tokenizer ,count vectorization, whitespaces
the effectiveness of trained models is assessed and also used label encoder[20][21]. Data was
using text data. Use the EasyOcr library, which is gathered from a variety of sources, with only
machine learning-based, to recognize the scrip- Indian languages being chosen as the primary
form image. To extract text from images and sources as the preprocessing task's first step in
send it to trained machine-learning algorithms data gathering. You must transform by using the
for detection and classification, use EasyOcr. label encoder[27].
3.1.1 Data Collection 3.1.3 Data Splitting
It was determined to gather information from A 90:10 ratio has been created from the
various sources in order to create a new data statistics. Teaching takes up 90% of the time,
set because different languages were not and evaluation takes up 10% of the time.
available with a singular data set. In order to Overfitting can be avoided by employing a deep
develop deep learning models that could learning algorithm to partition the data (ML).
recognize and categories the intended Machine learning can overfit when it fits the
languages, this was done. We gathered training data so well that it is unable to
information for four Indian dialects from public consistently fit any new data. This circumstance
sources. Data from of the Kaggle, UCI, as well as falls into that group[30]. Earlier than adding this
Data World websites should be gathered for this early data to a ML model[31].
eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

be effective in combining strongly basic learners


3.1.3 Proposed Models (like deep decision trees), producing an even
 Ensemble Voting Classifier more accurate model.
A voting classifier is a type of machine learning  Support Vector Machine (SVM)
estimate that selects a number or base models Classification or regression issues can be solved
or estimators, averages the output of the each using a supervised machine learning technique
base estimator, and then outputs predictions. known as the Support Vector Machine (SVM).
The output of each estimator can be rated using Nonetheless, it is most typically applied to text
the aggregating criterion. Voting variables fall categorization as well as other classification-
into two different categories. The result class related problems. The SVM approach
that is anticipated serves as the basis for voting. represents each data point as a point is n-
Soft voting is a method of voting that bases dimensional space, where n refers to the
choices on the likelihood that an output class number of features you have and each feature's
will occur. value is the value of a particular coordinate[23].
𝑦 = arg 𝑚𝑎𝑥 𝑖 ∑𝑚 𝑗=1 𝑤𝑗𝑋𝐴(𝐶𝑗(𝑥) = 𝑖) The optimum hyper-plane that clearly separates
(1) the two groups is then found, and classification
The classifier, Cj, is represented in the equation is completed. (view the image below). An
above, and the weight, wj, is connected with example of an SVM visualisation is a hyper-
the classifier's prediction. Voting classifiers plane, and Support Vectors are just a collection
come in two varieties. These include the of observation points. When it comes to
following: Classifiers for both hard and soft separating the two categories (hyper-
votes are available. To increase predictive plane/line), the SVM classifier is cutting edge.
performance, ensemble learning, a generic  Multi-Layer Perceptron
meta approach for machine learning, integrates Multi-layer perceptrons serve as a supplement 1215
the predictions from different models. Even to the feed forward neural network (MLP). It
though you can appear to have an infinite has three different sorts of layers, as shown in
number of ensembles to tackle your predictive Fig. 3: an input layer, an output layer, as well as
modelling problem, there are only three a hidden layer. The input layer is where a signal
strategies that are dominant in the field of is received to be processed. Classification and
ensemble learning. In fact, each is a field of prediction tasks are completed by the output
research that has led to the development of layer. Between the layers of input and output,
numerous more specialised techniques rather the real computation engine of a MLP is made
than being algorithms in and of itself. The three up of an indefinite number of hidden layers.
primary classes of ensemble learning Similar to a feed forward network, data moves
approaches are bagging, stacking, and boosting. forward from an MLP's input layer to its output
It's imperative to fully understand each layer. The MLP's neurons are instructed using
technique and incorporate it into your the backpropagation learning process. As they
predictive modelling project[22]. can approximate any continuous function, MLPs
 Ada Boost can deal with issues that are not solvable in a
AdaBoost is adaptive in that it adapts to linear fashion. Among the key uses of MLP are
instances where earlier classifiers misclassified classification, recognition, forecasting, and
weak learners in favour of those situations. It pattern approximation.
may, in some cases, be less prone to an IV. RESULT & DISCUSSION
overfitting issue than other learning strategies. It The confusion matrix is a diagram that can be
has been demonstrated that even if individual used to illustrate classification criteria. (CM).
learners may not perform well, if all learners Based on the predictions made by the systems
perform just slightly better than random as well as the real class of a specific instance of
guessing, then final model can converge to such data, this matrix compares various classes. It is
a strong learner. AdaBoost is typically used to not a quantifiable number that can be
connect weak base learners (like decision measured on its own, but instead depends on
stumps), but it has also been demonstrated to

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

the following four factors: The following lines False Negative (FN):Incorrectly anticipating the
provide an overview of TP, FP, TN, as well as FN: existence of negative classes causes a result to
True Negative (TN): True negative outcomes are be considered false.
those in which the model can be shown to have 1) Accuracy
correctly predicted the absence of the goal The accuracy of a classification task depends on
class. forecasts and the proportion of properly
True Positives (TP): Results that the model can classified data samples to the total amount of
confirm exist the goal class are considered true data samples. total number of samples and
positive results. forecasts of data. To demonstrate this, we
False Positives (FP): While the model incorrectly divided the quantity of correctly recognized
determines that a positive class exists, an samples by the sum of the TP and TN products.
outcome is said to have been measured as FP. (the main diagonal of the CM).
𝑇𝑃 + 𝐹𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑟𝑐𝑦 = 𝑇𝑃+ 𝑇𝐹 + 𝐹𝑃 + 𝐹𝑁 (1)
2) Precision precision can be determined. It is sufficient to
By comparing the true positive (TP) with all divide by the product of both components (TP +
instances of positivity (TP + FP), system FP).
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (2)
𝑇𝑃 + 𝐹𝑃
3) Recall or Sensitivity instances in the dataset (TP + FN). It can be used
A percentage of all positive occurrences that to determine "how many additional right
was used to determine how many positive versions the model missed when it displayed 1216
events there were. The denominator in this the correct ones," to put it another way.
example is therefore the sum of the positive
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (3)
𝑇𝑃+𝐹𝑁

𝑇𝑃
𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = 𝑇𝑃+𝐹𝑁 = 𝑟𝑒𝑐𝑎𝑙𝑙 (4)
4) F-score values. It follows that both the FN as well as FP
Below is a graph showing the likelihood that a points of view have been taken into account. To
favorable prediction will come true. It carries calculate a user's F1 score individually, use the
out the required mathematical procedures to continuity method:
determine the harmonic mean between two
2(𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙)
𝐹 − 𝑠𝑐𝑜𝑟𝑒 = (5)
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
5) Specificity light. similar to finding out how many healthy
ratio of all negative incidences to all other individuals who have never been told they have
instances of similar incidents. That total number cancer but have no visible signs of the disease in
of negative instances inside the dataset (TN + their bodies. a technique of evaluation to
FP) is the denominator of this equation. Similar determine how the classes differ from one
to remembrance, the main distinction is that another.
only bad things that happen are brought to
𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 𝑇𝑁+𝐹𝑃 (6)
.
Table.1 Performance Evaluation Machine Learning Models

Model Accuracy% F score% Precision% Recall%


SVM 89.9 89.9 89.9 89.9
MLP 98.6 98.6 98.6 98.6
AdaBoost 76.5 76.5 76.5 76.5

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

The performance evaluation of machine for accuracy, f score, precision, and recall, with
learning models is shown in Table.1, where a score of 0.986, when compared to Support
Multi-Layer Perceptron received the best values Vector Machine and Ada Boost.

Fig.3 Performance Evaluation of Models


The performance assessment of the models is Support Vector Machine as well as Ada Boost,
shown in Fig. 3, where the accuracy is denoted Multi-Layer Perceptron has the best accuracy,
by blue, the precision by green, the recall by precision, recall, and F Measure.
purple, and the f score by red. In comparison to

1217

Fig.4 Confusion Matrix of SVM

Fig.5 Confusion Matrix of MLP

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

Fig.6 Confusion Matrix of AdaBoost

Figs. 4 to 6 show the confusion matrix for the are both less accurate than Multi Layer
machine learning models, which contains Perceptron, which has the highest accuracy at
accuracy, projected values, and actual model 0.987.
values. Adaboost and Support Vector Machine

Table .2 Performance Evaluation of Ensemble learning Voting Classifier


Model Accuracy% Precision% Recall% Test F1 Specificity% Sensitivity%
Score% score%
MLP 98.65 98.65 98.65 98.65 98.65 100 100
Naive 98.65 98.65 98.65 98.65 98.65 100 100
Bayes 1218
Logistic 97.3 97.3 97.3 97.3 97.3 90 100
Regression
Ensemble 98.65 98.65 98.65 98.65 98.65 100 100

The Multi-Layer Perceptron, Naive Bayes, terms of accuracy, precision, recall, test score,
Logistic Regression, & Ensemble Learner Voting or f1 score (0.9865), the MLP, Naive Bayes, and
Classifier, which employs machine learning Ensemble are the best.
methods, performs well, as shown in Table 2. In

Fig.7 Performance Evaluation of Ensemble learning voting classifier

The performance evaluation of the models is purple, specificity by orange, and sensitivity by
depicted in Fig. 7, where accuracy is navy blue. The greatest accuracy, precision,
represented by blue, precision by red, recall by recall, and F-Measurecome from multi-layer
green, the f score by blue, the Test score by Perceptron and naive bayes ensemble models.

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

Fig.8 Confusion Matrix of Ensemble learning Voting Classifier

The Ensemble learning Voting classifier's confusion matrix is displayed in Fig. 8 along with accuracy,
prediction values, and real values.

1219

Fig.9 Extracted and Predicted Scripts

The findings for Hindi, Malayalam, and Kannada that is effective for different Indian languages.
are shown in Fig. 9's predicted as well as The search was streamlined by using an
extracted scripts for the languages. evolving meta-heuristics strategy, and a
V. CONCLUSION resolution was discovered. primarily on the
Despite decades of research on offline Indic languages of Hindi, Malayalam, Kannada, and
recapitulations, it is still difficult to recognize Tamil, proposed different models, including
handwritten characters and numerals. This is Ensemble learning voting Classifier ,Multi-layer
due to the characters' eerie facial resemblance perceptron or Support vector machine at
and widespread structural similarity in the Indic accuracy 98.6% as well as 89.9% with 100%
scripts. Modern results in the identification of detection and text extraction rate, to detect
handwritten Indic script have been attained languages present in images using the EasyOcr
using machine learning-based techniques, library. This contrasts with previous studies that
comparable to other computer vision tasks. prioritized particular languages over Hindi,
Even though the issue is still reasonably fresh, Malayalam, Kannada, and Tamil languages were
this is the situation. But it takes a lot of trial and in considering.
error and in-depth familiarity with the problem References
to create a handcrafted Machine learning model [1] N. Saqib, K. F. Haque, V. P. Yanambaka,

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

and A. Abdelgawad, “Convolutional- Autonoma de Madrid , Spain Brno


Neural-Network-Based Handwritten University of Technology , Czech
Character Recognition: An Approach with Republic Aragon Institute for Engineering
Massive Multisource Data,” Algorithms, Research ( I3A ), University of Zaragoza ,
vol. 15, no. 4, 2022, doi: Spain,” pp. 0–4, 2014.
10.3390/a15040129. [9] R. Ahmed et al., “Deep neural network-
[2] B. Muthusamy, “Deep Learning in Text based contextual recognition of arabic
Recognition and Text Detection : A handwritten scripts,” Entropy, vol. 23,
Review,” Int. Res. J. Eng. Technol., pp. 9– no. 3, pp. 4–6, 2021, doi:
24, 2022, [Online]. Available: 10.3390/e23030340.
www.irjet.net [10] G. R. Hemanth, M. Jayasree, S. K. Venii,
[3] P. P. Nair, A. James, and C. Saravanan, P. Akshaya, and R. Saranya, “Cnn-Rnn
“Malayalam handwritten character Based Handwritten Text Recognition,”
recognition using convolutional neural Ictact J. Soft Comput., vol. 6956, no.
network,” Proc. Int. Conf. Inven. October, p. 1, 2021, doi:
Commun. Comput. Technol. ICICCT 2017, 10.21917/ijsc.2021.0351.
vol. 15, no. 9, pp. 278–281, 2017, doi: [11] S. Kaur, S. Bawa, and R. Kumar, “Script
10.1109/ICICCT.2017.7975203. Identification in Handwritten Documents
[4] M. Sonkusare and N. Sahu, “A Survey on for Gurumukhi-Latin Script Using
Handwritten Character Recognition Transfer Learning with Deep and Shallow
(HCR) Techniques for English Alphabets,” Classifiers,” 2021, [Online]. Available:
Adv. Vis. Comput. An Int. J., vol. 3, no. 1, https://www.researchsquare.com/article
pp. 1–12, 2016, doi: /rs-695509/latest.pdf 1220
10.5121/avc.2016.3101. [12] N. Padmaja, B. N. S. Raja, and B. P.
[5] M. S. Tej, T. V. Saradhi, M. Spandana, Kumar, “Real time sign language
and V. Savya, “Hand Witten Text detection system using deep learning
Recognition using Deep Learning,” Int. J. techniques,” J. Pharm. Negat. Results,
Res. Appl. Sci. Eng. Technol., vol. 10, no. vol. 13, no. 1, pp. 1052–1059, 2022, doi:
4, pp. 84–89, 2022, doi: 10.47750/pnr.2022.13.S01.126.
10.22214/ijraset.2022.41156. [13] G. S. Bhati and A. R. Garg, “Handwritten
[6] G. Elizabeth Rani, M. Sakthimohan, G. Devanagari Character Recognition Using
Abhigna Reddy, D. Selvalakshmi, T. CNN with Transfer Learning,” pp. 269–
Keerthi, and R. Raja Sekar, “MNIST 279, 2021, doi: 10.1007/978-981-33-
Handwritten Digit Recognition using 6984-9_22.
Machine Learning,” 2022 2nd Int. Conf. [14] A. Cheikhrouhou, Y. Kessentini, and S.
Adv. Comput. Innov. Technol. Eng. Kanoun, “Multi-task learning for
ICACITE 2022, vol. 03, no. 003, pp. 768– simultaneous script identification and
772, 2022, doi: keyword spotting in document images,”
10.1109/ICACITE53722.2022.9823806. Pattern Recognit., vol. 113, 2021, doi:
[7] A. Shrivastava, I. Jaggi, S. Gupta, and D. 10.1016/j.patcog.2021.107832.
Gupta, “Handwritten Digit Recognition [15] A. Khalil, M. Jarrah, M. Al-Ayyoub, and Y.
Using Machine Learning: A Review,” Jararweh, “Text detection and script
2019 2nd Int. Conf. Power Energy identification in natural scene images
Environ. Intell. Control. PEEIC 2019, vol. using deep learning,” Comput. Electr.
8, no. 12, pp. 322–326, 2019, doi: Eng., vol. 91, no. February, p. 107043,
10.1109/PEEIC47157.2019.8976601. 2021, doi:
[8] I. Lopez-moreno, J. Gonzalez-dominguez, 10.1016/j.compeleceng.2021.107043.
O. Plchot, D. Martinez, J. Gonzalez- [16] M. Feurer and F. Hutter,
rodriguez, and P. Moreno, “Google Inc ., “Hyperparameter Optimization,” vol.
New York , USA ATVS-Biometric 2021, pp. 3–33, 2019, doi: 10.1007/978-
Recognition Group , Universidad 3-030-05318-5_1.

eISSN1303-5150 www.neuroquantology.com
Neuroquantology | December 2022 | Volume 20 | Issue 21 | Page 1211-1221 | Doi:10.48047/NQ.2022.20.21.NQ99127
Sakuldeep Singh et al/Indic Hand Written Script Identification Using Ensemble learning Soft Voting Classifier and Easy OCR

[17] S. Aniket, R. Atharva, C. Prabha, D. cursive video text using a deep learning
Rupali, and P. Shubham, “Handwritten framework,” IET Image Process., vol. 14,
Gujarati script recognition with image no. 14, pp. 3444–3455, 2020, doi:
processing and deep learning,” 2019 Int. 10.1049/iet-ipr.2019.1070.
Conf. Nascent Technol. Eng. ICNTE 2019 - [26] S. Aqab and M. U. Tariq, “Handwriting
Proc., no. Icnte, pp. 1–4, 2019, doi: recognition using artificial intelligence
10.1109/ICNTE44896.2019.8946074. neural network and image processing,”
[18] L. Gomez, A. Nicolaou, and D. Karatzas, Int. J. Adv. Comput. Sci. Appl., vol. 11, no.
“Improving patch-based scene text script 7, pp. 137–146, 2020, doi:
identification with ensembles of 10.14569/IJACSA.2020.0110719.
conjoined networks,” Pattern Recognit., [27] P. Thangamariappan and D. J. . Miraclin
vol. 67, pp. 85–96, 2017, doi: Joyce Pamila, “Handwritten Recognition
10.1016/j.patcog.2017.01.032. By Using Machine Learning Approach,”
[19] A. Bhat, V. Yadav, V. Dargan, and Yash, Int. J. Eng. Appl. Sci. Technol., vol. 04, no.
“Sign Language to Text Conversion using 11, pp. 564–567, 2020, doi:
Deep Learning,” 2022 3rd Int. Conf. 10.33564/ijeast.2020.v04i11.099.
Emerg. Technol. INCET 2022, pp. 4036– [28] P. Sujatha and D. Lalitha Bhaskari,
4044, 2022, doi: “Telugu and hindi script recognition
10.1109/INCET54531.2022.9824885. using deep learning techniques,” Int. J.
[20] M. Das, M. Panda, and S. Dash, Innov. Technol. Explor. Eng., vol. 8, no.
“Enhancing the Power of CNN Using Data 11, pp. 1758–1764, 2019, doi:
Augmentation Techniques for Odia 10.35940/ijitee.K1755.0981119.
Handwritten Character Recognition,” [29] C. Science and K. Dutta, “Handwritten 1221
Adv. Multimed., vol. 2022, 2022, doi: Word Recognition for Indic & Latin
10.1155/2022/6180701. scripts using Deep CNN-RNN Hybrid
[21] A. AYVACI ERDOĞAN and A. E. TÜMER, Networks,” no. March, 2019.
“Deep Learning Method for Handwriting [30] S. Susan and J. Malhotra, “Recognising
Recognition,” MANAS J. Eng., vol. 9, no. devanagari script by deep structure
1, pp. 85–92, 2021, doi: learning of image quadrants,” DESIDOC J.
10.51354/mjen.852312. Libr. Inf. Technol., vol. 40, no. 5, pp. 268–
[22] A. Asokan and S. N Unnithan, “Offline 271, 2020, doi:
Recognition of Malayalam and Kannada 10.14429/djlit.40.5.16336.
Handwritten Documents Using Deep [31] S. Ali, Z. Shaukat, M. Azeem, Z.
Learning,” Int. J. Comput. Commun. Sakhawat, T. Mahmood, and K. ur
Informatics, vol. 3, no. 2, pp. 12–24, Rehman, “An efficient and improved
2021, doi: 10.34256/ijcci2122. scheme for handwritten digit recognition
[23] B. Jose and K. P. Pushpalatha, “Intelligent based on convolutional neural network,”
Handwritten Character Recognition For SN Appl. Sci., vol. 1, no. 9, pp. 1–9, 2019,
Malayalam Scripts Using Deep Learning doi: 10.1007/s42452-019-1161-5.
Approach,” IOP Conf. Ser. Mater. Sci.
Eng., vol. 1085, no. 1, p. 012022, 2021,
doi: 10.1088/1757-899x/1085/1/012022.
[24] D. C. Cireşan, U. Meier, L. M.
Gambardella, and J. Schmidhuber,
“Convolutional neural network
committees for handwritten character
classification,” Proc. Int. Conf. Doc. Anal.
Recognition, ICDAR, vol. 10, pp. 1135–
1139, 2011, doi:
10.1109/ICDAR.2011.229.
[25] A. Mirza and I. Siddiqi, “Recognition of

eISSN1303-5150 www.neuroquantology.com
Reproduced with permission of copyright owner. Further reproduction
prohibited without permission.

You might also like