0% found this document useful (0 votes)

19 views

A Compact Deep Learning Model For Khmer Handwritten Text Recognition

The motivation of this study is to develop a compact offline recognition model for Khmer handwritten text that would be successfully applied under limited access to high-performance computational hardware. Such a task aims to ease the ad-hoc digitization of vast handwritten archives in many spheres. Data collected for previous experiments were used in this work. The oneagainst-all classification was completed with state-of-the-art techniques. A compact deep learning model (2+1CNN), with two convolutional layers and one fully connected layer, was proposed. The recognition rate came out to be within 93-98%. The compact model is performed on par with the state-of-theart models. It was discovered that computational capacity requirements usually associated with deep learning can be alleviated, therefore allowing applications under limited computational power.

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

A Compact Deep Learning Model For Khmer Handwritten Text Recognition

Uploaded by

IAES IJAI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 10, No. 3, September 2021, pp. 584~591

ISSN: 2252-8938, DOI: 10.11591/ijai.v10.i3.pp584-591  584

A compact deep learning model for Khmer handwritten text

recognition

Bayram Annanurov1, Norliza Mohd Noor2

1Department of Computer Science, Paragon International University, Cambodia
2Department of Engineering, Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Malaysia

Article Info ABSTRACT

Article history: The motivation of this study is to develop a compact offline recognition
model for Khmer handwritten text that would be successfully applied under
Received Sep 7, 2020 limited access to high-performance computational hardware. Such a task aims
Revised May 19, 2021 to ease the ad-hoc digitization of vast handwritten archives in many spheres.
Accepted May 25, 2021 Data collected for previous experiments were used in this work. The one-
against-all classification was completed with state-of-the-art techniques. A
compact deep learning model (2+1CNN), with two convolutional layers and
Keywords: one fully connected layer, was proposed. The recognition rate came out to be
within 93-98%. The compact model is performed on par with the state-of-the-
Character recognition art models. It was discovered that computational capacity requirements
Convolutional neural networks usually associated with deep learning can be alleviated, therefore allowing
Deep learning applications under limited computational power.
Handwriting recognition
Multilayer neural networks This is an open access article under the CC BY-SA license.

Corresponding Author:
Norliza Mohd Noor
Department of Engineering, Razak Faculty of Technology and Informatics
Universiti Teknologi Malaysia Kuala Lumpur Campus
Jalan Sultan Yahya Petra, 54100 Kuala Lumpur, Malaysia
Email: [email protected]

1. INTRODUCTION
Khmer is an official language of Cambodia, spoken by about 16 million people. It has an Alpha
syllabary (Abugida) writing structure: words are comprised of syllables, most of which consist of a radical
for a consonant and additional score for vowels. The modern Khmer alphabet consists of 33 consonants.
There is a great demand for a recognition system reflecting Khmer writing specifics due to the
constant accumulation of documents in such spheres as government, healthcare, finance, education. Until the
early 2000 s, most records in the government and private sectors in Cambodia have been held on handwritten
documents and hand-filled forms. One has to manually browse through the entire mass of paper to reach any
of these records. The bulk of such tasks is extremely complex to carry out on daily basis, even with help of a
systematic archiving system. Having an effective deep learning [1] application for digitizing handwritten text
is particularly important for promoting the development of public and private services. Such an application
also needs to be inexpensive and applicable in developing economies.
As opposed to other common alphabetical systems, there is a very small amount of research on
Khmer text recognition. Most of the efforts have been done only within the past decade [2]-[8]. Sok and
Taing [3] and Srun and Vyshyakov [4], [5], [7] studied recognition of the Khmer printed text. Ye et al. [8]
developed an online recognition method for printed text in the Khmer, Bangla, and Myanmar alphabets. The
amount of work in the field, as well as the nature of the collected data for relevant experiments, describes the
current state of the art for Khmer handwritten text recognition (HTR). Most of the data used in the past

Journal homepage: http://ijai.iaescore.com

Int J Artif Intell ISSN: 2252-8938  585

experiments were printed (Machine-derived) text, which greatly impedes the development of an accurate
application.
As an extension of previous experiments [9], [10], current work implemented CNNs for Khmer
HTR. A novel, compact model 2+1CNN was proposed to be used alongside the models used in literature
(LeNet-5 [1], AlexNet [11], visual geometry group 16 (VGG16), VGG19 [12], ResNet [13]). 2+1CNN is
designed for binary classification while existing models were optimized accordingly due to the adapted one-
against-all tactic used throughout the work.
To increase overall performance, an independent network was trained and evaluated for each class.
One particular class was taken as "positive" and all others – as "negative" while training each network. That
is, given a set of classes 𝐶 = {𝑐1 , 𝑐2 , … . 𝑐𝑘 }, the samples of class 𝑐𝑗 were isolated and all samples of other
classes 𝑐1 , 𝑐2 , … . 𝑐𝑗−1 , 𝑐𝑗+1 , … 𝑐𝑘 were considered as “𝑛𝑜𝑡 𝑐𝑗 " (or 𝑐𝑗 ’). Training cable news network (CNN)
with this setting yielded a classifier model 𝐹𝑗 (∙). The output of the training process was the combination of all
trained classifiers:

𝐹(∙) = {𝐹𝑗 (∙)| 𝑗 = 1. . 𝑘} (1)

Intuitively, the final model was designed to iterate the question “Are you of class 𝑐𝑗 ?" instead of
asking directly "What class are you?" This work aims to design a compact model for the Khmer HTR system.
Lack of appropriate datasets contributes to its difficulty. Only datasets collected in preliminary experiments
[9], [10] were used.

2. RELATED WORK
2.1. Recognition of Khmer handwriting
Meng and Morariu [2] described how to combine feedforward artificial neural network (ANN) with
a self-organizing map (SOM) to design a recognition system for printed Khmer characters. Sok and Taing [3]
described their experiment with SVM on printed Khmer characters. Font size-based accuracy and CPU load
were presented as efficiency assessment. Authors also listed some scarce work done towards Khmer optical
character recognition (OCR) to emphasize on lack of research for the Khmer language. Backpropagation was
used by Srun [4] to train a classifier to recognized Khmer characters. For the experiments, Srun sampled
printed text. Preprocessing consisted of resizing images to standard dimensions. Thumwarin et al. [6] in their
studies implemented finite impulse response (FIR) to extract features from handwritten Khmer characters and
sent their results to a Euclidean-based classifier. The work relies on temporal information, which is
impossible to collect from a scanned image of a manuscript. Another problem that the method requires extra
hardware for collecting temporal information. Another work by Srun and Vishnyakov [7] included the
implementation of classifiers in TESSERACT and further improvement of recognition quality of scanned
characters. The earliest mention of Khmer HTR in a computerized setting dates as early as 2008 in work by
Ye et al. [8], which proposed a recognition system of scripts like Myanmar, Khmer, and Bangla [8]. Research
data was collected by the means of drawing characters with a mouse, which is also a drawback of the work.
Unlike many previous attempts, data used in the current work reflects the nature of common handwriting
which makes resultant models more realistic. Khmer datasets acquired in previous attempts are compared in
Table 1.

Table 1. Data sets were acquired for Khmer HTR

Literature Dataset Data and size
Sok and Taing [3] Printed and scanned text Khmer Characters, 3000
Ye et al. [8] Collected by mouse, stylus pen Khmer, 135, Myanmar Characters, 107
Thumwain et al. [6] Scanned text Khmer letters and digits, 6750
Kruy and Kameyama [14] Printed and scanned text Khmer words, 1104
Meng and Morariu [2] Printed and scanned text Khmer Characters, 215
Kheang et al. [15] Printed and scanned text Khmer words, 110713
Srun [4] Printed and scanned text Khmer Characters, 33

2.2. Convolutional neural networks

A convolutional neural network (CNN, or ConvNet) is a special kind of deep, feed-forward artificial
neural network. In an image processing application, CNNs learns directly from images. Key concepts,
important in the description of CNN, are local receptive fields (LRF), shared weights and biases, activation,

A compact deep learning model for Khmer handwritten text recognition (Bayram Annanurov)
586  ISSN: 2252-8938

and pooling. CNNs also differ from each other in the method and objective of training, e.g., prediction, object
discovery, segmentation.
According to Cun [1], [16], CNN is a variation of multilayer perceptron which require minimal
preprocessing requirements. The connectivity pattern between neurons in a CNN is inspired by the biological
processes of the animal visual cortex, where each cortical neuron responds to signal from only a restricted
area of the visual field (receptive field). Matsugu et al. [17] described that receptive fields that connect to
different neurons, partially overlap. This leads to having the entire visual field covered and, therefore, to
smooth vision.
Figure 1 shows an example of a three-dimensional neuron arrangement in a convolutional neural
network. Every layer takes a three-channel image, where each pixel has a separate value for red, green, and
blue components. The image is split to form output in form of a 3D matrix of neurons. Data used in this study
was preprocessed into grayscale images.

Figure 1. 3-D neuron arrangements in a CNN [18]

The convolution operation is performed on the input data. This step models the response of an
individual biological neuron to visual input. The activation step applies a transformation to the output of each
neuron by using activation functions. Rectified linear unit (ReLU), is an example of a commonly used
activation function. It takes the output of a neuron and maps it to the highest positive value. If the output is
negative, the function maps it to zero.
The output of the activation step can be further transformed by applying a pooling step. Pooling
reduces the dimensionality of the feature map by condensing the output of small regions of neurons into a
single output. This helps to simplify the consequent layers and reduces the number of parameters that the
model needs to learn. CNN layers are configured by these three concepts. A CNN can have tens or hundreds
of hidden layers that each learns to detect different features of an image. In such feature maps, every hidden
layer increases the complexity of the learned image features. For example, the first hidden layer learns how
to detect edges, and the last layer learns how to detect more complex shapes.
In CNN inputs from a small local receptive field (LRF) are connected to one neuron hidden layer.
LRF is translated across an image to create a feature map from the input layer for being used in the hidden
layers. Convolutions are used to implement this process efficiently [19]. A convolution operation is applied
to the input of each layer. The convolution mimics the reaction of neurons to visual input. CNN architecture
also includes pooling layers, that are used to group the outputs of one layer into a single neuron in the next
layer [11], [20]. The cluster of neurons is designed in form of square batches of any size 𝑛 × 𝑛,
where 𝑛 = 2, 3, 4, … .
In some cases, pooling batches need to be moved beyond the boundaries of a sample image, which
may cause ambiguity in the training process as well as computational and programmatic complexity.
Extending the image by several rows and columns of pixels to match the size of pooling batches (padding)
helps to overcome such a problem. The values used for the extra pixels may be taken differently: average
overall spectrum of pixel values (average padding), zeros (zero paddings). Denoting filter size as 𝐹, input
size as 𝑊, resulting in image size as R, padding size as 𝑃, and stride size as 𝑆, it is obvious that the size of
the sample after each pooling layer will become is being as forms, which can also be deducted for two
dimensions:

𝑊+2𝑃−𝐹
, 𝑖𝑓 𝑆|(𝑊 + 2𝑃 − 𝐹)
𝑆
𝑅={ 𝑊+2𝑃−𝐹 (2)
+ 1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑆

Int J Artif Intell, Vol. 10, No. 3, September 2021: 584 - 591
Int J Artif Intell ISSN: 2252-8938  587

3. RESEARCH METHOD
Current experiments were based on the same data set and most preprocessing steps [9], [10]. Later,
the potential to highly increase the recognition rate of neural networks was explored [21]. Figure 2 shows the
development of the Khmer HTR framework. Data collection and preliminary experiments were completed in
our previous work [9], [10]. In preliminary experiments, the number of features was reduced by 90% using
three independent methods: correlation-based feature selection (CORR), two-dimensional Fourier transform
(FT2D, and Gabor filters (GF). The result of each method was classified with an artificial neural network
(ANN). The original data, without feature space transformation, was classified for comparison of
performance. Gabor Filters yielded the highest improvement in recognition. Such a fact suggested that filters
may play an important role in feature extraction. The current study is based on convolutional models, which
rely on a wider variety of filters. In the course of current work, Models LeNet-5, AlexNet, VGG16, VGG19,
ResNet50 have been modified for binary classification.

Figure 2. Research framework

3.1. One-against-all tactic

Khmer samples of one consonant were taken as positive class and the ones of remaining consonants
– as negative class. Such practice is called a two-way classification for having only two classes to recognize
from: “positive” and “negative". It has been adopted at all stages of the work. The performances of all 33
classifiers (one per each consonant) have been averaged to obtain the performance of each method. The final
classification model for each method is the assembly of the classifiers as described in (1). Such a tactic was
adopted since it has proven to be highly effective in comparison to direct multi-class classification in many
other [22] applications. Each Khmer character has been treated based on the corresponding root radical
(consonant) as a sample of that consonant. Since 17 vowels were combined with each consonant, there were
17 samples in each class.

3.2. The proposed model

This study introduces a convolutional neural net with a compact architecture: two convolutional
layers and one fully-connected layer. The model is referred to as "2+1CNN", for brevity. The model is built
ground-up and is initialized with random weights. 2+1CNN is based on a one-against-all tactic and designed
for binary classification.

3.3. Proposed model architecture

2+1CNN was proposed as a compact model for Khmer HTR and is expected to ease the burden of
computational requirements while staying close to the guidelines of previously designed successful
architectures [1], [11]-[13], [23]. Local receptive fields of size 5×5 have been used in convolutional layers.
Maximal pooling out of 2×2 patches has been used after each convolutional layer. Convolutional and pooling
layers were kept as simple as possible to reduce the number of computations per filter. The input size was
kept the same as in the previous research [11].
A compact deep learning model for Khmer handwritten text recognition (Bayram Annanurov)
588  ISSN: 2252-8938

To prevent overfitting, 50% of the nodes in the fully connected layer are dropped out in random
order. Rectified linear unit (ReLU) is used as an activation function, due to the simplicity of differentiation
and its behavior close to other activation functions. Hyper-parameters used in 2+1CNN are being as:
− Input images are pre-processed, resized to 224×224.
− First convolutional layer with ReLU as activation function, 5×5 filters with stride size 1.
− First pooling layer with 2×2 filters, stride size 1.
− Second convolutional layer with ReLU activation, 5×5 filters, stride size 2.
− Second pooling layer with 2×2 filters, stride size 1.
− The dropout stage randomly erases 50% of the perceptron, to reduce overfitting.
− The fully connected layer is made of 463 perceptron’s with the ReLU activation function. The choice
for the number is based on average (number of features + number of samples) / 2.
Table 2 illustrates the structure of 2+1CNN. The values R, W, P, and F were obtained per (2). Filter
sizes are chosen to minimize the number of computations required during model training. Figure 3 gives the
visualization of a sample as it is traversed through each layer in 2+1CNN. Represented layers are input,
convolution, pooling, convolution, pooling. All other models used in this work (LeNet, AlexNet, VGG16,
VGG19, RESNET) were also modified so that the number of output classes was reduced to two. This
modification was done to implement binary classification due to the adopted one-against-all tactic. While
2+1CNN is built ground-up, transfer learning was used to retrain the State-of-the-Art methods on Khmer
samples. Due to limitations of available processing power and a high amount of data, training of all
classifiers has been limited to 500 iterations.

Table 2. Structure of 2+1CNN

Input (W×W) Padding (P×P) Filter size (F×F) Stride size (S×S) Output (R×R)
Data 224×224
Conv-1, ReLU 224×224 2×2 5×5 1×1 223×223
Pooling-1 223×223 0×0 1×1 2×2 111×111
Conv-2, ReLU 111×111 2×2 5×5 2×2 55×55
Pooling-2 55×55 0×0 1×1 2×2 27×27
Dropout 27×27 - - - 272/2
FC Layer 365 - - - 1

Figure 3. Visualization of a sample within 2+1CNN

3.4. Performance evaluation

The performance of each classifier was quantified by the recognition rate on the testing data set: the
ratio of the number of samples recognized correctly to the total number of samples. To ensure the robustness
of each model, cross-validation was applied in four-folds. To measure the performance of an assembly of
classifiers, the average of their recognition rates was taken, as per (1).

3.5. System specifications

System hardware used in experimentation: Windows 7 64 bit, 4GB RAM, Intel Core i3-220 2.20
GHz CPU. CNN architecture was implemented in Keras, with TensorFlow backend.

4. RESULTS AND DISCUSSION

DL models were applied to each of the existing data sets, individually and are compared by
recognition rate at each data set. The main finding of the study is that the compact model 2+1CNN is highly
effective. The recognition rate came out to be more than 94% on average, on par with the other models. This
proves the concept of an ad-hoc CNN-based recognition system, that can be designed in a setting with low
computational and capabilities. The implications are important for the applications in growing economies,
like Cambodian, where developers and data engineers have limited access to high-performance technology.

Int J Artif Intell, Vol. 10, No. 3, September 2021: 584 - 591
Int J Artif Intell ISSN: 2252-8938  589

Table 3 compares the hardware used in previous experiments to that of the current work. The overall
comparison of models is given in Table 4. Table 5 shows the comparison of current work against previous
attempts. It highlights the theoretical progress in the field of handwritten text recognition for Abugida writing
systems, including Khmer. In previous attempts, data was collected either by scanning printed text or
drawing with a computer mouse, which poses difficulty representing common handwriting. The results of the
current HTR task were achieved on a hardware system of lesser specifications.

Table 3. System requirements for deep learning applications

Literature System setting Model Data Set, size Result
2+1CNN, LeNet, Accuracy: 94.9%,
Current
Intel core i3, 2.20GHz, RAM:4GB AlexNet VGG16, Khmer Chars, 3366 97.1%, 97.6%, 96.4%,
work
VGG19, Resnet 95.8%, 100%
[11] GTX 580, GPU 3GB AlexNet ImageNet, 1.4M Error: 15.3%
[12] 4×NVIDIA Titan Black GPU VGG16, VGG19 ImageNet, 1.4M Error: 12%
ImageNet, 1.4M, CIFAR-
[24] 8×GPU ResNet Error: 3.57%, 6.97%
10, 50k
[25] GeForce Titan X Pascal GPU LeNet IAM, 115k, RIMES, 12k Error: 12.7%, 6.6%
Intel Core i3 3.30 GHz, 12 GB RAM,
[26] ResNet Bangla, 200k Error: 5.5%
GPU: Nvidia 1050Ti 4GB
[27] GTX TITAN X GPU ResNet ICDAR- 2013, 462 Accuracy: 97.03%
[28] Intel i7-4600U, 16 GB RAM ResNet ICDAR- 2011, 7166 F-score: 90.18-96.88
[29] GTX Titan X ResNet, VGG16 Georgian HWT, 200k Accuracy: 95%, 89%
Intel Core i5-6500 4GHz, RAM: 8GB,
[30] AlexNet Iranshahr, 15k Accuracy: 99.13%
Nvidia GTX 1070 8GB

Table 4. Overall summary on CNN-based model

Model Average recognition rate (%) Convolutional layers Fully-Connected layers
2+1 CNN 94.9 2 1
LeNet-5 97.2 2 3
AlexNet 97.6 5 3
VGG16 96.4 13 3
VGG19 95.8 16 3
ResNet50 100 49 1

Table 5. Previous attempts to develop a classifier for abugida-type texts

Literature Dataset Data and size Classifier Accuracy
Sok and Taing [3] Printed and scanned text Khmer Characters, 3000 SVM 98%
Ye et al. [8] Mouse drawn, stylus Khmer, 135, Myanmar, 107 Stock methods Writing speed
Thumwain et al. [6] Scanned text Khmer symbols, 6750 Distance-Based 98%
Kruy and Kameyama [14] Printed and scanned text Khmer words, 1104 SIFT, distance-based 98%
Meng and Morariu [2] Printed and scanned text Khmer Characters, 215 ANN 65%
Kheang et al. [15] Printed and scanned text Khmer words, 110713 WFST ~73%
Srun [4] Printed and scanned text Khmer Characters, 33 ANN 97%
Annanurov and Noor [9], ANN + feature Higher
Handwritten Characters Khmer Syllables, 3366
[10] extraction performance
2+1 CNN Handwritten Characters Khmer Syllables, 3366 2+1 CNN 94.9%

5. CONCLUSION
This work aimed to develop a compact and effective model for offline recognition of Khmer
handwritten characters. In general, recognition rates came out to be 93-98%. The 2+1CNN model was built
ground-up and had performance over 94%, which is at the same level as other, more sophisticated models.
The results also helped towards closing the research gap in the field since, at the time of experiments, Khmer
HTR has not yet been approached with deep learning. The main contribution is the compact Khmer HTR
model (2+1CNN) with low computational requirements, which is based on open-source software and does
not require any proprietary packages. These aspects ease its implementation, therefore, allowing swift
digitization of document corpora in rural and developing areas. The developed models may be applied in a
high-end OCR application targeted to the general public, as well used in more sophisticated applications with
only the back-end part, aiming to digitize documents. Further works may include recognition based on the
information about the layout of documents, forms, tables.

A compact deep learning model for Khmer handwritten text recognition (Bayram Annanurov)
590  ISSN: 2252-8938

ACKNOWLEDGEMENTS
This work was partially funded by Universiti Teknologi Malaysia and the Ministry of Higher
Education Malaysia.

REFERENCES
[1] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition,"
in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.
[2] H. Meng and D. Morariu, "Khmer character recognition using artificial neural network," in Signal and Information
Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, 2014, pp. 1-8, doi:
10.1109/APSIPA.2014.7041824.
[3] P. Sok and N. Taing, "Support Vector Machine (SVM) based classifier for Khmer Printed Character-set
Recognition," in Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014
Asia-Pacific, 2014, pp. 1-9, doi: 10.1109/APSIPA.2014.7041823.
[4] S. Srun, "Applying Backpropagation for Khmer Printing Character Recognition," Proceedings of Japan-Cambodia
Joint Symposium on Information Systems and Communication Technology 2011, Phnom Penh, 2011, pp. 135-136.
[5] S. Srun and U. Vishnyakov, "An Approach for Quality Enhancement of the Text Recognition," Intellectual CAD,
vol. 4, 2009.
[6] P. Thumwarin, S. Khem, K. Janchitraponvej, and T. Matsuura, "On-line writer dependent character recognition for
Khmer based on FIR system characterizing handwriting motion," 2008 SICE Annual Conference, 2008, pp. 73-78,
doi: 10.1109/SICE.2008.4654625.
[7] S. Srun, "Applying Tesseract for Khmer Optical Character Recognition," in ASEAN-UEC Symposium, 2015.
[8] Y. K. Thu, O. Phavy. and Y. Urano, "Positional gesture for advanced smart terminals: Simple gesture text input for
syllabic scripts like Myanmar, Khmer and Bangla," in 2008 First ITU-T Kaleidoscope Academic Conference -
Innovations in NGN: Future Network and Services, 2008, pp. 77-84, doi: 10.1109/KINGN.2008.4542252.
[9] B. Annanurov and N. M. Noor, "Handwritten Khmer text recognition," in 2016 IEEE International WIE
Conference on Electrical and Computer Engineering (WIECON-ECE), 2016, pp. 176-179, doi: 10.1109/WIECON-
ECE.2016.8009112.
[10] B. Annanurov and N. M. Noor, "Feature selection for Khmer handwritten text recognition," in 2017 IEEE
Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), 2017, pp. 626-
630, doi: 10.1109/EIConRus.2017.7910634.
[11] A. Krizhevsky, I. Sutskever, and G.E. Hinton, "ImageNet classification with deep convolutional neural networks,"
in Advances in Neural Information Processing Systems, 2012.
[12] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," CoRR
arXiv: 1409.1556, 2014.
[13] Z. Y. He, “A New Feature Fusion Method for Handwritten Character Recognition Based on 3D
Accelerometer,” Applied Mechanics and Materials, vol. 44-47, pp. 1583–1587, 2010, doi:
10.4028/www.scientific.net/AMM.44-47.1583.
[14] V. Kruy and W. Kameyama, “Preliminary Experiment on Khmer OCR,” 8th International Conference of Frontiers
of Information Technology, 2010.
[15] S. Kheang, K. Katsurada, Y. Iribe, and T. Nitta, “Solving the Phoneme Conflict in Grapheme-to-Phoneme
Conversion Using a Two-Stage Neural Network-Based Approach,” IEICE Transactions on Information and
Systems, vol. E97.D, no. 4, pp. 901–910, 2014, doi: 10.1587/transinf.E97.D.901.
[16] Y. LeCun, "Deep learning & convolutional networks," in 2015 IEEE Hot Chips 27 Symposium (HCS), 2015, pp. 1-
95, doi: 10.1109/HOTCHIPS.2015.7477328.
[17] M. Matsugu, K. Mori, Y. Mitari, and Y. Kaneda, "Subject independent facial expression recognition with robust
face detection using a convolutional neural network," Neural Networks, vol. 16, no. 5-6, pp. 555-559, 2003, doi:
10.1016/S0893-6080(03)00115-1.
[18] A. Karpathy, A., "Connecting images and natural language," Thesis, Dept. of Computer Science, Stanford
University, 2016.
[19] K. Gregor and Y. LeCun, "Emergence of Complex-Like Cells in a Temporal Product Network with Local
Receptive Fields," CoRR arXiv:1006.0448, 2010.
[20] D.C. Cireşan, U. Meier, J. Masci, L.M. Gambardella, and J. Schmidhuber. "Flexible, high performance
convolutional neural networks for image classification," in IJCAI International Joint Conference on Artificial
Intelligence, 2011, doi: 10.5591/978-1-57735-516-8/IJCAI11-210.
[21] B. Annanurov and N.M. Noor, "Khmer handwritten text recognition with convolution neural networks," ARPN
Journal of Engineering and Applied Sciences, vol. 13, no. 22, pp. 8828-8833, 2018.
[22] R. Venkatesan and M. J. Er, “A novel progressive learning technique for multi-class classification,”
Neurocomputing, vol. 207, pp. 310–321, 2016, doi: 10.1016/j.neucom.2016.05.006.
[23] A. Krizhevsky, "Convolutional deep belief networks on cifar-10," in Unpublished manuscript, U.o. Toronto, Editor.
2010, Available: https://www.cs.toronto.edu/~kriz/conv-cifar10-aug2010.pdf.
[24] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.
[25] J. Sueiras, V. Ruiz, A. Sanchez, and J. F. Velez, “Offline continuous handwriting recognition using sequence to
sequence neural networks,” Neurocomputing, vol. 289, pp. 119–128, 2018, doi: 10.1016/j.neucom.2018.02.008.

Int J Artif Intell, Vol. 10, No. 3, September 2021: 584 - 591
Int J Artif Intell ISSN: 2252-8938  591

[26] M. Al Rabbani Alif, S. Ahmed, and M. A. Hasan, "Isolated Bangla handwritten character recognition with
convolutional neural network," in 2017 20th International Conference of Computer and Information Technology
(ICCIT), 2017, pp. 1-6, doi: 10.1109/ICCITECHN.2017.8281823.
[27] R. Zhang, Q. Wang, and Y. Lu, "Combination of ResNet and Center Loss Based Metric Learning for Handwritten
Chinese Character Recognition," in 2017 14th IAPR International Conference on Document Analysis and
Recognition (ICDAR), 2017, pp. 25-29, doi: 10.1109/ICDAR.2017.324.
[28] K. R. Ayyalasomayajula, F. Malmberg, and A. Brun, “PDNet: Semantic segmentation integrated with a primal-dual
network for document binarization,” Pattern Recognition Letters, vol. 121, pp. 52–60, 2019, doi:
10.1016/j.patrec.2018.05.011.
[29] D. Soselia, M. Tsintsadze, L. Shugliashvili, I. Koberidze, S. Amashukeli, and S. Jijavadze, “On Georgian
Handwritten Character Recognition,” IFAC-PapersOnLine, vol. 51, no. 30, pp. 161–165, 2018, doi:
10.1016/j.ifacol.2018.11.279.
[30] R. Sabzi et al., "Recognizing Persian handwritten words using deep convolutional networks," in 2017 Artificial
Intelligence and Signal Processing Conference (AISP), 2017, pp. 85-90, doi: 10.1109/AISP.2017.8324114.

BIOGRAPHIES OF AUTHORS

Dr. Bayram Annanurov completed his Ph.D. at the Universiti Teknologi Malaysia in 2016.
His main research area is Deep Learning. He is currently teaching programming and
optimization at Paragon International University in Phnom Penh, Cambodia.

Dr. Norliza Mohd Noor is a professor at Razak Faculty of Technology and Informatics,
Universiti Teknologi Malaysia, Kuala Lumpur Campus. Her research areas are image analysis
and machine learning.

A compact deep learning model for Khmer handwritten text recognition (Bayram Annanurov)

Concise Textbook Smartphone Graphic Design
100% (1)
Concise Textbook Smartphone Graphic Design
48 pages
Lenovo Thinkpad Catalog
100% (1)
Lenovo Thinkpad Catalog
13 pages
NoteCaddy Manual
100% (1)
NoteCaddy Manual
50 pages
Segmentation and Recognition of Handwritten Lontara Characters Using Convolutional Neural Network
No ratings yet
Segmentation and Recognition of Handwritten Lontara Characters Using Convolutional Neural Network
5 pages
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Khmer Optical Character Recognition (OCR) : September 2015
No ratings yet
Khmer Optical Character Recognition (OCR) : September 2015
7 pages
CNN Based Digital Alphanumeric Archaeolinguistics Apprehension For Ancient Script Detection
No ratings yet
CNN Based Digital Alphanumeric Archaeolinguistics Apprehension For Ancient Script Detection
7 pages
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Handwritten Character Recognition System
No ratings yet
Handwritten Character Recognition System
81 pages
ocr_progress4
No ratings yet
ocr_progress4
22 pages
Handwritten Marathi Character Recognition Using R
No ratings yet
Handwritten Marathi Character Recognition Using R
10 pages
Bangla Handwritten Word Recognition System Using Convolutional Neural Network
No ratings yet
Bangla Handwritten Word Recognition System Using Convolutional Neural Network
9 pages
Handwritten Amharic Character Recognition Using A Convolutional Neural Network
No ratings yet
Handwritten Amharic Character Recognition Using A Convolutional Neural Network
12 pages
Algorithm For Devanagari Character
No ratings yet
Algorithm For Devanagari Character
6 pages
Paper 1
No ratings yet
Paper 1
3 pages
Prashanth2022 Article HandwrittenDevanagariCharacter
No ratings yet
Prashanth2022 Article HandwrittenDevanagariCharacter
30 pages
State-of-the-Art Bangla Handwritten Character Recognition Using A Modified Resnet-34 Architecture
No ratings yet
State-of-the-Art Bangla Handwritten Character Recognition Using A Modified Resnet-34 Architecture
11 pages
Handwritten assamese_character
No ratings yet
Handwritten assamese_character
9 pages
ancient geez script reccognition
No ratings yet
ancient geez script reccognition
7 pages
Bornonet: Bangla Handwritten Characters Recognition Using Convolutional Neural Network Bornonet: Bangla Handwritten Characters Recognition Using Convolutional Neural Network
No ratings yet
Bornonet: Bangla Handwritten Characters Recognition Using Convolutional Neural Network Bornonet: Bangla Handwritten Characters Recognition Using Convolutional Neural Network
8 pages
Ijcses 030602
No ratings yet
Ijcses 030602
13 pages
Zhao - ApplSci22 - Evaluation and Recognition of Handwritten Chinese Characters Based On Similarities
No ratings yet
Zhao - ApplSci22 - Evaluation and Recognition of Handwritten Chinese Characters Based On Similarities
20 pages
Handwritten Script Recognition System: J Component Project Report FALL 2020
No ratings yet
Handwritten Script Recognition System: J Component Project Report FALL 2020
41 pages
Titlelabel1__5_ (3)
No ratings yet
Titlelabel1__5_ (3)
9 pages
Project T Proposal Bangla Alphabet Handwritten Recognition Using Deep Learning.
No ratings yet
Project T Proposal Bangla Alphabet Handwritten Recognition Using Deep Learning.
5 pages
Handwritten Bangla Digit Recognition Using Deep Learning: Alomm Udayton EDU
No ratings yet
Handwritten Bangla Digit Recognition Using Deep Learning: Alomm Udayton EDU
12 pages
Vietnamese Handwritten Character Recognition Using Convolutional Neural Network
No ratings yet
Vietnamese Handwritten Character Recognition Using Convolutional Neural Network
8 pages
Improving Amharic Handwritten Word Recognition Usi
No ratings yet
Improving Amharic Handwritten Word Recognition Usi
11 pages
Handwritten Assamese Character
No ratings yet
Handwritten Assamese Character
9 pages
Paper 4
No ratings yet
Paper 4
8 pages
IJIGSP BHCR CNN Pub 2015 8 52-59
No ratings yet
IJIGSP BHCR CNN Pub 2015 8 52-59
9 pages
Wen - ICCPR19 - Chinese Calligraphy - Character Style Recognition Based On Full-Page Document
No ratings yet
Wen - ICCPR19 - Chinese Calligraphy - Character Style Recognition Based On Full-Page Document
5 pages
Abdulbasit - 1570962165 Paper
No ratings yet
Abdulbasit - 1570962165 Paper
9 pages
Homo Ludens in the Loop: Playful Human Computation Systems
From Everand
Homo Ludens in the Loop: Playful Human Computation Systems
Markus Krause
No ratings yet
Recognition of Marathi Handwritten Numerals by Using Support Vector Machine
No ratings yet
Recognition of Marathi Handwritten Numerals by Using Support Vector Machine
8 pages
Article Hand Writing Character Recognition Using CNN
No ratings yet
Article Hand Writing Character Recognition Using CNN
6 pages
Handwritten Script Recognition
No ratings yet
Handwritten Script Recognition
5 pages
HCR-Net: A Deep Learning Based Script Independent Handwritten Character Recognition Network
No ratings yet
HCR-Net: A Deep Learning Based Script Independent Handwritten Character Recognition Network
35 pages
Gonder e
No ratings yet
Gonder e
14 pages
9
No ratings yet
9
8 pages
Deep Learning Frameworks
From Everand
Deep Learning Frameworks
Jamal Hopper
No ratings yet
BanglaHandwritten_A_Comparative_Study_among_Single_Numeral_Vowel_Modifier_And_Compound_Characters_Recognition_Using_CNN
No ratings yet
BanglaHandwritten_A_Comparative_Study_among_Single_Numeral_Vowel_Modifier_And_Compound_Characters_Recognition_Using_CNN
8 pages
Programming with X10: Definitive Reference for Developers and Engineers
From Everand
Programming with X10: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
123 Handwritten
No ratings yet
123 Handwritten
10 pages
31.july Ijmte - 674
No ratings yet
31.july Ijmte - 674
7 pages
System For Identifying Texts Written in Kazakh Language
No ratings yet
System For Identifying Texts Written in Kazakh Language
5 pages
Hybrid Convolutional Neural Networks-Support Vector Machine Classifier With Dropout For Javanese Character Recognition
No ratings yet
Hybrid Convolutional Neural Networks-Support Vector Machine Classifier With Dropout For Javanese Character Recognition
8 pages
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bangla Handwriting Recongnition
No ratings yet
Bangla Handwriting Recongnition
6 pages
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
Gujarati Character Recognition: 1.1 Previous Work
No ratings yet
Gujarati Character Recognition: 1.1 Previous Work
4 pages
Recognition of Handwritten Meitei Mayek Script Based On Texture Feature
No ratings yet
Recognition of Handwritten Meitei Mayek Script Based On Texture Feature
10 pages
Deep Learning
From Everand
Deep Learning
Manish Soni
No ratings yet
HCR-N: A: ET Deep Learning Based Script Independent Handwritten Character Recognition Network
No ratings yet
HCR-N: A: ET Deep Learning Based Script Independent Handwritten Character Recognition Network
27 pages
Classification of Handwritten Names of Cities and Handwritten Text Recognition Using Various Deep Learning Models
No ratings yet
Classification of Handwritten Names of Cities and Handwritten Text Recognition Using Various Deep Learning Models
11 pages
Proposal For Year Project Bachelor of Science in Information Technology
No ratings yet
Proposal For Year Project Bachelor of Science in Information Technology
22 pages
Handwritten Text Recognition a Survey of OCR Techniques
No ratings yet
Handwritten Text Recognition a Survey of OCR Techniques
16 pages
Presentation333 (Autosaved)
No ratings yet
Presentation333 (Autosaved)
11 pages
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applsci 12 10155
No ratings yet
Applsci 12 10155
23 pages
Aabin
No ratings yet
Aabin
4 pages
Developing a website for English-speaking practice to English as a foreign language learners at the university level
No ratings yet
Developing a website for English-speaking practice to English as a foreign language learners at the university level
12 pages
A proposed approach for plagiarism detection in Myanmar Unicode text
No ratings yet
A proposed approach for plagiarism detection in Myanmar Unicode text
9 pages
Multi-task deep learning for Vietnamese capitalization and punctuation recognition
No ratings yet
Multi-task deep learning for Vietnamese capitalization and punctuation recognition
11 pages
Graph-based methods for transaction databases: a comparative study
No ratings yet
Graph-based methods for transaction databases: a comparative study
10 pages
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
No ratings yet
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
10 pages
Enhancing emotion recognition model for a student engagement use case through transfer learning
No ratings yet
Enhancing emotion recognition model for a student engagement use case through transfer learning
11 pages
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
No ratings yet
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
10 pages
Evaluating ChatGPT’s Mandarin “yue” pronunciation system in language learning
No ratings yet
Evaluating ChatGPT’s Mandarin “yue” pronunciation system in language learning
8 pages
Artificial intelligence algorithms to predict customer satisfaction: a comparative study
No ratings yet
Artificial intelligence algorithms to predict customer satisfaction: a comparative study
9 pages
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on deep neural network
No ratings yet
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on deep neural network
13 pages
A comparative study of natural language inference in Swahili using monolingual and multilingual models
No ratings yet
A comparative study of natural language inference in Swahili using monolingual and multilingual models
8 pages
A contest of sentiment analysis: k-nearest neighbor versus neural network
No ratings yet
A contest of sentiment analysis: k-nearest neighbor versus neural network
9 pages
Video forgery: An extensive analysis of inter-and intra-frame manipulation alongside state-of-the-art comparisons
No ratings yet
Video forgery: An extensive analysis of inter-and intra-frame manipulation alongside state-of-the-art comparisons
13 pages
Automatic detection of dress-code surveillance in a university using YOLO algorithm
No ratings yet
Automatic detection of dress-code surveillance in a university using YOLO algorithm
8 pages
Hindi spoken digit analysis for native and non-native speakers
No ratings yet
Hindi spoken digit analysis for native and non-native speakers
7 pages
Primary phase Alzheimer's disease detection using ensemble learning model
No ratings yet
Primary phase Alzheimer's disease detection using ensemble learning model
9 pages
Hybrid model detection and classification of lung cancer
No ratings yet
Hybrid model detection and classification of lung cancer
11 pages
Deep learning-based techniques for video enhancement, compression and restoration
No ratings yet
Deep learning-based techniques for video enhancement, compression and restoration
13 pages
U-Net for wheel rim contour detection in robotic deburring
No ratings yet
U-Net for wheel rim contour detection in robotic deburring
14 pages
Deep ensemble learning with uncertainty aware prediction ranking for cervical cancer detection using Pap smear images
No ratings yet
Deep ensemble learning with uncertainty aware prediction ranking for cervical cancer detection using Pap smear images
11 pages
Improved convolutional neural networks for aircraft type classification in remote sensing images
No ratings yet
Improved convolutional neural networks for aircraft type classification in remote sensing images
8 pages
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
No ratings yet
Hybrid object detection and distance measurement for precision agriculture: integrating YOLOv8 with rice field sidewalk detection algorithm
11 pages
A novel scalable deep ensemble learning framework for big data classification via MapReduce integration
No ratings yet
A novel scalable deep ensemble learning framework for big data classification via MapReduce integration
15 pages
A comparative analysis of exponential smoothing method and deep learning models for bitcoin price prediction
No ratings yet
A comparative analysis of exponential smoothing method and deep learning models for bitcoin price prediction
9 pages
Enhancing fall detection and classification using Jarratt‐butterfly optimization algorithm with deep learning
No ratings yet
Enhancing fall detection and classification using Jarratt‐butterfly optimization algorithm with deep learning
10 pages
Adaptive kernel integration in visual geometry group 16 for enhanced classification of diabetic retinopathy stages in retinal images
No ratings yet
Adaptive kernel integration in visual geometry group 16 for enhanced classification of diabetic retinopathy stages in retinal images
12 pages
Optimizing deep learning models from multi-objective perspective via Bayesian optimization
No ratings yet
Optimizing deep learning models from multi-objective perspective via Bayesian optimization
10 pages
Event detection in soccer matches through audio classification using transfer learning
No ratings yet
Event detection in soccer matches through audio classification using transfer learning
9 pages
Exploring DenseNet architectures with particle swarm optimization: efficient tomato leaf disease detection
No ratings yet
Exploring DenseNet architectures with particle swarm optimization: efficient tomato leaf disease detection
9 pages
Detecting road damage utilizing retinanet and mobilenet models on edge devices
No ratings yet
Detecting road damage utilizing retinanet and mobilenet models on edge devices
11 pages
Group c Assignment 7 Man Walking in Rain
No ratings yet
Group c Assignment 7 Man Walking in Rain
3 pages
Tutorial - Rtmpdump
No ratings yet
Tutorial - Rtmpdump
4 pages
Object Recognition
No ratings yet
Object Recognition
30 pages
Introduction To Bsph114
No ratings yet
Introduction To Bsph114
13 pages
Steganography and Cryptography Approaches Combined Using Medical Digital Images IJERTV4IS060270
No ratings yet
Steganography and Cryptography Approaches Combined Using Medical Digital Images IJERTV4IS060270
4 pages
Unit-3 Notes SE Part-I
No ratings yet
Unit-3 Notes SE Part-I
5 pages
PySide Introduction
No ratings yet
PySide Introduction
16 pages
COG - 3 Centralis - Towers FP 17 MDC TSD RFA CENTRALIS FP 17.1 Proposed - Updated - Relocation - of - Branchline - at - Sec - 2024 04 11 PDF
No ratings yet
COG - 3 Centralis - Towers FP 17 MDC TSD RFA CENTRALIS FP 17.1 Proposed - Updated - Relocation - of - Branchline - at - Sec - 2024 04 11 PDF
2 pages
Types of Computer Memory: Primary and Secondary: Unit-3
No ratings yet
Types of Computer Memory: Primary and Secondary: Unit-3
8 pages
Photoshop Beginner Guide
No ratings yet
Photoshop Beginner Guide
6 pages
Bank Management System
No ratings yet
Bank Management System
16 pages
Zigbee Based Wireless Scada System: Bachelor of Engineering in Electronics & Telecommunication Engineering
No ratings yet
Zigbee Based Wireless Scada System: Bachelor of Engineering in Electronics & Telecommunication Engineering
43 pages
Geomagic-DesignX-General-22V1-EN
No ratings yet
Geomagic-DesignX-General-22V1-EN
4 pages
Project Report CRICKET
100% (3)
Project Report CRICKET
17 pages
Weka On Azure Performance Benchmark
No ratings yet
Weka On Azure Performance Benchmark
9 pages
Datasheet E78630dn
No ratings yet
Datasheet E78630dn
3 pages
PWP Project
No ratings yet
PWP Project
28 pages
Chapter V Thesis
100% (2)
Chapter V Thesis
4 pages
ReviewedFSETT 2023-2024 - ND1&HND1
No ratings yet
ReviewedFSETT 2023-2024 - ND1&HND1
3 pages
Guidelines To Submit Training Report
No ratings yet
Guidelines To Submit Training Report
14 pages
BAB 3 Permodelan Dan Arsitektur Aplikasi
No ratings yet
BAB 3 Permodelan Dan Arsitektur Aplikasi
56 pages
Development of Job Recommender For Alumni Information System
No ratings yet
Development of Job Recommender For Alumni Information System
6 pages
Maptitude 2024 New Features
No ratings yet
Maptitude 2024 New Features
10 pages
ES114 Computer Programming
No ratings yet
ES114 Computer Programming
15 pages
Sustainability 09 00426 v2
No ratings yet
Sustainability 09 00426 v2
23 pages
Resume Pakhi Sinha
No ratings yet
Resume Pakhi Sinha
1 page
Image Steganography Using LSB
No ratings yet
Image Steganography Using LSB
4 pages

A Compact Deep Learning Model For Khmer Handwritten Text Recognition

Uploaded by

A Compact Deep Learning Model For Khmer Handwritten Text Recognition

Uploaded by

IAES International Journal of Artificial Intelligence (IJ-AI)

Vol. 10, No. 3, September 2021, pp. 584~591

A compact deep learning model for Khmer handwritten text

Bayram Annanurov1, Norliza Mohd Noor2

Article Info ABSTRACT

Journal homepage: http://ijai.iaescore.com

𝐹(∙) = {𝐹𝑗 (∙)| 𝑗 = 1. . 𝑘} (1)

Table 1. Data sets were acquired for Khmer HTR

2.2. Convolutional neural networks

Figure 1. 3-D neuron arrangements in a CNN [18]

Figure 2. Research framework

3.1. One-against-all tactic

3.2. The proposed model

3.3. Proposed model architecture

Table 2. Structure of 2+1CNN

Figure 3. Visualization of a sample within 2+1CNN

3.4. Performance evaluation

3.5. System specifications

4. RESULTS AND DISCUSSION

Table 3. System requirements for deep learning applications

Table 4. Overall summary on CNN-based model

Table 5. Previous attempts to develop a classifier for abugida-type texts

You might also like