The State of The Art Handwritten Recognition of Arabic Script Using Simplified Fuzzy ARTMAP and Hidden Markov Models
The State of The Art Handwritten Recognition of Arabic Script Using Simplified Fuzzy ARTMAP and Hidden Markov Models
net/publication/271828246
The State of the Art Handwritten Recognition of Arabic Script Using Simplified
Fuzzy ARTMAP and Hidden Markov Models
CITATIONS READS
10 299
3 authors:
Nikhat Akhtar
Department of Information Technology (IT), Goel Institute of Technology & Manage…
60 PUBLICATIONS 517 CITATIONS
SEE PROFILE
All content following this page was uploaded by Nikhat Akhtar on 04 February 2015.
Abstract– In this paper, we present recognition of handwritten Qur’anical Arabic is not used in conversation or in non-
characters of Arabic script. Arabic is now the 6th most spoken religious writing and Modern Standard Arabic is the official
language in the world and is spoken by more than 200 million language of the Arabic world. Colloquial Arabic refers to
people worldwide. The 7th Century A.D., Arabic started to Arabic that is spoken with a dialect [3].
spread to the Middle East as many people started to convert to
The modern Arabic language writing system runs from
Islam. During this time of religious conversions, Arabic replaced
many South Arabian languages, most of which are no longer right to left and is a cursive script [4]. There are twenty eight
commonly spoken or understood languages. The challenges in letters in the alphabet, but because the script of the alphabet is
Arabic handwritten character recognition wholly lie in the cursive, 22 of the letters take different shapes when they are
variation and disfigurement of Arabic handwritten characters, in initial, medial, final, or isolated positions [5]. There are six
since different Arabic people may use a different style of letters in the alphabet which have only two presumable forms
handwriting, and direction to draw the same shape of the because you only connect to them; they cannot be connected
characters of their known Arabic script. Though various new from. The three long vowels are represented within the
propensity and technologies come out in these days, still alphabet. However, the three short vowels are not. Short
handwriting is playing an important role. To recognize Arabic
vowels can be indicated by optional diacritical markings [6],
handwritten data there are different strategies like Simplified
Fuzzy ARTMAP and Hidden Markov Models (HMM). In this but these are most often not written. Those texts in which they
paper, we are using Simplified Fuzzy ARTMAP, which is an are written are usually of a religious nature and they are
updated version of Predictive Adaptive Resonance Theory. It included to ensure that the proper pronunciation is made for
also has a capacity to adjust clusters, as per the requirements all the words. Note that Arabic is particularly rich in uvular,
Arabic script, which is remunerative to mitigate noise. We have pharyngeal, and pharyngealized ("emphatic") sounds.
tested our method on Arabic scripts and we have obtained The Arabic handwritten script recognition is an open field
encouraging results from our proposed technique. of research which has a large amount of scope for
development. A some models are already enforcing for the
Index Terms– Hidden Markov Model (HMM), Arabic Script,
hand written character recognition system include framework
Handwriting, Fuzzy ARTMAP, Recognition and Feature
Extractor based models, support vector machines, stochastic models,
and learning-based models etc. In this paper using a Hidden
Markov Models because Hidden
I. INTRODUCTION Markov Models are mainly sequence [7] classifiers and are
frequently used for recognition of Arabic handwritten script.
They are stochastic models and can encounter with noise and
A RABIC is the official language of many countries in the
Middle East such as Saudi Arabia, Jordan, Lebanon,
Libya, Egypt, Iraq, Morocco, and Sudan. It is also one of
also give diversifications in Arabic handwriting.
progressist viewpoint has been proposed to reform simplified
A
the six official languages of the United Nations. Arabic is a fuzzy ARTMAP Neural Network [8] performance for
“Semitic,” language and is most closely related to Aramaic character recognition of handwritten Arabic script. The few
and Hebrew. Semitic languages are based on a consonantal fuzzy values are used for the similar Arabic script to
root system. Every word in Arabic is derived from one or ameliorate recognition. The fuzzy
dissimilar root word [1]. By the 7th Century A.D., Arabic ARTMAP beforehand gives preferable execution for other
started to spread to the Middle East as many people started to characters [9]. In this paper available the all information
convert to Islam. During this time of religious conversions, keeps together by both labeled and unlabeled Arabic patterns,
Arabic substitute many South Arabian languages, most of it is essential to intermingle supervised and unsupervised
which are no longer commonly [2] spoken or understood learning in a single training algorithm. We are using a
languages. There are three forms of Arabic; Qur’anical simplified fuzzy ARTMAP and hidden Markov models in
Arabic, Modern Standard Arabic, and Colloquial Arabic. Arabic script. We have acquired spanking outcome.
3. Initial state delivery ∏ = {πi}, i ϵ S. πi is defined as: The Arabic character recognition is exceedingly arduous to
automate. The humankind being can diagnosticate variegated
objects and make cognition out of volumetric amount of
4. State transition prospect delivery A = {aij}, i, j, ϵ S. visual information, seemingly requiring very diminutive
attempt. The emulate task execution by humankind to
diagnosticate to the extent allowed by physical barricades will
be extremely gainful for the system. The difficulty contains in
5. Observation symbol prospect delivery B = bj (o t). The the real world data handwritten Arabic alphabets, where
prospect function for each state j is: handwritten Arabic characters are the input to the system,
while in print characters will be goal output of the system.
In this paper collection of Arabic character data. We are
Since modeling a problem as a hidden Markov model, and particular piece of paper has to be designed for the Arabic
acknowledging that some set of data was originated by the data collection. The Arabic data are collected from many
hidden Markov model, we are potentially viable to calculate people from various ages and realm. The Arabic character
the prospect of the observation sequence [13] and the data obtaining is done manually i.e., The bond piece of paper
potential fundamental state sequences. We can train the model was provided to the respondent and asked to inscribe the
parameters based on the comply data and get a more actual Arabic characters from to for one time. Because bond paper is
model. Then use the trained model to predict unappreciated a strong, high quality, durable writing paper similar to bank
data. paper, but having a weight greater than 50 g/m2. After that
A Hidden Markov Model is that the states are look on bond piece of paper are scanned using HP Scanjet 5590
directly on the Markov Model, and look on indirectly with Digital Flatbed Scanner at 2400 dpi optical resolution, which
unpredictability in the hidden markov model. The above is gives low noise and good quality image. The digitized images
best be an example using the graphical model representation are stored in BMP file shown in Fig. 2.
see in Fig. 1.
International Journal of Computer Science and Telecommunications [Volume 5, Issue 8, August 2014] 28
inhibited. For each input A, a fuzzy choice function is used to each of which determine some quantifiable property of an
get the response for each Fb2 category: Arabic object, and is computed such that it quantifies some
valued Arabic characteristics of the object. We classify the
various features [24] currently employed as follows:
• General features: Application unattached features such as
Let J be the node with the highest value computed as in color, texture, and shape. According to the abstraction level,
(1). If the resonance condition from eq. 2 is not fulfilled, they can be further divided into three types, firstly the pixel
Then the Jth node prevents such that it will not take part to level features these features calculated at each pixel, e.g.,
further competitions for this pattern and a new search for a color, location. Secondly the local features these features
resonant category is performed. This might lead to the calculated over the results of subdivision of the image band
creation of a new category in ARTa. on image segmentation or edge detection. Lastly the global
features: these features calculated over the overall image or
just a regular sub - area of an image.
• Domain-specific features: Application unattached features
such as human faces, fingerprints, and conceptual features.
An identical process occurs in ART b and let K be the These features are frequently a synthesis of low-level features
winning node from ARTb. The Fb2output vector is set to: for a specific domain. On the other hand, all features can be
coarsely classified into low-level features and high-level
features. Low-level features can be extracted directed from
the actual images, whereas high-level feature extraction must
be based on low level features [25].
This task is an example many facets of a typical Arabian
An output vector X ab is formed in Mapfield: X ab = y b pattern recognition problem, including feature selection,
∧Wabj . A Mapfield vigilantly test controls the match dimensionally deficiency and the use of prestigious
between the predicted vector X ab and the target vector yb : descriptors jiffy are the extracted features derived from raw
measurements. Where jiffy are used to obtain Arabic script
Scaling (ASS), Arabic script Rotation (ASR), Arabic script
Translation (AST) unalterable. The disposition of
immutability to ASS, ASR, AST transforms may be derived
using the function of jiffy. The jiffy transformation of an
Where ρab∈ {0, 1} is a Mapfield vigilantly parameter. If
Arabic image function ĄƗƑ(x,y) is given by:
the test from (4) is not passed, then a sequence of steps called
match tracking is initiated (the vigilantly parameter ρ a is
m n
increased and a new resonant category will be solicited for lmn = ĄƗƑ(x,y) m , n= 0,1,2,……,∞
ARTa), otherwise learning occurs in ART a, ARTb and
Mapfield: In the case of a spatially calumniate 9×11 (M×N) character
denoted by ĄƗƑ(i,j)is approximated as,
5 7
lmn = im jn ĄƗƑ(i,j)
i 0 j 0
(And the analogous in ARTb) and Wabjk = δkk, where δij is
Now the value of ĄƗƑ(i,j)0 or 1 keep faith upon whether the
Kronecker’s delta. With respect to βa, there are two learning
(i,j)th pixel The moment severity is represented by several
modes firstly the speedy learning for βa = 1 for the entire
prospects like the way of handwriting, ink used for Arabic
training process, and secondly the speedy commit and gently
written character i.e. 0 ≤ ĄƗƑ(i,j)≤ 1 point to that the severity
recode learning corresponds to setting βa = 1 when creating a
[26] falsehood amid the ends of a spectrum. However ĄƗƑ(i,j)
new node and βa < 1 for subsequent learning.
is static over any pixel realm and ponder it as pivotal jiffy.
In batch supervised learning mode, fuzzy ARTMAP may
The pivotal jiffy is given by:
also be accomplished in that its asymptotic [23] generalization
( i - iˆ )m (
error can be attained for a moderate time and space 5 7
mn=
i 0 j 0
j - ĵ )n ĄƗƑ(i,j) Where
complexity. They have been swimmingly applied in complex
real-world pattern recognition tasks such as the recognition of
l10 ˆ l 01
radar signals multi-sensor image fusion [24], remote sensing iˆ , j
and data mining recognition of handwritten characters and l 00 l 00
signature verification . The pivotal jiffies are so far sentient to ASR and ASS
transformation. The scaling unalterable may be procured by
V. WORKING OF FEATURE EXTRACTOR FOR ahead normalizing μmn as:
ARABIC SCRIPT
mn = μmn m+n=2,3,…..
The assortment of two-dimensional objects from Arabic
mn
visual image data is vital Arabic pattern recognition task. The μ00 1
feature is defined as a function of one or more measurements, 2
International Journal of Computer Science and Telecommunications [Volume 5, Issue 8, August 2014] 30
VI. THE ALGORITHEM FOR TRANING AND Phase 10: If CFn (BIx) < ƥ then If few much top down
INFERENCE PHASES FOR ARABIC SCRIPT weight nodes exist, then Contemplate the next topmost
winner WN among the top-down weight nodes.
The algorithms for Training and Inference phases of Go to Phase 8;
SFARTMAP on the basis of the prolonged definition of Else go to Phase 11;
complementary distribution are as follows:
Phase 11: Make a recently top-down weight node RWfirst such
A) The Training phase of Arabic Script that Wfirst= BIx and link the node to the ranking RN;
Phase 1: Select a suitable value for the vigilantly parameter
(0 < ƥ < 1) and a small value for š. The conglomeration Phase 12: If nope, more input patterns, then go to Phase 14;
number of training epochs to the desired number of training
epochs and Enumerate of training epochs to 0. Phase 13: Otherwise x ← x +1
Go to Phase 3;
Phase 2: x← 1;
Enumerate of training epochs = Enumerate of training epochs Phase 14: GOTO Phase 2;
+1; While (Enumerate of training epochs ≤ number of training
epochs) B) The Inference phase of Arabic Script
Repeat Phases 3 – 12 else 13;
Phase 1: Let Wy , y=1,2,3,………..,m allude m top-down
Phase 3: Input the pattern vector Ix = (bx1, bx2, bx3, bx4,…….., weight vectors procured after training the network with a
bxd) of dimension d and its ranking Rx. [28] given set of training patterns;
Phase 4: Count the augmented input vector using the detailed Let Ix be the conjecture pattern set each of whose ranking is to
definition of complementation of Fuzzy set under the be drawn conclusion of the network;
presumable cases. x ← 1;
BIx = (bx1, bx2, bx3, bx4,…….., bxd, 1- bx1, 1- bx2, 1- bx3, 1- bxd)
Phase 2: Perceive input Ix;
Phase 5: If BIx is the first input in the given ranking Rx
conglomeration the top down weight vector W x as BIx Link Phase 3: Count the augmented input BIx;
Wx to the ranking Rx.
Go to Phase 12 else 13. Phase 4: for y ← 1 to m count the conative functions
BI x Wx
Phase 6: If BIx is an input pattern vector whose ranking CFy BI x
previously exits, then count the conative function CFy(BIx) for š Wy
each of the existing top-down weight nodes TDWy
Phase 5: Select the winner N among the M conative functions
BI x Wx
CFy BI x Nc(BIx) = maxy Ny(BIx)
š Wy
Phase 6: Output ranking RN linked to Nc(BIx) as the one to
Phase 7: Select that top-down weight node N which records which Ix pertain to.
the transcendent [29] conative function Nc(BIx) = maxy
Ny(BIx) Phase 7: If nope, more conjecture pattern vectors
Then exit
Phase 8: Count the correspond function CFn(BIx) of the Else x ← x +1;
vanquish node N; If CFn (BIx) > ƥ and Rx is same as that Go to Phase 2.
ranking RN linked to WN Then update weight vector WN as
WN recently = WN longstanding + (I ⋀ WN longstanding) VII. EXPRIMENTAL RESULTS
Phase 9: If CFn (BIx) > ƥ and Rx is not the ranking RN linked The performance of the Simplified Fuzzy ARTMAP and
to WN then Initiate harmonize tracing by setting to CFn (BIx) Hidden Markov Models (HMM) is given in figure 4. We look
and incrementing by a miniature value ƛ then on that up to noise level of 0.20 to 0.25 and handwritten
ƥ = CFn (BIx) + ƛ recognition of Arabic Script is 100%. The table 1 shows the
If few much top down weight nodes exist then contemplate feature value of handwritten Arabic characters. In the next
the next topmost winner W N among the top-down weight table 2 gives a recognition rate, which is based on the
nodes disparity between feature values of ideal and handwritten
Go to Phase 8; Arabic characters. In table 2 the last row notifies the average
Else go to Phase 11; of handwritten Arabic character recognition rate. We look on
that up to 96.38% recognition is instate for handwritten
Arabic Script.
Yusuf Perwej et al. 31
REFERENCES
[17]. R. Andonie and L. Sasu. Fuzzy ARTMAP with input Dr. Yusuf Perwej Assistant Professor in the
relevances. IEEE Transactions on Neural Networks, 17, 2006, Department of Computer Science & Engineering Al
929–941. Baha University, Al Baha , Kingdom of Saudi
[18]. Kwan, H. K., and Cai, Y. (1994). ‘‘A fuzzy neural network Arabia (KSA). He has authored a number of
and its application to pattern recognition.’’ IEEE Trans. on different journal and paper. His research interests
Fuzzy Systems, 2(3), 185–191. include Soft Computing, Artificial Neural Network,
[19]. C. P. Lim, H. H. Toh, and T. S. Lee, ”An evaluation of the Machine Learning, Pattern Matching, Pattern Recognition, Artificial
fuzzy ARTMAP neural network using offline and on-line Intelligence, Image Processing, Fuzzy Logic, Genetic Algorithm,
strategies,” Neural Network World, 4, 327-339, 1999. Robotics, Bluetooth and Network etc. He is a member of IEEE.
[20]. I. Dagher, M. Georgiopoulos, G. L. Heileman, and G. Bebis.
Fuzzy ARTVar: An improved fuzzy ARTMAP algorithm.In
Proceedings IEEE World Congress Computational
Intelligence, Anchorage, 1998, 1688–1693. Dr. Shaikh Abdul Hannan Assistant Professor
[21]. Vazquez-Lopez, J. A., Lopez-Juarez, I., Peña-Cabrera, M. in the Department of Computer Science &
2010. On the use of the FuzzyARTMAP Neural Network for Engineering Al Baha University, Al Baha ,
Pattern Recognition in Statistical Process Control using a Kingdom of Saudi Arabia (KSA). He has authored
Factorial Design. Int. J. of Computers, Communications & a number of different journal and paper. His
Control, Vol. V (2), pp. 205-215. research interests include Data Mining and Data
[22]. P. Henniges, E. Granger, and R. Sabourin, “Factors of Warehouse, Artificial Neural Network, Artificial
overtraining with fuzzy ARTMAP neural networks,” Int. Joint Intelligence, Image Processing etc.
Conference on Neural Networks, Montreal, Canada, 2005, pp.
1075-1080.
[23]. B. Lerner and B. Vigdor, ”An empirical study of fuzzy
ARTMAP applied to cytogenetics,” IEEE Convention of
Electrical and Electronics Engineers in Israel, 2004, pp. 301- Nikhat Akhtar Assistant Professor in the
304. Department of Computer Science & Engineering
[24]. M. Taghi, V. Baghmisheh, and P. Nikola. A Fast Simplified Integral University, Lucknow, India. She has
Fuzzy ARTMAP Network. Neural Processing Letters, 17, authored a number of different journal and paper.
2003, 273–316. Her research interests include Soft Computing,
[25]. Addison, J F D, Wermter, S and MacIntyre, J. 1999 Swarm Intelligence, Storage Technology, Artificial
Effectiveness of feature extraction in neural network Neural Network, Cryptography, Pattern Matching, Pattern
architectures for novelty detection, ICANN-99, Ninth Recognition, Artificial Intelligence, Network Security, Fuzzy Logic,
International Conference on Artificial Neural Networks”, Network and Database. She is a member of IEEE.
Edinburgh, UK, September 1999, pp976-981.
[26]. Yusuf Perwej , Dr. Ashish Chaturvedi, “Machine Recognition
of Hand Written Characters using Neural Networks” for
published in the International Journal of Computer
Applications (IJCA) ,USA , Vol. 14, No.2, January 2011,
Pages 6- 9, ISSN 0975 – 8887, DOI : 10.5120/1819-2380
[27]. E. Saber, A.M. Tekalp, ”Integration of color, edge and texture
features for automatic region-based image annotation and
retrieval,” Electronic Imaging, 7, pp. 684–700, 1998.
[28]. I. Dagher, M. Georgiopoulos, G. L. Heileman, and G. Bebis.
Fuzzy ARTVar: An improved fuzzy ARTMAP algorithm. In
Proceedings IEEE World Congress Computational Intelligence
WCCI’98,Anchorage, 1998, 1688–1693.
[29]. E. Gomez-Sanchez, Y. A. Dimitriadis, J. M. Cano-Izquierdo,
and J. Lopez-Coronado. Π ARTMAP: Use of mutual
information for category reduction in fuzzy ARTMAP. IEEE
Transactions on Neural Networks, 13, 2002, 58–69.