0% found this document useful (0 votes)
465 views

Research Paper On OCR

This document provides an overview of optical character recognition (OCR) systems. It discusses the history and evolution of OCR from early mechanical devices to modern computer-based systems. The document also categorizes different types of OCR systems, including handwritten versus printed text recognition, online versus offline systems, and segmentation, feature extraction, classification, and post-processing steps used in OCR. While significant research has improved OCR accuracy, the document concludes that machine recognition is still not as reliable as human reading abilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
465 views

Research Paper On OCR

This document provides an overview of optical character recognition (OCR) systems. It discusses the history and evolution of OCR from early mechanical devices to modern computer-based systems. The document also categorizes different types of OCR systems, including handwritten versus printed text recognition, online versus offline systems, and segmentation, feature extraction, classification, and post-processing steps used in OCR. While significant research has improved OCR accuracy, the document concludes that machine recognition is still not as reliable as human reading abilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Journal of Information & Communication Technology-JICT Vol. 10 Issue.

2, December 2016

A Survey on Optical Character Recognition System


Noman Islam, Zeeshan Islam, Nazia Noor

Abstract—Optical Character Recognition (OCR) has been a II. LITERATURE REVIEW


topic of interest for many years. It is defined as the process of
digitizing a document image into its constituent characters.
Character recognition is not a new problem but its
Despite decades of intense research, developing OCR with
capabilities comparable to that of human still remains an roots can be traced back to systems before the inventions
open challenge. Due to this challenging nature, researchers of computers. The earliest OCR systems were not
from industry and academic circles have directed their computers but mechanical devices that were able to
attentions towards Optical Character Recognition. Over the recognize characters, but very slow speed and low
last few years, the number of academic laboratories and accuracy. In 1951, M. Sheppard invented a reading and
companies involved in research on Character Recognition has robot GISMO that can be considered as the earliest work
increased dramatically. This research aims at summarizing on modern OCR [1]. GISMO can read musical notations as
the research so far done in the field of OCR. It provides an well as words on a printed page one by one. However, it
overview of different aspects of OCR and discusses
can only recognize 23 characters. The machine also has the
corresponding proposals aimed at resolving issues of OCR.
capability to could copy a typewritten page. J. Rainbow, in
Keywords—character recognition, document image analysis, 1954, devised a machine that can read uppercase
OCR, OCR survey, classification typewritten English characters, one per minute. The early
OCR systems were criticized due to errors and slow
recognition speed. Hence, not much research efforts were
I. INTRODUCTION put on the topic during 60’s and 70’s. The only
developments were done on government agencies and large
Optical Character Recognition (OCR) is a piece of corporations like banks, newspapers and airlines etc.
software that converts printed text and images into Because of the complexities associated with
digitized form such that it can be manipulated by machine. recognition, it was felt that three should be standardized
Unlike human brain which has the capability to very easily OCR fonts for easing the task of recognition for OCR.
recognize the text/ characters from an image, machines are Hence, OCRA and OCRB were developed by ANSI and
not intelligent enough to perceive the information available EMCA in 1970, that provided comparatively acceptable
in image. Therefore, a large number of research efforts recognition rates[2] .
have been put forward that attempts to transform a During the past thirty years, substantial research has been
document image to format understandable for machine. done on OCR. This has lead to the emergence of document
OCR is a complex problem because of the variety of image analysis (DIA), multi-lingual, handwritten and
languages, fonts and styles in which text can be written, omni-font OCRs [2]. Despite these extensive research
and the complex rules of languages etc. Hence, techniques efforts, the machine’s ability to reliably read text is still far
from different disciplines of computer science (i.e. image below the human. Hence, current OCR research is being
processing, pattern classification and natural language done on improving accuracy and speed of OCR for diverse
processing etc. are employed to address different style documents printed/ written in unconstrained
challenges. This paper introduces the reader to the environments. There has not been availability of any open
problem. It enlightens the reader with the historical source or commercial software available for complex
perspectives, applications, challenges and techniques of languages like Urdu or Sindhi etc.
OCR.
III. TYPES OF OPTICAL CHARACTER
RECOGNITION SYSTEMS

There has been multitude of directions in which


research on OCR has been carried out during past years.
Manuscript Received August 17, 2016; accepted 10th October, 2016;
date of current version December 2016 This section discusses different types of OCR systems have
Noman Islam is with Iqra University, Karachi, Pakistan emerged as a result of these researches. We can categorize
(email:[email protected]) these systems based on image acquisition mode, character
Zeeshan Islam is with Aladdin Solutions, Pakistan (email: connectivity, font-restrictions etc. Fig. 1 categorizes the
[email protected])
Nazia Noor is with Indus University, Pakistan (email: character recognition system.
[email protected]) Based on the type of input, the OCR systems can be
categorized as handwriting recognition and machine
printed character recognition. The former is relatively

Page | 1 ISSN-2409-6520
Journal of Information & Communication Technology-JICT Vol. 10 Issue. 2, December 2016
simpler problem because characters are usually of uniform in the image. In these situations, advance character
dimensions, and the positions of characters on the page can segmentation techniques are used.
be predicted [3]. Feature extraction: The segmented characters are then
Handwriting character recognition is a very tough job processes to extract different features. Based on these
due to different writing style of user as well as different features, the characters are recognized. Different types of
pen movements by the user for the same character. These features that can be used extracted from images are
systems can be divided into two sub-categories i.e. on-line moments etc. The extracted features should be efficiently
and off-line systems. The former is performed in real-time computable, minimize intra-class variations and maximizes
while the users are writing the character. They are less inter-class variations.
complex as they can capture the temporal or time based Character classification: This step maps the features of
information i.e. speed, velocity, number of strokes made, segmented image to different categories or classes. There
direction of writing of strokes etc. In addition, there no are different types of character classification techniques.
need for thinning techniques as the trace of the pen is few Structural classification techniques are based on features
pixels wide. The offline recognition systems operate on extracted from the structure of image and uses different
static data i.e. the input is a bitmap. Hence, it is very decision rules to classify characters. Statistical pattern
difficult to perform recognition. classification methods are based on probabilistic models
There have been many online systems available because and other statistical methods to classify the characters.
they are easier to develop, have good accuracy and can be Post processing: After classification, the results are not
incorporated for inputs in tablets and PDAs [4]. 100% correct, especially for complex languages. Post
processing techniques can be performed to improve the
Character accuracy of OCR systems. These techniques utilizes
Recognition natural language processing, geometric and linguistic
context to correct errors in OCR results. For example, post
Handwritte processor can employ a spell checker and dictionary,
Printed probabilistic models like Markov chains and n-grams to
n
improve the accuracy. The time and space complexity of a
post processor should not be very high and the application
Online of a post-processor should not engender new errors.
OCR
a. Image Acquisition
Offline
Image acquisition is the initial step of OCR that
Figure.1: Types of character recognition system
comprises obtaining a digital image and converting it into
suitable form that can be easily processed by computer.
IV.APPLICATIONS OF OCR
This can involve quantization as well as compression of
image [8]. A special case of quantization is binarization
OCR enables a large number of useful applications.
that involves only two levels of image. In most of the
During the early days, OCR has been used for mail sorting,
cases, the binary image suffices to characterize the image.
bank cheque reading and signature verification [5].
The compression itself can be lossy or loss-less. An
Besides, OCR can be used by organizations for automated
overview of various image compression techniques have
form processing in places where a huge number of data is
been provided in [9].
available in printed form. Other uses of OCR include
processing utility bills, passport validation, pen computing
b. Pre-processing
and automated number plate recognition etc [6]. Another
useful application of OCR is helping blind and visually
impaired people to read text [7]. Next to image acquisition is pre-processing that aims to
enhance the quality of image. One of the pre-processing
IV. MAJOR PHASES OF OCR techniques is thresholding that aims to binaries the image
based on some threshold value [9]. The threshold value can
The process of OCR is a composite activity comprises be set at local or global level.
different phases. These phases are as follows: Different types of filters such as averaging, min
Image acquisition: To capture the image from an external and max filters can be applied. Alternatively, different
source like scanner or a camera etc. morphological operations such as erosion, dilation,
Preprocessing: Once the image has been acquired, opening and closing can be performed.
different preprocessing steps can be performed to improve
the quality of image. Among the different preprocessing
techniques are noise removal, thresholding and extraction
image base line etc.
Character segmentation: In this step, the characters in the
image are separated such that they can be passed to
recognition engine. Among the simplest techniques are
connected component analysis and projection profiles can
be used. However in complex situations, where the
characters are overlapping /broken or some noise is present

Page | 2 ISSN-2409-6520
Journal of Information & Communication Technology-JICT Vol. 10 Issue. 2, December 2016
Table.1: Major Phases of OCR system e. Classification

Phase Description Approaches It is defined as the process of classifying a character into


its appropriate category. The structural approach to
Digitization, classification is based on relationships present in image
The process of components. The statistical approaches are based on use of
Acquisition binarization,
acquiring image a discriminate function to classify the image. Some of the
compression
statistical classification approaches are Bayesian classifier,
decision tree classifier, neural network classifier, nearest
Noise removal, Skew neighborhood classifiers etc [12]. Finally, there are
To enhance quality removal, thinning, classifiers based on syntactic approach that assumes a
Pre-processing
of image morphological grammatical approach to compose an image from its sub-
operations constituents.

To separate image f. Post-processing


Implicit Vs Explicit
Segmentation into its constituent
Segmentation
characters Once the character has been classified, there are
various approaches that can be used to improve the
Geometrical feature
accuracy of OCR results. One of the approaches is to use
such as loops, corner
Feature To extract features more than one classifier for classification of image. The
points
Extraction from image classifier can be used in cascading, parallel or hierarchical
Statistical features such
fashion. The results of the classifiers can then be combined
as moments
using various approaches.
To categorize a Neural Network, In order to improve OCR results, contextual analysis can
Classification character into its Bayesian, Nearest also be performed. The geometrical and document context
particular class Neighborhood of the image can help in reducing the chances of errors.
Lexical processing based on Markov models and dictionary
Contextual approaches,
To improve accuracy multiple classifiers,
can also help in improving the results of OCR [12].
Post-
processing of OCR results dictionary based
approaches V. CONCLUSION

In this paper, an overview of various techniques of


An important part of pre-processing is to find out the
OCR has been presented. An OCR is not an atomic process
skew in the document. Different techniques for skew
but comprises various phases such as acquisition, pre-
estimation includes: projection profiles, Hough transform,
processing, segmentation, feature extraction, classification
nearest neighborhood methods.
and post-processing. Each of the steps is discussed in detail
In some cases, thinning of the image is also performed
in this paper. Using a combination of these techniques, an
before later phases are applied [10]. Finally, the text lines
efficient OCR system can be developed as a future work.
present in the document can also be found out as part of
The OCR system can also be used in different practical
pre-processing phase. This can be done based on
applications such as number-plate recognition, smart
projections or clustering of the pixels.
libraries and various other real-time applications.
Despite of the significant amount of research in OCR,
c. Character Segmentation
recognition of characters for language such as Arabic,
Sindhi and Urdu still remains an open challenge. An
In this step, the image is segmented into characters overview of OCR techniques for these languages has been
before being passed to classification phase. The planned as a future work. Another important area of
segmentation can be performed explicitly or implicitly as a research is multi-lingual character recognition system.
byproduct of classification phase [11]. In addition, the Finally, the employment of OCR systems in practical
other phases of OCR can help in providing contextual applications remains an active are of research.
information useful for segmentation of image.
ACKNOWLEDGMENT
d. Feature Extraction
The authors would like to thank Iqra University,
In this stage, various features of characters are extracted. Karachi for their support in the completion of this research
These features uniquely identify characters. The selection work.
of the right features and the total number of features to be
used is an important research question. Different types of
features such as the image itself, geometrical features
(loops, strokes) and statistical feature (moments) can be
used. Finally, various techniques such as principal
component analysis can be used to reduce the
dimensionality of the image.

Page | 3 ISSN-2409-6520
Journal of Information & Communication Technology-JICT Vol. 10 Issue. 2, December 2016
REFERENCES
[1] Satti, D.A., 2013, Offline Urdu Nastaliq OCR for
Printed Text using Analytical Approach. MS thesis
report Quaid-i-Azam University: Islamabad,
Pakistan. p. 141.
[2] Mahmoud, S.A., & Al-Badr, B., 1995, Survey and
bibliography of Arabic optical text recognition.
Signal processing, 41(1), 49-77.
[3] Bhansali, M., & Kumar, P, 2013, An Alternative
Method for Facilitating Cheque Clearance Using
Smart Phones Application. International Journal of
Application or Innovation in Engineering &
Management (IJAIEM), 2(1), 211-217.
[4] Qadri, M.T., & Asif, M, 2009, Automatic Number
Plate Recognition System for Vehicle Identification
Using Optical Character Recognition presented at
International Conference on Education Technology
and Computer, Singapore, 2009. Singapore: IEEE.
[5] Shen, H., & Coughlan, J.M, 2012, Towards A Real
Time System for Finding and Reading Signs for
Visually Impaired Users. Computers Helping People
with Special Needs. Linz, Austria: Springer
International Publishing.
[6] Bhavani, S., & Thanushkodi, K, 2010, A Survey On
Coding Algorithms In Medical Image Compression.
International Journal on Computer Science and
Engineering, 2(5), 1429-1434.
[7] Bhammar, M.B., & Mehta, K.A, 2012, Survey of
various image compression techniques. International
Journal on Darshan Institute of Engineering
Research & Emerging Technologies, 1(1), 85-90.
[8] Lazaro, J., Martín, J.L, Arias, J., Astarloa, A., &
Cuadrado, C, 2010, Neuro semantic thresholding
using OCR software for high precision OCR
applications. Image and Vision Computing, 28(4),
571-578.
[9] Lund, W.B., Kennard, D.J., & Ringger, E.K. (2013).
Combining Multiple Thresholding Binarization
Values to Improve OCR Output presented in
Document Recognition and Retrieval XX
Conference 2013, California, USA, 2013. USA:
SPIE
[10] Shaikh, N.A., & Shaikh, Z.A, 2005, A generalized
thinning algorithm for cursive and non-cursive
language scripts presented in 9th International
Multitopic Conference IEEE INMIC, Pakistan, 2005.
Pakistan: IEEE
[11] Shaikh, N.A., Shaikh, Z.A., & Ali, G, 2008,
Segmentation of Arabic text into characters for
recognition presented in International Multi Topic
Conference, IMTIC, Jamshoro, Pakistan, 2008.
Pakistan: Springer.
[12] Ciresan, D.C., Meier, U., Gambardella, L.M., &
Schmidhuber, J, 2011, Convolutional neural network
committees for handwritten character classification
presented in International Conference on Document
Analysis and Recognition, Beijing, China, 2011.
USA: IEEE.

Page | 4 ISSN-2409-6520

You might also like