0% found this document useful (0 votes)
18 views

Signboard Detection and Text Recognition Using Artificial Neural Networks

Uploaded by

zohaibsaleemoff
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Signboard Detection and Text Recognition Using Artificial Neural Networks

Uploaded by

zohaibsaleemoff
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Signboard Detection and Text Recognition Using

Artificial Neural Networks


Muhammad A.Panhwar Kamran A. Memon, Adeel Abro, Deng Zhongliang
School of Electronic Engineering, Beijing University of School of Electronic Engineering, Beijing University of
Posts & Telecommunications, Beijing, China Posts & Telecommunications, Beijing, China
[email protected] {ali.kamran77,adeelabro, Dengzhl}@bupt.edu.cn

Sijjad A. Khuhro Saleemullah Memon


School of Computer Science and Technology University of School of information & communication engineering,
Science Beijing University of Posts & Telecommunications, Beijing,
and Technology of China Anhui, Hefei, 230026, China China
[email protected] [email protected]

Abstract— A sub-field of Artificial Intelligence (AI) is machine Learning through machine allows analyzing large
learning (ML) which allows the computer to learn and adopt amounts of data. Generally, fast and more accurate results
new rules. With the ML algorithm, the computer can identify are obtained for pattern identification. Image, fingerprint,
samples of the observations, figures and other formats, which and others may be called as a pattern. Pattern recognition is a
may describe the model or other structure. The machine process of using machine learning algorithms for the
observes various events from the world and presents things recognition of design patterns.
with/without explaining the pre-programmed rules. This paper
presents the recognition of scene text from the outside Both machines learning (ML) algorithm and pattern
environment focusing signboards. A framework for text recognition plays a vital role in extracting information from
detection and recognition of text from the natural environment natural scenes [2]. Natural scenes contain helpful and
is presented. Advance steps from paper-based text recognition; valuable information; extracting text makes it easy to
the text recognition from natural scenes are divided into understand for humans and computers. This research of
several levels. Firstly, the image is captured from the outside extracting text from natural scenes is very active and
environment with a smart device, followed by detection of the challenging in computer vision applications [3]. Reading the
edges of a signboard. The next phase is the detection of text text of the scene provides many useful applications based on
and the recognition of the text into two languages such as Urdu geographic information. Despite the similarity with the
and English. Final phase uses Artificial Neural Network for the traditional optical character recognition (OCR), the text of
classification and recognition of the text extracted from the the natural scene is challenging due to the random patterns,
natural scenes or an outside environment. Experimental results light conditions, variation in the preview, and background
have been generated on the developed image database created
objects that can impose challenges and cannot be controlled
as the part of this research. The effort is a multilingual and
produces output in Urdu and English. The system performs
[4].
well as compared to traditional recognition systems and In this paper, we introduce a framework to extract text
delivers an overall 85%accuracy in image results. from natural landscape and natural scenes. Primarily, our
attention is the identification of words of situation. The Paper
Keywords-Text Detection; Text Recognition; Natural Scenic; is arranged into several sections. Next section reviews text
Text Detection & Recognition detection and recognition. Section 3 discusses the
methodology of extracting text from natural scenes. Section
I. INTRODUCTION 4 presents results and discussion. Conclusion and future
The machine learning (ML) is an application of artificial work are summarized in Section 5.
intelligence (AI) that provides the ability to learn without
II. RELETAED MATERIAL
experience and clear programming. The learning through
machine focuses on developing computer algorithms, which Huang et al., (2015) proposed a method for detecting and
can access data and learn by its own. The learning process recognizing text from scenes by using two algorithms MSER
starts with monitoring, direct experience, or guidance to find and SVM. Unlike text identification at the end, they
examples of patterns based on better decisions so that all can distribute recognition issues in the detection and recognition.
be performed again in the future. The main purpose allows a In the detection phase, the system extracted the attached
computer to learn and adjust their behavior according to components using MSER and color cluster to extract
human intervention [1]. possible per t. Next, for obtaining connected component and
non-text areas are filtered using Visual Saliency. Lastly,
word lines can be obtained by text line generation. In the

978-1-7281-1190-2/19/$31.00©2019 IEEE
identification phase, the system uses vertical projection to blurred, having broken ligature, variation in size, color,
divide word and then recognize the characters with the resolution, shape, texture, background, the geometry of text,
SVM-based framework. Experimental results have been lighting problem and contrast with background [10]. The text
speculated with better performance than traditional text detection process has been performed with the help of the
detection and identification methods [5]. following equation
Baran et al., (2018) introduced natural scenery images in Hr1 = ∑ fi , j (1)
a new effective way to detect automatic text and identify the
character. The method of detecting the text comes in a
connected component category using MSER feature. A new Where f i , j is a function of property I and I = 0 or 1?
filter designed to eliminate restless areas and words (phrases) The text extraction process is also called text localization.
are not related to assets. The last sentence is acknowledged This processing is considered for text detection and
using the OCR system. Finally, the approach of the IMCOP localization [11]. Recognition is the next and main phase in
material and the method submitted within the delivery which the image text (pixel-based text) is converted into a
platform are explained [6]. readable and editable form.
Dai et al., (2018) presents a new end to end framework
for detecting multidirectional visual text by segmentation.
The system offers a network for fused text segmentation that
combines multi-level features. The system identifies
semantic works and region-oriented object as well as the
ability to segment text view. Without additional pipelines,
the point of view is out of the current deduction of the multi-
dimensional text detecting bench: ICDAR 2015 and MSRA-
TD reaches 84.1% and 82.0%. Also, the system reports the
primary line of complete text containing curved and
suggesting the effect of the method offered [7].
Kaushik & Verma, (2018) offers a detailed overview of
various schemes for text recognition process in images of the Figure 1. Several Phases
natural scene. The study also provides comparisons of a
different state of the art methods. The work of this review
focuses mainly on the associated components and region- Recognition of the detected text is done by applying
based methods. The paper also defines various benefits and different techniques including neural network [12], SVM
limits of text recognition from nature [8]. classifier [13] and HOG method [14].
Tian et al., (2018) recommended a Bayesian based IV. RESULTS THROUGH EXPERIMENTS
framework for track-based text detection and identification To evaluate and test the system, we performed
from videos for embedded subtitles. The structure consists of experiments on different images captured from natural
three main components that are tracking text, detection, and scenes. The natural scene images have been obtained from
identification. With this unified framework, text initially the natural scene image database created in our other
done by tracking. This tracking experience is edited and research. Figure 2 shows an image captured and signboard
improved through detection or identification results: text edge detection shown in Figures 3. Figure 4 and Figure 5.
detection and identification where enhanced by multiple Show the test detection in the natural scene by using MSER.
frame integration. Also, a database is made available The Figure 6(a) shows the text recognition in English
publicly (USTB-Vid TEXT). Various experiments on this whereas Figure 6(b) shows the recognized Urdu presented
database confirm that the approach improves text detection with the extension of the .txt file.
and identification of the web video [9].
III. TEXT RECOGNITION FROM SCENES
Pattern's identity is a robust, exciting and fast-drawn field
that supports the development of the same area such as
image processing, computer vision, data acquisition system,
and neural network. The methodology of the study comprises
of several phases as shown in Figure 1. Capturing image is
the very first step for signboard detection. The next step is
real-time signboard detection that is a challenging task
because our environment is full of a different form of
signboards including glow signboard, 3D, Neon, creative
signboard, etc. Having different layout with different shapes
and angles. Text detection is the next phase which is also a Figure 2. Image from Natural Scene
very challenging task because the text on signboard may be
Figure 3. Signboard Edges Detection
(b)

Figure 6(a) English Text Recognition (b) Urdu Text Recognition

V. STATISTICS AND ANALYSIS OF THE PROPOSED SYSTEM


We selected 500 images randomly from natural scenes,
and these images have been taken from another research of
database creation which is also under review for the
publishing; these sample images were used for training and
testing. For feature extraction, a modified version of [15] has
been used. Neural Network has been selected for the
recognition of these images. Table 1 shows the accuracy of
the proposed system concerning its steps. The system
performs recognition for both English and Urdu languages
whereas the system is manually tuned to recognize for
Figure 4. English Text Detection English or Urdu languages. The samples and the accuracy
have been measured by counting the number of correctly
recognized images. The recognition along with training
samples is depicted in Figure 7(a) and the efficiency is given
in Figure 7(b).

Figure 5. Urdu Text Detection

(a) (a)
supported by the National High Technology 863 Program of
China (No.2015AA124103) and by the National Key R&D
Program no 2016YFB05502001. The authors are thankful
for the financial support and guidance and assistance
provided by the national natural science foundation of China
and State Key Laboratory Intelligent Communication,
Navigation and Micro-Nano System, BUPT.

References

[1] Baydin, A. G., Pearlmutter, B. A., Radul, A. A., & Siskind, J. M.


(2018). Automatic differentiation in machine learning: a survey.
Journal of Marchine Learning Research, 18, 1-43.
[2] Sahare, P., & Dhok, S. B. (2017). Review of Text Extraction
Algorithms for Scene- text and Document Images. IETE Technical
Review, 34(2), 144–
164.https://doi.org/10.1080/02564602.2016.1160805
[3] Kaur, T. (2015). Text Detection and Recognition from Natural Scene,
4(7), 3211–3216.
(b) [4] Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W. (2014). TextBoxes :
A Fast Text Detector with a Single Deep Neural Network, 4161–4167.
Figure 7: (a) English Text Recognition (b) Urdu Text Recognition [5] Huang, X., Shen, T., Wang, R., & Gao, C. (2015). Text detection and
recognition in natural scene images. 2015 International Conference
on Estimation, Detection and Information Fusion (ICEDIF),
Table 1: Accuracy with Steps (ICEDlF), 44–49. https://doi.org/10.1109/ICEDIF.2015.7280160
[6] Baran, R., Partila, P., & Wilk, R. (2018). Automated Text Detection
and Character Recognition in Natural Scenes Based on Local Image
Overall Steps Testing Recognition Accuracy Features and Contour Processing Techniques. In W. Karwowski & T.
Samples Ahram (Eds.), Intelligent Human Systems Integration (pp. 42–48).
Signboard 500 425 85% Cham: Springer International Publishing.
Detection [7] Dai, Y., Huang, Z., Gao, Y., Xu, Y., Chen, K., Guo, J., & Qiu, W.
(2018). Fused Text Segmentation Networks for Multi-oriented Scene
Text 500 405 80% Text Detection. 2018 24th International Conference on Pattern
Detection Recognition (ICPR), 3604–3609.
Text 500 356 71.2% [8] Kaushik, D., & Verma, V. S. (2018). Review on Text Recognition in
Extraction Natural Scene Images. In B. Panda, S. Sharma, & U. Batra (Eds.),
Innovations in Computational Intelligence : Best Selected Papers of
Text 500 351 70.2% the Third International Conference on REDSET 2016 (pp. 29–43).
Recognition Singapore: Springer Singapore. https://doi.org/10.1007/978-981-10-
4555-4_3
[9] Tian, S., Yin, X., Member, S., Su, Y., & Hao, H. (2018). A Unified
VI. CONCLUSION Framework for Tracking Based Text Detection and Recognition from
Web Videos, 40(3), 542–554.
ML algorithms are on a high rise. Currently, it is the [10] Raghunandan, K. S., Shivakumara, P., Ieee, M., Roy, S., & Kumar, G.
most difficult topic in the field of information technology. It H. (2018). Multi-Script-Oriented Text Detection and Recognition in
is a combination of different techniques that allow machines Video / Scene / Born Digital Images, 8215(c).
to learn from data, patterns, and predictions. The paper https://doi.org/10.1109/TCSVT.2018.2817642
shows an overall framework of text recognition from natural [11] H. Zhang, K. Zhao, Y. Z. Song, and J. Guo, “Text extraction from
scenes. The image first captured from the outside natural scene image: a survey,” Neurocomputing, Vol. 122, pp.
310323, Dec. 2013.
environment, and then the system detects the edges of a
[12] Shi, B., Bai, X., & Yao, C. (n.d.). An End-to-End Trainable Neural
signboard. Detection of signboard is followed by text Network for Image-based Sequence Recognition and Its Application
detection. The identified text is extracted from nature. At last to Scene Text Recognition.
neural network used for recognition. Experimental results [13] Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., & Scholkopf, B.
indicated better performance and resulted in 70% to 85% for (1998). Support vector machines. IEEE Intelligent Systems and Their
text recognition to signboard detection respectively. Applications, 13, 18–28. Retrieved from
https://ieeexplore.ieee.org/abstract/document/708428/authors#authors
ACKNOWLEDGMENT [14] Sun, D., & Watada, J. (2015). Detecting pedestrians and vehicles in
traffic scene based on boosted HOG features and SVM. In 2015 IEEE
This research work jointly supported by national natural 9th International Symposium on Intelligent Signal Processing (WISP)
science foundation of China under fund no# 61572454, Proceedings (pp. 1–4). https://doi.org/10.1109/WISP.2015.7139161
61562453, and 61520106007, And State Key Laboratory [15] Bhaskar, S., & Saravanan, K. N. (2017). Design and Description of
Intelligent Communication, Navigation and Micro-Nano Feature Extraction Algorithm for Old English Font.
System, Beijing University of Posts and Communications.
The research reported in this paper has been financially

You might also like