Signboard Detection and Text Recognition Using Artificial Neural Networks
Signboard Detection and Text Recognition Using Artificial Neural Networks
Abstract— A sub-field of Artificial Intelligence (AI) is machine Learning through machine allows analyzing large
learning (ML) which allows the computer to learn and adopt amounts of data. Generally, fast and more accurate results
new rules. With the ML algorithm, the computer can identify are obtained for pattern identification. Image, fingerprint,
samples of the observations, figures and other formats, which and others may be called as a pattern. Pattern recognition is a
may describe the model or other structure. The machine process of using machine learning algorithms for the
observes various events from the world and presents things recognition of design patterns.
with/without explaining the pre-programmed rules. This paper
presents the recognition of scene text from the outside Both machines learning (ML) algorithm and pattern
environment focusing signboards. A framework for text recognition plays a vital role in extracting information from
detection and recognition of text from the natural environment natural scenes [2]. Natural scenes contain helpful and
is presented. Advance steps from paper-based text recognition; valuable information; extracting text makes it easy to
the text recognition from natural scenes are divided into understand for humans and computers. This research of
several levels. Firstly, the image is captured from the outside extracting text from natural scenes is very active and
environment with a smart device, followed by detection of the challenging in computer vision applications [3]. Reading the
edges of a signboard. The next phase is the detection of text text of the scene provides many useful applications based on
and the recognition of the text into two languages such as Urdu geographic information. Despite the similarity with the
and English. Final phase uses Artificial Neural Network for the traditional optical character recognition (OCR), the text of
classification and recognition of the text extracted from the the natural scene is challenging due to the random patterns,
natural scenes or an outside environment. Experimental results light conditions, variation in the preview, and background
have been generated on the developed image database created
objects that can impose challenges and cannot be controlled
as the part of this research. The effort is a multilingual and
produces output in Urdu and English. The system performs
[4].
well as compared to traditional recognition systems and In this paper, we introduce a framework to extract text
delivers an overall 85%accuracy in image results. from natural landscape and natural scenes. Primarily, our
attention is the identification of words of situation. The Paper
Keywords-Text Detection; Text Recognition; Natural Scenic; is arranged into several sections. Next section reviews text
Text Detection & Recognition detection and recognition. Section 3 discusses the
methodology of extracting text from natural scenes. Section
I. INTRODUCTION 4 presents results and discussion. Conclusion and future
The machine learning (ML) is an application of artificial work are summarized in Section 5.
intelligence (AI) that provides the ability to learn without
II. RELETAED MATERIAL
experience and clear programming. The learning through
machine focuses on developing computer algorithms, which Huang et al., (2015) proposed a method for detecting and
can access data and learn by its own. The learning process recognizing text from scenes by using two algorithms MSER
starts with monitoring, direct experience, or guidance to find and SVM. Unlike text identification at the end, they
examples of patterns based on better decisions so that all can distribute recognition issues in the detection and recognition.
be performed again in the future. The main purpose allows a In the detection phase, the system extracted the attached
computer to learn and adjust their behavior according to components using MSER and color cluster to extract
human intervention [1]. possible per t. Next, for obtaining connected component and
non-text areas are filtered using Visual Saliency. Lastly,
word lines can be obtained by text line generation. In the
978-1-7281-1190-2/19/$31.00©2019 IEEE
identification phase, the system uses vertical projection to blurred, having broken ligature, variation in size, color,
divide word and then recognize the characters with the resolution, shape, texture, background, the geometry of text,
SVM-based framework. Experimental results have been lighting problem and contrast with background [10]. The text
speculated with better performance than traditional text detection process has been performed with the help of the
detection and identification methods [5]. following equation
Baran et al., (2018) introduced natural scenery images in Hr1 = ∑ fi , j (1)
a new effective way to detect automatic text and identify the
character. The method of detecting the text comes in a
connected component category using MSER feature. A new Where f i , j is a function of property I and I = 0 or 1?
filter designed to eliminate restless areas and words (phrases) The text extraction process is also called text localization.
are not related to assets. The last sentence is acknowledged This processing is considered for text detection and
using the OCR system. Finally, the approach of the IMCOP localization [11]. Recognition is the next and main phase in
material and the method submitted within the delivery which the image text (pixel-based text) is converted into a
platform are explained [6]. readable and editable form.
Dai et al., (2018) presents a new end to end framework
for detecting multidirectional visual text by segmentation.
The system offers a network for fused text segmentation that
combines multi-level features. The system identifies
semantic works and region-oriented object as well as the
ability to segment text view. Without additional pipelines,
the point of view is out of the current deduction of the multi-
dimensional text detecting bench: ICDAR 2015 and MSRA-
TD reaches 84.1% and 82.0%. Also, the system reports the
primary line of complete text containing curved and
suggesting the effect of the method offered [7].
Kaushik & Verma, (2018) offers a detailed overview of
various schemes for text recognition process in images of the Figure 1. Several Phases
natural scene. The study also provides comparisons of a
different state of the art methods. The work of this review
focuses mainly on the associated components and region- Recognition of the detected text is done by applying
based methods. The paper also defines various benefits and different techniques including neural network [12], SVM
limits of text recognition from nature [8]. classifier [13] and HOG method [14].
Tian et al., (2018) recommended a Bayesian based IV. RESULTS THROUGH EXPERIMENTS
framework for track-based text detection and identification To evaluate and test the system, we performed
from videos for embedded subtitles. The structure consists of experiments on different images captured from natural
three main components that are tracking text, detection, and scenes. The natural scene images have been obtained from
identification. With this unified framework, text initially the natural scene image database created in our other
done by tracking. This tracking experience is edited and research. Figure 2 shows an image captured and signboard
improved through detection or identification results: text edge detection shown in Figures 3. Figure 4 and Figure 5.
detection and identification where enhanced by multiple Show the test detection in the natural scene by using MSER.
frame integration. Also, a database is made available The Figure 6(a) shows the text recognition in English
publicly (USTB-Vid TEXT). Various experiments on this whereas Figure 6(b) shows the recognized Urdu presented
database confirm that the approach improves text detection with the extension of the .txt file.
and identification of the web video [9].
III. TEXT RECOGNITION FROM SCENES
Pattern's identity is a robust, exciting and fast-drawn field
that supports the development of the same area such as
image processing, computer vision, data acquisition system,
and neural network. The methodology of the study comprises
of several phases as shown in Figure 1. Capturing image is
the very first step for signboard detection. The next step is
real-time signboard detection that is a challenging task
because our environment is full of a different form of
signboards including glow signboard, 3D, Neon, creative
signboard, etc. Having different layout with different shapes
and angles. Text detection is the next phase which is also a Figure 2. Image from Natural Scene
very challenging task because the text on signboard may be
Figure 3. Signboard Edges Detection
(b)
(a) (a)
supported by the National High Technology 863 Program of
China (No.2015AA124103) and by the National Key R&D
Program no 2016YFB05502001. The authors are thankful
for the financial support and guidance and assistance
provided by the national natural science foundation of China
and State Key Laboratory Intelligent Communication,
Navigation and Micro-Nano System, BUPT.
References