A (6)
A (6)
Sujata Oak
Information Technology
Ramrao Adik Institute of Technology
Nerul,India
[email protected]
Abstract—The new generation of mobile phones has great done, this all features will be provided by Optical Character
hardware ability and faster processing which is powerful Recognition (OCR) module. Once the image is being pre-
enough to develop applications which help the user to connect processed Optical Character Recognition will read and
and interact with the world at their own comfort zone. This recognize the text visible in the image. However the final
system is an OCR reading system which uses camera
outcome of the system will be a audible speech which will
application present in your smart phones combined with OCR
(Optical Character Recognition). OCR is a mechanism which read out the text recognized by the OCR module, For
coverts images of typed, handwritten, or printed text into delivering this voice output Text-To-Speech (TTS) module
machine encoded text. This system will help you to take a is being used. This application will read anything that is
picture or scan the document present with user using the present in English language also any numerical integer
phone’s camera, the image will be scanned and the application value. Every special character like comma, exclamation, full
will read the text written in English language and convert the stop, question mark will be taken into consideration and
output in speech format. The speech output is generated using right pause will be taken whenever any of these is
Text To Speech Module. The purpose of delivering the output encountered. Thus using OCR the system will recognize and
in form of voice/speech is to serve the information that is
convert the image into text format and further Text-To-
present on the document to the visually impaired.
Speech will convert the recognized text into voice which
Keywords— Mobile Phone, Optical Character Recognition, will help the visually impaired to read the document and
Text to Speech, Visually Impaired. keep them updated.
I. INTRODUCTION
II. LITERATURE REVIEW
As in today’s world though all the information is
represented electronically, but the information represented In Text reading applications there are many different
on the paper, has its own relevance. However, this techniques available such as label reading, voice stick, brick
information is not available for the visually impaired people. pi reader and pen aiding but these methods can perform text
To help them in getting this vital information we propose a to speech by creating datasets. In order to address this
system in which the document is read electronically and will problem, finger reading technique has been developed, it
be converted into speech. This system will help the visually eliminates the datasets created and stored previously and
impaired to be aware of the information presented on the provide a previous response of reading any text given as
document through the help of speech. Our application will input captured image.
be able to recognize the text captured by a mobile phone
camera, display the translation result back onto the screen of 1. In [1], authors suggest that MATLAB, LabVIEW is
the mobile phone, and produce the speech of the translated used to preprocess an image which is then given as
text. input further this image is segmented and then the OCR
Mobile devices are becoming very popular, especially the module starts it’s process of text recognition. This
Smartphone. Researchers are developing various system not only converted image to speech but also
applications for the users, which can be used on Smart tried to take input in text format from the user and
phones. This system proposed will be available to the end converted the same into speech, this thereby can be
user in form of Android application that can be downloaded used by totally dumb people or people with speech loss.
from Application stores. Thus developing an android based Initially the system generates a bitmap in ARGB-8888,
system will help to serve greater number of people. This and then passes it to tesseract engine for recognition.
system will help the visually impaired to read the essential
document within few clicks. The application will require an 2. In [2] ,authors proposed system will allow the user to
inbuilt camera application system in the phone. Since a view the virtual object in the real world using a marker
visually impaired won’t be able to click a perfect picture of based Augment reality. The user has to provide any one
the document pre-processing of the captured image will be side of the image that can be the left, right, top or the
bottom view of the image. Later to get the virtual fully
appearing object the image will be placed on a 3D cube. III. PROPOSED METHODOLOGY
A live video is feed as an input, then it generates binary System Design:
images i.e. a digital image that has only two possible
values for each pixel. These binary images are
processed using an image processing technique to
detect the AR Marker. Once the AR Marker is detected,
its location is provided. Later it calculates the relative
pose of the camera in real time. The term pose means
the six degrees of freedom (DOF) position, i.e. the 3D
location and 3D orientation of an object. Finally it
display the augmented image on the display screen.
4. In [5], authors propose that major image processing Fig.1: Proposed Block Diagram.
techniques such as image acquisition, processing is
covered. Here text is identified from an image using This application will be available to the users in the form of
ideas such as LabVIEW and the NI Vision toolkit. an Android application that can be downloaded from the
Application Stores. Modules involved in this Reading
5. In [6], presents an algorithm for extracting information System are as follows:
from a business card using an android mobile phone. In
this research it used open source OCR software called •Camera:
tesseract. Using tesseract helped to overcome The inbuilt camera application in any smart phone is used to
environmental conditions reflection, blurring, variable capture the image of the document which is to be read. The
lighting, scaling that appeared during clicking a picture. image will contain textual regions from which the text will
In this system a Gray scale or color image is provided be recognized.
as input. The program takes .tiff and .bmp files, then it
converts the gray scale images into binary images. It •OCR module:
then calculates the optimal threshold that separates the OCR module that is Optical Character Recognition module
background and foreground pixel classes such that the is used to preprocess the image and detects and identify the
variance between the two is minimal. Then it finds words that in English language. After capturing of the image
locations having a pixel count less than a 2 specific in standard resolution, the textual regions within the image
threshold. After each line of text is found, Tesseract are localized. The system is only concerned with the textual
examines the lines of text to find approximate text. regions and complex backgrounds are not within the scope
of the project. The captured image is processed and the first
6. In [7], authors propose an algorithm which helps to step in the processing is localization of the textual regions in
detect localize and extract text that are horizontally the image. The identified characters are converted into text
aligned in an image irrespective of the background that is machine encoded and displayed on the screen with
present. It uses projection profile analyses and the help of this module. The core functionality of the system
geometric properties to segregate the text region and is to recognize the text from image. The localized textual
detect the text. After this processing it is then send to regions are used as input for this system feature and the text
the OCR engine for character recognition. (Alphanumeric characters: A-Z, a-z and 0-9) in these
regions is recognized.
7. In [8], authors explain the system which extracts text
characters from natural scene using smart mobile •Text To Speech Module:
devices. The algorithm used for the same designs a This module is used to give speech output of the converted
discriminative character descriptor then by designing text from the image. This module thus helps to hear
stroke character maps it models each specific character whatever is printed or written on the scanned document.
class by modeling character structure.
•User Interface module:
User Interface module will provide features such as a login
page, a registration page and also will also display the result
Proposed Approach:
CONCLUSION
AI based reading system using OCR is an artificial
intelligence reading system developed using a smart phones Hand written Text Recognition:
camera combined with OCR (Optical Character
Recognition). This application detects the text using the
camera and scans the text and then converts it into digital
text which is recognized by the system and displays the
translated text and gives speech output. To understand the
dynamics of the project, a basic idea about what is AI and
OCR is required. This report explains the entire working of
Language Translator, along with minimum requirements
needed to implement it. Hence, visually impaired person can
easily use this AI based Reading system as a friendly simple
application in all around the globe.
RESULT
Text Recognition:
AI based reading system for blind converts the image file
into text and display o the screen and give a voice output.
For conversion and recognition of text it uses OCR.OCR
uses in built library for recognition of text. It successfully
recognizes English alphabets from A-Z or a-z.
Number Recognition:
This system not just recognizes the alphabetic words and
letters but it also identifies the integer or numerical value
from 0-9 and it is also delivered in form of speech.
REFERENCES