The document discusses the development of smart glasses designed to assist blind individuals by converting written English text into audio through a Raspberry Pi-based system. The glasses utilize OCR technology to achieve high accuracy in text recognition, although they currently only support English and have distance limitations for image capture. Future enhancements aim to support multiple languages and improve the design for better user comfort.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
3 views
4
The document discusses the development of smart glasses designed to assist blind individuals by converting written English text into audio through a Raspberry Pi-based system. The glasses utilize OCR technology to achieve high accuracy in text recognition, although they currently only support English and have distance limitations for image capture. Future enhancements aim to support multiple languages and improve the design for better user comfort.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7
Journal of Xidian University https://doi.org/10.37896/jxu15.
7/029 ISSN No:1001-2400
SMART GLASSES FOR BLIND PEOPLE
J ANUSHA[1] S CHARAN [2] M YAMINI[3] MANISHA BAGH[4] P V PAVAN RAJGOPAL[5]
DEPARTMENT OF ELECTRONIC AND COMMUNICATION [1,2,3,4,5]
POTTI SRIRAMULU CHALAVADI
MALLIKARRJUNA RAO COLLEGE OF ENGINEERING AND TECHNOLOGY
ABSTRACT: combination of using OCR with
EAST detector provide really high These “Smart Glasses” are designed accuracy which showed the ability to help the blind people to read and of the glasses to recognize almost translate the typed text which is 99%ofthetext. written in the English However, the glasses have some language.These kinds of inventions drawbacks such as supporting only consider as solution to motivate the english language and the blind students to complete their maximum distance of capturing the education despite all their images is between150-300cm.As a difficulties its main objective is to future plan it is possible to support develop a new way reading texts many languages and enhance the for blind people and facilitate their design to make it smaller and more communication. The main task of Comfortable to wear. the glasses is to scan any text image and convert it into audio text, the person will listen to the audio through a headphone that’s connected to the glasses.The glasses are provided by Ultrasonic sensor which is used to measure the required distance between the user and the object that has an image to be able to take a clear picture. The picture will be scanned and also presses the button. All the computing and processing operations were done using the RaspberryPi3B+ and Raspberrypi3B.For the result,the
Journal of Xidian University https://doi.org/10.37896/jxu15.7/029 ISSN No:1001-2400
some of these methods and their
1. Introduction effectiveness regarding their In our lives, there are many people accuracy rates. In end-to-end text who are suffering from different recognition with the power of neural diseases or handicap. These people network combined with the new need some help to make their life unsupervised feature, learning easier and better. The Main goal of growth took advantage of the “Smart Glasses” is to help blind known framework for the train to people and people who have vision achieve high accuracy of the text difficulties By introducing a new and character detection and technology that makes them able to recognition modules. These two read the typed text. These glasses models have been combined using are provided with technology to simple methods to build end to End scan any written text and convert it text recognition system. The into audio text. Also, it can translate datasets that been used are ICDAR words from English to Arabic using 2003 and SVT. The method of 62- Google API. The goal of “Smart way character classifier obtained Glasses” is helping those people in 83.9% of accuracy for a cropped different life aspects. For example, character from the first dataset. In these glasses effectively helpful in novel scene text recognition, this is The education field. Blind people an algorithm that mainly depended and people with vision difficulties on machine learning methods. Two can be able to read, study And learn types of classifiers have been everything from any printed text designed to achieve more accuracy, images. “Smart Glasses” encourage the first one was developed to blind people or people with vision generate candidates, but the second difficulties to learn and succeed in one was for filtering of candidates many different fields. that are not text. A novel technique has been developed to take 2. Literature advantage of multi-channel information. Two datasets have Review been used in this study, ICDAR 2005, Text detection and recognition have ICDAR 2011. This method has been a challenging issue in different achieved significant results in computer vision fields. There are different evaluation protocols [2]. many research papers that have In photo OCR which is a system discussed different methods and designed to detect and extract any algorithms for extracting the text text from any Image using machine from the images. The main purpose learning techniques, it also used of this literature review is to view different distributed language modeling. The goal of this system
Journal of Xidian University https://doi.org/10.37896/jxu15.7/029 ISSN No:1001-2400
was to recognize any text from any advantage of two important
challenging image such as poor methods that are connected to a quality or blurred images. This component with a sliding window. system has been used in different The character or the letter has been application such as Google recognized as a region in the image Translate. The datasets that are that has some strokes in a particular been used for this system are ICDAR direction and particular position. and SVT. The results showed that The dataset that has been sed is the processing time for text ICDAR 2011, the experiment results detection and recognition is around showed 66% recall better than the 600ms for one image [3]. For text previous Methods . recognition in natural scene images EAST is an abbreviation of an method, This method has proposed Efficient and Accurate Scene Text an accurate and robust method for Detector. This detecting texts in natural scene method is a simple and powerful images. This method has used an pipeline that allows detecting a text (MSER) algorithm to detect almost in natural scenes, And it achieves all characters from any image. The high accuracy and efficiency. Three datasets used for this system are datasets have been used in this ICDAR 2011 and Multilingual study, ICDAR 2015, COCO-Text and datasets. The results showed that MSRA-TD500. The experiment has the MSER has achieved 88.52% in shown that this method has better character level recall [4]. results than previous methods End-to-end real-time text regarding accuracy and efficiency . recognition and localization system have used ER (External Regions) detector that covered about 94.8% of the characters, and the processing time of an image with 800×600 resolution was 0.3s on a regular personal computer. The system used two datasets ICDAR 2011 and SVT. The average run time of the method on an 800×600 image was 0.3s on a standard PC. On the ICDAR 2011 dataset, the method achieved 64.7% of image-recall. For SVT, it achieved 32.9% of image- recall [5]. Text detection and localization using Oriented Stroke Detection is a method that took
Journal of Xidian University https://doi.org/10.37896/jxu15.7/029 ISSN No:1001-2400
software for text extraction and
3. Proposed subsequently forwarded to a text- to-speech synthesizer. The text is System then read through the audio output Smart eyewear design depends port.The image processing adopted mainly on the processing unit, which in this work were implemented by is the raspberry pi 3b+, in this case. using Simulink (Mathwork Natick, The main hardware is a Linux based MA). In the reading mode, the main ARM processor that accepts a micro challenge is the image quality, text SD card and thus allows us to position and orientation in the increase the number of task image. Therefore, the first step is to functions as we wish. detect the red borders and the A raspberry pi camera was used for frame orientation. To simplify image acquisition. It was connected subsequent image processing, we to the raspberry pi using a flex cable, propose an indicator to inform user and was fixed on the top middle of if the image is skewed significantly the glasses for optimal image or part of the frame is cropped. capturing. the raspberry pi has an Once the text area is localized and audio port which connects to cropped, image is enhanced by noise earpiece. The raspberry pi GPIO port filtering, contrast enhancement was configured to receive input (histogram matching technique) and from push button switches. To morphological operations. Tesseract identify the text easier, the reading OCR engine [9] is used in the last material is placed within a custom step to extract the text before designed frame with red converting into audio output. The borders.The general principle of Hardware System the proposed operation for such glasses is by system includes a depth camera for giving instructions via itches and acquiring the depth information of listening to the output through an the surroundings, an ultrasonic earpiece. Similarly in this case, the rangefinder consisting of an user starts the task mode by a push ultrasonic sensor and a of the button. For text recognition MCU(Microprogrammed Control mode, the glasses will first confirm if Unit) for measuring the obstacle the text area is correctly positioned distance, an embedded CPU (Central and readable. Otherwise, it will ask Processing Unit) board acting as the user to change the orientation main processing module, which does of the material. After confirmation, such operations as depth image the view is processed in real time to processing, data fusion, AR get the image sent to an optical rendering, guiding sound synthesis, character recognition (OCR) etc., a pair of AR glasses to display the visual enhancement information
Journal of Xidian University https://doi.org/10.37896/jxu15.7/029 ISSN No:1001-2400
and an earphone to play the guiding rendering, guiding sound synthesis,
sound. The hardware configuration etc., a pair of AR glasses to display of the proposed system is illustrated the visual enhancement information in Fig. 1, and the initial prototype of and an earphone to play the guiding the system is sound. The hardware configuration
Fig. 1. The hardware configuration of the proposed system illustrated in
of the proposed system includes a Fig. 1, and the initial prototype of depth camera for acquiring the the system is shown in Fig. 2. depth information of the surroundings, an ultrasonic rangefinder consisting of an In this work, the ultrasonic sensor is ultrasonic sensor and a mounted on the glasses. The sensor MCU(Microprogrammed Control uses 40 KHz samples. The samples Unit) for measuring the obstacle are sent by the transmitter of the distance, an embedded CPU (Central sensor. The object reflects the Processing Unit) board acting as ultrasound Wave and the receiver main processing module, which does of the sensor receives the reflected such operations as depth image wave. The distance of the object processing, data fusion, AR can be obtained according to the
Journal of Xidian University https://doi.org/10.37896/jxu15.7/029 ISSN No:1001-2400
time Interval between the wave
sending and the receiving. As is shown in Fig3., the triggered pin of the sensor must receive a pulse Of high (5 V) for at least 10 us to start measurement that will trigger the sensor to transmit 8 cycles of ultrasonic burst at 40 KHz and wait for the reflected burst. When the sensor has sent The 8 cycles burst, the Echo pin of the sensor is set to high. Once the reflected burst is received, the Echo pin will be set to low, Which produces a pulse at the Echo pin. If no reflected burst is received within 30ms, the Echo pin stays high. Thus, the distance will be set very large for representing Fig 3. Diagram of the proposed system that there is no object in front of the user. The MCU is used to control the Ultrasonic sensor to start measurement and detect the pulse at the Echo pin. The width of the ultrasonic sensor is taken about 40cm-150cm 4. Result The operation of the project is Several sample texts were prepared and shown in the below diagram for the tested. Figure 1shows an example reading purpose of the reference that is text and the experiment results with the shown as follows: proposed smart glasses. Admittedly the text is relatively simple, but it proves the basic concept of our design.
Journal of Xidian University https://doi.org/10.37896/jxu15.7/029 ISSN No:1001-2400
5. Conclusion and 6. References
future work WHO|Visual impairment and blindness. WHO, 7 April 1948. This paper presents a new concept http://www.who.int/ of smart glasses designed for visually Mediacentre/factsheets/fs282/en/. impaired people using low cost Accessed Oct 2015 single board computer raspberry pi Unisco. Modern Stage of SNE 3b+ and its camera. For the Development: Implementation of demonstration purpose, the glasses Inclusive Education. In: Icts are designed to perform text In Education for People with Special recognition. The system capability Needs, Moscow, Kedrova: Institute however can be easily extended to For Information multiple tasks by adding more Technologies in Education UNESCO, models to the core program, albeit pp. 12–14 (2006) restricted by the size of the Low vision assistance. EnableMart raspberry pi SD card.Each model (1957). represents a specific task or mode. https://www.enablemart.com/visio The user can have the desired task n/low-visionassistance. Accessed run independently from the other Oct 2015 tasks. The system design, working Velázquez, R.: Wearable assistive mechanism and principles were devices for the blind. In: Lay- discussed along with some Ekuakille, A., experiment results. This new Mukhopadhyay, S.C. (eds.) Wearable concept is to improve the visually and Autonomous Systems. LNEE, impaired students’ lives despite vol. 75 their economic situations. 349. Springer, Heidelberg (2010) immediate future work includes Jafri, R., Ali, S.A.: Exploring the assessing the user-friendliness and potential of eyewear-based optimizing the power management wearable display devices for use of the computing unit. By the visually impaired. In: International Conference on User Science and Engineering, Shah Alam, 2–5 September 2014 The Macular Degeneration Foundation, Low Vision Aids & Technology, Sydney, Australia: The Macular Degeneration Foundation, July 2012