Real Time Indian Sign Language Recognition and Speech Generation Using Convolutional Neural Network
Real Time Indian Sign Language Recognition and Speech Generation Using Convolutional Neural Network
Abstract
Sign language is an essential mode of communication for the deaf community worldwide. Even today, there exists an evident
communication gap between sign language users and non-sign language users. To resolving this problem, researchers have proposed
real time sign language recognition systems. While American Sign Language (ASL) has been extensively studied, Indian Sign
Language (ISL) has received less attention. The approach in this paper tries to explore the challenges and opportunities involved in
developing a real time sign language recognition system for ISL. The purpose of the project was to bridge the communication gap
for the hearing-impaired communities and provide a tool for better accessibility. The paper identifies the sign recognition and speech
generation challenges while trying to provide a product capable of recognizing different characters of ISL. The objective is to focus
on words rather than alphabets used by deaf people daily.
Keywords: CNN (Convolutional Neural Network), ISL (Indian Sign Language), SLR (Sign Language Recognition), Speech
Generation.
I. INTRODUCTION
Sign language (SL) is a visual-gestural language used by deaf and hard-hearing people for communication with others. Deaf
people express meaning through hand gestures or other parts of the body to convey their message. They have a whole different
vocabulary and grammar than spoken languages. [1]
One of the most common disabilities is speech hearing loss. According to the World Health Organization, the global
population will reach 630 million by 2030. India, the world's second-largest country, has around 63 million deaf or hearing-
impaired citizens, a figure that could be ten times higher. In the case of a hearing-impaired person, sign language can be a very
useful adaptation. Deaf people can use it to communicate with others daily, making it like a mother tongue for them. It is
based on visual hand signs and uses different vocabulary.[2]
The official language for the deaf in India is Indian Sign Language (ISL). It is made up of structured, coded motions, each
of which has a distinct meaning. People who are deaf or dumb used to rely on interpreters to help them interact with others,
but it can be difficult to find an appropriate interpretation. For a hearing-impaired person, sign language is the only completely
dependable form of communication and expressing their views to others. Accordingly, the field of sign language has seen a
rise in interest during the last few years.
II. RELATED WORK
Honnaraju B, Meghana M, et al. [3] This research paper focuses on recognition utilities that work in real time to ensure their
use in various situations. They have achieved their goal by creating a custom dataset that addresses issues such as rotational
variability and background dependence. Their system has successful training for all 36 ISL letters and numbers, with an
accuracy of 99%. In this paper, we present a method that uses the visual vocabulary model to identify Indian Sign Language
letters and numbers in a live video stream. To predict the data, this project used skin color and background subtraction along
with CNN and SVM to perform sign and speech prediction.
Snehal H, Manpreet S, et al. [4] In this research paper, they have conducted a survey on the deaf and hard-of-hearing
community, with an emphasis on the Indian Sign Language (ISL) recognition. In this study, they propose a development of a
hybrid CNN based system for the real-time recognition of ISL. While this system is still in its early stages of development,
our detailed analysis of current methods and their limits will provide valuable insights for its future implementation. By
promoting technology that responds to the specific needs of DHH communities, this survey paper highlights the importance
of socially responsible technology development.
Aman P, Avinash K., et al. [5] The main purpose of this research paper is to provide a feasible way of communication
between normal and dumb people by using hand gesture. This proposed system can be accessed by using a webcam or any
built-in camera that can detect the sign and process it for recognition. The system can also detect some data like Hello,
Welcome, Thank you, etc. The best method of image processing was found to be Human movement classification. The system
can recognize selected sign language characters in a controlled, low-light environment with 70% to 80% accuracy.
III. CONVOLUTIONAL NEURAL NETWORK
The Convolutional Neural Network (CNN) architecture used in this study is structured to effectively learn and classify sign
language images. The model's input layer accepts grayscale images with a size of 100 x 100 pixels. The network is successivel y
composed of multiple sets of Conv2D layers, each followed by MaxPooling2D layers. These convolutional layers play a crucial
role in extracting features from the input images, with ReLU activation functions introducing nonlinearity to facilitate learning
complex patterns. MaxPooling layers are designed to shrink spatial dimensions, which helps reduce computational complexity
and control overfitting. To prevent overfitting, dropout layers are strategically placed after each MaxPooling layer, which
randomly deactivates a portion of neurons during training.
Following the convolutional layers, a flattening layer reshapes the multidimensional output into a one-dimensional array
and prepares it for input into fully connected layers. The fully connected layers consist of a 512-unit dense layer with ReLU
activation, followed by a dropout layer with a dropout rate of 0.5 to further mitigate overfitting. Finally, the output layer uses
softmax activation, where 36 units represent the number of classes in the data set. During training, the model is compiled using
the Adam optimizer and the categorical cross-entropy loss function, where accuracy is the metric of interest. A batch size of
256 and 100 epochs are used for training.
IV. METHODOLOGY
Our proposed methodology for Real Time Indian Sign Language aims to accurately interpret and classify ISL gestures. To
achieve this, we employ a multi-step process that involve data acquisition, preprocessing, feature extraction, and model
training.[6] We have designed to accurately capture the intricate details and movements of ISL signs, collecting datasets. and
executing CNN Fig.1 shows the general flow of the proposed architecture. This diagram shows a representation of the
sequential steps involved.
Fig. 3. Result
Fig. 4. Confusion matrix
VI. CONCLUSION
In conclusion, our study achieves a commendable 73.4 % accuracy in real-time Indian sign language recognition and
speech generation. Our model bridges the communication gap for the hearing-impaired by effectively recognizing ISL gestures
from image inputs by utilizing Convolutional Neural Network (CNN) architecture. This innovation can change lives by
enabling smooth communication between ISL users and those who need to speak audibly. Even though our system shows
encouraging levels of accuracy, continuous efforts are focused on improving real-time processing capabilities and fine-tuning
the model architecture. We envision a future in which barriers to communication are removed, promoting inclusivity
and fair access to communication resources, with continued advancements.
REFERENCES
[1] Deep, A. Litoriya, A. Ingole, V. Asare, S. M. Bhole, S. Pathak. “Realtime Sign Language Detection and Recognition.”, 2022 2nd Asian
Conference on Innovation in Technology (ASIANCON), DOI: 10.1109/ASIANCON55314.2022.9908995, August 2022.
[2] K. Goyal, Dr. Velmathi G. “INDIAN SIGN LANGUAGE RECOGNITION USING MEDIAPIPE HOLISTIC”,
https://arxiv.org/ftp/arxiv/papers/2304/2304.10256.pdf , Vellore Institute of Technology Chennai.
[3] S. Hon, M. Sidhu, S. Marathe, T. A. Rane. “Real Time Indian Sign Language Recognition using Convolutional Neural Network”,
IJNRD.ORG, ISSN: 2456-4184, Department of Information Technology PICT, Pune. Volume 9, Issue 2 February 2024.
[4] Honnaraju B, Meghana M, Sanjana D S, Nisarga N S, Nikhil H R. “Sign Language Recognition Using Deep Learning (CNN) And
SVM”, International Research Journal of Modernization in Engineering Technology and Science, e-ISSN: 2582-5208,
Volume:05/Issue:05/May-2023.
[5] A. Pathak, A. Kumar, Priyam, P. Gupta, G. Chugh, Dr. Akhilesh Das, “Real Time Sign Language Detection”, International Journal for
Modern Trends in Science and Technology. 8(01): pp 32-37, ISSN: 2455-3778, Copyright © 2022.
[6] Dr. Honnaraju B, Meghana M, Sanjana D S, Nisarga N S, Nikhil H R “SIGN LANGUAGE RECOGNITION USING DEEP
LEARNING (CNN) AND SVM”. International Research Journal of Modernization in Engineering Technology and Science. Impact
Factor- 7.868. Volume:05/Issue:05/May-2023. e-ISSN: 2582-5208. May-2023.
[7] R. Kadwade, A. Tangade, N. Pakhare, S.Kolhe, H. Waikar, S. J. Wagh, “Indian Sign Language Recognition System”. International
Journal of Engineering Research & Technology (IJERT). , Maharaja Institute of Technology, Mysore, Karnataka, India. Vol. 12 Issue
05, May-2023. ISSN: 2278-0181. May-2023.
[8] V. Jain, A. Jain, A. Chauhan, S. S. Kotla, A. Gautam. “American Sign Language Recognition using Support Vector Machine and
Convolutional Neural Network”. International Journal of Information Technology 13. DOI:10.1007/s41870-021-00617-x February
2021.
[9] D. Kothadiya, C. Bhatt, K. Sapariya, K. Patel, A. -B. Gil-González, J. M. Corchado. “Deepsign: Sign Language Detection and
Recognition Using Deep Learning”. https://www.mdpi.com/journal/electronics. Electronics 2022, 11, 1780.
https://doi.org/10.3390/electronics11111780. Published: 3 June 2022.
[10] S. Malik, Y. Kholiwal, Dr. Jayashree J. ,“SIGN LANGUAGE RECOGNITION AND DETECTION: A COMPREHENSIVE SURVEY”.
Journal of Data Acquisition and Processing (JCST) Vol.38 (3) 2023. https://sjcjycl.cn/DOI: 10.5281/zenodo.777632. ISSN: 1004-9037
(2023)