(8) ML Assisted Sign Language to Speech Conversion Gloves for the Differently Abled
(8) ML Assisted Sign Language to Speech Conversion Gloves for the Differently Abled
2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI) | 979-8-3503-6052-3/24/$31.00 ©2024 IEEE | DOI: 10.1109/IATMSI60426.2024.10503002
K H Akhil K Deepa
Department of Electrical and Electronics Engineering, Department of Electrical and Electronics Engineering,
Amrita Vishwa Vidyapeetham, Amrita Vishwa Vidyapeetham,
Bangalore, India Bangalore, India
[email protected] [email protected]
Abstract — In this rapidly changing world where everyone is Several artificial neural network-based solutions have also
vying for recognition, opportunities, success, and survival, been proposed in the past few years. These models employ
people with disabilities frequently find themselves marginalized. RNN and LTSM models to translate sign language into
They struggle to blend in with society and receive different speech.
treatment. The advancement of AI-assisted technologies has Neural networks can also be combined with hardware-based
contributed to closing this social gap and given the disabled model such as gloves to further increase its accuracy [5-7].
newfound hope. The “Sign-to-Speech” glove makes it possible to Gloves can be designed to convert sign language to
translate sign language into speech. Flex sensors that track speech using flex sensors. They drastically reduce the cost but
finger movement are integrated into the glove, and an Arduino
are less accurate compared to its competition. Different
microcontroller is used to process the sensor data. The speech
synthesis software is run on a Raspberry Pi, which is connected
combinations of flex sensor values can be mapped to bits
to the Arduino. The way in which the glove functions is by which are then converted to short speech [8]. Flex sensors can
tracking finger and hand movements that create various sign also be paired with accelerometers and gyroscopes to increase
language expressions. The Arduino receives data from the flex its freedom of motion and widen its range of gestures. Making
sensors, which track variations in resistance as the fingers move these systems such that they can also convert sign language to
and acts as an Analog-to-Digital converter. The machine languages other than English is an added advantage [9-10].
learning model is deployed in the Raspberry Pi, which is used to Another way to achieve sign language to speech is to use
identify the sigh and convert it into speech. The speech output is keypads and haptic feedback to achieve two-way
routed through a speaker, hence making it possible for the communication. Machine learning can also be integrated into
listener to hear it. By providing pertinent data to the machine this model to increase its efficiency and accuracy [11].
learning model for training, the device can be configured for Machine learning can also be used to convert live audio into
various sign languages making the overall system versatile. sign language. This allows those who are not familiar with
sign language to communicate in both directions [12].
Keywords— “Sign to Speech glove”, Arduino, Raspberry The proposed model is a hand glove fitted with flex
Pi, Machine Learning model. sensors, gyroscope and an accelerometer, to detect the hand
gestures. Generally, the gloves implementing the range
I. INTRODUCTION comparison method are inexpensive and tend to be less
People who are hearing and speech impaired encounter accurate. Machine learning algorithms can be integrated into
many difficulties in their daily lives even while completing the these gloves to increase the accuracy while keeping the cost
basic tasks. Using sign language to communicate with others relatively low.
is one of the many difficulties. This is because only a small All things considered, these gloves offer a prompt and
portion of the general population know and practise sign effective means of communication and aid in closing the gap
language. Various systems making use of Artificial Neural between people with special needs and the general public.
Networks (ANN) have been proposed which can convert sign This sign language to speech conversion gloves makes use of
language to speech [1]. Those who are speech impaired or deaf a machine learning algorithm – random forest, to increase the
encounter numerous challenges when trying to communicate accuracy of the gloves. The paper also puts forward a simple,
with others in the community. It could lower their self-esteem light weight yet highly effective sign language to speech
and cause them to feel socially isolated. The general public conversion. It is also portable and user-friendly.
and the deaf can communicate more easily when sign This paper consists of 7 sections, including
language is used. American sign language is used to translate introduction. Section two is “Proposed System”. This section
the text using a variety of technologies. Indian sign language consists block diagrams representing the proposed system
has been widely used by the deaf community within India for and its working along with brief explanations. The third
many years, although not much research has been done on it section is “Hardware Components”, where there is
[2-3]. Using various datasets available on the internet, information about every component that has been used in this
different deep learning-based models have been developed to model. The fourth section is “Results and Discussion”. In this
detect motion and gesture and thus translate sign language to section, the outputs obtained have been mentioned and
text [4]. analyzed. Section five discusses the future scope of the
5) Speakers
The speaker used is specified for 8 Ohm and 0.25 Watt
applications. It gives the voice output according to the sign
as identified by the machine learning model.
6) Software
The open-source software platform Arduino IDE
Fig. 2. Block diagram representation of working of Sign language to (Integrated Development Environment) was utilized to
Speech gloves program and develop software for Arduino Mega 2560.
The working of the proposed system is as given below:
Raspberry Pi OS x64 was installed into the
Raspberry Pi and the code was run using python. Thonny
Authorized licensed use limited to: UNIV OF ENGINEERING AND TECHNOLOGY LAHORE. Downloaded on September 29,2024 at 10:23:12 UTC from IEEE Xplore. Restrictions apply.
editor was used to run the python code. The main python
libraries used were scikit-learn, pandas, numpy and pyttsx3.
TABLE 1 COMPONENTS USED
Sl. No Name of Component Specification
1 Flex sensor • Dimensions : 2×2 inches
• Temperature Range: -35°C
to +80°C
• Resistance: 10K ohm (in
rest position)
• Tolerance: ±30%
Authorized licensed use limited to: UNIV OF ENGINEERING AND TECHNOLOGY LAHORE. Downloaded on September 29,2024 at 10:23:12 UTC from IEEE Xplore. Restrictions apply.
Fig. 5. Output for the alphabet ‘A’
Fig. 4 shows the hand sign for the letter ‘A’ and fig. 5 shows
the output received in the Thonny terminal in the raspberry
Pi. As seen from the figure the output received is the same Fig. 8. Hand sign of alphabet ‘Y’
letter as the gesture
Fig. 6 and 8 depicts the hand signs of the letters - “L” and
“Y”. The Fig.7 and Fig 9 shows the output that was obtained
when the sign language for the letters “L” and “Y” were
performed. The same was also received via the speakers.
Fig. 7. Output for the alphabet ‘L’ Fig. 11. Confusion matrix of the developed model
Authorized licensed use limited to: UNIV OF ENGINEERING AND TECHNOLOGY LAHORE. Downloaded on September 29,2024 at 10:23:12 UTC from IEEE Xplore. Restrictions apply.
access to information and interpersonal
The confusion matrix obtained for the random forest communication. Translation services, text-to-speech
algorithm is shown in Fig. 11. The confusion matrix consists capabilities, and other features can be a part of this
of data of only 6 alphabets for ease of viewing. As shown in integration.
the figure, there have been cases of false prediction only 7 e) Global Impact:
times in which case the letters ‘A’ and ‘F’ have been
By removing barriers to communication and
predicted incorrectly. This shows that the model predicts the
enhancing the lives of millions of sign language
value correctly for a large number of cases, making it
users, sign language gloves have the potential to
accurate and highly effective in detecting the hand gesture.
have a major global impact.
V. ANALYSIS AND FUTURE SCOPE
VI. CONCLUSION
TABLE 1. PERFORMANCE OF THE ML MODEL
Parameter Value Obtained The random model, which has been used, provides
Accuracy 0.94 a much more consistent range of parameters – accuracy,
Precision 0.9425 precision and the F1 score. It is also user friendly and can be
F1 score 0.93967 understood by anyone. Since the system offers a very simple
Recall score 0.94
way of converting sign language to speech, the memory
Table 1 represents the effectiveness of the model in requirement and complexity of technology is also less.
terms of its performance parameters. The Accuracy level of Hence, this system combines the simplicity of flex sensors
the developed system is in the range of 92-96%. With the use and the advanced technologies of machine learning to detect
of sign language to speech conversion gloves, people who are hand gestures and convert the given sign language to speech.
deaf or mute can interact with others. This allows in reducing Additionally, this system offers a very user-friendly platform
the communication gap between the two sides. for communication between the two sides. Such systems are
very easy to transition into and also use in the long run.
TABLE 2. MODEL COMPARISON Anyone, regardless of their background or field of work can
Parameter Random Forest ResNet based system use this system easily. In conclusion, these systems play a
Accuracy Range 92 – 96% 64.5 – 99.6% vital role in promoting equality, providing more opportunities
Ease of use Simple Complex and contributing to a hassle-free future
In the coming years, various other technologies can
TABLE 3. COMPARISON WITH CNN BASED SYSTEM also be added into this glove. An application or website, for
Parameter Random Forest CNN based system details on the gloves itself can be designed, which facilitate
Accuracy 92 -96% 65.85% - 83.85% easy status monitoring and on-off control of the gloves. The
Precision 0.94 0.69 – 0.97 flex sensors are prone to error due to the changes in the
F1 score 0.939 0.69 – 0.98 environment temperature, which changes the resistance of the
flex sensors and can cause an incorrect identification of the
Tables 2 and 3 show the comparison of the random forest sign. This can be prevented by considering the environment
model with the CNN based models and ResNet systems [5- temperature as well in the machine learning model. The
7]. The future scope of the work is: prototype can be made adaptable to other sign languages by
a) Better Communication: increasing its dataset and implementing a language model to
The Sign language to Speech conversion gloves can provide the speech output. This way, a larger crowd from
aid in communication between people who use sign different parts of the world can harness the benefits of this
language and people who do not use it. This can help system.
enhance relationships in variety of contexts such as
VII. ACKNOWLEDGEMENT
workplace, healthcare and education.
b) Improving Accessibility: Rhethika S, Hrithik T H, K H Akhil, Deepa K, have filed this
The community of the hard of hearing impaired can work as an Indian patent “A Glove System for Two-way
benefit from improved access to information and Communication and An Apparatus Thereof”
services thanks to these gloves. This involves
enhanced availability of healthcare services, REFERENCES
emergency communication, and educational
materials. [1] C. U. Om Kumar, K. P. K. Devan, P. Renukadevi, V. Balaji, A.
Srinivas and R. Krithiga, "Real Time Detection and Conversion
c) Cost Effectiveness: of Gestures to Text and Speech to Sign System," 3rd International
Cost-effectiveness is a crucial consideration when Conference on Electronics and Sustainable Communication
developing such models. The Sign Language to Systems (ICESC), Coimbatore, India, 2022, pp. 73-78.
[2] J. Peguda, V. S. S. Santosh, Y. Vijayalata, A. D. R. N and V.
speech gloves may have a high initial investment, but Mounish, "Speech to Sign Language Translation for Indian
the value it provides to the differently abled is far Languages," 8th International Conference on Advanced
greater than what we can imagine. Computing and Communication Systems (ICACCS),
Coimbatore, India, 2022, pp. 1131-1135.
d) Integration with Smart Devices: [3] S. Y. Heera, M. K. Murthy, V. S. Sravanti and S. Salvi, "Talking
Smartphones and other smart devices can be hands — An Indian sign language to speech translating gloves,"
integrated with sign language gloves, facilitating users' International Conference on Innovative Mechanisms for Industry
Applications (ICIMIA), Bengaluru, India, 2017, pp. 746-751.
Authorized licensed use limited to: UNIV OF ENGINEERING AND TECHNOLOGY LAHORE. Downloaded on September 29,2024 at 10:23:12 UTC from IEEE Xplore. Restrictions apply.
[4] S. R, S. R. Hegde, C. K, A. Priyesh, A. S. Manjunath and B. N.
Arunakumari, "Indian Sign Language to Speech Conversion
Using Convolutional Neural Network," IEEE 2nd Mysore Sub
Section International Conference (MysuruCon), Mysuru, India,
2022, pp. 1-5.
[5] C. U. Om Kumar, K. P. K. Devan, P. Renukadevi, V. Balaji, A.
Srinivas and R. Krithiga, "Real Time Detection and Conversion
of Gestures to Text and Speech to Sign System," 3rd International
Conference on Electronics and Sustainable Communication
Systems (ICESC), Coimbatore, India, 2022, pp. 73-78.
[6] L. Fernandes, P. Dalvi, A. Junnarkar and M. Bansode,
"Convolutional Neural Network based Bidirectional Sign
Language Translation System," Third International Conference
on Smart Systems and Inventive Technology (ICSSIT),
Tirunelveli, India, 2020, pp. 769-775.
[7] L. VB, S. KB, P. H, S. Abhishek and A. T, "An Empirical
Analysis of CNN for American Sign Language Recognition," 5th
International Conference on Inventive Research in Computing
Applications (ICIRCA), Coimbatore, India, 2023, pp. 421-428.
[8] N. T. Muralidharan, R. R. S., R. M. R., S. N. M. and H. M. E.,
"Modelling of Sign Language Smart Glove Based on Bit
Equivalent Implementation Using Flex Sensor," International
Conference on Wireless Communications Signal Processing and
Networking (WiSPNET), Chennai, India, 2022, pp. 99-104.
[9] P. Telluri, S. Manam, S. Somarouthu, J. M. Oli and C. Ramesh,
"Low cost flex powered gesture detection system and its
applications," Second International Conference on Inventive
Research in Computing Applications (ICIRCA), Coimbatore,
India, 2020.
[10] J. Kunjumon and R. K. Megalingam, "Hand Gesture Recognition
System For Translating Indian Sign Language Into Text And
Speech," International Conference on Smart Systems and
Inventive Technology (ICSSIT), Tirunelveli, India, 2019, pp. 14-
18.
[11] P. Prajapati, G. S. Surya and M. Nithya, "An Interpreter for the
Differently Abled using Haptic Feedback and Machine
Learning," Third International Conference on Smart Systems and
Inventive Technology (ICSSIT), Tirunelveli, India, 2020, pp. 1-
7.
[12] B. R. Reddy, D. C. Rup, M. Rohith and M. Belwal, "Indian Sign
Language Generation from Live Audio or Text for Tamil," 9th
International Conference on Advanced Computing and
Communication Systems (ICACCS), Coimbatore, India, 2023,
pp. 1507-1513.
Authorized licensed use limited to: UNIV OF ENGINEERING AND TECHNOLOGY LAHORE. Downloaded on September 29,2024 at 10:23:12 UTC from IEEE Xplore. Restrictions apply.