Adeel report
Adeel report
By
Muhammad Adeel Asad
2021-ag-4972
Muhammad Junaid
2021-ag-4989
Advised by
Dr. Rana Muhammad Saleem
Dr. Muhammad Ahmad
Bachelor of Science
In
Computer Science
SIGNATURES:
1. Student ___________________________
(Muhammad Adeel Asad)
2. Student ___________________________
(Muhammad Junaid)
EVALUATION COMMITTEE:
1. Advisor ____________________________
(Dr. Rana Muhammad Saleem)
2. Advisor ____________________________
(Dr. Muhammad Ahmad)
3. Member ____________________________
(Hafiz Muhammad Haroon)
4. Member ____________________________
(Mrs. Sidra Habib)
i
DEDICATION
This project is dedicated to my parents, whose sacrifices,
unwavering support, and constant encouragement have been the
bedrock of my academic journey. Their belief in my potential has
been a source of immense strength and motivation. We also
dedicate this work to my mentors, Dr. Rana Muhammad Saleem and
Dr. Muhammad Ahmad, whose profound guidance and mentorship
have been crucial in shaping my academic and professional growth.
Their dedication to excellence and commitment to fostering
knowledge have inspired us to strive for the highest standards.
Additionally, this project is dedicated to all aspiring students and
researchers in Information Technology who dare to dream, innovate,
and push the boundaries of what is possible. May this work serve as
a testament to the power of perseverance, hard work, and the
pursuit of knowledge. Lastly, we dedicate this to the memory of
those who have been part of my life and journey, whose
contributions, though sometimes unseen, have been vital in
reaching this milestone.
ii
DECLARATION
This project titled "Hand Gesture to Speech Conversion" submitted to the Sub
Campus Burewala, University of Agriculture, Faisalabad, Pakistan, is my original
work. All sources of information and data used in this project have been
acknowledged and referenced appropriately. We confirm that this project has not been
previously submitted, in whole or in part, for any degree, diploma, or other
qualifications at any other academic institution. Any assistance or collaboration
received during the research and preparation of this project has been fully disclosed.
We take full responsibility for the content and findings presented in this report and
affirm that the work is free from plagiarism or academic dishonesty.
_________________
Muhammad Adeel Asad
2021-ag-4972
_________________
Muhammad Junaid
2021-ag-4989
iii
ACKNOWLEDGMENT
ALLAH THE ALMIGHTY, who gave us the strength to work to complete this project
on time and with the best possible quality, and our family and friends who supported
us in every step of life and primarily the past four years of university life. We want to
thank and sincerely acknowledge the help of our supervisor, MR. RANA
MUHAMMAD SALEEM, and Dr. MUHAMMAD AHMAD, whose complete
guidance, support, and encouragement motivated us to do this project. We want to
thank all the volunteers who helped us while testing this application. Lastly, we would
like to thank all the faculty members of the CS department for their help, time, and
support, which were gladly given to us in this time of need.
iv
TABLE OF CONTENTS
Sr. No. Title Page No.
1 CHAPTER 1 1
1.1 INTRODUCTION 1
1.2 Objectives 1
1.2.1 Detect Hand Gestures 1
1.2.2 Convert Gestures into Speech 1
1.2.3 Mobile Connectivity 1
1.2.4 User-Friendly 1
1.2.5 Immediate Feedback 2
1.2.6 Affordable Solution 2
1.3 Scope and Domain 2
1.3.1 Assistive Technology 2
1.3.2 Applications 2
1.3.3 Flex Sensors 2
1.3.4 ESP32 Microcontroller 2
1.3.5 Mobile App 2
1.4 Functional Requirement 3
1.4.1 Recognizing Hand Gestures 3
1.4.2 Real-Time Processing 3
1.4.3 Speech Output 3
1.4.4 Wireless Communication 3
1.4.5 Simple User Interface 3
1.4.6 Customization 3
1.4.7 Power Efficiency 3
1.5 Non-Functional Requirement 3
1.5.1 Ease of Use 3
1.5.2 Scalability 4
1.5.3 Reliability 4
1.5.4 Real-Time Performance 4
1.5.5 Portability 4
1.5.6 Security 4
1.5.7 Compatibility 4
v
1.5.8 Affordability 4
1.5.9 Maintainability 4
1.5.10 Energy Efficiency 4
CHAPTER 2 5
2 LITERATURE VIEW 5
2.1 Gesture Recognition Technologies 5
2.1.1 Overview of Gesture Recognition 5
2.1.2 Existing Gesture Recognition Systems for Disabilities 5
2.1.3 Technology Behind Gesture Recognition 5
2.1.4 Wireless Communication Technologies in Gesture Recognition 5
2.1.5 Improvements Over Existing Systems 6
2.2 Proposed System 6
2.2.1 System Architecture 6
2.2.2 Improvements Over Existing Systems 7
2.2.3 Technological Innovations 7
2.2.4 Benefits of the Proposed System 7
2.3 UML Diagram 8
2.3.1 Gesture Input 8
2.3.2 Sensor Data Collection 8
2.3.3 Signal Processing 8
2.3.4 Command Transmission 8
2.3.5 Speech Synthesis 8
2.3.6 Voice Output 9
2.4 Use Case Diagram 11
2.4.1 Actors 11
2.4.2 Use Case 11
2.5 Class Diagram 13
2.6 Sequence Diagram 15
2.6.1 Turn On the System 15
2.6.2 Provide Power 15
2.6.3 Read Sensor Data (Loop) 15
2.6.4 Process Sensor Data 15
2.6.5 Detect Gesture 15
vi
2.6.6 Display Gesture 15
2.7 Circuit Diagram 17
2.7.1 Components 17
2.7.2 Connections 17
2.7.3 Working 18
CHAPTER 3 20
3 PROJECT INTERFACE 20
3.1 Main Interface 20
3.2 Flex Sensors 22
3.3 ESP32S 24
3.4 OLED Display 26
3.5 Jumper Wire 28
3.6 Resistor 30
3.7 Mini Breadboard 32
CONCLUSION 34
LITERATURE CITED 35
vii
LIST OF FIGURES
Figure No. Title Page No.
1 UML Diagram 10
2 Use Case Diagram 12
3 Class Diagram 14
4 Sequence Diagram 16
5 Circuit Diagram 19
6 Main Interface 22
7 Flex Sensor 23
8 ESP32S 25
9 OLED Display 27
10 Jumper Wire 29
11 10k ohmresistor 31
12 Mini Breadboard 33
viii
ABSTRACT
The "Hand Gesture to Speech Conversion" project offers a simple way for people
with hearing and speech difficulties to communicate efficiently. This system uses
sensors and modern technology to convert hand gestures into spoken words or text. It
allows individuals who are deaf or mute to interact with others in real time. The
project includes wearable sensors and a mobile connection to accurately process and
translate hand movements. This project helps bridge the communication gap between
people with hearing or speech impairments and the rest of society, promoting better
understanding and inclusion.
ix
CHAPTER1
INTRODUCTION
In recent years, technology has made huge strides in helping people interact with
devices in new ways. One exciting development area is the Internet of Things (IoT),
which allows everyday objects to connect and communicate. One of the most
innovative ways this technology is used is to help people with speech difficulties.
The "Hand Gesture to Speech Conversion" project is an IoT-based system that allows
people to communicate using hand gestures, which are then converted into spoken
words.
This project aims to provide an easy and effective way for people with speech
impairments to communicate. Using sensors and Bluetooth technology, the system
detects specific hand movements, translates them into text, and then uses a speaker or
mobile phone to read the text. The system is designed to be simple and easily
connected to a mobile phone via Bluetooth, allowing for smooth and quick
communication.
1.2 Objectives
1.2.4 User-Friendly:
The system should be easy to use, with no complicated settings or operations. It
should be designed so that even users without technical knowledge can efficiently
operate it.
1
1.2.5 Immediate Feedback:
Once a gesture is made, the system will provide instant feedback by showing the
result on the mobile screen and speaking the word aloud. This ensures that the user
knows their gesture has been understood.
This project falls under the domain of assistive technology, which aims to create
solutions for people with communication difficulties. Specifically, this project targets
individuals who have trouble speaking due to medical conditions like stroke, cerebral
palsy, or other physical disabilities.
1.3.2 Applications:
These sensors detect bending movements in the user's hand, allowing the system to
recognize different gestures based on how much the hand bends.
This small computer processes the data from the sensors, figures out what the gesture
means, and sends the information to a mobile device or speaker via Bluetooth.
A mobile app displays the gesture and reads the corresponding word aloud. It also
allows the user to manage the system and check its performance.
2
1.4 Functional Requirements
1.4.6 Customization
Users should be able to change or add gestures to tailor the system to their needs.
3
1.5.2 Scalability
While the system will start with basic gestures, it should be able to handle more
complex gestures or a bigger vocabulary in the future.
1.5.3 Reliability
The system should work consistently without frequent errors. It should always
recognize gestures correctly and speak the right word.
1.5.5 Portability
The system should be lightweight and easy to carry. This makes it useful in various
environments, whether at home, school, or public.
1.5.6 Security
The system should ensure that data sent between the devices is secure so unauthorized
people cannot interfere with the communication.
1.5.7 Compatibility
The mobile app should work on Android and iOS smartphones so that as many people
can use the system.
1.5.8 Affordability
The system should have affordable, readily available components to ensure it is
within reach for most people who need it.
1.5.9 Maintainability
The system should be easy to maintain and update. For example, users should be able
to update the mobile app or recalibrate the sensors quickly.
4
CHAPTER 2
LITERATURE VIEW
Current gesture recognition systems that assist people with disabilities typically focus
on interpreting hand or body movements. For example, systems like Myo Gesture
Control Armband or sign language interpreters use sensors to track hand movements
and convert them into text or speech. However, many of these systems require
expensive sensors or complex setup processes, which can limit their accessibility.
Some systems also struggle with accuracy or are limited to predefined gestures,
restricting users' ability to express a broad range of ideas.
Gesture recognition technology relies heavily on sensors, which can detect movement
in the hand or other body parts. Commonly used sensors include accelerometers,
gyroscopes, and flex sensors. Flex sensors, like those used in your project, are
particularly suited for detecting the bending of fingers or hand movements. These
sensors change resistance based on the flexion of the hand, which a microcontroller
can then interpret to determine the gesture made.
Despite the progress in gesture recognition, many systems still face several
challenges:
The "Hand Gesture to Speech Conversion" system consists of several key components
that work together to convert hand gestures into speech:
● ESP32 Microcontroller: The ESP32 processes the data from the flex sensors,
identifies the gesture, and communicates this information wirelessly to the
mobile device via Bluetooth.
● Flex Sensors: These sensors detect the bending of fingers or hands and send
corresponding data to the microcontroller.
● Mobile App: The app displays the recognized gesture and uses the text-to-
speech functionality to speak the corresponding word aloud. It also serves as
an interface for user interaction and settings.
6
2.2.2 Improvements Over Existing Systems
● Accuracy and Customization: The system uses flex sensors to detect hand
movement more precisely than older systems relying on accelerometers or
cameras. Additionally, users can add or modify gestures, allowing for more
flexibility in communication.
● Affordability: Unlike other gesture recognition systems that require costly
hardware, this system utilizes affordable, readily available components,
making it more accessible.
● User-Friendliness: The system is designed to be easy to use with minimal
setup, even for individuals without technical knowledge. The mobile app
interface is simple, providing instant feedback after each gesture is detected.
● Flex Sensors: These sensors detect subtle hand movements, providing high
accuracy in recognizing various gestures.
● ESP32 Microcontroller: The ESP32 offers Bluetooth and Wi-Fi capabilities,
making the system versatile for wireless communication. Its low energy
consumption ensures the system remains efficient when powered by battery-
operated devices.
The UML diagram outlines the process flow for the "Hand Gesture to Speech
Conversion" system. The following steps detail the sequence of actions:
The user provides input through hand movements, which are captured by the flex
sensors attached to the system. Each gesture represents a unique action or word.
8
2.3.6 Voice Output
The TTS engine produces audio output that is played through the mobile device's
speaker. This allows the user to hear the spoken word, enabling effective
communication.
9
Figure 1: UML Diagram
10
2.4 Use Case Diagram
This use case diagram outlines the interaction between the user, the "Gesture to
Speech Conversion System," and the external system. It illustrates how user gestures
are translated into spoken words and displayed on a mobile device.
2.4.1 Actors:
● User: The primary actor initiates the process by performing specific hand
gestures.
● System: The Gesture to Speech Conversion System executes tasks based on
the detected user input.
11
Figure 2: Use Case Diagram
12
2.5 Class Diagram
This class diagram illustrates the interaction between the components of the "Hand
Gesture to Speech Conversion" system. It highlights the functional relationship
between sensors, gesture control, central controller, and speech output subsystems,
along with the role of the power supply. The FlexSensor sends bend data to
GestureControl, which processes it and sends it to the MainController. The
MainController maps the gesture to a command and coordinates with SpeechOutput
to generate speech and display results. The power supply provides power to all
components, ensuring stable system operation.
13
Figure 1: Class Diagram
14
2.6 Sequence Diagram
The sequence diagram below illustrates the interaction between the components of the
"Hand Gesture to Speech Conversion" system. It demonstrates the step-by-step
process from powering the system to generating and displaying the output. Steps in
the Sequence Diagram:
The user turns on the system, activating the Battery to provide power to all connected
components, including the ESP32, Flex Sensors, and LCD Display.
The ESP32 begins a loop to collect data from all four Flex Sensors. Flex Sensor 1
sends its bend value (readValue1) to the ESP32. Flex Sensor 2, Flex Sensor 3, and
Flex Sensor 4 sequentially send their respective bend values (readValue2, readValue3,
readValue4) to the ESP32.
The ESP32 aggregates the sensor values and sends them to the Gesture Detection
Algorithm for processing. The Gesture Detection Algorithm interprets the combined
sensor data to identify the specific hand gesture.
2.6.5 Detect Gesture
The Gesture Detection Algorithm determines the detected gesture based on the
processed data. The recognized gesture is converted into its corresponding text or
command.
2.6.6 Display Gesture
The detected gesture (in text form) is sent to the LCD Display, where the result is
displayed for the user's feedback.
15
Figure 2: Sequence Diagram
16
2.7 Circuit Diagram
The circuit diagram demonstrates the interaction between the hardware components of
the hand gesture to speech conversion system. It provides an overview of how the
input from flex sensors is processed by the NodeMCU ESP32 microcontroller and
displayed on the OLED module.
2.7.1 Components:
2.7.2 Connections:
● Flex Sensors to ESP32: Each flex sensor is connected to an analog pin of the
ESP32. Pull-down resistors are used to stabilize the sensor signals.
● OLED Display to ESP32: The OLED uses the I2C communication protocol.
The SDA (data) pin connects to GPIO21, and the SCL (clock) pin connects to
GPIO22 of the ESP32. The power (VCC) and ground (GND) pins are
connected to the 3.3V and GND pins of the ESP32, respectively.
● Power Supply: The USB cable supplies power to the ESP32, which powers
the OLED and sensors through the breadboard.
17
2.7.3 Working:
● Input Signals: The flex sensors detect the bending of fingers and send
variable resistance data as analog signals to the ESP32.
● Data Processing: The ESP32 converts these analog inputs into digital values.
The values are mapped to predefined gestures stored in the system's database.
● Output Display: The OLED screen displays the text.
● Wireless Communication: The ESP32 can send data wirelessly to a mobile
device for additional text-to-speech output.
18
Figure 5: Circuit Diagram
19
CHAPTER 3
PROJECT INTERFACE
3.1 Main Interface
The hand gesture to speech conversions system is designed to provide an accessible
and user-friendly interface for individuals with speech impairments. This system
translates hand movements into spoken words by leveraging flex sensors, a
microcontroller (ESP32), and a mobile application. The primary interface includes the
hand gesture detection module, wireless communication through ESP32, and a text-
to-speech-enabled mobile device.
The system aims to create a seamless interaction between the user and their
environment by converting physical gestures into meaningful voice output, enabling
individuals with speech disabilities to communicate effectively.
20
Figure 6: Main Interface
21
3.2 Flex Sensors
The Flex Sensors, measuring 2.2 inches in length, are critical for detecting hand
movements. These sensors operate by changing their resistance as they bend. When
the user performs a gesture, the bending motion alters the sensor's resistance, which is
then converted into an analog signal. This signal is sent to the ESP32 for processing,
enabling the system to interpret the gesture. These sensors' lightweight and flexible
nature makes them ideal for wearable applications like this project, ensuring comfort
and accuracy.
22
Figure 7: Flex Sensor
3.3 ESP32S
23
The 38 Pin NodeMCU ESP32S Microcontroller serves as the brain of the system.
Equipped with built-in WiFi and Bluetooth capabilities, the ESP32 handles data
processing and wireless communication. It reads analog input from the flex sensors,
processes the data to identify gestures, and transmits the results to the mobile
application via Bluetooth. Its high processing power and energy efficiency make it an
excellent choice for real-time applications and portable systems, ensuring smooth and
reliable performance.
24
Figure 8: ESP32S
25
The Arduino 0.96-inch OLED Display Module plays a crucial role in visually
displaying the recognized gestures. This small screen, measuring just 0.96 inches with
a resolution of 128x64 pixels, uses I2C communication to interface with the ESP32
microcontroller. Its compact size and efficiency make it suitable for wearable
projects. The display module outputs the detected hand gesture in text form, allowing
users to see the system's interpretation of their movements, enhancing interaction and
usability.
26
Figure 9: OLED Display
27
used. These wires, with male-to-female connectors, offer flexibility and durability for
making temporary connections between the ESP32, flex sensors, resistors, and the
OLED display. Their 20cm length ensures organized and neat wiring, eliminating the
need for permanent soldering, which is particularly useful during the system's
prototyping and testing stages.
28
Figure 10: Jumper Wire
3.6 Resistor
The 10k Ohm Resistor is used in the system as part of a voltage divider circuit,
29
working alongside the flex sensors. It stabilizes the output signal from the flex sensors
and prevents issues such as floating analog readings when the sensors are not bent.
This resistor ensures accurate and reliable data transmission to the ESP32 while
protecting the circuit from excessive current flow, making it a vital component of the
system's functionality.
30
Figure 11: 10k ohmresistor
31
"Hand Gesture to Speech Conversion" system. It provides a compact, reusable
platform for creating and testing electrical circuits without soldering. With 170
connection points, the breadboard allows components like resistors, sensors, and
wires to be connected securely and rearranged quickly, making it ideal for quick
modifications and experimentation during development.
32
Figure 12: Mini Breadboard
CONCLUSION
33
Finally, the “Hand Gesture to Speech Conversion” project is a revolutionary step for
people with hearing and speech impairments, promoting their inclusion in life and
facilitating communication. This system, based on advanced technology and sensors,
converts sign language into text or voice, allowing these people to express their
thoughts and feelings effectively. This project has not only helped to eliminate
communication barriers but also reflects how technology can play a key role in
solving social problems. Its success highlights the importance of research and
development to further improve and expand the capabilities of assistive devices,
helping to create a more inclusive and accessible world.
34
REFERENCES
ATA-UR-Rehman, Salman Afghani, Muhammad Akmal, Raheel Yousaf
"Microcontroller and Sensors Based Gesture Vocalizer", Proceedings. of the
7th WSEAS International Conference on signal processing, robotics and
automation (ISPRA '08) ISSN: 1790-5117 8 2 ISBN: 978-960- 6766-44-2,
University of Cambridge, UK, February 20-22, 2008.
Kanika Rastogi, Pankaj &Bhardwaj, "A review paper on smart glove", International
Journal on Recent and Innovation Trends in Computing and Communication,
Vol:4 Issue:5
K.V.Fale, Akshay Phalke, Pratik Chaudhari & Pradeep Jadhav, "Smart Glove:
Gesture Vocalizer for Deaf and Dumb People", International Journal of
Innovative Research in Computer and Communication Engineering (An ISO
3297: 2007 Certified Organization) Vol. 4, Issue 4, April 2016
M.S.Kasar, Anvita Deshmukh, Akshada Gavande & Priyanka Ghadage, "Smart
Speaking Glove-Virtual tongue for Deaf and Dumb, International Journal of
Advanced Research in Electrical, Electronics and Instrumentation
Engineering, (An ISO 3297: 2007 Certified Organization), Vol. 5, Issue 3.
March 2016
Kiran R, "Digital Vocalizer System for Speech and Hearing Impaired", International
Journal of Advanced Research in Computer and Communication Engineering,
Vol. 4, Issue 5, May 2015,Copyright to IJARCCE DOI
10.17148/JARCCE.2015.4519 81,Digital Vocalizer System for Speech and
Hearing Impaired
35