0% found this document useful (0 votes)
8 views10 pages

Project_Proposal_Form-1

The University of Lahore's final project proposal outlines the development of 'GesCom', a software solution aimed at enabling communication for individuals with speech impairments by translating gestures into text and speech. The project emphasizes the importance of independence and social interaction for these individuals, leveraging advanced technologies like AI and machine learning to provide a user-friendly platform with multiple functionalities. The proposal includes a competitive analysis, objectives, and detailed requirements for the app's features and implementation.

Uploaded by

mnsuol500
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views10 pages

Project_Proposal_Form-1

The University of Lahore's final project proposal outlines the development of 'GesCom', a software solution aimed at enabling communication for individuals with speech impairments by translating gestures into text and speech. The project emphasizes the importance of independence and social interaction for these individuals, leveraging advanced technologies like AI and machine learning to provide a user-friendly platform with multiple functionalities. The proposal includes a competitive analysis, objectives, and detailed requirements for the app's features and implementation.

Uploaded by

mnsuol500
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

The University of Lahore – Final Project Proposal

Faculty of Information Technology


Department of Software Engineering
The University of Lahore

Final Year Project Proposal


For probably the first time in your undergraduate/graduate program, you are required to defend a
proposal of a larger project. In teams, you will be working on the common project, but individual
team members will be required to take on responsibilities for specific work for which each will
be held accountable. Interaction, collaboration and assistance are allowed and expected, but each
person will receive an individual mark for his/her work performed in the project.

Day Month Year

DATE – –

PROJECT TITLE: Enabling Speech through Gesture Recognition “GesCom”


STUDENT INFORMATION.
Sr. Student ID Name Email Mobile

1. 70131025 Ahmad Yaar [email protected] 03019657066

2. 70132747 Muhammad Naveed [email protected] 03212160248

03106860994
3. 70132841 Bilal Sher [email protected]

PROBLEM STATEMENT
Individuals with speech impairments (people who can’t speak) often face challenges in daily
communication, requiring a translator to convey their thoughts and needs. This reliance can limit
their independence and social interaction. Our solution aims to address this issue by developing a
software that translates gestures into text and speech, and vice versa, enabling seamless
communication without external assistance.
EXECUTIVE SUMMARY

This project addresses the communication challenges faced by individuals with speech
impairments. Current solutions often rely on human translators, limiting independence and social
interaction. This project aims to develop innovative software that facilitates seamless
communication by translating gestures into text or speech, and vice versa. This technology will
empower individuals with speech impairments to express themselves independently, enhance their
social participation, and improve their overall quality of life. The software will leverage advanced
computer vision and machine learning algorithms to accurately recognize and interpret a wide

Page 1
The University of Lahore – Final Project Proposal

range of gestures, providing a user-friendly and accessible communication platform.

There are some competitive apps that are lying under this category but they offer only one or two
functionalities like gesture to text or text to voice. But in our app we are combinning six core
functionalities on a single platform this is uniqueness of our app.

INTRODUCTION

Relevance and Importance of the Problem

Effective communication is a fundamental human right, enabling social interaction, personal


growth, and overall well-being. Speech impairments, stemming from deafness and dumbness
significantly hinder individuals' ability to express themselves freely. This can lead to social
isolation, limited educational and professional opportunities, and a diminished quality of life.
Reliance on human translators for constant communication is not only costly and inconvenient but
also restricts spontaneity and independent participation in daily activities.

Background Information

Individuals with speech impairments face diverse challenges, including difficulty in producing
clear speech sounds, controlling vocal pitch and volume, and comprehending or processing
language. This can manifest in various ways, from slurred speech to complete inability to vocalize.
Existing assistive technologies, such as text-to-speech devices and augmentative and alternative
communication (AAC) systems, often require significant manual input and may not be intuitive or
adaptable to individual needs.

Previous Related Work

Recent research has demonstrated significant advancements in gesture-based communication


systems, particularly within the context of assistive technologies.
2024 Research:
"Real-time Hand Gesture Recognition for Assistive Communication Using Deep Learning"
(Journal of Medical Systems, 2024):
This study showcases a deep learning-based system for real-time hand gesture recognition tailored
for individuals with speech impairments. The system effectively utilizes a combination of
convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to accurately
recognize dynamic hand gestures and translate them into text or speech. This directly addresses the
core challenge of the problem statement by exploring a technology that can enable independent
communication for individuals with speech impairments.

"A Wearable Gesture-Based Communication Device for People with Motor Disabilities"
(IEEE Transactions on Haptics, 2024):
This research presents a wearable device designed to enable individuals with motor disabilities to
communicate through simple hand gestures. The device incorporates inertial measurement units

Page 2
The University of Lahore – Final Project Proposal

(IMUs) and a machine learning algorithm to recognize gestures and generate corresponding
messages. This research aligns with the problem statement by focusing on developing assistive
technologies that enhance communication for individuals with motor impairments, which often co-
occur with speech impairments.

"Improving Accessibility and Inclusivity through Gesture-Based Interaction in Smart


Environments" (ACM Transactions on Accessible Computing, 2024):
This study explores the potential of gesture-based interaction to enhance accessibility and
inclusivity in smart environments. It investigates the use of intuitive gestures to control smart
devices and appliances, enabling individuals with motor impairments to interact with their
surroundings more independently. While not directly focused on speech impairments, this research
contributes to the broader goal of enhancing communication and independence for individuals
with disabilities through innovative technology.

COMPETITORS/COMPETITIVE ANALYSIS

Functionalities Hand Talk Spread Signs The ASL app Our App
Translator
Gesture to text No No Yes Yes
Text to gesture Yes Yes No Yes
Text to speech No No No Yes
Speech to text No No No Yes
Gesture to No No No Yes
speech
Speech to Yes No No Yes
gesture

OBJECTIVES
The primary objectives of the App are:

Facilitate Communication for Speech-Impaired Individuals:


Develop a software solution that enables individuals with speech impairments to communicate
effectively and independently without the need for external translators.

Seamless Gesture-to-Text and Speech Conversion:


Implement a system that accurately recognizes gestures and converts them into readable text or
spoken words in real-time.

Enable Multimodal Communication:


Provide comprehensive features, including:

1. Text-to-speech conversion
2. Speech-to-text conversion

Page 3
The University of Lahore – Final Project Proposal

3. Gesture-to-text conversion
4. Text-to-gesture conversion
5. Speech-to-gesture conversion
6. Gesture-to-speech conversion

Leverage Advanced Technologies:


Utilize AI, machine learning, and computer vision to ensure high accuracy, adaptability, and
scalability of gesture recognition and language processing.

Empower Independence:
Foster greater independence and confidence among users, enabling them to engage in social,
professional, and personal interactions seamlessly.

Global Accessibility:
Make the software available on mobile devices ensuring it reaches a diverse audience worldwide.
MOTIVATION
The motivation is to create a more equitable and inclusive society where individuals with speech
impairments have the same opportunities for communication and social participation as everyone
else.

REQUIREMENTS

Functional Requirements:

1. Gesture-to-Text Conversion

The app can recognize and interpret gestures (such as hand movements, facial expressions, or
body posture) and convert them into real-time text, allowing users to express themselves without
speaking.

2. Gesture-to-Speech Conversion

Recognizes gestures and translates them into spoken words, enabling users to "speak" via their
gestures and communicate with others who may not understand sign language or gestures.

3. Text-to-Gesture Conversion

Converts written text into visual gestures or sign language, facilitating communication for non-
verbal individuals to communicate with others who understand gestures but not text.

4. Text-to-Speech Conversion

Converts written text into spoken language, allowing individuals with speech impairments
to communicate through text that is spoken aloud by the app.

Page 4
The University of Lahore – Final Project Proposal

5. Speech-to-Text Conversion

Converts spoken language into text, allowing individuals with hearing impairments or speech
difficulties to read what others are saying in real-time.

6. Speech-to-Gesture Conversion

Converts spoken language into gestures or sign language, allowing the app to "translate" speech
into an accessible visual form.

Non-Functional Requirements:
1. Reliability and Stability:
Robust and stable system with minimal crashes or errors.
High availability and uptime.
2. Usability:
Easy use, even for users with limited technical experience..
3. Performance:
Real-time performance with minimal latency.
Efficient resource utilization (CPU, memory, battery).
4.Portability:
For smartphones.
FEATURES OF PROJECT
1. Gesture-to-Text Conversion

The app can recognize and interpret gestures (such as hand movements, facial expressions, or
body posture) and convert them into real-time text, allowing users to express themselves without
speaking.

2. Gesture-to-Speech Conversion

Recognizes gestures and translates them into spoken words, enabling users to "speak" via their
gestures and communicate with others who may not understand sign language or gestures.

3. Text-to-Gesture Conversion

Converts written text into visual gestures or sign language, facilitating communication for non-
verbal individuals to communicate with others who understand gestures but not text.

4. Text-to-Speech Conversion

Page 5
The University of Lahore – Final Project Proposal

communicate through text that is spoken aloud by the app.

5. Speech-to-Text Conversion

Converts spoken language into text, allowing individuals with hearing impairments or speech
difficulties to read what others are saying in real-time.

6. Speech-to-Gesture Conversion

Converts spoken language into gestures or sign language, allowing the app to "translate" speech
into an accessible visual form.

7. Real-Time Communication

Ensures that all conversions (gesture-to-text, text-to-speech, etc.) happen in real-time, enabling
smooth, ongoing conversations between the user and others.

8. Easy-to-Use Interface

Designed with an intuitive and accessible interface, making the app simple to navigate for people
of all ages and tech proficiency levels.
ARCHITECTURAL DESIGN

Hardware Components:

1. User Devices (Mobile):

The core user interface for interaction with the app will be mobile phone. These devices will
have basic computing capabilities, necessary sensors (camera, microphone, speakers), and a
display for rendering text, speech, and gestures.

2. Camera:

The device's camera (or an external webcam) will capture gestures, facial expressions, and body
movements, essential for the gesture recognition system. This will be used for the gesture-to-text
and gesture-to-speech conversion features.

3. Microphone:

The microphone on the user's device will capture speech for the speech-to-text conversion feature. It will
also be used for detecting speech input in the speech-to-gesture conversion system.

4. Speakers:

For the text-to-speech functionality, speakers on the device will produce audible speech when text is

Page 6
The University of Lahore – Final Project Proposal

converted into spoken words.

Software Components:

1. Mobile Application:

A cross-platform application will be built to run on mobile phones (Android, iOS). The app will serve as
the front-end interface for user interaction with various functionalities such as text, speech, and gesture
conversions.

2. Speech Recognition (Natural Language Processing):

The app will integrate speech recognition tools (e.g., Google Speech-to-Text API or Microsoft's Azure
Speech Services) to convert spoken language into text. It will also include Text-to-Speech (TTS) systems
such as Google Text-to-Speech or Amazon Polly to read text aloud.

3. APIs:

External APIs (e.g., Google Cloud Speech-to-Text, Text-to-Speech APIs, computer vision models for
gesture recognition) will be integrated into the app for processing speech input and generating speech or
gesture output.

4. Database:

A local database or cloud database will store user preferences, customization settings, historical
interactions, and any other relevant data, enabling a personalized experience.

Network Components:

1. Internet Connectivity:

The app will rely on an internet connection for cloud synchronization, fetching updates, and accessing
APIs for speech recognition, text-to-speech conversion, and machine learning model processing.

Page 7
The University of Lahore – Final Project Proposal

Page 8
The University of Lahore – Final Project Proposal

IMPLEMENTATION TOOLS AND TECHNIQUES


Techniques:
Natural Language Processing (NLP)
Tools:
Googlecolab, Pycharm, Vscode
Ui/UX:
Figma, Canva
Programming Languages:
React Native, python
Backend and APIs:
FastAPIs, Laravel APIs
Artificial Intelligence and Machine Learning:
TensorFlow, Keras, Scikit-learn, Pytorch , Anaconda
Testing Tools:
Manual Testing
Database:
MySql, Firebase, MongoDB

Project Plan

VERSION CONTROL

Version Date Description


1 10/12/2024 Initial draft created
1.1 31/12/2024 Added detailed architectural
components
Page 9
The University of Lahore – Final Project Proposal

REFERENCES
References:

Camgoz, N., Akarun, L., & Akarun, S. (2018). Deep learning for sign language recognition:
A survey. Computer Vision and Image Understanding, 175, 1-29.
Liu, H., Zhang, Y., & Liu, W. (2018). Gesture recognition based on depth image and
support vector machine for intelligent wheelchair. Journal of Ambient Intelligence and
Humanized Computing, 9(1), 189-197.
Muller, M., Muller, T., & Gross, H.-M. (2013). Real-time hand pose estimation for sign
language recognition using depth and color information. Computer Vision and Image
Understanding, 117(12), 1422-1437.
Vogler, C., Larsen, E., & Krüger, N. (2014). Gesture-based interaction for people with
motor disabilities. Universal Access in the Information Society, 13(2), 147-161.

……………………………….DO NOT WRITE BELOW THIS LINE…………………………………

FOR OFFICE USE ONLY


Approved Yes No

Checked & Approved/Not Approved By:

Name:

Signature:
Day Month Year

DATE – –

Page 10

You might also like