"A S F D A D ": Inguistic IGN Ommunicator
"A S F D A D ": Inguistic IGN Ommunicator
Submitted by:
Tabia Rashid 2012/comp/BS(SE)/14067 1214105
Maria Farooq 2012/comp/BS(SE)/14056 1214094
Anousha Khan 2012/comp/BS(SE)/14032 1214070
Maha Shakeel 2012/comp/BS(SE)/14055 1214093
January 2016
PROJECT APPROVAL
By:
Tabia Rashid 2012/comp/BS(SE)/14067 1214105
Maria Farooq 2012/comp/BS(SE)/14056 1214094
Anousha Khan 2012/comp/BS(SE)/14032 1214070
Maha Shakeel 2012/comp/BS(SE)/14055 1214093
Approval Committee:
___________________________ ____________________________
Ms. Narmeen Shawoo Bawany Sir. Adeel
Designation: Senior software engineer
Organization: Synety Groups
(Internal Advisor) (External Advisor)
___________________________
(Head of the Department)
ABSTRACT
Linguistic Sign Communicator is an interpreter for deaf and dumb people in Pakistani
society. The aim of this interpreter is to present a system that can efficiently translate
Pakistani Sign Language gestures to both text and auditory speech with two languages i.e
English & Urdu. The interpreter here makes use of a Microsoft kinect device V1 which
possesses special sensor that takes depth image, extract human skeleton points and have
speakers which recognize voice. It can track skeleton of 4 people at a time. The
interpreter not only translates alphabets but can also form words, phrases and sentences
using performed gestures. It also possesses learning videos for special people to learn
easily along with testing their own gestures.
i
TABLE OF CONTENTS
ABSTRACT i
TABLE OF CONTENTS ii
LIST OF FIGURES v
LIST OF TABLES vi
ACKNOWLEDGEMENTS vii
Chapter 1 INTRODUCTION 1
1.1 Purpose 1
1.2 Project Overview 1
1.3 Related Studies 2
1.4 Project Boundaries 5
1.5 Scope 6
1.5.1 Scope-in 6
1.5.2 Scope-out 6
1.6 Intended Audience & Reading Suggestions 7
ii
3.2.2 Rationale behind project 16
3.3 Other non-Functional Requirements 16
3.3.1 Performance Requirements 16
3.3.1.1 Response Time 17
3.3.1.2 Workload 17
3.2.2 Safety Requirements 17
3.4 Software Quality Attributes 17
3.4.1 Correctness 17
3.4.2 Flexibility 17
3.4.3 Interperobality 17
3.4.4 Maintainability 18
3.4.5 Reliability 18
3.4.6 Robustness 18
3.4.7 Ease of Use 18
3.5 Business Constraints 18
iii
4.5.3 C# 28
4.6 Memory Constraints 28
4.6.1 Hardware Constraints 28
4.6.2 Application Constraints 28
Chapter 5 INTEGRATION 29
Conclusion 32
Appendix B GLOSSARY 34
iv
LIST OF FIGURES
v
LIST OF TABLES
TABLE 1.1 PREVIOUS PROJECTS .......................................................................................... 5
vi
ACKNOWLEDGEMENT
We would like to pay gratitude to all those people who assisted us and tolerated us
throughout the work. Firstly, we would like to acknowledge the Almighty for his
guidance and wisdom.
After Him, we take this opportunity to express our profound sense of appreciation,
respect and thankfulness to our mentor Ms Narmeen Shawoo Bawany for her valuable
time and guidance and also for expressing her confidence in us by, we work on a project
of this magnitude and using latest technologies and providing her support, help &
encouragement in deployment of this project, without whose help this piece of work
wouldn’t be possible.
Further, we would like to pay our gratefulness to Mr. Adeel who helped and motivated us
in all the difficult times, without his motivations and appreciations on achieving every
small milestone in this stressful period, we would have not been able to achieve this piece
of success.
We are highly indebted to MIC (Microsoft Innovation Centre) & Jinnah University for
Women for their guidance and constant supervision as well as for providing necessary
information regarding the project & also for their support in completing the project.
Then, we would like to pay our sincere gratitude to our parents, who suffered sleepless
nights just because the lights were on when we were working on the system. Last, but
not the least we are grateful to our friends who were the best source of motivation during
this hectic phase.
We have taken efforts in this project. However, it would not have been possible without
the kind support and help of many individuals and organizations. I would like to extend
my sincere thanks to all of them.
Our thanks and appreciations also go to our colleagues in developing the project and
people who have willingly helped us out with their abilities.
vii
Chapter 1. Introduction
Chapter 1
INTRODUCTION
1.1 PURPOSE
The main idea involves a project for deaf and dumb to ease their social lives
living in Pakistan. Sign language is a non-verbal form of intercourse which is found
amongst deaf communities in world. The languages do not have a common origin and
hence difficult to interpret. LSC is an interpreter that translates the hand gestures to
auditory speech.
The main aim of this interpreter is to present a system that can efficiently translate
Pakistani Sign Language gestures to both text and auditory voice. The interpreter makes
use of a skeleton based technique comprising of flex sensor, tactile sensors and
accelerometer. For each hand gesture made a signal is produced by the sensors
corresponding to the hand sign the controller matches the gesture with pre-stored inputs.
The device not only translates alphabets but can also form words and sentences using
those performed gestures that are saved by the user itself.
The disadvantage of vision based techniques includes complex algorithms for data
processing. Another challenge in image and video processing includes variant lighting
conditions, backgrounds and field of view constraints and occlusion. The sensor based
technique offers greater mobility.
The main aim of this software is to present a system that can efficiently translate
Pakistani Sign Language gestures to both text and auditory speech. The interpreter here
makes use of a sensor based technique comprising of flex sensor, tactile sensors and
accelerometer. For each hand gesture made a signal is produced by the sensors
corresponding to the hand sign the controller matches the gesture with pre-stored inputs.
The software not only translates alphabets but can also form words/phrases using
performed gestures. Training mode is offered in software so that it make easy to learn the
signs.
2
Chapter 1. Introduction
The method described by Jonathan Hall [9] uses Markov Model. It is a typical
model for a stochastic sequence of a number of states. Which states are based on
observations or data. In this approach, the observation data used are sequential 3D points
(x, y, z) of Joints. A gesture is recognized based on the states as well as the transition
between these states. These states are hidden and hence this type of Markov model is
called a Hidden Markov Model (HMM).This method uses a skeleton-based gesture model
and also takes transition between states into consideration. The accuracy of the gesture
model depends on the initialization of the states by the user. Erroneous input from the
user could lead to poor performance. Thus, by averaging different sets of input states for
the same gesture could solve this problem.
Gesture Service for Kinect project using windows SDK in [4] considers gestures
to be made up of parts Each part of a gesture is a specific movement , when combined
with other gesture parts, makes up the whole gesture. This method uses a skeleton based
gesture model. Recognizing gesture parts are not sufficient to recognize a gesture. The
overall system comprises of three classes, namely, gesture controller, gesture and gesture
part. The method uses a Gesture Controller to control the transition between gesture parts
and updates the state of the gesture part. Though this method tries to incorporate
transitions, it is not efficient unless we consider a large number of gesture parts that are
close to each other.
The Kinetic Space described in [5] provides a tool which allows everybody to
record and automatically recognize customized gestures using the depth images and
skeleton data as provided by the Kinect sensors. This method is very similar to the
Hidden Markov Model [3] as discussed before. This method uses a skeleton based
gesture model and also takes transition between states into consideration. No code has to
be written by the trainer.The unique analysis routines allow to not only detect simple
gestures such as pushing, clicking, forming a circle or waving, but also to recognize more
complicated gestures as, for instance, used in dance performances or sign language. In
addition it provides a visual feedback how good individual body parts resemble a given
gesture.
3
Chapter 1. Introduction
This method does not consider breaking a gesture into segments or parts and as a
result a large amount of data is used to describe a gesture making it a memory inefficient
solution. By considering gesture segments and interpolating them would result in a more
memory efficient solution.
The project described in [6] allows developers to include fast, reliable and highly
customizable gesture recognition in Microsoft Kinect SDK C-sharp projects. This method
uses a skeleton based gesture model. It uses Dynamic time warping (DTW) [7] algorithm
for measuring similarity between two sequences which may vary in time or speed. It uses
skeletal tracking but the drawback with this software is that it currently supports only 2D
vectors and not 3D.
The software includes a gesture recorder that records the user's skeleton and
trains the system. The recognizer software then recognizes the gestures that have been
trained by the user. This method supports only 2D and not 3D vectors. It does not track
whether a user is correctly following a trajectory between poses. It does not give
incremental feedback along the way.
A method for contact-less Hand Gesture Recognition using Microsoft Kinect for
Xbox has been described in [8]. The system can detect the presence of gestures, to
identify fingers, and to recognize the meanings of nine gestures in a pre-defined Popular
Gesture scenario.
The accuracy of the system is from 84 percent to 99 percent with single-hand
gestures. Because the depth sensor of Kinect is an infrared camera, the lighting
conditions, signers' skin colors and clothing, and background have little impact on the
performance of this system. This method has a good accuracy rate. This method is limited
to Hand Gestures.
Further past projects are:
4
Chapter 1. Introduction
5
Chapter 1. Introduction
The figure 1.1 shows what symbol in Pakistani Sign Language while second in
Swedish sign Language. Others issues could be:
One semantic concept corresponds to a specific sign.
Several semantic concepts are mapped onto a unique sign.
One semantic concept generates several signs.
Verbs, general and specific Nouns.
1.5 SCOPE
1.5.1 SCOPE-IN
6
Chapter 1. Introduction
1.5.2 SCOPE-OUT
7
Chapter 2 . Overall Description
Chapter 2
OVERALL DESCRIPTION
8
Chapter 2 . Overall Description
It is the individual who lacks the ability to speak. Some types are:
Articulation Disorder
Fluency Disorder
9
Chapter 2 . Overall Description
It is assumed that the deaf or dumb person should clearly know about the
Pakistani sign language and also understand the English and Urdu language at basic level.
There is some limitation within LSC:
Brightness due to sun and contrast can sometimes make sensor hardly
detect the expected skin color.
Secondly it is hard to take decision because of the similarity of tracking
environment background color and skin color the LSC gets unexpected
pixels and also semantic problem will arose.
Different skeletons can be detected at same time.
Nearest skeleton will be detected.
Sensor should be place at proper height where it can scan whole body.
Sensor should not be exposed to sunlight or high beams.
The person who wants to use the system should record its own gestures.
10
Chapter 2 . Overall Description
The data acquisition phase begins once a user performs a gesture. Our intent is to
track their motion and capture an individual movement. As movement is detected we
determined the beginning and end of this motion and pass this data into our data analysis
phase.
Data analysis utilizes recognition algorithms that compare current movement data
against a predefined library. If a gesture is not found we return to the beginning of our
flow charts, and require the user to re-perform the gesture. If the gesture is found, the
translation is displayed on the user interface.
The user interface provides feedback to the user illustrating through text output
the translated word or phrase. The intended application of the LSC is to allow those that
use Pakistani Sign Language as their first language to communicate with those who are
unable to sign. A device such as this would allow them to seamlessly communicate
despite the language barrier.
11
Chapter 2 . Overall Description
2.7.1 ASSUMPTIONS
12
Chapter 2 . Overall Description
2.7.2 DEPENDENCIES
13
Chapter 3. Requirement Analysis
Chapter 3
REQUIREMENT ANALYSIS
14
Chapter 3. Requirement Analysis
The most problems arise while real-time video conferencing with deaf and dumb
were the semantic problem for example a person poses a gesture and now he wants to
stop conversation how he will express the full stop? This is the most problematic
condition occurred while interpreters were made previously.
15
Chapter 3. Requirement Analysis
16
Chapter 3. Requirement Analysis
3.3.1.2 WORKLOAD
Only single person can perform gestures at a time.
Too much gestures continuously performed may lead the system
hanged.
Kinect sensor should be protected and sometimes due to continuous
running it may be stopped showing the skeleton and then have to
restart it.
3.3.2 SAFETY REQUIREMENT
User should be positioned 6 (1.8 m) to 8(2.4 m) feet away from the sensor.
Protect the kinect sensor to sunlight.
3.4.1 CORRECTNESS
Gestures must be performed correctly according to PSL book that is been
implemented in order to retrieve correct sentence and phrases from a system.
3.4.2 FLEXIBILITY
User is advantaged to store its required PSL gestures for the future use. User can
also override the pre-recorded gestures.
3.4.3 INTERPEROBILITY
Gesture data and phrase dictionary is stored as an xml format that is shareable to
other environments and can further be used for making similar projects.
17
Chapter 3. Requirement Analysis
3.4.4 MAINTAINABILITY
If any failure occur in hardware, it would be resolve or replace.
3.4.5 RELIABILITY
The system must be able to run for a long duration of time as much as the Kinect
sensor supports.
3.4.6 ROBUSTNESS
System must have able to continue operating even it found more than one
skeleton or no skeleton.
18
Chapter 4. System Features
Chapter 4
SYSTEM FEATURES
19
Chapter 4. System Features
4.1.6 SLIDE-UP/SLIDE-DOWN:
This feature is basically used for the Kinect sensor to up and down the sensor to
fit the position of the skeleton. This feature will help the user to fit its skeleton without
the user movement.
20
Chapter 4. System Features
21
Chapter 4. System Features
In case if some gestures are not found or recorded the system will take you
to the testing mode.
A second user can reply with answering through a text writing in a box
then click enter that will show videos based on it.
22
Chapter 4. System Features
23
Chapter 4. System Features
Start performing some gestures. You can see the names of the gestures
from the select box, and hopefully most of them are obvious to perform. You
will see matches appear in the results text panel at the top of the skeleton
canvas.
Try recording your own gestures. Make sure your skeleton is being tracked,
select the gesture name you want to record, then click the Capture button.
The gesture is currently hard-coded to look at 32 frames (which is actually
every other frame over 64 expended frames).
When recording of each gesture is finished, then click Store button. Test your
new gesture a few times to see if you're happy with it. If not, re-record it and
try again.
When you're happy with your results, save your gestures to file by clicking
Load File button.
If you want to clear a list then click Clear list button for removing the list
which is showing you on your right hand.
24
Chapter 4. System Features
Make your own gestures - simply amend or add to the select box items with
a unique name and record your gesture.
Now a user can load the file when application runs and can perform its desired
gestures and can see proper sentences and Urdu translations of same at
communication mode.
25
Chapter 4. System Features
4.3.1 FEATURES:
Following features are available in website.
26
Chapter 4. System Features
27
Chapter 4. System Features
4.5.3 C#
C# language is used to code functionalities behind the WPF windows.
28
Chapter 5. Implementation
Chapter 5
IMPLEMENTATION
We have implemented different modes in LSC for the ease of users, which can be
helpful for them. These modes have been discussed before in Chapter 4 under User
Interfaces section. These interfaces has been implemented through using several different
classes. Some of them are discussed in this chapter.
5.1.2 CURVES:
It contains an interface which posses function declarations and a class that
implements that interface with functions definition. These functions find curves using the
k-curvature algorithm. Curvature is any of a number of loosely related concepts in
different areas of geometry. Intuitively, curvature is the amount by which a geometric
object deviates from being flat, or straight in the case of a line, but this is defined in
different ways depending on the context.
29
Chapter 5. Implementation
5.1.4 FINGERS:
This class contains functions to capture fingers position from depth image. This
tracks all the fingers from both hands.
30
Chapter 5. Implementation
5.4 XML:
LSC saves the gestures in XML format with several nodes containing positions
and directions of skeleton performing gesture in front of kinect. It takes frame, finger
distances and positions, Kinect distances and positions from skeleton, joint types as array
list.LSC uses XML format dictionary to retrieve sentences. The format for sentences is as
follow:
31
Conclusion
CONCLUSION
Sign Language is used all around the Pakistan by dumb/deaf people. There is
often a barrier for those who use Pakistani Sign Language as a first language and those
who don’t know it and we have tried to bridge this gap in communication by designing
the Linguistic Sign Communicator. The LSC captures gestures in 2D data using the
Microsoft Kinect hardware to develop our software. Our program uses Dynamic Time
Wrapping algorithms to detect signs very accurately and display them onto a computer
screen in a user friendly manner.
The LSC is designed to be a proof of concept and the results of the final product
are very accurate. A library consisting of PSL and phrases was recorded and when the
LSC is tested, the desired sign was almost always detected by the software an converting
sign into two natural language English and Urdu. There are, however, improvements that
can be made easily using the core concepts derived in this project.
Although the Linguistic Sign Communicator is a proof of concept, there are many
improvements that could be made to make it better. Since Pakistani Sign Language also
includes facial gestures (i.e. raised eyebrows) that could also be incorporated into the
Linguistic Sign Communicator using imaging techniques. Adding facial gestures would
allow this project to be much more robust.
Pakistani Sign Language linguist would also be needed to work on sentence
concatenation correctly. There are also different dialects in different regions where
Pakistani Sign Language is used and this would need to be considered while recording
the signs.
32
Appendices
Appendix A
33
Appendices
Appendix B
GLOSSARY
LSC:
Linguistic Sign Communicator is a software application for deaf/ dumb community all
over Pakistan which will help them ease their social lives.
Kinect V1
Kinect is a hardware device made by Microsoft corporation company which possess
various sensors that detect the color, depth and skeleton images through which various
applications can be made using human NUI gestures.
DTW
Dynamic Time Warping is an algorithm for measuring similarity between two temporal
sequences which may vary in time or speed until an optimal match (according to a
suitable metrics) between the two sequences is found.
XML
Extensible Markup Language is a simple, very flexible text format derived from SGML
(ISO 8879). Originally designed to meet the challenges of large-scale electronic
publishing, XML is also playing an increasingly important role in the exchange of a wide
variety of data on the Web and elsewhere.
SDK
Software Development Kit is typically a set of software development tools that allows
the creation of applications for a certain software package, software framework, hardware
platform, computer system, video game console, operating system, or similar
development platform.
34
Appendices
PSL
Pakistani Sign Language is a book with 5,000 unique words and phrases, and growing.
Each word has a graphic illustration and voice over in English and Urdu. Learn 3
languages – PSL, English, Urdu. It’s fun and easy to use!. View by categories, or search
individual words
WPF
Windows Presentation Foundation is a graphical subsystem for rendering user interfaces
in Windows-based applications by Microsoft. WPF, previously known as "Avalon", was
initially released as part of .NET Framework 3.0. Rather than relying on the older GDI
subsystem, WPF uses DirectX.
API
Application Program Interface is a set of routines, protocols, and tools for building
software applications. The API specifies how software components should interact
and APIs are used when programming graphical user interface (GUI) components.
2D
Two-Dimensional is the computer-based generation of digital images—mostly from two-
dimensional models (such as 2D geometric models, text, and digital images) and by
techniques specific to them. The word may stand for the branch of computer science that
comprises such techniques, or for the models themselves.
35
Appendices
Appendix C
ANALYSIS MODEL
C1 ACTIVITY DIAGRAM
A UML Activity diagram showing the process is given below:
36
Appendices
37
Appendices
C4 CLASS DIAGRAM
38
Resume’s
Tabia Rashid
House#B-95,Architect & Engineering Housing Society,Gulistan-e-jauhar,block-
08,Karachi,Pakistan
Cell no: (+92)346-2570782
Phone: 021-346643156
Email: [email protected]
DOB: 09th October 1993
Sex: Female
Marital Status: Single
Objective
To excel in the field of web developing and software engineering relating to business and
academic industry with highly proven leadership skills involving developing projects, managing
projects and ability to work as a part of a team. I am willing to dedicate myself strictly to adhere
the employment ethics and to give my best to the respective company.
Programming Skills
.NET framework
PHP + AJAX+MYSQL
Languages :C,C++, Java, C#
Html 5
CSS3
Android
XML
Database applications using Java
SQL management System
Visual Basic
MS OFFICE
Recent Projects
Linguistic Sign Communicator (Year 2015, FYP, Jinnah University For Women)
Adda Fashion (Year 2015,Made for client)
Fun MP4 Tube (Year 2014, Jinnah University For Women)
Edible arts by Nadia(Year 2014, Jinnah University For Women)
Clinical Management System (Year 2012, Jinnah University For Women)
Import Shipping Module(Year 2013, Jinnah University For Women)
Sunlight Inventory System(Year 2013, Jinnah University For Women)
Achievements
Research Paper (Evaluation of Smart phone Applications Accessibility for Blind Users)
Technologies
Software: Visual Studio 2013,.NET framework, Netbeans, eclipse, Notepad++, MS
Office (Word, Access, Excel, PowerPoint).
Qualification:
Portfolio on Request
Objective
To excel in the field of web developing and software engineering relating to business and
academic industry with highly proven leadership skills involving developing projects, managing
projects and ability to work as a part of a team. I am willing to dedicate myself strictly to adhere
the employment ethics and to give my best to the respective company.
Programming Skills
.NET framework
PHP + AJAX+MYSQL
Languages :C,C++, Java, C#
Html 5
CSS3
Android
XML
Database applications using Java
SQL management System
Visual Basic
MS OFFICE
Recent Projects
Linguistic Sign Communicator (Year 2015, FYP, Jinnah University For Women)
Fun MP4 Tube (Year 2014, Jinnah University For Women)
Clinical Management System (Year 2012, Jinnah University For Women)
Achievements
Best Poster Award (Linguistic Sign Communicator)
Technologies
Software: Visual Studio 2013,.NET framework, Netbeans, eclipse, Notepad++, MS
Office (Word, Access, Excel, PowerPoint).
Qualification:
Portfolio on Request
“I allow university authorities to publish my resume online and to submit/send resume to
any organization”.
Anousha Khan
Flat#A1-316,Unique Classic,Block#15,Gulistan-e-jauhar,Karachi,Pakistan
Cell no: (+92)332-8233858
Phone: 021-34012349
Email: [email protected]
DOB: 18th September 1993
Sex: Female
Marital Status: Married
Objective
To excel in the field of web developing and software engineering relating to business and
academic industry with highly proven leadership skills involving developing projects, managing
projects and ability to work as a part of a team. I am willing to dedicate myself strictly to adhere
the employment ethics and to give my best to the respective company.
Programming Skills
.NET framework
PHP + AJAX+MYSQL
Languages :C,C++, Java, C#
Html 5
CSS3
Android
XML
Database applications using Java
SQL management System
Visual Basic
MS OFFICE
Recent Projects
Linguistic Sign Communicator (Year 2015, FYP, Jinnah University For Women)
Adda Fashion (Year 2015,Made for client)
Fun MP4 Tube (Year 2014, Jinnah University For Women)
Edible arts by Nadia(Year 2014, Jinnah University For Women)
Clinical Management System (Year 2012, Jinnah University For Women)
Import Shipping Module(Year 2013, Jinnah University For Women)
Sunlight Inventory System(Year 2013, Jinnah University For Women)
Achievements
Best Poster Award (Linguistic Sign Communicator)
Technologies
Software: Visual Studio 2013,.NET framework, Netbeans, eclipse, Notepad++, MS
Office (Word, Access, Excel, PowerPoint).
Qualification:
Portfolio on Request
“I allow university authorities to publish my resume online and to submit/send resume to
any organization”.
Maria Farooq
House#14, block C Police headquarter garden, Karachi,Pakistan
Cell no: (+92)321-5236705
Email: [email protected]
DOB: 25th March 1994
Sex: Female
Marital Status: Single
Objective
To be a part of a progressive environment for career advancement, Professional growth and
which will help me gain sufficient knowledge.
.
Programming Skills
.NET framework
PHP +MYSQL
Languages :C,C++, Java, C#
Html 5
CSS3
Android
XML
Database applications using Java
SQL management System
Visual Basic
MS OFFICE
Recent Projects
Linguistic Sign Communicator (Year 2015, FYP, Jinnah University For Women)
Fun MP4 Tube (Year 2014, Jinnah University For Women)
Edible arts by Nadia(Year 2014, Jinnah University For Women)
Car Showroom (Year 2012, Jinnah University For Women)
Import Shipping Module(Year 2013, Jinnah University For Women)
Sunlight Inventory System(Year 2013, Jinnah University For Women)
Achievements
Research Paper (Evaluation of Smart phone Applications Accessibility for Blind Users)
Qualification:
Portfolio on Request
“I allow university authorities to publish my resume online and to submit/send resume to
any organization”.