Project Report
Project Report
SATHIYAVANI E (814319104049)
SOUNDARYA M (814319104053)
KAUSHICA K (814319104018)
BRINDHA P (814319104004)
DHANALAKSHMI SRINIVASAN
ENGINEERING COLLEGE (AUTONOMOUS)
PERAMBALUR-621212
TABLE OF CONTENTS
1. INTRODUCTION ........................................................................................................ 2
1.1 Project Overview
1.2 Purpose
2. LITERATURE SURVEY .............................................................................................. 4
2.1 Existing problem
2.2 References
2.3 Problem Statement Definition
11. CONCLUSION…………………………………………………………………………………..25
13. APPENDIX……………………………………………………………………………………...26
Source Code
The use of doctor-computer interaction devices in the operation room (OR) requires new
modalities that support medical imaging manipulation while allowing doctors' hands to remain
sterile, supporting their focus of attention, and providing fast response times. This paper
presents “Gestix,” a vision-based hand gesture capture and recognition system that interprets
in real-time the user's gestures for navigation and manipulation of images in an electronic
medical record (EMR) database. Navigation and other gestures are translated to commands
based on their temporal trajectories, through video capture. “Gestix” was tested during a brain
biopsy procedure. In the in vivo experiment, this interface prevented the surgeon's focus shift
and change of location while achieving a rapid intuitive reaction and easy interaction. Data
from two usability tests provide insights and implications regarding human-computer
interaction based on nonverbal conversational modalities. In this work we refer to gestures as
a basic form of non-verbal communication made with the hands. Psychological studies showed
that young children use gestures to communicate before they learn to talk. Manipulation, as a
form of gesticulation, is often used when people speak to each other about some object.
Naturalness of expression, non-encumbered interaction, intuitiveness and high sterility are all
good reasons to replace the current interface technology (e.g., keyboard, mouse, and joystick)
with more natural interfaces.
1
1. INTRODUCTION:
Humans can recognize body and sign language easily. This is possible due to the
combination of vision and synaptic interactions that were formed along brain
development. In order to replicate this skill in computers, some problems need to be
solved: how to separate objects of interest in images and which image capture technology
and classification technique are more appropriate, among others.
In this project Gesture based Desktop automation, First the model is trained pre trained
on the images of different hand gestures, such as a showing numbers with fingers as
1,2,3,4. This model uses the integrated webcam to capture the video frame. The image
of the gesture captured in the video frame is compared with the Pre-trained model and
the gesture is identified. If the gesture predicts is 0 - then images is converted into
rectangle, 1 - image is Resized into (200,200), 2 - image is rotated by -45॰, 3 - image is
blurred, 4 - image is Resized into (400,400), 5 - image is converted into grayscale etc.
1.1 Overview:
Project Objectives
Technical Architecture:
2
1. Defining our classification categories
2. Collect training images
3. Train the model
4. Test our model
1.2 PURPOSE:
Computer information technology is increasingly penetrating into the hospital domain. A major
challenge involved in this process is to provide doctors with efficient, intuitive, accurate and safe means
of interaction without affecting the quality of their work. Keyboards and pointing devices, such as a
mouse, are today’s principal method of human— computer interaction. However, the use of computer
keyboards and mice by doctors and nurses in intensive care units (ICUs) is a common method for
spreading infections.1 In this paper, we suggest the use of hand gestures as an alternative to existing
interface techniques, offering the major advantage of sterility. Even though voice control also provides
sterility, the noise level in the operating room (OR) deems it problematic.2 In this work we refer to
gestures as a basic form of non-verbal communication made with the hands. Psychological studies
showed that young children use gestures to communicate before they learn to talk. Manipulation, as a
form of gesticulation, is often used when people speak to each other about some object. Naturalness of
expression, non-encumbered interaction, intuitiveness and high sterility are all good reasons to replace
the current interface technology (e.g., keyboard, mouse, and joystick) with more natural interfaces. This
paper presents a video-based hand gesture capture and recognition system used to manipulate magnetic
resonance images (MRI) within a graphical user interface. A hand gesture vocabulary of commands
was selected as being natural in the sense that each gesture is cognitively associated with the notion or
command that is meant to represent it. For example, moving the hand left represents a “turn left”
command. The operation of the gesture interface was tested at the Washington Hospital Center in
Washington, DC. Two operations were observed in the hospital’s neurosurgery department and insights
regarding the suitability of a hand gesture system was obtained. To our knowledge, this is the first time
that a hand gesture recognition system was successfully implemented in an “in vivo” neurosurgical
biopsy. A sterile human—machine interface is of supreme importance because it is the means by which
the surgeon controls medical information avoiding contamination of the patient, the OR and the
surgeon.
3
2. LITERATURE SURVEY
2.2 REFERENCES:
4
Control during machine learning
Interventional method for hand gesture
Radiology recognition tasks. They
Procedures. Justin H. have evaluated and
Tan, MD • Cherng compared multiple
Chao, MD, JD • classification methods,
Mazen Zawaideh, to finally choose the
BS • Anne C. best recognition model
Roberts, MD • to develop a touchless
Thomas B. Kinney, real-time graphical user
MD interface for medical
image manipulation
based on this hand
recognition approach
Gestures for Picture 1. Command In recent years, there
2 Archiving and extraction, has been a major spur of
Communication 2018 2. Unconstrained hand-gesture interfaces
Systems (PACS) gesture elicitation for controlling
operation in the 3. Agreement analysis Electronic Medical
operating room: Is 4. Synapse Software Records in the
there any standard? was used for browsing Operating Room. Yet, it
Naveen Madapana 1, radiology images is still not clear which
Glebys Gonzalez 1, gestures should be used
Richard Rodgers 2, to operate these
Lingsong Zhang 3, interfaces. This work
Juan P. Wachs 1 addresses the challenge
of determining the best
gestures to control a
PACS system in the
OR, based uniquely on
agreement among
surgeons.
3 Touchless computer Eye gaze technology A variety of
interfaces in 2019 (EGT), capacitive outcomes are studied in
hospitals: A review floor sensors and the literature with
Seán Cronin inertial orientation accuracy of gesture
GLANTA Ltd, sensors; colour recognition being the
Ireland Gavin cameras such as the most frequently
Doherty Trinity Canon VC-C4,5 the reported outcome.
College Dublin, Loop Pointer and There are a number of
Ireland MESA SR-31000 ToF factors that should be
cameras; Siemens considered when
integrated OR system; evaluating a system.
wireless hands-free Validation of sensitivity
surgical pointer;29 to and recall of gestures,
the Apple iPad; leap precision and positive
motion controllers; predictive value, f-
and the Microsoft measure, likelihood
Kinect ToF camera G1 ratio and recognition
accuracy should all be
5
rigorously evaluated
using standard, public
data sets.
6
5 Gesture-controlled 1. Stereo infrared The main purpose of the
image system 2021 optical tracking software is to control
positioning for system the position of the X-
minimally invasive Ray tube. Especially
interventions 2. Qt application when using gantry CT
Benjamin Fritsch*, framework (Qt Group, systems, it is not
Thomas Hoffmann, Helsinki, Finland possible to see the real
André Mewes and time angle of the X-Ray
Georg Rose tube because of the CT
housing. For this a
prototypical GUI was
developed to visualize
the real time position
and provide gesture
interaction capabilities
7
3. IDEATION & PROPOSED SOLUTION
8
3.2 IDEATION & BRAINSTORMING
9
Social Impact / Customer Satisfaction
It takes over the social responsibilities and Customer satisfaction to full fill the Customer needs
in the various fields like Hospitals, Schools, Images workings etc. It helps the professionals to
control over the images without the physical contact with the computers.
Financial Benefits
Cost efficient to deploy this Software for health care department as well as in hospitals
and can collaborate with government for health awareness camps.
Scalability
Solution Better execution in accurate results, sensitivity, system architecture design and
transparency and flexibility of the software
10
5.Understan the root cause of the 6. Behaviour
problem
• The user Behaviour of this
The root cause of the problem problem is that, facing number of
here it is, user feels somewhat steps and process to work with
difficult to manipulate, navigate the radiology images for the
and working with the images in manipulation of the data
the physical way of interaction
. • And this behaviour of the user
with the computer. They were
is often repeats to work with the
excepting some new technology to
images.
do the image manipulation easier
• Before use of our product the This project has all the available
user feels little difficult to work solution for these major problems
with the images faced by the users. In this project
the hand gesture data sets were
• After the use of gesture-based trained by the CNN (Convolutional
tool for image processing the user Neural Network), the Deep
feels easy to manipulate the learning Algorithm. By this the
images hand gestures on the different
angle had identified accurately
and the process for the particular
position of the hand gesture is
processed.
11
4. REQUIREMENT ANALYSIS
4.1 Functional requirement
Functional Requirement defines What a product must do, what its features and
functions are. They are product features or functions that developers must implement to
enableusers to accomplish their tasks. Generally, functional requirements describe system
behaviour under specific conditions
12
The system have the lots of model data for
NFR-3 Reliability the single hand gesture with the different
angles, so here there is no chance for the
failure of the system.
The Performance of the system is faster, it
NFR-4 Performance responds to the user in the fraction so
second and process runs faster in the other
end.
It can be access by the authorized user from
NFR-5 Availability anywhere and any time without any delay.
And this system will be available at any
situation.
This system can give access and manage
NFR-6 Scalability more number of users at time and there is
no loss can be identified.
13
5. PROJECT DESIGN
5.1 Data Flow Diagrams
A Data Flow Diagram (DFD) is a traditional visual representation of the information
flows within a system. A neat and clear DFD can depict the right amount of the system
requirement graphically. It shows how data enters and leaves the system, what changes the
information, and where data is stored.
Level-0 Diagram
Level-1 Diagram
14
5.2 Solution and Technical Architecture:
15
6. PROJECT PLANNING & SCHEDULING
Velocity:
Imagine we have a 10-day sprint duration, and the velocity of the team is 20 (points per
sprint). Let’s calculate the team’s average velocity (AV) per iteration unit (story points per
day)
17
6.2 Sprint Delivery Schedule:
18
16. Applicatio Run the Application Kaushica K
n Building
Burndown Chart:
19
Road map:
20
7. CODING AND SOLUTION
7.1 Feature 1:
7.2 Feature 2:
7.3 Feature 3:
7.4 Feature 4:
21
8.TESTING
8.1 Test Cases:
A test case has components that describe input, action and an expected
response, in order to determine if a feature of an application is working correctly.
A test case is a set of instructions on “HOW” to validate a particular test
objective/target, which when followed will tell us if the expected behavior of the
system is satisfied or not.
This sort of testing is carried out by users, clients, or other authorised bodies to
identify the requirements and operational procedures of an application or piece of
software. The most crucial stage of testing is acceptance testing since it determines
whether or not the customer will accept the application or programmer. It could entail the
application's U.I., performance, usability, and usefulness. It is also referred to as end-user
testing, operational acceptance testing, and user acceptance testing (UAT).
22
9.RESULTS
9.1 Performance Metrics:
10.2
10
9.8
9.6
9.4
9.2 Series1
8.8
8.6
8.4
23
10.ADVANTAGES AND DISADVANTAGES:
ADVANTAGES:
It is a user-friendly application.
It will help people to find gesture easily.
Simple User Interface
It alleviates the burden of coordinator to manage Users and resources easily.
Compared to all other web applications, it incorporates provisions for
numerous gestures levels.
good customer satisfaction
The performance is better in terms of quality and time.
It provides the better use of the database which store user and product history.
Quality prediction, Scalability, Prediction, speed are the main advantages of the
proposed scheme.
DISADVANTAGES:
It cannot auto verify user genuineness.
It requires an active internet connection.
24
11.CONCLUSION
A hand gesture system for MRI manipulation in an EMR image database called
“Gestix” was tested during a brain biopsy surgery. This system is a real-time hand-tracking
recognition technique based on color and motion fusion. In an in vivo experiment, this type of
interface prevented surgeon's focus shift and change of location while achieving, rapid intuitive
interaction with an EMR image database. In addition to allowing sterile interaction with EMRs,
the “Gestix” hand gesture interface provides: (i) ease of use—the system allows the surgeon to
use his/her hands, their natural work tool; (ii) rapid reaction—nonverbal instructions by hand
gesture commands are intuitive and fast (In practice, the “Gestix” system can process images
and track hands at a frame-rate of 150 Hz, thus, responding to the surgeon's gesture commands
in real-time), (iii) an unencumbered interface—the proposed system does not require the
surgeon to attach a microphone, use head-mounted (body-contact) sensing devices or to use
foot pedals, and (iv) distance control—the hand gestures can be performed up to 5 meters from
the camera and still be recognized accurately. The results of two usability tests (contextual and
individual interviews) and a satisfaction questionnaire indicated that the “Gestix” system
provided a versatile method that can be used in the OR to manipulate medical images in real-
time and in a sterile manner.
We are now considering the addition of a body posture recognition system to increase the
functionality of the system, as well as visual tracking of both hands to provide a richer set of
gesture commands. For example, pinching the corners of a virtual image with both hands and
stretching the arms would represent an image zoom-in action. In addition, we wish to assess
whether a stereo camera will increase the gesture recognition accuracy of the system. A more
exhaustive comparative experiment between our system and other human–machine interfaces,
such as voice, is also left for future work.
25
13. APPENDIX
SOURCE CODE:
Data Collection
ML depends heavily on data, without data, it is impossible for a machine to learn. It is
the most crucial aspect that makes algorithm training possible. In Machine Learning
projects, we need a training data set. It is the actual data setused to train the model for
performing various actions.
Image Preprocessing
In this step we improve the image data that suppresses unwilling distortions orenhances
some image features important for further processing, although
26
Model Building
In this step we build Convolutional Neural Networking which contains a input layer along
with the convolution, maxpooling and finally a output layer.
27
Adding CNN Layers
Understanding the model is very important phase to properly use it for training and
prediction purposes. Keras provides a simple method, summary to get the full
information about the model and its layers.
28
● The compilation is the final step in creating a model. Once the compilationis done,
we can move on to training phase. Loss function is used to find error or deviation
in the learning process. Keras requires loss function during model compilation
process.
● Optimization is an important process which optimize the input weights by
comparing the prediction and the loss function. Here we are using Adam
optimizer
Metrics is used to evaluate the performance of your model. It is similar to lossfunction,
but not used in training process
29
Test The Model
Evaluation is a process during development of the model to check whether the model is
best fit for the given problem and corresponding data.
Load the saved model using load_model
30
Plotting images:
By using the model we are predicting the output for the given input image
31
The predicted class index name will be printed here.
Application Building
after the model is trained in this particular step, we will be building our flask
application which will be running in our local browser with a user interface.
● We use HTML to create the front end part of the web page.
● Here, we created 3 html pages- home.html, intro.html and index6.html
● home.html displays home page.
● Intro.html displays introduction about the hand gesture recognition
● index6.html accepts input from the user and predicts the values.
We also use JavaScript-main.js and CSS-main.css to enhanceour functionality and view of
HTML pages
● Build flask file ‘app.py’ which is a web framework written in python for
32
server-side scripting.
● App starts running when “ name ” constructor is called in main.
The above three route are used to render the home, introduction and the index html pages.
33
And the predict route is used for prediction and it contains all the codes which are used
for predicting our results.
Firstly, inside launch function we are having the following things:
Creating ROI
A region of interest (ROI) is a portion of an image that you want to filter or operate on in
some way. The toolbox supports a set of ROI objects that you can use to create ROIs of
many shapes, such circles, ellipses, polygons, rectangles, and hand-drawn shapes. A
common use of an ROI is to create a binary mask image.
34
Predicting our results
After placing the ROI and getting the frames from the web cam now its time to predict
the gesture result using the model which we trained and stored it into avariable for the
further operations.
Finally according to the result predicted with our model we will be performingcertain
operations like resize, blur , rotate etc
35
36
37
Run The Application
At last, we will run our flask application
38
OUTPUT:
39
Y
40