0% found this document useful (0 votes)
40 views

Project Report

This document presents a gesture-based tool for sterile browsing of radiology images. The tool uses vision-based hand gesture recognition to allow doctors to manipulate medical images in an electronic record database without compromising sterility. The gestures are mapped to commands like navigation or image manipulation. The tool was tested during a brain biopsy procedure and found to prevent shifts in the surgeon's focus while providing intuitive, rapid interaction. The tool aims to replace traditional input devices like keyboards and mice which can spread infections in medical settings.

Uploaded by

Sathiya Vani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Project Report

This document presents a gesture-based tool for sterile browsing of radiology images. The tool uses vision-based hand gesture recognition to allow doctors to manipulate medical images in an electronic record database without compromising sterility. The gestures are mapped to commands like navigation or image manipulation. The tool was tested during a brain biopsy procedure and found to prevent shifts in the surgeon's focus while providing intuitive, rapid interaction. The tool aims to replace traditional input devices like keyboards and mice which can spread infections in medical settings.

Uploaded by

Sathiya Vani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

A GESTURE-BASED TOOL FOR STERILE BROWSING OF RADIOLOGY IMAGES

NALAIYA THIRAN PROJECT REPORT


TEAM ID : PNT2022TMID08438

SATHIYAVANI E (814319104049)
SOUNDARYA M (814319104053)
KAUSHICA K (814319104018)
BRINDHA P (814319104004)

in partial fulfillment for the award of the degree


of
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE & ENGINEERING

DHANALAKSHMI SRINIVASAN
ENGINEERING COLLEGE (AUTONOMOUS)
PERAMBALUR-621212
TABLE OF CONTENTS
1. INTRODUCTION ........................................................................................................ 2
1.1 Project Overview
1.2 Purpose
2. LITERATURE SURVEY .............................................................................................. 4
2.1 Existing problem
2.2 References
2.3 Problem Statement Definition

3. IDEATION & PROPOSED SOLUTION ....................................................................8


3.1 Empathy Map Canvas
3.2 Ideation & Brainstorming
3.3 Proposed Solution
3.4 Problem Solution fit
4. REQUIREMENT ANALYSIS .......................................................................................12
4.1 Functional requirement
4.2 Non-Functional requirements
5. PROJECT DESIGN .........................................................................................................14
5.1 Data Flow Diagrams
5.2 Solution & Technical Architecture
5.3 User Stories
6. PROJECT PLANNING & SCHEDULING ................................................................. 16
6.1 Sprint Planning & Estimation
6.2 Sprint Delivery Schedule
6.3 Reports from JIRA
7. CODING & SOLUTIONING ......................................................................................... 21
7.1 Feature 1
7.2 Feature 2
7.3 Feature 3
7.4 Feature 4
8. TESTING.......................................................................................................................... 22
8.1 Test Cases
8.2 User Acceptance Testing
9. RESULTS ......................................................................................................................... 23
9.1 Performance Metrics
10. ADVANTAGES & DISADVANTAGES………………………………………………….24

11. CONCLUSION…………………………………………………………………………………..25

12. FUTURE SCOPE……………………………………………………………………………..25

13. APPENDIX……………………………………………………………………………………...26
Source Code

GitHub & Project Demo Link


ABSTRACT:

The use of doctor-computer interaction devices in the operation room (OR) requires new
modalities that support medical imaging manipulation while allowing doctors' hands to remain
sterile, supporting their focus of attention, and providing fast response times. This paper
presents “Gestix,” a vision-based hand gesture capture and recognition system that interprets
in real-time the user's gestures for navigation and manipulation of images in an electronic
medical record (EMR) database. Navigation and other gestures are translated to commands
based on their temporal trajectories, through video capture. “Gestix” was tested during a brain
biopsy procedure. In the in vivo experiment, this interface prevented the surgeon's focus shift
and change of location while achieving a rapid intuitive reaction and easy interaction. Data
from two usability tests provide insights and implications regarding human-computer
interaction based on nonverbal conversational modalities. In this work we refer to gestures as
a basic form of non-verbal communication made with the hands. Psychological studies showed
that young children use gestures to communicate before they learn to talk. Manipulation, as a
form of gesticulation, is often used when people speak to each other about some object.
Naturalness of expression, non-encumbered interaction, intuitiveness and high sterility are all
good reasons to replace the current interface technology (e.g., keyboard, mouse, and joystick)
with more natural interfaces.

1
1. INTRODUCTION:
Humans can recognize body and sign language easily. This is possible due to the
combination of vision and synaptic interactions that were formed along brain
development. In order to replicate this skill in computers, some problems need to be
solved: how to separate objects of interest in images and which image capture technology
and classification technique are more appropriate, among others.

In this project Gesture based Desktop automation, First the model is trained pre trained
on the images of different hand gestures, such as a showing numbers with fingers as
1,2,3,4. This model uses the integrated webcam to capture the video frame. The image
of the gesture captured in the video frame is compared with the Pre-trained model and
the gesture is identified. If the gesture predicts is 0 - then images is converted into
rectangle, 1 - image is Resized into (200,200), 2 - image is rotated by -45॰, 3 - image is
blurred, 4 - image is Resized into (400,400), 5 - image is converted into grayscale etc.

1.1 Overview:
Project Objectives

 Know fundamental concepts and techniques of ConvolutionalNeural


Network (CNN).
 Gain a broad understanding of image data.

 Know how to pre-process/clean the data using different data pre-


processing techniques.
 Know how to build a web application using Flask framework.

Technical Architecture:

2
1. Defining our classification categories
2. Collect training images
3. Train the model
4. Test our model

1.2 PURPOSE:
Computer information technology is increasingly penetrating into the hospital domain. A major
challenge involved in this process is to provide doctors with efficient, intuitive, accurate and safe means
of interaction without affecting the quality of their work. Keyboards and pointing devices, such as a
mouse, are today’s principal method of human— computer interaction. However, the use of computer
keyboards and mice by doctors and nurses in intensive care units (ICUs) is a common method for
spreading infections.1 In this paper, we suggest the use of hand gestures as an alternative to existing
interface techniques, offering the major advantage of sterility. Even though voice control also provides
sterility, the noise level in the operating room (OR) deems it problematic.2 In this work we refer to
gestures as a basic form of non-verbal communication made with the hands. Psychological studies
showed that young children use gestures to communicate before they learn to talk. Manipulation, as a
form of gesticulation, is often used when people speak to each other about some object. Naturalness of
expression, non-encumbered interaction, intuitiveness and high sterility are all good reasons to replace
the current interface technology (e.g., keyboard, mouse, and joystick) with more natural interfaces. This
paper presents a video-based hand gesture capture and recognition system used to manipulate magnetic
resonance images (MRI) within a graphical user interface. A hand gesture vocabulary of commands
was selected as being natural in the sense that each gesture is cognitively associated with the notion or
command that is meant to represent it. For example, moving the hand left represents a “turn left”
command. The operation of the gesture interface was tested at the Washington Hospital Center in
Washington, DC. Two operations were observed in the hospital’s neurosurgery department and insights
regarding the suitability of a hand gesture system was obtained. To our knowledge, this is the first time
that a hand gesture recognition system was successfully implemented in an “in vivo” neurosurgical
biopsy. A sterile human—machine interface is of supreme importance because it is the means by which
the surgeon controls medical information avoiding contamination of the patient, the OR and the
surgeon.

3
2. LITERATURE SURVEY

2.1 EXISTING PROBLEM :


In the early 1990's scientists, surgeons and other experts were beginning to draw
together state-of-the-art technologies to develop comprehensive image-guidance systems for
surgery, such as the Stealth Station. This is a free-hand stereo-tactic pointing device, in which
a position is converted into its corresponding location in the image space of a high-performance
computer monitor. In a setting like the OR, touch screen displays are often used, and must be
sealed to prevent the build up of contaminants. They should also have smooth surfaces for easy
cleaning with common cleaning solutions. These requirements are often overlooked in the busy
OR environment.

Many of these deficiencies may be overcome by introducing a more natural human-


computer interaction mode into the hospital environment. The bases of human-human
communication are speech, hand and body gestures, facial expression, and eye gaze. Some of
these concepts have been exploited in systems for improving medical procedures. In face
mouse,a surgeon can control the motion of the laparoscope by simply making the appropriate
face gesture, without hand or foot switches or voice input. Current research to incorporate hand
gestures into doctor-computer interfaces appeared in Graetzel et al. They developed a computer
vision system that enables surgeons to perform standard mouse functions (pointer movement
and button presses) with hand gestures. Another aspect of gestures is their capability to aid
handicapped people by offering a natural alternative form of interface and serving as a
diagnostic tool. 6 Wheelchairs, as mobility aids, have been enhanced as robotic vehicles able
to recognize the user's commands through hand gestures

2.2 REFERENCES:

SI.NO TITLE AND YEAR TECHNIQUES FINDINGS / PROS /


AUTHOR (S) CONS

Developing a In this study, they


1 Touchless User Leap motion proposed a simple and
Interface for 2013 Controller accurate
Intraoperative Image implementation of the

4
Control during machine learning
Interventional method for hand gesture
Radiology recognition tasks. They
Procedures. Justin H. have evaluated and
Tan, MD • Cherng compared multiple
Chao, MD, JD • classification methods,
Mazen Zawaideh, to finally choose the
BS • Anne C. best recognition model
Roberts, MD • to develop a touchless
Thomas B. Kinney, real-time graphical user
MD interface for medical
image manipulation
based on this hand
recognition approach
Gestures for Picture 1. Command In recent years, there
2 Archiving and extraction, has been a major spur of
Communication 2018 2. Unconstrained hand-gesture interfaces
Systems (PACS) gesture elicitation for controlling
operation in the 3. Agreement analysis Electronic Medical
operating room: Is 4. Synapse Software Records in the
there any standard? was used for browsing Operating Room. Yet, it
Naveen Madapana 1, radiology images is still not clear which
Glebys Gonzalez 1, gestures should be used
Richard Rodgers 2, to operate these
Lingsong Zhang 3, interfaces. This work
Juan P. Wachs 1 addresses the challenge
of determining the best
gestures to control a
PACS system in the
OR, based uniquely on
agreement among
surgeons.
3 Touchless computer Eye gaze technology A variety of
interfaces in 2019 (EGT), capacitive outcomes are studied in
hospitals: A review floor sensors and the literature with
Seán Cronin inertial orientation accuracy of gesture
GLANTA Ltd, sensors; colour recognition being the
Ireland Gavin cameras such as the most frequently
Doherty Trinity Canon VC-C4,5 the reported outcome.
College Dublin, Loop Pointer and There are a number of
Ireland MESA SR-31000 ToF factors that should be
cameras; Siemens considered when
integrated OR system; evaluating a system.
wireless hands-free Validation of sensitivity
surgical pointer;29 to and recall of gestures,
the Apple iPad; leap precision and positive
motion controllers; predictive value, f-
and the Microsoft measure, likelihood
Kinect ToF camera G1 ratio and recognition
accuracy should all be

5
rigorously evaluated
using standard, public
data sets.

4 Hand-gesture-based In this paper, we


Touchless 2020 develop a touchless
Exploration of Leap motion graphical user interface
Medical Images with Controller based on the LMC,
Leap Motion which offers a new
Controller. Safa experience for the
AMEUR, Anouar surgeon to command
BEN KHALIFA, medical images named
Med Salim DICOM images
BOUHLEL (Digital Imaging and
Communications in
Medicine). The
framework relies
essentially on a strong
recognition approach
which consists of
extracting statistical
features like the mean
and the standard
deviation from the
LMC raw data. Then,
we train our system on a
public dataset
composed of 11
gestures dedicated to
command DICOM
image

6
5 Gesture-controlled 1. Stereo infrared The main purpose of the
image system 2021 optical tracking software is to control
positioning for system the position of the X-
minimally invasive Ray tube. Especially
interventions 2. Qt application when using gantry CT
Benjamin Fritsch*, framework (Qt Group, systems, it is not
Thomas Hoffmann, Helsinki, Finland possible to see the real
André Mewes and time angle of the X-Ray
Georg Rose tube because of the CT
housing. For this a
prototypical GUI was
developed to visualize
the real time position
and provide gesture
interaction capabilities

2.3 PROBLEM STATEMENT DEFINITION:

7
3. IDEATION & PROPOSED SOLUTION

3.1 Empathy Map Canvas

8
3.2 IDEATION & BRAINSTORMING

3.3 PROPOSED SOLUTION


The Solution for the problem statement is to access and manipulate and navigate the images
without the physical contact with the computers by using the Gesture tool-based browsing for
images by using the CNN (Convolutional Neural Network), the deep learning algorithm. By
using the CNN and Dense layer the data sets are trained and tested. The data sets here is the
images of the hand gesture and positions which is used to access and manipulate the images
without the physical contact with the computers. This solution is created as web application by
using python and flask. The user can upload the images and they can work with the images by
using the hand gestures. And this is done with the help of OpenCV which allows you to perform
image processing and computer vision tasks. The solution flow here is once the Date 28
September 2022 Team ID PNT2022TMID08310 Project Name A Gesture based tool for sterile
browsing of radiology images Maximum Marks 4 Marks web cam captures the hand gestures
then the images is processed with the help of the OpenCV and then the image is matched with
the test and train data sets and according to the match of the data the starts to do the particular
task which is assigned for the particular data.
Uniqueness / Novelty
The novelty and Uniqueness of the project is that it had the better and simple method to work
with the images. And this is the future technology for manipulating and work with the images
in the companies and hospitals etc. It has the better understanding of user problem and better
techniques and neural networks to process the images. It enables the simple and attractive user
interface for the customers to work in the web application. Because of the advanced techniques
in it the performance of the project is very massive. The image processing and identification is
accurate and the result is dam correct in this solution project.

9
Social Impact / Customer Satisfaction
It takes over the social responsibilities and Customer satisfaction to full fill the Customer needs
in the various fields like Hospitals, Schools, Images workings etc. It helps the professionals to
control over the images without the physical contact with the computers.
Financial Benefits
Cost efficient to deploy this Software for health care department as well as in hospitals
and can collaborate with government for health awareness camps.
Scalability
Solution Better execution in accurate results, sensitivity, system architecture design and
transparency and flexibility of the software

3.4 PROBLEM SOLUTION FIT

1. Who is your customer? 2.Explore limitations to use your


The customers for this product
project were Doctors,
medical lab technicians Limitations for the customers to
and the people who are use my product are Spending
all working with the Power, Network connection.
radiology images in the Available device, Web cam Facility
hospitals sectors who to capture the gesture of the user.
needs the sterile
browsing of the radiology
images.

3.Available solutions for the 4. Frequent problem to solve for


customer problem the customer
• By using the gesture, the user Making a physical contact with
can browse the images in the the computer to work with the
effective way navigation or the manipulation of
. • It uses hand gesture to navigate the images is the frequent
and manipulate the images problem. By this gesture tool this
frequent problem will be solved
. • Even user can upload the hand and the hand gesture make the
gesture images to browse the image navigation and
radiology images. manipulation simple and easier.
without making the physical
contacts with the computes.

10
5.Understan the root cause of the 6. Behaviour
problem
• The user Behaviour of this
The root cause of the problem problem is that, facing number of
here it is, user feels somewhat steps and process to work with
difficult to manipulate, navigate the radiology images for the
and working with the images in manipulation of the data
the physical way of interaction
. • And this behaviour of the user
with the computer. They were
is often repeats to work with the
excepting some new technology to
images.
do the image manipulation easier

7.Triggers to act 9. Be where your customer are?


• The physical contact and The customer is in the online to
repeated process for the image access our product resource to
working triggers the user to go work with the images using the
with our product CNN (Convolutional Neural
Network), Test data and Train
• And the design and user
Data to manipulate the images
interface of our product makes to
user to use this product

8.Emotions of user 10. Solution

• Before use of our product the This project has all the available
user feels little difficult to work solution for these major problems
with the images faced by the users. In this project
the hand gesture data sets were
• After the use of gesture-based trained by the CNN (Convolutional
tool for image processing the user Neural Network), the Deep
feels easy to manipulate the learning Algorithm. By this the
images hand gestures on the different
angle had identified accurately
and the process for the particular
position of the hand gesture is
processed.

11
4. REQUIREMENT ANALYSIS
4.1 Functional requirement

Functional Requirement defines What a product must do, what its features and
functions are. They are product features or functions that developers must implement to
enableusers to accomplish their tasks. Generally, functional requirements describe system
behaviour under specific conditions

Following are the functional requirements of the proposed solution.

FR.NO Functional requirements (Epic) Sub Requirements (Story / Sub Task)


When the system stars it launches the
FR-1 Launching the model and algorithms trained CNN algorithm and models from
the cloud
By using the webcams, the images were
FR-2 Capturing the images
captured and uploaded to the system
After the images were uploaded the
FR-3 Identifying the gestures images, gesture was identified by the
system
After capturing the image, the algorithm
FR-4 Model rendering
will start its processing task
The sterile browsing can be performed
FR-5 Sterile Browsing
after identifying the hand gestures
After completion of all these process the
FR-6 Visibility of images user can see the images and work with
the images like zoom, blur, rotate and
can change the pixels.

4.2 Non-Functional requirements


Non-Functional Requirements, not related to the system functionality rather
define hoe the system should perform, Herem we will just briefly describe the most
typical Non- functional requirements.
Following are the non-functional requirements of the proposed solution

NFR No. Non-Functional Requirements Requirement Description


This user interface is simple and easy to
understand and access by the users and they
NFR-1 Usability can make control over the images without
making the physical contacts with the
systems.
This system is protected and only
NFR-2 Security
authorized users can access it.

12
The system have the lots of model data for
NFR-3 Reliability the single hand gesture with the different
angles, so here there is no chance for the
failure of the system.
The Performance of the system is faster, it
NFR-4 Performance responds to the user in the fraction so
second and process runs faster in the other
end.
It can be access by the authorized user from
NFR-5 Availability anywhere and any time without any delay.
And this system will be available at any
situation.
This system can give access and manage
NFR-6 Scalability more number of users at time and there is
no loss can be identified.

13
5. PROJECT DESIGN
5.1 Data Flow Diagrams
A Data Flow Diagram (DFD) is a traditional visual representation of the information
flows within a system. A neat and clear DFD can depict the right amount of the system
requirement graphically. It shows how data enters and leaves the system, what changes the
information, and where data is stored.
Level-0 Diagram

Level-1 Diagram

14
5.2 Solution and Technical Architecture:

5.3 User Stories:


User Functional User User Story / Acceptance Priority Release
Type requiremen Story Task criteria
t (Epic) Number
Customer Registration USN-1 As a user, I I can High Sprint-3
(Mobile can upload upload
user) an image for image
performing
the action.
USN-2 As a user, I I can show High Sprint-1
can show hand sign Sprint-2
my hand
sign in front
ofthe
camera.
USN-3 As a user, I I can get the High Sprint-2
will send result.
the result of
the
uploaded
image based
on my hand
sign.
Customer same as a
(Web mobile user.
user)

15
6. PROJECT PLANNING & SCHEDULING

6.1 Sprint Planning & Estimation:

Project Tracker, Velocity & Burndown Chart:


16
Sprint Total Duration Sprint Start Sprint End Story Points Sprint
Story Date Completed Release
Points (On Date
Planned End (Actual)
Date)
Sprint-1 10 6 Days 24 Oct 2022 29 Oct 2022 29 Oct 2022
10
Sprint-2 10 6 Days 31 Nov 2022 05 Nov 2022 10 05 Nov2022

Sprint-3 10 6 Days 07 Nov 2022 12 Nov 2022 10 12 Nov 2022

Sprint-4 10 6 Days 14 Nov 2022 19 Nov 2022 10 19 Nov 2022

Velocity:
Imagine we have a 10-day sprint duration, and the velocity of the team is 20 (points per
sprint). Let’s calculate the team’s average velocity (AV) per iteration unit (story points per
day)

17
6.2 Sprint Delivery Schedule:

S. No: Milestone Activities Team Members

01. Data Collection Download the Dataset Brindha P

02. Data Collection Image Pre-processing Soundarya M

03. Data Collection Import the Image Data Kaushica K


Generator Library
04. Data Collection Configure Image Data Generator Sathiyavani E
Class
05. Data Collection Apply Image Data Generator Sathiyavani
Functionality to Trainset and E Soundarya
Test set
06. Model Building Import the Model Kaushica
Building Libraries KBrindha
p
07. Model Building Initializing the Model Kaushica
KBrindha
P
08. Model Building Adding CNN Layers Sathiyavani
E Soundarya
M
09. Model Building Adding Dense Layers Sathiyavani
E Soundarya
M
10. Model Building Configure the Learning Process Sathiyavani E

11. Model Building Train The Model Sathiyavani


E Soundarya
M
12. Model Building Save the Model Sathiyavani
E Soundarya
M
13. Model Building Test Model
Kaushica
KBrindha
P
14. Applicatio Create HTML Pages Soundarya M
nBuilding

15. Applicatio Build Python code Sathiyavani E


nBuilding

18
16. Applicatio Run the Application Kaushica K
n Building

17. Train The Register for IBM Cloud Sathiyavani


Model on IBM E Soundarya
M Kaushica
K Brindha P
18. Train The Train Model on IBM Sathiyavani E
Model on IBM

6.3 Report from Jira:

Burndown Chart:

19
Road map:

20
7. CODING AND SOLUTION

7.1 Feature 1:

7.2 Feature 2:

7.3 Feature 3:

7.4 Feature 4:

21
8.TESTING
8.1 Test Cases:

A test case has components that describe input, action and an expected
response, in order to determine if a feature of an application is working correctly.
A test case is a set of instructions on “HOW” to validate a particular test
objective/target, which when followed will tell us if the expected behavior of the
system is satisfied or not.

Characteristics of a good test case:

 Accurate: Exacts the purpose.


 Economical: No unnecessary steps or words.
 Traceable: Capable of being traced to requirements.
 Repeatable: Can be used to perform the test over and over.
 Reusable: Can be reused if necessary.

8.2 User Acceptance Testing :

This sort of testing is carried out by users, clients, or other authorised bodies to
identify the requirements and operational procedures of an application or piece of
software. The most crucial stage of testing is acceptance testing since it determines
whether or not the customer will accept the application or programmer. It could entail the
application's U.I., performance, usability, and usefulness. It is also referred to as end-user
testing, operational acceptance testing, and user acceptance testing (UAT).

22
9.RESULTS
9.1 Performance Metrics:

10.2

10

9.8

9.6

9.4

9.2 Series1

8.8

8.6

8.4

accuracy level dash board plasma


Gesturerequest
Request

23
10.ADVANTAGES AND DISADVANTAGES:
ADVANTAGES:
 It is a user-friendly application.
 It will help people to find gesture easily.
 Simple User Interface
 It alleviates the burden of coordinator to manage Users and resources easily.
 Compared to all other web applications, it incorporates provisions for
numerous gestures levels.
 good customer satisfaction
 The performance is better in terms of quality and time.
 It provides the better use of the database which store user and product history.
 Quality prediction, Scalability, Prediction, speed are the main advantages of the
proposed scheme.

DISADVANTAGES:
 It cannot auto verify user genuineness.
 It requires an active internet connection.

24
11.CONCLUSION

A hand gesture system for MRI manipulation in an EMR image database called
“Gestix” was tested during a brain biopsy surgery. This system is a real-time hand-tracking
recognition technique based on color and motion fusion. In an in vivo experiment, this type of
interface prevented surgeon's focus shift and change of location while achieving, rapid intuitive
interaction with an EMR image database. In addition to allowing sterile interaction with EMRs,
the “Gestix” hand gesture interface provides: (i) ease of use—the system allows the surgeon to
use his/her hands, their natural work tool; (ii) rapid reaction—nonverbal instructions by hand
gesture commands are intuitive and fast (In practice, the “Gestix” system can process images
and track hands at a frame-rate of 150 Hz, thus, responding to the surgeon's gesture commands
in real-time), (iii) an unencumbered interface—the proposed system does not require the
surgeon to attach a microphone, use head-mounted (body-contact) sensing devices or to use
foot pedals, and (iv) distance control—the hand gestures can be performed up to 5 meters from
the camera and still be recognized accurately. The results of two usability tests (contextual and
individual interviews) and a satisfaction questionnaire indicated that the “Gestix” system
provided a versatile method that can be used in the OR to manipulate medical images in real-
time and in a sterile manner.

12. FUTURE SCOPE

We are now considering the addition of a body posture recognition system to increase the
functionality of the system, as well as visual tracking of both hands to provide a richer set of
gesture commands. For example, pinching the corners of a virtual image with both hands and
stretching the arms would represent an image zoom-in action. In addition, we wish to assess
whether a stereo camera will increase the gesture recognition accuracy of the system. A more
exhaustive comparative experiment between our system and other human–machine interfaces,
such as voice, is also left for future work.

25
13. APPENDIX

SOURCE CODE:
Data Collection
ML depends heavily on data, without data, it is impossible for a machine to learn. It is
the most crucial aspect that makes algorithm training possible. In Machine Learning
projects, we need a training data set. It is the actual data setused to train the model for
performing various actions.

Image Preprocessing
In this step we improve the image data that suppresses unwilling distortions orenhances
some image features important for further processing, although

perform some geometric transformations of images like rotation, scaling,


translation etc.

26
Model Building
In this step we build Convolutional Neural Networking which contains a input layer along
with the convolution, maxpooling and finally a output layer.

27
Adding CNN Layers

Adding Dense Layers


Dense layer is deeply connected neural network layer. It is most common and
frequently used layer.

Understanding the model is very important phase to properly use it for training and
prediction purposes. Keras provides a simple method, summary to get the full
information about the model and its layers.

Configure The Learning Process

28
● The compilation is the final step in creating a model. Once the compilationis done,
we can move on to training phase. Loss function is used to find error or deviation
in the learning process. Keras requires loss function during model compilation
process.
● Optimization is an important process which optimize the input weights by
comparing the prediction and the loss function. Here we are using Adam
optimizer
Metrics is used to evaluate the performance of your model. It is similar to lossfunction,
but not used in training process

Train The Model


Train the model with our image dataset.
fit_generator functions used to train a deep learning neural network
Arguments:

● steps_per_epoch : it specifies the total number of steps taken from thegenerator


as soon as one epoch is finished and next epoch has started.We can calculate
the value of steps_per_epoch as the total number of samples in your dataset
divided by the batch size.
● Epochs : an integer and number of epochs we want to train our model for.

● validation_data can be either:


1.an inputs and targets list
2.a generator
3.an inputs, targets, and sample_weights list which can be used to evaluate the loss
and metrics for any model after any epoch has ended.
● validation_steps :only if the validation_data is a generator then only this
argument can be used. It specifies the total number of steps taken from the
generator before it is stopped at every epoch and its value is calculated as the
total number of validation data points in your dataset divided by the validation
batch size.

29
Test The Model
Evaluation is a process during development of the model to check whether the model is
best fit for the given problem and corresponding data.
Load the saved model using load_model

30
Plotting images:

Taking an image as input and checking the results

By using the model we are predicting the output for the given input image

31
The predicted class index name will be printed here.

Application Building
after the model is trained in this particular step, we will be building our flask
application which will be running in our local browser with a user interface.

Create HTML Pages

● We use HTML to create the front end part of the web page.
● Here, we created 3 html pages- home.html, intro.html and index6.html
● home.html displays home page.
● Intro.html displays introduction about the hand gesture recognition
● index6.html accepts input from the user and predicts the values.
We also use JavaScript-main.js and CSS-main.css to enhanceour functionality and view of
HTML pages

Build Python Code

● Build flask file ‘app.py’ which is a web framework written in python for

32
server-side scripting.
● App starts running when “ name ” constructor is called in main.

● render_template is used to return html file.


● “GET” method is used to take input from the user.
● “POST” method is used to display the output to the user.
Importing Libraries

● Creating our flask application and loading our model

Routing to the html Page

The above three route are used to render the home, introduction and the index html pages.

33
And the predict route is used for prediction and it contains all the codes which are used
for predicting our results.
Firstly, inside launch function we are having the following things:

● Getting our input and storing it


● Grab the frames from the web cam.
● Creating ROI

● Predicting our results


● Showcase the results with the help of opencv
● Finally run the application

Getting our input and storing it


Once the predict route is called, we will check whether the method is POST or not if is POST
then we will request the image files and with the help of os function we will be storing the
image in the uploads folder in our local system.

Grab the frames from the web cam


when we run the code a web cam will be opening to take the gesture input so wewill be
capturing the frames of the gesture for predicting our results.

Creating ROI
A region of interest (ROI) is a portion of an image that you want to filter or operate on in
some way. The toolbox supports a set of ROI objects that you can use to create ROIs of
many shapes, such circles, ellipses, polygons, rectangles, and hand-drawn shapes. A
common use of an ROI is to create a binary mask image.

34
Predicting our results
After placing the ROI and getting the frames from the web cam now its time to predict
the gesture result using the model which we trained and stored it into avariable for the
further operations.

Finally according to the result predicted with our model we will be performingcertain
operations like resize, blur , rotate etc

35
36
37
Run The Application
At last, we will run our flask application

Run The app in local browser

● Open anaconda prompt from the start menu


● Navigate to the folder where your python script is.
● Now type “python app.py” command
Navigate to the localhost where you can view your web page

Then it will run on localhost:5000

Navigate to the localhost (http://127.0.0.1:5000/)where you can view your web


page.

38
OUTPUT:

39
Y

GITHUB LINK: https://github.com/IBM-EPBL/IBM-Project-49-1658204178


DEMO LINK:
https://drive.google.com/file/d/1hxPDomCqJYelHjEgghCgtAMjuVzwqGFI/view?usp=share
_link

40

You might also like