project summary
project summary
ABSTRACT
Hand gesture recognition system received great attention in the recent few years because of its
manifoldness applications and the ability to interact with machine efficiently through human
computer interaction. In this paper a survey of recent hand gesture recognition systems is
presented. Key issues of hand gesture recognition system are presented with challenges of
gesture system. Review methods of recent postures and gestures recognition system presented
as well. Summary of research results of hand gesture methods, databases, and comparison
between main gesture recognition phases are also given. Advantages and drawbacks of the
discussed systems are explained finally.
KEYWORDS
Hand Posture, Hand Gesture, Human Computer Interaction (HCI), Segmentation, Feature
Extraction, Classification Tools, Neural Networks.
INTRODUCTION
The essential aim of building hand gesture recognition system is to create a natural interaction
between human and computer where the recognized gestures can be used for controlling a robot
or conveying meaningful information . How to form the resulted hand gestures to be understood
and well interpreted by the computer considered as the problem of gesture interaction. Human
computer interaction (HCI) also named Man-Machine Interaction (MMI)refers to the relation
between the human and the computer or more precisely the machine, and since the machine is
insignificant without suitable utilize by the human. There are two main characteristics should
be deemed when designing a HCI system as mentioned in functionality and usability. System
functionality referred to the set of functions or services that the system equips to the users ,
while system usability referred to the level and scope that the system can operate and perform
specific user purposes efficiently. The system that attains a suitable balance between these
concepts considered as influential performance and powerful system. Gestures used for
communicating between human and machines as well as between people using sign language.
International Journal of Artificial Intelligence & Applications (IJAIA), Vol.3, No.4, July 2012
162 Gestures can be static (posture or certain pose) which require less computational
complexity or dynamic (sequence of postures) which are more complex but suitable for real
time environments. Different methods have been proposed for acquiring information necessary
for recognition gestures system. Some methods used additional hardware devices such as data
glove devices and colour markers to easily extract comprehensive description of gesture
features. Other methods based on the appearance of the hand using the skin colour to segment
the hand and extract necessary features, these methods considered easy, natural and less cost
comparing with methods mentioned before. Some recent reviews explained gesture recognition
system applications and its growing importance in our life especially for Human computer
Interaction HCI, Robot control, games, and surveillance, using different tools and algorithms.
This work demonstrates the advancement of the gesture recognition systems, with the
discussion of different stages required to build a complete system with less erroneous using
different algorithms. The paper organization is as follows: the following section explains key
issues of hand gesture recognition system which are segmentation, features extraction, and
recognition. Applications of gesture recognition systems. Gesture challenges are discussed in
a literature review of recent hand gesture recognition systems.
Segmentation process is the first process for recognizing hand gestures. It is the process of
dividing the input image (in this case hand gesture image) into regions separated by boundaries.
The segmentation process depends on the type of gesture, if it is dynamic gesture then the hand
gesture needs to be located and tracked, if it is static gesture (posture) the input image has to
be segmented only. The hand should be located firstly, generally a bounding box is used to
specify the depending on the skin colour and secondly, the hand have to be tracked, for tracking
the hand there are two main approaches; either the video is divided into frames and each frame
have to be processed alone, in this case the hand frame is treated as a posture and segmented,
or using some tracking information such as shape, skin colour using some tools such as Kalman
filter.
The common helpful cue used for segmenting the hand is the skin colour , since it is easy and
invariant to scale, translation, and rotation changes. Different tools and methods used skin and
non-skin pixels to model the hand. These methods are parametric and non-parametric
techniques, Gaussian Model (GM) and Gaussian Mixture Model (GMM) are parametric
techniques, and histogram-based techniques are non- parametric. However, it is affected with
illumination condition changes abs different races. Some researches overcome this problem
using data glove and coloured markers which provide exact information about the orientation
and position of palm and fingers. Others used infrared camera, and range information generated
by special camera Time-of-Flight (ToF) camera , although these systems can detect different
skin colours under cluttered background but it is affected with changing in temperature degrees
besides their expensive cost. The segmentation considered as an open issue problem itself. The
colour space used in a specific application plays an essential role in the success of segmentation
process, however colour spaces are sensitive to lighting changes, for this reason, researches
tend to use chrominance components only and neglect the luminance components such as r-g,
and HS colour spaces. However, there are some factors that obstacle the segmentation process
which is complex background, illumination changes, low video quality applied HSV colour
model which concentrates on the pigments of the pixel, [14] used YCbCr colour space.
used normalized r-g colour space. Some preprocessing operations are applied such as
subtraction, edge detection, and normalization to enhance the segmented hand image. Figure
2 shows some segmentation method examples.
PROBLEM STATEMENT
Hand Gesture Recognition (HGR) is an active research area that has gained significant attention
due to its wide range of applications in areas such as virtual reality, gaming, sign language
recognition, and human-computer interaction. Dynamic hand gestures, which involve motion
over time, can be particularly challenging to recognize due to the complexity and variability of
hand movements.
Hand gesture recognition (HGR) is the process of identifying and interpreting hand gestures in
real-time or from pre-recorded video data. In this problem statement, we focus on the
recognition of dynamic hand gestures, which involve motion over time.
Objective:-The objective of this problem is to develop an accurate and robust hand gesture
recognition system for dynamic applications. The system should be able to recognize a set of
predefined hand gestures from video data in real-time or from pre-recorded videos.
Data: The data used for this problem can be video recordings of hand gestures performed by
human subjects. The videos can be recorded using a variety of sensors, such as RGB cameras,
depth cameras, or time-of-flight cameras. The data should be labelled with the corresponding
hand gesture for each video.
Evaluation Metrics:-The performance of the hand gesture recognition system can be evaluated
using metrics such as accuracy, precision, recall, and F1 score. The system should be able to
achieve high accuracy, precision, recall, and F1 score for the predefined hand gestures.
Challenges:-Some of the challenges in hand gesture recognition for dynamic applications
include:
• Variability in hand movements between different subjects
• Occlusion of the hand by other body parts or objects in the environment
• Changes in lighting conditions
• Variability in hand shape, size, and colour between different subjects
• Real-time processing requirements
• Approach
A possible approach to hand gesture recognition for dynamic applications can include the
following steps:
• Preprocessing: The video data may need to be pre-processed to remove noise, correct
for lighting conditions, and normalize hand appearance.
• Hand Detection: The system should be able to detect the hand in each video frame. This
can be done using object detection algorithms or specialized hand detection algorithms.
• Feature Extraction: The system should extract relevant features from the hand
movement data. These features can include hand shape, motion, and orientation.
• Classification: The system should classify the hand gestures based on the extracted
features. This can be done using machine learning algorithms such as support vector
machines, random forests, or neural networks.
• Postprocessing: The system may need to perform postprocessing to handle cases where
the hand gesture is not clearly identified in a single video frame.
Expected Outcome:-The expected outcome of this problem is a hand gesture recognition
system that can accurately and robustly recognize dynamic hand gestures in real-time or from
pre-recorded videos. The system should be able to handle variability in hand movements,
occlusion, changes in lighting conditions, and other challenges. The system should achieve
high accuracy, precision, recall, and F1 score for the predefined hand gestures.
PURPOSE
The purpose of Hand Gesture Recognition (HGR) for dynamic applications is multifaceted and
impactful. Let’s explore some key reasons why HGR plays a crucial role:
Natural Interacting aims to create a natural interaction between humans and computers or other
devices. By recognizing hand gestures, users can interact with machines in more intuitive and
human-like ways. Imagine controlling a robot or conveying meaningful information using
simple hand movements.
Human-Computer Interaction (HCI):HGR is a fundamental feature of HCI.
It enables users to communicate with technology without relying solely on traditional input
methods like keyboards or touchscreens. Dynamic hand gestures provide an alternative and
expressive means of input.
Real-Time Interaction: Real-time HGR allows for immediate responses. Users can perform
gestures, and the system responds promptly, enhancing the overall user experience.
Applications such as virtual reality, gaming, and smart home control benefit from real-time
interaction.
Applications of HGR:
• Dynamic applications that utilize HGR include:
• Virtual Reality (VR) and Augmented Reality (AR): Users can manipulate virtual
objects, navigate interfaces, and interact with immersive environments using hand
gestures.
• Gaming: Gesture-based gaming enhances gameplay by introducing physical
movements.
• Healthcare: Surgeons can control medical equipment during surgeries without touching
physical interfaces.
• Smart Homes: Adjusting lights, temperature, or music with gestures.
• Sign Language Translation: HGR can aid in real-time sign language interpretation.
• Robotics: Robots can respond to human gestures for collaboration or assistance.
Challenges and Advances:
Challenges: Accurate recognition of dynamic gestures in various contexts, robustness to
lighting conditions, occlusions, and noise.
Advances: Deep learning techniques, such as 3D Dense Net and LSTM, improve performance
and real-time response speed.
In summary, HGR bridges the gap between humans and technology, enabling seamless and
expressive interactions across a wide range of dynamic applications
BACKGROUND
The background of Hand Gesture Recognition for Dynamic Applications lies in the importance
of hand gestures as a means of human-computer interaction (HCI). Hand gestures are a natural
way of communicating and can be used to convey a wide range of information. There are two
types of hand gestures: static hand signs and dynamic hand gestures. Static hand signs are hand
poses without any movement, while dynamic hand gestures are defined as a sequence of hand
poses. Hand gesture recognition systems can be categorized into two approaches: video-based
and 3D image-based. Video-based systems use conventional cameras, making them easily
implementable on widely available platforms. On the other hand, 3D image-based systems
require special devices such as Microsoft Kinect or Leap Motion, which can improve
performance. For practical use, hand gesture recognition systems should work in real-time and
have the ability to segment meaningful portions from a continuous data stream, known as
gesture spotting. Gesture spotting is essential to detect the start and end of a gesture naturally
in a continuous sequence of hand motion. Previous research has focused on developing hand
sign recognition systems, which recognize static hand signs. However, there is a need for
dynamic hand gesture recognition systems that can recognize dynamic hand gestures in real-
time. This is where the proposed system comes in, which uses a Self-Organizing Map (SOM)
and Hebbian learning network to recognize dynamic hand gestures with gesture spotting.
SCOPE
The scope of Hand Gesture Recognition (HGR) for dynamic applications is vast and impactful.
Let’s explore the various aspects of its scope:
Natural Interaction and HCI:
• HGR enables natural interaction between humans and computers or devices.
• It enhances human-computer interaction (HCI) by allowing users to communicate
through gestures.
• Applications include gaming, virtual reality (VR), and sign language communication1.
Applications and Use Cases:
• Robotics: HGR can control robots using hand gestures, aiding in collaboration or
assistance.
• Gaming: Gesture-based gaming enhances gameplay and immersion.
• Healthcare: Surgeons can manipulate medical equipment without physical touch during
surgeries.
• Smart Homes: Adjusting lights, temperature, or music with gestures.
• Sign Language Translation: Real-time interpretation of sign language.
• Video-Based Surveillance: Detecting suspicious gestures or actions.
Real-Time Interaction:
• Real-time HGR allows immediate responses, crucial for applications like VR and
robotics.
• Users can perform gestures, and the system responds promptly.
Challenges and Advances:
Challenges: Accurate recognition in various contexts, robustness to noise, occlusions, and
lighting conditions.
Advances: Deep learning techniques (e.g., 3D Dense Net, LSTM) improve performance and
real-time response speed3.
In summary, HGR bridges the gap between humans and technology, enabling seamless
interactions across dynamic applications. Its scope extends from entertainment to critical fields
like healthcare and robotics.
SUGGESTIONS
Here are some valuable suggestions for Hand Gesture Recognition (HGR) in dynamic
applications:
Real-Time Gesture Recognition Models:
• Develop real-time HGR models using deep learning techniques such as convolutional
neural networks (CNNs) or recurrent neural networks (RNNs).
• Optimize for low latency to ensure immediate responses in interactive applications.
Multi-Modal Approaches:
• Combine visual information from RGB images with depth data (e.g., from depth
sensors or stereo cameras).
• Multi-modal approaches enhance robustness and accuracy, especially in challenging
lighting conditions.
Data Augmentation:
• Augment the training dataset with variations in lighting, background, and hand poses.
• Synthetic data generation can help improve model generalization.
Transfer Learning:
• Pre-train models on large-scale datasets (e.g., ImageNet) and fine-tune them for HGR
tasks.
• Transfer learning accelerates model convergence and improves performance.
Gesture Segmentation:
• Implement gesture segmentation to identify the start and end points of gestures.
• Temporal modelling is crucial for recognizing dynamic gestures.
User-Centric Training:
• Collect user-specific data to adapt models to individual hand shapes and movements.
• Personalized models enhance accuracy for specific users.
Privacy and Security Considerations:
• Ensure that HGR systems respect user privacy.
• Avoid capturing sensitive information during gesture recognition.
Edge Deployment:
• opt for lightweight models suitable for edge devices (e.g., smartphones, IoT devices).
• Edge deployment minimizes latency and reduces reliance on cloud resources.
Gesture Vocabulary:
• Define a comprehensive set of gestures relevant to the application domain.
• Consider both static and dynamic gestures.
Continuous Learning:
• Enable models to adapt over time by incorporating new gestures or user-specific
variations.
• Online learning techniques can enhance long-term usability.
DRAWBACKS
In this section, drawbacks of some discussed methods are explained: Orientation histogram
method applied in have some problems which are; similar gestures might have different
orientation histograms and different gestures could have similar orientation histograms, besides
that, the proposed method achieved well for any objects that dominate the image even if it is
not the hand gesture. Neural Network classifier has been applied for gestures classification but
it is time consuming and when the number of training data increase, the time needed for
classification are increased too. In the NN required several hours for learning 42 characters and
four days to learn ten words. Fuzzy c-means clustering algorithm applied in has some
disadvantages; wrong object extraction problem raised if the objects larger than the hand. The
performance of recognition algorithm decreases when the distance greater than 1.5 meters
between the user and the camera. Besides that, its variation to lighting condition changes and
unwanted objects might overlap with the hand gesture. In the system is variation to
environment lighting changes which produces erroneous segmentation of the hand region.
HMM tools are perfect for recognition dynamic gestures but it is computational consuming.
International Journal of Artificial Intelligence & Applications (IJAIA), Vol.3, No.4, July 2012
170 Other system limitation as listed in where the gestures are made with the right hand only,
the arm must be vertical, the palm is facing the camera, and background is plane and uniform.
In System limitations restrict the application such as; gestures are made with the right hand
only, the arm must be vertical, the palm is facing the camera, background is uniform. In the
system could recognize numbers only from 0 to 9. While the system proposed in for controlling
a robot, can counts number of active fingers only without regard to which particular fingers are
active with a fixed set of commends.
CONCLUSIONS
In this paper various methods are discussed for gesture recognition, these methods include from
Neural Network, HMM, fuzzy c-means clustering, besides using orientation histogram for
features representation. For dynamic gestures HMM tools are perfect and have shown its
efficiency especially for robot controls are used as classifier and for capturing hand shape in.
For features extraction, some methods and algorithms are required even to capture the shape
of the hand as in applied Gaussian bivariate function for fitting the segmented hand which used
to minimize the rotation affection. The selection of specific algorithm for recognition depends
on the application needed. In this work application areas for the gestures system are presented.
Explanation of gesture recognition issues, detail discussion of recent recognition systems are
given as well. Summary of some selected systems are listed as well.