Literaturesurvey
Literaturesurvey
9. The positions of 18 key points tracked by CNN and LSTM Yoga poses in a video -
the Open- Pose, i.e. ears, eyes, nose, 99.04%
neck, shoulders, hips, knees, ankles, Real time-98.92%
elbows, and wrists
10. LSTM 99.2%
11. RLA-based CNN+TL Dataset 1- 0.993
Dataset 2- 0.944
Dataset 3- 0.928
12. Normalization Logistic Regression 94%
classifier
Sl. CONCLUSION FUTURE WORK
NO
1. Deep learning methods are promising because of the The dataset can be expanded my adding more
vast research being done in this field. The use of yoga poses performed by individuals not only
hybrid CNN and LSTM model on OpenPose data is in indoor setting but also outdoor. The
seen to be highly effective and classifies all the 6 yoga performance of the models depends upon the
poses perfectly. A basic CNN and SVM also perform quality of OpenPose pose estimation which
well beyond our expectations. Performance of SVM may not perform well in cases of overlap
proves that ML algorithms can also be used for pose between people or overlap between body parts.
estimation or activity recognition problems. Also, A portable device for self-training and real-
SVM is much lighter and less complex when time predictions can be implemented for this
compared to a neural network and requires less system. This work demonstrates activity
training time. recognition for practical applications. An
approach comparable to this can be utilized for
pose recognition in tasks such as sports,
surveillance, healthcare etc. Multi-person pose
estimation is a whole new problem in itself and
has a lot of scope for research. There are a lot
of scenarios where single person pose
estimation would not suffice, for example pose
estimation in crowded scenarios would have
multiple persons which will involve tracking
and identifying pose of each individual.
2. In this work, we have proposed a novel neural More poses can be considered even with our
network architecture, YoNet, to recognize five proposed architecture due to its strategy of
common yoga poses after having a thorough extracting features. Future research work also
discussion on current related works. The intuition of includes better performance through hyper-
our architecture is to extract the spatial and depth parameter tuning.
features from the image separately and use both types
of features for recognition. It gives our architecture an
advantage to differentiate better among the poses as
hypothesized in our methodology and proven through
result analysis and comparison carried out in our
research work.
3. The paper presented and compared four innovative Future research and development include may
yoga pose recognition models. LGDeep is the best include dataset expansion by extending the
yoga posture classification model, using deep transfer dataset to incorporate more postures, variants,
learning and ensemble techniques. LGDeep’s method and body types which can improve the model’s
achieved 100% classification accuracy, exceeding generalizability and robustness. To increase
previous similar studies and models. LGDeep’s accessibility, the Yoga posture recognition
specificity and sensitivity exceed those of other system might include a user-friendly interface.
techniques, proving its usefulness. The LGDeep Users may simply interact with the system,
model’s dependability and accuracy make it highly visualize their positions, and track their
suitable candidate for a yoga position recognition progress over time.
system. The recommended technique might improve
yoga practitioners’ health and safety due to its strong
classification capabilities.
4. The yoga posture evaluation system could help re- The proposed model can be modified to work
popularize asanas while also performing each asana for a video dataset or real-time feed. Three-
correctly. To accomplish this task, deep learning and dimensional convolutional neural networks can
AI techniques are promising and have a lot of also be explored for yoga asana detection and
potential. The employment of a convolutional neural help achieve even better results. Body posture
network on key points determined with MediaPipe tracking and classification can be used in
was found to be quite effective for this purpose, training robots, health care, surveillance,
accurately classifying all five yoga asanas. This work sports, motion capture, motion tracking,
also attempts to solve the many obstacles and consoles, augmented reality, etc. There is still a
restrictions in current state-of-the-art procedures. lot of untapped potential and research that can
be done in human posture detection.
5. Human posture assessment has been concentrated The proposed models right now characterize
widely over the previous years. When contrasted with just 6 yoga asanas. There are various yoga
other PC vision issues, human posture assessment is asanas, and subsequently making a posture
distinctive as it needs to limit and amass human body assessment model that can be effective for all
parts based on an effectively characterized structure of the asanas is a testing issue. The dataset can be
the human body. Use of posture assessment in extended my adding more yoga presents
wellness and sports can help forestall wounds and performed by people in indoor setting as well
improve the execution of individuals’ exercise. Yoga as open air.
self-guidance frameworks convey the potential to
make yoga famous alongside ensuring it is acted in the
correct way. Profound learning techniques are
promising a result of the huge exploration being done
in this field. The utilization of mixture CNN and
LSTM model on OpenPose information apparently is
profoundly successful and arranges all the 6 yoga
presents impeccably.
6 To provide a portable embedded solution for the real Our future research direction will be to develop
world deployment of the developed yoga pose deep learning-based models to identify the
recognition system, we optimized the designed abnormalities in the yoga poses and
lightweight 3DCNN Model3 using the TensorRT SDK incorporate feedback techniques for posture
and deployed it on the Nvidia Xavier embedded board correction. We also plan to explore the use of
for real-time inference. One can deploy such a system dual-stream neural networks combining body
in real-world conditions or integrate it with a self- pose (extracted from body keypoints) and
training yoga system to recognize different poses. spatial information (obtained from the RGB
frames) for the task of yoga pose recognition.
Finally, we intend to study skeleton-based
yoga pose recognition, combining skeleton
features and advanced neural network
techniques like graph convolutional networks
(GCNs), transformers, etc.
7. The model proposed was able to successfully classify In future, the skeleton points data and a CNN
the yoga poses with an average accuracy of 92.34%. can be combined with the image data to
This study aimed at making a model which can be overcome the problems like strong
used as a Yoga coach and is easily accessible to the articulations, small and barely visible joints,
common people for living a healthy and stress-free occlusions, and clothing with an ensemble
lifestyle. Real-time computing of features, as well as technique to make a more robust system. The
categorization, are key components of this proposed limitation of the proposed model can be
approach. computationally expensive to train and require
more computational resources compared to
other types of neural networks.
8. In this paper, we presented a vision-based system for In future work, more poses and video clips can
the recognition of Yoga poses in real time, which be added to the database to enhance its
intends to overcome the limitations of the existing usability. Additionally, the fusion of geometric
state-of-the-art technique. To this end, we first build a and spatial–temporal features from the Yoga
large-scale dataset of ten Yoga poses captured in pose sequences can be explored to enhance the
complex real-world environments so that our designed recognition accuracy of the designed system.
system could deliver better performance when The designed system can also be optimized and
deployed in real-world conditions. Secondly, we realized on portable embedded devices for the
proposed a lightweight 3D CNN architecture that recognition of the Yoga poses in real time.
exploits the inherent spatial–temporal relationship
among the Yoga poses for their recognition. On the
test set of the in-house Yoga pose dataset, the
proposed 3D CNN model achieved recognition
accuracy of 91.15% along with average precision,
average recall, and average F1-score of 0.91. Besides,
on the publicly available six-pose Yoga dataset, our
proposed model achieved competitive recognition
accuracy of 99.39% along with average precision,
average recall, and average F1-score of 0.99.
9. In this paper, we proposed a Yoga identification In future work, more asanas and a larger
system using a traditional RGB camera. The dataset is dataset comprising of both image and videos
collected using HD 1080p Logitech webcam for 15 can be included. Also, the system can be
individuals (ten males and five females) and made implemented on a portable device for real-time
publicly available. OpenPose is used to capture the predictions and self-training. This work serves
user and detect keypoints. The end-to-end deep as a demonstration of activity recognition
learning-based framework eliminates the need for systems for realistic applications. A similar
making handcrafted features allowing for the addition approach can be used for posture recognition in
of new asanas by just retraining the model with new various tasks like surveillance, sports,
data. We applied the time-distributed CNN layer to healthcare, image classification, etc.
detect patterns between keypoints in a single frame
and the LSTM to memorize the patterns found in the
recent frames. Using LSTM for the memory of
previous frames and polling for denoising, the results
make the system even more robust by minimizing the
error due to false keypoint detection.
10. This paper presents a yoga recognition method Extra yoga pose and large data with images
utilizing a conventional RGB camera in this study. and films may be integrated into destiny
The data were obtained for ten individuals (five men improvement and examine more models with
and five females) using an HD 1080p RGB camera yoga asana data.
and made publicly available. To find important
locations, pose estimation is employed. The end-to-
end deep-learning-based architecture stops the
requirement for hand-crafted functions, allowing the
model to be retrained with new data to include new
asana. The fundamental elements of the LSTM were
used in this study to help participants remember the
patterns seen in recent frames. The outcomes make the
device even more resilient with the aid of lowering the
mistake because the office key—factor identifies with
the aid of using LSTM for earlier body memory and
polling for denoising.
11. This article aims to establish a systematic model for The future evolvement would be the inclusion
yoga pose categorization by exploiting designed RLA- of some other hybrid optimization algorithms
based CNN with TL. Human object detection is to attain superior accuracy in terms of pose
implemented using SQA, where the network is trained classification.
to utilize RLA. The segmented result is applied to the
feature extraction module, where features including
SLBT, LTP, HLoG, and hierarchical skeleton features
are refined. Finally, yoga pose classification is
performed using CNN with TL.
12. In this study, a yoga pose classifier was successfully
developed which works perfectly on images, static
video, and live video of any user. The study starts
from environment creation and proceeds with data
collections from open data sources. Mediapipe pose
estimation library is used for human pose estimation
which returns body key points, these data points form
the basis of a new dataset. Then data preprocessing
takes place in which target variables are changed.
After this normalization of data occurs for better
performance of machine learning algorithms and
finally feature engineering of features starts where
various joint angles of the body are calculated using
the formula shown in figure 6. As the data is
completely preprocessed data is finally fed to machine
learning models. Evaluation of these models is done
on test data and is compared based on accuracy score.
Logistic regression classifier achieves a maximum
score of 94% among all classifiers. For classification a
threshold value is used which is set at 97% below
which no pose detected is given as output to the user.