Human Activity Detection Using Pose Net
Human Activity Detection Using Pose Net
Abstract—Human detection and tracking is one of Convolutional Neural Networks (CNN) have grown in
the important study fields that has attracted a lot of popularity in recent years as a solution to picture
attention recently.Despite the fact that commercially categorization problems. Finally, the success in identifying
available technologies for human identification and images and description, analysis have begin to use CNNs
counting are now available, further experimentation is for visual categorization more widely. Due to lighting
required in order to address the challenges. circumstances, blockage, framework clutter, contortion,
Abnormality detection, commonly called automated measurements, and inter-group variance, classifying real-life
video surveillance, is the way of noticing and describing movies into inconsistent freestyle activities is a difficult
chore.
human behaviour and interactions in a crowded setting.
People must be discovered and tracked in order to Human Action Recognition (HAR) is a programme that
maintain reliability, welfare, and site management. analyses human behaviour and assigns a name to each
Object identification is a critical stage in abnormality action. As a result, it has a huge number of applications and
detection and automated video surveillance. is getting more popular in the world of computer vision.
Background subtraction is a technique for detecting skeleton, indeep, flaming, specific cloud, voice, acceleration,
human actions in video segments. The frequent radar, and WiFi signals are some of the data procedure that
technique for distinguishing moving objects from can be used to depict human actions. Depending on the
motionless images Background subtraction is the application environment, these methods encapsulate several
technique of detaching foreground objects from the sources of important but separate data and offer several
background in a sequence of video frames, as the label benefits. Appropriately, numerous past research has
attempted to investigate individual HAR techniques
implicit. In this scenario, the primary purpose of
employing various modalities. Human action recognition
abnormality detection is to recognise and track a (HAR) is important for a range of real-world examples.This
moving object using video recorded images and pose can be used to detect potentially dangerous human actions
net. and assure their safe operation in both visual surveillance
and autonomous navigation systems. It is additionally
Keywords—Background subtraction, Pose net, CNN helpful for video recovery, human-robot association, and
I. .INTRODUCTION diversion. HAR motive is to analyse suddenly and recognise
the characteristics of an action in unknown video sequences.
Computer vision is a multidisciplinary field that focuses Academics and industry are both interested in HAR because
on how computers might be designed to recognise high- of the rising demand for automated human activity
quality digital images or movies. It expects to computerise interpretation. In real, analysing and understanding Person
things that the visual framework of humans is capable of behaviour is important for a variety of approach, including
performing a designing outlook. Techniques for recording, video indexing, biometrics, surveillance, and security.
processing, analysing, and understanding digital images, also
computer vision problems include extracting high- II. LITERATURE REVIEW
geometric data from the real world in order to provide Fatemeh et al. [1] present a graded approach that
statistical information like judgments. In this case, includes HOG with background subtraction, as well as the
understanding refers to the conversion of visual use of a deep neural network and bone modelling methods.
representations into world descriptions which can have For feature selection and storing past information, a CNN
interact with other cognitive processes and lead to exact and an LSTM recursive network are combined, and
action. Human activity detection and position estimates have ultimately, a Softmax-KNN classifier is utilized to identify
aroused attention in a variety of applications, including human actions.
video-based recognition and human–computer interactions.
However, there is still a lot of study being carried out with Fernando et al [2] A powerful deep neural networks for
precision and speed. In most cases, activity recognition and the HAR. The dimensions from multiple body-worn devices
pose estimation are done individually. are separated by this network. Three datasets were used:
Opportunity, Pamap2, and an industrial data set. The
Even if the posture is strongly linked with activity architecture was evaluated and found to outperform the
detection, no solution for simultaneously addressing both state-of-the-art.
difficulties is being researched for the advantage of activity
recognition. One of ML’s primary advantages is its capacity H Mei et al. [3]presents a way of detecting a large figure
to execute end-to-end optimization. There are various of things with an unknown amount that fluctuates over time.
advantages to a machine learning problem that can perform The multiple object tracking approach use a graph structure
end-to-end optimization. to keep track of a number of conjecture regarding the
2
clustering algorithms on the provided data files. compared to human computer interactions, patient monitoring systems,
alternative clustering techniques. and robotics, human activity detection is quickly becoming a
popular topic of research. These datasets were split into two
Vinayakumar.R et al.[15] The usefulness of the IRNN categories by us. First, there are two-dimensional (2D-RGB)
and various RNN variations for ID is investigated in this datasets, and then there are three-dimensional (3D-RGB)
work. The detection rates achieved by IRNN methods with datasets. The most accurate algorithms for these datasets
respect to KDDCup-99 intrusion datafiles are very similar to according to state-of-the-art technology are also offered. We
those achieved by other RNN variations. With numerous quickly go over both the benefits and drawbacks of using 2D
tests with IRNN and RNN variant designs, the logic behind and 3D datasets.
the network topology and its parameters has been thoroughly
studied. Furthermore, the document is organised as follows: The
Literature survey is found in the second section. The
Soman.K.P et al.[16]When compared to typical machine methods part is the third component. In the fourth section,
learning classifiers, experiments using members of RNN the proposed system is discussed. The fifth portion contains
modal produced a lower FP rate. RNN designs are popular the results and discussion, while the last section contains the
because they can retain detalies for long-period dependency study's conclusion.
during time delay and adapt it with subsequent detalies in
sequence connections. The usefulness of RNN designs is III. METHODOLOGY
also demonstrated in UNSW-NB15 data files.
A. CNN
Jie Yin et al. [17]To eliminate the actions with a very
high likelihood of becoming normal, our method first uses a Human Pose Estimation is the art of extricating the
one-class SVM which has been justified on regularly found body's skeletal central issues and joint areas relating to the
usual activities. To lower the wrong positive rate in an human body parts. It makes use of the enormous number of
unsupervised way, From a generic normal model, we next central problems and joints to connect the human body's
create models of anomalous behaviour using kernel two-layered structure. In this project, we used the OpenPose
nonlinear regression. We demonstrate that our method offers system to measure posture from an information picture. The
a favourable trade-off between the rate of abnormality image is submitted to the CNN Yield organisation in
recognition and the incorrect alarm rate and enables the OpenPose to extract the highlights from the input. The
automatic derivation of abnormal activity models without element map is then processed through multiple CNN layers
the need to exact tag the rare aberrant training data. Using to yield (PAF) Part Affinity Fields and Confidence Maps. To
actual data gathered from a sensor network set up in a capture human attitude in the image, the partial affinity
practical environment, we show the efficacy of our fields and confidence map established above go through a
methodology. bipartite diagram matching calculation.
3
3. extract features of the body, and normalized joint
positions.
4
[2] Fernando Moya Rueda, Gernot A. Fink, “Convolutional Neural
Networks for Human Activity Recognition Using Body-Worn
Sensors”, 2018
[3] Mei Han; A. Sethi, “A detection-based multiple object tracking
method”, International Conference on Image Processing, 2004
[4] Rashmi R. Koli, Tanveer I. Bagban,”Human Action Recognition
Using Deep Neural Networks”, Fourth World Conference on Smart
Trends in Systems, Security and Sustainability, 2020
[5] M. Leo, T. D'Orazio, I. Gnoni, “Complex human activity recognition
for monitoring wide outdoor environments”, 7th International
Conference on Pattern Recognition, 2004
[6] Xiaoran Shi; Yaxin Li; Feng Zhou,“Human Activity Recognition
Based on Deep Learning Method”, International Conference on
Radar (RADAR), 2018
[7] Noor Almaadeed, Omar Elharrouss, Somaya Al-Maadeed, “A Novel
Approach for Robust Multi Human Action Recognition and
Summarization based on 3D Convolutional Neural Networks”, 2021
[8] Tingtian Li, Zixun Sun, Xiao Chen, “Group-Skeleton-Based Human
Action Recognition in Complex Events”, 2020
[9] Amel Ben Mahjoub, Mohamed Atri,“Human action recognition using
RGB data”, 11th International Design & Test Symposium, 2016
[10] V. Parameswari S. Pushpalatha V. Parameswari S. Pushpalatha V.
Parameswari, S. Pushpalatha, “Human Activity Recognition using
SVM and Deep Learning”, European Journal of Molecular & Clinical
Medicine, Volume 7, Issue 4, 2020
[11] Kavya , J., & Geetha , M. (2016, September). An FSM based
Fig.3 Human activity detected Fig.4 Human activity detected methodology for interleaved and concurrent activity recognition. In
is Walking is fighting 2016 International Conference on Advances in Computing,
Communications and Informatics (ICACCI) (pp. 994-999). IEEE
Classification report: [12] Ashokan, V., & Murthy, O. R. (2017, July). Comparative evaluation
of classifiers for abnormal event detection in ATMs. In 2017
TABLE I. CLASSIFACATION REPORT International Conference on Intelligent Computing, Instrumentation
and Control Technologies (ICICICT) (pp. 1330-1333). IEEE
precision recall f1-score support [13] Amrutha, C. V., Jyotsna, C., & Amudha, J. (2020, March). Deep
learning approach for suspicious activity detectio n from surveillance
1 0.98 0.98 0.98 50 video. In 2020 2nd International Conference on Innovative
0 0.98 0.98 0.98 50 Mechanisms for Industry Applications (ICIMIA) (pp. 335-339).
IEEE.
accuracy 0.98 100 [14] Chitturi, B., Thomas, J., & Indulekha, T. S. (2015, December). New
approaches for discovering unsupervised human activities by mining
Macro avg 0.98 0.98 0.98 100 sensor data. In 2015 International Conference on Computing and
Network Communications (CoCoNet) (pp. 118-123). IEEE
Weighted avg 0.98 0.98 0.98 100
[15] Vinayakumar, R., Soman, K. P., Poornachandran, P. (2019). A
comparative analysis of deep learning approaches for network
VI. .CONCLUSION intrusion detection systems (N-IDSs): deep learning for N-IDSs.
International Journal of Digital Crime and Forensics (IJDCF), 11(3),
The focus of this article is to discuss the most recent 65-89
work in this field of study. It initially discusses the objective [16] Vinayakumar, R., Soman, K. P., & Poornachandran, P. (2017).
of human pose estimation and then presents the human pose Evaluation of recurrent neural network and its variants for intrusion
detection system (IDS). International Journal of Information System
estimation approaches and key point detection methods for Modeling and Design (IJISMD), 8(3), 43-63
pose representation. It also discusses some common datasets [17] Yin, J., Yang, Q., & Pan, J. J. (2008). Sensor-based abnormal human-
and different types of classification methods that provide an activity detection. IEEE transactions on knowledge and data
effective survey study to design and develop the automated engineering, 20(8), 1082-1090.
[18] Singh, G., & Cuzzolin, F. (2016). Untrimmed video classification for
human recognition system. In numerous computer vision activity detection: submission to activitynet challenge. arXiv preprint
applications, the necessity to interpret human actions has arXiv:1607.01979.
become unavoidable. On the other hand, the major [19] Kumaran, N., & Reddy, U. S. (2021). Classification of human
application fields anamoly recognition becoming a activity detection based on an intelligent regression model in video
prominent research work because of developed techniques to sequences. IET Image Processing, 15(1), 65-76.
[20] Erol, B., & Amin, M. G. (2019). Radar data cube processing for
perform anamoly detection in front of system without the human activity recognition using multisubspace learning. IEEE
help of trainer and become self-learner. We conclude that Transactions on Aerospace and Electronic Systems, 55(6), 3617-
our study helps to design and develop the automated human 3628.
recognition system for anomaly recognition systems with [21] Singh, T., & Vishwakarma, D. K. (2019). Human activity recognition
different poses irrespective of many challenges. in video benchmarks: A survey. Advances in Signal Processing and
Communication, 247-259.
REFERENCES
[1] Fatemeh Serpush, Mahdi Rezaei, “Complex Human Action
Recognition Using a Hierarchical Feature Reduction and Deep
Learning-Based Method”, SN Computer Science, 2021