0% found this document useful (0 votes)
6 views

Mini Paper5

This paper proposes a hybrid deep learning framework for human activity recognition (HAR) using smartwatch sensors, specifically a CNN-LSTM model that achieves an average accuracy of 96.2% and an F-measure of 96.3%. The framework automatically extracts spatial-temporal features from sensor data, addressing limitations of smartphone-based activity recognition. The study utilizes Bayesian optimization for hyperparameter tuning and evaluates the model's performance against baseline deep learning models using the WISDM dataset.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Mini Paper5

This paper proposes a hybrid deep learning framework for human activity recognition (HAR) using smartwatch sensors, specifically a CNN-LSTM model that achieves an average accuracy of 96.2% and an F-measure of 96.3%. The framework automatically extracts spatial-temporal features from sensor data, addressing limitations of smartphone-based activity recognition. The study utilizes Bayesian optimization for hyperparameter tuning and evaluates the model's performance against baseline deep learning models using the WISDM dataset.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Smartwatch-based Human Activity Recognition

Using Hybrid LSTM Network


Sakorn Mekruksavanich1 and Anuchit Jitpattanakul2
1
Department of Computer Engineering, School of Information and Communication Technology
University of Phayao, Phayao, Thailand
[email protected]
2
Intelligent and Nonlinear Dynamic Innovations Research Center, Department of Mathematics
Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
[email protected]

Abstract—As a result of the rapid development of wearable and smartwatches [13]. However, the focus of these efforts
sensor technology, the use of smartwatch sensors for human has mostly been on smartphone-based activity recognition,
activity recognition (HAR) has recently become a popular area including those investigated in [14]–[16]. The application of
of research. Currently, a large number of mobile applications,
such as healthcare monitoring, sport performance tracking, etc., these common commercially available devices significantly
are applying the results of major HAR research studies. In increased the possible uses for activity recognition; never-
this paper, an HAR framework that employs spatial-temporal theless, limits have resulted from their being placed on the
features that are automatically extracted from data obtained body of a user and the unstable orientation, such as when
from smartwatch sensors is proposed. The hybrid deep learning the smartphone moves around while in a the user’s pocket or
approach is used in the framework through the employment
of Long Short-Term Memory Networks and the Convolutional when the position of the pocket is not suitable for the tracking
Neural Network, eliminating the need for the manual extraction of hand-based activities. Since women do not usually carry
of features. The advantage of tuning the hyperparameters of their smartphone in their pocket, they are especially prone to
each of the considered networks by Bayesian optimization is also the limitations of smartphone-based activity recognition. The
utilized. It was indicated by the results that the baseline models use of smartwatches, which are worn in a constant location,
are outperformed by the proposed hybrid deep learning model,
which has an average accuracy of 96.2% and an F-measure of addresses the majority of these drawbacks, as they are ideally
96.3%. positioned for the tracking of activities that are hand-based.
Index Terms—smartwatch, deep learning, human activity
recognition, wearable devices, hybrid LSTM
In addition, machine learning algorithms can be employed
I. I NTRODUCTION for the developments that can improve the activity recog-
nition model using smartwatches in order to provide more
There are numerous reasons that the rapid development
accurate assessments of a broad variety of activities [17]–
of wearable sensor technology has occurred, for example
[19]. Nevertheless, these approaches using conventional ma-
the decreasing cost of sensor devices and the significant
chine learning regularly depend upon heuristic manual feature
improvement of miniaturized sensors’ computational capacity
extraction, and are thus normally limited to the knowledge
[1]. Wearable sensors are defined as tiny devices that peo-
of the human domain. Because of this limitation, there are
ple are able to continuously carry during the performance
restrictions regarding the performance of approaches using
of their everyday activities. Sensors such as accelerometers,
conventional machine learning in terms of the accuracy of the
barometers, global positioning systems (GPS), gyroscopes,
classification and other metrics used for evaluation [20]. In this
and magnetometers are capable of capturing the signal of the
paper, approaches utilizing deep learning (DL) are applied in
physical movement of a person at any time and at any place.
order to overcome these limitations.
The advantages of wearable sensors have been adopted in a
number of beneficial mobile applications, including abnormal
driving detection [2], healthcare systems for remotely moni- An HAR framework that automatically uses spatial-
toring elderly people [3]–[5], sport performance tracking [6], temporal features extracted from the data obtained by smart-
and mobile assistance systems for people who are visually watch sensors by applying a hybrid deep learning model
impaired [7]. known as a CNN-LSTM network is proposed in this work. In
Interest in the topic of applying wearable sensors to hu- this study, an experimental evaluation is performed in order
man activity recognition (HAR) is currently increasing for to conduct a comparison of the proposed CNN-LSTM-based
many researchers in the pervasive field of computing [8]– approach with the baseline deep learning models found in
[12]. These days, direction and motion sensors, for example a public dataset referred to as the WISDM dataset [21]. In
accelerometers and gyroscopes that can be employed for the addition, Bayesian optimization is employed for the tuning of
classification of human activities, are installed on smartphones the CNN-LSTM models’ hyperparameters.

978-1-7281-6801-2/20/$31.00 ©2020 IEEE

Authorized licensed use limited to: Western Sydney University. Downloaded on June 15,2021 at 05:34:10 UTC from IEEE Xplore. Restrictions apply.
II. BACKGROUND AND R ELATED R ESEARCH
In this section, the background knowledge for this study and
the related research concerned with HAR and deep learning
approaches are summarized.
A. Deep Learning applied to HAR
The smartwatch-based HAR focus on human behavior un-
derstanding by using sensors of smartwatch or smart phone
such as gyroscope, accelerometer, etc. The understanding
Fig. 1. The proposed framework of smartwatch-based HAR
is processed with sensors while they are performing their
activities through recognition. The HAR problem can be
systematically mentioned as a a time-series classification [22].
In recent studies, the main focus of research conducted activities performed in everyday life. Each of the activities was
on HAR has been deep learning (DL). Numerous successful performed separately for approximately 3 minutes at a rate of
outcomes have been reported, which has led researchers to 20 Hz.
apply a range of approaches that utilize deep learning in
order to solve complex problems involving HAR [23]–[26]. B. Preprocessing and Segmentation of the Data
Reduction of the effort needed for feature extraction using
conventional machine learning is provided by these deep In this work, only the smartwatch data included in the
learning approaches. As a result of this advantage, raw sensor WISDM dataset was utilized. An exploratory data analysis
data can be used to build DL models, which provides a high of the sensor data was conducted, and it was found that the
level of efficiency when performing recognition. activity data from seven of the participants did not include all
One type of deep learning networks is known as con- of the pre-determined activities. Therefore, it was necessary
volutional neural networks (CNN), which are recommended that the data from those seven subjects (subjects 1616, 1618,
for the improvement of the performance in solving problems 1637, 1638, 1639, 1640 and 1642) be disregarded. As a result,
in wearable-based HAR [27]. The CNNs are a type of DL the smartwatch data was collected from only 44 subjects. For
network that can extract the spatial features from raw sensor the processing of the time-series data in the HAR problem,
data automatically. However, there are a number of human a 10-second sliding window with 50% of the overlapping
activities that require time-series data, which results in the proportion was used to segment the data.
temporal dependencies. In order to solve this temporal depen-
dency issue, the use of Long Short-Term Memory networks C. Hybrid Long Short-Term Memory Network
(LSTM) has been proposed, and application of LSTMs is
currently increasing in the field of HAR [28], [29]. Hybrid In this research, a hybrid LSTM network known as a 2-layer
LSTMs that provide the advantages of both CNNs and LSTMs CNN-LSTM is proposed for the improvement of recognition
by combining several preceding CNN layers that extract the performance. The CNN-LSTM comprises two convolutional
spatial features with LSTM layers that extract the temporal layers and a single LSTM layer. The input sensor data’s size
features have been proposed. In the experiments carried out in is 200×6, and the output layer’s size is 18×1. Fig. 2 illustrates
this study, the implementation of the combined CNN-LSTM the CNN-LSTM’s structure.
was investigated, as described in Section III.
III. P ROPOSED M ETHODS
In the proposed framework for the human activity recogni-
tion that is smartwatch-based, the capture of the sensor data
from the smartwatch sensor is enabled in order to conduct
the classification of the activities that are performed by the
smartwatch user. The overall methods applied in this study to
achieve the aim of this research are illustrated in Fig. 1.
A. Smartwatch Dataset
The WISDM from the UCI Repository is the publicly
available source of the raw sensor data used in this study [21].
Accelerometer and gyroscope data from various smartphones Fig. 2. The structure of the proposed CNN-LSTM network
operating with Android 6.0 and a smartwatch (LG G Watch)
operating with Android Wear 1.5 are contained in the dataset. To evaluate the proposed hybrid LSTM networks, evaluation
The sensor data were collected from the smartwatches worn metrics are used from the field of HAR: accuracy, precision,
on the dominant hand of 51 participants during 18 physical recall, and F-measure [30].

Authorized licensed use limited to: Western Sydney University. Downloaded on June 15,2021 at 05:34:10 UTC from IEEE Xplore. Restrictions apply.
TABLE I TABLE II
P ERFORMANCE METRICS OF THE DEEP LEARNING MODELS S UMMARY HYPERPARAMETERS OF CNN-LSTM NETWORKS IN THE
RESEARCH

Performance metrics Stage Hyperparameters Values


Scenario
Accuracy Precision Recall F-measure Kernel Size 3
CNN 93.1% 93.1% 93.1% 93.1% Convolution 1 Stride 1
LSTM 89.6% 89.7% 89.6% 89.6% Filters 113
Proposed CNN-LSTM 96.2% 96.3% 96.2% 96.3% Kernel Size 3
Convolution 2 Stride 1
Architecture Filters 123
Dropout 1 0.06480703
IV. E XPERIMENTS AND R ESULTS
Maxpooling 2
In this section, the experimental setting and the results that LSTM neuron 128
were employed for the evaluation of the proposed CNN-LSTM Dropout 2 0.21129224
networks for the smartwatch-based HAR are described. Dense 458
Optimizer Adam
A. Experiments Batch Size 64
Training
For the comparison of the performance of the proposed Learning Rate 0.00049271
CNN-LSTM networks with that of the other DL models, vari- Number of Epoches 50
ations were used in the experiments. In the first experiment, a
basic CNN network composed of only one convolutional layer
working with Dropout and a single Dense layer was used. V. C ONCLUSION AND F UTURE W ORKS
In the second experiment, a basic LSTM network referred to
as Vanilla LSTM composed of LSTM layers working with In this work, we proposed a CNN-LSTM network to tackle
Dropout and one Dense Layer was used. The third experiment the smartphone-based HAR problem. This hybrid LSTM takes
included a proposed CNN-LSTM that is a LSTM network in advantage of both the spatial feature extraction of CNN and the
which a convolution layer is combined with a LSTM layer. temporal feature extraction of LSTM. We also used Bayesian
The code for the experiments is in Python 3.6.9, using the optimization to find the optimized hyperparameters of the
TensorFlow [31], Keras [32], Sci-kit Learn [33], Numpy [34], model for achieving better performance of HAR. We have
and Pandas [35] libraries. The experiments were executed evaluated the performance of this CNN-LSTM network by
on the Google Colab platform with Tesla K80, and also the various metrics and a public dataset known as WISDM. The
hyperparameters of each model were optimized by SigOpt results show that the proposed hybrid LSTM outperforms the
[36]. other baseline networks by utilizing the automatic spatial-
temporal feature extraction from the raw sensor data with an
B. Experimental Results
average accuracy of 96.2% and an F-measure of 96.3%. For
To evaluate the performance of the smartwatch-based HAR, future research work, we shall develop this model for per-
three experiments on different variations were performed using sonalized human activity recognition using a transfer learning
the WISDM dataset as described in the Section III. The approach based on smartwatch sensors.
smartwatch sensor data from the WISDM dataset was divided
into 70% to be used as training data and the remaining
30% for testing data, which resulted in 41,440 and 17,761
data numbers, respectively. Experiments were conducted in
this work to evaluate the recognition performance of the
DL networks with a variation of metrics, including accuracy,
precision, recall, and F-measure. Table I shows the accuracy
and the other metrics obtained from the various DL networks
trained on the WISDM dataset.
As seen in Table I, the proposed CNN-LSTM was tuned by
Bayesian optimization in order to find a set of hyperparameters
that provide the high-performance metrics. The hyperparame-
ters are summarized in Table II.
It can be seen that the proposed CNN-LSTM networks
outperform all of the other networks with an accuracy of
96.2% and an F-measure of 96.3%. Therefore, the performance
of the CNN-LSTM is better than that of the baseline DL
models. The confusion matrix for the CNN-LSTM networks
is shown in Fig. 3. Fig. 3. Confusion Matrix for the proposed CNN-LSTM

Authorized licensed use limited to: Western Sydney University. Downloaded on June 15,2021 at 05:34:10 UTC from IEEE Xplore. Restrictions apply.
ACKNOWLEDGMENT [17] S. Mekruksavanich, N. Hnoohom, and A. Jitpattanakul, “Smartwatch-
based sitting detection with human activity recognition for office workers
The authors thank the SigOpt team for the provided opti- syndrome,” in 2018 International ECTI Northern Section Conference on
Electrical, Electronics, Computer and Telecommunications Engineering
mization services. (ECTI-NCON), 2018, pp. 160–164.
[18] J. Yang, K. Cheng, J. Chen, B. Zhou, and Q. Li, “Smartphones
based online activity recognition for indoor localization using deep
R EFERENCES convolutional neural network,” in 2018 Ubiquitous Positioning, Indoor
Navigation and Location-Based Services (UPINLBS), 2018, pp. 1–7.
[1] C. Jobanputra, J. Bavishi, and N. Doshi, “Human activity recognition: [19] H. N. Monday, J. Ping Li, G. U. Nneji, M. Folarin Raji, C. C. Ukwuoma,
A survey,” Procedia Computer Science, vol. 155, pp. 698 – 703, 2019. J. Khan, C. J. Ejiyi, A. Ulhaq, S. Nahar, H. S. Abubakar, and C. A.
[2] L. Liu, C. Karatas, H. Li, S. Tan, M. Gruteser, J. Yang, Y. Chen, and R. P. Ijeoma, “A survey on hand-based behavioral activity recognition,” in
Martin, “Toward detection of unsafe driving with wearables,” in Pro- 2019 16th International Computer Conference on Wavelet Active Media
ceedings of the 2015 Workshop on Wearable Systems and Applications, Technology and Information Processing, 2019, pp. 119–123.
ser. WearSys ’15. New York, NY, USA: Association for Computing [20] R. Singh and R. Srivastava, “Some contemporary approaches for human
Machinery, 2015, pp. 27 –32. activity recognition: A survey,” in 2020 International Conference on
[3] S. Mekruksavanich and A. Jitpattanakul, “Classification of gait pattern Power Electronics IoT Applications in Renewable Energy and its Control
with wearable sensing data,” in 2019 Joint International Conference on (PARC), 2020, pp. 544–548.
Digital Arts, Media and Technology with ECTI Northern Section Con- [21] G. M. Weiss, K. Yoneda, and T. Hayajneh, “Smartphone and
ference on Electrical, Electronics, Computer and Telecommunications smartwatch-based biometrics using activities of daily living,” IEEE
Engineering (ECTI DAMT-NCON), 2019, pp. 137–141. Access, vol. 7, pp. 133 190–133 202, 2019.
[22] J. Wang, Y. Chen, S. Hao, X. Peng, and L. Hu, “Deep learning for sensor-
[4] L. M. Schrader, A. V. Toro, S. G. A. Konietzny, S. Rüping, B. Schäpers,
based activity recognition: A survey,” Pattern Recognition Letters, vol.
M. Steinböck, C. Krewer, F. Mueller, J. Güttler, and T. Bock, “Advanced
119, pp. 3 – 11, 2019.
sensing and human activity recognition in early intervention and reha-
[23] A. Baldominos, A. Cervantes, Y. Sáez, and P. Isasi, “A comparison of
bilitation of elderly people,” Journal of Population Ageing, vol. 13, pp.
machine learning and deep learning techniques for activity recognition
139–165, 2020.
using mobile devices,” Sensors, vol. 19, p. 521, 01 2019.
[5] S. Mekruksavanich, “Medical expert system based ontology for diabetes [24] P. P. San, P. Kakar, X.-L. Li, S. Krishnaswamy, J.-B. Yang, and M. N.
disease diagnosis,” in 2016 7th IEEE International Conference on Nguyen, “Chapter 9 - deep learning for human activity recognition,”
Software Engineering and Service Science (ICSESS), 2016, pp. 383– in Big Data Analytics for Sensor-Network Collected Intelligence, ser.
389. Intelligent Data-Centric Systems, H.-H. Hsu, C.-Y. Chang, and C.-H.
[6] S. Mekruksavanich and A. Jitpattanakul, “Exercise activity recogni- Hsu, Eds. Academic Press, 2017, pp. 186 – 204.
tion with surface electromyography sensor using machine learning [25] S. Wan, L. Qi, X. Xu, C. Tong, and Z. Gu, “Deep learning models
approach,” in 2020 Joint International Conference on Digital Arts, for real-time human activity recognition with smartphones,” Mobile
Media and Technology with ECTI Northern Section Conference on Networks and Applications, vol. 25, 12 2019.
Electrical, Electronics, Computer and Telecommunications Engineering [26] A. Murad and J.-Y. Pyun, “Deep recurrent neural networks for human
(ECTI DAMT-NCON), 2020, pp. 75–78. activity recognition,” Sensors, vol. 17, p. 2556, 11 2017.
[7] L. Porzi, S. Messelodi, C. M. Modena, and E. Ricci, “A smart watch- [27] Song-Mi Lee, Sang Min Yoon, and Heeryon Cho, “Human activity
based gesture recognition system for assisting people with visual im- recognition from accelerometer data using convolutional neural net-
pairments,” in Proceedings of the 3rd ACM International Workshop on work,” in 2017 IEEE International Conference on Big Data and Smart
Interactive Multimedia on Mobile and Portable Devices, ser. IMMPD Computing (BigComp), 2017, pp. 131–134.
’13. New York, NY, USA: Association for Computing Machinery, [28] N. Tüfek and O. Özkaya, “A comparative research on human activity
2013, p. 19–24. recognition using deep learning,” in 2019 27th Signal Processing and
[8] O. C. Ann and L. B. Theng, “Human activity recognition: A review,” Communications Applications Conference (SIU), 2019, pp. 1–4.
in 2014 IEEE International Conference on Control System, Computing [29] M. Devanne, P. Papadakis, and S. M. Nguyen, “Recognition of activities
and Engineering (ICCSCE 2014), 2014, pp. 389–393. of daily living via hierarchical long-short term memory networks,” in
[9] S. Ramasamy Ramamurthy and N. Roy, “Recent trends in machine 2019 IEEE International Conference on Systems, Man and Cybernetics
learning for human activity recognition-a survey,” Wiley Interdisciplinary (SMC), 2019, pp. 3318–3324.
Reviews: Data Mining and Knowledge Discovery, p. e1254, 03 2018. [30] D. Powers and Ailab, “Evaluation: From precision, recall and f-measure
[10] F. Attal, S. Mohammed, M. Dedabrishvili, F. Chamroukhi, L. Oukhellou, to roc, informedness, markedness and correlation,” J. Mach. Learn.
and Y. Amirat, “Physical human activity recognition using wearable Technol, vol. 2, pp. 2229–3981, 01 2011.
sensors,” Sensors, vol. 15, pp. 31 314–31 338, 12 2015. [31] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin,
[11] M. Mario, “Human activity recognition based on single sensor square hv S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga,
acceleration images and convolutional neural networks,” IEEE Sensors S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden,
Journal, vol. 19, no. 4, pp. 1487–1498, 2019. M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: A system for large-scale
[12] Y. Zhang, Z. Zhang, Y. Zhang, J. Bao, Y. Zhang, and H. Deng, “Human machine learning,” in 12th USENIX Symposium on Operating Systems
activity recognition based on motion sensor using u-net,” IEEE Access, Design and Implementation (OSDI 16), 2016, pp. 265–283.
vol. 7, pp. 75 213–75 226, 2019. [32] F. Chollet, “keras,” https://github.com/fchollet/keras, 2015.
[13] U. Alrazzak and B. Alhalabi, “A survey on human activity recognition [33] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
using accelerometer sensor,” in 2019 Joint 8th International Conference O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
on Informatics, Electronics Vision (ICIEV) and 2019 3rd International plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay,
Conference on Imaging, Vision Pattern Recognition (icIVPR), 2019, pp. and G. Louppe, “Scikit-learn: Machine learning in python,” Journal of
152–159. Machine Learning Research, vol. 12, 01 2012.
[14] N. Zouba, F. Bremond, and M. Thonnat, “An activity monitoring system [34] S. van der Walt, S. Colbert, and G. Varoquaux, “The numpy array: A
for real elderly at home: Validation study,” in 2010 7th IEEE Interna- structure for efficient numerical computation,” Computing in Science
tional Conference on Advanced Video and Signal Based Surveillance, and Engineering, vol. 13, pp. 22 – 30, 05 2011.
2010, pp. 278–285. [35] W. Mckinney, “Data structures for statistical computing in python,”
[15] N. Hnoohom, S. Mekruksavanich, and A. Jitpattanakul, “Human activity Proceedings of the 9th Python in Science Conference, 01 2010.
recognition using triaxial acceleration data from smartphone and ensem- [36] I. Dewancker, M. McCourt, S. Clark, P. Hayes, A. Johnson, and G. Ke,
ble learning,” in 2017 13th International Conference on Signal-Image “A stratified analysis of bayesian optimization methods,” ArXiv, vol.
Technology Internet-Based Systems (SITIS), 2017, pp. 408–412. abs/1603.09441, 2016.
[16] N. Ahmed, Rafiq, and Islam, “Enhanced human activity recognition
based on smartphone sensor data using hybrid feature selection model,”
Sensors, vol. 20, p. 317, 01 2020.

Authorized licensed use limited to: Western Sydney University. Downloaded on June 15,2021 at 05:34:10 UTC from IEEE Xplore. Restrictions apply.

You might also like