0% found this document useful (0 votes)
119 views

Sequential Neural Networks For Multi-Resident Acti

Uploaded by

CHANDER KUMAR M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views

Sequential Neural Networks For Multi-Resident Acti

Uploaded by

CHANDER KUMAR M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Applied Intelligence

https://doi.org/10.1007/s10489-020-02134-z

Sequential neural networks for multi-resident activity recognition


in ambient sensing smart homes
Anubhav Natani1 · Abhishek Sharma1 · Thinagaran Perumal2

Accepted: 9 December 2020


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature 2021

Abstract
Advances in smart home technology and IoT devices had made us capable of monitoring human activities in a non-intrusive
way. This data, in turn, enables us to predict the health status and energy consumption patterns of residents of these smart
homes. Machine learning has been an excellent tool for the prediction of human activities from raw sensor data of a single
resident. However, Multi Resident activity recognition is still a challenging task, as there is no correlation between sensor
values and resident activities. In this paper, we have applied deep learning algorithms on the real world ARAS Multi Resident
data set, which consists of data from two houses, each with two residents. We have used different variations of Recurrent
Neural Network (RNN), Convolutional Neural Network (CNN), and their combination on the data set and kept the labels
separate for both residents. We have evaluated the performance of models based on several metrics.

Keywords Activities of Daily Life(ADL) · Multi resident · Neural networks · Sequential neural networks · Deep learning ·
Human activity recognition

1 Introduction and cons to each of these sensors. For example, activity


recognition can be performed with very high accuracy with
With the help of advancements in machine learning and the help of data from vision sensors [2], but for vision
IoT, smart homes are becoming more and more capable methods, residents might not want to be watched all the
and accessible. In these advance smart homes, we can time, thus this method raises privacy issues. Similarly, for
build systems to monitor human activities, which helps in wearable sensors [3], although they are less invasive, some
predicting the health status [1] of residents and identifying residents might find it uncomfortable to wear them all the
the energy usage pattern of residents, which will lead to time. For ambient sensors, they are less intrusive and do not
analyze efficient energy consumption and to assist resident hinder user privacy as they are mainly environment based
when required. passive sensors embedded in the smart homes, residents
In past different types of sensors had been used to also do not have to interact with these sensors explicitly. In
collect data for human activity recognition. These include this paper, we are showcasing the use of ambient sensors
vision-based, wearable and ambient sensors. There are pros for human activity recognition. Ambient sensors used for
the collection of the data set include Contact Sensor, Force
 Abhishek Sharma Sensor, Photocell, Pressure Mat, Sonar Distance [4].
[email protected] Previous research in the field of human activity
recognition mostly focused on single resident activity
 Thinagaran Perumal
recognition, which although is simple, but in real-world
[email protected]
single-resident homes are not always the case. That is why
Anubhav Natani our smart homes should support activity recognition for
[email protected] multiple residents. Multiple Resident Activity Recognition
is a more complex task as sensor states do not directly reflect
1 Electronics and Communication, The LNM Institute the activity of a specific person. It represents information
of Information Technology, Jaipur, Rajasthan, 302031, India about the joint activities of the users.
2 Computer Science Engineering, Universiti Putra Malaysia, In recent works, various techniques had been used to over-
Serdang, Selangor, Malaysia come this obstacle. These techniques include data-association

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

and usage of temporal approaches for activity recognition uses images and videos captured from the camera, for
and sequential models such as HMM [5], and CRF [6] are activity recognition. The data collected from these sensors
also widely used in this area. Other works showcase the use can be used for determining the presence and orientation
of IDT’s [7] and other machine learning models like ran- of the subject in the environment. These types of sensors
dom forests [8] applied for the task of multi-resident activity are more effective in human activity recognition, but they
recognition. In this paper, we examine the use of different are very complicated and costly and also raise a severe
kinds of neural networks for the task of activity recognition concern about the subject’s privacy. Other types of sensors
like Multi-Layer Perceptron(MLP) [9], Recurrent Neural include Ambient sensors [4] these sensors are passive
networks(RNN) [10], Convolutional Neural Network(CNN) sensors embedded in the smart environment. They comprise
[11], and combination of CNN and RNN. Later we investi- many types of sensors, including Photocell, Contact Sensor,
gate the impact of data set size on the accuracy and training Pressure Mat, and other sensors used for collecting various
time of each type of neural network for the task of human types of data related to the environment and how subjects
activity recognition. This paper focuses on taking labels as interact with the environment [12, 16]. These sensors are
separate labels instead of taking them as combined labels. less intrusive as they do not hinder with the subject’s
This approach scales better in the real-world situation as privacy, and do not provide any discomfort to the subject as
the number of residents increases or decreases in the smart they are passively embedded in the smart environment.
home; only the last layer of the neural network needs to be
changed. 2.2 Types of approaches
We have applied the mentioned approaches on the ARAS
Data set [12], which consists of data from two smart homes, Many types of approaches are used for human activity
denoted as A and B. The data is collected for 30 days with recognition. These mainly include Logic-Based Approaches
the help of 20 ambient sensors and 27 different activities and Machine Learning-Based Approaches. In this paper,
are performed by two residents. The results show that we have used deep learning for human activity recognition,
the combinations of RNN and CNN for one-dimensional which comes under machine learning-based approaches.
data perform consistently, giving excellent results with less Firstly logic-based approaches involve logic-based con-
variance within results. This work helps in understanding text models where we define the context using the expres-
the performance of different kinds of neural network for sions, and we use rules to describe the relationships and
the task of human activity recognition and suggests the constraints [17]. Shet et al. [18] proposed a framework
best possible methods that can be used for making better integrating computer vision algorithms with logic program-
recognition systems. ming to describe and identify video surveillance activities
in a parking lot. Other human activity recognition systems
were proposed by [19] these systems are based on an Event
2 Related work Calculus logic programming implementation [20].
Secondly, approaches based on machine learning involve
In recent years a lot of research work has been done in the the use of several machine learning algorithms for human
field of human activity recognition. In this section, firstly, activity recognition. Earlier works show use of algorithms
we discuss the types of sensors used in collecting the data like naive bayes [21], kernel methods like SVM [22],
for human activity recognition, and then we discuss the use of decision trees like incremental decision trees [7],
different approaches used to predict activities from the data ensembles like random forest [8] and clustering algorithms
collected. [23]. However, there is much research involving the use of
graphical models like Hidden Markov Models [5, 24, 25],
2.1 Types of sensors Conditional Random Fields [6, 26, 27], Gaussian Mixture
Models [28], and Dynamic Bayesian Networks [29]. Due
Different types of sensors are used for the collection to the complexity of the problem of multiple residents,
of data for activity recognition which include wearable most of the previous works have used graphical models for
sensors [3, 13] which comprises of sensors like accelerator multi-resident activity recognition. Models like the Hidden
and gyroscope [14], employed for collection of data Markov Model are widely used for tasks where multiple
related to acceleration and rotation of the subject, main activity labels can be combined in a single label to be used
issues with this type of sensors are that they are mostly by HMM [30].
limited to physical movements like running, walking and Activity recognition is much more complicated for multi-
playing some sport. Furthermore, subjects do not feel resident activity recognition [31], where sensors states
much comfortable wearing it all the time. Another type do not directly reflect the particular resident’s activity.
of sensors includes vision-based sensors [2, 15], which Much of the research discussed earlier focuses on the

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

activity recognition of a single resident. However, recent model on the train set. Lastly, we measured the performance
works show the use of ambient sensors for human activity of the model using various metrics on the test set.
recognition. These sensors have a wide range of activity Figure 2 shows the model selection process. Firstly we
recognition like sleeping, watching television, cooking, or divide the training dataset into validation data and train data,
talking [12]. As these sensors collect only one value per then we train our model on train data and validate it on
time step for the whole environment, therefore in the data, validation data and then save the model for inferencing later.
we do not have any distinction between the activities of two
residents. Tran et al. [30] showcased the use of different 3.1 Multi-Layer perceptron approach
types of HMM’s like factorial HMM with different types
of labelling strategy for the task of multi-resident activity Multi-Layer perceptron is the simplest neural network
recognition; their work also includes the use of CRF for consisting of one input layer, any number of hidden layers
the same task where they have showcased the use of and an output layer. In this case, we have modified the multi-
simple CRF and Factorial CRF for modelling multi-resident layer perceptron to have two output layers instead of one to
activity recognition. Lastly Al Machot et al. showcased accommodate two residents instead of one by doing this we
how we can detect human activity from data streams from can avoid the multi-label approach [40] or combined label
sensors [32]. approach [41] which were used earlier for multi-resident
Deep Learning has been used for activity recognition activity modeling. We can use the multi-layer perceptron to
[33–36] in many ways, like using Deep Convolutional Neu- model activities as
ral Networks for activity recognition from RFID data [37].
X1 = g1 (W (1) .s(t) + b(1) ) (1)
Convolutional Neural Networks are also used to identify
the activities from 3D data collected using correspond-
ing depth sensors [38]. Other works show the use of X2 = g1 (W (2) .X1 + b(2) ) (2)
Recurrent Neural Network for the task of Activity recog-
nition [10].Liciotti et al. [39] showcased the use of dif- Ya = g2 (W (3a) .X2 + b(3a) ) (3)
ferent types of LSTM for human activity recognition on
CASAS dataset [16]. In this paper, we have showcased Yb = g2 (W (3b) .X2 + b(3b) ) (4)
the use of different types of neural network for the task In the (1)–(4), the weight matrix of nth layer is denoted by
of activity recognition. We have used Multi-Layer Per- Wn , and sensor states at time t are denoted by s(t) which are
ceptron(MLP), Recurrent Neural Networks(RNN), Con- given as input to the model, bias is denoted by b and g1 ,
volutional Neural Networks(CNN’s), and Combination of g2 specifies the activation functions. X1 denotes the output
RNN’s and CNN’s for human activity recognition. These from the first layer, X2 denotes the output from second
neural networks perform well in the scenario of a multi- layer, Ya denotes the output for resident A and Yb denotes
resident environment-based dataset. the output for resident B.
We can make several modifications in a multi-layer
perceptron network by increasing the number of layers and
3 Technical approach the number of neurons to tune its performance, here we have
3 layers and they have 128,64 and 28 neurons. We have used
Let us denote the activity of a resident at particular time t by a rather simple multi-layer perceptron to model activities
A(t) and sensor states at that time by s(t) . Here A(t) = {A(1,t) , with fewer layers as an increasing number of layers resulted
A(2,t) } i.e. activities performed by two residents and s(t) = in more training time, and there were no significant gains in
{s(1,t) , s(1,t) . . . .s(n,t) } where n is total number of sensors. We accuracy.
have used t1 :t2 to denote time between two-time intervals One of the benefits of using this network is it can be
t1 and t2 . Hence, A(t1 :t2 ) means activities between t1 and t2 trained in very little time as compared to other networks, but
time intervals and s(t1 :t2 ) means sensors states between t1 it comes with the cost of accuracy as it does not capture the
and t2 time intervals. Finally, for prediction we are using sequential i.e previous information and it also does not make
sensor readings between two time intervals s(t1 :t2 ) to predict use of spatial information. That’s why it performs worse in
activities activities A(t1 :t2 ) . the task of activity recognition.
We have used different types of neural networks to Figure 3 shows the model architecture for the multi layer
approach the task of multi-resident activity recognition, perceptron network. We have used two hidden layers which
which we will elaborate one by one. contains 128 and 64 neurons and we have two output layers
Figure 1 shows the workflow of the experiment. Firstly for both residents, each having 28 neurons representing each
we took raw data and did feature extraction, then divided it activity. We have used standard cross entropy loss to train
into train and test set of different sizes, and we trained our the neural network.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

Fig. 1 Process Workflow

3.2 Convolutional neural network based approach Yb = g2 (W (4b) .X3 + b(4b) ) (9)

Convolutional Neural Networks (CNN) [42] are generally Here in the (5) and (6) the convolution operation is
used to map image data to the output variable. Benefits performed on the input to capture the spatial information
of using a CNN is their ability to make use of spatial this will generate the output for the next layer. Similar
information. They directly takes two-dimension vector as to previous model here the weight matrix of nth layer is
an input and contains convolutional layers as hidden layers. denoted by Wn , and sensor states at time t are denoted by s(t)
We have used two types of convolutional neural network to which are given as input to the model, bias is denoted by b
model activities as feature maps. and g1 , g2 specifies the activation functions. Xn denotes the
output from the nth layer, Ya denotes the output for resident
X1 = g1 (W (1) ∗ s(t) + b(1) ) (5) A and Yb denotes the output for resident B. Other (7)–(9)
are similar to the previous model.
X2 = g1 (W (2) ∗ X1 + b(2) ) (6) Figure 4 shows the model architecture for Convolutional
Neural Network for one dimension. We have used two
X3 = g1 (W (3) .X2 + b(3) ) (7) convolutional layers for one dimension, giving out 40 and
80 channels, and they use a kernel of size two, it is followed
Ya = g2 (W (4a) .X3 + b(4a) ) (8) by a fully connected layer having 128 neurons, we also used

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

Fig. 2 Model Selection Process

Fig. 3 Model architecture for Multi-Layer Perceptron Network

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

Fig. 4 Model architecture for convolutional neural network for one-dimension

Relu activation in all the layers. We have two output layers residents, each having 28 neurons representing each activity.
for both residents, each having 28 neurons representing each The input to the network is in the form of a feature map,
activity followed by a softmax activation function. which is shown in Fig. 5.
Figure 5 show the mapping of one dimension sensor
states to two dimension feature maps. 3.3 Recurrent neural networks based approach
Figure 6 shows the model architecture for Convolutional
Neural Network for two dimension. We have used two Recurrent Neural Networks(RNN) [43, 44] uses past
Convolutional two dimension layers, giving out 6 and 12 information to determine the output, this is because in RNN
channels, and they use a kernel of size two, it is followed the current cell state is influenced by the previous cell state.
by a fully connected layer having 64 neurons, we also The benefits of using RNN is that it can make use of past
used Relu activation. We have two output layers for both information. In this paper, we have used three different
types of RNN’s architectures, which are Elman RNN, Gated
Recurrent Units (GRU’s), and Long-term Short Memory
Network (LSTM’s). Elman RNN is the most straightforward
RNN architecture where the output of the current state is
generated by multiplying the hidden value from previous
state and input to the current state, finally passing the result
through tanh activation function. These kind of RNN’s are
unable to learn long term dependencies and suffer from
vanishing gradient issues. In contrast, in GRU’s, we have
additional updates and reset gate, which helps to tackle the
vanishing gradient problem. In GRU update gate is utilized
to determine the amount of knowledge from previous hidden
state to be passed and reset gate is utilized to determine
the amount of prior knowledge to forget and lastly we have
LSTM in which we have additional forget gate, input gate
and output gate which make LSTM adequate of learning
long term dependencies. In LSTM, forget gate is utilized
to determine which information to reject from the previous
hidden state input gate is utilized to determine which
information will be updated and output gate is utilized to
determine which part of the information will go to the next
hidden state. Similar to MLP and CNN, we have multiple
Fig. 5 Feature Maps output layers for each type of RNN.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

Fig. 6 Model architecture for convolutional neural network for two-dimension

   
Figure 7 shows the model architecture for different types C̃t = tanh WC · ht−1 , st + bC (15)
of Recurrent Neural Network. In the first layer, we have
256 recurrent units followed by a fully connected layer Ct = ft ◦ Ct−1 + it ◦ C̃t (16)
having 128 neurons, and we have two output layers for both    
residents, each having 28 neurons representing each activity. ot = σ Wo ht−1 , st + bo (17)

3.3.1 For Elman RNN ht = ot ◦ tanh (Ct ) (18)


Equation (13) helps us decide which information to throw
In each time stamp input for current cell s(t) and hidden state away from hidden state this layer is called forget gate layer.
from the previous cell, h(t−1) is used to determine the output It uses the ht−1 hidden cell state from previous cell and
a(t) for current cell. input to the current cell st to determine the forget coefficient
at = W .h(t−1) + U .s(t) + b (10) ft which is used to determine the amount to information
to forget. Equation (14) is called Input gate which help us
ht = tanh(at ) (11) decide which values we will update. It gives us and update
coefficient it which is used to determine the amount of
At = V .h(t) + c (12) information to be updated. In (15) we will create a vector for
In (10) W, U denotes the weight matrix, b denotes the bias, new candidate values (C̃t ) that could be added to the state.
h(t−1) is hidden state from previous cell and s(t) denotes the In (16) we are determining the new cell state by multiplying
sensor state. In (11) ht is the hidden state of the current cell the old cell state (Ct−1 ) with the forget coefficient and
calculated by passing the output at into a tanh activation candidate value vector(C̃t ) with input coefficient. Equation
function. Lastly in (12) At denotes the output of the current (17) helps us decide which part to cell state will go to hidden
RNN cell which is calculated from the current hidden state i.e. output of cell(ot ). Cell hidden state (ht ) is obtained
state(ht ), weight matrix V and bias c. Elman RNN network by passing the cell state through a tanh activation function
suffer from vanishing gradient problem and it is hard to and multiplying it with the output (18).
learn long term dependencies with these kind of networks.
3.3.3 For GRU
3.3.2 For LSTM
Similar to LSTM’s GRU can also use the cell state of the
Using LSTM has some advantages over Elman RNN as previous cell and give the same advantages as LSTM over
does not suffer from the vanishing gradients problem and Elman RNN, but with an added advantage of training time.
can be used to learn long term dependencies. We adjust the zt = σ (Wz .st + Uz .ht−1 ) (19)
input and forget gate of LSTM to accomodate the previous
state of memory cell Ct−1 whereas in Elman RNN we were rt = σ (Wr .st + Ur .ht−1 ) (20)
replacing entire cell state.
   
ft = σ Wf · ht−1 , st + bf (13) ht = tanh (W .xt + rt ◦ (U .ht−1 )) (21)
   
it = σ Wi · ht−1 , st + bi (14) ht = zt ◦ ht−1 + (1 − zt ) ◦ ht (22)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

Fig. 7 Model architecture for Recurrent Neural Network

Equation (19) is called update gate equation which for determining the output. For the sequential part of these
determines how much past information need to be passed networks, we have used LSTM as it worked best during our
along the future, Update coefficient zt is determined with testing.
the help of current input st and previous cell hidden state These networks gave us the best accuracy results on both
ht−1 . Equation (20) is called reset gate equation which is the dataset. We have used two versions of these networks,
used to determine how much past information to forget. one which utilizes CNN which works with one-dimension
Forget coefficient rt is calculated with the help of current data and others that work with two-dimension data.
input and previous hidden state. Lastly (21)–(22) are used Figure 8 shows the model architecture for the combina-
to determine the hidden state(ht ) and resolve the vanishing tion of CNN for one-dimension and LSTM networks it has
gradient problem. same layers as of CNN for one-dimension (Fig. 4), followed
These networks work very well on the dataset with GRU by the fully connected layer with 64 neurons followed by a
beating all of the other previous networks in terms of layer having 256 LSTM units which is followed by a fully
accuracy but these network fall behind the combination of connected layer having 128 neurons. We also used relu acti-
CNN and RNN as these they do not make use of the spatial vation and network as two output layers for two residents
information which can increase accuracy significantly as we the same as earlier networks.
have seen in CNN’s. Figure 9 shows the model architecture for the combina-
tion of CNN2d and LSTM networks it has same layers as
3.4 Combination of CNN and RNN of CNN for two-dimension (Fig. 6), followed by the fully
connected layer with 64 neurons followed by a layer hav-
These networks are a combination of convolutional neural ing 256 LSTM units which is followed by a fully connected
networks and recurrent neural networks [45]. These net- layer having 128 neurons. We also used relu activation and
works can take advantage of both the RNN and CNN as they network as two output layers for two residents the same as
can use the spatial information as well as past information earlier networks.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

Fig. 8 Model architecture for combination of CNN for one dimension and LSTM networks

4 Dataset 5.3 True negative

We conducted our analysis of the ARAS multi-resident A true negative occurs when the model correctly predicts
dataset [12]. The ARAS dataset is collected in two houses the negative class.
House A and House B for 30 days. In these homes 20
different sensors were placed at different locations in the 5.4 False positive
homes. Residents living in house A are 2 males with
average age of 25, whereas residents living in House B is A false positive occurs when the model incorrectly predicts
a married couple with average age of 34, in both houses, the positive class.
each resident is asked to perform 27 different activities. The
ARAS dataset features are binary sensor readings which are 5.5 False negative
denoted by 1 when the sensor is activated and 0 when the
sensor is deactivated. Here each time stamp has two outputs A false negative occurs when the model incorrectly predicts
i.e. activities performed by each Resident A and Resident the negative class.
B. In this paper, we have modeled each output separately as
discussed in previous sections. 5.6 Precision

Precision determines the percentage of true positives that


5 Evaluation metrics are correct.

5.1 Accuracy T rue P ositive


P recision = (24)
T rue P ositive + F alse P ositive
Accuracy determines the percentage of correct activities
predicted by our model. We denote the test set by Dtest . We 5.7 Recall
also denote ground truth value of a resident at time stamp
t by Ar,t and predicted value of a resident at time t by ar,t Recall determines the percentage of true positives that were
(22). The model is evaluated on the basis of the accuracy of identified correctly.
each resident activity.
1  1   Recall =
T rue P ositive
(25)
accuracyr = f (A(r,t) , a(r,t) ) (23) T rue P ositive + F alse Negative
|Dtest | T t
ar,1:T ∈Dtest

In (23) value of function f is 1 if Ar,t is equal to ar,t 5.8 F Score


otherwise it is equal to 0.
F-Score is determined by the weighted harmonic mean of
5.2 True positive precision and recall.

A true positive occurs when the model correctly predicts the 2 ∗ (P recision ∗ Recall)
F − Score = (26)
positive class. P recision + Recall

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

Fig. 9 Model architecture for combination of CNN for two dimension and LSTM networks

6 Results such as MLP, CNN2D, CNN1D suffer in giving high accu-


racy as they do not make use of sequential information. But
We have split the ARAS dataset in two ways for analysis. the high accuracy comes with the cost of high training time.
First, we have considered 10 days worth of data for analysis,
in that we have used 8 days for training, 1 day for validation 6.1.2 30 Days
and 1 day for testing. Secondly, we have considered the
whole ARAS dataset i.e., 30 days worth of data for analysis, Table 2 shows the accuracy of each resident, average
we have used 24 days for training 3 days for validation and accuracy and training time for each model for 30 Days on
3 days for testing. We do the this for both the houses. In ARAS dataset house A. A similar pattern can be seen from
this way we can analyze the effect of dataset size on the Table 2, but we can see as the dataset size increase, the
performance of models. performance of the GRU model improves significantly as
We have measured the accuracy of all the models i.e., compared to other models. We can see with more data, all
Multi-Layer Perceptron, Convolutional Neural Networks, models except MLP and CNN1D perform better.
Recurrent Neural Networks, and a combination of Convolu- Furthermore we can observe on ARAS dataset House
tional Neural Networks and Recurrent Neural Networks for A increasing dataset size does not improve the accuracy
all the sizes of the dataset mentioned above. We also used significantly.
several other metrics to do the overall evaluation of models We are analyzing the best performing model CNN1DS
some of these metrics are precision, recall, f-score. Lastly based on overall performance by using PR, RE, FS.
we compared different models based on training time.
Some of the notations we used in the result section are PR 6.1.3 For resident A
for precision, RE for recall, FS for f-score, MLP for multi-
layer perceptron neural network, LSTM for long short- Table 3 shows the precision, recall and f-score of the
term memory neural network, GRU for gated recurrent CNN1DS model for different activities of Resident A on
unit neural network, CNN2D for convolutional neural ARAS dataset House A.
network for two dimension, CNN2DS for combination
of convolutional neural network for two-dimension with
LSTM neural network, CNN1D for convolutional neural Table 1 Accuracy and Time taken for 10 Days
network for one dimension, CNN1DS for combination of
Model Accuracy Accuracy Average Time
convolutional neural network for one dimension with LSTM
of Res A of Res B Taken(s)
network, Res A for resident A and Res B for resident B.
MLP 64.61 79.20 71.905 1169.9
6.1 ARAS dataset house A LSTM 73.64 85.21 79.425 4017.5
GRU 75.17 87.29 81.23 3481.1
6.1.1 10 Days RNN 71.27 81.67 76.47 2685.3
CNN2D 64.78 79.82 72.3 1304.1
Table 1 shows the accuracy of each resident, average accu- CNN2DS 77.66 85.00 81.33 4510.1
racy and training time for each model for 10 Days on ARAS
CNN1D 72.02 84.84 78.43 1703.2
dataset House A. From Table 1 we can observe that the
CNN1DS 77.49 87.10 82.295 4600
CNN1DS model gives us the highest accuracy. Simple models

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

Table 2 Accuracy and time taken for 30 Days Table 5 Accuracy and time taken for 10 days

Model Accuracy Accuracy Average Time Model Accuracy Accuracy Average Time
of Res A of Res B Taken(s) of Res A of Res B Taken(s)

MLP 64.64 75.20 69.92 3163.4 MLP 91.11 91.99 91.55 1048.2
LSTM 77.17 91.43 84.3 11973.6 LSTM 95.93 86.03 90.98 4043.7
GRU 81.83 90.13 85.98 11286 GRU 95.93 85.39 90.66 3492.7
RNN 74.25 90.27 82.26 9141.2 RNN 92.66 82.15 87.405 2661.8
CNN2D 72.07 88.12 80.095 3376.9 CNN2D 94.51 83.39 88.95 1129
CNN2DS 76.19 90.38 83.285 12339.2 CNN2DS 91.79 90.21 91 4135.1
CNN1D 64.66 75.27 69.965 5636.4 CNN1D 94.49 83.14 88.815 1726
CNN1DS 81.81 90.82 86.315 14043.7 CNN1DS 96.96 86.23 91.595 4592.4

Table 6 Accuracy and time taken for 30 days


Table 3 Evalutation metrics for Resident A
Model Accuracy Accuracy Average Time
Activity number PR RE FS
of Res A of Res B Taken(s)
1 0.25 0.38 0.30
MLP 91.53 87.77 89.7 3283
2 1 0.78 0.87
LSTM 86.71 87.39 87.05 12103.3
3 0.29 0.91 0.44
GRU 88.98 85.17 87.075 10807.6
4 0.3 0.99 0.46
RNN 86.00 82.96 84.48 7901.5
9 0.70 0.29 0.40
CNN2D 91.42 81.65 86.4 3906.9
11 1.00 0.73 0.84
CNN2DS 87.57 88.96 88.265 12078
12 0.34 0.90 0.49
CNN1D 92.51 81.55 87.03 5287.8
13 1.00 0.00 0.00
CNN1DS 89.21 86.74 87.975 14811.3
14 0.63 0.93 0.75
15 0.62 0.83 0.71
17 0.61 0.65 0.63
21 0.53 0.08 0.14
22 0.64 0.43 0.51
27 0.84 0.33 0.47
Table 7 Evalutation metrics for Resident A

Activity number PR RE FS

1 0.26 0.28 0.26


2 1.00 1.00 1.00
3 0.22 0.18 0.2
Table 4 Evalutation metrics for Resident B 4 0.99 0.42 0.59
8 0.94 0.59 0.73
Activity number PR RE FS
11 1.00 1.00 1.00
2 0.95 0.97 0.96 12 0.55 0.96 0.7
11 1 0.97 0.98 13 0.74 1.00 0.85
14 0.5 0.09 0.16 14 0.95 0.70 0.80
15 0.75 0.6 0.67 15 0.13 0.15 0.14
20 1 0.98 0.99 18 0.13 0.04 0.06
27 0.56 0.27 0.36 27 0.94 0.86 0.90

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

Table 8 Evalutation metrics for Resident B

Activity number PR RE FS

1 0.28 0.52 0.36


2 0.99 1.00 1.00
4 0.97 0.15 0.26
8 1.00 0.49 0.65
11 1.00 1.00 1.00
12 0.53 0.99 0.69
15 0.70 0.13 0.22
17 0.30 0.39 0.34
18 0.29 0.00 0.01
27 0.97 0.44 0.61 Fig. 11 Accuracy of House B

6.1.4 For resident B

Table 4 shows the precision, recall and f-score of the


CNN1DS model for different activities of Resident B on
ARAS dataset House A.

6.2 ARAS dataset house B

6.2.1 10 Days

Table 5 shows the accuracy of each resident, average


accuracy and training time for each model for 10 Days on
ARAS Dataset House B. From Table 5 we can observe
Fig. 12 Time Taken of House A
that the CNN1DS model gives us the highest accuracy and
simple models like MLP, CNN2D, CNN1D also gave us
high accuracy. This is because the data in House B has very
little variance as compared to House A.

6.2.2 30 Days

Table 6 shows the accuracy of each resident, average


accuracy and training time for each model for 30 Days on
ARAS Dataset House B. Significantly different result are
obtained this time as compared to House A performance

Fig. 13 Time Taken of House B

Table 9 Accuracy for both residents on ARAS House A

Model Accuracy of Res A Accuracy of Res B

HMM 64.61 79.20


CRF 73.64 85.21
KNN 75.17 87.29
CNN1DS(Ours) 81.81 90.82
Fig. 10 Accuracy of House A

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

of almost all the models decayed as number of days is 7 Conclusion


increased this is due to increase in variance in the data as
more days are added. In this paper we have shown the performance of a different
We are analyzing the CNN1DS model based on overall kinds of neural network for multi-resident activity recog-
performance by using PR, RE, FS. nition in smart homes with ambient sensors. The results
show that the combination of convolutional neural networks
6.2.3 For Resident A and recurrent neural networks (LSTM) performs better than
other models on this kind of problems. In general deep
Table 7 shows the precision, recall and f-score of the learning approaches perform better as compared to tradi-
CNN1DS model for different activities of Resident A on tional sequential models like HMM, CRF for the given task.
ARAS dataset House B. But with the pros of deep learning methods there are cons
too like they can take a large amount of time and data to
6.2.4 For Resident B train. They are not very predictable sometimes they will give
very high accuracy and other times they will completely fail
Table 8 shows the precision, recall and f-score of the as we have seen in the case of ARAS dataset House B the
CNN1DS model for different activities of Resident B on MLP algorithm which performed worst in case of House A
ARAS dataset House B. was the best in case of House B.
We have seen two cases of how data set size can affect
the performance of a model in case of House A dataset size
positively impacted the performance of model whereas in References
House B, increasing dataset size degrades the performance
of models but not that significantly because we have a lot of 1. Cook DJ, Das S, Gopalratnam K, Roy A (2003) Health monitoring
data the effect of dataset size is not significant for the both in an agent-based smart home. In: Proceedings of the International
Conference on Aging, Disability and Independence Advancing
houses in Aras dataset.
Technology and Services to Promote Quality of Life, pp 3–141
Figure 10 shows the accuracy of different models for 10 2. Poppe R (2010) A survey on vision-based human action
days and 30 days for ARAS dataset House A. Figure 11 recognition. Image Vis Comput 28(6):976–990
shows the accuracy of different models for 10 days and 3. Plötz T, Hammerla NY, Olivier PL (2011) Feature learning for
activity recognition in ubiquitous computing. In: Twenty-Second
30 days for ARAS dataset House B. Figure 12 shows the
International Joint Conference on Artificial Intelligence
training time of different models for 10 days and 30 days 4. Shelke S, Aksanli B (2019) Static and dynamic activity detection
for ARAS dataset House A. Figure 13 shows the training with ambient sensors in smart spaces. Sensors 19(4):804
time of different models for 10 days and 30 days for ARAS 5. Using a hidden markov model for resident identication (2010)
International Conference on Intelligent Environments, pp 7479–
dataset house B.
7479
These figures shows that combination of convolutional 6. Vail DL, Veloso MM, Lafferty JD (2007) Conditional random
and recurrent neural networks which depends on past values fields for activity recognition. In: Proceedings of the 6th inter-
and also use spatial information gives good accuracy for national joint conference on Autonomous agents and multiagent
systems. ACM, pp 235
both the houses and different dataset size. But these models
7. Prossegger M, Bouchachia H (2014) Multi-resident activity
also takes large amount of time for training. In contrast recognition using incremental decision trees, pp 182–191
simple networks like multi layer perceptron network and 8. Xu L, Yang W, Cao Y, Li Q (2017) Human activity recognition
convolutional neural networks takes less time for training based on random forests. In: 2017 13th International Confer-
but do no consistently produce good results. ence on Natural Computation, Fuzzy Systems and Knowledge
Discovery (ICNC-FSKD). IEEE, pp 548–553
Tables 9 and 10 shows comparison of our results to some 9. Putra DNS, Yulita IN (2019) Multilayer perceptron for activity
of the recent works in the multi resident activity recognition recognition using a batteryless wearable sensor. In: IOP Con-
on ARAS dataset [30]. ference Series: Earth and Environmental Science, vol 248. IOP
Publishing, pp 012039
10. Singh D, Merdivan E, Psychoula I, Kropf J, Hanke S, Geist M,
Holzinger A (2017) Human activity recognition using recurrent
Table 10 Accuracy for both residents on ARAS House B
neural networks. In: International Cross-Domain Conference for
Model Accuracy of Res A Accuracy of Res B Machine Learning and Knowledge Extraction. Springer, pp 267–
274
HMM 79.29 76.98 11. Ji S, Xu W, Yang M, Yu K (2012) 3d convolutional neural
networks for human action recognition. IEEE Trans Pattern Anal
CRF 88.36 89.27 Mach Intell 35(1):221–231
KNN 51.34 60.98 12. Alemdar H, Ertan H, Incel OD, Ersoy C (2013) Aras human
MLP(Ours) 91.53 87.77 activity datasets in multiple homes with multiple residents,
pp 232–235

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


A. Natani et al.

13. Jordao A, Nazare Jr A C, Sena J, Schwartz WR (2018) 31. Tran SN, Zhang Q (2020) Towards multi-resident activity
Human activity recognition based on wearable sensor data: A monitoring with smarter safer home platform. In: Smart Assisted
standardization of the state-of-the-art. arXiv:1806.05226 Living: Toward An Open Smart-Home Infrastructure. Springer
14. Zubair M, Song K, Yoon C (2016) Human activity recognition International Publishing, Cham, pp 249–267
using wearable accelerometer sensors. IEEE, pp 1–5 32. Al Machot F, Mosa A, Ali M, Kyamakya K (2017) Activity
15. Zhang S, Wei Z, Nie J, Huang L, Wang S, Li Z (2017) A review recognition in sensor data streams for active and assisted living
on human activity recognition using vision-based method. Journal environments. IEEE Trans Circ Syst for Video Technol PP.
of Healthcare Engineering 2017 https://doi.org/10.1109/TCSVT.2017.2764868
16. Cook DJ, Crandall AS, Thomas BL, Krishnan NC (2012) Casas: 33. Hassan MM, Uddin MZ, Mohamed A, Almogren A (2018)
A smart home in a box. Computer 46(7):62–69 A robust human activity recognition system using smartphone
17. Ye J, Stevenson G, Dobson S (2015) Kcar: A knowledge-driven sensors and deep learning. Futur Gener Comput Syst 81:307–313
approach for concurrent activity recognition. Pervasive Mob 34. Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep learning for
Comput 19:47–70 sensor-based activity recognition: A survey. Pattern Recogn Lett
18. Shet VD, Harwood D, Davis LS (2005) Vidmap: video monitoring 119:3–11
of activity with prolog. In: IEEE Conference on Advanced Video 35. Phyo CN, Zin TT, Tin P (2019) Deep learning for
and Signal Based Surveillance. IEEE, pp 224–229 recognizing human activities using motions of skele-
19. Artikis A, Sergot M, Paliouras G (2013) A logic-based approach tal joints. IEEE Trans Consum Electron 65(2):243–252.
to activity recognition. In: Human Behavior Recognition Tech- https://doi.org/10.1109/TCE.2019.2908986
nologies: Intelligent Applications for Monitoring and Security. IGI 36. Baccouche M, Mamalet F, Wolf C, Garcia C, Baskurt A (2011)
Global, pp 1–13 Sequential deep learning for human action recognition. In: Salah
20. Kowalski R, Sergot M (1989) A logic-based calculus of events. In: AA, Lepri B (eds) Human Behavior Understanding. Springer,
Foundations of knowledge base management. Springer, pp 23–55 Berlin, pp 29–39
21. Cook DJ (2010) Learning setting-generalized activity models for 37. Li X, Zhang Y, Marsic I, Sarcevic A, Burd RS (2016) Deep
smart spaces. IEEE Intell Syst 2010(99):1 learning for rfid-based activity recognition. In: Proceedings of the
22. Cook DJ, Krishnan NC, Rashidi P (2013) Activity discovery 14th ACM Conference on Embedded Network Sensor Systems
and activity recognition: A new partnership. IEEE Trans Cybern CD-ROM. ACM, pp 164–175
43(3):820–828 38. Wang K, Wang X, Lin L, Wang M, Zuo W (2014) 3d
23. Fahad LG, Tahir SF, Rajarajan M (2014) Activity recognition in human activity recognition with reconfigurable convolutional
smart homes using clustering based classification. In: 2014 22nd neural networks. In: Proceedings of the 22nd ACM international
International Conference on Pattern Recognition. IEEE, pp 1348– conference on Multimedia. ACM, pp 97–106
1353 39. Liciotti D, Bernardini M, Romeo L, Frontoni E (2019) A
24. Tran S, Zhang Q, Karunanithi M (2009) Mixed-dependency sequential deep learning application for recognising human
models for multi-resident activity recognition in smart-homes activities in smart homes. Neurocomputing
25. Chen R, Tong Y (2014) A two-stage method for solving multi- 40. Mohamed R (2017) Multi label classification on multi resident in
resident activity recognition in smart environments. Entropy smart home using classifier chains. Adv Sci Lett 4:400–407
16(4):2184–2203 41. Mohamed R, Perumal T, Sulaiman M, Mustapha N (2017) Multi-
26. Nazerfard E, Das B, Holder LB, Cook DJ (2010) Conditional resident activity recognition using label combination approach in
random fields for activity recognition in smart environments. In: smart home environment. In: 2017 IEEE International Symposium
Proceedings of the 1st ACM International Health Informatics on Consumer Electronics (ISCE). IEEE, pp 69–71
Symposium. ACM, pp 282–286 42. LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-
27. Hsu K-C, Chiang Y-T, Lin G-Y, Lu C-H, Hsu JY-J, Fu L-C based learning applied to document recognition. Proc IEEE
(2010) Strategies for inference mechanism of conditional random 86(11):2278–2324
fields for multiple-resident activity recognition in a smart home. 43. Sherstinsky A (2018) Fundamentals of recurrent neural net-
In: International Conference on Industrial, Engineering and Other work (rnn) and long short-term memory (lstm) network.
Applications of Applied Intelligent Systems. Springer, pp 417– arXiv:1808.03314
426 44. Hochreiter S, Schmidhuber J (1997) Long short-term memory.
28. Zhuang X, Huang J, Potamianos G, Hasegawa-Johnson M (2009) Neural Comput 9(8):1735–1780
Acoustic fall detection using gaussian mixture models and 45. Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M,
gmm supervectors. In: 2009 IEEE International Conference on Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent
Acoustics, Speech and Signal Processing. IEEE, pp 69–72 convolutional networks for visual recognition and description.
29. Ribaric S, Hrkac T (2012) A model of fuzzy spatio-temporal In: Proceedings of the IEEE conference on computer vision and
knowledge representation and reasoning based on high-level petri pattern recognition, pp 2625–2634
nets. Inf Syst 37(3):238–256
30. Tran SN, Nguyen D, Ngo T-S, Vu X-S, Hoang L, Zhang Q,
Karunanithi M (2019) On multi-resident activity recognition in Publisher’s note Springer Nature remains neutral with regard to
ambient smart-homes. Artif Intell Rev:1–17 jurisdictional claims in published maps and institutional affiliations.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Sequential neural networks for multi-resident activity recognition in ambient...

Anubhav Natani is a pre- Thinagaran Perumal received


final year student pursuing his Ph.D. in the area of smart
B.Tech in Communication and technology and robotics from
Computer Engineering at the Universiti Putra Malaysia. He
LNM Institute of Informa- is currently a Senior Lec-
tion Technology, Jaipur, India. turer at the Department of
He has published works in Computer Science, Faculty of
IEEE Global Conference on Computer Science and Infor-
Consumer Electronics Con- mation Technology, Univer-
ference (IEEE GCCE 2019). siti Putra Malaysia. He is the
He has received a merit- recipient of the 2014 Early
based scholarship for his aca- Career Award from the IEEE
demic performances by The Consumer Electronics Society
LNM Institue of Information for his pioneering contribu-
Technology, Jaipur, India, in tion to the field of consumer
2018,2019. His research inter- electronics. He is currently
ests are Machine Learning, Deep Learning, Activity Recognition, appointed as Head of Cyber-Physical Systems in the university and
Reinforcement Learning and Artificial Intelligence. has been elected as Chair of the IEEE Consumer Electronics Soci-
ety Malaysia Chapter. He is also heading the National Committee on
Standardization for IoT(IEC/ISO TC/G/16) as Chairman since 2018.
Abhishek Sharma received His research interests are towards interoperability aspects of smart
his B.E. in Electronics Engi- homes and the Internet of Things (IoT), wearable computing, and
neering from Jiwaji Univer- cyber-physical systems.
sity, Gwalior, India, and Ph.D.
in Embedded Systems from
the University of Genoa, Italy.
He is presently working as
an Assitant Professor in the
Department of Electronics and
Communication Engineering
at LNM Institute of Informa-
tion Technology, Jaipur, India.
He is also a member of
IEEE, Computer Society and
Consumer Electronics Soci-
ety, a Lifetime Member of the
Indian Society for Technical Education, a Lifetime member of
Advanced Computing Society, India. In the present institute, he is the
coordinator of the ARM university partner program. He is also the cen-
ter leader of the LNM Smart Technology Center (L-CST). His research
interests include real-time systems and embedded systems.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

You might also like