sensors-23-07586
sensors-23-07586
Article
A Smart Home Digital Twin to Support the Recognition of
Activities of Daily Living
Damien Bouchabou 1,2, *,† , Juliette Grosset 1,3,† , Sao Mai Nguyen 1,3 , Christophe Lohr 1 and Xavier Puig 4
Abstract: One of the challenges in the field of human activity recognition in smart homes based on
IoT sensors is the variability in the recorded data. This variability arises from differences in home
configurations, sensor network setups, and the number and habits of inhabitants, resulting in a
lack of data that accurately represent the application environment. Although simulators have been
proposed in the literature to generate data, they fail to bridge the gap between training and field data
or produce diverse datasets. In this article, we propose a solution to address this issue by leveraging
the concept of digital twins to reduce the disparity between training and real-world data and generate
more varied datasets. We introduce the Virtual Smart Home, a simulator specifically designed for
modeling daily life activities in smart homes, which is adapted from the Virtual Home simulator.
To assess its realism, we compare a set of activity data recorded in a real-life smart apartment with
its replication in the VirtualSmartHome simulator. Additionally, we demonstrate that an activity
recognition algorithm trained on the data generated by the VirtualSmartHome simulator can be
successfully validated using real-life field data.
Keywords: smart home; machine learning; home automation; simulator; database; digital twin;
Citation: Bouchabou, D.; Grosset, J.;
transfer learning
Nguyen, S.M.; Lohr, C.; Puig, X. A
Smart Home Digital Twin to Support
the Recognition of Activities of Daily
Living. Sensors 2023, 23, 7586. 1. Introduction
https://doi.org/10.3390/s23177586 Over the past few decades, there has been a significant increase in the adoption of
Academic Editors: Giovanni Pau and
smart homes and real-world testbeds, driven by the proliferation of Internet of Things (IoT)
Ilsun You devices. These devices enable the detection of various aspects within homes, such as door
openings, room luminosity, temperature, humidity, and more. Human Activity Recognition
Received: 5 June 2023 (HAR) algorithms in smart homes have become crucial for classifying streams of data from
Revised: 10 August 2023
IoT sensor networks into Activities of Daily Living (ADLs). These algorithms enable smart
Accepted: 18 August 2023
homes to provide adaptive services, including minimizing power consumption, improving
Published: 1 September 2023
healthcare, and enhancing overall well-being.
Despite the notable advancements in machine learning techniques and the improved
performance of HAR algorithms, their practical application to real-world test cases con-
Copyright: © 2023 by the authors.
tinues to encounter challenges. These challenges primarily stem from the variability and
Licensee MDPI, Basel, Switzerland. sparsity of sensor data, leading to a significant mismatch between the training and test sets.
This article is an open access article
distributed under the terms and
1.1. A Variable and Sparse Unevenly Sampled Time Series
conditions of the Creative Commons While HAR based on video data has made significant strides in performance [1], HAR
Attribution (CC BY) license (https:// in smart homes continues to encounter specific challenges, as highlighted in the survey by
creativecommons.org/licenses/by/ Bouchabou et al. [2]. Recent advances in HAR algorithms, such as convolutional neural
4.0/). networks [3] and fully connected networks [4], along with sequence learning methods like
long short-term memory [5], have contributed to advancements in the field. However, the
task of recognizing ADLs in a smart home environment remains inherently challenging,
primarily due to several contributing factors:
• Partial observability and sparsity of the data: The input data in HAR consists of traces
captured by a variety of sensors, including motion sensors, door sensors, temperature
sensors, and more, integrated into the environment or objects within the house [6].
However, each sensor has a limited field of view, resulting in most of the residents’
movements going unobserved by the sensor network in a typical smart home setup.
Unlike HAR in videos, where the context of human actions, such as objects of interest
or the position of obstacles, can be captured in the images, the sparsity of ambient
sensors in HAR does not provide information beyond their field of view. Each sensor
activation alone provides limited information about the current activity. For example,
the activation of a motion sensor in the kitchen could indicate activities such as “cook-
ing”, “washing dishes”, or “housekeeping”. Therefore, the information from multiple
sensors needs to be combined to infer the current activity accurately. Additionally,
each sensor activation provides only a partial piece of information about the activity
and the state of the environment, unlike videos where both the agent performing
the activity and the environment state are visible. Consequently, the time series of
sensor activity traces cannot be approximated as a Markov chain. Instead, estimating
the context or current state of the environment relies on past information and the
relationship with other sensors.
• Variability of the data: Activity traces between different households exhibit significant
variations. The variability arises from differences in house structures, layouts, and
equipment. House layouts can vary in terms of apartments, houses with gardens,
houses with multiple floors, the presence of bathrooms and bedrooms, open-plan
or separate kitchens, and more. The number and types of sensors can also differ
significantly between homes. For instance, datasets like MIT [7] use 77–84 sensors
for each apartment, while the Kasteren dataset [8] uses 14–21 sensors. The ARAS
dataset [9] includes apartments with 20 sensors, while the Orange4Home dataset [10]
is based on an apartment equipped with 236 sensors. All these factors, including
home topography, sensor count, and their placement, can result in radical differences
in activity traces. The second cause of variability stems from household composition
and residents’ living habits. ADLs vary depending on the residents’ habits, hobbies,
and daily routines, leading to different class balances among ADLs. For example,
the typical day of a student, a healthy adult, or an elderly person with frailty will
exhibit distinct patterns. Furthermore, the more residents there are, the more the
sensor activation traces corresponding to each resident’s activities become intertwined,
leading to complex scenarios involving composite actions, concurrent activities, and
interleaved activities.
Therefore, it becomes imperative for algorithms to proficiently analyze sparse and
irregular time series data to establish effective generalization across a spectrum of house
configurations, equipment variations, households with distinct dynamics, and diverse daily
habits. It is crucial to recognize that training machine learning algorithms, as well as any
other HAR methodologies intended for deployment in these multifaceted contexts, man-
dates the use of training data that comprehensively encapsulates this extensive variability.
way, the digital twin can be used to fine-tune algorithms before their deployment in the
actual target house.
Moreover, digital twins have the potential to generate data representing a vast range
of house configurations, household habits, and resident behaviors, thereby accelerating
simulations, facilitating automatic labeling, and eliminating the cost of physical sensors.
This extensive dataset can then be utilized for pre-training machine learning models.
Furthermore, a digital twin can aid in evaluating the correct positioning and selection of
sensors to recognize a predefined list of activities.
Digital twin models have gained significant interest in various application domains,
such as manufacturing, aerospace, healthcare, and medicine [13]. While digital twins for
smart homes are relatively less explored, digital twins for buildings have been studied
extensively. Ngah Nasaruddin et al. [14] define a digital twin of a building as the interaction
between the interior environment of a real building and a realistic virtual representation
model of the building environment. This digital twin enables real-time monitoring and data
acquisition. For example, digital twins of buildings have been utilized in [15] to determine
the strategic locations of sensors for efficient data collection.
1.3. Contributions
The gap between training and testing data in HAR for smart homes presents significant
challenges due to the variability and sparsity of activity traces. In this study, we address
this issue by exploring the possibility of generating data suitable for deployment scenarios
by using the concept of a smart home digital twin. Our study centers on the application of
this method within the domain of HAR deep learning techniques. It is worth highlighting
that our proposed method transcends this domain, as its applicability extends seamlessly
to encompass both machine learning and non-machine learning approaches.
Our contributions are as follows:
• We propose a novel approach that paves the way for digital twins in the context of
smart homes.
• We enhance the Virtual Home [16] video-based data simulator to support sensor-based
data simulation for smart homes, which we refer to as VirtualSmartHome.
• We demonstrate, through an illustrative example, that we can replicate a real apart-
ment to generate data for training an ADL classification algorithm.
• Our study validates the effectiveness of our approach in generating data that closely
resembles real-life scenarios and enables the training of an ADL recognition algorithm.
• We outline a tool and methodology for creating digital twins for smart homes, encom-
passing a simulator for ADLs in smart homes and a replicable approach for modeling
real-life apartments and scenarios.
• The proposed tool and methodology can be utilized to develop more effective ADL
classification algorithms and enhance the overall performance of smart home systems.
In the next section (Section 2), we provide a comprehensive review of the state-of-the-
art approaches in HAR algorithms, ADL datasets, and home simulators. Subsequently,
in Section 3, we introduce the VirtualSmartHome simulator that we have developed,
along with our methodology for replicating real apartments and human activities. Moving
forward, in Section 4, we present an evaluation of our simulator, comparing the synthetic
data produced by the VirtualSmartHome simulator with real data from a smart apartment.
We also demonstrate the potential of our approach by employing the generated datasets
for a HAR algorithm.
2. Related Work
While recent HAR algorithms have demonstrated improved recognition rates when
trained and tested on the same households, their generalizability across different house-
holds remains limited. The existing ADL datasets also have their own limitations, prompt-
ing the exploration of smart home simulators to generate relevant test data. In this section,
Sensors 2023, 23, 7586 4 of 27
we discuss the limitations of current HAR algorithms and ADL datasets, and review the
available home simulators.
2.1. Machine Learning Algorithms for Activity Recognition Based on Smart Home IoT Data
Numerous methods and algorithms have been studied for HAR in the smart home
domain. Early approaches utilized machine learning techniques such as Support Vector
Machines (SVM), naive Bayes networks, or Hidden Markov Models (HMM), as reviewed
in [17]. However, these models lack generalization and adaptability, as they are designed for
specific contexts and rely on hand-crafted features, which are time-consuming to produce
and limit the models’ generalization and adaptability.
More recently, deep learning techniques have emerged as a promising approach due
to their ability to serve as end-to-end models, simultaneously extracting features and
classifying activities. These models are predominantly based on Convolutional Neural
Networks (CNN) or Long Short-Term Memory (LSTM).
CNN structures excel at feature extraction and pattern recognition. They have two key
advantages for HAR. Firstly, they can capture local dependencies, meaning they consider
the significance of nearby observations that are correlated with the current event. Secondly,
they are scale-invariant, capable of handling differences in step frequency or event occur-
rence. For example, Gochoo et al. [18] transformed activity sequences into binary images
to leverage 2D CNN-based structures. Singh et al. [19] applied a 1D CNN structure to raw
data sequences, demonstrating their high feature extraction capability. Their experiments
demonstrated that the CNN 1D architecture yields comparable results to LSTM-based
models while being more computationally efficient. However, LSTM-based models still
outperform the CNN 1D architecture.
LSTM models are specifically designed to handle time sequences and effectively
capture both long- and short-term dependencies. In the context of HAR in smart homes,
Liciotti et al. [5] extensively investigated various LSTM structures and demonstrated that
LSTM surpasses traditional HAR approaches in terms of classification scores without
the need for handcrafted features. This superiority can be attributed to LSTM’s ability to
generate features that encode temporal patterns, as highlighted in [20] when compared
to conventional machine learning techniques. As a result, LSTM-based structures have
emerged as the leading models for tackling the challenges of HAR in the smart home domain.
Additionally, collecting data from real inhabitants is time-consuming, and the recording
cannot be accelerated like in a simulation. Furthermore, the data requires ground truth,
including data stream segmentation (start and end time) and class labels, which necessitates
significant user investment and is prone to errors due to manual annotations, as described
in [2].
Table 1. Cost of CASAS components: “smart home in a box” [21].
Simulators Open Approach Multi Environment API Apartment Objects Scripts IoT Sensors Designer/Editor Visual Application Output
AI2Thor [25] Yes Model Yes Unity Python 17 609 No No Yes 3D Robot Interaction Videos
iGibson [26] Yes Model Yes Bullet Python No 15 570 No 1 Yes None Robot Interaction Videos
Sims4Action [27] No Model Yes Sims 4 No None NA No No Game Interface 3D Human Activity Videos
Ai Habitat [28] Yes Model Yes C++ Python None NA No No Yes None Human Activity Sens. Log
Open SHS [29] Yes Hybrid No Blender Python None NA No 29 (B) With Blender 3D Human Activity Sens. Log
SESim [30] Yes Model No Unity NA NA Yes Yes 5 Yes 3D Human Activity Sens. Log
Persim 3D [31] No Model No Unity C# Gator Tech Yes No Yes (B) Yes 3D Human Activity Sens. Log
IE Sim [32] No Hybrid No NA NA NA Yes No Yes (B) Yes 2D Human Activity Sens. Log
SIMACT [33] Yes Model No JME Java 3D kitchen Yes Yes Yes (B) With Sketchup 3D Human Activity Sens. Log
Park et al. [34] No Interactive No Unity NA 1 Yes No NA With Unity 3D Human Activity Sens. Log
Francillette et al. [35] Yes Hybrid Yes Unity NA NA Yes Yes 8 (B and A) With Unity 3D Human Activity Sens. Log
Buchmayr et al. [36] No Interactive No NA NA NA Yes No Yes (B) NA 3D Human Activity Sens. Log
Armac et al. [37] No Interactive Yes NA NA None Yes No Yes (B) Yes 2D Human Activity Sens. Log
VirtualHome [38] Yes Hybrid Yes Unity Python 7 308 Yes No Yes 3D Human Activity Videos
erated by the simulator with real-world data collected from the Gator Tech Smart House
(GTSH) [48] and reported an 81% similarity. However, the authors did not evaluate the
performance of HAR algorithms using this simulator. Additionally, the current version of
the simulator only supports simulation of a single user’s activity.
More recently, hybrid approaches have emerged, combining both model-based and
interactive approaches in a single simulator [16,29,32,35]. These approaches offer the
advantages of both methods.
Alshammari et al. [29] proposed OpenSHS, a simulator for ADL dataset generation.
Designers can use Blender 3D to create the space and deploy devices, and users can control
an agent with a first-person view to generate agent traces. The simulator records sensor
readings and states based on user interactions. It also supports script-based actions in the
environment. However, the project does not appear to be actively updated or used.
Francillette et al. [35] developed a simulation tool capable of modeling the behav-
ior of individuals with Mild Cognitive Impairment (MCI) or Alzheimer’s Disease (AD).
The simulator allows the manual control or modeling of an agent based on a behavior
tree model with error probabilities for each action. The authors demonstrated that their
simulator accurately emulates individuals with MCI or AD when actions have different
error probabilities.
Synnott et al. [32] introduced IE Sim, a simulator capable of generating datasets
associated with normal and hazardous scenarios. Users can interact with the simulator
through a virtual agent to perform activities. The simulator provides an object toolbox with
a wide range of indoor objects and sensors, allowing users to create new objects as well. IE
Sim collects sensor readings throughout the simulation. The authors demonstrated that the
simulator’s data can be used to detect hazardous activities and overlapping activities. IE
Sim combines interactive and agent modeling approaches.
Puig et al. [16,49] proposed the Virtual Home simulator, a multi-agent platform for
simulating activities in a home. Humanoid avatars represent the agents, which can interact
with the environment using high-level instructions. Users can also control agents in a
first-person view to interact with the environment. This simulator supports video playback
of human activities and enables agent training for complex tasks. It includes a knowledge
base that provides instructions for a wide range of activities.
The Virtual Home simulator aligns with our requirements for recognizing activities
in a house. Although some common human actions are not yet implemented, such as
hoovering or eating, the extensible programming of the simulator allows for modifications.
Furthermore, the simulator facilitates the reproduction of human activity scenarios, retrieval
of sensor states, and replication of a real smart apartment for a digital twin. It is an ongoing
project with an active community.
(g) Scene 7
Virtual Home is developed on the Unity3D game engine, which offers robust kinematic,
physics, and navigation models. Moreover, users can take advantage of the vast collection
of 3D models accessible through Unity’s Assets store, providing access to a diverse range
of humanoid models.
Moreover, Virtual Home offers a straightforward process for adding new flats by
utilizing the provided Unity project [49]. Each environment in Virtual Home represents
an interior flat with multiple rooms and interactive objects. The configuration of each
flat scene is stored in a .json file, which contains nodes representing each object and their
relationships with other objects (specified as “edge labels”). For instance, the label “between”
can be used to describe the relationship between rooms connected by a door. By modifying
these description files, users can easily add, modify, or remove objects, enabling the creation
of diverse scenes for generating videos or training agents.
Another notable feature of Virtual Home is its capability to create custom virtual
databases within specific environments, with a supportive community that contributes
Sensors 2023, 23, 7586 9 of 27
new features. In a study conducted by Liao et al. [50], Virtual Home was utilized to
generate a novel dataset. The researchers expanded the original Virtual Home database by
incorporating additional actions for the agents and introducing programs. These programs
consist of predefined sequences of instructions that can be assigned to agents, enabling
them to perform activities within their simulated environment.
In Virtual Home, an avatar’s activity is represented by a sequence of actions, such
as “<char0> [PutBack] <glass> (1) <table>”, as described in [51]. This flexible framework
facilitates the training of agents to engage in various everyday activities.
The authors successfully collected a new dataset [50] based on Virtual Home [38],
encompassing 3000 daily activities. Furthermore, they expanded the database by incorpo-
rating over 30,000 programs, offering a wide range of actions and possibilities. Additionally,
the researchers graphed each environment, which consisted of an average of 300 objects
and 4000 spatial relationships.
Using Virtual Home, users can create scenarios where 3D avatars perform daily activi-
ties, with the ability to capture the simulated actions through a virtual camera. Moreover,
the simulator enables the replication of flats, facilitating the creation of digital twins of
apartments. However, it is important to note that Virtual Home does not support the
acquisition of data through home automation sensors.
In order to enhance the ability to reproduce scenarios and collect data from ambient
sensors, we have implemented several new features in our simulator:
1. Interactive Objects: While Virtual Home already offers a variety of objects for inclusion
in apartments, many are passive and non-interactive. To address this limitation, we
added the functionality to interact with some new objects. Agents can now open
trash cans, the drawers of column cabinets, and push on toilet faucets. Objects with
doors are implemented by splitting them into two parts—one static and one capable
of rotation around an axis to simulate interaction. Fluid objects like toilet faucets are
simulated by placing the origin point of the fluid at its supposed source.
2. Simulation Time Acceleration: To generate a large volume of data quickly, we imple-
mented the ability to accelerate simulation times. This feature utilizes the Time.timeScale
function of the Unity game engine. However, the acceleration cannot surpass the ren-
dering time of the Unity game engine, resulting in a maximum four-fold increase in
simulation speed.
3. Real-Life Apartment Replication and Room Creation: To replicate a real-life apartment,
we propose a methodology that involves creating a 2D map of the flat using tools like
Sweet Home 3D [52]. This map is then reproduced in Virtual Home, respecting the
hierarchical object structure imposed by the simulator. Finally, the interactive objects
are placed in a manner similar to their real-world counterparts. We demonstrated the
effectiveness of this method by replicating a real-life intelligent apartment based on
our room dimension measurements. Additionally, we have introduced the ability to
create new rooms, such as outdoor and entrance areas.
4. IoT Sensors: While Virtual Home previously focused on recording activities using
videos, we have implemented IoT sensors to enhance the simulation. The following
sensors have been incorporated: (1) opening/closing sensors, (2) pressure sensors,
(3) lights, (4) power consumption, and (5) zone occupancy sensors. Except for the
zone occupancy sensors, all other sensors are simulated using the environment graph
of the scene. This graph lists all objects in the scene with their corresponding states
(e.g., closed/open, on/off). The zone occupancy sensor takes the form of a sensitive
floor, implemented using a raycast. It originates from the center of the avatar and is
directed downwards. The floor of the flat is divided into rooms, and the intersection
with the floor identifies the room in which the avatar is located.
5. Simulation Interface: We have developed an interface that allows users to launch
simulations by specifying the apartment, ambient sensors, scenarios, date, and time.
The interface facilitates the scripting of each labeled activity for reproduction in
the simulation. It provides three main functions: (1) the creation of an experiment
Sensors 2023, 23, 7586 10 of 27
configuration file, where the simulation flat and desired sensor data can be chosen;
(2) the creation of a scenario configuration file, offering choices such as experiment
date, simulation acceleration, and various activities with their durations; (3) the
association of an experiment configuration file with a scenario configuration file and
the subsequent launch of the simulation. This functionality enables the storage of
synthetic sensor logs in a database file and provides a comprehensive record of the
conducted experiment, including the experiment configuration file and the scenario
configuration file.
Figure 2. Layout showing the positions of the different sensors in the smart apartment.
the number of activities performed by the volunteers in each scenario. Table 6 provides a
global summary of the generated dataset.
Table 3. Summary of all recorded data from morning scenarios.
Total/
Activity Subject 1 Subject 2 Subject 3 Subject 5 Subject 6 Subject 7 Subject 8 Subject 9
Activity
Bathe 5 3 3 3 1 1 0 0 16
Cook 5 4 4 7 2 2 0 2 26
Dress 6 4 4 1 1 1 0 2 19
Eat 5 4 3 3 1 2 0 1 19
Enter Home 0 0 0 1 0 0 0 0 1
Go To Toilets 5 4 3 1 1 1 0 1 16
Leave Home 5 4 2 2 1 2 0 1 17
Read 0 0 2 2 0 0 0 0 4
Sleep 5 4 4 0 1 2 0 1 17
Sleep in Bed 0 0 0 2 0 0 0 0 2
Wash Dishes 5 4 3 0 1 2 0 1 16
Watch TV 0 0 0 3 1 0 0 2 6
Total/Subject 41 31 28 25 10 13 0 11 159
Total/
Activity Subject 1 Subject 2 Subject 3 Subject 5 Subject 6 Subject 7 Subject 8 Subject 9
Activity
Bathe 5 0 1 3 0 0 0 0 9
Cook 7 6 1 3 1 4 2 5 29
Dress 7 2 0 1 0 0 0 1 11
Eat 5 2 1 3 0 2 1 2 16
Enter Home 5 2 1 2 1 2 1 2 16
Go To Toilets 8 4 1 4 1 0 0 2 20
Leave Home 5 1 1 2 1 2 1 2 15
Read 1 3 1 2 0 0 0 0 7
Sleep 2 0 0 1 0 0 0 0 3
Sleep in Bed 4 0 0 0 0 0 0 0 4
Wash Dishes 6 2 1 2 1 2 1 2 17
Watch TV 0 2 0 0 1 4 1 0 8
Total/Subject 55 24 8 23 6 16 7 16 155
Total/
Activity Subject 1 Subject 2 Subject 3 Subject 5 Subject 6 Subject 7 Subject 8 Subject 9
Activity
Bathe 6 2 3 5 0 1 0 2 19
Cook 6 5 2 5 0 3 0 3 24
Dress 8 2 1 4 1 2 1 0 19
Eat 4 2 2 4 0 1 0 2 15
Enter Home 5 2 2 3 1 2 1 2 18
Go To Toilets 10 3 1 4 0 2 0 2 22
Leave Home 0 0 0 0 0 0 0 0 0
Read 5 1 0 0 0 1 0 0 7
Sleep 4 2 1 3 1 2 1 2 16
Sleep in Bed 5 2 3 3 1 2 1 2 19
Wash Dishes 6 2 2 2 0 1 0 2 15
Watch TV 4 3 2 7 1 3 1 5 26
Total/Subject 63 26 19 40 5 20 5 22 200
Sensors 2023, 23, 7586 13 of 27
Total/
Activity Subject 1 Subject 2 Subject 3 Subject 5 Subject 6 Subject 7 Subject 8 Subject 9
Activity
Bathe 16 5 7 11 1 2 0 2 44
Cook 18 15 7 15 3 9 2 10 79
Dress 21 8 5 6 2 3 1 3 49
Eat 14 8 6 10 1 5 1 6 50
Enter Home 10 4 3 6 2 4 2 4 35
Go To Toilets 23 11 5 9 2 3 0 5 58
Leave Home 10 5 3 4 2 4 1 4 32
Read 6 4 3 4 0 1 0 0 18
Sleep 11 6 5 4 2 4 1 2 36
Sleep in Bed 9 2 3 5 1 2 1 1 25
Wash Dishes 17 8 6 4 2 5 1 5 48
Watch TV 4 5 2 10 3 7 2 7 40
Total/Subject 159 81 55 88 21 49 12 49 514
Several post-processing steps were performed, including renaming the sensors and
removing certain sensors (e.g., motion sensors, CO2 sensors, WiFi, radio level sensors, noise
sensor) from the real sequences that could not be implemented in Virtual Home or were not
relevant for our project. While our real-life sensors provided power consumption values,
we transformed them into ON or OFF states for simplicity in the virtual dataset, e.g., values
of devices such as the TV or the oven.
(a) (b)
Figure 4. (a) View of the living lab from a fisheye camera, (b) representation in the Virtual Home
simulator.
The synthetic dataset was generated by replicating the recorded scenarios from the
living lab within the simulator. We scripted an avatar to mimic each action performed by
Sensors 2023, 23, 7586 14 of 27
our volunteers. For example, if a volunteer followed these steps for the activity “cooking”:
entering the kitchen, washing hands, opening the fridge, closing the fridge, turning on the
oven, etc., we scripted the avatars in the VirtualSmartHome simulator to simulate each of
these actions. We created one script for each occurrence of an activity performed by our
volunteers in all three scenarios, allowing us to obtain a database of scripts that we can
reuse later for other environment configurations.
In conclusion, we successfully replicated segments of scenarios involving three ran-
domly selected subjects (subject 3, subject 7, and subject 9). It is noteworthy to mention that
the process of scripting actions for the avatars proved to be time-intensive. This aspect also
presents a potential avenue for future work, wherein the development of tools to facilitate
avatar scripting could enhance the efficiency of the simulation process. Ultimately, we
managed to recreate 23 out of 55 scenarios for subject 3, 37 out of 49 scenarios for subject 7,
and accomplished full replication for subject 9.
For a comprehensive overview of the synthetic dataset, please refer to the summarized
details presented in Table 7.
4.1. Comparison of Triggered Sensors in Real and Synthetic Logs for Similar Scenarios
To gain an initial understanding of the synthetic data generated, we compared the
frequency of sensor activations in the synthetic data with that of the real data.
Figure 5 illustrates the comparison of the number of triggered sensors in the real
dataset (in blue) and the synthetic dataset (in red) across 15 scenarios. Most scenarios
showed a similar count of triggered sensors in both datasets. However, some scenarios (1, 4,
5, 9, and 13) exhibited an excess of triggered sensors in the synthetic data. Upon examining
Table 8, we observed that these scenarios predominantly involved presence sensors, which
detect the presence of the avatar in a specific room. The difference in sensor activations can
be attributed to the fact that the real-life sensor did not always detect the volunteer, and
the path chosen by the avatar in the simulator did not always match the movement of the
volunteer during the recording experiment.
Figure 5. Comparison graph of sensors triggered in real and synthetic logs for similar scenarios.
In conclusion, the comparison of triggered sensors between the real and synthetic logs
for similar scenarios showed a generally close alignment. However, discrepancies were
observed in scenarios, in particular for the presence sensors, which can be attributed to
variations in detection and movement between the real-life recording and the simulation.
The cross-correlation Formula (1) for discrete functions was used, and the cross-
correlation values were calculated for all sensors and times in the sequences. To ensure a
fair comparison with the longer real log sequences, we expanded the cross-correlation tables
for synthetic sensors by duplicating the last line since the sensor values do not change.
To determine the similarity between the real and synthetic log sequences, we multi-
plied the value of each sensor in the real sequence by the corresponding synthetic sensor
value. If the values matched (e.g., both ON or both OFF), a score of 1 was assigned; other-
wise, a score of −1 was assigned. This calculation was performed for each sensor at each
time point, resulting in the final cross-correlation table. The score was computed as the
sum of all cross-correlation values for the sensors at each time.
The percentage likelihood between the two log sequences was calculated using the
following formula:
Maximum Score
Percentage = × 100
(Number of Sensors in Reality × Number of Events in Reality)
Table 9 presents the results obtained for the 15 processed scenarios, including the
similarity percentage and the average similarity across scenarios.
Subject S9 S7 S3
Scenario Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Similarity (%) 75.73% 97.40% 75.42% 75.30% 69.61% 74.42% 81.24% 93.18% 85.22% 84.73% 88.28% 79.93% 77.53% 54.73% 82.78%
Average Similarity (%) 77.98% 86.53% 73.74%
The obtained percentages were generally above 70%, except for one case that yielded
54.73%. Upon closer examination, we identified that the SensFloor sensor could be activated
and deactivated multiple times in the same room, leading to variations in the log sequences.
In conclusion, the cross-correlation analysis revealed that the synthetic log sequences
exhibited a high level of similarity to the real sequences, with similarity percentages ex-
ceeding 70% and reaching up to 86.53%. This indicates that the digital twin approach
allowed us to generate synthetic data that closely resembled the real-world data. Although
variations were observed in some scenarios due to the presence sensors, the overall com-
parison demonstrated a remarkable alignment between the two datasets. These findings
suggest that the synthetic data generated through the digital twin approach can be effec-
tively utilized for various applications, including activity recognition algorithms. In the
following Section 4.3, we will investigate whether this level of similarity is sufficient to
achieve comparable performance using an activity recognition algorithm.
described in Section 4.3.1. The experiment methods, which involve data preprocessing and
algorithm training, are detailed in Section 4.3.2.
activity sequence
M004ON M005OFF M007OFF M004OFF M004ON M004OFF M007ON M004ON M007OFF M007ON M005ON M004OFF M004ON M00
8 11 15 9 8 9 14 8 15 14 10 9 8
activity label
Table 10. Experiments 1 and 2: Partitioning of activity sequence samples into training, validation,
and test sets.
# Validation
# Train Samples # Test Samples
Samples
Subject 3 68 18 28
Subject 7 61 16 37
Subject 9 52 13 49
Finally, we compared the results obtained from the synthetic and real leave-one-
subject-out cross-validations to evaluate the algorithm’s performance when trained and
tested on synthetic data versus real data, respectively.
The results of the two leave-one-subject-out cross validations are presented in
Tables 11 and 12. These results demonstrate that the algorithm can be trained with both
real and synthetic data, yielding comparable outcomes. Notably, there is a high degree of
similarity in the results for Subject “9”. However, for the other two subjects, we observe
more differences between the results of synthetic and real data. Specifically, the perfor-
mance, as measured by F1-score and balanced accuracy, is better for Subject “7” and “3”
with synthetic data.
(a) Subject 3 real data (b) Subject 7 real data (c) Subject 9 real data
(d) Subject 3 synthetic data (e) Subject 7 synthetic data (f) Subject 9 synthetic data
In general, the confusion matrices reveal certain patterns. For example, activities such
as “Enter Home” and “Leave Home” are often confused, which is logical since they trigger
similar types of sensors (entrance floor, principal door, etc.). Similarly, “Bathe” and “Go
To Toilet” activities show confusion, likely because one may wash their hands after using
the toilet, and in our apartment, the toilet is located in the same room as the bathroom.
“Reading” and “Watching TV” activities can also be easily confused as they occur in the
same room and trigger similar types of sensors. Additionally, “Wash Dishes” and “Cook”
activities, being in the same room, are occasionally confused by the algorithm.
Comparing the confusion matrices for real and synthetic data, we observe that the
aforementioned confusions occur in both cases. For instance, the activity “Read” is not
correctly recognized in both synthetic and real data for Subject “3”, although the scores are
slightly higher for real data. This difference can be attributed to the richness of real data,
which exhibits more sensor activations in the sequence. Moreover, the avatar’s trajectory
in simulation can introduce sensor activations that are not correlated with the current
activity. In contrast, during the real recordings, subjects pay attention to room transitions,
while the avatar does not, resulting in sensor activations from other rooms that disrupt
activity sequences.
In conclusion, despite the logical confusions made by the algorithm, the recogni-
tion results obtained are quite similar for real and synthetic data. The next subsection
(Section 4.3.4) investigates the extent to which synthetic data can be effectively used for
activity recognition in our digital twin.
objective was to determine whether an algorithm trained on synthetic data could effectively
recognize activities in real data.
Training Test
In more detail, three models were trained, one for each subject using his own synthetic
and his own real data. To train these models, 80% of the synthetic data was used, while the
remaining 20% was used for validation during the training process (details in Table 13). A
stratified partitioning method was used to create these the training and validation subsets.
After training, each algorithm was tested using real data from the corresponding subject.
Table 13. Experience 3: Partitioning of activity sequence samples into training, validation, and test sets.
Analyzing Table 14 and Figure 9, we can initially observe that the synthetic data
generated for each subject enabled training and recognition of activity sequences for the
corresponding real datasets (one subject at a time). Subjects “9” and “7” achieved good
performance in terms of Accuracy, Balanced Accuracy, and F1-score. Notably, subject “7”
exhibited the best performance among the subjects. For these two subjects, the synthetic
data appeared realistic enough to achieve activity recognition with an accuracy of over 70%
for both subjects.
In contrast, subject “3” displayed the lowest performance. It seems that the synthetic
data generated for this subject were insufficient to enable accurate activity recognition. The
poor performance for this subject suggests that there are differences between the synthetic
and real data. A closer examination of the real data for subject “3” reveals sequences
that are interfered with by sensors triggering unrelated to the activity. For example, the
presence sensor on the floor is regularly triggered in the living room while subject “3” is
in the kitchen. This disturbance occurs due to a sensor malfunction, detecting a second
presence in the living room. Such malfunctions are not anticipated or simulated in the
synthetic data.
Sensors 2023, 23, 7586 22 of 27
S9 S7 S3
Precision Recall F1-Score Support Precision Recall F1-Score Support Precision Recall F1-Score Support
Bathe 0.00% 0.00% 0.00% 2 100.00% 100.00% 100.00% 2 50.00% 100.00% 66.67% 4
Cook 90.00% 90.00% 90.00% 10 54.55% 100.00% 70.59% 6 75.00% 75.00% 75.00% 4
Dress 75.00% 100.00% 85.71% 3 100.00% 100.00% 100.00% 3 100.00% 50.00% 66.67% 2
Eat 50.00% 50.00% 50.00% 6 100.00% 75.00% 85.71% 4 0.00% 0.00% 0.00% 3
Enter Home 50.00% 75.00% 60.00% 4 100.00% 33.33% 50.00% 3 60.00% 100.00% 75.00% 3
Go To Toilets 66.67% 80.00% 72.73% 5 75.00% 100.00% 85.71% 3 0.00% 0.00% 0.00% 2
Leave Home 40.00% 50.00% 44.44% 4 60.00% 100.00% 75.00% 3 0.00% 0.00% 0.00% 2
Sleep 50.00% 33.33% 40.00% 3 100.00% 75.00% 85.71% 4 100.00% 50.00% 66.67% 2
Wash Dishes 100.00% 80.00% 88.89% 5 100.00% 25.00% 40.00% 4 44.44% 100.00% 61.54% 4
Watch TV 100.00% 85.71% 92.31% 7 100.00% 80.00% 88.89% 5 0.00% 0.00% 0.00% 2
Accuracy 71.43% 78.38% 57.14%
Balanced Accuracy 64.40% 78.83% 47.50%
Macro Avg 62.17% 64.40% 62.41% 49 88.95% 78.83% 78.16% 37 42.94% 47.50% 41.15% 28
Weighted Avg 70.78% 71.43% 70.39% 49 87.36% 78.38% 76.91% 37 44.92% 57.14% 46.59% 28
(a) Subject 9 real data (b) Subject 7 real data (c) Subject 3 real data
Additionally, we observed that the activity “Bathe” was not recognized for subject “9”,
whereas it was recognized with 100% accuracy for subject “7”. Subject “3” had four activity
classes out of ten that were not recognized. These results indicate that synthetic data
can be used to train an algorithm and recognize activities in real data. However, relying
solely on activity data from a single subject may not always be sufficient. Furthermore, the
performance can be degraded by sensor malfunctions in real conditions, which can disrupt
the activity recognition algorithm. Therefore, incorporating more data and variability into
the training dataset is necessary to address these challenges.
Training Test
To train the algorithm, we merged all the synthetic data into one dataset. Then, 80% of
this dataset was used to train the algorithms, and 20% for training validation. Finally, the
algorithm was tested over each subject’s datasets. The results are shown in Table 15.
Table 15 demonstrates that the algorithm achieved higher classification scores (close
to 80%) for all subjects compared to the previous experiment. Subject “7” maintained very
similar performance to the previous experiment. However, subjects “9” and “3” showed
notable improvement, particularly subject “3”, which had previously exhibited the worst
results. Subject “3” experienced an increase in accuracy and balanced accuracy from 57.14%
and 47.50% to 78.57% and 81.67%, respectively.
Furthermore, Table 15 and Figure 11 reveal that more activity classes were correctly
identified. The introduction of additional synthetic data from other subjects within the
same apartment led to improved classification performance. The contribution of data
from different subjects introduced variability in the execution of activities, enabling the
algorithm to better generalize and capture sensor behavior during activities. Having a
diverse range of examples is crucial for training a deep learning algorithm.
In conclusion, by utilizing more synthetic data, the algorithm demonstrated increased
performance in real conditions. The inclusion of behavioral variability from different
subjects facilitated better generalization. This generalization resulted in significant im-
provements, particularly for subject “3”.
S9 S7 S3
Precision Recall F1-Score Support Precision Recall F1-Score Support Precision Recall F1-Score Support
Bathe 100.00% 100.00% 100.00% 2 100.00% 100.00% 100.00% 2 100.00% 75.00% 85.71% 4
Cook 100.00% 80.00% 88.89% 10 50.00% 100.00% 66.67% 6 100.00% 25.00% 40.00% 4
Dress 100.00% 100.00% 100.00% 3 100.00% 66.67% 80.00% 3 100.00% 100.00% 100.00% 2
Eat 50.00% 66.67% 57.14% 6 100.00% 50.00% 66.67% 4 100.00% 66.67% 80.00% 3
Enter Home 66.67% 50.00% 57.14% 4 100.00% 100.00% 100.00% 3 75.00% 100.00% 85.71% 3
Go To Toilets 100.00% 80.00% 88.89% 5 100.00% 100.00% 100.00% 3 66.67% 100.00% 80.00% 2
Leave Home 100.00% 75.00% 85.71% 4 100.00% 100.00% 100.00% 3 100.00% 50.00% 66.67% 2
Sleep 37.50% 100.00% 54.55% 3 66.67% 100.00% 80.00% 4 100.00% 100.00% 100.00% 2
Wash Dishes 100.00% 80.00% 88.89% 5 0.00% 0.00% 0.00% 4 57.14% 100.00% 72.73% 4
Watch TV 100.00% 85.71% 92.31% 7 100.00% 80.00% 88.89% 5 66.67% 100.00% 80.00% 2
Accuracy 79.59% 78.38% 78.57%
Balanced Accuracy 81.74% 79.67% 81.67%
Macro Avg 85.42% 81.74% 81.35% 49 81.67% 79.67% 78.22% 37 86.55% 81.67% 79.08% 28
Weighted Avg 87.33% 79.59% 81.67% 49 77.48% 78.38% 74.89% 37 86.44% 78.57% 76.58% 28
Sensors 2023, 23, 7586 24 of 27
(a) Subject 9 real data (b) Subject 7 real data (c) Subject 3 real data
4.4. Summary
The experiments conducted in this section yielded valuable insights. The results
demonstrated that the simulator has the ability to generate synthetic data that closely
resemble real-world data. The activity recognition algorithm performed similarly on
both synthetic and real data, indicating that training the algorithm solely on synthetic data
can effectively recognize activities in real-world scenarios. Moreover, when the entire set
of generated synthetic data was utilized, the algorithm’s performance improved for each
subject. This improvement can be attributed to the increased variability and examples
provided by the additional synthetic data, allowing the algorithm to better generalize and
capture the behavior of sensors during different activities.
5. Conclusions
In this study, we have explored the potential of leveraging a digital twin concept to
generate synthetic data for Human Activity Recognition (HAR) algorithms in the context
of smart homes. Our primary objective was to bridge the gap between training data and
real-world usage data, effectively addressing the challenges posed by the variability and
sparsity of activity traces in smart home environments.
To achieve this, we introduced the VirtualSmartHome simulator, which enhanced the
Virtual Home environment to support sensor-based data simulation. With this simulator,
we successfully replicated a real smart apartment and generated synthetic data by modeling
the behaviors of residents through avatars. The extensive evaluation and metric analysis
revealed a significant similarity between the synthetic data generated by the simulator and
the real data collected from volunteers in the smart apartment.
We then utilized the synthetic data to train a HAR algorithm, which demonstrated
robust activity recognition performance on the real data, achieving an average F1 score of
approximately 80%. Although this experiment was conducted on a relatively small dataset,
the promising results underscore the viability of our approach.
However, we acknowledge the need for further in-depth discussion and analysis to gain
deeper insights from the results. In future work, we intend to explore the limitations of our
study, specifically focusing on the impact of data collection in a lab environment versus a
real-world home and the significance of dataset size. Understanding these aspects is critical for
assessing the generalizability and practical applicability of our proposed approach.
To achieve this, we plan to expand the experiment by generating more synthetic
data from additional volunteers’ activities. Additionally, we aim to extend the evaluation
to include a larger number of real smart houses, allowing for a more comprehensive
assessment of our approach’s performance across diverse environments.
Furthermore, we will explore the possibility of integrating scenarios with multiple
agents to enrich datasets with more complex situations. Additionally, we intend to augment
the simulator with new sensing modalities such as audio sensors and radar systems and
Sensors 2023, 23, 7586 25 of 27
more modalities. These additions will not only enhance the realism of the synthetic data
but also broaden the scope of activity recognition research within smart homes.
In our future work, we aim to investigate training the algorithm from scratch in
a house without replicating the labeled activities of the final resident. Instead, we will
solely utilize the activity scripts provided by our volunteers, enabling a more realistic and
autonomous training process. By addressing these aspects in our future work, we aim to
further validate and enhance the effectiveness of our approach for generating synthetic
data and training HAR algorithms in smart homes.
By harnessing the concept of digital twins and generating realistic synthetic data, we
effectively mitigate the challenges posed by limited datasets and the gap between training
and real environment data, thereby enhancing the applicability of HAR algorithms in
real-world smart home environments. Our study contributes to the development of a
comprehensive tool and methodology for implementing digital twins in the context of
smart homes, enabling the development of more effective ADL classification algorithms
and ultimately improving the performance of smart home systems.
In conclusion, the results obtained from this work highlight the potential of digital
twins in generating synthetic data and training HAR algorithms for smart homes. Our
future research will focus on addressing the identified limitations and further validating
the approach with larger and more diverse datasets from real smart homes. We firmly
believe that our findings significantly contribute to the advancement of HAR technology,
paving the way for more efficient and adaptive smart home systems.
Author Contributions: Conceptualization, D.B. and J.G.; methodology, D.B., S.M.N and C.L.; soft-
ware, D.B. and J.G.; validation, D.B., J.G., S.M.N. and C.L.; formal analysis, D.B.; investigation, D.B.;
data curation, D.B. and J.G; writing—original draft preparation, D.B. and J.G; writing—review and
editing, D.B, J.G, S.M.N., C.L., X.P.; visualization, J.G.; supervision, S.M.N., C.L., X.P.; All authors
have read and agreed to the published version of the manuscript.
Funding: This work is partially supported by project VITAAL and is financed by Brest Metropole,
the region of Brittany and the European Regional Development Fund (ERDF). This work is partially
supported by the “plan France relance” of 21 December 2020 and a CIFRE agreement with the
company Delta Dore in Bonemain 35270 France, managed by the National Association of Technical
Research (ANRT) in France. We gratefully acknowledge the support of AID Project ACoCaTherm
which supported the dataset creation.
Institutional Review Board Statement: Not applicable
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Not applicable
Acknowledgments: We would like to thank Jérôme Kerdreux for his support in this project.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Dang, L.M.; Min, K.; Wang, H.; Piran, M.J.; Lee, C.H.; Moon, H. Sensor-based and vision-based human activity recognition: A
comprehensive survey. Pattern Recognit. 2020, 108, 107561. [CrossRef]
2. Bouchabou, D.; Nguyen, S.M.; Lohr, C.; LeDuc, B.; Kanellos, I. A Survey of Human Activity Recognition in Smart Homes Based
on IoT Sensors Algorithms: Taxonomies, Challenges, and Opportunities with Deep Learning. Sensors 2021, 21, 6037. [CrossRef]
[PubMed]
3. Tan, T.H.; Gochoo, M.; Huang, S.C.; Liu, Y.H.; Liu, S.H.; Huang, Y.F. Multi-resident activity recognition in a smart home using
RGB activity image and DCNN. IEEE Sens. J. 2018, 18, 9718–9727. [CrossRef]
4. Bouchabou, D.; Nguyen, S.M.; Lohr, C.; Kanellos, I.; Leduc, B. Fully Convolutional Network Bootstrapped by Word Encoding
and Embedding for Activity Recognition in Smart Homes. In Proceedings of the IJCAI 2020 Workshop on Deep Learning for
Human Activity Recognition, Yokohama, Japan, 7–15 January 2021.
5. Liciotti, D.; Bernardini, M.; Romeo, L.; Frontoni, E. A Sequential Deep Learning Application for Recognising Human Activities in
Smart Homes. Neurocomputing 2019, 396, 501–513. [CrossRef]
6. Hussain, Z.; Sheng, Q.; Zhang, W.E. Different Approaches for Human Activity Recognition: A Survey. arXiv 2019, arXiv:1906.05074.
Sensors 2023, 23, 7586 26 of 27
7. Tapia, E.M.; Intille, S.S.; Larson, K. Activity recognition in the home using simple and ubiquitous sensors. In Proceedings of the
International Conference on Pervasive Computing, Vienna, Austria, 21–23 April 2004; Springer: Berlin/Heidelberg, Germany,
2004; pp. 158–175.
8. van Kasteren, T.L.; Englebienne, G.; Kröse, B.J. Human activity recognition from wireless sensor network data: Benchmark and
software. In Activity Recognition in Pervasive Intelligent Environments; Atlantis Press: Paris, France, 2011; pp. 165–186.
9. Alemdar, H.; Ertan, H.; Incel, O.D.; Ersoy, C. ARAS human activity datasets in multiple homes with multiple residents. In
Proceedings of the 2013 7th International Conference on Pervasive Computing Technologies for Healthcare and Workshops,
Venice, Italy, 5–8 May 2013; pp. 232–235.
10. Cumin, J.; Lefebvre, G.; Ramparany, F.; Crowley, J.L. A dataset of routine daily activities in an instrumented home. In Proceedings
of the International Conference on Ubiquitous Computing and Ambient Intelligence, Philadelphia, PA, USA, 7–10 November
2017; Springer: Cham, Switzerland, 2017; pp. 413–425.
11. Grieves, M. Digital Twin: Manufacturing Excellence through Virtual Factory Replication; Digital Twin White Paper; Digital Twin
Consortium: Boston, MA, USA, 2014; Volume 1, pp. 1–7.
12. Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisci-
plinary Perspectives on Complex Systems; Springer: Cham, Switzerland, 2017; pp. 85–113.
13. Barricelli, B.R.; Casiraghi, E.; Fogli, D. A survey on digital twin: Definitions, characteristics, applications, and design implications.
IEEE Access 2019, 7, 167653–167671. [CrossRef]
14. Ngah Nasaruddin, A.; Ito, T.; Tee, B.T. Digital Twin Approach to Building Information Management. Proc. Manuf. Syst. Div. Conf.
2018, 2018, 304. [CrossRef]
15. Khajavi, S.; Hossein Motlagh, N.; Jaribion, A.; Werner, L.; Holmström, J. Digital Twin: Vision, Benefits, Boundaries, and Creation
for Buildings. IEEE Access 2019, 7, 147406–147419. [CrossRef]
16. Puig, X.; Ra, K.; Boben, M.; Li, J.; Wang, T.; Fidler, S.; Torralba, A. Virtualhome: Simulating Household Activities via Programs. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 8494–8502.
17. Sedky, M.; Howard, C.; Alshammari, T.; Alshammari, N. Evaluating machine learning techniques for activity classification in
smart home environments. Int. J. Inf. Syst. Comput. Sci. 2018, 12, 48–54.
18. Gochoo, M.; Tan, T.H.; Liu, S.H.; Jean, F.R.; Alnajjar, F.S.; Huang, S.C. Unobtrusive activity recognition of elderly people living
alone using anonymous binary sensors and DCNN. IEEE J. Biomed. Health Inform. 2018, 23, 693–702. [CrossRef]
19. Singh, D.; Merdivan, E.; Hanke, S.; Kropf, J.; Geist, M.; Holzinger, A. Convolutional and recurrent neural networks
for activity recognition in smart environment. In Towards Integrative Machine Learning and Knowledge Extraction, Proceed-
ings of the BIRS Workshop, Banff, AB, Canada, 24–26 July 2015; Springer: Cham, Switzerland, 2017; pp. 194–205.
20. Singh, D.; Merdivan, E.; Psychoula, I.; Kropf, J.; Hanke, S.; Geist, M.; Holzinger, A. Human activity recognition using recurrent
neural networks. In Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction,
Reggio, Italy, 29 August–1 September 2017; Springer: Cham, Switzerland, 2017; pp. 267–274.
21. Cook, D.J.; Crandall, A.S.; Thomas, B.L.; Krishnan, N.C. CASAS: A Smart Home in a Box. Computer 2013, 46, 62–69. [CrossRef]
22. De-La-Hoz-Franco, E.; Ariza-Colpas, P.; Quero, J.M.; Espinilla, M. Sensor-based datasets for human activity recognition—A
systematic review of literature. IEEE Access 2018, 6, 59192–59210. [CrossRef]
23. Golestan, S.; Stroulia, E.; Nikolaidis, I. Smart Indoor Space Simulation Methodologies: A Review. IEEE Sens. J. 2022, 22, 8337–8359.
[CrossRef]
24. Bruneau, J.; Consel, C.; OMalley, M.; Taha, W.; Hannourah, W.M. Virtual testing for smart buildings. In Proceedings of the 2012
Eighth International Conference on Intelligent Environments, Guanajuato, Mexico, 26–29 June 2012; pp. 282–289.
25. Kolve, E.; Mottaghi, R.; Han, W.; VanderBilt, E.; Weihs, L.; Herrasti, A.; Gordon, D.; Zhu, Y.; Gupta, A.; Farhadi, A. AI2-THOR:
An Interactive 3D Environment for Visual AI. arXiv 2019, arXiv:1712.05474.
26. Xia, F.; Zamir, A.R.; He, Z.; Sax, A.; Malik, J.; Savarese, S. Gibson Env: Real-World Perception for Embodied Agents. In Proceedings
of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 9068–9079. [CrossRef]
27. Roitberg, A.; Schneider, D.; Djamal, A.; Seibold, C.; Reiß, S.; Stiefelhagen, R. Let’s Play for Action: Recognizing Activities of Daily
Living by Learning from Life Simulation Video Games. arXiv 2021, arXiv:2107.05617.
28. Savva, M.; Kadian, A.; Maksymets, O.; Zhao, Y.; Wijmans, E.; Jain, B.; Straub, J.; Liu, J.; Koltun, V.; Malik, J.; et al. Habitat: A
Platform for Embodied AI Research. arXiv 2019, arXiv:1904.01201.
29. Alshammari, N.; Alshammari, T.; Sedky, M.; Champion, J.; Bauer, C. OpenSHS: Open smart home simulator. Sensors 2017,
17, 1003. [CrossRef] [PubMed]
30. Ho, B.; Vogts, D.; Wesson, J. A smart home simulation tool to support the recognition of activities of daily living. In Proceedings of
the South African Institute of Computer Scientists and Information Technologists 2019, Skukuza, South Africa, 17–18 September
2019; pp. 1–10.
31. Lee, J.W.; Cho, S.; Liu, S.; Cho, K.; Helal, S. Persim 3D: Context-Driven Simulation and Modeling of Human Activities in Smart
Spaces. IEEE Trans. Autom. Sci. Eng. 2015, 12, 1243–1256. [CrossRef]
Sensors 2023, 23, 7586 27 of 27
32. Synnott, J.; Chen, L.; Nugent, C.; Moore, G. IE Sim—A Flexible Tool for the Simulation of Data Generated within Intelligent Environ-
ments. In Proceedings of the Ambient Intelligence, Pisa, Italy, 13–15 November 2012; Paternò, F., de Ruyter, B., Markopoulos, P., Santoro,
C., van Loenen, E., Luyten, K., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2012; pp. 373–378.
33. Bouchard, K.; Ajroud, A.; Bouchard, B.; Bouzouane, A. SIMACT: A 3D Open Source Smart Home Simulator for Activity
Recognition with Open Database and Visual Editor. Int. J. Hybrid Inf. Technol. 2012, 5, 13–32.
34. Park, B.; Min, H.; Bang, G.; Ko, I. The User Activity Reasoning Model in a Virtual Living Space Simulator. Int. J. Softw. Eng. Its
Appl. 2015, 9, 53–62. [CrossRef]
35. Francillette, Y.; Boucher, E.; Bouzouane, A.; Gaboury, S. The Virtual Environment for Rapid Prototyping of the Intelligent
Environment. Sensors 2017, 17, 2562. [CrossRef] [PubMed]
36. Buchmayr, M.; Kurschl, W.; Küng, J. A simulator for generating and visualizing sensor data for ambient intelligence environments.
Procedia Comput. Sci. 2011, 5, 90–97. [CrossRef]
37. Armac, I.; Retkowitz, D. Simulation of smart environments. In Proceedings of the IEEE International Conference on Pervasive
Services, Istanbul, Turkey, 15–20 July 2007; pp. 257–266.
38. VirtualHome. Available online: http://www.virtual-home.org/ (accessed on 21 January 2021).
39. Savva, M.; Chang, A.X.; Dosovitskiy, A.; Funkhouser, T.; Koltun, V. MINOS: Multimodal indoor simulator for navigation in
complex environments. arXiv 2017, arXiv:1712.03931.
40. Roberts, M.; Ramapuram, J.; Ranjan, A.; Kumar, A.; Bautista, M.A.; Paczan, N.; Webb, R.; Susskind, J.M. Hypersim: A
Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding. arXiv 2021, arXiv:2011.02523.
41. Srivastava, S.; Li, C.; Lingelbach, M.; Martín-Martín, R.; Xia, F.; Vainio, K.; Lian, Z.; Gokmen, C.; Buch, S.; Liu, C.K.; et al.
BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments. arXiv 2021,
arXiv:2108.03332.
42. Shridhar, M.; Thomason, J.; Gordon, D.; Bisk, Y.; Han, W.; Mottaghi, R.; Zettlemoyer, L.; Fox, D. Alfred: A benchmark for
interpreting grounded instructions for everyday tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10740–10749.
43. Puig, X.; Shu, T.; Li, S.; Wang, Z.; Liao, Y.H.; Tenenbaum, J.B.; Fidler, S.; Torralba, A. Watch-And-Help: A Challenge for Social
Perception and Human-AI Collaboration. In Proceedings of the International Conference on Learning Representations, Online,
3–7 May 2021.
44. Cao, Z.; Gao, H.; Mangalam, K.; Cai, Q.; Vo, M.; Malik, J. Long-term human motion prediction with scene context. In Proceedings
of the ECCV, Glasgow, UK, 23–28 August 2020.
45. Varol, G.; Romero, J.; Martin, X.; Mahmood, N.; Black, M.J.; Laptev, I.; Schmid, C. Learning from Synthetic Humans. In Proceedings
of the CVPR, Honolulu, HI, USA, 21–26 July 2017.
46. Synnott, J.; Nugent, C.; Jeffers, P. Simulation of smart home activity datasets. Sensors 2015, 15, 14162–14179. [CrossRef]
47. Kamara-Esteban, O.; Azkune, G.; Pijoan, A.; Borges, C.E.; Alonso-Vicario, A.; López-de Ipiña, D. MASSHA: An agent-based
approach for human activity simulation in intelligent environments. Pervasive Mob. Comput. 2017, 40, 279–300. [CrossRef]
48. Helal, S.; Mann, W.; El-Zabadani, H.; King, J.; Kaddoura, Y.; Jansen, E. The gator tech smart house: A programmable pervasive
space. Computer 2005, 38, 50–60. [CrossRef]
49. Puig, X.VirtualHome Source Code. Available online: https://github.com/xavierpuigf/virtualhome_unity (accessed on 25 January 2021).
50. Liao, Y.; Puig, X.; Boben, M.; Torralba, A.; Fidler, S. Synthesizing Environment-Aware Activities via Activity Sketches. In
Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA,
15–20 June 2019; pp. 6284–6292, ISSN: 2575-7075 . [CrossRef]
51. VirtualHome Source Code and API. Available online: https://github.com/xavierpuigf/virtualhome (accessed on 25 January 2021).
52. Sweet Home 3D-Draw Floor Plans and Arrange Furniture Freely. Available online: https://www.sweethome3d.com/ (accessed on
10 September 2022).
53. Experiment’Haal, le Living Lab Santé Autonomie (LLSA). Available online: http://www.imt-atlantique.fr/fr/recherche-et-
innovation/plateformes-de-recherche/experiment-haal (accessed on 21 January 2021).
54. Lohr, C.; Kerdreux, J. Improvements of the xAAL home automation system. Future Internet 2020, 12, 104. [CrossRef]
55. Future-Shape. SensFloor—The Floor Becomes a Touch Screen. Available online: https://future-shape.com/en/system/ (accessed on
6 December 2021).
56. Katz, S. Assessing self-maintenance: Activities of daily living, mobility, and instrumental activities of daily living. J. Am. Geriatr.
Soc. 1983, 31, 721–727. [CrossRef]
57. Cross-Correlation. Available online: https://en.wikipedia.org/w/index.php?title=Cross-correlation&oldid=1031522391 (accessed on
17 August 2021).
58. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef] [PubMed]
59. Sammut, C.; Webb, G.I. (Eds.) Leave-One-Out Cross-Validation. In Encyclopedia of Machine Learning; Springer: Boston, MA, USA,
2010; pp. 600–601. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.