fault-tolerance-in-iot
fault-tolerance-in-iot
Abstract— Fault tolerance increases system availability and exclusion criteria and a detailed review, the primary studies
reliability by making systems robust to failures and proactive were selected.
enough to tackle failures. Fault tolerance can be introduced at
different architectural layers of the Internet of Things (IoT), this The paper is organized in the following sections: In section
is because a fault can occur at any of the layers. As for example, II, related works in the field of fault tolerance in IoT are
motion sensors, and motors can fail at the root layer, network discussed. Section III explains the taxonomy, on the basis of
connectivity could be disrupted in network layer, computation which fault tolerance in different systems is compared. The
and storage nodes can perform erroneously in their layers, so it comparison is done in section IV. Section V, presents the
becomes crucial to introduce fault tolerance in IoT systems at current trends in the field of fault tolerance in IoT. Section V
every layer. The study paves the way for classifying current and concludes the paper.
possible fault tolerant approaches by presenting different
techniques(replication, network control etc.), architectural II. RELATED WORKS
patterns(centralized, hybrid etc.), layers(network, sense etc.) & Moghaddam et al. [6] discussed different ways of achieving
styles(Microservices, Publish-Subscribe etc.) that can help in
fault tolerance in IoT systems, fault tolerance aspects and
making a system fault tolerant efficiently. Paper also discusses
subdomains in fault tolerance. The paper also shows changing
current trends in fault tolerance, areas that have been widely
worked upon and areas that can act as a future scope, in making
and emerging trends in the field of fault tolerance in IoT the
IoT based systems fault tolerant and efficient. study is performed in a systematic mapping way. And paving a
foundation for future studies in fault tolerance in Iot domain.
Keywords—Fault Tolerance, Internet of Things, Replication, Rullo et al. [7] reviewed fault tolerance techniques based on
Reliability, Availability. redundancy that targets availability and data integrity. The
I. INTRODUCTION paper discusses fault tolerance implementation techniques and
approaches at sensing & network layer. The paper reviews
In order to deliver smart services, IoT is the recent proposed approaches for achieving fault tolerance,
internal/external communication of intelligent elements [1] shows how they can be implemented to introduce fault
through the internet. Reliable and fault-free facilities should be tolerance at device level, overcoming disadvantages of old
offered by a dependable IoT scheme. A fault is a flaw that algorithms.
impacts the correct functionality within the hardware or
software systems [2]. As IoT devices are heterogeneous, highly III. TAXONOMY
distributed, battery-powered, and reliant on wireless The aim of this study is based on the Goal-Question-Metric
communication and affected by scalability, it is especially insights which are as follows:
difficult to create a pattern for Fault Tolerance in IoT. The IoT
devices that are distributed [3] in nature may cause the system Purpose: to have a thorough understanding of IoT fault-
to suffer from server crashes, server omissions, incorrect tolerant systems.
responses, and arbitrary errors. The reliance on wireless and
Issue: through the detection, classification and analysis of
battery makes the IoT devices hardly recoverable [4]. In
different approaches, techniques and architectures.
addition, being exposed to new equipment and facilities
influences the performance of the system. Object: Approaches based on existing IoT frameworks.
Although the IoT was launched more than a decade ago [5], Viewpoint: From the perspectives of both research and
its various aspects and quality of services (QoS) such as Fault industry.
Tolerance are still being attempted by the researchers to define
them well. Therefore the purpose of this research is to define We considered all the selected studies afterwards and
and classify the state of the art of the domain and to highlight filtered them according to a set of well-defined criteria for
the approaches, techniques and architectures that are potentially inclusion and exclusion. According to the guidelines, two key
relevant for modelling IoT with fault tolerance. A drivers have driven the concept of inclusion/exclusion criteria:
comprehensive mapping analysis has been carried out in order (i) keeping the focus of the selected papers on the scope of the
to achieve this objective. Based on precise inclusion and study; and (ii) avoiding grey or non-scientific work.
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 International License
49
Asian Journal of Convergence in Technology Volume VII and Issue I
ISSN NO: 2350-1146 I.F-5.11
50
Asian Journal of Convergence in Technology Volume VII and Issue I
ISSN NO: 2350-1146 I.F-5.11
51
Asian Journal of Convergence in Technology Volume VII and Issue I
ISSN NO: 2350-1146 I.F-5.11
3) Distributed Recovery Block: and causes delay. The findings of this study are both research-
In papers [3],[13] distributed recovery blocks technique is oriented and industry-oriented and are intended to establish a
employed to ensure node computations are error free. context for future Fault Tolerance IoT related research. We will
analyses the possible incorporation of existing research at the
4) Time Redundancy: industrial level of IoT as a future task. The study will help the
In paper [3] time redundancy technique is followed. readers to analyses the IoT system very minutely.
E. Quality of IoT Service REFERENCES
Quality of Service (QoS) is also an ever-increasing network [1] D. Terry, "Toward a New Approach to IoT Fault Tolerance," in
requirement today. New applications, such as voice and live Computer, vol. 49, no. 8, pp. 80-83, Aug. 2016.
video transmissions, which are accessible to consumers over [2] “What Is Fault Tolerance?: Creating a Fault Tolerant System: Imperva.”
the internet, generate higher standards for the quality of the Learning Center, Imperva, 30 Dec. 2019,
services offered. When the traffic volume is greater than what www.imperva.com/learn/availability/fault-tolerance/.
can be transmitted over the network, devices queue, or keep, [3] Tusher Chakraborty, Akshay Uttama Nambi, Ranveer Chandra, Rahul
the packets are held in memory before the resources are made Sharma, Manohar Swaminathan, Zerina Kapetanovic, and Jonathan
Appavoo. 2018. Fall-curve: A novel primitive for IoT Fault Detection
available to transmit them. In papers [8],[9],[11],[12] and Isolation. In Proceedings of the 16th ACM Conference on Embedded
performance is focused. Availability is used as an attribute in Networked Sensor Systems (SenSys '18). Association for Computing
papers[8],[9],[10], security is described in papers[1],[4], Machinery, New York, NY, USA, 95–107.
scalability is used as an attribute in paper[9], interoperability is [4] N. Mohamed, J. Al-Jaroodi and I. Jawhar, "Towards Fault Tolerant Fog
used as an attribute in paper[4], energy consumption is focussed Computing for IoT-Based Smart City Applications," 2019 IEEE 9th
in paper[9]. Annual Computing and Communication Workshop and Conference
(CCWC), Las Vegas, NV, USA, 2019, pp. 0752-0757.
V. TRENDS [5] Gubbi, Jayavardhana, Rajkumar Buyya, Slaven Marusic, and Marimuthu
Palaniswami. "Internet of Things (IoT): A vision, architectural elements,
It was observed that in most of the papers reviewed, in and future directions." Future generation computer systems 29, no. 7
order to introduce fault tolerance actuate and sense layer was (2013): 1645-1660.
being targeted, replication and network control techniques were [6] Moghaddam, Mahyar Tourchi, and Henry Muccini. "Fault-tolerant iot."
primarily employed, performance and availability is mostly In International Workshop on Software Engineering for Resilient
discussed under QoS attribute. Some papers [1],[3],[9] Systems, pp. 67-84. Springer, Cham, 2019.
discussed novel approaches to achieve fault tolerance [7] Rullo, Antonino, Edoardo Serra, and Jorge Lobo. "Redundancy as a
Measure of Fault-Tolerance for the Internet of Things: A Review." In
techniques, removing disadvantages of old techniques. Policy-Based Autonomic Data Governance, pp. 202-226. Springer,
Energy consumption, one of the QoS attributes is less Cham, 2019.
focused while making system fault tolerant, so there is a wide [8] A. Javed, K. Heljanko, A. Buda and K. Främling, "CEFIoT: A fault-
tolerant IoT architecture for edge and cloud," 2018 IEEE 4th World
scope in this field to work upon. Also, time redundancy Forum on Internet of Things (WF-IoT), Singapore, 2018, pp. 813-818
technique is employed in a handful of papers, hence paving the [9] M. Z. Hasan and F. Al-Turjman, "Optimizing Multipath Routing With
way for future research. It was also observed that Guaranteed Fault Tolerance in Internet of Things," in IEEE Sensors
correspondence between fault tolerance techniques and Journal, vol. 17, no. 19, pp. 6463-6473, 1 Oct.1, 2017.
associated architecture is less studied. So, despite fault [10] P. H. Su, C. Shih, J. Y. Hsu, K. Lin and Y. Wang, "Decentralized fault
tolerance in IoT being studied over a decade, there is still much tolerance mechanism for intelligent IoT/M2M middleware," 2014 IEEE
scope of improvements in the field. World Forum on Internet of Things (WF-IoT), Seoul, 2014, pp. 45-50.
[11] S. Zhou, K. Lin, J. Na, C. Chuang and C. Shih, "Supporting Service
VI. CONCLUSION Adaptation in Fault Tolerant Internet of Things," 2015 IEEE 8th
International Conference on Service-Oriented Computing and
In this paper, we present a systematic analysis of mapping Applications (SOCA), Rome, 2015, pp. 65-72.
with the objective of classifying and defining the state-of-the- [12] M. Mudassar, Y. Zhai, L. Liao and J. Shen, "A Decentralized Latency-
art domain and extracting a collection of methods and Aware Task Allocation and Group Formation Approach With Fault
techniques for Fault Tolerance in IoT. The fault tolerance Tolerance for IoT Applications," in IEEE Access, vol. 8, pp. 4912-
capability of some papers shows that cloud data center faults 4923,2020.
can be addressed in real time by customized design before [13] A. Celesti, L. Carnevale, A. Galletta, M. Fazio and M. Villari, "A
Watchdog Service Making Container-Based Micro-services Reliable in
repair becomes available. In comparison to some of the IoT Clouds," 2017 IEEE 5th International Conference on Future Internet
contributions discussed in the related works, the transition of of Things and Cloud (FiCloud), Prague, 2017, pp. 372-378.
data from failed devices to safe ones takes more excessive time
52