Preprints202203 0087 v1
Preprints202203 0087 v1
v1
Abstract The Internet of Things (IoT) is one of the cybersecurity experts and researchers working in the
most widely used technologies today, and it has a sig- context of IoT.
nificant effect on our lives in a variety of ways, includ-
ing social, commercial, and economic aspects. In terms Keywords Internet of Things; cyber-attacks; anoma-
of automation, productivity, and comfort for consumers lies; machine learning; deep learning; IoT data
across a wide range of application areas, from education analytics; intelligent decision-making; security intelli-
to smart cities, the present and future IoT technolo- gence
gies hold great promise for improving the overall qual-
ity of human life. However, cyber-attacks and threats
greatly affect smart applications in the environment of 1 Introduction
IoT. The traditional IoT security techniques are insuf-
ficient with the recent security challenges considering The Internet of Things (IoT) is one of the most widely
the advanced booming of different kinds of attacks and used technologies today and is often described as a con-
threats. Utilizing artificial intelligence (AI) expertise, nected network of heterogeneous components enabling
especially machine and deep learning solutions, is the intelligent systems and services that detect, capture,
key to delivering a dynamically enhanced and up-to- distribute, and analyze data. Things in the IoT devices
date security system for the next-generation IoT sys- refer to smart devices, such as sensors, smartwatches,
tem. Throughout this article, we present a comprehen- smart refrigerators, smoke detectors, radio frequency
sive picture on IoT security intelligence, which is built identification (RFID), heartbeat monitors, accelerome-
on machine and deep learning technologies that extract ters, smartphones, and so on, that collect and transmit
insights from raw data to intelligently protect IoT de- data. The number of connected things in IoT systems
vices against a variety of cyber-attacks. Finally, based is increasing day by day. For instance, there will be
on our study, we highlight the associated research is- about 20.4 billion connected things globally in 2022,
sues and future directions within the scope of our study. compared to 8.4 billion connected things in 2020 [57].
Overall, this article aspires to serve as a reference point The IoT has a significant effect on our lives in a va-
and guide, particularly from a technical standpoint, for riety of ways, including social, commercial, and eco-
nomic aspects. In terms of growing the digital econ-
omy, the IoT sector is projected to grow in revenue
1
Swinburne University of Technology, Melbourne, VIC-3122,
from 892 billion in 2018 to 4 trillion by 2025 [57]. The
Australia. IoT enables large-scale technological advancements and
2
Department of Computer Science and Engineering, Chit- value-added services in a variety of areas of our lives,
tagong University of Engineering & Technology, Chittagong- including smart homes, smart cities, transportation, lo-
4349, Bangladesh.
3
Computer Science Department, Faculty of Computing and
gistics, smart health, retail, agriculture, and business,
Information Technology, King Abdulaziz University, Jeddah- as well as smart metering, remote monitoring, and pro-
21589, Saudi Arabia. cess automation. In terms of automation, performance,
∗Correspondance: [email protected] (Iqbal H. Sarker) and comfort, current and future IoT applications and
2 Sarker et al.
services have tremendous potential for enhancing con- anomalies in IoT to develop an appropriate defensive
sumer quality of life. However, in the context of IoT, policy. Based on information gathered so far from the
numerous sorts of cyber-attacks and threats are viewed literature on these technologies and their use in the IoT
as challenging problems to the expansion of IoT. There- environment, the contribution of this article is summa-
fore, this paper focuses primarily on IoT security intel- rized as follows:
ligence to effectively protect systems and applications
from a variety of cyber-attacks and threats in IoT. – This study concentrates on the knowledge of ar-
The most basic need in the IoT network is to protect tificial intelligence, particularly, machine and deep
all of the systems, apps, and connected devices. IoT net- learning-based IoT security solutions with their ef-
works’ massive size introduces new challenges in a vari- fectiveness.
ety of areas, including device management, data man- – We discuss IoT environment, various IoT security
agement, computing, security, and privacy, etc. As the challenging issues, IoT systems with various layers,
IoT grows, various security concerns are being raised as and associated security issues in each layer, to high-
potential threats. Without a trusted system, the emerg- light the scope of this study.
ing IoT applications, such as those mentioned above, – We present different machine learning techniques as
will be unable to meet the needs of people and society well as deep learning architectures and techniques,
and may lose all their potential. Typically, IoT systems and their usage for intelligent security modeling to
operate on several layers, including the perception or solve the security problems, in the environment of
sensing layer, the networking, and data communication IoT.
layer, the middleware or support layer, and the appli- – Finally, we explore the issues that have been en-
cation layer. These layers are briefly discussed in Sec- countered, as well as potential research opportuni-
tion 3. Each of these layers has a unique set of tasks ties and future directions, to secure and trust IoT
and relevant technologies to perform in an IoT appli- networks and systems.
cation, and each layer brings a new set of issues and
The remainder of the paper is carried out as fol-
security risks. For example, denial of service (DoS) at-
lows: The Section 2 discusses the domain’s history and
tacks, spoofing attacks, jamming, eavesdropping, data
reviews related work. We discuss IoT system architec-
tampering, a man in the middle attacks, and malicious,
tures with different layers and the associated security
etc. are the most common IoT attacks [137]. Thus, de-
issues in each layer in Section 3. We present various ma-
pending on the nature of the security issues, poten-
chine and deep learning-based security solutions in the
tial IoT security solutions such as authentication, ac-
IoT environment in Section 4. The challenges faced, as
cess control, threat and risk prediction, malware anal-
well as prospective study opportunities and future di-
ysis, anomaly or intrusion detection, and prevention,
rections, are highlighted in Section 5, and the work is
etc, could be useful. Due to the advanced boom in se-
concluded in Section 6.
curity threats and attacks, and complexity in security
incidents, the conventional techniques for dealing with
them are no longer effective. Therefore an intelligent
security system based on modern technologies that can 2 Background and Related Work
address these security concerns is urgently required to
protect the next-generation IoT system. In this section, we make a comprehensive literature re-
Artificial Intelligence (AI) is one of the most im- view on the IoT environment with various application
portant technologies for developing intelligent systems, areas, IoT security challenging issues, and recent IoT
and it is considered to be a part of the Fourth Industrial security approaches including machine learning tech-
Revolution (4IR) [119] [130] as well. Thus, utilizing AI niques, and highlight the scope of our study.
knowledge, particularly, machine and deep learning, we
can detect anomalies or unwanted malicious activities
in the IoT, and, as a result, offer a dynamic security 2.1 The IoT Paradigm
solution that is constantly improved and up to date.
Typically, machine or deep learning models comprise The Internet of Things (IoT) represents a paradigm
a set of rules, methods, or complex transfer functions shift in information technology. The term ‘Internet of
that extract useful insights or interesting data patterns Things,’ which is also abbreviated as IoT, is composed
from the security data [122]. Thus, it is possible to uti- of two key words: the first is ‘Internet,’ and the sec-
lize the resultant security models to train machines to ond is ‘Things’, where the Things are defined as smart
predict threats or risks at an early stage, or to identify devices or objects.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
The Internet of Things (IoT) is one of the emerg- are defined as smart devices or objects such as sensors,
ing smart technologies for the Fourth Industrial Revo- smartwatches, and smartphones, etc.
lution (or Industry 4.0), which represents the ongoing
automation of traditional manufacturing and industrial
practices [130]. The IoT refers to a network of intercon- 2.2 IoT-based Smart Environments
nected, internet-connected devices that may collect and
send data over a wireless network without the need for A smart environment is typically a world, where the
human intervention. Several organizations and research sensors and computing devices are integrated with ev-
groups describe IoT and smart environments in a va- eryday objects through a connected network to enhance
riety of ways and from a variety of perspectives. For the comfort and efficiency of human life. Ahmed et
instance, Thiesse et al. [141] define the IoT as “con- al. [23] state that “the term ‘smart’ refers to the abil-
sisting of hardware items and digital information flows ity to autonomously obtain and apply knowledge, and
based on RFID tags”. The Institute of Electrical and the term ‘environment’ refers to the surroundings”. Ac-
Electronics Engineers (IEEE) defines the IoT as a “col- cording to Belissent et al. [32], “a smart environment
lection of items with sensors that form a network con- uses information and communications technologies to
nected to the Internet” [93]. The European Telecommu- make the critical infrastructure components and ser-
nications Standards Institute (ETSI) defines “machine- vices of a city’s administration, education, healthcare,
to-machine (M2M) communications as an automated public safety, real estate, transportation and utilities
communications system that makes decisions and pro- more aware, interactive and efficient”. Recent develop-
cesses data operations without direct human interven- ments in IoT have elevated it to the status of technol-
tion” [72]. Cisco (San Francisco), which is well-known ogy for creating smart environments, such as intelligent
as the worldwide leader in IT, networking, and cyber- cities, intelligent healthcare systems, intelligent build-
security solutions, has summarized the IoE (Internet- ing management systems, etc. Figure 1, and Figure 2
of-everything) concept “as a network that consists of depicted a graphical depiction of the total number of
people, data, things, and processes” [36]. connected IoT devices and the worldwide IoT market
[137], as well as the potential economic impact and pro-
The RFID (Radio Frequency Identification) group
jected market share of dominant IoT applications by
defines the “IoT as the worldwide network of intercon-
2025 [24].
nected objects uniquely addressable based on standard
The goal of such smart environments is to provide
communication protocols” [143]. According to Cluster
services based on data acquired by IoT-enabled sensors
of European research projects on the IoT [133] - “Things
using intelligent methods, which has a significant im-
are active participants in business, information and so-
pact on our lives [124] in various dimensions, such as
cial processes where they are enabled to interact and
social, commercial, as well as economic. According to
communicate among themselves and with the environ-
the statistics of Navigant Research mentioned in Elrawy
ment by exchanging data and information sensed about
et al. [43], the global smart city services market is ex-
the environment while reacting autonomously to the
pected to be 225.5 billion US dollars by 2026, while 93.5
real/physical world events and influencing it by run-
billion US dollars in 2017. A range of factors, such as us-
ning processes that trigger actions and create services
able bandwidth, serving an increasing number of users
with or without direct human intervention”. Gubbi et
and smart objects in IoT networks, managing large vol-
al. [50] define “IoT is the interconnection of sensing and
umes of data, scalable computing systems, such as cloud
actuating devices providing the ability to share infor-
computing, etc., need to be considered in the implemen-
mation across platforms through a unified framework,
tation of the IoT paradigm for building smart environ-
developing a common operating picture for enabling
ments, e.g. smart cities, for the quality of services of
innovative applications”. Atzori et al. [29] define IoT
smart environment applications [136].
in three paradigms such as internet-oriented (middle-
ware), things-oriented (sensors), and semantic-oriented
(knowledge). 2.3 What Makes IoT Security Challenging?
In general, the IoT’s main pillars are as follows:
smart devices, data, analytics, and connectivity. Thus, Many personal and commercial equipment are becom-
the IoT can be defined as a network of connected het- ing “smart” as the digital revolution takes hold. On IoT
erogeneous components that can sense, collect, trans- networks, traditional security and privacy approaches
mit, and analyze data over a wireless network to enable may fail. The dynamic nature of IoT connectivity in-
intelligent decision making and services aimed at im- troduces a new set of security challenges. The following
proving the quality of human life, where the Things are some examples:
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
4 Sarker et al.
Fig. 1: Total connected IoT devices and global IoT mar- Fig. 2: Potential economic impact and projected market
ket so far and future prediction. share of dominant IoT applications by 2025.
– Heterogeneity: IoT intends to connect a huge num- needed resources as well as costs due to a large num-
ber of heterogeneous devices [82] to enable advanced ber of sensors in a complex system application [82].
applications that can improve human life quality. As – Security and privacy protection: Consumer and pro-
a result, IoT devices come in a variety of shapes and prietary data must be secured and protected, partic-
sizes, resulting in a diverse set of hardware and soft- ularly in sensitive domains, such as healthcare ap-
ware schemes. plications [82].
– Volume: In IoT, a large number of devices, i.e., bil- – Intelligent decision-making : For many IoT applica-
lions of smart devices [57], are interconnected, which tions, sophisticated decisions should be intelligent,
are coupled with the high volume, velocity, and struc- according to the preferences of the users, and must
ture of real-world data. be made in real-time.
– Inter-connectivity: The IoT refers to the intercon-
Although most of these issues are shared by many
nection between devices, the information they send
Internet access points, the constraints of IoT devices,
and receive to one another, like a conversation. Thus,
as well as the dynamic nature and complexity of the
IoT networks are accessed with the nature of any
environment in which they operate, magnify many of
time, and anywhere [82].
these concerns beyond the scope of traditional security
– Structure and vulnerability: Various types of attacks,
capabilities.
such as cookie theft, cross-site scripting, structured
query language injection, session hijacking, and of-
ten distributed denial of service, are vulnerable to 2.4 Related work and the Scope of this Study
IoT devices. On a large, self-organized IoT network,
the vulnerability to distributed denial of service at- Several studies have been done on IoT security. For in-
tacks typically grows [57]. stance, the authors in [68] present a survey of IoT secu-
– Dynamism: As IoT devices are continually removed rity issues, where they review and categorize the popu-
and added, the nature of the network reconfigura- lar security issues, such as attacks, threats, concerning
tion is dynamic and must be adaptable [82]. the IoT layered architecture, networking, communica-
– Proximity: In short-range communications, ad hoc tion, and management. Another study on IoT security
networks may rely on local devices. Proximity means has been presented in [94].
that an IoT-enabled object changes and behaves ac- The authors in [153] present several research chal-
cording to the current location [34]. lenges and opportunities related to IoT security, where
– Latency and reliability: The main challenges in in- they have considered the general security background
dustrial IoT networks include low-latency and high- of IoT. In [56], an overview of the current status of IoT
reliability wireless communication. Sensitive appli- security research, as well as associated tools like IoT
cations like surgical devices, assembly line produc- modelers and simulators, was presented. In [91], the
tion, and traffic monitoring, etc. require high-reliable, authors provide an overview of security concepts, tech-
and low-latency communication [89]. nological and security concerns, viable solutions, and
– Cost, resource, and energy consumption: An IoT de- prospective approaches for safeguarding the IoT. They
vice is a piece of hardware with a sensor that trans- give their analysis of the current state and issues of IoT
mits data from one place to another over the In- security in their survey, which takes into account three
ternet. The systems should be configured to reduce layers of architecture: perception layer, network layer,
and application layer. The authors of [57] undertake an
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
IoT security survey that takes into account application usage for intelligent security modeling to solve the se-
domains, security threats, and solution architectures. curity problems, in the context of IoT.
A taxonomy on IoT vulnerabilities, attack vectors, at-
tacks that exploit such vulnerabilities, and correspond-
ing methodologies, has been presented in [99]. In [26], 3 IoT System Architectures and Security Issues
a study on IoT security is presented by the authors,
which focuses on the most recent IoT security threats In this section, we first highlight the attack surface ar-
and vulnerabilities identified via a thorough assessment eas of the IoT, and then we summarize the security
of current IoT security studies. issues through the overall architecture of an IoT sys-
tem.
In addition to these surveys, many research on ma-
chine learning have been conducted. For example, in the
paper [145], the authors explore the threat model for 3.1 IoT Attack Surface Areas
IoT systems and evaluate IoT security solutions based
In the following, we summarize surface areas for IoT
on machine-learning methods like supervised learning,
attacks, or areas where threats and vulnerabilities can
unsupervised learning, and reinforcement learning. They
exist in IoT systems and applications. These are:
explore methods to data privacy protection that use
learning-based IoT authentication, access control, se- – Devices: IoT devices are one of the most common
cure offloading, and malware detection. The authors ways that cyberattacks are initiated. Memory, firmware,
examine the security requirements, attack vectors, and the physical interface, the web interface, and net-
other discussions in [61], focused on computer learn- work resources are all aspects of an IoT system that
ing for the IoT networks. In [25], a survey of com- can be vulnerable. Attackers can take advantage of
puter and deep learning techniques for IoT security vulnerable update systems, outdated components,
was presented. In [154], the impact of IoT new features and risky default settings, among other things.
on protection and privacy considering new threats, ex- – Communication channels: Attacks against IoT com-
isting solutions and challenges was addressed. In or- ponents could originate via the communication chan-
der to construct data-driven security systems employ- nels that link them to one another. Protocols used
ing machine and deep learning techniques, it’s impor- in IoT systems could have security vulnerabilities
tant to understand the nature of data including various that could compromise the whole system. IoT sys-
forms of cyber threats and related features. There are tems are vulnerable to well-known network attacks,
several such datasets exist in the area of cybersecu- such as denial of service (DoS) and spoofing, which
rity. Hence, we have summarized as NSL-KDD [139], may cause significant damage.
UNSW-NB15 [97], DARPA [147] [85], CAIDA [4] [3], – Applications and software: Vulnerabilities in the web
ISOT’10 [14] [13], ISCX’12 [5] [128], CTU-13 [10], CIC- applications and associated software of IoT devices
IDS [9], CIC-DDoS2019 [6], MAWI [64], ADFA IDS might cause systems to be compromised. Web apps,
[146], CERT [84] [48], EnronSpam [12], SpamAssas- for example, can be used to steal user credentials or
sin [17], LingSpam [15], DGA [1] [2] [11] [151], Mal- to distribute malicious firmware upgrades.
ware Genome project [155], Virus Share [18], VirusTo-
tal [19], Comodo [7], Contagio [8], DREBIN [74], Mi-
crosoft [16], Bot-IoT [71]. The machine and deep learn- 3.2 Architectures and Security Issues
ing based model can be built utilizing these datasets,
according to the problem domain. For instance, a neural Based on the IoT attack surface areas highlighted above,
network based deep learning model is used to build an in this section, we summarize the security issues through
intrusion detection model utilizing NSL-KDD dataset the overall architecture of an IoT system. Several ar-
[45]. In [38], the authors use several such NSL-KDD chitectures for IoT have been proposed by different re-
[139], UNSW-NB15 [97], CIC-IDS [9], while analyzing searchers and research groups. Conventional IoT archi-
their machine learning-based network intrusion detec- tecture is considered to have three layers, such as the
tion model for IoT security. perception layer, the network layer, and the application
layer [91]. However, the support or middleware layer is
Unlike the previous studies, this paper focuses on ar- considered as an important layer later, according to the
tificial intelligence knowledge, particularly machine and needs for data processing and intelligent decision mak-
deep learning-based IoT security solutions. For this, we ing, which lies between the network layer and the appli-
present different machine learning techniques as well as cation layer. In several cases, the IoT architectures are
deep learning architectures and techniques, and their based on a network layer and a support layer according
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
6 Sarker et al.
to the needs. Furthermore, the concept of cloud com- edge technologies such as Wi-Fi, LTE, Bluetooth,
puting for the support layer has been included in some 3G/4G, ZigBee, and others are used to operate cloud
studies of IoT systems. In this paper, we take into ac- computing platforms, Internet gateways, switching,
count the most popular four-layered IoT architecture, and routing devices, among other things [87]. At this
such as the perception layer, the networking, and data layer, the most important cybersecurity issues are
communication layer, middleware layer, and the appli- confidentiality, privacy, and compatibility. At this
cation layer, shown in Figure 3, while discussing the layer, attackers have a high probability of evidenc-
security threats and attacks in the domain of IoT secu- ing criminal activity through phishing, distributed
rity. denial-of-service (DDoS/DoS), data transit attacks,
routing attacks, identity authentication, and encryp-
– Security Issues at Perception or Sensing Layer : The
tion, among other methods [57] [51].
perception layer is a hardware layer consisting of
For example, this layer of IoT is extremely vulnera-
physical devices and sensors in different forms, thus
ble to phishing attacks, which aim to steal personal
also known as the sensing layer. These devices or
data such as credit card and login information or to
sensors such as mechanical, electrical, electronic, or
infect victims’ devices with malware [57]. Access at-
chemical sensors, are connected with the physical
tack, also known as an advanced persistent threat,
world to capture different kinds of information ac-
occurs when an unauthorized individual or adver-
cording to the particular IoT applications. WSN,
sary gains access to the IoT network because IoT
RFID, and other types of sensing and identifying
apps are constantly receiving and transferring valu-
systems are the key technologies employed in the
able data. The most prevalent and destructive at-
perception layers [140]. There are four major cyber-
tacks on a network are denial of service (DoS) and
security issues: i) wireless signal strength; ii) sensor
distributed denial of service (DDoS) attacks, which
node exposure in IoT devices; ii) dynamic nature of
cause network resources to be exhausted and ser-
IoT topology; and iv) communication, computation,
vice to be unavailable. Furthermore, attackers may
storage, and memory constraints, exist in this layer
use routing attacks like sinkhole attacks, wormhole
[98] [87]. To defend the IoT network, this layer em-
attacks, and others to reroute routing paths during
ploys three popular mechanisms as node authentica-
data transmission.
tion, lightweight encryption and the access control
– Security Issues at Middleware or Support Layer : It’s
mechanism [87].
a layer of software that exists between the network
Many attacks and crimes target the confidentiality
and the application. As a result, this layer is usu-
of the perception layer that is common in practice.
ally in charge of IoT device service management,
Examples include node capturing, malicious code,
as well as data processing and intelligent operations
fake data injection, replay attacks, side-channel at-
on data with decision-making. It can be seen as a
tacks, etc. [57]. For example, a node capturing at-
dependable support platform, similar to the cloud
tack can cause a node to stop delivering genuine
[50], that makes this layer in the IoT system eas-
data, destroying the entire network and even com-
ier to use. In several cases, the more distributed fog
promising the security of the entire IoT applica-
computing technologies have been used to replace
tion. False data or malicious code injection attacks
the centralized cloud environment, resulting in im-
might produce false results and cause the IoT appli-
proved performance and faster response times [35].
cation to malfunction. Eavesdropping, often known
At this level, the authenticity, integrity, and confi-
as sniffing or snooping, is a type of attack that uses
dentiality of all transmitted data should be checked
unsecured network communications to acquire data
and maintained [87].
in transit between devices. A replay attack is defined
Although the middleware layer is essential for de-
as spoofing, changing, or repeating the identifying
livering a secure and dependable IoT application,
information of smart devices in an IoT network. A
it is also vulnerable to attacks such as insider at-
time attack occurs when an attacker steals the en-
tacks, man-in-the-middle attacks, SQL injection at-
cryption key associated with time and other critical
tacks, signature wrapping attacks, cloud malware
data [129]. Aside from direct attacks on the nodes,
injection, cloud flooding attacks, and so on [57] [73].
a variety of side-channel attacks may result in sen-
Internal attackers intentionally modify and extract
sitive data being leaked.
data or information within the network in a mali-
– Security Issues at Networking and Data Commu-
cious inside attack [81]. Through a SQL injection at-
nications Layer : The main purpose of this layer is
tack, an attacker can include malicious SQL queries
to transmit the information collected by the percep-
in a program to obtain sensitive data from any user
tual layer, as described above. At this layer, cutting-
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
Fig. 3: Various security issues at different layers namely, the perception, network, middleware, and application
layers of IoT architecture and systems, which are needed to handle intelligently for secured IoT applications and
services in various real-world scenarios.
and even change database records. A virtualization Different applications may have different levels of se-
attack occurs when a virtual machine is harmed and curity needs, depending on the application environ-
its effects spread to other virtual machines. Cloud ment and the necessity. As an example, the security
malware injection allows an attacker to take control method used in online banking should be more se-
of a cloud, inject malicious code, or even implant cure than the one used in exchanging climate change
a virtual machine into a cloud. Cloud flooding at- forecast information. Many security issues must be
tacks, which increase the workload on cloud servers, addressed at the application layer, including access
may have a significant impact on cloud servers. control attacks, malicious code attacks, sniffing at-
– Application layer : The application layer is respon- tacks, reprogram attacks, data breaches, service in-
sible for controlling the overall management of IoT terruption attacks, application vulnerabilities, and
apps that interact with users in a personalized way. software bugs, to name a few examples [57] [75].
A personal computer, smartphone, or any smart ob- In the application layer, malicious data is trans-
ject or device that can utilize IoT services via Inter- ferred and exchanged amongst smart devices at the
net connectivity can serve as the interface. In nu- application layer. Practitioners and academics have
merous application domains, such as smart homes, major issues in protecting data privacy and secu-
smart cities, industrial, building, and health appli- rity as well as identifying things. The attacker in-
cations, the application layer is dependent on the jects malware into the system via the use of viruses,
information processed in the middleware layer [69]. worms, Trojan horses, and spyware to deny service,
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
8 Sarker et al.
manipulate data, and/or gain access to confiden- can be applied to security solutions in the context of
tial data [149]. Service interruption attacks, often IoT.
known as DDoS attacks, prevent genuine consumers
from using IoT applications by intentionally making
4.1 Classification and Regression Techniques
servers or networks too busy to respond. Attackers
may use sniffer programs to monitor network traf-
In the area of machine learning, both classification and
fic in IoT applications to get access to confidential
regression methods are well-known and widely used.
user data. An attacker might quickly destroy a sys-
A classification task in IoT security is usually defined
tem in an unauthorized access attack by restricting
as predicting a fixed discrete value/category, such as
access to IoT-related services or destroying existing
[anomaly, normal] or [attack-1, attack-2, attack-3, etc.]
data [87]. Furthermore, attackers may attempt to
outcome, whereas a regression work is defined as pre-
remotely reprogram IoT devices, which might result
dicting a continuous or numeric value, such as the im-
in the IoT system being hacked.
pact of attacks. Several popular classification techniques,
such as k-nearest neighbors [22], support vector ma-
As discussed above, several security threats and at-
chines [67], navies Bayes [65], adaptive boosting [46],
tacks might happen in each layer of an IoT system. In
and logistic regression [78], decision tree [105], IntrudTree
addition, Zero-day attack [27] [33] [126] that is used
[115], BehavDT [117], ensemble learning such as ran-
to refer to the threat posed by an unknown security
dom forests [37], exist that can categorize security inci-
[127] [95], are considered as the serious potential se-
dents in order to address different IoT security issues,
curity threats. Thus, an in-depth analysis of detecting
including intrusion or attack detection, malware analy-
these cyber-attacks is important, where the knowledge
sis, and anomaly or fraud detection in IoT.
of artificial intelligence, particularly, machine learning
For instance, the support vector machine classifica-
methods as well as deep learning architectures or tech-
tion technique is used in profiling abnormal behavior
niques can be considered as a good solution in securing
of IoT devices [80], and for detecting android malware
the system from such anomalies in the domain of IoT
for reliable IoT services [53]. Random forest technique
security.
is used to detect anomalies [39] [103], denial of ser-
vice attack [41], IoT intrusion detection service [106]
[96], smart city anomaly detection [28] etc. Similarly, a
4 Machine and Deep learning Techniques in naive Bayes based classification model is used to detect
IoT Security Solutions anomalies [135], and a logistic regression-based method
to detect malicious IoT botnets [104] [31]. On the other
Machine and deep learning techniques are well-known hand, a regression model is useful for predicting attacks
as AI techniques that can help the IoT devices to learn quantitatively or to predict the impact of an attack,
from the experience representing as data, and behave such as worms, viruses, or other malicious software [62].
accordingly. The learning models are often comprised Similarly, a quantitative security model, e.g., phishing
of a set of rules, procedures, or sophisticated ‘transfer in a certain period or network packet parameters, re-
functions’ that may be used to uncover relevant secu- gression techniques could be useful [122]. Several popu-
rity incident trends in IoT data, as well as recognize lar regression techniques such as Linear, Logistic, Poly-
and predict behavior [42]. As a result, in an IoT con- nomial, Ridge, Lasso, regression trees, Principal com-
text, both machine learning and deep learning can oper- ponents, ElasticNet, Poisson, Negative binomial, Step-
ate in dynamic IoT networks without the requirement wise, Partial least squares regression [144] etc. exist that
for human or user intervention. The potential role of can be used to build the quantitative security model ac-
machine learning and deep learning techniques in de- cording to their working principle in machine learning.
veloping a data-driven model for IoT security intelli- For instance, the linear regression-based model is used
gence is shown in Figure 4. Several machine learning to identify the cyber attack origin [76], and multiple
methods can be used to learn from IoT security data, regression analysis is used for correlating human traits
including classification and regression analysis, cluster- and cybersecurity behavior intentions [49]. Similarly, re-
ing, rule-based methods, feature optimization methods gression regularization methods such as Lasso, Ridge,
[114], and deep learning methods based on artificial or ElasticNet, can enhance security attacks analysis to
neural networks, such as the multi-layer perceptron net- get a better outcome considering the high dimensional-
work, convolutional network, recurrent network, etc. ity of IoT security data [52].
[113] [112]. Thus, in the following section, we will dis- Thus, we can conclude that the classification tech-
cuss how different machine and deep learning methods niques can be used to build the prediction and clas-
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
Fig. 4: Illustration of the potential role of the machine learning and deep learning methods while building data-
driven model for IoT security intelligence.
sification model [123] utilizing the relevant data in the OPTICS, Gaussian Mixture Model [148], are the popu-
domain of IoT security, while the regression technique is lar concepts of clustering algorithms. These clustering
mainly the impact of the model [62] through determin- techniques can be used to solve various IoT security
ing the predictor strength, time-series causes, or the ef- problems. For instance, the k-Means algorithm is used
fect of the relations, considering the security attributes in profiling the abnormal behavior of IoT devices [80].
and the outcome. A dynamic threshold-based approach can be used to de-
tect the outlier or noisy instances in data [110]. A fuzzy
clustering approach is used in IoT intrusion detection
4.2 Clustering Techniques [86]. To analyze system log data for cybersecurity ap-
plications clustering approaches are useful to extract
In machine learning, clustering is another popular task useful insights or knowledge [77]. Thus, by uncovering
for analyzing IoT security data, which is considered hidden patterns and structures in IoT security data, the
unsupervised learning. It can cluster or create groups clustering techniques can play a significant role through
of a set of data points based on the measurement for measuring the behavioral similarity or dissimilarity, to
similarity and dissimilarity in the security data gener- solve various security problems, such as outlier detec-
ated by IoT devices from diverse sources. Thus, clus- tion, anomaly detection, signature extraction, fraud de-
tering could contribute to the discovery of hidden pat- tection, cyber-attack detection, etc. in the domain of
terns and structures in data, allowing for the detection IoT.
of abnormalities or attacks in IoT. Partition, Hierar-
chy, Fuzzy Theory, Distribution, Density, Graph The-
ory, Grid, Fractal Theory, and other perspectives can 4.3 Rule-based Techniques
be used to cluster data. [148]. K-means [90], K-medoids
[107], single linkage [131], complete linkage [132], ag- A rule-based system extracting rules from data, can
glomerative clustering, bottom-up BOTS [118], DBSCAN, mimic human intelligence, which is a system that ap-
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
10 Sarker et al.
plies rules to make an intelligent decision [111]. Thus, models [114]. As today’s IoT security datasets may in-
rule-based systems can play a significant role in IoT se- clude features that are less relevant or not at all impor-
curity through learning security or policy rules from tant, effectively modeling cyber attacks or abnormali-
data [119]. Association rule learning is a prominent ties is challenging. A security model with these qualities
method of discovering associations or rules among a set can lead to several issues, including excessive variance,
of available attributes in a security dataset in the field of overfitting, high computing cost and model prepara-
machine learning [20]. Several types of association rules tion, and a lack of generalization, all of which can de-
have been proposed in the area, such as frequent pattern grade prediction accuracy [115]. Thus an optimal num-
based [21] [60], [88], tree-based [55], logic-based [44], ber of security features selection based on their impact
fuzzy-rules [138], belief rule [156] etc. The rule learn- or importance [115] could minimize such issues while
ing techniques such as AIS [20], Apriori [21], Apriori- building an IoT security model with high dimensional
TID and Apriori-Hybrid [21], FP-Tree [55], Eclat [152], data sets. Several approaches such as wrapper methods
RARM [40] exist that can be used to solve IoT security such as recursive feature elimination, forward feature
problems and intelligent decision making. For instance, selection; filter methods such as Pearson correlation,
an association rule-mining algorithm-based network in- chi-squared test, analysis of variance test; or embedded
trusion detection has been presented in [125]. Moreover, methods such as regularization, Lasso, Ridge, or Elas-
fuzzy association rules are used to build a rule-based in- ticNet, tree-based feature importance [114] can be used.
trusion detection system [138]. To analyze IoT malware Along with feature selection, principal component anal-
activities, an FP-tree association rule-based study has ysis (PCA) [114] is utilized to generate new brand com-
been conducted in [100]. ponents that capture the majority of the relevant in-
Although a rule-based approach is easy to adopt, it formation. While developing a machine learning-based
has high time complexity because of generating a huge security modeling, these new brand components may
number of associations or frequent patterns depending help handle large dimensions of IoT security data, such
on the support and confidence values, and consequently, as IoT network traffic anomaly detection [58].
make the model complex [21] [137]. An effective asso-
ciation model could minimize this issue. For instance,
in our earlier paper, Sarker et al. [121], we present a 4.5 Deep Neural Network Learning-based Approaches
rule learning approach that effectively discovers the as-
sociation rules that are non-redundant and reliable, and Deep learning (DL) is a subset of machine learning that
thus could play a significant role in the domain of IoT developed from the Artificial Neural Network (ANN),
security as well. The rules can also be used to build which offers a computational architecture for learning
knowledge-based systems or rule-based expert systems from data by combining multiple processing levels, such
[120] to solve more complex security problems in IoT. as input, hidden, and output layers, into a single net-
Each of these systems consists of a set of policy rules to work [54]. Thus deep learning techniques are also ca-
define the scope of what kind of activities should be al- pable to learn from IoT security data through these
lowed on a network, where each rule is either explicitly layers, and known as hierarchical learning methods be-
allow or deny. Even new zero-day attacks are blocked cause of their knowledge capturing nature in deep ar-
that utilize rule-driven controls or filters security policy chitecture. Deep learning outperforms typical machine
monitoring. learning algorithms in a variety of situations, especially
when learning from huge security datasets. Several IoT-
based devices and their applications or systems produce
4.4 Security Feature Optimization and Principal a large amount of security data in the IoT environ-
Component Analysis ment; consequently, depending on the datasets, DL ap-
proaches may deliver better results. Depending on the
For an effective IoT security system based on the ma- characteristics and nature of the security data, differ-
chine learning approach, security feature engineering ent deep learning architectures such as Multi-layer per-
and optimization are considered key issues in IoT cy- ceptron (MLP), convolutional neural networks (CNN),
ber threat landscape. The reason is that the security recurrent neural networks (RNN), deep belief networks
features and corresponding IoT data directly influence (DBN), or hybrid networks can be used to build IoT
the machine learning-based security models and thus a security modeling [113] [147], as discussed below.
data dimensionality reduction technique is important
[102]. Feature engineering is the general term used to – Multilayer perceptron (MLP): A multilayer percep-
construct and modify security attributes or variables tron (MLP), often known as a feedforward artificial
to effectively develop machine learning-based security neural network, is the fundamental building block
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
of deep learning algorithms. A typical MLP com- (IoT) security. Using a CNN-based deep learning
prises an input layer, one or more hidden layers of model for intrusion detection, such as denial-of-service
an output layer, and one or more output layers. Each (DoS) attacks [134], to detect malware [150], an-
node in one layer is linked to a certain weight in the droid malware detection [92]. Furthermore, an in-
next layer via a chain of connections. The weight trusion detection model based on multi-CNN fusion
values are updated internally by MLP as the model may be utilized [83]. In the IoT environment, some
is being developed via the backpropagation process. innovative CNN-based deep learning models with
Such MLP network is used to build an intrusion de- lightweight architecture could reduce computations
tection model utilizing NSL-KDD dataset [45], mal- and provide higher performance with constrained
ware analysis [66], to generate explanation in IoT resources.
environments [47], detecting malicious botnet traffic – Recurrent neural network (RNN): A recurrent neu-
from IoT devices [63]. To perform a security threat ral network (RNN) is another kind of ANN in which
analysis of the IoT, MLP based network is used in the connections between nodes form a directed graph
[59], where the model classifies the network data as along a temporal sequence. The RNN model, which
normal or as under attack. is derived from feedforward neural networks, can
– Convolutional neural networks (CNN): The CNN process variable-length sequences of inputs by us-
[79] improves on the traditional ANN design, which ing their internal state, or memory. It is possible
includes convolutional layers, pooling layers, and fully to use the RNN model for IoT security, as well
connected layers. Each of these levels takes into ac- as natural language processing and voice recogni-
count optimized parameters, reducing the complex- tion, because of its capacity to effectively handle
ity. CNN also employs a dropout to address the sequential data. Internet of Things (IoT) devices
problem of overfitting, which can occur in the MLP produce a significant quantity of sequential data
network. It is commonly utilized in numerous areas from several sources, such as network traffic flows,
such as natural language processing, audio analysis, time-dependent data, and so on. When the behav-
picture processing, and other autocorrelated data in ior patterns of the threat are time-dependent, using
recent years because it takes advantage of the two- recurrent connections can help neural networks de-
dimensional (2D) structure of the input data. CNN tect security concerns. The reason for this is that
may also be used in the area of Internet of Things it contains a characteristic called Long Short Term
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
12 Sarker et al.
Memory (LSTM) that allows it to retain prior in- is bad, such as non-representative, poor-quality, irrele-
puts, making it a particularly helpful model for time vant attributes, or an inadequate quantity for training,
series prediction. Such an LSTM model-based recur- machine or deep learning security models may become
rent network can be used for several purposes in the worthless or yield reduced accuracy, or they may even
domain of security, such as intrusion detection [70], become worthless. Future research opportunities and
to detect and classify the malicious apps [142] etc. directions in the topic of IoT security include the fol-
lowing:
In addition to these deep learning models, hybrid
network models, such as the ensemble of classifiers, LSTM – In the world of IoT, gathering security data is not
network with the combination of CNN, can also be ap- easy. The dynamic characteristics of IoT, such as
plied for detecting IoT attacks, such as malware detec- heterogeneity, covered briefly in Section 2, allows for
tion [150], phishing, and Botnet attack detection and the generation of massive amounts of data at a high
mitigation across multiple IoT devices [101]. Other deep frequency from various domains. Collecting security
learning models, such as a deep belief network (DBN) data in the IoT is not a straightforward endeavor.
based security model, may be used to IoT security[30] For further analysis, it is critical to gather and man-
[108]. In our earlier paper Sarker et al. [113], we have ex- age relevant IoT-generated data for target applica-
plored different types of deep learning techniques with tions, such as security in smart city applications,
their taxonomy dividing into discriminative for super- to facilitate further investigation. As a result, while
vised tasks, generative for unsupervised tasks, and hy- working with IoT-generated data, a more in-depth
brid techniques that can be used according to the data analysis of data gathering methods is required.
characteristics. In Table 1, we have summarized how – Many ambiguous values, missing values, outliers,
various machine learning methods including deep learn- and erroneous data may be discovered in histori-
ing are used to solve various security issues in the do- cal or raw IoT security data. The machine learn-
main of IoT. Thus, we can infer that the above-mentioned ing or deep learning methods presented in Section
machine or deep learning techniques, as well as their 4 in IoT security have a significant impact on data
variants or modified lightweight approaches, can play a quality and training availability, and hence on the
significant role in data-driven security analytics in the IoT security model. As a result, cleaning and pre-
IoT environment. processing the various security data generated in an
IoT environment is a challenging task. To effectively
apply learning algorithms in the domain of IoT se-
5 Research Issues and Directions curity, improvement of current methods or the de-
velopment of new data preparation techniques are
Our study on the machine and deep learning-based se- expected.
curity solutions raises concerns in the area of IoT secu- – It is critical for an effective IoT security solution
rity. As a consequence, in this section, we describe and to consider the constraints or capabilities of IoT
analyze the challenges that have been encountered, as devices and systems where learning-based security
well as possible research possibilities and future direc- models are utilized, as addressed briefly in Section
tions for securing IoT networks and systems. 4. As a consequence, there should be a trade-off
The effectiveness and efficiency of a machine learn- between security and device capabilities in terms
ing or deep learning-based IoT security solution are pri- of data storage, computing, data processing, and
marily determined by the nature and features of the decision-making, and communication resources. There-
data, as well as the learning algorithms’ performance. fore, an in-depth investigation is required to dis-
There are a variety of machine and deep learning tech- cover the most appropriate machine or deep learning
niques available to evaluate data and extract insights, methods.
as detailed in Section 4. As a result, choosing an ap- – Because of the huge amount of redundant process-
propriate learning algorithm for the intended applica- ing, the classical learning techniques outlined in Sec-
tion in IoT security can be challenging. The reason be- tion 4 may not be directly applicable to IoT de-
hind this is that based on the data qualities, the results vices in various circumstances. The association rule
of different learning algorithms may vary [123] [114]. learning technique [21], for example, in a rule-based
If the wrong learning algorithm is chosen, unexpected system may extract redundant generation from IoT
results may occur, resulting in a loss of effort as well security data, making the decision-making process
as the model’s efficacy and accuracy. In the same way, complex and ineffective [121]. As a result, a better
unnecessary IoT security data might result in garbage understanding of the benefits and limitations of ex-
processing and inaccurate outcomes. If the IoT data isting learning methods is required, making the de-
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
velopment of new lightweight algorithms or methods developed using the obtained IoT security knowledge
for IoT devices a challenging task. connected with the target application.
– Compared to older patterns, a recent malicious be- Finally, we have discussed and addressed the issues
havioral trend is more likely to be intriguing and sig- that have arisen, as well as potential research directions
nificant for forecasting or detecting attacks in IoT and future approaches that are based on learning tech-
security. As a result, rather than considering con- niques. As a result, the challenges that have been high-
ventional data analysis, the idea of recency anal- lighted present promising research possibilities in the
ysis, i.e. current pattern-based extracted insight or field, which must be addressed with effective solutions
knowledge [116], may be more appropriate in a vari- to enhance IoT security over time. Overall, we believe
ety of situations. Thus another difficult challenge is that our study on machine and deep learning-based se-
to propose new lightweight solutions for IoT devices curity solutions points in the direction of a promising
that take into consideration current data patterns, path and can be used as a reference guide for future IoT
and eventually to construct a recency-based IoT se- security research and implementations by academic and
curity model. industry experts.
14 Sarker et al.
13. The honeynet project. 31. Rohan Bapat, Abhijith Mandya, Xinyang Liu, Bren-
http://www.honeynet.org/chapters/france/ (accessed dan Abraham, Donald E Brown, Hyojung Kang, and
on 20 october 2019). Malathi Veeraraghavan. Identifying malicious botnet
14. Isot botnet dataset. https://www.uvic.ca/engineering/ece/isot/ traffic using logistic regression. In 2018 Systems and
datasets/index.php/ (accessed on 20 october 2019). Information Engineering Design Symposium (SIEDS),
15. Lingspam. available online: pages 266–271. IEEE, 2018.
https://labs-repos.iit.demokritos.gr/skel/i- 32. Jennifer Bélissent et al. Getting clever about smart
config/downloads/lingspampublic.tar.gz/ (accessed cities: New opportunities require new business models.
on 20 october 2019). Cambridge, Massachusetts, USA, 193:244–77, 2010.
16. Microsoft malware classification (big 2015). available 33. Leyla Bilge and Tudor Dumitraş. Before we knew
online: http://arxiv.org/abs/1802.10135/ (accessed on it: an empirical study of zero-day attacks in the real
20 october 2019). world. In Proceedings of the 2012 ACM conference on
17. Spamassassin. available online: Computer and communications security, pages 833–844.
http://www.spamassassin.org/publiccorpus/ (accessed ACM, 2012.
on 20 october 2019). 34. Miodrag Bolic, Majed Rostamian, and Petar M Djuric.
18. Virusshare. available online: http://virusshare.com/ Proximity detection with rfid: A step toward the inter-
(accessed on 20 october 2019). net of things. IEEE Pervasive Computing, 14(2):70–76,
19. Virustotal. available online: https://virustotal.com/ 2015.
(accessed on 20 october 2019). 35. Flavio Bonomi, Rodolfo Milito, Preethi Natarajan, and
20. Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. Jiang Zhu. Fog computing: A platform for internet
Mining association rules between sets of items in large of things and analytics. In Big data and internet of
databases. In ACM SIGMOD Record, volume 22, pages things: A roadmap for smart environments, pages 169–
207–216. ACM, 1993. 186. Springer, 2014.
21. Rakesh Agrawal, Ramakrishnan Srikant, et al. Fast al- 36. Joseph Bradley, Jeff Loucks, James Macaulay, and Andy
gorithms for mining association rules. In Proc. 20th int. Noronha. Internet of everything (ioe) value index. White
conf. very large data bases, VLDB, volume 1215, pages Paper CISCO and/or its affiliates, 2013.
487–499, 1994. 37. Leo Breiman. Random forests. Machine learning,
22. David W Aha, Dennis Kibler, and Marc K Albert. 45(1):5–32, 2001.
38. Nadia Chaabouni, Mohamed Mosbah, Akka Zemmari,
Instance-based learning algorithms. Machine learning,
Cyrille Sauvignac, and Parvez Faruki. Network intru-
6(1):37–66, 1991.
23. Ejaz Ahmed, Ibrar Yaqoob, Abdullah Gani, Muham- sion detection for iot security based on learning tech-
mad Imran, and Mohsen Guizani. Internet-of-things- niques. IEEE Communications Surveys & Tutorials,
based smart environments: state of the art, taxonomy, 21(3):2671–2701, 2019.
39. Yaping Chang, Wei Li, and Zhongming Yang. Network
and open research challenges. IEEE Wireless Commu-
intrusion detection based on random forest and support
nications, 23(5):10–16, 2016.
vector machine. In 2017 IEEE international conference
24. Ala Al-Fuqaha, Mohsen Guizani, Mehdi Mohammadi,
on computational science and engineering (CSE) and
Mohammed Aledhari, and Moussa Ayyash. Internet
IEEE international conference on embedded and ubiqui-
of things: A survey on enabling technologies, protocols,
tous computing (EUC), volume 1, pages 635–638. IEEE,
and applications. IEEE communications surveys & tu-
2017.
torials, 17(4):2347–2376, 2015. 40. Amitabha Das, Wee-Keong Ng, and Yew-Kwong Woon.
25. Mohammed Ali Al-Garadi, Amr Mohamed, Abdulla Al-
Rapid association rule mining. In Proceedings of
Ali, Xiaojiang Du, Ihsan Ali, and Mohsen Guizani. A
the tenth international conference on Information and
survey of machine and deep learning methods for in-
knowledge management, pages 474–481. ACM, 2001.
ternet of things (iot) security. IEEE Communications 41. Rohan Doshi, Noah Apthorpe, and Nick Feamster. Ma-
Surveys & Tutorials, 2020. chine learning ddos detection for consumer internet of
26. Fadele Ayotunde Alaba, Mazliza Othman, Ibrahim things devices. In 2018 IEEE Security and Privacy
Abaker Targio Hashem, and Faiz Alotaibi. Internet of Workshops (SPW), pages 29–35. IEEE, 2018.
things security: A survey. Journal of Network and Com- 42. Sumeet Dua and Xian Du. Data mining and machine
puter Applications, 88:10–28, 2017. learning in cybersecurity. CRC press, 2016.
27. Mamoun Alazab, Sitalakshmi Venkatraman, Paul Wat- 43. Mohamed Faisal Elrawy, Ali Ismail Awad, and Hes-
ters, Moutaz Alazab, et al. Zero-day malware detection ham FA Hamed. Intrusion detection systems for iot-
based on supervised learning algorithms of api call sig- based smart environments: a survey. Journal of Cloud
natures. 2010. Computing, 7(1):21, 2018.
28. Ibrahim Alrashdi, Ali Alqazzaz, Esam Aloufi, Raed 44. Peter A Flach and Nicolas Lachiche. Confirmation-
Alharthi, Mohamed Zohdy, and Hua Ming. Ad-iot: guided discovery of first-order rules with tertius. Ma-
Anomaly detection of iot cyberattacks in smart city us- chine Learning, 42(1-2):61–95, 2001.
ing machine learning. In 2019 IEEE 9th Annual Com- 45. Felipe De Almeida Florencio, Edward David Moreno
puting and Communication Workshop and Conference Ordonez, Hendrik Teixeira Macedo, Ricardo José Paiva
(CCWC), pages 0305–0310. IEEE, 2019. De Britto Salgueiro, Filipe Barreto Do Nascimento, and
29. Luigi Atzori, Antonio Iera, and Giacomo Morabito. Flavio Arthur Oliveira Santos. Intrusion detection via
The internet of things: A survey. Computer networks, mlp neural network using an arduino embedded sys-
54(15):2787–2805, 2010. tem. In 2018 VIII Brazilian Symposium on Computing
30. Nagaraj Balakrishnan, Arunkumar Rajendran, Danilo Systems Engineering (SBESC), pages 190–195. IEEE,
Pelusi, and Vijayakumar Ponnusamy. Deep belief net- 2018.
work enhanced intrusion detection system to prevent 46. Yoav Freund, Robert E Schapire, et al. Experiments
security breach in the internet of things. Internet of with a new boosting algorithm. In Icml, volume 96,
Things, page 100112, 2019. pages 148–156. Citeseer, 1996.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
47. Iván Garcı́a-Magariño, Rajarajan Muttukrishnan, and 63. Yousra Javed and Navid Rajabi. Multi-layer percep-
Jaime Lloret. Human-centric ai for trustworthy iot sys- tron artificial neural network based iot botnet traffic
tems with explainable multilayer perceptrons. IEEE Ac- classification. In Proceedings of the Future Technologies
cess, 7:125562–125574, 2019. Conference, pages 973–984. Springer, 2019.
48. Joshua Glasser and Brian Lindauer. Bridging the gap: 64. Xuyang Jing, Zheng Yan, Xueqin Jiang, and Witold
A pragmatic approach to generating insider threat data. Pedrycz. Network traffic fusion and analysis against
In 2013 IEEE Security and Privacy Workshops, pages ddos flooding attacks with a novel reversible sketch. In-
98–104. IEEE, 2013. formation Fusion, 51:100–113, 2019.
49. Margaret Gratian, Sruthi Bandi, Michel Cukier, Josiah 65. George H John and Pat Langley. Estimating continu-
Dykstra, and Amy Ginther. Correlating human traits ous distributions in bayesian classifiers. In Proceedings
and cyber security behavior intentions. computers & of the Eleventh conference on Uncertainty in artificial
security, 73:345–358, 2018. intelligence, pages 338–345. Morgan Kaufmann Publish-
50. Jayavardhana Gubbi, Rajkumar Buyya, Slaven Maru- ers Inc., 1995.
sic, and Marimuthu Palaniswami. Internet of things 66. ElMouatez Billah Karbab, Mourad Debbabi, Abde-
(iot): A vision, architectural elements, and future direc- louahid Derhab, and Djedjiga Mouheb. Maldozer: Au-
tions. Future generation computer systems, 29(7):1645– tomatic framework for android malware detection using
1660, 2013. deep learning. Digital Investigation, 24:S48–S59, 2018.
51. Brij B Gupta, Aakanksha Tewari, Ankit Kumar Jain, 67. S. Sathiya Keerthi, Shirish Krishnaj Shevade, Chiranjib
and Dharma P Agrawal. Fighting against phishing at- Bhattacharyya, and Karuturi Radha Krishna Murthy.
tacks: state of the art and future challenges. Neural Improvements to platt’s smo algorithm for svm classifier
Computing and Applications, 28(12):3629–3654, 2017. design. Neural computation, 13(3):637–649, 2001.
52. Desta Haileselassie Hagos, Anis Yazidi, Øivind Kure, 68. Minhaj Ahmad Khan and Khaled Salah. Iot security:
and Paal E Engelstad. Enhancing security attacks anal- Review, blockchain solutions, and open challenges. Fu-
ysis using regularized machine learning techniques. In ture generation computer systems, 82:395–411, 2018.
2017 IEEE 31st International Conference on Advanced 69. R Khan, S Khan, R Zaheer, and S Khan. Future inter-
Information Networking and Applications (AINA), net: The internet of things architecture, possible appli-
pages 909–918. IEEE, 2017. cations and key challenges in: 2012 10th international
53. Hyo-Sik Ham, Hwan-Hee Kim, Myung-Sup Kim, and conference on frontiers of information technology, 257–
Mi-Jung Choi. Linear svm-based android malware de- 260. IEEE, Islamabad, 10, 2012.
tection for reliable iot services. Journal of Applied 70. Jihyun Kim, Jaehyun Kim, Huong Le Thi Thu, and
Mathematics, 2014, 2014. Howon Kim. Long short term memory recurrent neu-
54. Jiawei Han, Jian Pei, and Micheline Kamber. Data min- ral network classifier for intrusion detection. In 2016
ing: concepts and techniques. Elsevier, 2011. International Conference on Platform Technology and
55. Jiawei Han, Jian Pei, and Yiwen Yin. Mining frequent Service (PlatCon), pages 1–5. IEEE, 2016.
patterns without candidate generation. In ACM Sigmod 71. Nickolaos Koroniotis, Nour Moustafa, Elena Sitnikova,
Record, volume 29, pages 1–12. ACM, 2000. and Benjamin Turnbull. Towards the development of
56. Wan Haslina Hassan et al. Current research on internet realistic botnet dataset in the internet of things for net-
of things (iot) security: A survey. Computer networks, work forensic analytics: Bot-iot dataset. Future Gener-
148:283–294, 2019. ation Computer Systems, 100:779–796, 2019.
57. Vikas Hassija, Vinay Chamola, Vikas Saxena, Divyansh 72. Srdjan Krčo, Boris Pokrić, and Francois Carrez. Design-
Jain, Pranav Goyal, and Biplab Sikdar. A survey on iot ing iot architecture (s): A european perspective. In 2014
security: application areas, security threats, and solu- IEEE World Forum on Internet of Things (WF-IoT),
tion architectures. IEEE Access, 7:82721–82743, 2019. pages 79–84. IEEE, 2014.
58. Dang Hai Hoang and Ha Duong Nguyen. A pca-based 73. Dennis Kügler. “man in the middle” attacks on blue-
method for iot network traffic anomaly detection. In tooth. In International Conference on Financial Cryp-
2018 20th International Conference on Advanced Com- tography, pages 149–161. Springer, 2003.
munication Technology (ICACT), pages 381–386. IEEE, 74. Rajesh Kumar, Zhang Xiaosong, Riaz Ullah Khan, Jay
2018. Kumar, and Ijaz Ahad. Effective and explainable de-
59. Elike Hodo, Xavier Bellekens, Andrew Hamilton, Pierre- tection of android malware based on machine learning
Louis Dubouilh, Ephraim Iorkyase, Christos Tachtatzis, algorithms. In Proceedings of the 2018 International
and Robert Atkinson. Threat analysis of iot networks Conference on Computing and Artificial Intelligence,
using artificial neural network intrusion detection sys- pages 35–40. ACM, 2018.
tem. In 2016 International Symposium on Networks, 75. Sathish Alampalayam Kumar, Tyler Vealey, and
Computers and Communications (ISNCC), pages 1–6. Harshit Srivastava. Security in internet of things: Chal-
IEEE, 2016. lenges, solutions and future directions. In 2016 49th
60. Maurice Houtsma and Arun Swami. Set-oriented min- Hawaii International Conference on System Sciences
ing for association rules in relational databases. In Data (HICSS), pages 5772–5781. IEEE, 2016.
Engineering, 1995. Proceedings of the Eleventh Interna- 76. Mohammed Lalou, Hamamache Kheddouci, and Salim
tional Conference on, pages 25–33. IEEE, 1995. Hariri. Identifying the cyber attack origin with par-
61. Fatima Hussain, Rasheed Hussain, Syed Ali Hassan, and tial observation: a linear regression based approach. In
Ekram Hossain. Machine learning in iot security: cur- 2017 IEEE 2nd International Workshops on Founda-
rent solutions and future challenges. IEEE Communi- tions and Applications of Self* Systems (FAS* W),
cations Surveys & Tutorials, 2020. pages 329–333. IEEE, 2017.
62. Venkatesh Jaganathan, Priyesh Cherurveettil, and 77. Max Landauer, Florian Skopik, Markus Wurzenberger,
Premapriya Muthu Sivashanmugam. Using a prediction and Andreas Rauber. System log clustering approaches
model to manage cyber security threats. The Scientific for cyber security applications: A survey. Computers &
World Journal, 2015, 2015. Security, 92:101739, 2020.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
16 Sarker et al.
78. Saskia Le Cessie and Johannes C Van Houwelingen. and Application Security and Privacy, pages 301–308,
Ridge estimators in logistic regression. Journal of the 2017.
Royal Statistical Society: Series C (Applied Statistics), 93. Roberto Minerva, Abyi Biru, and Domenico Rotondi.
41(1):191–201, 1992. Towards a definition of the internet of things (iot). IEEE
79. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Internet Initiative, 1(1):1–86, 2015.
Haffner. Gradient-based learning applied to document 94. Daniel Minoli and Benedict Occhiogrosso. Blockchain
recognition. Proceedings of the IEEE, 86(11):2278–2324, mechanisms for iot security. Internet of Things, 1:1–13,
1998. 2018.
80. Soo-Yeon Lee, Sa-rang Wi, Eunil Seo, Jun-Kwon Jung, 95. Sophia Moganedi. Undetectable data breach in iot:
and Tai-Myoung Chung. Profiot: Abnormal behav- Healthcare data at risk. In ECCWS 2018 17th Euro-
ior profiling (abp) of iot devices based on a ma- pean Conference on Cyber Warfare and Security V2,
chine learning approach. In 2017 27th International page 296. Academic Conferences and publishing limited,
Telecommunication Networks and Applications Confer- 2018.
ence (ITNAC), pages 1–6. IEEE, 2017. 96. TagyAldeen Mohamed, Takanobu Otsuka, and Takayuki
81. Shancang Li and Li Da Xu. Securing the internet of Ito. Towards machine learning based iot intrusion detec-
things. Syngress, 2017. tion service. In International Conference on Industrial,
82. Shancang Li, Li Da Xu, and Shanshan Zhao. The inter- Engineering and Other Applications of Applied Intelli-
net of things: a survey. Information Systems Frontiers, gent Systems, pages 580–585. Springer, 2018.
17(2):243–259, 2015. 97. Nour Moustafa and Jill Slay. Unsw-nb15: a comprehen-
83. Yanmiao Li, Yingying Xu, Zhi Liu, Haixia Hou, Yushuo sive data set for network intrusion detection systems
Zheng, Yang Xin, Yuefeng Zhao, and Lizhen Cui. Ro- (unsw-nb15 network data set). In 2015 military com-
bust detection for network intrusion of industrial iot munications and information systems conference (Mil-
based on multi-cnn fusion. Measurement, 154:107450, CIS), pages 1–6. IEEE, 2015.
2020. 98. Farooq Muhammad, Waseem Anjum, and Khairi Sadia
84. Brian Lindauer, Joshua Glasser, Mitch Rosen, Kurt C Mazhar. A critical analysis on the security concerns of
Wallnau, and L ExactData. Generating test data for internet of things (iot). International Journal of Com-
insider threat detectors. JoWUA, 5(2):80–94, 2014. puter Applications, 111(7):1–6, 2015.
85. Richard P Lippmann, David J Fried, Isaac Graf, 99. Nataliia Neshenko, Elias Bou-Harb, Jorge Crichigno,
Joshua W Haines, Kristopher R Kendall, David Mc- Georges Kaddoum, and Nasir Ghani. Demystifying
Clung, Dan Weber, Seth E Webster, Dan Wyschogrod, iot security: an exhaustive survey on iot vulnerabili-
Robert K Cunningham, et al. Evaluating intrusion de- ties and a first empirical look on internet-scale iot ex-
tection systems: The 1998 darpa off-line intrusion de- ploitations. IEEE Communications Surveys & Tutori-
tection evaluation. In Proceedings DARPA Information als, 21(3):2702–2733, 2019.
Survivability Conference and Exposition. DISCEX’00, 100. Seiichi Ozawa, Tao Ban, Naoki Hashimoto, Junji
volume 2, pages 12–26. IEEE, 2000. Nakazato, and Jumpei Shimamura. A study of iot mal-
86. Liqun Liu, Bing Xu, Xiaoping Zhang, and Xianjun ware activities using association rule learning for dark-
Wu. An intrusion detection method for internet of net sensor data. International Journal of Information
things based on suppressed fuzzy clustering. EURASIP Security, 19(1):83–92, 2020.
Journal on Wireless Communications and Networking, 101. Gonzalo De La Torre Parra, Paul Rad, Kim-Kwang Ray-
2018(1):113, 2018. mond Choo, and Nicole Beebe. Detecting internet of
87. Yang Lu and Li Da Xu. Internet of things (iot) cyber- things attacks using distributed deep learning. Journal
security research: A review of current research topics. of Network and Computer Applications, page 102662,
IEEE Internet of Things Journal, 6(2):2103–2115, 2018. 2020.
88. Bing Liu Wynne Hsu Yiming Ma. Integrating classifica- 102. Morteza Safaei Pour, Elias Bou-Harb, Kavita Varma,
tion and association rule mining. In Proceedings of the Nataliia Neshenko, Dimitris A Pados, and Kim-
fourth international conference on knowledge discovery Kwang Raymond Choo. Comprehending the iot cyber
and data mining, 1998. threat landscape: A data dimensionality reduction tech-
89. Zheng Ma, Ming Xiao, Yue Xiao, Zhibo Pang, H Vin- nique to infer and characterize internet-scale iot probing
cent Poor, and Branka Vucetic. High-reliability and low- campaigns. Digital Investigation, 28:S40–S49, 2019.
latency wireless communication for internet of things: 103. Rifkie Primartha and Bayu Adhi Tama. Anomaly de-
challenges, fundamentals, and enabling technologies. tection using random forest: A performance revisited.
IEEE Internet of Things Journal, 6(5):7946–7970, 2019. In 2017 International conference on data and software
90. James MacQueen. Some methods for classification and engineering (ICoDSE), pages 1–6. IEEE, 2017.
analysis of multivariate observations. In Fifth Berkeley 104. Anton O Prokofiev, Yulia S Smirnova, and Vasiliy A
symposium on mathematical statistics and probability, Surov. A method to detect internet of things botnets. In
volume 1, 1967. 2018 IEEE Conference of Russian Young Researchers
91. Rwan Mahmoud, Tasneem Yousuf, Fadi Aloul, and Im- in Electrical and Electronic Engineering (EIConRus),
ran Zualkernan. Internet of things (iot) security: Cur- pages 105–108. IEEE, 2018.
rent status, challenges and prospective measures. In 105. J. Ross Quinlan. C4.5: Programs for machine learning.
2015 10th International Conference for Internet Tech- Machine Learning, 1993.
nology and Secured Transactions (ICITST), pages 336– 106. Paulo Angelo Alves Resende and André Costa Drum-
341. IEEE, 2015. mond. A survey of random forest based methods for
92. Niall McLaughlin, Jesus Martinez del Rincon, Boo- intrusion detection systems. ACM Computing Surveys
Joong Kang, Suleiman Yerima, Paul Miller, Sakir Sezer, (CSUR), 51(3):1–36, 2018.
Yeganeh Safaei, Erik Trickel, Ziming Zhao, Adam 107. Lior Rokach. A survey of clustering algorithms. In Data
Doupé, et al. Deep android malware detection. In Pro- Mining and Knowledge Discovery Handbook, pages 269–
ceedings of the Seventh ACM on Conference on Data 298. Springer, 2010.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
108. Ahmed Saeed, Ali Ahmadinia, Abbas Javed, and Hadi 125. Devaraju Sellappan and Ramakrishnan Srinivasan. As-
Larijani. Intelligent intrusion detection in low-power sociation rule-mining-based intrusion detection system
iots. ACM Transactions on Internet Technology with entropy-based feature selection: Intrusion detec-
(TOIT), 16(4):1–25, 2016. tion system. In Handbook of Research on Intelligent
109. Iqbal H Sarker. Context-aware rule learning from smart- Data Processing and Information Security Systems,
phone data: survey, challenges and future directions. pages 1–24. IGI Global, 2020.
Journal of Big Data, 6(1):95, 2019. 126. Vishal Sharma, Kyungroul Lee, Soonhyun Kwon, Jiyoon
110. Iqbal H Sarker. A machine learning based robust pre- Kim, Hyungjoon Park, Kangbin Yim, and Sun-Young
diction model for real-life mobile phone data. Internet Lee. A consensus framework for reliability and mitiga-
of Things, 5:180–193, 2019. tion of zero-day attacks in iot. Security and Communi-
111. Iqbal H Sarker. Data science and analytics: An overview cation Networks, 2017, 2017.
from data-driven smart computing, decision-making and 127. Abraham Shaw. Data breach: from notification to pre-
applications perspective. SN Computer Science, 2021. vention using pci dss. Colum. JL & Soc. Probs., 43:517,
112. Iqbal H Sarker. Deep cybersecurity: a comprehensive 2009.
overview from neural network and deep learning per- 128. Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali A
spective. SN Computer Science, 2(3):1–16, 2021. Ghorbani. Toward developing a systematic approach
113. Iqbal H Sarker. Deep learning: A comprehensive to generate benchmark datasets for intrusion detection.
overview on techniques, taxonomy, applications and re- computers & security, 31(3):357–374, 2012.
search directions. SN Computer Science, 2021. 129. Sabrina Sicari, Alessandra Rizzardi, Luigi Alfredo
114. Iqbal H Sarker. Machine learning: Algorithms, real- Grieco, and Alberto Coen-Porisini. Security, privacy
world applications and research directions. SN Com- and trust in internet of things: The road ahead. Com-
puter Science, 2(3):1–21, 2021. puter networks, 76:146–164, 2015.
115. Iqbal H Sarker, Yoosef B Abushark, Fawaz Alsolami, 130. Beata Ślusarczyk. Industry 4.0: Are we ready? Polish
and Asif Irshad Khan. Intrudtree: A machine learning Journal of Management Studies, 17, 2018.
based cyber security intrusion detection model. Sym- 131. Peter HA Sneath. The application of computers to tax-
metry, 12(5):754, 2020. onomy. Journal of General Microbiology, 17(1), 1957.
132. Thorvald Sorensen. method of establishing groups of
116. Iqbal H Sarker, Alan Colman, and Jun Han. Recen-
equal amplitude in plant sociology based on similarity
cyminer: mining recency-based personalized behavior
of species. Biol. Skr., 5, 1948.
from contextual smartphone data. Journal of Big Data,
133. Harald Sundmaeker, Patrick Guillemin, Peter Friess,
6(1):49, 2019.
and Sylvie Woelfflé. Vision and challenges for realising
117. Iqbal H Sarker, Alan Colman, Jun Han, Asif Irshad
the internet of things. Cluster of European Research
Khan, Yoosef B Abushark, and Khaled Salah. Behavdt:
Projects on the Internet of Things, European Commi-
a behavioral decision tree learning to build user-centric
sion, 3(3):34–36, 2010.
context-aware predictive model. Mobile Networks and
134. Bambang Susilo and Riri Fitri Sari. Intrusion detection
Applications, 25(3):1151–1161, 2020.
in iot networks using deep learning algorithm. Informa-
118. Iqbal H Sarker, Alan Colman, Muhammad Ashad Kabir, tion, 11(5):279, 2020.
and Jun Han. Individualized time-series segmentation 135. Mayank Swarnkar and Neminath Hubballi. Ocpad: One
for mining mobile phone user behavior. The Computer class naive bayes classifier for payload based anomaly
Journal, 61(3):349–368, 2018. detection. Expert Systems with Applications, 64:330–
119. Iqbal H Sarker, Md Hasan Furhad, and Raza Nowrozy. 339, 2016.
Ai-driven cybersecurity: an overview, security intelli- 136. Amir Taherkordi and Frank Eliassen. Scalable model-
gence modeling and research directions. SN Computer ing of cloud-based iot services for smart cities. In 2016
Science, 2(3):1–18, 2021. IEEE International Conference on Pervasive Comput-
120. Iqbal H Sarker, Mohammed Moshiul Hoque, Md Kafil ing and Communication Workshops (PerCom Work-
Uddin, and Tawfeeq Alsanoosy. Mobile data science and shops), pages 1–6. IEEE, 2016.
intelligent apps: Concepts, ai-based modeling and re- 137. Syeda Manjia Tahsien, Hadis Karimipour, and Petros
search directions. Mobile Networks and Applications, Spachos. Machine learning based solutions for security
pages 1–19, 2020. of internet of things (iot): A survey. Journal of Network
121. Iqbal H Sarker and ASM Kayes. Abc-ruleminer: and Computer Applications, 161:102630, 2020.
User behavioral rule-based machine learning method for 138. Arman Tajbakhsh, Mohammad Rahmati, and Abdol-
context-aware intelligent services. Journal of Network reza Mirzaei. Intrusion detection using fuzzy association
and Computer Applications, 168:102762, 2020. rules. Applied Soft Computing, 9(2):462–469, 2009.
122. Iqbal H Sarker, ASM Kayes, Shahriar Badsha, Hamed 139. Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A
Alqahtani, Paul Watters, and Alex Ng. Cybersecurity Ghorbani. A detailed analysis of the kdd cup 99 data
data science: an overview from machine learning per- set. In 2009 IEEE Symposium on Computational In-
spective. Journal of Big Data, 7(1):1–29, 2020. telligence for Security and Defense Applications, pages
123. Iqbal H Sarker, ASM Kayes, and Paul Watters. Ef- 1–6. IEEE, 2009.
fectiveness analysis of machine learning classification 140. Aakanksha Tewari and Brij B Gupta. Security, pri-
models for predicting personalized context-aware smart- vacy and trust of different layers in internet-of-things
phone usage. Journal of Big Data, 6(1):57, 2019. (iots) framework. Future generation computer systems,
124. Hans Schaffers, Nicos Komninos, Marc Pallot, Brigitte 108:909–920, 2020.
Trousse, Michael Nilsson, and Alvaro Oliveira. Smart 141. Frederic Thiesse and Florian Michahelles. An overview
cities and the future internet: Towards cooperation of epc technology. Sensor review, 26(2):101–105, 2006.
frameworks for open innovation. In The future internet 142. R Vinayakumar, KP Soman, and Prabaharan Poor-
assembly, pages 431–446. Springer, Berlin, Heidelberg, nachandran. Deep android malware detection and clas-
2011. sification. In 2017 International conference on ad-
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 7 March 2022 doi:10.20944/preprints202203.0087.v1
18 Sarker et al.