Blacksite- human-in-the-loop artificial immune system for intrusion detection in internet of things
Blacksite- human-in-the-loop artificial immune system for intrusion detection in internet of things
https://doi.org/10.1007/s42454-020-00017-9
RESEARCH ARTICLE
Received: 9 August 2020 / Accepted: 4 December 2020 / Published online: 2 January 2021
# The Author(s), under exclusive licence to Springer Nature Switzerland AG part of Springer Nature 2021
Abstract
The Internet of Things (IoT) has rapidly changed information systems and networks in a significant way. Traditional networks are
experiencing exponential increases in data volume, velocity, and variety. With the intermingling of IoT devices and legacy
systems, new security threats are becoming more prevalent and severe. Successful attacks can cause significant damage or
disruption to critical infrastructures or theft of private data through previously nonexistent attack vectors. In response, existing
security solutions must be adapted, or new countermeasures must be created to address the unique threats presented by IoT.
Therefore, we propose the Blacksite framework for a novel adaptive real-time intrusion detection in IoT networks using human
intelligence integrated into an artificial immune system with deep neural network–based validation model. We recommend a
solution that can address unique challenges of IoT networks and present implementation strategies as well as a pilot implemen-
tation of the core component (deep neural network model for malicious traffic classification) of Blacksite. The proposed
framework is designed to rapidly respond to attacks and adapt to changing network environments. Blacksite serves as a
foundation for further development of holistic IoT intrusion detection solutions, wherein each node contributes to the security
of the network.
Keywords Human intelligence . Internet of Things . Intrusion detection . Artificial immune system . Negative selection
algorithm . Deep neural network
This distributed manufacturing coupled with increasing Therefore, we propose the Blacksite framework, a
market demands has made security of devices less of a prior- DNN-powered AIS for intrusion detection in IoT
ity. A key motivation for businesses to produce new technol- networks.
ogies is financial gain. To introduce products to market sooner The applications of an AIS or a DNN in cybersecurity are
and with more features, the balance between functionality and not new in literature; however, the concept of using both to-
security becomes skewed away from security. Furthermore, gether with human intelligence for intrusion detection is
due to the distributed nature of IoT manufacturing, there is a completely novel. This innovative approach will bring signif-
lack of standardization for IoT device hardware and software. icant improvement on intrusion detection systems (IDS) for
This exposes the devices to security risks through the IoT environments. The novelty lies within the use of a DNN
comingling of devices within a single networked environment algorithm to train detectors during negative selection algo-
where a single point of failure could be catastrophic to the rithm (NSA). Furthermore, detectors are validated using the
entire network. With a lack of standardization, IoT devices DNN after a suspicious flow has been detected. Human-in-
implement different degrees of security and subsequently ex- the-loop confirmation combined with the aforementioned
ploitable through various threat vectors. techniques provides a dynamic and robust security solution
To address the unique security challenges presented by IoT for IoT networks.
environments, a holistic approach to security is necessary. Blacksite requires a paradigm shift in cybersecurity to
Traditional security solutions designed for single hosts and account for the future magnitude and impact smart tech-
networks lack consideration for IoT environments, such as nologies will have on networks. A holistic and adaptive
adaptability of network topology and the velocity and variety approach is necessary to address future communication
of data. The implementation of IoT networks is still relatively network security. The line between network security and
new, and thus, security solutions have yet to catch up to its host security will begin to blur due to device and net-
adoption. Moreover, IoT devices present a challenge to tradi- work topology dynamicity. Using human intelligence
tional security solutions due to their modest processing power integrated with adaptive detectors provided by an AIS
and storage capabilities, which inhibits large antivirus and and high detection rate through a DNN, Blacksite is
intrusion detection systems from installation on those devices. expected to meet future network security requirements.
A potential solution must be robust, lightweight, adaptive, This solution is proposed to reduce the need for
fault/error-tolerant, distributed, and dynamic. Artificial im- standalone IDS, IPS, firewalls, etc. Once established,
mune systems (AIS) have shown efficacy for addressing Blacksite can be inserted into a communication network
two-class deterministic problems with the previously men- and begin threat assessment.
tioned characteristics (Brown et al. 2016a, b; Brown et al. In this paper, our major contributions are as follows:
2017). AISs provide a robust solution which can be deployed
across a distributed environment. By distributing security con- & We explore the cybersecurity risks associated with IoT
trol across the nodes on a network, where each node performs adoption.
a subtask of the overall security solution, a holistic security
model can be achieved. We posit that an AIS-based intrusion & We propose the Blacksite framework, a novel intrusion
detection system can address the unique challenges of future detection solution built using a DNN-powered artificial
IoT networks: high volume of data, distributed devices across immune system integrated with human feedback. A
large geographic areas, device physical location constantly DNN-based model is used to train initial detectors of an
changing (e.g., smart cars), etc. AIS, validate those detectors in the event of an intrusion,
However, an AIS alone cannot tackle the unique se- and adapt to changing network environment.
curity challenges presented by IoT devices intermingling & We provide implementation strategies for Blacksite.
with traditional devices on legacy networks due to the
high throughput and computational requirements those This paper is organized as follows. In Section 2, we
networks demand. This intermingling of devices will survey related works in immunity-based, deep neural
continue far into the future, much like IPv4 has contin- network–based, and anomaly-based intrusion detection
ued to survive in spite of the arrival of IPv6. Moreover, systems for IoT. In Section 3, we discuss the IoT secu-
there lacks significant evidence of the performance of rity landscape and the unique challenges it presents. In
AIS implementation in real-world systems. Deep neural Section 4, a brief overview of AISs and their uses are
networks (DNN), however, have proven effective at pro- presented. In Section 5, we provide an overview of the
viding data-driven solutions to many information secu- Blacksite framework. Section 6 is an implementation
rity problems. DNNs have shown high detection rates in strategy for testing and deploying Blacksite. Finally,
laboratory and real-world environments. High detection we present our conclusions and future work in
rate is critical in high-throughput network environments. Section 7.
Hum.-Intell. Syst. Integr. (2021) 3:55–67 57
2 A review of IoT intrusion detection 2.2 Use of deep neural networks for IDS
approaches
DNNs, a type of artificial neural networks, have been gaining
Research efforts have sought to address IoT security using significant popularity due to their training stability, generaliz-
various techniques (Zarpelao et al. 2017; da Costa et al. ability, and scalability. A DNN extracts complex and nonlin-
2019; Hajiheidari et al. 2019). Our proposed framework is ear features from training data to build a classification model.
built using a DNN with an AIS for anomaly-based intrusion DNNs are distinct in their use of multi-layered neurons,
detection. Research has been conducted using immunity- whereby several layers of decision nodes coordinate to trans-
based (Aldhaheri et al. 2020; Fernandes et al. 2017) and ma- form an input parameter to a desired output (classification). A
chine learning (DNN) approaches for anomaly-based intru- few research initiatives employed DNNs in IoT IDS; however,
sion detection; however, the inclusion of both within one sys- there lacks a systematic research in DNN-based AIS systems
tem for holistic detection has not been explored. We overview for IoT.
various related research efforts in the following sections. Diro and Chilamkurti (2018) proposed deep learning algo-
rithm as a novel approach for intrusion detection for IoT.
Their proposed system used distributed fog nodes (data pro-
2.1 Immunity-based IDS for IoT cessing nodes located at the network edge) within an IoT
network to collect data, train the model, and perform local
Pamukov et al. (2018) explored the efficacy of combining attack detection. A centralized coordinating master node is
NSA with a neural network for intrusion detection. Their pro- responsible for collaborative parameter sharing and optimiza-
posed solution seeks to train a neural network based on the tion. However, the authors noted that many zero-day algo-
detectors generated by NSA, which are generated and trained rithms could circumvent their deep learning approach,
using a “self” dataset. The resulting detectors are then labeled resulting in longer evaluation times for possible attacks. On
“non-self.” The original “self” dataset and the “non-self” de- the contrary, our proposed model uses the initial detectors to
tectors are subsequently used to train a single layer feed- serve as a quick response mechanism for real-time detection,
forward neural network. The proposed solution, although in- and then deep neural network is used for validation of the
triguing, was not tested in an IoT environment; therefore, its attack and attack response.
applicability is still undetermined. In 2020, Pamukov et al.
extended their previously described NSA with a NN by com- 2.3 Anomaly-based IDS for IoT
paring a feed-forward NN to a cascade-forward NN, which
contains more input-to-hidden and hidden-to-hidden layer Fu et al. (2011) proposed a solution for IoT IDS using
connections. Both NN algorithms performed similarly. The anomaly-based detection. They collected data from time slice
same limitations are present as were observed in their original windows of network activity. Time slices were collected until
work. the distance between each subsequent slice is below a speci-
Alalade (2020) proposed an extreme learning machine and fied threshold. These slices act as the network baseline.
artificial immune system (AIS-ELM) introduced by Tian et al. During anomaly detection, if a suspected slice significantly
(2018) for anomaly-based intrusion detection in smart home deviates in distance from the initial baseline, it is classified
environments. Clonal selection, an algorithm inspired by bio- as anomalous. A limitation of this approach is that all anom-
logical immune systems, is used to optimize input weights and alies cannot be detected, such as those which occur during the
hidden biases for the extreme learning machine (ELM). initial baseline phase or overlap across multiple time slices.
Subsequently, the ELM is used to detect anomalies in the Ding et al. (2013) proposed a game theoretic approach for
network. This differs from our work because an AIS- network behavior analysis. In their scheme, normal nodes are
inspired algorithm was used to train another classifier. In con- considered selfish such that they seek to use a minimal amount
trast, our work proposes the use of both AIS classifiers and of resources by only sending data to a specific destination.
ML classifiers in conjunction with one another. Consequently, attacker nodes advertise information, which
Greensmith (2015) proposed the use of artificial immune consumes significant network resources. The proposed solu-
system technologies to address cyber intrusions in IoT net- tion uses a stochastic approach to analyze node behavior.
works. Greensmith proposed the use of a modified Dendritic Moreover, a given set of moves are calculated for the partic-
Cell Algorithm, where signals are generated when anomalous ipant players which distinguish between normal and attacker.
behavior is detected, to identify possible intrusion on IoT The limitation of this approach is a lack of implementation for
nodes. Those signals are interpreted by an analysis engine actual IoT networks and lack of real-time analysis.
once a certain threshold of anomalous signal is received. A Rajasegarar et al. (2014) proposed a distributed anomaly
limitation of this proposal was the lack of an implementation detection framework which used hyper-ellipsoidal groups to
scheme. show behavioral information of each node. A score is given to
58 Hum.-Intell. Syst. Integr. (2021) 3:55–67
Infected devices could also be used as a pivot point to device through cellular networks. This data must traverse the
infiltrate the network further. As Jones and Carter describe Internet to reach the end device, which leaves the home sensors
(2017), attackers infiltrated Target’s payment information sys- exposed to external threats. This example is a small microcosm
tem through vulnerabilities in the HVAC system to steal cus- of the distributed nature of many IoT networks.
tomer data. Attackers used their access within the HVAC sys-
tem to pivot to the payment information system and exfiltrate 3.1.3 Human aspects of security: lack of training,
credit card data. Both systems were configured on the same standardization, policies, and laws
virtual local area network, thereby making this attack possible.
A simple oversight exposed a critical system to attack through A lack of user training can have dramatic effects when consid-
network misconfiguration and Internet-enabled devices. This ering IoT devices. Due to a lack of training, end-users who
example highlights the need for robust, adaptive, and easily misconfigure their devices create undesired vulnerabilities; em-
deployable IoT security solutions. ployees who either misinterpret IoT device data or receive com-
promised data and use that data for business decisions; software
3.1 Difficulties securing IoT networks developers lack formal secure software development skills; or
cybersecurity professionals who misconfigure security devices
Securing IoT devices poses a significant challenge due to due to a lack of IoT security training.IoT standardization can be
resource constraints, the distribution of devices across poten- divided into platform, connectivity, business model, and appli-
tially large geographic networks, the disparity in security be- cations (Banafa 2016). An IoT platform must account for the
tween hardware and software vendors, rapid development variety in form and design of products, analysis of terabytes of
lifecycles and the lack of training, standardization, policies, data streaming across devices, and scalability to accept new
and laws. IoT presents a unique security challenge because protocols and technologies, while maintaining connectively
traditional frameworks for securing local hosts and networks across diverse deployment in a cohesive and unified system.
cannot be efficiently deployed to small embedded devices. Technological advances are typically spearheaded by finan-
The security of IoT networks cannot rely on previously cial motivations. The unification of application development is
established security solutions because of the aforementioned predicated on financial bottom-line considerations. Businesses
constraints and challenges of IoT networks. value financial gain from product development over allocating
the necessary resources toward the development of standardized
3.1.1 Resource constraints security strategies. Moreover, IoT standardization requires regu-
latory oversight and applicable laws for governance.
Many IoT devices are constrained by computational, storage, Historically, policies and laws have not progressed with tech-
and other similarly limited resources. Many devices are de- nology. This is especially evident in the digital age. As more IoT
signed to perform in low-powered environments for longevity devices and autonomous systems are connected to the network,
and minimal intervention. Specifically, embedded devices and the policies and laws outlining their uses and regulation must
sensors may operate in hard-to-reach environments. constantly evolve. Policies and laws are a critical piece in secu-
Therefore, processing and data transmission use battery pow- rity because humans are the weakest link in a secure system, and
er. The OS and programs on the device must be efficient. policies and laws contribute to human aspects of security.
For example, RFID chips contain enough processing capa- In many security chains, typically, the weakest link is the
bility to collect data and transmit. RFID chips store data and human factor. This is an unfortunate reality that security pro-
utilize the electromagnetic field of an RFID scanner to power fessionals must address to provide security to critical system.
the chip during data transmission. This scenario presents a For example, there is significant research regarding password
unique situation where a traditional security solution cannot security due to minimal password policies and users
be applied. Consequently, a potential solution must be light- circumventing the policy (Morris and Thompson 1979; Ur
weight and portable to accommodate a wide range of devices. et al. 2016). The previous example is a microcosm of the
difficulty in securing systems. The human factor plays a crit-
3.1.2 Distribution of devices ical role in the security of systems. Therefore, security solu-
tions must leverage human intelligence coupled with cyber
IoT devices could be geographically located across small or large systems and machine learning.
areas but still require communication. Security solutions must
account for the various deployment models in IoT. From small
home networks to expansive global networks, IoT devices must 4 Artificial immune system
effectively communicate through secure channels. For example,
home sensors monitor occupant activity and status. This data is Inspired by the human immune system, the artificial immune
collected, analyzed, and transmitted to the occupant’s mobile system (AIS) attempts to mimic the natural immune system
60 Hum.-Intell. Syst. Integr. (2021) 3:55–67
that produces biological signals and responses when a foreign 2009), to name a few. Through extrapolation from the natural
antigen is encountered (Yuan et al. 2014; Zhao et al. 2013). immune system, an ideal AIS provides several desirable char-
More specifically, T cells are created and distributed through- acteristics (Greensmith 2015; Fernandes et al. 2017) for
out the body. Once a T cell matches a “Self” cell, it undergoes security:
a lysis process, or commits suicide. After this phase, the re-
maining T cells will detect any antigens which are not identi- & Unique: Detectors can be assigned to focus specifically on
fied as Self. The traditional AIS algorithm is shown in a single device.
Algorithm 1. & Robust: Small deviations in malicious samples do not sig-
nificantly affect detection because detectors provide par-
tial matching through a threshold matching rule.
& Diverse: Detectors are constantly regenerated and re-
placed, which provides a diverse set of detectors.
& Anomaly detection: During NSA, detectors are only
retained if they do not match “Self” (i.e., benign traffic).
Therefore, the remaining detectors should match any
“Non-self” (i.e., anomalous traffic).
& Lightweight: It does not require significant computational
resources for detection nor significant storage.
& Fault tolerant: Imprecise and uncertain data conditions,
such as missing data or unstructured data, do not drasti-
cally affect classification accuracy.
& Adaptive: The AIS is constantly evolving and learning the
An AIS solves two-class deterministic problems. An AIS
network environment. The detectors are changed in re-
is composed of a set of detectors. Detectors are bit-strings
sponse to network changes.
which can be matched to network flows to detect malicious
& Distributed: It does not require centralized control. Data
intrusions. They are generated through various techniques.
collection and detectors can be distributed to individual
Primarily, negative selection algorithm (NSA) is used for de-
devices.
tector generation. Negative selection aims to build a set of
detectors which have identified “Self” during a training phase,
and therefore, can positively detect “Non-self” (Hofmeyr
1999; Hofmeyr and Forrest 1999).“Self,” in this context, is
5 Blacksite framework overview
defined as normal network activity, whereas “Non-self” is
defined as anomalous activity.
The proposed Blacksite framework, shown in Fig. 2, extends
Furthermore, a threshold matching rule is used to deter-
NSA of AIS by incorporating a deep neural network (DNN)
mine a match between two network flows. In a traditional
model for detector set generation. The DNN additionally adds
AIS, a r-consecutive bit matching (RCBM) rule, where a min-
validation of detected suspicious traffic, which is expected to
imum threshold, r, of consecutive bits must be identical be-
significantly reduce false positives, or incorrectly identified
tween two bit-strings, is typically used for match determina-
benign and malicious traffic, and increase the detection rate,
tion. However, for Blacksite, a similar scheme is deployed but
or the percent of malicious traffic successfully detected. The
instead of r-consecutive bits, r-features must be identical. The
NSA is implemented by first training a supervised DNN mod-
r-feature matching rule dictates that a threshold value, r, of
el using labeled data of “Self”/”Non-self” instances. Then,
features must match, either consecutively or non-consecutive-
a detector set is created by randomly generating network
ly, between a detector and a suspicious flow for a match to
flow detectors. When a flow arrives for analysis, the de-
occur. For example, using the CSE-CIC-IDS2018 dataset
tector set uses threshold matching to determine a match. If
(Sharafaldin et al. 2018), 76 features are collected for bi-
a match occurs, the flow is labeled suspicious and
directional network connections (i.e., a flow). Those features
forwarded to the DNN model for validation. If the DNN
include total forward packets, total backwards packets, total
model validates the match, the flow is forwarded to the
length of forward packets, to name a few. Given r = 50, 50 or
human expert for final manual confirmation, where the
more features must be identical between a given detector and a
flow is labeled confirmed and added to the training dataset
suspicious flow for a match (detection) to occur.
to be used for updating the DNN model.Our proposed
Significant research has been conducted to utilize AISs for
framework delivers two main innovations: real-time analy-
addressing problems in cybersecurity, robotics, fraud detec-
sis of threats and zero-day (previously unseen) attack de-
tion, and anomaly detection (Ramakrishnan and Srinivasan
tection. By using a lightweight detector set as the initial
Hum.-Intell. Syst. Integr. (2021) 3:55–67 61
validation. If the detectors do not detect a flow as suspi- analyze data, and respond to malicious threats. We explore
cious, it is allowed for transmission. this further in Section 7. This represents the best-case scenar-
4. DNN validation for mature detector set: The DNN model io; however, due to processing and storage limitations of
classifies a suspicious flow received from the detector set many IoT devices, a gateway may still be necessary for trans-
as either malicious or benign. If the suspicious flow is lation. In many modern IoT networks, the gateway plays a
classified as malicious by the model, then the model has critical role in aggregating data from distributed devices.
validated the classification of the detector(s) which origi- However, to alleviate the computation load on the gateway,
nally detected the suspicious flow. Therefore, the set of we abstract Blacksite from the gateway device itself and place
original detectors is promoted to Mature. The malicious it in an IoT integration middleware or edge computing plat-
flow is denied from transmission and forwarded to the form. The Blacksite framework can be ported to many differ-
human-in-the-loop for final confirmation. If the DNN- ent environments. Each phase, described in the next few sec-
based model classifies the suspicious flow as benign, it tions, can occur on physically separate systems with data col-
is allowed for transmission. lection occurring at the source node or in a central location.
5. Human-in-the-loop confirmation: The human-in-the-loop
has final confirmation rights. However, the human-in-the-
loop can be ignored if a specified timeframe has elapsed 5.2 Deep neural network model
from DNN validation to human-in-the-loop confirmation.
If that timeframe has not elapsed, and the human expert As described above, the DNN-based model serves two pur-
manually confirmed the malicious classification of the poses: (1) implementation of NSA for detector set generation
suspicious flow, the set of original detectors is promoted and (2) suspicious flow validation. The feed-forward multi-
to Memory, the flow is added to the training dataset as a class classification model is trained using labeled dataset of
malicious instance, and the flow is denied. If the human benign and malicious flow feature vectors. Once the model is
expert determines the suspicious flow is benign, the flow trained, it is tuned by sweeping through a set of batch sizes,
is added to the training dataset for future training of the where a batch is the number of instances the model evaluates
DNN-based model as a benign instance and allowed for during training before the activation functions at individual
transmission. kernels are updated.
When the model training is deemed sufficient, NSA begins
for developing detector set. First, a detector is created by ran-
In the following sections, we discuss the placement of domly generating a set of features equivalent in length to
Blacksite in a network and provide an overview of the main training dataset vectors, then evaluated by the DNN. If the
components in the framework: DNN-based model, detector detector is classified as benign, it is destroyed, and a new
set, and human-in-the-loop. We detail how the DNN-based detector is randomly generated to replace it for consideration.
model is trained and deployed. We then discuss detector gen- If the detector is classified as malicious, it is retained and
eration and labeling. Lastly, we discuss the human-in-the-loop labelled as an Immature detector with lifespan TImmature and
confirmation process. added to the detector set. After a specified number of detectors
have been added to the detector set, NSA ends.After the de-
5.1 Placement tector set has been created, it begins the analysis phase where
new flows are checked against the detectors using threshold
Blacksite will be integrated into an IoT integration matching to determine if they are benign or suspicious.
middleware (Guth et al. 2016), which can also be referred to Network flows will be collected through a network analyzer.
as edge computing (Satyanarayanan 2017). Data will be re- If one or more detector(s) within the detector set matches a
ceived from devices and intermediary gateways (i.e., if trans- suspicious instance through threshold matching, the instance
lation of data is required from an IoT protocol, such as is forwarded to the DNN model for validation. The model will
ZigBee, to a corresponding transport protocol, such as classify the suspicious instance as benign or malicious. If the
HTTP) and processed in a centralized location located within instance is malicious, the original detector(s) responsible for
an IoT integration middleware platform. As an IoT integration detecting the suspicious flow is promoted to Mature with a
middleware, Blacksite communicates and interacts with de- lifespan TMature. The instance is denied from transmission and
vices directly (or through a gateway) to deny/allow flows forwarded to the human-in-the-loop for final manual confir-
when malicious instances are detected. Future iterations of mation. If the suspicious instance is classified as benign, it is
Blacksite will seek to eliminate intermediary gateways for allowed to transmit.
communication to devices. The DNN-based model is updated when new instances are
A lightweight software package will be installed directly on added to the training dataset by the human-in-the-loop within
devices, which communicate with Blacksite to collect and a given timeframe. The timeframe is critical to account for
Hum.-Intell. Syst. Integr. (2021) 3:55–67 63
unexpected large volumes of malicious traffic, such as the in-the-loop confirms the suspicious flow as malicious, the
case in a distributed denial of service attack. flow is denied from transmission and added to the training
dataset as a malicious instance. The original detectors which
5.3 Detector set generation identified the suspicious instance are promoted to Memory
with lifespan TMemory. The framework can continue function-
Detectors in an AIS can be equated to the T cells of the bio- ing without human-in-the-loop intervention, if necessary. If a
logical immune system. There are three types of detectors: (1) human expert does not confirm the classification of an in-
Immature, (2) Mature, and (3) Memory. Each successively stance within a specified timeframe, the framework will con-
higher classification of detectors exhibits progressively longer tinue using DNN validation and remove the suspicious flow
lifespans, i.e., TImmature < TMature < TMemory. Immature detec- from human-in-the-loop consideration. The flow will not be
tors are randomly generated to identify benign and detect sus- added to the training dataset since human-in-the-loop confir-
picious traffic. Mature detectors were previously Immature mation was not received.
detectors, which were promoted as a result of identifying a
suspicious flow that was validated by the DNN-based model.
Lastly, a Mature detector is promoted to Memory when the
suspicious flow is confirmed by the human expert. 6 Implementation strategy
When the lifespan of any type of detector has elapsed, the
detector is replaced by a randomly generated Immature detec- 6.1 Dataset
tor. The Immature detector must undergo NSA to be added to
the detector set. If a detector successfully detects a suspicious Due to the lack of publicly available IoT IDS datasets and the
instance prior to the end of its lifetime, it is either promoted or, legacy nature of many commonly used datasets (i.e., DARPA
in the case of Memory detectors, its lifespan is extended. The (Sharafaldin et al. 2018)), we have identified two potential
various types of detectors exist to retain detectors which per- sources of data for Blacksite implementation: (1) the CSE-
form well and replace those which are ineffective. This tiered CIC-IDS2018 dataset (Sharafaldin et al. 2018) and (2) locally
approach to detectors provides dynamicity and adaptability to simulated data. The CSE-CIC-IDS2018 dataset is a systemat-
changing network conditions. The detector set functions as an ically generated IDS dataset of diverse and comprehensive
initial detection system to quickly identify suspicious in- intrusion detection network data. The dataset was built based
stances and forward them to the DNN-based model and on the creation of user profiles which contain abstract repre-
human-in-the-loop for deeper inspection. sentations of events and behaviors seen on the network by the
An individual detector consists of n randomly generated University of New Brunswick Canadian Institute of
features, where n = length of training set feature vectors. Cybersecurity. The dataset consists of network traffic
For each feature, the randomly generated number is between representing normal traffic and seven different attack scenar-
0 and 1 to adhere to the normalized values expected by the ios, such as brute force attacks and botnet attacks.Our second
DNN. source of data will originate from an in-house testbed environ-
ment. The environment will consist of traditional devices
5.4 Human-in-the-loop confirmation (PCs, switches, routers) and network traffic will be recorded
a nd a na l yz e d u s i ng th e s a m e ne t w o r k a na l yz e r ,
Human intelligence is a critical component for model refine- CICFlowMeter-V3, as the CSE-CIC-IDS2018 dataset for
ment. The human-in-the-loop serves as the final confirmation consistency.
mechanism. In this process, suspicious instances received
from the DNN model are manually confirmed malicious or
deemed benign. Once the suspicious instance has reached the 6.2 Preliminary results
human-in-the-loop, the flow is displayed to the human expert
in clear text. The manual inspection process is similar to a Preliminary results show a high accuracy for the DNN
typical incident response process (Freiling and Schwittay in classifying network flows using 2 days of network
2007), whereby a human expert conducts deep analysis on flow from the CSE-CIC-IDS2018 dataset. An epoch of
suspicious flows by reviewing the flows’ features (i.e., flow 100 (number of times to iterate through each flow dur-
byte/s, packet flow/s, etc. (Sharafaldin and Lashkari 2018)) ing training) and a batch size of 80 using 10-fold cross-
based on experience and training. validation produced an average accuracy of a 98.92%
The human expert will have final determination to either with a standard deviation of 0.23%. This showed prom-
allow or deny a flow. If the human expert determines a flow is ising results toward the potential of the DNN in detect-
benign, it is allowed for transmission, and added to the train- ing malicious flows. Below, we discuss the experimental
ing dataset as a benign instance. Consequently, if the human- setup and preliminary results in depth.
64 Hum.-Intell. Syst. Integr. (2021) 3:55–67
6.2.1 Experimental setup Table 2 Results of batch size sweep using 10-fold cross-validation
7 Study limitations framework into a working model. Then train the model using
the dataset described above. The next step will be validation
A major limitation of this work is the availability of IoT IDS and testing of the DNN-based model. Next, we will generate
datasets. Currently, there is a significant lack of community- initial immature detectors. Then validate and test those detec-
available datasets which focus specifically on IoT environ- tors when attempting to detect unseen data. The two detection
ments. Researchers have used the KDD Cup 1999 dataset methods (DNN-based classifier and detector set) will be com-
(KDD Cup 1999 Data 1999) for IDS exploration. Although bined to accomplish phases 1–3 as described in Section 5.
this dataset has been sufficient for traditional security re- Upon the conclusion of phases 1–4 testing, the human-in-
search, IoT presents a unique challenge which is not fully the-loop will be added to the framework for phase 5. At this
captured by this dataset. The diversity of devices and sophis- point, the entire framework will be implemented and tested as
tication of modern attacks could not be captured until recently. one system. The performance and efficiency, in terms of com-
Another limitation of this study is the lack of experimenta- putational, time, and resource overhead, will be measured.
tion for the full model. In this study, the initial DNN was We envision Blacksite as a distributed framework. Future
trained and evaluated. We did not evaluate the effect, either work will explore device-specific or fog node modules which
in detection rate or detection speed, of the AIS components, collect data (i.e., CPU process data, network flows) and pro-
specifically, the detector set. Therefore, the experimentation vide initial detection using local resources. A small detector
results of this study were strictly limited to DNN training and set will reside on each IoT device to analyze data as they are
analysis. generated. If a detection is found, the suspicious instance will
Future work, as described in the next section, will seek to be sent to a centralized controller where DNN validation and
evaluate the efficacy of the detector set. human-in-the-loop confirmation will occur. This provides a
granular inspection of suspicious data at the source and
offloads computational overhead from the centralized control-
8 Conclusion and discussion ler to the participating nodes. However, the distribution
scheme must account for limited computational resources on
In conclusion, IoT devices will be integral to future commu- many IoT devices. Therefore, the detector set, DNN, and
nication networks which require adaptive, dynamic, and effi- human-in-the-loop may remain on the centralized controller
cient IDS to protect against attacks. Traditional security sys- while participating nodes would be solely responsible for data
tems were not designed with these considerations in mind. collection. The centralized controller could be a fog node or
Furthermore, human intelligence integration is critical when reside in the cloud. A cloud-based approach would grant near-
designing state-of-the-art intrusion detection systems. ly limitless processing and storage resources but could inject
Therefore, we present Blacksite for intrusion detection in latency and other security concerns, whereas a fog node ap-
IoT networks. Blacksite combines human intelligence and a proach brings detection closer to the source, at the potential
deep neural network (DNN) model with an artificial immune expense of available resources and complex implementation.
system (AIS) to create a sophisticated security solution that Future research will also explore the efficacy of Blacksite
can adjust to dynamic environments, protect against cyber against DDoS attacks. The network packets used in a DDoS
threats, and still support high volume IoT communication. attack individually may appear to an IDS system as legitimate
The initial detectors of AIS are trained using the DNN model, traffic and therefore may not elicit a detection response; how-
which are then used as the first line of defense. If a detector ever, the voluminous amount of those flows should be detect-
identifies a suspicious flow, deeper inspection is conducted by ed. The current Blacksite framework has mechanisms which
the DNN model followed by the human-in-the-loop process. address potential DDoS attacks (deny/allow flows based on
However, if necessary, the framework can function without DNN validation and/or human-in-the-loop confirmation);
human intervention based on prior experience. Future com- however, this process may not be very efficient when dealing
munication networks require holistic security solutions which with large amounts of data. Therefore, additional mechanisms,
identify the network and devices connected to it in a unified such as long short-term memory (LSTM) neural network al-
fashion. Blacksite is designed to achieve that goal while still gorithms, will be explored to address sequence-specific traffic
maintaining the expected efficiency of such a solution. emblematic of DDoS attacks.
In preliminary implementation, a DNN on IDS data result-
ed in a high of 99.74% accuracy when classifying malicious Funding This work was supported in part partially by the US
Government, including the US Department of Education under the Title
attacks. These results show the potential effectiveness of DNN
III Historically Black Graduate Institutions (HBGI) grant.
for intrusion detection. In future work, we plan to implement
and test the entire Blacksite framework as an integrated sys- Data availability Data is publicly available at https://www.unb.ca/cic/
tem. As described in Section 6, the dataset will be critical for datasets/ids-2018.html
future work. The initial step will be to implement the
66 Hum.-Intell. Syst. Integr. (2021) 3:55–67
Compliance with ethical standards Hajiheidari S, Wakil K, Badri M, Navimipour N (2019) Intrusion detec-
tion systems in the Internet of things: a comprehensive investigation.
Comput Netw 160:165–191
Conflict of interest The authors declare that they have no conflict of
interest. Harper R (2006) Inside the smart home. Springer Science & Business
Media
Hofmeyr S (1999) An immunological model of distributed detection and
its application to computer security. Ph.D Dissertation, University of
New Mexico, Department of Computer Science
References Hofmeyr, S., & Forrest, S. (1999). Immunity by design: an artificial
immune system. Genetic and Evolutionary Computation
Abhishta A, van Rijswijk-Deig R, Nieuwehuis LJ (2019) Measuring the Conference (GECCO-1999), (págs. 1289-1296). San Francisco
impact of a successful DDoS attack on the customer behaviour of Hollands RG (2008) Will the real smart city please stand up? Intelligent,
managed DNS service providers. ACM SIGCOMM Comput progressive or entrepreneurial? City 12(3):303–320
Commun Rev 5:70–76 Jiang, L., Liu, D. Y., & Yang, B. (2004). Smart home research. 2004
Alalade ED (2020) Intrusion detection system in smart home network International Conference on Machine Learning and Cybernetics
using artificial immune system and extreme learning machine hybrid (IEEE CAT. No. 04EX823). 2, págs. 659-663. IEEE
approach. 2020 IEEE 6th World Forum on Internet of Things (WF- Jones, C. B., & Carter, C. (2017). Trusted interconnections between a
IoT) (págs. 1-2). New Orleans, LA, USA: IEEE centralized controller and commercial building HVAC systems for
Aldhaheri S, Alghazzawi D, Cheng L, Barnawi A, Alzahrani BA (2020) reliable channel response. IEEE Access, 5, 11063–11073
Artificial immune systems approaches to secure the internet of Katagi M, Moriai S (2008) Lightweight cryptography for the internet of
things: a systematic review of the literature and recommendations things. Sony Corporation
for future research. J Inf Secur Appl 35:138–159 KDD Cup 1999 Data (1999) Obtenido de Donald Bren School of
Banafa A (2016) IoT standardization and implementation challenges. Information & Computer Science at University of California,
IEEE Internet of Things Irvine: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.ht ml
Bin Onn, A., Salim, S., Ahmad, M., & Jamil, M. (2014). Development of Mansfield-Devine S (2016) DDoS goes mainstream: how headline-
a network-enabled traffic light system. 2014 IEEE International grabbing attacks could make this threat an organisation’s biggest
Conference on Control System, Computing and Engineering nightmare. Netw Secur 11:7–13
(ICCSCE 2014), (págs. 241–244). Batu Ferringhi Morris R, Thompson K (1979) Password security: a case history.
Brown J, Anwar M, Dozier G (2016a) Detection of mobile malware: an Commun ACM 22(11):594–597
artificial immunity approach. IEEE Security and Privacy Nam T, Pardo AT (2011) Conceptualizing smart city with dimensions of
Workshops (SPW), (págs. 74-80). San Jose technology, people, and institutions. 12th Annual International
Brown, J., Anwar, M., & Dozier, G. (2016b). Intrusion detection using a Digital Government Research Conference: Digital Government
multiple-detector set artificial immune system. 17th International Innovation in Challenging Times (págs. 282-291). ACM
Conference on Information Reuse and Integration (IRI), (págs. Pamukov M, Poulkov V, Shterev V (2018) Negative selection and neural
283-286). Pittsburgh network-based algorithm for intrusion detection in IoT. In: 41st
Brown J, Anwar M, Dozier G (2017) An artificial immunity approach to International Conference on Telecommunications and Signal
malware detection in a mobile platform. EURASIP J Inf Secur Processing (TSP). IEEE
2017(1):7 Pamukov M, Poulkov V, Shterev V (2020) NSNN algorithm perfor-
da Costa K, Papa J, Lisboa C, Munoz R, de Albuquerqu V (2019) Internet mance with different neural network Aachitectures. In: 2020 43rd
of things: a survey on machine learning-based intrusion detection International Conference on Telecommunications and Signal
approaches. Comput Netw 151:147–157 Processing (TSP). IEEE
Ding Y, Cheng XZ, Lin F (2013) A security differential game model for Rajasegarar S, Gluhak A, A IM, Nati M, Moshtaghi M, Lecki C,
sensor networks in context of the internet of things. Wireless Palaniswami M (2014) Ellipsoidal neighbourhood outlier factor
Presonal Communications 72(1):375–388 for distributed anomaly detection in resource constrained networks.
Diro A, Chilamkurti N (2018) Distributed attack detection scheme using Pattern Recogn 47(9):2867–2879
deep learning approach for Internet of Things. Futur Gener Comput Ramakrishnan S, Srinivasan S (2009) Intelligent agent based artificial
Syst 82:761–768 immune system for computer security - a review. Artif Intell Rev
Fernandes DA, Freire MM, Fazendeiro PA, Inacio PR (2017) 32(1–4):13–43
Applications of artificial immune systems to computer security: a Satyanarayanan M (2017) The emergence of edge omputing. Computer
survey. Journal of Information Security and Applications 35:138– 50(1):30–39
159 Sharafaldin I, Lashkari AG (2018) Toward Generating a New Intrusion
Freiling F, Schwittay B (2007) A common process model for incident Detection Dataset and Intrusion Traffic Characterization. 4th
response and computer forensics. Imf 7:19–40 International Conference on Information Systems Security and
Fu, R., Zheng, K., Zhang, D., & Yang, Y. (2011). An intrusion detection Privacy (ICISSP). Portugal
scheme based on anomaly mining in Internet of Things. IET Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a
International Conference on Wireless, Mobile and Multimedia new intrusion detection dataset and intrusion traffic characterization.
Networks (ICWMMN 2011) (págs. 315-320). IET In ICISSP, pp. 108-116
Greensmith J (2015) Securing the Internet of Things with responsive Tian HY, Li SJ, Wu TQ, Yao M (2018) An extreme learning machine
artificial immune systems. Annual Conference on Genetic and based on artificial immune system. Comput Intell Neurosci. https://
Evolutionary Computation. ACM doi.org/10.1155/2018/3635845
Guth, J., Breitenbucher, U., Falkenthal, M., Leymann, F., & Reinfurt, L. Ur B, Bees J, Sebreti SM, Bauer L, Christin N, Cranor LF (2016) Do
(2016). Comparison of IoT platform architectures: a field study users' perceptions of password security match reality? Proceedings
based on a reference architecture. 2016 Cloudification of the of the 2016CHI Conference on Human Factors in Computing
Internet of Things (págs. 1-6). IEEE Systems, pp 3748-3760
Hum.-Intell. Syst. Integr. (2021) 3:55–67 67
Wang J, Kuang Q, Duan S (2015) A new online anomaly learning and 18th Symposium on Communications & Networking (p.gs. 8-15).
detection for large-scale service of internet of thing. Pers Ubiquitous Society for Computer Simulation International
Comput19 (7):1021-1031 Zhao M, Zhang T, Wang J, Yuan Z (2013) A smartphone malware de-
Yuan Z, Lu Y, Wang Z, Xue Y (2014) Droid-sec: deep learning in an- tection framework based on artificial immunology. J Netw 8(2):
droid malware detection. ACM Conference on SIGCOMM (pp. 469–476
371-372). ACM
Zarpelao B, Miani RK, de Alvarenga S (2017) A Survey of Intrustion
Publisher’s note Springer Nature remains neutral with regard to jurisdic-
Detection in Internet of Things. J Netw Comput Appl 84:25–37
tional claims in published maps and institutional affiliations.
Zhang C, Green R (2015) Communication Security in Internet of Thing:
Preventive Measure and Avoid DDoS Attack over IoT Network.