0% found this document useful (0 votes)

54 views25 pages

A Voting Gray Wolf Optimizer-Based Ensemble Learning Models For Intrusion Detection in The Internet of Things

This article proposes a new approach for detecting intrusion attacks in an IoT network using an ensemble learning technique based on gray wolf optimizer. The proposed model employs a voting technique combining the probability averages of base learners optimized by GWO. When tested on two datasets, the ensemble model achieved very high accuracy, detection rate, and other performance metrics, demonstrating its effectiveness for intrusion detection in IoT networks.

Uploaded by

kumaranurupam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views25 pages

A Voting Gray Wolf Optimizer-Based Ensemble Learning Models For Intrusion Detection in The Internet of Things

Uploaded by

kumaranurupam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

International Journal of Information Security

https://doi.org/10.1007/s10207-023-00803-x

REGULAR CONTRIBUTION

A voting gray wolf optimizer-based ensemble learning models

for intrusion detection in the Internet of Things
Yakub Kayode Saheed1 · Sanjay Misra2

Accepted: 11 December 2023

Abstract
The Internet of Things (IoT) has garnered considerable attention from academic and industrial circles as a pivotal technology
in recent years. The escalation of security risks is observed to be associated with the growing interest in IoT applications.
Intrusion detection systems (IDS) have been devised as viable instruments for identifying and averting malicious actions in this
context. Several techniques described in academic papers are thought to be very accurate, but they cannot be used in the real
world because the datasets used to build and test the models do not accurately reflect and simulate the IoT network. Existing
methods, on the other hand, deal with these issues, but they are not good enough for commercial use because of their lack of
precision, low detection rate, receiver operating characteristic (ROC), and false acceptance rate (FAR). The effectiveness of
these solutions is predominantly dependent on individual learners and is consequently influenced by the inherent limitations
of each learning algorithm. This study introduces a new approach for detecting intrusion attacks in an IoT network, which
involves the use of an ensemble learning technique based on gray wolf optimizer (GWO). The novelty of this study lies in the
proposed voting gray wolf optimizer (GWO) ensemble model, which incorporates two crucial components: a traffic analyzer
and a classification phase engine. The model employs a voting technique to combine the probability averages of the base
learners. Secondly, the combination of feature selection and feature extraction techniques is to reduce dimensionality. Thirdly,
the utilization of GWO is employed to optimize the parameters of ensemble models. Similarly, the approach employs the most
authentic intrusion detection datasets that are accessible and amalgamates multiple learners to generate ensemble learners.
The hybridization of information gain (IG) and principal component analysis (PCA) was employed to reduce dimensionality.
The study utilized a novel GWO ensemble learning approach that incorporated a decision tree, random forest, K-nearest
neighbor, and multilayer perceptron for classification. To evaluate the efficacy of the proposed model, two authentic datasets,
namely, BoT-IoT and UNSW-NB15, were scrutinized. The GWO-optimized ensemble model demonstrates superior accuracy
when compared to other machine learning-based and deep learning models. Specifically, the model achieves an accuracy rate
of 99.98%, a DR of 99.97%, a precision rate of 99.94%, an ROC rate of 99.99%, and an FAR rate of 1.30 on the BoT-IoT
dataset. According to the experimental results, the proposed ensemble model optimized by GWO achieved an accuracy of
100%, a DR of 99.9%, a precision of 99.59%, an ROC of 99.40%, and an FAR of 1.5 when tested on the UNSW-NB15
dataset.

Keywords Internet of Things · Gray wolf optimizer · Ensemble model · Intrusion detection system · K-nearest neighbor ·
Multilayer perceptron · Random forest · Decision tree · Average of probability

1 Introduction
B Sanjay Misra
[email protected] The Internet of Things (IoT) is experiencing rapid growth
Yakub Kayode Saheed and assuming an increasingly significant role in our every-
[email protected] day existence. IoT nodes can establish a connection with
the Internet using an Internet Protocol (IP) address [1]. The
1 Department of Computer Science, University of Alcala,
past decade has witnessed a significant surge in the level of
Madrid, Spain
interconnectivity among individuals, machines, and services,
2 Department of Applied Data Science, Institute for Energy
Technology, Halden, Norway

123
Y. K. Saheed, S. Misra

ultimately leading to the emergence of a novel communica- As a result, numerous approaches and strategies such as
tion paradigm referred to as the IoT [2]. The proliferation data encryption, firewalls, and user verification via the fog
of self-configured smart nodes is fueling the development computing model have been created and implemented to
of a wide range of innovative applications, including but defend the IoT platform. These attack channels and risks
not limited to home automation, process automation, smart continue to evolve, rendering traditional security solutions
automobiles, health-care systems, decision analytics, smart inefficient and ineffective at addressing the IoT safety chal-
grids, industrial development, and autonomous cars [3]. It is lenge, paving the way for a new wave of IDS based on ML.
predicted by analysts that in the future, the number of inter- A substantial amount of work and study has been undertaken
connected devices will surpass that of the human population to determine the optimum intelligent IDS for various types
on Earth. As per the International Data Corporation’s projec- of applications in IoT-based environments [8]. As IDS is one
tions, by the year 2025, a total of 41.6 billion interconnected of the key remedies used to ensure IoT security, there is a
IoT devices are expected to generate a staggering amount of propensity to employ multiple techniques concurrently [9].
79.4 zettabytes of data, in contrast with the anticipated global Alharbi et al. [10] proposed an IoT security proof-of-concept
population of 8.1 billion individuals [4]. system built into the fog computing layer. Each unit defends
The IoT is vulnerable to a range of security threats and against a specific type of attack. The IDS of traffic analyzer
presents significant security challenges for end-users, par- components was employed to spot DDoS and DoS attacks
ticularly as it continues to expand into various aspects of with a classification engine based on the decision tree ML
communal life, as shown in Fig. 1. The IoT is a complex sys- technique. To authenticate the IDS’s answer, the challenge-
tem of various networks that include security measures for response component sends a challenge communication in
sensor data, Internet and mobile network connectivity, pri- the event of intrusion detection. As a result of the system’s
vacy protection, network authentication, access control, and failure to respond to this message, the firewall unit dis-
information management, as noted in the source [5]. In recent ables the connection. Pajouh et al. [11] introduced a unique
years, the occurrence of anomalies and security breaches on layered IDS for IoT mainstay networks that use a two-tier (2-
IoT devices has become increasingly prevalent. The Internet tier) dimensionality reduction and classification phase. The
of Things infrastructure framework is becoming increasingly dimension reduction engine is built of component analysis
complex, which is resulting in the introduction of undesired and LDA units, while the classification engine is composed
vulnerabilities into its systems. The IoT has the potential to of NB and a cascaded version of the CF-KNN units. The NB
facilitate the seamless integration of physical objects into was utilized to classify attack records, which were further
networks, thereby providing advanced information services improved using the CF-KNN algorithm as a secondary filter
to individuals. A multitude of IoT services and applications layer. Using the NSLKDD [12] dataset, the suggested model
that utilize ML have emerged across various domains such demonstrated modest uncovering performance for difficult-
as security, surveillance, health care, transportation, con- to-catch attacks, specifically those belonging to the U2R and
trol, and object monitoring. Preventative security measures R2L classes. Zhang et colleagues [13] used the UNSW-NB
are often limited by inadequate planning and implementa- [14] standard dataset to illustrate the efficacy of ML-based
tion, and given the inevitability of attacks, machine learning intrusion detection using a full depiction of modern IoT
systems can offer essential services and resilient security attack scenarios. They employed a new feature selection
strategies for safeguarding IoT devices [6]. engine that applied DAE founded on a biased loss function,
The attack detection system is classified as either a despite using a simple MLP as an algorithm. This unique
signature-based or an anomaly-based system. Signature- feature selection technique resulted in an increased empha-
based system attacks compare certain patterns, such as bytes sis on attack-representative features. Koroniotis et al. [15]
or harmful instruction sequences, in malware-infected net- proposed an IoT network forensic framework consisting of
work traffic to known attack types stored in a database [7]. C4.5, ARM, NB, and ANN ML approaches to recognize and
Systems based on anomalies detect unknown threats or devia- spot novel and complex forms of present botnet attacks as
tions from the typical flow. Unlike signature-based detection another application of the UNSW-NB dataset.
systems, machine learning-based solutions have the potential Traditional ML techniques and approaches have been
to detect unknown attacks. However, the ML models must be widely used due to their high accuracy for attack detection
sufficiently precise to maximize high accuracy, increase the and low false alarms, but they have been disapproved for
detection rate (DR), have a high ROC, and minimize false their inability to detect innovative threats. Traditional ML
alarms [6]. They must be trained and assessed on genuine techniques are incapable of identifying composite and new
datasets to demonstrate their efficacy in real-world deploy- attacks. The mainstream of mutation attacks is minor alter-
ments. The basic strategy is to utilize ML to create a model ations known as cyberattacks in modern times. The prior
of legitimate action and then analyze new behavioral attacks logic and conceptions serve as the basis for the novel attacks.
against the ML model. This means that typical ML models will fail to recognize

123
A voting gray wolf optimizer-based ensemble learning models …

Fig. 1 The Internet of Things scenario

minute mutations because they are incapable of abstract- The paper is divided into seven sections. The literature
ing information to discern novel threats [16]. Hence, a more and existing works are presented in Sect. 2. The proposed
robust, intelligent method for IoT attack detection is needed. methodology is detailed in Sect. 3. Section 4 presents the
Therefore, this paper proposes an ensemble learning method. GWO-optimized ensemble models, and Sect. 5 presents the
Ensemble learning for resilient IoT security is a strategy for experimental setup. The findings and discussions are given
solving a specific artificial intelligence-based challenge by in Sect. 6. Section 7 contains the conclusion and recommen-
combining different models or expertise. Ensemble learning dations for future work.
enhances generalization, simplification, and voting among
the various ensemble strategies in the intrusion detection
problem, resulting in a higher detection performance than
individual models [17]. The paper’s primary contributions 2 Related work
are as follows:
The study [18] employed bloom filtering for signature
• Propose a new voting ensemble learning approach for IoT matching and offered a dynamic coding mechanism for con-
intrusion detection (To the best of our knowledge, this is the structing a decentralized signature-based IDS in IP-USN.
first voting GWO-optimized ensemble model for intrusion The study [19] created a virtual test platform to mimic an
detection in the IoT). actual network environment, installing a Snort IDS for traf-
• Analyze the model using feature extraction (principal com- fic control and attack discovery by reflecting traffic to the
ponent analysis) and feature selection (information gain) server and constructing a stream-based IDS intelligent sys-
for dimensionality reduction. We created a hybrid IG + tem using ML developed a specification-based IDS capable
PCA technique for feature selection, feature extraction, of identifying a novel sort of danger—the topology attack.
and GWO-optimized ensemble models for classification They suggested an IDS architecture built on top of a network
tasks. monitor and explained its monitoring techniques using an
• Based on network traffic characteristics, low-cost and RPL FSM. Roy et al. [20] presented the use of a Bi-LSTM
mountable cyber intrusion detection for IoT are proposed. RNN for intrusion detection to spot a binary categorization
• Suggest several realistic datasets for IDS in the IoT envi- of normal and malicious attacks. The model was trained on
ronment. the UNSW-NB15 dataset and had a detection accuracy above
• Develop a voting ensemble model based on the average of 95% in IoT attacks. The work [21] devised an approach for
probability to increase the detection accuracy and decrease detecting resource-constrained deep packet anomalies that
the false alarm rate to detect cyberattacks in the IoT. distinguish between regular and anomalous payloads. Xu
• Leverage the realistic BoT-IoT and UNSW-NB15 datasets et al. [22] presented a unique IDS that examined the realiza-
that reflect modern-day attacks and are representative tion of several basic hybrid RNN models and MLP to protect
of real-world attack scenarios in IoT which also satisfy against IoT threats. Both the NSL-KDD and KDD Cup 99
IoT protocol requirements as against outdated and non- datasets are utilized for training and assessing the described
representative datasets used in some previous studies. models. The study [23] developed a several-layered RNN

123
Y. K. Saheed, S. Misra

Table 1 Summary of existing IoT attack detection using machine learning and deep learning

Authors Methodology Dataset Results/strengths Gaps

[39] Single-layered ANN N-BaIoT Accuracy 99 Building ten ANN

models to identify an
attack is a
resource-intensive
and time-consuming
procedure
[40] SMOTE and ANN BoT-IoT Accuracy 100 Focus solely on
detecting DDoS
attacks.
Additionally, a
simple ANN with
only one hidden
layer was deployed
[41] CNN System call graph Accuracy 97% and No experiment was
F-measure 98.33% conducted to
determine the
presence of other
harmful lines on IoT
devices
[20] Bi-LSTM UNSW-NB15 Accuracy 95% There was no
optimization of
parameters with long
training time
[23] RNN NSLKDD Probe 97.35%; DoS: It examined a single
98.27%; R2L: 77.25%; and dataset without
U2R: 64.93% elucidating the
tuning of the
hyperparameters
[42] K-NN, Gaussian Naive Capture live network KNN: accuracy 94.44, The suggested
Bayes, and random precision 92.0, recall approach is
forest 100, and F-measure 96; policy-based and
Gaussian Naive Bayes: relies on known
accuracy 77.78, precision attack signatures,
75, recall 100, and signatures are
F-measure 86; and upgraded
random forest: accuracy
88.8, precision 86, recall
100, and F-measure 92
[24] GRU, LSTM, BLS, and NSLKDD BLS performs better with an They considered only
Bi-LSTM accuracy reaching 84.14% a single basic
and F-measures 84.68% network data
[43] GRU + MLP, KDD Cup 99, Accuracy 99.24% Several general attacks
BGRU-MLP, BLSTM + NSLKDD were discovered
MLP, while examining a
GRU, MLP, DLSTM + single dataset
MLP, and LSTM
[44] SVM, RF, DT, and logistic Capture live network SVM 98.06; RF accuracy It is difficult to
regression 99.17; replicate the
DT 98.34 research. The
LR 97.50 implementation
details of the ML
model are absent

123
A voting gray wolf optimizer-based ensemble learning models …

Table 1 (continued)

Authors Methodology Dataset Results/strengths Gaps

[25] SVM, J48, NB, MLP, NB, NSLKDD RNN-IDS accuracy 95.2% For performance
RF, RF, RNN-IDS, and comparisons, only
ANN machine learning
models and outdated
dataset were used for
the experimental
analysis
[27] DJ, DF, DNN, LSTM, NSLKDD, KDD Cup, and DBN gave an accuracy of There are no realistic
DBN, and GRU CICIDS 96.9% outperforming others IoT datasets
examined
Proposed GWO RF, DT, MLP, and KNN BoT-IoT Improved accuracy, We used multiple base
ensemble models UNSW-NB15 F-measure, and ROC classifiers, including
RF, DT, MLP, and
KNN, and designed
a voting GWO
ensemble model

model for IoT gadgets that might be deployed. The identifi- the UNSW-NB15 and NSL-KDD datasets separately. The
cation rates of attacks were determined to be DoS at 98.27 OSS and SMOTE are combined to create balanced data for
percent, the probe at 97.35 percent, U2R at 64.93 percent, training models built with CNN, AlexNet, BiLSTM, LeNet-
and R2L at 77.25 percent, respectively, using the NSL-KDD 5, and RF algorithms. According to the statistical result,
dataset. Li et al. [24] used the NSLKDD dataset to build CNN-BiLSTM surpassed other classifiers with an accuracy
GRU, LSTM, BLS, and Bi-LSTM algorithms for several of 83.58%. Hasan et al. [29] addressed many paradigmatic
known intrusion classification tasks. According to the per- machine learning strategies for spotting intrusions into IoT
formance study, the BLS significantly reduces training time nets that result in system failure. On the DS2OS data, five-
while maintaining an accuracy of 72.64% and 84.15 per- fold cross-validation was performed using LR, SVM, DT,
cent for the KDDTest-21 and KDDTest + data, respectively. RF, and ANN. Cheng et al. [30] developed an HS-TCN
The author [25] demonstrated an accuracy of 85.5 per- for detecting anomalous communication in the Internet of
cent–95.25 percent for RNN-IDS using a heuristic technique Things. The experiment was controlled using two variants of
for intrusion detection. The IDS is initially trained using the unique dataset DS2OS: data collected over eleven (11)
the gradient descent approach and then retrained and tested days and the DS2OS-UA. For both adjusted datasets, the
using the KDD20 + and KDDTest + datasets. RNN-IDS out- HS-TCN model outperforms the LSTM and SVM models.
performs various applied algorithms, including SVM, J48, The author [31] suggested an intrusion detection approach
NB, MLP NB tree, RF, ANN, and RF tree. In ref. [26], a founded on node usage analysis in 6LowPAN. Sahu et al.
DoS detecting design for 6LoWPAN was presented. This [32] developed another machine learning-based method for
design incorporated an IDS into the ebbits framework cre- detecting anomalies by combining LR and ANN classifica-
ated under the EUFP7 program. The paper [27] conducted tion methods. Both the ANN and LR achieve approximately
an experimental investigation on intrusion detection utilizing 99.4 percent accuracy when the entire dataset is used and
DJ, DF, DNN, LSTM-RNN, DBN, GRU-RNN, and RNN 99.99 percent accuracy when approximately 105,952 data
of ML and deep learning models. Four datasets, namely, points are omitted from the unique data. In both situations,
KDD Cup 99, NSLKDD, CICIDS2017, and CICIDS, were the data are divided into 75 percent and 25% subsets. In ref-
used to evaluate the algorithms’ effectiveness in detecting erence [33], an event-processing IDS architecture based on
and classifying anomalies using 22 distinct evaluation mea- CEP technology was described. Kalis [34], an adaptive expert
sures. However, the experiment results indicate that when DL IDS that can supervise several protocols without modifying
models are combined with machine learning models, notably existing IoT software, is a thorough approach for detecting
DBN, the detection accuracy rate increases from 5 to 10%. IoT intrusions. Reddy et al. [35] described a DNN archi-
The study [26] set out to spot DoS attack protocols against tecture for securing the apps of future smart cities. The
CoAP and 6LoWPAN communication and to offer an IDS findings demonstrate that this DNN technique achieves an
architecture for detecting and blocking attacks in an internet- accuracy of approximately 98.26 percent when compared to
connected environment. Jiang et al. [28] experimented with standard machine learning classifiers with a variable layer
a mixed sampling-based intrusion detection method using and neurons. The authors [36] developed a novel method for

123
Y. K. Saheed, S. Misra

detecting network intrusions in IoT networks that are built example, cannot be deployed on IoT gadgets. Using numer-
on a conditional variational autoencoder with a specialized ous hacking tactics, hackers can disrupt or manipulate the
design that incorporates intrusion tags. To detect malicious functionality of smart gadgets [46]. In light of the physi-
activity, ref. [37] employed a single-class SVM equipped cally insecure nature of a large number of IoT gadgets, some
with characteristics such as memory utilization and CPU hacking approaches require active access to smart gadgets,
utilization. The study [38] examined the efficacy of many making an attack more difficult but not impossible. Other
community detection methods for detecting P2P bots, partic- attacks could be carried out remotely over the Internet. Table
ularly when only incomplete information is available. They 2 shows the main kinds of attacks targeting smart devices.
demonstrated that the approach may be used with approxi- The intrusion attacks can affect an IoT bot network com-
mately half of the nodes, presenting their connection graphs prised of unsecured IoT gadgets such as electrical gadgets,
with only a slight upsurge in detection mistakes. Table 1 security systems, automobiles, thermostats, lights in-home
summarizes the assessed studies on IoT security as per their or marketable locations, speaker systems, and wall timers.
datasets, models, best accuracy results, and gaps. These attacks give a cybercriminal the ability to take control
As seen from the review of the existing studies, the focus of the sensors. Unlike traditional botnets, compromised IoT
of some of the research is solely on detecting DDoS attacks. devices actively seek to propagate their hateful behavior to
Other sizable attacks are not taken into account. Also, a sim- a cumulative range of gadgets. While a traditional bot net-
ple ANN with only one hidden layer was deployed in one work may consist of hundreds of bots, IoT bot malware is
case with no optimization techniques applied. The majority far larger in scope, involving a large number of connected
of the work also lacks comparative analysis with other ML gadgets [51]. For instance, on October 21, 2016, cybercrimi-
and DL models. In another study, it was difficult to repli- nals targeted a prominent DNS firm named Dyn. This attack
cate the research work. The implementation details of the was initiated by a massive flood of DNS lookup queries from
machine learning model are absent, with obsolete datasets millions of IP addresses [52]. The bot network demands it
that do not reflect contemporary IoT attacks. Finally, the sug- infect a significant number of devices linked to the Inter-
gested approach is policy-based and relies on known attack net, including printers, camcorders, and other gadgets. This
signatures; hence, it will not be up-to-date with the most IoT bot network attack was initiated by malevolent software
recent attack trends until signatures are upgraded. known as Mirai. As a result of the Mirai contagion, com-
Unlike the past efforts, we investigate intrusion detection puters continually search the Internet for susceptible gadgets
for IoT resource-constrained devices in the network in this and log in using the default username and password, attacking
research. The difference is that our technique is divided into them with malicious programs. Researchers in the security
three stages. The first is hybrid dimensionality reduction, field described how they targeted the Chrysler Jeep Chero-
which involves using PCA and IG to choose the relevant kee at Black Hat 2015. While hacking the Jeep’s IoT device
attributes. The proposed GWO ensemble intrusion detection and sensor network, one could remotely access the vehicle
model includes two important engines in the second phase: as it drove down the motorway [53]. The specific secu-
a traffic analyzer and a classification phase engine. In the rity challenges addressed in this research, which involves
third phase, voting was utilized to merge the base learners’ developing an IDS for the IoT using a hybrid approach of
probability averages. feature extraction via PCA, feature selection via IG, and
parameter optimization using GWO for ensemble models,
2.1 Motivation for the intelligent threat model are related to the cybersecurity aspects of IoT environments.
on the Internet of Things Firstly, about vulnerabilities in IoT devices, it is important
to note that these devices frequently have limited resources
As IoT grows, so does the number of cybersecurity threats and may lack comprehensive security measures. The primary
that investigators must address and examine to develop a objective of the IDS suggested in this study is to identify
reliable IDS. Numerous forms of malevolent action attempt and address vulnerabilities present in these devices, hence
to compromise the privacy and security of IoT gadgets, and thwarting unauthorized access and control. Furthermore, it
all smart appliances connected to the Internet are potentially is imperative to periodically upgrade the firmware and soft-
vulnerable. For a variety of reasons, the IoT is vulnerable ware of IoT devices to ensure their security. The suggested
to cyberattacks. For starters, IoT appliances are frequently approach has the potential to facilitate monitoring and ensure
unattended (for example, sensors located in remote places), the timely implementation of changes. Authentication and
making it relatively uncomplicated for an assailant to get access control play a vital role in safeguarding IoT systems,
admittance to them physically. Second, the vast majority as they are responsible for ensuring that solely authorized
of data transfers are wireless, making eavesdropping easier. individuals or devices are granted access. The proposed IDS
Finally, most IoT devices have limited storage and comput- has the potential to effectively detect and identify unautho-
ing capabilities [45]. Additional anti-virus protection, for rized access attempts.

123
A voting gray wolf optimizer-based ensemble learning models …

Table 2 Common types of

attacks against smart IoT devices References IoT types of attack Examples Description

[47] Attack on cloud Numerous cloud services IoT devices connect to

infrastructure contain a logical fault, cloud services on the
which allows a back end. Clients of IoT
cybercriminal to get cloud services may be
delicate customer able to choose easy
information as well as passwords
contact with the device
without verification. These
services also feature
common management
console susceptibilities
[48] Attack on device In the case of intelligent IoT An attack is when
devices such as surveillance someone exploits a
cameras, a cybercriminal defect or weakness in the
may gain direct knowledge IoT infrastructure to get
of the equipment, allowing access to it
them to change the design
settings
[48] Man-in-the-middle Eavesdropping attacks such The attackers analyzed
attack as man-in-the-middle are a network traffic using a
sort of snooping attack. The network packet analyzer,
attacker might use this namely, Wireshark. IoT
approach to relay and gadget interacts with
possibly change additional IoT
interactions between two appliances. This link is
IoT devices invisibly neither encoded nor even
authorized. This is the
reason an attacker may
easily target network
access, allowing them to
mount attacks such as
ARP poisoning
[22] Denial of service An adversary can disable the A cybercriminal can
sensors’ capacity to disable or alter electronic
transmit and receive data. equipment and its
Additionally, battery associated gadgets via
misuse, device disabling, or physical or virtual access
device botching are to the IoT sensors
examples
[45] IoT botnet attack Mirai is regarded as a The term "IoT botnet"
watershed moment in the refers to a collection of
latest threats because it compromised
leverages security flaws in computers, smart
IoT systems to launch gadgets, and utilities
attacks [49] linked to the Web; these
gadgets are the targets of
attacks. They are mostly
interested in attacking
internet clients and
devices, such as IP
cameras and edge routers
[50] Reconnaissance This can be accomplished The objective is to collect
through the use of network data on an IoT base,
port scanners and packet comprising network
sniffers facilities and connected
gadgets

123
Y. K. Saheed, S. Misra

Fig. 2 The framework of the

proposed GWO ensemble
models for IoT

3 Methodology base classifiers. The new voting methodology employs GWO

ensemble models to improve the legitimate/intrusion classi-
This section discusses our proposed method’s framework, fication’s prediction capacity. A probability average offers
philosophy, and design ideologies. In this research, a hybrid rapid reply and effective immediate safety management for
IG-PCA-based feature selection and extraction method the IoT system. Voting is a critical phase of the proposed
employing optimized voting gray wolf optimizer-based classification-based traffic analysis; it analyzes network traf-
ensemble learning models was proposed for intrusion detec- fic that seeks to reach the IoT scheme and generates a security
tion in IoT. The general design of our suggested model is alert if an intrusion is identified. In the provided frame-
portrayed in Fig. 2, which is made up of three phases. The work illustrated in Fig. 2, the data are trained using the IG
first phase is dimensionality reduction utilizing PCA and approach, where the IG entropy is estimated. Following this,
IG to control the relevant attributes. In the second phase, we proceed to calculate the eigenvalue of the PCA covariance
two key engines comprise the proposed ensemble intrusion matrix. During the testing phase, the voting process is con-
detection model: a traffic analyzer and a classification (RF, ducted by calculating the average of probabilities obtained
DT, MLP, KNN, and voting ensemble) phase engine. The from the GWO-optimized ensembles, namely, RF, DT, MLP,
GWO evolutionary-based optimization was used for optimiz- and KNN. The voting mechanism is further enhanced by
ing the parameters of the ensemble models. Preprocessing the utilization of vectors alpha, beta, and gamma, which are
of traffic connection records in the circulation processing responsible for updating the voting process. In the context
unit results in traffic data in a format appropriate for proof an IoT setting, the process of data collecting encompasses
cessing by the ensemble models of the classification phase, not only the reception of data from IoT devices, but also
with these connections classed as normal or attacked by the the transmission of commands, updates, or responses back
GWO ensemble intrusion detection. In the third phase, voting to these devices. The bidirectional flow of information is of
was utilized to combine the average of the probability of the

123
A voting gray wolf optimizer-based ensemble learning models …

utmost importance in facilitating real-time interactions and are used as the input set of attributes for the next dimension-
control inside IoT devices. ality reduction stage. The author [58] describes the overall
entropy “K” of a given dataset “D” as follows:
3.1 Data preprocessing
K (D) − pi Log2Pi (2)
Normalization is a technique for scaling attributes in which i1
the goal is to have all attribute values on the same scale
where “e” signifies the total class size, and “pi” denotes the
normalization techniques include the standardized approach,
percentage of cases belonging to class u. The reduction in
min–max normalization, and z-score normalization [54, 55].
entropy in information is estimated for each feature using
We selected the min–max normalizing technique since the
the following formula:
majority of the features had a normal distribution to prevent
information from leaking in the test data. |D A, w|
IG (D, M) K (D) − K (Dw) (3)
|D|
3.2 Normalization technique wε A

The min–max approach [56] modifies a feature so that all of

3.5 Feature extraction with PCA
its values lie inside the interval [0,1]. Equation 1 depicts the
fundamental formula for min–max normalization.
The IG method’s specified attributes can be utilized directly
y − min (y) for categorization. However, one of the most typical IG issues
Ynew (1) is a preference for traits with various possible numbers [59].
max (y) − min (y)
These features have a close-zero eigenvalue in this scenario,
where yi represents the value of a certain feature, y min rep- which improves their gain more than another attribute. As
resents its minimum value, and ymax represents its highest a result, the full importance of these attributes to the train-
value. ing examples may not be represented in their ranking. To
overcome this constraint, features from the attribute selec-
3.3 Feature selection tion phase will be presented for additional reduction using the
PCA method to identify the best subgroup of features. This
The IoT ecosystem comprises intelligent devices with lim- allows the PCA to narrow the search area from the whole
ited computing power, energy, communication range, and subspace to the features that have been pre-selected [60].
memory. Among the issues with IDSs are handling numerous The purpose of using PCA is to minimize dimensionality
irrelevant features, which might result in system overhead. by retaining important attribute information in the data. It
Thus, the objective of feature evaluation is to discover key decreases the number of variables by employing orthogonal
attributes that may be employed in the IDS to detect a vari- combinations with significant variance. Table 3 shows the
ety of attacks efficiently. The characteristics are examined for proposed hybrid dimensionality reduction for our suggested
both normal and pathological behaviors using the retrieved models.
labels to select the most important features. We used an infor- Two techniques are employed to reduce the dimen-
mation gain (IG) strategy and principal component analysis sionality of features from m dimensions to j dimensions:
(PCA) for feature extraction for feature selection. preprocessing and dimensionality reduction. During the pre-
processing phase, the mean and variance of the data are
3.4 Feature selection with IG standardized using Eqs. (3) and (4) (steps 1 via 4 below).
During the second phase (steps 5–8), the covariance matrix
IG is a frequently used entropy-based feature evaluation Covn , eigenvectors, and eigenvalues are constructed using
approach in ML [57]. The information gain techniques were Eqs. (5) and (6).
rapid to execute, and this strategy extracted the model’s opti-
mal feature set. IG was frequently used in the literature to 1 Standardize the initial input feature values by their mean
determine how successfully each different attribute distin- and standard deviation using Eq. (4), where n is the num-
guished the assumed data. The first phase in this research is to ber of cases, and Y (i) is the data points.
use IG plus ranked as a filtering strategy to lower the datasets’
1
n
dimensionality. The primary idea behind this method is to μ Y(i) (4)
evaluate subgroups of features by estimating their IG entropy n
i1
in decreasing order. From most relevant to least relevant, each
feature receives a score. The attributes with the best scores 2. Substitute Y (i) with Y (i) −μ.

123
Y. K. Saheed, S. Misra

Table 3 Hybrid feature dimensionality reduction

3. Using Eq. (5), transform each vector Y k(i) to have unit Table 4 Design principles of PCA
variance.
Parameter Values
1 2
σi2 Yk(i) (5) Parameter ranking True
n
i Num to select 6
Threshold 0.5
4. Substitute each Y k(i) with Y k(i)
σ . Variance 1.832
5. Computation of the covariance matrix Covn :

1
Covn Y(i) Y(i) )T (6)
n
selection process, which quantifies the importance of each
6. Covn eigenvectors and eigenvalues are calculated.
feature based on its ability to discriminate between different
7. Set eigenvectors by diminishing eigenvalues and select j
classes (e.g., normal and intrusions). Features with higher
eigenvectors with the greatest eigenvalues to produce S.
information gain were considered more effective in distin-
8. Using S and Eq. 7, convert the data to the novel subspace.
guishing between classes. The design principle of PCA is
given in Table 4.
Y S×X (7)
Parameter ranking typically refers to the process of
assessing and ranking the importance or influence of dif-
where Y is a 1 × e vector on behalf of one sample, and y is
ferent parameters or hyperparameters on a machine learning
the converted j × 1 sample in the new subspace.
model’s performance. These parameters are settings or con-
The computational difficulty of performing the specified figurations that can be adjusted to influence how a model
PCA is proportional to the number of attributes F represent-
learns from data and makes predictions. In our research, the
ing each point of data.
parameter ranking in the settings is set to true. The num to
select parameter in PCA is set to the value 6. The threshold
O F3 (8) value is set to 0.5, and the variance is set to 1.832. The design
principle revolves around finding a new set of orthogonal
In this study, PCA is utilized to reduce the dimensionality axes, called principal components, that capture the maximum
of the BoT-IoT and UNSW-NB15 datasets by compressing variance in the data while reducing its dimensionality.
the attribute space with ten (10) selected features and nine Ten (10) new features were selected from the BoT-
(9) high-rank features, respectively. The ten (10) and nine IoT dataset, and nine (9) features were chosen from the
(9) top-ranked features were considered for the BoT-IoT and UNSW-NB15 which are subsequently fed and passed to
UNSW-NB15 datasets. To identify the most effective fea- the GWO-optimized ensemble models (RF, DT, MLP, and
tures, we employed information gain, used in our feature KNN). The information gain efficiently identifies the most

123
A voting gray wolf optimizer-based ensemble learning models …

relevant features based on their contribution to the target vari- x_ synthetic x_ minority + random_ number

able, while PCA optimally captures the variance within the ∗ n − x_ minority (9)
dataset to create a reduced set of orthogonal features. By
combining these two methods, we achieve a balanced fea- Assume there exists a dataset with features x and labels
ture reduction approach that maximizes the preservation of y. For each minority instance x_minority, there is a need
informative features while minimizing computational over- to find its K-nearest neighbors from the minority class.
head. The distance metric used for finding neighbors (such as
PCA aims to transform the original high-dimensional fea- Euclidean distance) can vary. Assume we denote the set of
ture space into a lower-dimensional space while retaining k-nearest neighbors as N(x_minority). For each neighbor n in
as much of the variance in the IoT network traffic data as N (x_minority), a synthetic instance x_synthetic is generated
possible. This dimensionality reduction can lead to several as Eq. (9).
benefits: At this juncture, random_number is a random value
between 0 and 1, controlling the interpolation between
i. Curse of Dimensionality High-dimensional IoT network x_minority and n. The formula in Eq. (9) is applied to each
traffic data can suffer from the "curse of dimensionality," feature of x_minority and n to generate the corresponding
where the number of features greatly exceeds the num- feature of x_synthetic.
ber of samples. This can lead to increased computational
complexity, overfitting, and difficulty in visualization. 3.7 Optimization of the ensemble learning models
PCA helps mitigate these issues by reducing the dimen- (ELM) with gray wolf optimizer
sionality.
ii. Noise Reduction High-dimensional IoT network data The GWO methodology is a metaheuristic algorithm that
often contain noise and irrelevant features. PCA helps replicates the initiative chain of importance and pursues the
remove and down-weight such noisy dimensions by method of dark posers [61]. In the numerical method for the
identifying and emphasizing the dimensions with the GWO, the optimal configuration is denoted by the symbol
most significant information. alpha α. The beta (β) and delta (δ) are optimized according
iii. Improved Model Performance Reducing dimensionality to the second- and the third-best configurations, respectively.
leads to faster training and inference times for machine It is believed that the remaining application setups are known
learning models, as well as potentially reducing overfit- as omega (ω). These three applicants are being pursued by
ting. β,δ, and ω using GWO tactics and α as a hunting guide.
For the pack to pursue prey, they immediately encircle it.
The following Eqs. (10)–(13) are applied to mathematically
3.6 Handling the class imbalance problem model surrounding behavior.

Addressing class imbalance is a prevalent issue encountered −

→ −→ −→− →
Z (r + 1) Z p (r ) + B . E (10)
in the field of machine learning, particularly in the context
of intrusion detection systems. This challenge arises due −→ −
→
Z p is the position of the prey, Z is the gray wolf position,
to the substantial disparity between the abundance of nor- −
→ −→
B and D are coefficient vectors, and r is the number of
mal instances and the scarcity of attack instances. In this
iteration number E as shown in Eq. (11)
research, we employed the synthetic minority oversampling
technique (SMOTE) as a method to tackle the aforemen- → −
− →− → →
−
tioned concern. The SMOTE is a method that produces E D . Z p (r ) − Z (r ) (11)
artificial cases for the underrepresented class by interpolat-
−
→ −→
ing between the available data points. We ensure that the D 2b. t 1 − b (12)
data are preprocessed properly, including removing irrelevant
features, handling missing values, and encoding categorical −
→ −
→
D 2 t 2 (13)
variables. Subsequently, we divide the datasets into features
(x) and corresponding labels (y) for both training and test- b is lowered linearly from 2 to 0 throughout the emphasis
ing datasets. Thus, we create an instance of the SMOTE and span, while t 1 and t 2 are random vectors in the interval [0,
apply it to the training data. The mathematical representation 1]. Typically, the alpha leads the pursuit. Moreover, the beta
is given in Eq. (9).

123
Y. K. Saheed, S. Misra

and the delta may occasionally be interested in chasing. To Table 5 Pseudocode of gray wolf optimization
scientifically emulate the chasing behavior of gray wolves,
1 Initialize values for the population size s, the
the alpha (the best candidate solution), beta (the second-best Maxitrcoefficient parameter, and the D and B vectors
rival solution), and delta (the third-best optimistic solution) 2 Create an initial population sample at random Z j (r)
are accepted to obtain more information regarding the likely
3 Using f (zj ) to evaluate each search agent’s fitness
prey position. The initial three best application configura-
4 Z α, Z β, and Z δ to determine the values of the 1st,
tions have reached this stage, necessitating that the other hunt 2nd, and 3rd optimal solutions
operators change their situations to match those of the best
5 Repeat
pursue experts. Therefore, the replenishment of the positions
6 For (j 1: j ≤ s) do
of the wolves is provided by Eq. (14):
7 Applying Eq. (21) to restore each population agent
−
→ −
→ −
→ 8 End for
−
→ Z 1+ Z 2+ Z 3
Z (r + 1) (14) 9 The vector has been updated by Z α, Z β, and Z δ
3 accordingly
−
→
−→ −→ − → 10 Set r r + 1
Z 1 Z α − B 1. E a (15)
11 As soon as, the termination criteria are met till (r ≥
−
→ − →
→ −→ − Maxitr)
Z 2 Z β − B 2. E β (16) 12 Lastly to produce the optimal solution Z a

−
→ − →
→ −→ −
Z 3 Z δ − B 3. E δ (17)
swarm intelligence methodologies due to its various charac-
−
→ − → −
→ teristics such as fine-tuning parameters, simplicity and ease
where B 1 , B 2 , and B 3 are defined as Eq. (14) and
−
→ − → −
→ of use, scalability, and most notably its ability to just provide
Z α, Z β , and Z δ are the leading three best solutions in
−→ − → −→ convergence speed by maintaining the right balance between
the assumed iteration r, B 1 , B 2 , and B 3 are expressed in
−
→ −
→ exploitation and exploration during the search. GWO exhibits
Eqs. (15–17), and E α and E δare expressed as Eqs. 18–20, a better balance between exploration (searching the solution
respectively. space) and exploitation (exploiting promising solutions). It
−
→ − → uses the concept of alpha, beta, gamma, and delta wolves to
→ − → −
E α D 1. Z 1 − Z (18) strike a balance between exploration and exploitation which
− can lead to more efficient optimization compared to other
−
→ → −→ −→
E β D 2 − Z β − Z 1 (19) algorithms. GWO tends to converge faster to a global opti-
mum compared to several other algorithms in some cases.
−
→ − →
→ − → − The nature-inspired hunting behavior of gray wolves, such as
E δ D 3. Z δ − Z 1 (20)
encircling prey, mimicked in GWO can lead to more efficient
−→ − → −
→ exploration and faster convergence. GWO promotes diverse
D 1 , D 2 , and D 3 are given as in Eq. (13) solution exploration due to its hierarchical structure and the
A final observation regarding the GWO mediator is the hunting behavior of gray wolves. This can help avoid getting
updating of the parameter that regulates the investigation- stuck in local optima and facilitate a more comprehensive
abuse tradeoff. The stricture is continuously updated each search of the solution space.
cycle to range from 2 to 0 following Eq. (21). In our research, the GWO is utilized to optimize the param-
eters of RF, DT, MLP, and n for KNN. Gray wolf optimizer
2
b 2 r (21) (GWO) is a nature-inspired optimization algorithm that sim-
Maxlter ulates the hunting behavior of gray wolves to find optimal
where MaxIter is the full number of allowable optimization solutions. We utilized the pseudocode of GWO to optimize
iterations, and r is the number of optimization iterations. The the hyperparameters of ensemble learning models; random
hunting and pursuit positions of gray wolves are required to forest, decision tree, multilayer perceptron (MLP), and K-
be updated by binary {1, 0}. The gray wolf optimization nearest neighbor (KNN) [62]. Here’s a high-level overview
pseudocode is described in Table 5. of how we integrated GWO with ensemble models:
We chose GWO to optimize the parameters of the ensem-
ble algorithms because of three significant merits; explo- 1. Initialize a population of gray wolves with random
ration and exploitation, convergence speed, and handling hyperparameter settings for the ensemble models.
constraints, which it has over other algorithms. GWO has 2. Define a fitness function that evaluates the performance
gained a significant amount of prominence among other of the ensemble model with the given hyperparameters.

123
A voting gray wolf optimizer-based ensemble learning models …

The fitness function used appropriate evaluation met- matrix. The eigenvalue problem stated in Eq. (24) is initially
rics. fixed through PCA.
3. In each iteration of the GWO loop, evaluate the fitness
of each wolf (hyperparameter set) using the ensemble βjkj Z kj (24)
model. Update the positions of the alpha, beta, and delta
wolves based on their fitness values. These wolves rep- where β j signifies an eigenvalue of Z (say β 1 > β 2 > ... >
resent the best solutions found so far. β m ), and k j is the corresponding eigenvector. The PCA is
4. Update the positions of the other wolves using prede- obtained using Eq. (25) as follows:
fined formulas that simulate the hunting behavior of
gray wolves. This step helps explore the search space x j (u) k j × (u), j 1, 2, . . . , m. (25)
efficiently.
5. Apply boundary constraints to ensure that hyperpa-
The jth principle component is denoted by x j (v). The com-
rameters remain within valid ranges for the ensemble
putation to project a fresh sample y(u) onto the main space
models.
is given in Eq. (26). Let
6. After a certain number of iterations or when a stopping
criterion is met, select the best solution found so far q
based on fitness values. y(u) b j U × (u)aj , (26)
j1
7. Perform cross-validation to assess the performance of
the ensemble model with the selected hyperparameters where A {ej : ej k j , j 1,…, g}. Equation (27) calculates
on a validation set. the distance f from y(u) and (t) to determine the projection
8. If the new solution (hyperparameters) is better than the inaccuracy of y(u) and Ý (u):
previous best solution, update the best solution.
9. Continue the optimization process until the stopping
b f y(u), Y (u) (27)
criterion is met.
10. Finally, return the best solution, which represents the
optimal hyperparameters for the ensemble learning
models. 3.9 Ensemble model

By integrating GWO with ensemble models in this way, Ensemble methods are effective ways of improving the
we effectively search for the best hyperparameters to max- prediction outcome of the overall model by developing
imize the ensemble’s performance, improving its accuracy numerous self-reliant models and integrating them to provide
and effectiveness in real-world applications. results with improved, enhanced accuracy [63]. Ensemble
learning approaches include boosting, bagging, Bayesian
3.8 Mathematical formulation of the ensemble parameter averaging, and stacking [64]. This work proposes
method for classification a unique ensemble classifier to improve intrusion detection
accuracy in IoT that employs RF, DT, MLP, and KNN learn-
Let {y(u)} for u 1,…, m be a randomized data containing its ers. These algorithms were utilized in a voting algorithm and
associated examples and characteristics with a mean of zero. were combined using the average of probabilities method. To
Equation (22) shows the covariance matrix of y(u). Algo- accelerate the performance of each of the models, the GWO
rithm 1 summarizes the hybrid IG-PCA approach’s selection was used to optimize the parameters of each of the ensemble
procedure. (RF, DT, MLP, and KNN) models.
Assume we have φ ’classifiers A {A1 , A2, … A φ} and
m
l labels {h1 , h2, …, hl }. According to the classifiers given
1
Z y(t) × (u)U (22) above, φ 4, and l 2 (that is, non-attack and attack) for the
m−1
u1 datasets analyzed in this work. Aj : Z m → [1,0]l is a classifier.
l takes an object y Z M and returns a vector [J Aj (h1 |y),…, JAj
In PCA, the transformation function from y(u) to x(v) is
(h|y)], where J A (h|y) represents the probability given by Ai
calculated as follows;
to the assumption that entity y corresponds to class i. Where
ni becomes the average of the probabilities provided by the
x(u) N u × (u) (23) different classifiers for every class hi ,

The jth column of the covariance sample matrix Z is equal 1 φ

to the jth eigenvector, and N denotes an m × m orthogonal ni J a j(h/y) (28)
φ j1

123
Y. K. Saheed, S. Misra

Let N denotes the collection of mean probability for each 3.11 Benefits of the proposed voting-based
category (n1, n2 ,…, nc ). Object y is classified correctly in N ensembles model
with the highest mean, i.e., y is allocated to class g if and
only if • Reduced Bias Combining multiple models can help
reduce bias present in any individual model.
n g max N (29) • Improved Generalization Ensembles often perform bet-
ter on unseen data compared to individual models.
The proposed ensemble approach’s performance is eval- • Robustness Ensemble methods are more robust against
uated using two famous intrusion detection assessment data overfitting, especially if the individual models are diverse.
that are ideally suited for IoT, namely, BoT-IoT and UNSW- • Model Diversity Using different learning algorithms
NB15. ensures that the ensemble captures different aspects of the
data.
3.10 Ensemble learning strategy

Ensemble learning is a powerful technique that combines

multiple individual learning algorithms to create a stronger, 4 Experimental setup with the software
more accurate predictive model. Voting-based ensembles and hardware requirements
are a popular approach within ensemble learning. In this
research, we performed the average of probabilities from The simulations are executed on a laptop with an Intel Core
multiple models for intrusion detection in the IoT using the (TM) i5-8250U processor clocked at 1.60 GHz and 8 GB of
BoT-IoT and UNSW-NB15 datasets. Here’s a step-by-step RAM. To demonstrate the efficacy of the proposed approach,
explanation of how we achieved this: four GWO ensemble models (RF, DT, MLP, and KNN) with
an average probability are chosen. The algorithms are used to
Step 1: Data Preparation We preprocess and split the classify and identify threats and anomalies across all the BoT-
datasets (BoT-IoT and UNSW-NB15) into training and test- IoT and UNSW-NB15 datasets. Scikit learning was utilized
ing subsets with the target labels (intrusion or non-intrusion) in the implementation of the models.
and the corresponding features for each dataset.

Step 2: Individual Learning Algorithms Choose a set of 4.1 Metrics used for performance evaluation
individual learning algorithms RF, DT, MLP, and KNN that
we want to ensemble. This study evaluated the performance of the proposed system
using multiple performance measures, including precision,
Step 3: Train Individual Models For each selected indi- recall, dtection rate (DR), and accuracy (Acc), as well as the
vidual learning algorithm RF, DT, MLP, and KNN. We time required to create the model. These metrics’ definitions
trained all these algorithms on training data from both are provided below. True positives (TP), true negatives (TN),
datasets (BoT-IoT and UNSW-NB15). This gave us a set of false positives (FP), and false negatives (FN) determine the
trained models, each capable of making intrusion detection metrics (FN).
predictions. Detection rate (DR): The DR is the proportion of identified
attacks relative to the total number of attack events in the
Step 4: Probability Prediction For each trained model, dataset. Equation (30) can be utilized to estimate DR.
we use it to make predictions on our testing data. Instead of
just obtaining the final prediction label, we are interested in TP
the predicted probabilities of intrusion (class attack) for each DR (30)
TP + FN
instance.
Accuracy is the measure of the classifier’s ability to cor-
Step 5: Ensemble Voting For each instance in our testing rectly classify an object as normal or as an attack. The
data, we calculated the average of the predicted probabilities accuracy is defined by Eq. (31).
from all the individual models. This average can be computed
for class 1 (intrusion). TP + TN
Accuracy (31)
TP + FN + FP + TN
Step 6: Evaluation We evaluated the performance of our
voting ensemble models using standard metrics such as accu- Precision is the ratio of positive predictions to the total
racy, DR, precision, ROC, and FAR on our testing data. We number of positive anticipated class values. It considered a
also compare these results with the performance of individual measure of the classifier’s precision. A low value represents a
models to assess the effectiveness of the ensemble. high number of FP. The precision is computed using Eq. (32).

123
A voting gray wolf optimizer-based ensemble learning models …

Table 6 Attack and normal behavior statistics from the BoT-IoT dataset
TP
Precision (32)
TP + FP Attack and normal behavior Values

The recall is calculated by dividing the number of TP DDoS 2766

by the number of TP and FN. The recall is regarded as a Reconnaissance 298
measure of a classifier’s completeness, with a low recall value Keylogging 73
resulting in a large number of FN [65]. Using equation, recall Normal 8945
is estimated (33).

TP
Recall (33) Table 7 UNSW-NB15 data records
TP + FN
Feature type Number of records

4.2 Description of the dataset Fuzzers 24,246

Backdoors 2329
One of the primary challenges encountered in the domain Analysis 2677
of anomaly detection research revolves around obtaining or Exploits 44,525
generating a suitable dataset for experimental endeavors. DoS 16,353
In this study, we analyzed pre-existing datasets to identify Generic 215,481
the dataset that is most appropriate for further exploration.
Reconnaissance 13,987
The authors delineated the dataset prerequisites identifying
Worms 174
anomalies in the IoT by the following four criteria:
Shellcode 1511
C1 The acquisition of the dataset ought to be conducted
Normal 2,218,761
from the IoT;
C2 It is recommended that the dataset includes anomalies;
C3 The dataset must be appropriately labeled to distin-
guish between normal and abnormal data; scan, and OS attacks are included in the dataset. The BoT-IoT
C4 It is recommended that the dataset utilized in the is available at https://www.unsw.adfa.edu.au/unsw-canberra-
study closely approximates real-world data, specifically data cyber/cybersecurity/ADFA-NB15-Datasets/bot_iot.php. All
derived from authentic or partially authentic systems. of these data were preprocessed to establish network-level
C5 It is recommended that the datasets encompass a patterns for the varied types of traffic generated by devices
diverse range of attack scenarios and network conditions. and to use these similarities to spot attack behavior in the IoT
A key criterion was the inclusion of a wide variety of attack architecture [51]. Table 6 summarizes the amount of benign
types and patterns to ensure a comprehensive evaluation of and attack samples in the collection.
our intrusion detection system.
C6 Took into account the accessibility and availability of 4.2.2 UNSW-NB15 dataset
the datasets to the research community. It was important to
select datasets that are publicly accessible, well-documented, The researchers [14] created the UNSW-NB15 dataset at
and readily available for replication and validation by other UNSW Canberra. The researchers used the IXIA perfect
researchers. storm to create a mix of benign and malicious traffic, yield-
The datasets that meet the specified criteria, namely, ing a 100 GB dataset in the form of PCAP files, including
those that comprise labeled sensors, actuators, and net- many novel attributes. The generated data were intended
work data, include the recently developed BoT-IoT and the to be utilized for intrusion detection generation and valida-
UNSW-NB15 dataset. These datasets were subjected to a tion. Nevertheless, the data were created using a simulated
comprehensive analysis by the authors. The particulars of environment to generate attack activity. The UNSW-NB15
each dataset are delineated as follows; dataset record distribution is specified in Table 7.

4.2.1 BoT-IoT dataset

5 Results and discussion
The BoT-IoT contains both typical IoT net traffic and a range
of attacks. These data were utilized to test our system. It We present the detailed findings of experiments conducted
was chosen because it accurately depicts an IoT ecosystem utilizing the proposed framework in this section. The sug-
context. DoS, DDoS, data exfiltration, keylogging, service gested approach was tested on the datasets mentioned above.

123
Y. K. Saheed, S. Misra

Table 8 Confusion matrix the performance of standard ML models IG + PCA-RF, IG

+ PCA-DT, IG + PCA-MLP, IG + PCA-KNN, and the pro-
Attack/intrusion Non-attack/legitimate
posed voting GWO ensemble model. The results indicate that
Attack/intrusion TP FN the voting GWO ensemble model performs the best, with an
Non- FP TN
accuracy of 99.98% and DR of 99.97%, precision of 99.94%,
attack/legitimate ROC of 99.99%, and FAR of 1.30.

The oversampling without replacement method was used to 5.2 Experimental analysis based on UNSW-NB15
divide each dataset’s selected samples into two distinct sub- dataset
groups for training and testing. As a result, the training subset
can accurately predict model performance on previously Additional tests on the UNSW-NB15 dataset were carried out
unrecognized data, and the testing sample is reserved for to demonstrate the efficiency of the suggested feature dimen-
assessing the model’s performance. In this instance, generat- sionality reduction (IG + PCA) GWO ensemble model. As
ing subgroups for cross-validation evaluation is not essential, in the first experiment, IG and PCAs were computed dur-
which could be time-consuming with large datasets. Two ing the preprocessing step of these datasets. In this second
tests were conducted to evaluate the efficiency of the pre- experiment, nine (9) candidate features were chosen from
sented technique. The following evaluation metrics were UNSW-NB15 by computing the entropy of the IG and, subse-
used according to the confusion matrix shown in Table 8: pre- quently, the PCA feature extraction. Table 10 shows the best
cision, accuracy, detection rate, ROC, and FAR. The authors results obtained using the reduction of dimension approaches
[66] explain the mathematical computations for the measure- on the dataset. Our proposed model produces promising clas-
ment methods used. sification results, as seen in the result. Table 10 compares
Where TP is the number of current attacks recognized the performance of the IG + PCA-RF, IG + PCA-DT, IG +
as attacks, TN is the number of frequent patterns identified PCA-MLP, IG + PCA-KNN, and the proposed GWO ensem-
as regular, FN is the series of attacks identified as frequent ble model on the UNSW-NB15 dataset. The voting GWO
patterns, and FP is the number of frequent patterns identified ensemble technique outperforms all other approaches, with
as threats. an accuracy attaining 100%, DR of 99.99%, precision of
99.59%, ROC of 99.40%, and FAR of 1.15.
5.1 Experimental analysis based on BoT-IoT dataset

The BoT-IoT dataset was used in the first experiment. To 5.3 Multiclass experimental analysis on the BoT-IoT
begin, vital attributes were determined by computing the IG dataset
entropy for every feature in declining order. From the orig-
inal thirty-one (31) potential features, ten (10) were chosen The initial step was the computation of the IG entropy for
for the following step. The strategy was seen to create several each characteristic, with the resulting values being arranged
FARs by deploying IG alone. To overcome this constraint, in descending order to identify the most significant qualities.
a second additional reduction phase founded on the selected Out of the initial set of thirty-one (31) possible features, a
attributes was done using the PCA as feature extraction. To subset of ten (10) features was selected for the subsequent
evade bias, the PCA was created using only the training set, stage. The implementation of IG in isolation was observed to
ensuring that no information from the test data was leaked generate several FARs as part of the strategy. To address this
into the training dataset. When genuine new unseen data are limitation, a secondary reduction phase was implemented,
introduced into the model, the model will not function as well utilizing the specified features and employing PCA as a fea-
if the complete dataset is used to construct the PCAs. Sim- ture extraction technique. To mitigate bias, the PCA was
ilarly, calculating PCAs on the two sets independently will conducted exclusively on the training dataset, to preventing
result in two mismatched sets of data. We cannot build a clas- any potential leakage of information from the test data into
sifier in one domain and then apply it to another. The same the training set.
characteristics from the training set were utilized to translate Table 11 shows the performance of the proposed voting
the testing dataset into the same feature space using the batch- GWO ensemble model on BoT-IoT in a multiclass scenario.
filtering method. The new datasets were utilized to assess the The results indicate that the voting GWO ensemble model
efficiency of the presented method, so five separate classifiers performed on DDoS HTTP achieved an accuracy of 99.87%
were built utilizing the training data and classified using the and DR of 99.89%, precision of 99.60%, ROC of 99.56%,
testing dataset. On the BoT-IoT dataset, Table 9 compares and FAR of 1.20.

123
A voting gray wolf optimizer-based ensemble learning models …

Table 9 The performance of standard ML approaches and the proposed voting ensemble model on BoT-IoT

Classifier Accuracy DR Precision ROC FAR

IG + PCA-RF 97.00 99.10 97.0 98.0 2.32

IG + PCA-DT 93.00 98.90 96.0 97.0 3.89
IG + PCA-MLP 95.00 98.0 97.0 98.0 4.83
IG + PCA-KNN 98.30 97.30 98.90 98.40 3.70
Proposed IG + PCA-Voting GWO ensemble Average of probability 99.98 99.97 99.94 99.99 1.30

Values of our proposed model are in bold

Table 10 The performance of standard ML techniques and voting ensemble model on the UNSW-NB15

ML approaches Accuracy DR Precision ROC FAR

IG + PCA-RF 98.14 99.20 99.20 98.10 3.40

IG + PCA-DT 97.00 99.12 98.40 97.81 5.20
IG + PCA-MLP 98.23 98.70 98.80 96.83 4.31
IG + PCA-KNN 97.80 99.70 98.80 98.30 3.79
Proposed IG + PCA-Voting GWO ensemble 100 99.99 99.59 99.40 1.15
Average of probability

Values of our proposed model are in bold

Table 11 Performance of the

voting GWO ensemble model Type of attack Accuracy DR Precision ROC FAR
relative to the different attack
types and benign in terms of DR, Benign 99.82 98.67 99.18 99.90 3.18
accuracy, and training time on OS fingerprinting 98.41 99.86 99.28 99.18 4.28
the BoT-IoT dataset
Service scanning 98.67 98.87 99.68 99.68 3.89
DoS TCP 99.62 99.78 98.81 99.10 1.89
DoS HTTP 99.89 98.77 99.72 98.10 1.01
DoS UDP 98.84 98.89 99.83 98.53 1.10
Data theft 99.99 98.97 99.78 98.05 2.60
Keylogging 98.76 99.45 99.09 99.12 2.80
DDoS UDP 99.56 99.58 98.68 99.68 1.90
DDoS TCP 99.83 99.60 98.10 99.32 1.59
DDoS HTTP 99.87 99.89 99.60 99.56 1.20

Values of our proposed model are in bold

5.4 Multiclass experimental analysis voting GWO ensemble model on BoT-IoT in a multiclass
on the UNSW-NB15 scenario. The results indicate that the voting GWO ensemble
model performed on reconnaissance achieved an accuracy
Further experiments were conducted on the UNSW-NB15 of 99.91% and DR of 99.75%, precision of 97.08%, ROC of
dataset to showcase the effectiveness of the proposed ensem- 98.80%, and FAR of 1.80.
ble model, which combines feature dimensionality reduction
techniques (IG + PCA) with the GWO. Similar to the initial
experiment, the datasets underwent preprocessing in which 5.5 Evaluation and comparison of current datasets
IG and PCAs were generated. In the subsequent experiment, suitability for IoT network
a total of nine (9) candidate features were selected from the
UNSW-NB15 dataset by evaluating the entropy of the infor- To determine the essential qualities of a valuable and realistic
mation gain (IG) and subsequently applying PCA for feature dataset for an IoT network, some of the current IDS datasets
extraction. Table 12 shows the performance of the proposed were evaluated in this part.

123
Y. K. Saheed, S. Misra

Table 12 Performance of the

voting GWO ensemble model Type of attack Accuracy DR Precision ROC FAR
relative to the different attack
types and benign in terms of Benign 99.99 99.89 98.80 99.89 3.42
accuracy, DR, precision, ROC, DoS 99.09 99.56 99.40 99.53 3.45
and FAR on the UNSW-NB15
dataset Backdoor 99.10 99.87 97.82 98.45 2.77
Worm 99.89 98.10 98.17 98.78 2.99
Shellcode 99.14 98.72 98.32 97.62 3.98
Probe 99.89 98.82 96.88 98.64 1.89
Exploits 99.90 98.67 99.08 99.80 3.79
Fuzzer 99.89 99.90 98.71 99.62 2.89
Analysis 99.78 99.10 98.83 99.84 2.89
Generic 99.69 99.59 98.67 99.89 1.99
Reconnaissance 99.91 99.75 97.08 98.80 1.80

Values of our proposed model are in bold

5.5.1 DARPA turn, results in testing outcomes that are biased, as reported
in reference [68]. NSL-KDD was developed as a means of
For the goal of analyzing network security, this dataset was addressing certain limitations of the KDD dataset [68], which
created. Due to problems with the fake injection of attacks as had been identified in the previous research [67].
well as benign traffic, researchers chastised DARPA. DARPA
covers tasks such as sending and receiving mail, surfing the
5.5.3 CDX
web, sending and receiving files via FTP, using telnet to
log into distant systems and carry out work, sending and
The utilization of network warfare competitions for the cre-
receiving IRC messages, and remotely monitoring the router
ation of contemporary labeled datasets is demonstrated by
using SNMP. The aforementioned list comprises various
the CDX dataset. The dataset reveals that attackers have uti-
types of attacks, including but not limited to denial of ser-
lized widely recognized attack tools such as Nikto, Nessus,
vice (DOS), password guessing, buffer overflow, remote file
and WebScarab to conduct automated reconnaissance and
transfer protocol (FTP), syn flood, network mapper (Nmap),
attacks. Benign network traffic encompasses essential ser-
and rootkit. Regrettably, the dataset under consideration does
vices such as web browsing, email communication, DNS
not accurately reflect network traffic in real-world scenarios
queries, and other necessary functions. According to source
in IoT and exhibits anomalies such as the lack of erroneous
[69], CDX has limitations in terms of traffic diversity and
detections. Furthermore, it is no longer current enough to
volume, although it can still serve as a tool for testing IDS
provide a comprehensive assessment of IDSs concerning
alert rules.
contemporary network infrastructures and attack modalities.
Furthermore, the absence of factual attack data records is
evident [67]. 5.5.4 Kyoto

The dataset in question has been generated through the uti-

5.5.2 KDD Cup 99 lization of honeypots, thereby precluding the possibility of
manual labeling and anonymization. However, it is important
The dataset known as KDD Cup 1999 was derived by ana- to note that the dataset’s scope is restricted to solely those
lyzing the tcpdump component of the 1998 DARPA dataset. attacks that were directed toward the honeypots. The current
However, it is important to note that the KDD Cup 1999 dataset offers ten additional features, including IDS detec-
dataset is not immune to the same issues as its predecessor. tion, malware identification, and Ashula detection, compared
The KDD99 dataset encompasses over twenty distinct types to the previous datasets. These features are beneficial for
of attacks, including but not limited to neptune-dos, pod- conducting NIDS evaluation and analysis. As the attacks
dos, smurf-dos, buffer-overflow, rootkit, satan, and teardrop. repeatedly simulate normal traffic, the resulting DNS and
The amalgamation of network traffic records of both nor- mail traffic information does not accurately reflect real-world
mal and attack traffic within a simulated environment yields normal traffic. Therefore, false positives are not present. The
a dataset that contains a substantial amount of superfluous significance of false positives lies in their ability to reduce
records, which are also tainted with data corruption. This, in the frequency of alerts, as indicated by sources [70].

123
A voting gray wolf optimizer-based ensemble learning models …

5.5.5 Twente 6 Discussion of findings

To generate the dataset, three distinct services, namely, 6.1 Comparison with the existing studies
OpenSSH, Apache web server, and Proftp utilizing auth/ident
on port 113, were deployed to gather information from a hon- In this section, we compared the performance of the proposed
eypot network via netflow. Certain types of traffic, including GWO ensemble model with the existing state-of-the-art mod-
auth/ident, ICMP, and irc traffic, may produce side effects els in Table 15. The majority of the state-of-the-art model
that are neither entirely benign nor malicious. In addition, concentrated on the NSLKDD and KDD Cup 99 datasets.
the dataset includes alert traffic that is both unidentified and These data are unrealistic intrusion detection datasets for
lacking correlations. The labeled dataset under consideration the evaluation of IoT systems. They are unsuccessful in
is deemed more realistic; however, its deficiency in terms of practical uses due to the dataset used to train and eval-
the volume and variety of attacks is a conspicuous limitation uate the underlying models being non-representative. On
as noted in reference [71]. the other hand, several existing techniques address these
issues but provide low accuracy, DR, precision, ROC, and
FAR preventing them from being implemented in com-
mercial systems. Also worthy of mentioning was that the
existing state-of-the-art models paid no attention to fea-
5.5.6 ISCX2012 ture dimensionality; this stage of dimensionality reduction
is regarded as the most crucial stage. This phase is partic-
The authors have presented a valuable recommendation ularly time- and labor-intensive. This paper addressed the
for producing realistic and useful IDS evaluation datasets feature dimensionality phase by proposing a hybridized IG
through a dynamic approach. The dataset in question was + PCA for dimensionality reduction and provides a novel
generated using this approach. The methodology employed GWO ensemble model for classification. Additionally, this
by the individuals involves a bifurcation into two distinct proposed ensemble model was evaluated on realistic BoT-
components, specifically denoted as the alpha and beta pro- IoT and UNSW-NB15 datasets, which made it suitable for
files. The alpha profile executes multiple stages of attack commercial and industrial applications. As shown in Fig. 3,
scenarios to filter the anomalous segment of the dataset. The the best state-of-the-art model provides 100% accuracy on
beta profile, a benign traffic generator, produces authentic the BoT-IoT data, while the ROC and F-measure were dis-
network traffic accompanied by ambient noise. Empirical regarded. On the comparable BoT-IoT data, the proposed
data are utilized to construct profiles that simulate authen- innovative voting GWO ensemble model achieved an accu-
tic traffic for various protocols such as HTTP, SMTP, SSH, racy of 99.98%, DR of 99.97%, precision of 99.94%, ROC
IMAP, POP3, and FTP. The dataset produced by this method- of 99.99%, and FAR of 1.30.
ology comprises network traces that include complete packet
payloads and pertinent profiles. Nevertheless, it should be
noted that the dataset in question does not pertain to novel 6.2 Computational compatibility across IoT devices
network protocols, given that a significant proportion of con-
temporary network traffic, approximately 70%, is comprised When designing a machine learning model for intrusion
of HTTPS, and no traces of HTTPS are present within the said detection in IoT environments, it is important to consider
dataset. Furthermore, the allocation of the simulated assaults the computational compatibility of the proposed model, espe-
is not grounded on empirical data [72]. Table 13 shows some cially given the heterogeneity in computational power among
popular realistic datasets for IoT networks. IoT devices. A model that works well on high-power devices
As can be seen, only the proposed datasets used in this might struggle or be impractical to implement on resource-
study meet all criteria. Tables 13 and 14 list and explain constrained IoT devices. Imagine a scenario where our pro-
the dataset’s flaws and strengths based on relevant doc- posed model is deployed for real-time anomaly detection in a
uments and research, as well as their suitability for IoT smart city environment, where various types of IoT devices
networks. Some feature values are not presented as a result are utilized, ranging from resource-constrained sensors to
of inadequate documentation and a lack of metadata. Here, more powerful edge devices. In this scenario, the lightweight
we evaluated the proposed model using two well-known nature of our voting GWO ensemble model enables seamless
datasets: UNSW-NB15 and BoT-IoT. In contrast with the integration across these devices. Resource-intensive tasks are
datasets used in several existing models, which do not accu- offloaded to devise with higher computational power, while
rately reflect contemporary attacks on IoT networks and less resource-intensive tasks are managed by lower-powered
do not adhere to IoT protocol requirements, these chosen devices. Our model’s architecture is designed to dynamically
datasets are appropriate and realistic for IoT network traffic. adjust its computational requirements based on the available

123
Y. K. Saheed, S. Misra

Table 13 A comparative analysis of the datasets currently accessible for detecting attacks in IoT

DARPA LBNL Kyoto AWID ISCX 2012 KDD’99 CDX Twente

Traffic No Yes No No No No No Yes

Network Yes Yes Yes Yes Yes Yes No Yes
Label Yes No Yes Yes Yes Yes No Yes
Capture Yes Yes Yes Yes Yes Yes Yes Yes
Interaction Yes No Yes Yes Yes Yes Yes Yes
Attacks Brute-force Yes – Yes Yes Yes Yes No Yes
Browser Yes – Yes Yes Yes Yes No No
DoS Yes – Yes Yes Yes Yes Yes No
DNS No – Yes No No No Yes No
Backdoor No – Yes No No No No No
Scan Yes Yes Yes Yes Yes Yes Yes Yes
Others Yes – Yes Yes Yes Yes – Yes
Protocols HTTP No No Yes No No No No No
HTTP Yes Yes Yes Yes Yes Yes Yes Yes
FTP Yes No Yes Yes Yes Yes Yes No
Email Yes No Yes Yes Yes Yes Yes No
Ssh Yes Yes Yes Yes Yes Yes Yes Yes
Heterogeneity No No No No Yes No No –
Anonymity No Yes No No No No – –
Metadata Yes No Yes Yes Yes Yes No Yes
Feature set No No Yes Yes No Yes No No

Table 14 Summary of
representative (realistic) and Dataset/authors Traffic creation Public Attack Normal Realistic
non-representative (non-realistic) year availability traffic traffic network
datasets for IoT traffic for
IoT

DARPA [73] 1999 Yes Yes Yes No

LBNL [74] 2005 Yes Yes Yes No
Kyoto 2006 + [70] 2011 Yes Yes Yes No
NSL-KDD [68] 2009 Yes Yes Yes No
SSENET-2011 [75] 2011 n.i.f Yes Yes No
UNIBS [76] 2009 o.r No Yes No
CDX [69] 2009 Yes Yes Yes No
Twente [71] 2009 Yes Yes Yes No
ISCX 2012 [72] 2012 Yes Yes Yes No
Botnet [77] 2014 Yes Yes Yes No
AWID [16] 2015 o.r Yes Yes No
DDoS [78] 2016 Yes Yes Yes No
CIDDS-001 [79] 2017 Yes Yes Yes Yes
N-BaIoT [80] 2018 Yes Yes Yes Yes
UNSW-NB15 2015 Yes Yes Yes Yes
BoT-IoT 2019 Yes Yes Yes Yes

o.r on request and n.i.f no information found

123
A voting gray wolf optimizer-based ensemble learning models …

Table 15 Comparison with the state-of-the-art models

Authors Methodology Dataset used Accuracy DR Precision FAR

[44] SVM, RF, DT, Capture live SVM 98.06; RF X x x

and LR network 99.17;
DT 98.34
LR 97.50
[41] CNN System call graph 97 X x 0.034
[20] Bi-LSTM UNSW-NB15 95 X x 0
[42] K-NN, Gaussian Capture live K-NN 94.44; KNN 100;GNB K-NN 96; GNB x
Naive Bayes, network Gaussian Naive 100; RF 100 86; RF 92
and random Bayes 77.78;
forest RF 88.8,
[24] GRU, LSTM, NSLKDD 84.14 X x x
BLS, and
Bi-LSTM
[43] GRU-MLP, KDD Cup 99, 99.24 X x 0.84
BGRU-MLP, NSLKDD
BLSTM + MLP,
GRU, MLP,
LSTM-MLP,
LSTM
[39] Single-layered N-BaIoT 99 X x x
ANN
[25] SVM, J48, NB, NSLKDD 95.2 X x 6.3
MLP, NB tree,
RF, RF tree,
RNN-IDS, and
ANN
[40] SMOTE and BoT-IoT 100 X x x
ANN
[27] DJ, DF, DNN, NSLKDD, KDD 96.9 X x 5.44
LSTM, DBN, Cup, and CICIDS
and GRU
[81] MTNN ToN_IoT 87.79 90.69 77.95 x
[82] CNN-CapSA BoT-IoT 99.94 99.93 99.93 x
[83] CNN-MGO BoT-IoT 99.62 99.72 99.52 x
Our proposed Voting GWO BoT-IoT 99.98 99.97 99.94 1.30
model ensemble model
Our proposed Voting GWO UNSW-NB15 100 99.99 99.59 1.15
model ensemble model

resources, ensuring effective and efficient operation across characteristics and challenges of IoT networks. This
the heterogeneous IoT landscape. approach ensures that our research is directly relevant
to the specific requirements and constraints of IoT appli-
6.3 Transferable of the proposed research cations.
to real-world IoT applications b. Dataset Selection We utilized datasets, such as BoT-IoT
and UNSW-NB15, that are representative of real-world
Our research is designed with a strong focus on practical IoT network traffic and intrusions. This dataset selection
applicability in real-world IoT environments. Here are key ensures that our research is grounded in the realities of
points highlighting the transferability of our research to real- IoT security.
world IoT applications: c. Hybrid Approach Our research combines feature
extraction via principal component analysis (PCA), fea-
a. IoT-Centric Approach We developed our intrusion ture selection via IG, and GWO-based ensemble models.
detection system with a deep understanding of the unique

123
Y. K. Saheed, S. Misra

Fig. 3 Comparison of the Proposed models versus exisng systems

proposed models with the
existing models Our Proposed model

Our Proposed model

[21]
[34]
[19]
[33]

[37]
[18]
[36]
[14]
[35]

[38]
0 20 40 60 80 100 120

FAR Precision DR Accuracy

This hybrid approach is designed to enhance the robust- 6.4 Threats to validity
ness and effectiveness of intrusion detection in real-world
IoT scenarios. The main danger to validity is random sampling, which
d. Generalization We conducted experiments and evalua- makes it difficult to duplicate the exact experiment. To val-
tions on multiple datasets to ensure the generalizability idate the suggested approach’s reliability, the experiments
of our proposed model to diverse IoT applications. Our were repeated on two separate realistic IoT sets of data
research demonstrates the adaptability and transferability with a substantial sample size. Finally, while the presented
of our approach across various IoT contexts. approach performed well in binary-class classification, it
e. Performance Metrics We evaluated our intrusion detec- deserves additional investigation in the class of multiple clas-
tion system using well-established performance metrics, sification issues.
such as accuracy, DR, precision, and FAR. These met-
rics reflect the real-world effectiveness of our approach
in identifying and mitigating security threats.
f. Scalability We addressed the scalability challenges often 7 Conclusion and future work
encountered in IoT environments, ensuring that our
research can handle growing numbers of devices and data This paper proposes a novel voting GWO ensemble learning
volumes while maintaining effectiveness. model for the detection of attacks in an IoT environment. The
g. Practical Deployment Considerations We discussed suggested system successfully detects various forms of IoT
the practical considerations of deploying our intrusion threats by leveraging the feature set retrieved from the IoT
detection system in real-world IoT applications, includ- ecosystem. The strength of this paper concentrates on the vot-
ing the optimization of model parameters and the impor- ing GWO ensemble model, which is the first of its kind, the
tance of network segmentation. hybridization of IG + PCA for dimensionality reduction, and
h. Security Challenges Our research explicitly addresses a the leverage of realistic datasets that reflect real-time attacks
range of security challenges and threats in IoT environ- in the IoT context. To construct a successful ensemble IDS
ments, making it directly applicable to scenarios where for detecting IoT attacks, a collection of relevant features was
IoT security is a concern. selected. The experimental findings prove that the detection
accuracy is increased in the voting GWO ensemble model
in the suggested framework using the average probability
This research is built on a foundation that prioritizes real- technique. Our experimental results indicate that our pro-
world relevance and practicality. We have conducted exper- posed voting ensemble model outperforms other ML and DL
iments and evaluations that demonstrate the effectiveness approaches in terms of overall accuracy, attaining 100%, DR
and transferability of our IDS to various IoT applications. of 99.99%, precision of 99.59%, ROC of 99.40%, and FAR
By addressing the unique challenges of IoT security and of 1.15 on the UNSW-NB15 compared to earlier studies.
employing a hybrid approach that combines feature extrac- This indicates that our presented method will be extremely
tion, feature selection, and optimization techniques, we aim beneficial in designing contemporary IDS for the IoT envi-
to provide a solution that can be readily applied in real-world ronment. The suggested model will be extended in the future
IoT environments. to incorporate multiple class classification problems. Also,

123
A voting gray wolf optimizer-based ensemble learning models …

the deep learning model to classify the additional forms of 8. Kelton, A.P., Papa, J.P., Lisboa, C.O., Munoz, R., De, V.H.C.:
attacks may be considered in the future work. Internet of Things: a survey on machine learning-based intrusion
detection approaches. Comput. Netw. 151, 147–157 (2019). https://
Authors’ contributions Authors contributed equally. doi.org/10.1016/j.comnet.2019.01.023
9. Saheed, Y.K., Misra, S., Chockalingam, S.: Autoencoder via
Funding Open access funding provided by Institute for Energy Tech- DCNN and LSTM models for intrusion detection in industrial con-
nology. trol systems of critical infrastructures. In: 2023 IEEE/ACM 4th
Int. Work. Eng. Cybersecurity Crit. Syst. (EnCyCriS), Melbourne,
Data availability The BoT-IoT is available at https://www.unsw. Aust., pp. 9–16 (2023). https://doi.org/10.1109/EnCyCriS59249.
adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Da 2023.00006
tasets/bot_iot.php and N. Moustafa and J. Slay, “UNSW-NB15: 10. Alharbi, S., Rodriguez, P., Maharaja, R., Iyer, P., Bose, N., Ye, Z.:
A comprehensive dataset for network intrusion detection systems FOCUS : a fog computing-based security system for the Internet
(UNSW-NB15 network dataset),” 2015 Mil. Commun. Inf. Syst. Conf. of Things. (2018)
MilCIS 2015—Proc., 2015, doi: https://doi.org/10.1109/MilCIS.2015. 11. Pajouh, H.H., Javidan, R., Khayami, R., Dehghantanha, A., Choo,
7348942. Also cited the same in the reference list [14]. K.K.R.: A two-layer dimension reduction and two-tier classifica-
tion model for anomaly-based intrusion detection in IoT backbone
networks. IEEE Trans. Emerg. Top. Comput. 7(2), 314–323 (2019).
Declarations https://doi.org/10.1109/TETC.2016.2633228
12. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed
Conflict of interest Authors do not have any financial or non-financial analysis of the KDD CUP 99 data set. no. Cisda, pp. 1–6 (2009).
interests that are directly or indirectly related to the work submitted for 13. Zhang, H., Wu, C.Q., Gao, S., Wang, Z., Xu, Y., Liu, Y.: An effective
publication. deep learning based scheme for network intrusion detection. In:
2018 24th Int. Conf. Pattern Recognit., pp. 682–687 (2018)
Ethical approval Authors comply with the highest level of ethical stan- 14. Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for
dards while preparing the manuscript. network intrusion detection systems (UNSW-NB15 network data
set). In: 2015 Mil. Commun. Inf. Syst. Conf. MilCIS 2015—Proc.
Open Access This article is licensed under a Creative Commons Attri- (2015). https://doi.org/10.1109/MilCIS.2015.7348942
bution 4.0 International License, which permits use, sharing, adaptation, 15. Koroniotis, N., Moustafa, N., Sitnikova, E.: Towards Developing
distribution and reproduction in any medium or format, as long as you Network Forensic Mechanism for Botnet Activities in the IoT
give appropriate credit to the original author(s) and the source, pro- Based on Machine Learning Techniques. Springer International
vide a link to the Creative Commons licence, and indicate if changes Publishing
were made. The images or other third party material in this article are 16. Kolias, C., Kambourakis, G., Stavrou, A., Gritzalis, S.: Intrusion
included in the article’s Creative Commons licence, unless indicated detection in 802. 11 Networks : Empirical Evaluation of Threats
otherwise in a credit line to the material. If material is not included in and a Public Dataset. no. c, pp. 1–24 (2015). https://doi.org/10.
the article’s Creative Commons licence and your intended use is not 1109/COMST.2015.2402161
permitted by statutory regulation or exceeds the permitted use, you will 17. Saheed, Y.K., Usman, A.A., Sukat, F.D., Abdulrahman, M.: A
need to obtain permission directly from the copyright holder. To view a novel hybrid autoencoder and modified particle swarm optimiza-
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. tion feature selection for intrusion detection in the internet of things
network. Front. Comput. Sci. 5, 1–13 (2023). https://doi.org/10.
3389/fcomp.2023.997159
18. Amin, S.O., Siddiqui, M.S., Hong, C.S., Choe, J.: A novel cod-
References ing scheme to implement signature based IDS in IP based sensor
networks. In: 2009 IFIP/IEEE Int. Symp. Integr. Netw. Manag. IM
1. Islam, N., et al.: Towards Machine learning based intrusion detec- 2009, pp. 269–274 (2009).https://doi.org/10.1109/INMW.2009.51
tion in IoT networks. Comput. Mater. Contin. 69(2), 1801–1821 95973
(2021). https://doi.org/10.32604/cmc.2021.018466 19. Abubakar, A., Pranggono, B.: Machine learning based intrusion
2. Rahman, M.A., Asyhari, A.T.: The emergence of Internet of detection system for software defined networks. In: 2017 Sev-
things (IoT): connecting anything, anywhere. Computers 8(2), enth International Conference on Emerging Security Technologies,
8–11 (2019). https://doi.org/10.3390/computers8020040 pp. 138–143 (2017)
3. Lin, H., Hu, J., Wang, X., Alhamid, M.F., Piran, M.J.: Toward 20. Roy, B., Cheung, H.: A deep learning approach for intrusion
secure data fusion in industrial IoT using transfer learning. IEEE detection in internet of things using bi-directional long short-term
Trans. Ind. Inform. 17(10), 7114–7122 (2021). https://doi.org/10. memory recurrent neural network. In: 2018 28th Int. Telecommun.
1109/TII.2020.3038780 Networks Appl. Conf. ITNAC 2018, pp. 1–6 (2019).https://doi.org/
4. Farsi, M., Daneshkhah, A., Hosseinian-Far, H., Jahankhani, 10.1109/ATNAC.2018.8615294
A.: Digital Twin Technologies and Smart Cities. Springer, 21. Le, A., Loo, J., Luo, Y., Lasebae, A.: Specification-based IDS for
Berlin/Heidelberg, Germany (2020) securing RPL from topology attacks. IFIP Wirel. Days 1(1), 4–6
5. Zhao, K., Ge, L.: A survey on the Internet of things security. (2011). https://doi.org/10.1109/WD.2011.6098218
In: Proceedings—9th International Conference on Computational 22. Bertino, E.: Botnets and Internet of Things Security. Computer
Intelligence and Security, CIS 2013, pp. 663–667 (2013). https:// (Long. Beach. Calif)., pp. 76–79 (2017)
doi.org/10.1109/CIS.2013.145. 23. Almiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S.,
6. Yan, Z., Zhang, P., Vasilakos, A.V.: A survey on trust management Razaque, A.: Deep recurrent neural network for IoT intrusion
for Internet of Things. J. Netw. Comput. Appl. 42, 120–134 (2014). detection system. Simul. Model. Pract. Theory 101, 102031 (2020).
https://doi.org/10.1016/j.jnca.2014.01.014 https://doi.org/10.1016/j.simpat.2019.102031
7. Saheed, Y.K., Babatunde, A.O.: Genetic algorithm technique in 24. Li, Z., Batta, P., Trajkovi, L.: Comparison of Machine Learning
program path coverage for improving software testing. Afr. J. Com- Algorithms for Detection of Network Intrusions. pp. 4248–4253
put. ICT 7(5), 151–158 (2014) (2018). https://doi.org/10.1109/SMC.2018.00719

123
Y. K. Saheed, S. Misra

25. Ayyaz-ul-haq, Q., Larijani, H., Ahmad, J.: A heuristic intrusion 4th Int. Conf. Informatics Comput. ICIC 2019, pp. 0–4 (2019).
detection system for Internet-of-Things (IoT). In: Arai, K., Bha- https://doi.org/10.1109/ICIC47613.2019.8985853
tia, R., Kapoor, S. (eds.) Intelligent Computing. CompCom 2019. 41. Le, H.V., Ngo, Q.D., Le, V.H.: Iot Botnet detection using system
Advances in Intelligent Systems and Computing. Springer Cham, call graphs and one-class CNN classification. Int. J. Innov. Technol.
pp. 86–98 (2019) Explor. Eng. 8(10) (2019).
26. Böhm, A., Jonsson, M., Uhlemann, E.: Performance comparison of 42. Kumar, A., Lim, T.J.: EDIMA: early detection of IoT mal-
a platooning application using the IEEE 802.11p MAC on the con- ware network activity using machine learning techniques. In:
trol channel and a centralized MAC on a service channel. Int. Conf. IEEE 5th World Forum Internet Things, WF-IoT 2019—Conf.
Wirel. Mob. Comput. Netw. Commun. 545–552 (2013).https://doi. Proc., pp. 289–294 (2019). https://doi.org/10.1109/WF-IoT.2019.
org/10.1109/WiMOB.2013.6673411 8767194
27. Elmasry, W., Akbulut, A., Zaim, A.H.: Empirical study on mul- 43. Xu, C., Member, S., Shen, J., Du, X.I.N., Zhang, F.A.N.: An intru-
ticlass classification-based network intrusion detection. Comput. sion detection system using a deep neural network with gated
Intell. 35(4), 919–954 (2019). https://doi.org/10.1111/coin.12220 recurrent units. IEEE Access PP(c), 1 (2018). https://doi.org/10.
28. Jiang, K., Wang, W., Wang, A., Wu, H.: Network intrusion detec- 1109/ACCESS.2018.2867564
tion combined hybrid sampling with deep hierarchical network. 44. Chaudhary, P., Gupta, B.B.: DDoS detection framework in resource
IEEE Access 8(3), 32464–32476 (2020). https://doi.org/10.1109/ constrained internet of things domain. In: 2019 IEEE 8th Glob.
ACCESS.2020.2973730 Conf. Consum. Electron. GCCE 2019, pp. 675–678 (2019).https://
29. Hasan, M., Islam, M., Zarif, I.I., Hashem, M.M.A.: Internet of doi.org/10.1109/GCCE46687.2019.9015465
things attack and anomaly detection in IoT sensors in IoT sites using 45. Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J., Alazab,
machine learning approaches. Internet Things 7, 100059 (2019). A.: A novel ensemble of hybrid intrusion detection system for
https://doi.org/10.1016/j.iot.2019.100059 detecting internet of things attacks. Electron (2019). https://doi.
30. Cheng, Y., Xu, Y., Zhong, H., Liu, Y.: Leveraging Semi- org/10.3390/electronics8111210
supervised Hierarchical Stacking Temporal Convolutional Net- 46. Alazab, A., Abawajy, J., Hobbs, M., Layton, R.: Crime Toolkits :
work for Anomaly Detection in IoT Communication, vol. 4662, The Productisation of Cybercrime (2013). https://doi.org/10.1109/
no. c (2020). https://doi.org/10.1109/JIOT.2020.3000771. TrustCom.2013.273
31. Lee, T.H., Wen, C.H., Chang, L.H., Chiang, H.S., Hsieh, M.C.: A 47. Singh, J., Pasquier, T., Bacon, J., Ko, H., Eyers, D.: Twenty security
lightweight intrusion detection scheme based on energy consump- considerations for cloud-supported Internet of Things. vol. 4662,
tion analysis in 6LowPAN. In: Advanced Technologies, Embedded no. c, pp. 1–16 (2015). https://doi.org/10.1109/JIOT.2015.2460333
and Multimedia for Human-centric Computing (2014). https://doi. 48. Adeyiola, A.Q., Saheed, Y.K., Misra, S., Chockalingam, S.: Meta-
org/10.1007/978-94-007-7262-5_137 heuristic firefly and C5 . 0 algorithms based intrusion detection for
32. Sahu, N.K., Mukherjee, I.: Machine learning based anomaly detec- critical infrastructures. In: 2023 3rd International Conference on
tion for IoT network:(Anomaly detection in IoT network). In: 4th Applied Artificial Intelligence (ICAPAI), pp. 1–7 (2023). https://
International Conference on Trends in Electronics and Informatics doi.org/10.1109/ICAPAI58366.2023.10193917
(ICOEI)(48184), no. Icoei, pp. 787–794 (2020). https://doi.org/10. 49. Kolias, C., Kambourakis, G., Stavrou, A., Voas, J.: DDoS in the
1109/ICOEI48184.2020.9142921 IoT: Mirai and other botnets. Computer (Long Beach Calif.) 50(7),
33. Chen, J., Chen, C.: Design of complex event-processing IDS in 80–84 (2017). https://doi.org/10.1109/MC.2017.201
internet of things. In: Proc. - 2014 6th Int. Conf. Meas. Technol. 50. Abomhara, M., Køien, G.M.: Cyber security and the internet of
Mechatronics Autom. ICMTMA 2014, pp. 226–229 (2014). https:// things : vulnerabilities , threats , intruders.4, 65–88 (2015). https://
doi.org/10.1109/ICMTMA.2014.57 doi.org/10.13052/jcsm2245-1439.414
34. Midi, D., Rullo, A., Mudgerikar, A., Bertino, E.: Kalis—a system 51. Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B.: Towards
for knowledge-driven adaptable intrusion detection for the Internet the development of realistic botnet dataset in the Internet of Things
of Things. In: Proc. - Int. Conf. Distrib. Comput. Syst., pp. 656–666 for network forensic analytics: Bot-IoT dataset. Futur. Gener. Com-
(2017). https://doi.org/10.1109/ICDCS.2017.104 put. Syst. 100, 779–796 (2019). https://doi.org/10.1016/j.future.
35. Karunkumar, D., Himansu, R., Behera, S., Nayak, J.: Deep neural 2019.05.041
network based anomaly detection in Internet of Things network 52. Mansfield-devine, S., Security, N.: DDoS goes mainstream: attacks
traffic tracking for the applications of future smart cities. no. July, could make this threat an organisation ’ s biggest nightmare. Netw.
pp. 1–26 (2020). https://doi.org/10.1002/ett.4121 Secur. 2016(11), 7–13 (2016). https://doi.org/10.1016/S1353-4858
36. Lopez-Martin, M., Carro, B., Sanchez-Esguevillas, A., Lloret, J.: (16)30104-0
Conditional variational autoencoder for prediction and feature 53. Greenberg, A.: Hackers remotely kill a jeep on the highway—with
recovery applied to intrusion detection in iot. Sensors (Switzer- me in it. Wired, 7(21) (2015)
land) (2017). https://doi.org/10.3390/s17091967 54. Saheed, Y.K.: Data analytics for intrusion detection system based
37. Guller, M.: Big data analytics with Spark: A practitioner’s guide to on recurrent neural network and supervised machine learning meth-
using Spark for large scale data analysis. Apress (2015) ods. In: Recurrent Neural Networks, pp. 167–179. CRC Press
38. Joshi, H.P., Bennison, M., Dutta, R.: Collaborative botnet detection Taylor & Francis Group (2022)
with partial communication graph information. In: 2017 IEEE 38th 55. Jain, S., Shukla, S., Wadhvani, R.: Dynamic selection of normal-
Sarnoff Symp. (2017). https://doi.org/10.1109/SARNOF.2017.80 ization techniques using data complexity measures. Expert Syst.
80397 Appl. 106, 252–262 (2018). https://doi.org/10.1016/j.eswa.2018.
39. Soe, Y.N., Feng, Y., Santosa, P.I., Hartanto, R., Sakurai, K.: A 04.008
sequential scheme for detecting cyber attacks in IoT environ- 56. Georganos, S., Lennert, M., Grippa, T., Vanhuysse, S., Johnson, B.,
ment. In: Proc. - IEEE 17th Int. Conf. Dependable, Auton. Secur. Wolff, E.: Normalization in unsupervised segmentation parameter
Comput. IEEE 17th Int. Conf. Pervasive Intell. Comput. IEEE optimization: a solution based on local regression trend analysis.
5th Int. Conf. Cloud Big Data Comput. 4th Cyber Sci., vol. Remote Sens. (2018). https://doi.org/10.3390/rs10020222
324, pp. 238–244 (2019). https://doi.org/10.1109/DASC/PiCom/ 57. Saheed, Y.K.: Performance improvement of intrusion detection sys-
CBDCom/CyberSciTech.2019.00051 tem for detecting attacks on internet of things and edge of things.
40. Soe, Y.N., Santosa, P.I., Hartanto, R.: DDoS attack detection based In: Misra, S., Kumar, T.A., Piuri, V., Garg, L. (eds.) Artificial
on simple ANN with SMOTE for IoT environment. In: Proc. 2019 Intelligence for Cloud and Edge Computing. Internet of Things

123
A voting gray wolf optimizer-based ensemble learning models …

(Technology, Communications and Computing). Springer, Cham 72. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward
(2022) developing a systematic approach to generate benchmark datasets
58. Gray, R.M.: Entropy and Information Theory. Springer Science & for intrusion detection. Comput. Secur. 31(3), 357–374 (2012).
Business Media (2011) https://doi.org/10.1016/j.cose.2011.12.012
59. Adi, E., Baig, Z., Hingston, P.: Stealthy Denial of Service (DoS) 73. Lippmann, R.P. et al.: Evaluating intrusion detection systems: the
attack modelling and detection for HTTP/2 services. J. Netw. Com- 1998 DARPA off-line intrusion detection evaluation. In: Proc. -
put. Appl. 91, 1–13 (2017). https://doi.org/10.1016/j.jnca.2017.04 DARPA Inf. Surviv. Conf. Expo. DISCEX 2000, vol. 2, pp. 12–26
.015 (2000). https://doi.org/10.1109/DISCEX.2000.821506
60. Saheed, Y.K.: Machine learning-based blockchain technology for 74. Ruoming, P., Mark, A., Mike, B., Jason, L., Vern, P., Brian, T.: A
protection and privacy against intrusion attacks in intelligent trans- first look at modern enterprise traffic. In: p. Proceedings of the 5th
portation systems. In: Machine Learning, Blockchain Technologies ACM SIGCOMM conference on I (2005)
and Big Data Analytics for IoTs: Methods, Technologies and Appli- 75. Vasudevan, A.R., Harshini, E., Selvakumar, S.: SSENet-2011: a
cations, p. 16 (2022) network intrusion detection system dataset and its comparison with
61. ZorarpacI, E., Özel, S.A.: A hybrid approach of differential evo- KDD CUP 99 dataset. Asian Himalayas Int. Conf. Internet (2011).
lution and artificial bee colony for feature selection. Expert Syst. https://doi.org/10.1109/AHICI.2011.6113948
Appl. 62, 91–103 (2016). https://doi.org/10.1016/j.eswa.2016.06 76. Gringoli, F., Salgarelli, L., Cascarano, N., Risso, F., Claffy, K.C.,
.004 Rodriguez, P.: GT: picking up the truth from the ground in traffic
62. Jimoh, R.G., Ridwan, M.Y., Yusuf, O.O., Saheed, Y.K.: Application classification. ACM SIGCOMM Comput. Commun. Rev. 39(5),
of dimensionality reduction on classification of colon cancer using 12–18 (2009)
Ica and K-Nn algorithm. Anale. Ser. Informatică, vol. 6, no. 10, 77. Beigi, E.B., Jazi, H.H., Stakhanova, N., Ghorbani, A.A.: Towards
pp. 55–59, 2018, [Online]. Available: http://anale-informatica.tibi effective feature selection in machine learning-based botnet detec-
scus.ro/download/lucrari/16-1-06-Olatunde.pdf. tion approaches. In: 2014 IEEE Conf. Commun. Netw. Secur.
63. Seni, G., Elder, J.F.: Ensemble Methods in Data Mining: Improving CNS 2014, pp. 247–255 (2014).https://doi.org/10.1109/CNS.2014.
Accuracy Through Combining Predictions, vol. 2, no. 1 (2010) 6997492
64. Hung, C., Chen, J.H.: A selective ensemble based on expected 78. Alkasassbeh, M., Al-Naymat, G., B.A, A., Almseidin, M.: Detect-
probabilities for bankruptcy prediction. Expert Syst. Appl. 36(3 ing distributed denial of service attacks using data mining tech-
PART 1), 5297–5303 (2009). https://doi.org/10.1016/j.eswa.2008. niques. Int. J. Adv. Comput. Sci. Appl. 7(1), 436–445 (2016).
06.068 https://doi.org/10.14569/ijacsa.2016.070159
65. Abdulhammed, R., Musafer, H., Alessa, A., Faezipour, M., 79. Sharafaldin, I., Gharib, A., Lashkari, A.H., Ghorbani, A.A.:
Abuzneid, A.: Features dimensionality reduction approaches for Towards a reliable intrusion detection benchmark dataset. Softw.
machine learning based network intrusion detection. Electron Netw. 2017(1), 177–200 (2017). https://doi.org/10.13052/jsn2445-
(2019). https://doi.org/10.3390/electronics8030322 9739.2017.009
66. Elhag, S., Fernández, A., Bawakid, A., Alshomrani, S., Herrera, F.: 80. Meidan, Y., et al.: N-BaIoT-Network-based detection of IoT botnet
On the combination of genetic fuzzy systems and pairwise learn- attacks using deep autoencoders. IEEE Pervasive Comput. 17(3),
ing for improving detection rates on Intrusion Detection Systems. 12–22 (2018). https://doi.org/10.1109/MPRV.2018.03367731
Expert Syst. Appl. 42(1), 193–202 (2015). https://doi.org/10.1016/ 81. Ahmed, S.W., Kientz, F., Kashef, R.: A modified transformer neural
j.eswa.2014.08.002 network (MTNN) for robust intrusion detection in IoT networks.
67. Mchugh, J.: Testing intrusion detection systems: a critique of the In: 2023 Int. Telecommun. Conf. ITC-Egypt 2023, pp. 663–668
1998 and 1999 DARPA intrusion detection system evaluations as (2023).https://doi.org/10.1109/ITC-Egypt58155.2023.10206134
performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. 82. Abd Elaziz, M., Al-qaness, M.A.A., Dahou, A., Ibrahim, R.A.,
3(4), 262–294 (2000). https://doi.org/10.1145/382912.382923 El-Latif, A.A.A.: Intrusion detection approach for cloud and IoT
68. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed environments using deep learning and Capuchin Search Algorithm.
analysis of the KDD CUP 99 data set in computational intelligence Adv. Eng. Softw. 176(December 2022), 103402 (2023). https://doi.
for security and defense applications. Comput. Intell. Secur. Def. org/10.1016/j.advengsoft.2022.103402
Appl., no. Cisda, 1–6 (2009) 83. Fatani, A., et al.: Enhancing intrusion detection systems for IoT
69. Sangster, B. et al.: Toward instrumenting network warfare com- and cloud environments using a growth optimizer algorithm and
petitions to generate labeled datasets. In: 2nd Work. Cyber Secur. conventional neural networks. Sensors 23(9), 1–14 (2023). https://
Exp. Test, CSET 2009 (2009) doi.org/10.3390/s23094430
70. Sato, M., Yamaki, H., Takakura, H.: Unknown attacks detec-
tion using feature extraction from anomaly-based IDS alerts. In:
Proc.—2012 IEEE/IPSJ 12th Int. Symp. Appl. Internet, SAINT
Publisher’s Note Springer Nature remains neutral with regard to juris-
2012, pp. 273–277 (2012). https://doi.org/10.1109/SAINT.2012.51
dictional claims in published maps and institutional affiliations.
71. Sperotto, A., Sadre, R., Van Vliet, F., Pras, A.: A labeled data set for
flow-based intrusion detection. In: IP Operations and Management:
9th IEEE International Workshop, IPOM, pp. 39–50 (2009). https://
doi.org/10.1007/978-3-642-04968-2_4