A Voting Gray Wolf Optimizer-Based Ensemble Learning Models For Intrusion Detection in The Internet of Things
A Voting Gray Wolf Optimizer-Based Ensemble Learning Models For Intrusion Detection in The Internet of Things
https://doi.org/10.1007/s10207-023-00803-x
REGULAR CONTRIBUTION
Abstract
The Internet of Things (IoT) has garnered considerable attention from academic and industrial circles as a pivotal technology
in recent years. The escalation of security risks is observed to be associated with the growing interest in IoT applications.
Intrusion detection systems (IDS) have been devised as viable instruments for identifying and averting malicious actions in this
context. Several techniques described in academic papers are thought to be very accurate, but they cannot be used in the real
world because the datasets used to build and test the models do not accurately reflect and simulate the IoT network. Existing
methods, on the other hand, deal with these issues, but they are not good enough for commercial use because of their lack of
precision, low detection rate, receiver operating characteristic (ROC), and false acceptance rate (FAR). The effectiveness of
these solutions is predominantly dependent on individual learners and is consequently influenced by the inherent limitations
of each learning algorithm. This study introduces a new approach for detecting intrusion attacks in an IoT network, which
involves the use of an ensemble learning technique based on gray wolf optimizer (GWO). The novelty of this study lies in the
proposed voting gray wolf optimizer (GWO) ensemble model, which incorporates two crucial components: a traffic analyzer
and a classification phase engine. The model employs a voting technique to combine the probability averages of the base
learners. Secondly, the combination of feature selection and feature extraction techniques is to reduce dimensionality. Thirdly,
the utilization of GWO is employed to optimize the parameters of ensemble models. Similarly, the approach employs the most
authentic intrusion detection datasets that are accessible and amalgamates multiple learners to generate ensemble learners.
The hybridization of information gain (IG) and principal component analysis (PCA) was employed to reduce dimensionality.
The study utilized a novel GWO ensemble learning approach that incorporated a decision tree, random forest, K-nearest
neighbor, and multilayer perceptron for classification. To evaluate the efficacy of the proposed model, two authentic datasets,
namely, BoT-IoT and UNSW-NB15, were scrutinized. The GWO-optimized ensemble model demonstrates superior accuracy
when compared to other machine learning-based and deep learning models. Specifically, the model achieves an accuracy rate
of 99.98%, a DR of 99.97%, a precision rate of 99.94%, an ROC rate of 99.99%, and an FAR rate of 1.30 on the BoT-IoT
dataset. According to the experimental results, the proposed ensemble model optimized by GWO achieved an accuracy of
100%, a DR of 99.9%, a precision of 99.59%, an ROC of 99.40%, and an FAR of 1.5 when tested on the UNSW-NB15
dataset.
Keywords Internet of Things · Gray wolf optimizer · Ensemble model · Intrusion detection system · K-nearest neighbor ·
Multilayer perceptron · Random forest · Decision tree · Average of probability
1 Introduction
B Sanjay Misra
[email protected] The Internet of Things (IoT) is experiencing rapid growth
Yakub Kayode Saheed and assuming an increasingly significant role in our every-
[email protected] day existence. IoT nodes can establish a connection with
the Internet using an Internet Protocol (IP) address [1]. The
1 Department of Computer Science, University of Alcala,
past decade has witnessed a significant surge in the level of
Madrid, Spain
interconnectivity among individuals, machines, and services,
2 Department of Applied Data Science, Institute for Energy
Technology, Halden, Norway
123
Y. K. Saheed, S. Misra
ultimately leading to the emergence of a novel communica- As a result, numerous approaches and strategies such as
tion paradigm referred to as the IoT [2]. The proliferation data encryption, firewalls, and user verification via the fog
of self-configured smart nodes is fueling the development computing model have been created and implemented to
of a wide range of innovative applications, including but defend the IoT platform. These attack channels and risks
not limited to home automation, process automation, smart continue to evolve, rendering traditional security solutions
automobiles, health-care systems, decision analytics, smart inefficient and ineffective at addressing the IoT safety chal-
grids, industrial development, and autonomous cars [3]. It is lenge, paving the way for a new wave of IDS based on ML.
predicted by analysts that in the future, the number of inter- A substantial amount of work and study has been undertaken
connected devices will surpass that of the human population to determine the optimum intelligent IDS for various types
on Earth. As per the International Data Corporation’s projec- of applications in IoT-based environments [8]. As IDS is one
tions, by the year 2025, a total of 41.6 billion interconnected of the key remedies used to ensure IoT security, there is a
IoT devices are expected to generate a staggering amount of propensity to employ multiple techniques concurrently [9].
79.4 zettabytes of data, in contrast with the anticipated global Alharbi et al. [10] proposed an IoT security proof-of-concept
population of 8.1 billion individuals [4]. system built into the fog computing layer. Each unit defends
The IoT is vulnerable to a range of security threats and against a specific type of attack. The IDS of traffic analyzer
presents significant security challenges for end-users, par- components was employed to spot DDoS and DoS attacks
ticularly as it continues to expand into various aspects of with a classification engine based on the decision tree ML
communal life, as shown in Fig. 1. The IoT is a complex sys- technique. To authenticate the IDS’s answer, the challenge-
tem of various networks that include security measures for response component sends a challenge communication in
sensor data, Internet and mobile network connectivity, pri- the event of intrusion detection. As a result of the system’s
vacy protection, network authentication, access control, and failure to respond to this message, the firewall unit dis-
information management, as noted in the source [5]. In recent ables the connection. Pajouh et al. [11] introduced a unique
years, the occurrence of anomalies and security breaches on layered IDS for IoT mainstay networks that use a two-tier (2-
IoT devices has become increasingly prevalent. The Internet tier) dimensionality reduction and classification phase. The
of Things infrastructure framework is becoming increasingly dimension reduction engine is built of component analysis
complex, which is resulting in the introduction of undesired and LDA units, while the classification engine is composed
vulnerabilities into its systems. The IoT has the potential to of NB and a cascaded version of the CF-KNN units. The NB
facilitate the seamless integration of physical objects into was utilized to classify attack records, which were further
networks, thereby providing advanced information services improved using the CF-KNN algorithm as a secondary filter
to individuals. A multitude of IoT services and applications layer. Using the NSLKDD [12] dataset, the suggested model
that utilize ML have emerged across various domains such demonstrated modest uncovering performance for difficult-
as security, surveillance, health care, transportation, con- to-catch attacks, specifically those belonging to the U2R and
trol, and object monitoring. Preventative security measures R2L classes. Zhang et colleagues [13] used the UNSW-NB
are often limited by inadequate planning and implementa- [14] standard dataset to illustrate the efficacy of ML-based
tion, and given the inevitability of attacks, machine learning intrusion detection using a full depiction of modern IoT
systems can offer essential services and resilient security attack scenarios. They employed a new feature selection
strategies for safeguarding IoT devices [6]. engine that applied DAE founded on a biased loss function,
The attack detection system is classified as either a despite using a simple MLP as an algorithm. This unique
signature-based or an anomaly-based system. Signature- feature selection technique resulted in an increased empha-
based system attacks compare certain patterns, such as bytes sis on attack-representative features. Koroniotis et al. [15]
or harmful instruction sequences, in malware-infected net- proposed an IoT network forensic framework consisting of
work traffic to known attack types stored in a database [7]. C4.5, ARM, NB, and ANN ML approaches to recognize and
Systems based on anomalies detect unknown threats or devia- spot novel and complex forms of present botnet attacks as
tions from the typical flow. Unlike signature-based detection another application of the UNSW-NB dataset.
systems, machine learning-based solutions have the potential Traditional ML techniques and approaches have been
to detect unknown attacks. However, the ML models must be widely used due to their high accuracy for attack detection
sufficiently precise to maximize high accuracy, increase the and low false alarms, but they have been disapproved for
detection rate (DR), have a high ROC, and minimize false their inability to detect innovative threats. Traditional ML
alarms [6]. They must be trained and assessed on genuine techniques are incapable of identifying composite and new
datasets to demonstrate their efficacy in real-world deploy- attacks. The mainstream of mutation attacks is minor alter-
ments. The basic strategy is to utilize ML to create a model ations known as cyberattacks in modern times. The prior
of legitimate action and then analyze new behavioral attacks logic and conceptions serve as the basis for the novel attacks.
against the ML model. This means that typical ML models will fail to recognize
123
A voting gray wolf optimizer-based ensemble learning models …
minute mutations because they are incapable of abstract- The paper is divided into seven sections. The literature
ing information to discern novel threats [16]. Hence, a more and existing works are presented in Sect. 2. The proposed
robust, intelligent method for IoT attack detection is needed. methodology is detailed in Sect. 3. Section 4 presents the
Therefore, this paper proposes an ensemble learning method. GWO-optimized ensemble models, and Sect. 5 presents the
Ensemble learning for resilient IoT security is a strategy for experimental setup. The findings and discussions are given
solving a specific artificial intelligence-based challenge by in Sect. 6. Section 7 contains the conclusion and recommen-
combining different models or expertise. Ensemble learning dations for future work.
enhances generalization, simplification, and voting among
the various ensemble strategies in the intrusion detection
problem, resulting in a higher detection performance than
individual models [17]. The paper’s primary contributions 2 Related work
are as follows:
The study [18] employed bloom filtering for signature
• Propose a new voting ensemble learning approach for IoT matching and offered a dynamic coding mechanism for con-
intrusion detection (To the best of our knowledge, this is the structing a decentralized signature-based IDS in IP-USN.
first voting GWO-optimized ensemble model for intrusion The study [19] created a virtual test platform to mimic an
detection in the IoT). actual network environment, installing a Snort IDS for traf-
• Analyze the model using feature extraction (principal com- fic control and attack discovery by reflecting traffic to the
ponent analysis) and feature selection (information gain) server and constructing a stream-based IDS intelligent sys-
for dimensionality reduction. We created a hybrid IG + tem using ML developed a specification-based IDS capable
PCA technique for feature selection, feature extraction, of identifying a novel sort of danger—the topology attack.
and GWO-optimized ensemble models for classification They suggested an IDS architecture built on top of a network
tasks. monitor and explained its monitoring techniques using an
• Based on network traffic characteristics, low-cost and RPL FSM. Roy et al. [20] presented the use of a Bi-LSTM
mountable cyber intrusion detection for IoT are proposed. RNN for intrusion detection to spot a binary categorization
• Suggest several realistic datasets for IDS in the IoT envi- of normal and malicious attacks. The model was trained on
ronment. the UNSW-NB15 dataset and had a detection accuracy above
• Develop a voting ensemble model based on the average of 95% in IoT attacks. The work [21] devised an approach for
probability to increase the detection accuracy and decrease detecting resource-constrained deep packet anomalies that
the false alarm rate to detect cyberattacks in the IoT. distinguish between regular and anomalous payloads. Xu
• Leverage the realistic BoT-IoT and UNSW-NB15 datasets et al. [22] presented a unique IDS that examined the realiza-
that reflect modern-day attacks and are representative tion of several basic hybrid RNN models and MLP to protect
of real-world attack scenarios in IoT which also satisfy against IoT threats. Both the NSL-KDD and KDD Cup 99
IoT protocol requirements as against outdated and non- datasets are utilized for training and assessing the described
representative datasets used in some previous studies. models. The study [23] developed a several-layered RNN
123
Y. K. Saheed, S. Misra
Table 1 Summary of existing IoT attack detection using machine learning and deep learning
123
A voting gray wolf optimizer-based ensemble learning models …
Table 1 (continued)
[25] SVM, J48, NB, MLP, NB, NSLKDD RNN-IDS accuracy 95.2% For performance
RF, RF, RNN-IDS, and comparisons, only
ANN machine learning
models and outdated
dataset were used for
the experimental
analysis
[27] DJ, DF, DNN, LSTM, NSLKDD, KDD Cup, and DBN gave an accuracy of There are no realistic
DBN, and GRU CICIDS 96.9% outperforming others IoT datasets
examined
Proposed GWO RF, DT, MLP, and KNN BoT-IoT Improved accuracy, We used multiple base
ensemble models UNSW-NB15 F-measure, and ROC classifiers, including
RF, DT, MLP, and
KNN, and designed
a voting GWO
ensemble model
model for IoT gadgets that might be deployed. The identifi- the UNSW-NB15 and NSL-KDD datasets separately. The
cation rates of attacks were determined to be DoS at 98.27 OSS and SMOTE are combined to create balanced data for
percent, the probe at 97.35 percent, U2R at 64.93 percent, training models built with CNN, AlexNet, BiLSTM, LeNet-
and R2L at 77.25 percent, respectively, using the NSL-KDD 5, and RF algorithms. According to the statistical result,
dataset. Li et al. [24] used the NSLKDD dataset to build CNN-BiLSTM surpassed other classifiers with an accuracy
GRU, LSTM, BLS, and Bi-LSTM algorithms for several of 83.58%. Hasan et al. [29] addressed many paradigmatic
known intrusion classification tasks. According to the per- machine learning strategies for spotting intrusions into IoT
formance study, the BLS significantly reduces training time nets that result in system failure. On the DS2OS data, five-
while maintaining an accuracy of 72.64% and 84.15 per- fold cross-validation was performed using LR, SVM, DT,
cent for the KDDTest-21 and KDDTest + data, respectively. RF, and ANN. Cheng et al. [30] developed an HS-TCN
The author [25] demonstrated an accuracy of 85.5 per- for detecting anomalous communication in the Internet of
cent–95.25 percent for RNN-IDS using a heuristic technique Things. The experiment was controlled using two variants of
for intrusion detection. The IDS is initially trained using the unique dataset DS2OS: data collected over eleven (11)
the gradient descent approach and then retrained and tested days and the DS2OS-UA. For both adjusted datasets, the
using the KDD20 + and KDDTest + datasets. RNN-IDS out- HS-TCN model outperforms the LSTM and SVM models.
performs various applied algorithms, including SVM, J48, The author [31] suggested an intrusion detection approach
NB, MLP NB tree, RF, ANN, and RF tree. In ref. [26], a founded on node usage analysis in 6LowPAN. Sahu et al.
DoS detecting design for 6LoWPAN was presented. This [32] developed another machine learning-based method for
design incorporated an IDS into the ebbits framework cre- detecting anomalies by combining LR and ANN classifica-
ated under the EUFP7 program. The paper [27] conducted tion methods. Both the ANN and LR achieve approximately
an experimental investigation on intrusion detection utilizing 99.4 percent accuracy when the entire dataset is used and
DJ, DF, DNN, LSTM-RNN, DBN, GRU-RNN, and RNN 99.99 percent accuracy when approximately 105,952 data
of ML and deep learning models. Four datasets, namely, points are omitted from the unique data. In both situations,
KDD Cup 99, NSLKDD, CICIDS2017, and CICIDS, were the data are divided into 75 percent and 25% subsets. In ref-
used to evaluate the algorithms’ effectiveness in detecting erence [33], an event-processing IDS architecture based on
and classifying anomalies using 22 distinct evaluation mea- CEP technology was described. Kalis [34], an adaptive expert
sures. However, the experiment results indicate that when DL IDS that can supervise several protocols without modifying
models are combined with machine learning models, notably existing IoT software, is a thorough approach for detecting
DBN, the detection accuracy rate increases from 5 to 10%. IoT intrusions. Reddy et al. [35] described a DNN archi-
The study [26] set out to spot DoS attack protocols against tecture for securing the apps of future smart cities. The
CoAP and 6LoWPAN communication and to offer an IDS findings demonstrate that this DNN technique achieves an
architecture for detecting and blocking attacks in an internet- accuracy of approximately 98.26 percent when compared to
connected environment. Jiang et al. [28] experimented with standard machine learning classifiers with a variable layer
a mixed sampling-based intrusion detection method using and neurons. The authors [36] developed a novel method for
123
Y. K. Saheed, S. Misra
detecting network intrusions in IoT networks that are built example, cannot be deployed on IoT gadgets. Using numer-
on a conditional variational autoencoder with a specialized ous hacking tactics, hackers can disrupt or manipulate the
design that incorporates intrusion tags. To detect malicious functionality of smart gadgets [46]. In light of the physi-
activity, ref. [37] employed a single-class SVM equipped cally insecure nature of a large number of IoT gadgets, some
with characteristics such as memory utilization and CPU hacking approaches require active access to smart gadgets,
utilization. The study [38] examined the efficacy of many making an attack more difficult but not impossible. Other
community detection methods for detecting P2P bots, partic- attacks could be carried out remotely over the Internet. Table
ularly when only incomplete information is available. They 2 shows the main kinds of attacks targeting smart devices.
demonstrated that the approach may be used with approxi- The intrusion attacks can affect an IoT bot network com-
mately half of the nodes, presenting their connection graphs prised of unsecured IoT gadgets such as electrical gadgets,
with only a slight upsurge in detection mistakes. Table 1 security systems, automobiles, thermostats, lights in-home
summarizes the assessed studies on IoT security as per their or marketable locations, speaker systems, and wall timers.
datasets, models, best accuracy results, and gaps. These attacks give a cybercriminal the ability to take control
As seen from the review of the existing studies, the focus of the sensors. Unlike traditional botnets, compromised IoT
of some of the research is solely on detecting DDoS attacks. devices actively seek to propagate their hateful behavior to
Other sizable attacks are not taken into account. Also, a sim- a cumulative range of gadgets. While a traditional bot net-
ple ANN with only one hidden layer was deployed in one work may consist of hundreds of bots, IoT bot malware is
case with no optimization techniques applied. The majority far larger in scope, involving a large number of connected
of the work also lacks comparative analysis with other ML gadgets [51]. For instance, on October 21, 2016, cybercrimi-
and DL models. In another study, it was difficult to repli- nals targeted a prominent DNS firm named Dyn. This attack
cate the research work. The implementation details of the was initiated by a massive flood of DNS lookup queries from
machine learning model are absent, with obsolete datasets millions of IP addresses [52]. The bot network demands it
that do not reflect contemporary IoT attacks. Finally, the sug- infect a significant number of devices linked to the Inter-
gested approach is policy-based and relies on known attack net, including printers, camcorders, and other gadgets. This
signatures; hence, it will not be up-to-date with the most IoT bot network attack was initiated by malevolent software
recent attack trends until signatures are upgraded. known as Mirai. As a result of the Mirai contagion, com-
Unlike the past efforts, we investigate intrusion detection puters continually search the Internet for susceptible gadgets
for IoT resource-constrained devices in the network in this and log in using the default username and password, attacking
research. The difference is that our technique is divided into them with malicious programs. Researchers in the security
three stages. The first is hybrid dimensionality reduction, field described how they targeted the Chrysler Jeep Chero-
which involves using PCA and IG to choose the relevant kee at Black Hat 2015. While hacking the Jeep’s IoT device
attributes. The proposed GWO ensemble intrusion detection and sensor network, one could remotely access the vehicle
model includes two important engines in the second phase: as it drove down the motorway [53]. The specific secu-
a traffic analyzer and a classification phase engine. In the rity challenges addressed in this research, which involves
third phase, voting was utilized to merge the base learners’ developing an IDS for the IoT using a hybrid approach of
probability averages. feature extraction via PCA, feature selection via IG, and
parameter optimization using GWO for ensemble models,
2.1 Motivation for the intelligent threat model are related to the cybersecurity aspects of IoT environments.
on the Internet of Things Firstly, about vulnerabilities in IoT devices, it is important
to note that these devices frequently have limited resources
As IoT grows, so does the number of cybersecurity threats and may lack comprehensive security measures. The primary
that investigators must address and examine to develop a objective of the IDS suggested in this study is to identify
reliable IDS. Numerous forms of malevolent action attempt and address vulnerabilities present in these devices, hence
to compromise the privacy and security of IoT gadgets, and thwarting unauthorized access and control. Furthermore, it
all smart appliances connected to the Internet are potentially is imperative to periodically upgrade the firmware and soft-
vulnerable. For a variety of reasons, the IoT is vulnerable ware of IoT devices to ensure their security. The suggested
to cyberattacks. For starters, IoT appliances are frequently approach has the potential to facilitate monitoring and ensure
unattended (for example, sensors located in remote places), the timely implementation of changes. Authentication and
making it relatively uncomplicated for an assailant to get access control play a vital role in safeguarding IoT systems,
admittance to them physically. Second, the vast majority as they are responsible for ensuring that solely authorized
of data transfers are wireless, making eavesdropping easier. individuals or devices are granted access. The proposed IDS
Finally, most IoT devices have limited storage and comput- has the potential to effectively detect and identify unautho-
ing capabilities [45]. Additional anti-virus protection, for rized access attempts.
123
A voting gray wolf optimizer-based ensemble learning models …
123
Y. K. Saheed, S. Misra
123
A voting gray wolf optimizer-based ensemble learning models …
utmost importance in facilitating real-time interactions and are used as the input set of attributes for the next dimension-
control inside IoT devices. ality reduction stage. The author [58] describes the overall
entropy “K” of a given dataset “D” as follows:
3.1 Data preprocessing
K (D) − pi Log2Pi (2)
Normalization is a technique for scaling attributes in which i1
the goal is to have all attribute values on the same scale
where “e” signifies the total class size, and “pi” denotes the
normalization techniques include the standardized approach,
percentage of cases belonging to class u. The reduction in
min–max normalization, and z-score normalization [54, 55].
entropy in information is estimated for each feature using
We selected the min–max normalizing technique since the
the following formula:
majority of the features had a normal distribution to prevent
information from leaking in the test data. |D A, w|
IG (D, M) K (D) − K (Dw) (3)
|D|
3.2 Normalization technique wε A
123
Y. K. Saheed, S. Misra
3. Using Eq. (5), transform each vector Y k(i) to have unit Table 4 Design principles of PCA
variance.
Parameter Values
1 2
σi2 Yk(i) (5) Parameter ranking True
n
i Num to select 6
Threshold 0.5
4. Substitute each Y k(i) with Y k(i)
σ . Variance 1.832
5. Computation of the covariance matrix Covn :
1
Covn Y(i) Y(i) )T (6)
n
selection process, which quantifies the importance of each
6. Covn eigenvectors and eigenvalues are calculated.
feature based on its ability to discriminate between different
7. Set eigenvectors by diminishing eigenvalues and select j
classes (e.g., normal and intrusions). Features with higher
eigenvectors with the greatest eigenvalues to produce S.
information gain were considered more effective in distin-
8. Using S and Eq. 7, convert the data to the novel subspace.
guishing between classes. The design principle of PCA is
given in Table 4.
Y S×X (7)
Parameter ranking typically refers to the process of
assessing and ranking the importance or influence of dif-
where Y is a 1 × e vector on behalf of one sample, and y is
ferent parameters or hyperparameters on a machine learning
the converted j × 1 sample in the new subspace.
model’s performance. These parameters are settings or con-
The computational difficulty of performing the specified figurations that can be adjusted to influence how a model
PCA is proportional to the number of attributes F represent-
learns from data and makes predictions. In our research, the
ing each point of data.
parameter ranking in the settings is set to true. The num to
select parameter in PCA is set to the value 6. The threshold
O F3 (8) value is set to 0.5, and the variance is set to 1.832. The design
principle revolves around finding a new set of orthogonal
In this study, PCA is utilized to reduce the dimensionality axes, called principal components, that capture the maximum
of the BoT-IoT and UNSW-NB15 datasets by compressing variance in the data while reducing its dimensionality.
the attribute space with ten (10) selected features and nine Ten (10) new features were selected from the BoT-
(9) high-rank features, respectively. The ten (10) and nine IoT dataset, and nine (9) features were chosen from the
(9) top-ranked features were considered for the BoT-IoT and UNSW-NB15 which are subsequently fed and passed to
UNSW-NB15 datasets. To identify the most effective fea- the GWO-optimized ensemble models (RF, DT, MLP, and
tures, we employed information gain, used in our feature KNN). The information gain efficiently identifies the most
123
A voting gray wolf optimizer-based ensemble learning models …
relevant features based on their contribution to the target vari- x_ synthetic x_ minority + random_ number
able, while PCA optimally captures the variance within the ∗ n − x_ minority (9)
dataset to create a reduced set of orthogonal features. By
combining these two methods, we achieve a balanced fea- Assume there exists a dataset with features x and labels
ture reduction approach that maximizes the preservation of y. For each minority instance x_minority, there is a need
informative features while minimizing computational over- to find its K-nearest neighbors from the minority class.
head. The distance metric used for finding neighbors (such as
PCA aims to transform the original high-dimensional fea- Euclidean distance) can vary. Assume we denote the set of
ture space into a lower-dimensional space while retaining k-nearest neighbors as N(x_minority). For each neighbor n in
as much of the variance in the IoT network traffic data as N (x_minority), a synthetic instance x_synthetic is generated
possible. This dimensionality reduction can lead to several as Eq. (9).
benefits: At this juncture, random_number is a random value
between 0 and 1, controlling the interpolation between
i. Curse of Dimensionality High-dimensional IoT network x_minority and n. The formula in Eq. (9) is applied to each
traffic data can suffer from the "curse of dimensionality," feature of x_minority and n to generate the corresponding
where the number of features greatly exceeds the num- feature of x_synthetic.
ber of samples. This can lead to increased computational
complexity, overfitting, and difficulty in visualization. 3.7 Optimization of the ensemble learning models
PCA helps mitigate these issues by reducing the dimen- (ELM) with gray wolf optimizer
sionality.
ii. Noise Reduction High-dimensional IoT network data The GWO methodology is a metaheuristic algorithm that
often contain noise and irrelevant features. PCA helps replicates the initiative chain of importance and pursues the
remove and down-weight such noisy dimensions by method of dark posers [61]. In the numerical method for the
identifying and emphasizing the dimensions with the GWO, the optimal configuration is denoted by the symbol
most significant information. alpha α. The beta (β) and delta (δ) are optimized according
iii. Improved Model Performance Reducing dimensionality to the second- and the third-best configurations, respectively.
leads to faster training and inference times for machine It is believed that the remaining application setups are known
learning models, as well as potentially reducing overfit- as omega (ω). These three applicants are being pursued by
ting. β,δ, and ω using GWO tactics and α as a hunting guide.
For the pack to pursue prey, they immediately encircle it.
The following Eqs. (10)–(13) are applied to mathematically
3.6 Handling the class imbalance problem model surrounding behavior.
123
Y. K. Saheed, S. Misra
and the delta may occasionally be interested in chasing. To Table 5 Pseudocode of gray wolf optimization
scientifically emulate the chasing behavior of gray wolves,
1 Initialize values for the population size s, the
the alpha (the best candidate solution), beta (the second-best Maxitrcoefficient parameter, and the D and B vectors
rival solution), and delta (the third-best optimistic solution) 2 Create an initial population sample at random Z j (r)
are accepted to obtain more information regarding the likely
3 Using f (zj ) to evaluate each search agent’s fitness
prey position. The initial three best application configura-
4 Z α, Z β, and Z δ to determine the values of the 1st,
tions have reached this stage, necessitating that the other hunt 2nd, and 3rd optimal solutions
operators change their situations to match those of the best
5 Repeat
pursue experts. Therefore, the replenishment of the positions
6 For (j 1: j ≤ s) do
of the wolves is provided by Eq. (14):
7 Applying Eq. (21) to restore each population agent
−
→ −
→ −
→ 8 End for
−
→ Z 1+ Z 2+ Z 3
Z (r + 1) (14) 9 The vector has been updated by Z α, Z β, and Z δ
3 accordingly
−
→
−→ −→ − → 10 Set r r + 1
Z 1 Z α − B 1. E a (15)
11 As soon as, the termination criteria are met till (r ≥
−
→ − →
→ −→ − Maxitr)
Z 2 Z β − B 2. E β (16) 12 Lastly to produce the optimal solution Z a
−
→ − →
→ −→ −
Z 3 Z δ − B 3. E δ (17)
swarm intelligence methodologies due to its various charac-
−
→ − → −
→ teristics such as fine-tuning parameters, simplicity and ease
where B 1 , B 2 , and B 3 are defined as Eq. (14) and
−
→ − → −
→ of use, scalability, and most notably its ability to just provide
Z α, Z β , and Z δ are the leading three best solutions in
−→ − → −→ convergence speed by maintaining the right balance between
the assumed iteration r, B 1 , B 2 , and B 3 are expressed in
−
→ −
→ exploitation and exploration during the search. GWO exhibits
Eqs. (15–17), and E α and E δare expressed as Eqs. 18–20, a better balance between exploration (searching the solution
respectively. space) and exploitation (exploiting promising solutions). It
−
→ − → uses the concept of alpha, beta, gamma, and delta wolves to
→ − → −
E α D 1. Z 1 − Z (18) strike a balance between exploration and exploitation which
− can lead to more efficient optimization compared to other
−
→ → −→ −→
E β D 2 − Z β − Z 1 (19) algorithms. GWO tends to converge faster to a global opti-
mum compared to several other algorithms in some cases.
−
→ − →
→ − → − The nature-inspired hunting behavior of gray wolves, such as
E δ D 3. Z δ − Z 1 (20)
encircling prey, mimicked in GWO can lead to more efficient
−→ − → −
→ exploration and faster convergence. GWO promotes diverse
D 1 , D 2 , and D 3 are given as in Eq. (13) solution exploration due to its hierarchical structure and the
A final observation regarding the GWO mediator is the hunting behavior of gray wolves. This can help avoid getting
updating of the parameter that regulates the investigation- stuck in local optima and facilitate a more comprehensive
abuse tradeoff. The stricture is continuously updated each search of the solution space.
cycle to range from 2 to 0 following Eq. (21). In our research, the GWO is utilized to optimize the param-
eters of RF, DT, MLP, and n for KNN. Gray wolf optimizer
2
b 2 r (21) (GWO) is a nature-inspired optimization algorithm that sim-
Maxlter ulates the hunting behavior of gray wolves to find optimal
where MaxIter is the full number of allowable optimization solutions. We utilized the pseudocode of GWO to optimize
iterations, and r is the number of optimization iterations. The the hyperparameters of ensemble learning models; random
hunting and pursuit positions of gray wolves are required to forest, decision tree, multilayer perceptron (MLP), and K-
be updated by binary {1, 0}. The gray wolf optimization nearest neighbor (KNN) [62]. Here’s a high-level overview
pseudocode is described in Table 5. of how we integrated GWO with ensemble models:
We chose GWO to optimize the parameters of the ensem-
ble algorithms because of three significant merits; explo- 1. Initialize a population of gray wolves with random
ration and exploitation, convergence speed, and handling hyperparameter settings for the ensemble models.
constraints, which it has over other algorithms. GWO has 2. Define a fitness function that evaluates the performance
gained a significant amount of prominence among other of the ensemble model with the given hyperparameters.
123
A voting gray wolf optimizer-based ensemble learning models …
The fitness function used appropriate evaluation met- matrix. The eigenvalue problem stated in Eq. (24) is initially
rics. fixed through PCA.
3. In each iteration of the GWO loop, evaluate the fitness
of each wolf (hyperparameter set) using the ensemble βjkj Z kj (24)
model. Update the positions of the alpha, beta, and delta
wolves based on their fitness values. These wolves rep- where β j signifies an eigenvalue of Z (say β 1 > β 2 > ... >
resent the best solutions found so far. β m ), and k j is the corresponding eigenvector. The PCA is
4. Update the positions of the other wolves using prede- obtained using Eq. (25) as follows:
fined formulas that simulate the hunting behavior of
gray wolves. This step helps explore the search space x j (u) k j × (u), j 1, 2, . . . , m. (25)
efficiently.
5. Apply boundary constraints to ensure that hyperpa-
The jth principle component is denoted by x j (v). The com-
rameters remain within valid ranges for the ensemble
putation to project a fresh sample y(u) onto the main space
models.
is given in Eq. (26). Let
6. After a certain number of iterations or when a stopping
criterion is met, select the best solution found so far q
based on fitness values. y(u) b j U × (u)aj , (26)
j1
7. Perform cross-validation to assess the performance of
the ensemble model with the selected hyperparameters where A {ej : ej k j , j 1,…, g}. Equation (27) calculates
on a validation set. the distance f from y(u) and (t) to determine the projection
8. If the new solution (hyperparameters) is better than the inaccuracy of y(u) and Ý (u):
previous best solution, update the best solution.
9. Continue the optimization process until the stopping
b f y(u), Y (u) (27)
criterion is met.
10. Finally, return the best solution, which represents the
optimal hyperparameters for the ensemble learning
models. 3.9 Ensemble model
By integrating GWO with ensemble models in this way, Ensemble methods are effective ways of improving the
we effectively search for the best hyperparameters to max- prediction outcome of the overall model by developing
imize the ensemble’s performance, improving its accuracy numerous self-reliant models and integrating them to provide
and effectiveness in real-world applications. results with improved, enhanced accuracy [63]. Ensemble
learning approaches include boosting, bagging, Bayesian
3.8 Mathematical formulation of the ensemble parameter averaging, and stacking [64]. This work proposes
method for classification a unique ensemble classifier to improve intrusion detection
accuracy in IoT that employs RF, DT, MLP, and KNN learn-
Let {y(u)} for u 1,…, m be a randomized data containing its ers. These algorithms were utilized in a voting algorithm and
associated examples and characteristics with a mean of zero. were combined using the average of probabilities method. To
Equation (22) shows the covariance matrix of y(u). Algo- accelerate the performance of each of the models, the GWO
rithm 1 summarizes the hybrid IG-PCA approach’s selection was used to optimize the parameters of each of the ensemble
procedure. (RF, DT, MLP, and KNN) models.
Assume we have φ ’classifiers A {A1 , A2, … A φ} and
m
l labels {h1 , h2, …, hl }. According to the classifiers given
1
Z y(t) × (u)U (22) above, φ 4, and l 2 (that is, non-attack and attack) for the
m−1
u1 datasets analyzed in this work. Aj : Z m → [1,0]l is a classifier.
l takes an object y Z M and returns a vector [J Aj (h1 |y),…, JAj
In PCA, the transformation function from y(u) to x(v) is
(h|y)], where J A (h|y) represents the probability given by Ai
calculated as follows;
to the assumption that entity y corresponds to class i. Where
ni becomes the average of the probabilities provided by the
x(u) N u × (u) (23) different classifiers for every class hi ,
123
Y. K. Saheed, S. Misra
Let N denotes the collection of mean probability for each 3.11 Benefits of the proposed voting-based
category (n1, n2 ,…, nc ). Object y is classified correctly in N ensembles model
with the highest mean, i.e., y is allocated to class g if and
only if • Reduced Bias Combining multiple models can help
reduce bias present in any individual model.
n g max N (29) • Improved Generalization Ensembles often perform bet-
ter on unseen data compared to individual models.
The proposed ensemble approach’s performance is eval- • Robustness Ensemble methods are more robust against
uated using two famous intrusion detection assessment data overfitting, especially if the individual models are diverse.
that are ideally suited for IoT, namely, BoT-IoT and UNSW- • Model Diversity Using different learning algorithms
NB15. ensures that the ensemble captures different aspects of the
data.
3.10 Ensemble learning strategy
Step 2: Individual Learning Algorithms Choose a set of 4.1 Metrics used for performance evaluation
individual learning algorithms RF, DT, MLP, and KNN that
we want to ensemble. This study evaluated the performance of the proposed system
using multiple performance measures, including precision,
Step 3: Train Individual Models For each selected indi- recall, dtection rate (DR), and accuracy (Acc), as well as the
vidual learning algorithm RF, DT, MLP, and KNN. We time required to create the model. These metrics’ definitions
trained all these algorithms on training data from both are provided below. True positives (TP), true negatives (TN),
datasets (BoT-IoT and UNSW-NB15). This gave us a set of false positives (FP), and false negatives (FN) determine the
trained models, each capable of making intrusion detection metrics (FN).
predictions. Detection rate (DR): The DR is the proportion of identified
attacks relative to the total number of attack events in the
Step 4: Probability Prediction For each trained model, dataset. Equation (30) can be utilized to estimate DR.
we use it to make predictions on our testing data. Instead of
just obtaining the final prediction label, we are interested in TP
the predicted probabilities of intrusion (class attack) for each DR (30)
TP + FN
instance.
Accuracy is the measure of the classifier’s ability to cor-
Step 5: Ensemble Voting For each instance in our testing rectly classify an object as normal or as an attack. The
data, we calculated the average of the predicted probabilities accuracy is defined by Eq. (31).
from all the individual models. This average can be computed
for class 1 (intrusion). TP + TN
Accuracy (31)
TP + FN + FP + TN
Step 6: Evaluation We evaluated the performance of our
voting ensemble models using standard metrics such as accu- Precision is the ratio of positive predictions to the total
racy, DR, precision, ROC, and FAR on our testing data. We number of positive anticipated class values. It considered a
also compare these results with the performance of individual measure of the classifier’s precision. A low value represents a
models to assess the effectiveness of the ensemble. high number of FP. The precision is computed using Eq. (32).
123
A voting gray wolf optimizer-based ensemble learning models …
Table 6 Attack and normal behavior statistics from the BoT-IoT dataset
TP
Precision (32)
TP + FP Attack and normal behavior Values
TP
Recall (33) Table 7 UNSW-NB15 data records
TP + FN
Feature type Number of records
123
Y. K. Saheed, S. Misra
The oversampling without replacement method was used to 5.2 Experimental analysis based on UNSW-NB15
divide each dataset’s selected samples into two distinct sub- dataset
groups for training and testing. As a result, the training subset
can accurately predict model performance on previously Additional tests on the UNSW-NB15 dataset were carried out
unrecognized data, and the testing sample is reserved for to demonstrate the efficiency of the suggested feature dimen-
assessing the model’s performance. In this instance, generat- sionality reduction (IG + PCA) GWO ensemble model. As
ing subgroups for cross-validation evaluation is not essential, in the first experiment, IG and PCAs were computed dur-
which could be time-consuming with large datasets. Two ing the preprocessing step of these datasets. In this second
tests were conducted to evaluate the efficiency of the pre- experiment, nine (9) candidate features were chosen from
sented technique. The following evaluation metrics were UNSW-NB15 by computing the entropy of the IG and, subse-
used according to the confusion matrix shown in Table 8: pre- quently, the PCA feature extraction. Table 10 shows the best
cision, accuracy, detection rate, ROC, and FAR. The authors results obtained using the reduction of dimension approaches
[66] explain the mathematical computations for the measure- on the dataset. Our proposed model produces promising clas-
ment methods used. sification results, as seen in the result. Table 10 compares
Where TP is the number of current attacks recognized the performance of the IG + PCA-RF, IG + PCA-DT, IG +
as attacks, TN is the number of frequent patterns identified PCA-MLP, IG + PCA-KNN, and the proposed GWO ensem-
as regular, FN is the series of attacks identified as frequent ble model on the UNSW-NB15 dataset. The voting GWO
patterns, and FP is the number of frequent patterns identified ensemble technique outperforms all other approaches, with
as threats. an accuracy attaining 100%, DR of 99.99%, precision of
99.59%, ROC of 99.40%, and FAR of 1.15.
5.1 Experimental analysis based on BoT-IoT dataset
The BoT-IoT dataset was used in the first experiment. To 5.3 Multiclass experimental analysis on the BoT-IoT
begin, vital attributes were determined by computing the IG dataset
entropy for every feature in declining order. From the orig-
inal thirty-one (31) potential features, ten (10) were chosen The initial step was the computation of the IG entropy for
for the following step. The strategy was seen to create several each characteristic, with the resulting values being arranged
FARs by deploying IG alone. To overcome this constraint, in descending order to identify the most significant qualities.
a second additional reduction phase founded on the selected Out of the initial set of thirty-one (31) possible features, a
attributes was done using the PCA as feature extraction. To subset of ten (10) features was selected for the subsequent
evade bias, the PCA was created using only the training set, stage. The implementation of IG in isolation was observed to
ensuring that no information from the test data was leaked generate several FARs as part of the strategy. To address this
into the training dataset. When genuine new unseen data are limitation, a secondary reduction phase was implemented,
introduced into the model, the model will not function as well utilizing the specified features and employing PCA as a fea-
if the complete dataset is used to construct the PCAs. Sim- ture extraction technique. To mitigate bias, the PCA was
ilarly, calculating PCAs on the two sets independently will conducted exclusively on the training dataset, to preventing
result in two mismatched sets of data. We cannot build a clas- any potential leakage of information from the test data into
sifier in one domain and then apply it to another. The same the training set.
characteristics from the training set were utilized to translate Table 11 shows the performance of the proposed voting
the testing dataset into the same feature space using the batch- GWO ensemble model on BoT-IoT in a multiclass scenario.
filtering method. The new datasets were utilized to assess the The results indicate that the voting GWO ensemble model
efficiency of the presented method, so five separate classifiers performed on DDoS HTTP achieved an accuracy of 99.87%
were built utilizing the training data and classified using the and DR of 99.89%, precision of 99.60%, ROC of 99.56%,
testing dataset. On the BoT-IoT dataset, Table 9 compares and FAR of 1.20.
123
A voting gray wolf optimizer-based ensemble learning models …
Table 9 The performance of standard ML approaches and the proposed voting ensemble model on BoT-IoT
Table 10 The performance of standard ML techniques and voting ensemble model on the UNSW-NB15
5.4 Multiclass experimental analysis voting GWO ensemble model on BoT-IoT in a multiclass
on the UNSW-NB15 scenario. The results indicate that the voting GWO ensemble
model performed on reconnaissance achieved an accuracy
Further experiments were conducted on the UNSW-NB15 of 99.91% and DR of 99.75%, precision of 97.08%, ROC of
dataset to showcase the effectiveness of the proposed ensem- 98.80%, and FAR of 1.80.
ble model, which combines feature dimensionality reduction
techniques (IG + PCA) with the GWO. Similar to the initial
experiment, the datasets underwent preprocessing in which 5.5 Evaluation and comparison of current datasets
IG and PCAs were generated. In the subsequent experiment, suitability for IoT network
a total of nine (9) candidate features were selected from the
UNSW-NB15 dataset by evaluating the entropy of the infor- To determine the essential qualities of a valuable and realistic
mation gain (IG) and subsequently applying PCA for feature dataset for an IoT network, some of the current IDS datasets
extraction. Table 12 shows the performance of the proposed were evaluated in this part.
123
Y. K. Saheed, S. Misra
5.5.1 DARPA turn, results in testing outcomes that are biased, as reported
in reference [68]. NSL-KDD was developed as a means of
For the goal of analyzing network security, this dataset was addressing certain limitations of the KDD dataset [68], which
created. Due to problems with the fake injection of attacks as had been identified in the previous research [67].
well as benign traffic, researchers chastised DARPA. DARPA
covers tasks such as sending and receiving mail, surfing the
5.5.3 CDX
web, sending and receiving files via FTP, using telnet to
log into distant systems and carry out work, sending and
The utilization of network warfare competitions for the cre-
receiving IRC messages, and remotely monitoring the router
ation of contemporary labeled datasets is demonstrated by
using SNMP. The aforementioned list comprises various
the CDX dataset. The dataset reveals that attackers have uti-
types of attacks, including but not limited to denial of ser-
lized widely recognized attack tools such as Nikto, Nessus,
vice (DOS), password guessing, buffer overflow, remote file
and WebScarab to conduct automated reconnaissance and
transfer protocol (FTP), syn flood, network mapper (Nmap),
attacks. Benign network traffic encompasses essential ser-
and rootkit. Regrettably, the dataset under consideration does
vices such as web browsing, email communication, DNS
not accurately reflect network traffic in real-world scenarios
queries, and other necessary functions. According to source
in IoT and exhibits anomalies such as the lack of erroneous
[69], CDX has limitations in terms of traffic diversity and
detections. Furthermore, it is no longer current enough to
volume, although it can still serve as a tool for testing IDS
provide a comprehensive assessment of IDSs concerning
alert rules.
contemporary network infrastructures and attack modalities.
Furthermore, the absence of factual attack data records is
evident [67]. 5.5.4 Kyoto
123
A voting gray wolf optimizer-based ensemble learning models …
To generate the dataset, three distinct services, namely, 6.1 Comparison with the existing studies
OpenSSH, Apache web server, and Proftp utilizing auth/ident
on port 113, were deployed to gather information from a hon- In this section, we compared the performance of the proposed
eypot network via netflow. Certain types of traffic, including GWO ensemble model with the existing state-of-the-art mod-
auth/ident, ICMP, and irc traffic, may produce side effects els in Table 15. The majority of the state-of-the-art model
that are neither entirely benign nor malicious. In addition, concentrated on the NSLKDD and KDD Cup 99 datasets.
the dataset includes alert traffic that is both unidentified and These data are unrealistic intrusion detection datasets for
lacking correlations. The labeled dataset under consideration the evaluation of IoT systems. They are unsuccessful in
is deemed more realistic; however, its deficiency in terms of practical uses due to the dataset used to train and eval-
the volume and variety of attacks is a conspicuous limitation uate the underlying models being non-representative. On
as noted in reference [71]. the other hand, several existing techniques address these
issues but provide low accuracy, DR, precision, ROC, and
FAR preventing them from being implemented in com-
mercial systems. Also worthy of mentioning was that the
existing state-of-the-art models paid no attention to fea-
5.5.6 ISCX2012 ture dimensionality; this stage of dimensionality reduction
is regarded as the most crucial stage. This phase is partic-
The authors have presented a valuable recommendation ularly time- and labor-intensive. This paper addressed the
for producing realistic and useful IDS evaluation datasets feature dimensionality phase by proposing a hybridized IG
through a dynamic approach. The dataset in question was + PCA for dimensionality reduction and provides a novel
generated using this approach. The methodology employed GWO ensemble model for classification. Additionally, this
by the individuals involves a bifurcation into two distinct proposed ensemble model was evaluated on realistic BoT-
components, specifically denoted as the alpha and beta pro- IoT and UNSW-NB15 datasets, which made it suitable for
files. The alpha profile executes multiple stages of attack commercial and industrial applications. As shown in Fig. 3,
scenarios to filter the anomalous segment of the dataset. The the best state-of-the-art model provides 100% accuracy on
beta profile, a benign traffic generator, produces authentic the BoT-IoT data, while the ROC and F-measure were dis-
network traffic accompanied by ambient noise. Empirical regarded. On the comparable BoT-IoT data, the proposed
data are utilized to construct profiles that simulate authen- innovative voting GWO ensemble model achieved an accu-
tic traffic for various protocols such as HTTP, SMTP, SSH, racy of 99.98%, DR of 99.97%, precision of 99.94%, ROC
IMAP, POP3, and FTP. The dataset produced by this method- of 99.99%, and FAR of 1.30.
ology comprises network traces that include complete packet
payloads and pertinent profiles. Nevertheless, it should be
noted that the dataset in question does not pertain to novel 6.2 Computational compatibility across IoT devices
network protocols, given that a significant proportion of con-
temporary network traffic, approximately 70%, is comprised When designing a machine learning model for intrusion
of HTTPS, and no traces of HTTPS are present within the said detection in IoT environments, it is important to consider
dataset. Furthermore, the allocation of the simulated assaults the computational compatibility of the proposed model, espe-
is not grounded on empirical data [72]. Table 13 shows some cially given the heterogeneity in computational power among
popular realistic datasets for IoT networks. IoT devices. A model that works well on high-power devices
As can be seen, only the proposed datasets used in this might struggle or be impractical to implement on resource-
study meet all criteria. Tables 13 and 14 list and explain constrained IoT devices. Imagine a scenario where our pro-
the dataset’s flaws and strengths based on relevant doc- posed model is deployed for real-time anomaly detection in a
uments and research, as well as their suitability for IoT smart city environment, where various types of IoT devices
networks. Some feature values are not presented as a result are utilized, ranging from resource-constrained sensors to
of inadequate documentation and a lack of metadata. Here, more powerful edge devices. In this scenario, the lightweight
we evaluated the proposed model using two well-known nature of our voting GWO ensemble model enables seamless
datasets: UNSW-NB15 and BoT-IoT. In contrast with the integration across these devices. Resource-intensive tasks are
datasets used in several existing models, which do not accu- offloaded to devise with higher computational power, while
rately reflect contemporary attacks on IoT networks and less resource-intensive tasks are managed by lower-powered
do not adhere to IoT protocol requirements, these chosen devices. Our model’s architecture is designed to dynamically
datasets are appropriate and realistic for IoT network traffic. adjust its computational requirements based on the available
123
Y. K. Saheed, S. Misra
Table 13 A comparative analysis of the datasets currently accessible for detecting attacks in IoT
Table 14 Summary of
representative (realistic) and Dataset/authors Traffic creation Public Attack Normal Realistic
non-representative (non-realistic) year availability traffic traffic network
datasets for IoT traffic for
IoT
123
A voting gray wolf optimizer-based ensemble learning models …
resources, ensuring effective and efficient operation across characteristics and challenges of IoT networks. This
the heterogeneous IoT landscape. approach ensures that our research is directly relevant
to the specific requirements and constraints of IoT appli-
6.3 Transferable of the proposed research cations.
to real-world IoT applications b. Dataset Selection We utilized datasets, such as BoT-IoT
and UNSW-NB15, that are representative of real-world
Our research is designed with a strong focus on practical IoT network traffic and intrusions. This dataset selection
applicability in real-world IoT environments. Here are key ensures that our research is grounded in the realities of
points highlighting the transferability of our research to real- IoT security.
world IoT applications: c. Hybrid Approach Our research combines feature
extraction via principal component analysis (PCA), fea-
a. IoT-Centric Approach We developed our intrusion ture selection via IG, and GWO-based ensemble models.
detection system with a deep understanding of the unique
123
Y. K. Saheed, S. Misra
[37]
[18]
[36]
[14]
[35]
[38]
0 20 40 60 80 100 120
This hybrid approach is designed to enhance the robust- 6.4 Threats to validity
ness and effectiveness of intrusion detection in real-world
IoT scenarios. The main danger to validity is random sampling, which
d. Generalization We conducted experiments and evalua- makes it difficult to duplicate the exact experiment. To val-
tions on multiple datasets to ensure the generalizability idate the suggested approach’s reliability, the experiments
of our proposed model to diverse IoT applications. Our were repeated on two separate realistic IoT sets of data
research demonstrates the adaptability and transferability with a substantial sample size. Finally, while the presented
of our approach across various IoT contexts. approach performed well in binary-class classification, it
e. Performance Metrics We evaluated our intrusion detec- deserves additional investigation in the class of multiple clas-
tion system using well-established performance metrics, sification issues.
such as accuracy, DR, precision, and FAR. These met-
rics reflect the real-world effectiveness of our approach
in identifying and mitigating security threats.
f. Scalability We addressed the scalability challenges often 7 Conclusion and future work
encountered in IoT environments, ensuring that our
research can handle growing numbers of devices and data This paper proposes a novel voting GWO ensemble learning
volumes while maintaining effectiveness. model for the detection of attacks in an IoT environment. The
g. Practical Deployment Considerations We discussed suggested system successfully detects various forms of IoT
the practical considerations of deploying our intrusion threats by leveraging the feature set retrieved from the IoT
detection system in real-world IoT applications, includ- ecosystem. The strength of this paper concentrates on the vot-
ing the optimization of model parameters and the impor- ing GWO ensemble model, which is the first of its kind, the
tance of network segmentation. hybridization of IG + PCA for dimensionality reduction, and
h. Security Challenges Our research explicitly addresses a the leverage of realistic datasets that reflect real-time attacks
range of security challenges and threats in IoT environ- in the IoT context. To construct a successful ensemble IDS
ments, making it directly applicable to scenarios where for detecting IoT attacks, a collection of relevant features was
IoT security is a concern. selected. The experimental findings prove that the detection
accuracy is increased in the voting GWO ensemble model
in the suggested framework using the average probability
This research is built on a foundation that prioritizes real- technique. Our experimental results indicate that our pro-
world relevance and practicality. We have conducted exper- posed voting ensemble model outperforms other ML and DL
iments and evaluations that demonstrate the effectiveness approaches in terms of overall accuracy, attaining 100%, DR
and transferability of our IDS to various IoT applications. of 99.99%, precision of 99.59%, ROC of 99.40%, and FAR
By addressing the unique challenges of IoT security and of 1.15 on the UNSW-NB15 compared to earlier studies.
employing a hybrid approach that combines feature extrac- This indicates that our presented method will be extremely
tion, feature selection, and optimization techniques, we aim beneficial in designing contemporary IDS for the IoT envi-
to provide a solution that can be readily applied in real-world ronment. The suggested model will be extended in the future
IoT environments. to incorporate multiple class classification problems. Also,
123
A voting gray wolf optimizer-based ensemble learning models …
the deep learning model to classify the additional forms of 8. Kelton, A.P., Papa, J.P., Lisboa, C.O., Munoz, R., De, V.H.C.:
attacks may be considered in the future work. Internet of Things: a survey on machine learning-based intrusion
detection approaches. Comput. Netw. 151, 147–157 (2019). https://
Authors’ contributions Authors contributed equally. doi.org/10.1016/j.comnet.2019.01.023
9. Saheed, Y.K., Misra, S., Chockalingam, S.: Autoencoder via
Funding Open access funding provided by Institute for Energy Tech- DCNN and LSTM models for intrusion detection in industrial con-
nology. trol systems of critical infrastructures. In: 2023 IEEE/ACM 4th
Int. Work. Eng. Cybersecurity Crit. Syst. (EnCyCriS), Melbourne,
Data availability The BoT-IoT is available at https://www.unsw. Aust., pp. 9–16 (2023). https://doi.org/10.1109/EnCyCriS59249.
adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Da 2023.00006
tasets/bot_iot.php and N. Moustafa and J. Slay, “UNSW-NB15: 10. Alharbi, S., Rodriguez, P., Maharaja, R., Iyer, P., Bose, N., Ye, Z.:
A comprehensive dataset for network intrusion detection systems FOCUS : a fog computing-based security system for the Internet
(UNSW-NB15 network dataset),” 2015 Mil. Commun. Inf. Syst. Conf. of Things. (2018)
MilCIS 2015—Proc., 2015, doi: https://doi.org/10.1109/MilCIS.2015. 11. Pajouh, H.H., Javidan, R., Khayami, R., Dehghantanha, A., Choo,
7348942. Also cited the same in the reference list [14]. K.K.R.: A two-layer dimension reduction and two-tier classifica-
tion model for anomaly-based intrusion detection in IoT backbone
networks. IEEE Trans. Emerg. Top. Comput. 7(2), 314–323 (2019).
Declarations https://doi.org/10.1109/TETC.2016.2633228
12. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed
Conflict of interest Authors do not have any financial or non-financial analysis of the KDD CUP 99 data set. no. Cisda, pp. 1–6 (2009).
interests that are directly or indirectly related to the work submitted for 13. Zhang, H., Wu, C.Q., Gao, S., Wang, Z., Xu, Y., Liu, Y.: An effective
publication. deep learning based scheme for network intrusion detection. In:
2018 24th Int. Conf. Pattern Recognit., pp. 682–687 (2018)
Ethical approval Authors comply with the highest level of ethical stan- 14. Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for
dards while preparing the manuscript. network intrusion detection systems (UNSW-NB15 network data
set). In: 2015 Mil. Commun. Inf. Syst. Conf. MilCIS 2015—Proc.
Open Access This article is licensed under a Creative Commons Attri- (2015). https://doi.org/10.1109/MilCIS.2015.7348942
bution 4.0 International License, which permits use, sharing, adaptation, 15. Koroniotis, N., Moustafa, N., Sitnikova, E.: Towards Developing
distribution and reproduction in any medium or format, as long as you Network Forensic Mechanism for Botnet Activities in the IoT
give appropriate credit to the original author(s) and the source, pro- Based on Machine Learning Techniques. Springer International
vide a link to the Creative Commons licence, and indicate if changes Publishing
were made. The images or other third party material in this article are 16. Kolias, C., Kambourakis, G., Stavrou, A., Gritzalis, S.: Intrusion
included in the article’s Creative Commons licence, unless indicated detection in 802. 11 Networks : Empirical Evaluation of Threats
otherwise in a credit line to the material. If material is not included in and a Public Dataset. no. c, pp. 1–24 (2015). https://doi.org/10.
the article’s Creative Commons licence and your intended use is not 1109/COMST.2015.2402161
permitted by statutory regulation or exceeds the permitted use, you will 17. Saheed, Y.K., Usman, A.A., Sukat, F.D., Abdulrahman, M.: A
need to obtain permission directly from the copyright holder. To view a novel hybrid autoencoder and modified particle swarm optimiza-
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. tion feature selection for intrusion detection in the internet of things
network. Front. Comput. Sci. 5, 1–13 (2023). https://doi.org/10.
3389/fcomp.2023.997159
18. Amin, S.O., Siddiqui, M.S., Hong, C.S., Choe, J.: A novel cod-
References ing scheme to implement signature based IDS in IP based sensor
networks. In: 2009 IFIP/IEEE Int. Symp. Integr. Netw. Manag. IM
1. Islam, N., et al.: Towards Machine learning based intrusion detec- 2009, pp. 269–274 (2009).https://doi.org/10.1109/INMW.2009.51
tion in IoT networks. Comput. Mater. Contin. 69(2), 1801–1821 95973
(2021). https://doi.org/10.32604/cmc.2021.018466 19. Abubakar, A., Pranggono, B.: Machine learning based intrusion
2. Rahman, M.A., Asyhari, A.T.: The emergence of Internet of detection system for software defined networks. In: 2017 Sev-
things (IoT): connecting anything, anywhere. Computers 8(2), enth International Conference on Emerging Security Technologies,
8–11 (2019). https://doi.org/10.3390/computers8020040 pp. 138–143 (2017)
3. Lin, H., Hu, J., Wang, X., Alhamid, M.F., Piran, M.J.: Toward 20. Roy, B., Cheung, H.: A deep learning approach for intrusion
secure data fusion in industrial IoT using transfer learning. IEEE detection in internet of things using bi-directional long short-term
Trans. Ind. Inform. 17(10), 7114–7122 (2021). https://doi.org/10. memory recurrent neural network. In: 2018 28th Int. Telecommun.
1109/TII.2020.3038780 Networks Appl. Conf. ITNAC 2018, pp. 1–6 (2019).https://doi.org/
4. Farsi, M., Daneshkhah, A., Hosseinian-Far, H., Jahankhani, 10.1109/ATNAC.2018.8615294
A.: Digital Twin Technologies and Smart Cities. Springer, 21. Le, A., Loo, J., Luo, Y., Lasebae, A.: Specification-based IDS for
Berlin/Heidelberg, Germany (2020) securing RPL from topology attacks. IFIP Wirel. Days 1(1), 4–6
5. Zhao, K., Ge, L.: A survey on the Internet of things security. (2011). https://doi.org/10.1109/WD.2011.6098218
In: Proceedings—9th International Conference on Computational 22. Bertino, E.: Botnets and Internet of Things Security. Computer
Intelligence and Security, CIS 2013, pp. 663–667 (2013). https:// (Long. Beach. Calif)., pp. 76–79 (2017)
doi.org/10.1109/CIS.2013.145. 23. Almiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S.,
6. Yan, Z., Zhang, P., Vasilakos, A.V.: A survey on trust management Razaque, A.: Deep recurrent neural network for IoT intrusion
for Internet of Things. J. Netw. Comput. Appl. 42, 120–134 (2014). detection system. Simul. Model. Pract. Theory 101, 102031 (2020).
https://doi.org/10.1016/j.jnca.2014.01.014 https://doi.org/10.1016/j.simpat.2019.102031
7. Saheed, Y.K., Babatunde, A.O.: Genetic algorithm technique in 24. Li, Z., Batta, P., Trajkovi, L.: Comparison of Machine Learning
program path coverage for improving software testing. Afr. J. Com- Algorithms for Detection of Network Intrusions. pp. 4248–4253
put. ICT 7(5), 151–158 (2014) (2018). https://doi.org/10.1109/SMC.2018.00719
123
Y. K. Saheed, S. Misra
25. Ayyaz-ul-haq, Q., Larijani, H., Ahmad, J.: A heuristic intrusion 4th Int. Conf. Informatics Comput. ICIC 2019, pp. 0–4 (2019).
detection system for Internet-of-Things (IoT). In: Arai, K., Bha- https://doi.org/10.1109/ICIC47613.2019.8985853
tia, R., Kapoor, S. (eds.) Intelligent Computing. CompCom 2019. 41. Le, H.V., Ngo, Q.D., Le, V.H.: Iot Botnet detection using system
Advances in Intelligent Systems and Computing. Springer Cham, call graphs and one-class CNN classification. Int. J. Innov. Technol.
pp. 86–98 (2019) Explor. Eng. 8(10) (2019).
26. Böhm, A., Jonsson, M., Uhlemann, E.: Performance comparison of 42. Kumar, A., Lim, T.J.: EDIMA: early detection of IoT mal-
a platooning application using the IEEE 802.11p MAC on the con- ware network activity using machine learning techniques. In:
trol channel and a centralized MAC on a service channel. Int. Conf. IEEE 5th World Forum Internet Things, WF-IoT 2019—Conf.
Wirel. Mob. Comput. Netw. Commun. 545–552 (2013).https://doi. Proc., pp. 289–294 (2019). https://doi.org/10.1109/WF-IoT.2019.
org/10.1109/WiMOB.2013.6673411 8767194
27. Elmasry, W., Akbulut, A., Zaim, A.H.: Empirical study on mul- 43. Xu, C., Member, S., Shen, J., Du, X.I.N., Zhang, F.A.N.: An intru-
ticlass classification-based network intrusion detection. Comput. sion detection system using a deep neural network with gated
Intell. 35(4), 919–954 (2019). https://doi.org/10.1111/coin.12220 recurrent units. IEEE Access PP(c), 1 (2018). https://doi.org/10.
28. Jiang, K., Wang, W., Wang, A., Wu, H.: Network intrusion detec- 1109/ACCESS.2018.2867564
tion combined hybrid sampling with deep hierarchical network. 44. Chaudhary, P., Gupta, B.B.: DDoS detection framework in resource
IEEE Access 8(3), 32464–32476 (2020). https://doi.org/10.1109/ constrained internet of things domain. In: 2019 IEEE 8th Glob.
ACCESS.2020.2973730 Conf. Consum. Electron. GCCE 2019, pp. 675–678 (2019).https://
29. Hasan, M., Islam, M., Zarif, I.I., Hashem, M.M.A.: Internet of doi.org/10.1109/GCCE46687.2019.9015465
things attack and anomaly detection in IoT sensors in IoT sites using 45. Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J., Alazab,
machine learning approaches. Internet Things 7, 100059 (2019). A.: A novel ensemble of hybrid intrusion detection system for
https://doi.org/10.1016/j.iot.2019.100059 detecting internet of things attacks. Electron (2019). https://doi.
30. Cheng, Y., Xu, Y., Zhong, H., Liu, Y.: Leveraging Semi- org/10.3390/electronics8111210
supervised Hierarchical Stacking Temporal Convolutional Net- 46. Alazab, A., Abawajy, J., Hobbs, M., Layton, R.: Crime Toolkits :
work for Anomaly Detection in IoT Communication, vol. 4662, The Productisation of Cybercrime (2013). https://doi.org/10.1109/
no. c (2020). https://doi.org/10.1109/JIOT.2020.3000771. TrustCom.2013.273
31. Lee, T.H., Wen, C.H., Chang, L.H., Chiang, H.S., Hsieh, M.C.: A 47. Singh, J., Pasquier, T., Bacon, J., Ko, H., Eyers, D.: Twenty security
lightweight intrusion detection scheme based on energy consump- considerations for cloud-supported Internet of Things. vol. 4662,
tion analysis in 6LowPAN. In: Advanced Technologies, Embedded no. c, pp. 1–16 (2015). https://doi.org/10.1109/JIOT.2015.2460333
and Multimedia for Human-centric Computing (2014). https://doi. 48. Adeyiola, A.Q., Saheed, Y.K., Misra, S., Chockalingam, S.: Meta-
org/10.1007/978-94-007-7262-5_137 heuristic firefly and C5 . 0 algorithms based intrusion detection for
32. Sahu, N.K., Mukherjee, I.: Machine learning based anomaly detec- critical infrastructures. In: 2023 3rd International Conference on
tion for IoT network:(Anomaly detection in IoT network). In: 4th Applied Artificial Intelligence (ICAPAI), pp. 1–7 (2023). https://
International Conference on Trends in Electronics and Informatics doi.org/10.1109/ICAPAI58366.2023.10193917
(ICOEI)(48184), no. Icoei, pp. 787–794 (2020). https://doi.org/10. 49. Kolias, C., Kambourakis, G., Stavrou, A., Voas, J.: DDoS in the
1109/ICOEI48184.2020.9142921 IoT: Mirai and other botnets. Computer (Long Beach Calif.) 50(7),
33. Chen, J., Chen, C.: Design of complex event-processing IDS in 80–84 (2017). https://doi.org/10.1109/MC.2017.201
internet of things. In: Proc. - 2014 6th Int. Conf. Meas. Technol. 50. Abomhara, M., Køien, G.M.: Cyber security and the internet of
Mechatronics Autom. ICMTMA 2014, pp. 226–229 (2014). https:// things : vulnerabilities , threats , intruders.4, 65–88 (2015). https://
doi.org/10.1109/ICMTMA.2014.57 doi.org/10.13052/jcsm2245-1439.414
34. Midi, D., Rullo, A., Mudgerikar, A., Bertino, E.: Kalis—a system 51. Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B.: Towards
for knowledge-driven adaptable intrusion detection for the Internet the development of realistic botnet dataset in the Internet of Things
of Things. In: Proc. - Int. Conf. Distrib. Comput. Syst., pp. 656–666 for network forensic analytics: Bot-IoT dataset. Futur. Gener. Com-
(2017). https://doi.org/10.1109/ICDCS.2017.104 put. Syst. 100, 779–796 (2019). https://doi.org/10.1016/j.future.
35. Karunkumar, D., Himansu, R., Behera, S., Nayak, J.: Deep neural 2019.05.041
network based anomaly detection in Internet of Things network 52. Mansfield-devine, S., Security, N.: DDoS goes mainstream: attacks
traffic tracking for the applications of future smart cities. no. July, could make this threat an organisation ’ s biggest nightmare. Netw.
pp. 1–26 (2020). https://doi.org/10.1002/ett.4121 Secur. 2016(11), 7–13 (2016). https://doi.org/10.1016/S1353-4858
36. Lopez-Martin, M., Carro, B., Sanchez-Esguevillas, A., Lloret, J.: (16)30104-0
Conditional variational autoencoder for prediction and feature 53. Greenberg, A.: Hackers remotely kill a jeep on the highway—with
recovery applied to intrusion detection in iot. Sensors (Switzer- me in it. Wired, 7(21) (2015)
land) (2017). https://doi.org/10.3390/s17091967 54. Saheed, Y.K.: Data analytics for intrusion detection system based
37. Guller, M.: Big data analytics with Spark: A practitioner’s guide to on recurrent neural network and supervised machine learning meth-
using Spark for large scale data analysis. Apress (2015) ods. In: Recurrent Neural Networks, pp. 167–179. CRC Press
38. Joshi, H.P., Bennison, M., Dutta, R.: Collaborative botnet detection Taylor & Francis Group (2022)
with partial communication graph information. In: 2017 IEEE 38th 55. Jain, S., Shukla, S., Wadhvani, R.: Dynamic selection of normal-
Sarnoff Symp. (2017). https://doi.org/10.1109/SARNOF.2017.80 ization techniques using data complexity measures. Expert Syst.
80397 Appl. 106, 252–262 (2018). https://doi.org/10.1016/j.eswa.2018.
39. Soe, Y.N., Feng, Y., Santosa, P.I., Hartanto, R., Sakurai, K.: A 04.008
sequential scheme for detecting cyber attacks in IoT environ- 56. Georganos, S., Lennert, M., Grippa, T., Vanhuysse, S., Johnson, B.,
ment. In: Proc. - IEEE 17th Int. Conf. Dependable, Auton. Secur. Wolff, E.: Normalization in unsupervised segmentation parameter
Comput. IEEE 17th Int. Conf. Pervasive Intell. Comput. IEEE optimization: a solution based on local regression trend analysis.
5th Int. Conf. Cloud Big Data Comput. 4th Cyber Sci., vol. Remote Sens. (2018). https://doi.org/10.3390/rs10020222
324, pp. 238–244 (2019). https://doi.org/10.1109/DASC/PiCom/ 57. Saheed, Y.K.: Performance improvement of intrusion detection sys-
CBDCom/CyberSciTech.2019.00051 tem for detecting attacks on internet of things and edge of things.
40. Soe, Y.N., Santosa, P.I., Hartanto, R.: DDoS attack detection based In: Misra, S., Kumar, T.A., Piuri, V., Garg, L. (eds.) Artificial
on simple ANN with SMOTE for IoT environment. In: Proc. 2019 Intelligence for Cloud and Edge Computing. Internet of Things
123
A voting gray wolf optimizer-based ensemble learning models …
(Technology, Communications and Computing). Springer, Cham 72. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward
(2022) developing a systematic approach to generate benchmark datasets
58. Gray, R.M.: Entropy and Information Theory. Springer Science & for intrusion detection. Comput. Secur. 31(3), 357–374 (2012).
Business Media (2011) https://doi.org/10.1016/j.cose.2011.12.012
59. Adi, E., Baig, Z., Hingston, P.: Stealthy Denial of Service (DoS) 73. Lippmann, R.P. et al.: Evaluating intrusion detection systems: the
attack modelling and detection for HTTP/2 services. J. Netw. Com- 1998 DARPA off-line intrusion detection evaluation. In: Proc. -
put. Appl. 91, 1–13 (2017). https://doi.org/10.1016/j.jnca.2017.04 DARPA Inf. Surviv. Conf. Expo. DISCEX 2000, vol. 2, pp. 12–26
.015 (2000). https://doi.org/10.1109/DISCEX.2000.821506
60. Saheed, Y.K.: Machine learning-based blockchain technology for 74. Ruoming, P., Mark, A., Mike, B., Jason, L., Vern, P., Brian, T.: A
protection and privacy against intrusion attacks in intelligent trans- first look at modern enterprise traffic. In: p. Proceedings of the 5th
portation systems. In: Machine Learning, Blockchain Technologies ACM SIGCOMM conference on I (2005)
and Big Data Analytics for IoTs: Methods, Technologies and Appli- 75. Vasudevan, A.R., Harshini, E., Selvakumar, S.: SSENet-2011: a
cations, p. 16 (2022) network intrusion detection system dataset and its comparison with
61. ZorarpacI, E., Özel, S.A.: A hybrid approach of differential evo- KDD CUP 99 dataset. Asian Himalayas Int. Conf. Internet (2011).
lution and artificial bee colony for feature selection. Expert Syst. https://doi.org/10.1109/AHICI.2011.6113948
Appl. 62, 91–103 (2016). https://doi.org/10.1016/j.eswa.2016.06 76. Gringoli, F., Salgarelli, L., Cascarano, N., Risso, F., Claffy, K.C.,
.004 Rodriguez, P.: GT: picking up the truth from the ground in traffic
62. Jimoh, R.G., Ridwan, M.Y., Yusuf, O.O., Saheed, Y.K.: Application classification. ACM SIGCOMM Comput. Commun. Rev. 39(5),
of dimensionality reduction on classification of colon cancer using 12–18 (2009)
Ica and K-Nn algorithm. Anale. Ser. Informatică, vol. 6, no. 10, 77. Beigi, E.B., Jazi, H.H., Stakhanova, N., Ghorbani, A.A.: Towards
pp. 55–59, 2018, [Online]. Available: http://anale-informatica.tibi effective feature selection in machine learning-based botnet detec-
scus.ro/download/lucrari/16-1-06-Olatunde.pdf. tion approaches. In: 2014 IEEE Conf. Commun. Netw. Secur.
63. Seni, G., Elder, J.F.: Ensemble Methods in Data Mining: Improving CNS 2014, pp. 247–255 (2014).https://doi.org/10.1109/CNS.2014.
Accuracy Through Combining Predictions, vol. 2, no. 1 (2010) 6997492
64. Hung, C., Chen, J.H.: A selective ensemble based on expected 78. Alkasassbeh, M., Al-Naymat, G., B.A, A., Almseidin, M.: Detect-
probabilities for bankruptcy prediction. Expert Syst. Appl. 36(3 ing distributed denial of service attacks using data mining tech-
PART 1), 5297–5303 (2009). https://doi.org/10.1016/j.eswa.2008. niques. Int. J. Adv. Comput. Sci. Appl. 7(1), 436–445 (2016).
06.068 https://doi.org/10.14569/ijacsa.2016.070159
65. Abdulhammed, R., Musafer, H., Alessa, A., Faezipour, M., 79. Sharafaldin, I., Gharib, A., Lashkari, A.H., Ghorbani, A.A.:
Abuzneid, A.: Features dimensionality reduction approaches for Towards a reliable intrusion detection benchmark dataset. Softw.
machine learning based network intrusion detection. Electron Netw. 2017(1), 177–200 (2017). https://doi.org/10.13052/jsn2445-
(2019). https://doi.org/10.3390/electronics8030322 9739.2017.009
66. Elhag, S., Fernández, A., Bawakid, A., Alshomrani, S., Herrera, F.: 80. Meidan, Y., et al.: N-BaIoT-Network-based detection of IoT botnet
On the combination of genetic fuzzy systems and pairwise learn- attacks using deep autoencoders. IEEE Pervasive Comput. 17(3),
ing for improving detection rates on Intrusion Detection Systems. 12–22 (2018). https://doi.org/10.1109/MPRV.2018.03367731
Expert Syst. Appl. 42(1), 193–202 (2015). https://doi.org/10.1016/ 81. Ahmed, S.W., Kientz, F., Kashef, R.: A modified transformer neural
j.eswa.2014.08.002 network (MTNN) for robust intrusion detection in IoT networks.
67. Mchugh, J.: Testing intrusion detection systems: a critique of the In: 2023 Int. Telecommun. Conf. ITC-Egypt 2023, pp. 663–668
1998 and 1999 DARPA intrusion detection system evaluations as (2023).https://doi.org/10.1109/ITC-Egypt58155.2023.10206134
performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. 82. Abd Elaziz, M., Al-qaness, M.A.A., Dahou, A., Ibrahim, R.A.,
3(4), 262–294 (2000). https://doi.org/10.1145/382912.382923 El-Latif, A.A.A.: Intrusion detection approach for cloud and IoT
68. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed environments using deep learning and Capuchin Search Algorithm.
analysis of the KDD CUP 99 data set in computational intelligence Adv. Eng. Softw. 176(December 2022), 103402 (2023). https://doi.
for security and defense applications. Comput. Intell. Secur. Def. org/10.1016/j.advengsoft.2022.103402
Appl., no. Cisda, 1–6 (2009) 83. Fatani, A., et al.: Enhancing intrusion detection systems for IoT
69. Sangster, B. et al.: Toward instrumenting network warfare com- and cloud environments using a growth optimizer algorithm and
petitions to generate labeled datasets. In: 2nd Work. Cyber Secur. conventional neural networks. Sensors 23(9), 1–14 (2023). https://
Exp. Test, CSET 2009 (2009) doi.org/10.3390/s23094430
70. Sato, M., Yamaki, H., Takakura, H.: Unknown attacks detec-
tion using feature extraction from anomaly-based IDS alerts. In:
Proc.—2012 IEEE/IPSJ 12th Int. Symp. Appl. Internet, SAINT
Publisher’s Note Springer Nature remains neutral with regard to juris-
2012, pp. 273–277 (2012). https://doi.org/10.1109/SAINT.2012.51
dictional claims in published maps and institutional affiliations.
71. Sperotto, A., Sadre, R., Van Vliet, F., Pras, A.: A labeled data set for
flow-based intrusion detection. In: IP Operations and Management:
9th IEEE International Workshop, IPOM, pp. 39–50 (2009). https://
doi.org/10.1007/978-3-642-04968-2_4
123