0% found this document useful (0 votes)
2 views

APELID Augmentd WGAN and Parallel Ensemble Learning

Uploaded by

20020065
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

APELID Augmentd WGAN and Parallel Ensemble Learning

Uploaded by

20020065
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Computers & Security 00 (2024) 1–17

APELID: Enhancing Real-Time Intrusion Detection with Augmented


WGAN and Parallel Ensemble Learning

Hoang V. Vo , Hanh P. Du , Hoa N. Nguyen
Department of Information Systems, VNU University of Engineering and Technology, Hanoi 100000, Vietnam
E-mail: [email protected], [hanhdp,hoa.nguyen]@vnu.edu.vn

Abstract
This paper proposes APELID, an AI-powered intrusion detection method that improves intrusion detection performance by
increasing the quality of the training set and employing numerous potent AI models. APELID is composed of two algorithms:
AWGAN and PELID. AWGAN uses a clustering algorithm to select representative samples from the majority classes and a WGAN
algorithm to generate more realistic samples from the minority classes in order to augment the training set. To improve the
efficacy of intrusion detection, we propose the PELID algorithm, which executes multiple efficient models in parallel, such as DNN,
XGB, GBM, and BME. In addition, we deploy a sandbox to enhance the malware detection capability for network-transferred
malware file dynamic analysis. To evaluate APELID, rigorous experiments utilizing well-known datasets are conducted. Using the
CSE-CIC-IDS2018 and NSL-KDD datasets, APELID achieves an F1-score of 99.99% and 99.65%, respectively, which is superior to
state-of-the-art algorithms. In addition, the average deep inspection time (i.e. 22.29µs/f low) for a single network traffic flow is
fast enough to detect intrusions in real-time.

© 2023 Published by Elsevier Ltd.

Keywords: AI-powered Intrusion Detection, Traffic Deep Analysis, Data Augmentation, Wasserstein Generative Adversarial
Networks, Deep Neural Network, eXtreme Gradient Boosting, Gradient Boosting on Decision Trees, Bagging Meta-Estimator.

1. Introduction for IDS [9, 10]. However, using an AI-powered method to


perform real-time deep flow inspection slows down the
The robust connection and pervasive use of the Internet
network, especially for high-bandwidth network traffic,
have contributed to the improvement of people’s lives,
potentially resulting in bottlenecks and a delay in analysis
but it also poses a number of threats to the network.
[11, 12]. Therefore, a deployment strategy for a genuine
Therefore, intrusion detection systems (IDS) are essential
IDS is required. Moreover, the effectiveness of AI models
network security devices for defending against increasingly
substantially lags behind the actual demand for an IDS.
vulnerable systems and network attacks. Specifically, the
Because the quality of the dataset determines the perfect
IDS employs signature-based [1, 2, 3, 4] to detect and
and efficient construction of the AI-powered model, using
prevent known attacks, while using machine learning (ML)
an unbalanced dataset to train our model can result in an
or deep learning (DL) to identify anomalous behavior
overfitting model. In addition, the minority classes in the
[5, 6, 7, 8]. Due to the dearth of detection methods for
training set may have a low precision score and a high rate
new network threats, it is challenging to identify unknown
of false positives [13, 14]. However, well-known datasets in
network attacks using either method. A rules-based engine,
this field, including NSL-KDD, KDD99, UNSW-NB 15, and
for example, cannot detect unknown malware, and anomaly-
CSE-CIC-IDS2018, typically exhibit significant imbalance
based ML still faces challenges such as false positive alarms
[15] caused by redundancy, a small number of samples for
and increased latency.
specific classes, a small number of intrusion samples in
Recent proposals and contributions to this field have
comparison to the innocuous class, etc. When employing
emphasized the use of artificial intelligence (or AI-powered)
AI-powered intrusion detection in real-time, the imbalanced
∗ Corresponding author: [email protected]
dataset is an issue that must be resolved, particularly to
1
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 2

improve or increase the accuracy of the ML models. • For deep network traffic inspection, we employ an en-
Moreover, Internet users must contend with the rapid semble learning model comprised of XGBoost, CBT,
dissemination of malware from numerous unknown or unre- GBM, Bagging Meta-Estimator, and DNN (specifi-
liable sources [16, 17]. Malware is code designed to damage cally PELID) to increase the precision and accuracy
a computer system’s functionality; it is also known as of intrusion detection.
malicious software. By executing these files in the Sandbox,
we can identify malicious activities such as modifying reg- • In response to the latency requirement in AI-powered
istry entries and deleting or uploading system files [18, 19]. IDS, PELID-based prediction is proposed to perform
Thus, we can use the Sandbox to observe the suspicious in parallel. In addition, a periodic PELID-based
behavior of the malware to gain a deeper understanding analysis sampling strategy is utilized to avoid bottle-
of its behavior without causing any harm to our existing necks for real-time, AI-powered intrusion detection
system. Therefore, Sandbox should integrate with the in massive traffic networks.
IDPS to address the problem of transferring malware files • We propose integrating IDS with a sandbox-based
between networks. analyzer to improve its malware detection capabil-
ities by automatically capturing, transmitting, and
Research Challenges: Real-time intrusion detection for analyzing network-transferred files. This analyzer
large-scale networks still faces the aforementioned research assists administrators in identifying suspicious files
challenges: and generating new rules to prevent malware in
1. IDS datasets that are well-known cannot account advance.
for novel network attacks. In addition, there is an
• The PELID and AWGAN are incorporated into the
imbalance between sample classes. Consequently, one
APELID, which is then implemented and deployed
of the primary challenges for AI-powered IDS is to
as inline IDPS. Extensive experiments on well-known
select the finest samples from the majority classes
datasets indicate that APELID’s F1-score and latency
and to generate realistic data for new attacks.
are superior for intrusion detection.
2. Currently, the high rate of false positives persists
when the IDS powered by AI is deployed in practice. The paper’s remainder includes the following sections.
Consequently, increasing the precision and accuracy Section §2 introduces the problem formulation and ana-
of AI models is one of the constant challenges we face, lyzes related works. §3 presents our AI-powered real-time
particularly in light of the new network intrusions. intrusion detection method, APELID, that enhances the
3. Utilizing AI models for intrusion detection causes performance, both for speed and accuracy. A strategy to
significant problems, including bottlenecks and high integrate APELID into an inline IDPS is also described
latency. In reality, the delay and efficacy of the in this section. We also present a method for improving
IDS are significant issues when deploying for high proactive malware hunting by adopting a sandbox that
bandwidth, large-scale networks. Therefore, we must integrates with IDPS. In Section §4, we summarize the
reduce the duration of deep inspections propelled by evaluation of APELID performance in comparison with
AI. state-of-the-art (SOTA) methods before the conclusions of
4. Attackers favor delivering and installing malware on our research in Section §5.
user computers. It causes the propagation of malware
throughout the network. In order to proactively 2. Problem Formulation & Related Works
protect network-transferred files from malware, in-
tegrating malware searching for IDS into practice is For AI-powered intrusion detection, each network traffic
also a challenge. flow is modeled by a vector of features f = [a1 , ..., an ].
Thus, we can build classifiers from this feature vector set
Highlights: These are the primary contributions of this that uses an AI model to determine abnormal or benign
work illustrated by our proposed method, namely APELID, flows. The accuracy of predicting anomalous flows depends
as the following: on the AI model and the dataset used to train it. This paper
aims to enhance the AI-powered intrusion detection model
• We propose an algorithm (specifically AWGAN) to and augment the training set. The following subsections
enhance the quality of the training set for selecting will clarify these issues.
the best samples in the majority classes and gener-
ating realistic samples for the minority classes that 2.1. AI-powered Intrusion Detection
the learning model will use. AWGAN induces the The current research affirms that the DNN, XGBoost,
WGAN method in order to generate more realistic CBT, GBM, and BME are the best methods to predict
samples with minority classes. In the meantime, intrusion attacks [6, 2, 8, 10, 20]. Thus, we focus on these
the majority classes are compressed using an Edited methods in our research to enhance intrusion detection by
Nearest algorithm. deep network traffic analysis.
2
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 3

Deep Neural Network (DNN): A DNN consists of outputs of n AI models for a feature vector f ; ω1 , ω2 ,..., ωn
several layers: input, hidden, and output. These layers are are weight ratio that represents the importance of model
n
established to extract features and transformation. It is where each ωi ∈ (0, 1) and
P
ωi = 1. The ensemble
considered a good method for detecting network attacks. i=1
The relationship between layer i(i = 1, 2, ..., n) and layer learning is involved to combine them into regular and
n − 1 can be expressed as: attacks network flow by the following formula:
n
vi = f (i) WiT vi−1 + bi

(1) 1X
ELP red(f ) = Pi (f ) ∗ ωi (3)
di di−1 xdi
n i=1
Where vi ∈ R is the vector of layer i, Wi ∈ R is
the weight matrix between layer i and layer i − 1, bi ∈ Rdi
2.2. Data Augmentation
is the bias vector, and f (i) (.) is the activation function of
layer i. We define the training set RT , including n of the
Several activate functions use in the DNN, such as majority class as RTmaj and m of the minority class as
Sigmoid f (x) = 1+e1−x . Hyperbolic tangent f (x) = 1−e
−2x
RTmin . The number of class labels varies greatly between
1+e−2x ;
xi RTmaj and RTmin . We achieve the highest quality of
Softmax f (x)i = Pe exj ; and Rectified Linear Unit (ReLU)
j the dataset by compressing the majority class with the
f (x) = max{x, 0}, ReLU has the advantage of efficiency K-Means algorithm using the following formula:
for training a large number of datasets [21].
n
Extreme Gradient Boosting (XGB): The XGB is X
based on a set of ML concepts using decision trees. It Smaj = Compress(RTmaj ) (4)
i=1
uses the notion of strengthening by minimizing errors by
introducing a gradient term [6]. XGB uses an ensemble We generate the minority class by using Oversampling
of K classification and regression trees, each of which has technique, then verifying and removing noise (Samplenoise )
i
KE |i ∈ 1..K nodes. The total of the prediction scores for from new class samples to increase the number of samples
each tree represents the final prediction: of minority classes by the following formula:
K
X m
(2)
X
yˆi = φ(xi ) = fk (xi ), fk ∈ F, Smin = Generate(RTmin ) \ Samplenoise (5)
k=1 i=1

Where the yi are the corresponding class labels, xi are Finally, we obtain the augmented training set as AT S with
members of the training set, fk is the leaf score for the k th the same number as τ of every class label by the following
tree and F is the set of all K scores for all classification formula:
and regression trees. AT S = Smaj + Smin (6)
Gradient Boosting on Decision Trees (CBT): CBT is
a high-performance gradient-boosting approach on decision 2.3. Related Works
trees [2]. It reduces overfitting with this strategy to handle
Several works investigate ML issues when applied to
categorical features. It develops an oblivious tree model
IDS [22, 23, 24, 25, 26, 27, 28]. The following subsections
using the greedy target statistics technique using randomly
focus on recently related works regarding (i) Deep Learning-
shuffled training data to enhance the model’s resilience.
based Intrusion Detection, (ii) Boosting-based Intrusion
Gradient Boosting Machine (GBM): The GBM is
Detection, and (iii) Dataset Augmentation.
used for regression and classification issues [8], a mixture of
weak classification models, often builds a model of decision
2.3.1. Deep Learning-based Intrusion Detection
trees. The least error values can be achieved by enhancing
the gradient by updating the estimates in accordance with DL approaches effectively detect network attack asso-
the learning rate. ciations within raw samples, feature learning, and clas-
Bagging Meta-Estimator (BME): It is an ensemble sification tasks. Many DL techniques have implemented
meta-estimator that aggregates the predictions of base IDS in the last few years [22, 29, 30]. DL is used in real-
classifiers fitted to random subsets of the original dataset time environments for several studies on network attack
[10]. Each training dataset in Bagging is created through detection. For example, Bontemps et al. [31] proposes
a random drawing of N replacement examples, where N a real-time collective anomaly detection model based on
represents the size of the initial training datasets. In the neural network learning and feature operating. Their
test phase, the ML model predicts input x by every base method involves using typical time series data to train
classifier, and the plurality vote combines the predictions. an LSTM RNN, followed by a live prediction for each time
Ensemble Learning: Combining outputs of various classi- step. An approach for NIDS based on a hierarchical and
fication models can lead to more significant predictions and dynamic feature extraction framework (HDFEF) was pro-
enhance the generalization ability of the baseline models posed by Li et al. [32]. They defined a network activity
[20]. Suppose P1 (f ), P2 (f ),...Pn (f ) are the probability as a series of packets using various network traffic flows.
3
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 4

The distribution of the feature representations of several CSE-CIC-IDS2018 datasets, respectively. In another method,
temporally associated network packets is then dynamically Al et al. [40] offers a new classification-based NIDS on
adjusted with an attention mechanism in a hierarchical network flow traffic generating huge data. The suggested
network model. The final discriminant vectors are then system combines a Hybrid DL (HDL) network comprised
obtained and utilized for classification after combining the of a CNN and an LSTM for a better IDS. In addition,
vectors from the multi-space mapping. The precision of data imbalance processing consisting of the SMOTE and
the HDFEF on the CSE-CIC-IDS2018 dataset is 99.05%. Tomek-Links sampling methods termed STL was utilized
A DL method for anomaly detection using a Restricted to mitigate the effects of data imbalance on system perfor-
Boltzmann Machine (RBM) and a deep belief network mance. The accuracy of the proposed method in binary
was proposed by Alrawashdeh et al. .[33]. Their method classification was 99.17% and 99.83% in multiclass classifi-
involved creating unsupervised feature reduction using cation.
a one-hidden layer RBM and then passing the weights
from this RBM to another RBM to create a deep belief 2.3.2. Boosting-based Intrusion Detection
network. With the NSL-KDD dataset, they were 97.91% Since the XGB outperforms other well-known algo-
accurate. In addition, Jayalaxmi et al. .[34] introduce the rithms in a single ML, its popularity is rising .[46]. Verma
IDS framework known as PIGNUS, which combines an et al. .[41] recently suggested a technique for IDS that
effective feature-mapping method with a cascade model. combined the XGB algorithm with K-Means clustering.
The PIGNUS combines Cascade Forward Back Propagation For the NSL-KDD dataset, the experiment result has an
Neural Network for classification and attack detection accuracy of 81.2%, 82.38%, and 84.25%, and for the ANN,
with Auto Encoders to choose the best features. The SVM, and XGB models, respectively. Additionally, Devan
cascade model creates an accurate categorization by using et al. .[42] propose a strategy for enhancing NIDS that
related links from the input layer to the output layer blends DNN with XGB. This approach uses the XGB
to identify typical and aberrant behavior patterns. The technique for feature selection, and the experiment results
experiment result from PIGNUS reaches 99.02% accuracy are 97.60% accurate. Numerous dual ensemble techniques
for the NSL-KDD dataset. Aldarwbi et al. [35] offer a involving fine-tuned CBT algorithms, such as XGB, CBT,
system that transforms the netowrk traffic flow features LightGBM, and GBM, are fully assessed utilizing publicly
into waves and leverages advanced audio/speech recognition available data sets, such as UNSW-NB15 and NSL-KDD.
DL-based methods such as LSTM, Deep Belief Networks Louk et al. [47] presented a dual ensemble model by
(DBNs) and CNN to detect intruders. It achieves the blending two current ensemble methods: bagging and CBT.
accuracy of 84.82% and 99.41% for the NSL-KDD and CIC- The experiment results show that the presented technique
IDS2017 datasets, respectively. In other ways, the authors achieves 94.66% accuracy.
employed the Firefly Optimization (FFO) technique to Golchha et al. [48] present an attack detection frame-
detect incursion and Probabilistic Neural Network(PNN) to work for IIoT utilizing the voting-based ensemble learning
categorize categories based on the NSL-KDD datasets [36]. method. This work includes an ensemble of the current
The proposed approach achieves an accuracy of 98.99%. and classical ML approaches, including Histogram Gradient
Qazi et al. [37] presented a hybrid IDS framework using Boosting, CBT, Random Forest (RF), and a hard voting
a convolutional recurrent neural network (CRNN) to detect classifier. The experiment result reaches 99.85%, 97.90%,
network threats. This method merges an RNN with a CNN and 98.83% accuracy for CBT, HGB, and RF, respectively.
in which various RNN layers follow up two convolutional Moreover, Nazir et al. [49] suggested a wrapper-based
layers.The output is then fed into fully connected, flattened, feature selection approach called ‘Tabu Search - Random
and SoftMax layers, which enable the model to detect and Forest (TS-RF).’ The Tabu search is utilized as a search
classify traffic. The experiment results on the CSE-CIC- technique, while the RF is employed as a learning process
IDS2018 dataset reach 98.90% accuracy. In addition, Ren for IDS. The suggested model attained an accuracy of
et al. [38] employed CNN, and the Attention mechanism 83.12% for the UNSW-NB15 dataset.
mix to construct a CA Block focused on local spatiotem- In another approach, Hammad et al. [44] present a
poral feature extraction, using Equalization Loss v2 (EQL method for categorizing network attacks called Multinomial
v2) to raise the minority class weight and balance the Mixture Modeling with Median Absolute Deviation and
learning attention on minority classes. The accuracy of the Random Forest Algorithm (MMM-RF). This approach uses
experiment’s results for the NSL-KDD and UNSW-NB15 t-SNE to minimize data dimension, Correlation Feature
datasets is 99.77% and 89.39%, respectively. Selection (CFS) to analyze the most important factors
Moreover, Ghanbarzadeh et al. [39] proposed a method affecting network traffic, and SMOTE combined with Ran-
that uses the Multi-objective Quantum-inspired Binary dom Under-Sampling to control imbalance on the CSE-
Horse herd Optimisation Algorithm (MQBHOA) for IDS. CIC-IDS2018 dataset. It has a 99.98% accuracy rate.
This technique implements the Horse herd Optimisation Despite the fact that many proposed ML/DL techniques
Algorithm (HOA) metaheuristic optimization algorithm, a have improved the development of IDS, they fail to achieve
robust algorithm inspired by nature. The method achieved excellent performance, which consists of a low false alarm
99.0% and 99.78% of the accuracy of the NSL-KDD and rate and a high detection rate. One of the explanations
4
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 5

Table 1. Summary of Related Works


Method Venue Approach Dataset Acc (%)
Applying the XGB algorithm with K-Means cluster for
XGB [41] ICMLA 2016 NSL-KDD 84.25
intrusion detection.
Proposal using a one-hidden layer RBM to perform
DNN+RBM [33] ICCCNT 2018 NSL-KDD 97.91
unsupervised feature reduction
Neural Computing
Provocation of XGB technique for feature selection
XGB+DNN [42] and Applications NSL-KDD 97.60
followed by a DNN
2020
CNN+BiLSTM Utilization of traditional oversampling technique to
AIPR 2020 NSL-KDD 99.22
[26] balance the dataset, and CNN for intrusion detection.
RF+
Combination of K-Means and ENN to balance dataset NSL-KDD, 82.84,
miniVGGNet IEEE Access 2020
then RF+ miniVGGNet to detect intrusions. CIC-IDS2018 96.99
[43]
Applying WGAN-GP for data generation on minority
WGAN+ Computer Science NSL-KDD, 99.00,
class samples and using LightGBM for the
LightGBM [25] 2021 CIC-IDS2018 96.00
classification.
Balancing on majority and minority class samples and
NSL-KDD, 99.62,
SDAID [20] Globecom 2022 using ensemble learning consisted XGBoost and DNN
CIC-IDS2018 99.93
model for the classification.
Use CFS to analyze network traffic, T-SNE to minimize
Computer &
MMM-RF [44] data dimension, and SMOTE to imbalance the CIC-IDS2018 99.98
Security 2022
CSE-CIC-IDS2018 dataset.
Computers and Transforms the traffic flow features into waves and
CNN, DBNs, CIC-IDS2017, 99.21,
Electrical utilizes advanced audio/speech recognition
LSTM [35] NSL-KDD 84.82
Engineering 2022 deep-learning-based techniques to detect intruders.
Digital Used SMOTE to balance abnormal traffic, CNN to UNSW.NB15, 99.21,
CNN+LSTM
Communications extract deep features, then CNN-LSTM to detect CIC-IDS2017, 99.32,
[45]
and Networks 2023 intrusions. NSL-KDD 98.45
Alexandria
Used the FFO technique to extract features and PNN
FFO+PNN [36] Engineering NSL-KDD 98.99
to classify categories.
Journal 2023
Used CNN and the Attention mechanism mingle to UNSW.NB15,
89.39,
Computer form a CA Block focusing on local spatiotemporal NSL-KDD,
99.77,
CNN+EQL [38] Communications feature extraction and EQL v2 to increase the minority CIC-IDS2017,
99.88,
2023 class weight and balance the learning attention on CIC-
99.58
minority classes. DDoS2019
Use Auto Encoders to select optimal features and
Computer &
PIGNUS [34] Cascade Forward Back Propagation Neural Network for NSL-KDD 99.02
Security 2023
classification and attack detection.

why the majority of these works disregard the imbalanced dataset achieving an accuracy of 99.22% and a detection
data in IDS datasets. rate of 98.88% for the NSL-KDD dataset. However, the
experiment only demonstrates cross-validation and does
2.3.3. Dataset Augmentation not include independent test data after execution. In
Researchers recently proposed several methods to im- another approach, Gupta et al. [27] presented a solution
prove the quality of datasets for training ML or DL models. to balance the dataset. This method integrates DL and
For instance, to balance the dataset in NIDS for indus- ensemble learning algorithms with data-level techniques
trial IoT, Zhang et al. [25] propose PWG-IDS based on based on Random Oversampling (ROS) and SVM-SMOTE.
WGAN with a gradient penalty for generating minority Before data oversampling, they used DNN to execute binary
class samples. The proposed reduces the number of it- classification of benign and attack network traffic flow,
erations and generates more realistic sample data than followed by the XGB algorithm. They also distinguish
GAN, using LightGBM for the classification algorithm. between the majority and minority attack classes. Then
The experimental findings on the NSL-KDD and CSE-CIC- the dataset is resampled, and the RF algorithm is applied
IDS2018 datasets demonstrate an accuracy of 99% and 96%, to classify the various minority attack classes.
respectively. Sinha et al. [26] proposed a conventional over- In addition, Liu et al. [43] presented a technique to
sampling procedure to balance the dataset. The experiment balance the dataset for network IDS, namely DSSTE. This
evaluated the CNN-BiLSTM model based on the NSL-KDD method used techniques to balance the dataset for the
5
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 6

Raw dataset Algorithm 1 AWGAN: Create the Training & Testing


Sets by Augmented WGAN
Normalization
Input: F - Raw Dataset, represented by a list of feature
RandomSplit Testing set vectors.
r - ratio between training and testing sets; default is 7:3.
Raw train dataset
τ - maximum samples in a label.
Output: T - Training Set; V - Testing Set.
Majority classes Minority classes
1: L ← GetLabels(F ) ▷ Get all labels of dataset F .
WGAN Backpropagation 2: F ← N ormalize(F ) ▷ Normalize all feature vectors.
Validation
K-Means-based Sampling minimize error 3: (RT, V ) ← SplitT rainT est(F, r) ▷ Split F randomly
Compression
into the raw training set RT and testing set V with
New samples
ratio of r.
4: (Smaj , Smin ) ← GetClasses(RT ) ▷ Determine
Training set
majority classes (Smaj ) and minority classes (Smin )
from RT
Figure 1. Training Set Augmentation
5: T ← ∅
6: for each M ∈ Smaj do ▷ Compression each
minority and majority classes. The experiment result of majority class
this approach achieves 96.99% and 82.84% for the CSE-CIC- 7: C ← Clustering(M, |L|) ▷ Compute the centroids
IDS2018 and NLS-KDD datasets, respectively. Research C of |L| clusters by using ENN
concentrates on separating the training and testing datasets 8: M ← Select(M, C, τ ) ▷ Compress majority
to boost detection quality; for instance, Ullah et al. [45] samples using C of L clusters
proposed an IDS employing transformer-based transfer 9: T ←T ∪M
learning for Imbalanced Network Traffic (IDS-INT). It 10: end for
employs SMOTE to balance unusual traffic and detect 11: for each M ∈ Smin do ▷ Generate samples for
minority attacks, uses CNN to extract features, and the minority classes by WGAN
CNN-LSTM model to detect attacks with an accuracy of 12: while |M | < τ do
99.21% on the UNSW-NB15 dataset. 13: S ← W GAN Sampling(M ) ▷ Generate new
Table 1 illustrates our analysis of SOTA methods in the samples
following factors: method, approach, dataset, and result. 14: M = Denoise(M, S) ▷ Eliminate noise samples
It shows that we still need to reduce the false positives, 15: end while ▷ Repeat until get enough samples τ .
and the ML model must also enhance when applied to 16: T ←T ∪M ▷ Add realistic samples to T
the IDS. In addition, the training dataset quality caused 17: end for
by low detection rates leading to ML-based IDS is more 18: return (T, V )
challenging than other anomaly detection applications.

by a ratio of 7:3, respectively. We use τ as a constant for


3. APELID: Augmented WGAN and Parallel En-
determining the maximum samples of the label class in the
semble Learning for Real-Time Intrusion Detec-
training set. The testing set is used for evaluation in the
tion
DL models of our project.
This section introduces our proposed method, APELID, Step 2 - Finding Majority and Minority Classes: The
combining a method to augment the dataset (AWGAN, training set is separated into the majority class and minority
described in §3.1) and a parallel ensemble learning method class from the initial/original train dataset. We compressed
(PELID, specified in §3.2) for deeply analyzing network the majority class and utilized the Oversampling approach
traffic. Thus, APELID is a comprehensive research solu- to create data for the minority class. Consequently, the
tion for enhancing real-time intrusion detection. A short total number of every class in the training set is equal.
version of APELID is WGID, presented at SoICT 2022 Step 3 - Compressing Majority Classes: we reduce
[50]. WGID only comprised a WGAN algorithm to tackle the number of the majority class by proposing a method
the imbalanced dataset and used the XGBoost method to inspired by the idea of the DSSTE [43] algorithm to aug-
detect intrusion. The following subsections describe our ment the dataset. We use the Edited Nearest Neighbor
APELID in detail. (ENN) algorithm to obtain the majority of labels that are
frequently difficult to classify due to their proximity. Using
3.1. Training Set Augmentation a clustering algorithm, we then compressed every label
class in the majority class to reduce the number of label
Step 1 - Dataset Reprocessing: carry out dataset nor-
classes. We eventually obtain τ for each label class in the
malizing; eliminate noise and duplicated raw data on the
majority class and append the majority class samples to
dataset; and split it into the training and testing sets, preset
6
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 7

the training set. Algorithm 2 PELID: Parallel Ensemble Learning-based


Step 4 - Augmenting Realistic Data for Minority Classes: Intrusion Detection
We balance the minority class using Oversampling model Model: XGB, GBM , CAT B, BM E, DN N - XGB,
based on the WGAN[51], which uses attack data to generate GBM, CBT, BME and DNN P5 trained model, and their
simulated attack samples. Then we validate these new ensemble weight ωi where i=1 ωi = 1.
samples by using the train & test model built from the Input: f - traffic flow.
actual dataset to test the output of the Oversampling model. Output: (msg, R) - (alert messages; new generated rules)
Depending on the result of the testing phase for the new 1: R ← ∅
attack samples as a testing set, the Oversampling model 2: F ← F eaturize(f ) ▷ Extract features of traffic flow f .
will be backpropagation to minimize error. It also removes 3: F in ← N ormalize(F ) ▷ Perform
the noise, which is new attack samples that failed to be the feature engineering: remove unused features and
classified. This step will repeat until the train & test model normalize the rest.
can’t define actual and simulated attack data, and the 4: Cats ← [DstP ort, P rotocol] ▷ Categorical variables
number of every label class in the minority class equals τ . 5: Conts ← F in \ Cats ▷ Continuous variables
Moreover, this step generates more realistic attack samples, 6: Perform in parallel five processes P1, P2, P3,
and these new attack samples may be similar to further P4, P5:
attacks. 7: P1: pXGB ← XGB.predict(Cats, Conts) ▷ perform
Step 5 - Results: Finally, we obtain a new training set, the prediction using XGB.
which contains the number of every label class that is the 8: P2: pGBM ← GBM.predict(Cats, Conts) ▷ perform
same as τ , and use this training set to train and the testing the prediction using GBM .
set in Step 1 to test our models. Thus, it helps us to obtain 9: P3: pCAT B ← CAT B.predict(Cats, Conts) ▷
a better AI model. perform the prediction using CAT B.
10: P4: pBM E ← BM E.predict(Cats, Conts) ▷ perform
3.2. Parallel Ensemble Learning-based Intrusion Detection the prediction using BM E.
Two ideas motivated our intrusion detection method: 11: P5: pDN N ← DN N.predict(Cats, Conts) ▷ perform
the ensemble learning approach and parallel computing. the prediction using DN N .
The first proposal tries to improve the quality of intrusion 12: Wait P1, P2, P3, P4, P5 finished.
detection, while the second helps to reduce intrusion detec- 13: scores ← (pXGB ∗ ω1 + pGBM ∗ ω2 + pCAT B ∗ ω3 +
tion latency. As a result, the intrusion detection approach pBM E ∗ ω4 + pDN N ∗ ω5 )
suggested in our study is known as PELID, which stands 14: F C ← scores.argmax(axis = 1) ▷ Get the flow
for “Parallel Ensemble Learning for Intrusion Detection.” predicted label.
As described in [6, 2, 8, 10, 20], the most effective new 15: if F C! = 0 then ▷ Classified as network attacks
AI models for intrusion detection are DNN, XGB, CBT, 16: msg ← Alert(F C, f ) ▷ Generate an alert by using
GBM, and BME. In addition, as illustrated in Table 1, the metadata from the flow f ; set alert category being as
accuracy and F1-score results for intrusion detection of predicted label.
these models currently exceeded 98%. Thus, it predomi- 17: R ← RuleGenerator(F C, f ) ▷ Generate a new
nantly affects our selection of these models for our PELID signature based on its indicator of compromise.
ensemble. 18: end if
In PELID, the combination of numerous AI models 19: return msg; R
is performed by soft-voting method. However, the effect
of each individual model is regulated by a weighted score
(ωi ∈ [0, 1] ) into the overall model PELID. In general, them, two features [DstP ort, P rotocol] are used as cate-
with nPAI models, the total sum of these scores has to gorical variables, and the 71 rest are considered continuous
n variables. It is worth noting that all the above steps are
be 1: i=1 ωi = 1. All of these scores will be determined
experimentally in order to identify the optimal combination also used for preparing the training set before training each
of a variety of AI models. Consequently, Alg. 2 depicts our AI model. Moreover, all AI models have to be trained by
PELID algorithm. using datasets augmented by AWGAN before using PELID.
In Alg. 2, the network traffic flow is firstly captured, Returning to PELID algorithm, the normalized vectors
extracted, and modeled by a feature vector F . In this are then fed into AI models to run the prediction step. Here,
step, the CICFlowMeter tool [52] can be used and return in order to enhance the speed of intrusion detection, the
a vector of 83 features for each flow. Next, F will be AI-based predictions are performed in parallel. Concretely,
cleaned by removing unused features and normalizing the in PELID, five P 1, P 2, P 3, P 4, and P 5 processes are
rest. With CICFlowMeter explicitly, this step retains concurrently run to compute the intrusion probability.
only 73 features by eliminating [FlowID, SrcIP, SrcPort, Lastly, Eq. 3 is used to determine a benign or attack.
Label, BwdPSH - Flags, BwdURGFlags, FwdByts/bAvg, In the event that network activities under an intrusion
FwdPkts/bAvg, FwdBlkRateAvg, BwdByts/bAvg]. Among attack are identified, PELID will send an alert message to

7
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 8

the administrator and generate a rule in the form of an It leads to increased latency and possibly causes bottlenecks.
indicator of compromise to update the IDPS’s signature Consequently, these parameters are also chosen based on
database. the context of network throughput and IDPS’s computing
capacity.
3.3. Strategy for AI-powered Real-Time Intrusion Detec- Consequently, Fig. 2 illustrates the system architecture
tion necessary to integrate our APELID into an inline IDPS.
When the AI-powered model for detecting intrusions Note that if the DeepAnalyzer detects anomalous network
is implemented, a large-scale traffic network, such as an behavior, the IDPS will generate an indicator of compro-
optical one, experiences significant network latency or mise (IoC) and add it to the signature-based ruleset of
bottleneck congestion. We employ a rules-based engine to the IDPS. Thus, it notifies the administrator of current
detect and prevent common network attacks. The APELID traffic flows and thwarts future network attacks of a similar
will then concentrate on network traffic flows missed by nature.
the current rules-based engine to decrease analysis time.
It operates in two phases, as described in the following 3.4. Hunting Malware by Sandbox Approach
sections: In order to improve the capability to detect malicious
Initially, a well-known IDPS, such as Suricata or Snort, files transferred over the network, our proposed APELID
is used to capture network traffic in both the receiving solution is integrated with a MalwareAnalyzer based on a
and transmitting directions. Next, we use a rules-based sandbox approach as illustrated in Fig. 2. The Malware-
engine to analyze network traffic in order to detect and Analyzer is assumed to perform both static and dynamic
prevent known network attacks: drop, reject, alert, and pass. file analysis to identify malware threats.
The APELID performing in-depth analysis will identify Our method for detecting malicious network-transmitted
abnormal network traffic behavior for the remaining case, files is as follows. IDPS will initially acquire network-
denoted by ‘Other.’ transmitted files and store them in the FileStore folder.
Second, during the in-depth analysis phase, the IDPS Then, we construct a Python program that periodically
is modified to capture flows corresponding to ‘Other.’ This examines the folder FileStore for new files. Therefore, all
flow data will be then analyzed by the PELID method new files are submitted automatically to MalwareAnalyzer
to identify one of the twelve labels: Benign, DoS-Slow for sandbox-based malware analysis.
HTTPTest, BruteForce-Web, BruteForce-XSS, DDOS-LOIC MalwareAnalyzer enables the deployment of a well-
-UDP, DDoS-HOIC, DoS -Hulk, DoS-GoldenEye, Bot, DoS known sandbox, such as Cuckoo, with two essential entities:
-Slowloris, Infiltration, SQL-Injection. the Host and the Agent. Each agent can be launched on
For large-scale network traffic, the deep analysis cer- a virtual machine that has been quarantined (Analysis
tainly causes the stuck of IDPS. Therefore, we propose VM). The analysis VM will execute the file and record
an efficient strategy to sense the traffic flows. Thus, we its complete behavior. For further investigation, it can
control the periodic deep analysis sampling strategy us- also identify behaviors associated with malware, such as
ing 6 variables: DI Cycle, DIC M in, DIC M ax, and extracted artifacts, registry modifications, dropped files,
DI W indow, DIW M in, DIW M ax. These parameters related processes, DLL library files in use, and network
are all natural numbers with units of seconds, and their activity data.
meanings are as follows: MalwareAnalyzer also considers statistical analysis by
utilizing the Yara utility. Here, we investigate the file
• DI Cycle: is the sampling cycle T for deep analy- signatures, hashes, strings, and other data related to the
sis. Suppose this parameter has a value equal to suspicious file. MalwareAnalyzer integrates with a number
0, and the IDPS system will include a deep traffic of additional malware analyzers, including Virus Total.
analysis with a random cycle in the range from Lastly, MalwareAnalyzer combines the static and dy-
DIC M in, DIC M ax. These default parameters are namic analysis results and gives the analysis report with a
60, 30, and 300 seconds, respectively. severity score of 0 to 10. If the severity score exceeds 7 in
• DI W indow: is the window size for deep analy- our proposal, MalwareAnalyzer will send an alert to the
sis. If this parameter is 0, the system will cap- administrator and collect file-related information, such as
ture the flows for deep analysis in a random win- the incoming IP address, domain, URL, etc. These data
dow size from DIW M in to DIW M ax. The de- permit the development of a new IoC-based rule and its
fault value of DI W indow is 10 seconds, and DIW incorporation into the IDPS signatures database, thereby
M in, DIW M ax is 1 and 30 seconds, respectively. preventing similar future threats. Consider a scenario
in which the severity score is less than 7, such as anti-
In the IDPS, these parameters are selected and config- executable malicious code on analysis virtual machines. In
ured. The sampling cycle and duration will determine the this case, the results of the simulator analysis will also be
performance of the IDPS for high-volume network traffic sent to the administrator in order to provide additional
(throughput of 10Gbps or more). If DI Cycle is tiny or big information. In some instances, human reverse engineering
DI W indow, DeepAnalyzer must make more predictions. analysis is required to assess the actual malware risk.
8
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 9

PELID 1. Bot

2. BruteForce-Web
XGB
IoC Updater 3. BruteForce-XSS

Feature Engineering
CABT 4. DDoS-HOIC

Weighted Voting
5. DDoS-LOIC-UDP
Traffic Capture Flow Featurization Y
Notificator & GBM labelID > 0 6. DoS-GoldenEye
Mitigator Y 7. DoS-Hulk
N
Sensing BME 8. DoS-SlowHTTPTest
Drop Alert
Reject 9. DoS-Slowloris
Other DNN
10. Infiltration

File Extraction
DeepAnalyzer 11. SQL-Injection

IoC, Signatures FileStore

Signature-Based Detector Monitoring and Create Task 0. Benign


Submitting Database Analysis VM1
Alert
Pass Other new Files Check if any Analysis VM2 Y

Host
VMs are
available Analysis VM3 score > 7
Real time
Traffic Out Task Scheduler
Traffic In
MalwareAnalyzer

Figure 2. Architecture of APELID-based Intrusion Detection

Algorithm 3 Malware Detection dataset?


Input: F - New files transferred in network and accumu- 2. RQ2: Does combining multiple AI models of PELID,
lated in F ileStore folder. both traditional ML and DL, allow for enhancing
Output: (msg, R) - (Alert Message, New Rules generated the performance of network intrusion detection? Is
based malware detected files). it possible to detect unsigned attacks?
1: Ready ← W ait Sandbox Ready ▷ Blocking-function 3. RQ3: When deploying an IDPS inline system in an
until Sandbox is ready. intranet with large-scale network traffic, is it fast
2: IngestF iles(F ) ▷ Send F in the FileStore folder to enough to conduct a deep analysis of network flows
Sandbox for intrusion detection with the AI model generated
3: score = HybridAnalyzer(F ) ▷ Determine the overall by the APELID method to ensure that network flows
score of both static and dynamic analysis. are handled in real time?
4: if score > 7 then ▷ Critical suspicious file 4. RQ4: Is implementing malware file detection in the
5: R ← RuleGenerator(F ) ▷ Update the rule to inline IDPS system combined with AI model-based
block connection. deep analysis possible?
6: msg ← ‘Detected M alware F iles′
Our rigorous experiments were conducted to answer the
7: return msg, R
above research questions. The following sections, in turn,
8: end if
detail the results we obtained while experimenting with
and evaluating our APELID method.
Notably, MalwareAnalyzer is not designed to prevent
malware files in real-time. Nevertheless, our method en- 4.1. Experiment Setup
ables us to proactively enhance and improve the IoC and This section describes the computing infrastructure and
signature database of IDPS in response to comparable the software framework libraries to evaluate the APELID
future threats. Alg. 3 illustrates our strategy for analyzing method. We use Python version 3.8 as a programming
and identifying this malware file. language with the following libraries and frameworks: Fas-
tai V2.7.10, Pandas V1.2.3, Matplotlib V3.7.1, Scikit-
learn V1.2.2, Numpy V1.20.2. Server machine with 2 x
4. Experiments & Evaluation
Intel Xeon-Platinum 8160 (24-cores); 256GB DDR4 RAM;
To demonstrate the performance of our method, we con- NVIDIA Tesla T4 16GB;.
duct a comprehensive experiment to answer the following
research questions: 4.2. AWGAN-based Dataset Preparation
Two well-known datasets, CSE-CIC-IDS2018 and NSL-
1. RQ1: Does the AWGAN data augmentation method
KDD [20], have been selected to perform our rigorous
allow for improvement in the quality of the training
9
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 10

experiments, evaluate, and solve the RQs mentioned above.


Table 2. Augmented Datasets
These datasets were also selected based on assessing the
SOTA methods in Section 2. CSE-CIC-IDS2018 has a Label Original Train Test
total of 12 classes labeled by [Benign, Infiltration, Bot, DS1: CSE-CIC-IDS2018
DDos-HOIC, DoS-GoldenEye, DoS-Hulk, DoS-Slowloris,
DDoS-LOIC-UDP, BruteForce-Web, BruceForce-XSS, SQL- Benign 4, 360, 029 14, 000 6, 000
Injection]. The Bengin samples in this dataset are much Infiltration 160, 604 14, 000 6, 000
larger than the attack samples. It has enough samples Bot 282, 310 14, 000 6, 000
for Bot and DDOS − HOIC, while SQL − Injection and DDoS-HOIC 668, 461 14, 000 6, 000
BruteF orce − W eb have very small attack samples. DoS-GoldenEye 41, 455 14, 000 6, 000
Related to NSL-KDD, it has four classes: [DoS, Probe, DoS-Hulk 434, 873 14, 000 6, 000
U2R, R2L]. Like CSE-CIC-IDS2018, we also constate that DoS-SlowHTTPTest 13, 067 14, 000 4, 082
Benign samples dominate the attack samples. However, DoS-Slowloris 6, 977 14, 000 2, 093
DoS possesses larger samples, while the other attacks, i.e., DDoS-LOIC-UDP 1, 120 14, 000 336
R2L and U 2R, suffer from very low samples. BruteForce-Web 261 14, 000 78
For both datasets, the samples of each label (in each BruteForce-XSS 97 14, 000 29
dataset) are largely imbalanced, as illustrated in Table 1. SQL-Injection 53 14, 000 17
Inspired by [43, 20], to enhance these datasets, the AWGAN
algorithm is applied to each class, where the parameter r DS2: NSL-KDD
(ratio between training and testing sets) is set to 7:3, and Benign 61, 343 14, 000 6, 000
τ (maximum samples in a label) set to 20,000. Note that, DoS 39, 927 14, 000 6, 000
with AWGAN setting with these parameters, the class Probe 8, 333 14, 000 2, 500
having more than 20,000 samples will be “compressed,” R2L 637 14, 000 191
selecting only 20,000 samples. Meanwhile, the class with U2R 40 14, 000 12
less than 20,000 samples will be “zoomed” and “generated”
with more realistic samples up to 20,000. A point that
should also be emphasized here is that for classes with Hyperparameters in the XGB model can accept two
less than 20,000 samples, the training/test set division categories of values, range, and choice. We determined
must be performed before applying WGAN. For example, the learning rate to be 0.003, the n estimators to be 100,
with the “BruteForce-Web” class of CSE-CIC-IDS2018 and the max depth to be 9. During the training phase, we
with 261 samples, the test set will be randomly selected at expressly define the optimal values for hyperparameters
30% ∗ 261 ≃ 78 samples. The remaining 183 samples will to configure XGB as a tree enhancer. Similarly, the same
be fed into AWGAN to generate up to 14,000 samples. methods are utilized for the remaining machine-learning
Similarly, after utilizing our AWGAN, we obtained the models. Additionally, in our ensemble learning model, we
augmented training sets for training AI models and test developed a Python utility for modifying the weight ratios
sets for evaluating them. Finally, Table 2 summarizes the of five models to obtain the optimal model. Every time the
number of samples for each class of both datasets. In weight ratio is altered by 0.01, the aggregate of the weight
this table, the Original column represents the number of ratio is always 1. Finally, the weight ratios of XGB, CBT,
original samples after removing NaN and duplicate values, GBM, BME, and DNN models are 0.3, 0.2, 0.2, 0.2, and
the Train column is the number of samples augmented by 0.1, respectively.
AWGAN, and the Test column shows the original samples
used to evaluate the PELID. 4.4. Evaluation Metrics
For simplicity, the augmented train set and test set of
To evaluate the network intrusion detection method, we
CSE-CIC-IDS2018 are called DS1 for short; DS2 stands
use standard metrics computed from the confusion matrix,
for the augmented train set and test set of NSL-KDD.
such as Accuracy (Acc), Precision (Prec), Recall (Rec), F1-
score (F1). All of the ML models developed in this study
4.3. Hyperparameter Optimization
are multi-label classification models [55, 56]. Therefore,
We select model parameters to optimize the DL models the metrics to evaluate model performance need to be
based on a technique called Hyperparameter Optimization made based on the overall assessment of all the n results
[53]. We utilize Ax to determine optimal parameters. Ax is of predicted labels yˆi and real labels yi , where i ∈ [1...n].
a platform for optimizing all types of experiments, including The following overall formulas compute these metrics:
ML experiments, A/B tests, and simulations. We also
employ a method known as Bayesian Optimization. It n n
1 X |yi ∩ yˆi | 1 X |yi ∩ yˆi |
begins by constructing a smooth surrogate model of the Acc = P rec =
n i=1 |yi ∪ yˆi | n i=1 |yi |
outcomes using Gaussian processes and prior experimental
observations [54].

10
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 11

Table 3. Confusion Matrix of DS1-based PELID Table 4. Confusion Matrix of DS2-based PELID

6000 0 0 0 0 0 0 0 0 0 0 0 6000 0 0 0 0
Benign DoS
100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 0% 0% 0% 0%
0 6000 0 0 0 0 0 0 0 0 0 0 0 2484 0 0 16
Bot Probe
0% 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 99% 0% 0% 0%

True Label
0 0 78 0 0 0 0 0 0 0 0 0
BruteForce-Web 0 0 185 0 6
0% 0% 99% 0% 0% 0% 0% 0% 0% 0% 0% 0% R2L
0 0 0 29 0 0 0 0 0 0 0 0 0% 0% 98% 0% 0%
BruteForce-XSS 0 0 0 4 8
0% 0% 0% 91% 0% 0% 0% 0% 0% 0% 0% 0% U2R
0 0 0 0 6000 0 0 0 0 0 0 0 0% 0% 0% 100% 0%
DDOS-HOIC
0% 0% 0% 0% 100% 0% 0% 0% 0% 0% 0% 0% 0 18 4 0 5978
0 0 0 0 0 336 0 0 0 0 0 0 Benign
0% 1% 2% 0% 99%
True Label

DDOS-LOIC-UDP
0% 0% 0% 0% 0% 100% 0% 0% 0% 0% 0% 0%
0 0 0 0 0 0 5999 0 0 1 0 0 Predicted Label
DoS-GoldenEye
0% 0% 0% 0% 0% 0% 100% 0% 0% 0% 0% 0%
0 0 0 0 0 0 0 6000 0 0 0 0
DoS-Hulk
0% 0% 0% 0% 0% 0% 0% 100% 0% 0% 0% 0%
DoS-SlowHTTPTest
0 0 0 0 0 0 0 0 4082 0 0 0 five individual AI models of PELID. Then, the testing
0% 0% 0% 0% 0% 0% 0% 0% 100% 0% 0% 0%
0 0 0 0 0 0 0 0 0 2093 0 0 set is used to assess not just the performance of the five
DoS-Slowloris
0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 0% 0%
0 0 0 0 0 0 0 0 0 0 6000 0
AI models, but also the ensemble model PELID. In addi-
Infiltration tion, the experiment identifies all the evaluation metrics
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 0%
0 0 1 3 0 0 0 0 0 0 0 13
SQL-Injection
0% 0% 1% 9% 0% 0% 0% 0% 0% 0% 0% 100%
and measures the time required to analyze each traffic
Predicted Label flow by PELID-based intrusion detection. To prevent the
impact of other processes on the testing server, the time
consumption of PELID is the average value obtained from
six separate prediction runnings. The obtained results will
n
1 X |yi ∩ yˆi |
n
1 X 2|yi ∩ yˆi | be summarized and evaluated in the next subsection.
Rec = F1 =
n i=1 |yˆi | n i=1 |yi | + |yˆi |
4.5.1. DS1-based Experimental Results
To measure the overall performance of intrusion detec- These experiments focus on evaluating our APELID
tion, we also use the false positive rate (FPR) and false method by utilizing the DS1 dataset, augmented from CSE-
negative rate (FNR), computed by the following formulas: CIC-IDS2018 by AWGAN, to train and evaluate AI models.
In this scenario, we trained all five specialized models
F alse Intrusion N umber and incorporated them into the PELID model based on
FPR =
T otal N umber of Benigns the GPU computing infrastructure aforementioned. After
training, the test set of DS1 is used to evaluate both the five
F alse Benign N umber single models and the ensemble model according to all five
FNR =
T otal N umber of Intrusions evaluation metrics. The detailed results of the experiment
In order to evaluate the efficacy of multiclass classifica- with DS1 are illustrated in the first part of Table 5 and the
tion, we also use the Area Under the Receiver Operating confusion matrix shown in Table 3.
Characteristic Curve (AUC) [56]. It is worth noting that All five individual models evaluated in this study achieved
with numerous intrusion classes, the One-vs-Rest (OvR) an F1-score of 99.77% or above, indicating excellent per-
strategy is more suitable than One-vs-One to be used to formance. This demonstrates the excellent efficiency gains
calculate the AUC. OvR can also handle class imbalances in intrusion detection made possible by data augmentation
since it treats each class independently and does not require using the AWGAN algorithm.
balanced class distributions. Thus, we compute the AUC These experiment results show that 1/17 SQL−Injection
for each class individually against the rest of the classes, attacks were identified as the BruteF orce − W eb, 3/17
and then average the individual AUC scores to obtain the SQL−Injection attacks are denoted as the BruteF orce−
multiclass AUC. XSS, 1/17 SQL − Injection attacks are defined as the
n Inf iltration, 1/6, 000 DoS − GoldenEye attacks were
1X identified as the DoS − Slowloris. All the attacks flow (in
AU C = AU Ci
n i=1 42, 635 total attacks) are detected by the PELID. There
are no false-positive in intrusion detection. Overall, the
4.5. Experiment Results F1-score of PELID is 99.99% and its value is the same for
We implemented AGWAN and PELID algorithms and other metrics of Acc, Prec, and Rec.
deployed them in an inline IDPS in order to validate the
four research questions mentioned above. Therefore, three 4.5.2. DS2-based Experimental Results
scenarios are proposed to evaluate our methods: DS1-based Similar to the previous experiments with DS1, we also
experiments, DS2-based experiments, and a practical model evaluate the APELID proposed method with the DS2
for hunting malware in an IDPS using the sandbox method. dataset. Therefore, we train all five individual models
The experiment process of both two datasets is the with the DS2’s augmented training set, just like we did
same. First, the augmented training set is used to train with DS1. Both the five individual models and our PELID
11
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 12

Table 5. Evaluation of AI models (%)


DS1-based Evaluation DS2-based Evaluation
Metric
XGB CBT GBM BME DNN PELID XGB CBT GBM BME DNN PELID
F1 99.77 99.92 99.95 99.77 97.75 99.99 99.48 99.21 99.48 99.48 98.00 99.63
Acc 99.76 99.92 99.96 99.98 97.54 99.99 99.49 99.22 99.56 99.43 98.07 99.65
Prec 99.83 99.93 99.96 99.98 98.20 99.99 99.49 99.21 99.49 99.41 98.03 99.65
Rec 99.76 99.92 99.96 99.98 97.54 99.99 99.49 99.22 99.49 99.43 98.07 99.65
FPR 0 0 0.03 0 0.13 0 0.67 1.27 0.63 0.77 1.22 0.37
FNR 0 0.01 0 0 1.37 0 0.37 0.39 0.30 0.32 2.26 0.34
AUC 100 100 99.99 99.99 98.69 100 99.99 99.98 99.99 99.89 99.85 99.99

Table 6. Malware Hunting Results


FTP N Malware Type Hash VT APELID
Web Server
1 QuasarRAT .exe 832ab3a898d188426d3541e1533b55f9 56/68 Yes
2 Loki .xlsx 5b6aec60c3be4724f7980a659206531a 29/58 Yes
Database 3 STRRAT .jar 2199150e7d79d0e831cda314c7ce6f56 28/62 Yes
4 AsynRAT .doc da6419e4d4e4528990898bcfdaa85e01 32/60 Yes
Sanbo
5 SnakeKeylogger .exe 715b0f6390ba4387a4155c1d59a3669c 49/69 Yes
Switch 6 AgentTesla .exe 5c590fcb32aedec16532aa857eec28b5 40/66 Yes
7 OskiStealer .xlsx 6a9203346218dded19d0a8a1dee24023 20/59 Yes
8 NanoCore .exe 4bae18ac4a73ff38f7ed718365e6c2b2 41/67 Yes
9 DanaBot .exe 5f4731a4ef7d1484893213caaf6a6685 42/69 Yes
Mail Files 10 DCRAT .exe ea800644b9dfd027807447fdd98241aa 50/68 Yes
11 YellowCockatoo .dll df7b2ece343c52df774d72e12ea09009 51/69 Yes
IDPS Sandbox 12 RemoteManipulator .exe 4c5649e9b9a2d9997ac2600a804e0aeb 41/68 Yes
DMZ 13 Pony .exe ab468a5b5cd9470c0895097efa2a687f 63/71 Yes
14 Stealc .exe cea30f806e644cebe48399eefa345e51 47/71 Yes
15 njRat .exe b17414d6949c2e013de14fdc268cfc89 65/71 Yes
Attacker 16 RedLineStealer .exe 8a61e10948c23a9a5c353d28b8738490 35/71 Yes
FW 17 Guildma .zip 8a61e10948c23a9a5c353d28b8738490 35/71 Yes
18 Gozi .js 1df2e7a13459223b2cc55b93744add77 24/71 Yes
19 DarkTortilla .exe 1c354a83f81063dc75612a9a7bd51225 54/71 Yes
20 VectorStealer .xlsx 5b47098a17ecd534de15df03b12beacb 40/71 Yes

Figure 3. Malware Hunting Scenario

It is anticipated that the answer to our RQ4 will be a


ensemble model are then evaluated by the DS2 test set. partial and temporary “YES” because it can use dynamic
The second part of Table 5 shows the experimental results analysis to detect malware behavior. It provides proof of
by using DS2, and Table 4 presents the PELID model’s concept through the experiment that follows. To evaluate
confusion matrix. the intrusion detection capabilities of sandbox analysis, we
The experimental findings shown in this scenario demon- conducted experiments on a host system equipped with
strate the efficacy of individual models in intrusion detec- Ubuntu 20.04 LTS, an Intel i7 CPU clocked at 2.3 GHz,
tion, with F1 values of 98% or higher. In particular, we and 8 GB of RAM.
can see that the AWGAN algorithm has greatly improved This scenario includes two completely separate net-
the quality of the training data set by confirming that all works: DMZ Network (including Web server (HTTP and
of the AUC measurements are more than 99.85%. FTP), Mail Server (SNMP), and Attacks-Network), shown
For our PELID method, its confusion matrix illustrated as Fig. 3. We used 100 files, including 80 normal and 20
in Table 4 shows that the FNR is 0.34%: total 30 network malware, to send to the DMZ Network and to upload as
attacks (including 16 P robe, 6 R2L and 8 U 2R) are not administrator to the sandbox.
detected by the PELID, and the FPR is 0.37%: 22 Benign The IDPS automatically captured files transmitted
flows are considered as intrusions. between networks that used an unencrypted protocol such
as FTP or HTTP. We wrote a Python tool to automatically
4.5.3. Malware Hunting Assessment check and send the files to the Sandbox-based analysis for
This experiment scenario is designed to assess the malware hunting. We compared the experimental results
sandbox-based malware hunting of APELID method. Here, with Virus Total (VT), shown in Table 6, indicating that
we utilize experimental data consisting of 80 benign and 20 our custom sandbox can detect common file types, such
malicious files. The benign files consist of Windows system as .exe, .dll, and .jar. It demonstrates that we can use the
software files obtained from a newly installed Windows sandbox to detect malware file transfer between networks
virtual machine and downloaded from reputable Internet and proactively hunt malware for suspicious files.
sources. The experimental study utilized downloaded Hence, these results consolidate the affirmation of RQ4
malware files from public sources such as https://bazaar. by using sandbox-based dynamic analysis to detect mal-
abuse.ch/ and https://virustotal.com/. ware.
12
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 13

(a) Original (b) Augmented

Figure 4. t-SNE-based Visualization of DS1 Training Set

(a) Original (b) Augmented

Figure 5. t-SNE-based Visualization of DS2 Training Set

4.6. Evaluation and Discussion 4.6.2. Efficacy of PELID in Intrusion Detection


4.6.1. Efficacy of AWGAN With the data augmentation algorithm AWGAN and
To evaluate the efficiency of data augmentation method, the ensemble learning method PELID, APELID achieves
AWGAN, we use the distributed stochastic neighbor em- the outstanding performance of intrusion detection: 99.99%
bedding method (t-SNE) [57] in order to visualize high- for Accuracy, Precision, F1-score, and Recall for the CSE-
dimensional training sets for both DS1 and DS2. Fig. 4a CIC-IDS2018 dataset. Moreover, for the NLS-KDD dataset,
and 5a show the original data before performing AWGAN- these evaluation metrics also are excellent values of 99.65%
based augmentation meanwhile 4b and 5b illustrate the for all. Table 5 also shows that the AUC of all five single
augmented training sets of DS1 and DS2, respectively. models is close to 100% for both datasets. In particular, the
As illustrated in Fig. 4 and 5, the visualization confirms AUC of the PELID model gives 100% results with DS1 and
that the training set after using the AWGAN-based augmen- 99.99% with DS2. These two values clearly demonstrate
tation has solved the challenges of sparse and unbalanced the ideal classification efficiency of the PELID model and
data. The DS1 and DS2 datasets had more distinct clusters basically allow it to handle the problem of data imbalance
corresponding to their classes than before the augmentation. in the classes.
In addition, with very high intrusion detection results, as Compared with individual AI models, as illustrated in
illustrated in Table 5 for both datasets, AWGAN clearly Table 5, it is clear that PELID gives outstanding F1 results
qualifies for enhancing the training set quality. Therefore, from 0.04% (with the GBM model) to 2.24% (with DNN)
this allows us to validate and respond to question RQ1 with DS1, from 0.17% (with XGB, GBM, and BME) to
affirmatively. 1.63% (with DNN) with DS2. In addition, both FPR and
13
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 14

FNR rates are lower than that of all five models DNN, Sequence Parallel Baseline
XGB, CBT, GBM, and BME. These results privilege us to 174.61
respond to RQ2: combining multiple AI models of PELID 251.81
DS2
allow for improved network intrusion detection. With a 284.35
very high F1 and both FPR and FNR less than 0.37%,
PELID can detect unknown intrusion attacks. In addition, 445.53
the combination of five individual AI models in PELID DS1 950.48
will also increase its resilience to adversarial attacks. In 1,422.7
addition, the PELID model trained by DS1 has nearly
perfect intrusion detection performance: 99.99% F1-score,
100% AUC, and both zero FPR and FNR. Therefore, this Figure 6. Comparing time consumption (milliseconds) between
model is selected for integration into our IDPS in order to parallel and sequential processing of PELID. ‘Baseline’ illustrates the
average execution time of five individual AI models.
improve the detection efficacy of eleven types of intrusions.

4.6.3. Efficacy of PELID in Time Consumption Table 7. Comparison with SOTA Methods (%)
By performing the individual AI models in parallel, Method Acc Prec F1 Rec
PELID theoretically permits to reduce the execution time.
DS1: CSE-CIC-IDS2018
Our experiments enable us to demonstrate it conclusively.
Fig. 6 shows that the average time the PELID predic- APELID (our) 99.99 99.99 99.99 99.99
tion of 14, 703 flows in the DS2 testing set from six different MMM-RF [44] 99.98 − − −
runnings is 251.81ms. Meanwhile, PELID needs an average SDAID [20] 99.93 99.93 99.93 99.93
of 950.48ms from six runnings for analyzing 42, 635 flows GAN+RF [9] 99.83 98.68 95.04 92.76
KNN-MQBHOA [39] 99.78 99.56 99.65 99.87
in the DS1 testing set. Therefore, the average time to
HDLNIDS [37] 98.90 98.63 99.03 99.14
investigate one flow of the PELID is 17.13µs and 22.29µs in CNN [11] 98.17 95.00 94.00 95.00
DS2 and DS1, respectively. These experiments also indicate AUE [7] 97.90 98.00 98.00 98.00
that the PELID-based analysis of DS1 is shorter than DS2. miniVGGNet [43] 96.99 97.46 97.04 96.97
It comes from the fact that the NSL-KDD dataset has
DS2: NSL-KDD
fewer features and labels than CSE-CIC-IDS2018 (40/71;
4/12). APELID (our) 99.65 99.65 99.63 99.65
The time consumption of PELID with DS1 and DS2 SDAID [20] 99.62 99.62 99.62 99.62
can be used to determine the throughput of network traf- KNN-MQBHOA [39] 99.00 99.00 97.00 98.00
fics. Based on Fig. 6, the PELID-based deep analysis FFO-PNN [36] 98.99 96.97 96.97 96.97
DLNID [15] 90.73 86.38 89.65 93.17
can perform 44,863 flows/s for the model trained by DS1
GMM-WGAN-IDS [1] 86.59 88.55 86.88 86.59
and 58,377 flows/s for the model trained by DS2. By Adaptive-Ensemble [3] 85.20 86.50 86.50 85.20
using notions of “mouse” and “elephant” flows of [58], CAFE-CNN [13] 83.34 85.35 82.60 83.44
we constate that the PELID can analyze the network
throughput of 44, 863 ∗ 10KB ≃ 438M B/s ≃ 3.42Gps
or 58, 377 ∗ 10KB ≃ 570M B/s ≃ 4.45Gbps for mouse IDPS in order to hunt malware and contribute more to
flows (less than 10KB/flow), respectively DS1 or DS2-based RQ4 responses.
model. For elephant flows (more than 10MB), PELID can
reach up to 44, 863 ∗ 10M B ≃ 448, 630M B/s = 3, 504Gps 4.6.4. Comparison with SOTAs
or 58, 377 ∗ 10M B ≃ 583, 770M B/s = 4, 560Gbps, re- In order to assess our proposed method, the exper-
spectively DS1 or DS2-based model. Therefore, PELID- imental results of APELID are used to compare with
based intrusion detection can be performed in large-scale other recently proposed methods by using the same well-
networks. Consequently, RQ3 has been resolved by all known datasets. Table 7 shows the comparison results
these experimental results. obtained directly in the publication of their works. We
The experiments in Table 5 show that the prediction can constate that APELID reaches an F1-score of 99.99%
on DS1-based PELID has higher accuracy, precision, and and 99.65%, higher than all SOTA models based on CSE-
F1-score than DS2-based PELID. Therefore, we build an CIC-IDS2018 and NSL-KDD, respectively. Consequently,
inline IDPS based on the Suricata solution and integrate the these comparisons also allow us to validate the efficacy of
PELID model trained by DS1. It is presently successfully our APELID method and contribute more to responding
deployed in inline mode to detect and prevent intrusions to RQ3.
on our university’s 10 Gbps large-scale network. It also
demonstrated that the amount of time the PELID model 5. Conclusions
requires to analyze a network traffic flow in parallel is fast
enough to qualify as real-time in practice. Note that the In this paper, we present APELID, a method propelled
open-source Cuckoo sandbox is also incorporated into our by artificial intelligence that improves the precision, ac-
14
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 15

curacy, and speed of real-time intrusion detection. To [7] F. Zhao, H. Zhang, J. Peng, X. Zhuang, S.-G. Na, A
obtain high-quality AI models, we propose augmenting semi-self-taught network intrusion detection system, Neural
Computing and Applications 32 (12 2020). doi:10.1007/
the training set with the AWGAN algorithm. AWGAN is s00521-020-04914-7.
inspired by the WGAN method in order to generate more [8] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T.-
realistic minority class samples. In addition, the K-Means Y. Liu, Lightgbm: A highly efficient gradient boosting decision
algorithm is used to select the finest samples from the tree, in: Proceedings of the 31st International Conference on
Neural Information Processing Systems, 2017, p. 3149–3157.
majority classes. To improve the precision and accuracy [9] J. Lee, K. Park, Gan-based imbalanced data intrusion detection
of intrusion detection, APELID includes the algorithm system, Personal and Ubiquitous Computing 25 (02 2021). doi:
PELID, which is designed to ensemble numerous AI models, 10.1007/s00779-019-01332-y.
including XGB, CBT, GBM, BME, and DNN. To reduce [10] X.-D. Zeng, S. Chao, F. Wong, Optimization of bagging classifiers
based on sbcb algorithm, in: 2010 International Conference on
inspection latency, we propose parallel PELID-based deep Machine Learning and Cybernetics, Vol. 1, 2010, pp. 262–267.
flow analysis and an efficient strategy for sensing network doi:10.1109/ICMLC.2010.5581054.
flows. In addition, we deploy a laboratory for the behavior [11] M. Mbow, H. Koide, K. Sakurai, Handling class imbalance
analysis of malware to combat the proliferation of malware problem in intrusion detection system based on deep learning,
International Journal of Networking and Computing 12 (2022)
on the network proactively. Consequently, APELID is 467–492.
regarded as an all-encompassing and holistic solution for [12] R. Chowdhury, S. Sen, A. Goswami, S. Purkait, B. Saha,
enhancing IDS performance. An implementation of bi-phase network intrusion detection
system by using real-time traffic analysis, Expert Systems with
For the purpose of evaluating APELID, we conduct Applications 224 (2023) 119831. doi:10.1016/j.eswa.2023.
rigorous experiments with well-known datasets. The ex- 119831.
perimental results have validated the superior performance [13] E. Shams, A. Rizaner, A. Ulusoy, A novel context-aware feature
of APELID over other SOTA methods, with F1-scores of extraction method for convolutional neural network-based
intrusion detection systems, Neural Computing and Applications
99.99% and 99.65% for the CSE-CIC-IDS2018 and NSL- 33 (2021) 1–19. doi:10.1007/s00521-021-05994-9.
KDD datasets, respectively. In addition, APELID can [14] H. Zhang, L. Huang, C. Q. Wu, Z. Li, An effective convolutional
be integrated into the inline IDPS due to the low latency neural network based on smote and gaussian mixture model for
of deep flow analysis when combined with an adequate intrusion detection in imbalanced dataset, Computer Networks
177 (2020) 107315. doi:10.1016/j.comnet.2020.107315.
sampling strategy. In future research, we intend to extend [15] Y. Fu, Y. Du, Z. Cao, Q. Li, W. Xiang, A deep learning model for
our intrusion detection method to prevent unknown, ad- network intrusion detection with imbalanced data, Electronics
versarial attacks not only on the network but also on the 11 (2022) 898. doi:10.3390/electronics11060898.
host. [16] S. Jamalpur, Y. S. Navya, P. Raja, G. Tagore, G. R. K. Rao,
Dynamic malware analysis using cuckoo sandbox, in: 2018
Second International Conference on Inventive Communication
and Computational Technologies (ICICCT), 2018, pp. 1056–1060.
Data and Code Availability
doi:10.1109/ICICCT.2018.8473346.
[17] W. Liu, S. Zhong, A novel dynamic model for web malware
Our proposed method and the dataset are available at spreading over scale-free networks, Physica A: Statistical
https://github.com/vovanhoang/APELID/. Mechanics and its Applications 505 (2018) 848–863. doi:
10.1016/j.physa.2018.04.015.
[18] M. Vasilescu, L. Gheorghe, N. Tapus, Practical malware
References analysis based on sandboxing, in: 2014 RoEduNet Conference
13th Edition: Networking in Education and Research Joint
[1] J. Cui, L. Zong, J. Xie, M. Tang, A novel multi-module Event RENAM 8th Conference, 2014, pp. 1–6. doi:10.1109/
integrated intrusion detection system for high-dimensional RoEduNet-RENAM.2014.6955304.
imbalanced data, Applied Intelligence (04 2022). doi:10.1007/ [19] S. Liu, P. Feng, S. Wang, K. Sun, J. Cao, Enhancing malware
s10489-022-03361-2. analysis sandboxes with emulated user behavior, Computers &
[2] J. Ding, Z. Chen, L. Xiaolong, B. Lai, Sales forecasting based on Security 115 (2022) 102613. doi:10.1016/j.cose.2022.102613.
catboost, in: 2020 2nd International Conference on Information [20] H. V. Vo, H. N. Nguyen, T. N. Nguyen, H. P. Du, Sdaid: Towards
Technology and Computer Application (ITCA), 2020, pp. 636– a hybrid signature and deep analysis-based intrusion detection
639. doi:10.1109/ITCA52113.2020.00138. method, in: IEEE Global Communications Conference, 2022,
[3] X. Gao, C. Shan, C. Hu, Z. Niu, Z. Liu, An adaptive ensemble pp. 2615–2620. doi:10.1109/GLOBECOM48099.2022.10001582.
machine learning model for intrusion detection, IEEE Access 7 [21] G. Bingham, R. Miikkulainen, Discovering parametric activation
(2019) 82512–82521. doi:10.1109/ACCESS.2019.2923640. functions, Neural Networks 148 (2022) 48–65. doi:10.1016/j.
[4] N. G. Narkar, N. M. Shekokar, A rule based intrusion detection neunet.2022.01.001.
system to identify vindictive web spider, in: 2016 International [22] S. T. Ikram, A. K. Cherukuri, B. Poorva, P. S. Ushasree,
Conference on Computing, Analytics and Security Trends Y. Zhang, X. Liu, G. Li, Anomaly detection using xgboost
(CAST), 2016, pp. 271–275. doi:10.1109/CAST.2016.7914979. ensemble of deep neural network models, Cybern. Inf. Technol.
[5] T. Akiba, S. Sano, T. Yanase, T. Ohta, M. Koyama, Optuna: 21 (3) (2021) 175–188. doi:10.2478/cait-2021-0037.
A next-generation hyperparameter optimization framework, in: [23] J. A. Sáez, B. Krawczyk, M. Woźniak, On the influence of
Proceedings of the 25th ACM SIGKDD International Conference class noise in medical data classification: Treatment using noise
on Knowledge Discovery & Data Mining, KDD ’19, Association filtering methods, Applied Artificial Intelligence 30 (6) (2016)
for Computing Machinery, New York, NY, USA, 2019, p. 590–609. doi:10.1080/08839514.2016.1193719.
2623–2631. doi:10.1145/3292500.3330701. [24] G. Bovenzi, G. Aceto, D. Ciuonzo, V. Persico, A. Pescapé, A
[6] A. Gouveia, M. P. Correia, Recent Advances in Security, Privacy, hierarchical hybrid intrusion detection approach in iot scenarios,
and Trust for Internet of Things (IoT) and Cyber-Physical in: GLOBECOM 2020 - 2020 IEEE Global Communications
Systems (CPS), 1st Edition, Chapman and Hall/CRC, 2020, Ch. Conference, 2020, pp. 1–7. doi:10.1109/GLOBECOM42002.2020.
Network Intrusion Detection with XGBoost, pp. 150–156.
15
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 16

9348167. detection using clustering and gradient boosting, in: 2018 9th
[25] L. Zhang, S. Jiang, X. Shen, B. B. Gupta, Z. Tian, PWG- International Conference on Computing, Communication and
IDS: an intrusion detection model for solving class imbalance Networking Technologies (ICCCNT), 2018, pp. 1–7. doi:10.
in iiot networks using generative adversarial networks, CoRR 1109/ICCCNT.2018.8494186.
abs/2110.03445 (2021). arXiv:2110.03445. [42] P. Devan, N. Khare, An efficient xgboost–dnn-based classification
[26] J. Sinha, M. Manollas, Efficient deep cnn-bilstm model for model for network intrusion detection system, Neural Computing
network intrusion detection, in: Proceedings of the 2020 3rd and Applications 32 (16) (2020) 12499–12514. doi:10.1007/
International Conference on Artificial Intelligence and Pattern s00521-020-04708-x.
Recognition, AIPR ’20, Association for Computing Machinery, [43] L. Liu, P. Wang, J. Lin, L. Liu, Intrusion detection of imbalanced
New York, NY, USA, 2020, p. 223–231. doi:10.1145/3430199. network traffic based on machine learning and deep learning,
3430224. IEEE Access 9 (2021) 7550–7563. doi:10.1109/ACCESS.2020.
[27] N. Gupta, V. Jindal, P. Bedi, Cse-ids: Using cost-sensitive deep 3048198.
learning and ensemble algorithms to handle class imbalance [44] M. Hammad, N. Hewahi, W. Elmedany, Mmm-rf: A novel
in network-based intrusion detection systems, Computers & high accuracy multinomial mixture model for network intrusion
Security 112 (2021) 102499. doi:10.1016/j.cose.2021.102499. detection systems, Computers & Security 120 (2022) 102777.
[28] P. Jeatrakul, K. W. Wong, C. C. Fung, Classification of doi:https://doi.org/10.1016/j.cose.2022.102777.
imbalanced data by combining the complementary neural [45] F. Ullah, S. Ullah, G. Srivastava, J. C.-W. Lin, Ids-int: Intrusion
network and smote algorithm, in: Proceedings of the 17th detection system using transformer-based transfer learning
International Conference on Neural Information Processing: for imbalanced network traffic, Digital Communications and
Models and Applications - Volume Part II, ICONIP’10, Springer- Networks (2023). doi:10.1016/j.dcan.2023.03.008.
Verlag, Berlin, Heidelberg, 2010, p. 152–159. [46] C. Adam-Bourdarios, G. Cowan, C. Germain, I. Guyon, B. Kégl,
[29] P. Mishra, V. Varadharajan, U. Tupakula, E. S. Pilli, A detailed D. Rousseau, The higgs boson machine learning challenge, in:
investigation and analysis of using machine learning techniques Proceedings of the 2014 International Conference on High-
for intrusion detection, IEEE Communications Surveys & Energy Physics and Machine Learning - Volume 42, HEPML’14,
Tutorials 21 (1) (2019) 686–728. doi:10.1109/COMST.2018. JMLR.org, 2014, p. 19–55.
2847722. [47] M. H. L. Louk, B. A. Tama, Dual-ids: A bagging-based gradient
[30] M. A. Ferrag, L. Maglaras, S. Moschoyiannis, H. Janicke, Deep boosting decision tree model for network anomaly intrusion
learning for cyber security intrusion detection: Approaches, detection system, Expert Systems with Applications 213 (2023)
datasets, and comparative study, Journal of Information Security 119030. doi:10.1016/j.eswa.2022.119030.
and Applications 50 (12 2019). doi:10.1016/j.jisa.2019. [48] R. Golchha, A. Joshi, G. P. Gupta, Voting-based ensemble
102419. learning approach for cyber attacks detection in industrial
[31] L. Bontemps, V. L. Cao, J. McDermott, N. Le-Khac, Collective internet of things, Procedia Computer Science 218 (2023) 1752–
anomaly detection based on long short term memory recurrent 1759, international Conference on Machine Learning and Data
neural network, CoRR abs/1703.09752 (2017). arXiv:1703. Engineering. doi:10.1016/j.procs.2023.01.153.
09752. [49] A. Nazir, R. A. Khan, A novel combinatorial optimization
[32] Y. Li, T. Qin, Y. Huang, J. Lan, Z. Liang, T. Geng, Hdfef: based feature selection method for network intrusion detection,
A hierarchical and dynamic feature extraction framework for Computers & Security 102 (2021) 102164. doi:https://doi.
intrusion detection systems, Computers & Security 121 (2022) org/10.1016/j.cose.2020.102164.
102842. doi:https://doi.org/10.1016/j.cose.2022.102842. [50] H. V.Vo, D. H. Nguyen, T. T. Nguyen, H. N. Nguyen, D. V.
[33] K. Alrawashdeh, C. Purdy, Toward an online anomaly intrusion Nguyen, Leveraging ai-driven realtime intrusion detection by us-
detection system based on deep learning, in: 2016 15th IEEE ing wgan and xgboost, in: Proceedings of the 11th International
International Conference on Machine Learning and Applications Symposium on Information and Communication Technology,
(ICMLA), 2016, pp. 195–200. doi:10.1109/ICMLA.2016.0040. SoICT ’22, Association for Computing Machinery, New York,
[34] P. Jayalaxmi, R. Saha, G. Kumar, M. Alazab, M. Conti, NY, USA, 2022, p. 208–215. doi:10.1145/3568562.3568660.
X. Cheng, Pignus: A deep learning model for ids in industrial [51] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein gan, Machine
internet-of-things, Computers & Security (2023) 103315doi: Learning (2017).
https://doi.org/10.1016/j.cose.2023.103315. [52] M. S. E. Sayed, N.-A. Le-Khac, M. A. Azer, A. D. Jurcut, A
[35] M. Y. Aldarwbi, A. H. Lashkari, A. A. Ghorbani, The sound flow-based anomaly detection approach with feature selection
of intrusion: A novel network intrusion detection system, method against ddos attacks in sdns, IEEE Transactions on
Computers and Electrical Engineering 104 (2022) 108455. doi: Cognitive Communications and Networking 8 (4) (2022) 1862–
10.1016/j.compeleceng.2022.108455. 1880. doi:10.1109/TCCN.2022.3186331.
[36] N. Omer, A. H. Samak, A. I. Taloba, R. M. Abd El-Aziz, A novel [53] G. Muniraju, B. Kailkhura, J. J. Thiagarajan, P.-T. Bremer,
optimized probabilistic neural network approach for intrusion C. Tepedelenlioglu, A. Spanias, Coverage-based designs improve
detection and categorization, Alexandria Engineering Journal sample mining and hyperparameter optimization, IEEE Trans-
72 (2023) 351–361. doi:10.1016/j.aej.2023.03.093. actions on Neural Networks and Learning Systems 32 (3) (2021)
[37] E. Qazi, M. Faheem, T. Zia, Hdlnids: Hybrid deep-learning- 1241–1253. doi:10.1109/TNNLS.2020.2982936.
based network intrusion detection system, Applied Sciences 13 [54] H. V. Le, T. N. Nguyen, H. N. Nguyen, L. Le, An efficient hybrid
(2023) 4921. doi:10.3390/app13084921. webshell detection method for webserver of marine transporta-
[38] K. Ren, S. Yuan, C. Zhang, Y. Shi, Z. Huang, Canet: A tion systems, IEEE Transactions on Intelligent Transportation
hierarchical cnn-attention model for network intrusion detection, Systems 24 (2) (2023) 2630–2642. doi:10.1109/TITS.2021.
Computer Communications (2023). doi:10.1016/j.comcom. 3122979.
2023.04.018. [55] G. P. Dubey, D. R. K. Bhujade, Optimal feature selection for
[39] R. Ghanbarzadeh, A. Hosseinalipour, A. Ghaffari, A novel machine learning based intrusion detection system by exploiting
network intrusion detection method based on metaheuristic opti- attribute dependence, Materials Today: Proceedings 47 (2021)
misation algorithms, Journal of Ambient Intelligence and Human- 6325–6331, sI: TIME-2021. doi:10.1016/j.matpr.2021.04.643.
ized Computing (2023) 1–18doi:10.1007/s12652-023-04571-3. [56] G. V. Le, T. H. Nguyen, P. D. Pham, O. V. Phung, H. N.
[40] S. Al, M. Dener, Stl-hdl: A new hybrid network intrusion Nguyen, Guruws: A hybrid platform for detecting malicious
detection system for imbalanced dataset on big data environment, web shells and web application vulnerabilities, Transactions on
Computers & Security 110 (2021) 102435. doi:https://doi. Computational Collective Intelligence 11370 (2019) 184–208.
org/10.1016/j.cose.2021.102435. doi:10.1007/978-3-662-58611-2_5.
[41] P. Verma, S. Anwar, S. Khan, S. B. Mane, Network intrusion [57] L. van der Maaten, G. Hinton, Viualizing data using t-sne,

16
H.N. Nguyen et al. / Computers & Security 00 (2024) 1–17 17

Journal of Machine Learning Research 9 (2008) 2579–2605.


[58] J. Alvarez-Horcajo, D. Lopez-Pajares, J. M. Arco, J. A. Carral,
I. Martinez-Yelmo, Tcp-path: Improving load balance by
network exploration, in: 6th IEEE International Conference
on Cloud Networking, CloudNet 2017, Prague, Czech Republic,
September 25-27, 2017, IEEE, 2017, pp. 65–70. doi:10.1109/
CloudNet.2017.8071533.

17

You might also like