0% found this document useful (0 votes)
147 views

A Bidirectional LSTM Deep Learning Approach For Intrusion Detection

This document proposes a bidirectional Long-Short-Term Memory (BDLSTM) deep learning approach for intrusion detection. It aims to address the high false alarm rates and difficulty detecting certain attack types (U2R and R2L attacks) that existing intrusion detection systems suffer from. The proposed BDLSTM model is trained and evaluated on the NSL-KDD dataset. Experimental results show that the BDLSTM approach outperforms conventional LSTM and existing models in terms of accuracy, precision, recall, F1-score, and lower false alarm rate. It also achieves higher detection accuracy for U2R and R2L attacks compared to conventional LSTM models.

Uploaded by

Imrana Yaqoub
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
147 views

A Bidirectional LSTM Deep Learning Approach For Intrusion Detection

This document proposes a bidirectional Long-Short-Term Memory (BDLSTM) deep learning approach for intrusion detection. It aims to address the high false alarm rates and difficulty detecting certain attack types (U2R and R2L attacks) that existing intrusion detection systems suffer from. The proposed BDLSTM model is trained and evaluated on the NSL-KDD dataset. Experimental results show that the BDLSTM approach outperforms conventional LSTM and existing models in terms of accuracy, precision, recall, F1-score, and lower false alarm rate. It also achieves higher detection accuracy for U2R and R2L attacks compared to conventional LSTM models.

Uploaded by

Imrana Yaqoub
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

A Bidirectional LSTM Deep Learning Approach for

Intrusion Detection
Imrana Yakubua,c , Xiang Yanpinga,∗, Liaqat Alib,∗, Abdul-Rauf Zaharawud,∗
a
School of Computer Science and Engineering, University of Electronic Science and
Technology of China (UESTC), Chengdu 611731, China
b
School of Information and Communication Engineering, University of Electronic
Science and Technology of China (UESTC), Chengdu 611731, China
c
Department of Computer Science, University for Development Studies (UDS), Tamale,
Ghana
d
Department of Education, University for Development Studies (UDS), Tamale, Ghana

Abstract
The rise in computer networks and Internet attacks in recent times is becom-
ing so alarming and as such has triggered the need for the development and
implementation of intrusion detection systems (IDSs) to help prevent and or,
mitigate the challenges posed by network intruders. Intrusion detection sys-
tems over the years have played and continue to play a very significant role
in spotting network attacks and anomalies which has become a major prob-
lem for most internet and network providers. Several research works have
been done in this domain and many IDSs have been proposed by numer-
ous researchers around the globe to combat the threat of network invaders.
However, most of the previously proposed IDSs are accompanied with high
rates of raising false alarms. Additionally, most of the existing models suffer
the difficulty of detecting the different attack types especially, User-to-Root
(U2R) and Remote-to-Local (U2L) attacks. These two types of attack often
appear to have lower detection accuracy for the existing models. Hence, in
this paper, we propose a bidirectional Long-Short-Term-Memory (BDLSTM)
based intrusion detection system to handle the aforementioned challenges.
The NSL-KDD dataset which is a benchmark dataset for most IDSs is used


The corresponding authors
Email addresses: [email protected] (Imrana Yakubu),
[email protected] (Xiang Yanping), [email protected] (Liaqat Ali),
[email protected] (Abdul-Rauf Zaharawu)

Preprint submitted to Expert Systems with Applications June 16, 2020


to train and measure the performance of our model. Experimental results
show validate effectiveness of the BDLSTM approach. It was observed that
the BDLSTM approach does not only outperform conventional LSTM in
terms of accuracy but also, has a higher precision, recall and F-score values,
as well as, a much more reduced false alarm rate than the existing models.
Furthermore, the BDLSTM model achieves a higher detection accuracy for
U2R and R2L attacks compared to the convectional LSTM. Additionally, the
proposed model also outperformed many state-of-the-art methods proposed
recently for the intrusion detection problem.
Keywords:
machine learning, deep learning, Recurrent neural networks, Bidirectional
LSTM, intrusion detection

1. Introduction
Information and Communication Technology (ICT), Internet of Things
(IOT) and other mobile devices have gained massive advancement and im-
provement in recent years. The massive growth of these technologies has
amounted to a significant increase to the number of individuals and many
organizations depending and relying on wireless networks for the accomplish-
ment of various tasks. With regards to the rise in the usage of Internet, the
lives of most individuals as well as the manner with which most organizations
work have gained a significant change. However, this rapid growth in Internet
services and the large amount of information traffic have resulted in many
security concerns in recent times. In an attempt to deal with these secu-
rity concerns and make networks more secure, many different techniques and
ideas have been proposed by security researchers around the globe (Berman
et al., 2019). Intrusion Detection Systems (IDS) have proven to be one of the
most promising if not the best ways to easily identify and deal with network
intruders or invaders. IDSs have the capability of identifying network sys-
tems that have already been intruded as well as systems that are experiencing
intrusion.
Intrusions Detection Systems are mostly utilized for monitoring the traf-
fic of a network, make proper analysis of the network and spot out possible
attacks (anomalies) or inappropriate network access (unauthorized access)
by invaders (Jang-Jaccard & Nepal, 2014). IDS in an ideal sense is referred
to as a computer software or program(s) that can be used to gather and

2
analyze a variety of criteria (metrics or parameters) in relation to a net-
work, and with the aim of determining whether the security of the network
is breached or not breached. In general terms, when the word “IDS” is men-
tioned, three key methodological concepts come to play: Anomaly Detection
(Wenke et al., 1999), Misuse Detection (Cannady, 1998) as well as Hybrid of
the two (Kim et al., 2014; Depren et al., 2005). With anomaly detection, a
reaction is given by the IDS in a computer system if a deviation from a pre-
viously defined computer system state is detected (Beqiri, 2009). Anomaly
detection is good at spotting behavior that differs significantly from normal
activity (Gregg, 2014). Misuse detection on the other hand is performed by
comparing attack behaviors used to penetrate systems, against recorded user
activity (Cannady, 1998). With these methodologies in mind, it is essential
to note that intrusions can be attacks coming from an Internet (outsider
attacks), authorized users (insiders) seeking to obtain greater privileges or
privileged users attempting to misuse their privileges. Researchers in the
field of security have carried out quite a number of commendable researches
in the domain of anomaly detection pertaining to computer networks and
the Internet as a whole. In as much as these works are commendable and
have given good outcomes in dealing with anomalies, there are however, some
drawbacks in the application perspective.
Machine learning techniques such as Support Vector Machine (SVM), K-
Nearest Neighbor (KNN), Radom Forest (RF) and Naı̈ve Bayes (NB) have
been widely proposed by many researchers for detecting and identifying in-
truders of a network (Horng et al., 2011; Manzoor & Kumar, 2017; Chandak
et al., 2019; Zhang & Zulkernine, 2006; Koc et al., 2012). However, these
techniques are on the basis of traditional machine learning and goes with
greater cost of computation. Most of these techniques also result in giving
alerts that are not entirely true (raising false alarms) due to the fact that
these approaches do not get deeper understanding of their datasets (they are
basically shallow learners). Contrary to traditional machine learning, latest
approaches referred to as Deep Learning has shown state-of-the-art perfor-
mance on many problems (Liaqat et al., 2019b,a) including intrusion detec-
tion. Deep learning provides automated tools for deep feature extraction and
gives a better representation of data that could be used for generating more
improved models. Recurrent Neural Network (RNN) has become one of the
most widely used approach in deep learning for carrying out classifications
and other evaluations on data sequences, building on today’s research in the
domain of intrusion (anomaly) detection (Tang et al., 2018; Kim & Kim,

3
2015). Moreover, RNN is a great method that can exhibit splendid outcomes
in successive leaning as well as enhancing the detection of anomalies in a
network system.
In this work, we propose to use a bidirectional Long-Short-Term-Memory
based RNN model referred to as BDLSTM for netwrok anomaly (Intrusion)
detection. To train and measure the performance of our model, we use the
NSL-KDD dataset (UNB, 2009) which is publicly available for use in Uni-
versity of New Brunswick (UNB) data repository. This work contributes
meaningfully in the following ways:
i Presents a development and implementation of an IDS using a bi-
directional LSTM model which has the capability of accurately mod-
elling and handling practical sequences of data processing.
ii To the best of our knowledge, this is the first study that propose the
use of BDLSTM for the intrusion detection problem.
iii Proposes a model capable of learning description of a data being normal
or an attack type from labeled dataset as well as relating the acquired
knowledge to make accurate classification on unseen dataset.
iv Achieves a better accuracy (of 7.9%) for intrusion detection as com-
pared to conventional LSTM. Additionally, the BDLSTM model out-
performed many other recently proposed approaches.
Subsequent sections of this work are presented as follows: Section 2
presents a brief explanation of Deep Learning approaches as applied to in-
trusion detection. Section 3 provides an overview of literature pertaining to
RNN, LSTM and intrusion detection. In section 4, the description of dataset
used in this work is given and the Long-Short-Term-Memory (LSTM) model
presented. Section 5 provides the experiment description and the discussion
of results. In section 6, we present our conclusion and future works.

2. An Overview of Deep Learning


With regards to artificial intelligence (AI), deep learning (DL) is a field
that mimics the functioning of human brain in the area of data processing and
the generation of patterns for making effective decisions. It is that subset
of Machine Learning (ML) that provides more advanced tools generating
models and uses a stratified form of ANN for the implementation of ML
algorithms. Also referred to as deep neural learning (DNL), the edifices of
deep learning are built over multiple layers which are linked to each other and

4
are capable of learning and converting their input data into multiple levels of
abstraction of data representation (LeCun et al., 2015; Bengio et al., 2013).
Deep belief networks (DBNs), deep neural networks (DNNs), convolutional
neural networks (CNNs) and the recurrent neural networks (RNNs) are types
of DL which have been applied to several research domains including intrusion
detection and yielded commendable results that is close to or even beyond
the reasoning of humans.

2.1. Recurrent Neural Networks


Recurrent Neural Network (RNN) as a subset of DL is a more suitable
method for handling sequential data due to its recurrent (circular) manner of
connection. This as a result give RNN an edge over traditional feed-forward
neural network. It is a category of neural networks that possesses the capa-
bility of utilization previous outputs as inputs while keeping hidden layers
(Berman et al., 2019; Kim & Kim, 2015). RNNs are capable of processing
any length of input, maintain the size a model with increasing input size,
taking into consideration historical knowledge as well as distributing weights
across time. Unlike traditional feed-forward networks which can only remem-
ber things in the process of training, RNNs have the ability to recall what
has been learnt previously and as a result guide their decisions on the basis
of previously acquired knowledge (Bengio et al., 1994). That is to say RNNs
do not only recall things during the training but also capable of recalling
knowledge acquired from previous inputs in the process of creating outputs.
Although RNNs have been used to solve several research problems, they
however, suffered from some drawbacks such as vanishing gradients (Pearl-
mutter, 1995; Bengio et al., 1994; Pineda, 1987) which makes them unable
to learn long-term dependencies. To combat this drawback, the Long-Short-
Term-Memory (LSTM) which has been adopted for the purpose work was
developed. LSTMs have the capability of handling the problem of vanishing
gradient and as such, effectively learn the long-erm dependencies.

2.2. Long-Short-Term-Memory (LSTM)


As mentioned earlier, LSTM was developed to solve the issue of vanishing
gradient in standard RNNs (Hochreiter & Schmidhuber, 1995). To achieve
this, LTSMs use a mechanism known as gating which enables them learn
long-range dependencies. Gating in this regard simply implies that LSTM
models utilize multiple switch gates which enables them to circumvent units
and as a result be able to recall longer time steps (Hochreiter & Schmidhuber,

5
1997). Due to this ability, they (LSTMs) have gained a lot of attention in
recent times and are being utilized by most researchers in the security domain
to deal with most impending security issues. As the name implies, a typical
LSTM model has a memory referred to as cells which accept the current input
and previous state as input (Hochreiter & Schmidhuber, 1995; Staudemeyer
& Omlin, 2013; Hochreiter & Schmidhuber, 1997). These cells choose what
to keep and what to discard from the memory and then make a combination
of the current memory, input and previous state. By doing so, it is possible
for them to capture long-range dependencies(Le et al., 2017). As a result,
LSTMs have been adopted by many researchers for intrusion detection (Kim
& Kim, 2015; Staudemeyer & Omlin, 2013; Le et al., 2017; Staudemeyer,
2015; Kim et al., 2016) in most networks and has proven to be one of the
best techniques in dealing with such issue and worth paying attention to for
good research works.

3. Related Works
Intrusion detection in the domain of security has been a peculiar problem
faced by most researchers. Machine Learning techniques have in recent times
proven to be one of the most efficient methods in combating issues concerning
intrusion in network systems. Several ML techniques have been proposed by
most researchers in this domain. A few of which have been discussed in this
section.
Customary machine learning methods such SVM, RF, KNN and NB have
been proposed by the authors in (Parwez et al., 2017; Reddy et al., 2016;
Ikram & Cherukuri, 2016; Ingre & Yadav, 2015; Nie et al., 2017). These
methods although have yielded good results over the years, they however
suffered from some immanent limitations, and as a result inspired the devel-
opment of deep neural networks. In (Tang et al., 2016), Tang et al proposed
a deep learning approach for intrusion detection in network systems. They
apply DL technique to a flow-based intrusion detection in a software defined
network. Their model was trained and tested on the NSL-KDD dataset and
achieved a good result. With the use of feed-forward deep DNN, an intrusion
detection on basis of DL was proposed in (Kasongo & Sun, 2019). A combina-
tion of feed-forward DNN and filter-oriented feature selection technique was
presented in this work. The approach utilizes information gain mechanism
and has proven from experiment to outperform most existing traditional ML
approaches.

6
The authors in (Tang et al., 2018) proposed a recurrent neural network
known as GRU-RNN which uses gating mechanism for detecting intrusions
in network systems (specifically software defined networks). Their approach
uses the NSL-KDD dataset for testing and evaluation. According to the
authors, the GRU-RNN causes no deterioration to the performance of the
network and thus achieves a greater accuracy in detecting anomalies. How-
ever, the approach was only based on six of the features in the dataset. In
(Kim et al., 2016), Kim et al with the use LSTM applied to RNN presented
a model for intrusion detection systems. Their model was trained using the
KDD Cup 1999 dataset and produced a good accuracy confirming the effec-
tiveness of DL on IDS. Fu et al (Fu et al., 2018), on the basis of LSTM-RNN,
proposed a smart network attack detection system in which the system archi-
tecture comprises, the input layer, a mean pooling layer and for the output,
a regression layer. In this approach, the NSL-KDD dataset was used for
training the model which yielded a good performance results, outperforming
existing classical machine learning algorithms (KNN, NB, SVM).
In (Staudemeyer, 2015), the behaviors of a normal and malicious user
were use to model the traffic of a network as a time series in a supervised
learning technique to enhance intrusion detection. To evaluate the approach,
an LSTM model was trained on the DARPA and KDD Cup ’99 datasets and
experimented with different network topologies. Different feature sets were
also evaluated to detect attacks in a network as well as establish training on
networks specified for individual attack types. An IDS classifier was built
in (Le et al., 2017) using a recurrent neural network approach. According
to the authors, a suitable optimizer known as Nadam, amongst six different
optimizers was obtained for LSTM-RNN which produced great performance
in detecting intruders as compared to existing works. In (Ishitaki et al., 2017),
a Deep Recurrent Neural Network (DRNN) based user behavior prediction
method was presented to monitor the behavior of users in a Tor network.
The authors constructed a Tor server and client which was used with the
aid of Wireshark network analyzer for collecting data on users of the Tor
network. The collected data was then used for simulating the DRNN model
with good predictions obtained.

4. Methodology
In this section, we first present an intuition of a traditional LSTM ar-
chitecture. Then, we explain in details, the Bidirectional LSTM (BLSTM)

7
based intrusion detection architecture. We further present a discussion of
the NSL-KDD dataset used in training our model.

Figure 1: Basic Architecture of a Long Short-Term Memory Cell

4.1. Traditional LSTM


As explained earlier in 2, recurrent neural networks (RNNs) suffer the
drawback of vanishing gradient and as such unable to capture long-term
dependencies. In order to combat this drawback the LSTM model was in-
troduced (Schmidhuber, 2015). With the LSTM conventional neurons are
replaced with memory cells that are controlled by inputs, outputs and forget
gates giving it the ability to deal with the issue of vanishing gradient in the
traditional RNNs. The structure of traditional LSTM is given in Figure 1,
with the connection between the inputs and outputs is described mathemat-
ically at time t and t − 1 by the following equations:

it = σ(Wxi xt + Whi ht−1 + Wci ct−1 + bi ) (1)

ft = σ(Wxf xt + Whf ht−1 + Wcf ct−1 + bf ) (2)

ct = ft ct−1 + it tanh(Wxc xt + Whc ht−1 + bc ) (3)

ot = σ(Wxo xt + Who ht−1 + Wco ct−1 + bo ) (4)

ht = ot tanh(ct ) (5)

8
where c is the cell state. σ (the sigmoid function) and tanh denote the
activation functions. The input vector is denoted by x, the output is given
by ht . W and b denote the weights and biases parameters, respectively. ft is
the forget function which has the role of sieving out unwanted information.
it (the input gate) and c induce new information in the cell state. ot which
is the output gate, outputs the relevant information.

Figure 2: Basic Architecture of a Bidirectional RNN

Figure 3: Architecture of the Proposed BDLSTM Approach

4.2. Bidirectional LSTM (BDLSTM)


The BDLSTM is a combination of bidirectional RNN and LSTM with the
ability to handle sequential input in both directions. To achieve this, the first

9
recurrent layer is duplicated with the first layer receiving as input, the input
sequences whiles the duplicated layer receives as input a reversed replicate of
the input sequence. In doing so, the issue of vanishing gradient in traditional
RNNs is effectively dealt with. A BDLSTM can be trained with the use of all
available information on inputs in the past as well as the future and within a
particular time frame. Input sequences are processed in two directions (thus,
from left-to-right and from right-to-left) using a forward hidden layer and a
backward hidden layer (Graves et al., 2013). These hidden layers are then
passed on to the same output layer (see Figure 3). As shown in Figure 2,
the output sequence y, forward hidden sequence (h-fwrd) and the backward
hidden sequence (h-back) can calculated follows (Graves et al., 2013; Mousa
& Schuller, 2017):
→  → 
h t = H W → xt + W→→ h t−1 + b→ (6)
xh hh h

←  ← 
ht = H W ← xt + W←← h t+1 + b← (7)
xh hh h

→ ←
y t = W → h t + W ← h t + by (8)
hy hy

where the terms W denote weight matrices (W → and W ← are the for-
xh xh
ward input-hidden weight and backward input-hidden weight matrices re-
spectively), the terms b (b→ and b← ) denote the bias vectors in both direc-
h h
tions, and the term H represents the hidden layer.

The proposed model in this paper is based on a bidirectional LSTM imple-


mented with Tensorflow and Keras. Adaptive Moment Estimation (Adam)
algorithm was used as the optimizer for updating model weights with a learn-
ing rate of 0.001. The binary cross-entropy function was used as a loss func-
tion for the binary (Normal and Abnormal) classification and the categorical
cross-entropy function used for the multi-class (Normal, DoS, U2R, R2L, and
Probe) classification. As shown in our model’s architecture (see Figure 3),
inputs are mapped to their representations using an Embedding layer. The
embeddings are then passed on to the LSTM layer with two directions of
processing. The first, processing in the forward direction and the other in
the backward direction. The LSTM outputs are then fed to fully connected
layers with the rectified linear unit (ReLU) as activation function for the
hidden layers. Sigmoid and Sofmax activation functions were applied to the

10
final output layer for the binary and multi-class classifications respectively.
Finally, a dropout probability of 0.2 was applied to the layers to ensure that
our model dose not overfit the data.

4.3. Description of Dataset


The NSL-KDD dataset (UNB, 2009; Tavallaee et al., 2009) is one of the
bench-marked datasets for evaluating Intrusion Detection Systems (IDS).
It is an enhanced form of the KDDCup 99 dataset (Dua & Graff, 2017).
The proposed method in this research was trained, evaluated and tested on
the NSL-KDD dataset. The dataset has 125,973 traffic samples for training
with 22,544 traffic samples for the KDDTest+ and 11,850 traffic samples
for the KDDTest-21 used for testing. The dataset comes with 41 features
which includes 3 non-numeric (i.e. protocol type, service and f lag) and 38
numeric features as shown in Table 1. Furthermore, the dataset contains
a single class label which is either grouped into two classes (Normal and
Abnormal) for binary classification or five main classes (Normal, Denial of
Service (DoS), User-to-Root (U2R), Remote-to-Local (R2L), and Probe) for
multi-class classification as described in Table 3.

11
Table 1: Feature List of NSL-KDD Dataset

NAME OF FEATURE DATA TYPE NAME OF FEATURE DATA TYPE


duration numeric is guest login numeric
protocol type nonnumeric count numeric
service nonnumeric srv count numeric
flag nonnumeric serror rate numeric
src bytes numeric srv error rate numeric
dst bytes numeric rerror rate numeric
land numeric srv rerror rate numeric
wrong fragment numeric same srv rate numeric
urgent numeric diff srv rate numeric
hot numeric srv diff host rate numeric
num failed logins numeric dst host count numeric
logged in numeric dst host srv count numeric
num compromised numeric dst host same srv rate numeric
root shell numeric dst host diff srv rate numeric
su attempted numeric dst host same src port rate numeric
num root numeric dst host srv diff host rate numeric
num file creations numeric dst host serror rate numeric
num shells numeric dst host srv serror rate numeric
num access files numeric dst host rerror rate numeric
num outbound cmds numeric dst host srv rerror rate numeric
is host login numeric

The NSL-KDD dataset is in two folds: the KDDTrain+ dataset for train-
ing and the test (KDDTest+ and KDDTest-21 ) dataset for testing. Addition-
ally, for the detection of intruders to be a lot more realistic, the test datasets
contains many attacks that do not appear in the training set (KDDTrain+ ).
Thus, adding to the 22 types of attacks in the training set, there exist 17
more different attack types in the test set. Table 2 displays the distribution
of attack types in the dataset.

12
Table 2: Attack Categories of the Different Types of Attacks

ATTACK CATEGORY TYPES OF ATTACK


TRAINING RECORDS TEST RECORDS
DoS back, land, neptune, back, land, neptune, pod,
pod,smurf, teardrop smurf, teardrop, mailbomb,
processtable, udpstorm,
apache2, worm
Probe ipsweep, nmap, portsweep, ipsweep, nmap, portsweep,
satan satan, mscan, saint
ATTACK U2R buffer-overflow, loadmod- buffer-overflow, loadmod-
ule, perl, rootkit ule, perl, rootkit, sqlattack,
xterm, ps
R2L fpt-write, guess-passwd, fpt-write, guess-passwd,
imap, multihop, phf, spy, imap, multihop, phf, spy,
warezclient, warezmaster warezmaster, xlock, xsnoop,
snmpguess, snmpgetattack,
httptunnel, sendmail,
named
NORMAL normal normal

Table 3: Breakdown of Traffic Records in the NSL-KDD

ATTACK CATEGORY NUMBER OF TRAFFIC RECORDS


TRAINING KDDTest+ KDDTest-21
DoS 45927 7458 4342
Probe 11656 2421 2402
ATTACK U2R 52 200 200
R2L 995 2754 2754
NORMAL 67343 9711 2152
TOTAL 125973 22544 11850

4.4. Data Preparation


As stated in section 4.3 above, the NSL-KDD dataset comes with 38 of
the features been numeric and 3 of them been non-numeric. However, just as
any RNN, our proposed BDLSTM model only handles numerical data inputs
and as result, there was therefore the need for us to convert all non-numeric
features to numeric representations. The features (protocol type, service
and f lag) appear to be the non-numeric features in the NSL-KDD dataset
that require transformation into numeric form. These three features were
encoded and assigned integer values uniquely to each of them. After success-
fully transforming these features and getting all the features in their format

13
(numeric representations), the next appropriate thing to do was feature scal-
ing. Feature scaling is done to ensure that the dataset is in the normalized
form. The values of some features in the NSL-KDD dataset (e.g. src bytes
and dst bytes) appeared to have uneven distribution and as result, there was
the need to scale the values of every feature within the range of (0, 1) using
the Min-Max scaling. By this, we ensure that our classifier do not produce
biased outcomes. The Min-Max feature scaling is expressed mathematically
as follows:
X − Xmin
Z0 = (9)
Xmax − Xmin
Here, Z 0 represents the new value (scaled), and X denotes the original
value.

Algorithm 1: BDLSTM Training


1 Load Dataset
2 for Data in Training and Test Sets do
3 Extract Features (x)
4 Extract Labels (y)
Input: Features Extracted
Output: Classifications
5 for Features in x do
6 if Feature = Nonnumerical then
7 Encode Feature using Keras Library
8 Scale Features with Z 0 = XX−X min
max −Xmin
9 for i from 1 → n do
10 Start: K = 10
11 Split Training set into K-groups
12 Load DBLSTM model
13 Fit model with K-1 group
14 Validate model with remaining Kth group
15 Repeat until all K-groups are used as validation set
16 Test model on Test sets (NSL-KDDTest+ and NSL-KDDTest-21 )

14
4.5. Performance Metrics
To evaluate the performance of our model, the accuracy (ACC), preci-
sion, true positive rate (TPR), true negative rate (TNR), and false positive
rate (FPR) as well as the F-score values were calculated. Each of these
measures as obtained from the confusion matrix is explained and derived
mathematically as follows:
i) Accuracy (ACC): This is the ratio of number of correctly detected in-
trusions to the total number traffic records:-
TP + TN
ACC = (10)
TP + TN + FP + FN
ii) True Positive Rate (TPR): It refers to the ratio of the number of in-
trusion records that are correctly detected as intrusions to the overall
anomaly activities:-
TP
TPR = (11)
FN + TP
iii) True Negative Rate (TNR): This is the percentage of normal records
that are correctly detected as normal:-
TN
T NR = (12)
FP + TN
iv) False Positive Rate (FPR): It is the percentage of normal behaviours
that are classified as intrusive behaviours:-
FP
FPR = (13)
TN + FP
v) False Negative Rate (FNR): The percentage of intrusive behaviors de-
tected as normal.
FN
F NR = (14)
TP + FN
vi) Precision: This refers to the ratio of the true anomalous records to the
overall traffic records that were identified as intrusions:-
TP
P recision = (15)
TP + FP
vii) F-Score: It refers to the harmonic mean of the precision and true positive
rate:-  
1
F − Score = 2 (16)
P recision−1 + T P R−1

15
5. Experimental Results
Following our earlier statement, the proposed model in this research was
implemented in Python programming language with the use of TensorFlow
and Keras libraries on a 64-bit windows 10 Operating system (OS). The
experiment of our work was carried out on a Dell personal computer (PC)
with Intel Core i5-9300H @ 4.1 GHz, 8 GB RAM and NVIDIA GeForce
GTX 1050 Ti with 4 GB of dedicated GDDR5 VRAM. To ascertain the
effectiveness of our proposed approach, two classes of experiment were carried
out.
The first class of experiment is a binary (2-class) classifier with target
behaviors categorized as Anomaly and Normal whereas the second class of
experiment is a 5-class classifier with target behaviors categorized as Nor-
mal, DoS, Probe, R2L, U2R. For each experiment, we fist of all implement
the convention LSTM and compare the performance with that of the bidirec-
tional LSTM approach. We further compared the performance of the bidirec-
tional LSTM approach with other existing methods in literature. (i.e. ANN,
NB, SVM, RF, Multi-Layer Perceptron (MLP), RNN-IDS, and SCDNN,
MDPCA-DBN and STL).
To validate and evaluate the performance of our model, a stratified K-fold
cross-validation method was implemented with K set to 10. The stratified K-
fold ensures that the sample percentage for each of the classes is maintained
in every fold. This is done to guarantee the model with a balanced and equal
distribution of data for the training and testing folds. The model was then
fit with K-1 (10 minus 1) folds and validated with the Kth folds remaining
(9 folds). This process was repeated down to the last K-fold. That is, till
every K-fold is utilized as the test set. The scores for each fold was recorded
and the mean of the scores recorded taken as the model’s performance as
depicted in Figure 4

16
Figure 4: Cross-Validation scores over folds. The bars represent the validation score for
each of the 10 folds for the two classification problems and the dashed lines indicate the
mean validation score for the classification problems (i.e. 2-class and 5-class classification)

5.1. Experiment 1: Binary Classification


The first experiment was carried out with two target classes (Anomaly
and Normal) and makes use of all features in the dataset for a 2-class clas-
sification. For this classification issue, the confusion matrices as depicted in
Figure 5a and 5b, were used to evaluate the conventional LSTM model’s per-
formance on the test datasets (KDDTest+ and KDDTest-21 ), whereas Figure
6a and 6b were used to measure the performance of the bidirectional LSTM
model based on the evaluation metrics presented in section 4.5.
Table 4 and 5 present respectively, summary of the performance results
obtained by the coventional LSTM and the bidirectional LSTM (BDLSTM)
model trained with all 41 features. The performance results of the BDLSTM
model compared with the conventional LSTM classifier and other existing
classifiers are presented in Table 6. The results prove that, with regards to
accuracy, precision, true positive rate (recall), true negative rate (specificity),
and F-score, the performance of the BDLSTM classifier is superior at spotting
network anomalies.

17
(a) Conventional LSTM Confusion Matrix for (b) Conventional LSTM Confusion Matrix for
KDDTest+ (Binary Classification) KDDTest-21 (Binary Classification)

(a) Bidirectional LSTM Confusion Matrix for (b) Bidirectional LSTM Confusion Matrix for
KDDTest+ (Binary Classification) KDDTest-21 (Binary Classification)

Table 4: Conventional LSTM Model Performance for Binary Classification

PERFORMANCE MEASURE TEST DATASET


KDDTest+ KDDTest-21
Accuracy 89.81% 79.87%
Recall 84.03% 79.54%
False Alarm Rate 2.55% 18.63%
Specificity 97.45% 81.37%
Precision 97.75% 95.06%
F-Score 90.38% 86.61%

18
Table 5: Bidirectional LSTM Model Performance for Binary Classification

PERFORMANCE MEASURE TEST DATASET


KDDTest+ KDDTest-21
Accuracy 94.26% 87.46%
Recall 90.79% 88.32%
False Alarm Rate 1.15% 16.40%
Specificity 98.85% 83.60%
Precision 99.05% 96.04%
F-Score 94.74% 92.02%

Table 6: Accuracy of BDLSTM Against Existing Methods-Binary Classification

METHOD ACCURACY IN PERCENTAGE


KDDTest+ KDDTest-21
J48 (Tavallaee et al., 2009) 81.05% 63.97%
Naive Bayes (Tavallaee et al., 2009) 76.56% 55.77%
NB Tree (Tavallaee et al., 2009) 82.02% 66.16%
Random Forest (Tavallaee et al., 2009) 80.67% 63.26%
Random Tree (Tavallaee et al., 2009) 81.59% 58.51%
Multi-Layer Perceptron (Tavallaee et al., 2009) 77.41% 57.34%
SVM (Tavallaee et al., 2009) 69.52% 42.29%
RNN (Yin et al., 2017) 83.28% 68.55%
STL (Javaid et al., 2016) 88.39% -
Sigmoid PIO (Hadeel et al., 2020) 86.90% -
Cosine PIO (Hadeel et al., 2020) 88.30% -
Conventional LSTM 89.81% 79.87%
Proposed BDLSTM 94.26% 87.46%

As presented in Table 4, 5 and 6, our proposed IDS (i.e., the IDS based on
BDLSTM) obtained a higher accuracy for the 2-class classification than the
other existing IDSs on the NSL-KDD dataset. The proposed IDS obtains
a training accuracy of 99.95% on the KDDTrain+ dataset, a testing accu-
racy of 94.26% and 87.46% on the KDDTest+ and the KDDTest-21 datasets,
respectively, which is superior to the results obtained by the other existing
models.
From Table 6, it can be observed that, the BDLSTM model improves
the detection accuracy of the convention LSTM model by 4.45% for the

19
KDDTest+ and 7.59% for the KDDTest-21 datasets. In addition, it obtained
a very good precision rate of 99.05% and 96.04% respectively on the two
categories of test datasets compared to the other models. Further more,
our model obtained a better F-score with a much reduced rate of raising
false alarms, which gives it an edge over the existing methods in detecting
anomalies.

5.2. Experiment 2: Multi-class Classification


The second fold of experiment presents a 5-class (Normal, DoS, Probe,
R2L and U2R) classifier trained with all 41 features. To draw a comparison
of our work against the existing works mentioned in experiment 1, a similar
experiment as presented in (Tavallaee et al., 2009; Yin et al., 2017) was car-
ried out on the benchmark dataset for a 5-class classification. The BDLSTM
approach on the basis of the evaluation metrics shows superiority over the
existing methods in detecting the four kinds of anomalies (i.e. DoS, Probe,
R2L, and U2R).
Figure 7a and 7b present the confusion matrices obtained from the test
datasets (KDDTest+ and KDDTest-21 ) for the conventional LSTM model,
whereas Figure 8a and 8b represent the confusion matrices obtained for the
BDLSTM model and on the basis of which, the performance of the two
models were measured. The performance results of the conventional LSTM
and BDLSTM model is summarized in Table 7 and 8, respectively.

(a) Conventional LSTM Confusion Matrix for (b) Conventional LSTM Confusion Matrix for
KDDTest+ (Multi-Class Classification) KDDTest-21 (Multi-Class Classification)

20
(a) Bidirectional LSTM Confusion Matrix for (b) Bidirectional LSTM Confusion Matrix for
KDDTest+ (Multi-Class Classification) KDDTest-21 (Multi-Class Classification)

Table 7: Conventional LSTM Model Performance For Multi-Class Classification

DATASET CLASS PERFORMANCE MEASURE


Recall FAR Specificity Precision F-Score
Normal 92.25% 6.84% 93.16% 91.07% 91.66%
DoS 86.36% 0.25% 99.75% 99.43% 92.44%
KDDTest+
Probe 87.32% 9.07% 90.93% 53.65% 66.47%
R2L 75.05% 0.17% 99.83% 98.43% 85.17%
U2R 46.00% 0.44% 99.56% 48.42% 47.18%
Normal 78.30% 14.44% 85.56% 54.62% 64.35%
DoS 70.57% 0.59% 99.41% 98.58% 82.26%
KDDTest-21
Probe 85.85% 14.79% 85.21% 59.61% 70.36%
R2L 70.04% 0.44% 99.56% 97.97% 81.69%
U2R 43.50% 1.22% 98.78% 37.99% 40.56%

21
Table 8: Bidirectional LSTM Model Performance For Multi-Class Classification

DATASET CLASS PERFORMANCE MEASURE


Recall FAR Specificity Precision F-Score
Normal 95.40% 5.64% 94.36% 92.75% 94.06%
DoS 90.34% 0.07% 99.93% 99.85% 94.86%
KDDTest+
Probe 91.53% 5.68% 94.32% 65.99% 76.69%
R2L 82.43% 0.03% 99.97% 99.74% 90.26%
U2R 54.00% 0.30% 99.70% 62.07% 57.75%
Normal 82.53% 11.36% 88.64% 61.71% 70.62%
DoS 84.22% 0.25% 99.75% 99.48% 91.22%
KDDTest-21
Probe 90.13% 9.80% 90.20% 70.04% 78.83%
R2L 73.60% 0.23% 99.77% 98.97% 84.42%
U2R 49.00% 0.51% 99.49% 62.42% 54.90%

Table 9: Comparison of Result for KDDTest+ - Multi-class Classification

METHOD ACCURACY IN PERCENTAGE


Accuracy Precision Recall FAR F-Score
SVM 77.12% 80.80% 77.12% 15.90% 73.90%
J48 75.23% 80.30% 75.23% 16.70% 71.3%
Multi-Layer Perceptron 75.66% 0.00% 75.70% 16.60% 0.00%
Naive Bayes 71.48% 76.30% 71.50% 12.30% 71.40%
Random Forest 77.07% 82.20% 77.10% 16.2% 73.10%
Random Tree 75.13% 79.3% 75.10% 16.90% 70.50%
NB Tree 74.65% 78.23% 74.70% 14.60% 74.48%
RNN (Yin et al., 2017) 81.29% 83.07% 81.29% 12.42% 79.25%
SCDNN (Ma et al., 2016) 72.64% - 57.48% 27.36% -
STL (Javaid et al., 2016) 79.10% 83.33% 68.99% - 75.76%
MDPCA-DBN (Yang et al., 2019) 82.08% 97.27% 70.51% 2.62% 81.75%
Conventional LSTM 87.26% 90.34% 87.26% 4.03% 88.03%
Proposed BDLSTM 91.36% 92.81% 91.36% 0.88% 91.67%

22
Table 10: Comparison of Result for KDDTest-21 - Multi-class Classification

METHOD ACCURACY IN PERCENTAGE


Accuracy Precision Recall FAR F-Score
SVM 56.59% 77.20% 56.60% 10.3% 56.60%
J48 53.22% 76.6% 53.20% 11.50% 51.70%
Multi-Layer Perceptron 53.81% 0.00% 53.80% 11.20% 0.00%
Naive Bayes 48.57% 62.60% 48.60% 11.00% 50.20%
Random Forest 56.79% 80.20% 56.80% 10.60% 55.3%
Random Tree 53.29% 75.30% 53.30% 11.2% 50.80%
NB Tree 57.58% 76.43% 57.60% 10.40% 65.69%
RNN Yin et al. (2017) 64.67% - - - -
SCDNN Ma et al. (2016) 44.55% - 37.85% 55.45% -
MDPCA-DBN Yang et al. (2019) 66.18% 95.51% 61.57% 13.06% 74.87%
Conventional LSTM 74.49% 81.53% 79.49% 5.96% 75.76%
Proposed BDLSTM 82.05% 85.91% 82.05% 4.20% 82.77%

From Table 9 and 10, the proposed BDLSTM does not only improve the
performance of the conventional LSTM, but also has a higher detection accu-
racy than the existing IDS models. Compared with the existing IDS models,
the proposed BDLSTM model achieved a greater accuracy of 91.36% and
82.05% for the KDDTest+ and the KDDTest-21 respectively. Additionally,
in terms of raising false alarm, the proposed model achieved a much lower
rate of 0.88% for the KDDTest+ and 4.20% for the KDDTest-21 as compared
to the existing algorithms.
It can be observed from Table 9 and 10 that with regards to precision,
the MDPCA-DBN achieves a much higher score of 97.27% for the KDDTest+
and 95.51% for the KDDTest-21 as compared to the proposed model which
achieved 92.81% and 85.91% for the two test datasets respectively. The
proposed BDLSTM model however, obtained a much better recall values of
91.36% and 82.05% respectively on the two test datasets, compared to the
MDPCA-DBN model. As a result, our model outperformed the MDPCA-
DBN in terms of F-Score. Thus, BDLSTM obtained higher F-Scores of
91.67% for the KDDTest+ and 82.77% for the KDDTest-21 compared to the
other existing model. In a nutshell, it is evident that, our proposed BDL-
STM in comparison with the existing models shows superiority in detecting
intrusions.
A graphical visualization of our model’s detection accuracy compared to
the other existing models is presented in Figure 9 and 10

23
Figure 9: Comparison of Detection Accuracy for the 2-Class Classification

Figure 10: Comparison of Detection Accuracy for the 5-Class Classification

24
6. Conclusion and Future Works
This work proposed an application of deep learning approach i.e. bidi-
rectional Long-Short-Term Memory (BDLSTM) model which makes use of
layers of LSTM cells in the forward and backward directions coupled with
fully connected layers to effectively detect network intrusions. The proposed
approach showed a good performance and achieved accurate results. To
substantiate our model’s performance, the NSL-KDD dataset which is ex-
tensively utilized by most researchers as the benchmark dataset for intrusion
detection was used to train the model. The BDLSTM model after the exper-
iment, obtained a higher accuracy, recall and F-score than the conventional
LSTM model and other existing intrusion detection models proposed in liter-
ature. In addition, the proposed model does not only efficiently improve the
overall anomaly detection rate but also the detection rate of each attack class
(i.e., Normal, DoS, Probe, R2L and U2R) especially, R2L and U2R attacks.
In future, it is our intention to develop and explore performance of integrated
systems that would integrate some state-of-the-art feature selection methods
with conventional LSTM and BDLSTM models.

Declaration of interest
The authors declare that they have no conflicts of interest.

References
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A
review and new perspectives. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 35 , 1798–1828.

Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependen-
cies with gradient descent is difficult. Transactions on Neural Networks,
5 , 157–166.

Beqiri, E. (2009). Neural networks for intrusion detection systems. Global Se-
curity, Safety, and Sustainability. ICGS3 2009. Communications in Com-
puter and Information Science, 45 , 156–165.

Berman, D. S., Buczak, A. L., Chavis, J. S., & Corbett, C. L. (2019). A


survey of deep learning methods for cyber security. Information, 10 , 122.

25
Cannady, J. (1998). Artificial neural networks for misuse detection. In Na-
tional Information Systems Security Conference (pp. 443–456).
Chandak, T., Shukla, S., & Wadhvani, R. (2019). An analysis of “a feature
reduced intrusion detection system using ann classifier. Expert Systems
with Applications, 130 , 79–83.
Depren, O., Topallar, M., Anarim, E., & Ciliz, M. K. (2005). An intelli-
gent intrusion detection system (ids) for anomaly and misuse detection in
computer networks. Expert Systems with Applications, 29 , 713–722.
Dua, D., & Graff, C. (2017). Uci machine learning repository-kdd cup 1999
data set. http://archive.ics.uci.edu/ml.
Fu, Y., Lou, F., Meng, F., Tian, Z., Zhang, H., & Jiang, F. (2018). An in-
telligent network attack detection method based on rnn. In 2018 IEEE
Third International Conference on Data Science in Cyberspace (DSC),
Guangzhou (pp. 483–489). IEEE.
Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with
deep recurrent neural networks. In 2013 IEEE International Conference on
Acoustics, Speech and Signal Processing, Vancouver, BC (pp. 6645–6649).
IEEE.
Gregg, M. (2014). Certified Ethical Hacker (CEH) Cert Guide. Pearson
Education, Inc., USA.
Hadeel, A., Ahmad, S., & Khair, S., Eddin (2020). A feature selection algo-
rithm for intrusion detection system based on pigeon inspired optimizer.
Expert Systems with Applications, 148 , 113249.
Hochreiter, S., & Schmidhuber, J. (1995). Long short-term memory. Neural
Computation, 9 , 1735–1780.
Hochreiter, S., & Schmidhuber, J. (1997). Lstm can solve hard long time lag
problems. In Proceedings of the 9th International Conference on Neural
Information Processing Systems (p. 473–479). MIT Press.
Horng, S.-J., Su, M.-Y., Chen, Y.-H., Kao, T.-W., Chen, R.-J., Lai, J.-L.,
& Perkasa, C. D. (2011). A novel intrusion detection system based on
hierarchical clustering and support vector machines. Expert Systems with
Applications, 38 , 306–313.

26
Ikram, T. S., & Cherukuri, A. K. (2016). Improving accuracy of intrusion
detection model using pca and optimized svm. CIT. Journal of Computing
and Information Technology, 24 , 133–148.

Ingre, B., & Yadav, A. (2015). Performance analysis of nsl-kdd dataset


using ann. In 2015 International Conference on Signal Processing and
Communication Engineering Systems, Guntur (pp. 92–96). IEEE.

Ishitaki, T., Obukata, R., Oda, T., & Barolli, L. (2017). Application of deep
recurrent neural networks for prediction of user behavior in tor networks. In
2017 31st International Conference on Advanced Information Networking
and Applications Workshops (WAINA), Taipei (pp. 238–243). IEEE.

Jang-Jaccard, J., & Nepal, S. (2014). A survey of emerging threats in cyber-


security. Journal of Computer and System Sciences, 80 , 973–993.

Javaid, A., Niyaz, Q., Sun, W., & Alam, M. (2016). A deep learning ap-
proach for network intrusion detection system. In Proceedings of the 9th
EAI International Conference on Bio-Inspired Information and Commu-
nications Technologies (Formerly BIONETICS) (p. 6). ICST (Institute for
Computer Sciences, Social-Informatics and Telecommunications Engineer-
ing).

Kasongo, S. M., & Sun, Y. (2019). A deep learning method with filter based
feature engineering for wireless intrusion detection system. IEEE Access,
7 , 38597–38607.

Kim, G., Lee, S., & Kim, S. (2014). A novel hybrid intrusion detection
method integrating anomaly detection with misuse detection. Expert Sys-
tems with Applications, 41 , 1690–1700.

Kim, J., & Kim, H. (2015). Applying recurrent neural network to intrusion
detection with hessian free optimization. In Revised Selected Papers of
the 16th International Workshop on Information Security Applications -
Volume 9503 (p. 357–369). Springer-Verlag.

Kim, J., Kim, J., Le, T., & Kim, H. (2016). Long short term memory recur-
rent neural network classifier for intrusion detection. In 2016 International
Conference on Platform Technology and Service (PlatCon), Jeju (pp. 1–5).
IEEE.

27
Koc, L., Mazzuchi, T. A., & Sarkani, S. (2012). A network intrusion detection
system based on a hidden naı̈ve bayes multiclass classifier. Expert Systems
with Applications, 39 , 13492–13500.

Le, T., Kim, J., & Kim, H. (2017). An effective intrusion detection classifier
using long short-term memory with gradient descent optimization. In 2017
International Conference on Platform Technology and Service (PlatCon),
Busan (pp. 1–6). IEEE.

LeCun, Y., Y., B., & Hinton, G. (2015). Deep learning. Nature, 521 , 436–444.

Liaqat, A., Ce, Z., Mingyi, Z., & Yipeng, L. (2019a). Early diagnosis of
parkinson’s disease from multiple voice recordings by simultaneous sample
and feature selection. Expert Systems with Applications, 137 , 22–28.

Liaqat, A., Shafqat, K., Ullah, Noorbakhsh, G., Amiri, Imrana, Y., Iqbal,
Q., Adeeb, N., & Redhwan, N. (2019b). A feature-driven decision support
system for heart failure prediction based on 2 statistical model and gaus-
sian naive bayes. Computational and Mathematical Methods in Medicine,
2019 .

Ma, T., Wang, F., Cheng, J., Yu, Y., & Chen, X. (2016). A hybrid spec-
tral clustering and deep neural network ensemble algorithm for intrusion
detection in sensor networks. Sensors, 16 , 1701.

Manzoor, I., & Kumar, N. (2017). A feature reduced intrusion detection


system using ann classifier. Expert Systems with Applications, 88 , 249–257.

Mousa, A., & Schuller, B. (2017). Contextual bidirectional long short-term


memory recurrent neural network language models: A generative approach
to sentiment analysis. In Proceedings of the 15th Conference of the Euro-
pean Chapter of the Association for Computational Linguistics: Volume 1,
Long Papers‘ (pp. 1023–1032). Association for Computational Linguistics.

Nie, L., Jiang, D., & Lv, Z. (2017). Modeling network traffic for traffic
matrix estimation and anomaly detection based on bayesian network in
cloud computing networks. Ann. Telecommun., 72 , 297–305.

Parwez, M. S., Rawat, D. B., & Garuba, M. (2017). Big data analytics
for user-activity analysis and user-anomaly detection in mobile wireless
network. IEEE Transactions on Industrial Informatics, 13 , 2058–2065.

28
Pearlmutter, B. A. (1995). Gradient calculations for dynamic recurrent neu-
ral networks: A survey. Trans. Neur. Netw., 6 , 1212–1228.
Pineda, F. J. (1987). Generalization of backpropagation to recurrent and
higher order neural networks. In Proceedings of the 1987 International
Conference on Neural Information Processing Systems (p. 602–611). MIT
Press.
Reddy, R. R., Ramadevi, Y., & Sunitha, K. V. N. (2016). Effective discrim-
inant function for intrusion detection using svm. In 2016 International
Conference on Advances in Computing, Communications and Informatics
(ICACCI), Jaipur (pp. 1148–1153). IEEE.
Schmidhuber, J. (2015). Deep learning in neural networks: An overview.
Neural Networks, 61 , 85–117.
Staudemeyer, R. C. (2015). Applying long short-term memory recurrent
neural networks to intrusion detection. South African Computer Journal
(SACJ), 56 , 136–154.
Staudemeyer, R. C., & Omlin, C. W. (2013). Evaluating performance of
long short-term memory recurrent neural networks on intrusion detection
data. In Proceedings of the South African Institute for Computer Scientists
and Information Technologists Conference (p. 218–224). Association for
Computing Machinery.
Tang, T. A., Mhamdi, L., McLernon, D., Zaidi, S. A. R., & Ghogho, M.
(2016). Deep learning approach for network intrusion detection in soft-
ware defined networking. In 2016 International Conference on Wireless
Networks and Mobile Communications (WINCOM), Fez (pp. 258–263).
IEEE.
Tang, T. A., Mhamdi, L., McLernon, D., Zaidi, S. A. R., & Ghogho, M.
(2018). Deep recurrent neural network for intrusion detection in sdn-based
networks. In 2018 4th IEEE Conference on Network Softwarization and
Workshops (NetSoft), Montreal, QC (pp. 202–206). IEEE.
Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. (2009). A detailed
analysis of the kdd cup 99 data set. In 2009 IEEE Symposium on Com-
putational Intelligence for Security and Defense Applications, Ottawa (pp.
1–6). IEEE.

29
UNB (2009). Nsl-kdd dataset. https://www.unb.ca/cic/datasets/nsl.html.

Wenke, L., Stolfo, S. J., & Mok, K. W. (1999). A data mining framework
for building intrusion detection models. In Proceedings of the 1999 IEEE
Symposium on Security and Privacy (Cat. No.99CB36344) (pp. 120–132).
IEEE.

Yang, Y., Zheng, K., Wu, C., Niu, X., & Yang, Y. (2019). Building an effec-
tive intrusion detection system using the modified density peak clustering
algorithm and deep belief networks. Appl. Sci., 9 , 238.

Yin, C., Zhu, Y., Fei, J., & He, X. (2017). A deep learning approach for in-
trusion detection using recurrent neural networks. IEEE Access, 5 , 21954–
21961.

Zhang, J., & Zulkernine, M. (2006). A hybrid network intrusion detec-


tion technique using random forests. In First International Conference
on Availability, Reliability and Security (ARES’06) (pp. 262–269). IEEE.

30

You might also like