0% found this document useful (0 votes)

74 views

4

Uploaded by

araj16585

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views

4

Uploaded by

araj16585

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Received January 12, 2020, accepted January 31, 2020, date of publication February 10, 2020, date of current

version February 18, 2020.

Digital Object Identifier 10.1109/ACCESS.2020.2972627

BAT: Deep Learning Methods on Network

Intrusion Detection Using NSL-KDD Dataset
TONGTONG SU , HUAZHI SUN, JINQI ZHU , SHENG WANG, AND YABO LI
School of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China
Corresponding authors: Huazhi Sun ([email protected]) and Jinqi Zhu ([email protected])
This work was supported in part by the Natural Science Foundation of Tianjin under Grant 17JCYBJC16400, Grant 18JCYBJC85900, and
Grant 18JCQNJC70200, and in part by the Science and Technology Development Fund of Tianjin Education Commission for Higher
Education under Grant JW1702.

ABSTRACT Intrusion detection can identify unknown attacks from network traffics and has been an
effective means of network security. Nowadays, existing methods for network anomaly detection are usually
based on traditional machine learning models, such as KNN, SVM, etc. Although these methods can obtain
some outstanding features, they get a relatively low accuracy and rely heavily on manual design of traffic
features, which has been obsolete in the age of big data. To solve the problems of low accuracy and feature
engineering in intrusion detection, a traffic anomaly detection model BAT is proposed. The BAT model
combines BLSTM (Bidirectional Long Short-term memory) and attention mechanism. Attention mechanism
is used to screen the network flow vector composed of packet vectors generated by the BLSTM model, which
can obtain the key features for network traffic classification. In addition, we adopt multiple convolutional
layers to capture the local features of traffic data. As multiple convolutional layers are used to process data
samples, we refer BAT model as BAT-MC. The softmax classifier is used for network traffic classification.
The proposed end-to-end model does not use any feature engineering skills and can automatically learn
the key features of the hierarchy. It can well describe the network traffic behavior and improve the ability
of anomaly detection effectively. We test our model on a public benchmark dataset, and the experimental
results demonstrate our model has better performance than other comparison methods.

INDEX TERMS Network traffic, intrusion detection, deep learning, BLSTM, attention mechanism.

I. INTRODUCTION In fact, network traffic can be divided into two categories

With the development and improvement of Internet technol- (normal traffics and malicious traffics). Furthermore, net-
ogy, the Internet is providing various convenient services for work traffic can also be divided into five categories: Normal,
people. However, we are also facing various security threats. DoS (Denial of Service attacks), R2L (Root to Local attacks),
Network viruses, eavesdropping and malicious attacks are on U2R (User to Root attack) and Probe (Probing attacks).
the rise, causing network security to become the focus of Hence, intrusion detection can be considered as a classifica-
attention of the society and government departments. Fortu- tion problem. By improving the performance of classifiers in
nately, these problems can be well solved via intrusion detec- effectively identifying malicious traffics, intrusion detection
tion. Intrusion detection plays an important part in ensuring accuracy can be largely improved.
network information security. However, with the explosive Machine learning methods [3]–[8] have been widely used
growth of Internet business, traffic types in the network are in intrusion detection to identify malicious traffic. However,
increasing day by day, and network behavior characteristics these methods belong to shallow learning and often empha-
are becoming increasingly complex, which brings great chal- size feature engineering and selection. They have difficulty
lenges to intrusion detection [1], [2]. How to identify various in features selection and cannot effectively solve the massive
malicious network traffics, especially unexpected malicious intrusion data classification problem, which leads to low
network traffics, is a key problem that cannot be avoided. recognition accuracy and high false alarm rate. In recent
years, intrusion detection methods based on deep learning
The associate editor coordinating the review of this manuscript and have been proposed successively. In [9], the authors propose a
approving it for publication was Fan Zhang . mal-ware traffic classification method based on convolutional

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 29575
T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

neural network with traffic data as image. This method does of the BAT-MC network can reach 84.25%, which is about
not need manual design features, and directly takes the origi- 4.12% and 2.96% higher than the existing CNN and RNN
nal traffic as the input data to the classifier. In [10], the authors model, respectively.
provide an analysis of the viability of Recurrent Neural Net- The following are some of the key contributions and find-
works (RNN) to detect the behavior of network traffic by ings of our work:
modeling it as a sequence of states that change over time. 1) We propose an end-to-end deep learning model
In [11], the authors verify the performance of Long Short- BAT-MC that is composed of BLSTM and attention
Term memory (LSTM) network in classifying intrusion traf- mechanism. BAT-MC can well solve the problem of
fics. Experimental results show that LSTM can learn all the intrusion detection and provide a new research method
attack classes hidden in the training data. All the above meth- for intrusion detection.
ods treat the entire network traffic as a whole consisting of a 2) We introduce the attention mechanism into the BLSTM
sequence of traffic bytes. They don’t make full use of domain model to highlight the key input. Attention mechanism
knowledge of network traffics. For example, CNN converts conducts feature learning on sequential data composed
continuous network traffic into images for processing, which of data package vectors. The obtained feature informa-
is equivalent to treating traffics as independent and ignore the tion is reasonable and accurate.
internal relations of network traffics. Firstly, network traffic 3) We compare the performance of BAT-MC with tra-
is a hierarchical structure. Specifically, network traffic is a ditional deep learning methods, the BAT-MC model
traffic unit composed of multiple data packets. Data packet can extract information from each packet. By making
is a traffic unit composed of multiple bytes. Secondly, traffic full use of the structure information of network traffic,
features in the same and different packets are significantly the BAT-MC model can capture features more compre-
different. Sequential features between different packets need hensively.
to be extracted independently. In other words, not all traffic 4) We evaluate our proposed network with a real
features are equally important for traffic classification in the NSL-KDD dataset. The experimental results show that
process of extracting features on a certain network traffic. the performance of BAT-MC is better than the tradi-
However, little prior works have utilized the above men- tional methods.
tioned structure of network traffic. Inspired by these char-
acteristics, in this paper, we propose and demonstrate our The rest of the paper is organized as follows: In Section 2,
method to analyze network traffic in an overall view. Net- we give a brief overview of the related work, especially how
work traffic is generally collected at fixed time intervals. intelligent algorithms facilitate the development of intrusion
Repeating this collecting process for m times, we can get detection. In Section 3, we present details of the proposed
the network traffic X 0 , where X 0 = (x10 , x20 , . . . , xm0 ) is a BAT-MC model. In Section 4, we explain the experimental
matrix with m data packets. Each x represents a data packet, setup and present our results. The performance of BAT-MC
in data packet is seen as a whole consisting of a sequence of model is compared with other machine learning methods
traffic bytes. Before entering the data into the BAT model, both in binary classification and multiclass classification.
the original data is preprocessed by multiple convolutional Section 5 draws the conclusions.
layers. Global features can be obtained with the increase of
the convolutional layer. With the preprocessing, we get an II. RELATED WORKS
abstract representation of network traffic X from X 0 . In order The intrusion detection technology can be divided into
to better make full use of domain knowledge of network three major categories: pattern matching methods, traditional
traffics, we propose a deep learning model BAT-MC that machine learning methods and deep learning methods.
mainly combines bidirectional long-term memory (BLSTM) At the beginning, people mainly use pattern matching
[12] and attention mechanism [13]. BLSTM is used to learn algorithms for intrusion detection. Pattern matching algo-
the characteristics of each packet and get the vector corre- rithm [14], [15] is the core algorithm of intrusion detection
sponding to each packet. Attention mechanism is then used system based on feature matching. Most algorithms have
to perform feature learning on the sequence data composed been considered for use in the past. In [16], the authors
of the packet vector to obtain the fine-grained features. Up to make a summary of pattern matching algorithm in Intrusion
now, we have finished the key features extraction of net- Detection System: KMP algorithm, BM algorithm, BMH
work traffics via attention mechanism. The whole process of algorithm, BMHS algorithm, AC algorithm and AC-BM
feature learning does not use any feature engineering skills. algorithm. Experiments show that the improved algorithm
The automatically learnt key features can better describe the can accelerate the matching speed and has a good time perfor-
traffic behavior, which can effectively improve the anomaly mance. In [17], Naive approach, Knuth-MorrisPratt algorithm
detection capability. Finally, a full connected network and a and RabinKarp Algorithm are compared in order to check
softmax function are performed on the obtained fine-grained which of them is most efficient in pattern/intrusion detection.
features for anomaly detection. To verify the effectiveness Pcap files have been used as datasets in order to determine
of the BAT-MC network, it is comprehensively evaluated on the efficiency of the algorithm by taking into consideration
the NSL-KDD dataset and gets the best results. The accuracy their running times respectively. These traditional pattern

29576 VOLUME 8, 2020

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

recognition algorithms have serious defects, which cannot The results show a significantly high accuracy and detection
achieve the effect of intrusion detection. Finding an efficient rate, averaging 99%. However, current deep learning methods
algorithm that reaches high efficiency and low false positive don’t make full use of the structured information of network
rates is still the focus of current work. With the develop- traffic. Network traffic is essentially a kind of time series
ment of artificial intelligence, the application of intelligent data. Similar to the structure of letters, words, sentences and
algorithms for intrusion detection has become a new research paragraphs in natural language processing (NLP), network
hotspot. traffic is composed of multiple data packets and each data
The traffic anomaly detection methods based on machine packet is a set of multiple bytes.
learning have achieved a lot of success. In [18], the authors In this paper, drawing on the application methods of deep
propose a new method of feature selection and classifica- learning in NLP, we adopt phased processing. The BLSTM
tion based on support vector machine (SVM). Experimen- is used to learn the sequential features in the data packet
tal results on NSL-KDD cup 99 of intrusion detection data to obtain a vector corresponding to each data packet. Then,
set showed that the classification accuracy of this method attention layer is used to perform feature learning on the
with all training features reached 99%. In [19], the authors sequential data composed of the packet vector. Attention can
combine k-mean clustering on the basis of KNN classifier. filter out the characteristics to get a network flow vector,
The experimental results on NSL-KDD dataset show that this which are helpful to achieve more accurate network traffic
method greatly improves the performance of KNN classifier. classification. Through the learning of two phases of BLSTM
In [20], the authors propose a new framework to combine and attention on the time series features, the BAT-MC model
the misuse and the anomaly detection in which they apply finally outputs a network flow vector, which contains struc-
the random forests algorithm. Experimental results show tured information of network traffic. Hence, the BAT-MC
that the overall detection rate of the hybrid system is 94.7% model makes full use of the structure information of network
and the overall false positive rate is 2%. In [21], the perfor- traffic.
mance of NSL-KDD dataset is evaluated via Artificial Neural
Network (ANN). The detection rate obtained is 81.2% and III. PROPOSED WORK
79.9% for intrusion detection and attack type classification As shown in Figure 1, the BAT-MC model consists of five
task respectively for NSL-KDD dataset. In [22], an intrusion components, including the input layer, multiple convolutional
detection method based on decision tree (DT) is proposed. Layers, BSLTM layer, attention layer and output layer, from
Experimental results of feature selection using the relevant bottom to top. At the input layer, BAT-MC model converts
feature selection (CFS) subset evaluation method show that each traffic byte into a one-hot data format. Each traffic
the DT based intrusion detection system has a higher accu- byte is encoded as an n-dimensional vector. After traffic
racy. As described above, machine learning methods have byte is converted into a numerical form, we perform nor-
been proposed and have achieved success for an intrusion malization operations. At the multiple convolutional layer,
detection system. However, these methods require large-scale we convert the numerical data into traffic images. Convolu-
preprocessing and complex feature engineering of traffic tional operation is used as a feature extractor that takes an
data. It is impossible to solve the massive intrusion data image representation of data packet. At the BLSTM layer,
classification problem using machine learning methods. BLSTM model which connects the forward LSTM and the
With the superior performance of deep learning in image backward LSTM is used to extract features on the the traffic
recognition [23], [24] and speech recognition [25], [26], bytes of each packet. BLSTM model can learn the sequential
traffic anomaly detection methods based on deep learning characteristics within the traffic bytes because BLSTM is
have been proposed. In [27], the authors use Self-taught suitable to the structure of network traffic. In the attention
Learning (STL) on NSL-KDD dataset for network intrusion. layer, attention mechanism is used to analyze the important
Testing results show that their 5-class classification achieved degree of packet vectors to obtain fine-grained features which
an average f-score of 75.76%. In [28], the authors propose an are more salient for malicious traffic detection. At the output
intrusion detection method using deep belief network (DBN) layer, the features generated by attention mechanism are then
and probabilistic neural network (PNN). The experiment imported into a fully connected layer for feature fusion, which
result on the KDD CUP 1999 dataset shows that the method obtains the key features that accurately characterize network
performs better than the traditional PNN, PCA-PNN and traffic behavior. Finally, the fused features are fed into a
unoptimized DBN-PNN. Similarly, [29] and [30] train the classifier to get the final recognition results.
DBN as a classifier to detect intrusions. In [31], the authors
propose a novel network intrusion detection model utilizing A. DATA PREPROCESSING LAYER
convolutional neural networks (CNNs). The CNN model not There are three symbolic data types in NSL-KDD data fea-
only reduces the false alarm rate (FAR) but also improves the tures: protocol type, flag and service. We use one-hot encoder
accuracy of the class with small numbers. In [32], an artificial mapping these features into binary vectors.
intelligence (AI) intrusion detection system using a deep neu- One-Hot Processing: NSL-KDD dataset is processed
ral network (DNN) is investigated and tested with the KDD by one-hot method to transform symbolic features into
Cup 99 dataset in response to ever-evolving network attacks. numerical features. For example, the second feature of the

VOLUME 8, 2020 29577

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

FIGURE 1. The Architecture of BAT-MC model. The whole architecture is divided into five
parts.

NSL-KDD data sample is protocol type. The protocol type and while the deeper layers can capture global information
has three values: tcp, udp, and icmp. One-hot method is with larger vision field. Hence, as the number of the con-
processed into a binary code that can be recognized by a volutional layers increases, the scale of the convolutional
computer, where tcp is [1, 0, 0], udp is [0, 1, 0], and icmp feature gradually becomes coarser. In this paper, the input
is [0, 0, 1]. of the convolutional layer can be formulated as a tensor of
Normalization Processing: The value of the original data the size H × W × 1, where H and W denote the height and
may be too large, resulting in problems such as ‘‘large num- width of data yielded by normalization processing. Suppose
bers to eat decimals’’, data processing overflows, and incon- we have some N unites layer as input which is followed by
sistent weights so on. We use standard scaler to normalize the convolutional layer. If we use m width filter w, the convolu-
continuous data into the range [0, 1]. Normalization process- tional output will be (N − m + 1) unites. The convolutional
ing eliminates the influence of the measurement unit on the calculation process is as shown in equation (3).
model training, and makes the training result more dependent m
on the characteristics of the data itself. The formula is shown l,j j l−1,j
X
xi,k = f (bj + wa,k ri+(k−1)×s+a−1 ), (3)
in equation (1) and equation (2). a=1
r − rmin
r0 c , (1) l,j
where xi,k is one of the ith unit of j feature map of the
rmax − rmin
rmax = max{r}, (2) kth section in the lth layer, and s is the range of section. f
is a non-linear mapping, it usually uses hyperbolic tangent
where r stands for numeric feature value, rmin stands for the function, tanh(·).
minimal value of the feature, rmax stands for the max value,
r0 stands the value after the normalization. C. BLSTM LAYER
For the time series data composed of traffic bytes, BLSTM
B. MULTIPLE CONVOLUTIONAL LAYERS can effectively use the context information of data for fea-
After the above processing operations, convolutional layer ture learning. The BLSTM is used to learn the time series
is used to capture the local features of traffic data. Convo- feature in the data packet. Traffic bytes of each data packet
lutional layer [33], [34] is the most important part of the are sequentially input into an BLSTM, which finally obtain
CNN, which convolves the input images (or feature maps) a packet vector. BLSTM is an enhanced version of LSTM
with multiple convolutional kernels to create different feature (Long Short-Term Memory) [36], [37]. The BLSTM model is
maps. According to [35], the shallower convolutional layers used to extract coarse-grained features by connecting forward
whose receptive field is narrow can extract local information, LSTM and backward LSTM. LSTM is designed by the input

29578 VOLUME 8, 2020

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

D. ATTENTION LAYER
BLSTM eventually generates a packet vector for each packet.
These packet vectors are arranged in the order of inter-
action between the two parties in the network stream to
form a sequence of packet vectors. The relationships within
packet vectors will be learned by attention layer. similarly
to [39], attention mechanism is used to adjust probability
of packet vectors so that our model pays more attention to
FIGURE 2. The architecture of BLSTM model. important features. Firstly, the packet vectors ht extracted by
the BLSTM model is used to obtain its implicit represen-
tation ut through a nonlinear transformation, which can be
gate i, the forget gate f and the output gate o to control how expressed as:
to overwrite the information by comparing the inner memory
cell C when new information arrives [38]. When information ut = tanh(Ww ht + bw ), (12)
enters a LSTM network, we can judge whether it is useful We next measure the importance of packet vectors based on
according to relevant rules. Only the information that meets the similarity representation ut with a context vector uw and
algorithms authentication will be remained, and inconsistent obtain the normalized importance weight coefficient αt . uw
information will be forgotten through forget gate. Given an is a random initialization matrix that can focus on important
input sequence x = (x0 , . . . , xt ) at time t and the hidden information over ut . The weight coefficient for the above
states of a BLSTM layer, h = (h0 , . . . , ht ) can be derived coarse-grained features can be expressed as:
as follows.
The forget gate will take the output of hidden layer ht−1 at exp(uTt uw )
αt = P , (13)
the previous moment and the input xt at the current moment exp(uTt uw )
as input to selectively forget in the cell state Ct , which can be Finally, the fine grained feature s can be computed via the
expressed as: weighted sum of ht based on αt . s can be expressed as:
ft = sigmoid(Wxf xt + Whf ht−1 + bf ), (4)
X
s= αt ht , (14)
The input gate cooperates with a tanh function together to The fine-grained feature vector s generated from the atten-
control the addition of new information. tanh generates a new tion mechanism is used for malicious traffic recognition with
candidate vector. The input gate generates a value for each a softmax classifier, which can be expressed as:
item in C
et from 0 to 1 to control how much new information
will be added, which can be expressed as: y = softmax(Wh s + bh ), (15)
et ),
Ct = sigmoid(ft · Ct−1 + it · C (5) where Wh represents the weight matrix of the classifier, which
can map s to a new vector with length h. h is the number of
it = sigmoid(Wxi xt + Whi ht−1 + bt ), (6)
categories of network traffics.
et = tanh(Wc xt + Wc ht−1 + bc ),
C (7)

The output gate is used to control how much of the current E. MODEL TRAINING
unit state will be filtered out, which can be expressed as: Training the proposed network contains a forward pass and a
backward pass.
ot = sigmoid(Wxo xt + Who ht−1 + bo ), (8) Forward Propagation The BAT-MC model is mainly com-
posed of BLSTM layer and attention layer, each of which
For the BLSTM model at time t, the hidden states of the presents different structures and thus plays different role
ht that is a packet vector generated from each packet can be in the whole model. The forward propagation [40], [41] is
←− −
→
defined as the concatenation of h t and h t , which can be conducted from BLSTM layer to attention layer. The input of
expressed as: current model is obtained by the processing of the previous
←
− −
→ model. After the completion of forward propagation, the final
ht = h t + h t , (9)
−
→ −
→ recognition result is obtained. The NSL-KDD dataset is
h t = tanh(Wx −→ xt + W−
h
→−
h h
→ ),
→ h t−1 + b−
h
(10) defined as X . The divided training dataset and testing dataset
←− ←− can be expressed as x1 ,x2 ,x3 . After one-hot operation and
h t = tanh(Wx ← − x + W←
h t
−← − ),
− h t−1 + b←
h h h
(11)
normalization operation, every samples is converted into a
where 0 .0 means the pointwise product. x represents the input format X 00 that can be acceptable to the BAT-MC model.
−
→ ←−
of the heterogeneous time series data. h t and h t is the Meanwhile, we set the cell state vector size as Sstate . In sum-
hidden states of forward LSTM layer and backward LSTM mary, the abnormal traffic detection algorithm based on the
layer at time t. All the matrices W are the connection weights BAT-MC model is summarized as Algorithm 1. The objec-
between two units, and b are bias vectors. tive function of our model is the cross-entropy based cost

VOLUME 8, 2020 29579

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

Algorithm 1 BAT-MC Intrusion Detection Algorithms TABLE 1. Different classifications in the NSL-KDD dataset.

Input: NSL-KDD dataset, adam, lr, batch_size

Output: Accuracy
1 get X=(x1 ,x2 ,x3 ) from NSL-KDD dataset;
0 0 0
2 x1 ,x12 ,x3 = one-hot(x1 ,x2 ,x3 );

3
00 , x 00 = normalization(x 0 , x 0 , x 0 );
x100 , x12 3 1 12 3
4 conduct convolutional processing;
5 for t = 1; t ≤ T ; do out on a public dataset: the NSL-KDD dataset [46], [47].
←−−−
6 create LSTM cell by Sstate ; Then, we analyze the performance of the BAT-MC model.
−−−→ Finally, in order to verify the advancement and practicability
7 create LSTM cell by Sstate ;
←−−− −−−→ of the BAT-MC model, we compare the performance of this
8 connect BLSTMnet by LSTM cell and LSTM cell ;
9 initialize BLSTMnet by seed; model with some state-of-the-art works.
10 get hidden states ht of the BLSTMnet ;
11 end A. BENCHMARK DATASETS
12 add a full connection layer, whose value is 320; The final result of network traffic anomaly detection is
13 add a dropout, whose value is 0.1; closely related to the dataset. The NSL-KDD dataset is
14 for each hidden state in 1:ht ; do an enhanced version of KDD cup 1999 dataset [48],
15 obtain ht implicit representation ut through a [49], which is widely used in intrusion detection experi-
nonlinear transformation; ments. The NSL-KDD dataset not only effectively solves
16 generate a random initialization matrix uw ; the inherent redundant records problems of the KDD Cup
17 obtain the normalized importance weight coefficient 1999 dataset but also makes the number of records rea-
αt ; sonable in the training dataset and testing dataset. The
18 get the fine-grained feature s via αt and ht ; NSL-KDD dataset is mainly composed of KDDTrain+ train-
19 end ing dataset, KDDTest+ and KDDTest-21 testing dataset,
20 add a full connection layer, whose value is 1024; which can make a reasonable comparison with different
21 add a full connection layer, whose value is 10; methods of the experimental results. As shown in Table 1,
22 P = BAT − MCnet (X 00 ) ; the NSL-KDD dataset have different normal records and four
23 get Loss by pi and yi ; different types of abnormal records. The KDDTest-21 dataset
24 update BAT − MCnet by Adam with loss and η is a subset of the KDDTest+ and is more difficult for
25 return accuracy, f 1 − score; classification.
Network traffic is generally collected at fixed time inter-
vals. Essentially, network traffic data is a kind of time series
data. Network traffic is a traffic unit composed of multiple
function [42]. The goal of training this model is to minimize data packets. Each data packet is seen as a whole consisting of
the cross entropy of the expected and actual outputs for all a sequence of traffic bytes. There are 41 features from differ-
activities. The formula is shown in (16): ent data packet and 1 class label for every data packet. It can
XX j j j j be described in the following form: x = (b0 , . . . , bi ,..). bi is
C =− yi ln ai + (1 − yi ) ln(1 − ai ), (16)
the i-th feature in a data packet, and x represents a continuous
i j
features of data packet. These features include basic features
where i is the index of network traffic. j is the traffic cate- (1-10), content features (11-22) and traffic features (23-41)
gory. a is the actual category of network traffic and y is the [50]. According to its characteristics, there are four types of
predicted category. attacks in this dataset: DoS (Denial of Service attacks), R2L
Backward Propagation: The model is trained with adam (Root to Local attacks), U2R (User to Root attack), and Probe
[43]. Adam is calculated by the back-propagation algo- (Probing attacks).
rithm. Error differentials are back-propagated with the
forward-backward algorithm. Back-Propagation Through B. EVALUATION METRIC
Time (BPTT) [44], [45] is applied to calculate the error differ- In this paper, Accuracy (A) is used to evaluate the BAT-
entials. In this paper, we use the Back Propagation Through MC model. Except for accuracy, false positive rate (TPR)
Time (BPTT) algorithm to obtain the derivatives of the objec- and false positive rate (FPR) are also introduced [51]. These
tive function with respect to all the weights, and minimize the three indicators are commonly used in the research field
objective function by stochastic gradient descent. of network traffic anomaly detection, which the calculation
formula is shown as follows. Where True Positive (TP) rep-
IV. EVALUATION resents the correct classification of the Intruder. False Positive
In this section, we first determine the parameters of BAT-MC (FP) represents the incorrect classification of a normal user
to obtain the optimal model through experiments which carry taken as an intruder. True Negative (NP) represents a normal

29580 VOLUME 8, 2020

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

user classified correctly. False Negative (FN) represents an TABLE 3. Super parameters of the end-to-end learning model.
instance where the intruder is incorrectly classified as a nor-
mal user.
Accuracy represents the proportion of correctly classified
samples to the total number of samples. The evaluation metric
are defined as follows:
TP + TN
accuracy, A = . (17)
TP + FP + FN + TN
True Positive Rate (TPR): as the equivalent of the Detec-
tion Rate (DR), it represents the percentage of the number of
records correctly identified over the total number of anomaly
records.
TP
DR = TPR = . (18)
TP + FN
False Positive Rate (FPR) represents the percentage of the
number of records rejected incorrectly is divided by the total
number of normal records. The evaluation metric are defined
as follows:
FP
FPR = . (19)
FP + TN In the experiment of identifying malicious traffics, when there
C. EXPERIMENTAL SETTINGS are 80 hidden nodes in the BAT-MC model, the accuracy of
BAT-MC on the KDDTest+ dataset is higher. Meanwhile,
In order to test the performance of BAT-MC model proposed
the learning rate is set to 0.01 and the number of training is
in this paper, NSL-KDD dataset is used for verification. The
100 epoches. The confusion matrix generated by the BAT-MC
data samples of the NSL-KDD dataset are divided into two
model on the KDDTest+ dataset is shown in Figure 3 and
parts: one is used to build a classifier, that is called the
Figure 4. Figure 3 and Figure 4 represent the experimental
training dataset. The other is used to evaluate the classifier,
results of the BAT-MC model for the 2-class and 5-class
that is called the testing dataset. There are 125,973 records
classification, respectively. The experimental results show
in the training set and 22,543 records in the testing set.
that most samples is concentrated on the diagonal of the
Table 2 shows the distribution of training and testing records
confusion matrix, indicating that the overall classification
for the (normal/attack) type of network traffic.
performance is very high. However, it can be intuitively seen
from the confusion matrix in Figure 3 show that the BAT-MC
TABLE 2. Distribution of training and testing records.
network achieves good detection performance in distinguish-
ing normal traffics from attack traffics (only 51 samples
are false positives), but there is still further improvement in

The operating environment of all experiments is Keras with

tensorflow as the backend; Operating system is 64-bit CtOS7;
Processor is E5-2620 v4; Main frequency is 2.10GHz; Mem-
ory is 32.0G; Python version is 3.6. In view of many hyper
parameters existing in the BAT-MC model, we performed
100 iterations of training on the NSL-KDD set. The hyper
parameters with the highest accuracy is selected as the model
parameter. The BAT-MC model is also verified on the test-
ing dataset. After lots of experiments, three one-dimensional
convolution layers are adopted when building the BAT-MC
model for intrusion detection task. The parameter list of BAT-
MC network is set as shown in Table 3.

D. PERFORMANCE ANALYSIS OF BAT-MC

Experiments have been designed to study the performance
of the BAT-MC model for 2-category and 5-category clas-
sification, such as Normal, DoS, R2L, U2R and Probe. FIGURE 3. Confusion matrix yielded by the BAT-MC model (5-class).

VOLUME 8, 2020 29581

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

TABLE 4. DR and FPR of the BAT-MC model on the NSL-KDD dataset

(5-class).

TABLE 5. Convolutional layer Diversity.

FIGURE 4. Confusion matrix yielded by the BAT-MC model (2-class).

different number of convolutional layer. As shown in Table 5,

the accuracy has a relatively increase when the convolutional
layer increases. When the BAT-MC model does not conclude
convolutional layers, the accuracy of BAT-MC reaches to
84.25%. Overall, our BAT-MC model shows a better classifi-
cation accuracy (84.25%) for diverse convolutional layer.

E. COMPARISON TO THE STATE OF THE ART

In order to objectively evaluate the accuracy and differen-
tiation of the BAT-MC network, we compare our network
with some related works proposed by [52]–[54]. In [52],
the authors propose a deep learning approach for intrusion
detection using recurrent neural networks (RNN). Compared
with traditional classification methods, such as J48, naive
FIGURE 5. Accuracy on the KDDTest+ and KDDTest21 datasets (5-class). bayesian, and random forest, the performance obtains a
higher accuracy rate and detection rate with a low false
positive rate, especially under the task of multiclass clas-
distinguishing different attack traffics. The detection effect of sification on the NSL-KDD dataset. In [53], the authors
Dos and Probe attack traffics are relatively good, while R2L build a Deep Neural Network (DNN) model for an intrusion
and U2R attack traffics are invalid. detection system and train the model with the NSL-KDD
After careful fine-tuning, the accuracy comparison of the Dataset. Experimental results confirm that the deep learning
BAT-MC model on the KDDTest+ and KDDTest-21 set is approach shows strong potential to be used for flow-based
shown in Figure 5. As the number of iterations increases, anomaly detection in SDN environments. In [54], the authors
the accuracy of the BAT-MC model on both the training set propose to use a typical deep learning method Convolution
and the test set shows an overall upward trend. Experiments Neural Networks (CNN) for detecting cyber intrusions. The
on the KDDTest+ dataset show that when epoch = 100, experimental results show that the performance of this IDS
the BAT-MC model has a good accuracy (84.25%). At model is superior to the performance of models based on
the same time, the accuracy of the BAT-MC model on the traditional machine learning methods and novel deep learning
KDDTest-21 data set is 69.42% and the accuracy on the methods in multi-class classification. These works use the
KDDTrain+ data set is 99.21%. Table 4 shows detection same dataset NSL-KDD for network traffic classification.
rate (DR) and false positive rate (FPR) for different attack They are not only recent highly relative and representative
types, the motivation of intrusion detection is to obtain a works on intrusion detection, but also can achieve excellent
higher accuracy and detection rate with a lower false positive accuracy. The comparison results among these works on
rate. It can be seen that U2R class has the lowest detection the NSL-KDD dataset are shown in Figure 6 and Figure 7,
rate and false positive rate. The U2R class with fewer samples respectively.
are more likely to be misclassified than those with more As shown in Figure 6 and Figure 7, we can observe that
samples. the BAT-MC model performs better than other models in
Here, we evaluate the performance of our model to convo- terms of accuracy, which can reach 84.25%, 69.42% in the
lutional layer diversity. We perform the classification task on KDDTest+ and KDDTest-21 testing set. Compared with the

29582 VOLUME 8, 2020

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

FIGURE 6. Performance of BAT-MC model and other machine learning

models.

FIGURE 8. Comparison of Accuracy with different models.

which can extract the information of each data packet and

then utilize it on a frame-by-frame way. These results prove
that the BAT-MC network can offer a significant advantage
across very different scenarios.
As the number of iterations increases, the accuracy of each
model shows an overall upward trend. It can be seen from
Figure 8 that the accuracy rate of testing dataset based on the
BAT-MC model is not only the fastest, but the accuracy is less
FIGURE 7. Performance of BAT-MC model and other deep learning fluctuating after the iteration of 20 times. The accuracy of the
models.
BAT-MC model remains almost unchanged. As the number of
iterations increases, the accuracy of the BAT model continues
model of [52], the authors adopt the traditional machine to increase, eventually reaching an ideal state. The accu-
learning methods to detect abnormal traffics. That is to say, racy of BAT-MC model is higher than BAT model because
it needs to manually design traffic features and complete BAT-MC can capture global information, which proves the
the extraction and selection of network traffics before model advantages of multiple convolutional layers. The RNN model
training. In contrast, the BAT-MC model directly takes the has small-scale fluctuations in the accuracy of the iterative
collected traffic as original input. Then, attention mecha- process. The RNN model improves faster and also has lower
nism captures key features from the outputs produced by the accuracy than BAT and BAT-MC model. The CNN model
BLSTM model. Experimental results show that the BAT-MC starts to improve at a slower rate and has the worst perfor-
model can automatically extract features by means of end- mance in each model. In summary, the BAT-MC network can
to-end learning, which achieves better classification results accurately identify the time series data by 84.25% accuracy,
than manual design methods. Meanwhile, we compares the which is an effective intrusion detection method.
recent works of using deep learning model for abnormal
traffic detection. As can be seen from Figure 7, the BAT-MC V. CONCLUSION
model achieved the best results on both the KDDTest+ and The current deep learning methods in the network traffic clas-
KDDTest-21 testing set. On the KDDTest+ set, the accuracy sification research don’t make full use of the network traffic
of the BAT-MC model is 4.12% and 2.96% higher than CNN structured information. Drawing on the application methods
[54] and RNN [52], respectively. On the KDDTest-21 set, of deep learning in the field of natural language processing,
the accuracy of the BAT-MC model is 4.75% and 7.1% higher we propose a novel model BAT-MC via the two phase’s learn-
than CNN [54] and RNN [52], respectively. The BAT-MC ing of BLSTM and attention on the time series features for
network is more accurate than CNN because CNN is more intrusion detection using NSL-KDD dataset. BLSTM layer
suitable for processing image data. Additionally, CNN uses a which connects the forward LSTM and the backward LSTM
fixed convolution kernel that cannot model longer contextual is used to extract features on the the traffic bytes of each
information, which is not conducive to the feature extraction packet. Each data packet can produce a packet vector. These
of the time series data. The BAT-MC network is better than packet vectors are arranged to form a network flow vector.
RNN, LSTM and BLSTM because the BAT-MC model com- Attention layer is used to perform feature learning on the
bines attention mechanism to capture the key features and network flow vector composed of packet vectors. The above
obtain more context information. The BAT-MC model can feature learning process is automatically completed by deep
capture features of network traffics more comprehensively, neural network without any feature engineering technology.

VOLUME 8, 2020 29583

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

This model effectively avoids the problem of manual design [19] H. Shapoorifard and P. Shamsinejad, ‘‘Intrusion detection using a novel
features. Performance of the BAT-MC method is tested by hybrid method incorporating an improved KNN,’’ Int. J. Control Automat.,
vol. 173, no. 1, pp. 5–9, Sep. 2017.
KDDTest+ and KDDTest-21 dataset. Experimental results [20] J. Zhang, M. Zulkernine, and A. Haque, ‘‘Random-forests-based network
on the NSL-KDD dataset indicate that the BAT-MC model intrusion detection systems,’’ IEEE Trans. Syst., Man, Cybern. C, Appl.
achieves pretty high accuracy. By comparing with some stan- Rev., vol. 38, no. 5, pp. 649–659, Sep. 2008.
[21] B. Ingre and A. Yadav, ‘‘2015 international conference on signal process-
dard classifier, these comparisons show that BAT-MC models ing and communication engineering systems (spaces),’’ in Proc. Int. Conf.
results are very promising when compared to other current Signal Process. Commun. Eng. Syst., 2015, pp. 1–15.
deep learning-based methods. Hence, we believe that the [22] B. Ingre, A. Yadav, and A. K. Soni, ‘‘Decision tree based intrusion detec-
tion system for NSL-KDD dataset,’’ in Proc. Int. Conf. Inf. Commun.
proposed method is a powerful tool for the intrusion detection Technol. Intell. Syst., 2017, pp. 207–218.
problem. [23] M. Asadi-Aghbolaghi, A. Clapes, M. Bellantonio, H. J. Escalante,
V. Ponce-Lopez, X. Baro, I. Guyon, S. Kasaei, and S. Escalera, ‘‘A survey
on deep learning based approaches for action and gesture recognition in
REFERENCES image sequences,’’ in Proc. 12th IEEE Int. Conf. Autom. Face Gesture
[1] B. B. Zarpelo, R. S Miani, C. T. Kawakani, and S. C. de Alvarenga, Recognit. (FG), May 2017, pp. 476–483.
‘‘A survey of intrusion detection in Internet of Things,’’ J. Netw. Comput. [24] Z. Yan, ‘‘Multi-instance multi-stage deep learning for medical image
Appl., vol. 84, pp. 25–37, Apr. 2017. recognition,’’ Deep Learn. Med. Image Anal., pp. 83–104, Jan. 2017.
[2] B. Mukherjee, L. T. Heberlein, and K. N. Levitt, ‘‘Network intrusion [25] Z. Zhang, J. Geiger, J. Pohjalainen, A. E.-D. Mousa, W. Jin, and
detection,’’ IEEE Netw., vol. 8, no. 3, pp. 26–41, May 1994. B. Schuller, ‘‘Deep learning for environmentally robust speech recogni-
[3] S. Kishorwagh, V. K. Pachghare, and S. R. Kolhe, ‘‘Survey on intru- tion,’’ ACM Trans. Intell. Syst. Technol., vol. 9, no. 5, pp. 1–28, 2017.
sion detection system using machine learning techniques,’’ Int. J. Control [26] K. Noda, Y. Yamaguchi, K. Nakadai, H. G. Okuno, and T. Ogata, ‘‘Audio-
Automat., vol. 78, no. 16, pp. 30–37, Sep. 2013. visual speech recognition using deep learning,’’ Appl. Intell., vol. 42, no. 4,
[4] N. Sultana, N. Chilamkurti, W. Peng, and R. Alhadad, ‘‘Survey on pp. 722–737, Jun. 2015.
SDN based network intrusion detection system using machine learn- [27] A. Javaid, Q. Niyaz, W. Sun, and M. Alam, ‘‘A deep learning approach
ing approaches,’’ Peer-to-Peer Netw. Appl., vol. 12, no. 2, pp. 493–501, for network intrusion detection system,’’ in Proc. 9th EAI Int. Conf. Bio-
Mar. 2019. Inspired Inf. Commun. Technol. (BIONETICS), 2016, pp. 21–26.
[5] M. Panda, A. Abraham, S. Das, and M. R. Patra, ‘‘Network intrusion [28] G. Zhao, C. Zhang, and L. Zheng, ‘‘Intrusion detection using deep belief
detection system: A machine learning approach,’’ Intell. Decis. Technol., network and probabilistic neural network,’’ in Proc. IEEE Int. Conf. Com-
vol. 5, no. 4, pp. 347–356, 2011. put. Sci. Eng. (CSE), IEEE Int. Conf. Embedded Ubiquitous Comput.
[6] W. Li, P. Yi, Y. Wu, L. Pan, and J. Li, ‘‘A new intrusion detection sys- (EUC), Jul. 2017, pp. 639–642.
tem based on KNN classification algorithm in wireless sensor network,’’ [29] N. Gao, L. Gao, Q. Gao, and H. Wang, ‘‘An intrusion detection model based
J. Electr. Comput. Eng., vol. 2014, pp. 1–8, Jun. 2014. on deep belief networks,’’ in Proc. 2nd Int. Conf. Adv. Cloud Big Data,
[7] S. Garg and S. Batra, ‘‘A novel ensembled technique for anomaly detec- Nov. 2014, pp. 247–252.
tion,’’ Int. J. Commun. Syst., vol. 30, no. 11, p. e3248, Jul. 2017. [30] M. Z. Alom, V. Bontupalli, and T. M. Taha, ‘‘Intrusion detection using
[8] F. Kuang, W. Xu, and S. Zhang, ‘‘A novel hybrid KPCA and SVM with GA deep belief networks,’’ in Proc. Nat. Aerosp. Electron. Conf. (NAECON),
model for intrusion detection,’’ Appl. Soft Comput.., vol. 18, pp. 178–184, Jun. 2015, pp. 247–252.
May 2014. [31] K. Wu, Z. Chen, and W. Li, ‘‘A novel intrusion detection model for
[9] W. Wang, M. Zhu, X. Zeng, X. Ye, and Y. Sheng, ‘‘Malware traffic clas- a massive network using convolutional neural networks,’’ IEEE Access,
sification using convolutional neural network for representation learning,’’ vol. 6, pp. 50850–50859, 2018.
in Proc. Int. Conf. Inf. Netw. (ICOIN), 2017, pp. 712–717. [32] J. Kim, N. Shin, S. Y. Jo, and S. Hyun Kim, ‘‘Method of intrusion detection
[10] P. Torres, C. Catania, S. Garcia, and C. G. Garino, ‘‘An analysis of using deep neural network,’’ in Proc. IEEE Int. Conf. Big Data Smart
Recurrent Neural Networks for Botnet detection behavior,’’ in Proc. IEEE Comput. (BigComp), Feb. 2017, pp. 313–316.
Biennial Congr. Argentina (ARGENCON), Jun. 2016, pp. 1–6. [33] A. Tatsuma and M. Aono, ‘‘Food image recognition using covariance of
[11] R. C. Staudemeyer and C. W. Omlin, ‘‘ACM press the south African insti- convolutional layer feature maps,’’ IEICE Trans. Inf. Syst., vol. E99.D,
tute for computer scientists and information technologists conference - east no. 6, pp. 1711–1715, 2016.
London, south Africa (2013.10.07-2013.10.09) proceedings of the south [34] Z. Yu, T. Li, G. Luo, H. Fujita, N. Yu, and Y. Pan, ‘‘Convolutional networks
African institute for computer scientists and information technologists co,’’ with cross-layer neurons for image recognition,’’ Inf. Sci., vols. 433–434,
in Proc. South African Inst. Comput. Scientists Inf. Technol. Conf., 2013, pp. 241–254, Apr. 2018.
pp. 252–261. [35] W. Luo, Y. Li, R. Urtasun, and R. Zemel, ‘‘Understanding the effective
[12] S. Cornegruta, R. Bakewell, S. Withey, and G. Montana, ‘‘Modelling radi- receptive field in deep convolutional neural networks,’’ in Proc. Adv.
ological language with bidirectional long short-term memory networks,’’ Neural Inf. Process. Syst., 2016, pp. 4898–4906.
in Proc. 7th Int. Workshop Health Text Mining Inf. Anal., 2016, pp. 1–11. [36] K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhu-
[13] O. Firat, K. Cho, and Y. Bengio, ‘‘Multi-way, multilingual neural machine ber, ‘‘LSTM: A search space odyssey,’’ IEEE Trans. Neural Netw. Learn.
translation with a shared attention mechanism,’’ in Proc. Conf. North Amer. Syst., vol. 28, no. 10, pp. 2222–2232, Oct. 2017.
Chapter Assoc. Comput. Linguistics, Hum. Lang. Technol., 2016, pp. 1–10. [37] F. Ordóñez and D. Roggen, ‘‘Deep convolutional and LSTM recurrent
[14] H. Zhang, ‘‘Design of intrusion detection system based on a new pattern neural networks for multimodal wearable activity recognition,’’ Sensors,
matching algorithm,’’ in Proc. Int. Conf. Comput. Eng. Technol., Jan. 2009, vol. 16, no. 1, p. 115, Jan. 2016.
pp. 545–548. [38] F. A. Gers, J. Schmidhuber, and F. Cummins, ‘‘Learning to forget:
[15] C. Yin, ‘‘An improved BM pattern matching algorithm in intrusion Continual prediction with LSTM,’’ Neural Comput., vol. 12, no. 10,
detection system,’’ Appl. Mech. Mater., vols. 148–149, pp. 1145–1148, pp. 2451–2471, Oct. 2000.
Jan. 2012. [39] N. Pappas and A. Popescu-Belis, ‘‘Multilingual hierarchical attention net-
[16] P.-F. Wu and H.-J. Shen, ‘‘The research and amelioration of pattern- works for document classification,’’ in Proc. IJCNLP, 2017, pp. 1–11.
matching algorithm in intrusion detection system,’’ in Proc. IEEE 14th Int. [40] Y. Hua, Z. Zhao, R. Li, X. Chen, Z. Liu, and H. Zhang, ‘‘Deep learning
Conf. High Perform. Comput. Commun., IEEE 9th Int. Conf. Embedded with long short-term memory for time series prediction,’’ IEEE Commun.
Softw. Syst., Jun. 2012, pp. 1712–1715. Mag., vol. 57, no. 6, pp. 114–119, Jun. 2019.
[17] V. Dagar, V. Prakash, and T. Bhatia, ‘‘Analysis of pattern matching algo- [41] S. Iamsa-at and P. Horata, ‘‘Handwritten character recognition using his-
rithms in network intrusion detection systems,’’ in Proc. 2nd Int. Conf. Adv. tograms of oriented gradient features in deep learning of artificial neural
Comput., Commun., Autom. (ICACCA), Sep. 2016, pp. 1–5. network,’’ in Proc. Int. Conf. IT Converg. Secur. (ICITCS), Dec. 2013.
[18] M. S. Pervez and D. M. Farid, ‘‘Feature selection and intrusion classifi- [42] A. Boubezoul and S. Paris, ‘‘Application of global optimization meth-
cation in NSL-KDD cup 99 dataset employing SVMs,’’ in Proc. 8th Int. ods to model and feature selection,’’ Pattern Recognit., vol. 45, no. 10,
Conf. Softw., Knowl., Inf. Manage. Appl. (SKIMA, Dec. 2014, pp. 1–6. pp. 3676–3686, Oct. 2012.

29584 VOLUME 8, 2020

T. Su et al.: BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset

[43] D. P. Kingma and J. Ba, ‘‘Adam: A method for stochastic HUAZHI SUN received the Ph.D. degree from the
optimization,’’ 2014, arXiv:1412.6980. [Online]. Available: University of Science and Technology of Beijing,
https://arxiv.org/abs/1412.6980 China, in 2008. He is currently a Professor with the
[44] M. Zeng, L. T. Nguyen, B. Yu, O. J. Mengshoel, J. Zhu, P. Wu, and School of Computer and Information Engineer-
J. Zhang, ‘‘Convolutional neural networks for human activity recognition ing, Tianjin Normal University, China. His main
using mobile sensors,’’ in Proc. 6th Int. Conf. Mobile Comput., Appl. research interests include mobile computing and
Services, 2014, pp. 197–205. distributed computing.
[45] A. Graves, S. Fernĺćndez, and F. Gomez, ‘‘Connectionist temporal clas-
sification: Labelling unsegmented sequence data with recurrent neural
networks,’’ in Proc. Int. Conf. Mach. Learn., 2006, pp. 369–376.
[46] S. Revathi and A. Malathi, ‘‘A detailed analysis on NSL-KDD dataset using
various machine learning techniques for intrusion detection,’’ Int. J. Eng.
Res. Technol., vol. 2, no. 12, pp. 1848–1853, 2013.
[47] D. H. Deshmukh, T. Ghorpade, and P. Padiya, ‘‘Improving classifica-
tion using preprocessing and machine learning algorithms on NSL-KDD JINQI ZHU received the Ph.D. degree in computer
dataset,’’ in Proc. Int. Conf. Commun., Inf. Comput. Technol. (ICCICT), science from the University of Electronic Sci-
Jan. 2015, pp. 1–6. ence and Technology of China (UESTC), China,
[48] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, ‘‘A detailed analysis in 2009. In 2013, she joined Nanyang Techno-
of the KDD CUP 99 data set,’’ in Proc. IEEE Symp. Comput. Intell. Secur. logical University (NTU) as a Visiting Scholar,
Defense Appl., Jul. 2009, pp. 1–6. under the supervision of Dr. Y. G. Wen. She is
[49] V. Engen, J. Vincent, and K. Phalp, ‘‘Exploring discrepancies in findings
currently an Associate Professor with the School
obtained with the KDD Cup ‘99 data set,’’’ Intell. Data Anal., vol. 15, no. 2,
of Computer and Information Engineering, Tian-
pp. 251–276, Mar. 2011.
[50] L. Dhanabal and S. P. Shantharajah, ‘‘A study on NSL-KDD dataset for jin Normal University, China. Her main research
intrusion detection system based on classification algorithms,’’ vol. 4, interests include mobile computing, vehicular net-
no. 6, pp. 446–452, 2015. works, and security networks.
[51] E. M. Stock, J. D. Stamey, R. Sankaranarayanan, D. M. Young,
R. Muwonge, and M. Arbyn, ‘‘Estimation of disease prevalence, true
positive rate, and false positive rate of two screening tests when disease
verification is applied on only screen-positives: A hierarchical model
using multi-center data,’’ Cancer Epidemiol., vol. 36, no. 2, pp. 153–160, SHENG WANG is currently pursuing the mas-
Apr. 2012. ter’s degree with the Academy of Computer and
[52] C. Yin, Y. Zhu, J. Fei, and X. He, ‘‘A deep learning approach for intru- Information Engineering, Tianjin Normal Univer-
sion detection using recurrent neural networks,’’ IEEE Access, vol. 5, sity, China. His current research interests include
pp. 21954–21961, 2017. network technology, big data analysis, and deep
[53] T. A. Tang, L. Mhamdi, D. Mclernon, S. A. R. Zaidi, and M. Ghogho, learning.
‘‘Deep learning approach for network intrusion detection in software
defined networking,’’ in Proc. Int. Conf. Wireless Netw. Mobile Commun.
(WINCOM), Oct. 2016.
[54] Y. Ding and Y. Zhai, ‘‘Intrusion detection system for NSL-KDD dataset
using convolutional neural networks,’’ in Proc. 2nd Int. Conf. Comput. Sci.
Artif. Intell. (CSAI), 2018, pp. 81–85.

TONGTONG SU was born in 1992. He received YABO LI is currently pursuing the master’s degree
the master’s degree in computer science from the in computer application technology with Tian-
School of Computer and Information Engineer- jin Normal University. Her main research inter-
ing, Tianjin Normal University, in 2019. His main ests include wireless self-organizing networks and
research interests include machine learning and mobile computing.
pattern recognition.

VOLUME 8, 2020 29585

BAT Deep Learning Methods On Network Intrusion Det
No ratings yet
BAT Deep Learning Methods On Network Intrusion Det
11 pages
Research Article: Network Intrusion Detection Method Based On Fcwgan and Bilstm
No ratings yet
Research Article: Network Intrusion Detection Method Based On Fcwgan and Bilstm
17 pages
Electronics 11 00898
No ratings yet
Electronics 11 00898
13 pages
Intrusion Detection of Imbalanced
No ratings yet
Intrusion Detection of Imbalanced
14 pages
Flow Dataset For Network Intrusion Detection
No ratings yet
Flow Dataset For Network Intrusion Detection
23 pages
Firewall+Intrusion+Detection+Hybrid+DNN+(Repaired)...
No ratings yet
Firewall+Intrusion+Detection+Hybrid+DNN+(Repaired)...
22 pages
Intrusion Detection of Imbalanced Network Traffic Based On Machine Learning and Deep Learning
No ratings yet
Intrusion Detection of Imbalanced Network Traffic Based On Machine Learning and Deep Learning
14 pages
1 s2.0 S2772503023000130 Main
No ratings yet
1 s2.0 S2772503023000130 Main
13 pages
A_Study_on_High_Speed_Outlier_Detection
No ratings yet
A_Study_on_High_Speed_Outlier_Detection
17 pages
Advanced Network Intrusion Detection With TabTransformer
No ratings yet
Advanced Network Intrusion Detection With TabTransformer
8 pages
Cyber Threat Detection Synopsis
No ratings yet
Cyber Threat Detection Synopsis
14 pages
Research Paper
No ratings yet
Research Paper
4 pages
Batch 1_4 CSE C
No ratings yet
Batch 1_4 CSE C
9 pages
AWID For IntrusionCISS2019
No ratings yet
AWID For IntrusionCISS2019
6 pages
A Hybrid Intrution Detection Approach Based On Deep Learning
No ratings yet
A Hybrid Intrution Detection Approach Based On Deep Learning
16 pages
Chris Literature Review
No ratings yet
Chris Literature Review
7 pages
Deep_Convolutional_Neural_Networks_for_Intrusion_Detection_in_Automotive_Ethernet_Networks
No ratings yet
Deep_Convolutional_Neural_Networks_for_Intrusion_Detection_in_Automotive_Ethernet_Networks
6 pages
ramaiah2021
No ratings yet
ramaiah2021
17 pages
EESNN Hybrid Deep Learning Empowered SpatialTemporal Features for Network Intrusion Detection System
No ratings yet
EESNN Hybrid Deep Learning Empowered SpatialTemporal Features for Network Intrusion Detection System
16 pages
Final Progress
No ratings yet
Final Progress
22 pages
Multi Level Deep Learning Model For Network Anomal
No ratings yet
Multi Level Deep Learning Model For Network Anomal
12 pages
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
No ratings yet
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
24 pages
HDLNIDS Hybrid Deep-Learning
No ratings yet
HDLNIDS Hybrid Deep-Learning
17 pages
631eaa91dbcfb7 78471842
No ratings yet
631eaa91dbcfb7 78471842
13 pages
1-s2.0-S2352864823000640-main
No ratings yet
1-s2.0-S2352864823000640-main
15 pages
A Novel Methodology For Malicious Traffic Detection in Smart Devices Using BI-LSTM-CNN-dependent Deep Learning Methodology
No ratings yet
A Novel Methodology For Malicious Traffic Detection in Smart Devices Using BI-LSTM-CNN-dependent Deep Learning Methodology
20 pages
Xu 2019 Case Study Deep Learning Net Intr Detect
No ratings yet
Xu 2019 Case Study Deep Learning Net Intr Detect
6 pages
Machine Learning Based Network Intrusion Detection For Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction
No ratings yet
Machine Learning Based Network Intrusion Detection For Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction
44 pages
Network Intrusion Detection Combined Hybrid Sampling With Deep Hierarchical Network
No ratings yet
Network Intrusion Detection Combined Hybrid Sampling With Deep Hierarchical Network
13 pages
paper3
No ratings yet
paper3
7 pages
NIDS-CNNLSTM Network Intrusion Detection Classification Model Based On Deep Learning
No ratings yet
NIDS-CNNLSTM Network Intrusion Detection Classification Model Based On Deep Learning
14 pages
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
No ratings yet
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
8 pages
10.1515 - Eng 2022 0403
No ratings yet
10.1515 - Eng 2022 0403
11 pages
Intrusion Detection in Software Defined Network Using Machine Learning
No ratings yet
Intrusion Detection in Software Defined Network Using Machine Learning
11 pages
8890306
No ratings yet
8890306
11 pages
1-s2.0-S2352864820302868-main
No ratings yet
1-s2.0-S2352864820302868-main
8 pages
Du 等 - 2024 - A Few-Shot Class-Incremental Learning Method for N
No ratings yet
Du 等 - 2024 - A Few-Shot Class-Incremental Learning Method for N
13 pages
ZR - Network Intrusion Detection System Based on Machine
No ratings yet
ZR - Network Intrusion Detection System Based on Machine
6 pages
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
No ratings yet
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
11 pages
A Novel Two-Stage Deep Learning Model for Network Intrusion Detection LSTM-AE
No ratings yet
A Novel Two-Stage Deep Learning Model for Network Intrusion Detection LSTM-AE
18 pages
Enhanced Network Anomaly Detection Based On Deep Neural Networks
No ratings yet
Enhanced Network Anomaly Detection Based On Deep Neural Networks
16 pages
Zero-Day Network Intrusion Detection Using Machine Learning Approach
No ratings yet
Zero-Day Network Intrusion Detection Using Machine Learning Approach
9 pages
A Deep Learning Approach To Network Intrusion Detection FINAL
No ratings yet
A Deep Learning Approach To Network Intrusion Detection FINAL
11 pages
MATTER A Multi-Level Attention-Enhanced Representation Learning Model For Network Intrusion Detection
No ratings yet
MATTER A Multi-Level Attention-Enhanced Representation Learning Model For Network Intrusion Detection
6 pages
An Intrusion Detection Model With Hierarchical Attention Mechanism-23
No ratings yet
An Intrusion Detection Model With Hierarchical Attention Mechanism-23
13 pages
6
No ratings yet
6
1 page
23 31 Network Intrusion Detection Using Wireshark and Machine Learning (1)
No ratings yet
23 31 Network Intrusion Detection Using Wireshark and Machine Learning (1)
9 pages
An Efficient Intrusion Detection System With Custom Features Using FPA-Gradient Boost Machine Learning Algorithm
No ratings yet
An Efficient Intrusion Detection System With Custom Features Using FPA-Gradient Boost Machine Learning Algorithm
17 pages
1 s2.0 S0167739X21003861 Main
No ratings yet
1 s2.0 S0167739X21003861 Main
13 pages
Network Intrusion Detection: Based On Deep Hierarchical Network and Original Flow Data
No ratings yet
Network Intrusion Detection: Based On Deep Hierarchical Network and Original Flow Data
13 pages
2020063039
No ratings yet
2020063039
13 pages
A CNN-based Attack Classification Versus An AE-based Unsupervised Anomaly Detection For Intrusion Detection Systems
No ratings yet
A CNN-based Attack Classification Versus An AE-based Unsupervised Anomaly Detection For Intrusion Detection Systems
7 pages
1 s2.0 S1877050922024942 Main
No ratings yet
1 s2.0 S1877050922024942 Main
10 pages
01-2020 DL CNN
No ratings yet
01-2020 DL CNN
17 pages
Online Network Attack Detection Using Statistical Features
No ratings yet
Online Network Attack Detection Using Statistical Features
6 pages
An Attention-Based Convolutional Neural Network For Intrusion Detection Model-Paper
No ratings yet
An Attention-Based Convolutional Neural Network For Intrusion Detection Model-Paper
12 pages
TLS Encrypted Malware Detection On Network Flow Using Accelerated Tools
No ratings yet
TLS Encrypted Malware Detection On Network Flow Using Accelerated Tools
16 pages
s00521-023-09309-y
No ratings yet
s00521-023-09309-y
19 pages
Ids
No ratings yet
Ids
22 pages
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
From Everand
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
Bolakale Aremu
5/5 (1)
An Unsupervised Machine Learning Algorithms_Comprehensive Review
No ratings yet
An Unsupervised Machine Learning Algorithms_Comprehensive Review
12 pages
New Hello 3rd Year Unit 5 - 20222
No ratings yet
New Hello 3rd Year Unit 5 - 20222
55 pages
Public - Speaking Topics
No ratings yet
Public - Speaking Topics
4 pages
Ethics of AI Applications
No ratings yet
Ethics of AI Applications
4 pages
A Comprehensive Review of Supervised and Unsupervised Machine Learning Techniques
No ratings yet
A Comprehensive Review of Supervised and Unsupervised Machine Learning Techniques
3 pages
Intuitive_physics_learning_in_a_deep-learning_mode
No ratings yet
Intuitive_physics_learning_in_a_deep-learning_mode
13 pages
[Ebooks PDF] download (Ebook) Video Object Segmentation: Tasks, Datasets, and Methods (Synthesis Lectures on Computer Vision) by Xu, Ning, Lin, Weiyao, Lu, Xiankai, Wei, Yunchao ISBN 9783031446559, 3031446550 full chapters
100% (12)
[Ebooks PDF] download (Ebook) Video Object Segmentation: Tasks, Datasets, and Methods (Synthesis Lectures on Computer Vision) by Xu, Ning, Lin, Weiyao, Lu, Xiankai, Wei, Yunchao ISBN 9783031446559, 3031446550 full chapters
71 pages
ISSCC 2018 Poster Final
No ratings yet
ISSCC 2018 Poster Final
1 page
AI ML Workshop
No ratings yet
AI ML Workshop
2 pages
Sociology of Celebrity, Spring 2024, Syllabus
No ratings yet
Sociology of Celebrity, Spring 2024, Syllabus
11 pages
Ethics of AI and Cybersecurity When Sovereignty Is at Stake: Paul Timmers
No ratings yet
Ethics of AI and Cybersecurity When Sovereignty Is at Stake: Paul Timmers
12 pages
Data Science Concepts And Techniques With Applications Usman Qamar download
100% (1)
Data Science Concepts And Techniques With Applications Usman Qamar download
79 pages
Generative AI: - Lecture-1
100% (1)
Generative AI: - Lecture-1
21 pages
Artificial intelligence in financial reporting
No ratings yet
Artificial intelligence in financial reporting
81 pages
i Am Code an Artificial Intelligence Speaks Poems 9780316560177 0316560170 (1)
No ratings yet
i Am Code an Artificial Intelligence Speaks Poems 9780316560177 0316560170 (1)
158 pages
Computer Science
No ratings yet
Computer Science
54 pages
Technology: Past - Present - Future
No ratings yet
Technology: Past - Present - Future
16 pages
Reinforcement Learning A LiteratureReview v2
No ratings yet
Reinforcement Learning A LiteratureReview v2
37 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
How An Ais Can Add Value To An Organization
No ratings yet
How An Ais Can Add Value To An Organization
2 pages
Applsci 10 00370 v2
No ratings yet
Applsci 10 00370 v2
14 pages
Expert Veri Ed, Online, Free.: Unlimited Access
No ratings yet
Expert Veri Ed, Online, Free.: Unlimited Access
3 pages
MNIST - Ipynb - Colab
No ratings yet
MNIST - Ipynb - Colab
5 pages
Advanced_Integration_of_Artificial_Intel
No ratings yet
Advanced_Integration_of_Artificial_Intel
12 pages
FALLSEM2024-25 BCSE209L TH VL2024250101586 2024-07-30 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101586 2024-07-30 Reference-Material-I
22 pages
Ugv History
No ratings yet
Ugv History
10 pages
ASU Resume Template
No ratings yet
ASU Resume Template
2 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
2 pages
Fortigate 200f Series
No ratings yet
Fortigate 200f Series
10 pages

4

Uploaded by

4

Uploaded by

Received January 12, 2020, accepted January 31, 2020, date of publication February 10, 2020, date of current

version February 18, 2020.

BAT: Deep Learning Methods on Network

I. INTRODUCTION In fact, network traffic can be divided into two categories

29576 VOLUME 8, 2020

VOLUME 8, 2020 29577

29578 VOLUME 8, 2020

VOLUME 8, 2020 29579

Input: NSL-KDD dataset, adam, lr, batch_size

29580 VOLUME 8, 2020

The operating environment of all experiments is Keras with

D. PERFORMANCE ANALYSIS OF BAT-MC

VOLUME 8, 2020 29581

TABLE 4. DR and FPR of the BAT-MC model on the NSL-KDD dataset

TABLE 5. Convolutional layer Diversity.

FIGURE 4. Confusion matrix yielded by the BAT-MC model (2-class).

different number of convolutional layer. As shown in Table 5,

E. COMPARISON TO THE STATE OF THE ART

29582 VOLUME 8, 2020

FIGURE 6. Performance of BAT-MC model and other machine learning

FIGURE 8. Comparison of Accuracy with different models.

which can extract the information of each data packet and

VOLUME 8, 2020 29583

29584 VOLUME 8, 2020

VOLUME 8, 2020 29585

You might also like