0% found this document useful (0 votes)

79 views

Intrusion Detection System Based On Support Vector Machines and The Two-Phase Bat Algorithm

The document proposes a hybrid intrusion detection system that combines the Binary Bat algorithm with Lévy flights for feature selection, and the Bat algorithm for parameter optimization of support vector machines (SVM) classifiers. The system, called BBAL-BA-SVM, uses the Binary Bat algorithm to select relevant features from network data, and the Bat algorithm to tune the parameters of SVM models trained on selected feature subsets, with the goal of improving detection accuracy and reducing false alarms compared to other methods. The proposed approach is tested on the standard NSL-KDD network intrusion detection dataset.

Uploaded by

Femi Ayo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

Intrusion Detection System Based On Support Vector Machines and The Two-Phase Bat Algorithm

Uploaded by

Femi Ayo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

INTRUSION DETECTION SYSTEM BASED ON SUPPORT

VECTOR MACHINES AND THE TWO-PHASE BAT

ALGORITHM

Eseoghene Daniel Erigha, Femi Emmanuel Ayo, Oluwatobi Olakunle Dada, and
Journal of Information System
Security is a publication of the Olusegun Folorunso
Information Institute. The JISSec
mission is to significantly expand Department of Computer Science, Federal University of Agriculture, Abeokuta,
the domain of information system
security research to a wide and Ogun State, Nigeria
eclectic audience of academics,
consultants and executives who are
involved in the management of
security and generally maintaining
the integrity of the business
operations. Abstract

Editor-in-Chief
Network Intrusions have become a pervasive threat to online ecosystems. Hence the
Gurpreet Dhillon need for an effective Intrusion Detection System (IDS) to safeguard and protect
The University of North Carolina,
Greensboro, USA
assets from a myriad of network attacks. A number of IDS that utilizes effective
feature selection methods have been proposed in the literature. However, this study
Managing Editor
Filipe de Sá-Soares
asserts that an IDS can provide better performance if parameter optimization for
University of Minho, Portugal classifier is embedded in the feature selection process. Consequently, this paper
proposes a hybrid wrapper feature selection approach that combines Binary Bat
ISSN: 1551-0123 algorithm with Lévy flights, together with Bat algorithm and Support Vector
Volume 13, Issue 3
machines (BBAL-BA-SVM). The Binary Bat algorithm with Lléevy flight performs
www.jissec.org the feature selection while the Bat algorithm performs parameter optimization on the
SVM for each feature subset. Experimental results using NSL-KDD dataset prove
that the proposed model provides higher accuracy in attack detection with lower
false alarm rate over compared models.

Keywords: Intrusion Detection System, Bat Algorithm, Binary Bat algorithm,

Support Vector Machines (SVM), BBAL.

135
Introduction

Intrusion Detection System (IDS) as a remedy forto solving all of the complex and
diverse threats to network security is not without loop holes. The need to guarantee
information security has increased the popularity for intrusion detection schemes
(Manekar and Waghmare, 2014). Several protection schemes have been formulated
and modified in the past, but these schemes are still vulnerable to diverse risks and
attacks. The feature selection approach is one of the recent methods for more reliable
IDS. The efficiency of most protection schemes that are rooted in the feature
selection approach are measured on the basis of their accuracy and false alarm rate.
The feature selection approach is sound for providing real-time security solutions
and for correcting anomaly detection. Data preprocessing and classification are the
two common phase for IDS models in the effort to improving performances (Enache
and Sgarciu, 2015). The pre-processing phase entails dataset transformation, while
the classification phase makes use of the transformed dataset for anomaly detection.
Although the pre-processing phase is crucial in the extraction of the reduced dataset
for relevant features selection, however, it is, however, most important to have an
optimization step in the classification phase for parameter optimization. In this paper,
we propose a hybrid wrapper feature selection approach that combines Binary Bat
Aalgorithm with Lévy flights (Enache and Sgarciu, 2014), together with Bat
Aalgorithm and Support Vector Mmachines (BBAL-BA-SVM). Our main goal is to
enhance the previous proposed scheme (Enache et al., 2015), test it on a standard
dataset and compare it with other related feature- selection schemes. Our goal is
motivated by the assertion that an IDS can provide better performance if parameter
optimization for classifier is embedded in the feature selection process.

The rest of this paper is setup as follows: Ssection 2 presents an up-to-date review
ofregarding feature selection approaches combining Swarm Intelligence (SI) with
machine learning algorithms. Next, Ssection 3 deﬁnes the Bat Algorithm, while
section 4 introduces the Binary Bat Algorithm. Section 5 and 6 presents the Binary
Bat Aalgorithm with Lévy flights and SVM concept respectively. Section 7 presents
the proposed model, while section 8 shows the evaluation results. Finally, the
conclusions and future work are discussed.

Feature selection

Big data is a common concern for network security. The fast rate at which incoming
network traffics is converted into huge dataset poses difficulty to the ability of the
IDS to process the resulting big data and produce real- time classification. Feature
selection approaches, also known as variable selection, can help abate the number of
attributes from a huge dataset and extract only a fraction of attributes with the same
efficiency which and that best represent the problem setup (Sammut and Webb,

136
2010). The feature selection approach is very useful in terms of model simplification
for easier interpretation by researchers, reduced training time, and the reduction of
variance to enhance generalization. The core of feature selection methods is that the
data contains many variables that can only assume one out of redundancy and
irrelevance, and can however be removed without reducing the goodness of the data.
Feature selection approaches are commonly employed in domains where there are
many features with few data points. Basically, feature selection approach can be
sliced into scalar methods that choose features individually and vector methods that
search for a new features subset based on the correlation between them. The latter
approach can further be divided foron the feature measure approach: filter-based
(rates features by the proxy measure instead of the error rate to score feature
subsets);, wrapper (requires a predictive model to score feature subsets) and; hybrid
(combines both filter and wrapper) (Dua and Du, 2011). The filter approach is less
computationally intensive and independent of the classifier than the wrapper
approach. On the other handside of the coin, the wrapper method translates to
training and testing the feature subset with a classifier using various evaluation
methods, which may include repeated cross validations that results in longer
execution time and dependency of the predictive model. In general, wrapper
approaches arebeing a greedy method that adds the best feature (or deletes the worst
feature) at each iteration, and they are considered to be more reliable because they
describe the problem for the classifier used for detection, and thus, assureing a good
accuracy level. Currently, fFilter- based methods are in application, together with
Correlation Feature Selection (Nguyen et al., 2010) or Information Gain (Enache and
Patriciu, 2014). The fFilter- based method offers lower prediction performance,
while wrapper methods, if applied appropriately, assure the improvement of the
classifier’s performance measures. In recent times, Swarm Intelligence (SI)
algorithms have prevailed, as these simple yet intelligent algorithms can solve
complex problems with the population of simple agents interacting locally with one
another and with their environment. Although they have no centralized mechanism
for directing the behaviour of each agent, the agents obey basic rules, thus, leading to
interactions and subsequently producing “intelligent” global attributes unknown to
the individual agents. SI algorithms normally deal with n-dimensional space
problems in which the best solution lies in a subset of points in the space. These
algorithms have also been applied to solve numerous NP hard problems (El-
Hefnawy, 2014), optimization problems (Valdez, 2011), robotics, data mining,
diagnosis and military ones. However, SI application to IDS domain is limited, but
interesting and captivating. SI algorithms have been applied for feature selection and
classification as an isolated element or in combination with other well-known
predictive models.

Ma et al. (2008) are the pioneers of the ﬁrst hybrid models which combines SI with
SVM. The authors used the Binary Particle Swarm Optimization (BPSO) for feature

137
selection and SVM for parameter optimization. The evaluation results of their model
on the KDD99 dataset proved to be accurate. Wang et al. (2009) also use PSO with
SVM, however; the authors implemented BPSO for feature selection and SPSO
(standard PSO) to improve the input parameters for the SVM classifier. The same
aAuthors employed the accuracy of the classifier as the objective function for PSO
and thus report an improved detection rate (99.84%). Lately, several IDS models
using SI algorithms have been proposed, including: Artificial Bee Colony (ABC)
(Wang et al., 2010);, Bat Algorithm (Enache and Sgarciu, 2015) and;, Hybrid Bat
Algorithm (Laamari and Kamel, 2014). The results obtained by these models have
proved that hybrid IDS which combine SI with Machine Learning algorithms can
obtain better results.

The authors in Enache et al., (2015) proposed a novel wrapper approach tagged
BBAL-SVM; an enhancement to the ordinary Binary Bat algorithm using Lléevy
flights to enhance feature selection. The authors conducted tests on the NSL-KDD
dataset using a SVM classifier to prove the superiority of BBAL-SVM over BBA
and BPSO. Although BBAL-SVM exhibited a relatively high accuracy, we assert
that the addition of parameter optimization to the feature selection process can
further improve the performance of the IDS model. In this paper, we propose a
hybrid wrapper feature selection approach that combines Binary Bat algorithm with
Lévy flights, together with Bat algorithm and Support Vector machines (BBAL-BA-
SVM). The Binary Bat algorithm with Lévy flight performs the feature selection
while the Bat algorithm performs parameter optimization on the SVM for each
feature subset. BBAL has several positives that qualify it for intrusion detection,
such as good generalization, even with huge noisy datasets. BA was used for the
optimization of the SVM parameters due to its superior performance over other SI
algorithms (Enache and Sgarciu, 2015). SVM on the other hand was used because of
its suitability for classification stemming from its good generalization performance,
absence of local minima (due to the adoption of quadratic optimization), and fast
execution time.

Bat Algorithm

Bat Algorithm (BA) is a relatively new bio-inspired meta-heuristic optimization

algorithm proposed by Yang (2010). The algorithm is modeled after the echolocation
behavior of bats. Echolocation works typically as a type of sonar. Bats can emit a
short and loud pulse of sound that can hit into an object, and return back to their ears
after a period of time (Griffin et al., 1960). Thus, bats can determine their proximity
from that object (Metzner, 1991). Interestingly, this phenomenal characteristic
enables bats to distinguish the difference between a prey and an obstacle, allowing
them to hunt even in complete darkness (Metzner, 1991).

138
The Bbat algorithm has been developed to behave as a group of bats tracking
foods/prey using their echolocation capability. Yang (2010) modeled the intelligence
of this algorithm using three rules. First, all bats use echolocation to sense distance
and distinguish between its target and an obstacle. Secondly, all bats fly randomly
and their trajectory is characterized by their internal encoded frequency (freq),
velocity (v) and position in space (x). Lastly, loudness may vary in many ways;
however it is assumed that the loudness varies from a large (positive) A o to a
minimum constant value Amin.
Within the context of the algorithm’s operations, each individual i in the group of
bats has a current position x i=¿ and a current flying velocity vi =¿ where d is the
problem dimension. To determine the optimal position, each bat updates its position
and velocity according to (1), (2) and (3).

freqi=freq min + ( freq max−freq min ) ∙ β (1)

vit=v t−1
i + ( x ti −1− xbest j ) ∙ freq i (2)

x ti=x ti −1+ v ti (3)

where β ∈ [ 0,1 ] is a random vector drawn from a uniform distribution. Moreover, as

the bat attains a position closer to its target then, it will decrease its loudness Ai and
increase its rate of pulse emission (r i ) using (4) and (5)

Ati +1=∝ ∙ A ti (4)

r ti +1=r i0 ∙[1−e− γ ∙t ] (5)

where ∝(0<∝<1) and γ (γ >0) are constants. Finally, the author assumes that the
loudness will vary from a large value to a minimum one.
Furthermore, in order to improve its position in space, the bat will perform a local
search by using uniform random walks as defined in (6).

139
x new =x old +δ ∙ A ¿ t (6)

Where δ ∈[−1 ,1] is a random number and A¿ t is the average loudness of all bats
at iteration t.
Similar to any optimization algorithm, a fitness function is defined to evaluate each
bat solution. In addition, all the individuals in the swarm will each fly randomly
performing local searches using random walks in order to diversify their position
(Enache and Sgarciu, 2015). As the algorithm obtains a candidate solution, the bat
will try to exploit it by adjusting its pulse rate and loudness. In this study, we adopt
the BA for the optimization of SVM parameters.

Binary Bat Algorithm

The Binary Bat Algorithm (BBA) is the binary version of the BA that was proposed
in Nakamura et al. (2012). Here, the search space is modeled as a n-dimensional
search grid, where n corresponds to the number of features. In this case, the optimal
solution is chosen among the 2n possibilities, and it corresponds to one hypercube’s
corner (Nakamura et al., 2012). The intuition behind the algorithm is to associate
each bat with a set of binary coordinates indicative of the absence or presence of a
feature in the eventual optimal feature subset. Hence, the each bat is formatted as a
multi-dimensional array of 0’s and 1’s. In order to represent each binary bat in a
binary format, the authors adopt a sigmoid function defined in (7).

1
S (vi , j )= (7)
1+e− v i,j

The new coordinates of each bat is expressed using (8):

x i , j= 1 if S ( vi , j ) > δ
{ 0 otherwise
(8)

where δ is a random number between 0 and 1.

140
The fitness function is usually defined in terms of the accuracy of a classifier with
respect to the subset of feature selected as represented by the binary bat.
To prove the effectiveness of BBA, tests were conducted in Nakamura et al. (2012),
using series of datasets. The results of the experimentation showed that BBA was
outperformed other binary swarm optimization algorithms, such as Binary Particle
Swarm Optimization (Kennedy and Eberhart, 1997), Binary Gravitational Search
Algorithm (Rashedi et al., 2010) and Binary Harmony Search Algorithm (Ramos,
2011). This forms the basis for the adoption of BBA in this study.

Binary Bat Algorithm with Léevy Flight

WIt is without doubt that BBA offers significant improvement in performance over
other swarm- based optimization algorithms, such as Binary Particle Swarm
Optimization (Ma et al., 2008) and Binary Firefly Algorithm (Nakamura et al.,
2012). However, it is believed that BBA can attain significant improvements in
feature selection if its exploration mechanism can be improved upon. Consequently,
this resulted in the proposal of the Binary Bat Algorithm with Lévy flights (BBAL)
in Enache and Sgarciu (2014). The authors replaced the random walk component of
BBA with Lévy Flights. Thus, (6) becomes (9):

x new =x old +t −η ∙ A ¿ t (9)

where t −η is the Lévy flights distribution and 1<η ≤ 3 is a constant.

Lévy flights are Mmarkovian processes that haves been proven by recent studies to
have a distribution describing the foraging patterns of animals such as spider
monkeys, albatross or other predatory animals (Yang, 2010). More specifically, they
are random walks whose step is depicted from a Léevy distribution. It is also
noteworthy that their trajectory describes a local search that suddenly takes a 90
degree turn. The motivation of the authors to replace the distribution is to
improvebetter randomization such that the algorithm will not get caught in local
minima (Enache and Sgarciu, 2015). We adopt BBAL over BBA, due to the
reduction in feature selection and improved performance, as proven in Enache and
Sgarciu (2014) and Enacheet et al. (2015).

Support Vector Machines

Support Vector Machines (SVM) is a non-probabilistic binary classifier that is based

on the concept of structural risk minimization of the statistical learning theory

141
(Martinez-Bea, 2014). Its objective is to construct and search for the optimal
hyperplane or the margin that achieves a good separation in a very high dimensional
space. More formally, given training instances as ( x1 , y1 ) , ( x2 , y2 ) , … ,( x y , yn ) ,
x i ∈ R is the n dimensional feature array, y i ∈{−1 ,+1} is the class label
n
where
and N is the number of instances in the dataset. A new instance x can be classified
using the function described in Valdez et al. (2011):

g ( x )=sign ( w ∙ x+ b ) (10)

where, w is the weight vector and b is the bias. These two coefficients define the
hyperplane that separates the two classes. In cases where the classification problem
is not linearly separable due to the distribution of the data points in the dataset, the
algorithm can map the dataset into a higher dimensional feature space and try to
construct the hyperplane that linearly-separates the mapped vectors. Hence the
adoption of a kernel function K that can replace x i with K ( x i ) .
The three main types of kernel functions utilized in SVM classification are radial-
basis kernel function (RBF), sigmoid kernel function, and polynomial kernel. In this
study, we adopt the SVM with RBF, because it has fewer controllable parameters
and it is also a universal kernel function (Enache and Patriciu, 2014). The kernel
function is defined using (11):

K ( x i , x )=exp
( 2−1σ ||x −x|¿ )
2 i
2
(11)

In the RBF kernel function, this study aims to optimize the parameters C and σ in
order to improve the performance of the SVM.

 C – is the regularization parameter. A lower value of C provides a larger

margin of the hyperplane and permits softer constraints. Thus increasing the
value of C determines a more accurate model but with a smaller margin.

 σ – is the kernel parameter. This tunable parameter controls the correlation

among support vectors. Too large a value of σ leads to a very tight correlation
among the support vectors. This ultimately results into a difficulty in realizing
enough accuracy.
Similar to Enache et al. (2015), we adopt the BA for SVM parameter optimization.

142
The proposed model

The proposed model (BBAL-BA-SVM) is anomaly detection- based IDS and has
three main stages, namely: Dataset preprocessing phase;, Feature selection with
parameter optimization phase, and; Detection phase. To our knowledge, no IDS
model has been developed using a combination of BBAL, BA, and SVM.

Dataset Preprocessing Phase

The dataset adopted for implementation is the NSL-KDD dataset (Tavallaee et al.,
2009) that has been acclaimed as being more effective for intrusion detection in
comparison to the KDD99 dataset (Enache and Sgarciu, 2015). Each record in the
dataset consists of 41 attributes and is defined as either normal or anomaly. The
attributes can be categorized as time based (19 features), connection based (9
features) and content based (13 features). The attacks modeled by the dataset can be
classified into four groups: Probing, Denial of Service (DoS),; User-to-Root (U2R)
and; Remote-to-Local (R2L).

For the implementation, the training data (TR) consists of 9,500 randomly selected
records from NSL-KDD training file while the testing data (TE) consists of 4,500
randomly selected records from the NSL-KDD test file. Furthermore, symbolic
attributes (protocol _type, service, flag, and class) are mapped to numeric. The
training dataset TR is split into two groups: TR1 and TR2 of 65% and 35%
respectively. Table 1 shows the summary of the dataset.

Dataset Number of Instances

TR1 6,175 (65% of TR)
TR2 3,325 (35% of TR)
TE 4,500

Table 1: Summary of the dataset

Feature Selection with Parameter Optimization Phase

In this phase, we conduct two-phase optimization, using BBAL and BA to determine

the optimal feature subset of the dataset and the best parameters for the SVM
classifier. Each binary bat is formatted as a d-dimensional vector; where d is the
number of features in the dataset, and each coordinate of value zero shows the
absence of a feature, while a coordinate of value one shows the presence of a feature.
In addition, each bat is formatted as a 2-dimensional vector indicative of the
parameters C and σ of the SVM algorithm.

143
For each binary bat bb (feature subset) of the swarm of binary bats, the fitness is a
product of the fitness value returned by the BA. This implies the notion of two-
phase optimization. The BA accepts an instance of a bb as input. Next, the dataset
(TR1 and TR2) is reduced based on the coordinate of the received bb. Thereafter, the
BA determines the best value of C and σ for the SVM using the reduced dataset over
a number of generations. The SVM is trained using reduced TR1 and the values C
and σ while its performance on the reduced TR2 is observed. The fitness function for
the BA is defined in Wang (2009) similar to Enache et al. (2015).

1
fitness=90 %Accuracy+10 % ( ) (12)
nbFeat

where:

TP+TN
Accuracy=
TP+TN + FP+ FN
(13)

 nbFeat = number of selected features from bb

 TP = number of attacks properly classified
 TN = number of normal records properly classified
 FP = number of normal records erroneously classified
 FN = number of attacks erroneously classified

The BA, for each instance of bb, returns the optimal values for C and σ, and the best
fitness. Thus, the best fitness returned by the BA determines the choice of optimal
feature subset by the BBAL algorithm. In essence, the BBAL conducts the feature
selection, while the BA performs parameter optimization for each selected bb.

Detection Phase

In this phase, the optimal feature subset together with its best C and σ are adopted by
the SVM to perform detection of intrusions in the dataset. Dataset TR1 and TE are
first reduced, based on the coordinate value of the returned feature subset from the
preceding phase. Next, the SVM is trained, using the reduced TR1 and parameters C
and σ. Finally, performance of the SVM is evaluated, using the reduced TE dataset.

Evaluation results
The proposed model was implemented using a personal computer with 2.10GHz
Intel Pentium(R) 4 CPU and 4GB of memory withunder Windows 10. In order to
show the superiority of the proposed model, we compare our model with Ordinary

144
SVM without parameter optimization (OSVM), Bat Aalgorithm with SVM (BA-
SVM), Binary Bat Algorithm with SVM (BBA-SVM), Binary Bat Algorithm using
Léevy flights with SVM (BBAL-SVM) (Enache et al., 2015), and Binary Bat
Algorithm with Bat Aalgorithm and SVM (BBA-BA-SVM). Note that dataset TR2
was not adopted for OSVM, since it does not involve optimization. Thus, OSVM
was trained with TR1 and tested with TE dataset. The fitness function defined in (1)
was adopted by all the optimization algorithms in this study and these algorithms
were implemented using Java programming language. WEKA version 3.7.12 [25]
was adopted for the SVM classifier.

For the Bat algorithm the following parameters were used:

 Number of Bats = 10
 Maximum loudness Ao = 10
 Minimum pulse rate ro = 0.9
 Frequency ranges between 0.2 and 1.0
 γ = 0.1 and α = 0.9
 Number of dimension = 2
 Number of generations = 10
For the BBA and BBAL algorithm, the following parameters were used:

 Number of Bats = 10
 Maximum loudness Ao = 0.5
 Minimum pulse rate ro = 0.5
 Frequency ranges between 0.8 and 1.0
 γ = 0.1 and α = 0.9
 Number of dimensions = 41
 Number of generations is 200

For parameters C the range is set between 1 and 3500 while the range between 0.001
and 50 is set for σ.

Performance Metrics

Similar to Enache et al. (2015), three performance metrics are utilized for model
evaluation.

 Attack Detection Rate (ADR) – measures the model’s ability to detect

attacks.

145
TP
ADR=
TP+ FP
(14)

 False Alarm Rate (FAR) - measures the false alarm generated by the model.

FP
FAR=
FP+TN
(15)

 Accuracy- shows if the proposed model is capable of correctly raising

alarms, when it detects intrusions and not generating false alarms when the
network traffic is normal.

TP+TN
Accuracy=
TP+ FP+TN + FN
(16)

Results and Analysis

The experiments conducted on the NSL-KDD dataset also proved the effectiveness
of BBAL over BBA with respect to the number of features selected. BBA selected
18 features while the BBAL selected 13 features with an additional improvement in
performance over BBA. Table 2 shows the features selected by the BBAL and BBA.
Each feature f i corresponds to index i in the NSL-KDD dataset. For example, f 2
corresponds to attribute protocol type in the dataset.

The results from Table 3 show that OSVM performed poorly on the TE dataset.
However, the optimization of the SVM algorithm using BA provided a significant
8%, 3.5%, and 6% improvement in ADR, FAR, and Accuracy respectively. The
result also indicated that feature subset selection is a veritable tool for improving the
detection of attacks while also reducing false alarm rate. This is evident from the
results of BBA-SVM and BBAL-SVM, as they both offered significant improvement
over OSVM and BA-SVM. Furthermore, the proposed model (BBAL-BA-SVM)
provides about 2.5% and 1.5% improvement over BBAL-SVM in ADR and
Accuracy respectively. This shows that further optimization on the parameters of the
classification algorithm per feature selection is an effective mechanism for
improving attack detection. It is also apparent that BBAL-BA-SVM offers a
significant improvement in performance and reduction in feature subset over BBA-
BA-SVM. However, it is noteworthy that both BBAL-BA-SVM and BBA-BA-SVM
requires more training time over other models, due to the extra layer of parameter
optimization.

146
Algorithm #Features Features selected

BBA 18 f 2 , f 3 , f 6 , f 8 , f 9 , f 14 , f 16 , f 19 , f 24 , f 25 , f 32 , f 33 , f 35 , f 36 , f 38 , f 39 , f 40 , f 4
BBAL 13 f 2 , f 3 , f 9 , f 13 , f 14 , f 16 , f 20 , f 25 , f 32 , f 33 , f 34 , f 40 , f 41

Table 2: Features selected based on the feature selection algorithm using nsl-
kdd dataset

Algorithm Number of Features ADR (%) FAR (%) Accuracy (%)

OSVM 41 59.06 5.59 74.22
BA-SVM 41 67.78 2.07 80.71
BBA-SVM 18 95.11 1.34 97.49
BBA-BA-SVM 18 97.28 0.82 99.24
BBAL-SVM 13 96.43 1.11 98.28
BBAL-BBA-SVM 13 98.93 0.45 99.76

Table 3: Experimental results

Conclusions and Future Work

In this paper we proposed a hybrid model that adopts Binary Bat algorithm with
Lélevy flight, Bat algorithm and SVM (BBAL-BA-SVM) to improve the wrapper
feature selection method, applied for intrusion detection that was presented in
Enache et al., (2015). In our approach; for each binary bat in the BBAL, the BA
creates a reduced dataset based on the coordinate value of the binary bat, and
determines its optimal performance by also finding the best value for parameter C
and σ in the SVM.

Hence, the BBAL returns the optimal feature subset with the additional values for
parameter C and σ. These values are then used for attack detection in the dataset.

This study adopted the NSL-KDD intrusion dataset to evaluate the performance of
BBAL-BA-SVM. The results showed that parameter optimization for SVM per
feature selection can improve the detection of attacks, while also reducing the false
alarm rate of the IDS. Furthermore, comparative analysis showed that BBAL-BA-
SVM performed better compared to BBA-BA-SVM and BBBAL-SVM (Enache et
al., 2015). BBAL-BA-SVM provided the best ADR, FAR, and Accuracy of 98.93%,
0.45%, and 99.76% respectively.

Future work will focus on experimentations with other optimization algorithms for
feature selection and validating our approach on more robust datasets.

147
References

Blum L. and Langley P. (1997). “Selection of relevant features and examples in

machine learning”, Artificial Intelligence, Vol. 97(1-2), pp. 245-271, December
1997.

Dua, S. and Du, X. (2011). Classical machine-learning paradigmsfor data mining. In

Data Mining and Machine Learning in Cybersecurity, pages 23–56. Auerbach
Publications Taylor and Francis Group.

Eid, H. F. and Hassanien, A. (2012). “Improved Real-Time Discretize Network

Intrusion Detection Model”, in the 7 th International Conference on Bio-Inspired
Computing: Theories and Applications (BIC-TA 2012), Vol. 201, 2013, pp. 99-109.

El-Hefnawy, N. A. (2014). Solving bi-level problems using modified particle swarm

optimization algorithm. In International Journal of Artificial Intelligence, Volume
12, pages 88–101.

Enache, A.-C. and Sgarciu, V. (2014). “Enhanced intrusion detection system based
on bat algorithm-support vector machine,” In SECRYPT 2014 - Proceedings of the
11th International Conference on Security and Cryptography, Vienna, Austria, 28-30
August, 2014, pages 184–189.

Enache, A.-C. and Patriciu, V. V. (2014). “Intrusions detection based on support

vector machine optimized with swarm intelligence,” In 9th IEEE International
Symposium on Applied Computational Intelligence and Informatics, SACI 2014,
Timisoara, Romania, May 15-17, 2014, pages 153–158.

Enache, A-C. and Sgarciu, V. (2015). “Anomaly Intrusions Detection Based on

Support Vector Machines with an Improved Bat Algorithm,” In Control Systems and
Computer Science (CSCS), 2015. 20th International Conference on, pp. 317-321.
IEEE.

Enache, A-C., Sgarciu, V., and Alina, P-N. (2015). “Intelligent feature selection
method rooted in Binary Bat Algorithm for intrusion detection,” In Applied
Computational Intelligence and Informatics, 2015. 10th Jubilee International
Symposium, pp. 517-521. IEEE.

Griffin, D. R., Webster, F. A., and Michael, C. R. (1960). “The echolocation of

flying insects by bats,” Animal Behaviour, Vol. 8, No. 34, pp. 141 – 154.

International Conference on Availability, Reliability and Security (2010). Pages 17–

24.

148
Kennedy J. and Eberhart, R. C. (1997). “A discrete binary version of the particle
swarm algorithm,” in IEEE International Conference on Systems, Man, and
Cybernetics, Vol. 5, pp. 4104–4108.

Kukielka, P. and Kotulski, Z. (2014). New unknown attack detection with the neural
network-based ids. In The State of the Art in Intrusion Prevention and Detection,
pages 259–284. Auerbach Publications.

Laamari, M. A. and Kamel, N. (2014). “A hybrid bat based feature selection

approach for intrusion detection,” In Bio-Inspired Computing – Theories and
Applications, Volume 472 of Communications in Computer and Information
Science, pages 230–238. Springer Berlin Heidelberg.

Ma, J., Liu X., and Liu, S. (2008). “A New Intrusion Detection Method Based on
BPSO-SVM”, in Proc. of the International Symposium on Computational
Intelligence and Design (ISCID2008), Vol. 1, pp.473–477.

Manekar, V. and Waghmare, K. (2014). “Intrusion Detection System using Support

Vector Machine (SVM) and Particle Swarm Optimization (PSO)”, International
Journal of Advanced Computer Research, vol. 4, no. 3, pp.808-812.

Martinez-Bea, S., Castillo-Perez, S., and Garcia-Alfaro, J. (2014). “Real-time

malicious fast-flux detection using DNS and bot related features,” 11 th annual
international conference on privacy, security and trust (PST). Tarragona Catalonia
pp.369–372.

Metzner, W. (1991). “Echolocation behaviour in bats,” Science Progress Edinburgh,

Vol. 75, No. 298, pp. 453–465.

Nakamura, R., Pereira L., Costa, K., Rodrigues, D., Papa, J., and Yang, X. S. (2012).
Bba: a binary bat algorithm for feature selection. In Proceedings of the 25 th
Conference on Graphics, Patterns and Images (SIBGRAPI ’12), pages 291–297.

Nguyen, H., Franke, K., and Petrovic, S. (2010). Improving effectiveness of

intrusion detection by correlation feature selection. In ARES ’10.

Ramos, C. Souza, A., Chiachia, G., Falcao, A., and Papa, J. (2011). “A novel
algorithm for feature selection using harmony search and its application for non-
technical losses detection,” Computers & Electrical Engineering, Vol. 37, No. 6, pp.
886–894.

Rashedi, E., Nezamabadi-pour, H., and Saryazdi, S. (2010). “BGSA: binary

gravitational search algorithm,” Natural Computing, Vol. 9, pp. 727–745.

149
Sammut, C. and Webb, G. I. (2010). Feature selection. In Encyclopedia of Machine
Learning, pp.429–433, Springer, New York.

Schnitzler, H.-U. and Kalko, E. K. V. (2001). “Echolocation by insect-eating bats,”

BioScience, Vol. 51, No. 7, pp. 557–569, July 2001.

Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani A. A. (2009). A detailed analysis
of the KDD CUP 99 data set. In Proceedings of the IEEE Symposium on
Computational Intelligence in Security and Defense Applications, pages 1–6. IEEE.

Valdez, F., Melin, P., and Castillo, O. (2011). An improved evolutionary method
with fuzzy logic for combining particle swarm optimization and genetic algorithms.
In Applied Soft Computing, Volume 11, pages 2625–2632.

Wang, J., Hong, X., Li, T., and Ren, R. (2009). “A real-time intrusion detection
system based on pso-svm,” In Proceedings of the International Workshop on
Information Security and Application, pages 319–321. Academy Publisher.

Wang, J., Li, T., and Ren, R. (2010). A real time IDSs based on artificial bee colony-
support vector machine algorithm. In Proceedings in the International Workshop on
Advanced Computational Intelligence, pp.91–96, IEEE.

Witten, H. and Frank, E. (2005). “Data Mining: Practical Machine Learning Tools
and Techniques”, Morgan Kaufmann, 2nd Edition.

Yang, X.-S. (2010). Random walks and Lévy flights. In Nature-Inspired

Metaheuristic Algorithms, Second Edition, pages 11–20. Luniver Press.

Yang, X.-S. (2010). “A new metaheuristic bat-inspired algorithm,” In Nature

Inspired Cooperative Strategies for Optimization (NICSO 2010), Volume 284 of
Studies in Computational Intelligence, pages 65–74, Springer Berlin Heidelberg.

150

Survey 2006
No ratings yet
Survey 2006
15 pages
Futureinternet 14 00178
No ratings yet
Futureinternet 14 00178
16 pages
Al Tashi2019
No ratings yet
Al Tashi2019
15 pages
Machine Learning Based Intrusion Detection Systems Using HGWCSO and ETSVM Techniques
No ratings yet
Machine Learning Based Intrusion Detection Systems Using HGWCSO and ETSVM Techniques
4 pages
08 Chapter 5
No ratings yet
08 Chapter 5
40 pages
2 Woa Opt
No ratings yet
2 Woa Opt
8 pages
Intrusion Detection
No ratings yet
Intrusion Detection
7 pages
2012-Elsiver-An Efficient Intrusion Detection System Based On Support Vector Machines
No ratings yet
2012-Elsiver-An Efficient Intrusion Detection System Based On Support Vector Machines
7 pages
Implementation_of_adaptive_scheme_in_evolutionary_
No ratings yet
Implementation_of_adaptive_scheme_in_evolutionary_
16 pages
2018 Computers and Security Journal Paper
No ratings yet
2018 Computers and Security Journal Paper
21 pages
Proofreading
No ratings yet
Proofreading
23 pages
Hybrid Feature Selection
No ratings yet
Hybrid Feature Selection
8 pages
Mathematics 10 00999 v2
No ratings yet
Mathematics 10 00999 v2
16 pages
appliedmath-04-00081 (1)
No ratings yet
appliedmath-04-00081 (1)
17 pages
Journal Tiis 12-10 TIISVol12No10-24
No ratings yet
Journal Tiis 12-10 TIISVol12No10-24
22 pages
Feature Selection Approach For Intrusion Detection System Based On Pollination Algorithm
No ratings yet
Feature Selection Approach For Intrusion Detection System Based On Pollination Algorithm
5 pages
Feature Selection and Intrusion Classification in NSL-KDD Cup 99 Dataset Employing SVMs
No ratings yet
Feature Selection and Intrusion Classification in NSL-KDD Cup 99 Dataset Employing SVMs
6 pages
Comparison of Single and Ensemble Intrusion Detection Techniques Using Multiple Datasets
No ratings yet
Comparison of Single and Ensemble Intrusion Detection Techniques Using Multiple Datasets
10 pages
s40537-024-00887-9
No ratings yet
s40537-024-00887-9
25 pages
SSRN Id2376652
No ratings yet
SSRN Id2376652
8 pages
Cmse
No ratings yet
Cmse
12 pages
Toward Integrating Feature Selection Algorithms For Classification and Clustering-M7s PDF
No ratings yet
Toward Integrating Feature Selection Algorithms For Classification and Clustering-M7s PDF
12 pages
s2.0-S1877705812008375-main غدا PDF
No ratings yet
s2.0-S1877705812008375-main غدا PDF
9 pages
s40537-023-00694-8
No ratings yet
s40537-023-00694-8
26 pages
Feature Selection For Domain Adaptation Using Complexity Meas - 2023 - Neurocomp
No ratings yet
Feature Selection For Domain Adaptation Using Complexity Meas - 2023 - Neurocomp
14 pages
A Survey On Intrusion Detection System Using Machine Learning Techniques
No ratings yet
A Survey On Intrusion Detection System Using Machine Learning Techniques
7 pages
Improving network intrusion detection by identifying effective features based on probabilistic dependency trees and evolutionary algorithm
No ratings yet
Improving network intrusion detection by identifying effective features based on probabilistic dependency trees and evolutionary algorithm
13 pages
Genetic Algorithm-Based Feature Selection Method For Credit Risk Analysis
No ratings yet
Genetic Algorithm-Based Feature Selection Method For Credit Risk Analysis
4 pages
Feature Selection and Comparison of Classifcation Algorithms
No ratings yet
Feature Selection and Comparison of Classifcation Algorithms
13 pages
n2020
No ratings yet
n2020
6 pages
Feature selection techniques
No ratings yet
Feature selection techniques
5 pages
Biomimetics 09 00648
No ratings yet
Biomimetics 09 00648
24 pages
Paper_8
No ratings yet
Paper_8
19 pages
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
No ratings yet
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
7 pages
Qer
No ratings yet
Qer
34 pages
Binary Ebola Optimization Search Algorithm For Feature Selection and Classification Problems
No ratings yet
Binary Ebola Optimization Search Algorithm For Feature Selection and Classification Problems
46 pages
A Subset Feature Elimination Mechanism For Intrusion Detection System
No ratings yet
A Subset Feature Elimination Mechanism For Intrusion Detection System
10 pages
Kernels, Model Selection and Feature Selection
No ratings yet
Kernels, Model Selection and Feature Selection
5 pages
Support Based Graph Framework For Effective Intrusion Detection
No ratings yet
Support Based Graph Framework For Effective Intrusion Detection
22 pages
Project Sample
No ratings yet
Project Sample
55 pages
Effective Feature Selection Strategy for Supervised Classification
No ratings yet
Effective Feature Selection Strategy for Supervised Classification
21 pages
minor
No ratings yet
minor
7 pages
Feature Subset Selection: A Correlation Based Filter Approach
No ratings yet
Feature Subset Selection: A Correlation Based Filter Approach
4 pages
Feature Subset Selection With Fast Algorithm Implementation
No ratings yet
Feature Subset Selection With Fast Algorithm Implementation
5 pages
Project Day Correction
No ratings yet
Project Day Correction
49 pages
s11227-024-06606-8
No ratings yet
s11227-024-06606-8
34 pages
Paper 3 PDF
No ratings yet
Paper 3 PDF
5 pages
feature selection
No ratings yet
feature selection
173 pages
Recall, Precision
No ratings yet
Recall, Precision
7 pages
Artificial Intelligence and Natural Algorithms
From Everand
Artificial Intelligence and Natural Algorithms
PublishDrive
No ratings yet
10 1016@j Jnca 2005 06 003 PDF
No ratings yet
10 1016@j Jnca 2005 06 003 PDF
19 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
5 pages
Animprovedfeatureselectionmethodforclassification onincompletedata
No ratings yet
Animprovedfeatureselectionmethodforclassification onincompletedata
15 pages
elaboudi2016 (1)
No ratings yet
elaboudi2016 (1)
5 pages
Bio-Inspired Feature Selection an Improved Binary
No ratings yet
Bio-Inspired Feature Selection an Improved Binary
15 pages
2023 Scopus Ensemble Based Dimensionality
No ratings yet
2023 Scopus Ensemble Based Dimensionality
5 pages
A Review of Feature Selection Methods On Synthetic Data
No ratings yet
A Review of Feature Selection Methods On Synthetic Data
37 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Efficient Memory Optimization for IoT Intrusion Detection
From Everand
Efficient Memory Optimization for IoT Intrusion Detection
Ethan Evelyn
No ratings yet
Cardiovascular Disease Detection Using Machine Learning and Risk Classification Based On Fuzzy Model
No ratings yet
Cardiovascular Disease Detection Using Machine Learning and Risk Classification Based On Fuzzy Model
21 pages
Application of Machine Learning in Medical Diagnosis
No ratings yet
Application of Machine Learning in Medical Diagnosis
19 pages
An Efficient Spam Detection Technique For IoT Devices Using Machine Learning
No ratings yet
An Efficient Spam Detection Technique For IoT Devices Using Machine Learning
10 pages
Cross Domain Sentiment Analysis
No ratings yet
Cross Domain Sentiment Analysis
17 pages
preprints202403.0585.v3
No ratings yet
preprints202403.0585.v3
10 pages
1 - A Survey of Intrusion Detection Models Based On NSL-KDD Data Set (IEEE)
No ratings yet
1 - A Survey of Intrusion Detection Models Based On NSL-KDD Data Set (IEEE)
6 pages
Unit-2: Logistic Regression
No ratings yet
Unit-2: Logistic Regression
30 pages
Predicting customer churn A systematic literature review
No ratings yet
Predicting customer churn A systematic literature review
22 pages
Nikhil Major Project
No ratings yet
Nikhil Major Project
60 pages
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
No ratings yet
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
7 pages
VTU Exam Question Paper With Solution of 18CS72 Big Data and Analytics Feb-2022-Dr. v. Vijayalakshmi
No ratings yet
VTU Exam Question Paper With Solution of 18CS72 Big Data and Analytics Feb-2022-Dr. v. Vijayalakshmi
25 pages
Feature Selection 16891042299
No ratings yet
Feature Selection 16891042299
23 pages
1 s2.0 S0038092X11000193 Main
No ratings yet
1 s2.0 S0038092X11000193 Main
11 pages
Learning From Class Imbalanced Data Review of Methods and Applications
No ratings yet
Learning From Class Imbalanced Data Review of Methods and Applications
20 pages
Prediction of Land Suitability For Crop Cultivation Based On Soil and Environmental Characteristics Using Modified Recursive Feature Elimination Technique With Various Classifiers
No ratings yet
Prediction of Land Suitability For Crop Cultivation Based On Soil and Environmental Characteristics Using Modified Recursive Feature Elimination Technique With Various Classifiers
11 pages
Data Mining Project
No ratings yet
Data Mining Project
4 pages
Machine Learning Applications For Building Structural Design and Performance Assessment
No ratings yet
Machine Learning Applications For Building Structural Design and Performance Assessment
41 pages
Article PP 1416-1433
No ratings yet
Article PP 1416-1433
18 pages
Tech Sem Report
No ratings yet
Tech Sem Report
17 pages
Feature Selection and Similarity Coefficient Based Method For Email Spam Filtering
No ratings yet
Feature Selection and Similarity Coefficient Based Method For Email Spam Filtering
4 pages
1 s2.0 S0010482524011569 Main
No ratings yet
1 s2.0 S0010482524011569 Main
15 pages
Example On Flight Delay Data
No ratings yet
Example On Flight Delay Data
10 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
Heart Disease Prediction Using Adaptive Infinite Feature Selection and Deep Neural Networks
No ratings yet
Heart Disease Prediction Using Adaptive Infinite Feature Selection and Deep Neural Networks
6 pages
Churn Rate DPV
No ratings yet
Churn Rate DPV
15 pages
Towards Efficient and Scalable Machine Learning-Based Qos Traffic Classification in Software-Defined Network
No ratings yet
Towards Efficient and Scalable Machine Learning-Based Qos Traffic Classification in Software-Defined Network
13 pages
Major Project Report Sem 7
No ratings yet
Major Project Report Sem 7
23 pages
Ids Tool: Project Report
No ratings yet
Ids Tool: Project Report
53 pages
A Comprehensive Review of Approaches To Building Occupancy Detection
No ratings yet
A Comprehensive Review of Approaches To Building Occupancy Detection
14 pages
Network Intrusion Detection Using Supervised Machine Learning Technique With Feature Selection
No ratings yet
Network Intrusion Detection Using Supervised Machine Learning Technique With Feature Selection
4 pages

Intrusion Detection System Based On Support Vector Machines and The Two-Phase Bat Algorithm

Uploaded by

Intrusion Detection System Based On Support Vector Machines and The Two-Phase Bat Algorithm

Uploaded by

INTRUSION DETECTION SYSTEM BASED ON SUPPORT

VECTOR MACHINES AND THE TWO-PHASE BAT

Keywords: Intrusion Detection System, Bat Algorithm, Binary Bat algorithm,

Bat Algorithm (BA) is a relatively new bio-inspired meta-heuristic optimization

freqi=freq min + ( freq max−freq min ) ∙ β (1)

x ti=x ti −1+ v ti (3)

where β ∈ [ 0,1 ] is a random vector drawn from a uniform distribution. Moreover, as

Ati +1=∝ ∙ A ti (4)

r ti +1=r i0 ∙[1−e− γ ∙t ] (5)

Binary Bat Algorithm

The new coordinates of each bat is expressed using (8):

where δ is a random number between 0 and 1.

Binary Bat Algorithm with Léevy Flight

x new =x old +t −η ∙ A ¿ t (9)

where t −η is the Lévy flights distribution and 1<η ≤ 3 is a constant.

Support Vector Machines

Support Vector Machines (SVM) is a non-probabilistic binary classifier that is based

 C – is the regularization parameter. A lower value of C provides a larger

 σ – is the kernel parameter. This tunable parameter controls the correlation

Dataset Preprocessing Phase

Dataset Number of Instances

Table 1: Summary of the dataset

Feature Selection with Parameter Optimization Phase

In this phase, we conduct two-phase optimization, using BBAL and BA to determine

 nbFeat = number of selected features from bb

For the Bat algorithm the following parameters were used:

 Attack Detection Rate (ADR) – measures the model’s ability to detect

 Accuracy- shows if the proposed model is capable of correctly raising

Results and Analysis

Algorithm Number of Features ADR (%) FAR (%) Accuracy (%)

Table 3: Experimental results

Conclusions and Future Work

Blum L. and Langley P. (1997). “Selection of relevant features and examples in

Dua, S. and Du, X. (2011). Classical machine-learning paradigmsfor data mining. In

Eid, H. F. and Hassanien, A. (2012). “Improved Real-Time Discretize Network

El-Hefnawy, N. A. (2014). Solving bi-level problems using modified particle swarm

Enache, A.-C. and Patriciu, V. V. (2014). “Intrusions detection based on support

Enache, A-C. and Sgarciu, V. (2015). “Anomaly Intrusions Detection Based on

Griffin, D. R., Webster, F. A., and Michael, C. R. (1960). “The echolocation of

International Conference on Availability, Reliability and Security (2010). Pages 17–

Laamari, M. A. and Kamel, N. (2014). “A hybrid bat based feature selection

Manekar, V. and Waghmare, K. (2014). “Intrusion Detection System using Support

Martinez-Bea, S., Castillo-Perez, S., and Garcia-Alfaro, J. (2014). “Real-time

Metzner, W. (1991). “Echolocation behaviour in bats,” Science Progress Edinburgh,

Nguyen, H., Franke, K., and Petrovic, S. (2010). Improving effectiveness of

Rashedi, E., Nezamabadi-pour, H., and Saryazdi, S. (2010). “BGSA: binary

Schnitzler, H.-U. and Kalko, E. K. V. (2001). “Echolocation by insect-eating bats,”

Yang, X.-S. (2010). Random walks and Lévy flights. In Nature-Inspired

Yang, X.-S. (2010). “A new metaheuristic bat-inspired algorithm,” In Nature

You might also like