0% found this document useful (0 votes)
27 views

Improving Oneclass SVM For Anomaly Detection

Uploaded by

jandry.valdez
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Improving Oneclass SVM For Anomaly Detection

Uploaded by

jandry.valdez
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

-%

Proceedings of the Second International Conference on Machine Learning and Cybernetics, Wan, 2-5 November 2003

IMPROVING ONE-CLASS S V M FOR ANOMALY DETECTION


KUN-LUN LI 13, HOU-KUAN HUANG ’ ,SHENG-FENG TIAN I, WE1 XU’

‘School of Computer & Information Technology, Northern Jiaotong University, Beijing, China, 100044
2
Faculty of Mathematics and Computer Science, Hebei University, Baoding, China, 071002
E-MAIL likunlm-njtu@ 163.com

Abstract: almost all activities are logged on a system, it is possible


With the tremendous growth of the Internet, the that a manual inspection of these logs would allow
information system security has become an issue of serious intrusions to be detected. However, the incredibly large
global coucern due to the rapid connection and accessibility. sizes of audit data generated (on the order of 100
Developing effective methods for intrusion detection, Megabytes a day) make manual analysis impossible. IDSs
therefore, is an urgent task for assuring computer &
information system security. Since most attacks and misuses automate the dmdgery of wading through the audit data
can be recognized through the examination of system audit log juogle. Audit trails are particularly useful because they can
tiles and patteru analysis therein, an approach for intrusion be used to establish guilt of attackers, and they are often the
detection can he built on them. First we have made deep only way to detect unauthorized but subversive user activity.
analysis on attacks and misuses patterns in log Nes; and then In this paper we use the data driven from DARPA rather
proposed an approach using support vector machines for than audit data.
anomaly detection. It is a one-class S V M based approach, The standard support vector machine (SVM) is a
trained with abstracted user audit logs data from 1999 classifier that finds a maximal margin separating two
DARF’A. classes of data; there have been a lot of successful
applications about that. But the data of intrusion detection
Keywords: are very special- the normal dataset is much larger than
Information security; intrusion detection; anomaly the abnormal, therefore, the standard S V M does not work
detection and SVMs
well on our task. We present a new SVM method, which is
based on one-class SVM described in [2] by Bemhard
1. Introduction
Scholkopf et al.
Intrusion detection is one of the most important
2. Intrusion and Intrusion Detection
techniques of information dynamic security technology.
James P. Anderson proposed it early in 1980’s in
An inksion can be defined as “any set of actions that
“Computer Security Threat Monitoring and
attempts to compromise the integrity, confidentiality or
Serveillance”[l]. IDS (shox? for Intrusion Detection System)
availability of a resource”. In the context of information
has been extensively used in many fields and it has played
important roles in protecting various kinds of information systems, intrusion refers to any unauthorized attempt to
access or damage, or malicious use of information
systems. Intrusion prevention techniques, such as user
resources. The identification of unauthorized use, misuse
authentication (e.g. using passwords or biometrics),
avoiding programming errors, and information protection and attacks on an information system is defined. There are
mainly two types of intrusion detection techniques:
have been used to protect computer systems. But intrusion
Anomaly detection and Misuse detection. Most current IDS
prevention alone is not sufficient because systems become
ever more complex, and there are always exploitable adopt one or both of them.
Anomaly detection techniques assume that all
weaknesses in the systems due to design and programming
errors. Intrusion detection is therefore needed as another intrusive activities are necessarily anomalous. This means
that if we could establish a ”normal activity profile” for a
wall to protect computer systems.
The most popular way to detect intrusions has been by system, we could, in theory, flag all system states varying
using the audit data generated by the operating system. An from the established profile by statistically significant
audit trail is a record of activities on a system that are amounts as intrusion asempts. The concept behind misuse
logged to a file in chronologically sorted order. Since detection schemes is that there are ways to represent attacks

0-7803-7865-2/03/$17.00 02003 IEEE


3077
Proceedings of the Second International Conference on Machme Learning and Cybernetics, Wan, 2-5 November 2003

in the form of a pattern or a signature so that even A


variations of the same attack can be detected. This means
that these systems are not unlike virus detection systems --
they can detect many or all known attack patterns, but they 1 class
are of little use for as yet unknown attack methods.
Anomaly detection is often considered to very difficult, as
it must be tailored system-to-system, and even user-to-user, ..
as behavior patterns and system usage can vary widely. The
advantage of anomaly detection systems is that they can
detect unknown intrusion with no a priori knowledge about
specific intrusions.

3. SVMs for Classification


Figure 1. Geometry interpretation of one- class SVM
The basic idea of SVM is as follows. For nonlinearly based classifier
separable data, we can transform them into a high In this case, we have to determine how far from the
dimensional space by a nonlinear map such that the data origin a point can be before being classified as an anomaly
points will he linearly separable. In our approach, we data.
define the data corresponding to the attacks to be the Suppose we are given the training data:
negative examples and the normal ones to be the positive. {(XI I YI )? (x2 > Y , I>...>(XI > Y, 11 where
According to the SVM theory [4], the separating
hyper-plane in 2-class SVM is decided by the support x e R N,y E {-l,+l}, and RNisthe feature space. This
vectors close to the hyper-plane. Actually one can use the leads to the following quadratic programming problem:
points close to the separating hyper-planes. But when the I
number of the negative examples is too small, the min(R2 + C c & )
generalization performance of SVM classifier must be i=l
weak. And further more, owing to the fact describing above,
the error rates is proved unsatisfactory. si. y i ( ~ ~ @ ( x i ) -- uR ~
2 )~<z& , (1)
One-class SVM: Bemhard Scholkopf et al. suggested 6, 2 0,1< i < I
a method of adapting the SVM methodology to the
one-class classification problem. Essentially, aftet where 6, are slack variables that are penalized in the
transforming the feature via a kernel, they treated the origin objective function. The goal of introducing the slack
as the only member of the second class. By introducing variables is to allow some errot during the training, where a
“relaxation parameters”, they separate the image of the one and R are the center and radius of the hyper-sphere
class from the origin. respectively, and C is the penalty parameter. The
Lagrangian form corresponding to the formula (1) is as
4 Improving One-Class SVM for Anomaly Detection follows:
I
The basic idea is to work first in the feature space, and L = R’ +cxti
assume that not only is the origin in the second class, but i=l

also that all data points “close enough” to the origin are to
be considered as outliers or anomaly data points. If the
input data match the selected samples, then they are
regarded as anomaly data, i.e., that belongs to the anomaly
class. Figure 1 gives the geometry interpretation of
improved one-class SVM.
- i
i=l
Piti

where ai 2 0, pi 2 0 are Lagrange multipliers, we


get the following formula by differentiating with respect to
all of the variables to zero:

3078
Proceedmgs of the Second International Conference on Machine Learning and Cybernetics, Wan, 2-5 November 2003

I
belonging to class - I , yi = + I , which is similar to the
C a i y i = 1, one-class SVMs.
i=l

0 5 a, 5 c, (3) 5. Data Preprocessing and Experiment


I
C a ; y ; @ ( x ;=) a The data are acquired from the 1999 DARPA Intrusion
i=l Detection Evaluation Program. Lincoln Lab of MIT took
and the corresponding dual is: charge of the program. They set up an environment to
1 acquire raw TCP/IP dump data for a local-area network
m i n ( C a i f f j Y i Y j ( @ (1x’zw j 1) (LAN) simulating a typical U.S. Air Force LAN. The
t,j=l objective of the study was to survey and evaluate the state
I of the art in intrusion detection research. For each TCP/IP
-Cff,v,(@(xO.@(a)))
i=l
(4) connection, 18 various quantitative and qualitative features
were extracted. Table 1 shows different kinds of exploits in
i 1999 DARPA, and presents attacks broken up into
Si. 0 5 ffi I C,CffiJIi = 1 categories by type and operating system.
i=l The data are about 4.0 G bytes of compressed
we use ,
aDorooriate
I. 1
kernel function tcpdump of 7 weeks of network trafiic. which can be
processed into about Smillou connection records of about
K(xi,xj) representing the inner product
100 bvtes for each. The data contains the content (i.e.. the
@ ( x i ) . @ ( x j ) . The choice of the kemel functions data portion) of every packet transmitted betwedn hosts
inside and outside a simulated military base. The fint work
depends on the experience and experiment. The Gaussian is to take ,4,292 data from 1999
RBF K ( x ,y ) = e-’lz~yl”r
is applied here. Secondly by applying an approach of mining association
For any input x , first we calculate the distance rules (the source code provided by our lab), we extract
between the data point and the center of the hyper-sphere, if 4,758 normal data for the experiment as training data; and
the following condition is true then pick up 2,338 connection records for testing. Table 2
II@‘(XI - x 1 15 R (5 ) gives the list of features of 1999 DARPA.
We apply the method to a set of intrusion data as
the data point x belongs to the h ~ ~ e r - s p h eand
r e regad it
described above, In our experiment we use the svMs to
belongs to + I class, othewise it belongs to -1 class. The , differentiate intrusions from normal activities. The testing
radius Of the can be reckoned with distance set consists 2,338 examples with 18 features, received
between the bounded and the center Of the by 96%accuracy, while the result of standard SVM received
following method 91% for the same data.
We perform the experiment over the abstract data, by
using cluster based method, nayve bayes, K-NN and
standard SVM (here refcr to the SVM-light), and compare
+Cff,ff,Y,Y,K(x,>x,) them with the one-class based method given in this paper.
‘.J The result is given in table 3.
where X k are the bounded vectors and nmVis the
6.Cunclusiuns and Future Work
number of the bounded vectors. Therefore the decision can
be written as: We have proposed an approach to intrusion detection
using SVMs, and test the performance on a set of selected
(7) data. It delivers a highly accuracy on testing set, proves to
have compatible level of performance. Next work we plan
1 to do includes constructing an Intrusion Detection System
where b = - - ~ C f f i y i K ( x , , x i ) . For any
nmrv k i
that can work on-line, and can identify different categories
of attacks. Also we will utilize multi-class SVM classifier
given vectorx ,one can decide that it belongs to class + I , if for the future work (IO].
f ( x ) 2 0 , and class -I otherwise. While there is not x

3079
Proceedings of the Second International Conference on Machine Learning and Cybernetics, Xi", 2-5 November 2003

Table 1. At :ks Used in 19 1 DARPA Evaluatior


Solaris sunos NT
I . I Linux
Probe protsweep protsweep ntinfoscan isdomain
queso oueso portsweepisdomain mascan
protsweep
queso
satan
Dos neptune arposion arposion apche2
pod land crashils arpposion
processtable mailbomb dosnuke back
selfping nettune ZUIlUIf mailbomb
amurf ' pod tcpreset neptune
syslogdtcpreset processtahle pod
warezclint processtahle
smurf
tcpreset
teardroo
udpstorm
R2L dict dict dict dict
@write XSnOOQ framespoof imap
guest netbus named
httptunnel netcat ncftp
xlwk ppmacro phf
XSnOOQ sendmail
sshtrojan
xiock
xsnoop
U2R eject loadmodule casesen per1
fdformat ntfsdos sqlanack
ffbcconfig nukepw Xtem
PS sechole
yaga

Feature Description Value type


duration Length (number of seconds) of the connecnon Continuous
protocol-type Type of the protocol, e.g. TCP, UDP, etc. Discrete
service Network service on the destination, e.g. http, telnet, etc. Discrete

flag Normal or error status of the connection


land I-connection is from/ to the same host/port;O-otherwise
wrongframent Number of "wong' fragments
urgent
count Number (

Yo of connectionsthat have "SYN" errors


% of connections that have "REP'errors Continuous
% of connectionsto the same service
% of connectionsto different services
I sN_COUnt
_ I Number o f connection to the same service as the current connection in the past 2 1 Continuous I
second
The
I I K following
'""L'""'& features refer to these >..,,#~->c>"m.L
same-serviceconnectinn
L F I I l Y l L I l L l C l IU L l l r l . . C"1"IFLIIIIII

Srv-serror 9. , O o of cunnections that have "SY Y" errors ~ -


. Cont!nvous
-
Srv-wrror O o O O of connections that have "RFJ" r . r r ~ ~ Continuuur
Srv_diff-host_% 'In of connections tu ditTcrcnt hosts Continuous

3080
Proceedings of the Second International Conference on Machine Learning and Cybernetics, Wan, 2-5 November 2003

Accuracyrate 93% I 91% I 89% 91% 96%

References

[I] Wenke Lee, Salvatore J. Stolfo & Kui W. Moka.


Adaptive Intrusion Detection: A Data Mining
Approach. Artificial Intelligence Review 14: 533-567,
2000
[2] Bemhard Scholkopf et al. Estimating the support of a
High-Dimensional Distribution. Technical Report,
Department of Computer Science, University of Haifa,
Haifa, 2001
[3] Larry M. Manevitz and Mal& Yousef. One-class
SVMs for Document Classification. JMLR,2, 2001
[4] V. N. Vapnik. The Nature of Statistical Learning
Theory. New York Springer. 1995
[5] Jake Ryan, Meng-Jang Lin, Risto Miikkulainen.
Intrusion Detection with Neural Networks, Advances
in Neural Information Processing Systems, 1998
[6] Srinivas Mukkarmala, Guadalupe Janoski and Andrew
Sung. Intrusion Detection Using Neural Networks and
Support Vector Machines, IEEE IJCNN (May, 2002).
[7] Yunqiang Chen, Xiang Zhou, and Thomas S. Huang.
One-Class SVM for Learning in Image Retrieval,
Proceeding IEEE Int’l conference on Image
processing, 2001
[8] James Cannady. Artificial Neural Networks for
Misuse Detection, Proceedings of the 1998 National
Information Systems Security Conference (NISSC’98)
Arlington, VA, October 5-8 1998
[9] http://kdd.ics.uci.edu/database&ddcup99
[lo] Kun-Lun Li, Hou-Kuan Huang, Sheng-Feng Tian. A
Novle Multi-class SVM Classifier Based on DDAG.
IEEE ICMLC’O2 ,Nov, 2002

3081

You might also like