5
5
Abstract—Widespread use of Wireless Sensor Networks serious attacks that threaten the security of computer
(WSNs) introduced many security threats due to the nature of networks in general and WSN security in particular.
such networks, particularly limited hardware resources and
infrastructure less nature. Denial of Service attack is one of the D/DoS attacks can disable a network service, application,
most common types of attacks that face such type of networks. website by sending excessive number of fake requests to
Building an Intrusion Detection and Prevention System to overwhelm the network/server resources to a point that
mitigate the effect of Denial of Service attack is not an easy legitimate traffic is prevented from accessing the network
task. This paper proposes the use of two machine learning until the attack is stopped. DoS attacks can be easily
techniques, namely decision trees and Support Vector executed, but their prevention is often costly and complex.
Machines, to detect attack signature on a specialized dataset.
The used dataset contains regular profiles and several Denial To avoid and prevent security threats, a security control
of Service attack scenarios in WSNs. The experimental results such as Intrusion Detection System (IDS) can be utilized to
show that decision trees technique achieved better (higher) detect known and unknown attacks. IDS detects abnormal or
true positive rate and better (lower) false positive rate than suspicious activities and make alarm when an intrusion
Support Vector Machines, 99.86% vs 99.62%, and 0.05% vs. occurs [1].
0.09%, respectively.
Implementing IDS in WSNs is more difficult than other
Keywords—Wireless Sensor Networks, Denial of Service, types of networks because sensors are usually designed to be
Intrusion Detection System, Decision Trees, Support Vector small, cheap with no sufficient hardware resources therefore
Machines. complex solutions are not practical in such networks. This
difficulty motivated the authors to develop a light-weight
IDS solution in WSN. This paper utilizes decision trees and
I. INTRODUCTION Support Vector Machines. These two well-known machine
There is a widespread use of networking systems over the learning techniques are tested on a specialized WSN dataset
globe by most companies and even individual developers to containing regular (no attack) and several attack scenarios to
create innovative solutions and products that can help check their effectiveness in DoS detection.
organizations and citizens utilize different new technologies
to satisfy their various needs. Sensors are one of the The rest of this paper is organized as follows. Section II
relatively new technologies that started early in the market presents some related work in the literature. Section III
and now is being used in the Internet of Things (IoT) [1]. introduces preliminary information needed for subsequent
Nowadays Wireless Sensor Networks (WSNs) is one of the sections including the studied dataset, the used machine
hottest research areas. WSN growth is more apparent in learning algorithms and tools, and the used performance
critical military and civilian applications. Although metrics. Experimental results are introduced and discussed in
beneficial, however, most of these applications have various section IV. Conclusions and avenues for future work are
associated security threats, especially in unattended presented in section V.
environments.
II. LITERATURE REVIEW
WSN consists of many independent, autonomous, tiny,
low power and usually cheap sensor nodes. These nodes are In recent years, WSN has become a preferred solution for
distributed across a geographical area of interest to collect many applications. WSN is useful in monitoring critical
important data and transfer them wirelessly to a powerful facilities such as power grids, water supplies, traffic
node, called sink node [2]. WSNs usually use special networks, telecommunications systems, farms, agriculture,
protocols to send, possibly sensitive, data over the networks. and military command applications. In medical and health
Therefore, information security and protection are of care fields there are many applications based on WSNs such
essential importance to all WSNs to prevent various security as monitoring patients' physiological parameters and track
threats. Unfortunately, achieving this is a major challenge their locations within hospitals, homes and elderly care
because of the limited resources of WSNs including battery center monitoring chronic and elderly patients and wearable
power, memory and processing capabilities. These restrictive wireless sensor design. Moreover, during natural disasters,
features make traditional security measures such as WSN can be used to detect flooding, power outage, fire,
encryption impractical in these networks. tornadoes, volcanoes or earthquakes [3].
Cyber attackers can break a sensor node; eavesdrop Security is an essential non-functional and sometimes
messages; inject fake messages; change data integrity, and functional requirement of most systems. Broadly speaking
waste network resources. Denial of Service (DoS) or security is related to preventing different types of
Distributed DoS (DDoS) attack is one of the most public and unauthorized access to system functions or detection and
prevention of attempts to block authorized access such as Four types of DoS attacks could target LEACH protocol.
DoS attacks. Security should be considered at early stages in These attacks are Blackhole, Grayhole, Flooding, and
the network design to secure transmitted data and isolate the Scheduling or Time-Division Multiple Access (TDMA)
locations of their members from unauthorized access [3]. attacks. These attacks are implemented in the dataset used in
Security protocols, methods and models used in wired this paper as discussed in Section III. A.
networks are inappropriate for WSN due to their constrained
energy resources. Rapid development of wireless Blackhole attack sends a false route reply message when
communication technology and small sensor nodes have it receives a Route Request (RREQ) message without
allowed WSN to spread rapidly and also to become a checking its routing table. A malicious node in the network
preferred target for attackers. refuses to redirect data packets to the destination. Wrong
routing error messages inform other nodes in the network
Internal attack is a critical security problem in WSN. that the destination is on the next step of the attacker node
Continuous Time Markov Chain (CTMC) enables the and that the attacker node has the best route to that
analysis of the behavior of sensors in WSN when attacked destination [7]. All neighboring nodes update their routing
internally. Shi et al. [4] built an internal attack detection tables and make the attacking node the starting point for the
epidemiological model. In this model, the detection rate is destination. When blackhole attacker node receives data
the rate of transition from a compromised state to a response packets, it drops all the packets and no packet reaches the
state. Using Bellman equation, the utility for the state destination [6].
transitions of a sensor can be written in standard forms of
dynamic programming [4]. Grayhole attack, also known as misbehaving attack, is an
expansion of blackhole attack in which a malicious node
The most common WSNs security threats include behavior is exceptionally unpredictable. It is a selective
eavesdropping, privacy breach and DoS attacks. DoS attacks packet dropping attack. Malicious node exploits Ad hoc On-
stops WSN functionalities and make it difficult to protect and Demand Distance Vector (AODV) routing protocol to
recover from this kind of attack. Such threats are common broadcast itself as a valid path to the destination node for
across all dedicated wireless networks and not restricted to intercepting packets. In grayhole attack, malicious nodes
WSN [5]. D/DOS attacks can be broadly classified into 3 drop intercepted packets with a certain probability. Due to
main categories: 1) Volume-based attacks using either the uncertainty nature of this attack, it is more difficult to
Internet Control Message Protocol (ICMP) protocol packets detect when compared to blackhole attack [7]. Malicious
or User Datagram Protocol (UDP) floods packets by sending behavior in grayhole attack has two different forms. Packets
huge number of packets and/or large packet size to the target may come from or destined to certain nodes in the network
server to fill the bandwidth and force the server or service while forwarding all packets to other nodes. Another
down. 2) Protocol-based attacks: actual sever resources are grayhole behavior is a combination of the previous two
consumed with no resources left to respond to legitimate which is more difficult to detect.
requests. Similarly, resources for intermediate services such
as load balancer, firewalls, and other communications Flooding attack affects LEACH protocol by sending
equipment or services can be consumed and disabled. large number of messages to the nodes to consume the
Examples of protocol-based DoS attacks include SYN flood, energy, memory and network traffic of the nodes inside
ping of death, and Smurf DDOS. 3) Applications-based WSN, this will consequently lead to disruption of the whole
attacks: An application layer protocol (layer 7) attack occurs network [8]. Scheduling attack behaves the same as flooding
when an attacker attempts to malform and spoof packets that attack but is triggered on a pre-configured date and time
require low bandwidth to attack an application target server. schedule to consume WSN node resources and disrupt the
network [1] [8].
Different WSN protocols are characterized and compared
based on their performance levels using different Quality of III. PRILIMINARIES
Service (QoS) metrics. QoS metrics include high reliability,
low jitter, low energy consumption and low end-to-end This section introduces preliminary information needed
delay. Routes in WSN are selected to maintain good energy for subsequent sections including description of the studied
level, such route selection is based on factors such as how dataset, the used machine learning techniques, tools and the
many packets can be transmitted successfully without energy utilized performance metrics.
depletion but this is an expensive approach since
establishment and maintenance of trees is needed to perform A. Data Sets
such calculations in energy-restricted networks [2]. Almomani et al. [1] used LEACH protocol to collect a
Several WSN protocols serve different purposes such as dataset representing WSN features in different attacking
flat-based, hierarchical-based and location-based. Low scenarios. LEACH protocol was selected because it is one of
Energy Aware Cluster Hierarchy (LEACH) is a hierarchical- the most common routing protocols and widely used in
based routing protocol that combines clustering and MAC WSNs. WSN dataset contains 374661 records as described in
layer techniques. LEACH protocol minimizes energy usage Table I. Four different DoS attacks are simulated in the
and outperforms classical clustering approaches. It is dataset: Blackhole (10049), Grayhole (14596), Flooding
distributed and increases the lifetime of the network. In (3312) and Scheduling (6638), as well as normal (no attack)
LEACH, a Cluster Head (CH) allows communication behavior with the remaining 340066 records.
between group members and sinks. They also perform data
aggregation and local data fusion to eliminate redundancies B. Support Vector Machines
[1] [6]. Support Vector Machines (SVM) is a supervised machine
learning algorithm that can be used for both classification or
regression tasks.
108
TABLE I. WSN DATA SET ATTRIBUTES [1]. D. Waikato Environment for Knowledge Analysis (WEKA)
# Attribute Name Attribute Description WEKA is an open source tool written in Java
A unique ID used to distinguish sensor programming language, designed and developed at the
1 Node ID
nodes in any round and at any stage University of Waikato, New Zealand. This tool is usually
2 Time The current simulation time of the node used for data mining tasks and modeling machine learning
A flag to distinguish whether the node is a algorithms. WEKA can be used for analyzing data from
3 Is CH
cluster head or not different perspectives and summarizing the results in a useful
The ID of the CH in the current simulation
4 Who CH
round
manner. WEKA supports several data mining tasks such as
Distance to The distance between the node and its CH spectral clustering classifiers, data pre-processing, feature
5 selection and regression [9]. WEKA is used in this research
CH
6
Energy The amount of energy consumed in the to build the decision tree and to construct SVM and can be
Consumption previous round used to insert some noise to the data set before building the
ADV_CH The number of advertise CH’s broadcast classifier to study its noise-tolerance capabilities [10].
7
send messages sent to the nodes
ADV_CH The number of advertise CH messages
8 E. Used Evaluation Metrics
receives received from CHs
Join_REQ The number of join request messages sent Several metrics are used to test the effectiveness of the
9
send by the nodes to the CH
Join_REQ The number of join request messages used machine learning techniques in DoS detection.
10
receives received by the CH from the nodes
ADV_SCH The number of advertise TDMA schedule
x True Positive (TP): rate of positive (attack) tuples that
11 were correctly labeled by the classifier as positive.
send broadcast messages sent to the nodes
ADV_SCH The number of TDMA schedule messages x False positive (FP): rate of negative (no attack) tuples
12
receives received from CHs that were incorrectly labeled by the classifier as
The order of this node within the TDMA positive.
13 Rank
schedule
The number of data packets sent from a
x True Negative (TN): rate of the negative tuples that
14 Data sent were correctly labeled by the classifier as negative.
sensor to its CH
The number of data packets received from x False Negative (FN): rate of positive tuples that were
15 Data Received
CH incorrectly labeled by the classifier as negative.
16
Data sent to The number of data packets sent to the BS x Precision: how close two or more measurements to
BS
Distance CH The distance between the CH and the BS
each other. Used as a measure of accuracy and to
17 represent the samples that are correctly classified
to BS
18 Send Code The cluster sending code among those were classified as positive cases by the
Type of the node. It is a class of five classifier.
19 Attack Type
possible values: Blackhole, Grayhole, x Recall: ratio of the number of samples that were
Flooding, Scheduling, and normal (the correctly classified from all of the correct samples.
node is not an attacker)
RSSI Received Signal Strength Indication
x Receiver Operating Characteristics (ROC): a graph or
20 curve used to organize classifiers and visualize their
between the node and its CH
Max Distance The maximum distance between the CH performance. It is to compare diagnostic tests. ROC is
21
to CH and the nodes within the cluster widely used in machine learning, decision making
Average The average distance between nodes in the and data mining.
22
distance to CH cluster to their CH
x Confusion Matrix: useful technique to summarize the
Current The current energy for the node in the
23
energy current round classification algorithm performance. When there are
more than two classes in the dataset, the classification
SVM is based on the concept of decision planes that accuracy alone can be misleading. Therefore,
define decision boundaries. Support vectors are simply calculating a confusion matrix can be a better
individual observation coordinates. In SVM, each data indicator for what the classification model is getting
element is a point in the dimensional space with the value of right and what types of errors it is making.
each feature is a specific coordinate value. SVM is a frontier
IV. EXPERIMENTAL RESULTS
which best segregates two classes (hyper-plane/line).
Decision plane is the one that separates a group of objects in Current research is empirical in nature and is based on
different classes [2] [3]. conducting extensive experiments to check the effectiveness
of machine learning techniques in the studied scenarios.
C. Decision Tree
Decision tree is an algorithm for making decisions. It is a A. Decision Tree (J48)
graphical representation used to solve a problem by This section presents and discusses the results of using
presenting various available alternative solutions to a given J48 implementation of decision tree available in WEKA.
problem. Decision tree analysis creates answers to a series of 1) Decision Tree With Full Dataset
questions until reaching the final decision choice, called the In this experiment, all available records are utilized to
leaf. Reaching a decision requires going through a path from test the accuracy of the system using decision trees. The
the root to the leaf. The path is selected based on the answers summary of the obtained results is listed in Table II, the
to the questions. Each answer represents a possible value for detailed accuracy per class is listed in Table III and the
a selected feature of the studied domain. Decision trees are confusion matrix in Table IV. The number of leaves is 316
commonly used in operations management research to and the size of the tree is 631. The time taken to build this
support the decision making process [2] [7]. model is 26.89 seconds.
109
TABLE II. SUMMARY OF DECISION TREE RESULTS USING FULL DATA. Based on the results, it can be noticed that in decision
Correctly Classified Instances 373401 %99.6637 tree classification, the use of selected attack dataset is better
Incorrectly Classified Instances 1260 %0.3363 than the use of the full dataset in terms of the number of
Kappa statistic 0.9805 leaves, size of the tree, time taken to build the tree, the
Mean absolute error 0.002 number of correctly and incorrectly classified Instances.
Root mean squared error 0.0351
Relative absolute error %2.9202 TABLE V. SUMMARY OF DECISION TREE RESULTS USING SELECTED
Root relative squared error %18.8321 ATTACKS
Total Number of Instances 374661
Correctly Classified Instances 357463 %99.8573
Incorrectly Classified Instances 511 %0.1427
TABLE III. DETAILED DECISION TREE ACCURACY PER CLASS USING
FULL DATA.
Kappa statistic 0.9851
Mean absolute error 0.0012
TP FP Prec. Recall ROC Class Root mean squared error 0.0293
0.999 0.020 0.998 0.999 0.990 Normal Relative absolute error %1.8866
0.975 0.000 0.953 0.975 0.995 Flood. Root relative squared error %16.3771
0.927 0.000 0.995 0.927 0.966 TDMA Total Number of Instances 357974
0.982 0.001 0.984 0.982 0.997 Grayhole
0.992 0.000 0.985 0.992 0.998 Blackhole TABLE VI. DETAILED DECISION TREE ACCURACY PER CLASS USING
Weighted Avg. 0.997 0.018 0.997 0.997 0.991 SELECTED ATTACKS
110
TABLE VIII. DECISION TREE EVALUATION METRICS USING FULL TABLE XV. CONFUSION MATRIX FOR SVM USING SELECTED ATTACKS
DATASET WITH FOCUS ON FLOODING AND GRAYHOLE ATTACKS.
a b c <-- Classified as
TP FP TN 3113 28 171 | a = Flooding
Normal 339719 0 309 0 14269 327 | b = Grayhole
Flooding 3229 83 0 345 595 339126 | c = Normal
Grayhole 14332 120 0
Total 357280 203 309
111
REFERENCES
112