1 s2.0 S1570870523000380 Main
1 s2.0 S1570870523000380 Main
Ad Hoc Networks
journal homepage: www.elsevier.com/locate/adhoc
Keywords: Internet of Things (IoT) has revolutionized the networking by connecting the real world entities to the
IoT Internet. IoT connects the communication devices and has an incredible impact on perspective analytics on
RPL the massive volume of data produced every day. An attacker may exploit vulnerabilities of IoT entities and
Version number attack
compromise users’ security and privacy. Development of solutions to address security and privacy issues of
Security
IoT is in premature stage and considered as challenge. This challenge becomes more critical when the devices
6LoWPAN
Q-learning
in the network are resource-constrained in terms of energy, processing and memory. The IPv6 over Low-
Power Wireless Personal Area Networks (6LoWPAN) has emerged in recent years as an adaptation layer to
carry IPv6 packets over IEEE 802.15.4. Many IoT applications use Routing Protocol for Low Power and Lossy
Networks (RPL) as a network layer protocol developed for routing in 6LoWPAN. Security is challenging in
resource constrained environment where encryption may not be a viable solution. Version number attack is
one of the most common network layer attacks against RPL based 6LoWPAN. The RPL specification does
not address the integrity of the version number and therefore leaves version number mechanism as a weak
point in terms of security. This paper investigates the impact of version number attack in RPL networks while
considering mobility of the sensor nodes. We propose a solution that utilizes Q-Learning strategy to detect the
malicious nodes that are performing version number attack. The proposed approach detects malicious nodes
with reasonable accuracy while imposing significantly less overhead on the nodes of low power and lossy
networks. There are other approaches too like Message Authentication Codes (MAC) based on symmetric keys
but these techniques have memory and communication overhead. So we propose different approach Q-Learning
to detect the attacker nodes.
1. Introduction IoT has made a significant impact on the lifestyle of humans. There
are vast number of IoT applications that have improved the human lives
The genesis of IoT has made a significant change in how people in terms of health, food quality, and environment quality. The inception
interact through interconnected smart devices and do their businesses. of IoT has added communication capabilities to embedded devices
The inception of IoT has added senses to computers, which has profited which support working and perform tasks without human intervention.
the world’s global economy [1]. IoT connects billions of devices where
Reports indicate that there will be approximately 75.44 billion IoT
a lot of data is exchanged and processed without human interven-
devices installed worldwide by 2025 and revenue from IoT will cross
tion. The proliferation of real-time data generated using interconnected
devices demands analytical and fast decisions for businesses. The appli- $ 212 billion worldwide [2,3]. This clearly shows how IoT is going
cations in IoT aims lightweight, secure, scalable and mobile solutions to profit world’s global economy [1] at a major scale. The connected
for communication among the devices as well as to retain security and devices exchange a large volume of data which is processed for decision
privacy.
✩ G. Sharma is with the Department of Computer Science & Engineering, Malaviya National Institute of Technology Jaipur, Rajasthan, India. He is also
Assistant Professor at Manipal University Jaipur. J. Grover is with the Department of Computer Science & Engineering, Malaviya National Institute of Technology
Jaipur, Rajasthan, India. A. Verma is with the Department of Computer Science & Engineering, PDPM Indian Institute of Information Technology, Design and
Manufacturing, Jabalpur, Madhya Pradesh, India.
∗ Corresponding author at: Malaviya National Institute of Technology, Jaipur JLN Marg, 302017, Rajasthan, India.
E-mail addresses: [email protected] (G. Sharma), [email protected] (J. Grover), [email protected] (A. Verma).
https://doi.org/10.1016/j.adhoc.2023.103118
Received 2 March 2022; Received in revised form 22 December 2022; Accepted 5 February 2023
Available online 8 February 2023
1570-8705/© 2023 Elsevier B.V. All rights reserved.
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
making at the end machines or at the cloud. The proliferation of real- Table 1
List of abbreviations.
time data generated using interconnected devices demands analytical
and fast decisions for business organizations. IoT devices are one of Abbreviations Definition
the favourite targets of hackers or attacker at present. These devices 6LoWPAN IPv6 over Low-powered Wireless Personal Area Network.
IoT Internet of Things
become easy target of attackers because of the vulnerabilities present in
RPL Routing Protocol for Low Power and Lossy Networks
hardware design, embedded software, and networking protocols being AODV Adhoc On-Demand Distance Vector
used. Security and privacy concerns involved in IoT slow down its OSPF Open Shortest Path First
growth and limit the global adoption of IoT. This demands development DSR Dynamic Source Routing
of security solutions for IoT to address existing and upcoming threats. IETF Internet Engineering Task Force
DODAG Destination Oriented Directed Acyclic Graph
There are many critical applications of IoT that run over resource-
DIO Destination Information Object
constrained networks and need lightweight, secure, scalable, and mo- DIS DODAG Information Solicitation
bility supported solutions to maintain user’s security and privacy. DAO Destination Advertisement Object
6LoWPAN is one such example that run over resource-constrained de- MRHOF Minimum Rank with Hysteresis Objective Function
vices or nodes [4,5]. In simple words, 6LoWPAN is an adaptation layer OF0 Objective Function Zero
ABC Artificial Bee Colony
between media access control layer (MAC) and the network layer that PDR Packet Delivery Ratio
optimizes IPv6 packets in resource-constrained networks [6]. In such AE2ED Average End-to-End Delay
networks, the classical routing protocols like Adhoc On-Demand Dis- RL Reinforcement Learning
tance Vector (AODV), Open Shortest Path First (OSPF), Dynamic Source IQR Interquartile Range
UGDM Unit Disk Graph Medium
Routing (DSR) are not recommended because of involved overhead [7].
ETX Expected Transmission Count
RPL, a lightweight routing protocol which was designed to perform VAD Version Number Attack Detection
energy-efficient routing in 6LoWPAN which suffer from communication DIO-P DIO Processing
delay, constrained resources, fluctuating link qualities, and varying WPS Worst Parent Selection
convergence time. RPL at present is a ‘‘Proposed Standard’’ by Internet IDS Intrusion Detection System
Engineering Task Force (IETF) which was drafted in 2011 and specified
in RFC 6560 [8]. It is a de-facto protocol for the network layer and
has become one of the notable protocols for routing in 6LoWPAN [9]. 1.1. Motivation
Many IoT resource-constrained applications like agriculture, remote
areas monitoring, military applications and the health care industry use
RPL protocol [10]. Some of the IoT applications which use RPL protocol at network
In recent years, it has been observed that the number of attacks are connected cars, connected factories, connected buildings, smart
on IoT have increased drastically. The attackers exploit the vulnera- creatures, wearable sensors, etc. Wearable sensors are one of the IoT
bilities of insecure devices like CCTV cameras, smart TVs, smart bulbs, heath-care applications where the nodes are moving and sending data
connected printers, smartphones, and even smart speakers to perform to the smartphones, which in turn sends data to cloud [19]. The security
attacks [11]. Recent research by Eyal Itkin et al. [12] shows how and privacy of the data is essential in such applications since it directly
to infect a network using a Fax Machine/All in one printer. IoT has relates to human life. Surveillance using drones and remotely accessing
increased the vulnerabilities surface for the attackers since if a device the data in the military is another mobile IoT based application where
is hacked, it may compromise all other connected devices through the
security is critical. Tracking of endangered species is being carried out
communication channel. The probability of threat surface increases
using the smart sensors fitted in their bodies or by using unmanned
with the increase in the number of connected devices, and it augments
aerial vehicles, and the data gathered from the sensors could be vital
security and privacy issues of the IoT users.
As IoT is a major target of attackers since the devices are constantly for the security of these species [20]. As many IoT applications have
connected and provide an expanded threat surface. It is clear that adopted RPL, but there are numerous security vulnerabilities in this
6LoWPANs are also not secure and will be targeted to perform attacks. protocol. Version number attack is a serious attack that can harm the
It is essential to address and mitigate IoT security issues. This paper proper functioning of the network. This motivates us to address the
primarily addresses version number attack in RPL based 6LoWPAN version number attack and its impact on mobile IoT networks having
considering mobility of nodes. Due to the ad-hoc characteristics of IoT, mobile nodes. The literature survey in Section 3 shows that most of the
it is challenging to detect routing attacks. RPL, which works above solutions are machine learning-based and do not consider the mobility
the adaptation layer 6LoWPAN, has its own set of threats and inherits of nodes. This proposed approach detects the version number attackers
attacks that were popular for wireless sensor networks [13]. RPL is in real-time, specifically for mobile IoT.
prone to different attacks, including network traffic, topology, and
resource based attacks. The attacker tries to flood the network with
unnecessary packets, impacting performance, energy consumption and 1.2. Contributions
network delays [14].
Many researchers worldwide [15–18] have proposed security so-
lutions for RPL, yet attackers exploit vulnerabilities in the resource- The contributions of our paper are listed below:
constrained networks to disrupt regular operations of applications.
1. Implementation and comprehensive analysis of version num-
This research aims to detect the version number attackers using Q-
ber attack in RPL based Mobile IoT networks using various
Learning in RPL networks. In RPL based IoT, version number attacker
implements the attack by simply changing the version number of the performance metrics.
incoming DIO message and multicast with modified DIO message. Since 2. Implementation and detailed examination of Q-Learning based
RPL does not give specifications for the integrity of the version number, version number attack detection approach which imposes sig-
this motivates us to detect the version number attack in the mobile nificantly less overhead on the sensor nodes.
IoT. The proposed approach directly imposes the Q-Learning algorithm 3. Comparison of our approach with the existing Message Authen-
in the sensor nodes to detect the version attacker in real-time. Terms tication Codes (MAC) techniques.
used in the context of IoT and RPL, their acronyms, and definitions are 4. Comparison of proposed technique with existing works as well
listed in Table 1. as a follow-up discussion, is also presented.
2
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
1.3. Organization
2. Background communication to record the nodes visited along the upward path.
Every node except the root node unicast the DAO message to convey
This section discusses about the RPL protocol and version number the routing tables with prefixes of their descendants and advertise
attack. It also covers the basics of Q-Learning and the steps involved in their addresses and prefixes to their parent. DAO-ACK message is
its implementation. a unicast packet sent by a DAO recipient in response to a unicast
DAO message [26]. The ‘‘Trickle timer’’ is used by RPL to limit the
2.1. Overview of RPL protocol transmission of control messages and reduce network energy usage. The
trickle timer get reset in case of inconsistency detection, i.e., loops and
RPL is a standard routing protocol for LLNs network. It is energy link loss, change in parent set. The trickle timer interval is decreased
efficient in 6LoWPAN due to its flexible nature and Quality-of-Service or increased in case of a stable network and inconsistency detection,
support [21,22]. The Internet Engineering Task Force (IETF) has sug- respectively. In case of a stable network, the interval is increased in
gested RPL as a network layer protocol standard. RFC 6550 [8] contains order to decrease the number of DIOs transmitted in the network.
a more detailed discussion of the RPL standard. RPL is a distance Whereas, upon detection of any inconsistency, interval is reduced to
vector and source routing protocol that is designed to operate on top increase the number of DIOs to fix the inconsistency quickly [27].
of IEEE 802.15.4 MAC layer protocol. RPL supports point-to-multipoint
traffic control from central point to the device node. Point-to-point and 2.2. Version number attack
multipoint-to-point traffic is also supported in RPL.
RPL forms a topology based on concept of Destination Oriented RPL was designed to support lossy networks, but it has several
Directed Acyclic Graph (DODAG). It defines loop free and tree like vulnerabilities that various researchers have discovered [28–31]. RPL-
structure that specifies the default routes between nodes in LLNs. A based networks are susceptible to version number attacks, which cause
network may consist of one or several DODAG at a same time, which the network to behave anomalously. This attack produces a negative
form together a RPLInstance. A DODAG is identified as the combination impact on the network causing DODAG to direct towards the malicious
of RPLInstanceID, DODAGID, DODAGVersion. A single network may node. The network instability increases power consumption, reduces
also consists of multiple RPLInstance concurrently at same time. These packet delivery ratio and increases delays in communication. The mod-
instances are logically independent. RPL have few primary characteris- ification in the version number makes loops in the network as the
tics like auto-configuration, self-healing, loop avoidance and detection, version number is not coming from the root. The change in version
transparency, and support for multiple sinks. number except the root is against the RPL specifications.
RPL control messages are specified as new type of ICMPv6 control The DODAG formation is started by root by sending the DIO mes-
message. There are four types of control messages in RPL for creat- sage to the neighbouring nodes. The neighbours calculate the cost of
ing and maintaining the whole topology and DODAG: (i) Destination the route to join the DODAG and form a path towards the root. The
information object (DIO), (ii) DODAG Information Solicitation (DIS), adjacent nodes multicast the DIO messages and repeats the process
(iii) Destination Advertisement Object (DAO), and (iv) Destination until the DODAG is formed. Once the DODAG is created, the network
Advertisement Object Acknowledgment (DAO-ACK). In RPL, an objec- becomes stable.
tive function (OF) defines the routing metrics, optimization objective, A malicious node could send a higher version number in the DIO
rank calculation, and parent selection criteria. Various OFs in RPL message, which causes inconsistency in the network. When the root
include ETX Objective function (ETXOF) [23], Minimum Rank with receives DIO with a new version number, it starts global repair by
Hysteresis Objective Function (MRHOF) [24], and Objective Function resetting the trickle timer as shown in Fig. 1.
Zero (OF0) [25]. Each node is associated with a rank which shows It is not easy for the sink and legitimate nodes to find the attacker
significant role in DODAG management. The individual position of each nodes locally. The nodes are getting many DIO messages from different
node in relation to the DODAG root node is defined by its rank. The neighbours. So, we apply the concept of Q-Learning to detect the
rank value should rise in a downward direction (from root to leaves) version number attackers with good accuracy.
and vice versa, according to the rank rule. In RPL, the rank idea is
applied to: (1) detect and resolve routing loops; (2) maintain a parent– 2.3. Q-learning
child relationship; (3) distinguish between parents and siblings; (4)
restore broken links. DIS message is used to solicit a DODAG informa- Reinforcement learning is not a novel concept in robotics. Modern
tion object. DIS may be used to discover nodes in DODAG. DIO message reinforcement learning, which depicts deep reinforcement learning,
is issued by the root node to construct a new DAG and then multicast uses the deep learning [32] for faster and efficient learning of the
through the DODAG structure. The DIO message carries necessary environment by the agent [33]. Many applications use reinforcement
information that allows a node to discover RPLInstance, configuration learning like game play, robotics, trading systems, and self-driving cars.
parameters and parent set. DAO message is used for bi-directional We can extend this concept to create trust-based attack detection in the
3
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
CHA-IDS based on the anomaly and signature of the data received at the
sink node. This system aggregates the data; the analyzer analyses it and
responds to the network about the abnormal activity. In [37], author
discusses how to mitigate version number attack for the static nodes
based on the monitoring table, which contains the nodes information
to check the malicious nodes.
A probabilistic model for the mitigation of the version number
attack is presented in [38], where the nodes are at some specific
locations. This approach has not considered the mobility of nodes. This
technique discards the DIO messages based on the version numbers and
Fig. 2. Q-Learning process.
the node’s rank. In [39], sink or the neighbouring nodes verify if the
received DIO message has a greater version than its version number.
The verification process detects whether the node is malicious or not.
A rule-based method [40] is used to detect the version and rank
mobile IoT systems. Q-Learning concept starts with Bellman equations
attack in the health networks. They use the threshold to check the
and by inducing stochastic and temporal change to approximate the
different parameters like energy consumption i.e ETX value. If it ex-
value function. Fig. 2 shows the Q-Learning algorithm process which
ceeds, the sink broadcast the exact version number to other nodes. A
starts by initializing the Q-Table and choosing an action.
distributed monitoring method [41] is proposed to detect the version
number attack using the monitoring nodes, which can capture the
𝑣∗ (𝑠′ ) = max 𝑅(𝑠, 𝑎) + 𝛾𝑣∗ (𝑠) (1) packets from normal nodes and send them to the sink node to detect the
𝑎
attack. This technique implements the local and distributed algorithms
The Eq. (1) represents the Bellman equation where 𝑅(𝑠, 𝑎) represents to detect the attack. The root implements both the attack list and
the instantaneous reward associated with the state, 𝛾 represents the neighbour list to identify the malicious node.
discounting factor, and 𝑣(𝑠′ ) represents the possible future state by A machine learning-based technique for detecting RPL based attacks
taking the maximum of action 𝑎. is presented in [42]. This technique generates legitimate and attacks
If we induce stochastic behaviour in the above equation Eq. (1) datasets given to the SVM classifier to identify the attack. The research
by finding the expected value at each state 𝐸[𝑣(𝑠′ )]. The equation for article [43] uses a gradient boosting machine to detect the version num-
Q-learning is:
ber attack by generating the dataset using Wireshark. This technique
∑
𝑄(𝑠, 𝑎) = 𝑅(𝑠, 𝑎) + 𝛾 𝑝(𝑠, 𝑎, 𝑠′ ) ∗ 𝑚𝑎𝑥 𝑄𝑎′ (𝑠′ , 𝑎′ ) (2) preprocesses the generated data before applying the learning model.
𝑎′ After selecting the relevant features and labelling the dataset, machine
𝑠′
This recursive function Eq. (2) takes the instantaneous reward and ex- learning model i.e. gradient boosting is used to classify the benign and
pected value from the other states to calculate its own value. In Eq. (2), malicious nodes.
𝑠′ represents the probability of state by taking the action 𝑎. One of Similarly Anitha et al. [44] detects the version number by compar-
the applications which optimizes memory control uses Reinforcement ing the root’s version number with the DIO message received from the
Learning [34]. This application shows the modelling of DRAM access neighbouring nodes. The methodology identifies the attack source by
using the Markov Decision Process (MDP). The controller’s performance comparing it with the nodes with a higher rank. The technique uses
surpassed First-Ready, First-Come-First-Serve (FR-FCFS) by 19%. The a threshold mechanism to detect the attacker closer to the root node.
memory access request performed well after training the controller with If the count reaches the threshold value, the node is declared as an
the nine benchmark applications. This controller can be extended to use attacker node.
Q-Learning technique. Since Reinforcement Learning (RL) is based on A technique to mitigate flooding attacks in RPL based IoT networks
the fixed policy, the controller provides a reward of 1 or 0 if an action is presented [45]. They have used the threshold values to control DIS
is performed(read, write requests). The values for each action can be Time Interval and DIS Start Delay. If the node crosses the limits of
defined probabilistically in Q-Learning, a model-free RL, allowing the the threshold values, the node discards the DIS message, and the IP
controller to give the processor the command. Since Q-Learning allows address is stored in the blacklist table to treat it as an attacker. In [46],
us to generate multiple optimal policies, the controller identifies the the authors have implemented an IDS system to detect attacks on RPL
optimal action given a state and an action. In the paper [35], authors networks. They have used three different datasets to implement the
provide how the RL and its extensions could be used in financial machine learning algorithms like the random forest, GBM, CART etc., to
markets. The authors predicted the stock price using both SARSA and classify the dataset, achieving more than 95% accuracy. They also have
Q-Learning and showed that Q-Learning uses explorative actions to implemented Secure-RPL to mitigate DIS flooding attack by maintaining
predict the stock price better. In our approach, we find those nodes a blocklist table [47]. This technique uses DIS time interval and DIS
generating the maximum expected value and crossing the threshold delay parameters of the Contiki operating system to detect the attacker
value. Using this analogy, we employ the Q-Learning concept in our node.
approach for the detection of the version attackers. The Section 4.6 Sharma et al. [48] have proposed a technique for simulating at-
specifies the details of the attacker node detection process. tacks for generating the dataset for multiple attacks. They generated
a dataset for Version number attack, Hello Flood attack, and decreased
3. Related work rank attack and identified 58 features to apply the machine learning
algorithm to classify attacks. Sarumathi et al. [49] have proposed an
This section discusses different approaches that many authors have IDS system for Sybil attack using the Artificial Bee Colony (ABC)
implemented to detect attacks in RPL based IoT networks or WSN inspired algorithm when the nodes are mobile. The IDS system counts
networks. Most of the techniques are based on the IDS systems us- the number of control messages in the stipulated period and calculates
ing machine learning approaches. We start our discussion with Gara the timestamp between the message; if it exceeds, the flag is set to
et al. [36], who proposed an IDS system for wireless sensor networks check whether the event is malicious or legitimate. Wadhaj et al. [50]
for selective forwarding and clone attacks. This technique uses two have proposed mitigation of the DAO attack in RPL based IoT networks
different sink nodes, one for creating DODAG and another for finding which restrict the number of DAO messages received from the child
the attacker nodes in the cluster of nodes. In [16], authors implement node. If the limit crosses the threshold, no DAO will be forwarded until
4
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
the next time slot. This mitigation technique can increase the PDR and
reduce the effect of the attack.
Congu Pu et al. [51] have used the Gini Index model to mitigate
the Sybil attack. This technique uses Gini impurity to detect the DIS
attacker nodes, and the control message impurity increase when there
is a Sybil attack. The defence mechanism discards the DAO messages
if it exceeds the threshold limit. Although this solution is capable of
identifying Sybil Attack but not able to identify the attacker node.
Due to this approach is not able to block the attacker node itself.
The attacker may attack the network at time intervals making the
network inconsistent. This strategy reduces the DIO message reply rate
so that the energy consumption at the nodes could be reduced. Ahmed Fig. 3. Version number attack detection flow model.
et al. [52] have presented a review of different attacks in RPL based
IoT networks and also discussed different IDS systems for the attacks
like signature-based, anomaly-based and specification-based systems to
detect attacks. An IDS for the Copycat attacks in the RPL based IoT
networks is presented in [18,53]. This paper uses the threshold based
on the DIO time interval to detect the attackers. In the attacker’s case,
the number of DIO messages increases significantly. The neighbours
put these nodes in the blocklist table. Further processing of the DIO’s
coming from the attacker is completely suppressed to mitigate the
attack.
A theoretical comparative study of different attacks is shown in the
Table 2. It shows different IDS systems or solutions against various
attacks proposed by eminent authors. Table 3 shows the comparative
study of related work with the proposed solution.
In today’s world, there are various applications such as in health
care, defence system where the nodes could be mobile. This research
focuses on static as well as mobile systems to detect the version number
Fig. 4. Attack: Environment.
attack.
4. Proposed solution
120 s. Experimentally, we have come across that it takes some time
This section describes the proposed methodology for the detection for DODAG to configure the network; 𝑇 = 120 s [65] is sufficient time
of the version number attack. This approach uses Q-Learning which is for DODAG establishment.
inherently stochastic to work in the mobile networks too. Researchers According to the Q-Learning principle, this research uses mobile
have developed different algorithms to detect the version approach, like and static nodes that learn from their environment as the simulation
threshold-based algorithms. Our approach uses the Q-Learning with the progresses. The node receives the penalty depending on actions in
discounting factor to make the detection effective and lightweight for the environment. If the version number changes, the nodes receiving
constrained networks specifically for RPL. the DIO message gets the penalty as per Q-Learning strategy which is
calculated based on the change in version for that node. The sink node
4.1. Flow model of attack detection analyses the network behaviour every 𝑇 seconds to detect the attacker.
The Fig. 4 illustrates 6LoWPAN based IoT network, where the
We first address the version number attack [64], which unneces- states of the environment vary with the change in the version number
sarily increases the DIO messages in the network. Each sensor node because DODAG is reconfigured with each version number update. The
wants to be part of the latest DODAG. Initially, the root sends the following are some important points about the environment:
DIO message to the neighbour nodes to join the network. Other sensor
nodes multicast the DIO message so that other neighbouring nodes may 1. The network model represents the environment i.e. set-up of the
join the network. Version attacker changes the version number and problem where the agents send packets to other agents.
multicast the DIO message to other nodes. This makes current DODAG 2. After DODAG is formed, the network remains stable with the
no longer exists and the node resets the trickle timer. The nodes sends same DODAG version until the agent (node) does not reset the
the DIO message with the updated version number to other nodes. It is trickle timer.
not easy to detect who changed the version number at first and due to 3. When the node gets a DIO message with a different version
the version change, the network is flooded with control messages. number from its neighbour or sink, it resets the trickle timer
The increase in number of control messages reduces the packet and sends the DIO message to others with a new version num-
delivery ratio since the data packets are not the part of latest DODAG, ber. Each node calculates the immediate reward value 𝑅(𝑆, 𝑎)
so they do not reach the sink node. The proposed approach uses Q- where 𝑆 represents the current state, and 𝑎 represents the action
Learning technique for detecting the version number attack, which which is a different version number received from the other
inherently includes the stochastic in the network. So, we propose our node(agent).
implementation for detecting the version number attack using the Q-
Learning mechanism to overcome the vulnerabilities in the RPL based Fig. 5 shows the change in the state when the nodes broadcast
networks. the DIO messages in the network. The root multicast the DIO control
We name this proposed detection technique QSec-RPL (as shown in packets when the DODAG creation begins. Once the DODAG is estab-
Fig. 3), which uses the Q-Learning approach to calculate each node’s lished, DIO control messages are not communicated among the nodes
Q-value based on the number of DIO messages. The attack detection until there is inconsistency in the network. The state transition diagram
process runs after every 𝑇 amount of time. Here we consider 𝑇 = shows that there are 5 nodes starting with 0 DIO count. As the multicast
5
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
Table 2
Defense mechanisms: IDS for RPL based IoT networks.
S.no Reference Mechanism Summary Attacks addressed Limitations
1. Kasinathan et al. [54] Signature Based Detect attacks using Suricata open source IDS. DoS Attack No performance study of the
(DEMO) Uses Frequency Agility Manager to operate in IDS is given
different channels.
2. Zhang et al. [55] Specification Uses FSM for monitoring nodes to implement Routing choice intrusion Only homogeneous nodes are
Based normal and malicious states. Detects attack if any considered.
node sends DIO message with lower ETX value.
3. Le et al. [56], Specification EFSM created using Integer Linear Programming. Rank, Local Repair, DIS, Lacks in implementation and
Le et al. [15] Based Generates RPL trace files to show the legitimate Sinkhole attacks performance analysis.
states with transitions
4. Surendar and Specification Cluster head acts as monitoring node and counts Sinkhole attack Considers only homogeneous
Umamakeswari [57] Based (InDReS) the packet drops of the adjacent nodes. Compare nodes
the ranks of the neighbouring nodes with the
threshold value and detects the malicious node.
5. Mayzaud et al. [58] Anomaly Based Two types of node: monitoring and monitored. DODAG inconsistency Considers only single attacker.
Monitoring nodes collect data and detects attacks Uses high end machines which
in distributes manner. adds cost overhead.
6. Mayzaud et al. [41] Anomaly Based Collaboration of monitoring nodes to transfer Version number attack Only one attacker and no
information using multi instance network. mobility
7. Mayzaud et al. [59] Anomaly Based Monitoring nodes to send information to the root Version number attack Only one attacker and no
about who changed the Version number called mobility considered.
Local Assessment. The Localization algorithm
deployed on the sink detects the attacker.
8. Bostani, Hamid and Hybrid IDS Specification module deployed on the router nodes Sinkhole, Selective Not suitable for the energy
Sheikhan, Mansour analyse their child nodes and sends the forwarding and Wormhole constrained environment.
2017 [60] information to the Gateway. Gateway uses anomaly attacks
approach uses the Optimum path Forest Clustering
on the incoming packets from the router nodes.
9. Ioulianou et al. [61] Signature Based Uses IDS routers and IDS detectors(Sends malicious DIS, Version Number Framework is not validated.
info to router s). Only proposed the approach.
10. Verma and Ranga Anomaly based Detection of attacks by using NIDDS17 dataset. It Blackhole, Sybil, Clone ID, Mobility of nodes is not
[17] uses ensemble classifiers. Selective Forwarding, Hello considered
Flooding and Local Repair
attacks
11. Kfoury et al. [62] Signature Based Perform clustering of traffic classes using Pcap Sinkhole, Version Number Implementation overhead,
(SOMIDS) files. Data aggregation on DIS, DAO, DIO, rank, and HELLO flooding Real time detection is not
version number change and mote power. possible
12. Verma and Virender Anomaly based Uses the datasets CIDDS-001, UNSW-NB15, and DoS attack Mobility of nodes is not
[46] NSL-KDD to detect the attack DoS attack. considered
Implements several ML algorithms and measures
the performance in terms of accuracy, FPR, AUC
etc.
13. Agiollo et al. [10] Hybrid IDS Uses dataset RADAR to detect around 14 using Routing attacks Real time detection is not
NetSim. Anomaly part detects the malicious node possible and mobility is not
since it knows how node behaves when there is no considered
attacker. It uses Auto Regressive Integrated Moving
Average (ARIMA) model. Signature part sees the
specific patterns in the data. It uses Clone Identity,
Change in DODAG, Change in Version or Rank etc.
14. Kiran [63] Anomaly based Extension to SVELTE. DWA-IDS uses the Nmapper Routing attacks(WPS) Mobility is not addressed.
and IDS module.
15. Our proposed Behaviour Based This lightweight technique uses the Q-Learning Version number attack Considers mobility of the
approach strategy to detect the abnormality in the network. nodes and detects the
attackers with 91% accuracy.
This strategy is suitable for
real time detection
begins, the count increases; the child node further multicast the DIO 4.2. Demystifying Algorithm 1: Version Number Attack Detection
message so that each node selects the highest rank parent node for the
data transfer among nodes and sink. Q-SecRPL algorithm 1 presents our proposed detection approach.
The count of DIO messages with penalty is embedded in the Q- The procedure QSec-RPLProc executes when the node receives the DIO
Learning mathematical model as represented in algorithm 1. The DIO packet. To maintain the integrity of the data, we maintain a static array
message count parameter is used by the Q-Learning process to deter- variable that holds each node’s information. We also have declared a
mine how much it increases with and without an attack. The proposed global variable 𝐿𝑖𝑠𝑡[𝑀𝐴𝑋_𝑁𝑂𝐷𝐸] which is the list of 𝑑𝑖𝑜𝑐𝑜𝑢𝑛𝑡 with
approach is presented in the algorithm 1. This algorithm represents the discounting factor of the 𝛥−𝛿. Each node maintains a record of how
changes in the 𝑟𝑝𝑙 − 𝑖𝑐𝑚𝑝.𝑐 file of the Contiki operating system. many DIO messages it gets from the other node in 120 s duration. After
6
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
every 120 s the client sends this information to the server through UDP
packets. Table 4 depicts the parameters used in the algorithms with its
description.
The server filters the information from clients and generates a graph
representing DIO messages coming in and going out among the nodes.
The data gathered from client nodes are Z1 nodes, while the server is
a border router or the root node. The next subsection discusses about
how server uses this information to detect the attack.
This section describes how the server handles the information sent
by each client node in the form of UDP packets. The sink is responsible
for detecting the version number attack by accumulating and calcu-
lating the DIO’s 𝐿𝑖𝑠𝑡[𝑀𝐴𝑋_𝑁𝑂𝐷𝐸] received from different sensor
nodes. The algorithm 2 represents that server creates a matrix after
Fig. 5. State transition diagram during DIO message broadcast. getting all the 𝐿𝑖𝑠𝑡[] from different nodes.
This 𝐿𝑖𝑠𝑡[𝑀𝐴𝑋_𝑁𝑂𝐷𝐸] table is determined for each node by
using the Q-Learning algorithm 1 that provides the number of DIO
communicated among the nodes with the penalty 𝛥 − 𝛿. If there is a
7
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
change in the version number, the value is modified in the table w.r.t. 4.5. Q-Learning table
the particular node. Algorithm 2 provides the global impact on the
overall network and shows the significant difference in the DIO count, To understand the concept of Q-Learning for the detection of ver-
power consumption, latency which we describe in Section 5. sion number attack, we take an environment having 10 nodes. We
simulate the results using the Contiki Coooja, which depicts different
environment states when there is an attack and no attack.
Algorithm 2: QSec-RPL: DIO Processing. In Fig. 7, the Q-Table shows the total control messages transferred
1: 𝑀𝐴𝑋_𝑁𝑂𝐷𝐸 ⊳ Global Variables among the nodes when there is no version number attack. After 15
2: 𝜔[𝑀𝐴𝑋_𝑁𝑂𝐷𝐸], 𝑈 𝐷𝑃 _𝑝𝑎𝑐𝑘𝑒𝑡 ⊳ Local variables: Array to store min of the simulation, we can see that the total count for the control
the DIO information filtered from the UDP packet and UDP packet messages is significantly less. Column 1 signifies the sink node. Here,
received from each sensor node column 1 index is 0 because here we are assuming that the server node
3: 𝑄_𝑇 𝑎𝑏𝑙𝑒_𝑆𝑒𝑟𝑣𝑒𝑟[𝑀𝐴𝑋_𝑁𝑂𝐷𝐸][𝑀𝐴𝑋_𝑁𝑂𝐷𝐸] ⊳ Matrix to store is not the attacker node. The number of DIO’s received by the server
the overall DIO information received from sensor nodes node is not a useful metric for detecting the attacker node.
4: for 𝑖 ← 1 𝑡𝑜 𝑠𝑡𝑟𝑙𝑒𝑛(𝑈 𝐷𝑃 _𝑝𝑎𝑐𝑘𝑒𝑡) do In Fig. 8, when the nodes are not moving and if we make the node
5: 𝑐 = 𝑈 𝐷𝑃 _𝑝𝑎𝑐𝑘𝑒𝑡[𝑘] 10 as the attacker node, the total control messages count increases
6: 𝑑𝑖𝑔𝑖𝑡 = 𝑐 −′ 0′ ⊳ Convert the character into digit significantly. This increase is due to the version number attack, which
7: 𝜔[𝑖] = 𝜔[𝑖] ∗ 10 + 𝑑𝑖𝑔𝑖𝑡 ⊳ Covert into number and store into the forces the network to reconfigure itself most of the time. The Eq. (3)
calculates the % difference.
array.
8: end for (𝑁𝑒𝑤𝑁𝑢𝑚𝑏𝑒𝑟 − 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑁𝑢𝑚𝑏𝑒𝑟)
%𝑖𝑛𝑐𝑟𝑒𝑎𝑠𝑒 = ∗ 100 (3)
9: for 𝑗 ← 1 𝑡𝑜 𝑀𝐴𝑋_𝑁𝑂𝐷𝐸 do 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑁𝑢𝑚𝑏𝑒𝑟
10: 𝑄_𝑇 𝑎𝑏𝑙𝑒_𝑆𝑒𝑟𝑣𝑒𝑟[𝑖𝑛𝑑𝑒𝑥][𝑗] = 𝜔[𝑗] ⊳ index represents the sensor If we calculate the % increase when there is a version number attack
node ID in the network, which increases by 582.9%, which is enormous. Fig. 9
11: end for also shows that 𝑛𝑜𝑑𝑒1 is the sink node which is not considered as the
12: end procedure attacker. This 𝑛𝑜𝑑𝑒10 is treated as the attacker as the number of DIO
messages communicated with this node is maximum i.e. 242. This table
is generated using the algorithm 1 which uses DIO count and penalty
This algorithm 2 represents the changes made in the 𝑆𝑒𝑟𝑣𝑒𝑟.𝑐, mechanism of the Q-Learning mechanism.
which gather UDP packets from different sensor nodes, filters it to We get similar results when about 50% of the nodes are moving.
get the DIO related information and stores in array 𝑄_𝑇 𝑎𝑏𝑙𝑒_𝑆𝑒𝑟𝑣𝑒𝑟 Here 𝑛𝑜𝑑𝑒2, 𝑛𝑜𝑑𝑒3, 𝑛𝑜𝑑𝑒5 and 𝑛𝑜𝑑𝑒8 are mobile nodes. The %increase
[𝑀𝐴𝑋_𝑁𝑂𝐷𝐸][𝑀𝐴𝑋_𝑁𝑂𝐷𝐸]. This matrix which is depicted in Sec- w.r.t. to the normal scenario is very high, which is 394.4% but less
tion 5 shows a significant difference in the results when there is a than the static attack scenario, which is due to mobility. Here, the
version number attack in the network. calculations are primarily based on DIO messages since version number
attack increases the DIO messages in the network compared to the other
Algorithm 3: QSec-RPL: UDP Client Send Procedure. control messages.
8
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
7: 𝐼𝑄𝑅 = (𝛾 − 𝛼)
8: 𝜁 = 𝛾 + 1.5 ∗ 𝐼𝑄𝑅 ⊳ Upper limit threshold
9: if 𝑓 𝑙𝑎𝑔𝐵𝑖𝑡 == 1 then
10: for 𝑖 ← 1 𝑡𝑜 𝑀𝐴𝑋_𝑁𝑂𝐷𝐸 do
11: if −𝑇 𝑎𝑏𝑙𝑒[𝑖] ≥ 𝜁 then
12: 𝑑𝑒𝑛𝑦𝑙𝑖𝑠𝑡 − 𝑡𝑎𝑏𝑙𝑒[𝑖] = −𝑇 𝑎𝑏𝑙𝑒[𝑖]
Fig. 7. Network state without attack. 13: end if
14: end for
15: else 𝑃 𝑟𝑖𝑛𝑡(𝑁𝑜 𝑚𝑎𝑙𝑖𝑐𝑖𝑜𝑢𝑠 𝑛𝑜𝑑𝑒 𝑓 𝑜𝑢𝑛𝑑)
16: end if
17: end procedure
5. Performance evaluation
9
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
Table 5 This dynamic network model whose simulation grid size is 200 × 200
Experimental simulation parameters.
uses up to 70 nodes for the performance evaluation. Here we assume
Parameters Value
one of the nodes is the sink node, and rest are sensor nodes. We also
Grid size 200 m × 200 m assume 50% of the nodes are moving in the mobile network with the
Number of sensor nodes Up to 70
speed of 1–2 m∕s using the Random Way Point mobility model [69].
Gateway nodes 1
Radio medium Unit Disk Graph Medium The proposed detection scheme is aimed to detect the version
PHY and MAC layer IEEE 802.15.4 number attack in RPL based IoT networks which works for mobile and
Transmission range 50 m static networks with the following assumptions:
Interference range 100 m
Number of Attacker nodes 10% , 20%, 30% 1. The detection of the version number attack starts after 120 s since
Number of Mobile nodes 50%
some amount of time is required to stabilize the network and
Speed of node 1–2 m/s for mobile nodes
Data packet size 30 bytes after then attacker starts attack in the network.
Data packet sending interval 60 s 2. The attacker node is resource rich sensor node.
3. We perform multiple simulation experiments using COOJA sim-
ulator for mobile and static networks to validate the results.
Table 6
Zolertia Z1 specifications [68]. 4. Dynamic analysis of the network through the sink node to inform
Parameters Value the sensor nodes about the attack.
Low Power Microcontroller MSP430F2617
RISC based CPU Clock Speed 16 MHz 5.3. Performance indicators
RAM Size 8 KB
ROM Size 92 KB
Transceiver CC2420(IEEE 802.15.4 compliant)
The attack’s impact on the network is evaluated using below men-
Power Consumption Active Mode 365 μA(2.2 V) tioned metrics:
Power Consumption Standby Mode 0.5 μA(2.2 V)
Power Consumption TX Mode 17.4 mA(2.1–3.6 V) • Number of DIO’s transmitted: This detection shows the significant
Power Consumption RX Mode 18.8 mA(2.1–3.6 V) increase of the DIO packets throughput among the node when
Power Consumption INT Mode 17.4 mA(2.1–3.6 V)
there is an attack. Experimentally we also see that the throughput
of control packets like DIS and DAO does not change significantly
when there is an attack in the network. So we apply the Q-
of the Contiki to implement the version number attack. The malicious Learning mechanism only to the DIO packets, which is the key
node, when it starts attacking the network, make changes in the Version- performance indicator.
Number of the DODAG which unstable the network. It causes the node • Power Consumption: This indicator shows how much power is
to reset the 𝑡𝑟𝑖𝑐𝑘𝑙𝑒 𝑡𝑖𝑚𝑒𝑟; due to the version change, many nodes are consumed during the attack and no attack conditions, i.e. for how
in an inconsistent state, which causes the root to reset the 𝑡𝑟𝑖𝑐𝑘𝑙𝑒 𝑡𝑖𝑚𝑒𝑟 much time the radio transceiver remained in different states (ON,
too. The root tries to reconfigure the network again, and this problem INT, RX, TX).
persists as the attacker always tries to propagate the DIO message with • Packet delivery ratio (PDR): It is the fraction of packets received
the different increased version number. by the Gateway and the total number of packets sent by sensor
The detection strategy makes changes in the rpl-icmp.c, udp-client.c, nodes.
udp-server.c of the Contiki operating system by implementing different • Average end-to-end delay (AE2ED): This depicts the ratio of the
procedures as mentioned in Section 4. The placement of the malicious time taken by each successfully delivered packet to the Gateway
node is critical; if it is closer to the sink node, the network becomes to the number of packets without considering the unsuccessful
unstable as soon as the attacker starts attacking the node [59]. packets.
This implementation uses Cooja as emulator. Cooja has the ability to
deliver precise evaluation results. In order to create realistic modelling, The performance of the proposed QSec-RPL approach is evaluated using
it has integrated hardware simulator called MSPsim that emulates the performance indicators Accuracy, Recall or True Positive Rate, True
identical binary code of sensor devices. We use Z1 platform which acts Negative Rate, False Positive Rate, False Negative Rate, Precision.
as 6LoWPAN node. The Unit Disk Graph Medium (UGDM) radio model
is used in this study. The simulation was run on a 200 m × 200 m 6. Simulation results
grid with number of nodes ranged from 10 to 70. One server node is
present in each network scenario, and it receives the DIO count from This section shows the results that we achieved by implementing the
each sensor node. The Random Waypoint Mobility Model is used to QSec-RPL algorithm in Contiki’s rpl-icmp6.c file and in the udp-client.c
mimic node mobility, and node speeds range from 1–2 m∕s. and udp-server.c. The results that we achieved experimentally in these
This experiment uses Cooja’s power trace to compute the radio scenarios shows that mobility consumes lot of power and also increases
statistics, i.e. the radio is on during transmission, receive and inter- delay. PDR drops significantly when nodes are mobile. End to end delay
ference time [47]. The Power-Trace focuses on the absolute power is very high when the nodes are mobile and it is very high at 1-hop with
consumed by the sensor board. This analysis shows the impact on the the increase in number of attackers. Similarly we can also conclude for
performance indicators, i.e. power consumption, average end-to-end power consumption which increase with the mobility of nodes. This
delay, and packet delivery ratio for the attack model. paper tries to show the intensive analysis for different metrics in the
RPL based mobile networks and provide suggestions for any Intrusion
5.2. Attack model assumptions Detection system (IDS) employing mobility in their solutions.
This model assumes 𝑁𝑛𝑜𝑑𝑒𝑠 in the 6LoWPAN network, which com- 6.1. Impact of version number attack
prises one sink node or border router and the rest are sensor nodes that
communicate data to sink through neighbour nodes. Table 5 represents The version number attack’s effects are described in this subsection.
parameters for the attacking model. We also assume that the sink is not We show the results using PDR, AE2ED, and Power Consumption as
the attacker node. metrics.
10
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
Fig. 11. Impact on PDR with and without mobility. Fig. 12. Average End to End delay: mobile and static nodes.
11
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
Table 7
Efficiency-parameters.
% of TPR FNR FPR TNR Precision Recall Accuracy Fooling
attackers rate
Number of nodes : 10
0% – – 0 1 – – 1 –
10.0% 1 0 0 1 1 1 1 0
20.0% 1 0 0 1 1 1 1 0
30.0% 0.67 0.33 0 1 1 0.67 0.90 0.33
Number of nodes : 20
0% – – 0 1 – – 1 0
10.0% 0.5 0.5 0 1 1 0.5 0.95 0.50
20.0% 0.75 0.25 0 1 1 0.75 0.95 0.25
30.0% 0.67 0.33 0.071 0.93 0.8 0.67 0.85 0.33
Number of nodes : 30
0% – – 0 1 – – 1 –
10.0% 0.67 0.33 0.0744 0.93 0.5 0.667 0.90 0.33
20.0% 0.67 0.33 0.083 0.92 0.67 0.67 0.87 0.33
30.0% 0.78 0.22 0.14 0.86 0.70 0.78 0.83 0.22
Number of nodes : 40
0% – – 0 1 – – 1 –
10.0% 0.75 0.25 0.083 0.92 0.50 0.75 0.90 0.25
20.0% 0.75 0.25 0.125 0.875 0.60 0.75 0.85 0.25
30.0% 0.83 0.17 0.18 0.82 0.67 0.83 0.825 0.17
Number of nodes : 50
0% – – 0 1 – – 1 –
10.0% 0.80 0.20 0.07 0.93 0.57 0.80 0.92 0.20
20.0% 0.75 0.25 0.13 0.87 0.64 0.75 0.84 0.30
30.0% 0.80 0.20 0.20 0.8 0.63 0.80 0.80 0.20
Number of nodes : 60
0% – – 0 1 – – 1 –
10.0% 0.83 0.17 0.055 0.944 0.625 0.83 0.93 0.167
20.0% 0.75 0.25 0.104 0.89 0.64 0.75 0.86 0.25
30.0% 0.72 0.27 0.16 0.83 0.65 0.72 0.8 0.27
Number of nodes : 70
0% – – 0 1 – – 1 –
10.0% 0.71 0.28 0.13 0.87 0.38 0.71 0.85 0.28
20.0% 0.64 0.35 0.16 0.84 0.5 0.64 0.8 0.35
30.0% 0.67 0.33 0.204 0.79 0.58 0.66 0.75 0.33
get control packet from the legitimate node due to mobility of the
node. Due to this, attacker will not be able to multicast these packets
with the modified version number and the system finds some node as with up to 70 nodes gives the accuracy of around 90%. This is shown
False Negatives. Fooling rate represents the fraction of attacker nodes in Fig. 14. We get high TPR, TNR and low FPR, FNR i.e. our model is
identified as legitimate node during attack over the total number of predicting well. Model provides a good recall rate for the positive class
attacker node. In this experiment very less fooling rate which validates which validates our proposed solution.
our proposed approach.
Some nodes can get many control packets and exceed the threshold 6.3. Implementation overhead
limits, so the model identifies them as attacker nodes (False Positives).
But False Positive rate is very low as shown in the Table 7. The accuracy Zolertia Z1 mote has 8 KB of RAM and 92 KB of ROM [68]. We
of the proposed lightweight solution is significant so this solution can incorporated our proposed solution with ContikiRPL implementation
be incorporated in real applications. On average, the proposed solution and performed evaluation study. It can be observed in Fig. 15 that the
12
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
13
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
Table 8
Comparison with conventional message authentication codes.
S.No. Algorithm Communication Remarks
Modulo Hash Storage Compute Buffer Communication
operation function cost cost required overhead
1 Message Unicast Yes Yes High 𝑂(𝑛) for each n Yes Yes. 128 bit message
Digest (MD) bytes [77] digest
2 Secure Hash Al- Unicast Yes Yes High 𝑂(𝑛) for each n Yes Yes. 512 bit message
gorithms (SHA) bytes [77] digest
3 Hash-based Unicast Yes Yes High 𝑂(𝑛) [78,79] Yes Yes. 160 bit message
Message digest
Authentication
Code (H-MAC)
4 TESLA Multicast Yes Yes High 𝑂(𝑛) [76] One MAC Yes Yes. 160 bit message
function digest
computation
overhead per packet
5 Proposed Multicast No No Low Low 𝑂(𝑛) as No 16 bit for a node.
Approach discussed in Piggyback with the data
(QSec-RPL) Section 7 packet. Only when there
is change in the version
number. Nodes
communicates DIO count.
remains the same. This algorithm takes 𝑂(𝑛) time complexity where which is more realistic as compared to UDGM [80]. The future work
𝑛 is the number of nodes in the network. Similarly, the DIO-P process also aims for mitigating the attacks keeping the mobility scenario in
takes the UDP packet as input and filters out the numerical values from the RPL based networks. In future, we plan to generate a dataset for
it to store the information into the global variable 𝑄_𝑇 𝑎𝑏𝑙𝑒_𝑆𝑒𝑟𝑣𝑒𝑟[][] different attacks and see the behaviour of the network. The dataset will
i.e. the updated DIO count for the node receiving from the specific be used for detecting the attack using machine learning techniques. We
node. This algorithm also takes the time complexity of 𝑂(𝑛), where n also plan to detect and mitigate coordinated attacks in IoT applications.
is the length of the UDP packet. So overall, our implementation takes
𝑂(𝑛) + 𝑂(𝑛) = 𝑂(𝑛). Declaration of competing interest
Another vital part of our implementation is that the proposed ap-
proach is feasible for other than 6LoWPAN. This approach can be used The authors declare that they have no known competing finan-
where we have constrained networks in terms of power, energy and cial interests or personal relationships that could have appeared to
memory. Since our implementation is not the architecture dependent influence the work reported in this paper.
so we can apply this approach to Wireless Sensor Network, Wired
Network, Mobile Ad hoc Network, Vehicular Ad hoc Network and
Data availability
Flying Ad hoc Network also.
This section provides some insights into our research and how it References
can be helpful for other types of attacks. The technique could be
[1] M.R. Palattella, M. Dohler, A. Grieco, G. Rizzo, J. Torsner, T. Engel, L. Ladid,
reasonable in detecting the Blackhole and Selective-Forwarding attacks. Internet of things in the 5G era: Enablers, architecture, and business models,
Blackhole attack results in DoS attacks because the malicious node IEEE J. Select. Areas Commun. 34 (2016) 510–527.
drops all the packets. Our proposed approach could find the attacker [2] Statista, Internet of Things (IoT) connected devices installed base worldwide from
node because it applies the Q-Learning approach to assess the node 2015 to 2025, 2022, Accessed 03 February 2022.
whether it is benign or malicious. We could also apply this approach [3] H. Tankovska, Forecast end-user spending on IoT solutions worldwide from 2017
to 2025, 2022, Accessed 03 February 2022.
to Selective-Forwarding attack as the behaviour of the node can be
[4] A.O. Bang, U.P. Rao, P. Kaliyar, M. Conti, Assessment of routing attacks and
calculated mathematically to identify whether it is the attacker or not. mitigation techniques with RPL control messages: A survey, ACM Comput. Surv.
The approach is useful for developing a Trust-Based Intrusion Detection 55 (2022) 1–36.
System (IDS) for different attacks in different networks as the approach [5] G. Mulligan, The 6lowpan architecture, in: Proceedings of the 4th Workshop on
is independent of architecture of networks. Embedded Networked Sensors, 2007, pp. 78–82.
[6] G. Sharma, J. Grover, A. Verma, R. Kumar, R. Lahre, Analysis of hatchetman
attack in RPL based IoT networks, in: International Conference on Emerging
9. Conclusion and future work Technologies in Computer Engineering, Springer, 2022, pp. 666–678.
[7] A. Verma, V. Ranga, The impact of copycat attack on RPL based 6LoWPAN
By taking into account mobile nodes and version number attacks, networks in internet of things, Computing (2020) 1–22.
[8] T. Winter, P. Thubert, A. Brandt, J.W. Hui, R. Kelsey, P. Levis, K. Pister, R. Struik,
this work was able to determine the impact on measures such as
J.P. Vasseur, R.K. Alexander, et al., RPL: IPv6 routing protocol for low-power
packet delivery ratio, power consumption, and end-to-end delay. We and lossy networks, Rfc 6550 (2012) 1–157.
can extend this simulation for different attacks like Rank Attack, Hello [9] A. Verma, V. Ranga, Security of RPL based 6LoWPAN networks in the internet
Flood attack, Hatchetman attack. Most of the research does not consider of things: A review, IEEE Sens. J. 20 (2020) 5666–5690.
the mobility of the nodes, so we plan to analyse the impact of different [10] A. Agiollo, M. Conti, P. Kaliyar, T. Lin, L. Pajola, DETONAR: Detection of routing
attacks in RPL-based IoT, IEEE Trans. Netw. Serv. Manag. (2021).
types of attacks.
[11] CISOMAG, 10 IoT security incidents that make you feel less secure, 2020,
In future work, we plan to implement different attacks and generate Accessed 10 January 2020.
data for implementing the Intrusion Detection System. We also plan to [12] Y.L. Eyal Itkin, Y. Balmas, Faxploit: Sending fax back to the dark ages, 2018,
perform the simulation using Multi-path Ray-tracer Medium (MRM), Accessed 12 August 2018.
14
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
[13] S.M. Muzammal, R.K. Murugesan, N. Jhanjhi, A comprehensive review on secure [40] S.S. Ambarkar, N. Shekokar, A secure model to protect healthcare IoT system
routing in internet of things: Mitigation methods and trust-based approaches, from version number and rank attack, J. Univ. Shanghai Sci. Technol..
IEEE Internet Things J. (2020). [41] A. Mayzaud, R. Badonnel, I. Chrisment, Detecting version number attacks in
[14] I. Butun, P. Österberg, H. Song, Security of the internet of things: Vulnerabilities, RPL-based networks using a distributed monitoring architecture, in: 2016 12th
attacks, and countermeasures, IEEE Commun. Surv. Tutor. 22 (2019) 616–644. International Conference on Network and Service Management, CNSM, IEEE,
[15] A. Le, J. Loo, K.K. Chai, M. Aiash, A specification-based for detecting attacks on 2016a, pp. 127–135.
RPL-based network topology, Information 7 (2016) 25. [42] M.D. Momand, M.K. Mohsin, et al., Machine learning-based multiple attack
[16] M.N. Napiah, M.Y.I.B. Idris, R. Ramli, I. Ahmedy, Compression header analyzer detection in RPL over IoT, in: 2021 International Conference on Computer
intrusion detection system (CHA-IDS) for 6LoWPAN communication protocol, Communication and Informatics, ICCCI, IEEE, 2021, pp. 1–8.
IEEE Access 6 (2018) 16623–16638. [43] M. Osman, J. He, F.M.M. Mokbal, N. Zhu, S. Qureshi, ML-LGBM: A machine
[17] A. Verma, V. Ranga, ELNIDS: Ensemble learning based network intrusion learning model based on light gradient boosting machine for the detection of
detection system for RPL based Internet of Things, in: 2019 4th International version number attacks in RPL-based networks, IEEE Access (2021).
Conference on Internet of Things: Smart Innovation and Usages, IoT-SIU, IEEE, [44] A.A. Anitha, L. Arockiam, VeNADet: Version number attack detection for RPL
2019b, pp. 1–6. based internet of things, Solid State Technol. 64 (2021) 2225–2237.
[18] A. Verma, V. Ranga, CoSec-RPL: Detection of copycat attacks in RPL based [45] A. Verma, V. Ranga, Addressing flooding attacks in IPv6-based low power and
6LoWPANs using outlier analysis, Telecommun. Syst. 75 (2020) 43–61. lossy networks, in: TENCON 2019-2019 IEEE Region 10 Conference, TENCON,
[19] P. Pace, G. Aloi, R. Gravina, G. Caliciuri, G. Fortino, A. Liotta, An edge-based IEEE, 2019a, pp. 552–557.
architecture to support efficient applications for healthcare industry 4.0, IEEE [46] A. Verma, R. Virender, Machine learning based intrusion detection systems for
Trans. Ind. Inform. 15 (2018) 481–489. IoT applications, Wirel. Pers. Commun. 111 (2020) 2287–2310.
[20] J. Xu, G. Solmaz, R. Rahmatizadeh, D. Turgut, L. Boloni, Internet of things [47] A. Verma, V. Ranga, Mitigation of dis flooding attacks in RPL-based 6lowpan
applications: Animal monitoring with unmanned aerial vehicle, 2016, arXiv networks, Trans. Emerg. Telecommun. Technol. 31 (2020) e3802.
preprint arXiv:1610.05287. [48] M. Sharma, H. Elmiligi, F. Gebali, A. Verma, Simulating attacks for RPL and
[21] J. Granjal, E. Monteiro, J.S. Silva, Security for the internet of things: A survey generating multi-class dataset for supervised machine learning, in: 2019 IEEE
of existing protocols and open research issues, IEEE Commun. Surv. Tutor. 17 10th Annual Information Technology, Electronics and Mobile Communication
(2015) 1294–1312. Conference, IEMCON, IEEE, 2019, pp. 0020–0026.
[22] M.R. Palattella, N. Accettura, X. Vilajosana, T. Watteyne, L.A. Grieco, G. Boggia, [49] S. Murali, A. Jamalipour, A lightweight intrusion detection for sybil attack under
M. Dohler, Standardized protocol stack for the internet of (important) things, mobile RPL in the internet of things, IEEE Internet Things J. 7 (2019) 379–388.
IEEE Commun. Surv. Tutor. 15 (2013) 1389–1406. [50] I. Wadhaj, B. Ghaleb, C. Thomson, A. Al-Dubai, W.J. Buchanan, Mitigation
[23] O. Gnawali, P. Levis, The ETX objective function for RPL, draft-gnawali-roll-etx mechanisms against the DAO attack on the routing protocol for low power and
of-01, 2010. lossy networks (RPL), IEEE Access 8 (2020) 43665–43675.
[24] Levis Gnawali, The minimum rank with hysteresis objective function (MRHOF), [51] C. Pu, Sybil attack in RPL-based internet of things: Analysis and defenses, IEEE
2012, IETF, CA, USA, RFC 6719. Internet Things J. 7 (2020) 4937–4949.
[52] A. Raoof, A. Matrawy, C.H. Lung, Routing attacks and mitigation methods for
[25] P. Thubert, Objective Function Zero for the Routing Protocol for Low-Power and
RPL-based internet of things, IEEE Commun. Surv. Tutor. 21 (2018) 1582–1606.
Lossy Networks (RPL), Technical Report, 2012.
[53] A. Verma, V. Ranga, The impact of copycat attack on RPL based 6lowpan
[26] O. Gaddour, A. Koubâa, RPL in a nutshell: A survey, Comput. Netw. 56 (2012)
networks in internet of things, Computing 103 (2021) 1479–1500.
3163–3178.
[54] P. Kasinathan, G. Costamagna, H. Khaleel, C. Pastrone, M.A. Spirito, DEMO: An
[27] J. Vasseur, N. Agarwal, J. Hui, Z. Shelby, P. Bertrand, C. Chauvenet, RPL: The IP
IDS framework for internet of things empowered by 6LoWPAN, in: Proceedings
routing protocol designed for low power and lossy networks, Internet Protocol
of the 2013 ACM SIGSAC Conference on Computer & Communications
Smart Objects (IPSO) Alliance 36 (2011) 1–20.
Security, ACM, New York, NY, USA, 2013, pp. 1337–1340, http://dx.doi.org/
[28] A.M. Raoof, Secure Routing and Forwarding in RPL-Based Internet of Things:
10.1145/2508859.2512494.
Challenges and Solutions (Ph.D. thesis), Carleton University, 2021.
[55] L. Zhang, G. Feng, S. Qin, Intrusion detection system for RPL from routing choice
[29] J. Tournier, F. Lesueur, F. Le Mouël, L. Guyon, H. Ben-Hassine, A survey of
intrusion, in: 2015 IEEE International Conference on Communication Workshop,
IoT protocols and their security issues through the lens of a generic IoT stack,
ICCW, IEEE, 2015, pp. 2652–2658.
Internet Things 16 (2021) 100264.
[56] A. Le, J. Loo, Y. Luo, A. Lasebae, Specification-based IDS for securing RPL from
[30] H. HaddadPajouh, A. Dehghantanha, R.M. Parizi, M. Aledhari, H. Karimipour, A
topology attacks, in: 2011 IFIP Wireless Days, WD, IEEE, 2011, pp. 1–3.
survey on internet of things security: Requirements, Challenges, and Solutions,
[57] M. Surendar, A. Umamakeswari, Indres: An intrusion detection and response
Internet Things 14 (2021) 100129.
system for internet of things with 6lowpan, in: 2016 International Conference
[31] Tomic, McCann, A Survey of potential security issues in existing wireless sensor
on Wireless Communications, Signal Processing and Networking, WiSPNET, IEEE,
network protocols, IEEE Internet Things J. 4 (2017) 1910–1923.
2016, pp. 1903–1908.
[32] D. Zhang, F.R. Yu, R. Yang, A machine learning approach for software-defined [58] A. Mayzaud, A. Sehgal, R. Badonnel, I. Chrisment, J. Schönwälder, Using the RPL
vehicular ad hoc networks with trust management, in: 2018 IEEE Global protocol for supporting passive monitoring in the internet of things, in: NOMS
Communications Conference, GLOBECOM, IEEE, 2018, pp. 1–6. 2016-2016 IEEE/IFIP Network Operations and Management Symposium, IEEE,
[33] H. Van Hasselt, A. Guez, D. Silver, Deep reinforcement learning with double 2016b, pp. 366–374.
q-learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, [59] A. Mayzaud, R. Badonnel, I. Chrisment, A distributed monitoring strategy for
2016. detecting version number attacks in RPL-based networks, IEEE Trans. Netw. Serv.
[34] A.G. Barto, P.S. Thomas, R.S. Sutton, Some recent applications of reinforcement Manag. 14 (2017) 472–486.
learning, in: Proceedings of the Eighteenth Yale Workshop on Adaptive and [60] Hamid Bostani, Mansour Sheikhan, Hybrid of anomaly-based and specification-
Learning Systems, 2017. based IDS for internet of things using unsupervised OPF based on MapReduce
[35] M. Corazza, A. Sangalli, Q-Learning and Sarsa: A Comparison Between Two Intel- approach, Comput. Commun. 98 (2017) 52–71.
ligent Stochastic Control Approaches for Financial Trading, University Ca’Foscari [61] P. Ioulianou, V. Vasilakis, I. Moscholios, M. Logothetis, A signature-based
of Venice, Dept. of Economics Research Paper Series No 15, 2015. intrusion detection system for the internet of things, Inf. Commun. Technol. Form
[36] F. Gara, L.B. Saad, R.B. Ayed, An efficient intrusion detection system for selective (2018).
forwarding and clone attackers in ipv6-based wireless sensor networks under [62] E. Kfoury, J. Saab, P. Younes, R. Achkar, A self organizing map intrusion
mobility, Int. J. Semant. Web Inf. Syst. (IJSWIS) 13 (2017) 22–47. detection system for RPL protocol attacks, Int. J. Interdiscipl. Telecommun.
[37] Z. A Almusaylim, N. Jhanjhi, A. Alhumam, Detection and mitigation of RPL rank Network. (IJITN) 11 (2019) 30–43.
and version number attacks in the internet of things: SRPL-RP, Sensors 20 (2020) [63] U. Kiran, IDS to detect worst parent selection attack in RPL-based IoT network,
5997. in: 2022 14th International Conference on Communication Systems & NetworkS,
[38] A. Arış, S.B.Ö. Yalçın, S.F. Oktuğ, New lightweight mitigation techniques for RPL COMSNETS, IEEE, 2022, pp. 769–773.
version number attacks, Ad Hoc Netw. 85 (2019) 81–91. [64] G. Sharma, J. Grover, A. Verma, Performance evaluation of mobile RPL-
[39] F. Ahmed, Y.B. Ko, A distributed and cooperative verification mechanism to based IoT networks under version number attack, Comput. Commun. 197
defend against DODAG version number attack in RPL, in: PECCS, 2016, pp. (2023) 12–22, http://dx.doi.org/10.1016/j.comcom.2022.10.014, URL: https://
55–62. www.sciencedirect.com/science/article/pii/S0140366422004029.
15
G. Sharma et al. Ad Hoc Networks 142 (2023) 103118
[65] H. Kermajani, C. Gomez, On the network convergence process in RPL over IEEE Mr. Girish Sharma is research fellow at Malviya National
802.15, 4 multihop networks: Improvement and trade-offs, Sensors 14 (2014) Institute of Technology, Jaipur. He is also Assistant Profes-
11993–12022. sor in the department of Computer Science & Engineering
[66] D.C. Hoaglin, John w. Tukey and data analysis, Stat. Sci. (2003) 311–318. at Manipal University Jaipur, Rajasthan, India. He obtained
M.Tech. degree (2016) in the Computer Engineering from
[67] P. Kugler, P. Nordhus, B. Eskofier, Shimmer, Cooja and Contiki: A new toolset
the MNIT Jaipur, Rajasthan, India. He completed his B.Tech
for the simulation of on-node signal processing algorithms, in: 2013 IEEE
degree (2009) in Computer Science & Engineering from
International Conference on Body Sensor Networks, IEEE, 2013, pp. 1–6. University of Rajasthan, Jaipur, Rajasthan India. He has
[68] Zoletria, Z1 Datasheet. more than ten years of experience in teaching. His current
[69] C. Bettstetter, H. Hartenstein, Pérez-Costa, Stochastic properties of the random areas of interest include Security in Computing, Internet of
waypoint mobility model, Wirel. Netw. 10 (2004) 555–567. Things, Intrusion Detection, Static & Dynamic Analysis of
[70] D. Boneh, C. Gentry, B. Lynn, H. Shacham, et al., A survey of two signature Applications.
aggregation techniques, 2003.
[71] A. Shamir, E. Tromer, On the cost of factoring RSA-1024, RSA CryptoBytes 6
Dr. Jyoti Grover received her Ph.D. in Computer Engineer-
(2003) 10–19.
ing from Malaviya National Institute of Technology, Jaipur
[72] B.A. Forouzan, D. Mukhopadhyay, Cryptography and Network Security. Vol. 12,
(India) in 2013. Currently, she is an Assistant Professor in
Mc Graw Hill Education (India) Private Limited New York, NY, USA, 2015. Department of Computer Science and Engineering, Malaviya
[73] R. Pappu, B. Recht, J. Taylor, N. Gershenfeld, Physical one-way functions, National Institute of Technology Jaipur, India. Her research
Science 297 (2002) 2026–2030. interests include Ad hoc networks security, VANET, IoT,
[74] A.E. Hajjar, G. Roussos, M. Paterson, On the performance of key pre-distribution SDN, intelligent transportation system, cloud and mobile
for RPL-based IoT networks, in: Interoperability, Safety and Security in IoT, computing. She has published more than 50 research papers
Springer, 2016, pp. 67–78. in international journals and conferences.
[75] P. Ilia, G. Oikonomou, T. Tryfonas, Cryptographic key exchange in ipv6-based
low power, lossy networks, in: IFIP International Workshop on Information
Security Theory and Practices, Springer, 2013, pp. 34–49.
Dr. Abhishek Verma is an Assistant Professor in the De-
[76] A. Perrig, R. Canetti, D. Song, J.D. Tygar, Efficient and secure source authen-
partment of Computer Science & Engineering at IIITDM,
tication for multicast, in: Network and Distributed System Security Symposium,
India. He obtained Ph.D. degree (2020) in the Internet of
NDSS, 2001, pp. 35–46.
Things security from the National Institute of Technology
[77] D. Rachmawati, J. Tarigan, A. Ginting, A comparative study of message digest Kurukshetra, Haryana, India. He completed his B.Tech de-
5 (md5) and sha256 algorithm, J. Phys.: Conf. Ser. 978 (1) (2018) 012116. gree (2014) in Computer Science & Engineering from Uttar
[78] E. Dubrova, M. Näslund, G. Selander, F. Lindqvist, Lightweight message authen- Pradesh Technical University, India, and M.Tech degree
tication for constrained devices, in: Proceedings of the 11th ACM Conference on (2016) in Computer Engineering from the National Insti-
Security & Privacy in Wireless and Mobile Networks, 2018, pp. 196–201. tute of Technology Kurukshetra, India. He has more than
[79] H. Li, V. Kumar, J.-M.J. Park, Y. Yang, Cumulative message authentication codes six years of experience in research and teaching. He has
for resource-constrained networks, in: 2020 IEEE Conference on Communications published more than 20 research articles in international
journals and conferences of high repute. He is an editorial
and Network Security, CNS, IEEE, 2020, pp. 1–9.
board member of Research Reports on Computer Science
[80] P. Perazzo, C. Vallati, G. Anastasi, G. Dini, DIO suppression attack against routing
(RRCS) and active review board member of various reputed
in the internet of things, IEEE Commun. Lett. 21 (2017) 2524–2527. journals, including IEEE, Springer, Wiley, and Elsevier.
His current areas of interest include Information Security,
Intrusion Detection, and the Internet of Things.
16