SlideShare a Scribd company logo
Indonesian Journal of Electrical Engineering and Computer Science
Vol. 7, No. 3, September 2017, pp. 718 ~ 723
DOI: 10.11591/ijeecs.v7.i3.pp718-723  718
Received May 29, 2017; Revised August 15, 2017; Accepted August 30, 2017
Learning Based Route Management in Mobile Ad Hoc
Networks
Rahul Desai
*1
, B P Patil
2
, Davinder Pal Sharma
3
1
Sinhad College of Engineering, Army Institute of Technology, Pune, INDIA
2
E & TC Department, Army Institute of Technology, Pune, INDIA
3
The University of the West Indies, Department of Physics St. Augustine, St. George, TT
*Corrsponding author, email-desaimrahul@yahoo.com
Abstract
Ad hoc networks are mobile wireless networks where each node is acting as a router. The
existing routing protocols such as Destination sequences distance vector (DSDV), Optimized list state
routing protocols (OLSR), Ad hoc on demand routing protocol (AODV), dynamic source routing (DSR) are
optimized versions of distance vector or link state routing protocols. Reinforcement Learning is new
method evolved recently which is learning from interaction with an environment. Q Learning which is based
on reinforcement learning that learns from the delayed reinforcements and becomes more popular in areas
of networking. Q Learning is applied to the routing algorithms where the routing tables in the distance
vector algorithms are replaced by the estimation tables called as Q values. These Q values are based on
the link delay. In this paper, various optimization techniques over Q routing are described in detail with
their algorithms.
Keywords: Q Routing, Reinforcement, CQ routing, DRQ routing, CDRQ routing, DSR, AODV, DSDV
Copyright © 2017 Institute of Advanced Engineering and Science. All rights reserved.
1. Introduction
An ad Hoc network is a technology where no fixed infrastructure is required; all nodes
are mobile, thus moving from one network to another [1, 2]. Ad hoc network is a temporary
network where each node is also acting as a router. All nodes are self configured (addresses
and routing features) nodes, multiple hops are required to transfer data from one node to
another. Energy is also one of the most important parameter as all nodes have limited power
supply. Ad hoc network characteristics includes Peer-to-peer, zero administration, low power,
Multihop, dynamic and auto configured. Routing consists of two steps; forwarding packets to the
next hop and to decide how the forwarding process to reach the packets to the destination in
minimum number of hops. To judge the merit of a routing protocol, qualitative and quantitative
metrics are used to measure its suitability and performance. Various performance parameters
such as packet delivery ratio, delay, jitter, control overhead etc are used judge the performance
of routing protocols.
There are two types of protocols–proactive routing protocols and on demand also
known as reactive routing protocols are widely adopted for an ad hoc network. Proactive
protocols always maintains routing paths between all pairs of nodes irrespective of their usage
while reactive protocols finds out the path to reach to the node only when needed. Pro-active
routing protocols always find the optimum routes to reach to every destination nodes. But these
types of protocols are not suitable for large network because of high overheads and their poor
convergence behaviour. Destination sequenced Distance Vector (DSDV) is one of earliest
protocols developed for ad hoc networks [3, 4]. It is based on distance vector algorithm and
uses sequence numbers to avoid count to infinity problem. Every node communicates, finds out
their neighbours by sending hello messages and exchanges their routing tables with them.
Periodic full updates and small updates are also transmitted to maintain routing tables up to
date.
Optimized link state routing protocol [5, 6] is another proactive routing protocol based
on link state algorithm. Here, every node broadcasts link state updates to every other node
present in the network and thus creates link tables from which routing tables are designed. In
order to reduce the overheads, multipoint relay concept is widely used. There are two types of
IJEECS ISSN: 2502-4752 
Learning Based Route Management in Mobile Ad Hoc Networks (Rahul Desai)
719
algorithms which are widely used for wires as well as wireless networks, first is distance vectors
routing protocols, where the distances in terms of number of hops are communicated to the
neighbours and builds up the routing table. Routing tables basically consists of three columns,
first column for destination node, second column will be the next hop where the packet are to be
delivered, third column stands for metric or cost.
In on demand routing protocols, route to the destination is obtained only when there is a
need. When source nodes want to transmit data packets to the destination nodes, it initiates
route discovery process. Route request (RREQ) messages float over the network and finally the
packet reaches to the destination, Destination nodes replies with route reply message (RREP)
and unicast towards the source node. All nodes including the source node keeps this route
information in caches for future purpose. Dynamic Source Routing Protocol (DSR) is thus
characterized by the use of source routing. The data packets carry the source route in the
packet header. When the link or node goes down, existing route is no longer available; source
node again initiates route discovery process to find out the optimum route. Route Error packets
and acknowledgement packets are also used. Ad Hoc on Demand Distance Vector Routing
(AODV) is also on-demand routing protocol. It uses traditional routing tables, one entry per
destination. In AODV, only one route path is available in routing table, if this path fails, it again
initiates route discovery process to find out another optimum path [7-9].
2. Survey of Reinforcement Based Routing Methods
Reinforcement learning is the process of mapping the situations to the actions and tries to
maximize a reward signal. There are various strategies such as positive or negative approaches
as well as model based or model free approaches are used. Q Routing is new evolved concept
arises in the modern world which is also based on reinforcement concepts. Each node in the
network contains reinforcement learning module which tries to find out the optimum path to the
destination. Direct or indirect training signal is required to improve the routing policy. As
illustrated in Figure 1, Let QX(Y, D) represents the time that a node X takes to deliver a packet P
to the destination node D when the packet is transmitted to the next neighbour node Y. After
sending the packet, node X will also get node Y’s estimate of the remaining time in the
network [10-11].
Figure 1. Q routing
In Reinforcement Learning (Q Routing) each node maintains database of Q values
which represents delays for each of the next hops. For every incoming packet, nodes consult its
Q table and decided the next hop based on the least delivery time required to reach the packet
to the destination [10, 11]. At the same time the sending node receives the estimate of the
remaining delivery time for the packet to the destination. Thus after every packet transmitted by
the source node and all intermediate nodes[11] Q values are received by these nodes and
updates their Q table to represents the steady state of the network. As soon as the node X
sends a packet P to the destination node D to one of the neighbouring nodes Y, node Y send
back to node X, its best estimate QY(Z, D) for the destination D. QY(Z, D) for the destination D
shows its remaining time required to reach the packet to the destination node D [10,11].
 ISSN: 2502-4752
IJEECS Vol. 7, No. 3, September 2017 : 718 – 723
720
PacketSend (X)
1 Receive the packet from Packet Queue
2 Find out the best neighbour Y = min(Qx(Y, D))
3 Forward Packet to the neighbour Y
4 Receive Estimate (Qy(Z, D) + qy) from node Y.
5 Update Q value Qx(Y, D).
PacketReceive (Y)
1 Receive a packet from neighbour X
2 Calculate best estimate for node D; Qy(Z, D) and send back to node X.
3 Get ready for receiving next packet
By adding confidence measure, the quality of exploration is improved by learning faster
thus Q values represent the current state of the network more closely. Each node in the network
contains C tables consisting of confidence values, where each Q value is associated with C
value. This value is the real number between 0-1 and essentially specifies the confidence in the
corresponding Q value [10]
In standard Q routing, learning rate is always maintained to be constant, it means there
is way to specify reliability of Q values but in Confidence based Q Routing, the learning rate
depends on the confidence of the Q value being updated and its new estimate. In particular,
when node X sends a packet to its neighbour Y, it also gets back the confidence value Cy(Z, D)
associated with this Q value. When node X updates its Qx(Y, D) value, it first computes the
learning rate П which depends on both Cx (Y, D) and Cy (Z, D). Simple and effective learning
rate function is given by: Пf (Cold, Cnew) = max (Cnew, 1- Cold). The confidence value always
represents the reliability of the corresponding Q value, and thus always changes with time. This
confidence value decays with time if their Q values are not updated in the last time step [11].
Figure 2. CQ routing
In confidence based Q Routing, algorithm for Packet Send and Packet Receive can be
summarized as follows [10-12].
PacketSend(X)
1. Receive the packet from Packet Queue
2. Find out the best neighbour Y = min(Qx(Y, D))
3. Forward Packet to the neighbour Y
4. Receive Estimate (Qy(Z, D) + qy) and Cy(Z, D) from node Y.
5. Update Q value Qx(Y, D) and Cy(Z, D) value.
PacketReceive(Y)
1. Receive a packet from neighbour X
2. Calculate best estimate for node D; Qy(Z, D) and send back to node X.
3. Find the corresponding confidence value Cy(Z, D) and send back to node X.
4. Get ready for receiving next packet
In Dual reinforcement Q Routing (DRQ) the learning process occurs in both ways and
thus the learning performance of the Q Routing algorithm doubles. Instead of using a single
reinforcement signal, an indirect reinforcement signal extracted from the incoming information is
IJEECS ISSN: 2502-4752 
Learning Based Route Management in Mobile Ad Hoc Networks (Rahul Desai)
721
also used to update the state of the network. When a node X sends a packet to neighbour node
Y, it will also send additional routing information which will be used to update node Y's decisions
in opposite direction. Thus backward exploration is also added to standard Q Routing [11]
Figure 3 illustrates the backward exploration in standard Q routing.
Figure 3. CQ routing
In dual reinforcement confidence based Q Routing, algorithm for Packet Send and
Packet Receive can be summarized as follows [10-12].
PacketSend (X)
1. Receive the packet from Packet Queue
2. Find out the best estimate Qx(Z, d)
3. Append (Qx(Z, S)+qx) and Cx(Z, S) to the packet P(S, D).
4. Find out the best neighbour Y = min(Qx(Y, D))
5. Forward Packet to the neighbour Y
6. Receive Estimate (Qy(Z, D) + qy) and Cy(Z, D) from node Y.
7. Update Q value Qx(Y, D) and C value C x(Y, D).
PacketReceive (Y)
1. Receive a packet from neighbour X
2. Using the received estimate Qx (Z, D) + qx and Cx (Z, D) update Q value Qy(X, S) and C y(X,).
3. Calculate best estimate for node D; Qy (Z, D) and Cy (Z, D), send back to node X.
4. Get ready for receiving next packet.
Thus confidence values are used not only for exploration but also in making routing
decisions [12].
3. Analysis
The experiment is performed using the simulator NS2 which is open source software
and used to do research on wired and wireless networks. The number of nodes varies from 10
to 100. The topology Size is 1000 m × 1000 m. The simulation time is 200 seconds. DSDV,
DSR, AODV, Dual reinforcement Q routing protocols are analysed.
Figure 4. No of Nodes vs PDR
 ISSN: 2502-4752
IJEECS Vol. 7, No. 3, September 2017 : 718 – 723
722
It is observed that when the network size increases beyond 60 nodes, AODV or DSR
protocols starts dropping packets. But CDRQ protocols maintains consistent ratio throughout
the network irrespective of the network size.
End-to-end Delay is the time taken by a data packet to reach to the destination. The
result of end to end delay is illustrated in Figure 5. Here again dual reinforcement confidence
based routing provides low delay compared with standard routing and other non optimized
variants of Q routing.
Figure 5. No of Nodes vs. Delay
4. Conclusion
This paper explains the comparative analysis of various optimized versions of existing
routing protocols with dual reinforcement confidence based Q routing in NS2. This research
study compares DSDV, AODV and DSR protocols with CDRQ routing protocols for an ad hoc
network. PDR and delay are very important parameters when deciding how a reliable a
protocols works. CDRQ variant based on reinforcement learning shows significant results as
compared with existing routing protocols.
References
[1] Mukhtiar Ahmed, Mazleena Salleh, M.Ibrahim Channa, Mohd Foad Rohani. Review on localization
based Routing Protocols for Underwater Wireless Sensor Network. International Journal of Electrical
and Computer Engineering, Vol 7, No 1: February 2017.
[2] Dilip Singh Sisodia, Riya Singhal, Vijay Khandal, A Performance Review of Intra and Inter-Group
MANET Routing Protocols under Varying Speed of Nodes. International Journal of Electrical and
Computer Engineering, Vol 7, No 5: October 2017.
[3] Justin Sophia I, N. Rama. Improving the Proactive Routing Protocol using Depth First Iterative
Deepening Spanning Tree in Mobile Ad Hoc Network. International Journal of Electrical and Computer
Engineering, Vol 7, No 1: February 2017.
[4] Rahul Desai, B P Patil. Analysis of Reinforcement Based Adaptive Routing in MANET. Indonesian
Journal of Electrical Engineering and Computer Science, Vol 2, No 3: June 2016.
[5] P. Jacquet et al. Optimized Link State Routing Protocol for Ad Hoc Networks. Proc. IEEE Int’l. Multi
Topic Conf. (INMIC ’01), 2001, pp. 62–68.
[6] T. Clausen and P. Jacquet, Optimized Link State Routing Protocol (OLSR), document RFC 3626,
IETF, Oct. 2003;. [Online]. Available: http://www.ietf.org/rfc/rfc3626.txt.
[7] Reji Mano, P.C. Kishore Raja, Christeena Joseph, Radhika Baskar. Hardware Implementation of
Intrusion Detection System for Ad-Hoc Network. International Journal of Reconfigurable and
Embedded Systems (IJRES), Vol 5, No 3: November 2016.
[8] Shalini Singh, Rajeev Tripathi. Performance Analysis of Extended AODV with IEEE802.11e HCCA to
support QoS in Hybrid Network. Indonesian Journal of Electrical Engineering and Computer Science,
Vol 12, No 9: September 2014.
IJEECS ISSN: 2502-4752 
Learning Based Route Management in Mobile Ad Hoc Networks (Rahul Desai)
723
[9] AL-Gabri Malek, Chunlin LI, Layuan Li. Improving ZigBee AODV Mesh Routing Algorithm Topology
and Simulation Analysis. TELKOMNIKA Indonesian Journal of Electrical Engineering Vol.12, No.2,
2014, pp. 1528~1535.
[10] S. Kumar. Confidence based dual reinforcement Q-routing: an on-line adaptive network routing
algorithm. MS thesis. University of Texas at Austin, 1998.
[11] S. Kumar, R. Miikkulainen. Dual Reinforcement Q-Routing: An On-Line Adaptive Routing Algorithm.
Artificial Neural Networks in Engineering, 1997.
[12] Rahul Desai, B P Patil. Prioritized Sweeping Reinforcement Learning Based Routing for MANETs.
Indonesian Journal of Electrical Engineering and Computer Science. Vol. 5, No. 2, Feb 2017, pp.
684~694.

More Related Content

PDF
17 21 jan17 9dec16 13655 27902-1-ed(edit)
PDF
Fuzzy Optimized Metric for Adaptive Network Routing
PDF
Authentication of Secure Data Transmission In Wireless Routing
PDF
Maximizing Throughput using Adaptive Routing Based on Reinforcement Learning
PDF
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
PDF
Amtr the ant based qos aware multipath temporally ordered routing algorithm ...
PDF
D0441722
PDF
Optimized Fuzzy Routing for MANET
17 21 jan17 9dec16 13655 27902-1-ed(edit)
Fuzzy Optimized Metric for Adaptive Network Routing
Authentication of Secure Data Transmission In Wireless Routing
Maximizing Throughput using Adaptive Routing Based on Reinforcement Learning
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
Amtr the ant based qos aware multipath temporally ordered routing algorithm ...
D0441722
Optimized Fuzzy Routing for MANET

What's hot (16)

PDF
DETERMINING THE NETWORK THROUGHPUT AND FLOW RATE USING GSR AND AAL2R
PDF
QUALITY OF SERVICE STABILITY BASED MULTICAST ROUTING PROTOCOL FOR MANETS
PDF
Fuzzy Route Switching for Energy Preservation(FEP) in Ad Hoc Networks
PDF
Fuzzy-controlled Scheduling of Real Time Data Packets (FSRP) in Mobile Ad Hoc...
PDF
Delay Sensitive Packet Scheduling Algorithm for MANETs by Cross Layer
PDF
Comparative Analysis of MANET Routing Protocols and Cluster Head Selection Te...
PDF
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
PPT
Iccsit2010 paper2
PDF
Routing in « Delay Tolerant Networks » (DTN) Improved Routing With Prophet an...
PDF
A cross layer delay-aware multipath routing algorithm for mobile adhoc networks
PDF
Design and implementation of low latency weighted round robin (ll wrr) schedu...
PDF
Welcome to International Journal of Engineering Research and Development (IJERD)
PDF
An approach to dsr routing qos by fuzzy genetic algorithms
PDF
Fuzzy-Controlled Scheduling of Route-Request Packets (FSRR) in Mobile Ad Hoc ...
PDF
Gk2411581160
PDF
Quality of Service Routing in Mobile Ad hoc Networks Using Node Mobility and ...
DETERMINING THE NETWORK THROUGHPUT AND FLOW RATE USING GSR AND AAL2R
QUALITY OF SERVICE STABILITY BASED MULTICAST ROUTING PROTOCOL FOR MANETS
Fuzzy Route Switching for Energy Preservation(FEP) in Ad Hoc Networks
Fuzzy-controlled Scheduling of Real Time Data Packets (FSRP) in Mobile Ad Hoc...
Delay Sensitive Packet Scheduling Algorithm for MANETs by Cross Layer
Comparative Analysis of MANET Routing Protocols and Cluster Head Selection Te...
A SURVEY TO REAL-TIME MESSAGE-ROUTING NETWORK SYSTEM WITH KLA MODELLING
Iccsit2010 paper2
Routing in « Delay Tolerant Networks » (DTN) Improved Routing With Prophet an...
A cross layer delay-aware multipath routing algorithm for mobile adhoc networks
Design and implementation of low latency weighted round robin (ll wrr) schedu...
Welcome to International Journal of Engineering Research and Development (IJERD)
An approach to dsr routing qos by fuzzy genetic algorithms
Fuzzy-Controlled Scheduling of Route-Request Packets (FSRR) in Mobile Ad Hoc ...
Gk2411581160
Quality of Service Routing in Mobile Ad hoc Networks Using Node Mobility and ...
Ad

Similar to 13 9 sep17 22aug 8454 9914-1-ed edit septian (20)

PDF
20 16 sep17 22jul 8036 9913-2-ed(edit)
PDF
20 16 sep17 22jul 8036 9913-2-ed(edit)
PDF
Flexible and Scalable Routing Approach for Mobile Ad Hoc Networks Based on Re...
PDF
Survey of Modified Routing Protocols for Mobile Ad-hoc Network
PDF
A study on performance comparison of dymo with aodv and dsr
PDF
J017515960
PDF
Destination Aware APU Strategy for Geographic Routing in MANET
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
Distributed Routing Protocol for Different Packet Size Data Transfer over Wir...
PDF
Link-and Node-Disjoint Evaluation of the Ad Hoc on Demand Multi-path Distance...
PDF
The Effects of Speed on the Performance of Routing Protocols in Mobile Ad-hoc...
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
A Simulated Behavioral Study of DSR Routing Protocol Using NS-2
PDF
Paper id 71201928
PDF
Implementation and analysis of multiple criteria decision routing algorithm f...
DOCX
Routing in Opportunistic Networks
PDF
Caching strategies for on demand routing protocols for wireless ad hoc network
PDF
Performance Comparison of AODV and DSDV Routing Protocols for Ad-hoc Wireless...
PDF
Ijarcet vol-2-issue-7-2351-2356
PDF
Ijarcet vol-2-issue-7-2351-2356
20 16 sep17 22jul 8036 9913-2-ed(edit)
20 16 sep17 22jul 8036 9913-2-ed(edit)
Flexible and Scalable Routing Approach for Mobile Ad Hoc Networks Based on Re...
Survey of Modified Routing Protocols for Mobile Ad-hoc Network
A study on performance comparison of dymo with aodv and dsr
J017515960
Destination Aware APU Strategy for Geographic Routing in MANET
International Journal of Computational Engineering Research(IJCER)
Distributed Routing Protocol for Different Packet Size Data Transfer over Wir...
Link-and Node-Disjoint Evaluation of the Ad Hoc on Demand Multi-path Distance...
The Effects of Speed on the Performance of Routing Protocols in Mobile Ad-hoc...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
A Simulated Behavioral Study of DSR Routing Protocol Using NS-2
Paper id 71201928
Implementation and analysis of multiple criteria decision routing algorithm f...
Routing in Opportunistic Networks
Caching strategies for on demand routing protocols for wireless ad hoc network
Performance Comparison of AODV and DSDV Routing Protocols for Ad-hoc Wireless...
Ijarcet vol-2-issue-7-2351-2356
Ijarcet vol-2-issue-7-2351-2356
Ad

More from IAESIJEECS (20)

PDF
08 20314 electronic doorbell...
PDF
07 20278 augmented reality...
PDF
06 17443 an neuro fuzzy...
PDF
05 20275 computational solution...
PDF
04 20268 power loss reduction ...
PDF
03 20237 arduino based gas-
PDF
02 20274 improved ichi square...
PDF
01 20264 diminution of real power...
PDF
08 20272 academic insight on application
PDF
07 20252 cloud computing survey
PDF
06 20273 37746-1-ed
PDF
05 20261 real power loss reduction
PDF
04 20259 real power loss
PDF
03 20270 true power loss reduction
PDF
02 15034 neural network
PDF
01 8445 speech enhancement
PDF
08 17079 ijict
PDF
07 20269 ijict
PDF
06 10154 ijict
PDF
05 20255 ijict
08 20314 electronic doorbell...
07 20278 augmented reality...
06 17443 an neuro fuzzy...
05 20275 computational solution...
04 20268 power loss reduction ...
03 20237 arduino based gas-
02 20274 improved ichi square...
01 20264 diminution of real power...
08 20272 academic insight on application
07 20252 cloud computing survey
06 20273 37746-1-ed
05 20261 real power loss reduction
04 20259 real power loss
03 20270 true power loss reduction
02 15034 neural network
01 8445 speech enhancement
08 17079 ijict
07 20269 ijict
06 10154 ijict
05 20255 ijict

Recently uploaded (20)

PPTX
Practice Questions on recent development part 1.pptx
PPTX
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
PPTX
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
PPTX
Internship_Presentation_Final engineering.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
web development for engineering and engineering
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
composite construction of structures.pdf
PDF
Queuing formulas to evaluate throughputs and servers
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
MET 305 MODULE 1 KTU 2019 SCHEME 25.pptx
PPT
Chapter 6 Design in software Engineeing.ppt
PPTX
ANIMAL INTERVENTION WARNING SYSTEM (4).pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Practice Questions on recent development part 1.pptx
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
Internship_Presentation_Final engineering.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
bas. eng. economics group 4 presentation 1.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Arduino robotics embedded978-1-4302-3184-4.pdf
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
web development for engineering and engineering
Embodied AI: Ushering in the Next Era of Intelligent Systems
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Operating System & Kernel Study Guide-1 - converted.pdf
composite construction of structures.pdf
Queuing formulas to evaluate throughputs and servers
OOP with Java - Java Introduction (Basics)
MET 305 MODULE 1 KTU 2019 SCHEME 25.pptx
Chapter 6 Design in software Engineeing.ppt
ANIMAL INTERVENTION WARNING SYSTEM (4).pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx

13 9 sep17 22aug 8454 9914-1-ed edit septian

  • 1. Indonesian Journal of Electrical Engineering and Computer Science Vol. 7, No. 3, September 2017, pp. 718 ~ 723 DOI: 10.11591/ijeecs.v7.i3.pp718-723  718 Received May 29, 2017; Revised August 15, 2017; Accepted August 30, 2017 Learning Based Route Management in Mobile Ad Hoc Networks Rahul Desai *1 , B P Patil 2 , Davinder Pal Sharma 3 1 Sinhad College of Engineering, Army Institute of Technology, Pune, INDIA 2 E & TC Department, Army Institute of Technology, Pune, INDIA 3 The University of the West Indies, Department of Physics St. Augustine, St. George, TT *Corrsponding author, [email protected] Abstract Ad hoc networks are mobile wireless networks where each node is acting as a router. The existing routing protocols such as Destination sequences distance vector (DSDV), Optimized list state routing protocols (OLSR), Ad hoc on demand routing protocol (AODV), dynamic source routing (DSR) are optimized versions of distance vector or link state routing protocols. Reinforcement Learning is new method evolved recently which is learning from interaction with an environment. Q Learning which is based on reinforcement learning that learns from the delayed reinforcements and becomes more popular in areas of networking. Q Learning is applied to the routing algorithms where the routing tables in the distance vector algorithms are replaced by the estimation tables called as Q values. These Q values are based on the link delay. In this paper, various optimization techniques over Q routing are described in detail with their algorithms. Keywords: Q Routing, Reinforcement, CQ routing, DRQ routing, CDRQ routing, DSR, AODV, DSDV Copyright © 2017 Institute of Advanced Engineering and Science. All rights reserved. 1. Introduction An ad Hoc network is a technology where no fixed infrastructure is required; all nodes are mobile, thus moving from one network to another [1, 2]. Ad hoc network is a temporary network where each node is also acting as a router. All nodes are self configured (addresses and routing features) nodes, multiple hops are required to transfer data from one node to another. Energy is also one of the most important parameter as all nodes have limited power supply. Ad hoc network characteristics includes Peer-to-peer, zero administration, low power, Multihop, dynamic and auto configured. Routing consists of two steps; forwarding packets to the next hop and to decide how the forwarding process to reach the packets to the destination in minimum number of hops. To judge the merit of a routing protocol, qualitative and quantitative metrics are used to measure its suitability and performance. Various performance parameters such as packet delivery ratio, delay, jitter, control overhead etc are used judge the performance of routing protocols. There are two types of protocols–proactive routing protocols and on demand also known as reactive routing protocols are widely adopted for an ad hoc network. Proactive protocols always maintains routing paths between all pairs of nodes irrespective of their usage while reactive protocols finds out the path to reach to the node only when needed. Pro-active routing protocols always find the optimum routes to reach to every destination nodes. But these types of protocols are not suitable for large network because of high overheads and their poor convergence behaviour. Destination sequenced Distance Vector (DSDV) is one of earliest protocols developed for ad hoc networks [3, 4]. It is based on distance vector algorithm and uses sequence numbers to avoid count to infinity problem. Every node communicates, finds out their neighbours by sending hello messages and exchanges their routing tables with them. Periodic full updates and small updates are also transmitted to maintain routing tables up to date. Optimized link state routing protocol [5, 6] is another proactive routing protocol based on link state algorithm. Here, every node broadcasts link state updates to every other node present in the network and thus creates link tables from which routing tables are designed. In order to reduce the overheads, multipoint relay concept is widely used. There are two types of
  • 2. IJEECS ISSN: 2502-4752  Learning Based Route Management in Mobile Ad Hoc Networks (Rahul Desai) 719 algorithms which are widely used for wires as well as wireless networks, first is distance vectors routing protocols, where the distances in terms of number of hops are communicated to the neighbours and builds up the routing table. Routing tables basically consists of three columns, first column for destination node, second column will be the next hop where the packet are to be delivered, third column stands for metric or cost. In on demand routing protocols, route to the destination is obtained only when there is a need. When source nodes want to transmit data packets to the destination nodes, it initiates route discovery process. Route request (RREQ) messages float over the network and finally the packet reaches to the destination, Destination nodes replies with route reply message (RREP) and unicast towards the source node. All nodes including the source node keeps this route information in caches for future purpose. Dynamic Source Routing Protocol (DSR) is thus characterized by the use of source routing. The data packets carry the source route in the packet header. When the link or node goes down, existing route is no longer available; source node again initiates route discovery process to find out the optimum route. Route Error packets and acknowledgement packets are also used. Ad Hoc on Demand Distance Vector Routing (AODV) is also on-demand routing protocol. It uses traditional routing tables, one entry per destination. In AODV, only one route path is available in routing table, if this path fails, it again initiates route discovery process to find out another optimum path [7-9]. 2. Survey of Reinforcement Based Routing Methods Reinforcement learning is the process of mapping the situations to the actions and tries to maximize a reward signal. There are various strategies such as positive or negative approaches as well as model based or model free approaches are used. Q Routing is new evolved concept arises in the modern world which is also based on reinforcement concepts. Each node in the network contains reinforcement learning module which tries to find out the optimum path to the destination. Direct or indirect training signal is required to improve the routing policy. As illustrated in Figure 1, Let QX(Y, D) represents the time that a node X takes to deliver a packet P to the destination node D when the packet is transmitted to the next neighbour node Y. After sending the packet, node X will also get node Y’s estimate of the remaining time in the network [10-11]. Figure 1. Q routing In Reinforcement Learning (Q Routing) each node maintains database of Q values which represents delays for each of the next hops. For every incoming packet, nodes consult its Q table and decided the next hop based on the least delivery time required to reach the packet to the destination [10, 11]. At the same time the sending node receives the estimate of the remaining delivery time for the packet to the destination. Thus after every packet transmitted by the source node and all intermediate nodes[11] Q values are received by these nodes and updates their Q table to represents the steady state of the network. As soon as the node X sends a packet P to the destination node D to one of the neighbouring nodes Y, node Y send back to node X, its best estimate QY(Z, D) for the destination D. QY(Z, D) for the destination D shows its remaining time required to reach the packet to the destination node D [10,11].
  • 3.  ISSN: 2502-4752 IJEECS Vol. 7, No. 3, September 2017 : 718 – 723 720 PacketSend (X) 1 Receive the packet from Packet Queue 2 Find out the best neighbour Y = min(Qx(Y, D)) 3 Forward Packet to the neighbour Y 4 Receive Estimate (Qy(Z, D) + qy) from node Y. 5 Update Q value Qx(Y, D). PacketReceive (Y) 1 Receive a packet from neighbour X 2 Calculate best estimate for node D; Qy(Z, D) and send back to node X. 3 Get ready for receiving next packet By adding confidence measure, the quality of exploration is improved by learning faster thus Q values represent the current state of the network more closely. Each node in the network contains C tables consisting of confidence values, where each Q value is associated with C value. This value is the real number between 0-1 and essentially specifies the confidence in the corresponding Q value [10] In standard Q routing, learning rate is always maintained to be constant, it means there is way to specify reliability of Q values but in Confidence based Q Routing, the learning rate depends on the confidence of the Q value being updated and its new estimate. In particular, when node X sends a packet to its neighbour Y, it also gets back the confidence value Cy(Z, D) associated with this Q value. When node X updates its Qx(Y, D) value, it first computes the learning rate П which depends on both Cx (Y, D) and Cy (Z, D). Simple and effective learning rate function is given by: Пf (Cold, Cnew) = max (Cnew, 1- Cold). The confidence value always represents the reliability of the corresponding Q value, and thus always changes with time. This confidence value decays with time if their Q values are not updated in the last time step [11]. Figure 2. CQ routing In confidence based Q Routing, algorithm for Packet Send and Packet Receive can be summarized as follows [10-12]. PacketSend(X) 1. Receive the packet from Packet Queue 2. Find out the best neighbour Y = min(Qx(Y, D)) 3. Forward Packet to the neighbour Y 4. Receive Estimate (Qy(Z, D) + qy) and Cy(Z, D) from node Y. 5. Update Q value Qx(Y, D) and Cy(Z, D) value. PacketReceive(Y) 1. Receive a packet from neighbour X 2. Calculate best estimate for node D; Qy(Z, D) and send back to node X. 3. Find the corresponding confidence value Cy(Z, D) and send back to node X. 4. Get ready for receiving next packet In Dual reinforcement Q Routing (DRQ) the learning process occurs in both ways and thus the learning performance of the Q Routing algorithm doubles. Instead of using a single reinforcement signal, an indirect reinforcement signal extracted from the incoming information is
  • 4. IJEECS ISSN: 2502-4752  Learning Based Route Management in Mobile Ad Hoc Networks (Rahul Desai) 721 also used to update the state of the network. When a node X sends a packet to neighbour node Y, it will also send additional routing information which will be used to update node Y's decisions in opposite direction. Thus backward exploration is also added to standard Q Routing [11] Figure 3 illustrates the backward exploration in standard Q routing. Figure 3. CQ routing In dual reinforcement confidence based Q Routing, algorithm for Packet Send and Packet Receive can be summarized as follows [10-12]. PacketSend (X) 1. Receive the packet from Packet Queue 2. Find out the best estimate Qx(Z, d) 3. Append (Qx(Z, S)+qx) and Cx(Z, S) to the packet P(S, D). 4. Find out the best neighbour Y = min(Qx(Y, D)) 5. Forward Packet to the neighbour Y 6. Receive Estimate (Qy(Z, D) + qy) and Cy(Z, D) from node Y. 7. Update Q value Qx(Y, D) and C value C x(Y, D). PacketReceive (Y) 1. Receive a packet from neighbour X 2. Using the received estimate Qx (Z, D) + qx and Cx (Z, D) update Q value Qy(X, S) and C y(X,). 3. Calculate best estimate for node D; Qy (Z, D) and Cy (Z, D), send back to node X. 4. Get ready for receiving next packet. Thus confidence values are used not only for exploration but also in making routing decisions [12]. 3. Analysis The experiment is performed using the simulator NS2 which is open source software and used to do research on wired and wireless networks. The number of nodes varies from 10 to 100. The topology Size is 1000 m × 1000 m. The simulation time is 200 seconds. DSDV, DSR, AODV, Dual reinforcement Q routing protocols are analysed. Figure 4. No of Nodes vs PDR
  • 5.  ISSN: 2502-4752 IJEECS Vol. 7, No. 3, September 2017 : 718 – 723 722 It is observed that when the network size increases beyond 60 nodes, AODV or DSR protocols starts dropping packets. But CDRQ protocols maintains consistent ratio throughout the network irrespective of the network size. End-to-end Delay is the time taken by a data packet to reach to the destination. The result of end to end delay is illustrated in Figure 5. Here again dual reinforcement confidence based routing provides low delay compared with standard routing and other non optimized variants of Q routing. Figure 5. No of Nodes vs. Delay 4. Conclusion This paper explains the comparative analysis of various optimized versions of existing routing protocols with dual reinforcement confidence based Q routing in NS2. This research study compares DSDV, AODV and DSR protocols with CDRQ routing protocols for an ad hoc network. PDR and delay are very important parameters when deciding how a reliable a protocols works. CDRQ variant based on reinforcement learning shows significant results as compared with existing routing protocols. References [1] Mukhtiar Ahmed, Mazleena Salleh, M.Ibrahim Channa, Mohd Foad Rohani. Review on localization based Routing Protocols for Underwater Wireless Sensor Network. International Journal of Electrical and Computer Engineering, Vol 7, No 1: February 2017. [2] Dilip Singh Sisodia, Riya Singhal, Vijay Khandal, A Performance Review of Intra and Inter-Group MANET Routing Protocols under Varying Speed of Nodes. International Journal of Electrical and Computer Engineering, Vol 7, No 5: October 2017. [3] Justin Sophia I, N. Rama. Improving the Proactive Routing Protocol using Depth First Iterative Deepening Spanning Tree in Mobile Ad Hoc Network. International Journal of Electrical and Computer Engineering, Vol 7, No 1: February 2017. [4] Rahul Desai, B P Patil. Analysis of Reinforcement Based Adaptive Routing in MANET. Indonesian Journal of Electrical Engineering and Computer Science, Vol 2, No 3: June 2016. [5] P. Jacquet et al. Optimized Link State Routing Protocol for Ad Hoc Networks. Proc. IEEE Int’l. Multi Topic Conf. (INMIC ’01), 2001, pp. 62–68. [6] T. Clausen and P. Jacquet, Optimized Link State Routing Protocol (OLSR), document RFC 3626, IETF, Oct. 2003;. [Online]. Available: http://www.ietf.org/rfc/rfc3626.txt. [7] Reji Mano, P.C. Kishore Raja, Christeena Joseph, Radhika Baskar. Hardware Implementation of Intrusion Detection System for Ad-Hoc Network. International Journal of Reconfigurable and Embedded Systems (IJRES), Vol 5, No 3: November 2016. [8] Shalini Singh, Rajeev Tripathi. Performance Analysis of Extended AODV with IEEE802.11e HCCA to support QoS in Hybrid Network. Indonesian Journal of Electrical Engineering and Computer Science, Vol 12, No 9: September 2014.
  • 6. IJEECS ISSN: 2502-4752  Learning Based Route Management in Mobile Ad Hoc Networks (Rahul Desai) 723 [9] AL-Gabri Malek, Chunlin LI, Layuan Li. Improving ZigBee AODV Mesh Routing Algorithm Topology and Simulation Analysis. TELKOMNIKA Indonesian Journal of Electrical Engineering Vol.12, No.2, 2014, pp. 1528~1535. [10] S. Kumar. Confidence based dual reinforcement Q-routing: an on-line adaptive network routing algorithm. MS thesis. University of Texas at Austin, 1998. [11] S. Kumar, R. Miikkulainen. Dual Reinforcement Q-Routing: An On-Line Adaptive Routing Algorithm. Artificial Neural Networks in Engineering, 1997. [12] Rahul Desai, B P Patil. Prioritized Sweeping Reinforcement Learning Based Routing for MANETs. Indonesian Journal of Electrical Engineering and Computer Science. Vol. 5, No. 2, Feb 2017, pp. 684~694.