Artificial Intelligence and Deep Learning For Computer Network Management and Analysis
Artificial Intelligence and Deep Learning For Computer Network Management and Analysis
Analytics
ARTIFICIAL INTELLIGENCE
AND DEEP LEARNING FOR
COMPUTER NETWORK
MANAGEMENT AND ANALYSIS
Edited by
Sangita Roy, Rajat Subhra Chakraborty,
Jimson Mathew, Arka Prokash Mazumdar
and Sudeshna Chakraborty
Features:
This book serves as a valuable reference book for students, researchers, and
practitioners who wish to study and get acquainted with the application of
cutting-edge AI, ML, and DL techniques to network management and
cybersecurity.
Chapman & Hall/Distributed Computing and Intelligent
Data Analytics Series
Series Editors: Niranjanamurthy M and Sudeshna Chakraborty
Edited by
Sangita Roy
Rajat Subhra Chakraborty
Jimson Mathew
Arka Prokash Mazumdar
Sudeshna Chakraborty
First edition published 2023
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
and by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
CRC Press is an imprint of Taylor & Francis Group, LLC
© 2023 selection and editorial matter, Sangita Roy, Rajat Subhra Chakraborty, Jimson Mathew,
Arka Prokash Mazumdar and Sudeshna Chakraborty; individual chapters, the contributors
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to
publish in this form has not been obtained. If any copyright material has not been
acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means,
now known or hereafter invented, including photocopying, microfilming, and recording, or in
any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.
copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and
are used only for identification and explanation without intent to infringe.
Library of Congress Cataloging‑in‑Publication Data
Names: Roy, Sangita, editor.
Title: Artificial intelligence and deep learning for computer network management and analysis /
edited by Sangita Roy, Rajat Subhra Chakraborty, Jimson Mathew, Arka Prokash Mazumdar,
Sudeshna Chakraborty.
Description: First edition. | Boca Raton : Chapman & Hall/CRC Press, 2023. |
Series: Chapman & Hall/CRC distributed computing and intelligent data analytics series |
Includes bibliographical references and index. |
Identifiers: LCCN 2022050235 (print) | LCCN 2022050236 (ebook) | ISBN 9781032079592 (hbk) |
ISBN 9781032461380 (pbk) | ISBN 9781003212249 (ebk)
Subjects: LCSH: Computer networks‐‐Management‐‐Data processing. | Autonomic computing. |
Computer networks‐‐Automatic control. | Self-organizing systems. | Artificial intelligence.
Classification: LCC TK5105.548 .A78 2023 (print) | LCC TK5105.548 (ebook) | DDC
005.74028563‐‐dc23/eng/20221220
LC record available at https://lccn.loc.gov/2022050235
ISBN: 978-1-032-07959-2 (hbk)
ISBN: 978-1-032-46138-0 (pbk)
ISBN: 978-1-003-21224-9 (ebk)
DOI: 10.1201/9781003212249
Typeset in Palatino
by MPS Limited, Dehradun
Contents
Preface.....................................................................................................................vii
About the Editors ..................................................................................................ix
Contributors..........................................................................................................xiii
v
Preface
In recent years, particularly with the advent of deep learning (DL), new
avenues have opened up to handle today’s most complex and very
dynamic computer networks, and the large amount of data (often real
time) that they generate. Artificial intelligence (AI) and machine learning
(ML) techniques have already shown their effectiveness in different
networks and service management problems, including, but not limited
to, cloud, traffic management, cybersecurity, etc. There exist numerous
research articles in this domain, but a comprehensive and self-sufficient
book capturing the current state-of-the-art has been lacking. The book
aims to systematically collect quality research spanning AI, ML, and deep
learning (DL) applications to diverse sub-topics of computer networks,
communications, and security, under a single cover. It also aspires to
provide more insights on the applicability of the theoretical similitudes,
otherwise a rarity in many such books.
In the first chapter, application of ML to traffic management, in particular
classification of domain name service (DNS) query packets over a secure
(encrypted) connection is proposed. This important problem is challenging
to solve because the relevant fields in the packet header and body that
allows easy classification are not available in plaintext in an encrypted
packet. An accurate DL model and a support vector machine (SVM)–based
ML model is based on well-chosen features that were constructed to solve
this problem.
Wi-Fi access points periodically broadcast beacons for dynamic network
management. However, these frames are typically unprotected, and thus
can be exploited by adversaries to severely affect the security and
performance of the underlying Wi-Fi network. In the second chapter, the
authors have proposed and developed a methodology applying several
supervised ML techniques and DL to perform real-time detection of
authentic and forged beacons. The technique achieves high accuracy in
detecting and classifying several beacon attacks.
In the third chapter, the authors have proposed a reinforcement learning-
based switch migration to handle the load imbalance among the controllers
and optimize network configuration for a software defined network (SDN).
The scope of this chapter is to design a framework that dynamically maps
switches to controllers, such that the load on the controllers is balanced. The
scheme exploits the concept of a knowledge plane to automate the switch to
controller mappings. The knowledge plane learns about the environment,
such as network traffic, and it uses reinforcement learning to discover the
actions that will lead to an optimal load-balanced environment.
vii
viii Preface
ix
x About the Editors
xiii
xiv Contributors
CONTENTS
1.1 Introduction ....................................................................................................1
1.2 Survey on DoH, DoT, and Machine Learning Classification................2
1.3 Implementation (Diff Models and All)...................................................... 4
1.3.1 Dataset .................................................................................................4
1.3.1.1 DoH vs. Non-DoH Dataset...............................................4
1.3.1.2 Malicious vs. Non-Malicious DoH .................................. 4
1.3.2 Feature Engineering .......................................................................... 4
1.3.3 Classification Models ........................................................................5
1.3.3.1 The Keras Sequential Model.............................................5
1.3.3.2 The SVM Model..................................................................7
1.4 Results and Analysis (with Graphs) ..........................................................8
1.5 Conclusion ......................................................................................................9
References ..............................................................................................................11
1.1 Introduction
DNS over HTTPS (DoH) was introduced to overcome the security vulner
abilities of DNS, which exposes DNS queries to possible snoopers, compro
mising the privacy and security of a connection through man-in-the-middle
attacks and eavesdropping. DoH is able to encrypt these DNS requests as an
HTTPS request ensuring all the security features of the HTTPS protocol.
However, this also adds another catch – being unable to distinguish between
DoH and normal HTTPS packets, which can become avenues for malicious
packets being potential replacements for DoH responses. This brings the need
to analyze DoH packets for patterns that can distinguish them from other
DOI: 10.1201/9781003212249-1 1
2 Artificial Intelligence and Deep Learning for Computer Network
HTTPS packets. The unavailability of a dataset that can be used for analysis of
DoH traffic is a key obstacle to being able to identify DoH traffic and subse
quently the nature of such a DoH packet, i.e. whether it is malicious or not.
“DoHlyzer” [1] is a Canadian Institute for Cybersecurity (CIC) project funded
by the Canadian Internet Registration Authority (CIRA) that is one of the only
available approaches that help gather such a dataset in a systematic way, apply
feature extraction and then use models for classification. Our approach uses
their modules of feature extraction as a baseline for improvement and makes
use of a diverse dataset generated from several combinations of browsers,
operating systems, internet service providers (ISPs), and locations across India.
The motivation behind being able to analyze DoH traffic comes from the
fact that while encryption provides a greater level of security for DNS
requests, it can provide reasons for worry in other cases. For example,
readable information from DNS can be used to identify malware, botnet
communication, and data exfiltration, and encryption removes these aspects
that can be read. Sudden increases in DNS requests can be a sign of data
exfiltration [2,3], which is the unauthorized transfer of data, by a malicious
actor from a computer. However, with DoH, it is no longer known if and
when there is a DNS query. DNS information can be used to dig into and
enforce security policies, like limiting the access to services in corporate
networks, parental control, phishing servers, avoiding potential blacklists,
etc. While DoH might be able to curb censorship, the significance of that
is out of the scope of this project. While there are some intuitive features
and statistical features that can be used to identify a large number of DoH
packets in a capture session, the same can often be used to emulate DoH
packets that could be malicious traffic in disguise. This also largely varies
depending on the network connection strength, the server being requested,
etc., which creates several limitations to fixed pattern and knowledge-based
classification methods for network traffic. The main factor that motivates this
project is that the identification of DoH packets among network traffic can
help retain the strength of security tools that have protected us so far based
on DNS information. However, from an academic perspective, it helps us
understand the strength of privacy protection provided by DoH to users that
can prevent them from being profiled or subjected to unnecessary censorship.
This work aims to analyze network traffic to extract identifiable information
that can differentiate between non-DoH web traffic and DoH traffic using
machine learning and deep learning methods for a variety of packet captures.
TABLE 1.1
Dataset Parameters
Parameter Feature
F1 Number of flow bytes sent
F2 Rate of flow bytes sent
F3 Number of flow bytes received
F4 Rate of flow bytes received
F5 Mean packet length
F6 Median packet length
F7 Mode packet length
F8 Variance of packet length
F9 Standard deviation of packet length
F10 Coefficient of variation of packet length
F11 Skew from median packet length
F12 Skew from mode packet length
F13 Mean packet time
F14 Median packet time
F15 Mode packet time
F16 Variance of packet time
F17 Standard deviation of packet time
F18 Coefficient of variation of packet time
F19 Skew from median packet time
F20 Skew from mode packet time
F21 Mean Request/response time difference
F22 Median request/response time difference
F23 Mode request/response time difference
F24 Variance of request/response time difference
F25 Standard deviation of request/response time difference
F26 Coefficient of variation of request/response time difference
F27 Skew from redequest/response time difference
the model through the dataset. The function of the ETC with respect to the
dataset is to create an accurate feature importance split that is able to create
an accurate representation of the reduced data while maintaining variance
and as much information as possible.
FIGURE 1.1
Model architecture.
Deep Learning in Traffic Management 7
n
L CE = ti log(pi ), for n classes, (1.1)
i =1
where ti is the truth label and pi is the Softmax probability for the ith class.
n
minw w 2 + (1 yi < xi , w>)+ (1.2)
i =1
where < xi , w> is the SVM function that provides the classification label
for input xi and SVM parameter w , and yi is the true label.
2 precision × recall
F1 = 1 1 = 2× precision + recall
recall
× precision
(1.3)
tp
= 1
tp + 2 (fp + fn)
Figure 1.2 shows that there are five important features when trying to
identify a DoH Packet. The following ones are the most important features
to look for, something our research uniquely provides:
• PacketLengthMean
• PacketLengthStandardDeviation
• PacketLengthMode
Deep Learning in Traffic Management 9
FIGURE 1.2
Feature importance.
• PacketLengthCoefficientofVariation
• Duration
It can be seen that these features are easily attainable from our parsing
technique as they are openly available and not encrypted. It is also observed
that statistical analysis on packet length and time, in general, provide en
ough information to identify DoH packets.
1.5 Conclusion
Our dataset provides an improved and more varied dataset than pre
vious works. This chapter provides a more challenging set of packets
replicating the real-world setting in a much better way compared to
previous works. The proposed deep learning model provides a 93%
accuracy on detection of these packets, proving to be highly accurate and
robust on the task. Finally, the proposed model provides a high accuracy
of 97% on the malicious DoH detection as well, beating all previous
10 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 1.3
Sequential model accuracy.
FIGURE 1.4
SVM model accuracy.
this work has introduced a new deep learning approach to classify and
detect DoH in a real-world setting. Future work would involve exploring
DoT detection as well as decrypting messages in these protocols using
deep learning. The decryption poses the largest challenge, and we hope
to see the proposed work being used as a base for the same.
References
[1] M. MontazeriShatoori, L. Davidson, G. Kaur, and A. Habibi Lashkari,
“Detection of dohtunnels using time-series classification of encrypted
traffic,” in 2020 IEEE Intl Conf onDependable, Autonomic and Secure
Computing, Intl Conf on Pervasive Intelligence andComputing, Intl Conf
on Cloud and Big Data Computing, Intl Conf on Cyber Scienceand
Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020,
pp. 63–70. doi: 10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.
00026.
[2] D. Vekshin, K. Hynek, and T. Cejka, “Doh insight: Detecting dns over
https by machine learning,” New York, NY, USA: Association for Comput-
ing Machinery, 2020, isbn: 9781450388337. doi: 10.1145/3407023.3409192.
[Online]. Available: 10.1145/3407023.3409192.
[3] [Online]. Available: https://dnscrypt.info/
[4] C. L ́ opez Romera, “Dns over https traffic analysis and detection,” 2020.
[5] L. Jun, Z. Shunyi, L. Yanqing, and Z. Zailong, “Internet traffic classification
using machine learning,” in 2007 Second International Conference on
Communications andNetworking in China, IEEE, 2007, pp. 239–243.
[6] T. T. Nguyen and G. Armitage, “A survey of techniques for internet traffic
classification using machine learning,” IEEE communications surveys &
tutorials, vol. 10, no. 4, pp. 56–76, 2008.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” Advances in neural information
processing systems, vol. 25, pp. 1097–1105, 2012.
[8] “Towards a comprehensive picture of the great firewall’s DNS censor
ship,” in 4th USENIXWorkshop on Free and Open Communications
on the Internet (FOCI 14), San Diego, CA: USENIX Association, Aug.
2014. [Online]. Available: https://www.usenix.org/conference/foci14/
workshop-program/presentation/anonymous.
[9] W. M. Shbair, T. Cholez, J. Francois, and I. Chrisment, “A multi-level fra
meworkto identify https services,” in NOMS 2016-2016 IEEE/IFIP Network
Operations andManagement Symposium, IEEE, 2016, pp. 240–248.
[10] J. Manzoor, I. Drago, and R. Sadre, “How http/2 is changing web traffic
and how todetect it,” in 2017 Network Traffic Measurement and Analysis
Conference (TMA), IEEE, 2017, pp. 1–9.
[11] T. A. Pen ̃ a, “A deep learning approach to detecting covert channels in the
domain name system,” Ph.D. dissertation, Capitol Technology University,
2020.
12 Artificial Intelligence and Deep Learning for Computer Network
[12] J. Bushart and C. Rossow, “Padding ain’t enough: Assessing the privacy
guarantees of encrypted{dns},” in 10th{USENIX}Workshop on Free and
Open Communications on the Internet ({FOCI}20), 2020.
[13] S. Siby, M. Juarez, C. Diaz, N. Vallina-Rodriguez, and C. Troncoso,
“Encrypted dns –¿privacy? a traffic analysis perspective,” Jun. 2019.
[14] A. Sherstinsky, “Fundamentals of recurrent neural network (rnn) and long
short-term memory (lstm) network,” Physica D: Nonlinear Phenomena, vol. 404,
p. 132 306, 2020.
[15] G. P. Zhang, “Time series forecasting using a hybrid arima and neural
network model,”Neurocomputing, vol. 50, pp. 159–175, 2003.
[16] S. Guha and P. Francis, “Identity trail: Covert surveillance using dns,” in
Proceedings of 7th Workshop on Privacy Enhancing Technologies, 2007.
[17] M. Ghaemi and M.-R. Feizi-Derakhshi, “Feature selection using forest
optimization algorithm,” Pattern Recognition, vol. 60, pp. 121–129, 2016.
[18] G. Biau and E. Scornet, “A random forest guided tour,” Test, vol. 25, no. 2,
pp. 197–227, 2016.
[19] S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier
methodology,” IEEE transactions on systems, man, and cybernetics, vol. 21,
no. 3, pp. 660–674, 1991.
[20] M. Borga, “Learning multidimensional signal processing,” Ph.D. disserta
tion, Link öping University Electronic Press, 1998.
[21] T. Joachims, “Making large-scale svm learning practical,” Technical report,
Tech. Rep., 1998.
[22] P. Pearce, B. Jones, F. Li, R. Ensafi, N. Feamster, N. Weaver, and V. Paxson,
“Global mea-surement of DNS manipulation,” in 26th USENIX Security
Symposium (USENIX Security17), Vancouver, BC: USENIX Association,
Aug. 2017, pp. 307–323, isbn: 978-1-931971-40-9. [Online]. Available:
https://www.usenix.org/conference/usenixsecurity17/technical-sessions/
presentation/pearce.
[23] A. F. M. Agarap, “A neural network architecture combining gated re-
current unit (gru)and support vector machine (svm) for intrusion detec
tion in network traffic data,” in Proceedings of the 2018 10th international
conference on machine learning and computing, 2018, pp. 26–30.
2
Machine Learning–Based Approach
for Detecting Beacon Forgeries in
Wi-Fi Networks
CONTENTS
2.1 Introduction ..................................................................................................14
2.2 Problem Statement ......................................................................................15
2.3 Related Work................................................................................................ 15
2.4 Brief Introduction to the Models .............................................................. 16
2.4.1 SVM.................................................................................................... 16
2.4.2 k-NN ..................................................................................................16
2.4.3 Random Forest .................................................................................17
2.4.4 Multilayer Perceptron (MLP) ........................................................17
2.4.5 CNN...................................................................................................17
2.5 Dataset Generation......................................................................................18
2.5.1 Beacon Forgery ................................................................................18
2.5.2 Beacon Flooding ..............................................................................18
2.5.3 De-authentication Attack ............................................................... 19
2.5.4 Attack Modeling ..............................................................................22
2.5.4.1 Feature Extraction.............................................................22
2.6 Dataset Classification .................................................................................. 24
2.7 Evaluation .....................................................................................................24
2.7.1 Analyzis of Results.......................................................................... 26
2.8 Conclusion and Future Work....................................................................31
References ..............................................................................................................32
DOI: 10.1201/9781003212249-2 13
14 Artificial Intelligence and Deep Learning for Computer Network
2.1 Introduction
Wi-Fi belonging to a family of wireless technology protocols based on IEEE
802.11 has become an integral part in connecting IoT devices. Experts pre
dict that by 2022 more than half of the IP traffic will be generated from
wireless devices [1] and an efficient way of accessing the internet for these
devices would be through the utilization of Wi-Fi. Wi-Fi utilizes wired
network devices called access points [2] to connect to the internet.
Access points use management frames like beacon and de-authentication
frames to announce their presence or to disconnect from a network. Lack of
authentication of these frames may eventually lead to spoofing resulting
in attacks such as beacon forgery, de-authentication attacks, beacon flooding,
etc. A combination of de-authentication and beacon flooding can be used to
consume the resources of the legitimate access points and scam the user to
connect to fake access points that may become a hotspot for man-in-the-middle
attacks. Beacon forgery may also result in reduction of the victim’s transmission
power, eventually making the network connection unusable [3,4].
Compared to a wired network, defending against attacks in real-time
Wi-Fi networks is challenging due to the difficulty in controlling the
area of access and protecting the medium of transmission. However,
analyzing the patterns of packets can help to anticipate the attacks be
forehand and prepare to defend against emerging threats [5]. Towards
this goal, intrusion detection systems (IDSs) are leveraged to analyze
patterns of normal and abnormal behavior. There exists two broad cate
gories of IDS, as described below.
Existing IDSs detect beacon frame spoofing and access point flooding attacks
while leaving de-authentication attacks and beacon flooding undetected,
which may eventually result in denial of service to legitimate access points.
Hence, in this paper, we propose a machine learning–based approach to
detect numerous beacon forgeries. In particular, we generate a dataset by
launching spoofing, flooding, and de-authentication attacks on the network
and capturing the network frames and analyzing them using supervised and
Machine Learning–Based Approach for Detecting Beacon Forgeries 15
deep learning models. We show that our approach detects attacks such as
beacon forgery, flooding, and de-authentication with an accuracy of 92%.
2.4.1 SVM
SVM is used for binary classification to propose an optimum boundary
between clusters of data points by constructing a marginal hyper-plane
given by Equation 2.1.
H : wT (x ) + b = 0 (2.1)
2.4.2 k-NN
k-NN selects k nearest neighbours and calculates distance between the
query instance and remaining samples using Euclidean distance (d), as
shown in Equation 2.3. Values thus obtained are arranged in an order and
the top k points are chosen to determine the nearest neighbours [23].
n
d (p , q ) = (qi pi )2 (2.3)
i =1
where,
pi = Query-instance
qi = Second sample considered
Machine Learning–Based Approach for Detecting Beacon Forgeries 17
j alltrees norm ij
RFi = (2.4)
T
where,
where,
L = Learning rate
yexp = Expected value of output
ypred = Predicted value of output
x = Input
2.4.5 CNN
CNN belongs to the class of neural networks that process data with
grid topology. Each of the convolutional layers contain a series of filters
called kernels. Kernels are a matrix of integers (pixels) used to reshape
the input vectors to the same size as that of the kernel shown in
Equation 2.6. This feature is known as stride. Each of the corresponding
pixels in the kernel and the reshaped vectors are multiplied to generate a
feature map. Data is then learnt and classified based on the feature map
values.
18 Artificial Intelligence and Deep Learning for Computer Network
W F + 2P
Wout = +1 (2.6)
S
where,
FIGURE 2.1
Creating a fake beacon using kali linux.
FIGURE 2.2
Rate of flow of network frames during beacon forgery attack.
simulated using kali linux and the graph of the received beacons with
specific ssid are given in Figures 2.3 and 2.4, respectively.
Figure 2.4 shows an increase in beacon frames during the beacon
flooding attack, but unlike the forgery attack, these beacons have the same
ssid but different bssid. A high beacon flow rate indicates that the
flooding attack was performed successfully, as shown in Figure 2.2. A
beacon flooding attack is easier to detect as victims are flooded with a
large number of frames.
FIGURE 2.3
Beacons flooding using kali linux.
FIGURE 2.4
Rate of flow of network frames during beacon flooding attack.
FIGURE 2.5
De-authentication attack using kali linux.
FIGURE 2.6
Rate of flow of de-authentication frames during de-authentication attack.
FIGURE 2.7
Graphical results for evaluation of balanced dataset.
22 Artificial Intelligence and Deep Learning for Computer Network
Frequency = ND /T (2.7)
FIGURE 2.8
Sample of the extracted data.
Machine Learning–Based Approach for Detecting Beacon Forgeries 23
TABLE 2.1
Features Extracted From pcap Files
Features Extracted Description
wlan.da Destination of the frame
wlan.fixed.timestamp Timestamp of arrival
wlan.bssid bssid of the frame (Mac address)
frame.time Timestamp
wlan radio.signal dbm Radio signal strength
wlan.fixed.beacon Beacon intervel
wlan.sa Source address of access point
wlan.fixed.capabilities.ess Channel capability (extended service set)
wlan.fixed.capabilities.ibss Channel capability (ibss)
wlan.ra Receiver address (de-authentication frame)
wlan.fixed.capabilities.cfpoll.ap Channel capability (CFP participation)
wlan.fixed.capabilities.privacy Channel capability (privacy)
wlan.fixed.capabilities.preamble Channel capability (short preamble)
wlan.fixed.capabilities.pbcc Channel capability (pbcc)
wlan.fixed.capabilities.agility Channel capability (channel agility)
wlan.fixed.capabilities.spec man Channel capability (spectral management)
wlan.fixed.capabilities.short slot time Channel capability (short slot time)
wlan.fixed.capabilities.radio Channel capability (radio measurement
measurement implemented)
wlan.fixed.capabilities.apsd Channel capability (apsd)
wlan.ssid ssid of access point (name of access point)
wlan.fixed.capabilities.del blk ack Channel capability (delayed block ACK)
frame.time epoch Epoch time
frame.time relative Time passed from the arrival of the first frame
wlan.sa Source address (de-authentication frame)
wlan.fixed.capabilities.imm blk ack Channel capability (immediate block ack-finish)
wlan.fixed.capabilities.dsss ofdm Channel capability (DSSS-OFDM allowed)
wlan.fixed.reason code Reason for de-authentication
where,
where,
2.7 Evaluation
After the training data was pre-processed, various machine and deep
learning models were used to study the attack patterns in the data. As the
resulting dataset, shown in Table 2.2, was unbalanced, we use synthetic
memory oversampling technique (SMOTE) [26] to balance it by over
sampling the minority class. SMOTE randomly selects a point in the
minority class, finds k nearest neighbours, and creates synthetic instance-
based data points from the selected neighbours. Table 2.3 shows the class
distribution of the dataset after balancing.
Machine Learning–Based Approach for Detecting Beacon Forgeries 25
TABLE 2.2
Class Distribution of Dataset Before SMOTE Balancing
(Unbalanced Dataset)
Class Number of Samples
Normal 7,118
Beacon Flooding 2,222
De-authentication Attack 1,262
Beacon Forgery 1,119
TABLE 2.3
Class Distribution of Dataset After SMOTE Balancing
(Balanced Dataset)
Class Number of Samples
Normal 7,118
Beacon Flooding 7,118
De-authentication Attack 7,118
Beacon Forgery 7,118
2. Recall – Recall defines the ability of the model to find all the rele
vant cases within a dataset.
TABLE 2.4
Performance Metrics on Unbalanced Dataset
Algorithm Precision (%) Recall (%) F1-Score (%) Accuracy (%)
Machine Learning Models
k-NN 36.87 60.72 45.89 60.72
SVM 11.50 10.76 20.90 10.76
Random Forest 78.97 73.32 65.39 73.32
Deep Learning Models
MLP 91.00 91.00 92.00 92.00
CNN 100.00 44.00 62.00 44.00
FIGURE 2.9
Graphical results for evaluation of unbalanced data.
Machine Learning–Based Approach for Detecting Beacon Forgeries 27
FIGURE 2.10
Confusion matrix of k-NN for unbalanced dataset.
FIGURE 2.11
Confusion matrix of SVM for unbalanced dataset.
FIGURE 2.12
Confusion matrix of random forest for unbalanced dataset.
FIGURE 2.13
Confusion matrix of MLP for unbalanced dataset.
highest accuracy followed by the random forest. SVM suitable for bino
mial classification has the least performance. As CNN studies the sam
pled original data, its accuracy is very low. From the results obtained
using a balanced dataset, we can see that even though random forest
outperforms other models, there is a steep decrease in efficiency due to
Machine Learning–Based Approach for Detecting Beacon Forgeries 29
FIGURE 2.14
Confusion matrix of CNN for unbalanced dataset.
TABLE 2.5
Performance Metrics on Balanced Dataset
Algorithm Precision (%) Recall (%) F1 Score (%) Accuracy (%)
FIGURE 2.15
Confusion matrix of k-NN for balanced dataset.
30 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 2.16
Confusion matrix of SVM for balanced dataset.
FIGURE 2.17
Confusion matrix of random forest for balanced dataset.
FIGURE 2.18
Confusion matrix of MLP for balanced dataset.
FIGURE 2.19
Confusion matrix of CNN for balanced dataset.
References
[1] “Cisco Visual Networking Index: Global Mobile Data Traffic Forecast
Update, 2017–2022,” White Paper, Cisco, 2019.
[2] Mart´ınez, Asier, et al. “Beacon frame spoofing attack detection in IEEE
802.11 networks.” 2008 Third International Conference on Availability,
Reliability and Security. IEEE, 2008.
[3] Vanhoef, Mathy, Prasant Adhikari, and Christina P¨opper. “Protecting wi-fi
beacons from outsider forgeries.” Proceedings of the 13th ACM Conference
on Security and Privacy in Wireless and Mobile Networks. 2020.
[4] Asaduzzaman, Md, Mohammad Shahjahan Majib, and Md Mahbubur
Rahman. “Wi-fi frame classification and feature selection analysis in
detecting evil twin attack.” 2020 IEEE Region 10 Symposium (TENSYMP).
IEEE, 2020.
[5] Boob, Snehal, and Priyanka Jadhav. “Wireless intrusion detection system.”
International Journal of Computer Applications 5.8 (2010): 9–13.
[6] Bronte, Robert, Hossain Shahriar, and Hisham M. Haddad. “A signature-
based intrusion detection system for web applications based on genetic
algorithm.” Proceedings of the 9th International Conference on Security of
Information and Networks. 2016.
[7] Otoum, Yazan, and Amiya Nayak. “AS-IDS: Anomaly and Signature Based
IDS for the Internet of Things.” Journal of Network and Systems Management
29.3 (2021): 1–26.
[8] Ugochukwu, Chibuzor John, E. O. Bennett, and P. Harcourt. An intrusion
detection system using machine learning algorithm. LAP LAMBERT Academic
Publishing, 2019.
[9] Usha, M., and P. J. W. N. Kavitha. “Anomaly based intrusion detection for
802.11 networks with optimal features using SVM classifier.” Wireless
Networks 23.8 (2017): 2431–2446.
[10] “Kali.” [Online]. Available: https://www.kali.org/
[11] “TSHARK.DEV.” [Online]. Available: https://tshark.dev/
[12] Liu, Donggang, Peng Ning, and Wenliang Du. “Detecting malicious beacon
nodes for secure location discovery in wireless sensor networks.” 25th IEEE
International Conference on Distributed Computing Systems (ICDCS’05).
IEEE, 2005.
[13] Wright, “Detecting wireless LAN MAC address spoofing,” http://home.
jwu.edu/jwright/, 2003.
Machine Learning–Based Approach for Detecting Beacon Forgeries 33
CONTENTS
3.1 Introduction ..................................................................................................36
3.2 Literature Survey.........................................................................................36
3.3 Load Balancing in SDN..............................................................................38
3.3.1 Problem Formulation......................................................................38
3.4 Knowledge-Defined Networking.............................................................. 39
3.4.1 Classical SDN Architecture ...........................................................40
3.4.2 SDN Architecture with Knowledge Plane .................................. 41
3.5 Load Balancing Using Reinforcement Learning ....................................41
3.5.1 Reinforcement Learning ................................................................. 41
3.5.2 Markov Decision Process (MDP)..................................................42
3.5.3 Problem Formulation Using Markov Decision Process ...........42
3.5.3.1 State Space (S) ...................................................................42
3.5.3.2 Action Space (A) ............................................................... 42
3.5.3.3 Reward Function (R)........................................................43
3.5.4 Q-Learning........................................................................................44
3.5.4.1 Exploration and Exploitation Trade-off........................ 45
3.6 Methodology ................................................................................................ 46
3.6.1 Random Approach .......................................................................... 47
3.6.2 Q-Learning with ϵ-greedy.............................................................. 48
3.7 Results and Implementation......................................................................50
3.7.1 Experiment Setup ............................................................................50
3.7.2 Evaluation Metric ............................................................................51
3.7.3 Experimental Results ......................................................................51
3.8 Conclusion and Future Work....................................................................54
References ..............................................................................................................54
DOI: 10.1201/9781003212249-3 35
36 Artificial Intelligence and Deep Learning for Computer Network
3.1 Introduction
In the last few years, SDN has defined an architectural paradigm that
enables network-wide automation. The SDN market is developing
quickly and, according to market reports [1], it is still believed to be
in its early stages. The popularity of SDN stems from the fact that it
allows automated provisioning, network virtualization, and efficient
management by separating the forwarding and control planes. The net
work intelligence is centralized through programmable SDN controllers.
Although the control plane is centralized, a single physical controller can
lead to reliability, performance, and robustness issues.
The control plane in practice is logically centralized, which is achieved
using multiple, physically distributed controllers. Each controller has a set of
switches under its control, and it controls a specific portion of the network,
called the SDN domain. The SDN domains periodically exchange messages
to allow synchronization. However, these SDN domains cannot be static.
Organizations need continuous network monitoring and performance opti
mization to support the highly dynamic network traffic. Consequently, there is
a need for the network to evolve by continuously changing the boundaries of
the SDN domains. A static SDN network can lead to controller load imbalance.
The research community has considered the application of artificial intel
ligence to optimize the network configuration automatically. The employ
ment of machine learning (ML) techniques to operate the network has led to a
new construct called the knowledge plane [2]. The logical centralization of the
control plane facilitates ML techniques that were otherwise not possible on
traditional networks that are inherently distributed. The knowledge plane
captures the network traffic, uses analytics, and makes decisions on behalf of
the network operator. The closed-loop feedback provided by the knowledge
plane is fundamental to achieve the desired performance level.
In this chapter, we propose a reinforcement learning–based switch
migration to handle the load imbalance among the controllers. The
framework aims to discover which actions will lead to an optimal network
configuration. The action is to move a switch from one controller to
another. For each such act, a reward is associated. Ultimately, the algo
rithm will learn the set of switch migrations (action updates), leading to
the target of a balanced network.
the day. Spatial traffic variations happen because of the flows produced by
applications associated with various switches. Due to these variations, the
accumulated traffic at a controller may exceed its capacity. We may need to
migrate switches from overloaded controllers to underloaded ones to deal
with this load imbalance. However, unlike static controller to switch map
ping, the migrations need to be on the fly. An additional issue that needs to be
addressed is the disruption of ongoing flows.
Dixit et al. [3,4] proposed to address the issue of load balancing by using
the equal controller mode (specified in OpenFlow v1.2) while transitioning a
controller from master to slave. In addition, they suggest resizing the con
troller pool depending on whether the controller load exceeds or falls below
an upper or lower threshold.
Wang et al. [5] proposed a framework called switch migration-based
decision-making (SMDM), where they compute load diversity, a ratio of con
troller loads. Switch migration takes place in case the load diversity
between two controllers exceeds a certain threshold. The switches selected
for migration are those with less load and higher efficiency.
Hu et al. [6] proposed efficiency-aware switch migration (EASM), where
they measure the degree of load balancing using the normalized load
variance of the controller load. If the load difference matrix exceeds a
threshold, the controller load is presumed to be unbalanced. The
threshold is a function of the difference between the maximum and
minimum controller load.
Filali et al. [7] use the ARIMA time series model to predict a switch’s load. The
forecasting allows finding the time step at which a controller will become
overloaded and accordingly schedule a switch migration in advance. The au
thors use a predetermined threshold to identify overloaded controllers.
The work by Zhou et al. [8] is among those few who consider the problem
of load oscillation due to inappropriate switch migration. Load oscillation
occurs when underloaded controllers, which are used to offload the traffic
load of overloaded controllers, become rapidly overloaded themselves. As
stated by the authors, the problem is due to the target switches not being
selected based on the overall network status.
Ul Haque et al. [9] address the variance in the controller load by employing
a controller module, which is a set of controllers. Their method estimates the
number of flows that the switches will produce on a regular interval and
accordingly activates the appropriate number of controllers.
Chen et al. [10] use a game-theoretic method to solve the problem of
controller load balancing. The underloaded controllers are modelled as
players who compete for switches from overloaded controllers. The payoffs
are determined when an underloaded controller is selected as the master
controller of a victim switch.
After this survey, we found that a system became unbalanced due to a
change in the network traffic characteristics. If we learn the behavior of
network traffic, then we can make the right decision. Several papers use
38 Artificial Intelligence and Deep Learning for Computer Network
a threshold to affect the switch migration, but they do not outline how
the threshold can be determined. Our proposed work applies reinforcement
learning to present a framework that learns the network traffic character
istics. It can take an optimal decision for choosing the underloaded con
troller that load balance the system.
TABLE 3.1
List of Symbols Used in Our Chapter
Symbol Description
S Set of switches, |S| = n
C Set of controllers, |C| = k
s(t) Flow-request rate of switch s at time t
ILc(t) Instantaneous load of controller c at time t
LI(t) Ideal load of all the controller at time t
Let si(t) denote the number of new flows arriving at switch si in time t. In
computing a controller’s load, we take into account the total number of
flows experienced by it at any given time. Thus, the instantaneous load of a
controller cj at time t is the sum of all the flow request messages produced
by its assigned switches.
n
IL cj (t) = si (t) xij (t) (3.2)
t =1
The ideal load of a controller is taken as a mean load of the system, which is
computed as follows:
K
j =1 IL cj (t )
LI (t) = (3.3)
K
The goal is to minimize the difference between the ideal load and instan
taneous load (equation 3.3). Thus, our objective is as follows:
control plane controls the forwarding devices. SDN originated from the need
to optimize network resources in fast-evolving networks. A key to solving
such a problem is automating the decision process of the control plane. The
integration of analytics and behaviour models into SDN to automate decision
making has led to a new paradigm called knowledge-defined networking
(KDN). In this section, we briefly describe the evolvement of KDN starting
from SDN.
Data
Plane
Analytic
Data Data Analysis
Acquisition (AI/ML)
Application
Application
Application
Routing NAT
Plane
Forwarding
Northbound API Northbound API devices
Control
Control
Plane
Plane
East-Westbound East-Westbound
API API
Southbound API Southbound API
Plane
Data
Plane
Data
FIGURE 3.1
Architecture of SDN.
Reinforcement Learning for Switch Migration to Balance Loads 41
In this technique, an agent learns its behaviour through trial and error. It is
rewarded for interacting with the environment. The agent’s actions are
determined not just by the immediate reward it offers but also by the
possibility of a delayed payoff. By dynamically adjusting parameters, a
reinforcement learning system aims to maximize reinforcement signals. The
signals produced by the environment are an evaluation of how the action
was completed.
The term Di,j is the load deviation coefficient (or discrete coefficients)
between two controllers, ci and cj. Similarly, D i , j is the load deviation after a
switch is migrated from controller ci to cj. It is computed as
Di , j := Dci , cj = (ILc k
L cIi, cj )2 /2 L cIi, cj . (3.6)
k=i,j
n
D= (ILc k
LI)2 /n LI . (3.7)
i =1
S7
S8
C2
S6
S9
S10
S11
S5
S4
C1 S1
S2
S3
C3
FIGURE 3.2
A state-action example for the MDP formulation.
3.5.4 Q-Learning
Q-learning is an RL technique that works by learning an action-
value function. It presents the expected utility of action in a given state and
Reinforcement Learning for Switch Migration to Balance Loads 45
TABLE 3.2
Q-table for Example 3.1 at Time ti
Action
a2 a3
State s1 Q(s1,a2) Q(s1,a3)
s2 Q(s2,a2) Q(s2,a3)
s3 Q(s3,a2) Q(s3,a3)
Q (ST , a)t +1 = Q (ST , a)t + (R (ST , a) + maxQ (ST , a )t Q (ST , a)t ) (3.8)
a
The sum of the current return value and the next-largest maximum value
Q(ST‘, a’)t in the memory is used as the expected value. The incremental
iterative learning is performed by using the difference between the expected
value and the true estimate, so as to obtain the value of Q(ST, a)t in the
(t + 1)th round. The learning rate α is set between 0 and 1. A value of
0 means that the Q-values are never updated; hence, nothing is learned.
Setting a high Q-value such as 0.9 means that learning can occur quickly. A
large magnitude of the discount factor γ considers the long-term benefits,
while smaller values emphasize immediate consideration. Initially, the
Q-table is initialized with “0.” Exploration and exploitation trade-offs are
used to tune the Q-values gradually.
3.6 Methodology
We propose two different methods in this section to achieve load bal
ancing. In the first approach, the switch and target controller are selected
randomly. In the second case, the switch with the maximum load and
target controller is selected with a Q-table algorithm. The core driving
force for optimizing the migration model is the decision-making process
in reinforcement learning, which happens in the knowledge plane. The
control loop between the knowledge and the control plane is used to
exchange load statistics and switch migration decisions. Finally, the con
trol plane implements the switch migration on the data plane. The optimal
switch migration model is shown in Figure 3.3.
Policy
Controller
Q learning Selection
Reward
State, Action
State
Monitoring
(Load balance
detection)
Switch
Load
Migration Migrating
Statistics
execution
FIGURE 3.3
Optimal switch migration model based on RL.
Reinforcement Learning for Switch Migration to Balance Loads 47
Example 3.2: In Figure 3.4, we show a SDN network with nine SDN switches
distributed across three SDN domains (controllers). The flow arrival rate of the
switches for the first time step is shown in Table 3.3. The instantaneous load of
controllers c1, c2, and c3 in terms of number of flows are 75, 130, and 60,
respectively.
The corresponding state space is {S1 → 〈0, 0, 1〉, S2 → 〈0, 1, 0〉, S3 → 〈1, 0, 0〉}.
The action space is {A1 → 〈0, 0, 1〉, A2 → 〈0, 1, 0〉, A3 → 〈1, 0, 0〉}. We have
already discussed the MDP formulation in detail in Subsection 3.5.2 with an
example.
We assume homogeneous controllers with maximum processing capacity of
150 flows per second. The controller load ratios are Rc1 = 75/150 = 0.5; Rc2 = 130/
150 = 0.866; and Rc3 = 60/150 = 0.4. Using these values, we compute the pairwise
average load ratio of the con-trollers. The values are Rc1c2 = 205/300 = 0.683;
Reinforcement Learning for Switch Migration to Balance Loads 49
Control Plane
c1
c3
c2
Domain1 Domain 3
s6
s2 s9
s4
s1
s8
s3 s5
s7
Domain 2
FIGURE 3.4
Unbalanced switch to controller mapping.
TABLE 3.3
Flow Arrival Rate of Switches at Time t0
Switch s1 s2 s3 s4 s5 s6 s7 s8 s9
si(t0) 25 25 25 25 25 35 45 25 35
si(t1) – – – – – – – – –
TABLE 3.4
Q-matrix
Action
A1 A2 A3
State S1 1 1 1
S2 1 1 1
S3 1 1 1
TABLE 3.5
State of Q-matrix Update (from time ‘t′0 to time ‘t′1)
Action
A1 A2 A3
State S1 1 1 1
S2 1 1 0.61451894
S3 1 1 1
Again, we compute the controller load ratios, average controller load ratio, and the
discrete coefficient D ij = 0.19451894 . The value of the new discrete coefficient is
smaller, which means the system has become more stable after the transfer of the
switch. The reward is computed using Equation (3.5).
The rewards become larger when the system become more stable. The Q-matrix is
updated using equation (3.8). We take γ to be 0.7, α to be 0.6 in the first step, and
Q [S2,A3] = 1 + 1(0.19451894 + 0.7 ∗ 0.6 − 1) = 0.61451894. The new Q-table
for the next iteration is shown in Table 3.5.
TABLE 3.6
Experimental Setup
Parameter Value Parameter Value
# Controllers 5 # Switches 34
Topology Arnes Factor (α) 0.6
Factor (γ) 0.7 Factor (E) 0.7 decay to 0.05
Dataset: CAIDA1 [ 15]
Data size 1065GB # Flows 67M+ (1hr)
Flow-req. rate 18.7k/s Flow size 30 sec
Dataset: CAIDA2 [ 15]
Data size 1463GB # Flows 63M+ (1hr)
Flow-req. rate 17.6k/s Flow size 30 sec
Dataset: University [ 13]
Data size 97.17GB No. of 0.43M+(56mins)
Flows
Flow-req. rate 110k/s Flow size 30 sec
*
Damping, +Million.
k
j =1 ( Lc (t) LI (t) )
k
Controllers' Load
4.5×106
C0
C1
4×106 C2
C3
Number of Flows
C4
3.5×106
3×106
2.5×106
2×106
0 1 2 3 4 5 6 7 8
Time-Steps(30 secs interval) (Seconds)
FIGURE 3.5
Load on controllers using Q-learning with ϵ-greedy approach.
load means the number of flows generated by those switches connected to the
controller. The load is computed at a 30-sec interval (time steps). This ex
periment is to show the dynamic nature of network traffic. As can be seen
from the figure, there is a wide variation in the controller load. This result
affirms our hypothesis regarding the bursty nature of network traffic.
A comparison of the load balancing rate of the Q-learning based approach
and a random approach is shown in Figure 3.6. There is a wide variation
in the load balancing of the random approach. In the case of Q-learning with
Load Balance
1.4×106
Q-Learning
1.2×106 Random
Load Balancing Rate
1×106
800000
600000
400000
200000
0
4500 4550 4600 4650 4700 4750 4800 4850 4900 4950 5000
Number of iterations
FIGURE 3.6
Comparison of load balance rate.
Reinforcement Learning for Switch Migration to Balance Loads 53
ϵ-greedy approach, the variation in the load balancing rate is in the range of
100000 to 700000, while that of the random method is in the field of 200000
to 1200000. The results show that the proposed Q-learning approach can
restrict the variation in the load balancing rate. Further, it also results in
an improvement in the load balancing rate. Our proposed approach also
improves the load balancing among the controllers. The average load bal
ancing rate of the Q-learning approach is 30% lower than that of the random
method. The solution of the Q-learning is either chosen at random or depends
on the best action as determined by the Q-matrix. In the early phases of
learning, the Q-matrix does not have much intelligence. Consequently, the
random process takes precedence over optimal action. However, as the
learning process gains momentum, random actions are chosen with a lower
percentage, while optimal actions from the Q-matrix are selected with a
higher rate. The proposed approach thus results in a balance between ex
ploitation and exploration, resulting in a better solution. On the other hand,
the random strategy merely explores the solution.
A comparison of the number of switches migrated in both approaches
is shown in Figure 3.7. The number of switches migrated in the random
approach is approximately three times that with the ϵ-greedy approach.
Migration of a switch involves cost in terms of the control messages
exchanged between the different stakeholders. The results show that the
Q-learning approach reduces the migration cost of switch and synchroni
zation cost between the controllers. This also translates to a reduction in the
migration time. In this work, we have considered the selection of switches
based on descending order schedule; thus, few switches are offloaded from
overloaded to balance the system.
10
0
4500 4550 4600 4650 4700 4750 4800 4850 4900 4950 5000
Number of iterations
FIGURE 3.7
Comparison of number of migrated switches.
54 Artificial Intelligence and Deep Learning for Computer Network
References
[1] 2020 global networking trends report. https://www.data3.com/knowledge-
centre/ebooks/2020-global-networking-trends-report-see-whats-next-in-
networking/, 2020 (accessed August 2021).
[2] Albert Mestres, Alberto Rodriguez-Natal, Josep Carner, Pere Barlet-Ros,
Eduard Alarcon, Marc Sole, Victor Muntes-Mulero, David Meyer,
Sharon Barkai, Mike J Hibbett, et al. Knowledge- defined networking.
ACM SIGCOMM Computer Communication Review, 47(3):2–10, 2017.
[3] Advait Dixit, Fang Hao, Sarit Mukherjee, TV Lakshman, and Ramana
Kompella. Towards an elastic distributed sdn controller. In Proceedings of the
second ACM SIGCOMM workshop on Hot topics in software defined networking,
pages 7–12, 2013.
[4] Advait Dixit, Fang Hao, Sarit Mukherjee, TV Lakshman, and Ramana Rao
Kompella. ElastiCon; an Elastic Distributed SDN Controller. In ACM/IEEE
Symposium on Architectures for Networking and Communications Systems
(ANCS), pages 17–27, 2014.
[5] Chuan’an Wang, Bo Hu, Shanzhi Chen, Desheng Li, and Bin Liu. A Switch
Migration-Based Decision-Making Scheme for Balancing Load in SDN.
IEEE Access, 5:4537–4544, 2017.
Reinforcement Learning for Switch Migration to Balance Loads 55
[6] Tao Hu, Julong Lan, Jianhui Zhang, and Wei Zhao. EASM: Efficiency-Aware
Switch Migration for Balancing Controller Loads in Software-Defined
Networking. Peer-to-Peer Networking and Applications, 12(2):452–464, 2019.
[7] S. Filali, Cherkaoui, and A. Kobbane. Prediction-based switch migration
scheduling for sdn load balancing. In ICC 2019 - 2019 IEEE International
Conference on Communications (ICC), pages 1–6, 2019.
[8] Yaning Zhou, Ying Wang, Jinke Yu, Junhua Ba, and Shilei Zhang. Load
Balancing for Multiple Controllers in SDN based on Switches Group.
In 19th Asia-Pacific on Network Operations and Management Symposium
(APNOMS), pages 227–230. IEEE, 2017.
[9] Md Tanvir Ishtaique ul Huque, Weisheng Si, Guillaume Jourjon, and Vincent
Gramoli. Large-Scale Dynamic Controller Placement. IEEE Transactions on
Network and Service Management, 14(1):63–76, 2017.
[10] H. Chen, G. Cheng, and Z. Wang. A game-theoretic approach to elastic control
in software-defined networking. China Communications, 13(5):103–109, 2016.
[11] David D Clark, Craig Partridge, J Christopher Ramming, and John T
Wroclawski. A knowledge plane for the internet. In Proceedings of the 2003
conference on Applications, technologies, architectures, and protocols for computer
communications, pages 3–10, 2003.
[12] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduc
tion. MIT press, 2018.
[13] Data Center Measurment University Data Set, [Last accessed on 05/12/2019].
Available from: http://pages.cs.wisc.edu/~tbenson/IMC10_Data.html.
[14] The internet topology zoo. http://http://www.topology-zoo.org/index.
html, 2019 (accessed January, 2020).
[15] The CAIDA UCSD Anonymized Internet Traces 2016, [Last accessed on
05/12/2019]. Available from: http://www.caida.org/ data/passive/passive_
2016_dataset.xml.
[16] J. Chandra, A. Kumari, and A. S. Sairam. Predictive flow modeling in
software defined network. In TENCON 2019 - 2019 IEEE Region 10 Conference
(TENCON), pages 1494–1498, 2019.
4
Green Corridor over a Narrow Lane:
Supporting High-Priority Message
Delivery through NB-IoT
CONTENTS
4.1 Introduction ..................................................................................................58
4.1.1 Challenges in Delay-Sensitive Traffic Scheduling
over NB-IoT ......................................................................................58
4.1.2 Contribution of This Work ............................................................59
4.1.3 Organization of the Paper.............................................................. 59
4.2 Related Works.............................................................................................. 60
4.3 NB-PTS: System Model and Design Details........................................... 61
4.3.1 Queueing Model Description ........................................................62
4.3.1.1 Solution Approach in NB-PTS .......................................62
4.3.1.2 Derivation of Target Mean Delay.................................. 63
4.3.2 Estimation of Queue Threshold Value ........................................65
4.3.2.1 ϵ-greedy Policy ..................................................................65
4.3.3 Calculation of Target Mean Delay ...............................................66
4.3.4 Metric for Scheduling .....................................................................66
4.3.5 Knowledge Base ..............................................................................67
4.3.6 Scheduling in NB-PTS ....................................................................67
4.3.7 Time-Bound Analysis .....................................................................67
4.4 Performance Analysis .................................................................................68
4.4.1 Baseline Mechanisms ......................................................................68
4.4.2 Prioritized Traffic Generation .......................................................69
4.4.3 Implementation Details ..................................................................70
4.4.4 Analysis of Throughput ................................................................. 71
4.4.4.1 Overall Average Throughput .........................................73
4.4.5 Analysis of Packet Loss Rate and Packet Delay........................ 73
4.4.5.1 PLR and Delay of Individual Priority-Based Traffic ... 74
DOI: 10.1201/9781003212249-4 57
58 Artificial Intelligence and Deep Learning for Computer Network
4.1 Introduction
While a large number of research and commercial establishments have
proposed various smart city and smart infrastructure management solu
tions, a primary requirement for supporting these applications is to deploy
millions of sensors over diverse platforms. Consequently, supporting the
convergence of sensing, communication, and computing over the Internet
of Things (IoT) is the requirement of the time, and the existing cellular
network over the 4G and 5G communication technologies can directly
support this convergence. Narrowband IoT (NB-IoT) [1,2] supports a large
number of low-throughput IoT devices over the existing cellular platforms
built over the long-term evolution (LTE) and LTE advanced (LTE-A) tech
nologies. Consequently, it can play a prominent role in the deployment of
wide-scale smart city applications.
From the technical perspective, NB-IoT operates on a low-frequency
channel bandwidth of 180 kHz in order to offer a coverage area over 160 dB,
maintaining a latency tolerance of 10 seconds, approximately [2]. With these
specifications, NB-IoT primarily targets IoT devices that are delay tolerant
or the devices that are situated in areas where signal strength is poor. LTE
operators can deploy a NB-IoT standard inside an LTE carrier signal by
allocating a physical resource block (PRB) of 180 kHz to NB-IoT. Although
the technology has multiple advantages for industrial automation and
control, such as cost-effective massive deployment supports, plug-and-
play over existing LTE networks, long battery life due to ultralow power
communication, better penetration of structures and better data rates
compared to LoRA and SigFox, however, it does not support delay-sensitive
applications, which makes it unsuitable for industrial applications where
real-time data delivery with a strict delay guarantee is one of the major
requirements.
TABLE 4.1
Examples of QCI Values Defined by 3GPP LTE
Resource Packet Delay
QCI Type Priority Budget Example Services
FIGURE 4.1
State transition model for independent Bernoulli process in each phase along with threshold.
the state transition diagram (Figure 4.1), the discrete-time finite queue with
L = K − 1 can be expressed as given in the following:
q0 = q0 (1 ) + q1 [ (1 )]
q1 = q0 + q1 [ + (1 )(1 )] + q2 [ (1 )]
qi = qi 1[ (1 )] + qi [ + (1 )(1 )] + qi +1 [ (1 )]
qK 2 = qK 3 [ (1 )] + qK 2[ + (1 )(1 )] + qK 1
qK 1 = qK 2 [ (1 )] + qK 1 (1 )
(1 )
By solving these equations recursively and assuming = (1 ) , we can
express the equilibrium probability in terms of q0 as given in the following:
i
qi = q0 , 1 i K 2 (4.1)
(1 )
For i = K − 1, we have
(K 1) (1 )
qK 1 = q0 (4.2)
(1 )
K 1
By normalising the equations, we get i =0 qi = 1. Thus, q0 can be computed
as follows:
(1 )(1 )
q0 = (K 1) ) K
(4.3)
1 (1 )( +
High-Priority Message Delivery 65
Now, we find the generating function of the finite queue length process and
it is given by P (z) = iK=01 qi zi . That is
q0 1 (1 z)( + K 1z K 1) ( z )K
P (z) = (4.4)
(1 ) 1 z
Next, we find the mean queue waiting time by using Little’s Law, which
states that the average number of packets in a system is equal to the product
of average waiting time of packets and average arrival rate of packets in the
system [25]. For L = K − 1, we compute the mean queue length (MQL) for
this queue by taking the first-order derivative of P (z) at z = 1. In this way,
we can continue with L = K −2, K −3, …, and can find some regulation for
MQL equation. That means the threshold L can be put in any position of the
finite queue. Therefore, for L = K − 1, we have
Lt +1 1 Lt +1 Lt +1+1
t [1 Lt +1 t t + (2Lt +1 t (Lt +1 + 1)) t (Lt +1 t Lt +1) t ]
Dt +1 = Lt +1 Lt +1+1
(4.6)
(1 t) [1 (1 t )( + t t ) t (1 + )(1 t) ]
t (1 )
where t = (1
. For a given value of Lt+1, and the current arrival rate αt,
t)
Lt+1 in [t, t+1], since Dt can influence the queue length in [t, t + 1]. The
ϵ-greedy policy enforcement employs two phases as follows.
j
× DtHOL ,j pj
Wtj+1 = t +1
j
+1
× n
(4.7)
Dt +1 i =1 pi
In this equation, pj specifies the priority of traffic j. It can be noted that the
value of Wtj+1 increases as pj increases and higher values of Wtj+1 provide
more chances of scheduling of traffic j. In 4.7, DtHOL+1
,j
is the current head of
j
line (HOL) packet delay of traffic j and t +1 denotes the maximum proba
bility that the HOL packet delay of traffic j exceeds its target mean delay.
j
HOL delay is the waiting time of the first packet in transmission queue. Dt +1
defines the target mean delay of traffic j. The intuition behind 4.7 is that the
traffic that has the highest tj+1 value has not met its target delay in the
maximum cases. Furthermore, the traffic should get more chances to be
scheduled as the HOL delay of the traffic increases. So, Wtj+1 gets higher as
j HOL, j
t +1 or Dt +1 increases. Additionally, higher values of pj increase the chance
of scheduling of traffic j. However, a higher value of the target mean delay
High-Priority Message Delivery 67
of a traffic signifies that the traffic can sustain a delay and, thus, the value
j j
of Wt +1 will be reduced as Dt +1 enhances.
1. Let Ltj+1 be the queue threshold value of traffic j over [t, t + 1] (where
j
j = 1, 2, 3, …, n). The value of Lt +1 is estimated by applying ϵ-greedy
policy as follows.
a. At time t, SINR St of the channel is measured.
b. Exploitation: If St ∈ [(St − δ), (St + δ)] in K S ⊂ K , the Lt+1 which
has produced the lowest average PLR in the past is chosen
within the range of [(St− δ), (St + δ)] in K S. Otherwise, the Lt+1
which has provided the minimum average PLR in K is selected.
The probability of exploitation is (1 − ϵ). In this case, δ > 0 is a
small integer, which is used to define a small range of SINR
values around the present SINR St.
c. Exploration: A value of Lt+1 is selected at random with the
probability of ϵ.
j
2. The target mean delay Dt +1 is computed by using 4.6.
j
3. The metric Wt +1 is calculated for j = 1, 2, 3, …, n, by using 4.7.
4. The traffic that has the highest value of Wtj+1 is scheduled in the next
transmission phase (in slot [t, t + 1]).
Figure 4.2 demonstrates the basic execution steps of NB-PTS with the three
phase design approach.
FIGURE 4.2
Steps of execution of NB-PTS with three phases.
TABLE 4.2
PHY/MAC and Control Parameters Used in Simulation
Parameter Value
Path loss model FriisSpectrumPropagationLossModel
Fading model TraceFadingLossModel
TxPower of UE 23 dBm
TxPower of eNB 46 dBm
NoiseFigure of UE 9
NoiseFigure of eNB 5
DefaultTransmissionMode 0 (SISO)
Propagation delay model Constant speed propagation delay model
Maximum physical data rate 250 kbps
Bit error rate (BER) 0.03
Adaptive modulation and coding (AMC) model Vienna
Mobility model Random direction 2d mobility model
(“Bounds: Rectangle (−100, 100, −100,
100)”, “Speed: ConstantRandomVariable
[Constant=3.0]”, “Pause:
ConstantRandomVariable
[Constant=0.4]”)
UE scheduler type PfFfMacScheduler
Cell radius 1.5 (km)
Transmission mode Single-Tone
the radio access strategy in a NB-IoT system and its downlink scheduling
issues. Since NIS addresses the downlink scheduling in NB-IoT, the
downlink scheduling can be analyzed with prioritized traffic in order to
find their impact on the packet transmission delay. NANIS focuses on the
adaptation of the time interval between two successive NPDCCH and the
minimization of the radio resource that is consumed to receive data during
downlink transmission. The primary design concept of NANIS attempts
to utilize as many narrowband physical downlink shared channel
(NPDSCH) subframes as possible. Thus, NANIS can help minimize the
packet transmission delay considering prioritized traffic. Hence, we use NIS
and NANIS as the baselines. The “General” is basically a first-In first-out
(FIFO) approach for scheduling packets.
FIGURE 4.3
NB-PTS implementation modules in ns-3-dev-NB-IOT.
High-Priority Message Delivery 71
100 100
NIS NIS
NANIS NANIS
90 90
80 80
70 70
60 60
50 50
40 40
30 30
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
Number of UEs Number of UEs
FIGURE 4.4
(a) Average throughput of Priority-1 traffic; (b) average throughput of Priority-2 traffic.
72 Artificial Intelligence and Deep Learning for Computer Network
80 80
70 70
60 60
50 50
40 40
30 30
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
Number of UEs Number of UEs
FIGURE 4.5
(a) Average throughput of Priority-3 traffic; (b) overall average throughput.
0.6
0.8
0.5
0.4 0.6
0.3
0.4
0.2
0.2
0.1
0 0
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
Number of UEs Number of UEs
FIGURE 4.6
(a) Average PLR; (b) average delay.
(a) Average PLR of different priority traffic (b) Average delay of different priority traffic
0.4 Priority 1 Priority 1
Priority 2 0.7 Priority 2
0.35 Priority 3 Priority 3
0.6
Average delay (s)
0.3
Average PLR (%)
0.5
0.25
0.4
0.2
0.3
0.15
0.1 0.2
0.05 0.1
0 0
20 30 40 50 20 30 40 50
Number of UEs Number of UEs
FIGURE 4.7
Analysis of individual priority: (a) average PLR; (b) average delay.
High-Priority Message Delivery 75
Wt+1 includes the probability that the HOL packet delay of a traffic exceeds its
target mean delay. NB-PTS schedules packets by trying to reduce this prob
ability and consequently, the rate of packet transmission (within the target
mean delay) is increased. This scenario helps to minimize average delay
in packet scheduling.
When the number of connected UEs is 30, from Figure 4.7, it can be noted
that the average PLR values of “Priority-1” are 14% and 33% lower than
“Priority-2” and “Priority-3,” respectively; whereas, the average PLR is 22%
low in “Priority-2” compared to “Priority-3.” In terms of delay, “Priority-1”
has 30% and 40% lower average delay than that of “Priority-2” and
“Priority-3,” respectively. The average delay is 14% less in “Priority-2” than
“Priority-3” traffic.
Priority-2 0.9
85 Priority-3
0.8
80 0.7
75 0.6
CDF
0.5
70 0.4
65 0.3
0.2 NB-PTS
60 General
0.1 NIS
NANIS
55 0
10 20 30 40 50 60 70 80 90 100 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
Run time (min) Average delay (s)
FIGURE 4.8
(a) Convergence behavior; (b) average delay distribution.
76 Artificial Intelligence and Deep Learning for Computer Network
(a) Impact on number of consumed subframes (b) Computational time - Number of UEs
300
Number of consumed subframes
1x106 200
800000 150
600000
100
400000
50
200000
0
10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50
Number of UEs Number of UEs
FIGURE 4.9
(a) Impact on consumed subframes; (b) computational time.
High-Priority Message Delivery 77
4.5 Conclusion
Packet scheduling is an important aspect of QoS requirements in NB-
IoT–based IIoT network, in which the channel bandwith and peak data rate
78 Artificial Intelligence and Deep Learning for Computer Network
References
[1] S. Popli, R. K. Jha, and S. Jain, “A Survey on Energy Efficient Narrowband
Internet of Things (NBIoT): Architecture, Application and Challenges,” IEEE
Access, vol. 7, pp. 16 739–16 776, 2019.
[2] 3GPP RP-161248, 3GPP TSG-RAN Meeting 72, Ericsson, Nokia, ZTE, NTT
DOCOMO Inc., Busan, South Korea, “Introduction of NB-IoT in 36.331,”
June 2016.
[3] 3GPP TS 23.203 V10.6.0, “Technical Specification Group Services and System
Aspects; Policy and charging control architecture (Release 10),” March 2012.
[4] L. Da Xu, W. He, and S. Li, “Internet of things in industries: A survey,” IEEE
Transactions on industrial informatics, vol. 10, no. 4, pp. 2233–2243, 2014.
[5] E. Sisinni, A. Saifullah, S. Han, U. Jennehag, and M. Gidlund, “Industrial
Internet of Things: Challenges, Opportunities, and Directions,” IEEE
Transactions on Industrial Informatics, vol. 14, no. 11, pp. 4724– 4734, 2018.
[6] F. Tong, Y. Sun, and S. He, “On Positioning Performance for the Narrow-Band
Internet of Things: How Participating eNBs Impact?” IEEE Transactions on
Industrial Informatics, vol. 15, no. 1, pp. 423–433, 2019.
[7] C. Yu, L. Yu, Y. Wu, Y. He, and Q. Lu, “Uplink Scheduling and Link
Adaptation for Narrowband Internet of Things Systems,” IEEE Access, vol. 5,
pp. 1724–1734, 2017.
[8] G. Tsoukaneri, M. Condoluci, T. Mahmoodi, M. Dohler, and M. K. Marina,
“Group Communications in Narrowband-IoT: Architecture, Procedures, and
Evaluation,” IEEE Internet of Things Journal, vol. 5, no. 3, pp. 1539–1549, 2018.
[9] H. Malik, H. Pervaiz, M. M. Alam, Y. Le Moullec, A. Kuusik, and
M. A. Imran, “Radio Resource Management Scheme in NB-IoT Systems,”
IEEE Access, vol. 6, pp. 15 051–15 064, 2018.
[10] S.-M. Oh and J. Shin, “An Efficient Small Data Transmission Scheme
in the 3GPP NB-IoT System,” IEEE Communications Letters, vol. 21, no. 3,
pp. 660–663, 2016.
High-Priority Message Delivery 79
CONTENTS
5.1 Introduction ..................................................................................................81
5.2 Literature Survey.........................................................................................84
5.2.1 Research Gap.................................................................................... 84
5.2.2 Importance of the Chapter in the Context of
Current Status .................................................................................. 88
5.3 Pictorial Representation of Cybersecurity Working Model.................90
5.3.1 The Proposed Approach ................................................................ 90
5.3.2 Apply LSTM..................................................................................... 92
5.3.3 Algorithm for LSTM .......................................................................93
5.3.4 LSTM Implementation for Vulnerabilities Detection................93
5.3.5 Results ............................................................................................... 94
5.4 Conclusion ....................................................................................................95
References ..............................................................................................................97
5.1 Introduction
Security information and event management (SIEM) play a key role in
improving the next-generation cybersecurity-based secure communication
against potential cyber-attacks. Nowadays, the world has been connected
by Internet of Things (IoT) devices, which will reach 1.3 trillion by 2026.
In such cases, potential cyber-attacks that affect the communication layers
between computers and IoT devices through centralized and decentralized
DOI: 10.1201/9781003212249-5 81
82 Artificial Intelligence and Deep Learning for Computer Network
systems will be more vulnerable, such as DDoS attacks, botnet, social at
tacks, man-in-the-middle attacks, etc. during its applications in worldwide
industrial areas such as the healthcare sector, smart transportation, agri
culture, online banking, Google pay, Paytm banking, etc [1].
As a solution in this chapter, deep learning–based vulnerability detection
and prevention model for SIEM with lightweight computing has been
proposed. Also, an open-source decentralized vulnerabilities detection
model will be developed for testing the implementation of industrial ap
plications. Through this chapter scalability, reduced latency, secure reliable
communication, and fast downloading/uploading through Internet of
Things can be achieved. The deliverable of the chapter is a deep learning
(DL)–based model to detect and protect the vulnerabilities for IoT networks
that will have a great focus in the near future for secure communication.
The following are features of SIEM for IoT networks [2]:
FIGURE 5.1
AI technologies on cybersecurity market in the next five years (1. 28% represent every cy
bersecurity solution will have some aspect to AI; 2. 31% represent a small part (less than 50%);
3. 39% to 50% of all cybersecurity offerings will use some aspect of AI; 4. others.).
TABLE 5.1
Comparison of Existing and Proposed Cybersecurity Models
Existing Cybersecurity Models DL-Based Cybersecurity Model
Traditional cybersecurity model doesn’t DL-based cybersecurity model proposed here
provide efficient anomalies detections. is platform independent with ability to detect
and prevent all types of communication layer
attacks. It can provide a better solution that
includes intelligence, attack detecting speed,
and accuracy.
Increased computation time and space Reduced computation time and space
complexity. complexity using AI techniques.
TABLE 5.2
Comparison Among Various Existing Cybersecurity Tools and Techniques
ELK Apache
Comparison of Features OSSIM Stack OSSEC Wazuh Metron SIEMonster Prelude SecurityOnion MozDef Snort Suricata
1. Includes key SIEM Y/N Y N Y – – – – – – – –
components*
3. Log management N Y N – Y Y Y – – – –
capabilities
4. Performance issues Y – – – – – – – – – Y
5. Online version – – – – – N – – – – –
Vulnerabilities Detection in Cybersecurity
6. Reporting or alerting in – N Y – Y Y Y – Y Y –-
syslog
7. Used for security – N Y Y Y – – Y Y Y Y
applications
8. Open source operating – – Y – Y – – Y – Y –
systems support
9. Scalability N – – Y – – – – – – –
10. Open-source tools Y – – – Y – – Y Y – –
version
11. Security – N – – Y – – Y – – –
12. High cost to maintain – Y – – – – – – – – –
85
86 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 5.2
Level of integration of SIEM for threat intelligence and analytics applications.
TABLE 5.3
Number of Cyber-Attacks that Cybersecurity Model
can Alert to on Average Per Day
Avg. No of Alerts per Day Cyber-attacks Reported (%)
5 47
10–20 23
21–30 13
31–40 3
41 14
TABLE 5.4
Number of Cyber-Attacks that Turned Out to be False
Positives
False Positives Range (%) Cyberattacks (%)
10% 27
10–30% 29
31–50% 15
50% 28
TABLE 5.5
Survey Among Various Industries With Respect to Different Parameters
Total No. of Remarks about
respondents respondent/ Security model
from Total or Time slot
industries Company employee count or Revenue respondent in taken to
(%) of company that category investigate/alert
30% >$1 billion 29/95 >31 attack
incident
returned/day
72% between 57/79 AI-enhanced
$100 million systems are
and $1 billion effective (Also
69% Between 1,001 and 59/85 represented
5,000 employees in Figure 4)
fewer than 1,000 less than 56 or 68/121
employees $100 million
60% 64/107
68% $1 billion 65/95
or more
65% (high employee 67/103
count ie., >5000
employees)
47% (mid size companies) between 37/79 Somewhat
$100 million effective
to $1 billion
36% (large size more than 34/95
companies) $1 billion
small companies less than 53/121
$100 million
43% 1,000 and fewer
employees
31% 1,001 to 5,000 26/85
employees
34% Mid size companny 27/79
parameters and Table 5.5 represents the average time taken to investigate
a security incident. Figure 5.3 represents combined percentages for very
effective and somewhat effective security models that are enhanced by
AI. Table 5.6 represents various data sources that are referred to while
investigating a security incident [8,9] (Table 5.7).
Figure 5.2(a) represents a cybersecurity model; (b) represents security
analytics; (c) represents threat intelligence; (d) represents email/malware;
(e) represents cloud security; (f) represents network management; (g) rep
resents IAM; (h) represents IoT security too low compare than security
so that in this chapter concentrate in IoT threats detect and protect.
88 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 5.3
Level of integration of SIEM for threat intelligence and analytics applications.
TABLE 5.6
Average Time Taken to Investigate Cyber-Attacks
Avg. Time Taken Cyber-attacks Investigated (%)
10 min 32
10–20 min 22
20 min–1 hour 29
2– 12 hours 13
12 hours 5
TABLE 5.7
Various Data Sources That are Referred to While Investigating a
Cyber-Attacks
Type of Data Source Cyber-attacks Investigated (%)
Treat feeds 76
Search engines 67
Research articles 48
Blogs 46
Other 10
FIGURE 5.4
Combined percentages for very effective and somewhat effective security tools enhanced
by AI.
The SIEM model provides the ability to exchange messages from one device
to another device, make orders, and complete the data transmission with
the help of peer-to-peer centralized techniques using minimum hardware
cost in a secure event management.
90 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 5.5
Importance of incorporating AI into SIEM.
Vulnerabilities Detection in Cybersecurity 91
TABLE 5.8
Scope Addressed
Scope Scope Addressed
To implement a vulnerable monitoring Anomaly detection via topic modelling
and detection model for suspicious technique from NLP using collection of
network events using DL techniques network event data sources, such as:
Netflow logs DNS logs
HTTP proxy logs
Here logs are network events
To create an open-source SIEM model for Developing open-source coding framework for
an IoT that can work without third-party monitoring and detecting suspicious network
approval. events like flow, DNS, and proxy in order to
open borders of SIEM correlation restrictions.
The framework implementation is based on
open-source decoders
Load data in Hadoop data transformation
FIGURE 5.6
Working model for proposed SIEM in IoT networks.
92 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 5.7
Vulnerability detection in cybersecurity using deep learning model.
Vulnerabilities Detection in Cybersecurity 93
FIGURE 5.8
The basic structure of LSTM.
5.3.5 Results
The purpose of the results was to create an LSTM-based vulnerability
detection model using a vulnerable data set. Although this data set contains
instances that belong to some vulnerable classes with unbalanced distri
bution, it has been shown that this problem does not affect the classification
performance [15]. Figure 5.11 shows the LSTM model accuracy when the
number of epochs is plotted against accuracy on the y-axis. The grey colour
represents test data, while black colour represents the training data.
Figure 5.11 shows the LSTM model loss when the number of epochs is
Vulnerabilities Detection in Cybersecurity 95
FIGURE 5.9
New information calculation.
FIGURE 5.10
Updates the information of the cell.
plotted against the loss on the y-axis. The grey colour represents test data
while black colour represents the training data.
5.4 Conclusion
In this chapter, a deep learning–based vulnerabilities detection and pre
vention model against cyber-attacks for IoT with lightweight computing
96 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 5.11
Model loss [ 15].
FIGURE 5.12
Model accuracy [ 15].
References
[1] R. Arthi and S. Krishnaveni. Design and development of iot testbed
with ddos attack for cybersecurity research. In 2021 3rd International
Conference on Signal Processing and Communication (ICPSC), pages 586–590.
IEEE, 2021.
[2] Charles Wheelus and Xingquan Zhu. Iot network security: Threats, risks,
and a data-driven defense framework. IoT, 1(2):259–285, 2020.
[3] Zhen Li, Deqing Zou, Shouhuai Xu, Zhaoxuan Chen, Yawei Zhu, and
Hai Jin. Vuldeelocator: a deep learning-based fine-grained vulnerability
detector. IEEE Transactions on Dependable and Secure Computing, 2021.
[4] Alejandro Mazuera-Rozo, Anamaria Mojica-Hanke, Mario Linares-Vasquez,
and Gabriele Bavota. Shallow or deep? an empirical study on detect-
ing vulnerabilities using deep learning. arXiv preprint arXiv:2103.11940,
2021.
[5] Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, and
Zhaoxuan Chen. Sysevr: A framework for using deep learning to detect
software vulnerabilities. IEEE Transactions on Dependable and Secure
Computing, 2021.
[6] Prasesh Adina, Raghav H. Venkatnarayan, and Muhammad Shahzad.
Impacts & detection of network layer attacks on iot networks. In Proceedings
of the 1st ACM MobiHoc Workshop on Mobile IoT Sensing, Security, and Privacy,
pages 1–6, 2018.
[7] Meenigi Ramesh Babu and KN Veena. A survey on attack detection
methods for iot using machine learning and deep learning. In 2021 3rd
International Conference on Signal Processing and Communication (ICPSC),
pages 625–630. IEEE, 2021.
[8] Ismail Butun, Patrik Osterberg, and Houbing Song. Security of the internet of
things: Vulnerabilities, attacks, and countermeasures. IEEE Communications
Surveys & Tutorials, 22(1):616–644, 2019.
[9] Abhirup Khanna. An architectural design for cloud of things. Facta
universitatis-series: Electronics and Energetics, 29(3):357–365, 2016.
[10] Mohammed Zagane, Mustapha Kamel Abdi, and Mamdouh Alenezi. Deep
learning for software vulnerabilities detection using code metrics. IEEE
Access, 8:74562–74570, 2020.
[11] Xiang Li, Yuchen Jiang, Chenglin Liu, Shaochong Liu, Hao Luo, and
Shen Yin. Playing against deep neural network-based object detectors: A
novel bidirectional adversarial attack approach. IEEE Transactions on
Artificial Intelligence, 2021.
[12] Jun-Yan He, Xiao Wu, Zhi-Qi Cheng, Zhaoquan Yuan, and Yu- Gang Jiang.
Db-lstm: Densely-connected bi-directional lstm for human action recognition.
Neurocomputing, 444:319–331, 2021.
[13] Raneem Qaddoura, Al-Zoubi Ala’M, Iman Almomani, and Hossam Faris.
Predicting different types of imbalanced intrusion activities based
on a multi-stage deep learning approach. In 2021 International Conference
on Information Technology (ICIT), pages 858–863. IEEE, 2021.
98 Artificial Intelligence and Deep Learning for Computer Network
[14] Francis Akowuah and Fanxin Kong. Real-time adaptive sensor attack
detection in autonomous cyber-physical systems. In 2021 IEEE 27th Real-
Time and Embedded Technology and Applications Symposium (RTAS), pages
237–250. IEEE, 2021.
[15] Ferhat Ozgur Catak, Ahmet Faruk Yaz, Ogerta Elezaj, and Javed Ahmed.
Deep learning based sequential model for malware analysis using windows
exe api calls. PeerJ Computer Science, 6:e285, 2020.
6
Detection and Localization of Double-
Compressed Forged Regions in JPEG
Images Using DCT Coefficients and
Deep Learning–Based CNN
CONTENTS
6.1 Introduction ..................................................................................................99
6.1.1 Motivation and Objectives...........................................................101
6.1.2 Our Contributions ......................................................................... 103
6.2 Related Background .................................................................................. 103
6.2.1 Overview of JPEG Image Compression ....................................103
6.2.2 JPEG Attack Model ....................................................................... 105
6.2.3 Related Works................................................................................ 105
6.3 Deep Learning–Based Forensic Framework for JPEG Double-
Compression Detection ............................................................................107
6.3.1 JPEG DCT Coefficients Extraction and Selection .................... 107
6.3.2 CNN Architecture ......................................................................... 109
6.4 Localizing Double JPEG Compressed Forged Regions ...................... 110
6.4.1 JPEG Double-Compression Region Localization ..................... 110
6.4.2 Experimental Results for JPEG Double-Compression
Localization..................................................................................... 111
6.5 Conclusion ..................................................................................................113
References ............................................................................................................114
6.1 Introduction
In recent years, cybersecurity has gained a lot of attention from researchers.
Cybersecurity is the framework that protects our valuable digital
DOI: 10.1201/9781003212249-6 99
100 Artificial Intelligence and Deep Learning for Computer Network
images and videos. For example, 500+ hours of video content was uploaded
every minute in YouTube [6], Facebook users uploaded 14.58 million
images per hour [7], and 8.95 million images and videos were shared on
Instagram per day as per data collected in 2017 [8].
Among the shared information, a lot of information is manipulated
(intentionally create fake information), which provide wrong information
to the audience. According to a report [9], by Massachusetts Institute of
Technology (MIT) researchers, roughly one in every eight photos shared in
WhatsApp groups during the Lok Sabha elections in India in 2019 were
misleading. A total of 5 million messages from 2.5 lakh users were com
piled in between October 2018 and June 2019. It is observed that 52% of all
messages were visual, including 35% images and 17% videos, and 13% of
all shared images were misleading [9]. In this chapter, we will discuss
image manipulation and its detection techniques in detail. The motivation
and prime objectives of this chapter are presented in the following section.
FIGURE 6.1
An example of image forgery [ 11]: (a) authentic image and (b) forged image.
102 Artificial Intelligence and Deep Learning for Computer Network
horses are present, and Figure 6.1(b) presents a forged (manipulated) image,
where two extra horses are added. Using traditional techniques for multi
media security and protection, including digital signature and digital water
marking, we can detect the authenticity of multimedia. Such techniques
depend on some external information, such as digital signature or watermark,
and the external information is computed by pre-processing the multimedia
data in some form or another. Additionally, such techniques require en
bedded hardware chips and software to embed the digital signature or
watermark into images and videos. Also, many of the digital cameras did not
support digital watermarking or digital signatures.
On the other hand, the field of blind forensic measure is recent and still
developing, and it does not require any pre-processing step, i.e., embedding
digital signature or watermark. By analyzing intrinsic characteristics of
images [12,13], which are left behind by the manipulation operation itself, it
can detect the tempered images and videos. Hence, a blind forensic tech
nique is a completly post-processing-based operation. This chapter aims to
investigate the problem of forgery detection in images through blind digital
forensic measures.
The most common blind digital image forgery detection techniques are
copy-move forgery detection [14], image splicing forgery detection [15], image re
touching detection [16], and joint photographic experts group (JPEG) double/
multiple compression detection [17].
Copy-move forgery is also known as region duplication forgery, where
regions of an image are copied and pasted onto another target region within
the same image to obscure or repeat one or more significant object(s) in the
image. Unlike copy-move forgery, image splicing [18] comprises of a com
position of multiple images. Image retouching [16,19] is nothing but enhan
cing the image quality by editing certain image pixels such as red eye
removal, sharpness, tone adjustment, etc. Such forms of modification are
generally detectable by investigating inconsistencies in natural statistical
properties of an image [19–21].
JPEG double/multiple compression detection [13,17,22] (JPEG compressed
image) has to be decompressed first before performing forgery. After
performing forgery, the resultant image may be compressed again to
be stored in a JPEG format. Hence, when a JPEG image is modified/
edited, a subsequent compression occurs due to consecutive saving of the
edited image back to the memory. So, it can be inferred that double or
multiple compressed images are forged. However, it is not guaranteed
that a double or multiple JPEG compressed image is always forged.
However, varying degrees of JPEG compression are a good indicator
of forgery.
The main objective of this chapter is to present a blind forensic framework
for JPEG double compression–based image manipulation (forges) detection.
In the following section, we present our contribution in this chapter.
Detection of Double Compressed Forgeries in JPEGs 103
FIGURE 6.2
JPEG compression and de-compression process.
decompression technique are shown in Figure 6.2. First, the images are
divided into non-overlapping pixel blocks of size 88, denoted by B, followed
by 2D discrete cosine transform (DCT) performed on each block. Hence,
it obtains its corresponding DCT coefficient block, denoted by DB. Then,
each DCT coefficient block is quantized by an 88 quantization matrix, QQF.
The quality factor, QF, defines a quantization matrix.
The value of QF is in between [1, 100], and higher value of QF denotes the
lower degree of compression.
The JPEG de-compression process is just the reverse of the compression
process. First, quantized DCT coefficients QCqB are de-quantized by multi
plying the quantized DCT coefficient QCqB with the corresponding quanti
zation matrix, QQF, to obtain the de-quantized coefficient D−B. Then, the
image pixel blocks (say B′) are reconstruct by applying inverse-DCT (IDCT)
to the de-quantized DCT coefficients, and finally followed by a rounding
and truncation operation.
The quantization function is a non-invertible operation due to the
rounding function. This makes JPEG a lossy compression technique. Due to
the quantization step, a quantization error is generated for an image during
the encoding and decoding process in JPEG. The quantization error is
defined as Qerror = DB − (DB/QQF)QQF.
Along with the quantization error, a rounding error and a truncation error
are also introduced during IDCT. Some float values are generated when
performing IDCT on de-quantized DCT coefficients. The float values are
Detection of Double Compressed Forgeries in JPEGs 105
FIGURE 6.3
JPEG attack on image [ 32]: (a) authentic 512 × 512 image; (b) selected region, re-saved at a
different compression quality factor; (c) forged image with partially double-compressed
regions.
106 Artificial Intelligence and Deep Learning for Computer Network
before CNN training, (ii) CNN in noise domain – denoised JPEG images are
directly fed into CNN, and (iii) CNN embedding DCT histograms – DCT
histograms are computed with a CNN layer rather than it extracted from
JPEG bitstream, which still works if double JPEG images are stored in bitmap
or PNG format. Hence, the third proposed approache relies on a CNN that
automatically extracts first-order features from the DCT coefficients. Also, the
third approach achieves better accuracy than the other two methods, and
performs efficiently when the second quality factor is greater than the first.
In [38], Amerini et al. proposed multidomain, a combination of both a
spatial and frequency domain CNN model for detection of double com
pression in JPEG images. Park et al. [39] proposed a deep convolutional
neural network for JPEG double compression in mixed JPEG quality factor
images. They used a DCT histogram and JPEG quantization tables as input
into CNN, which identified forged images from authentic ones. However,
this method suffers when the pixel values of an image are saturated and
only low frequencies are present. In addition to this, this method also suf
fers when the difference between two consecutive quality factors of JPEG
compression is low.
computional cost, instead of selecting all 16 blocks, only 7 DCT blocks are
considered as follows.
For the i-th coefficient, we find the block where it assumes the highest
value compared to the rest of the 15 blocks. This maximum block and its six
neighbors are considered: position-wise, its three immediate predecessors
and three immediate successor blocks for feature extraction. For example, if
13th DCT block contains the highest value for the ith coefficient; DCT blocks
indexed [9–12,14,33,34] are also considered. This generates a 19 × 7
dimension DCT coefficient vector for each 32 × 32 image block. This
abstraction is carried out to reduce computational complexity, without
losing any significant block information.
To present the DCT coefficient selection procedure more clearly to the
readers, an example is presented in Figure 6.4, which shows a 32 × 32 image
FIGURE 6.4
An example of DCT coefficient selection (second coefficient shown) [ 32].
Detection of Double Compressed Forgeries in JPEGs 109
FIGURE 6.5
Convolution neural network (CNN) architecture [ 32].
110 Artificial Intelligence and Deep Learning for Computer Network
FIGURE 6.6
Stride movement demonstration for forgery localization in test images: (a) Represents top-
leftmost 32 × 32 image block, (b) horizontal stride movement of 8 pixels to the next 32 × 32
image block, (c) vertical stride movement of 8 pixels to the next 32 × 32 image block, (d) top-
leftmost 8 × 8 unit of forgery localization (mark in cyan color).
evident from Figure 6.6(d). This technique helps to obtain small units, 8 ×
8 image blocks, of forgery localization.
Following a similar way, each (overlapping) 32 × 32 JPEG block is tested
sequentially, and its class label is assigned using the trained CNN, and
moved to the next image block. For the last overhead blocks, image pad
ding, with sufficient number of zero rows and columns, is performed to
mentain the image block of size 32 × 32 if required. This process helps
to localize the JPEG forgery and is considerably accurate.
select 480 images for training the deep learning model, and the remaining
20 images are considered for testing.
For training, we first compress the TIFF images with a JPEG quality factor
of QF1 = 55, 65, 75, 85, and 95, to create a single compress dataset, denoted
by SSC. Next, the single-compressed images in SSC are again re-compressed
with a JPEG quality factor QF2 = 55, 65, 75, 85, and 95, to create a double
JPEG compressed dataset, denoted by SDC. In the training phase, all the
single- and double-compressed images are divided into non-overlapping
blocks of size 32 × 32, as discussed in Section 6.3.1.
In the testing phase, we create JPEG double-compressed forged images
according to the JPEG attack model, as discussed in Section 6.2.2, by varying
the JPEG compression quality factor and forgery size of 10%, 30%, and
50% of the actual images.
A visual of JPEG recompression-based forgery localization results on a
UCID dataset [42] of the presented framework [32] is shown in Figure 6.7
for three different forgery sizes. In Figure 6.7, the top row represents the
single-compressed images with a JPEG quality factor QF 1 = 50. For creating
forged images, 10%, 30%, and 50% regions of single-compressed JPEG
images are extracted, and those regions are saved with a quality factor of
QF 3 = 90, and relocated back to their original position, which can be shown
in the middle row of Figures 6.7(a), (b), and (c), respectively. The forged
(double-compressed) regions are highlighted with a green colour, and the
corresponding forgery detection and localization results are presented
in the bottom row of Figure 6.7 (highlighted with white colour). From
Figure 6.7, it can be observed that the proposed technique can locate
the forged regions efficiently.
Figure 6.8 presents JPEG re-compression localization results, in terms of
average accuracy, of a presented deep learning–based forensic technique on
20 UCID test images [42] by varing quality factors QF1 and QF2. Figure 6.8
also presents the localization results of three state-of-the-art techniques
FIGURE 6.7
Forgery detection and localization results [ 32]. Forgery sizes: (a) 10%, (b) 30%, (c) 50%. (Top)
Authentic images. (Middle) Tampered images: tampered regions highlighted. (Bottom) Detection
and localization of forged region.
Detection of Double Compressed Forgeries in JPEGs 113
FIGURE 6.8
Average accuracy for varying QF2 − QF1 values [ 32].
of schemes of Wang et al. [26], Bianchi et al. [27], and Lin et al. [28]. From
Figure 6.8, it can be observed that the presented deep learning–based
forensic model performs considerably high when QF1 > QF2, compared to
the other three schemes. This is because CNNs help to preserve the
spatial structured features and efficiently learn the statistical patterns of
JPEG coefficient distribution; hence, improving the detection accuracy.
However, the performance of the presented scheme is marginally higher
that the state-of-the-art for the cases QF1 < QF2. The performance of the
presented scheme is higher than the schemes of Wang et al. [26], Bianchi
et al. [27], and Lin et al. [28], in the case of QF1 = QF2. Still, the presented
scheme did not provide a satisfactory performance.
6.5 Conclusion
In this chapter, we have discussed illicit modification attacks on multi
media, i.e., image and video forgery, and their relation to cybersecurity. We
have observed that JPEG re-compression footprints help to detect forgery in
images. In this chapter, we have also presented a deep learning–based
forensic framework for detection and localization of forgery (double JPEG
compressed regions) in images. The pre-processed 19 × 7 JPEG DCT coef
ficients are fed to CNN to extract and learn the suitable features from single
and double JPEG compressed images.
The experimental results prove that the presented deep learning–based
forensic scheme can detect and locate the JPEG double-compressed forged
114 Artificial Intelligence and Deep Learning for Computer Network
References
[1] Rossouw von Solms and Johan van Niekerk. From information security
to cyber security. Computer & Security, 38:97–102, 2013.
[2] Nigel Martin and John Rice. Cybercrime: Understanding and addressing
the concerns of stakeholders. Computers & Security, 30(8):803–814, 2011.
[3] Manuel Jiménez, Pedro Sánchez, Francisca Rosique, Bárbara Ál-varez,
and Andrés Iborra. A tool for facilitating the teaching of smart home ap
plications. Computer Applications in Engineering Education, 22(1):178–186,
2014.
[4] DC: Department of Homeland Security Washington. Critical infrastructure,
Cited 23 November 2012.
[5] The Whitehouse. International strategy for cyberspace: prosperity, security, and
openness in a networked world, Cited February 2012.
[6] Youtube. Statistics, 2021 (accessed January 3, 2021).
[7] Salman Aslam. Facebook by the Numbers: Stats, Demographics & Fun Facts,
2021 (accessed January 3, 2021).
[8] Mary Lister. 33 Mind-Boggling Instagram Stats & Facts for 2018, 2021 (accessed
January 3, 2021).
[9] THE TIMES OF INDIA. 1 out of 8 photos in political WhatsApp groups mis
leading, 2021 (accessed January 3, 2021).
[10] Ingemar Cox, Matthew Miller, Jeffrey Bloom, Jessica Fridrich, and
Ton Kalker. Digital watermarking and steganography. Morgan kaufmann, 2007.
[11] Yu-Feng Hsu and Shih-Fu Chang. Detecting image splicing using geom
etry invariants and camera characteristics consistency. In 2006 IEEE
International Conference on Multimedia and Expo, pages 549–552. IEEE,
2006.
[12] Hany Farid. Exposing digital forgeries from JPEG ghosts. IEEE Transactions
on Information Forensics and Security, 4(1):154–160, 2009.
[13] Simone Milani, Marco Tagliasacchi, and Stefano Tubaro. Discriminating
multiple JPEG compressions using first digit features. APSIPA Transactions
on Signal and Information Processing, 3, 2014.
[14] Rahul Dixit and Ruchira Naskar. Region duplication detection in digital
images based on centroid linkage clustering of key–points and graph
Detection of Double Compressed Forgeries in JPEGs 115
[40] Qing Wang and Rong Zhang. Double JPEG compression forensics based on
a convolutional neural network. EURASIP Journal on Information Security, 1,
23, 2016.
[41] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classi
fication with deep convolutional neural networks. In Advances in Neural
Information Processing Systems, pages 1097–1105, 2012.
[42] Gerald Schaefer and Michal Stich. UCID: an uncompressed color image
database. In Minerva M. Yeung, Rainer W. Lienhart, and Chung-Sheng Li,
editors, Storage and Retrieval Methods and Applications for Multimedia 2004,
volume 5307, pages 472–480. International Society for Optics and Photonics,
SPIE, 2003.
Index
Note: Page numbers in italics indicate a figure and those in bold indicate a
table.
119
120 Index